Combinatorial number system

Lizheng Wang

2024-08-30 (Updated: 2024-08-31)

Mathematics

Introduction

Combinatorial number system represents a non-negative natural numbers as sum of binomial coefficients, which is a correspondence between natural numbers (taken to include 0) N and k-combinations.

By restricting the number of binomial coefficients, say to r, any non-negative integer m can be represented uniquely as sum of r binomial coefficients. A different value of r would result in a different combinatorial number system.

Definition(Combinadics). $\forall m \geq 0, m \in \mathbb{N}_{\geq 0}$, there exist unique $r, C_r, \cdots, C_i, \cdots, C_1$ such that $C_i \geq 0$, and $C_j>C_i$ for $j>i$ and

$\begin{aligned} m & =\sum_{i=1}^r\binom{C_i}{i} \\ & =\binom{C_r}{r}+\binom{C_{r-1}}{r-1}+\cdots+\binom{C_i}{i}+\cdots+\binom{C_2}{2}+\binom{C_1}{1} \end{aligned}$

Lemma. Using Pascal’s triangle, we can get

$\binom{n+1}{r}=\sum_{j=0}^r\binom{n-r+j}{j}=\sum_{j=0}^r\binom{n-j}{r-j}$

Proof. We prove it using the basic property of the Pascal’s triangle. That is every entry is the sum of two entries in the preceding row, one on the top and the other on the top-left of the current entry.

$\begin{aligned}\binom{n+1}{r}= & \binom{n}{r}+\binom{n}{r-1} \\ = & \binom{n}{r}+\binom{n-1}{r-1}+\binom{n-1}{r-2} \\ = & \binom{n}{r}+\binom{n-1}{r-1}+\binom{n-2}{r-2}+\binom{n-2}{r-3} \\ & \vdots \\ = & \binom{n}{r}+\binom{n-1}{r-1}+\cdots+\binom{n-r+i}{i}+\cdots+\binom{n-r+1}{1}+\binom{n-r}{0}\end{aligned}$

as $\binom{n-r}{0}=1$, we have

$\binom{n+1}{r}=\sum_{j=1}^r\binom{n-j}{r-j}+1 > \sum_{j=1}^r\binom{n-j}{r-j}$

Then we can prove the bijection between the natural numbers and the k-combinations.
Exsitence. We prove it using mathematical induction.

Base case: $m=0$, we have $0=\binom{r-1}{r} + \cdots + \binom{1}{2} + \binom{0}{1}$
Inductive hypothesis: Assume that the statement is true for $m$, we have $m = \binom{C_r}{r} + \cdots + \binom{C_2}{2} + \binom{C_1}{1}$
Inductive step: Add 1 to both sides of above equation, we have $m + 1 = \binom{C_r}{r} + \cdots + \binom{C_2}{2} + \binom{C_1}{1} + 1$ Assume the first $j C_i$ ‘s that are consective. That means $C_{l+1}=C_l+1$ for $1 \leq l<j$, we have $\binom{C_1+j-1}{j}+\cdots+\binom{C_1+1}{2}+\binom{C_1}{1}+1 = \binom{C_1+j}{j}$ Then we have $m + 1 = \binom{C_r}{r} + \cdots + \binom{C_1+j+\alpha} + \binom{C_1+j}{j} + \binom{j-2}{j-1} + \cdots + \binom{1}{2} + \binom{0}{1}$ where $\alpha \ge 1$ means that it’s not consective with the first $j C_i$ ‘s.
So we have proved the existence of the bijection between the natural numbers and the k-combinations.

Uniqueness. We prove it by contradiction. Assume that there are two different representations of $m$ as sum of binomial coefficients, say

$\begin{aligned} & \sum A=\binom{a_r}{r}+\cdots+\binom{a_2}{1}+\binom{a_1}{1} \\ & \sum B=\binom{b_r}{r}+\cdots+\binom{b_2}{1}+\binom{b_1}{1} \end{aligned}$

Consider sets $A^{\prime}=A-B$ and $B^{\prime}=B-A$, carrying elements of $A$ and $B$ that are not present in the other set.
Since $A$ and $B$ have equal sums:

$\sum A^{\prime} = \sum B^{\prime}$

As $A$ is not equal to $B$, we have $A^{\prime} \neq \emptyset$ and $B^{\prime} \neq \emptyset$.
Suppose $\binom{C_A}{r}$ and $\binom{C_B}{r}$ be the largest coefficients in $A^{\prime}$ and $B^{\prime}$ respectively.
Since $A^{\prime}$ and $B^{\prime}$ have no common elements, we have $C_A \neq C_B$.
Without loss of generality, assume $C_A > C_B$, then we have $C_B + 1 \leq C_A$

$\sum B^\prime < \binom{C_B+1}{r}$

So we have

$\sum B^\prime < \binom{C_B+1}{r} \leq \binom{C_A}{r} \leq \sum A^\prime$

As $\sum A^\prime = \sum B^\prime$, we have a contradiction. So we have proved the uniqueness of the bijection between the natural numbers and the k-combinations.

Conclusion. We have proved the bijection between the natural numbers and the k-combinations.

Applications

In the CFS signature scheme[3], the combinatorial number system is used to compress the signature. The original signature is a word of length $n=2^{16}$ bits with weight $w=9$.
Since its low hamming weight, we obviouly have an intuition that we can compress it.
We totally have $\binom{2^{16}}{9} \approx 2^{125.5}$ kinds of words, so we can use 126 bits to index the word $z$ instead of storing the whole $z$.
As we proved above, we have $\binom{2^{16}}{9} = \sum_{i=0}^9\binom{2^{16}-1-i}{9-i}$, so we can use the position of the 1’s in $z$ to index $z$.
Let $i_1<\ldots<i_9$ denote the positions of the non-zero bits of $z$. We define the index $I_z$ of $z$ by:

$I_z=1+\binom{i_1}{1}+\binom{i_2}{2}+\ldots+\binom{i_9}{9}$

The full CFS signature scheme is as follows:

Signature algorithm

hash the document $D$ into $s=h(D)$
compute $s_i=h([\cdots s \cdots \mid \cdot i \cdot])$ for $i=0,1,2 \ldots$
find $i_0$ the smallest value of $i$ such that $s_i$ is decodable
use our trapdoor function to compute $z$ such that $H z^T=s_{i_0}$
compute the index $I_z$ of $z$ in the space of words of weight 9
use $\left[\cdots I_z \cdots \mid \cdot i_0 \cdot\right]$ as a signature for D

Verification algorithm

recover $z$ from its index $I_z$
compute $s_1=H z^T$ with the public key $H$
compute $s_2=h\left(\left[\cdots h(D) \cdots \mid \cdot i_0 \cdot\right]\right)$ with the public hash function
compare $s_1$ and $s_2$ : if they are equal the signature is valid

Discussion

As we can see, there is a bijection between the index $I_z$ and the word $z$, so intuitively it’s a theoretical upper bound.
The easiest way to compress the signature is to use the 9 positions of the 1’s in $z$ to index $z$. Every position is $\log{2^{16}}=16$ bits, so we need $9 \times 16=144$ bits to index $z$.
We can think about why there are redundant $144-126=18$ bits here, and where does the redundancy come from?
I think the redundancy here comes from the possibility that the position of 1 in the back is smaller than the position of 1 in the front, but it does not exist in reality.
This is one of the redundancy situations, maybe there are other situations that we won’t discuss here.

References

[1] http://math0.wvstateu.edu/~baker/cs405/code/Combinadics.html
[2] Siddique A B, Farid S, Tahir M. Proof of bijection for combinatorial number system[J]. arXiv preprint arXiv:1601.05794, 2016.
[3] Courtois N T, Finiasz M, Sendrier N. How to achieve a McEliece-based digital signature scheme[C]//Advances in Cryptology—ASIACRYPT 2001: 7th International Conference on the Theory and Application of Cryptology and Information Security Gold Coast, Australia, December 9–13, 2001 Proceedings 7. Springer Berlin Heidelberg, 2001: 157-174.
[4] Bernstein D J, Buchmann J, Dahmen E. Post-Quantum Cryptography. Mathematics and Statistics Springer-11649[R]. ZDB-2-SMA. Springer, Heidelberg, 2009.