The code in the implementation of the EnhancedDoubleHasher is based on the
wikipedia article https://en.wikipedia.org/wiki/Double_hashing.
The code that appears in the article is:
struct key; /// Opaque
/// Use other data types when needed. (Must be unsigned for guaranteed
wrapping.)
extern unsigned int h1(struct key const *), h2(struct key const *);
/// Calculate k hash values from two underlying hash functions
/// h1() and h2() using enhanced double hashing. On return,
/// hashes[i] = h1(x) + i*h2(x) + (i*i*i - i)/6.
/// Takes advantage of automatic wrapping (modular reduction)
/// of unsigned types in C.
void ext_dbl_hash(struct key const *x, unsigned int hashes[], unsigned int n)
{
unsigned int a = h1(x), b = h2(x), i;
for (i = 0; i < n; i++) {
hashes[i] = a;
a += b; // Add quadratic difference to get cubic
b += i; // Add linear difference to get quadratic
// i++ adds constant difference to get linear
}
}
And it is based on code from the paper "Bloom Filters in Probabilistic
Verification" by Peter C.
Dillinger and Panagiotis Manolios
(https://www.khoury.northeastern.edu/~pete/pub/bloom-filters-verification.pdf)
But the code that appears on that paper is:
x, y := a(δ) MOD m, b(δ) MOD m
f[0] := x
for i := 1 .. k-1
x := (x + y) MOD m
y := (y + i) MOD m
f[i] := x
which does the assignments to x, y (a, b in the wikipedia article) in a
different order.
The same algorithm appears the thesis "Adaptive Approximate State Storage"
(http://peterd.org/pcd-diss.pdf)
algorithm " Enhanced DoubleHashing"
input x : "the element being added or queried"
input m : "number of bits in bit vector"
input k : "number of indices to compute"
output g[0 .. k - 1] : "array of indices"
uses h1, h2 : "hash functions"
begin
a := h1(x)
b := h2(x)
g[0] := a
for i := 1 .. k - 1
begin
a := (a + b) MOD m
b := (b + i) MOD m
g[i] := a
end
end
It's easy to see that the computation the enhanced double hash computes
hash[i] = h1(x) + i * h2(x) + (i^3 - i) / 6
if we use a=h1(x) and b=h2(x), generates the sequence:
a, a + b, a + 2 * b + 1, a + 3 * b + 4, a + 4 * b + 10, ...
And both your implementation and the wikipedia article generates
a, a + b, a + 2 * b, a + 3 * b + 1, a + 4 * b + 4, ...
So the last term in the sum lags one step behind.
The solution, I think, is as simple as swapping the assignments of index and
inc in your implementation.
I suppose the impact of this is nil, but I'm not an expert for judging that.
Thanks,
Juan Manuel Gimeno