Re: symmetric toeplitz hashing

David Higgs Sat, 13 Jun 2020 05:47:32 -0700

On Fri, Jun 12, 2020 at 9:41 AM Theo Buehler <[email protected]> wrote:


> I finally found the time to think about the mathematics of this some
> more and I'm now convinced that it's a sound construction. I hope that
> one or the other observation below will be useful for you.
>
> The hash as it is now can be proved to produce values in the full range
> of uint16_t, so that's good.
>
> As we discussed already, you can simplify the construction further.
>

[snip]


> Next, I don't particularly like this magic STOEPLITZ_KEYSEED number, but
> I guess we can live with it.
>
> Another option would be to generate the key seed randomly. You will get
> a "good" hash that spreads out over all 16 bit values if and only if the
> random value has an odd number of binary digits.
>
> I haven't thought hard about this, but I don't immediately see a better
> way of generating such numbers than:
>
> int
> stoeplitz_good_seed(uint16_t seed)
> {
>         int             ones = 0;
>
>         while (seed > 0) {
>                 ones += seed % 2;
>                 seed /= 2;
>         }
>
>         return ones % 2;
> }
>
> uint16_t
> stoeplitz_seed_init(void)
> {
>         uint16_t        seed;
>
>         do {
>                 seed = arc4random() & UINT16_MAX;
>         } while (!stoeplitz_good_seed(seed));
>
>         return seed;
> }
>
> This will loop as long as it needs to get a good toeplitz key seed.
> Each time there is a 50% chance that it will find one, so this will
> need to loop n times with probability 1 / 2**n. This is basically the
> same situation as for arc4random_uniform() with an upper bound that is
> not a power of two.
>

I neither read nor understand the math but assuming you've described it
correctly, you can do this more simply by realizing that a random bad seed
can be made into a good seed by toggling one (random) bit.  You also
replace stoeplitz_good_seed() with __builtin_parity() and perhaps leverage
some intrinsics?

uint16_t
stoeplitz_seed_init(void) {
    uint16_t seed;
    seed = arc4random() & UINT16_MAX;
    if (!stoeplitz_good_seed(seed))
        seed ^= 1 << (arc4random() % 16);
}

--david

Re: symmetric toeplitz hashing

Reply via email to