I finally found the time to think about the mathematics of this some
more and I'm now convinced that it's a sound construction. I hope that
one or the other observation below will be useful for you.

The hash as it is now can be proved to produce values in the full range
of uint16_t, so that's good.

As we discussed already, you can simplify the construction further.

One trick is that stoeplitz_cache_entry() is linear in the second
argument -- you can think of the xor operation ^ as addition of vectors
(uint8_t and uint16_t) over the field with two elements {0, 1}.  That's
just the mathematician's way of expressing this relation:

        stoeplitz_cache_entry(scache, a ^ b)
            == stoeplitz_cache_entry(scache, a) ^ stoeplitz_cache_entry(scache, 
b);

Using this, the stoeplitz hash functions can be rewritten to this.
I would expect it to be a bit cheaper.

uint16_t
stoeplitz_hash_ip4(const struct stoeplitz_cache *scache,
    in_addr_t faddr, in_addr_t laddr)
{
        uint16_t lo, hi;

        lo  = faddr >> 0;
        lo ^= faddr >> 16;
        lo ^= laddr >> 0;
        lo ^= laddr >> 16;

        hi  = faddr >> 8;
        hi ^= faddr >> 24;
        hi ^= laddr >> 8;
        hi ^= laddr >> 24;

        return swap16(stoeplitz_cache_entry(scache, lo))
            ^ stoeplitz_cache_entry(scache, hi);
}

or another example:

uint16_t
stoeplitz_hash_ip6port(const struct stoeplitz_cache *scache,
    const struct in6_addr *faddr6, const struct in6_addr * laddr6,
    in_port_t fport, in_port_t lport)
{
        uint16_t hi = 0, lo = 0;
        size_t i;

        for (i = 0; i < nitems(faddr6->s6_addr32); i++) {
                uint32_t faddr = faddr6->s6_addr32[i];
                uint32_t laddr = laddr6->s6_addr32[i];

                lo ^= faddr >> 0;
                lo ^= faddr >> 16;
                lo ^= laddr >> 0;
                lo ^= laddr >> 16;

                hi ^= faddr >> 8;
                hi ^= faddr >> 24;
                hi ^= laddr >> 8;
                hi ^= laddr >> 24;
        }

        lo ^= fport >> 0;
        lo ^= lport >> 0;

        hi ^= fport >> 8;
        hi ^= lport >> 8;

        return swap16(stoeplitz_cache_entry(scache, lo))
            ^ stoeplitz_cache_entry(scache, hi);
}

and so on.


Next, I don't particularly like this magic STOEPLITZ_KEYSEED number, but
I guess we can live with it.

Another option would be to generate the key seed randomly. You will get
a "good" hash that spreads out over all 16 bit values if and only if the
random value has an odd number of binary digits.

I haven't thought hard about this, but I don't immediately see a better
way of generating such numbers than:

int
stoeplitz_good_seed(uint16_t seed)
{
        int             ones = 0;

        while (seed > 0) {
                ones += seed % 2;
                seed /= 2;
        }

        return ones % 2;
}

uint16_t
stoeplitz_seed_init(void)
{
        uint16_t        seed;

        do {
                seed = arc4random() & UINT16_MAX;
        } while (!stoeplitz_good_seed(seed));

        return seed;
}

This will loop as long as it needs to get a good toeplitz key seed.
Each time there is a 50% chance that it will find one, so this will
need to loop n times with probability 1 / 2**n. This is basically the
same situation as for arc4random_uniform() with an upper bound that is
not a power of two.

I don't know if something like that is acceptable early in init_main.c.
If it is, you can use a seed generated by this init function in place
of STOEPLITZ_KEYSEED.


Finally, I don't think it would be all bad to eliminate the cache
altogether and simply do this (and similarly for all the other
stoeplitz_hash_*() variants). This would be another way of eliminating
the magic number:

uint16_t
simple_hash_ip4(const struct stoeplitz_cache *scache,
    in_addr_t faddr, in_addr_t laddr)
{
        uint16_t lo, hi;

        lo  = faddr >> 0;
        lo ^= faddr >> 16;
        lo ^= laddr >> 0;
        lo ^= laddr >> 16;

        hi  = faddr >> 8;
        hi ^= faddr >> 24;
        hi ^= laddr >> 8;
        hi ^= laddr >> 24;

        return (swap16(lo) ^ hi);
}

I don't see how this is fundamentally different from sephe's
construction although it's not quite a special case.

If there's a problem with this, I'd be really curious to know why.

Reply via email to