Hello,

  Apologies for the delay in responding. The trouble is largely to do with
being able to reproduce the results with the test harness. I made a couple
of changes to the test harness you provided, for standard deviation and
variance, the results are at [1]. The sheet titles hopefully make sense,
lmk in case they don't

It took a while to find the data that we based our decision on, the last
sheet contains this data. The std dev for SDBM was 47 and for DJB2 was 30,
this difference correlates to connections from varnish to our application
backends as [2]. I realize this is a second order metric however it makes
sense in that the smaller standard deviation relates to a smoother
distribution of loads to varnish from haproxy which in turn relates to
connection counts to application webs converging across the varnish pool.

I spent some time looking for how this data was obtained, and just found
out that it was done by performing http requests. I used this data against
the test harness [3] and immediately noticed that the difference in std dev
results.

I will as a next step use the same methodology used previously (performing
GET requests) to determine the efficacy of of the algorithms, however are
there substantial differences between the test harness and the code in
haproxy that would in anyway explain this difference? Any other thoughts?

[1] http://tinyurl.com/obystx7
[2]
https://www.evernote.com/shard/s311/sh/2ad29dee-c66e-4775-9e43-572170d683bf/ae3cf3e4766dcfd62cca88a3563a015f
[3] https://gist.github.com/maddalab/7127644

Thanks
Bhaskar


On Sun, Oct 20, 2013 at 3:56 PM, Willy Tarreau <w...@1wt.eu> wrote:

> Hello Bhaskar,
>
> On Sun, Oct 20, 2013 at 11:25:31AM -0400, Bhaskar Maddala wrote:
> > Hello,
> >
> >    I did mean djb2, typo on my end, and it is the same as as hash_djbx33.
>
> OK thanks for confirming.
>
> > I will take a look at 1.5 later today, and try to test it out
> >
> > Here [1] is a short page on the 2 functions.
>
> Ah OK, yes indeed the current hash function is the same as sdbm. I
> wasn't aware of this. I remember we did some experiments on hashes
> with Aleks many years ago and I think the current function appeared
> good enough in these tests.
>
> > The config we have in
> > production uses consistent hashing on the host header, the function
> > "get_server_hh" is invoked we have patches that update backend.c [2] from
> >
> > hash = *p + (hash << 6) + (hash << 16) - hash;
> >
> > to
> >
> > hash = (px->lbprm.func & BE_LB_HASH_DJB2) ? 5381 : 0;
> > hash = *p + (hash << 5) + hash;
> >
> >
> > (the equivalent changes where required), when the "hash-func" option is
> > specified. The choice to use host header, was an application design.
>
> This is a common choice, typically if you put a cache such as varnish
> behind haproxy.
>
> I ran a few tests on an old program out of which I had to blow the dust,
> we used it 6 years ago for some hash tests, and I've put it here :
>
>     http://1wt.eu/test-hash/hashcount.c
>
> It supports the following hashes in argv[1] :
>     0  test_hash
>     1  wt_hash
>     2  wt_hash5
>     3  wt_hash6
>     4  hash_djbx33
>     5  haproxy_uri_hash (=sdbm)
>
> The second argument is the number of buckets and if a third one is passed,
> a final avalanche is applied after the hash.
>
> I fed it with the top 1000 and top 10000 domains from alexa.com's top-1m :
>
>     http://www.alexa.com/topsites
>
> It appears that haproxy_uri_hash and wt_hash5 produce the smoothest
> distribution with a cleaner and longer gaussian bell.
>
> But in our case, we're not looking for the cleanest distribution in
> practice. What we want is to shorten the distance between the most
> commonly used and the least commonly used server. And it appears in
> tests that when hasing 1000 domains to 50 servers, djb33 provides a
> minimum usage of 14 for a max of 28 when the current algo does (12,28)
> and wt_hash even worse (8,32).
>
> Applying the avalanche on the hash result further increases the
> differences between the most and the least loaded servers in my
> tests on djb33 and sdbm but improved the results with the others :
>
>                    normal                  avalanche
>               min   max  ratio          min   max   ratio
>   test_hash     8    32   0.25 !!        10    28    0.36
>     wt_hash     8    32   0.25 !!         9    30    0.30
>    wt_hash5     7    27   0.26 !!        12    29    0.41
>    wt_hash6    14    27   0.52 <-        14    27    0.52 <-
>       djb33    14    28   0.50            8    33    0.24 !!
>        sdbm    12    28   0.43           11    33    0.33
>
> However, on a small number of servers (4), the results are the
> opposite. Hashing 100 names to 4 servers gives a high discrepancy on
> djb33 and the best for wt_hash5/6, both with and without avalanche :
>
>                    normal                  avalanche
>               min   max  ratio          min   max   ratio
>   test_hash    20    30   0.67           19    31    0.61
>     wt_hash    14    31   0.45           20    36    0.56
>    wt_hash5    20    29   0.69           21    28    0.75
>    wt_hash6    24    26   0.92 <-        23    28    0.82 <-
>       djb33    17    42   0.40 !!        19    35    0.54
>        sdbm    16    33   0.48           22    30    0.73
>
> So first, it seems important not to apply the avalanche on top of djb33.
> Second, on a small number of servers, sdbm+avalanche will be much better
> than sdbm alone (so 1.5 is smoother than 1.4). However on a large number
> of servers, djb33 alone seems better, than sdm alone or avalanched. It's
> interesting to note that wt_hash6 outperforms all the other ones in both
> small and large sets, with avalanche and normal modes, which makes me
> happy :-)
>
> I'm now wondering whether it would make sense to add both djb33 and wthash6
> as options. We would also remove "avalanche" from hash-type and move it to
> an option of the hash algo (it's not supported in 1.4 anyway). In the
> tests,
> I also noticed that djb33 is very bad on server counts that are multiples
> of
> 33, so this must be considered carefully.
>
> So maybe it can make sense to add a new "hash-algo" directive with a few
> values such as "djb2", "sdbm", "wt6" and the same variants with
> "+avalanche".
>
> > The
> > choice of using DJB2 came about from testing load distribution on the
> > backends. A little dated description is available at [3]. The number of
> > haproxy, varnish and web nodes has grown substantially. I say this to
> make
> > clear that this might be specific for larger deployments.
>
> That's always when the number of servers increase that you face poor hash
> distribution, so that's not surprizing.
>
> Best regards,
> Willy
>
>

Reply via email to