Re: Compatibility issue on consistent-hash between pre-3.0 and 3.0+

Anthony Deschamps Sat, 06 Sep 2025 20:34:29 -0700

Hello,

First off, my apologies for the bug. This change was my contribution,
and I hope the regression didn't cause serious issues for anyone.


Personally, I don't have any deployments that would be negatively
affected by simply applying the fix. Of course, that may result in a burst
of cache misses for some users -- I can't speak for every system, but
my general inclination is that it's better to (if necessary) take one-time
mitigating actions such as adding capacity or deploying during a low
traffic period if it means that the ongoing maintenance/complexity will
be minimized.

Regards,
Anthony

On Fri, Sep 5, 2025 at 10:47 AM Willy Tarreau <[email protected]> wrote:

> Hi all,
>
> I've got a report of consistent hash delivering different hashes since 3.0
> with
> commit faa8c3e02 ("MEDIUM: lb-chash: Deterministic node hashes based on
> server
> address").
>
> The cause is a mistake in the ID-based key calculation (the hash is applied
> twice and the ID range scaling was dropped). The fix is trivial:
>
>   --- a/src/lb_chash.c
>   +++ b/src/lb_chash.c
>   @@ -123,7 +123,7 @@ static inline u32 chash_compute_server_key(struct
> server *s)
>
>           case SRV_HASH_KEY_ID:
>           default:
>   -               key = full_hash(s->puid);
>   +               key = s->puid * SRV_EWGHT_RANGE;
>                   break;
>           }
>
> but I'm having a problem now: anyone who deployed haproxy with consistent
> hashing before 3.0 notices the problem (much higher miss rate on caches)
> and would want the fix to be applied, but those having enabled it first
> in 3.0+ on hash-key id don't know that something broke, and will be
> surprised by the fix which will change everything for them.
>
> To be honest, I really doubt that anyone just started to use consistent
> hash recently with 3.0 or 3.2 using server IDs while addresses are
> available and more robust. So I'm tempted to apply the fix in order to
> fix the situation for all those who are progressively upgrading their
> fleet from pre-3.0 to 3.0+.
>
> Another possibility would be to add a 4th hash-key setting to support
> pre-3.0 compatibility, but that would remain a mess for those upgrading
> anyway.
>
> Hence my question to our users: did anyone just start to use consistent
> hashing recently with the default hash-key (id), and would rightfully
> want to have a way to keep their keys distributed like this (i.e. with
> an incompatible algo), in which case we'd need to add a new setting to
> support this ? Or should we consider that a regression is a regression
> and should be fixed ?
>
> Thanks for sharing your insights,
> Willy
>

Re: Compatibility issue on consistent-hash between pre-3.0 and 3.0+

Reply via email to