On Fri, Feb 16, 2024 at 05:03:56PM -0500, Anthony Deschamps wrote:
> >From a031cf97da759eb2c2f9b6e191065ad503f821ed Mon Sep 17 00:00:00 2001
> From: Anthony Deschamps <anthony.j.descha...@gmail.com>
> Date: Fri, 16 Feb 2024 16:00:35 -0500
> Subject: [PATCH] MEDIUM: lb-chash: Deterministic node hashes based on server
>  address
> 
> Motivation: When services are discovered through DNS resolution, the order
> in
> which DNS records get resolved and assigned to servers is arbitrary.
> Therefore,
> even though two HAProxy instances using chash balancing might agree that a
> particular request should go to server3, it is likely the case that they
> have
> assigned different IPs and ports to the server in that slot.
> 
> By deriving the keys for the chash tree nodes from a server's address and
> port
> we ensure that independent HAProxy instances will agree on routing
> decisions. If
> an address is not known then the key is derived from the server's puid as
> it was
> previously.
> 
> When adjusting a server's weight, we now unconditionally remove all nodes
> first,
> since the server's address (and therefore the desired node keys) may have
> changed.

That's a good idea, it corresponds more or less to what is already
supported for stick-tables where we can decide to synchronize the servers'
addresses instead of IDs.

However here you have replaced the existing method instead of offering
an option to choose this one, and that's not correct as it will break
a number of setups, for example those with multiple DCs and replicated
servers, or those with dual-attached servers (each connected to one LB),
or those who want to have all ports going to the same server, etc.

I think that your use case would be desirable for a majority of users,
so that's definitely something we should do, but making it configurable.

Since here the hash key is a property of the server, it would make sense
to add a "hash-key" argument to the server and support "id" (the default),
"addr" (IP only, useful when learned over DNS), "addr-port" for even more
mixing, and maybe in the future even "fqdn" or "name" depending on what
users ask for.

Another point to take care of is what happens when a server's address is
changed (DNS, CLI). Normally here you'll need to unhash / rehash the node.
In the current version it would become incorrect if the server doesn't
first go down then up (this can happen for example if health checks are
too far apart and give enough time for the server's address to be
reassigned).

Thanks!
Willy

Reply via email to