Greetings,
I've been chasing an issue similar to that described in
https://github.com/haproxy/haproxy/issues/51. I've had a look through the
source and found that the server `init-state` pondered therein has not yet been
implemented. I'm wondering if:
1. there is updated guidance on how to mitigate work around backend servers
being marked UP before their health checks have passed exists?
2. does the server `init-state` feature described in the above still align with
the goals of the haproxy project? Would this feature still be a viable addition?
Cheers,
Aaron
PS: Some - perhaps familiar - details from my specific experiments for context:
I have 2 redis nodes (one primary, one replicating from the primary), 3
sentinels handling failover, and one haproxy node (version HAProxy version
2.9.7-5742051 2024/04/05) which exposes the primary to clients - the relevant
HAProxy config:
haproxy.cfg (snippet)
```
...
resolvers k8s
parse-resolv-conf
hold other 10s
hold refused 10s
hold nx 10
hold timeout 10s
hold valid 10s
hold obsolete 10s
frontend redis-master
bind *:6379
default_backend redis-master
backend redis-master
mode tcp
balance first
option tcp-check
tcp-check send role\r\n
tcp-check expect string master
server-template redis 2 _redis._tcp.redis.sandbox.svc.cluster.local:6379
check inter 1s resolvers k8s init-addr none
```
When a the haproxy node (re)starts, redis failover occurs, or the replica node
restarts there's a brief window where HAPRoxy forwards client requests to the
read-only replica node. See the following logs demonstrating that the replica
node `redis-master/redis1` is marked UP/READY 2 seconds before the health
checks detect that it is a replica node and place it in a DOWN state.
HAProxy logs :
```
2024-08-27T18:56:44.394842468Z [NOTICE] (1) : New worker (8) forked
2024-08-27T18:56:44.394868093Z [NOTICE] (1) : Loading success.
2024-08-27T18:57:38.505780382Z [WARNING] (8) : redis-master/redis1 changed its
IP from (none) to 10.42.2.13 by DNS additional record.
2024-08-27T18:57:38.505818007Z [WARNING] (8) : redis-master/redis2 changed its
IP from (none) to 10.42.1.46 by DNS additional record.
2024-08-27T18:57:39.507633591Z [WARNING] (8) : Server redis-master/redis1
('10-42-2-13.redis-instance-1.redis-primary.svc.cluster.local') is UP/READY
(resolves again).
2024-08-27T18:57:39.507672132Z [WARNING] (8) : Server redis-master/redis2
('10-42-1-46.redis-instance-1.redis-primary.svc.cluster.local') is UP/READY
(resolves again).
2024-08-27T18:57:41.401342425Z [WARNING] (8) : Server redis-master/redis1 is
DOWN, reason: Layer7 timeout, info: " at step 2 of tcp-check (expect string
'master')", check duration: 1006ms. 1 active and 0 backup servers left. 0
sessions active, 0 requeued, 0 remaining in queue.
```
While traffic is forwarded to the replica node, any client writes will result
in the following error:
```
READONLY You can't write against a read only replica.
```