RE: HA failover: Nothing we try reduces client recovery below one minute

2024-02-21 Thread Vilius Šumskas
Hi, are you able to reproduce your issue without Kubernetes layer on Artemis instances? I’m not sure how exactly do you kill a pod, but 40 second timeout very much looks like default pod grace timeout + 10 seconds. -- Vilius From: John Lilley Sent: Wednesday, February 21, 2024 8:45 PM To:

Re: HA failover: Nothing we try reduces client recovery below one minute

2024-02-21 Thread Karl Weller
Since your parameters didn't change the behavior, you could try tunning tcp settings (jdk and OS). I find the TCP stacks across different OSs behave differently. JDK example: -Dsun.net.client.defaultReadTimeout=2 -Dsun.net.client.defaultConnectTimeout=1 OS Example (windows): HKEY_LOC

Re: HA failover: Nothing we try reduces client recovery below one minute

2024-02-21 Thread Justin Bertram
I would recommend a couple of things to narrow down the issue... First, have a script or something on the client gather thread dumps at regular intervals (e.g. every 5 seconds) starting right before you kill the pod and continuing until the client fails over and recovers. Second, turn on TRACE lo

HA failover: Nothing we try reduces client recovery below one minute

2024-02-21 Thread John Lilley
Greetings! We are having a devil of a time trying to reduce delay during a failover event. We’ve set our URL to (tcp://dm-activemq-live-svc:61616,tcp://dm-activemq-backup-svc:61617)?ha=true&reconnectAttempts=200&initialConnectAttempts=200&clientFailureCheckPeriod=1&connectionTTL=1&callTi