[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-4921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17946909#comment-17946909
 ] 

Chuong Tran commented on ZOOKEEPER-4921:
----------------------------------------

With 3.9.2, it will retry really hard until we connect to the VPN:
{
"time" : "2025-04-23T19:42:23.638-07:00",
"logger_name" : "org.apache.zookeeper.ClientCnxn",
"thread_name" : "ZookeeperClient-watcherProcessor-SendThread(XXX:2181)",
"level" : "WARN",
"myid" : "XXX:2181",
"message" : "Session 0x100707c2dc26281 for server XXX:2181, Closing socket 
connection. Attempting reconnect except it is a SessionExpiredException.",
"stackTrace" : "j.l.IllegalArgumentException: Unable to canonicalize address 
XXX:2181 because it's not resolvable
at o.a.z.SaslServerPrincipal.getServerPrincipal(SaslServerPrincipal.java:78)
at o.a.z.SaslServerPrincipal.getServerPrincipal(SaslServerPrincipal.java:41)
at o.a.z.ClientCnxn$SendThread.startConnect(ClientCnxn.java:1142)
at o.a.z.ClientCnxn$SendThread.run(ClientCnxn.java:1192)
"
}
{
"time" : "2025-04-23T19:42:25.506-07:00",
"logger_name" : "org.apache.zookeeper.ClientCnxn",
"thread_name" : "ZookeeperClient-watcherProcessor-SendThread(XXX:2181)",
"level" : "INFO",
"myid" : "XXX:2181",
"message" : "Opening socket connection to server XXX/YYY:2181."
}
{
"time" : "2025-04-23T19:42:25.507-07:00",
"logger_name" : "org.apache.zookeeper.ClientCnxn",
"thread_name" : "ZookeeperClient-watcherProcessor-SendThread(XXX:2181)",
"level" : "INFO",
"myid" : "XXX:2181",
"message" : "SASL config status: Will not attempt to authenticate using SASL 
(unknown error)"
}
{
"time" : "2025-04-23T19:42:25.519-07:00",
"logger_name" : "org.apache.zookeeper.ClientCnxn",
"thread_name" : "ZookeeperClient-watcherProcessor-SendThread(XXX:2181)",
"level" : "INFO",
"myid" : "XXX:2181",
"message" : "Socket connection established, initiating session, client: 
/10.4.9.178:61916, server: XXX/YYY:2181"
}
{
"time" : "2025-04-23T19:42:25.536-07:00",
"logger_name" : "org.apache.zookeeper.ClientCnxn",
"thread_name" : "ZookeeperClient-watcherProcessor-SendThread(XXX:2181)",
"level" : "WARN",
"myid" : "XXX:2181",
"message" : "Unable to reconnect to ZooKeeper service, session 
0x100707c2dc26281 has expired"
}
{
"time" : "2025-04-23T19:42:25.537-07:00",
"logger_name" : "org.apache.zookeeper.ClientCnxn",
"thread_name" : "ZookeeperClient-watcherProcessor-SendThread(XXX:2181)",
"level" : "WARN",
"myid" : "XXX:2181",
"message" : "Session 0x100707c2dc26281 for server XXX/YYY:2181, Closing socket 
connection. Attempting reconnect except it is a SessionExpiredException.",
"stackTrace" : "o.a.z.ClientCnxn$SessionExpiredException: Unable to reconnect 
to ZooKeeper service, session 0x100707c2dc26281 has expired
at o.a.z.ClientCnxn$SendThread.onConnected(ClientCnxn.java:1418)
at o.a.z.ClientCnxnSocket.readConnectResult(ClientCnxnSocket.java:148)
at o.a.z.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:86)
at o.a.z.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:350)
at o.a.z.ClientCnxn$SendThread.run(ClientCnxn.java:1274)
"
}
{
"time" : "2025-04-23T19:42:25.542-07:00",
"logger_name" : "org.apache.zookeeper.ClientCnxn",
"thread_name" : "ZookeeperClient-watcherProcessor-EventThread",
"level" : "INFO",
"message" : "EventThread shut down for session: 0x100707c2dc26281"
}
{
"time" : "2025-04-23T19:42:25.543-07:00",
"logger_name" : "org.apache.zookeeper.ZooKeeper",
"thread_name" : "ZookeeperClient-watcherProcessor",
"level" : "INFO",
"message" : "Initiating client connection, connectString=XXX:2181 
sessionTimeout=4000 
watcher=com.twitter.finagle.common.zookeeper.ZooKeeperClient$3@38b9cd1e"
}
{
"time" : "2025-04-23T19:42:25.544-07:00",
"logger_name" : "org.apache.zookeeper.ClientCnxnSocket",
"thread_name" : "ZookeeperClient-watcherProcessor",
"level" : "INFO",
"message" : "jute.maxbuffer value is 1048575 Bytes"
}
{
"time" : "2025-04-23T19:42:25.545-07:00",
"logger_name" : "org.apache.zookeeper.ClientCnxn",
"thread_name" : "ZookeeperClient-watcherProcessor",
"level" : "INFO",
"message" : "zookeeper.request.timeout value is 0. feature enabled=false"
}
{
"time" : "2025-04-23T19:42:25.546-07:00",
"logger_name" : "org.apache.zookeeper.ClientCnxn",
"thread_name" : "ZookeeperClient-watcherProcessor-SendThread(XXX:2181)",
"level" : "INFO",
"myid" : "XXX:2181",
"message" : "Opening socket connection to server XXX/YYY:2181."
}
{
"time" : "2025-04-23T19:42:25.547-07:00",
"logger_name" : "org.apache.zookeeper.ClientCnxn",
"thread_name" : "ZookeeperClient-watcherProcessor-SendThread(XXX:2181)",
"level" : "INFO",
"myid" : "XXX:2181",
"message" : "SASL config status: Will not attempt to authenticate using SASL 
(unknown error)"
}
Apr 23, 2025 7:42:25 PM com.twitter.finagle.common.zookeeper.ZooKeeperClient$3 
process
INFO: Zookeeper session expired. Event: WatchedEvent state:Expired type:None 
path:null zxid: -1
{
"time" : "2025-04-23T19:42:25.557-07:00",
"logger_name" : "org.apache.zookeeper.ClientCnxn",
"thread_name" : "ZookeeperClient-watcherProcessor-SendThread(XXX:2181)",
"level" : "INFO",
"myid" : "XXX:2181",
"message" : "Socket connection established, initiating session, client: 
/10.4.9.178:61917, server: XXX/YYY:2181"
}
{
"time" : "2025-04-23T19:42:25.571-07:00",
"logger_name" : "org.apache.zookeeper.ClientCnxn",
"thread_name" : "ZookeeperClient-watcherProcessor-SendThread(XXX:2181)",
"level" : "INFO",
"myid" : "XXX:2181",
"message" : "Session establishment complete on server XXX/YYY:2181, session id 
= 0x100707c2dc26286, negotiated timeout = 4000"
}
Apr 23, 2025 7:42:25 PM 
com.twitter.finagle.common.zookeeper.Group$ActiveMembership join
INFO: Membership path is : 
/env-q/finagle/service/example-server/thrift/member_0000000003
Apr 23, 2025 7:42:25 PM 
com.twitter.finagle.common.zookeeper.Group$ActiveMembership join
INFO: Set group member ID to member_0000000003

> Zookeeper Client 3.9.3 Fails to Reconnect After Network Failures
> ----------------------------------------------------------------
>
>                 Key: ZOOKEEPER-4921
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4921
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: java client
>    Affects Versions: 3.9.3
>            Reporter: Chuong Tran
>            Priority: Critical
>
> After upgrading the Java Zookeeper client to version 3.9.3, we observed that 
> it is not resilient to brief network disruptions, such as a short VPN blip. 
> In such cases, the client attempts to reconnect only once, and if 
> unsuccessful, the session expires.
> {quote}Apr 23, 2025 10:19:23 AM 
> com.twitter.finagle.common.zookeeper.ZooKeeperClient$3 process
> INFO: Zookeeper session expired. Event: WatchedEvent state:Expired type:None 
> path:null zxid: -1
> {quote}
> In contrast, the previous version (3.9.2) would continuously retry until the 
> network connection was restored, maintaining the session more reliably.
> I believe it's a new issue with this change: 
> https://issues.apache.org/jira/browse/ZOOKEEPER-4508
>  
> Step to repro:
>  # Open VPN.
>  # Start the application which connects to the Zookeeper server with the VPN.
>  # Disable VPN for a couple of minutes.
>  # Observe the application.
>  # Enable the VPN again.
> {quote}3.9.3:
> "message" : "Session 0x0 for server XXX, Closing socket connection. 
> Attempting reconnect except it is a SessionExpiredException or 
> SessionTimeoutException.",
>   "stackTrace" : "o.a.z.ClientCnxn$SessionTimeoutException: Client session 
> timed out, have not heard from server in 5590ms for session id 0x0
>         at o.a.z.ClientCnxn$SendThread.run(ClientCnxn.java:1253)
> {quote}
> 3.9.2: Application will be reconnected successfully.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to