[ https://issues.apache.org/jira/browse/ZOOKEEPER-4840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Kevin Wikant updated ZOOKEEPER-4840: ------------------------------------ Description: h2. TL;DR After upgrade Apache Zookeeper & Apache Curator, our application is experiencing a new unexpected behaviour after events such as: * Zookeeper server(s) being restarted * Zookeeper server failing & being replaced h2. Background Application is using Zookeeper for leader election & metadata storage. The application runs on 3 hosts which each also have 1 Zookeeper server/daemon running. Previously the application was running on Zookeeper version 3.5.10 & Curator version 4.3.0 After upgrading to Zookeeper version 3.9.1 & Curator version 5.2.0 a new edge case was observed where after Zookeeper daemons are restarted/failed the application (i.e. Zookeeper client) enters a 15+ minute loop of repeatedly logging "{{{}SessionExpiredException"{}}} These repeated "{{{}SessionExpiredException"{}}} are not indicative of a full Zookeeper client/communication outage because DEBUG logs show that other Zookeeper sessions are communicating just fine. The "{{{}SessionExpiredException"{}}} logs unfortunately do not show the associated Session ID which is encountering the "{{{}SessionExpiredException"{}}} h2. Symptoms When using Zookeeper version 3.9.1 & Curator version 5.2.0, after restarting/failing some of the Zookeeper daemons: # All the 3 zookeeper clients experience some connections failures lasting a few seconds after the Zookeeper daemons were failed/restarted. # These connection failure issues are resolved shortly without any action needed. # Around 1 minute after the Zookeeper daemons were failed/restarted, all the 3 zookeeper clients start repeatedly logging "{{{}SessionExpiredException"{}}} # The "{{{}SessionExpiredException"{}}} is repeatedly logged for 15+ minutes. During this time there are no connectivity issues. We can see from the Zookeeper server logs that all 3 Zookeeper servers are receiving regular traffic from the clients. # Interestingly, each Zookeeper server is not receiving any requests from the local Zookeeper client for the duration of the period where "{{{}SessionExpiredException"{}}} is repeatedly logged by the clients. However, each Zookeeper server is receiving regular traffic from the 2 remote Zookeeper clients. The evidence suggests that this is a client-side issue & the "{{{}SessionExpiredException"{}}} is being thrown before the request is even sent to the Zookeeper server. h2. Root Cause Theory Based on my analysis of the available DEBUG & TRACE level logs, my current working theory is that the Zookeeper client is repeatedly trying to use a connection/session in CLOSED state which is causing the client to repeatedly throw "{{{}SessionExpiredException"{}}} for each retry. The first "{{{}SessionExpiredException"{}}} is thrown after a log pattern like the following: {quote}2024-06-18 01:14:45,694 WARN Curator-ConnectionStateManager-0: Session timeout has elapsed while SUSPENDED. Injecting a session expiration. Elapsed ms: 60000. Adjusted session timeout ms: 60000 2024-06-18 01:14:45,694 INFO Curator-ConnectionStateManager-0: ZK getBindInformation: ip-172-31-200-85.ec2.internal:22181,ip-172-31-200-230.ec2.internal:22181,ip-172-31-200-39.ec2.internal:22181 2024-06-18 01:14:45,695 INFO Curator-ConnectionStateManager-0: injectSessionExpiration() called 2024-06-18 01:14:45,695 INFO Curator-ConnectionStateManager-0: channel is told closing 2024-06-18 01:14:45,698 INFO epollEventLoopGroup-3-1: channel is disconnected: [id: 0x86a592c8, L:/172.31.200.230:35836 ! R:ip-172-31-200-230.ec2.internal/172.31.200.230:22181] 2024-06-18 01:14:45,698 INFO epollEventLoopGroup-3-1: channel is told closing 2024-06-18 01:14:45,698 TRACE StateStore-bg-thread-SendThread(ip-172-31-200-230.ec2.internal:22181): SendThread exited loop for session: 0x1000000fc490005 2024-06-18 01:14:45,703 DEBUG StateStore-bg-thread-EventThread: Retry-able exception received org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode = Session expired for /EMR/StateStore/ApplicationsAttempt at org.apache.zookeeper.KeeperException.create(KeeperException.java:133) at org.apache.zookeeper.KeeperException.create(KeeperException.java:53) at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1972) at org.apache.curator.framework.imps.GetDataBuilderImpl$4.call(GetDataBuilderImpl.java:327) at org.apache.curator.framework.imps.GetDataBuilderImpl$4.call(GetDataBuilderImpl.java:316) at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:93) at org.apache.curator.framework.imps.GetDataBuilderImpl.pathInForeground(GetDataBuilderImpl.java:313) at org.apache.curator.framework.imps.GetDataBuilderImpl.forPath(GetDataBuilderImpl.java:304) at org.apache.curator.framework.imps.GetDataBuilderImpl.forPath(GetDataBuilderImpl.java:35) at org.apache.curator.framework.recipes.shared.ConnectionStateAwareSharedValue.readValue(ConnectionStateAwareSharedValue.java:253) at org.apache.curator.framework.recipes.shared.ConnectionStateAwareSharedValue.access$200(ConnectionStateAwareSharedValue.java:26) at org.apache.curator.framework.recipes.shared.ConnectionStateAwareSharedValue$1.process(ConnectionStateAwareSharedValue.java:45) at org.apache.curator.framework.imps.NamespaceWatcher.process(NamespaceWatcher.java:83) at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:563) at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:538) {quote} It seems the initial Zookeeper daemon restart/failure is causing the Curator client to "{_}injectSessionExpiration(){_}" which is then causing a session "{_}0x1000000fc490005{_}" to have its SendThread exited. [The injectSessionExpiration method is then putting the ClientCnxn object for the session into CLOSED state.|https://github.com/apache/zookeeper/blob/39973dc5fddf778733e4f0370980629c07a16d67/zookeeper-server/src/main/java/org/apache/zookeeper/ZooKeeperTestable.java#L41] This is causing [the following code path to throw "{{{}SessionExpiredException"{}}} before the request is even sent to the Zookeeper server|https://github.com/apache/zookeeper/blob/39973dc5fddf778733e4f0370980629c07a16d67/zookeeper-server/src/main/java/org/apache/zookeeper/ClientCnxn.java#L777] Then 25 minutes later, the EventThread for the Zookeeper session is finally also stopped: {quote}2024-06-18 01:40:43,942 TRACE StateStore-bg-thread-EventThread: Trace: GetDataBuilderImpl-Background - 1163211 ms 2024-06-18 01:40:43,942 TRACE StateStore-bg-thread-EventThread: Trace: GetChildrenBuilderImpl-Background - 1163211 ms 2024-06-18 01:40:43,942 TRACE StateStore-bg-thread-EventThread: Trace: CreateBuilderImpl-Background - 1163211 ms 2024-06-18 01:40:43,942 TRACE StateStore-bg-thread-EventThread: Trace: GetChildrenBuilderImpl-Background - 1003198 ms 2024-06-18 01:40:43,942 TRACE StateStore-bg-thread-EventThread: Trace: GetDataBuilderImpl-Background - 383164 ms 2024-06-18 01:40:43,942 TRACE StateStore-bg-thread-EventThread: Trace: GetChildrenBuilderImpl-Background - 383164 ms 2024-06-18 01:40:43,942 TRACE StateStore-bg-thread-EventThread: Trace: CreateBuilderImpl-Background - 383164 ms 2024-06-18 01:40:43,942 TRACE StateStore-bg-thread-EventThread: Trace: GetChildrenBuilderImpl-Background - 266157 ms 2024-06-18 01:40:43,942 INFO StateStore-bg-thread-EventThread: EventThread shut down for session: 0x1000000fc490005 {quote} Only after the EventThread is stopped do the "{{{}SessionExpiredException"{}}} stop occurring. This leads me to believe that the lingering EventThread for the closed session is leading to the closed session being used which is leading to the repeated "{{{}SessionExpiredException"{}}} which occur until the EventThread is stopped. In between "2024-06-18 01:14:45" & "2024-06-18 01:40:43" we see 0 requests from the local client "{_}172.31.200.230{_}" reaching the Zookeeper server also on "{_}172.31.200.230{_}". However, other sessions to other Zookeeper servers are unaffected & operate without connection or session expiry issues. was: h2. TL;DR After upgrade Apache Zookeeper & Apache Curator, our application is experiencing a new unexpected behaviour after events such as: * Zookeeper server(s) being restarted * Zookeeper server failing & being replaced h2. Background Application is using Zookeeper for leader election & metadata storage. The application runs on 3 hosts which each also have 1 Zookeeper server/daemon running. Previously the application was running on Zookeeper version 3.5.10 & Curator version 4.3.0 After upgrading to Zookeeper version 3.9.1 & Curator version 5.2.0 a new edge case was observed where after Zookeeper daemons are restarted/failed the application (i.e. Zookeeper client) enters a 15+ minute loop of repeatedly logging "{{{}SessionExpiredException"{}}} These repeated "{{{}SessionExpiredException"{}}} are not indicative of a full Zookeeper client/communication outage because DEBUG logs show that other Zookeeper sessions are communicating just fine. The "{{{}SessionExpiredException"{}}} logs unfortunately do not show the associated Session ID which is encountering the "{{{}SessionExpiredException"{}}} h2. Symptoms When using Zookeeper version 3.9.1 & Curator version 5.2.0, after restarting/failing some of the Zookeeper daemons: # All the 3 zookeeper clients experience some connections failures lasting a few seconds after the Zookeeper daemons were failed/restarted. # These connection failure issues are resolved shortly without any action needed. # Around 1 minute after the Zookeeper daemons were failed/restarted, all the 3 zookeeper clients start repeatedly logging "{{{}SessionExpiredException"{}}} # The "{{{}SessionExpiredException"{}}} is repeatedly logged for 15+ minutes. During this time there are no connectivity issues. We can see from the Zookeeper server logs that all 3 Zookeeper servers are receiving regular traffic from the clients. # Interestingly, each Zookeeper server is not receiving any requests from the local Zookeeper client for the duration of the period where "{{{}SessionExpiredException"{}}} is repeatedly logged by the clients. However, each Zookeeper server is receiving regular traffic from the 2 remote Zookeeper clients. The evidence suggests that this is a client-side issue & the "{{{}SessionExpiredException"{}}} is being thrown before the request is even sent to the Zookeeper server. h2. Root Cause Theory Based on my analysis of the available DEBUG & TRACE level logs, my current working theory is that the Zookeeper client is repeatedly trying to use a connection/session in CLOSED state which is causing the client to repeatedly throw "{{{}SessionExpiredException"{}}} for each retry. The first "{{{}SessionExpiredException"{}}} is thrown after a log pattern like the following: {quote}2024-06-18 01:14:45,694 WARN Curator-ConnectionStateManager-0: Session timeout has elapsed while SUSPENDED. Injecting a session expiration. Elapsed ms: 60000. Adjusted session timeout ms: 60000 2024-06-18 01:14:45,694 INFO Curator-ConnectionStateManager-0: ZK getBindInformation: ip-172-31-200-85.ec2.internal:22181,ip-172-31-200-230.ec2.internal:22181,ip-172-31-200-39.ec2.internal:22181 2024-06-18 01:14:45,695 INFO Curator-ConnectionStateManager-0: injectSessionExpiration() called 2024-06-18 01:14:45,695 INFO Curator-ConnectionStateManager-0: channel is told closing 2024-06-18 01:14:45,698 INFO epollEventLoopGroup-3-1: channel is disconnected: [id: 0x86a592c8, L:/172.31.200.230:35836 ! R:ip-172-31-200-230.ec2.internal/172.31.200.230:22181] 2024-06-18 01:14:45,698 INFO epollEventLoopGroup-3-1: channel is told closing 2024-06-18 01:14:45,698 TRACE StateStore-bg-thread-SendThread(ip-172-31-200-230.ec2.internal:22181): SendThread exited loop for session: 0x1000000fc490005 2024-06-18 01:14:45,703 DEBUG StateStore-bg-thread-EventThread: Retry-able exception received org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode = Session expired for /EMR/StateStore/ApplicationsAttempt at org.apache.zookeeper.KeeperException.create(KeeperException.java:133) at org.apache.zookeeper.KeeperException.create(KeeperException.java:53) at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1972) at org.apache.curator.framework.imps.GetDataBuilderImpl$4.call(GetDataBuilderImpl.java:327) at org.apache.curator.framework.imps.GetDataBuilderImpl$4.call(GetDataBuilderImpl.java:316) at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:93) at org.apache.curator.framework.imps.GetDataBuilderImpl.pathInForeground(GetDataBuilderImpl.java:313) at org.apache.curator.framework.imps.GetDataBuilderImpl.forPath(GetDataBuilderImpl.java:304) at org.apache.curator.framework.imps.GetDataBuilderImpl.forPath(GetDataBuilderImpl.java:35) at org.apache.curator.framework.recipes.shared.ConnectionStateAwareSharedValue.readValue(ConnectionStateAwareSharedValue.java:253) at org.apache.curator.framework.recipes.shared.ConnectionStateAwareSharedValue.access$200(ConnectionStateAwareSharedValue.java:26) at org.apache.curator.framework.recipes.shared.ConnectionStateAwareSharedValue$1.process(ConnectionStateAwareSharedValue.java:45) at org.apache.curator.framework.imps.NamespaceWatcher.process(NamespaceWatcher.java:83) at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:563) at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:538) {quote} It seems the initial Zookeeper daemon restart/failure is causing the Curator client to "{_}injectSessionExpiration(){_}" which is then causing a session "{_}0x1000000fc490005{_}" to have its SendThread exited. [The injectSessionExpiration method is then putting the ClientCnxn object for the session into CLOSED state.|https://github.com/apache/zookeeper/blob/39973dc5fddf778733e4f0370980629c07a16d67/zookeeper-server/src/main/java/org/apache/zookeeper/ZooKeeperTestable.java#L41] This is causing [the following code path to throw "{{{}SessionExpiredException"{}}} before the request is even sent to the Zookeeper server|https://github.com/apache/zookeeper/blob/39973dc5fddf778733e4f0370980629c07a16d67/zookeeper-server/src/main/java/org/apache/zookeeper/ClientCnxn.java#L777] Then 25 minutes later, the EventThread for the Zookeeper session is finally also stopped: {quote}2024-06-18 01:40:43,942 TRACE StateStore-bg-thread-EventThread: Trace: GetDataBuilderImpl-Background - 1163211 ms 2024-06-18 01:40:43,942 TRACE StateStore-bg-thread-EventThread: Trace: GetChildrenBuilderImpl-Background - 1163211 ms 2024-06-18 01:40:43,942 TRACE StateStore-bg-thread-EventThread: Trace: CreateBuilderImpl-Background - 1163211 ms 2024-06-18 01:40:43,942 TRACE StateStore-bg-thread-EventThread: Trace: GetChildrenBuilderImpl-Background - 1003198 ms 2024-06-18 01:40:43,942 TRACE StateStore-bg-thread-EventThread: Trace: GetDataBuilderImpl-Background - 383164 ms 2024-06-18 01:40:43,942 TRACE StateStore-bg-thread-EventThread: Trace: GetChildrenBuilderImpl-Background - 383164 ms 2024-06-18 01:40:43,942 TRACE StateStore-bg-thread-EventThread: Trace: CreateBuilderImpl-Background - 383164 ms 2024-06-18 01:40:43,942 TRACE StateStore-bg-thread-EventThread: Trace: GetChildrenBuilderImpl-Background - 266157 ms 2024-06-18 01:40:43,942 INFO StateStore-bg-thread-EventThread: EventThread shut down for session: 0x1000000fc490005 {quote} Only after the EventThread is stopped do the "{{{}SessionExpiredException"{}}} stop occurring. This leads me to believe that the lingering EventThread for the closed session is leading to the closed session being used which is leading to the repeated "{{{}SessionExpiredException"{}}} which occur until the EventThread is stopped > Repeated SessionExpiredException after Zookeeper daemon restart > --------------------------------------------------------------- > > Key: ZOOKEEPER-4840 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4840 > Project: ZooKeeper > Issue Type: Bug > Affects Versions: 3.9.1 > Reporter: Kevin Wikant > Priority: Major > > h2. TL;DR > After upgrade Apache Zookeeper & Apache Curator, our application is > experiencing a new unexpected behaviour after events such as: > * Zookeeper server(s) being restarted > * Zookeeper server failing & being replaced > h2. Background > Application is using Zookeeper for leader election & metadata storage. The > application runs on 3 hosts which each also have 1 Zookeeper server/daemon > running. > Previously the application was running on Zookeeper version 3.5.10 & Curator > version 4.3.0 > After upgrading to Zookeeper version 3.9.1 & Curator version 5.2.0 a new edge > case was observed where after Zookeeper daemons are restarted/failed the > application (i.e. Zookeeper client) enters a 15+ minute loop of repeatedly > logging "{{{}SessionExpiredException"{}}} > These repeated "{{{}SessionExpiredException"{}}} are not indicative of a full > Zookeeper client/communication outage because DEBUG logs show that other > Zookeeper sessions are communicating just fine. The > "{{{}SessionExpiredException"{}}} logs unfortunately do not show the > associated Session ID which is encountering the > "{{{}SessionExpiredException"{}}} > h2. Symptoms > When using Zookeeper version 3.9.1 & Curator version 5.2.0, after > restarting/failing some of the Zookeeper daemons: > # All the 3 zookeeper clients experience some connections failures lasting a > few seconds after the Zookeeper daemons were failed/restarted. > # These connection failure issues are resolved shortly without any action > needed. > # Around 1 minute after the Zookeeper daemons were failed/restarted, all the > 3 zookeeper clients start repeatedly logging "{{{}SessionExpiredException"{}}} > # The "{{{}SessionExpiredException"{}}} is repeatedly logged for 15+ > minutes. During this time there are no connectivity issues. We can see from > the Zookeeper server logs that all 3 Zookeeper servers are receiving regular > traffic from the clients. > # Interestingly, each Zookeeper server is not receiving any requests from > the local Zookeeper client for the duration of the period where > "{{{}SessionExpiredException"{}}} is repeatedly logged by the clients. > However, each Zookeeper server is receiving regular traffic from the 2 remote > Zookeeper clients. > The evidence suggests that this is a client-side issue & the > "{{{}SessionExpiredException"{}}} is being thrown before the request is even > sent to the Zookeeper server. > h2. Root Cause Theory > Based on my analysis of the available DEBUG & TRACE level logs, my current > working theory is that the Zookeeper client is repeatedly trying to use a > connection/session in CLOSED state which is causing the client to repeatedly > throw "{{{}SessionExpiredException"{}}} for each retry. > The first "{{{}SessionExpiredException"{}}} is thrown after a log pattern > like the following: > {quote}2024-06-18 01:14:45,694 WARN Curator-ConnectionStateManager-0: Session > timeout has elapsed while SUSPENDED. Injecting a session expiration. Elapsed > ms: 60000. Adjusted session timeout ms: 60000 > 2024-06-18 01:14:45,694 INFO Curator-ConnectionStateManager-0: ZK > getBindInformation: > ip-172-31-200-85.ec2.internal:22181,ip-172-31-200-230.ec2.internal:22181,ip-172-31-200-39.ec2.internal:22181 > 2024-06-18 01:14:45,695 INFO Curator-ConnectionStateManager-0: > injectSessionExpiration() called > 2024-06-18 01:14:45,695 INFO Curator-ConnectionStateManager-0: channel is > told closing > 2024-06-18 01:14:45,698 INFO epollEventLoopGroup-3-1: channel is > disconnected: [id: 0x86a592c8, L:/172.31.200.230:35836 ! > R:ip-172-31-200-230.ec2.internal/172.31.200.230:22181] > 2024-06-18 01:14:45,698 INFO epollEventLoopGroup-3-1: channel is told closing > 2024-06-18 01:14:45,698 TRACE > StateStore-bg-thread-SendThread(ip-172-31-200-230.ec2.internal:22181): > SendThread exited loop for session: 0x1000000fc490005 > 2024-06-18 01:14:45,703 DEBUG StateStore-bg-thread-EventThread: Retry-able > exception received > org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode > = Session expired for /EMR/StateStore/ApplicationsAttempt > at org.apache.zookeeper.KeeperException.create(KeeperException.java:133) > at org.apache.zookeeper.KeeperException.create(KeeperException.java:53) > at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1972) > at > org.apache.curator.framework.imps.GetDataBuilderImpl$4.call(GetDataBuilderImpl.java:327) > at > org.apache.curator.framework.imps.GetDataBuilderImpl$4.call(GetDataBuilderImpl.java:316) > at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:93) > at > org.apache.curator.framework.imps.GetDataBuilderImpl.pathInForeground(GetDataBuilderImpl.java:313) > at > org.apache.curator.framework.imps.GetDataBuilderImpl.forPath(GetDataBuilderImpl.java:304) > at > org.apache.curator.framework.imps.GetDataBuilderImpl.forPath(GetDataBuilderImpl.java:35) > at > org.apache.curator.framework.recipes.shared.ConnectionStateAwareSharedValue.readValue(ConnectionStateAwareSharedValue.java:253) > at > org.apache.curator.framework.recipes.shared.ConnectionStateAwareSharedValue.access$200(ConnectionStateAwareSharedValue.java:26) > at > org.apache.curator.framework.recipes.shared.ConnectionStateAwareSharedValue$1.process(ConnectionStateAwareSharedValue.java:45) > at > org.apache.curator.framework.imps.NamespaceWatcher.process(NamespaceWatcher.java:83) > at > org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:563) > at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:538) > {quote} > It seems the initial Zookeeper daemon restart/failure is causing the Curator > client to "{_}injectSessionExpiration(){_}" which is then causing a session > "{_}0x1000000fc490005{_}" to have its SendThread exited. > [The injectSessionExpiration method is then putting the ClientCnxn object for > the session into CLOSED > state.|https://github.com/apache/zookeeper/blob/39973dc5fddf778733e4f0370980629c07a16d67/zookeeper-server/src/main/java/org/apache/zookeeper/ZooKeeperTestable.java#L41] > This is causing [the following code path to throw > "{{{}SessionExpiredException"{}}} before the request is even sent to the > Zookeeper > server|https://github.com/apache/zookeeper/blob/39973dc5fddf778733e4f0370980629c07a16d67/zookeeper-server/src/main/java/org/apache/zookeeper/ClientCnxn.java#L777] > Then 25 minutes later, the EventThread for the Zookeeper session is finally > also stopped: > {quote}2024-06-18 01:40:43,942 TRACE StateStore-bg-thread-EventThread: Trace: > GetDataBuilderImpl-Background - 1163211 ms > 2024-06-18 01:40:43,942 TRACE StateStore-bg-thread-EventThread: Trace: > GetChildrenBuilderImpl-Background - 1163211 ms > 2024-06-18 01:40:43,942 TRACE StateStore-bg-thread-EventThread: Trace: > CreateBuilderImpl-Background - 1163211 ms > 2024-06-18 01:40:43,942 TRACE StateStore-bg-thread-EventThread: Trace: > GetChildrenBuilderImpl-Background - 1003198 ms > 2024-06-18 01:40:43,942 TRACE StateStore-bg-thread-EventThread: Trace: > GetDataBuilderImpl-Background - 383164 ms > 2024-06-18 01:40:43,942 TRACE StateStore-bg-thread-EventThread: Trace: > GetChildrenBuilderImpl-Background - 383164 ms > 2024-06-18 01:40:43,942 TRACE StateStore-bg-thread-EventThread: Trace: > CreateBuilderImpl-Background - 383164 ms > 2024-06-18 01:40:43,942 TRACE StateStore-bg-thread-EventThread: Trace: > GetChildrenBuilderImpl-Background - 266157 ms > 2024-06-18 01:40:43,942 INFO StateStore-bg-thread-EventThread: EventThread > shut down for session: 0x1000000fc490005 > {quote} > Only after the EventThread is stopped do the > "{{{}SessionExpiredException"{}}} stop occurring. > This leads me to believe that the lingering EventThread for the closed > session is leading to the closed session being used which is leading to the > repeated "{{{}SessionExpiredException"{}}} which occur until the EventThread > is stopped. > In between "2024-06-18 01:14:45" & "2024-06-18 01:40:43" we see 0 requests > from the local client "{_}172.31.200.230{_}" reaching the Zookeeper server > also on "{_}172.31.200.230{_}". However, other sessions to other Zookeeper > servers are unaffected & operate without connection or session expiry issues. > -- This message was sent by Atlassian Jira (v8.20.10#820010)