[ 
https://issues.apache.org/jira/browse/CURATOR-367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15855051#comment-15855051
 ] 

ASF GitHub Bot commented on CURATOR-367:
----------------------------------------

Github user Randgalt commented on the issue:

    https://github.com/apache/curator/pull/197
  
    I'm -1 on introducing a new dependency. Can the PR be re-worked to not use 
Mockito?


> Curator may deliver RECONNECTED before LOST in case of session expiry
> ---------------------------------------------------------------------
>
>                 Key: CURATOR-367
>                 URL: https://issues.apache.org/jira/browse/CURATOR-367
>             Project: Apache Curator
>          Issue Type: Bug
>          Components: Client
>    Affects Versions: 2.11.1
>            Reporter: Zoltan Szekeres
>
> h2. Behaviour:
> We saw our code blocked at client.blockUntilConnected() after reconnected on 
> session expiry.
> h2. Possible reason:
> After receiving a session expired event ConnectionState first resets the 
> connection then notifies the parent watchers, where the CuratorEvent is 
> created. In this case it seems the execution of the first zookeeper event 
> thread was delayed before calling the parent watchers. Meanwhile a new 
> zookeeper event thread was created due to calling reset and this new thread 
> sent the SyncConnected event earlier than SessionExpired was sent to parent 
> watchers in the first thread. This resulted ConnectionStateListener instances 
> seeing the RECONNECTED before the LOST.
> h2. Logs:
> 2016-11-17T20:23:28.527Z [Thread-0-SendThread(]  INFO              
> ClientCnxn: Opening socket connection to server _
> 2016-11-17T20:23:28.535Z [Thread-0-SendThread(]  INFO              
> ClientCnxn: Socket connection established to _, initiating session
> 2016-11-17T20:23:28.576Z [Thread-0-SendThread(]  INFO              
> ClientCnxn: Unable to reconnect to ZooKeeper service, session 
> 0xc585ba1e7b6adc2 has expired, closing socket connection
> 2016-11-17T20:23:28.576Z [Thread-0-EventThread]  WARN         
> ConnectionState: Session expired event received
> 2016-11-17T20:23:28.673Z [Thread-0-EventThread]  INFO               
> ZooKeeper: Initiating client connection, connectString=_ sessionTimeout=30000 
> watcher=org.apache.curator.ConnectionState@6ddf3f9d
> 2016-11-17T20:23:28.691Z [Thread-0-SendThread(]  INFO              
> ClientCnxn: Opening socket connection to server _
> 2016-11-17T20:23:28.693Z [Thread-0-SendThread(]  INFO              
> ClientCnxn: Socket connection established to _, initiating session
> 2016-11-17T20:23:28.701Z [Thread-0-SendThread(]  INFO              
> ClientCnxn: Session establishment complete on server _, sessionid = 
> 0x2585ba1e69ffeca, negotiated timeout = 30000
> 2016-11-17T20:23:28.701Z [Thread-0-EventThread]  INFO  
> ConnectionStateManager: State change: RECONNECTED
> 2016-11-17T20:23:28.715Z [Thread-0-EventThread]  INFO  
> ConnectionStateManager: State change: LOST
> 2016-11-17T20:23:28.715Z [Thread-0-EventThread]  INFO              
> ClientCnxn: EventThread shut down
> h2. Reproduction:
> I was only able to reproduce the behaviour by adding artificial Thread.sleep 
> in ConnectionState#process before calling the parent watchers if the event is 
> session expired.
> {code:title=ConnectionState#process}
> @Override
> public void process(WatchedEvent event)
> {
>       if ( LOG_EVENTS )
>       {
>               log.debug("ConnectState watcher: " + event);
>       }
>       if ( event.getType() == Watcher.Event.EventType.None )
>       {
>               boolean wasConnected = isConnected.get();
>               boolean newIsConnected = checkState(event.getState(), 
> wasConnected);
>               if ( newIsConnected != wasConnected )
>               {
>                       isConnected.set(newIsConnected);
>                       connectionStartMs = System.currentTimeMillis();
>               }
>       }
>       if (event.getState() == KeeperState.Expired)
>       {
>               System.err.println("::> sleep in ConnectionState#process");
>               try {
>                       Thread.sleep(1000);
>               } catch (InterruptedException e) {}
>       }
>       for ( Watcher parentWatcher : parentWatchers )
>       {
>               TimeTrace timeTrace = new 
> TimeTrace("connection-state-parent-process", tracer.get());
>               parentWatcher.process(event);
>               timeTrace.commit();
>       }
> }
> {code}
> h2. Some ideas for fix:
>  * Add the event handling and calling parent watchers into a synchronized 
> block.
>  * Change the order of handling watched event and calling parent watchers 
> (I'm not aware of the behaviour implications of this).
>  * Move only calling reset to the end of the method "process".



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to