[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16340172#comment-16340172
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2775:
-------------------------------------------

GitHub user jiajunwang opened a pull request:

    https://github.com/apache/helix/pull/131

    Bump up ZOOKEEPER version to 3.4.11.

    There is a zk connection related bug (ZOOKEEPER-2775) fixed in 3.4.11. Bump 
up version to get the fix.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/jiajunwang/helix master

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/helix/pull/131.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #131
    
----
commit 22cc4a4d4819b211b30094de7b3b0944cb5c033b
Author: jiajunwang <ericwang1985@...>
Date:   2018-01-25T22:04:13Z

    Bump up ZOOKEEPER version to 3.4.11.
    
    There is a zk connection related bug (ZOOKEEPER-2775) fixed in 3.4.11. Bump 
up version to get the fix.

----


> ZK Client not able to connect with Xid out of order error 
> ----------------------------------------------------------
>
>                 Key: ZOOKEEPER-2775
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2775
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: java client
>    Affects Versions: 3.4.10, 3.5.3, 3.6.0
>            Reporter: Bhupendra Kumar Jain
>            Assignee: Mohammad Arshad
>            Priority: Critical
>             Fix For: 3.4.11, 3.5.4, 3.6.0
>
>         Attachments: ZOOKEEPER-2775-01.patch
>
>
> During Network unreachable scenario in one of the cluster, we observed Xid 
> out of order and Nothing in the queue error continously. And ZK client it 
> finally not able to connect successully to ZK server. 
> *Logs:*
> unexpected error, closing socket connection and attempting reconnect | 
> org.apache.zookeeper.ClientCnxn (ClientCnxn.java:1447) 
> java.io.IOException: Xid out of order. Got Xid 52 with err 0 expected Xid 53 
> for a packet with details: clientPath:null serverPath:null finished:false 
> header:: 53,101  replyHeader:: 0,0,-4  request:: 
> 12885502275,v{'/app1/controller,'/app1/config/changes},v{},v{'/app1/config/changes}
>   response:: null
>       at 
> org.apache.zookeeper.ClientCnxn$SendThread.readResponse(ClientCnxn.java:996)
>       at 
> org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:101)
>       at 
> org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:370)
>       at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1426)
> unexpected error, closing socket connection and attempting reconnect 
> java.io.IOException: Nothing in the queue, but got 1
>       at 
> org.apache.zookeeper.ClientCnxn$SendThread.readResponse(ClientCnxn.java:983)
>       at 
> org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:101)
>       at 
> org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:370)
>       at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1426)
>       
> *Analysis:* 
> 1) First time Client fails to do SASL login due to network unreachable 
> problem.
> 2017-03-29 10:03:59,377 | WARN  | [main-SendThread(192.168.130.8:24002)] | 
> SASL configuration failed: javax.security.auth.login.LoginException: Network 
> is unreachable (sendto failed) Will continue connection to Zookeeper server 
> without SASL authentication, if Zookeeper server allows it. | 
> org.apache.zookeeper.ClientCnxn (ClientCnxn.java:1307) 
>       Here the boolean saslLoginFailed becomes true.
> 2) After some time network connection is recovered and client is successully 
> able to login but still the boolean saslLoginFailed is not reset to false. 
> 3) Now SASL negotiation between client and server start happening and during 
> this time no user request will be sent. ( As the socket channel will be 
> closed for write till sasl negotiation complets)
> 4) Now response from server for SASL packet will be processed by the client 
> and client assumes that tunnelAuthInProgress() is finished ( method checks 
> for saslLoginFailed boolean Since the boolean is true it assumes its done.) 
> and tries to process the packet as a other packet and will result in above 
> errors. 
> *Solution:*  Reset the saslLoginFailed boolean every time before client login



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to