[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14046298#comment-14046298
 ] 

Orion Hodson commented on ZOOKEEPER-1933:
-----------------------------------------

Hi Raul, I'm responsible for the change in select() and also a relative 
neophyte in the ZooKeeper code.

>From scanning the patch alone, it may not be obvious but there is an 
>isunrecoverable() check in the same loop do_io() loop. It's comes just after 
>zookeeper_process().

Previously there was no error checking in the Win32 case for select(). The 
issue we've seen with select() is when the socket is remotely closed and then 
the descriptor is bad and so select() fails without waiting - there is no 
change in state at this point, just a bad descriptor. It used to be the 
descriptor was never removed from the fd_set's and so it'd burn CPU. The 
variable interest in the loop serves two purposes and because it wasn't zeroed 
out in the case of error, there'd be a socket error and then the code would 
attempt to process I/O on the socket and then fail there. With the change here, 
we see sessions being re-established as expected.

Thanks
Orion

> Windows release build of zk client cannot connect to zk server
> --------------------------------------------------------------
>
>                 Key: ZOOKEEPER-1933
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1933
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: c client
>    Affects Versions: 3.4.6
>            Reporter: Norris Lee
>            Assignee: Orion Hodson
>             Fix For: 3.5.0
>
>         Attachments: ZOOKEEPER-1933-2.patch, ZOOKEEPER-1933-3.patch, 
> ZOOKEEPER-1933.patch, ZOOKEEPER-1933.patch, ZOOKEEPER-1933.patch
>
>
> When building zookeeper in Visual Studio in debug mode, the client can 
> connect to the server without error. When building in release mode, I get a 
> continuous error message:
> {code}
> 2014-06-02 11:25:20,070:7144(0xc84):ZOO_INFO@zookeeper_init_internal@1008: 
> Initiating client connection, host=192.168.39.43:5181 sessionTimeout=30000 
> watcher=10049C90 sessionId=0 sessionPasswd=<null> context=001FC0F0 flags=0
> 2014-06-02 11:25:20,072:7144(0xc84):ZOO_DEBUG@start_threads@221: starting 
> threads...
> 2014-06-02 11:25:20,072:7144(0x1ea0):ZOO_DEBUG@do_completion@460: started 
> completion thread
> 2014-06-02 11:25:20,072:7144(0x1e08):ZOO_DEBUG@do_io@403: started IO thread
> 2014-06-02 
> 11:25:20,072:7144(0x1e08):ZOO_DEBUG@get_next_server_in_reconfig@1148: [OLD] 
> count=0 capacity=0 next=0 hasnext=0
> 2014-06-02 
> 11:25:20,072:7144(0x1e08):ZOO_DEBUG@get_next_server_in_reconfig@1151: [NEW] 
> count=1 capacity=16 next=0 hasnext=1
> 2014-06-02 
> 11:25:20,072:7144(0x1e08):ZOO_DEBUG@get_next_server_in_reconfig@1160: Using 
> next from NEW=192.168.39.43:5181
> 2014-06-02 11:25:20,072:7144(0x1e08):ZOO_DEBUG@zookeeper_interest@1992: [zk] 
> connect()
> 2014-06-02 11:25:20,158:7144(0x1e08):ZOO_ERROR@handle_socket_error_msg@1847: 
> Socket [192.168.39.43:5181] zk retcode=-4, errno=10035(Unknown error): failed 
> to send a handshake packet: Unknown error
> 2014-06-02 11:25:20,158:7144(0x1e08):ZOO_DEBUG@handle_error@1595: Previous 
> connection=[192.168.39.43:5181] delay=0
> 2014-06-02 
> 11:25:20,158:7144(0x1e08):ZOO_DEBUG@get_next_server_in_reconfig@1148: [OLD] 
> count=0 capacity=0 next=0 hasnext=0
> 2014-06-02 
> 11:25:20,158:7144(0x1e08):ZOO_DEBUG@get_next_server_in_reconfig@1151: [NEW] 
> count=1 capacity=16 next=0 hasnext=1
> 2014-06-02 
> 11:25:20,158:7144(0x1e08):ZOO_DEBUG@get_next_server_in_reconfig@1160: Using 
> next from NEW=192.168.39.43:5181
> 2014-06-02 11:25:20,158:7144(0x1e08):ZOO_DEBUG@zookeeper_interest@1992: [zk] 
> connect()
> 2014-06-02 11:25:20,159:7144(0x1e08):ZOO_ERROR@handle_socket_error_msg@1847: 
> Socket [192.168.39.43:5181] zk retcode=-4, errno=10035(Unknown error): failed 
> to send a handshake packet: Unknown error
> 2014-06-02 11:25:20,159:7144(0x1e08):ZOO_DEBUG@handle_error@1595: Previous 
> connection=[192.168.39.43:5181] delay=0
> 2014-06-02 
> 11:25:20,159:7144(0x1e08):ZOO_DEBUG@get_next_server_in_reconfig@1148: [OLD] 
> count=0 capacity=0 next=0 hasnext=0
> 2014-06-02 
> 11:25:20,159:7144(0x1e08):ZOO_DEBUG@get_next_server_in_reconfig@1151: [NEW] 
> count=1 capacity=16 next=0 hasnext=1
> 2014-06-02 
> 11:25:20,159:7144(0x1e08):ZOO_DEBUG@get_next_server_in_reconfig@1160: Using 
> next from NEW=192.168.39.43:5181
> 2014-06-02 11:25:20,159:7144(0x1e08):ZOO_DEBUG@zookeeper_interest@1992: [zk] 
> connect()
> 2014-06-02 11:25:20,159:7144(0x1e08):ZOO_ERROR@handle_socket_error_msg@1847: 
> Socket [192.168.39.43:5181] zk retcode=-4, errno=10035(Unknown error): failed 
> to send a handshake packet: Unknown error
> 2014-06-02 11:25:20,159:7144(0x1e08):ZOO_DEBUG@handle_error@1595: Previous 
> connection=[192.168.39.43:5181] delay=0
> 2014-06-02 
> 11:25:20,159:7144(0x1e08):ZOO_DEBUG@get_next_server_in_reconfig@1148: [OLD] 
> count=0 capacity=0 next=0 hasnext=0
> 2014-06-02 
> 11:25:20,159:7144(0x1e08):ZOO_DEBUG@get_next_server_in_reconfig@1151: [NEW] 
> count=1 capacity=16 next=0 hasnext=1
> 2014-06-02 
> 11:25:20,159:7144(0x1e08):ZOO_DEBUG@get_next_server_in_reconfig@1160: Using 
> next from NEW=192.168.39.43:5181
> 2014-06-02 11:25:20,159:7144(0x1e08):ZOO_DEBUG@zookeeper_interest@1992: [zk] 
> connect()
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to