[ https://issues.apache.org/jira/browse/ZOOKEEPER-2466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Michael Han reassigned ZOOKEEPER-2466: -------------------------------------- Assignee: Michael Han (was: Flavio Junqueira) > Client skips servers when trying to connect > ------------------------------------------- > > Key: ZOOKEEPER-2466 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2466 > Project: ZooKeeper > Issue Type: Bug > Components: c client > Reporter: Flavio Junqueira > Assignee: Michael Han > Priority: Critical > Fix For: 3.5.3, 3.6.0 > > > I've been looking at {{Zookeeper_simpleSystem::testFirstServerDown}} and I > observed the following behavior. The list of servers to connect contains two > servers, let's call them S1 and S2. The client never connects, but the odd > bit is the sequence of servers that the client tries to connect to: > {noformat} > S1 > S2 > S1 > S1 > S1 > <keeps repeating S1> > {noformat} > It intrigued me that S2 is only tried once and never again. Checking the > code, here is what happens. Initially, {{zh->reconfig}} is 1, so in > {{zoo_cycle_next_server}} we return an address from > {{get_next_server_in_reconfig}}, which is taken from {{zh->addrs_new}} in > this test case. The attempt to connect fails, and {{handle_error}} is invoked > in the error handling path. {{handle_error}} actually invokes > {{addrvec_next}} which changes the address pointer to the next server on the > list. > After two attempts, it decides that it has tried all servers in > {{zoo_cycle_next_server}} and sets {{zh->reconfig}} to zero. Once > {{zh->reconfig == 0}}, we have that each call to {{zoo_cycle_next_server}} > moves the address pointer to the next server in {{zh->addrs}}. But, given > that {{handle_error}} also moves the pointer to the next server, we end up > moving the pointer ahead twice upon every failed attempt to connect, which is > wrong. -- This message was sent by Atlassian JIRA (v6.3.4#6332)