[ https://issues.apache.org/jira/browse/ZOOKEEPER-2469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15393524#comment-15393524 ]
Flavio Junqueira commented on ZOOKEEPER-2469: --------------------------------------------- Thanks [~sershe] for the patch and [~arshad.mohammad], [~eribeiro] for the reviews. I understand that this part of the code badly needs a re-write, but I'm not sure I like this current patch as is. I do see the issue reported, though. Here are a couple of comments: # I know that {{immediateRetry}} has been mentioned a few times in this jira, but it isn't clear to me why we need it. Is it necessary for correctness or is it just convenience? If it is convenience, then I'd rather not have it because the whole {{retry}}/{{initialRetry}} logic isn't really contributing to the code being cleaner. # I wanted to check that my understanding is correct. The latest patch seems to exit the while loop upon the second login exception and {{immediateRetry}} is false during the first execution iteration but not the second. Shouldn't we keep retrying but making sure that it is sleeping in each iteration rather than run in a tight loop? Another point unrelated to your changes but in the same scope. Instead of throwing {{le}} upon an InterruptedException in that loop, we should restore the interrupt status as suggested here: https://www.ibm.com/developerworks/library/j-jtp05236/ There are a couple of instances around that code, so if you don't want to do it in this patch, please I'd appreciate if you could create a new jira for it. > infinite loop in ZK re-login > ---------------------------- > > Key: ZOOKEEPER-2469 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2469 > Project: ZooKeeper > Issue Type: Bug > Reporter: Sergey Shelukhin > Assignee: Sergey Shelukhin > Attachments: ZOOKEEPER-2469.01.patch, ZOOKEEPER-2469.02.patch, > ZOOKEEPER-2469.03.patch, ZOOKEEPER-2469.patch > > > {noformat} > int retry = 1; > while (retry >= 0) { > try { > reLogin(); > break; > } catch (LoginException le) { > if (retry > 0) { > --retry; > // sleep for 10 seconds. > try { > Thread.sleep(10 * 1000); > } catch (InterruptedException e) { > LOG.error("Interrupted during login > retry after LoginException:", le); > throw le; > } > } else { > LOG.error("Could not refresh TGT for > principal: " + principal + ".", le); > } > } > } > {noformat} > will retry forever. Should return like the one above -- This message was sent by Atlassian JIRA (v6.3.4#6332)