[ https://issues.apache.org/jira/browse/HBASE-4764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13150001#comment-13150001 ]
nkeywal commented on HBASE-4764: -------------------------------- For example, after a look at the code where the above sample blocks; the current implementation is: {noformat} while (!this.stopped && (notimeout || remaining > 0) && this.data == null) { if (notimeout) { wait(); continue; } wait(remaining); remaining = timeout - (System.currentTimeMillis() - startTime); } {noformat} This means that if the notification is sent before we actually started waiting, we will wait forever. Probability is quite low and I don't think that's the root cause here, but who knows?. An implementation like this one would be more secure: {noformat} long previousLogTime = 0; while (!this.stopped && (notimeout || remaining > 0) && this.data == null) { if (System.currentTimeMillis() > previousLogTime + 1000) { LOG.info("Waiting for node to be available "); previousLogTime = System.currentTimeMillis(); } wait(200); remaining = timeout - (System.currentTimeMillis() - startTime); } {noformat} The is no notifications in getData(), don't know if it's voluntary. The recursive call to start in ZooKeeperNodeTracker#start() is not really necessary. But to conclude, we don't even know if it's a stack like this one on the build machine. Can we put a script in the cron tab? Something that would run every 10 minutes and look for dead processes? We could also hijack a maven task to run a script when it's launched and make it send a mail. > naming errors for TestHLogUtils and SoftValueSortedMapTest > ---------------------------------------------------------- > > Key: HBASE-4764 > URL: https://issues.apache.org/jira/browse/HBASE-4764 > Project: HBase > Issue Type: Improvement > Components: test > Affects Versions: 0.94.0 > Reporter: nkeywal > Assignee: nkeywal > Priority: Minor > Attachments: 4764_trunk.patch > > > SoftValueSortedMapTest it's a test, but not a junit one, I tend to think it's > not called. I don't know if it's used. > TestHLogUtils has a wrong name: it's not a test, but an helper. It confuses > the script looking for the tests. It would seems a better thing to rename it. > Is there anything special to do to keep the history attached to this file? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira