[ 
https://issues.apache.org/jira/browse/HBASE-4764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13150001#comment-13150001
 ] 

nkeywal commented on HBASE-4764:
--------------------------------

For example, after a look at the code where the above sample blocks; the 
current implementation is:
{noformat}
    while (!this.stopped && (notimeout || remaining > 0) && this.data == null) {
      if (notimeout) {
        wait();
        continue;
      }
      wait(remaining);
      remaining = timeout - (System.currentTimeMillis() - startTime);
    }
{noformat}

This means that if the notification is sent before we actually started waiting, 
we will wait forever. Probability is quite low and I don't think that's the 
root cause here, but who knows?. An implementation like this one would be more 
secure:
{noformat}
    long previousLogTime = 0;
    while (!this.stopped && (notimeout || remaining > 0) && this.data == null) {
      if (System.currentTimeMillis() > previousLogTime + 1000) {
        LOG.info("Waiting for node to be available ");
        previousLogTime = System.currentTimeMillis();
      }

      wait(200);
      remaining = timeout - (System.currentTimeMillis() - startTime);
    }
{noformat}


The is no notifications in getData(), don't know if it's voluntary.

The recursive call to start in ZooKeeperNodeTracker#start() is not really 
necessary.

But to conclude, we don't even know if it's a stack like this one on the build 
machine.


Can we put a script in the cron tab? Something that would run every 10 minutes 
and look for dead processes? We could also hijack a maven task to run a script 
when it's launched and make it send a mail.
                
> naming errors for TestHLogUtils and SoftValueSortedMapTest
> ----------------------------------------------------------
>
>                 Key: HBASE-4764
>                 URL: https://issues.apache.org/jira/browse/HBASE-4764
>             Project: HBase
>          Issue Type: Improvement
>          Components: test
>    Affects Versions: 0.94.0
>            Reporter: nkeywal
>            Assignee: nkeywal
>            Priority: Minor
>         Attachments: 4764_trunk.patch
>
>
> SoftValueSortedMapTest it's a test, but not a junit one, I tend to think it's 
> not called. I don't know if it's used.
> TestHLogUtils has a wrong name: it's not a test, but an helper. It confuses 
> the script looking for the tests. It would seems a better thing to rename it. 
> Is there anything special to do to keep the history attached to this file?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to