Scheduled tasks in distributed log splitting not in sync with ZK
----------------------------------------------------------------

                 Key: HBASE-3890
                 URL: https://issues.apache.org/jira/browse/HBASE-3890
             Project: HBase
          Issue Type: Bug
          Components: regionserver
    Affects Versions: 0.92.0
            Reporter: Lars George
             Fix For: 0.92.0


This is in continuation to HBASE-3889:

Note that there must be more slightly off here. Although the splitlogs znode is 
now empty the master is still stuck here:

{noformat}
Doing distributed log split in 
hdfs://localhost:8020/hbase/.logs/10.0.0.65,60020,1305406356765  
- Waiting for distributed tasks to finish. scheduled=2 done=1 error=0   4380s

Master startup  
- Splitting logs after master startup   4388s
{noformat}

There seems to be an issue with what is in ZK and what the TaskBatch holds. In 
my case it could be related to the fact that the task was already in ZK after 
many faulty restarts because of the NPE. Maybe it was added once (since that is 
keyed by path, and that is unique on my machine), but the reference count upped 
twice? Now that the real one is done, the done counter has been increased, but 
will never match the scheduled.

The code could also check if ZK is actually depleted, and therefore treat the 
scheduled task as bogus? This of course only treats the symptom, not the root 
cause of this condition. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to