[ 
https://issues.apache.org/jira/browse/HBASE-3890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13034149#comment-13034149
 ] 

Prakash Khemani commented on HBASE-3890:
----------------------------------------

With the bug you identified in HBASE-3889 this behavior is expected. The 
SplitLogManager will put up a task, a SplitLogWorker will pick it up and will 
never complete it because of the bug. Manager will resubmit the task and 
another worker will pick it up to never complete it. The Manager resubmits at 
most hbase.splitlog.max.resubmit (default = 3) times after which the task hangs.



> Scheduled tasks in distributed log splitting not in sync with ZK
> ----------------------------------------------------------------
>
>                 Key: HBASE-3890
>                 URL: https://issues.apache.org/jira/browse/HBASE-3890
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>    Affects Versions: 0.92.0
>            Reporter: Lars George
>             Fix For: 0.92.0
>
>
> This is in continuation to HBASE-3889:
> Note that there must be more slightly off here. Although the splitlogs znode 
> is now empty the master is still stuck here:
> {noformat}
> Doing distributed log split in 
> hdfs://localhost:8020/hbase/.logs/10.0.0.65,60020,1305406356765        
> - Waiting for distributed tasks to finish. scheduled=2 done=1 error=0   4380s
> Master startup        
> - Splitting logs after master startup   4388s
> {noformat}
> There seems to be an issue with what is in ZK and what the TaskBatch holds. 
> In my case it could be related to the fact that the task was already in ZK 
> after many faulty restarts because of the NPE. Maybe it was added once (since 
> that is keyed by path, and that is unique on my machine), but the reference 
> count upped twice? Now that the real one is done, the done counter has been 
> increased, but will never match the scheduled.
> The code could also check if ZK is actually depleted, and therefore treat the 
> scheduled task as bogus? This of course only treats the symptom, not the root 
> cause of this condition. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to