[ 
https://issues.apache.org/jira/browse/HBASE-7172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Enis Soztutar updated HBASE-7172:
---------------------------------

       Resolution: Fixed
    Fix Version/s: 0.94.4
                   0.96.0
     Hadoop Flags: Reviewed
           Status: Resolved  (was: Patch Available)

I've committed this to trunk and 0.94. Thanks Lars and Stack for reviews.
                
> TestSplitLogManager.testVanishingTaskZNode() fails when run individually and 
> is flaky
> -------------------------------------------------------------------------------------
>
>                 Key: HBASE-7172
>                 URL: https://issues.apache.org/jira/browse/HBASE-7172
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>    Affects Versions: 0.96.0, 0.94.4
>            Reporter: Enis Soztutar
>            Assignee: Enis Soztutar
>             Fix For: 0.96.0, 0.94.4
>
>         Attachments: hbase-7172_v1.patch, hbase-7172_v2-0.94.patch, 
> hbase-7172_v2.patch
>
>
> TestSplitLogManager.testVanishingTaskZNode fails when run individually (run 
> just that test case from eclipse). I've also noticed that it is flaky on 
> windows. 
> The reason is a rare race condition, which somehow does not happen that much 
> when the whole class is run.
> The sequence of events is smt like this:
>  - we create 1 log file to split
>  - we call splitLogDistributed() in its own thread. 
>  - splitLogDistributed() is waiting in waitForSplittingCompletion() since 
> there are no splitlogworkers, it keep waiting.
>  - we delete the task znode from zk
>  - SplitLogManager receives the zk callback from GetDataAsyncCallback, which 
> will call setDone() and mark the task as success. 
>  - However, meanwhile the waitForSplittingCompletion() loops sees that 
> remainingInZK == 0, and calls return concurrently to the above. 
>  - on return from waitForSplittingCompletion(), splitLogDistributed() fails 
> because the znode delete callback has not completed yet. 
> This race only happens when the last task is deleted from zk, and normally 
> only the SplitLogManager deletes the task znodes after processing it, so I 
> don't think this is a production issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to