[ 
https://issues.apache.org/jira/browse/HBASE-21354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16658270#comment-16658270
 ] 

stack commented on HBASE-21354:
-------------------------------

Makes sense [~allan163] Nice find sir. Here's hoping this addresses the weird 
issue I've seen when lots of chaos where I cannot clean up a Procedure because 
another holds a lock but the 'other' no longer exists.

nit: These kind of logs w/o adding context -- name of the file being recovered 
-- can be useless....          LOG.debug("Starting WAL Procedure Store lease 
recovery"); 

Great test.

> Procedure may be deleted improperly during master restarts resulting in 
> 'Corrupt'
> ---------------------------------------------------------------------------------
>
>                 Key: HBASE-21354
>                 URL: https://issues.apache.org/jira/browse/HBASE-21354
>             Project: HBase
>          Issue Type: Sub-task
>    Affects Versions: 2.1.0, 2.0.2
>            Reporter: Allan Yang
>            Assignee: Allan Yang
>            Priority: Major
>         Attachments: HBASE-21354.branch-2.0.001.patch, 
> HBASE-21354.branch-2.0.002.patch, HBASE-21354.branch-2.0.003.patch
>
>
> Good news! [~stack], [~Apache9], I may find the root cause of mysterious 
> ‘Corrupted procedure’ or some procedures disappeared after master 
> restarts(happens during ITBLL).
> This is because during master restarts, we load procedures from the log, and 
> builds the 'holdingCleanupTracker' according each log's tracker. We may mark 
> a procedure in the oldest log as deleted if one log doesn't contain the 
> procedure. This is Inappropriate since one log will not contain info of the 
> log if this procedure was not updated during the time. We can only delete the 
> procedure only if it is not in the global tracker, which have the whole 
> picture.
> {code}
> trackerNode = tracker.lookupClosestNode(trackerNode, procId);
>         if (trackerNode == null || !trackerNode.contains(procId) ||
>           trackerNode.isModified(procId)) {
>           // the procedure was removed or modified
>           node.delete(procId);
>         }
> {code}
> A test case(testProcedureShouldNotCleanOnLoad) shows cleanly how the 
> corruption happened in the patch.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to