[ https://issues.apache.org/jira/browse/HBASE-7329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13558167#comment-13558167 ]
stack commented on HBASE-7329: ------------------------------ Patch looks great. I like the analysis that is done in the below citation. I think the below would work well as release note for this issue. https://issues.apache.org/jira/browse/HBASE-7329?focusedCommentId=13554658&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13554658 That'd be cool if you can close hbase-7011 too. From 7011, "You say the zombie RS 'should fail' when the dir is renamed under it. I haven't looked but if we done't already, sounds like we should add a test that proves it." Is there such a test in this issue or, do you know if we rename the directory before we start log splitting? Have you done much testing of this patch SS? Nit: Below looks like it should be class comment rather than internal implementation comment or do you think otherwise? + /** + * Contains the number of outstanding operations, as well as flags. + * Initially, the number of operations is 1. Each beginOp increments, and endOp decrements it. + * beginOp does not proceed when it sees the draining flag. When stop is called, it atomically + * decrements the number of operations (the initial 1) and sets the draining flag. If stop did + * the decrement to zero, that means there are no more operations outstanding, so stop is done. + * Otherwise, stop blocks, and the endOp that decrements the count to 0 unblocks it. + */ bq. Initially, the number of operations is 1. Is it right having it at 1 when we construct the class? Should we wait on first beginOp call? Nice test. It is not your fault but that is sure an ugly name on a method, getCompleteCacheFlushSequenceId. From its name you would not know what it is for. In fact, if you want to remove it it looks like you could since 'TransactionalRegion' is a facility that no longer exists and going forward if you wanted to do this kinda thing, you'd do it via a coprocessor. Is it right getting seqid before we advance memstore? We used to do it other way around. Thanks for doing this rename: - private final ConcurrentSkipListMap<byte [], Long> lastSeqWritten = + private final ConcurrentSkipListMap<byte [], Long> oldestUnflushedSeqNums = I wonder if the log of the roll should be outside of the lock...? Could be a bit sloppy but maybe it does not have to be too precise? Let me try and do another review before monday (but don't let this hold up commit I'd say...) > remove flush-related records from WAL and make locking more granular > -------------------------------------------------------------------- > > Key: HBASE-7329 > URL: https://issues.apache.org/jira/browse/HBASE-7329 > Project: HBase > Issue Type: Improvement > Components: wal > Affects Versions: 0.96.0 > Reporter: Sergey Shelukhin > Assignee: Sergey Shelukhin > Fix For: 0.96.0 > > Attachments: 7329-findbugs.diff, HBASE-7329-v0.patch, > HBASE-7329-v0.patch, HBASE-7329-v0-tmp.patch, HBASE-7329-v1.patch, > HBASE-7329-v1.patch, HBASE-7329-v2.patch, HBASE-7329-v3.patch, > HBASE-7329-v4.patch, HBASE-7329-v5.patch > > > Comments from many people in HBASE-6466 and HBASE-6980 indicate that flush > records in WAL are not useful. If so, they should be removed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira