[ https://issues.apache.org/jira/browse/HBASE-4335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Lars Hofhansl updated HBASE-4335: --------------------------------- Attachment: 4335-v3.txt New patch. Breaks SplitTransaction.execute into three parts. In part to make the phases clear, in part so that a test can test each of the phases independently. Also added a test. The test uses phaseI and phaseIII directly and mocks a bit with phaseII (that's the one that bring the daughters online and updates .META.) I could validate that if I change the order back to what is was before this patch the client would indeed reach the wrong region if querying past the split key and would (before HBASE-4334) silently return an empty result set. Let me know what you think about this change. TestSplitTransaction and the new TestEndToEndSplitTransaction pass. > Splits can create temporary holes in .META. that confuse clients and > regionservers > ---------------------------------------------------------------------------------- > > Key: HBASE-4335 > URL: https://issues.apache.org/jira/browse/HBASE-4335 > Project: HBase > Issue Type: Bug > Components: regionserver > Affects Versions: 0.90.4 > Reporter: Joe Pallas > Assignee: Lars Hofhansl > Priority: Critical > Fix For: 0.92.0 > > Attachments: 4335-v2.txt, 4335-v3.txt, 4335.txt > > > When a SplitTransaction is performed, three updates are done to .META.: > 1. The parent region is marked as splitting (and hence offline) > 2. The first daughter region is added (same start key as parent) > 3. The second daughter region is added (split key is start key) > (later, the original parent region is deleted, but that's not important to > this discussion) > Steps 2 and 3 are actually done concurrently by > SplitTransaction.DaughterOpener threads. While the master is notified when a > split is complete, the only visibility that clients have is whether the > daughter regions have appeared in .META. > If the second daughter is added to .META. first, then .META. will contain the > (offline) parent region followed by the second daughter region. If the > client looks up a key that is greater than (or equal to) the split, the > client will find the second daughter region and use it. If the key is less > than the split key, the client will find the parent region and see that it is > offline, triggering a retry. > If the first daughter is added to .META. before the second daughter, there is > a window during which .META. has a hole: the first daughter effectively hides > the parent region (same start key), but there is no entry for the second > daughter. A region lookup will find the first daughter for all keys in the > parent's range, but the first daughter does not include keys at or beyond the > split key. > See HBASE-4333 and HBASE-4334 for details on how this causes problems and > suggestions for mitigating this in the client and regionserver. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira