[jira] [Updated] (HBASE-8763) [BRAINSTORM] Combine MVCC and SeqId
[ https://issues.apache.org/jira/browse/HBASE-8763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeffrey Zhong updated HBASE-8763: - Attachment: hbase-8763-v5.2.patch The V5.2 patch addressed [~saint@gmail.com] comments except one that I can't use a static EMPTY_HLOGKEY in function HRegion#appendNoSyncNoAppend because I need hlog key to return assigned log sequence Id. [BRAINSTORM] Combine MVCC and SeqId --- Key: HBASE-8763 URL: https://issues.apache.org/jira/browse/HBASE-8763 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Enis Soztutar Assignee: Jeffrey Zhong Priority: Critical Attachments: HBase MVCC LogSeqId Combined.pdf, hbase-8736-poc.patch, hbase-8763-poc-v1.patch, hbase-8763-v1.patch, hbase-8763-v2.patch, hbase-8763-v3.patch, hbase-8763-v4.patch, hbase-8763-v5.1.patch, hbase-8763-v5.2.patch, hbase-8763-v5.patch, hbase-8763_wip1.patch HBASE-8701 and a lot of recent issues include good discussions about mvcc + seqId semantics. It seems that having mvcc and the seqId complicates the comparator semantics a lot in regards to flush + WAL replay + compactions + delete markers and out of order puts. Thinking more about it I don't think we need a MVCC write number which is different than the seqId. We can keep the MVCC semantics, read point and smallest read points intact, but combine mvcc write number and seqId. This will allow cleaner semantics + implementation + smaller data files. We can do some brainstorming for 0.98. We still have to verify that this would be semantically correct, it should be so by my current understanding. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-8763) [BRAINSTORM] Combine MVCC and SeqId
[ https://issues.apache.org/jira/browse/HBASE-8763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeffrey Zhong updated HBASE-8763: - Attachment: hbase-8763-v5.1.patch The v5.1 is the patch which keep KeyValue untouched(v5 is using MutableLong for mvcc in KeyVavlue) because there is a concern on the possible GC pressure for the new object as it may stay in heap for a while till memstore flush. [BRAINSTORM] Combine MVCC and SeqId --- Key: HBASE-8763 URL: https://issues.apache.org/jira/browse/HBASE-8763 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Enis Soztutar Assignee: Jeffrey Zhong Priority: Critical Attachments: HBase MVCC LogSeqId Combined.pdf, hbase-8736-poc.patch, hbase-8763-poc-v1.patch, hbase-8763-v1.patch, hbase-8763-v2.patch, hbase-8763-v3.patch, hbase-8763-v4.patch, hbase-8763-v5.1.patch, hbase-8763-v5.patch, hbase-8763_wip1.patch HBASE-8701 and a lot of recent issues include good discussions about mvcc + seqId semantics. It seems that having mvcc and the seqId complicates the comparator semantics a lot in regards to flush + WAL replay + compactions + delete markers and out of order puts. Thinking more about it I don't think we need a MVCC write number which is different than the seqId. We can keep the MVCC semantics, read point and smallest read points intact, but combine mvcc write number and seqId. This will allow cleaner semantics + implementation + smaller data files. We can do some brainstorming for 0.98. We still have to verify that this would be semantically correct, it should be so by my current understanding. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-8763) [BRAINSTORM] Combine MVCC and SeqId
[ https://issues.apache.org/jira/browse/HBASE-8763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeffrey Zhong updated HBASE-8763: - Attachment: hbase-8763-v5.patch Rebase master branch and address [~saint@gmail.com] comments. {quote} + mvccNum.setValue(this.sequenceId.incrementAndGet()); Does it have to increment? Could you just get current? Would that work? {quote} Yes. get current won't work because there could be multiple changes for the same row. In that case, the later coming change would have same MVCC value which makes the ordering of KVs in Heap questionable. [BRAINSTORM] Combine MVCC and SeqId --- Key: HBASE-8763 URL: https://issues.apache.org/jira/browse/HBASE-8763 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Enis Soztutar Assignee: Jeffrey Zhong Priority: Critical Attachments: HBase MVCC LogSeqId Combined.pdf, hbase-8736-poc.patch, hbase-8763-poc-v1.patch, hbase-8763-v1.patch, hbase-8763-v2.patch, hbase-8763-v3.patch, hbase-8763-v4.patch, hbase-8763-v5.patch, hbase-8763_wip1.patch HBASE-8701 and a lot of recent issues include good discussions about mvcc + seqId semantics. It seems that having mvcc and the seqId complicates the comparator semantics a lot in regards to flush + WAL replay + compactions + delete markers and out of order puts. Thinking more about it I don't think we need a MVCC write number which is different than the seqId. We can keep the MVCC semantics, read point and smallest read points intact, but combine mvcc write number and seqId. This will allow cleaner semantics + implementation + smaller data files. We can do some brainstorming for 0.98. We still have to verify that this would be semantically correct, it should be so by my current understanding. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-8763) [BRAINSTORM] Combine MVCC and SeqId
[ https://issues.apache.org/jira/browse/HBASE-8763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeffrey Zhong updated HBASE-8763: - Attachment: hbase-8763-v4.patch The v4 patch addressed [~saint@gmail.com] comments and clean the java doc warnings. Thanks. [BRAINSTORM] Combine MVCC and SeqId --- Key: HBASE-8763 URL: https://issues.apache.org/jira/browse/HBASE-8763 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Enis Soztutar Assignee: Jeffrey Zhong Priority: Critical Attachments: HBase MVCC LogSeqId Combined.pdf, hbase-8736-poc.patch, hbase-8763-poc-v1.patch, hbase-8763-v1.patch, hbase-8763-v2.patch, hbase-8763-v3.patch, hbase-8763-v4.patch, hbase-8763_wip1.patch HBASE-8701 and a lot of recent issues include good discussions about mvcc + seqId semantics. It seems that having mvcc and the seqId complicates the comparator semantics a lot in regards to flush + WAL replay + compactions + delete markers and out of order puts. Thinking more about it I don't think we need a MVCC write number which is different than the seqId. We can keep the MVCC semantics, read point and smallest read points intact, but combine mvcc write number and seqId. This will allow cleaner semantics + implementation + smaller data files. We can do some brainstorming for 0.98. We still have to verify that this would be semantically correct, it should be so by my current understanding. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-8763) [BRAINSTORM] Combine MVCC and SeqId
[ https://issues.apache.org/jira/browse/HBASE-8763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeffrey Zhong updated HBASE-8763: - Attachment: hbase-8763-v2.patch Full test suite passed locally. TestAcidGuarantees passed 10 times in one loop. [~saint@gmail.com] I've moved the sequenceId wait for assignment from ring buffer consumer to rpc handlers. Hopefully we can restore the performance. After mvcc log sequence combining, one idea I come up today is that we can introduce a client read flushed changes only mode. In this mode, a client only read changes are flushed. During recovery we can set its scanner read point to last flushed sequence id while the region is still under recovery. The total recovery time for those clients are failure detection time + region assignment time. I also attached a write up on this JIRA. [BRAINSTORM] Combine MVCC and SeqId --- Key: HBASE-8763 URL: https://issues.apache.org/jira/browse/HBASE-8763 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Enis Soztutar Assignee: Jeffrey Zhong Priority: Critical Attachments: HBase MVCC LogSeqId Combined.pdf, hbase-8736-poc.patch, hbase-8763-poc-v1.patch, hbase-8763-v1.patch, hbase-8763-v2.patch, hbase-8763_wip1.patch HBASE-8701 and a lot of recent issues include good discussions about mvcc + seqId semantics. It seems that having mvcc and the seqId complicates the comparator semantics a lot in regards to flush + WAL replay + compactions + delete markers and out of order puts. Thinking more about it I don't think we need a MVCC write number which is different than the seqId. We can keep the MVCC semantics, read point and smallest read points intact, but combine mvcc write number and seqId. This will allow cleaner semantics + implementation + smaller data files. We can do some brainstorming for 0.98. We still have to verify that this would be semantically correct, it should be so by my current understanding. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-8763) [BRAINSTORM] Combine MVCC and SeqId
[ https://issues.apache.org/jira/browse/HBASE-8763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeffrey Zhong updated HBASE-8763: - Attachment: hbase-8763-v2.patch [BRAINSTORM] Combine MVCC and SeqId --- Key: HBASE-8763 URL: https://issues.apache.org/jira/browse/HBASE-8763 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Enis Soztutar Assignee: Jeffrey Zhong Priority: Critical Attachments: HBase MVCC LogSeqId Combined.pdf, hbase-8736-poc.patch, hbase-8763-poc-v1.patch, hbase-8763-v1.patch, hbase-8763-v2.patch, hbase-8763_wip1.patch HBASE-8701 and a lot of recent issues include good discussions about mvcc + seqId semantics. It seems that having mvcc and the seqId complicates the comparator semantics a lot in regards to flush + WAL replay + compactions + delete markers and out of order puts. Thinking more about it I don't think we need a MVCC write number which is different than the seqId. We can keep the MVCC semantics, read point and smallest read points intact, but combine mvcc write number and seqId. This will allow cleaner semantics + implementation + smaller data files. We can do some brainstorming for 0.98. We still have to verify that this would be semantically correct, it should be so by my current understanding. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-8763) [BRAINSTORM] Combine MVCC and SeqId
[ https://issues.apache.org/jira/browse/HBASE-8763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeffrey Zhong updated HBASE-8763: - Attachment: (was: hbase-8763-v2.patch) [BRAINSTORM] Combine MVCC and SeqId --- Key: HBASE-8763 URL: https://issues.apache.org/jira/browse/HBASE-8763 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Enis Soztutar Assignee: Jeffrey Zhong Priority: Critical Attachments: HBase MVCC LogSeqId Combined.pdf, hbase-8736-poc.patch, hbase-8763-poc-v1.patch, hbase-8763-v1.patch, hbase-8763-v2.patch, hbase-8763_wip1.patch HBASE-8701 and a lot of recent issues include good discussions about mvcc + seqId semantics. It seems that having mvcc and the seqId complicates the comparator semantics a lot in regards to flush + WAL replay + compactions + delete markers and out of order puts. Thinking more about it I don't think we need a MVCC write number which is different than the seqId. We can keep the MVCC semantics, read point and smallest read points intact, but combine mvcc write number and seqId. This will allow cleaner semantics + implementation + smaller data files. We can do some brainstorming for 0.98. We still have to verify that this would be semantically correct, it should be so by my current understanding. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-8763) [BRAINSTORM] Combine MVCC and SeqId
[ https://issues.apache.org/jira/browse/HBASE-8763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeffrey Zhong updated HBASE-8763: - Attachment: HBase MVCC LogSeqId Combined.pdf [BRAINSTORM] Combine MVCC and SeqId --- Key: HBASE-8763 URL: https://issues.apache.org/jira/browse/HBASE-8763 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Enis Soztutar Assignee: Jeffrey Zhong Priority: Critical Attachments: HBase MVCC LogSeqId Combined.pdf, hbase-8736-poc.patch, hbase-8763-poc-v1.patch, hbase-8763-v1.patch, hbase-8763-v2.patch, hbase-8763_wip1.patch HBASE-8701 and a lot of recent issues include good discussions about mvcc + seqId semantics. It seems that having mvcc and the seqId complicates the comparator semantics a lot in regards to flush + WAL replay + compactions + delete markers and out of order puts. Thinking more about it I don't think we need a MVCC write number which is different than the seqId. We can keep the MVCC semantics, read point and smallest read points intact, but combine mvcc write number and seqId. This will allow cleaner semantics + implementation + smaller data files. We can do some brainstorming for 0.98. We still have to verify that this would be semantically correct, it should be so by my current understanding. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-8763) [BRAINSTORM] Combine MVCC and SeqId
[ https://issues.apache.org/jira/browse/HBASE-8763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeffrey Zhong updated HBASE-8763: - Attachment: hbase-8763-v3.patch The v3 fix a test case failure trigger QA run [BRAINSTORM] Combine MVCC and SeqId --- Key: HBASE-8763 URL: https://issues.apache.org/jira/browse/HBASE-8763 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Enis Soztutar Assignee: Jeffrey Zhong Priority: Critical Attachments: HBase MVCC LogSeqId Combined.pdf, hbase-8736-poc.patch, hbase-8763-poc-v1.patch, hbase-8763-v1.patch, hbase-8763-v2.patch, hbase-8763-v3.patch, hbase-8763_wip1.patch HBASE-8701 and a lot of recent issues include good discussions about mvcc + seqId semantics. It seems that having mvcc and the seqId complicates the comparator semantics a lot in regards to flush + WAL replay + compactions + delete markers and out of order puts. Thinking more about it I don't think we need a MVCC write number which is different than the seqId. We can keep the MVCC semantics, read point and smallest read points intact, but combine mvcc write number and seqId. This will allow cleaner semantics + implementation + smaller data files. We can do some brainstorming for 0.98. We still have to verify that this would be semantically correct, it should be so by my current understanding. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-8763) [BRAINSTORM] Combine MVCC and SeqId
[ https://issues.apache.org/jira/browse/HBASE-8763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeffrey Zhong updated HBASE-8763: - Attachment: hbase-8763-v1.patch The v1 patch runs fine in my local test env. Trigger one more QA run. [BRAINSTORM] Combine MVCC and SeqId --- Key: HBASE-8763 URL: https://issues.apache.org/jira/browse/HBASE-8763 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Enis Soztutar Assignee: Jeffrey Zhong Priority: Critical Attachments: hbase-8736-poc.patch, hbase-8763-poc-v1.patch, hbase-8763-v1.patch, hbase-8763_wip1.patch HBASE-8701 and a lot of recent issues include good discussions about mvcc + seqId semantics. It seems that having mvcc and the seqId complicates the comparator semantics a lot in regards to flush + WAL replay + compactions + delete markers and out of order puts. Thinking more about it I don't think we need a MVCC write number which is different than the seqId. We can keep the MVCC semantics, read point and smallest read points intact, but combine mvcc write number and seqId. This will allow cleaner semantics + implementation + smaller data files. We can do some brainstorming for 0.98. We still have to verify that this would be semantically correct, it should be so by my current understanding. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-8763) [BRAINSTORM] Combine MVCC and SeqId
[ https://issues.apache.org/jira/browse/HBASE-8763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeffrey Zhong updated HBASE-8763: - Attachment: (was: hbase-8763-v1.patch) [BRAINSTORM] Combine MVCC and SeqId --- Key: HBASE-8763 URL: https://issues.apache.org/jira/browse/HBASE-8763 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Enis Soztutar Assignee: Jeffrey Zhong Priority: Critical Attachments: hbase-8736-poc.patch, hbase-8763-poc-v1.patch, hbase-8763-v1.patch, hbase-8763_wip1.patch HBASE-8701 and a lot of recent issues include good discussions about mvcc + seqId semantics. It seems that having mvcc and the seqId complicates the comparator semantics a lot in regards to flush + WAL replay + compactions + delete markers and out of order puts. Thinking more about it I don't think we need a MVCC write number which is different than the seqId. We can keep the MVCC semantics, read point and smallest read points intact, but combine mvcc write number and seqId. This will allow cleaner semantics + implementation + smaller data files. We can do some brainstorming for 0.98. We still have to verify that this would be semantically correct, it should be so by my current understanding. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-8763) [BRAINSTORM] Combine MVCC and SeqId
[ https://issues.apache.org/jira/browse/HBASE-8763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeffrey Zhong updated HBASE-8763: - Attachment: (was: hbase-8763.patch) [BRAINSTORM] Combine MVCC and SeqId --- Key: HBASE-8763 URL: https://issues.apache.org/jira/browse/HBASE-8763 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Enis Soztutar Assignee: Jeffrey Zhong Priority: Critical Attachments: hbase-8736-poc.patch, hbase-8763-poc-v1.patch, hbase-8763_wip1.patch HBASE-8701 and a lot of recent issues include good discussions about mvcc + seqId semantics. It seems that having mvcc and the seqId complicates the comparator semantics a lot in regards to flush + WAL replay + compactions + delete markers and out of order puts. Thinking more about it I don't think we need a MVCC write number which is different than the seqId. We can keep the MVCC semantics, read point and smallest read points intact, but combine mvcc write number and seqId. This will allow cleaner semantics + implementation + smaller data files. We can do some brainstorming for 0.98. We still have to verify that this would be semantically correct, it should be so by my current understanding. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-8763) [BRAINSTORM] Combine MVCC and SeqId
[ https://issues.apache.org/jira/browse/HBASE-8763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeffrey Zhong updated HBASE-8763: - Attachment: hbase-8763-v1.patch [BRAINSTORM] Combine MVCC and SeqId --- Key: HBASE-8763 URL: https://issues.apache.org/jira/browse/HBASE-8763 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Enis Soztutar Assignee: Jeffrey Zhong Priority: Critical Attachments: hbase-8736-poc.patch, hbase-8763-poc-v1.patch, hbase-8763-v1.patch, hbase-8763_wip1.patch HBASE-8701 and a lot of recent issues include good discussions about mvcc + seqId semantics. It seems that having mvcc and the seqId complicates the comparator semantics a lot in regards to flush + WAL replay + compactions + delete markers and out of order puts. Thinking more about it I don't think we need a MVCC write number which is different than the seqId. We can keep the MVCC semantics, read point and smallest read points intact, but combine mvcc write number and seqId. This will allow cleaner semantics + implementation + smaller data files. We can do some brainstorming for 0.98. We still have to verify that this would be semantically correct, it should be so by my current understanding. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-8763) [BRAINSTORM] Combine MVCC and SeqId
[ https://issues.apache.org/jira/browse/HBASE-8763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeffrey Zhong updated HBASE-8763: - Status: Patch Available (was: Open) Trigger QA run [BRAINSTORM] Combine MVCC and SeqId --- Key: HBASE-8763 URL: https://issues.apache.org/jira/browse/HBASE-8763 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Enis Soztutar Assignee: Jeffrey Zhong Priority: Critical Attachments: hbase-8736-poc.patch, hbase-8763-poc-v1.patch, hbase-8763.patch, hbase-8763_wip1.patch HBASE-8701 and a lot of recent issues include good discussions about mvcc + seqId semantics. It seems that having mvcc and the seqId complicates the comparator semantics a lot in regards to flush + WAL replay + compactions + delete markers and out of order puts. Thinking more about it I don't think we need a MVCC write number which is different than the seqId. We can keep the MVCC semantics, read point and smallest read points intact, but combine mvcc write number and seqId. This will allow cleaner semantics + implementation + smaller data files. We can do some brainstorming for 0.98. We still have to verify that this would be semantically correct, it should be so by my current understanding. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-8763) [BRAINSTORM] Combine MVCC and SeqId
[ https://issues.apache.org/jira/browse/HBASE-8763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeffrey Zhong updated HBASE-8763: - Attachment: hbase-8763.patch Submit the patch to let QA run to see how many failures. [BRAINSTORM] Combine MVCC and SeqId --- Key: HBASE-8763 URL: https://issues.apache.org/jira/browse/HBASE-8763 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Enis Soztutar Assignee: Jeffrey Zhong Priority: Critical Attachments: hbase-8736-poc.patch, hbase-8763-poc-v1.patch, hbase-8763.patch, hbase-8763_wip1.patch HBASE-8701 and a lot of recent issues include good discussions about mvcc + seqId semantics. It seems that having mvcc and the seqId complicates the comparator semantics a lot in regards to flush + WAL replay + compactions + delete markers and out of order puts. Thinking more about it I don't think we need a MVCC write number which is different than the seqId. We can keep the MVCC semantics, read point and smallest read points intact, but combine mvcc write number and seqId. This will allow cleaner semantics + implementation + smaller data files. We can do some brainstorming for 0.98. We still have to verify that this would be semantically correct, it should be so by my current understanding. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-8763) [BRAINSTORM] Combine MVCC and SeqId
[ https://issues.apache.org/jira/browse/HBASE-8763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeffrey Zhong updated HBASE-8763: - Attachment: hbase-8763.patch [BRAINSTORM] Combine MVCC and SeqId --- Key: HBASE-8763 URL: https://issues.apache.org/jira/browse/HBASE-8763 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Enis Soztutar Assignee: Jeffrey Zhong Priority: Critical Attachments: hbase-8736-poc.patch, hbase-8763-poc-v1.patch, hbase-8763.patch, hbase-8763_wip1.patch HBASE-8701 and a lot of recent issues include good discussions about mvcc + seqId semantics. It seems that having mvcc and the seqId complicates the comparator semantics a lot in regards to flush + WAL replay + compactions + delete markers and out of order puts. Thinking more about it I don't think we need a MVCC write number which is different than the seqId. We can keep the MVCC semantics, read point and smallest read points intact, but combine mvcc write number and seqId. This will allow cleaner semantics + implementation + smaller data files. We can do some brainstorming for 0.98. We still have to verify that this would be semantically correct, it should be so by my current understanding. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-8763) [BRAINSTORM] Combine MVCC and SeqId
[ https://issues.apache.org/jira/browse/HBASE-8763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeffrey Zhong updated HBASE-8763: - Attachment: (was: hbase-8763.patch) [BRAINSTORM] Combine MVCC and SeqId --- Key: HBASE-8763 URL: https://issues.apache.org/jira/browse/HBASE-8763 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Enis Soztutar Assignee: Jeffrey Zhong Priority: Critical Attachments: hbase-8736-poc.patch, hbase-8763-poc-v1.patch, hbase-8763.patch, hbase-8763_wip1.patch HBASE-8701 and a lot of recent issues include good discussions about mvcc + seqId semantics. It seems that having mvcc and the seqId complicates the comparator semantics a lot in regards to flush + WAL replay + compactions + delete markers and out of order puts. Thinking more about it I don't think we need a MVCC write number which is different than the seqId. We can keep the MVCC semantics, read point and smallest read points intact, but combine mvcc write number and seqId. This will allow cleaner semantics + implementation + smaller data files. We can do some brainstorming for 0.98. We still have to verify that this would be semantically correct, it should be so by my current understanding. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-8763) [BRAINSTORM] Combine MVCC and SeqId
[ https://issues.apache.org/jira/browse/HBASE-8763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeffrey Zhong updated HBASE-8763: - Attachment: hbase-8763-poc-v1.patch This is updated POC version. All small/medium tests including TestAtomicOperation are passed except TestExportSnapshot. I'll try to fix it soon. Thanks. [BRAINSTORM] Combine MVCC and SeqId --- Key: HBASE-8763 URL: https://issues.apache.org/jira/browse/HBASE-8763 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Enis Soztutar Assignee: Jeffrey Zhong Priority: Critical Attachments: hbase-8736-poc.patch, hbase-8763-poc-v1.patch, hbase-8763_wip1.patch HBASE-8701 and a lot of recent issues include good discussions about mvcc + seqId semantics. It seems that having mvcc and the seqId complicates the comparator semantics a lot in regards to flush + WAL replay + compactions + delete markers and out of order puts. Thinking more about it I don't think we need a MVCC write number which is different than the seqId. We can keep the MVCC semantics, read point and smallest read points intact, but combine mvcc write number and seqId. This will allow cleaner semantics + implementation + smaller data files. We can do some brainstorming for 0.98. We still have to verify that this would be semantically correct, it should be so by my current understanding. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-8763) [BRAINSTORM] Combine MVCC and SeqId
[ https://issues.apache.org/jira/browse/HBASE-8763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-8763: - Priority: Critical (was: Major) Marking critical. Let's make a decision on this. A few other developments in the pipeline are simplified if we do this. [BRAINSTORM] Combine MVCC and SeqId --- Key: HBASE-8763 URL: https://issues.apache.org/jira/browse/HBASE-8763 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Enis Soztutar Priority: Critical Attachments: hbase-8736-poc.patch, hbase-8763_wip1.patch HBASE-8701 and a lot of recent issues include good discussions about mvcc + seqId semantics. It seems that having mvcc and the seqId complicates the comparator semantics a lot in regards to flush + WAL replay + compactions + delete markers and out of order puts. Thinking more about it I don't think we need a MVCC write number which is different than the seqId. We can keep the MVCC semantics, read point and smallest read points intact, but combine mvcc write number and seqId. This will allow cleaner semantics + implementation + smaller data files. We can do some brainstorming for 0.98. We still have to verify that this would be semantically correct, it should be so by my current understanding. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HBASE-8763) [BRAINSTORM] Combine MVCC and SeqId
[ https://issues.apache.org/jira/browse/HBASE-8763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeffrey Zhong updated HBASE-8763: - Attachment: hbase-8736-poc.patch I attached the Proof of Concept patch and there are a few unit tests failed. The basic idea is like we have two buckets: one has all in progress writes and the other contains all committed writes. Once a write is done we assign a log sequence number(after sync) and move it to the committed bucket by bumping up the mvcc read point to be = the log sequence number. The major changes are in HRegion.java and MultiVersionConsistencyControl.java. Since Increment, Append and etcs need to wait for all previous in-progress transactions complete, I have to keep a collection MultiVersionConsistencyControl#inProgressWrites. [BRAINSTORM] Combine MVCC and SeqId --- Key: HBASE-8763 URL: https://issues.apache.org/jira/browse/HBASE-8763 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Enis Soztutar Attachments: hbase-8736-poc.patch, hbase-8763_wip1.patch HBASE-8701 and a lot of recent issues include good discussions about mvcc + seqId semantics. It seems that having mvcc and the seqId complicates the comparator semantics a lot in regards to flush + WAL replay + compactions + delete markers and out of order puts. Thinking more about it I don't think we need a MVCC write number which is different than the seqId. We can keep the MVCC semantics, read point and smallest read points intact, but combine mvcc write number and seqId. This will allow cleaner semantics + implementation + smaller data files. We can do some brainstorming for 0.98. We still have to verify that this would be semantically correct, it should be so by my current understanding. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HBASE-8763) [BRAINSTORM] Combine MVCC and SeqId
[ https://issues.apache.org/jira/browse/HBASE-8763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-8763: -- Status: Patch Available (was: Open) [BRAINSTORM] Combine MVCC and SeqId --- Key: HBASE-8763 URL: https://issues.apache.org/jira/browse/HBASE-8763 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Enis Soztutar Attachments: hbase-8736-poc.patch, hbase-8763_wip1.patch HBASE-8701 and a lot of recent issues include good discussions about mvcc + seqId semantics. It seems that having mvcc and the seqId complicates the comparator semantics a lot in regards to flush + WAL replay + compactions + delete markers and out of order puts. Thinking more about it I don't think we need a MVCC write number which is different than the seqId. We can keep the MVCC semantics, read point and smallest read points intact, but combine mvcc write number and seqId. This will allow cleaner semantics + implementation + smaller data files. We can do some brainstorming for 0.98. We still have to verify that this would be semantically correct, it should be so by my current understanding. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HBASE-8763) [BRAINSTORM] Combine MVCC and SeqId
[ https://issues.apache.org/jira/browse/HBASE-8763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-8763: -- Status: Open (was: Patch Available) [BRAINSTORM] Combine MVCC and SeqId --- Key: HBASE-8763 URL: https://issues.apache.org/jira/browse/HBASE-8763 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Enis Soztutar Attachments: hbase-8736-poc.patch, hbase-8763_wip1.patch HBASE-8701 and a lot of recent issues include good discussions about mvcc + seqId semantics. It seems that having mvcc and the seqId complicates the comparator semantics a lot in regards to flush + WAL replay + compactions + delete markers and out of order puts. Thinking more about it I don't think we need a MVCC write number which is different than the seqId. We can keep the MVCC semantics, read point and smallest read points intact, but combine mvcc write number and seqId. This will allow cleaner semantics + implementation + smaller data files. We can do some brainstorming for 0.98. We still have to verify that this would be semantically correct, it should be so by my current understanding. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HBASE-8763) [BRAINSTORM] Combine MVCC and SeqId
[ https://issues.apache.org/jira/browse/HBASE-8763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-8763: -- Fix Version/s: (was: 0.98.0) Moving out of 0.98. Put back if you feel otherwise. [BRAINSTORM] Combine MVCC and SeqId --- Key: HBASE-8763 URL: https://issues.apache.org/jira/browse/HBASE-8763 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Enis Soztutar Attachments: hbase-8763_wip1.patch HBASE-8701 and a lot of recent issues include good discussions about mvcc + seqId semantics. It seems that having mvcc and the seqId complicates the comparator semantics a lot in regards to flush + WAL replay + compactions + delete markers and out of order puts. Thinking more about it I don't think we need a MVCC write number which is different than the seqId. We can keep the MVCC semantics, read point and smallest read points intact, but combine mvcc write number and seqId. This will allow cleaner semantics + implementation + smaller data files. We can do some brainstorming for 0.98. We still have to verify that this would be semantically correct, it should be so by my current understanding. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HBASE-8763) [BRAINSTORM] Combine MVCC and SeqId
[ https://issues.apache.org/jira/browse/HBASE-8763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Enis Soztutar updated HBASE-8763: - Attachment: hbase-8763_wip1.patch Attaching the patch which is a wip. Feel free to comment though. [BRAINSTORM] Combine MVCC and SeqId --- Key: HBASE-8763 URL: https://issues.apache.org/jira/browse/HBASE-8763 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Enis Soztutar Fix For: 0.98.0 Attachments: hbase-8763_wip1.patch HBASE-8701 and a lot of recent issues include good discussions about mvcc + seqId semantics. It seems that having mvcc and the seqId complicates the comparator semantics a lot in regards to flush + WAL replay + compactions + delete markers and out of order puts. Thinking more about it I don't think we need a MVCC write number which is different than the seqId. We can keep the MVCC semantics, read point and smallest read points intact, but combine mvcc write number and seqId. This will allow cleaner semantics + implementation + smaller data files. We can do some brainstorming for 0.98. We still have to verify that this would be semantically correct, it should be so by my current understanding. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira