[ 
https://issues.apache.org/jira/browse/HBASE-8763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14000131#comment-14000131
 ] 

stack commented on HBASE-8763:
------------------------------

[~jeffreyz]

Excellent work.  Thanks for the writeup.  Helps.

bq. After mvcc & log sequence combin....

We can do a bunch of stuff...

But I like your idea about clients being able to read at least to last flush.  
Sweet.

Some comments on the patch:

Should beginMemstoreInsert assign a write mvcc number at all?

This won't work:

+        w = mvcc.beginMemstoreInsert(this.sequenceId.incrementAndGet());

Or at least I don't think you are getting what you think you are getting.  See 
the note on sequenceid.  I think you have to use the new getNextSequenceId.  
Then you do this +          flushSeqId = getNextSequenceId(wal); at end of 
flush.  Do we need the earlier mvcc?

You do it thrice I see (There is 'lag' between adding of append to ring buffer 
and it being consumed and its edit/sequence id being updated... perhaps an edit 
is in the ring buffer, you update the sequence id thinking you are getting the 
last sequence id but subsequently, the ring buffer consumer runs...).



Misspelling: waitForPreviousTransactoinsComplete

I asked already and I think you explained but still not sure what is going on 
here:

+      mvcc.waitForPreviousTransactoinsComplete(w);
+      w = null;

Call this appendNoSyncFakedWALEdit appendNoSyncNoAppend?  It will make folks 
sit up and wonder why you would do such a thing and they'd read the javadoc.  
It is more explicit than appendNoSyncFakedWALEdit.



I'd think the below is temporary? It is convenient updating once and all 
related get the update.  Is that the only reason?  Could we iterate the KVs 
that make up the edit and set their MVCC?  The edits are not yet in the 
memstore, right?  They get their mvcc before they are added to the MemStore? 
(Always?)

-  private long mvcc = 0;  // this value is not part of a serialized KeyValue 
(not in HFiles)
+  private MutableLong mvcc; // this value is not part of a serialized KeyValue 
(not in HFiles)

This is the clone that we need to get rid of, right (smile)?

-    newKv.setMvccVersion(kv.getMvccVersion());
+    newKv.setMvccVersion(kv.getMvccVersionReference());

nit: don't need to set the memstoreRead... it is done in the declaration.

-    this.memstoreRead = this.memstoreWrite = 0;
+    memstoreRead = 0;

Set top bits rather than add a big number?: +    curSeqNum.setValue(originalVal 
+ 1000000000);

Do we need to have MemStore know about HLogKeys?  And that they have this odd 
'waiting' thing that you can do on them?

+      long newSeqNum = walKey.waitForLogSeqNumAssigned();
+      e.setWriteNumber(newSeqNum);

Could we do the wait in the WAL system before we call 
completeMemstoreInsertWithSeqNum passing in the sequence id to use?  Could the 
consumer on the ring buffer call the Memstore. completeMemstoreInsertWithSeqNum?

Hmm... maybe we should make the writeQueue a disruptor too?  Later.

+1 on moving the latch from waledit to walkey.  Nice.

You fixing this test? TestMultiParallel

Patch looks good.  Can give a closer review later.  This first pass should do 
for now.

Great stuff Jeffrey



> [BRAINSTORM] Combine MVCC and SeqId
> -----------------------------------
>
>                 Key: HBASE-8763
>                 URL: https://issues.apache.org/jira/browse/HBASE-8763
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver
>            Reporter: Enis Soztutar
>            Assignee: Jeffrey Zhong
>            Priority: Critical
>         Attachments: HBase MVCC & LogSeqId Combined.pdf, 
> hbase-8736-poc.patch, hbase-8763-poc-v1.patch, hbase-8763-v1.patch, 
> hbase-8763-v2.patch, hbase-8763_wip1.patch
>
>
> HBASE-8701 and a lot of recent issues include good discussions about mvcc + 
> seqId semantics. It seems that having mvcc and the seqId complicates the 
> comparator semantics a lot in regards to flush + WAL replay + compactions + 
> delete markers and out of order puts. 
> Thinking more about it I don't think we need a MVCC write number which is 
> different than the seqId. We can keep the MVCC semantics, read point and 
> smallest read points intact, but combine mvcc write number and seqId. This 
> will allow cleaner semantics + implementation + smaller data files. 
> We can do some brainstorming for 0.98. We still have to verify that this 
> would be semantically correct, it should be so by my current understanding.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to