subject:"\[jira\] \[Commented\] \(HBASE\-8763\) \[BRAINSTORM\] Combine MVCC and SeqId"

[jira] [Commented] (HBASE-8763) [BRAINSTORM] Combine MVCC and SeqId

2014-06-06 Thread Hadoop QA (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-8763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14020272#comment-14020272
]

Hadoop QA commented on HBASE-8763:
--

{color:red}-1 overall{color}. Here are the results of testing the latest
attachment
http://issues.apache.org/jira/secure/attachment/12648663/hbase-8763-v5.2.patch
against trunk revision .
ATTACHMENT ID: 12648663

{color:green}+1 @author{color}. The patch does not contain any @author
tags.

{color:green}+1 tests included{color}. The patch appears to include 15 new
or modified tests.

{color:green}+1 javac{color}. The applied patch does not increase the
total number of javac compiler warnings.

{color:red}-1 javadoc{color}. The javadoc tool appears to have generated 3
warning messages.

{color:green}+1 findbugs{color}. The patch does not introduce any new
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}. The applied patch does not increase
the total number of release audit warnings.

{color:green}+1 lineLengths{color}. The patch does not introduce lines
longer than 100

{color:green}+1 site{color}. The mvn site goal succeeds with this patch.

{color:red}-1 core tests{color}. The patch failed these unit tests:

Test results:
https://builds.apache.org/job/PreCommit-HBASE-Build/9708//testReport/
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/9708//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/9708//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/9708//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/9708//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/9708//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/9708//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/9708//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/9708//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/9708//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output:
https://builds.apache.org/job/PreCommit-HBASE-Build/9708//console

This message is automatically generated.

[BRAINSTORM] Combine MVCC and SeqId
---

Key: HBASE-8763
URL: https://issues.apache.org/jira/browse/HBASE-8763
Project: HBase
Issue Type: Improvement
Components: regionserver
Reporter: Enis Soztutar
Assignee: Jeffrey Zhong
Priority: Critical
Attachments: HBase MVCC LogSeqId Combined.pdf,
hbase-8736-poc.patch, hbase-8763-poc-v1.patch, hbase-8763-v1.patch,
hbase-8763-v2.patch, hbase-8763-v3.patch, hbase-8763-v4.patch,
hbase-8763-v5.1.patch, hbase-8763-v5.2.patch, hbase-8763-v5.patch,
hbase-8763_wip1.patch

HBASE-8701 and a lot of recent issues include good discussions about mvcc +
seqId semantics. It seems that having mvcc and the seqId complicates the
comparator semantics a lot in regards to flush + WAL replay + compactions +
delete markers and out of order puts.
Thinking more about it I don't think we need a MVCC write number which is
different than the seqId. We can keep the MVCC semantics, read point and
smallest read points intact, but combine mvcc write number and seqId. This
will allow cleaner semantics + implementation + smaller data files.
We can do some brainstorming for 0.98. We still have to verify that this
would be semantically correct, it should be so by my current understanding.

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HBASE-8763) [BRAINSTORM] Combine MVCC and SeqId

2014-06-06 Thread stack (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-8763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14020352#comment-14020352
]

stack commented on HBASE-8763:
--

All above comments are good by me.

This is good: getPreAssignedWriteNumber

+1 Please fix the javadoc on commit and file follow on issues. Please also add
a comment that explains why the Pair is needed and conditions under which we
can remove it. Also explain in comment why the MutableLong is used though it
seems it not needed.

bq. We don't have setMvccVersion in Cell interface. Do you want to create one?

Ugh. Cell Interface is read-only as it should be. Would it make sense having
another Interface, MVCCSettable with a setMvccVersion in it? This would be
server-side only? Is there ever a time when we know the mvcc constructing a
Cell such that we can shove it in on Construction other than at deserializing
or clone time?

[BRAINSTORM] Combine MVCC and SeqId
---

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HBASE-8763) [BRAINSTORM] Combine MVCC and SeqId

2014-06-05 Thread Jeffrey Zhong (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-8763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14019021#comment-14019021
 ] 

Jeffrey Zhong commented on HBASE-8763:
--

[~saint@gmail.com] Thanks for the review and good comments.
{quote}
Why return a Pair? We are passing in the Cell? Can't caller use passed in Cell 
and just leave addTo returning long as before (I am missing something I know... 
pardon my being slow). We are making a new object each time we add to memstore, 
the Pair.
{quote}
That's the issue on how we add cells to memestore today. Basically it does copy 
the passed in Cells and put newly created Cells into memstore. So I have to 
return those newly added Cells to update mvcc later(because JAVA doesn't 
support reference to a primitive type). In future, we shouldn't make copies 
when add cells into memstore.

{quote}
If so, a comment on the @param would help
{quote}
Sure. I'll do that.

{quote}
 create a new one each time through? Just as we have WALEdit.EMPTY_WALEDIT?
{quote}
Good point! I'll do that.

{quote}
On beginMemstoreInsert, why take a value at all
{quote}
It's used by beginMemstoreInsertWithSeqNum. One way is that I can let 
beginMemstoreInsert to call beginMemstoreInsertWithSeqNum instead. After this, 
I think it can address several following comments.

{quote}
NO_SEQUENCE_ID Should be a define in your new SequenceId interface?
A comment on wny you do 'w = null;' would be helpful in flush: e.g. Set to 
null to indicate success
Change name of memstoreKVs to be memstoreCells (be forward thinking!)
{quote}
Ok, I'll change that.

{quote}
Do we need the MutableLong here still? 
{quote}
The reason is that I want to hide the fact of bumping 1 billion number inside 
this function and keep the bumping in one place. I could define a constant for 
this purpose

{quote}
Can these be lists of Cells rather than private final transient ListKeyValue 
memstoreKVs;? You can do cell.setMvccVersion.
{quote}
We don't have setMvccVersion in Cell interface. Do you want to create one?

{quote}
Are we not passing the KVs twice? Once in WALEdits and then again in this new 
memstoreKVs argument?
{quote}
My firstly version(not published) is using KVs in WALEdits while 
HRegion#doProcessRowWithTimeout let clients to create WALEdits so it's 
impossible to merge those two lists.

{quote}
Man, the mvcc stuff should be redone w/ disruptor. Looks like ideal disruptor 
case.
{quote}
The issue lands in we have different durability modes so the disruptor way 
maybe hard. As in the earlier review thread, we could remove mvcc writeQueue 
while it needs to keep the sequence id of the last unflushed edit with sync_wal 
durability. Let the optimization be done later.










 [BRAINSTORM] Combine MVCC and SeqId
 ---

 Key: HBASE-8763
 URL: https://issues.apache.org/jira/browse/HBASE-8763
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Reporter: Enis Soztutar
Assignee: Jeffrey Zhong
Priority: Critical
 Attachments: HBase MVCC  LogSeqId Combined.pdf, 
 hbase-8736-poc.patch, hbase-8763-poc-v1.patch, hbase-8763-v1.patch, 
 hbase-8763-v2.patch, hbase-8763-v3.patch, hbase-8763-v4.patch, 
 hbase-8763-v5.1.patch, hbase-8763-v5.patch, hbase-8763_wip1.patch


 HBASE-8701 and a lot of recent issues include good discussions about mvcc + 
 seqId semantics. It seems that having mvcc and the seqId complicates the 
 comparator semantics a lot in regards to flush + WAL replay + compactions + 
 delete markers and out of order puts. 
 Thinking more about it I don't think we need a MVCC write number which is 
 different than the seqId. We can keep the MVCC semantics, read point and 
 smallest read points intact, but combine mvcc write number and seqId. This 
 will allow cleaner semantics + implementation + smaller data files. 
 We can do some brainstorming for 0.98. We still have to verify that this 
 would be semantically correct, it should be so by my current understanding.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HBASE-8763) [BRAINSTORM] Combine MVCC and SeqId

2014-06-04 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-8763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14018163#comment-14018163
 ] 

stack commented on HBASE-8763:
--

Looking at 5.1:

Why return a Pair?  We are passing in the Cell?  Can't caller use passed in 
Cell and just leave addTo returning long as before (I am missing something I 
know... pardon my being slow).  We are making a new object each time we add to 
memstore, the Pair.

Why would the below take a list of kvs to appendNoSyncNoAppend?  When would it 
ever make sense passing a list of kvs?  (Hmm... I think I see why -- when we 
want to just set an mvcc on the KV though we are not appending it -- is that 
right?  If so, a comment on the @param would help)

In the appendNoSyncNoAppend we make HLogKey.  We call System.currentTimeMillis. 
 This edit is never appended.  Can we NOT call System.currentTimeMillis?  Just 
pass a -1 or something instead?  Or can we make a noop HLogKey defined as a 
static in HLogKey and just use it every time rather than create a new one each 
time through?  Just as we have WALEdit.EMPTY_WALEDIT?

On beginMemstoreInsert, why take a value at all?  Wny not just return the 
WriteEntry that has special value for the write number?  If we ever try to use 
this number advancing the read point, throw exceptions?  Remove the current 
beginMemstoreInsert that does not take an entry?  I see that in your new 
method, waitForPreviousTransactionsComplete, you put something into the mvcc 
queue w/ a HLog.NO_SEQUENCE_ID and wait for this edit to go around so you can 
be sure queue is cleared.  So you have second use for special mvcc/sequenceid 
number.  Should the NO_SEQUENCE_ID be it?  and you just use it when 
beginMemstoreInsert is called setting it into the WriteEntry?  Should the 
number even come from HLog?  Could it be private to this class?
  
When we do waitForPreviousTransactionsComplete, does WriteEntry have a 
writeNumber set? Or is it NO_SEQUENCE_ID?  If it is the latter, yeah, just 
change beginMemstoreInsert to not take a param, or at least, not take this 
particular one because it is means of asking for a special behavior.  If the 
writeNumber is set, where does that happen?

NO_SEQUENCE_ID Should be a define in your new SequenceId interface?
 
A comment on wny you do 'w = null;' would be helpful in flush: e.g. Set to 
null to indicate success

Change name of memstoreKVs to be memstoreCells (be forward thinking!)

I am not clear still on why the below is ok up in HRegion#doMiniBatchMutation 
(Do we need the MutableLong here still? Why not just set the sequenceid into 
beginMemstoreInsertWithSeqNum and you are doing this when you use it   
kv.setMvccVersion(mvccNum.longValue());)

  mvccNum.setValue(this.sequenceId.incrementAndGet());

At a minimum it needs a comment explaining why (Sorry if I am being dense here).

You know why we add to memstore first before WAL?  For speed IIRC.  I should go 
research it.  This rollback stuff could be tricky.

So then here:

mvccNum.setValue(this.sequenceId.incrementAndGet());
w = mvcc.beginMemstoreInsertWithSeqNum(mvccNum);

We are getting a seqid and setting it as write number. We have not yet gone on 
the ring buffer.  Every edit is getting a write number like this?  MVCC read 
number happens only after the WAL append has happened.

Man, the mvcc stuff should be redone w/ disruptor.  Looks like ideal disruptor 
case.

StoreFlusher change no longer needed?

Can these be lists of Cells rather than   private final transient 
ListKeyValue memstoreKVs;?  You can do cell.setMvccVersion.

Why this?

  long appendNoSync(HTableDescriptor htd, HRegionInfo info, HLogKey key, 
WALEdit edits,
  AtomicLong sequenceId, boolean inMemstore, ListKeyValue memstoreKVs)

Unnecssary import in HLogSplitter?  Unnecessary change in WALEdit?





Are we not passing the KVs twice?  Once in WALEdits and then again in this new 
memstoreKVs argument?

I'm running tests now to see what this patch does for performance.  After our 
chat yesterday, yes, I see, it should not have much of an impact (especially 
looking at what you did in FSHLog).  That'd be cool.

I'm excited about this patch coming in.  Great work Mr. Zhong.





















 [BRAINSTORM] Combine MVCC and SeqId
 ---

 Key: HBASE-8763
 URL: https://issues.apache.org/jira/browse/HBASE-8763
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Reporter: Enis Soztutar
Assignee: Jeffrey Zhong
Priority: Critical
 Attachments: HBase MVCC  LogSeqId Combined.pdf, 
 hbase-8736-poc.patch, hbase-8763-poc-v1.patch, hbase-8763-v1.patch, 
 hbase-8763-v2.patch, hbase-8763-v3.patch, hbase-8763-v4.patch, 
 hbase-8763-v5.1.patch, hbase-8763-v5.patch, hbase-8763_wip1.patch


 HBASE-8701 and a lot of recent issues include

[jira] [Commented] (HBASE-8763) [BRAINSTORM] Combine MVCC and SeqId

2014-06-04 Thread Liyin Tang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-8763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14018165#comment-14018165
 ] 

Liyin Tang commented on HBASE-8763:
---

Hi, I am out of office since 9/1/2012 to 9/16/2012 and I cannot access to this 
email.
In urgent case, please forward your email to liyint...@gmail.com

Thanks a lot
Liyin


 [BRAINSTORM] Combine MVCC and SeqId
 ---

 Key: HBASE-8763
 URL: https://issues.apache.org/jira/browse/HBASE-8763
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Reporter: Enis Soztutar
Assignee: Jeffrey Zhong
Priority: Critical
 Attachments: HBase MVCC  LogSeqId Combined.pdf, 
 hbase-8736-poc.patch, hbase-8763-poc-v1.patch, hbase-8763-v1.patch, 
 hbase-8763-v2.patch, hbase-8763-v3.patch, hbase-8763-v4.patch, 
 hbase-8763-v5.1.patch, hbase-8763-v5.patch, hbase-8763_wip1.patch


 HBASE-8701 and a lot of recent issues include good discussions about mvcc + 
 seqId semantics. It seems that having mvcc and the seqId complicates the 
 comparator semantics a lot in regards to flush + WAL replay + compactions + 
 delete markers and out of order puts. 
 Thinking more about it I don't think we need a MVCC write number which is 
 different than the seqId. We can keep the MVCC semantics, read point and 
 smallest read points intact, but combine mvcc write number and seqId. This 
 will allow cleaner semantics + implementation + smaller data files. 
 We can do some brainstorming for 0.98. We still have to verify that this 
 would be semantically correct, it should be so by my current understanding.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HBASE-8763) [BRAINSTORM] Combine MVCC and SeqId

2014-06-04 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-8763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14018385#comment-14018385
 ] 

stack commented on HBASE-8763:
--

This patch seems to get us our speed back.  Good on one [~jeffreyz]

Below do basic 1, 5, 25, and 200 threads test:

Here is nopatch, the current master branch:
{code}
nopatch1.1.txt:2014-06-04 12:11:52,176 INFO  [main] 
wal.HLogPerformanceEvaluation: Summary: threads=1, iterations=100, 
syncInterval=0 took 1224.313s 816.785ops/s
nopatch5.1.txt:2014-06-04 12:29:02,864 INFO  [main] 
wal.HLogPerformanceEvaluation: Summary: threads=5, iterations=100, 
syncInterval=0 took 1025.163s 4877.273ops/s
nopatch25.1.txt:2014-06-04 12:53:02,973 INFO  [main] 
wal.HLogPerformanceEvaluation: Summary: threads=25, iterations=100, 
syncInterval=0 took 1434.641s 17425.963ops/s
nopatch200.1.txt:2014-06-04 13:40:30,333 INFO  [main] 
wal.HLogPerformanceEvaluation: Summary: threads=200, iterations=100, 
syncInterval=0 took 2841.947s 70374.289ops/s
{code}

Here is w/ patch applied:
{code}
patch1.1.txt:2014-06-04 14:37:04,973 INFO  [main] 
wal.HLogPerformanceEvaluation: Summary: threads=1, iterations=100, 
syncInterval=0 took 1228.775s 813.819ops/s
patch5.1.txt:2014-06-04 14:53:53,623 INFO  [main] 
wal.HLogPerformanceEvaluation: Summary: threads=5, iterations=100, 
syncInterval=0 took 1003.234s 4983.882ops/s
patch25.1.txt:2014-06-04 15:17:17,952 INFO  [main] 
wal.HLogPerformanceEvaluation: Summary: threads=25, iterations=100, 
syncInterval=0 took 1398.927s 17870.840ops/s
patch200.1.txt:2014-06-04 15:47:36,297 INFO  [main] 
wal.HLogPerformanceEvaluation: Summary: threads=200, iterations=100, 
syncInterval=0 took 1813.013s 110313.609ops/s
{code}

 [BRAINSTORM] Combine MVCC and SeqId
 ---

 Key: HBASE-8763
 URL: https://issues.apache.org/jira/browse/HBASE-8763
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Reporter: Enis Soztutar
Assignee: Jeffrey Zhong
Priority: Critical
 Attachments: HBase MVCC  LogSeqId Combined.pdf, 
 hbase-8736-poc.patch, hbase-8763-poc-v1.patch, hbase-8763-v1.patch, 
 hbase-8763-v2.patch, hbase-8763-v3.patch, hbase-8763-v4.patch, 
 hbase-8763-v5.1.patch, hbase-8763-v5.patch, hbase-8763_wip1.patch


 HBASE-8701 and a lot of recent issues include good discussions about mvcc + 
 seqId semantics. It seems that having mvcc and the seqId complicates the 
 comparator semantics a lot in regards to flush + WAL replay + compactions + 
 delete markers and out of order puts. 
 Thinking more about it I don't think we need a MVCC write number which is 
 different than the seqId. We can keep the MVCC semantics, read point and 
 smallest read points intact, but combine mvcc write number and seqId. This 
 will allow cleaner semantics + implementation + smaller data files. 
 We can do some brainstorming for 0.98. We still have to verify that this 
 would be semantically correct, it should be so by my current understanding.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HBASE-8763) [BRAINSTORM] Combine MVCC and SeqId

2014-05-30 Thread Hadoop QA (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-8763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14014466#comment-14014466
]

Hadoop QA commented on HBASE-8763:
--

{color:green}+1 overall{color}. Here are the results of testing the latest
attachment
http://issues.apache.org/jira/secure/attachment/12647731/hbase-8763-v5.1.patch
against trunk revision .
ATTACHMENT ID: 12647731

{color:green}+1 @author{color}. The patch does not contain any @author
tags.

{color:green}+1 tests included{color}. The patch appears to include 15 new
or modified tests.

{color:green}+1 javac{color}. The applied patch does not increase the
total number of javac compiler warnings.

{color:green}+1 javadoc{color}. The javadoc tool did not generate any
warning messages.

{color:green}+1 findbugs{color}. The patch does not introduce any new
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}. The applied patch does not increase
the total number of release audit warnings.

{color:green}+1 lineLengths{color}. The patch does not introduce lines
longer than 100

{color:green}+1 site{color}. The mvn site goal succeeds with this patch.

{color:green}+1 core tests{color}. The patch passed unit tests in .

Test results:
https://builds.apache.org/job/PreCommit-HBASE-Build/9654//testReport/
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/9654//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/9654//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/9654//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/9654//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/9654//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/9654//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/9654//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/9654//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/9654//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output:
https://builds.apache.org/job/PreCommit-HBASE-Build/9654//console

This message is automatically generated.

[BRAINSTORM] Combine MVCC and SeqId
---

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HBASE-8763) [BRAINSTORM] Combine MVCC and SeqId

2014-05-27 Thread stack (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-8763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14010314#comment-14010314
]

stack commented on HBASE-8763:
--

Patch looks great.

On commit, remove this comment. Doesn't seem appropriate to the statement that
follows:

+// Record the mvcc for all transactions in progress.

You need to do this?

-
+ mvccNum.setValue(this.sequenceId.incrementAndGet());

I'm trying to keep it so all sequenceid increments happen inside HLog only.

You do it in a few places. Does it have to increment? Could you just get
current? Would that work?

This should almost be a method because it happens more than a few times:

+ if(walKey == null){
+// Append a faked WALEdit in order for SKIP_WAL updates to get
mvccNum assigned
+walKey = this.appendNoSyncNoAppend(this.log, mvccNum);
+ }

Rename SequenceNumberAssignor as SequenceNumber and method as getSequenceNumber?

If no one else reviews in next few days, I'll give it another go (having
trouble concentrating on this because I've looked at a few versions of this
patch). Overall I think this is excellent.

Maybe we should just commit and then tune up in new issues?

Not sure we can get our speedup back looking at this patch at the mo.

[BRAINSTORM] Combine MVCC and SeqId
---

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HBASE-8763) [BRAINSTORM] Combine MVCC and SeqId

2014-05-27 Thread Hadoop QA (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-8763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14010336#comment-14010336
]

Hadoop QA commented on HBASE-8763:
--

{color:red}-1 overall{color}. Here are the results of testing the latest
attachment
http://issues.apache.org/jira/secure/attachment/12646094/hbase-8763-v4.patch
against trunk revision .
ATTACHMENT ID: 12646094

{color:green}+1 @author{color}. The patch does not contain any @author
tags.

{color:green}+1 tests included{color}. The patch appears to include 12 new
or modified tests.

{color:red}-1 patch{color}. The patch command could not apply the patch.

Console output:
https://builds.apache.org/job/PreCommit-HBASE-Build/9607//console

This message is automatically generated.

[BRAINSTORM] Combine MVCC and SeqId
---

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HBASE-8763) [BRAINSTORM] Combine MVCC and SeqId

2014-05-27 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-8763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14010637#comment-14010637
 ] 

stack commented on HBASE-8763:
--

Then, does it have to be getSequenceId?  There may be appends in flight inside 
in the disruptor and so you'll get a sequence id that may be superceded?

 [BRAINSTORM] Combine MVCC and SeqId
 ---

 Key: HBASE-8763
 URL: https://issues.apache.org/jira/browse/HBASE-8763
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Reporter: Enis Soztutar
Assignee: Jeffrey Zhong
Priority: Critical
 Attachments: HBase MVCC  LogSeqId Combined.pdf, 
 hbase-8736-poc.patch, hbase-8763-poc-v1.patch, hbase-8763-v1.patch, 
 hbase-8763-v2.patch, hbase-8763-v3.patch, hbase-8763-v4.patch, 
 hbase-8763-v5.patch, hbase-8763_wip1.patch


 HBASE-8701 and a lot of recent issues include good discussions about mvcc + 
 seqId semantics. It seems that having mvcc and the seqId complicates the 
 comparator semantics a lot in regards to flush + WAL replay + compactions + 
 delete markers and out of order puts. 
 Thinking more about it I don't think we need a MVCC write number which is 
 different than the seqId. We can keep the MVCC semantics, read point and 
 smallest read points intact, but combine mvcc write number and seqId. This 
 will allow cleaner semantics + implementation + smaller data files. 
 We can do some brainstorming for 0.98. We still have to verify that this 
 would be semantically correct, it should be so by my current understanding.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HBASE-8763) [BRAINSTORM] Combine MVCC and SeqId

2014-05-27 Thread Hadoop QA (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-8763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14010682#comment-14010682
]

Hadoop QA commented on HBASE-8763:
--

{color:red}-1 overall{color}. Here are the results of testing the latest
attachment
http://issues.apache.org/jira/secure/attachment/12647028/hbase-8763-v5.patch
against trunk revision .
ATTACHMENT ID: 12647028

{color:green}+1 @author{color}. The patch does not contain any @author
tags.

{color:green}+1 tests included{color}. The patch appears to include 12 new
or modified tests.

{color:green}+1 javac{color}. The applied patch does not increase the
total number of javac compiler warnings.

{color:green}+1 javadoc{color}. The javadoc tool did not generate any
warning messages.

{color:green}+1 findbugs{color}. The patch does not introduce any new
Findbugs (version 1.3.9) warnings.

{color:red}-1 release audit{color}. The applied patch generated 31 release
audit warnings (more than the trunk's current 0 warnings).

{color:green}+1 lineLengths{color}. The patch does not introduce lines
longer than 100

{color:green}+1 site{color}. The mvn site goal succeeds with this patch.

{color:green}+1 core tests{color}. The patch passed unit tests in .

Test results:
https://builds.apache.org/job/PreCommit-HBASE-Build/9610//testReport/
Release audit warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/9610//artifact/trunk/patchprocess/patchReleaseAuditProblems.txt
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/9610//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/9610//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/9610//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/9610//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/9610//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/9610//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/9610//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/9610//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/9610//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Console output:
https://builds.apache.org/job/PreCommit-HBASE-Build/9610//console

This message is automatically generated.

[BRAINSTORM] Combine MVCC and SeqId
---

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HBASE-8763) [BRAINSTORM] Combine MVCC and SeqId

2014-05-27 Thread Jeffrey Zhong (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-8763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14010789#comment-14010789
]

Jeffrey Zhong commented on HBASE-8763:
--

{quote}
There may be appends in flight inside in the disruptor and so you'll get a
sequence id that may be superceded?
{quote}
Yes. The mvccNum.setValue(this.sequenceId.incrementAndGet()); here ensures the
ordering of KVs when inserted into memstore for the same row key and their mvcc
values will be replaced later with actual sequence id while the ordering is
maintained. Because appendNoSync also happens while rowlock is held so their
actual sequence Ids maintain the same ordering.

[BRAINSTORM] Combine MVCC and SeqId
---

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HBASE-8763) [BRAINSTORM] Combine MVCC and SeqId

2014-05-16 Thread stack (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-8763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14000131#comment-14000131
]

stack commented on HBASE-8763:
--

[~jeffreyz]

Excellent work. Thanks for the writeup. Helps.

bq. After mvcc log sequence combin

We can do a bunch of stuff...

But I like your idea about clients being able to read at least to last flush.
Sweet.

Some comments on the patch:

Should beginMemstoreInsert assign a write mvcc number at all?

This won't work:

+w = mvcc.beginMemstoreInsert(this.sequenceId.incrementAndGet());

Or at least I don't think you are getting what you think you are getting. See
the note on sequenceid. I think you have to use the new getNextSequenceId.
Then you do this + flushSeqId = getNextSequenceId(wal); at end of
flush. Do we need the earlier mvcc?

You do it thrice I see (There is 'lag' between adding of append to ring buffer
and it being consumed and its edit/sequence id being updated... perhaps an edit
is in the ring buffer, you update the sequence id thinking you are getting the
last sequence id but subsequently, the ring buffer consumer runs...).

Misspelling: waitForPreviousTransactoinsComplete

I asked already and I think you explained but still not sure what is going on
here:

+ mvcc.waitForPreviousTransactoinsComplete(w);
+ w = null;

Call this appendNoSyncFakedWALEdit appendNoSyncNoAppend? It will make folks
sit up and wonder why you would do such a thing and they'd read the javadoc.
It is more explicit than appendNoSyncFakedWALEdit.

I'd think the below is temporary? It is convenient updating once and all
related get the update. Is that the only reason? Could we iterate the KVs
that make up the edit and set their MVCC? The edits are not yet in the
memstore, right? They get their mvcc before they are added to the MemStore?
(Always?)

- private long mvcc = 0; // this value is not part of a serialized KeyValue
(not in HFiles)
+ private MutableLong mvcc; // this value is not part of a serialized KeyValue
(not in HFiles)

This is the clone that we need to get rid of, right (smile)?

-newKv.setMvccVersion(kv.getMvccVersion());
+newKv.setMvccVersion(kv.getMvccVersionReference());

nit: don't need to set the memstoreRead... it is done in the declaration.

-this.memstoreRead = this.memstoreWrite = 0;
+memstoreRead = 0;

Set top bits rather than add a big number?: +curSeqNum.setValue(originalVal
+ 10);

Do we need to have MemStore know about HLogKeys? And that they have this odd
'waiting' thing that you can do on them?

+ long newSeqNum = walKey.waitForLogSeqNumAssigned();
+ e.setWriteNumber(newSeqNum);

Could we do the wait in the WAL system before we call
completeMemstoreInsertWithSeqNum passing in the sequence id to use? Could the
consumer on the ring buffer call the Memstore. completeMemstoreInsertWithSeqNum?

Hmm... maybe we should make the writeQueue a disruptor too? Later.

+1 on moving the latch from waledit to walkey. Nice.

You fixing this test? TestMultiParallel

Patch looks good. Can give a closer review later. This first pass should do
for now.

Great stuff Jeffrey

[BRAINSTORM] Combine MVCC and SeqId
---

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HBASE-8763) [BRAINSTORM] Combine MVCC and SeqId

2014-05-16 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-8763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14000519#comment-14000519
 ] 

stack commented on HBASE-8763:
--

+1 on removing waitQueue in new JIRA

bq. I'll pass HLog.NO_SEQUENCE_ID value then.

Yes.  I tried to make it so only the WAL system does the sequence id update so 
these this.sequenceId.incrementAndGet()  make me go yuck (smile)

+1 on MutableLong (I may ask you to make the same explanation again in the 
future so remember this bit of text so you can copy/paste it next time I ask).

Ok on not setting top bits. Add comments on how large needs to be 
(theoretically). Helps others coming along after better understand.

On MemStore knowing about WALKey, as in+  public void 
completeMemstoreInsertWithSeqNum(WriteEntry e, HLogKey walKey) throws 
IOException {, that is just butt ugly (pardon the technical term).  Doing 
'+import org.apache.hadoop.hbase.regionserver.wal.HLogKey;' is so ugly, I'd 
suggest we have an Interface that has one method in it -- 
waitForLogSeqNumAssigned or getSequenceId or getLogSeqNumAssigned (and just 
hide the fact that we are waiting) -- and pass that instead.

Good stuff Jay-Z



 [BRAINSTORM] Combine MVCC and SeqId
 ---

 Key: HBASE-8763
 URL: https://issues.apache.org/jira/browse/HBASE-8763
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Reporter: Enis Soztutar
Assignee: Jeffrey Zhong
Priority: Critical
 Attachments: HBase MVCC  LogSeqId Combined.pdf, 
 hbase-8736-poc.patch, hbase-8763-poc-v1.patch, hbase-8763-v1.patch, 
 hbase-8763-v2.patch, hbase-8763-v3.patch, hbase-8763_wip1.patch


 HBASE-8701 and a lot of recent issues include good discussions about mvcc + 
 seqId semantics. It seems that having mvcc and the seqId complicates the 
 comparator semantics a lot in regards to flush + WAL replay + compactions + 
 delete markers and out of order puts. 
 Thinking more about it I don't think we need a MVCC write number which is 
 different than the seqId. We can keep the MVCC semantics, read point and 
 smallest read points intact, but combine mvcc write number and seqId. This 
 will allow cleaner semantics + implementation + smaller data files. 
 We can do some brainstorming for 0.98. We still have to verify that this 
 would be semantically correct, it should be so by my current understanding.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HBASE-8763) [BRAINSTORM] Combine MVCC and SeqId

2014-05-16 Thread Hadoop QA (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-8763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14000515#comment-14000515
]

Hadoop QA commented on HBASE-8763:
--

{color:red}-1 overall{color}. Here are the results of testing the latest
attachment
http://issues.apache.org/jira/secure/attachment/12645323/hbase-8763-v3.patch
against trunk revision .
ATTACHMENT ID: 12645323

{color:green}+1 @author{color}. The patch does not contain any @author
tags.

{color:green}+1 tests included{color}. The patch appears to include 12 new
or modified tests.

{color:green}+1 javac{color}. The applied patch does not increase the
total number of javac compiler warnings.

{color:red}-1 javadoc{color}. The javadoc tool appears to have generated 6
warning messages.

{color:green}+1 findbugs{color}. The patch does not introduce any new
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}. The applied patch does not increase
the total number of release audit warnings.

{color:green}+1 lineLengths{color}. The patch does not introduce lines
longer than 100

{color:green}+1 site{color}. The mvn site goal succeeds with this patch.

{color:green}+1 core tests{color}. The patch passed unit tests in .

Test results:
https://builds.apache.org/job/PreCommit-HBASE-Build/9526//testReport/
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/9526//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/9526//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/9526//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/9526//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/9526//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/9526//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/9526//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/9526//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/9526//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output:
https://builds.apache.org/job/PreCommit-HBASE-Build/9526//console

This message is automatically generated.

[BRAINSTORM] Combine MVCC and SeqId
---

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HBASE-8763) [BRAINSTORM] Combine MVCC and SeqId

2014-05-16 Thread Jeffrey Zhong (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-8763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14000581#comment-14000581
]

Jeffrey Zhong commented on HBASE-8763:
--

{quote}
I'd suggest we have an Interface that has one method in it –
waitForLogSeqNumAssigned or getSequenceId or getLogSeqNumAssigned (and just
hide the fact that we are waiting) – and pass that instead.
{quote}
Ok, I'll do that in my next patch. The QA run succeeded except java doc
warnings which will be cleared in my next patch. Thanks.

[BRAINSTORM] Combine MVCC and SeqId
---

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HBASE-8763) [BRAINSTORM] Combine MVCC and SeqId

2014-05-16 Thread Jeffrey Zhong (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-8763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14000455#comment-14000455
]

Jeffrey Zhong commented on HBASE-8763:
--

Thanks [~saint@gmail.com] for the good comments! I'll make corresponding
changes in my next patch. Please see my answers below:

{quote}
Hmm... maybe we should make the writeQueue a disruptor too? Later.
{quote}
That's a good idea. I could also remove the writeQueue even without disruptor.
Basically return the previous highestUnsyncedSequence from disruptor consumer
code where we stampRegionSequenceId in FSHLog#append() before current wal
append, inside syncOrDefer we will wait for this previous
highestUnsyncedSequence for skip_wal async_wal case. Let's do it later in a
separate JIRA.

{quote}
Should beginMemstoreInsert assign a write mvcc number at all?
{quote}
No need to assign a write number and I could pass 0 here. The main purpose for
the call is inserting a marker so that later code can wait all mvcc
transactions before marker complete. this.sequenceId.incrementAndGet() is just
a way to pass some good number without messing up the memstore read point
later by mvcc.advanceMemstore(w) call.
I'll pass HLog.NO_SEQUENCE_ID value then.

{quote}
Misspelling: waitForPreviousTransactoinsComplete
{quote}
Good catch.

{quote}
Call this appendNoSyncFakedWALEdit appendNoSyncNoAppend
{quote}
Ok. I'll make the change

{quote}
It is convenient updating once and all related get the update. Is that the only
reason?
{quote}
It's not only for convenient but for correctness. Let's say we have two updates
coming in the order of c1, c2 while c1 could have seqId=2 and c2 could have
seqId=1. As you know wal syncer syncs all available pending appends and it's
likely both wal entries for e1 and e2 are synced at same time. Therefore, the
mvcc for c1 will advance memstore read point to 2. Since MutableLong is used
here, so we know for sure c1's MVCC has been updated to 1 before c2 gets its
seqid=2 otherwise we could end up with the situation that memstore read point
has been set to 2 while c1's mvcc in menstore hasn't been updated yet depending
on the caller thread scheduling.(Unless I pass MVCC into ring buffer where we
keep the reference of all new KVs and update mvcc for them in disruptor
consumer code) .

{quote}
Set top bits rather than add a big number?: + curSeqNum.setValue(originalVal +
10);
{quote}
How about the case if the first bit is already used for a large sequence
number? In theory, I only need to bump the number to 2 * the number of rpc
handlers because the writeQueue will be blocked if current mvcc isn't complete.
The big number here is just to be safe.

{quote}
Do we need to have MemStore know about HLogKeys?
Could we do the wait in the WAL system before we call
completeMemstoreInsertWithSeqNum passing in the sequence id to use? Could the
consumer on the ring buffer call the Memstore. completeMemstoreInsertWithSeqNum?
{quote}
Memstore doesn't know hlogkey but MVCC. I can do it outside of MVCC but that
requires all places calling completeMemstoreInsertWithSeqNum need to remember
to call the waitForLogSequence. It may leave hole in the future.

[BRAINSTORM] Combine MVCC and SeqId
---

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HBASE-8763) [BRAINSTORM] Combine MVCC and SeqId

2014-05-14 Thread Jeffrey Zhong (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-8763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13997068#comment-13997068
 ] 

Jeffrey Zhong commented on HBASE-8763:
--

Thanks [~saint@gmail.com]! I'm migrating my patch on top of hbase-11135 now.

 [BRAINSTORM] Combine MVCC and SeqId
 ---

 Key: HBASE-8763
 URL: https://issues.apache.org/jira/browse/HBASE-8763
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Reporter: Enis Soztutar
Assignee: Jeffrey Zhong
Priority: Critical
 Attachments: hbase-8736-poc.patch, hbase-8763-poc-v1.patch, 
 hbase-8763-v1.patch, hbase-8763_wip1.patch


 HBASE-8701 and a lot of recent issues include good discussions about mvcc + 
 seqId semantics. It seems that having mvcc and the seqId complicates the 
 comparator semantics a lot in regards to flush + WAL replay + compactions + 
 delete markers and out of order puts. 
 Thinking more about it I don't think we need a MVCC write number which is 
 different than the seqId. We can keep the MVCC semantics, read point and 
 smallest read points intact, but combine mvcc write number and seqId. This 
 will allow cleaner semantics + implementation + smaller data files. 
 We can do some brainstorming for 0.98. We still have to verify that this 
 would be semantically correct, it should be so by my current understanding.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HBASE-8763) [BRAINSTORM] Combine MVCC and SeqId

2014-05-13 Thread stack (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-8763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13996821#comment-13996821
]

stack commented on HBASE-8763:
--

I committed hbase-11135 so hopefully this patch is cleaner. I opened
HBASE-11160 for the case where we can hopefully let go of append having to wait
on edit/sequence id updates (early-binding instead of late-binding).

[BRAINSTORM] Combine MVCC and SeqId
---

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HBASE-8763) [BRAINSTORM] Combine MVCC and SeqId

2014-05-11 Thread Jeffrey Zhong (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-8763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13994785#comment-13994785
]

Jeffrey Zhong commented on HBASE-8763:
--

{quote}
Early- vs late-binding would change this patch?
{quote}
Yes, it makes the patch much easier for edits with durablity=SKIP_WAL
ASYNC_WAL situation otherwise it would do similar things of HBASE-11135.

{quote}
I'm not sure I'm clear on what could be rolled back out of memstore around
flush. Or, can there be more doc on how mvcc and sequence id are interacting
here?
{quote}
During flush, we take a region write lock(prevent all writes coming into a
region), append a marker in MVCC queue, get flush sequence Id, take a mem store
snapshot and then release the region write lock. After lock release, we wait
for the MVCC marker we appended while holding the write lock in order for all
in-flight transactions before acquiring the region write lock to
complete(either succeed or rollback). Since there is only one copy of MVCC
which is a MutableLong object referenced by all related KVs, the rollback of
KVs from mem store should have no issue.

{quote}
Is the best write up on how this is going to work going forward what is above
in this issue?
{quote}
Sure. I'll modify this patch on top of HBASE-11135 and then a write up on the
final implementation.

[BRAINSTORM] Combine MVCC and SeqId
---

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HBASE-8763) [BRAINSTORM] Combine MVCC and SeqId

2014-05-10 Thread stack (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-8763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13992516#comment-13992516
]

stack commented on HBASE-8763:
--

This changes with hbase-11109:

{code}
+ flushSeqId = this.sequenceId.incrementAndGet();
+} else {
+ // use the provided sequence Id as WAL is not being used for this
flush.
+ flushSeqId = myseqid;

{code}

... but that is fine. Let 11109 worry about it.

I'm not sure I'm clear on what could be rolled back out of memstore around
flush. Or, can there be more doc on how mvcc and sequence id are interacting
here? For those who come after us?

This should be called sequenceId: +MutableLong seqNumber = new
MutableLong();

A left shift would be better? +return beginMemstoreInsert(curSeqNum +
10);

Did a quick skim. Early- vs late-binding would change this patch?

Is the best write up on how this is going to work going forward what is above
in this issue? Thanks [~jeffreyz]

[BRAINSTORM] Combine MVCC and SeqId
---

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HBASE-8763) [BRAINSTORM] Combine MVCC and SeqId

2014-05-02 Thread Hadoop QA (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-8763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13988533#comment-13988533
]

Hadoop QA commented on HBASE-8763:
--

{color:red}-1 overall{color}. Here are the results of testing the latest
attachment
http://issues.apache.org/jira/secure/attachment/12643167/hbase-8763.patch
against trunk revision .
ATTACHMENT ID: 12643167

{color:green}+1 @author{color}. The patch does not contain any @author
tags.

{color:green}+1 tests included{color}. The patch appears to include 6 new
or modified tests.

{color:green}+1 javac{color}. The applied patch does not increase the
total number of javac compiler warnings.

{color:red}-1 javadoc{color}. The javadoc tool appears to have generated 4
warning messages.

{color:green}+1 findbugs{color}. The patch does not introduce any new
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}. The applied patch does not increase
the total number of release audit warnings.

{color:green}+1 lineLengths{color}. The patch does not introduce lines
longer than 100

{color:green}+1 site{color}. The mvn site goal succeeds with this patch.

{color:red}-1 core tests{color}. The patch failed these unit tests:
org.apache.hadoop.hbase.io.TestHeapSize

Test results:
https://builds.apache.org/job/PreCommit-HBASE-Build/9452//testReport/
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/9452//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/9452//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/9452//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/9452//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/9452//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/9452//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/9452//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/9452//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/9452//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output:
https://builds.apache.org/job/PreCommit-HBASE-Build/9452//console

This message is automatically generated.

[BRAINSTORM] Combine MVCC and SeqId
---

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HBASE-8763) [BRAINSTORM] Combine MVCC and SeqId

2014-05-01 Thread stack (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-8763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13987070#comment-13987070
]

stack commented on HBASE-8763:
--

bq. ...and I reset the value of the MutaleLong in one place so all new KVs in
MemStore will see the updated sequence number...

Nice.

bq. If our WAL Sync doesn't late binding then I don't need to use MutableLong.

IIUC, if we want the region sequence id (and hence the mvcc) to reflect the
order in which edits appear in the WAL, we must do late binding. See my
natterings over in HBASE-11099.

bq. ...and reset their MVCC values in an extra loop

Tell me more. Where the KVs be 'kept'? In the WALEdit? The extra loop would
be after a sync or on append?

bq. ...our pre post co-processor copies MVCC values...

Ok. Ugly. Would be cool getting this in w/o breaking CPs in 1.0 but we should
break the API after 1.0 if forcing inefficiency.

[BRAINSTORM] Combine MVCC and SeqId
---

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HBASE-8763) [BRAINSTORM] Combine MVCC and SeqId

2014-04-30 Thread stack (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-8763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13986354#comment-13986354
]

stack commented on HBASE-8763:
--

(Pardon me; am excited about this one so went back to do some more study...)

So yeah, why do a MutableLong in KV rather than just a volatile? Probably same
in the end... I see you want to get a reference to a KV. You trying to tie a
KV to something else in case the mvcc gets updated?

Maybe this is it:

-newKv.setMvccVersion(kv.getMvccVersion());
+newKv.setMvccVersion(kv.getMvccVersionReference());

We are cloning. The original gets updated later? You want to make sure clone
is updated too?

Would be cool if we could get rid of this clone one day.

[BRAINSTORM] Combine MVCC and SeqId
---

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HBASE-8763) [BRAINSTORM] Combine MVCC and SeqId

2014-04-30 Thread Jeffrey Zhong (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-8763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13986360#comment-13986360
]

Jeffrey Zhong commented on HBASE-8763:
--

The reason using MutableLong object is that at the very beginning we don't know
the real sync sequence number(due to the late binding) so I use MutableLong
object which keeps a faked big sequence number. All new KVs and related of
this transation reference this mvcc mutablelong object. Once after the
corresponding WALEdit is synced(after SyncOrDefer call), we have the real
sequence number and I reset the value of the MutaleLong in one place so all new
KVs in MemStore will see the updated sequence number(because they keep the
reference to this MVCC(MutableLong) instance.

If our WAL Sync doesn't late binding then I don't need to use MutableLong.

[~enis] is suggesting not to use MutableLong while keeping all new KVs and
reset their MVCC values in an extra loop. This may be hard to implement because
our pre post co-processor copies MVCC values as in the code you pasted
above(where I changed to copy reference)
{code}
newKv.setMvccVersion(kv.getMvccVersionReference());
{code}

My plan is to get all tests pass and then do enhancement/refactoring that you
and [~enis] are suggesting.

[BRAINSTORM] Combine MVCC and SeqId
---

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HBASE-8763) [BRAINSTORM] Combine MVCC and SeqId

2014-04-29 Thread Enis Soztutar (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-8763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13985080#comment-13985080
 ] 

Enis Soztutar commented on HBASE-8763:
--

Great to see progress Jeffrey. 

As talked offline, I think we do not want to allocate 2 objects on heap rather 
than one: 
{code}
-  private long mvcc = 0;  // this value is not part of a serialized KeyValue 
(not in HFiles)
+  private MutableLong mvcc;  // this value is not part of a serialized 
KeyValue (not in HFiles)
{code}
Although, changing the mvcc long after object construction may require it to be 
declared volatile. 
I think you are using the technique of double incrementing the seqId, once at 
trx start and once at end right? Did you try your append 1B to seqId for 
memstore approach? 

I think I also changed Cell.getMvcc() to Cell.getSeqId(). Maybe worth doing 
once we get the patch fully running and performant. 


 [BRAINSTORM] Combine MVCC and SeqId
 ---

 Key: HBASE-8763
 URL: https://issues.apache.org/jira/browse/HBASE-8763
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Reporter: Enis Soztutar
Assignee: Jeffrey Zhong
Priority: Critical
 Attachments: hbase-8736-poc.patch, hbase-8763-poc-v1.patch, 
 hbase-8763_wip1.patch


 HBASE-8701 and a lot of recent issues include good discussions about mvcc + 
 seqId semantics. It seems that having mvcc and the seqId complicates the 
 comparator semantics a lot in regards to flush + WAL replay + compactions + 
 delete markers and out of order puts. 
 Thinking more about it I don't think we need a MVCC write number which is 
 different than the seqId. We can keep the MVCC semantics, read point and 
 smallest read points intact, but combine mvcc write number and seqId. This 
 will allow cleaner semantics + implementation + smaller data files. 
 We can do some brainstorming for 0.98. We still have to verify that this 
 would be semantically correct, it should be so by my current understanding.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HBASE-8763) [BRAINSTORM] Combine MVCC and SeqId

2014-04-29 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-8763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13985127#comment-13985127
 ] 

stack commented on HBASE-8763:
--

You the man [~jeffreyz]

Spelling here waitForPreviousTransactoinsComplete

What is happening here?

 syncOrDefer(txid, durability);
+// Get LogSequenceNumber from WAL Sync
+if(walEdit.getLogKey() != null) {
+  seqNumber.setValue(walEdit.getLogKey().getLogSeqNum());
+}

The seqNumber of the last wallEdit added to the WAL and sync'd?  The seq number 
could be well beyond this in actually?  Does that matter?  Should we let you 
have access to last sync'd id out of WAL?

You import but don't use?

+import org.apache.commons.lang.mutable.MutableInt;
+import org.apache.commons.lang.mutable.MutableLong;

... maybe a few times.

Why you use the MutableLong instead of say a long or a volatile?

Will do a closer review later.  So far looks great.





 [BRAINSTORM] Combine MVCC and SeqId
 ---

 Key: HBASE-8763
 URL: https://issues.apache.org/jira/browse/HBASE-8763
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Reporter: Enis Soztutar
Assignee: Jeffrey Zhong
Priority: Critical
 Attachments: hbase-8736-poc.patch, hbase-8763-poc-v1.patch, 
 hbase-8763_wip1.patch


 HBASE-8701 and a lot of recent issues include good discussions about mvcc + 
 seqId semantics. It seems that having mvcc and the seqId complicates the 
 comparator semantics a lot in regards to flush + WAL replay + compactions + 
 delete markers and out of order puts. 
 Thinking more about it I don't think we need a MVCC write number which is 
 different than the seqId. We can keep the MVCC semantics, read point and 
 smallest read points intact, but combine mvcc write number and seqId. This 
 will allow cleaner semantics + implementation + smaller data files. 
 We can do some brainstorming for 0.98. We still have to verify that this 
 would be semantically correct, it should be so by my current understanding.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HBASE-8763) [BRAINSTORM] Combine MVCC and SeqId

2013-12-27 Thread Liyin Tang (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-8763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13857630#comment-13857630
]

Liyin Tang commented on HBASE-8763:
---

I totally vote for combining the MVCC and SeqID. Furthermore, it will be even
more straightforward if the SeqID does not shared across all the Regions.
Ideally, each region shall have its own monotonously increasing seq id.

[BRAINSTORM] Combine MVCC and SeqId
---

Key: HBASE-8763
URL: https://issues.apache.org/jira/browse/HBASE-8763
Project: HBase
Issue Type: Improvement
Components: regionserver
Reporter: Enis Soztutar
Priority: Critical
Attachments: hbase-8736-poc.patch, hbase-8763_wip1.patch

--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HBASE-8763) [BRAINSTORM] Combine MVCC and SeqId

2013-12-27 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-8763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13857661#comment-13857661
 ] 

Sergey Shelukhin commented on HBASE-8763:
-

I thought the latter was already done?

 [BRAINSTORM] Combine MVCC and SeqId
 ---

 Key: HBASE-8763
 URL: https://issues.apache.org/jira/browse/HBASE-8763
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Reporter: Enis Soztutar
Priority: Critical
 Attachments: hbase-8736-poc.patch, hbase-8763_wip1.patch


 HBASE-8701 and a lot of recent issues include good discussions about mvcc + 
 seqId semantics. It seems that having mvcc and the seqId complicates the 
 comparator semantics a lot in regards to flush + WAL replay + compactions + 
 delete markers and out of order puts. 
 Thinking more about it I don't think we need a MVCC write number which is 
 different than the seqId. We can keep the MVCC semantics, read point and 
 smallest read points intact, but combine mvcc write number and seqId. This 
 will allow cleaner semantics + implementation + smaller data files. 
 We can do some brainstorming for 0.98. We still have to verify that this 
 would be semantically correct, it should be so by my current understanding.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HBASE-8763) [BRAINSTORM] Combine MVCC and SeqId

2013-12-27 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-8763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13857665#comment-13857665
 ] 

Ted Yu commented on HBASE-8763:
---

Right. See HBASE-8741

 [BRAINSTORM] Combine MVCC and SeqId
 ---

 Key: HBASE-8763
 URL: https://issues.apache.org/jira/browse/HBASE-8763
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Reporter: Enis Soztutar
Priority: Critical
 Attachments: hbase-8736-poc.patch, hbase-8763_wip1.patch


 HBASE-8701 and a lot of recent issues include good discussions about mvcc + 
 seqId semantics. It seems that having mvcc and the seqId complicates the 
 comparator semantics a lot in regards to flush + WAL replay + compactions + 
 delete markers and out of order puts. 
 Thinking more about it I don't think we need a MVCC write number which is 
 different than the seqId. We can keep the MVCC semantics, read point and 
 smallest read points intact, but combine mvcc write number and seqId. This 
 will allow cleaner semantics + implementation + smaller data files. 
 We can do some brainstorming for 0.98. We still have to verify that this 
 would be semantically correct, it should be so by my current understanding.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HBASE-8763) [BRAINSTORM] Combine MVCC and SeqId

2013-12-27 Thread Liyin Tang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-8763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13857680#comment-13857680
 ] 

Liyin Tang commented on HBASE-8763:
---

Thanks for jira ! 
If SeqID has already been per-region basis, and we want to combine the MVCC, 
then how do we want to handle the group commit across multiple regions ? 

 [BRAINSTORM] Combine MVCC and SeqId
 ---

 Key: HBASE-8763
 URL: https://issues.apache.org/jira/browse/HBASE-8763
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Reporter: Enis Soztutar
Priority: Critical
 Attachments: hbase-8736-poc.patch, hbase-8763_wip1.patch


 HBASE-8701 and a lot of recent issues include good discussions about mvcc + 
 seqId semantics. It seems that having mvcc and the seqId complicates the 
 comparator semantics a lot in regards to flush + WAL replay + compactions + 
 delete markers and out of order puts. 
 Thinking more about it I don't think we need a MVCC write number which is 
 different than the seqId. We can keep the MVCC semantics, read point and 
 smallest read points intact, but combine mvcc write number and seqId. This 
 will allow cleaner semantics + implementation + smaller data files. 
 We can do some brainstorming for 0.98. We still have to verify that this 
 would be semantically correct, it should be so by my current understanding.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HBASE-8763) [BRAINSTORM] Combine MVCC and SeqId

2013-12-27 Thread Jeffrey Zhong (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-8763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13857813#comment-13857813
]

Jeffrey Zhong commented on HBASE-8763:
--

I'm working with [~enis] on this though in a slow pace because the feature can
most likely be released in 0.99 or later. There are work left to be done like
upgrade handling performance evaluation.

[~liyin] Not much special handing on log group commit which you can check on
hbase-8741. Basically log sequence number won't monotonically increase in a WAL
while it is for a single region. MVCC will become a region specific value not
global in RS level so we need to maintain a map region - read point for
scanning.

[BRAINSTORM] Combine MVCC and SeqId
---

--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HBASE-8763) [BRAINSTORM] Combine MVCC and SeqId

2013-12-27 Thread Liyin Tang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-8763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13857820#comment-13857820
 ] 

Liyin Tang commented on HBASE-8763:
---

[~jeffreyz], I see. Thanks for the clarification and it makes sense to me now !

 [BRAINSTORM] Combine MVCC and SeqId
 ---

 Key: HBASE-8763
 URL: https://issues.apache.org/jira/browse/HBASE-8763
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Reporter: Enis Soztutar
Priority: Critical
 Attachments: hbase-8736-poc.patch, hbase-8763_wip1.patch


 HBASE-8701 and a lot of recent issues include good discussions about mvcc + 
 seqId semantics. It seems that having mvcc and the seqId complicates the 
 comparator semantics a lot in regards to flush + WAL replay + compactions + 
 delete markers and out of order puts. 
 Thinking more about it I don't think we need a MVCC write number which is 
 different than the seqId. We can keep the MVCC semantics, read point and 
 smallest read points intact, but combine mvcc write number and seqId. This 
 will allow cleaner semantics + implementation + smaller data files. 
 We can do some brainstorming for 0.98. We still have to verify that this 
 would be semantically correct, it should be so by my current understanding.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HBASE-8763) [BRAINSTORM] Combine MVCC and SeqId

2013-12-26 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-8763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13856987#comment-13856987
 ] 

Sergey Shelukhin commented on HBASE-8763:
-

Any update on this? Resolving this confusion would help greatly.

 [BRAINSTORM] Combine MVCC and SeqId
 ---

 Key: HBASE-8763
 URL: https://issues.apache.org/jira/browse/HBASE-8763
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Reporter: Enis Soztutar
 Attachments: hbase-8736-poc.patch, hbase-8763_wip1.patch


 HBASE-8701 and a lot of recent issues include good discussions about mvcc + 
 seqId semantics. It seems that having mvcc and the seqId complicates the 
 comparator semantics a lot in regards to flush + WAL replay + compactions + 
 delete markers and out of order puts. 
 Thinking more about it I don't think we need a MVCC write number which is 
 different than the seqId. We can keep the MVCC semantics, read point and 
 smallest read points intact, but combine mvcc write number and seqId. This 
 will allow cleaner semantics + implementation + smaller data files. 
 We can do some brainstorming for 0.98. We still have to verify that this 
 would be semantically correct, it should be so by my current understanding.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HBASE-8763) [BRAINSTORM] Combine MVCC and SeqId

2013-12-04 Thread stack (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-8763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13839529#comment-13839529
]

stack commented on HBASE-8763:
--

bq. 1) Memstore insert using long.max as the initial write number

What will we do if two edits arrive with same coordinates? How will we
distingush them if both have long.max during the time it takes to sync and
converte long.max to a legit seqid?

bq. Currently, we maintain an internal queue which might defer the read point
bump up if transactions complete order is different than that of MVCC internal
write queue.

A reason to unify MVCC and WAL seqid (smile).

bq. By doing above, it's possible to remove the logics maintaining writeQueue
...

We need the writeQueue for performance reasons, right? We need to add edits in
bulk under a lock and this lock is expensive to obtain (maybe I am missing
something?)

bq. ...so it means we can remove two locking and one queue loop in write code
path.

What are the two locks J?

Otherwise, sounds great. Will look at patches...

[BRAINSTORM] Combine MVCC and SeqId
---

Key: HBASE-8763
URL: https://issues.apache.org/jira/browse/HBASE-8763
Project: HBase
Issue Type: Improvement
Components: regionserver
Reporter: Enis Soztutar
Attachments: hbase-8736-poc.patch, hbase-8763_wip1.patch

--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HBASE-8763) [BRAINSTORM] Combine MVCC and SeqId

2013-12-04 Thread Jeffrey Zhong (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-8763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13839728#comment-13839728
 ] 

Jeffrey Zhong commented on HBASE-8763:
--

{quote}
What will we do if two edits arrive with same coordinates? How will we 
distingush them if both have long.max during the time it takes to sync and 
converte long.max to a legit seqid?
{quote}
Basically the MVCC write number only needs to make sure scanner can't see them 
before a write is done. Therefore we can assign them to Long.MAX. It means all 
in-progress writes belongs to one bucket and scanner can't see them.  Once a 
write is done, we assign them the logSeqNumber in WAL appending order and then 
bump up the min read point so that all writes before current log sequence 
number are visible to scanners. In this case, client can see changes in the 
order we commit the writes.

There are two orders in today's code because we assign the write number before 
a write starts: receiving order and commit order. For example, Put1 has write 
number 1 and Put2 has write number 2 while Put2 can finish earlier than Put1 
but Put2 still need wait for Put1 to finish. This cause issues for replication 
and recovery because both replies on the order(commit order) in the WAL file.

{quote}
What are the two locks J?
{quote}

In file MultiVersionConsistencyControl, the locks guard the access to 
writeQueue. Since we don't need keep the receiving order(which have to today 
because large write number could complete earlier than smaller write number), 
we can remove the related code as you can see my proof-of-concept patch 
beginMemstoreInsertUseSeqNum  advanceMemstoreUseSeqNum. I still keep a 
collection inProgressWrites because our Increment, Append etc needs all 
in-progress done but this part can be optimized by just keeping a hashmap for 
rows which row lock are released but not wal synced yet. 

Thanks.

{code}
  public WriteEntry beginMemstoreInsert() {
synchronized (writeQueue) {
  long nextWriteNumber = ++memstoreWrite;
  WriteEntry e = new WriteEntry(nextWriteNumber);
  writeQueue.add(e);
  return e;
}
  }

  boolean advanceMemstore(WriteEntry e) {
synchronized (writeQueue) {
   ...
 while (!writeQueue.isEmpty()) {
...
 }
 }
}
{code}



 [BRAINSTORM] Combine MVCC and SeqId
 ---

 Key: HBASE-8763
 URL: https://issues.apache.org/jira/browse/HBASE-8763
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Reporter: Enis Soztutar
 Attachments: hbase-8736-poc.patch, hbase-8763_wip1.patch


 HBASE-8701 and a lot of recent issues include good discussions about mvcc + 
 seqId semantics. It seems that having mvcc and the seqId complicates the 
 comparator semantics a lot in regards to flush + WAL replay + compactions + 
 delete markers and out of order puts. 
 Thinking more about it I don't think we need a MVCC write number which is 
 different than the seqId. We can keep the MVCC semantics, read point and 
 smallest read points intact, but combine mvcc write number and seqId. This 
 will allow cleaner semantics + implementation + smaller data files. 
 We can do some brainstorming for 0.98. We still have to verify that this 
 would be semantically correct, it should be so by my current understanding.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HBASE-8763) [BRAINSTORM] Combine MVCC and SeqId

2013-12-04 Thread stack (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-8763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13839816#comment-13839816
]

stack commented on HBASE-8763:
--

bq. It means all in-progress writes belongs to one bucket and scanner can't
see them.

These will not be in the memstore, right? They will be held aside? We only
add to memstore after they get their seqid? Else, the second edit overwrites
the first?

bq. ...Put1 has write number 1 and Put2 has write number 2 while Put2 can
finish earlier than Put1 but Put2 still need wait for Put1 to finish. This
cause issues for replication and recovery because both replies on the
order(commit order) in the WAL file.

Yes. I like the way you call this out. Please talk this fact up going forward.

Sorry, let me go look at your code. That should help me follow along.

[BRAINSTORM] Combine MVCC and SeqId
---

--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HBASE-8763) [BRAINSTORM] Combine MVCC and SeqId

2013-12-02 Thread Jeffrey Zhong (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-8763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13837109#comment-13837109
]

Jeffrey Zhong commented on HBASE-8763:
--

Today I had some discussion with [~enis] and [~te...@apache.org] on this topic
and found it might be possible to handle the JIRA issue in a simpler way. Below
are the steps:

1) Memstore insert using long.max as the initial write number
2) append no sync
3) sync
4) update WriteEntry's write number to the sequence number returned from Step 2
5) CompleteMemstoreInsert. In this step, make current read point to be = the
sequence number from Step 2. The reasoning behind this is that once we sync
till the sequence number, all changes with small sequence numbers are already
synced into WAL. Therefore, we should be able to bump up read number to the
last sequence number synced.

Currently, we maintain an internal queue which might defer the read point bump
up if transactions complete order is different than that of MVCC internal write
queue.

By doing above, it's possible to remove the logics maintaining writeQueue so it
means we can remove two locking and one queue loop in write code path. Sounds
too good to be true :-). Let me try to write a quick patch and run it against
unit tests to see if the idea could fly.

[BRAINSTORM] Combine MVCC and SeqId
---

Key: HBASE-8763
URL: https://issues.apache.org/jira/browse/HBASE-8763
Project: HBase
Issue Type: Improvement
Components: regionserver
Reporter: Enis Soztutar
Attachments: hbase-8763_wip1.patch

--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HBASE-8763) [BRAINSTORM] Combine MVCC and SeqId

2013-12-02 Thread Enis Soztutar (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-8763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13837226#comment-13837226
]

Enis Soztutar commented on HBASE-8763:
--

This may actually work with a very small amount of changes. It would be great
if you can hack up a working patch and run the unit tests. In my case, I've
seen the current set of UT's to cover most of the concurrency issues I ran into
while developing this.

[BRAINSTORM] Combine MVCC and SeqId
---

Key: HBASE-8763
URL: https://issues.apache.org/jira/browse/HBASE-8763
Project: HBase
Issue Type: Improvement
Components: regionserver
Reporter: Enis Soztutar
Attachments: hbase-8763_wip1.patch

--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HBASE-8763) [BRAINSTORM] Combine MVCC and SeqId

2013-12-02 Thread Jeffrey Zhong (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-8763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13837300#comment-13837300
]

Jeffrey Zhong commented on HBASE-8763:
--

I tried a small patch. Since we support SKIP_WAL model, the MVCC.writeQueue is
still needed to main the write order because there is no wal sync operation at
all. Also there are quite a few test cases doesn't do appendNosync between
mvcc.beginMemstoreInsert and mvcc.completeMemstoreInsert so they are needed to
be adjusted. So far I didn't find block issues but still need to verify it
thoroughly.

[BRAINSTORM] Combine MVCC and SeqId
---

Key: HBASE-8763
URL: https://issues.apache.org/jira/browse/HBASE-8763
Project: HBase
Issue Type: Improvement
Components: regionserver
Reporter: Enis Soztutar
Attachments: hbase-8763_wip1.patch

--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HBASE-8763) [BRAINSTORM] Combine MVCC and SeqId

2013-09-27 Thread Himanshu Vashishtha (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-8763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13780489#comment-13780489
 ] 

Himanshu Vashishtha commented on HBASE-8763:


Hey Enis  Jeffrey, could you guys please share the approach for this issue 
which is currently under testing. Thanks.

 [BRAINSTORM] Combine MVCC and SeqId
 ---

 Key: HBASE-8763
 URL: https://issues.apache.org/jira/browse/HBASE-8763
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Reporter: Enis Soztutar
 Fix For: 0.98.0


 HBASE-8701 and a lot of recent issues include good discussions about mvcc + 
 seqId semantics. It seems that having mvcc and the seqId complicates the 
 comparator semantics a lot in regards to flush + WAL replay + compactions + 
 delete markers and out of order puts. 
 Thinking more about it I don't think we need a MVCC write number which is 
 different than the seqId. We can keep the MVCC semantics, read point and 
 smallest read points intact, but combine mvcc write number and seqId. This 
 will allow cleaner semantics + implementation + smaller data files. 
 We can do some brainstorming for 0.98. We still have to verify that this 
 would be semantically correct, it should be so by my current understanding.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-8763) [BRAINSTORM] Combine MVCC and SeqId

2013-09-27 Thread Enis Soztutar (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-8763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13780598#comment-13780598
]

Enis Soztutar commented on HBASE-8763:
--

Definitely. I have a patch which works on 90% of the unit tests :) Was going to
create a RB, but never got to make it stable enough, perf test, handle old mvcc
numbers in files, etc.
The approach is that we get keep MVCC, but mvcc write number will be the log
sequence number. So instead of:
1. get mvcc number
2. memstore insert
3. append no sync
4. sync
we are doing:
1. append no sync
2. memstore insert
3. sync

You can check the TrxManager, which keeps track of on the fly transactions.
MVCC is an impl of TrxMngr. HLog will order the transactions by the update
lock, and create and add a trx to the Trx manager. TrxManager still ensures
that the transactions are committed (made visible) in serial order. row locking
order is also not changed, meaning, read-write transcations still have to wait
for on-flight transactions to finish, after they acquire the row locks.

[BRAINSTORM] Combine MVCC and SeqId
---

Key: HBASE-8763
URL: https://issues.apache.org/jira/browse/HBASE-8763
Project: HBase
Issue Type: Improvement
Components: regionserver
Reporter: Enis Soztutar
Fix For: 0.98.0

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-8763) [BRAINSTORM] Combine MVCC and SeqId

2013-07-08 Thread Sergey Shelukhin (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-8763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13702436#comment-13702436
]

Sergey Shelukhin commented on HBASE-8763:
-

I have another idea, related to this, that intends to solve problems with clock
skew also.
If user doesn't supply ts, we can also combine ts with seqId and mvcc :) That
way conflicts due to clock skew won't happen assuming seqId doesn't go back;
which it doesn't under current assumptions such as compactions and log recovery.
If user does, we can store user ts and seqId or just user ts (but then user has
to avoid conflicts, same as now).
This makes mixed scenarios more difficult though, and some people may rely in
default ts actually storing time.

[BRAINSTORM] Combine MVCC and SeqId
---

Key: HBASE-8763
URL: https://issues.apache.org/jira/browse/HBASE-8763
Project: HBase
Issue Type: Improvement
Components: regionserver
Reporter: Enis Soztutar
Fix For: 0.98.0

[jira] [Commented] (HBASE-8763) [BRAINSTORM] Combine MVCC and SeqId

2013-07-08 Thread stack (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-8763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13702546#comment-13702546
]

stack commented on HBASE-8763:
--

Can we just do the mvcc and seqid unification? It would be a nice
simplification. Only hitch I see is ensuring that we write the WAL in
sequenceid order (might require a little sort before we add to the WAL done
under lock).

On merging in ts, that looks like we'd be asking for trouble (smile). That'd
be radical.

[BRAINSTORM] Combine MVCC and SeqId
---

Key: HBASE-8763
URL: https://issues.apache.org/jira/browse/HBASE-8763
Project: HBase
Issue Type: Improvement
Components: regionserver
Reporter: Enis Soztutar
Fix For: 0.98.0

[jira] [Commented] (HBASE-8763) [BRAINSTORM] Combine MVCC and SeqId

2013-06-18 Thread Enis Soztutar (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-8763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13687345#comment-13687345
]

Enis Soztutar commented on HBASE-8763:
--

I also realized that, if I am not wrong, current multi row atomic mutations is
broken with scanners + RS crashes. Since mvcc is not persisted, if a multi put
changing r1,r100 happens where mvcc = 100, the scanner with mvcc = 90 will not
see r1. Just after passing r1, the scanner might fail, and the new scanner in
the new region server will get another mvcc, but since the changes for previous
multi put has been persisted (in log recovery), the scanner will happily see
r100 mutation.

The underlying reason multi puts + scanner for a region has to see a snapshot
of the region, but mvcc is ephemeral. This can also be fixed by saving the
seqId's in hfiles, and when a region scanner is opened, the client obtains the
scanner seqId (mvcc read point) and uses this number in case of failover.

[BRAINSTORM] Combine MVCC and SeqId
---

Key: HBASE-8763
URL: https://issues.apache.org/jira/browse/HBASE-8763
Project: HBase
Issue Type: Improvement
Components: regionserver
Reporter: Enis Soztutar
Fix For: 0.98.0

[jira] [Commented] (HBASE-8763) [BRAINSTORM] Combine MVCC and SeqId

2013-06-18 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-8763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13687444#comment-13687444
 ] 

Sergey Shelukhin commented on HBASE-8763:
-

+1 on idea after we have discussed it

 [BRAINSTORM] Combine MVCC and SeqId
 ---

 Key: HBASE-8763
 URL: https://issues.apache.org/jira/browse/HBASE-8763
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Reporter: Enis Soztutar
 Fix For: 0.98.0


 HBASE-8701 and a lot of recent issues include good discussions about mvcc + 
 seqId semantics. It seems that having mvcc and the seqId complicates the 
 comparator semantics a lot in regards to flush + WAL replay + compactions + 
 delete markers and out of order puts. 
 Thinking more about it I don't think we need a MVCC write number which is 
 different than the seqId. We can keep the MVCC semantics, read point and 
 smallest read points intact, but combine mvcc write number and seqId. This 
 will allow cleaner semantics + implementation + smaller data files. 
 We can do some brainstorming for 0.98. We still have to verify that this 
 would be semantically correct, it should be so by my current understanding.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

46 matches

Mail list logo