[jira] [Commented] (HBASE-10227) When a region is opened, its mvcc isn't correctly recovered when there are split hlogs to replay

2014-01-14 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13871212#comment-13871212
 ] 

Sergey Shelukhin commented on HBASE-10227:
--

This JIRA tries to solve it for some other issue (see Feng's comment above).
Scanners before opening the region would be possible (HBASE-10241) but they 
don't require storing MVCC for all time, just for more time. You could also do 
stuff like arbitrary file combinations for compactions with mvcc/seqId always 
stored (so you don't need the files to be in order to compare KVs).

As for storing one way to optimize would be vlong with delta, base value in 
header. One would expect most KVs in most files to have mvccs in narrow range. 

> When a region is opened, its mvcc isn't correctly recovered when there are 
> split hlogs to replay
> 
>
> Key: HBASE-10227
> URL: https://issues.apache.org/jira/browse/HBASE-10227
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Reporter: Feng Honghua
>Assignee: Feng Honghua
> Attachments: HBASE-10227-trunk_v0.patch
>
>
> When opening a region, all stores are examined to get the max MemstoreTS and 
> it's used as the initial mvcc for the region, and then split hlogs are 
> replayed. In fact the edits in split hlogs have kvs with greater mvcc than 
> all MemstoreTS in all store files, but replaying them don't increment the 
> mvcc according at all. From an overall perspective this mvcc recovering is 
> 'logically' incorrect/incomplete.
> Why currently it doesn't incur problem is because no active scanners exists 
> and no new scanners can be created before the region opening completes, so 
> the mvcc of all kvs in the resulted hfiles from hlog replaying can be safely 
> set to zero. They are just treated as kvs put 'earlier' than the ones in 
> HFiles with mvcc greater than zero(say 'earlier' since they have mvcc less 
> than the ones with non-zero mvcc, but in fact they are put 'later'), and 
> without any incorrect impact just because during region opening there are no 
> active scanners existing / created.
> This bug is just in 'logic' sense for the time being, but if later on we need 
> to survive mvcc in the region's whole logic lifecycle(across regionservers) 
> and never set them to zero, this bug needs to be fixed first.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10227) When a region is opened, its mvcc isn't correctly recovered when there are split hlogs to replay

2014-01-14 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13871201#comment-13871201
 ] 

Lars Hofhansl commented on HBASE-10227:
---

I have not followed all the discussion here. The key points seem to be:
# we'll always keep the read point (never set it 0)
# that in turns means we always have to store and decode it upon read

So for each KV we'll store an extra vlong. Since we don't set it to null we'll 
likely need at least 4 bytes. In that case we should probably not store as a 
vlong, but just an 8 byte long.
For small KeyValue that would probably make a noticeable performance impact, 
but probably acceptable if needed for correctness.

What I do not understand what we're trying to solve. As described in the 
description we do not allow a scanner before splitting is finished, so this is 
not currently a problem. How would we allow opening a scanner before we open 
the region?

> When a region is opened, its mvcc isn't correctly recovered when there are 
> split hlogs to replay
> 
>
> Key: HBASE-10227
> URL: https://issues.apache.org/jira/browse/HBASE-10227
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Reporter: Feng Honghua
>Assignee: Feng Honghua
> Attachments: HBASE-10227-trunk_v0.patch
>
>
> When opening a region, all stores are examined to get the max MemstoreTS and 
> it's used as the initial mvcc for the region, and then split hlogs are 
> replayed. In fact the edits in split hlogs have kvs with greater mvcc than 
> all MemstoreTS in all store files, but replaying them don't increment the 
> mvcc according at all. From an overall perspective this mvcc recovering is 
> 'logically' incorrect/incomplete.
> Why currently it doesn't incur problem is because no active scanners exists 
> and no new scanners can be created before the region opening completes, so 
> the mvcc of all kvs in the resulted hfiles from hlog replaying can be safely 
> set to zero. They are just treated as kvs put 'earlier' than the ones in 
> HFiles with mvcc greater than zero(say 'earlier' since they have mvcc less 
> than the ones with non-zero mvcc, but in fact they are put 'later'), and 
> without any incorrect impact just because during region opening there are no 
> active scanners existing / created.
> This bug is just in 'logic' sense for the time being, but if later on we need 
> to survive mvcc in the region's whole logic lifecycle(across regionservers) 
> and never set them to zero, this bug needs to be fixed first.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10227) When a region is opened, its mvcc isn't correctly recovered when there are split hlogs to replay

2014-01-14 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13871169#comment-13871169
 ] 

Sergey Shelukhin commented on HBASE-10227:
--

I think its ok to store all mvccs in files, but it might require discussion 
e.g. on dev list. 
[~lhofhansl] iirc you added the optimization to drop mvcc from files, what do 
you think?

It's better to remove unused parameters if the change goes thru, as part of the 
same patch.

Rolling upgrade may break if old versions cannot read new WALs. Why not put 
mvcc-s corresponding to KVs into PB part of WAL? It should be rather simple and 
backward/forward compatible. Easy to remove too if we move mvcc as part of KV 
itself eventually.

> When a region is opened, its mvcc isn't correctly recovered when there are 
> split hlogs to replay
> 
>
> Key: HBASE-10227
> URL: https://issues.apache.org/jira/browse/HBASE-10227
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Reporter: Feng Honghua
>Assignee: Feng Honghua
> Attachments: HBASE-10227-trunk_v0.patch
>
>
> When opening a region, all stores are examined to get the max MemstoreTS and 
> it's used as the initial mvcc for the region, and then split hlogs are 
> replayed. In fact the edits in split hlogs have kvs with greater mvcc than 
> all MemstoreTS in all store files, but replaying them don't increment the 
> mvcc according at all. From an overall perspective this mvcc recovering is 
> 'logically' incorrect/incomplete.
> Why currently it doesn't incur problem is because no active scanners exists 
> and no new scanners can be created before the region opening completes, so 
> the mvcc of all kvs in the resulted hfiles from hlog replaying can be safely 
> set to zero. They are just treated as kvs put 'earlier' than the ones in 
> HFiles with mvcc greater than zero(say 'earlier' since they have mvcc less 
> than the ones with non-zero mvcc, but in fact they are put 'later'), and 
> without any incorrect impact just because during region opening there are no 
> active scanners existing / created.
> This bug is just in 'logic' sense for the time being, but if later on we need 
> to survive mvcc in the region's whole logic lifecycle(across regionservers) 
> and never set them to zero, this bug needs to be fixed first.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10227) When a region is opened, its mvcc isn't correctly recovered when there are split hlogs to replay

2014-01-14 Thread Feng Honghua (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13870650#comment-13870650
 ] 

Feng Honghua commented on HBASE-10227:
--

thanks [~sershe] for the review
bq.Storing mvcc in store file always is an interesting option...
I agree, to recover correct mvcc for region opening, only the max mvcc in 
fileinfo of hfile is sufficient. Storing every KV's mvcc in hfile is only 
required to correct the delete semantic which relies on KV's mvcc to determine 
whether a delete occurs before/after a put, it is for 
[HBASE-8721|https://issues.apache.org/jira/browse/HBASE-8721], I can revert 
this change if strong objection, your opinion?
bq.mvcc.reinitialize(maxMemstoreTS + 1); is now called twice in the same place.
not exactly. first call is mvcc.initialize(maxMemstoreTS + 1) to *initialize* 
region mvcc using maxMemstoreTS from hfiles without regard to KVs in split hlog 
files; second call is mvcc.reinitialize(maxMemstoreTS + 1) to *re-initialize* 
region mvcc with regard to KVs in newly flushed hfiles resulted from split hlog 
files replay. and I explained the reason in above comment :-)
bq.With removal of usage performCompaction no longer needs smallestReadPoint. 
Also parameter might not be necessary in createWriterInTmp
yes, you're right, it's a bit more aggressive clean-up without impact to 
correctness, I can do it if you insist
bq.Is it possible to add MVCCs from corresponding KVs to protobuf part, rather 
than expand WALEdit format? I think the proper way is actually to make mvcc 
serialization a "first class" part of KV, there's JIRA for that; but that might 
be too much for this patch, as it would require new HFile version.
I agree, KV is the better place to serialize mvcc but require new HFile 
version. serializing it in WALEdit is more lightweight and can serve the need 
to persist mvcc in HLog files. what about keeping current change?
bq.Already, it appears that old reader will not read V_3 correctly.
sure,only back-compatibility is provided(new reader can read old HLog version, 
but not vice versa)

> When a region is opened, its mvcc isn't correctly recovered when there are 
> split hlogs to replay
> 
>
> Key: HBASE-10227
> URL: https://issues.apache.org/jira/browse/HBASE-10227
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Reporter: Feng Honghua
>Assignee: Feng Honghua
> Attachments: HBASE-10227-trunk_v0.patch
>
>
> When opening a region, all stores are examined to get the max MemstoreTS and 
> it's used as the initial mvcc for the region, and then split hlogs are 
> replayed. In fact the edits in split hlogs have kvs with greater mvcc than 
> all MemstoreTS in all store files, but replaying them don't increment the 
> mvcc according at all. From an overall perspective this mvcc recovering is 
> 'logically' incorrect/incomplete.
> Why currently it doesn't incur problem is because no active scanners exists 
> and no new scanners can be created before the region opening completes, so 
> the mvcc of all kvs in the resulted hfiles from hlog replaying can be safely 
> set to zero. They are just treated as kvs put 'earlier' than the ones in 
> HFiles with mvcc greater than zero(say 'earlier' since they have mvcc less 
> than the ones with non-zero mvcc, but in fact they are put 'later'), and 
> without any incorrect impact just because during region opening there are no 
> active scanners existing / created.
> This bug is just in 'logic' sense for the time being, but if later on we need 
> to survive mvcc in the region's whole logic lifecycle(across regionservers) 
> and never set them to zero, this bug needs to be fixed first.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10227) When a region is opened, its mvcc isn't correctly recovered when there are split hlogs to replay

2014-01-13 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13869784#comment-13869784
 ] 

Sergey Shelukhin commented on HBASE-10227:
--


RB would be nice.
Storing mvcc in store file always is an interesting option.
However, it becomes unnecessary for most KVs after some time under current 
HBase assumptions (that storefiles can be compared, all KVs in one SF are older 
than all KVs in the other per seqId/mvcc).
The only uses for mvcc in KV at that point is exact same key in the file, and 
scanners, but the latter need disappears after some time, see some later 
comments in HBASE-10244.

Some minor comments on the patch:
bq.  mvcc.reinitialize(maxMemstoreTS + 1); 
is now called twice in the same place.

With removal of usage performCompaction no longer needs smallestReadPoint.
Also parameter might not be necessary in createWriterInTmp

Ok this is major comment.
bq. if (versionOrLength == VERSION_3) {
Is it possible to add MVCCs from corresponding KVs to protobuf part, rather 
than expand WALEdit format?
I think the proper way is actually to make mvcc serialization a "first class" 
part of KV, there's JIRA for that; but that might be too much for this patch, 
as it would require new HFile version.
For now we can at least avoid more hard-to-maintain-compat stuff down the line.
Already, it appears that old reader will not read V_3 correctly.



> When a region is opened, its mvcc isn't correctly recovered when there are 
> split hlogs to replay
> 
>
> Key: HBASE-10227
> URL: https://issues.apache.org/jira/browse/HBASE-10227
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Reporter: Feng Honghua
>Assignee: Feng Honghua
> Attachments: HBASE-10227-trunk_v0.patch
>
>
> When opening a region, all stores are examined to get the max MemstoreTS and 
> it's used as the initial mvcc for the region, and then split hlogs are 
> replayed. In fact the edits in split hlogs have kvs with greater mvcc than 
> all MemstoreTS in all store files, but replaying them don't increment the 
> mvcc according at all. From an overall perspective this mvcc recovering is 
> 'logically' incorrect/incomplete.
> Why currently it doesn't incur problem is because no active scanners exists 
> and no new scanners can be created before the region opening completes, so 
> the mvcc of all kvs in the resulted hfiles from hlog replaying can be safely 
> set to zero. They are just treated as kvs put 'earlier' than the ones in 
> HFiles with mvcc greater than zero(say 'earlier' since they have mvcc less 
> than the ones with non-zero mvcc, but in fact they are put 'later'), and 
> without any incorrect impact just because during region opening there are no 
> active scanners existing / created.
> This bug is just in 'logic' sense for the time being, but if later on we need 
> to survive mvcc in the region's whole logic lifecycle(across regionservers) 
> and never set them to zero, this bug needs to be fixed first.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10227) When a region is opened, its mvcc isn't correctly recovered when there are split hlogs to replay

2014-01-13 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13869573#comment-13869573
 ] 

Hadoop QA commented on HBASE-10227:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12622605/HBASE-10227-trunk_v0.patch
  against trunk revision .
  ATTACHMENT ID: 12622605

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified tests.

{color:green}+1 hadoop1.0{color}.  The patch compiles against the hadoop 
1.0 profile.

{color:green}+1 hadoop1.1{color}.  The patch compiles against the hadoop 
1.1 profile.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

{color:red}-1 site{color}.  The patch appears to cause mvn site goal to 
fail.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8403//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8403//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8403//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8403//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8403//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8403//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8403//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8403//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8403//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8403//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8403//console

This message is automatically generated.

> When a region is opened, its mvcc isn't correctly recovered when there are 
> split hlogs to replay
> 
>
> Key: HBASE-10227
> URL: https://issues.apache.org/jira/browse/HBASE-10227
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Reporter: Feng Honghua
>Assignee: Feng Honghua
> Attachments: HBASE-10227-trunk_v0.patch
>
>
> When opening a region, all stores are examined to get the max MemstoreTS and 
> it's used as the initial mvcc for the region, and then split hlogs are 
> replayed. In fact the edits in split hlogs have kvs with greater mvcc than 
> all MemstoreTS in all store files, but replaying them don't increment the 
> mvcc according at all. From an overall perspective this mvcc recovering is 
> 'logically' incorrect/incomplete.
> Why currently it doesn't incur problem is because no active scanners exists 
> and no new scanners can be created before the region opening completes, so 
> the mvcc of all kvs in the resulted hfiles from hlog replaying can be safely 
> set to zero. They are just treated as kvs put 'earlier' than the ones in 
> HFiles with mvcc greater than zero(say 'earlier' since they have mvcc less 
> than the ones with non-zero mvcc, but in fact they are put 'later'), and 
> without any incorrect impact just because during region opening there are no 
> active scanners existing / created.
> This bug is just in 'logic' sense for the time being, but if later on we need 
> to survive mvcc in the region's whole logic lifecycle(across regionservers) 
> and never set them to zero, this bug needs to be fixed first.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10227) When a region is opened, its mvcc isn't correctly recovered when there are split hlogs to replay

2014-01-12 Thread Feng Honghua (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13868994#comment-13868994
 ] 

Feng Honghua commented on HBASE-10227:
--

[~gustavoanatoly] : nothing at all, please don't feel sorry, I'll take care of 
it :-)

> When a region is opened, its mvcc isn't correctly recovered when there are 
> split hlogs to replay
> 
>
> Key: HBASE-10227
> URL: https://issues.apache.org/jira/browse/HBASE-10227
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Reporter: Feng Honghua
>Assignee: Gustavo Anatoly
>
> When opening a region, all stores are examined to get the max MemstoreTS and 
> it's used as the initial mvcc for the region, and then split hlogs are 
> replayed. In fact the edits in split hlogs have kvs with greater mvcc than 
> all MemstoreTS in all store files, but replaying them don't increment the 
> mvcc according at all. From an overall perspective this mvcc recovering is 
> 'logically' incorrect/incomplete.
> Why currently it doesn't incur problem is because no active scanners exists 
> and no new scanners can be created before the region opening completes, so 
> the mvcc of all kvs in the resulted hfiles from hlog replaying can be safely 
> set to zero. They are just treated as kvs put 'earlier' than the ones in 
> HFiles with mvcc greater than zero(say 'earlier' since they have mvcc less 
> than the ones with non-zero mvcc, but in fact they are put 'later'), and 
> without any incorrect impact just because during region opening there are no 
> active scanners existing / created.
> This bug is just in 'logic' sense for the time being, but if later on we need 
> to survive mvcc in the region's whole logic lifecycle(across regionservers) 
> and never set them to zero, this bug needs to be fixed first.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10227) When a region is opened, its mvcc isn't correctly recovered when there are split hlogs to replay

2014-01-12 Thread Gustavo Anatoly (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13868986#comment-13868986
 ] 

Gustavo Anatoly commented on HBASE-10227:
-

Hi, [~fenghh].

Unfortunately I don't have, sorry. I'm busy with other Jira and not be able 
complete this task. Please steal back this issue.
Thanks for you patient wait



> When a region is opened, its mvcc isn't correctly recovered when there are 
> split hlogs to replay
> 
>
> Key: HBASE-10227
> URL: https://issues.apache.org/jira/browse/HBASE-10227
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Reporter: Feng Honghua
>Assignee: Gustavo Anatoly
>
> When opening a region, all stores are examined to get the max MemstoreTS and 
> it's used as the initial mvcc for the region, and then split hlogs are 
> replayed. In fact the edits in split hlogs have kvs with greater mvcc than 
> all MemstoreTS in all store files, but replaying them don't increment the 
> mvcc according at all. From an overall perspective this mvcc recovering is 
> 'logically' incorrect/incomplete.
> Why currently it doesn't incur problem is because no active scanners exists 
> and no new scanners can be created before the region opening completes, so 
> the mvcc of all kvs in the resulted hfiles from hlog replaying can be safely 
> set to zero. They are just treated as kvs put 'earlier' than the ones in 
> HFiles with mvcc greater than zero(say 'earlier' since they have mvcc less 
> than the ones with non-zero mvcc, but in fact they are put 'later'), and 
> without any incorrect impact just because during region opening there are no 
> active scanners existing / created.
> This bug is just in 'logic' sense for the time being, but if later on we need 
> to survive mvcc in the region's whole logic lifecycle(across regionservers) 
> and never set them to zero, this bug needs to be fixed first.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10227) When a region is opened, its mvcc isn't correctly recovered when there are split hlogs to replay

2014-01-11 Thread Feng Honghua (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13868961#comment-13868961
 ] 

Feng Honghua commented on HBASE-10227:
--

ping [~gustavoanatoly] : any update? :-)

> When a region is opened, its mvcc isn't correctly recovered when there are 
> split hlogs to replay
> 
>
> Key: HBASE-10227
> URL: https://issues.apache.org/jira/browse/HBASE-10227
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Reporter: Feng Honghua
>Assignee: Gustavo Anatoly
>
> When opening a region, all stores are examined to get the max MemstoreTS and 
> it's used as the initial mvcc for the region, and then split hlogs are 
> replayed. In fact the edits in split hlogs have kvs with greater mvcc than 
> all MemstoreTS in all store files, but replaying them don't increment the 
> mvcc according at all. From an overall perspective this mvcc recovering is 
> 'logically' incorrect/incomplete.
> Why currently it doesn't incur problem is because no active scanners exists 
> and no new scanners can be created before the region opening completes, so 
> the mvcc of all kvs in the resulted hfiles from hlog replaying can be safely 
> set to zero. They are just treated as kvs put 'earlier' than the ones in 
> HFiles with mvcc greater than zero(say 'earlier' since they have mvcc less 
> than the ones with non-zero mvcc, but in fact they are put 'later'), and 
> without any incorrect impact just because during region opening there are no 
> active scanners existing / created.
> This bug is just in 'logic' sense for the time being, but if later on we need 
> to survive mvcc in the region's whole logic lifecycle(across regionservers) 
> and never set them to zero, this bug needs to be fixed first.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10227) When a region is opened, its mvcc isn't correctly recovered when there are split hlogs to replay

2014-01-02 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13860515#comment-13860515
 ] 

Sergey Shelukhin commented on HBASE-10227:
--

resolved that as dup of this as an earlier issue. Would be nice to see patch :)

> When a region is opened, its mvcc isn't correctly recovered when there are 
> split hlogs to replay
> 
>
> Key: HBASE-10227
> URL: https://issues.apache.org/jira/browse/HBASE-10227
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Reporter: Feng Honghua
>Assignee: Gustavo Anatoly
>
> When opening a region, all stores are examined to get the max MemstoreTS and 
> it's used as the initial mvcc for the region, and then split hlogs are 
> replayed. In fact the edits in split hlogs have kvs with greater mvcc than 
> all MemstoreTS in all store files, but replaying them don't increment the 
> mvcc according at all. From an overall perspective this mvcc recovering is 
> 'logically' incorrect/incomplete.
> Why currently it doesn't incur problem is because no active scanners exists 
> and no new scanners can be created before the region opening completes, so 
> the mvcc of all kvs in the resulted hfiles from hlog replaying can be safely 
> set to zero. They are just treated as kvs put 'earlier' than the ones in 
> HFiles with mvcc greater than zero(say 'earlier' since they have mvcc less 
> than the ones with non-zero mvcc, but in fact they are put 'later'), and 
> without any incorrect impact just because during region opening there are no 
> active scanners existing / created.
> This bug is just in 'logic' sense for the time being, but if later on we need 
> to survive mvcc in the region's whole logic lifecycle(across regionservers) 
> and never set them to zero, this bug needs to be fixed first.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10227) When a region is opened, its mvcc isn't correctly recovered when there are split hlogs to replay

2013-12-28 Thread Feng Honghua (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13858238#comment-13858238
 ] 

Feng Honghua commented on HBASE-10227:
--

yes, you mean [HBASE-10243|https://issues.apache.org/jira/browse/HBASE-10243]? 
...actually I've already implemented it as part of 
[HBASE-8721|https://issues.apache.org/jira/browse/HBASE-8721], and MVCC 
persistence is only the prerequisite of the fix for this issue. :-)

> When a region is opened, its mvcc isn't correctly recovered when there are 
> split hlogs to replay
> 
>
> Key: HBASE-10227
> URL: https://issues.apache.org/jira/browse/HBASE-10227
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Reporter: Feng Honghua
>Assignee: Gustavo Anatoly
>
> When opening a region, all stores are examined to get the max MemstoreTS and 
> it's used as the initial mvcc for the region, and then split hlogs are 
> replayed. In fact the edits in split hlogs have kvs with greater mvcc than 
> all MemstoreTS in all store files, but replaying them don't increment the 
> mvcc according at all. From an overall perspective this mvcc recovering is 
> 'logically' incorrect/incomplete.
> Why currently it doesn't incur problem is because no active scanners exists 
> and no new scanners can be created before the region opening completes, so 
> the mvcc of all kvs in the resulted hfiles from hlog replaying can be safely 
> set to zero. They are just treated as kvs put 'earlier' than the ones in 
> HFiles with mvcc greater than zero(say 'earlier' since they have mvcc less 
> than the ones with non-zero mvcc, but in fact they are put 'later'), and 
> without any incorrect impact just because during region opening there are no 
> active scanners existing / created.
> This bug is just in 'logic' sense for the time being, but if later on we need 
> to survive mvcc in the region's whole logic lifecycle(across regionservers) 
> and never set them to zero, this bug needs to be fixed first.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10227) When a region is opened, its mvcc isn't correctly recovered when there are split hlogs to replay

2013-12-28 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13858236#comment-13858236
 ] 

Ted Yu commented on HBASE-10227:


MVCC persistence in WAL is a subtask of HBASE-10241 too.

> When a region is opened, its mvcc isn't correctly recovered when there are 
> split hlogs to replay
> 
>
> Key: HBASE-10227
> URL: https://issues.apache.org/jira/browse/HBASE-10227
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Reporter: Feng Honghua
>Assignee: Gustavo Anatoly
>
> When opening a region, all stores are examined to get the max MemstoreTS and 
> it's used as the initial mvcc for the region, and then split hlogs are 
> replayed. In fact the edits in split hlogs have kvs with greater mvcc than 
> all MemstoreTS in all store files, but replaying them don't increment the 
> mvcc according at all. From an overall perspective this mvcc recovering is 
> 'logically' incorrect/incomplete.
> Why currently it doesn't incur problem is because no active scanners exists 
> and no new scanners can be created before the region opening completes, so 
> the mvcc of all kvs in the resulted hfiles from hlog replaying can be safely 
> set to zero. They are just treated as kvs put 'earlier' than the ones in 
> HFiles with mvcc greater than zero(say 'earlier' since they have mvcc less 
> than the ones with non-zero mvcc, but in fact they are put 'later'), and 
> without any incorrect impact just because during region opening there are no 
> active scanners existing / created.
> This bug is just in 'logic' sense for the time being, but if later on we need 
> to survive mvcc in the region's whole logic lifecycle(across regionservers) 
> and never set them to zero, this bug needs to be fixed first.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10227) When a region is opened, its mvcc isn't correctly recovered when there are split hlogs to replay

2013-12-28 Thread Feng Honghua (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13858232#comment-13858232
 ] 

Feng Honghua commented on HBASE-10227:
--

[~yuzhih...@gmail.com] : actually I changed the (un)serialization of WALEdit to 
persist mvcc, and use it to correctly recover region's mvcc during reopening, 
that change is part of jira-8721's patch :-)

> When a region is opened, its mvcc isn't correctly recovered when there are 
> split hlogs to replay
> 
>
> Key: HBASE-10227
> URL: https://issues.apache.org/jira/browse/HBASE-10227
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Reporter: Feng Honghua
>Assignee: Gustavo Anatoly
>
> When opening a region, all stores are examined to get the max MemstoreTS and 
> it's used as the initial mvcc for the region, and then split hlogs are 
> replayed. In fact the edits in split hlogs have kvs with greater mvcc than 
> all MemstoreTS in all store files, but replaying them don't increment the 
> mvcc according at all. From an overall perspective this mvcc recovering is 
> 'logically' incorrect/incomplete.
> Why currently it doesn't incur problem is because no active scanners exists 
> and no new scanners can be created before the region opening completes, so 
> the mvcc of all kvs in the resulted hfiles from hlog replaying can be safely 
> set to zero. They are just treated as kvs put 'earlier' than the ones in 
> HFiles with mvcc greater than zero(say 'earlier' since they have mvcc less 
> than the ones with non-zero mvcc, but in fact they are put 'later'), and 
> without any incorrect impact just because during region opening there are no 
> active scanners existing / created.
> This bug is just in 'logic' sense for the time being, but if later on we need 
> to survive mvcc in the region's whole logic lifecycle(across regionservers) 
> and never set them to zero, this bug needs to be fixed first.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10227) When a region is opened, its mvcc isn't correctly recovered when there are split hlogs to replay

2013-12-28 Thread Feng Honghua (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13858224#comment-13858224
 ] 

Feng Honghua commented on HBASE-10227:
--

thanks [~yuzhih...@gmail.com] for directing. I just took a look at 10241, seems 
it together with 8763 is a bit general, while this one is pretty specific , its 
fix can be a part of those general jiras, so I think it's ok to fix this one 
separately, opinion? :-)

> When a region is opened, its mvcc isn't correctly recovered when there are 
> split hlogs to replay
> 
>
> Key: HBASE-10227
> URL: https://issues.apache.org/jira/browse/HBASE-10227
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Reporter: Feng Honghua
>Assignee: Gustavo Anatoly
>
> When opening a region, all stores are examined to get the max MemstoreTS and 
> it's used as the initial mvcc for the region, and then split hlogs are 
> replayed. In fact the edits in split hlogs have kvs with greater mvcc than 
> all MemstoreTS in all store files, but replaying them don't increment the 
> mvcc according at all. From an overall perspective this mvcc recovering is 
> 'logically' incorrect/incomplete.
> Why currently it doesn't incur problem is because no active scanners exists 
> and no new scanners can be created before the region opening completes, so 
> the mvcc of all kvs in the resulted hfiles from hlog replaying can be safely 
> set to zero. They are just treated as kvs put 'earlier' than the ones in 
> HFiles with mvcc greater than zero(say 'earlier' since they have mvcc less 
> than the ones with non-zero mvcc, but in fact they are put 'later'), and 
> without any incorrect impact just because during region opening there are no 
> active scanners existing / created.
> This bug is just in 'logic' sense for the time being, but if later on we need 
> to survive mvcc in the region's whole logic lifecycle(across regionservers) 
> and never set them to zero, this bug needs to be fixed first.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10227) When a region is opened, its mvcc isn't correctly recovered when there are split hlogs to replay

2013-12-28 Thread Feng Honghua (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13858222#comment-13858222
 ] 

Feng Honghua commented on HBASE-10227:
--

[~gustavoanatoly] : sounds good. I will also try to find time to extract the 
original fix for it from jira-8721 these days to attach here for your reference.

> When a region is opened, its mvcc isn't correctly recovered when there are 
> split hlogs to replay
> 
>
> Key: HBASE-10227
> URL: https://issues.apache.org/jira/browse/HBASE-10227
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Reporter: Feng Honghua
>Assignee: Gustavo Anatoly
>
> When opening a region, all stores are examined to get the max MemstoreTS and 
> it's used as the initial mvcc for the region, and then split hlogs are 
> replayed. In fact the edits in split hlogs have kvs with greater mvcc than 
> all MemstoreTS in all store files, but replaying them don't increment the 
> mvcc according at all. From an overall perspective this mvcc recovering is 
> 'logically' incorrect/incomplete.
> Why currently it doesn't incur problem is because no active scanners exists 
> and no new scanners can be created before the region opening completes, so 
> the mvcc of all kvs in the resulted hfiles from hlog replaying can be safely 
> set to zero. They are just treated as kvs put 'earlier' than the ones in 
> HFiles with mvcc greater than zero(say 'earlier' since they have mvcc less 
> than the ones with non-zero mvcc, but in fact they are put 'later'), and 
> without any incorrect impact just because during region opening there are no 
> active scanners existing / created.
> This bug is just in 'logic' sense for the time being, but if later on we need 
> to survive mvcc in the region's whole logic lifecycle(across regionservers) 
> and never set them to zero, this bug needs to be fixed first.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10227) When a region is opened, its mvcc isn't correctly recovered when there are split hlogs to replay

2013-12-28 Thread Gustavo Anatoly (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13858159#comment-13858159
 ] 

Gustavo Anatoly commented on HBASE-10227:
-

Thanks for reply, [~fenghh]. :)

I can schedule to start on 08/Jan/14, would be possible? This date is good? But 
if I can't start this task on 08/Jan. Please, re-assign for you [~fenghh].

> When a region is opened, its mvcc isn't correctly recovered when there are 
> split hlogs to replay
> 
>
> Key: HBASE-10227
> URL: https://issues.apache.org/jira/browse/HBASE-10227
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Reporter: Feng Honghua
>Assignee: Gustavo Anatoly
>
> When opening a region, all stores are examined to get the max MemstoreTS and 
> it's used as the initial mvcc for the region, and then split hlogs are 
> replayed. In fact the edits in split hlogs have kvs with greater mvcc than 
> all MemstoreTS in all store files, but replaying them don't increment the 
> mvcc according at all. From an overall perspective this mvcc recovering is 
> 'logically' incorrect/incomplete.
> Why currently it doesn't incur problem is because no active scanners exists 
> and no new scanners can be created before the region opening completes, so 
> the mvcc of all kvs in the resulted hfiles from hlog replaying can be safely 
> set to zero. They are just treated as kvs put 'earlier' than the ones in 
> HFiles with mvcc greater than zero(say 'earlier' since they have mvcc less 
> than the ones with non-zero mvcc, but in fact they are put 'later'), and 
> without any incorrect impact just because during region opening there are no 
> active scanners existing / created.
> This bug is just in 'logic' sense for the time being, but if later on we need 
> to survive mvcc in the region's whole logic lifecycle(across regionservers) 
> and never set them to zero, this bug needs to be fixed first.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10227) When a region is opened, its mvcc isn't correctly recovered when there are split hlogs to replay

2013-12-28 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13858107#comment-13858107
 ] 

Ted Yu commented on HBASE-10227:


There is considerable overlap between this JIRA and HBASE-10241

> When a region is opened, its mvcc isn't correctly recovered when there are 
> split hlogs to replay
> 
>
> Key: HBASE-10227
> URL: https://issues.apache.org/jira/browse/HBASE-10227
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Reporter: Feng Honghua
>Assignee: Gustavo Anatoly
>
> When opening a region, all stores are examined to get the max MemstoreTS and 
> it's used as the initial mvcc for the region, and then split hlogs are 
> replayed. In fact the edits in split hlogs have kvs with greater mvcc than 
> all MemstoreTS in all store files, but replaying them don't increment the 
> mvcc according at all. From an overall perspective this mvcc recovering is 
> 'logically' incorrect/incomplete.
> Why currently it doesn't incur problem is because no active scanners exists 
> and no new scanners can be created before the region opening completes, so 
> the mvcc of all kvs in the resulted hfiles from hlog replaying can be safely 
> set to zero. They are just treated as kvs put 'earlier' than the ones in 
> HFiles with mvcc greater than zero(say 'earlier' since they have mvcc less 
> than the ones with non-zero mvcc, but in fact they are put 'later'), and 
> without any incorrect impact just because during region opening there are no 
> active scanners existing / created.
> This bug is just in 'logic' sense for the time being, but if later on we need 
> to survive mvcc in the region's whole logic lifecycle(across regionservers) 
> and never set them to zero, this bug needs to be fixed first.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10227) When a region is opened, its mvcc isn't correctly recovered when there are split hlogs to replay

2013-12-28 Thread Feng Honghua (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13858037#comment-13858037
 ] 

Feng Honghua commented on HBASE-10227:
--

[~gustavoanatoly] : glad to know you're also aware of this bug and show 
interest for fixing it. :-)

Actually this issue had already been fixed in my patch for JIRA-8721 (where the 
mvcc can't be set zero and need to keep across region move / regionserver 
failover / balance etc, I noticed and fixed this 'logic' bug as a part of that 
patch), since JIRA-8721 experienced several times close/reopen/close, I think 
it's not a good timing to reopen it again. but the exposing of this bug and 
providing its fix can be opened as a separate JIRA.

If you can't schedule time for this fix, maybe I can re-assign to myself and 
extract the fix for this bug from JIRA-8721's patch to here for 
discussion/review, what do you think? [~gustavoanatoly] / [~stack]

> When a region is opened, its mvcc isn't correctly recovered when there are 
> split hlogs to replay
> 
>
> Key: HBASE-10227
> URL: https://issues.apache.org/jira/browse/HBASE-10227
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Reporter: Feng Honghua
>Assignee: Gustavo Anatoly
>
> When opening a region, all stores are examined to get the max MemstoreTS and 
> it's used as the initial mvcc for the region, and then split hlogs are 
> replayed. In fact the edits in split hlogs have kvs with greater mvcc than 
> all MemstoreTS in all store files, but replaying them don't increment the 
> mvcc according at all. From an overall perspective this mvcc recovering is 
> 'logically' incorrect/incomplete.
> Why currently it doesn't incur problem is because no active scanners exists 
> and no new scanners can be created before the region opening completes, so 
> the mvcc of all kvs in the resulted hfiles from hlog replaying can be safely 
> set to zero. They are just treated as kvs put 'earlier' than the ones in 
> HFiles with mvcc greater than zero(say 'earlier' since they have mvcc less 
> than the ones with non-zero mvcc, but in fact they are put 'later'), and 
> without any incorrect impact just because during region opening there are no 
> active scanners existing / created.
> This bug is just in 'logic' sense for the time being, but if later on we need 
> to survive mvcc in the region's whole logic lifecycle(across regionservers) 
> and never set them to zero, this bug needs to be fixed first.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10227) When a region is opened, its mvcc isn't correctly recovered when there are split hlogs to replay

2013-12-27 Thread Gustavo Anatoly (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13857595#comment-13857595
 ] 

Gustavo Anatoly commented on HBASE-10227:
-

Hi, [~stack].

I understood, was just a careful to avoid problems with assignments.

Thanks, for your advice.

> When a region is opened, its mvcc isn't correctly recovered when there are 
> split hlogs to replay
> 
>
> Key: HBASE-10227
> URL: https://issues.apache.org/jira/browse/HBASE-10227
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Reporter: Feng Honghua
>Assignee: Gustavo Anatoly
>
> When opening a region, all stores are examined to get the max MemstoreTS and 
> it's used as the initial mvcc for the region, and then split hlogs are 
> replayed. In fact the edits in split hlogs have kvs with greater mvcc than 
> all MemstoreTS in all store files, but replaying them don't increment the 
> mvcc according at all. From an overall perspective this mvcc recovering is 
> 'logically' incorrect/incomplete.
> Why currently it doesn't incur problem is because no active scanners exists 
> and no new scanners can be created before the region opening completes, so 
> the mvcc of all kvs in the resulted hfiles from hlog replaying can be safely 
> set to zero. They are just treated as kvs put 'earlier' than the ones in 
> HFiles with mvcc greater than zero(say 'earlier' since they have mvcc less 
> than the ones with non-zero mvcc, but in fact they are put 'later'), and 
> without any incorrect impact just because during region opening there are no 
> active scanners existing / created.
> This bug is just in 'logic' sense for the time being, but if later on we need 
> to survive mvcc in the region's whole logic lifecycle(across regionservers) 
> and never set them to zero, this bug needs to be fixed first.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10227) When a region is opened, its mvcc isn't correctly recovered when there are split hlogs to replay

2013-12-27 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13857550#comment-13857550
 ] 

stack commented on HBASE-10227:
---

[~gustavoanatoly] You are an 'contributor' up in JIRA so you should be able to 
assign yourself.  I assigned this to you for now ([~fenghh] can steal it back 
if he is working on it himself).

> When a region is opened, its mvcc isn't correctly recovered when there are 
> split hlogs to replay
> 
>
> Key: HBASE-10227
> URL: https://issues.apache.org/jira/browse/HBASE-10227
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Reporter: Feng Honghua
>
> When opening a region, all stores are examined to get the max MemstoreTS and 
> it's used as the initial mvcc for the region, and then split hlogs are 
> replayed. In fact the edits in split hlogs have kvs with greater mvcc than 
> all MemstoreTS in all store files, but replaying them don't increment the 
> mvcc according at all. From an overall perspective this mvcc recovering is 
> 'logically' incorrect/incomplete.
> Why currently it doesn't incur problem is because no active scanners exists 
> and no new scanners can be created before the region opening completes, so 
> the mvcc of all kvs in the resulted hfiles from hlog replaying can be safely 
> set to zero. They are just treated as kvs put 'earlier' than the ones in 
> HFiles with mvcc greater than zero(say 'earlier' since they have mvcc less 
> than the ones with non-zero mvcc, but in fact they are put 'later'), and 
> without any incorrect impact just because during region opening there are no 
> active scanners existing / created.
> This bug is just in 'logic' sense for the time being, but if later on we need 
> to survive mvcc in the region's whole logic lifecycle(across regionservers) 
> and never set them to zero, this bug needs to be fixed first.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10227) When a region is opened, its mvcc isn't correctly recovered when there are split hlogs to replay

2013-12-27 Thread Gustavo Anatoly (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13857475#comment-13857475
 ] 

Gustavo Anatoly commented on HBASE-10227:
-

Hi, [~fenghh].

I would like to work in this issue, but at this moment I'm working in another 
bug. So could I assign to me and putting in my queue, to be the next task? Is 
it possible?

Thanks.

> When a region is opened, its mvcc isn't correctly recovered when there are 
> split hlogs to replay
> 
>
> Key: HBASE-10227
> URL: https://issues.apache.org/jira/browse/HBASE-10227
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Reporter: Feng Honghua
>
> When opening a region, all stores are examined to get the max MemstoreTS and 
> it's used as the initial mvcc for the region, and then split hlogs are 
> replayed. In fact the edits in split hlogs have kvs with greater mvcc than 
> all MemstoreTS in all store files, but replaying them don't increment the 
> mvcc according at all. From an overall perspective this mvcc recovering is 
> 'logically' incorrect/incomplete.
> Why currently it doesn't incur problem is because no active scanners exists 
> and no new scanners can be created before the region opening completes, so 
> the mvcc of all kvs in the resulted hfiles from hlog replaying can be safely 
> set to zero. They are just treated as kvs put 'earlier' than the ones in 
> HFiles with mvcc greater than zero(say 'earlier' since they have mvcc less 
> than the ones with non-zero mvcc, but in fact they are put 'later'), and 
> without any incorrect impact just because during region opening there are no 
> active scanners existing / created.
> This bug is just in 'logic' sense for the time being, but if later on we need 
> to survive mvcc in the region's whole logic lifecycle(across regionservers) 
> and never set them to zero, this bug needs to be fixed first.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)