[ 
https://issues.apache.org/jira/browse/HBASE-10227?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Feng Honghua updated HBASE-10227:
---------------------------------

    Attachment: HBASE-10227-trunk_v0.patch

the fix is as below:
1. persist mvcc in HLog (in WALEdit)
2. never set KeyValue's mvcc to 0
3. always(not conditionally) include mvcc in HFile
4. reinitialize region mvcc after replaying split HLog files to include the 
greater ones in the new stores resulted from replaying/flushing split HLog 
files -- to correctly recover the region's mvcc

Note to step 4: since replaying split HLog files need access mvcc, so we can't 
intialize mvcc after replaying split HLog files, reinitializing it to the final 
correct one is ok after replaying is done. An alternative fix is to add and use 
a new internalFlushcache method for replaying split HLog files which doesn't 
access mvcc(it's ok since when replaying split HLog files, it's impossible 
there is in-progress transaction/write not committed to HLog--no write to HLog 
during replaying split HLog files)

> When a region is opened, its mvcc isn't correctly recovered when there are 
> split hlogs to replay
> ------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-10227
>                 URL: https://issues.apache.org/jira/browse/HBASE-10227
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: Feng Honghua
>            Assignee: Feng Honghua
>         Attachments: HBASE-10227-trunk_v0.patch
>
>
> When opening a region, all stores are examined to get the max MemstoreTS and 
> it's used as the initial mvcc for the region, and then split hlogs are 
> replayed. In fact the edits in split hlogs have kvs with greater mvcc than 
> all MemstoreTS in all store files, but replaying them don't increment the 
> mvcc according at all. From an overall perspective this mvcc recovering is 
> 'logically' incorrect/incomplete.
> Why currently it doesn't incur problem is because no active scanners exists 
> and no new scanners can be created before the region opening completes, so 
> the mvcc of all kvs in the resulted hfiles from hlog replaying can be safely 
> set to zero. They are just treated as kvs put 'earlier' than the ones in 
> HFiles with mvcc greater than zero(say 'earlier' since they have mvcc less 
> than the ones with non-zero mvcc, but in fact they are put 'later'), and 
> without any incorrect impact just because during region opening there are no 
> active scanners existing / created.
> This bug is just in 'logic' sense for the time being, but if later on we need 
> to survive mvcc in the region's whole logic lifecycle(across regionservers) 
> and never set them to zero, this bug needs to be fixed first.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to