[ https://issues.apache.org/jira/browse/HBASE-10227?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Feng Honghua updated HBASE-10227: --------------------------------- Attachment: HBASE-10227-trunk_v0.patch the fix is as below: 1. persist mvcc in HLog (in WALEdit) 2. never set KeyValue's mvcc to 0 3. always(not conditionally) include mvcc in HFile 4. reinitialize region mvcc after replaying split HLog files to include the greater ones in the new stores resulted from replaying/flushing split HLog files -- to correctly recover the region's mvcc Note to step 4: since replaying split HLog files need access mvcc, so we can't intialize mvcc after replaying split HLog files, reinitializing it to the final correct one is ok after replaying is done. An alternative fix is to add and use a new internalFlushcache method for replaying split HLog files which doesn't access mvcc(it's ok since when replaying split HLog files, it's impossible there is in-progress transaction/write not committed to HLog--no write to HLog during replaying split HLog files) > When a region is opened, its mvcc isn't correctly recovered when there are > split hlogs to replay > ------------------------------------------------------------------------------------------------ > > Key: HBASE-10227 > URL: https://issues.apache.org/jira/browse/HBASE-10227 > Project: HBase > Issue Type: Bug > Components: regionserver > Reporter: Feng Honghua > Assignee: Feng Honghua > Attachments: HBASE-10227-trunk_v0.patch > > > When opening a region, all stores are examined to get the max MemstoreTS and > it's used as the initial mvcc for the region, and then split hlogs are > replayed. In fact the edits in split hlogs have kvs with greater mvcc than > all MemstoreTS in all store files, but replaying them don't increment the > mvcc according at all. From an overall perspective this mvcc recovering is > 'logically' incorrect/incomplete. > Why currently it doesn't incur problem is because no active scanners exists > and no new scanners can be created before the region opening completes, so > the mvcc of all kvs in the resulted hfiles from hlog replaying can be safely > set to zero. They are just treated as kvs put 'earlier' than the ones in > HFiles with mvcc greater than zero(say 'earlier' since they have mvcc less > than the ones with non-zero mvcc, but in fact they are put 'later'), and > without any incorrect impact just because during region opening there are no > active scanners existing / created. > This bug is just in 'logic' sense for the time being, but if later on we need > to survive mvcc in the region's whole logic lifecycle(across regionservers) > and never set them to zero, this bug needs to be fixed first. -- This message was sent by Atlassian JIRA (v6.1.5#6160)