[ 
https://issues.apache.org/jira/browse/HBASE-10227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13870650#comment-13870650
 ] 

Feng Honghua commented on HBASE-10227:
--------------------------------------

thanks [~sershe] for the review
bq.Storing mvcc in store file always is an interesting option...
I agree, to recover correct mvcc for region opening, only the max mvcc in 
fileinfo of hfile is sufficient. Storing every KV's mvcc in hfile is only 
required to correct the delete semantic which relies on KV's mvcc to determine 
whether a delete occurs before/after a put, it is for 
[HBASE-8721|https://issues.apache.org/jira/browse/HBASE-8721], I can revert 
this change if strong objection, your opinion?
bq.mvcc.reinitialize(maxMemstoreTS + 1); is now called twice in the same place.
not exactly. first call is mvcc.initialize(maxMemstoreTS + 1) to *initialize* 
region mvcc using maxMemstoreTS from hfiles without regard to KVs in split hlog 
files; second call is mvcc.reinitialize(maxMemstoreTS + 1) to *re-initialize* 
region mvcc with regard to KVs in newly flushed hfiles resulted from split hlog 
files replay. and I explained the reason in above comment :-)
bq.With removal of usage performCompaction no longer needs smallestReadPoint. 
Also parameter might not be necessary in createWriterInTmp
yes, you're right, it's a bit more aggressive clean-up without impact to 
correctness, I can do it if you insist
bq.Is it possible to add MVCCs from corresponding KVs to protobuf part, rather 
than expand WALEdit format? I think the proper way is actually to make mvcc 
serialization a "first class" part of KV, there's JIRA for that; but that might 
be too much for this patch, as it would require new HFile version.
I agree, KV is the better place to serialize mvcc but require new HFile 
version. serializing it in WALEdit is more lightweight and can serve the need 
to persist mvcc in HLog files. what about keeping current change?
bq.Already, it appears that old reader will not read V_3 correctly.
sure,only back-compatibility is provided(new reader can read old HLog version, 
but not vice versa)

> When a region is opened, its mvcc isn't correctly recovered when there are 
> split hlogs to replay
> ------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-10227
>                 URL: https://issues.apache.org/jira/browse/HBASE-10227
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: Feng Honghua
>            Assignee: Feng Honghua
>         Attachments: HBASE-10227-trunk_v0.patch
>
>
> When opening a region, all stores are examined to get the max MemstoreTS and 
> it's used as the initial mvcc for the region, and then split hlogs are 
> replayed. In fact the edits in split hlogs have kvs with greater mvcc than 
> all MemstoreTS in all store files, but replaying them don't increment the 
> mvcc according at all. From an overall perspective this mvcc recovering is 
> 'logically' incorrect/incomplete.
> Why currently it doesn't incur problem is because no active scanners exists 
> and no new scanners can be created before the region opening completes, so 
> the mvcc of all kvs in the resulted hfiles from hlog replaying can be safely 
> set to zero. They are just treated as kvs put 'earlier' than the ones in 
> HFiles with mvcc greater than zero(say 'earlier' since they have mvcc less 
> than the ones with non-zero mvcc, but in fact they are put 'later'), and 
> without any incorrect impact just because during region opening there are no 
> active scanners existing / created.
> This bug is just in 'logic' sense for the time being, but if later on we need 
> to survive mvcc in the region's whole logic lifecycle(across regionservers) 
> and never set them to zero, this bug needs to be fixed first.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to