[ 
https://issues.apache.org/jira/browse/HBASE-17633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15950155#comment-15950155
 ] 

Duo Zhang commented on HBASE-17633:
-----------------------------------

Finally I found that if we keep min sequence id in memstore then we do not need 
SequenceIdAccounting anymore(Of course the highestSequenceIds is still needed 
but it does not worth to have a separated class for it)... WAL could go to 
memstore to find out if it can safely purge an old wal file.

Will give a try soon.

> Update unflushed sequence id in SequenceIdAccounting after flush with the 
> minimum sequence id in memstore
> ---------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-17633
>                 URL: https://issues.apache.org/jira/browse/HBASE-17633
>             Project: HBase
>          Issue Type: Improvement
>          Components: wal
>    Affects Versions: 2.0.0
>            Reporter: Duo Zhang
>            Assignee: Duo Zhang
>             Fix For: 2.0.0
>
>         Attachments: HBASE-17633.patch, HBASE-17633-v1.patch
>
>
> Now the tracking work is done by SequenceIdAccounting. And it is a little 
> tricky when dealing with flush. We should remove the mapping for the given 
> stores of a region from lowestUnflushedSequenceIds, so that we have space to 
> store the new lowest unflushed sequence id after flush. But we still need to 
> keep the old sequence ids in another map as we still need to use these values 
> when reporting to master to prevent data loss(think of the scenario that we 
> report the new lowest unflushed sequence id to master and we crashed before 
> actually flushed the data to disk).
> And when reviewing HBASE-17407, I found  that for CompactingMemStore, we have 
> to record the minimum sequence id.in memstore. We could just update the 
> mappings in SequenceIdAccounting using these values after flush. This means 
> we do not need to update the lowest unflushed sequence id in 
> SequenceIdAccounting, and also do not need to make space for the new lowest 
> unflushed when startCacheFlush, and also do not need the extra map to store 
> the old mappings.
> This could simplify our logic a lot. But this is a fundamental change so I 
> need sometime to implement, especially for modifying tests... And I also need 
> sometime to check if I miss something.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to