[ 
https://issues.apache.org/jira/browse/HBASE-26791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17503641#comment-17503641
 ] 

Wellington Chevreuil commented on HBASE-26791:
----------------------------------------------

Whilst the proposed solutions would handle the pitfalls of File based SFT impl, 
isn't the broader issue here the fact RS1 doesn't abort immediately upon the 
loss of its ZK lock? Shouldn't we rather ensure an RS abort is triggered and 
all ongoing operations (including any hstore flushes) are interrupted right 
away?

> Memstore flush fencing issue for SFT
> ------------------------------------
>
>                 Key: HBASE-26791
>                 URL: https://issues.apache.org/jira/browse/HBASE-26791
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 2.6.0, 3.0.0-alpha-3
>            Reporter: Szabolcs Bukros
>            Priority: Major
>
> The scenarios is the following:
>  # rs1 is flushing file to S3 for region1
>  # rs1 loses ZK lock
>  # region1 gets assigned to rs2
>  # rs2 opens region1
>  # rs1 completes flush and updates sft file for region1
>  # rs2 has a different “version” of the sft file for region1
> The flush should fail at the end, but the SFT file gets overwritten before 
> that, resulting in potential data loss.
>  
> Potential solutions include:
>  * Adding timestamp to the tracker file names. This and creating a new 
> tracker file when an rs open the region would allow us to list available 
> tracker files before an update and compare the found timestamps to the one 
> stored in memory to verify the store still owns the latest tracker file
>  * Using the existing timestamp in the tracker file content. This would also 
> require us to create a new tracker file when a new rs opens the region, but 
> instead of listing the available tracker files, we could try to load and 
> de-serialize the last tracker file and compare the timestamp found in it to 
> the one stored in memory.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to