[ 
https://issues.apache.org/jira/browse/HBASE-26969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17534360#comment-17534360
 ] 

Szabolcs Bukros commented on HBASE-26969:
-----------------------------------------

I would like to start by stating that this issue grow bigger than just removing 
the renames and exposed multiple issues in the MOB-SFT interaction.

I have uploaded a draft PR containing my changes. I intend to use it as a 
reference to show the issues when it comes to using MOB on FileBased SFT.

My main problem was that while MOB files were already tracked in the hfile 
metadata, the "single source of truth" is widely distributed and not easily 
available.

Both the WriterCreationTracker and the StoreFileTracker are RS based data and 
the MOB cleaner needs it to work reliably when FileBased SFT is used. Exposing 
this data and allowing the Master to request this from RSes, collect it and run 
the cleaner based on this, while technically possible, looked less than 
optimal. It would result in a single cluster wide spike that we should try to 
avoid and considering the delay that certain RSes could have (uneven load, GC 
pauses, etc) the data can be already outdated by the time the collection is 
done. So instead I tried to move the cleaner to the RSes. This solution also 
had it's drawbacks.

MOB file names contain the encoded name of the region that created them so the 
RS hosting that specific region can check it's hfiles for references and can 
clean it up if it does not find anything. The problem comes with merge/split 
parent regions. When the parent region is archived the new region's hfiles will 
still hold references to the old MOB files but now the only way to make sure if 
the old MOB file is referenced or not is to check every single hfile in every 
store belonging to the same columnfamily, because we can not tell based on it's 
name where it could be referenced from. Like the old cleaner did. So while I 
moved the MOB cleaner to the RS level and reduced it's scope to only clean up 
MOB files belonging to regions hosted by that RS I had to leave a "global" MOB 
cleaner running on Master to deal with MOB files created by archived regions 
but potentially still being referenced. And I think this is very ugly.

This whole process could have been significantly simpler if we would have 
tracker files in MOB stores but then we would have TWO competing sources of 
truth. The tracker files and the hfile metadata.

HBASE-27017 is a related issue where the snapshot code tries to get the active 
MOB files based on the configured SFT, but since MOB stores do not have tracker 
files it returns an empty list. If the store had tracker files it would work. 
Without a tracker file we either include every MOB files in the dir (garbage 
included) or scan every single hfile metadata for MOB references.

What I'm trying to say is that while I think my solution would work and solve 
the immediate issues I would much prefer if there would be a centralized, 
easily available active MOB list and create a solution based on that.

[~apurtell] ,[~zhangduo],[~elserj] ,[~wchevreuil] What do you think?

> Eliminate MOB renames when SFT is enabled
> -----------------------------------------
>
>                 Key: HBASE-26969
>                 URL: https://issues.apache.org/jira/browse/HBASE-26969
>             Project: HBase
>          Issue Type: Task
>          Components: mob
>    Affects Versions: 2.5.0, 3.0.0-alpha-3
>            Reporter: Szabolcs Bukros
>            Assignee: Szabolcs Bukros
>            Priority: Major
>             Fix For: 2.6.0, 3.0.0-alpha-3
>
>
> MOB file compaction and flush still relies on renames even when SFT is 
> enabled.
> My proposed changes are:
>  * when requireWritingToTmpDirFirst is false during mob flush/compact instead 
> of using the temp writer we should create a different writer using a 
> {color:#000000}StoreFileWriterCreationTracker that writes directly to the mob 
> store folder{color}
>  * {color:#000000}these StoreFileWriterCreationTracker should be stored in 
> the MobStore. This would requires us to extend MobStore with a createWriter 
> and a finalizeWriter method to handle this{color}
>  * {color:#000000}refactor {color}MobFileCleanerChore to run on the RS 
> instead on Master to allow access to the 
> {color:#000000}StoreFileWriterCreationTracker{color}s to make sure the 
> currently written files are not cleaned up



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

Reply via email to