[ 
https://issues.apache.org/jira/browse/HBASE-27495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tak-Lon (Stephen) Wu resolved HBASE-27495.
------------------------------------------
    Hadoop Flags: Reviewed
      Resolution: Fixed

> Improve HFileLinkCleaner to validate back reference links ahead the next 
> traverse 
> ----------------------------------------------------------------------------------
>
>                 Key: HBASE-27495
>                 URL: https://issues.apache.org/jira/browse/HBASE-27495
>             Project: HBase
>          Issue Type: Improvement
>          Components: master
>    Affects Versions: 2.6.0, 3.0.0-alpha-4, 2.5.2
>            Reporter: Tak-Lon (Stephen) Wu
>            Assignee: Tak-Lon (Stephen) Wu
>            Priority: Major
>             Fix For: 2.6.0, 3.0.0-alpha-4, 2.5.2
>
>
> We found a a race in the CleanerChore related to back reference links. When 
> the HFileLinkCleaner runs for a file it can make 2 decisions depending on the 
> file types.
>  - HFiles, The cleaner for HFile deletion only checks if the .links-<> 
> directory is present with files. 
>  - Back reference links, the cleaner checks if the forward link is still 
> available in the data directory.
> The logic and order how the cleaner checks these 2 files matters. When the 
> back reference is checked first it can remove both the reference and the 
> HFile from the archive, however, when it first runs for the HFile then only 
> the back-reference is removed. In this case, the HFile is only deleted in the 
> next iteration of the CleanerChore, and it could be very slow if the list of 
> files are huge in case of using object store.
> The goal of this task is to improve traverse of the archived HFile, reusing 
> the list of found back reference files, and immediately apply the checks for 
> the Back reference links.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to