Tak-Lon (Stephen) Wu created HBASE-27495:
--------------------------------------------

             Summary: Improve HFileLinkCleaner to validate back reference links 
ahead the next traverse 
                 Key: HBASE-27495
                 URL: https://issues.apache.org/jira/browse/HBASE-27495
             Project: HBase
          Issue Type: Improvement
    Affects Versions: 2.6.0, 3.0.0-alpha-4, 2.5.2
            Reporter: Tak-Lon (Stephen) Wu
            Assignee: Tak-Lon (Stephen) Wu


We found a a race in the CleanerChore related to back reference links. When the 
HFileLinkCleaner runs for a file it can make 2 decisions depending on the file 
types.
 - Hfiles, The cleaner for HFile deletion only checks if the .links-<> 
directory is present with files. 
 - Back reference links, the cleaner checks if the forward link is still 
available in the data directory.

The logic and order how the cleaner checks these 2 files matters. When the back 
reference is checked first it can remove both the reference and the HFile from 
the archive, however, when it first runs for the HFile then only the 
back-reference is removed. In this case, the HFile is only deleted in the next 
iteration of the CleanerChore, and it could be very slow if the list of files 
are huge in case of using object store.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to