[ 
https://issues.apache.org/jira/browse/HBASE-7643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13559489#comment-13559489
 ] 

Jonathan Hsieh commented on HBASE-7643:
---------------------------------------

I misspoke -- Matteo pointed out to me that #3 isn't problem due to compactions 
but more likely due to splits (compactions only create new files, splits create 
new dirs and the parent dir is the likely deletion candidate).  The high level 
point still stands -- if a compaction happens while the archiver deletes the 
directory the rename attempt can fail.  
                
> HFileArchiver.resolveAndArchive() race condition and snapshot data loss
> -----------------------------------------------------------------------
>
>                 Key: HBASE-7643
>                 URL: https://issues.apache.org/jira/browse/HBASE-7643
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: hbase-6055, 0.96.0
>            Reporter: Matteo Bertozzi
>            Assignee: Matteo Bertozzi
>            Priority: Blocker
>             Fix For: 0.96.0, 0.94.5
>
>
>  * The master have an hfile cleaner thread (that is responsible for cleaning 
> the /hbase/.archive dir)
>  ** /hbase/.archive/table/region/family/hfile
>  ** if the family/region/family directory is empty the cleaner removes it
>  * The master can archive files (from another thread, e.g. DeleteTableHandler)
>  * The region can archive files (from another server/process, e.g. compaction)
> The simplified file archiving code looks like this:
> {code}
> HFileArchiver.resolveAndArchive(...) {
>   // ensure that the archive dir exists
>   fs.mkdir(archiveDir);
>   // move the file to the archiver
>   success = fs.rename(originalPath/fileName, archiveDir/fileName)
>   // if the rename is failed, delete the file without archiving
>   if (!success) fs.delete(originalPath/fileName);
> }
> {code}
> Since there's no synchronization between HFileArchiver.resolveAndArchive() 
> and the cleaner run (different process, thread, ...) you can end up in the 
> situation where you are moving something in a directory that doesn't exists.
> {code}
> fs.mkdir(archiveDir);
> // HFileCleaner chore starts at this point
> // and the archiveDirectory that we just ensured to be present gets removed.
> // The rename at this point will fail since the parent directory is missing.
> success = fs.rename(originalPath/fileName, archiveDir/fileName)
> {code}
> The bad thing of deleting the file without archiving is that if you've a 
> snapshot that relies on the file to be present, or you've a clone table that 
> relies on that file is that you're losing data.
> Possible solutions
>  * Create a ZooKeeper lock, to notify the master ("Hey I'm archiving 
> something, wait a bit")
>  * Add a RS -> Master call to let the master removes files and avoid this 
> kind of situations
>  * Avoid to remove empty directories from the archive if the table exists or 
> is not disabled
>  * Add a try catch around the fs.rename
> The last one, the easiest one, looks like:
> {code}
> for (int i = 0; i < retries; ++i) {
>   // ensure archive directory to be present
>   fs.mkdir(archiveDir);
>   // ----> possible race <-----
>   // try to archive file
>   success = fs.rename(originalPath/fileName, archiveDir/fileName);
>   if (success) break;
> }
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to