[ 
https://issues.apache.org/jira/browse/HBASE-18398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16117689#comment-16117689
 ] 

Ashu Pachauri commented on HBASE-18398:
---------------------------------------

Attaching patch for master. The patch adds another region operation called 
SNAPSHOT. Since the unit of snapshot operation (the unit of manifest that's 
written to filesystem) is a region, the operation takes locks for all stores 
for the region that is being processed. The locks that are taken are 
1. Region level read lock, to avoid state changes to the region.
2. Archive lock for all stores, to prevent movement for compacted files to the 
archive while the snapshot operation is in progress.

> Snapshot operation fails with FileNotFoundException
> ---------------------------------------------------
>
>                 Key: HBASE-18398
>                 URL: https://issues.apache.org/jira/browse/HBASE-18398
>             Project: HBase
>          Issue Type: Sub-task
>          Components: snapshots
>            Reporter: Ashu Pachauri
>            Assignee: Ashu Pachauri
>             Fix For: 1.3.2
>
>         Attachments: HBASE-18398.master.001.patch
>
>
> Failing to take snapshot due to FileNotFoundException
>     * FlushSnapshotSubprocedure.RegionSnapshotTask takes a region level read 
> lock
>     * Call to HRegion#addRegionToSnapshot.
>     * Call to SnapshotManifest#addRegion. This gets the current list of store 
> files.
>     * RACE → File is marked as compacted away and HFileArchiver moves the 
> file to archive under store level lock.
>     * SnapshotManifest#addRegion visits the stale list of store files one by 
> one. It does a file.getStatus() call to get length of each file. Since the 
> file object still points to the original file, file.getStatus() fails with 
> FileNotFoundException.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to