[ 
https://issues.apache.org/jira/browse/HDFS-4529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13632101#comment-13632101
 ] 

Aaron T. Myers commented on HDFS-4529:
--------------------------------------

That sounds good to me as well. Thanks Nicholas. 
                
> Decide the semantic of concat with snapshots
> --------------------------------------------
>
>                 Key: HDFS-4529
>                 URL: https://issues.apache.org/jira/browse/HDFS-4529
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: namenode
>            Reporter: Tsz Wo (Nicholas), SZE
>            Assignee: Tsz Wo (Nicholas), SZE
>
> The use case of concat is for copying large files across clusters using the 
> following steps.
> - Step 1: The blocks of a file in the source cluster are copied in parallel 
> to transient files in the destination cluster.
> - Step 2: Then the transient files in the destination cluster are 
> concatenated in order to obtain the original file.
> If a snapshot is taken in the destination cluster before Step 2, some 
> transient files may be captured in the snapshot.  Then what should happen?  
> The following are some alternatives:
> * (1) fail concat and keep the transient files in the snapshots;
> * (2) allow concat and keep the transient files in the snapshots;
> * (3) allow concat but remove the transient files from all snapshots.
> All solutions above are not perfect.  Here are their drawbacks:
> For (1) and (2), the transient files will remain in the system until the 
> snapshots are deleted.  It is inefficient to the system since the files are 
> known to be transient.  (1) may be able to force user to create files under 
> some non-snapshottable tmp directory in the first place.  However, it 
> complicates the user applications and the existing applications may need to 
> be updated for the new policy.  Also, non-snapshottable directory may not 
> exists since admin may set the system root directory to be snapshottable.  
> For (2), the problem seems to break the Read-Only snapshot contract - some 
> files appear in a snapshot may disappear later on.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to