[ 
https://issues.apache.org/jira/browse/HDFS-9763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15138088#comment-15138088
 ] 

Colin Patrick McCabe edited comment on HDFS-9763 at 2/9/16 12:10 AM:
---------------------------------------------------------------------

Hmm.  Hive's directory merge operation doesn't seem like something that belongs 
inside HDFS.
* It can be implemented reasonably well outside the filesystem.
* Implementing an O(N) operation inside HDFS will cause high latencies on the 
NameNode, or tricky code that needs to periodically drop the lock while 
performing a single operation.
* Other filesystems and storage systems for Hadoop will have trouble 
implementing merge, or may not be able to implement it at all (like s3, etc.) 
since they don't have the ability to atomically move a bunch of entries.
* Hive may want to change how it handles directory merges, and it won't be able 
to do that easily if the code was merged into HDFS.

It seems like the TOCTOU in Hive can be avoided by using the {{rename}} variant 
that doesn't perform overwrites.  If we get an exception about the file already 
existing, we simply choose another name and try again.  This also cuts the 
usual number of operations to be one per file, rather than 2 per file.


was (Author: cmccabe):
Hmm.  Hive's directory merge operation doesn't seem like something that belongs 
inside HDFS.
* It can be implemented reasonably well outside the filesystem.
* Implementing an O(N) operation inside HDFS will cause high latencies on the 
NameNode, or tricky code that needs to periodically drop the lock while 
performing a single operation.
* Other filesystems and storage systems for Hadoop will have trouble 
implementing merge, or may not be able to implement it at all (like s3, etc.) 
since they don't have the ability to atomically move a bunch of entries.
* Hive may want to change how it handles directory merges, and it won't be able 
to do that easily if the code was merged into HDFS.

It seems like the TOCTOU in Hive can be avoided by using the {{rename}} variant 
that doesn't perform overwrites.  This also cuts the number of operations to be 
one per file, rather than 2 per file.

> Add merge api
> -------------
>
>                 Key: HDFS-9763
>                 URL: https://issues.apache.org/jira/browse/HDFS-9763
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>          Components: fs
>            Reporter: Ashutosh Chauhan
>            Assignee: Xiaobing Zhou
>
> It will be good to add merge(Path dir1, Path dir2, ... ) api to HDFS. 
> Semantics will be to move all files under dir1 to dir2 and doing a rename of 
> files in case of collisions.
> In absence of this api, Hive[1] has to check for collision for each file and 
> then come up unique name and try again and so on. This is inefficient in 
> multiple ways:
> 1) It generates huge number of calls on NN (atleast 2*number of source files 
> in dir1)
> 2) It suffers from TOCTOU[2] bug for client picked up name in case of 
> collision.
> 3) Whole operation is not atomic.
> A merge api outlined as above will be immensely useful for Hive and 
> potentially to other HDFS users.
> [1] 
> https://github.com/apache/hive/blob/release-2.0.0-rc1/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java#L2576
> [2] https://en.wikipedia.org/wiki/Time_of_check_to_time_of_use



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to