[ https://issues.apache.org/jira/browse/HDFS-3370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13270805#comment-13270805 ]
Liyin Tang commented on HDFS-3370: ---------------------------------- bg. I agree that ds quota doesn't need to be changed when there are links in the same directory. I'm referring to the case of hardlinks across directories. Ie. /dir/dir2/file and /dir/dir3/hardlink. If dir2 and dir3 have separate ds quotas, then dir3 has to absorb the ds quota when the original file is removed from dir2. What if there is a /dir/dir4/hardlink2? Does dir3 or dir4 absorb the ds quota? What if neither has the necessary quota available? Based on the same example you commented, when linking /dir/dir2/file and /dir/dir3/hardlink, it will increase the dsquota for dir3 but not /dir. Because dir3 is NOT a common ancestor but dir is. And if dir3 doesn't have enough dsquota, then it shall throw quota exceptions. Also if there is a /dir/dir4/hardlink2, it absorbs the dsquota for dir4 as well. So the point is that it only absorbs the dsquota during the link creation time and decreases the dsquota during the link deletion time. >From my understanding, the basic semantics for HardLink is to allow user >create multiple logic files referencing to the same set of blocks/bytes on >disks. So user could set different file level attributes for each linked file >such as owner, permission, modification time. Since these linked files share the same set of blocks, the block level setting shall be shared. It may be a little confused to distinguish the replication factor in HDFS between file-level attributes and block-level attributes. If we agree that replication factor is a block-level attribute, then we shall pay the overhead (wait time) when increasing replication factor, just as increasing the replication factor against a regular file, and the setReplication operation is supposed to fail if it breaks the dsquota. > HDFS hardlink > ------------- > > Key: HDFS-3370 > URL: https://issues.apache.org/jira/browse/HDFS-3370 > Project: Hadoop HDFS > Issue Type: New Feature > Reporter: Hairong Kuang > Assignee: Liyin Tang > Attachments: HDFS-HardLinks.pdf > > > We'd like to add a new feature hardlink to HDFS that allows harlinked files > to share data without copying. Currently we will support hardlinking only > closed files, but it could be extended to unclosed files as well. > Among many potential use cases of the feature, the following two are > primarily used in facebook: > 1. This provides a lightweight way for applications like hbase to create a > snapshot; > 2. This also allows an application like Hive to move a table to a different > directory without breaking current running hive queries. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira