[ 
https://issues.apache.org/jira/browse/HDFS-3370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13270805#comment-13270805
 ] 

Liyin Tang commented on HDFS-3370:
----------------------------------

bg. I agree that ds quota doesn't need to be changed when there are links in 
the same directory. I'm referring to the case of hardlinks across directories. 
Ie. /dir/dir2/file and /dir/dir3/hardlink. If dir2 and dir3 have separate ds 
quotas, then dir3 has to absorb the ds quota when the original file is removed 
from dir2. What if there is a /dir/dir4/hardlink2? Does dir3 or dir4 absorb the 
ds quota? What if neither has the necessary quota available?

Based on the same example you commented, when linking /dir/dir2/file and 
/dir/dir3/hardlink, it will increase the dsquota for dir3 but not /dir. Because 
dir3 is NOT a common ancestor but dir is. And if dir3 doesn't have enough 
dsquota, then it shall throw quota exceptions. Also if there is a 
/dir/dir4/hardlink2, it absorbs the dsquota for dir4 as well. So the point is 
that it only absorbs the dsquota during the link creation time and decreases 
the dsquota during the link deletion time.


>From my understanding, the basic semantics for HardLink is to allow user 
>create multiple logic files referencing to the same set of blocks/bytes on 
>disks. So user could set different file level attributes for each linked file 
>such as owner, permission, modification time. 
Since these linked files share the same set of blocks, the block level setting 
shall be shared. 
It may be a little confused to distinguish the replication factor in HDFS 
between file-level attributes and block-level attributes. 
If we agree that replication factor is a block-level attribute, then we shall 
pay the overhead (wait time) when increasing replication factor, just as 
increasing the replication factor against a regular file, and the 
setReplication operation is supposed to fail if it breaks the dsquota.

                
> HDFS hardlink
> -------------
>
>                 Key: HDFS-3370
>                 URL: https://issues.apache.org/jira/browse/HDFS-3370
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>            Reporter: Hairong Kuang
>            Assignee: Liyin Tang
>         Attachments: HDFS-HardLinks.pdf
>
>
> We'd like to add a new feature hardlink to HDFS that allows harlinked files 
> to share data without copying. Currently we will support hardlinking only 
> closed files, but it could be extended to unclosed files as well.
> Among many potential use cases of the feature, the following two are 
> primarily used in facebook:
> 1. This provides a lightweight way for applications like hbase to create a 
> snapshot;
> 2. This also allows an application like Hive to move a table to a different 
> directory without breaking current running hive queries.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to