[ 
https://issues.apache.org/jira/browse/HDFS-3370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13270780#comment-13270780
 ] 

Daryn Sharp commented on HDFS-3370:
-----------------------------------

I'm glad you find my questions helpful!

bq. For example, "ln /root/dir1/file1 /root/dir1/file2" : there is no need to 
increase the ds quota usage when creating the link file: file2.  Also "rm 
/root/dir1/file1" : there is no need to decrease the ds quota usage when 
removing the original source file: file1.

I agree that ds quota doesn't need to be changed when there are links in the 
same directory.  I'm referring to the case of hardlinks across directories.  
Ie. /dir/dir2/file and /dir/dir3/hardlink.  If dir2 and dir3 have separate ds 
quotas, then dir3 has to absorb the ds quota when the original file is removed 
from dir2.  What if there is a /dir/dir4/hardlink2?  Does dir3 or dir4 absorb 
the ds quota?  What if neither has the necessary quota available?

bq.  Currently, at least for V1, we shall support the hardlinking only for the 
closed files and won't support to append operation against linked files, but it 
could be extended in the future.

A reasonable approach, but it may lead to user confusion.  It almost begs for a 
immutable flag (ie. chattr +i/-i) to prevent inadvertent hard linking to files 
intended to be mutable.

Nonetheless, I'd suggest exploring the difficulties reconciling the current 
design of the namesystem/block management with your design.  It may help avoid 
boxing ourselves into a corner with limited hard link support.

bq.  From my understanding, the setReplication is just a memory footprint 
update and the name node will increase actual replication in the background.

Yes, but the FsShell setrep command actively monitors the files and does not 
exit until the replication factor is what the user requested -- as determined 
by the number of hosts per block.  Another consideration is ds quota is based 
on a multiple of replication factor, so who is allowed to change the 
replication factor since increasing it may impact a different user's quota?
                
> HDFS hardlink
> -------------
>
>                 Key: HDFS-3370
>                 URL: https://issues.apache.org/jira/browse/HDFS-3370
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>            Reporter: Hairong Kuang
>            Assignee: Liyin Tang
>         Attachments: HDFS-HardLinks.pdf
>
>
> We'd like to add a new feature hardlink to HDFS that allows harlinked files 
> to share data without copying. Currently we will support hardlinking only 
> closed files, but it could be extended to unclosed files as well.
> Among many potential use cases of the feature, the following two are 
> primarily used in facebook:
> 1. This provides a lightweight way for applications like hbase to create a 
> snapshot;
> 2. This also allows an application like Hive to move a table to a different 
> directory without breaking current running hive queries.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to