[ 
https://issues.apache.org/jira/browse/HADOOP-9984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14519971#comment-14519971
 ] 

Sanjay Radia commented on HADOOP-9984:
--------------------------------------

bq. The problem with dereferencing all symlinks in listStatus is that it's 
disastrously inefficient

# In the proposal listStatus2 is the new API that replaces listStatus
# all our libraries need to be changed to use listStatus2 (see item 3 in the4 
proposal)
# customer who have old code that calls the old listStatus and cannot convert 
that code immediately can disable symlinks,  not use symlinks,  or use symlinks 
sparinglg. In practice I don't think there will dirs with oven tens of symlinks 
(but symlink2 addresses the problem going forward.

bq.  isSymlink is broken for dangling symlinks, FileSystem#rename is broken for 
symlinks, the behavior of symlinks in globStatus is controversial, distCp 
doesn't support it, ...
These are fixable. I think this jira itslef was attempting to fix some of these 
when we ran into the design flaw of the orignal listStatus

bq.  cross-filesystem symlinks ...
As I pointed out this needs to be discussed. Let make a separate comment that 
summarizes the cross-namspace issues that have been presented in the various 
comments in this and other jiras.


> FileSystem#globStatus and FileSystem#listStatus should resolve symlinks by 
> default
> ----------------------------------------------------------------------------------
>
>                 Key: HADOOP-9984
>                 URL: https://issues.apache.org/jira/browse/HADOOP-9984
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs
>    Affects Versions: 2.1.0-beta
>            Reporter: Colin Patrick McCabe
>            Assignee: Colin Patrick McCabe
>            Priority: Critical
>         Attachments: HADOOP-9984.001.patch, HADOOP-9984.003.patch, 
> HADOOP-9984.005.patch, HADOOP-9984.007.patch, HADOOP-9984.009.patch, 
> HADOOP-9984.010.patch, HADOOP-9984.011.patch, HADOOP-9984.012.patch, 
> HADOOP-9984.013.patch, HADOOP-9984.014.patch, HADOOP-9984.015.patch
>
>
> During the process of adding symlink support to FileSystem, we realized that 
> many existing HDFS clients would be broken by listStatus and globStatus 
> returning symlinks.  One example is applications that assume that 
> !FileStatus#isFile implies that the inode is a directory.  As we discussed in 
> HADOOP-9972 and HADOOP-9912, we should default these APIs to returning 
> resolved paths.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to