[ https://issues.apache.org/jira/browse/HADOOP-6427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12793442#action_12793442 ]
Eli Collins commented on HADOOP-6427: ------------------------------------- See the symlink behavior I posted to HDFS-245. https://issues.apache.org/jira/browse/HDFS-245?focusedCommentId=12791197&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12791197 Absolute links are resolved using the file system of the link's parent. So hdfs://host1/foo -> /bar is resolved using the hdfs://host1 file system because that's foo's parent. hdfs://host1/dir/foo -> /bar would also use the hdfs://host1 file system because that's where dir resides, *unless* dir is a symlink to eg hdfs://host2/dir, in which case it would resolve /bar on host2, ie you fully resolve symlinks in the path leading up to the link to determine what the parent is. This is what I understand "symlinks are resolved relative to their source" to mean. I think resolving according to the parent is most intuitive since it means the link always resolves to the same location regardless of the URI used to access the link. This is similar to the behavior you described for links that are "volume root relative" (not clear what "NN's root" refers to in your des the presence of links across NNs). The partially qualified syntax (a scheme but no host, eg hdfs:///foo) indicates that the link is resolved using the client's default file system. This is the same behavior you described for links "relative to the client root". The current syntax is a little goofy since the scheme is ignored, eg using hdfs:///foo to indicate to use the client's default file system has nothing to do with "hdfs" since the client's default file system may be s3 or file etc. Replacing <some scheme>:/// with a special character (eg %) so you'd have %/dir/foo is perhaps less confusing. Another nice thing about this syntax (as opposed to using a partially qualified path) to indicate resolution using the client's default file system is that it side steps the fact that Hadoop doesn't support fully qualified paths with the "file" scheme (eg file://localhost/foo is an error and so partially qualified paths with a file scheme are currently special cased not to be considered relative to the client's default file system). In NFS symlinks are opaque to (not interpreted by) the server and are always resolved on the client. Similarly, in the current patch symlinks are stored raw (ie the target is not modified) on the namenode and interpreted on the client. However this doesn't preclude resolving relative and absolute ("volume relative") links on the namenode when the links don't span file systems in the future as an optimization since (in this case) the final resolution is the same. The path resolution *is different* from NFS since not all paths are resolved using the client's slash. I think HDFS semantics should differ here since HDFS uses URIs instead of Unix paths. For example if a user with an HDFS file context creates the link /data/latest -> /2009/10 on host1 then another user with a *local* file context accesses hdfs://host1/data/latest it would seem confusing if they got a FileNotFoundException because the directory /2009/10 does not exist on their local file system. Does this make sense? I think the above mostly jives with what you've posted in the jira earlier. > Add Path isQualified > -------------------- > > Key: HADOOP-6427 > URL: https://issues.apache.org/jira/browse/HADOOP-6427 > Project: Hadoop Common > Issue Type: Improvement > Reporter: Eli Collins > Assignee: Eli Collins > Attachments: hadoop-6427-1.patch > > > The Path class has a method to make a path qualified but not to query if the > path is qualified. This is needed for HADOOP-64221. In addition this patch > adds tests to TestPath that cover the file scheme. Note that "fully > qualified" applies to domain names not URIs so this function and it tests > also serve to define what we mean by a fully qualified path. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.