[ 
https://issues.apache.org/jira/browse/HDFS-245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins updated HDFS-245:
-----------------------------

    Attachment: designdocv2.txt

While writing tests I noticed the current API doesn't match POSIX semantics 
that closely, eg if a path refers to a symlink then the symlink is not 
resolved, eg {{getFileStatus}} returns  the FileStatus of the link rather than 
what it points to (ie behaves like {{lstat}} rather than {{stat}}), ditto for 
{{open}}, {{setReplication}} etc. While some APIs should act on the symlink 
itself (eg {{rename}}, {{delete}}) others need symlinks fully resolved. The 
design doc should specify the intended behavior of the FileContext API wrt 
symlinks. I attached an updated version and pasted the relevant section below. 
What do people think?


h2. FileContext APIs

This section specifies the behavior of the FileContext API when links are 
present in paths. The intent is to match POSIX semantics. For most functions, 
if symlinks are supported, all links leading up to the target of a path should 
automatically be resolved. Some functions will not resolve any links in a given 
path. Some functions will, if given a path that refers to a symlink, operate on 
the target of the symlink, while others will operate on the symlink itself. For 
example, {{setReplication}} and {{getFileBlockLocations}} act on the symlink 
target while {{delete}} and {{getFileStatus}} act on the symlink itself. 
Behavior is specified both for filesystems that do and do not support symlinks. 
To support symlink-aware utilities the FileContext API requires some new 
interfaces (eg equivalent to {{lstat}}) to indicate whether a path refers to a 
symlink.

- {{create}}, {{mkdir}} -- the path should not refer to a symlink since the 
path must not currently exist.

- {{delete}}, {{deleteOnExit}} -- if path refers to a symlink then the symlink 
is removed (like {{unlink}}).

- {{open}} -- if the given path refers to a symlink then the path is fully 
resolved.

- {{set|getWorkingDirectory}} -- if the given path refers to a symlink then the 
symlink is fully resolved when setting the working directory, ie if the working 
directory is changed to {{/link1/link2}} then subsequent queries of the working 
directory should return whatever {{link2}} points to.

- {{setReplication}} -- if the given path refers to a symlink then the path is 
fully resolved.

- {{setPermission}} -- if the given path refers to a symlink then the path is 
fully resolved (like {{chmod}}). Symlink access is determined by permissions of 
the target of the symlink.

- {{setOwner}} -- if the given path refers to a symlink then the path is fully 
resolved (like {{chown}}). We could add an {{lchown}} equivalent in the future.

- {{setTimes}} -- if the given path refers to a symlink then the path is fully 
resolved. ySmlinks do not have access times.

- {{get|setFileChecksum}} -- if the given path refers to a symlink then the 
path is fully resolved, ie there are no checksums associated with symlinks.

- {{getFileStatus}} -- if the given path refers to a symlink then the path is 
fully resolved, ie returns the FileStatus of the file or directory the symlink 
points to.

- *new* {{getLinkFileStatus}} -- like {{lstat}}, if the given path refers to a 
symlink then the FileStatus of the symlink is returned, otherwise the results 
as if {{getFileStatus}} was called. If symlink support is not enabled or the 
underlying filesystem does not support symlinks then the results are the same 
as if {{getFileStatus}} was called.

- {{isDirectory}}, {{isFile}} -- if the given path refers to a symlink then the 
path is fully resolved, ie if the symlink points to a directory then 
{{isDirectory}} returns true.

- *new* {{isLink}} -- returns true if the given path refers to a symlink. If 
symlink support is not enabled or the underlying filesystem does not support 
symlinks then {{isLink}} returns false.

- {{listStatus}} -- if the given path refers to a symlink then the path is 
fully resolved, ie the result is equivalent to calling {{listStatus}} with the 
target of the symlink.

- {{getFileBlockLocations}} -- if the given path refers to a symlink then the 
path is fully resolved, ie symlinks are not associated with blocks.

- {{getFsStatus}} -- if the given path refers to a symlink then the path is 
fully resolved, ie the FsStatus of the target of the symlink is returned.

- {{getLinkTarget}} -- only the first symlink in the given path is resolved. If 
symlink support is not enabled or the underlying filesystem does not support 
symlinks then an IOException is thrown.

- {{resolve}} -- all symlinks in the given path are resolved. If symlink 
support is not enabled or the underlying filesystem does not support symlinks 
then no symlinks are resolved.

- {{createSymlink(oldpath, newpath)}}
   -- newpath should not refer to a symlink since the path must not currently 
exist.
   -- _No symlinks are resolved in oldpath_. For example, if {{/link1}} points 
to {{/dir}}, and {{/link1/link2}} points to {{/link1/file}}, then 
{{createSymlink("/link1/file", "/link1/link2")}} points {{link2}} to 
{{/link1/file}} (not {{/dir/file}}). The path {{/link1/link2}} resolves as 
follows: {{/dir/link2}} -> {{/link1/file}} -> {{/dir/file}}.
   -- If symlink support is not enabled or the underlying filesystem does not 
support symlinks then an IOException is thrown.

- {{rename(oldpath, newpath)}} -- 
   -- if oldpath refers to a symlink, the symlink is renamed (POSIX)
   -- if newpath refers to a symlink, the symlink is over-written (POSIX), if 
the the OVERWRITE option is passed.

> Create symbolic links in HDFS
> -----------------------------
>
>                 Key: HDFS-245
>                 URL: https://issues.apache.org/jira/browse/HDFS-245
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>         Attachments: 4044_20081030spi.java, designdocv1.txt, designdocv2.txt, 
> HADOOP-4044-strawman.patch, symlink-0.20.0.patch, symLink1.patch, 
> symLink1.patch, symLink11.patch, symLink12.patch, symLink13.patch, 
> symLink14.patch, symLink15.txt, symLink15.txt, symlink16-common.patch, 
> symlink16-hdfs.patch, symlink16-mr.patch, symlink17-common.txt, 
> symlink17-hdfs.txt, symlink18-common.txt, symlink19-common.txt, 
> symlink19-common.txt, symlink19-hdfs.txt, symLink4.patch, symLink5.patch, 
> symLink6.patch, symLink8.patch, symLink9.patch
>
>
> HDFS should support symbolic links. A symbolic link is a special type of file 
> that contains a reference to another file or directory in the form of an 
> absolute or relative path and that affects pathname resolution. Programs 
> which read or write to files named by a symbolic link will behave as if 
> operating directly on the target file. However, archiving utilities can 
> handle symbolic links specially and manipulate them directly.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to