[ 
https://issues.apache.org/jira/browse/HDFS-245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins updated HDFS-245:
-----------------------------

    Attachment: designdocv1.txt

Uploaded a first draft of a design doc (followed the template in HADOOP-5587).  
Content follows. Lemme know where I missed the boat.   Thanks, Eli


h1. Symlinks Design Doc

h2. Problem definition

HDFS path resolution has the following limitations:

* Files and directories are only accessible via a single path.

* An HDFS namespace may not span multiple file systems.

Symbolic links address these limitations by providing an additional level of 
indirection when resolving paths in an HDFS file system. A symbolic link is a 
special type of file that contain a path to another file or directory. Paths 
may be relative or absolute. Relative paths (eg {{../user}}) provide an 
alternate path to a single file or directory in the file system. An absolute 
path may be relative to the current file system (eg {{/user}}) or specify a URL 
(eg {{hdfs://localhost:8020/foo}}) which allows the link to point to any file 
or directory irrespective of the source and destination file systems.

Allowing multiple paths to resolve to the same file or directory, and HDFS 
namespaces to span multiple file systems makes it easier to access files and 
manage their underlying stoarge.


h2. Use Cases

If an application requires data be available by a particular path a symlink may 
be used in lieu of copying the data from its current location. For example, a 
user may want to create a symlink {{/data/latest}} that points to an existing 
directory so that the latest data is accessible via it's current name and an 
alias, eg:

{{$ hadoop fs -ln /data/20090922 /data/latest}}

The user may eventually want to archive this data so that it's accessible but 
stored more efficiently. They could create an archive of the files in the 
20090922 directory and make the original path a symlink to the HAR, eg:

{{$ hadoop fs -ln har:///data/20090922.har /data/20090922}}

They could also move the directory to another file system that is perhaps 
lightly loaded or less expensive and make the existing directory a symlink, eg:

{{$ hadoop fs -ln hdfs://archive-host/data/20090922 /data/20090922}}

The archival file system could also be accessible via an alternative protocol 
(eg FTP). In both cases the original data has moved but remains accessible by 
its original path.

This technique can be used generally to balance storage by transparently making 
a namespace span multiple file systems. For example, if a particular subtree of 
a namespace outgrows the capabilities of the file system it resides on (eg 
Namenode performance, number of files, etc) it it can be moved to a new file 
system and linked into its current path.

A symbolic link is also a useful primitive that could be used to implement 
atomic rename within a file system by atomically rewriting a symbolic link, or 
to rename a file across partitions (if in the future a file system metadata is 
partitioned across multiple hosts). See HADOOP-6240 for more info.


h2. Interaction with the Current System

The user may interact with symbolic links via the shell or indirectly via 
applications (eg libhdfs clients like fuse mounts).

Symbolic links are transparent to most operations though some may want to 
handle links specially. In general, the behavior should match POSIX where 
appropriate. Note that linking across file systems is somehwat equivalent to 
creating a symlink across mount points in POSIX.

Path resolution:
* Some commands operate on the link directly (eg stat, rm) if the link is the 
target.

* Some commands (eg mv) should operate on the link target if a trailing slash 
is used (eg if {{/bar}} is a link that points to a directory, {{mv /foo /bar}} 
renames bar to foo while {{mv /foo /bar}} moves {{/foo}} into the directory 
pointed to by bar).

* Symbolic links in archive URIs should fully resolve.

* Some APIs should operate on the link target (eg setting access and 
modification times).

*Permissions*: access control properties of links are ignored, checks are 
always performed against the link target (if it resides in an HDFS file 
system). The ch* operations should operate directly on the target.

*Some utilities need to be link-aware*:
* distcp should not follow links by default. 
* fsck should only look at each file once, could optionally report dangling 
links.
* Symbolic links in HARs should be followed (so that a symlink to a HAR 
preserves the original path resolution behavior).

Symbolic links exist independently of their targets, and may point to 
non-existing files or directories.


h2. Requirements

Clients send the entire path to the Namenode for resolution. The path may 
contain multiple links:

* If all links in the path are relative to the current file system then the 
Namenode transparently (to the client) resolves the path.

* If the Namenode finds a link in the path that points outside the file system 
it must provide API(s) to report to the client that (a) it can not resolve a 
path (due to a link that points outside the file system) and (b) return the 
target of the link and the remainder of the path that still needs to be 
resolved.

Symbolic links should be largely invisible to users of the client. However, 
symbolic links may introduce cycles into path resolution. For example, a link 
may point to another URI (on another file system) which points back to the 
link. Loops should be avoided by having the client limit the number of links it 
will traverse, and report to its user that the operation was not successful.

Symbolic links should not introduce significant overhead in the common case, 
resolving paths without links. Resolving symbolic links may be a frequent 
operation. If links are being used to transparently span a namespace across 
multiple file systems then the "root" Namenode may do little work aside from 
resolving link paths. Therefore, link resolution should have reasonable 
performance overhead and limited side-effects on both the client and Namenode.


h2. Design

One approach is to have the client first stat the file to see if it is a 
symbolic link and resolve the path. Once the path is fully resolved (this may 
require contacting additional file systems) the client performs the desired 
operation. This introduces an additional RPC in the common case (when links are 
not present) so an optimization is to optimistically perform the operation 
first. If it was unsucessful due to the presence of an external link the 
Namenode notifies the client, using an exception, which is caught at the 
FileSystem level (since the link refers to another file system). The client 
then makes an additional call to the Namenode to resolve the link.* Once the 
path is fully resolved the operation can be re-tried. Note that if the 
operation contained multiple paths each may contain links which must be fully 
resolved before the operation can complete. This may require additional round 
trips. The call to resolve a link may fail if the link has been deleted, or is 
no longer a link, etc. In this case the resulting exception is passed upwards 
as if the original operation failed. As stated above, link resolution at the 
FileSystem level will perform a limited number of link resolutions before 
notifying the client of the failure.

The Namenode's INodeFile class needs maintain additional metadata to indicate 
whether a file is a symbolic link (or an additional class could be introduced).

\* NB: It is possible to eliminate this additional RPC by piggy backing the 
link resolution on the notification.


h2. Future work

This design should cover the necessary use cases however some of the above 
features (like enhancing archives) may be deferred.

Related future work is implementing atomic rename within a file system by 
atomically re-writing a symbolic link.

> Create symbolic links in HDFS
> -----------------------------
>
>                 Key: HDFS-245
>                 URL: https://issues.apache.org/jira/browse/HDFS-245
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>         Attachments: 4044_20081030spi.java, designdocv1.txt, 
> HADOOP-4044-strawman.patch, symlink-0.20.0.patch, symLink1.patch, 
> symLink1.patch, symLink11.patch, symLink12.patch, symLink13.patch, 
> symLink14.patch, symLink15.txt, symLink15.txt, symlink16-common.patch, 
> symlink16-hdfs.patch, symlink16-mr.patch, symLink4.patch, symLink5.patch, 
> symLink6.patch, symLink8.patch, symLink9.patch
>
>
> HDFS should support symbolic links. A symbolic link is a special type of file 
> that contains a reference to another file or directory in the form of an 
> absolute or relative path and that affects pathname resolution. Programs 
> which read or write to files named by a symbolic link will behave as if 
> operating directly on the target file. However, archiving utilities can 
> handle symbolic links specially and manipulate them directly.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to