Expose HDFS as a WebDAV store
-----------------------------

                 Key: HADOOP-496
                 URL: http://issues.apache.org/jira/browse/HADOOP-496
             Project: Hadoop
          Issue Type: New Feature
          Components: dfs
            Reporter: Michel Tourn


WebDAV stands for Distributed Authoring and Versioning. It is a set of 
extensions to the HTTP protocol that lets users collaboratively edit and manage 
files on a remote web server. It is often considered as a replacement for NFS 
or SAMBA

HDFS (Hadoop Distributed File System) needs a friendly file system interface. 
DFSShell commands are unfamiliar. Instead it is more convenient for Hadoop 
users to use a mountable network drive. A friendly interface to HDFS will be 
used both for casual browsing of data and for bulk import/export. 

The FUSE provider for HDFS is already available ( 
http://issues.apache.org/jira/browse/HADOOP-17 )  but it had scalability 
problems. WebDAV is a popular alternative. 

The typical licensing terms for WebDAV tools are also attractive: 
GPL for Linux client tools that Hadoop would not redistribute anyway. 
More importantly, Apache Project/Apache license for Java tools and for server 
components. 
This allows for a tighter integration with the HDFS code base.

There are some interesting Apache projects that support WebDAV.
But these are probably too heavyweight for the needs of Hadoop:
Tomcat servlet: 
http://tomcat.apache.org/tomcat-4.1-doc/catalina/docs/api/org/apache/catalina/servlets/WebdavServlet.html
Slide:          http://jakarta.apache.org/slide/

Being HTTP-based and "backwards-compatible" with Web Browser clients, the 
WebDAV server protocol could even be piggy-backed on the existing Web UI ports 
of the Hadoop name node / data nodes. WebDAV can be hosted as (Jetty) servlets. 
This minimizes server code bloat and this avoids additional network traffic 
between HDFS and the WebDAV server.

General Clients (read-only):
Any web browser

Linux Clients: 
Mountable GPL davfs2  http://dav.sourceforge.net/
FTP-like  GPL Cadaver http://www.webdav.org/cadaver/

Server Protocol compliance tests:
http://www.webdav.org/neon/litmus/  
A goal is for Hadoop HDFS to pass this test (minus support for Properties)

Pure Java clients:
DAV Explorer Apache lic. http://www.ics.uci.edu/~webdav/        

WebDAV also makes it convenient to add advanced features in an incremental 
fashion:
file locking, access control lists, hard links, symbolic links.
New WebDAV standards get accepted and more or less featured WebDAV clients 
exist.
core              http://www.webdav.org/specs/rfc2518.html
ACLs              http://www.webdav.org/specs/rfc3744.html
redirects "soft links" http://greenbytes.de/tech/webdav/rfc4437.html
BIND "hard links" http://www.webdav.org/bind/
quota             http://tools.ietf.org/html/rfc4331



-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to