[ 
https://issues.apache.org/jira/browse/HDFS-2178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13121354#comment-13121354
 ] 

Sanjay Radia commented on HDFS-2178:
------------------------------------

Som thoughts on webhdfs & hoop:
* webhdfs is a significant rewrite of hoop - indeed as much as hoop was a 
rewrite of hdfsproxy. This was due to a number of reasons.
** Fundamental differences between the proxy (hoop) and a built in file system. 
Examples are support for delegation tokens, support for trusted proxies (like 
oozie) and the need for redirecting to a DN,  These three do not make sense for 
a proxy (Hoop or hdfsproxy).
** Code cleanup and integration into servlets of NN and DN. These changes can 
apply to hoop and we can share the code.
** Parameter and return type clean up (eg. the root element issue) - these also 
apply to hoop.

As we move forward we need to consider two things.
# What subset of webhdfs API makes sense for a proxy?
# Alternatively shall we think of the proxy as a *pure* proxy - that is should 
it merely redirect  requests to the webhdfs API. Indeed it can then be a proxy 
for *any* built-in rest API of the entire Hadoop system (not just HDFS). We can 
add features like user name and authentication mapping to the proxy. 
                
> Contributing Hoop to HDFS, replacement for HDFS proxy with read/write 
> capabilities
> ----------------------------------------------------------------------------------
>
>                 Key: HDFS-2178
>                 URL: https://issues.apache.org/jira/browse/HDFS-2178
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>    Affects Versions: 0.23.0
>            Reporter: Alejandro Abdelnur
>            Assignee: Alejandro Abdelnur
>             Fix For: 0.23.0
>
>         Attachments: HdfsHttpAPI.pdf
>
>
> We'd like to contribute Hoop to Hadoop HDFS as a replacement (an improvement) 
> for HDFS Proxy.
> Hoop provides access to all Hadoop Distributed File System (HDFS) operations 
> (read and write) over HTTP/S.
> The Hoop server component is a REST HTTP gateway to HDFS supporting all file 
> system operations. It can be accessed using standard HTTP tools (i.e. curl 
> and wget), HTTP libraries from different programing languages (i.e. Perl, 
> Java Script) as well as using the Hoop client. The Hoop server component is a 
> standard Java web-application and it has been implemented using Jersey 
> (JAX-RS).
> The Hoop client component is an implementation of Hadoop FileSystem client 
> that allows using the familiar Hadoop filesystem API to access HDFS data 
> through a Hoop server.
>   Repo: https://github.com/cloudera/hoop
>   Docs: http://cloudera.github.com/hoop
>   Blog: http://www.cloudera.com/blog/2011/07/hoop-hadoop-hdfs-over-http/
> Hoop is a Maven based project that depends on Hadoop HDFS and Alfredo (for 
> Kerberos HTTP SPNEGO authentication). 
> To make the integration easy, HDFS Mavenization (HDFS-2096) would have to be 
> done first, as well as the Alfredo contribution (HADOOP-7119).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to