[jira] [Commented] (HDFS-2316) webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP

2011-11-17 Thread Eli Collins (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13152503#comment-13152503
 ] 

Eli Collins commented on HDFS-2316:
---

Nicholas / Tucu - see HADOOP-6585 for the rationale.

> webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP
> --
>
> Key: HDFS-2316
> URL: https://issues.apache.org/jira/browse/HDFS-2316
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
> Fix For: 0.20.205.1, 0.23.1
>
> Attachments: WebHdfsAPI20111020.pdf, WebHdfsAPI2003.pdf, 
> WebHdfsAPI2011.pdf, test-webhdfs, test-webhdfs-0.20s
>
>
> We current have hftp for accessing HDFS over HTTP.  However, hftp is a 
> read-only FileSystem and does not provide "write" accesses.
> In HDFS-2284, we propose to have webhdfs for providing a complete FileSystem 
> implementation for accessing HDFS over HTTP.  The is the umbrella JIRA for 
> the tasks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2316) webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP

2011-11-17 Thread Tsz Wo (Nicholas), SZE (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13152394#comment-13152394
 ] 

Tsz Wo (Nicholas), SZE commented on HDFS-2316:
--

@Alejandro, thanks for checking it.

> webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP
> --
>
> Key: HDFS-2316
> URL: https://issues.apache.org/jira/browse/HDFS-2316
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
> Attachments: WebHdfsAPI20111020.pdf, WebHdfsAPI2003.pdf, 
> WebHdfsAPI2011.pdf, test-webhdfs, test-webhdfs-0.20s
>
>
> We current have hftp for accessing HDFS over HTTP.  However, hftp is a 
> read-only FileSystem and does not provide "write" accesses.
> In HDFS-2284, we propose to have webhdfs for providing a complete FileSystem 
> implementation for accessing HDFS over HTTP.  The is the umbrella JIRA for 
> the tasks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2316) webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP

2011-11-16 Thread Alejandro Abdelnur (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13151713#comment-13151713
 ] 

Alejandro Abdelnur commented on HDFS-2316:
--

well, then we are good, when symlink isDir is always false.

> webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP
> --
>
> Key: HDFS-2316
> URL: https://issues.apache.org/jira/browse/HDFS-2316
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
> Attachments: WebHdfsAPI20111020.pdf, WebHdfsAPI2003.pdf, 
> WebHdfsAPI2011.pdf, test-webhdfs, test-webhdfs-0.20s
>
>
> We current have hftp for accessing HDFS over HTTP.  However, hftp is a 
> read-only FileSystem and does not provide "write" accesses.
> In HDFS-2284, we propose to have webhdfs for providing a complete FileSystem 
> implementation for accessing HDFS over HTTP.  The is the umbrella JIRA for 
> the tasks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2316) webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP

2011-11-16 Thread Tsz Wo (Nicholas), SZE (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13151696#comment-13151696
 ] 

Tsz Wo (Nicholas), SZE commented on HDFS-2316:
--

INodeSymlink.isDirectory() always return false, how could we have 
DIRECTORY_SYMLINK?  Do you mean for some FileSystem other than HDFS?

> webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP
> --
>
> Key: HDFS-2316
> URL: https://issues.apache.org/jira/browse/HDFS-2316
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
> Attachments: WebHdfsAPI20111020.pdf, WebHdfsAPI2003.pdf, 
> WebHdfsAPI2011.pdf, test-webhdfs, test-webhdfs-0.20s
>
>
> We current have hftp for accessing HDFS over HTTP.  However, hftp is a 
> read-only FileSystem and does not provide "write" accesses.
> In HDFS-2284, we propose to have webhdfs for providing a complete FileSystem 
> implementation for accessing HDFS over HTTP.  The is the umbrella JIRA for 
> the tasks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2316) webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP

2011-11-16 Thread Alejandro Abdelnur (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13151693#comment-13151693
 ] 

Alejandro Abdelnur commented on HDFS-2316:
--

No, no, the options would be:

# enum {FILE, DIRECTORY, FILE_SYMLINK, DIRECTORY_SYMLINK}
# boolean isDir, boolean isSymlink

I'd prefer #2, but I'm good either way.

> webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP
> --
>
> Key: HDFS-2316
> URL: https://issues.apache.org/jira/browse/HDFS-2316
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
> Attachments: WebHdfsAPI20111020.pdf, WebHdfsAPI2003.pdf, 
> WebHdfsAPI2011.pdf, test-webhdfs, test-webhdfs-0.20s
>
>
> We current have hftp for accessing HDFS over HTTP.  However, hftp is a 
> read-only FileSystem and does not provide "write" accesses.
> In HDFS-2284, we propose to have webhdfs for providing a complete FileSystem 
> implementation for accessing HDFS over HTTP.  The is the umbrella JIRA for 
> the tasks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2316) webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP

2011-11-16 Thread Tsz Wo (Nicholas), SZE (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13151688#comment-13151688
 ] 

Tsz Wo (Nicholas), SZE commented on HDFS-2316:
--

For symlinks, should it be isDir==false and isFile==false?

> webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP
> --
>
> Key: HDFS-2316
> URL: https://issues.apache.org/jira/browse/HDFS-2316
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
> Attachments: WebHdfsAPI20111020.pdf, WebHdfsAPI2003.pdf, 
> WebHdfsAPI2011.pdf, test-webhdfs, test-webhdfs-0.20s
>
>
> We current have hftp for accessing HDFS over HTTP.  However, hftp is a 
> read-only FileSystem and does not provide "write" accesses.
> In HDFS-2284, we propose to have webhdfs for providing a complete FileSystem 
> implementation for accessing HDFS over HTTP.  The is the umbrella JIRA for 
> the tasks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2316) webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP

2011-11-11 Thread Tsz Wo (Nicholas), SZE (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13148932#comment-13148932
 ] 

Tsz Wo (Nicholas), SZE commented on HDFS-2316:
--

Let's call the current encoding "urlString" which is the name used in the 
Hadoop code.  Okay?

You are right that the documentation must be clear.

> webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP
> --
>
> Key: HDFS-2316
> URL: https://issues.apache.org/jira/browse/HDFS-2316
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
> Attachments: WebHdfsAPI20111020.pdf, WebHdfsAPI2003.pdf, 
> WebHdfsAPI2011.pdf, test-webhdfs, test-webhdfs-0.20s
>
>
> We current have hftp for accessing HDFS over HTTP.  However, hftp is a 
> read-only FileSystem and does not provide "write" accesses.
> In HDFS-2284, we propose to have webhdfs for providing a complete FileSystem 
> implementation for accessing HDFS over HTTP.  The is the umbrella JIRA for 
> the tasks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2316) webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP

2011-11-11 Thread Alejandro Abdelnur (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13148929#comment-13148929
 ] 

Alejandro Abdelnur commented on HDFS-2316:
--

Regarding #2, regarding the token encoding, then it should be something like 
'encodedToken' and whatever it is the encoding it should be clearly documented 
if the token is to be used (other than as an opaque value by clients).

Regarding #2, 'Path' is fine.

Thanks.


> webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP
> --
>
> Key: HDFS-2316
> URL: https://issues.apache.org/jira/browse/HDFS-2316
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
> Attachments: WebHdfsAPI20111020.pdf, WebHdfsAPI2003.pdf, 
> WebHdfsAPI2011.pdf, test-webhdfs, test-webhdfs-0.20s
>
>
> We current have hftp for accessing HDFS over HTTP.  However, hftp is a 
> read-only FileSystem and does not provide "write" accesses.
> In HDFS-2284, we propose to have webhdfs for providing a complete FileSystem 
> implementation for accessing HDFS over HTTP.  The is the umbrella JIRA for 
> the tasks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2316) webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP

2011-11-11 Thread Tsz Wo (Nicholas), SZE (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13148925#comment-13148925
 ] 

Tsz Wo (Nicholas), SZE commented on HDFS-2316:
--

Alejandro,

We are getting very close.  All except #2 are documentation discussion.  Let's 
continue documentation related discussion in HDFS-2552.  I will response to 
your other points there.

For #2, the reason of having "urlString" is to allow changing encoding/token 
format later on, e.g.
{noformat}
{"Token": { "newEncoding" :  } }
{noformat}

For path, it seems using a string is just fine.  But if you disagree, I have no 
problem to change the JSON response for GETHOMEDIRECTORY to
{noformat}
{"string": }
{noformat}
Which one do you prefer, {"string": } or {"Path": }?

BTW, thank you so much for all the comments.  They are all very helpful!

> webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP
> --
>
> Key: HDFS-2316
> URL: https://issues.apache.org/jira/browse/HDFS-2316
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
> Attachments: WebHdfsAPI20111020.pdf, WebHdfsAPI2003.pdf, 
> WebHdfsAPI2011.pdf, test-webhdfs, test-webhdfs-0.20s
>
>
> We current have hftp for accessing HDFS over HTTP.  However, hftp is a 
> read-only FileSystem and does not provide "write" accesses.
> In HDFS-2284, we propose to have webhdfs for providing a complete FileSystem 
> implementation for accessing HDFS over HTTP.  The is the umbrella JIRA for 
> the tasks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2316) webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP

2011-11-11 Thread Alejandro Abdelnur (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13148913#comment-13148913
 ] 

Alejandro Abdelnur commented on HDFS-2316:
--

Nicholas,

On #1, sorry don't agree, you are referring to HTTP/1.1 headers. We are talking 
about URLs here, except for scheme and host everything else should case 
sensitive. Milind has also stated his opinion on this in favor of case 
sensitive. Again, this is by spec; still the implementation could be lenient.

On #2, if we go for types, then GETHOMEDIRECTORY uses { "Path" : "" }, 
using similar reasoning GETDELEGATIONTOKEN  should return { "Token" : "" 
} (you are skipping the field name of the structure type as the token seems to 
treated as an opaque value).

On #3, "symlink" is fine

On #4, then we should state in the parameter 'permission' that is a String with 
valid values being a 3 digits octal number or '[0-7][0-7][0-7]'.

On #5, if the Token is to be decoded/parsed by a client in any way, it should 
be stated, independent of Hadoop how to do the decoding.

Thanks.





> webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP
> --
>
> Key: HDFS-2316
> URL: https://issues.apache.org/jira/browse/HDFS-2316
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
> Attachments: WebHdfsAPI20111020.pdf, WebHdfsAPI2003.pdf, 
> WebHdfsAPI2011.pdf, test-webhdfs, test-webhdfs-0.20s
>
>
> We current have hftp for accessing HDFS over HTTP.  However, hftp is a 
> read-only FileSystem and does not provide "write" accesses.
> In HDFS-2284, we propose to have webhdfs for providing a complete FileSystem 
> implementation for accessing HDFS over HTTP.  The is the umbrella JIRA for 
> the tasks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2316) webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP

2011-11-11 Thread Tsz Wo (Nicholas), SZE (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13148883#comment-13148883
 ] 

Tsz Wo (Nicholas), SZE commented on HDFS-2316:
--

Hi Alejandro,

> 1. Case insensitivity of the param names/values.

Let's follow HTTP/1.1.  They have stated it in the spec.

> 2. Inconsistency on the JSON responses names:

They are indeed consistent: If the return type is a JSON primitive type 
(string, boolean or number), the format is 
{noformat}
{"type": }
{noformat}

If the return type is a class but not an array, then the format is

{noformat}
{
  "Class":
  { 
"FieldA": ,
"FieldB": ,
...
  }
}
{noformat}

If the return type is an array, then the format is

{noformat}
{
  "Classes":
  { 
"Class":
[
  { 
"FieldA": ,
"FieldB": ,
...
  },
  { 
"FieldA": ,
"FieldB": ,
...
  },
  ...
]
  }
}
{noformat}

BTW, there is a typo in your comment: There is "Token:" in the 
GETDELEGATIONTOKEN return type, i.e.
{noformat}
 {"Token": { "urlString" : "" } }
{noformat}

Okay, just found out that GETHOMEDIRECTORY does not follow above rules.  "Path" 
should be "string".  But it seems that "Path" makes more sense, what do you 
think?

I did use function names in an earlier implementation but changed it to 
type/class name since it seems that return types should not associate with 
function names.  Just like in Java, you won't put the method name in the object 
of return class.

> 3. FileStatus does not have symlinkPath for symlinks.

Yes, symlink is optional.  I have not included it in the doc.  BTW, how about 
calling in "symlink" instead of "symlinkPath"?

> 4. FileStatus permission is a String but the permission parameter is a short

If we use short, then the octal 777 will be shown as 511, which seems very 
confusing.  So, string is used for representing octal.  In the url, it is 
actually a string as everything is a string.

> 5. The encoding of the delegation token, both as parameter and as response is 
> not defined.

It use Hadoop delegation token encoding, i.e. Token.encodeToUrlString() and 
Token.decodeFromUrlString(..).

> 6. The final doc that gets checked in should no include authors, ...

Agree.

> webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP
> --
>
> Key: HDFS-2316
> URL: https://issues.apache.org/jira/browse/HDFS-2316
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
> Attachments: WebHdfsAPI20111020.pdf, WebHdfsAPI2003.pdf, 
> WebHdfsAPI2011.pdf, test-webhdfs, test-webhdfs-0.20s
>
>
> We current have hftp for accessing HDFS over HTTP.  However, hftp is a 
> read-only FileSystem and does not provide "write" accesses.
> In HDFS-2284, we propose to have webhdfs for providing a complete FileSystem 
> implementation for accessing HDFS over HTTP.  The is the umbrella JIRA for 
> the tasks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2316) webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP

2011-11-11 Thread Alejandro Abdelnur (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13148775#comment-13148775
 ] 

Alejandro Abdelnur commented on HDFS-2316:
--

Nicholas,

Thanks for updating the spec. Getting there. A few follow up 
comments/open-issues:

*1. Case insensitivity of the param names/values.*

Current opinions favor case sensitivity in the spec (IMO the implementation 
could be case insensitive, but the spec should be case sensitive. The 
client/server components should produce as per spec, they could be lax to 
understand. In other words Postel Law).

*2. Inconsistency on the JSON responses names:*

* MAKEDIRS/RENAME/DELETE/SETREPLICATION returns: { "boolean" :  }
* GETHOMEDIRECTORY returns: { "path" : "" }
* GETDELEGATIONTOKEN returns: { "urlString" : "" }
* RENEWDELEGATIONTOKEN returns: { "long" :  }
* GETFILESTATUS returns: { "fileStatus" :  }
* LISTSTATUS returns: { "fileStatuses" :  }

Sometimes are basic types, sometimes are structure names, sometimes are 
functional names ('urlString'). 

Because structure names give a sense of functional names, i'd suggest we use 
functional names for everything. Then it would be:

* MAKEDIRS/RENAME/DELETE/SETREPLICATION returns: { "mkdirs/rename/.." : 
 }
* GETHOMEDIRECTORY returns: { "homeDir" : "" }
* GETDELEGATIONTOKEN returns: { "delegationToken" : "" }
* RENEWDELEGATIONTOKEN returns: { "delegationTokenRenewal" :  }
* GETFILESTATUS returns: { "fileStatus" :  }
* LISTSTATUS returns: { "fileStatuses" :  }

*3. FileStatus does not have symlinkPath for symlinks.*

symlinkPath should be an optional value (and the same for pathSuffix when the 
status is for a symlink)

*4. FileStatus permission is a String but the permission parameter is a short*

Both should be short (octal).

*5. The encoding of the delegation token, both as parameter and as response is 
not defined.*

The encoding should be the same in both cases (I assume, from the code, you are 
using HEXA. If so, wouldn't BASE64 be a more common encoding to use?)

*6. The final doc that gets checked in should no include authors, same as we 
don't use @author tag in the code.*

> webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP
> --
>
> Key: HDFS-2316
> URL: https://issues.apache.org/jira/browse/HDFS-2316
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
> Attachments: WebHdfsAPI20111020.pdf, WebHdfsAPI2003.pdf, 
> WebHdfsAPI2011.pdf, test-webhdfs, test-webhdfs-0.20s
>
>
> We current have hftp for accessing HDFS over HTTP.  However, hftp is a 
> read-only FileSystem and does not provide "write" accesses.
> In HDFS-2284, we propose to have webhdfs for providing a complete FileSystem 
> implementation for accessing HDFS over HTTP.  The is the umbrella JIRA for 
> the tasks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2316) webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP

2011-11-11 Thread Tsz Wo (Nicholas), SZE (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13148770#comment-13148770
 ] 

Tsz Wo (Nicholas), SZE commented on HDFS-2316:
--

Closing this since all tasks for 205.1 are done.  There is one remaining issue 
(HDFS-2545) for 0.23 though.  Please feel free to create JIRAs if you find any 
bug.

> webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP
> --
>
> Key: HDFS-2316
> URL: https://issues.apache.org/jira/browse/HDFS-2316
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
> Attachments: WebHdfsAPI20111020.pdf, WebHdfsAPI2003.pdf, 
> WebHdfsAPI2011.pdf, test-webhdfs, test-webhdfs-0.20s
>
>
> We current have hftp for accessing HDFS over HTTP.  However, hftp is a 
> read-only FileSystem and does not provide "write" accesses.
> In HDFS-2284, we propose to have webhdfs for providing a complete FileSystem 
> implementation for accessing HDFS over HTTP.  The is the umbrella JIRA for 
> the tasks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2316) webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP

2011-11-11 Thread Tsz Wo (Nicholas), SZE (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13148716#comment-13148716
 ] 

Tsz Wo (Nicholas), SZE commented on HDFS-2316:
--

Hi Milind, as mentioned earlier, either case sensitive or not is fine for 
experienced users.  No?

> webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP
> --
>
> Key: HDFS-2316
> URL: https://issues.apache.org/jira/browse/HDFS-2316
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
> Attachments: WebHdfsAPI20111020.pdf, WebHdfsAPI2003.pdf
>
>
> We current have hftp for accessing HDFS over HTTP.  However, hftp is a 
> read-only FileSystem and does not provide "write" accesses.
> In HDFS-2284, we propose to have webhdfs for providing a complete FileSystem 
> implementation for accessing HDFS over HTTP.  The is the umbrella JIRA for 
> the tasks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2316) webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP

2011-11-09 Thread Milind Bhandarkar (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13146898#comment-13146898
 ] 

Milind Bhandarkar commented on HDFS-2316:
-

Before I get tired of the case-sensitivity arguments, let me ask you who you 
are designing the system for ? I suppose that is for folks like me, who have 
used the URL scheme for more than 18 years now. So, here is my take: anything 
after that host:port/ is case sensitive. (Because after host:port/, I know that 
it refers to a file system, or a "resource" that ultimately refers to a file 
system.) So, please stop arguing, and design it for curmudgeons like me. Even 
without the reading glasses, I can recognize the difference between capital and 
small letters. Thank you !

> webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP
> --
>
> Key: HDFS-2316
> URL: https://issues.apache.org/jira/browse/HDFS-2316
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
> Attachments: WebHdfsAPI20111020.pdf, WebHdfsAPI2003.pdf
>
>
> We current have hftp for accessing HDFS over HTTP.  However, hftp is a 
> read-only FileSystem and does not provide "write" accesses.
> In HDFS-2284, we propose to have webhdfs for providing a complete FileSystem 
> implementation for accessing HDFS over HTTP.  The is the umbrella JIRA for 
> the tasks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2316) webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP

2011-11-08 Thread Tsz Wo (Nicholas), SZE (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13146646#comment-13146646
 ] 

Tsz Wo (Nicholas), SZE commented on HDFS-2316:
--

Regarding to #6, suppose webhdfs and hoop share the same FileSystem scheme.  I 
think "http" as a FileSystem scheme is not an option.  Otherwise, we cannot 
easily tell whether "http://host:port/path/to/file"; is a http URL or a 
FileSystem URI.

> webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP
> --
>
> Key: HDFS-2316
> URL: https://issues.apache.org/jira/browse/HDFS-2316
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
> Attachments: WebHdfsAPI20111020.pdf, WebHdfsAPI2003.pdf
>
>
> We current have hftp for accessing HDFS over HTTP.  However, hftp is a 
> read-only FileSystem and does not provide "write" accesses.
> In HDFS-2284, we propose to have webhdfs for providing a complete FileSystem 
> implementation for accessing HDFS over HTTP.  The is the umbrella JIRA for 
> the tasks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2316) webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP

2011-11-08 Thread Arpit Gupta (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13146497#comment-13146497
 ] 

Arpit Gupta commented on HDFS-2316:
---

@Alejandro
{quote}
Again, I mean in a 'general way'. Having a syntax that is convenient for 
parsing using a specific library doesn't seem the right approach.
{quote}

I am not sure why the approach i suggested is not a general way. The current 
response we send allows users to create a dom object from the json response. If 
the root object is not present in that case the user would have to write 
specific code for different api calls and add the root object when needed. Thus 
i think what we have right now allows for the general way rather than specific 
solutions for different api calls.

The benefit for having a response that can be converted to valid xml is that in 
future if we want to support xml response there is no schema change needed 
between xml and json. 

Also clients that are using java can use that java xpath libraries to parse the 
data. I am not sure if json has something as strong as xpath that one can use.


Here you can see an example where a response has both json and xml responses

yql call to get weather info 
xml -> http://goo.gl/i2Gii
json -> http://goo.gl/osChW

.
So i believe our json response should be returning a root object.

> webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP
> --
>
> Key: HDFS-2316
> URL: https://issues.apache.org/jira/browse/HDFS-2316
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
> Attachments: WebHdfsAPI20111020.pdf, WebHdfsAPI2003.pdf
>
>
> We current have hftp for accessing HDFS over HTTP.  However, hftp is a 
> read-only FileSystem and does not provide "write" accesses.
> In HDFS-2284, we propose to have webhdfs for providing a complete FileSystem 
> implementation for accessing HDFS over HTTP.  The is the umbrella JIRA for 
> the tasks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2316) webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP

2011-11-07 Thread Alejandro Abdelnur (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13146055#comment-13146055
 ] 

Alejandro Abdelnur commented on HDFS-2316:
--

Nicholas, I guess we are stale here and it is matter of personal preference. It 
would be good to hear others here. But at this point, I'd be good with either 
approach; I want to get this going. It should be 100% clear in the 
documentation what is case sensitive and what is not, parameter name/value one 
by one.

Still open are #5 and #6.

> webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP
> --
>
> Key: HDFS-2316
> URL: https://issues.apache.org/jira/browse/HDFS-2316
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
> Attachments: WebHdfsAPI20111020.pdf, WebHdfsAPI2003.pdf
>
>
> We current have hftp for accessing HDFS over HTTP.  However, hftp is a 
> read-only FileSystem and does not provide "write" accesses.
> In HDFS-2284, we propose to have webhdfs for providing a complete FileSystem 
> implementation for accessing HDFS over HTTP.  The is the umbrella JIRA for 
> the tasks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2316) webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP

2011-11-07 Thread Tsz Wo (Nicholas), SZE (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13146044#comment-13146044
 ] 

Tsz Wo (Nicholas), SZE commented on HDFS-2316:
--

In HTTP/1.1 or other programming languages which support case insensitive, 
names are case insensitive but string values are case sensitive.  Our case is 
the same: parameter names are case insensitive and string values (including 
paths) are case sensitive.

Case insensitive approaches are designed for inexperienced users.  For 
experienced users, either way is fine as long as the documentation is clear.  
SQL, HTTP and other case insensitive examples like BASIC are all designed for 
inexperienced users in order to cover a wider audience.  Some may claim that 
SQL and HTTP are mess but no one can deny that they are the most popular 
standards.

> webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP
> --
>
> Key: HDFS-2316
> URL: https://issues.apache.org/jira/browse/HDFS-2316
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
> Attachments: WebHdfsAPI20111020.pdf, WebHdfsAPI2003.pdf
>
>
> We current have hftp for accessing HDFS over HTTP.  However, hftp is a 
> read-only FileSystem and does not provide "write" accesses.
> In HDFS-2284, we propose to have webhdfs for providing a complete FileSystem 
> implementation for accessing HDFS over HTTP.  The is the umbrella JIRA for 
> the tasks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2316) webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP

2011-11-07 Thread Alejandro Abdelnur (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13146026#comment-13146026
 ] 

Alejandro Abdelnur commented on HDFS-2316:
--

Nicholas, my concern is not regarding case sensitive or case insensitive; but 
regarding consistency. In your suggested approach.

*1 path is case sensitive
*2 query parameter names are case insensitive
*3 some query parameter values are case sensitive (destination, owner, group, 
user.name, doAs)
*4 some query parameter values are case insensitive (override)

Note that in #3 we don't have an option as the corresponding underlying 
entities are case sensitive

And I don't recall now in your proposal if 'op' is case sensitive or not.

My take is that if we are consistent, this will be easier for users. As we 
cannot go all case insensitive this is why I'm suggesting all case sensitive.






> webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP
> --
>
> Key: HDFS-2316
> URL: https://issues.apache.org/jira/browse/HDFS-2316
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
> Attachments: WebHdfsAPI20111020.pdf, WebHdfsAPI2003.pdf
>
>
> We current have hftp for accessing HDFS over HTTP.  However, hftp is a 
> read-only FileSystem and does not provide "write" accesses.
> In HDFS-2284, we propose to have webhdfs for providing a complete FileSystem 
> implementation for accessing HDFS over HTTP.  The is the umbrella JIRA for 
> the tasks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2316) webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP

2011-11-07 Thread Tsz Wo (Nicholas), SZE (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13146018#comment-13146018
 ] 

Tsz Wo (Nicholas), SZE commented on HDFS-2316:
--

Hi Alejandro, Http field names are case insensitive.
{quote}
... Each header field consists of a name followed by a colon (":") and the 
field value. Field names are case-insensitive. ...
{quote}
For more details, see [HTTP/1.1 Section 
4.2|http://www.w3.org/Protocols/rfc2616/rfc2616-sec4.html#sec4.2].  

So the query parameter names should also be case insensitive.  Okay?

> webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP
> --
>
> Key: HDFS-2316
> URL: https://issues.apache.org/jira/browse/HDFS-2316
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
> Attachments: WebHdfsAPI20111020.pdf, WebHdfsAPI2003.pdf
>
>
> We current have hftp for accessing HDFS over HTTP.  However, hftp is a 
> read-only FileSystem and does not provide "write" accesses.
> In HDFS-2284, we propose to have webhdfs for providing a complete FileSystem 
> implementation for accessing HDFS over HTTP.  The is the umbrella JIRA for 
> the tasks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2316) webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP

2011-11-07 Thread Alejandro Abdelnur (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13145967#comment-13145967
 ] 

Alejandro Abdelnur commented on HDFS-2316:
--

@Arpit, 

I'm trying to understand -in a general way- how the additional levels 
indicating a class-name (container-name) simplify the creation of XML. I've 
tried using json-lib JSON to XML but it does not achieve the desired results. 
Futhermore, with json-lib it seems easier not have the class-name 
(container-name).

Again, I mean in a 'general way'. Having a syntax that is convenient for 
parsing using a specific library doesn't seem the right approach. 





> webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP
> --
>
> Key: HDFS-2316
> URL: https://issues.apache.org/jira/browse/HDFS-2316
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
> Attachments: WebHdfsAPI20111020.pdf, WebHdfsAPI2003.pdf
>
>
> We current have hftp for accessing HDFS over HTTP.  However, hftp is a 
> read-only FileSystem and does not provide "write" accesses.
> In HDFS-2284, we propose to have webhdfs for providing a complete FileSystem 
> implementation for accessing HDFS over HTTP.  The is the umbrella JIRA for 
> the tasks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2316) webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP

2011-11-07 Thread Arpit Gupta (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13145939#comment-13145939
 ] 

Arpit Gupta commented on HDFS-2316:
---

{quote}
Regarding #5, assuming that that is the case (that clients would do the 
conversion), would you please tell me what kind of libraries would help to do 
such conversion by having the root/array names as being proposed? And how this 
conversion is made simple by having those elements?
{quote}

One can use http://www.json.org/java/index.html to convert json into an xml 
string and then create a java dom object. If the root element is not present 
one will get an exception when building a dom object.

> webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP
> --
>
> Key: HDFS-2316
> URL: https://issues.apache.org/jira/browse/HDFS-2316
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
> Attachments: WebHdfsAPI20111020.pdf, WebHdfsAPI2003.pdf
>
>
> We current have hftp for accessing HDFS over HTTP.  However, hftp is a 
> read-only FileSystem and does not provide "write" accesses.
> In HDFS-2284, we propose to have webhdfs for providing a complete FileSystem 
> implementation for accessing HDFS over HTTP.  The is the umbrella JIRA for 
> the tasks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2316) webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP

2011-11-07 Thread Alejandro Abdelnur (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13145925#comment-13145925
 ] 

Alejandro Abdelnur commented on HDFS-2316:
--

Regarding #1, I'm not sure SQL is a good example of this, SQL is a mess, it 
depends on the SQL vendor and how the SQL DB is configured.

Regarding #4, OK

Regarding #5, assuming that that is the case (that clients would do the 
conversion), would you please tell me what kind of libraries would help to do 
such conversion by having the root/array names as being proposed? And how this 
conversion is made simple by having those elements?

Regarding #6, let me rephrase my previous comment, if we have full 
interoperability between webhdfs and hoop then I don't see a need for having 2 
client implementations of the 'http' filesystem.



> webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP
> --
>
> Key: HDFS-2316
> URL: https://issues.apache.org/jira/browse/HDFS-2316
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
> Attachments: WebHdfsAPI20111020.pdf, WebHdfsAPI2003.pdf
>
>
> We current have hftp for accessing HDFS over HTTP.  However, hftp is a 
> read-only FileSystem and does not provide "write" accesses.
> In HDFS-2284, we propose to have webhdfs for providing a complete FileSystem 
> implementation for accessing HDFS over HTTP.  The is the umbrella JIRA for 
> the tasks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2316) webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP

2011-11-07 Thread Tsz Wo (Nicholas), SZE (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13145893#comment-13145893
 ] 

Tsz Wo (Nicholas), SZE commented on HDFS-2316:
--

> Regarding #3, ... if we make the the param names case insensitive we should 
> make the param values case insensitive as well. ...

For the case insensitive programming languages, identifer names are often case 
insensitive but string values are case sensitive.  SQL is an example.

> Regarding #4, ...

Yes, it requires the knowledge of the request.  Even if absolute paths are 
provided, it requires the knowledge of the request to know which NameNode it is 
referring to.  We simply cannot put everything in the response.

> Regarding #5, If we are going to support XML, we can easily add the root 
> elements to XML. ...

But then the JSON schema and the XML schema will be different.

It does make sense to first generate JSON in the server side and then convert 
to XML in the client side since the XML payload is heavy.

> Regarding #6, Regardless if Hoop as a FileSystem implementation, ...

Even the share the same FileSystem scheme, users have to change their 
configuration to the corresponding implementation in webhdfs and hoop do not 
share the same FileSystem implementation.


> webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP
> --
>
> Key: HDFS-2316
> URL: https://issues.apache.org/jira/browse/HDFS-2316
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
> Attachments: WebHdfsAPI20111020.pdf, WebHdfsAPI2003.pdf
>
>
> We current have hftp for accessing HDFS over HTTP.  However, hftp is a 
> read-only FileSystem and does not provide "write" accesses.
> In HDFS-2284, we propose to have webhdfs for providing a complete FileSystem 
> implementation for accessing HDFS over HTTP.  The is the umbrella JIRA for 
> the tasks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2316) webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP

2011-11-07 Thread Alejandro Abdelnur (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13145842#comment-13145842
 ] 

Alejandro Abdelnur commented on HDFS-2316:
--

Regarding #3, IMO having parts of the URL being case sensitive (the path & the 
param names) and part of the url being case sensitive (the param values) is an 
issue. We cannot make the path case-insensitive because file names are not. 
Because of that I'm suggesting all case sensitive. But if we make the the param 
names case insensitive we should make the param values case insensitive as 
well. We just have to make sure we don't modify the case of parameters as in 
certain cases (a rename or a changeOwner may cause undesirable results).

Regarding #4, pathSuffix is good. Still this means that a FileStatus response 
requires knowledge of the requested URL to be able to know which file we are 
talking about.

Regarding #5, If we are going to support XML, we can easily add the root 
elements to XML. Adding a couple of nested levels because of XML conversion 
does not seem right. Furthermore, I would assume that for performance reasons, 
when generating XML we'll do directly XML via a Provider, we won't generate 
JSON and then convert it to XML.

Regarding #6, Regardless if Hoop as a FileSystem implementation, they should be 
interoperable. This would mean that a distcp would work without changes even if 
the infrastructure setup changes from hoop to webhdfs or viceversa.

Thanks.

Alejandro

> webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP
> --
>
> Key: HDFS-2316
> URL: https://issues.apache.org/jira/browse/HDFS-2316
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
> Attachments: WebHdfsAPI20111020.pdf, WebHdfsAPI2003.pdf
>
>
> We current have hftp for accessing HDFS over HTTP.  However, hftp is a 
> read-only FileSystem and does not provide "write" accesses.
> In HDFS-2284, we propose to have webhdfs for providing a complete FileSystem 
> implementation for accessing HDFS over HTTP.  The is the umbrella JIRA for 
> the tasks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2316) webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP

2011-11-07 Thread Tsz Wo (Nicholas), SZE (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13145812#comment-13145812
 ] 

Tsz Wo (Nicholas), SZE commented on HDFS-2316:
--

> Regarding #3, having params being case sensitive it does not mean they have 
> to be all lowercase. ...

This is a problem of case sensitive names: people have different preferences.  
From the people I talked so far, if the names are case sensitive, some prefer 
lowercase and some prefer camelCase.  However, if the names are case 
insensitive, everyone is fine (they won't ask for case sensitive.)

> Regarding #4, Having 'name' and 'localname' is not clear when you'll have one 
> or the other. ...

I recall that full path had been sent in RPC through FileStatus but there were 
some issues and then HdfsFileStatus with "localName" was added.  I think we 
should not make the same mistake again.  Does "pathSuffix" sound good to you?

> Regarding #5, My issue here is that having an extra nested level for a 
> possible conversion to XML. ...

We should support XML in the near future since some users may prefer XML over 
JSON.

> Regarding #6, I disagree, ...

I see your point.  Are you going to use WebHdfsFileSystem in HDFS for hoop but 
not adding another FileSystem implementation?


> webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP
> --
>
> Key: HDFS-2316
> URL: https://issues.apache.org/jira/browse/HDFS-2316
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
> Attachments: WebHdfsAPI20111020.pdf, WebHdfsAPI2003.pdf
>
>
> We current have hftp for accessing HDFS over HTTP.  However, hftp is a 
> read-only FileSystem and does not provide "write" accesses.
> In HDFS-2284, we propose to have webhdfs for providing a complete FileSystem 
> implementation for accessing HDFS over HTTP.  The is the umbrella JIRA for 
> the tasks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2316) webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP

2011-11-05 Thread Alejandro Abdelnur (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13144764#comment-13144764
 ] 

Alejandro Abdelnur commented on HDFS-2316:
--

@Arpit,

Now I got how you are proposing the json payload for filestatuses. You are 
correct, the overhead is minimal.

@Nicholas,

Regarding #1, 'type' sounds good.

Regarding #2, ok.

Regarding #3, having params being case sensitive it does not mean they have to 
be all lowercase. Hoop originally used case sensitive parameters using 
camelCase, thus the 'doas' parameter was 'doAs'. How about going back to that 
for all parameter names and values. And for the 'op' values it means they mimic 
the FileSystem method names (that was also the initial motivation on Hoop).

Regarding #4, Having 'name' and 'localname' is not clear when you'll have one 
or the other. If you have the full path it means the FileStatus is 
selfcontained and you don't need to know the requested URL to know the file 
location in the filesystem and the payloads or filestatuses are bigger. Having 
'localname' is the other way around, you need to know the requested URL to know 
the file location in the file system but the payloads of filestatuses will be 
bigger. IMO we should choose one. I prefer the full path because it makes the 
filestatus selfcontained, regarding the size of the payload, I wouldn't worry 
much about it as we are always talking about the contents of a single 
directory. And we are using a verbose syntax afterwards. And you could use 
compression in the server responses.

Regarding #5, My issue here is that having an extra nested level for a possible 
conversion to XML. Is this a users requirement? If not I'd prefer to keep it 
without the class name.

Hoop can proxy any filesystem implementation. Because of this the HTTP REST API 
should be restricted to the FileSystem public API; without exposing 
implementation specifics.

Regarding #6, I disagree, all this discussion we are having to have a single 
HTTP REST API between Hoop and WebHDFS is to achieve interoperability between 
implementations and make it transparent to users.


> webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP
> --
>
> Key: HDFS-2316
> URL: https://issues.apache.org/jira/browse/HDFS-2316
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
> Attachments: WebHdfsAPI20111020.pdf, WebHdfsAPI2003.pdf
>
>
> We current have hftp for accessing HDFS over HTTP.  However, hftp is a 
> read-only FileSystem and does not provide "write" accesses.
> In HDFS-2284, we propose to have webhdfs for providing a complete FileSystem 
> implementation for accessing HDFS over HTTP.  The is the umbrella JIRA for 
> the tasks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2316) webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP

2011-11-04 Thread Tsz Wo (Nicholas), SZE (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13144367#comment-13144367
 ] 

Tsz Wo (Nicholas), SZE commented on HDFS-2316:
--

1. Agree, symlink could be optional.  BTW, the "isDir" and "isSymlink" are 
replaced with "type", which is an enum {FILE, DIRECTORY, SYMLINK}

2. Sure.  Let's change it in a separated JIRA then.

3. Path is a parameter value, therefore case sensitive.

I think case sensitive causes more confusion:

Q: Why this is not working?  http://nn:port/webhdfs/v1/path?Op=GETFILECHECKSUM
A: You must use lower case: "Op" should be "op"

Q: Why this is not working?  
http://nn:port/webhdfs/v1/path?op=GETFILECHECKSUM&does=nicholas
A: It is a typo: "does" should be "doas".
Q: But "doas" looks more like a typo than "does".  I wish I can use "doAs"

How about op values and boolean values?  Do you also think that they should be 
case sensitive?

4. "name" should like file/directory names.  "localName" is an empty string if 
the full path is given.  How about calling it "pathSuffix"?

5. As Arpit mentioned, the payload for liststatus won't be increased 
significantly.  We only needs two more words per request (instead of one more 
word per status.)

Is Hoop going to support file system other than HDFS?

It is common to convert JSON to/from XML.  We should make the convertion 
trivial.

6. Webhdfs and hoop should not share the same scheme since they require 
different implementatons.  Webhdfs should use "webhdfs://".  For hoop, I 
suggest not using "http://"; as a file system scheme.


> webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP
> --
>
> Key: HDFS-2316
> URL: https://issues.apache.org/jira/browse/HDFS-2316
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
> Attachments: WebHdfsAPI20111020.pdf, WebHdfsAPI2003.pdf
>
>
> We current have hftp for accessing HDFS over HTTP.  However, hftp is a 
> read-only FileSystem and does not provide "write" accesses.
> In HDFS-2284, we propose to have webhdfs for providing a complete FileSystem 
> implementation for accessing HDFS over HTTP.  The is the umbrella JIRA for 
> the tasks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2316) webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP

2011-11-03 Thread Arpit Gupta (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13143558#comment-13143558
 ] 

Arpit Gupta commented on HDFS-2316:
---

{quote}
5. Regarding filestatus, delete, rename, mkdirs, setreplication payloads and 
root element being a classname. JSON does not require a root element, a JSON 
response can be an list of key/value pairs (JSON object). I'd prefer to keep it 
like that. Specially for filestatus when doing a liststatus operation, else 
they payload will increase significantly in size. Another issue with the name 
of the class is that it should be an public class, not an implementation one 
(currently is using 'HdfsFileStatus').

You mention that the root element class is added because of XML requiring a 
root element. We are not spec-ing XML here. So I don't see this as a 
requirement. And if somebody is doing JSON to XML they should account for that 
in the transcoding.
{quote}

I dont think the size increases significantly by adding the root element. 
Especially for the liststatus call it is the following if the root element is 
there

{code}
{"HdfsFileStatuses":{"HdfsFileStatus":[]}}
{code}


or if the root is not there

{code}
{"HdfsFileStatus":[]}
{code}


I do no think this adds to much size to the response. 



> webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP
> --
>
> Key: HDFS-2316
> URL: https://issues.apache.org/jira/browse/HDFS-2316
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
> Attachments: WebHdfsAPI20111020.pdf, WebHdfsAPI2003.pdf
>
>
> We current have hftp for accessing HDFS over HTTP.  However, hftp is a 
> read-only FileSystem and does not provide "write" accesses.
> In HDFS-2284, we propose to have webhdfs for providing a complete FileSystem 
> implementation for accessing HDFS over HTTP.  The is the umbrella JIRA for 
> the tasks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2316) webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP

2011-11-03 Thread Alejandro Abdelnur (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13143518#comment-13143518
 ] 

Alejandro Abdelnur commented on HDFS-2316:
--

bq. #2, the reason for using 'user.name' is that hadoop pseudo uses the 
'user.name' system property.

> webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP
> --
>
> Key: HDFS-2316
> URL: https://issues.apache.org/jira/browse/HDFS-2316
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
> Attachments: WebHdfsAPI20111020.pdf, WebHdfsAPI2003.pdf
>
>
> We current have hftp for accessing HDFS over HTTP.  However, hftp is a 
> read-only FileSystem and does not provide "write" accesses.
> In HDFS-2284, we propose to have webhdfs for providing a complete FileSystem 
> implementation for accessing HDFS over HTTP.  The is the umbrella JIRA for 
> the tasks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2316) webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP

2011-11-03 Thread Alejandro Abdelnur (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13143509#comment-13143509
 ] 

Alejandro Abdelnur commented on HDFS-2316:
--

Thanks for the updated PDF with the API, looks good.

Following are the remaining issues:

1. Regarding FileStatus containing symlink & isSymlink elements. Got it, they 
should. It would be enough to have symlink as an optional element, thus 
reducing the size of the response.

2. Regarding using 'username' parameter instead of 'user.name'. This comes from 
hadoop-auth (Alfredo), it should be changed there not here.

3. Regarding querystring parameters/values case sensitive or no. IMO, as path 
is case sensitive, querystring should be as well not to create confusion with 
developers/users.

4. Regarding filestatus containing localname instead full path to make payload 
smaller; it makes sense. But shouldn't be just called 'name'?

5. Regarding filestatus, delete, rename, mkdirs, setreplication payloads and 
root element being a classname. JSON does not require a root element, a JSON 
response can be an list of key/value pairs (JSON object). I'd prefer to keep it 
like that. Specially for filestatus when doing a liststatus operation, else 
they payload will increase significantly in size. Another issue with the name 
of the class is that it should be an public class, not an implementation one 
(currently is using 'HdfsFileStatus').

You mention that the root element class is added because of XML requiring a 
root element. We are not spec-ing XML here. So I don't see this as a 
requirement. And if somebody is doing JSON to XML they should account for that 
in the transcoding.

6. Regarding the scheme to use, "webhdfs://" and "http://";. We are doing HTTP, 
this is why, IMO, we should use "http://";. For example, when using curl you'll 
use "http://"; not "webhdfs://"; it will be less confusing to developers.


> webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP
> --
>
> Key: HDFS-2316
> URL: https://issues.apache.org/jira/browse/HDFS-2316
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
> Attachments: WebHdfsAPI20111020.pdf, WebHdfsAPI2003.pdf
>
>
> We current have hftp for accessing HDFS over HTTP.  However, hftp is a 
> read-only FileSystem and does not provide "write" accesses.
> In HDFS-2284, we propose to have webhdfs for providing a complete FileSystem 
> implementation for accessing HDFS over HTTP.  The is the umbrella JIRA for 
> the tasks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2316) webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP

2011-11-03 Thread Alejandro Abdelnur (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13143460#comment-13143460
 ] 

Alejandro Abdelnur commented on HDFS-2316:
--

* The user.name would have to be changed in hadoop-auth, it is independent of 
webhdfs/hoop.

Let me recap all outstanding issues in a follow up comment.

> webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP
> --
>
> Key: HDFS-2316
> URL: https://issues.apache.org/jira/browse/HDFS-2316
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
> Attachments: WebHdfsAPI20111020.pdf
>
>
> We current have hftp for accessing HDFS over HTTP.  However, hftp is a 
> read-only FileSystem and does not provide "write" accesses.
> In HDFS-2284, we propose to have webhdfs for providing a complete FileSystem 
> implementation for accessing HDFS over HTTP.  The is the umbrella JIRA for 
> the tasks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2316) webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP

2011-11-03 Thread Tsz Wo (Nicholas), SZE (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13143459#comment-13143459
 ] 

Tsz Wo (Nicholas), SZE commented on HDFS-2316:
--

> FileStatus JSON payload has elements that are not part of the FileStatus 
> interface. The WebhdfsFileSystem client expects those elements and fails if 
> they are not present. These elements are: localName, isSymlink, symlink. 
> These elements are not later used and they are lost when creating a 
> FileStatus in WebhdfsFileSystem. Either those elements should not be in JSON 
> payload (my preference) or they should not be required by the 
> WebhdfsFileSystem.

localName is for reducing the response size.  It does not include the path 
prefix.  Otherwise, the same path prefix have to be sent for each status.  It 
becomes a problem if the number of statuses is huge.

symlink it in 0.23.  It is a bug that it is not used.

> delete, rename, mkdirs, setReplication JSON responses use 'boolean' as 
> element name, they should use the operation name as it is more descriptive.

Similar to other responses, they need a root element.  The key of the root 
element is the type/class.  Then, the client can determine how to parse the 
JSON object by checking the key.

> FileChecksum JSON serialization is using the classname in the JSON payload, 
> it should not, ...

The classname is the root element.  It is required by other format such as xml.

> webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP
> --
>
> Key: HDFS-2316
> URL: https://issues.apache.org/jira/browse/HDFS-2316
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
> Attachments: WebHdfsAPI20111020.pdf
>
>
> We current have hftp for accessing HDFS over HTTP.  However, hftp is a 
> read-only FileSystem and does not provide "write" accesses.
> In HDFS-2284, we propose to have webhdfs for providing a complete FileSystem 
> implementation for accessing HDFS over HTTP.  The is the umbrella JIRA for 
> the tasks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2316) webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP

2011-11-03 Thread Tsz Wo (Nicholas), SZE (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13143452#comment-13143452
 ] 

Tsz Wo (Nicholas), SZE commented on HDFS-2316:
--

> gethomedir is required as you don't know where the FS impl creates the home 
> dir.

Okay, I can add it to webhdfs.

> The doAs parameter name was chosen to mimic the Java API, real.user is kind 
> of confusing, do you mean the proxy user or the doAs user. Plus when doing 
> Kerberos you don't use user.name. IMO doAs is a easier not to get confused.

I am fine with using doAs instead of realUser.

BTW, hadoop auth use "user.name".  The dot in the middle is different from 
other naming convention.  How about changing it to username?

> The GET_BLOCK_LOCATIONS private API, how are you differentiating private and 
> public APIs?

We will document it and state that it is a private unstable API.


> webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP
> --
>
> Key: HDFS-2316
> URL: https://issues.apache.org/jira/browse/HDFS-2316
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
> Attachments: WebHdfsAPI20111020.pdf
>
>
> We current have hftp for accessing HDFS over HTTP.  However, hftp is a 
> read-only FileSystem and does not provide "write" accesses.
> In HDFS-2284, we propose to have webhdfs for providing a complete FileSystem 
> implementation for accessing HDFS over HTTP.  The is the umbrella JIRA for 
> the tasks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2316) webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP

2011-11-03 Thread Alejandro Abdelnur (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13143418#comment-13143418
 ] 

Alejandro Abdelnur commented on HDFS-2316:
--

Any update on the open issues? I'd like to get them taken care to be able to 
finalize HDFS-2178 accordingly.


> webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP
> --
>
> Key: HDFS-2316
> URL: https://issues.apache.org/jira/browse/HDFS-2316
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
> Attachments: WebHdfsAPI20111020.pdf
>
>
> We current have hftp for accessing HDFS over HTTP.  However, hftp is a 
> read-only FileSystem and does not provide "write" accesses.
> In HDFS-2284, we propose to have webhdfs for providing a complete FileSystem 
> implementation for accessing HDFS over HTTP.  The is the umbrella JIRA for 
> the tasks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2316) webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP

2011-11-02 Thread Alejandro Abdelnur (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13142440#comment-13142440
 ] 

Alejandro Abdelnur commented on HDFS-2316:
--

scheme & authority are case insensitive by definition. This is well known and 
expected.

However, path & query string are not. Regarding your last example, that is JIRA 
functionality. And illustrates my point, the fact that 'browse' is case 
sensitive and 'hdfs' is not it will be confusing.

> webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP
> --
>
> Key: HDFS-2316
> URL: https://issues.apache.org/jira/browse/HDFS-2316
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
> Attachments: WebHdfsAPI20111020.pdf
>
>
> We current have hftp for accessing HDFS over HTTP.  However, hftp is a 
> read-only FileSystem and does not provide "write" accesses.
> In HDFS-2284, we propose to have webhdfs for providing a complete FileSystem 
> implementation for accessing HDFS over HTTP.  The is the umbrella JIRA for 
> the tasks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2316) webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP

2011-11-02 Thread Tsz Wo (Nicholas), SZE (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13142433#comment-13142433
 ] 

Tsz Wo (Nicholas), SZE commented on HDFS-2316:
--

- https://issues.apache.org/jira/browse/hdFS-2316 - also works

> webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP
> --
>
> Key: HDFS-2316
> URL: https://issues.apache.org/jira/browse/HDFS-2316
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
> Attachments: WebHdfsAPI20111020.pdf
>
>
> We current have hftp for accessing HDFS over HTTP.  However, hftp is a 
> read-only FileSystem and does not provide "write" accesses.
> In HDFS-2284, we propose to have webhdfs for providing a complete FileSystem 
> implementation for accessing HDFS over HTTP.  The is the umbrella JIRA for 
> the tasks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2316) webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP

2011-11-02 Thread Tsz Wo (Nicholas), SZE (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13142431#comment-13142431
 ] 

Tsz Wo (Nicholas), SZE commented on HDFS-2316:
--

@Alejandro

> On #3, it will be confusing to users that part of the URL is case sensitive 
> (the path) and part it is not (the query-string). Given that HDFS is case 
> sensitive, making the path case insensitive is not an option. Thus, I'm 
> suggesting to make it all case sensitive and there will be no confusion there.

We cannot make it all case sensitive, e.g. scheme and authority are case 
insensitive.  For examples,

- [hTTps://issues.apache.org/jira/browse/HDFS-2316] - works
- https://iSSues.apache.org/jira/browse/HDFS-2316 - works
- https://issues.apache.org/jira/BRowse/HDFS-2316 - does not works

> webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP
> --
>
> Key: HDFS-2316
> URL: https://issues.apache.org/jira/browse/HDFS-2316
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
> Attachments: WebHdfsAPI20111020.pdf
>
>
> We current have hftp for accessing HDFS over HTTP.  However, hftp is a 
> read-only FileSystem and does not provide "write" accesses.
> In HDFS-2284, we propose to have webhdfs for providing a complete FileSystem 
> implementation for accessing HDFS over HTTP.  The is the umbrella JIRA for 
> the tasks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2316) webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP

2011-11-01 Thread Alejandro Abdelnur (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13141820#comment-13141820
 ] 

Alejandro Abdelnur commented on HDFS-2316:
--

@Nicholas,

* gethomedir is required as you don't know where the FS impl creates the home 
dir.
* The doAs parameter name was chosen to mimic the Java API, real.user is kind 
of confusing, do you mean the proxy user or the doAs user. Plus when doing 
Kerberos you don't use user.name. IMO doAs is a easier not to get confused.
* The GET_BLOCK_LOCATIONS private API, how are you differentiating private and 
public APIs?
* Oozie could use delegation token or it could use doAs.


> webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP
> --
>
> Key: HDFS-2316
> URL: https://issues.apache.org/jira/browse/HDFS-2316
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
> Attachments: WebHdfsAPI20111020.pdf
>
>
> We current have hftp for accessing HDFS over HTTP.  However, hftp is a 
> read-only FileSystem and does not provide "write" accesses.
> In HDFS-2284, we propose to have webhdfs for providing a complete FileSystem 
> implementation for accessing HDFS over HTTP.  The is the umbrella JIRA for 
> the tasks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2316) webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP

2011-11-01 Thread Tsz Wo (Nicholas), SZE (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13141727#comment-13141727
 ] 

Tsz Wo (Nicholas), SZE commented on HDFS-2316:
--

@Nathan

> ":" and "http://:" seem to be used 
> interchangeably. We should be consistent where possible.

You are right.  I should use : only.

> Why doesn't "curl -i -L "http://:/webhdfs/" just work? Do 
> we really need to specify op=OPEN for this very simple, common case?

The op parameter does not have a default value.  I think it may be confusing if 
we have a default - If we forgot to add op parameter, then it becomes a totally 
different operation.

> I believe "http://:" should be "http://:" in 
> append.

Good catch!

> Need format of responses spelled out.
> It would be nice if we could document the possible error responses as well.

Will post a updated doc with JSON responses and error responses soon.

> Since a single datanode will be performing the write of a potentially large 
> file, does that mean that file will have an entire copy on that node (due to 
> block placement strategies)? That doesn't seem desirable..

It is probably the case.  We may change the block placement strategies as an 
improvement later on.

> Is a SHORT sufficient for buffersize?

It should be INT.

> Do we need a renewlease? How will very slow writers be handled?

A slow writer sends data to one of the datanodes using HTTP.  That datanode 
uses a DFSClient to write data.  The DFSClient is going to renews lease for the 
writer.

> Once I have file block locations, can I go directly to those datanodes to 
> retrieve rather than using content_range and always following a redirect?

Yes.  Clients could get block locations, construct the URLs itself and then 
talk to the datanodes directly.  We should have an API to support this.  E.g. 
GETFILEBLOCKLOCATIONS is better to return a list of URLs directly.

GETFILEBLOCKLOCATIONS returns a LocatedBlocks structure which is not easy to 
use.  I am changing GETFILEBLOCKLOCATIONS to GET_BLOCK_LOCATIONS, a private API.

> Do we need flush/sync?

Since the client is using HTTP, there is no way for them to call hflush.  Let's 
leave this as a future improvement.


> webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP
> --
>
> Key: HDFS-2316
> URL: https://issues.apache.org/jira/browse/HDFS-2316
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
> Attachments: WebHdfsAPI20111020.pdf
>
>
> We current have hftp for accessing HDFS over HTTP.  However, hftp is a 
> read-only FileSystem and does not provide "write" accesses.
> In HDFS-2284, we propose to have webhdfs for providing a complete FileSystem 
> implementation for accessing HDFS over HTTP.  The is the umbrella JIRA for 
> the tasks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2316) webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP

2011-11-01 Thread Tsz Wo (Nicholas), SZE (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13141709#comment-13141709
 ] 

Tsz Wo (Nicholas), SZE commented on HDFS-2316:
--

@Alejandro,

> GET GETHOMEDIRECTORY operation is missing.

Do we really need it?  DistributedFileSystem implements it by simple creating a 
path locally

> The GETFILEBLOCKLOCATIONS, GETDELEGATIONTOKEN, RENEWDELEGATIONTOKEN, 
> CANCELDELEGATIONTOKEN operations seem to be the ones that don't make sense 
> (at the moment) in a proxy scenario. We should make those operations as 
> optional.

I agree that GETFILEBLOCKLOCATIONS should be a private API.  Let's rename it to 
GET_BLOCK_LOCATIONS.

For the delegation token ops, they seems making sense in proxy scenario.  E.g. 
Oozie needs it with proxy.

> The 'doas' query parameter is missing, this is required to enable proxyuser 
> functionality.

In RPC, we use "realUser" for the user submitting the call and "user" for the 
effective user, e.g. If Oozie performs an opertions as "nicholas", then 
realUser is "oozie" and user is "nicholas".  How about we have something 
similar, say real.user and user.name?

> The 'user.name' query parameter is optional as this is used only in the case 
> of pseudo authentication, in the case of other authentication mechanism the 
> username will be taken for the authentication credentials.

Agree.

> webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP
> --
>
> Key: HDFS-2316
> URL: https://issues.apache.org/jira/browse/HDFS-2316
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
> Attachments: WebHdfsAPI20111020.pdf
>
>
> We current have hftp for accessing HDFS over HTTP.  However, hftp is a 
> read-only FileSystem and does not provide "write" accesses.
> In HDFS-2284, we propose to have webhdfs for providing a complete FileSystem 
> implementation for accessing HDFS over HTTP.  The is the umbrella JIRA for 
> the tasks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2316) webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP

2011-11-01 Thread Alejandro Abdelnur (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13141095#comment-13141095
 ] 

Alejandro Abdelnur commented on HDFS-2316:
--

* FileStatus also seems to be using the classname (HdfsFileSatus) in the JSON 
payload.

> webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP
> --
>
> Key: HDFS-2316
> URL: https://issues.apache.org/jira/browse/HDFS-2316
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
> Attachments: WebHdfsAPI20111020.pdf
>
>
> We current have hftp for accessing HDFS over HTTP.  However, hftp is a 
> read-only FileSystem and does not provide "write" accesses.
> In HDFS-2284, we propose to have webhdfs for providing a complete FileSystem 
> implementation for accessing HDFS over HTTP.  The is the umbrella JIRA for 
> the tasks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2316) webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP

2011-11-01 Thread Alejandro Abdelnur (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13141076#comment-13141076
 ] 

Alejandro Abdelnur commented on HDFS-2316:
--

Another thing I've noticed is that WebhdfsFileSystem is still using 
100-Continue logic

> webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP
> --
>
> Key: HDFS-2316
> URL: https://issues.apache.org/jira/browse/HDFS-2316
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
> Attachments: WebHdfsAPI20111020.pdf
>
>
> We current have hftp for accessing HDFS over HTTP.  However, hftp is a 
> read-only FileSystem and does not provide "write" accesses.
> In HDFS-2284, we propose to have webhdfs for providing a complete FileSystem 
> implementation for accessing HDFS over HTTP.  The is the umbrella JIRA for 
> the tasks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2316) webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP

2011-10-31 Thread Alejandro Abdelnur (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13140953#comment-13140953
 ] 

Alejandro Abdelnur commented on HDFS-2316:
--

I've been running tests to validate HTTP REST API compatibility between webhdfs 
and hoop. Following the issues I've found.


* FileStatus JSON payload has elements that are not part of the FileStatus 
interface. The WebhdfsFileSystem client expects those elements and fails if 
they are not present. These elements are: localName, isSymlink, symlink. These 
elements are not later used and they are lost when creating a FileStatus in 
WebhdfsFileSystem. Either those elements should not be in JSON payload (my 
preference) or they should not be required by the WebhdfsFileSystem.

* delete, rename, mkdirs, setReplication JSON responses use 'boolean' as 
element name, they should use the operation name as it is more descriptive.

* FileChecksum JSON serialization is using the classname in the JSON payload, 
it should not, it should be something like:

{code}
{
  "algorithm" : "foo",
  "bytes" : "hexabytes",
  "length" : 1000
}
{code}

* Same as FileChecksum JSON for ContentSummary JSON.


> webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP
> --
>
> Key: HDFS-2316
> URL: https://issues.apache.org/jira/browse/HDFS-2316
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
> Attachments: WebHdfsAPI20111020.pdf
>
>
> We current have hftp for accessing HDFS over HTTP.  However, hftp is a 
> read-only FileSystem and does not provide "write" accesses.
> In HDFS-2284, we propose to have webhdfs for providing a complete FileSystem 
> implementation for accessing HDFS over HTTP.  The is the umbrella JIRA for 
> the tasks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2316) webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP

2011-10-31 Thread Alejandro Abdelnur (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13140300#comment-13140300
 ] 

Alejandro Abdelnur commented on HDFS-2316:
--

Current open issues:

* 1. case insensitivity or lowercase of the query string
* 2. rename-'destination'=param/status-reponses include or not the prefix 
(webhdfs/v1)
* 3. schema to use, webhdfs:// or http://
* 4. proxy user support via 'doas=' query string parameter
* 5. API spec, JSON response payloads and response error codes

For #1 and #2 Sanjay suggested 'lowercase' and 'include not'. Are we OK with 
that?

> webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP
> --
>
> Key: HDFS-2316
> URL: https://issues.apache.org/jira/browse/HDFS-2316
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
> Attachments: WebHdfsAPI20111020.pdf
>
>
> We current have hftp for accessing HDFS over HTTP.  However, hftp is a 
> read-only FileSystem and does not provide "write" accesses.
> In HDFS-2284, we propose to have webhdfs for providing a complete FileSystem 
> implementation for accessing HDFS over HTTP.  The is the umbrella JIRA for 
> the tasks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2316) webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP

2011-10-28 Thread Alejandro Abdelnur (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13139006#comment-13139006
 ] 

Alejandro Abdelnur commented on HDFS-2316:
--

On #3, it will be confusing to users that part of the URL is case sensitive 
(the path) and part it is not (the query-string). Given that HDFS is case 
sensitive, making the path case insensitive is not an option. Thus, I'm 
suggesting to make it all case sensitive and there will be no confusion there.

On #5, It is not overloading the scheme, we are doing "http://";, thus why I say 
we should use "http://";. Whe n using curl you'll use http://


> webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP
> --
>
> Key: HDFS-2316
> URL: https://issues.apache.org/jira/browse/HDFS-2316
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
> Attachments: WebHdfsAPI20111020.pdf
>
>
> We current have hftp for accessing HDFS over HTTP.  However, hftp is a 
> read-only FileSystem and does not provide "write" accesses.
> In HDFS-2284, we propose to have webhdfs for providing a complete FileSystem 
> implementation for accessing HDFS over HTTP.  The is the umbrella JIRA for 
> the tasks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2316) webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP

2011-10-28 Thread Tsz Wo (Nicholas), SZE (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13138726#comment-13138726
 ] 

Tsz Wo (Nicholas), SZE commented on HDFS-2316:
--

> 1. versioning

Added v1 to the prefix, i.e. the url is 
http://nn:port/webhdfs/v1/path/to/file?op=...

> 2. ranges

Since it is optional in the http spec, let's use offset and length query 
parameters.

> 3. params case sensitivity

Won't case insensitive parameter names be more user friendly?

> 4. permissions format

Let's use octal.  Handling sticky bit with ls output style is tricky.

> 5. protocol scheme to use

Do you mean the FileSystem scheme?  Webhdfs is using "webhdfs://".  I think 
"http://"; is not suitable to be overloaded as a FileSystem scheme.

> 6. how to get the create/append handle ...

Commented on HDFS-2178.

Sorry that I am still updating the doc.

> webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP
> --
>
> Key: HDFS-2316
> URL: https://issues.apache.org/jira/browse/HDFS-2316
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
> Attachments: WebHdfsAPI20111020.pdf
>
>
> We current have hftp for accessing HDFS over HTTP.  However, hftp is a 
> read-only FileSystem and does not provide "write" accesses.
> In HDFS-2284, we propose to have webhdfs for providing a complete FileSystem 
> implementation for accessing HDFS over HTTP.  The is the umbrella JIRA for 
> the tasks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2316) webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP

2011-10-28 Thread Alejandro Abdelnur (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13138509#comment-13138509
 ] 

Alejandro Abdelnur commented on HDFS-2316:
--

Any follow up in the open issues here?

1. versioning
2. ranges
3. params case sensitivity
4. permissions format
5. protocol scheme to use
6. how to get the create/append handle 
(https://issues.apache.org/jira/browse/HDFS-2178?focusedCommentId=13133189&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13133189)

Also, an updated version of the docs with returned json payloads definitions 
and errors codes is still pending.

Thanks.


> webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP
> --
>
> Key: HDFS-2316
> URL: https://issues.apache.org/jira/browse/HDFS-2316
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
> Attachments: WebHdfsAPI20111020.pdf
>
>
> We current have hftp for accessing HDFS over HTTP.  However, hftp is a 
> read-only FileSystem and does not provide "write" accesses.
> In HDFS-2284, we propose to have webhdfs for providing a complete FileSystem 
> implementation for accessing HDFS over HTTP.  The is the umbrella JIRA for 
> the tasks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2316) webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP

2011-10-24 Thread Alejandro Abdelnur (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13134134#comment-13134134
 ] 

Alejandro Abdelnur commented on HDFS-2316:
--

* Regarding ranges: it is more about ease of use and full description of the 
resource fragment being fetched in the URL.

* Regarding versioning: IMO it seems cleaner to do that at prefix level. Given 
a prefix a user will know what version of the API the server side supports. 
Third, from the implementation perspective, using a prefix in jax-rs instead a 
parameter allows to easily have a different driver classes; thus providing a 
clean separation & coexistence of implementations in the same server. 

> webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP
> --
>
> Key: HDFS-2316
> URL: https://issues.apache.org/jira/browse/HDFS-2316
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
> Attachments: WebHdfsAPI20111020.pdf
>
>
> We current have hftp for accessing HDFS over HTTP.  However, hftp is a 
> read-only FileSystem and does not provide "write" accesses.
> In HDFS-2284, we propose to have webhdfs for providing a complete FileSystem 
> implementation for accessing HDFS over HTTP.  The is the umbrella JIRA for 
> the tasks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2316) webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP

2011-10-23 Thread Sanjay Radia (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13133797#comment-13133797
 ] 

Sanjay Radia commented on HDFS-2316:


Versioning: We were going with a previous suggestion to  add a version 
parameter when we go to the next version.

> webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP
> --
>
> Key: HDFS-2316
> URL: https://issues.apache.org/jira/browse/HDFS-2316
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
> Attachments: WebHdfsAPI20111020.pdf
>
>
> We current have hftp for accessing HDFS over HTTP.  However, hftp is a 
> read-only FileSystem and does not provide "write" accesses.
> In HDFS-2284, we propose to have webhdfs for providing a complete FileSystem 
> implementation for accessing HDFS over HTTP.  The is the umbrella JIRA for 
> the tasks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2316) webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP

2011-10-23 Thread Sanjay Radia (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13133793#comment-13133793
 ] 

Sanjay Radia commented on HDFS-2316:


> ... embedding byte-ranges in the URL itself.
This was the  implementation a few days ago,  It was changed to use  content 
range header - fairly standard and it likely allow other tools to works 
seamlessly.

> webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP
> --
>
> Key: HDFS-2316
> URL: https://issues.apache.org/jira/browse/HDFS-2316
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
> Attachments: WebHdfsAPI20111020.pdf
>
>
> We current have hftp for accessing HDFS over HTTP.  However, hftp is a 
> read-only FileSystem and does not provide "write" accesses.
> In HDFS-2284, we propose to have webhdfs for providing a complete FileSystem 
> implementation for accessing HDFS over HTTP.  The is the umbrella JIRA for 
> the tasks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2316) webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP

2011-10-22 Thread Tsz Wo (Nicholas), SZE (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13133525#comment-13133525
 ] 

Tsz Wo (Nicholas), SZE commented on HDFS-2316:
--

It seems that the Http spec does not force us to use Range header; see [14.35.2 
Range Retrieval 
Requests|http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.35]
{quote}
A server MAY ignore the Range header. However, HTTP/1.1 origin servers and 
intermediate caches ought to support byte ranges when possible, since Range 
supports efficient recovery from partially failed transfers, and supports 
efficient partial retrieval of large entities. 
{quote}

> webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP
> --
>
> Key: HDFS-2316
> URL: https://issues.apache.org/jira/browse/HDFS-2316
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
> Attachments: WebHdfsAPI20111020.pdf
>
>
> We current have hftp for accessing HDFS over HTTP.  However, hftp is a 
> read-only FileSystem and does not provide "write" accesses.
> In HDFS-2284, we propose to have webhdfs for providing a complete FileSystem 
> implementation for accessing HDFS over HTTP.  The is the umbrella JIRA for 
> the tasks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2316) webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP

2011-10-22 Thread Alejandro Abdelnur (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13133471#comment-13133471
 ] 

Alejandro Abdelnur commented on HDFS-2316:
--

Regarding case sensitivity, given that HDFS URIs are case sensitive, being case 
insensitive in the querystring it would be confusing to users. Furthermore, 
being case insensitive in part of the querystring it would be more confusing. 
I'd propose we are case sensitive and all params and values we define are 
lowercase.

Regarding permissions, you are correct, Hoop uses output style which are 
absolute. While uncommon I found the more intuitive.  I'll make sure Hoop 
handles octal as well.

Regarding file ranges, it is matter of convenience for users. I don't think it 
will cause problems because there are not libraries that handle webhdfs/hoop 
URLs operations (and use HTTP ranges), users will have to code the 
constructions of these URLs and then will use what we provide. Again, I see a 
big value if ALL webhdfs/hoop operations are 100% selfcontained in the URL.



> webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP
> --
>
> Key: HDFS-2316
> URL: https://issues.apache.org/jira/browse/HDFS-2316
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
> Attachments: WebHdfsAPI20111020.pdf
>
>
> We current have hftp for accessing HDFS over HTTP.  However, hftp is a 
> read-only FileSystem and does not provide "write" accesses.
> In HDFS-2284, we propose to have webhdfs for providing a complete FileSystem 
> implementation for accessing HDFS over HTTP.  The is the umbrella JIRA for 
> the tasks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2316) webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP

2011-10-21 Thread Tsz Wo (Nicholas), SZE (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13133237#comment-13133237
 ] 

Tsz Wo (Nicholas), SZE commented on HDFS-2316:
--

*Thanks everyone for looking at the webhdfs API.*  I will update the doc 
accordingly.  Sorry the json types and error responses are missing.  Below are 
some quick responses:

> 4. case sensitivity - make the parameters lower case rather than have the 
> filter convert them since pathname and the user name should not be converted.

Only the parameter names are case insensitive.  The parameter values are case 
sensitive except for op values and boolean values.

> Permission masks are currently octal in webhdfs and symbolic in hoop. IMO, it 
> should make sense to support both.

I thought about adding chmod style symbolic permission.  However, it does not 
make sense in webhdfs SETPERMISSION since it only sets absolute permission 
(e.g. "u=rwx,go=rx", which equals 755.)  The relative permissions (e.g. "go-w") 
won't work.  Hoop uses ls output style for setting permission (e.g. rwxr-xr-x). 
 This seems uncommon. 

> File ranges, webhdfs uses HTTP 'content-ranges' header, ...

I have implemented OPEN with offset and length parameters but decide to change 
it since it is not following the http spec.  Won't it cause problems if webhdfs 
does follow the http spec?

> webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP
> --
>
> Key: HDFS-2316
> URL: https://issues.apache.org/jira/browse/HDFS-2316
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
> Attachments: WebHdfsAPI20111020.pdf
>
>
> We current have hftp for accessing HDFS over HTTP.  However, hftp is a 
> read-only FileSystem and does not provide "write" accesses.
> In HDFS-2284, we propose to have webhdfs for providing a complete FileSystem 
> implementation for accessing HDFS over HTTP.  The is the umbrella JIRA for 
> the tasks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2316) webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP

2011-10-21 Thread Milind Bhandarkar (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13133175#comment-13133175
 ] 

Milind Bhandarkar commented on HDFS-2316:
-

Guys, is there a documentation for webhdfs APIs that I can read somewhere ? (A 
good advice for producing human readable documentation for webservices can be 
found here: 
http://answers.oreilly.com/topic/1390-how-to-document-restful-web-services/).

+1 to Nathan's suggestion for versioning the API.

+1 to Alejandro's suggestion for embedding byte-ranges in the URL itself.

> webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP
> --
>
> Key: HDFS-2316
> URL: https://issues.apache.org/jira/browse/HDFS-2316
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
> Attachments: WebHdfsAPI20111020.pdf
>
>
> We current have hftp for accessing HDFS over HTTP.  However, hftp is a 
> read-only FileSystem and does not provide "write" accesses.
> In HDFS-2284, we propose to have webhdfs for providing a complete FileSystem 
> implementation for accessing HDFS over HTTP.  The is the umbrella JIRA for 
> the tasks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2316) webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP

2011-10-21 Thread Milind Bhandarkar (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13133173#comment-13133173
 ] 

Milind Bhandarkar commented on HDFS-2316:
-

@Thejas, webhdfs:// is the scheme recognized by FileSystem.get in Hadoop. (Same 
thing as hftp://, which uses http protocol, but hftp is the file system impl.)

> webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP
> --
>
> Key: HDFS-2316
> URL: https://issues.apache.org/jira/browse/HDFS-2316
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
> Attachments: WebHdfsAPI20111020.pdf
>
>
> We current have hftp for accessing HDFS over HTTP.  However, hftp is a 
> read-only FileSystem and does not provide "write" accesses.
> In HDFS-2284, we propose to have webhdfs for providing a complete FileSystem 
> implementation for accessing HDFS over HTTP.  The is the umbrella JIRA for 
> the tasks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2316) webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP

2011-10-21 Thread Thejas M Nair (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13133134#comment-13133134
 ] 

Thejas M Nair commented on HDFS-2316:
-

bq. for scheme - i don't think we should use a prexisting scheme name. Nfs 
community has used webnfs as the scheme for accessing nfs over http.
The plan is to support calls over HTTP, so I think it is better to keep that 
clear for the users. Are there any plans of supporting non http operations ? If 
not, I don't see any benefit of having a 'webhdfs' scheme.


> webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP
> --
>
> Key: HDFS-2316
> URL: https://issues.apache.org/jira/browse/HDFS-2316
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
> Attachments: WebHdfsAPI20111020.pdf
>
>
> We current have hftp for accessing HDFS over HTTP.  However, hftp is a 
> read-only FileSystem and does not provide "write" accesses.
> In HDFS-2284, we propose to have webhdfs for providing a complete FileSystem 
> implementation for accessing HDFS over HTTP.  The is the umbrella JIRA for 
> the tasks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2316) webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP

2011-10-21 Thread Alejandro Abdelnur (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13133129#comment-13133129
 ] 

Alejandro Abdelnur commented on HDFS-2316:
--

Thanks Sanjay. A couple of follow up issues in the current API:

* Permission masks are currently octal in webhdfs and symbolic in hoop. IMO, it 
should make sense to support both.

* File ranges, webhdfs uses HTTP 'content-ranges' header, hoop uses 2 query 
string params offset= & len=. In webhdfs, except for this type of requests, for 
all other request the URL itself fully describes what is being requested. 
Because webhdfs uses  the HTTP 'content-ranges' header a URL is not sufficient 
to specify the desired range. With Hoop approach a URL self-contains the 
desired range, making it easier to use from libraries and scripts.




> webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP
> --
>
> Key: HDFS-2316
> URL: https://issues.apache.org/jira/browse/HDFS-2316
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
> Attachments: WebHdfsAPI20111020.pdf
>
>
> We current have hftp for accessing HDFS over HTTP.  However, hftp is a 
> read-only FileSystem and does not provide "write" accesses.
> In HDFS-2284, we propose to have webhdfs for providing a complete FileSystem 
> implementation for accessing HDFS over HTTP.  The is the umbrella JIRA for 
> the tasks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2316) webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP

2011-10-21 Thread Nathan Roberts (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13133103#comment-13133103
 ] 

Nathan Roberts commented on HDFS-2316:
--

Is there a mechanism for versioning this API? Seems like we should probably 
have one. e.g. /webhdfs/1/ or /webhdfs/v1/

> webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP
> --
>
> Key: HDFS-2316
> URL: https://issues.apache.org/jira/browse/HDFS-2316
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
> Attachments: WebHdfsAPI20111020.pdf
>
>
> We current have hftp for accessing HDFS over HTTP.  However, hftp is a 
> read-only FileSystem and does not provide "write" accesses.
> In HDFS-2284, we propose to have webhdfs for providing a complete FileSystem 
> implementation for accessing HDFS over HTTP.  The is the umbrella JIRA for 
> the tasks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2316) webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP

2011-10-21 Thread Sanjay Radia (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13133090#comment-13133090
 ] 

Sanjay Radia commented on HDFS-2316:


Alejandro raised the following 4 issues for discussion:
# rename - should the target path contain the /webhdfs prefix since a client 
app will want to simply take the target path and use it as part of a read 
operation.
# should getStatus return the paths with the /webhdfs prefix
# Why is the scheme of the webhdfs file system  "webhdfs:" and not "http:"
# case sensitivity - make the parameters lower case rather than have the filter 
convert them since pathname and the user name should not be converted.

My initial thoughts are: 
* for 1 and 2: the rename target path and the paths in the filestatus should 
NOT contain contain /webhdfs since /webhdfs is not really part of the 
parameters but a part of the rest api's "headers".
* for scheme - i don't think we should use a prexisting scheme name. Nfs 
community has used webnfs as the scheme for accessing nfs over http.
* I am fine with lower case parameters.

> webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP
> --
>
> Key: HDFS-2316
> URL: https://issues.apache.org/jira/browse/HDFS-2316
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
> Attachments: WebHdfsAPI20111020.pdf
>
>
> We current have hftp for accessing HDFS over HTTP.  However, hftp is a 
> read-only FileSystem and does not provide "write" accesses.
> In HDFS-2284, we propose to have webhdfs for providing a complete FileSystem 
> implementation for accessing HDFS over HTTP.  The is the umbrella JIRA for 
> the tasks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2316) webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP

2011-10-21 Thread Nathan Roberts (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13132732#comment-13132732
 ] 

Nathan Roberts commented on HDFS-2316:
--

Hi Nicholas, some quick comments from first read:
* ":" and "http://:" seem to be used 
interchangeably. We should be consistent where possible.
* Why doesn't "curl -i -L "http://:/webhdfs/" just work? Do 
we really need to specify op=OPEN for this very simple, common case?
* I believe "http://:" should be "http://:" in 
append.
* Need format of responses spelled out.
* It would be nice if we could document the possible error responses as well.
* Since a single datanode will be performing the write of a potentially large 
file, does that mean that file will have an entire copy on that node (due to 
block placement strategies)? That doesn't seem desirable..
* Is a SHORT sufficient for buffersize?
* Do we need a renewlease? How will very slow writers be handled?
* Once I have file block locations, can I go directly to those datanodes to 
retrieve rather than using content_range and always following a redirect?
* Do we need flush/sync?

> webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP
> --
>
> Key: HDFS-2316
> URL: https://issues.apache.org/jira/browse/HDFS-2316
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
> Attachments: WebHdfsAPI20111020.pdf
>
>
> We current have hftp for accessing HDFS over HTTP.  However, hftp is a 
> read-only FileSystem and does not provide "write" accesses.
> In HDFS-2284, we propose to have webhdfs for providing a complete FileSystem 
> implementation for accessing HDFS over HTTP.  The is the umbrella JIRA for 
> the tasks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2316) webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP

2011-10-21 Thread Alejandro Abdelnur (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13132692#comment-13132692
 ] 

Alejandro Abdelnur commented on HDFS-2316:
--

---
@Nicholas,

Thanks for the API document. In general it looks OK. A few comments:

* GET GETHOMEDIRECTORY operation is missing.

* The GETFILEBLOCKLOCATIONS, GETDELEGATIONTOKEN, RENEWDELEGATIONTOKEN, 
CANCELDELEGATIONTOKEN operations seem to be the ones that don't make sense (at 
the moment) in a proxy scenario. We should make those operations as optional.

* The 'doas' query parameter is missing, this is required to enable proxyuser 
functionality.

* The 'user.name' query parameter is optional as this is used only in the case 
of pseudo authentication, in the case of other authentication mechanism the 
username will be taken for the authentication credentials.

* The document does not define any of the JSON responses nor error codes and 
JSON error messages. I assume you are taking the JSON responses in the doc 
posted in HDFS-2178. Still this has to be augmented for checksum and 
content-summary responses.

* IMO we need operations to get create and append handles, reason in my 
response to Sanjay in HDFS-2178 
(https://issues.apache.org/jira/browse/HDFS-2178?focusedCommentId=13132691&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13132691)

* The webhdfs prefix should be optional/configurable and it should be provided 
by server on a 'filsystem.get' operation.



> webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP
> --
>
> Key: HDFS-2316
> URL: https://issues.apache.org/jira/browse/HDFS-2316
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
> Attachments: WebHdfsAPI20111020.pdf
>
>
> We current have hftp for accessing HDFS over HTTP.  However, hftp is a 
> read-only FileSystem and does not provide "write" accesses.
> In HDFS-2284, we propose to have webhdfs for providing a complete FileSystem 
> implementation for accessing HDFS over HTTP.  The is the umbrella JIRA for 
> the tasks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2316) webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP

2011-10-20 Thread Tsz Wo (Nicholas), SZE (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13132322#comment-13132322
 ] 

Tsz Wo (Nicholas), SZE commented on HDFS-2316:
--

> Here is the WebHdfs API.

I mean the attached file WebHdfsAPI20111020.pdf.


> webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP
> --
>
> Key: HDFS-2316
> URL: https://issues.apache.org/jira/browse/HDFS-2316
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
> Attachments: WebHdfsAPI20111020.pdf
>
>
> We current have hftp for accessing HDFS over HTTP.  However, hftp is a 
> read-only FileSystem and does not provide "write" accesses.
> In HDFS-2284, we propose to have webhdfs for providing a complete FileSystem 
> implementation for accessing HDFS over HTTP.  The is the umbrella JIRA for 
> the tasks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2316) webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP

2011-09-27 Thread Sanjay Radia (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13115926#comment-13115926
 ] 

Sanjay Radia commented on HDFS-2316:


BTW Amazon web services encourages the use of 100-continue for PUT and POST.
http://docs.amazonwebservices.com/AmazonS3/latest/API/

> webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP
> --
>
> Key: HDFS-2316
> URL: https://issues.apache.org/jira/browse/HDFS-2316
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
>
> We current have hftp for accessing HDFS over HTTP.  However, hftp is a 
> read-only FileSystem and does not provide "write" accesses.
> In HDFS-2284, we propose to have webhdfs for providing a complete FileSystem 
> implementation for accessing HDFS over HTTP.  The is the umbrella JIRA for 
> the tasks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2316) webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP

2011-09-27 Thread Sanjay Radia (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13115875#comment-13115875
 ] 

Sanjay Radia commented on HDFS-2316:


Alejando has raised the issue of 100 continue not working with some http client 
libraries.
Curl supports it and  httpclient from Apache seem to support it. If there there 
is at least one java library that
supports it then it seems an unnecessary API complication to split the create 
into two APIs.

> webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP
> --
>
> Key: HDFS-2316
> URL: https://issues.apache.org/jira/browse/HDFS-2316
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
>
> We current have hftp for accessing HDFS over HTTP.  However, hftp is a 
> read-only FileSystem and does not provide "write" accesses.
> In HDFS-2284, we propose to have webhdfs for providing a complete FileSystem 
> implementation for accessing HDFS over HTTP.  The is the umbrella JIRA for 
> the tasks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2316) webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP

2011-09-20 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13108907#comment-13108907
 ] 

Alejandro Abdelnur commented on HDFS-2316:
--

I've just uploaded to HDFS-2178 a PDF with the proposed HTTP API.

> webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP
> --
>
> Key: HDFS-2316
> URL: https://issues.apache.org/jira/browse/HDFS-2316
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
>
> We current have hftp for accessing HDFS over HTTP.  However, hftp is a 
> read-only FileSystem and does not provide "write" accesses.
> In HDFS-2284, we propose to have webhdfs for providing a complete FileSystem 
> implementation for accessing HDFS over HTTP.  The is the umbrella JIRA for 
> the tasks.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2316) webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP

2011-09-10 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13102086#comment-13102086
 ] 

Alejandro Abdelnur commented on HDFS-2316:
--

bq. 'op=' parameter

Hoop audit logs write the HTTP method, thus you have POST URL.

IMO it should be GETFILESTATUS (the value of the params is case insensitive) 
and it should match the FileSytem method name.

> webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP
> --
>
> Key: HDFS-2316
> URL: https://issues.apache.org/jira/browse/HDFS-2316
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
>
> We current have hftp for accessing HDFS over HTTP.  However, hftp is a 
> read-only FileSystem and does not provide "write" accesses.
> In HDFS-2284, we propose to have webhdfs for providing a complete FileSystem 
> implementation for accessing HDFS over HTTP.  The is the umbrella JIRA for 
> the tasks.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2316) webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP

2011-09-09 Thread Tsz Wo (Nicholas), SZE (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13101955#comment-13101955
 ] 

Tsz Wo (Nicholas), SZE commented on HDFS-2316:
--

I am revising my patch.  Do you think that GETFILESTATUS is hard to parse?  Is 
GET_FILE_STATUS better?

> webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP
> --
>
> Key: HDFS-2316
> URL: https://issues.apache.org/jira/browse/HDFS-2316
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
>
> We current have hftp for accessing HDFS over HTTP.  However, hftp is a 
> read-only FileSystem and does not provide "write" accesses.
> In HDFS-2284, we propose to have webhdfs for providing a complete FileSystem 
> implementation for accessing HDFS over HTTP.  The is the umbrella JIRA for 
> the tasks.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2316) webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP

2011-09-09 Thread Tsz Wo (Nicholas), SZE (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13101954#comment-13101954
 ] 

Tsz Wo (Nicholas), SZE commented on HDFS-2316:
--

>Using 'getOp', 'putOp' and 'postOp' is redundant as the HTTP method is is 
> GET/PUT/POST already

But it will make operation clear in the url and show up in the log messages.  I 
think it will help other developers.

>Using 'op=open' for reading a file make sense
>(generalizing on the prev bullet item) we should use the name of the 
> method (case insensitive) for the operation names,
>thus it should be 'setowner'

Sounds good.

>Permissions: we should support both octal and symbolic.

Octal is concise.  Do we really need symbolic?

> webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP
> --
>
> Key: HDFS-2316
> URL: https://issues.apache.org/jira/browse/HDFS-2316
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
>
> We current have hftp for accessing HDFS over HTTP.  However, hftp is a 
> read-only FileSystem and does not provide "write" accesses.
> In HDFS-2284, we propose to have webhdfs for providing a complete FileSystem 
> implementation for accessing HDFS over HTTP.  The is the umbrella JIRA for 
> the tasks.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2316) webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP

2011-09-09 Thread Tsz Wo (Nicholas), SZE (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13101951#comment-13101951
 ] 

Tsz Wo (Nicholas), SZE commented on HDFS-2316:
--

> Still is open how to handle create/append operations.

Have you seen [my 
comment|https://issues.apache.org/jira/browse/HDFS-2316?focusedCommentId=13101012&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13101012]?

> webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP
> --
>
> Key: HDFS-2316
> URL: https://issues.apache.org/jira/browse/HDFS-2316
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
>
> We current have hftp for accessing HDFS over HTTP.  However, hftp is a 
> read-only FileSystem and does not provide "write" accesses.
> In HDFS-2284, we propose to have webhdfs for providing a complete FileSystem 
> implementation for accessing HDFS over HTTP.  The is the umbrella JIRA for 
> the tasks.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2316) webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP

2011-09-09 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13101428#comment-13101428
 ] 

Alejandro Abdelnur commented on HDFS-2316:
--

@Sanjay:

* As agreed with Nicholas, create/mkdir should be PUT (idempotent), and append 
should be POST (not idempotent)
* Parameter names
** Using 'getOp', 'putOp' and 'postOp' is redundant as the HTTP method is is 
GET/PUT/POST already
** Using 'op=open' for reading a file make sense
** (generalizing on the prev bullet item) we should use the name of the method 
(case insensitive) for the operation names, thus it should be 'setowner'
** Default values: Hoop uses the JAR default values, it could be modified to 
use the defaults of hdfs-site.xml in HOOP config dir
** Permissions: we should support both octal and symbolic.

Still is open how to handle create/append operations.

> webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP
> --
>
> Key: HDFS-2316
> URL: https://issues.apache.org/jira/browse/HDFS-2316
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
>
> We current have hftp for accessing HDFS over HTTP.  However, hftp is a 
> read-only FileSystem and does not provide "write" accesses.
> In HDFS-2284, we propose to have webhdfs for providing a complete FileSystem 
> implementation for accessing HDFS over HTTP.  The is the umbrella JIRA for 
> the tasks.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2316) webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP

2011-09-09 Thread Sanjay Radia (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13101355#comment-13101355
 ] 

Sanjay Radia commented on HDFS-2316:


Differences between the patch and Hoop:
* Hoop uses PUT for append and POST for create and mkdirs.  Nicholas uses PUT 
for all three operations
* Parameter names:
** Hoop uses op for all operations while Nicholas uses getOp, putOp, etc.
** Hoop uses "data" for reading a file while Nicholas uses "open".
**  Hoop uses setowner while Nicholas use SET_OWNER.
* Default values: Nicholas follows default values used in HDFS but Hoop does 
not.
* Permission: Nicholas use octal but Hoop use -rwxrwxrwx.

> webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP
> --
>
> Key: HDFS-2316
> URL: https://issues.apache.org/jira/browse/HDFS-2316
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
>
> We current have hftp for accessing HDFS over HTTP.  However, hftp is a 
> read-only FileSystem and does not provide "write" accesses.
> In HDFS-2284, we propose to have webhdfs for providing a complete FileSystem 
> implementation for accessing HDFS over HTTP.  The is the umbrella JIRA for 
> the tasks.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2316) webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP

2011-09-09 Thread Sanjay Radia (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13101354#comment-13101354
 ] 

Sanjay Radia commented on HDFS-2316:


Once we publish the spec folks will start building tools around it. These tools 
will not work on clusters with different configuration. Further, we have too 
much configuration knobs.

> webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP
> --
>
> Key: HDFS-2316
> URL: https://issues.apache.org/jira/browse/HDFS-2316
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
>
> We current have hftp for accessing HDFS over HTTP.  However, hftp is a 
> read-only FileSystem and does not provide "write" accesses.
> In HDFS-2284, we propose to have webhdfs for providing a complete FileSystem 
> implementation for accessing HDFS over HTTP.  The is the umbrella JIRA for 
> the tasks.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2316) webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP

2011-09-09 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13101311#comment-13101311
 ] 

Alejandro Abdelnur commented on HDFS-2316:
--

Would be OK if this prefix is optional/configurable as well as the capability 
of running the HTTP HDFS access in the DN in a different port (and by default 
is with the /webhdfs prefix in the same port as the rest of the HTTP services? 

> webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP
> --
>
> Key: HDFS-2316
> URL: https://issues.apache.org/jira/browse/HDFS-2316
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
>
> We current have hftp for accessing HDFS over HTTP.  However, hftp is a 
> read-only FileSystem and does not provide "write" accesses.
> In HDFS-2284, we propose to have webhdfs for providing a complete FileSystem 
> implementation for accessing HDFS over HTTP.  The is the umbrella JIRA for 
> the tasks.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2316) webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP

2011-09-09 Thread Sanjay Radia (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13101305#comment-13101305
 ] 

Sanjay Radia commented on HDFS-2316:


> If we use a webhdfs://HOST:PORT from the FS client impl and internally we 
> replace to it to
> http://HOST:PORT/webhdfs, I'd live with that one. After all (as Todd pointed 
> offline) it is not fully REST.
That is exactly what Nicholas has implemented. ie with curl you use the url 
http://host/:port/webhdfs.

hdfs://host:port/path and webhdfs://hostx:portx/path  are consistent wrt to the 
path.
While it would nice to be consistent for http, we cannot because of other 
services using the same port.
So I think we have an agreement on this.


> webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP
> --
>
> Key: HDFS-2316
> URL: https://issues.apache.org/jira/browse/HDFS-2316
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
>
> We current have hftp for accessing HDFS over HTTP.  However, hftp is a 
> read-only FileSystem and does not provide "write" accesses.
> In HDFS-2284, we propose to have webhdfs for providing a complete FileSystem 
> implementation for accessing HDFS over HTTP.  The is the umbrella JIRA for 
> the tasks.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2316) webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP

2011-09-08 Thread Tsz Wo (Nicholas), SZE (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13101012#comment-13101012
 ] 

Tsz Wo (Nicholas), SZE commented on HDFS-2316:
--

> But the write/append handle thingy, any other option?

We probably should use "Expect: 100- continue" http header.  See 8.2.3 Use of 
the 100 (Continue) Status in 
http://www.w3.org/Protocols/rfc2616/rfc2616-sec8.html#sec8.2

> webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP
> --
>
> Key: HDFS-2316
> URL: https://issues.apache.org/jira/browse/HDFS-2316
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
>
> We current have hftp for accessing HDFS over HTTP.  However, hftp is a 
> read-only FileSystem and does not provide "write" accesses.
> In HDFS-2284, we propose to have webhdfs for providing a complete FileSystem 
> implementation for accessing HDFS over HTTP.  The is the umbrella JIRA for 
> the tasks.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2316) webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP

2011-09-08 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13100993#comment-13100993
 ] 

Alejandro Abdelnur commented on HDFS-2316:
--

If we use a webhdfs://HOST:PORT from the FS client impl and internally we 
replace to it to http://HOST:PORT/webhdfs, I'd live with that one. After all 
(as Todd pointed offline) it is not fully REST.

But the write/append handle thingy, any other option?


> webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP
> --
>
> Key: HDFS-2316
> URL: https://issues.apache.org/jira/browse/HDFS-2316
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
>
> We current have hftp for accessing HDFS over HTTP.  However, hftp is a 
> read-only FileSystem and does not provide "write" accesses.
> In HDFS-2284, we propose to have webhdfs for providing a complete FileSystem 
> implementation for accessing HDFS over HTTP.  The is the umbrella JIRA for 
> the tasks.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2316) webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP

2011-09-08 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13100978#comment-13100978
 ] 

Alejandro Abdelnur commented on HDFS-2316:
--

@Todd, well the problem here is that we are overloading the use of a PORT for a 
functionality that requires the whole domain of the namespace (in HFDS we don't 
have special dirs like /dev). I'd be OK with a different port then.

@Sanjay, the in-direction for writes it would mean that we moving away from a 
the REST protocol, which is a well understood and known way of interacting with 
resources. I think there a big value in having a full REST API like Hoop has 
today.


> webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP
> --
>
> Key: HDFS-2316
> URL: https://issues.apache.org/jira/browse/HDFS-2316
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
>
> We current have hftp for accessing HDFS over HTTP.  However, hftp is a 
> read-only FileSystem and does not provide "write" accesses.
> In HDFS-2284, we propose to have webhdfs for providing a complete FileSystem 
> implementation for accessing HDFS over HTTP.  The is the umbrella JIRA for 
> the tasks.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2316) webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP

2011-09-08 Thread Sanjay Radia (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13100965#comment-13100965
 ] 

Sanjay Radia commented on HDFS-2316:


Like hftp, Operations are processed  at the NN if they involve no data 
transfer. If the operation involves data transfer (r or w) then the request is 
redirected to the DN. This allows bandwidth scaling and load distribution.
Alejandro has pointed out that when you redirect a put or post operation, the 
initial part of the payload has been sent to the NN. I believe this is true. 
Hence for writes we could consider a two request mode - getTheWrite handle 
using a get and then do a put or post to the data node.

> webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP
> --
>
> Key: HDFS-2316
> URL: https://issues.apache.org/jira/browse/HDFS-2316
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
>
> We current have hftp for accessing HDFS over HTTP.  However, hftp is a 
> read-only FileSystem and does not provide "write" accesses.
> In HDFS-2284, we propose to have webhdfs for providing a complete FileSystem 
> implementation for accessing HDFS over HTTP.  The is the umbrella JIRA for 
> the tasks.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2316) webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP

2011-09-08 Thread Sanjay Radia (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13100960#comment-13100960
 ] 

Sanjay Radia commented on HDFS-2316:


Operations like to minimize the number of ports. So we would use the same port.

> webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP
> --
>
> Key: HDFS-2316
> URL: https://issues.apache.org/jira/browse/HDFS-2316
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
>
> We current have hftp for accessing HDFS over HTTP.  However, hftp is a 
> read-only FileSystem and does not provide "write" accesses.
> In HDFS-2284, we propose to have webhdfs for providing a complete FileSystem 
> implementation for accessing HDFS over HTTP.  The is the umbrella JIRA for 
> the tasks.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2316) webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP

2011-09-08 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13100934#comment-13100934
 ] 

Todd Lipcon commented on HDFS-2316:
---

Yea, that seems crazy. I'm for option #1 or for opening yet-another-port.

> webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP
> --
>
> Key: HDFS-2316
> URL: https://issues.apache.org/jira/browse/HDFS-2316
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
>
> We current have hftp for accessing HDFS over HTTP.  However, hftp is a 
> read-only FileSystem and does not provide "write" accesses.
> In HDFS-2284, we propose to have webhdfs for providing a complete FileSystem 
> implementation for accessing HDFS over HTTP.  The is the umbrella JIRA for 
> the tasks.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2316) webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP

2011-09-08 Thread Tsz Wo (Nicholas), SZE (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13100926#comment-13100926
 ] 

Tsz Wo (Nicholas), SZE commented on HDFS-2316:
--

@Todd, we are using the same port.  If we implement (2), you are right that we 
will have such problem.

> webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP
> --
>
> Key: HDFS-2316
> URL: https://issues.apache.org/jira/browse/HDFS-2316
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
>
> We current have hftp for accessing HDFS over HTTP.  However, hftp is a 
> read-only FileSystem and does not provide "write" accesses.
> In HDFS-2284, we propose to have webhdfs for providing a complete FileSystem 
> implementation for accessing HDFS over HTTP.  The is the umbrella JIRA for 
> the tasks.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2316) webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP

2011-09-08 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13100917#comment-13100917
 ] 

Todd Lipcon commented on HDFS-2316:
---

Maybe I'm not following completely: is the idea that webhdfs would run on the 
same port as the NN web UI?

It seems crazy to me that http://nn:50030/jmx would be the JMX servlet but 
http://nn:50030/jmx?opGet=read would be a file at path /jmx... hopefully I'm 
misunderstanding :)

> webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP
> --
>
> Key: HDFS-2316
> URL: https://issues.apache.org/jira/browse/HDFS-2316
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
>
> We current have hftp for accessing HDFS over HTTP.  However, hftp is a 
> read-only FileSystem and does not provide "write" accesses.
> In HDFS-2284, we propose to have webhdfs for providing a complete FileSystem 
> implementation for accessing HDFS over HTTP.  The is the umbrella JIRA for 
> the tasks.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2316) webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP

2011-09-08 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13100891#comment-13100891
 ] 

Alejandro Abdelnur commented on HDFS-2316:
--

IMO, the nice thing about #2 is that the file path of *HDFS:* and a *HTTP:* 
URIs will be exactly the same, and in the case of using the NN/DD deployment of 
HOOP it will be even the same host. 

In addition is it intuitive without any caveat, a given path will just work by 
replacing the SCHEME://HOST:PORT part of it. 

Finally, and IMO this is very important from the Usability perspective, user 
applications that take are designed to take the URI of the FS as parameter and 
operate via HDFS: or HTTP: will be otherwise difficult to code. Hadoop's 
*Path(String parent, String child)* uses the *URI.resolve(...)* that uses a 
well defined logic to resolve URIs based on other URIs[ 
http://download.oracle.com/javase/6/docs/api/java/net/URI.html#resolve(java.net.URI)
 ]. If we use a prefix for HTTP URIs then it will become difficult and error 
prone to compose HDFS: URIs from HTTP: URIs and viceversa. (And I believe the 
same is true for libraries in other languages)

Finally, I have not seen HDFS files under */data* as a common practice, thus 
the name collision won't be that common.


> webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP
> --
>
> Key: HDFS-2316
> URL: https://issues.apache.org/jira/browse/HDFS-2316
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
>
> We current have hftp for accessing HDFS over HTTP.  However, hftp is a 
> read-only FileSystem and does not provide "write" accesses.
> In HDFS-2284, we propose to have webhdfs for providing a complete FileSystem 
> implementation for accessing HDFS over HTTP.  The is the umbrella JIRA for 
> the tasks.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2316) webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP

2011-09-08 Thread Tsz Wo (Nicholas), SZE (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13100862#comment-13100862
 ] 

Tsz Wo (Nicholas), SZE commented on HDFS-2316:
--

2) may be confusion since the same http url path represents two different file 
system paths.  E.g.
- http://namenode:port/data/foo/bar?... means reading /foo/bar
- http://namenode:port/data/foo/bar?opGet=read&;... means reading /data/foo/bar


> webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP
> --
>
> Key: HDFS-2316
> URL: https://issues.apache.org/jira/browse/HDFS-2316
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
>
> We current have hftp for accessing HDFS over HTTP.  However, hftp is a 
> read-only FileSystem and does not provide "write" accesses.
> In HDFS-2284, we propose to have webhdfs for providing a complete FileSystem 
> implementation for accessing HDFS over HTTP.  The is the umbrella JIRA for 
> the tasks.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2316) webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP

2011-09-08 Thread Sanjay Radia (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13100816#comment-13100816
 ] 

Sanjay Radia commented on HDFS-2316:


One issue: we have to co exist with hftp which uses a prefixes like /data. Two 
proposal on the table:
# (by Nicholas) use a different fixed prefix like /webhdfs/path since it does 
not conflict with /data.
# (by Alejandro) always use an operation in the url so that one can send /data 
with an operation to webhdfs and /data without operation to hftp. 

1) is is simpler to implement since urls within the context of /webhdfs can be 
sent to the webhdfs servlet. 2)has a nicer url since the path is the pathname 
being refereced. 

> webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP
> --
>
> Key: HDFS-2316
> URL: https://issues.apache.org/jira/browse/HDFS-2316
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
>
> We current have hftp for accessing HDFS over HTTP.  However, hftp is a 
> read-only FileSystem and does not provide "write" accesses.
> In HDFS-2284, we propose to have webhdfs for providing a complete FileSystem 
> implementation for accessing HDFS over HTTP.  The is the umbrella JIRA for 
> the tasks.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2316) webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP

2011-09-08 Thread Eli Collins (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13100797#comment-13100797
 ] 

Eli Collins commented on HDFS-2316:
---

I definitely agree with these distinct use cases, I'm just wondering if we need 
to have two separate FileSystem over HTTP implementations vs one client that 
may or may not use a proxy server (there's no reason http or FileSystem clients 
need to care whether they're being proxied). Sounds like we were duplicating 
code w/o understanding what could be shared. You and Alejandro have looked at 
the specifics more than I have so I trust your judgement.


> webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP
> --
>
> Key: HDFS-2316
> URL: https://issues.apache.org/jira/browse/HDFS-2316
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
>
> We current have hftp for accessing HDFS over HTTP.  However, hftp is a 
> read-only FileSystem and does not provide "write" accesses.
> In HDFS-2284, we propose to have webhdfs for providing a complete FileSystem 
> implementation for accessing HDFS over HTTP.  The is the umbrella JIRA for 
> the tasks.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2316) webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP

2011-09-08 Thread Sanjay Radia (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13100120#comment-13100120
 ] 

Sanjay Radia commented on HDFS-2316:


HDFS-2178 (HOOP) is HDFS Proxy using http protocol - a replacement for hdfs 
proxy v2, but providing rw access. It runs as separate daemons *typically* on 
an array of servers sitting next to an HDFS cluster (like HDFS proxy v2).

HDFS-2316 is http rw access that replaces hftp but is *built into* the hdfs 
system and provides bandwidth 
scaling by redirecting from the NN to the datanode that contains the block. It 
will use spnego and delegation token. It does not requires a notion of trust. 
HDFS-2178 (HOOP) is run as a separate daemons that are trusted by hdfs. 
hdfs-2178 (HOOP)  can provide additional features like bandwidth management and 
user authentication mapping.

There is an overlap but a need for both.

> webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP
> --
>
> Key: HDFS-2316
> URL: https://issues.apache.org/jira/browse/HDFS-2316
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
>
> We current have hftp for accessing HDFS over HTTP.  However, hftp is a 
> read-only FileSystem and does not provide "write" accesses.
> In HDFS-2284, we propose to have webhdfs for providing a complete FileSystem 
> implementation for accessing HDFS over HTTP.  The is the umbrella JIRA for 
> the tasks.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2316) webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP

2011-09-07 Thread Tsz Wo (Nicholas), SZE (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13099690#comment-13099690
 ] 

Tsz Wo (Nicholas), SZE commented on HDFS-2316:
--

Hi Eli, webhdfs is replacing hftp but Hoop is replacing HDFS Proxy.  For more 
detailed discussion, please see HDFS-2284.

> webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP
> --
>
> Key: HDFS-2316
> URL: https://issues.apache.org/jira/browse/HDFS-2316
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
>
> We current have hftp for accessing HDFS over HTTP.  However, hftp is a 
> read-only FileSystem and does not provide "write" accesses.
> In HDFS-2284, we propose to have webhdfs for providing a complete FileSystem 
> implementation for accessing HDFS over HTTP.  The is the umbrella JIRA for 
> the tasks.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2316) webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP

2011-09-07 Thread Eli Collins (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13099682#comment-13099682
 ] 

Eli Collins commented on HDFS-2316:
---

Why are you duplicating HDFS-2178? Hoop already provides a full read write 
FileSystem interface to HDFS that goes over http. 


> webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP
> --
>
> Key: HDFS-2316
> URL: https://issues.apache.org/jira/browse/HDFS-2316
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
>
> We current have hftp for accessing HDFS over HTTP.  However, hftp is a 
> read-only FileSystem and does not provide "write" accesses.
> In HDFS-2284, we propose to have webhdfs for providing a complete FileSystem 
> implementation for accessing HDFS over HTTP.  The is the umbrella JIRA for 
> the tasks.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira