[jira] [Commented] (HDFS-2631) Rewrite fuse-dfs to use the webhdfs protocol
[ https://issues.apache.org/jira/browse/HDFS-2631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13423771#comment-13423771 ] xiongwen commented on HDFS-2631: hello Jaimin where can i download HDFS-2631.patch , thanks for you attention! i plan to test IO performance of hdfs by filebench, including (seqread ,randomread,seqwrite,randomwrite ) i also think fuse-webhdfs may be better than fuse-dfs Rewrite fuse-dfs to use the webhdfs protocol Key: HDFS-2631 URL: https://issues.apache.org/jira/browse/HDFS-2631 Project: Hadoop HDFS Issue Type: Improvement Components: webhdfs Reporter: Eli Collins Assignee: Jaimin D Jetly Attachments: HDFS-2631.1.patch, HDFS-2631.patch We should port the implementation of fuse-dfs to use the webhdfs protocol. This has a number of benefits: * Compatibility - allows a single fuse client to work across server versions * Works with both WebHDFS and Hoop since they are protocol compatible * Removes the overhead related to libhdfs (forking a jvm) * Makes it easier to support features like security -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2631) Rewrite fuse-dfs to use the webhdfs protocol
[ https://issues.apache.org/jira/browse/HDFS-2631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13408228#comment-13408228 ] Colin Patrick McCabe commented on HDFS-2631: Hi Jaimin, It's great that you're working on this. I think it would be best if you kept the existing libhdfs API. That way, users can easily switch back and forth between the JNI based libhdfs and your webhdfs-based libhdfs. If you do not do this, all applications will have to be rewritten, which may limit the number of people who can use your work. In a similar vein, I think you should avoid changing fuse-dfs in this patch (it would definitely make it a lot smaller). And if you implement the existing API, then obviously there's no reason to modify FUSE at all. Finally, we're using CMake now so you should update your patch to make use of that. CMake is very straightforward. Let me know if you have any questions or if you want to see an example CMakeLists.txt. Rewrite fuse-dfs to use the webhdfs protocol Key: HDFS-2631 URL: https://issues.apache.org/jira/browse/HDFS-2631 Project: Hadoop HDFS Issue Type: Improvement Components: webhdfs Reporter: Eli Collins Assignee: Jaimin D Jetly Attachments: HDFS-2631.1.patch, HDFS-2631.patch We should port the implementation of fuse-dfs to use the webhdfs protocol. This has a number of benefits: * Compatibility - allows a single fuse client to work across server versions * Works with both WebHDFS and Hoop since they are protocol compatible * Removes the overhead related to libhdfs (forking a jvm) * Makes it easier to support features like security -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2631) Rewrite fuse-dfs to use the webhdfs protocol
[ https://issues.apache.org/jira/browse/HDFS-2631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13408375#comment-13408375 ] Eli Collins commented on HDFS-2631: --- Agree w Colin's suggestions. A WebHDFS-based implementation of libhdfs would be useful beyond fuse-dfs. Rewrite fuse-dfs to use the webhdfs protocol Key: HDFS-2631 URL: https://issues.apache.org/jira/browse/HDFS-2631 Project: Hadoop HDFS Issue Type: Improvement Components: webhdfs Reporter: Eli Collins Assignee: Jaimin D Jetly Attachments: HDFS-2631.1.patch, HDFS-2631.patch We should port the implementation of fuse-dfs to use the webhdfs protocol. This has a number of benefits: * Compatibility - allows a single fuse client to work across server versions * Works with both WebHDFS and Hoop since they are protocol compatible * Removes the overhead related to libhdfs (forking a jvm) * Makes it easier to support features like security -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2631) Rewrite fuse-dfs to use the webhdfs protocol
[ https://issues.apache.org/jira/browse/HDFS-2631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13408387#comment-13408387 ] Suresh Srinivas commented on HDFS-2631: --- Colin, I have already commented to this effect on HDFS-2656. Rewrite fuse-dfs to use the webhdfs protocol Key: HDFS-2631 URL: https://issues.apache.org/jira/browse/HDFS-2631 Project: Hadoop HDFS Issue Type: Improvement Components: webhdfs Reporter: Eli Collins Assignee: Jaimin D Jetly Attachments: HDFS-2631.1.patch, HDFS-2631.patch We should port the implementation of fuse-dfs to use the webhdfs protocol. This has a number of benefits: * Compatibility - allows a single fuse client to work across server versions * Works with both WebHDFS and Hoop since they are protocol compatible * Removes the overhead related to libhdfs (forking a jvm) * Makes it easier to support features like security -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2631) Rewrite fuse-dfs to use the webhdfs protocol
[ https://issues.apache.org/jira/browse/HDFS-2631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13408414#comment-13408414 ] Jaimin D Jetly commented on HDFS-2631: -- Hi Colin, This patch does not replace/alter fuse-dfs (that uses JNI based libhdfs) and this patch does not use existing libhdfs API. Implementation in the patch uses its own API (based on libcurl and Jansson library). On your last suggestion, I will surely go through CMake. Rewrite fuse-dfs to use the webhdfs protocol Key: HDFS-2631 URL: https://issues.apache.org/jira/browse/HDFS-2631 Project: Hadoop HDFS Issue Type: Improvement Components: webhdfs Reporter: Eli Collins Assignee: Jaimin D Jetly Attachments: HDFS-2631.1.patch, HDFS-2631.patch We should port the implementation of fuse-dfs to use the webhdfs protocol. This has a number of benefits: * Compatibility - allows a single fuse client to work across server versions * Works with both WebHDFS and Hoop since they are protocol compatible * Removes the overhead related to libhdfs (forking a jvm) * Makes it easier to support features like security -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2631) Rewrite fuse-dfs to use the webhdfs protocol
[ https://issues.apache.org/jira/browse/HDFS-2631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13408424#comment-13408424 ] Colin Patrick McCabe commented on HDFS-2631: Hi Jaimin, I don't think we want to copy and paste all the fuse code to another directory just because we're relying on a different backend library. That would really increase the maintenance burden since we'd be fixing the same bugs in two places, etc. As Suresh said (both here and in HDFS-2656), we really do want to keep that existing API. fuse_dfs isn't the only libhdfs application out there! Let me know if there's anything I can do to help, whatever that may be. It would be really nice to have the option of running without a JVM in libhdfs... Rewrite fuse-dfs to use the webhdfs protocol Key: HDFS-2631 URL: https://issues.apache.org/jira/browse/HDFS-2631 Project: Hadoop HDFS Issue Type: Improvement Components: webhdfs Reporter: Eli Collins Assignee: Jaimin D Jetly Attachments: HDFS-2631.1.patch, HDFS-2631.patch We should port the implementation of fuse-dfs to use the webhdfs protocol. This has a number of benefits: * Compatibility - allows a single fuse client to work across server versions * Works with both WebHDFS and Hoop since they are protocol compatible * Removes the overhead related to libhdfs (forking a jvm) * Makes it easier to support features like security -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2631) Rewrite fuse-dfs to use the webhdfs protocol
[ https://issues.apache.org/jira/browse/HDFS-2631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13255350#comment-13255350 ] Todd Lipcon commented on HDFS-2631: --- That seems reasonable. I think it's a given that we need to keep the original libhdfs for performance. Having a libhdfs-alike that goes over HTTP seems reasonable enough but not always preferable. To speak to each of the original points: bq. Compatibility - allows a single fuse client to work across server versions We need to address compatibility for clients in general. Our Java client (and hence libhdfs) need this just as much as fuse. bq. Works with both WebHDFS and Hoop since they are protocol compatible I guess this is an advantage, but given that libhdfs already wraps arbitrary hadoop filesystems, we already have this capability. bq. Removes the overhead related to libhdfs (forking a jvm) fuse is a long-running client, so the fork overhead seems minimal. Recent improvements in libhdfs have also cut out most of the copying overhead. bq. Makes it easier to support features like security Perhaps - but libhdfs needs security anyway, so I don't think it buys us much. Rewrite fuse-dfs to use the webhdfs protocol Key: HDFS-2631 URL: https://issues.apache.org/jira/browse/HDFS-2631 Project: Hadoop HDFS Issue Type: Improvement Components: contrib/fuse-dfs Reporter: Eli Collins Assignee: Jaimin D Jetly We should port the implementation of fuse-dfs to use the webhdfs protocol. This has a number of benefits: * Compatibility - allows a single fuse client to work across server versions * Works with both WebHDFS and Hoop since they are protocol compatible * Removes the overhead related to libhdfs (forking a jvm) * Makes it easier to support features like security -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2631) Rewrite fuse-dfs to use the webhdfs protocol
[ https://issues.apache.org/jira/browse/HDFS-2631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13255255#comment-13255255 ] Todd Lipcon commented on HDFS-2631: --- I'm a little confused: why is this a good idea? Seems like it's likely to end up much slower than the current implementation. I'd prefer it as another option, rather than a rewrite. Rewrite fuse-dfs to use the webhdfs protocol Key: HDFS-2631 URL: https://issues.apache.org/jira/browse/HDFS-2631 Project: Hadoop HDFS Issue Type: Improvement Components: contrib/fuse-dfs Reporter: Eli Collins Assignee: Jaimin D Jetly We should port the implementation of fuse-dfs to use the webhdfs protocol. This has a number of benefits: * Compatibility - allows a single fuse client to work across server versions * Works with both WebHDFS and Hoop since they are protocol compatible * Removes the overhead related to libhdfs (forking a jvm) * Makes it easier to support features like security -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2631) Rewrite fuse-dfs to use the webhdfs protocol
[ https://issues.apache.org/jira/browse/HDFS-2631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164975#comment-13164975 ] Tsz Wo (Nicholas), SZE commented on HDFS-2631: -- Eli, this is a great idea! Rewrite fuse-dfs to use the webhdfs protocol Key: HDFS-2631 URL: https://issues.apache.org/jira/browse/HDFS-2631 Project: Hadoop HDFS Issue Type: Improvement Components: contrib/fuse-dfs Reporter: Eli Collins We should port the implementation of fuse-dfs to use the webhdfs protocol. This has a number of benefits: * Compatibility - allows a single fuse client to work across server versions * Works with both WebHDFS and Hoop since they are protocol compatible * Removes the overhead related to libhdfs (forking a jvm) * Makes it easier to support features like security -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira