[jira] [Commented] (HDFS-1213) Implement an Apache Commons VFS Driver for HDFS
[ https://issues.apache.org/jira/browse/HDFS-1213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14268685#comment-14268685 ] Dave Marion commented on HDFS-1213: --- FWIW, the current HDFS provider in Commons VFS is read-only. Implement an Apache Commons VFS Driver for HDFS --- Key: HDFS-1213 URL: https://issues.apache.org/jira/browse/HDFS-1213 Project: Hadoop HDFS Issue Type: New Feature Components: hdfs-client Reporter: Michael D'Amour Attachments: HADOOP-HDFS-Apache-VFS.patch, pentaho-hdfs-vfs-TRUNK-SNAPSHOT-sources.tar.gz, pentaho-hdfs-vfs-TRUNK-SNAPSHOT.jar We have an open source ETL tool (Kettle) which uses VFS for many input/output steps/jobs. We would like to be able to read/write HDFS from Kettle using VFS. I haven't been able to find anything out there other than it would be nice. I had some time a few weeks ago to begin writing a VFS driver for HDFS and we (Pentaho) would like to be able to contribute this driver. I believe it supports all the major file/folder operations and I have written unit tests for all of these operations. The code is currently checked into an open Pentaho SVN repository under the Apache 2.0 license. There are some current limitations, such as a lack of authentication (kerberos), which appears to be coming in 0.22.0, however, the driver supports username/password, but I just can't use them yet. I will be attaching the code for the driver once the case is created. The project does not modify existing hadoop/hdfs source. Our JIRA case can be found at http://jira.pentaho.com/browse/PDI-4146 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-1213) Implement an Apache Commons VFS Driver for HDFS
[ https://issues.apache.org/jira/browse/HDFS-1213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13774520#comment-13774520 ] Jakub Narloch commented on HDFS-1213: - Hi, What is the status of this issue? Is there any change I would be released someday? Actually we could use such level of abstraction and I think the general idea is great. Implement an Apache Commons VFS Driver for HDFS --- Key: HDFS-1213 URL: https://issues.apache.org/jira/browse/HDFS-1213 Project: Hadoop HDFS Issue Type: New Feature Components: hdfs-client Reporter: Michael D'Amour Attachments: HADOOP-HDFS-Apache-VFS.patch, pentaho-hdfs-vfs-TRUNK-SNAPSHOT.jar, pentaho-hdfs-vfs-TRUNK-SNAPSHOT-sources.tar.gz We have an open source ETL tool (Kettle) which uses VFS for many input/output steps/jobs. We would like to be able to read/write HDFS from Kettle using VFS. I haven't been able to find anything out there other than it would be nice. I had some time a few weeks ago to begin writing a VFS driver for HDFS and we (Pentaho) would like to be able to contribute this driver. I believe it supports all the major file/folder operations and I have written unit tests for all of these operations. The code is currently checked into an open Pentaho SVN repository under the Apache 2.0 license. There are some current limitations, such as a lack of authentication (kerberos), which appears to be coming in 0.22.0, however, the driver supports username/password, but I just can't use them yet. I will be attaching the code for the driver once the case is created. The project does not modify existing hadoop/hdfs source. Our JIRA case can be found at http://jira.pentaho.com/browse/PDI-4146 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1213) Implement an Apache Commons VFS Driver for HDFS
[ https://issues.apache.org/jira/browse/HDFS-1213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13774553#comment-13774553 ] Matt Casters commented on HDFS-1213: Jakub, until the VFS driver can be incorporated into the HDFS or VFS Apache projects, Pentaho maintains a separate source tree. It's over here: http://source.pentaho.org/viewvc/svnroot/pentaho-commons/pentaho-hdfs-vfs/trunk/ Implement an Apache Commons VFS Driver for HDFS --- Key: HDFS-1213 URL: https://issues.apache.org/jira/browse/HDFS-1213 Project: Hadoop HDFS Issue Type: New Feature Components: hdfs-client Reporter: Michael D'Amour Attachments: HADOOP-HDFS-Apache-VFS.patch, pentaho-hdfs-vfs-TRUNK-SNAPSHOT.jar, pentaho-hdfs-vfs-TRUNK-SNAPSHOT-sources.tar.gz We have an open source ETL tool (Kettle) which uses VFS for many input/output steps/jobs. We would like to be able to read/write HDFS from Kettle using VFS. I haven't been able to find anything out there other than it would be nice. I had some time a few weeks ago to begin writing a VFS driver for HDFS and we (Pentaho) would like to be able to contribute this driver. I believe it supports all the major file/folder operations and I have written unit tests for all of these operations. The code is currently checked into an open Pentaho SVN repository under the Apache 2.0 license. There are some current limitations, such as a lack of authentication (kerberos), which appears to be coming in 0.22.0, however, the driver supports username/password, but I just can't use them yet. I will be attaching the code for the driver once the case is created. The project does not modify existing hadoop/hdfs source. Our JIRA case can be found at http://jira.pentaho.com/browse/PDI-4146 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1213) Implement an Apache Commons VFS Driver for HDFS
[ https://issues.apache.org/jira/browse/HDFS-1213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402927#comment-13402927 ] DuncanMaster commented on HDFS-1213: hi, I'm Duncan, and i'm approaching to vfs for the first time. i've tried a simple program to use this vfs-hdfs libraries, but I don't understand what to do to make them work. If I've a simple project, with the vfs2 and the pentaho-hdfs-vfs-1.0 libraries added in my libraries' project: what I've to do to make them work? If I do nothing it give me this error: org.apache.commons.vfs2.FileSystemException: Badly formed URI hdfs://localhost:9000/user/giuseppe/input/timestamp.png at org.apache.commons.vfs2.provider.url.UrlFileProvider.findFile(UrlFileProvider.java:91) [...] java.net.MalformedURLException: unknown protocol: hdfs thanks in advace :) Implement an Apache Commons VFS Driver for HDFS --- Key: HDFS-1213 URL: https://issues.apache.org/jira/browse/HDFS-1213 Project: Hadoop HDFS Issue Type: New Feature Components: hdfs client Reporter: Michael D'Amour Attachments: HADOOP-HDFS-Apache-VFS.patch, pentaho-hdfs-vfs-TRUNK-SNAPSHOT-sources.tar.gz, pentaho-hdfs-vfs-TRUNK-SNAPSHOT.jar We have an open source ETL tool (Kettle) which uses VFS for many input/output steps/jobs. We would like to be able to read/write HDFS from Kettle using VFS. I haven't been able to find anything out there other than it would be nice. I had some time a few weeks ago to begin writing a VFS driver for HDFS and we (Pentaho) would like to be able to contribute this driver. I believe it supports all the major file/folder operations and I have written unit tests for all of these operations. The code is currently checked into an open Pentaho SVN repository under the Apache 2.0 license. There are some current limitations, such as a lack of authentication (kerberos), which appears to be coming in 0.22.0, however, the driver supports username/password, but I just can't use them yet. I will be attaching the code for the driver once the case is created. The project does not modify existing hadoop/hdfs source. Our JIRA case can be found at http://jira.pentaho.com/browse/PDI-4146 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira