Hi, We (Jingxin Feng, Xing Lin, and I) have been working on providing a FileSystem implementation that allows Hadoop to utilize a NFSv3 storage server as a filesystem. It leverages code from hadoop-nfs project for all the request/response handling. We would like your help to add it as part of hadoop tools (similar to the way hadoop-aws and hadoop-azure).
In more detail, the Hadoop NFS Connector allows Apache Hadoop (2.2+) and Apache Spark (1.2+) to use a NFSv3 storage server as a storage endpoint. The NFS Connector can be run in two modes: (1) secondary filesystem - where Hadoop/Spark runs using HDFS as its primary storage and can use NFS as a second storage endpoint, and (2) primary filesystem - where Hadoop/Spark runs entirely on a NFSv3 storage server. The code is written in a way such that existing applications do not have to change. All one has to do is to copy the connector jar into the lib/ directory of Hadoop/Spark. Then, modify core-site.xml to provide the necessary details. The current version can be seen at: https://github.com/NetApp/NetApp-Hadoop-NFS-Connector It is my first time contributing to the Hadoop codebase. It would be great if someone on the Hadoop team can guide us through this process. I'm willing to make the necessary changes to integrate the code. What are the next steps? Should I create a JIRA entry? Thanks, Gokul