The following graphic outlines the architecture for HDFS: http://hadoop.apache.org/core/docs/current/images/hdfsarchitecture.gif
If one is to write a client that adds data into HDFS, it needs to add it through the Data Node. Now, from the graphic I am to understand that the client doesn't communicate with the NameNode, and only the Data Node. In the examples I've seen and the playing I am doing, I am connecting to the hdfs url as a configuration parameter before I create a file. Is this the incorrect way to create files in HDFS? Configuration config = new Configuration(); config.set("fs.default.name","hdfs://foo.bar.com:9000/"); String path = "/tmp/i/am/a/path/to/a/file.name" Path hdfsPath = new Path(path); FileSystem fileSystem = FileSystem.get(config); FSDataOutputStream os = fileSystem.create(hdfsPath, false); os.write("something".getBytes()); os.close(); Should the client be connecting to a data node to create the file as indicated in the graphic above? If connecting to a data node is possible and suggested, where can I find more details about this process? Thanks in advance, -sasha -- Sasha Dolgy sasha.do...@gmail.com