RE: proper method for writing files to hdfs

Bill Habermaas Mon, 18 May 2009 06:40:58 -0700

Sasha, 

If the namenode is unavailable then you cannot communicate with Hadoop.  It
is the single point of failure and once it is down then the system is
unusable.  The secondary name node is not a failover substitute for the name
node. The name is misleading. It's purpose is simply to checkpoint the
namenode's data so you can recover from a namenode failure that has
corrupted data.


Bill 


-----Original Message-----
From: Sasha Dolgy [mailto:sdo...@gmail.com] 
Sent: Monday, May 18, 2009 9:34 AM
To: core-user@hadoop.apache.org
Subject: Re: proper method for writing files to hdfs

Hi Bill,

Thanks for that.  If the NameNode is unavailable, how do we find the
secondary name node?  Is there a way to deal with this in the code or
should a load balancer of some type sit above each and only direct
traffic to the name node if its listening?

-sd

On Mon, May 18, 2009 at 2:09 PM, Bill Habermaas <b...@habermaas.us> wrote:
> Sasha,
>
> Connecting to the namenode is the proper way to establish the hdfs
> connection.  Afterwards the Hadoop client handler that is called by your
> code will go directly to the datanodes. There is no reason for you to
> communicate directly with a datanode nor is there a way for you to even
know
> where the data nodes are located. That is all done by the Hadoop client
code
> and done silently under the covers by Hadoop itself.
>
> Bill
>
> -----Original Message-----
> From: sdo...@gmail.com [mailto:sdo...@gmail.com] On Behalf Of Sasha Dolgy
> Sent: Sunday, May 17, 2009 10:55 AM
> To: core-user@hadoop.apache.org
> Subject: proper method for writing files to hdfs
>
> The following graphic outlines the architecture for HDFS:
> http://hadoop.apache.org/core/docs/current/images/hdfsarchitecture.gif
>
> If one is to write a client that adds data into HDFS, it needs to add it
> through the Data Node.  Now, from the graphic I am to understand that the
> client doesn't communicate with the NameNode, and only the Data Node.
>
> In the examples I've seen and the playing I am doing, I am connecting to
the
> hdfs url as a configuration parameter before I create a file.  Is this the
> incorrect way to create files in HDFS?
>
>    Configuration config = new Configuration();
>    config.set("fs.default.name","hdfs://foo.bar.com:9000/");
>    String path = "/tmp/i/am/a/path/to/a/file.name"
>    Path hdfsPath = new Path(path);
>    FileSystem fileSystem = FileSystem.get(config);
>    FSDataOutputStream os = fileSystem.create(hdfsPath, false);
>    os.write("something".getBytes());
>    os.close();
>
> Should the client be connecting to a data node to create the file as
> indicated in the graphic above?
>
> If connecting to a data node is possible and suggested, where can I find
> more details about this process?
>
> Thanks in advance,
> -sasha
>
> --
> Sasha Dolgy
> sasha.do...@gmail.com
>
>
>



-- 
Sasha Dolgy
sasha.do...@gmail.com

RE: proper method for writing files to hdfs

Reply via email to