Re: Question about writing HDFS files

2013-05-16 Thread Harsh J
That is not true. HDFS writes are not staged to a local disk first before being written onto the DataNodes. The old architecture docs seem to suggest that the writes get staged to a local disk but thats not true anymore, see https://issues.apache.org/jira/browse/HDFS-1454. Also worth noting that a

Re: Question about writing HDFS files

2013-05-16 Thread Rahul Bhattacharjee
Hi Harsh, I think what John meant by writing to local disk is writing to the same data node first which has initiated the write call. John can further clarify. On Fri, May 17, 2013 at 4:23 AM, Harsh J wrote: > That is not true. HDFS writes are not staged to a local disk first > before being w

Re: Question about writing HDFS files

2013-05-16 Thread Harsh J
Thanks for the clarification Rahul. In that case, then the reading is correct (and that a HDFS client behaves the same, in and out of MR - its not really related to MR at all). A "client outside" would write to a random set of datanode, across at least two racks for 3 replicas if rack awareness is

RE: Question about writing HDFS files

2013-05-17 Thread John Lilley
to:ha...@cloudera.com] Sent: Thursday, May 16, 2013 11:12 PM To: Subject: Re: Question about writing HDFS files Thanks for the clarification Rahul. In that case, then the reading is correct (and that a HDFS client behaves the same, in and out of MR - its not really related to MR at all). A "clie

Re: Question about writing HDFS files

2013-05-17 Thread J. Rottinghuis
out from IP > or hostname? > > > -Original Message- > From: Harsh J [mailto:ha...@cloudera.com] > Sent: Thursday, May 16, 2013 11:12 PM > To: > Subject: Re: Question about writing HDFS files > > Thanks for the clarification Rahul. In that case, then the readi