Right, sorry for the ambiguity, I was talking about HDFS writes only.

So my application doesn't need to do anything to signal that it is writing from 
inside vs. outside of the Hadoop cluster, it figures that out from IP or 
hostname?


-----Original Message-----
From: Harsh J [mailto:ha...@cloudera.com] 
Sent: Thursday, May 16, 2013 11:12 PM
To: <user@hadoop.apache.org>
Subject: Re: Question about writing HDFS files

Thanks for the clarification Rahul. In that case, then the reading is correct 
(and that a HDFS client behaves the same, in and out of MR - its not really 
related to MR at all).

A "client outside" would write to a random set of datanode, across at least two 
racks for 3 replicas if rack awareness is turned on.

On Fri, May 17, 2013 at 8:17 AM, Rahul Bhattacharjee <rahul.rec....@gmail.com> 
wrote:
> Hi Harsh,
>
> I think what John meant by writing to local disk is writing to the 
> same data node first which has initiated the write call.
>
> John can further clarify.
>
>
> On Fri, May 17, 2013 at 4:23 AM, Harsh J <ha...@cloudera.com> wrote:
>>
>> That is not true. HDFS writes are not staged to a local disk first 
>> before being written onto the DataNodes. The old architecture docs 
>> seem to suggest that the writes get staged to a local disk but thats 
>> not true anymore, see https://issues.apache.org/jira/browse/HDFS-1454.
>>
>> Also worth noting that a HDFS client behaves the same way in almost 
>> all contexts, whether its invoked from an MR framework or directly 
>> from shell.
>>
>> On Fri, May 17, 2013 at 3:38 AM, John Lilley 
>> <john.lil...@redpoint.net>
>> wrote:
>> > I seem to recall reading that when a MapReduce task writes a file, 
>> > the blocks of the file are always written to local disk, and 
>> > replicated to other nodes.  If this is true, is this also true for 
>> > non-MR applications writing to HDFS from Hadoop worker nodes?  What 
>> > about clients outside of the cluster doing a file load?
>> >
>> > Thanks
>> >
>> > John
>> >
>> >
>>
>>
>>
>> --
>> Harsh J
>
>



--
Harsh J

Reply via email to