That is not true. HDFS writes are not staged to a local disk first
before being written onto the DataNodes. The old architecture docs
seem to suggest that the writes get staged to a local disk but thats
not true anymore, see https://issues.apache.org/jira/browse/HDFS-1454.

Also worth noting that a HDFS client behaves the same way in almost
all contexts, whether its invoked from an MR framework or directly
from shell.

On Fri, May 17, 2013 at 3:38 AM, John Lilley <john.lil...@redpoint.net> wrote:
> I seem to recall reading that when a MapReduce task writes a file, the
> blocks of the file are always written to local disk, and replicated to other
> nodes.  If this is true, is this also true for non-MR applications writing
> to HDFS from Hadoop worker nodes?  What about clients outside of the cluster
> doing a file load?
>
> Thanks
>
> John
>
>



--
Harsh J

Reply via email to