Ranjith, MapReduce and HDFS are two different things. MapReduce uses HDFS (and can use any other FS as well) to do some efficient work, but HDFS does not use MapReduce.
A simple HDFS transfer is done via network directly - Yes its just a block by block copy/write to/from the relevant DataNodes, done over network sockets at each end. On Tue, May 22, 2012 at 8:58 AM, Ranjith <ranjith.raghuna...@gmail.com> wrote: > Thanks harsh. So when it connects directly to the data nodes it does not fire > off any mappers. So how does it get the data over? Is it just a block by > block copy? > > Thanks, > Ranjith > > On May 21, 2012, at 9:22 PM, Harsh J <ha...@cloudera.com> wrote: > >> Ranjith, >> >> Are you speaking of DistCp? >> http://hadoop.apache.org/common/docs/current/distcp.html >> >> An 'fs -copyFromLocal' otherwise just runs as a single program that >> connects to your DFS nodes and writes data from a single client >> thread, and is not distributed on its own. >> >> On Tue, May 22, 2012 at 6:48 AM, Ranjith <ranjith.raghuna...@gmail.com> >> wrote: >>> >>> I have always wondered about this and and not sure as to phenomenon. When I >>> fire a map reduce job to copy data over in a distributed fashion I would >>> expect to see mappers executing the copy. What happens with a copy command >>> from Hadoop fs? >>> >>> Thanks, >>> Ranjith >> >> >> >> -- >> Harsh J -- Harsh J