Re: CopyFromLocal

Harsh J Mon, 21 May 2012 21:10:19 -0700

Ranjith,

MapReduce and HDFS are two different things. MapReduce uses HDFS (and
can use any other FS as well) to do some efficient work, but HDFS does
not use MapReduce.


A simple HDFS transfer is done via network directly - Yes its just a
block by block copy/write to/from the relevant DataNodes, done over
network sockets at each end.

On Tue, May 22, 2012 at 8:58 AM, Ranjith <ranjith.raghuna...@gmail.com> wrote:
> Thanks harsh. So when it connects directly to the data nodes it does not fire 
> off any mappers. So how does it get the data over? Is it just a block by 
> block copy?
>
> Thanks,
> Ranjith
>
> On May 21, 2012, at 9:22 PM, Harsh J <ha...@cloudera.com> wrote:
>
>> Ranjith,
>>
>> Are you speaking of DistCp?
>> http://hadoop.apache.org/common/docs/current/distcp.html
>>
>> An 'fs -copyFromLocal' otherwise just runs as a single program that
>> connects to your DFS nodes and writes data from a single client
>> thread, and is not distributed on its own.
>>
>> On Tue, May 22, 2012 at 6:48 AM, Ranjith <ranjith.raghuna...@gmail.com> 
>> wrote:
>>>
>>> I have always wondered about this and and not sure as to phenomenon. When I 
>>> fire a map reduce job to copy data over in a distributed fashion I would 
>>> expect to see mappers executing the copy. What happens with a copy command 
>>> from Hadoop fs?
>>>
>>> Thanks,
>>> Ranjith
>>
>>
>>
>> --
>> Harsh J



-- 
Harsh J

Re: CopyFromLocal

Reply via email to