Zheng Shao created HADOOP-13975:
---
Summary: Allow DistCp to use MultiThreadedMapper
Key: HADOOP-13975
URL: https://issues.apache.org/jira/browse/HADOOP-13975
Project: Hadoop Common
Issue Type
ithread mapper to
>> decrease the contention of input reading and output?
>>
>
>
--
View this message in context:
http://old.nabble.com/MultithreadedMapper-tp34213805p34219011.html
Sent from the Hadoop core-dev mailing list archive at Nabble.com.
ithread mapper to
>> decrease the contention of input reading and output?
>>
>
>
--
View this message in context:
http://old.nabble.com/MultithreadedMapper-tp34213805p34219009.html
Sent from the Hadoop core-dev mailing list archive at Nabble.com.
nd output?
>
--
View this message in context:
http://old.nabble.com/MultithreadedMapper-tp34213805p34217963.html
Sent from the Hadoop core-dev mailing list archive at Nabble.com.
But I found that synchronization is needed for record reading(read
the input Key and Value) and result output.
I use Spring Batch for that. it has io buffering builtin and it is very easy to
use and well documented.
On Thu, Jul 26, 2012 at 7:42 AM, Robert Evans wrote:
> About the only time that
> MultiThreaded mapper makes a lot of since is if there is a lot of
> computation associated with each key/value pair.
Or if the mapper does a lot of i/o to some external resource, e.g., a
web crawler.
Doug
7 AM, "kenyh" wrote:
>
>Multithread Mapreduce introduces multithread execution in map task. In
>hadoop
>1.0.2, MultithreadedMapper implements multithread execution in mapper
>function. But I found that synchronization is needed for record
>reading(read
>the input Key a