By the way, how do I know if my map task is single threaded (ie. one thread 
executing for each record ) ? and how to change that into multi-threading ?

Thank you,
Maha

On Mar 12, 2011, at 9:11 PM, Harsh J wrote:

> Hello,
> 
> On Sat, Mar 12, 2011 at 3:51 PM, Jun Young Kim <juneng...@gmail.com> wrote:
>> hi,
>> 
>> is a single thread allocated to a single output file when a job is trying to
>> write multiple output files?
> 
> At the lower levels, a data streaming thread is indeed run for every
> OutputStream created for writing on the DFS.
> 
> The map task is generally single threaded unless you multi-thread the
> calls (in which case the record writers are still got in a
> synchronized fashion).
> 
>> if counts of output files are 10,000, does a hadoop try to create threads
>> for each output file?
> 
> Yes, there should be 10,000 threads 'started' for streaming writes
> (but not all really working at the same time, as per the record writer
> access methods in tasks).
> 
> Please correct me if I'm wrong.
> 
> -- 
> Harsh J
> www.harshj.com

Reply via email to