In this case the output of the map-tasks directly go to distributed 
file-system, to the path set by FileOutputFormat.setOutputPath(JobConf, 
Path)<https://hadoop.apache.org/docs/r2.7.3/api/org/apache/hadoop/mapred/FileOutputFormat.html#setOutputPath(org.apache.hadoop.mapred.JobConf,%20org.apache.hadoop.fs.Path)>.
 Also, the framework doesn't sort the map-outputs before writing it out to HDFS.


From: 萝卜丝炒饭 [mailto:1427357...@qq.com]
Sent: Friday, February 03, 2017 12:05 PM
To: ☼ R Nair (रविशंकर नायर); dev; user; user
Subject: Re: No Reducer scenarios

HI  Nair,
have you know the class please? I tried to find but failed. I know 
NewDirectOutputCollector is used to write tmp files.

---Original---
From: "☼ R Nair (रविशंकर 
नायर)"<ravishankar.n...@gmail.com<mailto:ravishankar.n...@gmail.com>>
Date: 2017/1/30 13:32:04
To: 
"dev"<d...@spark.apache.org<mailto:d...@spark.apache.org>>;"user"<user@hadoop.apache.org<mailto:user@hadoop.apache.org>>;"user"<u...@spark.apache.org<mailto:u...@spark.apache.org>>;
Subject: No Reducer scenarios

Dear all,


1) When we don't set the reducer class in driver program, IdentityReducer is 
invoked.

2) When we set setNumReduceTasks(0), no reducer, even IdentityReducer is 
invoked.

Now, in the second scenario, we observed that the output is part-m-xx 
format(instead of part-r-xx format) , which shows the map output. But we know 
that the output of Map is always written to intermediate local file system. So 
who/which class is responsible for taking these intermediate Map outputs from 
local file system and writes to HDFS ? Does this particular class performs this 
write operation only when setNumReduceTasks is set to zero?

Best, Ravion
This message contains information that may be privileged or confidential and is 
the property of the KPIT Technologies Ltd. It is intended only for the person 
to whom it is addressed. If you are not the intended recipient, you are not 
authorized to read, print, retain copy, disseminate, distribute, or use this 
message or any part thereof. If you receive this message in error, please 
notify the sender immediately and delete all copies of this message. KPIT 
Technologies Ltd. does not accept any liability for virus infected mails.

Reply via email to