Was the command line output really ever intended to be *that* verbose? I can 
see how it would be useful but considering how easy it is to interact with the 
web-ui I can't see why much effort should be put into enhancing it. Even if you 
didn't want to see all of the details the web-ui has to offer it doesn't take 
long to learn how to skim it and get 10x more accurate reading on your job 
progress.

Matt

-----Original Message-----
From: Arun C Murthy [mailto:a...@hortonworks.com] 
Sent: Sunday, September 18, 2011 11:27 PM
To: common-user@hadoop.apache.org
Subject: Re: phases of Hadoop Jobs

Agreed.

At least, I believe the new web-ui for MRv2 is (or will be soon) more verbose 
about this.

On Sep 18, 2011, at 9:23 PM, Kai Voigt wrote:

> Hi,
> 
> this 0-33-66-100% phases are really confusing to beginners. We see that in 
> our training classes. The output should be more verbose, such as breaking 
> down the phases into seperate progress numbers.
> 
> Does that make sense?
> 
> Am 19.09.2011 um 06:17 schrieb Arun C Murthy:
> 
>> Nan,
>> 
>> The 'phase' is implicitly understood by the 'progress' (value) made by the 
>> map/reduce tasks (see o.a.h.mapred.TaskStatus.Phase).
>> 
>> For e.g. 
>> Reduce: 
>> 0-33% -> Shuffle
>> 34-66% -> Sort (actually, just 'merge', there is no sort in the reduce since 
>> all map-outputs are sorted)
>> 67-100% -> Reduce
>> 
>> With 0.23 onwards the Map has phases too:
>> 0-90% -> Map
>> 91-100% -> Final Sort/merge
>> 
>> Now,about starting reduces early - this is done to ensure shuffle can 
>> proceed for completed maps while rest of the maps run, there-by pipelining 
>> shuffle and map completion. There is a 'reduce slowstart' feature to control 
>> this - by default, reduces aren't started until 5% of maps are complete. 
>> Users can set this higher.
>> 
>> Arun
>> 
>> On Sep 18, 2011, at 7:24 PM, Nan Zhu wrote:
>> 
>>> Hi, all
>>> 
>>> recently, I was hit by a question, "how is a hadoop job divided into 2
>>> phases?",
>>> 
>>> In textbooks, we are told that the mapreduce jobs are divided into 2 phases,
>>> map and reduce, and for reduce, we further divided it into 3 stages,
>>> shuffle, sort, and reduce, but in hadoop codes, I never think about
>>> this question, I didn't see any variable members in JobInProgress class
>>> to indicate this information,
>>> 
>>> and according to my understanding on the source code of hadoop, the reduce
>>> tasks are unnecessarily started until all mappers are finished, in
>>> constract, we can see the reduce tasks are in shuffle stage while there are
>>> mappers which are still in running,
>>> So how can I indicate the phase which the job is belonging to?
>>> 
>>> Thanks
>>> -- 
>>> Nan Zhu
>>> School of Electronic, Information and Electrical Engineering,229
>>> Shanghai Jiao Tong University
>>> 800,Dongchuan Road,Shanghai,China
>>> E-Mail: zhunans...@gmail.com
>> 
>> 
> 
> -- 
> Kai Voigt
> k...@123.org
> 
> 
> 
> 

This e-mail message may contain privileged and/or confidential information, and 
is intended to be received only by persons entitled
to receive such information. If you have received this e-mail in error, please 
notify the sender immediately. Please delete it and
all attachments from any servers, hard drives or any other media. Other use of 
this e-mail by you is strictly prohibited.

All e-mails and attachments sent and received are subject to monitoring, 
reading and archival by Monsanto, including its
subsidiaries. The recipient of this e-mail is solely responsible for checking 
for the presence of "Viruses" or other "Malware".
Monsanto, along with its subsidiaries, accepts no liability for any damage 
caused by any such code transmitted by or accompanying
this e-mail or any attachment.


The information contained in this email may be subject to the export control 
laws and regulations of the United States, potentially
including but not limited to the Export Administration Regulations (EAR) and 
sanctions regulations issued by the U.S. Department of
Treasury, Office of Foreign Asset Controls (OFAC).  As a recipient of this 
information you are obligated to comply with all
applicable U.S. export laws and regulations.

Reply via email to