[ 
https://issues.apache.org/jira/browse/PIG-2029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13035010#comment-13035010
 ] 

Richard Ding commented on PIG-2029:
-----------------------------------

Currently Pig prints out zero (0) if max/min/avg map/reduce time isn't 
available by querying hadoop using hadoop client API. This is misleading. I 
propose that we change those values to 'n/a' as following:

{code}
Job Stats (time in seconds):
JobId   Maps    Reduces MaxMapTime      MinMapTIme      AvgMapTime      
MaxReduceTime   MinReduceTime   AvgReduceTime   Alias   Feature Outputs
job_201104272229_434232 2       10      354     220     287     168     149     
163     
IN,SP10P,SP11P,SP12P,SP13P,SP16P,SP17P,SP18P,SP20P,SP21P,SP22P,SP23P,SP24P,SP26P,SP27P,SP28P,SP29P,SP30P,SP31P,SP32P,SP33P,SP34P,SP4P,SP6P,SP7P,SP8P,SP9P
       DISTINCT,MULTI_QUERY    
job_201104272229_434319 2       0       9       3       6       0       0       
0       UNION5  MULTI_QUERY,MAP_ONLY    /user/rding/verifypigstats2-UNION5,
job_201104272229_434320 2       10      n/a     n/a     n/a     n/a     n/a     
n/a     CNJOIN3,GNJOIN3,sampleNJOIN3    GROUP_BY,COMBINER       
job_201104272229_434321 1       10      5       5       5       23      9       
17      CNJOIN25,GNJOIN25,sampleNJOIN25 GROUP_BY,COMBINER       
job_201104272229_434322 2       10      n/a     n/a     n/a     n/a     n/a     
n/a     CNJOIN15,GNJOIN15,sampleNJOIN15 GROUP_BY,COMBINER       
job_201104272229_434323 2       10      n/a     n/a     n/a     n/a     n/a     
n/a     CNJOIN19,GNJOIN19,sampleNJOIN19 GROUP_BY,COMBINER       
job_201104272229_434331 2       1       n/a     n/a     n/a     n/a     n/a     
n/a     ONJOIN15        SAMPLER 
job_201104272229_434332 2       1       n/a     n/a     n/a     n/a     n/a     
n/a     ONJOIN3 SAMPLER 
job_201104272229_434333 1       1       2       2       2       13      13      
13      ONJOIN25        SAMPLER 
job_201104272229_434334 1       1       1       1       1       12      12      
12      ONJOIN19        SAMPLER 
job_201104272229_434342 1       10      2       2       2       16      8       
11      ONJOIN25        ORDER_BY,COMBINER       
{code}

> Inconsistency in Pig Stats reports 
> -----------------------------------
>
>                 Key: PIG-2029
>                 URL: https://issues.apache.org/jira/browse/PIG-2029
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.8.1, 0.9.0
>            Reporter: Viraj Bhat
>            Assignee: Richard Ding
>             Fix For: 0.10
>
>
> I have a Pig script which reports varying Stats for the same M/R job (same 
> inputs). Sometimes the PigStats reports all the stats (such as 
> Maps,Reduces,MaxMapTime,MinMapTime,AvgMapTime,MaxReduceTime, MinReduceTime 
> and AvgReduceTime) for the M/R job as 0. Sometimes it reports it correctly.
> Enclosed are the stderr logs for 2 runs, you can notice that for Run 1 
> job_201103091134_556600 from Run 1; has 0 against all the columns whereas in 
> Run 2, Hadoop job job_201104272229_75693 has some valid values. 
> The actual Job Tracker link shows that they are non empty. This points to a 
> bug in the interaction of the PigStats module with the Jobtracker.
> Run 1:
> {quote}
> Job Stats (time in seconds):
> JobId Maps    Reduces MaxMapTime      MinMapTIme      AvgMapTime      
> MaxReduceTime   MinReduceTime   AvgReduceTime   Alias   Feature Outputs
> job_201103091134_556458       160     100     552     191     368     1257    
> 371     392     
> IN,SP10P,SP11P,SP12P,SP13P,SP16P,SP17P,SP18P,SP20P,SP21P,SP22P,SP23P,SP24P,SP26P,SP27P,SP28P,SP29P,SP30P,SP31P,SP32P,SP33P,SP34P,SP4P,SP6P,SP7P,SP8P,SP9P
>        DISTINCT,MULTI_QUERY    
> job_201103091134_556600       0       0       0       0       0       0       
> 0       0       UNION5      MULTI_QUERY,MAP_ONLY        /user/viraj/dir,,
> job_201103091134_556601       7       100     17      8       14      200     
> 15      27      CNJOIN25,GNJOIN25,sampleNJOIN25 GROUP_BY,COMBINER       
> job_201103091134_556602       0       0       0       0       0       0       
> 0       0       CNJOIN3,GNJOIN3,sampleNJOIN3    GROUP_BY,COMBINER       
> job_201103091134_556603       0       0       0       0       0       0       
> 0       0       CNJOIN15,GNJOIN15,sampleNJOIN15 GROUP_BY,COMBINER       
> job_201103091134_556604       2       100     13      7       10      34      
> 13      31      CNJOIN19,GNJOIN19,sampleNJOIN19 GROUP_BY,COMBINER       
> job_201103091134_556644       0       0       0       0       0       0       
> 0       0       ONJOIN15        SAMPLER 
> job_201103091134_556645       0       0       0       0       0       0       
> 0       0       ONJOIN25        SAMPLER 
> job_201103091134_556646       0       0       0       0       0       0       
> 0       0       ONJOIN3 SAMPLER 
> job_201103091134_556654       0       0       0       0       0       0       
> 0       0       ONJOIN19        SAMPLER 
> job_201103091134_556662       0       0       0       0       0       0       
> 0       0       ONJOIN19        ORDER_BY,COMBINER
> ..
> {quote}
> Run 2:
> {quote}
> Job Stats (time in seconds):
> JobId Maps    Reduces MaxMapTime      MinMapTIme      AvgMapTime      
> MaxReduceTime   MinReduceTime   AvgReduceTime   Alias   Feature Outputs
> job_201104272229_75503        159     100     484     192     353     396     
> 308     321     
> IN,SP10P,SP11P,SP12P,SP13P,SP16P,SP17P,SP18P,SP20P,SP21P,SP22P,SP23P,SP24P,SP26P,SP27P,SP28P,SP29P,SP30P,SP31P,SP32P,SP33P,SP34P,SP4P,SP6P,SP7P,SP8P,SP9P
>        DISTINCT,MULTI_QUERY    
> job_201104272229_75693        18      0       31      14      24      0       
> 0                   UNION5         MULTI_QUERY,MAP_ONLY /user/viraj/dir,
> job_201104272229_75694        7       100     34      13      22      46      
> 20      25      CNJOIN25,GNJOIN25,sampleNJOIN25 GROUP_BY,COMBINER       
> job_201104272229_75695        125     100     19      11      15      32      
> 18      26      CNJOIN3,GNJOIN3,sampleNJOIN3    GROUP_BY,COMBINER       
> job_201104272229_75698        1       100     12      12      12      13      
> 9       11      CNJOIN15,GNJOIN15,sampleNJOIN15 GROUP_BY,COMBINER       
> job_201104272229_75702        2       100     21      5       13      35      
> 22      26      CNJOIN19,GNJOIN19,sampleNJOIN19 GROUP_BY,COMBINER       
> job_201104272229_75724        1       1       4       4       4       11      
> 11      11      ONJOIN15        SAMPLER 
> job_201104272229_75725        0       0       0       0       0       0       
> 0                   ONJOIN25    SAMPLER 
> job_201104272229_75726        6       1       8       6       8       24      
> 24      24      ONJOIN3 SAMPLER 
> job_201104272229_75729        0       0       0       0       0       0       
> 0                   ONJOIN19    SAMPLER 
> job_201104272229_75752        1       100     5       5       5       12      
> 9       11      ONJOIN19        ORDER_BY,COMBINER
> ..
> {quote}
> Viraj

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to