Inconsistency in Pig Stats reports 
-----------------------------------

                 Key: PIG-2029
                 URL: https://issues.apache.org/jira/browse/PIG-2029
             Project: Pig
          Issue Type: Bug
          Components: impl
    Affects Versions: 0.8.1, 0.9.0
            Reporter: Viraj Bhat
             Fix For: 0.8.1, 0.9.0


I have a Pig script which reports varying Stats for the same M/R job (same 
inputs). Sometimes the PigStats reports all the stats (such as 
Maps,Reduces,MaxMapTime,MinMapTime,AvgMapTime,MaxReduceTime, MinReduceTime and 
AvgReduceTime) for the M/R job as 0. Sometimes it reports it correctly.

Enclosed are the stderr logs for 2 runs, you can notice that for Run 1 
job_201103091134_556600 from Run 1; has 0 against all the columns whereas in 
Run 2, Hadoop job job_201104272229_75693 has some valid values. 

The actual Job Tracker link shows that they are non empty. This points to a bug 
in the interaction of the PigStats module with the Jobtracker.

Run 1:
{quote}
Job Stats (time in seconds):
JobId   Maps    Reduces MaxMapTime      MinMapTIme      AvgMapTime      
MaxReduceTime   MinReduceTime   AvgReduceTime   Alias   Feature Outputs
job_201103091134_556458 160     100     552     191     368     1257    371     
392     
IN,SP10P,SP11P,SP12P,SP13P,SP16P,SP17P,SP18P,SP20P,SP21P,SP22P,SP23P,SP24P,SP26P,SP27P,SP28P,SP29P,SP30P,SP31P,SP32P,SP33P,SP34P,SP4P,SP6P,SP7P,SP8P,SP9P
       DISTINCT,MULTI_QUERY    
job_201103091134_556600 0       0       0       0       0       0       0       
0       UNION5      MULTI_QUERY,MAP_ONLY        /user/viraj/dir,,
job_201103091134_556601 7       100     17      8       14      200     15      
27      CNJOIN25,GNJOIN25,sampleNJOIN25 GROUP_BY,COMBINER       
job_201103091134_556602 0       0       0       0       0       0       0       
0       CNJOIN3,GNJOIN3,sampleNJOIN3    GROUP_BY,COMBINER       
job_201103091134_556603 0       0       0       0       0       0       0       
0       CNJOIN15,GNJOIN15,sampleNJOIN15 GROUP_BY,COMBINER       
job_201103091134_556604 2       100     13      7       10      34      13      
31      CNJOIN19,GNJOIN19,sampleNJOIN19 GROUP_BY,COMBINER       
job_201103091134_556644 0       0       0       0       0       0       0       
0       ONJOIN15        SAMPLER 
job_201103091134_556645 0       0       0       0       0       0       0       
0       ONJOIN25        SAMPLER 
job_201103091134_556646 0       0       0       0       0       0       0       
0       ONJOIN3 SAMPLER 
job_201103091134_556654 0       0       0       0       0       0       0       
0       ONJOIN19        SAMPLER 
job_201103091134_556662 0       0       0       0       0       0       0       
0       ONJOIN19        ORDER_BY,COMBINER
..
{quote}


Run 2:
{quote}

Job Stats (time in seconds):
JobId   Maps    Reduces MaxMapTime      MinMapTIme      AvgMapTime      
MaxReduceTime   MinReduceTime   AvgReduceTime   Alias   Feature Outputs
job_201104272229_75503  159     100     484     192     353     396     308     
321     
IN,SP10P,SP11P,SP12P,SP13P,SP16P,SP17P,SP18P,SP20P,SP21P,SP22P,SP23P,SP24P,SP26P,SP27P,SP28P,SP29P,SP30P,SP31P,SP32P,SP33P,SP34P,SP4P,SP6P,SP7P,SP8P,SP9P
       DISTINCT,MULTI_QUERY    
job_201104272229_75693  18      0       31      14      24      0       0       
            UNION5         MULTI_QUERY,MAP_ONLY /user/viraj/dir,
job_201104272229_75694  7       100     34      13      22      46      20      
25      CNJOIN25,GNJOIN25,sampleNJOIN25 GROUP_BY,COMBINER       
job_201104272229_75695  125     100     19      11      15      32      18      
26      CNJOIN3,GNJOIN3,sampleNJOIN3    GROUP_BY,COMBINER       
job_201104272229_75698  1       100     12      12      12      13      9       
11      CNJOIN15,GNJOIN15,sampleNJOIN15 GROUP_BY,COMBINER       
job_201104272229_75702  2       100     21      5       13      35      22      
26      CNJOIN19,GNJOIN19,sampleNJOIN19 GROUP_BY,COMBINER       
job_201104272229_75724  1       1       4       4       4       11      11      
11      ONJOIN15        SAMPLER 
job_201104272229_75725  0       0       0       0       0       0       0       
            ONJOIN25    SAMPLER 
job_201104272229_75726  6       1       8       6       8       24      24      
24      ONJOIN3 SAMPLER 
job_201104272229_75729  0       0       0       0       0       0       0       
            ONJOIN19    SAMPLER 
job_201104272229_75752  1       100     5       5       5       12      9       
11      ONJOIN19        ORDER_BY,COMBINER
..
{quote}

Viraj

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to