[ 
https://issues.apache.org/jira/browse/PIG-1829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Richard Ding updated PIG-1829:
------------------------------

    Attachment: PIG-1829.patch

Implemented the polling of completed jobs before the batch execution completes. 

The output of test-patch:

{code}
     [exec] -1 overall.  
     [exec] 
     [exec]     +1 @author.  The patch does not contain any @author tags.
     [exec] 
     [exec]     -1 tests included.  The patch doesn't appear to include any new 
or modified tests.
     [exec]                         Please justify why no tests are needed for 
this patch.
     [exec] 
     [exec]     +1 javadoc.  The javadoc tool did not generate any warning 
messages.
     [exec] 
     [exec]     +1 javac.  The applied patch does not increase the total number 
of javac compiler warnings.
     [exec] 
     [exec]     +1 findbugs.  The patch does not introduce any new Findbugs 
warnings.
     [exec] 
     [exec]     +1 release audit.  The applied patch does not increase the 
total number of release audit warnings.
{code}

It's hard to test this with miniCluster. 

> "0" value seen in PigStat's map/reduce runtime, even when the job is 
> successful
> -------------------------------------------------------------------------------
>
>                 Key: PIG-1829
>                 URL: https://issues.apache.org/jira/browse/PIG-1829
>             Project: Pig
>          Issue Type: Improvement
>    Affects Versions: 0.8.0
>            Reporter: Thejas M Nair
>            Assignee: Richard Ding
>             Fix For: 0.9.0
>
>         Attachments: PIG-1829.patch
>
>
> Pig runtime calls JobClient.getMapTaskReports(jobId) and 
> JobClient.getReduceTaskReports(jobId) to get statistics about numbers of 
> maps/reducers, as well as max/min/avg time of these tasks. But from time to 
> time, these calls return empty lists. When that happens pig is reports 0 
> values for the stats. 
> The jobtracker keeps the stats information only for a limited duration based 
> on the configuration parameters  mapred.jobtracker.completeuserjobs.maximum 
> and mapred.job.tracker.retiredjobs.cache.size. Since pig collects the stats 
> after jobs have finished running, it is possible that the stats for the 
> initial jobs are no longer available. To have better chances of getting the 
> stats, it should be collected as soon as the job is over. 

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to