[ 
https://issues.apache.org/jira/browse/PIG-1829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12988342#action_12988342
 ] 

Richard Ding commented on PIG-1829:
-----------------------------------

Current implementation is to get job stats after each patch of jobs is finished 
(i.e. jobs that can be ran in parallel). A short-term fix can be to poll the 
JobControl to get finished job (instead of waiting for completion of all the 
jobs in the patch). This could reduce the time window between a job is finished 
and its stats gets queried in cases where the running time of jobs in a patch 
varies.

> "0" value seen in PigStat's map/reduce runtime, even when the job is 
> successful
> -------------------------------------------------------------------------------
>
>                 Key: PIG-1829
>                 URL: https://issues.apache.org/jira/browse/PIG-1829
>             Project: Pig
>          Issue Type: Improvement
>    Affects Versions: 0.8.0
>            Reporter: Thejas M Nair
>             Fix For: 0.9.0
>
>
> Pig runtime calls JobClient.getMapTaskReports(jobId) and 
> JobClient.getReduceTaskReports(jobId) to get statistics about numbers of 
> maps/reducers, as well as max/min/avg time of these tasks. But from time to 
> time, these calls return empty lists. When that happens pig is reports 0 
> values for the stats. 
> The jobtracker keeps the stats information only for a limited duration based 
> on the configuration parameters  mapred.jobtracker.completeuserjobs.maximum 
> and mapred.job.tracker.retiredjobs.cache.size. Since pig collects the stats 
> after jobs have finished running, it is possible that the stats for the 
> initial jobs are no longer available. To have better chances of getting the 
> stats, it should be collected as soon as the job is over. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to