[ 
https://issues.apache.org/jira/browse/PIG-948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12760121#action_12760121
 ] 

Ashutosh Chauhan commented on PIG-948:
--------------------------------------

@Daniel

bq. Also I notice in many cases we cannot get first job id correctly (job id is 
null in this case). If I change sleepTime (MapReduceLauncher.java:100) from 500 
to 1000 (ms), things look fine. Does anyone else also see that? 

Reason for that is JobControlCompiler compiles a set of inter-dependent MR jobs 
and generates a job-control object which is then submitted  asynchronously to 
hadoop for execution. Since we dont block on those thread,  its possible that 
job-ids are not yet assigned when we ask for them. Setting sleep time to higher 
value like 1000ms should be sufficient for most cases and should work. Note 
increasing this sleep time doesn't affect execution in anyway since we are 
sleeping in a thread which only does reporting. Another fool-proof though 
complicated approach is to sleep for shorter time duration, then check if id is 
assigned, if not sleep again in a while loop until ids are assigned.

> [Usability] Relating pig script with MR jobs
> --------------------------------------------
>
>                 Key: PIG-948
>                 URL: https://issues.apache.org/jira/browse/PIG-948
>             Project: Pig
>          Issue Type: Improvement
>          Components: impl
>    Affects Versions: 0.4.0
>            Reporter: Ashutosh Chauhan
>            Assignee: Ashutosh Chauhan
>            Priority: Minor
>             Fix For: 0.6.0
>
>         Attachments: pig-948-2.patch, pig-948.patch
>
>
> Currently its hard to find a way to relate pig script with specific MR job. 
> In a loaded cluster with multiple simultaneous job submissions, its not easy 
> to figure out which specific MR jobs were launched for a given pig script. If 
> Pig can provide this info, it will be useful to debug and monitor the jobs 
> resulting from a pig script.
> At the very least, Pig should be able to provide user the following 
> information
> 1) Job id of the launched job.
> 2) Complete web url of jobtracker running this job. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to