[ 
https://issues.apache.org/jira/browse/HADOOP-4296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12637015#action_12637015
 ] 

Joydeep Sen Sarma commented on HADOOP-4296:
-------------------------------------------

we definitely care about the status of completed jobs (and i think most 
installations would - given that at least some of the uses are always 
programmatic invocations that check return status).

does the jobstatus store need to scan dfs even when the job status is available 
in memory? (falling back to persistent store only when the data is missing in 
memory would seem like a good strategy). another question is whether job 
counters are available from the persisted job status?

> Spasm of JobClient failures on successful jobs every once in a while
> --------------------------------------------------------------------
>
>                 Key: HADOOP-4296
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4296
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.17.1
>            Reporter: Joydeep Sen Sarma
>            Assignee: dhruba borthakur
>            Priority: Critical
>         Attachments: 4296_jt_delayretire.patch
>
>
> At very busy times - we get a wave of job client failures all at the same 
> time. the failures come when the job is about to complete. when we look at 
> the job history files - the jobs are actually complete. Here's the stack:
> 08/09/27 02:18:00 INFO mapred.JobClient:  map 100% reduce 98%
> 08/09/27 02:18:41 INFO mapred.JobClient:  map 100% reduce 99% 
> java.lang.NullPointerException
>       at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:993)
>       at 
> com.facebook.hive.common.columnSetLoader.main(columnSetLoader.java:535)
>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>       at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>       at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>       at java.lang.reflect.Method.invoke(Method.java:597)
>       at org.apache.hadoop.util.RunJar.main(RunJar.java:155)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to