[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13104836#comment-13104836
 ] 

Devaraj K commented on MAPREDUCE-2925:
--------------------------------------

Thanks Arun for reviewing and suggestion.

There are few problems around this.

1. If the RM doesn't give application report it is giving NullPointerException. 
        This can be handled by redirecting to history server as it still may 
aware of the application.

        
2. After redirecting to History Server, if the history server doesn't have 
information about it(or it failed to give because of some other reason), it is 
going to infinite loop and keep on printing the message. 

        I have faced the similar problem. RM is giving the application report 
with status as success and then it is redirecting to History server. History 
server is not able to find the application info, it throwing the exception. 
That is converting to InvocationTargetException and it is retrying infinitely.

3. If it throws other than 'YarnRemoteException' and 
'InvocationTargetException' also it goes to infinite times. This needs to break 
at some point.

Here we need to differentiate remote end exceptions and connection failures to 
RM/AM/HS, if it is remote end exception then it can be reported directly. If it 
is connection failure then retry can happen in the RPC and after retries it can 
be reported.

Please provide your suggestions.


> job -status <JOB_ID> is giving continuously info message for completed jobs 
> on the console
> ------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-2925
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2925
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mrv2
>    Affects Versions: 0.23.0, 0.24.0
>            Reporter: Devaraj K
>            Assignee: Devaraj K
>             Fix For: 0.23.0, 0.24.0
>
>         Attachments: MAPREDUCE-2925.patch
>
>
> This below message is coming continuously on the console.
> {code:xml}
> 11/09/02 16:00:00 INFO mapred.ClientServiceDelegate: Failed to contact AM for 
> job job_1314955256658_0009  Will retry..
> 11/09/02 16:00:00 INFO mapred.ClientServiceDelegate: Application state is 
> completed. Redirecting to job history server null
> 11/09/02 16:00:00 INFO mapred.ClientServiceDelegate: Failed to contact AM for 
> job job_1314955256658_0009  Will retry..
> 11/09/02 16:00:00 INFO mapred.ClientServiceDelegate: Application state is 
> completed. Redirecting to job history server null
> {code}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to