[ 
https://issues.apache.org/jira/browse/TEZ-2724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14740168#comment-14740168
 ] 

TezQA commented on TEZ-2724:
----------------------------

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12755294/TEZ-2724-2.patch
  against master revision b288be7.

    {color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

    {color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
                        Please justify why no new tests are needed for this 
patch.
                        Also please list what manual steps were performed to 
verify this patch.

    {color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

    {color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

    {color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 3.0.1) warnings.

    {color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

    {color:red}-1 core tests{color}.  The patch failed these unit tests in :
                   org.apache.tez.test.TestFaultTolerance

Test results: 
https://builds.apache.org/job/PreCommit-TEZ-Build/1110//testReport/
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/1110//console

This message is automatically generated.

> Tez Client keeps on showing old status when application is finished but RM is 
> shutdown
> --------------------------------------------------------------------------------------
>
>                 Key: TEZ-2724
>                 URL: https://issues.apache.org/jira/browse/TEZ-2724
>             Project: Apache Tez
>          Issue Type: Bug
>    Affects Versions: 0.5.4
>            Reporter: Jeff Zhang
>            Assignee: Jeff Zhang
>         Attachments: TEZ-2724-1.patch, TEZ-2724-2.patch, 
> amrecovery_mutlipleamrestart.txt
>
>
> From the logs, it seems the ipc retry interval is set as 20 seconds and ipc 
> max retries is 45. This means that the client will retry the RPC connection 
> for total 900 (20*45) seconds. And in this period, the application may 
> already complete and RM Restarting may be triggered as said in the jira 
> description. And I think the RM recovery is not enabled, so even the new RM 
> is restarted, the original application info is lost, that means the client 
> can never get the correct application report which makes it showing the old 
> status forever. 
> {code}
> 15/05/07 19:13:43 INFO ipc.Client: Retrying connect to server: 
> maint22-tez12/100.79.80.19:52822. Already tried 26 time(s); maxRetries=45
> Deleted /user/hadoopqa/Input1
> RUNNING: call D:\hdp\hadoop-2.6.0.2.2.6.0-2782\bin\hdfs.cmd dfs -ls 
> /user/hadoopqa/Input2
> RUNNING: call D:\hdp\hadoop-2.6.0.2.2.6.0-2782\bin\hdfs.cmd dfs  -rm -r 
> -skipTrash /user/hadoopqa/Input2
> 15/05/07 19:14:03 INFO ipc.Client: Retrying connect to server: 
> maint22-tez12/100.79.80.19:52822. Already tried 27 time(s); maxRetries=45
> {code}
> Configuration to reproduce this issue
> * disable generic application history 
> (yarn.timeline-service.generic-application-history.enabled)
> * disable rm recovery (yarn.resourcemanager.recovery.enabled)
> * increase the ipc retry interval and max retry 
> (ipc.client.connect.retry.interval & ipc.client.connect.max.retries)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to