[ 
https://issues.apache.org/jira/browse/AMBARI-15389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15219937#comment-15219937
 ] 

Hadoop QA commented on AMBARI-15389:
------------------------------------

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12796288/AMBARI-15389.patch
  against trunk revision .

    {color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

    {color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
                        Please justify why no new tests are needed for this 
patch.
                        Also please list what manual steps were performed to 
verify this patch.

    {color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

    {color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

    {color:green}+1 core tests{color}.  The patch passed unit tests in 
ambari-web.

Test results: 
https://builds.apache.org/job/Ambari-trunk-test-patch/6123//testReport/
Console output: 
https://builds.apache.org/job/Ambari-trunk-test-patch/6123//console

This message is automatically generated.

> Intermittent YARN service check failures during and post EU
> -----------------------------------------------------------
>
>                 Key: AMBARI-15389
>                 URL: https://issues.apache.org/jira/browse/AMBARI-15389
>             Project: Ambari
>          Issue Type: Bug
>          Components: ambari-server
>    Affects Versions: 2.2.2
>            Reporter: Dmitry Lysnichenko
>            Assignee: Antonenko Alexander
>             Fix For: 2.2.2
>
>         Attachments: AMBARI-15389.patch, AMBARI-15389.patch, 
> AMBARI-15389_2.2.patch
>
>
> Build # - Ambari 2.2.1.1 - #63
> Observed this issue in a couple of EU runs recently where YARN service check 
> reports failure
> a. In one test, the EU ran from HDP 2.3.4.0 to 2.4.0.0 and YARN service check 
> reported failure during EU itself; a retry of the operation led to service 
> check being successful
> b. In another test post EU when YARN service check was run, it reported 
> failure; afterwards when I ran it again - success
> Looks like there is some corner condition which causes this issue to be hit
> {code}
> stderr:   /var/lib/ambari-agent/data/errors-822.txt
> Traceback (most recent call last):
> File 
> "/var/lib/ambari-agent/cache/common-services/YARN/2.1.0.2.0/package/scripts/service_check.py",
>  line 142, in <module>
> ServiceCheck().execute()
> File 
> "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py",
>  line 219, in execute
> method(env)
> File 
> "/var/lib/ambari-agent/cache/common-services/YARN/2.1.0.2.0/package/scripts/service_check.py",
>  line 104, in service_check
> user=params.smokeuser,
> File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", 
> line 70, in inner
> result = function(command, **kwargs)
> File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", 
> line 92, in checked_call
> tries=tries, try_sleep=try_sleep)
> File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", 
> line 140, in _call_wrapper
> result = _call(command, **kwargs_copy)
> File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", 
> line 291, in _call
> raise Fail(err_msg)
> resource_management.core.exceptions.Fail: Execution of '/usr/bin/kinit -kt 
> /etc/security/keytabs/smokeuser.headless.keytab ambari...@example.com; yarn 
> org.apache.hadoop.yarn.applications.distributedshell.Client -shell_command ls 
> -num_containers 1 -jar 
> /usr/hdp/current/hadoop-yarn-client/hadoop-yarn-applications-distributedshell.jar'
>  returned 2. ######## Hortonworks #############
> This is MOTD message, added for testing in qe infra
> 16/03/03 02:33:51 INFO impl.TimelineClientImpl: Timeline service address: 
> http://host:8188/ws/v1/timeline/
> 16/03/03 02:33:51 INFO distributedshell.Client: Initializing Client
> 16/03/03 02:33:51 INFO distributedshell.Client: Running Client
> 16/03/03 02:33:51 INFO client.RMProxy: Connecting to ResourceManager at 
> host-9-5.test/127.0.0.254:8050
> 16/03/03 02:33:53 INFO distributedshell.Client: Got Cluster metric info from 
> ASM, numNodeManagers=3
> 16/03/03 02:33:53 INFO distributedshell.Client: Got Cluster node info from ASM
> 16/03/03 02:33:53 INFO distributedshell.Client: Got node report from ASM for, 
> nodeId=host:25454, nodeAddresshost:8042, nodeRackName/default-rack, 
> nodeNumContainers1
> 16/03/03 02:33:53 INFO distributedshell.Client: Got node report from ASM for, 
> nodeId=host-9-5.test:25454, nodeAddresshost-9-5.test:8042, 
> nodeRackName/default-rack, nodeNumContainers0
> 16/03/03 02:33:53 INFO distributedshell.Client: Got node report from ASM for, 
> nodeId=host-9-1.test:25454, nodeAddresshost-9-1.test:8042, 
> nodeRackName/default-rack, nodeNumContainers0
> 16/03/03 02:33:53 INFO distributedshell.Client: Queue info, 
> queueName=default, queueCurrentCapacity=0.083333336, queueMaxCapacity=1.0, 
> queueApplicationCount=0, queueChildQueueCount=0
> 16/03/03 02:33:53 INFO distributedshell.Client: User ACL Info for Queue, 
> queueName=root, userAcl=SUBMIT_APPLICATIONS
> 16/03/03 02:33:53 INFO distributedshell.Client: User ACL Info for Queue, 
> queueName=default, userAcl=SUBMIT_APPLICATIONS
> 16/03/03 02:33:53 INFO distributedshell.Client: Max mem capabililty of 
> resources in this cluster 10240
> 16/03/03 02:33:53 INFO distributedshell.Client: Max virtual cores capabililty 
> of resources in this cluster 1
> 16/03/03 02:33:53 INFO distributedshell.Client: Copy App Master jar from 
> local filesystem and add to local environment
> 16/03/03 02:33:53 INFO distributedshell.Client: Set the environment for the 
> application master
> 16/03/03 02:33:53 INFO distributedshell.Client: Setting up app master command
> 16/03/03 02:33:53 INFO distributedshell.Client: Completed setting up app 
> master command {{JAVA_HOME}}/bin/java -Xmx10m 
> org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster 
> --container_memory 10 --container_vcores 1 --num_containers 1 --priority 0 
> 1><LOG_DIR>/AppMaster.stdout 2><LOG_DIR>/AppMaster.stderr
> 16/03/03 02:33:53 INFO hdfs.DFSClient: Created HDFS_DELEGATION_TOKEN token 
> 290 for ambari-qa on 127.0.0.235:8020
> 16/03/03 02:33:53 INFO distributedshell.Client: Got dt for 
> hdfs://host-9-1.test:8020; Kind: HDFS_DELEGATION_TOKEN, Service: 
> 127.0.0.235:8020, Ident: (HDFS_DELEGATION_TOKEN token 290 for ambari-qa)
> 16/03/03 02:33:53 INFO distributedshell.Client: Submitting application to ASM
> 16/03/03 02:33:54 INFO impl.YarnClientImpl: Submitted application 
> application_1456970141888_0011
> 16/03/03 02:33:55 INFO distributedshell.Client: Got application report from 
> ASM for, appId=11, clientToAMToken=Token { kind: YARN_CLIENT_TOKEN, service:  
> }, appDiagnostics=, appMasterHost=N/A, appQueue=default, appMasterRpcPort=-1, 
> appStartTime=1456972434150, yarnAppState=ACCEPTED, 
> distributedFinalState=UNDEFINED, 
> appTrackingUrl=http://host-9-5.test:8088/proxy/application_1456970141888_0011/,
>  appUser=ambari-qa
> 16/03/03 02:33:56 INFO distributedshell.Client: Got application report from 
> ASM for, appId=11, clientToAMToken=Token { kind: YARN_CLIENT_TOKEN, service:  
> }, appDiagnostics=, appMasterHost=N/A, appQueue=default, appMasterRpcPort=-1, 
> appStartTime=1456972434150, yarnAppState=ACCEPTED, 
> distributedFinalState=UNDEFINED, 
> appTrackingUrl=http://host-9-5.test:8088/proxy/application_1456970141888_0011/,
>  appUser=ambari-qa
> 16/03/03 02:33:57 INFO distributedshell.Client: Got application report from 
> ASM for, appId=11, clientToAMToken=Token { kind: YARN_CLIENT_TOKEN, service:  
> }, appDiagnostics=, appMasterHost=N/A, appQueue=default, appMasterRpcPort=-1, 
> appStartTime=1456972434150, yarnAppState=ACCEPTED, 
> distributedFinalState=UNDEFINED, 
> appTrackingUrl=http://host-9-5.test:8088/proxy/application_1456970141888_0011/,
>  appUser=ambari-qa
> 16/03/03 02:33:58 INFO distributedshell.Client: Got application report from 
> ASM for, appId=11, clientToAMToken=Token { kind: YARN_CLIENT_TOKEN, service:  
> }, appDiagnostics=, appMasterHost=N/A, appQueue=default, appMasterRpcPort=-1, 
> appStartTime=1456972434150, yarnAppState=ACCEPTED, 
> distributedFinalState=UNDEFINED, 
> appTrackingUrl=http://host-9-5.test:8088/proxy/application_1456970141888_0011/,
>  appUser=ambari-qa
> 16/03/03 02:33:59 INFO distributedshell.Client: Got application report from 
> ASM for, appId=11, clientToAMToken=Token { kind: YARN_CLIENT_TOKEN, service:  
> }, appDiagnostics=, appMasterHost=N/A, appQueue=default, appMasterRpcPort=-1, 
> appStartTime=1456972434150, yarnAppState=ACCEPTED, 
> distributedFinalState=UNDEFINED, 
> appTrackingUrl=http://host-9-5.test:8088/proxy/application_1456970141888_0011/,
>  appUser=ambari-qa
> 16/03/03 02:34:00 INFO distributedshell.Client: Got application report from 
> ASM for, appId=11, clientToAMToken=Token { kind: YARN_CLIENT_TOKEN, service:  
> }, appDiagnostics=, appMasterHost=N/A, appQueue=default, appMasterRpcPort=-1, 
> appStartTime=1456972434150, yarnAppState=ACCEPTED, 
> distributedFinalState=UNDEFINED, 
> appTrackingUrl=http://host-9-5.test:8088/proxy/application_1456970141888_0011/,
>  appUser=ambari-qa
> 16/03/03 02:34:01 INFO distributedshell.Client: Got application report from 
> ASM for, appId=11, clientToAMToken=Token { kind: YARN_CLIENT_TOKEN, service:  
> }, appDiagnostics=, appMasterHost=N/A, appQueue=default, appMasterRpcPort=-1, 
> appStartTime=1456972434150, yarnAppState=ACCEPTED, 
> distributedFinalState=UNDEFINED, 
> appTrackingUrl=http://host-9-5.test:8088/proxy/application_1456970141888_0011/,
>  appUser=ambari-qa
> 16/03/03 02:34:02 INFO distributedshell.Client: Got application report from 
> ASM for, appId=11, clientToAMToken=Token { kind: YARN_CLIENT_TOKEN, service:  
> }, appDiagnostics=, appMasterHost=N/A, appQueue=default, appMasterRpcPort=-1, 
> appStartTime=1456972434150, yarnAppState=ACCEPTED, 
> distributedFinalState=UNDEFINED, 
> appTrackingUrl=http://host-9-5.test:8088/proxy/application_1456970141888_0011/,
>  appUser=ambari-qa
> 16/03/03 02:34:03 INFO distributedshell.Client: Got application report from 
> ASM for, appId=11, clientToAMToken=Token { kind: YARN_CLIENT_TOKEN, service:  
> }, appDiagnostics=, appMasterHost=N/A, appQueue=default, appMasterRpcPort=-1, 
> appStartTime=1456972434150, yarnAppState=ACCEPTED, 
> distributedFinalState=UNDEFINED, 
> appTrackingUrl=http://host-9-5.test:8088/proxy/application_1456970141888_0011/,
>  appUser=ambari-qa
> 16/03/03 02:34:04 INFO distributedshell.Client: Got application report from 
> ASM for, appId=11, clientToAMToken=Token { kind: YARN_CLIENT_TOKEN, service:  
> }, appDiagnostics=, appMasterHost=host-9-1/127.0.0.235, appQueue=default, 
> appMasterRpcPort=-1, appStartTime=1456972434150, yarnAppState=RUNNING, 
> distributedFinalState=UNDEFINED, 
> appTrackingUrl=http://host-9-5.test:8088/proxy/application_1456970141888_0011/,
>  appUser=ambari-qa
> 16/03/03 02:34:05 INFO distributedshell.Client: Got application report from 
> ASM for, appId=11, clientToAMToken=Token { kind: YARN_CLIENT_TOKEN, service:  
> }, appDiagnostics=, appMasterHost=host-9-1/127.0.0.235, appQueue=default, 
> appMasterRpcPort=-1, appStartTime=1456972434150, yarnAppState=RUNNING, 
> distributedFinalState=UNDEFINED, 
> appTrackingUrl=http://host-9-5.test:8088/proxy/application_1456970141888_0011/,
>  appUser=ambari-qa
> 16/03/03 02:34:06 INFO distributedshell.Client: Got application report from 
> ASM for, appId=11, clientToAMToken=Token { kind: YARN_CLIENT_TOKEN, service:  
> }, appDiagnostics=, appMasterHost=host-9-1/127.0.0.235, appQueue=default, 
> appMasterRpcPort=-1, appStartTime=1456972434150, yarnAppState=RUNNING, 
> distributedFinalState=UNDEFINED, 
> appTrackingUrl=http://host-9-5.test:8088/proxy/application_1456970141888_0011/,
>  appUser=ambari-qa
> 16/03/03 02:34:07 INFO distributedshell.Client: Got application report from 
> ASM for, appId=11, clientToAMToken=Token { kind: YARN_CLIENT_TOKEN, service:  
> }, appDiagnostics=, appMasterHost=host-9-1/127.0.0.235, appQueue=default, 
> appMasterRpcPort=-1, appStartTime=1456972434150, yarnAppState=RUNNING, 
> distributedFinalState=UNDEFINED, 
> appTrackingUrl=http://host-9-5.test:8088/proxy/application_1456970141888_0011/,
>  appUser=ambari-qa
> 16/03/03 02:34:08 INFO distributedshell.Client: Got application report from 
> ASM for, appId=11, clientToAMToken=Token { kind: YARN_CLIENT_TOKEN, service:  
> }, appDiagnostics=, appMasterHost=host-9-1/127.0.0.235, appQueue=default, 
> appMasterRpcPort=-1, appStartTime=1456972434150, yarnAppState=FINISHED, 
> distributedFinalState=FAILED, 
> appTrackingUrl=http://host-9-5.test:8088/proxy/application_1456970141888_0011/,
>  appUser=ambari-qa
> 16/03/03 02:34:08 INFO distributedshell.Client: Application did finished 
> unsuccessfully. YarnState=FINISHED, DSFinalStatus=FAILED. Breaking monitoring 
> loop
> 16/03/03 02:34:08 ERROR distributedshell.Client: Application failed to 
> complete successfully
> stdout:   /var/lib/ambari-agent/data/output-822.txt
> 2016-03-03 02:33:47,974 - Using hadoop conf dir: 
> /usr/hdp/current/hadoop-client/conf
> 2016-03-03 02:33:48,013 - Using hadoop conf dir: 
> /usr/hdp/current/hadoop-client/conf
> 2016-03-03 02:33:48,018 - checked_call['/usr/bin/kinit -kt 
> /etc/security/keytabs/smokeuser.headless.keytab ambari...@example.com; yarn 
> org.apache.hadoop.yarn.applications.distributedshell.Client -shell_command ls 
> -num_containers 1 -jar 
> /usr/hdp/current/hadoop-yarn-client/hadoop-yarn-applications-distributedshell.jar']
>  {'path': '/usr/sbin:/sbin:/usr/local/bin:/bin:/usr/bin', 'user': 'ambari-qa'}
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to