[ https://issues.apache.org/jira/browse/AMBARI-19930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sumit Mohanty resolved AMBARI-19930. ------------------------------------ Resolution: Fixed Committed to branch-2.5 > The service check status was set to TIMEOUT even if service check was failed > ---------------------------------------------------------------------------- > > Key: AMBARI-19930 > URL: https://issues.apache.org/jira/browse/AMBARI-19930 > Project: Ambari > Issue Type: Bug > Reporter: Yesha Vora > Assignee: Myroslav Papirkovskyi > > Steps to reproduce: > * Install a cluster with Hadoop, Tez, Hbase , Hive, Spark > * Enable Wire encryption > * Run Tez service check > Here, agent.service.check.task.timeout is set to 600 sec. Tez application was > started in background. The service check then tries to find out SUCCESS file > for couple of minutes only. In this particular instance, the application took > 5 minutes to run. Thus, the check for SUCCESS file on HDFS failed. > In this scenario, the status for service check should be failed instead > Timeout. > {code} > stderr: /var/lib/ambari-agent/data/errors-370.txt > stdout: /var/lib/ambari-agent/data/output-370.txt > 2017-02-08 03:55:55,017 - > HdfsResource['/hdp/apps/2.6.0.0-xxx/tez/tez.tar.gz'] {'security_enabled': > True, 'hadoop_bin_dir': '/usr/hdp/current/hadoop-client/bin', 'keytab': > '/etc/security/keytabs/hdfs.headless.keytab', 'source': > '/usr/hdp/2.6.0.0-xxx/tez/lib/tez.tar.gz', 'dfs_type': '', 'default_fs': > 'hdfs://host:8020', 'replace_existing_files': False, > 'hdfs_resource_ignore_file': > '/var/lib/ambari-agent/data/.hdfs_resource_ignore', 'hdfs_site': ..., > 'kinit_path_local': '/usr/bin/kinit', 'principal_name': 'h...@example.com', > 'user': 'hdfs', 'owner': 'hdfs', 'group': 'hadoop', 'hadoop_conf_dir': > '/usr/hdp/current/hadoop-client/conf', 'type': 'file', 'action': > ['create_on_execute'], 'immutable_paths': [u'/apps/hive/warehouse', > u'/mr-history/done', u'/app-logs', u'/tmp'], 'mode': 0444} > 2017-02-08 03:55:55,017 - Execute['/usr/bin/kinit -kt > /etc/security/keytabs/hdfs.headless.keytab h...@example.com'] {'user': 'hdfs'} > 2017-02-08 03:55:55,096 - call['ambari-sudo.sh su hdfs -l -s /bin/bash -c > 'curl -sS -L -w '"'"'%{http_code}'"'"' -X GET --negotiate -u : -k > '"'"'https://host:50470/webhdfs/v1/hdp/apps/2.6.0.0-xxx/tez/tez.tar.gz?op=GETFILESTATUS&user.name=hdfs'"'"' > 1>/tmp/tmpoIadeN 2>/tmp/tmp6nFiLj''] {'logoutput': None, 'quiet': False} > 2017-02-08 03:55:55,292 - call returned (0, '') > 2017-02-08 03:55:55,293 - DFS file /hdp/apps/2.6.0.0-xxx/tez/tez.tar.gz is > identical to /usr/hdp/2.6.0.0-xxx/tez/lib/tez.tar.gz, skipping the copying > 2017-02-08 03:55:55,293 - Will attempt to copy tez tarball from > /usr/hdp/2.6.0.0-xxx/tez/lib/tez.tar.gz to DFS at > /hdp/apps/2.6.0.0-xxx/tez/tez.tar.gz. > 2017-02-08 03:55:55,293 - HdfsResource[None] {'security_enabled': True, > 'hadoop_bin_dir': '/usr/hdp/current/hadoop-client/bin', 'keytab': > '/etc/security/keytabs/hdfs.headless.keytab', 'dfs_type': '', 'default_fs': > 'hdfs://host:8020', 'hdfs_resource_ignore_file': > '/var/lib/ambari-agent/data/.hdfs_resource_ignore', 'hdfs_site': ..., > 'kinit_path_local': '/usr/bin/kinit', 'principal_name': 'h...@example.com', > 'user': 'hdfs', 'action': ['execute'], 'hadoop_conf_dir': > '/usr/hdp/current/hadoop-client/conf', 'immutable_paths': > [u'/apps/hive/warehouse', u'/mr-history/done', u'/app-logs', u'/tmp']} > 2017-02-08 03:55:55,294 - Execute['/usr/bin/kinit -kt > /etc/security/keytabs/smokeuser.headless.keytab ambari-qa-...@example.com;'] > {'user': 'ambari-qa'} > 2017-02-08 03:55:55,389 - ExecuteHadoop['jar > /usr/hdp/current/tez-client/tez-examples*.jar orderedwordcount > /tmp/tezsmokeinput/sample-tez-test /tmp/tezsmokeoutput/'] {'try_sleep': 5, > 'tries': 3, 'bin_dir': '/usr/hdp/current/hadoop-client/bin', 'user': > 'ambari-qa', 'conf_dir': '/usr/hdp/current/hadoop-client/conf'} > 2017-02-08 03:55:55,390 - Execute['hadoop --config > /usr/hdp/current/hadoop-client/conf jar > /usr/hdp/current/tez-client/tez-examples*.jar orderedwordcount > /tmp/tezsmokeinput/sample-tez-test /tmp/tezsmokeoutput/'] {'logoutput': None, > 'try_sleep': 5, 'environment': {}, 'tries': 3, 'user': 'ambari-qa', 'path': > ['/usr/hdp/current/hadoop-client/bin']}{code} > {code} > Requests: { > aborted_task_count: 0, > cluster_name: "cl1", > completed_task_count: 1, > create_time: 1486526151743, > end_time: 1486526463038, > exclusive: false, > failed_task_count: 0, > id: 29, > inputs: "{}", > operation_level: null, > progress_percent: 100, > queued_task_count: 0, > request_context: "WE API TEZ Service Check", > request_schedule: null, > request_status: "TIMEDOUT", > resource_filters: [ > { > service_name: "TEZ" > } > ], > start_time: 1486526151751, > task_count: 1, > timed_out_task_count: 1, > type: "COMMAND" > },{code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)