[ https://issues.apache.org/jira/browse/METRON-894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Nick Allen updated METRON-894: ------------------------------ Summary: Ambari "Restart Metron Parsers" Fails If Any Parser Not Running (was: Ambari "Restart Metron Parsers" Fails If YAF Parser Not Running) > Ambari "Restart Metron Parsers" Fails If Any Parser Not Running > --------------------------------------------------------------- > > Key: METRON-894 > URL: https://issues.apache.org/jira/browse/METRON-894 > Project: Metron > Issue Type: Bug > Affects Versions: 0.3.1 > Reporter: Nick Allen > Priority: Minor > > The "Restart Metron Parsers" action failed in Ambari. It failed because the > "stop" portion of the "restart" failed because the YAF topology was not > running. This should not be treated as an error condition. > I was able to work around this by simply using a "start" operation instead of > a "restart". > {code} > stderr: /var/lib/ambari-agent/data/errors-966.txt > Traceback (most recent call last): > File > "/var/lib/ambari-agent/cache/common-services/METRON/0.4.0/package/scripts/parser_master.py", > line 93, in <module> > ParserMaster().execute() > File > "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", > line 280, in execute > method(env) > File > "/var/lib/ambari-agent/cache/common-services/METRON/0.4.0/package/scripts/parser_master.py", > line 81, in restart > commands.restart_parser_topologies(env) > File > "/var/lib/ambari-agent/cache/common-services/METRON/0.4.0/package/scripts/parser_commands.py", > line 146, in restart_parser_topologies > self.stop_parser_topologies() > File > "/var/lib/ambari-agent/cache/common-services/METRON/0.4.0/package/scripts/parser_commands.py", > line 141, in stop_parser_topologies > Execute(stop_cmd, user=self.__params.metron_user) > File "/usr/lib/python2.6/site-packages/resource_management/core/base.py", > line 155, in __init__ > self.env.run() > File > "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", > line 160, in run > self.run_action(resource, action) > File > "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", > line 124, in run_action > provider_action() > File > "/usr/lib/python2.6/site-packages/resource_management/core/providers/system.py", > line 273, in action_run > tries=self.resource.tries, try_sleep=self.resource.try_sleep) > File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", > line 70, in inner > result = function(command, **kwargs) > File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", > line 92, in checked_call > tries=tries, try_sleep=try_sleep) > File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", > line 140, in _call_wrapper > result = _call(command, **kwargs_copy) > File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", > line 293, in _call > raise ExecutionFailed(err_msg, code, out, err) > resource_management.core.exceptions.ExecutionFailed: Execution of 'storm kill > yaf' returned 1. Running: /usr/jdk64/jdk1.8.0_77/bin/java -client > -Ddaemon.name= -Dstorm.options= -Dstorm.home=/usr/hdp/2.5.3.0-37/storm > -Dstorm.log.dir=/var/log/storm > -Djava.library.path=/usr/local/lib:/opt/local/lib:/usr/lib:/usr/hdp/current/storm-client/lib > -Dstorm.conf.file= -cp > /usr/hdp/2.5.3.0-37/storm/lib/clojure-1.7.0.jar:/usr/hdp/2.5.3.0-37/storm/lib/disruptor-3.3.2.jar:/usr/hdp/2.5.3.0-37/storm/lib/log4j-slf4j-impl-2.1.jar:/usr/hdp/2.5.3.0-37/storm/lib/storm-rename-hack-1.0.1.2.5.3.0-37.jar:/usr/hdp/2.5.3.0-37/storm/lib/log4j-api-2.1.jar:/usr/hdp/2.5.3.0-37/storm/lib/ring-cors-0.1.5.jar:/usr/hdp/2.5.3.0-37/storm/lib/log4j-core-2.1.jar:/usr/hdp/2.5.3.0-37/storm/lib/asm-5.0.3.jar:/usr/hdp/2.5.3.0-37/storm/lib/log4j-over-slf4j-1.6.6.jar:/usr/hdp/2.5.3.0-37/storm/lib/slf4j-api-1.7.7.jar:/usr/hdp/2.5.3.0-37/storm/lib/servlet-api-2.5.jar:/usr/hdp/2.5.3.0-37/storm/lib/zookeeper.jar:/usr/hdp/2.5.3.0-37/storm/lib/minlog-1.3.0.jar:/usr/hdp/2.5.3.0-37/storm/lib/kryo-3.0.3.jar:/usr/hdp/2.5.3.0-37/storm/lib/storm-core-1.0.1.2.5.3.0-37.jar:/usr/hdp/2.5.3.0-37/storm/lib/reflectasm-1.10.1.jar:/usr/hdp/2.5.3.0-37/storm/lib/objenesis-2.1.jar:/usr/hdp/2.5.3.0-37/storm/lib/ambari-metrics-storm-sink.jar:/usr/hdp/2.5.3.0-37/storm/extlib-daemon/ranger-storm-plugin-shim-0.6.0.2.5.3.0-37.jar:/usr/hdp/2.5.3.0-37/storm/extlib-daemon/ojdbc6.jar:/usr/hdp/2.5.3.0-37/storm/extlib-daemon/ranger-plugin-classloader-0.6.0.2.5.3.0-37.jar:/usr/hdp/current/storm-supervisor/conf:/usr/hdp/2.5.3.0-37/storm/bin > org.apache.storm.command.kill_topology yaf > Exception in thread "main" NotAliveException(msg:yaf is not alive) > at > org.apache.storm.generated.Nimbus$killTopologyWithOpts_result$killTopologyWithOpts_resultStandardScheme.read(Nimbus.java:10748) > at > org.apache.storm.generated.Nimbus$killTopologyWithOpts_result$killTopologyWithOpts_resultStandardScheme.read(Nimbus.java:10734) > at > org.apache.storm.generated.Nimbus$killTopologyWithOpts_result.read(Nimbus.java:10676) > at > org.apache.storm.thrift.TServiceClient.receiveBase(TServiceClient.java:86) > at > org.apache.storm.generated.Nimbus$Client.recv_killTopologyWithOpts(Nimbus.java:383) > at > org.apache.storm.generated.Nimbus$Client.killTopologyWithOpts(Nimbus.java:369) > at > org.apache.storm.command.kill_topology$_main.doInvoke(kill_topology.clj:27) > at clojure.lang.RestFn.applyTo(RestFn.java:137) > at org.apache.storm.command.kill_topology.main(Unknown Source) > stdout: /var/lib/ambari-agent/data/output-966.txt > 2017-04-26 18:21:46,880 - The hadoop conf dir > /usr/hdp/current/hadoop-client/conf exists, will call conf-select on it for > version 2.5.3.0-37 > 2017-04-26 18:21:46,882 - Checking if need to create versioned conf dir > /etc/hadoop/2.5.3.0-37/0 > 2017-04-26 18:21:46,884 - call[('ambari-python-wrap', > u'/usr/bin/conf-select', 'create-conf-dir', '--package', 'hadoop', > '--stack-version', '2.5.3.0-37', '--conf-version', '0')] {'logoutput': False, > 'sudo': True, 'quiet': False, 'stderr': -1} > 2017-04-26 18:21:46,921 - call returned (1, '/etc/hadoop/2.5.3.0-37/0 exist > already', '') > 2017-04-26 18:21:46,922 - checked_call[('ambari-python-wrap', > u'/usr/bin/conf-select', 'set-conf-dir', '--package', 'hadoop', > '--stack-version', '2.5.3.0-37', '--conf-version', '0')] {'logoutput': False, > 'sudo': True, 'quiet': False} > 2017-04-26 18:21:46,960 - checked_call returned (0, '') > 2017-04-26 18:21:46,962 - Ensuring that hadoop has the correct symlink > structure > 2017-04-26 18:21:46,962 - Using hadoop conf dir: > /usr/hdp/current/hadoop-client/conf > 2017-04-26 18:21:47,150 - The hadoop conf dir > /usr/hdp/current/hadoop-client/conf exists, will call conf-select on it for > version 2.5.3.0-37 > 2017-04-26 18:21:47,152 - Checking if need to create versioned conf dir > /etc/hadoop/2.5.3.0-37/0 > 2017-04-26 18:21:47,155 - call[('ambari-python-wrap', > u'/usr/bin/conf-select', 'create-conf-dir', '--package', 'hadoop', > '--stack-version', '2.5.3.0-37', '--conf-version', '0')] {'logoutput': False, > 'sudo': True, 'quiet': False, 'stderr': -1} > 2017-04-26 18:21:47,193 - call returned (1, '/etc/hadoop/2.5.3.0-37/0 exist > already', '') > 2017-04-26 18:21:47,194 - checked_call[('ambari-python-wrap', > u'/usr/bin/conf-select', 'set-conf-dir', '--package', 'hadoop', > '--stack-version', '2.5.3.0-37', '--conf-version', '0')] {'logoutput': False, > 'sudo': True, 'quiet': False} > 2017-04-26 18:21:47,232 - checked_call returned (0, '') > 2017-04-26 18:21:47,233 - Ensuring that hadoop has the correct symlink > structure > 2017-04-26 18:21:47,233 - Using hadoop conf dir: > /usr/hdp/current/hadoop-client/conf > 2017-04-26 18:21:47,235 - Group['metron'] {} > 2017-04-26 18:21:47,238 - Group['livy'] {} > 2017-04-26 18:21:47,238 - Group['elasticsearch'] {} > 2017-04-26 18:21:47,238 - Group['spark'] {} > 2017-04-26 18:21:47,239 - Group['zeppelin'] {} > 2017-04-26 18:21:47,239 - Group['hadoop'] {} > 2017-04-26 18:21:47,239 - Group['kibana'] {} > 2017-04-26 18:21:47,240 - Group['users'] {} > 2017-04-26 18:21:47,240 - User['hive'] {'gid': 'hadoop', > 'fetch_nonlocal_groups': True, 'groups': [u'hadoop']} > 2017-04-26 18:21:47,242 - User['storm'] {'gid': 'hadoop', > 'fetch_nonlocal_groups': True, 'groups': [u'hadoop']} > 2017-04-26 18:21:47,243 - User['zookeeper'] {'gid': 'hadoop', > 'fetch_nonlocal_groups': True, 'groups': [u'hadoop']} > 2017-04-26 18:21:47,244 - User['ams'] {'gid': 'hadoop', > 'fetch_nonlocal_groups': True, 'groups': [u'hadoop']} > 2017-04-26 18:21:47,245 - User['tez'] {'gid': 'hadoop', > 'fetch_nonlocal_groups': True, 'groups': [u'users']} > 2017-04-26 18:21:47,246 - User['zeppelin'] {'gid': 'hadoop', > 'fetch_nonlocal_groups': True, 'groups': [u'hadoop']} > 2017-04-26 18:21:47,247 - User['metron'] {'gid': 'hadoop', > 'fetch_nonlocal_groups': True, 'groups': [u'hadoop']} > 2017-04-26 18:21:47,248 - User['livy'] {'gid': 'hadoop', > 'fetch_nonlocal_groups': True, 'groups': [u'hadoop']} > 2017-04-26 18:21:47,248 - User['elasticsearch'] {'gid': 'hadoop', > 'fetch_nonlocal_groups': True, 'groups': [u'hadoop']} > 2017-04-26 18:21:47,249 - User['spark'] {'gid': 'hadoop', > 'fetch_nonlocal_groups': True, 'groups': [u'hadoop']} > 2017-04-26 18:21:47,250 - User['ambari-qa'] {'gid': 'hadoop', > 'fetch_nonlocal_groups': True, 'groups': [u'users']} > 2017-04-26 18:21:47,251 - User['kafka'] {'gid': 'hadoop', > 'fetch_nonlocal_groups': True, 'groups': [u'hadoop']} > 2017-04-26 18:21:47,252 - User['hdfs'] {'gid': 'hadoop', > 'fetch_nonlocal_groups': True, 'groups': [u'hadoop']} > 2017-04-26 18:21:47,253 - User['yarn'] {'gid': 'hadoop', > 'fetch_nonlocal_groups': True, 'groups': [u'hadoop']} > 2017-04-26 18:21:47,254 - User['kibana'] {'gid': 'hadoop', > 'fetch_nonlocal_groups': True, 'groups': [u'hadoop']} > 2017-04-26 18:21:47,255 - User['mapred'] {'gid': 'hadoop', > 'fetch_nonlocal_groups': True, 'groups': [u'hadoop']} > 2017-04-26 18:21:47,256 - User['hbase'] {'gid': 'hadoop', > 'fetch_nonlocal_groups': True, 'groups': [u'hadoop']} > 2017-04-26 18:21:47,257 - User['hcat'] {'gid': 'hadoop', > 'fetch_nonlocal_groups': True, 'groups': [u'hadoop']} > 2017-04-26 18:21:47,258 - File['/var/lib/ambari-agent/tmp/changeUid.sh'] > {'content': StaticFile('changeToSecureUid.sh'), 'mode': 0555} > 2017-04-26 18:21:47,261 - Execute['/var/lib/ambari-agent/tmp/changeUid.sh > ambari-qa > /tmp/hadoop-ambari-qa,/tmp/hsperfdata_ambari-qa,/home/ambari-qa,/tmp/ambari-qa,/tmp/sqoop-ambari-qa'] > {'not_if': '(test $(id -u ambari-qa) -gt 1000) || (false)'} > 2017-04-26 18:21:47,269 - Skipping > Execute['/var/lib/ambari-agent/tmp/changeUid.sh ambari-qa > /tmp/hadoop-ambari-qa,/tmp/hsperfdata_ambari-qa,/home/ambari-qa,/tmp/ambari-qa,/tmp/sqoop-ambari-qa'] > due to not_if > 2017-04-26 18:21:47,270 - Directory['/tmp/hbase-hbase'] {'owner': 'hbase', > 'create_parents': True, 'mode': 0775, 'cd_access': 'a'} > 2017-04-26 18:21:47,272 - File['/var/lib/ambari-agent/tmp/changeUid.sh'] > {'content': StaticFile('changeToSecureUid.sh'), 'mode': 0555} > 2017-04-26 18:21:47,274 - Execute['/var/lib/ambari-agent/tmp/changeUid.sh > hbase /home/hbase,/tmp/hbase,/usr/bin/hbase,/var/log/hbase,/tmp/hbase-hbase'] > {'not_if': '(test $(id -u hbase) -gt 1000) || (false)'} > 2017-04-26 18:21:47,281 - Skipping > Execute['/var/lib/ambari-agent/tmp/changeUid.sh hbase > /home/hbase,/tmp/hbase,/usr/bin/hbase,/var/log/hbase,/tmp/hbase-hbase'] due > to not_if > 2017-04-26 18:21:47,282 - Group['hdfs'] {} > 2017-04-26 18:21:47,283 - User['hdfs'] {'fetch_nonlocal_groups': True, > 'groups': [u'hadoop', u'hdfs']} > 2017-04-26 18:21:47,284 - FS Type: > 2017-04-26 18:21:47,284 - Directory['/etc/hadoop'] {'mode': 0755} > 2017-04-26 18:21:47,308 - > File['/usr/hdp/current/hadoop-client/conf/hadoop-env.sh'] {'content': > InlineTemplate(...), 'owner': 'hdfs', 'group': 'hadoop'} > 2017-04-26 18:21:47,310 - > Directory['/var/lib/ambari-agent/tmp/hadoop_java_io_tmpdir'] {'owner': > 'hdfs', 'group': 'hadoop', 'mode': 01777} > 2017-04-26 18:21:47,330 - Execute[('setenforce', '0')] {'not_if': '(! which > getenforce ) || (which getenforce && getenforce | grep -q Disabled)', 'sudo': > True, 'only_if': 'test -f /selinux/enforce'} > 2017-04-26 18:21:47,341 - Skipping Execute[('setenforce', '0')] due to not_if > 2017-04-26 18:21:47,342 - Directory['/var/log/hadoop'] {'owner': 'root', > 'create_parents': True, 'group': 'hadoop', 'mode': 0775, 'cd_access': 'a'} > 2017-04-26 18:21:47,346 - Directory['/var/run/hadoop'] {'owner': 'root', > 'create_parents': True, 'group': 'root', 'cd_access': 'a'} > 2017-04-26 18:21:47,346 - Directory['/tmp/hadoop-hdfs'] {'owner': 'hdfs', > 'create_parents': True, 'cd_access': 'a'} > 2017-04-26 18:21:47,354 - > File['/usr/hdp/current/hadoop-client/conf/commons-logging.properties'] > {'content': Template('commons-logging.properties.j2'), 'owner': 'hdfs'} > 2017-04-26 18:21:47,357 - > File['/usr/hdp/current/hadoop-client/conf/health_check'] {'content': > Template('health_check.j2'), 'owner': 'hdfs'} > 2017-04-26 18:21:47,358 - > File['/usr/hdp/current/hadoop-client/conf/log4j.properties'] {'content': ..., > 'owner': 'hdfs', 'group': 'hadoop', 'mode': 0644} > 2017-04-26 18:21:47,377 - > File['/usr/hdp/current/hadoop-client/conf/hadoop-metrics2.properties'] > {'content': Template('hadoop-metrics2.properties.j2'), 'owner': 'hdfs', > 'group': 'hadoop'} > 2017-04-26 18:21:47,378 - > File['/usr/hdp/current/hadoop-client/conf/task-log4j.properties'] {'content': > StaticFile('task-log4j.properties'), 'mode': 0755} > 2017-04-26 18:21:47,379 - > File['/usr/hdp/current/hadoop-client/conf/configuration.xsl'] {'owner': > 'hdfs', 'group': 'hadoop'} > 2017-04-26 18:21:47,386 - File['/etc/hadoop/conf/topology_mappings.data'] > {'owner': 'hdfs', 'content': Template('topology_mappings.data.j2'), > 'only_if': 'test -d /etc/hadoop/conf', 'group': 'hadoop'} > 2017-04-26 18:21:47,391 - File['/etc/hadoop/conf/topology_script.py'] > {'content': StaticFile('topology_script.py'), 'only_if': 'test -d > /etc/hadoop/conf', 'mode': 0755} > 2017-04-26 18:21:47,682 - The hadoop conf dir > /usr/hdp/current/hadoop-client/conf exists, will call conf-select on it for > version 2.5.3.0-37 > 2017-04-26 18:21:47,684 - Checking if need to create versioned conf dir > /etc/hadoop/2.5.3.0-37/0 > 2017-04-26 18:21:47,687 - call[('ambari-python-wrap', > u'/usr/bin/conf-select', 'create-conf-dir', '--package', 'hadoop', > '--stack-version', '2.5.3.0-37', '--conf-version', '0')] {'logoutput': False, > 'sudo': True, 'quiet': False, 'stderr': -1} > 2017-04-26 18:21:47,726 - call returned (1, '/etc/hadoop/2.5.3.0-37/0 exist > already', '') > 2017-04-26 18:21:47,727 - checked_call[('ambari-python-wrap', > u'/usr/bin/conf-select', 'set-conf-dir', '--package', 'hadoop', > '--stack-version', '2.5.3.0-37', '--conf-version', '0')] {'logoutput': False, > 'sudo': True, 'quiet': False} > 2017-04-26 18:21:47,766 - checked_call returned (0, '') > 2017-04-26 18:21:47,767 - Ensuring that hadoop has the correct symlink > structure > 2017-04-26 18:21:47,767 - Using hadoop conf dir: > /usr/hdp/current/hadoop-client/conf > 2017-04-26 18:21:47,771 - Create Metron Local Config Directory > 2017-04-26 18:21:47,771 - Configure Metron global.json > 2017-04-26 18:21:47,771 - Directory['/usr/metron/0.4.0/config/zookeeper'] > {'owner': 'metron', 'group': 'metron', 'mode': 0755} > 2017-04-26 18:21:47,781 - > File['/usr/metron/0.4.0/config/zookeeper/global.json'] {'content': > InlineTemplate(...), 'owner': 'metron'} > 2017-04-26 18:21:47,786 - > File['/usr/metron/0.4.0/config/zookeeper/../elasticsearch.properties'] > {'content': InlineTemplate(...), 'owner': 'metron'} > 2017-04-26 18:21:47,787 - Loading config into ZooKeeper > 2017-04-26 18:21:47,787 - Execute['/usr/metron/0.4.0/bin/zk_load_configs.sh > --mode PUSH -i /usr/metron/0.4.0/config/zookeeper -z > y113.l42scl.hortonworks.com:2181,y114.l42scl.hortonworks.com:2181,y115.l42scl.hortonworks.com:2181'] > {'path': [u'/usr/jdk64/jdk1.8.0_77/bin']} > 2017-04-26 18:21:49,396 - Calling security setup > 2017-04-26 18:21:49,397 - Restarting the parser topologies > 2017-04-26 18:21:49,397 - Stopping parsers > 2017-04-26 18:21:49,397 - Stopping bro > 2017-04-26 18:21:49,397 - Execute['storm kill bro'] {'user': 'metron'} > 2017-04-26 18:21:55,400 - Stopping snort > 2017-04-26 18:21:55,401 - Execute['storm kill snort'] {'user': 'metron'} > 2017-04-26 18:22:01,016 - Stopping yaf > 2017-04-26 18:22:01,017 - Execute['storm kill yaf'] {'user': 'metron'} > Command failed after 1 tries > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)