[ https://issues.apache.org/jira/browse/AMBARI-18684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jonathan Hurley updated AMBARI-18684: ------------------------------------- Resolution: Fixed Status: Resolved (was: Patch Available) > Webhcat server start failed during EU with BindException > -------------------------------------------------------- > > Key: AMBARI-18684 > URL: https://issues.apache.org/jira/browse/AMBARI-18684 > Project: Ambari > Issue Type: Bug > Components: ambari-server > Affects Versions: 2.2.0 > Reporter: Jonathan Hurley > Assignee: Jonathan Hurley > Priority: Blocker > Fix For: 2.5.0 > > Attachments: AMBARI-18684.patch > > > WebHCat may fail to restart during an upgrade due to the following exception: > {code} > Traceback (most recent call last): > File > "/var/lib/ambari-agent/cache/common-services/HIVE/0.12.0.2.0/package/scripts/webhcat_server.py", > line 155, in <module> > WebHCatServer().execute() > File > "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", > line 219, in execute > method(env) > File > "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", > line 530, in restart > self.start(env, upgrade_type=upgrade_type) > File > "/var/lib/ambari-agent/cache/common-services/HIVE/0.12.0.2.0/package/scripts/webhcat_server.py", > line 42, in start > webhcat_service(action='start', upgrade_type=upgrade_type) > File "/usr/lib/python2.6/site-packages/ambari_commons/os_family_impl.py", > line 89, in thunk > return fn(*args, **kwargs) > File > "/var/lib/ambari-agent/cache/common-services/HIVE/0.12.0.2.0/package/scripts/webhcat_service.py", > line 54, in webhcat_service > environment = environ) > File "/usr/lib/python2.6/site-packages/resource_management/core/base.py", > line 154, in __init__ > self.env.run() > File > "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", > line 160, in run > self.run_action(resource, action) > File > "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", > line 124, in run_action > provider_action() > File > "/usr/lib/python2.6/site-packages/resource_management/core/providers/system.py", > line 238, in action_run > tries=self.resource.tries, try_sleep=self.resource.try_sleep) > File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", > line 70, in inner > result = function(command, **kwargs) > File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", > line 92, in checked_call > tries=tries, try_sleep=try_sleep) > File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", > line 140, in _call_wrapper > result = _call(command, **kwargs_copy) > File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", > line 291, in _call > raise Fail(err_msg) > resource_management.core.exceptions.Fail: Execution of 'cd /var/run/webhcat ; > /usr/hdp/current/hive-webhcat/sbin/webhcat_server.sh start' returned 1. > {code} > {noformat} > WARN | 17 Oct 2016 12:53:02,999 | > org.eclipse.jetty.util.component.AbstractLifeCycle | FAILED > org.eclipse.jetty.server.Server@19a639d8: java.net.BindException: Address > already in use > java.net.BindException: Address already in use > at sun.nio.ch.Net.bind0(Native Method) > at sun.nio.ch.Net.bind(Net.java:444) > at sun.nio.ch.Net.bind(Net.java:436) > at > sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:214) > {noformat} > The problem seems to be caused by the failure of WebHCat to stop before being > upgraded. There was code added in AMBARI-12695 to address the issues with > WebHCat not stopping, however, it doesn't look correct. > - Return Code 0 (prevents the kill -9 from running due to {{not_if}} > -- > {code} > ! (ls /var/run/webhcat/webhcat.pid >/dev/null 2>&1 && ps -p > `/var/lib/ambari-agent/ambari-sudo.sh su hcat -l -s /bin/bash -c 'cat > /var/run/webhcat/webhcat.pid'` >/dev/null 2>&1) || ( sleep 10 && ! (ls > /var/run/webhcat/webhcat.pid >/dev/null 2>&1 && ps -p `ambari-sudo.sh su hcat > -l -s /bin/bash -c 'cat /var/run/webhcat/webhcat.pid'` >/dev/null 2>&1) ) > {code} > - Return Code 0 (prevents Fail from being raised) > -- > {code} > ! (ls /var/run/webhcat/webhcat.pid >/dev/null 2>&1 && ps -p > `/var/lib/ambari-agent/ambari-sudo.sh su hcat -l -s /bin/bash -c 'cat > /var/run/webhcat/webhcat.pid'` >/dev/null 2>&1) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)