[ https://issues.apache.org/jira/browse/AMBARI-23894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Andrew Onischuk updated AMBARI-23894: ------------------------------------- Status: Patch Available (was: Open) > ZooKeepers Show As Down After EU to HDP 3.0 But They Are Not > ------------------------------------------------------------ > > Key: AMBARI-23894 > URL: https://issues.apache.org/jira/browse/AMBARI-23894 > Project: Ambari > Issue Type: Bug > Reporter: Andrew Onischuk > Assignee: Andrew Onischuk > Priority: Major > Fix For: 2.7.0 > > Attachments: AMBARI-23894.patch > > > STR: > * Perform an EU from HDP 2.6 to HDP 3.0 > After, 2 of my 3 ZKs are shown as being down. However, they are actually alive > on my boxes: > > > > [root@c7402 ~]$ ps aux | grep [z]oo.cfg > zookeep+ 22463 0.2 2.8 3064236 53728 ? Sl 20:41 0:01 > /usr/jdk64/jdk1.8.0_144/bin/java -Dzookeeper.log.dir=/var/log/zookeeper > -Dzookeeper.log.file=zookeeper-zookeeper-server-c7402.ambari.apache.org.log > -Dzookeeper.root.logger=INFO,ROLLINGFILE -cp > /usr/hdp/current/zookeeper-server/bin/../build/classes:/usr/hdp/current/zookeeper-server/bin/../build/lib/*.jar:/usr/hdp/current/zookeeper-server/bin/../lib/xercesMinimal-1.9.6.2.jar:/usr/hdp/current/zookeeper-server/bin/../lib/wagon-provider-api-2.4.jar:/usr/hdp/current/zookeeper-server/bin/../lib/wagon-http-shared4-2.4.jar:/usr/hdp/current/zookeeper-server/bin/../lib/wagon-http-shared-1.0-beta-6.jar:/usr/hdp/current/zookeeper-server/bin/../lib/wagon-http-lightweight-1.0-beta-6.jar:/usr/hdp/current/zookeeper-server/bin/../lib/wagon-http-2.4.jar:/usr/hdp/current/zookeeper-server/bin/../lib/wagon-file-1.0-beta-6.jar:/usr/hdp/current/zookeeper-server/bin/../lib/slf4j-log4j12-1.6.1.jar:/usr/hdp/current/zookeeper-server/bin/../lib/slf4j-api-1.6.1.jar:/usr/hdp/current/zookeeper-server/bin/../lib/plexus-utils-3.0.8.jar:/usr/hdp/current/zookeeper-server/bin/../lib/plexus-interpolation-1.11.jar:/usr/hdp/current/zookeeper-server/bin/../lib/plexus-container-default-1.0-alpha-9-stable-1.jar:/usr/hdp/current/zookeeper-server/bin/../lib/netty-3.10.5.Final.jar:/usr/hdp/current/zookeeper-server/bin/../lib/nekohtml-1.9.6.2.jar:/usr/hdp/current/zookeeper-server/bin/../lib/maven-settings-2.2.1.jar:/usr/hdp/current/zookeeper-server/bin/../lib/maven-repository-metadata-2.2.1.jar:/usr/hdp/current/zookeeper-server/bin/../lib/maven-project-2.2.1.jar:/usr/hdp/current/zookeeper-server/bin/../lib/maven-profile-2.2.1.jar:/usr/hdp/current/zookeeper-server/bin/../lib/maven-plugin-registry-2.2.1.jar:/usr/hdp/current/zookeeper-server/bin/../lib/maven-model-2.2.1.jar:/usr/hdp/current/zookeeper-server/bin/../lib/maven-error-diagnostics-2.2.1.jar:/usr/hdp/current/zookeeper-server/bin/../lib/maven-artifact-manager-2.2.1.jar:/usr/hdp/current/zookeeper-server/bin/../lib/maven-artifact-2.2.1.jar:/usr/hdp/current/zookeeper-server/bin/../lib/maven-ant-tasks-2.1.3.jar:/usr/hdp/current/zookeeper-server/bin/../lib/log4j-1.2.16.jar:/usr/hdp/current/zookeeper-server/bin/../lib/jsoup-1.7.1.jar:/usr/hdp/current/zookeeper-server/bin/../lib/jline-0.9.94.jar:/usr/hdp/current/zookeeper-server/bin/../lib/commons-logging-1.1.1.jar:/usr/hdp/current/zookeeper-server/bin/../lib/commons-io-2.2.jar:/usr/hdp/current/zookeeper-server/bin/../lib/commons-codec-1.6.jar:/usr/hdp/current/zookeeper-server/bin/../lib/classworlds-1.1-alpha-2.jar:/usr/hdp/current/zookeeper-server/bin/../lib/backport-util-concurrent-3.1.jar:/usr/hdp/current/zookeeper-server/bin/../lib/ant-launcher-1.8.0.jar:/usr/hdp/current/zookeeper-server/bin/../lib/ant-1.8.0.jar:/usr/hdp/current/zookeeper-server/bin/../zookeeper-3.4.6.3.0.0.0-1250.jar:/usr/hdp/current/zookeeper-server/bin/../src/java/lib/*.jar:/usr/hdp/current/zookeeper-server/conf::/usr/share/zookeeper/*:/usr/share/zookeeper/* > -Xmx1024m -Dcom.sun.management.jmxremote > -Dcom.sun.management.jmxremote.local.only=false > org.apache.zookeeper.server.quorum.QuorumPeerMain > /usr/hdp/current/zookeeper-server/conf/zoo.cfg > > [root@c7402 ~]$ telnet localhost 2181 > Trying ::1... > Connected to localhost. > Escape character is '^]'. > ^CConnection closed by foreign host. > > But you can see that we clearly think it's down on c7402: > > > > { > "href" : > "http://localhost:8080/api/v1/clusters/c1/hosts/c7402.ambari.apache.org/host_components/ZOOKEEPER_SERVER", > "HostRoles" : { > "cluster_name" : "c1", > "component_name" : "ZOOKEEPER_SERVER", > "desired_repository_version" : "3.0.0.0-1250", > "desired_stack_id" : "HDP-3.0", > "desired_state" : "STARTED", > "display_name" : "ZooKeeper Server", > "host_name" : "c7402.ambari.apache.org", > "maintenance_state" : "OFF", > "public_host_name" : "c7402.ambari.apache.org", > "reload_configs" : false, > "service_name" : "ZOOKEEPER", > "stale_configs" : false, > "state" : "INSTALLED", > "upgrade_state" : "NONE", > "version" : "3.0.0.0-1250", > "actual_configs" : { } > }, > "host" : { > "href" : > "http://localhost:8080/api/v1/clusters/c1/hosts/c7402.ambari.apache.org" > }, > "component" : [ > { > "href" : > "http://localhost:8080/api/v1/clusters/c1/services/ZOOKEEPER/components/ZOOKEEPER_SERVER", > "ServiceComponentInfo" : { > "cluster_name" : "c1", > "component_name" : "ZOOKEEPER_SERVER", > "service_name" : "ZOOKEEPER" > } > } > ], > "processes" : [ ] > } > > The PID file looks correct: > > > > [root@c7402 zookeeper]$ cat /var/run/zookeeper/zookeeper_server.pid > 22463 > -- This message was sent by Atlassian JIRA (v7.6.3#76005)