[ 
https://issues.apache.org/jira/browse/METRON-261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Sirota updated METRON-261:
--------------------------------
    Priority: Minor  (was: Major)

> Storm Supervisors Fail to Start
> -------------------------------
>
>                 Key: METRON-261
>                 URL: https://issues.apache.org/jira/browse/METRON-261
>             Project: Metron
>          Issue Type: Bug
>            Reporter: Nick Allen
>            Priority: Minor
>             Fix For: 0.2.1BETA
>
>
> After deployment completes, the Storm Supervisors often fail to start 
> correctly.  This prevents any data from being ingested until the Supervisors 
> are manually started.  
> It appears that the Supervisors fail to communicate with Zookeeper and they 
> timeout and die.  Zookeeper may just not be ready in time.  Not sure if this 
> is something we can fix or if this is an Ambari issue.
> 2016-06-25 12:48:16.448 o.a.s.z.ClientCnxn [WARN] Session 0x0 for server 
> null, unexpected error, closing socket connection and attempting reconnect
> java.net.ConnectException: Connection refused
>         at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) 
> ~[?:1.8.0_40]
>         at 
> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) 
> ~[?:1.8.0_40]
>         at 
> org.apache.storm.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
>  ~[storm-core-0.10.0.2.3.4.7-4.jar:0.10.0.2.3.4.7-4]
>         at 
> org.apache.storm.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1125) 
> [storm-core-0.10.0.2.3.4.7-4.jar:0.10.0.2.3.4.7-4]
> 2016-06-25 12:48:17.154 o.a.s.c.ConnectionState [ERROR] Connection timed out 
> for connection string (ec2-52-41-178-50.us-west-2.compute.amazonaws.com:2181) 
> and timeout (15000) / elapsed (15053)
> org.apache.storm.curator.CuratorConnectionLossException: KeeperErrorCode = 
> ConnectionLoss
>         at 
> org.apache.storm.curator.ConnectionState.checkTimeouts(ConnectionState.java:195)
>  [storm-core-0.10.0.2.3.4.7-4.jar:0.10.0.2.3.4.7-4]
>         at 
> org.apache.storm.curator.ConnectionState.getZooKeeper(ConnectionState.java:87)
>  [storm-core-0.10.0.2.3.4.7-4.jar:0.10.0.2.3.4.7-4]
>         at 
> org.apache.storm.curator.CuratorZookeeperClient.getZooKeeper(CuratorZookeeperClient.java:115)
>  [storm-core-0.10.0.2.3.4.7-4.jar:0.10.0.2.3.4.7-4]
>         at 
> org.apache.storm.curator.framework.imps.CuratorFrameworkImpl.getZooKeeper(CuratorFrameworkImpl.java:487)
>  [storm-core-0.10.0.2.3.4.7-4.jar:0.10.0.2.3.4.7-4]
>         at 
> org.apache.storm.curator.framework.imps.ExistsBuilderImpl$3.call(ExistsBuilderImpl.java:226)
>  [storm-core-0.10.0.2.3.4.7-4.jar:0.10.0.2.3.4.7-4]
>         at 
> org.apache.storm.curator.framework.imps.ExistsBuilderImpl$3.call(ExistsBuilderImpl.java:215)
>  [storm-core-0.10.0.2.3.4.7-4.jar:0.10.0.2.3.4.7-4]
>         at 
> org.apache.storm.curator.RetryLoop.callWithRetry(RetryLoop.java:107) 
> [storm-core-0.10.0.2.3.4.7-4.jar:0.10.0.2.3.4.7-4]
>         at 
> org.apache.storm.curator.framework.imps.ExistsBuilderImpl.pathInForegroundStandard(ExistsBuilderImpl.java:212)
>  [storm-core-0.10.0.2.3.4.7-4.jar:0.10.0.2.3.4.7-4]
>         at 
> org.apache.storm.curator.framework.imps.ExistsBuilderImpl.pathInForeground(ExistsBuilderImpl.java:205)
>  [storm-core-0.10.0.2.3.4.7-4.jar:0.10.0.2.3.4.7-4]
>         at 
> org.apache.storm.curator.framework.imps.ExistsBuilderImpl.forPath(ExistsBuilderImpl.java:168)
>  [storm-core-0.10.0.2.3.4.7-4.jar:0.10.0.2.3.4.7-4]
>         at 
> org.apache.storm.curator.framework.imps.ExistsBuilderImpl.forPath(ExistsBuilderImpl.java:39)
>  [storm-core-0.10.0.2.3.4.7-4.jar:0.10.0.2.3.4.7-4]
>         at 
> backtype.storm.zookeeper$exists_node_QMARK_$fn__3211.invoke(zookeeper.clj:107)
>  [storm-core-0.10.0.2.3.4.7-4.jar:0.10.0.2.3.4.7-4]
>         at 
> backtype.storm.zookeeper$exists_node_QMARK_.invoke(zookeeper.clj:104) 
> [storm-core-0.10.0.2.3.4.7-4.jar:0.10.0.2.3.4.7-4]
>         at backtype.storm.zookeeper$mkdirs.invoke(zookeeper.clj:120) 
> [storm-core-0.10.0.2.3.4.7-4.jar:0.10.0.2.3.4.7-4]
>         at 
> backtype.storm.cluster$mk_distributed_cluster_state.doInvoke(cluster.clj:60) 
> [storm-core-0.10.0.2.3.4.7-4.jar:0.10.0.2.3.4.7-4]
>         at clojure.lang.RestFn.invoke(RestFn.java:486) [clojure-1.6.0.jar:?]
>         at 
> backtype.storm.cluster$mk_storm_cluster_state.doInvoke(cluster.clj:314) 
> [storm-core-0.10.0.2.3.4.7-4.jar:0.10.0.2.3.4.7-4]
>         at clojure.lang.RestFn.invoke(RestFn.java:439) [clojure-1.6.0.jar:?]
>         at 
> backtype.storm.daemon.supervisor$supervisor_data.invoke(supervisor.clj:296) 
> [storm-core-0.10.0.2.3.4.7-4.jar:0.10.0.2.3.4.7-4]
>         at 
> backtype.storm.daemon.supervisor$fn__8449$exec_fn__3614__auto____8450.invoke(supervisor.clj:504)
>  [storm-core-0.10.0.2.3.4.7-4.jar:0.10.0.2.3.4.7-4]
>         at clojure.lang.AFn.applyToHelper(AFn.java:160) [clojure-1.6.0.jar:?]
>         at clojure.lang.AFn.applyTo(AFn.java:144) [clojure-1.6.0.jar:?]
>         at clojure.core$apply.invoke(core.clj:624) [clojure-1.6.0.jar:?]
>         at 
> backtype.storm.daemon.supervisor$fn__8449$mk_supervisor__8476.doInvoke(supervisor.clj:500)
>  [storm-core-0.10.0.2.3.4.7-4.jar:0.10.0.2.3.4.7-4]
>         at clojure.lang.RestFn.invoke(RestFn.java:436) [clojure-1.6.0.jar:?]
>         at 
> backtype.storm.daemon.supervisor$_launch.invoke(supervisor.clj:792) 
> [storm-core-0.10.0.2.3.4.7-4.jar:0.10.0.2.3.4.7-4]
>         at backtype.storm.daemon.supervisor$_main.invoke(supervisor.clj:822) 
> [storm-core-0.10.0.2.3.4.7-4.jar:0.10.0.2.3.4.7-4]
>         at clojure.lang.AFn.applyToHelper(AFn.java:152) [clojure-1.6.0.jar:?]
>         at clojure.lang.AFn.applyTo(AFn.java:144) [clojure-1.6.0.jar:?]
>         at backtype.storm.daemon.supervisor.main(Unknown Source) 
> [storm-core-0.10.0.2.3.4.7-4.jar:0.10.0.2.3.4.7-4]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to