Hi Igor.
Try to dump threads and look on it. I think you can find problem in dump.

Look at lan, cpu and ram load. Maybe you CPU overloaded or ram and lan.

2016-03-27 21:26 GMT+03:00 Igor Kuzmenko <f1she...@gmail.com>:

> Hello, I'm using Hortonworks Data Platform v2.3.4 with included storm 0.10.
> After deploying my topology using "*storm jar*" command it takes about 3
> min to distribute code. The network is 10Gb/s, topology jar is about 150MB,
> cluster has 2 nodes with supervisors on them, so I assume that this process
> shouldn't take much time. Actualy, it wasn't with Hortonworks platform
> 2.3.0, I can't understand what happend since than. Heres change log
> <http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.3.4/bk_HDP_RelNotes/content/patch_storm.html>
>  of
> HDP, hope you can help.
>
> Here's logs.
> Nimbus download jar pretty fast:
> *2016-02-25 17:10:13.040 b.s.d.nimbus [INFO] Uploading file from client to
> /opt/hadoop/storm/nimbus/inbox/stormjar-7352b097-8829-4268-a81f-29d820c0f311.jar*
> *2016-02-25 17:10:14.915 b.s.d.nimbus [INFO] Finished uploading file from
> client:
> /opt/hadoop/storm/nimbus/inbox/stormjar-7352b097-8829-4268-a81f-29d820c0f311.jar*
> 2016-02-25 17:10:14.928 b.s.d.nimbus [INFO] [req 22] Access from:
> principal: op:submitTopology
> 2016-02-25 17:10:14.933 b.s.d.nimbus [INFO] Received topology submission
> for CdrPerformanceTestTopology-phoenix-bolt-4 with conf
> {"topology.max.task.parallelism" nil, "hbase.conf" {"hbase.rootdir" "hdfs://
> sorm-master02.msk.mts.ru:8020/apps/hbase/data"},
> "topology.submitter.principal" "", "topology.acker.executors" nil,
> "topology.workers" 2, "topology.message.timeout.secs" 30, "topology.debug"
> true, "topology.max.spout.pending" 10000, "storm.zookeeper.superACL" nil,
> "topology.users" (), "topology.submitter.user" "", "topology.kryo.register"
> {"org.apache.avro.util.Utf8" nil,
> "ru.mts.sorm.schemes.avro.cdr.CbossCdrRecord" nil},
> "topology.kryo.decorators" (), "storm.id"
> "CdrPerformanceTestTopology-phoenix-bolt-4-2-1456409414", "topology.name"
> "CdrPerformanceTestTopology-phoenix-bolt-4"}
> 2016-02-25 17:10:14.936 b.s.d.nimbus [INFO] nimbus file
> location:/opt/hadoop/storm/nimbus/stormdist/CdrPerformanceTestTopology-phoenix-bolt-4-2-1456409414
> 2016-02-25 17:10:15.160 b.s.c.LocalFileSystemCodeDistributor [INFO]
> Created meta file
> /opt/hadoop/storm/nimbus/stormdist/CdrPerformanceTestTopology-phoenix-bolt-4-2-1456409414/storm-code-distributor.meta
> upload successful.
> 2016-02-25 17:10:15.164 b.s.zookeeper [INFO] Node already in queue for
> leader lock.
> 2016-02-25 17:10:15.165 b.s.d.nimbus [INFO] desired replication count of 1
> not achieved but we have hit the max wait time 60 so moving on with
> replication count = 1
> 2016-02-25 17:10:15.169 b.s.d.nimbus [INFO] Activating
> CdrPerformanceTestTopology-phoenix-bolt-4:
> CdrPerformanceTestTopology-phoenix-bolt-4-2-1456409414
> 2016-02-25 17:10:15.186 b.s.s.EvenScheduler [INFO] Available slots:
> (["edef7b4d-2358-41c6-8a03-a7638d0f68c6" 6700]
> ["edef7b4d-2358-41c6-8a03-a7638d0f68c6" 6701]
> ["0a6e4de5-dd83-4307-b7e6-1db60c4c0898" 6700]
> ["0a6e4de5-dd83-4307-b7e6-1db60c4c0898" 6701])
> 2016-02-25 17:10:15.189 b.s.d.nimbus [INFO] Setting new assignment for
> topology id CdrPerformanceTestTopology-phoenix-bolt-4-2-1456409414:
> #backtype.storm.daemon.common.Assignment{:master-code-dir
> "/opt/hadoop/storm/nimbus/stormdist/CdrPerformanceTestTopology-phoenix-bolt-4-2-1456409414",
> :node->host {"edef7b4d-2358-41c6-8a03-a7638d0f68c6" "
> sorm-data06.msk.mts.ru", "0a6e4de5-dd83-4307-b7e6-1db60c4c0898" "
> sorm-data07.msk.mts.ru"}, :executor->node+port {[8 8]
> ["0a6e4de5-dd83-4307-b7e6-1db60c4c0898" 6700], [2 2]
> ["0a6e4de5-dd83-4307-b7e6-1db60c4c0898" 6700], [7 7]
> ["edef7b4d-2358-41c6-8a03-a7638d0f68c6" 6700], [3 3]
> ["edef7b4d-2358-41c6-8a03-a7638d0f68c6" 6700], [1 1]
> ["edef7b4d-2358-41c6-8a03-a7638d0f68c6" 6700], [6 6]
> ["0a6e4de5-dd83-4307-b7e6-1db60c4c0898" 6700], [9 9]
> ["edef7b4d-2358-41c6-8a03-a7638d0f68c6" 6700], [11 11]
> ["edef7b4d-2358-41c6-8a03-a7638d0f68c6" 6700], [5 5]
> ["edef7b4d-2358-41c6-8a03-a7638d0f68c6" 6700], [10 10]
> ["0a6e4de5-dd83-4307-b7e6-1db60c4c0898" 6700], [4 4]
> ["0a6e4de5-dd83-4307-b7e6-1db60c4c0898" 6700]}, :executor->start-time-secs
> {[8 8] 1456409415, [2 2] 1456409415, [7 7] 1456409415, [3 3] 1456409415, [1
> 1] 1456409415, [6 6] 1456409415, [9 9] 1456409415, [11 11] 1456409415, [5
> 5] 1456409415, [10 10] 1456409415, [4 4] 1456409415}}
>
> But supervisor stuck somewhere:
> *2016-02-25 17:10:15.198 b.s.d.supervisor [INFO] Downloading code for
> storm id* CdrPerformanceTestTopology-phoenix-bolt-4-2-1456409414 from
> /opt/hadoop/storm/nimbus/stormdist/CdrPerformanceTestTopology-phoenix-bolt-4-2-1456409414
> 2016-02-25 17:10:15.199 b.s.u.StormBoundedExponentialBackoffRetry [INFO]
> The baseSleepTimeMs [2000] the maxSleepTimeMs [60000] the maxRetries [5]
> 2016-02-25 17:10:15.205 b.s.u.StormBoundedExponentialBackoffRetry [INFO]
> The baseSleepTimeMs [2000] the maxSleepTimeMs [60000] the maxRetries [5]
> 2016-02-25 17:10:17.405 b.s.c.LocalFileSystemCodeDistributor [INFO]
> Attempting to download meta file
> /opt/hadoop/storm/nimbus/stormdist/CdrPerformanceTestTopology-phoenix-bolt-4-2-1456409414/stormjar.jar
> from remote sorm-master02.msk.mts.ru:6627
> 2016-02-25 *17:10:17.406* b.s.u.StormBoundedExponentialBackoffRetry
> [INFO] The baseSleepTimeMs [2000] the maxSleepTimeMs [60000] the maxRetries
> [5]
> 2016-02-25 *17:13:29.002* b.s.c.LocalFileSystemCodeDistributor [INFO]
> Attempting to download meta file
> /opt/hadoop/storm/nimbus/stormdist/CdrPerformanceTestTopology-phoenix-bolt-4-2-1456409414/stormconf.ser
> from remote sorm-master02.msk.mts.ru:6627
> 2016-02-25 17:13:29.003 b.s.u.StormBoundedExponentialBackoffRetry [INFO]
> The baseSleepTimeMs [2000] the maxSleepTimeMs [60000] the maxRetries [5]
> 2016-02-25 17:13:31.148 b.s.c.LocalFileSystemCodeDistributor [INFO]
> Attempting to download meta file
> /opt/hadoop/storm/nimbus/stormdist/CdrPerformanceTestTopology-phoenix-bolt-4-2-1456409414/stormcode.ser
> from remote sorm-master02.msk.mts.ru:6627
> 2016-02-25 17:13:31.148 b.s.u.StormBoundedExponentialBackoffRetry [INFO]
> The baseSleepTimeMs [2000] the maxSleepTimeMs [60000] the maxRetries [5]
> *2016-02-25 17:13:34.124 b.s.d.supervisor [INFO] Finished downloading code
> for storm id* CdrPerformanceTestTopology-phoenix-bolt-4-2-1456409414 from
> /opt/hadoop/storm/nimbus/stormdist/CdrPerformanceTestTopology-phoenix-bolt-4-2-1456409414
>
>
>


-- 
С уважением Дудин Андрей

Reply via email to