The screen shows a complete Latency of 47 seconds. That is really high. Is there a screen that shows the performance/capacity of each bolt?
--Tom On Mon, Sep 22, 2014 at 9:27 PM, Kushan Maskey < kushan.mas...@mmillerassociates.com> wrote: > Below is my Topology configuration and Topology status bases on the > configuration. Can anyone help me how to optimize the storm for faster > process of the 20 Million data? > > Topology statsWindowEmittedTransferredComplete latency (ms)AckedFailed10m > 0s > <http://nmcxstrmd001:8080/topology.html?id=CEXPStormTopology-1-1411442050&window=600> > 11142011142047305.48828610003h 0m 0s > <http://nmcxstrmd001:8080/topology.html?id=CEXPStormTopology-1-1411442050&window=10800> > 11142011142047305.48828610001d 0h 0m 0s > <http://nmcxstrmd001:8080/topology.html?id=CEXPStormTopology-1-1411442050&window=86400> > 11142011142047305.4882861000All time > <http://nmcxstrmd001:8080/topology.html?id=CEXPStormTopology-1-1411442050&window=:all-time> > 11142011142047305.4882861000 > > Topology ConfigurationKeyValuedev.zookeeper.path/tmp/dev-storm-zookeeper > drpc.childopts-Xmx768mdrpc.invocations.port3773drpc.port3772 > drpc.queue.size128drpc.request.timeout.secs600drpc.worker.threads64 > java.library.path/usr/local/lib:/opt/local/lib:/usr/lib > logviewer.appender.nameA1logviewer.childopts-Xmx128mlogviewer.port8000 > nimbus.childopts-Xmx1024mnimbus.cleanup.inbox.freq.secs600 > nimbus.file.copy.expiration.secs600nimbus.hostmystormserver > nimbus.inbox.jar.expiration.secs3600nimbus.monitor.freq.secs10 > nimbus.reassigntruenimbus.supervisor.timeout.secs60nimbus.task.launch.secs > 120nimbus.task.timeout.secs30nimbus.thrift.max_buffer_size1048576 > nimbus.thrift.port6627nimbus.topology.validator > backtype.storm.nimbus.DefaultTopologyValidatorstorm.cluster.mode > distributedstorm.config.properties[object Object]storm.id > CEXPStormTopology-1-1411442050storm.local.dir/data/disk00/storm/localdir > storm.local.mode.zmqfalsestorm.messaging.netty.buffer_size5242880 > storm.messaging.netty.client_worker_threads1 > storm.messaging.netty.flush.check.interval.ms10 > storm.messaging.netty.max_retries30storm.messaging.netty.max_wait_ms1000 > storm.messaging.netty.min_wait_ms100 > storm.messaging.netty.server_worker_threads1 > storm.messaging.netty.transfer.batch.size262144storm.messaging.transport > backtype.storm.messaging.netty.Contextstorm.thrift.transport > backtype.storm.security.auth.SimpleTransportPlugin > storm.zookeeper.connection.timeout15000storm.zookeeper.port2181 > storm.zookeeper.retry.interval1000 > storm.zookeeper.retry.intervalceiling.millis30000 > storm.zookeeper.retry.times5storm.zookeeper.root/storm > storm.zookeeper.serversmystormserverstorm.zookeeper.session.timeout20000 > supervisor.childopts-Xmx256msupervisor.enabletrue > supervisor.heartbeat.frequency.secs5supervisor.monitor.frequency.secs3 > supervisor.slots.ports > 6700,6701,6702,6703,6704,6705,6706,6707,6708,6709,6710,6711,6712,6713,6714,6715,6716,6717,6718,6719,6720,6721,6722,6723,6724,6725,6726,6727,6728 > supervisor.worker.start.timeout.secs120supervisor.worker.timeout.secs30 > task.heartbeat.frequency.secs3task.refresh.poll.secs10 > topology.acker.executors1000topology.builtin.metrics.bucket.size.secs60 > topology.debugtruetopology.disruptor.wait.strategy > com.lmax.disruptor.BlockingWaitStrategytopology.enable.message.timeouts > truetopology.error.throttle.interval.secs10 > topology.executor.receive.buffer.size65536 > topology.executor.send.buffer.size65536 > topology.fall.back.on.java.serializationtruetopology.kryo.decorators > topology.kryo.factorybacktype.storm.serialization.DefaultKryoFactory > topology.kryo.register[object Object] > topology.max.error.report.per.interval5topology.max.spout.pending5000 > topology.max.task.parallelism100topology.message.timeout.secs60 > topology.multilang.serializerbacktype.storm.multilang.JsonSerializer > topology.nameCEXPStormTopologytopology.receiver.buffer.size8 > topology.skip.missing.kryo.registrationsfalse > topology.sleep.spout.wait.strategy.time.ms1topology.spout.wait.strategy > backtype.storm.spout.SleepSpoutWaitStrategy > topology.state.synchronization.timeout.secs60topology.stats.sample.rate > 0.05topology.taskstopology.tick.tuple.freq.secs > topology.transfer.buffer.size32topology.trident.batch.emit.interval.millis > 500topology.tuple.serializer > backtype.storm.serialization.types.ListDelegateSerializer > topology.worker.childoptstopology.worker.receiver.thread.count1 > topology.worker.shared.thread.pool.size4topology.workers20 > transactional.zookeeper.porttransactional.zookeeper.root/transactional > transactional.zookeeper.serversui.childopts-Xmx768mui.port8080 > worker.childopts-Xmx1024mworker.heartbeat.frequency.secs1zmq.hwm0 > zmq.linger.millis5000zmq.threads1 > > > -- > Kushan Maskey > 817.403.7500 > > On Mon, Sep 22, 2014 at 9:25 PM, Kushan Maskey < > kushan.mas...@mmillerassociates.com> wrote: > >> Here is my storm config. >> >> >> storm.config.setMaxTaskParallelism=4 >> >> storm.config.setNumWorkers=20 >> >> storm.config.setMaxSpoutPending=5000 >> >> storm.config.numAckers=1000 >> >> >> I am guessing I need to increase the maxTaskParallelism more. IF that is >> the case how much would you suggest? Any help will be highly appreciated. >> >> >> Thanks. >> >> -- >> Kushan Maskey >> 817.403.7500 >> >> On Mon, Sep 22, 2014 at 9:20 PM, Michael Rose <mich...@fullcontact.com> >> wrote: >> >>> Storm is not your bottleneck. Check your Storm code to 1) ensure you're >>> parallelizing your writes and 2) you're batching writes to your external >>> resources if possible. Some quick napkin math shows you only doing 110 >>> writes/s, which seems awfully low. >>> >>> Michael Rose (@Xorlev <https://twitter.com/xorlev>) >>> Senior Platform Engineer, FullContact <http://www.fullcontact.com/> >>> mich...@fullcontact.com >>> >>> On Mon, Sep 22, 2014 at 8:05 PM, Kushan Maskey < >>> kushan.mas...@mmillerassociates.com> wrote: >>> >>>> I am trying to load 20 M records into Cassandra database through >>>> Kafka-Storm. I am able to post all the data in 5 mins into Kafka. But >>>> reading it from storm and inserting into Cassandra, Couch and Solr is kind >>>> of very slow. It has been running for past 5 hours and so far only 2 >>>> Million records. >>>> >>>> How do I make the storm perform faster? Coz in this pace it will take >>>> couple of days to load all the data. >>>> >>>> -- >>>> Kushan Maskey >>>> >>>> >>> >> >