That is what is causing the storm to perform very slow data read and process. And I am not sure what is causing it to be that slow.
-- Kushan Maskey 817.403.7500 On Mon, Sep 22, 2014 at 10:30 PM, Tom Brown <[email protected]> wrote: > The screen shows a complete Latency of 47 seconds. That is really high. Is > there a screen that shows the performance/capacity of each bolt? > > --Tom > > On Mon, Sep 22, 2014 at 9:27 PM, Kushan Maskey < > [email protected]> wrote: > >> Below is my Topology configuration and Topology status bases on the >> configuration. Can anyone help me how to optimize the storm for faster >> process of the 20 Million data? >> >> Topology statsWindowEmittedTransferredComplete latency (ms)AckedFailed10m >> 0s >> <http://nmcxstrmd001:8080/topology.html?id=CEXPStormTopology-1-1411442050&window=600> >> 11142011142047305.48828610003h 0m 0s >> <http://nmcxstrmd001:8080/topology.html?id=CEXPStormTopology-1-1411442050&window=10800> >> 11142011142047305.48828610001d 0h 0m 0s >> <http://nmcxstrmd001:8080/topology.html?id=CEXPStormTopology-1-1411442050&window=86400> >> 11142011142047305.4882861000All time >> <http://nmcxstrmd001:8080/topology.html?id=CEXPStormTopology-1-1411442050&window=:all-time> >> 11142011142047305.4882861000 >> >> Topology ConfigurationKeyValuedev.zookeeper.path/tmp/dev-storm-zookeeper >> drpc.childopts-Xmx768mdrpc.invocations.port3773drpc.port3772 >> drpc.queue.size128drpc.request.timeout.secs600drpc.worker.threads64 >> java.library.path/usr/local/lib:/opt/local/lib:/usr/lib >> logviewer.appender.nameA1logviewer.childopts-Xmx128mlogviewer.port8000 >> nimbus.childopts-Xmx1024mnimbus.cleanup.inbox.freq.secs600 >> nimbus.file.copy.expiration.secs600nimbus.hostmystormserver >> nimbus.inbox.jar.expiration.secs3600nimbus.monitor.freq.secs10 >> nimbus.reassigntruenimbus.supervisor.timeout.secs60 >> nimbus.task.launch.secs120nimbus.task.timeout.secs30 >> nimbus.thrift.max_buffer_size1048576nimbus.thrift.port6627 >> nimbus.topology.validatorbacktype.storm.nimbus.DefaultTopologyValidator >> storm.cluster.modedistributedstorm.config.properties[object Object] >> storm.idCEXPStormTopology-1-1411442050storm.local.dir >> /data/disk00/storm/localdirstorm.local.mode.zmqfalse >> storm.messaging.netty.buffer_size5242880 >> storm.messaging.netty.client_worker_threads1 >> storm.messaging.netty.flush.check.interval.ms10 >> storm.messaging.netty.max_retries30storm.messaging.netty.max_wait_ms1000 >> storm.messaging.netty.min_wait_ms100 >> storm.messaging.netty.server_worker_threads1 >> storm.messaging.netty.transfer.batch.size262144storm.messaging.transport >> backtype.storm.messaging.netty.Contextstorm.thrift.transport >> backtype.storm.security.auth.SimpleTransportPlugin >> storm.zookeeper.connection.timeout15000storm.zookeeper.port2181 >> storm.zookeeper.retry.interval1000 >> storm.zookeeper.retry.intervalceiling.millis30000 >> storm.zookeeper.retry.times5storm.zookeeper.root/storm >> storm.zookeeper.serversmystormserverstorm.zookeeper.session.timeout20000 >> supervisor.childopts-Xmx256msupervisor.enabletrue >> supervisor.heartbeat.frequency.secs5supervisor.monitor.frequency.secs3 >> supervisor.slots.ports >> 6700,6701,6702,6703,6704,6705,6706,6707,6708,6709,6710,6711,6712,6713,6714,6715,6716,6717,6718,6719,6720,6721,6722,6723,6724,6725,6726,6727,6728 >> supervisor.worker.start.timeout.secs120supervisor.worker.timeout.secs30 >> task.heartbeat.frequency.secs3task.refresh.poll.secs10 >> topology.acker.executors1000topology.builtin.metrics.bucket.size.secs60 >> topology.debugtruetopology.disruptor.wait.strategy >> com.lmax.disruptor.BlockingWaitStrategytopology.enable.message.timeouts >> truetopology.error.throttle.interval.secs10 >> topology.executor.receive.buffer.size65536 >> topology.executor.send.buffer.size65536 >> topology.fall.back.on.java.serializationtruetopology.kryo.decorators >> topology.kryo.factorybacktype.storm.serialization.DefaultKryoFactory >> topology.kryo.register[object Object] >> topology.max.error.report.per.interval5topology.max.spout.pending5000 >> topology.max.task.parallelism100topology.message.timeout.secs60 >> topology.multilang.serializerbacktype.storm.multilang.JsonSerializer >> topology.nameCEXPStormTopologytopology.receiver.buffer.size8 >> topology.skip.missing.kryo.registrationsfalse >> topology.sleep.spout.wait.strategy.time.ms1topology.spout.wait.strategy >> backtype.storm.spout.SleepSpoutWaitStrategy >> topology.state.synchronization.timeout.secs60topology.stats.sample.rate >> 0.05topology.taskstopology.tick.tuple.freq.secs >> topology.transfer.buffer.size32 >> topology.trident.batch.emit.interval.millis500topology.tuple.serializer >> backtype.storm.serialization.types.ListDelegateSerializer >> topology.worker.childoptstopology.worker.receiver.thread.count1 >> topology.worker.shared.thread.pool.size4topology.workers20 >> transactional.zookeeper.porttransactional.zookeeper.root/transactional >> transactional.zookeeper.serversui.childopts-Xmx768mui.port8080 >> worker.childopts-Xmx1024mworker.heartbeat.frequency.secs1zmq.hwm0 >> zmq.linger.millis5000zmq.threads1 >> >> >> -- >> Kushan Maskey >> 817.403.7500 >> >> On Mon, Sep 22, 2014 at 9:25 PM, Kushan Maskey < >> [email protected]> wrote: >> >>> Here is my storm config. >>> >>> >>> storm.config.setMaxTaskParallelism=4 >>> >>> storm.config.setNumWorkers=20 >>> >>> storm.config.setMaxSpoutPending=5000 >>> >>> storm.config.numAckers=1000 >>> >>> >>> I am guessing I need to increase the maxTaskParallelism more. IF that is >>> the case how much would you suggest? Any help will be highly appreciated. >>> >>> >>> Thanks. >>> >>> -- >>> Kushan Maskey >>> 817.403.7500 >>> >>> On Mon, Sep 22, 2014 at 9:20 PM, Michael Rose <[email protected]> >>> wrote: >>> >>>> Storm is not your bottleneck. Check your Storm code to 1) ensure you're >>>> parallelizing your writes and 2) you're batching writes to your external >>>> resources if possible. Some quick napkin math shows you only doing 110 >>>> writes/s, which seems awfully low. >>>> >>>> Michael Rose (@Xorlev <https://twitter.com/xorlev>) >>>> Senior Platform Engineer, FullContact <http://www.fullcontact.com/> >>>> [email protected] >>>> >>>> On Mon, Sep 22, 2014 at 8:05 PM, Kushan Maskey < >>>> [email protected]> wrote: >>>> >>>>> I am trying to load 20 M records into Cassandra database through >>>>> Kafka-Storm. I am able to post all the data in 5 mins into Kafka. But >>>>> reading it from storm and inserting into Cassandra, Couch and Solr is kind >>>>> of very slow. It has been running for past 5 hours and so far only 2 >>>>> Million records. >>>>> >>>>> How do I make the storm perform faster? Coz in this pace it will take >>>>> couple of days to load all the data. >>>>> >>>>> -- >>>>> Kushan Maskey >>>>> >>>>> >>>> >>> >> >
