Can you show some code? 200 seconds for 15K puts sounds like you're not batching.
On Fri, Jul 11, 2014 at 12:47 PM, Chen Wang <chen.apache.s...@gmail.com> wrote: > typo in previous email > The emit method in the query bolt takes about 200(instead of 20) seconds.. > > > On Fri, Jul 11, 2014 at 11:58 AM, Chen Wang <chen.apache.s...@gmail.com> > wrote: > >> Hi, Guys, >> I have a storm topology, with a single thread bolt querying large amount >> of data (From elasticsearch), and emit to a HBase bolt(10 threads), doing >> some filtering, then emit to Arvo bolt.(10threads) The arvo bolt simply >> emit the tuple to arvo client, which will be received by two flume node and >> then sink into hdfs. I am testing in local mode. >> >> In the query bolt, i am getting around 15000 entries in a batch, the >> query itself takes about 4second, however, he emit method in the query bolt >> takes about 20 seconds. Does it mean that >> the downstream bolt(HBaseBolt and Avro bolt) cannot catch up with the >> query bolt? >> >> How can I tune my topology to make this process as fast as possible? I >> tried to increase the HBase thread to 20 but it does not seem to help. >> >> I use shuffleGrouping from query bolt to hbase bolt, and from hbase bolt >> to avro. >> >> Thanks for any advice. >> Chen >> > >