Unsubscribe Sent from BlueMail
On Aug 9, 2017, 8:40 AM, at 8:40 AM, "Hannum, Daniel" <daniel_han...@premierinc.com> wrote: >I think the problem is that capacity of 3.5. That indicates that >there’s a backlog on that bolt, so it’s saying that actual time spent >processing in the bolt is small, but the total time spent (including >wait time) is large. Scale the bolts up or scale the spout down or make >the bolt faster > >From: "fanxi...@travelsky.com" <fanxi...@travelsky.com> >Reply-To: "user@storm.apache.org" <user@storm.apache.org> >Date: Tuesday, August 8, 2017 at 11:02 PM >To: user <user@storm.apache.org>, libo19 <lib...@asiainfo.com>, kabhwan ><kabh...@gmail.com> >Subject: 回复: RE: How to Improve Storm Application's Throughput > >****This email did not originate from the Premier, Inc. network. Use >caution when opening attachments or clicking on URLs.***** > > >. >Hi LiBo, Jungtaek : > >Yes, storm tunning depends on situations. Thank u for your kindly >advice. >The follow is one of my situations, any hints from you will be >appreciated. The storm version is 1.0.0. >The topology just has a spout which reads message from kafka and a bolt >to parse the message and put it into the hbase. >[cid:image001.png@01D310E7.19BFB400] >As you can see from the above picture, the Execute latency of the bolt >is small(0.5ms), but the Complete latency is much more larger(4365ms), >so as to slow down the throughput of the topology. >Which part will consume so much additional time? the transfer between >the spout and the bolt ? or the ack part? I tried to increase >parallelism for the component, but it did not work. >Is there a tool to analyze the time consumption in general? It will be >a great news to know it. > >There is another thing to explain in the above picture. It seems that >the Capacity is high as 1.617, but there are 64 bolts, most Capacity of >it is low, as picture below shows. >[cid:image002.png@01D310E7.19BFB400] >[cid:image003.png@01D310E7.19BFB400] >So, another puzzle is both the Execute latency and the Executed is >about equal, but the Capacity turns out to be so much different. Any >hints? > >The follow is another topology. >[cid:image004.png@01D310E7.19BFB400] >Maybe the history_Put bolt has both high Capacity and larger Execute >latency, this would definitely lead to the Complete latency as 56964ms? > >Thank you all for your time. > > >________________________________ >Joshua > > > >发件人: 李波<mailto:lib...@asiainfo.com> >发送时间: 2017-08-07 16:56 >收件人: user@storm.apache.org<mailto:user@storm.apache.org> >抄送: >jiangyc_cui...@si-tech.com.cn<mailto:jiangyc_cui...@si-tech.com.cn>; >'zhangxl_bds'<mailto:zhangxl_...@si-tech.com.cn> >主题: RE: How to Improve Storm Application's Throughput >你好! > >Storm的性能排查过程需要不断的尝试最后达到一个经验值,我个人的排查过程如下,希望可以有一些帮助: > >1、Kafka PartitionNumber and KafkaSpout’s parallelism >首先你要确认是KafkaSpout的接收能力不行导致的延迟,还是由于后续的bolt处理能力有限造成拥堵,导致的上游KafkaSpout也一起拥堵造成的延迟 >KafkaSpout接收能力不行的话,需要增加Kakfa的分区数量,同时把KafkaSpout的并行度设置为和分区数量一致,这样逐步提升以达到吞吐要求 > >2、Bolt’s business logical and Bolt’s parallelism >首先看下是不是硬件资源不行了。。 >其次,需要查出拥堵在哪一个bolt造成的拥堵,可以配合Storm的那个动态图来判断越红拥堵越厉害,同时参考Capacity这个值来判断超过1的Bolt或多或少会出现拥堵的情况 >根据自己的算法来优化处理逻辑提升效率,或者通过增加Bolt并行度的方式来提升Bolt处理能力(前提是硬件资源没有到上限) > >从你的这个业务来看应该是后续的几个Bolt要访问外部的数据存储系统进行最终结果的存储,关注一下存取数据是否延迟较大,目标系统压力比较大 > >多数情况都是由于Bolt处理能力不够造成的,需要找出压力点优化业务处理逻辑和同时调整Bolt并行度 >关注下bolt的逻辑是计算密集型还是以来外部io,计算密集型只能增加worker > >3、Config.TOPOLOGY_BACKPRESSURE_ENABLE >是否开启了反压机制Config.TOPOLOGY_BACKPRESSURE_ENABLE (好像是storm1.0.0以后才有的) > >4、Netty >另外要想提升Storm在底层传输上的吞吐量,可以通过修改storm.yaml的netty配置,来提升netty的发送批量大 > >5、Executor‘s throughput params >// Net io set >config.put(Config.TOPOLOGY_TRANSFER_BUFFER_SIZE, 1024 * 16); // default >is 1024 >config.put(Config.TOPOLOGY_EXECUTOR_RECEIVE_BUFFER_SIZE, 1024 * 16);// >batched; default is 1024 >config.put(Config.TOPOLOGY_EXECUTOR_SEND_BUFFER_SIZE, 1024 * 16); // >individual tuples; default is 1024 >config.put(Config.TOPOLOGY_TRIDENT_BATCH_EMIT_INTERVAL_MILLIS, 200); > > >________________________________ >李波 13813887096 lib...@asiainfo.com<mailto:lib...@asiainfo.com> >北京亚信智慧数据科技有限公司 >亚信是我家 发展靠大家 > >From: 王 纯超 [mailto:wangchunc...@outlook.com] >Sent: 2017年8月7日 10:58 >To: user <user@storm.apache.org> >Cc: 姜艳春jiangyc_cui...@si-tech.com.cn <jiangyc_cui...@si-tech.com.cn>; >zhangxl_bds <zhangxl_...@si-tech.com.cn> >Subject: How to Improve Storm Application's Throughput > >Hi, > >I am now considering improve a Storm application's throughput because I >find that the consumption speed of KafkaSpout is slower than the >producing speed. And the lag gets larger and larger. Below is the bolt >statistics. I tried to bring forward the tuple projection and filtering >logic in a custom scheme with intention of reducing network traffic. >However, after observation, things go contrary to my wishes. Am I going >the wrong way? Are there any principles tuning Storm applications? Or >could anyone give some suggestions for this specific case? >[cid:image005.jpg@01D310E7.19BFB400] >________________________________ >wangchunc...@outlook.com<mailto:wangchunc...@outlook.com>