For at-most-once semantics you need to disable ACKing by setting acker count
to 0. BP is not intended for that.
IMO you need BP for what you are trying to do. Dropping messages under BP will
lead to a situation where each component runs at a speed that is unrelated to
the rest of the topology and this will skew your CPU consumption measurements.
Some of the CPU will get wasted in processing messages that get discarded.
Also this leads to a very degenerate case of at-most once where there is heavy
data loss.
On Thursday, April 12, 2018, 7:40:30 AM PDT, ravi kiran puttaswamy
<[email protected]> wrote:
#yiv9436905427 P {margin-top:0;margin-bottom:0;}Thanks Roshan for the detailed
answer.
This is for a research project, which assumes a topology configured in an at
most once semantics. Loss of tuples is tolerable. We want to study the
relationship between maximum throughput and the amount of processing power
(mainly CPU) allocated to the topology. We want to ensure that the topology is
using as much of the CPU provided to it. In case of backpressure, the input it
getting throttled. (I am additionally using cgroups to control the amount of
CPU allocated to each component. So storm 2.0 is my only option).
Is it possible to configure storm to simulate at most once semantics, where the
topology is not throttled by back pressure?
Thanks again for your time,
Warm regards,
ravi
From: Roshan Naik <[email protected]>
Sent: Thursday, April 12, 2018 1:09 PM
To: [email protected]
Subject: Re: Disable backpressure in Storm 2.0 Short answer: It is not
possible to disable backpressure in 2.0 and you don't want to do it either.
Long Answer: That setting applies to the 1.0 backpressure subsystem (zookeeper
based) which is layered on top of the messaging system. Storm 2.0 has a new
messaging subsystem and a very lightweight backpressure model that is tightly
integrated into the messaging subsystem (with no ZK or other external
dependencies).
WRT why you don't want to disable it... If you disable backpressure & allow
upstream components to blindly pump out messages, then you are left with two
options to handle a BP situation. To hold the excess messages, you could allow
the internal queues to keep growing in an unbounded fashion, in which case the
worker process will die with an OOM exception relatively quickly. The other
option is to keep the queues bounded and drop messages once queues are full.
Both options are bad options.
In Storm 1.x you could disable the BP system (default) and fallback on
topology.max.spout.pending as an alternative BP model (if acking is enabled).
If ACKing was disabled and BP is also disabled it is easy to crash the workers.
Any reason you are looking to disable BP ?
-roshan
On Wednesday, April 11, 2018, 7:42:29 PM PDT, ravi kiran puttaswamy
<[email protected]> wrote:
#yiv9436905427 #yiv9436905427 --#yiv9436905427x_yiv6786273096 p
{margin-top:0;margin-bottom:0;}#yiv9436905427 Hello all,
I am configuring a storm topology to execute in at-most once semantics
without backpressure.
Can you let me know how to disable backpressure in storm 2.0? The
documentation says the earlier config "topology.backpressue.enable" will be
deprecated and removed from storm 2.0.
thanks and regards,ravi