Hello all,
I recently upgraded my storm cluster from 0.9.5 to 1.0.0, and then 1.0.1.
In 1.0.0, if a certain bolt is running slowly and tuples are queued up the auto
backpressure throttles the spout.
but after that bolt consumed and executed all the tuples, spouts is still not
back up as normal again.
The reason is one backpressure-related node in zookeeper (under
STORM_ROOT/backpressure/worker-id/) is not being deleted.
Once I delete it manually the topology runs well again.
This problem happens very frequently in 1.0.0, though less severe but still
happening in 1.0.1
I thought STORM-1696/1731 solved the problem and in that hope I upgraded again
the storm cluster from 1.0.0 to 1.0.1 but it didn't work out as I expected.
Obviously this can be in part due to bad combination of configured
topology.maxSpoutPending and topology.sleep.spout.wait.strategy.time.ms, but
auto backpressure is supposed to handle this kind of stuff regardless of those
configuration, if I am understanding correctly.
Any commend would be a big help for me, and I want to ask out all of you if you
are having the same problem.