I guess tuples are waiting for them to be read by your python bolt.
200ms per tuple is a lot of processing time.Your up stream bolt/spout might
have emitted thousands of tuple by then and they have no where to go.
Have you measured how many tuples were emitted per sec by your spout?
Add a time stam
Thanks Srikanth. I played with 1-2 tasks per executor, it helps a bit but
not by much. I need to fieldshuffle, so parallelism won't solve my problem.
I profiled the python bolt, and it takes order of 200ms per tuple, which is
in line with the execute latency. But process latency is in 20s of minut
Why have you configured 32 tasks on 36 executors? Set noof task to at least
36.
Looks like your Python bolt takes some time to process a tuple. You may
need to tune that or give it more threads.
If you are not maxing out on resource usage, set noof task to 72 and see if
that helps.
Srikanth
On Fr
My storm topology has python bolts using multilang support.
Kafka_spout(java) -> format_filter_bolt -> process_bolt
My storm cluster has 3 32 core EC2 instances. format_filter_bolt has 1
executor and 1 task, process_bolt has 36 executor and 32 tasks. I have max
spout pending = 250.
I observe tha