Hi experts, I did a simple experiment to understand how to tune storm topology for production, but was totally puzzled by the results on complete latency (avg time a tuple tree takes to finish).
I used a simple bolt that does nothing but sleep for a period of time then ack the input tuple that comes from a kafka spout. For simplicity, I limited worker count to 1 (acker # as well as spout or bolt count are all 1 too), to make sure spout, bolt and acker all sits in the same worker. Firstly I tuned the sleep time to 0 (no sleep), and max spout pending # to be 300, then the complete latency is around 10ms. All seems good and reasonable so far. After I changed the pending # to 1, however, the latency suddenly spiked to 900ms. I then modified the sleep time to 1ms, with pending # at 300, the latency also spiked to around 600ms. while when pending # is 1, the latency hovered over 900ms It all seemed like Storm internally does some tuple batching and waiting based on both size and time when moving tuples from one queue to the other. Is this expected? Or is it related to disruptor queue handling? Where should I look at if I want to customize these behaviors? I am not familiar with clojure yet, so any pointer is much appreciated! Thanks a lot, Fang