I am playing with Apache Storm for a real-time image processing application
which requires ultra low latency. In the topology definition, a single
spout will emit raw images(5MB) in every 1s and a few bolts will process
them. The processing latency of each bolt is acceptable and the overall
computing delay can be around 150ms.

*However, I find that the message passing delay between workers on the
different nodes is really high. The overall such delay on the 5 successive
bolts is around 200ms.* To calculate this delay, I subtract all the task
latencies from the end-to-end latency. Moreover, I implement a timer bolt
and other processing bolts will register in this timer bolt to record the
timestamp before starting the real processing. By comparing the timestamps
of the bolts, I find the delay between each bolt is high as I previously
noticed.

To analyze the source of this high additional delay, I firstly reduce the
sending interval to 1s and thus there should be no queuing delay due to the
high computing overheads. Also, from the Storm UI, I find none bolt is in
high CPU utilization.

Then, I checked the network delay. I am using a 1Gbps network testbed and
test the network by RTT and bandwidth. The network latency should not be
that high to send a 5MB image.

Finally, I am thinking about the buffer delay. I find each thread maintains
its own sending buffer and transfer the data to the worker's sending
buffer. I am not sure how long it takes before the receiver bolt can get
this sending message. As suggested by the community, I increase the
sender/receiver buffer size to 16384, modify STORM_NETTY_MESSAGE_BATCH_SIZE
to 32768. However, it did not help.

*My question is that how to remove/reduce the messaging overheads between
bolts?(inter workers)* It is possible to synchronize the communication
between bolts and have the receiver got the sending messages immediately
without any delay?

ᐧ

Reply via email to