Hi, Recently I got a really strange problem. The storm cluster have 3 machines. The topo structure is like this, Kafka Spout A -> Bolt B -> Bolt C. I have acked all tuples in every bolt, even though there possibly throw exceptions inner bolt (in bolt execute method I try and catch all exceptions, and finally ack the tuple). But here the strange thing happens. I print the log of the spout, on one machine all the tuples acked by the spout, but on other 2 machines, almost all tuples failed. And after 60 seconds the tuple replayed once again and again and again. 'Almost' means at the begin time, all tuples failed on the other 2 machines. After a time, there's a small amount of tuples acked on the 2 machines. Absolutely the tuples are failed because of timeout. But I really don't know why they timed out. According to the logs I've printed, I'm really sure all tuples acked at the end of the execute method in every bolt. So I want to know why some of the tuples failed on the 2 machines. Is there any thing I can do to find out what's wrong with the topo or the storm cluster? Really thanks and hoping for your reply.
