We have a topology that is experiencing massive amounts of spout failures
without corresponding bolt failures. We have been interpreting these as
tuple timeouts, but we seem to be getting more of these failures than we
understand to be possible with timeouts.

Our topology uses a Kafka spout and the topology is configured with:
topology.message.timeout.secs = 300
topology.max.spout.pending = 2500

Based on these settings, I would expect the topology to experience a
maximum of 2500 tuple timeouts per 300 seconds. But from the Storm UI, we
see that after running for about 10 minutes, the topology will show about
50K spout failures and zero bolt failures.

Am I misunderstanding something that would allow more tuples to time out,
or is there another source of spout failures?

Thanks in advance,
Kevin Peek

Reply via email to