Re: Losing Tuples

johnson_d4221 Mon, 18 Apr 2016 11:10:05 -0700

The spout is a KafkaSpout and I only have one spout task.
The reason I set the MaxSpoutPendingValue so high was that in the topology, 
each tuple processed in a bolt tends to create more tuples. So, although the 
KafkaSpout only receives one message, it results in thousands of tuples 
downstream.


    On Sunday, April 17, 2016 6:56 AM, Kevin Conaway 
<[email protected]> wrote:
 

 What type of spout is it? How many spout tasks do you have?
Maxspoutpending seems pretty high so its possible the tuples could be toming 
out in the queue and if the spout isnt reliable, or if acking is disabled, they 
will be discarded 
On Friday, April 15, 2016, <[email protected]> wrote:

I've placed two logs in the bolts to verify that tuples are missing. One log 
right before the tuple is emitted and another at the beginning of the execute 
method for the downstream bolt. These logs should contain the same statements, 
however, the downstream bolt is lacking close to 1000 of the 21,000 tuples it 
should be receiving.
 

    On Friday, April 15, 2016 12:56 PM, Kevin Conaway 
<[email protected]> wrote:
 

 How are you verifying that the tuples are failing?  If you're looking at storm 
UI for the exact counts you may be mislead.  Storm samples tuples for at a 
configurable rate (defaulted to 0.05) and extrapolates the metrics shown in the 
UI based on that number.  For dev or testing purposes you can set the 
_topology.stats.sample.rate_ to 1 in storm.yaml which will cause storm to 
compute stats based on every tuple.
On Fri, Apr 15, 2016 at 12:33 PM, <[email protected]> wrote:

Hi all,

I've recently run into a problem where my topology seems to be losing tuples 
after some continuous processing. That is the number of tuples emitted from one 
bolt doesn't equal the number of tuples ack'ed for the downstream bolt. It's 
also not reporting any tuples as having failed, I ack immediately in each 
exectue method, and there seem to be no errors in the logs. Due to the nature 
of the topology, one bolt tends to emit about 10 tuples for each tuple that it 
receives, resulting in the topology itself getting backed up relatively 
quickly. I've read in other articles that this can result in a memory leak, 
which might be the cause of my lost tuples. 

My question is what configuration properties of the topology can I change that 
would potentially resolve this problem? I currently have my 
executor.send.buffer and executor.receive.buffer set at 16384, the 
maxSpoutPending at 500000, and the tupleTimeout at 300000, which I thought 
would help, but still have not seen any improvement. Or is there something else 
that might be causing this problem?

Thanks



-- 
Kevin Conaway
http://www.linkedin.com/pub/kevin-conaway/7/107/580/
https://github.com/kevinconaway


   


-- 
Kevin Conaway
http://www.linkedin.com/pub/kevin-conaway/7/107/580/
https://github.com/kevinconaway

Re: Losing Tuples

Reply via email to