Best way of scaling with a single spout

Javier Gonzalez Sat, 09 May 2015 13:59:23 -0700

Hi,

I'm currently approaching the design of an application that will have a
single source of data from AMPS (high speed pub-sub system like Kafka). We
are currently facing the issue that the spout is much faster than the
bolts, and I believe the farming out of the processing to different nodes
is hurting our performance. Before we used to have several consumers on a
queue-like producer, so each spout would likely transfer to the "nearest"
bolts, but now with the pub-sub model we can't just consume blindly off the
source or we would face duplication.


Any ideas on how to approach this? One idea we're toying with is using more
than one consumer, but using filters so that we can assure there is no
duplicate reads. Any others any of you could have, I would be grateful :)

best regards,

-- 
Javier González Nicolini

Best way of scaling with a single spout

Reply via email to