Re: long running bolts

Mike Thomsen Tue, 02 Jun 2015 05:46:05 -0700

Not that I know of. I don't think Storm has an effective mechanism for
allowing a particular tuple to take such an unusually long time while not
affecting the amount of time allotted to the other bolts and tuples. You
could jack those numbers up very, very high, but that could conceivably
encourage other bolts to take longer than you want instead of
failing/replaying a tuple.

If you REALLY want to use storm for this long processing of a single piece
of data, you could do something like this supposing it's just one bolt that
does the long processing:

1. Write the data to disk, hdfs, hbase, RDBMS, etc.
2. Write a new topology based on a signal spout (zookeeper signals)
3. Give your new topology a ridiculously high amount of time for processing
a single tuple
4. Have your current topology use SignalClient to post a zookeeper message
for the the new one, when the last tuple is ready to be processed

On Tue, Jun 2, 2015 at 8:04 AM, Subrat Basnet <sub...@myktm.com> wrote:

>  Hi there,
>
> Is it normal to have long running bolts once in a while? When I say long
> running, I’m talking about a bolt that takes a few hours to process a tuple.
>
> I need to export data, push notifications and upload files with this when
> I reach the LAST tuple of a sequence of tuples. This does not happen on
> every tuple.
>
> I am worried, that this will make my whole topology hang up. My instinct
> is to give this particular bolt a higher parallelism, so that other threads
> are available to process when one bolt is hung up.
>
> Please advise, what would be the best way to achieve this.
>
> Thanks!
> Subrat
>
> --
> Subrat Basnet
> Sent with Sparrow <http://www.sparrowmailapp.com/?sig>
>
>

Re: long running bolts

Reply via email to