Re: Is Storm a right tool for processing of thousands of small tasks?

Angelo Genovese Wed, 19 Mar 2014 13:05:01 -0700

"Any problem in computer science can be solved with another layer of
indirection." (David Wheeler)


You could setup the spout to read it's list of sources from somewhere, add
to that list when a request comes in, then remove completed sources from
that list in your sink.  Or you could use a separate app to read from the
sources on request and pile that info into a queue like Kestrel for a storm
spout to read from.




On Wed, Mar 19, 2014 at 8:21 AM, Eugene Dzhurinsky <jdeve...@gmail.com>wrote:

> Hello!
>
> I'm evaluating Storm for the project, which involves processing of many
> distinct small tasks in the following way:
>
> - a user supplies some data source
>
> - spout is attached to the source and produces chunks of data to the
> topology
>
> - bolts are being processing the chunk of data and transform it somehow
> (in general
> reducing the number of chunks, so number of records in sink are much less
> than number of records out of the spout)
>
> - when all records are processed - the results are accumulated and sent
> back
> to the user.
>
> As far as I understand, a topology is supposed to be kept running forever,
> so
> I don't really see the easy way to "distinguish" the records from one task
> from records of another one. Should a new topology be started for each new
> task of a user?
>
> Thank you in advance! The links to any appropriate articles are very
> welcome :)
>
> --
> Eugene N Dzhurinsky
>

Re: Is Storm a right tool for processing of thousands of small tasks?

Reply via email to