You're right. For that I have to manage the queue and all those complexities of timeout. If Storm is not the right place to do this then what else?
On Tue, Sep 20, 2016 at 8:25 PM, Ambud Sharma <asharma52...@gmail.com> wrote: > The correct way is to perform time window aggregation using bucketing. > > Use the timestamp on your event computed from.various stages and send it > to a single bolt where the aggregation happens. You only emit from this > bolt once you receive results from both parts. > > It's like creating a barrier or the join phase of a fork join pool. > > That said the more important question is is Storm the right place do to > this? When you perform time window aggregation you are susceptible to tuple > timeouts and have to also deal with making sure your aggregation is > idempotent. > > On Sep 20, 2016 7:49 AM, "Harsh Choudhary" <shry.ha...@gmail.com> wrote: > >> But how would that solve the syncing problem? >> >> >> >> On Tue, Sep 20, 2016 at 8:12 PM, Alberto São Marcos < >> alberto....@gmail.com> wrote: >> >>> I would dump the *Bolt-A* results in a shared-data-store/queue and have >>> a separate workflow with another spout and Bolt-B draining from there >>> >>> On Tue, Sep 20, 2016 at 9:20 AM, Harsh Choudhary <shry.ha...@gmail.com> >>> wrote: >>> >>>> Hi >>>> >>>> I am thinking of doing the following. >>>> >>>> Spout subscribed to Kafka and get JSONs. Spout emits the JSONs as >>>> individual tuples. >>>> >>>> Bolt-A has subscribed to the spout. Bolt-A creates multiple JSONs from >>>> a json and emits them as multiple streams. >>>> >>>> Bolt-B receives these streams and do the computation on them. >>>> >>>> I need to make a cumulative result from all the multiple JSONs (which >>>> are emerged from a single JSON) in a Bolt. But a bolt static instance >>>> variable is only shared between tasks per worker. How do achieve this >>>> syncing process. >>>> >>>> ---> >>>> Spout ---> Bolt-A ---> Bolt-B ---> Final result >>>> ---> >>>> >>>> The final result is per JSON which was read from Kafka. >>>> >>>> Or is there any other way to achieve this better? >>>> >>> >>> >>