Alternative is to use a control message on a separate stream that goes to all bolt tasks using all grouping. On May 8, 2016 3:20 PM, "Matthias J. Sax" <mj...@apache.org> wrote:
> To synchronize this, use an additional "shut down bolt" that used > parallelism of one. "shut down bolt" must be notified by all parallel > DbBolts after they performed the flush. If all notifications are > received, there are not in-flight message and thus "shut down bolt" can > kill the topology safely. > > -Matthias > > > > On 05/08/2016 07:27 PM, Spico Florin wrote: > > hi! > > there is this solution of sending a poison pill message from the > > spout. on bolt wil receiv your poison pill and will kill topology via > > storm storm nimbus API. one potentential issue whith this approach is > > that due to your topology structure regarding the parralelism of your > > bolts nd the time required by themto excute their bussineess logic, is > > that the poison pill to be swallowed by the one bolt responsilble for > > killing the topology, before all the other messages that are in-flight > > to be processed. the conseuence is that you cannot be sure that all the > > messagess sent by the spout were processed. also sharing the total > > number of sent messages between the excutors in order to shutdown when > > all messages were processed coul be error prone since tuple can be > > processed many times (depending on your guaranteee message processing) > > or they could be failed. > > i coul not find a solution for this. storm is intended to run > > forunbounded data. > > i hope that thrse help, > > regard, > > florin > > > > > > On Sunday, May 8, 2016, Matthias J. Sax <mj...@apache.org > > <mailto:mj...@apache.org>> wrote: > > > > You can get the number of bolt instances from TopologyContext that is > > provided in Bolt.prepare() > > > > Furthermore, you could put a loop into your topology, ie, a bolt > reads > > it's own output; if you broadcast (ie, allGrouping) this > > feedback-loop-stream you can let bolt instances talk to each other. > > > > builder.setBolt("DbBolt", new MyDBBolt()) > > .shuffleGrouping("spout") > > .allGrouping("flush-stream", "DbBolt"); > > > > where "flush-stream" is a second output stream of MyDBBolt() sending > a > > notification tuple after it received the end-of-stream from spout; > > furthermore, if a bolt received the signal via "flush-stream" from > > **all** parallel bolt instances, it can flush to DB. > > > > Or something like this... Be creative! :) > > > > > > -Matthias > > > > > > On 05/08/2016 02:26 PM, Navin Ipe wrote: > > > @Matthias: I agree about the batch processor, but my superior took > the > > > decision to use Storm, and he visualizes more complexity later for > > which > > > he needs Storm. > > > I had considered the "end of stream" tuple earlier (my idea was to > > emit > > > 10 consecutive nulls), but then the question was how do I know how > > many > > > bolt instances have been created, and how do I notify all the > bolts? > > > Because it's only after the last bolt finishes writing to DB, that > I > > > have to shut down the topology. > > > > > > @Jason: Thanks. I had seen storm signals earlier (I think from one > of > > > your replies to someone else) and I had a look at the code too, > > but am a > > > bit wary because it's no longer being maintained and because of the > > > issues: https://github.com/ptgoetz/storm-signals/issues > > > > > > On Sun, May 8, 2016 at 5:40 AM, Jason Kusar <ja...@kusar.net > > <javascript:;> > > > <mailto:ja...@kusar.net <javascript:;>>> wrote: > > > > > > You might want to check out Storm Signals. > > > https://github.com/ptgoetz/storm-signals > > > > > > It might give you what you're looking for. > > > > > > > > > On Sat, May 7, 2016, 11:59 AM Matthias J. Sax > > <mj...@apache.org <javascript:;> > > > <mailto:mj...@apache.org <javascript:;>>> wrote: > > > > > > As you mentioned already: Storm is designed to run > topologies > > > forever ;) > > > If you have finite data, why do you not use a batch > > processor??? > > > > > > As a workaround, you can embed "control messages" in your > > stream > > > (or use > > > an additional stream for them). > > > > > > If you want a topology to shut down itself, you could use > > > > > > `NimbusClient.getConfiguredClient(conf).getClient().killTopology(name);` > > > in your spout/bolt code. > > > > > > Something like: > > > - Spout emit all tuples > > > - Spout emit special "end of stream" control tuple > > > - Bolt1 processes everything > > > - Bolt1 forward "end of stream" control tuple (when it > > received it) > > > - Bolt2 processed everything > > > - Bolt2 receives "end of stream" control tuple => flush > to DB > > > => kill > > > topology > > > > > > But I guess, this is kinda weird pattern. > > > > > > -Matthias > > > > > > On 05/05/2016 06:13 AM, Navin Ipe wrote: > > > > Hi, > > > > > > > > I know Storm is designed to run forever. I also know > about > > > Trident's > > > > technique of aggregation. But shouldn't Storm have a way > to > > > let bolts > > > > know that a certain bunch of processing has been > completed? > > > > > > > > Consider this topology: > > > > Spout------>Bolt-A------>Bolt-B > > > > | |--->Bolt-B > > > > | \--->Bolt-B > > > > |--->Bolt-A------>Bolt-B > > > > | |--->Bolt-B > > > > | \--->Bolt-B > > > > \--->Bolt-A------>Bolt-B > > > > |--->Bolt-B > > > > \--->Bolt-B > > > > > > > > * From Bolt-A to Bolt-B, it is a FieldsGrouping. > > > > * Spout emits only a few tuples and then stops > emitting. > > > > * Bolt A takes those tuples and generates millions of > > tuples. > > > > > > > > > > > > *Bolt-B accumulates tuples that Bolt A sends and needs > > to know > > > when > > > > Spout finished emitting. Only then can Bolt-B start > > writing to > > > SQL.* > > > > > > > > *Questions:* > > > > 1. How can all Bolts B be notified that it is time to > > write to > > > SQL? > > > > 2. After all Bolts B have written to SQL, how to know > > that all > > > Bolts B > > > > have completed writing? > > > > 3. How to stop the topology? I know of > > > > localCluster.killTopology("HelloStorm"), but shouldn't > there > > > be a way to > > > > do it from the Bolt? > > > > > > > > -- > > > > Regards, > > > > Navin > > > > > > > > > > > > > > > -- > > > Regards, > > > Navin > > > >