I ran into this issue a few weeks ago, in a bolt, using Futures in Scala. Basically the acks I was doing in new threads never got back (well at least not all of them). The solution I ended up with was to use a thread safe queue and then flush the acks out from a tick (as was described earlier in this thread). It definitely works, I don't know if there is a better way.
My solution is documented here: https://scalalala.wordpress.com/2016/04/09/async-stormy-weather/ On Fri, Apr 29, 2016 at 6:12 AM, Stephen Powis <spo...@salesforce.com> wrote: > You're probably right, if its an expensive operation to package your data > into a formatted tuple, it may make more sense for your spout to emit > something simple, and have a downstream bolt package it up. > > In the situation I was describing our spout is executing a SQL statement > to gather rows that should be emitted as tuples, so the "processing time" > of the spout is more around how fast or slow that query statement ends up > being, and less about converting them to tuples -- we're actually querying > against somewhere around 100 different databases to find the data. Doing > that in a single thread with the other spouts seemed not ideal, so thats > why we kicked it off to separate threads. > > On Fri, Apr 29, 2016 at 8:53 AM, Hart, James W. <jwh...@seic.com> wrote: > >> I’m working on a topology that will be similar to this application so I >> was thinking about this yesterday. >> >> >> >> I’m thinking that if there is any significant work to do on messages in >> making them into tuples, shouldn’t the message be emitted and the work be >> in a bolt? I don’t think that bolt execute functions have the same >> limitations as spout nextTuple functions. Now with that said, bolt >> executes should not be long running computations either, but can be longer >> than the spouts nextTuple function. >> >> >> >> *From:* Stephen Powis [mailto:spo...@salesforce.com] >> *Sent:* Thursday, April 28, 2016 11:59 AM >> *To:* user@storm.apache.org >> *Subject:* Re: thread safe output collector >> >> >> >> So the Spout documentation (assuming its correct...) here ( >> http://storm.apache.org/releases/current/Concepts.html#spouts) mentions >> this: >> >> >> "The main method on spouts is nextTuple. nextTuple either emits a new >> tuple into the topology or simply returns if there are no new tuples to >> emit. *It is imperative that **nextTuple** does not block for any spout >> implementation, because Storm calls all the spout methods on the same >> thread.*" >> >> When developing a custom spout we interpreted it to mean that any "real >> work" done by a spout should be done in a separate thread, and decided on >> the following pattern which seems some what relevant to what you are trying >> to do in your bolts. >> >> On Spout prepare, we create a concurrent/thread safe queue. We then >> create a new Thread passing it a reference to our thread safe queue. This >> thread handles finding new data that needs to be emitted. When that thread >> finds data, it adds it to the shared queue. When the spout's nextTuple() >> method is called, it looks for data on the shared queue and emits it. >> >> I imagine doing async processing in a bolt using one or more threads >> could work with a similar pattern. On prepare you setup your thread(s) >> with references to a shared queue. The bolt passes work to be completed to >> the thread(s), the thread(s) communicate back to the bolt the result via a >> shared queue. Add in the concept of tick tuples to ensure your bolt checks >> for completed work on a regular basis? >> >> Is there a better way to do this? >> >> >> >> On Thu, Apr 28, 2016 at 11:22 AM, Julien Nioche < >> lists.digitalpeb...@gmail.com> wrote: >> >> Thanks for the clarification >> >> >> >> On 28 April 2016 at 15:12, P. Taylor Goetz <ptgo...@gmail.com> wrote: >> >> The documentation is wrong. See: >> >> >> >> https://issues.apache.org/jira/browse/STORM-841 >> >> >> >> At some point it looks like the change made there got reverted. I will >> reopen it to make sure the documentation is corrected. >> >> >> >> OutputCollector is NOT thread-safe. >> >> >> >> -Taylor >> >> >> >> On Apr 28, 2016, at 9:06 AM, Stephen Powis <spo...@salesforce.com> wrote: >> >> >> >> "Its perfectly fine to launch new threads in bolts that do processing >> asynchronously. OutputCollector >> <http://storm.apache.org/releases/current/javadocs/org/apache/storm/task/OutputCollector.html> >> is thread-safe and can be called at any time." >> >> >> >> From the docs for 0.9.6: >> http://storm.apache.org/releases/0.9.6/Concepts.html#bolts >> >> >> >> On Thu, Apr 28, 2016 at 9:03 AM, P. Taylor Goetz <ptgo...@gmail.com> >> wrote: >> >> IIRC there was discussion about making it thread safe, but I don't >> believe it was implemented. >> >> >> >> -Taylor >> >> >> On Apr 28, 2016, at 3:52 AM, Julien Nioche <lists.digitalpeb...@gmail.com> >> wrote: >> >> Hi Stephen >> >> >> >> I asked the same question in February but did not get a reply >> >> >> >> >> https://mail-archives.apache.org/mod_mbox/storm-user/201602.mbox/%3cca+-fm0urpf3fuerozywpzmxu-kdbgf-zj3wbyr8evsaqjc6...@mail.gmail.com%3E >> >> >> >> Anyone who could confirm this? >> >> >> >> Thanks >> >> >> >> On 27 April 2016 at 14:05, Steven Lewis <steven.le...@walmart.com> wrote: >> >> I have conflicting information, and have not checked personally but has >> the output collector finally been made thread safe for emitting in version >> 1.0 or 0.10? I know it was a huge problem in 0.9.5 when trying to do >> threading in a bolt for async future calls and emitting once it returns. >> >> >> >> This email and any files transmitted with it are confidential and >> intended solely for the individual or entity to whom they are addressed. If >> you have received this email in error destroy it immediately. *** Walmart >> Confidential *** >> >> >> >> >> >> -- >> >> >> *Open Source Solutions for Text Engineering* >> >> >> http://www.digitalpebble.com >> http://digitalpebble.blogspot.com/ >> #digitalpebble <http://twitter.com/digitalpebble> >> >> >> >> >> >> >> >> >> >> -- >> >> >> *Open Source Solutions for Text Engineering* >> >> >> http://www.digitalpebble.com >> http://digitalpebble.blogspot.com/ >> #digitalpebble <http://twitter.com/digitalpebble> >> >> >> > >