Yes, my processing logic is task id dependent. Thus the behavior of different bolt instances are similar but not exactly the same. This is also the reason why I want some instances to be non-critical that do not affect the ack procedure.
I would like to explore the possibility of modifying the ack logic so that tuples emitted to non-critical tasks are not anchored. I will report any progress I have on this matter. Best Regards. On 20 December 2016 at 11:32, Ambud Sharma <asharma52...@gmail.com> wrote: > Storm is a framework built on replays, fundamentally replays are the way > guaranteed event processing is accomplished. Typically all Bolt Instances > in a given registered bolt should be running the same code, unless you are > doing some logic based on task ids. This implies that behavior of bolt > instances should be similar as well unless experiencing a hardware failure. > > If I am understanding your use case you can either duplicate the data > outside storm (like write it to separate kafka topics) and have independent > spouts pick it up while keeping everything in 1 topology. > > Grouping however is applied to one stream, you can have more than one > streams to have a logical separation as well. > > I am still unsure about why would you get partial failures unless it's > frequent supervisor failure, may be you can provide more details about your > use case. > > Lastly ALL groups are usually used for update delivery where > acknowledgements should matter, however if you can get away with using > unanchored tuples then that is also an alternative. > > > On Dec 19, 2016 4:17 PM, "Xunyun Liu" <xunyun...@gmail.com> wrote: > > Thank you for your answer, Ambud. My use case is that only some of the > bolt instances are critical that I need them responding to the signal > through proper acknowledgment. However, the rest of them are non-critical > which are preferably not to interfere the normal ack process, much like > receiving an unanchored tuple. Is there any way that I can achieve this? > > On 20 December 2016 at 11:01, Ambud Sharma <asharma52...@gmail.com> wrote: > >> Forgot to answer your specific question. Storm message id is internal and >> will be different so you will see a duplicate tuple with a different id. >> >> On Dec 19, 2016 3:59 PM, "Ambud Sharma" <asharma52...@gmail.com> wrote: >> >>> Yes that is correct. All downstream tuples must be processed for the >>> root tuple to be acknowledged. >>> >>> Type of grouping does not change the acking behavior. >>> >>> On Dec 19, 2016 3:53 PM, "Xunyun Liu" <xunyun...@gmail.com> wrote: >>> >>>> Hi there, >>>> >>>> As some grouping methods allow sending multiple copies of emitted data >>>> to downstream bolt instances, I was wondering what will happen if any one >>>> of them is not able to ack the tuple due to failures. The intrinsic >>>> question is that, when the all grouping method is used, whether the >>>> recipients are receiving the exact the same tuple or just duplications with >>>> different tuple IDs. In the latter case, I believe the tuple tree is >>>> expanded with regard to the number of parallelisms in downstream and each >>>> task has to invoke ack() for the root tuple to be fully processed. >>>> >>>> Any idea is much appreciated. >>>> >>>> >>>> -- >>>> Best Regards. >>>> ====================================================== >>>> Xunyun Liu >>>> >>>> >>>> > > > -- > Best Regards. > ====================================================== > Xunyun Liu > The Cloud Computing and Distributed Systems (CLOUDS) Laboratory, > The University of Melbourne > > > -- Best Regards. ====================================================== Xunyun Liu The Cloud Computing and Distributed Systems (CLOUDS) Laboratory, The University of Melbourne