Re: ack in downstream when using all grouping method

Ambud Sharma Mon, 19 Dec 2016 16:33:23 -0800

Storm is a framework built on replays, fundamentally replays are the way
guaranteed event processing is accomplished. Typically all Bolt Instances
in a given registered bolt should be running the same code, unless you are
doing some logic based on task ids. This implies that behavior of bolt
instances should be similar as well unless experiencing a hardware failure.

If I am understanding your use case you can either duplicate the data
outside storm (like write it to separate kafka topics) and have independent
spouts pick it up while keeping everything in 1 topology.

Grouping however is applied to one stream, you can have more than one
streams to have a logical separation as well.

I am still unsure about why would you get partial failures unless it's
frequent supervisor failure, may be you can provide more details about your
use case.

Lastly ALL groups are usually used for update delivery where
acknowledgements should matter, however if you can get away with using
unanchored tuples then that is also an alternative.

On Dec 19, 2016 4:17 PM, "Xunyun Liu" <xunyun...@gmail.com> wrote:

Thank you for your answer, Ambud. My use case is that only some of the bolt
instances are critical that I need them responding to the signal through
proper acknowledgment. However, the rest of them are non-critical which are
preferably not to interfere the normal ack process, much like receiving an
unanchored tuple. Is there any way that I can achieve this?

On 20 December 2016 at 11:01, Ambud Sharma <asharma52...@gmail.com> wrote:

> Forgot to answer your specific question. Storm message id is internal and
> will be different so you will see a duplicate tuple with a different id.
>
> On Dec 19, 2016 3:59 PM, "Ambud Sharma" <asharma52...@gmail.com> wrote:
>
>> Yes that is correct. All downstream tuples must be processed for the root
>> tuple to be acknowledged.
>>
>> Type of grouping does not change the acking behavior.
>>
>> On Dec 19, 2016 3:53 PM, "Xunyun Liu" <xunyun...@gmail.com> wrote:
>>
>>> Hi there,
>>>
>>> As some grouping methods allow sending multiple copies of emitted data
>>> to downstream bolt instances, I was wondering what will happen if any one
>>> of them is not able to ack the tuple due to failures. The intrinsic
>>> question is that, when the all grouping method is used, whether the
>>> recipients are receiving the exact the same tuple or just duplications with
>>> different tuple IDs. In the latter case, I believe the tuple tree is
>>> expanded with regard to the number of parallelisms in downstream and each
>>> task has to invoke ack() for the root tuple to be fully processed.
>>>
>>> Any idea is much appreciated.
>>>
>>>
>>> --
>>> Best Regards.
>>> ======================================================
>>> Xunyun Liu
>>> 
>>>
>>>

-- 
Best Regards.
======================================================
Xunyun Liu
The Cloud Computing and Distributed Systems (CLOUDS) Laboratory,
The University of Melbourne

Re: ack in downstream when using all grouping method

Reply via email to