Re: ack in downstream when using all grouping method

Xunyun Liu Mon, 19 Dec 2016 18:02:28 -0800


Yes, my processing logic is task id dependent. Thus the behavior of
different bolt instances are similar but not exactly the same. This is also
the reason why I want some instances to be non-critical that do not affect
the ack procedure.


I would like to explore the possibility of modifying the ack logic so that
tuples emitted to non-critical tasks are not anchored. I will report any
progress I have on this matter.

Best Regards.



On 20 December 2016 at 11:32, Ambud Sharma <[email protected]> wrote:

> Storm is a framework built on replays, fundamentally replays are the way
> guaranteed event processing is accomplished. Typically all Bolt Instances
> in a given registered bolt should be running the same code, unless you are
> doing some logic based on task ids. This implies that behavior of bolt
> instances should be similar as well unless experiencing a hardware failure.
>
> If I am understanding your use case you can either duplicate the data
> outside storm (like write it to separate kafka topics) and have independent
> spouts pick it up while keeping everything in 1 topology.
>
> Grouping however is applied to one stream, you can have more than one
> streams to have a logical separation as well.
>
> I am still unsure about why would you get partial failures unless it's
> frequent supervisor failure, may be you can provide more details about your
> use case.
>
> Lastly ALL groups are usually used for update delivery where
> acknowledgements should matter, however if you can get away with using
> unanchored tuples then that is also an alternative.
>
>
> On Dec 19, 2016 4:17 PM, "Xunyun Liu" <[email protected]> wrote:
>
> Thank you for your answer, Ambud. My use case is that only some of the
> bolt instances are critical that I need them responding to the signal
> through proper acknowledgment. However, the rest of them are non-critical
> which are preferably not to interfere the normal ack process, much like
> receiving an unanchored tuple. Is there any way that I can achieve this?
>
> On 20 December 2016 at 11:01, Ambud Sharma <[email protected]> wrote:
>
>> Forgot to answer your specific question. Storm message id is internal and
>> will be different so you will see a duplicate tuple with a different id.
>>
>> On Dec 19, 2016 3:59 PM, "Ambud Sharma" <[email protected]> wrote:
>>
>>> Yes that is correct. All downstream tuples must be processed for the
>>> root tuple to be acknowledged.
>>>
>>> Type of grouping does not change the acking behavior.
>>>
>>> On Dec 19, 2016 3:53 PM, "Xunyun Liu" <[email protected]> wrote:
>>>
>>>> Hi there,
>>>>
>>>> As some grouping methods allow sending multiple copies of emitted data
>>>> to downstream bolt instances, I was wondering what will happen if any one
>>>> of them is not able to ack the tuple due to failures. The intrinsic
>>>> question is that, when the all grouping method is used, whether the
>>>> recipients are receiving the exact the same tuple or just duplications with
>>>> different tuple IDs. In the latter case, I believe the tuple tree is
>>>> expanded with regard to the number of parallelisms in downstream and each
>>>> task has to invoke ack() for the root tuple to be fully processed.
>>>>
>>>> Any idea is much appreciated.
>>>>
>>>>
>>>> --
>>>> Best Regards.
>>>> ======================================================
>>>> Xunyun Liu
>>>> 
>>>>
>>>>
>
>
> --
> Best Regards.
> ======================================================
> Xunyun Liu
> The Cloud Computing and Distributed Systems (CLOUDS) Laboratory,
> The University of Melbourne
>
>
>


-- 
Best Regards.
======================================================
Xunyun Liu
The Cloud Computing and Distributed Systems (CLOUDS) Laboratory,
The University of Melbourne

Re: ack in downstream when using all grouping method

Reply via email to