>> If you fail to start the function, you immediately break people's
> functions when they upgrade to this version. How about notifying them once
> via logger (WARN)?


I tend to fail. Although this breaks the current logic. but the current 
implementation can be considered is a bug.

> It will flood their logs if they used it wrong. Maybe write to log once?


Agree, I changed PIP.

Thanks,
Baodi Shi

> 2022年5月31日 23:5720,Asaf Mesika <asaf.mes...@gmail.com> 写道:
> 
> Hi Baodi,
> 
> Regarding
> 
>> 
>>   1. When the delivery semantic is ATMOST_ONCE, add verify autoAck must
>>   be true. If the validation fails, let the function fail to start (This
>>   temporarily resolves the configuration ambiguity). When autoAck is
>>   subsequently removed, the message will be acked immediately after receiving
>>   the message.
>> 
>> 
>> If you fail to start the function, you immediately break people's
> functions when they upgrade to this version. How about notifying them once
> via logger (WARN)?
> 
> Regarding
> 
>> 
>>   1.
>> 
>> 
>>   When user call record.ack() in function, just ProcessingGuarantees ==
>>   MANUAL can be work. In turn, when ProcessingGuarantees != MANUAL, user
>>   call record.ack() is invalid(print warn log).
>> 
>> It will flood their logs if they used it wrong. Maybe write to log once?
> 
> On Tue, May 31, 2022 at 12:24 PM Baozi <wudixiaolong...@icloud.com.invalid>
> wrote:
> 
>> Hi, Asaf.
>> 
>> Thanks review.
>> 
>>>> I'm not entirely sure that is accurate. The Effectively-Once as I
>>> understand it is achieved using transactions, thus the consumption of
>> that
>>> message and production of any messages, as a result, are considered one
>>> atomic unit - either message acknowledged and messages produced or
>> neither.
>> 
>> 
>> Not using transactions now, I understand: EFFECTIVELY_ONCE = ATLEAST_ONCE
>> + Message Deduplication.
>> 
>> @Neng Lu @Rui Fu Can help make sure?
>> 
>>>> I would issue a WARN when reading configuring the function (thus emitted
>>> once) when the user actively configured autoAck=false and warn them that
>>> this configuration is deprecated and they should switch to the MANUAL
>>> ProcessingGuarantee configuration option.
>> 
>> 
>> Added to API Change(2)
>> 
>>>> suggest you clarify what existing behavior remains for backward
>>> compatibility with the appropriate comment that this is deprecated and
>> will
>>> be removed.
>> 
>> Yes, I have rewritten it, please see Implementation(1)
>> 
>>> 5. Regarding Test Plan
>>> * I would add: Validate the test of autoAck=false still works (you
>> haven't
>>> broken anything)
>>> * I would add: Validate existing ProcessingGuarantee test for AtMostOnce,
>>> AtLeastOnce, ExactlyOnce still works (when autoAck=true)
>> 
>> 
>> Nice, I added to PIP.
>> 
>> 
>> Thanks,
>> Baodi Shi
>> 
>>> 2022年5月30日 22:0011,Asaf Mesika <asaf.mes...@gmail.com> 写道:
>>> 
>>> Thanks for applying the fixes.
>>> 
>>> 1. Regarding
>>> 
>>>> 
>>>>  - EFFECTIVELY_ONCE: The message is acknowledged *after* the function
>>>>  finished execution. Depends on pulsar deduplication, and provides
>>>>  end-to-end effectively once processing.
>>>> 
>>>> I'm not entirely sure that is accurate. The Effectively-Once as I
>>> understand it is achieved using transactions, thus the consumption of
>> that
>>> message and production of any messages, as a result, are considered one
>>> atomic unit - either message acknowledged and messages produced or
>> neither.
>>> 
>>> 2. Regarding
>>> 
>>>> 
>>>>  1. Indication of autoAck is deprecated, and not use it in the code.
>>>>  (and also Function.proto)
>>>> 
>>>> * I would issue a WARN when reading configuring the function (thus
>> emitted
>>> once) when the user actively configured autoAck=false and warn them that
>>> this configuration is deprecated and they should switch to the MANUAL
>>> ProcessingGuarantee configuration option.
>>> 
>>> 3. Regarding
>>> 
>>>> 
>>>>  1. When the delivery semantic is ATMOST_ONCE, the message will be
>>>>  acked immediately after receiving the message, no longer affected by
>> the
>>>>  autoAck configuration.
>>>> 
>>>> I suggest you clarify what existing behavior remains for backward
>>> compatibility with the appropriate comment that this is deprecated and
>> will
>>> be removed.
>>> 
>>> 4. Regarding
>>> 
>>>> 
>>>>  1.
>>>> 
>>>>  When user call record.ack() in function, just ProcessingGuarantees ==
>>>>  MANUAL can be work. In turn, when ProcessingGuarantees != MANUAL, user
>>>>  call record.ack() is invalid(print warn log).
>>>> 
>>>> That might blast WARN messages to the user. Perhaps save the fact that
>> you
>>> have printed a WARN message once and only print when the variable is not
>>> set?
>>> 
>>> 5. Regarding Test Plan
>>> * I would add: Validate the test of autoAck=false still works (you
>> haven't
>>> broken anything)
>>> * I would add: Validate existing ProcessingGuarantee test for AtMostOnce,
>>> AtLeastOnce, ExactlyOnce still works (when autoAck=true)
>>> 
>>> 
>>> 
>>> On Mon, May 30, 2022 at 4:09 PM Baozi <wudixiaolong...@icloud.com
>> .invalid>
>>> wrote:
>>> 
>>>> Hi, Mesika.
>>>> 
>>>> Thanks review.
>>>> 
>>>>>> 2. I suggest calling it `MANUAL` `ProcessingGuarantee` instead of
>> NONE.
>>>> As
>>>>>> you carefully explained, ProcessingGuarantee comes does to the fact
>> that
>>>>>> the function executor calls acknowledge, in specific timing:
>>>> 
>>>> 
>>>> Added, Refer to the latest pip.
>>>> https://github.com/apache/pulsar/issues/15560
>>>> 
>>>>>> 3. Removing autoAck from Function Config breaks backward
>> compatibility,
>>>>>> thus shouldn't this be strongly reflected in the PIP - how does Pulsar
>>>>>> release handle breaking change?
>>>> 
>>>> As suggested by @neng, They will be marked as deprecated first and
>> clearly
>>>> stated in the documentation. Remove it after 2~3 release.
>>>> 
>>>>>> 4. Regarding Implementation (1), isn't the class itself
>>>>>> PulsarSinkAtLeastOnceProcessor encodes what to do? I'm not sure I
>>>>>> understand how you use that enum value *inside* the class/
>>>> 
>>>> I changed PIP, add new PulsarSinkManualProcessor. Refer to the latest
>> PIP
>>>> API Changes(3)
>>>> 
>>>> Thanks,
>>>> Baodi Shi
>>>> 
>>>>> 2022年5月30日 12:5128,Rui Fu <r...@apache.org> 写道:
>>>>> 
>>>>> Hi Baodi,
>>>>> 
>>>>> Nice work. Put some suggestions below, ptal.
>>>>> 
>>>>> 1. API changes should also contain the changes of `Function.proto`,
>>>> including new `ProcessingGuarantees` option and `autoAck`.
>>>>> 2. Please be sure the other language runtimes (like Python, Golang) do
>>>> support similar `record.ack()` function from the context, if no, it
>> might
>>>> have some API changes for different runtime we well.
>>>>> 
>>>>> 
>>>>> Best,
>>>>> 
>>>>> Rui Fu
>>>>> 在 2022年5月29日 +0800 22:18,Asaf Mesika <asaf.mes...@gmail.com>,写道:
>>>>>> 1. "Added NONE delivery semantics and delete autoAck config."
>>>>>> - Added --> add
>>>>>> 
>>>>>> 2. I suggest calling it `MANUAL` `ProcessingGuarantee` instead of
>> NONE.
>>>> As
>>>>>> you carefully explained, ProcessingGuarantee comes does to the fact
>> that
>>>>>> the function executor calls acknowledge, in specific timing:
>>>>>> - `AT_MOST_ONCE` - When the message is read by the client, it is
>>>>>> immediately acknowledged and only then the function is executed, thus
>>>>>> guaranteeing it will not run more than once
>>>>>> - `AT_LEAST_ONCE` - Message is acknowledged *after* the function
>>>> finished
>>>>>> execution, thus it will be run at least once.
>>>>>> - `MANUAL` - Signals to the user that it is up to them to acknowledge
>>>> the
>>>>>> message, inside the function.
>>>>>> 
>>>>>> I think if you couple that change with adding the explanation I wrote
>>>>>> above to the documentation it will become crystal clear (hopefully)
>>>> what is
>>>>>> a Processing Guarantee exactly and what each value signifies.
>>>>>> 
>>>>>> 3. Removing autoAck from Function Config breaks backward
>> compatibility,
>>>>>> thus shouldn't this be strongly reflected in the PIP - how does Pulsar
>>>>>> release handle breaking change?
>>>>>> 
>>>>>> 4. Regarding Implementation (1), isn't the class itself
>>>>>> PulsarSinkAtLeastOnceProcessor encodes what to do? I'm not sure I
>>>>>> understand how you use that enum value *inside* the class/
>>>>>> 
>>>>>> 
>>>>>> On Fri, May 27, 2022 at 1:08 AM Neng Lu <freen...@gmail.com> wrote:
>>>>>> 
>>>>>>> Some suggestions:
>>>>>>> 
>>>>>>> 1. Instead of deleting the `autoAck`, keep it but not use it in the
>>>> code.
>>>>>>> And documented clearly it's deprecated for the following 2~3 release.
>>>> And
>>>>>>> then delete it.
>>>>>>> 2. For `PulsarSinkAtLeastOnceProcessor` and
>>>>>>> `PulsarSinkEffectivelyOnceProcessor`, if `NONE` is configured, it
>>>> defaults
>>>>>>> to ATLEAST_ONCE.
>>>>>>> 3. Need to let users know the behavior when they call `record.ack()`
>>>> inside
>>>>>>> the function implementation.
>>>>>>> 
>>>>>>> On Thu, May 12, 2022 at 1:52 AM Baozi <wudixiaolong...@icloud.com
>>>> .invalid>
>>>>>>> wrote:
>>>>>>> 
>>>>>>>> Hi Pulsar community,
>>>>>>>> 
>>>>>>>> I open a https://github.com/apache/pulsar/issues/15560 for Function
>>>> add
>>>>>>>> NONE delivery semantics
>>>>>>>> 
>>>>>>>> Let me know what you think.
>>>>>>>> 
>>>>>>>> 
>>>>>>>> Thanks,
>>>>>>>> Baodi Shi
>>>>>>>> 
>>>>>>>> 
>>>>>>>> ## Motivation
>>>>>>>> 
>>>>>>>> Currently Function supports three delivery semantics, and also
>>>> provides
>>>>>>>> autoAck to control whether to automatically ack.
>>>>>>>> Because autoAck affects the delivery semantics of Function, it can
>> be
>>>>>>>> confusing for users to understand the relationship between these two
>>>>>>>> parameters.
>>>>>>>> 
>>>>>>>> For example, when the user configures `Guarantees == ATMOST_ONCE`
>> and
>>>>>>>> `autoAck == false`, then the framework will not help the user to ack
>>>>>>>> messages, and the processing semantics may become `ATLEAST_ONCE`.
>>>>>>>> 
>>>>>>>> The delivery semantics provided by Function should be clear. When
>> the
>>>>>>> user
>>>>>>>> sets the guarantees, the framework should ensure point-to-point
>>>> semantic
>>>>>>>> processing and cannot be affected by other parameters.
>>>>>>>> 
>>>>>>>> ## Goal
>>>>>>>> 
>>>>>>>> Added `NONE` delivery semantics and delete `autoAck` config.
>>>>>>>> 
>>>>>>>> The original intention of `autoAck` semantics is that users want to
>>>>>>>> control the timing of ack by themselves. When autoAck == false, the
>>>>>>>> processing semantics provided by the framework should be invalid.
>>>> Then we
>>>>>>>> can add `NONE` processing semantics to replace the autoAck == false
>>>>>>>> scenario.
>>>>>>>> 
>>>>>>>> When the user configuration `ProcessingGuarantees == NONE`, the
>>>> framework
>>>>>>>> does not help the user to do any ack operations, and the ack is left
>>>> to
>>>>>>> the
>>>>>>>> user to handle. In other cases, the framework guarantees processing
>>>>>>>> semantics.
>>>>>>>> 
>>>>>>>> ## API Changes
>>>>>>>> 1. Add `NONE` type to ProcessingGuarantees
>>>>>>>> ``` java
>>>>>>>> public enum ProcessingGuarantees {
>>>>>>>> ATLEAST_ONCE,
>>>>>>>> ATMOST_ONCE,
>>>>>>>> EFFECTIVELY_ONCE,
>>>>>>>> NONE
>>>>>>>> }
>>>>>>>> ```
>>>>>>>> 
>>>>>>>> 2. Delete autoAck config in FunctionConfig
>>>>>>>> ``` java
>>>>>>>> public class FunctionConfig {
>>>>>>>> - private Boolean autoAck;
>>>>>>>> }
>>>>>>>> ```
>>>>>>>> 
>>>>>>>> ## Implementation
>>>>>>>> 
>>>>>>>> 1. In `PulsarSinkAtLeastOnceProcessor` and
>>>>>>>> `PulsarSinkEffectivelyOnceProcessor`, when `ProcessingGuarantees !=
>>>> NONE`
>>>>>>>> can be ack.
>>>>>>>> 
>>>>>>>> <
>>>>>>>> 
>>>>>>> 
>>>> 
>> https://github.com/apache/pulsar/blob/c49a977de4b0b525ec80e5070bc90eddcc7cddad/pulsar-functions/instance/src/main/java/org/apache/pulsar/functions/sink/PulsarSink.java#L274-L276
>>>>>>>>> 
>>>>>>>> 
>>>>>>>> 2. When the delivery semantic is `ATMOST_ONCE`, the message will be
>>>> acked
>>>>>>>> immediately after receiving the message, no longer affected by the
>>>>>>> autoAck
>>>>>>>> configuration.
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>> 
>> https://github.com/apache/pulsar/blob/c49a977de4b0b525ec80e5070bc90eddcc7cddad/pulsar-functions/instance/src/main/java/org/apache/pulsar/functions/instance/JavaInstanceRunnable.java#L271-L276
>>>>>>>> 
>>>>>>>> 3. When user call `record.ack()` in function, just
>>>> `ProcessingGuarantees
>>>>>>>> == NONE` can be work.
>>>>>>>> 
>>>>>>>> ## Plan test
>>>>>>>> The main test and assert is that when ProcessingGuarantees == NONE,
>>>> the
>>>>>>>> function framework will not do any ack operations for the user.
>>>>>>>> 
>>>>>>>> ## Compatibility
>>>>>>>> 1. This change will invalidate the user's setting of autoAck, which
>>>>>>> should
>>>>>>>> be explained in the documentation and provide parameter verification
>>>> to
>>>>>>>> remind the user.
>>>>>>>> 2. Runtimes of other languages need to maintain consistent
>> processing
>>>>>>>> logic (python, go).
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>>>>> --
>>>>>>> Best Regards,
>>>>>>> Neng
>>>>>>> 
>>>> 
>>>> 
>> 
>> 

Reply via email to