Hi, I found some problems with `FunctionWindows` when I implemented this pip, 
and I added it to PIP: Implementation[4].

After I submit the first PR, you can refer to it.

Thanks,
Baodi Shi

> On Jun 2, 2022, at 18:4232, 石宝迪 <wudixiaolong...@icloud.com.INVALID> wrote:
> 
>> Ok. I would add in the Compatability change another section with bold or
>> capital letters to highlight you're creating a breaking change. It should
>> be reflected in the release notes somehow - don't know the process for that.
> 
> Ok, I added to `Incompatible case`. PTAL.
> 
> 
> Thanks,
> Baodi Shi
> 
>> 2022年6月2日 18:0404,Asaf Mesika <asaf.mes...@gmail.com> 写道:
>> 
>>> 
>>> I tend to fail. Although this breaks the current logic. but the current
>>> implementation can be considered is a bug.
>> 
>> Ok. I would add in the Compatability change another section with bold or
>> capital letters to highlight you're creating a breaking change. It should
>> be reflected in the release notes somehow - don't know the process for that.
>> 
>> On Tue, May 31, 2022 at 7:16 PM 石宝迪 <wudixiaolong...@icloud.com.invalid>
>> wrote:
>> 
>>>>> If you fail to start the function, you immediately break people's
>>>> functions when they upgrade to this version. How about notifying them
>>> once
>>>> via logger (WARN)?
>>> 
>>> 
>>> I tend to fail. Although this breaks the current logic. but the current
>>> implementation can be considered is a bug.
>>> 
>>>> It will flood their logs if they used it wrong. Maybe write to log once?
>>> 
>>> 
>>> Agree, I changed PIP.
>>> 
>>> Thanks,
>>> Baodi Shi
>>> 
>>>> 2022年5月31日 23:5720,Asaf Mesika <asaf.mes...@gmail.com> 写道:
>>>> 
>>>> Hi Baodi,
>>>> 
>>>> Regarding
>>>> 
>>>>> 
>>>>> 1. When the delivery semantic is ATMOST_ONCE, add verify autoAck must
>>>>> be true. If the validation fails, let the function fail to start (This
>>>>> temporarily resolves the configuration ambiguity). When autoAck is
>>>>> subsequently removed, the message will be acked immediately after
>>> receiving
>>>>> the message.
>>>>> 
>>>>> 
>>>>> If you fail to start the function, you immediately break people's
>>>> functions when they upgrade to this version. How about notifying them
>>> once
>>>> via logger (WARN)?
>>>> 
>>>> Regarding
>>>> 
>>>>> 
>>>>> 1.
>>>>> 
>>>>> 
>>>>> When user call record.ack() in function, just ProcessingGuarantees ==
>>>>> MANUAL can be work. In turn, when ProcessingGuarantees != MANUAL, user
>>>>> call record.ack() is invalid(print warn log).
>>>>> 
>>>>> It will flood their logs if they used it wrong. Maybe write to log once?
>>>> 
>>>> On Tue, May 31, 2022 at 12:24 PM Baozi <wudixiaolong...@icloud.com
>>> .invalid>
>>>> wrote:
>>>> 
>>>>> Hi, Asaf.
>>>>> 
>>>>> Thanks review.
>>>>> 
>>>>>>> I'm not entirely sure that is accurate. The Effectively-Once as I
>>>>>> understand it is achieved using transactions, thus the consumption of
>>>>> that
>>>>>> message and production of any messages, as a result, are considered one
>>>>>> atomic unit - either message acknowledged and messages produced or
>>>>> neither.
>>>>> 
>>>>> 
>>>>> Not using transactions now, I understand: EFFECTIVELY_ONCE =
>>> ATLEAST_ONCE
>>>>> + Message Deduplication.
>>>>> 
>>>>> @Neng Lu @Rui Fu Can help make sure?
>>>>> 
>>>>>>> I would issue a WARN when reading configuring the function (thus
>>> emitted
>>>>>> once) when the user actively configured autoAck=false and warn them
>>> that
>>>>>> this configuration is deprecated and they should switch to the MANUAL
>>>>>> ProcessingGuarantee configuration option.
>>>>> 
>>>>> 
>>>>> Added to API Change(2)
>>>>> 
>>>>>>> suggest you clarify what existing behavior remains for backward
>>>>>> compatibility with the appropriate comment that this is deprecated and
>>>>> will
>>>>>> be removed.
>>>>> 
>>>>> Yes, I have rewritten it, please see Implementation(1)
>>>>> 
>>>>>> 5. Regarding Test Plan
>>>>>> * I would add: Validate the test of autoAck=false still works (you
>>>>> haven't
>>>>>> broken anything)
>>>>>> * I would add: Validate existing ProcessingGuarantee test for
>>> AtMostOnce,
>>>>>> AtLeastOnce, ExactlyOnce still works (when autoAck=true)
>>>>> 
>>>>> 
>>>>> Nice, I added to PIP.
>>>>> 
>>>>> 
>>>>> Thanks,
>>>>> Baodi Shi
>>>>> 
>>>>>> 2022年5月30日 22:0011,Asaf Mesika <asaf.mes...@gmail.com> 写道:
>>>>>> 
>>>>>> Thanks for applying the fixes.
>>>>>> 
>>>>>> 1. Regarding
>>>>>> 
>>>>>>> 
>>>>>>> - EFFECTIVELY_ONCE: The message is acknowledged *after* the function
>>>>>>> finished execution. Depends on pulsar deduplication, and provides
>>>>>>> end-to-end effectively once processing.
>>>>>>> 
>>>>>>> I'm not entirely sure that is accurate. The Effectively-Once as I
>>>>>> understand it is achieved using transactions, thus the consumption of
>>>>> that
>>>>>> message and production of any messages, as a result, are considered one
>>>>>> atomic unit - either message acknowledged and messages produced or
>>>>> neither.
>>>>>> 
>>>>>> 2. Regarding
>>>>>> 
>>>>>>> 
>>>>>>> 1. Indication of autoAck is deprecated, and not use it in the code.
>>>>>>> (and also Function.proto)
>>>>>>> 
>>>>>>> * I would issue a WARN when reading configuring the function (thus
>>>>> emitted
>>>>>> once) when the user actively configured autoAck=false and warn them
>>> that
>>>>>> this configuration is deprecated and they should switch to the MANUAL
>>>>>> ProcessingGuarantee configuration option.
>>>>>> 
>>>>>> 3. Regarding
>>>>>> 
>>>>>>> 
>>>>>>> 1. When the delivery semantic is ATMOST_ONCE, the message will be
>>>>>>> acked immediately after receiving the message, no longer affected by
>>>>> the
>>>>>>> autoAck configuration.
>>>>>>> 
>>>>>>> I suggest you clarify what existing behavior remains for backward
>>>>>> compatibility with the appropriate comment that this is deprecated and
>>>>> will
>>>>>> be removed.
>>>>>> 
>>>>>> 4. Regarding
>>>>>> 
>>>>>>> 
>>>>>>> 1.
>>>>>>> 
>>>>>>> When user call record.ack() in function, just ProcessingGuarantees ==
>>>>>>> MANUAL can be work. In turn, when ProcessingGuarantees != MANUAL,
>>> user
>>>>>>> call record.ack() is invalid(print warn log).
>>>>>>> 
>>>>>>> That might blast WARN messages to the user. Perhaps save the fact that
>>>>> you
>>>>>> have printed a WARN message once and only print when the variable is
>>> not
>>>>>> set?
>>>>>> 
>>>>>> 5. Regarding Test Plan
>>>>>> * I would add: Validate the test of autoAck=false still works (you
>>>>> haven't
>>>>>> broken anything)
>>>>>> * I would add: Validate existing ProcessingGuarantee test for
>>> AtMostOnce,
>>>>>> AtLeastOnce, ExactlyOnce still works (when autoAck=true)
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> On Mon, May 30, 2022 at 4:09 PM Baozi <wudixiaolong...@icloud.com
>>>>> .invalid>
>>>>>> wrote:
>>>>>> 
>>>>>>> Hi, Mesika.
>>>>>>> 
>>>>>>> Thanks review.
>>>>>>> 
>>>>>>>>> 2. I suggest calling it `MANUAL` `ProcessingGuarantee` instead of
>>>>> NONE.
>>>>>>> As
>>>>>>>>> you carefully explained, ProcessingGuarantee comes does to the fact
>>>>> that
>>>>>>>>> the function executor calls acknowledge, in specific timing:
>>>>>>> 
>>>>>>> 
>>>>>>> Added, Refer to the latest pip.
>>>>>>> https://github.com/apache/pulsar/issues/15560
>>>>>>> 
>>>>>>>>> 3. Removing autoAck from Function Config breaks backward
>>>>> compatibility,
>>>>>>>>> thus shouldn't this be strongly reflected in the PIP - how does
>>> Pulsar
>>>>>>>>> release handle breaking change?
>>>>>>> 
>>>>>>> As suggested by @neng, They will be marked as deprecated first and
>>>>> clearly
>>>>>>> stated in the documentation. Remove it after 2~3 release.
>>>>>>> 
>>>>>>>>> 4. Regarding Implementation (1), isn't the class itself
>>>>>>>>> PulsarSinkAtLeastOnceProcessor encodes what to do? I'm not sure I
>>>>>>>>> understand how you use that enum value *inside* the class/
>>>>>>> 
>>>>>>> I changed PIP, add new PulsarSinkManualProcessor. Refer to the latest
>>>>> PIP
>>>>>>> API Changes(3)
>>>>>>> 
>>>>>>> Thanks,
>>>>>>> Baodi Shi
>>>>>>> 
>>>>>>>> 2022年5月30日 12:5128,Rui Fu <r...@apache.org> 写道:
>>>>>>>> 
>>>>>>>> Hi Baodi,
>>>>>>>> 
>>>>>>>> Nice work. Put some suggestions below, ptal.
>>>>>>>> 
>>>>>>>> 1. API changes should also contain the changes of `Function.proto`,
>>>>>>> including new `ProcessingGuarantees` option and `autoAck`.
>>>>>>>> 2. Please be sure the other language runtimes (like Python, Golang)
>>> do
>>>>>>> support similar `record.ack()` function from the context, if no, it
>>>>> might
>>>>>>> have some API changes for different runtime we well.
>>>>>>>> 
>>>>>>>> 
>>>>>>>> Best,
>>>>>>>> 
>>>>>>>> Rui Fu
>>>>>>>> 在 2022年5月29日 +0800 22:18,Asaf Mesika <asaf.mes...@gmail.com>,写道:
>>>>>>>>> 1. "Added NONE delivery semantics and delete autoAck config."
>>>>>>>>> - Added --> add
>>>>>>>>> 
>>>>>>>>> 2. I suggest calling it `MANUAL` `ProcessingGuarantee` instead of
>>>>> NONE.
>>>>>>> As
>>>>>>>>> you carefully explained, ProcessingGuarantee comes does to the fact
>>>>> that
>>>>>>>>> the function executor calls acknowledge, in specific timing:
>>>>>>>>> - `AT_MOST_ONCE` - When the message is read by the client, it is
>>>>>>>>> immediately acknowledged and only then the function is executed,
>>> thus
>>>>>>>>> guaranteeing it will not run more than once
>>>>>>>>> - `AT_LEAST_ONCE` - Message is acknowledged *after* the function
>>>>>>> finished
>>>>>>>>> execution, thus it will be run at least once.
>>>>>>>>> - `MANUAL` - Signals to the user that it is up to them to
>>> acknowledge
>>>>>>> the
>>>>>>>>> message, inside the function.
>>>>>>>>> 
>>>>>>>>> I think if you couple that change with adding the explanation I
>>> wrote
>>>>>>>>> above to the documentation it will become crystal clear (hopefully)
>>>>>>> what is
>>>>>>>>> a Processing Guarantee exactly and what each value signifies.
>>>>>>>>> 
>>>>>>>>> 3. Removing autoAck from Function Config breaks backward
>>>>> compatibility,
>>>>>>>>> thus shouldn't this be strongly reflected in the PIP - how does
>>> Pulsar
>>>>>>>>> release handle breaking change?
>>>>>>>>> 
>>>>>>>>> 4. Regarding Implementation (1), isn't the class itself
>>>>>>>>> PulsarSinkAtLeastOnceProcessor encodes what to do? I'm not sure I
>>>>>>>>> understand how you use that enum value *inside* the class/
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> On Fri, May 27, 2022 at 1:08 AM Neng Lu <freen...@gmail.com> wrote:
>>>>>>>>> 
>>>>>>>>>> Some suggestions:
>>>>>>>>>> 
>>>>>>>>>> 1. Instead of deleting the `autoAck`, keep it but not use it in the
>>>>>>> code.
>>>>>>>>>> And documented clearly it's deprecated for the following 2~3
>>> release.
>>>>>>> And
>>>>>>>>>> then delete it.
>>>>>>>>>> 2. For `PulsarSinkAtLeastOnceProcessor` and
>>>>>>>>>> `PulsarSinkEffectivelyOnceProcessor`, if `NONE` is configured, it
>>>>>>> defaults
>>>>>>>>>> to ATLEAST_ONCE.
>>>>>>>>>> 3. Need to let users know the behavior when they call
>>> `record.ack()`
>>>>>>> inside
>>>>>>>>>> the function implementation.
>>>>>>>>>> 
>>>>>>>>>> On Thu, May 12, 2022 at 1:52 AM Baozi <wudixiaolong...@icloud.com
>>>>>>> .invalid>
>>>>>>>>>> wrote:
>>>>>>>>>> 
>>>>>>>>>>> Hi Pulsar community,
>>>>>>>>>>> 
>>>>>>>>>>> I open a https://github.com/apache/pulsar/issues/15560 for
>>> Function
>>>>>>> add
>>>>>>>>>>> NONE delivery semantics
>>>>>>>>>>> 
>>>>>>>>>>> Let me know what you think.
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Baodi Shi
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> ## Motivation
>>>>>>>>>>> 
>>>>>>>>>>> Currently Function supports three delivery semantics, and also
>>>>>>> provides
>>>>>>>>>>> autoAck to control whether to automatically ack.
>>>>>>>>>>> Because autoAck affects the delivery semantics of Function, it can
>>>>> be
>>>>>>>>>>> confusing for users to understand the relationship between these
>>> two
>>>>>>>>>>> parameters.
>>>>>>>>>>> 
>>>>>>>>>>> For example, when the user configures `Guarantees == ATMOST_ONCE`
>>>>> and
>>>>>>>>>>> `autoAck == false`, then the framework will not help the user to
>>> ack
>>>>>>>>>>> messages, and the processing semantics may become `ATLEAST_ONCE`.
>>>>>>>>>>> 
>>>>>>>>>>> The delivery semantics provided by Function should be clear. When
>>>>> the
>>>>>>>>>> user
>>>>>>>>>>> sets the guarantees, the framework should ensure point-to-point
>>>>>>> semantic
>>>>>>>>>>> processing and cannot be affected by other parameters.
>>>>>>>>>>> 
>>>>>>>>>>> ## Goal
>>>>>>>>>>> 
>>>>>>>>>>> Added `NONE` delivery semantics and delete `autoAck` config.
>>>>>>>>>>> 
>>>>>>>>>>> The original intention of `autoAck` semantics is that users want
>>> to
>>>>>>>>>>> control the timing of ack by themselves. When autoAck == false,
>>> the
>>>>>>>>>>> processing semantics provided by the framework should be invalid.
>>>>>>> Then we
>>>>>>>>>>> can add `NONE` processing semantics to replace the autoAck ==
>>> false
>>>>>>>>>>> scenario.
>>>>>>>>>>> 
>>>>>>>>>>> When the user configuration `ProcessingGuarantees == NONE`, the
>>>>>>> framework
>>>>>>>>>>> does not help the user to do any ack operations, and the ack is
>>> left
>>>>>>> to
>>>>>>>>>> the
>>>>>>>>>>> user to handle. In other cases, the framework guarantees
>>> processing
>>>>>>>>>>> semantics.
>>>>>>>>>>> 
>>>>>>>>>>> ## API Changes
>>>>>>>>>>> 1. Add `NONE` type to ProcessingGuarantees
>>>>>>>>>>> ``` java
>>>>>>>>>>> public enum ProcessingGuarantees {
>>>>>>>>>>> ATLEAST_ONCE,
>>>>>>>>>>> ATMOST_ONCE,
>>>>>>>>>>> EFFECTIVELY_ONCE,
>>>>>>>>>>> NONE
>>>>>>>>>>> }
>>>>>>>>>>> ```
>>>>>>>>>>> 
>>>>>>>>>>> 2. Delete autoAck config in FunctionConfig
>>>>>>>>>>> ``` java
>>>>>>>>>>> public class FunctionConfig {
>>>>>>>>>>> - private Boolean autoAck;
>>>>>>>>>>> }
>>>>>>>>>>> ```
>>>>>>>>>>> 
>>>>>>>>>>> ## Implementation
>>>>>>>>>>> 
>>>>>>>>>>> 1. In `PulsarSinkAtLeastOnceProcessor` and
>>>>>>>>>>> `PulsarSinkEffectivelyOnceProcessor`, when `ProcessingGuarantees
>>> !=
>>>>>>> NONE`
>>>>>>>>>>> can be ack.
>>>>>>>>>>> 
>>>>>>>>>>> <
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>> 
>>>>> 
>>> https://github.com/apache/pulsar/blob/c49a977de4b0b525ec80e5070bc90eddcc7cddad/pulsar-functions/instance/src/main/java/org/apache/pulsar/functions/sink/PulsarSink.java#L274-L276
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 2. When the delivery semantic is `ATMOST_ONCE`, the message will
>>> be
>>>>>>> acked
>>>>>>>>>>> immediately after receiving the message, no longer affected by the
>>>>>>>>>> autoAck
>>>>>>>>>>> configuration.
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>> 
>>>>> 
>>> https://github.com/apache/pulsar/blob/c49a977de4b0b525ec80e5070bc90eddcc7cddad/pulsar-functions/instance/src/main/java/org/apache/pulsar/functions/instance/JavaInstanceRunnable.java#L271-L276
>>>>>>>>>>> 
>>>>>>>>>>> 3. When user call `record.ack()` in function, just
>>>>>>> `ProcessingGuarantees
>>>>>>>>>>> == NONE` can be work.
>>>>>>>>>>> 
>>>>>>>>>>> ## Plan test
>>>>>>>>>>> The main test and assert is that when ProcessingGuarantees ==
>>> NONE,
>>>>>>> the
>>>>>>>>>>> function framework will not do any ack operations for the user.
>>>>>>>>>>> 
>>>>>>>>>>> ## Compatibility
>>>>>>>>>>> 1. This change will invalidate the user's setting of autoAck,
>>> which
>>>>>>>>>> should
>>>>>>>>>>> be explained in the documentation and provide parameter
>>> verification
>>>>>>> to
>>>>>>>>>>> remind the user.
>>>>>>>>>>> 2. Runtimes of other languages need to maintain consistent
>>>>> processing
>>>>>>>>>>> logic (python, go).
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> --
>>>>>>>>>> Best Regards,
>>>>>>>>>> Neng
>>>>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>> 
>>>>> 
>>> 
>>> 
> 

Reply via email to