-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512

There is one more change to this KIP for the upgrade path of Kafka
Streams applications:

We cannot detect broker versions reliable, and thus, we need users to
manually opt-in to the feature. Thus, we need to add a third value for
configuration parameter `processing.guarantee` that we call
`exactly_once_beta` -- specifying this config will enable the producer
per thread design.

I updated the KIP accordingly. Please let us know if there are any
concerns.


- -Matthias

On 2/11/20 4:44 PM, Guozhang Wang wrote:
> Boyang,
>
> Thanks for the update. This change makes sense to me.
>
> Guozhang
>
> On Tue, Feb 11, 2020 at 11:37 AM Boyang Chen
> <reluctanthero...@gmail.com> wrote:
>
>> Hey there,
>>
>> we are adding a small change to the KIP-447 public API. The
>> default value of
>> `transaction.abort.timed.out.transaction.cleanup.interval.ms`
>> shall be changed from 1 minute to 10 seconds. The goal here is to
>> trigger the expired transaction more frequently in order to
>> reduce the consumer pending offset fetch wait time.
>>
>> Let me know if you have further questions, thanks!
>>
>>
>> On Wed, Jan 8, 2020 at 3:44 PM Boyang Chen
>> <reluctanthero...@gmail.com> wrote:
>>
>>> Thanks Guozhang for another review! I have addressed all the
>>> javadoc changes for PendingTransactionException in the KIP.
>>> For
>> FENCED_INSTANCE_ID
>>> the only thrown place would be on the new send offsets API,
>>> which is also addressed.
>>>
>>> Thanks Matthias for the vote! As we have 3 binding votes
>>> (Guozhang,
>> Jason,
>>> and Matthias), the KIP is officially accepted and prepared to
>>> ship in
>> 2.5.
>>>
>>> Still feel free to put more thoughts on either discussion or
>>> voting
>> thread
>>> to refine the KIP!
>>>
>>>
>>> On Wed, Jan 8, 2020 at 3:15 PM Matthias J. Sax
>>> <matth...@confluent.io> wrote:
>>>
>>>> I just re-read the KIP. Overall I am +1 as well.
>>>>
>>>
>>>> Some minor comments (also apply to the Google design doc):
>>>>
>>>> 1) As 2.4 was release, references should be updated to 2.5.
>>>>
>>>> Addressed
>>>
>>>>
>>>>
>>>>> 2) About the upgrade path, the KIP says:
>>>>
>>>> 2a)
>>>>
>>>>> Broker must be upgraded to 2.4 first. This means the
>>>> `inter.broker.protocol.version` (IBP) has to be set to the
>>>> latest. Any produce request with higher version will
>>>> automatically get fenced
>> because
>>>> of no support.
>>>>
>>>> From my understanding, this is not correct? After a broker is
>>>> updated to the new binaries, it should accept new requests,
>>>> even if IBP was not bumped yet?
>>>>
>>>> Your understanding was correct, after some offline discussion
>>>> we should
>>> not worry about IBP in this case.
>>>
>>>> 2b)
>>>>
>>>> About the two rolling bounces for KS apps and the statement
>>>>
>>>>> one should never allow task producer and thread producer
>>>>> under the
>> same
>>>> application group
>>>>
>>>> In the second rolling bounce, we might actually mix both (ie,
>>>> per-task and per-thread producers) but this is fine as
>>>> explained in the KIP. The only case we cannot allow is, old
>>>> per-task producers (without consumer generation fencing) to
>>>> be mixed with per-thread producers (that rely solely on
>>>> consumer generation fencing).
>>>>
>>>> Does this sound correct?
>>>>
>>>> Correct, that's the purpose of doing 2 rolling bounce, where
>>>> the first
>>> one is to guarantee everyone's opt-in for generation fencing.
>>>
>>>>
>>>> 3) We should also document how users can use KS 2.5
>>>> applications against older brokers -- for this case, we need
>>>> to stay on per-task producers and cannot use the new fencing
>>>> mechanism. Currently, the KIP only describe a single way how
>>>> users could make this work: by setting (and keeping)
>>>> UPGRADE_FROM config to 2.4 (what might not be an ideal
>>>> solution and might also not be clear by itself that people
>>>> would need to do
>> this)?
>>>>
>>>>
>>>> Yes exactly, at the moment we are actively working on a plan
>>>> to acquire
>>> broker's IBP during stream start-up and initialize based off
>>> that information, so that user doesn't need to keep
>>> UPGRADE_FROM simply for working with
>> old
>>> brokers.
>>>
>>>>
>>>> -Matthias
>>>>
>>>>
>>>>
>>>> On 9/18/19 4:41 PM, Boyang Chen wrote:
>>>>> Bump this thread to see if someone could also review!
>>>>>
>>>>> On Mon, Sep 9, 2019 at 5:00 PM Boyang Chen <
>> reluctanthero...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Thank you Jason! Addressed the comments.
>>>>>>
>>>>>> Thank you Guozhang for explaining. I will document the
>>>>>> timeout
>> setting
>>>>>> reasoning in the design doc.
>>>>>>
>>>>>>
>>>>>> On Mon, Sep 9, 2019 at 1:49 PM Guozhang Wang
>>>>>> <wangg...@gmail.com>
>>>> wrote:
>>>>>>
>>>>>>> On Fri, Sep 6, 2019 at 6:33 PM Boyang Chen <
>>>> reluctanthero...@gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Thanks Guozhang, I have polished the design doc to
>>>>>>>> make it sync
>> with
>>>>>>>> current KIP. As for overriding default timeout
>>>>>>>> values, I guess it's
>>>>>>> already
>>>>>>>> stated in the KIP to set txn timeout to 10s, are you
>>>>>>>> suggesting we
>>>>>>> should
>>>>>>>> also put down this recommendation on the KIP for
>>>>>>>> non-stream EOS
>>>> users?
>>>>>>>>
>>>>>>>> My comment is not for changing any produce / consumer
>>>>>>>> default
>> config
>>>>>>> values, but for the Streams configs, to make sure that
>>>>>>> our overridden config values respect the above rules.
>>>>>>> That is, we check
>>>> the
>>>>>>> actual value used in the config if they are ever
>>>>>>> overridden by
>> users,
>>>> and
>>>>>>> if the above were not true we can log a warning that it
>>>>>>> may be risky
>>>> to
>>>>>>> encounter some unnecessary rebalances.
>>>>>>>
>>>>>>> Again, this is not something we need to include in the
>>>>>>> KIP since it
>>>> is not
>>>>>>> part of public APIs, just to emphasize that the
>>>>>>> internal
>>>> implementation
>>>>>>> can have some safety guard like this.
>>>>>>>
>>>>>>> Guozhang
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>> Boyang
>>>>>>>>
>>>>>>>> On Thu, Sep 5, 2019 at 8:43 PM Guozhang Wang
>>>>>>>> <wangg...@gmail.com>
>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Hello Boyang,
>>>>>>>>>
>>>>>>>>> Just realized one thing about timeout
>>>>>>>>> configurations that we
>> should
>>>>>>>>> consider including in this KIP as well:
>>>>>>>>>
>>>>>>>>> 1) In Producer we have: max.block.ms (default value
>>>>>>>>> 60sec), request.timeout (30sec),
>>>>>>>>> delivery.timeout.ms (120sec), transaction.timeout
>> (60sec)
>>>>>>>>> 2) In Consumer we have: session.timeout (10sec),
>>>>>>>>> request.timeout
>>>>>>> (30sec),
>>>>>>>>> default.api.timeout.ms (60sec).
>>>>>>>>>
>>>>>>>>> Within a transaction (i.e. after we've beginTxn),
>>>>>>>>> we could
>>>> potentially
>>>>>>>> call
>>>>>>>>> consumer blocking APIs that depend on
>>>>>>>>> default.api.timeout.ms, and
>>>>>>> call
>>>>>>>>> producer blocking APIs that depend on max.block.ms.
>>>>>>>>> Also, if the
>>>>>>> user is
>>>>>>>>> following a consumer->producer pattern, then it
>>>>>>>>> could be kicked
>> and
>>>>>>>> fenced
>>>>>>>>> either by txn or by consumer group session.
>>>>>>>>>
>>>>>>>>> So we want to make sure that in the caller, e.g.
>>>>>>>>> Kafka Streams:
>>>>>>>>>
>>>>>>>>> 1) transaction.timeout < max.block.ms 2)
>>>>>>>>> transaction.timeout < delivery.timeout.ms 3)
>>>>>>>>> transaction.timeout < default.api.timeout.ms 4)
>>>>>>>>> transaction.timeout ~= default.api.timeout.ms (I
>>>>>>>>> think this is
>>>>>>>> already
>>>>>>>>> mentioned in the KIP, just wanted to bring this up
>>>>>>>>> again)
>>>>>>>>>
>>>>>>>>> We do not need to override the default since not
>>>>>>>>> every users are
>>>>>>>> following
>>>>>>>>> the consumer -> producer pattern, but in cases like
>>>>>>>>> Streams where
>> it
>>>>>>> is
>>>>>>>>> indeed the case, we should override the default
>>>>>>>>> values to obey the
>>>>>>> above
>>>>>>>>> rules.
>>>>>>>>>
>>>>>>>>> Guozhang
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Thu, Sep 5, 2019 at 5:47 PM Guozhang Wang
>>>>>>>>> <wangg...@gmail.com>
>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Thanks Boyang, I'm +1 on the KIP.
>>>>>>>>>>
>>>>>>>>>> Could you also update the detailed design doc
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>
>> https://docs.google.com/document/d/1LhzHGeX7_Lay4xvrEXxfciuDWATjpUXQh
rEIkph9qRE/edit
>>>>>>>>>
>>
which
>>>>>>>>>> seems a bit out-dated with the latest proposal?
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Guozhang
>>>>>>>>>>
>>>>>>>>>> On Wed, Sep 4, 2019 at 10:45 AM Boyang Chen <
>>>>>>>> reluctanthero...@gmail.com>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> Hey all,
>>>>>>>>>>>
>>>>>>>>>>> I would like to start the vote for KIP-447 <
>>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-447%3A+Producer
+scalability+for+exactly+once+semantics
>>>>>>>>>>>>
>>
.
>>>>>>>>>>> This is a very important step to improve Kafka
>>>>>>>>>>> Streams
>> scalability
>>>>>>> in
>>>>>>>>>>> exactly-once semantics, by avoiding linearly
>>>>>>>>>>> increasing number
>> of
>>>>>>>>>>> producers with topic partition increases.
>>>>>>>>>>>
>>>>>>>>>>> Thanks, Boyang
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> -- -- Guozhang
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> -- -- Guozhang
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> -- -- Guozhang
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>
>
>
-----BEGIN PGP SIGNATURE-----

iQIzBAEBCgAdFiEEI8mthP+5zxXZZdDSO4miYXKq/OgFAl5hZ5cACgkQO4miYXKq
/Ojfrg//euYJ93oZkeS7+tquvaXB58NFJOMgaMyHLhnD5Oqcz06nRmb/g+PSNsZ6
3gyGAqwY5qh9v5DwruFKapDaBFekyoUaCUD+8t3FMh+erKM+kqfhediqSpbm3E3J
K7g9En0QkziiGazjLljgw2w/iM2nnj7rI/S9Hww9zx1bwdaq5W7K8ypzhvfm0ZvB
GlRn9UnstCLqH0k2gB+kVVF666Z0X82Th4uc2wR2hPVnV8Fbv5iFYwuatfiUSDg2
7IB2ehHvF+7UfSoeg/YgTiiKcj9rzTDbjpmmgg/wZlCic+DxufIrBZmfNW12AXQd
l/sjxkt5mL6qjfPcDKXxdVTatyLzjyYNPtm4o1bTO9Aj60SBIwEx3uNLgPvzkXxU
RQHI0POtn4NsboJAUf/1BmTowPLYlAjEbfZN5d9J0LNPm4Vw7mbV9XZbSbLt/lN0
dSovKTY33sKkuzXoJDN9LGKXCpToMgxkgZeVq71RMlol+qrWDYHhAn7p9ZCt/gLO
ZEv9T5hRCf/AJg/NHk+1wcs8zv3ho5ofasb/d4HjZxVQGIu8CIgMTep2rt5hir3z
CYX5jQ4O1p5U1s0T+v0JOOeRyclDYjG/GrySR22ffz+8CmJ9dhh5nChZdqPo3p/i
jN7iAOA3y/pbkTCS+D7gHsaa1xBXfpXxVDsgxArRxi5axt54jws=
=zWHL
-----END PGP SIGNATURE-----

Reply via email to