Re: [VOTE] KIP-890: Transactions Server Side Defense

2023-03-17 Thread Justine Olshan
Hey all. I've updated the kip to include a configuration change. This will
allow users to disable the verification step. This will be for
performance-conscious customers who can sacrifice possible hanging
transactions in order achieve latency goals.
This is because the second AddPartition call may introduce slightly slower
produce requests. I do plan to optimize and make this as fast as possible,
but I suspect some folks will want this as an option.
Of course, the best solution will be to update the clients. But until that
is available, we offer the configuration. Let me know if there are any
questions.

Thanks,
Justine

On Thu, Feb 23, 2023 at 10:55 AM Justine Olshan 
wrote:

> Yup -- those are the main changes!
>
> On Thu, Feb 23, 2023 at 9:44 AM Guozhang Wang 
> wrote:
>
>> Thanks Justine. I checked the diff between the two versions on wiki,
>> seems the major changes are:
>>
>> 1) Move the `verifyOnly` field of the request into each transaction
>> and hence we no longer have any top-level primitive fields.
>> 2) Add a top-level `errorCode` field in the response.
>>
>> Is that summary right?
>>
>>
>> Guozhang
>>
>> On Wed, Feb 22, 2023 at 4:51 PM Justine Olshan
>>  wrote:
>> >
>> > Hey all,
>> >
>> > I've updated the KIP to slightly change some of the request and response
>> > specs for AddPartitionsToTxn. Nothing huge, but some points came up
>> during
>> > PR review.
>> >
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-890%3A+Transactions+Server-Side+Defense
>> >
>> > Thanks,
>> > Justine
>> >
>> > On Fri, Feb 3, 2023 at 8:40 AM Justine Olshan 
>> wrote:
>> >
>> > > Thanks everyone! I'm going to close the vote.
>> > > The KIP is accepted with five binding votes from Jason, Guozhang,
>> > > Matthias, David (and me), and two non-binding votes from Colt and
>> Artem.
>> > >
>> > > Thanks again,
>> > > Justine
>> > >
>> > > On Thu, Feb 2, 2023 at 11:41 PM David Jacot
>> 
>> > > wrote:
>> > >
>> > >> Thanks for the KIP, Justine. +1 (binding)
>> > >>
>> > >> On Fri, Feb 3, 2023 at 1:36 AM Matthias J. Sax 
>> wrote:
>> > >>
>> > >> > Thanks for the KIP!
>> > >> >
>> > >> > +1 (binding)
>> > >> >
>> > >> >
>> > >> > On 2/2/23 4:18 PM, Artem Livshits wrote:
>> > >> > > (non-binding) +1.  Looking forward to the implementation and
>> fixing
>> > >> the
>> > >> > > issues that we've got.
>> > >> > >
>> > >> > > -Artem
>> > >> > >
>> > >> > > On Mon, Jan 23, 2023 at 2:25 PM Guozhang Wang <
>> > >> > guozhang.wang...@gmail.com>
>> > >> > > wrote:
>> > >> > >
>> > >> > >> Thanks Justine, I have no further comments on the KIP. +1.
>> > >> > >>
>> > >> > >> On Tue, Jan 17, 2023 at 10:34 AM Jason Gustafson
>> > >> > >>  wrote:
>> > >> > >>>
>> > >> > >>> +1. Thanks Justine!
>> > >> > >>>
>> > >> > >>> -Jason
>> > >> > >>>
>> > >> > >>> On Tue, Jan 10, 2023 at 3:46 PM Colt McNealy <
>> c...@littlehorse.io>
>> > >> > >> wrote:
>> > >> > >>>
>> > >> >  (non-binding) +1. Thank you for the KIP, Justine! I've read
>> it; it
>> > >> > >> makes
>> > >> >  sense to me and I am excited for the implementation.
>> > >> > 
>> > >> >  Colt McNealy
>> > >> >  *Founder, LittleHorse.io*
>> > >> > 
>> > >> > 
>> > >> >  On Tue, Jan 10, 2023 at 10:46 AM Justine Olshan
>> > >> >   wrote:
>> > >> > 
>> > >> > > Hi everyone,
>> > >> > >
>> > >> > > I would like to start a vote on KIP-890 which aims to
>> prevent some
>> > >> > >> of the
>> > >> > > common causes of hanging transactions and make other general
>> > >> > >> improvements
>> > >> > > to transactions in Kafka.
>> > >> > >
>> > >> > >
>> > >> > >
>> > >> > 
>> > >> > >>
>> > >> >
>> > >>
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-890%3A+Transactions+Server-Side+Defense
>> > >> > >
>> > >> > > Please take a look if you haven't already and vote!
>> > >> > >
>> > >> > > Justine
>> > >> > >
>> > >> > 
>> > >> > >>
>> > >> > >
>> > >> >
>> > >>
>> > >
>>
>


Re: [DISCUSS] KIP-914 Join Processor Semantics for Versioned Stores

2023-03-17 Thread Guozhang Wang
Thanks Matthias / Victoria, both bullet points make sense to me.

On Thu, Mar 16, 2023 at 10:39 AM Victoria Xia
 wrote:
>
> Thanks for your comments, Matthias!
>
> > For stream-table joins, I think we need to elaborate that a `get(k, ts)`
> call now might return `null` if the history retention of the store is too
> short.
>
> Great callout -- I agree we should definitely clarify this in the KIP and
> mention it in the eventual docs as well.
>
> When a call to `get(k, ts)` returns null, there's not really a good way to
> distinguish whether it's because the timestamp is outside of the store's
> history retention or if it's because there's actually no record version for
> the key at the specified timestamp. Determining this from the processor
> would require (1) exposing the store's history retention to the processor,
> and (2) reconciling the fact that state stores today (including the new
> versioned store implementation) track their own observed stream time
> separate from processor time.
>
> In light of this, I think your proposal to treat a null from `get(k, ts)`
> due to history retention having been exceeded the same as we'd treat any
> other null makes sense, and is also our only viable option right now. I'll
> call this out in the docs so users are aware that their choice of history
> retention has this implication.
>
> > For left-table-table joins, there seems to be no special impact, but it
> should be called out, too. The lookup itself does not go into the history
> of the table so no change here (as we don't have the "query older than
> history case")
>
> Yup, we're on the same page. Using a versioned store for table-table joins
> results in the semantic change that the join result will include the
> latest-by-timestamp record rather than the latest-by-offset record, but no
> timestamped lookups (i.e., `get(key, ts)` calls) are used in the process so
> there is no concern about history retention having elapsed and affecting
> join results. (The only implication of history retention for this use case
> is indirect, since history retention doubles as grace period for the store.
> Because grace period is per store instance, which has task-level
> granularity, that means if grace period is set too low then the latest
> record for one key could be dropped from the store if another key has
> already advanced the store's observed stream time past the grace period by
> the time that this record is seen.)
>
> I will update the KIP with these additional notes.
>
> Thanks,
> Victoria
>
> On Wed, Mar 15, 2023 at 7:16 PM Matthias J. Sax  wrote:
>
> > Thanks for the KIP! Great to see a first step towards using the new
> > versioned stores!
> >
> > I think the described tradeoffs make sense and I like make a pragmatic
> > step into the right direction, and avoid boiling the ocean. Thus, I
> > agree to the proposed solution.
> >
> > One minor thing, that I believe just need clarification in the KIP (does
> > not seem to be a change to the KIP itself):
> >
> > For stream-table joins, I think we need to elaborate that a `get(k, ts)`
> > call now might return `null` if the history retention of the store is
> > too short. For inner-joins it would result in no output record (ie,
> > stream input record is dropped). Would be good to have it mentioned in
> > the KIP explicitly.
> >
> > We should also discuss how left-joins should work for this case. I think
> > it's ok (better) to include the stream record in the result if the
> > lookup returns `null` -- either because no key exist in the exiting
> > history for the provided timestamp, or (the actual case in question)
> > because we query older than available history. If you agree, can we add
> > this to the KIP?
> >
> > For left-table-table joins, there seems to be no special impact, but it
> > should be called out, too. The lookup itself does not go into the
> > history of the table so no change here (as we don't have the "query
> > older than history case") -- and for out-of-order records, we just
> > "drop" them anyway, so no change for left-joins either I believe.
> >
> >
> > -Matthias
> >
> >
> >
> > On 3/15/23 2:00 PM, Guozhang Wang wrote:
> > > Sounds good to me. Thanks!
> > >
> > > On Wed, Mar 15, 2023 at 12:07 PM Victoria Xia
> > >  wrote:
> > >>
> > >> Thanks for kicking off the discussion, John and Guozhang!
> > >>
> > >>> Just one thing that might be out of scope: if users want to enable the
> > >> versioned table feature across the topology, should we allow them to do
> > it
> > >> via a single config rather than changing the materialized object at each
> > >> place?
> > >>
> > >> Yes, I think this would be a great usability improvement and am in
> > favor of
> > >> introducing such a config. As long as the config defaults to using
> > >> unversioned stores (which makes sense anyway), there will be no
> > >> compatibility concerns with introducing the config in a future release.
> > >> It's out of scope for this particular KIP as a result, but can
> > h

Re: [DISCUSS] KIP-914 Join Processor Semantics for Versioned Stores

2023-03-17 Thread Victoria Xia
Hi everyone,

As mentioned in the KIP, I would like to include these changes together
with KIP-889 in the 3.5 release to avoid any potential compatibility
concerns. Given that the current discussion is nearing convergence and the
KIP deadline for 3.5 is five days away (March 22), I will initiate a vote
at the end of the day.

Happy to continue discussing even after the vote has started. Thanks for
all the great suggestions so far!

Best,
Victoria

On Thu, Mar 16, 2023 at 10:38 AM Victoria Xia 
wrote:

> Thanks for your comments, Matthias!
>
> > For stream-table joins, I think we need to elaborate that a `get(k, ts)`
> call now might return `null` if the history retention of the store is too
> short.
>
> Great callout -- I agree we should definitely clarify this in the KIP and
> mention it in the eventual docs as well.
>
> When a call to `get(k, ts)` returns null, there's not really a good way to
> distinguish whether it's because the timestamp is outside of the store's
> history retention or if it's because there's actually no record version for
> the key at the specified timestamp. Determining this from the processor
> would require (1) exposing the store's history retention to the processor,
> and (2) reconciling the fact that state stores today (including the new
> versioned store implementation) track their own observed stream time
> separate from processor time.
>
> In light of this, I think your proposal to treat a null from `get(k, ts)`
> due to history retention having been exceeded the same as we'd treat any
> other null makes sense, and is also our only viable option right now. I'll
> call this out in the docs so users are aware that their choice of history
> retention has this implication.
>
> > For left-table-table joins, there seems to be no special impact, but it
> should be called out, too. The lookup itself does not go into the history
> of the table so no change here (as we don't have the "query older than
> history case")
>
> Yup, we're on the same page. Using a versioned store for table-table joins
> results in the semantic change that the join result will include the
> latest-by-timestamp record rather than the latest-by-offset record, but no
> timestamped lookups (i.e., `get(key, ts)` calls) are used in the process so
> there is no concern about history retention having elapsed and affecting
> join results. (The only implication of history retention for this use case
> is indirect, since history retention doubles as grace period for the store.
> Because grace period is per store instance, which has task-level
> granularity, that means if grace period is set too low then the latest
> record for one key could be dropped from the store if another key has
> already advanced the store's observed stream time past the grace period by
> the time that this record is seen.)
>
> I will update the KIP with these additional notes.
>
> Thanks,
> Victoria
>
> On Wed, Mar 15, 2023 at 7:16 PM Matthias J. Sax  wrote:
>
>> Thanks for the KIP! Great to see a first step towards using the new
>> versioned stores!
>>
>> I think the described tradeoffs make sense and I like make a pragmatic
>> step into the right direction, and avoid boiling the ocean. Thus, I
>> agree to the proposed solution.
>>
>> One minor thing, that I believe just need clarification in the KIP (does
>> not seem to be a change to the KIP itself):
>>
>> For stream-table joins, I think we need to elaborate that a `get(k, ts)`
>> call now might return `null` if the history retention of the store is
>> too short. For inner-joins it would result in no output record (ie,
>> stream input record is dropped). Would be good to have it mentioned in
>> the KIP explicitly.
>>
>> We should also discuss how left-joins should work for this case. I think
>> it's ok (better) to include the stream record in the result if the
>> lookup returns `null` -- either because no key exist in the exiting
>> history for the provided timestamp, or (the actual case in question)
>> because we query older than available history. If you agree, can we add
>> this to the KIP?
>>
>> For left-table-table joins, there seems to be no special impact, but it
>> should be called out, too. The lookup itself does not go into the
>> history of the table so no change here (as we don't have the "query
>> older than history case") -- and for out-of-order records, we just
>> "drop" them anyway, so no change for left-joins either I believe.
>>
>>
>> -Matthias
>>
>>
>>
>> On 3/15/23 2:00 PM, Guozhang Wang wrote:
>> > Sounds good to me. Thanks!
>> >
>> > On Wed, Mar 15, 2023 at 12:07 PM Victoria Xia
>> >  wrote:
>> >>
>> >> Thanks for kicking off the discussion, John and Guozhang!
>> >>
>> >>> Just one thing that might be out of scope: if users want to enable the
>> >> versioned table feature across the topology, should we allow them to
>> do it
>> >> via a single config rather than changing the materialized object at
>> each
>> >> place?
>> >>
>> >> Yes, I think this would be a great usabi

Re: [DISCUSS] Apache Kafka 3.5.0 release

2023-03-17 Thread Christo Lolov
Hello!

What would you suggest as the best way to get more eyes on KIP-902 as I would 
like it to be included it in 3.5.0?

Best,
Christo

> On 16 Mar 2023, at 10:33, Mickael Maison  wrote:
> 
> Hi,
> 
> This is a reminder that KIP freeze is less than a week away (22 Mar).
> For a KIP to be considered for this release, it must be voted and
> accepted by that date.
> 
> Feature freeze will be 3 weeks after this, so if you want KIPs or
> other significant changes in the release, please get them ready soon.
> 
> Thanks,
> Mickael
> 
>> On Tue, Feb 14, 2023 at 10:44 PM Ismael Juma  wrote:
>> 
>> Thanks!
>> 
>> Ismael
>> 
>> On Tue, Feb 14, 2023 at 1:07 PM Mickael Maison 
>> wrote:
>> 
>>> Hi Ismael,
>>> 
>>> Good call. I shifted all dates by 2 weeks and moved them to Wednesdays.
>>> 
>>> Thanks,
>>> Mickael
>>> 
>>> On Tue, Feb 14, 2023 at 6:01 PM Ismael Juma  wrote:
 
 Thanks Mickael. A couple of notes:
 
 1. We typically choose a Wednesday for the various freeze dates - there
>>> are
 often 1-2 day slips and it's better if that doesn't require people
 working through the weekend.
 2. Looks like we're over a month later compared to the equivalent release
 last year (
 https://cwiki.apache.org/confluence/display/KAFKA/Release+Plan+3.2.0). I
 understand that some of it is due to 3.4.0 slipping, but I wonder if we
 could perhaps aim for the KIP freeze to be one or two weeks earlier.
 
 Ismael
 
 On Tue, Feb 14, 2023 at 8:00 AM Mickael Maison >>> 
 wrote:
 
> Hi,
> 
> I've created a release plan for 3.5.0 in the wiki:
> https://cwiki.apache.org/confluence/display/KAFKA/Release+Plan+3.5.0
> 
> Current dates are:
> 1) KIP Freeze: 07 Apr 2023
> 2) Feature Freeze: 27 Apr 2023
> 3) Code Freeze: 11 May 2023
> 
> Please take a look at the plan. Let me know if there are other KIPs
> targeting 3.5.0.
> Also if you are the author of one of the KIPs that's missing a status
> (or the status is incorrect) please update it and let me know.
> 
> Thanks,
> Mickael
> 
> 
> On Thu, Feb 9, 2023 at 9:23 AM Bruno Cadonna 
>>> wrote:
>> 
>> Thanks, Mickael!
>> 
>> Best,
>> Bruno
>> 
>> On 09.02.23 03:15, Luke Chen wrote:
>>> Hi Mickael,
>>> Thanks for volunteering!
>>> 
>>> Luke
>>> 
>>> On Thu, Feb 9, 2023 at 6:23 AM Chris Egerton <
>>> fearthecel...@gmail.com>
>>> wrote:
>>> 
 Thanks for volunteering, Mickael!
 
 On Wed, Feb 8, 2023 at 1:12 PM José Armando García Sancio
  wrote:
 
> Thanks for volunteering Mickael.
> 
> --
> -José
> 
 
>>> 
> 
>>> 


Jenkins build is unstable: Kafka » Kafka Branch Builder » trunk #1686

2023-03-17 Thread Apache Jenkins Server
See 




Re: [Permission for thachle jira account]

2023-03-17 Thread Bill Bejeck
Peter,

You're all set now.
Thanks for your interest in Apache Kafka.

Regards,
Bill

On Fri, Mar 17, 2023 at 10:13 AM Thach Le  wrote:

> Hello team,
> I just created an account for apache Kafka Jira.
> Please grant my account so I can assign myself to the Jira ticket.
> My account: thachle
>
> Regards,
> Peter
>
> *Skype: thachlp*
> *Phone: (+84) 35 5381 053*
>


Re: Failing ZkMigrationIntegrationTest

2023-03-17 Thread Chia-Ping Tsai
It seems to me KAFKA-14811 can fix the failure test :)

> Ismael Juma  於 2023年3月17日 下午10:16 寫道:
> 
> Thanks David. It may be worth temporarily disabling it unless we expect the
> fix to be available today.
> 
> Ismael
> 
>> On Fri, Mar 17, 2023 at 6:58 AM David Arthur
>>  wrote:
>> 
>> Hey folks, I've noticed that ZkMigrationIntegrationTest#testDualWrite has
>> been failing a lot lately on trunk and PR builds. Earlier this week I
>> tracked down the root cause and have been working on a fix (KAFKA-14805).
>> Basically, what's happening is an AllocateProducerIdRequest is being
>> handled by the controller before the ZK migration kicks off.
>> 
>> Just wanted to let everyone know that a fix is in progress.
>> 
>> Cheers,
>> David
>> 


Re: [VOTE] KIP-581: Value of optional null field which has default value

2023-03-17 Thread Tom Bentley
+1 (binding), thanks again Mickael.

On Fri, 17 Mar 2023 at 14:56, Chris Egerton  wrote:

> +1 (binding), thanks Mickael!
>
> On Fri, Mar 17, 2023, 10:11 Mickael Maison 
> wrote:
>
> > Hi all,
> >
> > I'd like to start a vote on KIP-581 to allow keeping null values
> > instead of always replacing with the default values:
> >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-581%3A+Value+of+optional+null+field+which+has+default+value
> >
> > Thanks,
> > Mickael
> >
>


Re:[VOTE] KIP-911: Add source tag to MirrorSourceConnector metrics

2023-03-17 Thread Hector Geraldino (BLOOMBERG/ 919 3RD A)
+1 (non-binding)

From: dev@kafka.apache.org At: 03/15/23 07:08:33 UTC-4:00To:  
dev@kafka.apache.org
Subject: [VOTE] KIP-911: Add source tag to MirrorSourceConnector metrics

Hi,

I'd like to start the vote on KIP-911 to add the source cluster alias
as a tag on the MirrorSourceConnector metrics:
https://cwiki.apache.org/confluence/display/KAFKA/KIP-911%3A+Add+source+tag+to+M
irrorSourceConnector+metrics

Thanks,
Mickael




Re: [VOTE] KIP-581: Value of optional null field which has default value

2023-03-17 Thread Chris Egerton
+1 (binding), thanks Mickael!

On Fri, Mar 17, 2023, 10:11 Mickael Maison  wrote:

> Hi all,
>
> I'd like to start a vote on KIP-581 to allow keeping null values
> instead of always replacing with the default values:
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-581%3A+Value+of+optional+null+field+which+has+default+value
>
> Thanks,
> Mickael
>


Re: Failing ZkMigrationIntegrationTest

2023-03-17 Thread Ismael Juma
Thanks David. It may be worth temporarily disabling it unless we expect the
fix to be available today.

Ismael

On Fri, Mar 17, 2023 at 6:58 AM David Arthur
 wrote:

> Hey folks, I've noticed that ZkMigrationIntegrationTest#testDualWrite has
> been failing a lot lately on trunk and PR builds. Earlier this week I
> tracked down the root cause and have been working on a fix (KAFKA-14805).
> Basically, what's happening is an AllocateProducerIdRequest is being
> handled by the controller before the ZK migration kicks off.
>
> Just wanted to let everyone know that a fix is in progress.
>
> Cheers,
> David
>


[jira] [Resolved] (KAFKA-13900) Support Java 9 direct ByteBuffer Checksum methods

2023-03-17 Thread Ismael Juma (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-13900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma resolved KAFKA-13900.
-
Resolution: Fixed

> Support Java 9 direct ByteBuffer Checksum methods
> -
>
> Key: KAFKA-13900
> URL: https://issues.apache.org/jira/browse/KAFKA-13900
> Project: Kafka
>  Issue Type: Improvement
>  Components: clients, core
>Affects Versions: 3.1.1
>Reporter: Francesco Nigro
>Priority: Minor
>  Labels: performance, performance-benchmark
> Fix For: 3.4.0
>
>
> Java 9 has added a new Checksum method that can makes uses of ByteBuffer(s) 
> (see [Java 9's 
> Checksum::update|https://docs.oracle.com/javase/9/docs/api/java/util/zip/Checksum.html#update-java.nio.ByteBuffer-]):
>  Kafka already provides specific support for Java 9's Cr32C, hence it makes 
> sense it's going to use the most optimized version of it for direct 
> ByteBuffers as well (read-only or not), instead of performing a byte-per-byte 
> computation.
>  
> I'm aware that currently the client's Buffer pools aren't using direct 
> ByteBuffer, but having full support for it can open the door to future 
> interesting optimizations on it.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[Permission for thachle jira account]

2023-03-17 Thread Thach Le
Hello team,
I just created an account for apache Kafka Jira.
Please grant my account so I can assign myself to the Jira ticket.
My account: thachle

Regards,
Peter

*Skype: thachlp*
*Phone: (+84) 35 5381 053*


[VOTE] KIP-581: Value of optional null field which has default value

2023-03-17 Thread Mickael Maison
Hi all,

I'd like to start a vote on KIP-581 to allow keeping null values
instead of always replacing with the default values:
https://cwiki.apache.org/confluence/display/KAFKA/KIP-581%3A+Value+of+optional+null+field+which+has+default+value

Thanks,
Mickael


Failing ZkMigrationIntegrationTest

2023-03-17 Thread David Arthur
Hey folks, I've noticed that ZkMigrationIntegrationTest#testDualWrite has
been failing a lot lately on trunk and PR builds. Earlier this week I
tracked down the root cause and have been working on a fix (KAFKA-14805).
Basically, what's happening is an AllocateProducerIdRequest is being
handled by the controller before the ZK migration kicks off.

Just wanted to let everyone know that a fix is in progress.

Cheers,
David


Re: [Discuss] KIP-581: Value of optional null field which has default value

2023-03-17 Thread Chris Egerton
LGTM

On Fri, Mar 17, 2023, 08:26 Luke Chen  wrote:

> Thanks Mickael and Cheng!
> This KIP LGTM!
>
> Thanks.
> Luke
>
> On Fri, Mar 17, 2023 at 7:27 PM Mickael Maison 
> wrote:
>
> > Thanks Tom and Chris for your feedback!
> >
> > I agree one configuration is enough. I've renamed it to
> > 'replace.null.with.default'.
> > Since you both seemed happy with the KIP overall, I'll start a vote later
> > today.
> >
> > Thanks,
> > Mickael
> >
> > On Thu, Mar 16, 2023 at 7:08 PM Chris Egerton 
> > wrote:
> > >
> > > Hi Mickael,
> > >
> > > Tom got this part perfect so I'm just going to copy+paste: Thanks to
> > Cheng
> > > for the KIP and to you for picking it up.
> > >
> > > I'm wondering why we need separate properties for serialization vs.
> > > deserialization? If the converter is being used by the Kafka Connect
> > > runtime, a given instance of it will only ever be responsible for one
> or
> > > the other (in other words, sink connectors' converters will only ever
> be
> > > used for deserialization and source connectors' converters will only
> ever
> > > be used for serialization). It seems like a single "use.optional.null"
> > (or
> > > "map.null.to.default") property would be simpler and easier to use,
> but I
> > > might be missing something about why we'd want to add this kind of
> > > granularity.
> > >
> > > I'm slightly in favor of the alternative name that Tom has proposed. I
> > > think highlighting that this property deals with how to handle default
> > > values is important, possibly more important than the fact that it
> deals
> > > with null field values. I'm a little hesitant to use "map" since that
> may
> > > be harder to remember and at first glance it might seem like it deals
> > with
> > > the map schema type. Maybe "replace.null.with.default.value"? (For the
> > > record, I don't consider any of this worthy of blocking the KIP, so
> don't
> > > feel the need to appease me on this front before starting a vote.)
> > >
> > > Cheers,
> > >
> > > Chris
> > >
> > > On Thu, Mar 16, 2023 at 11:38 AM Tom Bentley 
> > wrote:
> > >
> > > > Hi Mickael,
> > > >
> > > > Thanks to Cheng for the KIP and to you for picking it up.
> > > >
> > > > My only comment (feel free to ignore) is about the names of the
> > configs.
> > > > Personally I don't think I'd correctly guess what
> > > > "serialize.use.optional.null" meant. Something like
> > > > "serialize.map.null.to.default" is much clearer to me, for the cost
> > of one
> > > > extra token.
> > > >
> > > > Otherwise LGTM.
> > > >
> > > > Thanks,
> > > >
> > > > Tom
> > > >
> > > > On Wed, 8 Mar 2023 at 15:55, Mickael Maison <
> mickael.mai...@gmail.com>
> > > > wrote:
> > > >
> > > > > Hi,
> > > > >
> > > > > This KIP has been staled for a long time. Since it would be a
> useful
> > > > > feature, I pinged Cheng about a month ago asking if he was planning
> > to
> > > > > work on it again. I've not received a reply, so I've allowed myself
> > to
> > > > > update the KIP (hopefully preserving the initial requirements) and
> > > > > would like to restart a discussion.
> > > > >
> > > > > The DISCUSS thread was split in two, you can find the other part in
> > > > > https://lists.apache.org/thread/dc56k17zptzvbyc7gtscovzgzwf6yn9p
> > > > >
> > > > > Let me know if you have any feedback.
> > > > >
> > > > > Thanks,
> > > > > Mickael
> > > > >
> > > > > On Tue, Apr 14, 2020 at 8:28 PM Christopher Egerton <
> > chr...@confluent.io
> > > > >
> > > > > wrote:
> > > > > >
> > > > > > Hi Cheng,
> > > > > >
> > > > > > Thanks for the KIP! I really appreciate the care that was taken
> to
> > > > ensure
> > > > > > backwards compatibility for existing users, and the minimal
> > changes to
> > > > > > public interface that are suggested to address this.
> > > > > >
> > > > > > I have two quick requests for clarification:
> > > > > >
> > > > > > 1) Where is the proposed "accept.optional.null" property going to
> > > > apply?
> > > > > > It's hinted that it'll take effect on the JSON converter but not
> > > > actually
> > > > > > called out anywhere.
> > > > > >
> > > > > > 2) Assuming this takes effect on the JSON converter, is the
> intent
> > to
> > > > > alter
> > > > > > the semantics for both serialization and deserialization? The
> code
> > > > > snippet
> > > > > > from the JSON converter that's included in the KIP comes from the
> > > > > > "convertToJson" method, which is used for serialization. However,
> > based
> > > > > on
> > > > > >
> > > > >
> > > >
> >
> https://github.com/apache/kafka/blob/ea47a885b1fe47dfb87c1dc86db1b0e7eb8a273c/connect/json/src/main/java/org/apache/kafka/connect/json/JsonConverter.java#L712-L713
> > > > > > it
> > > > > > looks like the converter also inserts the default value for
> > > > > > optional-but-null data during deserialization.
> > > > > >
> > > > > > Thanks again for the KIP!
> > > > > >
> > > > > > Cheers,
> > > > > >
> > > > > > Chris
> > > > > >
> > > > > > On Wed, Mar 18, 2020 at 12:00 AM Cheng Pan <379377...@qq.c

Re: [Discuss] KIP-581: Value of optional null field which has default value

2023-03-17 Thread Luke Chen
Thanks Mickael and Cheng!
This KIP LGTM!

Thanks.
Luke

On Fri, Mar 17, 2023 at 7:27 PM Mickael Maison 
wrote:

> Thanks Tom and Chris for your feedback!
>
> I agree one configuration is enough. I've renamed it to
> 'replace.null.with.default'.
> Since you both seemed happy with the KIP overall, I'll start a vote later
> today.
>
> Thanks,
> Mickael
>
> On Thu, Mar 16, 2023 at 7:08 PM Chris Egerton 
> wrote:
> >
> > Hi Mickael,
> >
> > Tom got this part perfect so I'm just going to copy+paste: Thanks to
> Cheng
> > for the KIP and to you for picking it up.
> >
> > I'm wondering why we need separate properties for serialization vs.
> > deserialization? If the converter is being used by the Kafka Connect
> > runtime, a given instance of it will only ever be responsible for one or
> > the other (in other words, sink connectors' converters will only ever be
> > used for deserialization and source connectors' converters will only ever
> > be used for serialization). It seems like a single "use.optional.null"
> (or
> > "map.null.to.default") property would be simpler and easier to use, but I
> > might be missing something about why we'd want to add this kind of
> > granularity.
> >
> > I'm slightly in favor of the alternative name that Tom has proposed. I
> > think highlighting that this property deals with how to handle default
> > values is important, possibly more important than the fact that it deals
> > with null field values. I'm a little hesitant to use "map" since that may
> > be harder to remember and at first glance it might seem like it deals
> with
> > the map schema type. Maybe "replace.null.with.default.value"? (For the
> > record, I don't consider any of this worthy of blocking the KIP, so don't
> > feel the need to appease me on this front before starting a vote.)
> >
> > Cheers,
> >
> > Chris
> >
> > On Thu, Mar 16, 2023 at 11:38 AM Tom Bentley 
> wrote:
> >
> > > Hi Mickael,
> > >
> > > Thanks to Cheng for the KIP and to you for picking it up.
> > >
> > > My only comment (feel free to ignore) is about the names of the
> configs.
> > > Personally I don't think I'd correctly guess what
> > > "serialize.use.optional.null" meant. Something like
> > > "serialize.map.null.to.default" is much clearer to me, for the cost
> of one
> > > extra token.
> > >
> > > Otherwise LGTM.
> > >
> > > Thanks,
> > >
> > > Tom
> > >
> > > On Wed, 8 Mar 2023 at 15:55, Mickael Maison 
> > > wrote:
> > >
> > > > Hi,
> > > >
> > > > This KIP has been staled for a long time. Since it would be a useful
> > > > feature, I pinged Cheng about a month ago asking if he was planning
> to
> > > > work on it again. I've not received a reply, so I've allowed myself
> to
> > > > update the KIP (hopefully preserving the initial requirements) and
> > > > would like to restart a discussion.
> > > >
> > > > The DISCUSS thread was split in two, you can find the other part in
> > > > https://lists.apache.org/thread/dc56k17zptzvbyc7gtscovzgzwf6yn9p
> > > >
> > > > Let me know if you have any feedback.
> > > >
> > > > Thanks,
> > > > Mickael
> > > >
> > > > On Tue, Apr 14, 2020 at 8:28 PM Christopher Egerton <
> chr...@confluent.io
> > > >
> > > > wrote:
> > > > >
> > > > > Hi Cheng,
> > > > >
> > > > > Thanks for the KIP! I really appreciate the care that was taken to
> > > ensure
> > > > > backwards compatibility for existing users, and the minimal
> changes to
> > > > > public interface that are suggested to address this.
> > > > >
> > > > > I have two quick requests for clarification:
> > > > >
> > > > > 1) Where is the proposed "accept.optional.null" property going to
> > > apply?
> > > > > It's hinted that it'll take effect on the JSON converter but not
> > > actually
> > > > > called out anywhere.
> > > > >
> > > > > 2) Assuming this takes effect on the JSON converter, is the intent
> to
> > > > alter
> > > > > the semantics for both serialization and deserialization? The code
> > > > snippet
> > > > > from the JSON converter that's included in the KIP comes from the
> > > > > "convertToJson" method, which is used for serialization. However,
> based
> > > > on
> > > > >
> > > >
> > >
> https://github.com/apache/kafka/blob/ea47a885b1fe47dfb87c1dc86db1b0e7eb8a273c/connect/json/src/main/java/org/apache/kafka/connect/json/JsonConverter.java#L712-L713
> > > > > it
> > > > > looks like the converter also inserts the default value for
> > > > > optional-but-null data during deserialization.
> > > > >
> > > > > Thanks again for the KIP!
> > > > >
> > > > > Cheers,
> > > > >
> > > > > Chris
> > > > >
> > > > > On Wed, Mar 18, 2020 at 12:00 AM Cheng Pan <379377...@qq.com>
> wrote:
> > > > >
> > > > > > Hi all,
> > > > > >
> > > > > > I'd like to use this thread to discuss KIP-581: Value of optional
> > > null
> > > > > > field which has default value, please see detail at:
> > > > > >
> > > >
> > >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-581%3A+Value+of+optional+null+field+which+has+default+value
> > > > > >
> > > > > >
> > > > 

[jira] [Created] (KAFKA-14818) Support non default file systems

2023-03-17 Thread Christian (Jira)
Christian created KAFKA-14818:
-

 Summary: Support non default file systems
 Key: KAFKA-14818
 URL: https://issues.apache.org/jira/browse/KAFKA-14818
 Project: Kafka
  Issue Type: Improvement
  Components: clients
Affects Versions: 3.3.2
Reporter: Christian


In our java project we are using a truststore and keystore that are not part of 
our application but download from an external source. We are running in a 
docker container without file system write access.

We can download keystore and truststore and store them to an 
in-memory-filesystem (jimf).

When the DefaultSslEngineFactory tries to load these Files it uses 

 
{code:java}
try (InputStream in = Files.newInputStream(Paths.get(path))) {{code}
same goes for lastModifiesMS().

 

If Paths.get(path) would be replaced with Paths.get(URI.create(path)) the code 
works with the default filesystem but for other filesystems like jimfs, too.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [Discuss] KIP-581: Value of optional null field which has default value

2023-03-17 Thread Mickael Maison
Thanks Tom and Chris for your feedback!

I agree one configuration is enough. I've renamed it to
'replace.null.with.default'.
Since you both seemed happy with the KIP overall, I'll start a vote later today.

Thanks,
Mickael

On Thu, Mar 16, 2023 at 7:08 PM Chris Egerton  wrote:
>
> Hi Mickael,
>
> Tom got this part perfect so I'm just going to copy+paste: Thanks to Cheng
> for the KIP and to you for picking it up.
>
> I'm wondering why we need separate properties for serialization vs.
> deserialization? If the converter is being used by the Kafka Connect
> runtime, a given instance of it will only ever be responsible for one or
> the other (in other words, sink connectors' converters will only ever be
> used for deserialization and source connectors' converters will only ever
> be used for serialization). It seems like a single "use.optional.null" (or
> "map.null.to.default") property would be simpler and easier to use, but I
> might be missing something about why we'd want to add this kind of
> granularity.
>
> I'm slightly in favor of the alternative name that Tom has proposed. I
> think highlighting that this property deals with how to handle default
> values is important, possibly more important than the fact that it deals
> with null field values. I'm a little hesitant to use "map" since that may
> be harder to remember and at first glance it might seem like it deals with
> the map schema type. Maybe "replace.null.with.default.value"? (For the
> record, I don't consider any of this worthy of blocking the KIP, so don't
> feel the need to appease me on this front before starting a vote.)
>
> Cheers,
>
> Chris
>
> On Thu, Mar 16, 2023 at 11:38 AM Tom Bentley  wrote:
>
> > Hi Mickael,
> >
> > Thanks to Cheng for the KIP and to you for picking it up.
> >
> > My only comment (feel free to ignore) is about the names of the configs.
> > Personally I don't think I'd correctly guess what
> > "serialize.use.optional.null" meant. Something like
> > "serialize.map.null.to.default" is much clearer to me, for the cost of one
> > extra token.
> >
> > Otherwise LGTM.
> >
> > Thanks,
> >
> > Tom
> >
> > On Wed, 8 Mar 2023 at 15:55, Mickael Maison 
> > wrote:
> >
> > > Hi,
> > >
> > > This KIP has been staled for a long time. Since it would be a useful
> > > feature, I pinged Cheng about a month ago asking if he was planning to
> > > work on it again. I've not received a reply, so I've allowed myself to
> > > update the KIP (hopefully preserving the initial requirements) and
> > > would like to restart a discussion.
> > >
> > > The DISCUSS thread was split in two, you can find the other part in
> > > https://lists.apache.org/thread/dc56k17zptzvbyc7gtscovzgzwf6yn9p
> > >
> > > Let me know if you have any feedback.
> > >
> > > Thanks,
> > > Mickael
> > >
> > > On Tue, Apr 14, 2020 at 8:28 PM Christopher Egerton  > >
> > > wrote:
> > > >
> > > > Hi Cheng,
> > > >
> > > > Thanks for the KIP! I really appreciate the care that was taken to
> > ensure
> > > > backwards compatibility for existing users, and the minimal changes to
> > > > public interface that are suggested to address this.
> > > >
> > > > I have two quick requests for clarification:
> > > >
> > > > 1) Where is the proposed "accept.optional.null" property going to
> > apply?
> > > > It's hinted that it'll take effect on the JSON converter but not
> > actually
> > > > called out anywhere.
> > > >
> > > > 2) Assuming this takes effect on the JSON converter, is the intent to
> > > alter
> > > > the semantics for both serialization and deserialization? The code
> > > snippet
> > > > from the JSON converter that's included in the KIP comes from the
> > > > "convertToJson" method, which is used for serialization. However, based
> > > on
> > > >
> > >
> > https://github.com/apache/kafka/blob/ea47a885b1fe47dfb87c1dc86db1b0e7eb8a273c/connect/json/src/main/java/org/apache/kafka/connect/json/JsonConverter.java#L712-L713
> > > > it
> > > > looks like the converter also inserts the default value for
> > > > optional-but-null data during deserialization.
> > > >
> > > > Thanks again for the KIP!
> > > >
> > > > Cheers,
> > > >
> > > > Chris
> > > >
> > > > On Wed, Mar 18, 2020 at 12:00 AM Cheng Pan <379377...@qq.com> wrote:
> > > >
> > > > > Hi all,
> > > > >
> > > > > I'd like to use this thread to discuss KIP-581: Value of optional
> > null
> > > > > field which has default value, please see detail at:
> > > > >
> > >
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-581%3A+Value+of+optional+null+field+which+has+default+value
> > > > >
> > > > >
> > > > > There are some previous discussion at:
> > > > > https://github.com/apache/kafka/pull/7112
> > > > >
> > > > >
> > > > > I'm a beginner for apache project, please let me know if I did any
> > > thing
> > > > > wrong.
> > > > >
> > > > >
> > > > > Best regards,
> > > > > Cheng Pan
> > >
> > >
> >


[jira] [Created] (KAFKA-14817) LogCleaner mark some partitions of __consumer_offsets as uncleanable

2023-03-17 Thread ZhenChun Pan (Jira)
ZhenChun Pan created KAFKA-14817:


 Summary: LogCleaner mark some partitions of __consumer_offsets as 
uncleanable
 Key: KAFKA-14817
 URL: https://issues.apache.org/jira/browse/KAFKA-14817
 Project: Kafka
  Issue Type: Bug
  Components: core
Affects Versions: 2.8.1
Reporter: ZhenChun Pan


We find some patitions of topic __consumer_offsets can't retention it's log any 
more and takes up a lot of disk space. Then we found these patitions of topic 
__consumer_offsets have to mark as uncleanable in log-cleaner.log. The logs 
below:

[2023-03-17 17:53:46,655] INFO Starting the log cleaner (kafka.log.LogCleaner)
[2023-03-17 17:53:46,770] INFO [kafka-log-cleaner-thread-0]: Starting 
(kafka.log.LogCleaner)
[2023-03-17 17:53:46,841] INFO Cleaner 0: Beginning cleaning of log 
__consumer_offsets-24. (kafka.log.LogCleaner)
[2023-03-17 17:53:46,841] INFO Cleaner 0: Building offset map for 
__consumer_offsets-24... (kafka.log.LogCleaner)
[2023-03-17 17:53:47,013] INFO Cleaner 0: Building offset map for log 
__consumer_offsets-24 for 5 segments in offset range [0, 2360519). 
(kafka.log.LogCleaner)
[2023-03-17 17:53:47,394] INFO Cleaner 0: Growing cleaner I/O buffers from 
262144 bytes to 524288 bytes. (kafka.log.LogCleaner)
[2023-03-17 17:53:47,395] INFO Cleaner 0: Growing cleaner I/O buffers from 
524288 bytes to 1048576 bytes. (kafka.log.LogCleaner)
[2023-03-17 17:53:47,396] INFO Cleaner 0: Growing cleaner I/O buffers from 
1048576 bytes to 2097152 bytes. (kafka.log.LogCleaner)
[2023-03-17 17:53:47,401] INFO Cleaner 0: Growing cleaner I/O buffers from 
2097152 bytes to 4194304 bytes. (kafka.log.LogCleaner)
[2023-03-17 17:53:47,409] INFO Cleaner 0: Growing cleaner I/O buffers from 
4194304 bytes to 8388608 bytes. (kafka.log.LogCleaner)
[2023-03-17 17:53:47,434] INFO Cleaner 0: Growing cleaner I/O buffers from 
8388608 bytes to 10485772 bytes. (kafka.log.LogCleaner)
[2023-03-17 17:53:47,465] WARN [kafka-log-cleaner-thread-0]: Unexpected 
exception thrown when cleaning log 
Log(dir=/opt/kafka-service/data/__consumer_offsets-24, 
topic=__consumer_offsets, partition=24, highWatermark=0, lastStableOffset=0, 
logStartOffset=0, logEndOffset=2759760). Marking its partition 
(__consumer_offsets-24) as uncleanable (kafka.log.LogCleaner)
kafka.log.LogCleaningException: Batch size 223 < buffer size 10485772, but not 
processed for log segment 
/opt/kafka-service/data/__consumer_offsets-24/.log at 
position 31457091
        at 
kafka.log.LogCleaner$CleanerThread.cleanFilthiestLog(LogCleaner.scala:356)
        at 
kafka.log.LogCleaner$CleanerThread.tryCleanFilthiestLog(LogCleaner.scala:332)
        at kafka.log.LogCleaner$CleanerThread.doWork(LogCleaner.scala:321)
        at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:96)
Caused by: java.lang.IllegalStateException: Batch size 223 < buffer size 
10485772, but not processed for log segment 
/opt/kafka-service/data/__consumer_offsets-24/.log at 
position 31457091
        at kafka.log.Cleaner.growBuffersOrFail(LogCleaner.scala:745)
        at kafka.log.Cleaner.buildOffsetMapForSegment(LogCleaner.scala:983)
        at kafka.log.Cleaner.$anonfun$buildOffsetMap$5(LogCleaner.scala:908)
        at 
kafka.log.Cleaner.$anonfun$buildOffsetMap$5$adapted(LogCleaner.scala:904)
        at 
scala.collection.TraversableLike$WithFilter.$anonfun$foreach$1(TraversableLike.scala:985)
        at 
scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)
        at 
scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)
        at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
        at 
scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:984)
        at kafka.log.Cleaner.buildOffsetMap(LogCleaner.scala:904)
        at kafka.log.Cleaner.doClean(LogCleaner.scala:523)
        at kafka.log.Cleaner.clean(LogCleaner.scala:511)
        at kafka.log.LogCleaner$CleanerThread.cleanLog(LogCleaner.scala:380)
        at 
kafka.log.LogCleaner$CleanerThread.cleanFilthiestLog(LogCleaner.scala:352)
        ... 3 more
[2023-03-17 17:54:02,477] INFO Cleaner 0: Beginning cleaning of log 
__consumer_offsets-33. (kafka.log.LogCleaner)
[2023-03-17 17:54:02,477] INFO Cleaner 0: Building offset map for 
__consumer_offsets-33... (kafka.log.LogCleaner)
[2023-03-17 17:54:02,511] INFO Cleaner 0: Building offset map for log 
__consumer_offsets-33 for 2361 segments in offset range [2946663171, 
7343698202). (kafka.log.LogCleaner)
[2023-03-17 17:54:03,235] WARN [kafka-log-cleaner-thread-0]: Unexpected 
exception thrown when cleaning log 
Log(dir=/opt/kafka-service/data/__consumer_offsets-33, 
topic=__consumer_offsets, partition=33, highWatermark=7344703164, 
lastStableOffset=7344703164, logStartOffset=0, logEndOffset=7344703164). 
Marking its partition (__consumer_offsets-33) as uncl