Re: [DISCUSS] KIP-1033: Add Kafka Streams exception handler for exceptions occuring during processing

2024-05-10 Thread Sophie Blee-Goldman
Whoops, just noticed there is already a voting thread for this. Hard to
keep track with all the KIPs going on right now!

In that case I'll just wait for the SerializationExceptionOrigin thing to
be added and then I'll vote. Should definitely be able to make 3.8 in this
case :D

On Fri, May 10, 2024 at 10:10 PM Sophie Blee-Goldman 
wrote:

> Sounds like we're more or less in agreement here. I think the KIP just
> needs one small update still, which is the SerializationExceptionOrigin
> enum.
>
> As discussed there are a few options for this and we're all happy to defer
> to the preference of the KIP authors, but if we keep the KIP as-is with the
> two separate methods in the ProductionExceptionHandler, then imo it makes
> the most sense to add the SerializationExceptionOrigin enum to the
> ProductionExceptionHandler interface and then add an "origin" parameter to
> the new  #handleSerializationException method. However you decide to do it,
> I'm personally happy to vote on this KIP once the KIP is updated.
>
>  Just FYI the 3.8 KIP freeze is this upcoming Wednesday, so if you guys
> would like to target 3.8 for this feature, just make sure to update the KIP
> and kick off a [VOTE] thread by EOD Monday so that you can close the vote
> by EOD Wednesday (since it has to be open for 72 hours).
>
> Thanks again for this sorely needed feature!
>
> On Fri, May 10, 2024 at 10:06 AM Bill Bejeck  wrote:
>
>> Great KIP discussion so far by everyone.
>> At this point, I'm in agreement with the direction and current state of
>> the
>> KIP.
>>
>> As for having two separate callbacks for the ProductionExceptionHandler,
>> I'm somewhat split in that I agree with points raised by Sophie and
>> Matthias with my final
>> position being to maintain both callbacks.  IMHO, while there are several
>> things that could go wrong with producing a message, it seems that
>> serialization exceptions would be the most common, although I don't have
>> any data to back that opinion up.  But having said that, should the KIP
>> authors decide otherwise, I would be fine with that approach as well.
>>
>> I'm at the point where I'm comfortable voting for this KIP.
>>
>> Thanks,
>> Bill
>>
>> On Thu, May 9, 2024 at 4:28 PM Sophie Blee-Goldman > >
>> wrote:
>>
>> > The type safety issue is definitely not solved by having two separate
>> > callbacks. I just think it gets a bit worse by mashing them into one
>> > method. At least in the plain #handle method you can be sure that the
>> type
>> > is ProducerRecord and in #handleSerialization the type
>> is
>> > some POJO.
>> >
>> > And in theory you can just embed the mapping of sink topics to
>> type/Serde
>> > based on your topology. Or let's say your output record keys & values
>> are
>> > all Strings, and you want to print the String representation in your
>> > handler, rather than the bytes.
>> > Having a separate callback means knowing you can simply print the
>> > ProducerRecord's key/value in the #handleSerialization method, and will
>> > have to use a StringDeserializer to convert the key/value to its String
>> > form to print it in the #handle method.
>> >
>> > Again, I just feel this will be more straightforward and easy for users
>> to
>> > use correctly, but am satisfied either way. I'll shut up now and wait
>> for
>> > the KIP authors to make a call on this one way or another, and then I'm
>> > happy to cast my vote
>> >
>> > On Thu, May 9, 2024 at 10:15 AM Matthias J. Sax 
>> wrote:
>> >
>> > > Thanks Sophie! Makes it much clearer where you are coming from.
>> > >
>> > > About the Type unsafety: isn't this also an issue for the
>> > > `handleSerialziationException` case, because the handler is used for
>> all
>> > > sink topics, and thus key/value types are not really know w/o taking
>> the
>> > > sink topic into account? -- So I am not sure if having two handler
>> > > methods really helps much with regard to type safety?
>> > >
>> > > Just want to make this small comment for completeness. Let's hear what
>> > > others think. Given that we both don't have a strong opinion but just
>> a
>> > > personal preference, we should be able to come to a conclusion quickly
>> > > and get this KIP approved for 3.8 :)
>> > >
>> > >
>> > > -Matthias
>> > >
>> > > On 5/8/24 3:12 PM, Sophie Blee-Goldman wrote:
>> > > > Well I definitely don't feel super strongly about it, and more
>> > > importantly,
>> > > > I'm not a user. So I will happily defer to the preference of anyone
>> who
>> > > > will actually be using this feature. But  I'll explain my reasoning:
>> > > >
>> > > > There *is* a relevant distinction between these two callbacks --
>> > because
>> > > > the passed-in record will have a different type depending on
>> whether it
>> > > was
>> > > > a serialization exception or something else. Even if we combined
>> them
>> > > into
>> > > > a single #handle method, users will still end up implementing two
>> > > distinct
>> > > > branches depending on whether it was a serialization 

Re: [DISCUSS] KIP-1033: Add Kafka Streams exception handler for exceptions occuring during processing

2024-05-10 Thread Sophie Blee-Goldman
Sounds like we're more or less in agreement here. I think the KIP just
needs one small update still, which is the SerializationExceptionOrigin
enum.

As discussed there are a few options for this and we're all happy to defer
to the preference of the KIP authors, but if we keep the KIP as-is with the
two separate methods in the ProductionExceptionHandler, then imo it makes
the most sense to add the SerializationExceptionOrigin enum to the
ProductionExceptionHandler interface and then add an "origin" parameter to
the new  #handleSerializationException method. However you decide to do it,
I'm personally happy to vote on this KIP once the KIP is updated.

 Just FYI the 3.8 KIP freeze is this upcoming Wednesday, so if you guys
would like to target 3.8 for this feature, just make sure to update the KIP
and kick off a [VOTE] thread by EOD Monday so that you can close the vote
by EOD Wednesday (since it has to be open for 72 hours).

Thanks again for this sorely needed feature!

On Fri, May 10, 2024 at 10:06 AM Bill Bejeck  wrote:

> Great KIP discussion so far by everyone.
> At this point, I'm in agreement with the direction and current state of the
> KIP.
>
> As for having two separate callbacks for the ProductionExceptionHandler,
> I'm somewhat split in that I agree with points raised by Sophie and
> Matthias with my final
> position being to maintain both callbacks.  IMHO, while there are several
> things that could go wrong with producing a message, it seems that
> serialization exceptions would be the most common, although I don't have
> any data to back that opinion up.  But having said that, should the KIP
> authors decide otherwise, I would be fine with that approach as well.
>
> I'm at the point where I'm comfortable voting for this KIP.
>
> Thanks,
> Bill
>
> On Thu, May 9, 2024 at 4:28 PM Sophie Blee-Goldman 
> wrote:
>
> > The type safety issue is definitely not solved by having two separate
> > callbacks. I just think it gets a bit worse by mashing them into one
> > method. At least in the plain #handle method you can be sure that the
> type
> > is ProducerRecord and in #handleSerialization the type is
> > some POJO.
> >
> > And in theory you can just embed the mapping of sink topics to type/Serde
> > based on your topology. Or let's say your output record keys & values are
> > all Strings, and you want to print the String representation in your
> > handler, rather than the bytes.
> > Having a separate callback means knowing you can simply print the
> > ProducerRecord's key/value in the #handleSerialization method, and will
> > have to use a StringDeserializer to convert the key/value to its String
> > form to print it in the #handle method.
> >
> > Again, I just feel this will be more straightforward and easy for users
> to
> > use correctly, but am satisfied either way. I'll shut up now and wait for
> > the KIP authors to make a call on this one way or another, and then I'm
> > happy to cast my vote
> >
> > On Thu, May 9, 2024 at 10:15 AM Matthias J. Sax 
> wrote:
> >
> > > Thanks Sophie! Makes it much clearer where you are coming from.
> > >
> > > About the Type unsafety: isn't this also an issue for the
> > > `handleSerialziationException` case, because the handler is used for
> all
> > > sink topics, and thus key/value types are not really know w/o taking
> the
> > > sink topic into account? -- So I am not sure if having two handler
> > > methods really helps much with regard to type safety?
> > >
> > > Just want to make this small comment for completeness. Let's hear what
> > > others think. Given that we both don't have a strong opinion but just a
> > > personal preference, we should be able to come to a conclusion quickly
> > > and get this KIP approved for 3.8 :)
> > >
> > >
> > > -Matthias
> > >
> > > On 5/8/24 3:12 PM, Sophie Blee-Goldman wrote:
> > > > Well I definitely don't feel super strongly about it, and more
> > > importantly,
> > > > I'm not a user. So I will happily defer to the preference of anyone
> who
> > > > will actually be using this feature. But  I'll explain my reasoning:
> > > >
> > > > There *is* a relevant distinction between these two callbacks --
> > because
> > > > the passed-in record will have a different type depending on whether
> it
> > > was
> > > > a serialization exception or something else. Even if we combined them
> > > into
> > > > a single #handle method, users will still end up implementing two
> > > distinct
> > > > branches depending on whether it was a serialization exception or
> not,
> > > > since that determines the type of the ProducerRecord passed in.
> > > >
> > > > Not to mention they'll need to cast it to a ProducerRecord > > byte[]>
> > > > when we could have just passed it in as this type via a dedicated
> > > callback.
> > > > And note that because of the generics, they can't do an instanceof
> > check
> > > to
> > > > make sure that the record type is ProducerRecord and
> > will
> > > > have to suppress the "unchecked cast" warning.
> > 

[jira] [Created] (KAFKA-16704) Fix flaky kafka.log.remote.RemoteIndexCacheTest#testIndexFileAlreadyExistOnDiskButNotInCache

2024-05-10 Thread Chia-Ping Tsai (Jira)
Chia-Ping Tsai created KAFKA-16704:
--

 Summary: Fix flaky  
kafka.log.remote.RemoteIndexCacheTest#testIndexFileAlreadyExistOnDiskButNotInCache
 Key: KAFKA-16704
 URL: https://issues.apache.org/jira/browse/KAFKA-16704
 Project: Kafka
  Issue Type: Test
Reporter: Chia-Ping Tsai


java.io.UncheckedIOException: java.nio.file.NoSuchFileException: 
/tmp/kafka-RemoteIndexCacheTest3690189103734187552/R68MBnutRfmqJY66XXFoOA:foo-0/remote-log-index-cache/2147584984_Ma8JCqucS7mqKIHfSSDeow.txnindex.deleted
 at 
java.base/java.nio.file.FileTreeIterator.fetchNextIfNeeded(FileTreeIterator.java:87)
 at java.base/java.nio.file.FileTreeIterator.hasNext(FileTreeIterator.java:103)
 at java.base/java.util.Iterator.forEachRemaining(Iterator.java:132)
 at 
java.base/java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801)
 at 
java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484)
 at 
java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474)
 at 
java.base/java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150)
 at 
java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173)
 at 
java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
 at 
java.base/java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:497)
 at 
kafka.log.remote.RemoteIndexCacheTest.renameRemoteCacheIndexFileFromDisk$1(RemoteIndexCacheTest.scala:832)
 at 
kafka.log.remote.RemoteIndexCacheTest.testIndexFileAlreadyExistOnDiskButNotInCache(RemoteIndexCacheTest.scala:851)
 at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native 
Method)
 at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
 at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.base/java.lang.reflect.Method.invoke(Method.java:566)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Jenkins build is still unstable: Kafka » Kafka Branch Builder » trunk #2892

2024-05-10 Thread Apache Jenkins Server
See 




[jira] [Created] (KAFKA-16703) SocketServer leaks ServerSocketChannel when port is already in-use

2024-05-10 Thread Greg Harris (Jira)
Greg Harris created KAFKA-16703:
---

 Summary: SocketServer leaks ServerSocketChannel when port is 
already in-use
 Key: KAFKA-16703
 URL: https://issues.apache.org/jira/browse/KAFKA-16703
 Project: Kafka
  Issue Type: Test
Reporter: Greg Harris
Assignee: Greg Harris


The SocketServer#openServerSocket method creates a serverSocket, and then 
attempts to bind it to the selected port. If the port is already in-use, an 
exception is propagated and the serverSocket is never closed.

This causes KafkaServerTest#testListenerPortAlreadyInUse to leak sockets.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-16702) KafkaLog4jAppenderTest leaks producer instances

2024-05-10 Thread Greg Harris (Jira)
Greg Harris created KAFKA-16702:
---

 Summary: KafkaLog4jAppenderTest leaks producer instances
 Key: KAFKA-16702
 URL: https://issues.apache.org/jira/browse/KAFKA-16702
 Project: Kafka
  Issue Type: Test
Affects Versions: 3.8.0
Reporter: Greg Harris
Assignee: Greg Harris


The KafkaLog4jAppenderTest has the method getLog4jConfigWithRealProducer which 
naturally creates a real producer. This appender is never cleaned up within the 
test, so the producer is leaked.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (KAFKA-16513) Allow WriteTxnMarkers API with Alter Cluster Permission

2024-05-10 Thread Justine Olshan (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-16513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Justine Olshan resolved KAFKA-16513.

Resolution: Fixed

> Allow WriteTxnMarkers API with Alter Cluster Permission
> ---
>
> Key: KAFKA-16513
> URL: https://issues.apache.org/jira/browse/KAFKA-16513
> Project: Kafka
>  Issue Type: Improvement
>  Components: admin
>Reporter: Nikhil Ramakrishnan
>Assignee: Siddharth Yagnik
>Priority: Minor
>  Labels: KIP-1037
> Fix For: 3.8.0
>
>
> We should allow WriteTxnMarkers API with Alter Cluster Permission because it 
> can invoked externally by a Kafka AdminClient. Such usage is more aligned 
> with the Alter permission on the Cluster resource, which includes other 
> administrative actions invoked from the Kafka AdminClient.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-16701) Some SocketServerTest buffered close tests flaky failing locally

2024-05-10 Thread Greg Harris (Jira)
Greg Harris created KAFKA-16701:
---

 Summary: Some SocketServerTest buffered close tests flaky failing 
locally
 Key: KAFKA-16701
 URL: https://issues.apache.org/jira/browse/KAFKA-16701
 Project: Kafka
  Issue Type: Test
  Components: core, unit tests
Affects Versions: 3.7.0, 3.6.0, 3.5.0
Reporter: Greg Harris


These tests are failing for me on a local development environment, but don't 
appear to be flaky or failing in CI. They only appear to fail for JDK >= 17. 
I'm using an M1 Mac, so it is possible that either the Mac's linear port 
allocation, or a native implementation is impacting this.

closingChannelSendFailure()

 
{noformat}
java.lang.AssertionError: receiveRequest timed out
at 
kafka.network.SocketServerTest.receiveRequest(SocketServerTest.scala:148)
at 
kafka.network.SocketServerTest.makeChannelWithBufferedRequestsAndCloseRemote(SocketServerTest.scala:690)
at 
kafka.network.SocketServerTest.$anonfun$verifySendFailureAfterRemoteClose$1(SocketServerTest.scala:1434)
at 
kafka.network.SocketServerTest.verifySendFailureAfterRemoteClose(SocketServerTest.scala:1430)
at 
kafka.network.SocketServerTest.closingChannelSendFailure(SocketServerTest.scala:1425){noformat}
closingChannelWithBufferedReceivesFailedSend()

 
{noformat}
java.lang.AssertionError: receiveRequest timed out
at 
kafka.network.SocketServerTest.receiveRequest(SocketServerTest.scala:148)
at 
kafka.network.SocketServerTest.$anonfun$verifyRemoteCloseWithBufferedReceives$6(SocketServerTest.scala:1591)
at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:190)
at 
kafka.network.SocketServerTest.$anonfun$verifyRemoteCloseWithBufferedReceives$1(SocketServerTest.scala:1590)
at 
kafka.network.SocketServerTest.verifyRemoteCloseWithBufferedReceives(SocketServerTest.scala:1553)
at 
kafka.network.SocketServerTest.closingChannelWithBufferedReceivesFailedSend(SocketServerTest.scala:1520){noformat}
closingChannelWithCompleteAndIncompleteBufferedReceives()
{noformat}
java.lang.AssertionError: receiveRequest timed out at 
kafka.network.SocketServerTest.receiveRequest(SocketServerTest.scala:148) at 
kafka.network.SocketServerTest.$anonfun$verifyRemoteCloseWithBufferedReceives$6(SocketServerTest.scala:1591)
 at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:190) at 
kafka.network.SocketServerTest.$anonfun$verifyRemoteCloseWithBufferedReceives$1(SocketServerTest.scala:1590)
 at 
kafka.network.SocketServerTest.verifyRemoteCloseWithBufferedReceives(SocketServerTest.scala:1553)
 at 
kafka.network.SocketServerTest.closingChannelWithCompleteAndIncompleteBufferedReceives(SocketServerTest.scala:1511)
{noformat}
remoteCloseWithBufferedReceives()
{noformat}
java.lang.AssertionError: receiveRequest timed out
at 
kafka.network.SocketServerTest.receiveRequest(SocketServerTest.scala:148)
at 
kafka.network.SocketServerTest.$anonfun$verifyRemoteCloseWithBufferedReceives$6(SocketServerTest.scala:1591)
at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:190)
at 
kafka.network.SocketServerTest.$anonfun$verifyRemoteCloseWithBufferedReceives$1(SocketServerTest.scala:1590)
at 
kafka.network.SocketServerTest.verifyRemoteCloseWithBufferedReceives(SocketServerTest.scala:1553)
at 
kafka.network.SocketServerTest.remoteCloseWithBufferedReceives(SocketServerTest.scala:1453){noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-16700) Kafka Streams: possible message loss on KTable-KTable FK Left Join

2024-05-10 Thread Jira
Karsten Stöckmann created KAFKA-16700:
-

 Summary: Kafka Streams: possible message loss on KTable-KTable FK 
Left Join
 Key: KAFKA-16700
 URL: https://issues.apache.org/jira/browse/KAFKA-16700
 Project: Kafka
  Issue Type: Bug
  Components: streams
Affects Versions: 3.7.0
 Environment: Apache Kafka 3.7.0 cluster in KRaft mode, 3 brokers and 3 
controllers, on managed Kubernetes 1.28.7, operated by Strimzi Kafka Operators
Reporter: Karsten Stöckmann


We are experiencing significant, yet intermittent / non-deterministic / 
unexplainable message loss on a Kafka Streams topology while performing a 
*KTable-KTable* {*}FK Left Join{*}.

Assume the following snippet:
{code:java}
streamsBuilder
.table(
folderTopicName,
Consumed.with(
folderKeySerde,
folderSerde))
.leftJoin(
agencies, // KTable
Folder::agencyIdValue,
AggregateFolder::new,
TableJoined.as("folder-to-agency"),
Materialized
.as("folder-to-agency-materialized")
.withKeySerde(folderKeySerde)
.withValueSerde(aggregateFolderSerde))
.leftJoin(
documents,
{code}
The setup is as follows:

A Debezium Connector for PostgreSQL streams database changes into various Kafka 
topics. A series of Quarkus Kafka Streams applications then performs 
aggregation operations on those topics to create index documents later to be 
sent into an OpenSearch system.

When firing up the Kafka Streams infrastructure to work on initially populated 
Kafka Topics (i.e. a snapshot of all relevant table data has been streamed to 
Kafka), the above shown KTable-KTable FK Left Join seems to produce message 
loss on the first of a series of FK Left Joins; the right hand 
{{KTable}} is consumed from an aggregated topic fed 
from another Kafka Streams topology / application.

On a (heavily reduced) test data set of 6828 messages in the 
{{folderTopicName}} Topic, we observe the following results:
 * {{{}folder-to-agency-subscription-registration{}}}: *6967* messages
 * {{{}folder-to-agency-subscription-response{}}}: *3048* messages
 * {{{}folder-to-agency-subscription-store-changelog{}}}: *6359* messages
 * {{{}folder-to-agency-materialized-changelog{}}}: *4644* messages.

Telling from the nature of a (FK) Left Join, I'd expect all messages from the 
left hand topic should produce an aggregate even if no matching message is 
found in the right hand topic.

Message loss unpredictably varies across tests and seems not to be bound to 
specific keys or messages.

As it seems, this can only be observed when initially firing up the Streams 
infrastructure to process the message 'backlog' that had been snapshotted by 
Debezium. A manual snapshot triggered later (i.e. Streams applications already 
running) seems not to show this behaviour. Additionally, as of yet we observed 
this kind of message loss only when running multiple replicas of the affected 
application. When carrying out the tests with only one replica, everything 
seems to work as expected. We've tried to leverage 
{{group.initial.rebalance.delay.ms}} in order to rule out possible rebalancing 
issues, but to no avail.

Our Kafka configuration:
{code:yaml}
offsets.topic.replication.factor: 3
transaction.state.log.replication.factor: 3
transaction.state.log.min.isr: 2
default.replication.factor: 3
min.insync.replicas: 2
message.max.bytes: "20971520"
{code}
Our Kafka Streams application configuration:
{code:yaml}
kafka-streams.num.stream.threads: 5
kafka-streams.num.standby.replicas: 1
kafka-streams.auto.offset.reset: earliest
kafka-streams.cache.max.bytes.buffering: "20971520"
kafka-streams.commit.interval.ms: 100
kafka-streams.fetch.max.bytes: "10485760"
kafka-streams.max.request.size: "10485760"
kafka-streams.max.partition.fetch.bytes: "10485760"
kafka-streams.metadata.max.age.ms: 30
kafka-streams.statestore.cache.max.bytes: "20971520"
kafka-streams.topology.optimization: all
kafka-streams.processing.guarantee: exactly_once_v2

# Kafka Streams Intermediate Topics
kafka-streams.topic.compression.type: lz4
kafka-streams.topic.segment.ms: "4320" # 12h
kafka-streams.topic.max.compaction.lag.ms: "8640" # 24h
kafka-streams.topic.delete.retention.ms: "8640" # 24h

kafka-streams.producer.max.request.size: "20971520" # 20MiB
kafka-streams.producer.transaction.timeout.ms: 100 # Should match 
commit.interval.ms, set close to 100ms for exactly_once_v2

kafka-streams.consumer.group.instance.id: ${HOSTNAME}
kafka-streams.consumer.heartbeat.interval.ms: 100
kafka-streams.consumer.session.timeout.ms: 45000
{code}
All input (and aggregate) topics feature 15 partitions and share this 
configuration:
{code:yaml}

Jenkins build is unstable: Kafka » Kafka Branch Builder » trunk #2891

2024-05-10 Thread Apache Jenkins Server
See 




Re: [DISCUSS] KIP-1033: Add Kafka Streams exception handler for exceptions occuring during processing

2024-05-10 Thread Bill Bejeck
Great KIP discussion so far by everyone.
At this point, I'm in agreement with the direction and current state of the
KIP.

As for having two separate callbacks for the ProductionExceptionHandler,
I'm somewhat split in that I agree with points raised by Sophie and
Matthias with my final
position being to maintain both callbacks.  IMHO, while there are several
things that could go wrong with producing a message, it seems that
serialization exceptions would be the most common, although I don't have
any data to back that opinion up.  But having said that, should the KIP
authors decide otherwise, I would be fine with that approach as well.

I'm at the point where I'm comfortable voting for this KIP.

Thanks,
Bill

On Thu, May 9, 2024 at 4:28 PM Sophie Blee-Goldman 
wrote:

> The type safety issue is definitely not solved by having two separate
> callbacks. I just think it gets a bit worse by mashing them into one
> method. At least in the plain #handle method you can be sure that the type
> is ProducerRecord and in #handleSerialization the type is
> some POJO.
>
> And in theory you can just embed the mapping of sink topics to type/Serde
> based on your topology. Or let's say your output record keys & values are
> all Strings, and you want to print the String representation in your
> handler, rather than the bytes.
> Having a separate callback means knowing you can simply print the
> ProducerRecord's key/value in the #handleSerialization method, and will
> have to use a StringDeserializer to convert the key/value to its String
> form to print it in the #handle method.
>
> Again, I just feel this will be more straightforward and easy for users to
> use correctly, but am satisfied either way. I'll shut up now and wait for
> the KIP authors to make a call on this one way or another, and then I'm
> happy to cast my vote
>
> On Thu, May 9, 2024 at 10:15 AM Matthias J. Sax  wrote:
>
> > Thanks Sophie! Makes it much clearer where you are coming from.
> >
> > About the Type unsafety: isn't this also an issue for the
> > `handleSerialziationException` case, because the handler is used for all
> > sink topics, and thus key/value types are not really know w/o taking the
> > sink topic into account? -- So I am not sure if having two handler
> > methods really helps much with regard to type safety?
> >
> > Just want to make this small comment for completeness. Let's hear what
> > others think. Given that we both don't have a strong opinion but just a
> > personal preference, we should be able to come to a conclusion quickly
> > and get this KIP approved for 3.8 :)
> >
> >
> > -Matthias
> >
> > On 5/8/24 3:12 PM, Sophie Blee-Goldman wrote:
> > > Well I definitely don't feel super strongly about it, and more
> > importantly,
> > > I'm not a user. So I will happily defer to the preference of anyone who
> > > will actually be using this feature. But  I'll explain my reasoning:
> > >
> > > There *is* a relevant distinction between these two callbacks --
> because
> > > the passed-in record will have a different type depending on whether it
> > was
> > > a serialization exception or something else. Even if we combined them
> > into
> > > a single #handle method, users will still end up implementing two
> > distinct
> > > branches depending on whether it was a serialization exception or not,
> > > since that determines the type of the ProducerRecord passed in.
> > >
> > > Not to mention they'll need to cast it to a ProducerRecord > byte[]>
> > > when we could have just passed it in as this type via a dedicated
> > callback.
> > > And note that because of the generics, they can't do an instanceof
> check
> > to
> > > make sure that the record type is ProducerRecord and
> will
> > > have to suppress the "unchecked cast" warning.
> > >
> > > So if we combined the two callbacks, their handler will look something
> > like
> > > this:
> > >
> > > @SuppressWarnings("unchecked")
> > > public ProductionExceptionHandlerResponse handle(final
> > ErrorHandlerContext
> > > context,
> > > final ProducerRecord record,
> > > final Exception exception) {
> > > if (exception instanceof SerializationException) {
> > > if (exception.origin().equals(KEY)) {
> > > log.error("Failed to serialize key", exception);
> > > } else {
> > > log.error("Failed to serialize value", exception);
> > > }
> > >
> > > } else {
> > > final ProducerRecord serializedRecord =
> > (ProducerRecord > > byte[]>) record;
> > > log.error("Failed to produce record with serialized key={} and
> serialized
> > > value={}",
> > > serializedRecord.key(), serializedRecord.value());
> > > }
> > > return ProductionExceptionHandlerResponse.FAIL;
> > > }
> > >
> > > That seems like the most basic case, and it still haswith distinct
> logic
> > > even if they ultimately handle exceptions the same way. And looking
> > forward
> > > to KIP-1034: Dead-letter queues, it seems all the more likely that the
> > > actual handling response might be different depending on whether it's a
> > > 

[jira] [Created] (KAFKA-16699) Have Streams treat InvalidPidMappingException like a ProducerFencedException

2024-05-10 Thread Walker Carlson (Jira)
Walker Carlson created KAFKA-16699:
--

 Summary: Have Streams treat InvalidPidMappingException like a 
ProducerFencedException
 Key: KAFKA-16699
 URL: https://issues.apache.org/jira/browse/KAFKA-16699
 Project: Kafka
  Issue Type: Improvement
  Components: streams
Reporter: Walker Carlson






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-16698) Fix flaky kafka.network.DynamicConnectionQuotaTest#testDynamicIpConnectionRateQuota

2024-05-10 Thread Chia-Ping Tsai (Jira)
Chia-Ping Tsai created KAFKA-16698:
--

 Summary: Fix flaky 
kafka.network.DynamicConnectionQuotaTest#testDynamicIpConnectionRateQuota
 Key: KAFKA-16698
 URL: https://issues.apache.org/jira/browse/KAFKA-16698
 Project: Kafka
  Issue Type: Test
Reporter: Chia-Ping Tsai


{code:java}
org.opentest4j.AssertionFailedError: Timed out waiting for connection rate 
update to propagate  at 
app//org.junit.jupiter.api.AssertionUtils.fail(AssertionUtils.java:38)   at 
app//org.junit.jupiter.api.Assertions.fail(Assertions.java:138)  at 
app//kafka.network.DynamicConnectionQuotaTest.updateIpConnectionRate(DynamicConnectionQuotaTest.scala:279)
   at 
app//kafka.network.DynamicConnectionQuotaTest.testDynamicIpConnectionRateQuota(DynamicConnectionQuotaTest.scala:255)
 at 
java.base@21.0.2/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103)
 at java.base@21.0.2/java.lang.reflect.Method.invoke(Method.java:580)at 
app//org.junit.platform.commons.util.ReflectionUtils.invokeMethod(ReflectionUtils.java:728)
  at 
app//org.junit.jupiter.engine.execution.MethodInvocation.proceed(MethodInvocation.java:60)
   at 
app//org.junit.jupiter.engine.execution.InvocationInterceptorChain$ValidatingInvocation.proceed(InvocationInterceptorChain.java:131)
 at 
app//org.junit.jupiter.engine.extension.TimeoutExtension.intercept(TimeoutExtension.java:156)
at 
app//org.junit.jupiter.engine.extension.TimeoutExtension.interceptTestableMethod(TimeoutExtension.java:147)
  at 
app//org.junit.jupiter.engine.extension.TimeoutExtension.interceptTestTemplateMethod(TimeoutExtension.java:94)
   at 
app//org.junit.jupiter.engine.execution.InterceptingExecutableInvoker$ReflectiveInterceptorCall.lambda$ofVoidMethod$0(InterceptingExecutableInvoker.java:103)
at 
app//org.junit.jupiter.engine.execution.InterceptingExecutableInvoker.lambda$invoke$0(InterceptingExecutableInvoker.java:93)
 {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-16697) Fix flaky kafka.api.SaslGssapiSslEndToEndAuthorizationTest#testProduceConsumeWithWildcardAcls

2024-05-10 Thread Chia-Ping Tsai (Jira)
Chia-Ping Tsai created KAFKA-16697:
--

 Summary: Fix flaky 
kafka.api.SaslGssapiSslEndToEndAuthorizationTest#testProduceConsumeWithWildcardAcls
 Key: KAFKA-16697
 URL: https://issues.apache.org/jira/browse/KAFKA-16697
 Project: Kafka
  Issue Type: Test
Reporter: Chia-Ping Tsai


{code:java}
org.opentest4j.AssertionFailedError: Should have been zero expired connections 
killed: 1(total=0.0) ==> expected: <0> but was: <1>  at 
app//org.junit.jupiter.api.AssertionFailureBuilder.build(AssertionFailureBuilder.java:151)
   at 
app//org.junit.jupiter.api.AssertionFailureBuilder.buildAndThrow(AssertionFailureBuilder.java:132)
   at 
app//org.junit.jupiter.api.AssertEquals.failNotEqual(AssertEquals.java:197)  at 
app//org.junit.jupiter.api.AssertEquals.assertEquals(AssertEquals.java:166)  at 
app//org.junit.jupiter.api.Assertions.assertEquals(Assertions.java:664)  at 
app//kafka.api.EndToEndAuthorizationTest.$anonfun$confirmReauthenticationMetrics$1(EndToEndAuthorizationTest.scala:202)
  at 
app//kafka.api.EndToEndAuthorizationTest.$anonfun$confirmReauthenticationMetrics$1$adapted(EndToEndAuthorizationTest.scala:200)
  at app//scala.collection.IterableOnceOps.foreach(IterableOnce.scala:576)  
  at app//scala.collection.IterableOnceOps.foreach$(IterableOnce.scala:574) 
  at app//scala.collection.AbstractIterable.foreach(Iterable.scala:933)   
at 
app//kafka.api.EndToEndAuthorizationTest.confirmReauthenticationMetrics(EndToEndAuthorizationTest.scala:200)
 at 
app//kafka.api.EndToEndAuthorizationTest.testProduceConsumeWithWildcardAcls(EndToEndAuthorizationTest.scala:236)
 at 
java.base@17.0.7/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native 
Method)at 
java.base@17.0.7/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
  at 
java.base@17.0.7/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
  at java.base@17.0.7/java.lang.reflect.Method.invoke(Method.java:568)
at 
app//org.junit.platform.commons.util.ReflectionUtils.invokeMethod(ReflectionUtils.java:728)
  at 
app//org.junit.jupiter.engine.execution.MethodInvocation.proceed(MethodInvocation.java:60)
   at 
app//org.junit.jupiter.engine.execution.InvocationInterceptorChain$ValidatingInvocation.proceed(InvocationInterceptorChai
 {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [DISCUSS] KIP-891: Running multiple versions of Connector plugins

2024-05-10 Thread Greg Harris
Hey Snehashis,

I'm glad to hear you're still interested in this KIP!
I'm happy to let you drive this, and I apologize for increasing the
scope of work so drastically. To make up for that, I'll volunteer to
be the primary PR reviewer to help get this done quickly once the KIP
is approved.

Thanks,
Greg


On Fri, May 10, 2024 at 3:51 AM Snehashis  wrote:
>
> Hi Greg,
>
> Thanks for the follow up to my original KIP, I am in favour of the
> additions made to expand its scope, the addition of range versions
> specifically make a lot of sense.
>
> Apologies if I have not publicly worked on this KIP for a long time. The
> original work was done when the move to service loading was in discussion
> and I wanted to loop back to this only after that work was completed. Post
> its conclusion, I have not been able to take this up due to other
> priorities. If it's okay with you, I would still like to get this
> implemented myself, including the additional scope.
>
> Thanks and regards
> Snehashis
>
> On Fri, May 10, 2024 at 12:45 AM Greg Harris 
> wrote:
>
> > Hi all,
> >
> > I'd like to reboot the discussion on KIP-891:
> >
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-891%3A+Running+multiple+versions+of+Connector+plugins
> >
> > I've made some changes, most notably:
> >
> > 1. Specifying versions for all plugins in Connector configs
> > (converters, header converters, transforms, and predicates) not just
> > connectors & tasks
> > 2. Specifying a range of versions instead of an exact match
> > 3. New metrics to observe what versions are in-use
> >
> > Thanks to Snehashis for the original KIP idea!
> >
> > Thanks,
> > Greg
> >
> > On Tue, Jan 2, 2024 at 11:49 AM Greg Harris  wrote:
> > >
> > > Hi Snehashis,
> > >
> > > Thank you for the KIP! This is something I've wanted for a long time.
> > >
> > > I know the discussion has gone cold, are you still interested in
> > > pursuing this feature? I'll make time to review the KIP if you are
> > > still accepting comments.
> > >
> > > Thanks,
> > > Greg
> > >
> > > On Tue, Nov 22, 2022 at 12:29 PM Snehashis 
> > wrote:
> > > >
> > > > Thanks for the points Sagar.
> > > >
> > > > > 1) Should we update the GET /connectors endpoint to include the
> > version of
> > > > > the plugin that is running? It could be useful to figure out the
> > version
> > > > of
> > > > > the plugin or I am assuming it gets returned by the expand=info call?
> > > >
> > > > I think this is good to have and possible future enhancement. The
> > version
> > > > info will be present in the config of the connector if the user has
> > > > specified the version. Otherwise it is the latest version which the
> > user
> > > > can find out from the connector-plugin endpoint. The information can be
> > > > introduced to the response of the GET /connectors endpoint itself,
> > however
> > > > the most ideal way of doing this would be to get the currently running
> > > > instance of the connector and get the version directly from there.
> > This is
> > > > slightly tricky as the connector could be running in a different node.
> > > > One way to do this would be to persist the version information in the
> > > > status backing store during instantiation of the connector. It requires
> > > > some more thought and since the version is part of the configs if
> > provided
> > > > and evident otherwise, I have not included it in this KIP.
> > > >
> > > > > 2) I am not aware of this and hence asking, can 2 connectors with
> > > > different
> > > > > versions have the same name? Does the plugin isolation allow this?
> > This
> > > > > could have a bearing when using the lifecycle endpoints for
> > connectors
> > > > like
> > > > > DELETE etc.
> > > >
> > > > All connectors in a cluster need to have uniquire connector names
> > > > regardless of what version of the plugin the connector is running
> > > > underneath. This is something enforced by the connect runtime itself.
> > All
> > > > connect CRUD operations are keyed on the connector name so there will
> > not
> > > > be an issue.
> > > >
> > > > Regards
> > > > Snehashis
> > > >
> > > > On Tue, Nov 22, 2022 at 3:16 PM Sagar 
> > wrote:
> > > >
> > > > > Hey Snehashsih,
> > > > >
> > > > > Thanks for the KIP. It looks like a very useful feature. Couple of
> > > > > small-ish points, let me know what you think:
> > > > >
> > > > > 1) Should we update the GET /connectors endpoint to include the
> > version of
> > > > > the plugin that is running? It could be useful to figure out the
> > version of
> > > > > the plugin or I am assuming it gets returned by the expand=info call?
> > > > > 2) I am not aware of this and hence asking, can 2 connectors with
> > different
> > > > > versions have the same name? Does the plugin isolation allow this?
> > This
> > > > > could have a bearing when using the lifecycle endpoints for
> > connectors like
> > > > > DELETE etc.
> > > > >
> > > > > Thanks!
> > > > > Sagar.
> > > > >
> > > > >
> > > > > On Tue, Nov 22, 2022 at 

[jira] [Resolved] (KAFKA-16688) SystemTimer leaks resources on close

2024-05-10 Thread Igor Soarez (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-16688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Igor Soarez resolved KAFKA-16688.
-
Resolution: Fixed

> SystemTimer leaks resources on close
> 
>
> Key: KAFKA-16688
> URL: https://issues.apache.org/jira/browse/KAFKA-16688
> Project: Kafka
>  Issue Type: Test
>Affects Versions: 3.8.0
>Reporter: Gaurav Narula
>Assignee: Gaurav Narula
>Priority: Major
>
> We observe some thread leaks with thread name {{executor-client-metrics}}.
> This may happen because {{SystemTimer}} doesn't attempt to shutdown its 
> executor service properly.
> Refer: 
> https://ci-builds.apache.org/blue/organizations/jenkins/Kafka%2Fkafka-pr/detail/PR-15885/1/tests
>  and tests with {{initializationError}} in them for stacktrace



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: KIP-1040: Improve handling of nullable values in InsertField/ExtractField transformations

2024-05-10 Thread Chris Egerton
 Done

On Fri, May 10, 2024, 10:55 Mario Fiore Vitale  wrote:

> Thanks a lot! I have just a minor comment, should we also update the title
> to be more generic since now it's also related to other SMTs?
>
> On Fri, May 10, 2024 at 4:44 PM Chris Egerton 
> wrote:
>
> > I've finished updating the KIP; @Mario, please let me know what you
> think.
> >
> > On Fri, May 10, 2024 at 10:26 AM Chris Egerton  wrote:
> >
> > > I can do it :)
> > >
> > > On Fri, May 10, 2024 at 10:02 AM Mario Fiore Vitale <
> mvit...@redhat.com>
> > > wrote:
> > >
> > >> Yes, I agree. Unfortunately due to the issue of the creation of a new
> > >> account for the WIKI, I asked Mickael Maison to create the KIP for me.
> > >>
> > >> I'll ask him to update the KIP. Do you have other alternatives?
> > >>
> > >> Thanks,
> > >> Mario.
> > >>
> > >> On Fri, May 10, 2024 at 3:40 PM Chris Egerton  >
> > >> wrote:
> > >>
> > >> > Yes, I think we should just do one KIP for all the SMTs. You don't
> > have
> > >> to
> > >> > implement everything all at once or by yourself, but I don't see why
> > we
> > >> > should require one or more follow-up KIPs to apply the exact same
> > >> changes
> > >> > to the SMTs we missed the first time.
> > >> >
> > >> > On Fri, May 10, 2024 at 5:20 AM Mario Fiore Vitale <
> > mvit...@redhat.com>
> > >> > wrote:
> > >> >
> > >> > > Hi Chris,
> > >> > >
> > >> > > Thanks for the survey. Do you think I need to update the KIP to
> put
> > >> all
> > >> > of
> > >> > > these?
> > >> > >
> > >> > > On Thu, May 9, 2024 at 4:14 PM Chris Egerton
> >  > >> >
> > >> > > wrote:
> > >> > >
> > >> > > > After doing a brief survey of the SMTs that ship with Connect,
> it
> > >> seems
> > >> > > > like these would also benefit:
> > >> > > >
> > >> > > > - HeaderFrom, which populates record headers with subfields of
> > >> > > keys/values
> > >> > > > [1]
> > >> > > > - Cast, which performs type transformation on subfields of
> > >> keys/values
> > >> > > [2]
> > >> > > > - SetSchemaMetadata, which (when the record key/value is a
> struct)
> > >> > copies
> > >> > > > fields from the input struct to the output struct (which uses a
> > >> > different
> > >> > > > schema) [3]
> > >> > > > - TimestampConverter, which does similar input/output field
> > copying
> > >> to
> > >> > > > SetSchemaMetadata [4]
> > >> > > > - ReplaceField, which does similar input/output field copying to
> > >> > > > SetSchemaMetadata and TimestampConverter
> > >> > > >
> > >> > > > [1] -
> > >> > > >
> > >> > > >
> > >> > >
> > >> >
> > >>
> >
> https://github.com/apache/kafka/blob/f4fdaa702a2e718bdb44b9c5fec254f32a33f0d8/connect/transforms/src/main/java/org/apache/kafka/connect/transforms/HeaderFrom.java#L143
> > >> > > > [2] -
> > >> > > >
> > >> > > >
> > >> > >
> > >> >
> > >>
> >
> https://github.com/apache/kafka/blob/f4fdaa702a2e718bdb44b9c5fec254f32a33f0d8/connect/transforms/src/main/java/org/apache/kafka/connect/transforms/Cast.java#L178
> > >> > > > [3] -
> > >> > > >
> > >> > > >
> > >> > >
> > >> >
> > >>
> >
> https://github.com/apache/kafka/blob/f4fdaa702a2e718bdb44b9c5fec254f32a33f0d8/connect/transforms/src/main/java/org/apache/kafka/connect/transforms/SetSchemaMetadata.java#L174
> > >> > > > [4] -
> > >> > > >
> > >> > > >
> > >> > >
> > >> >
> > >>
> >
> https://github.com/apache/kafka/blob/2a5efe4a334611fc7c15f76b322482bad0942db0/connect/transforms/src/main/java/org/apache/kafka/connect/transforms/TimestampConverter.java#L420
> > >> > > > [5] -
> > >> > > >
> > >> > > >
> > >> > >
> > >> >
> > >>
> >
> https://github.com/apache/kafka/blob/f4fdaa702a2e718bdb44b9c5fec254f32a33f0d8/connect/transforms/src/main/java/org/apache/kafka/connect/transforms/ReplaceField.java#L183
> > >> > > >
> > >> > > > On Thu, May 9, 2024 at 3:28 AM Mario Fiore Vitale <
> > >> mvit...@redhat.com>
> > >> > > > wrote:
> > >> > > >
> > >> > > > > Hi Chris,
> > >> > > > >
> > >> > > > > > Wouldn't ValueToKey [1] be applicable as well, for example?
> > >> > > > > Yes, also that one can be affected.
> > >> > > > >
> > >> > > > > On Wed, May 8, 2024 at 5:59 PM Chris Egerton
> > >>  > >> > >
> > >> > > > > wrote:
> > >> > > > >
> > >> > > > > > Wait, just one more thing--are there any other SMTs that
> could
> > >> > > benefit
> > >> > > > > from
> > >> > > > > > this? Wouldn't ValueToKey [1] be applicable as well, for
> > >> example?
> > >> > > > > >
> > >> > > > > > [1] -
> > >> > > > > >
> > >> > > > > >
> > >> > > > >
> > >> > > >
> > >> > >
> > >> >
> > >>
> >
> https://github.com/apache/kafka/blob/2a5efe4a334611fc7c15f76b322482bad0942db0/connect/transforms/src/main/java/org/apache/kafka/connect/transforms/ValueToKey.java#L106
> > >> > > > > >
> > >> > > > > > On Wed, May 8, 2024 at 11:46 AM Chris Egerton <
> > chr...@aiven.io>
> > >> > > wrote:
> > >> > > > > >
> > >> > > > > > > Hi Mario,
> > >> > > > > > >
> > >> > > > > > > I think we could have something like `copy` and
> > >> > > `copyWithoutDefaults`
> > >> > > > > to
> > >> > > > > > > get around that, but 

Re: KIP-1040: Improve handling of nullable values in InsertField/ExtractField transformations

2024-05-10 Thread Mario Fiore Vitale
Thanks a lot! I have just a minor comment, should we also update the title
to be more generic since now it's also related to other SMTs?

On Fri, May 10, 2024 at 4:44 PM Chris Egerton 
wrote:

> I've finished updating the KIP; @Mario, please let me know what you think.
>
> On Fri, May 10, 2024 at 10:26 AM Chris Egerton  wrote:
>
> > I can do it :)
> >
> > On Fri, May 10, 2024 at 10:02 AM Mario Fiore Vitale 
> > wrote:
> >
> >> Yes, I agree. Unfortunately due to the issue of the creation of a new
> >> account for the WIKI, I asked Mickael Maison to create the KIP for me.
> >>
> >> I'll ask him to update the KIP. Do you have other alternatives?
> >>
> >> Thanks,
> >> Mario.
> >>
> >> On Fri, May 10, 2024 at 3:40 PM Chris Egerton 
> >> wrote:
> >>
> >> > Yes, I think we should just do one KIP for all the SMTs. You don't
> have
> >> to
> >> > implement everything all at once or by yourself, but I don't see why
> we
> >> > should require one or more follow-up KIPs to apply the exact same
> >> changes
> >> > to the SMTs we missed the first time.
> >> >
> >> > On Fri, May 10, 2024 at 5:20 AM Mario Fiore Vitale <
> mvit...@redhat.com>
> >> > wrote:
> >> >
> >> > > Hi Chris,
> >> > >
> >> > > Thanks for the survey. Do you think I need to update the KIP to put
> >> all
> >> > of
> >> > > these?
> >> > >
> >> > > On Thu, May 9, 2024 at 4:14 PM Chris Egerton
>  >> >
> >> > > wrote:
> >> > >
> >> > > > After doing a brief survey of the SMTs that ship with Connect, it
> >> seems
> >> > > > like these would also benefit:
> >> > > >
> >> > > > - HeaderFrom, which populates record headers with subfields of
> >> > > keys/values
> >> > > > [1]
> >> > > > - Cast, which performs type transformation on subfields of
> >> keys/values
> >> > > [2]
> >> > > > - SetSchemaMetadata, which (when the record key/value is a struct)
> >> > copies
> >> > > > fields from the input struct to the output struct (which uses a
> >> > different
> >> > > > schema) [3]
> >> > > > - TimestampConverter, which does similar input/output field
> copying
> >> to
> >> > > > SetSchemaMetadata [4]
> >> > > > - ReplaceField, which does similar input/output field copying to
> >> > > > SetSchemaMetadata and TimestampConverter
> >> > > >
> >> > > > [1] -
> >> > > >
> >> > > >
> >> > >
> >> >
> >>
> https://github.com/apache/kafka/blob/f4fdaa702a2e718bdb44b9c5fec254f32a33f0d8/connect/transforms/src/main/java/org/apache/kafka/connect/transforms/HeaderFrom.java#L143
> >> > > > [2] -
> >> > > >
> >> > > >
> >> > >
> >> >
> >>
> https://github.com/apache/kafka/blob/f4fdaa702a2e718bdb44b9c5fec254f32a33f0d8/connect/transforms/src/main/java/org/apache/kafka/connect/transforms/Cast.java#L178
> >> > > > [3] -
> >> > > >
> >> > > >
> >> > >
> >> >
> >>
> https://github.com/apache/kafka/blob/f4fdaa702a2e718bdb44b9c5fec254f32a33f0d8/connect/transforms/src/main/java/org/apache/kafka/connect/transforms/SetSchemaMetadata.java#L174
> >> > > > [4] -
> >> > > >
> >> > > >
> >> > >
> >> >
> >>
> https://github.com/apache/kafka/blob/2a5efe4a334611fc7c15f76b322482bad0942db0/connect/transforms/src/main/java/org/apache/kafka/connect/transforms/TimestampConverter.java#L420
> >> > > > [5] -
> >> > > >
> >> > > >
> >> > >
> >> >
> >>
> https://github.com/apache/kafka/blob/f4fdaa702a2e718bdb44b9c5fec254f32a33f0d8/connect/transforms/src/main/java/org/apache/kafka/connect/transforms/ReplaceField.java#L183
> >> > > >
> >> > > > On Thu, May 9, 2024 at 3:28 AM Mario Fiore Vitale <
> >> mvit...@redhat.com>
> >> > > > wrote:
> >> > > >
> >> > > > > Hi Chris,
> >> > > > >
> >> > > > > > Wouldn't ValueToKey [1] be applicable as well, for example?
> >> > > > > Yes, also that one can be affected.
> >> > > > >
> >> > > > > On Wed, May 8, 2024 at 5:59 PM Chris Egerton
> >>  >> > >
> >> > > > > wrote:
> >> > > > >
> >> > > > > > Wait, just one more thing--are there any other SMTs that could
> >> > > benefit
> >> > > > > from
> >> > > > > > this? Wouldn't ValueToKey [1] be applicable as well, for
> >> example?
> >> > > > > >
> >> > > > > > [1] -
> >> > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> https://github.com/apache/kafka/blob/2a5efe4a334611fc7c15f76b322482bad0942db0/connect/transforms/src/main/java/org/apache/kafka/connect/transforms/ValueToKey.java#L106
> >> > > > > >
> >> > > > > > On Wed, May 8, 2024 at 11:46 AM Chris Egerton <
> chr...@aiven.io>
> >> > > wrote:
> >> > > > > >
> >> > > > > > > Hi Mario,
> >> > > > > > >
> >> > > > > > > I think we could have something like `copy` and
> >> > > `copyWithoutDefaults`
> >> > > > > to
> >> > > > > > > get around that, but now that you bring up compatibility, I
> >> think
> >> > > > it's
> >> > > > > > best
> >> > > > > > > to hold off on this. I'm forced to recall that anything we
> >> add to
> >> > > the
> >> > > > > > > Connect API may be used by plugin developers who write for
> the
> >> > > > bleeding
> >> > > > > > > edge of the Connect runtime, but deployed by users who are
> >> > running
> >> > > on
> >> > > > > > > 

Re: KIP-1040: Improve handling of nullable values in InsertField/ExtractField transformations

2024-05-10 Thread Chris Egerton
I've finished updating the KIP; @Mario, please let me know what you think.

On Fri, May 10, 2024 at 10:26 AM Chris Egerton  wrote:

> I can do it :)
>
> On Fri, May 10, 2024 at 10:02 AM Mario Fiore Vitale 
> wrote:
>
>> Yes, I agree. Unfortunately due to the issue of the creation of a new
>> account for the WIKI, I asked Mickael Maison to create the KIP for me.
>>
>> I'll ask him to update the KIP. Do you have other alternatives?
>>
>> Thanks,
>> Mario.
>>
>> On Fri, May 10, 2024 at 3:40 PM Chris Egerton 
>> wrote:
>>
>> > Yes, I think we should just do one KIP for all the SMTs. You don't have
>> to
>> > implement everything all at once or by yourself, but I don't see why we
>> > should require one or more follow-up KIPs to apply the exact same
>> changes
>> > to the SMTs we missed the first time.
>> >
>> > On Fri, May 10, 2024 at 5:20 AM Mario Fiore Vitale 
>> > wrote:
>> >
>> > > Hi Chris,
>> > >
>> > > Thanks for the survey. Do you think I need to update the KIP to put
>> all
>> > of
>> > > these?
>> > >
>> > > On Thu, May 9, 2024 at 4:14 PM Chris Egerton > >
>> > > wrote:
>> > >
>> > > > After doing a brief survey of the SMTs that ship with Connect, it
>> seems
>> > > > like these would also benefit:
>> > > >
>> > > > - HeaderFrom, which populates record headers with subfields of
>> > > keys/values
>> > > > [1]
>> > > > - Cast, which performs type transformation on subfields of
>> keys/values
>> > > [2]
>> > > > - SetSchemaMetadata, which (when the record key/value is a struct)
>> > copies
>> > > > fields from the input struct to the output struct (which uses a
>> > different
>> > > > schema) [3]
>> > > > - TimestampConverter, which does similar input/output field copying
>> to
>> > > > SetSchemaMetadata [4]
>> > > > - ReplaceField, which does similar input/output field copying to
>> > > > SetSchemaMetadata and TimestampConverter
>> > > >
>> > > > [1] -
>> > > >
>> > > >
>> > >
>> >
>> https://github.com/apache/kafka/blob/f4fdaa702a2e718bdb44b9c5fec254f32a33f0d8/connect/transforms/src/main/java/org/apache/kafka/connect/transforms/HeaderFrom.java#L143
>> > > > [2] -
>> > > >
>> > > >
>> > >
>> >
>> https://github.com/apache/kafka/blob/f4fdaa702a2e718bdb44b9c5fec254f32a33f0d8/connect/transforms/src/main/java/org/apache/kafka/connect/transforms/Cast.java#L178
>> > > > [3] -
>> > > >
>> > > >
>> > >
>> >
>> https://github.com/apache/kafka/blob/f4fdaa702a2e718bdb44b9c5fec254f32a33f0d8/connect/transforms/src/main/java/org/apache/kafka/connect/transforms/SetSchemaMetadata.java#L174
>> > > > [4] -
>> > > >
>> > > >
>> > >
>> >
>> https://github.com/apache/kafka/blob/2a5efe4a334611fc7c15f76b322482bad0942db0/connect/transforms/src/main/java/org/apache/kafka/connect/transforms/TimestampConverter.java#L420
>> > > > [5] -
>> > > >
>> > > >
>> > >
>> >
>> https://github.com/apache/kafka/blob/f4fdaa702a2e718bdb44b9c5fec254f32a33f0d8/connect/transforms/src/main/java/org/apache/kafka/connect/transforms/ReplaceField.java#L183
>> > > >
>> > > > On Thu, May 9, 2024 at 3:28 AM Mario Fiore Vitale <
>> mvit...@redhat.com>
>> > > > wrote:
>> > > >
>> > > > > Hi Chris,
>> > > > >
>> > > > > > Wouldn't ValueToKey [1] be applicable as well, for example?
>> > > > > Yes, also that one can be affected.
>> > > > >
>> > > > > On Wed, May 8, 2024 at 5:59 PM Chris Egerton
>> > > >
>> > > > > wrote:
>> > > > >
>> > > > > > Wait, just one more thing--are there any other SMTs that could
>> > > benefit
>> > > > > from
>> > > > > > this? Wouldn't ValueToKey [1] be applicable as well, for
>> example?
>> > > > > >
>> > > > > > [1] -
>> > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> https://github.com/apache/kafka/blob/2a5efe4a334611fc7c15f76b322482bad0942db0/connect/transforms/src/main/java/org/apache/kafka/connect/transforms/ValueToKey.java#L106
>> > > > > >
>> > > > > > On Wed, May 8, 2024 at 11:46 AM Chris Egerton 
>> > > wrote:
>> > > > > >
>> > > > > > > Hi Mario,
>> > > > > > >
>> > > > > > > I think we could have something like `copy` and
>> > > `copyWithoutDefaults`
>> > > > > to
>> > > > > > > get around that, but now that you bring up compatibility, I
>> think
>> > > > it's
>> > > > > > best
>> > > > > > > to hold off on this. I'm forced to recall that anything we
>> add to
>> > > the
>> > > > > > > Connect API may be used by plugin developers who write for the
>> > > > bleeding
>> > > > > > > edge of the Connect runtime, but deployed by users who are
>> > running
>> > > on
>> > > > > > > (possibly much) older versions. In that scenario, any use of
>> new
>> > > > Struct
>> > > > > > > methods would cause issues at runtime caused by compatibility
>> > > clashes
>> > > > > > > between the newer API that the plugin was written for, and the
>> > > older
>> > > > > API
>> > > > > > > that's provided by the runtime it's running on.
>> > > > > > >
>> > > > > > > Anyway, thanks for humoring me. The KIP looks good to me 
>> > > > > > >
>> > > > > > > Cheers,
>> > > > > > >
>> > > > > > > Chris
>> > > > > > >
>> > > > > > > On 

Re: KIP-1040: Improve handling of nullable values in InsertField/ExtractField transformations

2024-05-10 Thread Chris Egerton
I can do it :)

On Fri, May 10, 2024 at 10:02 AM Mario Fiore Vitale 
wrote:

> Yes, I agree. Unfortunately due to the issue of the creation of a new
> account for the WIKI, I asked Mickael Maison to create the KIP for me.
>
> I'll ask him to update the KIP. Do you have other alternatives?
>
> Thanks,
> Mario.
>
> On Fri, May 10, 2024 at 3:40 PM Chris Egerton 
> wrote:
>
> > Yes, I think we should just do one KIP for all the SMTs. You don't have
> to
> > implement everything all at once or by yourself, but I don't see why we
> > should require one or more follow-up KIPs to apply the exact same changes
> > to the SMTs we missed the first time.
> >
> > On Fri, May 10, 2024 at 5:20 AM Mario Fiore Vitale 
> > wrote:
> >
> > > Hi Chris,
> > >
> > > Thanks for the survey. Do you think I need to update the KIP to put all
> > of
> > > these?
> > >
> > > On Thu, May 9, 2024 at 4:14 PM Chris Egerton 
> > > wrote:
> > >
> > > > After doing a brief survey of the SMTs that ship with Connect, it
> seems
> > > > like these would also benefit:
> > > >
> > > > - HeaderFrom, which populates record headers with subfields of
> > > keys/values
> > > > [1]
> > > > - Cast, which performs type transformation on subfields of
> keys/values
> > > [2]
> > > > - SetSchemaMetadata, which (when the record key/value is a struct)
> > copies
> > > > fields from the input struct to the output struct (which uses a
> > different
> > > > schema) [3]
> > > > - TimestampConverter, which does similar input/output field copying
> to
> > > > SetSchemaMetadata [4]
> > > > - ReplaceField, which does similar input/output field copying to
> > > > SetSchemaMetadata and TimestampConverter
> > > >
> > > > [1] -
> > > >
> > > >
> > >
> >
> https://github.com/apache/kafka/blob/f4fdaa702a2e718bdb44b9c5fec254f32a33f0d8/connect/transforms/src/main/java/org/apache/kafka/connect/transforms/HeaderFrom.java#L143
> > > > [2] -
> > > >
> > > >
> > >
> >
> https://github.com/apache/kafka/blob/f4fdaa702a2e718bdb44b9c5fec254f32a33f0d8/connect/transforms/src/main/java/org/apache/kafka/connect/transforms/Cast.java#L178
> > > > [3] -
> > > >
> > > >
> > >
> >
> https://github.com/apache/kafka/blob/f4fdaa702a2e718bdb44b9c5fec254f32a33f0d8/connect/transforms/src/main/java/org/apache/kafka/connect/transforms/SetSchemaMetadata.java#L174
> > > > [4] -
> > > >
> > > >
> > >
> >
> https://github.com/apache/kafka/blob/2a5efe4a334611fc7c15f76b322482bad0942db0/connect/transforms/src/main/java/org/apache/kafka/connect/transforms/TimestampConverter.java#L420
> > > > [5] -
> > > >
> > > >
> > >
> >
> https://github.com/apache/kafka/blob/f4fdaa702a2e718bdb44b9c5fec254f32a33f0d8/connect/transforms/src/main/java/org/apache/kafka/connect/transforms/ReplaceField.java#L183
> > > >
> > > > On Thu, May 9, 2024 at 3:28 AM Mario Fiore Vitale <
> mvit...@redhat.com>
> > > > wrote:
> > > >
> > > > > Hi Chris,
> > > > >
> > > > > > Wouldn't ValueToKey [1] be applicable as well, for example?
> > > > > Yes, also that one can be affected.
> > > > >
> > > > > On Wed, May 8, 2024 at 5:59 PM Chris Egerton
>  > >
> > > > > wrote:
> > > > >
> > > > > > Wait, just one more thing--are there any other SMTs that could
> > > benefit
> > > > > from
> > > > > > this? Wouldn't ValueToKey [1] be applicable as well, for example?
> > > > > >
> > > > > > [1] -
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/apache/kafka/blob/2a5efe4a334611fc7c15f76b322482bad0942db0/connect/transforms/src/main/java/org/apache/kafka/connect/transforms/ValueToKey.java#L106
> > > > > >
> > > > > > On Wed, May 8, 2024 at 11:46 AM Chris Egerton 
> > > wrote:
> > > > > >
> > > > > > > Hi Mario,
> > > > > > >
> > > > > > > I think we could have something like `copy` and
> > > `copyWithoutDefaults`
> > > > > to
> > > > > > > get around that, but now that you bring up compatibility, I
> think
> > > > it's
> > > > > > best
> > > > > > > to hold off on this. I'm forced to recall that anything we add
> to
> > > the
> > > > > > > Connect API may be used by plugin developers who write for the
> > > > bleeding
> > > > > > > edge of the Connect runtime, but deployed by users who are
> > running
> > > on
> > > > > > > (possibly much) older versions. In that scenario, any use of
> new
> > > > Struct
> > > > > > > methods would cause issues at runtime caused by compatibility
> > > clashes
> > > > > > > between the newer API that the plugin was written for, and the
> > > older
> > > > > API
> > > > > > > that's provided by the runtime it's running on.
> > > > > > >
> > > > > > > Anyway, thanks for humoring me. The KIP looks good to me 
> > > > > > >
> > > > > > > Cheers,
> > > > > > >
> > > > > > > Chris
> > > > > > >
> > > > > > > On Wed, May 8, 2024 at 10:50 AM Mario Fiore Vitale <
> > > > mvit...@redhat.com
> > > > > >
> > > > > > > wrote:
> > > > > > >
> > > > > > >> Hi Chris,
> > > > > > >>
> > > > > > >> Thanks for reviewing this.
> > > > > > >>
> > > > > > >> > It seems like the pattern of "copy the contents of this
> 

Re: [DISCUSS] KIP-1028: Docker Official Image for Apache Kafka

2024-05-10 Thread Chris Egerton
Hi Krish and Prabha,

Thanks for your replies. I still have some follow-up questions:

1) I read https://docs.docker.com/trusted-content/official-images/ and
didn't find anything on that page or immediately around it that explains
what compliance requirements might be satisfied by a DOI that couldn't be
satisfied by the existing apache/kafka image. Can you elaborate? Feel free
to provide another link, but please also quote the relevant sections from
it (as StackOverflow likes to say, links can grow stale over time).

2) It would be great to see a brief summary of the differences in these
images included in the KIP, in order to try to gauge how this would look to
our users.

3) What I suggested last time was not a separate apache/apache-docker
repository, but a repository controlled entirely by Docker. The DOI docs
[1] state that "While it's preferable to have upstream software authors
maintaining their Docker Official Images, this isn't a strict requirement",
which I take to mean that it's not required for an Apache Kafka DOI to live
under the apache organization on GitHub. It also seems like there's
precedent for this: images for MySQL [2] and PHP [3] already exist under
the control of Docker. The reason I think this is worth considering is that
Docker can arbitrarily change the eligibility requirements for their
official images at any time, and it doesn't seem like there's a clear
process in the KIP for judging how we should respond to these changes (in
fact, it seems like the idea in the KIP is that we should make any change
required with no further vetting beyond possibly a pull request on
apache/kafka, which would require approval by a committer). By hosting the
DOI definitions ourselves (either in apache/kafka, or in a theoretical
apache/docker-kafka repository), we take responsibility for the image, even
if the owner on Docker Hub is Docker, not Apache. If the code lives
elsewhere, then (as long as basic trademark and possibly security
guidelines are respected) Apache doesn't have to concern itself at all with
the image and the maintainers are free to make whatever changes they want
to it in order to meet Docker's requirements.

[1] - https://docs.docker.com/trusted-content/official-images/contributing/,
second paragraph under "Contributing to Docker Official Images"
[2] - https://github.com/docker-library/mysql
[3] - https://github.com/docker-library/php

Cheers,

Chris

On Fri, May 10, 2024 at 7:46 AM Krish Vora  wrote:

> Hey Chris,
>
> We have responded to the initial round of queries. And as you mentioned in
> your comment, you may have more questions related to this KIP. Please let
> us know if you have any.
> We want to start the voting for this KIP soon, as we intend to include it
> in the 3.8.0 release.
>
> Thanks.
>
> Regards,
>
> Krish.
>
> On Wed, May 8, 2024 at 6:19 PM Krish Vora  wrote:
>
> > Hi Chris. Thanks for the questions.
> >
> > 3. Would a separate Docker-owned repository be out of the question? I'm
> >> guessing there are some trademark issues that might get in the way, but
> >> it's worth exploring since the entire purpose of this KIP seems to be to
> >> provide images that are vetted and designed by Docker more than by the
> >> Apache Kafka contributors/committers/PMC.
> >
> >
> >
> >- The process for introducing a Docker Official Image involves
> >   - Hosting the Dockerfile in the Apache Kafka repository and
> >   - Providing the path to this Dockerfile to Docker Hub in Docker
> >   Hub’s own repo
> >   <
> https://github.com/docker-library/official-images/tree/master/library>
> >   .
> >- This ensures that any updates to the Dockerfile in the AK repository
> >are directly applicable to the docker official images available on
> Docker
> >Hub.
> >
> >
> >- We also did not find any added advantage to create a separate
> >repository named apache-docker within the Apache GitHub organization.
> >
> > Thanks,
> > Krish.
> >
> > On Wed, May 8, 2024 at 6:05 PM Prabha Manepalli
> >  wrote:
> >
> >> Hi Chris,  I would like to add more context to this KIP's motivation.
> >> Vedarth and Krish, please weigh in with your inputs.
> >>
> >> In the motivation section it's stated that "Several other Apache
> projects,
> >> > like Flink, Spark, Solr, have already released Docker Official Images,
> >> with
> >> > download figures ranging from 50 million to over 1 billion. These
> >> numbers
> >> > highlight the significant demand among users." But then immediately
> >> > afterwards, we learn that "Also the Docker Official Images are always
> >> the
> >> > top 1 search result, irrespective of the number of downloads."
> Wouldn't
> >> a
> >> > high number of downloads for an image naturally follow from being the
> >> top
> >> > search result? It seems like we can't necessarily assume that Docker
> >> > Official Images are inherently more desirable for users based solely
> on
> >> > download statistics.
> >> >
> >>
> >> *My thoughts: *Unlike the 

Re: KIP-1040: Improve handling of nullable values in InsertField/ExtractField transformations

2024-05-10 Thread Mario Fiore Vitale
Yes, I agree. Unfortunately due to the issue of the creation of a new
account for the WIKI, I asked Mickael Maison to create the KIP for me.

I'll ask him to update the KIP. Do you have other alternatives?

Thanks,
Mario.

On Fri, May 10, 2024 at 3:40 PM Chris Egerton 
wrote:

> Yes, I think we should just do one KIP for all the SMTs. You don't have to
> implement everything all at once or by yourself, but I don't see why we
> should require one or more follow-up KIPs to apply the exact same changes
> to the SMTs we missed the first time.
>
> On Fri, May 10, 2024 at 5:20 AM Mario Fiore Vitale 
> wrote:
>
> > Hi Chris,
> >
> > Thanks for the survey. Do you think I need to update the KIP to put all
> of
> > these?
> >
> > On Thu, May 9, 2024 at 4:14 PM Chris Egerton 
> > wrote:
> >
> > > After doing a brief survey of the SMTs that ship with Connect, it seems
> > > like these would also benefit:
> > >
> > > - HeaderFrom, which populates record headers with subfields of
> > keys/values
> > > [1]
> > > - Cast, which performs type transformation on subfields of keys/values
> > [2]
> > > - SetSchemaMetadata, which (when the record key/value is a struct)
> copies
> > > fields from the input struct to the output struct (which uses a
> different
> > > schema) [3]
> > > - TimestampConverter, which does similar input/output field copying to
> > > SetSchemaMetadata [4]
> > > - ReplaceField, which does similar input/output field copying to
> > > SetSchemaMetadata and TimestampConverter
> > >
> > > [1] -
> > >
> > >
> >
> https://github.com/apache/kafka/blob/f4fdaa702a2e718bdb44b9c5fec254f32a33f0d8/connect/transforms/src/main/java/org/apache/kafka/connect/transforms/HeaderFrom.java#L143
> > > [2] -
> > >
> > >
> >
> https://github.com/apache/kafka/blob/f4fdaa702a2e718bdb44b9c5fec254f32a33f0d8/connect/transforms/src/main/java/org/apache/kafka/connect/transforms/Cast.java#L178
> > > [3] -
> > >
> > >
> >
> https://github.com/apache/kafka/blob/f4fdaa702a2e718bdb44b9c5fec254f32a33f0d8/connect/transforms/src/main/java/org/apache/kafka/connect/transforms/SetSchemaMetadata.java#L174
> > > [4] -
> > >
> > >
> >
> https://github.com/apache/kafka/blob/2a5efe4a334611fc7c15f76b322482bad0942db0/connect/transforms/src/main/java/org/apache/kafka/connect/transforms/TimestampConverter.java#L420
> > > [5] -
> > >
> > >
> >
> https://github.com/apache/kafka/blob/f4fdaa702a2e718bdb44b9c5fec254f32a33f0d8/connect/transforms/src/main/java/org/apache/kafka/connect/transforms/ReplaceField.java#L183
> > >
> > > On Thu, May 9, 2024 at 3:28 AM Mario Fiore Vitale 
> > > wrote:
> > >
> > > > Hi Chris,
> > > >
> > > > > Wouldn't ValueToKey [1] be applicable as well, for example?
> > > > Yes, also that one can be affected.
> > > >
> > > > On Wed, May 8, 2024 at 5:59 PM Chris Egerton  >
> > > > wrote:
> > > >
> > > > > Wait, just one more thing--are there any other SMTs that could
> > benefit
> > > > from
> > > > > this? Wouldn't ValueToKey [1] be applicable as well, for example?
> > > > >
> > > > > [1] -
> > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/apache/kafka/blob/2a5efe4a334611fc7c15f76b322482bad0942db0/connect/transforms/src/main/java/org/apache/kafka/connect/transforms/ValueToKey.java#L106
> > > > >
> > > > > On Wed, May 8, 2024 at 11:46 AM Chris Egerton 
> > wrote:
> > > > >
> > > > > > Hi Mario,
> > > > > >
> > > > > > I think we could have something like `copy` and
> > `copyWithoutDefaults`
> > > > to
> > > > > > get around that, but now that you bring up compatibility, I think
> > > it's
> > > > > best
> > > > > > to hold off on this. I'm forced to recall that anything we add to
> > the
> > > > > > Connect API may be used by plugin developers who write for the
> > > bleeding
> > > > > > edge of the Connect runtime, but deployed by users who are
> running
> > on
> > > > > > (possibly much) older versions. In that scenario, any use of new
> > > Struct
> > > > > > methods would cause issues at runtime caused by compatibility
> > clashes
> > > > > > between the newer API that the plugin was written for, and the
> > older
> > > > API
> > > > > > that's provided by the runtime it's running on.
> > > > > >
> > > > > > Anyway, thanks for humoring me. The KIP looks good to me 
> > > > > >
> > > > > > Cheers,
> > > > > >
> > > > > > Chris
> > > > > >
> > > > > > On Wed, May 8, 2024 at 10:50 AM Mario Fiore Vitale <
> > > mvit...@redhat.com
> > > > >
> > > > > > wrote:
> > > > > >
> > > > > >> Hi Chris,
> > > > > >>
> > > > > >> Thanks for reviewing this.
> > > > > >>
> > > > > >> > It seems like the pattern of "copy the contents of this Struct
> > > into
> > > > > >> another
> > > > > >> one for the purpose of mutation" could be fairly common in user
> > code
> > > > > bases
> > > > > >> in addition to the core Connect SMTs. Do you think there's a way
> > to
> > > > > >> simplify this with, e.g., a Struct.putAll(Struct destination) or
> > > > > >> Struct.copy(Schema destinationSchema) method?
> > > > > >>
> > > > > >> The only concern 

Re: KIP-1040: Improve handling of nullable values in InsertField/ExtractField transformations

2024-05-10 Thread Chris Egerton
Yes, I think we should just do one KIP for all the SMTs. You don't have to
implement everything all at once or by yourself, but I don't see why we
should require one or more follow-up KIPs to apply the exact same changes
to the SMTs we missed the first time.

On Fri, May 10, 2024 at 5:20 AM Mario Fiore Vitale 
wrote:

> Hi Chris,
>
> Thanks for the survey. Do you think I need to update the KIP to put all of
> these?
>
> On Thu, May 9, 2024 at 4:14 PM Chris Egerton 
> wrote:
>
> > After doing a brief survey of the SMTs that ship with Connect, it seems
> > like these would also benefit:
> >
> > - HeaderFrom, which populates record headers with subfields of
> keys/values
> > [1]
> > - Cast, which performs type transformation on subfields of keys/values
> [2]
> > - SetSchemaMetadata, which (when the record key/value is a struct) copies
> > fields from the input struct to the output struct (which uses a different
> > schema) [3]
> > - TimestampConverter, which does similar input/output field copying to
> > SetSchemaMetadata [4]
> > - ReplaceField, which does similar input/output field copying to
> > SetSchemaMetadata and TimestampConverter
> >
> > [1] -
> >
> >
> https://github.com/apache/kafka/blob/f4fdaa702a2e718bdb44b9c5fec254f32a33f0d8/connect/transforms/src/main/java/org/apache/kafka/connect/transforms/HeaderFrom.java#L143
> > [2] -
> >
> >
> https://github.com/apache/kafka/blob/f4fdaa702a2e718bdb44b9c5fec254f32a33f0d8/connect/transforms/src/main/java/org/apache/kafka/connect/transforms/Cast.java#L178
> > [3] -
> >
> >
> https://github.com/apache/kafka/blob/f4fdaa702a2e718bdb44b9c5fec254f32a33f0d8/connect/transforms/src/main/java/org/apache/kafka/connect/transforms/SetSchemaMetadata.java#L174
> > [4] -
> >
> >
> https://github.com/apache/kafka/blob/2a5efe4a334611fc7c15f76b322482bad0942db0/connect/transforms/src/main/java/org/apache/kafka/connect/transforms/TimestampConverter.java#L420
> > [5] -
> >
> >
> https://github.com/apache/kafka/blob/f4fdaa702a2e718bdb44b9c5fec254f32a33f0d8/connect/transforms/src/main/java/org/apache/kafka/connect/transforms/ReplaceField.java#L183
> >
> > On Thu, May 9, 2024 at 3:28 AM Mario Fiore Vitale 
> > wrote:
> >
> > > Hi Chris,
> > >
> > > > Wouldn't ValueToKey [1] be applicable as well, for example?
> > > Yes, also that one can be affected.
> > >
> > > On Wed, May 8, 2024 at 5:59 PM Chris Egerton 
> > > wrote:
> > >
> > > > Wait, just one more thing--are there any other SMTs that could
> benefit
> > > from
> > > > this? Wouldn't ValueToKey [1] be applicable as well, for example?
> > > >
> > > > [1] -
> > > >
> > > >
> > >
> >
> https://github.com/apache/kafka/blob/2a5efe4a334611fc7c15f76b322482bad0942db0/connect/transforms/src/main/java/org/apache/kafka/connect/transforms/ValueToKey.java#L106
> > > >
> > > > On Wed, May 8, 2024 at 11:46 AM Chris Egerton 
> wrote:
> > > >
> > > > > Hi Mario,
> > > > >
> > > > > I think we could have something like `copy` and
> `copyWithoutDefaults`
> > > to
> > > > > get around that, but now that you bring up compatibility, I think
> > it's
> > > > best
> > > > > to hold off on this. I'm forced to recall that anything we add to
> the
> > > > > Connect API may be used by plugin developers who write for the
> > bleeding
> > > > > edge of the Connect runtime, but deployed by users who are running
> on
> > > > > (possibly much) older versions. In that scenario, any use of new
> > Struct
> > > > > methods would cause issues at runtime caused by compatibility
> clashes
> > > > > between the newer API that the plugin was written for, and the
> older
> > > API
> > > > > that's provided by the runtime it's running on.
> > > > >
> > > > > Anyway, thanks for humoring me. The KIP looks good to me 
> > > > >
> > > > > Cheers,
> > > > >
> > > > > Chris
> > > > >
> > > > > On Wed, May 8, 2024 at 10:50 AM Mario Fiore Vitale <
> > mvit...@redhat.com
> > > >
> > > > > wrote:
> > > > >
> > > > >> Hi Chris,
> > > > >>
> > > > >> Thanks for reviewing this.
> > > > >>
> > > > >> > It seems like the pattern of "copy the contents of this Struct
> > into
> > > > >> another
> > > > >> one for the purpose of mutation" could be fairly common in user
> code
> > > > bases
> > > > >> in addition to the core Connect SMTs. Do you think there's a way
> to
> > > > >> simplify this with, e.g., a Struct.putAll(Struct destination) or
> > > > >> Struct.copy(Schema destinationSchema) method?
> > > > >>
> > > > >> The only concern that I see is backward compatibility. Suppose
> that
> > > you
> > > > >> are
> > > > >> not using the JsonConvert but another convert that does't support
> > the
> > > > >> 'replace.null.with.default', when you use the current
> 'InsertField'
> > > smt
> > > > >> the null values will be replace by default values. If we replace
> the
> > > > >> "copy"
> > > > >> logic with a method in the Struct we remove this behavior.
> > > > >>
> > > > >> Isn't it?
> > > > >>
> > > > >> Mario.
> > > > >>
> > > > >> On Wed, May 8, 2024 at 2:14 PM Chris Egerton
> >  > 

Re: [DISCUSS] KIP-1028: Docker Official Image for Apache Kafka

2024-05-10 Thread Krish Vora
Hey Chris,

We have responded to the initial round of queries. And as you mentioned in
your comment, you may have more questions related to this KIP. Please let
us know if you have any.
We want to start the voting for this KIP soon, as we intend to include it
in the 3.8.0 release.

Thanks.

Regards,

Krish.

On Wed, May 8, 2024 at 6:19 PM Krish Vora  wrote:

> Hi Chris. Thanks for the questions.
>
> 3. Would a separate Docker-owned repository be out of the question? I'm
>> guessing there are some trademark issues that might get in the way, but
>> it's worth exploring since the entire purpose of this KIP seems to be to
>> provide images that are vetted and designed by Docker more than by the
>> Apache Kafka contributors/committers/PMC.
>
>
>
>- The process for introducing a Docker Official Image involves
>   - Hosting the Dockerfile in the Apache Kafka repository and
>   - Providing the path to this Dockerfile to Docker Hub in Docker
>   Hub’s own repo
>   
>   .
>- This ensures that any updates to the Dockerfile in the AK repository
>are directly applicable to the docker official images available on Docker
>Hub.
>
>
>- We also did not find any added advantage to create a separate
>repository named apache-docker within the Apache GitHub organization.
>
> Thanks,
> Krish.
>
> On Wed, May 8, 2024 at 6:05 PM Prabha Manepalli
>  wrote:
>
>> Hi Chris,  I would like to add more context to this KIP's motivation.
>> Vedarth and Krish, please weigh in with your inputs.
>>
>> In the motivation section it's stated that "Several other Apache projects,
>> > like Flink, Spark, Solr, have already released Docker Official Images,
>> with
>> > download figures ranging from 50 million to over 1 billion. These
>> numbers
>> > highlight the significant demand among users." But then immediately
>> > afterwards, we learn that "Also the Docker Official Images are always
>> the
>> > top 1 search result, irrespective of the number of downloads." Wouldn't
>> a
>> > high number of downloads for an image naturally follow from being the
>> top
>> > search result? It seems like we can't necessarily assume that Docker
>> > Official Images are inherently more desirable for users based solely on
>> > download statistics.
>> >
>>
>> *My thoughts: *Unlike the Sponsored OSS image, the Docker Official image
>> is
>> more desirable for workloads that have stringent compliance requirements.
>> More details on why official images are more trusted are documented here
>> . The Docker
>> Official image would also help an absolutely new Kafka beginner who might
>> not know about Apache or the concept of Sponsored images. We want to make
>> it easier for Kafka beginners to discover the Kafka image through
>> DockerHub.
>>
>>
>> Can you elaborate on the value that these new images would add from a
>> > user's perspective? I'm hesitant to introduce another image, since it
>> adds
>> > to the cognitive burden of people who will inevitably have to answer the
>> > question of "What are the differences between all of the available
>> images
>> > and which one is best for my use case?"
>> >
>>
>>
>> *My thoughts: *This is a valid concern to address. The response to the
>> above question addresses the value-add this new Docker Official image
>> would
>> provide. I also agree we need a clear distinction between each of these
>> images to be well documented. We plan to update the AK website with
>> details
>> on how, why, and when a developer would want to use each of these
>> particular images(KIP-974,975,1028).
>>
>> Thanks,
>> Prabha.
>>
>>
>>
>>
>>
>> On Tue, Apr 30, 2024 at 9:41 PM Chris Egerton 
>> wrote:
>>
>> > Hi Vedarth and Krish,
>> >
>> > Thanks for the KIP! I have to admit I'm a little skeptical; hopefully
>> you
>> > can help me understand the need for these additional images.
>> >
>> > 1) In the motivation section it's stated that "Several other Apache
>> > projects, like Flink, Spark, Solr, have already released Docker Official
>> > Images, with download figures ranging from 50 million to over 1 billion.
>> > These numbers highlight the significant demand among users." But then
>> > immediately afterwards, we learn that "Also the Docker Official Images
>> are
>> > always the top 1 search result, irrespective of the number of
>> downloads."
>> > Wouldn't a high number of downloads for an image naturally follow from
>> > being the top search result? It seems like we can't necessarily assume
>> that
>> > Docker Official Images are inherently more desirable for users based
>> solely
>> > on download statistics.
>> >
>> > 2) Can you elaborate on the value that these new images would add from a
>> > user's perspective? I'm hesitant to introduce another image, since it
>> adds
>> > to the cognitive burden of people who will inevitably have to answer the
>> > question of "What are 

Re: [DISCUSS] KIP-891: Running multiple versions of Connector plugins

2024-05-10 Thread Snehashis
Hi Greg,

Thanks for the follow up to my original KIP, I am in favour of the
additions made to expand its scope, the addition of range versions
specifically make a lot of sense.

Apologies if I have not publicly worked on this KIP for a long time. The
original work was done when the move to service loading was in discussion
and I wanted to loop back to this only after that work was completed. Post
its conclusion, I have not been able to take this up due to other
priorities. If it's okay with you, I would still like to get this
implemented myself, including the additional scope.

Thanks and regards
Snehashis

On Fri, May 10, 2024 at 12:45 AM Greg Harris 
wrote:

> Hi all,
>
> I'd like to reboot the discussion on KIP-891:
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-891%3A+Running+multiple+versions+of+Connector+plugins
>
> I've made some changes, most notably:
>
> 1. Specifying versions for all plugins in Connector configs
> (converters, header converters, transforms, and predicates) not just
> connectors & tasks
> 2. Specifying a range of versions instead of an exact match
> 3. New metrics to observe what versions are in-use
>
> Thanks to Snehashis for the original KIP idea!
>
> Thanks,
> Greg
>
> On Tue, Jan 2, 2024 at 11:49 AM Greg Harris  wrote:
> >
> > Hi Snehashis,
> >
> > Thank you for the KIP! This is something I've wanted for a long time.
> >
> > I know the discussion has gone cold, are you still interested in
> > pursuing this feature? I'll make time to review the KIP if you are
> > still accepting comments.
> >
> > Thanks,
> > Greg
> >
> > On Tue, Nov 22, 2022 at 12:29 PM Snehashis 
> wrote:
> > >
> > > Thanks for the points Sagar.
> > >
> > > > 1) Should we update the GET /connectors endpoint to include the
> version of
> > > > the plugin that is running? It could be useful to figure out the
> version
> > > of
> > > > the plugin or I am assuming it gets returned by the expand=info call?
> > >
> > > I think this is good to have and possible future enhancement. The
> version
> > > info will be present in the config of the connector if the user has
> > > specified the version. Otherwise it is the latest version which the
> user
> > > can find out from the connector-plugin endpoint. The information can be
> > > introduced to the response of the GET /connectors endpoint itself,
> however
> > > the most ideal way of doing this would be to get the currently running
> > > instance of the connector and get the version directly from there.
> This is
> > > slightly tricky as the connector could be running in a different node.
> > > One way to do this would be to persist the version information in the
> > > status backing store during instantiation of the connector. It requires
> > > some more thought and since the version is part of the configs if
> provided
> > > and evident otherwise, I have not included it in this KIP.
> > >
> > > > 2) I am not aware of this and hence asking, can 2 connectors with
> > > different
> > > > versions have the same name? Does the plugin isolation allow this?
> This
> > > > could have a bearing when using the lifecycle endpoints for
> connectors
> > > like
> > > > DELETE etc.
> > >
> > > All connectors in a cluster need to have uniquire connector names
> > > regardless of what version of the plugin the connector is running
> > > underneath. This is something enforced by the connect runtime itself.
> All
> > > connect CRUD operations are keyed on the connector name so there will
> not
> > > be an issue.
> > >
> > > Regards
> > > Snehashis
> > >
> > > On Tue, Nov 22, 2022 at 3:16 PM Sagar 
> wrote:
> > >
> > > > Hey Snehashsih,
> > > >
> > > > Thanks for the KIP. It looks like a very useful feature. Couple of
> > > > small-ish points, let me know what you think:
> > > >
> > > > 1) Should we update the GET /connectors endpoint to include the
> version of
> > > > the plugin that is running? It could be useful to figure out the
> version of
> > > > the plugin or I am assuming it gets returned by the expand=info call?
> > > > 2) I am not aware of this and hence asking, can 2 connectors with
> different
> > > > versions have the same name? Does the plugin isolation allow this?
> This
> > > > could have a bearing when using the lifecycle endpoints for
> connectors like
> > > > DELETE etc.
> > > >
> > > > Thanks!
> > > > Sagar.
> > > >
> > > >
> > > > On Tue, Nov 22, 2022 at 2:10 PM Ashwin  >
> > > > wrote:
> > > >
> > > > > Hi Snehasis,
> > > > >
> > > > > > IIUC (please correct me if I am wrong here), what you highlighted
> > > > above,
> > > > > is
> > > > > a versioning scheme for a connector config for the same connector
> (and
> > > > not
> > > > > different versions of a connector plugin).
> > > > >
> > > > > Sorry for not being more precise in my wording -  I meant
> registering
> > > > > versions of schema for connector config.
> > > > >
> > > > > Let's take the example of a fictional connector which uses a
> fictional
> > > > AWS
> > > > > service.
> > > > >
> > 

Re: [DISCUSS] KIP-950: Tiered Storage Disablement

2024-05-10 Thread Luke Chen
Hi Christo,

> 1. I am not certain I follow the question. From DISABLED you can only go
to
ENABLED regardless of whether your cluster is backed by Zookeeper or KRaft.
Am I misunderstanding your point?

Yes, you're right.

> 4. I was thinking that if there is a mismatch we will just fail accepting
the request for disablement. This should be the same in both Zookeeper and
KRaft. Or am I misunderstanding your question?

OK, sounds good.

> 6. I think my current train of thought is that there will be unlimited
retries until all brokers respond in a similar way to how deletion of a
topic works today in ZK. In the meantime the state will continue to be
DISABLING. Do you have a better suggestion?

I don't think infinite retries is a good idea since if a broker is down
forever, this request will never complete.
You mentioned the existing topic deletion is using the similar pattern, how
does it handle this issue?

Thanks.
Luke

On Thu, May 9, 2024 at 9:21 PM Christo Lolov  wrote:

> Heya!
>
> re: Luke
>
> 1. I am not certain I follow the question. From DISABLED you can only go to
> ENABLED regardless of whether your cluster is backed by Zookeeper or KRaft.
> Am I misunderstanding your point?
>
> 2. Apologies, this was a leftover from previous versions. I have updated
> the Zookeeper section. The steps ought to be: controller receives change,
> commits necessary data to Zookeeper, enqueues disablement and starts
> sending StopReplicas request to brokers; brokers receive StopReplicas and
> propagate them all the way to RemoteLogManager#stopPartitions which takes
> care of the rest.
>
> 3. Correct, it should say DISABLED - this should now be corrected.
>
> 4. I was thinking that if there is a mismatch we will just fail accepting
> the request for disablement. This should be the same in both Zookeeper and
> KRaft. Or am I misunderstanding your question?
>
> 5. Yeah. I am now doing a second pass on all diagrams and will update them
> by the end of the day!
>
> 6. I think my current train of thought is that there will be unlimited
> retries until all brokers respond in a similar way to how deletion of a
> topic works today in ZK. In the meantime the state will continue to be
> DISABLING. Do you have a better suggestion?
>
> re: Kamal
>
> Yep, I will update all diagrams
>
> I am not certain I follow the reasoning for making retain and delete the
> same. Deletion when the policy is retain happens asynchronously due to
> expiration. I think that deletion when the policy is delete ought to (at
> least for the initial implementation) happen synchronously. Should people
> run into timeout problems we can always then have a follow-up KIP where we
> make it asynchronous.
>
> Best,
> Christo
>
> On Tue, 7 May 2024 at 10:04, Kamal Chandraprakash <
> kamal.chandraprak...@gmail.com> wrote:
>
> > Hi Christo,
> >
> > Thanks for the update!
> >
> > For both the policies "retain" and "delete", can we maintain the same
> > approach to delete the segments async?
> >
> > > If the disablement policy is set to delete, the Log start offset (LSO)
> is
> > updated to match the Local Log Start Offset and the remote log is deleted
> > by calling the RemoteStorageManager#deleteLogSegmentData().
> >
> > In the KIP, it's mentioned that when the disable policy is set to
> "delete",
> > the remote-log-segments will be
> > deleted in-sync. The stopPartition call might get timed out when the
> number
> > of remote log segments to
> > delete is huge. We can further extend the same approach for the topic
> > deletion requests.
> >
> > Also, Could you please update the state diagram about the transitions? It
> > is not clear when to transit from
> > DISABLING to DISABLED state?
> >
> > --
> > Kamal
> >
> > On Mon, May 6, 2024 at 6:55 PM Kamal Chandraprakash <
> > kamal.chandraprak...@gmail.com> wrote:
> >
> > > Ignore the above message. Got the answers after reading the state
> > > transition section.
> > >
> > > > If the disablement policy is delete, tasks scheduled for the
> > > topic-partitions in the RemoteDataExpirationThreadPool will also be
> > > canceled.
> > >
> > > We are deleting the segments synchronously. Should we delete them
> > > asynchronously? The same approach can be extended to topic deletion
> > > requests.
> > >
> > > > 6. In ZK mode, what will the controller do if the "stopReplicas"
> > > responses not received from all brokers? Reverting the changes?
> > >
> > > Since we are deleting the segments synchronously. This case can be
> bound
> > > to happen when the number of remote log segments to
> > > delete is huge.
> > >
> > >
> > > On Mon, May 6, 2024, 18:12 Kamal Chandraprakash <
> > > kamal.chandraprak...@gmail.com> wrote:
> > >
> > >> Hi Christo,
> > >>
> > >> Thanks for the update!
> > >>
> > >> 1. In the ZK mode, how will the transition from DISABLING to DISABLED
> > >> state happen?
> > >> For the "retain" policy, until we delete all the remote-log segments,
> > the
> > >> state will be
> > >> DISABLING and the deletion can 

Re: KIP-1040: Improve handling of nullable values in InsertField/ExtractField transformations

2024-05-10 Thread Mario Fiore Vitale
Hi Chris,

Thanks for the survey. Do you think I need to update the KIP to put all of
these?

On Thu, May 9, 2024 at 4:14 PM Chris Egerton 
wrote:

> After doing a brief survey of the SMTs that ship with Connect, it seems
> like these would also benefit:
>
> - HeaderFrom, which populates record headers with subfields of keys/values
> [1]
> - Cast, which performs type transformation on subfields of keys/values [2]
> - SetSchemaMetadata, which (when the record key/value is a struct) copies
> fields from the input struct to the output struct (which uses a different
> schema) [3]
> - TimestampConverter, which does similar input/output field copying to
> SetSchemaMetadata [4]
> - ReplaceField, which does similar input/output field copying to
> SetSchemaMetadata and TimestampConverter
>
> [1] -
>
> https://github.com/apache/kafka/blob/f4fdaa702a2e718bdb44b9c5fec254f32a33f0d8/connect/transforms/src/main/java/org/apache/kafka/connect/transforms/HeaderFrom.java#L143
> [2] -
>
> https://github.com/apache/kafka/blob/f4fdaa702a2e718bdb44b9c5fec254f32a33f0d8/connect/transforms/src/main/java/org/apache/kafka/connect/transforms/Cast.java#L178
> [3] -
>
> https://github.com/apache/kafka/blob/f4fdaa702a2e718bdb44b9c5fec254f32a33f0d8/connect/transforms/src/main/java/org/apache/kafka/connect/transforms/SetSchemaMetadata.java#L174
> [4] -
>
> https://github.com/apache/kafka/blob/2a5efe4a334611fc7c15f76b322482bad0942db0/connect/transforms/src/main/java/org/apache/kafka/connect/transforms/TimestampConverter.java#L420
> [5] -
>
> https://github.com/apache/kafka/blob/f4fdaa702a2e718bdb44b9c5fec254f32a33f0d8/connect/transforms/src/main/java/org/apache/kafka/connect/transforms/ReplaceField.java#L183
>
> On Thu, May 9, 2024 at 3:28 AM Mario Fiore Vitale 
> wrote:
>
> > Hi Chris,
> >
> > > Wouldn't ValueToKey [1] be applicable as well, for example?
> > Yes, also that one can be affected.
> >
> > On Wed, May 8, 2024 at 5:59 PM Chris Egerton 
> > wrote:
> >
> > > Wait, just one more thing--are there any other SMTs that could benefit
> > from
> > > this? Wouldn't ValueToKey [1] be applicable as well, for example?
> > >
> > > [1] -
> > >
> > >
> >
> https://github.com/apache/kafka/blob/2a5efe4a334611fc7c15f76b322482bad0942db0/connect/transforms/src/main/java/org/apache/kafka/connect/transforms/ValueToKey.java#L106
> > >
> > > On Wed, May 8, 2024 at 11:46 AM Chris Egerton  wrote:
> > >
> > > > Hi Mario,
> > > >
> > > > I think we could have something like `copy` and `copyWithoutDefaults`
> > to
> > > > get around that, but now that you bring up compatibility, I think
> it's
> > > best
> > > > to hold off on this. I'm forced to recall that anything we add to the
> > > > Connect API may be used by plugin developers who write for the
> bleeding
> > > > edge of the Connect runtime, but deployed by users who are running on
> > > > (possibly much) older versions. In that scenario, any use of new
> Struct
> > > > methods would cause issues at runtime caused by compatibility clashes
> > > > between the newer API that the plugin was written for, and the older
> > API
> > > > that's provided by the runtime it's running on.
> > > >
> > > > Anyway, thanks for humoring me. The KIP looks good to me 
> > > >
> > > > Cheers,
> > > >
> > > > Chris
> > > >
> > > > On Wed, May 8, 2024 at 10:50 AM Mario Fiore Vitale <
> mvit...@redhat.com
> > >
> > > > wrote:
> > > >
> > > >> Hi Chris,
> > > >>
> > > >> Thanks for reviewing this.
> > > >>
> > > >> > It seems like the pattern of "copy the contents of this Struct
> into
> > > >> another
> > > >> one for the purpose of mutation" could be fairly common in user code
> > > bases
> > > >> in addition to the core Connect SMTs. Do you think there's a way to
> > > >> simplify this with, e.g., a Struct.putAll(Struct destination) or
> > > >> Struct.copy(Schema destinationSchema) method?
> > > >>
> > > >> The only concern that I see is backward compatibility. Suppose that
> > you
> > > >> are
> > > >> not using the JsonConvert but another convert that does't support
> the
> > > >> 'replace.null.with.default', when you use the current 'InsertField'
> > smt
> > > >> the null values will be replace by default values. If we replace the
> > > >> "copy"
> > > >> logic with a method in the Struct we remove this behavior.
> > > >>
> > > >> Isn't it?
> > > >>
> > > >> Mario.
> > > >>
> > > >> On Wed, May 8, 2024 at 2:14 PM Chris Egerton
>  > >
> > > >> wrote:
> > > >>
> > > >> > Hi Mario,
> > > >> >
> > > >> > Thanks for the KIP! Looks good overall. One quick thought--it
> wasn't
> > > >> > immediately obvious to me why we had to touch on InsertField for
> > this.
> > > >> It
> > > >> > looks like the reason is that we use Struct::get [1] to create a
> > clone
> > > >> of
> > > >> > the input value [2], instead of Struct::getWithoutDefault [3].
> > > >> >
> > > >> > It seems like the pattern of "copy the contents of this Struct
> into
> > > >> another
> > > >> > one for the purpose of mutation" could be fairly common in user
> 

[PR] Add Skillsoft to "Powered By" [kafka-site]

2024-05-10 Thread via GitHub


brandon-powers opened a new pull request, #601:
URL: https://github.com/apache/kafka-site/pull/601

    
   
   This PR adds [Skillsoft](https://www.skillsoft.com/) to the _**Powered By**_ 
page with a brief usage description & logo. 
   
   We've been utilizing Apache Kafka in production for quite a while now, along 
with Kafka Connect for multiple use cases.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Resolved] (KAFKA-16445) PATCH method for connector configuration

2024-05-10 Thread Ivan Yurchenko (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-16445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Yurchenko resolved KAFKA-16445.

Resolution: Fixed

> PATCH method for connector configuration
> 
>
> Key: KAFKA-16445
> URL: https://issues.apache.org/jira/browse/KAFKA-16445
> Project: Kafka
>  Issue Type: Improvement
>  Components: connect
>Reporter: Ivan Yurchenko
>Assignee: Ivan Yurchenko
>Priority: Minor
>
> As  [KIP-477: Add PATCH method for connector config in Connect REST 
> API|https://cwiki.apache.org/confluence/display/KAFKA/KIP-477%3A+Add+PATCH+method+for+connector+config+in+Connect+REST+API]
>  suggests, we should introduce the PATCH method for connector configuration.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Jenkins build is unstable: Kafka » Kafka Branch Builder » trunk #2889

2024-05-10 Thread Apache Jenkins Server
See 




[jira] [Resolved] (KAFKA-16660) reduce the check interval to speedup DelegationTokenRequestsTest

2024-05-10 Thread Chia-Ping Tsai (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-16660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chia-Ping Tsai resolved KAFKA-16660.

Fix Version/s: 3.8.0
   Resolution: Fixed

> reduce the check interval to speedup DelegationTokenRequestsTest
> 
>
> Key: KAFKA-16660
> URL: https://issues.apache.org/jira/browse/KAFKA-16660
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Chia-Ping Tsai
>Assignee: Kuan Po Tseng
>Priority: Minor
> Fix For: 3.8.0
>
>
> the check interval is 1 minute 
> (https://github.com/apache/kafka/blob/trunk/core/src/test/scala/unit/kafka/server/DelegationTokenRequestsTest.scala#L49),
>  and `DelegationTokenRequestsTest` waits 2 minutes before running the check 
> (https://github.com/apache/kafka/blob/trunk/core/src/test/scala/unit/kafka/server/DelegationTokenRequestsTest.scala#L159)
>  ...
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)