[jira] [Reopened] (KAFKA-16935) Automatically wait for cluster startup in embedded Connect integration tests

2024-06-12 Thread Chris Egerton (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-16935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Egerton reopened KAFKA-16935:
---

> Automatically wait for cluster startup in embedded Connect integration tests
> 
>
> Key: KAFKA-16935
> URL: https://issues.apache.org/jira/browse/KAFKA-16935
> Project: Kafka
>  Issue Type: Improvement
>Affects Versions: 3.8.0
>    Reporter: Chris Egerton
>    Assignee: Chris Egerton
>Priority: Major
> Fix For: 3.8.0
>
>
> It's a common idiom in our integration tests to [start an embedded Kafka and 
> Connect 
> cluster|https://github.com/apache/kafka/blob/aecaf4447561edd8da9f06e3abdf46f382dc9d89/connect/runtime/src/test/java/org/apache/kafka/connect/util/clusters/EmbeddedConnect.java#L120-L135]
>  and then immediately afterwards [wait for each worker in the Connect cluster 
> to complete 
> startup|https://github.com/apache/kafka/blob/aecaf4447561edd8da9f06e3abdf46f382dc9d89/connect/runtime/src/test/java/org/apache/kafka/connect/util/clusters/ConnectAssertions.java#L62-L92].
>  Separating these two actions into separate steps makes our tests lengthier 
> and can even lead to bugs and flakiness if the second step is accidentally 
> omitted (see [https://github.com/apache/kafka/pull/16286] for one example).
> Instead, we should default to automatically awaiting the complete startup of 
> every worker in an embedded Connect cluster when {{EmbeddedConnect::start}} 
> is invoked, and require callers to opt out if they do not want to 
> automatically wait for startup to complete when invoking that method.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (KAFKA-16935) Automatically wait for cluster startup in embedded Connect integration tests

2024-06-12 Thread Chris Egerton (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-16935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Egerton resolved KAFKA-16935.
---
Resolution: Fixed

> Automatically wait for cluster startup in embedded Connect integration tests
> 
>
> Key: KAFKA-16935
> URL: https://issues.apache.org/jira/browse/KAFKA-16935
> Project: Kafka
>  Issue Type: Improvement
>Affects Versions: 3.8.0
>    Reporter: Chris Egerton
>    Assignee: Chris Egerton
>Priority: Major
> Fix For: 3.8.0
>
>
> It's a common idiom in our integration tests to [start an embedded Kafka and 
> Connect 
> cluster|https://github.com/apache/kafka/blob/aecaf4447561edd8da9f06e3abdf46f382dc9d89/connect/runtime/src/test/java/org/apache/kafka/connect/util/clusters/EmbeddedConnect.java#L120-L135]
>  and then immediately afterwards [wait for each worker in the Connect cluster 
> to complete 
> startup|https://github.com/apache/kafka/blob/aecaf4447561edd8da9f06e3abdf46f382dc9d89/connect/runtime/src/test/java/org/apache/kafka/connect/util/clusters/ConnectAssertions.java#L62-L92].
>  Separating these two actions into separate steps makes our tests lengthier 
> and can even lead to bugs and flakiness if the second step is accidentally 
> omitted (see [https://github.com/apache/kafka/pull/16286] for one example).
> Instead, we should default to automatically awaiting the complete startup of 
> every worker in an embedded Connect cluster when {{EmbeddedConnect::start}} 
> is invoked, and require callers to opt out if they do not want to 
> automatically wait for startup to complete when invoking that method.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-16943) Synchronously verify Connect worker startup failure in InternalTopicsIntegrationTest

2024-06-12 Thread Chris Egerton (Jira)
Chris Egerton created KAFKA-16943:
-

 Summary: Synchronously verify Connect worker startup failure in 
InternalTopicsIntegrationTest
 Key: KAFKA-16943
 URL: https://issues.apache.org/jira/browse/KAFKA-16943
 Project: Kafka
  Issue Type: Improvement
  Components: connect
Reporter: Chris Egerton


Created after PR discussion 
[here|https://github.com/apache/kafka/pull/16288#discussion_r1636615220].

In some of our integration tests, we want to verify that a Connect worker 
cannot start under poor conditions (such as when its internal topics do not yet 
exist and it is configured to create them with a higher replication factor than 
the number of available brokers, or when its internal topics already exist but 
they do not have the compaction cleanup policy).

This is currently not possible, and presents a possible gap in testing 
coverage, especially for the test cases 
{{testFailToCreateInternalTopicsWithMoreReplicasThanBrokers}} and 
{{{}testFailToStartWhenInternalTopicsAreNotCompacted{}}}. It'd be nice if we 
could have some way of synchronously awaiting the completion or failure of 
worker startup in our integration tests in order to guarantee that worker 
startup fails under sufficiently adverse conditions.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [DISCUSS] KIP-1017: A health check endpoint for Kafka Connect

2024-06-11 Thread Chris Egerton
Hi Viktor,

Thanks for your comments!

1) I'm realizing now that there's some implicit behavior in the KIP that I
should spell out explicitly. I was thinking that the timeout of 10 seconds
that I mentioned in the "Public Interfaces" section would have to elapse
before any 5xx responses were issued. So, if someone pinged the health
endpoint while the worker was starting up, the worker would take up to 10
seconds to try to complete startup before giving up and responding to the
request with a 503 status. This would obviate the need for a Retry-After
header because it would apply a natural rate limiting to these requests
during worker startup. Does this seem reasonable? We could diverge in
behavior and only apply the 10 second timeout to the 500 case (i.e., worker
has completed startup but is not live for other reasons), at which point a
Retry-After header for the 503 case (worker still starting up) would make
sense, but I can't think of any benefits to this approach. Thoughts?

2) As of KAFKA-15563 [1], we have some great infrastructure in place to
identify worker actions that might be good candidates for the contents of
this "worker events" topic. But I don't think conflating the retrieval of
these events with the health endpoint is a good idea--IMO it should be a
separate endpoint and the health endpoint should stay lightweight and
simple. I'm also not sure it's necessary to expose the contents of this
kind of topic via the REST API at all; we could instruct users to consume
directly from the topic if they'd like to know the history of the worker.
Overall it seems like a decent idea and I'd be happy to review a KIP for
it, but like you mention, it seems like a pretty drastic change in scope
and I don't think it needs to be included in this proposal.

[1] - https://issues.apache.org/jira/browse/KAFKA-15563

Cheers,

Chris

On Tue, Jun 11, 2024 at 11:42 AM Viktor Somogyi-Vass
 wrote:

> Hi Chris,
>
> I also have 2 other comments:
>
> 1. One more thing I came across is that should we provide the Retry-After
> header in the response in case of 503 response? Although I'm not sure how
> many clients honor this, we could add it just in case some does and if you
> also find it useful. (We could default it to retry.backoff.ms.)
>
> 2. Adding to Adrian's comments, storing timestamped worker statuses in an
> internal topic and then reading them from there would add valuable info
> about what the worker does. For instance GET /health?startTime=45345323346
> could fetch events from the given timestamp additionally to your proposed
> behavior. Also, if the rest server isn't available, it would serve in
> itself as a log about the workers' behavior. I understand if you think it's
> such a scope change that it should be an improvement KIP, but would like to
> gauge what you think about this.
>
> Regards,
> Viktor
>
> On Tue, Jun 11, 2024 at 4:34 PM Chris Egerton 
> wrote:
>
> > Hi Adrian,
> >
> > Thanks for your comments/questions! The responses to them are related so
> > I'll try to address both at once.
> >
> > The most recent update I made to the KIP should help provide insight into
> > what's going wrong if a non-200 response is returned. I don't plan on
> > adding any structured data such as error codes or something like a
> "phase"
> > field with values like READING_CONFIG_TOPIC quite yet, but there is room
> > for us to add human-readable information on the causes of failure in the
> > "message" field (see KAFKA-15563 [1] and its PR [2] for an example of
> what
> > kind of information we might provide to users). Part of the problem is
> that
> > while I've heard plenty of (justified!) complaints about the Kafka
> Connect
> > REST API becoming unavailable and the difficulties users face with
> > debugging their workers when that happens, I still don't feel we have a
> > strong-enough grasp on the common causes for this scenario to commit to a
> > response format that could be more machine-readable, and it can be
> > surprisingly difficult to get to a root cause in some cases.
> >
> > I'm anticipating that users will rely on the endpoint primarily for two
> > things:
> > 1) Ensuring that a worker has completed startup successfully during a
> > rolling upgrade (if you don't get a 200 after long enough, abort the
> > upgrade, check the error message, and start investigating)
> > 2) Ensuring that a worker remains healthy after it has joined the cluster
> > (if you don't get a 200 for a sustained period of time, check the error
> > message, and then decide whether to restart the process or issue a page)
> >
> > It's primarily designed to be easy to incorporate with automated tooling
> > that has support for REST-based pr

[jira] [Created] (KAFKA-16935) Automatically wait for cluster startup in embedded Connect integration tests

2024-06-11 Thread Chris Egerton (Jira)
Chris Egerton created KAFKA-16935:
-

 Summary: Automatically wait for cluster startup in embedded 
Connect integration tests
 Key: KAFKA-16935
 URL: https://issues.apache.org/jira/browse/KAFKA-16935
 Project: Kafka
  Issue Type: Improvement
Affects Versions: 3.8.0
Reporter: Chris Egerton
Assignee: Chris Egerton


It's a common idiom in our integration tests to [start an embedded Kafka and 
Connect 
cluster|https://github.com/apache/kafka/blob/aecaf4447561edd8da9f06e3abdf46f382dc9d89/connect/runtime/src/test/java/org/apache/kafka/connect/util/clusters/EmbeddedConnect.java#L120-L135]
 and then immediately afterwards [wait for each worker in the Connect cluster 
to complete 
startup|https://github.com/apache/kafka/blob/aecaf4447561edd8da9f06e3abdf46f382dc9d89/connect/runtime/src/test/java/org/apache/kafka/connect/util/clusters/ConnectAssertions.java#L62-L92].
 Separating these two actions into separate steps makes our tests lengthier and 
can even lead to bugs and flakiness if the second step is accidentally omitted 
(see [https://github.com/apache/kafka/pull/16286] for one example).

Instead, we should default to automatically awaiting the complete startup of 
every worker in an embedded Connect cluster when {{EmbeddedConnect::start}} is 
invoked, and require callers to opt out if they do not want to automatically 
wait for startup to complete when invoking that method.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [DISCUSS] KIP-1017: A health check endpoint for Kafka Connect

2024-06-11 Thread Chris Egerton
Hi Adrian,

Thanks for your comments/questions! The responses to them are related so
I'll try to address both at once.

The most recent update I made to the KIP should help provide insight into
what's going wrong if a non-200 response is returned. I don't plan on
adding any structured data such as error codes or something like a "phase"
field with values like READING_CONFIG_TOPIC quite yet, but there is room
for us to add human-readable information on the causes of failure in the
"message" field (see KAFKA-15563 [1] and its PR [2] for an example of what
kind of information we might provide to users). Part of the problem is that
while I've heard plenty of (justified!) complaints about the Kafka Connect
REST API becoming unavailable and the difficulties users face with
debugging their workers when that happens, I still don't feel we have a
strong-enough grasp on the common causes for this scenario to commit to a
response format that could be more machine-readable, and it can be
surprisingly difficult to get to a root cause in some cases.

I'm anticipating that users will rely on the endpoint primarily for two
things:
1) Ensuring that a worker has completed startup successfully during a
rolling upgrade (if you don't get a 200 after long enough, abort the
upgrade, check the error message, and start investigating)
2) Ensuring that a worker remains healthy after it has joined the cluster
(if you don't get a 200 for a sustained period of time, check the error
message, and then decide whether to restart the process or issue a page)

It's primarily designed to be easy to incorporate with automated tooling
that has support for REST-based process health monitoring, while also
providing some human-readable information (when possible) if the worker
isn't healthy. This human-readable information should hopefully help people
gauge how to respond to non-200 responses, and we can try to improve
wording and granularity over time based on user feedback. You and other
users may develop automated responses based on the content of the error
messages, but beware that the wording may change across releases.

Does that seem reasonable for V1 of this feature? I can definitely see room
for expansion of the response format in the future, but would like to hold
off on that for now.

[1] - https://issues.apache.org/jira/browse/KAFKA-15563
[2] - https://github.com/apache/kafka/pull/14562

Cheers,

Chris

On Tue, Jun 11, 2024 at 3:37 AM Adrian Preston  wrote:

> Hi Chris,
>
> Good KIP – I think it will be very helpful in alerting and automating the
> resolution of common Connect problems.
>
> I have a couple of questions / suggestions:
>
> 1. What are you planning on documenting as guidance for using this new
> endpoint? My guess would be that if Connect doesn’t return a status of 200
> after some period I would either page someone, or restart the process? But
> I’m missing the nuance of distinguishing between readiness and liveness, is
> this for maintaining overall availability when rolling out updates to
> several Connect processes?
>
> 2. Would you consider providing a way to discover details about exactly
> which condition (or conditions) is/are failing? Perhaps by providing a
> richer JSON response? Something like this would help us adopt the health
> check, as we could start by paging someone for all failures, then over time
> (as we gained confidence) transition more of the conditions over to being
> handled by automation.
>
> Regards,
> - Adrian
>
>
> From: Chris Egerton 
> Date: Monday, 10 June 2024 at 15:26
> To: dev@kafka.apache.org 
> Subject: [EXTERNAL] Re: [DISCUSS] KIP-1017: A health check endpoint for
> Kafka Connect
> Hi all,
>
> Thanks for the positive feedback!
>
> I've made one small addition to the KIP since there's been a change to our
> REST timeout error messages that's worth incorporating here. Quoting the
> added section directly:
>
> > Note that the HTTP status codes and "status" fields in the JSON response
> will match the exact examples above. However, the "message" field may be
> augmented to include, among other things, more information about the
> operation(s) the worker could be blocked on (such as was added in REST
> timeout error messages in KAFKA-15563 [1]).
>
> Assuming this still looks okay to everyone, I'll kick off a vote thread
> sometime this week or the next.
>
> [1] - https://issues.apache.org/jira/browse/KAFKA-15563
>
> Cheers,
>
> Chris
>
> On Fri, Jun 7, 2024 at 12:01 PM Andrew Schofield <
> andrew_schofi...@live.com>
> wrote:
>
> > Hi Chris,
> > This KIP looks good to me. I particularly like the explanation of how the
> > result will specifically
> > check the worker health in ways that have previously caused trouble.
> >
> > Thanks

[jira] [Resolved] (KAFKA-9228) Reconfigured converters and clients may not be propagated to connector tasks

2024-06-10 Thread Chris Egerton (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Egerton resolved KAFKA-9228.
--
Fix Version/s: 3.9.0
   Resolution: Fixed

> Reconfigured converters and clients may not be propagated to connector tasks
> 
>
> Key: KAFKA-9228
> URL: https://issues.apache.org/jira/browse/KAFKA-9228
> Project: Kafka
>  Issue Type: Bug
>  Components: connect
>Affects Versions: 2.3.0, 2.4.0, 2.3.1, 2.3.2
>    Reporter: Chris Egerton
>    Assignee: Chris Egerton
>Priority: Major
> Fix For: 3.9.0
>
>
> If an existing connector is reconfigured but the only changes are to its 
> converters and/or Kafka clients (enabled as of 
> [KIP-458|https://cwiki.apache.org/confluence/display/KAFKA/KIP-458%3A+Connector+Client+Config+Override+Policy]),
>  the changes will not propagate to its tasks unless the connector also 
> generates task configs that differ from the existing task configs. Even after 
> this point, if the connector tasks are reconfigured, they will still not pick 
> up on the new converter and/or Kafka client configs.
> This is because the {{DistributedHerder}} only writes new task configurations 
> to the connect config topic [if the connector-provided task configs differ 
> from the task configs already in the config 
> topic|https://github.com/apache/kafka/blob/e499c960e4f9cfc462f1a05a110d79ffa1c5b322/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/distributed/DistributedHerder.java#L1285-L1332],
>  and neither of those contain converter or Kafka client configs.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [DISCUSS] KIP-1017: A health check endpoint for Kafka Connect

2024-06-10 Thread Chris Egerton
Hi all,

Thanks for the positive feedback!

I've made one small addition to the KIP since there's been a change to our
REST timeout error messages that's worth incorporating here. Quoting the
added section directly:

> Note that the HTTP status codes and "status" fields in the JSON response
will match the exact examples above. However, the "message" field may be
augmented to include, among other things, more information about the
operation(s) the worker could be blocked on (such as was added in REST
timeout error messages in KAFKA-15563 [1]).

Assuming this still looks okay to everyone, I'll kick off a vote thread
sometime this week or the next.

[1] - https://issues.apache.org/jira/browse/KAFKA-15563

Cheers,

Chris

On Fri, Jun 7, 2024 at 12:01 PM Andrew Schofield 
wrote:

> Hi Chris,
> This KIP looks good to me. I particularly like the explanation of how the
> result will specifically
> check the worker health in ways that have previously caused trouble.
>
> Thanks,
> Andrew
>
> > On 7 Jun 2024, at 16:18, Mickael Maison 
> wrote:
> >
> > Hi Chris,
> >
> > Happy Friday! The KIP looks good to me. +1
> >
> > Thanks,
> > Mickael
> >
> > On Fri, Jan 26, 2024 at 8:41 PM Chris Egerton 
> wrote:
> >>
> >> Hi all,
> >>
> >> Happy Friday! I'd like to kick off discussion for KIP-1017, which (as
> the
> >> title suggests) proposes adding a health check endpoint for Kafka
> Connect:
> >>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1017%3A+Health+check+endpoint+for+Kafka+Connect
> >>
> >> This is one of the longest-standing issues with Kafka Connect and I'm
> >> hoping we can finally put it in the ground soon. Looking forward to
> hearing
> >> people's thoughts!
> >>
> >> Cheers,
> >>
> >> Chris
>
>


Re: [VOTE] KIP-877: Mechanism for plugins and connectors to register metrics

2024-06-10 Thread Chris Egerton
+1 (binding), thanks Mickael!

On Mon, Jun 10, 2024, 04:24 Mickael Maison  wrote:

> Hi,
>
> Following the feedback in the DISCUSS thread, I made significant
> changes to the proposal. So I'd like to restart a vote for KIP-877:
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-877%3A+Mechanism+for+plugins+and+connectors+to+register+metrics
>
> Thanks,
> Mickael
>
> On Thu, Jan 25, 2024 at 2:59 AM Tom Bentley  wrote:
> >
> > Hi Mickael,
> >
> > You'll have seen that I left some comments on the discussion thread, but
> > they're minor enough that I'm happy to vote +1 here.
> >
> > Thanks,
> >
> > Tom
> >
> > On Thu, 11 Jan 2024 at 06:14, Mickael Maison 
> > wrote:
> >
> > > Bumping this thread since I've not seen any feedback.
> > >
> > > Thanks,
> > > Mickael
> > >
> > > On Tue, Dec 19, 2023 at 10:03 AM Mickael Maison
> > >  wrote:
> > > >
> > > > Hi,
> > > >
> > > > I'd like to start a vote on KIP-877:
> > > >
> > >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-877%3A+Mechanism+for+plugins+and+connectors+to+register+metrics
> > > >
> > > > Let me know if you have any feedback.
> > > >
> > > > Thanks,
> > > > Mickael
> > >
> > >
>


Re: Request permission to Create KIP

2024-06-06 Thread Chris Egerton
Hi Eric,

I don't think you need to log into Pony Mail to start a discussion thread.
You can just send an email to the dev list with [DISCUSS] in the title and
that should be sufficient.

Cheers,

Chris

On Thu, Jun 6, 2024 at 12:01 PM Eric Lu  wrote:

> Hi,
>
> Thanks for setting my account up. I am currently writing a KIP. However, I
> am experiencing some issues with creating a discussion thread on Apache
> Pony mail. I am unable to login to Apache Pony Mail with my credentials:
> ericlu95. Who should I contact to deal with this problem? What should I do?
>
> Best regards,
>
> Eric
>
> On Wed, Jun 5, 2024 at 11:51 AM Justine Olshan
> 
> wrote:
>
> > Hi Eric,
> >
> > You should be all set to create a KIP. Let me know if you have any
> issues!
> >
> > Justine
> >
> > On Wed, Jun 5, 2024 at 7:18 AM Erick Morty 
> > wrote:
> >
> > > Hi,
> > >
> > > I would like to request permissions to access my apache account to
> > create a
> > > KIP. I have joined my account information below:
> > > JIRA: ericlu95
> > > Confluence: ericlu95
> > >
> > >
> > > Thank you very much,
> > >
> > > Eric
> > >
> >
>


[jira] [Resolved] (KAFKA-16838) Kafka Connect loads old tasks from removed connectors

2024-06-04 Thread Chris Egerton (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-16838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Egerton resolved KAFKA-16838.
---
Fix Version/s: 3.9.0
   Resolution: Fixed

> Kafka Connect loads old tasks from removed connectors
> -
>
> Key: KAFKA-16838
> URL: https://issues.apache.org/jira/browse/KAFKA-16838
> Project: Kafka
>  Issue Type: Bug
>  Components: connect
>Affects Versions: 3.5.1, 3.6.1, 3.8.0
>Reporter: Sergey Ivanov
>    Assignee: Chris Egerton
>Priority: Major
> Fix For: 3.9.0
>
>
> Hello,
> When creating connector we faced an error from one of our ConfigProviders 
> about not existing resource, but we didn't try to set that resource as config 
> value:
> {code:java}
> [2024-05-24T12:08:24.362][ERROR][request_id= ][tenant_id= 
> ][thread=DistributedHerder-connect-1-1][class=org.apache.kafka.connect.runtime.distributed.DistributedHerder][method=lambda$reconfigureConnectorTasksWithExponentialBackoffRetries$44]
>  [Worker clientId=connect-1, groupId=streaming-service_streaming_service] 
> Failed to reconfigure connector's tasks (local-file-sink), retrying after 
> backoff.
> org.apache.kafka.common.config.ConfigException: Could not read properties 
> from file /opt/kafka/provider.properties
>  at 
> org.apache.kafka.common.config.provider.FileConfigProvider.get(FileConfigProvider.java:98)
>  at 
> org.apache.kafka.common.config.ConfigTransformer.transform(ConfigTransformer.java:103)
>  at 
> org.apache.kafka.connect.runtime.WorkerConfigTransformer.transform(WorkerConfigTransformer.java:58)
>  at 
> org.apache.kafka.connect.storage.ClusterConfigState.taskConfig(ClusterConfigState.java:181)
>  at 
> org.apache.kafka.connect.runtime.AbstractHerder.taskConfigsChanged(AbstractHerder.java:804)
>  at 
> org.apache.kafka.connect.runtime.distributed.DistributedHerder.publishConnectorTaskConfigs(DistributedHerder.java:2089)
>  at 
> org.apache.kafka.connect.runtime.distributed.DistributedHerder.reconfigureConnector(DistributedHerder.java:2082)
>  at 
> org.apache.kafka.connect.runtime.distributed.DistributedHerder.reconfigureConnectorTasksWithExponentialBackoffRetries(DistributedHerder.java:2025)
>  at 
> org.apache.kafka.connect.runtime.distributed.DistributedHerder.lambda$null$42(DistributedHerder.java:2038)
>  at 
> org.apache.kafka.connect.runtime.distributed.DistributedHerder.runRequest(DistributedHerder.java:2232)
>  at 
> org.apache.kafka.connect.runtime.distributed.DistributedHerder.tick(DistributedHerder.java:470)
>  at 
> org.apache.kafka.connect.runtime.distributed.DistributedHerder.run(DistributedHerder.java:371)
>  at 
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539)
>  at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
>  at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
>  at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
>  at java.base/java.lang.Thread.run(Thread.java:840)
>  {code}
> It looked like there already was connector with the same name and same 
> config, +but it wasn't.+
> After investigation we found out, that few months ago on that cloud there was 
> the connector with the same name and another value for config provider. Then 
> it was removed, but by some reason when we tried to create connector with the 
> same name months ago AbstractHerder tried to update tasks from our previous 
> connector
> As an example I used FileConfigProvider, but actually any ConfigProvider is 
> acceptable which could raise exception if something wrong with config (like 
> result doesn't exist).
> We continued our investigation and found the issue 
> https://issues.apache.org/jira/browse/KAFKA-7745 that says Connect doesn't 
> send tombstone message for *commit* and *task* records in the config topic of 
> Kafka Connect. As we remember, the config topic is `compact` *that means 
> commit and tasks are are always stored* (months, years after connector 
> removing) while tombstones for connector messages are cleaned with 
> {{delete.retention.ms}}  property. That impacts further connector creations 
> with the same name.
> We didn't investigate reasons in ConfigClusterStore and how to avoid that 
> issue, because would {+}like to ask{+}, probably it's better to fix 
> KAFKA-7745 and send tombstones for commit and task messages as connect does 
> for connector and target messages?
> In the common way the TC looks like:
>  # Create connector with config provider to 

[jira] [Resolved] (KAFKA-16837) Kafka Connect fails on update connector for incorrect previous Config Provider tasks

2024-06-04 Thread Chris Egerton (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-16837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Egerton resolved KAFKA-16837.
---
Fix Version/s: 3.9.0
   Resolution: Fixed

> Kafka Connect fails on update connector for incorrect previous Config 
> Provider tasks
> 
>
> Key: KAFKA-16837
> URL: https://issues.apache.org/jira/browse/KAFKA-16837
> Project: Kafka
>  Issue Type: Bug
>  Components: connect
>Affects Versions: 3.5.1, 3.6.1, 3.8.0
>Reporter: Sergey Ivanov
>Assignee: Chris Egerton
>Priority: Major
> Fix For: 3.9.0
>
> Attachments: kafka_connect_config.png
>
>
> Hello,
> We faced an issue when is not possible to update Connector config if the 
> *previous* task contains ConfigProvider's value with incorrect value that 
> leads to ConfigException.
> I can provide simple Test Case to reproduce it with FileConfigProvider, but 
> actually any ConfigProvider is acceptable that could raise exception if 
> something wrong with config (like resource doesn't exist).
> *Prerequisites:*
> Kafka Connect instance with config providers:
>  
> {code:java}
> config.providers=file
> config.providers.file.class=org.apache.kafka.common.config.provider.FileConfigProvider{code}
>  
> 1. Create Kafka topic "test"
> 2. On the Kafka Connect instance create the file 
> "/opt/kafka/provider.properties" with content
> {code:java}
> topics=test
> {code}
> 3. Create simple FileSink connector:
> {code:java}
> PUT /connectors/local-file-sink/config
> {
>   "connector.class": "FileStreamSink",
>   "tasks.max": "1",
>   "file": "/opt/kafka/test.sink.txt",
>   "topics": "${file:/opt/kafka/provider.properties:topics}"
> }
> {code}
> 4. Checks that everything works fine:
> {code:java}
> GET /connectors?expand=info=status
> ...
> "status": {
>   "name": "local-file-sink",
>   "connector": {
> "state": "RUNNING",
> "worker_id": "10.10.10.10:8083"
>   },
>   "tasks": [
> {
>   "id": 0,
>   "state": "RUNNING",
>   "worker_id": "10.10.10.10:8083"
> }
>   ],
>   "type": "sink"
> }
>   }
> }
> {code}
> Looks fine.
> 5. Renames the file to "/opt/kafka/provider2.properties".
> 6. Update connector with new correct file name:
> {code:java}
> PUT /connectors/local-file-sink/config
> {
>   "connector.class": "FileStreamSink",
>   "tasks.max": "1",
>   "file": "/opt/kafka/test.sink.txt",
>   "topics": "${file:/opt/kafka/provider2.properties:topics}"
> }
> {code}
> Update {*}succeed{*}, got 200. 
> 7. Checks that everything works fine:
> {code:java}
> {
>   "local-file-sink": {
> "info": {
>   "name": "local-file-sink",
>   "config": {
> "connector.class": "FileStreamSink",
> "file": "/opt/kafka/test.sink.txt",
> "tasks.max": "1",
> "topics": "${file:/opt/kafka/provider2.properties:topics}",
> "name": "local-file-sink"
>   },
>   "tasks": [
> {
>   "connector": "local-file-sink",
>   "task": 0
> }
>   ],
>   "type": "sink"
> },
> "status": {
>   "name": "local-file-sink",
>   "connector": {
> "state": "RUNNING",
> "worker_id": "10.10.10.10:8083"
>   },
>   "tasks": [
> {
>   "id": 0,
>   "state": "FAILED",
>   "worker_id": "10.10.10.10:8083",
>   "trace": "org.apache.kafka.common.errors.InvalidTopicException: 
> Invalid topics: [${file:/opt/kafka/provider.properties:topics}]"
> }
>   ],
>   "type": "sink"
> }
>   }
> }
> {code}
> Config has been updated, but new task has not been c

Re: [DISCUSS] KIP-877: Mechanism for plugins and connectors to register metrics

2024-05-31 Thread Chris Egerton
ur LOW COST IT SERVICE ODC MODEL eliminate the cost of
> expensive employee
> > > > > >>>> payroll, Help partner to get profit more than 50% on each
> project.. ..We
> > > > > >>>> really mean it.
> > > > > >>>>
> > > > > >>>>
> > > > > >>>>
> > > > > >>>> We are already working with platinum partner like NTT DATA,
> NEC Singapore,
> > > > > >>>> Deloitte, Hitachi consulting. ACCENTURE, Abeam Singapore etc.
> > > > > >>>>
> > > > > >>>>
> > > > > >>>>
> > > > > >>>> Are u keen to understand VOTEC IT SERVICE MODEL PARTNERSHIP
> offerings?
> > > > > >>>>
> > > > > >>>>
> > > > > >>>>
> > > > > >>>> Let us know your availability this week OR Next week?? We can
> arrange
> > > > > >>>> discussion with Partner Manager.
> > > > > >>>>> On 01/25/2024 9:56 AM +08 Tom Bentley 
> wrote:
> > > > > >>>>>
> > > > > >>>>>
> > > > > >>>>> Hi Mickael,
> > > > > >>>>>
> > > > > >>>>> Thanks for the KIP! I can tell a lot of thought went into
> this. I have a
> > > > > >>>>> few comments, but they're all pretty trivial and aimed at
> making the
> > > > > >>>>> correct use of this API clearer to implementors.
> > > > > >>>>>
> > > > > >>>>> 1. Configurable and Reconfigurable both use a verb in the
> imperative mood
> > > > > >>>>> for their method name. Monitorable doesn't, which initially
> seemed a bit
> > > > > >>>>> inconsistent to me, but I think your intention is to allow
> plugins to
> > > > > >>>>> merely retain a reference to the PluginMetrics, and allow
> registering
> > > > > >>>>> metrics at any later point? If that's the case you could add
> something
> > > > > >>>> like
> > > > > >>>>> "Plugins can register and unregister metrics using the given
> > > > > >>>> PluginMetrics
> > > > > >>>>> at any point in their lifecycle prior to their close method
> being called"
> > > > > >>>>> to the javadoc to make clear how this can be used.
> > > > > >>>>> 2. I assume PluginMetrics will be thread-safe? We should
> document that as
> > > > > >>>>> part of the contract.
> > > > > >>>>> 3. I don't think IAE is quite right for duplicate metrics.
> In this case
> > > > > >>>> the
> > > > > >>>>> arguments themselves are fine, it's the current state of the
> > > > > >>>> PluginMetrics
> > > > > >>>>> which causes the problem. If the earlier point about plugins
> being
> > > > > >>>> allowed
> > > > > >>>>> to register and unregister metrics at any point is correct
> then this
> > > > > >>>>> exception could be thrown after configuration time. That
> being the case I
> > > > > >>>>> think a new exception type might be clearer.
> > > > > >>>>> 4. You define some semantics for PluginMetrics.close(): It
> might be a
> > > > > >>>> good
> > > > > >>>>> idea to override the inherited method and add that as
> javadoc.
> > > > > >>>>> 5. You say "It will be the responsibility of the plugin that
> creates
> > > > > >>>>> metrics to call close() of the PluginMetrics instance they
> were given to
> > > > > >>>>> remove their metrics." But you don't provide any guidance to
> users about
> > > > > >>>>> when they need to do this. I guess that they should be doing
> this in
> > > > > >>>> their
> > > > > >>>>> plugin's close method (and that's why you're only adding
> Monitorable to
> > > > > >>>>> plugins which implement Closeable and AutoCloseable), but if
> that's t

[jira] [Resolved] (KAFKA-16844) ByteArrayConverter can't convert ByteBuffer

2024-05-30 Thread Chris Egerton (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-16844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Egerton resolved KAFKA-16844.
---
Fix Version/s: 3.8.0
   Resolution: Fixed

> ByteArrayConverter can't convert ByteBuffer
> ---
>
> Key: KAFKA-16844
> URL: https://issues.apache.org/jira/browse/KAFKA-16844
> Project: Kafka
>  Issue Type: Improvement
>  Components: connect
>Reporter: Fan Yang
>Assignee: Fan Yang
>Priority: Minor
> Fix For: 3.8.0
>
>
> In current Schema design, schema type Bytes correspond to two kinds of 
> classes, byte[] and ByteBuffer. But current ByteArrayConverter can only 
> convert byte[]. My suggestion is to add ByteBuffer support in current 
> ByteArrayConverter.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [DISCUSS] KIP-891: Running multiple versions of Connector plugins

2024-05-29 Thread Chris Egerton
Hi Greg,

First, an apology! I mistakenly assumed that each plugin appeared only once
in the responses from GET /connector-plugins?connectorsOnly=false. Thank
you for correcting me and pointing out that all versions of each plugin
appear in that response, which does indeed satisfy my desire for users to
discover this information in at most two REST requests (and in fact, does
it in only one)!

And secondly, with the revelation about recommenders, I agree that it's
best to leave the "version" property out of the lists of properties
returned from the GET /connector-plugins//config endpoint.

With those two points settled, I think the only unresolved item is the
small change to version parsing added to the KIP (where raw version numbers
are treated as an exact match, instead of a best-effort match with a
fallback on the default version). If the KIP is updated with that then I'd
be ready to vote on it.

Cheers,

Chris

On Wed, May 29, 2024 at 12:00 PM Greg Harris 
wrote:

> Hey Chris,
>
> Thanks for your thoughts.
>
> > Won't it still only expose the
> > latest version for each plugin, instead of the range of versions
> available?
>
> Here is a snippet of the current output of the GET
> /connector-plugins?connectorsOnly=false endpoint, after I installed two
> versions of the debezium PostgresConnector:
>
>   {
> "class": "io.debezium.connector.postgresql.PostgresConnector",
> "type": "source",
> "version": "2.0.1.Final"
>   },
>   {
> "class": "io.debezium.connector.postgresql.PostgresConnector",
> "type": "source",
> "version": "2.6.1.Final"
>   },
>
> I think this satisfies your requirement to learn about all plugins and all
> versions in two or fewer REST calls.
>
> I tried to get an example of the output of `/config` by hardcoding the
> Recommender, and realized that Recommenders aren't executed on the
> `/config` endpoint at all: only during validation, when a configuration is
> actually present.
> And this led me to discover that the `/config` endpoint returns a
> List, and ConfigKeyInfo does not contain a recommendedValues
> field. The ConfigValue field is the object which contains
> recommendedValues, and it is only generated during validation.
> I think it's out of scope to start calling recommenders on empty configs
> that might throw exceptions, changing the existing REST entities, or
> changing the core ConfigDef implementation.
> Someone could add this functionality later, I don't think it's necessary
> here.
>
> Then the question is: should "version" without recommenders appear in
> non-connector plugins? I think I'd rather be consistent with "predicate"
> and "negate" on release, and let a later improvement add them.
>
> Thanks,
> Greg
>
> On Wed, May 29, 2024 at 8:06 AM Chris Egerton 
> wrote:
>
> > Hi Greg,
> >
> > I'm confused about the behavior for GET
> > /connector-plugins?connectorsOnly=false. Won't it still only expose the
> > latest version for each plugin, instead of the range of versions
> available?
> >
> > I'm hoping we can provide a flow where people need at most two REST calls
> > to discover 1) the complete set of plugins available on the worker (which
> > is already possible with the endpoint under discussion) and 2) the set of
> > versions available for a specific plugin on the worker (which doesn't
> > appear to be possible, at least for some plugin types). This wouldn't
> > require any out-of-band knowledge and would be valuable for connector
> users
> > (who may want to, for example, know what their options are when
> considering
> > a plugin upgrade) and cluster administrators (who could use it as a
> sanity
> > check for the setup of their Kafka Connect clusters without having to
> pore
> > over log files).
> >
> > As far as modifying the content of the GET
> > /connector-plugins//config endpoint to include a "version"
> property
> > goes, I think your point about returning that property from requests to
> > that endpoint that include a version query parameter is salient, but it
> > also unfortunately applies to all types of plugin. I don't think it
> should
> > completely disqualify that option. I also wasn't imagining adding
> anything
> > besides that single property to that endpoint, so no new "predicate"
> > property for SMTs, no new "negate" property for predicates, and no new
> > "type" property for either. I'm not necessarily opposed to adding those,
> > but it can be done without being pulled into the sco

Re: [DISCUSS] KIP-891: Running multiple versions of Connector plugins

2024-05-29 Thread Chris Egerton
uld you want to
> handle the converters?
> And when I say the configs are "not part of the plugin config itself"
> I mean that saying that GET
> /connector-plugins/Flatten$Key/config?version=3.8.0 has a "version"
> config that must be "3.8.0" is a little bit nonsense, as the version
> is already specified.
>
> > IMO
> > it's worth including this information somewhere directly accessible
> without
> > having to provide a full connector config. FWIW I'd be fine with GET
> > /connector-plugins//versions as a first-class endpoint
>
> You don't have to provide a configuration to call GET
> /connector-plugins?connectorsOnly=false , is that endpoint not close
> enough to what you have in mind? See also the Rejected Alternative
> "Adding new REST API endpoints"
>
> If you're calling /connector-plugins//config, you know the
> name of a plugin right? That either comes from out-of-band knowledge,
> validating a connector config, or calling GET
> /connector-plugins?connectorsOnly=false.
> * If you have out-of-band knowledge of plugin classes, perhaps you
> have out-of-band knowledge of versions too.
> * If you've just validated a connector config, there should be an
> accompanying "version" field there with an accurate default value and
> recommenders.
> * If you've called GET /connector-plugins?connectorsOnly=false, that
> endpoint includes version information.
>
> Thanks,
> Greg
>
> On Wed, May 22, 2024 at 11:05 AM Chris Egerton 
> wrote:
> >
> > Hi Greg,
> >
> > Hope you had a nice weekend! Gonna try to keep things concise.
> >
> > Concluded points:
> >
> > RE version recommenders, I agree it's likely that programmatic UIs will
> > already be able to handle dynamic configuration definitions, and the
> detail
> > about SMTs is a great point. I still anticipate some awkwardness with
> > connector versions, though: if the latest version supports some new
> > properties, then a user switches to an earlier version, a UI may respond
> by
> > wiping values for these properties. I guess we can bite the bullet,
> though.
> >
> > RE double-dinging during preflight validation for invalid versions, I
> like
> > the analogy with login credentials. I'm convinced that the proposal in
> the
> > KIP is best 
> >
> > Continued points:
> >
> > RE failing on worker startup, sorry, I should be clearer: there is no
> _new_
> > justification for it that doesn't also apply to existing behavior. We
> > shouldn't diverge from existing behavior solely for this new case. An
> > alternative could be to change existing behavior to fail fast on any
> > invalid default converter configuration instead of just for invalid
> > versions, but I'd vote to just stick to existing behavior and not
> > complicate things, especially since no other part of the KIP requires
> this
> > change.
> >
> > RE exposing the version property in the
> /connector-plugins//config
> > endpoint, the behavior is inconsistent across plugin types. Hitting the
> > endpoint for the FileStreamSinkConnector on version 3.7.0 yields a
> response
> > that includes, among other things, the "topics", "topics.regex", and
> > "errors.tolerance" properties. I see that we don't do this everywhere
> (the
> > examples you cite for SMT and converter properties are accurate), but IMO
> > it's worth including this information somewhere directly accessible
> without
> > having to provide a full connector config. FWIW I'd be fine with GET
> > /connector-plugins//versions as a first-class endpoint either
> > instead of or in addition to adding recommended values for all plugin
> > versions.
> >
> > Thanks for your continued work on this KIP, and with the progress we're
> > making I'm optimistic about its chances of appearing in 4.0.0.
> >
> > Cheers,
> >
> > Chris
> >
> > On Wed, May 15, 2024 at 1:22 PM Greg Harris  >
> > wrote:
> >
> > > Hey Chris,
> > >
> > > Thanks for your quick follow up.
> > >
> > > > But this risk is already present with
> > > > existing error cases, and I don't see anything that justifies
> changing
> > > > existing behavior with an invalid converter class, or diverging from
> it
> > > in
> > > > the case of invalid converter versions.
> > >
> > > The justification is to fail-fast, and prevent REST API users from
> > > receiving errors from bad configs that they didn't write, or maybe
> > > don't even know apply to them.

Re: [DISCUSS] KIP-891: Running multiple versions of Connector plugins

2024-05-22 Thread Chris Egerton
r) you
> will then see all of the transformations configurations.
> I think this means that UI developers should have already developed
> the infrastructure for handling dynamic recommenders, or else they've
> had this bug since KIP-66 in 2017. It may require some manual
> attention to roll-out support for the ".version" properties
> specifically, but I don't think that should prevent us from providing
> this information in a natural place.
>
> > but will it be possible to configure
> > a connector to use two instances of the same transform or predicate, but
> > with different versions for each?
>
> Yes, and this is included in the example in the KIP,
> `transforms.flatten-latest` and `transforms.flatten-old`. This will
> work in the natural way, with the two instances having a different
> version if configured for that. If a user wants to pin the same
> version of a plugin for every instance, they will need to provide the
> version config multiple times and keep them in-sync themselves.
>
> > I would have expected the error
> > to only be attributed to the version property, and for the class property
> > to be reported as valid.
>
> I considered this, and then went back and changed it. To quote someone
> else, this is to catch "misspelling cat as dog". For example a user
> types DogConverter and meant to type CatConverter, and then types a
> version which is valid for CatConverter, conceptually the error is in
> the class name and not the version.
> That's a very contrived scenario, but I think similar arguments are
> used for attributing validation errors to endpoints/urls/hostnames
> when the associated credentials are unable to log into the remote
> system. Did the user provide the correct testing credentials, but
> accidentally typed the production endpoint? Even if the production
> endpoint looks valid (it's a real hostname that is reachable) the
> conceptual error is still in the hostname and should have the error
> attributed to it to draw the user's attention.
> If that's not convincing, I think the alternative of only attributing
> version errors to the version property is also acceptable.
>
> Thanks,
> Greg
>
> On Wed, May 15, 2024 at 8:06 AM Chris Egerton 
> wrote:
> >
> > Hi Greg,
> >
> > Thanks for your responses! Continuations of existing discussions:
> >
> > Regarding crashing the worker on startup--yes, there is also a risk to
> > allowing it to join the cluster. But this risk is already present with
> > existing error cases, and I don't see anything that justifies changing
> > existing behavior with an invalid converter class, or diverging from it
> in
> > the case of invalid converter versions. I think we should keep this
> simple
> > and not do anything different for worker startup than we do already.
> >
> > As far as REST API vs. metrics go--I think you're right that the original
> > version metrics were added as a "monitoring" detail. However, this was
> back
> > when plugin versions were managed solely by cluster administrators. With
> > this KIP, connector users will be able to manage plugin versions, and CLI
> > and programmatic UI developers will want to develop their own tooling
> > layers. I think focusing on the REST API as the primary interface for
> this
> > KIP would be best for these users.
> >
> > (All that said, I don't object to the metrics that are proposed in the
> KIP;
> > I just think they make more sense in addition to new REST API
> > functionality, as opposed to instead of it.)
> >
> > Regarding the GET /connector-plugins//config endpoint, I was
> > thinking about the response for non-connector plugins, e.g.,
> > GET /connector-plugins/RegexRouter/config. Would a "version" property
> > appear with recommended values?
> >
> >
> > And new thoughts:
> >
> > 1) Regarding the recommended values for "connector.version", this might
> be
> > confusing since there could be differences between the ConfigDefs for the
> > latest version of the connector and prior versions. It also makes the
> flow
> > a bit awkward for programmatic UI developers: if a user changes the
> > connector version in, e.g., a dropdown menu, then the UI either has to
> > re-fetch the ConfigDef for the new version, or risk operating on stale
> > information. I'm starting to doubt that exposing the range of available
> > versions via recommended values is the best way to proceed, instead of a
> > more explicit approach like GET /connector-plugins//versions, or
> > the "Adding new REST API endpoints" rejected alternative.
> >
> > 2

Re: [VOTE] KIP-1040: Improve handling of nullable values in InsertField, ExtractField, and other transformations

2024-05-20 Thread Chris Egerton
Thanks for the KIP! +1 (binding)

On Mon, May 20, 2024 at 4:22 AM Mario Fiore Vitale 
wrote:

> Hi everyone,
>
> I'd like to call a vote on KIP-1040 which aims to improve handling of
> nullable values in InsertField, ExtractField, and other transformations
>
> KIP -
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=303794677
>
> Discussion thread -
> https://lists.apache.org/thread/ggqqqjbg6ccpz8g6ztyj7oxr80q5184n
>
> Thanks and regards,
> Mario
>


[jira] [Resolved] (KAFKA-16603) Data loss when kafka connect sending data to Kafka

2024-05-20 Thread Chris Egerton (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-16603?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Egerton resolved KAFKA-16603.
---
Resolution: Not A Bug

> Data loss when kafka connect sending data to Kafka
> --
>
> Key: KAFKA-16603
> URL: https://issues.apache.org/jira/browse/KAFKA-16603
> Project: Kafka
>  Issue Type: Bug
>  Components: clients, producer 
>Affects Versions: 3.3.1
>Reporter: Anil Dasari
>Priority: Major
>
> We are experiencing a data loss when Kafka Source connector is failed to send 
> data to Kafka topic and offset topic. 
> Kafka cluster and Kafka connect details:
>  # Kafka connect version i.e client : Confluent community version 7.3.1 i.e 
> Kafka 3.3.1
>  # Kafka version: 0.11.0 (server)
>  # Cluster size : 3 brokers
>  # Number of partitions in all topics = 3
>  # Replication factor = 3
>  # Min ISR set 2
>  # Uses no transformations in Kafka connector
>  # Use default error tolerance i.e None.
> Our connector checkpoints the offsets info received in 
> SourceTask#commitRecord and resume the data process from the persisted 
> checkpoint.
> The data loss is noticed when broker is unresponsive for few mins due to high 
> load and kafka connector was restarted. Also, Kafka connector graceful 
> shutdown failed.
> Logs:
>  
> {code:java}
> [Worker clientId=connect-1, 
> groupId=pg-group-adf06ea08abb4394ad4f2787481fee17] Discovered group 
> coordinator 10.75.100.176:31000 (id: 2147483647 rack: null)
> Apr 22, 2024 @ 15:56:16.152 [Worker clientId=connect-1, 
> groupId=pg-group-adf06ea08abb4394ad4f2787481fee17] Group coordinator 
> 10.75.100.176:31000 (id: 2147483647 rack: null) is unavailable or invalid due 
> to cause: coordinator unavailable. isDisconnected: false. Rediscovery will be 
> attempted.
> Apr 22, 2024 @ 15:56:16.153 [Worker clientId=connect-1, 
> groupId=pg-group-adf06ea08abb4394ad4f2787481fee17] Requesting disconnect from 
> last known coordinator 10.75.100.176:31000 (id: 2147483647 rack: null)
> Apr 22, 2024 @ 15:56:16.514 [Worker clientId=connect-1, 
> groupId=pg-group-adf06ea08abb4394ad4f2787481fee17] Node 0 disconnected.
> Apr 22, 2024 @ 15:56:16.708 [Producer 
> clientId=connector-producer-d094a5d7bbb046b99d62398cb84d648c-0] Node 0 
> disconnected.
> Apr 22, 2024 @ 15:56:16.710 [Worker clientId=connect-1, 
> groupId=pg-group-adf06ea08abb4394ad4f2787481fee17] Node 2147483647 
> disconnected.
> Apr 22, 2024 @ 15:56:16.731 [Worker clientId=connect-1, 
> groupId=pg-group-adf06ea08abb4394ad4f2787481fee17] Group coordinator 
> 10.75.100.176:31000 (id: 2147483647 rack: null) is unavailable or invalid due 
> to cause: coordinator unavailable. isDisconnected: true. Rediscovery will be 
> attempted.
> Apr 22, 2024 @ 15:56:19.103 == Trying to sleep while stop == (** custom log 
> **)
> Apr 22, 2024 @ 15:56:19.755 [Worker clientId=connect-1, 
> groupId=pg-group-adf06ea08abb4394ad4f2787481fee17] Broker coordinator was 
> unreachable for 3000ms. Revoking previous assignment Assignment{error=0, 
> leader='connect-1-8f41a1d2-6cc9-4956-9be3-1fbae9c6d305', 
> leaderUrl='http://10.75.100.46:8083/', offset=4, 
> connectorIds=[d094a5d7bbb046b99d62398cb84d648c], 
> taskIds=[d094a5d7bbb046b99d62398cb84d648c-0], revokedConnectorIds=[], 
> revokedTaskIds=[], delay=0} to avoid running tasks while not being a member 
> the group
> Apr 22, 2024 @ 15:56:19.866 Stopping connector 
> d094a5d7bbb046b99d62398cb84d648c
> Apr 22, 2024 @ 15:56:19.874 Stopping task d094a5d7bbb046b99d62398cb84d648c-0
> Apr 22, 2024 @ 15:56:19.880 Scheduled shutdown for 
> WorkerConnectorWorkerConnector{id=d094a5d7bbb046b99d62398cb84d648c}
> Apr 22, 2024 @ 15:56:24.105 Connector 'd094a5d7bbb046b99d62398cb84d648c' 
> failed to properly shut down, has become unresponsive, and may be consuming 
> external resources. Correct the configuration for this connector or remove 
> the connector. After fixing the connector, it may be necessary to restart 
> this worker to release any consumed resources.
> Apr 22, 2024 @ 15:56:24.110 [Producer 
> clientId=connector-producer-d094a5d7bbb046b99d62398cb84d648c-0] Closing the 
> Kafka producer with timeoutMillis = 0 ms.
> Apr 22, 2024 @ 15:56:24.110 [Producer 
> clientId=connector-producer-d094a5d7bbb046b99d62398cb84d648c-0] Proceeding to 
> force close the producer since pending requests could not be completed within 
> timeout 0 ms.
> Apr 22, 2024 @ 15:56:24.112 [Producer 
> clientId=connector-producer-d094a5d7bbb046b99d62398cb84d648c-0] Beginning 
> shutdown of Kafka producer I/O thread, sending remaining records.

[jira] [Resolved] (KAFKA-16656) Using a custom replication.policy.separator with DefaultReplicationPolicy

2024-05-20 Thread Chris Egerton (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-16656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Egerton resolved KAFKA-16656.
---
Resolution: Not A Bug

> Using a custom replication.policy.separator with DefaultReplicationPolicy
> -
>
> Key: KAFKA-16656
> URL: https://issues.apache.org/jira/browse/KAFKA-16656
> Project: Kafka
>  Issue Type: Bug
>  Components: mirrormaker
>Affects Versions: 3.5.1
>Reporter: Lenin Joseph
>Priority: Major
>
> Hi,
> In the case of bidirectional replication using mm2, when we tried using a 
> custom replication.policy.separator( ex: "-") with DefaultReplicationPolicy , 
> we see cyclic replication of topics. Could you confirm whether it's mandatory 
> to use a CustomReplicationPolicy whenever we want to use a separator other 
> than a "." ?
> Regards, 
> Lenin



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [DISCUSS] Apache Kafka 3.8.0 release

2024-05-15 Thread Chris Egerton
The KIP isn't officially accepted until its vote thread closes after being
open for at least three days. KIP-1028 won't make that deadline.

On Wed, May 15, 2024, 12:36 Ismael Juma  wrote:

> The KIP freeze is just about having the KIP accepted. Not sure why we would
> need an exception for that.
>
> Ismael
>
> On Wed, May 15, 2024 at 9:20 AM Chris Egerton 
> wrote:
>
> > FWIW I think that the low blast radius for KIP-1028 should allow it to
> > proceed without adhering to the usual KIP and feature freeze dates. Code
> > freeze is probably worth still  respecting, at least if changes are
> > required to the docker/jvm/Dockerfile. But I defer to Josep's judgement
> as
> > the release manager.
> >
> > On Wed, May 15, 2024, 06:59 Vedarth Sharma 
> > wrote:
> >
> > > Hey Josep!
> > >
> > > The KIP 1028 has received the required votes. Voting thread:-
> > > https://lists.apache.org/thread/cdq4wfv5v1gpqlxnf46ycwtcwk5wos4q
> > > But we are keeping the vote open for 72 hours as per the process.
> > >
> > > I would like to request you to please consider it for the 3.8.0
> release.
> > >
> > > Thanks and regards,
> > > Vedarth
> > >
> > >
> > > On Wed, May 15, 2024 at 1:14 PM Josep Prat  >
> > > wrote:
> > >
> > > > Hi Kafka developers!
> > > >
> > > > Today is the KIP freeze deadline. All KIPs should be accepted by EOD
> > > today.
> > > > Tomorrow morning (CEST timezone) I'll start summarizing all KIPs that
> > > have
> > > > been approved. Please any KIP approved after tomorrow should be
> adopted
> > > in
> > > > a future release version, not 3.8.
> > > >
> > > > Other relevant upcoming deadlines:
> > > > - Feature freeze is on May 29th
> > > > - Code freeze is June 12th
> > > >
> > > > Best,
> > > >
> > > > On Fri, May 3, 2024 at 3:59 PM Josep Prat 
> wrote:
> > > >
> > > > > Hi Kafka developers!
> > > > > I just wanted to remind you all of the upcoming relevant dates for
> > > Kafka
> > > > > 3.8.0:
> > > > > - KIP freeze is on May 15th (this is in a little less than 2 weeks)
> > > > > - Feature freeze is on May 29th (this is in a little more than 25
> > > days).
> > > > >
> > > > > If there is a KIP you really want to have in the 3.8 series, now is
> > the
> > > > > time to make the last push. Once the deadline for KIP freeze is
> over
> > > I'll
> > > > > update the release plan with the final list of KIPs accepted and
> that
> > > may
> > > > > make it to the release.
> > > > >
> > > > > Best!
> > > > >
> > > > > On Wed, Mar 6, 2024 at 10:40 AM Josep Prat 
> > > wrote:
> > > > >
> > > > >> Hi all,
> > > > >>
> > > > >> Thanks for your support. I updated the skeleton release plan
> created
> > > by
> > > > >> Colin. You can find it here:
> > > > >>
> > https://cwiki.apache.org/confluence/display/KAFKA/Release+Plan+3.8.0
> > > > >>
> > > > >> Our last release stumbled upon some problems while releasing and
> was
> > > > >> delayed by several weeks, so I won't try to shave some weeks from
> > our
> > > > plan
> > > > >> for 3.8.0 (we might end up having delays again). Please raise your
> > > > concerns
> > > > >> if you don't agree with the proposed dates.
> > > > >>
> > > > >> The current proposal on dates are:
> > > > >>
> > > > >>- KIP Freeze: *15nd May *(Wednesday)
> > > > >>   - A KIP must be accepted by this date in order to be
> > considered
> > > > >>   for this release. Note, any KIP that may not be implemented
> > in a
> > > > week, or
> > > > >>   that might destabilize the release, should be deferred.
> > > > >>- Feature Freeze: *29th May *(Wednesday)
> > > > >>   - *major features merged & working on stabilisation, minor
> > > > >>   features have PR, release branch cut; anything not in this
> > state
> > > > will be
> > > > >>   automatically moved to the next release in JIRA*
>

Re: [DISCUSS] Apache Kafka 3.8.0 release

2024-05-15 Thread Chris Egerton
;> KIP-1019: Expose method to determine Metric Measurability
> > >>
> > >> Please review the plan and provide any additional information or
> updates
> > >> regarding KIPs that target this release version (3.8).
> > >> If you have authored any KIPs that have an inaccurate status in the
> > list,
> > >> or are not in the list and should be, or are in the list and should
> not
> > be
> > >> - please share it in this thread so that I can keep the document
> > accurate
> > >> and up to date.
> > >>
> > >> Looking forward to your feedback.
> > >>
> > >> Best,
> > >>
> > >> On Wed, Feb 28, 2024 at 10:07 AM Satish Duggana <
> > satish.dugg...@gmail.com>
> > >> wrote:
> > >>
> > >>> Thanks Josep, +1.
> > >>>
> > >>> On Tue, 27 Feb 2024 at 17:29, Divij Vaidya 
> > >>> wrote:
> > >>> >
> > >>> > Thank you for volunteering Josep. +1 from me.
> > >>> >
> > >>> > --
> > >>> > Divij Vaidya
> > >>> >
> > >>> >
> > >>> >
> > >>> > On Tue, Feb 27, 2024 at 9:35 AM Bruno Cadonna 
> > >>> wrote:
> > >>> >
> > >>> > > Thanks Josep!
> > >>> > >
> > >>> > > +1
> > >>> > >
> > >>> > > Best,
> > >>> > > Bruno
> > >>> > >
> > >>> > > On 2/26/24 9:53 PM, Chris Egerton wrote:
> > >>> > > > Thanks Josep, I'm +1 as well.
> > >>> > > >
> > >>> > > > On Mon, Feb 26, 2024 at 12:32 PM Justine Olshan
> > >>> > > >  wrote:
> > >>> > > >
> > >>> > > >> Thanks Joesp. +1 from me.
> > >>> > > >>
> > >>> > > >> On Mon, Feb 26, 2024 at 3:37 AM Josep Prat
> > >>>  > >>> > > >
> > >>> > > >> wrote:
> > >>> > > >>
> > >>> > > >>> Hi all,
> > >>> > > >>>
> > >>> > > >>> I'd like to volunteer as release manager for the Apache Kafka
> > >>> 3.8.0
> > >>> > > >>> release.
> > >>> > > >>> If there are no objections, I'll start building a release
> plan
> > >>> (or
> > >>> > > >> adapting
> > >>> > > >>> the one Colin made some weeks ago) in the wiki in the next
> > days.
> > >>> > > >>>
> > >>> > > >>> Thank you.
> > >>> > > >>>
> > >>> > > >>> --
> > >>> > > >>> [image: Aiven] <https://www.aiven.io>
> > >>> > > >>>
> > >>> > > >>> *Josep Prat*
> > >>> > > >>> Open Source Engineering Director, *Aiven*
> > >>> > > >>> josep.p...@aiven.io   |   +491715557497
> > >>> > > >>> aiven.io <https://www.aiven.io>   |   <
> > >>> > > >> https://www.facebook.com/aivencloud
> > >>> > > >>>>
> > >>> > > >>><https://www.linkedin.com/company/aiven/>   <
> > >>> > > >>> https://twitter.com/aiven_io>
> > >>> > > >>> *Aiven Deutschland GmbH*
> > >>> > > >>> Alexanderufer 3-7, 10117 Berlin
> > >>> > > >>> Geschäftsführer: Oskari Saarenmaa & Hannu Valtonen
> > >>> > > >>> Amtsgericht Charlottenburg, HRB 209739 B
> > >>> > > >>>
> > >>> > > >>
> > >>> > > >
> > >>> > >
> > >>>
> > >>
> > >>
> > >> --
> > >> [image: Aiven] <https://www.aiven.io>
> > >>
> > >> *Josep Prat*
> > >> Open Source Engineering Director, *Aiven*
> > >> josep.p...@aiven.io   |   +491715557497
> > >> aiven.io <https://www.aiven.io>   |
> > >> <https://www.facebook.com/aivencloud>
> > >> <https://www.linkedin.com/company/aiven/>
> > >> <https://twitter.com/aiven_io>
> > >> *Aiven Deutschland GmbH*
> > >> Alexanderufer 3-7, 10117 Berlin
> > >> Geschäftsführer: Oskari Saarenmaa & Hannu Valtonen
> > >> Amtsgericht Charlottenburg, HRB 209739 B
> > >>
> > >
> > >
> > > --
> > > [image: Aiven] <https://www.aiven.io>
> > >
> > > *Josep Prat*
> > > Open Source Engineering Director, *Aiven*
> > > josep.p...@aiven.io   |   +491715557497
> > > aiven.io <https://www.aiven.io>   |
> > > <https://www.facebook.com/aivencloud>
> > > <https://www.linkedin.com/company/aiven/>   <
> > https://twitter.com/aiven_io>
> > > *Aiven Deutschland GmbH*
> > > Alexanderufer 3-7, 10117 Berlin
> > > Geschäftsführer: Oskari Saarenmaa & Hannu Valtonen
> > > Amtsgericht Charlottenburg, HRB 209739 B
> > >
> >
> >
> > --
> > [image: Aiven] <https://www.aiven.io>
> >
> > *Josep Prat*
> > Open Source Engineering Director, *Aiven*
> > josep.p...@aiven.io   |   +491715557497
> > aiven.io <https://www.aiven.io>   |   <
> https://www.facebook.com/aivencloud
> > >
> >   <https://www.linkedin.com/company/aiven/>   <
> > https://twitter.com/aiven_io>
> > *Aiven Deutschland GmbH*
> > Alexanderufer 3-7, 10117 Berlin
> > Geschäftsführer: Oskari Saarenmaa & Hannu Valtonen
> > Amtsgericht Charlottenburg, HRB 209739 B
> >
>


Re: [DISCUSS] KIP-891: Running multiple versions of Connector plugins

2024-05-15 Thread Chris Egerton
> There is room in the API to add recommenders for "key.converter",
> "value.converter", and "header.converter", but not for transforms and
> predicates, as they include aliases that depend on an actual
> configuration. We could explicitly say we're going to do that, or do
> whatever is convenient during the implementation phase, or leave it
> open to be improved later.
> There will not be any recommenders for ".version" properties in the
> `/config` endpoint, because those recommenders are dynamic and depend
> on an actual configuration.
>
> 5) There are two relevant lines in the KIP: "If a .version property
> contains a hard requirement, select the latest installed version which
> satisfies the requirement." and "This configuration is re-evaluated
> each time the connector or task are assigned to a new worker". I would
> call this "eager" upgrade behavior, rather than a "sticky" or "lazy"
> upgrade behavior.
>
> 6) Updated!
>
> Thanks,
> Greg
>
> On Tue, May 14, 2024 at 9:14 AM Chris Egerton 
> wrote:
> >
> > Hi all,
> >
> > Thanks Greg for updating the KIP, and thanks Snehashis for starting the
> > work on this originally.
> >
> > The motivation section makes a pretty convincing case for this kind of
> > feature. My thoughts are mostly about specific details:
> >
> > 1) I like the support for version ranges (the example demonstrating how
> to
> > avoid KAFKA-10574 with the header converter was particularly
> entertaining),
> > but the configuration syntax for the most basic use case of specifying a
> > single desired version is pretty counterintuitive. People may get bitten
> or
> > at least frustrated if they put connector.version=3.8.0 in a config but
> > then version 3.7.5 ends up running. I'd like it if we could either
> > intentionally deviate from Maven ranges when a bare version is present,
> or
> > separate things out into two properties: foo.version would be the single
> > accepted version for the foo plugin, and foo.version.range would use
> Maven
> > range syntax. Open to other options too, just providing a couple to get
> the
> > ball rolling.
> >
> > 2) Although the current behavior for a worker with an invalid
> > key/value/header converter class specified in its config file is a little
> > strange (I was surprised to learn that it wouldn't fail on startup), I
> > don't see a good reason to deviate from this when an invalid version is
> > specified. Failing startup is drastic and has the potential to disrupt
> the
> > availability of connectors that would otherwise be able to run healthily
> > because they were explicitly configured to use valid converters instead
> of
> > the worker defaults.
> >
> > 3) Why are metrics utilized to report information about plugin versions
> > utilized by connectors at runtime instead of publishing this info in the
> > REST API? I saw that this was mentioned as a rejected alternative, but I
> > didn't get a sense of why. It seems like the REST API would be easier to
> > access and more intuitive for most users than new metrics.
> >
> > 4) In the "Validation" section it's stated that "Users can use these
> > recommenders to discover the valid plugin classes and versions, without
> > requiring an earlier call to GET
> /connector-plugins?connectorsOnly=false."
> > I really like the creativity and simplicity of reusing the recommender
> > mechanism to expose available versions for plugins via the REST API. I'm
> > unclear on whether or not it'll be possible to see this information via
> the
> > GET /connector-plugins//config endpoint, though. It'd be great if
> > this were supported, since we learned in KIP-769 [1] that people really
> > want to be able to see configuration options for connectors and their
> > plugins via some kind of GET endpoint without having to provide a
> complete
> > connector config for validation.
> >
> > 5) In the Maven version range docs, it's stated that "Maven picks the
> > highest version of each project that satisfies all the hard requirements
> of
> > the dependencies on that project." I'm guessing this behavior will be
> > retained for Connect; i.e., the highest-possible version of each plugin
> > that satisfies the user-specified version constraints will be run? (An
> > alternative approach could be to have some kind of "sticky" logic that
> only
> > restarts connectors/tasks when their currently-used version becomes
> > incompatible with the configured 

Re: [VOTE] KIP-1028: Docker Official Image for Apache Kafka

2024-05-14 Thread Chris Egerton
+1 (binding), thanks for the KIP!

On Tue, May 14, 2024, 12:13 Vedarth Sharma  wrote:

> Hi everyone,
>
> I'd like to call a vote on KIP-1028 which aims to introduce a JVM based
> Docker Official Image (DOI) for Apache Kafka.
>
> KIP -
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1028%3A+Docker+Official+Image+for+Apache+Kafka
>
> Discussion thread -
> https://lists.apache.org/thread/6vvwx173jcbgj6vqoq6bo8c0k0ntym0w
>
> Thanks and regards,
> Vedarth
>


Re: [DISCUSS] KIP-891: Running multiple versions of Connector plugins

2024-05-14 Thread Chris Egerton
Hi all,

Thanks Greg for updating the KIP, and thanks Snehashis for starting the
work on this originally.

The motivation section makes a pretty convincing case for this kind of
feature. My thoughts are mostly about specific details:

1) I like the support for version ranges (the example demonstrating how to
avoid KAFKA-10574 with the header converter was particularly entertaining),
but the configuration syntax for the most basic use case of specifying a
single desired version is pretty counterintuitive. People may get bitten or
at least frustrated if they put connector.version=3.8.0 in a config but
then version 3.7.5 ends up running. I'd like it if we could either
intentionally deviate from Maven ranges when a bare version is present, or
separate things out into two properties: foo.version would be the single
accepted version for the foo plugin, and foo.version.range would use Maven
range syntax. Open to other options too, just providing a couple to get the
ball rolling.

2) Although the current behavior for a worker with an invalid
key/value/header converter class specified in its config file is a little
strange (I was surprised to learn that it wouldn't fail on startup), I
don't see a good reason to deviate from this when an invalid version is
specified. Failing startup is drastic and has the potential to disrupt the
availability of connectors that would otherwise be able to run healthily
because they were explicitly configured to use valid converters instead of
the worker defaults.

3) Why are metrics utilized to report information about plugin versions
utilized by connectors at runtime instead of publishing this info in the
REST API? I saw that this was mentioned as a rejected alternative, but I
didn't get a sense of why. It seems like the REST API would be easier to
access and more intuitive for most users than new metrics.

4) In the "Validation" section it's stated that "Users can use these
recommenders to discover the valid plugin classes and versions, without
requiring an earlier call to GET /connector-plugins?connectorsOnly=false."
I really like the creativity and simplicity of reusing the recommender
mechanism to expose available versions for plugins via the REST API. I'm
unclear on whether or not it'll be possible to see this information via the
GET /connector-plugins//config endpoint, though. It'd be great if
this were supported, since we learned in KIP-769 [1] that people really
want to be able to see configuration options for connectors and their
plugins via some kind of GET endpoint without having to provide a complete
connector config for validation.

5) In the Maven version range docs, it's stated that "Maven picks the
highest version of each project that satisfies all the hard requirements of
the dependencies on that project." I'm guessing this behavior will be
retained for Connect; i.e., the highest-possible version of each plugin
that satisfies the user-specified version constraints will be run? (An
alternative approach could be to have some kind of "sticky" logic that only
restarts connectors/tasks when their currently-used version becomes
incompatible with the configured constraints.)

6) (Nit) It'd be nice to add a link to the TestPlugins class or somewhere
in its neighborhood to the testing plan; unfamiliar readers probably won't
get much out of what's there right now.

[1] -
https://cwiki.apache.org/confluence/display/KAFKA/KIP-769%3A+Connect+APIs+to+list+all+connector+plugins+and+retrieve+their+configuration+definitions

Cheers,

Chris

On Mon, May 13, 2024 at 2:01 PM Snehashis  wrote:

> Hi Greg,
>
> That is much appreciated. No complaints on the additional scope, I will
> make some time out to work on this once we have approval.
>
> Thanks
> Snehashis
>
> On Fri, May 10, 2024 at 9:28 PM Greg Harris 
> wrote:
>
> > Hey Snehashis,
> >
> > I'm glad to hear you're still interested in this KIP!
> > I'm happy to let you drive this, and I apologize for increasing the
> > scope of work so drastically. To make up for that, I'll volunteer to
> > be the primary PR reviewer to help get this done quickly once the KIP
> > is approved.
> >
> > Thanks,
> > Greg
> >
> >
> > On Fri, May 10, 2024 at 3:51 AM Snehashis 
> > wrote:
> > >
> > > Hi Greg,
> > >
> > > Thanks for the follow up to my original KIP, I am in favour of the
> > > additions made to expand its scope, the addition of range versions
> > > specifically make a lot of sense.
> > >
> > > Apologies if I have not publicly worked on this KIP for a long time.
> The
> > > original work was done when the move to service loading was in
> discussion
> > > and I wanted to loop back to this only after that work was completed.
> > Post
> > > its conclusion, I have not been able to take this up due to other
> > > priorities. If it's okay with you, I would still like to get this
> > > implemented myself, including the additional scope.
> > >
> > > Thanks and regards
> > > Snehashis
> > >
> > > On Fri, May 10, 2024 at 12:45 AM Greg Harris
> > 
> 

Re: [DISCUSS] KIP-1028: Docker Official Image for Apache Kafka

2024-05-14 Thread Chris Egerton
Hi Vedarth,

This looks great, thank you for helping me thoroughly understand the
motivation and benefits for the KIP. Looks good to me.

Regarding public interface for the images--everything in the "Public
Interface" section in KIP-975 does qualify as public interface for the
images IMO, but I don't think it's comprehensive. If we were asked to, for
example, change the port in the EXPOSE directive in our Dockerfile, that
would probably qualify as a change to public interface too. With the strict
language in the latest draft of this KIP that ensures that any functional
changes to our Docker images go through another follow-up KIP, we should be
fine without having to identify a comprehensive list of everything that
constitutes public interface for our images.

Cheers, and thanks again for the KIP,

Chris

On Mon, May 13, 2024 at 3:07 PM Vedarth Sharma 
wrote:

> Hey Chris,
>
> Once we provide the definitions to docker, they should take care of
> everything from there. They mentioned here
> <
> https://github.com/docker-library/official-images?tab=readme-ov-file#library-definition-files
> >
> that
> the image will be rebuilt when the base image is updated. Hence active
> rebuilds won't require any changes from our side.
> If we are packaging something which may contain a CVE, like some jar, then
> the onus will be on us to patch it, but it will be upto us whether we
> consider the threat severe enough to fix and when we want to provide the
> fixed version. Having Docker Official Image will not impact the frequency
> of our releases. It will be the Apache Kafka community's call on when a
> release goes and Docker Official Image will be released accordingly as per
> the KIP. source
> <https://github.com/docker-library/faq?tab=readme-ov-file#image-building>
>
> As mentioned in Docker's documentation as well "In essence we strive to
> heed upstream's recommendations on how they intend for their software to be
> consumed." source
> <
> https://github.com/docker-library/official-images?tab=readme-ov-file#what-are-official-images
> >
> Docker Official Image will rely on upstream's recommendation for
> functionality. But I do agree that since Docker's stance on this might
> change in future it makes sense to put a safeguard that will not allow any
> functionality changes get incorporated as part of the vetting process. I
> have updated the KIP to reflect the same.
>
> KIP-975 has a well defined public interface based on how configs can be
> supplied and how it can be used. I am not sure if we put that label on it
> during discussions. I am happy to have a separate email thread on it to
> iron things out.
>
> I hope this addresses all of your concerns!
>
> Thanks and regards,
> Vedarth
>
> On Mon, May 13, 2024 at 10:55 PM Chris Egerton 
> wrote:
>
> > Thanks both for your responses! Friendly reminder: again, better to
> provide
> > a quote instead of just a link :)
> >
> > I've seen a bit about image rebuilding to handle CVEs but I'm a little
> > unclear on how this would work in practice, and I couldn't find any
> > concrete details in any of the links. Does Docker do this automatically
> for
> > DOIs? Or will the onus be on us to put out patched images? Would this
> lead
> > to us putting out images more quickly than we put out standard releases?
> As
> > a plus, it does look like DOIs get the benefit of Docker Scout [1] for
> > free, which is nice, but it's still unclear who'd be doing the rest of
> the
> > work on that front.
> >
> > As far as this point from Vedarth goes:
> >
> > > By incorporating the source code of the Docker Official Image into our
> > > AK ecosystem, we gain control over its functionality, ensuring
> alignment
> > > with the OSS Docker image. This ensures a seamless experience for users
> > who
> > > may need to transition between these images.
> >
> > This captures my concern with the KIP pretty well. If there's any
> > significant divergence in behavior (not just build methodology) between
> the
> > apache/kafka image and what Docker requires for a Kafka DOI, how are we
> > going to vet these changes moving forward? Under the "Post Release
> Process
> > - if Dockerhub folks suggest changes to the Dockerfiles:" header, this
> KIP
> > proposes that we port all suggested changes for the DOI to
> > the docker/jvm/Dockerfile image, but this seems a bit too permissive. As
> an
> > alternative, we could state that all build-related changes can be done
> with
> > a PR on the apache/kafka GitHub repo (which will require approval from a
> > single committer), but any functional changes will require a 

Re: [DISCUSS] KIP-1038: Add Custom Error Handler to Producer

2024-05-14 Thread Chris Egerton
Hi Alieh,

Thank you for all the updates! One final question--how will the retry
timeout for unknown topic partition errors be implemented? I think it would
be best if this could be done with an implementation of the error handler,
but I don't see a way to track the necessary information with the
current ProducerExceptionHandler interface.

Cheers,

Chris

On Tue, May 14, 2024 at 9:10 AM Alieh Saeedi 
wrote:

> Thanks Andrew. Done :)
>
> @Chris: I changed the config parameter type from boolean to integer, which
> defines the timeout for retrying. I thought reusing `max.block.ms` was not
> reasonable as you mentioned.
>
> So if the KIP looks good, let 's skip to the good part ;-) VOTING :)
>
> Bests,
> Alieh
>
>
>
>
>
> On Tue, May 14, 2024 at 12:26 PM Andrew Schofield <
> andrew_schofi...@live.com>
> wrote:
>
> > Hi Alieh,
> > Just one final comment.
> >
> > [AJS5] Existing classes use Retriable, not Retryable. For example:
> >
> >
> https://kafka.apache.org/21/javadoc/org/apache/kafka/common/errors/RetriableException.html
> >
> > I suggest RetriableResponse and NonRetriableResponse.
> >
> > Thanks,
> > Andrew
> >
> > > On 13 May 2024, at 23:17, Alieh Saeedi 
> > wrote:
> > >
> > > Hi all,
> > >
> > >
> > > Thanks for all the valid points you listed.
> > >
> > >
> > > KIP updates and addressing concerns:
> > >
> > >
> > > 1) The KIP now suggests two Response types: `RetryableResponse` and
> > > `NonRetryableResponse`
> > >
> > >
> > > 2) `custom.exception.handler` is changed to
> > `custom.exception.handler.class`
> > >
> > >
> > > 3) The KIP clarifies that `In the case of an implemented handler for
> the
> > > specified exception, the handler takes precedence.`
> > >
> > >
> > > 4)  There is now a `default` implementation for both handle() methods.
> > >
> > >
> > > 5)  @Chris: for `UnknownTopicOrPartition`, the default is already
> > retrying
> > > for 60s. (In fact, the default value of `max.block.ms`). If the
> handler
> > > instructs to FAIL or SWALLOW, there will be no retry, and if the
> handler
> > > instructs to RETRY, that will be the default behavior, which follows
> the
> > > values in already existing config parameters such as `max.block.ms`.
> > Does
> > > that make sense?
> > >
> > >
> > > Hope the changes and explanations are convincing :)
> > >
> > >
> > > Cheers,
> > >
> > > Alieh
> > >
> > > On Mon, May 13, 2024 at 6:40 PM Justine Olshan
> > 
> > > wrote:
> > >
> > >> Oh I see. The type isn't the error type but a newly defined type for
> the
> > >> response. Makes sense and works for me.
> > >>
> > >> Justine
> > >>
> > >> On Mon, May 13, 2024 at 9:13 AM Chris Egerton <
> fearthecel...@gmail.com>
> > >> wrote:
> > >>
> > >>> If we have dedicated methods for each kind of exception
> > >>> (handleRecordTooLarge, handleUnknownTopicOrPartition, etc.), doesn't
> > that
> > >>> provide sufficient constraint? I'm not suggesting we eliminate these
> > >>> methods, just that we change their return types to something more
> > >> flexible.
> > >>>
> > >>> On Mon, May 13, 2024, 12:07 Justine Olshan
> >  > >>>
> > >>> wrote:
> > >>>
> > >>>> I'm not sure I agree with the Retriable and NonRetriableResponse
> > >> comment.
> > >>>> This doesn't limit the blast radius or enforce certain errors are
> > used.
> > >>>> I think we might disagree on how controlled these interfaces can
> be...
> > >>>>
> > >>>> Justine
> > >>>>
> > >>>> On Mon, May 13, 2024 at 8:40 AM Chris Egerton
>  > >>>
> > >>>> wrote:
> > >>>>
> > >>>>> Hi Alieh,
> > >>>>>
> > >>>>> Thanks for the updates! I just have a few more thoughts:
> > >>>>>
> > >>>>> - I don't think a boolean property is sufficient to dictate retries
> > >> for
> > >>>>> unknown topic partitions, though. These errors can occur if a topic
> > >> has
> > >>>>> just been created, wh

Re: [DISCUSS] KIP-1028: Docker Official Image for Apache Kafka

2024-05-13 Thread Chris Egerton
- Functionally, both Docker images will remain identical.
> >- The variance lies primarily in the methodologies of building and
> >validation, as outlined in the updated KIP.
> >
> >
> > 3) What I suggested last time was not a separate apache/apache-docker
> > > repository, but a repository controlled entirely by Docker. The DOI
> docs
> > > [1] state that "While it's preferable to have upstream software authors
> > > maintaining their Docker Official Images, this isn't a strict
> > requirement",
> > > which I take to mean that it's not required for an Apache Kafka DOI to
> > live
> > > under the apache organization on GitHub. It also seems like there's
> > > precedent for this: images for MySQL [2] and PHP [3] already exist
> under
> > > the control of Docker. The reason I think this is worth considering is
> > that
> > > Docker can arbitrarily change the eligibility requirements for their
> > > official images at any time, and it doesn't seem like there's a clear
> > > process in the KIP for judging how we should respond to these changes
> (in
> > > fact, it seems like the idea in the KIP is that we should make any
> change
> > > required with no further vetting beyond possibly a pull request on
> > > apache/kafka, which would require approval by a committer). By hosting
> > the
> > > DOI definitions ourselves (either in apache/kafka, or in a theoretical
> > > apache/docker-kafka repository), we take responsibility for the image,
> > even
> > > if the owner on Docker Hub is Docker, not Apache. If the code lives
> > > elsewhere, then (as long as basic trademark and possibly security
> > > guidelines are respected) Apache doesn't have to concern itself at all
> > with
> > > the image and the maintainers are free to make whatever changes they
> want
> > > to it in order to meet Docker's requirements.
> >
> >
> >- By incorporating the source code of the Docker Official Image into
> our
> >AK ecosystem, we gain control over its functionality, ensuring
> alignment
> >with the OSS Docker image. This ensures a seamless experience for
> users
> > who
> >may need to transition between these images.
> >- Maintaining both images within the same community facilitates ease
> of
> >management and fosters a singular source of truth.
> >- While Apache may not retain ownership of the hosted Docker Official
> >Image, we are, in essence, providing Docker with a foundation that
> > aligns
> >with their established guidelines as well as remains consistent with
> OSS
> >Docker Image apache/kafka.
> >- Any future alterations to the functionality can be seamlessly
> >propagated across both the OSS and Official Docker Images.
> >
> > I think these reasons must be why a lot of the Apache projects choose to
> > host the docker definitions themselves. The responsibility of owning the
> > definitions comes with benefits as well that we should also consider.
> >
> > Let us know if you have any further questions!
> >
> > Thanks and regards,
> > Vedarth
> >
> > On Fri, May 10, 2024 at 7:56 PM Chris Egerton 
> > wrote:
> >
> > > Hi Krish and Prabha,
> > >
> > > Thanks for your replies. I still have some follow-up questions:
> > >
> > > 1) I read https://docs.docker.com/trusted-content/official-images/ and
> > > didn't find anything on that page or immediately around it that
> explains
> > > what compliance requirements might be satisfied by a DOI that couldn't
> be
> > > satisfied by the existing apache/kafka image. Can you elaborate? Feel
> > free
> > > to provide another link, but please also quote the relevant sections
> from
> > > it (as StackOverflow likes to say, links can grow stale over time).
> > >
> > > 2) It would be great to see a brief summary of the differences in these
> > > images included in the KIP, in order to try to gauge how this would
> look
> > to
> > > our users.
> > >
> > > 3) What I suggested last time was not a separate apache/apache-docker
> > > repository, but a repository controlled entirely by Docker. The DOI
> docs
> > > [1] state that "While it's preferable to have upstream software authors
> > > maintaining their Docker Official Images, this isn't a strict
> > requirement",
> > > which I take to mean that it's not required for an Apache Kafka DOI to
> > live
> > > under 

Re: [DISCUSS] KIP-1038: Add Custom Error Handler to Producer

2024-05-13 Thread Chris Egerton
If we have dedicated methods for each kind of exception
(handleRecordTooLarge, handleUnknownTopicOrPartition, etc.), doesn't that
provide sufficient constraint? I'm not suggesting we eliminate these
methods, just that we change their return types to something more flexible.

On Mon, May 13, 2024, 12:07 Justine Olshan 
wrote:

> I'm not sure I agree with the Retriable and NonRetriableResponse comment.
> This doesn't limit the blast radius or enforce certain errors are used.
> I think we might disagree on how controlled these interfaces can be...
>
> Justine
>
> On Mon, May 13, 2024 at 8:40 AM Chris Egerton 
> wrote:
>
> > Hi Alieh,
> >
> > Thanks for the updates! I just have a few more thoughts:
> >
> > - I don't think a boolean property is sufficient to dictate retries for
> > unknown topic partitions, though. These errors can occur if a topic has
> > just been created, which can occur if, for example, automatic topic
> > creation is enabled for a multi-task connector. This is why I proposed a
> > timeout instead of a boolean (and see my previous email for why reducing
> > max.block.ms for a producer is not a viable alternative). If it helps,
> one
> > way to reproduce this yourself is to add the line
> > `fooProps.put(TASKS_MAX_CONFIG, "10");` to the integration test here:
> >
> >
> https://github.com/apache/kafka/blob/5439914c32fa00d634efa7219699f1bc21add839/connect/runtime/src/test/java/org/apache/kafka/connect/integration/SourceConnectorsIntegrationTest.java#L134
> > and then check the logs afterward for messages like "Error while fetching
> > metadata with correlation id  :
> {foo-topic=UNKNOWN_TOPIC_OR_PARTITION}".
> >
> > - I also don't think we need custom XxxResponse enums for every possible
> > method; it seems like this will lead to a lot of duplication and
> cognitive
> > overhead if we want to expand the error handler in the future. Something
> > more flexible like RetriableResponse and NonRetriableResponse could
> > suffice.
> >
> > - Finally, the KIP still doesn't state how the handler will or won't take
> > precedence over existing retry properties. If I set `retries` or `
> > delivery.timeout.ms` or `max.block.ms` to low values, will that cause
> > retries to cease even if my custom handler would otherwise keep returning
> > RETRY for an error?
> >
> > Cheers,
> >
> > Chris
> >
> > On Mon, May 13, 2024 at 11:02 AM Andrew Schofield <
> > andrew_schofi...@live.com>
> > wrote:
> >
> > > Hi Alieh,
> > > Just a few more comments on the KIP. It is looking much less risky now
> > the
> > > scope
> > > is tighter.
> > >
> > > [AJS1] It would be nice to have default implementations of the handle
> > > methods
> > > so an implementor would not need to implement both themselves.
> > >
> > > [AJS2] Producer configurations which are class names usually end in
> > > “.class”.
> > > I suggest “custom.exception.handler.class”.
> > >
> > > [AJS3] If I implemented a handler, and I set a non-default value for
> one
> > > of the
> > > new configuations, what happens? I would expect that the handler takes
> > > precedence. I wasn’t quite clear what “the control will follow the
> > handler
> > > instructions” meant.
> > >
> > > [AJS4] Because you now have an enum for the
> > > RecordTooLargeExceptionResponse,
> > > I don’t think you need to state in the comment for
> > > ProducerExceptionHandler that
> > > RETRY will be interpreted as FAIL.
> > >
> > > Thanks,
> > > Andrew
> > >
> > > > On 13 May 2024, at 14:53, Alieh Saeedi  >
> > > wrote:
> > > >
> > > > Hi all,
> > > >
> > > >
> > > > Thanks for the very interesting discussion during my PTO.
> > > >
> > > >
> > > > KIP updates and addressing concerns:
> > > >
> > > >
> > > > 1) Two handle() methods are defined in ProducerExceptionHandler for
> the
> > > two
> > > > exceptions with different input parameters so that we have
> > > > handle(RecordTooLargeException e, ProducerRecord record) and
> > > > handle(UnknownTopicOrPartitionException e, ProducerRecord record)
> > > >
> > > >
> > > > 2) The ProducerExceptionHandler extends `Closable` as well.
> > > >
> > > >
> > > > 3) The KIP suggests having two more configuration parameters w

Re: [DISCUSS] KIP-1038: Add Custom Error Handler to Producer

2024-05-13 Thread Chris Egerton
ter chance that whoever tries to add a new cases pauses
> >> and thinks a bit more.
> >> 3. As Justine pointed out, having different method will be a forcing
> >> function to go through a KIP rather than smuggle new cases through
> >> implementation.
> >> 4. Sort of a consequence of the previous 3 -- all those things reduce
> the
> >> chance of someone writing the code that works with 2 errors and then
> when
> >> more errors are added in the future will suddenly incorrectly ignore new
> >> errors (the example I gave in the previous email).
> >>
> >>> [AL2 cont.] Similar to AL1, I see such a handler to some extend as
> >> business logic. If a user puts a bad filter condition in their KS app,
> and
> >> drops messages
> >>
> >> I agree that there is always a chance to get a bug and lose messages,
> but
> >> there are generally separation of concerns that has different risk
> profile:
> >> the filtering logic may be more rigorously tested and rarely changed
> (say
> >> an application developer does it), but setting the topics to produce
> may be
> >> done via configuration (e.g. a user of the application does it) and it's
> >> generally an expectation that users would get an error when
> configuration
> >> is incorrect.
> >>
> >> What could be worse is that UnknownTopicOrPartitionException can be an
> >> intermittent error, i.e. with a generally correct configuration, there
> >> could be metadata propagation problem on the cluster and then a random
> set
> >> of records could get lost.
> >>
> >>> [AL3] Maybe I misunderstand what you are saying, but to me, checking
> the
> >> size of the record upfront is exactly what the KIP proposes? No?
> >>
> >> It achieves the same result but solves it differently, my proposal:
> >>
> >> 1. Application checks the validity of a record (maybe via a new
> >> validateRecord method) before producing it, and can just exclude it or
> >> return an error to the user.
> >> 2. Application produces the record -- at this point there are no records
> >> that could return record too large, they were either skipped at step 1
> or
> >> we didn't get here because step 1 failed.
> >>
> >> Vs. KIP's proposal
> >>
> >> 1. Application produces the record.
> >> 2. Application gets a callback.
> >> 3. Application returns the action on how to proceed.
> >>
> >> The advantage of the former is the clarity of semantics -- the record is
> >> invalid (property of the record, not a function of server state or
> server
> >> configuration) and we can clearly know that it is the record that is bad
> >> and can never succeed.
> >>
> >> The KIP-proposed way actually has a very tricky point: it actually
> handles
> >> a subset of record-too-large exceptions.  The broker can return
> >> record-too-large and reject the whole batch (but we don't want to ignore
> >> those because then we can skip random records that just happened to be
> in
> >> the same batch), in some sense we use the same error for 2 different
> >> conditions and understanding that requires pretty deep understanding of
> >> Kafka internals.
> >>
> >> -Artem
> >>
> >>
> >> On Wed, May 8, 2024 at 9:47 AM Justine Olshan
>  >>>
> >> wrote:
> >>
> >>> My concern with respect to it being fragile: the code that ensures the
> >>> error type is internal to the producer. Someone may see it and say, I
> >> want
> >>> to add such and such error. This looks like internal code, so I don't
> >> need
> >>> a KIP, and then they can change it to whatever they want thinking it is
> >>> within the typical kafka improvement protocol.
> >>>
> >>> Relying on an internal change to enforce an external API is fragile in
> my
> >>> opinion. That's why I sort of agreed with Artem with enforcing the
> error
> >> in
> >>> the method signature -- part of the public API.
> >>>
> >>> Chris's comments on requiring more information to handler again makes
> me
> >>> wonder if we are solving a problem of lack of information at the
> >>> application level with a more powerful solution than we need. (Ie, if
> we
> >>> had more information, could the application close and restart the
> >>> transaction rather than having to drop records) But 

Re: KIP-1040: Improve handling of nullable values in InsertField/ExtractField transformations

2024-05-10 Thread Chris Egerton
 Done

On Fri, May 10, 2024, 10:55 Mario Fiore Vitale  wrote:

> Thanks a lot! I have just a minor comment, should we also update the title
> to be more generic since now it's also related to other SMTs?
>
> On Fri, May 10, 2024 at 4:44 PM Chris Egerton 
> wrote:
>
> > I've finished updating the KIP; @Mario, please let me know what you
> think.
> >
> > On Fri, May 10, 2024 at 10:26 AM Chris Egerton  wrote:
> >
> > > I can do it :)
> > >
> > > On Fri, May 10, 2024 at 10:02 AM Mario Fiore Vitale <
> mvit...@redhat.com>
> > > wrote:
> > >
> > >> Yes, I agree. Unfortunately due to the issue of the creation of a new
> > >> account for the WIKI, I asked Mickael Maison to create the KIP for me.
> > >>
> > >> I'll ask him to update the KIP. Do you have other alternatives?
> > >>
> > >> Thanks,
> > >> Mario.
> > >>
> > >> On Fri, May 10, 2024 at 3:40 PM Chris Egerton  >
> > >> wrote:
> > >>
> > >> > Yes, I think we should just do one KIP for all the SMTs. You don't
> > have
> > >> to
> > >> > implement everything all at once or by yourself, but I don't see why
> > we
> > >> > should require one or more follow-up KIPs to apply the exact same
> > >> changes
> > >> > to the SMTs we missed the first time.
> > >> >
> > >> > On Fri, May 10, 2024 at 5:20 AM Mario Fiore Vitale <
> > mvit...@redhat.com>
> > >> > wrote:
> > >> >
> > >> > > Hi Chris,
> > >> > >
> > >> > > Thanks for the survey. Do you think I need to update the KIP to
> put
> > >> all
> > >> > of
> > >> > > these?
> > >> > >
> > >> > > On Thu, May 9, 2024 at 4:14 PM Chris Egerton
> >  > >> >
> > >> > > wrote:
> > >> > >
> > >> > > > After doing a brief survey of the SMTs that ship with Connect,
> it
> > >> seems
> > >> > > > like these would also benefit:
> > >> > > >
> > >> > > > - HeaderFrom, which populates record headers with subfields of
> > >> > > keys/values
> > >> > > > [1]
> > >> > > > - Cast, which performs type transformation on subfields of
> > >> keys/values
> > >> > > [2]
> > >> > > > - SetSchemaMetadata, which (when the record key/value is a
> struct)
> > >> > copies
> > >> > > > fields from the input struct to the output struct (which uses a
> > >> > different
> > >> > > > schema) [3]
> > >> > > > - TimestampConverter, which does similar input/output field
> > copying
> > >> to
> > >> > > > SetSchemaMetadata [4]
> > >> > > > - ReplaceField, which does similar input/output field copying to
> > >> > > > SetSchemaMetadata and TimestampConverter
> > >> > > >
> > >> > > > [1] -
> > >> > > >
> > >> > > >
> > >> > >
> > >> >
> > >>
> >
> https://github.com/apache/kafka/blob/f4fdaa702a2e718bdb44b9c5fec254f32a33f0d8/connect/transforms/src/main/java/org/apache/kafka/connect/transforms/HeaderFrom.java#L143
> > >> > > > [2] -
> > >> > > >
> > >> > > >
> > >> > >
> > >> >
> > >>
> >
> https://github.com/apache/kafka/blob/f4fdaa702a2e718bdb44b9c5fec254f32a33f0d8/connect/transforms/src/main/java/org/apache/kafka/connect/transforms/Cast.java#L178
> > >> > > > [3] -
> > >> > > >
> > >> > > >
> > >> > >
> > >> >
> > >>
> >
> https://github.com/apache/kafka/blob/f4fdaa702a2e718bdb44b9c5fec254f32a33f0d8/connect/transforms/src/main/java/org/apache/kafka/connect/transforms/SetSchemaMetadata.java#L174
> > >> > > > [4] -
> > >> > > >
> > >> > > >
> > >> > >
> > >> >
> > >>
> >
> https://github.com/apache/kafka/blob/2a5efe4a334611fc7c15f76b322482bad0942db0/connect/transforms/src/main/java/org/apache/kafka/connect/transforms/TimestampConverter.java#L420
> > >> > > > [5] -
> > >> &

Re: KIP-1040: Improve handling of nullable values in InsertField/ExtractField transformations

2024-05-10 Thread Chris Egerton
I've finished updating the KIP; @Mario, please let me know what you think.

On Fri, May 10, 2024 at 10:26 AM Chris Egerton  wrote:

> I can do it :)
>
> On Fri, May 10, 2024 at 10:02 AM Mario Fiore Vitale 
> wrote:
>
>> Yes, I agree. Unfortunately due to the issue of the creation of a new
>> account for the WIKI, I asked Mickael Maison to create the KIP for me.
>>
>> I'll ask him to update the KIP. Do you have other alternatives?
>>
>> Thanks,
>> Mario.
>>
>> On Fri, May 10, 2024 at 3:40 PM Chris Egerton 
>> wrote:
>>
>> > Yes, I think we should just do one KIP for all the SMTs. You don't have
>> to
>> > implement everything all at once or by yourself, but I don't see why we
>> > should require one or more follow-up KIPs to apply the exact same
>> changes
>> > to the SMTs we missed the first time.
>> >
>> > On Fri, May 10, 2024 at 5:20 AM Mario Fiore Vitale 
>> > wrote:
>> >
>> > > Hi Chris,
>> > >
>> > > Thanks for the survey. Do you think I need to update the KIP to put
>> all
>> > of
>> > > these?
>> > >
>> > > On Thu, May 9, 2024 at 4:14 PM Chris Egerton > >
>> > > wrote:
>> > >
>> > > > After doing a brief survey of the SMTs that ship with Connect, it
>> seems
>> > > > like these would also benefit:
>> > > >
>> > > > - HeaderFrom, which populates record headers with subfields of
>> > > keys/values
>> > > > [1]
>> > > > - Cast, which performs type transformation on subfields of
>> keys/values
>> > > [2]
>> > > > - SetSchemaMetadata, which (when the record key/value is a struct)
>> > copies
>> > > > fields from the input struct to the output struct (which uses a
>> > different
>> > > > schema) [3]
>> > > > - TimestampConverter, which does similar input/output field copying
>> to
>> > > > SetSchemaMetadata [4]
>> > > > - ReplaceField, which does similar input/output field copying to
>> > > > SetSchemaMetadata and TimestampConverter
>> > > >
>> > > > [1] -
>> > > >
>> > > >
>> > >
>> >
>> https://github.com/apache/kafka/blob/f4fdaa702a2e718bdb44b9c5fec254f32a33f0d8/connect/transforms/src/main/java/org/apache/kafka/connect/transforms/HeaderFrom.java#L143
>> > > > [2] -
>> > > >
>> > > >
>> > >
>> >
>> https://github.com/apache/kafka/blob/f4fdaa702a2e718bdb44b9c5fec254f32a33f0d8/connect/transforms/src/main/java/org/apache/kafka/connect/transforms/Cast.java#L178
>> > > > [3] -
>> > > >
>> > > >
>> > >
>> >
>> https://github.com/apache/kafka/blob/f4fdaa702a2e718bdb44b9c5fec254f32a33f0d8/connect/transforms/src/main/java/org/apache/kafka/connect/transforms/SetSchemaMetadata.java#L174
>> > > > [4] -
>> > > >
>> > > >
>> > >
>> >
>> https://github.com/apache/kafka/blob/2a5efe4a334611fc7c15f76b322482bad0942db0/connect/transforms/src/main/java/org/apache/kafka/connect/transforms/TimestampConverter.java#L420
>> > > > [5] -
>> > > >
>> > > >
>> > >
>> >
>> https://github.com/apache/kafka/blob/f4fdaa702a2e718bdb44b9c5fec254f32a33f0d8/connect/transforms/src/main/java/org/apache/kafka/connect/transforms/ReplaceField.java#L183
>> > > >
>> > > > On Thu, May 9, 2024 at 3:28 AM Mario Fiore Vitale <
>> mvit...@redhat.com>
>> > > > wrote:
>> > > >
>> > > > > Hi Chris,
>> > > > >
>> > > > > > Wouldn't ValueToKey [1] be applicable as well, for example?
>> > > > > Yes, also that one can be affected.
>> > > > >
>> > > > > On Wed, May 8, 2024 at 5:59 PM Chris Egerton
>> > > >
>> > > > > wrote:
>> > > > >
>> > > > > > Wait, just one more thing--are there any other SMTs that could
>> > > benefit
>> > > > > from
>> > > > > > this? Wouldn't ValueToKey [1] be applicable as well, for
>> example?
>> > > > > >
>> > > > > > [1] -
>> > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > 

Re: KIP-1040: Improve handling of nullable values in InsertField/ExtractField transformations

2024-05-10 Thread Chris Egerton
I can do it :)

On Fri, May 10, 2024 at 10:02 AM Mario Fiore Vitale 
wrote:

> Yes, I agree. Unfortunately due to the issue of the creation of a new
> account for the WIKI, I asked Mickael Maison to create the KIP for me.
>
> I'll ask him to update the KIP. Do you have other alternatives?
>
> Thanks,
> Mario.
>
> On Fri, May 10, 2024 at 3:40 PM Chris Egerton 
> wrote:
>
> > Yes, I think we should just do one KIP for all the SMTs. You don't have
> to
> > implement everything all at once or by yourself, but I don't see why we
> > should require one or more follow-up KIPs to apply the exact same changes
> > to the SMTs we missed the first time.
> >
> > On Fri, May 10, 2024 at 5:20 AM Mario Fiore Vitale 
> > wrote:
> >
> > > Hi Chris,
> > >
> > > Thanks for the survey. Do you think I need to update the KIP to put all
> > of
> > > these?
> > >
> > > On Thu, May 9, 2024 at 4:14 PM Chris Egerton 
> > > wrote:
> > >
> > > > After doing a brief survey of the SMTs that ship with Connect, it
> seems
> > > > like these would also benefit:
> > > >
> > > > - HeaderFrom, which populates record headers with subfields of
> > > keys/values
> > > > [1]
> > > > - Cast, which performs type transformation on subfields of
> keys/values
> > > [2]
> > > > - SetSchemaMetadata, which (when the record key/value is a struct)
> > copies
> > > > fields from the input struct to the output struct (which uses a
> > different
> > > > schema) [3]
> > > > - TimestampConverter, which does similar input/output field copying
> to
> > > > SetSchemaMetadata [4]
> > > > - ReplaceField, which does similar input/output field copying to
> > > > SetSchemaMetadata and TimestampConverter
> > > >
> > > > [1] -
> > > >
> > > >
> > >
> >
> https://github.com/apache/kafka/blob/f4fdaa702a2e718bdb44b9c5fec254f32a33f0d8/connect/transforms/src/main/java/org/apache/kafka/connect/transforms/HeaderFrom.java#L143
> > > > [2] -
> > > >
> > > >
> > >
> >
> https://github.com/apache/kafka/blob/f4fdaa702a2e718bdb44b9c5fec254f32a33f0d8/connect/transforms/src/main/java/org/apache/kafka/connect/transforms/Cast.java#L178
> > > > [3] -
> > > >
> > > >
> > >
> >
> https://github.com/apache/kafka/blob/f4fdaa702a2e718bdb44b9c5fec254f32a33f0d8/connect/transforms/src/main/java/org/apache/kafka/connect/transforms/SetSchemaMetadata.java#L174
> > > > [4] -
> > > >
> > > >
> > >
> >
> https://github.com/apache/kafka/blob/2a5efe4a334611fc7c15f76b322482bad0942db0/connect/transforms/src/main/java/org/apache/kafka/connect/transforms/TimestampConverter.java#L420
> > > > [5] -
> > > >
> > > >
> > >
> >
> https://github.com/apache/kafka/blob/f4fdaa702a2e718bdb44b9c5fec254f32a33f0d8/connect/transforms/src/main/java/org/apache/kafka/connect/transforms/ReplaceField.java#L183
> > > >
> > > > On Thu, May 9, 2024 at 3:28 AM Mario Fiore Vitale <
> mvit...@redhat.com>
> > > > wrote:
> > > >
> > > > > Hi Chris,
> > > > >
> > > > > > Wouldn't ValueToKey [1] be applicable as well, for example?
> > > > > Yes, also that one can be affected.
> > > > >
> > > > > On Wed, May 8, 2024 at 5:59 PM Chris Egerton
>  > >
> > > > > wrote:
> > > > >
> > > > > > Wait, just one more thing--are there any other SMTs that could
> > > benefit
> > > > > from
> > > > > > this? Wouldn't ValueToKey [1] be applicable as well, for example?
> > > > > >
> > > > > > [1] -
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/apache/kafka/blob/2a5efe4a334611fc7c15f76b322482bad0942db0/connect/transforms/src/main/java/org/apache/kafka/connect/transforms/ValueToKey.java#L106
> > > > > >
> > > > > > On Wed, May 8, 2024 at 11:46 AM Chris Egerton 
> > > wrote:
> > > > > >
> > > > > > > Hi Mario,
> > > > > > >
> > > > > > > I think we could have something like `copy` and
> > > `copyWithoutDefaults`
> > > > > to
> > > > > > > get around that, but now t

Re: [DISCUSS] KIP-1028: Docker Official Image for Apache Kafka

2024-05-10 Thread Chris Egerton
the
> >> > top 1 search result, irrespective of the number of downloads."
> Wouldn't
> >> a
> >> > high number of downloads for an image naturally follow from being the
> >> top
> >> > search result? It seems like we can't necessarily assume that Docker
> >> > Official Images are inherently more desirable for users based solely
> on
> >> > download statistics.
> >> >
> >>
> >> *My thoughts: *Unlike the Sponsored OSS image, the Docker Official image
> >> is
> >> more desirable for workloads that have stringent compliance
> requirements.
> >> More details on why official images are more trusted are documented here
> >> <https://docs.docker.com/trusted-content/official-images/>. The Docker
> >> Official image would also help an absolutely new Kafka beginner who
> might
> >> not know about Apache or the concept of Sponsored images. We want to
> make
> >> it easier for Kafka beginners to discover the Kafka image through
> >> DockerHub.
> >>
> >>
> >> Can you elaborate on the value that these new images would add from a
> >> > user's perspective? I'm hesitant to introduce another image, since it
> >> adds
> >> > to the cognitive burden of people who will inevitably have to answer
> the
> >> > question of "What are the differences between all of the available
> >> images
> >> > and which one is best for my use case?"
> >> >
> >>
> >>
> >> *My thoughts: *This is a valid concern to address. The response to the
> >> above question addresses the value-add this new Docker Official image
> >> would
> >> provide. I also agree we need a clear distinction between each of these
> >> images to be well documented. We plan to update the AK website with
> >> details
> >> on how, why, and when a developer would want to use each of these
> >> particular images(KIP-974,975,1028).
> >>
> >> Thanks,
> >> Prabha.
> >>
> >>
> >>
> >>
> >>
> >> On Tue, Apr 30, 2024 at 9:41 PM Chris Egerton 
> >> wrote:
> >>
> >> > Hi Vedarth and Krish,
> >> >
> >> > Thanks for the KIP! I have to admit I'm a little skeptical; hopefully
> >> you
> >> > can help me understand the need for these additional images.
> >> >
> >> > 1) In the motivation section it's stated that "Several other Apache
> >> > projects, like Flink, Spark, Solr, have already released Docker
> Official
> >> > Images, with download figures ranging from 50 million to over 1
> billion.
> >> > These numbers highlight the significant demand among users." But then
> >> > immediately afterwards, we learn that "Also the Docker Official Images
> >> are
> >> > always the top 1 search result, irrespective of the number of
> >> downloads."
> >> > Wouldn't a high number of downloads for an image naturally follow from
> >> > being the top search result? It seems like we can't necessarily assume
> >> that
> >> > Docker Official Images are inherently more desirable for users based
> >> solely
> >> > on download statistics.
> >> >
> >> > 2) Can you elaborate on the value that these new images would add
> from a
> >> > user's perspective? I'm hesitant to introduce another image, since it
> >> adds
> >> > to the cognitive burden of people who will inevitably have to answer
> the
> >> > question of "What are the differences between all of the available
> >> images
> >> > and which one is best for my use case?"
> >> >
> >> > 3) Would a separate Docker-owned repository be out of the question?
> I'm
> >> > guessing there are some trademark issues that might get in the way,
> but
> >> > it's worth exploring since the entire purpose of this KIP seems to be
> to
> >> > provide images that are vetted and designed by Docker more than by the
> >> > Apache Kafka contributors/committers/PMC.
> >> >
> >> > I may have more questions later but wanted to get this initial round
> out
> >> > now without trying to list everything first.
> >> >
> >> > Looking forward to your thoughts!
> >> >
> >> > Cheers,
> >> >
> >> > Chris
> >> >
> >> > On Mon, Apr 22, 2024 a

Re: KIP-1040: Improve handling of nullable values in InsertField/ExtractField transformations

2024-05-10 Thread Chris Egerton
Yes, I think we should just do one KIP for all the SMTs. You don't have to
implement everything all at once or by yourself, but I don't see why we
should require one or more follow-up KIPs to apply the exact same changes
to the SMTs we missed the first time.

On Fri, May 10, 2024 at 5:20 AM Mario Fiore Vitale 
wrote:

> Hi Chris,
>
> Thanks for the survey. Do you think I need to update the KIP to put all of
> these?
>
> On Thu, May 9, 2024 at 4:14 PM Chris Egerton 
> wrote:
>
> > After doing a brief survey of the SMTs that ship with Connect, it seems
> > like these would also benefit:
> >
> > - HeaderFrom, which populates record headers with subfields of
> keys/values
> > [1]
> > - Cast, which performs type transformation on subfields of keys/values
> [2]
> > - SetSchemaMetadata, which (when the record key/value is a struct) copies
> > fields from the input struct to the output struct (which uses a different
> > schema) [3]
> > - TimestampConverter, which does similar input/output field copying to
> > SetSchemaMetadata [4]
> > - ReplaceField, which does similar input/output field copying to
> > SetSchemaMetadata and TimestampConverter
> >
> > [1] -
> >
> >
> https://github.com/apache/kafka/blob/f4fdaa702a2e718bdb44b9c5fec254f32a33f0d8/connect/transforms/src/main/java/org/apache/kafka/connect/transforms/HeaderFrom.java#L143
> > [2] -
> >
> >
> https://github.com/apache/kafka/blob/f4fdaa702a2e718bdb44b9c5fec254f32a33f0d8/connect/transforms/src/main/java/org/apache/kafka/connect/transforms/Cast.java#L178
> > [3] -
> >
> >
> https://github.com/apache/kafka/blob/f4fdaa702a2e718bdb44b9c5fec254f32a33f0d8/connect/transforms/src/main/java/org/apache/kafka/connect/transforms/SetSchemaMetadata.java#L174
> > [4] -
> >
> >
> https://github.com/apache/kafka/blob/2a5efe4a334611fc7c15f76b322482bad0942db0/connect/transforms/src/main/java/org/apache/kafka/connect/transforms/TimestampConverter.java#L420
> > [5] -
> >
> >
> https://github.com/apache/kafka/blob/f4fdaa702a2e718bdb44b9c5fec254f32a33f0d8/connect/transforms/src/main/java/org/apache/kafka/connect/transforms/ReplaceField.java#L183
> >
> > On Thu, May 9, 2024 at 3:28 AM Mario Fiore Vitale 
> > wrote:
> >
> > > Hi Chris,
> > >
> > > > Wouldn't ValueToKey [1] be applicable as well, for example?
> > > Yes, also that one can be affected.
> > >
> > > On Wed, May 8, 2024 at 5:59 PM Chris Egerton 
> > > wrote:
> > >
> > > > Wait, just one more thing--are there any other SMTs that could
> benefit
> > > from
> > > > this? Wouldn't ValueToKey [1] be applicable as well, for example?
> > > >
> > > > [1] -
> > > >
> > > >
> > >
> >
> https://github.com/apache/kafka/blob/2a5efe4a334611fc7c15f76b322482bad0942db0/connect/transforms/src/main/java/org/apache/kafka/connect/transforms/ValueToKey.java#L106
> > > >
> > > > On Wed, May 8, 2024 at 11:46 AM Chris Egerton 
> wrote:
> > > >
> > > > > Hi Mario,
> > > > >
> > > > > I think we could have something like `copy` and
> `copyWithoutDefaults`
> > > to
> > > > > get around that, but now that you bring up compatibility, I think
> > it's
> > > > best
> > > > > to hold off on this. I'm forced to recall that anything we add to
> the
> > > > > Connect API may be used by plugin developers who write for the
> > bleeding
> > > > > edge of the Connect runtime, but deployed by users who are running
> on
> > > > > (possibly much) older versions. In that scenario, any use of new
> > Struct
> > > > > methods would cause issues at runtime caused by compatibility
> clashes
> > > > > between the newer API that the plugin was written for, and the
> older
> > > API
> > > > > that's provided by the runtime it's running on.
> > > > >
> > > > > Anyway, thanks for humoring me. The KIP looks good to me 
> > > > >
> > > > > Cheers,
> > > > >
> > > > > Chris
> > > > >
> > > > > On Wed, May 8, 2024 at 10:50 AM Mario Fiore Vitale <
> > mvit...@redhat.com
> > > >
> > > > > wrote:
> > > > >
> > > > >> Hi Chris,
> > > > >>
> > > > >> Thanks for reviewing this.
> > > > >>
> > > > >> > It seems like the pattern of "copy 

Re: KIP-1040: Improve handling of nullable values in InsertField/ExtractField transformations

2024-05-09 Thread Chris Egerton
After doing a brief survey of the SMTs that ship with Connect, it seems
like these would also benefit:

- HeaderFrom, which populates record headers with subfields of keys/values
[1]
- Cast, which performs type transformation on subfields of keys/values [2]
- SetSchemaMetadata, which (when the record key/value is a struct) copies
fields from the input struct to the output struct (which uses a different
schema) [3]
- TimestampConverter, which does similar input/output field copying to
SetSchemaMetadata [4]
- ReplaceField, which does similar input/output field copying to
SetSchemaMetadata and TimestampConverter

[1] -
https://github.com/apache/kafka/blob/f4fdaa702a2e718bdb44b9c5fec254f32a33f0d8/connect/transforms/src/main/java/org/apache/kafka/connect/transforms/HeaderFrom.java#L143
[2] -
https://github.com/apache/kafka/blob/f4fdaa702a2e718bdb44b9c5fec254f32a33f0d8/connect/transforms/src/main/java/org/apache/kafka/connect/transforms/Cast.java#L178
[3] -
https://github.com/apache/kafka/blob/f4fdaa702a2e718bdb44b9c5fec254f32a33f0d8/connect/transforms/src/main/java/org/apache/kafka/connect/transforms/SetSchemaMetadata.java#L174
[4] -
https://github.com/apache/kafka/blob/2a5efe4a334611fc7c15f76b322482bad0942db0/connect/transforms/src/main/java/org/apache/kafka/connect/transforms/TimestampConverter.java#L420
[5] -
https://github.com/apache/kafka/blob/f4fdaa702a2e718bdb44b9c5fec254f32a33f0d8/connect/transforms/src/main/java/org/apache/kafka/connect/transforms/ReplaceField.java#L183

On Thu, May 9, 2024 at 3:28 AM Mario Fiore Vitale 
wrote:

> Hi Chris,
>
> > Wouldn't ValueToKey [1] be applicable as well, for example?
> Yes, also that one can be affected.
>
> On Wed, May 8, 2024 at 5:59 PM Chris Egerton 
> wrote:
>
> > Wait, just one more thing--are there any other SMTs that could benefit
> from
> > this? Wouldn't ValueToKey [1] be applicable as well, for example?
> >
> > [1] -
> >
> >
> https://github.com/apache/kafka/blob/2a5efe4a334611fc7c15f76b322482bad0942db0/connect/transforms/src/main/java/org/apache/kafka/connect/transforms/ValueToKey.java#L106
> >
> > On Wed, May 8, 2024 at 11:46 AM Chris Egerton  wrote:
> >
> > > Hi Mario,
> > >
> > > I think we could have something like `copy` and `copyWithoutDefaults`
> to
> > > get around that, but now that you bring up compatibility, I think it's
> > best
> > > to hold off on this. I'm forced to recall that anything we add to the
> > > Connect API may be used by plugin developers who write for the bleeding
> > > edge of the Connect runtime, but deployed by users who are running on
> > > (possibly much) older versions. In that scenario, any use of new Struct
> > > methods would cause issues at runtime caused by compatibility clashes
> > > between the newer API that the plugin was written for, and the older
> API
> > > that's provided by the runtime it's running on.
> > >
> > > Anyway, thanks for humoring me. The KIP looks good to me 
> > >
> > > Cheers,
> > >
> > > Chris
> > >
> > > On Wed, May 8, 2024 at 10:50 AM Mario Fiore Vitale  >
> > > wrote:
> > >
> > >> Hi Chris,
> > >>
> > >> Thanks for reviewing this.
> > >>
> > >> > It seems like the pattern of "copy the contents of this Struct into
> > >> another
> > >> one for the purpose of mutation" could be fairly common in user code
> > bases
> > >> in addition to the core Connect SMTs. Do you think there's a way to
> > >> simplify this with, e.g., a Struct.putAll(Struct destination) or
> > >> Struct.copy(Schema destinationSchema) method?
> > >>
> > >> The only concern that I see is backward compatibility. Suppose that
> you
> > >> are
> > >> not using the JsonConvert but another convert that does't support the
> > >> 'replace.null.with.default', when you use the current 'InsertField'
> smt
> > >> the null values will be replace by default values. If we replace the
> > >> "copy"
> > >> logic with a method in the Struct we remove this behavior.
> > >>
> > >> Isn't it?
> > >>
> > >> Mario.
> > >>
> > >> On Wed, May 8, 2024 at 2:14 PM Chris Egerton  >
> > >> wrote:
> > >>
> > >> > Hi Mario,
> > >> >
> > >> > Thanks for the KIP! Looks good overall. One quick thought--it wasn't
> > >> > immediately obvious to me why we had to touch on InsertField for
> this.
> > >> It
> > >> > looks like the reason is t

Re: KIP-1040: Improve handling of nullable values in InsertField/ExtractField transformations

2024-05-08 Thread Chris Egerton
Wait, just one more thing--are there any other SMTs that could benefit from
this? Wouldn't ValueToKey [1] be applicable as well, for example?

[1] -
https://github.com/apache/kafka/blob/2a5efe4a334611fc7c15f76b322482bad0942db0/connect/transforms/src/main/java/org/apache/kafka/connect/transforms/ValueToKey.java#L106

On Wed, May 8, 2024 at 11:46 AM Chris Egerton  wrote:

> Hi Mario,
>
> I think we could have something like `copy` and `copyWithoutDefaults` to
> get around that, but now that you bring up compatibility, I think it's best
> to hold off on this. I'm forced to recall that anything we add to the
> Connect API may be used by plugin developers who write for the bleeding
> edge of the Connect runtime, but deployed by users who are running on
> (possibly much) older versions. In that scenario, any use of new Struct
> methods would cause issues at runtime caused by compatibility clashes
> between the newer API that the plugin was written for, and the older API
> that's provided by the runtime it's running on.
>
> Anyway, thanks for humoring me. The KIP looks good to me 
>
> Cheers,
>
> Chris
>
> On Wed, May 8, 2024 at 10:50 AM Mario Fiore Vitale 
> wrote:
>
>> Hi Chris,
>>
>> Thanks for reviewing this.
>>
>> > It seems like the pattern of "copy the contents of this Struct into
>> another
>> one for the purpose of mutation" could be fairly common in user code bases
>> in addition to the core Connect SMTs. Do you think there's a way to
>> simplify this with, e.g., a Struct.putAll(Struct destination) or
>> Struct.copy(Schema destinationSchema) method?
>>
>> The only concern that I see is backward compatibility. Suppose that you
>> are
>> not using the JsonConvert but another convert that does't support the
>> 'replace.null.with.default', when you use the current 'InsertField' smt
>> the null values will be replace by default values. If we replace the
>> "copy"
>> logic with a method in the Struct we remove this behavior.
>>
>> Isn't it?
>>
>> Mario.
>>
>> On Wed, May 8, 2024 at 2:14 PM Chris Egerton 
>> wrote:
>>
>> > Hi Mario,
>> >
>> > Thanks for the KIP! Looks good overall. One quick thought--it wasn't
>> > immediately obvious to me why we had to touch on InsertField for this.
>> It
>> > looks like the reason is that we use Struct::get [1] to create a clone
>> of
>> > the input value [2], instead of Struct::getWithoutDefault [3].
>> >
>> > It seems like the pattern of "copy the contents of this Struct into
>> another
>> > one for the purpose of mutation" could be fairly common in user code
>> bases
>> > in addition to the core Connect SMTs. Do you think there's a way to
>> > simplify this with, e.g., a Struct.putAll(Struct destination) or
>> > Struct.copy(Schema destinationSchema) method?
>> >
>> > [1] -
>> >
>> >
>> https://github.com/apache/kafka/blob/f74f596bc7d35fcea97edcd83244e5d6aee308d2/connect/api/src/main/java/org/apache/kafka/connect/data/Struct.java#L78-L91
>> > [2] -
>> >
>> >
>> https://github.com/apache/kafka/blob/f74f596bc7d35fcea97edcd83244e5d6aee308d2/connect/transforms/src/main/java/org/apache/kafka/connect/transforms/InsertField.java#L179-L183
>> > [3] -
>> >
>> >
>> https://github.com/apache/kafka/blob/f74f596bc7d35fcea97edcd83244e5d6aee308d2/connect/api/src/main/java/org/apache/kafka/connect/data/Struct.java#L93-L101
>> >
>> > Cheers,
>> >
>> > Chris
>> >
>> > On Wed, May 8, 2024 at 3:40 AM Mario Fiore Vitale 
>> > wrote:
>> >
>> > > Hi All,
>> > >
>> > > I have created (through Mickael Maison's account since there was an
>> issue
>> > > creating a new one for me) KIP-1040[1] to improve handling of nullable
>> > > values in InsertField/ExtractField transformations, this is required
>> > after
>> > > the introduction of KIP-581[2] that introduced the configuration
>> property
>> > > *replace.null.with.default* to *JsonConverter* to choose whether to
>> > replace
>> > > fields that have a default value and that are null to the default
>> value.
>> > >
>> > > [1]
>> > >
>> >
>> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=303794677
>> > > [2]
>> > >
>> > >
>> >
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-581%3A+Value+of+optional+null+field+which+has+default+value
>> > >
>> > > Feedback and suggestions are welcome.
>> > >
>> > > Regards,
>> > > Mario.
>> > > --
>> > >
>> > > Mario Fiore Vitale
>> > >
>> > > Senior Software Engineer
>> > >
>> > > Red Hat <https://www.redhat.com/>
>> > > <https://www.redhat.com/>
>> > >
>> >
>>
>>
>> --
>>
>> Mario Fiore Vitale
>>
>> Senior Software Engineer
>>
>> Red Hat <https://www.redhat.com/>
>> <https://www.redhat.com/>
>>
>


Re: KIP-1040: Improve handling of nullable values in InsertField/ExtractField transformations

2024-05-08 Thread Chris Egerton
Hi Mario,

I think we could have something like `copy` and `copyWithoutDefaults` to
get around that, but now that you bring up compatibility, I think it's best
to hold off on this. I'm forced to recall that anything we add to the
Connect API may be used by plugin developers who write for the bleeding
edge of the Connect runtime, but deployed by users who are running on
(possibly much) older versions. In that scenario, any use of new Struct
methods would cause issues at runtime caused by compatibility clashes
between the newer API that the plugin was written for, and the older API
that's provided by the runtime it's running on.

Anyway, thanks for humoring me. The KIP looks good to me 

Cheers,

Chris

On Wed, May 8, 2024 at 10:50 AM Mario Fiore Vitale 
wrote:

> Hi Chris,
>
> Thanks for reviewing this.
>
> > It seems like the pattern of "copy the contents of this Struct into
> another
> one for the purpose of mutation" could be fairly common in user code bases
> in addition to the core Connect SMTs. Do you think there's a way to
> simplify this with, e.g., a Struct.putAll(Struct destination) or
> Struct.copy(Schema destinationSchema) method?
>
> The only concern that I see is backward compatibility. Suppose that you are
> not using the JsonConvert but another convert that does't support the
> 'replace.null.with.default', when you use the current 'InsertField' smt
> the null values will be replace by default values. If we replace the "copy"
> logic with a method in the Struct we remove this behavior.
>
> Isn't it?
>
> Mario.
>
> On Wed, May 8, 2024 at 2:14 PM Chris Egerton 
> wrote:
>
> > Hi Mario,
> >
> > Thanks for the KIP! Looks good overall. One quick thought--it wasn't
> > immediately obvious to me why we had to touch on InsertField for this. It
> > looks like the reason is that we use Struct::get [1] to create a clone of
> > the input value [2], instead of Struct::getWithoutDefault [3].
> >
> > It seems like the pattern of "copy the contents of this Struct into
> another
> > one for the purpose of mutation" could be fairly common in user code
> bases
> > in addition to the core Connect SMTs. Do you think there's a way to
> > simplify this with, e.g., a Struct.putAll(Struct destination) or
> > Struct.copy(Schema destinationSchema) method?
> >
> > [1] -
> >
> >
> https://github.com/apache/kafka/blob/f74f596bc7d35fcea97edcd83244e5d6aee308d2/connect/api/src/main/java/org/apache/kafka/connect/data/Struct.java#L78-L91
> > [2] -
> >
> >
> https://github.com/apache/kafka/blob/f74f596bc7d35fcea97edcd83244e5d6aee308d2/connect/transforms/src/main/java/org/apache/kafka/connect/transforms/InsertField.java#L179-L183
> > [3] -
> >
> >
> https://github.com/apache/kafka/blob/f74f596bc7d35fcea97edcd83244e5d6aee308d2/connect/api/src/main/java/org/apache/kafka/connect/data/Struct.java#L93-L101
> >
> > Cheers,
> >
> > Chris
> >
> > On Wed, May 8, 2024 at 3:40 AM Mario Fiore Vitale 
> > wrote:
> >
> > > Hi All,
> > >
> > > I have created (through Mickael Maison's account since there was an
> issue
> > > creating a new one for me) KIP-1040[1] to improve handling of nullable
> > > values in InsertField/ExtractField transformations, this is required
> > after
> > > the introduction of KIP-581[2] that introduced the configuration
> property
> > > *replace.null.with.default* to *JsonConverter* to choose whether to
> > replace
> > > fields that have a default value and that are null to the default
> value.
> > >
> > > [1]
> > >
> >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=303794677
> > > [2]
> > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-581%3A+Value+of+optional+null+field+which+has+default+value
> > >
> > > Feedback and suggestions are welcome.
> > >
> > > Regards,
> > > Mario.
> > > --
> > >
> > > Mario Fiore Vitale
> > >
> > > Senior Software Engineer
> > >
> > > Red Hat <https://www.redhat.com/>
> > > <https://www.redhat.com/>
> > >
> >
>
>
> --
>
> Mario Fiore Vitale
>
> Senior Software Engineer
>
> Red Hat <https://www.redhat.com/>
> <https://www.redhat.com/>
>


Re: [DISCUSS] KIP-1038: Add Custom Error Handler to Producer

2024-05-08 Thread Chris Egerton
> > only be able to SWALLOW RecordTooLargeException that happen because the
> > producer cannot produce the record (if the broker rejects the batch, the
> > error won't get to the handler, because we cannot know which other
> records
> > get ignored).  In this case, why not just check the locally configured
> max
> > record size upfront and not produce the recrord in the first place?
> Maybe
> > we can expose a validation function from the producer that could validate
> > the records locally, so we don't need to produce the record in order to
> > know that it's invalid.
> >
> > -Artem
> >
> > On Tue, May 7, 2024 at 2:07 PM Justine Olshan
> 
> > wrote:
> >
> >> Alieh and Chris,
> >>
> >> Thanks for clarifying 1) but I saw the motivation. I guess I just didn't
> >> understand how that would be ensured on the producer side. I saw the
> sample
> >> code -- is it just an if statement checking for the error before the
> >> handler is invoked? That seems a bit fragile.
> >>
> >> Can you clarify what you mean by `since the code does not reach the KS
> >> interface and breaks somewhere in producer.` If we surfaced this error
> to
> >> the application in a better way would that also be a solution to the
> issue?
> >>
> >> Justine
> >>
> >> On Tue, May 7, 2024 at 1:55 PM Alieh Saeedi
> 
> >> wrote:
> >>
> >>> Hi,
> >>>
> >>>
> >>> Thank you, Chris and Justine, for the feedback.
> >>>
> >>>
> >>> @Chris
> >>>
> >>> 1) Flexibility: it has two meanings. The first meaning is the one you
> >>> mentioned. We are going to cover more exceptions in the future, but as
> >>> Justine mentioned, we must be very conservative about adding more
> >>> exceptions. Additionally, flexibility mainly means that the user is
> able
> >> to
> >>> develop their own code. As mentioned in the motivation section and the
> >>> examples, sometimes the user decides on dropping a record based on the
> >>> topic, for example.
> >>>
> >>>
> >>> 2) Defining two separate methods for retriable and non-retriable
> >>> exceptions: although the idea is brilliant, the user may still make a
> >>> mistake by implementing the wrong method and see a non-expecting
> >> behaviour.
> >>> For example, he may implement handleRetriable() for
> >> RecordTooLargeException
> >>> and define SWALLOW for the exception, but in practice, he sees no
> change
> >> in
> >>> default behaviour since he implemented the wrong method. I think we can
> >>> never reduce the user’s mistakes to 0.
> >>>
> >>>
> >>>
> >>> 3) Default implementation for Handler: the default behaviour is already
> >>> preserved with NO need of implementing any handler or setting the
> >>> corresponding config parameter `custom.exception.handler`. What you
> mean
> >> is
> >>> actually having a second default, which requires having both interface
> >> and
> >>> config parameters. About UnknownTopicOrPartitionException: the producer
> >>> already offers the config parameter `max.block.ms` which determines
> the
> >>> duration of retrying. The main purpose of the user who needs the
> handler
> >> is
> >>> to get the root cause of TimeoutException and handle it in the way he
> >>> intends. The KIP explains the necessity of it for KS users.
> >>>
> >>>
> >>> 4) Naming issue: By SWALLOW, we meant actually swallow the error, while
> >>> SKIP means skip the record, I think. If it makes sense for more ppl, I
> >> can
> >>> change it to SKIP
> >>>
> >>>
> >>> @Justine
> >>>
> >>> 1) was addressed by Chris.
> >>>
> >>> 2 and 3) The problem is exactly what you mentioned. Currently, there is
> >> no
> >>> way to handle these issues application-side. Even KS users who
> implement
> >> KS
> >>> ProductionExceptionHandler are not able to handle the exceptions as
> they
> >>> intend since the code does not reach the KS interface and breaks
> >> somewhere
> >>> in producer.
> >>>
> >>> Cheers,
> >>> Alieh
> >>>
> >>> On Tue, May 7, 2024 at 8:43 PM Chris Egerton 
> >>> wrote:
&

[jira] [Resolved] (KAFKA-16108) Backport fix for KAFKA-16093 to 3.7

2024-05-08 Thread Chris Egerton (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-16108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Egerton resolved KAFKA-16108.
---
Resolution: Done

> Backport fix for KAFKA-16093 to 3.7
> ---
>
> Key: KAFKA-16108
> URL: https://issues.apache.org/jira/browse/KAFKA-16108
> Project: Kafka
>  Issue Type: Improvement
>  Components: connect
>    Reporter: Chris Egerton
>    Assignee: Chris Egerton
>Priority: Blocker
> Fix For: 3.7.1
>
>
> A fix for KAFKA-16093 is present on the branches trunk (the version for which 
> is currently 3.8.0-SNAPSHOT) and 3.6. We are in code freeze for the 3.7.0 
> release, and this issue is not a blocker, so it cannot be backported right 
> now.
> We should backport the fix once 3.7.0 has been released and before 3.7.1 is 
> released.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: KIP-1040: Improve handling of nullable values in InsertField/ExtractField transformations

2024-05-08 Thread Chris Egerton
Hi Mario,

Thanks for the KIP! Looks good overall. One quick thought--it wasn't
immediately obvious to me why we had to touch on InsertField for this. It
looks like the reason is that we use Struct::get [1] to create a clone of
the input value [2], instead of Struct::getWithoutDefault [3].

It seems like the pattern of "copy the contents of this Struct into another
one for the purpose of mutation" could be fairly common in user code bases
in addition to the core Connect SMTs. Do you think there's a way to
simplify this with, e.g., a Struct.putAll(Struct destination) or
Struct.copy(Schema destinationSchema) method?

[1] -
https://github.com/apache/kafka/blob/f74f596bc7d35fcea97edcd83244e5d6aee308d2/connect/api/src/main/java/org/apache/kafka/connect/data/Struct.java#L78-L91
[2] -
https://github.com/apache/kafka/blob/f74f596bc7d35fcea97edcd83244e5d6aee308d2/connect/transforms/src/main/java/org/apache/kafka/connect/transforms/InsertField.java#L179-L183
[3] -
https://github.com/apache/kafka/blob/f74f596bc7d35fcea97edcd83244e5d6aee308d2/connect/api/src/main/java/org/apache/kafka/connect/data/Struct.java#L93-L101

Cheers,

Chris

On Wed, May 8, 2024 at 3:40 AM Mario Fiore Vitale 
wrote:

> Hi All,
>
> I have created (through Mickael Maison's account since there was an issue
> creating a new one for me) KIP-1040[1] to improve handling of nullable
> values in InsertField/ExtractField transformations, this is required after
> the introduction of KIP-581[2] that introduced the configuration property
> *replace.null.with.default* to *JsonConverter* to choose whether to replace
> fields that have a default value and that are null to the default value.
>
> [1]
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=303794677
> [2]
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-581%3A+Value+of+optional+null+field+which+has+default+value
>
> Feedback and suggestions are welcome.
>
> Regards,
> Mario.
> --
>
> Mario Fiore Vitale
>
> Senior Software Engineer
>
> Red Hat 
> 
>


Re: [DISCUSS] KIP-1038: Add Custom Error Handler to Producer

2024-05-07 Thread Chris Egerton
Hi Justine,

The method signatures for the interface are indeed open-ended, but the KIP
states that its uses will be limited. See the motivation section:

> We believe that the user should be able to develop custom exception
handlers for managing producer exceptions. On the other hand, this will be
an expert-level API, and using that may result in strange behaviour in the
system, making it hard to find the root cause. Therefore, the custom
handler is currently limited to handling RecordTooLargeException and
UnknownTopicOrPartitionException.

Cheers,

Chris


On Tue, May 7, 2024, 14:37 Justine Olshan 
wrote:

> Hi Alieh,
>
> I was out for KSB and then was also sick. :(
>
> To your point 1) Chris, I don't think it is limited to two specific
> scenarios, since the interface accepts a generic Exception e and can be
> implemented to check if that e is an instanceof any exception. I didn't see
> anywhere that specific errors are enforced. I'm a bit concerned about this
> actually. I'm concerned about the opened-endedness and the contract we have
> with transactions. We are allowing the client to make decisions that are
> somewhat invisible to the server. As an aside, can we build in log messages
> when the handler decides to skip etc a message. I'm really concerned about
> messages being silently dropped.
>
> I do think Chris's point 2) about retriable vs non retriable errors is
> fair. I'm a bit concerned about skipping a unknown topic or partition
> exception too early, as there are cases where it can be transient.
>
> I'm still a little bit wary of allowing dropping records as part of EOS
> generally as in many cases, these errors signify an issue with the original
> data. I understand that streams and connect/mirror maker may have reasons
> they want to progress past these messages, but wondering if there is a way
> that can be done application-side. I'm willing to accept this sort of
> proposal if we can make it clear that this sort of thing is happening and
> we limit the blast radius for what we can do.
>
> Justine
>
> On Tue, May 7, 2024 at 9:55 AM Chris Egerton 
> wrote:
>
> > Hi Alieh,
> >
> > Sorry for the delay, I've been out sick. I still have some thoughts that
> > I'd like to see addressed before voting.
> >
> > 1) If flexibility is the motivation for a pluggable interface, why are we
> > only limiting the uses for this interface to two very specific scenarios?
> > Why not also allow, e.g., authorization errors to be handled as well
> > (allowing users to drop records destined for some off-limits topics, or
> > retry for a limited duration in case there's a delay in the propagation
> of
> > ACL updates)? It'd be nice to see some analysis of other errors that
> could
> > be handled with this new API, both to avoid the follow-up work of another
> > KIP to address them in the future, and to make sure that we're not
> painting
> > ourselves into a corner with the current API in a way that would make
> > future modifications difficult.
> >
> > 2) Something feels a bit off with how retriable vs. non-retriable errors
> > are handled with the interface. Why not introduce two separate methods to
> > handle each case separately? That way there's no ambiguity or implicit
> > behavior when, e.g., attempting to retry on a RecordTooLargeException.
> This
> > could be something like `NonRetriableResponse handle(ProducerRecord,
> > Exception)` and `RetriableResponse handleRetriable(ProducerRecord,
> > Exception)`, though the exact names and shape can obviously be toyed
> with a
> > bit.
> >
> > 3) Although the flexibility of a pluggable interface may benefit some
> > users' custom producer applications and Kafka Streams applications, it
> > comes at significant deployment cost for other low-/no-code environments,
> > including but not limited to Kafka Connect and MirrorMaker 2. Can we add
> a
> > default implementation of the exception handler that allows for some
> simple
> > behavior to be tweaked via configuration property? Two things that would
> be
> > nice to have would be A) an upper bound on the retry time for
> > unknown-topic-partition exceptions and B) an option to drop records that
> > are large enough to trigger a record-too-large exception.
> >
> > 4) I'd still prefer to see "SKIP" or "DROP" instead of the proposed
> > "SWALLOW" option, which IMO is opaque and non-obvious, especially when
> > trying to guess the behavior for retriable errors.
> >
> > Cheers,
> >
> > Chris
> >
> > On Fri, May 3, 2024 at 11:23 AM Alieh Saeedi
>  > >
> > wrote:
> >
> > >

[jira] [Resolved] (KAFKA-15018) Potential tombstone offsets corruption for exactly-once source connectors

2024-05-07 Thread Chris Egerton (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-15018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Egerton resolved KAFKA-15018.
---
Fix Version/s: 3.8.0
   Resolution: Fixed

> Potential tombstone offsets corruption for exactly-once source connectors
> -
>
> Key: KAFKA-15018
> URL: https://issues.apache.org/jira/browse/KAFKA-15018
> Project: Kafka
>  Issue Type: Bug
>  Components: connect
>Affects Versions: 3.3.0, 3.4.0, 3.3.1, 3.3.2, 3.5.0, 3.4.1
>    Reporter: Chris Egerton
>Assignee: Sagar Rao
>Priority: Major
> Fix For: 3.8.0
>
>
> When exactly-once support is enabled for source connectors, source offsets 
> can potentially be written to two different offsets topics: a topic specific 
> to the connector, and the global offsets topic (which was used for all 
> connectors prior to KIP-618 / version 3.3.0).
> Precedence is given to offsets in the per-connector offsets topic, but if 
> none are found for a given partition, then the global offsets topic is used 
> as a fallback.
> When committing offsets, a transaction is used to ensure that source records 
> and source offsets are written to the Kafka cluster targeted by the source 
> connector. This transaction only includes the connector-specific offsets 
> topic. Writes to the global offsets topic take place after writes to the 
> connector-specific offsets topic have completed successfully, and if they 
> fail, a warning message is logged, but no other action is taken.
> Normally, this ensures that, for offsets committed by exactly-once-supported 
> source connectors, the per-connector offsets topic is at least as up-to-date 
> as the global offsets topic, and sometimes even ahead.
> However, for tombstone offsets, we lose that guarantee. If a tombstone offset 
> is successfully written to the per-connector offsets topic, but cannot be 
> written to the global offsets topic, then the global offsets topic will still 
> contain that source offset, but the per-connector topic will not. Due to the 
> fallback-on-global logic used by the worker, if a task requests offsets for 
> one of the tombstoned partitions, the worker will provide it with the offsets 
> present in the global offsets topic, instead of indicating to the task that 
> no offsets can be found.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [DISCUSS] KIP-1038: Add Custom Error Handler to Producer

2024-05-07 Thread Chris Egerton
tions, but to be more conservative and avoid
> > >>> future
> > >>>> issues, we decided to be limited to a short list of exceptions. I
> > >>> included
> > >>>> *RecordTooLargeExceptin* and *UnknownTopicOrPartitionException.
> *Open
> > >> to
> > >>>> suggestion for adding some more ;-)
> > >>>>
> > >>>> KIP Updates:
> > >>>> - clarified the way that the user should configure the Producer to
> use
> > >>> the
> > >>>> custom handler. I think adding a producer config property is the
> > >> cleanest
> > >>>> one.
> > >>>> - changed the *ClientExceptionHandler* to *ProducerExceptionHandler*
> > to
> > >>> be
> > >>>> closer to what we are changing.
> > >>>> - added the ProducerRecord as the input parameter of the handle()
> > >> method
> > >>> as
> > >>>> well.
> > >>>> - increased the response types to 3 to have fail and two types of
> > >>> continue.
> > >>>> - The default behaviour is having no custom handler, having the
> > >>>> corresponding config parameter set to null. Therefore, the KIP
> > provides
> > >>> no
> > >>>> default implementation of the interface.
> > >>>> - We follow the interface solution as described in the
> > >>>> Rejected Alternetives section.
> > >>>>
> > >>>>
> > >>>> Cheers,
> > >>>> Alieh
> > >>>>
> > >>>>
> > >>>> On Thu, Apr 18, 2024 at 8:11 PM Matthias J. Sax 
> > >>> wrote:
> > >>>>
> > >>>>> Thanks for the KIP Alieh! It addresses an important case for error
> > >>>>> handling.
> > >>>>>
> > >>>>> I agree that using this handler would be an expert API, as
> mentioned
> > >> by
> > >>>>> a few people. But I don't think it would be a reason to not add it.
> > >> It's
> > >>>>> always a tricky tradeoff what to expose to users and to avoid foot
> > >> guns,
> > >>>>> but we added similar handlers to Kafka Streams, and have good
> > >> experience
> > >>>>> with it. Hence, I understand, but don't share the concern raised.
> > >>>>>
> > >>>>> I also agree that there is some responsibility by the user to
> > >> understand
> > >>>>> how such a handler should be implemented to not drop data by
> > accident.
> > >>>>> But it seem unavoidable and acceptable.
> > >>>>>
> > >>>>> While I understand that a "simpler / reduced" API (eg via configs)
> > >> might
> > >>>>> also work, I personally prefer a full handler. Configs have the
> same
> > >>>>> issue that they could be miss-used potentially leading to
> incorrectly
> > >>>>> dropped data, but at the same time are less flexible (and thus
> maybe
> > >>>>> ever harder to use correctly...?). Base on my experience, there is
> > >> also
> > >>>>> often weird corner case for which it make sense to also drop
> records
> > >> for
> > >>>>> other exceptions, and a full handler has the advantage of full
> > >>>>> flexibility and "absolute power!".
> > >>>>>
> > >>>>> To be fair: I don't know the exact code paths of the producer in
> > >>>>> details, so please keep me honest. But my understanding is, that
> the
> > >> KIP
> > >>>>> aims to allow users to react to internal exception, and decide to
> > keep
> > >>>>> retrying internally, swallow the error and drop the record, or
> raise
> > >> the
> > >>>>> error?
> > >>>>>
> > >>>>> Maybe the KIP would need to be a little bit more precises what
> error
> > >> we
> > >>>>> want to cover -- I don't think this list must be exhaustive, as we
> > can
> > >>>>> always do follow up KIP to also apply the handler to other errors
> to
> > >>>>> expand the scope of the handler. The KIP does mention examples, but
> > it
&

[jira] [Resolved] (KAFKA-13329) Connect does not perform preflight validation for per-connector key and value converters

2024-05-07 Thread Chris Egerton (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-13329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Egerton resolved KAFKA-13329.
---
Fix Version/s: 3.8.0
   Resolution: Fixed

> Connect does not perform preflight validation for per-connector key and value 
> converters
> 
>
> Key: KAFKA-13329
> URL: https://issues.apache.org/jira/browse/KAFKA-13329
> Project: Kafka
>  Issue Type: Bug
>  Components: connect
>    Reporter: Chris Egerton
>    Assignee: Chris Egerton
>Priority: Major
> Fix For: 3.8.0
>
>
> Users may specify a key and/or value converter class for their connector 
> directly in the configuration for that connector. If this occurs, no 
> preflight validation is performed to ensure that the specified converter is 
> valid.
> Unfortunately, the [Converter 
> interface|https://github.com/apache/kafka/blob/4eb386f6e060e12e1940c0d780987e3a7c438d74/connect/api/src/main/java/org/apache/kafka/connect/storage/Converter.java]
>  does not require converters to expose a {{ConfigDef}} (unlike the 
> [HeaderConverter 
> interface|https://github.com/apache/kafka/blob/4eb386f6e060e12e1940c0d780987e3a7c438d74/connect/api/src/main/java/org/apache/kafka/connect/storage/HeaderConverter.java#L48-L52],
>  which does have that requirement), so it's unlikely that the configuration 
> properties of the converter itself can be validated.
> However, we can and should still validate that the converter class exists, 
> can be instantiated (i.e., has a public, no-args constructor and is a 
> concrete, non-abstract class), and implements the {{Converter}} interface.
> *EDIT:* Since this ticket was originally filed, a {{Converter::config}} 
> method was added in 
> [KIP-769|https://cwiki.apache.org/confluence/display/KAFKA/KIP-769%3A+Connect+APIs+to+list+all+connector+plugins+and+retrieve+their+configuration+definitions].
>  We can now utilize that config definition during preflight validation for 
> connectors.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (KAFKA-13328) Connect does not perform preflight validation for per-connector header converters

2024-05-07 Thread Chris Egerton (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-13328?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Egerton resolved KAFKA-13328.
---
Fix Version/s: 3.8.0
   Resolution: Fixed

> Connect does not perform preflight validation for per-connector header 
> converters
> -
>
> Key: KAFKA-13328
> URL: https://issues.apache.org/jira/browse/KAFKA-13328
> Project: Kafka
>  Issue Type: Bug
>  Components: connect
>    Reporter: Chris Egerton
>    Assignee: Chris Egerton
>Priority: Major
> Fix For: 3.8.0
>
>
> Users may specify a header converter class for their connector directly in 
> the configuration for that connector. If this occurs, no preflight validation 
> is performed to ensure that the specified converter is valid.
> {{HeaderConverter}} implementations are required to provide a valid 
> {{ConfigDef}} to the Connect framework via 
> [HeaderConverter::config|https://github.com/apache/kafka/blob/4eb386f6e060e12e1940c0d780987e3a7c438d74/connect/api/src/main/java/org/apache/kafka/connect/storage/HeaderConverter.java#L48-L52],
>  but this object isn't actually leveraged anywhere by Connect.
> Connect should make use of this config object during preflight validation for 
> connectors to fail faster when their header converters are misconfigured.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: lambda not returning?

2024-04-30 Thread Chris Egerton
Hi John,

Do you have logs demonstrating that the connector was able to start up
successfully? If not, it's possible that instead of the callback being
dropped, it was properly retained but never invoked because the connector
was blocked.

Cheers,

Chris

On Tue, Apr 30, 2024 at 1:57 PM John Hawkins
 wrote:

> Hi folks,
>
> I’ve been working with the standalone kafka for quite a while now and been
> debugging a problem I’m having when deploying new connectors (using the PUT
> method but it happens in the POST as well).
>
>
>
> I get a 500 timeout whenever I try to create a new connector. It looks to
> me like the lambda callback is being called but disappears into a
> blackhole. So, the original caller – ConnectorsResources.java never gets
> notified and so times-out.
>
> The herder never calls back…
>
>
>
> herder.putConnectorConfig(connector, connectorConfig, *true*, cb);
>
>
>
> or rather the callback does get #onComplete() called in the doTransition
> method of the connectorworker but the callback disappears – it never seems
> to work.
>
>
>
> I’m left wondering if the nested lambda calling is losing the memory
> somehow and this could be a classloading issue somewhere that is somehow
> losing the context of the original lambda?
>
>
>
> Has this been seen before?
>
>
>
> I’ve seen a few posts (not in kakfa) where lambda calls fails to callback
> if there are still threads running in the call itself. From what I can see
> there are plenty of threads still running in the call
>
> I’m using a 21 microsoft OpenJDK JRE but it also fails with IBMs version.
>
>
>
> Our project is dependent on using the kakfa standalone so I need to get
> this to work.
>
>
>
> Thanks for your thoughts,
>
> John.
>
>
>
>
>
> John Hawkins
>
> *Senior Architect*
>
>
>
>
>
>
>
>
>
>
>
> *john.hawk...@coliance.co * | www.coliance.co
>
> p: +44 (0) 7879 992532
>
> Follow Coliance:  LinkedIn
>   | YouTube
>   |  Facebook
>   |  Instagram 
>
>
>
>
>


Re: [DISCUSS] KIP-1028: Docker Official Image for Apache Kafka

2024-04-30 Thread Chris Egerton
Hi Vedarth and Krish,

Thanks for the KIP! I have to admit I'm a little skeptical; hopefully you
can help me understand the need for these additional images.

1) In the motivation section it's stated that "Several other Apache
projects, like Flink, Spark, Solr, have already released Docker Official
Images, with download figures ranging from 50 million to over 1 billion.
These numbers highlight the significant demand among users." But then
immediately afterwards, we learn that "Also the Docker Official Images are
always the top 1 search result, irrespective of the number of downloads."
Wouldn't a high number of downloads for an image naturally follow from
being the top search result? It seems like we can't necessarily assume that
Docker Official Images are inherently more desirable for users based solely
on download statistics.

2) Can you elaborate on the value that these new images would add from a
user's perspective? I'm hesitant to introduce another image, since it adds
to the cognitive burden of people who will inevitably have to answer the
question of "What are the differences between all of the available images
and which one is best for my use case?"

3) Would a separate Docker-owned repository be out of the question? I'm
guessing there are some trademark issues that might get in the way, but
it's worth exploring since the entire purpose of this KIP seems to be to
provide images that are vetted and designed by Docker more than by the
Apache Kafka contributors/committers/PMC.

I may have more questions later but wanted to get this initial round out
now without trying to list everything first.

Looking forward to your thoughts!

Cheers,

Chris

On Mon, Apr 22, 2024 at 2:14 PM Vedarth Sharma 
wrote:

> Hey folks,
>
> Thanks a lot for reviewing the KIP and providing feedback.
> The discussion thread seems resolved and KIP has been updated accordingly.
> We will be starting the voting thread for this KIP in the next few days.
> Please take a look at the KIP and let us know if any further discussion
> is needed.
>
> Thanks and regards,
> Vedarth
>
> On Fri, Apr 19, 2024 at 1:33 PM Manikumar 
> wrote:
>
> > Thanks Krish. KIP looks good to me.
> >
> > On Wed, Apr 17, 2024 at 1:38 PM Krish Vora 
> wrote:
> > >
> > > Hi Manikumar,
> > >
> > > Thanks for the comments.
> > >
> > > Maybe as part of the release process, RM can create a JIRA for this
> > > > task. This can be taken by RM or any comitter or any contributor
> (with
> > > > some help from commiters to run "Docker Image Preparation via GitHub
> > > > Actions:"
> > >
> > > This sounds like a good idea. This step would be beneficial. By
> creating
> > a
> > > JIRA ticket, it will also serve as a reminder to complete the
> > post-release
> > > steps for the Docker official images. Have updated the KIP with this
> > step.
> > >
> > > Is this using GitHub Actions workflow? or manual testing?
> > >
> > > This will be done by a Github Actions workflow, which will test the
> > static
> > > Docker Official Image assets for a specific release version.
> > >
> > > Is it mandatory for RM/comitters to raise the PR to Docker Hub’s
> > > > official images repository (or) can it be done by any contributor.
> > >
> > > I believe that it can be done by any contributor (ref: This link
> > >  >
> > > quotes "*Anyone can provide feedback, contribute code, suggest process
> > > changes, or even propose a new Official Image.*")
> > >
> > > Also I was thinking, once the KIP gets voted, we should try to release
> > > > kafka:3.7.0 (or 3.7.1) Docker Official image. This will help us to
> > > > validate the process and allow us to fix any changes suggested by
> > > > Dockerhub before the 3.8.0 release.
> > >
> > > This sounds like a great idea. This KIP proposes release of DOI as a
> > > post-release process, which can be done anytime post release. Since
> 3.7.0
> > > is already released, we can perform these steps for that release too.
> By
> > > the time the KIP gets implemented, if 3.7.1 is released, we could do
> > these
> > > steps for 3.7.1, instead of 3.7.0. This would allow us to make changes
> to
> > > the Dockerfiles and other assets based on feedback from Docker Hub
> before
> > > the release of version 3.8.0.
> > >
> > > Thanks,
> > > Krish.
> > >
> > > On Mon, Apr 15, 2024 at 12:59 PM Manikumar 
> > > wrote:
> > >
> > > > Hi Krish,
> > > >
> > > > Thanks for the updated KIP. a few comments below.
> > > >
> > > > > "These actions can be carried out by the RM or any contributor post
> > the
> > > > release process."
> > > > Maybe as part of the release process, RM can create a JIRA for this
> > > > task. This can be taken by RM or any comitter or any contributor
> (with
> > > > some help from commiters to run "Docker Image Preparation via GitHub
> > > > Actions:"
> > > >
> > > > > "Perform Docker build tests to ensure image integrity"
> > > > Is this using GitHub Actions workflow? or manual 

Re: [DISCUSS] KIP-1039: Disable automatic topic creation for MirrorMaker2 consumers

2024-04-30 Thread Chris Egerton
Hi Aaron,

Thanks for publishing this KIP after the feedback on your PR. I'm generally
in favor but have a few questions:

1) Is the consumer mentioned in the KIP the one constructed by the
MirrorSourceConnector for polling replicated records from the source
cluster? If yes, are there any other consumers for which it would also make
sense to disable automatic topic creation?

2) Is broker-side automatic topic creation enabled intentionally in your
source cluster? I know that it is by default but in my experience it's
highly unusual to see it enabled in prod.

3) In that same vein, can you add disabling automatic topic creation
broker-side (setting "auto.create.topics.enable" to "false" in your broker
config(s)) as another rejected alternative? This is useful because it
provides a workaround for users running existing versions of MM2 that don't
have the changes proposed by your KIP.

4) Can you also provide the exact configuration changes that would be
required for the currently-mentioned rejected alternative ("An alternative
solution is to disable this option manually,"), which presumably refers to
altering the configuration of MM2 to manually force the changes to its
consumer(s) that this KIP proposes be applied by default? Again, this is
useful as documentation of a workaround for users running existing versions
of MM2.

Thanks again for the KIP!

Cheers,

Chris

On Fri, Apr 19, 2024 at 6:33 AM aaron ai  wrote:

> Hi Omnia,
> Thanks for your feedback!
>
> Yes, another approach could be to disable this option manually. IMHO, it's
> hard to envision a scenario where enabling it would be necessary. BTW, I've
> already added this part into the KIP.
>
> On Fri, Apr 19, 2024 at 6:18 PM Omnia Ibrahim 
> wrote:
>
> > Hi Aaron,
> > You mentioned that there is no public interface changes however changing
> > the default value of a config should be considered as a public change.
> You
> > can check other KIP where we changed the default config value for a
> > reference.
> >
> > Can you please list what is the impact of changing the behaviour of the
> > topic creation along side  as well as is there any rejected alternatives
> > like can’t customer disable allow.auto.create.topics manually for example
> > as a workaround?
> >
> > Thanks
> > Omnia
> >
> > > On 19 Apr 2024, at 10:37, aaron ai  wrote:
> > >
> > > Hi all,
> > >
> > > Here is the KIP-1039: Disable automatic topic creation for MirrorMaker2
> > > consumers
> > > <
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1039%3A+Disable+automatic+topic+creation+for+MirrorMaker2+consumers
> > >>
> > >
> > > Looking forward to your feedback!
> > >
> > > Best regards,
> > > Aaron
> >
> >
>


Re: [DISCUSS] KIP-1038: Add Custom Error Handler to Producer

2024-04-17 Thread Chris Egerton
Hi Alieh,

Thanks for the KIP! The issue with writing to non-existent topics is
particularly frustrating for users of Kafka Connect and has been the source
of a handful of Jira tickets over the years. My thoughts:

1. An additional detail we can add to the motivation (or possibly rejected
alternatives) section is that this kind of custom retry logic can't be
implemented by hand by, e.g., setting retries to 0 in the producer config
and handling exceptions at the application level. Or rather, it can, but 1)
it's a bit painful to have to reimplement at every call-site for
Producer::send (and any code that awaits the returned Future) and 2) it's
impossible to do this without losing idempotency on retries.

2. That said, I wonder if a pluggable interface is really the right call
here. Documenting the interactions of a producer with
a ClientExceptionHandler instance will be tricky, and implementing them
will also be a fair amount of work. I believe that there needs to be some
more granularity for how writes to non-existent topics (or really,
UNKNOWN_TOPIC_OR_PARTITION and related errors from the broker) are handled,
but I'm torn between keeping it simple with maybe one or two new producer
config properties, or a full-blown pluggable interface. If there are more
cases that would benefit from a pluggable interface, it would be nice to
identify these and add them to the KIP to strengthen the motivation. Right
now, I'm not sure the two that are mentioned in the motivation are
sufficient.

3. Alternatively, a possible compromise is for this KIP to introduce new
properties that dictate how to handle unknown-topic-partition and
record-too-large errors, with the thinking that if we introduce a pluggable
interface later on, these properties will be recognized by the default
implementation of that interface but could be completely ignored or
replaced by alternative implementations.

4. (Nit) You can remove the "This page is meant as a template for writing a
KIP..." part from the KIP. It's not a template anymore :)

5. If we do go the pluggable interface route, wouldn't we want to add the
possibility for retry logic? The simplest version of this could be to add a
RETRY value to the ClientExceptionHandlerResponse enum.

6. I think "SKIP" or "DROP" might be clearer instead of "CONTINUE" for
the ClientExceptionHandlerResponse enum, since they cause records to be
dropped.

Cheers,

Chris

On Wed, Apr 17, 2024 at 12:25 PM Justine Olshan
 wrote:

> Hey Alieh,
>
> I echo what Omnia says, I'm not sure I understand the implications of the
> change and I think more detail is needed.
>
> This comment also confused me a bit:
> * {@code ClientExceptionHandler} that continues the transaction even if a
> record is too large.
> * Otherwise, it makes the transaction to fail.
>
> Relatedly, I've been working with some folks on a KIP for transactions
> errors and how they are handled. Specifically for the
> RecordTooLargeException (and a few other errors), we want to give a new
> error category for this error that allows the application to choose how it
> is handled. Maybe this KIP is something that you are looking for? Stay
> tuned :)
>
> Justine
>
>
>
>
>
> On Wed, Apr 17, 2024 at 8:03 AM Omnia Ibrahim 
> wrote:
>
> > Hi Alieh,
> > Thanks for the KIP! I have couple of comments
> > - You mentioned in the KIP motivation,
> > > Another example for which a production exception handler could be
> useful
> > is if a user tries to write into a non-existing topic, which returns a
> > retryable error code; with infinite retries, the producer would hang
> > retrying forever. A handler could help to break the infinite retry loop.
> >
> > How the handler can differentiate between something that is temporary and
> > it should keep retrying and something permanent like forgot to create the
> > topic? temporary here could be
> >  the producer get deployed before the topic creation finish (specially if
> > the topic creation is handled via IaC)
> >  temporary offline partitions
> >  leadership changing
> > Isn’t this putting the producer at risk of dropping records
> > unintentionally?
> > - Can you elaborate more on what is written in the compatibility /
> > migration plan section please by explaining in bit more details what is
> the
> > changing behaviour and how this will impact client who are upgrading?
> > - In the proposal changes can you elaborate in the KIP where in the
> > producer lifecycle will ClientExceptionHandler and
> > TransactionExceptionHandler get triggered, and how will the producer
> > configure them to point to costumed implementation.
> >
> > Thanks
> > Omnia
> >
> > > On 17 Apr 2024, at 13:13, Alieh Saeedi 
> > wrote:
> > >
> > > Hi all,
> > > Here is the KIP-1038: Add Custom Error Handler to Producer.
> > > <
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1038%3A+Add+Custom+Error+Handler+to+Producer
> > >
> > > I look forward to your feedback!
> > >
> > > Cheers,
> > > Alieh
> >
> >
>


Re: [VOTE] KIP-899: Allow producer and consumer clients to rebootstrap

2024-04-15 Thread Chris Egerton
Hi Ivan,

Thanks for the KIP. After the recent changes, this LGTM. +1 (binding)

Cheers,

Chris

On Wed, Aug 2, 2023 at 12:15 AM Ivan Yurchenko 
wrote:

> Hello,
>
> The discussion [1] for KIP-899 [2] has been open for quite some time. I'd
> like to put the KIP up for a vote.
>
> Best,
> Ivan
>
> [1] https://lists.apache.org/thread/m0ncbmfxs5m87sszby2jbmtjx2bdpcdl
> [2]
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-899%3A+Allow+producer+and+consumer+clients+to+rebootstrap
>


[ANNOUNCE] New Kafka PMC Member: Greg Harris

2024-04-13 Thread Chris Egerton
Hi all,

Greg Harris has been a Kafka committer since July 2023. He has remained
very active and instructive in the community since becoming a committer.
It's my pleasure to announce that Greg is now a member of Kafka PMC.

Congratulations, Greg!

Chris, on behalf of the Apache Kafka PMC


Re: [DISCUSS] KIP-899: Allow clients to rebootstrap

2024-04-12 Thread Chris Egerton
Thanks Ivan! LGTM

On Fri, Apr 12, 2024, 13:38 Ivan Yurchenko  wrote:

> Hi Chris and all,
>
> Thank you for your feedback. Your proposals seems good to me. I did these
> changed to the KIP, please have a look at the change [1]
>
> Best,
> Ivan
>
> [1]
> https://cwiki.apache.org/confluence/pages/diffpagesbyversion.action?pageId=240881396=14=12
>
> On Thu, Apr 11, 2024, at 10:49, Chris Egerton wrote:
> > Hi Ivan,
> >
> > I agree with Andrew that we can save cluster ID checking for later. This
> > feature is opt-in and if necessary we can add a note to users about only
> > enabling it if they can be certain that the same cluster will always be
> > resolved by the bootstrap servers. This would apply regardless of whether
> > we did client ID checking anyways.
> >
> > Thanks for exploring a variety of options and ironing out the details on
> > this KIP. I think this is acceptable as-is but have a couple of final
> > suggestions we might consider:
> >
> > 1. Although the definition of an unavailable broker is useful ("A broker
> is
> > unavailable when the client doesn't have an established connection with
> it
> > and cannot establish a connection (e.g. due to the reconnect backoff)"),
> I
> > think this is a little too restrictive. It's useful to note this as an
> > example of what we may consider an unavailable broker, but if we leave
> some
> > more wiggle room, it could save us the trouble of a follow-up KIP when
> > tweaking behavior in the future. For example, to reduce discovery time
> for
> > a migrated Kafka cluster, it could be nice to re-bootstrap after a
> > connection attempt has failed for every currently-known broker with no
> > successful attempts in between, instead of waiting for the reconnection
> > backoff interval to kick in. Again, I don't think this needs to happen
> with
> > the initial implementation of the KIP, I just want us to be able to
> explore
> > options like this in the future.
> >
> > 2. In a similar vein, I think we can leave more room in our definition of
> > re-bootstrapping. Instead of "During the rebootstrap process, the client
> > forgets the brokers it knows about and falls back on the bootstrap
> brokers
> > (i.e. provided by bootstrap.servers provided by the client configuration)
> > as if it had just been initialized.", we could say something like "During
> > the rebootstrap process, the client attempts to re-contact the bootstrap
> > servers (i.e. ...) that it contacted during initialization." This could
> be
> > useful if we want to add the bootstrap servers to the previously-known
> list
> > of brokers instead of completely discarding the previously-known set.
> This
> > too can be left out of the initial implementation and just give us a bit
> > more room for future options.
> >
> > Cheers,
> >
> > Chris
> >
> > On Tue, Apr 9, 2024 at 11:51 AM Andrew Schofield <
> andrew_schofi...@live.com>
> > wrote:
> >
> > > Hi Ivan,
> > > I think you have to go one way or the other with the cluster ID, so I
> > > think removing that from this KIP might
> > > be the best. I think there’s another KIP waiting to be written for
> > > ensuring consistency of clusters, but
> > > I think that wouldn’t conflict at all with this one.
> > >
> > > Thanks,
> > > Andrew
> > >
> > > > On 9 Apr 2024, at 19:11, Ivan Yurchenko  wrote:
> > > >
> > > > Hi Andrew and all,
> > > >
> > > > I looked deeper into the code [1] and it seems the Metadata class is
> OK
> > > with cluster ID changing. So I'm thinking that the rebootstrapping
> > > shouldn't introduce a new failure mode here. And I should remove the
> > > mention of this cluster ID checks from the KIP.
> > > >
> > > > Best,
> > > > Ivan
> > > >
> > > > [1]
> > >
> https://github.com/apache/kafka/blob/ff90f78c700c582f9800013faad827c36b45ceb7/clients/src/main/java/org/apache/kafka/clients/Metadata.java#L355
> > > >
> > > > On Tue, Apr 9, 2024, at 09:28, Andrew Schofield wrote:
> > > >> Hi Ivan,
> > > >> Thanks for the KIP. I can see situations in which this would be
> > > helpful. I have one question.
> > > >>
> > > >> The KIP says the client checks the cluster ID when it re-bootstraps
> and
> > > that it will fail if the
> > > >> cluster ID doesn’t match the previously known one. How does it fail?

Re: [VOTE] KIP-1031: Control offset translation in MirrorSourceConnector

2024-04-12 Thread Chris Egerton
+1 (binding), thanks Omnia!

On Fri, Apr 12, 2024, 03:46 Mickael Maison  wrote:

> Hi Omnia,
>
> +1 (binding), thanks for the KIP!
>
> Mickael
>
> On Fri, Apr 12, 2024 at 9:01 AM Omnia Ibrahim 
> wrote:
> >
> > Hi everyone, I would like to start a voting thread for KIP-1031: Control
> offset translation in MirrorSourceConnector
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1031%3A+Control+offset+translation+in+MirrorSourceConnector
> >
> > For comments or feedback please check the discussion thread here
> https://lists.apache.org/thread/ym6zr0wrhglft5c000x9c8ych098s7h6
> >
> > Thanks
> > Omnia
> >
>


Re: [DISCUSS] KIP-899: Allow clients to rebootstrap

2024-04-11 Thread Chris Egerton
es to the KIP based on the comments:
> >>>>
> >>>> 1. Reduced the scope to producer and consumer clients only.
> >>>> 2. Added more details to the description of the rebootstrap process.
> >>>> 3. Documented the role of low values of reconnect.backoff.max.ms in
> >>>> preventing rebootstrapping.
> >>>> 4. Some wording changes.
> >>>>
> >>>> You can see the changes in the history [1]
> >>>>
> >>>> I'm planning to put the KIP to a vote in some days if there are no new
> >>>> comments.
> >>>>
> >>>> Thank you!
> >>>>
> >>>> Ivan
> >>>>
> >>>> [1]
> >>>>
> https://cwiki.apache.org/confluence/pages/diffpagesbyversion.action?pageId=240881396=9=5
> >>>>
> >>>> On Tue, 30 May 2023 at 08:23, Ivan Yurchenko 
> >>>> wrote:
> >>>>
> >>>>> Hi Chris and all,
> >>>>>
> >>>>>> I believe the logic you've linked is only applicable for the
> producer and
> >>>>>> consumer clients; the admin client does something different (see
> [1]).
> >>>>>
> >>>>> I see, thank you for the pointer. It seems the admin client is fairly
> >>>>> different from the producer and consumer. Probably it makes sense to
> reduce
> >>>>> the scope of the KIP to the producer and consumer clients only.
> >>>>>
> >>>>>> it'd be nice to have a definition of when re-bootstrapping
> >>>>>> would occur that doesn't rely on internal implementation details.
> What
> >>>>>> user-visible phenomena can we identify that would lead to a
> >>>>>> re-bootstrapping?
> >>>>>
> >>>>> Let's put it this way: "Re-bootstrapping means that the client
> forgets
> >>>>> about nodes it knows about and falls back on the bootstrap nodes as
> if it
> >>>>> had just been initialized. Re-bootstrapping happens when, during a
> metadata
> >>>>> update (which may be scheduled by `metadata.max.age.ms` or caused by
> >>>>> certain error responses like NOT_LEADER_OR_FOLLOWER,
> REPLICA_NOT_AVAILABLE,
> >>>>> etc.), the client doesn't have a node with an established connection
> or
> >>>>> establishable connection."
> >>>>> Does this sound good?
> >>>>>
> >>>>>> I also believe that if someone has "
> >>>>>> reconnect.backoff.max.ms" set to a low-enough value,
> >>>>>> NetworkClient::leastLoadedNode may never return null. In that case,
> >>>>>> shouldn't we still attempt a re-bootstrap at some point (if the
> user has
> >>>>>> enabled this feature)?
> >>>>>
> >>>>> Yes, you're right. Particularly `canConnect` here [1] can always be
> >>>>> returning `true` if `reconnect.backoff.max.ms` is low enough.
> >>>>> It seems pretty difficult to find a good criteria when
> re-bootstrapping
> >>>>> should be forced in this case, so it'd be difficult to configure and
> reason
> >>>>> about. I think it's worth mentioning in the KIP and later in the
> >>>>> documentation, but we should not try to do anything special here.
> >>>>>
> >>>>>> Would it make sense to re-bootstrap only after "
> >>>>>> metadata.max.age.ms" has elapsed since the last metadata update,
> and
> >>>>> when
> >>>>>> at least one request has been made to contact each known server and
> been
> >>>>>> met with failure?
> >>>>>
> >>>>> The first condition is satisfied by the check in the beginning of
> >>>>> `maybeUpdate` [2].
> >>>>> It seems to me, the second one is also satisfied by
> `leastLoadedNode`.
> >>>>> Admittedly, it's more relaxed than you propose: it tracks
> unavailability of
> >>>>> nodes that was detected by all types of requests, not only by
> metadata
> >>>>> requests.
> >>>>> What do you think, would this be enough?
> >>>>>
> >>>>> [1]
> >>>>>
> https://github.com/apache/kafka/blob/c9a42c85e2c903329b3550181d230527e90e3646/clients/src/m

Re: [VOTE] KIP-477: Add PATCH method for connector config in Connect REST API

2024-04-08 Thread Chris Egerton
Thanks Ivan! +1 (binding) from me.

On Mon, Apr 8, 2024, 06:59 Ivan Yurchenko  wrote:

> Hello!
>
> I'd like to put the subj KIP[1] to a vote. Thank you.
>
> Best regards,
> Ivan
>
> [1]
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-477%3A+Add+PATCH+method+for+connector+config+in+Connect+REST+API
>


Re: [DISCUSS] KIP-1031: Control offset translation in MirrorSourceConnector

2024-04-07 Thread Chris Egerton
Hi Omnia,

Ah, good catch. I think failing to start the checkpoint connector if offset
syncs are disabled is fine. We'd probably want to do that via the
Connector::validate [1] method in order to be able to catch invalid configs
during preflight validation, but it's not necessary to get that specific in
the KIP (especially since we may add other checks as well).

[1] -
https://kafka.apache.org/37/javadoc/org/apache/kafka/connect/connector/Connector.html#validate(java.util.Map)

Cheers,

Chris

On Thu, Apr 4, 2024 at 8:07 PM Omnia Ibrahim 
wrote:

> Thanks Chris for the feedback
> > 1. It'd be nice to mention that increasing the max offset lag to INT_MAX
> > could work as a partial workaround for users on existing versions (though
> > of course this wouldn't prevent creation of the syncs topic).
> I updated the KIP
>
> > 2. Will it be illegal to disable offset syncs if other features that rely
> > on offset syncs are explicitly enabled in the connector config? If
> they're
> > not explicitly enabled then it should probably be fine to silently
> disable
> > them, but I'd be interested in your thoughts.
> The rest of the features that relays on this is controlled by
> emit.checkpoints.enabled (enabled by default) and
> sync.group.offsets.enabled (disabled by default) which are part of
> MirrorCheckpointConnector config not MirrorSourceConnector, I was thinking
> that MirrorCheckpointConnector should fail to start if
> emit.offset-syncs.enabled is disabled while emit.checkpoints.enabled and/or
> sync.group.offsets.enabled are enabled as no point of creating this
> connector if the main part is disabled. WDYT?
>
> Thanks
> Omnia
>
> > On 3 Apr 2024, at 12:45, Chris Egerton  wrote:
> >
> > Hi Omnia,
> >
> > Thanks for the KIP! Two small things come to mind:
> >
> > 1. It'd be nice to mention that increasing the max offset lag to INT_MAX
> > could work as a partial workaround for users on existing versions (though
> > of course this wouldn't prevent creation of the syncs topic).
> >
> > 2. Will it be illegal to disable offset syncs if other features that rely
> > on offset syncs are explicitly enabled in the connector config? If
> they're
> > not explicitly enabled then it should probably be fine to silently
> disable
> > them, but I'd be interested in your thoughts.
> >
> > Cheers,
> >
> > Chris
> >
> > On Wed, Apr 3, 2024, 20:41 Luke Chen  wrote:
> >
> >> Hi Omnia,
> >>
> >> Thanks for the KIP!
> >> It LGTM!
> >> But I'm not an expert of MM2, it would be good to see if there is any
> other
> >> comment from MM2 experts.
> >>
> >> Thanks.
> >> Luke
> >>
> >> On Thu, Mar 14, 2024 at 6:08 PM Omnia Ibrahim 
> >> wrote:
> >>
> >>> Hi everyone, I would like to start a discussion thread for KIP-1031:
> >>> Control offset translation in MirrorSourceConnector
> >>>
> >>>
> >>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1031%3A+Control+offset+translation+in+MirrorSourceConnector
> >>>
> >>> Thanks
> >>> Omnia
> >>>
> >>
>
>


Re: [DISCUSS] KIP-1031: Control offset translation in MirrorSourceConnector

2024-04-03 Thread Chris Egerton
Hi Omnia,

Thanks for the KIP! Two small things come to mind:

1. It'd be nice to mention that increasing the max offset lag to INT_MAX
could work as a partial workaround for users on existing versions (though
of course this wouldn't prevent creation of the syncs topic).

2. Will it be illegal to disable offset syncs if other features that rely
on offset syncs are explicitly enabled in the connector config? If they're
not explicitly enabled then it should probably be fine to silently disable
them, but I'd be interested in your thoughts.

Cheers,

Chris

On Wed, Apr 3, 2024, 20:41 Luke Chen  wrote:

> Hi Omnia,
>
> Thanks for the KIP!
> It LGTM!
> But I'm not an expert of MM2, it would be good to see if there is any other
> comment from MM2 experts.
>
> Thanks.
> Luke
>
> On Thu, Mar 14, 2024 at 6:08 PM Omnia Ibrahim 
> wrote:
>
> > Hi everyone, I would like to start a discussion thread for KIP-1031:
> > Control offset translation in MirrorSourceConnector
> >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1031%3A+Control+offset+translation+in+MirrorSourceConnector
> >
> > Thanks
> > Omnia
> >
>


Re: [DISCUSS] KIP-477: Add PATCH method for connector config in Connect REST API

2024-03-28 Thread Chris Egerton
 No further comments from me.

On Thu, Mar 28, 2024, 19:36 Ivan Yurchenko  wrote:

> Hello Chris,
>
> Thanks for your feedback. I created the jira and also updated the
> description a bit mentioning that other similar race scenarios exist and
> the KIP is not trying to solve them.
>
> Best,
> Ivan
>
> On Wed, Mar 27, 2024, at 17:08, Chris Egerton wrote:
> > Hi Ivan,
> >
> > Thanks for the updates. LGTM!
> >
> > RE atomicity: I think it should be possible and not _too_ invasive to
> > detect and handle these kinds of races by tracking the offset in the
> config
> > topic for connector configs and aborting an operation if that offset
> > changes between when the request was initiated and when the write to the
> > config topic will take place, but since the same kind of issue is also
> > possible with other connector operations (concurrent configuration PUTs,
> > for example) due to how validation is split out into a separate thread, I
> > agree that it's not worth blocking the KIP on fixing this.
> >
> > One final nit: Can you update the Jira ticket link in the KIP?
> >
> > Cheers,
> >
> > Chris
> >
> > On Wed, Mar 27, 2024 at 2:56 PM Ivan Yurchenko  wrote:
> >
> > > Hi,
> > >
> > > I updated the KIP with the two following changes:
> > > 1. Using `null` values as tombstone value for removing existing fields
> > > from configuration.
> > > 2. Added a note about the lack of 100% atomicity, which seems very
> > > difficult to achieve practically.
> > >
> > > Ivan
> > >
> > >
> > > On Tue, Mar 26, 2024, at 14:45, Ivan Yurchenko wrote:
> > > > Speaking of the Chris' comment
> > > >
> > > > > One thought that comes to mind is that sometimes it may be useful
> to
> > > > > explicitly remove properties from a connector configuration. We
> might
> > > > > permit this by allowing users to specify null (the JSON literal,
> not a
> > > > > string containing the characters "null") as the value for
> to-be-removed
> > > > > properties.
> > > >
> > > > This actually makes sense. AFAIU, `null` cannot be in the connector
> > > config since https://github.com/apache/kafka/pull/11333, so using it
> as a
> > > tombstone value is a good idea. I can update the KIP.
> > > >
> > > > Ivan
> > > >
> > > >
> > > > On Tue, Mar 26, 2024, at 14:19, Ivan Yurchenko wrote:
> > > > > Hi all,
> > > > >
> > > > > This KIP is a bit old now :) but I think its context hasn't changed
> > > much since then and the KIP is still valid. I would like to finally
> bring
> > > it to some conclusion.
> > > > >
> > > > > Best,
> > > > > Ivan
> > > > >
> > > > > On 2021/07/12 14:49:47 Chris Egerton wrote:
> > > > > > Hi all,
> > > > > >
> > > > > > Know it's been a while for this KIP but in my personal experience
> > > the value
> > > > > > of a PATCH method in the REST API has actually increased over
> time
> > > and I'd
> > > > > > love to have this option for quick, stop-the-bleeding remediation
> > > efforts.
> > > > > >
> > > > > > One thought that comes to mind is that sometimes it may be
> useful to
> > > > > > explicitly remove properties from a connector configuration. We
> might
> > > > > > permit this by allowing users to specify null (the JSON literal,
> not
> > > a
> > > > > > string containing the characters "null") as the value for
> > > to-be-removed
> > > > > > properties.
> > > > > >
> > > > > > I'd love to see this change if you're still interested in driving
> > > it, Ivan.
> > > > > > Hopefully we can give it the attention it deserves in the
> upcoming
> > > months!
> > > > > >
> > > > > > Cheers,
> > > > > >
> > > > > > Chris
> > > > > >
> > > > > > On Fri, Jun 28, 2019 at 4:56 AM Ivan Yurchenko 
> > > > > > wrote:
> > > > > >
> > > > > > > Thank you for your feedback Ryanne!
> > > > > > > These are all surely valid concerns and PATCH isn't really
> > > necessa

Re: [DISCUSS] KIP-477: Add PATCH method for connector config in Connect REST API

2024-03-27 Thread Chris Egerton
Hi Ivan,

Thanks for the updates. LGTM!

RE atomicity: I think it should be possible and not _too_ invasive to
detect and handle these kinds of races by tracking the offset in the config
topic for connector configs and aborting an operation if that offset
changes between when the request was initiated and when the write to the
config topic will take place, but since the same kind of issue is also
possible with other connector operations (concurrent configuration PUTs,
for example) due to how validation is split out into a separate thread, I
agree that it's not worth blocking the KIP on fixing this.

One final nit: Can you update the Jira ticket link in the KIP?

Cheers,

Chris

On Wed, Mar 27, 2024 at 2:56 PM Ivan Yurchenko  wrote:

> Hi,
>
> I updated the KIP with the two following changes:
> 1. Using `null` values as tombstone value for removing existing fields
> from configuration.
> 2. Added a note about the lack of 100% atomicity, which seems very
> difficult to achieve practically.
>
> Ivan
>
>
> On Tue, Mar 26, 2024, at 14:45, Ivan Yurchenko wrote:
> > Speaking of the Chris' comment
> >
> > > One thought that comes to mind is that sometimes it may be useful to
> > > explicitly remove properties from a connector configuration. We might
> > > permit this by allowing users to specify null (the JSON literal, not a
> > > string containing the characters "null") as the value for to-be-removed
> > > properties.
> >
> > This actually makes sense. AFAIU, `null` cannot be in the connector
> config since https://github.com/apache/kafka/pull/11333, so using it as a
> tombstone value is a good idea. I can update the KIP.
> >
> > Ivan
> >
> >
> > On Tue, Mar 26, 2024, at 14:19, Ivan Yurchenko wrote:
> > > Hi all,
> > >
> > > This KIP is a bit old now :) but I think its context hasn't changed
> much since then and the KIP is still valid. I would like to finally bring
> it to some conclusion.
> > >
> > > Best,
> > > Ivan
> > >
> > > On 2021/07/12 14:49:47 Chris Egerton wrote:
> > > > Hi all,
> > > >
> > > > Know it's been a while for this KIP but in my personal experience
> the value
> > > > of a PATCH method in the REST API has actually increased over time
> and I'd
> > > > love to have this option for quick, stop-the-bleeding remediation
> efforts.
> > > >
> > > > One thought that comes to mind is that sometimes it may be useful to
> > > > explicitly remove properties from a connector configuration. We might
> > > > permit this by allowing users to specify null (the JSON literal, not
> a
> > > > string containing the characters "null") as the value for
> to-be-removed
> > > > properties.
> > > >
> > > > I'd love to see this change if you're still interested in driving
> it, Ivan.
> > > > Hopefully we can give it the attention it deserves in the upcoming
> months!
> > > >
> > > > Cheers,
> > > >
> > > > Chris
> > > >
> > > > On Fri, Jun 28, 2019 at 4:56 AM Ivan Yurchenko 
> > > > wrote:
> > > >
> > > > > Thank you for your feedback Ryanne!
> > > > > These are all surely valid concerns and PATCH isn't really
> necessary or
> > > > > suitable for normal production configuration management. However,
> there are
> > > > > cases where quick patching of the configuration is useful, such as
> hot
> > > > > fixes of production or in development.
> > > > >
> > > > > Overall, the change itself is really tiny and if the cost-benefit
> balance
> > > > > is positive, I'd still like to drive it further.
> > > > >
> > > > > Ivan
> > > > >
> > > > > On Wed, 26 Jun 2019 at 17:45, Ryanne Dolan 
> wrote:
> > > > >
> > > > > > Ivan, I looked at adding PATCH a while ago as well. I decided
> not to
> > > > > pursue
> > > > > > the idea for a few reasons:
> > > > > >
> > > > > > 1) PATCH is still racy. For example, if you want to add a topic
> to the
> > > > > > "topics" property, you still need to read, modify, and write the
> existing
> > > > > > value. To handle this, you'd need to support atomic sub-document
> > > > > > operations, which I don't see happening.
> > > > > >
> > > > > > 2) A common patter

[jira] [Resolved] (KAFKA-16423) Ignoring offset partition key with an unexpected format for the first element in the partition key list. Expected type: java.lang.String, actual type: null"

2024-03-26 Thread Chris Egerton (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-16423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Egerton resolved KAFKA-16423.
---
Fix Version/s: (was: 3.6.2)
   (was: 3.8.0)
   (was: 3.7.1)
   Resolution: Duplicate

> Ignoring offset partition key with an unexpected format for the first element 
> in the partition key list. Expected type: java.lang.String, actual type: null"
> 
>
> Key: KAFKA-16423
> URL: https://issues.apache.org/jira/browse/KAFKA-16423
> Project: Kafka
>  Issue Type: Bug
>  Components: connect
>Affects Versions: 3.5.0, 3.6.0, 3.5.1, 3.5.2, 3.7.0, 3.6.1, 3.8.0
>Reporter: johndoe
>Priority: Minor
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (KAFKA-16423) Ignoring offset partition key with an unexpected format for the first element in the partition key list. Expected type: java.lang.String, actual type: null"

2024-03-26 Thread Chris Egerton (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-16423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Egerton resolved KAFKA-16423.
---
Resolution: Duplicate

> Ignoring offset partition key with an unexpected format for the first element 
> in the partition key list. Expected type: java.lang.String, actual type: null"
> 
>
> Key: KAFKA-16423
> URL: https://issues.apache.org/jira/browse/KAFKA-16423
> Project: Kafka
>  Issue Type: Bug
>  Components: connect
>Affects Versions: 3.5.0, 3.6.0, 3.5.1, 3.5.2, 3.7.0, 3.6.1, 3.8.0
>Reporter: johndoe
>Assignee: johndoe
>Priority: Minor
> Fix For: 3.6.2, 3.8.0, 3.7.1
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Reopened] (KAFKA-16423) Ignoring offset partition key with an unexpected format for the first element in the partition key list. Expected type: java.lang.String, actual type: null"

2024-03-26 Thread Chris Egerton (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-16423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Egerton reopened KAFKA-16423:
---
  Assignee: (was: johndoe)

> Ignoring offset partition key with an unexpected format for the first element 
> in the partition key list. Expected type: java.lang.String, actual type: null"
> 
>
> Key: KAFKA-16423
> URL: https://issues.apache.org/jira/browse/KAFKA-16423
> Project: Kafka
>  Issue Type: Bug
>  Components: connect
>Affects Versions: 3.5.0, 3.6.0, 3.5.1, 3.5.2, 3.7.0, 3.6.1, 3.8.0
>Reporter: johndoe
>Priority: Minor
> Fix For: 3.6.2, 3.8.0, 3.7.1
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-16392) Spurious log warnings: "Ignoring offset partition key with an unexpected format for the second element in the partition key list. Expected type: java.util.Map, actual ty

2024-03-20 Thread Chris Egerton (Jira)
Chris Egerton created KAFKA-16392:
-

 Summary: Spurious log warnings: "Ignoring offset partition key 
with an unexpected format for the second element in the partition key list. 
Expected type: java.util.Map, actual type: null"
 Key: KAFKA-16392
 URL: https://issues.apache.org/jira/browse/KAFKA-16392
 Project: Kafka
  Issue Type: Bug
  Components: connect
Affects Versions: 3.6.1, 3.7.0, 3.5.2, 3.5.1, 3.6.0, 3.5.0, 3.8.0
Reporter: Chris Egerton
Assignee: Chris Egerton


Some source connectors choose not to specify source offsets with the records 
they emit (or rather, to provide null partitions/offsets). When these 
partitions are parsed by a Kafka Connect worker, this currently leads to a 
spurious warning log message.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [VOTE] KIP-995: Allow users to specify initial offsets while creating connectors

2024-03-06 Thread Chris Egerton
Hi Yash,

Thanks for the follow-up, I like the benefits it's yielded. I too think
"offsets_status" would be a better name for the response field.
@Ashwin--thoughts?

Cheers,

Chris


On Wed, Mar 6, 2024, 03:08 Ashwin  wrote:

> Thanks Yash,
>
> Yes , I think we can use @JsonInclude(JsonInclude.Include.NON_NULL) to
> exclude “initial_offsets_response” from the create response if offset is
> not specified.
>
> I’ll close the voting this week , if there are no further comments.
>
> Thanks for voting, everyone!
>
>
> Ashwin
>
> On Tue, Mar 5, 2024 at 11:20 PM Yash Mayya  wrote:
>
> > Hi Chris,
> >
> > I followed up with Ashwin offline and I believe he wanted to take a
> closer
> > look at the `ConnectorInfoWithInitialOffsetsResponse` stuff he mentioned
> in
> > the previous email and whether or not that'll be required (alternatively
> > using some Jackson JSON tricks). However, that's an implementation detail
> > and shouldn't hold up the KIP. Bikeshedding a little on the
> > "initial_offsets_response" field - I'm wondering if something like
> > "offsets_status" might be more appropriate, what do you think? I don't
> > think the current name is terrible though, so I'm +1 (binding) if
> everyone
> > else agrees that it's suitable.
> >
> > Thanks,
> > Yash
> >
> > On Tue, Mar 5, 2024 at 9:51 PM Chris Egerton 
> > wrote:
> >
> > > Hi all,
> > >
> > > Wanted to bump this and see if it looks good enough for a third vote.
> > Yash,
> > > any thoughts?
> > >
> > > Cheers,
> > >
> > > Chris
> > >
> > > On Mon, Jan 29, 2024 at 2:55 AM Ashwin 
> > > wrote:
> > >
> > > > Thanks for reviewing this KIP,  Yash.
> > > >
> > > > Could you please elaborate on the cleanup steps? For instance, if we
> > > > > encounter an error after wiping existing offsets but before writing
> > the
> > > > new
> > > > > offsets, there's not really any good way to "revert" the wiped
> > offsets.
> > > > > It's definitely extremely unlikely that a user would expect the
> > > previous
> > > > > offsets for a connector to still be present (by creating a new
> > > connector
> > > > > with the same name but without initial offsets for instance) after
> > > such a
> > > > > failed operation, but it would still be good to call this out
> > > > explicitly. I
> > > > > presume that we'd want to wipe the newly written initial offsets if
> > we
> > > > fail
> > > > > while writing the connector's config however?
> > > >
> > > >
> > > > Agree - I have clarified the cleanup here -
> > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-995%3A+Allow+users+to+specify+initial+offsets+while+creating+connectors#KIP995:Allowuserstospecifyinitialoffsetswhilecreatingconnectors-ProposedChanges
> > > > .
> > > >
> > > > The `PATCH /connectors/{connector}/offsets` and `DELETE
> > > > > /connectors/{connector}/offsets` endpoints have two possible
> success
> > > > > messages in the response depending on whether or not the connector
> > > plugin
> > > > > has implemented the `alterOffsets` connector method. Since we're
> > > > proposing
> > > > > to utilize the same offset validation during connector creation if
> > > > initial
> > > > > offsets are specified, I think it would be valuable to surface
> > similar
> > > > > information to users here as well
> > > >
> > > >
> > > > Thanks for pointing this out. I have updated the response to include
> a
> > > new
> > > > field “initial_offsets_response” which will contain the response
> based
> > on
> > > > whether the connector implements alterOffsets or not. This also means
> > > that
> > > > if initial_offsets is set in the ConnectorCreate request, we will
> > return
> > > a
> > > > new REST entity (ConnectorInfoWithInitialOffsetsResponse ?) which
> will
> > > be a
> > > > child class of ConnectorInfo.
> > > >
> > > > (
> > > >
> > > >
> > >
> >
> https://github.com/apache/kafka/blob/trunk/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/rest/entities/ConnectorInfo.java#L28-L28
>

Re: [VOTE] KIP-995: Allow users to specify initial offsets while creating connectors

2024-03-05 Thread Chris Egerton
Hi all,

Wanted to bump this and see if it looks good enough for a third vote. Yash,
any thoughts?

Cheers,

Chris

On Mon, Jan 29, 2024 at 2:55 AM Ashwin  wrote:

> Thanks for reviewing this KIP,  Yash.
>
> Could you please elaborate on the cleanup steps? For instance, if we
> > encounter an error after wiping existing offsets but before writing the
> new
> > offsets, there's not really any good way to "revert" the wiped offsets.
> > It's definitely extremely unlikely that a user would expect the previous
> > offsets for a connector to still be present (by creating a new connector
> > with the same name but without initial offsets for instance) after such a
> > failed operation, but it would still be good to call this out
> explicitly. I
> > presume that we'd want to wipe the newly written initial offsets if we
> fail
> > while writing the connector's config however?
>
>
> Agree - I have clarified the cleanup here -
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-995%3A+Allow+users+to+specify+initial+offsets+while+creating+connectors#KIP995:Allowuserstospecifyinitialoffsetswhilecreatingconnectors-ProposedChanges
> .
>
> The `PATCH /connectors/{connector}/offsets` and `DELETE
> > /connectors/{connector}/offsets` endpoints have two possible success
> > messages in the response depending on whether or not the connector plugin
> > has implemented the `alterOffsets` connector method. Since we're
> proposing
> > to utilize the same offset validation during connector creation if
> initial
> > offsets are specified, I think it would be valuable to surface similar
> > information to users here as well
>
>
> Thanks for pointing this out. I have updated the response to include a new
> field “initial_offsets_response” which will contain the response based on
> whether the connector implements alterOffsets or not. This also means that
> if initial_offsets is set in the ConnectorCreate request, we will return a
> new REST entity (ConnectorInfoWithInitialOffsetsResponse ?) which will be a
> child class of ConnectorInfo.
>
> (
>
> https://github.com/apache/kafka/blob/trunk/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/rest/entities/ConnectorInfo.java#L28-L28
> )
>
> Thanks,
> Ashwin
>
> On Wed, Jan 17, 2024 at 4:48 PM Yash Mayya  wrote:
>
> > Hi Ashwin,
> >
> > Thanks for the KIP.
> >
> > > If Connect runtime encounters an error in any of these steps,
> > > it will cleanup (if required) and return an error response
> >
> > Could you please elaborate on the cleanup steps? For instance, if we
> > encounter an error after wiping existing offsets but before writing the
> new
> > offsets, there's not really any good way to "revert" the wiped offsets.
> > It's definitely extremely unlikely that a user would expect the previous
> > offsets for a connector to still be present (by creating a new connector
> > with the same name but without initial offsets for instance) after such a
> > failed operation, but it would still be good to call this out
> explicitly. I
> > presume that we'd want to wipe the newly written initial offsets if we
> fail
> > while writing the connector's config however?
> >
> > > Validate the offset using the same checks performed while
> > > altering connector offsets (PATCH /$connector/offsets ) as
> > > specified in KIP-875
> >
> > The `PATCH /connectors/{connector}/offsets` and `DELETE
> > /connectors/{connector}/offsets` endpoints have two possible success
> > messages in the response depending on whether or not the connector plugin
> > has implemented the `alterOffsets` connector method. Since we're
> proposing
> > to utilize the same offset validation during connector creation if
> initial
> > offsets are specified, I think it would be valuable to surface similar
> > information to users here as well. Thoughts?
> >
> > Thanks,
> > Yash
> >
> > On Wed, Jan 17, 2024 at 3:31 PM Ashwin 
> > wrote:
> >
> > > Hi All ,
> > >
> > > Can I please get one more binding vote, so that the KIP is approved ?
> > > Thanks for the votes Chris and Mickael !
> > >
> > >
> > > - Ashwin
> > >
> > >
> > > On Thu, Jan 11, 2024 at 3:55 PM Mickael Maison <
> mickael.mai...@gmail.com
> > >
> > > wrote:
> > >
> > > > Hi Ashwin,
> > > >
> > > > +1 (binding), thanks for the KIP
> > > >
> > > > Mickael
> > > >
> > > > On Tue, Jan 9, 2024 at 4:54 PM Chris Egerton  >
> > > > wrote:
> > > > >
> > > > > Thanks for the KIP! +1 (binding)
> > > > >
> > > > > On Mon, Jan 8, 2024 at 9:35 AM Ashwin  >
> > > > wrote:
> > > > >
> > > > > > Hi All,
> > > > > >
> > > > > > I would like to start  a vote on KIP-995.
> > > > > >
> > > > > >
> > > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-995%3A+Allow+users+to+specify+initial+offsets+while+creating+connectors
> > > > > >
> > > > > > Discussion thread -
> > > > > > https://lists.apache.org/thread/msorbr63scglf4484yq764v7klsj7c4j
> > > > > >
> > > > > > Thanks!
> > > > > >
> > > > > > Ashwin
> > > > > >
> > > >
> > >
> >
>


Re: Shortened URLs for KIPs?

2024-02-28 Thread Chris Egerton
Hi Kirk,

Interesting proposal! I gave it a shot with one of my own prior KIPs and
was able to generate https://s.apache.org/kip-618 for it.

It looks like uppercase characters aren't permitted for URL IDs (even
though the regex listed in that text box does claim to allow them).

I can't commit to doing this for every KIP in perpetuity, but I wouldn't
mind giving it a shot on at least a trial basis unless any of my colleagues
have objections.

Best,

Chris

On Wed, Feb 28, 2024 at 5:45 PM Kirk True  wrote:

> I just found https://s.apache.org/, which is an Apache shortened URL
> service. That might provide the needed infrastructure, but it requires a
> login, so someone (a committer(?)) to create that for each KIP :(
>
> > On Feb 28, 2024, at 2:40 PM, Kirk True  wrote:
> >
> > Hi all,
> >
> > Is it possible to set up shortened URLs for KIPs? So instead of, say:
> >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-966%3A+Eligible+Leader+Replicas
> >
> > We could refer to it as:
> >
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-966
> >
> > Or maybe even go so far as to have something like:
> >
> > https://kafka.apache.org/kips/KIP-966
> >
> > I know the wiki has a way to generate a short URL (e.g.
> https://cwiki.apache.org/confluence/x/mpOzDw), but, IMO, it’s so opaque
> as to be nearly worthless.
> >
> > Pros:
> >
> > 1. Succinct: great for written documentation
> > 2. Discoverability: it’s predictable and easy to find
> > 3. Robust: the URL doesn’t break when the KIP title changes
> >
> > Cons:
> >
> > 1. Time
> > 2. Money
> > 3. Perpetual maintenance: requires 100% commitment indefinitely
> >
> > I know the list of cons is probably much more than I realize. At this
> point I’m just wondering if it’s even a desired mechanism.
> >
> > Thoughts?
> >
> > Thanks,
> > Kirk
>
>


Re: [ANNOUNCE] Apache Kafka 3.7.0

2024-02-27 Thread Chris Egerton
Thanks for running this release, Stanislav! And thanks to all the
contributors who helped implement all the bug fixes and new features we got
to put out this time around.

On Tue, Feb 27, 2024, 13:03 Stanislav Kozlovski <
stanislavkozlov...@apache.org> wrote:

> The Apache Kafka community is pleased to announce the release of
> Apache Kafka 3.7.0
>
> This is a minor release that includes new features, fixes, and
> improvements from 296 JIRAs
>
> An overview of the release and its notable changes can be found in the
> release blog post:
> https://kafka.apache.org/blog#apache_kafka_370_release_announcement
>
> All of the changes in this release can be found in the release notes:
> https://www.apache.org/dist/kafka/3.7.0/RELEASE_NOTES.html
>
> You can download the source and binary release (Scala 2.12, 2.13) from:
> https://kafka.apache.org/downloads#3.7.0
>
>
> ---
>
>
> Apache Kafka is a distributed streaming platform with four core APIs:
>
>
> ** The Producer API allows an application to publish a stream of records to
> one or more Kafka topics.
>
> ** The Consumer API allows an application to subscribe to one or more
> topics and process the stream of records produced to them.
>
> ** The Streams API allows an application to act as a stream processor,
> consuming an input stream from one or more topics and producing an
> output stream to one or more output topics, effectively transforming the
> input streams to output streams.
>
> ** The Connector API allows building and running reusable producers or
> consumers that connect Kafka topics to existing applications or data
> systems. For example, a connector to a relational database might
> capture every change to a table.
>
>
> With these APIs, Kafka can be used for two broad classes of application:
>
> ** Building real-time streaming data pipelines that reliably get data
> between systems or applications.
>
> ** Building real-time streaming applications that transform or react
> to the streams of data.
>
>
> Apache Kafka is in use at large and small companies worldwide, including
> Capital One, Goldman Sachs, ING, LinkedIn, Netflix, Pinterest, Rabobank,
> Target, The New York Times, Uber, Yelp, and Zalando, among others.
>
> A big thank you to the following 146 contributors to this release!
> (Please report an unintended omission)
>
> Abhijeet Kumar, Akhilesh Chaganti, Alieh, Alieh Saeedi, Almog Gavra,
> Alok Thatikunta, Alyssa Huang, Aman Singh, Andras Katona, Andrew
> Schofield, Anna Sophie Blee-Goldman, Anton Agestam, Apoorv Mittal,
> Arnout Engelen, Arpit Goyal, Artem Livshits, Ashwin Pankaj,
> ashwinpankaj, atu-sharm, bachmanity1, Bob Barrett, Bruno Cadonna,
> Calvin Liu, Cerchie, chern, Chris Egerton, Christo Lolov, Colin
> Patrick McCabe, Colt McNealy, Crispin Bernier, David Arthur, David
> Jacot, David Mao, Deqi Hu, Dimitar Dimitrov, Divij Vaidya, Dongnuo
> Lyu, Eaugene Thomas, Eduwer Camacaro, Eike Thaden, Federico Valeri,
> Florin Akermann, Gantigmaa Selenge, Gaurav Narula, gongzhongqiang,
> Greg Harris, Guozhang Wang, Gyeongwon, Do, Hailey Ni, Hanyu Zheng, Hao
> Li, Hector Geraldino, hudeqi, Ian McDonald, Iblis Lin, Igor Soarez,
> iit2009060, Ismael Juma, Jakub Scholz, James Cheng, Jason Gustafson,
> Jay Wang, Jeff Kim, Jim Galasyn, John Roesler, Jorge Esteban Quilcate
> Otoya, Josep Prat, José Armando García Sancio, Jotaniya Jeel, Jouni
> Tenhunen, Jun Rao, Justine Olshan, Kamal Chandraprakash, Kirk True,
> kpatelatwork, kumarpritam863, Laglangyue, Levani Kokhreidze, Lianet
> Magrans, Liu Zeyu, Lucas Brutschy, Lucia Cerchie, Luke Chen, maniekes,
> Manikumar Reddy, mannoopj, Maros Orsak, Matthew de Detrich, Matthias
> J. Sax, Max Riedel, Mayank Shekhar Narula, Mehari Beyene, Michael
> Westerby, Mickael Maison, Nick Telford, Nikhil Ramakrishnan, Nikolay,
> Okada Haruki, olalamichelle, Omnia G.H Ibrahim, Owen Leung, Paolo
> Patierno, Philip Nee, Phuc-Hong-Tran, Proven Provenzano, Purshotam
> Chauhan, Qichao Chu, Matthias J. Sax, Rajini Sivaram, Renaldo Baur
> Filho, Ritika Reddy, Robert Wagner, Rohan, Ron Dagostino, Roon, runom,
> Ruslan Krivoshein, rykovsi, Sagar Rao, Said Boudjelda, Satish Duggana,
> shuoer86, Stanislav Kozlovski, Taher Ghaleb, Tang Yunzi, TapDang,
> Taras Ledkov, tkuramoto33, Tyler Bertrand, vamossagar12, Vedarth
> Sharma, Viktor Somogyi-Vass, Vincent Jiang, Walker Carlson,
> Wuzhengyu97, Xavier Léauté, Xiaobing Fang, yangy, Ritika Reddy,
> Yanming Zhou, Yash Mayya, yuyli, zhaohaidao, Zihao Lin, Ziming Deng
>
> We welcome your help and feedback. For more information on how to
> report problems, and to get involved, visit the project website at
> https://kafka.apache.org/
>
> Thank you!
>
>
> Regards,
>
> Stanislav Kozlovski
> Release Manager for Apache Kafka 3.7.0
>


Re: [DISCUSS] Apache Kafka 3.8.0 release

2024-02-26 Thread Chris Egerton
Thanks Josep, I'm +1 as well.

On Mon, Feb 26, 2024 at 12:32 PM Justine Olshan
 wrote:

> Thanks Joesp. +1 from me.
>
> On Mon, Feb 26, 2024 at 3:37 AM Josep Prat 
> wrote:
>
> > Hi all,
> >
> > I'd like to volunteer as release manager for the Apache Kafka 3.8.0
> > release.
> > If there are no objections, I'll start building a release plan (or
> adapting
> > the one Colin made some weeks ago) in the wiki in the next days.
> >
> > Thank you.
> >
> > --
> > [image: Aiven] 
> >
> > *Josep Prat*
> > Open Source Engineering Director, *Aiven*
> > josep.p...@aiven.io   |   +491715557497
> > aiven.io    |   <
> https://www.facebook.com/aivencloud
> > >
> >      <
> > https://twitter.com/aiven_io>
> > *Aiven Deutschland GmbH*
> > Alexanderufer 3-7, 10117 Berlin
> > Geschäftsführer: Oskari Saarenmaa & Hannu Valtonen
> > Amtsgericht Charlottenburg, HRB 209739 B
> >
>


Re: Github build queue

2024-02-09 Thread Chris Egerton
+1, would love this.

On Fri, Feb 9, 2024, 10:04 David Arthur  wrote:

> Hey folks,
>
> I recently learned about Github's Merge Queue feature, and I think it could
> help us out.
>
> Essentially, when you hit the Merge button on a PR, it will add the PR to a
> queue and let you run a CI job before merging. Just something simple like
> compile + static analysis would probably save us from a lot of headaches on
> trunk.
>
> I can think of two situations this would help us avoid:
> * Two valid PRs are merged near one another, but they create a code
> breakage (rare)
> * A quick little "fixup" commit on a PR actually breaks something (less
> rare)
>
> Looking at our Github stats, we are averaging under 40 commits per week.
> Assuming those primarily come in on weekdays, that's 8 commits per day. If
> we just run "gradlew check -x tests" for the merge queue job, I don't think
> we'd get backlogged.
>
> Thoughts?
> David
>
>
>
>
> --
> David Arthur
>


[jira] [Resolved] (KAFKA-15575) Prevent Connectors from exceeding tasks.max configuration

2024-02-01 Thread Chris Egerton (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-15575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Egerton resolved KAFKA-15575.
---
Fix Version/s: 3.8.0
   Resolution: Fixed

> Prevent Connectors from exceeding tasks.max configuration
> -
>
> Key: KAFKA-15575
> URL: https://issues.apache.org/jira/browse/KAFKA-15575
> Project: Kafka
>  Issue Type: Task
>  Components: connect
>Reporter: Greg Harris
>    Assignee: Chris Egerton
>Priority: Minor
>  Labels: kip
> Fix For: 3.8.0
>
>
> The Connector::taskConfigs(int maxTasks) function is used by Connectors to 
> enumerate tasks configurations. This takes an argument which comes from the 
> tasks.max connector config. This is the Javadoc for that method:
> {noformat}
> /**
>  * Returns a set of configurations for Tasks based on the current 
> configuration,
>  * producing at most {@code maxTasks} configurations.
>  *
>  * @param maxTasks maximum number of configurations to generate
>  * @return configurations for Tasks
>  */
> public abstract List> taskConfigs(int maxTasks);
> {noformat}
> This includes the constraint that the number of tasks is at most maxTasks, 
> but this constraint is not enforced by the framework.
>  
> To enforce this constraint, we could begin dropping configs that exceed the 
> limit, and log a warning. For sink connectors this should harmlessly 
> rebalance the consumer subscriptions onto the remaining tasks. For source 
> connectors that distribute their work via task configs, this may result in an 
> interruption in data transfer.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[DISCUSS] KIP-1017: A health check endpoint for Kafka Connect

2024-01-26 Thread Chris Egerton
Hi all,

Happy Friday! I'd like to kick off discussion for KIP-1017, which (as the
title suggests) proposes adding a health check endpoint for Kafka Connect:
https://cwiki.apache.org/confluence/display/KAFKA/KIP-1017%3A+Health+check+endpoint+for+Kafka+Connect

This is one of the longest-standing issues with Kafka Connect and I'm
hoping we can finally put it in the ground soon. Looking forward to hearing
people's thoughts!

Cheers,

Chris


Re: [VOTE] KIP-1011: Use incrementalAlterConfigs when updating broker configs by kafka-configs.sh

2024-01-24 Thread Chris Egerton
Thanks Ziming! +1 (binding)

On Wed, Jan 24, 2024, 03:23 ziming deng  wrote:

> Thank you for reminding this, David,
>
> I have mentioned this in the [Compatibility] section as a following work.
>
> --,
> Best,
> Ziming
>
> > On Jan 23, 2024, at 15:17, David Jacot 
> wrote:
> >
> > Hi Chris, Ziming,
> >
> > Thanks for the clarification. I am glad that it does not impact the tool.
> > It may be worth adding a note about it in the KIP to avoid the same
> > question in the future.
> >
> > Otherwise, I am +1 (binding). Thanks for driving this!
> >
> > Best,
> > David
> >
> > On Tue, Jan 23, 2024 at 6:07 AM ziming deng 
> > wrote:
> >
> >> Hello David,
> >>
> >> Thanks for reminding this, as Chirs explained, the tools I’m trying to
> >> update only support set/delete configs, and I’m just make a way for
> >> append/subtract configs in the future, so this would not be affected by
> >> KAFKA-10140, and it would be a little overkill to support
> append/subtract
> >> configs or solve KAFKA-10140 here, so let’s leave it right now, I'm
> happy
> >> to pick it after finishing this KIP.
> >>
> >> --,
> >> Ziming
> >>
> >>> On Jan 22, 2024, at 18:23, David Jacot 
> >> wrote:
> >>>
> >>> Hi Ziming,
> >>>
> >>> Thanks for driving this. I wanted to bring KAFKA-10140
> >>>  to your attention.
> >> It
> >>> looks like the incremental API does not work for configuring plugins. I
> >>> think that we need to cover this in the KIP.
> >>>
> >>> Best,
> >>> David
> >>>
> >>> On Mon, Jan 22, 2024 at 10:13 AM Andrew Schofield <
> >>> andrew_schofield_j...@outlook.com> wrote:
> >>>
>  +1 (non-binding)
> 
>  Thanks,
>  Andrew
> 
> > On 22 Jan 2024, at 07:29, Federico Valeri 
> >> wrote:
> >
> > +1 (non binding)
> >
> > Thanks.
> >
> > On Mon, Jan 22, 2024 at 7:03 AM Luke Chen  wrote:
> >>
> >> Hi Ziming,
> >>
> >> +1(binding) from me.
> >>
> >> Thanks.
> >> Luke
> >>
> >> On Mon, Jan 22, 2024 at 11:50 AM Kamal Chandraprakash <
> >> kamal.chandraprak...@gmail.com> wrote:
> >>
> >>> +1 (non-binding)
> >>>
> >>> On Mon, Jan 22, 2024 at 8:34 AM ziming deng <
> >> dengziming1...@gmail.com>
> >>> wrote:
> >>>
>  Hello everyone,
>  I'd like to initiate a vote for KIP-1011.
>  This KIP is about replacing alterConfigs with
> >> incrementalAlterConfigs
>  when updating broker configs using kafka-configs.sh, this is
> similar
>  to
>  what we have done in KIP-894.
> 
>  KIP link:
>  KIP-1011: Use incrementalAlterConfigs when updating broker configs
> >> by
>  kafka-configs.sh - Apache Kafka - Apache Software Foundation
>  <
> 
> >>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1011%3A+Use+incrementalAlterConfigs+when+updating+broker+configs+by+kafka-configs.sh
> >
>  cwiki.apache.org
>  <
> 
> >>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1011%3A+Use+incrementalAlterConfigs+when+updating+broker+configs+by+kafka-configs.sh
> >
>  [image: favicon.ico]
>  <
> 
> >>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1011%3A+Use+incrementalAlterConfigs+when+updating+broker+configs+by+kafka-configs.sh
> >
>  <
> 
> >>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1011%3A+Use+incrementalAlterConfigs+when+updating+broker+configs+by+kafka-configs.sh
> >
> 
>  Discussion thread:
> 
> 
>  lists.apache.org
>   >
>   >
>   >
> 
> 
>  --,
>  Best,
>  Ziming
> 
> 
> 
> >>
> >>
>
>


Re: [DISCUSS] KIP-1016 Make MM2 heartbeats topic name configurable

2024-01-22 Thread Chris Egerton
Hi Berci,

Thanks for the KIP!

IMO we don't need the "default." prefix for the new property, and it
deviates a bit from the precedent set by properties like
"replication.policy.internal.topic.separator.enabled". I think we can just
call it "replication.policy.heartbeats.topic", or if we really want to be
precise, "replication.policy.heartbeats.topic.name".

Regarding multiple source->target pairs, won't we get support for this for
free if we add the new property to the DefaultReplicationPolicy class? IIRC
it's already possible to configure replication policies on a
per-replication-flow basis with that syntax, I don't see why this wouldn't
be the case for the new property.

I'm also a little hazy on the motivation for the change. Just out of
curiosity, what exactly is meant by "the "heartbeats" topics of other
systems" in the Jira ticket's description? Are we trying to better
accommodate cases where other harder-to-configure systems (like a picky
source connector, for example) create and use a "heartbeats" topic, or are
we trying to enable multiple MM2 heartbeat connectors to target the same
Kafka cluster? I can understand the former as a niche but possible scenario
and one that we can make room for, but the latter is a bit harder to
justify short of, e.g., fine-tuning the heartbeat emission interval based
on the eventual target of the replication flow that will be reading from
the heartbeats topic.

I don't raise the above to cast doubt on the KIP, really I'm just curious
about how people are using MM2.

Cheers,

Chris

On Thu, Jan 18, 2024 at 6:11 AM Kondrát Bertalan  wrote:

> Hi Viktor,
>
> Let me address your points one by one.
>
>1. The current implementation does not support the source->target pair
>based configuration, it is global.
>2. Yes, I introduced that property both in the client and in the
>connectors
>3. This is a great idea, I am going to do that, and also I tried to
>construct the property name in a way that makes this clear for the
> users: '
>default.replication.policy.heartbeats.topic.name'
>4. Yeah, that was my impression too.
>
> Thanks,
> Berci
>
> On Wed, Jan 17, 2024 at 4:51 PM Viktor Somogyi-Vass
>  wrote:
>
> > Hi Bertalan,
> >
> > Thanks for creating this KIP.
> > A couple of observations/questions:
> > 1. If I have multiple source->target pairs, can I set this property per
> > cluster by prefixing with "source->target" as many other configs or is it
> > global?
> > 2. The replication policy must be set in MirrorClient as well. Is your
> > change applicable to both MirrorClient and the connectors as well?
> > 3. It might be worth pointing out (both in the docs and the KIP) that if
> > the user overrides the replication policy to any other than
> > DefaultReplicationPolicy, then this config has no effect.
> > 4. With regards to integration tests, I tend to lean towards that we
> don't
> > need them if we can cover this well with unit tests and mocking.
> >
> > Thanks,
> > Viktor
> >
> > On Wed, Jan 17, 2024 at 12:23 AM Ryanne Dolan 
> > wrote:
> >
> > > Makes sense to me, +1.
> > >
> > > On Tue, Jan 16, 2024 at 5:04 PM Kondrát Bertalan 
> > > wrote:
> > >
> > >> Hey Team,
> > >>
> > >> I would like to start a discussion thread about the *KIP-1016 Make MM2
> > >> heartbeats topic name configurable
> > >> <
> > >>
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1016+Make+MM2+heartbeats+topic+name+configurable
> > >> >*
> > >> .
> > >>
> > >> This KIP aims to make the default heartbeat topic name (`heartbeats`)
> in
> > >> the DefaultReplicationPolicy configurable via a property.
> > >> Since this is my first KIP and the change is small, I implemented it
> in
> > >> advance so, I can include the PR
> > >>  as well.
> > >>
> > >> I appreciate all your feedbacks and comments.
> > >>
> > >> Special thanks to Viktor Somogyi-Vass 
> and
> > >> Daniel
> > >> Urban  for the original idea and help.
> > >> Thank you,
> > >> Berci
> > >>
> > >> --
> > >> *Bertalan Kondrat* | Founder, SWE
> > >> servy.hu 
> > >>
> > >>
> > >>
> > >> 
> > >> --
> > >>
> > >
> >
>
>
> --
> *Bertalan Kondrat* | Founder
> t. +36(70) 413-4801
> servy.hu 
>
>
> [image: Servy] 
> --
>


Re: [VOTE] KIP-1011: Use incrementalAlterConfigs when updating broker configs by kafka-configs.sh

2024-01-22 Thread Chris Egerton
Hi David,

I've only briefly skimmed KAFKA-10140 but it seems that it may only apply
to append/subtract operations on list-type properties. If my understanding
is correct then this shouldn't be a problem for the KIP since we only use
the set/delete operations in the kafka-configs.sh script. If the scope of
the issue extends beyond those operations, then I agree that changes are
warranted to the KIP.

Cheers,

Chris

On Mon, Jan 22, 2024 at 5:23 AM David Jacot 
wrote:

> Hi Ziming,
>
> Thanks for driving this. I wanted to bring KAFKA-10140
>  to your attention. It
> looks like the incremental API does not work for configuring plugins. I
> think that we need to cover this in the KIP.
>
> Best,
> David
>
> On Mon, Jan 22, 2024 at 10:13 AM Andrew Schofield <
> andrew_schofield_j...@outlook.com> wrote:
>
> > +1 (non-binding)
> >
> > Thanks,
> > Andrew
> >
> > > On 22 Jan 2024, at 07:29, Federico Valeri 
> wrote:
> > >
> > > +1 (non binding)
> > >
> > > Thanks.
> > >
> > > On Mon, Jan 22, 2024 at 7:03 AM Luke Chen  wrote:
> > >>
> > >> Hi Ziming,
> > >>
> > >> +1(binding) from me.
> > >>
> > >> Thanks.
> > >> Luke
> > >>
> > >> On Mon, Jan 22, 2024 at 11:50 AM Kamal Chandraprakash <
> > >> kamal.chandraprak...@gmail.com> wrote:
> > >>
> > >>> +1 (non-binding)
> > >>>
> > >>> On Mon, Jan 22, 2024 at 8:34 AM ziming deng <
> dengziming1...@gmail.com>
> > >>> wrote:
> > >>>
> >  Hello everyone,
> >  I'd like to initiate a vote for KIP-1011.
> >  This KIP is about replacing alterConfigs with
> incrementalAlterConfigs
> >  when updating broker configs using kafka-configs.sh, this is similar
> > to
> >  what we have done in KIP-894.
> > 
> >  KIP link:
> >  KIP-1011: Use incrementalAlterConfigs when updating broker configs
> by
> >  kafka-configs.sh - Apache Kafka - Apache Software Foundation
> >  <
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1011%3A+Use+incrementalAlterConfigs+when+updating+broker+configs+by+kafka-configs.sh
> > >
> >  cwiki.apache.org
> >  <
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1011%3A+Use+incrementalAlterConfigs+when+updating+broker+configs+by+kafka-configs.sh
> > >
> >  [image: favicon.ico]
> >  <
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1011%3A+Use+incrementalAlterConfigs+when+updating+broker+configs+by+kafka-configs.sh
> > >
> >  <
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1011%3A+Use+incrementalAlterConfigs+when+updating+broker+configs+by+kafka-configs.sh
> > >
> > 
> >  Discussion thread:
> > 
> > 
> >  lists.apache.org
> >  
> >  
> >  
> > 
> > 
> >  --,
> >  Best,
> >  Ziming
> >
> >
> >
>


Re: DISCUSS KIP-1011: Use incrementalAlterConfigs when updating broker configs by kafka-configs.sh

2024-01-18 Thread Chris Egerton
Thanks Ziming, LGTM!

On Mon, Jan 15, 2024 at 12:00 AM ziming deng 
wrote:

> Hello Luke,
>
> thank you for finding this error, I have rectified it, and I will start a
> vote process soon.
>
> --
> Best,
> Ziming
>
>
> > On Jan 12, 2024, at 16:32, Luke Chen  wrote:
> >
> > Hi Ziming,
> >
> > Thanks for the KIP!
> > LGTM!
> > Using incremental by defaul and fallback automatically if it's not
> > supported is a good idea!
> >
> > One minor comment:
> > 1. so I'm inclined to move it to incrementalAlterConfigs  and "provide a
> > flag" to still use alterConfigs  for new client to interact with old
> > servers.
> > I don't think we will "provide any flag" after the discussion. We should
> > remove it.
> >
> > Thanks.
> > Luke
> >
> > On Fri, Jan 12, 2024 at 12:29 PM ziming deng  <mailto:dengziming1...@gmail.com>>
> > wrote:
> >
> >> Thank you for your clarification, Chris,
> >>
> >> I have spent some time to review KIP-894 and I think it's automatic way
> is
> >> better and bring no side effect, and I will also adopt this way here.
> >> As you mentioned, the changes in semantics is minor, the most important
> >> reason for this change is fixing bug brought by sensitive configs.
> >>
> >>
> >>> We
> >>> don't appear to support appending/subtracting from list properties via
> >> the
> >>> CLI for any other entity type right now,
> >> You are right about this, I tried and found that we can’t subtract or
> >> append configs, I will change the KIP to "making way for
> >> appending/subtracting list properties"
> >>
> >> --
> >> Best,
> >> Ziming
> >>
> >>> On Jan 6, 2024, at 01:34, Chris Egerton 
> wrote:
> >>>
> >>> Hi all,
> >>>
> >>> Can we clarify any changes in the user-facing semantics for the CLI
> tool
> >>> that would come about as a result of this KIP? I think the debate over
> >> the
> >>> necessity of an opt-in flag, or waiting for 4.0.0, ultimately boils
> down
> >> to
> >>> this.
> >>>
> >>> My understanding is that the only changes in semantics are fairly minor
> >>> (semantic versioning pun intended):
> >>>
> >>> - Existing sensitive broker properties no longer have to be explicitly
> >>> specified on the command line if they're not being changed
> >>> - A small race condition is fixed where the broker config is updated
> by a
> >>> separate operation in between when the CLI reads the existing broker
> >> config
> >>> and writes the new broker config
> >>> - Usage of a new broker API that has been supported since version
> 2.3.0,
> >>> but which does not require any new ACLs and does not act any
> differently
> >>> apart from the two small changes noted above
> >>>
> >>> If this is correct, then I'm inclined to agree with Ismael's suggestion
> >> of
> >>> starting with incrementalAlterConfigs, and falling back on alterConfigs
> >> if
> >>> the former is not supported by the broker, and do not believe it's
> >>> necessary to wait for 4.0.0 or provide opt-in or opt-out flags to
> release
> >>> this change. This would also be similar to changes we made to
> >> MirrorMaker 2
> >>> in KIP-894 [1], where the default behavior for syncing topic configs is
> >> now
> >>> to start with incrementalAlterConfigs and fall back on alterConfigs if
> >> it's
> >>> not supported.
> >>>
> >>> If there are other, more significant changes to the user-facing
> semantics
> >>> for the CLI, then these should be called out here and in the KIP, and
> we
> >>> might consider a more cautious approach.
> >>>
> >>> [1] -
> >>>
> >>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-894%3A+Use+incrementalAlterConfigs+API+for+syncing+topic+configurations
> >>>
> >>>
> >>> Also, regarding this part of the KIP:
> >>>
> >>>> incrementalAlterConfigs is more convenient especially for updating
> >>> configs of list data type, such as
> >> "leader.replication.throttled.replicas"
> >>>
> >>> While this is true for the Java admin client and the corresponding
> broker
> >>> APIs, i

Re: [VOTE] 3.7.0 RC2

2024-01-12 Thread Chris Egerton
Thanks, Kirk!

@Stanislav--do you believe that this warrants a new RC?

On Fri, Jan 12, 2024, 19:08 Kirk True  wrote:

> Hi Chris/Stanislav,
>
> I'm working on the 'Unable to find FetchSessionHandler' log problem
> (KAFKA-16029) and have put out a draft PR (
> https://github.com/apache/kafka/pull/15186). I will use the quickstart
> approach as a second means to reproduce/verify while I wait for the PR's
> Jenkins job to finish.
>
> Thanks,
> Kirk
>
> On Fri, Jan 12, 2024, at 11:31 AM, Chris Egerton wrote:
> > Hi Stanislav,
> >
> >
> > Thanks for running this release!
> >
> > To verify, I:
> > - Built from source using Java 11 with both:
> > - - the 3.7.0-rc2 tag on GitHub
> > - - the kafka-3.7.0-src.tgz artifact from
> > https://home.apache.org/~stanislavkozlovski/kafka-3.7.0-rc2/
> > - Checked signatures and checksums
> > - Ran the quickstart using both:
> > - - The kafka_2.13-3.7.0.tgz artifact from
> > https://home.apache.org/~stanislavkozlovski/kafka-3.7.0-rc2/ with Java
> 11
> > and Scala 13 in KRaft mode
> > - - Our shiny new broker Docker image, apache/kafka:3.7.0-rc2
> > - Ran all unit tests
> > - Ran all integration tests for Connect and MM2
> >
> >
> > I found two minor areas for concern:
> >
> > 1. (Possibly a blocker)
> > When running the quickstart, I noticed this ERROR-level log message being
> > emitted frequently (not not every time) when I killed my console consumer
> > via ctrl-C:
> >
> > > [2024-01-12 11:00:31,088] ERROR [Consumer clientId=console-consumer,
> > groupId=console-consumer-74388] Unable to find FetchSessionHandler for
> node
> > 1. Ignoring fetch response
> > (org.apache.kafka.clients.consumer.internals.AbstractFetch)
> >
> > I see that this error message is already reported in
> > https://issues.apache.org/jira/browse/KAFKA-16029. I think we should
> > prioritize fixing it for this release. I know it's probably benign but
> it's
> > really not a good look for us when basic operations log error messages,
> and
> > it may give new users some headaches.
> >
> >
> > 2. (Probably not a blocker)
> > The following unit tests failed the first time around, and all of them
> > passed the second time I ran them:
> >
> > - (clients)
> ClientUtilsTest.testParseAndValidateAddressesWithReverseLookup()
> > - (clients) SelectorTest.testConnectionsByClientMetric()
> > - (clients) Tls13SelectorTest.testConnectionsByClientMetric()
> > - (connect) TopicAdminTest.retryEndOffsetsShouldRetryWhenTopicNotFound (I
> > thought I fixed this one! 郎郎)
> > - (core) ProducerIdManagerTest.testUnrecoverableErrors(Errors)[2]
> >
> >
> > Thanks again for your work on this release, and congratulations to Kafka
> > Streams for having zero flaky unit tests during my highly-experimental
> > single laptop run!
> >
> >
> > Cheers,
> >
> > Chris
> >
> > On Thu, Jan 11, 2024 at 1:33 PM Stanislav Kozlovski
> >  wrote:
> >
> > > Hello Kafka users, developers, and client-developers,
> > >
> > > This is the first candidate for release of Apache Kafka 3.7.0.
> > >
> > > Note it's named "RC2" because I had a few "failed" RCs that I had
> > > cut/uploaded but ultimately had to scrap prior to announcing due to new
> > > blockers arriving before I could even announce them.
> > >
> > > Further - I haven't yet been able to set up the system tests
> successfully.
> > > And the integration/unit tests do have a few failures that I have to
> spend
> > > time triaging. I would appreciate any help in case anyone notices any
> tests
> > > failing that they're subject matters experts in. Expect me to follow
> up in
> > > a day or two with more detailed analysis.
> > >
> > > Major changes include:
> > > - Early Access to KIP-848 - the next generation of the consumer
> rebalance
> > > protocol
> > > - KIP-858: Adding JBOD support to KRaft
> > > - KIP-714: Observability into Client metrics via a standardized
> interface
> > >
> > > Check more information in the WIP blog post:
> > > https://github.com/apache/kafka-site/pull/578
> > >
> > > Release notes for the 3.7.0 release:
> > >
> > >
> https://home.apache.org/~stanislavkozlovski/kafka-3.7.0-rc2/RELEASE_NOTES.html
> > >
> > > *** Please download, test and vote by Thursday, January 18, 9am PT ***
> > >
> > > Usually these deadlines tend to be 2-3 days, but du

Re: [VOTE] 3.7.0 RC2

2024-01-12 Thread Chris Egerton
Hi Stanislav,


Thanks for running this release!

To verify, I:
- Built from source using Java 11 with both:
- - the 3.7.0-rc2 tag on GitHub
- - the kafka-3.7.0-src.tgz artifact from
https://home.apache.org/~stanislavkozlovski/kafka-3.7.0-rc2/
- Checked signatures and checksums
- Ran the quickstart using both:
- - The kafka_2.13-3.7.0.tgz artifact from
https://home.apache.org/~stanislavkozlovski/kafka-3.7.0-rc2/ with Java 11
and Scala 13 in KRaft mode
- - Our shiny new broker Docker image, apache/kafka:3.7.0-rc2
- Ran all unit tests
- Ran all integration tests for Connect and MM2


I found two minor areas for concern:

1. (Possibly a blocker)
When running the quickstart, I noticed this ERROR-level log message being
emitted frequently (not not every time) when I killed my console consumer
via ctrl-C:

> [2024-01-12 11:00:31,088] ERROR [Consumer clientId=console-consumer,
groupId=console-consumer-74388] Unable to find FetchSessionHandler for node
1. Ignoring fetch response
(org.apache.kafka.clients.consumer.internals.AbstractFetch)

I see that this error message is already reported in
https://issues.apache.org/jira/browse/KAFKA-16029. I think we should
prioritize fixing it for this release. I know it's probably benign but it's
really not a good look for us when basic operations log error messages, and
it may give new users some headaches.


2. (Probably not a blocker)
The following unit tests failed the first time around, and all of them
passed the second time I ran them:

- (clients) ClientUtilsTest.testParseAndValidateAddressesWithReverseLookup()
- (clients) SelectorTest.testConnectionsByClientMetric()
- (clients) Tls13SelectorTest.testConnectionsByClientMetric()
- (connect) TopicAdminTest.retryEndOffsetsShouldRetryWhenTopicNotFound (I
thought I fixed this one! 郎郎)
- (core) ProducerIdManagerTest.testUnrecoverableErrors(Errors)[2]


Thanks again for your work on this release, and congratulations to Kafka
Streams for having zero flaky unit tests during my highly-experimental
single laptop run!


Cheers,

Chris

On Thu, Jan 11, 2024 at 1:33 PM Stanislav Kozlovski
 wrote:

> Hello Kafka users, developers, and client-developers,
>
> This is the first candidate for release of Apache Kafka 3.7.0.
>
> Note it's named "RC2" because I had a few "failed" RCs that I had
> cut/uploaded but ultimately had to scrap prior to announcing due to new
> blockers arriving before I could even announce them.
>
> Further - I haven't yet been able to set up the system tests successfully.
> And the integration/unit tests do have a few failures that I have to spend
> time triaging. I would appreciate any help in case anyone notices any tests
> failing that they're subject matters experts in. Expect me to follow up in
> a day or two with more detailed analysis.
>
> Major changes include:
> - Early Access to KIP-848 - the next generation of the consumer rebalance
> protocol
> - KIP-858: Adding JBOD support to KRaft
> - KIP-714: Observability into Client metrics via a standardized interface
>
> Check more information in the WIP blog post:
> https://github.com/apache/kafka-site/pull/578
>
> Release notes for the 3.7.0 release:
>
> https://home.apache.org/~stanislavkozlovski/kafka-3.7.0-rc2/RELEASE_NOTES.html
>
> *** Please download, test and vote by Thursday, January 18, 9am PT ***
>
> Usually these deadlines tend to be 2-3 days, but due to this being the
> first RC and the tests not having ran yet, I am giving it a bit more time.
>
> Kafka's KEYS file containing PGP keys we use to sign the release:
> https://kafka.apache.org/KEYS
>
> * Release artifacts to be voted upon (source and binary):
> https://home.apache.org/~stanislavkozlovski/kafka-3.7.0-rc2/
>
> * Docker release artifact to be voted upon:
> apache/kafka:3.7.0-rc2
>
> * Maven artifacts to be voted upon:
> https://repository.apache.org/content/groups/staging/org/apache/kafka/
>
> * Javadoc:
> https://home.apache.org/~stanislavkozlovski/kafka-3.7.0-rc2/javadoc/
>
> * Tag to be voted upon (off 3.7 branch) is the 3.7.0 tag:
> https://github.com/apache/kafka/releases/tag/3.7.0-rc2
>
> * Documentation:
> https://kafka.apache.org/37/documentation.html
>
> * Protocol:
> https://kafka.apache.org/37/protocol.html
>
> * Successful Jenkins builds for the 3.7 branch:
> Unit/integration tests:
> https://ci-builds.apache.org/job/Kafka/job/kafka/job/3.7/58/
> There are failing tests here. I have to follow up with triaging some of
> the failures and figuring out if they're actual problems or simply flakes.
>
> System tests: https://jenkins.confluent.io/job/system-test-kafka/job/3.7/
>
> No successful system test runs yet. I am working on getting the job to run.
>
> * Successful Docker Image Github Actions Pipeline for 3.7 branch:
> Attached are the scan_report and report_jvm output files from the Docker
> Build run:
> https://github.com/apache/kafka/actions/runs/7486094960/job/20375761673
>
> And the final docker image build job - Docker Build Test Pipeline:
> 

Re: [PROPOSAL] Add commercial support page on website

2024-01-11 Thread Chris Egerton
Hi François,

Is it an official policy of the ASF that projects provide a listing of
commercial support options for themselves? I understand that other projects
have chosen to provide one, but this doesn't necessarily imply that all
projects should do the same, and I can't say I find this point very
convincing as a rebuttal to some of the good-faith concerns raised by the
PMC and members of the community so far. However, if there's an official
ASF stance on this topic, then I acknowledge that Apache Kafka should align
with it.

Best,

Chris


On Thu, Jan 11, 2024, 14:50 fpapon  wrote:

> Hi Justine,
>
> I'm not sure to see the difference between "happy users" and vendors
> that advertise their products in some of the company list in the
> "powered by" page.
>
> Btw, my initial purpose of my proposal was to help user to find support
> for production stuff rather than searching in google.
>
> I don't think this is a bad thing because this is something that already
> exist in many ASF projects like:
>
> https://hop.apache.org/community/commercial/
> https://struts.apache.org/commercial-support.html
> https://directory.apache.org/commercial-support.html
> https://tomee.apache.org/commercial-support.html
> https://plc4x.apache.org/users/commercial-support.html
> https://camel.apache.org/community/support/
> https://openmeetings.apache.org/commercial-support.html
> https://guacamole.apache.org/support/
>
> https://cwiki.apache.org/confluence/display/HADOOP2/Distributions+and+Commercial+Support
> https://activemq.apache.org/supporthttps://karaf.apache.org/community.html
> https://netbeans.apache.org/front/main/help/commercial-support/
> https://royale.apache.org/royale-commercial-support/
>
> https://karaf.apache.org/community.html
>
> As I understand for now, the channel for users to find production
> support is:
>
> - The mailing list (u...@kafka.apache.org / dev@kafka.apache.org)
>
> - The official #kafka  ASF Slack channel (may be we can add it on the
> website because I didn't find it in the website =>
> https://kafka.apache.org/contact)
>
> - Search in google for commercial support only
>
> I can update my PR to mention only the 3 points above for the "get
> support" page if people think that having a support page make sense.
>
> regards,
>
> François
>
> On 11/01/2024 19:34, Justine Olshan wrote:
> > I think there is a difference between the "Powered by" page and a page
> for
> > vendors to advertise their products and services.
> >
> > The idea is that the companies on that page are "powered by" Kafka. They
> > serve as examples of happy users of Kafka.
> > I don't think it is meant only as a place just for those companies to
> > advertise.
> >
> > I'm a little confused by
> >
> >> In this case, I'm ok to say that the commercial support section in the
> > "Get support" is no need as we can use this page.
> >
> > If you plan to submit for this page, please include a description on how
> > your company uses Kafka.
> >
> > I'm happy to hear other folks' opinions on this page as well.
> >
> > Thanks,
> > Justine
> >
> >
> >
> > On Thu, Jan 11, 2024 at 8:57 AM fpapon  wrote:
> >
> >> Hi,
> >>
> >> About the vendors list and neutrality, what is the policy of the
> >> "Powered by" page?
> >>
> >> https://kafka.apache.org/powered-by
> >>
> >> We can see company with logo, some are talking about their product
> >> (Agoora), some are offering services (Instaclustr, Aiven), and we can
> >> also see some that just put their logo and a link to their website
> >> without any explanation (GoldmanSachs).
> >>
> >> So as I understand and after reading the text in the footer of this
> >> page, every company can add themselves by providing a PR right?
> >>
> >> "Want to appear on this page?
> >> Submit a pull request or send a quick description of your organization
> >> and usage to the mailing list and we'll add you."
> >>
> >> In this case, I'm ok to say that the commercial support section in the
> >> "Get support" is no need as we can use this page.
> >>
> >> regards,
> >>
> >> François
> >>
> >>
> >> On 10/01/2024 19:03, Kenneth Eversole wrote:
> >>> I agree with Divji here and to be more pointed. I worry that if we go
> >> down
> >>> the path of adding vendors to a list it comes off as supporting their
> >>> product, not to mention could be a huge security risk for novice
> users. I
> >>> would rather this be a callout to other purely open source tooling,
> such
> >> as
> >>> cruise control.
> >>>
> >>> Divji brings up good question
> >>> 1.  What value does additional of this page bring to the users of
> Apache
> >>> Kafka?
> >>>
> >>> I think the community would be a better service to have a more
> >> synchronous
> >>> line of communication such as Slack/Discord and we call that out here.
> It
> >>> would be more inline with other major open source projects.
> >>>
> >>> ---
> >>> Kenneth Eversole
> >>>
> >>> On Wed, Jan 10, 2024 at 10:30 AM Divij Vaidya  >
> >>> wrote:
> >>>
>  I don't see a need for this. What 

Re: [VOTE] KIP-1004: Enforce tasks.max property in Kafka Connect

2024-01-11 Thread Chris Egerton
Hi all,

The vote for KIP-1004 passes with the following +1 votes and no +0 or -1
votes:

- Hector Geraldino
- Mickael Maison (binding)
- Greg Harris (binding)
- Yash Mayya (binding)
- Federico Valeri

With regards to the open discussion about whether to remove the deprecated
tasks.max.enforce property in 4.0.0 or later, I've tweaked the KIP to
clearly state that it may take place in 4.0.0 but may also be delayed. A
deprecated property does not require a KIP for removal, so we have some
wiggle room should the discussion continue, especially if people feel
strongly that we should push to remove it in time for 4.0.0.

Thanks all for your votes and discussion!

Cheers,

Chris

On Fri, Jan 5, 2024 at 3:45 PM Greg Harris 
wrote:

> Hey Chris,
>
> Thanks for keeping KIP-987 in-mind.
>
> The current design of KIP-987 doesn't take tasks.max.enforce into
> account, but I think it may be possible to only allow the protocol
> upgrade when tasks.max.enforce is true if we were to try to enforce
> it. It may also be reasonable to just have a warning about it appended
> to the documentation string for tasks.max.enforce.
> I am fine with either keeping or removing it in 4.0, leaning towards
> keeping it, for the same reasons you listed above.
>
> Thanks!
> Greg
>
> On Fri, Jan 5, 2024 at 9:40 AM Chris Egerton 
> wrote:
> >
> > Hi Yash,
> >
> > Thanks for raising the possibility of a more aggressive removal schedule
> > for the tasks.max.enforce property now that it seems a 3.8.x branch is
> > likely--I was wondering if someone would bring that up!
> >
> > I think I'd prefer to err on the side of caution and give users more time
> > to adjust, since some may skip 3.8.x and upgrade to 4.0.x, 4.1.x, etc.
> > directly instead. It seems like the maintenance cost will be fairly low,
> > and with the option to programmatically require it to be set to true in
> > order to work with other features we may want to develop in the future,
> it
> > shouldn't block any progress in the meantime. Thoughts? I'd also be
> curious
> > what Greg Harris thinks about this, given that it seems relevant to
> KIP-987
> > [1].
> >
> > [1] -
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-987%3A+Connect+Static+Assignments
> >
> > Cheers,
> >
> > Chris
> >
> > On Thu, Jan 4, 2024 at 4:45 AM Federico Valeri 
> wrote:
> >
> > > Thanks! This will finally reconcile Javadoc and implementation.
> > >
> > > +1 (non binding)
> > >
> > > On Thu, Jan 4, 2024 at 6:49 AM Yash Mayya 
> wrote:
> > > >
> > > > Hi Chris,
> > > >
> > > > +1 (binding), thanks for the KIP.
> > > >
> > > > Based on discussion in other threads, it looks like the community is
> > > > aligned with having a 3.8 release before the 4.0 release so we
> should be
> > > > able to remove the 'tasks.max.enforce' connector property in 4.0
> (we'd
> > > > discussed potentially having to live with this property until 5.0 in
> this
> > > > KIP's discussion thread). Once we have confirmation of a 3.8 release,
> > > will
> > > > this KIP be updated to reflect the exact AK versions where the
> deprecated
> > > > property will be introduced and removed?
> > > >
> > > > Thanks,
> > > > Yash
> > > >
> > > > On Wed, Jan 3, 2024 at 11:37 PM Greg Harris
>  > > >
> > > > wrote:
> > > >
> > > > > Hey Chris,
> > > > >
> > > > > Thanks for the KIP! I think the aggressive default and deprecation
> > > > > schedule is the right choice for this change.
> > > > >
> > > > > +1 (binding)
> > > > >
> > > > > On Wed, Jan 3, 2024 at 9:01 AM Mickael Maison <
> > > mickael.mai...@gmail.com>
> > > > > wrote:
> > > > > >
> > > > > > Hi Chris,
> > > > > >
> > > > > > +1 (binding), thanks for the KIP
> > > > > >
> > > > > > Mickael
> > > > > >
> > > > > > On Tue, Jan 2, 2024 at 8:55 PM Hector Geraldino (BLOOMBERG/ 919
> 3RD
> > > A)
> > > > > >  wrote:
> > > > > > >
> > > > > > > +1 (non-binding)
> > > > > > >
> > > > > > > Thanks Chris!
> > > > > > >
> > > > > > > From: dev@kafka.apache.org At: 01/02/24 11:49:18 UTC-5:00To:
> > > > > de

[jira] [Created] (KAFKA-16108) Backport fix for KAFKA-16093 to 3.7

2024-01-10 Thread Chris Egerton (Jira)
Chris Egerton created KAFKA-16108:
-

 Summary: Backport fix for KAFKA-16093 to 3.7
 Key: KAFKA-16108
 URL: https://issues.apache.org/jira/browse/KAFKA-16108
 Project: Kafka
  Issue Type: Improvement
  Components: connect
Reporter: Chris Egerton
Assignee: Chris Egerton
 Fix For: 3.7.1


A fix for KAFKA-16093 is present on the branches trunk (the version for which 
is currently 3.8.0-SNAPSHOT) and 3.6. We are in code freeze for the 3.7.0 
release, and this issue is not a blocker, so it cannot be backported right now.

We should backport the fix once 3.7.0 has been released and before 3.7.1 is 
released.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [VOTE] KIP-995: Allow users to specify initial offsets while creating connectors

2024-01-09 Thread Chris Egerton
Thanks for the KIP! +1 (binding)

On Mon, Jan 8, 2024 at 9:35 AM Ashwin  wrote:

> Hi All,
>
> I would like to start  a vote on KIP-995.
>
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-995%3A+Allow+users+to+specify+initial+offsets+while+creating+connectors
>
> Discussion thread -
> https://lists.apache.org/thread/msorbr63scglf4484yq764v7klsj7c4j
>
> Thanks!
>
> Ashwin
>


Re: [VOTE] KIP-1012: The need for a Kafka 3.8.x release

2024-01-08 Thread Chris Egerton
Hi Colin,

The idea isn't to hold up 4.0.0 any longer; in fact, it's the opposite. If
things slip a bit with 3.8.0 and we take, for example, an extra month to
deliver it (or even to cut the branch), I don't think this should
necessarily translate to an extra month of latency between now and 4.0.0,
given exactly what you mention about the major changes we plan to include
in 4.0.0 (which consist more of dropping support for existing things than
adding support for new things).

If we want to avoid confusion, we could say something like "no later than 3
to 4 months after the 3.8 branch is created". Frankly though, I think it's
unnecessary to specify an exact timeline for 4.0 in this KIP, since nothing
in the proposal actually diverges from the usual time-based release plan we
follow. The only necessary part seems to be to state that 4.0 will directly
follow 3.8 (as opposed to 3.9, 3.10, etc.). But perhaps I'm missing
something?

Cheers,

Chris

On Mon, Jan 8, 2024 at 2:38 PM Colin McCabe  wrote:

> On Mon, Jan 8, 2024, at 09:05, Chris Egerton wrote:
> > Hi Josep,
> >
> > Thanks for the KIP! +1 (binding).
> >
> > One small nit: I don't think we necessarily have to hold ourselves to
> > releasing 4.0.0 "3 to 4 months after 3.8 branch is created" (quoting the
> > timeline section of the KIP). IMO it's fine to leave some wiggle room for
> > the 4.0.0 release without codifying a timeline in this KIP. Maybe
> something
> > like "some time after 3.8 branch is created" would be sufficient?
> Anyways,
> > not a huge thing, I'm sure we'll all give 4.0.0 the flexibility it needs
> > with the understanding that this KIP is more focused on 3.8.0 than 4.0.0.
> >
>
> Hmm... I don't see any obstacles in the path of releasing 4.0 after the
> traditional 4 months of development. Keep in mind, we're removing things
> from the code (the ability to support JDK8, ZooKeeper mode, etc.), not
> adding things. We already support JDK11 so saying that it's the minimum is
> a very quick change. Similarly, we already support KRaft so saying that
> it's the only mode should be a pretty quick change.
>
> Also, we added a note that "the timeline is very rough" to KIP-833 and it
> caued all kinds of confusion. So overall I'd prefer to leave the language
> about 4.0 unchanged.
>
> best,
> Colin
>
> >
> > Cheers,
> >
> > Chris
> >
> > On Mon, Jan 8, 2024 at 11:41 AM Greg Harris  >
> > wrote:
> >
> >> Thanks Josep for leading the KIP and building consensus on 3.8!
> >>
> >> +1 (binding)
> >>
> >> Greg
> >>
> >> On Sun, Jan 7, 2024 at 11:45 PM Josep Prat  >
> >> wrote:
> >> >
> >> > Hi all,
> >> >
> >> > Thanks for your comments,
> >> > I reworded some parts of the KIP to express that:
> >> > - The KIP is to agree that we need at least one more minor version in
> the
> >> > 3.x series
> >> > - Explicitly saying that the list of KIPs is not exhaustive and that
> if
> >> > some are not done, we would need another minor version
> >> > - Which are the KIPs/Features the community identified that should be
> >> > present in a 3.x version so they can safely migrate to a potential 4.0
> >> > version
> >> > - The timeline of 4.0 (starting after branching, not after release)
> >> > - Wording is now focusing more on the need to have at least another
> minor
> >> > release in 3.x to enable and ease migration to a potential 4.0 version
> >> >
> >> > I always mention potential in terms of 4.0 as we don't know what will
> be
> >> in
> >> > there yet, and this KIP's scope is not meant to define this.
> >> >
> >> > Best,
> >> >
> >> > On Fri, Jan 5, 2024 at 10:46 PM Ismael Juma 
> wrote:
> >> >
> >> > > I agree with Colin. Also, 4.0 would be started after 3.8 is branched
> >> (not
> >> > > after 3.8.0 is released). The rest looks good.
> >> > >
> >> > > Thanks!
> >> > >
> >> > > Ismael
> >> > >
> >> > > On Fri, Jan 5, 2024 at 11:27 PM Colin McCabe 
> >> wrote:
> >> > >
> >> > > > Hi,
> >> > > >
> >> > > > Thanks for calling the vote, Josep.
> >> > > >
> >> > > > I re-checked this, and saw something that we missed updating. One
> >> thing
> >> > > we
> >> > > > talked about earlier is that KIP-966 is act

[jira] [Created] (KAFKA-16093) Spurious warnings logged to stderr about empty path annotations and providers not implementing provider interfaces

2024-01-08 Thread Chris Egerton (Jira)
Chris Egerton created KAFKA-16093:
-

 Summary: Spurious warnings logged to stderr about empty path 
annotations and providers not implementing provider interfaces
 Key: KAFKA-16093
 URL: https://issues.apache.org/jira/browse/KAFKA-16093
 Project: Kafka
  Issue Type: Improvement
  Components: connect
Affects Versions: 3.7.0, 3.8.0
Reporter: Chris Egerton
Assignee: Chris Egerton


Some warnings get logged to stderr on Connect startup. For example:
{quote}Jan 08, 2024 1:48:18 PM org.glassfish.jersey.internal.inject.Providers 
checkProviderRuntime

WARNING: A provider 
org.apache.kafka.connect.runtime.rest.resources.RootResource registered in 
SERVER runtime does not implement any provider interfaces applicable in the 
SERVER runtime. Due to constraint configuration problems the provider 
org.apache.kafka.connect.runtime.rest.resources.RootResource will be ignored. 

Jan 08, 2024 1:48:18 PM org.glassfish.jersey.internal.inject.Providers 
checkProviderRuntime

WARNING: A provider 
org.apache.kafka.connect.runtime.rest.resources.ConnectorsResource registered 
in SERVER runtime does not implement any provider interfaces applicable in the 
SERVER runtime. Due to constraint configuration problems the provider 
org.apache.kafka.connect.runtime.rest.resources.ConnectorsResource will be 
ignored. 

Jan 08, 2024 1:48:18 PM org.glassfish.jersey.internal.inject.Providers 
checkProviderRuntime

WARNING: A provider 
org.apache.kafka.connect.runtime.rest.resources.InternalConnectResource 
registered in SERVER runtime does not implement any provider interfaces 
applicable in the SERVER runtime. Due to constraint configuration problems the 
provider 
org.apache.kafka.connect.runtime.rest.resources.InternalConnectResource will be 
ignored. 

Jan 08, 2024 1:48:18 PM org.glassfish.jersey.internal.inject.Providers 
checkProviderRuntime

WARNING: A provider 
org.apache.kafka.connect.runtime.rest.resources.ConnectorPluginsResource 
registered in SERVER runtime does not implement any provider interfaces 
applicable in the SERVER runtime. Due to constraint configuration problems the 
provider 
org.apache.kafka.connect.runtime.rest.resources.ConnectorPluginsResource will 
be ignored. 

Jan 08, 2024 1:48:18 PM org.glassfish.jersey.internal.inject.Providers 
checkProviderRuntime

WARNING: A provider 
org.apache.kafka.connect.runtime.rest.resources.LoggingResource registered in 
SERVER runtime does not implement any provider interfaces applicable in the 
SERVER runtime. Due to constraint configuration problems the provider 
org.apache.kafka.connect.runtime.rest.resources.LoggingResource will be 
ignored. 

Jan 08, 2024 1:48:19 PM org.glassfish.jersey.internal.Errors logErrors

WARNING: The following warnings have been detected: WARNING: The (sub)resource 
method listLoggers in 
org.apache.kafka.connect.runtime.rest.resources.LoggingResource contains empty 
path annotation.

WARNING: The (sub)resource method listConnectors in 
org.apache.kafka.connect.runtime.rest.resources.ConnectorsResource contains 
empty path annotation.

WARNING: The (sub)resource method createConnector in 
org.apache.kafka.connect.runtime.rest.resources.ConnectorsResource contains 
empty path annotation.

WARNING: The (sub)resource method listConnectorPlugins in 
org.apache.kafka.connect.runtime.rest.resources.ConnectorPluginsResource 
contains empty path annotation.

WARNING: The (sub)resource method serverInfo in 
org.apache.kafka.connect.runtime.rest.resources.RootResource contains empty 
path annotation.
{quote}
These are benign, but can confuse and even frighten new users.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [VOTE] KIP-1012: The need for a Kafka 3.8.x release

2024-01-08 Thread Chris Egerton
Hi Josep,

Thanks for the KIP! +1 (binding).

One small nit: I don't think we necessarily have to hold ourselves to
releasing 4.0.0 "3 to 4 months after 3.8 branch is created" (quoting the
timeline section of the KIP). IMO it's fine to leave some wiggle room for
the 4.0.0 release without codifying a timeline in this KIP. Maybe something
like "some time after 3.8 branch is created" would be sufficient? Anyways,
not a huge thing, I'm sure we'll all give 4.0.0 the flexibility it needs
with the understanding that this KIP is more focused on 3.8.0 than 4.0.0.

Cheers,

Chris

On Mon, Jan 8, 2024 at 11:41 AM Greg Harris 
wrote:

> Thanks Josep for leading the KIP and building consensus on 3.8!
>
> +1 (binding)
>
> Greg
>
> On Sun, Jan 7, 2024 at 11:45 PM Josep Prat 
> wrote:
> >
> > Hi all,
> >
> > Thanks for your comments,
> > I reworded some parts of the KIP to express that:
> > - The KIP is to agree that we need at least one more minor version in the
> > 3.x series
> > - Explicitly saying that the list of KIPs is not exhaustive and that if
> > some are not done, we would need another minor version
> > - Which are the KIPs/Features the community identified that should be
> > present in a 3.x version so they can safely migrate to a potential 4.0
> > version
> > - The timeline of 4.0 (starting after branching, not after release)
> > - Wording is now focusing more on the need to have at least another minor
> > release in 3.x to enable and ease migration to a potential 4.0 version
> >
> > I always mention potential in terms of 4.0 as we don't know what will be
> in
> > there yet, and this KIP's scope is not meant to define this.
> >
> > Best,
> >
> > On Fri, Jan 5, 2024 at 10:46 PM Ismael Juma  wrote:
> >
> > > I agree with Colin. Also, 4.0 would be started after 3.8 is branched
> (not
> > > after 3.8.0 is released). The rest looks good.
> > >
> > > Thanks!
> > >
> > > Ismael
> > >
> > > On Fri, Jan 5, 2024 at 11:27 PM Colin McCabe 
> wrote:
> > >
> > > > Hi,
> > > >
> > > > Thanks for calling the vote, Josep.
> > > >
> > > > I re-checked this, and saw something that we missed updating. One
> thing
> > > we
> > > > talked about earlier is that KIP-966 is actually not required for
> 3.8.
> > > What
> > > > is required is some way of enabling unclean leader election by
> default in
> > > > KRaft. (Which could be KIP-966, or something else). Please revise
> this.
> > > >
> > > > best,
> > > > Colin
> > > >
> > > > On Fri, Jan 5, 2024, at 02:50, Anton Agestam wrote:
> > > > > +1 from me.
> > > > >
> > > > > Den fre 5 jan. 2024 kl 10:33 skrev Josep Prat
> > > > :
> > > > >
> > > > >> Hi all,
> > > > >>
> > > > >> I'd like to start a vote on KIP-1012:
> > > > >>
> > > > >>
> > > >
> > >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1012%3A+The+need+for+a+Kafka+3.8.x+release
> > > > >>
> > > > >> Discussion thread is here:
> > > > >> https://lists.apache.org/thread/kvdp2gmq5gd9txkvxh5vk3z2n55b04s5
> > > > >>
> > > > >> Thanks!
> > > > >>
> > > > >> ---
> > > > >> Josep Prat
> > > > >> Open Source Engineering Director, aivenjosep.p...@aiven.io   |
> > > > >> +491715557497 | aiven.io
> > > > >> Aiven Deutschland GmbH
> > > > >> Alexanderufer 3-7, 10117 Berlin
> > > > >> Geschäftsführer: Oskari Saarenmaa & Hannu Valtonen
> > > > >> Amtsgericht Charlottenburg, HRB 209739 B
> > > > >>
> > > >
> > >
> >
> >
> > --
> > [image: Aiven] 
> >
> > *Josep Prat*
> > Open Source Engineering Director, *Aiven*
> > josep.p...@aiven.io   |   +491715557497
> > aiven.io    |   <
> https://www.facebook.com/aivencloud>
> >      <
> https://twitter.com/aiven_io>
> > *Aiven Deutschland GmbH*
> > Alexanderufer 3-7, 10117 Berlin
> > Geschäftsführer: Oskari Saarenmaa & Hannu Valtonen
> > Amtsgericht Charlottenburg, HRB 209739 B
>


Re: [VOTE] KIP-1004: Enforce tasks.max property in Kafka Connect

2024-01-05 Thread Chris Egerton
Hi Yash,

Thanks for raising the possibility of a more aggressive removal schedule
for the tasks.max.enforce property now that it seems a 3.8.x branch is
likely--I was wondering if someone would bring that up!

I think I'd prefer to err on the side of caution and give users more time
to adjust, since some may skip 3.8.x and upgrade to 4.0.x, 4.1.x, etc.
directly instead. It seems like the maintenance cost will be fairly low,
and with the option to programmatically require it to be set to true in
order to work with other features we may want to develop in the future, it
shouldn't block any progress in the meantime. Thoughts? I'd also be curious
what Greg Harris thinks about this, given that it seems relevant to KIP-987
[1].

[1] -
https://cwiki.apache.org/confluence/display/KAFKA/KIP-987%3A+Connect+Static+Assignments

Cheers,

Chris

On Thu, Jan 4, 2024 at 4:45 AM Federico Valeri  wrote:

> Thanks! This will finally reconcile Javadoc and implementation.
>
> +1 (non binding)
>
> On Thu, Jan 4, 2024 at 6:49 AM Yash Mayya  wrote:
> >
> > Hi Chris,
> >
> > +1 (binding), thanks for the KIP.
> >
> > Based on discussion in other threads, it looks like the community is
> > aligned with having a 3.8 release before the 4.0 release so we should be
> > able to remove the 'tasks.max.enforce' connector property in 4.0 (we'd
> > discussed potentially having to live with this property until 5.0 in this
> > KIP's discussion thread). Once we have confirmation of a 3.8 release,
> will
> > this KIP be updated to reflect the exact AK versions where the deprecated
> > property will be introduced and removed?
> >
> > Thanks,
> > Yash
> >
> > On Wed, Jan 3, 2024 at 11:37 PM Greg Harris  >
> > wrote:
> >
> > > Hey Chris,
> > >
> > > Thanks for the KIP! I think the aggressive default and deprecation
> > > schedule is the right choice for this change.
> > >
> > > +1 (binding)
> > >
> > > On Wed, Jan 3, 2024 at 9:01 AM Mickael Maison <
> mickael.mai...@gmail.com>
> > > wrote:
> > > >
> > > > Hi Chris,
> > > >
> > > > +1 (binding), thanks for the KIP
> > > >
> > > > Mickael
> > > >
> > > > On Tue, Jan 2, 2024 at 8:55 PM Hector Geraldino (BLOOMBERG/ 919 3RD
> A)
> > > >  wrote:
> > > > >
> > > > > +1 (non-binding)
> > > > >
> > > > > Thanks Chris!
> > > > >
> > > > > From: dev@kafka.apache.org At: 01/02/24 11:49:18 UTC-5:00To:
> > > dev@kafka.apache.org
> > > > > Subject: Re: [VOTE] KIP-1004: Enforce tasks.max property in Kafka
> > > Connect
> > > > >
> > > > > Hi all,
> > > > >
> > > > > Happy New Year! Wanted to give this a bump now that the holidays
> are
> > > over
> > > > > for a lot of us. Looking forward to people's thoughts!
> > > > >
> > > > > Cheers,
> > > > >
> > > > > Chris
> > > > >
> > > > > On Mon, Dec 4, 2023 at 10:36 AM Chris Egerton 
> wrote:
> > > > >
> > > > > > Hi all,
> > > > > >
> > > > > > I'd like to call for a vote on KIP-1004, which adds enforcement
> for
> > > the
> > > > > > tasks.max connector property in Kafka Connect.
> > > > > >
> > > > > > The KIP:
> > > > > >
> > > > >
> > >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1004%3A+Enforce+tasks.max+
> > > > > property+in+Kafka+Connect
> > > > > >
> > > > > > The discussion thread:
> > > > > > https://lists.apache.org/thread/scx75cjwm19jyt19wxky41q9smf5nx6d
> > > > > >
> > > > > > Cheers,
> > > > > >
> > > > > > Chris
> > > > > >
> > > > >
> > > > >
> > >
>


Re: DISCUSS KIP-1011: Use incrementalAlterConfigs when updating broker configs by kafka-configs.sh

2024-01-05 Thread Chris Egerton
Hi all,

Can we clarify any changes in the user-facing semantics for the CLI tool
that would come about as a result of this KIP? I think the debate over the
necessity of an opt-in flag, or waiting for 4.0.0, ultimately boils down to
this.

My understanding is that the only changes in semantics are fairly minor
(semantic versioning pun intended):

- Existing sensitive broker properties no longer have to be explicitly
specified on the command line if they're not being changed
- A small race condition is fixed where the broker config is updated by a
separate operation in between when the CLI reads the existing broker config
and writes the new broker config
- Usage of a new broker API that has been supported since version 2.3.0,
but which does not require any new ACLs and does not act any differently
apart from the two small changes noted above

If this is correct, then I'm inclined to agree with Ismael's suggestion of
starting with incrementalAlterConfigs, and falling back on alterConfigs if
the former is not supported by the broker, and do not believe it's
necessary to wait for 4.0.0 or provide opt-in or opt-out flags to release
this change. This would also be similar to changes we made to MirrorMaker 2
in KIP-894 [1], where the default behavior for syncing topic configs is now
to start with incrementalAlterConfigs and fall back on alterConfigs if it's
not supported.

If there are other, more significant changes to the user-facing semantics
for the CLI, then these should be called out here and in the KIP, and we
might consider a more cautious approach.

[1] -
https://cwiki.apache.org/confluence/display/KAFKA/KIP-894%3A+Use+incrementalAlterConfigs+API+for+syncing+topic+configurations


Also, regarding this part of the KIP:

> incrementalAlterConfigs is more convenient especially for updating
configs of list data type, such as "leader.replication.throttled.replicas"

While this is true for the Java admin client and the corresponding broker
APIs, it doesn't appear to be relevant to the kafka-configs.sh CLI tool. We
don't appear to support appending/subtracting from list properties via the
CLI for any other entity type right now, and there's nothing in the KIP
that leads me to believe we'd be adding it for broker configs.

Cheers,

Chris

On Thu, Jan 4, 2024 at 10:12 PM ziming deng 
wrote:

> Hi Ismael,
> I added this automatically approach to “Rejected alternatives” concerning
> that we need to unify the semantics between alterConfigs and
> incrementalAlterConfigs, so I choose to give this privilege to users.
>
> After reviewing these code and doing some tests I found that they
> following the similar approach, I think the simplest way is to let the
> client choose the best method heuristically.
>
> Thank you for pointing out this, I will change the KIP later.
>
> Best,
> Ziming
>
> > On Jan 4, 2024, at 17:28, Ismael Juma  wrote:
> >
> > Hi Ziming,
> >
> > Why is the flag required at all? Can we use incremental and fallback
> automatically if it's not supported by the broker? At this point, the vast
> majority of clusters should support it.
> >
> > Ismael
> >
> > On Mon, Dec 18, 2023 at 7:58 PM ziming deng  > wrote:
> >>
> >> Hello, I want to start a discussion on KIP-1011, to make the broker
> config change path unified with that of user/topic/client-metrics and avoid
> some bugs.
> >>
> >> Here is the link:
> >>
> >> KIP-1011: Use incrementalAlterConfigs when updating broker configs by
> kafka-configs.sh - Apache Kafka - Apache Software Foundation
> >> cwiki.apache.org
> >>
> >>  <
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1011%3A+Use+incrementalAlterConfigs+when+updating+broker+configs+by+kafka-configs.sh>KIP-1011:
> Use incrementalAlterConfigs when updating broker configs by
> kafka-configs.sh - Apache Kafka - Apache Software Foundation <
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1011%3A+Use+incrementalAlterConfigs+when+updating+broker+configs+by+kafka-configs.sh
> >
> >> cwiki.apache.org <
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1011%3A+Use+incrementalAlterConfigs+when+updating+broker+configs+by+kafka-configs.sh>
><
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1011%3A+Use+incrementalAlterConfigs+when+updating+broker+configs+by+kafka-configs.sh
> >
> >>
> >> Best,
> >> Ziming.
>
>


Re: [VOTE] KIP-1004: Enforce tasks.max property in Kafka Connect

2024-01-02 Thread Chris Egerton
Hi all,

Happy New Year! Wanted to give this a bump now that the holidays are over
for a lot of us. Looking forward to people's thoughts!

Cheers,

Chris

On Mon, Dec 4, 2023 at 10:36 AM Chris Egerton  wrote:

> Hi all,
>
> I'd like to call for a vote on KIP-1004, which adds enforcement for the
> tasks.max connector property in Kafka Connect.
>
> The KIP:
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1004%3A+Enforce+tasks.max+property+in+Kafka+Connect
>
> The discussion thread:
> https://lists.apache.org/thread/scx75cjwm19jyt19wxky41q9smf5nx6d
>
> Cheers,
>
> Chris
>


Re: [VOTE] KIP-993: Allow restricting files accessed by File and Directory ConfigProviders

2023-12-13 Thread Chris Egerton
Hi Tina,

Thanks for the KIP! +1 (binding)

Cheers,

Chris

On Wed, Dec 13, 2023 at 5:50 AM Gantigmaa Selenge 
wrote:

> Hi
>
> I'd like to call for a vote on KIP-993, which allows restricting files
> accessed by File and Directory ConfigProviders when using Connect.
>
> The KIP:
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-993%3A+Allow+restricting+files+accessed+by+File+and+Directory+ConfigProviders
>
> The discussion thread:
> https://lists.apache.org/thread/7931thc3m58wbnvswxv4osy4nn3qkb3j
>
> Thank you
> Tina
>


[DISCUSS] Kafka Connect source task interruption semantics

2023-12-12 Thread Chris Egerton
Hi all,

I'd like to solicit input from users and maintainers on a problem we've
been dealing with for source task cleanup logic.

If you'd like to pore over some Jira history, here's the primary link:
https://issues.apache.org/jira/browse/KAFKA-15090

To summarize, we accidentally introduced a breaking change for Kafka
Connect in https://github.com/apache/kafka/pull/9669. Before that change,
the SourceTask::stop method [1] would be invoked on a separate thread from
the one that did the actual data processing for the task (polling the task
for records, transforming and converting those records, then sending them
to Kafka). After that change, we began invoking SourceTask::stop on the
same thread that handled data processing for the task. This had the effect
that tasks which blocked indefinitely in the SourceTask::poll method [2]
with the expectation that they could stop blocking when SourceTask::stop
was invoked would no longer be capable of graceful shutdown, and may even
hang forever.

This breaking change was introduced in the 3.0.0 release, a little over two
three ago. Since then, source connectors may have been modified to adapt to
the change in behavior by the Connect framework. As a result, we are
hesitant to go back to the prior logic of invoking SourceTask::stop on a
separate thread (see the linked Jira ticket for more detail on this front).

In https://github.com/apache/kafka/pull/14316, I proposed that we begin
interrupting the data processing thread for the source task after it had
exhausted its graceful shutdown timeout (i.e., when the Kafka Connect
runtime decides to cancel [3], [4], [5] the task). I believe this change is
fairly non-controversial--once a task has failed to shut down gracefully,
the runtime can and should do whatever it wants to force a shutdown,
graceful or otherwise.

With all that context out of the way, the question I'd like to ask is: do
we believe it's also appropriate to interrupt the data processing thread
when the task is scheduled for shutdown [6], [7]? This interruption would
ideally be followed up by a graceful shutdown of the task, which may
require the Kafka Connect runtime to handle a potential
InterruptedException from SourceTask::poll. Other exceptions (such as a
wrapped InterruptedException) would be impossible to handle gracefully, and
may lead to spurious error messages in the logs and failed final offset
commits for connectors that do not work well with this new behavior.

Finally, one important note: in the official documentation for
SourceTask::poll, we do already state that this method should not block for
too long:

> If no data is currently available, this method should block but return
control to the caller regularly (by returning null) in order for the task
to transition to the PAUSED state if requested to do so.

Looking forward to everyone's thoughts on this tricky issue!

Cheers,

Chris

[1] -
https://kafka.apache.org/36/javadoc/org/apache/kafka/connect/source/SourceTask.html#stop()
[2] -
https://kafka.apache.org/36/javadoc/org/apache/kafka/connect/source/SourceTask.html#poll()
[3] -
https://github.com/apache/kafka/blob/c5ee82cab447b094ad7491200afa319515d5f467/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/Worker.java#L1037
[4] -
https://github.com/apache/kafka/blob/c5ee82cab447b094ad7491200afa319515d5f467/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/WorkerTask.java#L129-L136
[5] -
https://github.com/apache/kafka/blob/c5ee82cab447b094ad7491200afa319515d5f467/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/AbstractWorkerSourceTask.java#L284-L297
[6] -
https://github.com/apache/kafka/blob/c5ee82cab447b094ad7491200afa319515d5f467/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/Worker.java#L1014
[7] -
https://github.com/apache/kafka/blob/c5ee82cab447b094ad7491200afa319515d5f467/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/WorkerTask.java#L112-L127


Re: KIP-993: Allow restricting files accessed by File and Directory ConfigProviders

2023-12-12 Thread Chris Egerton
Thanks Tina! LGTM as long as we take care to document that recursive access
to directories will be granted when we release this feature.

On Tue, Dec 12, 2023 at 8:16 AM Gantigmaa Selenge 
wrote:

> Hi Chris,
>
> Thank you for the feedback.
>
>
> 1.  Addressed
>
>
> 2. I have updated the type to be List. The configure() function is more
> likely to process the value as String and convert to List using the comma
> separation but I think it makes sense to specify it as List, as that is the
> final type.
>
>
> 3. Yes, it's mentioned under the Public Interfaces section but I also added
> another sentence to make it clearer.
>
>
> 4. Yes, I have just tested this to confirm and it looks like "/" gives
> access to the entire file system.
>
>
> Thanks.
> Regards,
> Tina
>
>
>
>
> On Thu, Dec 7, 2023 at 2:58 PM Chris Egerton 
> wrote:
>
> > Hi Tina,
> >
> > Thanks for the KIP! Looks good overall. A few minor thoughts:
> >
> > 1. We can remove the "This page is meant as a template for writing a KIP"
> > section from the beginning.
> >
> > 2. The type of the allowed.paths property is string in the KIP, but the
> > description mentions it'll contain multiple comma-separated paths.
> > Shouldn't it be described as a list? Or are we calling it a string in
> order
> > to allow for escape syntax for directories that may contain the delimiter
> > character (e.g., ',')?
> >
> > 3. I'm guessing the answer is yes but I want to make sure--will users be
> > allowed to specify files in the allowed.paths property?
> >
> > 4. Again, guessing the answer is yes but to make sure--if a directory is
> > specified in the allowed.paths property, will all files (nested or
> > otherwise) be accessible by the config provider? E.g., if I set
> > allowed.paths to "/", then everything on the entire file system would be
> > accessible, instead of just the files directly inside the root directory.
> >
> > Cheers,
> >
> > Chris
> >
> > On Thu, Dec 7, 2023 at 9:33 AM Gantigmaa Selenge 
> > wrote:
> >
> > > Thank you Mickael.
> > >
> > > I'm going to leave the discussion thread open for a couple more days
> and
> > if
> > > there are no further comments, I would like to start the vote for this
> > KIP.
> > >
> > > Thanks.
> > > Regards,
> > > Tina
> > >
> > > On Wed, Dec 6, 2023 at 10:06 AM Mickael Maison <
> mickael.mai...@gmail.com
> > >
> > > wrote:
> > >
> > > > Hi,
> > > >
> > > > I'm not aware of any other mechanisms to explore the filesystem. If
> > > > you have ideas, please reach out to the security list.
> > > >
> > > > Thanks,
> > > > Mickael
> > > >
> > > > On Tue, Dec 5, 2023 at 1:05 PM Gantigmaa Selenge <
> gsele...@redhat.com>
> > > > wrote:
> > > > >
> > > > > Hi everyone,
> > > > >
> > > > >
> > > > > Apologies for the very delayed response. Thank you both for the
> > > feedback.
> > > > >
> > > > >
> > > > > > For clarity it might make sense to mention this feature will be
> > > useful
> > > > >
> > > > > when using a ConfigProvider with Kafka Connect as providers are set
> > in
> > > > >
> > > > > the runtime and can then be used by connectors. This feature has no
> > > > >
> > > > > use when using a ConfigProvider in server.properties or in clients.
> > > > >
> > > > >
> > > > > I have updated the KIP to address this suggestion. Please let me
> know
> > > if
> > > > > it's not clear enough.
> > > > >
> > > > >
> > > > > > When trying to use a path not allowed, you propose returning an
> > > error.
> > > > >
> > > > > With Connect does that mean the connector will be failed? The
> > > > >
> > > > > EnvVarConfigProvider returns empty string in case a user tries to
> > > > >
> > > > > access an environment variable not allowed. I wonder if we should
> > > > >
> > > > > follow the same pattern so the behavior is "consistent" across all
> > > > >
> > > > > built-in providers.
> > > > >
> > > > >
> > > > > I agree 

Re: [DISCUSS] KIP-995: Allow users to specify initial offsets while creating connectors

2023-12-12 Thread Chris Egerton
Hi Ashwin,

LGTM! One small adjustment I'd suggest but we don't have to block on--it
may be clearer to put "Wipe all existing offsets for the connector" in
between steps 2 and 3 of the proposed changes section.

Cheers,

Chris

On Mon, Dec 11, 2023 at 11:24 PM Ashwin 
wrote:

> Thanks for pointing this out Chris.
>
> I have updated the KIP with the correct sequence of steps.
>
> Thanks,
> Ashwin
>
> On Wed, Dec 6, 2023 at 11:48 PM Chris Egerton 
> wrote:
>
> > Hi Ashwin,
> >
> > Regarding point 4--I think this is still a bit unwise. When workers pick
> up
> > a new connector config from the config topic, they participate in a
> > rebalance. It may be safe to write offsets during that rebalance, but in
> > the name of simplicity, do you think we can write the offsets for the
> > connector before its config? The sequence of steps would be something
> like
> > this:
> >
> > 1. Validate offsets and connector config (can be done in any order)
> > 2. Write offsets
> > 3. Write connector config (with whatever initial state is specified in
> the
> > request, or the default if none is specified)
> >
> > Cheers,
> >
> > Chris
> >
> > On Wed, Dec 6, 2023 at 9:13 AM Ashwin 
> > wrote:
> >
> > > Hello Chris,
> > >
> > > Thanks for the quick and detailed review. Please see my responses below
> > >
> > > High-level thoughts:
> > >
> > > 1. I had not thought of this till now, thanks for pointing it out. I
> > > would lean towards the second option of cleaning previous offsets as
> > > it will result in the fewer surprises for the user.
> > >
> > > 2 and 3. I agree and have updated the wiki to that effect. I just
> > > wanted to use the connector name as a mutex - we can handle the race
> > > in other ways.
> > >
> > > 4. Yes, I meant submitting config to the config topic. Have updated
> > > the wiki to make it clearer.
> > >
> > >
> > > Nits:
> > >
> > > Thanks for pointing these - I have made the necessary changes.
> > >
> > >
> > > Thanks again,
> > >
> > > Ashwin
> > >
> > >
> > > On Mon, Dec 4, 2023 at 9:05 PM Chris Egerton 
> > > wrote:
> > >
> > > > Hi Ashwin,
> > > >
> > > > Thanks for the KIP! This would be a nice simplification to the
> process
> > > for
> > > > migrating connectors enabled by KIP-980, and would also add global
> > > support
> > > > for a feature I've seen implemented by hand in at least a couple
> > > connectors
> > > > over the years.
> > > >
> > > >
> > > > High-level thoughts:
> > > >
> > > > 1. If there are leftover offsets from a previous connector, what will
> > the
> > > > impact on those offsets (if any) be if a new connector is created
> with
> > > the
> > > > same name with initial offsets specified? I can think of at least two
> > > > options: we leave those offsets as-are but allow any of the initial
> > > offsets
> > > > in the new connector request to overwrite them, or we automatically
> > wipe
> > > > all existing offsets for the connector first before writing its
> initial
> > > > offsets and then creating it. I have a slight preference for the
> first
> > > > because it's simpler to implement and aligns with existing precedent
> > for
> > > > offset handling where we never wipe them unless explicitly requested
> by
> > > the
> > > > user or connector, but it could be argued that the second is less
> > likely
> > > to
> > > > generate footguns for users. Interested in your thoughts!
> > > >
> > > > 2. IMO preflight validation (i.e., the "Validate initial_offset
> before
> > > > creating the connector in STOPPED state rejected alternative) is a
> > > > must-have for this kind of feature. I acknowledge the possible race
> > > > condition (and I don't think it's worth it to track in-flight
> connector
> > > > creation requests in the herder in order to prevent this race, since
> > > > ever-blocking Connector operations would be cumbersome to deal with),
> > and
> > > > the extra implementation effort. But I don't think either of those
> tip
> > > the
> > > > scales far enough to override the benefit of ensuring the submitted
> > > offsets
&

[jira] [Resolved] (KAFKA-15563) Provide informative error messages when Connect REST requests time out

2023-12-11 Thread Chris Egerton (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-15563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Egerton resolved KAFKA-15563.
---
Fix Version/s: 3.7.0
   Resolution: Fixed

> Provide informative error messages when Connect REST requests time out
> --
>
> Key: KAFKA-15563
> URL: https://issues.apache.org/jira/browse/KAFKA-15563
> Project: Kafka
>  Issue Type: Improvement
>  Components: KafkaConnect
>    Reporter: Chris Egerton
>    Assignee: Chris Egerton
>Priority: Major
> Fix For: 3.7.0
>
>
> The Kafka Connect REST API has a hardcoded timeout of 90 seconds. If any 
> operations take longer than that, a 500 error response is returned with the 
> message "Request timed out" (see 
> [here|https://github.com/apache/kafka/blob/7e1c453af9533aba8c19da2d08ce6595c1441fc0/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/rest/HerderRequestHandler.java#L70]).
> This can be a source of frustration for users, who want to understand what is 
> causing the request to time out. This can be specific to the request (for 
> example, a connector's [custom multi-property validation 
> logic|https://kafka.apache.org/35/javadoc/org/apache/kafka/connect/connector/Connector.html#validate(java.util.Map)]
>  is taking too long), or applicable to any request that goes through the 
> herder's tick thread (for which there are a variety of possible causes).
> We can give users better, immediate insight into what is causing requests to 
> time out by including information about the last possibly-blocking operation 
> the worker performed while servicing the request (or attempting to enter a 
> state where all preconditions necessary to service the request have been 
> satisfied), and when the worker began that operation.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-15988) Kafka Connect OffsetsApiIntegrationTest takes too long

2023-12-07 Thread Chris Egerton (Jira)
Chris Egerton created KAFKA-15988:
-

 Summary: Kafka Connect OffsetsApiIntegrationTest takes too long
 Key: KAFKA-15988
 URL: https://issues.apache.org/jira/browse/KAFKA-15988
 Project: Kafka
  Issue Type: Improvement
  Components: KafkaConnect
Reporter: Chris Egerton
Assignee: Chris Egerton


The [OffsetsApiIntegrationTest 
suite|https://github.com/apache/kafka/blob/c515bf51f820f26ff6be6b0fde03b47b69a10b00/connect/runtime/src/test/java/org/apache/kafka/connect/integration/OffsetsApiIntegrationTest.java]
 currently contains 27 test cases. Each test case begins by creating embedded 
Kafka and Kafka Connect clusters, which is fairly resource-intensive and 
time-consuming.

If possible, we should reuse those embedded clusters across test cases.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: KIP-993: Allow restricting files accessed by File and Directory ConfigProviders

2023-12-07 Thread Chris Egerton
Hi Tina,

Thanks for the KIP! Looks good overall. A few minor thoughts:

1. We can remove the "This page is meant as a template for writing a KIP"
section from the beginning.

2. The type of the allowed.paths property is string in the KIP, but the
description mentions it'll contain multiple comma-separated paths.
Shouldn't it be described as a list? Or are we calling it a string in order
to allow for escape syntax for directories that may contain the delimiter
character (e.g., ',')?

3. I'm guessing the answer is yes but I want to make sure--will users be
allowed to specify files in the allowed.paths property?

4. Again, guessing the answer is yes but to make sure--if a directory is
specified in the allowed.paths property, will all files (nested or
otherwise) be accessible by the config provider? E.g., if I set
allowed.paths to "/", then everything on the entire file system would be
accessible, instead of just the files directly inside the root directory.

Cheers,

Chris

On Thu, Dec 7, 2023 at 9:33 AM Gantigmaa Selenge 
wrote:

> Thank you Mickael.
>
> I'm going to leave the discussion thread open for a couple more days and if
> there are no further comments, I would like to start the vote for this KIP.
>
> Thanks.
> Regards,
> Tina
>
> On Wed, Dec 6, 2023 at 10:06 AM Mickael Maison 
> wrote:
>
> > Hi,
> >
> > I'm not aware of any other mechanisms to explore the filesystem. If
> > you have ideas, please reach out to the security list.
> >
> > Thanks,
> > Mickael
> >
> > On Tue, Dec 5, 2023 at 1:05 PM Gantigmaa Selenge 
> > wrote:
> > >
> > > Hi everyone,
> > >
> > >
> > > Apologies for the very delayed response. Thank you both for the
> feedback.
> > >
> > >
> > > > For clarity it might make sense to mention this feature will be
> useful
> > >
> > > when using a ConfigProvider with Kafka Connect as providers are set in
> > >
> > > the runtime and can then be used by connectors. This feature has no
> > >
> > > use when using a ConfigProvider in server.properties or in clients.
> > >
> > >
> > > I have updated the KIP to address this suggestion. Please let me know
> if
> > > it's not clear enough.
> > >
> > >
> > > > When trying to use a path not allowed, you propose returning an
> error.
> > >
> > > With Connect does that mean the connector will be failed? The
> > >
> > > EnvVarConfigProvider returns empty string in case a user tries to
> > >
> > > access an environment variable not allowed. I wonder if we should
> > >
> > > follow the same pattern so the behavior is "consistent" across all
> > >
> > > built-in providers.
> > >
> > >
> > > I agree with this, it makes sense to have consistent behaviour across
> all
> > > the providers. I made this update.
> > >
> > >
> > > > 1. In the past Connect removed the FileStream connectors in order to
> > >
> > > prevent a REST API attacker from accessing the filesystem. Is this the
> > >
> > > only remaining attack vector for reading the file system? Meaning, if
> > >
> > > this feature is configured and all custom plugins are audited for
> > >
> > > filesystem accesses, would someone with access to the REST API be
> > >
> > > unable to access arbitrary files on disk?
> > >
> > >
> > > Once this feature is configured, it will stop someone from accessing
> the
> > > file system via config providers.
> > >
> > > However, I’m not sure whether there are other ways users can access
> file
> > > systems via REST API.
> > >
> > >
> > > Mickael, perhaps you have some thoughts on this?
> > >
> > >
> > > > 2. Could you explain how this feature would prevent a path traversal
> > >
> > > attack, and how we will verify that such attacks are not feasible?
> > >
> > >
> > > The intention is to generate File objects based on the String value
> > > provided for allowed.paths and the String path passed to the get()
> > function.
> > >
> > > This would allow validation of path inclusion within the specified
> > allowed
> > > paths using their corresponding Path objects, rather than doing String
> > > comparisons.
> > >
> > > This hopefully will mitigate the risk of path traversal. The
> > implementation
> > > should include unit tests to verify this.
> > >
> > >
> > > > 3. This applies a single "allowed paths" to a whole worker, but I've
> > >
> > > seen situations where preventing one connector from accessing
> > >
> > > another's secrets may also be desirable. Is there any way to extend
> > >
> > > this feature now or in the future to make that possible?
> > >
> > >
> > > One approach could be creating multiple providers, each assigned a
> unique
> > > name and specific allowed.paths configuration. Users would then be
> > assigned
> > > a provider name, granting them appropriate access on the file system to
> > > load variables for their connectors. However, during provider
> > > configuration, administrators would have to anticipate and specify the
> > > files and directories users may require access to.
> > >
> > >
> > > Regards,
> > >
> > > Tina
> > >
> > > On Wed, Nov 8, 2023 

Re: [DISCUSS] KIP-995: Allow users to specify initial offsets while creating connectors

2023-12-06 Thread Chris Egerton
Hi Ashwin,

Regarding point 4--I think this is still a bit unwise. When workers pick up
a new connector config from the config topic, they participate in a
rebalance. It may be safe to write offsets during that rebalance, but in
the name of simplicity, do you think we can write the offsets for the
connector before its config? The sequence of steps would be something like
this:

1. Validate offsets and connector config (can be done in any order)
2. Write offsets
3. Write connector config (with whatever initial state is specified in the
request, or the default if none is specified)

Cheers,

Chris

On Wed, Dec 6, 2023 at 9:13 AM Ashwin  wrote:

> Hello Chris,
>
> Thanks for the quick and detailed review. Please see my responses below
>
> High-level thoughts:
>
> 1. I had not thought of this till now, thanks for pointing it out. I
> would lean towards the second option of cleaning previous offsets as
> it will result in the fewer surprises for the user.
>
> 2 and 3. I agree and have updated the wiki to that effect. I just
> wanted to use the connector name as a mutex - we can handle the race
> in other ways.
>
> 4. Yes, I meant submitting config to the config topic. Have updated
> the wiki to make it clearer.
>
>
> Nits:
>
> Thanks for pointing these - I have made the necessary changes.
>
>
> Thanks again,
>
> Ashwin
>
>
> On Mon, Dec 4, 2023 at 9:05 PM Chris Egerton 
> wrote:
>
> > Hi Ashwin,
> >
> > Thanks for the KIP! This would be a nice simplification to the process
> for
> > migrating connectors enabled by KIP-980, and would also add global
> support
> > for a feature I've seen implemented by hand in at least a couple
> connectors
> > over the years.
> >
> >
> > High-level thoughts:
> >
> > 1. If there are leftover offsets from a previous connector, what will the
> > impact on those offsets (if any) be if a new connector is created with
> the
> > same name with initial offsets specified? I can think of at least two
> > options: we leave those offsets as-are but allow any of the initial
> offsets
> > in the new connector request to overwrite them, or we automatically wipe
> > all existing offsets for the connector first before writing its initial
> > offsets and then creating it. I have a slight preference for the first
> > because it's simpler to implement and aligns with existing precedent for
> > offset handling where we never wipe them unless explicitly requested by
> the
> > user or connector, but it could be argued that the second is less likely
> to
> > generate footguns for users. Interested in your thoughts!
> >
> > 2. IMO preflight validation (i.e., the "Validate initial_offset before
> > creating the connector in STOPPED state rejected alternative) is a
> > must-have for this kind of feature. I acknowledge the possible race
> > condition (and I don't think it's worth it to track in-flight connector
> > creation requests in the herder in order to prevent this race, since
> > ever-blocking Connector operations would be cumbersome to deal with), and
> > the extra implementation effort. But I don't think either of those tip
> the
> > scales far enough to override the benefit of ensuring the submitted
> offsets
> > are valid before creating the connector.
> >
> > 3. On the topic of preflight validation--I also think it's important that
> > we validate both the connector config and the initial offsets before
> either
> > creating the connector or storing the initial offsets. I don't think this
> > point requires any changes to the KIP that aren't already proposed with
> > point 2. above, but wanted to see if we could adopt this as a primary
> goal
> > for the design of the feature and keep it in mind with future changes.
> >
> > 4. In the proposed changes section, the first step is "Create a connector
> > in STOPPED state", and the second step is "Validate the offset...". What
> > exactly is entailed by "Create a connector"? Submit the config to the
> > config topic (presumably after a preflight validation of the config)?
> > Participate in the ensuing rebalance? I'm a little hesitant to start
> > performing rebalances inside REST requests, wondering if we can find a
> > lighter-weight way to implement this.
> >
> >
> > Nits:
> >
> > 5. You can remove the italicized "This page is meant as a template for
> > writing a KIP" section after the table of contents.
> >
> > 6. Can you file a JIRA ticket for this change and add a link to it in the
> > KIP under the status section?
> >
> >

Re: [DISCUSS] KIP-967: Support custom SSL configuration for Kafka Connect RestServer

2023-12-04 Thread Chris Egerton
Hi Taras,

Regarding slimming down the interface: IMO, we should do this right the
first time, and that includes not requiring unnecessary methods from users.
I think BaseSslEngineFactory is good enough as a superinterface.


Regarding the parsing logic: I think the KIP needs to be more explicit. We
should add something like this to the proposed changes section:

"If any properties are present in the worker config with a prefix of
"listeners.https.", then only properties with that prefix will be passed to
the SSL engine factory. Otherwise, all top-level worker properties will be
passed to the SSL engine factory. Note that this differs slightly from
existing logic in that the set of properties (prefixed or otherwise) will
not be filtered based on a predefined set of keys; this will enable custom
SSL engine factories to define and accept custom properties."

I also took a quick look at the prototype (I usually try not to do this
since we vote on KIP documents, not PRs). I don't think we should populate
default values for SSL-related properties before sending properties to the
SSL engine factory, since it may confuse users who have written custom SSL
engine factories to see that properties not specified in their Connect
worker config are being passed to their factory. Instead, the default SSL
engine factory used by Connect can perform this logic, and we can let other
custom factories be responsible for their own default values.


Cheers,

Chris

On Wed, Nov 29, 2023 at 8:36 AM Taras Ledkov  wrote:

> Hi team,
>
> Ping for review / vote for KIP-967 [1].
> Voting thread is here [2]
>
> [1].
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-967%3A+Support+custom+SSL+configuration+for+Kafka+Connect+RestServer
> [2]. https://github.com/apache/kafka/pull/14203
> [2]. https://lists.apache.org/thread/wc4v5v3pynl15g1q547m2croqsqgmzpw
>
> --
> With best regards,
> Taras Ledkov
>


Re: [DISCUSS] KIP-1004: Enforce tasks.max property in Kafka Connect

2023-12-04 Thread Chris Egerton
Hi all,

It seems like there are no objections to this KIP, so I've kicked off a
vote thread:
https://lists.apache.org/thread/dgq332o5j25rwddbvfydf7ttrclldw17

Cheers,

Chris

On Fri, Nov 24, 2023 at 10:39 PM Chris Egerton 
wrote:

> Hi Yash,
>
> Thanks for taking a look! Yeah, it looks like we'll be forced to hold onto
> the property until 5.0 if we can't introduce it at least one minor release
> before 4.0. I don't think this is the worst thing. Although it'd be nice to
> have the property completely removed when designing features like KIP-987,
> if necessary, we can also declare any new features incompatible with
> connectors that have opted out of enforcement of the tasks.max property
> (and likely enforce this restraint programmatically via preflight
> validation, failing connectors/tasks, log messages, etc.).
>
> Cheers,
>
> Chris
>
> On Wed, Nov 22, 2023 at 10:04 PM Yash Mayya  wrote:
>
> > Hi Chris,
> >
> > Thanks for the well written and comprehensive KIP! Given that we're
> already
> > past the KIP freeze deadline for 3.7.0 (
> > https://cwiki.apache.org/confluence/display/KAFKA/Release+Plan+3.7.0)
> and
> > there may not be a 3.8.0 release before the 4.0.0 release, would we then
> be
> > forced to punt the removal of "tasks.max.enforce" to a future 5.0.0
> > release? I don't have any other comments, and the proposed changes make
> > sense to me.
> >
> > Thanks,
> > Yash
> >
> > On Mon, Nov 20, 2023 at 10:50 PM Chris Egerton 
> > wrote:
> >
> > > Hi Hector,
> > >
> > > Thanks for taking a look! I think the key difference between the
> proposed
> > > behavior and the rejected alternative is that the set of tasks that
> will
> > be
> > > running with the former is still a complete set of tasks, whereas the
> set
> > > of tasks in the latter is a subset of tasks. Also noteworthy but
> slightly
> > > less important: the problem will be more visible to users with the
> former
> > > (the connector will still be marked FAILED) than with the latter.
> > >
> > > Cheers,
> > >
> > > Chris
> > >
> > > On Tue, Nov 21, 2023, 00:53 Hector Geraldino (BLOOMBERG/ 919 3RD A) <
> > > hgerald...@bloomberg.net> wrote:
> > >
> > > > Thanks for the KIP Chris, adding this check makes total sense.
> > > >
> > > > I do have one question. The second paragraph in the Public Interfaces
> > > > section states:
> > > >
> > > > "If the connector generated excessive tasks after being reconfigured,
> > > then
> > > > any existing tasks for the connector will be allowed to continue
> > running,
> > > > unless that existing set of tasks also exceeds the tasks.max
> property."
> > > >
> > > > Would not failing the connector land us in the second scenario of
> > > > 'Rejected Alternatives'?
> > > >
> > > > From: dev@kafka.apache.org At: 11/11/23 00:27:44 UTC-5:00To:
> > > > dev@kafka.apache.org
> > > > Subject: [DISCUSS] KIP-1004: Enforce tasks.max property in Kafka
> > Connect
> > > >
> > > > Hi all,
> > > >
> > > > I'd like to open up KIP-1004 for discussion:
> > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1004%3A+Enforce+tasks.max+
> > > > property+in+Kafka+Connect
> > > >
> > > > As a brief summary: this KIP proposes that the Kafka Connect runtime
> > > start
> > > > failing connectors that generate a greater number of tasks than the
> > > > tasks.max property, with an optional emergency override that can be
> > used
> > > to
> > > > continue running these (probably-buggy) connectors if absolutely
> > > necessary.
> > > >
> > > > I'll be taking time off most of the next three weeks, so response
> > latency
> > > > may be a bit higher than usual, but I wanted to kick off the
> discussion
> > > in
> > > > case we can land this in time for the upcoming 3.7.0 release.
> > > >
> > > > Cheers,
> > > >
> > > > Chris
> > > >
> > > >
> > > >
> > >
> >
>


[VOTE] KIP-1004: Enforce tasks.max property in Kafka Connect

2023-12-04 Thread Chris Egerton
Hi all,

I'd like to call for a vote on KIP-1004, which adds enforcement for the
tasks.max connector property in Kafka Connect.

The KIP:
https://cwiki.apache.org/confluence/display/KAFKA/KIP-1004%3A+Enforce+tasks.max+property+in+Kafka+Connect

The discussion thread:
https://lists.apache.org/thread/scx75cjwm19jyt19wxky41q9smf5nx6d

Cheers,

Chris


Re: [DISCUSS] KIP-995: Allow users to specify initial offsets while creating connectors

2023-12-04 Thread Chris Egerton
Oh, one more thing--can we add the KIP to the list of KIPs?
https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Improvement+Proposals#KafkaImprovementProposals-KIPsunderdiscussion

On Mon, Dec 4, 2023 at 10:33 AM Chris Egerton  wrote:

> Hi Ashwin,
>
> Thanks for the KIP! This would be a nice simplification to the process for
> migrating connectors enabled by KIP-980, and would also add global support
> for a feature I've seen implemented by hand in at least a couple connectors
> over the years.
>
>
> High-level thoughts:
>
> 1. If there are leftover offsets from a previous connector, what will the
> impact on those offsets (if any) be if a new connector is created with the
> same name with initial offsets specified? I can think of at least two
> options: we leave those offsets as-are but allow any of the initial offsets
> in the new connector request to overwrite them, or we automatically wipe
> all existing offsets for the connector first before writing its initial
> offsets and then creating it. I have a slight preference for the first
> because it's simpler to implement and aligns with existing precedent for
> offset handling where we never wipe them unless explicitly requested by the
> user or connector, but it could be argued that the second is less likely to
> generate footguns for users. Interested in your thoughts!
>
> 2. IMO preflight validation (i.e., the "Validate initial_offset before
> creating the connector in STOPPED state rejected alternative) is a
> must-have for this kind of feature. I acknowledge the possible race
> condition (and I don't think it's worth it to track in-flight connector
> creation requests in the herder in order to prevent this race, since
> ever-blocking Connector operations would be cumbersome to deal with), and
> the extra implementation effort. But I don't think either of those tip the
> scales far enough to override the benefit of ensuring the submitted offsets
> are valid before creating the connector.
>
> 3. On the topic of preflight validation--I also think it's important that
> we validate both the connector config and the initial offsets before either
> creating the connector or storing the initial offsets. I don't think this
> point requires any changes to the KIP that aren't already proposed with
> point 2. above, but wanted to see if we could adopt this as a primary goal
> for the design of the feature and keep it in mind with future changes.
>
> 4. In the proposed changes section, the first step is "Create a connector
> in STOPPED state", and the second step is "Validate the offset...". What
> exactly is entailed by "Create a connector"? Submit the config to the
> config topic (presumably after a preflight validation of the config)?
> Participate in the ensuing rebalance? I'm a little hesitant to start
> performing rebalances inside REST requests, wondering if we can find a
> lighter-weight way to implement this.
>
>
> Nits:
>
> 5. You can remove the italicized "This page is meant as a template for
> writing a KIP" section after the table of contents.
>
> 6. Can you file a JIRA ticket for this change and add a link to it in the
> KIP under the status section?
>
> 7. What do you think about changing the name for the new field from
> "initial_offset" (singular) to "initial_offsets" (plural)? This is
> especially relevant for connectors that read from multiple source
> partitions, like MM2 and the various JDBC source connectors out there.
>
> 8. IMO it's not necessary to include this section: "Please note that sink
> and source connectors have different schemas for offset." While it's
> technically true that the fields of the objects inside the "partition" and
> "offset" fields will likely differ between sink and source connectors,
> they'll also likely differ in the exact same way across different sink
> connectors. I think it's enough to link to the relevant section in KIP-875
> on the details for the format.
>
> 9. Instead of "Connector-defined source partition" and "Connector-defined
> source offset" in the comments for the sample connector creation body
> (which aren't strictly accurate for sink connectors), can we say something
> like "Source partition" and "Desired initial offset"?
>
> 10. In the compatibility section the KIP states that "This new feature
> will use the current OffsetStorageWriter." IMO we should refrain from
> referring to internal API class names in KIPs when possible, since those
> class names may change and users may also mistakenly assume that they're
> part of the public API. Can we say something like "This new feature will
> reuse existi

  1   2   3   4   5   6   7   8   >