[DISCUSS] Conventional commits

2023-06-07 Thread Ismael Juma
Hi,

A number of open source projects have adopted the conventional commits
specification. What do people think about using this for Apache Kafka?

https://www.conventionalcommits.org/en/v1.0.0/

Thanks,
Ismael


Re: [VOTE] KIP-938: Add more metrics for measuring KRaft performance

2023-06-07 Thread Luke Chen
Hi Colin,

Thanks for the response.
I have no more comments.
+1 (binding)

Luke

On Thu, Jun 8, 2023 at 6:02 AM Colin McCabe  wrote:
>
> Hi Luke,
>
> Thanks for the review and the suggestion.
>
> I think we will add more "handling time" metrics later, but for now I don't 
> want to make this KIP any bigger than it is already...
>
> best,
> Colin
>
>
> On Wed, Jun 7, 2023, at 03:12, Luke Chen wrote:
> > Hi Colin,
> >
> > One comment:
> > Should we add a metric to record the snapshot handling time?
> > Since we know the snapshot loading might take long if the size is huge.
> > We might want to know how much time it is processed. WDYT?
> >
> > No matter you think we need it or not, the KIP LGTM.
> > +1 from me.
> >
> >
> > Thank you.
> > Luke
> >
> > On Wed, Jun 7, 2023 at 1:33 PM Colin McCabe  wrote:
> >>
> >> Hi all,
> >>
> >> I added two new metrics to the list:
> >>
> >> * LatestSnapshotGeneratedBytes
> >> * LatestSnapshotGeneratedAgeMs
> >>
> >> These will help monitor the period snapshot generation process.
> >>
> >> best,
> >> Colin
> >>
> >>
> >> On Tue, Jun 6, 2023, at 22:21, Colin McCabe wrote:
> >> > Hi Divij,
> >> >
> >> > Yes, I am referring to the feature level. I changed the description of
> >> > CurrentMetadataVersion to reference the feature level specifically.
> >> >
> >> > best,
> >> > Colin
> >> >
> >> >
> >> > On Tue, Jun 6, 2023, at 05:56, Divij Vaidya wrote:
> >> >> "Each metadata version has a corresponding integer in the
> >> >> MetadataVersion.java file."
> >> >>
> >> >> Please correct me if I'm wrong, but are you referring to "featureLevel"
> >> >> in
> >> >> the enum at
> >> >> https://github.com/apache/kafka/blob/trunk/server-common/src/main/java/org/apache/kafka/server/common/MetadataVersion.java#L45
> >> >> ? Is yes, can we please update the description of the metric to make it
> >> >> easier for the users to understand this? For example, we can say,
> >> >> "Represents the current metadata version as an integer value. See
> >> >> MetadataVersion (hyperlink) for a mapping between string and integer
> >> >> formats of metadata version".
> >> >>
> >> >> --
> >> >> Divij Vaidya
> >> >>
> >> >>
> >> >>
> >> >> On Tue, Jun 6, 2023 at 1:51 PM Ron Dagostino  wrote:
> >> >>
> >> >>> Thanks again for the KIP, Colin.  +1 (binding).
> >> >>>
> >> >>> Ron
> >> >>>
> >> >>> > On Jun 6, 2023, at 7:02 AM, Igor Soarez 
> >> >>> wrote:
> >> >>> >
> >> >>> > Thanks for the KIP.
> >> >>> >
> >> >>> > Seems straightforward, LGTM.
> >> >>> > Non binding +1.
> >> >>> >
> >> >>> > --
> >> >>> > Igor
> >> >>> >
> >> >>>


[jira] [Created] (KAFKA-15074) offset out of range for partition xxx, resetting offset

2023-06-07 Thread YaYun Wang (Jira)
YaYun Wang created KAFKA-15074:
--

 Summary: offset out of range for partition xxx, resetting offset
 Key: KAFKA-15074
 URL: https://issues.apache.org/jira/browse/KAFKA-15074
 Project: Kafka
  Issue Type: Bug
  Components: clients, consumer
Affects Versions: 3.3.2
Reporter: YaYun Wang


I ??got Fetch position FetchPosition{offset=42574305, 
offsetEpoch=Optional[2214], 
currentLeader=LeaderAndEpoch\{leader=Optional[host:port (id: 2 rack: 
cn-north-1d)], epoch=2214}} is out of range for partition 
vcc.hdmap.tile.delivery-4, resetting offset ??

when i consumer kafka through @KafkaListener:

 

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [DISCUSS] Regarding Old PRs

2023-06-07 Thread David Arthur
I filed KAFKA-15073 for this. Here is a patch
https://github.com/apache/kafka/pull/13827. This simply adds a "stale"
label to PRs with no activity in the last 90 days. I figure that's a good
starting point.

As for developer workflow, the "stale" action is quite flexible in how it
finds candidate PRs to mark as stale. For example, we can exclude PRs that
have an Assignee, or a particular set of labels. Docs are here
https://github.com/actions/stale

-David


On Wed, Jun 7, 2023 at 2:36 PM Josep Prat 
wrote:

> Thanks David!
>
> ———
> Josep Prat
>
> Aiven Deutschland GmbH
>
> Alexanderufer 3-7, 10117 Berlin
>
> Amtsgericht Charlottenburg, HRB 209739 B
>
> Geschäftsführer: Oskari Saarenmaa & Hannu Valtonen
>
> m: +491715557497
>
> w: aiven.io
>
> e: josep.p...@aiven.io
>
> On Wed, Jun 7, 2023, 20:28 David Arthur  .invalid>
> wrote:
>
> > Hey all, I started poking around at Github actions on my fork.
> >
> > https://github.com/mumrah/kafka/actions
> >
> > I'll post a PR if I get it working and we can discuss what kind of
> settings
> > we want (or if we want it all)
> >
> > -David
> >
> > On Tue, Jun 6, 2023 at 1:18 PM Chris Egerton 
> > wrote:
> >
> > > Hi Josep,
> > >
> > > Thanks for bringing this up! Will try to keep things brief.
> > >
> > > I'm generally in favor of this initiative. A couple ideas that I really
> > > liked: requiring a component label (producer, consumer, connect,
> streams,
> > > etc.) before closing, and disabling auto-close (i.e., automatically
> > tagging
> > > PRs as stale, but leaving it to a human being to actually close them).
> > >
> > > We might replace the "stale" label with a "close-by-" label so
> that
> > > it becomes even easier for us to find the PRs that are ready to be
> closed
> > > (as opposed to the ones that have just been labeled as stale without
> > giving
> > > the contributor enough time to respond).
> > >
> > > I've also gone ahead and closed some of my stale PRs. Others I've
> > > downgraded to draft to signal that I'd like to continue to pursue them,
> > but
> > > have to iron out merge conflicts first. For the last ones, I'll ping
> for
> > > review.
> > >
> > > One question that came to mind--do we want to distinguish between
> regular
> > > and draft PRs? I'm guessing not, since they still add up to the total
> PR
> > > count against the project, but since they do also implicitly signal
> that
> > > they're not intended for review (yet) it may be friendlier to leave
> them
> > > up.
> > >
> > > Cheers,
> > >
> > > Chris
> > >
> > > On Tue, Jun 6, 2023 at 10:18 AM Mickael Maison <
> mickael.mai...@gmail.com
> > >
> > > wrote:
> > >
> > > > Hi Josep,
> > > >
> > > > Thanks for looking into this. This is clearly one aspect where we
> need
> > > > to improve.
> > > >
> > > > We had a similar initiative last year
> > > > (https://lists.apache.org/thread/66yj9m6tcyz8zqb3lqlbnr386bqwsopt)
> and
> > > > we closed many PRs. Unfortunately we did not follow up with a process
> > > > or automation and we are back to the same situation.
> > > >
> > > > Manually reviewing all these PRs is a huge task, so I think we should
> > > > at least partially automate it. I'm not sure if we should manually
> > > > review the oldest PRs (pre 2020). There's surely many interesting
> > > > things but I wonder if we should instead focus on the more recent
> ones
> > > > as they have a higher chance of 1) still making sense, 2) getting
> > > > updates from their authors, 3) needing less rebasing. If something
> has
> > > > been broken since 2016 but we never bothered to fix the PR it means
> it
> > > > can't be anything critical!
> > > >
> > > > Finally as Colin mentioned, it looks like a non negligible chunk of
> > > > stale PRs comes from committers and regular contributors. So I'd
> > > > suggest we each try to clean our own backlog too.
> > > >
> > > > I wonder if we also need to do something in JIRA. Querying for
> > > > unresolved tickets returns over 4000 items. Considering we're not
> > > > quite at KAFKA-15000 yet, that's a lot.
> > > >
> > > > Thanks,
> > > > Mickael
> > > >
> > > >
> > > > On Tue, Jun 6, 2023 at 11:35 AM Josep Prat
>  > >
> > > > wrote:
> > > > >
> > > > > Hi Devs,
> > > > > I would say we can split the problem in 2.
> > > > >
> > > > > Waiting for Author feedback:
> > > > > We could have a bot that would ping authors for the cases where we
> > have
> > > > PRs
> > > > > that are stalled and have either:
> > > > > - Merge conflict
> > > > > - Unaddressed reviews
> > > > >
> > > > > Waiting for reviewers:
> > > > > For the PRs where we have no reviewers and there are no conflicts,
> I
> > > > think
> > > > > we would need some human interaction to determine modules (maybe
> this
> > > can
> > > > > be automated) and ping people who could review.
> > > > >
> > > > > What do you think?
> > > > >
> > > > > Best,
> > > > >
> > > > > On Tue, Jun 6, 2023 at 11:30 AM Josep Prat 
> > > wrote:
> > > > >
> > > > > > Hi Nikolay,
> > > > > >
> > > > > > With a bot it wi

[jira] [Created] (KAFKA-15073) Automation for old/inactive PRs

2023-06-07 Thread David Arthur (Jira)
David Arthur created KAFKA-15073:


 Summary: Automation for old/inactive PRs
 Key: KAFKA-15073
 URL: https://issues.apache.org/jira/browse/KAFKA-15073
 Project: Kafka
  Issue Type: Improvement
  Components: build
Reporter: David Arthur


Following from a discussion on the mailing list. It would be nice to 
automatically triage inactive PRs. There are currently over 1000 open PRs. Most 
likely a majority of these will not ever be merged and should be closed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [DISCUSS] KIP-935: Extend AlterConfigPolicy with existing configurations

2023-06-07 Thread Jorge Esteban Quilcate Otoya
Thank Colin.

I've took a closer look on how configs are passed to the policy when delete
configs are requested, and either null (KRaft) or empty values
(ZkAdminManager) are passed:
- ZkAdminManager passes empty values:
  - Config Entries definition:
https://github.com/apache/kafka/blob/513e1c641d63c5e15144f9fcdafa1b56c5e5ba09/core/src/main/scala/kafka/server/ZkAdminManager.scala#L503
  - and passed to policy without changes:
https://github.com/apache/kafka/blob/513e1c641d63c5e15144f9fcdafa1b56c5e5ba09/core/src/main/scala/kafka/server/ZkAdminManager.scala#L495
- Similar with ConfigurationControlManager (KRaft) passes null values:
  - Config entries added regardless of value==null:
https://github.com/apache/kafka/blob/513e1c641d63c5e15144f9fcdafa1b56c5e5ba09/metadata/src/main/java/org/apache/kafka/controller/ConfigurationControlManager.java#L281
  - And passed to policy:
https://github.com/apache/kafka/blob/513e1c641d63c5e15144f9fcdafa1b56c5e5ba09/metadata/src/main/java/org/apache/kafka/controller/ConfigurationControlManager.java#L295

So, instead of passing (requested config + requested config to delete +
existing configs), the new metadata type is including (requested
configs--which already include deleted configs-- + _resulting_ configs) so
users could evaluate the whole (resulting) config map similar to
CreateTopicPolicy; and check only requested configs if needed (as with
current metadata).

I've also added a rejected alternative to consider the scenario of
piggybacking on the existing map and just including the resulting config
instead, though this would break compatibility with existing
implementations.

Thanks,
Jorge.


On Wed, 7 Jun 2023 at 08:38, Colin McCabe  wrote:

> On Tue, Jun 6, 2023, at 06:57, Jorge Esteban Quilcate Otoya wrote:
> > Thanks Colin.
> >
> >> I would suggest renaming the "configs" parameter to "proposedConfigs,"
> in
> > both the new and old RequestMetadata constructors, to make things
> clearer.
> > This would be a binary and API-compatible change in Java.
> >
> > Sure, fully agree. KIP is updated with this suggestion.
> >
> >> We should also clarify that if configurations are being proposed for
> > deletion, they won't appear in proposedConfigs.
> >
> > Great catch. Wasn't aware of this, but I think it's valuable for the API
> to
> > also include the list of configurations to be deleted.
> > For this, I have extended the `RequestMetadata` type with a list of
> > proposed configs to delete:
> >
>
> Hi Jorge,
>
> Thanks for the reply.
>
> Rather than having a separate list of proposedConfigsToDelete, it seems
> like we could have an accessor function that calculates this when needed.
> After all, it's completely determined by existingConfigs and
> proposedConfigs. And some plugins will want the list and some won't (or
> will want to do a slightly different analysis)
>
> regards,
> Colin
>
>
> > ```
> > class RequestMetadata {
> >
> > private final ConfigResource resource;
> > private final Map proposedConfigs;
> > private final List proposedConfigsToDelete;
> > private final Map existingConfigs;
> > ```
> >
> > Looking forward to your feedback.
> >
> > Cheers,
> > Jorge.
> >
> > On Fri, 2 Jun 2023 at 22:42, Colin McCabe  wrote:
> >
> >> Hi Jorge,
> >>
> >> This is a good KIP which addresses a real gap we have today.
> >>
> >> I would suggest renaming the "configs" parameter to "proposedConfigs,"
> in
> >> both the new and old RequestMetadata constructors, to make things
> clearer.
> >> This would be a binary and API-compatible change in Java. We should also
> >> clarify that if configurations are being proposed for deletion, they
> won't
> >> appear in proposedConfigs.
> >>
> >> best,
> >> Colin
> >>
> >>
> >> On Tue, May 23, 2023, at 03:03, Christo Lolov wrote:
> >> > Hello!
> >> >
> >> > This proposal will address problems with configuration dependencies
> >> which I
> >> > run into very frequently, so I am fully supporting the development of
> >> this
> >> > feature!
> >> >
> >> > Best,
> >> > Christo
> >> >
> >> > On Mon, 22 May 2023 at 17:18, Jorge Esteban Quilcate Otoya <
> >> > quilcate.jo...@gmail.com> wrote:
> >> >
> >> >> Hi everyone,
> >> >>
> >> >> I'd like to start a discussion for KIP-935 <
> >> >>
> >> >>
> >>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-935%3A+Extend+AlterConfigPolicy+with+existing+configurations
> >> >> >
> >> >> which proposes extend AlterConfigPolicy with existing configuration
> to
> >> >> enable more complex policies.
> >> >>
> >> >> There have been related KIPs in the past that haven't been accepted
> and
> >> >> seem retired/abandoned as outlined in the motivation.
> >> >> The scope of this KIP intends to be more specific to get most of the
> >> >> benefits from previous discussions; and if previous KIPs are
> >> resurrected,
> >> >> should still be possible to do it if this one is adopted.
> >> >>
> >> >> Looking forward to your feedback!
> >> >>
> >> >> Thanks,
> >> >> Jorge.
> >> >>
> >>
>

Jenkins build is still unstable: Kafka » Kafka Branch Builder » trunk #1908

2023-06-07 Thread Apache Jenkins Server
See 




Re: [VOTE] KIP-938: Add more metrics for measuring KRaft performance

2023-06-07 Thread Colin McCabe
Hi Luke,

Thanks for the review and the suggestion.

I think we will add more "handling time" metrics later, but for now I don't 
want to make this KIP any bigger than it is already...

best,
Colin


On Wed, Jun 7, 2023, at 03:12, Luke Chen wrote:
> Hi Colin,
>
> One comment:
> Should we add a metric to record the snapshot handling time?
> Since we know the snapshot loading might take long if the size is huge.
> We might want to know how much time it is processed. WDYT?
>
> No matter you think we need it or not, the KIP LGTM.
> +1 from me.
>
>
> Thank you.
> Luke
>
> On Wed, Jun 7, 2023 at 1:33 PM Colin McCabe  wrote:
>>
>> Hi all,
>>
>> I added two new metrics to the list:
>>
>> * LatestSnapshotGeneratedBytes
>> * LatestSnapshotGeneratedAgeMs
>>
>> These will help monitor the period snapshot generation process.
>>
>> best,
>> Colin
>>
>>
>> On Tue, Jun 6, 2023, at 22:21, Colin McCabe wrote:
>> > Hi Divij,
>> >
>> > Yes, I am referring to the feature level. I changed the description of
>> > CurrentMetadataVersion to reference the feature level specifically.
>> >
>> > best,
>> > Colin
>> >
>> >
>> > On Tue, Jun 6, 2023, at 05:56, Divij Vaidya wrote:
>> >> "Each metadata version has a corresponding integer in the
>> >> MetadataVersion.java file."
>> >>
>> >> Please correct me if I'm wrong, but are you referring to "featureLevel"
>> >> in
>> >> the enum at
>> >> https://github.com/apache/kafka/blob/trunk/server-common/src/main/java/org/apache/kafka/server/common/MetadataVersion.java#L45
>> >> ? Is yes, can we please update the description of the metric to make it
>> >> easier for the users to understand this? For example, we can say,
>> >> "Represents the current metadata version as an integer value. See
>> >> MetadataVersion (hyperlink) for a mapping between string and integer
>> >> formats of metadata version".
>> >>
>> >> --
>> >> Divij Vaidya
>> >>
>> >>
>> >>
>> >> On Tue, Jun 6, 2023 at 1:51 PM Ron Dagostino  wrote:
>> >>
>> >>> Thanks again for the KIP, Colin.  +1 (binding).
>> >>>
>> >>> Ron
>> >>>
>> >>> > On Jun 6, 2023, at 7:02 AM, Igor Soarez 
>> >>> wrote:
>> >>> >
>> >>> > Thanks for the KIP.
>> >>> >
>> >>> > Seems straightforward, LGTM.
>> >>> > Non binding +1.
>> >>> >
>> >>> > --
>> >>> > Igor
>> >>> >
>> >>>


Re: [DISCUSS] KIP-937 Improve Message Timestamp Validation

2023-06-07 Thread Beyene, Mehari
> Although it's more verbose, splitting the configuration into explicit ‘past’ 
> and ‘future’ would provide the appropriate tradeoff between constraint and 
> flexibility, right?

+1 






Jenkins build is unstable: Kafka » Kafka Branch Builder » trunk #1907

2023-06-07 Thread Apache Jenkins Server
See 




[jira] [Created] (KAFKA-15072) Flaky test MirrorConnectorsIntegrationExactlyOnceTest.testReplicationWithEmptyPartition

2023-06-07 Thread Josep Prat (Jira)
Josep Prat created KAFKA-15072:
--

 Summary: Flaky test 
MirrorConnectorsIntegrationExactlyOnceTest.testReplicationWithEmptyPartition
 Key: KAFKA-15072
 URL: https://issues.apache.org/jira/browse/KAFKA-15072
 Project: Kafka
  Issue Type: Bug
  Components: mirrormaker
Affects Versions: 3.5.0
Reporter: Josep Prat


Test 
MirrorConnectorsIntegrationExactlyOnceTest.testReplicationWithEmptyPartition 
became flaky again, but it's a different error this time.

Occurrence: 
[https://ci-builds.apache.org/job/Kafka/job/kafka-pr/job/PR-13824/1/testReport/junit/org.apache.kafka.connect.mirror.integration/MirrorConnectorsIntegrationExactlyOnceTest/Build___JDK_17_and_Scala_2_13___testReplicationWithEmptyPartition__/]

 
h3. Error Message
{code:java}
java.lang.AssertionError: Connector MirrorHeartbeatConnector tasks did not 
start in time on cluster: backup-connect-cluster{code}
h3. Stacktrace
{code:java}
java.lang.AssertionError: Connector MirrorHeartbeatConnector tasks did not 
start in time on cluster: backup-connect-cluster at 
org.apache.kafka.connect.util.clusters.EmbeddedConnectClusterAssertions.assertConnectorAndAtLeastNumTasksAreRunning(EmbeddedConnectClusterAssertions.java:301)
 at 
org.apache.kafka.connect.mirror.integration.MirrorConnectorsIntegrationBaseTest.waitUntilMirrorMakerIsRunning(MirrorConnectorsIntegrationBaseTest.java:912)
 at 
org.apache.kafka.connect.mirror.integration.MirrorConnectorsIntegrationBaseTest.testReplicationWithEmptyPartition(MirrorConnectorsIntegrationBaseTest.java:415)
 at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native 
Method) at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
 at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.base/java.lang.reflect.Method.invoke(Method.java:568) at 
org.junit.platform.commons.util.ReflectionUtils.invokeMethod(ReflectionUtils.java:727)
 at 
org.junit.jupiter.engine.execution.MethodInvocation.proceed(MethodInvocation.java:60)
 at 
org.junit.jupiter.engine.execution.InvocationInterceptorChain$ValidatingInvocation.proceed(InvocationInterceptorChain.java:131)
 at 
org.junit.jupiter.engine.extension.TimeoutExtension.intercept(TimeoutExtension.java:156)
 at 
org.junit.jupiter.engine.extension.TimeoutExtension.interceptTestableMethod(TimeoutExtension.java:147)
 at 
org.junit.jupiter.engine.extension.TimeoutExtension.interceptTestMethod(TimeoutExtension.java:86)
 at 
org.junit.jupiter.engine.execution.InterceptingExecutableInvoker$ReflectiveInterceptorCall.lambda$ofVoidMethod$0(InterceptingExecutableInvoker.java:103)
 at 
org.junit.jupiter.engine.execution.InterceptingExecutableInvoker.lambda$invoke$0(InterceptingExecutableInvoker.java:93)
 at 
org.junit.jupiter.engine.execution.InvocationInterceptorChain$InterceptedInvocation.proceed(InvocationInterceptorChain.java:106)
 at 
org.junit.jupiter.engine.execution.InvocationInterceptorChain.proceed(InvocationInterceptorChain.java:64)
 at 
org.junit.jupiter.engine.execution.InvocationInterceptorChain.chainAndInvoke(InvocationInterceptorChain.java:45)
 at 
org.junit.jupiter.engine.execution.InvocationInterceptorChain.invoke(InvocationInterceptorChain.java:37)
 at 
org.junit.jupiter.engine.execution.InterceptingExecutableInvoker.invoke(InterceptingExecutableInvoker.java:92)
 at 
org.junit.jupiter.engine.execution.InterceptingExecutableInvoker.invoke(InterceptingExecutableInvoker.java:86)
 at 
org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.lambda$invokeTestMethod$7(TestMethodTestDescriptor.java:217)
 at 
org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
 at 
org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.invokeTestMethod(TestMethodTestDescriptor.java:213)
 at 
org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.execute(TestMethodTestDescriptor.java:138)
 at 
org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.execute(TestMethodTestDescriptor.java:68)
 at 
org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$6(NodeTestTask.java:151)
 at 
org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
 at 
org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$8(NodeTestTask.java:141)
 at org.junit.platform.engine.support.hierarchical.Node.around(Node.java:137) 
at 
org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$9(NodeTestTask.java:139)
 at 
org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
 at 
org.junit.platform.engine.support.hierarchical.NodeTestTask.executeRecursively(NodeTestTask.java:138)
 at 
org.junit.platform.engine.support.hierarchical.NodeTestTask.execute(NodeT

Re: [DISCUSS] KIP-937 Improve Message Timestamp Validation

2023-06-07 Thread Beyene, Mehari
Luke, thank you for the suggestion of introducing the two configurations. I 
have discussed this option internally with Divij, and your suggestion does have 
its merits. I will update the KIP with your suggestions and will revert back in 
a day or two.

Kirk, thank you for participating in the KIP discussion. I am unable to think 
of a reliable way to validate this on the client side with high confidence, as 
the frame of reference for validation is the Broker's timestamp. I am open to 
hearing any suggestions for client-side validation.

Thanks,
Mehari


>On 6/7/23, 10:26 AM, "Kirk True" mailto:k...@kirktrue.pro>> wrote:


CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.






Hi Mehari,


Thanks for the KIP and keeping it up-to-date with the discussions here!


Question:


1. Is it possible to check for invalid timestamps in the client? Suppose we 
were to develop a means to determine with high confidence that the user had 
provided a timestamp in nanoseconds vs. milliseconds, is it worth the trouble 
to either reject or output some debug logging? Of course, we don’t have the 
log.message.timestamp.difference.max.ms configuration on the client, and the 
client may be out of sync enough to cause false positives.


Thanks,
Kirk


> On Jun 6, 2023, at 5:42 AM, Divij Vaidya  > wrote:
>
> Hi Luke
>
> Thank you for your participation in reviewing this KIP.
>
> #1 Updated the KIP with correct configuration names and hyperlinks.
>
> #2 Yes, the semantics change from a perspective that the difference is
> always in the past (or at most 1 hour into the future). Updated the
> compatibility section to represent the same. Will update again after #3 is
> resolved.
>
> #3 I'd propose to introduce 2 new configs, one is "
> log.message.timestamp.before.max.ms", and the other one is "
> log.message.timestamp.after.max.ms".
> This is a great idea because with your proposal, 1\ we can ensure backward
> compatibility (hence safety of this breaking change) in 3.x by keeping the
> defaults of these configurations in line with what "
> log.message.timestamp.difference.max.ms" provides today 2\ and when 4.x has
> modified logic to perform segment rotation based on append time which will
> always be available, we can still re-use these configurations to reject
> messages based on validation of create timestamps. Mehari, thoughts?
>
> --
> Divij Vaidya
>
>
>
> On Tue, Jun 6, 2023 at 11:43 AM Luke Chen  > wrote:
>
>> Hi Beyene,
>>
>> Thanks for the KIP.
>>
>> Questions:
>> 1. We don't have "*max.message.time.difference.ms
>>  
>> ;>*" config, I think you're 
>> referring
>> to "log.message.timestamp.difference.max.ms"?
>> 2. After this KIP, the semantics of "
>> log.message.timestamp.difference.max.ms"
>> will be changed, right?
>> We should also mention it in this KIP, maybe compatibility section?
>> And I think the description of "log.message.timestamp.difference.max.ms"
>> will also need to be updated.
>> 3. Personally I like to see the "TIME_DRIFT_TOLERANCE" to be exposed as a
>> config, since after this change, the root issue is still not completed
>> resolved.
>> After this KIP, the 1 hour later of timestamp can still be appended
>> successfully, which might still be an issue for some applications.
>> I'd propose to introduce 2 new configs, one is "
>> log.message.timestamp.before.max.ms", and the other one is "
>> log.message.timestamp.after.max.ms".
>> And then we deprecate "log.message.timestamp.difference.max.ms". WDYT?
>>
>> Thank you.
>> Luke
>>
>> On Tue, Jun 6, 2023 at 8:02 AM Beyene, Mehari > lid>
>> wrote:
>>
>>> Hey Justine and Divij,
>>>
>>> Thank you for the recommendations.
>>> I've made the updates to the KIP and added a new section called "Future
>>> Work: Update Message Format to Include Both Client Timestamp and
>> LogAppend
>>> Timestamp."
>>>
>>> Please take a look when get some time and let me know if there's anything
>>> else you'd like me to address.
>>>
>>> Thanks,
>>> Mehari
>>>
>>> On 6/5/23, 10:16 AM, "Divij Vaidya" >>  >> divijvaidy...@gmail.com >> wrote:
>>>
>>>
>>> CAUTION: This email originated from outside of the organization. Do not
>>> click links or open attachments unless you can confirm the sender and
>> know
>>> the content is safe.
>>>
>>>
>>>
>>>
>>>
>>>
>>> Hey Justine
>>>
>>>
>>> Thank you for bringing this up. We had a discussion earlier in this [1]
>>> thread and concluded that bumping up the message version is a very
>>> expensive operation. Hence, we want to bundle together a bunch of
>>> impactful changes that we will perform on the message version and change
>> it
>>> in v4.0. We are currently capturing the ideas here [2]. The idea

[jira] [Resolved] (KAFKA-14539) Simplify StreamsMetadataState by replacing the Cluster metadata with partition info map

2023-06-07 Thread Bill Bejeck (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-14539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bill Bejeck resolved KAFKA-14539.
-
Resolution: Fixed

> Simplify StreamsMetadataState by replacing the Cluster metadata with 
> partition info map
> ---
>
> Key: KAFKA-14539
> URL: https://issues.apache.org/jira/browse/KAFKA-14539
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Reporter: A. Sophie Blee-Goldman
>Assignee: Danica Fine
>Priority: Major
>
> We can clean up the StreamsMetadataState class a bit by removing the 
> #onChange invocation that currently occurs within 
> StreamsPartitionAssignor#assign, which then lets us remove the `Cluster` 
> parameter in that callback. Instead of building a fake Cluster object from 
> the map of partition info when we invoke #onChange inside the 
> StreamsPartitionAssignor#onAssignment method, we can just directly pass in 
> the  `Map` and replace the usage of `Cluster` 
> everywhere in StreamsMetadataState
> (I believe the current system is a historical artifact from when we used to 
> require passing in a {{Cluster}} for the default partitioning strategy, which 
> the StreamMetadataState needs to compute the partition for a key. At some 
> point in the past we provided a better way to get the default partition, so 
> we no longer need a {{Cluster}} parameter/field at all)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-15071) Flaky test kafka.admin.LeaderElectionCommandTest.testPreferredReplicaElection for Type=ZK, MetadataVersion=3.5-IV2, Security=PLAINTEXT

2023-06-07 Thread Josep Prat (Jira)
Josep Prat created KAFKA-15071:
--

 Summary: Flaky test 
kafka.admin.LeaderElectionCommandTest.testPreferredReplicaElection for Type=ZK, 
MetadataVersion=3.5-IV2, Security=PLAINTEXT
 Key: KAFKA-15071
 URL: https://issues.apache.org/jira/browse/KAFKA-15071
 Project: Kafka
  Issue Type: Bug
Reporter: Josep Prat


Test became kafka.admin.LeaderElectionCommandTest.testPreferredReplicaElection 
flaky again but failing because of different reason. In this case it might be a 
missing cleanup

The values of the parameters are Type=ZK, MetadataVersion=3.5-IV2, 
Security=PLAINTEXT

Related to https://issues.apache.org/jira/browse/KAFKA-13737

Ocurred: 
https://ci-builds.apache.org/job/Kafka/job/kafka-pr/job/PR-13824/1/testReport/junit/kafka.admin/LeaderElectionCommandTest/Build___JDK_8_and_Scala_2_123__Type_ZK__Name_testPreferredReplicaElection__MetadataVersion_3_5_IV2__Security_PLAINTEXT/
h3. Error Message
{code:java}
org.apache.kafka.common.errors.TopicExistsException: Topic '__consumer_offsets' 
already exists.{code}
h3. Stacktrace
{code:java}
org.apache.kafka.common.errors.TopicExistsException: Topic '__consumer_offsets' 
already exists.{code}
{{ }}
h3. Standard Output
{code:java}
Successfully completed leader election (UNCLEAN) for partitions unclean-topic-0 
[2023-06-07 14:42:33,845] ERROR [QuorumController id=3000] writeNoOpRecord: 
unable to start processing because of RejectedExecutionException. Reason: The 
event queue is shutting down (org.apache.kafka.controller.QuorumController:467) 
[2023-06-07 14:42:42,699] WARN [AdminClient clientId=adminclient-65] Connection 
to node -2 (localhost/127.0.0.1:35103) could not be established. Broker may not 
be available. (org.apache.kafka.clients.NetworkClient:814) Successfully 
completed leader election (UNCLEAN) for partitions unclean-topic-0 [2023-06-07 
14:42:44,416] ERROR [QuorumController id=0] writeNoOpRecord: unable to start 
processing because of RejectedExecutionException. Reason: The event queue is 
shutting down (org.apache.kafka.controller.QuorumController:467) [2023-06-07 
14:42:44,716] WARN maxCnxns is not configured, using default value 0. 
(org.apache.zookeeper.server.ServerCnxnFactory:309) [2023-06-07 14:42:44,765] 
WARN No meta.properties file under dir 
/tmp/kafka-2117748934951771120/meta.properties 
(kafka.server.BrokerMetadataCheckpoint:70) [2023-06-07 14:42:44,986] WARN No 
meta.properties file under dir /tmp/kafka-5133306871105583937/meta.properties 
(kafka.server.BrokerMetadataCheckpoint:70) [2023-06-07 14:42:45,214] WARN No 
meta.properties file under dir /tmp/kafka-8449809620400833553/meta.properties 
(kafka.server.BrokerMetadataCheckpoint:70) [2023-06-07 14:42:45,634] WARN 
[ReplicaFetcher replicaId=1, leaderId=2, fetcherId=0] Received UNKNOWN_TOPIC_ID 
from the leader for partition __consumer_offsets-0. This error may be returned 
transiently when the partition is being created or deleted, but it is not 
expected to persist. (kafka.server.ReplicaFetcherThread:70) [2023-06-07 
14:42:45,634] WARN [ReplicaFetcher replicaId=1, leaderId=0, fetcherId=0] 
Received UNKNOWN_TOPIC_ID from the leader for partition __consumer_offsets-4. 
This error may be returned transiently when the partition is being created or 
deleted, but it is not expected to persist. 
(kafka.server.ReplicaFetcherThread:70) [2023-06-07 14:42:45,872] WARN 
[ReplicaFetcher replicaId=2, leaderId=0, fetcherId=0] Received UNKNOWN_TOPIC_ID 
from the leader for partition __consumer_offsets-1. This error may be returned 
transiently when the partition is being created or deleted, but it is not 
expected to persist. (kafka.server.ReplicaFetcherThread:70) [2023-06-07 
14:42:46,010] WARN [ReplicaFetcher replicaId=0, leaderId=2, fetcherId=0] Error 
in response for fetch request (type=FetchRequest, replicaId=0, maxWait=500, 
minBytes=1, maxBytes=10485760, 
fetchData={__consumer_offsets-3=PartitionData(topicId=vAlEsYVbTFClcpnVRp3AOw, 
fetchOffset=0, logStartOffset=0, maxBytes=1048576, 
currentLeaderEpoch=Optional[0], lastFetchedEpoch=Optional.empty)}, 
isolationLevel=READ_UNCOMMITTED, removed=, replaced=, 
metadata=(sessionId=INVALID, epoch=INITIAL), rackId=) 
(kafka.server.ReplicaFetcherThread:72) java.io.IOException: Connection to 2 was 
disconnected before the response was read at 
org.apache.kafka.clients.NetworkClientUtils.sendAndReceive(NetworkClientUtils.java:99)
 at 
kafka.server.BrokerBlockingSender.sendRequest(BrokerBlockingSender.scala:113) 
at kafka.server.RemoteLeaderEndPoint.fetch(RemoteLeaderEndPoint.scala:79) at 
kafka.server.AbstractFetcherThread.processFetchRequest(AbstractFetcherThread.scala:316)
 at 
kafka.server.AbstractFetcherThread.$anonfun$maybeFetch$3(AbstractFetcherThread.scala:130)
 at 
kafka.server.AbstractFetcherThread.$anonfun$maybeFetch$3$adapted(AbstractFetcherThread.scala:129)
 at scala.Option.foreach(Option.scala:407) at 
kafka.server.AbstractFetc

[jira] [Created] (KAFKA-15070) Flaky test kafka.log.LogCleanerParameterizedIntegrationTest.testCleansCombinedCompactAndDeleteTopic for codec zstd

2023-06-07 Thread Josep Prat (Jira)
Josep Prat created KAFKA-15070:
--

 Summary: Flaky test 
kafka.log.LogCleanerParameterizedIntegrationTest.testCleansCombinedCompactAndDeleteTopic
 for codec zstd
 Key: KAFKA-15070
 URL: https://issues.apache.org/jira/browse/KAFKA-15070
 Project: Kafka
  Issue Type: Bug
  Components: core
Affects Versions: 3.5.0
Reporter: Josep Prat


Flaky tests with the following traces and output:
h3. Error Message

org.opentest4j.AssertionFailedError: Timed out waiting for deletion of old 
segments
h3. Stacktrace

org.opentest4j.AssertionFailedError: Timed out waiting for deletion of old 
segments at org.junit.jupiter.api.AssertionUtils.fail(AssertionUtils.java:38) 
at org.junit.jupiter.api.Assertions.fail(Assertions.java:135) at 
kafka.log.LogCleanerParameterizedIntegrationTest.testCleansCombinedCompactAndDeleteTopic(LogCleanerParameterizedIntegrationTest.scala:123)

...

 
h3. Standard Output

[2023-06-07 16:03:59,974] WARN [LocalLog partition=log-0, 
dir=/tmp/kafka-6339499869249617477] Record format version has been downgraded 
from V2 to V0. (kafka.log.LocalLog:70) [2023-06-07 16:04:01,691] WARN [LocalLog 
partition=log-0, dir=/tmp/kafka-6391328203703920459] Record format version has 
been downgraded from V2 to V0. (kafka.log.LocalLog:70) [2023-06-07 
16:04:02,661] WARN [LocalLog partition=log-0, 
dir=/tmp/kafka-7107559685120209313] Record format version has been downgraded 
from V2 to V0. (kafka.log.LocalLog:70) [2023-06-07 16:04:04,449] WARN [LocalLog 
partition=log-0, dir=/tmp/kafka-2334095685379242376] Record format version has 
been downgraded from V2 to V0. (kafka.log.LocalLog:70) [2023-06-07 
16:04:12,059] ERROR [LogLoader partition=log-0, 
dir=/tmp/kafka-4306370019245327987] Could not find offset index file 
corresponding to log file 
/tmp/kafka-4306370019245327987/log-0/0300.log, recovering 
segment and rebuilding index files... (kafka.log.LogLoader:74) [2023-06-07 
16:04:21,424] ERROR [LogLoader partition=log-0, 
dir=/tmp/kafka-8549848301585177643] Could not find offset index file 
corresponding to log file 
/tmp/kafka-8549848301585177643/log-0/0300.log, recovering 
segment and rebuilding index files... (kafka.log.LogLoader:74) [2023-06-07 
16:04:42,679] ERROR [LogLoader partition=log-0, 
dir=/tmp/kafka-8308685679443421785] Could not find offset index file 
corresponding to log file 
/tmp/kafka-8308685679443421785/log-0/0300.log, recovering 
segment and rebuilding index files... (kafka.log.LogLoader:74) [2023-06-07 
16:04:50,435] ERROR [LogLoader partition=log-0, 
dir=/tmp/kafka-2686097435338562303] Could not find offset index file 
corresponding to log file 
/tmp/kafka-2686097435338562303/log-0/0300.log, recovering 
segment and rebuilding index files... (kafka.log.LogLoader:74) [2023-06-07 
16:07:16,263] WARN [LocalLog partition=log-0, 
dir=/tmp/kafka-5435804108212698551] Record format version has been downgraded 
from V2 to V0. (kafka.log.LocalLog:70) [2023-06-07 16:07:35,193] WARN [LocalLog 
partition=log-0, dir=/tmp/kafka-4310277229895025994] Record format version has 
been downgraded from V2 to V0. (kafka.log.LocalLog:70) [2023-06-07 
16:07:55,323] WARN [LocalLog partition=log-0, 
dir=/tmp/kafka-3364951894697258113] Record format version has been downgraded 
from V2 to V0. (kafka.log.LocalLog:70) [2023-06-07 16:08:16,286] WARN [LocalLog 
partition=log-0, dir=/tmp/kafka-3161518940405121110] Record format version has 
been downgraded from V2 to V0. (kafka.log.LocalLog:70) [2023-06-07 
16:35:03,765] ERROR [LogLoader partition=log-0, 
dir=/tmp/kafka-2385863108707929062] Could not find offset index file 
corresponding to log file 
/tmp/kafka-2385863108707929062/log-0/0300.log, recovering 
segment and rebuilding index files... (kafka.log.LogLoader:74) [2023-06-07 
16:35:06,406] ERROR [LogLoader partition=log-0, 
dir=/tmp/kafka-5380450050465409057] Could not find offset index file 
corresponding to log file 
/tmp/kafka-5380450050465409057/log-0/0300.log, recovering 
segment and rebuilding index files... (kafka.log.LogLoader:74) [2023-06-07 
16:35:09,061] ERROR [LogLoader partition=log-0, 
dir=/tmp/kafka-7510941634638265317] Could not find offset index file 
corresponding to log file 
/tmp/kafka-7510941634638265317/log-0/0300.log, recovering 
segment and rebuilding index files... (kafka.log.LogLoader:74) [2023-06-07 
16:35:11,593] ERROR [LogLoader partition=log-0, 
dir=/tmp/kafka-7423113520781905391] Could not find offset index file 
corresponding to log file 
/tmp/kafka-7423113520781905391/log-0/0300.log, recovering 
segment and rebuilding index files... (kafka.log.LogLoader:74) [2023-06-07 
16:35:14,159] ERROR [LogLoader partition=log-0, 
dir=/tmp/kafka-2120426496175304835] Could not find offset index file 
corresponding to log file 
/tmp/kafka-212042649617530483

Re: [DISCUSS] Regarding Old PRs

2023-06-07 Thread Josep Prat
Thanks David!

———
Josep Prat

Aiven Deutschland GmbH

Alexanderufer 3-7, 10117 Berlin

Amtsgericht Charlottenburg, HRB 209739 B

Geschäftsführer: Oskari Saarenmaa & Hannu Valtonen

m: +491715557497

w: aiven.io

e: josep.p...@aiven.io

On Wed, Jun 7, 2023, 20:28 David Arthur 
wrote:

> Hey all, I started poking around at Github actions on my fork.
>
> https://github.com/mumrah/kafka/actions
>
> I'll post a PR if I get it working and we can discuss what kind of settings
> we want (or if we want it all)
>
> -David
>
> On Tue, Jun 6, 2023 at 1:18 PM Chris Egerton 
> wrote:
>
> > Hi Josep,
> >
> > Thanks for bringing this up! Will try to keep things brief.
> >
> > I'm generally in favor of this initiative. A couple ideas that I really
> > liked: requiring a component label (producer, consumer, connect, streams,
> > etc.) before closing, and disabling auto-close (i.e., automatically
> tagging
> > PRs as stale, but leaving it to a human being to actually close them).
> >
> > We might replace the "stale" label with a "close-by-" label so that
> > it becomes even easier for us to find the PRs that are ready to be closed
> > (as opposed to the ones that have just been labeled as stale without
> giving
> > the contributor enough time to respond).
> >
> > I've also gone ahead and closed some of my stale PRs. Others I've
> > downgraded to draft to signal that I'd like to continue to pursue them,
> but
> > have to iron out merge conflicts first. For the last ones, I'll ping for
> > review.
> >
> > One question that came to mind--do we want to distinguish between regular
> > and draft PRs? I'm guessing not, since they still add up to the total PR
> > count against the project, but since they do also implicitly signal that
> > they're not intended for review (yet) it may be friendlier to leave them
> > up.
> >
> > Cheers,
> >
> > Chris
> >
> > On Tue, Jun 6, 2023 at 10:18 AM Mickael Maison  >
> > wrote:
> >
> > > Hi Josep,
> > >
> > > Thanks for looking into this. This is clearly one aspect where we need
> > > to improve.
> > >
> > > We had a similar initiative last year
> > > (https://lists.apache.org/thread/66yj9m6tcyz8zqb3lqlbnr386bqwsopt) and
> > > we closed many PRs. Unfortunately we did not follow up with a process
> > > or automation and we are back to the same situation.
> > >
> > > Manually reviewing all these PRs is a huge task, so I think we should
> > > at least partially automate it. I'm not sure if we should manually
> > > review the oldest PRs (pre 2020). There's surely many interesting
> > > things but I wonder if we should instead focus on the more recent ones
> > > as they have a higher chance of 1) still making sense, 2) getting
> > > updates from their authors, 3) needing less rebasing. If something has
> > > been broken since 2016 but we never bothered to fix the PR it means it
> > > can't be anything critical!
> > >
> > > Finally as Colin mentioned, it looks like a non negligible chunk of
> > > stale PRs comes from committers and regular contributors. So I'd
> > > suggest we each try to clean our own backlog too.
> > >
> > > I wonder if we also need to do something in JIRA. Querying for
> > > unresolved tickets returns over 4000 items. Considering we're not
> > > quite at KAFKA-15000 yet, that's a lot.
> > >
> > > Thanks,
> > > Mickael
> > >
> > >
> > > On Tue, Jun 6, 2023 at 11:35 AM Josep Prat  >
> > > wrote:
> > > >
> > > > Hi Devs,
> > > > I would say we can split the problem in 2.
> > > >
> > > > Waiting for Author feedback:
> > > > We could have a bot that would ping authors for the cases where we
> have
> > > PRs
> > > > that are stalled and have either:
> > > > - Merge conflict
> > > > - Unaddressed reviews
> > > >
> > > > Waiting for reviewers:
> > > > For the PRs where we have no reviewers and there are no conflicts, I
> > > think
> > > > we would need some human interaction to determine modules (maybe this
> > can
> > > > be automated) and ping people who could review.
> > > >
> > > > What do you think?
> > > >
> > > > Best,
> > > >
> > > > On Tue, Jun 6, 2023 at 11:30 AM Josep Prat 
> > wrote:
> > > >
> > > > > Hi Nikolay,
> > > > >
> > > > > With a bot it will be complicated to determine what to do when the
> PR
> > > > > author is waiting for a reviewer. If a person goes over them, can
> > > check if
> > > > > they are waiting for reviews and tag the PR accordingly and maybe
> > ping
> > > a
> > > > > maintainer.
> > > > > If you look at my last email I described a flow (but AFAIU it will
> > work
> > > > > only if a human executes it) where the situation you point out
> would
> > be
> > > > > covered.
> > > > >
> > > > > ———
> > > > > Josep Prat
> > > > >
> > > > > Aiven Deutschland GmbH
> > > > >
> > > > > Alexanderufer 3-7, 10117 Berlin
> > > > >
> > > > > Amtsgericht Charlottenburg, HRB 209739 B
> > > > >
> > > > > Geschäftsführer: Oskari Saarenmaa & Hannu Valtonen
> > > > >
> > > > > m: +491715557497
> > > > >
> > > > > w: aiven.io
> > > > >
> > > > > e: josep.p...@aiven.io
> > >

Re: [DISCUSS] Regarding Old PRs

2023-06-07 Thread David Arthur
Hey all, I started poking around at Github actions on my fork.

https://github.com/mumrah/kafka/actions

I'll post a PR if I get it working and we can discuss what kind of settings
we want (or if we want it all)

-David

On Tue, Jun 6, 2023 at 1:18 PM Chris Egerton 
wrote:

> Hi Josep,
>
> Thanks for bringing this up! Will try to keep things brief.
>
> I'm generally in favor of this initiative. A couple ideas that I really
> liked: requiring a component label (producer, consumer, connect, streams,
> etc.) before closing, and disabling auto-close (i.e., automatically tagging
> PRs as stale, but leaving it to a human being to actually close them).
>
> We might replace the "stale" label with a "close-by-" label so that
> it becomes even easier for us to find the PRs that are ready to be closed
> (as opposed to the ones that have just been labeled as stale without giving
> the contributor enough time to respond).
>
> I've also gone ahead and closed some of my stale PRs. Others I've
> downgraded to draft to signal that I'd like to continue to pursue them, but
> have to iron out merge conflicts first. For the last ones, I'll ping for
> review.
>
> One question that came to mind--do we want to distinguish between regular
> and draft PRs? I'm guessing not, since they still add up to the total PR
> count against the project, but since they do also implicitly signal that
> they're not intended for review (yet) it may be friendlier to leave them
> up.
>
> Cheers,
>
> Chris
>
> On Tue, Jun 6, 2023 at 10:18 AM Mickael Maison 
> wrote:
>
> > Hi Josep,
> >
> > Thanks for looking into this. This is clearly one aspect where we need
> > to improve.
> >
> > We had a similar initiative last year
> > (https://lists.apache.org/thread/66yj9m6tcyz8zqb3lqlbnr386bqwsopt) and
> > we closed many PRs. Unfortunately we did not follow up with a process
> > or automation and we are back to the same situation.
> >
> > Manually reviewing all these PRs is a huge task, so I think we should
> > at least partially automate it. I'm not sure if we should manually
> > review the oldest PRs (pre 2020). There's surely many interesting
> > things but I wonder if we should instead focus on the more recent ones
> > as they have a higher chance of 1) still making sense, 2) getting
> > updates from their authors, 3) needing less rebasing. If something has
> > been broken since 2016 but we never bothered to fix the PR it means it
> > can't be anything critical!
> >
> > Finally as Colin mentioned, it looks like a non negligible chunk of
> > stale PRs comes from committers and regular contributors. So I'd
> > suggest we each try to clean our own backlog too.
> >
> > I wonder if we also need to do something in JIRA. Querying for
> > unresolved tickets returns over 4000 items. Considering we're not
> > quite at KAFKA-15000 yet, that's a lot.
> >
> > Thanks,
> > Mickael
> >
> >
> > On Tue, Jun 6, 2023 at 11:35 AM Josep Prat 
> > wrote:
> > >
> > > Hi Devs,
> > > I would say we can split the problem in 2.
> > >
> > > Waiting for Author feedback:
> > > We could have a bot that would ping authors for the cases where we have
> > PRs
> > > that are stalled and have either:
> > > - Merge conflict
> > > - Unaddressed reviews
> > >
> > > Waiting for reviewers:
> > > For the PRs where we have no reviewers and there are no conflicts, I
> > think
> > > we would need some human interaction to determine modules (maybe this
> can
> > > be automated) and ping people who could review.
> > >
> > > What do you think?
> > >
> > > Best,
> > >
> > > On Tue, Jun 6, 2023 at 11:30 AM Josep Prat 
> wrote:
> > >
> > > > Hi Nikolay,
> > > >
> > > > With a bot it will be complicated to determine what to do when the PR
> > > > author is waiting for a reviewer. If a person goes over them, can
> > check if
> > > > they are waiting for reviews and tag the PR accordingly and maybe
> ping
> > a
> > > > maintainer.
> > > > If you look at my last email I described a flow (but AFAIU it will
> work
> > > > only if a human executes it) where the situation you point out would
> be
> > > > covered.
> > > >
> > > > ———
> > > > Josep Prat
> > > >
> > > > Aiven Deutschland GmbH
> > > >
> > > > Alexanderufer 3-7, 10117 Berlin
> > > >
> > > > Amtsgericht Charlottenburg, HRB 209739 B
> > > >
> > > > Geschäftsführer: Oskari Saarenmaa & Hannu Valtonen
> > > >
> > > > m: +491715557497
> > > >
> > > > w: aiven.io
> > > >
> > > > e: josep.p...@aiven.io
> > > >
> > > > On Tue, Jun 6, 2023, 11:20 Николай Ижиков 
> wrote:
> > > >
> > > >> Hello.
> > > >>
> > > >> What is actions of contributor if no feedback given? [1], [2]
> > > >>
> > > >> [1] https://github.com/apache/kafka/pull/13278
> > > >> [2] https://github.com/apache/kafka/pull/13247
> > > >>
> > > >> > 2 июня 2023 г., в 23:38, David Arthur 
> > написал(а):
> > > >> >
> > > >> > I think this is a great idea. If we don’t want the auto-close
> > > >> > functionality, we can set it to -1
> > > >> >
> > > >> > I realize this isn’t a vote, but I’m +1 on this

Re: oauthbearer client -- override ssl.property (unable to do so)

2023-06-07 Thread Kirk True
Hi Neil,

> On Jun 7, 2023, at 10:35 AM, Neil Buesing  wrote:
> 
> The code "AccessTokenRetrieverFactory" uses the "jaasConfig"'s for the
> properties used for building the socket factory.
> 
> Shouldn't "jou.createSSLSockerFactor()" use the kafka configs for sasl/ssl
> overrides?
> 
> I am looking to do "oauthbearer.ssl.protocol=TLSv1.2" -- but no luck - I
> have tried many variations and in looking at the code I *think* it is
> trying to use JAAS properties not the kafka configurations for this.
> 
> I am confused by this so I don't want to open a ticket if I am
> mis-understanding how overrides should work for oauthbearer
> 

I think you’re right. I’m the guilty party here, and can’t remember why I wrote 
it without taking the readily available configuration into consideration. 
Looking at the JaasOptionsUtils.createSSLSocketFactory method, it ends up 
creating a Config object anyway, just to get access to the defaults. I’m as 
perplexed as you.

Please do file a Jira if you’d be so willing.

Thanks,
Kirk

> Thanks,
> 
> Neil
> 
> 
> 
> 
>JaasOptionsUtils jou = new JaasOptionsUtils(jaasConfig);
>String clientId = jou.validateString(CLIENT_ID_CONFIG);
>String clientSecret = jou.validateString(CLIENT_SECRET_CONFIG);
>String scope = jou.validateString(SCOPE_CONFIG, false);
> 
>SSLSocketFactory sslSocketFactory = null;
> 
>if (jou.shouldCreateSSLSocketFactory(tokenEndpointUrl))
>sslSocketFactory = jou.createSSLSocketFactory();



Re: [DISCUSS] KIP-917: Additional custom metadata for remote log segment

2023-06-07 Thread Ivan Yurchenko
Hi Satish,

Thank you for your feedback.

I've nothing against going from Map to byte[].
Serialization should not be a problem for RSM implementations: `Struct`,
`Schema` and other useful serde classes are distributed as a part of the
kafka-clients library.

Also a good idea to add the size limiting setting, some
`remote.log.metadata.custom.metadata.max.size`. A sensible default may be
10 KB, which is enough to host a struct with 10 long (almost) 1K symbol
ASCII strings.

If a piece of custom metadata exceeds the limit, the execution of
RLMTask.copyLogSegmentsToRemote should be interrupted with an error message.

Does this sound good?
If so, I'll update the KIP accordingly. And I think it may be time for the
vote after that.

Best,
Ivan



On Sat, 3 Jun 2023 at 17:13, Satish Duggana 
wrote:

> Hi Ivan,
> Thanks for the KIP.
>
> The motivation of the KIP looks reasonable to me. It requires a way
> for RSM providers to add custom metadata with the existing
> RemoteLogSegmentMetadata. I remember we wanted to introduce a very
> similar change in the earlier proposals called
> RemoteLogMetadataContext. But we dropped that as we did not feel a
> strong need at that time and we wanted to revisit it if needed. But I
> see there is a clear need for this kind of custom metadata to keep
> with RemoteLogSegmentMetadata.
>
> It is better to introduce a new class for this custom metadata in
> RemoteLogSegmentMetadata like below for any changes in the future.
> RemoteLogSegmentMetadata will have this as an optional value.
>
> public class RemoteLogSegmentMetadata {
> ...
> public static class CustomMetadata {
>  private final byte[] value;
> ...
> }
> ...
> }
>
> This is completely opaque and it is the RSM implementation provider's
> responsibility in serializing and deserializing the bytes. We can
> introduce a property to guard the size with a configurable property
> with a default value to avoid any unwanted large size values.
>
> Thanks,
> Satish.
>
> On Tue, 30 May 2023 at 10:59, Ivan Yurchenko 
> wrote:
> >
> > Hi all,
> >
> > I want to bring this to a conclusion (positive or negative), so if there
> > are no more questions in a couple of days, I'll put the KIP to the vote.
> >
> > Best,
> > Ivan
> >
> >
> > On Fri, 5 May 2023 at 18:42, Ivan Yurchenko 
> > wrote:
> >
> > > Hi Alexandre,
> > >
> > > > combining custom
> > > > metadata with rlmMetadata increases coupling between Kafka and the
> > > > plugin.
> > >
> > > This is true. However, (if I understand your concern correctly,)
> > > rlmMetadata in the current form may be independent from RSM plugins,
> but
> > > data they point to are accessible only via the particular plugin (the
> one
> > > that wrote the data or a compatible one). It seems, this coupling
> already
> > > exists, but it is implicit. To make my point more concrete, imagine an
> S3
> > > RSM which maps RemoteLogSegmentMetadata objects to S3 object keys. This
> > > mapping logic is a part of the RSM plugin and without it the metadata
> is
> > > useless. I think it will not get worse if (to follow the example) the
> > > plugin makes the said S3 object keys explicit by adding them to the
> > > metadata. From the high level point of view, moving the custom
> metadata to
> > > a separate topic doesn't change the picture: it's still the plugin that
> > > binds the standard and custom metadata together.
> > >
> > >
> > > > For instance, the custom metadata may need to be modified
> > > > outside of Kafka, but the rlmMetadata would still be cached on
> brokers
> > > > independently of any update of custom metadata. Since both types of
> > > > metadata are authored by different systems, and are cached in
> > > > different layers, this may become a problem, or make plugin migration
> > > > more difficult. What do you think?
> > >
> > > This is indeed a problem. I think a solution to this would be to
> clearly
> > > state that metadata being modified outside of Kafka is out of scope and
> > > instruct the plugin authors that custom metadata could be provided only
> > > reactively from the copyLogSegmentData method and must remain immutable
> > > after that. Does it make sense?
> > >
> > >
> > > > Yes, you are right that the suggested alternative is to let the
> plugin
> > > > store its own metadata separately with a solution chosen by the admin
> > > > or plugin provider. For instance, it could be using a dedicated topic
> > > > if chosen to, or relying on an external key-value store.
> > >
> > > I see. Yes, this option always exists and doesn't even require a KIP.
> The
> > > biggest drawback I see is that a plugin will need to reimplement the
> > > consumer/producer + caching mechanics that will exist on the broker
> side
> > > for the standard remote metadata. I'd like to avoid this and this KIP
> is
> > > the best solution I see.
> > >
> > > Best,
> > > Ivan
> > >
> > >
> > >
> > > On Tue, 18 Apr 2023 at 13:02, Alexandre Dupriez <
> > > alexandre.dupr...@gmail.com> wrote:
> > >
> > >> Hi Ivan,

oauthbearer client -- override ssl.property (unable to do so)

2023-06-07 Thread Neil Buesing
The code "AccessTokenRetrieverFactory" uses the "jaasConfig"'s for the
properties used for building the socket factory.

Shouldn't "jou.createSSLSockerFactor()" use the kafka configs for sasl/ssl
overrides?

I am looking to do "oauthbearer.ssl.protocol=TLSv1.2" -- but no luck - I
have tried many variations and in looking at the code I *think* it is
trying to use JAAS properties not the kafka configurations for this.

I am confused by this so I don't want to open a ticket if I am
mis-understanding how overrides should work for oauthbearer

Thanks,

Neil




JaasOptionsUtils jou = new JaasOptionsUtils(jaasConfig);
String clientId = jou.validateString(CLIENT_ID_CONFIG);
String clientSecret = jou.validateString(CLIENT_SECRET_CONFIG);
String scope = jou.validateString(SCOPE_CONFIG, false);

SSLSocketFactory sslSocketFactory = null;

if (jou.shouldCreateSSLSocketFactory(tokenEndpointUrl))
sslSocketFactory = jou.createSSLSocketFactory();


Re: [DISCUSS] KIP-937 Improve Message Timestamp Validation

2023-06-07 Thread Kirk True
Hi Mehari,

Thanks for the KIP and keeping it up-to-date with the discussions here!

Question:

1. Is it possible to check for invalid timestamps in the client? Suppose we 
were to develop a means to determine with high confidence that the user had 
provided a timestamp in nanoseconds vs. milliseconds, is it worth the trouble 
to either reject or output some debug logging? Of course, we don’t have the 
log.message.timestamp.difference.max.ms configuration on the client, and the 
client may be out of sync enough to cause false positives.

Thanks,
Kirk

> On Jun 6, 2023, at 5:42 AM, Divij Vaidya  wrote:
> 
> Hi Luke
> 
> Thank you for your participation in reviewing this KIP.
> 
> #1 Updated the KIP with correct configuration names and hyperlinks.
> 
> #2 Yes, the semantics change from a perspective that the difference is
> always in the past (or at most 1 hour into the future). Updated the
> compatibility section to represent the same. Will update again after #3 is
> resolved.
> 
> #3 I'd propose to introduce 2 new configs, one is "
> log.message.timestamp.before.max.ms", and the other one is "
> log.message.timestamp.after.max.ms".
> This is a great idea because with your proposal, 1\ we can ensure backward
> compatibility (hence safety of this breaking change) in 3.x by keeping the
> defaults of these configurations in line with what "
> log.message.timestamp.difference.max.ms" provides today 2\ and when 4.x has
> modified logic to perform segment rotation based on append time which will
> always be available, we can still re-use these configurations to reject
> messages based on validation of create timestamps. Mehari, thoughts?
> 
> --
> Divij Vaidya
> 
> 
> 
> On Tue, Jun 6, 2023 at 11:43 AM Luke Chen  wrote:
> 
>> Hi Beyene,
>> 
>> Thanks for the KIP.
>> 
>> Questions:
>> 1. We don't have "*max.message.time.difference.ms
>> *" config, I think you're referring
>> to "log.message.timestamp.difference.max.ms"?
>> 2. After this KIP, the semantics of "
>> log.message.timestamp.difference.max.ms"
>> will be changed, right?
>> We should also mention it in this KIP, maybe compatibility section?
>> And I think the description of "log.message.timestamp.difference.max.ms"
>> will also need to be updated.
>> 3. Personally I like to see the "TIME_DRIFT_TOLERANCE" to be exposed as a
>> config, since after this change, the root issue is still not completed
>> resolved.
>> After this KIP, the 1 hour later of timestamp can still be appended
>> successfully, which might still be an issue for some applications.
>> I'd propose to introduce 2 new configs, one is "
>> log.message.timestamp.before.max.ms", and the other one is "
>> log.message.timestamp.after.max.ms".
>> And then we deprecate "log.message.timestamp.difference.max.ms". WDYT?
>> 
>> Thank you.
>> Luke
>> 
>> On Tue, Jun 6, 2023 at 8:02 AM Beyene, Mehari 
>> wrote:
>> 
>>> Hey Justine and Divij,
>>> 
>>> Thank you for the recommendations.
>>> I've made the updates to the KIP and added a new section called "Future
>>> Work: Update Message Format to Include Both Client Timestamp and
>> LogAppend
>>> Timestamp."
>>> 
>>> Please take a look when get some time and let me know if there's anything
>>> else you'd like me to address.
>>> 
>>> Thanks,
>>> Mehari
>>> 
>>> On 6/5/23, 10:16 AM, "Divij Vaidya" >> divijvaidy...@gmail.com>> wrote:
>>> 
>>> 
>>> CAUTION: This email originated from outside of the organization. Do not
>>> click links or open attachments unless you can confirm the sender and
>> know
>>> the content is safe.
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> Hey Justine
>>> 
>>> 
>>> Thank you for bringing this up. We had a discussion earlier in this [1]
>>> thread and concluded that bumping up the message version is a very
>>> expensive operation. Hence, we want to bundle together a bunch of
>>> impactful changes that we will perform on the message version and change
>> it
>>> in v4.0. We are currently capturing the ideas here [2]. The idea of
>> always
>>> having a log append time is captured in point 4 in the above wiki of
>> ideas.
>>> 
>>> 
>>> As you suggested, we will add a new section called "future work" and add
>>> the idea of two timestamps (& why not do it now) over there. Meanwhile,
>>> does the above explanation answer your question on why not to do it right
>>> now?
>>> 
>>> 
>>> [1] https://lists.apache.org/thread/rxnps10t4vrsor46cx6xdj6t03qqxosh <
>>> https://lists.apache.org/thread/rxnps10t4vrsor46cx6xdj6t03qqxosh>
>>> [2]
>>> 
>>> 
>> https://cwiki.apache.org/confluence/display/KAFKA/ideas+for+kafka+message+format+v.3
>>> <
>>> 
>> https://cwiki.apache.org/confluence/display/KAFKA/ideas+for+kafka+message+format+v.3
 
>>> 
>>> 
>>> 
>>> 
>>> --
>>> Divij Vaidya
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> On Mon, Jun 5, 2023 at 6:42 PM Justine Olshan >> lid>
>>> wrote:
>>> 
>>> 
 Hey Mehari,
 Thanks for adding that section. I think one other thing folks have
 considere

Build failed in Jenkins: Kafka » Kafka Branch Builder » trunk #1906

2023-06-07 Thread Apache Jenkins Server
See 


Changes:


--
[...truncated 575654 lines...]
[2023-06-07T17:09:44.913Z] 
[2023-06-07T17:09:44.913Z] Gradle Test Run :core:integrationTest > Gradle Test 
Executor 178 > ZooKeeperClientTest > testMixedPipeline() PASSED
[2023-06-07T17:09:44.913Z] 
[2023-06-07T17:09:44.913Z] Gradle Test Run :core:integrationTest > Gradle Test 
Executor 178 > ZooKeeperClientTest > testGetDataExistingZNode() STARTED
[2023-06-07T17:09:44.913Z] 
[2023-06-07T17:09:44.913Z] Gradle Test Run :core:integrationTest > Gradle Test 
Executor 178 > ZooKeeperClientTest > testGetDataExistingZNode() PASSED
[2023-06-07T17:09:44.913Z] 
[2023-06-07T17:09:44.913Z] Gradle Test Run :core:integrationTest > Gradle Test 
Executor 178 > ZooKeeperClientTest > testDeleteExistingZNode() STARTED
[2023-06-07T17:09:44.913Z] 
[2023-06-07T17:09:44.913Z] Gradle Test Run :core:integrationTest > Gradle Test 
Executor 178 > ZooKeeperClientTest > testDeleteExistingZNode() PASSED
[2023-06-07T17:09:44.913Z] 
[2023-06-07T17:09:44.913Z] Gradle Test Run :core:integrationTest > Gradle Test 
Executor 178 > ZooKeeperClientTest > testSessionExpiry() STARTED
[2023-06-07T17:09:44.913Z] 
[2023-06-07T17:09:44.913Z] Gradle Test Run :core:integrationTest > Gradle Test 
Executor 178 > ZooKeeperClientTest > testSessionExpiry() PASSED
[2023-06-07T17:09:44.913Z] 
[2023-06-07T17:09:44.913Z] Gradle Test Run :core:integrationTest > Gradle Test 
Executor 178 > ZooKeeperClientTest > testSetDataNonExistentZNode() STARTED
[2023-06-07T17:09:44.913Z] 
[2023-06-07T17:09:44.913Z] Gradle Test Run :core:integrationTest > Gradle Test 
Executor 178 > ZooKeeperClientTest > testSetDataNonExistentZNode() PASSED
[2023-06-07T17:09:44.913Z] 
[2023-06-07T17:09:44.913Z] Gradle Test Run :core:integrationTest > Gradle Test 
Executor 178 > ZooKeeperClientTest > testConnectionViaNettyClient() STARTED
[2023-06-07T17:09:44.913Z] 
[2023-06-07T17:09:44.913Z] Gradle Test Run :core:integrationTest > Gradle Test 
Executor 178 > ZooKeeperClientTest > testConnectionViaNettyClient() PASSED
[2023-06-07T17:09:44.913Z] 
[2023-06-07T17:09:44.913Z] Gradle Test Run :core:integrationTest > Gradle Test 
Executor 178 > ZooKeeperClientTest > testDeleteNonExistentZNode() STARTED
[2023-06-07T17:09:44.913Z] 
[2023-06-07T17:09:44.913Z] Gradle Test Run :core:integrationTest > Gradle Test 
Executor 178 > ZooKeeperClientTest > testDeleteNonExistentZNode() PASSED
[2023-06-07T17:09:44.913Z] 
[2023-06-07T17:09:44.913Z] Gradle Test Run :core:integrationTest > Gradle Test 
Executor 178 > ZooKeeperClientTest > testExistsExistingZNode() STARTED
[2023-06-07T17:09:44.913Z] 
[2023-06-07T17:09:44.913Z] Gradle Test Run :core:integrationTest > Gradle Test 
Executor 178 > ZooKeeperClientTest > testExistsExistingZNode() PASSED
[2023-06-07T17:09:44.913Z] 
[2023-06-07T17:09:44.913Z] Gradle Test Run :core:integrationTest > Gradle Test 
Executor 178 > ZooKeeperClientTest > testZooKeeperStateChangeRateMetrics() 
STARTED
[2023-06-07T17:09:44.913Z] 
[2023-06-07T17:09:44.913Z] Gradle Test Run :core:integrationTest > Gradle Test 
Executor 178 > ZooKeeperClientTest > testZooKeeperStateChangeRateMetrics() 
PASSED
[2023-06-07T17:09:44.913Z] 
[2023-06-07T17:09:44.913Z] Gradle Test Run :core:integrationTest > Gradle Test 
Executor 178 > ZooKeeperClientTest > testZNodeChangeHandlerForDeletion() STARTED
[2023-06-07T17:09:44.913Z] 
[2023-06-07T17:09:44.913Z] Gradle Test Run :core:integrationTest > Gradle Test 
Executor 178 > ZooKeeperClientTest > testZNodeChangeHandlerForDeletion() PASSED
[2023-06-07T17:09:44.913Z] 
[2023-06-07T17:09:44.913Z] Gradle Test Run :core:integrationTest > Gradle Test 
Executor 178 > ZooKeeperClientTest > testGetAclNonExistentZNode() STARTED
[2023-06-07T17:09:44.913Z] 
[2023-06-07T17:09:44.913Z] Gradle Test Run :core:integrationTest > Gradle Test 
Executor 178 > ZooKeeperClientTest > testGetAclNonExistentZNode() PASSED
[2023-06-07T17:09:44.913Z] 
[2023-06-07T17:09:44.913Z] Gradle Test Run :core:integrationTest > Gradle Test 
Executor 178 > ZooKeeperClientTest > testStateChangeHandlerForAuthFailure() 
STARTED
[2023-06-07T17:09:44.913Z] 
[2023-06-07T17:09:44.913Z] Gradle Test Run :core:integrationTest > Gradle Test 
Executor 178 > ZooKeeperClientTest > testStateChangeHandlerForAuthFailure() 
PASSED
[2023-06-07T17:09:44.913Z] 
[2023-06-07T17:09:44.913Z] Gradle Test Run :core:integrationTest > Gradle Test 
Executor 178 > AllocateProducerIdsRequestTest > 
testAllocateProducersIdSentToController() > [1] Type=Raft-Isolated, 
Name=testAllocateProducersIdSentToController, MetadataVersion=3.6-IV0, 
Security=PLAINTEXT STARTED
[2023-06-07T17:09:44.913Z] 
[2023-06-07T17:09:44.913Z] Gradle Test Run :core:integrationTest > Gradle Test 
Executor 178 > AllocateProducerIdsRequestTest > 
testAllocateProducersIdSentToController() > [1] Type=Raft-Isolated, 
Name=testAllocateProducersIdSentToController, MetadataVersion=3.6-IV0, 
Security=

Re: [DISCUSS] KIP-937 Improve Message Timestamp Validation

2023-06-07 Thread Kirk True
Hi Divij/all,

> On Jun 6, 2023, at 5:42 AM, Divij Vaidya  wrote:
> 
> Hi Luke
> 
> Thank you for your participation in reviewing this KIP.
> 
> #1 Updated the KIP with correct configuration names and hyperlinks.
> 
> #2 Yes, the semantics change from a perspective that the difference is
> always in the past (or at most 1 hour into the future). Updated the
> compatibility section to represent the same. Will update again after #3 is
> resolved.
> 
> #3 I'd propose to introduce 2 new configs, one is "
> log.message.timestamp.before.max.ms", and the other one is "
> log.message.timestamp.after.max.ms".
> This is a great idea because with your proposal, 1\ we can ensure backward
> compatibility (hence safety of this breaking change) in 3.x by keeping the
> defaults of these configurations in line with what "
> log.message.timestamp.difference.max.ms" provides today 2\ and when 4.x has
> modified logic to perform segment rotation based on append time which will
> always be available, we can still re-use these configurations to reject
> messages based on validation of create timestamps. Mehari, thoughts?

Although it's more verbose, splitting the configuration into explicit ‘past’ 
and ‘future’ would provide the appropriate tradeoff between constraint and 
flexibility, right?

Thanks,
Kirk

> --
> Divij Vaidya
> 
> 
> 
> On Tue, Jun 6, 2023 at 11:43 AM Luke Chen  wrote:
> 
>> Hi Beyene,
>> 
>> Thanks for the KIP.
>> 
>> Questions:
>> 1. We don't have "*max.message.time.difference.ms
>> *" config, I think you're referring
>> to "log.message.timestamp.difference.max.ms"?
>> 2. After this KIP, the semantics of "
>> log.message.timestamp.difference.max.ms"
>> will be changed, right?
>> We should also mention it in this KIP, maybe compatibility section?
>> And I think the description of "log.message.timestamp.difference.max.ms"
>> will also need to be updated.
>> 3. Personally I like to see the "TIME_DRIFT_TOLERANCE" to be exposed as a
>> config, since after this change, the root issue is still not completed
>> resolved.
>> After this KIP, the 1 hour later of timestamp can still be appended
>> successfully, which might still be an issue for some applications.
>> I'd propose to introduce 2 new configs, one is "
>> log.message.timestamp.before.max.ms", and the other one is "
>> log.message.timestamp.after.max.ms".
>> And then we deprecate "log.message.timestamp.difference.max.ms". WDYT?
>> 
>> Thank you.
>> Luke
>> 
>> On Tue, Jun 6, 2023 at 8:02 AM Beyene, Mehari 
>> wrote:
>> 
>>> Hey Justine and Divij,
>>> 
>>> Thank you for the recommendations.
>>> I've made the updates to the KIP and added a new section called "Future
>>> Work: Update Message Format to Include Both Client Timestamp and
>> LogAppend
>>> Timestamp."
>>> 
>>> Please take a look when get some time and let me know if there's anything
>>> else you'd like me to address.
>>> 
>>> Thanks,
>>> Mehari
>>> 
>>> On 6/5/23, 10:16 AM, "Divij Vaidya" >> divijvaidy...@gmail.com>> wrote:
>>> 
>>> 
>>> CAUTION: This email originated from outside of the organization. Do not
>>> click links or open attachments unless you can confirm the sender and
>> know
>>> the content is safe.
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> Hey Justine
>>> 
>>> 
>>> Thank you for bringing this up. We had a discussion earlier in this [1]
>>> thread and concluded that bumping up the message version is a very
>>> expensive operation. Hence, we want to bundle together a bunch of
>>> impactful changes that we will perform on the message version and change
>> it
>>> in v4.0. We are currently capturing the ideas here [2]. The idea of
>> always
>>> having a log append time is captured in point 4 in the above wiki of
>> ideas.
>>> 
>>> 
>>> As you suggested, we will add a new section called "future work" and add
>>> the idea of two timestamps (& why not do it now) over there. Meanwhile,
>>> does the above explanation answer your question on why not to do it right
>>> now?
>>> 
>>> 
>>> [1] https://lists.apache.org/thread/rxnps10t4vrsor46cx6xdj6t03qqxosh <
>>> https://lists.apache.org/thread/rxnps10t4vrsor46cx6xdj6t03qqxosh>
>>> [2]
>>> 
>>> 
>> https://cwiki.apache.org/confluence/display/KAFKA/ideas+for+kafka+message+format+v.3
>>> <
>>> 
>> https://cwiki.apache.org/confluence/display/KAFKA/ideas+for+kafka+message+format+v.3
 
>>> 
>>> 
>>> 
>>> 
>>> --
>>> Divij Vaidya
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> On Mon, Jun 5, 2023 at 6:42 PM Justine Olshan >> lid>
>>> wrote:
>>> 
>>> 
 Hey Mehari,
 Thanks for adding that section. I think one other thing folks have
 considered is including two timestamps in the message format -- one for
>>> the
 client side timestamp and one for the server side. Of course, this
>> would
 require a bump to the message format, and that hasn't happened in a
>>> while.
 Could we include some information on this approach and why we aren't
 pursuing it? I think message format bumps

Re: [VOTE] 3.5.0 RC1

2023-06-07 Thread John Roesler
Thanks for running this release, Mickael!

I've verified:
* the signature
* that I can compile the project
* that I can run the tests. I saw one flaky test failure, but I don't think it 
should block us. Reported as 
https://issues.apache.org/jira/browse/KAFKA-13531?focusedCommentId=17730190
* the Kafka, Consumer, and Streams quickstarts with ZK and KRaft

I'm +1 (binding)

Thanks,
-John

On Wed, Jun 7, 2023, at 06:16, Josep Prat wrote:
> Hi MIckael,
>
> Apparently you did it in this PR already :) :
> https://github.com/apache/kafka/pull/13749 (this PR among other things
> removes classgraph.
>
> Without being a lawyer, I think I agree with you as stating we depend on
> something we don't would be less problematic than the other way around.
>
> Best,
>
> On Wed, Jun 7, 2023 at 12:14 PM Mickael Maison 
> wrote:
>
>> Hi Josep,
>>
>> Thanks for spotting this. If not already done, can you open a
>> ticket/PR to fix this on trunk? It looks like the last couple of
>> releases already had that issue. Since we're including a license for a
>> dependency we don't ship, I think we can consider this non blocking.
>> The other way around (shipping a dependency without its license) would
>> be blocking.
>>
>> Thanks,
>> Mickael
>>
>> On Tue, Jun 6, 2023 at 10:10 PM Jakub Scholz  wrote:
>> >
>> > +1 (non-binding) ... I used the staged binaries with Scala 2.13 and
>> staged
>> > artifacts to run my tests. All seems to work fine.
>> >
>> > Thanks for running the release Mickael!
>> >
>> > Jakub
>> >
>> > On Mon, Jun 5, 2023 at 3:39 PM Mickael Maison 
>> wrote:
>> >
>> > > Hello Kafka users, developers and client-developers,
>> > >
>> > > This is the second candidate for release of Apache Kafka 3.5.0. Some
>> > > of the major features include:
>> > > - KIP-710: Full support for distributed mode in dedicated MirrorMaker
>> > > 2.0 clusters
>> > > - KIP-881: Rack-aware Partition Assignment for Kafka Consumers
>> > > - KIP-887: Add ConfigProvider to make use of environment variables
>> > > - KIP-889: Versioned State Stores
>> > > - KIP-894: Use incrementalAlterConfig for syncing topic configurations
>> > > - KIP-900: KRaft kafka-storage.sh API additions to support SCRAM for
>> > > Kafka Brokers
>> > >
>> > > Release notes for the 3.5.0 release:
>> > > https://home.apache.org/~mimaison/kafka-3.5.0-rc1/RELEASE_NOTES.html
>> > >
>> > > *** Please download, test and vote by Friday June 9, 5pm PT
>> > >
>> > > Kafka's KEYS file containing PGP keys we use to sign the release:
>> > > https://kafka.apache.org/KEYS
>> > >
>> > > * Release artifacts to be voted upon (source and binary):
>> > > https://home.apache.org/~mimaison/kafka-3.5.0-rc1/
>> > >
>> > > * Maven artifacts to be voted upon:
>> > > https://repository.apache.org/content/groups/staging/org/apache/kafka/
>> > >
>> > > * Javadoc:
>> > > https://home.apache.org/~mimaison/kafka-3.5.0-rc1/javadoc/
>> > >
>> > > * Tag to be voted upon (off 3.5 branch) is the 3.5.0 tag:
>> > > https://github.com/apache/kafka/releases/tag/3.5.0-rc1
>> > >
>> > > * Documentation:
>> > > https://kafka.apache.org/35/documentation.html
>> > >
>> > > * Protocol:
>> > > https://kafka.apache.org/35/protocol.html
>> > >
>> > > * Successful Jenkins builds for the 3.5 branch:
>> > > Unit/integration tests: I'm struggling to get all tests to pass in the
>> > > same build. I'll run a few more builds to ensure each test pass at
>> > > least once in the CI. All tests passed locally.
>> > > System tests: The build is still running, I'll send an update once I
>> > > have the results.
>> > >
>> > > Thanks,
>> > > Mickael
>> > >
>>
>
>
> -- 
> [image: Aiven] 
>
> *Josep Prat*
> Open Source Engineering Director, *Aiven*
> josep.p...@aiven.io   |   +491715557497
> aiven.io    |   
>      
> *Aiven Deutschland GmbH*
> Alexanderufer 3-7, 10117 Berlin
> Geschäftsführer: Oskari Saarenmaa & Hannu Valtonen
> Amtsgericht Charlottenburg, HRB 209739 B


Re: [DISCUSS] KIP-940: Broker extension point for validating record contents at produce time

2023-06-07 Thread Николай Ижиков
Hello.

As author of one of related KIPs I’m +1 for this change.
Long waited feature.

> 7 июня 2023 г., в 19:02, Edoardo Comar  написал(а):
> 
> Dear all,
> Adrian and I would like to start a discussion thread on
> 
> KIP-940: Broker extension point for validating record contents at produce time
> 
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-940%3A+Broker+extension+point+for+validating+record+contents+at+produce+time
> 
> This KIP proposes a new broker-side extension point (a “record validation 
> policy”) that can be used to reject records published by a misconfigured 
> client.
> Though general, it is aimed at the common, best-practice use case of defining 
> Kafka message formats with schemas maintained in a schema registry.
> 
> Please post your feedback, thanks !
> 
> Edoardo & Adrian
> 
> Unless otherwise stated above:
> 
> IBM United Kingdom Limited
> Registered in England and Wales with number 741598
> Registered office: PO Box 41, North Harbour, Portsmouth, Hants. PO6 3AU



[jira] [Created] (KAFKA-15069) Refactor scanning hierarchy out of DelegatingClassLoader

2023-06-07 Thread Greg Harris (Jira)
Greg Harris created KAFKA-15069:
---

 Summary: Refactor scanning hierarchy out of DelegatingClassLoader
 Key: KAFKA-15069
 URL: https://issues.apache.org/jira/browse/KAFKA-15069
 Project: Kafka
  Issue Type: Task
  Components: KafkaConnect
Reporter: Greg Harris
Assignee: Greg Harris


The DelegatingClassLoader is involved in both scanning and using the results of 
scanning to process classloading.
Instead, the scanning should take place outside of the DelegatingClassLoader, 
and results of scanning be passed back into the DelegatingClassLoader for 
classloading functionality.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[DISCUSS] KIP-940: Broker extension point for validating record contents at produce time

2023-06-07 Thread Edoardo Comar
Dear all,
Adrian and I would like to start a discussion thread on

KIP-940: Broker extension point for validating record contents at produce time

https://cwiki.apache.org/confluence/display/KAFKA/KIP-940%3A+Broker+extension+point+for+validating+record+contents+at+produce+time

This KIP proposes a new broker-side extension point (a “record validation 
policy”) that can be used to reject records published by a misconfigured client.
Though general, it is aimed at the common, best-practice use case of defining 
Kafka message formats with schemas maintained in a schema registry.

Please post your feedback, thanks !

Edoardo & Adrian

Unless otherwise stated above:

IBM United Kingdom Limited
Registered in England and Wales with number 741598
Registered office: PO Box 41, North Harbour, Portsmouth, Hants. PO6 3AU


Re: [DISCUSS] KIP-932: Queues for Kafka

2023-06-07 Thread Andrew Schofield
Hi Daniel,
True, I see your point. It’s analogous to a KafkaConsumer fetching
uncommitted records but not delivering them to the application.

Thanks,
Andrew

> On 7 Jun 2023, at 16:38, Dániel Urbán  wrote:
>
> Hi Andrew,
>
> I think the "pending" state could be the solution for reading beyond the
> LSO. Pending could indicate that a message is not yet available for
> consumption (so they won't be offered for consumers), but with transactions
> ending, they can become "available". With a pending state, records wouldn't
> "disappear", they would simply not show up until they become available on
> commit, or archived on abort.
>
> Regardless, I understand that this might be some extra, unwanted
> complexity, I just thought that with the message ordering guarantee gone,
> it would be a cool feature for share-groups. I've seen use-cases where the
> LSO being blocked for an extended period of time caused huge lag for
> traditional read_committed consumers, which could be completely avoided by
> share-groups.
>
> Thanks,
> Daniel
>
> Andrew Schofield  ezt írta (időpont:
> 2023. jún. 7., Sze, 17:28):
>
>> Hi Daniel,
>> Kind of. I don’t want a transaction abort to cause disappearance of
>> records which are already in-flight. A “pending” state doesn’t seem
>> helpful for read_committed. There’s no such disappearance problem
>> for read_uncommitted.
>>
>> Thanks,
>> Andrew
>>
>>> On 7 Jun 2023, at 16:19, Dániel Urbán  wrote:
>>>
>>> Hi Andrew,
>>>
>>> I agree with having a single isolation.level for the whole group, it
>> makes
>>> sense.
>>> As for:
>>> "b) The default isolation level for a share group is read_committed, in
>>> which case
>>> the SPSO and SPEO cannot move past the LSO."
>>>
>>> With this limitation (SPEO not moving beyond LSO), are you trying to
>> avoid
>>> handling the complexity of some kind of a "pending" state for the
>>> uncommitted in-flight messages?
>>>
>>> Thanks,
>>> Daniel
>>>
>>> Andrew Schofield  ezt írta (időpont:
>>> 2023. jún. 7., Sze, 16:52):
>>>
 HI Daniel,
 I’ve been thinking about this question and I think this area is a bit
 tricky.

 If there are some consumers in a share group with isolation level
 read_uncommitted
 and other consumers with read_committed, they have different
>> expectations
 with
 regards to which messages are visible when EOS comes into the picture.
 It seems to me that this is not necessarily a good thing.

 One option would be to support just read_committed in KIP-932. This
>> means
 it is unambiguous which records are in-flight, because they’re only
 committed
 ones.

 Another option would be to have the entire share group have an isolation
 level,
 which again gives an unambiguous set of in-flight records but without
>> the
 restriction of permitting just read_committed behaviour.

 So, my preference is for the following:
 a) A share group has an isolation level that applies to all consumers in
 the group.
 b) The default isolation level for a share group is read_committed, in
 which case
 the SPSO and SPEO cannot move past the LSO.
 c) For a share group with read_uncommited isolation level, the SPSO and
 SPEO
 can move past the LSO.
 d) The kafka_configs.sh tool or the AdminClient can be used to set a
 non-default
 value for the isolation level for a share group. The value is applied
>> when
 the first
 member joins.

 Thanks,
 Andrew

> On 2 Jun 2023, at 10:02, Dániel Urbán  wrote:
>
> Hi Andrew,
> Thank you for the clarification. One follow-up to read_committed mode:
> Taking the change in message ordering guarantees into account, does
>> this
> mean that in queues, share-group consumers will be able to consume
> committed records AFTER the LSO?
> Thanks,
> Daniel
>
> Andrew Schofield  ezt írta
>> (időpont:
> 2023. jún. 2., P, 10:39):
>
>> Hi Daniel,
>> Thanks for your questions.
>>
>> 1) Yes, read_committed fetch will still be possible.
>>
>> 2) You weren’t wrong that this is a broad question :)
>>
>> Broadly speaking, I can see two ways of managing the in-flight
>> records:
>> the share-partition leader does it, or the share-group coordinator
>> does
 it.
>> I want to choose what works best, and I happen to have started with
 trying
>> the share-partition leader doing it. This is just a whiteboard
>> exercise
 at
>> the
>> moment, looking at the potential protocol flows and how well it all
 hangs
>> together. When I have something coherent and understandable and worth
>> reviewing, I’ll update the KIP with a proposal.
>>
>> I think it’s probably worth doing a similar exercise for the
>> share-group
>> coordinator way too. There are bound to be pros and cons, and I don’t
>> really
>> mind which way prevails.
>>
>> If the share-group coordina

Re: [DISCUSS] Adding non-committers as Github collaborators

2023-06-07 Thread John Roesler
Hello again, all,

FYI, I've just opened a request for clarification on the ability to trigger 
builds: https://issues.apache.org/jira/browse/INFRA-24673

Thanks,
-John

On Tue, Jun 6, 2023, at 19:11, Hao Li wrote:
> Thanks John for looking into this!
>
> Hao
>
> On Tue, Jun 6, 2023 at 8:32 AM John Roesler  wrote:
>
>> Hello all,
>>
>> I put in a ticket with Apache Infra to re-send these invites, and they
>> told me I should just remove the usernames in one commit and then re-add
>> them in a subsequent commit to trigger the invites again.
>>
>> I will go ahead and do this for the users who requested it on this thread
>> (Greg and Andrew, as well as for Victoria who asked me about it
>> separately). If there is anyone else who needs a re-send, please let us
>> know.
>>
>> I'm sorry for the confusion, Hao. The docs claimed we could add 20 users,
>> but when I actually checked in the file, I got an automated notification
>> that we could actually only have 10.
>>
>> As for triggering the build, I don't believe you'll be able to log in to
>> Jenkins, but you should be able to say "retest this please" in a PR comment
>> to trigger it. Apparently, that doesn't work anymore, though. I'll file an
>> Infra ticket for it.
>>
>> Thanks all,
>> John
>>
>> On Fri, Jun 2, 2023, at 18:46, Hao Li wrote:
>> > Hi Luke,
>> >
>> > Sorry for the late reply. Can you also add me to the whitelist? I believe
>> > I'm supposed to be there as well. Matthias and John can vouch for me :)
>> My
>> > github ID is "lihaosky".
>> >
>> > Thanks,
>> > Hao
>> >
>> > On Fri, Jun 2, 2023 at 4:43 PM Greg Harris > >
>> > wrote:
>> >
>> >> Luke,
>> >>
>> >> I see that the PR has been merged, but I don't believe it re-sent the
>> >> invitation.
>> >>
>> >> Thanks
>> >> Greg
>> >>
>> >>
>> >> On Wed, May 31, 2023 at 6:46 PM Luke Chen  wrote:
>> >> >
>> >> > Hi Greg and Andrew,
>> >> >
>> >> > Sorry, I don't know how to re-sent the invitation.
>> >> > It looks like it is auto sent after the .asf.yaml updated.
>> >> > Since updating collaborator list is part of release process based on
>> the
>> >> doc
>> >> > , I just created a new list
>> and
>> >> > opened a PR:
>> >> > https://github.com/apache/kafka/pull/13790
>> >> >
>> >> > Hope that after this PR merged, you'll get a new invitation.
>> >> >
>> >> > Thanks.
>> >> > Luke
>> >> >
>> >> > On Thu, Jun 1, 2023 at 5:27 AM Andrew Grant
>> > >> >
>> >> > wrote:
>> >> >
>> >> > > Hi all,
>> >> > > Like Greg I also received an invitation to collaborate but was too
>> >> slow to
>> >> > > accept the invite :( I'm also wondering if there's a way to resend
>> the
>> >> > > invite? I'm andymg3 on GitHub.
>> >> > > Thanks,
>> >> > > Andrew
>> >> > >
>> >> > > On Tue, May 30, 2023 at 12:12 PM Greg Harris
>> >> > >> > > >
>> >> > > wrote:
>> >> > >
>> >> > > > Hey all,
>> >> > > >
>> >> > > > I received an invitation to collaborate on apache/kafka, but let
>> the
>> >> > > > invitation expire after 7 days.
>> >> > > > Is there a workflow for refreshing the invite, or is an admin
>> able to
>> >> > > > manually re-invite me?
>> >> > > > I'm gharris1727 on github.
>> >> > > >
>> >> > > > Thanks!
>> >> > > > Greg
>> >> > > >
>> >> > > > On Wed, May 24, 2023 at 9:32 AM Justine Olshan
>> >> > > >  wrote:
>> >> > > > >
>> >> > > > > Hey Yash,
>> >> > > > > I'm not sure how it used to be for sure, but I do remember we
>> used
>> >> to
>> >> > > > have
>> >> > > > > a different build system. I wonder if this used to work with the
>> >> old
>> >> > > > build
>> >> > > > > system and not any more.
>> >> > > > > I'd be curious if other projects have something similar and how
>> it
>> >> > > works.
>> >> > > > >
>> >> > > > > Thanks,
>> >> > > > > Justine
>> >> > > > >
>> >> > > > > On Wed, May 24, 2023 at 9:22 AM Yash Mayya <
>> yash.ma...@gmail.com>
>> >> > > wrote:
>> >> > > > >
>> >> > > > > > Hi Justine,
>> >> > > > > >
>> >> > > > > > Thanks for the response. Non-committers don't have Apache
>> >> accounts;
>> >> > > > are you
>> >> > > > > > suggesting that there wasn't a need to sign in earlier and a
>> >> change
>> >> > > in
>> >> > > > this
>> >> > > > > > policy is restricting collaborators from triggering Jenkins
>> >> builds?
>> >> > > > > >
>> >> > > > > > Thanks,
>> >> > > > > > Yash
>> >> > > > > >
>> >> > > > > > On Wed, May 24, 2023 at 9:30 PM Justine Olshan
>> >> > > > > > 
>> >> > > > > > wrote:
>> >> > > > > >
>> >> > > > > > > Yash,
>> >> > > > > > >
>> >> > > > > > > When I rebuild, I go to the CloudBees CI page and I have to
>> >> log in
>> >> > > > with
>> >> > > > > > my
>> >> > > > > > > apache account.
>> >> > > > > > > Not sure if the change in the build system or the need to
>> sign
>> >> in
>> >> > > is
>> >> > > > part
>> >> > > > > > > of the problem.
>> >> > > > > > >
>> >> > > > > > >
>> >> > > > > > > On Wed, May 24, 2023 at 4:54 AM Federico Valeri <
>> >> > > > fedeval...@gmail.com>
>> >> > > > > > > wrote:
>> >> > > > > > >
>> >> > > > > > > > +

Re: [DISCUSS] KIP-932: Queues for Kafka

2023-06-07 Thread Dániel Urbán
Hi Andrew,

I think the "pending" state could be the solution for reading beyond the
LSO. Pending could indicate that a message is not yet available for
consumption (so they won't be offered for consumers), but with transactions
ending, they can become "available". With a pending state, records wouldn't
"disappear", they would simply not show up until they become available on
commit, or archived on abort.

Regardless, I understand that this might be some extra, unwanted
complexity, I just thought that with the message ordering guarantee gone,
it would be a cool feature for share-groups. I've seen use-cases where the
LSO being blocked for an extended period of time caused huge lag for
traditional read_committed consumers, which could be completely avoided by
share-groups.

Thanks,
Daniel

Andrew Schofield  ezt írta (időpont:
2023. jún. 7., Sze, 17:28):

> Hi Daniel,
> Kind of. I don’t want a transaction abort to cause disappearance of
> records which are already in-flight. A “pending” state doesn’t seem
> helpful for read_committed. There’s no such disappearance problem
> for read_uncommitted.
>
> Thanks,
> Andrew
>
> > On 7 Jun 2023, at 16:19, Dániel Urbán  wrote:
> >
> > Hi Andrew,
> >
> > I agree with having a single isolation.level for the whole group, it
> makes
> > sense.
> > As for:
> > "b) The default isolation level for a share group is read_committed, in
> > which case
> > the SPSO and SPEO cannot move past the LSO."
> >
> > With this limitation (SPEO not moving beyond LSO), are you trying to
> avoid
> > handling the complexity of some kind of a "pending" state for the
> > uncommitted in-flight messages?
> >
> > Thanks,
> > Daniel
> >
> > Andrew Schofield  ezt írta (időpont:
> > 2023. jún. 7., Sze, 16:52):
> >
> >> HI Daniel,
> >> I’ve been thinking about this question and I think this area is a bit
> >> tricky.
> >>
> >> If there are some consumers in a share group with isolation level
> >> read_uncommitted
> >> and other consumers with read_committed, they have different
> expectations
> >> with
> >> regards to which messages are visible when EOS comes into the picture.
> >> It seems to me that this is not necessarily a good thing.
> >>
> >> One option would be to support just read_committed in KIP-932. This
> means
> >> it is unambiguous which records are in-flight, because they’re only
> >> committed
> >> ones.
> >>
> >> Another option would be to have the entire share group have an isolation
> >> level,
> >> which again gives an unambiguous set of in-flight records but without
> the
> >> restriction of permitting just read_committed behaviour.
> >>
> >> So, my preference is for the following:
> >> a) A share group has an isolation level that applies to all consumers in
> >> the group.
> >> b) The default isolation level for a share group is read_committed, in
> >> which case
> >> the SPSO and SPEO cannot move past the LSO.
> >> c) For a share group with read_uncommited isolation level, the SPSO and
> >> SPEO
> >> can move past the LSO.
> >> d) The kafka_configs.sh tool or the AdminClient can be used to set a
> >> non-default
> >> value for the isolation level for a share group. The value is applied
> when
> >> the first
> >> member joins.
> >>
> >> Thanks,
> >> Andrew
> >>
> >>> On 2 Jun 2023, at 10:02, Dániel Urbán  wrote:
> >>>
> >>> Hi Andrew,
> >>> Thank you for the clarification. One follow-up to read_committed mode:
> >>> Taking the change in message ordering guarantees into account, does
> this
> >>> mean that in queues, share-group consumers will be able to consume
> >>> committed records AFTER the LSO?
> >>> Thanks,
> >>> Daniel
> >>>
> >>> Andrew Schofield  ezt írta
> (időpont:
> >>> 2023. jún. 2., P, 10:39):
> >>>
>  Hi Daniel,
>  Thanks for your questions.
> 
>  1) Yes, read_committed fetch will still be possible.
> 
>  2) You weren’t wrong that this is a broad question :)
> 
>  Broadly speaking, I can see two ways of managing the in-flight
> records:
>  the share-partition leader does it, or the share-group coordinator
> does
> >> it.
>  I want to choose what works best, and I happen to have started with
> >> trying
>  the share-partition leader doing it. This is just a whiteboard
> exercise
> >> at
>  the
>  moment, looking at the potential protocol flows and how well it all
> >> hangs
>  together. When I have something coherent and understandable and worth
>  reviewing, I’ll update the KIP with a proposal.
> 
>  I think it’s probably worth doing a similar exercise for the
> share-group
>  coordinator way too. There are bound to be pros and cons, and I don’t
>  really
>  mind which way prevails.
> 
>  If the share-group coordinator does it, I already have experience of
>  efficient
>  storage of in-flight record state in a way that scales and is
>  space-efficient.
>  If the share-partition leader does it, storage of in-flight state is a
> >> bit
>  more
>  tricky.
> >

Re: [DISCUSS] KIP-932: Queues for Kafka

2023-06-07 Thread Andrew Schofield
Hi Daniel,
Kind of. I don’t want a transaction abort to cause disappearance of
records which are already in-flight. A “pending” state doesn’t seem
helpful for read_committed. There’s no such disappearance problem
for read_uncommitted.

Thanks,
Andrew

> On 7 Jun 2023, at 16:19, Dániel Urbán  wrote:
>
> Hi Andrew,
>
> I agree with having a single isolation.level for the whole group, it makes
> sense.
> As for:
> "b) The default isolation level for a share group is read_committed, in
> which case
> the SPSO and SPEO cannot move past the LSO."
>
> With this limitation (SPEO not moving beyond LSO), are you trying to avoid
> handling the complexity of some kind of a "pending" state for the
> uncommitted in-flight messages?
>
> Thanks,
> Daniel
>
> Andrew Schofield  ezt írta (időpont:
> 2023. jún. 7., Sze, 16:52):
>
>> HI Daniel,
>> I’ve been thinking about this question and I think this area is a bit
>> tricky.
>>
>> If there are some consumers in a share group with isolation level
>> read_uncommitted
>> and other consumers with read_committed, they have different expectations
>> with
>> regards to which messages are visible when EOS comes into the picture.
>> It seems to me that this is not necessarily a good thing.
>>
>> One option would be to support just read_committed in KIP-932. This means
>> it is unambiguous which records are in-flight, because they’re only
>> committed
>> ones.
>>
>> Another option would be to have the entire share group have an isolation
>> level,
>> which again gives an unambiguous set of in-flight records but without the
>> restriction of permitting just read_committed behaviour.
>>
>> So, my preference is for the following:
>> a) A share group has an isolation level that applies to all consumers in
>> the group.
>> b) The default isolation level for a share group is read_committed, in
>> which case
>> the SPSO and SPEO cannot move past the LSO.
>> c) For a share group with read_uncommited isolation level, the SPSO and
>> SPEO
>> can move past the LSO.
>> d) The kafka_configs.sh tool or the AdminClient can be used to set a
>> non-default
>> value for the isolation level for a share group. The value is applied when
>> the first
>> member joins.
>>
>> Thanks,
>> Andrew
>>
>>> On 2 Jun 2023, at 10:02, Dániel Urbán  wrote:
>>>
>>> Hi Andrew,
>>> Thank you for the clarification. One follow-up to read_committed mode:
>>> Taking the change in message ordering guarantees into account, does this
>>> mean that in queues, share-group consumers will be able to consume
>>> committed records AFTER the LSO?
>>> Thanks,
>>> Daniel
>>>
>>> Andrew Schofield  ezt írta (időpont:
>>> 2023. jún. 2., P, 10:39):
>>>
 Hi Daniel,
 Thanks for your questions.

 1) Yes, read_committed fetch will still be possible.

 2) You weren’t wrong that this is a broad question :)

 Broadly speaking, I can see two ways of managing the in-flight records:
 the share-partition leader does it, or the share-group coordinator does
>> it.
 I want to choose what works best, and I happen to have started with
>> trying
 the share-partition leader doing it. This is just a whiteboard exercise
>> at
 the
 moment, looking at the potential protocol flows and how well it all
>> hangs
 together. When I have something coherent and understandable and worth
 reviewing, I’ll update the KIP with a proposal.

 I think it’s probably worth doing a similar exercise for the share-group
 coordinator way too. There are bound to be pros and cons, and I don’t
 really
 mind which way prevails.

 If the share-group coordinator does it, I already have experience of
 efficient
 storage of in-flight record state in a way that scales and is
 space-efficient.
 If the share-partition leader does it, storage of in-flight state is a
>> bit
 more
 tricky.

 I think it’s worth thinking ahead to how EOS will work and also another
 couple of enhancements (key-based ordering and acquisition lock
 extension) so it’s somewhat future-proof.

 Thanks,
 Andrew

> On 1 Jun 2023, at 11:51, Dániel Urbán  wrote:
>
> Hi Andrew,
>
> Thank you for the KIP, exciting work you are doing :)
> I have 2 questions:
> 1. I understand that EOS won't be supported for share-groups (yet), but
> read_committed fetch will still be possible, correct?
>
> 2. I have a very broad question about the proposed solution: why not
>> let
> the share-group coordinator manage the states of the in-flight records?
> I'm asking this because it seems to me that using the same pattern as
>> the
> existing group coordinator would
> a, solve the durability of the message state storage (same method as
>> the
> one used by the current group coordinator)
>
> b, pave the way for EOS with share-groups (same method as the one used
>> by
> the current group coordinator)
>
> c, allow follower-fetchin

Re: [DISCUSS] KIP-932: Queues for Kafka

2023-06-07 Thread Dániel Urbán
Hi Andrew,

I agree with having a single isolation.level for the whole group, it makes
sense.
As for:
"b) The default isolation level for a share group is read_committed, in
which case
the SPSO and SPEO cannot move past the LSO."

With this limitation (SPEO not moving beyond LSO), are you trying to avoid
handling the complexity of some kind of a "pending" state for the
uncommitted in-flight messages?

Thanks,
Daniel

Andrew Schofield  ezt írta (időpont:
2023. jún. 7., Sze, 16:52):

> HI Daniel,
> I’ve been thinking about this question and I think this area is a bit
> tricky.
>
> If there are some consumers in a share group with isolation level
> read_uncommitted
> and other consumers with read_committed, they have different expectations
> with
> regards to which messages are visible when EOS comes into the picture.
> It seems to me that this is not necessarily a good thing.
>
> One option would be to support just read_committed in KIP-932. This means
> it is unambiguous which records are in-flight, because they’re only
> committed
> ones.
>
> Another option would be to have the entire share group have an isolation
> level,
> which again gives an unambiguous set of in-flight records but without the
> restriction of permitting just read_committed behaviour.
>
> So, my preference is for the following:
> a) A share group has an isolation level that applies to all consumers in
> the group.
> b) The default isolation level for a share group is read_committed, in
> which case
> the SPSO and SPEO cannot move past the LSO.
> c) For a share group with read_uncommited isolation level, the SPSO and
> SPEO
> can move past the LSO.
> d) The kafka_configs.sh tool or the AdminClient can be used to set a
> non-default
> value for the isolation level for a share group. The value is applied when
> the first
> member joins.
>
> Thanks,
> Andrew
>
> > On 2 Jun 2023, at 10:02, Dániel Urbán  wrote:
> >
> > Hi Andrew,
> > Thank you for the clarification. One follow-up to read_committed mode:
> > Taking the change in message ordering guarantees into account, does this
> > mean that in queues, share-group consumers will be able to consume
> > committed records AFTER the LSO?
> > Thanks,
> > Daniel
> >
> > Andrew Schofield  ezt írta (időpont:
> > 2023. jún. 2., P, 10:39):
> >
> >> Hi Daniel,
> >> Thanks for your questions.
> >>
> >> 1) Yes, read_committed fetch will still be possible.
> >>
> >> 2) You weren’t wrong that this is a broad question :)
> >>
> >> Broadly speaking, I can see two ways of managing the in-flight records:
> >> the share-partition leader does it, or the share-group coordinator does
> it.
> >> I want to choose what works best, and I happen to have started with
> trying
> >> the share-partition leader doing it. This is just a whiteboard exercise
> at
> >> the
> >> moment, looking at the potential protocol flows and how well it all
> hangs
> >> together. When I have something coherent and understandable and worth
> >> reviewing, I’ll update the KIP with a proposal.
> >>
> >> I think it’s probably worth doing a similar exercise for the share-group
> >> coordinator way too. There are bound to be pros and cons, and I don’t
> >> really
> >> mind which way prevails.
> >>
> >> If the share-group coordinator does it, I already have experience of
> >> efficient
> >> storage of in-flight record state in a way that scales and is
> >> space-efficient.
> >> If the share-partition leader does it, storage of in-flight state is a
> bit
> >> more
> >> tricky.
> >>
> >> I think it’s worth thinking ahead to how EOS will work and also another
> >> couple of enhancements (key-based ordering and acquisition lock
> >> extension) so it’s somewhat future-proof.
> >>
> >> Thanks,
> >> Andrew
> >>
> >>> On 1 Jun 2023, at 11:51, Dániel Urbán  wrote:
> >>>
> >>> Hi Andrew,
> >>>
> >>> Thank you for the KIP, exciting work you are doing :)
> >>> I have 2 questions:
> >>> 1. I understand that EOS won't be supported for share-groups (yet), but
> >>> read_committed fetch will still be possible, correct?
> >>>
> >>> 2. I have a very broad question about the proposed solution: why not
> let
> >>> the share-group coordinator manage the states of the in-flight records?
> >>> I'm asking this because it seems to me that using the same pattern as
> the
> >>> existing group coordinator would
> >>> a, solve the durability of the message state storage (same method as
> the
> >>> one used by the current group coordinator)
> >>>
> >>> b, pave the way for EOS with share-groups (same method as the one used
> by
> >>> the current group coordinator)
> >>>
> >>> c, allow follower-fetching
> >>> I saw your point about this: "FFF gives freedom to fetch records from a
> >>> nearby broker, but it does not also give the ability to commit offsets
> >> to a
> >>> nearby broker"
> >>> But does it matter if message acknowledgement is not "local"?
> Supposedly,
> >>> fetching is the actual hard work which benefits from follower fetching,
> >> not
> >>> the group related re

[GitHub] [kafka-site] C0urante opened a new pull request, #520: KAFKA-15051: add missing GET plugin/config endpoint

2023-06-07 Thread via GitHub


C0urante opened a new pull request, #520:
URL: https://github.com/apache/kafka-site/pull/520

   Ports the changes from https://github.com/apache/kafka/pull/13803 back 
through 3.2, the the version that originally added this endpoint.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Resolved] (KAFKA-10337) Wait for pending async commits in commitSync() even if no offsets are specified

2023-06-07 Thread David Jacot (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-10337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Jacot resolved KAFKA-10337.
-
Fix Version/s: 3.6.0
   Resolution: Fixed

> Wait for pending async commits in commitSync() even if no offsets are 
> specified
> ---
>
> Key: KAFKA-10337
> URL: https://issues.apache.org/jira/browse/KAFKA-10337
> Project: Kafka
>  Issue Type: Bug
>  Components: clients
>Reporter: Tom Lee
>Priority: Major
> Fix For: 3.6.0
>
>
> The JavaDoc for commitSync() states the following:
> {quote}Note that asynchronous offset commits sent previously with the
> {@link #commitAsync(OffsetCommitCallback)}
>  (or similar) are guaranteed to have their callbacks invoked prior to 
> completion of this method.
> {quote}
> But should we happen to call the method with an empty offset map
> (i.e. commitSync(Collections.emptyMap())) the callbacks for any incomplete 
> async commits will not be invoked because of an early return in 
> ConsumerCoordinator.commitOffsetsSync() when the input map is empty.
> If users are doing manual offset commits and relying on commitSync as a 
> barrier for in-flight async commits prior to a rebalance, this could be an 
> important (though somewhat implementation-dependent) detail.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [DISCUSS] Partial CI builds - Reducing flakiness with fewer tests

2023-06-07 Thread Chris Egerton
Hi Greg,

I can see the point about enabling partial runs as a temporary measure to
fight flakiness, and it does carry some merit. In that case, though, we
should have an idea of what the desired end state is once we've stopped
relying on any temporary measures. Do you think we should aim to disable
merges without a full suite of passing CI runs (allowing for administrative
override in an emergency)? If so, what would the path be from our current
state to there? What can we do to ensure that we don't get stuck relying on
a once-temporary aid that becomes effectively permanent?

With partial builds, we also need to be careful to make sure to correctly
handle cross-module dependencies. A tweak to broker or client logic may
only affect files in one module and pass all tests for that module, but
have far-reaching consequences for Streams, Connect, and MM2. We probably
want to build awareness of this dependency graph into any partial CI logic
we add, but if we do opt for that, then this change would
disproportionately benefit downstream modules (Streams, Connect, MM2), and
have little to no benefit for upstream ones (clients and at least some core
modules).

With regards to faster iteration times--I agree that it would be nice if
our CI builds didn't take 2-3 hours, but people should already be making
sure that tests are running locally before they push changes (or, if they
really want, they can run tests locally after pushing changes). And if
rapid iteration is necessary, it's always (or at least for the foreseeable
future) going to be faster to run whatever specific tests or build tasks
you need to run locally, instead of pushing to GitHub and waiting for
Jenkins to check for you.

Finally, since there are a number of existing flaky tests on trunk, what
would the strategy be for handling those? Do we try to get to a green state
on a per-module basis (possibly with awareness of downstream modules) as
quickly as possible, and then selectively enable partial builds once we
feel confident that flakiness has been addressed?

Cheers,

Chris

On Wed, Jun 7, 2023 at 5:09 AM Gaurav Narula  wrote:

> Hey Greg,
>
> Thanks for sharing this idea!
>
> The idea of building and testing a relevant subset of code certainly seems
> interesting.
>
> Perhaps this is a good fit for Bazel [1] where
> target-determinator [2] can be used to to find a subset of targets that
> have changed between two commits.
>
> Even without [2], Bazel builds can benefit immensely from distributing
> builds
> to a set of remote nodes [3] with support for caching previously built
> targets [4].
>
> We've seen a few other ASF projects adopt Bazel as well:
>
> * https://github.com/apache/rocketmq
> * https://github.com/apache/brpc
> * https://github.com/apache/trafficserver
> * https://github.com/apache/ws-axiom
>
> I wonder how the Kafka community feels about experimenting with Bazel and
> exploring if it helps us offer faster build times without compromising on
> the
> correctness of the targets that need to be built and tested?
>
> Thanks,
> Gaurav
>
> [1]: https://bazel.build
> [2]: https://github.com/bazel-contrib/target-determinator
> [3]: https://bazel.build/remote/rbe
> [4]: https://bazel.build/remote/caching
>
> On 2023/06/05 17:47:07 Greg Harris wrote:
> > Hey all,
> >
> > I've been working on test flakiness recently, and I've been trying to
> > come up with ways to tackle the issue top-down as well as bottom-up,
> > and I'm interested to hear your thoughts on an idea.
> >
> > In addition to the current full-suite runs, can we in parallel trigger
> > a smaller test run which has only a relevant subset of tests? For
> > example, if someone is working on one sub-module, the CI would only
> > run tests in that module.
> >
> > I think this would be more likely to pass than the full suite due to
> > the fewer tests failing probabilistically, and would improve the
> > signal-to-noise ratio of the summary pass/fail marker on GitHub. This
> > should also be shorter to execute than the full suite, allowing for
> > faster cycle-time than the current full suite encourages.
> >
> > This would also strengthen the incentive for contributors specializing
> > in a module to de-flake tests, as they are rewarded with a tangible
> > improvement within their area of the project. Currently, even the
> > modules with the most reliable tests receive consistent CI failures
> > from other less reliable modules.
> >
> > I believe this is possible, even if there isn't an off-the-shelf
> > solution for it. We can learn of the changed files via a git diff, map
> > that to modules containing those files, and then execute the tests
> > just for those modules with gradle. GitHub also permits showing
> > multiple "checks" so that we can emit both the full-suite and partial
> > test results.
> >
> > Thanks,
> > Greg
> >


Re: [DISCUSS] KIP-932: Queues for Kafka

2023-06-07 Thread Andrew Schofield
HI Daniel,
I’ve been thinking about this question and I think this area is a bit tricky.

If there are some consumers in a share group with isolation level 
read_uncommitted
and other consumers with read_committed, they have different expectations with
regards to which messages are visible when EOS comes into the picture.
It seems to me that this is not necessarily a good thing.

One option would be to support just read_committed in KIP-932. This means
it is unambiguous which records are in-flight, because they’re only committed
ones.

Another option would be to have the entire share group have an isolation level,
which again gives an unambiguous set of in-flight records but without the
restriction of permitting just read_committed behaviour.

So, my preference is for the following:
a) A share group has an isolation level that applies to all consumers in the 
group.
b) The default isolation level for a share group is read_committed, in which 
case
the SPSO and SPEO cannot move past the LSO.
c) For a share group with read_uncommited isolation level, the SPSO and SPEO
can move past the LSO.
d) The kafka_configs.sh tool or the AdminClient can be used to set a non-default
value for the isolation level for a share group. The value is applied when the 
first
member joins.

Thanks,
Andrew

> On 2 Jun 2023, at 10:02, Dániel Urbán  wrote:
>
> Hi Andrew,
> Thank you for the clarification. One follow-up to read_committed mode:
> Taking the change in message ordering guarantees into account, does this
> mean that in queues, share-group consumers will be able to consume
> committed records AFTER the LSO?
> Thanks,
> Daniel
>
> Andrew Schofield  ezt írta (időpont:
> 2023. jún. 2., P, 10:39):
>
>> Hi Daniel,
>> Thanks for your questions.
>>
>> 1) Yes, read_committed fetch will still be possible.
>>
>> 2) You weren’t wrong that this is a broad question :)
>>
>> Broadly speaking, I can see two ways of managing the in-flight records:
>> the share-partition leader does it, or the share-group coordinator does it.
>> I want to choose what works best, and I happen to have started with trying
>> the share-partition leader doing it. This is just a whiteboard exercise at
>> the
>> moment, looking at the potential protocol flows and how well it all hangs
>> together. When I have something coherent and understandable and worth
>> reviewing, I’ll update the KIP with a proposal.
>>
>> I think it’s probably worth doing a similar exercise for the share-group
>> coordinator way too. There are bound to be pros and cons, and I don’t
>> really
>> mind which way prevails.
>>
>> If the share-group coordinator does it, I already have experience of
>> efficient
>> storage of in-flight record state in a way that scales and is
>> space-efficient.
>> If the share-partition leader does it, storage of in-flight state is a bit
>> more
>> tricky.
>>
>> I think it’s worth thinking ahead to how EOS will work and also another
>> couple of enhancements (key-based ordering and acquisition lock
>> extension) so it’s somewhat future-proof.
>>
>> Thanks,
>> Andrew
>>
>>> On 1 Jun 2023, at 11:51, Dániel Urbán  wrote:
>>>
>>> Hi Andrew,
>>>
>>> Thank you for the KIP, exciting work you are doing :)
>>> I have 2 questions:
>>> 1. I understand that EOS won't be supported for share-groups (yet), but
>>> read_committed fetch will still be possible, correct?
>>>
>>> 2. I have a very broad question about the proposed solution: why not let
>>> the share-group coordinator manage the states of the in-flight records?
>>> I'm asking this because it seems to me that using the same pattern as the
>>> existing group coordinator would
>>> a, solve the durability of the message state storage (same method as the
>>> one used by the current group coordinator)
>>>
>>> b, pave the way for EOS with share-groups (same method as the one used by
>>> the current group coordinator)
>>>
>>> c, allow follower-fetching
>>> I saw your point about this: "FFF gives freedom to fetch records from a
>>> nearby broker, but it does not also give the ability to commit offsets
>> to a
>>> nearby broker"
>>> But does it matter if message acknowledgement is not "local"? Supposedly,
>>> fetching is the actual hard work which benefits from follower fetching,
>> not
>>> the group related requests.
>>>
>>> The only problem I see with the share-group coordinator managing the
>>> in-flight message state is that the coordinator is not aware of the exact
>>> available offsets of a partition, nor how the messages are batched. For
>>> this problem, maybe the share group coordinator could use some form of
>>> "logical" addresses, such as "the next 2 batches after offset X", or
>> "after
>>> offset X, skip 2 batches, fetch next 2". Acknowledgements always contain
>>> the exact offset, but for the "unknown" sections of a partition, these
>>> logical addresses would be used. The coordinator could keep track of
>>> message states with a mix of offsets and these batch based addresses. The
>>> partition leader could su

Re: [ANNOUNCE] Apache Kafka 3.4.1

2023-06-07 Thread Bill Bejeck
Thanks for running the release Luke!

On Wed, Jun 7, 2023 at 4:29 AM Tom Bentley  wrote:

> Thanks Luke!
>
> On Wed, 7 Jun 2023 at 09:11, Mickael Maison 
> wrote:
>
> > Thanks for running the release!
> >
> > On Wed, Jun 7, 2023 at 9:11 AM Bruno Cadonna  wrote:
> > >
> > > Thanks Luke!
> > >
> > > On 07.06.23 07:55, Federico Valeri wrote:
> > > > Thanks Luke!
> > > >
> > > > On Wed, Jun 7, 2023 at 5:56 AM Kamal Chandraprakash
> > > >  wrote:
> > > >>
> > > >> Thanks Luke for running this release!
> > > >>
> > > >> On Wed, Jun 7, 2023 at 8:08 AM Chia-Ping Tsai 
> > wrote:
> > > >>
> > > >>> Thank Luke for this hard work!!!
> > > >>>
> > >  Chris Egerton  於 2023年6月7日 上午10:35 寫道:
> > > 
> > >  Thanks for running this release, Luke!
> > > 
> > >  On Tue, Jun 6, 2023, 22:31 Luke Chen  wrote:
> > > 
> > > > The Apache Kafka community is pleased to announce the release for
> > > > Apache Kafka 3.4.1.
> > > >
> > > > This is a bug fix release and it includes fixes and improvements
> > from
> > > > 58 JIRAs, including a few critical bugs:
> > > > - core
> > > > KAFKA-14644 Process should stop after failure in raft IO thread
> > > > KAFKA-14946 KRaft controller node shutting down while renouncing
> > > >>> leadership
> > > > KAFKA-14887 ZK session timeout can cause broker to shutdown
> > > > - client
> > > > KAFKA-14639 Kafka CooperativeStickyAssignor revokes/assigns
> > partition
> > > > in one rebalance cycle
> > > > - connect
> > > > KAFKA-12558 MM2 may not sync partition offsets correctly
> > > > KAFKA-14666 MM2 should translate consumer group offsets behind
> > > >>> replication
> > > > flow
> > > > - stream
> > > > KAFKA-14172 bug: State stores lose state when tasks are
> reassigned
> > under
> > > > EOS
> > > >
> > > > All of the changes in this release can be found in the release
> > notes:
> > > >
> > > > https://www.apache.org/dist/kafka/3.4.1/RELEASE_NOTES.html
> > > >
> > > > You can download the source and binary release (Scala 2.12 and
> > Scala
> > > >>> 2.13)
> > > > from:
> > > >
> > > > https://kafka.apache.org/downloads#3.4.1
> > > >
> > > >
> > > >
> > > >>>
> >
> ---
> > > >
> > > > Apache Kafka is a distributed streaming platform with four core
> > APIs:
> > > >
> > > > ** The Producer API allows an application to publish a stream
> > records
> > > > to one or more Kafka topics.
> > > >
> > > > ** The Consumer API allows an application to subscribe to one or
> > more
> > > > topics and process the stream of records produced to them.
> > > >
> > > > ** The Streams API allows an application to act as a stream
> > processor,
> > > > consuming an input stream from one or more topics and producing
> an
> > > > output stream to one or more output topics, effectively
> > transforming
> > > > the input streams to output streams.
> > > >
> > > > ** The Connector API allows building and running reusable
> > producers or
> > > > consumers that connect Kafka topics to existing applications or
> > data
> > > > systems. For example, a connector to a relational database might
> > > > capture every change to a table.
> > > >
> > > >
> > > > With these APIs, Kafka can be used for two broad classes of
> > application:
> > > >
> > > > ** Building real-time streaming data pipelines that reliably get
> > data
> > > > between systems or applications.
> > > >
> > > > ** Building real-time streaming applications that transform or
> > react
> > > > to the streams of data.
> > > >
> > > > Apache Kafka is in use at large and small companies worldwide,
> > > > including Capital One, Goldman Sachs, ING, LinkedIn, Netflix,
> > > > Pinterest, Rabobank, Target, The New York Times, Uber, Yelp, and
> > > > Zalando, among others.
> > > >
> > > > A big thank you for the following 32 contributors to this
> release!
> > > >
> > > > atu-sharm, Chia-Ping Tsai, Chris Egerton, Colin Patrick McCabe,
> > > > csolidum, David Arthur, David Jacot, Divij Vaidya, egyedt,
> > > > emilnkrastev, Eric Haag, Greg Harris, Guozhang Wang, Hector
> > Geraldino,
> > > > hudeqi, Jason Gustafson, Jeff Kim, Jorge Esteban Quilcate Otoya,
> > José
> > > > Armando García Sancio, Lucia Cerchie, Luke Chen, Manikumar Reddy,
> > > > Matthias J. Sax, Mickael Maison, Philip Nee, Purshotam Chauhan,
> > Rajini
> > > > Sivaram, Ron Dagostino, Terry, Victoria Xia, Viktor Somogyi-Vass,
> > Yash
> > > > Mayya
> > > >
> > > > We welcome your help and feedback. For more information on how to
> > > > report problems, and to get involved, visit the project website
> at
> > > > https://kafka.apache.org/
> > > >
> > > >
> > > > Thank you!
>

Re: [DISCUSS] KIP-928: Making Kafka resilient to log directories becoming full

2023-06-07 Thread Christo Lolov
Hey Colin,

I tried the following setup:

* Create 3 EC2 machines.
* EC2 machine named A acts as a KRaft Controller.
* EC2 machine named B acts as a KRaft Broker. (The only configurations
different to the default values: log.retention.ms=3,
log.segment.bytes=1048576, log.retention.check.interval.ms=3,
leader.imbalance.check.interval.seconds=30)
* EC2 machine named C acts as a Producer.
* I attached 1 GB EBS volume to the EC2 machine B (Broker) and configured
the log.dirs to point to it.
* I filled 995 MB of that EBS volume using fallocate.
* I created a topic with 6 partitions and a replication factor of 1.
* From the Producer machine I used `~/kafka/bin/kafka-producer-perf-test.sh
--producer.config ~/kafka/config/client.properties --topic batman
--record-size 524288 --throughput 5 --num-records 150`. The disk on EC2
machine B filled up and the broker shut down. I stopped the producer.
* I stopped the controller on EC2 machine A. I started the controller to
both be a controller and a broker (I need this because I cannot communicate
directly with a controller -
https://cwiki.apache.org/confluence/display/KAFKA/KIP-919%3A+Allow+AdminClient+to+Talk+Directly+with+the+KRaft+Controller+Quorum
).
* I deleted the topic to which I had been writing by using kafka-topics.sh .
* I started the broker on EC2 machine B and it failed due to no space left
on disk during its recovery process. The topic was not deleted from the
disk.

As such, I am not convinced that KRaft addresses the problem of deleting
topics on startup if there is no space left on the disk - is there
something wrong with my setup that you disagree with? I think this will
continue to be the case even when JBOD + KRaft is implemented.

Let me know your thoughts!

Best,
Christo

On Mon, 5 Jun 2023 at 11:03, Christo Lolov  wrote:

> Hey Colin,
>
> Thanks for the review!
>
> I am also skeptical that much space can be reclaimed via compaction as
> detailed in the limitations section of the KIP.
>
> In my head there are two ways to get out of the saturated state -
> configure more aggressive retention and delete topics. I wasn't aware that
> KRaft deletes topics marked for deletion on startup if the disks occupied
> by those partitions are full - I will check it out, thank you for the
> information! On the retention side, I believe there is still a benefit in
> keeping the broker up and responsive - in my experience, people first try
> to reduce the data they have and only when that also does not work they are
> okay with sacrificing all of the data.
>
> Let me know your thoughts!
>
> Best,
> Christo
>
> On Fri, 2 Jun 2023 at 20:09, Colin McCabe  wrote:
>
>> Hi Christo,
>>
>> We're not adding new stuff to ZK at this point (it's deprecated), so it
>> would be good to drop that from the design.
>>
>> With regard to the "saturated" state: I'm skeptical that compaction could
>> really move the needle much in terms of freeing up space -- in most
>> workloads I've seen, it wouldn't. Compaction also requires free space to
>> function as well.
>>
>> So the main benefit of the "satured" state seems to be enabling deletion
>> on full disks. But KRaft mode already has most of that benefit. Full disks
>> (or, indeed, downed brokers) don't block deletion on KRaft. If you delete a
>> topic and then bounce the broker that had the disk full, it will delete the
>> topic directory on startup as part of its snapshot load process.
>>
>> So I'm not sure if we really need this. Maybe we should re-evaluate once
>> we have JBOD + KRaft.
>>
>> best,
>> Colin
>>
>>
>> On Mon, May 22, 2023, at 02:23, Christo Lolov wrote:
>> > Hello all!
>> >
>> > I would like to start a discussion on KIP-928: Making Kafka resilient to
>> > log directories becoming full which can be found at
>> >
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-928%3A+Making+Kafka+resilient+to+log+directories+becoming+full
>> > .
>> >
>> > In summary, I frequently run into problems where Kafka becomes
>> unresponsive
>> > when the disks backing its log directories become full. Such
>> > unresponsiveness generally requires intervention outside of Kafka. I
>> have
>> > found it to be significantly nicer of an experience when Kafka maintains
>> > control plane operations and allows you to free up space.
>> >
>> > I am interested in your thoughts and any suggestions for improving the
>> > proposal!
>> >
>> > Best,
>> > Christo
>>
>


Jenkins build is still unstable: Kafka » Kafka Branch Builder » trunk #1905

2023-06-07 Thread Apache Jenkins Server
See 




[DISCUSS] KIP-842: Add richer group offset reset mechanisms

2023-06-07 Thread hudeqi
Is there any more attention to this KIP? :)
bump this thread.

Best,
hudeqi


> -原始邮件-
> 发件人: hudeqi <16120...@bjtu.edu.cn>
> 发送时间: 2023-03-26 17:42:31 (星期日)
> 收件人: dev@kafka.apache.org
> 抄送: 
> 主题: Re: Re: Re: [DISCUSS] KIP-842: Add richer group offset reset mechanisms
> 


[jira] [Created] (KAFKA-15068) Incorrect replication Latency may be calculated when the timestamp of the record is of type CREATE_TIME

2023-06-07 Thread hudeqi (Jira)
hudeqi created KAFKA-15068:
--

 Summary: Incorrect replication Latency may be calculated when the 
timestamp of the record is of type CREATE_TIME
 Key: KAFKA-15068
 URL: https://issues.apache.org/jira/browse/KAFKA-15068
 Project: Kafka
  Issue Type: Improvement
  Components: mirrormaker
Reporter: hudeqi
Assignee: hudeqi






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [DISCUSS] KIP-916: MM2 distributed mode flow log context

2023-06-07 Thread Dániel Urbán
Hi Chris,

Thank you for your comments! I updated the KIP. I still need to add the
example before/after log lines, will do that soon, but I addressed all the
other points.
1. Added more details on thread renaming under Public Interfaces, removed
pseudo code.
2. Removed the stale header - originally, client.id related changes were
part of the KIP, and I failed removing all leftovers of that version.
3. Threads listed under Public Interfaces with current/proposed names.
4. Added a comment in the connect-log4j.properties, similar to the one
introduced in KIP-449. We don't have a dedicated MM2 log4j config, not sure
if we should introduce it here.
5. Clarified testing section - I think thread names should not be tested
(they never were), but testing will focus on the new MDC context value.

Thanks,
Daniel

Chris Egerton  ezt írta (időpont: 2023. jún. 5.,
H, 16:46):

> Hi Daniel,
>
> Thanks for the updates! A few more thoughts:
>
> 1. The "Proposed Changes" section seems a bit like a pseudocode
> implementation of the KIP. We don't really need this level of detail; most
> of the time, we're just looking for implementation details that don't
> directly affect the user-facing changes proposed in the "Public Interfaces"
> section but are worth mentioning because they (1) demonstrate how the
> user-facing changes are made possible, (2) indirectly affect user-facing
> behavior, or (3) go into more detail (like providing examples) about the
> user-facing behavior. For this KIP, I think examples of user-facing
> behavior (like before/after of thread names and log messages) and possibly
> an official description of the scope of the changes (which threads are
> going to be renamed and/or include the new MDC key, and which aren't?) are
> all that we'd really need in this section; everything else is fairly clear
> IMO. FWIW, the reason we want to discourage going into too much detail with
> KIPs is that it can quickly devolve into code review over mailing list,
> which can hold KIPs up for longer than necessary when the core design
> changes they contain are already basically accepted by everyone.
>
> 2. The "MM2 distributed mode client.id and log change" header seems like
> it
> may no longer be accurate; the contents do not mention any changes to
> client IDs. I might be missing something though; please correct me if I am.
>
> 3. Can you provide some before/after examples of what thread names and log
> messages will look like? I'm wondering about the thread that the
> distributed herder runs on, threads for connectors and tasks, and threads
> for polling internal topics (which we currently manage with the
> KafkaBasedLog class). It's fine if some of these are unchanged, I just want
> to better understand the scope of the proposed changes and get an idea of
> how they may appear to users.
>
> 4. There's no mention of changes to the default Log4j config files that we
> ship. Is this intentional? I feel like we need some way for users to easily
> discover this feature; if we're not going to touch our default Log4j config
> files, is there another way that we can expect users to find out about the
> new MDC key?
>
> 5. RE the "Test Plan" section: can you provide a little more detail of
> which cases we'll be covering with the proposed unit tests? Will we be
> verifying that the MDC context is set in various places? If so, which
> places? And the same with thread names? (There doesn't have to be a ton of
> detail, but a little more than "unit tests" would be nice 😄)
>
> Cheers,
>
> Chris
>
> On Mon, Jun 5, 2023 at 9:45 AM Dániel Urbán  wrote:
>
> > I updated the KIP accordingly. Tried to come up with more generic terms
> in
> > the Connect code instead of referring to "flow" anywhere.
> > Daniel
> >
> > Dániel Urbán  ezt írta (időpont: 2023. jún. 5.,
> H,
> > 14:49):
> >
> > > Hi Chris,
> > >
> > > Thank you for your comments!
> > >
> > > I agree that the toString based logging is not ideal, and I believe all
> > > occurrences are within a proper logging context, so they can be
> ignored.
> > > If thread names can be changed unconditionally, I agree, using a new
> MDC
> > > key is the ideal solution.
> > >
> > > Will update the KIP accordingly.
> > >
> > > Thanks,
> > > Daniel
> > >
> > > Chris Egerton  ezt írta (időpont: 2023. jún.
> > 2.,
> > > P, 22:23):
> > >
> > >> Hi Daniel,
> > >>
> > >> Are there any cases of Object::toString being used by internal classes
> > for
> > >> logging context that can't be covered by MDC contexts? For example,
> > >> anything logged by WorkerSinkTask (or any WorkerTask subclass) already
> > has
> > >> the MDC set for the task [1]. Since the Object::toString-prefixed
> style
> > of
> > >> logging is a bit obsolete after the introduction of MDC contexts it'd
> > feel
> > >> a little strange to continue trying to accommodate it, especially if
> the
> > >> changes from this KIP are going to be opt-in regardless.
> > >>
> > >> As far as thread names go: unlike log statements, I don't believe
> 

Re: [VOTE] 3.5.0 RC1

2023-06-07 Thread Josep Prat
Hi MIckael,

Apparently you did it in this PR already :) :
https://github.com/apache/kafka/pull/13749 (this PR among other things
removes classgraph.

Without being a lawyer, I think I agree with you as stating we depend on
something we don't would be less problematic than the other way around.

Best,

On Wed, Jun 7, 2023 at 12:14 PM Mickael Maison 
wrote:

> Hi Josep,
>
> Thanks for spotting this. If not already done, can you open a
> ticket/PR to fix this on trunk? It looks like the last couple of
> releases already had that issue. Since we're including a license for a
> dependency we don't ship, I think we can consider this non blocking.
> The other way around (shipping a dependency without its license) would
> be blocking.
>
> Thanks,
> Mickael
>
> On Tue, Jun 6, 2023 at 10:10 PM Jakub Scholz  wrote:
> >
> > +1 (non-binding) ... I used the staged binaries with Scala 2.13 and
> staged
> > artifacts to run my tests. All seems to work fine.
> >
> > Thanks for running the release Mickael!
> >
> > Jakub
> >
> > On Mon, Jun 5, 2023 at 3:39 PM Mickael Maison 
> wrote:
> >
> > > Hello Kafka users, developers and client-developers,
> > >
> > > This is the second candidate for release of Apache Kafka 3.5.0. Some
> > > of the major features include:
> > > - KIP-710: Full support for distributed mode in dedicated MirrorMaker
> > > 2.0 clusters
> > > - KIP-881: Rack-aware Partition Assignment for Kafka Consumers
> > > - KIP-887: Add ConfigProvider to make use of environment variables
> > > - KIP-889: Versioned State Stores
> > > - KIP-894: Use incrementalAlterConfig for syncing topic configurations
> > > - KIP-900: KRaft kafka-storage.sh API additions to support SCRAM for
> > > Kafka Brokers
> > >
> > > Release notes for the 3.5.0 release:
> > > https://home.apache.org/~mimaison/kafka-3.5.0-rc1/RELEASE_NOTES.html
> > >
> > > *** Please download, test and vote by Friday June 9, 5pm PT
> > >
> > > Kafka's KEYS file containing PGP keys we use to sign the release:
> > > https://kafka.apache.org/KEYS
> > >
> > > * Release artifacts to be voted upon (source and binary):
> > > https://home.apache.org/~mimaison/kafka-3.5.0-rc1/
> > >
> > > * Maven artifacts to be voted upon:
> > > https://repository.apache.org/content/groups/staging/org/apache/kafka/
> > >
> > > * Javadoc:
> > > https://home.apache.org/~mimaison/kafka-3.5.0-rc1/javadoc/
> > >
> > > * Tag to be voted upon (off 3.5 branch) is the 3.5.0 tag:
> > > https://github.com/apache/kafka/releases/tag/3.5.0-rc1
> > >
> > > * Documentation:
> > > https://kafka.apache.org/35/documentation.html
> > >
> > > * Protocol:
> > > https://kafka.apache.org/35/protocol.html
> > >
> > > * Successful Jenkins builds for the 3.5 branch:
> > > Unit/integration tests: I'm struggling to get all tests to pass in the
> > > same build. I'll run a few more builds to ensure each test pass at
> > > least once in the CI. All tests passed locally.
> > > System tests: The build is still running, I'll send an update once I
> > > have the results.
> > >
> > > Thanks,
> > > Mickael
> > >
>


-- 
[image: Aiven] 

*Josep Prat*
Open Source Engineering Director, *Aiven*
josep.p...@aiven.io   |   +491715557497
aiven.io    |   
     
*Aiven Deutschland GmbH*
Alexanderufer 3-7, 10117 Berlin
Geschäftsführer: Oskari Saarenmaa & Hannu Valtonen
Amtsgericht Charlottenburg, HRB 209739 B


Re: [VOTE] KIP-938: Add more metrics for measuring KRaft performance

2023-06-07 Thread Divij Vaidya
"Yes, I am referring to the feature level. I changed the description of
CurrentMetadataVersion to reference the feature level specifically."

Thanks Colin. I have reviewed the KIP after the latest changes (including
addition of two new metrics). It looks good to me.

+1 (non-binding)

--
Divij Vaidya



On Wed, Jun 7, 2023 at 12:12 PM Luke Chen  wrote:

> Hi Colin,
>
> One comment:
> Should we add a metric to record the snapshot handling time?
> Since we know the snapshot loading might take long if the size is huge.
> We might want to know how much time it is processed. WDYT?
>
> No matter you think we need it or not, the KIP LGTM.
> +1 from me.
>
>
> Thank you.
> Luke
>
> On Wed, Jun 7, 2023 at 1:33 PM Colin McCabe  wrote:
> >
> > Hi all,
> >
> > I added two new metrics to the list:
> >
> > * LatestSnapshotGeneratedBytes
> > * LatestSnapshotGeneratedAgeMs
> >
> > These will help monitor the period snapshot generation process.
> >
> > best,
> > Colin
> >
> >
> > On Tue, Jun 6, 2023, at 22:21, Colin McCabe wrote:
> > > Hi Divij,
> > >
> > > Yes, I am referring to the feature level. I changed the description of
> > > CurrentMetadataVersion to reference the feature level specifically.
> > >
> > > best,
> > > Colin
> > >
> > >
> > > On Tue, Jun 6, 2023, at 05:56, Divij Vaidya wrote:
> > >> "Each metadata version has a corresponding integer in the
> > >> MetadataVersion.java file."
> > >>
> > >> Please correct me if I'm wrong, but are you referring to
> "featureLevel"
> > >> in
> > >> the enum at
> > >>
> https://github.com/apache/kafka/blob/trunk/server-common/src/main/java/org/apache/kafka/server/common/MetadataVersion.java#L45
> > >> ? Is yes, can we please update the description of the metric to make
> it
> > >> easier for the users to understand this? For example, we can say,
> > >> "Represents the current metadata version as an integer value. See
> > >> MetadataVersion (hyperlink) for a mapping between string and integer
> > >> formats of metadata version".
> > >>
> > >> --
> > >> Divij Vaidya
> > >>
> > >>
> > >>
> > >> On Tue, Jun 6, 2023 at 1:51 PM Ron Dagostino 
> wrote:
> > >>
> > >>> Thanks again for the KIP, Colin.  +1 (binding).
> > >>>
> > >>> Ron
> > >>>
> > >>> > On Jun 6, 2023, at 7:02 AM, Igor Soarez 
> > >>> wrote:
> > >>> >
> > >>> > Thanks for the KIP.
> > >>> >
> > >>> > Seems straightforward, LGTM.
> > >>> > Non binding +1.
> > >>> >
> > >>> > --
> > >>> > Igor
> > >>> >
> > >>>
>


Re: [VOTE] 3.5.0 RC1

2023-06-07 Thread Mickael Maison
Hi Josep,

Thanks for spotting this. If not already done, can you open a
ticket/PR to fix this on trunk? It looks like the last couple of
releases already had that issue. Since we're including a license for a
dependency we don't ship, I think we can consider this non blocking.
The other way around (shipping a dependency without its license) would
be blocking.

Thanks,
Mickael

On Tue, Jun 6, 2023 at 10:10 PM Jakub Scholz  wrote:
>
> +1 (non-binding) ... I used the staged binaries with Scala 2.13 and staged
> artifacts to run my tests. All seems to work fine.
>
> Thanks for running the release Mickael!
>
> Jakub
>
> On Mon, Jun 5, 2023 at 3:39 PM Mickael Maison  wrote:
>
> > Hello Kafka users, developers and client-developers,
> >
> > This is the second candidate for release of Apache Kafka 3.5.0. Some
> > of the major features include:
> > - KIP-710: Full support for distributed mode in dedicated MirrorMaker
> > 2.0 clusters
> > - KIP-881: Rack-aware Partition Assignment for Kafka Consumers
> > - KIP-887: Add ConfigProvider to make use of environment variables
> > - KIP-889: Versioned State Stores
> > - KIP-894: Use incrementalAlterConfig for syncing topic configurations
> > - KIP-900: KRaft kafka-storage.sh API additions to support SCRAM for
> > Kafka Brokers
> >
> > Release notes for the 3.5.0 release:
> > https://home.apache.org/~mimaison/kafka-3.5.0-rc1/RELEASE_NOTES.html
> >
> > *** Please download, test and vote by Friday June 9, 5pm PT
> >
> > Kafka's KEYS file containing PGP keys we use to sign the release:
> > https://kafka.apache.org/KEYS
> >
> > * Release artifacts to be voted upon (source and binary):
> > https://home.apache.org/~mimaison/kafka-3.5.0-rc1/
> >
> > * Maven artifacts to be voted upon:
> > https://repository.apache.org/content/groups/staging/org/apache/kafka/
> >
> > * Javadoc:
> > https://home.apache.org/~mimaison/kafka-3.5.0-rc1/javadoc/
> >
> > * Tag to be voted upon (off 3.5 branch) is the 3.5.0 tag:
> > https://github.com/apache/kafka/releases/tag/3.5.0-rc1
> >
> > * Documentation:
> > https://kafka.apache.org/35/documentation.html
> >
> > * Protocol:
> > https://kafka.apache.org/35/protocol.html
> >
> > * Successful Jenkins builds for the 3.5 branch:
> > Unit/integration tests: I'm struggling to get all tests to pass in the
> > same build. I'll run a few more builds to ensure each test pass at
> > least once in the CI. All tests passed locally.
> > System tests: The build is still running, I'll send an update once I
> > have the results.
> >
> > Thanks,
> > Mickael
> >


Re: [VOTE] KIP-938: Add more metrics for measuring KRaft performance

2023-06-07 Thread Luke Chen
Hi Colin,

One comment:
Should we add a metric to record the snapshot handling time?
Since we know the snapshot loading might take long if the size is huge.
We might want to know how much time it is processed. WDYT?

No matter you think we need it or not, the KIP LGTM.
+1 from me.


Thank you.
Luke

On Wed, Jun 7, 2023 at 1:33 PM Colin McCabe  wrote:
>
> Hi all,
>
> I added two new metrics to the list:
>
> * LatestSnapshotGeneratedBytes
> * LatestSnapshotGeneratedAgeMs
>
> These will help monitor the period snapshot generation process.
>
> best,
> Colin
>
>
> On Tue, Jun 6, 2023, at 22:21, Colin McCabe wrote:
> > Hi Divij,
> >
> > Yes, I am referring to the feature level. I changed the description of
> > CurrentMetadataVersion to reference the feature level specifically.
> >
> > best,
> > Colin
> >
> >
> > On Tue, Jun 6, 2023, at 05:56, Divij Vaidya wrote:
> >> "Each metadata version has a corresponding integer in the
> >> MetadataVersion.java file."
> >>
> >> Please correct me if I'm wrong, but are you referring to "featureLevel"
> >> in
> >> the enum at
> >> https://github.com/apache/kafka/blob/trunk/server-common/src/main/java/org/apache/kafka/server/common/MetadataVersion.java#L45
> >> ? Is yes, can we please update the description of the metric to make it
> >> easier for the users to understand this? For example, we can say,
> >> "Represents the current metadata version as an integer value. See
> >> MetadataVersion (hyperlink) for a mapping between string and integer
> >> formats of metadata version".
> >>
> >> --
> >> Divij Vaidya
> >>
> >>
> >>
> >> On Tue, Jun 6, 2023 at 1:51 PM Ron Dagostino  wrote:
> >>
> >>> Thanks again for the KIP, Colin.  +1 (binding).
> >>>
> >>> Ron
> >>>
> >>> > On Jun 6, 2023, at 7:02 AM, Igor Soarez 
> >>> wrote:
> >>> >
> >>> > Thanks for the KIP.
> >>> >
> >>> > Seems straightforward, LGTM.
> >>> > Non binding +1.
> >>> >
> >>> > --
> >>> > Igor
> >>> >
> >>>


[jira] [Created] (KAFKA-15067) kafka SSL support with differnt ssl providers

2023-06-07 Thread Aldan Brito (Jira)
Aldan Brito created KAFKA-15067:
---

 Summary: kafka SSL support with differnt ssl providers
 Key: KAFKA-15067
 URL: https://issues.apache.org/jira/browse/KAFKA-15067
 Project: Kafka
  Issue Type: Test
  Components: security
Reporter: Aldan Brito


kafka SSL support with different ssl providers.

configuring different ssl providers eg netty ssl providers.

there is no documentation nor examples test.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


RE: [DISCUSS] Partial CI builds - Reducing flakiness with fewer tests

2023-06-07 Thread Gaurav Narula
Hey Greg,

Thanks for sharing this idea!

The idea of building and testing a relevant subset of code certainly seems 
interesting.

Perhaps this is a good fit for Bazel [1] where
target-determinator [2] can be used to to find a subset of targets that have 
changed between two commits.

Even without [2], Bazel builds can benefit immensely from distributing builds
to a set of remote nodes [3] with support for caching previously built
targets [4].

We've seen a few other ASF projects adopt Bazel as well:

* https://github.com/apache/rocketmq
* https://github.com/apache/brpc
* https://github.com/apache/trafficserver
* https://github.com/apache/ws-axiom

I wonder how the Kafka community feels about experimenting with Bazel and
exploring if it helps us offer faster build times without compromising on the
correctness of the targets that need to be built and tested?

Thanks,
Gaurav

[1]: https://bazel.build
[2]: https://github.com/bazel-contrib/target-determinator
[3]: https://bazel.build/remote/rbe
[4]: https://bazel.build/remote/caching

On 2023/06/05 17:47:07 Greg Harris wrote:
> Hey all,
> 
> I've been working on test flakiness recently, and I've been trying to
> come up with ways to tackle the issue top-down as well as bottom-up,
> and I'm interested to hear your thoughts on an idea.
> 
> In addition to the current full-suite runs, can we in parallel trigger
> a smaller test run which has only a relevant subset of tests? For
> example, if someone is working on one sub-module, the CI would only
> run tests in that module.
> 
> I think this would be more likely to pass than the full suite due to
> the fewer tests failing probabilistically, and would improve the
> signal-to-noise ratio of the summary pass/fail marker on GitHub. This
> should also be shorter to execute than the full suite, allowing for
> faster cycle-time than the current full suite encourages.
> 
> This would also strengthen the incentive for contributors specializing
> in a module to de-flake tests, as they are rewarded with a tangible
> improvement within their area of the project. Currently, even the
> modules with the most reliable tests receive consistent CI failures
> from other less reliable modules.
> 
> I believe this is possible, even if there isn't an off-the-shelf
> solution for it. We can learn of the changed files via a git diff, map
> that to modules containing those files, and then execute the tests
> just for those modules with gradle. GitHub also permits showing
> multiple "checks" so that we can emit both the full-suite and partial
> test results.
> 
> Thanks,
> Greg
>

[jira] [Created] (KAFKA-15066) passing listener name config into TopicBasedRemoteLogMetadataManagerConfig

2023-06-07 Thread Luke Chen (Jira)
Luke Chen created KAFKA-15066:
-

 Summary: passing listener name config into 
TopicBasedRemoteLogMetadataManagerConfig
 Key: KAFKA-15066
 URL: https://issues.apache.org/jira/browse/KAFKA-15066
 Project: Kafka
  Issue Type: Sub-task
Reporter: Luke Chen
Assignee: Luke Chen


The `remote.log.metadata.manager.listener.name` config doesn't pass to 
TopicBasedRemoteLogMetadataManagerConfig correctly.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [ANNOUNCE] Apache Kafka 3.4.1

2023-06-07 Thread Tom Bentley
Thanks Luke!

On Wed, 7 Jun 2023 at 09:11, Mickael Maison 
wrote:

> Thanks for running the release!
>
> On Wed, Jun 7, 2023 at 9:11 AM Bruno Cadonna  wrote:
> >
> > Thanks Luke!
> >
> > On 07.06.23 07:55, Federico Valeri wrote:
> > > Thanks Luke!
> > >
> > > On Wed, Jun 7, 2023 at 5:56 AM Kamal Chandraprakash
> > >  wrote:
> > >>
> > >> Thanks Luke for running this release!
> > >>
> > >> On Wed, Jun 7, 2023 at 8:08 AM Chia-Ping Tsai 
> wrote:
> > >>
> > >>> Thank Luke for this hard work!!!
> > >>>
> >  Chris Egerton  於 2023年6月7日 上午10:35 寫道:
> > 
> >  Thanks for running this release, Luke!
> > 
> >  On Tue, Jun 6, 2023, 22:31 Luke Chen  wrote:
> > 
> > > The Apache Kafka community is pleased to announce the release for
> > > Apache Kafka 3.4.1.
> > >
> > > This is a bug fix release and it includes fixes and improvements
> from
> > > 58 JIRAs, including a few critical bugs:
> > > - core
> > > KAFKA-14644 Process should stop after failure in raft IO thread
> > > KAFKA-14946 KRaft controller node shutting down while renouncing
> > >>> leadership
> > > KAFKA-14887 ZK session timeout can cause broker to shutdown
> > > - client
> > > KAFKA-14639 Kafka CooperativeStickyAssignor revokes/assigns
> partition
> > > in one rebalance cycle
> > > - connect
> > > KAFKA-12558 MM2 may not sync partition offsets correctly
> > > KAFKA-14666 MM2 should translate consumer group offsets behind
> > >>> replication
> > > flow
> > > - stream
> > > KAFKA-14172 bug: State stores lose state when tasks are reassigned
> under
> > > EOS
> > >
> > > All of the changes in this release can be found in the release
> notes:
> > >
> > > https://www.apache.org/dist/kafka/3.4.1/RELEASE_NOTES.html
> > >
> > > You can download the source and binary release (Scala 2.12 and
> Scala
> > >>> 2.13)
> > > from:
> > >
> > > https://kafka.apache.org/downloads#3.4.1
> > >
> > >
> > >
> > >>>
> ---
> > >
> > > Apache Kafka is a distributed streaming platform with four core
> APIs:
> > >
> > > ** The Producer API allows an application to publish a stream
> records
> > > to one or more Kafka topics.
> > >
> > > ** The Consumer API allows an application to subscribe to one or
> more
> > > topics and process the stream of records produced to them.
> > >
> > > ** The Streams API allows an application to act as a stream
> processor,
> > > consuming an input stream from one or more topics and producing an
> > > output stream to one or more output topics, effectively
> transforming
> > > the input streams to output streams.
> > >
> > > ** The Connector API allows building and running reusable
> producers or
> > > consumers that connect Kafka topics to existing applications or
> data
> > > systems. For example, a connector to a relational database might
> > > capture every change to a table.
> > >
> > >
> > > With these APIs, Kafka can be used for two broad classes of
> application:
> > >
> > > ** Building real-time streaming data pipelines that reliably get
> data
> > > between systems or applications.
> > >
> > > ** Building real-time streaming applications that transform or
> react
> > > to the streams of data.
> > >
> > > Apache Kafka is in use at large and small companies worldwide,
> > > including Capital One, Goldman Sachs, ING, LinkedIn, Netflix,
> > > Pinterest, Rabobank, Target, The New York Times, Uber, Yelp, and
> > > Zalando, among others.
> > >
> > > A big thank you for the following 32 contributors to this release!
> > >
> > > atu-sharm, Chia-Ping Tsai, Chris Egerton, Colin Patrick McCabe,
> > > csolidum, David Arthur, David Jacot, Divij Vaidya, egyedt,
> > > emilnkrastev, Eric Haag, Greg Harris, Guozhang Wang, Hector
> Geraldino,
> > > hudeqi, Jason Gustafson, Jeff Kim, Jorge Esteban Quilcate Otoya,
> José
> > > Armando García Sancio, Lucia Cerchie, Luke Chen, Manikumar Reddy,
> > > Matthias J. Sax, Mickael Maison, Philip Nee, Purshotam Chauhan,
> Rajini
> > > Sivaram, Ron Dagostino, Terry, Victoria Xia, Viktor Somogyi-Vass,
> Yash
> > > Mayya
> > >
> > > We welcome your help and feedback. For more information on how to
> > > report problems, and to get involved, visit the project website at
> > > https://kafka.apache.org/
> > >
> > >
> > > Thank you!
> > >
> > > Regards,
> > > Luke
> > >
> > >>>
> > >>>
>
>


[jira] [Created] (KAFKA-15065) ApiVersionRequest is not properly handled in Sasl ControllerServer

2023-06-07 Thread Deng Ziming (Jira)
Deng Ziming created KAFKA-15065:
---

 Summary: ApiVersionRequest is not properly handled in Sasl 
ControllerServer
 Key: KAFKA-15065
 URL: https://issues.apache.org/jira/browse/KAFKA-15065
 Project: Kafka
  Issue Type: Improvement
Reporter: Deng Ziming


In KAFKA-14291 we add finalizedFeatures in ApiVersionResponse, also change the 
`apiVersionResponse` method to throw exception:
{code:java}
override def apiVersionResponse(requestThrottleMs: Int): ApiVersionsResponse = {
throw new UnsupportedOperationException("This method is not supported in 
SimpleApiVersionManager, use apiVersionResponse(throttleTimeMs, 
finalizedFeatures, epoch) instead")
} {code}
but this method is used in SocketServer:
{code:java}
private[network] val selector = createSelector(
ChannelBuilders.serverChannelBuilder(
listenerName,
listenerName == config.interBrokerListenerName,
securityProtocol,
config,
credentialProvider.credentialCache,
credentialProvider.tokenCache,
time,
logContext,
() => apiVersionManager.apiVersionResponse(throttleTimeMs = 0)
)
) {code}
 

And this method will be invoked in `SaslServerAuthenticator.authenticate` and 
will stop the process.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [ANNOUNCE] Apache Kafka 3.4.1

2023-06-07 Thread Mickael Maison
Thanks for running the release!

On Wed, Jun 7, 2023 at 9:11 AM Bruno Cadonna  wrote:
>
> Thanks Luke!
>
> On 07.06.23 07:55, Federico Valeri wrote:
> > Thanks Luke!
> >
> > On Wed, Jun 7, 2023 at 5:56 AM Kamal Chandraprakash
> >  wrote:
> >>
> >> Thanks Luke for running this release!
> >>
> >> On Wed, Jun 7, 2023 at 8:08 AM Chia-Ping Tsai  wrote:
> >>
> >>> Thank Luke for this hard work!!!
> >>>
>  Chris Egerton  於 2023年6月7日 上午10:35 寫道:
> 
>  Thanks for running this release, Luke!
> 
>  On Tue, Jun 6, 2023, 22:31 Luke Chen  wrote:
> 
> > The Apache Kafka community is pleased to announce the release for
> > Apache Kafka 3.4.1.
> >
> > This is a bug fix release and it includes fixes and improvements from
> > 58 JIRAs, including a few critical bugs:
> > - core
> > KAFKA-14644 Process should stop after failure in raft IO thread
> > KAFKA-14946 KRaft controller node shutting down while renouncing
> >>> leadership
> > KAFKA-14887 ZK session timeout can cause broker to shutdown
> > - client
> > KAFKA-14639 Kafka CooperativeStickyAssignor revokes/assigns partition
> > in one rebalance cycle
> > - connect
> > KAFKA-12558 MM2 may not sync partition offsets correctly
> > KAFKA-14666 MM2 should translate consumer group offsets behind
> >>> replication
> > flow
> > - stream
> > KAFKA-14172 bug: State stores lose state when tasks are reassigned under
> > EOS
> >
> > All of the changes in this release can be found in the release notes:
> >
> > https://www.apache.org/dist/kafka/3.4.1/RELEASE_NOTES.html
> >
> > You can download the source and binary release (Scala 2.12 and Scala
> >>> 2.13)
> > from:
> >
> > https://kafka.apache.org/downloads#3.4.1
> >
> >
> >
> >>> ---
> >
> > Apache Kafka is a distributed streaming platform with four core APIs:
> >
> > ** The Producer API allows an application to publish a stream records
> > to one or more Kafka topics.
> >
> > ** The Consumer API allows an application to subscribe to one or more
> > topics and process the stream of records produced to them.
> >
> > ** The Streams API allows an application to act as a stream processor,
> > consuming an input stream from one or more topics and producing an
> > output stream to one or more output topics, effectively transforming
> > the input streams to output streams.
> >
> > ** The Connector API allows building and running reusable producers or
> > consumers that connect Kafka topics to existing applications or data
> > systems. For example, a connector to a relational database might
> > capture every change to a table.
> >
> >
> > With these APIs, Kafka can be used for two broad classes of application:
> >
> > ** Building real-time streaming data pipelines that reliably get data
> > between systems or applications.
> >
> > ** Building real-time streaming applications that transform or react
> > to the streams of data.
> >
> > Apache Kafka is in use at large and small companies worldwide,
> > including Capital One, Goldman Sachs, ING, LinkedIn, Netflix,
> > Pinterest, Rabobank, Target, The New York Times, Uber, Yelp, and
> > Zalando, among others.
> >
> > A big thank you for the following 32 contributors to this release!
> >
> > atu-sharm, Chia-Ping Tsai, Chris Egerton, Colin Patrick McCabe,
> > csolidum, David Arthur, David Jacot, Divij Vaidya, egyedt,
> > emilnkrastev, Eric Haag, Greg Harris, Guozhang Wang, Hector Geraldino,
> > hudeqi, Jason Gustafson, Jeff Kim, Jorge Esteban Quilcate Otoya, José
> > Armando García Sancio, Lucia Cerchie, Luke Chen, Manikumar Reddy,
> > Matthias J. Sax, Mickael Maison, Philip Nee, Purshotam Chauhan, Rajini
> > Sivaram, Ron Dagostino, Terry, Victoria Xia, Viktor Somogyi-Vass, Yash
> > Mayya
> >
> > We welcome your help and feedback. For more information on how to
> > report problems, and to get involved, visit the project website at
> > https://kafka.apache.org/
> >
> >
> > Thank you!
> >
> > Regards,
> > Luke
> >
> >>>
> >>>


Re: [ANNOUNCE] Apache Kafka 3.4.1

2023-06-07 Thread Bruno Cadonna

Thanks Luke!

On 07.06.23 07:55, Federico Valeri wrote:

Thanks Luke!

On Wed, Jun 7, 2023 at 5:56 AM Kamal Chandraprakash
 wrote:


Thanks Luke for running this release!

On Wed, Jun 7, 2023 at 8:08 AM Chia-Ping Tsai  wrote:


Thank Luke for this hard work!!!


Chris Egerton  於 2023年6月7日 上午10:35 寫道:

Thanks for running this release, Luke!

On Tue, Jun 6, 2023, 22:31 Luke Chen  wrote:


The Apache Kafka community is pleased to announce the release for
Apache Kafka 3.4.1.

This is a bug fix release and it includes fixes and improvements from
58 JIRAs, including a few critical bugs:
- core
KAFKA-14644 Process should stop after failure in raft IO thread
KAFKA-14946 KRaft controller node shutting down while renouncing

leadership

KAFKA-14887 ZK session timeout can cause broker to shutdown
- client
KAFKA-14639 Kafka CooperativeStickyAssignor revokes/assigns partition
in one rebalance cycle
- connect
KAFKA-12558 MM2 may not sync partition offsets correctly
KAFKA-14666 MM2 should translate consumer group offsets behind

replication

flow
- stream
KAFKA-14172 bug: State stores lose state when tasks are reassigned under
EOS

All of the changes in this release can be found in the release notes:

https://www.apache.org/dist/kafka/3.4.1/RELEASE_NOTES.html

You can download the source and binary release (Scala 2.12 and Scala

2.13)

from:

https://kafka.apache.org/downloads#3.4.1




---


Apache Kafka is a distributed streaming platform with four core APIs:

** The Producer API allows an application to publish a stream records
to one or more Kafka topics.

** The Consumer API allows an application to subscribe to one or more
topics and process the stream of records produced to them.

** The Streams API allows an application to act as a stream processor,
consuming an input stream from one or more topics and producing an
output stream to one or more output topics, effectively transforming
the input streams to output streams.

** The Connector API allows building and running reusable producers or
consumers that connect Kafka topics to existing applications or data
systems. For example, a connector to a relational database might
capture every change to a table.


With these APIs, Kafka can be used for two broad classes of application:

** Building real-time streaming data pipelines that reliably get data
between systems or applications.

** Building real-time streaming applications that transform or react
to the streams of data.

Apache Kafka is in use at large and small companies worldwide,
including Capital One, Goldman Sachs, ING, LinkedIn, Netflix,
Pinterest, Rabobank, Target, The New York Times, Uber, Yelp, and
Zalando, among others.

A big thank you for the following 32 contributors to this release!

atu-sharm, Chia-Ping Tsai, Chris Egerton, Colin Patrick McCabe,
csolidum, David Arthur, David Jacot, Divij Vaidya, egyedt,
emilnkrastev, Eric Haag, Greg Harris, Guozhang Wang, Hector Geraldino,
hudeqi, Jason Gustafson, Jeff Kim, Jorge Esteban Quilcate Otoya, José
Armando García Sancio, Lucia Cerchie, Luke Chen, Manikumar Reddy,
Matthias J. Sax, Mickael Maison, Philip Nee, Purshotam Chauhan, Rajini
Sivaram, Ron Dagostino, Terry, Victoria Xia, Viktor Somogyi-Vass, Yash
Mayya

We welcome your help and feedback. For more information on how to
report problems, and to get involved, visit the project website at
https://kafka.apache.org/


Thank you!

Regards,
Luke