Build failed in Jenkins: Kafka » Kafka Branch Builder » trunk #2184

2023-09-08 Thread Apache Jenkins Server
See 


Changes:


--
[...truncated 410258 lines...]
Gradle Test Run :streams:test > Gradle Test Executor 85 > TasksTest > 
onlyRemovePendingTaskToCloseCleanShouldRemoveTaskFromPendingUpdateActions() 
PASSED

Gradle Test Run :streams:test > Gradle Test Executor 85 > TasksTest > 
shouldDrainPendingTasksToCreate() STARTED

Gradle Test Run :streams:test > Gradle Test Executor 85 > TasksTest > 
shouldDrainPendingTasksToCreate() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 85 > TasksTest > 
onlyRemovePendingTaskToRecycleShouldRemoveTaskFromPendingUpdateActions() STARTED

Gradle Test Run :streams:test > Gradle Test Executor 85 > TasksTest > 
onlyRemovePendingTaskToRecycleShouldRemoveTaskFromPendingUpdateActions() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 85 > TasksTest > 
shouldAddAndRemovePendingTaskToCloseClean() STARTED

Gradle Test Run :streams:test > Gradle Test Executor 85 > TasksTest > 
shouldAddAndRemovePendingTaskToCloseClean() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 85 > TasksTest > 
shouldAddAndRemovePendingTaskToCloseDirty() STARTED

Gradle Test Run :streams:test > Gradle Test Executor 85 > TasksTest > 
shouldAddAndRemovePendingTaskToCloseDirty() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 85 > TasksTest > 
shouldKeepAddedTasks() STARTED

Gradle Test Run :streams:test > Gradle Test Executor 85 > TasksTest > 
shouldKeepAddedTasks() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 85 > 
DefaultTaskManagerTest > shouldAssignTasksThatCanBeSystemTimePunctuated() 
STARTED

Gradle Test Run :streams:test > Gradle Test Executor 85 > 
DefaultTaskManagerTest > shouldAssignTasksThatCanBeSystemTimePunctuated() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 85 > 
DefaultTaskManagerTest > shouldNotUnassignNotOwnedTask() STARTED

Gradle Test Run :streams:test > Gradle Test Executor 85 > 
DefaultTaskManagerTest > shouldNotUnassignNotOwnedTask() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 85 > 
DefaultTaskManagerTest > shouldNotSetUncaughtExceptionsTwice() STARTED

Gradle Test Run :streams:test > Gradle Test Executor 85 > 
DefaultTaskManagerTest > shouldNotSetUncaughtExceptionsTwice() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 85 > 
DefaultTaskManagerTest > 
shouldNotAssignTasksForPunctuationIfPunctuationDisabled() STARTED

Gradle Test Run :streams:test > Gradle Test Executor 85 > 
DefaultTaskManagerTest > 
shouldNotAssignTasksForPunctuationIfPunctuationDisabled() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 85 > 
DefaultTaskManagerTest > shouldAddTask() STARTED

Gradle Test Run :streams:test > Gradle Test Executor 85 > 
DefaultTaskManagerTest > shouldAddTask() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 85 > 
DefaultTaskManagerTest > shouldNotAssignAnyLockedTask() STARTED

Gradle Test Run :streams:test > Gradle Test Executor 85 > 
DefaultTaskManagerTest > shouldNotAssignAnyLockedTask() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 85 > 
DefaultTaskManagerTest > shouldRemoveTask() STARTED

Gradle Test Run :streams:test > Gradle Test Executor 85 > 
DefaultTaskManagerTest > shouldRemoveTask() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 85 > 
DefaultTaskManagerTest > shouldNotRemoveAssignedTask() STARTED

Gradle Test Run :streams:test > Gradle Test Executor 85 > 
DefaultTaskManagerTest > shouldNotRemoveAssignedTask() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 85 > 
DefaultTaskManagerTest > shouldAssignTaskThatCanBeProcessed() STARTED

Gradle Test Run :streams:test > Gradle Test Executor 85 > 
DefaultTaskManagerTest > shouldAssignTaskThatCanBeProcessed() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 85 > 
DefaultTaskManagerTest > shouldNotRemoveUnlockedTask() STARTED

Gradle Test Run :streams:test > Gradle Test Executor 85 > 
DefaultTaskManagerTest > shouldNotRemoveUnlockedTask() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 85 > 
DefaultTaskManagerTest > shouldReturnAndClearExceptionsOnDrainExceptions() 
STARTED

Gradle Test Run :streams:test > Gradle Test Executor 85 > 
DefaultTaskManagerTest > shouldReturnAndClearExceptionsOnDrainExceptions() 
PASSED

Gradle Test Run :streams:test > Gradle Test Executor 85 > 
DefaultTaskManagerTest > shouldUnassignTask() STARTED

Gradle Test Run :streams:test > Gradle Test Executor 85 > 
DefaultTaskManagerTest > shouldUnassignTask() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 85 > 
DefaultTaskManagerTest > 
shouldNotAssignTasksForProcessingIfProcessingDisabled() STARTED

Gradle Test Run :streams:test > Gradle Test Executor 85 > 
DefaultTaskManagerTest > 
shouldNotAssignTasksForProcessingIfProcessingDisabled() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 85 > 
DefaultTaskManagerTest > shouldNo

Re: [VOTE] KIP-714: Client metrics and observability

2023-09-08 Thread Andrew Schofield
Bumping the voting thread for KIP-714.

So far, we have:
Non-binding +2 (Milind and Kirk), non-binding -1 (Ryanne)

Thanks,
Andrew

> On 4 Aug 2023, at 09:45, Andrew Schofield  wrote:
> 
> Hi,
> After almost 2 1/2 years in the making, I would like to call a vote for 
> KIP-714 
> (https://cwiki.apache.org/confluence/display/KAFKA/KIP-714%3A+Client+metrics+and+observability).
> 
> This KIP aims to improve monitoring and troubleshooting of client performance 
> by enabling clients to push metrics to brokers.
> 
> I’d like to thank everyone that participated in the discussion, especially 
> the librdkafka team since one of the aims of the KIP is to enable any client 
> to participate, not just the Apache Kafka project’s Java clients.
> 
> Thanks,
> Andrew




[DISCUSS] KIP-974 Docker Image for GraalVM based Native Kafka Broker

2023-09-08 Thread Krishna Agarwal
Hi,
I want to submit a KIP to deliver an experimental Apache Kafka docker image.
The proposed docker image can launch brokers with sub-second startup time
and minimal memory footprint by leveraging a GraalVM based native Kafka
binary.

KIP-974: Docker Image for GraalVM based Native Kafka Broker


Regards,
Krishna


[DISCUSS] KIP-975 Docker Image for Apache Kafka

2023-09-08 Thread Krishna Agarwal
Hi,
Apache Kafka does not have an official docker image currently.
I want to submit a KIP to publish a docker image for Apache Kafka.

KIP-975: Docker Image for Apache Kafka


Regards,
Krishna


Re: [DISCUSS] KIP-973 Expose per topic replication rate metrics

2023-09-08 Thread Nelson B.
Hi all,

I just wanted to bump up this discussion thread.

Thanks

On Thu, Aug 31, 2023 at 2:05 AM Nelson Bighetti 
wrote:

> Relatively minor change that fixes a mismatch between documentation and
> implementation.
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-973%3A+Expose+per+topic+replication+rate+metrics
>


[jira] [Created] (KAFKA-15446) Upgrading from 2.0 to 2.8, with replica out of sync exceeding 12 hours

2023-09-08 Thread Jira
许胜斌 created KAFKA-15446:
---

 Summary: Upgrading from 2.0 to 2.8, with replica out of sync 
exceeding 12 hours
 Key: KAFKA-15446
 URL: https://issues.apache.org/jira/browse/KAFKA-15446
 Project: Kafka
  Issue Type: Bug
  Components: replication
Affects Versions: 2.8.2
 Environment: centos7、java8
Reporter: 许胜斌
 Attachments: image-2023-09-08-16-37-12-364.png

!image-2023-09-08-16-37-12-364.png!
There are three brokers in the cluster. When the leader of the partition is 
node 0, it cannot be synchronized to nodes 1 and 2. This problem has lasted for 
more than ten hours, and the log.dir of the corresponding partition on nodes 1 
and 2 has not been updated for a long time, indicating that data replication 
has stopped.

However, when the leader of the partition is node 1 or node 2, it can be 
synchronized to other nodes.

the error log is:

[2023-09-08 16:35:05,238] WARN [ReplicaFetcher replicaId=2, leaderId=0, 
fetcherId=0] Reset fetch offset for partition msg_for_dispatche-0 from 
3636534258 to current leader's start offset 14558984559 
(kafka.server.ReplicaFetcherThread)
[2023-09-08 16:35:05,238] INFO The cleaning for partition msg_for_dispatche-0 
is aborted and paused (kafka.log.LogManager)
[2023-09-08 16:35:05,238] INFO [Log partition=msg_for_dispatche-0, 
dir=/usr/local/kafka/kafka-logs] Deleting segments as part of log truncation: 
LogSegment(baseOffset=3636534258, size=0, lastModifiedTime=1694162105000, 
largestRecordTimestamp=None) (kafka.log.Log)
[2023-09-08 16:35:05,241] INFO [Log partition=msg_for_dispatche-0, 
dir=/usr/local/kafka/kafka-logs] Loading producer state till offset 14558984559 
with message format version 2 (kafka.log.Log)
[2023-09-08 16:35:05,241] INFO Cleaning for partition msg_for_dispatche-0 is 
resumed (kafka.log.LogManager)




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: Apache Kafka 3.6.0 release

2023-09-08 Thread Ivan Yurchenko
Hi Satish and all,

I wonder if https://issues.apache.org/jira/browse/KAFKA-14993 should be 
included in the 3.6 release plan. I'm thinking that when implemented, it would 
be a small, but still a change in the RSM contract: throw an exception instead 
of returning an empty InputStream. Maybe it should be included right away to 
save the migration later? What do you think?

Best,
Ivan

On Fri, Sep 8, 2023, at 02:52, Satish Duggana wrote:
> Hi Jose,
> Thanks for looking into this issue and resolving it with a quick fix.
> 
> ~Satish.
> 
> On Thu, 7 Sept 2023 at 21:40, José Armando García Sancio
>  wrote:
> >
> > Hi Satish,
> >
> > On Wed, Sep 6, 2023 at 4:58 PM Satish Duggana  
> > wrote:
> > >
> > > Hi Greg,
> > > It seems https://issues.apache.org/jira/browse/KAFKA-14273 has been
> > > there in 3.5.x too.
> >
> > I also agree that it should be a blocker for 3.6.0. It should have
> > been a blocker for those previous releases. I didn't fix it because,
> > unfortunately, I wasn't aware of the issue and jira.
> > I'll create a PR with a fix in case the original author doesn't respond in 
> > time.
> >
> > Satish, do you agree?
> >
> > Thanks!
> > --
> > -José
> 


Re: [DISCUSS] KIP-974 Docker Image for GraalVM based Native Kafka Broker

2023-09-08 Thread Federico Valeri
Hi Krishna, thanks for opening this discussion.

I see you created two separate KIPs (974 and 975), but there are some
common points (build system and test plan).

Currently, the Docker image used for system tests is only supported in
that limited scope, so the maintenance burden is minimal. Providing
official Kafka images would be much more complicated. Have you
considered how the image rebuild process would work in case a high
severity CVE comes out for a non Kafka image dependency? In that case,
there will be no Kafka release.

Br
Fede

On Fri, Sep 8, 2023 at 9:17 AM Krishna Agarwal
 wrote:
>
> Hi,
> I want to submit a KIP to deliver an experimental Apache Kafka docker image.
> The proposed docker image can launch brokers with sub-second startup time
> and minimal memory footprint by leveraging a GraalVM based native Kafka
> binary.
>
> KIP-974: Docker Image for GraalVM based Native Kafka Broker
> 
>
> Regards,
> Krishna


Jenkins build is unstable: Kafka » Kafka Branch Builder » trunk #2185

2023-09-08 Thread Apache Jenkins Server
See 




[jira] [Resolved] (KAFKA-15435) KRaft migration record counts in log message are incorrect

2023-09-08 Thread David Arthur (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-15435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Arthur resolved KAFKA-15435.
--
Resolution: Fixed

> KRaft migration record counts in log message are incorrect
> --
>
> Key: KAFKA-15435
> URL: https://issues.apache.org/jira/browse/KAFKA-15435
> Project: Kafka
>  Issue Type: Bug
>  Components: kraft
>Affects Versions: 3.6.0
>Reporter: David Arthur
>Assignee: David Arthur
>Priority: Blocker
> Fix For: 3.6.0
>
>
> The counting logic in MigrationManifest is incorrect and produces invalid 
> output. This information is critical for users wanting to validate the result 
> of a migration.
>  
> {code}
> Completed migration of metadata from ZooKeeper to KRaft. 7117 records were 
> generated in 54253 ms across 1629 batches. The record types were 
> {TOPIC_RECORD=2, CONFIG_RECORD=2, PARTITION_RECORD=2, 
> ACCESS_CONTROL_ENTRY_RECORD=2, PRODUCER_IDS_RECORD=1}. 
> {code}
> Due to the logic bug, the counts will never exceed 2.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: Apache Kafka 3.6.0 release

2023-09-08 Thread David Arthur
Quick update on my two blockers: KAFKA-15435 is merged to trunk and
cherry-picked to 3.6. I have a PR open for KAFKA-15441 and will hopefully
get it merged today.

-David

On Fri, Sep 8, 2023 at 5:26 AM Ivan Yurchenko  wrote:

> Hi Satish and all,
>
> I wonder if https://issues.apache.org/jira/browse/KAFKA-14993 should be
> included in the 3.6 release plan. I'm thinking that when implemented, it
> would be a small, but still a change in the RSM contract: throw an
> exception instead of returning an empty InputStream. Maybe it should be
> included right away to save the migration later? What do you think?
>
> Best,
> Ivan
>
> On Fri, Sep 8, 2023, at 02:52, Satish Duggana wrote:
> > Hi Jose,
> > Thanks for looking into this issue and resolving it with a quick fix.
> >
> > ~Satish.
> >
> > On Thu, 7 Sept 2023 at 21:40, José Armando García Sancio
> >  wrote:
> > >
> > > Hi Satish,
> > >
> > > On Wed, Sep 6, 2023 at 4:58 PM Satish Duggana <
> satish.dugg...@gmail.com> wrote:
> > > >
> > > > Hi Greg,
> > > > It seems https://issues.apache.org/jira/browse/KAFKA-14273 has been
> > > > there in 3.5.x too.
> > >
> > > I also agree that it should be a blocker for 3.6.0. It should have
> > > been a blocker for those previous releases. I didn't fix it because,
> > > unfortunately, I wasn't aware of the issue and jira.
> > > I'll create a PR with a fix in case the original author doesn't
> respond in time.
> > >
> > > Satish, do you agree?
> > >
> > > Thanks!
> > > --
> > > -José
> >
>


-- 
-David


Re: Apache Kafka 3.6.0 release

2023-09-08 Thread Ismael Juma
Hi Satish,

Do you have a sense of when we'll publish RC0?

Thanks,
Ismael

On Fri, Sep 8, 2023 at 6:27 AM David Arthur
 wrote:

> Quick update on my two blockers: KAFKA-15435 is merged to trunk and
> cherry-picked to 3.6. I have a PR open for KAFKA-15441 and will hopefully
> get it merged today.
>
> -David
>
> On Fri, Sep 8, 2023 at 5:26 AM Ivan Yurchenko  wrote:
>
> > Hi Satish and all,
> >
> > I wonder if https://issues.apache.org/jira/browse/KAFKA-14993 should be
> > included in the 3.6 release plan. I'm thinking that when implemented, it
> > would be a small, but still a change in the RSM contract: throw an
> > exception instead of returning an empty InputStream. Maybe it should be
> > included right away to save the migration later? What do you think?
> >
> > Best,
> > Ivan
> >
> > On Fri, Sep 8, 2023, at 02:52, Satish Duggana wrote:
> > > Hi Jose,
> > > Thanks for looking into this issue and resolving it with a quick fix.
> > >
> > > ~Satish.
> > >
> > > On Thu, 7 Sept 2023 at 21:40, José Armando García Sancio
> > >  wrote:
> > > >
> > > > Hi Satish,
> > > >
> > > > On Wed, Sep 6, 2023 at 4:58 PM Satish Duggana <
> > satish.dugg...@gmail.com> wrote:
> > > > >
> > > > > Hi Greg,
> > > > > It seems https://issues.apache.org/jira/browse/KAFKA-14273 has
> been
> > > > > there in 3.5.x too.
> > > >
> > > > I also agree that it should be a blocker for 3.6.0. It should have
> > > > been a blocker for those previous releases. I didn't fix it because,
> > > > unfortunately, I wasn't aware of the issue and jira.
> > > > I'll create a PR with a fix in case the original author doesn't
> > respond in time.
> > > >
> > > > Satish, do you agree?
> > > >
> > > > Thanks!
> > > > --
> > > > -José
> > >
> >
>
>
> --
> -David
>


Re: [VOTE] KIP-970: Deprecate and remove Connect's redundant task configurations endpoint

2023-09-08 Thread Mickael Maison
Hi Yash,

+1 (binding)
Thanks for the KIP!

Mickael

On Wed, Sep 6, 2023 at 6:05 PM Greg Harris  wrote:
>
> Hey Yash,
>
> +1(binding)
>
> Thanks for the KIP!
> Greg
>
> On Wed, Sep 6, 2023 at 6:59 AM Yash Mayya  wrote:
> >
> > Hi all,
> >
> > I just wanted to bump up this vote thread. Thanks to everyone who's voted
> > so far - we have 1 binding +1 vote and 3 non-binding +1 votes so far.
> >
> > Thanks,
> > Yash
> >
> > On Wed, Aug 30, 2023 at 11:14 PM Sagar  wrote:
> >
> > > +1 (non - binding).
> > >
> > > Thanks !
> > > Sagar.
> > >
> > > On Wed, 30 Aug 2023 at 11:09 PM, Chris Egerton 
> > > wrote:
> > >
> > > > +1 (binding), thanks Yash!
> > > >
> > > > On Wed, Aug 30, 2023 at 1:34 PM Andrew Schofield <
> > > > andrew_schofield_j...@outlook.com> wrote:
> > > >
> > > > > Thanks for the KIP. Looks good to me.
> > > > >
> > > > > +1 (non-binding).
> > > > >
> > > > > Andrew
> > > > >
> > > > > > On 30 Aug 2023, at 18:07, Hector Geraldino (BLOOMBERG/ 919 3RD A) <
> > > > > hgerald...@bloomberg.net> wrote:
> > > > > >
> > > > > > This makes sense to me, +1 (non-binding)
> > > > > >
> > > > > > From: dev@kafka.apache.org At: 08/30/23 02:58:59 UTC-4:00To:
> > > > > dev@kafka.apache.org
> > > > > > Subject: [VOTE] KIP-970: Deprecate and remove Connect's redundant
> > > task
> > > > > configurations endpoint
> > > > > >
> > > > > > Hi all,
> > > > > >
> > > > > > This is the vote thread for KIP-970 which proposes deprecating (in
> > > the
> > > > > > Apache Kafka 3.7 release) and eventually removing (in the next major
> > > > > Apache
> > > > > > Kafka release - 4.0) Connect's redundant task configurations
> > > endpoint.
> > > > > >
> > > > > > KIP -
> > > > > >
> > > > >
> > > >
> > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-970%3A+Deprecate+and+remov
> > > > > > e+Connect%27s+redundant+task+configurations+endpoint
> > > > > >
> > > > > > Discussion thread -
> > > > > > https://lists.apache.org/thread/997qg9oz58kho3c19mdrjodv0n98plvj
> > > > > >
> > > > > > Thanks,
> > > > > > Yash
> > > > > >
> > > > > >
> > > > >
> > > > >
> > > >
> > >


Jenkins build is still unstable: Kafka » Kafka Branch Builder » trunk #2186

2023-09-08 Thread Apache Jenkins Server
See 




Build failed in Jenkins: Kafka » Kafka Branch Builder » 3.6 #32

2023-09-08 Thread Apache Jenkins Server
See 


Changes:


--
[...truncated 407602 lines...]

Gradle Test Run :streams:test > Gradle Test Executor 85 > TasksTest > 
shouldDrainPendingTasksToCreate() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 85 > TasksTest > 
onlyRemovePendingTaskToRecycleShouldRemoveTaskFromPendingUpdateActions() STARTED

Gradle Test Run :streams:test > Gradle Test Executor 85 > TasksTest > 
onlyRemovePendingTaskToRecycleShouldRemoveTaskFromPendingUpdateActions() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 85 > TasksTest > 
shouldAddAndRemovePendingTaskToCloseClean() STARTED

Gradle Test Run :streams:test > Gradle Test Executor 85 > TasksTest > 
shouldAddAndRemovePendingTaskToCloseClean() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 85 > TasksTest > 
shouldAddAndRemovePendingTaskToCloseDirty() STARTED

Gradle Test Run :streams:test > Gradle Test Executor 85 > TasksTest > 
shouldAddAndRemovePendingTaskToCloseDirty() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 85 > TasksTest > 
shouldKeepAddedTasks() STARTED

Gradle Test Run :streams:test > Gradle Test Executor 85 > TasksTest > 
shouldKeepAddedTasks() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 85 > 
DefaultTaskManagerTest > shouldAssignTasksThatCanBeSystemTimePunctuated() 
STARTED

Gradle Test Run :streams:test > Gradle Test Executor 85 > 
DefaultTaskManagerTest > shouldAssignTasksThatCanBeSystemTimePunctuated() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 85 > 
DefaultTaskManagerTest > shouldNotUnassignNotOwnedTask() STARTED

Gradle Test Run :streams:test > Gradle Test Executor 85 > 
DefaultTaskManagerTest > shouldNotUnassignNotOwnedTask() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 85 > 
DefaultTaskManagerTest > shouldNotSetUncaughtExceptionsTwice() STARTED

Gradle Test Run :streams:test > Gradle Test Executor 85 > 
DefaultTaskManagerTest > shouldNotSetUncaughtExceptionsTwice() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 85 > 
DefaultTaskManagerTest > 
shouldNotAssignTasksForPunctuationIfPunctuationDisabled() STARTED

Gradle Test Run :streams:test > Gradle Test Executor 85 > 
DefaultTaskManagerTest > 
shouldNotAssignTasksForPunctuationIfPunctuationDisabled() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 85 > 
DefaultTaskManagerTest > shouldAddTask() STARTED

Gradle Test Run :streams:test > Gradle Test Executor 85 > 
DefaultTaskManagerTest > shouldAddTask() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 85 > 
DefaultTaskManagerTest > shouldNotAssignAnyLockedTask() STARTED

Gradle Test Run :streams:test > Gradle Test Executor 85 > 
DefaultTaskManagerTest > shouldNotAssignAnyLockedTask() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 85 > 
DefaultTaskManagerTest > shouldRemoveTask() STARTED

Gradle Test Run :streams:test > Gradle Test Executor 85 > 
DefaultTaskManagerTest > shouldRemoveTask() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 85 > 
DefaultTaskManagerTest > shouldNotRemoveAssignedTask() STARTED

Gradle Test Run :streams:test > Gradle Test Executor 85 > 
DefaultTaskManagerTest > shouldNotRemoveAssignedTask() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 85 > 
DefaultTaskManagerTest > shouldAssignTaskThatCanBeProcessed() STARTED

Gradle Test Run :streams:test > Gradle Test Executor 85 > 
DefaultTaskManagerTest > shouldAssignTaskThatCanBeProcessed() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 85 > 
DefaultTaskManagerTest > shouldNotRemoveUnlockedTask() STARTED

Gradle Test Run :streams:test > Gradle Test Executor 85 > 
DefaultTaskManagerTest > shouldNotRemoveUnlockedTask() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 85 > 
DefaultTaskManagerTest > shouldReturnAndClearExceptionsOnDrainExceptions() 
STARTED

Gradle Test Run :streams:test > Gradle Test Executor 85 > 
DefaultTaskManagerTest > shouldReturnAndClearExceptionsOnDrainExceptions() 
PASSED

Gradle Test Run :streams:test > Gradle Test Executor 85 > 
DefaultTaskManagerTest > shouldUnassignTask() STARTED

Gradle Test Run :streams:test > Gradle Test Executor 85 > 
DefaultTaskManagerTest > shouldUnassignTask() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 85 > 
DefaultTaskManagerTest > 
shouldNotAssignTasksForProcessingIfProcessingDisabled() STARTED

Gradle Test Run :streams:test > Gradle Test Executor 85 > 
DefaultTaskManagerTest > 
shouldNotAssignTasksForProcessingIfProcessingDisabled() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 85 > 
DefaultTaskManagerTest > shouldNotSetUncaughtExceptionsForUnassignedTasks() 
STARTED

Gradle Test Run :streams:test > Gradle Test Executor 85 > 
DefaultTaskManagerTest > shouldNotSetUncaughtExceptionsForUnassignedTasks() 
PASSED

Gradle Test Run :streams:test > Gradle Test Executor 85 > 
DefaultTaskManage

[REVIEW REQUEST] Move ReassignPartitionsUnitTest to java

2023-09-08 Thread Николай Ижиков
Hello.

I’m working on [1].
The goal of ticket is to rewire `ReassignPartitionCommand` in java.

The PR that moves whole command is pretty big so it makes sense to split it.
I prepared the PR [2] that moves single test (ReassignPartitionsUnitTest) to 
java.

It smaller and simpler(touches only 4 files):

To review - https://github.com/apache/kafka/pull/14355
Big PR  - https://github.com/apache/kafka/pull/13247

Please, review.

[1] https://issues.apache.org/jira/browse/KAFKA-14595
[2] https://github.com/apache/kafka/pull/14355

Re: [VOTE] KIP-970: Deprecate and remove Connect's redundant task configurations endpoint

2023-09-08 Thread Yash Mayya
Hi all,

KIP-970 has been accepted with three binding +1 votes from Chris, Greg and
Mickael (and three non-binding +1 votes from Hector, Andrew and Sagar).
Thanks all.

Cheers,
Yash

On Fri, Sep 8, 2023 at 7:38 PM Mickael Maison 
wrote:

> Hi Yash,
>
> +1 (binding)
> Thanks for the KIP!
>
> Mickael
>
> On Wed, Sep 6, 2023 at 6:05 PM Greg Harris 
> wrote:
> >
> > Hey Yash,
> >
> > +1(binding)
> >
> > Thanks for the KIP!
> > Greg
> >
> > On Wed, Sep 6, 2023 at 6:59 AM Yash Mayya  wrote:
> > >
> > > Hi all,
> > >
> > > I just wanted to bump up this vote thread. Thanks to everyone who's
> voted
> > > so far - we have 1 binding +1 vote and 3 non-binding +1 votes so far.
> > >
> > > Thanks,
> > > Yash
> > >
> > > On Wed, Aug 30, 2023 at 11:14 PM Sagar 
> wrote:
> > >
> > > > +1 (non - binding).
> > > >
> > > > Thanks !
> > > > Sagar.
> > > >
> > > > On Wed, 30 Aug 2023 at 11:09 PM, Chris Egerton
> 
> > > > wrote:
> > > >
> > > > > +1 (binding), thanks Yash!
> > > > >
> > > > > On Wed, Aug 30, 2023 at 1:34 PM Andrew Schofield <
> > > > > andrew_schofield_j...@outlook.com> wrote:
> > > > >
> > > > > > Thanks for the KIP. Looks good to me.
> > > > > >
> > > > > > +1 (non-binding).
> > > > > >
> > > > > > Andrew
> > > > > >
> > > > > > > On 30 Aug 2023, at 18:07, Hector Geraldino (BLOOMBERG/ 919 3RD
> A) <
> > > > > > hgerald...@bloomberg.net> wrote:
> > > > > > >
> > > > > > > This makes sense to me, +1 (non-binding)
> > > > > > >
> > > > > > > From: dev@kafka.apache.org At: 08/30/23 02:58:59 UTC-4:00To:
> > > > > > dev@kafka.apache.org
> > > > > > > Subject: [VOTE] KIP-970: Deprecate and remove Connect's
> redundant
> > > > task
> > > > > > configurations endpoint
> > > > > > >
> > > > > > > Hi all,
> > > > > > >
> > > > > > > This is the vote thread for KIP-970 which proposes deprecating
> (in
> > > > the
> > > > > > > Apache Kafka 3.7 release) and eventually removing (in the next
> major
> > > > > > Apache
> > > > > > > Kafka release - 4.0) Connect's redundant task configurations
> > > > endpoint.
> > > > > > >
> > > > > > > KIP -
> > > > > > >
> > > > > >
> > > > >
> > > >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-970%3A+Deprecate+and+remov
> > > > > > > e+Connect%27s+redundant+task+configurations+endpoint
> > > > > > >
> > > > > > > Discussion thread -
> > > > > > >
> https://lists.apache.org/thread/997qg9oz58kho3c19mdrjodv0n98plvj
> > > > > > >
> > > > > > > Thanks,
> > > > > > > Yash
> > > > > > >
> > > > > > >
> > > > > >
> > > > > >
> > > > >
> > > >
>


Re: [VOTE] KIP-714: Client metrics and observability

2023-09-08 Thread Philip Nee
Hey Andrew -

+1 but I don't have a binding vote!

It took me a while to go through the KIP. Here are some of my notes during
the reading:

*Metrics*
- Should we care about the client's leader epoch? There is a case where the
user recreates the topic, but the consumer thinks it is still the same
topic and therefore, attempts to start from an offset that doesn't exist.
KIP-848 addresses this issue, but I can still see some potential benefits
from knowing the client's epoch information.
- I assume poll idle is similar to poll interval: I needed to read the
description a few times.
- I don't have a clear use case in mind for the commit latency, but I do
think sometimes people lack clarity about how much progress was tracked by
the auto-commit.  Would tracking auto-commit-related metrics be useful? I
was thinking: the last offset committed or the actual cadence in ms.
- Are there cases when we need to increase the cadence of telemetry data
push? i.e. variable interval.
- Thanks for implementing the randomized initial metric push; I think it is
really important.
- Is there a potential use case for tracking the number of active
partitions? The consumer can pause partitions via API, during revocation,
or during offset reset for the stream.

*Connections*:
- The KIP stated that it will keep the same connection until the connection
is disconnected. I wonder if that could potentially cause congestion if it
is already a busy channel, which leads to connection timeout and
subsequently disconnection.

Thanks,
P

On Fri, Sep 8, 2023 at 4:15 AM Andrew Schofield <
andrew_schofield_j...@outlook.com> wrote:

> Bumping the voting thread for KIP-714.
>
> So far, we have:
> Non-binding +2 (Milind and Kirk), non-binding -1 (Ryanne)
>
> Thanks,
> Andrew
>
> > On 4 Aug 2023, at 09:45, Andrew Schofield 
> wrote:
> >
> > Hi,
> > After almost 2 1/2 years in the making, I would like to call a vote for
> KIP-714 (
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-714%3A+Client+metrics+and+observability
> ).
> >
> > This KIP aims to improve monitoring and troubleshooting of client
> performance by enabling clients to push metrics to brokers.
> >
> > I’d like to thank everyone that participated in the discussion,
> especially the librdkafka team since one of the aims of the KIP is to
> enable any client to participate, not just the Apache Kafka project’s Java
> clients.
> >
> > Thanks,
> > Andrew
>
>
>


Re: [DISCUSS] KIP-966: Eligible Leader Replicas

2023-09-08 Thread Calvin Liu
Hi Jun
Thanks for the comments!

10. Updated

11. It is mentioned in the ELR invariants section. It is mainly to save
metadata space.

12. Good catch, let me update the graph.

13. The thing we did not change is the original HWM advance requirement
“replicate to maximum ISR”. The proposal adds another requirement of
“committed ISR size should be larger than min ISR.” Let me update it.

14. In the Proactive mode, it will have the unclean recovery.

15. Good advice, I will update to “In Balance mode, all the LastKnownELR
members have replied, plus the replicas replied within the timeout.”

16. We can have a race between broker unfence and the log query. So just
try the best we can.

17. Will update. It is not downgradable once the Unclean Recovery is
enabled.

18. We don’t name ISR as ISRs so I think we don’t have to add s to the
LastKnownELR.

19. Good catch. I will rename it to PreviousBrokerEpoch. If the broker does
not find any previous epoch, it can set it to -1.

20. The issuer should be admin clients or brokers. The controller will
serve the request because brokers do not know the ELR info.

21. Yes, it uses the index to map the partitions and the desired leaders.

22. The admin clients may not know the topic id. So give it an option to
use topic name.

23. I will add the unclean. DESIGNATION refers to elect the provided
replica to be the leader.

24. The minimalReplicas is a field in the json file. See the example of
--path-to-json-file

25. Maybe call it manual_operation_required_partition_count? A partition
can be leaderless but it is waiting for a ELR member to be online, not
necessarily requiring operators to check.

26. It should be added 1 when a partition is triggered for an unclean
recovery and -1 when the partition has a new leader.

27. The words are not accurate. Min ISR only works on HWM advance and
whether to reject an ack=all request.

28. Good advice. Will update.

29. Yes, it does maintain the last ISR member. Just the last ISR member is
usually the last known leader. And the reason behind is for election.

30. Will update.
Thanks again!

On Thu, Sep 7, 2023 at 11:45 AM Jun Rao  wrote:

> Hi, Calvin,
>
> Thanks for the KIP. A few comments below.
>
> 10. "The High Watermark forwarding still requires a quorum within the ISR."
> Should it say that it requires the full ISR?
>
> 11. "remove the duplicate member in both ISR and ELR from ELR." Hmm, not
> sure that I follow since the KIP says that ELR doesn't overlap with ISR.
>
> 12. T1: In this case, min ISR is 3. Why would HWM advance to 2 with only 2
> members in ISR?
>
> 13. The KIP says "Note that, if maximal ISR > ISR, the message should be
> replicated to the maximal ISR before covering the message under HWM. The
> proposal does not change this behavior." and "Currently, we would advance
> HWM because it replicated to 2 brokers (the ones in Maximal ISR), but in
> the new protocol we wait until the controller updates ISR=[0,2] to avoid
> advancing HWM beyond what ELR=[1] has." They seem a bit inconsistent since
> one says no change and the other describes a change.
>
> 14. "If there are no ELR members. If the
> unclean.recovery.strategy=balanced, the controller will do the unclean
> recovery. Otherwise, unclean.recovery.strategy=Manual, the controller will
> not attempt to elect a leader. Waiting for the user operations." What
> happens with unclean.recovery.strategy=Proactive?
>
> 15. "In Balance mode, all the LastKnownELR members have replied." In
> Proactive, we wait for all replicas within a fixed amount of time. Balance
> should do the same since it's designed to preserve more data, right?
>
> 16. "The URM will query all the replicas including the fenced replicas."
> Why include the fenced replicas? Could a fenced replica be elected as the
> leader?
>
> 17. Once unclean.recovery.strategy is enabled, new metadata records could
> be written to the metadata log. At that point, is the broker downgradable?
> It would be useful to document that.
>
> 18. Since LastKnownELR can have more than 1 member, should it be
> LastKnownELRs?
>
> 19. BrokerRegistration.BrokerEpoch: "The broker's assigned epoch or the
> epoch before a clean shutdown." How do we tell whether the value is for the
> current or the previous epoch? Does it matter?
>
> 20. DescribeTopicRequest: Who issues that request? Who can serve that
> request? Is it only the controller or any broker?
>
> 21. DesiredLeaders: Does the ordering matter?
>
> 22. GetReplicaLogInfo only uses topicId while DescribeTopicRequest uses
> both topicId and name. Should they be consistent?
>
> 23. --election-type: The description mentions unclean, but that option
> doesn't exist. Also, could we describe what DESIGNATION means?
>
> 24. kafka-leader-election.sh has minimalReplicas, but ElectLeadersRequest
> doesn't seem to have a corresponding field?
>
> 25. kafka.replication.paused_partitions_count: paused doesn't seem to match
> the meaning of the metric. Should this be leaderless_partitions_count?

Re: [DISCUSS] KIP-966: Eligible Leader Replicas

2023-09-08 Thread Calvin Liu
Hi Artem
Thanks so much for the comments!

1. Yes, you are right, when the leader gets fenced, it will be put into
ELR. The unclean recovery can only be triggered if the mode is Proactive.
Let me clarify the trigger requirement in the KIP.

2. Good point, the controller should wait for all the LastKnownELR to be
unfenced then trigger the recovery.

3. Let me rewrite this part. The URM should have access to the
ReplicationControllManager which stores the partition registration. Then it
can check the replicas and LastKnownELR. But I guess those are
implementation details.

Thanks!


On Thu, Sep 7, 2023 at 9:07 PM Artem Livshits
 wrote:

> Hi Calvin,
>
> Thanks for the KIP.  The new ELR protocol looks good to me.  I have some
> questions about unclean recovery, specifically in "balanced" mode:
>
> 1. The KIP mentions that the controller would trigger unclear recovery when
> the leader is fenced, but my understanding is that when a leader is fenced,
> it would get into ELR.  Would it be more precise to say that an unclear
> leader election is triggered when the last member of ELR gets unfenced and
> registers with unclean shutdown?
> 2. For balanced mode, we need replies from at least LastKnownELR, in which
> case, does it make sense to start unclean recovery if some of the
> LastKnownELR are fenced?
> 3. "The URM takes the partition info to initiate an unclear recovery task
> ..." the parameters are topic-partition and replica ids -- what are those?
> Would those be just the whole replica assignment or just LastKnownELR?
>
> -Artem
>
> On Thu, Aug 10, 2023 at 3:47 PM Calvin Liu 
> wrote:
>
> > Hi everyone,
> > I'd like to discuss a series of enhancement to the replication protocol.
> >
> > A partition replica can experience local data loss in unclean shutdown
> > scenarios where unflushed data in the OS page cache is lost - such as an
> > availability zone power outage or a server error. The Kafka replication
> > protocol is designed to handle these situations by removing such replicas
> > from the ISR and only re-adding them once they have caught up and
> therefore
> > recovered any lost data. This prevents replicas that lost an arbitrary
> log
> > suffix, which included committed data, from being elected leader.
> > However, there is a "last replica standing" state which when combined
> with
> > a data loss unclean shutdown event can turn a local data loss scenario
> into
> > a global data loss scenario, i.e., committed data can be removed from all
> > replicas. When the last replica in the ISR experiences an unclean
> shutdown
> > and loses committed data, it will be reelected leader after starting up
> > again, causing rejoining followers to truncate their logs and thereby
> > removing the last copies of the committed records which the leader lost
> > initially.
> >
> > The new KIP will maximize the protection and provides MinISR-1 tolerance
> to
> > data loss unclean shutdown events.
> >
> >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-966%3A+Eligible+Leader+Replicas
> >
>


[jira] [Resolved] (KAFKA-15441) Broker sessions can time out during ZK migration

2023-09-08 Thread David Arthur (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-15441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Arthur resolved KAFKA-15441.
--
Resolution: Fixed

> Broker sessions can time out during ZK migration
> 
>
> Key: KAFKA-15441
> URL: https://issues.apache.org/jira/browse/KAFKA-15441
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 3.6.0
>Reporter: David Arthur
>Assignee: David Arthur
>Priority: Blocker
> Fix For: 3.6.0
>
>
> When a ZK to KRaft migration takes more than a few seconds to complete, the 
> sessions between the ZK brokers and the KRaft controller will expire. This 
> appears to be due to the heartbeat events being blocked in the purgatory on 
> the controller.
> The side effect of this expiration is that after the metadata is migrated, 
> the KRaft controller will immediately fence all of the brokers and remove 
> them from ISRs. This leads to a mass leadership change that can cause large 
> latency spikes on the brokers.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Jenkins build is still unstable: Kafka » Kafka Branch Builder » trunk #2187

2023-09-08 Thread Apache Jenkins Server
See 




Jenkins build is unstable: Kafka » Kafka Branch Builder » 3.6 #33

2023-09-08 Thread Apache Jenkins Server
See