Hi Stanislav,

I merged https://github.com/apache/kafka/pull/15308 in trunk. I let
you cherry-pick it to 3.7.

I think fixing the absolute show stoppers and calling JBOD support in
KRaft early access in 3.7.0 is probably the right call. Even without
the bugs we found, there's still quite a few JBOD follow up work to do
(KAFKA-16061) + system tests and documentation updates.

Thanks,
Mickael

On Fri, Feb 2, 2024 at 4:49 PM Stanislav Kozlovski
<stanis...@confluent.io.invalid> wrote:
>
> Thanks for the work everybody. Providing a status update at the end of the
> week:
>
> - docs change explaining migration
> <https://github.com/apache/kafka/pull/15193> was merged
> - the blocker KAFKA-16162 <https://github.com/apache/kafka/pull/15270> was
> merged
> - the blocker KAFKA-14616 <https://github.com/apache/kafka/pull/15230> was
> merged
> - a small blocker problem with the shadow jar plugin
> <https://github.com/apache/kafka/pull/15308>
> - the blockers KAFKALESS-16157 & KAFKALESS-16195 aren't merged
> - the good-to-have KAFKA-16082 isn't merged
>
> I think we should prioritize merging KAFKALESS-16195 and *call JBOD EA*. I
> question whether we may find more blocker bugs in the next RC.
> The release is late by approximately a month so far, so I do want to scope
> down aggressively to meet the time-based goal.
>
> Best,
> Stanislav
>
> On Mon, Jan 29, 2024 at 5:46 PM Omnia Ibrahim <o.g.h.ibra...@gmail.com>
> wrote:
>
> > Hi Stan and Gaurav,
> > Just to clarify some points mentioned here before
> >  KAFKA-14616: I raised a year ago so it's not related to JBOD work. It is
> > rather a blocker bug for KRAFT in general. The PR from Colin should fix
> > this. Am not sure if it is a blocker for 3.7 per-say as it was a major bug
> > since 3.3 and got missed from all other releases.
> >
> > Regarding the JBOD's work:
> > KAFKA-16082:  Is not a blocker for 3.7 instead it's nice fix. The pr
> > https://github.com/apache/kafka/pull/15136 is quite a small one and was
> > approved by Proven and I but it is waiting for a committer's approval.
> > KAFKA-16162: This is a blocker for 3.7.  Same it’s a small pr
> > https://github.com/apache/kafka/pull/15270 and it is approved Proven and
> > I and the PR is waiting for committer's approval.
> > KAFKA-16157: This is a blocker for 3.7. There is one small suggestion for
> > the pr https://github.com/apache/kafka/pull/15263 but I don't think any
> > of the current feedback is blocking the pr from getting approved. Assuming
> > we get a committer's approval on it.
> > KAFKA-16195:  Same it's a blocker but it has approval from Proven and I
> > and we are waiting for committer's approval on the pr
> > https://github.com/apache/kafka/pull/15262.
> >
> > If we can’t get a committer approval for KAFKA-16162, KAFKA-16157 and
> > KAFKA-16195  in time for 3.7 then we can mark JBOD as early release
> > assuming we merge at least KAFKA-16195.
> >
> > Regards,
> > Omnia
> >
> > > On 26 Jan 2024, at 15:39, ka...@gnarula.com wrote:
> > >
> > > Apologies, I duplicated KAFKA-16157 twice in my previous message. I
> > intended to mention KAFKA-16195
> > > with the PR at https://github.com/apache/kafka/pull/15262 as the second
> > JIRA.
> > >
> > > Thanks,
> > > Gaurav
> > >
> > >> On 26 Jan 2024, at 15:34, ka...@gnarula.com wrote:
> > >>
> > >> Hi Stan,
> > >>
> > >> I wanted to share some updates about the bugs you shared earlier.
> > >>
> > >> - KAFKA-14616: I've reviewed and tested the PR from Colin and have
> > observed
> > >> the fix works as intended.
> > >> - KAFKA-16162: I reviewed Proven's PR and found some gaps in the
> > proposed fix. I've
> > >> therefore raised https://github.com/apache/kafka/pull/15270 following
> > a discussion with Luke in JIRA.
> > >> - KAFKA-16082: I don't think this is marked as a blocker anymore. I'm
> > awaiting
> > >> feedback/reviews at https://github.com/apache/kafka/pull/15136
> > >>
> > >> In addition to the above, there are 2 JIRAs I'd like to bring
> > everyone's attention to:
> > >>
> > >> - KAFKA-16157: This is similar to KAFKA-14616 and is marked as a
> > blocker. I've raised
> > >> https://github.com/apache/kafka/pull/15263 and am awaiting reviews on
> > it.
> > >> - KAFKA-16157: I raised this yesterday and have addressed feedback from
> > Luke. This should
> > >> hopefully get merged soon.
> > >>
> > >> Regards,
> > >> Gaurav
> > >>
> > >>
> > >>> On 24 Jan 2024, at 11:51, ka...@gnarula.com wrote:
> > >>>
> > >>> Hi Stanislav,
> > >>>
> > >>> Thanks for bringing these JIRAs/PRs up.
> > >>>
> > >>> I'll be testing the open PRs for KAFKA-14616 and KAFKA-16162 this week
> > and I hope to have some feedback
> > >>> by Friday. I gather the latter JIRA is marked as a WIP by Proven and
> > he's away. I'll try to build on his work in the meantime.
> > >>>
> > >>> As for KAFKA-16082, we haven't been able to deduce a data loss
> > scenario. There's a PR open
> > >>> by me for promoting an abandoned future replica with approvals from
> > Omnia and Proven,
> > >>> so I'd appreciate a committer reviewing it.
> > >>>
> > >>> Regards,
> > >>> Gaurav
> > >>>
> > >>> On 23 Jan 2024, at 20:17, Stanislav Kozlovski 
> > >>> <stanis...@confluent.io.INVALID>
> > wrote:
> > >>>>
> > >>>> Hey all, I figured I'd give an update about what known blockers we
> > have
> > >>>> right now:
> > >>>>
> > >>>> - KAFKA-16101: KRaft migration rollback documentation is incorrect -
> > >>>> https://github.com/apache/kafka/pull/15193; This need not block RC
> > >>>> creation, but we need the docs updated so that people can test
> > properly
> > >>>> - KAFKA-14616: Topic recreation with offline broker causes permanent
> > URPs -
> > >>>> https://github.com/apache/kafka/pull/15230 ; I am of the
> > understanding that
> > >>>> this is blocking JBOD for 3.7
> > >>>> - KAFKA-16162: New created topics are unavailable after upgrading to
> > 3.7 -
> > >>>> a strict blocker with an open PR
> > https://github.com/apache/kafka/pull/15232
> > >>>> - although I understand Proveen is out of office
> > >>>> - KAFKA-16082: JBOD: Possible dataloss when moving leader partition -
> > I am
> > >>>> hearing mixed opinions on whether this is a blocker (
> > >>>> https://github.com/apache/kafka/pull/15136)
> > >>>>
> > >>>> Given that there are 3 JBOD blocker bugs, and I am not confident they
> > will
> > >>>> all be merged this week - I am on the edge of voting to revert JBOD
> > from
> > >>>> this release, or mark it early access.
> > >>>>
> > >>>> By all accounts, it seems that if we keep with JBOD the release will
> > have
> > >>>> to spill into February, which is a month extra from the time-based
> > release
> > >>>> plan we had of start of January.
> > >>>>
> > >>>> Can I ask others for an opinion?
> > >>>>
> > >>>> Best,
> > >>>> Stan
> > >>>>
> > >>>> On Thu, Jan 18, 2024 at 1:21 PM Luke Chen <show...@gmail.com> wrote:
> > >>>>
> > >>>>> Hi all,
> > >>>>>
> > >>>>> I think I've found another blocker issue: KAFKA-16162
> > >>>>> <https://issues.apache.org/jira/browse/KAFKA-16162> .
> > >>>>> The impact is after upgrading to 3.7.0, any new created
> > topics/partitions
> > >>>>> will be unavailable.
> > >>>>> I've put my findings in the JIRA.
> > >>>>>
> > >>>>> Thanks.
> > >>>>> Luke
> > >>>>>
> > >>>>> On Thu, Jan 18, 2024 at 9:50 AM Matthias J. Sax <mj...@apache.org>
> > wrote:
> > >>>>>
> > >>>>>> Stan, thanks for driving this all forward! Excellent job.
> > >>>>>>
> > >>>>>> About
> > >>>>>>
> > >>>>>>> StreamsStandbyTask -
> > https://issues.apache.org/jira/browse/KAFKA-16141
> > >>>>>>> StreamsUpgradeTest -
> > https://issues.apache.org/jira/browse/KAFKA-16139
> > >>>>>>
> > >>>>>> For `StreamsUpgradeTest` it was a test setup issue and should be
> > fixed
> > >>>>>> now in trunk and 3.7 (and actually also in 3.6...)
> > >>>>>>
> > >>>>>> For `StreamsStandbyTask` the failing test exposes a regression bug,
> > so
> > >>>>>> it's a blocker. I updated the ticket accordingly. We already have an
> > >>>>>> open PR that reverts the code introducing the regression.
> > >>>>>>
> > >>>>>>
> > >>>>>> -Matthias
> > >>>>>>
> > >>>>>> On 1/17/24 9:44 AM, Proven Provenzano wrote:
> > >>>>>>> We have another blocking issue for the RC :
> > >>>>>>> https://issues.apache.org/jira/browse/KAFKA-16157. This bug is
> > similar
> > >>>>>> to
> > >>>>>>> https://issues.apache.org/jira/browse/KAFKA-14616. The new issue
> > >>>>> however
> > >>>>>>> can lead to the new topic having partitions that a producer cannot
> > >>>>> write
> > >>>>>> to.
> > >>>>>>>
> > >>>>>>> --Proven
> > >>>>>>>
> > >>>>>>> On Tue, Jan 16, 2024 at 12:04 PM Proven Provenzano <
> > >>>>>> pprovenz...@confluent.io>
> > >>>>>>> wrote:
> > >>>>>>>
> > >>>>>>>>
> > >>>>>>>> I have a PR https://github.com/apache/kafka/pull/15197 for
> > >>>>>>>> https://issues.apache.org/jira/browse/KAFKA-16131 that is
> > building
> > >>>>> now.
> > >>>>>>>> --Proven
> > >>>>>>>>
> > >>>>>>>> On Mon, Jan 15, 2024 at 5:03 AM Jakub Scholz <ja...@scholz.cz>
> > wrote:
> > >>>>>>>>
> > >>>>>>>>> *> Hi Jakub,> > Thanks for trying the RC. I think what you found
> > is a
> > >>>>>>>>> blocker bug because it *
> > >>>>>>>>> *> will generate huge amount of logspam. I guess we didn't find
> > it in
> > >>>>>>>>> junit
> > >>>>>>>>> tests *
> > >>>>>>>>> *> since logspam doesn't fail the automated tests. But certainly
> > it's
> > >>>>>> not
> > >>>>>>>>> suitable *
> > >>>>>>>>> *> for production. Did you file a JIRA yet?*
> > >>>>>>>>>
> > >>>>>>>>> Hi Colin,
> > >>>>>>>>>
> > >>>>>>>>> I opened https://issues.apache.org/jira/browse/KAFKA-16131.
> > >>>>>>>>>
> > >>>>>>>>> Thanks & Regards
> > >>>>>>>>> Jakub
> > >>>>>>>>>
> > >>>>>>>>> On Mon, Jan 15, 2024 at 8:57 AM Colin McCabe <cmcc...@apache.org
> > >
> > >>>>>> wrote:
> > >>>>>>>>>
> > >>>>>>>>>> Hi Stanislav,
> > >>>>>>>>>>
> > >>>>>>>>>> Thanks for making the first RC. The fact that it's titled RC2 is
> > >>>>>> messing
> > >>>>>>>>>> with my mind a bit. I hope this doesn't make people think that
> > we're
> > >>>>>>>>>> farther along than we are, heh.
> > >>>>>>>>>>
> > >>>>>>>>>> On Sun, Jan 14, 2024, at 13:54, Jakub Scholz wrote:
> > >>>>>>>>>>> *> Nice catch! It does seem like we should have gated this
> > behind
> > >>>>> the
> > >>>>>>>>>>> metadata> version as KIP-858 implies. Is the cluster configured
> > >>>>> with
> > >>>>>>>>>>> multiple log> dirs? What is the impact of the error messages?*
> > >>>>>>>>>>>
> > >>>>>>>>>>> I did not observe any obvious impact. I was able to send and
> > >>>>> receive
> > >>>>>>>>>>> messages as normally. But to be honest, I have no idea what
> > else
> > >>>>>>>>>>> this might impact, so I did not try anything special.
> > >>>>>>>>>>>
> > >>>>>>>>>>> I think everyone upgrading an existing KRaft cluster will go
> > >>>>> through
> > >>>>>>>>> this
> > >>>>>>>>>>> stage (running Kafka 3.7 with an older metadata version for at
> > >>>>> least
> > >>>>>> a
> > >>>>>>>>>>> while). So even if it is just a logged exception without any
> > other
> > >>>>>>>>>> impact I
> > >>>>>>>>>>> wonder if it might scare users from upgrading. But I leave it
> > to
> > >>>>>>>>> others
> > >>>>>>>>>> to
> > >>>>>>>>>>> decide if this is a blocker or not.
> > >>>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>> Hi Jakub,
> > >>>>>>>>>>
> > >>>>>>>>>> Thanks for trying the RC. I think what you found is a blocker
> > bug
> > >>>>>>>>> because
> > >>>>>>>>>> it will generate huge amount of logspam. I guess we didn't find
> > it
> > >>>>> in
> > >>>>>>>>> junit
> > >>>>>>>>>> tests since logspam doesn't fail the automated tests. But
> > certainly
> > >>>>>> it's
> > >>>>>>>>>> not suitable for production. Did you file a JIRA yet?
> > >>>>>>>>>>
> > >>>>>>>>>>> On Sun, Jan 14, 2024 at 10:17 PM Stanislav Kozlovski
> > >>>>>>>>>>> <stanis...@confluent.io.invalid> wrote:
> > >>>>>>>>>>>
> > >>>>>>>>>>>> Hey Luke,
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> This is an interesting problem. Given the fact that the KIP
> > for
> > >>>>>>>>> having a
> > >>>>>>>>>>>> 3.8 release passed, I think it weights the scale towards not
> > >>>>> calling
> > >>>>>>>>>> this a
> > >>>>>>>>>>>> blocker and expecting it to be solved in 3.7.1.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> It is unfortunate that it would not seem safe to migrate to
> > KRaft
> > >>>>> in
> > >>>>>>>>>> 3.7.0
> > >>>>>>>>>>>> (given the inability to rollback safely), but if that's true
> > - the
> > >>>>>>>>> same
> > >>>>>>>>>>>> case would apply for 3.6.0. So in any case users w\ould be
> > >>>>> expected
> > >>>>>>>>> to
> > >>>>>>>>>> use a
> > >>>>>>>>>>>> patch release for this.
> > >>>>>>>>>>
> > >>>>>>>>>> Hi Luke,
> > >>>>>>>>>>
> > >>>>>>>>>> Thanks for testing rollback. I think this is a case where the
> > >>>>>>>>>> documentation is wrong. The intention was to for the steps to
> > >>>>>> basically
> > >>>>>>>>> be:
> > >>>>>>>>>>
> > >>>>>>>>>> 1. roll all the brokers into zk mode, but with migration enabled
> > >>>>>>>>>> 2. take down the kraft quorum
> > >>>>>>>>>> 3. rmr /controller, allowing a hybrid broker to take over.
> > >>>>>>>>>> 4. roll all the brokers into zk mode without migration enabled
> > (if
> > >>>>>>>>> desired)
> > >>>>>>>>>>
> > >>>>>>>>>> With these steps, there isn't really unavailability since a ZK
> > >>>>>>>>> controller
> > >>>>>>>>>> can be elected quickly after the kraft quorum is gone.
> > >>>>>>>>>>
> > >>>>>>>>>>>> Further, since we will have a 3.8 release - it is
> > >>>>>>>>>>>> likely we will ultimately recommend users upgrade from that
> > >>>>> version
> > >>>>>>>>>> given
> > >>>>>>>>>>>> its aim is to have strategic KRaft feature parity with ZK.
> > >>>>>>>>>>>> That being said, I am not 100% on this. Let me know whether
> > you
> > >>>>>> think
> > >>>>>>>>>> this
> > >>>>>>>>>>>> should block the release, Luke. I am also tagging Colin and
> > David
> > >>>>> to
> > >>>>>>>>>> weigh
> > >>>>>>>>>>>> in with their opinions, as they worked on the migration logic.
> > >>>>>>>>>>
> > >>>>>>>>>> The rollback docs are new in 3.7 so the fact that they're wrong
> > is a
> > >>>>>>>>> clear
> > >>>>>>>>>> blocker, I think. But easy to fix, I believe. I will create a
> > PR.
> > >>>>>>>>>>
> > >>>>>>>>>> best,
> > >>>>>>>>>> Colin
> > >>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> Hey Kirk and Chris,
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> Unless I'm missing something - KAFKALESS-16029 is simply a
> > bad log
> > >>>>>>>>> due
> > >>>>>>>>>> to
> > >>>>>>>>>>>> improper closing. And the PR description implies this has been
> > >>>>>>>>> present
> > >>>>>>>>>>>> since 3.5. While annoying, I don't see a strong reason for
> > this to
> > >>>>>>>>> block
> > >>>>>>>>>>>> the release.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> Hey Jakub,
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> Nice catch! It does seem like we should have gated this
> > behind the
> > >>>>>>>>>> metadata
> > >>>>>>>>>>>> version as KIP-858 implies. Is the cluster configured with
> > >>>>> multiple
> > >>>>>>>>> log
> > >>>>>>>>>>>> dirs? What is the impact of the error messages?
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> Tagging Igor (the author of the KIP) to weigh in.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> Best,
> > >>>>>>>>>>>> Stanislav
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> On Sat, Jan 13, 2024 at 7:22 PM Jakub Scholz <ja...@scholz.cz
> > >
> > >>>>>>>>> wrote:
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>> Hi,
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> I was trying the RC2 and run into the following issue ...
> > when I
> > >>>>>>>>> run
> > >>>>>>>>>>>>> 3.7.0-RC2 KRaft cluster with metadata version set to 3.6-IV2
> > >>>>>>>>> metadata
> > >>>>>>>>>>>>> version, I seem to be getting repeated errors like this in
> > the
> > >>>>>>>>>> controller
> > >>>>>>>>>>>>> logs:
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> 2024-01-13 16:58:01,197 INFO [QuorumController id=0]
> > >>>>>>>>>>>> assignReplicasToDirs:
> > >>>>>>>>>>>>> event failed with UnsupportedVersionException in 15
> > microseconds.
> > >>>>>>>>>>>>> (org.apache.kafka.controller.QuorumController)
> > >>>>>>>>>>>>> [quorum-controller-0-event-handler]
> > >>>>>>>>>>>>> 2024-01-13 16:58:01,197 ERROR [ControllerApis nodeId=0]
> > >>>>> Unexpected
> > >>>>>>>>>> error
> > >>>>>>>>>>>>> handling request
> > RequestHeader(apiKey=ASSIGN_REPLICAS_TO_DIRS,
> > >>>>>>>>>>>>> apiVersion=0, clientId=1000, correlationId=14,
> > headerVersion=2)
> > >>>>> --
> > >>>>>>>>>>>>> AssignReplicasToDirsRequestData(brokerId=1000, brokerEpoch=5,
> > >>>>>>>>>>>>> directories=[DirectoryData(id=w_uxN7pwQ6eXSMrOKceYIQ,
> > >>>>>>>>>>>>> topics=[TopicData(topicId=bvAKLSwmR7iJoKv2yZgygQ,
> > >>>>>>>>>>>>> partitions=[PartitionData(partitionIndex=2),
> > >>>>>>>>>>>>> PartitionData(partitionIndex=1)]),
> > >>>>>>>>>>>>> TopicData(topicId=uNe7f5VrQgO0zST6yH1jDQ,
> > >>>>>>>>>>>>> partitions=[PartitionData(partitionIndex=0)])])]) with
> > context
> > >>>>>>>>>>>>>
> > >>>>> RequestContext(header=RequestHeader(apiKey=ASSIGN_REPLICAS_TO_DIRS,
> > >>>>>>>>>>>>> apiVersion=0, clientId=1000, correlationId=14,
> > headerVersion=2),
> > >>>>>>>>>>>>> connectionId='172.16.14.219:9090-172.16.14.217:53590-7',
> > >>>>>>>>>> clientAddress=/
> > >>>>>>>>>>>>> 172.16.14.217,
> > principal=User:CN=my-cluster-kafka,O=io.strimzi,
> > >>>>>>>>>>>>> listenerName=ListenerName(CONTROLPLANE-9090),
> > >>>>> securityProtocol=SSL,
> > >>>>>>>>>>>>>
> > >>>>> clientInformation=ClientInformation(softwareName=apache-kafka-java,
> > >>>>>>>>>>>>> softwareVersion=3.7.0), fromPrivilegedListener=false,
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>
> > >>>>>>
> > >>>>>
> > principalSerde=Optional[org.apache.kafka.common.security.authenticator.DefaultKafkaPrincipalBuilder@71004ad2
> > >>>>>>>>>>>>> ])
> > >>>>>>>>>>>>> (kafka.server.ControllerApis)
> > [quorum-controller-0-event-handler]
> > >>>>>>>>>>>>> java.util.concurrent.CompletionException:
> > >>>>>>>>>>>>> org.apache.kafka.common.errors.UnsupportedVersionException:
> > >>>>>>>>> Directory
> > >>>>>>>>>>>>> assignment is not supported yet.
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> at
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>
> > >>>>>>
> > >>>>>
> > java.base/java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:332)
> > >>>>>>>>>>>>> at
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>
> > >>>>>>
> > >>>>>
> > java.base/java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:347)
> > >>>>>>>>>>>>> at
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>
> > >>>>>>
> > >>>>>
> > java.base/java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:636)
> > >>>>>>>>>>>>> at
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>
> > >>>>>>
> > >>>>>
> > java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:510)
> > >>>>>>>>>>>>> at
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>
> > >>>>>>
> > >>>>>
> > java.base/java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2162)
> > >>>>>>>>>>>>> at
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>
> > >>>>>>
> > >>>>>
> > org.apache.kafka.controller.QuorumController$ControllerWriteEvent.complete(QuorumController.java:880)
> > >>>>>>>>>>>>> at
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>
> > >>>>>>
> > >>>>>
> > org.apache.kafka.controller.QuorumController$ControllerWriteEvent.handleException(QuorumController.java:871)
> > >>>>>>>>>>>>> at
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>
> > >>>>>>
> > >>>>>
> > org.apache.kafka.queue.KafkaEventQueue$EventContext.completeWithException(KafkaEventQueue.java:148)
> > >>>>>>>>>>>>> at
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>
> > >>>>>>
> > >>>>>
> > org.apache.kafka.queue.KafkaEventQueue$EventContext.run(KafkaEventQueue.java:137)
> > >>>>>>>>>>>>> at
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>
> > >>>>>>
> > >>>>>
> > org.apache.kafka.queue.KafkaEventQueue$EventHandler.handleEvents(KafkaEventQueue.java:210)
> > >>>>>>>>>>>>> at
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>
> > >>>>>>
> > >>>>>
> > org.apache.kafka.queue.KafkaEventQueue$EventHandler.run(KafkaEventQueue.java:181)
> > >>>>>>>>>>>>> at java.base/java.lang.Thread.run(Thread.java:840)
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> Caused by:
> > >>>>>>>>> org.apache.kafka.common.errors.UnsupportedVersionException:
> > >>>>>>>>>>>>> Directory assignment is not supported yet.
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> Is that expected? I guess with the metadata version set to
> > >>>>>>>>> 3.6-IV2, it
> > >>>>>>>>>>>>> makes sense that the request is not supported. But shouldn't
> > then
> > >>>>>>>>> the
> > >>>>>>>>>>>>> request not be sent at all by the brokers? (I did not opened
> > a
> > >>>>> JIRA
> > >>>>>>>>>> for
> > >>>>>>>>>>>> it,
> > >>>>>>>>>>>>> but I can open one if you agree this is not expected)
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> Thanks & Regards
> > >>>>>>>>>>>>> Jakub
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> On Sat, Jan 13, 2024 at 8:03 AM Luke Chen <show...@gmail.com
> > >
> > >>>>>>>>> wrote:
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>> Hi Stanislav,
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> I commented in the "Apache Kafka 3.7.0 Release" thread, but
> > >>>>> maybe
> > >>>>>>>>>> you
> > >>>>>>>>>>>>>> missed it.
> > >>>>>>>>>>>>>> cross-posting here:
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> There is a bug KAFKA-16101
> > >>>>>>>>>>>>>> <https://issues.apache.org/jira/browse/KAFKA-16101>
> > reporting
> > >>>>>>>>> that
> > >>>>>>>>>>>>> "Kafka
> > >>>>>>>>>>>>>> cluster will be unavailable during KRaft migration
> > rollback".
> > >>>>>>>>>>>>>> The impact for this issue is that if brokers try to
> > rollback to
> > >>>>>>>>> ZK
> > >>>>>>>>>> mode
> > >>>>>>>>>>>>>> during KRaft migration process, there will be a period of
> > time
> > >>>>>>>>> the
> > >>>>>>>>>>>>> cluster
> > >>>>>>>>>>>>>> is unavailable.
> > >>>>>>>>>>>>>> Since ZK migrating to KRaft feature is a production ready
> > >>>>>>>>> feature, I
> > >>>>>>>>>>>>> think
> > >>>>>>>>>>>>>> this should be addressed soon.
> > >>>>>>>>>>>>>> Do you think this is a blocker for v3.7.0?
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> Thanks.
> > >>>>>>>>>>>>>> Luke
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> On Sat, Jan 13, 2024 at 8:36 AM Chris Egerton <
> > >>>>>>>>>> fearthecel...@gmail.com
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>> wrote:
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> Thanks, Kirk!
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> @Stanislav--do you believe that this warrants a new RC?
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> On Fri, Jan 12, 2024, 19:08 Kirk True <k...@kirktrue.pro>
> > >>>>>>>>> wrote:
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>> Hi Chris/Stanislav,
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>> I'm working on the 'Unable to find FetchSessionHandler'
> > log
> > >>>>>>>>>> problem
> > >>>>>>>>>>>>>>>> (KAFKA-16029) and have put out a draft PR (
> > >>>>>>>>>>>>>>>> https://github.com/apache/kafka/pull/15186). I will use
> > the
> > >>>>>>>>>>>>> quickstart
> > >>>>>>>>>>>>>>>> approach as a second means to reproduce/verify while I
> > wait
> > >>>>>>>>> for
> > >>>>>>>>>> the
> > >>>>>>>>>>>>>> PR's
> > >>>>>>>>>>>>>>>> Jenkins job to finish.
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>> Thanks,
> > >>>>>>>>>>>>>>>> Kirk
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>> On Fri, Jan 12, 2024, at 11:31 AM, Chris Egerton wrote:
> > >>>>>>>>>>>>>>>>> Hi Stanislav,
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>> Thanks for running this release!
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>> To verify, I:
> > >>>>>>>>>>>>>>>>> - Built from source using Java 11 with both:
> > >>>>>>>>>>>>>>>>> - - the 3.7.0-rc2 tag on GitHub
> > >>>>>>>>>>>>>>>>> - - the kafka-3.7.0-src.tgz artifact from
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>> https://home.apache.org/~stanislavkozlovski/kafka-3.7.0-rc2/
> > >>>>>>>>>>>>>>>>> - Checked signatures and checksums
> > >>>>>>>>>>>>>>>>> - Ran the quickstart using both:
> > >>>>>>>>>>>>>>>>> - - The kafka_2.13-3.7.0.tgz artifact from
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>> https://home.apache.org/~stanislavkozlovski/kafka-3.7.0-rc2/
> > >>>>>>>>>>>> with
> > >>>>>>>>>>>>>> Java
> > >>>>>>>>>>>>>>>> 11
> > >>>>>>>>>>>>>>>>> and Scala 13 in KRaft mode
> > >>>>>>>>>>>>>>>>> - - Our shiny new broker Docker image,
> > >>>>>>>>> apache/kafka:3.7.0-rc2
> > >>>>>>>>>>>>>>>>> - Ran all unit tests
> > >>>>>>>>>>>>>>>>> - Ran all integration tests for Connect and MM2
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>> I found two minor areas for concern:
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>> 1. (Possibly a blocker)
> > >>>>>>>>>>>>>>>>> When running the quickstart, I noticed this ERROR-level
> > log
> > >>>>>>>>>>>> message
> > >>>>>>>>>>>>>>> being
> > >>>>>>>>>>>>>>>>> emitted frequently (not not every time) when I killed my
> > >>>>>>>>>> console
> > >>>>>>>>>>>>>>> consumer
> > >>>>>>>>>>>>>>>>> via ctrl-C:
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>> [2024-01-12 11:00:31,088] ERROR [Consumer
> > >>>>>>>>>>>>>> clientId=console-consumer,
> > >>>>>>>>>>>>>>>>> groupId=console-consumer-74388] Unable to find
> > >>>>>>>>>>>> FetchSessionHandler
> > >>>>>>>>>>>>>> for
> > >>>>>>>>>>>>>>>> node
> > >>>>>>>>>>>>>>>>> 1. Ignoring fetch response
> > >>>>>>>>>>>>>>>>>
> > (org.apache.kafka.clients.consumer.internals.AbstractFetch)
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>> I see that this error message is already reported in
> > >>>>>>>>>>>>>>>>> https://issues.apache.org/jira/browse/KAFKA-16029. I
> > >>>>>>>>> think we
> > >>>>>>>>>>>>> should
> > >>>>>>>>>>>>>>>>> prioritize fixing it for this release. I know it's
> > probably
> > >>>>>>>>>>>> benign
> > >>>>>>>>>>>>>> but
> > >>>>>>>>>>>>>>>> it's
> > >>>>>>>>>>>>>>>>> really not a good look for us when basic operations log
> > >>>>>>>>> error
> > >>>>>>>>>>>>>> messages,
> > >>>>>>>>>>>>>>>> and
> > >>>>>>>>>>>>>>>>> it may give new users some headaches.
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>> 2. (Probably not a blocker)
> > >>>>>>>>>>>>>>>>> The following unit tests failed the first time around,
> > and
> > >>>>>>>>>> all of
> > >>>>>>>>>>>>>> them
> > >>>>>>>>>>>>>>>>> passed the second time I ran them:
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>> - (clients)
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>> ClientUtilsTest.testParseAndValidateAddressesWithReverseLookup()
> > >>>>>>>>>>>>>>>>> - (clients) SelectorTest.testConnectionsByClientMetric()
> > >>>>>>>>>>>>>>>>> - (clients)
> > >>>>>>>>> Tls13SelectorTest.testConnectionsByClientMetric()
> > >>>>>>>>>>>>>>>>> - (connect)
> > >>>>>>>>>>>>>> TopicAdminTest.retryEndOffsetsShouldRetryWhenTopicNotFound
> > >>>>>>>>>>>>>>> (I
> > >>>>>>>>>>>>>>>>> thought I fixed this one! 🤬🤬)
> > >>>>>>>>>>>>>>>>> - (core)
> > >>>>>>>>>> ProducerIdManagerTest.testUnrecoverableErrors(Errors)[2]
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>> Thanks again for your work on this release, and
> > >>>>>>>>>> congratulations
> > >>>>>>>>>>>> to
> > >>>>>>>>>>>>>>> Kafka
> > >>>>>>>>>>>>>>>>> Streams for having zero flaky unit tests during my
> > >>>>>>>>>>>>>> highly-experimental
> > >>>>>>>>>>>>>>>>> single laptop run!
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>> Cheers,
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>> Chris
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>> On Thu, Jan 11, 2024 at 1:33 PM Stanislav Kozlovski
> > >>>>>>>>>>>>>>>>> <stanis...@confluent.io.invalid> wrote:
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>> Hello Kafka users, developers, and client-developers,
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>> This is the first candidate for release of Apache Kafka
> > >>>>>>>>>> 3.7.0.
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>> Note it's named "RC2" because I had a few "failed" RCs
> > >>>>>>>>> that
> > >>>>>>>>>> I
> > >>>>>>>>>>>> had
> > >>>>>>>>>>>>>>>>>> cut/uploaded but ultimately had to scrap prior to
> > >>>>>>>>> announcing
> > >>>>>>>>>>>> due
> > >>>>>>>>>>>>> to
> > >>>>>>>>>>>>>>> new
> > >>>>>>>>>>>>>>>>>> blockers arriving before I could even announce them.
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>> Further - I haven't yet been able to set up the system
> > >>>>>>>>> tests
> > >>>>>>>>>>>>>>>> successfully.
> > >>>>>>>>>>>>>>>>>> And the integration/unit tests do have a few failures
> > >>>>>>>>> that I
> > >>>>>>>>>>>> have
> > >>>>>>>>>>>>>> to
> > >>>>>>>>>>>>>>>> spend
> > >>>>>>>>>>>>>>>>>> time triaging. I would appreciate any help in case
> > anyone
> > >>>>>>>>>>>> notices
> > >>>>>>>>>>>>>> any
> > >>>>>>>>>>>>>>>> tests
> > >>>>>>>>>>>>>>>>>> failing that they're subject matters experts in. Expect
> > >>>>>>>>> me
> > >>>>>>>>>> to
> > >>>>>>>>>>>>>> follow
> > >>>>>>>>>>>>>>>> up in
> > >>>>>>>>>>>>>>>>>> a day or two with more detailed analysis.
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>> Major changes include:
> > >>>>>>>>>>>>>>>>>> - Early Access to KIP-848 - the next generation of the
> > >>>>>>>>>> consumer
> > >>>>>>>>>>>>>>>> rebalance
> > >>>>>>>>>>>>>>>>>> protocol
> > >>>>>>>>>>>>>>>>>> - KIP-858: Adding JBOD support to KRaft
> > >>>>>>>>>>>>>>>>>> - KIP-714: Observability into Client metrics via a
> > >>>>>>>>>> standardized
> > >>>>>>>>>>>>>>>> interface
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>> Check more information in the WIP blog post:
> > >>>>>>>>>>>>>>>>>> https://github.com/apache/kafka-site/pull/578
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>> Release notes for the 3.7.0 release:
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>
> > >>>>>>
> > >>>>>
> > https://home.apache.org/~stanislavkozlovski/kafka-3.7.0-rc2/RELEASE_NOTES.html
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>> *** Please download, test and vote by Thursday, January
> > >>>>>>>>> 18,
> > >>>>>>>>>> 9am
> > >>>>>>>>>>>>> PT
> > >>>>>>>>>>>>>>> ***
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>> Usually these deadlines tend to be 2-3 days, but due to
> > >>>>>>>>> this
> > >>>>>>>>>>>>> being
> > >>>>>>>>>>>>>>> the
> > >>>>>>>>>>>>>>>>>> first RC and the tests not having ran yet, I am giving
> > >>>>>>>>> it a
> > >>>>>>>>>> bit
> > >>>>>>>>>>>>>> more
> > >>>>>>>>>>>>>>>> time.
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>> Kafka's KEYS file containing PGP keys we use to sign the
> > >>>>>>>>>>>> release:
> > >>>>>>>>>>>>>>>>>> https://kafka.apache.org/KEYS
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>> * Release artifacts to be voted upon (source and
> > binary):
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>> https://home.apache.org/~stanislavkozlovski/kafka-3.7.0-rc2/
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>> * Docker release artifact to be voted upon:
> > >>>>>>>>>>>>>>>>>> apache/kafka:3.7.0-rc2
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>> * Maven artifacts to be voted upon:
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>>
> > >>>>>
> > https://repository.apache.org/content/groups/staging/org/apache/kafka/
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>> * Javadoc:
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>
> > >>>>> https://home.apache.org/~stanislavkozlovski/kafka-3.7.0-rc2/javadoc/
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>> * Tag to be voted upon (off 3.7 branch) is the 3.7.0
> > tag:
> > >>>>>>>>>>>>>>>>>> https://github.com/apache/kafka/releases/tag/3.7.0-rc2
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>> * Documentation:
> > >>>>>>>>>>>>>>>>>> https://kafka.apache.org/37/documentation.html
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>> * Protocol:
> > >>>>>>>>>>>>>>>>>> https://kafka.apache.org/37/protocol.html
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>> * Successful Jenkins builds for the 3.7 branch:
> > >>>>>>>>>>>>>>>>>> Unit/integration tests:
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>> https://ci-builds.apache.org/job/Kafka/job/kafka/job/3.7/58/
> > >>>>>>>>>>>>>>>>>> There are failing tests here. I have to follow up with
> > >>>>>>>>>> triaging
> > >>>>>>>>>>>>>> some
> > >>>>>>>>>>>>>>> of
> > >>>>>>>>>>>>>>>>>> the failures and figuring out if they're actual problems
> > >>>>>>>>> or
> > >>>>>>>>>>>>> simply
> > >>>>>>>>>>>>>>>> flakes.
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>> System tests:
> > >>>>>>>>>>>>>>>>
> > https://jenkins.confluent.io/job/system-test-kafka/job/3.7/
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>> No successful system test runs yet. I am working on
> > >>>>>>>>> getting
> > >>>>>>>>>> the
> > >>>>>>>>>>>>> job
> > >>>>>>>>>>>>>>> to
> > >>>>>>>>>>>>>>>> run.
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>> * Successful Docker Image Github Actions Pipeline for
> > 3.7
> > >>>>>>>>>>>> branch:
> > >>>>>>>>>>>>>>>>>> Attached are the scan_report and report_jvm output files
> > >>>>>>>>>> from
> > >>>>>>>>>>>> the
> > >>>>>>>>>>>>>>>> Docker
> > >>>>>>>>>>>>>>>>>> Build run:
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>>
> > >>>>>>
> > https://github.com/apache/kafka/actions/runs/7486094960/job/20375761673
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>> And the final docker image build job - Docker Build Test
> > >>>>>>>>>>>>> Pipeline:
> > >>>>>>>>>>>>>>>>>> https://github.com/apache/kafka/actions/runs/7486178277
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>> The image is apache/kafka:3.7.0-rc2 -
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>
> > >>>>>>
> > >>>>>
> > https://hub.docker.com/layers/apache/kafka/3.7.0-rc2/images/sha256-5b4707c08170d39549fbb6e2a3dbb83936a50f987c0c097f23cb26b4c210c226?context=explore
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>> /**************************************
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>> Thanks,
> > >>>>>>>>>>>>>>>>>> Stanislav Kozlovski
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> --
> > >>>>>>>>>>>> Best,
> > >>>>>>>>>>>> Stanislav
> > >>>>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>
> > >>>>>>>
> > >>>>>>
> > >>>>>
> > >>>>
> > >>>>
> > >>>> --
> > >>>> Best,
> > >>>> Stanislav
> > >>>
> > >>
> > >
> >
> >
>
> --
> Best,
> Stanislav

Reply via email to