Re: [DISCUSS] KIP-1022 Formatting and Updating Features

2024-03-15 Thread José Armando García Sancio
Hi Justine,

Thanks for the update. Some comments below.

On Wed, Mar 13, 2024 at 10:53 AM Justine Olshan
 wrote:
> 4. Include an API to list the features for a given metadata version

I am not opposed to designing and implementing this. I am just
wondering if this is strictly required?

Would having auto-generated documentation address the issue of not
knowing which feature versions are associated with a given release
version?

Does it need to be a Kafka API (RPC)? Or can this be strictly
implemented in the tool? The latest tool, which is needed to parse
release version to feature version, can print this mapping. Is this
what you mean by API?

> 5. I'm going back and forth on whether we should support the
> --release-version flag still. If we do, we need to include validation so we
> only upgrade on upgrade.

I am not sure how this is different from supporting multiple --feature
flags. The user can run an upgrade command where some of the features
specified are greater than what the cluster has finalized and some of
the features specified are less than what the cluster has finalized.

In other words, the KRaft controller and kafka-storage tool are going
to have to implement this validation even if you don't implement
--release-version in the tools.
Thanks,
-- 
-José


Re: [DISCUSS] KIP-1022 Formatting and Updating Features

2024-03-15 Thread Colin McCabe
Hi Justine,

Thanks for driving this forward. Comments below.

On Wed, Mar 13, 2024, at 10:51, Justine Olshan wrote:
> Hey folks -- let me summarize my understanding
>
> Artem requested we change the name of transaction version (tv) to
> transaction protocol version (tpv). If I do that, I will also probably
> change to group coordinator protocol version (gcpv).
>
>
> Folks were concerned about upgrades/downgrades with different feature
> versions when using the feature tool and --release version. I think we can
> address some concerns here:
>
>- don't allow a command that moves a version in the wrong direction
>(when upgrade actually downgrades a feature for example)
>- allow a command to describe all the feature versions for a given
>metadata version
>
> Note that Colin expressed some concern with having the --release-version
> flag at all. I wonder if we can reach a compromise with having the upgrade
> command have a "latest" command only.
>
>
> Jun proposed this "latest" functionality can be the default when no version
> or feature is specified.
>
> Jose requested that we make --release-version and --feature mutually
> exclusive for both the storage tool and the features tool. We should also
> specify each feature in the storage tool as a separate flag.
>
> Finally, Jun and Jose requested deprecation of the --metadata flag, but
> Colin felt we should keep it. I mentioned the --feature flag does the same
> but no reply yet.
>
> * So here are the updates I propose:*
> 1. Update the feature names as Artem described -- transaction protocol
> version and group coordinator protocol version
>
> 2. Rename --features in the storage tool to --feature. In the case where we
> use this flag, we must repeat the flag for the number of features to set
> all of them.
>

+1

> 3. No argument behavior on the storage and upgrade tools is to set all the
> features to the latest (stable) version
>

+1

> 4. Include an API to list the features for a given metadata version
>

At the risk of expanding the scope... if we have an API to list feature levels, 
we should have a brief description of what they do. Something like:

13   3.6-IV1   add metadata transactions

Maybe a simple way to do this would just be to output the full list and let 
people grep for what they want. You could also add filters like 
--release-version, etc.

The big question is would you store this data on the servers or in the tool? 
Right now the tool is a bit "smart" in the sense that it can translate between 
an MV string and an MV level. Arguably it should be dumber, though, and rely on 
the server for these things.

Anyway your idea of putting the info on the server side is probably for the 
best.

> 5. I'm going back and forth on whether we should support the
> --release-version flag still. If we do, we need to include validation so we
> only upgrade on upgrade.

I'd rather not, but I'm curious what others think. I really feel like the 
use-case is not there (compared with the more manual, but safer process 
described above)

best,
Colin

>
> Let me know if folks have any issues with the updates I proposed or think
> there is something I missed.
>
> Justine
>
> On Fri, Mar 8, 2024 at 11:59 AM Artem Livshits
>  wrote:
>
>> Hi Justine,
>>
>> >  Are you suggesting it should be called "transaction protocol version" or
>> "TPV"? I don't mind that, but just wanted to clarify if we want to include
>> protocol or if simply "transaction version" is enough.
>>
>> My understanding is that "metadata version" is the version of metadata
>> records, which is fairly straightforward.  "Transaction version" may be
>> ambiguous.
>>
>> -Artem
>>
>> On Thu, Feb 29, 2024 at 3:39 PM Justine Olshan
>> 
>> wrote:
>>
>> > Hey folks,
>> >
>> > Thanks for the discussion. Let me try to cover everyone's comments.
>> >
>> > Artem --
>> > I can add the examples you mentioned. As for naming, right now the
>> feature
>> > is called "transaction version" or "TV". Are you suggesting it should be
>> > called "transaction protocol version" or "TPV"? I don't mind that, but
>> just
>> > wanted to clarify if we want to include protocol or if simply
>> "transaction
>> > version" is enough.
>> >
>> > Jun --
>> >
>> > 10.  *With **more features, would each of those be controlled by a
>> separate
>> > feature or*
>> >
>> > *multiple features. For example, is the new transaction record format*
>> >
>> > *controlled only by MV with TV having a dependency on MV or is it
>> > controlled*
>> >
>> > *by both MV and TV.*
>> >
>> >
>> > I think this will need to be decided on a case by case basis. There
>> should
>> > be a mechanism to set dependencies among features.
>> > For transaction version specifically, I have no metadata version
>> > dependencies besides requiring 3.3 to write the feature records and use
>> the
>> > feature tools. I would suspect all new features would have this
>> > requirement.
>> >
>> >
>> > 11. *Basically, if **--release-version is not used, the command 

Re: Incremental build for scala tests

2024-03-15 Thread Pavel Pozdeev

Looks like this bahaviour has changed after upgrade from gradle 8.5 to 8.6 in 
https://github.com/apache/kafka/pull/15404. I left a comment there
 
  
>Wednesday, March 13, 2024 6:36 PM UTC from Pavel Pozdeev
> 
>
>There seems to be open issue in Gradle for it:  
>https://github.com/gradle/gradle/issues/20854
> 
> 
> 
>>Monday, March 4, 2024 9:44 PM UTC from Pavel Pozdeev
>> 
>>
>>Hi team,
>>I'm a new member, just joined the community. Trying to add Kraft support to 
>>some existing unit-tests.
>>I noticed, that when I do any change to some unit-test, e.g. 
>>"kafka.admin.AclCommandTest.scala", and then run:
>> 
>>"./gradlew core:compileTestScala"
>> 
>>the entire folder "core/build/classes/scala/test" is cleared, and gradle 
>>re-compiles ALL tests. It takes quite a long time.
>>Looks like this issue affects only tests. If I change main scala code, e.g. 
>>"kafka.admin.AclCommand.scala", and then run:
>> 
>>"./gradlew core:compileScala"
>> 
>>only single file is re-compiled, and it takes only a few seconds.
>>Has anybody noticed same issue before?
>> 
>>Best,
>>Pavel Pozdeev
 

Jenkins build is unstable: Kafka » Kafka Branch Builder » trunk #2724

2024-03-15 Thread Apache Jenkins Server
See 




[jira] [Created] (KAFKA-16380) Rename the shallowOffsetOfMaxTimestamp in LogSegment

2024-03-15 Thread Johnny Hsu (Jira)
Johnny Hsu created KAFKA-16380:
--

 Summary: Rename the shallowOffsetOfMaxTimestamp in LogSegment
 Key: KAFKA-16380
 URL: https://issues.apache.org/jira/browse/KAFKA-16380
 Project: Kafka
  Issue Type: Bug
Reporter: Johnny Hsu
Assignee: Johnny Hsu






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-16379) add metric to measure time spent in purgatory

2024-03-15 Thread Jeff Kim (Jira)
Jeff Kim created KAFKA-16379:


 Summary: add metric to measure time spent in purgatory
 Key: KAFKA-16379
 URL: https://issues.apache.org/jira/browse/KAFKA-16379
 Project: Kafka
  Issue Type: Sub-task
Reporter: Jeff Kim
Assignee: Jeff Kim






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Build failed in Jenkins: Kafka » Kafka Branch Builder » 3.7 #111

2024-03-15 Thread Apache Jenkins Server
See 


Changes:


--
[...truncated 447811 lines...]
[2024-03-15T11:06:41.016Z] Gradle Test Run :core:test > Gradle Test Executor 
113 > PlaintextConsumerTest > 
testPerPartitionLagMetricsCleanUpWithSubscribe(String, String) > 
testPerPartitionLagMetricsCleanUpWithSubscribe(String, 
String).quorum=kraft+kip848.groupProtocol=classic PASSED
[2024-03-15T11:06:41.016Z] 
[2024-03-15T11:06:41.016Z] Gradle Test Run :core:test > Gradle Test Executor 
113 > PlaintextConsumerTest > 
testPerPartitionLagMetricsCleanUpWithSubscribe(String, String) > 
testPerPartitionLagMetricsCleanUpWithSubscribe(String, 
String).quorum=kraft+kip848.groupProtocol=consumer STARTED
[2024-03-15T11:06:49.142Z] 
kafka.api.PlaintextConsumerTest.testPerPartitionLagMetricsCleanUpWithSubscribe(String,
 String)[4] failed, log available in 
/home/jenkins/workspace/Kafka_kafka_3.7/core/build/reports/testOutput/kafka.api.PlaintextConsumerTest.testPerPartitionLagMetricsCleanUpWithSubscribe(String,
 String)[4].test.stdout
[2024-03-15T11:06:49.142Z] 
[2024-03-15T11:06:49.142Z] Gradle Test Run :core:test > Gradle Test Executor 
113 > PlaintextConsumerTest > 
testPerPartitionLagMetricsCleanUpWithSubscribe(String, String) > 
testPerPartitionLagMetricsCleanUpWithSubscribe(String, 
String).quorum=kraft+kip848.groupProtocol=consumer FAILED
[2024-03-15T11:06:49.142Z] org.opentest4j.AssertionFailedError: should be 
assigned once ==> expected: <1> but was: <0>
[2024-03-15T11:06:49.142Z] at 
app//org.junit.jupiter.api.AssertionFailureBuilder.build(AssertionFailureBuilder.java:151)
[2024-03-15T11:06:49.142Z] at 
app//org.junit.jupiter.api.AssertionFailureBuilder.buildAndThrow(AssertionFailureBuilder.java:132)
[2024-03-15T11:06:49.142Z] at 
app//org.junit.jupiter.api.AssertEquals.failNotEqual(AssertEquals.java:197)
[2024-03-15T11:06:49.142Z] at 
app//org.junit.jupiter.api.AssertEquals.assertEquals(AssertEquals.java:150)
[2024-03-15T11:06:49.142Z] at 
app//org.junit.jupiter.api.Assertions.assertEquals(Assertions.java:563)
[2024-03-15T11:06:49.142Z] at 
app//kafka.api.PlaintextConsumerTest.testPerPartitionLagMetricsCleanUpWithSubscribe(PlaintextConsumerTest.scala:1653)
[2024-03-15T11:06:49.142Z] 
[2024-03-15T11:06:49.142Z] Gradle Test Run :core:test > Gradle Test Executor 
113 > PlaintextConsumerTest > 
testPerPartitionLeadMetricsCleanUpWithSubscribe(String, String) > 
testPerPartitionLeadMetricsCleanUpWithSubscribe(String, 
String).quorum=zk.groupProtocol=classic STARTED
[2024-03-15T11:06:54.251Z] 
[2024-03-15T11:06:54.251Z] Gradle Test Run :core:test > Gradle Test Executor 
113 > PlaintextConsumerTest > 
testPerPartitionLeadMetricsCleanUpWithSubscribe(String, String) > 
testPerPartitionLeadMetricsCleanUpWithSubscribe(String, 
String).quorum=zk.groupProtocol=classic PASSED
[2024-03-15T11:06:54.251Z] 
[2024-03-15T11:06:54.251Z] Gradle Test Run :core:test > Gradle Test Executor 
113 > PlaintextConsumerTest > 
testPerPartitionLeadMetricsCleanUpWithSubscribe(String, String) > 
testPerPartitionLeadMetricsCleanUpWithSubscribe(String, 
String).quorum=kraft.groupProtocol=classic STARTED
[2024-03-15T11:06:59.580Z] 
[2024-03-15T11:06:59.581Z] Gradle Test Run :core:test > Gradle Test Executor 
113 > PlaintextConsumerTest > 
testPerPartitionLeadMetricsCleanUpWithSubscribe(String, String) > 
testPerPartitionLeadMetricsCleanUpWithSubscribe(String, 
String).quorum=kraft.groupProtocol=classic PASSED
[2024-03-15T11:06:59.581Z] 
[2024-03-15T11:06:59.581Z] Gradle Test Run :core:test > Gradle Test Executor 
113 > PlaintextConsumerTest > 
testPerPartitionLeadMetricsCleanUpWithSubscribe(String, String) > 
testPerPartitionLeadMetricsCleanUpWithSubscribe(String, 
String).quorum=kraft+kip848.groupProtocol=classic STARTED
[2024-03-15T11:07:05.010Z] 
[2024-03-15T11:07:05.010Z] Gradle Test Run :core:test > Gradle Test Executor 
112 > TransactionsBounceTest > testWithGroupMetadata() PASSED
[2024-03-15T11:07:06.655Z] 
[2024-03-15T11:07:06.655Z] Gradle Test Run :core:test > Gradle Test Executor 
113 > PlaintextConsumerTest > 
testPerPartitionLeadMetricsCleanUpWithSubscribe(String, String) > 
testPerPartitionLeadMetricsCleanUpWithSubscribe(String, 
String).quorum=kraft+kip848.groupProtocol=classic PASSED
[2024-03-15T11:07:06.655Z] 
[2024-03-15T11:07:06.655Z] Gradle Test Run :core:test > Gradle Test Executor 
113 > PlaintextConsumerTest > 
testPerPartitionLeadMetricsCleanUpWithSubscribe(String, String) > 
testPerPartitionLeadMetricsCleanUpWithSubscribe(String, 
String).quorum=kraft+kip848.groupProtocol=consumer STARTED
[2024-03-15T11:07:13.564Z] 
kafka.api.PlaintextConsumerTest.testPerPartitionLeadMetricsCleanUpWithSubscribe(String,
 String)[4] failed, log available in 
/home/jenkins/workspace/Kafka_kafka_3.7/core/build/reports/testOutput/kafka.api.PlaintextConsumerTest.testPerPartitionLeadMetricsCleanUpWithSubscribe(String,
 

Re: [DISCUSS] KIP-1026: Handling producer snapshot when upgrading from < v2.8.0 for Tiered Storage

2024-03-15 Thread Arpit Goyal
Hi All,
I was looking at one of the plugin implementation
s
and Indeed it is expected to encounter a compilation error
[image: Screenshot 2024-03-15 at 2.55.49 PM.png]

To overcome the backward compatibility issue , We have two options ?

   1.  Introduce a new field instead of modifying the existing one and then
   mark the existing field as deprecated.The plugin will gradually transition
   to using the new field as issues arise due to non-existence of the producer
   snapshot file.
   2. As the interface is in an evolving phase , we can go ahead with the
   change and guide the upgrade  process through the release documentation.


@Luke Chen
IMO The behaviour described in the question can also occur  with a
transaction index. In both cases there are scenarios where both files may
or may not exist depending on the use case .This highlights  a distinct
process to validate which  files should exist for a given segment based on
the topic configuration.
After the KIP , even if the producer snapshot file is mistakenly deleted,
we would refrain from copying it and it holds true for the Transaction
Index.

@Kamal Chandraprakash   Creating empty
files may suffice for this use case, but it would  lead to unnecessary file
transfer on the RemoteStorage even when not required and impact performance.


Thanks and Regards
Arpit Goyal
8861094754


On Fri, Mar 15, 2024 at 2:23 PM Kamal Chandraprakash <
kamal.chandraprak...@gmail.com> wrote:

> Hi Arpit,
>
> Thanks for the KIP!
>
> There is an open ticket [1] to generate the empty producer snapshot for
> segments which lacks one.
> Tiered storage is supported from IBP 2.8-IV1 which mandates that all the
> segments should have
> the producer snapshots. If we make the producer snapshot optional, then we
> have to handle all
> the edge cases as Luke mentioned.
>
> Shall we propose to generate the empty producer snapshots instead?
>
> [1]: https://issues.apache.org/jira/browse/KAFKA-15195
>
> On Fri, Mar 15, 2024 at 2:16 PM Luke Chen  wrote:
>
> > Hi Arpit,
> >
> > Thanks for the KIP!
> >
> > I agree with Greg that we should make it clear about backward
> > Compatibility.
> > Since you don't have permission to edit the KIP, you could think about it
> > and write in the email thread directly.
> >
> > Questions:
> > 1. Could you explain what will happen if one partition created after
> v2.8,
> > which should upload the producer snapshot file, but somehow it didn't
> > upload this file to remote storage (ex: the file is deleted accidentally
> by
> > user). Before this KIP, we'll throw exception when uploading the segment
> > file. But what will happen after this KIP?
> >
> >
> > Thanks.
> > Luke
> >
> > On Fri, Mar 15, 2024 at 3:56 AM Greg Harris  >
> > wrote:
> >
> > > Hi Arpit,
> > >
> > > Thanks for the clarification. Replying here without updating the KIP
> > > is fine for now.
> > >
> > > I disagree with your evaluation of the backwards compatibility. If you
> > > change the return type of a method, that breaks both source and binary
> > > compatibility.
> > > After upgrading, plugin implementations using this method would face
> > > compilation errors. Implementations that were compiled against the old
> > > interface will not be able to be loaded when the new interface is
> > > present.
> > > I see that the interface is marked Evolving which permits breaking
> > > compatibility at minor releases, but that doesn't change the
> > > compatibility of the change itself.
> > >
> > > Thanks,
> > > Greg
> > >
> > > On Thu, Mar 14, 2024 at 8:55 AM Arpit Goyal 
> > > wrote:
> > > >
> > > > Hi Greg,
> > > > I do not have access to update the KIP , Divij is helping me to do
> it.
> > > > Meanwhile let me update your queries here.
> > > >
> > > > Backward compatibility:
> > > > These changes will not impact the existing functionality as the
> > existing
> > > > behaviour always expects producer snapshot files to be present for a
> > > given
> > > > segment. Making Producer Snapshot file optional helps to cover both
> the
> > > > scenario i.e. both existing  and non existing of the producer
> snapshot
> > > file.
> > > >
> > > > The getter of producer snapshot file  would also be changed as
> > described
> > > > below:
> > > >
> > > > Current
> > > >
> > > > /**
> > > >  * @return Producer snapshot file until this segment.
> > > >  */
> > > > public Path producerSnapshotIndex() {
> > > > return producerSnapshotIndex;
> > > > }
> > > >
> > > >
> > > > Proposed
> > > >
> > > > /**
> > > >  * @return Producer snapshot file until this segment.
> > > >  */
> > > > public Optional producerSnapshotIndex() {
> > > > return producerSnapshotIndex;
> > > > }
> > > >
> > > >
> > > > Thanks and Regards
> > > > Arpit Goyal
> > > > 8861094754
> > > >
> > > >
> > > > On Wed, Mar 13, 2024 at 9:25 PM Greg 

Re: [DISCUSS] KIP-1026: Handling producer snapshot when upgrading from < v2.8.0 for Tiered Storage

2024-03-15 Thread Kamal Chandraprakash
Hi Arpit,

Thanks for the KIP!

There is an open ticket [1] to generate the empty producer snapshot for
segments which lacks one.
Tiered storage is supported from IBP 2.8-IV1 which mandates that all the
segments should have
the producer snapshots. If we make the producer snapshot optional, then we
have to handle all
the edge cases as Luke mentioned.

Shall we propose to generate the empty producer snapshots instead?

[1]: https://issues.apache.org/jira/browse/KAFKA-15195

On Fri, Mar 15, 2024 at 2:16 PM Luke Chen  wrote:

> Hi Arpit,
>
> Thanks for the KIP!
>
> I agree with Greg that we should make it clear about backward
> Compatibility.
> Since you don't have permission to edit the KIP, you could think about it
> and write in the email thread directly.
>
> Questions:
> 1. Could you explain what will happen if one partition created after v2.8,
> which should upload the producer snapshot file, but somehow it didn't
> upload this file to remote storage (ex: the file is deleted accidentally by
> user). Before this KIP, we'll throw exception when uploading the segment
> file. But what will happen after this KIP?
>
>
> Thanks.
> Luke
>
> On Fri, Mar 15, 2024 at 3:56 AM Greg Harris 
> wrote:
>
> > Hi Arpit,
> >
> > Thanks for the clarification. Replying here without updating the KIP
> > is fine for now.
> >
> > I disagree with your evaluation of the backwards compatibility. If you
> > change the return type of a method, that breaks both source and binary
> > compatibility.
> > After upgrading, plugin implementations using this method would face
> > compilation errors. Implementations that were compiled against the old
> > interface will not be able to be loaded when the new interface is
> > present.
> > I see that the interface is marked Evolving which permits breaking
> > compatibility at minor releases, but that doesn't change the
> > compatibility of the change itself.
> >
> > Thanks,
> > Greg
> >
> > On Thu, Mar 14, 2024 at 8:55 AM Arpit Goyal 
> > wrote:
> > >
> > > Hi Greg,
> > > I do not have access to update the KIP , Divij is helping me to do it.
> > > Meanwhile let me update your queries here.
> > >
> > > Backward compatibility:
> > > These changes will not impact the existing functionality as the
> existing
> > > behaviour always expects producer snapshot files to be present for a
> > given
> > > segment. Making Producer Snapshot file optional helps to cover both the
> > > scenario i.e. both existing  and non existing of the producer snapshot
> > file.
> > >
> > > The getter of producer snapshot file  would also be changed as
> described
> > > below:
> > >
> > > Current
> > >
> > > /**
> > >  * @return Producer snapshot file until this segment.
> > >  */
> > > public Path producerSnapshotIndex() {
> > > return producerSnapshotIndex;
> > > }
> > >
> > >
> > > Proposed
> > >
> > > /**
> > >  * @return Producer snapshot file until this segment.
> > >  */
> > > public Optional producerSnapshotIndex() {
> > > return producerSnapshotIndex;
> > > }
> > >
> > >
> > > Thanks and Regards
> > > Arpit Goyal
> > > 8861094754
> > >
> > >
> > > On Wed, Mar 13, 2024 at 9:25 PM Greg Harris
>  > >
> > > wrote:
> > >
> > > > Hi Arpit,
> > > >
> > > > Thanks for the KIP!
> > > >
> > > > I am not familiar with the necessity of producer snapshots, but your
> > > > explanation sounds like this should be made optional.
> > > >
> > > > Can you expand the KIP to include the changes that need to be made to
> > > > the constructor and getter, and explain more about backwards
> > > > compatibility? From the description I can't tell if this change is
> > > > backwards-compatible or not.
> > > >
> > > > Thanks,
> > > > Greg
> > > >
> > > > On Wed, Mar 13, 2024 at 6:48 AM Arpit Goyal <
> goyal.arpit...@gmail.com>
> > > > wrote:
> > > > >
> > > > > Hi all,
> > > > >
> > > > > I just wanted to bump up this thread.
> > > > >
> > > > > The KIP introduces a really small change  and it would not take
> much
> > of
> > > > the
> > > > > time reviewing it.  This change would enable kafka users to use
> > tiered
> > > > > storage features seamlessly  for the topics migrated  from < 2.8
> > version
> > > > > which currently failed with NullPointerException.
> > > > >
> > > > > I am waiting for this KIP to get approved and then start working on
> > it.
> > > > >
> > > > > On Mon, Mar 11, 2024, 14:26 Arpit Goyal 
> > > > wrote:
> > > > >
> > > > > > Hi All,
> > > > > > Just a Reminder, KIP-1026  is open for discussion.
> > > > > > Thanks and Regards
> > > > > > Arpit Goyal
> > > > > > 8861094754
> > > > > >
> > > > > >
> > > > > > On Sat, Mar 9, 2024 at 9:27 AM Arpit Goyal <
> > goyal.arpit...@gmail.com>
> > > > > > wrote:
> > > > > >
> > > > > >> Hi All,
> > > > > >>
> > > > > >> I have created KIP-1026 for handling producerSnapshot empty
> > scenarios
> > > > > >> when the topic is upgraded from the kafka  < 2.8 version.
> > > > > >>
> > > > > >>
> > > > > >>
> > > >
> >
> 

Re: [DISCUSS] KIP-1026: Handling producer snapshot when upgrading from < v2.8.0 for Tiered Storage

2024-03-15 Thread Luke Chen
Hi Arpit,

Thanks for the KIP!

I agree with Greg that we should make it clear about backward Compatibility.
Since you don't have permission to edit the KIP, you could think about it
and write in the email thread directly.

Questions:
1. Could you explain what will happen if one partition created after v2.8,
which should upload the producer snapshot file, but somehow it didn't
upload this file to remote storage (ex: the file is deleted accidentally by
user). Before this KIP, we'll throw exception when uploading the segment
file. But what will happen after this KIP?


Thanks.
Luke

On Fri, Mar 15, 2024 at 3:56 AM Greg Harris 
wrote:

> Hi Arpit,
>
> Thanks for the clarification. Replying here without updating the KIP
> is fine for now.
>
> I disagree with your evaluation of the backwards compatibility. If you
> change the return type of a method, that breaks both source and binary
> compatibility.
> After upgrading, plugin implementations using this method would face
> compilation errors. Implementations that were compiled against the old
> interface will not be able to be loaded when the new interface is
> present.
> I see that the interface is marked Evolving which permits breaking
> compatibility at minor releases, but that doesn't change the
> compatibility of the change itself.
>
> Thanks,
> Greg
>
> On Thu, Mar 14, 2024 at 8:55 AM Arpit Goyal 
> wrote:
> >
> > Hi Greg,
> > I do not have access to update the KIP , Divij is helping me to do it.
> > Meanwhile let me update your queries here.
> >
> > Backward compatibility:
> > These changes will not impact the existing functionality as the existing
> > behaviour always expects producer snapshot files to be present for a
> given
> > segment. Making Producer Snapshot file optional helps to cover both the
> > scenario i.e. both existing  and non existing of the producer snapshot
> file.
> >
> > The getter of producer snapshot file  would also be changed as described
> > below:
> >
> > Current
> >
> > /**
> >  * @return Producer snapshot file until this segment.
> >  */
> > public Path producerSnapshotIndex() {
> > return producerSnapshotIndex;
> > }
> >
> >
> > Proposed
> >
> > /**
> >  * @return Producer snapshot file until this segment.
> >  */
> > public Optional producerSnapshotIndex() {
> > return producerSnapshotIndex;
> > }
> >
> >
> > Thanks and Regards
> > Arpit Goyal
> > 8861094754
> >
> >
> > On Wed, Mar 13, 2024 at 9:25 PM Greg Harris  >
> > wrote:
> >
> > > Hi Arpit,
> > >
> > > Thanks for the KIP!
> > >
> > > I am not familiar with the necessity of producer snapshots, but your
> > > explanation sounds like this should be made optional.
> > >
> > > Can you expand the KIP to include the changes that need to be made to
> > > the constructor and getter, and explain more about backwards
> > > compatibility? From the description I can't tell if this change is
> > > backwards-compatible or not.
> > >
> > > Thanks,
> > > Greg
> > >
> > > On Wed, Mar 13, 2024 at 6:48 AM Arpit Goyal 
> > > wrote:
> > > >
> > > > Hi all,
> > > >
> > > > I just wanted to bump up this thread.
> > > >
> > > > The KIP introduces a really small change  and it would not take much
> of
> > > the
> > > > time reviewing it.  This change would enable kafka users to use
> tiered
> > > > storage features seamlessly  for the topics migrated  from < 2.8
> version
> > > > which currently failed with NullPointerException.
> > > >
> > > > I am waiting for this KIP to get approved and then start working on
> it.
> > > >
> > > > On Mon, Mar 11, 2024, 14:26 Arpit Goyal 
> > > wrote:
> > > >
> > > > > Hi All,
> > > > > Just a Reminder, KIP-1026  is open for discussion.
> > > > > Thanks and Regards
> > > > > Arpit Goyal
> > > > > 8861094754
> > > > >
> > > > >
> > > > > On Sat, Mar 9, 2024 at 9:27 AM Arpit Goyal <
> goyal.arpit...@gmail.com>
> > > > > wrote:
> > > > >
> > > > >> Hi All,
> > > > >>
> > > > >> I have created KIP-1026 for handling producerSnapshot empty
> scenarios
> > > > >> when the topic is upgraded from the kafka  < 2.8 version.
> > > > >>
> > > > >>
> > > > >>
> > >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1026%3A+Handling+producer+snapshot+when+upgrading+from+%3C+v2.8.0+for+Tiered+Storage
> > > > >>
> > > > >> Feedback and suggestions are welcome.
> > > > >>
> > > > >> Thanks and Regards
> > > > >> Arpit Goyal
> > > > >> 8861094754
> > > > >>
> > > > >
> > >
>


Jenkins build is unstable: Kafka » Kafka Branch Builder » 3.6 #154

2024-03-15 Thread Apache Jenkins Server
See 




Jenkins build is still unstable: Kafka » Kafka Branch Builder » 3.7 #110

2024-03-15 Thread Apache Jenkins Server
See 




[jira] [Resolved] (KAFKA-15490) Invalid path provided to the log failure channel upon I/O error when writing broker metadata checkpoint

2024-03-15 Thread Luke Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-15490?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Chen resolved KAFKA-15490.
---
Fix Version/s: 3.6.2
   Resolution: Fixed

> Invalid path provided to the log failure channel upon I/O error when writing 
> broker metadata checkpoint
> ---
>
> Key: KAFKA-15490
> URL: https://issues.apache.org/jira/browse/KAFKA-15490
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 3.4.0, 3.4.1, 3.5.1, 3.6.1
>Reporter: Alexandre Dupriez
>Assignee: Divij Vaidya
>Priority: Minor
> Fix For: 3.6.2
>
>
> There is a small bug/typo in the handling of I/O error when writing broker 
> metadata checkpoint in {{{}KafkaServer{}}}. The path provided to the log dir 
> failure channel is the full path of the checkpoint file whereas only the log 
> directory is expected 
> ([source|https://github.com/apache/kafka/blob/3.4/core/src/main/scala/kafka/server/KafkaServer.scala#L958C8-L961C8]).
> {code:java}
> case e: IOException =>
>val dirPath = checkpoint.file.getAbsolutePath
>logDirFailureChannel.maybeAddOfflineLogDir(dirPath, s"Error while writing 
> meta.properties to $dirPath", e){code}
> As a result, after an {{IOException}} is captured and enqueued in the log dir 
> failure channel ({{{}{}}} is to be replaced with the actual path of 
> the log directory):
> {code:java}
> [2023-09-22 17:07:32,052] ERROR Error while writing meta.properties to 
> /meta.properties (kafka.server.LogDirFailureChannel) 
> java.io.IOException{code}
> The log dir failure handler cannot lookup the log directory:
> {code:java}
> [2023-09-22 17:07:32,053] ERROR [LogDirFailureHandler]: Error due to 
> (kafka.server.ReplicaManager$LogDirFailureHandler) 
> org.apache.kafka.common.errors.LogDirNotFoundException: Log dir 
> /meta.properties is not found in the config.{code}
> An immediate fix for this is to use the {{logDir}} provided from to the 
> checkpointing method instead of the path of the metadata file.
> For brokers with only one log directory, this bug will result in preventing 
> the broker from shutting down as expected.
> The L{{{}ogDirNotFoundException{}}} then kills the log dir failure handler 
> thread, and subsequent {{IOException}} are not handled, and the broker never 
> stops.
> {code:java}
> [2024-02-27 02:13:13,564] INFO [LogDirFailureHandler]: Stopped 
> (kafka.server.ReplicaManager$LogDirFailureHandler){code}
> Another consideration here is whether the {{LogDirNotFoundException}} should 
> terminate the log dir failure handler thread.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)