Re: A query on log truncation.

2023-01-15 Thread Luke Chen
Hi Vinoth,

I'm wondering what's the use case or pain point you're trying to resolve?
Like you said, the client will be notified the data is not successfully
sent or propagated and handle the error, why should we keep the un-commited
records?
Could you elaborate more on the motivation?

Thank you.
Luke

On Mon, Jan 16, 2023 at 12:33 PM Vinoth  wrote:

> I was reading through about kafka , the way leader election works , log
> truncation etc. One thing that kind of struck me was how records which were
> written to log but then were not committed (It has not propagated
> successfully through to all of the isr and and the high watermark has not
> increased and so not committed ) ,were truncated following the replication
> reconciliation logic . In case they are not committed they would not be
> available for the consumer since the reads are  only upto to the high
> watermark. the producer client will also be notified or will eventually
> know if the message has not successfully propagated and it should be
> handled thru application logic. It seems straight forward in this case.
>
> KIP-405
> <
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-405%3A+Kafka+Tiered+Storage
> >
> talks about tiered storage and kafka being an important part of and an
> entry point for data infrastructure . Else where i have read that kafka
> also serves as way of replaying data to restore state / viewing data.
> KIP-320
> <
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-320%3A+Allow+fetchers+to+detect+and+handle+log+truncation
> >
> mentions users wanting higher availability opting for unclean leader
> election.
>
> Would it be fair to assume that users might be interested in a feature or
> at least  one that can be user enabled where a write to kafka (even with a
> 0 or no acks configuration or unlcean leader election ) will remain written
> until the event where clean or delete config is acted upon?.
>
> If this is a valid use case , i have thoughts of suggesting a kip around
> picking up the data that is to be truncated at time of truncation and
> replaying it as if it came through a fresh produce request. That is a
> truncation of data will not result in the data being removed from kafka but
> rather be placed differently at a different offset.
>
> Regards,
> Vinoth
>


A query on log truncation.

2023-01-15 Thread Vinoth
I was reading through about kafka , the way leader election works , log
truncation etc. One thing that kind of struck me was how records which were
written to log but then were not committed (It has not propagated
successfully through to all of the isr and and the high watermark has not
increased and so not committed ) ,were truncated following the replication
reconciliation logic . In case they are not committed they would not be
available for the consumer since the reads are  only upto to the high
watermark. the producer client will also be notified or will eventually
know if the message has not successfully propagated and it should be
handled thru application logic. It seems straight forward in this case.

KIP-405

talks about tiered storage and kafka being an important part of and an
entry point for data infrastructure . Else where i have read that kafka
also serves as way of replaying data to restore state / viewing data.
KIP-320

mentions users wanting higher availability opting for unclean leader
election.

Would it be fair to assume that users might be interested in a feature or
at least  one that can be user enabled where a write to kafka (even with a
0 or no acks configuration or unlcean leader election ) will remain written
until the event where clean or delete config is acted upon?.

If this is a valid use case , i have thoughts of suggesting a kip around
picking up the data that is to be truncated at time of truncation and
replaying it as if it came through a fresh produce request. That is a
truncation of data will not result in the data being removed from kafka but
rather be placed differently at a different offset.

Regards,
Vinoth


Jenkins build is unstable: Kafka » Kafka Branch Builder » trunk #1517

2023-01-15 Thread Apache Jenkins Server
See 




[jira] [Resolved] (KAFKA-5841) Open old index files with read-only permission

2023-01-15 Thread Ismael Juma (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-5841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma resolved KAFKA-5841.

Resolution: Not A Problem

We have a constructor parameter for this, so I don't think we need this method.

> Open old index files with read-only permission
> --
>
> Key: KAFKA-5841
> URL: https://issues.apache.org/jira/browse/KAFKA-5841
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Jason Gustafson
>Assignee: huxihx
>Priority: Major
>
> Since old index files do not change, we may as well drop the write permission 
> needed when opening them. From the doc comments in {{OffsetIndex}}, it sounds 
> like we may have had this implemented at one point:
> {code}
>  * Index files can be opened in two ways: either as an empty, mutable index 
> that allows appends or
>  * an immutable read-only index file that has previously been populated. The 
> makeReadOnly method will turn a mutable file into an 
>  * immutable one and truncate off any extra bytes. This is done when the 
> index file is rolled over.
> {code}
> So we should either support this or (if there is good reason not to) update 
> the comment.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [DISCUSS] KIP-894: Use incrementalAlterConfigs API for syncing topic configurations

2023-01-15 Thread Ismael Juma
Hi Tina,

See below.

On Wed, Jan 11, 2023 at 3:03 AM Gantigmaa Selenge 
wrote:

> I do like this idea, however when it's set to required, I wasn't sure how
> the mirrormaker should have. It's probably not a great experience if
> mirrormaker starts crashing at some point after it's already running due to
> an incompatible broker version.
>

This would only happen if the user explicitly requests the strict required
("non fallback") mode. There are many reasons why one may want this: say
you want to be sure that your system is not susceptible to the
"alterConfigs" problems or you want to write a test that fails if
"alterConfigs' is used.


> If the incrementalAlterConfig request fails because the target cluster is
> running an older version, then we could log a WARN message that says
> something like  "The config to use incrementalAlterConfig API for syncing
> topic configurations has been set to true however target cluster is running
> incompatible version therefore using the legacy alterConfig API". This way
> the Mirrormaker never has to stop working and makes the user aware of
> what's being used.


Logging statements are not great as the sole mechanism (although useful as
a complementary one) since one cannot easily test against them and they're
often missed alongside all the other logging statements.


>   In this case, we also would not need 3 separate values
> for the config, instead would use the original true or false values:
> - true - > would use incrementalAlterConfig API, but if it's unavailable,
> fallback to the legacy API
> - false -> keep using the legacy API
>
> Set the flag to true by default and remove the config in 4.0.
>

But this means you have different behavior depending on the
supported versions forever. Even though the config values are simpler, it's
harder to reason about the behavior.

> One suggestion: I'm not sure how concerned you are about people's ability
> to migrate, but if you want to make it as smooth as possible, you could add
> one more step. In the 4.0 release, while removing
> `use.incremental.alter.configs`, you can also add
> `use.legacy.alter.configs` to give users a path to continue using the old
> behavior even after the default changes.
>
> If we implement the fallback logic as Ismael suggested above, I think we
> would not need this extra flag later anymore.
>
> Please let me know what you think. Then I can go ahead and update the KIP.


IncrementalAlterConfigs has been around since Apache Kafka 2.3, released in
June 2019. By the time Apache Kafka 4.0 is released, it will have been at
least 4.5 years. I think it's reasonable to set it as the new baseline and
not maintain the fallback code. The key benefit is having behavior that is
easy to reason about.

Ismael


Re: [DISCUSS] KIP-894: Use incrementalAlterConfigs API for syncing topic configurations

2023-01-15 Thread Federico Valeri
On Wed, Jan 11, 2023 at 12:03 PM Gantigmaa Selenge  wrote:
>
> Thank you both for the feedback.
>
> > Have we considered using incremental alter configs by
> default and fallback to the legacy one if the former is unavailable?
> Initially having a flag with just on and off switches seemed simpler and it
> gives users control and awareness of what's being used. However, now I
> think using incremental alter configs by default and if it is unavailable,
> fallback to the legacy API is a good idea.
>
> > The config could have 3 possible values: requested, required, never. The
> default would be requested.
>
> I do like this idea, however when it's set to required, I wasn't sure how
> the mirrormaker should have. It's probably not a great experience if
> mirrormaker starts crashing at some point after it's already running due to
> an incompatible broker version.
>
> If the incrementalAlterConfig request fails because the target cluster is
> running an older version, then we could log a WARN message that says
> something like  "The config to use incrementalAlterConfig API for syncing
> topic configurations has been set to true however target cluster is running
> incompatible version therefore using the legacy alterConfig API". This way
> the Mirrormaker never has to stop working and makes the user aware of
> what's being used.  In this case, we also would not need 3 separate values
> for the config, instead would use the original true or false values:
> - true - > would use incrementalAlterConfig API, but if it's unavailable,
> fallback to the legacy API
> - false -> keep using the legacy API
>
> Set the flag to true by default and remove the config in 4.0.

Hi Tina, I'm in favor of this variant. Maybe the warning message could
just be "The target cluster  is not compatible with the
IncrementalAlterConfigs API, falling back to AlterConfigs API".

Thanks.

>
> > One suggestion: I'm not sure how concerned you are about people's ability
> to migrate, but if you want to make it as smooth as possible, you could add
> one more step. In the 4.0 release, while removing
> `use.incremental.alter.configs`, you can also add
> `use.legacy.alter.configs` to give users a path to continue using the old
> behavior even after the default changes.
>
> If we implement the fallback logic as Ismael suggested above, I think we
> would not need this extra flag later anymore.
>
> Please let me know what you think. Then I can go ahead and update the KIP.
>
> Regards,
> Tina
>
> On Sat, Jan 7, 2023 at 7:45 PM Ismael Juma  wrote:
>
> > Hi,
> >
> > Thanks for the KIP. Have we considered using incremental alter configs by
> > default and fallback to the legacy one if the former is unavailable?
> >
> > The config could have 3 possible values: requested, required, never. The
> > default would be requested. In 4.0, this could would be removed and it
> > would effectively be required.
> >
> > Ismael
> >
> > On Sat, Jan 7, 2023, 10:03 AM John Roesler  wrote:
> >
> > > Hi Tina,
> > >
> > > Thanks for the KIP!
> > >
> > > I hope someone with prior MM or Kafka config experience is able to chime
> > > in here; I have neither.
> > >
> > > I took a look at your KIP, and it makes sense to me. I also think your
> > > migration plan is a good one.
> > >
> > > One suggestion: I'm not sure how concerned you are about people's ability
> > > to migrate, but if you want to make it as smooth as possible, you could
> > add
> > > one more step. In the 4.0 release, while removing
> > > `use.incremental.alter.configs`, you can also add
> > > `use.legacy.alter.configs` to give users a path to continue using the old
> > > behavior even after the default changes.
> > >
> > > Clearly, this will prolong the deprecation period, with implications on
> > > code maintenance, so there is some downside. But generally, I've found
> > > going above and beyond to support smooth upgrades for users to be well
> > > worth it in the long run.
> > >
> > > Thanks again,
> > > -John
> > >
> > >
> > > On Fri, Jan 6, 2023, at 05:49, Gantigmaa Selenge wrote:
> > > > Hi everyone,
> > > >
> > > > I would like to start a discussion on the MirrorMaker update that
> > > > proposes
> > > > replacing the deprecated alterConfigs API with the
> > > > incrementalAlterConfigs
> > > > API for syncing topic configurations. Please take a look at the
> > proposal
> > > > here:
> > > >
> > >
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-894%3A+Use+incrementalAlterConfigs+API+for+syncing+topic+configurations
> > > >
> > > >
> > > > Regards,
> > > > Tina
> > >
> >


Build failed in Jenkins: Kafka » Kafka Branch Builder » trunk #1516

2023-01-15 Thread Apache Jenkins Server
See 


Changes:


--
[...truncated 527060 lines...]
[2023-01-15T10:11:54.442Z] 
[2023-01-15T10:11:54.442Z] Gradle Test Run :core:integrationTest > Gradle Test 
Executor 171 > GetOffsetShellTest > testInternalExcluded() STARTED
[2023-01-15T10:11:58.022Z] 
[2023-01-15T10:11:58.022Z] Gradle Test Run :core:integrationTest > Gradle Test 
Executor 171 > GetOffsetShellTest > testInternalExcluded() PASSED
[2023-01-15T10:11:58.022Z] 
[2023-01-15T10:11:58.022Z] Gradle Test Run :core:integrationTest > Gradle Test 
Executor 171 > GetOffsetShellTest > 
testTopicPartitionsNotFoundForNonMatchingTopicPartitionPattern() STARTED
[2023-01-15T10:11:59.778Z] 
[2023-01-15T10:11:59.778Z] Gradle Test Run :core:integrationTest > Gradle Test 
Executor 171 > GetOffsetShellTest > 
testTopicPartitionsNotFoundForNonMatchingTopicPartitionPattern() PASSED
[2023-01-15T10:11:59.778Z] 
[2023-01-15T10:11:59.778Z] Gradle Test Run :core:integrationTest > Gradle Test 
Executor 171 > GetOffsetShellTest > 
testTopicPartitionsNotFoundForExcludedInternalTopic() STARTED
[2023-01-15T10:12:02.581Z] 
[2023-01-15T10:12:02.581Z] Gradle Test Run :core:integrationTest > Gradle Test 
Executor 171 > GetOffsetShellTest > 
testTopicPartitionsNotFoundForExcludedInternalTopic() PASSED
[2023-01-15T10:12:02.581Z] 
[2023-01-15T10:12:02.581Z] Gradle Test Run :core:integrationTest > Gradle Test 
Executor 171 > MirrorMakerIntegrationTest > testCommaSeparatedRegex(String) > 
kafka.tools.MirrorMakerIntegrationTest.testCommaSeparatedRegex(String)[1] 
STARTED
[2023-01-15T10:12:03.517Z] 
[2023-01-15T10:12:03.518Z] Gradle Test Run :core:integrationTest > Gradle Test 
Executor 171 > MirrorMakerIntegrationTest > testCommaSeparatedRegex(String) > 
kafka.tools.MirrorMakerIntegrationTest.testCommaSeparatedRegex(String)[1] PASSED
[2023-01-15T10:12:03.518Z] 
[2023-01-15T10:12:03.518Z] Gradle Test Run :core:integrationTest > Gradle Test 
Executor 171 > MirrorMakerIntegrationTest > testCommaSeparatedRegex(String) > 
kafka.tools.MirrorMakerIntegrationTest.testCommaSeparatedRegex(String)[2] 
STARTED
[2023-01-15T10:12:06.149Z] 
[2023-01-15T10:12:06.149Z] Gradle Test Run :core:integrationTest > Gradle Test 
Executor 171 > MirrorMakerIntegrationTest > testCommaSeparatedRegex(String) > 
kafka.tools.MirrorMakerIntegrationTest.testCommaSeparatedRegex(String)[2] PASSED
[2023-01-15T10:12:06.149Z] 
[2023-01-15T10:12:06.149Z] Gradle Test Run :core:integrationTest > Gradle Test 
Executor 171 > MirrorMakerIntegrationTest > 
testCommitOffsetsRemoveNonExistentTopics(String) > 
kafka.tools.MirrorMakerIntegrationTest.testCommitOffsetsRemoveNonExistentTopics(String)[1]
 STARTED
[2023-01-15T10:12:08.781Z] 
[2023-01-15T10:12:08.781Z] Gradle Test Run :core:integrationTest > Gradle Test 
Executor 171 > MirrorMakerIntegrationTest > 
testCommitOffsetsRemoveNonExistentTopics(String) > 
kafka.tools.MirrorMakerIntegrationTest.testCommitOffsetsRemoveNonExistentTopics(String)[1]
 PASSED
[2023-01-15T10:12:08.781Z] 
[2023-01-15T10:12:08.781Z] Gradle Test Run :core:integrationTest > Gradle Test 
Executor 171 > MirrorMakerIntegrationTest > 
testCommitOffsetsRemoveNonExistentTopics(String) > 
kafka.tools.MirrorMakerIntegrationTest.testCommitOffsetsRemoveNonExistentTopics(String)[2]
 STARTED
[2023-01-15T10:12:12.363Z] 
[2023-01-15T10:12:12.363Z] Gradle Test Run :core:integrationTest > Gradle Test 
Executor 171 > MirrorMakerIntegrationTest > 
testCommitOffsetsRemoveNonExistentTopics(String) > 
kafka.tools.MirrorMakerIntegrationTest.testCommitOffsetsRemoveNonExistentTopics(String)[2]
 PASSED
[2023-01-15T10:12:12.363Z] 
[2023-01-15T10:12:12.363Z] Gradle Test Run :core:integrationTest > Gradle Test 
Executor 171 > MirrorMakerIntegrationTest > 
testCommitOffsetsThrowTimeoutException(String) > 
kafka.tools.MirrorMakerIntegrationTest.testCommitOffsetsThrowTimeoutException(String)[1]
 STARTED
[2023-01-15T10:12:12.363Z] 
[2023-01-15T10:12:12.363Z] Gradle Test Run :core:integrationTest > Gradle Test 
Executor 171 > MirrorMakerIntegrationTest > 
testCommitOffsetsThrowTimeoutException(String) > 
kafka.tools.MirrorMakerIntegrationTest.testCommitOffsetsThrowTimeoutException(String)[1]
 PASSED
[2023-01-15T10:12:12.363Z] 
[2023-01-15T10:12:12.363Z] Gradle Test Run :core:integrationTest > Gradle Test 
Executor 171 > MirrorMakerIntegrationTest > 
testCommitOffsetsThrowTimeoutException(String) > 
kafka.tools.MirrorMakerIntegrationTest.testCommitOffsetsThrowTimeoutException(String)[2]
 STARTED
[2023-01-15T10:12:14.118Z] 
[2023-01-15T10:12:14.118Z] Gradle Test Run :core:integrationTest > Gradle Test 
Executor 171 > MirrorMakerIntegrationTest > 
testCommitOffsetsThrowTimeoutException(String) > 
kafka.tools.MirrorMakerIntegrationTest.testCommitOffsetsThrowTimeoutException(String)[2]
 PASSED
[2023-01-15T10:12:14.118Z] 
[2023-01-15T10:12:14.118Z] Gradle Test Run :core:integrationTest > Gradle Test 
Executor 171 > Replicati