Re: [DISCUSS] KIP-963: Upload and delete lag metrics in Tiered Storage

2023-11-16 Thread Satish Duggana
Thanks Christo for your reply.

101 and 102 We have conclusion on them.

103. I am not strongly opinionated on this. I am fine if it is helpful
for your scenarios.

104. It seems you want to compare this metric with the number of
segments that are copied. Do you have such a metric now?

Kamal and Luke,
I agree some of the metrics are needed outside of RSM layer in remote
fetch path. Can we take those fine grained remote fetch flow sequence
metrics separately later?

Thanks,
Satish.

On Tue, 14 Nov 2023 at 22:07, Christo Lolov  wrote:
>
> Heya everyone,
>
> Apologies for the delay in my response and thank you very much for all your
> comments! I will start answering in reverse:
>
> *re: Satish*
>
> 101. I am happy to scope down this KIP and start off by emitting those
> metrics on a topic level. I had a preference to emit them on a partition
> level because I have ran into situations where data wasn't evenly spread
> across partitions and not having that granularity made it harder to
> discover.
>
> 102. Fair enough, others have expressed the same preference. I will scope
> down the KIP to only bytes-based and segment-based metrics.
>
> 103. I agree that we could do this, but I personally prefer this to be a
> metric. At the very least a newcomer might not know to look for the log
> line, while most metric-collection systems allow you to explore the whole
> namespace. For example, I really dislike that while log loading happens
> Kafka emits log lines of "X/Y logs loaded" rather than just show me the
> progress via a metric. If you are strongly against this, however, I am
> happy to scope down on this as well.
>
> 104. Ideally we have only one metadata in remote storage for every segment
> of the correct lineage. Due to leadership changes, however, this is not
> always the case. I envisioned that exposing such a metric will showcase if
> there are problems with too many metadata records not part of the correct
> lineage of a log.
>
> *re: Luke*
>
> 1. I am a bit conflicted on this one. As discussed earlier with Jorge, in
> my head such metrics are better left to plugin implementations. If you and
> Kamal feel strongly about this being included I will add it to the KIP.
>
> 2. After running tiered storage in production for a while I ran into
> problems where a partition-level metric would have allowed me to zone in on
> the problem sooner. I tried balancing this with not exposing everything on
> a partition level so not to explode the cardinality too much (point 101
> from Satish). I haven't ran into a situation where knowing the
> RemoteLogSizeComputationTime on a partition level helped me, but this
> doesn't mean there isn't one.
>
> 3. I was thinking that the metric can be emitted while reading of those
> records is happening i.e. if it takes a long time then it will just
> gradually increase as we read. What do you think?
>
> *re: Jorge*
>
> 3.5. Sure, I will aim to add my thoughts to the KIP
>
> 4. Let me check and come back to you on this one. I have a vague memory
> this wasn't as easy to calculate, but if it is, I will include
> RemoteDeleteBytesPerSec as well.
>
> 7. Yes, this is I believe a better explanation than the one I have in the
> KIP, so I will aim to update it with your one. Thank you! LocalDeleteLag
> makes sense to me as well, I will aim to include it.
>
> *re: Kamal*
>
> 1. I can add this to the KIP, but similar to what I have mentioned earlier,
> I believe these are better left to plugin implementations, no?
>
> 2. Yeah, this makes sense.
>
> Best,
> Christo
>
> On Fri, 10 Nov 2023 at 09:33, Satish Duggana 
> wrote:
>
> > Thanks Christo for the KIP and the interesting discussion.
> >
> > 101. Adding metrics at partition level may increase the cardinality of
> > these metrics. We should be cautious of that and see whether they are
> > really needed. RLM related operations do not generally affect based on
> > partition(s) but it is mostly because of the remote storage or broker
> > level issues.
> >
> > 102. I am not sure whether the records metric is much useful when we
> > have other bytes and segments related metrics available. If needed,
> > records level information can be derived once we have segments/bytes
> > metrics.
> >
> > 103. Regarding RemoteLogSizeComputationTime, we can add logs for
> > debugging purposes to print the required duration while computing size
> > instead of generating a metric. If there is any degradation in remote
> > log size computation, it will have an effect on RLM task leading to
> > remote log copy and delete lags.
> >
> > RLMM and RSM implementations can always add more metrics for
> > observability based on the respective implementations.
> >
> > 104. What is the purpose of RemoteLogMetadataCount as a metric?
> >
> > Thanks,
> > Satish.
> >
> > On Fri, 10 Nov 2023 at 04:10, Jorge Esteban Quilcate Otoya
> >  wrote:
> > >
> > > Hi Christo,
> > >
> > > I'd like to add another suggestion:
> > >
> > > 7. Adding on TS lag formulas, my understanding is that

Re: [DISCUSS] KIP-968: Support single-key_multi-timestamp interactive queries (IQv2) for versioned state stores

2023-11-16 Thread Bruno Cadonna

Hi,

80)
We do not keep backwards compatibility with IQv1, right? I would even 
say that currently we do not need to keep backwards compatibility among 
IQv2 versions since we marked the API "Evolving" (do we only mean code 
compatibility here or also behavioral compatibility?). I propose to try 
to not limit ourselves for backwards compatibility that we explicitly 
marked as evolving.
I re-read the discussion on KIP-985. In that discussion, we were quite 
focused on what the state store provides. I see that for range queries, 
we have methods on the state store interface that specify the order, but 
that should be kind of orthogonal to the IQv2 query type. Let's assume 
somebody in the future adds a state store implementation that is not 
order based. To account for use cases where the order does not matter, 
this person might also add a method to the state store interface that 
does not guarantee any order. However, our range query type is specified 
to guarantee order by default. So we need to add something like 
withNoOrder() to the query type to allow the use cases that does not 
need order and has the better performance in IQ. That does not look very 
nice to me. Having the no-order-guaranteed option does not cost us 
anything and it keeps the IQv2 interface flexible. I assume we want to 
drop the Evolving annotation at some point.

Sorry for not having brought this up in the discussion about KIP-985.

Best,
Bruno





On 11/15/23 6:56 AM, Matthias J. Sax wrote:

Just catching up on this one.


50) I am also in favor of setting `validTo` in VersionedRecord for 
single-key single-ts lookup; it seems better to return the proper 
timestamp. The timestamp is already in the store and it's cheap to 
extract it and add to the result, and it might be valuable information 
for the user. Not sure though if we should deprecate the existing 
constructor though, because for "latest" it's convenient to have?



60) Yes, I meant `VersionedRecord`. Sorry for the mixup.


80) We did discuss this question on KIP-985 (maybe you missed it Bruno). 
It's kinda tricky.


Historically, it seems that IQv1, ie, the `ReadOnlyXxx` interfaces 
provide a clear contract that `range()` is ascending and 
`reverseRange()` is descending.


For `RangeQuery`, the question is, if we did implicitly inherit this 
contract? Our conclusion on KIP-985 discussion was, that we did inherit 
it. If this holds true, changing the contract would be a breaking change 
(what might still be acceptable, given that the interface is annotated 
as unstable, and that IQv2 is not widely adopted yet). I am happy to go 
with the 3-option contract, but just want to ensure we all agree it's 
the right way to go, and we are potentially willing to pay the price of 
backward incompatibility.




Do we need a snapshot semantic or can we specify a weaker but still 
useful semantic? 


I don't think we necessarily need it, but as pointed out by Lucas, all 
existing queries provide it. Overall, my main point is really about not 
implementing something "random", but defining a proper binding contract 
that allows users to reason about it.


I general, I agree that weaker semantics might be sufficient, but I am 
not sure if we can implement anything weaker in a reasonable way? Happy 
to be convinced otherwise. (I have some example, that I will omit for 
now, as I hope we can actually go with snapshot semantics.)


The RocksDB Snaptshot idea from Lucas sounds very promising. I was not 
aware that we could do this with RocksDB (otherwise I might have 
suggested it on the PR right away). I guess the only open question 
remaining would be, if we can provide the same guarantees for a future 
in-memory implementation for VersionedStores? It sounds possible to do, 
but we should have some level of confidence about it?



-Matthias

On 11/14/23 6:33 AM, Lucas Brutschy wrote:

Hi Alieh,

I agree with Bruno that a weaker guarantee could be useful already,
and it's anyway possible to strengthen the guarantees later on. On the
other hand, it would be nice to provide a consistent view across all
segments if it doesn't come with major downsides, because until now IQ
does provide a consistent view (via iterators), and this would be the
first feature that diverges from that guarantee.

I think a consistent could be achieved relatively easily by creating a
snapshot (https://github.com/facebook/rocksdb/wiki/Snapshot) that will
ensure a consistent view on a single DB and using
`ReadOptions.setSnapshot` for the gets. Since versioned state stores
segments seem to be backed by a single RocksDB instance (that was
unclear to me during our earlier discussion), a single snapshot should
be enough to guarantee a consistent view - we just need to make sure
to clean up the snapshots after use, similar to iterators. If we
instead need a consistent view across multiple RocksDB instances, we
may have to acquire a write lock on all of those (could use the
current object monitors of our `RocksDB` interfa

Re: [DISCUSS] KIP-997 Support fetch(fromKey, toKey, from, to) to WindowRangeQuery and unify WindowKeyQuery and WindowRangeQuery

2023-11-16 Thread Bruno Cadonna

Hi Hanyu,

Thanks for the KIP!

1)
Could you please mark the pieces that you want to add to the API in the 
code listing in the KIP? You can add a comment like "// newly added" or 
similar. That would make reading the KIP a bit easier because one does 
not need to compare your code with the code in the current codebase.


2)
Could you -- as a side cleanup -- also change the getters to not use the 
get-prefix anymore, please? That was apparently an oversight when those 
methods were added. Although the API is marked as Evolving, I think we 
should still deprecate the getX() methods, since it does not cost us 
anything.


3)
I propose to make the API a bit more fluent. For example, something like

WindowRangeQuery.withKey(key).fromTime(t1).toTime(t2)

and

WindowRangeQuery.withAllKeys().fromTime(t1).toTime(t2)

and

WindowRangeQuery.withKeyRange(key1, key2).fromTime(t1).toTime(t2)

and maybe even in addition to the above add also the option to start 
with the time range


WindowRangeQuery.withWindowStartRange(t1, t2).fromKey(key1).toKey(key2)


4)
Could you also add some usage examples? Alieh did quite a nice job 
regarding usage examples in KIP-986.



Best,
Bruno

On 11/8/23 8:02 PM, Hanyu (Peter) Zheng wrote:

Hello everyone,

I would like to start the discussion for KIP-997: Support fetch(fromKey,
toKey, from, to) to WindowRangeQuery and unify WindowKeyQuery and
WindowRangeQuery
The KIP can be found here:
https://cwiki.apache.org/confluence/display/KAFKA/KIP-997%3A++Support+fetch%28fromKey%2C+toKey%2C+from%2C+to%29+to+WindowRangeQuery+and+unify+WindowKeyQuery+and+WindowRangeQuery

Any suggestions are more than welcome.

Many thanks,
Hanyu

On Wed, Nov 8, 2023 at 10:38 AM Hanyu (Peter) Zheng 
wrote:



https://cwiki.apache.org/confluence/display/KAFKA/KIP-997%3A++Support+fetch%28fromKey%2C+toKey%2C+from%2C+to%29+to+WindowRangeQuery+and+unify+WindowKeyQuery+and+WindowRangeQuery

--

[image: Confluent] 
Hanyu (Peter) Zheng he/him/his
Software Engineer Intern
+1 (213) 431-7193 <+1+(213)+431-7193>
Follow us: [image: Blog]
[image:
Twitter] [image: LinkedIn]
[image: Slack]
[image: YouTube]


[image: Try Confluent Cloud for Free]







[jira] [Resolved] (KAFKA-15755) LeaveGroupResponse v0-v2 should handle no members

2023-11-16 Thread David Jacot (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-15755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Jacot resolved KAFKA-15755.
-
Fix Version/s: 3.4.2
   3.5.2
   3.7.0
   3.6.1
   Resolution: Fixed

> LeaveGroupResponse v0-v2 should handle no members
> -
>
> Key: KAFKA-15755
> URL: https://issues.apache.org/jira/browse/KAFKA-15755
> Project: Kafka
>  Issue Type: Bug
>Reporter: Robert Wagner
>Assignee: Robert Wagner
>Priority: Major
> Fix For: 3.4.2, 3.5.2, 3.7.0, 3.6.1
>
>
> When Sarama and Librdkafka consumer clients issue LeaveGroup requests, they 
> use an older protocol version < 3 which did not include a `members` field.
> Since our upgrade the kafka broker 3.4.1 we have started seeing these broker 
> exceptions:
> {code}
> [2023-10-24 01:17:17,214] ERROR [KafkaApi-28598] Unexpected error handling 
> request RequestHeader(apiKey=LEAVE_GROUP, apiVersion=1, clientId=REDACTED, 
> correlationId=116775, headerVersion=1) -- 
> LeaveGroupRequestData(groupId=REDACTED, 
> memberId='REDACTED-73967453-93c4-4f3f-bcef-32c1f280350f', members=[]) with 
> context RequestContext(header=RequestHeader(apiKey=LEAVE_GROUP, apiVersion=1, 
> clientId=REDACTED, correlationId=116775, headerVersion=1), 
> connectionId='REDACTED', clientAddress=/REDACTED, principal=REDACTED, 
> listenerName=ListenerName(PLAINTEXT), securityProtocol=PLAINTEXT, 
> clientInformation=ClientInformation(softwareName=confluent-kafka-python, 
> softwareVersion=1.7.0-rdkafka-1.7.0), fromPrivilegedListener=false, 
> principalSerde=Optional[REDACTED]) (kafka.server.KafkaApis)
> java.util.concurrent.CompletionException: 
> org.apache.kafka.common.errors.UnsupportedVersionException: LeaveGroup 
> response version 1 can only contain one member, got 0 members.
>   at 
> java.base/java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:315)
>   at 
> java.base/java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:320)
>   at 
> java.base/java.util.concurrent.CompletableFuture.uniHandle(CompletableFuture.java:936)
>   at 
> java.base/java.util.concurrent.CompletableFuture.uniHandleStage(CompletableFuture.java:950)
>   at 
> java.base/java.util.concurrent.CompletableFuture.handle(CompletableFuture.java:2340)
>   at kafka.server.KafkaApis.handleLeaveGroupRequest(KafkaApis.scala:1796)
>   at kafka.server.KafkaApis.handle(KafkaApis.scala:196)
>   at kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:75)
>   at java.base/java.lang.Thread.run(Thread.java:833)
> Caused by: org.apache.kafka.common.errors.UnsupportedVersionException: 
> LeaveGroup response version 1 can only contain one member, got 0 members. 
> {code}
>  
> KIP-848 introduced a check in LeaveGroupResponse that the members field must 
> have 1 element.  In some error cases, it seems like the members field has 0 
> elements - which would still be a valid response for v0-v2 messages, but this 
> exception was being thrown.
> Instead of throwing an exception in this case, continue with the 
> LeaveGroupResponse, since it is not a field included in v0 - v2 responses 
> anyway.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (KAFKA-15552) Duplicate Producer ID blocks during ZK migration

2023-11-16 Thread Luke Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-15552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Chen resolved KAFKA-15552.
---
Resolution: Fixed

> Duplicate Producer ID blocks during ZK migration
> 
>
> Key: KAFKA-15552
> URL: https://issues.apache.org/jira/browse/KAFKA-15552
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 3.4.0, 3.5.0, 3.4.1, 3.6.0, 3.5.1
>Reporter: David Arthur
>Assignee: David Arthur
>Priority: Critical
> Fix For: 3.5.2, 3.6.1
>
>
> When migrating producer ID blocks from ZK to KRaft, we are taking the current 
> producer ID block from ZK and writing it's "firstProducerId" into the 
> producer IDs KRaft record. However, in KRaft we store the _next_ producer ID 
> block in the log rather than storing the current block like ZK does. The end 
> result is that the first block given to a caller of AllocateProducerIds is a 
> duplicate of the last block allocated in ZK mode.
>  
> This can result in duplicate producer IDs being given to transactional or 
> idempotent producers. In the case of transactional producers, this can cause 
> long term problems since the producer IDs are persisted and reused for a long 
> time.
> The time between the last producer ID block being allocated by the ZK 
> controller and all the brokers being restarted following the metadata 
> migration is when this bug is possible.
>  
> Symptoms of this bug will include ReplicaManager OutOfOrderSequenceException 
> and possibly some producer epoch validation errors. To see if a cluster is 
> affected by this bug, search for the offending producer ID and see if it is 
> being used by more than one producer.
>  
> For example, the following error was observed
> {code}
> Out of order sequence number for producer 376000 at offset 381338 in 
> partition REDACTED: 0 (incoming seq. number), 21 (current end sequence 
> number) 
> {code}
> Then searching for "376000" on 
> org.apache.kafka.clients.producer.internals.TransactionManager logs, two 
> brokers both show the same producer ID being provisioned
> {code}
> Broker 0 [Producer clientId=REDACTED-0] ProducerId set to 376000 with epoch 1
> Broker 5 [Producer clientId=REDACTED-1] ProducerId set to 376000 with epoch 1
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [DISCUSS] Should we continue to merge without a green build? No!

2023-11-16 Thread Igor Soarez
Hi all,

I think at least one of those is my fault, apologies.
I'll try to make sure all my tests are passing from now on.

It doesn't help that GitHub always shows that the tests have failed,
even when they have not. I suspect this is because Jenkins always
marks the builds as unstable, even when all tests pass, because
the "Archive JUnit-formatted test results" step seems to persistently
fail with "[Checks API] No suitable checks publisher found.".
e.g. 
https://ci-builds.apache.org/blue/organizations/jenkins/Kafka%2Fkafka-pr/detail/PR-14770/1/pipeline/

Can we get rid of that persistent failure and actually mark successful test 
runs as green?

--
Igor


Build failed in Jenkins: Kafka » Kafka Branch Builder » trunk #2387

2023-11-16 Thread Apache Jenkins Server
See 


Changes:


--
[...truncated 415933 lines...]

Gradle Test Run :connect:runtime:test > Gradle Test Executor 44 > 
org.apache.kafka.connect.util.SinkUtilsTest > testNullTopic PASSED

Gradle Test Run :connect:runtime:test > Gradle Test Executor 44 > 
org.apache.kafka.connect.util.SinkUtilsTest > 
testValidateAndParseInvalidPartition STARTED

Gradle Test Run :connect:runtime:test > Gradle Test Executor 44 > 
org.apache.kafka.connect.util.SinkUtilsTest > 
testValidateAndParseInvalidPartition PASSED

Gradle Test Run :connect:runtime:test > Gradle Test Executor 44 > 
org.apache.kafka.connect.util.TopicAdminTest > 
verifyingTopicCleanupPolicyShouldReturnFalseWhenBrokerVersionIsUnsupported 
STARTED

Gradle Test Run :connect:runtime:test > Gradle Test Executor 44 > 
org.apache.kafka.connect.util.TopicAdminTest > 
verifyingTopicCleanupPolicyShouldReturnFalseWhenBrokerVersionIsUnsupported 
PASSED

Gradle Test Run :connect:runtime:test > Gradle Test Executor 44 > 
org.apache.kafka.connect.util.TopicAdminTest > 
shouldNotCreateTopicWhenItAlreadyExists STARTED

Gradle Test Run :connect:runtime:test > Gradle Test Executor 44 > 
org.apache.kafka.connect.util.TopicAdminTest > 
shouldNotCreateTopicWhenItAlreadyExists PASSED

Gradle Test Run :connect:runtime:test > Gradle Test Executor 44 > 
org.apache.kafka.connect.util.TopicAdminTest > 
endOffsetsShouldFailWithNonRetriableWhenUnknownErrorOccurs STARTED

Gradle Test Run :connect:runtime:test > Gradle Test Executor 44 > 
org.apache.kafka.connect.util.TopicAdminTest > 
endOffsetsShouldFailWithNonRetriableWhenUnknownErrorOccurs PASSED

Gradle Test Run :connect:runtime:test > Gradle Test Executor 44 > 
org.apache.kafka.connect.util.TopicAdminTest > 
verifyingTopicCleanupPolicyShouldFailWhenTopicHasDeleteAndCompactPolicy STARTED

Gradle Test Run :connect:runtime:test > Gradle Test Executor 44 > 
org.apache.kafka.connect.util.TopicAdminTest > 
verifyingTopicCleanupPolicyShouldFailWhenTopicHasDeleteAndCompactPolicy PASSED

Gradle Test Run :connect:runtime:test > Gradle Test Executor 44 > 
org.apache.kafka.connect.util.TopicAdminTest > 
describeTopicConfigShouldReturnEmptyMapWhenUnsupportedVersionFailure STARTED

Gradle Test Run :connect:runtime:test > Gradle Test Executor 44 > 
org.apache.kafka.connect.util.TopicAdminTest > 
describeTopicConfigShouldReturnEmptyMapWhenUnsupportedVersionFailure PASSED

Gradle Test Run :connect:runtime:test > Gradle Test Executor 44 > 
org.apache.kafka.connect.util.TopicAdminTest > 
shouldRetryWhenTopicCreateThrowsWrappedTimeoutException STARTED

Gradle Test Run :connect:runtime:test > Gradle Test Executor 44 > 
org.apache.kafka.connect.util.TopicAdminTest > 
shouldRetryWhenTopicCreateThrowsWrappedTimeoutException PASSED

Gradle Test Run :connect:runtime:test > Gradle Test Executor 44 > 
org.apache.kafka.connect.util.TopicAdminTest > 
describeTopicConfigShouldReturnTopicConfigWhenTopicExists STARTED

Gradle Test Run :connect:runtime:test > Gradle Test Executor 44 > 
org.apache.kafka.connect.util.TopicAdminTest > 
describeTopicConfigShouldReturnTopicConfigWhenTopicExists PASSED

Gradle Test Run :connect:runtime:test > Gradle Test Executor 44 > 
org.apache.kafka.connect.util.TopicAdminTest > 
retryEndOffsetsShouldRetryWhenTopicNotFound STARTED

Gradle Test Run :connect:runtime:test > Gradle Test Executor 44 > 
org.apache.kafka.connect.util.TopicAdminTest > 
retryEndOffsetsShouldRetryWhenTopicNotFound PASSED

Gradle Test Run :connect:runtime:test > Gradle Test Executor 44 > 
org.apache.kafka.connect.util.TopicAdminTest > 
createShouldReturnFalseWhenSuppliedNullTopicDescription STARTED

Gradle Test Run :connect:runtime:test > Gradle Test Executor 44 > 
org.apache.kafka.connect.util.TopicAdminTest > 
createShouldReturnFalseWhenSuppliedNullTopicDescription PASSED

Gradle Test Run :connect:runtime:test > Gradle Test Executor 44 > 
org.apache.kafka.connect.util.TopicAdminTest > 
describeShouldReturnEmptyWhenTopicDoesNotExist STARTED

Gradle Test Run :connect:runtime:test > Gradle Test Executor 44 > 
org.apache.kafka.connect.util.TopicAdminTest > 
describeShouldReturnEmptyWhenTopicDoesNotExist PASSED

Gradle Test Run :connect:runtime:test > Gradle Test Executor 44 > 
org.apache.kafka.connect.util.TopicAdminTest > 
verifyingTopicCleanupPolicyShouldReturnFalseWhenTopicAuthorizationError STARTED

Gradle Test Run :connect:runtime:test > Gradle Test Executor 44 > 
org.apache.kafka.connect.util.TopicAdminTest > 
verifyingTopicCleanupPolicyShouldReturnFalseWhenTopicAuthorizationError PASSED

Gradle Test Run :connect:runtime:test > Gradle Test Executor 44 > 
org.apache.kafka.connect.util.TopicAdminTest > 
describeTopicConfigShouldReturnEmptyMapWhenNoTopicsAreSpecified STARTED

Gradle Test Run :connect:runtime:test > Gradle Test Executor 44 > 
org.apache.kafka.connect.util.TopicAdminTest > 
describeTopicConfigShouldReturnEmptyMapWhenNoTopicsA

[jira] [Created] (KAFKA-15840) Correct initialization of ConsumerGroupHeartbeat by client

2023-11-16 Thread Andrew Schofield (Jira)
Andrew Schofield created KAFKA-15840:


 Summary: Correct initialization of ConsumerGroupHeartbeat by client
 Key: KAFKA-15840
 URL: https://issues.apache.org/jira/browse/KAFKA-15840
 Project: Kafka
  Issue Type: Sub-task
  Components: clients
Reporter: Andrew Schofield
Assignee: Andrew Schofield


The new consumer using the KIP-848 protocol currently leaves the 
TopicPartitions set to null for the ConsumerGroupHeartbeat request, even when 
the MemberEpoch is zero. This violates the KIP which expects the list to be 
empty (but not null).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Jenkins build is still unstable: Kafka » Kafka Branch Builder » 3.5 #98

2023-11-16 Thread Apache Jenkins Server
See 




Build failed in Jenkins: Kafka » Kafka Branch Builder » 3.6 #114

2023-11-16 Thread Apache Jenkins Server
See 


Changes:


--
[...truncated 307712 lines...]

Gradle Test Run :core:test > Gradle Test Executor 91 > KafkaZkClientTest > 
testConditionalUpdatePath() PASSED

Gradle Test Run :core:test > Gradle Test Executor 91 > KafkaZkClientTest > 
testGetAllTopicsInClusterTriggersWatch() STARTED

Gradle Test Run :core:test > Gradle Test Executor 91 > KafkaZkClientTest > 
testGetAllTopicsInClusterTriggersWatch() PASSED

Gradle Test Run :core:test > Gradle Test Executor 91 > KafkaZkClientTest > 
testDeleteTopicZNode() STARTED

Gradle Test Run :core:test > Gradle Test Executor 91 > KafkaZkClientTest > 
testDeleteTopicZNode() PASSED

Gradle Test Run :core:test > Gradle Test Executor 91 > KafkaZkClientTest > 
testDeletePath() STARTED

Gradle Test Run :core:test > Gradle Test Executor 91 > KafkaZkClientTest > 
testDeletePath() PASSED

Gradle Test Run :core:test > Gradle Test Executor 91 > KafkaZkClientTest > 
testGetBrokerMethods() STARTED

Gradle Test Run :core:test > Gradle Test Executor 91 > KafkaZkClientTest > 
testGetBrokerMethods() PASSED

Gradle Test Run :core:test > Gradle Test Executor 91 > KafkaZkClientTest > 
testCreateTokenChangeNotification() STARTED

Gradle Test Run :core:test > Gradle Test Executor 91 > KafkaZkClientTest > 
testCreateTokenChangeNotification() PASSED

Gradle Test Run :core:test > Gradle Test Executor 91 > KafkaZkClientTest > 
testGetTopicsAndPartitions() STARTED

Gradle Test Run :core:test > Gradle Test Executor 91 > KafkaZkClientTest > 
testGetTopicsAndPartitions() PASSED

Gradle Test Run :core:test > Gradle Test Executor 91 > KafkaZkClientTest > 
testChroot(boolean) > [1] createChrootIfNecessary=true STARTED

Gradle Test Run :core:test > Gradle Test Executor 91 > KafkaZkClientTest > 
testChroot(boolean) > [1] createChrootIfNecessary=true PASSED

Gradle Test Run :core:test > Gradle Test Executor 91 > KafkaZkClientTest > 
testChroot(boolean) > [2] createChrootIfNecessary=false STARTED

Gradle Test Run :core:test > Gradle Test Executor 91 > KafkaZkClientTest > 
testChroot(boolean) > [2] createChrootIfNecessary=false PASSED

Gradle Test Run :core:test > Gradle Test Executor 91 > KafkaZkClientTest > 
testRegisterBrokerInfo() STARTED

Gradle Test Run :core:test > Gradle Test Executor 91 > KafkaZkClientTest > 
testRegisterBrokerInfo() PASSED

Gradle Test Run :core:test > Gradle Test Executor 91 > KafkaZkClientTest > 
testRetryRegisterBrokerInfo() STARTED

Gradle Test Run :core:test > Gradle Test Executor 91 > KafkaZkClientTest > 
testRetryRegisterBrokerInfo() PASSED

Gradle Test Run :core:test > Gradle Test Executor 91 > KafkaZkClientTest > 
testConsumerOffsetPath() STARTED

Gradle Test Run :core:test > Gradle Test Executor 91 > KafkaZkClientTest > 
testConsumerOffsetPath() PASSED

Gradle Test Run :core:test > Gradle Test Executor 91 > KafkaZkClientTest > 
testDeleteRecursiveWithControllerEpochVersionCheck() STARTED

Gradle Test Run :core:test > Gradle Test Executor 91 > KafkaZkClientTest > 
testDeleteRecursiveWithControllerEpochVersionCheck() PASSED

Gradle Test Run :core:test > Gradle Test Executor 91 > KafkaZkClientTest > 
testTopicAssignments() STARTED

Gradle Test Run :core:test > Gradle Test Executor 91 > KafkaZkClientTest > 
testTopicAssignments() PASSED

Gradle Test Run :core:test > Gradle Test Executor 91 > KafkaZkClientTest > 
testControllerManagementMethods() STARTED

Gradle Test Run :core:test > Gradle Test Executor 91 > KafkaZkClientTest > 
testControllerManagementMethods() PASSED

Gradle Test Run :core:test > Gradle Test Executor 91 > KafkaZkClientTest > 
testTopicAssignmentMethods() STARTED

Gradle Test Run :core:test > Gradle Test Executor 91 > KafkaZkClientTest > 
testTopicAssignmentMethods() PASSED

Gradle Test Run :core:test > Gradle Test Executor 91 > KafkaZkClientTest > 
testConnectionViaNettyClient() STARTED

Gradle Test Run :core:test > Gradle Test Executor 91 > KafkaZkClientTest > 
testConnectionViaNettyClient() PASSED

Gradle Test Run :core:test > Gradle Test Executor 91 > KafkaZkClientTest > 
testPropagateIsrChanges() STARTED

Gradle Test Run :core:test > Gradle Test Executor 91 > KafkaZkClientTest > 
testPropagateIsrChanges() PASSED

Gradle Test Run :core:test > Gradle Test Executor 91 > KafkaZkClientTest > 
testControllerEpochMethods() STARTED

Gradle Test Run :core:test > Gradle Test Executor 91 > KafkaZkClientTest > 
testControllerEpochMethods() PASSED

Gradle Test Run :core:test > Gradle Test Executor 91 > KafkaZkClientTest > 
testDeleteRecursive() STARTED

Gradle Test Run :core:test > Gradle Test Executor 91 > KafkaZkClientTest > 
testDeleteRecursive() PASSED

Gradle Test Run :core:test > Gradle Test Executor 91 > KafkaZkClientTest > 
testGetTopicPartitionStates() STARTED

Gradle Test Run :core:test > Gradle Test Executor 91 > KafkaZkClientTest > 
testGetTopicPartitionStates() PASSED

Gradle Test Run :core:test > Gradle Test Executor 91 > KafkaZ

Re: [VOTE] KIP-997: Partition-Level Throughput Metrics

2023-11-16 Thread Kamal Chandraprakash
+1 (non-binding). Thanks for the KIP!

On Thu, Nov 16, 2023 at 9:00 AM Satish Duggana 
wrote:

> Thanks Qichao for the KIP.
>
> +1 (binding)
>
> ~Satish.
>
> On Thu, 16 Nov 2023 at 02:20, Jorge Esteban Quilcate Otoya
>  wrote:
> >
> > Qichao, thanks again for leading this proposal!
> >
> > +1 (non-binding)
> >
> > Cheers,
> > Jorge.
> >
> > On Wed, 15 Nov 2023 at 19:17, Divij Vaidya 
> wrote:
> >
> > > +1 (binding)
> > >
> > > I was involved in the discussion thread for this KIP and support it in
> its
> > > current form.
> > >
> > > --
> > > Divij Vaidya
> > >
> > >
> > >
> > > On Wed, Nov 15, 2023 at 10:55 AM Qichao Chu 
> > > wrote:
> > >
> > > > Hi all,
> > > >
> > > > I'd like to call a vote for KIP-977: Partition-Level Throughput
> Metrics.
> > > >
> > > > Please take a look here:
> > > >
> > > >
> > >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-977%3A+Partition-Level+Throughput+Metrics
> > > >
> > > > Best,
> > > > Qichao Chu
> > > >
> > >
>


[jira] [Created] (KAFKA-15841) Add Support for Topic-Level Partitioning in Kafka Connect

2023-11-16 Thread Henrique Mota (Jira)
Henrique Mota created KAFKA-15841:
-

 Summary: Add Support for Topic-Level Partitioning in Kafka Connect
 Key: KAFKA-15841
 URL: https://issues.apache.org/jira/browse/KAFKA-15841
 Project: Kafka
  Issue Type: Improvement
  Components: connect
Reporter: Henrique Mota


In our organization, we utilize JDBC sink connectors to consume data from 
various topics, where each topic is dedicated to a specific tenant with a 
single partition. Recently, we developed a custom sink based on the standard 
JDBC sink, enabling us to pause consumption of a topic when encountering 
problematic records.

However, we face limitations within Kafka Connect, as it doesn't allow for 
appropriate partitioning of topics among workers. We attempted a workaround by 
breaking down the topics list within the 'topics' parameter. Unfortunately, 
Kafka Connect overrides this parameter after invoking the {{taskConfigs(int 
maxTasks)}} method from the {{org.apache.kafka.connect.connector.Connector}} 
class.

We request the addition of support in Kafka Connect to enable the partitioning 
of topics among workers without requiring a fork. This enhancement would 
facilitate better load distribution and allow for more flexible configurations, 
particularly in scenarios where topics are dedicated to different tenants.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Build failed in Jenkins: Kafka » Kafka Branch Builder » 3.4 #172

2023-11-16 Thread Apache Jenkins Server
See 


Changes:


--
[...truncated 538281 lines...]
Gradle Test Run :core:integrationTest > Gradle Test Executor 172 > 
KafkaZkClientTest > testCreateTopLevelPaths() STARTED

Gradle Test Run :core:integrationTest > Gradle Test Executor 172 > 
KafkaZkClientTest > testCreateTopLevelPaths() PASSED

Gradle Test Run :core:integrationTest > Gradle Test Executor 172 > 
KafkaZkClientTest > testGetAllTopicsInClusterDoesNotTriggerWatch() STARTED

Gradle Test Run :core:integrationTest > Gradle Test Executor 172 > 
KafkaZkClientTest > testGetAllTopicsInClusterDoesNotTriggerWatch() PASSED

Gradle Test Run :core:integrationTest > Gradle Test Executor 172 > 
KafkaZkClientTest > testIsrChangeNotificationGetters() STARTED

Gradle Test Run :core:integrationTest > Gradle Test Executor 172 > 
KafkaZkClientTest > testIsrChangeNotificationGetters() PASSED

Gradle Test Run :core:integrationTest > Gradle Test Executor 172 > 
KafkaZkClientTest > testLogDirEventNotificationsDeletion() STARTED

Gradle Test Run :core:integrationTest > Gradle Test Executor 172 > 
KafkaZkClientTest > testLogDirEventNotificationsDeletion() PASSED

Gradle Test Run :core:integrationTest > Gradle Test Executor 172 > 
KafkaZkClientTest > testGetLogConfigs() STARTED

Gradle Test Run :core:integrationTest > Gradle Test Executor 172 > 
KafkaZkClientTest > testGetLogConfigs() PASSED

Gradle Test Run :core:integrationTest > Gradle Test Executor 172 > 
KafkaZkClientTest > testBrokerSequenceIdMethods() STARTED

Gradle Test Run :core:integrationTest > Gradle Test Executor 172 > 
KafkaZkClientTest > testBrokerSequenceIdMethods() PASSED

Gradle Test Run :core:integrationTest > Gradle Test Executor 172 > 
KafkaZkClientTest > testAclMethods() STARTED

Gradle Test Run :core:integrationTest > Gradle Test Executor 172 > 
KafkaZkClientTest > testAclMethods() PASSED

Gradle Test Run :core:integrationTest > Gradle Test Executor 172 > 
KafkaZkClientTest > testCreateSequentialPersistentPath() STARTED

Gradle Test Run :core:integrationTest > Gradle Test Executor 172 > 
KafkaZkClientTest > testCreateSequentialPersistentPath() PASSED

Gradle Test Run :core:integrationTest > Gradle Test Executor 172 > 
KafkaZkClientTest > testConditionalUpdatePath() STARTED

Gradle Test Run :core:integrationTest > Gradle Test Executor 172 > 
KafkaZkClientTest > testConditionalUpdatePath() PASSED

Gradle Test Run :core:integrationTest > Gradle Test Executor 172 > 
KafkaZkClientTest > testGetAllTopicsInClusterTriggersWatch() STARTED

Gradle Test Run :core:integrationTest > Gradle Test Executor 172 > 
KafkaZkClientTest > testGetAllTopicsInClusterTriggersWatch() PASSED

Gradle Test Run :core:integrationTest > Gradle Test Executor 172 > 
KafkaZkClientTest > testDeleteTopicZNode() STARTED

Gradle Test Run :core:integrationTest > Gradle Test Executor 172 > 
KafkaZkClientTest > testDeleteTopicZNode() PASSED

Gradle Test Run :core:integrationTest > Gradle Test Executor 172 > 
KafkaZkClientTest > testDeletePath() STARTED

Gradle Test Run :core:integrationTest > Gradle Test Executor 172 > 
KafkaZkClientTest > testDeletePath() PASSED

Gradle Test Run :core:integrationTest > Gradle Test Executor 172 > 
KafkaZkClientTest > testGetBrokerMethods() STARTED

Gradle Test Run :core:integrationTest > Gradle Test Executor 172 > 
KafkaZkClientTest > testGetBrokerMethods() PASSED

Gradle Test Run :core:integrationTest > Gradle Test Executor 172 > 
KafkaZkClientTest > testJuteMaxBufffer() STARTED

Gradle Test Run :core:integrationTest > Gradle Test Executor 172 > 
KafkaZkClientTest > testJuteMaxBufffer() PASSED

Gradle Test Run :core:integrationTest > Gradle Test Executor 172 > 
KafkaZkClientTest > testCreateTokenChangeNotification() STARTED

Gradle Test Run :core:integrationTest > Gradle Test Executor 172 > 
KafkaZkClientTest > testCreateTokenChangeNotification() PASSED

Gradle Test Run :core:integrationTest > Gradle Test Executor 172 > 
KafkaZkClientTest > testGetTopicsAndPartitions() STARTED

Gradle Test Run :core:integrationTest > Gradle Test Executor 172 > 
KafkaZkClientTest > testGetTopicsAndPartitions() PASSED

Gradle Test Run :core:integrationTest > Gradle Test Executor 172 > 
KafkaZkClientTest > testChroot(boolean) > 
kafka.zk.KafkaZkClientTest.testChroot(boolean)[1] STARTED

Gradle Test Run :core:integrationTest > Gradle Test Executor 172 > 
KafkaZkClientTest > testChroot(boolean) > 
kafka.zk.KafkaZkClientTest.testChroot(boolean)[1] PASSED

Gradle Test Run :core:integrationTest > Gradle Test Executor 172 > 
KafkaZkClientTest > testChroot(boolean) > 
kafka.zk.KafkaZkClientTest.testChroot(boolean)[2] STARTED

Gradle Test Run :core:integrationTest > Gradle Test Executor 172 > 
KafkaZkClientTest > testChroot(boolean) > 
kafka.zk.KafkaZkClientTest.testChroot(boolean)[2] PASSED

Gradle Test Run :core:integrationTest > Gradle Test Executor 172 > 
KafkaZkClientTest > testRegisterBrokerInfo() STARTED

Gr

Re: [VOTE] KIP-1000: List Client Metrics Configuration Resources

2023-11-16 Thread Doğuşcan Namal
Thanks for the brief KIP Andrew. Having discussed the details in KIP-714, I
see this is a natural follow up to that.

+1(non-binding)

On Wed, 15 Nov 2023 at 15:23, Andrew Schofield <
andrew_schofield_j...@outlook.com> wrote:

> Hi,
> I’d like to start the voting for KIP-1000: List Client Metrics
> Configuration Resources.
>
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1000%3A+List+Client+Metrics+Configuration+Resources
>
> Thanks,
> Andrew


Re: [VOTE] KIP-1000: List Client Metrics Configuration Resources

2023-11-16 Thread Apoorv Mittal
Thanks a lot for writing the KIP Andrew. This is much required to list all
configured client metrics resources.

I have one minor question related to the zkBroker listener in the new RPC.
As the client-metrics resource is not supported in Zookeeper mode hence
shouldn't we disallow ListClientMetricsResourcesRequest for
Zookeper in the APIVersion request itself?

+1(non-binding)

Regards,
Apoorv Mittal
+44 7721681581


On Wed, Nov 15, 2023 at 4:58 PM Andrew Schofield <
andrew_schofield_j...@outlook.com> wrote:

> Hi,
> I’d like to start the voting for KIP-1000: List Client Metrics
> Configuration Resources.
>
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1000%3A+List+Client+Metrics+Configuration+Resources
>
> Thanks,
> Andrew


[jira] [Resolved] (KAFKA-15633) Bug: Generated Persistent Directory IDs are overwritten on startup.

2023-11-16 Thread Proven Provenzano (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-15633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Proven Provenzano resolved KAFKA-15633.
---
Resolution: Fixed

> Bug: Generated Persistent Directory IDs are overwritten on startup.
> ---
>
> Key: KAFKA-15633
> URL: https://issues.apache.org/jira/browse/KAFKA-15633
> Project: Kafka
>  Issue Type: Sub-task
>  Components: jbod
>Affects Versions: 3.7.0
>Reporter: Proven Provenzano
>Assignee: Proven Provenzano
>Priority: Major
> Fix For: 3.7.0
>
>
> The code to generate the persistent directory IDs and add them to the 
> meta.properties file works, but later in the startup process the file is 
> overwritten with the original data.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-15842) Correct handling of KafkaConsumer.committed for new consumer

2023-11-16 Thread Andrew Schofield (Jira)
Andrew Schofield created KAFKA-15842:


 Summary: Correct handling of KafkaConsumer.committed for new 
consumer
 Key: KAFKA-15842
 URL: https://issues.apache.org/jira/browse/KAFKA-15842
 Project: Kafka
  Issue Type: Sub-task
  Components: clients
Reporter: Andrew Schofield
Assignee: Andrew Schofield


KafkaConsumer.committed throws TimeOutException when there is no response. The 
new consumer currently returns a null. Changing the new consumer to behave like 
the old consumer.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Jenkins build is unstable: Kafka » Kafka Branch Builder » trunk #2388

2023-11-16 Thread Apache Jenkins Server
See 




Re: [VOTE] KIP-997: Partition-Level Throughput Metrics

2023-11-16 Thread Matthias J. Sax

This is KIP-977, right? Not as the subject says.

Guess we won't be able to fix this now. Hope it does not cause confusion 
down the line...



-Matthias

On 11/16/23 4:43 AM, Kamal Chandraprakash wrote:

+1 (non-binding). Thanks for the KIP!

On Thu, Nov 16, 2023 at 9:00 AM Satish Duggana 
wrote:


Thanks Qichao for the KIP.

+1 (binding)

~Satish.

On Thu, 16 Nov 2023 at 02:20, Jorge Esteban Quilcate Otoya
 wrote:


Qichao, thanks again for leading this proposal!

+1 (non-binding)

Cheers,
Jorge.

On Wed, 15 Nov 2023 at 19:17, Divij Vaidya 

wrote:



+1 (binding)

I was involved in the discussion thread for this KIP and support it in

its

current form.

--
Divij Vaidya



On Wed, Nov 15, 2023 at 10:55 AM Qichao Chu 
wrote:


Hi all,

I'd like to call a vote for KIP-977: Partition-Level Throughput

Metrics.


Please take a look here:





https://cwiki.apache.org/confluence/display/KAFKA/KIP-977%3A+Partition-Level+Throughput+Metrics


Best,
Qichao Chu









Re: [VOTE] KIP-1000: List Client Metrics Configuration Resources

2023-11-16 Thread Andrew Schofield
Hi Apoorv,
Thanks for your vote.

Initially, I put support for zkBroker in order to be able to control the error 
response in this case.
I have validated the error handling for this RPC on a ZK cluster in which the 
RPC is not supported,
and the error is entirely understandable. Consequently, I have removed 
`zkBroker` for this new RPC.

Thanks,
Andrew

> On 16 Nov 2023, at 13:51, Apoorv Mittal  wrote:
>
> Thanks a lot for writing the KIP Andrew. This is much required to list all
> configured client metrics resources.
>
> I have one minor question related to the zkBroker listener in the new RPC.
> As the client-metrics resource is not supported in Zookeeper mode hence
> shouldn't we disallow ListClientMetricsResourcesRequest for
> Zookeper in the APIVersion request itself?
>
> +1(non-binding)
>
> Regards,
> Apoorv Mittal
> +44 7721681581
>
>
> On Wed, Nov 15, 2023 at 4:58 PM Andrew Schofield <
> andrew_schofield_j...@outlook.com> wrote:
>
>> Hi,
>> I’d like to start the voting for KIP-1000: List Client Metrics
>> Configuration Resources.
>>
>>
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1000%3A+List+Client+Metrics+Configuration+Resources
>>
>> Thanks,
>> Andrew



[jira] [Created] (KAFKA-15843) Review consumer onPartitionsAssigned called with empty partitions

2023-11-16 Thread Lianet Magrans (Jira)
Lianet Magrans created KAFKA-15843:
--

 Summary: Review consumer onPartitionsAssigned called with empty 
partitions
 Key: KAFKA-15843
 URL: https://issues.apache.org/jira/browse/KAFKA-15843
 Project: Kafka
  Issue Type: Sub-task
  Components: clients, consumer
Reporter: Lianet Magrans


Legacy coordinator triggers onPartitionsAssigned with empty assignment (which 
is not the case when triggering onPartitionsRevoked or Lost). This is the 
behaviour of the legacy coordinator, and the new consumer implementation 
maintains the same principle. We should review this to fully understand if it 
is really needed to call onPartitionsAssigned with empty assignment (or if it 
should behave consistently with the onRevoke/Lost)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [DISCUSS] KIP-1000: List Client Metrics Configuration Resources

2023-11-16 Thread Jun Rao
Hi, Andrew,

Thanks for the KIP. Just one comment.

Should we extend ConfigCommand or add a new tool to list client metrics?

Thanks,

Jun

On Tue, Nov 7, 2023 at 9:42 AM Andrew Schofield <
andrew_schofield_j...@outlook.com> wrote:

> Hi,
> I would like to start discussion of a small KIP which fills a gap in the
> administration of client metrics configuration.
>
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1000%3A+List+Client+Metrics+Configuration+Resources
>
> Thanks,
> Andrew


[jira] [Created] (KAFKA-15844) Broker does re-register

2023-11-16 Thread Jira
José Armando García Sancio created KAFKA-15844:
--

 Summary: Broker does re-register
 Key: KAFKA-15844
 URL: https://issues.apache.org/jira/browse/KAFKA-15844
 Project: Kafka
  Issue Type: Bug
Affects Versions: 3.1.2
Reporter: José Armando García Sancio


We experience a case where a Kafka broker lost connection to the ZK cluster and 
was not able to recreate the registration znode. Only, after the broker was 
restarted did the registration znode get created.

My impression is that the following code is not correct. This code marks the ZK 
client as connect right after creating the ZooKeeper client. It doesn't wait 
for the session state to be marked as connected.
{code:java}
     private def reinitialize(): Unit = {
      // Initialization callbacks are invoked outside of the lock to avoid 
deadlock potential since their completion
      // may require additional Zookeeper requests, which will block to acquire 
the initialization lock
      stateChangeHandlers.values.foreach(callBeforeInitializingSession _)      
inWriteLock(initializationLock) {
        if (!connectionState.isAlive) {
          zooKeeper.close()
          info(s"Initializing a new session to $connectString.")
          // retry forever until ZooKeeper can be instantiated
          var connected = false
          while (!connected) {
            try {
              zooKeeper = new ZooKeeper(connectString, sessionTimeoutMs, 
ZooKeeperClientWatcher, clientConfig)
              connected = true
            } catch {
              case e: Exception =>
                info("Error when recreating ZooKeeper, retrying after a short 
sleep", e)
                Thread.sleep(RetryBackoffMs)
            }
          }
        }
      }      stateChangeHandlers.values.foreach(callAfterInitializingSession _)
    }
{code}
During broker startup or construction of the {{{}ZooKeeperClient{}}}, it blocks 
waiting for the connection state to be marked as connected.

The controller sees the broker go offline:
{code:java}
INFO [Controller id=32] Newly added brokers: , deleted brokers: 37, bounced 
brokers: , all live brokers: ...{code}
ZK session is lost in broker 37:
{code:java}
[Broker=37] WARN Client session timed out, have not heard from server in 3026ms 
for sessionid 0x504b9c08b5e0025
...
INFO [ZooKeeperClient ACL authorizer] Session expired.
...
INFO [ZooKeeperClient ACL authorizer] Initializing a new session to ...
...
[Broker=37] INFO Session establishment complete on server ..., sessionid = 
0x604dd0ad7180045, negotiated timeout = 18000{code}
Unfortunately, we never see the broker recreate the broker registration znode. 
We never see the following line in the logs:
{code:java}
Creating $path (is it secure? $isSecure){code}
My best guess is that some of the Kafka threads (for example the controller 
threads) are block on the ZK client. Unfortunately, I don't have a thread dump 
of the process at the time of the issue.

Restarting broker 37 resolved the issue.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [DISCUSS] KIP-1000: List Client Metrics Configuration Resources

2023-11-16 Thread Jason Gustafson
Hey Andrew,

Thanks for the KIP. Just clarifying a couple small details.

1. I assume any broker can handle this API, so admin clients will choose a
node randomly?
2. Does the controller need to support this API? If not, we can drop
"controller" from "listeners."

Thanks,
Jason

On Thu, Nov 16, 2023 at 10:00 AM Jun Rao  wrote:

> Hi, Andrew,
>
> Thanks for the KIP. Just one comment.
>
> Should we extend ConfigCommand or add a new tool to list client metrics?
>
> Thanks,
>
> Jun
>
> On Tue, Nov 7, 2023 at 9:42 AM Andrew Schofield <
> andrew_schofield_j...@outlook.com> wrote:
>
> > Hi,
> > I would like to start discussion of a small KIP which fills a gap in the
> > administration of client metrics configuration.
> >
> >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1000%3A+List+Client+Metrics+Configuration+Resources
> >
> > Thanks,
> > Andrew
>


Re: [DISCUSS] KIP-1000: List Client Metrics Configuration Resources

2023-11-16 Thread Andrew Schofield
Hi Jun,
KIP-714 includes `kafka-client-metrics.sh` which provides an easier way to work 
with client metrics config
than the general-purpose `kafka-configs.sh`. So, this new RPC will actually be 
used in the
`kafka-client-metrics.sh` tool.

Thanks,
Andrew

> On 16 Nov 2023, at 18:00, Jun Rao  wrote:
>
> Hi, Andrew,
>
> Thanks for the KIP. Just one comment.
>
> Should we extend ConfigCommand or add a new tool to list client metrics?
>
> Thanks,
>
> Jun
>
> On Tue, Nov 7, 2023 at 9:42 AM Andrew Schofield <
> andrew_schofield_j...@outlook.com> wrote:
>
>> Hi,
>> I would like to start discussion of a small KIP which fills a gap in the
>> administration of client metrics configuration.
>>
>>
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1000%3A+List+Client+Metrics+Configuration+Resources
>>
>> Thanks,
>> Andrew



Re: [VOTE] KIP-979 Allow independently stop KRaft processes

2023-11-16 Thread Hailey Ni
Hey Justine,

Thank you very much for the review.
I've updated the KIP to add a log line stating that when both flags are
given, node-id will take precedence.

Thanks,
Hailey

On Wed, Nov 15, 2023 at 3:37 PM Justine Olshan 
wrote:

> Hey Hailey,
>
> Thanks for the KIP.
> I wonder if it would be better to either not allow both flags or if we
> choose to have node take precedence, at least have a log line stating such.
>
> Otherwise the KIP makes sense to me.
>
> Justine
>
> On Tue, Nov 14, 2023 at 10:17 AM Colin McCabe  wrote:
>
> > Thanks, Hailey.
> >
> > +1 (binding)
> >
> > Colin
> >
> > On Mon, Nov 13, 2023, at 15:13, Hailey Ni wrote:
> > > Hi Colin,
> > >
> > > Thank you for your review. I removed the "absolute path need to be
> > > provided" line from the KIP, and will modify the code to get the
> absolute
> > > path to the config files using some bash in the kafka-server-start
> file.
> > > For your second question, I've added a line in the KIP: "If both
> > parameters
> > > are provided, the value for node-id parameter will take precedence,
> i.e,
> > > the process with node id specified will be killed, no matter what's the
> > > process role provided."
> > >
> > > What do you think?
> > >
> > > Thanks,
> > > Hailey
> > >
> > > On Thu, Nov 9, 2023 at 4:03 PM Colin McCabe 
> wrote:
> > >
> > >> Hi Hailey,
> > >>
> > >> Thanks for the KIP.
> > >>
> > >> It feels clunky to have to pass an absolute path to the configuration
> > file
> > >> when starting the broker or controller. I think we should consider one
> > of
> > >> two alternate options:
> > >>
> > >> 1. Use JMXtool to examine the running kafka.Kafka processes.
> > >> I believe ID is available from kafka.server, type=app-info,id=1
> > (replace 1
> > >> with the actual ID)
> > >>
> > >> Role can be deduced by the presence or absence of
> > >> kafka.server,type=KafkaServer,name=BrokerState for brokers, or
> > >> kafka.server,type=ControllerServer,name=ClusterId for controllers.
> > >>
> > >> 2. Alternately, we could inject the ID and role into the command line
> in
> > >> kafka-server-start.sh. Basically add -Dkafka.node.id=1,
> > >> -Dkafka.node.roles=broker. This would be helpful to people just
> > examining
> > >> the output of ps.
> > >>
> > >> Finally, you state that either command-line option can be given. What
> > >> happens if both are given?
> > >>
> > >> best,
> > >> Colin
> > >>
> > >>
> > >> On Mon, Oct 23, 2023, at 12:20, Hailey Ni wrote:
> > >> > Hi Ron,
> > >> >
> > >> > I've added the "Rejected Alternatives" section in the KIP. Thanks
> for
> > the
> > >> > comments and +1 vote!
> > >> >
> > >> > Thanks,
> > >> > Hailey
> > >> >
> > >> > On Mon, Oct 23, 2023 at 6:33 AM Ron Dagostino 
> > wrote:
> > >> >
> > >> >> Hi Hailey.  I'm +1 (binding), but could you add a "Rejected
> > >> >> Alternatives" section to the KIP and mention the
> "--required-config "
> > >> >> option that we decided against and the reason why we made the
> > decision
> > >> >> to reject it?  There were some other small things (dash instead of
> > dot
> > >> >> in the parameter names, --node-id instead of --broker-id), but
> > >> >> cosmetic things like this don't warrant a mention, so I think
> there's
> > >> >> just the one thing to document.
> > >> >>
> > >> >> Thanks for the KIP, and thanks for adjusting it along the way as
> the
> > >> >> discussion moved forward.
> > >> >>
> > >> >> Ron
> > >> >>
> > >> >>
> > >> >> Ron
> > >> >>
> > >> >> On Mon, Oct 23, 2023 at 4:00 AM Federico Valeri <
> > fedeval...@gmail.com>
> > >> >> wrote:
> > >> >> >
> > >> >> > +1 (non binding)
> > >> >> >
> > >> >> > Thanks.
> > >> >> >
> > >> >> > On Mon, Oct 23, 2023 at 9:48 AM Kamal Chandraprakash
> > >> >> >  wrote:
> > >> >> > >
> > >> >> > > +1 (non-binding). Thanks for the KIP!
> > >> >> > >
> > >> >> > > On Mon, Oct 23, 2023, 12:55 Hailey Ni  >
> > >> >> wrote:
> > >> >> > >
> > >> >> > > > Hi all,
> > >> >> > > >
> > >> >> > > > I'd like to call a vote on KIP-979 that will allow users to
> > >> >> independently
> > >> >> > > > stop KRaft processes.
> > >> >> > > >
> > >> >> > > >
> > >> >> > > >
> > >> >>
> > >>
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-979%3A+Allow+independently+stop+KRaft+processes
> > >> >> > > >
> > >> >> > > > Thanks,
> > >> >> > > > Hailey
> > >> >> > > >
> > >> >>
> > >>
> >
>


Re: [DISCUSS] KIP-968: Support single-key_multi-timestamp interactive queries (IQv2) for versioned state stores

2023-11-16 Thread Alieh Saeedi
Hi all,
Thank you for the feedback and the interesting solutions you suggested.

The KIP and the corresponding PR are updated as follows:

   - Snapshot semantics is implemented in the `get(key, fromTime, toTime)`
   method to guarantee consistency.
   - The `ValueIterator` interface is replaced with
   `VersionedRecordIterator implements Iterator>`.
   Consequently, the `MultiVersionedKeyQuery` class is modified to `
   MultiVersionedKeyQuery implements Query>`.
   - The former `get(key,ts)` method is updated so that it returns records
   with their `validTo` timestamps as well. I still kept the old constructor
   of the `VersionedRecord` class with only two parameters. The `validTo`
   field is initialized by `MAX` as default, but I think the default value for
   the `validTo` field must be `*PUT_RETURN_CODE_VALID_TO_UNDEFINED*` which
   is defined in the `VersionedKeyValueStore` interface and is actually
   `-1`. Am I right?


Open question:

   - It seems like we do not have a conclusion about the default ordering
   issue.


Cheers,
Alieh

On Thu, Nov 16, 2023 at 9:56 AM Bruno Cadonna  wrote:

> Hi,
>
> 80)
> We do not keep backwards compatibility with IQv1, right? I would even
> say that currently we do not need to keep backwards compatibility among
> IQv2 versions since we marked the API "Evolving" (do we only mean code
> compatibility here or also behavioral compatibility?). I propose to try
> to not limit ourselves for backwards compatibility that we explicitly
> marked as evolving.
> I re-read the discussion on KIP-985. In that discussion, we were quite
> focused on what the state store provides. I see that for range queries,
> we have methods on the state store interface that specify the order, but
> that should be kind of orthogonal to the IQv2 query type. Let's assume
> somebody in the future adds a state store implementation that is not
> order based. To account for use cases where the order does not matter,
> this person might also add a method to the state store interface that
> does not guarantee any order. However, our range query type is specified
> to guarantee order by default. So we need to add something like
> withNoOrder() to the query type to allow the use cases that does not
> need order and has the better performance in IQ. That does not look very
> nice to me. Having the no-order-guaranteed option does not cost us
> anything and it keeps the IQv2 interface flexible. I assume we want to
> drop the Evolving annotation at some point.
> Sorry for not having brought this up in the discussion about KIP-985.
>
> Best,
> Bruno
>
>
>
>
>
> On 11/15/23 6:56 AM, Matthias J. Sax wrote:
> > Just catching up on this one.
> >
> >
> > 50) I am also in favor of setting `validTo` in VersionedRecord for
> > single-key single-ts lookup; it seems better to return the proper
> > timestamp. The timestamp is already in the store and it's cheap to
> > extract it and add to the result, and it might be valuable information
> > for the user. Not sure though if we should deprecate the existing
> > constructor though, because for "latest" it's convenient to have?
> >
> >
> > 60) Yes, I meant `VersionedRecord`. Sorry for the mixup.
> >
> >
> > 80) We did discuss this question on KIP-985 (maybe you missed it Bruno).
> > It's kinda tricky.
> >
> > Historically, it seems that IQv1, ie, the `ReadOnlyXxx` interfaces
> > provide a clear contract that `range()` is ascending and
> > `reverseRange()` is descending.
> >
> > For `RangeQuery`, the question is, if we did implicitly inherit this
> > contract? Our conclusion on KIP-985 discussion was, that we did inherit
> > it. If this holds true, changing the contract would be a breaking change
> > (what might still be acceptable, given that the interface is annotated
> > as unstable, and that IQv2 is not widely adopted yet). I am happy to go
> > with the 3-option contract, but just want to ensure we all agree it's
> > the right way to go, and we are potentially willing to pay the price of
> > backward incompatibility.
> >
> >
> >
> >> Do we need a snapshot semantic or can we specify a weaker but still
> >> useful semantic?
> >
> > I don't think we necessarily need it, but as pointed out by Lucas, all
> > existing queries provide it. Overall, my main point is really about not
> > implementing something "random", but defining a proper binding contract
> > that allows users to reason about it.
> >
> > I general, I agree that weaker semantics might be sufficient, but I am
> > not sure if we can implement anything weaker in a reasonable way? Happy
> > to be convinced otherwise. (I have some example, that I will omit for
> > now, as I hope we can actually go with snapshot semantics.)
> >
> > The RocksDB Snaptshot idea from Lucas sounds very promising. I was not
> > aware that we could do this with RocksDB (otherwise I might have
> > suggested it on the PR right away). I guess the only open question
> > remaining would be, if we can provide the same guarantees for a future
> > 

[jira] [Resolved] (KAFKA-15481) Concurrency bug in RemoteIndexCache leads to IOException

2023-11-16 Thread Divij Vaidya (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-15481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Divij Vaidya resolved KAFKA-15481.
--
Resolution: Fixed

> Concurrency bug in RemoteIndexCache leads to IOException
> 
>
> Key: KAFKA-15481
> URL: https://issues.apache.org/jira/browse/KAFKA-15481
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 3.6.0
>Reporter: Divij Vaidya
>Assignee: Jeel Jotaniya
>Priority: Major
> Fix For: 3.7.0, 3.6.1
>
>
> RemoteIndexCache has a concurrency bug which leads to IOException while 
> fetching data from remote tier.
> Below events in order of timeline -
> Thread 1 (cache thread): invalidates the entry, removalListener is invoked 
> async, so the files have not been renamed to "deleted" suffix yet.
> Thread 2: (fetch thread): tries to find entry in cache, doesn't find it 
> because it has been removed by 1, fetches the entry from S3, writes it to 
> existing file (using replace existing)
> Thread 1: async removalListener is invoked, acquires a lock on old entry 
> (which has been removed from cache), it renames the file to "deleted" and 
> starts deleting it
> Thread 2: Tries to create in-memory/mmapped index, but doesn't find the file 
> and hence, creates a new file of size 2GB in AbstractIndex constructor. JVM 
> returns an error as it won't allow creation of 2GB random access file.
> *Potential Fix*
> Use EvictionListener instead of RemovalListener in Caffeine cache as per the 
> documentation:
> {quote} When the operation must be performed synchronously with eviction, use 
> {{Caffeine.evictionListener(RemovalListener)}} instead. This listener will 
> only be notified when {{RemovalCause.wasEvicted()}} is true. For an explicit 
> removal, {{Cache.asMap()}} offers compute methods that are performed 
> atomically.{quote}
> This will ensure that removal from cache and marking the file with delete 
> suffix is synchronously done, hence the above race condition will not occur.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-15845) Add Junit5 test extension which detects leaked Kafka clients and servers

2023-11-16 Thread Greg Harris (Jira)
Greg Harris created KAFKA-15845:
---

 Summary: Add Junit5 test extension which detects leaked Kafka 
clients and servers
 Key: KAFKA-15845
 URL: https://issues.apache.org/jira/browse/KAFKA-15845
 Project: Kafka
  Issue Type: Improvement
  Components: clients
Reporter: Greg Harris
Assignee: Greg Harris


We have many tests which accidentally leak Kafka clients and servers. This 
contributes to test flakiness and build instability.

We should use a test extension to make it easier to find these leaked clients 
and servers, and force test-implementors to resolve their resource leaks prior 
to merge.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Request permissions to contribute to Apache Kafka

2023-11-16 Thread shang xinli
I would like to contribute to Apache Kafka. Could you please review and approve 
the request for permission to contribute to Apache Kafka with the following 
Wiki & Jira ID?

Wiki ID: shangxinli
Jira ID: shangxinli


Xinli Shang


[jira] [Created] (KAFKA-15846) Review consumer leave group request

2023-11-16 Thread Lianet Magrans (Jira)
Lianet Magrans created KAFKA-15846:
--

 Summary: Review consumer leave group request
 Key: KAFKA-15846
 URL: https://issues.apache.org/jira/browse/KAFKA-15846
 Project: Kafka
  Issue Type: Sub-task
  Components: clients, consumer
Reporter: Lianet Magrans


New consumer sends out a leave group request with a best effort approach. 
Transitions to LEAVING to indicate the HB manager that the request must be 
sent, but it does not do any response handling or retrying. After the first HB 
manager poll iteration while on LEAVING, the consumer transitions into 
UNSUBSCRIBE (no matter if the request was actually sent out or not, ex, due to 
coordinator not known). Review if this is good enough as an effort to send the 
request.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Build failed in Jenkins: Kafka » Kafka Branch Builder » 3.6 #115

2023-11-16 Thread Apache Jenkins Server
See 


Changes:


--
[...truncated 309249 lines...]
Gradle Test Run :streams:test > Gradle Test Executor 89 > 
RocksDBBlockCacheMetricsTest > 
shouldRecordCorrectBlockCacheCapacity(RocksDBStore, StateStoreContext) > [1] 
org.apache.kafka.streams.state.internals.RocksDBStore@90c691e, 
org.apache.kafka.test.MockInternalProcessorContext@3a2e27a7 STARTED

Gradle Test Run :streams:test > Gradle Test Executor 89 > 
RocksDBBlockCacheMetricsTest > 
shouldRecordCorrectBlockCacheCapacity(RocksDBStore, StateStoreContext) > [1] 
org.apache.kafka.streams.state.internals.RocksDBStore@90c691e, 
org.apache.kafka.test.MockInternalProcessorContext@3a2e27a7 PASSED

Gradle Test Run :streams:test > Gradle Test Executor 89 > 
RocksDBBlockCacheMetricsTest > 
shouldRecordCorrectBlockCacheCapacity(RocksDBStore, StateStoreContext) > [2] 
org.apache.kafka.streams.state.internals.RocksDBTimestampedStore@2a61ec7a, 
org.apache.kafka.test.MockInternalProcessorContext@6e0ec362 STARTED

Gradle Test Run :streams:test > Gradle Test Executor 89 > 
RocksDBBlockCacheMetricsTest > 
shouldRecordCorrectBlockCacheCapacity(RocksDBStore, StateStoreContext) > [2] 
org.apache.kafka.streams.state.internals.RocksDBTimestampedStore@2a61ec7a, 
org.apache.kafka.test.MockInternalProcessorContext@6e0ec362 PASSED

Gradle Test Run :streams:test > Gradle Test Executor 89 > 
RocksDBMetricsRecorderTest > 
shouldThrowIfDbToAddWasAlreadyAddedForOtherSegment() STARTED

Gradle Test Run :streams:test > Gradle Test Executor 89 > 
RocksDBMetricsRecorderTest > 
shouldThrowIfDbToAddWasAlreadyAddedForOtherSegment() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 89 > 
RocksDBMetricsRecorderTest > 
shouldAddItselfToRecordingTriggerWhenFirstValueProvidersAreAddedAfterLastValueProvidersWereRemoved()
 STARTED

Gradle Test Run :streams:test > Gradle Test Executor 89 > 
RocksDBMetricsRecorderTest > 
shouldAddItselfToRecordingTriggerWhenFirstValueProvidersAreAddedAfterLastValueProvidersWereRemoved()
 PASSED

Gradle Test Run :streams:test > Gradle Test Executor 89 > 
RocksDBMetricsRecorderTest > shouldThrowIfValueProvidersToRemoveNotFound() 
STARTED

Gradle Test Run :streams:test > Gradle Test Executor 89 > 
RocksDBMetricsRecorderTest > shouldThrowIfValueProvidersToRemoveNotFound() 
PASSED

Gradle Test Run :streams:test > Gradle Test Executor 89 > 
RocksDBMetricsRecorderTest > 
shouldThrowIfCacheToAddIsSameAsOnlyOneOfMultipleCaches() STARTED

Gradle Test Run :streams:test > Gradle Test Executor 89 > 
RocksDBMetricsRecorderTest > 
shouldThrowIfCacheToAddIsSameAsOnlyOneOfMultipleCaches() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 89 > 
RocksDBMetricsRecorderTest > 
shouldNotSetStatsLevelToExceptDetailedTimersWhenValueProvidersWithoutStatisticsAreAdded()
 STARTED

Gradle Test Run :streams:test > Gradle Test Executor 89 > 
RocksDBMetricsRecorderTest > 
shouldNotSetStatsLevelToExceptDetailedTimersWhenValueProvidersWithoutStatisticsAreAdded()
 PASSED

Gradle Test Run :streams:test > Gradle Test Executor 89 > 
RocksDBMetricsRecorderTest > 
shouldNotRemoveItselfFromRecordingTriggerWhenAtLeastOneValueProviderIsPresent() 
STARTED

Gradle Test Run :streams:test > Gradle Test Executor 89 > 
RocksDBMetricsRecorderTest > 
shouldNotRemoveItselfFromRecordingTriggerWhenAtLeastOneValueProviderIsPresent() 
PASSED

Gradle Test Run :streams:test > Gradle Test Executor 89 > 
RocksDBMetricsRecorderTest > 
shouldRemoveItselfFromRecordingTriggerWhenAllValueProvidersAreRemoved() STARTED

Gradle Test Run :streams:test > Gradle Test Executor 89 > 
RocksDBMetricsRecorderTest > 
shouldRemoveItselfFromRecordingTriggerWhenAllValueProvidersAreRemoved() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 89 > 
RocksDBMetricsRecorderTest > 
shouldAddItselfToRecordingTriggerWhenFirstValueProvidersAreAddedToNewlyCreatedRecorder()
 STARTED

Gradle Test Run :streams:test > Gradle Test Executor 89 > 
RocksDBMetricsRecorderTest > 
shouldAddItselfToRecordingTriggerWhenFirstValueProvidersAreAddedToNewlyCreatedRecorder()
 PASSED

Gradle Test Run :streams:test > Gradle Test Executor 89 > 
RocksDBMetricsRecorderTest > 
shouldThrowIfValueProvidersForASegmentHasBeenAlreadyAdded() STARTED

Gradle Test Run :streams:test > Gradle Test Executor 89 > 
RocksDBMetricsRecorderTest > 
shouldThrowIfValueProvidersForASegmentHasBeenAlreadyAdded() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 89 > 
RocksDBMetricsRecorderTest > 
shouldCorrectlyHandleHitRatioRecordingsWithZeroHitsAndMisses() STARTED

Gradle Test Run :streams:test > Gradle Test Executor 89 > 
RocksDBMetricsRecorderTest > 
shouldCorrectlyHandleHitRatioRecordingsWithZeroHitsAndMisses() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 89 > 
RocksDBMetricsRecorderTest > 
shouldThrowIfCacheToAddIsNotNullButExistingCacheIsNull() STARTED

Gradle Test Run :streams:test > Gradle Test Executor 89 > 
Rock

[jira] [Created] (KAFKA-15847) Allow to resolve client metadata for specific topics

2023-11-16 Thread Lianet Magrans (Jira)
Lianet Magrans created KAFKA-15847:
--

 Summary: Allow to resolve client metadata for specific topics
 Key: KAFKA-15847
 URL: https://issues.apache.org/jira/browse/KAFKA-15847
 Project: Kafka
  Issue Type: Bug
  Components: clients
Reporter: Lianet Magrans


Currently metadata updates requested through the metadata object request 
metadata for all topics. Consider allowing the partial updates that are already 
expressed as an intention in the Metadata class but not fully supported. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-15848) Consumer API timeout inconsistent between LegacyKafkaConsumer and AsyncKafkaConsumer

2023-11-16 Thread Kirk True (Jira)
Kirk True created KAFKA-15848:
-

 Summary: Consumer API timeout inconsistent between 
LegacyKafkaConsumer and AsyncKafkaConsumer
 Key: KAFKA-15848
 URL: https://issues.apache.org/jira/browse/KAFKA-15848
 Project: Kafka
  Issue Type: Bug
  Components: clients, consumer
Reporter: Kirk True






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Build failed in Jenkins: Kafka » Kafka Branch Builder » 3.6 #116

2023-11-16 Thread Apache Jenkins Server
See 


Changes:


--
[...truncated 308710 lines...]

Gradle Test Run :core:test > Gradle Test Executor 93 > KafkaZkClientTest > 
testConditionalUpdatePath() PASSED

Gradle Test Run :core:test > Gradle Test Executor 93 > KafkaZkClientTest > 
testGetAllTopicsInClusterTriggersWatch() STARTED

Gradle Test Run :core:test > Gradle Test Executor 93 > KafkaZkClientTest > 
testGetAllTopicsInClusterTriggersWatch() PASSED

Gradle Test Run :core:test > Gradle Test Executor 93 > KafkaZkClientTest > 
testDeleteTopicZNode() STARTED

Gradle Test Run :core:test > Gradle Test Executor 93 > KafkaZkClientTest > 
testDeleteTopicZNode() PASSED

Gradle Test Run :core:test > Gradle Test Executor 93 > KafkaZkClientTest > 
testDeletePath() STARTED

Gradle Test Run :core:test > Gradle Test Executor 93 > KafkaZkClientTest > 
testDeletePath() PASSED

Gradle Test Run :core:test > Gradle Test Executor 93 > KafkaZkClientTest > 
testGetBrokerMethods() STARTED

Gradle Test Run :core:test > Gradle Test Executor 93 > KafkaZkClientTest > 
testGetBrokerMethods() PASSED

Gradle Test Run :core:test > Gradle Test Executor 93 > KafkaZkClientTest > 
testCreateTokenChangeNotification() STARTED

Gradle Test Run :core:test > Gradle Test Executor 93 > KafkaZkClientTest > 
testCreateTokenChangeNotification() PASSED

Gradle Test Run :core:test > Gradle Test Executor 93 > KafkaZkClientTest > 
testGetTopicsAndPartitions() STARTED

Gradle Test Run :core:test > Gradle Test Executor 93 > KafkaZkClientTest > 
testGetTopicsAndPartitions() PASSED

Gradle Test Run :core:test > Gradle Test Executor 93 > KafkaZkClientTest > 
testChroot(boolean) > [1] createChrootIfNecessary=true STARTED

Gradle Test Run :core:test > Gradle Test Executor 93 > KafkaZkClientTest > 
testChroot(boolean) > [1] createChrootIfNecessary=true PASSED

Gradle Test Run :core:test > Gradle Test Executor 93 > KafkaZkClientTest > 
testChroot(boolean) > [2] createChrootIfNecessary=false STARTED

Gradle Test Run :core:test > Gradle Test Executor 93 > KafkaZkClientTest > 
testChroot(boolean) > [2] createChrootIfNecessary=false PASSED

Gradle Test Run :core:test > Gradle Test Executor 93 > KafkaZkClientTest > 
testRegisterBrokerInfo() STARTED

Gradle Test Run :core:test > Gradle Test Executor 93 > KafkaZkClientTest > 
testRegisterBrokerInfo() PASSED

Gradle Test Run :core:test > Gradle Test Executor 93 > KafkaZkClientTest > 
testRetryRegisterBrokerInfo() STARTED

Gradle Test Run :core:test > Gradle Test Executor 93 > KafkaZkClientTest > 
testRetryRegisterBrokerInfo() PASSED

Gradle Test Run :core:test > Gradle Test Executor 93 > KafkaZkClientTest > 
testConsumerOffsetPath() STARTED

Gradle Test Run :core:test > Gradle Test Executor 93 > KafkaZkClientTest > 
testConsumerOffsetPath() PASSED

Gradle Test Run :core:test > Gradle Test Executor 93 > KafkaZkClientTest > 
testDeleteRecursiveWithControllerEpochVersionCheck() STARTED

Gradle Test Run :core:test > Gradle Test Executor 93 > KafkaZkClientTest > 
testDeleteRecursiveWithControllerEpochVersionCheck() PASSED

Gradle Test Run :core:test > Gradle Test Executor 93 > KafkaZkClientTest > 
testTopicAssignments() STARTED

Gradle Test Run :core:test > Gradle Test Executor 93 > KafkaZkClientTest > 
testTopicAssignments() PASSED

Gradle Test Run :core:test > Gradle Test Executor 93 > KafkaZkClientTest > 
testControllerManagementMethods() STARTED

Gradle Test Run :core:test > Gradle Test Executor 93 > KafkaZkClientTest > 
testControllerManagementMethods() PASSED

Gradle Test Run :core:test > Gradle Test Executor 93 > KafkaZkClientTest > 
testTopicAssignmentMethods() STARTED

Gradle Test Run :core:test > Gradle Test Executor 93 > KafkaZkClientTest > 
testTopicAssignmentMethods() PASSED

Gradle Test Run :core:test > Gradle Test Executor 93 > KafkaZkClientTest > 
testConnectionViaNettyClient() STARTED

Gradle Test Run :core:test > Gradle Test Executor 93 > KafkaZkClientTest > 
testConnectionViaNettyClient() PASSED

Gradle Test Run :core:test > Gradle Test Executor 93 > KafkaZkClientTest > 
testPropagateIsrChanges() STARTED

Gradle Test Run :core:test > Gradle Test Executor 93 > KafkaZkClientTest > 
testPropagateIsrChanges() PASSED

Gradle Test Run :core:test > Gradle Test Executor 93 > KafkaZkClientTest > 
testControllerEpochMethods() STARTED

Gradle Test Run :core:test > Gradle Test Executor 93 > KafkaZkClientTest > 
testControllerEpochMethods() PASSED

Gradle Test Run :core:test > Gradle Test Executor 93 > KafkaZkClientTest > 
testDeleteRecursive() STARTED

Gradle Test Run :core:test > Gradle Test Executor 93 > KafkaZkClientTest > 
testDeleteRecursive() PASSED

Gradle Test Run :core:test > Gradle Test Executor 93 > KafkaZkClientTest > 
testGetTopicPartitionStates() STARTED

Gradle Test Run :core:test > Gradle Test Executor 93 > KafkaZkClientTest > 
testGetTopicPartitionStates() PASSED

Gradle Test Run :core:test > Gradle Test Executor 93 > KafkaZ

Re: [DISCUSS] KIP-968: Support single-key_multi-timestamp interactive queries (IQv2) for versioned state stores

2023-11-16 Thread Matthias J. Sax

Thanks, Alieh.

Overall SGTM. About `validTo` -- wondering if we should make it an 
`Optional` and set to `empty()` by default?


I am totally ok with going with the 3-way option about ordering using 
default "undefined". For this KIP (as it's all net new) nothing really 
changes. -- However, we should amend `RangeQuery`/KIP-985 to align it.


Btw: so far we focused on key-ordering, but I believe the same "ordering 
undefined by default" would apply to time-ordering, too? This might 
affect KIP-997, too.



-Matthias

On 11/16/23 12:51 AM, Bruno Cadonna wrote:

Hi,

80)
We do not keep backwards compatibility with IQv1, right? I would even 
say that currently we do not need to keep backwards compatibility among 
IQv2 versions since we marked the API "Evolving" (do we only mean code 
compatibility here or also behavioral compatibility?). I propose to try 
to not limit ourselves for backwards compatibility that we explicitly 
marked as evolving.
I re-read the discussion on KIP-985. In that discussion, we were quite 
focused on what the state store provides. I see that for range queries, 
we have methods on the state store interface that specify the order, but 
that should be kind of orthogonal to the IQv2 query type. Let's assume 
somebody in the future adds a state store implementation that is not 
order based. To account for use cases where the order does not matter, 
this person might also add a method to the state store interface that 
does not guarantee any order. However, our range query type is specified 
to guarantee order by default. So we need to add something like 
withNoOrder() to the query type to allow the use cases that does not 
need order and has the better performance in IQ. That does not look very 
nice to me. Having the no-order-guaranteed option does not cost us 
anything and it keeps the IQv2 interface flexible. I assume we want to 
drop the Evolving annotation at some point.

Sorry for not having brought this up in the discussion about KIP-985.

Best,
Bruno





On 11/15/23 6:56 AM, Matthias J. Sax wrote:

Just catching up on this one.


50) I am also in favor of setting `validTo` in VersionedRecord for 
single-key single-ts lookup; it seems better to return the proper 
timestamp. The timestamp is already in the store and it's cheap to 
extract it and add to the result, and it might be valuable information 
for the user. Not sure though if we should deprecate the existing 
constructor though, because for "latest" it's convenient to have?



60) Yes, I meant `VersionedRecord`. Sorry for the mixup.


80) We did discuss this question on KIP-985 (maybe you missed it 
Bruno). It's kinda tricky.


Historically, it seems that IQv1, ie, the `ReadOnlyXxx` interfaces 
provide a clear contract that `range()` is ascending and 
`reverseRange()` is descending.


For `RangeQuery`, the question is, if we did implicitly inherit this 
contract? Our conclusion on KIP-985 discussion was, that we did 
inherit it. If this holds true, changing the contract would be a 
breaking change (what might still be acceptable, given that the 
interface is annotated as unstable, and that IQv2 is not widely 
adopted yet). I am happy to go with the 3-option contract, but just 
want to ensure we all agree it's the right way to go, and we are 
potentially willing to pay the price of backward incompatibility.




Do we need a snapshot semantic or can we specify a weaker but still 
useful semantic? 


I don't think we necessarily need it, but as pointed out by Lucas, all 
existing queries provide it. Overall, my main point is really about 
not implementing something "random", but defining a proper binding 
contract that allows users to reason about it.


I general, I agree that weaker semantics might be sufficient, but I am 
not sure if we can implement anything weaker in a reasonable way? 
Happy to be convinced otherwise. (I have some example, that I will 
omit for now, as I hope we can actually go with snapshot semantics.)


The RocksDB Snaptshot idea from Lucas sounds very promising. I was not 
aware that we could do this with RocksDB (otherwise I might have 
suggested it on the PR right away). I guess the only open question 
remaining would be, if we can provide the same guarantees for a future 
in-memory implementation for VersionedStores? It sounds possible to 
do, but we should have some level of confidence about it?



-Matthias

On 11/14/23 6:33 AM, Lucas Brutschy wrote:

Hi Alieh,

I agree with Bruno that a weaker guarantee could be useful already,
and it's anyway possible to strengthen the guarantees later on. On the
other hand, it would be nice to provide a consistent view across all
segments if it doesn't come with major downsides, because until now IQ
does provide a consistent view (via iterators), and this would be the
first feature that diverges from that guarantee.

I think a consistent could be achieved relatively easily by creating a
snapshot (https://github.com/facebook/rocksdb/wiki/Snapshot) that will
ens

Re: [DISCUSS] KIP-997 Support fetch(fromKey, toKey, from, to) to WindowRangeQuery and unify WindowKeyQuery and WindowRangeQuery

2023-11-16 Thread Matthias J. Sax

Thanks for the KIP.

Given how `WindowRangeQuery` works right now, it's really time to 
improve it.



1) Agree. It's not clear what will be added right now. I think we should 
deprecate existing `getKey()` w/o an actually replacement? For 
`getFromKey` and `getToKey` we should actually be `lowerKeyBound()` and 
`upperKeyBound()` to align to KIP-969?


Also wondering if we should deprecate existing `withKey()` and 
`withWindowStartRange`? `withKey` only works for SessionStores and 
implements a single-key full-time-range query. Similarly, 
`withWindowStartRange` only works for WindowedStore and implements an 
all-key time-range query. Thus, both are rather special and it seems the 
aim of this KIP is to generalize `WindowRangeQuery` to arbitrary 
key-range/time-range queries?


What raises one question about time-range semantics, given that we query 
windows with different semantics.


 - The current `WindowStore` semantics used for 
`WindowRangeQuery#withWindowStartRange` is considering only the window 
start time, ie, the window-start time must fall into the query 
time-range to be returned.


 - In contrast, `SessionStore` time ranges base on `findSession` use 
earliest-session-end-time and latest-session-end-time and thus implement 
an "window-bounds / search-time-range overlap query".


Is there any concern about semantic differences? I would also be 
possible to use the same semantics for both query types, and maybe even 
let the user pick with semantics they want (let users chose might 
actually be the best thing to do)? -- We can also do this incrementally, 
and limit the scope of this KIP (or keep the full KIP scope but 
implement it incrementally only)


Btw: I think we should not add any ordering at this point, and 
explicitly state that no ordering is guarantee whatsoever at this point.




2) Agreed. We should deprecate `getFromTime` and `getToTime` and add new 
method `fromTime` and `toTime`.




3) About the API. If we move forward with general key-range/time-range I 
agree that a more modular approach might be nice. Not sure right now, 
what the best approach would be for this? Looking into KIP-969, we might 
want to have:


 - static withKeyRange
 - static withLowerKeyBound
 - static withUpperKeyBound
 - static withAllKeys (KIP-969 actually uses `allKeys` ?)
 - fromTime
 - toTime

with default-time range would be "all / unbounded" ?



10: you mentioned that `WindowKeyQuery` functionality can be covered by 
`WindowRangeQuery`. I agree. For this case, it seems we want to 
deprecate `WindowKeyQuery` entirely?




-Matthias

On 11/16/23 1:19 AM, Bruno Cadonna wrote:

Hi Hanyu,

Thanks for the KIP!

1)
Could you please mark the pieces that you want to add to the API in the 
code listing in the KIP? You can add a comment like "// newly added" or 
similar. That would make reading the KIP a bit easier because one does 
not need to compare your code with the code in the current codebase.


2)
Could you -- as a side cleanup -- also change the getters to not use the 
get-prefix anymore, please? That was apparently an oversight when those 
methods were added. Although the API is marked as Evolving, I think we 
should still deprecate the getX() methods, since it does not cost us 
anything.


3)
I propose to make the API a bit more fluent. For example, something like

WindowRangeQuery.withKey(key).fromTime(t1).toTime(t2)

and

WindowRangeQuery.withAllKeys().fromTime(t1).toTime(t2)

and

WindowRangeQuery.withKeyRange(key1, key2).fromTime(t1).toTime(t2)

and maybe even in addition to the above add also the option to start 
with the time range


WindowRangeQuery.withWindowStartRange(t1, t2).fromKey(key1).toKey(key2)


4)
Could you also add some usage examples? Alieh did quite a nice job 
regarding usage examples in KIP-986.



Best,
Bruno

On 11/8/23 8:02 PM, Hanyu (Peter) Zheng wrote:

Hello everyone,

I would like to start the discussion for KIP-997: Support fetch(fromKey,
toKey, from, to) to WindowRangeQuery and unify WindowKeyQuery and
WindowRangeQuery
The KIP can be found here:
https://cwiki.apache.org/confluence/display/KAFKA/KIP-997%3A++Support+fetch%28fromKey%2C+toKey%2C+from%2C+to%29+to+WindowRangeQuery+and+unify+WindowKeyQuery+and+WindowRangeQuery

Any suggestions are more than welcome.

Many thanks,
Hanyu

On Wed, Nov 8, 2023 at 10:38 AM Hanyu (Peter) Zheng 
wrote:



https://cwiki.apache.org/confluence/display/KAFKA/KIP-997%3A++Support+fetch%28fromKey%2C+toKey%2C+from%2C+to%29+to+WindowRangeQuery+and+unify+WindowKeyQuery+and+WindowRangeQuery

--

[image: Confluent] 
Hanyu (Peter) Zheng he/him/his
Software Engineer Intern
+1 (213) 431-7193 <+1+(213)+431-7193>
Follow us: [image: Blog]
[image:
Twitter] [image: LinkedIn]
[image: Slack]


Jenkins build is still unstable: Kafka » Kafka Branch Builder » trunk #2389

2023-11-16 Thread Apache Jenkins Server
See 




[jira] [Created] (KAFKA-15849) Fix ListGroups API when runtime partition size is zero

2023-11-16 Thread Dongnuo Lyu (Jira)
Dongnuo Lyu created KAFKA-15849:
---

 Summary: Fix ListGroups API when runtime partition size is zero
 Key: KAFKA-15849
 URL: https://issues.apache.org/jira/browse/KAFKA-15849
 Project: Kafka
  Issue Type: Sub-task
Reporter: Dongnuo Lyu
Assignee: Dongnuo Lyu






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Build failed in Jenkins: Kafka » Kafka Branch Builder » 3.6 #117

2023-11-16 Thread Apache Jenkins Server
See 


Changes:


--
[...truncated 309159 lines...]
Gradle Test Run :streams:test > Gradle Test Executor 62 > 
RocksDBBlockCacheMetricsTest > 
shouldRecordCorrectBlockCacheCapacity(RocksDBStore, StateStoreContext) > [2] 
org.apache.kafka.streams.state.internals.RocksDBTimestampedStore@5d8f3142, 
org.apache.kafka.test.MockInternalProcessorContext@9a3a9f9 STARTED

Gradle Test Run :streams:test > Gradle Test Executor 62 > 
RocksDBBlockCacheMetricsTest > 
shouldRecordCorrectBlockCacheCapacity(RocksDBStore, StateStoreContext) > [2] 
org.apache.kafka.streams.state.internals.RocksDBTimestampedStore@5d8f3142, 
org.apache.kafka.test.MockInternalProcessorContext@9a3a9f9 PASSED

Gradle Test Run :streams:test > Gradle Test Executor 62 > 
RocksDBMetricsRecorderTest > 
shouldThrowIfDbToAddWasAlreadyAddedForOtherSegment() STARTED

Gradle Test Run :streams:test > Gradle Test Executor 62 > 
RocksDBMetricsRecorderTest > 
shouldThrowIfDbToAddWasAlreadyAddedForOtherSegment() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 62 > 
RocksDBMetricsRecorderTest > 
shouldAddItselfToRecordingTriggerWhenFirstValueProvidersAreAddedAfterLastValueProvidersWereRemoved()
 STARTED

Gradle Test Run :streams:test > Gradle Test Executor 62 > 
RocksDBMetricsRecorderTest > 
shouldAddItselfToRecordingTriggerWhenFirstValueProvidersAreAddedAfterLastValueProvidersWereRemoved()
 PASSED

Gradle Test Run :streams:test > Gradle Test Executor 62 > 
RocksDBMetricsRecorderTest > shouldThrowIfValueProvidersToRemoveNotFound() 
STARTED

Gradle Test Run :streams:test > Gradle Test Executor 62 > 
RocksDBMetricsRecorderTest > shouldThrowIfValueProvidersToRemoveNotFound() 
PASSED

Gradle Test Run :streams:test > Gradle Test Executor 62 > 
RocksDBMetricsRecorderTest > 
shouldThrowIfCacheToAddIsSameAsOnlyOneOfMultipleCaches() STARTED

Gradle Test Run :streams:test > Gradle Test Executor 62 > 
RocksDBMetricsRecorderTest > 
shouldThrowIfCacheToAddIsSameAsOnlyOneOfMultipleCaches() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 62 > 
RocksDBMetricsRecorderTest > 
shouldNotSetStatsLevelToExceptDetailedTimersWhenValueProvidersWithoutStatisticsAreAdded()
 STARTED

Gradle Test Run :streams:test > Gradle Test Executor 62 > 
RocksDBMetricsRecorderTest > 
shouldNotSetStatsLevelToExceptDetailedTimersWhenValueProvidersWithoutStatisticsAreAdded()
 PASSED

Gradle Test Run :streams:test > Gradle Test Executor 62 > 
RocksDBMetricsRecorderTest > 
shouldNotRemoveItselfFromRecordingTriggerWhenAtLeastOneValueProviderIsPresent() 
STARTED

Gradle Test Run :streams:test > Gradle Test Executor 62 > 
RocksDBMetricsRecorderTest > 
shouldNotRemoveItselfFromRecordingTriggerWhenAtLeastOneValueProviderIsPresent() 
PASSED

Gradle Test Run :streams:test > Gradle Test Executor 62 > 
RocksDBMetricsRecorderTest > 
shouldRemoveItselfFromRecordingTriggerWhenAllValueProvidersAreRemoved() STARTED

Gradle Test Run :streams:test > Gradle Test Executor 62 > 
RocksDBMetricsRecorderTest > 
shouldRemoveItselfFromRecordingTriggerWhenAllValueProvidersAreRemoved() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 62 > 
RocksDBMetricsRecorderTest > 
shouldAddItselfToRecordingTriggerWhenFirstValueProvidersAreAddedToNewlyCreatedRecorder()
 STARTED

Gradle Test Run :streams:test > Gradle Test Executor 62 > 
RocksDBMetricsRecorderTest > 
shouldAddItselfToRecordingTriggerWhenFirstValueProvidersAreAddedToNewlyCreatedRecorder()
 PASSED

Gradle Test Run :streams:test > Gradle Test Executor 62 > 
RocksDBMetricsRecorderTest > 
shouldThrowIfValueProvidersForASegmentHasBeenAlreadyAdded() STARTED

Gradle Test Run :streams:test > Gradle Test Executor 62 > 
RocksDBMetricsRecorderTest > 
shouldThrowIfValueProvidersForASegmentHasBeenAlreadyAdded() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 62 > 
RocksDBMetricsRecorderTest > 
shouldCorrectlyHandleHitRatioRecordingsWithZeroHitsAndMisses() STARTED

Gradle Test Run :streams:test > Gradle Test Executor 62 > 
RocksDBMetricsRecorderTest > 
shouldCorrectlyHandleHitRatioRecordingsWithZeroHitsAndMisses() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 62 > 
RocksDBMetricsRecorderTest > 
shouldThrowIfCacheToAddIsNotNullButExistingCacheIsNull() STARTED

Gradle Test Run :streams:test > Gradle Test Executor 62 > 
RocksDBMetricsRecorderTest > 
shouldThrowIfCacheToAddIsNotNullButExistingCacheIsNull() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 62 > 
RocksDBMetricsRecorderTest > 
shouldThrowIfCacheToAddIsNullButExistingCacheIsNotNull() STARTED

Gradle Test Run :streams:test > Gradle Test Executor 62 > 
RocksDBMetricsRecorderTest > 
shouldThrowIfCacheToAddIsNullButExistingCacheIsNotNull() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 62 > 
RocksDBMetricsRecorderTest > 
shouldThrowIfStatisticsToAddIsNotNullButExistingStatisticsAreNull() STARTED

Gradle Test Run :streams:test > Gradle 

Build failed in Jenkins: Kafka » Kafka Branch Builder » trunk #2390

2023-11-16 Thread Apache Jenkins Server
See 


Changes:


--
[...truncated 325483 lines...]

Gradle Test Run :streams:test > Gradle Test Executor 89 > 
DefaultTaskManagerTest > shouldLockATaskThatWasVoluntarilyReleased() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 89 > 
DefaultTaskManagerTest > shouldNotRemoveUnlockedTask() STARTED

Gradle Test Run :streams:test > Gradle Test Executor 89 > 
DefaultTaskManagerTest > shouldNotRemoveUnlockedTask() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 89 > 
DefaultTaskManagerTest > shouldReturnAndClearExceptionsOnDrainExceptions() 
STARTED

Gradle Test Run :streams:test > Gradle Test Executor 89 > 
DefaultTaskManagerTest > shouldReturnAndClearExceptionsOnDrainExceptions() 
PASSED

Gradle Test Run :streams:test > Gradle Test Executor 89 > 
DefaultTaskManagerTest > shouldUnassignTask() STARTED

Gradle Test Run :streams:test > Gradle Test Executor 89 > 
DefaultTaskManagerTest > shouldUnassignTask() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 89 > 
DefaultTaskManagerTest > shouldNotAssignTasksIfUncaughtExceptionPresent() 
STARTED

Gradle Test Run :streams:test > Gradle Test Executor 89 > 
DefaultTaskManagerTest > shouldNotAssignTasksIfUncaughtExceptionPresent() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 89 > 
DefaultTaskManagerTest > shouldReturnFromAwaitOnUnlocking() STARTED

Gradle Test Run :streams:test > Gradle Test Executor 89 > 
DefaultTaskManagerTest > shouldReturnFromAwaitOnUnlocking() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 89 > 
DefaultTaskManagerTest > 
shouldNotAssignTasksForProcessingIfProcessingDisabled() STARTED

Gradle Test Run :streams:test > Gradle Test Executor 89 > 
DefaultTaskManagerTest > 
shouldNotAssignTasksForProcessingIfProcessingDisabled() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 89 > 
DefaultTaskManagerTest > shouldNotSetUncaughtExceptionsForUnassignedTasks() 
STARTED

Gradle Test Run :streams:test > Gradle Test Executor 89 > 
DefaultTaskManagerTest > shouldNotSetUncaughtExceptionsForUnassignedTasks() 
PASSED

Gradle Test Run :streams:test > Gradle Test Executor 89 > 
DefaultTaskManagerTest > shouldNotAssignLockedTask() STARTED

Gradle Test Run :streams:test > Gradle Test Executor 89 > 
DefaultTaskManagerTest > shouldNotAssignLockedTask() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 89 > 
DefaultTaskManagerTest > shouldUnassignLockingTask() STARTED

Gradle Test Run :streams:test > Gradle Test Executor 89 > 
DefaultTaskManagerTest > shouldUnassignLockingTask() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 89 > 
DefaultTaskManagerTest > shouldAssignTasksThatCanBeStreamTimePunctuated() 
STARTED

Gradle Test Run :streams:test > Gradle Test Executor 89 > 
DefaultTaskManagerTest > shouldAssignTasksThatCanBeStreamTimePunctuated() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 89 > 
RocksDBBlockCacheMetricsTest > shouldRecordCorrectBlockCacheUsage(RocksDBStore, 
StateStoreContext) > [1] 
org.apache.kafka.streams.state.internals.RocksDBStore@17cebcd8, 
org.apache.kafka.test.MockInternalProcessorContext@d276f17 STARTED

Gradle Test Run :streams:test > Gradle Test Executor 89 > 
RocksDBBlockCacheMetricsTest > shouldRecordCorrectBlockCacheUsage(RocksDBStore, 
StateStoreContext) > [1] 
org.apache.kafka.streams.state.internals.RocksDBStore@17cebcd8, 
org.apache.kafka.test.MockInternalProcessorContext@d276f17 PASSED

Gradle Test Run :streams:test > Gradle Test Executor 89 > 
RocksDBBlockCacheMetricsTest > shouldRecordCorrectBlockCacheUsage(RocksDBStore, 
StateStoreContext) > [2] 
org.apache.kafka.streams.state.internals.RocksDBTimestampedStore@49bccc76, 
org.apache.kafka.test.MockInternalProcessorContext@389f6a06 STARTED

Gradle Test Run :streams:test > Gradle Test Executor 89 > 
RocksDBBlockCacheMetricsTest > shouldRecordCorrectBlockCacheUsage(RocksDBStore, 
StateStoreContext) > [2] 
org.apache.kafka.streams.state.internals.RocksDBTimestampedStore@49bccc76, 
org.apache.kafka.test.MockInternalProcessorContext@389f6a06 PASSED

Gradle Test Run :streams:test > Gradle Test Executor 89 > 
RocksDBBlockCacheMetricsTest > 
shouldRecordCorrectBlockCachePinnedUsage(RocksDBStore, StateStoreContext) > [1] 
org.apache.kafka.streams.state.internals.RocksDBStore@5bd19d80, 
org.apache.kafka.test.MockInternalProcessorContext@35fa5910 STARTED

Gradle Test Run :streams:test > Gradle Test Executor 89 > 
RocksDBBlockCacheMetricsTest > 
shouldRecordCorrectBlockCachePinnedUsage(RocksDBStore, StateStoreContext) > [1] 
org.apache.kafka.streams.state.internals.RocksDBStore@5bd19d80, 
org.apache.kafka.test.MockInternalProcessorContext@35fa5910 PASSED

Gradle Test Run :streams:test > Gradle Test Executor 89 > 
RocksDBBlockCacheMetricsTest > 
shouldRecordCorrectBlockCachePinnedUsage(RocksDBStore, StateStoreContext) > [2] 
org.apache.kafka.streams.state.inte

Re: [DISCUSS] KIP-994: Minor Enhancements to ListTransactions and DescribeTransactions APIs

2023-11-16 Thread Artem Livshits
Hi Raman,

I see that you've updated the KIP.  The content looks good to me.

A couple nits on the format:
- can you highlight which fields are new in the message?
- can you add your original proposal of using a tagged field in
ListTransactionsRequest to the list of rejected alternatives?

-Artem

On Tue, Nov 7, 2023 at 11:04 PM Andrew Schofield <
andrew_schofield_j...@outlook.com> wrote:

> Hi Artem,
> I think you make a very good point. This also looks to me like it deserves
> a version bump for the request.
>
> Andrew
>
> > On 8 Nov 2023, at 04:12, Artem Livshits 
> wrote:
> >
> > Hi Raman,
> >
> > Thank you for the KIP.  I think using the tagged field
> > in DescribeTransactionsResponse should be good -- if either the client or
> > the server don't support it, it's not printed, which is reasonable
> behavior.
> >
> > For the ListTransactionsRequest, though, I think using the tagged field
> > could lead to a subtle compatibility issue if a new client is used with
> old
> > server: the client could specify the DurationFilter, but the old server
> > would ignore it and list all transactions instead, which could be
> > misleading or potentially even dangerous if the results are used in a
> > script for some automation.  I think a more desirable behavior would be
> to
> > fail if the server doesn't support the new filter, which we should be
> able
> > to achieve if we bump version of the ListTransactionsRequest and add
> > DurationFilter as a regular field.
> >
> > -Artem
> >
> > On Tue, Nov 7, 2023 at 2:20 AM Raman Verma 
> wrote:
> >
> >> I would like to start a discussion on KIP-994
> >>
> >>
> >>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-994%3A+Minor+Enhancements+to+ListTransactions+and+DescribeTransactions+APIs
> >>
>
>


[jira] [Created] (KAFKA-15850) Improve KafkaMetrics APIs to expose Value Provider information

2023-11-16 Thread Apoorv Mittal (Jira)
Apoorv Mittal created KAFKA-15850:
-

 Summary: Improve KafkaMetrics APIs to expose Value Provider 
information
 Key: KAFKA-15850
 URL: https://issues.apache.org/jira/browse/KAFKA-15850
 Project: Kafka
  Issue Type: Sub-task
Reporter: Apoorv Mittal
Assignee: Apoorv Mittal


Today KafkaMetric do not expose the metricValueProvider information which makes 
it hard to determine the provider type. In KIP-714 implementation we have used 
reflections to access the information but would like to propose the correct 
ways of exposing the information.

Discussion thread: 
https://github.com/apache/kafka/pull/14620#discussion_r1374059672



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-15851) broker under replicated due to error java.nio.BufferOverflowException

2023-11-16 Thread wangliucheng (Jira)
wangliucheng created KAFKA-15851:


 Summary: broker under replicated due to error 
java.nio.BufferOverflowException
 Key: KAFKA-15851
 URL: https://issues.apache.org/jira/browse/KAFKA-15851
 Project: Kafka
  Issue Type: Bug
Affects Versions: 3.3.2
 Environment: Kafka Version: 3.3.2

Deployment mode: zookeeper

Reporter: wangliucheng
 Attachments: p1.png, server.log

In my kafka cluster, kafka update 2.0 to 3.3.2 version 

{*}first start failed{*}, because the same directory was configured

The error is as follows:

 
{code:java}
[2023-11-16 10:04:09,952] ERROR (main kafka.Kafka$ 159) Exiting Kafka due to 
fatal exception during startup.
java.lang.IllegalStateException: Duplicate log directories for 
skydas_sc_tdevirsec-12 are found in both 
/data01/kafka/log/skydas_sc_tdevirsec-12 and 
/data07/kafka/log/skydas_sc_tdevirsec-12. It is likely because log directory 
failure happened while broker was replacing current replica with future 
replica. Recover broker from this failure by manually deleting one of the two 
directories for this partition. It is recommended to delete the partition in 
the log directory that is known to have failed recently.
        at kafka.log.LogManager.loadLog(LogManager.scala:305)
        at kafka.log.LogManager.$anonfun$loadLogs$14(LogManager.scala:403)
        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
[2023-11-16 10:04:09,953] INFO (kafka-shutdown-hook kafka.server.KafkaServer 
66) [KafkaServer id=1434] shutting down {code}
 

 

*second,* remove /data07/kafka/log in log.dirs  and start kafka also reported 
an error :

 
{code:java}
[2023-11-16 10:13:10,713] INFO (ReplicaFetcherThread-3-1008 
kafka.log.UnifiedLog 66) [UnifiedLog partition=ty_udp_full-60, 
dir=/data04/kafka/log] Rolling new log segment (log_size = 
755780551/1073741824}, offset_index_size = 2621440/2621440, time_index_size = 
1747626/1747626, inactive_time_ms = 2970196/60480).
[2023-11-16 10:13:10,714] ERROR (ReplicaFetcherThread-3-1008 
kafka.server.ReplicaFetcherThread 76) [ReplicaFetcher replicaId=1434, 
leaderId=1008, fetcherId=3] Unexpected error occurred while processing data for 
partition ty_udp_full-60 at offset 2693467479
java.nio.BufferOverflowException
        at java.nio.Buffer.nextPutIndex(Buffer.java:555)
        at java.nio.DirectByteBuffer.putLong(DirectByteBuffer.java:794)
        at kafka.log.TimeIndex.$anonfun$maybeAppend$1(TimeIndex.scala:135)
        at kafka.log.TimeIndex.maybeAppend(TimeIndex.scala:114)
        at kafka.log.LogSegment.onBecomeInactiveSegment(LogSegment.scala:510)
        at kafka.log.LocalLog.$anonfun$roll$9(LocalLog.scala:529)
        at kafka.log.LocalLog.$anonfun$roll$9$adapted(LocalLog.scala:529)
        at scala.Option.foreach(Option.scala:437)
        at kafka.log.LocalLog.$anonfun$roll$2(LocalLog.scala:529)
        at kafka.log.LocalLog.roll(LocalLog.scala:786)
        at kafka.log.UnifiedLog.roll(UnifiedLog.scala:1537)
        at kafka.log.UnifiedLog.maybeRoll(UnifiedLog.scala:1523)
        at kafka.log.UnifiedLog.append(UnifiedLog.scala:919)
        at kafka.log.UnifiedLog.appendAsFollower(UnifiedLog.scala:778)
        at 
kafka.cluster.Partition.doAppendRecordsToFollowerOrFutureReplica(Partition.scala:1121)
        at 
kafka.cluster.Partition.appendRecordsToFollowerOrFutureReplica(Partition.scala:1128)
        at 
kafka.server.ReplicaFetcherThread.processPartitionData(ReplicaFetcherThread.scala:121)
        at 
kafka.server.AbstractFetcherThread.$anonfun$processFetchRequest$7(AbstractFetcherThread.scala:336)
        at scala.Option.foreach(Option.scala:437)
        at 
kafka.server.AbstractFetcherThread.$anonfun$processFetchRequest$6(AbstractFetcherThread.scala:325)
        at 
kafka.server.AbstractFetcherThread.$anonfun$processFetchRequest$6$adapted(AbstractFetcherThread.scala:324)
        at 
kafka.utils.Implicits$MapExtensionMethods$.$anonfun$forKeyValue$1(Implicits.scala:62)
        at 
scala.collection.convert.JavaCollectionWrappers$JMapWrapperLike.foreachEntry(JavaCollectionWrappers.scala:359)
        at 
scala.collection.convert.JavaCollectionWrappers$JMapWrapperLike.foreachEntry$(JavaCollectionWrappers.scala:355)
        at 
scala.collection.convert.JavaCollectionWrappers$AbstractJMapWrapper.foreachEntry(JavaCollectionWrappers.scala:309)
        at 
kafka.server.AbstractFetcherThread.processFetchRequest(AbstractFetcherThread.scala:324)
        at 
kafka.server.AbstractFetcherThread.$anonfun$maybeFetch$3(AbstractFetcherThread.scala:124)
        at 
kafka.server.AbstractFetcherThread.$anonfun$may

Re: Request permissions to contribute to Apache Kafka

2023-11-16 Thread Josep Prat
Hi there!

Thanks for your interest in Apache Kafka. Your accounts are all set, let me
know if you have any questions.

Best,

On Thu, Nov 16, 2023 at 10:46 PM shang xinli  wrote:

> I would like to contribute to Apache Kafka. Could you please review and
> approve the request for permission to contribute to Apache Kafka with the
> following Wiki & Jira ID?
>
> Wiki ID: shangxinli
> Jira ID: shangxinli
>
>
> Xinli Shang
>


-- 
[image: Aiven] 

*Josep Prat*
Open Source Engineering Director, *Aiven*
josep.p...@aiven.io   |   +491715557497
aiven.io    |   
     
*Aiven Deutschland GmbH*
Alexanderufer 3-7, 10117 Berlin
Geschäftsführer: Oskari Saarenmaa & Hannu Valtonen
Amtsgericht Charlottenburg, HRB 209739 B