[jira] [Updated] (KAFKA-17925) Convert Kafka Client integration tests to use KRaft

2024-11-02 Thread Kirk True (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-17925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kirk True updated KAFKA-17925:
--
Labels: integration-test  (was: )

> Convert Kafka Client integration tests to use KRaft
> ---
>
> Key: KAFKA-17925
> URL: https://issues.apache.org/jira/browse/KAFKA-17925
> Project: Kafka
>  Issue Type: Task
>  Components: clients
>Affects Versions: 4.0.0
>Reporter: Kirk True
>Assignee: Kirk True
>Priority: Blocker
>  Labels: integration-test
>
> Update quota, truncation, and client compatibility tests to use KRaft. Tests 
> that do not inject a metadata_quorum argument are defaulted to using 
> Zookeeper.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (KAFKA-17925) Convert Kafka Client integration tests to use KRaft

2024-11-02 Thread Kirk True (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-17925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kirk True updated KAFKA-17925:
--
Fix Version/s: 4.0.0

> Convert Kafka Client integration tests to use KRaft
> ---
>
> Key: KAFKA-17925
> URL: https://issues.apache.org/jira/browse/KAFKA-17925
> Project: Kafka
>  Issue Type: Task
>  Components: clients
>Affects Versions: 4.0.0
>Reporter: Kirk True
>Assignee: Kirk True
>Priority: Blocker
>  Labels: integration-test
> Fix For: 4.0.0
>
>
> Update quota, truncation, and client compatibility tests to use KRaft. Tests 
> that do not inject a metadata_quorum argument are defaulted to using 
> Zookeeper.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (KAFKA-17925) Convert Kafka Client integration tests to use KRaft

2024-11-02 Thread Kirk True (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-17925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kirk True updated KAFKA-17925:
--
Component/s: (was: system tests)

> Convert Kafka Client integration tests to use KRaft
> ---
>
> Key: KAFKA-17925
> URL: https://issues.apache.org/jira/browse/KAFKA-17925
> Project: Kafka
>  Issue Type: Task
>  Components: clients
>Affects Versions: 4.0.0
>Reporter: Kirk True
>Assignee: Kirk True
>Priority: Blocker
>
> Update quota, truncation, and client compatibility tests to use KRaft. Tests 
> that do not inject a metadata_quorum argument are defaulted to using 
> Zookeeper.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-17925) Convert Kafka Client integration tests to use KRaft

2024-11-02 Thread Kirk True (Jira)
Kirk True created KAFKA-17925:
-

 Summary: Convert Kafka Client integration tests to use KRaft
 Key: KAFKA-17925
 URL: https://issues.apache.org/jira/browse/KAFKA-17925
 Project: Kafka
  Issue Type: Task
  Components: clients, system tests
Affects Versions: 4.0.0
Reporter: Kirk True
Assignee: Kirk True


Update quota, truncation, and client compatibility tests to use KRaft. Tests 
that do not inject a metadata_quorum argument are defaulted to using Zookeeper.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (KAFKA-17915) Convert Kafka Client system tests to use KRaft

2024-11-02 Thread Kirk True (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-17915?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kirk True reassigned KAFKA-17915:
-

Assignee: Kirk True

> Convert Kafka Client system tests to use KRaft
> --
>
> Key: KAFKA-17915
> URL: https://issues.apache.org/jira/browse/KAFKA-17915
> Project: Kafka
>  Issue Type: Task
>  Components: clients, system tests
>Affects Versions: 4.0.0
>Reporter: Kevin Wu
>Assignee: Kirk True
>Priority: Blocker
>
> Update quota, truncation, and client compatibility tests to use KRaft. Tests 
> that do not inject a metadata_quorum argument are defaulted to using 
> Zookeeper.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (KAFKA-16985) Ensure consumer attempts to send leave request on close even if interrupted

2024-11-01 Thread Kirk True (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-16985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kirk True updated KAFKA-16985:
--
Summary: Ensure consumer attempts to send leave request on close even if 
interrupted  (was: Ensure consumer sends leave request on close even if 
interrupted)

> Ensure consumer attempts to send leave request on close even if interrupted
> ---
>
> Key: KAFKA-16985
> URL: https://issues.apache.org/jira/browse/KAFKA-16985
> Project: Kafka
>  Issue Type: Bug
>  Components: clients, consumer
>Affects Versions: 3.8.0
>Reporter: Lianet Magrans
>Assignee: Kirk True
>Priority: Major
>  Labels: kip-848-client-support
> Fix For: 4.0.0
>
>
> While running some stress tests we found out consumers were not leaving the 
> group if interrupted while closing (leading to members getting eventually 
> fenced after the session expired). On close, the consumer generates an 
> Unsubscribe event to be handled in the background, but we noticed that the 
> network thread failed with the interruption, seemingly not sending the unsent 
> requests. We should review this to ensure that a member does a clean leave, 
> notifying the coordinator with a leave HB, even if in a fire-and-forget mode 
> in the case of interruption (and validate the legacy consumer behaviour in 
> this scenario).  (Still under investigation, I'll update more info as I 
> discover it)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (KAFKA-17821) the set of configs displayed by `logAll` could be invalid

2024-10-30 Thread Kirk True (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-17821?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kirk True updated KAFKA-17821:
--
Component/s: clients
 consumer

> the set of configs displayed by `logAll` could be invalid
> -
>
> Key: KAFKA-17821
> URL: https://issues.apache.org/jira/browse/KAFKA-17821
> Project: Kafka
>  Issue Type: Improvement
>  Components: clients, consumer
>Reporter: Chia-Ping Tsai
>Assignee: 黃竣陽
>Priority: Major
>
> see https://github.com/apache/kafka/pull/16899#discussion_r1743632489



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (KAFKA-14517) Implement regex subscriptions

2024-10-25 Thread Kirk True (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-14517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kirk True updated KAFKA-14517:
--
Fix Version/s: 4.0.0

> Implement regex subscriptions
> -
>
> Key: KAFKA-14517
> URL: https://issues.apache.org/jira/browse/KAFKA-14517
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: David Jacot
>Assignee: Lianet Magrans
>Priority: Major
>  Labels: kip-848-preview
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (KAFKA-17861) Serialize with ByteBuffer instead of byte[]

2024-10-25 Thread Kirk True (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-17861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kirk True updated KAFKA-17861:
--
Labels: kip-required  (was: )

> Serialize with ByteBuffer instead of byte[]
> ---
>
> Key: KAFKA-17861
> URL: https://issues.apache.org/jira/browse/KAFKA-17861
> Project: Kafka
>  Issue Type: Wish
>  Components: clients, producer 
>Affects Versions: 3.3.2
>Reporter: Sarah Hennenkamp
>Priority: Minor
>  Labels: kip-required
>
> This is a request to consider changing the return value of the 
> [Serializer|https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/common/serialization/Serializer.java]
>  from a byte[] to a ByteBuffer. 
>  
> Understandably folks may balk at this since it's a large lift. However, we've 
> noticed a good chunk of memory allocation in our application comes from the 
> [KafkaProducer 
> serializing|https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/clients/producer/KafkaProducer.java#L1045]
>  the key and value pair. Using ByteBuffer could allow this to be off-heap and 
> save on garbage collection time.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (KAFKA-17862) [buffer pool] corruption during buffer reuse from the pool

2024-10-25 Thread Kirk True (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-17862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kirk True updated KAFKA-17862:
--
Component/s: clients
 producer 

> [buffer pool] corruption during buffer reuse from the pool
> --
>
> Key: KAFKA-17862
> URL: https://issues.apache.org/jira/browse/KAFKA-17862
> Project: Kafka
>  Issue Type: Bug
>  Components: clients, core, producer 
>Affects Versions: 3.7.1
>Reporter: Bharath Vissapragada
>Priority: Major
> Attachments: client-config.txt
>
>
> We noticed malformed batches from the Kafka Java client + Redpanda under 
> certain conditions that caused excessive client retries and we narrowed it 
> down to a client bug related to corruption of buffers reused from the buffer 
> pool. We were able to reproduce it with Kafka brokers too, so we are fairly 
> certain the bug is on the client.
> (Attached the full client config, fwiw)
> We narrowed it down to a race condition between produce requests and failed 
> batch expiration. If the network flush of produce request races with the 
> expiration, the produce batch that the request uses is corrupted, so a 
> malformed batch is sent to the broker.
> The expiration is triggered by a timeout 
> [https://github.com/apache/kafka/blob/2c6fb6c54472e90ae17439e62540ef3cb0426fe3/clients/src/main/java/org/apache/kafka/clients/producer/internals/Sender.java#L392C13-L392C22]
> that eventually deallocates the batch
> [https://github.com/apache/kafka/blob/2c6fb6c54472e90ae17439e62540ef3cb0426fe3/clients/src/main/java/org/apache/kafka/clients/producer/internals/Sender.java#L773]
> adding it back to the buffer pool
> [https://github.com/apache/kafka/blob/661bed242e8d7269f134ea2f6a24272ce9b720e9/clients/src/main/java/org/apache/kafka/clients/producer/internals/RecordAccumulator.java#L1054]
> Now it is probably all zeroed out or there is a competing producer that 
> requests a new append that reuses this freed up buffer and starts writing to 
> it corrupting it's contents.
> If there is racing network flush of a produce batch backed by this buffer, a 
> corrupt batch is sent to the broker resulting in a CRC mismatch. 
> This issue can be easily reproduced in a simulated environment that triggers 
> frequent timeouts (eg: lower timeouts) and then use a producer with high-ish 
> throughput that can cause longer queues (hence higher chances of expiration) 
> and frequent buffer reuse from the pool (deadly combination :))



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (KAFKA-15284) Implement GroupProtocolResolver to dynamically determine consumer group protocol

2024-10-24 Thread Kirk True (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-15284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17892638#comment-17892638
 ] 

Kirk True commented on KAFKA-15284:
---

[~yangpoan]—sorry, I am working on this but forgot to update the status. Thanks!

> Implement GroupProtocolResolver to dynamically determine consumer group 
> protocol
> 
>
> Key: KAFKA-15284
> URL: https://issues.apache.org/jira/browse/KAFKA-15284
> Project: Kafka
>  Issue Type: New Feature
>  Components: clients, consumer
>Reporter: Kirk True
>Assignee: Kirk True
>Priority: Critical
>  Labels: kip-848-client-support
> Fix For: 4.0.0
>
>
> At client initialization, we need to determine which of the 
> {{ConsumerDelegate}} implementations to use:
>  # {{LegacyKafkaConsumerDelegate}}
>  # {{AsyncKafkaConsumerDelegate}}
> There are conditions defined by KIP-848 that determine client eligibility to 
> use the new protocol. This will be modeled by the 
> {{{}GroupProtocolResolver{}}}.
> Known tasks:
>  * Determine at what point in the {{Consumer}} initialization the network 
> communication should happen
>  * Determine what RPCs to invoke in order to determine eligibility (API 
> versions, IBP version, etc.)
>  * Implement the network client lifecycle (startup, communication, shutdown, 
> etc.)
>  * Determine the fall-back path in case the client is not eligible to use the 
> protocol



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (KAFKA-15284) Implement GroupProtocolResolver to dynamically determine consumer group protocol

2024-10-24 Thread Kirk True (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-15284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kirk True reassigned KAFKA-15284:
-

Assignee: Kirk True

> Implement GroupProtocolResolver to dynamically determine consumer group 
> protocol
> 
>
> Key: KAFKA-15284
> URL: https://issues.apache.org/jira/browse/KAFKA-15284
> Project: Kafka
>  Issue Type: New Feature
>  Components: clients, consumer
>Reporter: Kirk True
>Assignee: Kirk True
>Priority: Critical
>  Labels: kip-848-client-support
> Fix For: 4.0.0
>
>
> At client initialization, we need to determine which of the 
> {{ConsumerDelegate}} implementations to use:
>  # {{LegacyKafkaConsumerDelegate}}
>  # {{AsyncKafkaConsumerDelegate}}
> There are conditions defined by KIP-848 that determine client eligibility to 
> use the new protocol. This will be modeled by the 
> {{{}GroupProtocolResolver{}}}.
> Known tasks:
>  * Determine at what point in the {{Consumer}} initialization the network 
> communication should happen
>  * Determine what RPCs to invoke in order to determine eligibility (API 
> versions, IBP version, etc.)
>  * Implement the network client lifecycle (startup, communication, shutdown, 
> etc.)
>  * Determine the fall-back path in case the client is not eligible to use the 
> protocol



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (KAFKA-17861) Serialize with ByteBuffer instead of byte[]

2024-10-24 Thread Kirk True (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-17861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kirk True updated KAFKA-17861:
--
Component/s: clients

> Serialize with ByteBuffer instead of byte[]
> ---
>
> Key: KAFKA-17861
> URL: https://issues.apache.org/jira/browse/KAFKA-17861
> Project: Kafka
>  Issue Type: Wish
>  Components: clients, producer 
>Affects Versions: 3.3.2
>Reporter: Sarah Hennenkamp
>Priority: Minor
>
> This is a request to consider changing the return value of the 
> [Serializer|https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/common/serialization/Serializer.java]
>  from a byte[] to a ByteBuffer. 
>  
> Understandably folks may balk at this since it's a large lift. However, we've 
> noticed a good chunk of memory allocation in our application comes from the 
> [KafkaProducer 
> serializing|https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/clients/producer/KafkaProducer.java#L1045]
>  the key and value pair. Using ByteBuffer could allow this to be off-heap and 
> save on garbage collection time.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (KAFKA-17686) AsyncKafkaConsumer.offsetsForTimes() fails with NullPointerException

2024-10-22 Thread Kirk True (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-17686?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kirk True updated KAFKA-17686:
--
Reviewer: Chia-Ping Tsai

> AsyncKafkaConsumer.offsetsForTimes() fails with NullPointerException
> 
>
> Key: KAFKA-17686
> URL: https://issues.apache.org/jira/browse/KAFKA-17686
> Project: Kafka
>  Issue Type: Improvement
>  Components: clients, consumer
>Affects Versions: 3.9.0
>Reporter: Kirk True
>Assignee: Kirk True
>Priority: Critical
>  Labels: consumer-threading-refactor, integration-tests
> Fix For: 4.0.0
>
>
> Error when running the integration test:
> {noformat}
> Gradle Test Run :core:integrationTest > Gradle Test Executor 10 > 
> PlaintextAdminIntegrationTest > testOffsetsForTimesAfterDeleteRecords(String) 
> > "testOffsetsForTimesAfterDeleteRecords(String).quorum=kraft" FAILED
> java.lang.NullPointerException: Cannot invoke 
> "org.apache.kafka.clients.consumer.internals.OffsetAndTimestampInternal.buildOffsetAndTimestamp()"
>  because the return value of "java.util.Map$Entry.getValue()" is null
> at 
> org.apache.kafka.clients.consumer.internals.AsyncKafkaConsumer.lambda$offsetsForTimes$4(AsyncKafkaConsumer.java:1082)
> at 
> java.base/java.util.stream.Collectors.lambda$uniqKeysMapAccumulator$1(Collectors.java:180)
> at 
> java.base/java.util.stream.ReduceOps$3ReducingSink.accept(ReduceOps.java:169)
> at 
> java.base/java.util.HashMap$EntrySpliterator.forEachRemaining(HashMap.java:1858)
> at 
> java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:509)
> at 
> java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:499)
> at 
> java.base/java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:921)
>  
> at 
> java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
> at 
> java.base/java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:682)
> at 
> org.apache.kafka.clients.consumer.internals.AsyncKafkaConsumer.offsetsForTimes(AsyncKafkaConsumer.java:1080)
> at 
> org.apache.kafka.clients.consumer.internals.AsyncKafkaConsumer.offsetsForTimes(AsyncKafkaConsumer.java:1043)
> at 
> org.apache.kafka.clients.consumer.KafkaConsumer.offsetsForTimes(KafkaConsumer.java:1560)
> at 
> kafka.api.PlaintextAdminIntegrationTest.testOffsetsForTimesAfterDeleteRecords(PlaintextAdminIntegrationTest.scala:1535)
>  
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (KAFKA-15284) Implement GroupProtocolResolver to dynamically determine consumer group protocol

2024-10-22 Thread Kirk True (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-15284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kirk True updated KAFKA-15284:
--
Description: 
At client initialization, we need to determine which of the 
{{ConsumerDelegate}} implementations to use:
 # {{LegacyKafkaConsumerDelegate}}
 # {{AsyncKafkaConsumerDelegate}}

There are conditions defined by KIP-848 that determine client eligibility to 
use the new protocol. This will be modeled by the {{{}GroupProtocolResolver{}}}.

Known tasks:
 * Determine at what point in the {{Consumer}} initialization the network 
communication should happen
 * Determine what RPCs to invoke in order to determine eligibility (API 
versions, IBP version, etc.)
 * Implement the network client lifecycle (startup, communication, shutdown, 
etc.)
 * Determine the fall-back path in case the client is not eligible to use the 
protocol

  was:
At client initialization, we need to determine which of the 
{{ConsumerDelegate}} implementations to use:
 # {{LegacyKafkaConsumerDelegate}}
 # {{AsyncKafkaConsumerDelegate}}

There are conditions defined by KIP-848 that determine client eligibility to 
use the new protocol. This will be modeled by the {{{}GroupProtocolResolver{}}}.

Known tasks:
 * Determine at what point in the {{Consumer}} initialization the network 
communication should happen
 * Determine what RPCs to invoke in order to determine eligibility (API 
versions, IBP version, etc.)
 * Implement the network client lifecycle (startup, communication, shutdown, 
etc.)
 * Determine the fallback path in case the client is not eligible to use the 
protocol


> Implement GroupProtocolResolver to dynamically determine consumer group 
> protocol
> 
>
> Key: KAFKA-15284
> URL: https://issues.apache.org/jira/browse/KAFKA-15284
> Project: Kafka
>  Issue Type: New Feature
>  Components: clients, consumer
>Reporter: Kirk True
>Priority: Critical
>  Labels: kip-848-client-support
> Fix For: 4.0.0
>
>
> At client initialization, we need to determine which of the 
> {{ConsumerDelegate}} implementations to use:
>  # {{LegacyKafkaConsumerDelegate}}
>  # {{AsyncKafkaConsumerDelegate}}
> There are conditions defined by KIP-848 that determine client eligibility to 
> use the new protocol. This will be modeled by the 
> {{{}GroupProtocolResolver{}}}.
> Known tasks:
>  * Determine at what point in the {{Consumer}} initialization the network 
> communication should happen
>  * Determine what RPCs to invoke in order to determine eligibility (API 
> versions, IBP version, etc.)
>  * Implement the network client lifecycle (startup, communication, shutdown, 
> etc.)
>  * Determine the fall-back path in case the client is not eligible to use the 
> protocol



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (KAFKA-15284) Implement GroupProtocolResolver to dynamically determine consumer group protocol

2024-10-22 Thread Kirk True (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-15284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kirk True updated KAFKA-15284:
--
Summary: Implement GroupProtocolResolver to dynamically determine consumer 
group protocol  (was: Implement GroupProtocolResolver to determine consumer 
group protocol)

> Implement GroupProtocolResolver to dynamically determine consumer group 
> protocol
> 
>
> Key: KAFKA-15284
> URL: https://issues.apache.org/jira/browse/KAFKA-15284
> Project: Kafka
>  Issue Type: New Feature
>  Components: clients, consumer
>Reporter: Kirk True
>Priority: Critical
>  Labels: kip-848-client-support
> Fix For: 4.0.0
>
>
> At client initialization, we need to determine which of the 
> {{ConsumerDelegate}} implementations to use:
>  # {{LegacyKafkaConsumerDelegate}}
>  # {{AsyncKafkaConsumerDelegate}}
> There are conditions defined by KIP-848 that determine client eligibility to 
> use the new protocol. This will be modeled by the 
> {{{}GroupProtocolResolver{}}}.
> Known tasks:
>  * Determine at what point in the {{Consumer}} initialization the network 
> communication should happen
>  * Determine what RPCs to invoke in order to determine eligibility (API 
> versions, IBP version, etc.)
>  * Implement the network client lifecycle (startup, communication, shutdown, 
> etc.)
>  * Determine the fallback path in case the client is not eligible to use the 
> protocol



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (KAFKA-15284) Implement GroupProtocolResolver to determine consumer group protocol

2024-10-22 Thread Kirk True (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-15284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kirk True updated KAFKA-15284:
--
Priority: Critical  (was: Major)

> Implement GroupProtocolResolver to determine consumer group protocol
> 
>
> Key: KAFKA-15284
> URL: https://issues.apache.org/jira/browse/KAFKA-15284
> Project: Kafka
>  Issue Type: New Feature
>  Components: clients, consumer
>Reporter: Kirk True
>Priority: Critical
>  Labels: kip-848-client-support
> Fix For: 4.0.0
>
>
> At client initialization, we need to determine which of the 
> {{ConsumerDelegate}} implementations to use:
>  # {{LegacyKafkaConsumerDelegate}}
>  # {{AsyncKafkaConsumerDelegate}}
> There are conditions defined by KIP-848 that determine client eligibility to 
> use the new protocol. This will be modeled by the 
> {{{}GroupProtocolResolver{}}}.
> Known tasks:
>  * Determine at what point in the {{Consumer}} initialization the network 
> communication should happen
>  * Determine what RPCs to invoke in order to determine eligibility (API 
> versions, IBP version, etc.)
>  * Implement the network client lifecycle (startup, communication, shutdown, 
> etc.)
>  * Determine the fallback path in case the client is not eligible to use the 
> protocol



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (KAFKA-15284) Implement ConsumerGroupProtocolVersionResolver to determine consumer group protocol

2024-10-22 Thread Kirk True (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-15284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kirk True reassigned KAFKA-15284:
-

Assignee: (was: Kirk True)

> Implement ConsumerGroupProtocolVersionResolver to determine consumer group 
> protocol
> ---
>
> Key: KAFKA-15284
> URL: https://issues.apache.org/jira/browse/KAFKA-15284
> Project: Kafka
>  Issue Type: New Feature
>  Components: clients, consumer
>Reporter: Kirk True
>Priority: Major
>  Labels: kip-848-client-support
> Fix For: 4.0.0
>
>
> At client initialization, we need to determine which of the 
> {{ConsumerDelegate}} implementations to use:
>  # {{LegacyKafkaConsumerDelegate}}
>  # {{AsyncKafkaConsumerDelegate}}
> There are conditions defined by KIP-848 that determine client eligibility to 
> use the new protocol. This will be modeled by the—deep 
> breath—{{{}ConsumerGroupProtocolVersionResolver{}}}.
> Known tasks:
>  * Determine at what point in the {{Consumer}} initialization the network 
> communication should happen
>  * Determine what RPCs to invoke in order to determine eligibility (API 
> versions, IBP version, etc.)
>  * Implement the network client lifecycle (startup, communication, shutdown, 
> etc.)
>  * Determine the fallback path in case the client is not eligible to use the 
> protocol



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (KAFKA-15284) Implement GroupProtocolResolver to determine consumer group protocol

2024-10-22 Thread Kirk True (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-15284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kirk True updated KAFKA-15284:
--
Description: 
At client initialization, we need to determine which of the 
{{ConsumerDelegate}} implementations to use:
 # {{LegacyKafkaConsumerDelegate}}
 # {{AsyncKafkaConsumerDelegate}}

There are conditions defined by KIP-848 that determine client eligibility to 
use the new protocol. This will be modeled by the {{{}GroupProtocolResolver{}}}.

Known tasks:
 * Determine at what point in the {{Consumer}} initialization the network 
communication should happen
 * Determine what RPCs to invoke in order to determine eligibility (API 
versions, IBP version, etc.)
 * Implement the network client lifecycle (startup, communication, shutdown, 
etc.)
 * Determine the fallback path in case the client is not eligible to use the 
protocol

  was:
At client initialization, we need to determine which of the 
{{ConsumerDelegate}} implementations to use:
 # {{LegacyKafkaConsumerDelegate}}
 # {{AsyncKafkaConsumerDelegate}}

There are conditions defined by KIP-848 that determine client eligibility to 
use the new protocol. This will be modeled by the—deep 
breath—{{{}ConsumerGroupProtocolVersionResolver{}}}.

Known tasks:
 * Determine at what point in the {{Consumer}} initialization the network 
communication should happen
 * Determine what RPCs to invoke in order to determine eligibility (API 
versions, IBP version, etc.)
 * Implement the network client lifecycle (startup, communication, shutdown, 
etc.)
 * Determine the fallback path in case the client is not eligible to use the 
protocol


> Implement GroupProtocolResolver to determine consumer group protocol
> 
>
> Key: KAFKA-15284
> URL: https://issues.apache.org/jira/browse/KAFKA-15284
> Project: Kafka
>  Issue Type: New Feature
>  Components: clients, consumer
>Reporter: Kirk True
>Priority: Major
>  Labels: kip-848-client-support
> Fix For: 4.0.0
>
>
> At client initialization, we need to determine which of the 
> {{ConsumerDelegate}} implementations to use:
>  # {{LegacyKafkaConsumerDelegate}}
>  # {{AsyncKafkaConsumerDelegate}}
> There are conditions defined by KIP-848 that determine client eligibility to 
> use the new protocol. This will be modeled by the 
> {{{}GroupProtocolResolver{}}}.
> Known tasks:
>  * Determine at what point in the {{Consumer}} initialization the network 
> communication should happen
>  * Determine what RPCs to invoke in order to determine eligibility (API 
> versions, IBP version, etc.)
>  * Implement the network client lifecycle (startup, communication, shutdown, 
> etc.)
>  * Determine the fallback path in case the client is not eligible to use the 
> protocol



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (KAFKA-15284) Implement GroupProtocolResolver to determine consumer group protocol

2024-10-22 Thread Kirk True (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-15284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kirk True updated KAFKA-15284:
--
Summary: Implement GroupProtocolResolver to determine consumer group 
protocol  (was: Implement ConsumerGroupProtocolVersionResolver to determine 
consumer group protocol)

> Implement GroupProtocolResolver to determine consumer group protocol
> 
>
> Key: KAFKA-15284
> URL: https://issues.apache.org/jira/browse/KAFKA-15284
> Project: Kafka
>  Issue Type: New Feature
>  Components: clients, consumer
>Reporter: Kirk True
>Priority: Major
>  Labels: kip-848-client-support
> Fix For: 4.0.0
>
>
> At client initialization, we need to determine which of the 
> {{ConsumerDelegate}} implementations to use:
>  # {{LegacyKafkaConsumerDelegate}}
>  # {{AsyncKafkaConsumerDelegate}}
> There are conditions defined by KIP-848 that determine client eligibility to 
> use the new protocol. This will be modeled by the—deep 
> breath—{{{}ConsumerGroupProtocolVersionResolver{}}}.
> Known tasks:
>  * Determine at what point in the {{Consumer}} initialization the network 
> communication should happen
>  * Determine what RPCs to invoke in order to determine eligibility (API 
> versions, IBP version, etc.)
>  * Implement the network client lifecycle (startup, communication, shutdown, 
> etc.)
>  * Determine the fallback path in case the client is not eligible to use the 
> protocol



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Reopened] (KAFKA-15284) Implement ConsumerGroupProtocolVersionResolver to determine consumer group protocol

2024-10-22 Thread Kirk True (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-15284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kirk True reopened KAFKA-15284:
---
  Assignee: Kirk True

> Implement ConsumerGroupProtocolVersionResolver to determine consumer group 
> protocol
> ---
>
> Key: KAFKA-15284
> URL: https://issues.apache.org/jira/browse/KAFKA-15284
> Project: Kafka
>  Issue Type: New Feature
>  Components: clients, consumer
>Reporter: Kirk True
>Assignee: Kirk True
>Priority: Major
>  Labels: kip-848-client-support
> Fix For: 4.0.0
>
>
> At client initialization, we need to determine which of the 
> {{ConsumerDelegate}} implementations to use:
>  # {{LegacyKafkaConsumerDelegate}}
>  # {{AsyncKafkaConsumerDelegate}}
> There are conditions defined by KIP-848 that determine client eligibility to 
> use the new protocol. This will be modeled by the—deep 
> breath—{{{}ConsumerGroupProtocolVersionResolver{}}}.
> Known tasks:
>  * Determine at what point in the {{Consumer}} initialization the network 
> communication should happen
>  * Determine what RPCs to invoke in order to determine eligibility (API 
> versions, IBP version, etc.)
>  * Implement the network client lifecycle (startup, communication, shutdown, 
> etc.)
>  * Determine the fallback path in case the client is not eligible to use the 
> protocol



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (KAFKA-17826) Consumer#offsetsForTimes should not return null value

2024-10-21 Thread Kirk True (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-17826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17891620#comment-17891620
 ] 

Kirk True commented on KAFKA-17826:
---

I agree with this change. Since the documentation in {{KafkaConsumer}} mentions 
returning {{{}null{}}}, can we simply update the documentation along with the 
code change, or do we need a KIP to make the change?

> Consumer#offsetsForTimes should not return null value
> -
>
> Key: KAFKA-17826
> URL: https://issues.apache.org/jira/browse/KAFKA-17826
> Project: Kafka
>  Issue Type: Improvement
>  Components: clients, consumer
>Reporter: Chia-Ping Tsai
>Assignee: Chia-Ping Tsai
>Priority: Minor
>
> https://github.com/apache/kafka/pull/17353#discussion_r1795032707
> The map returned by Consumer#offsetsForTimes can have null value if the 
> specific timestamp is not mapped to any offset. That is a anti-pattern due to 
> following reasons.
> 1. most java11+ collection can't accept null value
> 2. the other similar methods in consumer does not do that
> 3. admin does not do that



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (KAFKA-17826) Consumer#offsetsForTimes should not return null value

2024-10-21 Thread Kirk True (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-17826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kirk True updated KAFKA-17826:
--
Component/s: clients
 consumer

> Consumer#offsetsForTimes should not return null value
> -
>
> Key: KAFKA-17826
> URL: https://issues.apache.org/jira/browse/KAFKA-17826
> Project: Kafka
>  Issue Type: Improvement
>  Components: clients, consumer
>Reporter: Chia-Ping Tsai
>Assignee: Chia-Ping Tsai
>Priority: Minor
>
> https://github.com/apache/kafka/pull/17353#discussion_r1795032707
> The map returned by Consumer#offsetsForTimes can have null value if the 
> specific timestamp is not mapped to any offset. That is a anti-pattern due to 
> following reasons.
> 1. most java11+ collection can't accept null value
> 2. the other similar methods in consumer does not do that
> 3. admin does not do that



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (KAFKA-17843) Add integration tests to validate clients close when closed just after instantiation

2024-10-21 Thread Kirk True (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-17843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kirk True updated KAFKA-17843:
--
Component/s: clients
 consumer

> Add integration tests to validate clients close when closed just after 
> instantiation
> 
>
> Key: KAFKA-17843
> URL: https://issues.apache.org/jira/browse/KAFKA-17843
> Project: Kafka
>  Issue Type: Improvement
>  Components: clients, consumer
>Reporter: Apoorv Mittal
>Assignee: Apoorv Mittal
>Priority: Major
>
> We have seen an issue with Kafka Consumer which trigger thread wait when 
> consumer is closed just after instantiation. Though the 
> [issue|https://issues.apache.org/jira/browse/KAFKA-17731] has been fixed now 
> but write test cases to avoid any such issue in future.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (KAFKA-17843) Add integration tests to validate clients close when closed just after instantiation

2024-10-21 Thread Kirk True (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-17843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kirk True updated KAFKA-17843:
--
Labels: integration-test  (was: )

> Add integration tests to validate clients close when closed just after 
> instantiation
> 
>
> Key: KAFKA-17843
> URL: https://issues.apache.org/jira/browse/KAFKA-17843
> Project: Kafka
>  Issue Type: Improvement
>  Components: clients, consumer
>Reporter: Apoorv Mittal
>Assignee: Apoorv Mittal
>Priority: Major
>  Labels: integration-test
>
> We have seen an issue with Kafka Consumer which trigger thread wait when 
> consumer is closed just after instantiation. Though the 
> [issue|https://issues.apache.org/jira/browse/KAFKA-17731] has been fixed now 
> but write test cases to avoid any such issue in future.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (KAFKA-17731) Kafka consumer client sometimes elapses wait time for terminating telemetry push

2024-10-21 Thread Kirk True (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-17731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kirk True updated KAFKA-17731:
--
Component/s: consumer

> Kafka consumer client sometimes elapses wait time for terminating telemetry 
> push
> 
>
> Key: KAFKA-17731
> URL: https://issues.apache.org/jira/browse/KAFKA-17731
> Project: Kafka
>  Issue Type: Bug
>  Components: clients, consumer
>Affects Versions: 3.7.0, 3.8.0, 3.7.1
>Reporter: Apoorv Mittal
>Assignee: Apoorv Mittal
>Priority: Blocker
> Fix For: 3.9.0, 3.7.2, 3.8.1
>
>
> ClientTelemetryReporter awaits on the timeout to send terminating telemetry 
> push but sometimes the wait is elapsed. The condition occurs intermitently 
> which can affect closing time of the consumer.
>  
> {*}The issue can only be reproduced when consumer is closed just after 
> creating i.e. instantiated Kafka Consumer and closed it{*}. When consumer is 
> instantly closed then, then worker thread goes for {{timed_waiting}} state 
> and expects last telemetry push request, which gets completed by background 
> thread poll. But as the consumer is instantly closed, the heartbeat thread 
> can't send the telemetry request, which makes the consumer close to wait for 
> timeout.
>  
> Debug logs:
>  
> {code:java}
> [2024-10-08 18:58:48,223] DEBUG 
> (org.apache.kafka.common.telemetry.internals.ClientTelemetryReporter)
>  :224 - Thread - 17 - Initiate close of ClientTelemetryReporter
> [2024-10-08 18:58:48,223] DEBUG 
> (org.apache.kafka.common.telemetry.internals.ClientTelemetryReporter)
>  :610 - Thread - 17 - initiate close for client telemetry, check if terminal 
> push required. Timeout 3 ms.
> [2024-10-08 18:58:48,223] DEBUG 
> (org.apache.kafka.common.telemetry.internals.ClientTelemetryReporter)
>  :834 - Thread - 17 - Setting telemetry state from PUSH_NEEDED to 
> TERMINATING_PUSH_NEEDED
> [2024-10-08 18:58:48,223] INFO 
> (org.apache.kafka.common.telemetry.internals.ClientTelemetryReporter)
>  :632 - Thread - 17 - About to wait 3 ms. for terminal telemetry push to 
> be submitted
> [2024-10-08 18:59:18,729] INFO 
> (org.apache.kafka.common.telemetry.internals.ClientTelemetryReporter)
>  :634 - Thread - 17 - Wait for terminal telemetry push to be submitted has 
> elapsed; may not have actually sent request 
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (KAFKA-17648) AsyncKafkaConsumer#unsubscribe swallow TopicAuthorizationException

2024-10-21 Thread Kirk True (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-17648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kirk True updated KAFKA-17648:
--
Priority: Major  (was: Minor)

> AsyncKafkaConsumer#unsubscribe swallow TopicAuthorizationException
> --
>
> Key: KAFKA-17648
> URL: https://issues.apache.org/jira/browse/KAFKA-17648
> Project: Kafka
>  Issue Type: Task
>  Components: clients, consumer
>Reporter: PoAn Yang
>Assignee: PoAn Yang
>Priority: Major
> Fix For: 4.0.0
>
>
> Followup for [https://github.com/apache/kafka/pull/17244].
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (KAFKA-17780) Heartbeat interval is not configured in the heartbeatrequest manager

2024-10-18 Thread Kirk True (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-17780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kirk True updated KAFKA-17780:
--
Fix Version/s: 4.0.0

> Heartbeat interval is not configured in the heartbeatrequest manager
> 
>
> Key: KAFKA-17780
> URL: https://issues.apache.org/jira/browse/KAFKA-17780
> Project: Kafka
>  Issue Type: Bug
>  Components: clients, consumer, kip
>Reporter: Arpit Goyal
>Priority: Critical
>  Labels: kip-848-client-support
> Fix For: 4.0.0
>
>
> In the AbstractHeartBeatRequestManager  , I observed we are not setting the 
> right heartbeat request interval. Is this intentional ?
> [~lianetm] [~kirktrue]
> {code:java}
> long retryBackoffMs = config.getLong(ConsumerConfig.RETRY_BACKOFF_MS_CONFIG);
> long retryBackoffMaxMs = 
> config.getLong(ConsumerConfig.RETRY_BACKOFF_MAX_MS_CONFIG);
> this.heartbeatRequestState = new HeartbeatRequestState(logContext, 
> time, 0, retryBackoffMs,
> retryBackoffMaxMs, maxPollIntervalMs);
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (KAFKA-17823) Refactor GroupRebalanceConfig to remove unused configuration

2024-10-17 Thread Kirk True (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-17823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kirk True updated KAFKA-17823:
--
Description: 
The GroupRebalanceConfig includes configuration that isn't always relevant to 
all its users, causing some callers to implement kludges to avoid errors.

Given that this is a public class, we'll need a KIP to change anything 
substantially :(

  was:The GroupRebalanceConfig includes configuration that isn't always 
relevant to all its users, causing some callers to implement kludges to avoid 
errors.


> Refactor GroupRebalanceConfig to remove unused configuration
> 
>
> Key: KAFKA-17823
> URL: https://issues.apache.org/jira/browse/KAFKA-17823
> Project: Kafka
>  Issue Type: Improvement
>  Components: clients, config, consumer
>Affects Versions: 3.9.0
>Reporter: Kirk True
>Assignee: Kirk True
>Priority: Major
>  Labels: kip-848-client-support
>
> The GroupRebalanceConfig includes configuration that isn't always relevant to 
> all its users, causing some callers to implement kludges to avoid errors.
> Given that this is a public class, we'll need a KIP to change anything 
> substantially :(



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-17823) Refactor GroupRebalanceConfig to remove unused configuration

2024-10-17 Thread Kirk True (Jira)
Kirk True created KAFKA-17823:
-

 Summary: Refactor GroupRebalanceConfig to remove unused 
configuration
 Key: KAFKA-17823
 URL: https://issues.apache.org/jira/browse/KAFKA-17823
 Project: Kafka
  Issue Type: Improvement
  Components: clients, config, consumer
Affects Versions: 3.9.0
Reporter: Kirk True
Assignee: Kirk True


The GroupRebalanceConfig includes configuration that isn't always relevant to 
all its users, causing some callers to implement kludges to avoid errors.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (KAFKA-17780) Heartbeat interval is not configured in the heartbeatrequest manager

2024-10-17 Thread Kirk True (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-17780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17890533#comment-17890533
 ] 

Kirk True commented on KAFKA-17780:
---

>From my read of the current code in {{{}AbstractHeartbeatRequestManager{}}}’s 
>constructor, we set the interval to 0 initially. On the first (and subsequent) 
>heartbeat responses, we call {{onResponse()}} and update the interval, if 
>needed. This is because, as you mentioned, the heartbeat interval is now a 
>server-side configuration.

Does that explanation seem correct, or no?

> Heartbeat interval is not configured in the heartbeatrequest manager
> 
>
> Key: KAFKA-17780
> URL: https://issues.apache.org/jira/browse/KAFKA-17780
> Project: Kafka
>  Issue Type: Bug
>  Components: clients, consumer, kip
>Reporter: Arpit Goyal
>Priority: Critical
>  Labels: kip-848-client-support
>
> In the AbstractHeartBeatRequestManager  , I observed we are not setting the 
> right heartbeat request interval. Is this intentional ?
> [~lianetm] [~kirktrue]
> {code:java}
> long retryBackoffMs = config.getLong(ConsumerConfig.RETRY_BACKOFF_MS_CONFIG);
> long retryBackoffMaxMs = 
> config.getLong(ConsumerConfig.RETRY_BACKOFF_MAX_MS_CONFIG);
> this.heartbeatRequestState = new HeartbeatRequestState(logContext, 
> time, 0, retryBackoffMs,
> retryBackoffMaxMs, maxPollIntervalMs);
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (KAFKA-17780) Heartbeat interval is not configured in the heartbeatrequest manager

2024-10-17 Thread Kirk True (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-17780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kirk True updated KAFKA-17780:
--
Component/s: clients

> Heartbeat interval is not configured in the heartbeatrequest manager
> 
>
> Key: KAFKA-17780
> URL: https://issues.apache.org/jira/browse/KAFKA-17780
> Project: Kafka
>  Issue Type: Bug
>  Components: clients, consumer, kip
>Reporter: Arpit Goyal
>Priority: Critical
>
> In the AbstractHeartBeatRequestManager  , I observed we are not setting the 
> right heartbeat request interval. Is this intentional ?
> [~lianetm] [~kirktrue]
> {code:java}
> long retryBackoffMs = config.getLong(ConsumerConfig.RETRY_BACKOFF_MS_CONFIG);
> long retryBackoffMaxMs = 
> config.getLong(ConsumerConfig.RETRY_BACKOFF_MAX_MS_CONFIG);
> this.heartbeatRequestState = new HeartbeatRequestState(logContext, 
> time, 0, retryBackoffMs,
> retryBackoffMaxMs, maxPollIntervalMs);
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (KAFKA-17780) Heartbeat interval is not configured in the heartbeatrequest manager

2024-10-17 Thread Kirk True (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-17780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kirk True updated KAFKA-17780:
--
Labels: kip-848-client-support  (was: )

> Heartbeat interval is not configured in the heartbeatrequest manager
> 
>
> Key: KAFKA-17780
> URL: https://issues.apache.org/jira/browse/KAFKA-17780
> Project: Kafka
>  Issue Type: Bug
>  Components: clients, consumer, kip
>Reporter: Arpit Goyal
>Priority: Critical
>  Labels: kip-848-client-support
>
> In the AbstractHeartBeatRequestManager  , I observed we are not setting the 
> right heartbeat request interval. Is this intentional ?
> [~lianetm] [~kirktrue]
> {code:java}
> long retryBackoffMs = config.getLong(ConsumerConfig.RETRY_BACKOFF_MS_CONFIG);
> long retryBackoffMaxMs = 
> config.getLong(ConsumerConfig.RETRY_BACKOFF_MAX_MS_CONFIG);
> this.heartbeatRequestState = new HeartbeatRequestState(logContext, 
> time, 0, retryBackoffMs,
> retryBackoffMaxMs, maxPollIntervalMs);
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Reopened] (KAFKA-16444) Run KIP-848 unit tests under code coverage

2024-10-16 Thread Kirk True (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-16444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kirk True reopened KAFKA-16444:
---

> Run KIP-848 unit tests under code coverage
> --
>
> Key: KAFKA-16444
> URL: https://issues.apache.org/jira/browse/KAFKA-16444
> Project: Kafka
>  Issue Type: Task
>  Components: clients, consumer, unit tests
>Affects Versions: 3.7.0
>Reporter: Kirk True
>Priority: Minor
>  Labels: consumer-threading-refactor, kip-848-client-support
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (KAFKA-17338) ConsumerConfig should prevent using partition assignors with CONSUMER group protocol

2024-10-15 Thread Kirk True (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-17338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kirk True updated KAFKA-17338:
--
Priority: Major  (was: Critical)

> ConsumerConfig should prevent using partition assignors with CONSUMER group 
> protocol
> 
>
> Key: KAFKA-17338
> URL: https://issues.apache.org/jira/browse/KAFKA-17338
> Project: Kafka
>  Issue Type: Task
>  Components: clients, config, consumer
>Reporter: Kirk True
>Assignee: 黃竣陽
>Priority: Major
>  Labels: consumer-threading-refactor, kip-848-client-support
> Fix For: 4.0.0
>
>
> {{ConsumerConfig}} should be updated to include additional validation in 
> {{postProcessParsedConfig()}}—when {{group.protocol}} is set to {{CONSUMER}}, 
> the value for {{partition.assignment.strategy}} must be either null or empty. 
> Otherwise a {{ConfigException}} should be thrown.
> This is somewhat of the inverse case of KAFKA-15773.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (KAFKA-15402) Performance regression on close consumer after upgrading to 3.5.0

2024-10-15 Thread Kirk True (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-15402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17889752#comment-17889752
 ] 

Kirk True commented on KAFKA-15402:
---

[~bdelbosc]—sorry for the delay on this. All hands are pulling on the KIP-848 
client release for AK 4.0. I am hopeful we can get to this too.

A configuration option for skipping the close is probably out of the question 
given it would require a KIP.

> Performance regression on close consumer after upgrading to 3.5.0
> -
>
> Key: KAFKA-15402
> URL: https://issues.apache.org/jira/browse/KAFKA-15402
> Project: Kafka
>  Issue Type: Bug
>  Components: clients, consumer
>Affects Versions: 3.5.0, 3.6.0, 3.5.1
>Reporter: Benoit Delbosc
>Priority: Major
>  Labels: consumer-threading-refactor
> Fix For: 4.0.0
>
> Attachments: image-2023-08-24-18-51-21-720.png, 
> image-2023-08-24-18-51-57-435.png, image-2023-08-25-10-50-28-079.png
>
>
> Hi,
> After upgrading to Kafka client version 3.5.0, we have observed a significant 
> increase in the duration of our Java unit tests. These unit tests heavily 
> rely on the Kafka Admin, Producer, and Consumer API.
> When using Kafka server version 3.4.1, the duration of the unit tests 
> increased from 8 seconds (with Kafka client 3.4.1) to 18 seconds (with Kafka 
> client 3.5.0).
> Upgrading the Kafka server to 3.5.1 show similar results.
> I have come across the issue KAFKA-15178, which could be the culprit. I will 
> attempt to test the proposed patch.
> In the meantime, if you have any ideas that could help identify and address 
> the regression, please let me know.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (KAFKA-15402) Performance regression on close consumer after upgrading to 3.5.0

2024-10-15 Thread Kirk True (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-15402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kirk True updated KAFKA-15402:
--
Priority: Major  (was: Critical)

> Performance regression on close consumer after upgrading to 3.5.0
> -
>
> Key: KAFKA-15402
> URL: https://issues.apache.org/jira/browse/KAFKA-15402
> Project: Kafka
>  Issue Type: Bug
>  Components: clients, consumer
>Affects Versions: 3.5.0, 3.6.0, 3.5.1
>Reporter: Benoit Delbosc
>Priority: Major
>  Labels: consumer-threading-refactor
> Fix For: 4.0.0
>
> Attachments: image-2023-08-24-18-51-21-720.png, 
> image-2023-08-24-18-51-57-435.png, image-2023-08-25-10-50-28-079.png
>
>
> Hi,
> After upgrading to Kafka client version 3.5.0, we have observed a significant 
> increase in the duration of our Java unit tests. These unit tests heavily 
> rely on the Kafka Admin, Producer, and Consumer API.
> When using Kafka server version 3.4.1, the duration of the unit tests 
> increased from 8 seconds (with Kafka client 3.4.1) to 18 seconds (with Kafka 
> client 3.5.0).
> Upgrading the Kafka server to 3.5.1 show similar results.
> I have come across the issue KAFKA-15178, which could be the culprit. I will 
> attempt to test the proposed patch.
> In the meantime, if you have any ideas that could help identify and address 
> the regression, please let me know.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (KAFKA-17456) Make sure FindCoordinatorResponse get created before consumer

2024-10-14 Thread Kirk True (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-17456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kirk True updated KAFKA-17456:
--
Labels: flaky-test kip-848-client-support  (was: kip-848-client-support)

> Make sure FindCoordinatorResponse get created before consumer
> -
>
> Key: KAFKA-17456
> URL: https://issues.apache.org/jira/browse/KAFKA-17456
> Project: Kafka
>  Issue Type: Sub-task
>  Components: clients, consumer
>Reporter: Chia-Ping Tsai
>Assignee: TengYao Chi
>Priority: Major
>  Labels: flaky-test, kip-848-client-support
>
> The incorrect order could lead to flaky (see KAFKA-17092 and KAFKA-17395). It 
> would be nice that we fix all of them.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (KAFKA-16143) New JMX metrics for AsyncKafkaConsumer

2024-10-14 Thread Kirk True (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-16143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17889365#comment-17889365
 ] 

Kirk True commented on KAFKA-16143:
---

[~yangpoan]—would you kindly update the status on this to Patch Available? 
Thanks!

> New JMX metrics for AsyncKafkaConsumer
> --
>
> Key: KAFKA-16143
> URL: https://issues.apache.org/jira/browse/KAFKA-16143
> Project: Kafka
>  Issue Type: Improvement
>  Components: clients, consumer, metrics
>Affects Versions: 3.7.0
>Reporter: Kirk True
>Assignee: PoAn Yang
>Priority: Major
>  Labels: consumer-threading-refactor, kip-848-client-support, 
> metrics, needs-kip
> Fix For: 4.0.0
>
>
> This task is to consider what _new_ metrics we need from the KIP-848 protocol 
> that aren't already exposed by the current set of metrics. This will require 
> a KIP to introduce the new metrics.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (KAFKA-17456) Make sure FindCoordinatorResponse get created before consumer

2024-10-14 Thread Kirk True (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-17456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kirk True updated KAFKA-17456:
--
Labels: kip-848-client-support  (was: )

> Make sure FindCoordinatorResponse get created before consumer
> -
>
> Key: KAFKA-17456
> URL: https://issues.apache.org/jira/browse/KAFKA-17456
> Project: Kafka
>  Issue Type: Sub-task
>  Components: clients, consumer
>Reporter: Chia-Ping Tsai
>Assignee: TengYao Chi
>Priority: Major
>  Labels: kip-848-client-support
>
> The incorrect order could lead to flaky (see KAFKA-17092 and KAFKA-17395). It 
> would be nice that we fix all of them.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (KAFKA-17724) Clients - resolve race condition for SubscriptionState in share group consumer

2024-10-14 Thread Kirk True (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-17724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17889363#comment-17889363
 ] 

Kirk True commented on KAFKA-17724:
---

[~schofielaj]—would you kindly update the status of this to "Patch Available?" 
Thanks!

> Clients - resolve race condition for SubscriptionState in share group consumer
> --
>
> Key: KAFKA-17724
> URL: https://issues.apache.org/jira/browse/KAFKA-17724
> Project: Kafka
>  Issue Type: Sub-task
>  Components: clients, consumer
>Affects Versions: 4.0.0
>Reporter: Andrew Schofield
>Assignee: Andrew Schofield
>Priority: Major
>  Labels: kip-848-client-support
> Fix For: 4.0.0
>
>
> The SubscriptionState object is not thread-safe. Currently, there are a 
> handful of accesses in the application thread in ShareConsumerImpl which 
> ought to be moved into the background thread, thus eliminating the 
> possibility of race conditions.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (KAFKA-17724) Clients - resolve race condition for SubscriptionState in share group consumer

2024-10-14 Thread Kirk True (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-17724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kirk True updated KAFKA-17724:
--
Component/s: consumer

> Clients - resolve race condition for SubscriptionState in share group consumer
> --
>
> Key: KAFKA-17724
> URL: https://issues.apache.org/jira/browse/KAFKA-17724
> Project: Kafka
>  Issue Type: Sub-task
>  Components: clients, consumer
>Affects Versions: 4.0.0
>Reporter: Andrew Schofield
>Assignee: Andrew Schofield
>Priority: Major
> Fix For: 4.0.0
>
>
> The SubscriptionState object is not thread-safe. Currently, there are a 
> handful of accesses in the application thread in ShareConsumerImpl which 
> ought to be moved into the background thread, thus eliminating the 
> possibility of race conditions.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (KAFKA-17724) Clients - resolve race condition for SubscriptionState in share group consumer

2024-10-14 Thread Kirk True (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-17724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kirk True updated KAFKA-17724:
--
Labels: kip-848-client-support  (was: )

> Clients - resolve race condition for SubscriptionState in share group consumer
> --
>
> Key: KAFKA-17724
> URL: https://issues.apache.org/jira/browse/KAFKA-17724
> Project: Kafka
>  Issue Type: Sub-task
>  Components: clients, consumer
>Affects Versions: 4.0.0
>Reporter: Andrew Schofield
>Assignee: Andrew Schofield
>Priority: Major
>  Labels: kip-848-client-support
> Fix For: 4.0.0
>
>
> The SubscriptionState object is not thread-safe. Currently, there are a 
> handful of accesses in the application thread in ShareConsumerImpl which 
> ought to be moved into the background thread, thus eliminating the 
> possibility of race conditions.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (KAFKA-17363) Flaky test: kafka.api.PlaintextConsumerCommitTest.testAutoCommitOnRebalance(String, String).quorum=kraft+kip848.groupProtocol=consumer

2024-10-14 Thread Kirk True (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-17363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kirk True updated KAFKA-17363:
--
Labels: flaky-test integration-tests kip-848-client-support  (was: 
integration-tests kip-848-client-support)

> Flaky test: 
> kafka.api.PlaintextConsumerCommitTest.testAutoCommitOnRebalance(String, 
> String).quorum=kraft+kip848.groupProtocol=consumer
> --
>
> Key: KAFKA-17363
> URL: https://issues.apache.org/jira/browse/KAFKA-17363
> Project: Kafka
>  Issue Type: Bug
>  Components: clients, consumer
>Reporter: Apoorv Mittal
>Priority: Major
>  Labels: flaky-test, integration-tests, kip-848-client-support
> Fix For: 4.0.0
>
>
> kafka.api.PlaintextConsumerCommitTest.testAutoCommitOnRebalance(String, 
> String).quorum=kraft+kip848.groupProtocol=consumer
> [https://ci-builds.apache.org/job/Kafka/job/kafka-pr/job/PR-16890/3/testReport/kafka.api/PlaintextConsumerCommitTest/Build___JDK_8_and_Scala_2_12___testAutoCommitOnRebalance_String__String__quorum_kraft_kip848_groupProtocol_consumer/]
> {code:java}
> org.opentest4j.AssertionFailedError: Topic [topic2] metadata not propagated 
> after 6 ms
>   at org.junit.jupiter.api.AssertionUtils.fail(AssertionUtils.java:38)
>   at org.junit.jupiter.api.Assertions.fail(Assertions.java:138)
>   at 
> kafka.utils.TestUtils$.waitForAllPartitionsMetadata(TestUtils.scala:931)
>   at kafka.utils.TestUtils$.createTopicWithAdmin(TestUtils.scala:474)
>   at 
> kafka.integration.KafkaServerTestHarness.$anonfun$createTopic$1(KafkaServerTestHarness.scala:193)
>   at scala.util.Using$.resource(Using.scala:269)
>   at 
> kafka.integration.KafkaServerTestHarness.createTopic(KafkaServerTestHarness.scala:185)
>   at 
> kafka.api.PlaintextConsumerCommitTest.testAutoCommitOnRebalance(PlaintextConsumerCommitTest.scala:223)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183)
>   at 
> java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
>   at 
> java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175)
>   at 
> java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
>   at 
> java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183)
>   at 
> java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
>   at 
> java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
>   at 
> java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
>   at 
> java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183)
>   at 
> java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
>   at 
> java.util.Spliterators$ArraySpliterator.forEachRemaining(Spliterators.java:948)
>   at 
> java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:647)
>   at 
> java.util.stream.ReferencePipeline$7$1.accept(ReferencePipeline.java:272)
>   at 
> java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
>   at 
> java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
>   at 
> java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
>   at 
> java.util.Spliterators$ArraySpliterator.forEachRemaining(Spliterators.java:948)
>   at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482)
>   at 
> java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472)
>   at 
> java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150)
>   at 
> java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173)
>   at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
>   at 
> java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:485)
>   at 
> java.util.stream.ReferencePipeline$7$1.accept(ReferencePipeline.java:272)
>   at 
> java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
>   at 
> java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
>   at 
> java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
>   at 
> java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1384)
>   at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482)
>   at 
> java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472)
>   at 
> java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150)
>   at 
> java.util.stream.ForEachOps$ForEachOp$OfRef.

[jira] [Updated] (KAFKA-17769) Fix flaky PlaintextConsumerSubscriptionTest.testSubscribeInvalidTopicCanUnsubscribe

2024-10-14 Thread Kirk True (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-17769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kirk True updated KAFKA-17769:
--
Labels: flaky-test integration-test kip-848-client-support  (was: 
integration-test kip-848-client-support)

> Fix flaky 
> PlaintextConsumerSubscriptionTest.testSubscribeInvalidTopicCanUnsubscribe
> ---
>
> Key: KAFKA-17769
> URL: https://issues.apache.org/jira/browse/KAFKA-17769
> Project: Kafka
>  Issue Type: Bug
>  Components: clients, consumer
>Reporter: Yu-Lin Chen
>Assignee: Yu-Lin Chen
>Priority: Major
>  Labels: flaky-test, integration-test, kip-848-client-support
> Fix For: 4.0.0
>
>
> 4 flaky out of 110 trunk builds in past 2 weeks. ([Report 
> Link|https://ge.apache.org/scans/tests?search.rootProjectNames=kafka&search.startTimeMax=1728584869905&search.startTimeMin=172615680&search.tags=trunk&search.timeZoneId=Asia%2FTaipei&tests.container=kafka.api.PlaintextConsumerSubscriptionTest&tests.test=testSubscribeInvalidTopicCanUnsubscribe(String%2C%20String)%5B3%5D])
> This issue can be reproduced in my local within 50 loops.
>  
> ([Oct 4 2024 at 10:35:49 
> CST|https://ge.apache.org/s/o4ir4xtitsu52/tests/task/:core:test/details/kafka.api.PlaintextConsumerSubscriptionTest/testSubscribeInvalidTopicCanUnsubscribe(String%2C%20String)%5B3%5D?top-execution=1]):
> {code:java}
> org.apache.kafka.common.KafkaException: Failed to close kafka consumer
> at 
> org.apache.kafka.clients.consumer.internals.AsyncKafkaConsumer.close(AsyncKafkaConsumer.java:1249)
>  
> at 
> org.apache.kafka.clients.consumer.internals.AsyncKafkaConsumer.close(AsyncKafkaConsumer.java:1204)
>  
> at 
> org.apache.kafka.clients.consumer.KafkaConsumer.close(KafkaConsumer.java:1718)
>  
> at 
> kafka.api.IntegrationTestHarness.$anonfun$tearDown$3(IntegrationTestHarness.scala:249)
>  
> at 
> kafka.api.IntegrationTestHarness.$anonfun$tearDown$3$adapted(IntegrationTestHarness.scala:249)
>  
> at scala.collection.IterableOnceOps.foreach(IterableOnce.scala:619)   
> at scala.collection.IterableOnceOps.foreach$(IterableOnce.scala:617)  
> at scala.collection.AbstractIterable.foreach(Iterable.scala:935)  
> at 
> kafka.api.IntegrationTestHarness.tearDown(IntegrationTestHarness.scala:249)   
>  
> at java.lang.reflect.Method.invoke(Method.java:566)   
> at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183)
> at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)  
> at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:177)  
> at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)  
> at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183)
> at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)  
> at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)  
> at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)  
> at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183)
> at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)  
> at 
> java.util.Spliterators$ArraySpliterator.forEachRemaining(Spliterators.java:948)
> 
> at 
> java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:658)   
>  
> at java.util.stream.ReferencePipeline$7$1.accept(ReferencePipeline.java:274)  
> at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)  
> at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)  
> at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)  
> at 
> java.util.Spliterators$ArraySpliterator.forEachRemaining(Spliterators.java:948)
> 
> at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484)  
> at 
> java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474)  
>  
> at 
> java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150) 
>  
> at 
> java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173)
> 
> at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)  
> at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:497) 
> at java.util.stream.ReferencePipeline$7$1.accept(ReferencePipeline.java:274)  
> at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)  
> at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)  
> at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)  
> at 
> java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1655)
>  
> at java.util.stream.AbstractPipeline.copyInto

[jira] [Updated] (KAFKA-17681) Fix unstable consumer_test.py#test_fencing_static_consumer

2024-10-14 Thread Kirk True (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-17681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kirk True updated KAFKA-17681:
--
Labels: flaky-test kip-848-client-support  (was: kip-848-client-support)

> Fix unstable consumer_test.py#test_fencing_static_consumer
> --
>
> Key: KAFKA-17681
> URL: https://issues.apache.org/jira/browse/KAFKA-17681
> Project: Kafka
>  Issue Type: Bug
>  Components: clients, consumer, system tests
>Reporter: Chia-Ping Tsai
>Assignee: Chia-Ping Tsai
>Priority: Major
>  Labels: flaky-test, kip-848-client-support
> Fix For: 4.0.0
>
>
> {code:java}
> AssertionError('Static consumers attempt to join with instance id in use 
> should not cause a rebalance.')
> Traceback (most recent call last):
>   File 
> "/usr/local/lib/python3.9/dist-packages/ducktape/tests/runner_client.py", 
> line 186, in _do_run
> data = self.run_test()
>   File 
> "/usr/local/lib/python3.9/dist-packages/ducktape/tests/runner_client.py", 
> line 246, in run_test
> return self.test_context.function(self.test)
>   File "/usr/local/lib/python3.9/dist-packages/ducktape/mark/_mark.py", line 
> 433, in wrapper
> return functools.partial(f, *args, **kwargs)(*w_args, **w_kwargs)
>   File "/opt/kafka-dev/tests/kafkatest/tests/client/consumer_test.py", line 
> 359, in test_fencing_static_consumer
> assert num_rebalances == consumer.num_rebalances(), "Static consumers 
> attempt to join with instance id in use should not cause a rebalance. before: 
> " + str(num_rebalances) + " after: " + str(consumer.num_rebalances())
> AssertionError: Static consumers attempt to join with instance id in use 
> should not cause a rebalance.
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (KAFKA-17623) Flaky testSeekPositionAndPauseNewlyAssignedPartitionOnPartitionsAssignedCallback

2024-10-14 Thread Kirk True (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-17623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kirk True updated KAFKA-17623:
--
Labels: consumer-threading-refactor flaky-test  (was: 
consumer-threading-refactor)

> Flaky 
> testSeekPositionAndPauseNewlyAssignedPartitionOnPartitionsAssignedCallback
> 
>
> Key: KAFKA-17623
> URL: https://issues.apache.org/jira/browse/KAFKA-17623
> Project: Kafka
>  Issue Type: Bug
>  Components: clients, consumer
>Reporter: Lianet Magrans
>Assignee: PoAn Yang
>Priority: Major
>  Labels: consumer-threading-refactor, flaky-test
> Fix For: 4.0.0
>
>
> Flaky for the new consumer, failing with :
> org.apache.kafka.common.KafkaException: User rebalance callback throws an 
> error at 
> app//org.apache.kafka.clients.consumer.internals.ConsumerUtils.maybeWrapAsKafkaException(ConsumerUtils.java:259)
>  at 
> app//org.apache.kafka.clients.consumer.internals.AsyncKafkaConsumer.invokeRebalanceCallbacks(AsyncKafkaConsumer.java:1867)
>  at 
> app//org.apache.kafka.clients.consumer.internals.AsyncKafkaConsumer$BackgroundEventProcessor.process(AsyncKafkaConsumer.java:195)
>  at 
> app//org.apache.kafka.clients.consumer.internals.AsyncKafkaConsumer$BackgroundEventProcessor.process(AsyncKafkaConsumer.java:181)
>  at 
> app//org.apache.kafka.clients.consumer.internals.AsyncKafkaConsumer.processBackgroundEvents(AsyncKafkaConsumer.java:1758)
>  at 
> app//org.apache.kafka.clients.consumer.internals.AsyncKafkaConsumer.updateAssignmentMetadataIfNeeded(AsyncKafkaConsumer.java:1618)
> ...
> Caused by: java.lang.IllegalStateException: No current assignment for 
> partition topic-0 at 
> org.apache.kafka.clients.consumer.internals.SubscriptionState.assignedState(SubscriptionState.java:378)
>  at 
> org.apache.kafka.clients.consumer.internals.SubscriptionState.seekUnvalidated(SubscriptionState.java:395)
>  at 
> org.apache.kafka.clients.consumer.internals.events.ApplicationEventProcessor.process(ApplicationEventProcessor.java:425)
>  at 
> org.apache.kafka.clients.consumer.internals.events.ApplicationEventProcessor.process(ApplicationEventProcessor.java:147)
>  at 
> org.apache.kafka.clients.consumer.internals.ConsumerNetworkThread.processApplicationEvents(ConsumerNetworkThread.java:171)
>  
> Flaky behaviour:
>  
> https://ge.apache.org/scans/tests?search.buildOutcome=failure&search.names=Git%20branch&search.rootProjectNames=kafka&search.startTimeMax=172740959&search.startTimeMin=172248480&search.timeZoneId=America%2FToronto&search.values=trunk&tests.container=integration.kafka.api.PlaintextConsumerCallbackTest&tests.test=testSeekPositionAndPauseNewlyAssignedPartitionOnPartitionsAssignedCallback(String%2C%20String)%5B3%5D



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (KAFKA-17769) Fix flaky PlaintextConsumerSubscriptionTest.testSubscribeInvalidTopicCanUnsubscribe

2024-10-14 Thread Kirk True (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-17769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kirk True updated KAFKA-17769:
--
Fix Version/s: 4.0.0

> Fix flaky 
> PlaintextConsumerSubscriptionTest.testSubscribeInvalidTopicCanUnsubscribe
> ---
>
> Key: KAFKA-17769
> URL: https://issues.apache.org/jira/browse/KAFKA-17769
> Project: Kafka
>  Issue Type: Bug
>  Components: clients, consumer
>Reporter: Yu-Lin Chen
>Assignee: Yu-Lin Chen
>Priority: Major
>  Labels: integration-test, kip-848-client-support
> Fix For: 4.0.0
>
>
> 4 flaky out of 110 trunk builds in past 2 weeks. ([Report 
> Link|https://ge.apache.org/scans/tests?search.rootProjectNames=kafka&search.startTimeMax=1728584869905&search.startTimeMin=172615680&search.tags=trunk&search.timeZoneId=Asia%2FTaipei&tests.container=kafka.api.PlaintextConsumerSubscriptionTest&tests.test=testSubscribeInvalidTopicCanUnsubscribe(String%2C%20String)%5B3%5D])
> This issue can be reproduced in my local within 50 loops.
>  
> ([Oct 4 2024 at 10:35:49 
> CST|https://ge.apache.org/s/o4ir4xtitsu52/tests/task/:core:test/details/kafka.api.PlaintextConsumerSubscriptionTest/testSubscribeInvalidTopicCanUnsubscribe(String%2C%20String)%5B3%5D?top-execution=1]):
> {code:java}
> org.apache.kafka.common.KafkaException: Failed to close kafka consumer
> at 
> org.apache.kafka.clients.consumer.internals.AsyncKafkaConsumer.close(AsyncKafkaConsumer.java:1249)
>  
> at 
> org.apache.kafka.clients.consumer.internals.AsyncKafkaConsumer.close(AsyncKafkaConsumer.java:1204)
>  
> at 
> org.apache.kafka.clients.consumer.KafkaConsumer.close(KafkaConsumer.java:1718)
>  
> at 
> kafka.api.IntegrationTestHarness.$anonfun$tearDown$3(IntegrationTestHarness.scala:249)
>  
> at 
> kafka.api.IntegrationTestHarness.$anonfun$tearDown$3$adapted(IntegrationTestHarness.scala:249)
>  
> at scala.collection.IterableOnceOps.foreach(IterableOnce.scala:619)   
> at scala.collection.IterableOnceOps.foreach$(IterableOnce.scala:617)  
> at scala.collection.AbstractIterable.foreach(Iterable.scala:935)  
> at 
> kafka.api.IntegrationTestHarness.tearDown(IntegrationTestHarness.scala:249)   
>  
> at java.lang.reflect.Method.invoke(Method.java:566)   
> at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183)
> at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)  
> at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:177)  
> at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)  
> at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183)
> at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)  
> at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)  
> at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)  
> at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183)
> at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)  
> at 
> java.util.Spliterators$ArraySpliterator.forEachRemaining(Spliterators.java:948)
> 
> at 
> java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:658)   
>  
> at java.util.stream.ReferencePipeline$7$1.accept(ReferencePipeline.java:274)  
> at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)  
> at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)  
> at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)  
> at 
> java.util.Spliterators$ArraySpliterator.forEachRemaining(Spliterators.java:948)
> 
> at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484)  
> at 
> java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474)  
>  
> at 
> java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150) 
>  
> at 
> java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173)
> 
> at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)  
> at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:497) 
> at java.util.stream.ReferencePipeline$7$1.accept(ReferencePipeline.java:274)  
> at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)  
> at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)  
> at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)  
> at 
> java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1655)
>  
> at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484)  
> at 
> java.util.stream.AbstractPipeline.wrapAndCopyInto(Abstrac

[jira] [Updated] (KAFKA-17777) Flaky KafkaConsumerTest testReturnRecordsDuringRebalance

2024-10-14 Thread Kirk True (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-1?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kirk True updated KAFKA-1:
--
Fix Version/s: 4.0.0

> Flaky KafkaConsumerTest testReturnRecordsDuringRebalance
> 
>
> Key: KAFKA-1
> URL: https://issues.apache.org/jira/browse/KAFKA-1
> Project: Kafka
>  Issue Type: Test
>  Components: clients, consumer, unit tests
>Reporter: Lianet Magrans
>Assignee: TengYao Chi
>Priority: Major
>  Labels: flaky-test, kip-848-client-support
> Fix For: 4.0.0
>
>
> Top flaky consumer test at the moment (after several recent fixes addressing 
> consumer tests flakiness): 
> [https://ge.apache.org/scans/tests?search.rootProjectNames=kafka&search.tags=trunk&search.timeZoneId=America%2FNew_York&tests.sortField=FLAKY]
>  
> It's been flaky in trunk for a while:
> [https://ge.apache.org/scans/tests?search.relativeStartTime=P28D&search.rootProjectNames=kafka&search.tags=trunk&search.timeZoneId=America%2FToronto&tests.container=org.apache.kafka.clients.consumer.KafkaConsumerTest&tests.sortField=FLAKY&tests.test=testReturnRecordsDuringRebalance(GroupProtocol)%5B1%5D]
>  
>  
> Fails with : org.opentest4j.AssertionFailedError: Condition not met within 
> timeout 15000. Does not complete rebalance in time ==> expected:  but 
> was: 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (KAFKA-17732) System test coverage for new consumer positions not getting behind committed offsets

2024-10-14 Thread Kirk True (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-17732?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kirk True updated KAFKA-17732:
--
Fix Version/s: 4.0.0

> System test coverage for new consumer positions not getting behind committed 
> offsets
> 
>
> Key: KAFKA-17732
> URL: https://issues.apache.org/jira/browse/KAFKA-17732
> Project: Kafka
>  Issue Type: Task
>  Components: clients, consumer, system tests
>Reporter: Lianet Magrans
>Priority: Major
>  Labels: consumer-threading-refactor
> Fix For: 4.0.0
>
>
> We recently catch a bug where the new consumer positions would get behind the 
> committed offsets (fixed with 
> https://issues.apache.org/jira/browse/KAFKA-17674). It was only discovered 
> while adding  a new system test (test_consumer_rolling_migration, not yet in 
> trunk, PR should be available soon in trunk under 
> https://issues.apache.org/jira/browse/KAFKA-17272)
> We should check why this failure wasn't catch with existing tests, and add a 
> test for it (test to ensure positions are updated properly, that would fail 
> without KAFKA-17674 and pass with it). 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (KAFKA-17769) Fix flaky PlaintextConsumerSubscriptionTest.testSubscribeInvalidTopicCanUnsubscribe

2024-10-14 Thread Kirk True (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-17769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kirk True updated KAFKA-17769:
--
Labels: integration-test  (was: )

> Fix flaky 
> PlaintextConsumerSubscriptionTest.testSubscribeInvalidTopicCanUnsubscribe
> ---
>
> Key: KAFKA-17769
> URL: https://issues.apache.org/jira/browse/KAFKA-17769
> Project: Kafka
>  Issue Type: Bug
>  Components: clients, consumer
>Reporter: Yu-Lin Chen
>Assignee: Yu-Lin Chen
>Priority: Major
>  Labels: integration-test
>
> 4 flaky out of 110 trunk builds in past 2 weeks. ([Report 
> Link|https://ge.apache.org/scans/tests?search.rootProjectNames=kafka&search.startTimeMax=1728584869905&search.startTimeMin=172615680&search.tags=trunk&search.timeZoneId=Asia%2FTaipei&tests.container=kafka.api.PlaintextConsumerSubscriptionTest&tests.test=testSubscribeInvalidTopicCanUnsubscribe(String%2C%20String)%5B3%5D])
> This issue can be reproduced in my local within 50 loops.
>  
> ([Oct 4 2024 at 10:35:49 
> CST|https://ge.apache.org/s/o4ir4xtitsu52/tests/task/:core:test/details/kafka.api.PlaintextConsumerSubscriptionTest/testSubscribeInvalidTopicCanUnsubscribe(String%2C%20String)%5B3%5D?top-execution=1]):
> {code:java}
> org.apache.kafka.common.KafkaException: Failed to close kafka consumer
> at 
> org.apache.kafka.clients.consumer.internals.AsyncKafkaConsumer.close(AsyncKafkaConsumer.java:1249)
>  
> at 
> org.apache.kafka.clients.consumer.internals.AsyncKafkaConsumer.close(AsyncKafkaConsumer.java:1204)
>  
> at 
> org.apache.kafka.clients.consumer.KafkaConsumer.close(KafkaConsumer.java:1718)
>  
> at 
> kafka.api.IntegrationTestHarness.$anonfun$tearDown$3(IntegrationTestHarness.scala:249)
>  
> at 
> kafka.api.IntegrationTestHarness.$anonfun$tearDown$3$adapted(IntegrationTestHarness.scala:249)
>  
> at scala.collection.IterableOnceOps.foreach(IterableOnce.scala:619)   
> at scala.collection.IterableOnceOps.foreach$(IterableOnce.scala:617)  
> at scala.collection.AbstractIterable.foreach(Iterable.scala:935)  
> at 
> kafka.api.IntegrationTestHarness.tearDown(IntegrationTestHarness.scala:249)   
>  
> at java.lang.reflect.Method.invoke(Method.java:566)   
> at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183)
> at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)  
> at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:177)  
> at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)  
> at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183)
> at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)  
> at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)  
> at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)  
> at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183)
> at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)  
> at 
> java.util.Spliterators$ArraySpliterator.forEachRemaining(Spliterators.java:948)
> 
> at 
> java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:658)   
>  
> at java.util.stream.ReferencePipeline$7$1.accept(ReferencePipeline.java:274)  
> at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)  
> at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)  
> at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)  
> at 
> java.util.Spliterators$ArraySpliterator.forEachRemaining(Spliterators.java:948)
> 
> at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484)  
> at 
> java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474)  
>  
> at 
> java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150) 
>  
> at 
> java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173)
> 
> at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)  
> at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:497) 
> at java.util.stream.ReferencePipeline$7$1.accept(ReferencePipeline.java:274)  
> at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)  
> at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)  
> at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)  
> at 
> java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1655)
>  
> at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484)  
> at 
> java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474)  
>  
> at 
> java

[jira] [Updated] (KAFKA-17769) Fix flaky PlaintextConsumerSubscriptionTest.testSubscribeInvalidTopicCanUnsubscribe

2024-10-14 Thread Kirk True (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-17769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kirk True updated KAFKA-17769:
--
Labels: integration-test kip-848-client-support  (was: integration-test)

> Fix flaky 
> PlaintextConsumerSubscriptionTest.testSubscribeInvalidTopicCanUnsubscribe
> ---
>
> Key: KAFKA-17769
> URL: https://issues.apache.org/jira/browse/KAFKA-17769
> Project: Kafka
>  Issue Type: Bug
>  Components: clients, consumer
>Reporter: Yu-Lin Chen
>Assignee: Yu-Lin Chen
>Priority: Major
>  Labels: integration-test, kip-848-client-support
>
> 4 flaky out of 110 trunk builds in past 2 weeks. ([Report 
> Link|https://ge.apache.org/scans/tests?search.rootProjectNames=kafka&search.startTimeMax=1728584869905&search.startTimeMin=172615680&search.tags=trunk&search.timeZoneId=Asia%2FTaipei&tests.container=kafka.api.PlaintextConsumerSubscriptionTest&tests.test=testSubscribeInvalidTopicCanUnsubscribe(String%2C%20String)%5B3%5D])
> This issue can be reproduced in my local within 50 loops.
>  
> ([Oct 4 2024 at 10:35:49 
> CST|https://ge.apache.org/s/o4ir4xtitsu52/tests/task/:core:test/details/kafka.api.PlaintextConsumerSubscriptionTest/testSubscribeInvalidTopicCanUnsubscribe(String%2C%20String)%5B3%5D?top-execution=1]):
> {code:java}
> org.apache.kafka.common.KafkaException: Failed to close kafka consumer
> at 
> org.apache.kafka.clients.consumer.internals.AsyncKafkaConsumer.close(AsyncKafkaConsumer.java:1249)
>  
> at 
> org.apache.kafka.clients.consumer.internals.AsyncKafkaConsumer.close(AsyncKafkaConsumer.java:1204)
>  
> at 
> org.apache.kafka.clients.consumer.KafkaConsumer.close(KafkaConsumer.java:1718)
>  
> at 
> kafka.api.IntegrationTestHarness.$anonfun$tearDown$3(IntegrationTestHarness.scala:249)
>  
> at 
> kafka.api.IntegrationTestHarness.$anonfun$tearDown$3$adapted(IntegrationTestHarness.scala:249)
>  
> at scala.collection.IterableOnceOps.foreach(IterableOnce.scala:619)   
> at scala.collection.IterableOnceOps.foreach$(IterableOnce.scala:617)  
> at scala.collection.AbstractIterable.foreach(Iterable.scala:935)  
> at 
> kafka.api.IntegrationTestHarness.tearDown(IntegrationTestHarness.scala:249)   
>  
> at java.lang.reflect.Method.invoke(Method.java:566)   
> at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183)
> at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)  
> at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:177)  
> at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)  
> at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183)
> at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)  
> at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)  
> at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)  
> at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183)
> at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)  
> at 
> java.util.Spliterators$ArraySpliterator.forEachRemaining(Spliterators.java:948)
> 
> at 
> java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:658)   
>  
> at java.util.stream.ReferencePipeline$7$1.accept(ReferencePipeline.java:274)  
> at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)  
> at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)  
> at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)  
> at 
> java.util.Spliterators$ArraySpliterator.forEachRemaining(Spliterators.java:948)
> 
> at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484)  
> at 
> java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474)  
>  
> at 
> java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150) 
>  
> at 
> java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173)
> 
> at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)  
> at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:497) 
> at java.util.stream.ReferencePipeline$7$1.accept(ReferencePipeline.java:274)  
> at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)  
> at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)  
> at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)  
> at 
> java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1655)
>  
> at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484)  
> at 
> java.util.stream.AbstractPipeline.wr

[jira] [Updated] (KAFKA-17777) Flaky KafkaConsumerTest testReturnRecordsDuringRebalance

2024-10-14 Thread Kirk True (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-1?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kirk True updated KAFKA-1:
--
Labels: flaky-test kip-848-client-support  (was: flaky-test)

> Flaky KafkaConsumerTest testReturnRecordsDuringRebalance
> 
>
> Key: KAFKA-1
> URL: https://issues.apache.org/jira/browse/KAFKA-1
> Project: Kafka
>  Issue Type: Test
>  Components: clients, consumer, unit tests
>Reporter: Lianet Magrans
>Assignee: TengYao Chi
>Priority: Major
>  Labels: flaky-test, kip-848-client-support
>
> Top flaky consumer test at the moment (after several recent fixes addressing 
> consumer tests flakiness): 
> [https://ge.apache.org/scans/tests?search.rootProjectNames=kafka&search.tags=trunk&search.timeZoneId=America%2FNew_York&tests.sortField=FLAKY]
>  
> It's been flaky in trunk for a while:
> [https://ge.apache.org/scans/tests?search.relativeStartTime=P28D&search.rootProjectNames=kafka&search.tags=trunk&search.timeZoneId=America%2FToronto&tests.container=org.apache.kafka.clients.consumer.KafkaConsumerTest&tests.sortField=FLAKY&tests.test=testReturnRecordsDuringRebalance(GroupProtocol)%5B1%5D]
>  
>  
> Fails with : org.opentest4j.AssertionFailedError: Condition not met within 
> timeout 15000. Does not complete rebalance in time ==> expected:  but 
> was: 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (KAFKA-17732) System test coverage for new consumer positions not getting behind committed offsets

2024-10-08 Thread Kirk True (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-17732?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kirk True updated KAFKA-17732:
--
Component/s: clients

> System test coverage for new consumer positions not getting behind committed 
> offsets
> 
>
> Key: KAFKA-17732
> URL: https://issues.apache.org/jira/browse/KAFKA-17732
> Project: Kafka
>  Issue Type: Task
>  Components: clients, consumer, system tests
>Reporter: Lianet Magrans
>Priority: Major
>  Labels: consumer-threading-refactor
>
> We recently catch a bug where the new consumer positions would get behind the 
> committed offsets (fixed with 
> https://issues.apache.org/jira/browse/KAFKA-17674). It was only discovered 
> while adding  a new system test (test_consumer_rolling_migration, not yet in 
> trunk, PR should be available soon in trunk under 
> https://issues.apache.org/jira/browse/KAFKA-17272)
> We should check why this failure wasn't catch with existing tests, and add a 
> test for it (test to ensure positions are updated properly, that would fail 
> without KAFKA-17674 and pass with it). 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (KAFKA-17337) ConsumerConfig should default to CONSUMER for group.protocol

2024-10-08 Thread Kirk True (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-17337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17887700#comment-17887700
 ] 

Kirk True commented on KAFKA-17337:
---

When switching to this as the default, we have found a few errors, which are 
linked. Need to complete investigation of testing to find errors.

> ConsumerConfig should default to CONSUMER for group.protocol
> 
>
> Key: KAFKA-17337
> URL: https://issues.apache.org/jira/browse/KAFKA-17337
> Project: Kafka
>  Issue Type: Task
>  Components: clients, config, consumer
>Affects Versions: 3.9.0
>Reporter: Kirk True
>Assignee: Kirk True
>Priority: Blocker
>  Labels: consumer-threading-refactor, kip-848-client-support
> Fix For: 4.0.0
>
>
> {{ConsumerConfig}}’s default value for {{group.protocol}} should be changed 
> from {{CLASSIC}} to {{CONSUMER}} for 4.0.0.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (KAFKA-17536) Ensure clear error message when "new" consumer used with incompatible cluster

2024-10-07 Thread Kirk True (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-17536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17887486#comment-17887486
 ] 

Kirk True commented on KAFKA-17536:
---

Correct. Feel free to grab it :)

> Ensure clear error message when "new" consumer used with incompatible cluster
> -
>
> Key: KAFKA-17536
> URL: https://issues.apache.org/jira/browse/KAFKA-17536
> Project: Kafka
>  Issue Type: Improvement
>  Components: clients, consumer
>Affects Versions: 3.9.0
>Reporter: Kirk True
>Priority: Critical
>  Labels: kip-848-client-support
> Fix For: 4.0.0
>
>
> In Kafka 4.0, by default the consumer uses the updated consumer group 
> protocol defined in KIP-848. When the consumer is used with a cluster that 
> does not support (or is not configured to use) the new protocol, the consumer 
> will get an unfriendly error about unavailable APIs. Since this error could 
> be the user's first impression when attempting to upgrade to 4.0, we need to 
> make sure that the error is very clear about the remediation steps (set the 
> group.protocol to CLASSIC on the client or upgrade and enable the new 
> protocol on the cluster).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (KAFKA-17536) Ensure clear error message when "new" consumer used with incompatible cluster

2024-10-07 Thread Kirk True (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-17536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kirk True reassigned KAFKA-17536:
-

Assignee: (was: Kirk True)

> Ensure clear error message when "new" consumer used with incompatible cluster
> -
>
> Key: KAFKA-17536
> URL: https://issues.apache.org/jira/browse/KAFKA-17536
> Project: Kafka
>  Issue Type: Improvement
>  Components: clients, consumer
>Affects Versions: 3.9.0
>Reporter: Kirk True
>Priority: Critical
>  Labels: kip-848-client-support
> Fix For: 4.0.0
>
>
> In Kafka 4.0, by default the consumer uses the updated consumer group 
> protocol defined in KIP-848. When the consumer is used with a cluster that 
> does not support (or is not configured to use) the new protocol, the consumer 
> will get an unfriendly error about unavailable APIs. Since this error could 
> be the user's first impression when attempting to upgrade to 4.0, we need to 
> make sure that the error is very clear about the remediation steps (set the 
> group.protocol to CLASSIC on the client or upgrade and enable the new 
> protocol on the cluster).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (KAFKA-14672) Producer queue time does not reflect batches expired in the accumulator

2024-10-07 Thread Kirk True (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-14672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kirk True updated KAFKA-14672:
--
Component/s: clients
 metrics
 producer 

> Producer queue time does not reflect batches expired in the accumulator
> ---
>
> Key: KAFKA-14672
> URL: https://issues.apache.org/jira/browse/KAFKA-14672
> Project: Kafka
>  Issue Type: Bug
>  Components: clients, metrics, producer 
>Reporter: Jason Gustafson
>Assignee: Kirk True
>Priority: Major
>
> The producer exposes two metrics for the time a record has spent in the 
> accumulator waiting to be drained:
>  * {{record-queue-time-avg}}
>  * {{record-queue-time-max}}
> The metric is only updated when a batch is ready to send to a broker. It is 
> also possible for a batch to be expired before it can be sent, but in this 
> case, the metric is not updated. This seems surprising and makes the queue 
> time misleading. The only metric I could find that does reflect batch 
> expirations in the accumulator is the generic {{{}record-error-rate{}}}. It 
> would make sense to let the queue-time metrics record the time spent in the 
> queue regardless of the outcome of the record send attempt.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (KAFKA-14567) Kafka Streams crashes after ProducerFencedException

2024-10-07 Thread Kirk True (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-14567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kirk True updated KAFKA-14567:
--
Component/s: clients

> Kafka Streams crashes after ProducerFencedException
> ---
>
> Key: KAFKA-14567
> URL: https://issues.apache.org/jira/browse/KAFKA-14567
> Project: Kafka
>  Issue Type: Bug
>  Components: clients, producer , streams
>Affects Versions: 3.7.0
>Reporter: Matthias J. Sax
>Assignee: Kirk True
>Priority: Blocker
>  Labels: eos, transactions
> Fix For: 4.0.0
>
>
> Running a Kafka Streams application with EOS-v2.
> We first see a `ProducerFencedException`. After the fencing, the fenced 
> thread crashed resulting in a non-recoverable error:
> {quote}[2022-12-22 13:49:13,423] ERROR [i-0c291188ec2ae17a0-StreamThread-3] 
> stream-thread [i-0c291188ec2ae17a0-StreamThread-3] Failed to process stream 
> task 1_2 due to the following error: 
> (org.apache.kafka.streams.processor.internals.TaskExecutor)
> org.apache.kafka.streams.errors.StreamsException: Exception caught in 
> process. taskId=1_2, processor=KSTREAM-SOURCE-05, 
> topic=node-name-repartition, partition=2, offset=539776276, 
> stacktrace=java.lang.IllegalStateException: TransactionalId 
> stream-soak-test-72b6e57c-c2f5-489d-ab9f-fdbb215d2c86-3: Invalid transition 
> attempted from state FATAL_ERROR to state ABORTABLE_ERROR
> at 
> org.apache.kafka.clients.producer.internals.TransactionManager.transitionTo(TransactionManager.java:974)
> at 
> org.apache.kafka.clients.producer.internals.TransactionManager.transitionToAbortableError(TransactionManager.java:394)
> at 
> org.apache.kafka.clients.producer.internals.TransactionManager.maybeTransitionToErrorState(TransactionManager.java:620)
> at 
> org.apache.kafka.clients.producer.KafkaProducer.doSend(KafkaProducer.java:1079)
> at 
> org.apache.kafka.clients.producer.KafkaProducer.send(KafkaProducer.java:959)
> at 
> org.apache.kafka.streams.processor.internals.StreamsProducer.send(StreamsProducer.java:257)
> at 
> org.apache.kafka.streams.processor.internals.RecordCollectorImpl.send(RecordCollectorImpl.java:207)
> at 
> org.apache.kafka.streams.processor.internals.RecordCollectorImpl.send(RecordCollectorImpl.java:162)
> at 
> org.apache.kafka.streams.processor.internals.SinkNode.process(SinkNode.java:85)
> at 
> org.apache.kafka.streams.processor.internals.ProcessorContextImpl.forwardInternal(ProcessorContextImpl.java:290)
> at 
> org.apache.kafka.streams.processor.internals.ProcessorContextImpl.forward(ProcessorContextImpl.java:269)
> at 
> org.apache.kafka.streams.processor.internals.ProcessorContextImpl.forward(ProcessorContextImpl.java:228)
> at 
> org.apache.kafka.streams.kstream.internals.KStreamKTableJoinProcessor.process(KStreamKTableJoinProcessor.java:88)
> at 
> org.apache.kafka.streams.processor.internals.ProcessorNode.process(ProcessorNode.java:157)
> at 
> org.apache.kafka.streams.processor.internals.ProcessorContextImpl.forwardInternal(ProcessorContextImpl.java:290)
> at 
> org.apache.kafka.streams.processor.internals.ProcessorContextImpl.forward(ProcessorContextImpl.java:269)
> at 
> org.apache.kafka.streams.processor.internals.ProcessorContextImpl.forward(ProcessorContextImpl.java:228)
> at 
> org.apache.kafka.streams.processor.internals.SourceNode.process(SourceNode.java:84)
> at 
> org.apache.kafka.streams.processor.internals.StreamTask.lambda$doProcess$1(StreamTask.java:791)
> at 
> org.apache.kafka.streams.processor.internals.metrics.StreamsMetricsImpl.maybeMeasureLatency(StreamsMetricsImpl.java:867)
> at 
> org.apache.kafka.streams.processor.internals.StreamTask.doProcess(StreamTask.java:791)
> at 
> org.apache.kafka.streams.processor.internals.StreamTask.process(StreamTask.java:722)
> at 
> org.apache.kafka.streams.processor.internals.TaskExecutor.processTask(TaskExecutor.java:95)
> at 
> org.apache.kafka.streams.processor.internals.TaskExecutor.process(TaskExecutor.java:76)
> at 
> org.apache.kafka.streams.processor.internals.TaskManager.process(TaskManager.java:1645)
> at 
> org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:788)
> at 
> org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:607)
> at 
> org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:569)
> at 
> org.apache.kafka.streams.processor.internals.StreamTask.process(StreamTask.java:748)
> at 
> org.apache.kafka.streams.processor.internals.TaskExecutor.processTask(TaskExecutor.java:95)
> at 
> org.apach

[jira] [Resolved] (KAFKA-16444) Run KIP-848 unit tests under code coverage

2024-10-07 Thread Kirk True (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-16444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kirk True resolved KAFKA-16444.
---
Resolution: Won't Do

> Run KIP-848 unit tests under code coverage
> --
>
> Key: KAFKA-16444
> URL: https://issues.apache.org/jira/browse/KAFKA-16444
> Project: Kafka
>  Issue Type: Task
>  Components: clients, consumer, unit tests
>Affects Versions: 3.7.0
>Reporter: Kirk True
>Priority: Minor
>  Labels: consumer-threading-refactor, kip-848-client-support
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (KAFKA-16444) Run KIP-848 unit tests under code coverage

2024-10-07 Thread Kirk True (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-16444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kirk True reassigned KAFKA-16444:
-

Assignee: (was: Kirk True)

> Run KIP-848 unit tests under code coverage
> --
>
> Key: KAFKA-16444
> URL: https://issues.apache.org/jira/browse/KAFKA-16444
> Project: Kafka
>  Issue Type: Task
>  Components: clients, consumer, unit tests
>Affects Versions: 3.7.0
>Reporter: Kirk True
>Priority: Minor
>  Labels: consumer-threading-refactor, kip-848-client-support
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (KAFKA-16138) QuotaTest system test fails consistently in 3.7

2024-10-07 Thread Kirk True (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-16138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kirk True updated KAFKA-16138:
--
Fix Version/s: (was: 4.0.0)

> QuotaTest system test fails consistently in 3.7
> ---
>
> Key: KAFKA-16138
> URL: https://issues.apache.org/jira/browse/KAFKA-16138
> Project: Kafka
>  Issue Type: Test
>  Components: clients, consumer, system tests
>Affects Versions: 3.7.0, 3.8.0
>Reporter: Stanislav Kozlovski
>Assignee: Kirk True
>Priority: Major
>
> as mentioned in 
> [https://hackmd.io/@hOneAGCrSmKSpL8VF-1HWQ/HyRgRJmta#kafkatesttestsclientquota_testQuotaTesttest_quotaArguments-%E2%80%9Coverride_quota%E2%80%9D-true-%E2%80%9Cquota_type%E2%80%9D-%E2%80%9Cuser%E2%80%9D,]
>  the test fails consistently:
> {code:java}
> ValueError('max() arg is an empty sequence')
> Traceback (most recent call last):
>   File 
> "/home/jenkins/workspace/system-test-kafka-branch-builder/kafka/venv/lib/python3.7/site-packages/ducktape/tests/runner_client.py",
>  line 184, in _do_run
> data = self.run_test()
>   File 
> "/home/jenkins/workspace/system-test-kafka-branch-builder/kafka/venv/lib/python3.7/site-packages/ducktape/tests/runner_client.py",
>  line 262, in run_test
> return self.test_context.function(self.test)
>   File 
> "/home/jenkins/workspace/system-test-kafka-branch-builder/kafka/venv/lib/python3.7/site-packages/ducktape/mark/_mark.py",
>  line 433, in wrapper
> return functools.partial(f, *args, **kwargs)(*w_args, **w_kwargs)
>   File 
> "/home/jenkins/workspace/system-test-kafka-branch-builder/kafka/tests/kafkatest/tests/client/quota_test.py",
>  line 169, in test_quota
> success, msg = self.validate(self.kafka, producer, consumer)
>   File 
> "/home/jenkins/workspace/system-test-kafka-branch-builder/kafka/tests/kafkatest/tests/client/quota_test.py",
>  line 197, in validate
> metric.value for k, metrics in producer.metrics(group='producer-metrics', 
> name='outgoing-byte-rate', client_id=producer.client_id) for metric in metrics
> ValueError: max() arg is an empty sequence {code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (KAFKA-16138) QuotaTest system test fails consistently in 3.7

2024-10-07 Thread Kirk True (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-16138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kirk True reassigned KAFKA-16138:
-

Assignee: (was: Kirk True)

> QuotaTest system test fails consistently in 3.7
> ---
>
> Key: KAFKA-16138
> URL: https://issues.apache.org/jira/browse/KAFKA-16138
> Project: Kafka
>  Issue Type: Test
>  Components: clients, consumer, system tests
>Affects Versions: 3.7.0, 3.8.0
>Reporter: Stanislav Kozlovski
>Priority: Major
>
> as mentioned in 
> [https://hackmd.io/@hOneAGCrSmKSpL8VF-1HWQ/HyRgRJmta#kafkatesttestsclientquota_testQuotaTesttest_quotaArguments-%E2%80%9Coverride_quota%E2%80%9D-true-%E2%80%9Cquota_type%E2%80%9D-%E2%80%9Cuser%E2%80%9D,]
>  the test fails consistently:
> {code:java}
> ValueError('max() arg is an empty sequence')
> Traceback (most recent call last):
>   File 
> "/home/jenkins/workspace/system-test-kafka-branch-builder/kafka/venv/lib/python3.7/site-packages/ducktape/tests/runner_client.py",
>  line 184, in _do_run
> data = self.run_test()
>   File 
> "/home/jenkins/workspace/system-test-kafka-branch-builder/kafka/venv/lib/python3.7/site-packages/ducktape/tests/runner_client.py",
>  line 262, in run_test
> return self.test_context.function(self.test)
>   File 
> "/home/jenkins/workspace/system-test-kafka-branch-builder/kafka/venv/lib/python3.7/site-packages/ducktape/mark/_mark.py",
>  line 433, in wrapper
> return functools.partial(f, *args, **kwargs)(*w_args, **w_kwargs)
>   File 
> "/home/jenkins/workspace/system-test-kafka-branch-builder/kafka/tests/kafkatest/tests/client/quota_test.py",
>  line 169, in test_quota
> success, msg = self.validate(self.kafka, producer, consumer)
>   File 
> "/home/jenkins/workspace/system-test-kafka-branch-builder/kafka/tests/kafkatest/tests/client/quota_test.py",
>  line 197, in validate
> metric.value for k, metrics in producer.metrics(group='producer-metrics', 
> name='outgoing-byte-rate', client_id=producer.client_id) for metric in metrics
> ValueError: max() arg is an empty sequence {code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (KAFKA-17363) Flaky test: kafka.api.PlaintextConsumerCommitTest.testAutoCommitOnRebalance(String, String).quorum=kraft+kip848.groupProtocol=consumer

2024-10-07 Thread Kirk True (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-17363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kirk True updated KAFKA-17363:
--
Fix Version/s: 4.0.0

> Flaky test: 
> kafka.api.PlaintextConsumerCommitTest.testAutoCommitOnRebalance(String, 
> String).quorum=kraft+kip848.groupProtocol=consumer
> --
>
> Key: KAFKA-17363
> URL: https://issues.apache.org/jira/browse/KAFKA-17363
> Project: Kafka
>  Issue Type: Bug
>  Components: clients, consumer
>Reporter: Apoorv Mittal
>Priority: Major
>  Labels: integration-tests
> Fix For: 4.0.0
>
>
> kafka.api.PlaintextConsumerCommitTest.testAutoCommitOnRebalance(String, 
> String).quorum=kraft+kip848.groupProtocol=consumer
> [https://ci-builds.apache.org/job/Kafka/job/kafka-pr/job/PR-16890/3/testReport/kafka.api/PlaintextConsumerCommitTest/Build___JDK_8_and_Scala_2_12___testAutoCommitOnRebalance_String__String__quorum_kraft_kip848_groupProtocol_consumer/]
> {code:java}
> org.opentest4j.AssertionFailedError: Topic [topic2] metadata not propagated 
> after 6 ms
>   at org.junit.jupiter.api.AssertionUtils.fail(AssertionUtils.java:38)
>   at org.junit.jupiter.api.Assertions.fail(Assertions.java:138)
>   at 
> kafka.utils.TestUtils$.waitForAllPartitionsMetadata(TestUtils.scala:931)
>   at kafka.utils.TestUtils$.createTopicWithAdmin(TestUtils.scala:474)
>   at 
> kafka.integration.KafkaServerTestHarness.$anonfun$createTopic$1(KafkaServerTestHarness.scala:193)
>   at scala.util.Using$.resource(Using.scala:269)
>   at 
> kafka.integration.KafkaServerTestHarness.createTopic(KafkaServerTestHarness.scala:185)
>   at 
> kafka.api.PlaintextConsumerCommitTest.testAutoCommitOnRebalance(PlaintextConsumerCommitTest.scala:223)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183)
>   at 
> java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
>   at 
> java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175)
>   at 
> java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
>   at 
> java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183)
>   at 
> java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
>   at 
> java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
>   at 
> java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
>   at 
> java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183)
>   at 
> java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
>   at 
> java.util.Spliterators$ArraySpliterator.forEachRemaining(Spliterators.java:948)
>   at 
> java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:647)
>   at 
> java.util.stream.ReferencePipeline$7$1.accept(ReferencePipeline.java:272)
>   at 
> java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
>   at 
> java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
>   at 
> java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
>   at 
> java.util.Spliterators$ArraySpliterator.forEachRemaining(Spliterators.java:948)
>   at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482)
>   at 
> java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472)
>   at 
> java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150)
>   at 
> java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173)
>   at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
>   at 
> java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:485)
>   at 
> java.util.stream.ReferencePipeline$7$1.accept(ReferencePipeline.java:272)
>   at 
> java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
>   at 
> java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
>   at 
> java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
>   at 
> java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1384)
>   at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482)
>   at 
> java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472)
>   at 
> java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150)
>   at 
> java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173)
>   at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
>   

[jira] [Updated] (KAFKA-17363) Flaky test: kafka.api.PlaintextConsumerCommitTest.testAutoCommitOnRebalance(String, String).quorum=kraft+kip848.groupProtocol=consumer

2024-10-07 Thread Kirk True (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-17363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kirk True updated KAFKA-17363:
--
Labels: integration-tests kip-848-client-support  (was: integration-tests)

> Flaky test: 
> kafka.api.PlaintextConsumerCommitTest.testAutoCommitOnRebalance(String, 
> String).quorum=kraft+kip848.groupProtocol=consumer
> --
>
> Key: KAFKA-17363
> URL: https://issues.apache.org/jira/browse/KAFKA-17363
> Project: Kafka
>  Issue Type: Bug
>  Components: clients, consumer
>Reporter: Apoorv Mittal
>Priority: Major
>  Labels: integration-tests, kip-848-client-support
> Fix For: 4.0.0
>
>
> kafka.api.PlaintextConsumerCommitTest.testAutoCommitOnRebalance(String, 
> String).quorum=kraft+kip848.groupProtocol=consumer
> [https://ci-builds.apache.org/job/Kafka/job/kafka-pr/job/PR-16890/3/testReport/kafka.api/PlaintextConsumerCommitTest/Build___JDK_8_and_Scala_2_12___testAutoCommitOnRebalance_String__String__quorum_kraft_kip848_groupProtocol_consumer/]
> {code:java}
> org.opentest4j.AssertionFailedError: Topic [topic2] metadata not propagated 
> after 6 ms
>   at org.junit.jupiter.api.AssertionUtils.fail(AssertionUtils.java:38)
>   at org.junit.jupiter.api.Assertions.fail(Assertions.java:138)
>   at 
> kafka.utils.TestUtils$.waitForAllPartitionsMetadata(TestUtils.scala:931)
>   at kafka.utils.TestUtils$.createTopicWithAdmin(TestUtils.scala:474)
>   at 
> kafka.integration.KafkaServerTestHarness.$anonfun$createTopic$1(KafkaServerTestHarness.scala:193)
>   at scala.util.Using$.resource(Using.scala:269)
>   at 
> kafka.integration.KafkaServerTestHarness.createTopic(KafkaServerTestHarness.scala:185)
>   at 
> kafka.api.PlaintextConsumerCommitTest.testAutoCommitOnRebalance(PlaintextConsumerCommitTest.scala:223)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183)
>   at 
> java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
>   at 
> java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175)
>   at 
> java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
>   at 
> java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183)
>   at 
> java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
>   at 
> java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
>   at 
> java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
>   at 
> java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183)
>   at 
> java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
>   at 
> java.util.Spliterators$ArraySpliterator.forEachRemaining(Spliterators.java:948)
>   at 
> java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:647)
>   at 
> java.util.stream.ReferencePipeline$7$1.accept(ReferencePipeline.java:272)
>   at 
> java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
>   at 
> java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
>   at 
> java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
>   at 
> java.util.Spliterators$ArraySpliterator.forEachRemaining(Spliterators.java:948)
>   at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482)
>   at 
> java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472)
>   at 
> java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150)
>   at 
> java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173)
>   at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
>   at 
> java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:485)
>   at 
> java.util.stream.ReferencePipeline$7$1.accept(ReferencePipeline.java:272)
>   at 
> java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
>   at 
> java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
>   at 
> java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
>   at 
> java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1384)
>   at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482)
>   at 
> java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472)
>   at 
> java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150)
>   at 
> java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173)
>  

[jira] [Updated] (KAFKA-17623) Flaky testSeekPositionAndPauseNewlyAssignedPartitionOnPartitionsAssignedCallback

2024-10-07 Thread Kirk True (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-17623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kirk True updated KAFKA-17623:
--
Fix Version/s: 4.0.0

> Flaky 
> testSeekPositionAndPauseNewlyAssignedPartitionOnPartitionsAssignedCallback
> 
>
> Key: KAFKA-17623
> URL: https://issues.apache.org/jira/browse/KAFKA-17623
> Project: Kafka
>  Issue Type: Bug
>  Components: clients, consumer
>Reporter: Lianet Magrans
>Assignee: PoAn Yang
>Priority: Major
>  Labels: consumer-threading-refactor
> Fix For: 4.0.0
>
>
> Flaky for the new consumer, failing with :
> org.apache.kafka.common.KafkaException: User rebalance callback throws an 
> error at 
> app//org.apache.kafka.clients.consumer.internals.ConsumerUtils.maybeWrapAsKafkaException(ConsumerUtils.java:259)
>  at 
> app//org.apache.kafka.clients.consumer.internals.AsyncKafkaConsumer.invokeRebalanceCallbacks(AsyncKafkaConsumer.java:1867)
>  at 
> app//org.apache.kafka.clients.consumer.internals.AsyncKafkaConsumer$BackgroundEventProcessor.process(AsyncKafkaConsumer.java:195)
>  at 
> app//org.apache.kafka.clients.consumer.internals.AsyncKafkaConsumer$BackgroundEventProcessor.process(AsyncKafkaConsumer.java:181)
>  at 
> app//org.apache.kafka.clients.consumer.internals.AsyncKafkaConsumer.processBackgroundEvents(AsyncKafkaConsumer.java:1758)
>  at 
> app//org.apache.kafka.clients.consumer.internals.AsyncKafkaConsumer.updateAssignmentMetadataIfNeeded(AsyncKafkaConsumer.java:1618)
> ...
> Caused by: java.lang.IllegalStateException: No current assignment for 
> partition topic-0 at 
> org.apache.kafka.clients.consumer.internals.SubscriptionState.assignedState(SubscriptionState.java:378)
>  at 
> org.apache.kafka.clients.consumer.internals.SubscriptionState.seekUnvalidated(SubscriptionState.java:395)
>  at 
> org.apache.kafka.clients.consumer.internals.events.ApplicationEventProcessor.process(ApplicationEventProcessor.java:425)
>  at 
> org.apache.kafka.clients.consumer.internals.events.ApplicationEventProcessor.process(ApplicationEventProcessor.java:147)
>  at 
> org.apache.kafka.clients.consumer.internals.ConsumerNetworkThread.processApplicationEvents(ConsumerNetworkThread.java:171)
>  
> Flaky behaviour:
>  
> https://ge.apache.org/scans/tests?search.buildOutcome=failure&search.names=Git%20branch&search.rootProjectNames=kafka&search.startTimeMax=172740959&search.startTimeMin=172248480&search.timeZoneId=America%2FToronto&search.values=trunk&tests.container=integration.kafka.api.PlaintextConsumerCallbackTest&tests.test=testSeekPositionAndPauseNewlyAssignedPartitionOnPartitionsAssignedCallback(String%2C%20String)%5B3%5D



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (KAFKA-17648) AsyncKafkaConsumer#unsubscribe swallow TopicAuthorizationException

2024-10-07 Thread Kirk True (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-17648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kirk True updated KAFKA-17648:
--
Fix Version/s: 4.0.0

> AsyncKafkaConsumer#unsubscribe swallow TopicAuthorizationException
> --
>
> Key: KAFKA-17648
> URL: https://issues.apache.org/jira/browse/KAFKA-17648
> Project: Kafka
>  Issue Type: Task
>  Components: clients, consumer
>Reporter: PoAn Yang
>Assignee: PoAn Yang
>Priority: Minor
> Fix For: 4.0.0
>
>
> Followup for [https://github.com/apache/kafka/pull/17244].
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (KAFKA-15561) Client support for new SubscriptionPattern based subscription

2024-10-07 Thread Kirk True (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-15561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kirk True updated KAFKA-15561:
--
Priority: Blocker  (was: Critical)

> Client support for new SubscriptionPattern based subscription
> -
>
> Key: KAFKA-15561
> URL: https://issues.apache.org/jira/browse/KAFKA-15561
> Project: Kafka
>  Issue Type: New Feature
>  Components: clients, consumer
>Reporter: Lianet Magrans
>Assignee: Phuc Hong Tran
>Priority: Blocker
>  Labels: kip-848-client-support, regex
> Fix For: 4.0.0
>
>
> New consumer should support subscribe with the new SubscriptionPattern 
> introduced in the new consumer group protocol. When subscribing with this 
> regex, the client should provide the regex in the HB request on the 
> SubscribedTopicRegex field, delegating the resolution to the server.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (KAFKA-17696) New consumer background operations unaware of metadata errors

2024-10-07 Thread Kirk True (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-17696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kirk True updated KAFKA-17696:
--
Priority: Blocker  (was: Critical)

> New consumer background operations unaware of metadata errors
> -
>
> Key: KAFKA-17696
> URL: https://issues.apache.org/jira/browse/KAFKA-17696
> Project: Kafka
>  Issue Type: Bug
>  Components: clients, consumer
>Reporter: Lianet Magrans
>Assignee: 黃竣陽
>Priority: Blocker
>  Labels: consumer-threading-refactor
> Fix For: 4.0.0
>
>
> When a metadata error happens (ie. Unauthorized topic), the network layer is 
> the one to detect it and it just propagates it to the app thread via en 
> ErrorEvent.
> [https://github.com/apache/kafka/blob/0edf5dbd204df9eb62bfea1b56993e95737df5a3/clients/src/main/java/org/apache/kafka/clients/consumer/internals/NetworkClientDelegate.java#L153]
> That allows api calls that processBackgroundEvents to throw the error in the 
> app thread (ex. poll, unsubscribe and close, which are the only api calls 
> that currently processBackgroundEvents).
> This means that all other api calls that do not processBackgroundEvent will 
> never know about errors like Unauthorized topics. Moreover, it really means 
> that the background operations are not notified/aborted when a metadata error 
> happens (auth error). Ex. call to position block waiting for the 
> updateFetchPositions 
> ([here|https://github.com/apache/kafka/blob/0edf5dbd204df9eb62bfea1b56993e95737df5a3/clients/src/main/java/org/apache/kafka/clients/consumer/internals/AsyncKafkaConsumer.java#L1586]),
>  will leave a
> pendingOffsetFetchEvent waiting to complete, even when the background already 
> got an Unauthorized exception (but it only passed it to the app thread via 
> ErrorEvent)
> I wonder if we should ensure that metadata errors are not only propagated to 
> the app thread via ErrorEvents, but also ensure that we notify all request 
> managers in the background (so that they can decide if completeExceptionally 
> their outstanding events). Ex. OffsetsRequestManager.onMetadataError should 
> completeExceptionally the pendingOffsetFetchEvent (just first thought, there 
> could be other approaches, but note that calling processBackgroundEvent in 
> api calls like positions will not do because we would block first on the 
> CheckAndUpdatePositions, then processBackgroundEvents that would only happen 
> after the CheckAndUpdate)
>  
> This behaviour can be repro with the integration test 
> AuthorizerIntegrationTest.testOffsetFetchWithNoAccess with the new consumer 
> enabled (discovered with [https://github.com/apache/kafka/pull/17107] )
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (KAFKA-17337) ConsumerConfig should default to CONSUMER for group.protocol

2024-10-07 Thread Kirk True (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-17337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kirk True updated KAFKA-17337:
--
Priority: Blocker  (was: Critical)

> ConsumerConfig should default to CONSUMER for group.protocol
> 
>
> Key: KAFKA-17337
> URL: https://issues.apache.org/jira/browse/KAFKA-17337
> Project: Kafka
>  Issue Type: Task
>  Components: clients, config, consumer
>Affects Versions: 3.9.0
>Reporter: Kirk True
>Assignee: Kirk True
>Priority: Blocker
>  Labels: consumer-threading-refactor, kip-848-client-support
> Fix For: 4.0.0
>
>
> {{ConsumerConfig}}’s default value for {{group.protocol}} should be changed 
> from {{CLASSIC}} to {{CONSUMER}} for 4.0.0.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (KAFKA-17593) Async regex resolution

2024-10-07 Thread Kirk True (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-17593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kirk True updated KAFKA-17593:
--
Fix Version/s: 4.0.0

> Async regex resolution
> --
>
> Key: KAFKA-17593
> URL: https://issues.apache.org/jira/browse/KAFKA-17593
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Lianet Magrans
>Assignee: Lianet Magrans
>Priority: Major
> Fix For: 4.0.0
>
>
> Async regex RE2J eval triggered from HB as needed, performed in separate 
> component for group regex management. It should end up persisting the 
> resolved regexes for the group. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (KAFKA-15954) Review minimal effort approach on consumer last heartbeat on unsubscribe

2024-10-07 Thread Kirk True (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-15954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kirk True updated KAFKA-15954:
--
Fix Version/s: (was: 4.0.0)

> Review minimal effort approach on consumer last heartbeat on unsubscribe
> 
>
> Key: KAFKA-15954
> URL: https://issues.apache.org/jira/browse/KAFKA-15954
> Project: Kafka
>  Issue Type: Sub-task
>  Components: clients, consumer
>Reporter: Lianet Magrans
>Assignee: Lianet Magrans
>Priority: Minor
>  Labels: kip-848-client-support, reconciliation
>
> Currently the legacy and new consumer follows a minimal effort approach when 
> sending a leave group (legacy) or last heartbeat request (new consumer). The 
> request is sent without waiting/handling any response. This behaviour applies 
> when the consumer is being closed or when it unsubscribes.
> For the case when the consumer is being closed, (which is a "terminal" 
> state), it probably makes sense to just follow a minimal effort approach for 
> "properly" leaving the group (no retry logic). But for the case of 
> unsubscribe, we could consider if valuable to to put a little more effort 
> into making sure that the last heartbeat is sent and received by the broker 
> (ex. what if coordinator not known/available when sending the last HB). Note 
> that unsubscribe could a temporary state, where the consumer might want to 
> re-join the group at any time. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (KAFKA-15954) Review minimal effort approach on consumer last heartbeat on unsubscribe

2024-10-07 Thread Kirk True (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-15954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17887395#comment-17887395
 ] 

Kirk True commented on KAFKA-15954:
---

This is marked as post-4.0 on our internal priority list, so clearing the fix 
version.

> Review minimal effort approach on consumer last heartbeat on unsubscribe
> 
>
> Key: KAFKA-15954
> URL: https://issues.apache.org/jira/browse/KAFKA-15954
> Project: Kafka
>  Issue Type: Sub-task
>  Components: clients, consumer
>Reporter: Lianet Magrans
>Assignee: Lianet Magrans
>Priority: Minor
>  Labels: kip-848-client-support, reconciliation
> Fix For: 4.0.0
>
>
> Currently the legacy and new consumer follows a minimal effort approach when 
> sending a leave group (legacy) or last heartbeat request (new consumer). The 
> request is sent without waiting/handling any response. This behaviour applies 
> when the consumer is being closed or when it unsubscribes.
> For the case when the consumer is being closed, (which is a "terminal" 
> state), it probably makes sense to just follow a minimal effort approach for 
> "properly" leaving the group (no retry logic). But for the case of 
> unsubscribe, we could consider if valuable to to put a little more effort 
> into making sure that the last heartbeat is sent and received by the broker 
> (ex. what if coordinator not known/available when sending the last HB). Note 
> that unsubscribe could a temporary state, where the consumer might want to 
> re-join the group at any time. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (KAFKA-15847) Consider partial metadata requests for client reconciliation

2024-10-07 Thread Kirk True (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-15847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kirk True updated KAFKA-15847:
--
Fix Version/s: (was: 4.0.0)

> Consider partial metadata requests for client reconciliation
> 
>
> Key: KAFKA-15847
> URL: https://issues.apache.org/jira/browse/KAFKA-15847
> Project: Kafka
>  Issue Type: Task
>  Components: clients, consumer
>Reporter: Lianet Magrans
>Assignee: Lianet Magrans
>Priority: Major
>  Labels: consumer-threading-refactor, reconciliation
>
> New consumer implementing KIP-848 protocol needs to resolve metadata for the 
> topics received in the assignment. It does so by relying on the centralized 
> metadata object. Currently metadata updates requested through the metadata 
> object, request metadata for all topics. Consider allowing the partial 
> updates that are already expressed as an intention in the Metadata class but 
> not fully supported (investigate background in case there were some specifics 
> that led to this intention not being fully implemented) 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (KAFKA-15847) Consider partial metadata requests for client reconciliation

2024-10-07 Thread Kirk True (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-15847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17887394#comment-17887394
 ] 

Kirk True commented on KAFKA-15847:
---

On our internal priority list, this is marked this as post-4.0, so clearing the 
fix version.

> Consider partial metadata requests for client reconciliation
> 
>
> Key: KAFKA-15847
> URL: https://issues.apache.org/jira/browse/KAFKA-15847
> Project: Kafka
>  Issue Type: Task
>  Components: clients, consumer
>Reporter: Lianet Magrans
>Assignee: Lianet Magrans
>Priority: Major
>  Labels: consumer-threading-refactor, reconciliation
>
> New consumer implementing KIP-848 protocol needs to resolve metadata for the 
> topics received in the assignment. It does so by relying on the centralized 
> metadata object. Currently metadata updates requested through the metadata 
> object, request metadata for all topics. Consider allowing the partial 
> updates that are already expressed as an intention in the Metadata class but 
> not fully supported (investigate background in case there were some specifics 
> that led to this intention not being fully implemented) 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (KAFKA-15588) Purge the unsent offset commits/fetches when the member is fenced/failed

2024-10-07 Thread Kirk True (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-15588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kirk True updated KAFKA-15588:
--
Fix Version/s: (was: 4.0.0)

> Purge the unsent offset commits/fetches when the member is fenced/failed
> 
>
> Key: KAFKA-15588
> URL: https://issues.apache.org/jira/browse/KAFKA-15588
> Project: Kafka
>  Issue Type: Improvement
>  Components: clients, consumer
>Reporter: Philip Nee
>Assignee: TengYao Chi
>Priority: Major
>  Labels: kip-848-client-support, reconciliation
>
> When the member is fenced/failed, we should purge the inflight offset commits 
> and fetches.  HeartbeatRequestManager should be able to handle this



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (KAFKA-15553) Review consumer positions update

2024-10-07 Thread Kirk True (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-15553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17887391#comment-17887391
 ] 

Kirk True commented on KAFKA-15553:
---

It looks our internal prioritization marked this as post-4.0.0, so moving out.

> Review consumer positions update
> 
>
> Key: KAFKA-15553
> URL: https://issues.apache.org/jira/browse/KAFKA-15553
> Project: Kafka
>  Issue Type: Task
>  Components: clients, consumer
>Reporter: Philip Nee
>Assignee: Philip Nee
>Priority: Minor
>  Labels: consumer-threading-refactor, position
> Fix For: 4.0.0
>
>
> From the existing comment: If there are any partitions which do not have a 
> valid position and are not awaiting reset, then we need to fetch committed 
> offsets.
> In the async consumer: I wonder if it would make sense to refresh the 
> position on the event loop continuously.
> The logic to refresh offsets in the poll loop is quite fragile and works 
> largely by side-effects of the code that it calls. For example, the behaviour 
> of the "cached" value is really not that straightforward and simply reading 
> the cached value is not sufficient to start consuming data in all cases.
> This area needs a bit of a refactor.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (KAFKA-15553) Review consumer positions update

2024-10-07 Thread Kirk True (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-15553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kirk True updated KAFKA-15553:
--
Fix Version/s: (was: 4.0.0)

> Review consumer positions update
> 
>
> Key: KAFKA-15553
> URL: https://issues.apache.org/jira/browse/KAFKA-15553
> Project: Kafka
>  Issue Type: Task
>  Components: clients, consumer
>Reporter: Philip Nee
>Assignee: Philip Nee
>Priority: Minor
>  Labels: consumer-threading-refactor, position
>
> From the existing comment: If there are any partitions which do not have a 
> valid position and are not awaiting reset, then we need to fetch committed 
> offsets.
> In the async consumer: I wonder if it would make sense to refresh the 
> position on the event loop continuously.
> The logic to refresh offsets in the poll loop is quite fragile and works 
> largely by side-effects of the code that it calls. For example, the behaviour 
> of the "cached" value is really not that straightforward and simply reading 
> the cached value is not sufficient to start consuming data in all cases.
> This area needs a bit of a refactor.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (KAFKA-16460) New consumer times out consuming records in multiple consumer_test.py system tests

2024-10-07 Thread Kirk True (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-16460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17887388#comment-17887388
 ] 

Kirk True commented on KAFKA-16460:
---

[~yangpoan]—yes, you are free to take it :) Thanks!

> New consumer times out consuming records in multiple consumer_test.py system 
> tests
> --
>
> Key: KAFKA-16460
> URL: https://issues.apache.org/jira/browse/KAFKA-16460
> Project: Kafka
>  Issue Type: Bug
>  Components: clients, consumer, system tests
>Affects Versions: 3.7.0
>Reporter: Kirk True
>Priority: Critical
>  Labels: kip-848-client-support, system-tests
> Fix For: 4.0.0
>
>
> The {{consumer_test.py}} system test fails with the following errors:
> {quote}
> * Timed out waiting for consumption
> {quote}
> Affected tests:
> * {{test_broker_failure}}
> * {{test_consumer_bounce}}
> * {{test_static_consumer_bounce}}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (KAFKA-17681) Fix unstable consumer_test.py#test_fencing_static_consumer

2024-10-06 Thread Kirk True (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-17681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kirk True updated KAFKA-17681:
--
Component/s: system tests

> Fix unstable consumer_test.py#test_fencing_static_consumer
> --
>
> Key: KAFKA-17681
> URL: https://issues.apache.org/jira/browse/KAFKA-17681
> Project: Kafka
>  Issue Type: Bug
>  Components: clients, consumer, system tests
>Reporter: Chia-Ping Tsai
>Assignee: Chia-Ping Tsai
>Priority: Major
>  Labels: kip-848-client-support
> Fix For: 4.0.0
>
>
> {code:java}
> AssertionError('Static consumers attempt to join with instance id in use 
> should not cause a rebalance.')
> Traceback (most recent call last):
>   File 
> "/usr/local/lib/python3.9/dist-packages/ducktape/tests/runner_client.py", 
> line 186, in _do_run
> data = self.run_test()
>   File 
> "/usr/local/lib/python3.9/dist-packages/ducktape/tests/runner_client.py", 
> line 246, in run_test
> return self.test_context.function(self.test)
>   File "/usr/local/lib/python3.9/dist-packages/ducktape/mark/_mark.py", line 
> 433, in wrapper
> return functools.partial(f, *args, **kwargs)(*w_args, **w_kwargs)
>   File "/opt/kafka-dev/tests/kafkatest/tests/client/consumer_test.py", line 
> 359, in test_fencing_static_consumer
> assert num_rebalances == consumer.num_rebalances(), "Static consumers 
> attempt to join with instance id in use should not cause a rebalance. before: 
> " + str(num_rebalances) + " after: " + str(consumer.num_rebalances())
> AssertionError: Static consumers attempt to join with instance id in use 
> should not cause a rebalance.
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (KAFKA-16460) New consumer times out consuming records in multiple consumer_test.py system tests

2024-10-06 Thread Kirk True (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-16460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kirk True reassigned KAFKA-16460:
-

Assignee: (was: Kirk True)

> New consumer times out consuming records in multiple consumer_test.py system 
> tests
> --
>
> Key: KAFKA-16460
> URL: https://issues.apache.org/jira/browse/KAFKA-16460
> Project: Kafka
>  Issue Type: Bug
>  Components: clients, consumer, system tests
>Affects Versions: 3.7.0
>Reporter: Kirk True
>Priority: Critical
>  Labels: kip-848-client-support, system-tests
> Fix For: 4.0.0
>
>
> The {{consumer_test.py}} system test fails with the following errors:
> {quote}
> * Timed out waiting for consumption
> {quote}
> Affected tests:
> * {{test_broker_failure}}
> * {{test_consumer_bounce}}
> * {{test_static_consumer_bounce}}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (KAFKA-15402) Performance regression on close consumer after upgrading to 3.5.0

2024-10-06 Thread Kirk True (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-15402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kirk True reassigned KAFKA-15402:
-

Assignee: (was: Kirk True)

> Performance regression on close consumer after upgrading to 3.5.0
> -
>
> Key: KAFKA-15402
> URL: https://issues.apache.org/jira/browse/KAFKA-15402
> Project: Kafka
>  Issue Type: Bug
>  Components: clients, consumer
>Affects Versions: 3.5.0, 3.6.0, 3.5.1
>Reporter: Benoit Delbosc
>Priority: Critical
>  Labels: consumer-threading-refactor
> Fix For: 4.0.0
>
> Attachments: image-2023-08-24-18-51-21-720.png, 
> image-2023-08-24-18-51-57-435.png, image-2023-08-25-10-50-28-079.png
>
>
> Hi,
> After upgrading to Kafka client version 3.5.0, we have observed a significant 
> increase in the duration of our Java unit tests. These unit tests heavily 
> rely on the Kafka Admin, Producer, and Consumer API.
> When using Kafka server version 3.4.1, the duration of the unit tests 
> increased from 8 seconds (with Kafka client 3.4.1) to 18 seconds (with Kafka 
> client 3.5.0).
> Upgrading the Kafka server to 3.5.1 show similar results.
> I have come across the issue KAFKA-15178, which could be the culprit. I will 
> attempt to test the proposed patch.
> In the meantime, if you have any ideas that could help identify and address 
> the regression, please let me know.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (KAFKA-15553) Review consumer positions update

2024-10-06 Thread Kirk True (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-15553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17887251#comment-17887251
 ] 

Kirk True commented on KAFKA-15553:
---

[~pnee]—do you plan on addressing this for 4.0.0?

> Review consumer positions update
> 
>
> Key: KAFKA-15553
> URL: https://issues.apache.org/jira/browse/KAFKA-15553
> Project: Kafka
>  Issue Type: Task
>  Components: clients, consumer
>Reporter: Philip Nee
>Assignee: Philip Nee
>Priority: Minor
>  Labels: consumer-threading-refactor, position
> Fix For: 4.0.0
>
>
> From the existing comment: If there are any partitions which do not have a 
> valid position and are not awaiting reset, then we need to fetch committed 
> offsets.
> In the async consumer: I wonder if it would make sense to refresh the 
> position on the event loop continuously.
> The logic to refresh offsets in the poll loop is quite fragile and works 
> largely by side-effects of the code that it calls. For example, the behaviour 
> of the "cached" value is really not that straightforward and simply reading 
> the cached value is not sufficient to start consuming data in all cases.
> This area needs a bit of a refactor.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (KAFKA-16816) Remove unneeded FencedInstanceId support on commit path for new consumer

2024-10-06 Thread Kirk True (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-16816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17887249#comment-17887249
 ] 

Kirk True commented on KAFKA-16816:
---

[~lianetm]—Is this something we plan to get into 4.0.0?

> Remove unneeded FencedInstanceId support on commit path for new consumer
> 
>
> Key: KAFKA-16816
> URL: https://issues.apache.org/jira/browse/KAFKA-16816
> Project: Kafka
>  Issue Type: Task
>  Components: clients, consumer
>Reporter: Lianet Magrans
>Assignee: Lianet Magrans
>Priority: Minor
>  Labels: kip-848-client-support
> Fix For: 4.0.0
>
>
> The new consumer contains logic related to handling FencedInstanceId 
> exception received as a response to an OffsetCommit request (on the 
> [consumer|https://github.com/apache/kafka/blob/028e7a06dcdca7d4dbeae83f2fce0a4120cc2753/clients/src/main/java/org/apache/kafka/clients/consumer/internals/AsyncKafkaConsumer.java#L776]
>  and [commit 
> manager|https://github.com/apache/kafka/blob/028e7a06dcdca7d4dbeae83f2fce0a4120cc2753/clients/src/main/java/org/apache/kafka/clients/consumer/internals/CommitRequestManager.java#L715]),
>  but with the new group protocol, we will never get that error on a commit 
> response. We should remove the code that expects the FencedInstanceId on the 
> commit response, and also clean up the other related usages that we added to 
> propagate the FencedInstanceId exception on the poll, commitSync and 
> commitAsync API. Note that throwing that exception is part of the contract of 
> the poll, commitSync and commitAsync APIs of the KafkaConsumer, but it 
> changes with the new protocol. We should update the java doc for the new 
> AsyncKafkaConsumer to reflect the change.  
>  
> With the new protocol If a consumer tries to commit offsets, there could be 2 
> cases:
>  # empty group -> commit succeeds, fencing an instance id would never happen 
> because group is empty
>  # non-empty group -> commit fails with UnknownMemberId, indicating that the 
> member is not known to the group. The consumer needs to join the non-empty 
> group in order to commit offsets to it. To complete the story, the moment the 
> consumer attempts to join, it will receive an UnreleasedInstanceId error on 
> the HB response, indicating it using a groupInstanceId that is already in use.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (KAFKA-15954) Review minimal effort approach on consumer last heartbeat on unsubscribe

2024-10-06 Thread Kirk True (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-15954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17887250#comment-17887250
 ] 

Kirk True commented on KAFKA-15954:
---

[~lianetm]—Same here—is this going into 4.0.0?

> Review minimal effort approach on consumer last heartbeat on unsubscribe
> 
>
> Key: KAFKA-15954
> URL: https://issues.apache.org/jira/browse/KAFKA-15954
> Project: Kafka
>  Issue Type: Sub-task
>  Components: clients, consumer
>Reporter: Lianet Magrans
>Assignee: Lianet Magrans
>Priority: Minor
>  Labels: kip-848-client-support, reconciliation
> Fix For: 4.0.0
>
>
> Currently the legacy and new consumer follows a minimal effort approach when 
> sending a leave group (legacy) or last heartbeat request (new consumer). The 
> request is sent without waiting/handling any response. This behaviour applies 
> when the consumer is being closed or when it unsubscribes.
> For the case when the consumer is being closed, (which is a "terminal" 
> state), it probably makes sense to just follow a minimal effort approach for 
> "properly" leaving the group (no retry logic). But for the case of 
> unsubscribe, we could consider if valuable to to put a little more effort 
> into making sure that the last heartbeat is sent and received by the broker 
> (ex. what if coordinator not known/available when sending the last HB). Note 
> that unsubscribe could a temporary state, where the consumer might want to 
> re-join the group at any time. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (KAFKA-17337) ConsumerConfig should default to CONSUMER for group.protocol

2024-10-06 Thread Kirk True (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-17337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kirk True updated KAFKA-17337:
--
Priority: Critical  (was: Blocker)

> ConsumerConfig should default to CONSUMER for group.protocol
> 
>
> Key: KAFKA-17337
> URL: https://issues.apache.org/jira/browse/KAFKA-17337
> Project: Kafka
>  Issue Type: Task
>  Components: clients, config, consumer
>Affects Versions: 3.9.0
>Reporter: Kirk True
>Assignee: Kirk True
>Priority: Critical
>  Labels: consumer-threading-refactor, kip-848-client-support
> Fix For: 4.0.0
>
>
> {{ConsumerConfig}}’s default value for {{group.protocol}} should be changed 
> from {{CLASSIC}} to {{CONSUMER}} for 4.0.0.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (KAFKA-16460) New consumer times out consuming records in multiple consumer_test.py system tests

2024-10-06 Thread Kirk True (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-16460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kirk True updated KAFKA-16460:
--
Priority: Critical  (was: Blocker)

> New consumer times out consuming records in multiple consumer_test.py system 
> tests
> --
>
> Key: KAFKA-16460
> URL: https://issues.apache.org/jira/browse/KAFKA-16460
> Project: Kafka
>  Issue Type: Bug
>  Components: clients, consumer, system tests
>Affects Versions: 3.7.0
>Reporter: Kirk True
>Assignee: Kirk True
>Priority: Critical
>  Labels: kip-848-client-support, system-tests
> Fix For: 4.0.0
>
>
> The {{consumer_test.py}} system test fails with the following errors:
> {quote}
> * Timed out waiting for consumption
> {quote}
> Affected tests:
> * {{test_broker_failure}}
> * {{test_consumer_bounce}}
> * {{test_static_consumer_bounce}}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (KAFKA-15561) Client support for new SubscriptionPattern based subscription

2024-10-06 Thread Kirk True (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-15561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kirk True updated KAFKA-15561:
--
Priority: Critical  (was: Blocker)

> Client support for new SubscriptionPattern based subscription
> -
>
> Key: KAFKA-15561
> URL: https://issues.apache.org/jira/browse/KAFKA-15561
> Project: Kafka
>  Issue Type: New Feature
>  Components: clients, consumer
>Reporter: Lianet Magrans
>Assignee: Phuc Hong Tran
>Priority: Critical
>  Labels: kip-848-client-support, regex
> Fix For: 4.0.0
>
>
> New consumer should support subscribe with the new SubscriptionPattern 
> introduced in the new consumer group protocol. When subscribing with this 
> regex, the client should provide the regex in the HB request on the 
> SubscribedTopicRegex field, delegating the resolution to the server.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (KAFKA-17696) New consumer background operations unaware of metadata errors

2024-10-06 Thread Kirk True (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-17696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kirk True updated KAFKA-17696:
--
Priority: Critical  (was: Blocker)

> New consumer background operations unaware of metadata errors
> -
>
> Key: KAFKA-17696
> URL: https://issues.apache.org/jira/browse/KAFKA-17696
> Project: Kafka
>  Issue Type: Bug
>  Components: clients, consumer
>Reporter: Lianet Magrans
>Assignee: 黃竣陽
>Priority: Critical
>  Labels: consumer-threading-refactor
> Fix For: 4.0.0
>
>
> When a metadata error happens (ie. Unauthorized topic), the network layer is 
> the one to detect it and it just propagates it to the app thread via en 
> ErrorEvent.
> [https://github.com/apache/kafka/blob/0edf5dbd204df9eb62bfea1b56993e95737df5a3/clients/src/main/java/org/apache/kafka/clients/consumer/internals/NetworkClientDelegate.java#L153]
> That allows api calls that processBackgroundEvents to throw the error in the 
> app thread (ex. poll, unsubscribe and close, which are the only api calls 
> that currently processBackgroundEvents).
> This means that all other api calls that do not processBackgroundEvent will 
> never know about errors like Unauthorized topics. Moreover, it really means 
> that the background operations are not notified/aborted when a metadata error 
> happens (auth error). Ex. call to position block waiting for the 
> updateFetchPositions 
> ([here|https://github.com/apache/kafka/blob/0edf5dbd204df9eb62bfea1b56993e95737df5a3/clients/src/main/java/org/apache/kafka/clients/consumer/internals/AsyncKafkaConsumer.java#L1586]),
>  will leave a
> pendingOffsetFetchEvent waiting to complete, even when the background already 
> got an Unauthorized exception (but it only passed it to the app thread via 
> ErrorEvent)
> I wonder if we should ensure that metadata errors are not only propagated to 
> the app thread via ErrorEvents, but also ensure that we notify all request 
> managers in the background (so that they can decide if completeExceptionally 
> their outstanding events). Ex. OffsetsRequestManager.onMetadataError should 
> completeExceptionally the pendingOffsetFetchEvent (just first thought, there 
> could be other approaches, but note that calling processBackgroundEvent in 
> api calls like positions will not do because we would block first on the 
> CheckAndUpdatePositions, then processBackgroundEvents that would only happen 
> after the CheckAndUpdate)
>  
> This behaviour can be repro with the integration test 
> AuthorizerIntegrationTest.testOffsetFetchWithNoAccess with the new consumer 
> enabled (discovered with [https://github.com/apache/kafka/pull/17107] )
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (KAFKA-16954) Move consumer leave operations on close to background thread

2024-10-06 Thread Kirk True (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-16954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kirk True updated KAFKA-16954:
--
Labels: kip-848-client-support  (was: )

> Move consumer leave operations on close to background thread
> 
>
> Key: KAFKA-16954
> URL: https://issues.apache.org/jira/browse/KAFKA-16954
> Project: Kafka
>  Issue Type: Bug
>  Components: clients, consumer
>Reporter: Lianet Magrans
>Assignee: Lianet Magrans
>Priority: Blocker
>  Labels: kip-848-client-support
> Fix For: 3.8.0
>
>
> When a consumer unsubscribes, the app thread simply triggers an Unsubscribe 
> event that will take care of it all in the background thread: release 
> assignment (callbacks), clear assigned partitions, and send leave group HB.
> On the contrary, when a consumer is closed, these actions happen in both 
> threads:
>  * release assignment -> in the app thread by directly running the callbacks
>  * clear assignment -> in app thread by updating the subscriptionState
>  * send leave group HB -> in the background thread via an event LeaveOnClose 
> This situation could lead to race conditions, mainly because of the close 
> updating the subscription state in the app thread, when other operations in 
> the background could be already running based on it. Ex. 
>  * unsubscribe in app thread (triggers background UnsubscribeEvent to revoke 
> and leave)
>  * unsubscribe fails (ex. interrupted, leaving operation running in the 
> background thread to revoke partitions and leave)
>  * consumer close (will revoke and clear assignment in the app thread)
>  *  UnsubscribeEvent in the background may fail by trying to revoke 
> partitions that it does not own anymore - _No current assignment for 
> partition ..._
> A basic check has been added to the background thread revocation to avoid the 
> race condition, ensuring that we only revoke partitions we own, but still we 
> should avoid the root cause, which is updating the assignment on the app 
> thread. We should consider having the close operation as a single 
> LeaveOnClose event handled in the background. That even already takes cares 
> of revoking the partitions and clearing assignment on the background, so no 
> need to take care of it in the app thread. We should only ensure that we 
> processBackgroundEvents until the LeaveOnClose completes (to allow for 
> callbacks to run in the app thread)
>  
> Trying to understand the current approach, I imagine the initial motivation 
> to have the callabacks (and assignment cleared) in the app thread was to 
> avoid the back-and-forth: app thread close -> background thread leave event 
> -> app thread to run callback -> background thread to clear assignment and 
> send HB. But updating the assignment on the app thread ends up being 
> problematic, as it mainly happens in the background so it opens up the door 
> for race conditions on the subscription state. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (KAFKA-16984) New consumer should not complete leave operation until it gets a response

2024-10-06 Thread Kirk True (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-16984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kirk True updated KAFKA-16984:
--
Labels: kip-848 kip-848-client-support  (was: kip-848)

> New consumer should not complete leave operation until it gets a response
> -
>
> Key: KAFKA-16984
> URL: https://issues.apache.org/jira/browse/KAFKA-16984
> Project: Kafka
>  Issue Type: Bug
>  Components: clients, consumer
>Affects Versions: 3.8.0
>Reporter: Lianet Magrans
>Assignee: Lianet Magrans
>Priority: Major
>  Labels: kip-848, kip-848-client-support
> Fix For: 3.9.0
>
>
> When the new consumer attempts to leave a group, it sends a leave group 
> request in a fire-and-forget mode, so, as soon as the request is generated, 
> it will:
> 1. transitions to UNSUBSCRIBED
> 2. complete the leaveGroup operation future 
> This task focus on point 2, which has the undesired side-effect that whatever 
> might have been waiting for the leave to do something else, will carry on, 
> ex. consumer close, leading to responses to disconnected clients we've seen 
> when running stress tests)
> When leaving a group while closing a consumer, the member sends the leave 
> request and moves on to next operation, which is closing the network thread, 
> so we end up with disconnected client receiving responses from the server. We 
> should send leave group heartbeat, and transition to UNSUBSCRIBE, but only 
> complete the leave operation when we get a response for it, which is a much 
> more accurate confirmation that the consumer left the group and can move on 
> with other operations.
> Note that the legacy consumer does wait for a leave response before closing 
> down the coordinator (see 
> [AbstractCoordinator|https://github.com/apache/kafka/blob/25230b538841a5e7256b1b51725361dd59435901/clients/src/main/java/org/apache/kafka/clients/consumer/internals/AbstractCoordinator.java#L1135-L1140]),
>  we we are looking to have the same behaviour on the new consumer. 
> Note that with this task we'll only focus on changing the behaviour for the 
> leave operation completion (point 2 above) to tidy up the close flow. We are 
> not changing the transition to UNSUBSCRIBED, as it would require further 
> consideration if ever needed. 
>  
> This is also a building block for future improvements around error handling 
> for the leave request, which we don't have at the moment (related Jira linked)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (KAFKA-16937) Consider inlineing Time#waitObject to ProducerMetadata#awaitUpdate

2024-10-06 Thread Kirk True (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-16937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kirk True updated KAFKA-16937:
--
Component/s: clients
 producer 

> Consider inlineing Time#waitObject to ProducerMetadata#awaitUpdate
> --
>
> Key: KAFKA-16937
> URL: https://issues.apache.org/jira/browse/KAFKA-16937
> Project: Kafka
>  Issue Type: Improvement
>  Components: clients, producer 
>Reporter: Chia-Ping Tsai
>Assignee: PoAn Yang
>Priority: Minor
>
> Time#waitObject is implemented by while-loop and it is used by 
> `ProducerMetadata` only. Hence, this jira can include following changes:
> 1. move `Time#waitObject` to  `ProducerMetadata#awaitUpdate`
> 2. ProducerMetadata#awaitUpdate can throw "exact" TimeoutException [0]
> [0] 
> https://github.com/apache/kafka/blob/23fe71d579f84d59ebfe6d5a29e688315cec1285/clients/src/main/java/org/apache/kafka/clients/producer/KafkaProducer.java#L1176



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (KAFKA-17279) Handle retriable errors from offset fetches in ConsumerCoordinator

2024-10-06 Thread Kirk True (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-17279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kirk True updated KAFKA-17279:
--
Component/s: clients

> Handle retriable errors from offset fetches in ConsumerCoordinator
> --
>
> Key: KAFKA-17279
> URL: https://issues.apache.org/jira/browse/KAFKA-17279
> Project: Kafka
>  Issue Type: Improvement
>  Components: clients, consumer
>Reporter: Sean Quah
>Assignee: Sean Quah
>Priority: Minor
> Fix For: 3.9.0
>
>
> Currently {{ConsumerCoordinator}}'s {{OffsetFetchResponseHandler}} only 
> retries on {{COORDINATOR_LOAD_IN_PROGRESS}} and {{NOT_COORDINATOR}} errors.
> The error handling should be expanded to retry on all retriable errors.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (KAFKA-17478) Wrong configuration of metric.reporters lead to NPE in KafkaProducer constructor

2024-10-06 Thread Kirk True (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-17478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kirk True updated KAFKA-17478:
--
Component/s: producer 

> Wrong configuration of metric.reporters lead to NPE in KafkaProducer 
> constructor
> 
>
> Key: KAFKA-17478
> URL: https://issues.apache.org/jira/browse/KAFKA-17478
> Project: Kafka
>  Issue Type: Bug
>  Components: clients, producer 
>Affects Versions: 3.7.0, 3.8.0, 3.7.1
>Reporter: Fred Rouleau
>Assignee: Fred Rouleau
>Priority: Minor
> Fix For: 4.0.0
>
>   Original Estimate: 10m
>  Remaining Estimate: 10m
>
> if the metric.reporters property contains some invalid class, the 
> KafkaProducer constructor fails with non explicit NPE:
> {code:java}
> Exception in thread "main" java.lang.NullPointerException: Cannot invoke 
> "java.util.Optional.ifPresent(java.util.function.Consumer)" because 
> "this.clientTelemetryReporter" is null
>     at 
> org.apache.kafka.clients.producer.KafkaProducer.close(KafkaProducer.java:1424)
>     at 
> org.apache.kafka.clients.producer.KafkaProducer.(KafkaProducer.java:472)
>     at 
> org.apache.kafka.clients.producer.KafkaProducer.(KafkaProducer.java:295)
>     at 
> org.apache.kafka.clients.producer.KafkaProducer.(KafkaProducer.java:322)
>     at 
> org.apache.kafka.clients.producer.KafkaProducer.(KafkaProducer.java:307)
>     at org.frouleau.kafka.clients.Produce.sendAvroSpecific(Produce.java:89)
>     at org.frouleau.kafka.clients.Produce.main(Produce.java:63){code}
> This behavior was introduced by KAFKA-15901 implementing KIP-714.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (KAFKA-17681) Fix unstable consumer_test.py#test_fencing_static_consumer

2024-10-06 Thread Kirk True (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-17681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kirk True updated KAFKA-17681:
--
Labels: kip-848-client-support  (was: )

> Fix unstable consumer_test.py#test_fencing_static_consumer
> --
>
> Key: KAFKA-17681
> URL: https://issues.apache.org/jira/browse/KAFKA-17681
> Project: Kafka
>  Issue Type: Bug
>  Components: clients, consumer
>Reporter: Chia-Ping Tsai
>Assignee: Chia-Ping Tsai
>Priority: Major
>  Labels: kip-848-client-support
> Fix For: 4.0.0
>
>
> {code:java}
> AssertionError('Static consumers attempt to join with instance id in use 
> should not cause a rebalance.')
> Traceback (most recent call last):
>   File 
> "/usr/local/lib/python3.9/dist-packages/ducktape/tests/runner_client.py", 
> line 186, in _do_run
> data = self.run_test()
>   File 
> "/usr/local/lib/python3.9/dist-packages/ducktape/tests/runner_client.py", 
> line 246, in run_test
> return self.test_context.function(self.test)
>   File "/usr/local/lib/python3.9/dist-packages/ducktape/mark/_mark.py", line 
> 433, in wrapper
> return functools.partial(f, *args, **kwargs)(*w_args, **w_kwargs)
>   File "/opt/kafka-dev/tests/kafkatest/tests/client/consumer_test.py", line 
> 359, in test_fencing_static_consumer
> assert num_rebalances == consumer.num_rebalances(), "Static consumers 
> attempt to join with instance id in use should not cause a rebalance. before: 
> " + str(num_rebalances) + " after: " + str(consumer.num_rebalances())
> AssertionError: Static consumers attempt to join with instance id in use 
> should not cause a rebalance.
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (KAFKA-17681) Fix unstable consumer_test.py#test_fencing_static_consumer

2024-10-06 Thread Kirk True (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-17681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kirk True updated KAFKA-17681:
--
Component/s: clients
 consumer

> Fix unstable consumer_test.py#test_fencing_static_consumer
> --
>
> Key: KAFKA-17681
> URL: https://issues.apache.org/jira/browse/KAFKA-17681
> Project: Kafka
>  Issue Type: Bug
>  Components: clients, consumer
>Reporter: Chia-Ping Tsai
>Assignee: Chia-Ping Tsai
>Priority: Major
>
> {code:java}
> AssertionError('Static consumers attempt to join with instance id in use 
> should not cause a rebalance.')
> Traceback (most recent call last):
>   File 
> "/usr/local/lib/python3.9/dist-packages/ducktape/tests/runner_client.py", 
> line 186, in _do_run
> data = self.run_test()
>   File 
> "/usr/local/lib/python3.9/dist-packages/ducktape/tests/runner_client.py", 
> line 246, in run_test
> return self.test_context.function(self.test)
>   File "/usr/local/lib/python3.9/dist-packages/ducktape/mark/_mark.py", line 
> 433, in wrapper
> return functools.partial(f, *args, **kwargs)(*w_args, **w_kwargs)
>   File "/opt/kafka-dev/tests/kafkatest/tests/client/consumer_test.py", line 
> 359, in test_fencing_static_consumer
> assert num_rebalances == consumer.num_rebalances(), "Static consumers 
> attempt to join with instance id in use should not cause a rebalance. before: 
> " + str(num_rebalances) + " after: " + str(consumer.num_rebalances())
> AssertionError: Static consumers attempt to join with instance id in use 
> should not cause a rebalance.
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (KAFKA-17681) Fix unstable consumer_test.py#test_fencing_static_consumer

2024-10-06 Thread Kirk True (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-17681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kirk True updated KAFKA-17681:
--
Fix Version/s: 4.0.0

> Fix unstable consumer_test.py#test_fencing_static_consumer
> --
>
> Key: KAFKA-17681
> URL: https://issues.apache.org/jira/browse/KAFKA-17681
> Project: Kafka
>  Issue Type: Bug
>  Components: clients, consumer
>Reporter: Chia-Ping Tsai
>Assignee: Chia-Ping Tsai
>Priority: Major
> Fix For: 4.0.0
>
>
> {code:java}
> AssertionError('Static consumers attempt to join with instance id in use 
> should not cause a rebalance.')
> Traceback (most recent call last):
>   File 
> "/usr/local/lib/python3.9/dist-packages/ducktape/tests/runner_client.py", 
> line 186, in _do_run
> data = self.run_test()
>   File 
> "/usr/local/lib/python3.9/dist-packages/ducktape/tests/runner_client.py", 
> line 246, in run_test
> return self.test_context.function(self.test)
>   File "/usr/local/lib/python3.9/dist-packages/ducktape/mark/_mark.py", line 
> 433, in wrapper
> return functools.partial(f, *args, **kwargs)(*w_args, **w_kwargs)
>   File "/opt/kafka-dev/tests/kafkatest/tests/client/consumer_test.py", line 
> 359, in test_fencing_static_consumer
> assert num_rebalances == consumer.num_rebalances(), "Static consumers 
> attempt to join with instance id in use should not cause a rebalance. before: 
> " + str(num_rebalances) + " after: " + str(consumer.num_rebalances())
> AssertionError: Static consumers attempt to join with instance id in use 
> should not cause a rebalance.
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (KAFKA-16142) Update metrics documentation for errors

2024-10-06 Thread Kirk True (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-16142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kirk True updated KAFKA-16142:
--
Fix Version/s: (was: 4.0.0)

> Update metrics documentation for errors
> ---
>
> Key: KAFKA-16142
> URL: https://issues.apache.org/jira/browse/KAFKA-16142
> Project: Kafka
>  Issue Type: Improvement
>  Components: clients, consumer, documentation, metrics
>Affects Versions: 3.7.0
>Reporter: Kirk True
>Assignee: Philip Nee
>Priority: Minor
>  Labels: consumer-threading-refactor, metrics
>
> We need to identify the “errors” that exist in the current JMX documentation 
> and resolve them. Per [~pnee] there are errors on the JMX web page, which he 
> will identify and resolve.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (KAFKA-15556) Remove NetworkClientDelegate methods isUnavailable, maybeThrowAuthFailure, and tryConnect

2024-10-06 Thread Kirk True (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-15556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kirk True updated KAFKA-15556:
--
Fix Version/s: (was: 4.0.0)

> Remove NetworkClientDelegate methods isUnavailable, maybeThrowAuthFailure, 
> and tryConnect
> -
>
> Key: KAFKA-15556
> URL: https://issues.apache.org/jira/browse/KAFKA-15556
> Project: Kafka
>  Issue Type: Improvement
>  Components: clients, consumer
>Reporter: Kirk True
>Assignee: Phuc Hong Tran
>Priority: Minor
>  Labels: consumer-threading-refactor
>
> The "new consumer" (i.e. {{{}PrototypeAsyncConsumer{}}}) was designed to 
> handle networking details in a more centralized way. However, in order to 
> reuse code between the existing {{KafkaConsumer}} and the new 
> {{{}PrototypeAsyncConsumer{}}}, that design goal was "relaxed" when the 
> {{NetworkClientDelegate}} capitulated and -stole- copied three methods from 
> {{ConsumerNetworkClient}} related to detecting node status:
>  # {{isUnavailable}}
>  # {{maybeThrowAuthFailure}}
>  # {{tryConnect}}
> Unfortunately, these have found their way into the {{FetchRequestManager}} 
> and {{OffsetsRequestManager}} implementations. We should review if we can 
> clean up—or even remove—this leaky abstraction.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (KAFKA-15867) Should ConsumerNetworkThread wrap the exception and notify the polling thread?

2024-10-06 Thread Kirk True (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-15867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kirk True updated KAFKA-15867:
--
Fix Version/s: (was: 4.0.0)

> Should ConsumerNetworkThread wrap the exception and notify the polling thread?
> --
>
> Key: KAFKA-15867
> URL: https://issues.apache.org/jira/browse/KAFKA-15867
> Project: Kafka
>  Issue Type: Task
>  Components: clients, consumer
>Reporter: Philip Nee
>Assignee: Phuc Hong Tran
>Priority: Minor
>  Labels: consumer-threading-refactor, events
>
> The ConsumerNetworkThread runs a tight loop infinitely.  However, when 
> encountering an unexpected exception, it logs an error and continues.
>  
> I think this might not be ideal because user can run blind for a long time 
> before discovering there's something wrong with the code; so I believe we 
> should propagate the throwable back to the polling thread. 
>  
> cc [~lucasbru] 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


<    1   2   3   4   5   6   7   8   9   10   >