[jira] [Updated] (KAFKA-17925) Convert Kafka Client integration tests to use KRaft
[ https://issues.apache.org/jira/browse/KAFKA-17925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kirk True updated KAFKA-17925: -- Labels: integration-test (was: ) > Convert Kafka Client integration tests to use KRaft > --- > > Key: KAFKA-17925 > URL: https://issues.apache.org/jira/browse/KAFKA-17925 > Project: Kafka > Issue Type: Task > Components: clients >Affects Versions: 4.0.0 >Reporter: Kirk True >Assignee: Kirk True >Priority: Blocker > Labels: integration-test > > Update quota, truncation, and client compatibility tests to use KRaft. Tests > that do not inject a metadata_quorum argument are defaulted to using > Zookeeper. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-17925) Convert Kafka Client integration tests to use KRaft
[ https://issues.apache.org/jira/browse/KAFKA-17925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kirk True updated KAFKA-17925: -- Fix Version/s: 4.0.0 > Convert Kafka Client integration tests to use KRaft > --- > > Key: KAFKA-17925 > URL: https://issues.apache.org/jira/browse/KAFKA-17925 > Project: Kafka > Issue Type: Task > Components: clients >Affects Versions: 4.0.0 >Reporter: Kirk True >Assignee: Kirk True >Priority: Blocker > Labels: integration-test > Fix For: 4.0.0 > > > Update quota, truncation, and client compatibility tests to use KRaft. Tests > that do not inject a metadata_quorum argument are defaulted to using > Zookeeper. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-17925) Convert Kafka Client integration tests to use KRaft
[ https://issues.apache.org/jira/browse/KAFKA-17925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kirk True updated KAFKA-17925: -- Component/s: (was: system tests) > Convert Kafka Client integration tests to use KRaft > --- > > Key: KAFKA-17925 > URL: https://issues.apache.org/jira/browse/KAFKA-17925 > Project: Kafka > Issue Type: Task > Components: clients >Affects Versions: 4.0.0 >Reporter: Kirk True >Assignee: Kirk True >Priority: Blocker > > Update quota, truncation, and client compatibility tests to use KRaft. Tests > that do not inject a metadata_quorum argument are defaulted to using > Zookeeper. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-17925) Convert Kafka Client integration tests to use KRaft
Kirk True created KAFKA-17925: - Summary: Convert Kafka Client integration tests to use KRaft Key: KAFKA-17925 URL: https://issues.apache.org/jira/browse/KAFKA-17925 Project: Kafka Issue Type: Task Components: clients, system tests Affects Versions: 4.0.0 Reporter: Kirk True Assignee: Kirk True Update quota, truncation, and client compatibility tests to use KRaft. Tests that do not inject a metadata_quorum argument are defaulted to using Zookeeper. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (KAFKA-17915) Convert Kafka Client system tests to use KRaft
[ https://issues.apache.org/jira/browse/KAFKA-17915?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kirk True reassigned KAFKA-17915: - Assignee: Kirk True > Convert Kafka Client system tests to use KRaft > -- > > Key: KAFKA-17915 > URL: https://issues.apache.org/jira/browse/KAFKA-17915 > Project: Kafka > Issue Type: Task > Components: clients, system tests >Affects Versions: 4.0.0 >Reporter: Kevin Wu >Assignee: Kirk True >Priority: Blocker > > Update quota, truncation, and client compatibility tests to use KRaft. Tests > that do not inject a metadata_quorum argument are defaulted to using > Zookeeper. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-16985) Ensure consumer attempts to send leave request on close even if interrupted
[ https://issues.apache.org/jira/browse/KAFKA-16985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kirk True updated KAFKA-16985: -- Summary: Ensure consumer attempts to send leave request on close even if interrupted (was: Ensure consumer sends leave request on close even if interrupted) > Ensure consumer attempts to send leave request on close even if interrupted > --- > > Key: KAFKA-16985 > URL: https://issues.apache.org/jira/browse/KAFKA-16985 > Project: Kafka > Issue Type: Bug > Components: clients, consumer >Affects Versions: 3.8.0 >Reporter: Lianet Magrans >Assignee: Kirk True >Priority: Major > Labels: kip-848-client-support > Fix For: 4.0.0 > > > While running some stress tests we found out consumers were not leaving the > group if interrupted while closing (leading to members getting eventually > fenced after the session expired). On close, the consumer generates an > Unsubscribe event to be handled in the background, but we noticed that the > network thread failed with the interruption, seemingly not sending the unsent > requests. We should review this to ensure that a member does a clean leave, > notifying the coordinator with a leave HB, even if in a fire-and-forget mode > in the case of interruption (and validate the legacy consumer behaviour in > this scenario). (Still under investigation, I'll update more info as I > discover it) -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-17821) the set of configs displayed by `logAll` could be invalid
[ https://issues.apache.org/jira/browse/KAFKA-17821?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kirk True updated KAFKA-17821: -- Component/s: clients consumer > the set of configs displayed by `logAll` could be invalid > - > > Key: KAFKA-17821 > URL: https://issues.apache.org/jira/browse/KAFKA-17821 > Project: Kafka > Issue Type: Improvement > Components: clients, consumer >Reporter: Chia-Ping Tsai >Assignee: 黃竣陽 >Priority: Major > > see https://github.com/apache/kafka/pull/16899#discussion_r1743632489 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-14517) Implement regex subscriptions
[ https://issues.apache.org/jira/browse/KAFKA-14517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kirk True updated KAFKA-14517: -- Fix Version/s: 4.0.0 > Implement regex subscriptions > - > > Key: KAFKA-14517 > URL: https://issues.apache.org/jira/browse/KAFKA-14517 > Project: Kafka > Issue Type: Sub-task >Reporter: David Jacot >Assignee: Lianet Magrans >Priority: Major > Labels: kip-848-preview > Fix For: 4.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-17861) Serialize with ByteBuffer instead of byte[]
[ https://issues.apache.org/jira/browse/KAFKA-17861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kirk True updated KAFKA-17861: -- Labels: kip-required (was: ) > Serialize with ByteBuffer instead of byte[] > --- > > Key: KAFKA-17861 > URL: https://issues.apache.org/jira/browse/KAFKA-17861 > Project: Kafka > Issue Type: Wish > Components: clients, producer >Affects Versions: 3.3.2 >Reporter: Sarah Hennenkamp >Priority: Minor > Labels: kip-required > > This is a request to consider changing the return value of the > [Serializer|https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/common/serialization/Serializer.java] > from a byte[] to a ByteBuffer. > > Understandably folks may balk at this since it's a large lift. However, we've > noticed a good chunk of memory allocation in our application comes from the > [KafkaProducer > serializing|https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/clients/producer/KafkaProducer.java#L1045] > the key and value pair. Using ByteBuffer could allow this to be off-heap and > save on garbage collection time. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-17862) [buffer pool] corruption during buffer reuse from the pool
[ https://issues.apache.org/jira/browse/KAFKA-17862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kirk True updated KAFKA-17862: -- Component/s: clients producer > [buffer pool] corruption during buffer reuse from the pool > -- > > Key: KAFKA-17862 > URL: https://issues.apache.org/jira/browse/KAFKA-17862 > Project: Kafka > Issue Type: Bug > Components: clients, core, producer >Affects Versions: 3.7.1 >Reporter: Bharath Vissapragada >Priority: Major > Attachments: client-config.txt > > > We noticed malformed batches from the Kafka Java client + Redpanda under > certain conditions that caused excessive client retries and we narrowed it > down to a client bug related to corruption of buffers reused from the buffer > pool. We were able to reproduce it with Kafka brokers too, so we are fairly > certain the bug is on the client. > (Attached the full client config, fwiw) > We narrowed it down to a race condition between produce requests and failed > batch expiration. If the network flush of produce request races with the > expiration, the produce batch that the request uses is corrupted, so a > malformed batch is sent to the broker. > The expiration is triggered by a timeout > [https://github.com/apache/kafka/blob/2c6fb6c54472e90ae17439e62540ef3cb0426fe3/clients/src/main/java/org/apache/kafka/clients/producer/internals/Sender.java#L392C13-L392C22] > that eventually deallocates the batch > [https://github.com/apache/kafka/blob/2c6fb6c54472e90ae17439e62540ef3cb0426fe3/clients/src/main/java/org/apache/kafka/clients/producer/internals/Sender.java#L773] > adding it back to the buffer pool > [https://github.com/apache/kafka/blob/661bed242e8d7269f134ea2f6a24272ce9b720e9/clients/src/main/java/org/apache/kafka/clients/producer/internals/RecordAccumulator.java#L1054] > Now it is probably all zeroed out or there is a competing producer that > requests a new append that reuses this freed up buffer and starts writing to > it corrupting it's contents. > If there is racing network flush of a produce batch backed by this buffer, a > corrupt batch is sent to the broker resulting in a CRC mismatch. > This issue can be easily reproduced in a simulated environment that triggers > frequent timeouts (eg: lower timeouts) and then use a producer with high-ish > throughput that can cause longer queues (hence higher chances of expiration) > and frequent buffer reuse from the pool (deadly combination :)) -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KAFKA-15284) Implement GroupProtocolResolver to dynamically determine consumer group protocol
[ https://issues.apache.org/jira/browse/KAFKA-15284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17892638#comment-17892638 ] Kirk True commented on KAFKA-15284: --- [~yangpoan]—sorry, I am working on this but forgot to update the status. Thanks! > Implement GroupProtocolResolver to dynamically determine consumer group > protocol > > > Key: KAFKA-15284 > URL: https://issues.apache.org/jira/browse/KAFKA-15284 > Project: Kafka > Issue Type: New Feature > Components: clients, consumer >Reporter: Kirk True >Assignee: Kirk True >Priority: Critical > Labels: kip-848-client-support > Fix For: 4.0.0 > > > At client initialization, we need to determine which of the > {{ConsumerDelegate}} implementations to use: > # {{LegacyKafkaConsumerDelegate}} > # {{AsyncKafkaConsumerDelegate}} > There are conditions defined by KIP-848 that determine client eligibility to > use the new protocol. This will be modeled by the > {{{}GroupProtocolResolver{}}}. > Known tasks: > * Determine at what point in the {{Consumer}} initialization the network > communication should happen > * Determine what RPCs to invoke in order to determine eligibility (API > versions, IBP version, etc.) > * Implement the network client lifecycle (startup, communication, shutdown, > etc.) > * Determine the fall-back path in case the client is not eligible to use the > protocol -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (KAFKA-15284) Implement GroupProtocolResolver to dynamically determine consumer group protocol
[ https://issues.apache.org/jira/browse/KAFKA-15284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kirk True reassigned KAFKA-15284: - Assignee: Kirk True > Implement GroupProtocolResolver to dynamically determine consumer group > protocol > > > Key: KAFKA-15284 > URL: https://issues.apache.org/jira/browse/KAFKA-15284 > Project: Kafka > Issue Type: New Feature > Components: clients, consumer >Reporter: Kirk True >Assignee: Kirk True >Priority: Critical > Labels: kip-848-client-support > Fix For: 4.0.0 > > > At client initialization, we need to determine which of the > {{ConsumerDelegate}} implementations to use: > # {{LegacyKafkaConsumerDelegate}} > # {{AsyncKafkaConsumerDelegate}} > There are conditions defined by KIP-848 that determine client eligibility to > use the new protocol. This will be modeled by the > {{{}GroupProtocolResolver{}}}. > Known tasks: > * Determine at what point in the {{Consumer}} initialization the network > communication should happen > * Determine what RPCs to invoke in order to determine eligibility (API > versions, IBP version, etc.) > * Implement the network client lifecycle (startup, communication, shutdown, > etc.) > * Determine the fall-back path in case the client is not eligible to use the > protocol -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-17861) Serialize with ByteBuffer instead of byte[]
[ https://issues.apache.org/jira/browse/KAFKA-17861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kirk True updated KAFKA-17861: -- Component/s: clients > Serialize with ByteBuffer instead of byte[] > --- > > Key: KAFKA-17861 > URL: https://issues.apache.org/jira/browse/KAFKA-17861 > Project: Kafka > Issue Type: Wish > Components: clients, producer >Affects Versions: 3.3.2 >Reporter: Sarah Hennenkamp >Priority: Minor > > This is a request to consider changing the return value of the > [Serializer|https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/common/serialization/Serializer.java] > from a byte[] to a ByteBuffer. > > Understandably folks may balk at this since it's a large lift. However, we've > noticed a good chunk of memory allocation in our application comes from the > [KafkaProducer > serializing|https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/clients/producer/KafkaProducer.java#L1045] > the key and value pair. Using ByteBuffer could allow this to be off-heap and > save on garbage collection time. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-17686) AsyncKafkaConsumer.offsetsForTimes() fails with NullPointerException
[ https://issues.apache.org/jira/browse/KAFKA-17686?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kirk True updated KAFKA-17686: -- Reviewer: Chia-Ping Tsai > AsyncKafkaConsumer.offsetsForTimes() fails with NullPointerException > > > Key: KAFKA-17686 > URL: https://issues.apache.org/jira/browse/KAFKA-17686 > Project: Kafka > Issue Type: Improvement > Components: clients, consumer >Affects Versions: 3.9.0 >Reporter: Kirk True >Assignee: Kirk True >Priority: Critical > Labels: consumer-threading-refactor, integration-tests > Fix For: 4.0.0 > > > Error when running the integration test: > {noformat} > Gradle Test Run :core:integrationTest > Gradle Test Executor 10 > > PlaintextAdminIntegrationTest > testOffsetsForTimesAfterDeleteRecords(String) > > "testOffsetsForTimesAfterDeleteRecords(String).quorum=kraft" FAILED > java.lang.NullPointerException: Cannot invoke > "org.apache.kafka.clients.consumer.internals.OffsetAndTimestampInternal.buildOffsetAndTimestamp()" > because the return value of "java.util.Map$Entry.getValue()" is null > at > org.apache.kafka.clients.consumer.internals.AsyncKafkaConsumer.lambda$offsetsForTimes$4(AsyncKafkaConsumer.java:1082) > at > java.base/java.util.stream.Collectors.lambda$uniqKeysMapAccumulator$1(Collectors.java:180) > at > java.base/java.util.stream.ReduceOps$3ReducingSink.accept(ReduceOps.java:169) > at > java.base/java.util.HashMap$EntrySpliterator.forEachRemaining(HashMap.java:1858) > at > java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:509) > at > java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:499) > at > java.base/java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:921) > > at > java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) > at > java.base/java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:682) > at > org.apache.kafka.clients.consumer.internals.AsyncKafkaConsumer.offsetsForTimes(AsyncKafkaConsumer.java:1080) > at > org.apache.kafka.clients.consumer.internals.AsyncKafkaConsumer.offsetsForTimes(AsyncKafkaConsumer.java:1043) > at > org.apache.kafka.clients.consumer.KafkaConsumer.offsetsForTimes(KafkaConsumer.java:1560) > at > kafka.api.PlaintextAdminIntegrationTest.testOffsetsForTimesAfterDeleteRecords(PlaintextAdminIntegrationTest.scala:1535) > > {noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-15284) Implement GroupProtocolResolver to dynamically determine consumer group protocol
[ https://issues.apache.org/jira/browse/KAFKA-15284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kirk True updated KAFKA-15284: -- Description: At client initialization, we need to determine which of the {{ConsumerDelegate}} implementations to use: # {{LegacyKafkaConsumerDelegate}} # {{AsyncKafkaConsumerDelegate}} There are conditions defined by KIP-848 that determine client eligibility to use the new protocol. This will be modeled by the {{{}GroupProtocolResolver{}}}. Known tasks: * Determine at what point in the {{Consumer}} initialization the network communication should happen * Determine what RPCs to invoke in order to determine eligibility (API versions, IBP version, etc.) * Implement the network client lifecycle (startup, communication, shutdown, etc.) * Determine the fall-back path in case the client is not eligible to use the protocol was: At client initialization, we need to determine which of the {{ConsumerDelegate}} implementations to use: # {{LegacyKafkaConsumerDelegate}} # {{AsyncKafkaConsumerDelegate}} There are conditions defined by KIP-848 that determine client eligibility to use the new protocol. This will be modeled by the {{{}GroupProtocolResolver{}}}. Known tasks: * Determine at what point in the {{Consumer}} initialization the network communication should happen * Determine what RPCs to invoke in order to determine eligibility (API versions, IBP version, etc.) * Implement the network client lifecycle (startup, communication, shutdown, etc.) * Determine the fallback path in case the client is not eligible to use the protocol > Implement GroupProtocolResolver to dynamically determine consumer group > protocol > > > Key: KAFKA-15284 > URL: https://issues.apache.org/jira/browse/KAFKA-15284 > Project: Kafka > Issue Type: New Feature > Components: clients, consumer >Reporter: Kirk True >Priority: Critical > Labels: kip-848-client-support > Fix For: 4.0.0 > > > At client initialization, we need to determine which of the > {{ConsumerDelegate}} implementations to use: > # {{LegacyKafkaConsumerDelegate}} > # {{AsyncKafkaConsumerDelegate}} > There are conditions defined by KIP-848 that determine client eligibility to > use the new protocol. This will be modeled by the > {{{}GroupProtocolResolver{}}}. > Known tasks: > * Determine at what point in the {{Consumer}} initialization the network > communication should happen > * Determine what RPCs to invoke in order to determine eligibility (API > versions, IBP version, etc.) > * Implement the network client lifecycle (startup, communication, shutdown, > etc.) > * Determine the fall-back path in case the client is not eligible to use the > protocol -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-15284) Implement GroupProtocolResolver to dynamically determine consumer group protocol
[ https://issues.apache.org/jira/browse/KAFKA-15284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kirk True updated KAFKA-15284: -- Summary: Implement GroupProtocolResolver to dynamically determine consumer group protocol (was: Implement GroupProtocolResolver to determine consumer group protocol) > Implement GroupProtocolResolver to dynamically determine consumer group > protocol > > > Key: KAFKA-15284 > URL: https://issues.apache.org/jira/browse/KAFKA-15284 > Project: Kafka > Issue Type: New Feature > Components: clients, consumer >Reporter: Kirk True >Priority: Critical > Labels: kip-848-client-support > Fix For: 4.0.0 > > > At client initialization, we need to determine which of the > {{ConsumerDelegate}} implementations to use: > # {{LegacyKafkaConsumerDelegate}} > # {{AsyncKafkaConsumerDelegate}} > There are conditions defined by KIP-848 that determine client eligibility to > use the new protocol. This will be modeled by the > {{{}GroupProtocolResolver{}}}. > Known tasks: > * Determine at what point in the {{Consumer}} initialization the network > communication should happen > * Determine what RPCs to invoke in order to determine eligibility (API > versions, IBP version, etc.) > * Implement the network client lifecycle (startup, communication, shutdown, > etc.) > * Determine the fallback path in case the client is not eligible to use the > protocol -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-15284) Implement GroupProtocolResolver to determine consumer group protocol
[ https://issues.apache.org/jira/browse/KAFKA-15284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kirk True updated KAFKA-15284: -- Priority: Critical (was: Major) > Implement GroupProtocolResolver to determine consumer group protocol > > > Key: KAFKA-15284 > URL: https://issues.apache.org/jira/browse/KAFKA-15284 > Project: Kafka > Issue Type: New Feature > Components: clients, consumer >Reporter: Kirk True >Priority: Critical > Labels: kip-848-client-support > Fix For: 4.0.0 > > > At client initialization, we need to determine which of the > {{ConsumerDelegate}} implementations to use: > # {{LegacyKafkaConsumerDelegate}} > # {{AsyncKafkaConsumerDelegate}} > There are conditions defined by KIP-848 that determine client eligibility to > use the new protocol. This will be modeled by the > {{{}GroupProtocolResolver{}}}. > Known tasks: > * Determine at what point in the {{Consumer}} initialization the network > communication should happen > * Determine what RPCs to invoke in order to determine eligibility (API > versions, IBP version, etc.) > * Implement the network client lifecycle (startup, communication, shutdown, > etc.) > * Determine the fallback path in case the client is not eligible to use the > protocol -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (KAFKA-15284) Implement ConsumerGroupProtocolVersionResolver to determine consumer group protocol
[ https://issues.apache.org/jira/browse/KAFKA-15284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kirk True reassigned KAFKA-15284: - Assignee: (was: Kirk True) > Implement ConsumerGroupProtocolVersionResolver to determine consumer group > protocol > --- > > Key: KAFKA-15284 > URL: https://issues.apache.org/jira/browse/KAFKA-15284 > Project: Kafka > Issue Type: New Feature > Components: clients, consumer >Reporter: Kirk True >Priority: Major > Labels: kip-848-client-support > Fix For: 4.0.0 > > > At client initialization, we need to determine which of the > {{ConsumerDelegate}} implementations to use: > # {{LegacyKafkaConsumerDelegate}} > # {{AsyncKafkaConsumerDelegate}} > There are conditions defined by KIP-848 that determine client eligibility to > use the new protocol. This will be modeled by the—deep > breath—{{{}ConsumerGroupProtocolVersionResolver{}}}. > Known tasks: > * Determine at what point in the {{Consumer}} initialization the network > communication should happen > * Determine what RPCs to invoke in order to determine eligibility (API > versions, IBP version, etc.) > * Implement the network client lifecycle (startup, communication, shutdown, > etc.) > * Determine the fallback path in case the client is not eligible to use the > protocol -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-15284) Implement GroupProtocolResolver to determine consumer group protocol
[ https://issues.apache.org/jira/browse/KAFKA-15284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kirk True updated KAFKA-15284: -- Description: At client initialization, we need to determine which of the {{ConsumerDelegate}} implementations to use: # {{LegacyKafkaConsumerDelegate}} # {{AsyncKafkaConsumerDelegate}} There are conditions defined by KIP-848 that determine client eligibility to use the new protocol. This will be modeled by the {{{}GroupProtocolResolver{}}}. Known tasks: * Determine at what point in the {{Consumer}} initialization the network communication should happen * Determine what RPCs to invoke in order to determine eligibility (API versions, IBP version, etc.) * Implement the network client lifecycle (startup, communication, shutdown, etc.) * Determine the fallback path in case the client is not eligible to use the protocol was: At client initialization, we need to determine which of the {{ConsumerDelegate}} implementations to use: # {{LegacyKafkaConsumerDelegate}} # {{AsyncKafkaConsumerDelegate}} There are conditions defined by KIP-848 that determine client eligibility to use the new protocol. This will be modeled by the—deep breath—{{{}ConsumerGroupProtocolVersionResolver{}}}. Known tasks: * Determine at what point in the {{Consumer}} initialization the network communication should happen * Determine what RPCs to invoke in order to determine eligibility (API versions, IBP version, etc.) * Implement the network client lifecycle (startup, communication, shutdown, etc.) * Determine the fallback path in case the client is not eligible to use the protocol > Implement GroupProtocolResolver to determine consumer group protocol > > > Key: KAFKA-15284 > URL: https://issues.apache.org/jira/browse/KAFKA-15284 > Project: Kafka > Issue Type: New Feature > Components: clients, consumer >Reporter: Kirk True >Priority: Major > Labels: kip-848-client-support > Fix For: 4.0.0 > > > At client initialization, we need to determine which of the > {{ConsumerDelegate}} implementations to use: > # {{LegacyKafkaConsumerDelegate}} > # {{AsyncKafkaConsumerDelegate}} > There are conditions defined by KIP-848 that determine client eligibility to > use the new protocol. This will be modeled by the > {{{}GroupProtocolResolver{}}}. > Known tasks: > * Determine at what point in the {{Consumer}} initialization the network > communication should happen > * Determine what RPCs to invoke in order to determine eligibility (API > versions, IBP version, etc.) > * Implement the network client lifecycle (startup, communication, shutdown, > etc.) > * Determine the fallback path in case the client is not eligible to use the > protocol -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-15284) Implement GroupProtocolResolver to determine consumer group protocol
[ https://issues.apache.org/jira/browse/KAFKA-15284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kirk True updated KAFKA-15284: -- Summary: Implement GroupProtocolResolver to determine consumer group protocol (was: Implement ConsumerGroupProtocolVersionResolver to determine consumer group protocol) > Implement GroupProtocolResolver to determine consumer group protocol > > > Key: KAFKA-15284 > URL: https://issues.apache.org/jira/browse/KAFKA-15284 > Project: Kafka > Issue Type: New Feature > Components: clients, consumer >Reporter: Kirk True >Priority: Major > Labels: kip-848-client-support > Fix For: 4.0.0 > > > At client initialization, we need to determine which of the > {{ConsumerDelegate}} implementations to use: > # {{LegacyKafkaConsumerDelegate}} > # {{AsyncKafkaConsumerDelegate}} > There are conditions defined by KIP-848 that determine client eligibility to > use the new protocol. This will be modeled by the—deep > breath—{{{}ConsumerGroupProtocolVersionResolver{}}}. > Known tasks: > * Determine at what point in the {{Consumer}} initialization the network > communication should happen > * Determine what RPCs to invoke in order to determine eligibility (API > versions, IBP version, etc.) > * Implement the network client lifecycle (startup, communication, shutdown, > etc.) > * Determine the fallback path in case the client is not eligible to use the > protocol -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Reopened] (KAFKA-15284) Implement ConsumerGroupProtocolVersionResolver to determine consumer group protocol
[ https://issues.apache.org/jira/browse/KAFKA-15284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kirk True reopened KAFKA-15284: --- Assignee: Kirk True > Implement ConsumerGroupProtocolVersionResolver to determine consumer group > protocol > --- > > Key: KAFKA-15284 > URL: https://issues.apache.org/jira/browse/KAFKA-15284 > Project: Kafka > Issue Type: New Feature > Components: clients, consumer >Reporter: Kirk True >Assignee: Kirk True >Priority: Major > Labels: kip-848-client-support > Fix For: 4.0.0 > > > At client initialization, we need to determine which of the > {{ConsumerDelegate}} implementations to use: > # {{LegacyKafkaConsumerDelegate}} > # {{AsyncKafkaConsumerDelegate}} > There are conditions defined by KIP-848 that determine client eligibility to > use the new protocol. This will be modeled by the—deep > breath—{{{}ConsumerGroupProtocolVersionResolver{}}}. > Known tasks: > * Determine at what point in the {{Consumer}} initialization the network > communication should happen > * Determine what RPCs to invoke in order to determine eligibility (API > versions, IBP version, etc.) > * Implement the network client lifecycle (startup, communication, shutdown, > etc.) > * Determine the fallback path in case the client is not eligible to use the > protocol -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KAFKA-17826) Consumer#offsetsForTimes should not return null value
[ https://issues.apache.org/jira/browse/KAFKA-17826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17891620#comment-17891620 ] Kirk True commented on KAFKA-17826: --- I agree with this change. Since the documentation in {{KafkaConsumer}} mentions returning {{{}null{}}}, can we simply update the documentation along with the code change, or do we need a KIP to make the change? > Consumer#offsetsForTimes should not return null value > - > > Key: KAFKA-17826 > URL: https://issues.apache.org/jira/browse/KAFKA-17826 > Project: Kafka > Issue Type: Improvement > Components: clients, consumer >Reporter: Chia-Ping Tsai >Assignee: Chia-Ping Tsai >Priority: Minor > > https://github.com/apache/kafka/pull/17353#discussion_r1795032707 > The map returned by Consumer#offsetsForTimes can have null value if the > specific timestamp is not mapped to any offset. That is a anti-pattern due to > following reasons. > 1. most java11+ collection can't accept null value > 2. the other similar methods in consumer does not do that > 3. admin does not do that -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-17826) Consumer#offsetsForTimes should not return null value
[ https://issues.apache.org/jira/browse/KAFKA-17826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kirk True updated KAFKA-17826: -- Component/s: clients consumer > Consumer#offsetsForTimes should not return null value > - > > Key: KAFKA-17826 > URL: https://issues.apache.org/jira/browse/KAFKA-17826 > Project: Kafka > Issue Type: Improvement > Components: clients, consumer >Reporter: Chia-Ping Tsai >Assignee: Chia-Ping Tsai >Priority: Minor > > https://github.com/apache/kafka/pull/17353#discussion_r1795032707 > The map returned by Consumer#offsetsForTimes can have null value if the > specific timestamp is not mapped to any offset. That is a anti-pattern due to > following reasons. > 1. most java11+ collection can't accept null value > 2. the other similar methods in consumer does not do that > 3. admin does not do that -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-17843) Add integration tests to validate clients close when closed just after instantiation
[ https://issues.apache.org/jira/browse/KAFKA-17843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kirk True updated KAFKA-17843: -- Component/s: clients consumer > Add integration tests to validate clients close when closed just after > instantiation > > > Key: KAFKA-17843 > URL: https://issues.apache.org/jira/browse/KAFKA-17843 > Project: Kafka > Issue Type: Improvement > Components: clients, consumer >Reporter: Apoorv Mittal >Assignee: Apoorv Mittal >Priority: Major > > We have seen an issue with Kafka Consumer which trigger thread wait when > consumer is closed just after instantiation. Though the > [issue|https://issues.apache.org/jira/browse/KAFKA-17731] has been fixed now > but write test cases to avoid any such issue in future. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-17843) Add integration tests to validate clients close when closed just after instantiation
[ https://issues.apache.org/jira/browse/KAFKA-17843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kirk True updated KAFKA-17843: -- Labels: integration-test (was: ) > Add integration tests to validate clients close when closed just after > instantiation > > > Key: KAFKA-17843 > URL: https://issues.apache.org/jira/browse/KAFKA-17843 > Project: Kafka > Issue Type: Improvement > Components: clients, consumer >Reporter: Apoorv Mittal >Assignee: Apoorv Mittal >Priority: Major > Labels: integration-test > > We have seen an issue with Kafka Consumer which trigger thread wait when > consumer is closed just after instantiation. Though the > [issue|https://issues.apache.org/jira/browse/KAFKA-17731] has been fixed now > but write test cases to avoid any such issue in future. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-17731) Kafka consumer client sometimes elapses wait time for terminating telemetry push
[ https://issues.apache.org/jira/browse/KAFKA-17731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kirk True updated KAFKA-17731: -- Component/s: consumer > Kafka consumer client sometimes elapses wait time for terminating telemetry > push > > > Key: KAFKA-17731 > URL: https://issues.apache.org/jira/browse/KAFKA-17731 > Project: Kafka > Issue Type: Bug > Components: clients, consumer >Affects Versions: 3.7.0, 3.8.0, 3.7.1 >Reporter: Apoorv Mittal >Assignee: Apoorv Mittal >Priority: Blocker > Fix For: 3.9.0, 3.7.2, 3.8.1 > > > ClientTelemetryReporter awaits on the timeout to send terminating telemetry > push but sometimes the wait is elapsed. The condition occurs intermitently > which can affect closing time of the consumer. > > {*}The issue can only be reproduced when consumer is closed just after > creating i.e. instantiated Kafka Consumer and closed it{*}. When consumer is > instantly closed then, then worker thread goes for {{timed_waiting}} state > and expects last telemetry push request, which gets completed by background > thread poll. But as the consumer is instantly closed, the heartbeat thread > can't send the telemetry request, which makes the consumer close to wait for > timeout. > > Debug logs: > > {code:java} > [2024-10-08 18:58:48,223] DEBUG > (org.apache.kafka.common.telemetry.internals.ClientTelemetryReporter) > :224 - Thread - 17 - Initiate close of ClientTelemetryReporter > [2024-10-08 18:58:48,223] DEBUG > (org.apache.kafka.common.telemetry.internals.ClientTelemetryReporter) > :610 - Thread - 17 - initiate close for client telemetry, check if terminal > push required. Timeout 3 ms. > [2024-10-08 18:58:48,223] DEBUG > (org.apache.kafka.common.telemetry.internals.ClientTelemetryReporter) > :834 - Thread - 17 - Setting telemetry state from PUSH_NEEDED to > TERMINATING_PUSH_NEEDED > [2024-10-08 18:58:48,223] INFO > (org.apache.kafka.common.telemetry.internals.ClientTelemetryReporter) > :632 - Thread - 17 - About to wait 3 ms. for terminal telemetry push to > be submitted > [2024-10-08 18:59:18,729] INFO > (org.apache.kafka.common.telemetry.internals.ClientTelemetryReporter) > :634 - Thread - 17 - Wait for terminal telemetry push to be submitted has > elapsed; may not have actually sent request > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-17648) AsyncKafkaConsumer#unsubscribe swallow TopicAuthorizationException
[ https://issues.apache.org/jira/browse/KAFKA-17648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kirk True updated KAFKA-17648: -- Priority: Major (was: Minor) > AsyncKafkaConsumer#unsubscribe swallow TopicAuthorizationException > -- > > Key: KAFKA-17648 > URL: https://issues.apache.org/jira/browse/KAFKA-17648 > Project: Kafka > Issue Type: Task > Components: clients, consumer >Reporter: PoAn Yang >Assignee: PoAn Yang >Priority: Major > Fix For: 4.0.0 > > > Followup for [https://github.com/apache/kafka/pull/17244]. > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-17780) Heartbeat interval is not configured in the heartbeatrequest manager
[ https://issues.apache.org/jira/browse/KAFKA-17780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kirk True updated KAFKA-17780: -- Fix Version/s: 4.0.0 > Heartbeat interval is not configured in the heartbeatrequest manager > > > Key: KAFKA-17780 > URL: https://issues.apache.org/jira/browse/KAFKA-17780 > Project: Kafka > Issue Type: Bug > Components: clients, consumer, kip >Reporter: Arpit Goyal >Priority: Critical > Labels: kip-848-client-support > Fix For: 4.0.0 > > > In the AbstractHeartBeatRequestManager , I observed we are not setting the > right heartbeat request interval. Is this intentional ? > [~lianetm] [~kirktrue] > {code:java} > long retryBackoffMs = config.getLong(ConsumerConfig.RETRY_BACKOFF_MS_CONFIG); > long retryBackoffMaxMs = > config.getLong(ConsumerConfig.RETRY_BACKOFF_MAX_MS_CONFIG); > this.heartbeatRequestState = new HeartbeatRequestState(logContext, > time, 0, retryBackoffMs, > retryBackoffMaxMs, maxPollIntervalMs); > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-17823) Refactor GroupRebalanceConfig to remove unused configuration
[ https://issues.apache.org/jira/browse/KAFKA-17823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kirk True updated KAFKA-17823: -- Description: The GroupRebalanceConfig includes configuration that isn't always relevant to all its users, causing some callers to implement kludges to avoid errors. Given that this is a public class, we'll need a KIP to change anything substantially :( was:The GroupRebalanceConfig includes configuration that isn't always relevant to all its users, causing some callers to implement kludges to avoid errors. > Refactor GroupRebalanceConfig to remove unused configuration > > > Key: KAFKA-17823 > URL: https://issues.apache.org/jira/browse/KAFKA-17823 > Project: Kafka > Issue Type: Improvement > Components: clients, config, consumer >Affects Versions: 3.9.0 >Reporter: Kirk True >Assignee: Kirk True >Priority: Major > Labels: kip-848-client-support > > The GroupRebalanceConfig includes configuration that isn't always relevant to > all its users, causing some callers to implement kludges to avoid errors. > Given that this is a public class, we'll need a KIP to change anything > substantially :( -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-17823) Refactor GroupRebalanceConfig to remove unused configuration
Kirk True created KAFKA-17823: - Summary: Refactor GroupRebalanceConfig to remove unused configuration Key: KAFKA-17823 URL: https://issues.apache.org/jira/browse/KAFKA-17823 Project: Kafka Issue Type: Improvement Components: clients, config, consumer Affects Versions: 3.9.0 Reporter: Kirk True Assignee: Kirk True The GroupRebalanceConfig includes configuration that isn't always relevant to all its users, causing some callers to implement kludges to avoid errors. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KAFKA-17780) Heartbeat interval is not configured in the heartbeatrequest manager
[ https://issues.apache.org/jira/browse/KAFKA-17780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17890533#comment-17890533 ] Kirk True commented on KAFKA-17780: --- >From my read of the current code in {{{}AbstractHeartbeatRequestManager{}}}’s >constructor, we set the interval to 0 initially. On the first (and subsequent) >heartbeat responses, we call {{onResponse()}} and update the interval, if >needed. This is because, as you mentioned, the heartbeat interval is now a >server-side configuration. Does that explanation seem correct, or no? > Heartbeat interval is not configured in the heartbeatrequest manager > > > Key: KAFKA-17780 > URL: https://issues.apache.org/jira/browse/KAFKA-17780 > Project: Kafka > Issue Type: Bug > Components: clients, consumer, kip >Reporter: Arpit Goyal >Priority: Critical > Labels: kip-848-client-support > > In the AbstractHeartBeatRequestManager , I observed we are not setting the > right heartbeat request interval. Is this intentional ? > [~lianetm] [~kirktrue] > {code:java} > long retryBackoffMs = config.getLong(ConsumerConfig.RETRY_BACKOFF_MS_CONFIG); > long retryBackoffMaxMs = > config.getLong(ConsumerConfig.RETRY_BACKOFF_MAX_MS_CONFIG); > this.heartbeatRequestState = new HeartbeatRequestState(logContext, > time, 0, retryBackoffMs, > retryBackoffMaxMs, maxPollIntervalMs); > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-17780) Heartbeat interval is not configured in the heartbeatrequest manager
[ https://issues.apache.org/jira/browse/KAFKA-17780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kirk True updated KAFKA-17780: -- Component/s: clients > Heartbeat interval is not configured in the heartbeatrequest manager > > > Key: KAFKA-17780 > URL: https://issues.apache.org/jira/browse/KAFKA-17780 > Project: Kafka > Issue Type: Bug > Components: clients, consumer, kip >Reporter: Arpit Goyal >Priority: Critical > > In the AbstractHeartBeatRequestManager , I observed we are not setting the > right heartbeat request interval. Is this intentional ? > [~lianetm] [~kirktrue] > {code:java} > long retryBackoffMs = config.getLong(ConsumerConfig.RETRY_BACKOFF_MS_CONFIG); > long retryBackoffMaxMs = > config.getLong(ConsumerConfig.RETRY_BACKOFF_MAX_MS_CONFIG); > this.heartbeatRequestState = new HeartbeatRequestState(logContext, > time, 0, retryBackoffMs, > retryBackoffMaxMs, maxPollIntervalMs); > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-17780) Heartbeat interval is not configured in the heartbeatrequest manager
[ https://issues.apache.org/jira/browse/KAFKA-17780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kirk True updated KAFKA-17780: -- Labels: kip-848-client-support (was: ) > Heartbeat interval is not configured in the heartbeatrequest manager > > > Key: KAFKA-17780 > URL: https://issues.apache.org/jira/browse/KAFKA-17780 > Project: Kafka > Issue Type: Bug > Components: clients, consumer, kip >Reporter: Arpit Goyal >Priority: Critical > Labels: kip-848-client-support > > In the AbstractHeartBeatRequestManager , I observed we are not setting the > right heartbeat request interval. Is this intentional ? > [~lianetm] [~kirktrue] > {code:java} > long retryBackoffMs = config.getLong(ConsumerConfig.RETRY_BACKOFF_MS_CONFIG); > long retryBackoffMaxMs = > config.getLong(ConsumerConfig.RETRY_BACKOFF_MAX_MS_CONFIG); > this.heartbeatRequestState = new HeartbeatRequestState(logContext, > time, 0, retryBackoffMs, > retryBackoffMaxMs, maxPollIntervalMs); > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Reopened] (KAFKA-16444) Run KIP-848 unit tests under code coverage
[ https://issues.apache.org/jira/browse/KAFKA-16444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kirk True reopened KAFKA-16444: --- > Run KIP-848 unit tests under code coverage > -- > > Key: KAFKA-16444 > URL: https://issues.apache.org/jira/browse/KAFKA-16444 > Project: Kafka > Issue Type: Task > Components: clients, consumer, unit tests >Affects Versions: 3.7.0 >Reporter: Kirk True >Priority: Minor > Labels: consumer-threading-refactor, kip-848-client-support > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-17338) ConsumerConfig should prevent using partition assignors with CONSUMER group protocol
[ https://issues.apache.org/jira/browse/KAFKA-17338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kirk True updated KAFKA-17338: -- Priority: Major (was: Critical) > ConsumerConfig should prevent using partition assignors with CONSUMER group > protocol > > > Key: KAFKA-17338 > URL: https://issues.apache.org/jira/browse/KAFKA-17338 > Project: Kafka > Issue Type: Task > Components: clients, config, consumer >Reporter: Kirk True >Assignee: 黃竣陽 >Priority: Major > Labels: consumer-threading-refactor, kip-848-client-support > Fix For: 4.0.0 > > > {{ConsumerConfig}} should be updated to include additional validation in > {{postProcessParsedConfig()}}—when {{group.protocol}} is set to {{CONSUMER}}, > the value for {{partition.assignment.strategy}} must be either null or empty. > Otherwise a {{ConfigException}} should be thrown. > This is somewhat of the inverse case of KAFKA-15773. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KAFKA-15402) Performance regression on close consumer after upgrading to 3.5.0
[ https://issues.apache.org/jira/browse/KAFKA-15402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17889752#comment-17889752 ] Kirk True commented on KAFKA-15402: --- [~bdelbosc]—sorry for the delay on this. All hands are pulling on the KIP-848 client release for AK 4.0. I am hopeful we can get to this too. A configuration option for skipping the close is probably out of the question given it would require a KIP. > Performance regression on close consumer after upgrading to 3.5.0 > - > > Key: KAFKA-15402 > URL: https://issues.apache.org/jira/browse/KAFKA-15402 > Project: Kafka > Issue Type: Bug > Components: clients, consumer >Affects Versions: 3.5.0, 3.6.0, 3.5.1 >Reporter: Benoit Delbosc >Priority: Major > Labels: consumer-threading-refactor > Fix For: 4.0.0 > > Attachments: image-2023-08-24-18-51-21-720.png, > image-2023-08-24-18-51-57-435.png, image-2023-08-25-10-50-28-079.png > > > Hi, > After upgrading to Kafka client version 3.5.0, we have observed a significant > increase in the duration of our Java unit tests. These unit tests heavily > rely on the Kafka Admin, Producer, and Consumer API. > When using Kafka server version 3.4.1, the duration of the unit tests > increased from 8 seconds (with Kafka client 3.4.1) to 18 seconds (with Kafka > client 3.5.0). > Upgrading the Kafka server to 3.5.1 show similar results. > I have come across the issue KAFKA-15178, which could be the culprit. I will > attempt to test the proposed patch. > In the meantime, if you have any ideas that could help identify and address > the regression, please let me know. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-15402) Performance regression on close consumer after upgrading to 3.5.0
[ https://issues.apache.org/jira/browse/KAFKA-15402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kirk True updated KAFKA-15402: -- Priority: Major (was: Critical) > Performance regression on close consumer after upgrading to 3.5.0 > - > > Key: KAFKA-15402 > URL: https://issues.apache.org/jira/browse/KAFKA-15402 > Project: Kafka > Issue Type: Bug > Components: clients, consumer >Affects Versions: 3.5.0, 3.6.0, 3.5.1 >Reporter: Benoit Delbosc >Priority: Major > Labels: consumer-threading-refactor > Fix For: 4.0.0 > > Attachments: image-2023-08-24-18-51-21-720.png, > image-2023-08-24-18-51-57-435.png, image-2023-08-25-10-50-28-079.png > > > Hi, > After upgrading to Kafka client version 3.5.0, we have observed a significant > increase in the duration of our Java unit tests. These unit tests heavily > rely on the Kafka Admin, Producer, and Consumer API. > When using Kafka server version 3.4.1, the duration of the unit tests > increased from 8 seconds (with Kafka client 3.4.1) to 18 seconds (with Kafka > client 3.5.0). > Upgrading the Kafka server to 3.5.1 show similar results. > I have come across the issue KAFKA-15178, which could be the culprit. I will > attempt to test the proposed patch. > In the meantime, if you have any ideas that could help identify and address > the regression, please let me know. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-17456) Make sure FindCoordinatorResponse get created before consumer
[ https://issues.apache.org/jira/browse/KAFKA-17456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kirk True updated KAFKA-17456: -- Labels: flaky-test kip-848-client-support (was: kip-848-client-support) > Make sure FindCoordinatorResponse get created before consumer > - > > Key: KAFKA-17456 > URL: https://issues.apache.org/jira/browse/KAFKA-17456 > Project: Kafka > Issue Type: Sub-task > Components: clients, consumer >Reporter: Chia-Ping Tsai >Assignee: TengYao Chi >Priority: Major > Labels: flaky-test, kip-848-client-support > > The incorrect order could lead to flaky (see KAFKA-17092 and KAFKA-17395). It > would be nice that we fix all of them. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KAFKA-16143) New JMX metrics for AsyncKafkaConsumer
[ https://issues.apache.org/jira/browse/KAFKA-16143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17889365#comment-17889365 ] Kirk True commented on KAFKA-16143: --- [~yangpoan]—would you kindly update the status on this to Patch Available? Thanks! > New JMX metrics for AsyncKafkaConsumer > -- > > Key: KAFKA-16143 > URL: https://issues.apache.org/jira/browse/KAFKA-16143 > Project: Kafka > Issue Type: Improvement > Components: clients, consumer, metrics >Affects Versions: 3.7.0 >Reporter: Kirk True >Assignee: PoAn Yang >Priority: Major > Labels: consumer-threading-refactor, kip-848-client-support, > metrics, needs-kip > Fix For: 4.0.0 > > > This task is to consider what _new_ metrics we need from the KIP-848 protocol > that aren't already exposed by the current set of metrics. This will require > a KIP to introduce the new metrics. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-17456) Make sure FindCoordinatorResponse get created before consumer
[ https://issues.apache.org/jira/browse/KAFKA-17456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kirk True updated KAFKA-17456: -- Labels: kip-848-client-support (was: ) > Make sure FindCoordinatorResponse get created before consumer > - > > Key: KAFKA-17456 > URL: https://issues.apache.org/jira/browse/KAFKA-17456 > Project: Kafka > Issue Type: Sub-task > Components: clients, consumer >Reporter: Chia-Ping Tsai >Assignee: TengYao Chi >Priority: Major > Labels: kip-848-client-support > > The incorrect order could lead to flaky (see KAFKA-17092 and KAFKA-17395). It > would be nice that we fix all of them. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KAFKA-17724) Clients - resolve race condition for SubscriptionState in share group consumer
[ https://issues.apache.org/jira/browse/KAFKA-17724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17889363#comment-17889363 ] Kirk True commented on KAFKA-17724: --- [~schofielaj]—would you kindly update the status of this to "Patch Available?" Thanks! > Clients - resolve race condition for SubscriptionState in share group consumer > -- > > Key: KAFKA-17724 > URL: https://issues.apache.org/jira/browse/KAFKA-17724 > Project: Kafka > Issue Type: Sub-task > Components: clients, consumer >Affects Versions: 4.0.0 >Reporter: Andrew Schofield >Assignee: Andrew Schofield >Priority: Major > Labels: kip-848-client-support > Fix For: 4.0.0 > > > The SubscriptionState object is not thread-safe. Currently, there are a > handful of accesses in the application thread in ShareConsumerImpl which > ought to be moved into the background thread, thus eliminating the > possibility of race conditions. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-17724) Clients - resolve race condition for SubscriptionState in share group consumer
[ https://issues.apache.org/jira/browse/KAFKA-17724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kirk True updated KAFKA-17724: -- Component/s: consumer > Clients - resolve race condition for SubscriptionState in share group consumer > -- > > Key: KAFKA-17724 > URL: https://issues.apache.org/jira/browse/KAFKA-17724 > Project: Kafka > Issue Type: Sub-task > Components: clients, consumer >Affects Versions: 4.0.0 >Reporter: Andrew Schofield >Assignee: Andrew Schofield >Priority: Major > Fix For: 4.0.0 > > > The SubscriptionState object is not thread-safe. Currently, there are a > handful of accesses in the application thread in ShareConsumerImpl which > ought to be moved into the background thread, thus eliminating the > possibility of race conditions. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-17724) Clients - resolve race condition for SubscriptionState in share group consumer
[ https://issues.apache.org/jira/browse/KAFKA-17724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kirk True updated KAFKA-17724: -- Labels: kip-848-client-support (was: ) > Clients - resolve race condition for SubscriptionState in share group consumer > -- > > Key: KAFKA-17724 > URL: https://issues.apache.org/jira/browse/KAFKA-17724 > Project: Kafka > Issue Type: Sub-task > Components: clients, consumer >Affects Versions: 4.0.0 >Reporter: Andrew Schofield >Assignee: Andrew Schofield >Priority: Major > Labels: kip-848-client-support > Fix For: 4.0.0 > > > The SubscriptionState object is not thread-safe. Currently, there are a > handful of accesses in the application thread in ShareConsumerImpl which > ought to be moved into the background thread, thus eliminating the > possibility of race conditions. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-17363) Flaky test: kafka.api.PlaintextConsumerCommitTest.testAutoCommitOnRebalance(String, String).quorum=kraft+kip848.groupProtocol=consumer
[ https://issues.apache.org/jira/browse/KAFKA-17363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kirk True updated KAFKA-17363: -- Labels: flaky-test integration-tests kip-848-client-support (was: integration-tests kip-848-client-support) > Flaky test: > kafka.api.PlaintextConsumerCommitTest.testAutoCommitOnRebalance(String, > String).quorum=kraft+kip848.groupProtocol=consumer > -- > > Key: KAFKA-17363 > URL: https://issues.apache.org/jira/browse/KAFKA-17363 > Project: Kafka > Issue Type: Bug > Components: clients, consumer >Reporter: Apoorv Mittal >Priority: Major > Labels: flaky-test, integration-tests, kip-848-client-support > Fix For: 4.0.0 > > > kafka.api.PlaintextConsumerCommitTest.testAutoCommitOnRebalance(String, > String).quorum=kraft+kip848.groupProtocol=consumer > [https://ci-builds.apache.org/job/Kafka/job/kafka-pr/job/PR-16890/3/testReport/kafka.api/PlaintextConsumerCommitTest/Build___JDK_8_and_Scala_2_12___testAutoCommitOnRebalance_String__String__quorum_kraft_kip848_groupProtocol_consumer/] > {code:java} > org.opentest4j.AssertionFailedError: Topic [topic2] metadata not propagated > after 6 ms > at org.junit.jupiter.api.AssertionUtils.fail(AssertionUtils.java:38) > at org.junit.jupiter.api.Assertions.fail(Assertions.java:138) > at > kafka.utils.TestUtils$.waitForAllPartitionsMetadata(TestUtils.scala:931) > at kafka.utils.TestUtils$.createTopicWithAdmin(TestUtils.scala:474) > at > kafka.integration.KafkaServerTestHarness.$anonfun$createTopic$1(KafkaServerTestHarness.scala:193) > at scala.util.Using$.resource(Using.scala:269) > at > kafka.integration.KafkaServerTestHarness.createTopic(KafkaServerTestHarness.scala:185) > at > kafka.api.PlaintextConsumerCommitTest.testAutoCommitOnRebalance(PlaintextConsumerCommitTest.scala:223) > at java.lang.reflect.Method.invoke(Method.java:498) > at > java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183) > at > java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193) > at > java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175) > at > java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193) > at > java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183) > at > java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193) > at > java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193) > at > java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193) > at > java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183) > at > java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193) > at > java.util.Spliterators$ArraySpliterator.forEachRemaining(Spliterators.java:948) > at > java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:647) > at > java.util.stream.ReferencePipeline$7$1.accept(ReferencePipeline.java:272) > at > java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193) > at > java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193) > at > java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193) > at > java.util.Spliterators$ArraySpliterator.forEachRemaining(Spliterators.java:948) > at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482) > at > java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472) > at > java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150) > at > java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173) > at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) > at > java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:485) > at > java.util.stream.ReferencePipeline$7$1.accept(ReferencePipeline.java:272) > at > java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193) > at > java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193) > at > java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193) > at > java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1384) > at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482) > at > java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472) > at > java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150) > at > java.util.stream.ForEachOps$ForEachOp$OfRef.
[jira] [Updated] (KAFKA-17769) Fix flaky PlaintextConsumerSubscriptionTest.testSubscribeInvalidTopicCanUnsubscribe
[ https://issues.apache.org/jira/browse/KAFKA-17769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kirk True updated KAFKA-17769: -- Labels: flaky-test integration-test kip-848-client-support (was: integration-test kip-848-client-support) > Fix flaky > PlaintextConsumerSubscriptionTest.testSubscribeInvalidTopicCanUnsubscribe > --- > > Key: KAFKA-17769 > URL: https://issues.apache.org/jira/browse/KAFKA-17769 > Project: Kafka > Issue Type: Bug > Components: clients, consumer >Reporter: Yu-Lin Chen >Assignee: Yu-Lin Chen >Priority: Major > Labels: flaky-test, integration-test, kip-848-client-support > Fix For: 4.0.0 > > > 4 flaky out of 110 trunk builds in past 2 weeks. ([Report > Link|https://ge.apache.org/scans/tests?search.rootProjectNames=kafka&search.startTimeMax=1728584869905&search.startTimeMin=172615680&search.tags=trunk&search.timeZoneId=Asia%2FTaipei&tests.container=kafka.api.PlaintextConsumerSubscriptionTest&tests.test=testSubscribeInvalidTopicCanUnsubscribe(String%2C%20String)%5B3%5D]) > This issue can be reproduced in my local within 50 loops. > > ([Oct 4 2024 at 10:35:49 > CST|https://ge.apache.org/s/o4ir4xtitsu52/tests/task/:core:test/details/kafka.api.PlaintextConsumerSubscriptionTest/testSubscribeInvalidTopicCanUnsubscribe(String%2C%20String)%5B3%5D?top-execution=1]): > {code:java} > org.apache.kafka.common.KafkaException: Failed to close kafka consumer > at > org.apache.kafka.clients.consumer.internals.AsyncKafkaConsumer.close(AsyncKafkaConsumer.java:1249) > > at > org.apache.kafka.clients.consumer.internals.AsyncKafkaConsumer.close(AsyncKafkaConsumer.java:1204) > > at > org.apache.kafka.clients.consumer.KafkaConsumer.close(KafkaConsumer.java:1718) > > at > kafka.api.IntegrationTestHarness.$anonfun$tearDown$3(IntegrationTestHarness.scala:249) > > at > kafka.api.IntegrationTestHarness.$anonfun$tearDown$3$adapted(IntegrationTestHarness.scala:249) > > at scala.collection.IterableOnceOps.foreach(IterableOnce.scala:619) > at scala.collection.IterableOnceOps.foreach$(IterableOnce.scala:617) > at scala.collection.AbstractIterable.foreach(Iterable.scala:935) > at > kafka.api.IntegrationTestHarness.tearDown(IntegrationTestHarness.scala:249) > > at java.lang.reflect.Method.invoke(Method.java:566) > at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183) > at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195) > at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:177) > at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195) > at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183) > at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195) > at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195) > at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195) > at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183) > at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195) > at > java.util.Spliterators$ArraySpliterator.forEachRemaining(Spliterators.java:948) > > at > java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:658) > > at java.util.stream.ReferencePipeline$7$1.accept(ReferencePipeline.java:274) > at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195) > at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195) > at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195) > at > java.util.Spliterators$ArraySpliterator.forEachRemaining(Spliterators.java:948) > > at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484) > at > java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474) > > at > java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150) > > at > java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173) > > at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) > at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:497) > at java.util.stream.ReferencePipeline$7$1.accept(ReferencePipeline.java:274) > at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195) > at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195) > at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195) > at > java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1655) > > at java.util.stream.AbstractPipeline.copyInto
[jira] [Updated] (KAFKA-17681) Fix unstable consumer_test.py#test_fencing_static_consumer
[ https://issues.apache.org/jira/browse/KAFKA-17681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kirk True updated KAFKA-17681: -- Labels: flaky-test kip-848-client-support (was: kip-848-client-support) > Fix unstable consumer_test.py#test_fencing_static_consumer > -- > > Key: KAFKA-17681 > URL: https://issues.apache.org/jira/browse/KAFKA-17681 > Project: Kafka > Issue Type: Bug > Components: clients, consumer, system tests >Reporter: Chia-Ping Tsai >Assignee: Chia-Ping Tsai >Priority: Major > Labels: flaky-test, kip-848-client-support > Fix For: 4.0.0 > > > {code:java} > AssertionError('Static consumers attempt to join with instance id in use > should not cause a rebalance.') > Traceback (most recent call last): > File > "/usr/local/lib/python3.9/dist-packages/ducktape/tests/runner_client.py", > line 186, in _do_run > data = self.run_test() > File > "/usr/local/lib/python3.9/dist-packages/ducktape/tests/runner_client.py", > line 246, in run_test > return self.test_context.function(self.test) > File "/usr/local/lib/python3.9/dist-packages/ducktape/mark/_mark.py", line > 433, in wrapper > return functools.partial(f, *args, **kwargs)(*w_args, **w_kwargs) > File "/opt/kafka-dev/tests/kafkatest/tests/client/consumer_test.py", line > 359, in test_fencing_static_consumer > assert num_rebalances == consumer.num_rebalances(), "Static consumers > attempt to join with instance id in use should not cause a rebalance. before: > " + str(num_rebalances) + " after: " + str(consumer.num_rebalances()) > AssertionError: Static consumers attempt to join with instance id in use > should not cause a rebalance. > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-17623) Flaky testSeekPositionAndPauseNewlyAssignedPartitionOnPartitionsAssignedCallback
[ https://issues.apache.org/jira/browse/KAFKA-17623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kirk True updated KAFKA-17623: -- Labels: consumer-threading-refactor flaky-test (was: consumer-threading-refactor) > Flaky > testSeekPositionAndPauseNewlyAssignedPartitionOnPartitionsAssignedCallback > > > Key: KAFKA-17623 > URL: https://issues.apache.org/jira/browse/KAFKA-17623 > Project: Kafka > Issue Type: Bug > Components: clients, consumer >Reporter: Lianet Magrans >Assignee: PoAn Yang >Priority: Major > Labels: consumer-threading-refactor, flaky-test > Fix For: 4.0.0 > > > Flaky for the new consumer, failing with : > org.apache.kafka.common.KafkaException: User rebalance callback throws an > error at > app//org.apache.kafka.clients.consumer.internals.ConsumerUtils.maybeWrapAsKafkaException(ConsumerUtils.java:259) > at > app//org.apache.kafka.clients.consumer.internals.AsyncKafkaConsumer.invokeRebalanceCallbacks(AsyncKafkaConsumer.java:1867) > at > app//org.apache.kafka.clients.consumer.internals.AsyncKafkaConsumer$BackgroundEventProcessor.process(AsyncKafkaConsumer.java:195) > at > app//org.apache.kafka.clients.consumer.internals.AsyncKafkaConsumer$BackgroundEventProcessor.process(AsyncKafkaConsumer.java:181) > at > app//org.apache.kafka.clients.consumer.internals.AsyncKafkaConsumer.processBackgroundEvents(AsyncKafkaConsumer.java:1758) > at > app//org.apache.kafka.clients.consumer.internals.AsyncKafkaConsumer.updateAssignmentMetadataIfNeeded(AsyncKafkaConsumer.java:1618) > ... > Caused by: java.lang.IllegalStateException: No current assignment for > partition topic-0 at > org.apache.kafka.clients.consumer.internals.SubscriptionState.assignedState(SubscriptionState.java:378) > at > org.apache.kafka.clients.consumer.internals.SubscriptionState.seekUnvalidated(SubscriptionState.java:395) > at > org.apache.kafka.clients.consumer.internals.events.ApplicationEventProcessor.process(ApplicationEventProcessor.java:425) > at > org.apache.kafka.clients.consumer.internals.events.ApplicationEventProcessor.process(ApplicationEventProcessor.java:147) > at > org.apache.kafka.clients.consumer.internals.ConsumerNetworkThread.processApplicationEvents(ConsumerNetworkThread.java:171) > > Flaky behaviour: > > https://ge.apache.org/scans/tests?search.buildOutcome=failure&search.names=Git%20branch&search.rootProjectNames=kafka&search.startTimeMax=172740959&search.startTimeMin=172248480&search.timeZoneId=America%2FToronto&search.values=trunk&tests.container=integration.kafka.api.PlaintextConsumerCallbackTest&tests.test=testSeekPositionAndPauseNewlyAssignedPartitionOnPartitionsAssignedCallback(String%2C%20String)%5B3%5D -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-17769) Fix flaky PlaintextConsumerSubscriptionTest.testSubscribeInvalidTopicCanUnsubscribe
[ https://issues.apache.org/jira/browse/KAFKA-17769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kirk True updated KAFKA-17769: -- Fix Version/s: 4.0.0 > Fix flaky > PlaintextConsumerSubscriptionTest.testSubscribeInvalidTopicCanUnsubscribe > --- > > Key: KAFKA-17769 > URL: https://issues.apache.org/jira/browse/KAFKA-17769 > Project: Kafka > Issue Type: Bug > Components: clients, consumer >Reporter: Yu-Lin Chen >Assignee: Yu-Lin Chen >Priority: Major > Labels: integration-test, kip-848-client-support > Fix For: 4.0.0 > > > 4 flaky out of 110 trunk builds in past 2 weeks. ([Report > Link|https://ge.apache.org/scans/tests?search.rootProjectNames=kafka&search.startTimeMax=1728584869905&search.startTimeMin=172615680&search.tags=trunk&search.timeZoneId=Asia%2FTaipei&tests.container=kafka.api.PlaintextConsumerSubscriptionTest&tests.test=testSubscribeInvalidTopicCanUnsubscribe(String%2C%20String)%5B3%5D]) > This issue can be reproduced in my local within 50 loops. > > ([Oct 4 2024 at 10:35:49 > CST|https://ge.apache.org/s/o4ir4xtitsu52/tests/task/:core:test/details/kafka.api.PlaintextConsumerSubscriptionTest/testSubscribeInvalidTopicCanUnsubscribe(String%2C%20String)%5B3%5D?top-execution=1]): > {code:java} > org.apache.kafka.common.KafkaException: Failed to close kafka consumer > at > org.apache.kafka.clients.consumer.internals.AsyncKafkaConsumer.close(AsyncKafkaConsumer.java:1249) > > at > org.apache.kafka.clients.consumer.internals.AsyncKafkaConsumer.close(AsyncKafkaConsumer.java:1204) > > at > org.apache.kafka.clients.consumer.KafkaConsumer.close(KafkaConsumer.java:1718) > > at > kafka.api.IntegrationTestHarness.$anonfun$tearDown$3(IntegrationTestHarness.scala:249) > > at > kafka.api.IntegrationTestHarness.$anonfun$tearDown$3$adapted(IntegrationTestHarness.scala:249) > > at scala.collection.IterableOnceOps.foreach(IterableOnce.scala:619) > at scala.collection.IterableOnceOps.foreach$(IterableOnce.scala:617) > at scala.collection.AbstractIterable.foreach(Iterable.scala:935) > at > kafka.api.IntegrationTestHarness.tearDown(IntegrationTestHarness.scala:249) > > at java.lang.reflect.Method.invoke(Method.java:566) > at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183) > at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195) > at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:177) > at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195) > at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183) > at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195) > at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195) > at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195) > at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183) > at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195) > at > java.util.Spliterators$ArraySpliterator.forEachRemaining(Spliterators.java:948) > > at > java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:658) > > at java.util.stream.ReferencePipeline$7$1.accept(ReferencePipeline.java:274) > at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195) > at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195) > at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195) > at > java.util.Spliterators$ArraySpliterator.forEachRemaining(Spliterators.java:948) > > at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484) > at > java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474) > > at > java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150) > > at > java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173) > > at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) > at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:497) > at java.util.stream.ReferencePipeline$7$1.accept(ReferencePipeline.java:274) > at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195) > at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195) > at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195) > at > java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1655) > > at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484) > at > java.util.stream.AbstractPipeline.wrapAndCopyInto(Abstrac
[jira] [Updated] (KAFKA-17777) Flaky KafkaConsumerTest testReturnRecordsDuringRebalance
[ https://issues.apache.org/jira/browse/KAFKA-1?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kirk True updated KAFKA-1: -- Fix Version/s: 4.0.0 > Flaky KafkaConsumerTest testReturnRecordsDuringRebalance > > > Key: KAFKA-1 > URL: https://issues.apache.org/jira/browse/KAFKA-1 > Project: Kafka > Issue Type: Test > Components: clients, consumer, unit tests >Reporter: Lianet Magrans >Assignee: TengYao Chi >Priority: Major > Labels: flaky-test, kip-848-client-support > Fix For: 4.0.0 > > > Top flaky consumer test at the moment (after several recent fixes addressing > consumer tests flakiness): > [https://ge.apache.org/scans/tests?search.rootProjectNames=kafka&search.tags=trunk&search.timeZoneId=America%2FNew_York&tests.sortField=FLAKY] > > It's been flaky in trunk for a while: > [https://ge.apache.org/scans/tests?search.relativeStartTime=P28D&search.rootProjectNames=kafka&search.tags=trunk&search.timeZoneId=America%2FToronto&tests.container=org.apache.kafka.clients.consumer.KafkaConsumerTest&tests.sortField=FLAKY&tests.test=testReturnRecordsDuringRebalance(GroupProtocol)%5B1%5D] > > > Fails with : org.opentest4j.AssertionFailedError: Condition not met within > timeout 15000. Does not complete rebalance in time ==> expected: but > was: -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-17732) System test coverage for new consumer positions not getting behind committed offsets
[ https://issues.apache.org/jira/browse/KAFKA-17732?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kirk True updated KAFKA-17732: -- Fix Version/s: 4.0.0 > System test coverage for new consumer positions not getting behind committed > offsets > > > Key: KAFKA-17732 > URL: https://issues.apache.org/jira/browse/KAFKA-17732 > Project: Kafka > Issue Type: Task > Components: clients, consumer, system tests >Reporter: Lianet Magrans >Priority: Major > Labels: consumer-threading-refactor > Fix For: 4.0.0 > > > We recently catch a bug where the new consumer positions would get behind the > committed offsets (fixed with > https://issues.apache.org/jira/browse/KAFKA-17674). It was only discovered > while adding a new system test (test_consumer_rolling_migration, not yet in > trunk, PR should be available soon in trunk under > https://issues.apache.org/jira/browse/KAFKA-17272) > We should check why this failure wasn't catch with existing tests, and add a > test for it (test to ensure positions are updated properly, that would fail > without KAFKA-17674 and pass with it). -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-17769) Fix flaky PlaintextConsumerSubscriptionTest.testSubscribeInvalidTopicCanUnsubscribe
[ https://issues.apache.org/jira/browse/KAFKA-17769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kirk True updated KAFKA-17769: -- Labels: integration-test (was: ) > Fix flaky > PlaintextConsumerSubscriptionTest.testSubscribeInvalidTopicCanUnsubscribe > --- > > Key: KAFKA-17769 > URL: https://issues.apache.org/jira/browse/KAFKA-17769 > Project: Kafka > Issue Type: Bug > Components: clients, consumer >Reporter: Yu-Lin Chen >Assignee: Yu-Lin Chen >Priority: Major > Labels: integration-test > > 4 flaky out of 110 trunk builds in past 2 weeks. ([Report > Link|https://ge.apache.org/scans/tests?search.rootProjectNames=kafka&search.startTimeMax=1728584869905&search.startTimeMin=172615680&search.tags=trunk&search.timeZoneId=Asia%2FTaipei&tests.container=kafka.api.PlaintextConsumerSubscriptionTest&tests.test=testSubscribeInvalidTopicCanUnsubscribe(String%2C%20String)%5B3%5D]) > This issue can be reproduced in my local within 50 loops. > > ([Oct 4 2024 at 10:35:49 > CST|https://ge.apache.org/s/o4ir4xtitsu52/tests/task/:core:test/details/kafka.api.PlaintextConsumerSubscriptionTest/testSubscribeInvalidTopicCanUnsubscribe(String%2C%20String)%5B3%5D?top-execution=1]): > {code:java} > org.apache.kafka.common.KafkaException: Failed to close kafka consumer > at > org.apache.kafka.clients.consumer.internals.AsyncKafkaConsumer.close(AsyncKafkaConsumer.java:1249) > > at > org.apache.kafka.clients.consumer.internals.AsyncKafkaConsumer.close(AsyncKafkaConsumer.java:1204) > > at > org.apache.kafka.clients.consumer.KafkaConsumer.close(KafkaConsumer.java:1718) > > at > kafka.api.IntegrationTestHarness.$anonfun$tearDown$3(IntegrationTestHarness.scala:249) > > at > kafka.api.IntegrationTestHarness.$anonfun$tearDown$3$adapted(IntegrationTestHarness.scala:249) > > at scala.collection.IterableOnceOps.foreach(IterableOnce.scala:619) > at scala.collection.IterableOnceOps.foreach$(IterableOnce.scala:617) > at scala.collection.AbstractIterable.foreach(Iterable.scala:935) > at > kafka.api.IntegrationTestHarness.tearDown(IntegrationTestHarness.scala:249) > > at java.lang.reflect.Method.invoke(Method.java:566) > at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183) > at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195) > at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:177) > at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195) > at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183) > at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195) > at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195) > at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195) > at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183) > at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195) > at > java.util.Spliterators$ArraySpliterator.forEachRemaining(Spliterators.java:948) > > at > java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:658) > > at java.util.stream.ReferencePipeline$7$1.accept(ReferencePipeline.java:274) > at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195) > at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195) > at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195) > at > java.util.Spliterators$ArraySpliterator.forEachRemaining(Spliterators.java:948) > > at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484) > at > java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474) > > at > java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150) > > at > java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173) > > at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) > at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:497) > at java.util.stream.ReferencePipeline$7$1.accept(ReferencePipeline.java:274) > at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195) > at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195) > at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195) > at > java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1655) > > at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484) > at > java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474) > > at > java
[jira] [Updated] (KAFKA-17769) Fix flaky PlaintextConsumerSubscriptionTest.testSubscribeInvalidTopicCanUnsubscribe
[ https://issues.apache.org/jira/browse/KAFKA-17769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kirk True updated KAFKA-17769: -- Labels: integration-test kip-848-client-support (was: integration-test) > Fix flaky > PlaintextConsumerSubscriptionTest.testSubscribeInvalidTopicCanUnsubscribe > --- > > Key: KAFKA-17769 > URL: https://issues.apache.org/jira/browse/KAFKA-17769 > Project: Kafka > Issue Type: Bug > Components: clients, consumer >Reporter: Yu-Lin Chen >Assignee: Yu-Lin Chen >Priority: Major > Labels: integration-test, kip-848-client-support > > 4 flaky out of 110 trunk builds in past 2 weeks. ([Report > Link|https://ge.apache.org/scans/tests?search.rootProjectNames=kafka&search.startTimeMax=1728584869905&search.startTimeMin=172615680&search.tags=trunk&search.timeZoneId=Asia%2FTaipei&tests.container=kafka.api.PlaintextConsumerSubscriptionTest&tests.test=testSubscribeInvalidTopicCanUnsubscribe(String%2C%20String)%5B3%5D]) > This issue can be reproduced in my local within 50 loops. > > ([Oct 4 2024 at 10:35:49 > CST|https://ge.apache.org/s/o4ir4xtitsu52/tests/task/:core:test/details/kafka.api.PlaintextConsumerSubscriptionTest/testSubscribeInvalidTopicCanUnsubscribe(String%2C%20String)%5B3%5D?top-execution=1]): > {code:java} > org.apache.kafka.common.KafkaException: Failed to close kafka consumer > at > org.apache.kafka.clients.consumer.internals.AsyncKafkaConsumer.close(AsyncKafkaConsumer.java:1249) > > at > org.apache.kafka.clients.consumer.internals.AsyncKafkaConsumer.close(AsyncKafkaConsumer.java:1204) > > at > org.apache.kafka.clients.consumer.KafkaConsumer.close(KafkaConsumer.java:1718) > > at > kafka.api.IntegrationTestHarness.$anonfun$tearDown$3(IntegrationTestHarness.scala:249) > > at > kafka.api.IntegrationTestHarness.$anonfun$tearDown$3$adapted(IntegrationTestHarness.scala:249) > > at scala.collection.IterableOnceOps.foreach(IterableOnce.scala:619) > at scala.collection.IterableOnceOps.foreach$(IterableOnce.scala:617) > at scala.collection.AbstractIterable.foreach(Iterable.scala:935) > at > kafka.api.IntegrationTestHarness.tearDown(IntegrationTestHarness.scala:249) > > at java.lang.reflect.Method.invoke(Method.java:566) > at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183) > at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195) > at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:177) > at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195) > at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183) > at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195) > at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195) > at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195) > at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183) > at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195) > at > java.util.Spliterators$ArraySpliterator.forEachRemaining(Spliterators.java:948) > > at > java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:658) > > at java.util.stream.ReferencePipeline$7$1.accept(ReferencePipeline.java:274) > at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195) > at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195) > at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195) > at > java.util.Spliterators$ArraySpliterator.forEachRemaining(Spliterators.java:948) > > at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484) > at > java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474) > > at > java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150) > > at > java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173) > > at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) > at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:497) > at java.util.stream.ReferencePipeline$7$1.accept(ReferencePipeline.java:274) > at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195) > at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195) > at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195) > at > java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1655) > > at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484) > at > java.util.stream.AbstractPipeline.wr
[jira] [Updated] (KAFKA-17777) Flaky KafkaConsumerTest testReturnRecordsDuringRebalance
[ https://issues.apache.org/jira/browse/KAFKA-1?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kirk True updated KAFKA-1: -- Labels: flaky-test kip-848-client-support (was: flaky-test) > Flaky KafkaConsumerTest testReturnRecordsDuringRebalance > > > Key: KAFKA-1 > URL: https://issues.apache.org/jira/browse/KAFKA-1 > Project: Kafka > Issue Type: Test > Components: clients, consumer, unit tests >Reporter: Lianet Magrans >Assignee: TengYao Chi >Priority: Major > Labels: flaky-test, kip-848-client-support > > Top flaky consumer test at the moment (after several recent fixes addressing > consumer tests flakiness): > [https://ge.apache.org/scans/tests?search.rootProjectNames=kafka&search.tags=trunk&search.timeZoneId=America%2FNew_York&tests.sortField=FLAKY] > > It's been flaky in trunk for a while: > [https://ge.apache.org/scans/tests?search.relativeStartTime=P28D&search.rootProjectNames=kafka&search.tags=trunk&search.timeZoneId=America%2FToronto&tests.container=org.apache.kafka.clients.consumer.KafkaConsumerTest&tests.sortField=FLAKY&tests.test=testReturnRecordsDuringRebalance(GroupProtocol)%5B1%5D] > > > Fails with : org.opentest4j.AssertionFailedError: Condition not met within > timeout 15000. Does not complete rebalance in time ==> expected: but > was: -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-17732) System test coverage for new consumer positions not getting behind committed offsets
[ https://issues.apache.org/jira/browse/KAFKA-17732?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kirk True updated KAFKA-17732: -- Component/s: clients > System test coverage for new consumer positions not getting behind committed > offsets > > > Key: KAFKA-17732 > URL: https://issues.apache.org/jira/browse/KAFKA-17732 > Project: Kafka > Issue Type: Task > Components: clients, consumer, system tests >Reporter: Lianet Magrans >Priority: Major > Labels: consumer-threading-refactor > > We recently catch a bug where the new consumer positions would get behind the > committed offsets (fixed with > https://issues.apache.org/jira/browse/KAFKA-17674). It was only discovered > while adding a new system test (test_consumer_rolling_migration, not yet in > trunk, PR should be available soon in trunk under > https://issues.apache.org/jira/browse/KAFKA-17272) > We should check why this failure wasn't catch with existing tests, and add a > test for it (test to ensure positions are updated properly, that would fail > without KAFKA-17674 and pass with it). -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KAFKA-17337) ConsumerConfig should default to CONSUMER for group.protocol
[ https://issues.apache.org/jira/browse/KAFKA-17337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17887700#comment-17887700 ] Kirk True commented on KAFKA-17337: --- When switching to this as the default, we have found a few errors, which are linked. Need to complete investigation of testing to find errors. > ConsumerConfig should default to CONSUMER for group.protocol > > > Key: KAFKA-17337 > URL: https://issues.apache.org/jira/browse/KAFKA-17337 > Project: Kafka > Issue Type: Task > Components: clients, config, consumer >Affects Versions: 3.9.0 >Reporter: Kirk True >Assignee: Kirk True >Priority: Blocker > Labels: consumer-threading-refactor, kip-848-client-support > Fix For: 4.0.0 > > > {{ConsumerConfig}}’s default value for {{group.protocol}} should be changed > from {{CLASSIC}} to {{CONSUMER}} for 4.0.0. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KAFKA-17536) Ensure clear error message when "new" consumer used with incompatible cluster
[ https://issues.apache.org/jira/browse/KAFKA-17536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17887486#comment-17887486 ] Kirk True commented on KAFKA-17536: --- Correct. Feel free to grab it :) > Ensure clear error message when "new" consumer used with incompatible cluster > - > > Key: KAFKA-17536 > URL: https://issues.apache.org/jira/browse/KAFKA-17536 > Project: Kafka > Issue Type: Improvement > Components: clients, consumer >Affects Versions: 3.9.0 >Reporter: Kirk True >Priority: Critical > Labels: kip-848-client-support > Fix For: 4.0.0 > > > In Kafka 4.0, by default the consumer uses the updated consumer group > protocol defined in KIP-848. When the consumer is used with a cluster that > does not support (or is not configured to use) the new protocol, the consumer > will get an unfriendly error about unavailable APIs. Since this error could > be the user's first impression when attempting to upgrade to 4.0, we need to > make sure that the error is very clear about the remediation steps (set the > group.protocol to CLASSIC on the client or upgrade and enable the new > protocol on the cluster). -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (KAFKA-17536) Ensure clear error message when "new" consumer used with incompatible cluster
[ https://issues.apache.org/jira/browse/KAFKA-17536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kirk True reassigned KAFKA-17536: - Assignee: (was: Kirk True) > Ensure clear error message when "new" consumer used with incompatible cluster > - > > Key: KAFKA-17536 > URL: https://issues.apache.org/jira/browse/KAFKA-17536 > Project: Kafka > Issue Type: Improvement > Components: clients, consumer >Affects Versions: 3.9.0 >Reporter: Kirk True >Priority: Critical > Labels: kip-848-client-support > Fix For: 4.0.0 > > > In Kafka 4.0, by default the consumer uses the updated consumer group > protocol defined in KIP-848. When the consumer is used with a cluster that > does not support (or is not configured to use) the new protocol, the consumer > will get an unfriendly error about unavailable APIs. Since this error could > be the user's first impression when attempting to upgrade to 4.0, we need to > make sure that the error is very clear about the remediation steps (set the > group.protocol to CLASSIC on the client or upgrade and enable the new > protocol on the cluster). -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-14672) Producer queue time does not reflect batches expired in the accumulator
[ https://issues.apache.org/jira/browse/KAFKA-14672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kirk True updated KAFKA-14672: -- Component/s: clients metrics producer > Producer queue time does not reflect batches expired in the accumulator > --- > > Key: KAFKA-14672 > URL: https://issues.apache.org/jira/browse/KAFKA-14672 > Project: Kafka > Issue Type: Bug > Components: clients, metrics, producer >Reporter: Jason Gustafson >Assignee: Kirk True >Priority: Major > > The producer exposes two metrics for the time a record has spent in the > accumulator waiting to be drained: > * {{record-queue-time-avg}} > * {{record-queue-time-max}} > The metric is only updated when a batch is ready to send to a broker. It is > also possible for a batch to be expired before it can be sent, but in this > case, the metric is not updated. This seems surprising and makes the queue > time misleading. The only metric I could find that does reflect batch > expirations in the accumulator is the generic {{{}record-error-rate{}}}. It > would make sense to let the queue-time metrics record the time spent in the > queue regardless of the outcome of the record send attempt. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-14567) Kafka Streams crashes after ProducerFencedException
[ https://issues.apache.org/jira/browse/KAFKA-14567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kirk True updated KAFKA-14567: -- Component/s: clients > Kafka Streams crashes after ProducerFencedException > --- > > Key: KAFKA-14567 > URL: https://issues.apache.org/jira/browse/KAFKA-14567 > Project: Kafka > Issue Type: Bug > Components: clients, producer , streams >Affects Versions: 3.7.0 >Reporter: Matthias J. Sax >Assignee: Kirk True >Priority: Blocker > Labels: eos, transactions > Fix For: 4.0.0 > > > Running a Kafka Streams application with EOS-v2. > We first see a `ProducerFencedException`. After the fencing, the fenced > thread crashed resulting in a non-recoverable error: > {quote}[2022-12-22 13:49:13,423] ERROR [i-0c291188ec2ae17a0-StreamThread-3] > stream-thread [i-0c291188ec2ae17a0-StreamThread-3] Failed to process stream > task 1_2 due to the following error: > (org.apache.kafka.streams.processor.internals.TaskExecutor) > org.apache.kafka.streams.errors.StreamsException: Exception caught in > process. taskId=1_2, processor=KSTREAM-SOURCE-05, > topic=node-name-repartition, partition=2, offset=539776276, > stacktrace=java.lang.IllegalStateException: TransactionalId > stream-soak-test-72b6e57c-c2f5-489d-ab9f-fdbb215d2c86-3: Invalid transition > attempted from state FATAL_ERROR to state ABORTABLE_ERROR > at > org.apache.kafka.clients.producer.internals.TransactionManager.transitionTo(TransactionManager.java:974) > at > org.apache.kafka.clients.producer.internals.TransactionManager.transitionToAbortableError(TransactionManager.java:394) > at > org.apache.kafka.clients.producer.internals.TransactionManager.maybeTransitionToErrorState(TransactionManager.java:620) > at > org.apache.kafka.clients.producer.KafkaProducer.doSend(KafkaProducer.java:1079) > at > org.apache.kafka.clients.producer.KafkaProducer.send(KafkaProducer.java:959) > at > org.apache.kafka.streams.processor.internals.StreamsProducer.send(StreamsProducer.java:257) > at > org.apache.kafka.streams.processor.internals.RecordCollectorImpl.send(RecordCollectorImpl.java:207) > at > org.apache.kafka.streams.processor.internals.RecordCollectorImpl.send(RecordCollectorImpl.java:162) > at > org.apache.kafka.streams.processor.internals.SinkNode.process(SinkNode.java:85) > at > org.apache.kafka.streams.processor.internals.ProcessorContextImpl.forwardInternal(ProcessorContextImpl.java:290) > at > org.apache.kafka.streams.processor.internals.ProcessorContextImpl.forward(ProcessorContextImpl.java:269) > at > org.apache.kafka.streams.processor.internals.ProcessorContextImpl.forward(ProcessorContextImpl.java:228) > at > org.apache.kafka.streams.kstream.internals.KStreamKTableJoinProcessor.process(KStreamKTableJoinProcessor.java:88) > at > org.apache.kafka.streams.processor.internals.ProcessorNode.process(ProcessorNode.java:157) > at > org.apache.kafka.streams.processor.internals.ProcessorContextImpl.forwardInternal(ProcessorContextImpl.java:290) > at > org.apache.kafka.streams.processor.internals.ProcessorContextImpl.forward(ProcessorContextImpl.java:269) > at > org.apache.kafka.streams.processor.internals.ProcessorContextImpl.forward(ProcessorContextImpl.java:228) > at > org.apache.kafka.streams.processor.internals.SourceNode.process(SourceNode.java:84) > at > org.apache.kafka.streams.processor.internals.StreamTask.lambda$doProcess$1(StreamTask.java:791) > at > org.apache.kafka.streams.processor.internals.metrics.StreamsMetricsImpl.maybeMeasureLatency(StreamsMetricsImpl.java:867) > at > org.apache.kafka.streams.processor.internals.StreamTask.doProcess(StreamTask.java:791) > at > org.apache.kafka.streams.processor.internals.StreamTask.process(StreamTask.java:722) > at > org.apache.kafka.streams.processor.internals.TaskExecutor.processTask(TaskExecutor.java:95) > at > org.apache.kafka.streams.processor.internals.TaskExecutor.process(TaskExecutor.java:76) > at > org.apache.kafka.streams.processor.internals.TaskManager.process(TaskManager.java:1645) > at > org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:788) > at > org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:607) > at > org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:569) > at > org.apache.kafka.streams.processor.internals.StreamTask.process(StreamTask.java:748) > at > org.apache.kafka.streams.processor.internals.TaskExecutor.processTask(TaskExecutor.java:95) > at > org.apach
[jira] [Resolved] (KAFKA-16444) Run KIP-848 unit tests under code coverage
[ https://issues.apache.org/jira/browse/KAFKA-16444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kirk True resolved KAFKA-16444. --- Resolution: Won't Do > Run KIP-848 unit tests under code coverage > -- > > Key: KAFKA-16444 > URL: https://issues.apache.org/jira/browse/KAFKA-16444 > Project: Kafka > Issue Type: Task > Components: clients, consumer, unit tests >Affects Versions: 3.7.0 >Reporter: Kirk True >Priority: Minor > Labels: consumer-threading-refactor, kip-848-client-support > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (KAFKA-16444) Run KIP-848 unit tests under code coverage
[ https://issues.apache.org/jira/browse/KAFKA-16444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kirk True reassigned KAFKA-16444: - Assignee: (was: Kirk True) > Run KIP-848 unit tests under code coverage > -- > > Key: KAFKA-16444 > URL: https://issues.apache.org/jira/browse/KAFKA-16444 > Project: Kafka > Issue Type: Task > Components: clients, consumer, unit tests >Affects Versions: 3.7.0 >Reporter: Kirk True >Priority: Minor > Labels: consumer-threading-refactor, kip-848-client-support > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-16138) QuotaTest system test fails consistently in 3.7
[ https://issues.apache.org/jira/browse/KAFKA-16138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kirk True updated KAFKA-16138: -- Fix Version/s: (was: 4.0.0) > QuotaTest system test fails consistently in 3.7 > --- > > Key: KAFKA-16138 > URL: https://issues.apache.org/jira/browse/KAFKA-16138 > Project: Kafka > Issue Type: Test > Components: clients, consumer, system tests >Affects Versions: 3.7.0, 3.8.0 >Reporter: Stanislav Kozlovski >Assignee: Kirk True >Priority: Major > > as mentioned in > [https://hackmd.io/@hOneAGCrSmKSpL8VF-1HWQ/HyRgRJmta#kafkatesttestsclientquota_testQuotaTesttest_quotaArguments-%E2%80%9Coverride_quota%E2%80%9D-true-%E2%80%9Cquota_type%E2%80%9D-%E2%80%9Cuser%E2%80%9D,] > the test fails consistently: > {code:java} > ValueError('max() arg is an empty sequence') > Traceback (most recent call last): > File > "/home/jenkins/workspace/system-test-kafka-branch-builder/kafka/venv/lib/python3.7/site-packages/ducktape/tests/runner_client.py", > line 184, in _do_run > data = self.run_test() > File > "/home/jenkins/workspace/system-test-kafka-branch-builder/kafka/venv/lib/python3.7/site-packages/ducktape/tests/runner_client.py", > line 262, in run_test > return self.test_context.function(self.test) > File > "/home/jenkins/workspace/system-test-kafka-branch-builder/kafka/venv/lib/python3.7/site-packages/ducktape/mark/_mark.py", > line 433, in wrapper > return functools.partial(f, *args, **kwargs)(*w_args, **w_kwargs) > File > "/home/jenkins/workspace/system-test-kafka-branch-builder/kafka/tests/kafkatest/tests/client/quota_test.py", > line 169, in test_quota > success, msg = self.validate(self.kafka, producer, consumer) > File > "/home/jenkins/workspace/system-test-kafka-branch-builder/kafka/tests/kafkatest/tests/client/quota_test.py", > line 197, in validate > metric.value for k, metrics in producer.metrics(group='producer-metrics', > name='outgoing-byte-rate', client_id=producer.client_id) for metric in metrics > ValueError: max() arg is an empty sequence {code} > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (KAFKA-16138) QuotaTest system test fails consistently in 3.7
[ https://issues.apache.org/jira/browse/KAFKA-16138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kirk True reassigned KAFKA-16138: - Assignee: (was: Kirk True) > QuotaTest system test fails consistently in 3.7 > --- > > Key: KAFKA-16138 > URL: https://issues.apache.org/jira/browse/KAFKA-16138 > Project: Kafka > Issue Type: Test > Components: clients, consumer, system tests >Affects Versions: 3.7.0, 3.8.0 >Reporter: Stanislav Kozlovski >Priority: Major > > as mentioned in > [https://hackmd.io/@hOneAGCrSmKSpL8VF-1HWQ/HyRgRJmta#kafkatesttestsclientquota_testQuotaTesttest_quotaArguments-%E2%80%9Coverride_quota%E2%80%9D-true-%E2%80%9Cquota_type%E2%80%9D-%E2%80%9Cuser%E2%80%9D,] > the test fails consistently: > {code:java} > ValueError('max() arg is an empty sequence') > Traceback (most recent call last): > File > "/home/jenkins/workspace/system-test-kafka-branch-builder/kafka/venv/lib/python3.7/site-packages/ducktape/tests/runner_client.py", > line 184, in _do_run > data = self.run_test() > File > "/home/jenkins/workspace/system-test-kafka-branch-builder/kafka/venv/lib/python3.7/site-packages/ducktape/tests/runner_client.py", > line 262, in run_test > return self.test_context.function(self.test) > File > "/home/jenkins/workspace/system-test-kafka-branch-builder/kafka/venv/lib/python3.7/site-packages/ducktape/mark/_mark.py", > line 433, in wrapper > return functools.partial(f, *args, **kwargs)(*w_args, **w_kwargs) > File > "/home/jenkins/workspace/system-test-kafka-branch-builder/kafka/tests/kafkatest/tests/client/quota_test.py", > line 169, in test_quota > success, msg = self.validate(self.kafka, producer, consumer) > File > "/home/jenkins/workspace/system-test-kafka-branch-builder/kafka/tests/kafkatest/tests/client/quota_test.py", > line 197, in validate > metric.value for k, metrics in producer.metrics(group='producer-metrics', > name='outgoing-byte-rate', client_id=producer.client_id) for metric in metrics > ValueError: max() arg is an empty sequence {code} > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-17363) Flaky test: kafka.api.PlaintextConsumerCommitTest.testAutoCommitOnRebalance(String, String).quorum=kraft+kip848.groupProtocol=consumer
[ https://issues.apache.org/jira/browse/KAFKA-17363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kirk True updated KAFKA-17363: -- Fix Version/s: 4.0.0 > Flaky test: > kafka.api.PlaintextConsumerCommitTest.testAutoCommitOnRebalance(String, > String).quorum=kraft+kip848.groupProtocol=consumer > -- > > Key: KAFKA-17363 > URL: https://issues.apache.org/jira/browse/KAFKA-17363 > Project: Kafka > Issue Type: Bug > Components: clients, consumer >Reporter: Apoorv Mittal >Priority: Major > Labels: integration-tests > Fix For: 4.0.0 > > > kafka.api.PlaintextConsumerCommitTest.testAutoCommitOnRebalance(String, > String).quorum=kraft+kip848.groupProtocol=consumer > [https://ci-builds.apache.org/job/Kafka/job/kafka-pr/job/PR-16890/3/testReport/kafka.api/PlaintextConsumerCommitTest/Build___JDK_8_and_Scala_2_12___testAutoCommitOnRebalance_String__String__quorum_kraft_kip848_groupProtocol_consumer/] > {code:java} > org.opentest4j.AssertionFailedError: Topic [topic2] metadata not propagated > after 6 ms > at org.junit.jupiter.api.AssertionUtils.fail(AssertionUtils.java:38) > at org.junit.jupiter.api.Assertions.fail(Assertions.java:138) > at > kafka.utils.TestUtils$.waitForAllPartitionsMetadata(TestUtils.scala:931) > at kafka.utils.TestUtils$.createTopicWithAdmin(TestUtils.scala:474) > at > kafka.integration.KafkaServerTestHarness.$anonfun$createTopic$1(KafkaServerTestHarness.scala:193) > at scala.util.Using$.resource(Using.scala:269) > at > kafka.integration.KafkaServerTestHarness.createTopic(KafkaServerTestHarness.scala:185) > at > kafka.api.PlaintextConsumerCommitTest.testAutoCommitOnRebalance(PlaintextConsumerCommitTest.scala:223) > at java.lang.reflect.Method.invoke(Method.java:498) > at > java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183) > at > java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193) > at > java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175) > at > java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193) > at > java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183) > at > java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193) > at > java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193) > at > java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193) > at > java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183) > at > java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193) > at > java.util.Spliterators$ArraySpliterator.forEachRemaining(Spliterators.java:948) > at > java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:647) > at > java.util.stream.ReferencePipeline$7$1.accept(ReferencePipeline.java:272) > at > java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193) > at > java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193) > at > java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193) > at > java.util.Spliterators$ArraySpliterator.forEachRemaining(Spliterators.java:948) > at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482) > at > java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472) > at > java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150) > at > java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173) > at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) > at > java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:485) > at > java.util.stream.ReferencePipeline$7$1.accept(ReferencePipeline.java:272) > at > java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193) > at > java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193) > at > java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193) > at > java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1384) > at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482) > at > java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472) > at > java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150) > at > java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173) > at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) >
[jira] [Updated] (KAFKA-17363) Flaky test: kafka.api.PlaintextConsumerCommitTest.testAutoCommitOnRebalance(String, String).quorum=kraft+kip848.groupProtocol=consumer
[ https://issues.apache.org/jira/browse/KAFKA-17363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kirk True updated KAFKA-17363: -- Labels: integration-tests kip-848-client-support (was: integration-tests) > Flaky test: > kafka.api.PlaintextConsumerCommitTest.testAutoCommitOnRebalance(String, > String).quorum=kraft+kip848.groupProtocol=consumer > -- > > Key: KAFKA-17363 > URL: https://issues.apache.org/jira/browse/KAFKA-17363 > Project: Kafka > Issue Type: Bug > Components: clients, consumer >Reporter: Apoorv Mittal >Priority: Major > Labels: integration-tests, kip-848-client-support > Fix For: 4.0.0 > > > kafka.api.PlaintextConsumerCommitTest.testAutoCommitOnRebalance(String, > String).quorum=kraft+kip848.groupProtocol=consumer > [https://ci-builds.apache.org/job/Kafka/job/kafka-pr/job/PR-16890/3/testReport/kafka.api/PlaintextConsumerCommitTest/Build___JDK_8_and_Scala_2_12___testAutoCommitOnRebalance_String__String__quorum_kraft_kip848_groupProtocol_consumer/] > {code:java} > org.opentest4j.AssertionFailedError: Topic [topic2] metadata not propagated > after 6 ms > at org.junit.jupiter.api.AssertionUtils.fail(AssertionUtils.java:38) > at org.junit.jupiter.api.Assertions.fail(Assertions.java:138) > at > kafka.utils.TestUtils$.waitForAllPartitionsMetadata(TestUtils.scala:931) > at kafka.utils.TestUtils$.createTopicWithAdmin(TestUtils.scala:474) > at > kafka.integration.KafkaServerTestHarness.$anonfun$createTopic$1(KafkaServerTestHarness.scala:193) > at scala.util.Using$.resource(Using.scala:269) > at > kafka.integration.KafkaServerTestHarness.createTopic(KafkaServerTestHarness.scala:185) > at > kafka.api.PlaintextConsumerCommitTest.testAutoCommitOnRebalance(PlaintextConsumerCommitTest.scala:223) > at java.lang.reflect.Method.invoke(Method.java:498) > at > java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183) > at > java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193) > at > java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175) > at > java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193) > at > java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183) > at > java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193) > at > java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193) > at > java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193) > at > java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183) > at > java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193) > at > java.util.Spliterators$ArraySpliterator.forEachRemaining(Spliterators.java:948) > at > java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:647) > at > java.util.stream.ReferencePipeline$7$1.accept(ReferencePipeline.java:272) > at > java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193) > at > java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193) > at > java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193) > at > java.util.Spliterators$ArraySpliterator.forEachRemaining(Spliterators.java:948) > at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482) > at > java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472) > at > java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150) > at > java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173) > at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) > at > java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:485) > at > java.util.stream.ReferencePipeline$7$1.accept(ReferencePipeline.java:272) > at > java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193) > at > java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193) > at > java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193) > at > java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1384) > at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482) > at > java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472) > at > java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150) > at > java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173) >
[jira] [Updated] (KAFKA-17623) Flaky testSeekPositionAndPauseNewlyAssignedPartitionOnPartitionsAssignedCallback
[ https://issues.apache.org/jira/browse/KAFKA-17623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kirk True updated KAFKA-17623: -- Fix Version/s: 4.0.0 > Flaky > testSeekPositionAndPauseNewlyAssignedPartitionOnPartitionsAssignedCallback > > > Key: KAFKA-17623 > URL: https://issues.apache.org/jira/browse/KAFKA-17623 > Project: Kafka > Issue Type: Bug > Components: clients, consumer >Reporter: Lianet Magrans >Assignee: PoAn Yang >Priority: Major > Labels: consumer-threading-refactor > Fix For: 4.0.0 > > > Flaky for the new consumer, failing with : > org.apache.kafka.common.KafkaException: User rebalance callback throws an > error at > app//org.apache.kafka.clients.consumer.internals.ConsumerUtils.maybeWrapAsKafkaException(ConsumerUtils.java:259) > at > app//org.apache.kafka.clients.consumer.internals.AsyncKafkaConsumer.invokeRebalanceCallbacks(AsyncKafkaConsumer.java:1867) > at > app//org.apache.kafka.clients.consumer.internals.AsyncKafkaConsumer$BackgroundEventProcessor.process(AsyncKafkaConsumer.java:195) > at > app//org.apache.kafka.clients.consumer.internals.AsyncKafkaConsumer$BackgroundEventProcessor.process(AsyncKafkaConsumer.java:181) > at > app//org.apache.kafka.clients.consumer.internals.AsyncKafkaConsumer.processBackgroundEvents(AsyncKafkaConsumer.java:1758) > at > app//org.apache.kafka.clients.consumer.internals.AsyncKafkaConsumer.updateAssignmentMetadataIfNeeded(AsyncKafkaConsumer.java:1618) > ... > Caused by: java.lang.IllegalStateException: No current assignment for > partition topic-0 at > org.apache.kafka.clients.consumer.internals.SubscriptionState.assignedState(SubscriptionState.java:378) > at > org.apache.kafka.clients.consumer.internals.SubscriptionState.seekUnvalidated(SubscriptionState.java:395) > at > org.apache.kafka.clients.consumer.internals.events.ApplicationEventProcessor.process(ApplicationEventProcessor.java:425) > at > org.apache.kafka.clients.consumer.internals.events.ApplicationEventProcessor.process(ApplicationEventProcessor.java:147) > at > org.apache.kafka.clients.consumer.internals.ConsumerNetworkThread.processApplicationEvents(ConsumerNetworkThread.java:171) > > Flaky behaviour: > > https://ge.apache.org/scans/tests?search.buildOutcome=failure&search.names=Git%20branch&search.rootProjectNames=kafka&search.startTimeMax=172740959&search.startTimeMin=172248480&search.timeZoneId=America%2FToronto&search.values=trunk&tests.container=integration.kafka.api.PlaintextConsumerCallbackTest&tests.test=testSeekPositionAndPauseNewlyAssignedPartitionOnPartitionsAssignedCallback(String%2C%20String)%5B3%5D -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-17648) AsyncKafkaConsumer#unsubscribe swallow TopicAuthorizationException
[ https://issues.apache.org/jira/browse/KAFKA-17648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kirk True updated KAFKA-17648: -- Fix Version/s: 4.0.0 > AsyncKafkaConsumer#unsubscribe swallow TopicAuthorizationException > -- > > Key: KAFKA-17648 > URL: https://issues.apache.org/jira/browse/KAFKA-17648 > Project: Kafka > Issue Type: Task > Components: clients, consumer >Reporter: PoAn Yang >Assignee: PoAn Yang >Priority: Minor > Fix For: 4.0.0 > > > Followup for [https://github.com/apache/kafka/pull/17244]. > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-15561) Client support for new SubscriptionPattern based subscription
[ https://issues.apache.org/jira/browse/KAFKA-15561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kirk True updated KAFKA-15561: -- Priority: Blocker (was: Critical) > Client support for new SubscriptionPattern based subscription > - > > Key: KAFKA-15561 > URL: https://issues.apache.org/jira/browse/KAFKA-15561 > Project: Kafka > Issue Type: New Feature > Components: clients, consumer >Reporter: Lianet Magrans >Assignee: Phuc Hong Tran >Priority: Blocker > Labels: kip-848-client-support, regex > Fix For: 4.0.0 > > > New consumer should support subscribe with the new SubscriptionPattern > introduced in the new consumer group protocol. When subscribing with this > regex, the client should provide the regex in the HB request on the > SubscribedTopicRegex field, delegating the resolution to the server. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-17696) New consumer background operations unaware of metadata errors
[ https://issues.apache.org/jira/browse/KAFKA-17696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kirk True updated KAFKA-17696: -- Priority: Blocker (was: Critical) > New consumer background operations unaware of metadata errors > - > > Key: KAFKA-17696 > URL: https://issues.apache.org/jira/browse/KAFKA-17696 > Project: Kafka > Issue Type: Bug > Components: clients, consumer >Reporter: Lianet Magrans >Assignee: 黃竣陽 >Priority: Blocker > Labels: consumer-threading-refactor > Fix For: 4.0.0 > > > When a metadata error happens (ie. Unauthorized topic), the network layer is > the one to detect it and it just propagates it to the app thread via en > ErrorEvent. > [https://github.com/apache/kafka/blob/0edf5dbd204df9eb62bfea1b56993e95737df5a3/clients/src/main/java/org/apache/kafka/clients/consumer/internals/NetworkClientDelegate.java#L153] > That allows api calls that processBackgroundEvents to throw the error in the > app thread (ex. poll, unsubscribe and close, which are the only api calls > that currently processBackgroundEvents). > This means that all other api calls that do not processBackgroundEvent will > never know about errors like Unauthorized topics. Moreover, it really means > that the background operations are not notified/aborted when a metadata error > happens (auth error). Ex. call to position block waiting for the > updateFetchPositions > ([here|https://github.com/apache/kafka/blob/0edf5dbd204df9eb62bfea1b56993e95737df5a3/clients/src/main/java/org/apache/kafka/clients/consumer/internals/AsyncKafkaConsumer.java#L1586]), > will leave a > pendingOffsetFetchEvent waiting to complete, even when the background already > got an Unauthorized exception (but it only passed it to the app thread via > ErrorEvent) > I wonder if we should ensure that metadata errors are not only propagated to > the app thread via ErrorEvents, but also ensure that we notify all request > managers in the background (so that they can decide if completeExceptionally > their outstanding events). Ex. OffsetsRequestManager.onMetadataError should > completeExceptionally the pendingOffsetFetchEvent (just first thought, there > could be other approaches, but note that calling processBackgroundEvent in > api calls like positions will not do because we would block first on the > CheckAndUpdatePositions, then processBackgroundEvents that would only happen > after the CheckAndUpdate) > > This behaviour can be repro with the integration test > AuthorizerIntegrationTest.testOffsetFetchWithNoAccess with the new consumer > enabled (discovered with [https://github.com/apache/kafka/pull/17107] ) > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-17337) ConsumerConfig should default to CONSUMER for group.protocol
[ https://issues.apache.org/jira/browse/KAFKA-17337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kirk True updated KAFKA-17337: -- Priority: Blocker (was: Critical) > ConsumerConfig should default to CONSUMER for group.protocol > > > Key: KAFKA-17337 > URL: https://issues.apache.org/jira/browse/KAFKA-17337 > Project: Kafka > Issue Type: Task > Components: clients, config, consumer >Affects Versions: 3.9.0 >Reporter: Kirk True >Assignee: Kirk True >Priority: Blocker > Labels: consumer-threading-refactor, kip-848-client-support > Fix For: 4.0.0 > > > {{ConsumerConfig}}’s default value for {{group.protocol}} should be changed > from {{CLASSIC}} to {{CONSUMER}} for 4.0.0. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-17593) Async regex resolution
[ https://issues.apache.org/jira/browse/KAFKA-17593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kirk True updated KAFKA-17593: -- Fix Version/s: 4.0.0 > Async regex resolution > -- > > Key: KAFKA-17593 > URL: https://issues.apache.org/jira/browse/KAFKA-17593 > Project: Kafka > Issue Type: Sub-task >Reporter: Lianet Magrans >Assignee: Lianet Magrans >Priority: Major > Fix For: 4.0.0 > > > Async regex RE2J eval triggered from HB as needed, performed in separate > component for group regex management. It should end up persisting the > resolved regexes for the group. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-15954) Review minimal effort approach on consumer last heartbeat on unsubscribe
[ https://issues.apache.org/jira/browse/KAFKA-15954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kirk True updated KAFKA-15954: -- Fix Version/s: (was: 4.0.0) > Review minimal effort approach on consumer last heartbeat on unsubscribe > > > Key: KAFKA-15954 > URL: https://issues.apache.org/jira/browse/KAFKA-15954 > Project: Kafka > Issue Type: Sub-task > Components: clients, consumer >Reporter: Lianet Magrans >Assignee: Lianet Magrans >Priority: Minor > Labels: kip-848-client-support, reconciliation > > Currently the legacy and new consumer follows a minimal effort approach when > sending a leave group (legacy) or last heartbeat request (new consumer). The > request is sent without waiting/handling any response. This behaviour applies > when the consumer is being closed or when it unsubscribes. > For the case when the consumer is being closed, (which is a "terminal" > state), it probably makes sense to just follow a minimal effort approach for > "properly" leaving the group (no retry logic). But for the case of > unsubscribe, we could consider if valuable to to put a little more effort > into making sure that the last heartbeat is sent and received by the broker > (ex. what if coordinator not known/available when sending the last HB). Note > that unsubscribe could a temporary state, where the consumer might want to > re-join the group at any time. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KAFKA-15954) Review minimal effort approach on consumer last heartbeat on unsubscribe
[ https://issues.apache.org/jira/browse/KAFKA-15954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17887395#comment-17887395 ] Kirk True commented on KAFKA-15954: --- This is marked as post-4.0 on our internal priority list, so clearing the fix version. > Review minimal effort approach on consumer last heartbeat on unsubscribe > > > Key: KAFKA-15954 > URL: https://issues.apache.org/jira/browse/KAFKA-15954 > Project: Kafka > Issue Type: Sub-task > Components: clients, consumer >Reporter: Lianet Magrans >Assignee: Lianet Magrans >Priority: Minor > Labels: kip-848-client-support, reconciliation > Fix For: 4.0.0 > > > Currently the legacy and new consumer follows a minimal effort approach when > sending a leave group (legacy) or last heartbeat request (new consumer). The > request is sent without waiting/handling any response. This behaviour applies > when the consumer is being closed or when it unsubscribes. > For the case when the consumer is being closed, (which is a "terminal" > state), it probably makes sense to just follow a minimal effort approach for > "properly" leaving the group (no retry logic). But for the case of > unsubscribe, we could consider if valuable to to put a little more effort > into making sure that the last heartbeat is sent and received by the broker > (ex. what if coordinator not known/available when sending the last HB). Note > that unsubscribe could a temporary state, where the consumer might want to > re-join the group at any time. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-15847) Consider partial metadata requests for client reconciliation
[ https://issues.apache.org/jira/browse/KAFKA-15847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kirk True updated KAFKA-15847: -- Fix Version/s: (was: 4.0.0) > Consider partial metadata requests for client reconciliation > > > Key: KAFKA-15847 > URL: https://issues.apache.org/jira/browse/KAFKA-15847 > Project: Kafka > Issue Type: Task > Components: clients, consumer >Reporter: Lianet Magrans >Assignee: Lianet Magrans >Priority: Major > Labels: consumer-threading-refactor, reconciliation > > New consumer implementing KIP-848 protocol needs to resolve metadata for the > topics received in the assignment. It does so by relying on the centralized > metadata object. Currently metadata updates requested through the metadata > object, request metadata for all topics. Consider allowing the partial > updates that are already expressed as an intention in the Metadata class but > not fully supported (investigate background in case there were some specifics > that led to this intention not being fully implemented) -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KAFKA-15847) Consider partial metadata requests for client reconciliation
[ https://issues.apache.org/jira/browse/KAFKA-15847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17887394#comment-17887394 ] Kirk True commented on KAFKA-15847: --- On our internal priority list, this is marked this as post-4.0, so clearing the fix version. > Consider partial metadata requests for client reconciliation > > > Key: KAFKA-15847 > URL: https://issues.apache.org/jira/browse/KAFKA-15847 > Project: Kafka > Issue Type: Task > Components: clients, consumer >Reporter: Lianet Magrans >Assignee: Lianet Magrans >Priority: Major > Labels: consumer-threading-refactor, reconciliation > > New consumer implementing KIP-848 protocol needs to resolve metadata for the > topics received in the assignment. It does so by relying on the centralized > metadata object. Currently metadata updates requested through the metadata > object, request metadata for all topics. Consider allowing the partial > updates that are already expressed as an intention in the Metadata class but > not fully supported (investigate background in case there were some specifics > that led to this intention not being fully implemented) -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-15588) Purge the unsent offset commits/fetches when the member is fenced/failed
[ https://issues.apache.org/jira/browse/KAFKA-15588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kirk True updated KAFKA-15588: -- Fix Version/s: (was: 4.0.0) > Purge the unsent offset commits/fetches when the member is fenced/failed > > > Key: KAFKA-15588 > URL: https://issues.apache.org/jira/browse/KAFKA-15588 > Project: Kafka > Issue Type: Improvement > Components: clients, consumer >Reporter: Philip Nee >Assignee: TengYao Chi >Priority: Major > Labels: kip-848-client-support, reconciliation > > When the member is fenced/failed, we should purge the inflight offset commits > and fetches. HeartbeatRequestManager should be able to handle this -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KAFKA-15553) Review consumer positions update
[ https://issues.apache.org/jira/browse/KAFKA-15553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17887391#comment-17887391 ] Kirk True commented on KAFKA-15553: --- It looks our internal prioritization marked this as post-4.0.0, so moving out. > Review consumer positions update > > > Key: KAFKA-15553 > URL: https://issues.apache.org/jira/browse/KAFKA-15553 > Project: Kafka > Issue Type: Task > Components: clients, consumer >Reporter: Philip Nee >Assignee: Philip Nee >Priority: Minor > Labels: consumer-threading-refactor, position > Fix For: 4.0.0 > > > From the existing comment: If there are any partitions which do not have a > valid position and are not awaiting reset, then we need to fetch committed > offsets. > In the async consumer: I wonder if it would make sense to refresh the > position on the event loop continuously. > The logic to refresh offsets in the poll loop is quite fragile and works > largely by side-effects of the code that it calls. For example, the behaviour > of the "cached" value is really not that straightforward and simply reading > the cached value is not sufficient to start consuming data in all cases. > This area needs a bit of a refactor. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-15553) Review consumer positions update
[ https://issues.apache.org/jira/browse/KAFKA-15553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kirk True updated KAFKA-15553: -- Fix Version/s: (was: 4.0.0) > Review consumer positions update > > > Key: KAFKA-15553 > URL: https://issues.apache.org/jira/browse/KAFKA-15553 > Project: Kafka > Issue Type: Task > Components: clients, consumer >Reporter: Philip Nee >Assignee: Philip Nee >Priority: Minor > Labels: consumer-threading-refactor, position > > From the existing comment: If there are any partitions which do not have a > valid position and are not awaiting reset, then we need to fetch committed > offsets. > In the async consumer: I wonder if it would make sense to refresh the > position on the event loop continuously. > The logic to refresh offsets in the poll loop is quite fragile and works > largely by side-effects of the code that it calls. For example, the behaviour > of the "cached" value is really not that straightforward and simply reading > the cached value is not sufficient to start consuming data in all cases. > This area needs a bit of a refactor. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KAFKA-16460) New consumer times out consuming records in multiple consumer_test.py system tests
[ https://issues.apache.org/jira/browse/KAFKA-16460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17887388#comment-17887388 ] Kirk True commented on KAFKA-16460: --- [~yangpoan]—yes, you are free to take it :) Thanks! > New consumer times out consuming records in multiple consumer_test.py system > tests > -- > > Key: KAFKA-16460 > URL: https://issues.apache.org/jira/browse/KAFKA-16460 > Project: Kafka > Issue Type: Bug > Components: clients, consumer, system tests >Affects Versions: 3.7.0 >Reporter: Kirk True >Priority: Critical > Labels: kip-848-client-support, system-tests > Fix For: 4.0.0 > > > The {{consumer_test.py}} system test fails with the following errors: > {quote} > * Timed out waiting for consumption > {quote} > Affected tests: > * {{test_broker_failure}} > * {{test_consumer_bounce}} > * {{test_static_consumer_bounce}} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-17681) Fix unstable consumer_test.py#test_fencing_static_consumer
[ https://issues.apache.org/jira/browse/KAFKA-17681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kirk True updated KAFKA-17681: -- Component/s: system tests > Fix unstable consumer_test.py#test_fencing_static_consumer > -- > > Key: KAFKA-17681 > URL: https://issues.apache.org/jira/browse/KAFKA-17681 > Project: Kafka > Issue Type: Bug > Components: clients, consumer, system tests >Reporter: Chia-Ping Tsai >Assignee: Chia-Ping Tsai >Priority: Major > Labels: kip-848-client-support > Fix For: 4.0.0 > > > {code:java} > AssertionError('Static consumers attempt to join with instance id in use > should not cause a rebalance.') > Traceback (most recent call last): > File > "/usr/local/lib/python3.9/dist-packages/ducktape/tests/runner_client.py", > line 186, in _do_run > data = self.run_test() > File > "/usr/local/lib/python3.9/dist-packages/ducktape/tests/runner_client.py", > line 246, in run_test > return self.test_context.function(self.test) > File "/usr/local/lib/python3.9/dist-packages/ducktape/mark/_mark.py", line > 433, in wrapper > return functools.partial(f, *args, **kwargs)(*w_args, **w_kwargs) > File "/opt/kafka-dev/tests/kafkatest/tests/client/consumer_test.py", line > 359, in test_fencing_static_consumer > assert num_rebalances == consumer.num_rebalances(), "Static consumers > attempt to join with instance id in use should not cause a rebalance. before: > " + str(num_rebalances) + " after: " + str(consumer.num_rebalances()) > AssertionError: Static consumers attempt to join with instance id in use > should not cause a rebalance. > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (KAFKA-16460) New consumer times out consuming records in multiple consumer_test.py system tests
[ https://issues.apache.org/jira/browse/KAFKA-16460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kirk True reassigned KAFKA-16460: - Assignee: (was: Kirk True) > New consumer times out consuming records in multiple consumer_test.py system > tests > -- > > Key: KAFKA-16460 > URL: https://issues.apache.org/jira/browse/KAFKA-16460 > Project: Kafka > Issue Type: Bug > Components: clients, consumer, system tests >Affects Versions: 3.7.0 >Reporter: Kirk True >Priority: Critical > Labels: kip-848-client-support, system-tests > Fix For: 4.0.0 > > > The {{consumer_test.py}} system test fails with the following errors: > {quote} > * Timed out waiting for consumption > {quote} > Affected tests: > * {{test_broker_failure}} > * {{test_consumer_bounce}} > * {{test_static_consumer_bounce}} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (KAFKA-15402) Performance regression on close consumer after upgrading to 3.5.0
[ https://issues.apache.org/jira/browse/KAFKA-15402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kirk True reassigned KAFKA-15402: - Assignee: (was: Kirk True) > Performance regression on close consumer after upgrading to 3.5.0 > - > > Key: KAFKA-15402 > URL: https://issues.apache.org/jira/browse/KAFKA-15402 > Project: Kafka > Issue Type: Bug > Components: clients, consumer >Affects Versions: 3.5.0, 3.6.0, 3.5.1 >Reporter: Benoit Delbosc >Priority: Critical > Labels: consumer-threading-refactor > Fix For: 4.0.0 > > Attachments: image-2023-08-24-18-51-21-720.png, > image-2023-08-24-18-51-57-435.png, image-2023-08-25-10-50-28-079.png > > > Hi, > After upgrading to Kafka client version 3.5.0, we have observed a significant > increase in the duration of our Java unit tests. These unit tests heavily > rely on the Kafka Admin, Producer, and Consumer API. > When using Kafka server version 3.4.1, the duration of the unit tests > increased from 8 seconds (with Kafka client 3.4.1) to 18 seconds (with Kafka > client 3.5.0). > Upgrading the Kafka server to 3.5.1 show similar results. > I have come across the issue KAFKA-15178, which could be the culprit. I will > attempt to test the proposed patch. > In the meantime, if you have any ideas that could help identify and address > the regression, please let me know. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KAFKA-15553) Review consumer positions update
[ https://issues.apache.org/jira/browse/KAFKA-15553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17887251#comment-17887251 ] Kirk True commented on KAFKA-15553: --- [~pnee]—do you plan on addressing this for 4.0.0? > Review consumer positions update > > > Key: KAFKA-15553 > URL: https://issues.apache.org/jira/browse/KAFKA-15553 > Project: Kafka > Issue Type: Task > Components: clients, consumer >Reporter: Philip Nee >Assignee: Philip Nee >Priority: Minor > Labels: consumer-threading-refactor, position > Fix For: 4.0.0 > > > From the existing comment: If there are any partitions which do not have a > valid position and are not awaiting reset, then we need to fetch committed > offsets. > In the async consumer: I wonder if it would make sense to refresh the > position on the event loop continuously. > The logic to refresh offsets in the poll loop is quite fragile and works > largely by side-effects of the code that it calls. For example, the behaviour > of the "cached" value is really not that straightforward and simply reading > the cached value is not sufficient to start consuming data in all cases. > This area needs a bit of a refactor. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KAFKA-16816) Remove unneeded FencedInstanceId support on commit path for new consumer
[ https://issues.apache.org/jira/browse/KAFKA-16816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17887249#comment-17887249 ] Kirk True commented on KAFKA-16816: --- [~lianetm]—Is this something we plan to get into 4.0.0? > Remove unneeded FencedInstanceId support on commit path for new consumer > > > Key: KAFKA-16816 > URL: https://issues.apache.org/jira/browse/KAFKA-16816 > Project: Kafka > Issue Type: Task > Components: clients, consumer >Reporter: Lianet Magrans >Assignee: Lianet Magrans >Priority: Minor > Labels: kip-848-client-support > Fix For: 4.0.0 > > > The new consumer contains logic related to handling FencedInstanceId > exception received as a response to an OffsetCommit request (on the > [consumer|https://github.com/apache/kafka/blob/028e7a06dcdca7d4dbeae83f2fce0a4120cc2753/clients/src/main/java/org/apache/kafka/clients/consumer/internals/AsyncKafkaConsumer.java#L776] > and [commit > manager|https://github.com/apache/kafka/blob/028e7a06dcdca7d4dbeae83f2fce0a4120cc2753/clients/src/main/java/org/apache/kafka/clients/consumer/internals/CommitRequestManager.java#L715]), > but with the new group protocol, we will never get that error on a commit > response. We should remove the code that expects the FencedInstanceId on the > commit response, and also clean up the other related usages that we added to > propagate the FencedInstanceId exception on the poll, commitSync and > commitAsync API. Note that throwing that exception is part of the contract of > the poll, commitSync and commitAsync APIs of the KafkaConsumer, but it > changes with the new protocol. We should update the java doc for the new > AsyncKafkaConsumer to reflect the change. > > With the new protocol If a consumer tries to commit offsets, there could be 2 > cases: > # empty group -> commit succeeds, fencing an instance id would never happen > because group is empty > # non-empty group -> commit fails with UnknownMemberId, indicating that the > member is not known to the group. The consumer needs to join the non-empty > group in order to commit offsets to it. To complete the story, the moment the > consumer attempts to join, it will receive an UnreleasedInstanceId error on > the HB response, indicating it using a groupInstanceId that is already in use. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KAFKA-15954) Review minimal effort approach on consumer last heartbeat on unsubscribe
[ https://issues.apache.org/jira/browse/KAFKA-15954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17887250#comment-17887250 ] Kirk True commented on KAFKA-15954: --- [~lianetm]—Same here—is this going into 4.0.0? > Review minimal effort approach on consumer last heartbeat on unsubscribe > > > Key: KAFKA-15954 > URL: https://issues.apache.org/jira/browse/KAFKA-15954 > Project: Kafka > Issue Type: Sub-task > Components: clients, consumer >Reporter: Lianet Magrans >Assignee: Lianet Magrans >Priority: Minor > Labels: kip-848-client-support, reconciliation > Fix For: 4.0.0 > > > Currently the legacy and new consumer follows a minimal effort approach when > sending a leave group (legacy) or last heartbeat request (new consumer). The > request is sent without waiting/handling any response. This behaviour applies > when the consumer is being closed or when it unsubscribes. > For the case when the consumer is being closed, (which is a "terminal" > state), it probably makes sense to just follow a minimal effort approach for > "properly" leaving the group (no retry logic). But for the case of > unsubscribe, we could consider if valuable to to put a little more effort > into making sure that the last heartbeat is sent and received by the broker > (ex. what if coordinator not known/available when sending the last HB). Note > that unsubscribe could a temporary state, where the consumer might want to > re-join the group at any time. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-17337) ConsumerConfig should default to CONSUMER for group.protocol
[ https://issues.apache.org/jira/browse/KAFKA-17337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kirk True updated KAFKA-17337: -- Priority: Critical (was: Blocker) > ConsumerConfig should default to CONSUMER for group.protocol > > > Key: KAFKA-17337 > URL: https://issues.apache.org/jira/browse/KAFKA-17337 > Project: Kafka > Issue Type: Task > Components: clients, config, consumer >Affects Versions: 3.9.0 >Reporter: Kirk True >Assignee: Kirk True >Priority: Critical > Labels: consumer-threading-refactor, kip-848-client-support > Fix For: 4.0.0 > > > {{ConsumerConfig}}’s default value for {{group.protocol}} should be changed > from {{CLASSIC}} to {{CONSUMER}} for 4.0.0. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-16460) New consumer times out consuming records in multiple consumer_test.py system tests
[ https://issues.apache.org/jira/browse/KAFKA-16460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kirk True updated KAFKA-16460: -- Priority: Critical (was: Blocker) > New consumer times out consuming records in multiple consumer_test.py system > tests > -- > > Key: KAFKA-16460 > URL: https://issues.apache.org/jira/browse/KAFKA-16460 > Project: Kafka > Issue Type: Bug > Components: clients, consumer, system tests >Affects Versions: 3.7.0 >Reporter: Kirk True >Assignee: Kirk True >Priority: Critical > Labels: kip-848-client-support, system-tests > Fix For: 4.0.0 > > > The {{consumer_test.py}} system test fails with the following errors: > {quote} > * Timed out waiting for consumption > {quote} > Affected tests: > * {{test_broker_failure}} > * {{test_consumer_bounce}} > * {{test_static_consumer_bounce}} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-15561) Client support for new SubscriptionPattern based subscription
[ https://issues.apache.org/jira/browse/KAFKA-15561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kirk True updated KAFKA-15561: -- Priority: Critical (was: Blocker) > Client support for new SubscriptionPattern based subscription > - > > Key: KAFKA-15561 > URL: https://issues.apache.org/jira/browse/KAFKA-15561 > Project: Kafka > Issue Type: New Feature > Components: clients, consumer >Reporter: Lianet Magrans >Assignee: Phuc Hong Tran >Priority: Critical > Labels: kip-848-client-support, regex > Fix For: 4.0.0 > > > New consumer should support subscribe with the new SubscriptionPattern > introduced in the new consumer group protocol. When subscribing with this > regex, the client should provide the regex in the HB request on the > SubscribedTopicRegex field, delegating the resolution to the server. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-17696) New consumer background operations unaware of metadata errors
[ https://issues.apache.org/jira/browse/KAFKA-17696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kirk True updated KAFKA-17696: -- Priority: Critical (was: Blocker) > New consumer background operations unaware of metadata errors > - > > Key: KAFKA-17696 > URL: https://issues.apache.org/jira/browse/KAFKA-17696 > Project: Kafka > Issue Type: Bug > Components: clients, consumer >Reporter: Lianet Magrans >Assignee: 黃竣陽 >Priority: Critical > Labels: consumer-threading-refactor > Fix For: 4.0.0 > > > When a metadata error happens (ie. Unauthorized topic), the network layer is > the one to detect it and it just propagates it to the app thread via en > ErrorEvent. > [https://github.com/apache/kafka/blob/0edf5dbd204df9eb62bfea1b56993e95737df5a3/clients/src/main/java/org/apache/kafka/clients/consumer/internals/NetworkClientDelegate.java#L153] > That allows api calls that processBackgroundEvents to throw the error in the > app thread (ex. poll, unsubscribe and close, which are the only api calls > that currently processBackgroundEvents). > This means that all other api calls that do not processBackgroundEvent will > never know about errors like Unauthorized topics. Moreover, it really means > that the background operations are not notified/aborted when a metadata error > happens (auth error). Ex. call to position block waiting for the > updateFetchPositions > ([here|https://github.com/apache/kafka/blob/0edf5dbd204df9eb62bfea1b56993e95737df5a3/clients/src/main/java/org/apache/kafka/clients/consumer/internals/AsyncKafkaConsumer.java#L1586]), > will leave a > pendingOffsetFetchEvent waiting to complete, even when the background already > got an Unauthorized exception (but it only passed it to the app thread via > ErrorEvent) > I wonder if we should ensure that metadata errors are not only propagated to > the app thread via ErrorEvents, but also ensure that we notify all request > managers in the background (so that they can decide if completeExceptionally > their outstanding events). Ex. OffsetsRequestManager.onMetadataError should > completeExceptionally the pendingOffsetFetchEvent (just first thought, there > could be other approaches, but note that calling processBackgroundEvent in > api calls like positions will not do because we would block first on the > CheckAndUpdatePositions, then processBackgroundEvents that would only happen > after the CheckAndUpdate) > > This behaviour can be repro with the integration test > AuthorizerIntegrationTest.testOffsetFetchWithNoAccess with the new consumer > enabled (discovered with [https://github.com/apache/kafka/pull/17107] ) > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-16954) Move consumer leave operations on close to background thread
[ https://issues.apache.org/jira/browse/KAFKA-16954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kirk True updated KAFKA-16954: -- Labels: kip-848-client-support (was: ) > Move consumer leave operations on close to background thread > > > Key: KAFKA-16954 > URL: https://issues.apache.org/jira/browse/KAFKA-16954 > Project: Kafka > Issue Type: Bug > Components: clients, consumer >Reporter: Lianet Magrans >Assignee: Lianet Magrans >Priority: Blocker > Labels: kip-848-client-support > Fix For: 3.8.0 > > > When a consumer unsubscribes, the app thread simply triggers an Unsubscribe > event that will take care of it all in the background thread: release > assignment (callbacks), clear assigned partitions, and send leave group HB. > On the contrary, when a consumer is closed, these actions happen in both > threads: > * release assignment -> in the app thread by directly running the callbacks > * clear assignment -> in app thread by updating the subscriptionState > * send leave group HB -> in the background thread via an event LeaveOnClose > This situation could lead to race conditions, mainly because of the close > updating the subscription state in the app thread, when other operations in > the background could be already running based on it. Ex. > * unsubscribe in app thread (triggers background UnsubscribeEvent to revoke > and leave) > * unsubscribe fails (ex. interrupted, leaving operation running in the > background thread to revoke partitions and leave) > * consumer close (will revoke and clear assignment in the app thread) > * UnsubscribeEvent in the background may fail by trying to revoke > partitions that it does not own anymore - _No current assignment for > partition ..._ > A basic check has been added to the background thread revocation to avoid the > race condition, ensuring that we only revoke partitions we own, but still we > should avoid the root cause, which is updating the assignment on the app > thread. We should consider having the close operation as a single > LeaveOnClose event handled in the background. That even already takes cares > of revoking the partitions and clearing assignment on the background, so no > need to take care of it in the app thread. We should only ensure that we > processBackgroundEvents until the LeaveOnClose completes (to allow for > callbacks to run in the app thread) > > Trying to understand the current approach, I imagine the initial motivation > to have the callabacks (and assignment cleared) in the app thread was to > avoid the back-and-forth: app thread close -> background thread leave event > -> app thread to run callback -> background thread to clear assignment and > send HB. But updating the assignment on the app thread ends up being > problematic, as it mainly happens in the background so it opens up the door > for race conditions on the subscription state. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-16984) New consumer should not complete leave operation until it gets a response
[ https://issues.apache.org/jira/browse/KAFKA-16984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kirk True updated KAFKA-16984: -- Labels: kip-848 kip-848-client-support (was: kip-848) > New consumer should not complete leave operation until it gets a response > - > > Key: KAFKA-16984 > URL: https://issues.apache.org/jira/browse/KAFKA-16984 > Project: Kafka > Issue Type: Bug > Components: clients, consumer >Affects Versions: 3.8.0 >Reporter: Lianet Magrans >Assignee: Lianet Magrans >Priority: Major > Labels: kip-848, kip-848-client-support > Fix For: 3.9.0 > > > When the new consumer attempts to leave a group, it sends a leave group > request in a fire-and-forget mode, so, as soon as the request is generated, > it will: > 1. transitions to UNSUBSCRIBED > 2. complete the leaveGroup operation future > This task focus on point 2, which has the undesired side-effect that whatever > might have been waiting for the leave to do something else, will carry on, > ex. consumer close, leading to responses to disconnected clients we've seen > when running stress tests) > When leaving a group while closing a consumer, the member sends the leave > request and moves on to next operation, which is closing the network thread, > so we end up with disconnected client receiving responses from the server. We > should send leave group heartbeat, and transition to UNSUBSCRIBE, but only > complete the leave operation when we get a response for it, which is a much > more accurate confirmation that the consumer left the group and can move on > with other operations. > Note that the legacy consumer does wait for a leave response before closing > down the coordinator (see > [AbstractCoordinator|https://github.com/apache/kafka/blob/25230b538841a5e7256b1b51725361dd59435901/clients/src/main/java/org/apache/kafka/clients/consumer/internals/AbstractCoordinator.java#L1135-L1140]), > we we are looking to have the same behaviour on the new consumer. > Note that with this task we'll only focus on changing the behaviour for the > leave operation completion (point 2 above) to tidy up the close flow. We are > not changing the transition to UNSUBSCRIBED, as it would require further > consideration if ever needed. > > This is also a building block for future improvements around error handling > for the leave request, which we don't have at the moment (related Jira linked) -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-16937) Consider inlineing Time#waitObject to ProducerMetadata#awaitUpdate
[ https://issues.apache.org/jira/browse/KAFKA-16937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kirk True updated KAFKA-16937: -- Component/s: clients producer > Consider inlineing Time#waitObject to ProducerMetadata#awaitUpdate > -- > > Key: KAFKA-16937 > URL: https://issues.apache.org/jira/browse/KAFKA-16937 > Project: Kafka > Issue Type: Improvement > Components: clients, producer >Reporter: Chia-Ping Tsai >Assignee: PoAn Yang >Priority: Minor > > Time#waitObject is implemented by while-loop and it is used by > `ProducerMetadata` only. Hence, this jira can include following changes: > 1. move `Time#waitObject` to `ProducerMetadata#awaitUpdate` > 2. ProducerMetadata#awaitUpdate can throw "exact" TimeoutException [0] > [0] > https://github.com/apache/kafka/blob/23fe71d579f84d59ebfe6d5a29e688315cec1285/clients/src/main/java/org/apache/kafka/clients/producer/KafkaProducer.java#L1176 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-17279) Handle retriable errors from offset fetches in ConsumerCoordinator
[ https://issues.apache.org/jira/browse/KAFKA-17279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kirk True updated KAFKA-17279: -- Component/s: clients > Handle retriable errors from offset fetches in ConsumerCoordinator > -- > > Key: KAFKA-17279 > URL: https://issues.apache.org/jira/browse/KAFKA-17279 > Project: Kafka > Issue Type: Improvement > Components: clients, consumer >Reporter: Sean Quah >Assignee: Sean Quah >Priority: Minor > Fix For: 3.9.0 > > > Currently {{ConsumerCoordinator}}'s {{OffsetFetchResponseHandler}} only > retries on {{COORDINATOR_LOAD_IN_PROGRESS}} and {{NOT_COORDINATOR}} errors. > The error handling should be expanded to retry on all retriable errors. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-17478) Wrong configuration of metric.reporters lead to NPE in KafkaProducer constructor
[ https://issues.apache.org/jira/browse/KAFKA-17478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kirk True updated KAFKA-17478: -- Component/s: producer > Wrong configuration of metric.reporters lead to NPE in KafkaProducer > constructor > > > Key: KAFKA-17478 > URL: https://issues.apache.org/jira/browse/KAFKA-17478 > Project: Kafka > Issue Type: Bug > Components: clients, producer >Affects Versions: 3.7.0, 3.8.0, 3.7.1 >Reporter: Fred Rouleau >Assignee: Fred Rouleau >Priority: Minor > Fix For: 4.0.0 > > Original Estimate: 10m > Remaining Estimate: 10m > > if the metric.reporters property contains some invalid class, the > KafkaProducer constructor fails with non explicit NPE: > {code:java} > Exception in thread "main" java.lang.NullPointerException: Cannot invoke > "java.util.Optional.ifPresent(java.util.function.Consumer)" because > "this.clientTelemetryReporter" is null > at > org.apache.kafka.clients.producer.KafkaProducer.close(KafkaProducer.java:1424) > at > org.apache.kafka.clients.producer.KafkaProducer.(KafkaProducer.java:472) > at > org.apache.kafka.clients.producer.KafkaProducer.(KafkaProducer.java:295) > at > org.apache.kafka.clients.producer.KafkaProducer.(KafkaProducer.java:322) > at > org.apache.kafka.clients.producer.KafkaProducer.(KafkaProducer.java:307) > at org.frouleau.kafka.clients.Produce.sendAvroSpecific(Produce.java:89) > at org.frouleau.kafka.clients.Produce.main(Produce.java:63){code} > This behavior was introduced by KAFKA-15901 implementing KIP-714. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-17681) Fix unstable consumer_test.py#test_fencing_static_consumer
[ https://issues.apache.org/jira/browse/KAFKA-17681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kirk True updated KAFKA-17681: -- Labels: kip-848-client-support (was: ) > Fix unstable consumer_test.py#test_fencing_static_consumer > -- > > Key: KAFKA-17681 > URL: https://issues.apache.org/jira/browse/KAFKA-17681 > Project: Kafka > Issue Type: Bug > Components: clients, consumer >Reporter: Chia-Ping Tsai >Assignee: Chia-Ping Tsai >Priority: Major > Labels: kip-848-client-support > Fix For: 4.0.0 > > > {code:java} > AssertionError('Static consumers attempt to join with instance id in use > should not cause a rebalance.') > Traceback (most recent call last): > File > "/usr/local/lib/python3.9/dist-packages/ducktape/tests/runner_client.py", > line 186, in _do_run > data = self.run_test() > File > "/usr/local/lib/python3.9/dist-packages/ducktape/tests/runner_client.py", > line 246, in run_test > return self.test_context.function(self.test) > File "/usr/local/lib/python3.9/dist-packages/ducktape/mark/_mark.py", line > 433, in wrapper > return functools.partial(f, *args, **kwargs)(*w_args, **w_kwargs) > File "/opt/kafka-dev/tests/kafkatest/tests/client/consumer_test.py", line > 359, in test_fencing_static_consumer > assert num_rebalances == consumer.num_rebalances(), "Static consumers > attempt to join with instance id in use should not cause a rebalance. before: > " + str(num_rebalances) + " after: " + str(consumer.num_rebalances()) > AssertionError: Static consumers attempt to join with instance id in use > should not cause a rebalance. > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-17681) Fix unstable consumer_test.py#test_fencing_static_consumer
[ https://issues.apache.org/jira/browse/KAFKA-17681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kirk True updated KAFKA-17681: -- Component/s: clients consumer > Fix unstable consumer_test.py#test_fencing_static_consumer > -- > > Key: KAFKA-17681 > URL: https://issues.apache.org/jira/browse/KAFKA-17681 > Project: Kafka > Issue Type: Bug > Components: clients, consumer >Reporter: Chia-Ping Tsai >Assignee: Chia-Ping Tsai >Priority: Major > > {code:java} > AssertionError('Static consumers attempt to join with instance id in use > should not cause a rebalance.') > Traceback (most recent call last): > File > "/usr/local/lib/python3.9/dist-packages/ducktape/tests/runner_client.py", > line 186, in _do_run > data = self.run_test() > File > "/usr/local/lib/python3.9/dist-packages/ducktape/tests/runner_client.py", > line 246, in run_test > return self.test_context.function(self.test) > File "/usr/local/lib/python3.9/dist-packages/ducktape/mark/_mark.py", line > 433, in wrapper > return functools.partial(f, *args, **kwargs)(*w_args, **w_kwargs) > File "/opt/kafka-dev/tests/kafkatest/tests/client/consumer_test.py", line > 359, in test_fencing_static_consumer > assert num_rebalances == consumer.num_rebalances(), "Static consumers > attempt to join with instance id in use should not cause a rebalance. before: > " + str(num_rebalances) + " after: " + str(consumer.num_rebalances()) > AssertionError: Static consumers attempt to join with instance id in use > should not cause a rebalance. > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-17681) Fix unstable consumer_test.py#test_fencing_static_consumer
[ https://issues.apache.org/jira/browse/KAFKA-17681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kirk True updated KAFKA-17681: -- Fix Version/s: 4.0.0 > Fix unstable consumer_test.py#test_fencing_static_consumer > -- > > Key: KAFKA-17681 > URL: https://issues.apache.org/jira/browse/KAFKA-17681 > Project: Kafka > Issue Type: Bug > Components: clients, consumer >Reporter: Chia-Ping Tsai >Assignee: Chia-Ping Tsai >Priority: Major > Fix For: 4.0.0 > > > {code:java} > AssertionError('Static consumers attempt to join with instance id in use > should not cause a rebalance.') > Traceback (most recent call last): > File > "/usr/local/lib/python3.9/dist-packages/ducktape/tests/runner_client.py", > line 186, in _do_run > data = self.run_test() > File > "/usr/local/lib/python3.9/dist-packages/ducktape/tests/runner_client.py", > line 246, in run_test > return self.test_context.function(self.test) > File "/usr/local/lib/python3.9/dist-packages/ducktape/mark/_mark.py", line > 433, in wrapper > return functools.partial(f, *args, **kwargs)(*w_args, **w_kwargs) > File "/opt/kafka-dev/tests/kafkatest/tests/client/consumer_test.py", line > 359, in test_fencing_static_consumer > assert num_rebalances == consumer.num_rebalances(), "Static consumers > attempt to join with instance id in use should not cause a rebalance. before: > " + str(num_rebalances) + " after: " + str(consumer.num_rebalances()) > AssertionError: Static consumers attempt to join with instance id in use > should not cause a rebalance. > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-16142) Update metrics documentation for errors
[ https://issues.apache.org/jira/browse/KAFKA-16142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kirk True updated KAFKA-16142: -- Fix Version/s: (was: 4.0.0) > Update metrics documentation for errors > --- > > Key: KAFKA-16142 > URL: https://issues.apache.org/jira/browse/KAFKA-16142 > Project: Kafka > Issue Type: Improvement > Components: clients, consumer, documentation, metrics >Affects Versions: 3.7.0 >Reporter: Kirk True >Assignee: Philip Nee >Priority: Minor > Labels: consumer-threading-refactor, metrics > > We need to identify the “errors” that exist in the current JMX documentation > and resolve them. Per [~pnee] there are errors on the JMX web page, which he > will identify and resolve. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-15556) Remove NetworkClientDelegate methods isUnavailable, maybeThrowAuthFailure, and tryConnect
[ https://issues.apache.org/jira/browse/KAFKA-15556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kirk True updated KAFKA-15556: -- Fix Version/s: (was: 4.0.0) > Remove NetworkClientDelegate methods isUnavailable, maybeThrowAuthFailure, > and tryConnect > - > > Key: KAFKA-15556 > URL: https://issues.apache.org/jira/browse/KAFKA-15556 > Project: Kafka > Issue Type: Improvement > Components: clients, consumer >Reporter: Kirk True >Assignee: Phuc Hong Tran >Priority: Minor > Labels: consumer-threading-refactor > > The "new consumer" (i.e. {{{}PrototypeAsyncConsumer{}}}) was designed to > handle networking details in a more centralized way. However, in order to > reuse code between the existing {{KafkaConsumer}} and the new > {{{}PrototypeAsyncConsumer{}}}, that design goal was "relaxed" when the > {{NetworkClientDelegate}} capitulated and -stole- copied three methods from > {{ConsumerNetworkClient}} related to detecting node status: > # {{isUnavailable}} > # {{maybeThrowAuthFailure}} > # {{tryConnect}} > Unfortunately, these have found their way into the {{FetchRequestManager}} > and {{OffsetsRequestManager}} implementations. We should review if we can > clean up—or even remove—this leaky abstraction. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-15867) Should ConsumerNetworkThread wrap the exception and notify the polling thread?
[ https://issues.apache.org/jira/browse/KAFKA-15867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kirk True updated KAFKA-15867: -- Fix Version/s: (was: 4.0.0) > Should ConsumerNetworkThread wrap the exception and notify the polling thread? > -- > > Key: KAFKA-15867 > URL: https://issues.apache.org/jira/browse/KAFKA-15867 > Project: Kafka > Issue Type: Task > Components: clients, consumer >Reporter: Philip Nee >Assignee: Phuc Hong Tran >Priority: Minor > Labels: consumer-threading-refactor, events > > The ConsumerNetworkThread runs a tight loop infinitely. However, when > encountering an unexpected exception, it logs an error and continues. > > I think this might not be ideal because user can run blind for a long time > before discovering there's something wrong with the code; so I believe we > should propagate the throwable back to the polling thread. > > cc [~lucasbru] -- This message was sent by Atlassian Jira (v8.20.10#820010)