[ 
https://issues.apache.org/jira/browse/KAFKA-19888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Jacot updated KAFKA-19888:
--------------------------------
    Priority: Blocker  (was: Critical)

> Coordinator histogram negative values causing persistent write timeouts and 
> consumer instability in Kafka 4.1.0
> ---------------------------------------------------------------------------------------------------------------
>
>                 Key: KAFKA-19888
>                 URL: https://issues.apache.org/jira/browse/KAFKA-19888
>             Project: Kafka
>          Issue Type: Bug
>          Components: group-coordinator
>    Affects Versions: 4.1.0
>         Environment: Kafka cluster (KRaft) with 9 brokers
> RHEL-based OS (AlmaLinux 9.6) 
> Running on VMware vSphere
> NTP service active on VMs
> VMware Tools had "Synchronize time periodically" enabled
>            Reporter: Eliel Gomes
>            Priority: Blocker
>             Fix For: 4.2.0
>
>
> h2. Summary
> In Kafka 4.1.0, a broker entered a corrupted state where the Coordinator 
> began producing {*}negative histogram values{*}, causing persistent write 
> timeouts to {{{}__consumer_offsets{}}}, continuous group coordinator 
> instability, and repeated consumer disconnections.
> The issue did *not* self-heal after time synchronization stabilized and only 
> cleared after restarting the affected broker.
> This suggests that the Coordinator histogram/timer subsystem becomes invalid 
> after certain time irregularities and does not recover automatically.
>  
> ----
> h2. *Description*
> We observed repeated Coordinator failures on a single broker in a production 
> Kafka 4.1.0 cluster. The internal histogram used by the Coordinator runtime 
> began to record {*}negative values{*}, which should be impossible under 
> correct operation.
> Once this occurs:
>  * Coordinator write events begin to timeout repeatedly
>  * Consumers are disconnected and reconnected frequently
>  * Group coordination is unstable
>  * Commits failing on __consumer_offsets
>  * Lag increasing ("fake" lag), because of failures in commits
>  * The broker remains in a broken state even after the system clock stabilizes
>  * Only restarting the broker clears the condition
>  * After the restart, consumer groups immediately select the same broker 
> again as coordinator, and the problem eventually reappears
> This behavior suggests state corruption inside the Coordinator's 
> timer/histogram logic, where the {{GroupCoordinator}} component is 
> susceptible to a fatal {{ArrayIndexOutOfBoundsException}} when attempting to 
> record a negative value in an internal histogram.
> This bug appears to be triggered by a non-monotonic clock adjustment (i.e., 
> the system clock moving backward).
> When this exception occurs, the {{group-coordinator-event-processor}} thread 
> fails, specifically when processing internal events like 
> {{{*}HighWatermarkUpdate{*}.}}
> The failures cause of offset *commit failures* from {*}consumers{*}.
>  
> ----
> h2. *Actual Behavior*
>  * Coordinator internal histograms observe *negative values*
>  * The Coordinator *fails to write to {{__consumer_offsets}}*
>  * Consumers experience recurring instability
>  * Kafka does not self-recover even when system time becomes stable
>  * Only a broker restart resolves the condition
>  
> ----
> h2. *Expected behavior*
> Kafka should:
>  * Reject or guard against negative deltas in histogram/timer calculations
>  * Prevent a corrupted Coordinator state from persisting
>  * Avoid selecting a malfunctioning broker as group coordinator
>  
> ----
> h2. *Impact*
>  * Critical production consumer instability
>  * Continuous rebalances
>  * Offset commit delays
>  * 
>  ** "Fake Lags"
>  * Persistent coordinator malfunction
>  * Requires broker restarts to recover
>  * High operational risk
> ----
> h2. *LOGS*
> *Server-Side:* The {{GroupCoordinator}} broker log is flooded with the 
> following exception, indicating the event processor thread is failing:
>  
> *Kafka Log - 2025-11-12 at 14:56*
> {code:java}
> [2025-11-12 14:56:38,166] ERROR [group-coordinator-event-processor-0]: Failed 
> to run event InternalEvent(name=HighWatermarkUpdate) due to: Histogram 
> recorded value cannot be negative.. 
> (org.apache.kafka.coordinator.common.runtime.MultiThreadedEventProcessor$EventProcessorThread)
> java.lang.ArrayIndexOutOfBoundsException: Histogram recorded value cannot be 
> negative.
> [2025-11-12 14:56:38,167] ERROR [GroupCoordinator id=5] Execution of 
> HighWatermarkUpdate failed due to Histogram recorded value cannot be 
> negative.. (org.apache.kafka.coordinator.common.runtime.CoordinatorRuntime)
> java.lang.ArrayIndexOutOfBoundsException: Histogram recorded value cannot be 
> negative.
> ...
> [2025-11-12 14:58:38,170] ERROR [group-coordinator-event-processor-2]: Failed 
> to run event InternalEvent(name=FlushBatch) due to: Histogram recorded value 
> cannot be negative.. 
> (org.apache.kafka.coordinator.common.runtime.MultiThreadedEventProcessor$EventProcessorThread)
> java.lang.ArrayIndexOutOfBoundsException: Histogram recorded value cannot be 
> negative.
> [2025-11-12 14:58:38,170] ERROR [GroupCoordinator id=5] Execution of 
> FlushBatch failed due to Histogram recorded value cannot be negative.. 
> (org.apache.kafka.coordinator.common.runtime.CoordinatorRuntime)
> java.lang.ArrayIndexOutOfBoundsException: Histogram recorded value cannot be 
> negative.{code}
>  
> *Kafka Log - 2025-11-13 09:15*
> {code:java}
> [2025-11-13 09:15:39,777] ERROR [group-coordinator-event-processor-0]: Failed 
> to run event InternalEvent(name=HighWatermarkUpdate) due to: Histogram 
> recorded value cannot be negative.. 
> (org.apache.kafka.coordinator.common.runtime.MultiThreadedEventProcessor$EventProcessorThread)
> java.lang.ArrayIndexOutOfBoundsException: Histogram recorded value cannot be 
> negative.
>         at 
> org.HdrHistogram.AbstractHistogram.countsArrayIndex(AbstractHistogram.java:2399)
>  ~[HdrHistogram-2.2.2.jar:?]
>         at 
> org.HdrHistogram.AbstractHistogram.recordSingleValue(AbstractHistogram.java:559)
>  ~[HdrHistogram-2.2.2.jar:?]
>         at 
> org.HdrHistogram.AbstractHistogram.recordValue(AbstractHistogram.java:467) 
> ~[HdrHistogram-2.2.2.jar:?]
>         at org.HdrHistogram.Recorder.recordValue(Recorder.java:136) 
> ~[HdrHistogram-2.2.2.jar:?]
>         at 
> org.apache.kafka.coordinator.common.runtime.HdrHistogram.record(HdrHistogram.java:100)
>  ~[kafka-coordinator-common-4.1.0.jar:?]
>         at 
> org.apache.kafka.coordinator.common.runtime.KafkaMetricHistogram.record(KafkaMetricHistogram.java:128)
>  ~[kafka-coordinator-common-4.1.0.jar:?]
>         at 
> org.apache.kafka.common.metrics.Sensor.recordInternal(Sensor.java:236) 
> ~[kafka-clients-4.1.0.jar:?]
>         at org.apache.kafka.common.metrics.Sensor.record(Sensor.java:197) 
> ~[kafka-clients-4.1.0.jar:?]
>         at 
> org.apache.kafka.coordinator.common.runtime.CoordinatorRuntimeMetricsImpl.recordEventQueueTime(CoordinatorRuntimeMetricsImpl.java:284)
>  ~[kafka-coordinator-common-4.1.0.jar:?]
>         at 
> org.apache.kafka.coordinator.common.runtime.MultiThreadedEventProcessor$EventProcessorThread.handleEvents(MultiThreadedEventProcessor.java:146)
>  [kafka-coordinator-common-4.1.0.jar:?]
>         at 
> org.apache.kafka.coordinator.common.runtime.MultiThreadedEventProcessor$EventProcessorThread.run(MultiThreadedEventProcessor.java:179)
>  [kafka-coordinator-common-4.1.0.jar:?]
> [2025-11-13 09:15:39,777] ERROR [GroupCoordinator id=5] Execution of 
> HighWatermarkUpdate failed due to Histogram recorded value cannot be 
> negative.. (org.apache.kafka.coordinator.common.runtime.CoordinatorRuntime)
> java.lang.ArrayIndexOutOfBoundsException: Histogram recorded value cannot be 
> negative.
>         at 
> org.HdrHistogram.AbstractHistogram.countsArrayIndex(AbstractHistogram.java:2399)
>  ~[HdrHistogram-2.2.2.jar:?]
>         at 
> org.HdrHistogram.AbstractHistogram.recordSingleValue(AbstractHistogram.java:559)
>  ~[HdrHistogram-2.2.2.jar:?]
>         at 
> org.HdrHistogram.AbstractHistogram.recordValue(AbstractHistogram.java:467) 
> ~[HdrHistogram-2.2.2.jar:?]
>         at org.HdrHistogram.Recorder.recordValue(Recorder.java:136) 
> ~[HdrHistogram-2.2.2.jar:?]
>         at 
> org.apache.kafka.coordinator.common.runtime.HdrHistogram.record(HdrHistogram.java:100)
>  ~[kafka-coordinator-common-4.1.0.jar:?]
>         at 
> org.apache.kafka.coordinator.common.runtime.KafkaMetricHistogram.record(KafkaMetricHistogram.java:128)
>  ~[kafka-coordinator-common-4.1.0.jar:?]
>         at 
> org.apache.kafka.common.metrics.Sensor.recordInternal(Sensor.java:236) 
> ~[kafka-clients-4.1.0.jar:?]
>         at org.apache.kafka.common.metrics.Sensor.record(Sensor.java:197) 
> ~[kafka-clients-4.1.0.jar:?]
>         at 
> org.apache.kafka.coordinator.common.runtime.CoordinatorRuntimeMetricsImpl.recordEventQueueTime(CoordinatorRuntimeMetricsImpl.java:284)
>  ~[kafka-coordinator-common-4.1.0.jar:?]
>         at 
> org.apache.kafka.coordinator.common.runtime.MultiThreadedEventProcessor$EventProcessorThread.handleEvents(MultiThreadedEventProcessor.java:146)
>  [kafka-coordinator-common-4.1.0.jar:?]
>         at 
> org.apache.kafka.coordinator.common.runtime.MultiThreadedEventProcessor$EventProcessorThread.run(MultiThreadedEventProcessor.java:179)
>  [kafka-coordinator-common-4.1.0.jar:?] {code}
>  
> *Client-Side:* Consumers associated with this (failed) {{GroupCoordinator}} 
> begin to fail their offset commits. The client logs show a continuous loop 
> where the consumer:
>  * Tries to commit and fails with {{This is not the correct coordinator.}}
>  * Receives an error response {{{}NOT_COORDINATOR{}}}.
>  * Disconnects and attempts rediscovery.
>  * Discovers the _exact same_ (broken) broker as the coordinator and repeats 
> the cycle.
>  
> *Consumer Log - 2025-11-12 at 14:58*
> {code:java}
> 2025-11-13 14:58:46.730 WARN  [o.a.k.c.c.i.ConsumerCoordinator] [] - 
> [Consumer clientId=CLIENT_ID_CONSUMER_LOW_PRIORITY, 
> groupId=CONSUMER_GROUP_LOW_PRIORITY] Offset commit failed on partition 
> SEND_SMS_LOW_PRIORITY-31 at offset 2291423: The coordinator is not available.
> 2025-11-13 14:58:46.730 INFO  [o.a.k.c.c.i.ConsumerCoordinator] [] - 
> [Consumer clientId=CLIENT_ID_CONSUMER_LOW_PRIORITY, 
> groupId=CONSUMER_GROUP_LOW_PRIORITY] Group coordinator X.X.X.X:9092 (id: 
> 2147483642 rack: null) is unavailable or invalid due to cause: error response 
> COORDINATOR_NOT_AVAILABLE. isDisconnected: false. Rediscovery will be 
> attempted.
> 2025-11-12 14:58:46.388 INFO [o.a.k.c.c.i.ConsumerCoordinator] [] - [Consumer 
> clientId=CLIENT_ID_CONSUMER_LOW_PRIORITY, 
> groupId=CONSUMER_GROUP_LOW_PRIORITY] Requesting disconnect from last known 
> coordinator 10.0.01:9092 (id: 2147483642 rack: null) 2025-11-12 14:58:46.395 
> INFO [o.apache.kafka.clients.NetworkClient] [] - [Consumer 
> clientId=CLIENT_ID_CONSUMER_LOW_PRIORITY, 
> groupId=CONSUMER_GROUP_LOW_PRIORITY] Client requested disconnect from node 
> 2147483642 2025-11-12 14:58:46.396 INFO 
> [o.apache.kafka.clients.NetworkClient] [] - [Consumer 
> clientId=CLIENT_ID_CONSUMER_LOW_PRIORITY, 
> groupId=CONSUMER_GROUP_LOW_PRIORITY] Cancelled in-flight OFFSET_COMMIT 
> request with correlation id 3436277 due to node 2147483642 being disconnected 
> (elapsed time since creation: 12ms, elapsed time since send: 12ms, throttle 
> time: 0ms, request timeout: 30000ms) 2025-11-12 14:58:46.677 INFO 
> [o.a.k.c.c.i.ConsumerCoordinator] [] - [Consumer 
> clientId=CLIENT_ID_CONSUMER_LOW_PRIORITY, 
> groupId=CONSUMER_GROUP_LOW_PRIORITY] Discovered group coordinator 
> X.X.X.X:9092 (id: 2147483642 rack: null) {code}
>  
> *Consumer Log - 2025-11-13 at 09:15*
> {code:java}
> 2025-11-13 09:15:46.730 WARN  [o.a.k.c.c.i.ConsumerCoordinator] [] - 
> [Consumer clientId=CLIENT_ID_CONSUMER_LOW_PRIORITY, 
> groupId=CONSUMER_GROUP_LOW_PRIORITY] Offset commit failed on partition 
> SEND_SMS_LOW_PRIORITY-31 at offset 2291423: The coordinator is not available.
> 2025-11-13 09:15:46.730 INFO  [o.a.k.c.c.i.ConsumerCoordinator] [] - 
> [Consumer clientId=CLIENT_ID_CONSUMER_LOW_PRIORITY, 
> groupId=CONSUMER_GROUP_LOW_PRIORITY] Group coordinator X.X.X.X:9092 (id: 
> 2147483642 rack: null) is unavailable or invalid due to cause: error response 
> COORDINATOR_NOT_AVAILABLE. isDisconnected: false. Rediscovery will be 
> attempted.
> 2025-11-13 09:15:46.730 INFO  [o.a.k.c.c.i.ConsumerCoordinator] [] - 
> [Consumer clientId=CLIENT_ID_CONSUMER_LOW_PRIORITY, 
> groupId=CONSUMER_GROUP_LOW_PRIORITY] Requesting disconnect from last known 
> coordinator X.X.X.X:9092 (id: 2147483642 rack: null)
> 2025-11-13 09:15:46.732 INFO  [o.apache.kafka.clients.NetworkClient] [] - 
> [Consumer clientId=CLIENT_ID_CONSUMER_LOW_PRIORITY, 
> groupId=CONSUMER_GROUP_LOW_PRIORITY] Client requested disconnect from node 
> 2147483642
> 2025-11-13 09:15:46.949 INFO  [o.a.k.c.c.i.ConsumerCoordinator] [] - 
> [Consumer clientId=CLIENT_ID_CONSUMER_LOW_PRIORITY, 
> groupId=CONSUMER_GROUP_LOW_PRIORITY] Discovered group coordinator 
> X.X.X.X:9092 (id: 2147483642 rack: null)
> 2025-11-13 09:15:51.958 INFO  [o.a.k.c.c.i.ConsumerCoordinator] [] - 
> [Consumer clientId=CLIENT_ID_CONSUMER_LOW_PRIORITY, 
> groupId=CONSUMER_GROUP_LOW_PRIORITY] Group coordinator X.X.X.X:9092 (id: 
> 2147483642 rack: null) is unavailable or invalid due to cause: error response 
> COORDINATOR_NOT_AVAILABLE. isDisconnected: false. Rediscovery will be 
> attempted.
> 2025-11-13 09:15:51.958 INFO  [o.a.k.c.c.i.ConsumerCoordinator] [] - 
> [Consumer clientId=CLIENT_ID_CONSUMER_LOW_PRIORITY, 
> groupId=CONSUMER_GROUP_LOW_PRIORITY] Requesting disconnect from last known 
> coordinator X.X.X.X:9092 (id: 2147483642 rack: null)
> 2025-11-13 09:15:51.958 INFO  [o.apache.kafka.clients.NetworkClient] [] - 
> [Consumer clientId=CLIENT_ID_CONSUMER_LOW_PRIORITY, 
> groupId=CONSUMER_GROUP_LOW_PRIORITY] Client requested disconnect from node 
> 2147483642
> 2025-11-13 09:15:51.958 INFO  [o.apache.kafka.clients.NetworkClient] [] - 
> [Consumer clientId=CLIENT_ID_CONSUMER_LOW_PRIORITY, 
> groupId=CONSUMER_GROUP_LOW_PRIORITY] Cancelled in-flight OFFSET_COMMIT 
> request with correlation id 3557487 due to node 2147483642 being disconnected 
> (elapsed time since creation: 4908ms, elapsed time since send: 4908ms, 
> throttle time: 0ms, request timeout: 30000ms)
> 2025-11-13 09:15:52.086 INFO  [SendAggregateTemplateRoute] [] - Grouped 2 
> messages.
> 2025-11-13 09:15:52.132 INFO  [o.a.k.c.c.i.ConsumerCoordinator] [] - 
> [Consumer clientId=CLIENT_ID_CONSUMER_LOW_PRIORITY, 
> groupId=CONSUMER_GROUP_LOW_PRIORITY] Discovered group coordinator 
> X.X.X.X:9092 (id: 2147483642 rack: null) {code}
>  
> ----
>  
> *Workaround (Temporary Fix):*
>  # Restarting the affected broker to force a new {{GroupCoordinator}} 
> election for the affected consumer groups.
>  # 
>  ## After the broker restarts, consumers will once again discover it as Group 
> Coordinator.
>  
> ----
> h2. *NTP*
> Brokers of Kafka Cluster and Clients Consumers are with NTP service active
> {code:java}
> timedatectl
>                Local time: Fri 2025-11-14 11:07:49 -03
>            Universal time: Fri 2025-11-14 14:07:49 UTC
>                  RTC time: Fri 2025-11-14 14:07:49
>                 Time zone: America/Sao_Paulo (-03, -0300)
> System clock synchronized: yes
>               NTP service: active
>           RTC in local TZ: no {code}
>  
> ----
> h2. *Request for Analysis / Questions*
>  # Does Kafka 4.1.0 require strictly monotonic time for Coordinator 
> histogram/timer logic?
>  # Is histogram corruption (negative values) a known issue in 4.1.X?
>  # Should Kafka validate histogram deltas to prevent negative values?
>  # Is this fixed in newer Kafka releases?
>  # Should CoordinatorRuntime reset its internal state when encountering time 
> irregularities?
>  # Is there a recommended workaround to avoid Coordinator entering this 
> permanent error state?
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to