[ 
https://issues.apache.org/jira/browse/KAFKA-15792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17783114#comment-17783114
 ] 

Matthias J. Sax edited comment on KAFKA-15792 at 11/6/23 5:04 PM:
------------------------------------------------------------------

Can it be related to 
[https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=186878390]. 
However we can't see any metrics that can prove this.


was (Author: JIRAUSER300456):
Can it be related to 
[https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=186878390|https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=186878390?]
 However we can't see any metrics that can prove this.

> Kafka Streams stuck partition fixed after restarting the process
> ----------------------------------------------------------------
>
>                 Key: KAFKA-15792
>                 URL: https://issues.apache.org/jira/browse/KAFKA-15792
>             Project: Kafka
>          Issue Type: New Feature
>          Components: streams
>    Affects Versions: 3.1.2
>            Reporter: Patrick Pang
>            Priority: Major
>
> Our Kafka Streams process often show slow in processing a particular 
> partition on a specific instance. No data skew is detected, i.e. partitions 
> are mostly uniformly distributed. Symptom is huge lag on a specific 
> partition. We do notice network out is higher on problematic process than 
> normal process, often at 3x
> After restarting the process, the lag drains within 5 minutes after startup. 
> This hints at internal processing issue of our streams application instead of 
> cluster or poison message. 
> Is there any metrics you suggest for us to look at, or is this a known issue? 
> Regularly bouncing the application doesn't look like a proper fix for 
> production systems.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to