[ 
https://issues.apache.org/jira/browse/KAFKA-5318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joao updated KAFKA-5318:
------------------------
    Description: 
I rely on "org.apache.kafka.streams.KafkaStreams#state" to know if my current 
stream instance is properly running. If it becomes unhealthy my provisioning 
system (Kubernetes) automatically restarts/replaces the instance.

One of such instance encountered bug 
https://issues.apache.org/jira/browse/KAFKA-5167.

The issue is that during the whole time my instance was affected by the linked 
bug, the stream state was considered healthy when in fact it was not. My 
instance did not recover automatically from the LockException and I happened to 
notice something was wrong because I monitor the stream delay, which went into 
abnormal values.

This ultimately means that the kafka stream state is unreliable at describing 
if an instance is actually running as intended.

There are some improvements in the works from what I was told, such as 
https://cwiki.apache.org/confluence/display/KAFKA/KIP+130%3A+Expose+states+of+active+tasks+to+KafkaStreams+public+API

  was:
I rely on "org.apache.kafka.streams.KafkaStreams#state" to know if my current 
stream instance is properly running. If it becomes unhealthy my provisioning 
system (Kubernetes) automatically restarts/replaces the instance.

One of such instances encountered bug 
https://issues.apache.org/jira/browse/KAFKA-5167.

The issue is that during the whole time my instance was affected by the linked 
bug, the stream state was considered healthy when in fact it was not. My 
instance did not recover automatically from the LockException and I happened to 
notice something was wrong because I monitor the stream delay, which went into 
abnormal values.

This ultimately means that the kafka stream state is unreliable at describing 
if an instance is actually running as intended.

There are some improvements in the works from what I was told, such as 
https://cwiki.apache.org/confluence/display/KAFKA/KIP+130%3A+Expose+states+of+active+tasks+to+KafkaStreams+public+API


> Streams state may be misleading
> -------------------------------
>
>                 Key: KAFKA-5318
>                 URL: https://issues.apache.org/jira/browse/KAFKA-5318
>             Project: Kafka
>          Issue Type: Bug
>          Components: streams
>    Affects Versions: 0.10.2.1
>            Reporter: Joao
>
> I rely on "org.apache.kafka.streams.KafkaStreams#state" to know if my current 
> stream instance is properly running. If it becomes unhealthy my provisioning 
> system (Kubernetes) automatically restarts/replaces the instance.
> One of such instance encountered bug 
> https://issues.apache.org/jira/browse/KAFKA-5167.
> The issue is that during the whole time my instance was affected by the 
> linked bug, the stream state was considered healthy when in fact it was not. 
> My instance did not recover automatically from the LockException and I 
> happened to notice something was wrong because I monitor the stream delay, 
> which went into abnormal values.
> This ultimately means that the kafka stream state is unreliable at describing 
> if an instance is actually running as intended.
> There are some improvements in the works from what I was told, such as 
> https://cwiki.apache.org/confluence/display/KAFKA/KIP+130%3A+Expose+states+of+active+tasks+to+KafkaStreams+public+API



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to