[
https://issues.apache.org/jira/browse/KAFKA-5318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Joao updated KAFKA-5318:
------------------------
Description:
I rely on "org.apache.kafka.streams.KafkaStreams#state" to know if my current
stream instance is properly running. If it becomes unhealthy my provisioning
system (Kubernetes) automatically restarts/replaces the instance.
One of such instance encountered bug
https://issues.apache.org/jira/browse/KAFKA-5167.
The issue is that during the whole time my instance was affected by the linked
bug, the stream state was considered healthy when in fact it was not. My
instance did not recover automatically from the LockException and I happened to
notice something was wrong because I monitor the stream delay, which went into
abnormal values.
This ultimately means that the kafka stream state is unreliable at describing
if an instance is actually running as intended.
There are some improvements in the works from what I was told, such as
https://cwiki.apache.org/confluence/display/KAFKA/KIP+130%3A+Expose+states+of+active+tasks+to+KafkaStreams+public+API
was:
I rely on "org.apache.kafka.streams.KafkaStreams#state" to know if my current
stream instance is properly running. If it becomes unhealthy my provisioning
system (Kubernetes) automatically restarts/replaces the instance.
One of such instances encountered bug
https://issues.apache.org/jira/browse/KAFKA-5167.
The issue is that during the whole time my instance was affected by the linked
bug, the stream state was considered healthy when in fact it was not. My
instance did not recover automatically from the LockException and I happened to
notice something was wrong because I monitor the stream delay, which went into
abnormal values.
This ultimately means that the kafka stream state is unreliable at describing
if an instance is actually running as intended.
There are some improvements in the works from what I was told, such as
https://cwiki.apache.org/confluence/display/KAFKA/KIP+130%3A+Expose+states+of+active+tasks+to+KafkaStreams+public+API
> Streams state may be misleading
> -------------------------------
>
> Key: KAFKA-5318
> URL: https://issues.apache.org/jira/browse/KAFKA-5318
> Project: Kafka
> Issue Type: Bug
> Components: streams
> Affects Versions: 0.10.2.1
> Reporter: Joao
>
> I rely on "org.apache.kafka.streams.KafkaStreams#state" to know if my current
> stream instance is properly running. If it becomes unhealthy my provisioning
> system (Kubernetes) automatically restarts/replaces the instance.
> One of such instance encountered bug
> https://issues.apache.org/jira/browse/KAFKA-5167.
> The issue is that during the whole time my instance was affected by the
> linked bug, the stream state was considered healthy when in fact it was not.
> My instance did not recover automatically from the LockException and I
> happened to notice something was wrong because I monitor the stream delay,
> which went into abnormal values.
> This ultimately means that the kafka stream state is unreliable at describing
> if an instance is actually running as intended.
> There are some improvements in the works from what I was told, such as
> https://cwiki.apache.org/confluence/display/KAFKA/KIP+130%3A+Expose+states+of+active+tasks+to+KafkaStreams+public+API
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)