[ 
https://issues.apache.org/jira/browse/KAFKA-6520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias J. Sax updated KAFKA-6520:
-----------------------------------
    Description: 
KIP WIP: 
[https://cwiki.apache.org/confluence/display/KAFKA/KIP-457%3A+Add+DISCONNECTED+status+to+Kafka+Streams]


When you execute the following scenario the application is always in RUNNING 
state
  
 1)start kafka
 2)start app, app connects to kafka and starts processing
 3)kill kafka(stop docker container)
 4)the application doesn't give any indication that it's no longer 
connected(Stream State is still RUNNING, and the uncaught exception handler 
isn't invoked)
  
  
 It would be useful if the Stream State had a DISCONNECTED status.
  
 See 
[this|https://groups.google.com/forum/#!topic/confluent-platform/nQh2ohgdrIQ] 
for a discussion from the google user forum.  This is a link to a related issue.

-------------------------

Update: there are some discussions on the PR itself which leads me to think 
that a more general solution should be at the ClusterConnectionStates rather 
than at the Streams or even Consumer level. One proposal would be:
 * Add a new metric named `failedConnection` in SelectorMetrics which is 
recorded at `connect()` and `pollSelectionKeys()` functions, upon capture the 
IOException / RuntimeException which indicates the connection disconnected.
 * And then users of Consumer / Streams can monitor on this metric, which 
normally will only have close to zero values as we have transient disconnects, 
if it is spiking it means the brokers are consistently being unavailable 
indicting the state.

[~Yohan123] WDYT?

  was:
When you execute the following scenario the application is always in RUNNING 
state
  
 1)start kafka
 2)start app, app connects to kafka and starts processing
 3)kill kafka(stop docker container)
 4)the application doesn't give any indication that it's no longer 
connected(Stream State is still RUNNING, and the uncaught exception handler 
isn't invoked)
  
  
 It would be useful if the Stream State had a DISCONNECTED status.
  
See 
[this|https://groups.google.com/forum/#!topic/confluent-platform/nQh2ohgdrIQ] 
for a discussion from the google user forum.  
[This|https://issues.apache.org/jira/browse/KAFKA-4564] is a link to a related 
issue.

-------------------------

Update: there are some discussions on the PR itself which leads me to think 
that a more general solution should be at the ClusterConnectionStates rather 
than at the Streams or even Consumer level. One proposal would be:

* Add a new metric named `failedConnection` in SelectorMetrics which is 
recorded at `connect()` and `pollSelectionKeys()` functions, upon capture the 
IOException / RuntimeException which indicates the connection disconnected.
* And then users of Consumer / Streams can monitor on this metric, which 
normally will only have close to zero values as we have transient disconnects, 
if it is spiking it means the brokers are consistently being unavailable 
indicting the state.

[~Yohan123] WDYT?



> When a Kafka Stream can't communicate with the server, it's Status stays 
> RUNNING
> --------------------------------------------------------------------------------
>
>                 Key: KAFKA-6520
>                 URL: https://issues.apache.org/jira/browse/KAFKA-6520
>             Project: Kafka
>          Issue Type: Improvement
>          Components: streams
>            Reporter: Michael Kohout
>            Priority: Major
>              Labels: newbie, user-experience
>
> KIP WIP: 
> [https://cwiki.apache.org/confluence/display/KAFKA/KIP-457%3A+Add+DISCONNECTED+status+to+Kafka+Streams]
> When you execute the following scenario the application is always in RUNNING 
> state
>   
>  1)start kafka
>  2)start app, app connects to kafka and starts processing
>  3)kill kafka(stop docker container)
>  4)the application doesn't give any indication that it's no longer 
> connected(Stream State is still RUNNING, and the uncaught exception handler 
> isn't invoked)
>   
>   
>  It would be useful if the Stream State had a DISCONNECTED status.
>   
>  See 
> [this|https://groups.google.com/forum/#!topic/confluent-platform/nQh2ohgdrIQ] 
> for a discussion from the google user forum.  This is a link to a related 
> issue.
> -------------------------
> Update: there are some discussions on the PR itself which leads me to think 
> that a more general solution should be at the ClusterConnectionStates rather 
> than at the Streams or even Consumer level. One proposal would be:
>  * Add a new metric named `failedConnection` in SelectorMetrics which is 
> recorded at `connect()` and `pollSelectionKeys()` functions, upon capture the 
> IOException / RuntimeException which indicates the connection disconnected.
>  * And then users of Consumer / Streams can monitor on this metric, which 
> normally will only have close to zero values as we have transient 
> disconnects, if it is spiking it means the brokers are consistently being 
> unavailable indicting the state.
> [~Yohan123] WDYT?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to