[jira] [Updated] (KAFKA-13388) Kafka Producer nodes stuck in CHECKING_API_VERSIONS

2021-12-10 Thread David Mao (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-13388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Mao updated KAFKA-13388:
--
Priority: Critical  (was: Minor)

> Kafka Producer nodes stuck in CHECKING_API_VERSIONS
> ---
>
> Key: KAFKA-13388
> URL: https://issues.apache.org/jira/browse/KAFKA-13388
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Reporter: David Hoffman
>Priority: Critical
> Attachments: Screen Shot 2021-10-25 at 10.28.48 AM.png, 
> image-2021-10-21-13-42-06-528.png
>
>
> I have been seeing expired batch errors in my app.
> {code:java}
> org.apache.kafka.common.errors.TimeoutException: Expiring 51 record(s) for 
> xxx-17:120002 ms has passed since batch creation
> {code}
>  I would have assumed a request timout or connection timeout should have also 
> been logged. I could not find any other associated errors. 
> I added some instrumenting to my app and have traced this down to broker 
> connections hanging in CHECKING_API_VERSIONS state. -It appears there is no 
> effective timeout for Kafka Producer broker connections in 
> CHECKING_API_VERSIONS state.-
> In the code see the after the NetworkClient connects to a broker node it 
> makes a request to check api versions, when it receives the response it marks 
> the node as ready. -I am seeing that sometimes a reply is not received for 
> the check api versions request the connection just hangs in 
> CHECKING_API_VERSIONS state until it is disposed I assume after the idle 
> connection timeout.-
> Update: not actually sure what causes the connection to get stuck in 
> CHECKING_API_VERSIONS.
> -I am guessing the connection setup timeout should be still in play for this, 
> but it is not.- 
>  -There is a connectingNodes set that is consulted when checking timeouts and 
> the node is removed- 
>  -when ClusterConnectionStates.checkingApiVersions(String id) is called to 
> transition the node into CHECKING_API_VERSIONS-



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (KAFKA-13388) Kafka Producer nodes stuck in CHECKING_API_VERSIONS

2021-10-25 Thread David Hoffman (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-13388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Hoffman updated KAFKA-13388:
--
Attachment: Screen Shot 2021-10-25 at 10.28.48 AM.png

> Kafka Producer nodes stuck in CHECKING_API_VERSIONS
> ---
>
> Key: KAFKA-13388
> URL: https://issues.apache.org/jira/browse/KAFKA-13388
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Reporter: David Hoffman
>Priority: Minor
> Attachments: Screen Shot 2021-10-25 at 10.28.48 AM.png, 
> image-2021-10-21-13-42-06-528.png
>
>
> I have been seeing expired batch errors in my app.
> {code:java}
> org.apache.kafka.common.errors.TimeoutException: Expiring 51 record(s) for 
> xxx-17:120002 ms has passed since batch creation
> {code}
>  I would have assumed a request timout or connection timeout should have also 
> been logged. I could not find any other associated errors. 
> I added some instrumenting to my app and have traced this down to broker 
> connections hanging in CHECKING_API_VERSIONS state. -It appears there is no 
> effective timeout for Kafka Producer broker connections in 
> CHECKING_API_VERSIONS state.-
> In the code see the after the NetworkClient connects to a broker node it 
> makes a request to check api versions, when it receives the response it marks 
> the node as ready. -I am seeing that sometimes a reply is not received for 
> the check api versions request the connection just hangs in 
> CHECKING_API_VERSIONS state until it is disposed I assume after the idle 
> connection timeout.-
> Update: not actually sure what causes the connection to get stuck in 
> CHECKING_API_VERSIONS.
> -I am guessing the connection setup timeout should be still in play for this, 
> but it is not.- 
>  -There is a connectingNodes set that is consulted when checking timeouts and 
> the node is removed- 
>  -when ClusterConnectionStates.checkingApiVersions(String id) is called to 
> transition the node into CHECKING_API_VERSIONS-



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-13388) Kafka Producer nodes stuck in CHECKING_API_VERSIONS

2021-10-22 Thread David Hoffman (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-13388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Hoffman updated KAFKA-13388:
--
Priority: Minor  (was: Major)

> Kafka Producer nodes stuck in CHECKING_API_VERSIONS
> ---
>
> Key: KAFKA-13388
> URL: https://issues.apache.org/jira/browse/KAFKA-13388
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Reporter: David Hoffman
>Priority: Minor
> Attachments: image-2021-10-21-13-42-06-528.png
>
>
> I have been seeing expired batch errors in my app.
> {code:java}
> org.apache.kafka.common.errors.TimeoutException: Expiring 51 record(s) for 
> xxx-17:120002 ms has passed since batch creation
> {code}
>  I would have assumed a request timout or connection timeout should have also 
> been logged. I could not find any other associated errors. 
> I added some instrumenting to my app and have traced this down to broker 
> connections hanging in CHECKING_API_VERSIONS state. -It appears there is no 
> effective timeout for Kafka Producer broker connections in 
> CHECKING_API_VERSIONS state.-
> In the code see the after the NetworkClient connects to a broker node it 
> makes a request to check api versions, when it receives the response it marks 
> the node as ready. -I am seeing that sometimes a reply is not received for 
> the check api versions request the connection just hangs in 
> CHECKING_API_VERSIONS state until it is disposed I assume after the idle 
> connection timeout.-
> Update: not actually sure what causes the connection to get stuck in 
> CHECKING_API_VERSIONS.
> -I am guessing the connection setup timeout should be still in play for this, 
> but it is not.- 
>  -There is a connectingNodes set that is consulted when checking timeouts and 
> the node is removed- 
>  -when ClusterConnectionStates.checkingApiVersions(String id) is called to 
> transition the node into CHECKING_API_VERSIONS-



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-13388) Kafka Producer nodes stuck in CHECKING_API_VERSIONS

2021-10-22 Thread David Hoffman (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-13388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Hoffman updated KAFKA-13388:
--
Description: 
I have been seeing expired batch errors in my app.
{code:java}
org.apache.kafka.common.errors.TimeoutException: Expiring 51 record(s) for 
xxx-17:120002 ms has passed since batch creation
{code}
 I would have assumed a request timout or connection timeout should have also 
been logged. I could not find any other associated errors. 

I added some instrumenting to my app and have traced this down to broker 
connections hanging in CHECKING_API_VERSIONS state. It appears there is no 
effective timeout for Kafka Producer broker connections in 
CHECKING_API_VERSIONS state.

In the code see the after the NetworkClient connects to a broker node it makes 
a request to check api versions, when it receives the response it marks the 
node as ready. -I am seeing that sometimes a reply is not received for the 
check api versions request the connection just hangs in CHECKING_API_VERSIONS 
state until it is disposed I assume after the idle connection timeout.-

Update: not actually sure what causes the connection to get stuck in 
CHECKING_API_VERSIONS.

-I am guessing the connection setup timeout should be still in play for this, 
but it is not.- 
 -There is a connectingNodes set that is consulted when checking timeouts and 
the node is removed- 
 -when ClusterConnectionStates.checkingApiVersions(String id) is called to 
transition the node into CHECKING_API_VERSIONS-

  was:
I have been seeing expired batch errors in my app.
{code:java}
org.apache.kafka.common.errors.TimeoutException: Expiring 51 record(s) for 
xxx-17:120002 ms has passed since batch creation
{code}
 I would have assumed a request timout or connection timeout should have also 
been logged. I could not find any other associated errors. 

I added some instrumenting to my app and have traced this down to broker 
connections hanging in CHECKING_API_VERSIONS state. It appears there is no 
effective timeout for Kafka Producer broker connections in 
CHECKING_API_VERSIONS state.

In the code see the after the NetworkClient connects to a broker node it makes 
a request to check api versions, when it receives the response it marks the 
node as ready. I am seeing that sometimes a reply is not received for the check 
api versions request the connection just hangs in CHECKING_API_VERSIONS state 
until it is disposed I assume after the idle connection timeout.

I am guessing the connection setup timeout should be still in play for this, 
but it is not. 
There is a connectingNodes set that is consulted when checking timeouts and the 
node is removed 
when ClusterConnectionStates.checkingApiVersions(String id) is called to 
transition the node into CHECKING_API_VERSIONS


> Kafka Producer nodes stuck in CHECKING_API_VERSIONS
> ---
>
> Key: KAFKA-13388
> URL: https://issues.apache.org/jira/browse/KAFKA-13388
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Reporter: David Hoffman
>Priority: Major
> Attachments: image-2021-10-21-13-42-06-528.png
>
>
> I have been seeing expired batch errors in my app.
> {code:java}
> org.apache.kafka.common.errors.TimeoutException: Expiring 51 record(s) for 
> xxx-17:120002 ms has passed since batch creation
> {code}
>  I would have assumed a request timout or connection timeout should have also 
> been logged. I could not find any other associated errors. 
> I added some instrumenting to my app and have traced this down to broker 
> connections hanging in CHECKING_API_VERSIONS state. It appears there is no 
> effective timeout for Kafka Producer broker connections in 
> CHECKING_API_VERSIONS state.
> In the code see the after the NetworkClient connects to a broker node it 
> makes a request to check api versions, when it receives the response it marks 
> the node as ready. -I am seeing that sometimes a reply is not received for 
> the check api versions request the connection just hangs in 
> CHECKING_API_VERSIONS state until it is disposed I assume after the idle 
> connection timeout.-
> Update: not actually sure what causes the connection to get stuck in 
> CHECKING_API_VERSIONS.
> -I am guessing the connection setup timeout should be still in play for this, 
> but it is not.- 
>  -There is a connectingNodes set that is consulted when checking timeouts and 
> the node is removed- 
>  -when ClusterConnectionStates.checkingApiVersions(String id) is called to 
> transition the node into CHECKING_API_VERSIONS-



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-13388) Kafka Producer nodes stuck in CHECKING_API_VERSIONS

2021-10-22 Thread David Hoffman (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-13388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Hoffman updated KAFKA-13388:
--
Description: 
I have been seeing expired batch errors in my app.
{code:java}
org.apache.kafka.common.errors.TimeoutException: Expiring 51 record(s) for 
xxx-17:120002 ms has passed since batch creation
{code}
 I would have assumed a request timout or connection timeout should have also 
been logged. I could not find any other associated errors. 

I added some instrumenting to my app and have traced this down to broker 
connections hanging in CHECKING_API_VERSIONS state. -It appears there is no 
effective timeout for Kafka Producer broker connections in 
CHECKING_API_VERSIONS state.-

In the code see the after the NetworkClient connects to a broker node it makes 
a request to check api versions, when it receives the response it marks the 
node as ready. -I am seeing that sometimes a reply is not received for the 
check api versions request the connection just hangs in CHECKING_API_VERSIONS 
state until it is disposed I assume after the idle connection timeout.-

Update: not actually sure what causes the connection to get stuck in 
CHECKING_API_VERSIONS.

-I am guessing the connection setup timeout should be still in play for this, 
but it is not.- 
 -There is a connectingNodes set that is consulted when checking timeouts and 
the node is removed- 
 -when ClusterConnectionStates.checkingApiVersions(String id) is called to 
transition the node into CHECKING_API_VERSIONS-

  was:
I have been seeing expired batch errors in my app.
{code:java}
org.apache.kafka.common.errors.TimeoutException: Expiring 51 record(s) for 
xxx-17:120002 ms has passed since batch creation
{code}
 I would have assumed a request timout or connection timeout should have also 
been logged. I could not find any other associated errors. 

I added some instrumenting to my app and have traced this down to broker 
connections hanging in CHECKING_API_VERSIONS state. It appears there is no 
effective timeout for Kafka Producer broker connections in 
CHECKING_API_VERSIONS state.

In the code see the after the NetworkClient connects to a broker node it makes 
a request to check api versions, when it receives the response it marks the 
node as ready. -I am seeing that sometimes a reply is not received for the 
check api versions request the connection just hangs in CHECKING_API_VERSIONS 
state until it is disposed I assume after the idle connection timeout.-

Update: not actually sure what causes the connection to get stuck in 
CHECKING_API_VERSIONS.

-I am guessing the connection setup timeout should be still in play for this, 
but it is not.- 
 -There is a connectingNodes set that is consulted when checking timeouts and 
the node is removed- 
 -when ClusterConnectionStates.checkingApiVersions(String id) is called to 
transition the node into CHECKING_API_VERSIONS-


> Kafka Producer nodes stuck in CHECKING_API_VERSIONS
> ---
>
> Key: KAFKA-13388
> URL: https://issues.apache.org/jira/browse/KAFKA-13388
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Reporter: David Hoffman
>Priority: Major
> Attachments: image-2021-10-21-13-42-06-528.png
>
>
> I have been seeing expired batch errors in my app.
> {code:java}
> org.apache.kafka.common.errors.TimeoutException: Expiring 51 record(s) for 
> xxx-17:120002 ms has passed since batch creation
> {code}
>  I would have assumed a request timout or connection timeout should have also 
> been logged. I could not find any other associated errors. 
> I added some instrumenting to my app and have traced this down to broker 
> connections hanging in CHECKING_API_VERSIONS state. -It appears there is no 
> effective timeout for Kafka Producer broker connections in 
> CHECKING_API_VERSIONS state.-
> In the code see the after the NetworkClient connects to a broker node it 
> makes a request to check api versions, when it receives the response it marks 
> the node as ready. -I am seeing that sometimes a reply is not received for 
> the check api versions request the connection just hangs in 
> CHECKING_API_VERSIONS state until it is disposed I assume after the idle 
> connection timeout.-
> Update: not actually sure what causes the connection to get stuck in 
> CHECKING_API_VERSIONS.
> -I am guessing the connection setup timeout should be still in play for this, 
> but it is not.- 
>  -There is a connectingNodes set that is consulted when checking timeouts and 
> the node is removed- 
>  -when ClusterConnectionStates.checkingApiVersions(String id) is called to 
> transition the node into CHECKING_API_VERSIONS-



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-13388) Kafka Producer nodes stuck in CHECKING_API_VERSIONS

2021-10-22 Thread David Hoffman (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-13388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Hoffman updated KAFKA-13388:
--
Summary: Kafka Producer nodes stuck in CHECKING_API_VERSIONS  (was: Kafka 
Producer has no timeout for nodes stuck in CHECKING_API_VERSIONS)

> Kafka Producer nodes stuck in CHECKING_API_VERSIONS
> ---
>
> Key: KAFKA-13388
> URL: https://issues.apache.org/jira/browse/KAFKA-13388
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Reporter: David Hoffman
>Priority: Major
> Attachments: image-2021-10-21-13-42-06-528.png
>
>
> I have been seeing expired batch errors in my app.
> {code:java}
> org.apache.kafka.common.errors.TimeoutException: Expiring 51 record(s) for 
> xxx-17:120002 ms has passed since batch creation
> {code}
>  I would have assumed a request timout or connection timeout should have also 
> been logged. I could not find any other associated errors. 
> I added some instrumenting to my app and have traced this down to broker 
> connections hanging in CHECKING_API_VERSIONS state. It appears there is no 
> effective timeout for Kafka Producer broker connections in 
> CHECKING_API_VERSIONS state.
> In the code see the after the NetworkClient connects to a broker node it 
> makes a request to check api versions, when it receives the response it marks 
> the node as ready. I am seeing that sometimes a reply is not received for the 
> check api versions request the connection just hangs in CHECKING_API_VERSIONS 
> state until it is disposed I assume after the idle connection timeout.
> I am guessing the connection setup timeout should be still in play for this, 
> but it is not. 
> There is a connectingNodes set that is consulted when checking timeouts and 
> the node is removed 
> when ClusterConnectionStates.checkingApiVersions(String id) is called to 
> transition the node into CHECKING_API_VERSIONS



--
This message was sent by Atlassian Jira
(v8.3.4#803005)