[GitHub] [kafka] KarboniteKream commented on pull request #13679: KAFKA-14291: KRaft controller should return right finalized features in ApiVersionResponse

2023-05-24 Thread via GitHub


KarboniteKream commented on PR #13679:
URL: https://github.com/apache/kafka/pull/13679#issuecomment-1562213887

   Thank you, I'll check today!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [kafka] KarboniteKream commented on pull request #13679: KAFKA-14291: KRaft controller should return right finalized features in ApiVersionResponse

2023-05-24 Thread via GitHub


KarboniteKream commented on PR #13679:
URL: https://github.com/apache/kafka/pull/13679#issuecomment-1560737054

   Correct, the issue happens after shutting down and restarting the leader.
   
   Here's the example configuration:
   - 
[`1.properties`](https://gist.github.com/KarboniteKream/6ec0378f9b057f9e15467177e58539c6)
   - 
[`2.properties`](https://gist.github.com/KarboniteKream/4bd198db1db77e7b43c6bb37cecd9b68)
   
   Reproduction:
   1. Initialize the storage:
   - `./bin/kafka-storage.sh format --config 1.properties --cluster-id 
9N8QxiJfRoe1rPZ1vpjd2w`
   - `./bin/kafka-storage.sh format --config 2.properties --cluster-id 
9N8QxiJfRoe1rPZ1vpjd2w`
   2. Start both controllers:
   - `./bin/kafka-server-start.sh 1.properties`
   - `./bin/kafka-server-start.sh 2.properties`
   3. Wait for quorum to be established, then stop the leader with Ctrl-C
   4. Start the controller back up
   
   After a few moments, the controller will start logging the following:
   ```
   [2023-05-24 18:00:02,980] WARN [RaftManager id=3001] Received error 
UNKNOWN_SERVER_ERROR from node 3002 when making an ApiVersionsRequest with 
correlation id 4. Disconnecting. (org.apache.kafka.clients.NetworkClient)
   [2023-05-24 18:00:03,030] INFO [MetadataLoader id=3001] 
initializeNewPublishers: the loader is still catching up because we still don't 
know the high water mark yet. (org.apache.kafka.image.loader.MetadataLoader)
   ```
   
   On the other controller (new leader), you'll see the following logs in 
`logs/controller.log`:
   ```
   [2023-05-24 18:00:02,898] WARN [QuorumController id=3002] 
getFinalizedFeatures: failed with unknown server exception RuntimeException in 
271 us.  The controller is already in standby mode. 
(org.apache.kafka.controller.QuorumController)
   java.lang.RuntimeException: No in-memory snapshot for epoch 9. Snapshot 
epochs are: 
at 
org.apache.kafka.timeline.SnapshotRegistry.getSnapshot(SnapshotRegistry.java:173)
at 
org.apache.kafka.timeline.SnapshotRegistry.iterator(SnapshotRegistry.java:131)
at org.apache.kafka.timeline.TimelineObject.get(TimelineObject.java:69)
at 
org.apache.kafka.controller.FeatureControlManager.finalizedFeatures(FeatureControlManager.java:303)
at 
org.apache.kafka.controller.QuorumController.lambda$finalizedFeatures$16(QuorumController.java:2016)
at 
org.apache.kafka.controller.QuorumController$ControllerReadEvent.run(QuorumController.java:546)
at 
org.apache.kafka.queue.KafkaEventQueue$EventContext.run(KafkaEventQueue.java:127)
at 
org.apache.kafka.queue.KafkaEventQueue$EventHandler.handleEvents(KafkaEventQueue.java:210)
at 
org.apache.kafka.queue.KafkaEventQueue$EventHandler.run(KafkaEventQueue.java:181)
at java.base/java.lang.Thread.run(Thread.java:829)
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [kafka] KarboniteKream commented on pull request #13679: KAFKA-14291: KRaft controller should return right finalized features in ApiVersionResponse

2023-05-23 Thread via GitHub


KarboniteKream commented on PR #13679:
URL: https://github.com/apache/kafka/pull/13679#issuecomment-1559681643

   This PR seems to have introduced a regression (confirmed using bisect). In a 
simple setup of two controllers using `config/kraft/controller.properties`, 
after the leader is shut down and restarted, `UNKNOWN_SERVER_EXCEPTION` will be 
thrown by `ApiVersionsRequest`.
   
   The other controller sees the following exception:
   ```
   [2023-05-19 15:50:18,834] WARN [QuorumController id=0] getFinalizedFeatures: 
failed with unknown server exception RuntimeException in 28 us.  The controller 
is already in standby mode. (org.apache.kafka.controller.QuorumController)
   java.lang.RuntimeException: No in-memory snapshot for epoch 84310. Snapshot 
epochs are: 61900
   at 
org.apache.kafka.timeline.SnapshotRegistry.getSnapshot(SnapshotRegistry.java:173)
   at 
org.apache.kafka.timeline.SnapshotRegistry.iterator(SnapshotRegistry.java:131)
   at org.apache.kafka.timeline.TimelineObject.get(TimelineObject.java:69)
   at 
org.apache.kafka.controller.FeatureControlManager.finalizedFeatures(FeatureControlManager.java:303)
   at 
org.apache.kafka.controller.QuorumController.lambda$finalizedFeatures$16(QuorumController.java:2016)
   at 
org.apache.kafka.controller.QuorumController$ControllerReadEvent.run(QuorumController.java:546)
   at 
org.apache.kafka.queue.KafkaEventQueue$EventContext.run(KafkaEventQueue.java:127)
   at 
org.apache.kafka.queue.KafkaEventQueue$EventHandler.handleEvents(KafkaEventQueue.java:210)
   at 
org.apache.kafka.queue.KafkaEventQueue$EventHandler.run(KafkaEventQueue.java:181)
   at java.base/java.lang.Thread.run(Thread.java:829)
   ```
   
   A similar issue was reported in 
[KAFKA-14996](https://issues.apache.org/jira/browse/KAFKA-14996). I'll report 
this in Jira once my account is approved.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org