[ https://issues.apache.org/jira/browse/KAFKA-14000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17585291#comment-17585291 ]
Sagar Rao commented on KAFKA-14000: ----------------------------------- I think one approach to reduce the above error could it to have the status topic use `LogAppendTime` instead of the default `CreateTime`. This way we can use the timestamps and generations to sort the messages before processing them. Basically since tombstone records won't have generation id, we will have to leverage timestamp for them at the time of sorting. Using `LogAppendTime` allows the messages to be in the right order in the same partition even if written by different producers. There could still be edge cases but it provides higher guarantees than the current approach. Also note that with `LogAppendTime` 2 messages can have the same timestamp. This should be fine as long as they aren't tombstone records as we can sort using generation id. Even with tombstone records, chances of 2 different tombstone records for the same key at the same timestamp should be remote. > Kafka-connect standby server shows empty tasks list > --------------------------------------------------- > > Key: KAFKA-14000 > URL: https://issues.apache.org/jira/browse/KAFKA-14000 > Project: Kafka > Issue Type: Bug > Components: KafkaConnect > Affects Versions: 2.6.0 > Reporter: Xinyu Zou > Assignee: Sagar Rao > Priority: Major > Attachments: kafka-connect-trace.log > > > I'm using Kafka-connect distributed mode. There're two servers. One active > and one standby. The standby server sometimes shows empty tasks list in > status rest API response. > curl host:8443/connectors/name1/status > {code:java} > { > "connector": { > "state": "RUNNING", > "worker_id": "1.2.3.4:10443" > }, > "name": "name1", > "tasks": [], > "type": "source" > } {code} > I enabled TRACE log and checked. As required, the connect-status topic is set > to cleanup.policy=compact. But messages in the topic won't be compacted > timely. They will be compacted in a specific interval. So usually there're > more than one messages with same key. E.g. When kafka-connect is launched > there's no connector running. And then we start a new connector. Then there > will be two messages in connect-status topic: > status-task-name1 : state=RUNNING, workerId='10.251.170.166:10443', > generation=100 > status-task-name1 : _<emtpy>_ > Please check the log file [^kafka-connect-trace.log]. We can see that the > tasks status was removed finally. But actually the empty status was not the > newest message in topic connect-status. > > When reading status from connect-status topic, it doesn't sort messages by > generation. > [https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/clients/consumer/ConsumerRecords.java] > So I think this could be improved. We can either sort the messages after poll > or compare generation value before we choose correct status message. -- This message was sent by Atlassian Jira (v8.20.10#820010)