[ 
https://issues.apache.org/jira/browse/KAFKA-14000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17582972#comment-17582972
 ] 

Sagar Rao commented on KAFKA-14000:
-----------------------------------

I took a look at this today. Looking at the trace logs, the messages all belong 
to the same task and all of them would be sent to the same partition. But the 
logs seem all jumbled up in the sense older generations are read after newer 
ones. Also, I see that the tasks were moved around from one worker to another. 
So, was there a case of task failures/rebalances happening? The connector 
status push to status topic happens in an async manner and deletion is not 
considered for a safe put. 

I think If tasks are moved around, then they would be assigned to different 
producers and the ordering guarantees even within a single partition could be 
broken. One approach could be to sort by generation id for the same task. The 
issue there is for delete messages, there won't be any generation set in the 
kafka messages (as they are null). So, we won't know at what point the delete 
happened. 

 

 

> Kafka-connect standby server shows empty tasks list
> ---------------------------------------------------
>
>                 Key: KAFKA-14000
>                 URL: https://issues.apache.org/jira/browse/KAFKA-14000
>             Project: Kafka
>          Issue Type: Bug
>          Components: KafkaConnect
>    Affects Versions: 2.6.0
>            Reporter: Xinyu Zou
>            Assignee: Sagar Rao
>            Priority: Major
>         Attachments: kafka-connect-trace.log
>
>
> I'm using Kafka-connect distributed mode. There're two servers. One active 
> and one standby. The standby server sometimes shows empty tasks list in 
> status rest API response.
> curl host:8443/connectors/name1/status
> {code:java}
> {
>     "connector": {
>         "state": "RUNNING",
>         "worker_id": "1.2.3.4:10443"
>     },
>     "name": "name1",
>     "tasks": [],
>     "type": "source"
> } {code}
> I enabled TRACE log and checked. As required, the connect-status topic is set 
> to cleanup.policy=compact. But messages in the topic won't be compacted 
> timely. They will be compacted in a specific interval. So usually there're 
> more than one messages with same key. E.g. When kafka-connect is launched 
> there's no connector running. And then we start a new connector. Then there 
> will be two messages in connect-status topic:
> status-task-name1 : state=RUNNING, workerId='10.251.170.166:10443', 
> generation=100
> status-task-name1 : _<emtpy>_
> Please check the log file [^kafka-connect-trace.log]. We can see that the 
> tasks status was removed finally. But actually the empty status was not the 
> newest message in topic connect-status.
>  
> When reading status from connect-status topic, it doesn't sort messages by 
> generation.
> [https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/clients/consumer/ConsumerRecords.java]
> So I think this could be improved. We can either sort the messages after poll 
> or compare generation value before we choose correct status message.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to