[ https://issues.apache.org/jira/browse/KAFKA-1911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14903046#comment-14903046 ]
Joel Koshy edited comment on KAFKA-1911 at 9/22/15 5:34 PM: ------------------------------------------------------------ The original motivation in this ticket was to avoid a high latency request from tying up request handlers. However, while thinking through some nuances of delete topic, I think delete topic would also benefit from this. Since stop-replica-requests can take a while to finish delete topic can also take a while (apart from failure cases such as a replica being down). I think the easiest way to fix this would be to just rename the partition directory from <topic><partId> to something like <topic><partId>deleted<seqNo> and asynchronously delete that. The <seqNo> is probably needed if a user were to delete and recreate multiple times in rapid fire for whatever reason. was (Author: jjkoshy): The original motivation in this ticket was to avoid a high latency request from tying up request handlers. However, while thinking through some nuances of delete topic, I think delete topic would also benefit from this. Since stop-replica-requests can take a while to finish delete topic can also take a while (apart from failure cases such as a replica being down). I think the easiest way to fix this would be to just rename the partition directory from <topic>-<partId> to something like <topic>-<partId>-deleted-<seqNo> and asynchronously delete that. The <seqNo> is probably needed if a user were to delete and recreate multiple times in rapid fire for whatever reason. > Log deletion on stopping replicas should be async > ------------------------------------------------- > > Key: KAFKA-1911 > URL: https://issues.apache.org/jira/browse/KAFKA-1911 > Project: Kafka > Issue Type: Bug > Components: log, replication > Reporter: Joel Koshy > Assignee: Geoff Anderson > Labels: newbie++ > > If a StopReplicaRequest sets delete=true then we do a file.delete on the file > message sets. I was under the impression that this is fast but it does not > seem to be the case. > On a partition reassignment in our cluster the local time for stop replica > took nearly 30 seconds. > {noformat} > Completed request:Name: StopReplicaRequest; Version: 0; CorrelationId: 467; > ClientId: ; DeletePartitions: true; ControllerId: 1212; ControllerEpoch: > 53 from > client/...:45964;totalTime:29191,requestQueueTime:1,localTime:29190,remoteTime:0,responseQueueTime:0,sendTime:0 > {noformat} > This ties up one API thread for the duration of the request. > Specifically in our case, the queue times for other requests also went up and > producers to the partition that was just deleted on the old leader took a > while to refresh their metadata (see KAFKA-1303) and eventually ran out of > retries on some messages leading to data loss. > I think the log deletion in this case should be fully asynchronous although > we need to handle the case when a broker may respond immediately to the > stop-replica-request but then go down after deleting only some of the log > segments. -- This message was sent by Atlassian JIRA (v6.3.4#6332)