[jira] [Commented] (KAFKA-7322) race between compaction thread and retention thread when changing topic cleanup policy
[ https://issues.apache.org/jira/browse/KAFKA-7322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16596927#comment-16596927 ] ASF GitHub Bot commented on KAFKA-7322: --- xiowu0 opened a new pull request #5591: KAFKA-7322: fix race between log cleaner thread and log retention thread URL: https://github.com/apache/kafka/pull/5591 This is to address issue described in KAFKA-7322. To fix race condition between log retention and log cleaner when dynamically switching topic cleanup policy, existing log cleaner inprogress map is used to prevent more than one thread working on the same topic partition. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > race between compaction thread and retention thread when changing topic > cleanup policy > -- > > Key: KAFKA-7322 > URL: https://issues.apache.org/jira/browse/KAFKA-7322 > Project: Kafka > Issue Type: Bug > Components: log >Reporter: xiongqi wu >Assignee: xiongqi wu >Priority: Major > > The deletion thread will grab the log.lock when it tries to rename log > segment and schedule for actual deletion. > The compaction thread only grabs the log.lock when it tries to replace the > original segments with the cleaned segment. The compaction thread doesn't > grab the log when it reads records from the original segments to build > offsetmap and new segments. As a result, if both deletion and compaction > threads work on the same log partition. We have a race condition. > This race happens when the topic cleanup policy is updated on the fly. > One case to hit this race condition: > 1: topic clean up policy is "compact" initially > 2: log cleaner (compaction) thread picks up the partition for compaction and > still in progress > 3: the topic clean up policy has been updated to "deletion" > 4: retention thread pick up the topic partition and delete some old segments. > 5: log cleaner thread reads from the deleted log and raise an IO exception. > > The proposed solution is to use "inprogress" map that cleaner manager has to > protect such a race. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-7322) race between compaction thread and retention thread when changing topic cleanup policy
[ https://issues.apache.org/jira/browse/KAFKA-7322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16589213#comment-16589213 ] Dhruvil Shah commented on KAFKA-7322: - I think that makes sense, thanks for the confirmation. > race between compaction thread and retention thread when changing topic > cleanup policy > -- > > Key: KAFKA-7322 > URL: https://issues.apache.org/jira/browse/KAFKA-7322 > Project: Kafka > Issue Type: Bug > Components: log >Reporter: xiongqi wu >Assignee: xiongqi wu >Priority: Major > > The deletion thread will grab the log.lock when it tries to rename log > segment and schedule for actual deletion. > The compaction thread only grabs the log.lock when it tries to replace the > original segments with the cleaned segment. The compaction thread doesn't > grab the log when it reads records from the original segments to build > offsetmap and new segments. As a result, if both deletion and compaction > threads work on the same log partition. We have a race condition. > This race happens when the topic cleanup policy is updated on the fly. > One case to hit this race condition: > 1: topic clean up policy is "compact" initially > 2: log cleaner (compaction) thread picks up the partition for compaction and > still in progress > 3: the topic clean up policy has been updated to "deletion" > 4: retention thread pick up the topic partition and delete some old segments. > 5: log cleaner thread reads from the deleted log and raise an IO exception. > > The proposed solution is to use "inprogress" map that cleaner manager has to > protect such a race. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-7322) race between compaction thread and retention thread when changing topic cleanup policy
[ https://issues.apache.org/jira/browse/KAFKA-7322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16589199#comment-16589199 ] xiongqi wu commented on KAFKA-7322: --- [~dhruvilshah] I believe this race condition is different from the race in KAFKA-7278 . This race is caused by compaction reading from the logsement while retention thread delete the log segment. Ideally, we should not let compaction and retention threads work on the same partition. > race between compaction thread and retention thread when changing topic > cleanup policy > -- > > Key: KAFKA-7322 > URL: https://issues.apache.org/jira/browse/KAFKA-7322 > Project: Kafka > Issue Type: Bug > Components: log >Reporter: xiongqi wu >Assignee: xiongqi wu >Priority: Major > > The deletion thread will grab the log.lock when it tries to rename log > segment and schedule for actual deletion. > The compaction thread only grabs the log.lock when it tries to replace the > original segments with the cleaned segment. The compaction thread doesn't > grab the log when it reads records from the original segments to build > offsetmap and new segments. As a result, if both deletion and compaction > threads work on the same log partition. We have a race condition. > This race happens when the topic cleanup policy is updated on the fly. > One case to hit this race condition: > 1: topic clean up policy is "compact" initially > 2: log cleaner (compaction) thread picks up the partition for compaction and > still in progress > 3: the topic clean up policy has been updated to "deletion" > 4: retention thread pick up the topic partition and delete some old segments. > 5: log cleaner thread reads from the deleted log and raise an IO exception. > > The proposed solution is to use "inprogress" map that cleaner manager has to > protect such a race. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-7322) race between compaction thread and retention thread when changing topic cleanup policy
[ https://issues.apache.org/jira/browse/KAFKA-7322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16588450#comment-16588450 ] Dhruvil Shah commented on KAFKA-7322: - Does this issue still exist after KAFKA-7278 was fixed? > race between compaction thread and retention thread when changing topic > cleanup policy > -- > > Key: KAFKA-7322 > URL: https://issues.apache.org/jira/browse/KAFKA-7322 > Project: Kafka > Issue Type: Bug > Components: log >Reporter: xiongqi wu >Assignee: xiongqi wu >Priority: Major > > The deletion thread will grab the log.lock when it tries to rename log > segment and schedule for actual deletion. > The compaction thread only grabs the log.lock when it tries to replace the > original segments with the cleaned segment. The compaction thread doesn't > grab the log when it reads records from the original segments to build > offsetmap and new segments. As a result, if both deletion and compaction > threads work on the same log partition. We have a race condition. > This race happens when the topic cleanup policy is updated on the fly. > One case to hit this race condition: > 1: topic clean up policy is "compact" initially > 2: log cleaner (compaction) thread picks up the partition for compaction and > still in progress > 3: the topic clean up policy has been updated to "deletion" > 4: retention thread pick up the topic partition and delete some old segments. > 5: log cleaner thread reads from the deleted log and raise an IO exception. > > The proposed solution is to use "inprogress" map that cleaner manager has to > protect such a race. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)