[ https://issues.apache.org/jira/browse/KAFKA-9632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Rajini Sivaram resolved KAFKA-9632. ----------------------------------- Fix Version/s: 2.6.0 Reviewer: Manikumar Resolution: Fixed > Transient test failure: PartitionLockTest.testAppendReplicaFetchWithUpdateIsr > ----------------------------------------------------------------------------- > > Key: KAFKA-9632 > URL: https://issues.apache.org/jira/browse/KAFKA-9632 > Project: Kafka > Issue Type: Bug > Components: core > Affects Versions: 2.5.0 > Reporter: Rajini Sivaram > Assignee: Rajini Sivaram > Priority: Major > Fix For: 2.6.0 > > > When running this test with _numRecordsPerProducer=500_, the test fails > intermittently. The test uses MockTime and runs concurrent log operations. > This can cause issues when attempting to roll a segment since Log and > MockScheduler don't work well together. MockScheduler currently runs tasks > while holding the MockScheduler lock. This can cause a deadlock if a thread > attempts to schedule a task while holding a lock which is also acquired > within a scheduled task. > The issue in this test occurs when these two operations happen concurrently: > 1) LogManager.cleanupLogs is a scheduled task that acquires Log lock. When > run with MockScheduler, the thread holds MockScheduler lock and then attempts > to acquire Log lock. > 2) Partition.appendLogsToLeader holds Log lock and attempts to acquire > MockScheduler lock in order to schedule a roll(). > Since locking order is reversed in 1) and 2), this causes a deadlock. > The test itself can be easily fixed by avoiding roll() in the test. But it > will be good to fix MockScheduler to enable it to be used in this case. > -- This message was sent by Atlassian Jira (v8.3.4#803005)