[ 
https://issues.apache.org/jira/browse/KAFKA-9632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajini Sivaram resolved KAFKA-9632.
-----------------------------------
    Fix Version/s: 2.6.0
         Reviewer: Manikumar
       Resolution: Fixed

> Transient test failure: PartitionLockTest.testAppendReplicaFetchWithUpdateIsr
> -----------------------------------------------------------------------------
>
>                 Key: KAFKA-9632
>                 URL: https://issues.apache.org/jira/browse/KAFKA-9632
>             Project: Kafka
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 2.5.0
>            Reporter: Rajini Sivaram
>            Assignee: Rajini Sivaram
>            Priority: Major
>             Fix For: 2.6.0
>
>
> When running this test with _numRecordsPerProducer=500_, the test fails 
> intermittently. The test uses MockTime and runs concurrent log operations. 
> This can cause issues when attempting to roll a segment since Log and 
> MockScheduler don't work well together. MockScheduler currently runs tasks 
> while holding the MockScheduler lock. This can cause a deadlock if a thread 
> attempts to schedule a task while holding a lock which is also acquired 
> within a scheduled task.
> The issue in this test occurs when these two operations happen concurrently:
> 1) LogManager.cleanupLogs is a scheduled task that acquires Log lock. When 
> run with MockScheduler, the thread holds MockScheduler lock and then attempts 
> to acquire Log lock.
> 2) Partition.appendLogsToLeader holds Log lock and attempts to acquire 
> MockScheduler lock in order to schedule a roll().
> Since locking order is reversed in 1) and 2), this causes a deadlock.
> The test itself can be easily fixed by avoiding roll() in the test. But it 
> will be good to fix MockScheduler to enable it to be used in this case.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to