[ 
https://issues.apache.org/jira/browse/KAFKA-9750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17066574#comment-17066574
 ] 

Chia-Ping Tsai commented on KAFKA-9750:
---------------------------------------

{code:java}
      // change the epoch from 0 to 1 in order to make fenced error
      replicaManager.becomeLeaderOrFollower(0, leaderAndIsrRequest(1), (_, _) 
=> ())
      TestUtils.waitUntilTrue(() => 
replicaManager.replicaAlterLogDirsManager.fetcherThreadMap.values.forall(_.partitionCount()
 == 0),
        s"the partition=$topicPartition should be removed from pending state")
{code}

The root cause is race condition. The partition is add to the end instead of 
being removed if the epoch in ReplicaAlterLogDirsThread is increased. This PR 
includes following changes.
1. controls the lock of ReplicaAlterLogDirsThread to make the fenced error 
happen almost.
2. wait for the completion of thread

> Flaky test kafka.server.ReplicaManagerTest.testFencedErrorCausedByBecomeLeader
> ------------------------------------------------------------------------------
>
>                 Key: KAFKA-9750
>                 URL: https://issues.apache.org/jira/browse/KAFKA-9750
>             Project: Kafka
>          Issue Type: Bug
>          Components: core
>            Reporter: Bob Barrett
>            Assignee: Chia-Ping Tsai
>            Priority: Major
>              Labels: flaky-test
>
> When running tests locally, I've seen that 1-2% of the time, 
> testFencedErrorCausedByBecomeLeader fails with
> {code:java}
> org.scalatest.exceptions.TestFailedException: the partition=test-topic-0 
> should be removed from pending 
> stateorg.scalatest.exceptions.TestFailedException: the partition=test-topic-0 
> should be removed from pending state
>  at 
> org.scalatest.Assertions.newAssertionFailedException(Assertions.scala:530) at 
> org.scalatest.Assertions.newAssertionFailedException$(Assertions.scala:529) 
> at 
> org.scalatest.Assertions$.newAssertionFailedException(Assertions.scala:1389) 
> at org.scalatest.Assertions.fail(Assertions.scala:1091) at 
> org.scalatest.Assertions.fail$(Assertions.scala:1087) at 
> org.scalatest.Assertions$.fail(Assertions.scala:1389) at 
> kafka.server.ReplicaManagerTest.testFencedErrorCausedByBecomeLeader(ReplicaManagerTest.scala:248)
>  at jdk.internal.reflect.GeneratedMethodAccessor25.invoke(Unknown Source) at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.base/java.lang.reflect.Method.invoke(Method.java:566) at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>  at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>  at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>  at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>  at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) 
> at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) 
> at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) at 
> org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
>  at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366) at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
>  at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
>  at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331) at 
> org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79) at 
> org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329) at 
> org.junit.runners.ParentRunner.access$100(ParentRunner.java:66) at 
> org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293) at 
> org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) at 
> org.junit.runners.ParentRunner.run(ParentRunner.java:413) at 
> org.junit.runner.JUnitCore.run(JUnitCore.java:137) at 
> com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:68)
>  at 
> com.intellij.rt.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:40)
>  at 
> com.intellij.rt.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:230)
>  at com.intellij.rt.junit.JUnitStarter.main(JUnitStarter.java:58) {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to