[jira] [Commented] (KAFKA-15897) Flaky Test: testWrongIncarnationId() – kafka.server.ControllerRegistrationManagerTest

2024-04-29 Thread Johnny Hsu (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-15897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17841998#comment-17841998
 ] 

Johnny Hsu commented on KAFKA-15897:


previously I thought that the poll should be fine, since the 
`rpcStats(manager)` always returns the current status of the manager, which 
means that no matter whether there are race conditions or not, if the 
prepareResponseFrom was executed before the second poll, it should not fail. 
However, actually there is another event queue thread which append the request. 
Thus, if there are race conditions between the first and second poll, this 
could lead to the second poll failing, because the request could be taken away 
by the first poll.

Thanks [~chia7712] for the through discussion, I am willing to fix this :)

> Flaky Test: testWrongIncarnationId() – 
> kafka.server.ControllerRegistrationManagerTest
> -
>
> Key: KAFKA-15897
> URL: https://issues.apache.org/jira/browse/KAFKA-15897
> Project: Kafka
>  Issue Type: Test
>Reporter: Apoorv Mittal
>Assignee: Chia-Ping Tsai
>Priority: Major
>  Labels: flaky-test
>
> Build run: 
> https://ci-builds.apache.org/blue/organizations/jenkins/Kafka%2Fkafka-pr/detail/PR-14699/21/tests/
>  
> {code:java}
> org.opentest4j.AssertionFailedError: expected: <(false,1,0)> but was: 
> <(true,0,0)>Stacktraceorg.opentest4j.AssertionFailedError: expected: 
> <(false,1,0)> but was: <(true,0,0)>at 
> app//org.junit.jupiter.api.AssertionFailureBuilder.build(AssertionFailureBuilder.java:151)
>at 
> app//org.junit.jupiter.api.AssertionFailureBuilder.buildAndThrow(AssertionFailureBuilder.java:132)
>at 
> app//org.junit.jupiter.api.AssertEquals.failNotEqual(AssertEquals.java:197)  
> at 
> app//org.junit.jupiter.api.AssertEquals.assertEquals(AssertEquals.java:182)  
> at 
> app//org.junit.jupiter.api.AssertEquals.assertEquals(AssertEquals.java:177)  
> at app//org.junit.jupiter.api.Assertions.assertEquals(Assertions.java:1141)   
>   at 
> app//kafka.server.ControllerRegistrationManagerTest.$anonfun$testWrongIncarnationId$3(ControllerRegistrationManagerTest.scala:228)
>at 
> app//org.apache.kafka.test.TestUtils.retryOnExceptionWithTimeout(TestUtils.java:379)
>  at 
> app//kafka.server.ControllerRegistrationManagerTest.testWrongIncarnationId(ControllerRegistrationManagerTest.scala:226)
>   at 
> java.base@17.0.7/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native 
> Method)at 
> java.base@17.0.7/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
>   at 
> java.base@17.0.7/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base@17.0.7/java.lang.reflect.Method.invoke(Method.java:568)
> at 
> app//org.junit.platform.commons.util.ReflectionUtils.invokeMethod(ReflectionUtils.java:728)
>   at 
> app//org.junit.jupiter.engine.execution.MethodInvocation.proceed(MethodInvocation.java:60)
>at 
> app//org.junit.jupiter.engine.execution.InvocationInterceptorChain$ValidatingInvocation.proceed(InvocationInterceptorChain.java:131)
>  at 
> app//org.junit.jupiter.engine.extension.SameThreadTimeoutInvocation.proceed(SameThreadTimeoutInvocation.java:45)
>  at 
> app//org.junit.jupiter.engine.extension.TimeoutExtension.intercept(TimeoutExtension.java:156)
> at 
> app//org.junit.jupiter.engine.extension.TimeoutExtension.interceptTestableMethod(TimeoutExtension.java:147)
>   at 
> app//org.junit.jupiter.engine.extension.TimeoutExtension.interceptTestMethod(TimeoutExtension.java:86)
>at 
> app//org.junit.jupiter.engine.execution.InterceptingExecutableInvoker$ReflectiveInterceptorCall.lambda$ofVoidMethod$0(InterceptingExecutableInvoker.java:103)
> at 
> app//org.junit.jupiter.engine.execution.InterceptingExecutableInvoker.lambda$invoke$0(InterceptingExecutableInvoker.java:93)
>  at 
> app//org.junit.jupiter.engine.execution.InvocationInterceptorChain$InterceptedInvocation.proceed(InvocationInterceptorChain.java:106)
> at 
> app//org.junit.jupiter.engine.execution.InvocationInterceptorChain.proceed(InvocationInterceptorChain.java:64)
>at 
> app//org.junit.jupiter.engine.execution.InvocationInterceptorChain.chainAndInvoke(InvocationInterceptorChain.java:45)
> at 
> app//org.junit.jupiter.engine.execution.InvocationInterceptorChain.invoke(InvocationInterceptorChain.java:37)
> at 
> app//org.junit.jupiter.engine.execution.InterceptingExecutableInvoker.invoke(InterceptingExecutableInvoker.java:92)
>   at 
> app//org.junit.jupiter.engine.execution.InterceptingExecutableInvoker.invoke(InterceptingExecutableInvoker.java:86)
>   at 
> app//org.junit.jupiter.engine.descriptor.TestMethodTe

[jira] [Commented] (KAFKA-15897) Flaky Test: testWrongIncarnationId() – kafka.server.ControllerRegistrationManagerTest

2024-04-28 Thread Chia-Ping Tsai (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-15897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17841597#comment-17841597
 ] 

Chia-Ping Tsai commented on KAFKA-15897:


It seems the root cause is that `context.mockChannelManager.poll()` 
([https://github.com/apache/kafka/blob/trunk/core/src/test/scala/unit/kafka/server/ControllerRegistrationManagerTest.scala#L221)]
  is executed after event queue thread adds the `ControllerRegistrationRequest` 
([https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/server/ControllerRegistrationManager.scala#L233).]
 The request is consumed without no response as we don't set response 
([https://github.com/apache/kafka/blob/trunk/core/src/test/scala/unit/kafka/server/ControllerRegistrationManagerTest.scala#L226)].
 Hence, the following test 
([https://github.com/apache/kafka/blob/trunk/core/src/test/scala/unit/kafka/server/ControllerRegistrationManagerTest.scala#L230)]
 gets failed as the request is gone.

I feel the first poll 
([https://github.com/apache/kafka/blob/trunk/core/src/test/scala/unit/kafka/server/ControllerRegistrationManagerTest.scala#L221)]
 is unnecessary since the check is waiting for event queue thread to send 
controller registration request. That does NOT need the poll, and removing the 
poll can prevent above race condition.

 

 

> Flaky Test: testWrongIncarnationId() – 
> kafka.server.ControllerRegistrationManagerTest
> -
>
> Key: KAFKA-15897
> URL: https://issues.apache.org/jira/browse/KAFKA-15897
> Project: Kafka
>  Issue Type: Test
>Reporter: Apoorv Mittal
>Assignee: Chia-Ping Tsai
>Priority: Major
>  Labels: flaky-test
>
> Build run: 
> https://ci-builds.apache.org/blue/organizations/jenkins/Kafka%2Fkafka-pr/detail/PR-14699/21/tests/
>  
> {code:java}
> org.opentest4j.AssertionFailedError: expected: <(false,1,0)> but was: 
> <(true,0,0)>Stacktraceorg.opentest4j.AssertionFailedError: expected: 
> <(false,1,0)> but was: <(true,0,0)>at 
> app//org.junit.jupiter.api.AssertionFailureBuilder.build(AssertionFailureBuilder.java:151)
>at 
> app//org.junit.jupiter.api.AssertionFailureBuilder.buildAndThrow(AssertionFailureBuilder.java:132)
>at 
> app//org.junit.jupiter.api.AssertEquals.failNotEqual(AssertEquals.java:197)  
> at 
> app//org.junit.jupiter.api.AssertEquals.assertEquals(AssertEquals.java:182)  
> at 
> app//org.junit.jupiter.api.AssertEquals.assertEquals(AssertEquals.java:177)  
> at app//org.junit.jupiter.api.Assertions.assertEquals(Assertions.java:1141)   
>   at 
> app//kafka.server.ControllerRegistrationManagerTest.$anonfun$testWrongIncarnationId$3(ControllerRegistrationManagerTest.scala:228)
>at 
> app//org.apache.kafka.test.TestUtils.retryOnExceptionWithTimeout(TestUtils.java:379)
>  at 
> app//kafka.server.ControllerRegistrationManagerTest.testWrongIncarnationId(ControllerRegistrationManagerTest.scala:226)
>   at 
> java.base@17.0.7/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native 
> Method)at 
> java.base@17.0.7/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
>   at 
> java.base@17.0.7/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base@17.0.7/java.lang.reflect.Method.invoke(Method.java:568)
> at 
> app//org.junit.platform.commons.util.ReflectionUtils.invokeMethod(ReflectionUtils.java:728)
>   at 
> app//org.junit.jupiter.engine.execution.MethodInvocation.proceed(MethodInvocation.java:60)
>at 
> app//org.junit.jupiter.engine.execution.InvocationInterceptorChain$ValidatingInvocation.proceed(InvocationInterceptorChain.java:131)
>  at 
> app//org.junit.jupiter.engine.extension.SameThreadTimeoutInvocation.proceed(SameThreadTimeoutInvocation.java:45)
>  at 
> app//org.junit.jupiter.engine.extension.TimeoutExtension.intercept(TimeoutExtension.java:156)
> at 
> app//org.junit.jupiter.engine.extension.TimeoutExtension.interceptTestableMethod(TimeoutExtension.java:147)
>   at 
> app//org.junit.jupiter.engine.extension.TimeoutExtension.interceptTestMethod(TimeoutExtension.java:86)
>at 
> app//org.junit.jupiter.engine.execution.InterceptingExecutableInvoker$ReflectiveInterceptorCall.lambda$ofVoidMethod$0(InterceptingExecutableInvoker.java:103)
> at 
> app//org.junit.jupiter.engine.execution.InterceptingExecutableInvoker.lambda$invoke$0(InterceptingExecutableInvoker.java:93)
>  at 
> app//org.junit.jupiter.engine.execution.InvocationInterceptorChain$InterceptedInvocation.proceed(InvocationInterceptorChain.java:106)
> at 
> app//org.junit.jupiter.engine.execution.InvocationInterceptorChain.proceed(InvocationInterceptorChain.java:64)
>at 
> app//org.junit.jupiter.e