[ 
https://issues.apache.org/jira/browse/IGNITE-26271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mirza Aliev updated IGNITE-26271:
---------------------------------
    Description: 
{noformat}
org.opentest4j.AssertionFailedError: Row comparison failed within the timeout. 
==> expected: <true> but was: <false>
  at 
app//org.junit.jupiter.api.AssertionFailureBuilder.build(AssertionFailureBuilder.java:151)
  at 
app//org.junit.jupiter.api.AssertionFailureBuilder.buildAndThrow(AssertionFailureBuilder.java:132)
  at app//org.junit.jupiter.api.AssertTrue.failNotTrue(AssertTrue.java:63)
  at app//org.junit.jupiter.api.AssertTrue.assertTrue(AssertTrue.java:36)
  at app//org.junit.jupiter.api.Assertions.assertTrue(Assertions.java:214)
  at 
app//org.apache.ignite.internal.disaster.ItDisasterRecoveryManagerTest.assertValueOnSpecificNode(ItDisasterRecoveryManagerTest.java:498)
  at 
app//org.apache.ignite.internal.disaster.ItDisasterRecoveryManagerTest.assertValueOnSpecificNodes(ItDisasterRecoveryManagerTest.java:488)
  at 
app//org.apache.ignite.internal.disaster.ItDisasterRecoveryManagerTest.testRestartTablePartitionsWithCleanUpConcurrentRebalance(ItDisasterRecoveryManagerTest.java:436)
  at [email protected]/java.lang.reflect.Method.invoke(Method.java:568)
  at [email protected]/java.util.ArrayList.forEach(ArrayList.java:1511)
  at [email protected]/java.util.ArrayList.forEach(ArrayList.java:1511)
{noformat}

Test failed on a private TC

In this ticket we have investigated a bug in the way how we calculate alive 
data nodes for disaster recovery, the problem is that current solution does not 
take into account that some nodes from assignments could be learner nodes, so 
we need to provide the info about learners to {{localPartitionStates}} 
response. This changes leads to adding new filed to a 
{{LocalPartitionStateMessage}}

Original test fails with the other reason: we see some reordering of 
{{changePeersAndLearnersAsync}} and outdated 

{{Receive ChangePeersAndLearnersAsyncRequest}} which leads to 

{{{drmt_trtpwcucr_3347> can't do preVote as it is not in conf}} and further 
{{Unsuccessful election rounds}}
but test awaits that all nodes will be in conf and tries to read rows from them

  was:
{noformat}
org.opentest4j.AssertionFailedError: Row comparison failed within the timeout. 
==> expected: <true> but was: <false>
  at 
app//org.junit.jupiter.api.AssertionFailureBuilder.build(AssertionFailureBuilder.java:151)
  at 
app//org.junit.jupiter.api.AssertionFailureBuilder.buildAndThrow(AssertionFailureBuilder.java:132)
  at app//org.junit.jupiter.api.AssertTrue.failNotTrue(AssertTrue.java:63)
  at app//org.junit.jupiter.api.AssertTrue.assertTrue(AssertTrue.java:36)
  at app//org.junit.jupiter.api.Assertions.assertTrue(Assertions.java:214)
  at 
app//org.apache.ignite.internal.disaster.ItDisasterRecoveryManagerTest.assertValueOnSpecificNode(ItDisasterRecoveryManagerTest.java:498)
  at 
app//org.apache.ignite.internal.disaster.ItDisasterRecoveryManagerTest.assertValueOnSpecificNodes(ItDisasterRecoveryManagerTest.java:488)
  at 
app//org.apache.ignite.internal.disaster.ItDisasterRecoveryManagerTest.testRestartTablePartitionsWithCleanUpConcurrentRebalance(ItDisasterRecoveryManagerTest.java:436)
  at [email protected]/java.lang.reflect.Method.invoke(Method.java:568)
  at [email protected]/java.util.ArrayList.forEach(ArrayList.java:1511)
  at [email protected]/java.util.ArrayList.forEach(ArrayList.java:1511)
{noformat}

Test failed on a private TC

In this ticket we have investigated a bug in the way how we calculate alive 
data nodes for disaster recovery, the problem is that current solution does not 
take into account that some nodes from assignments could be learner nodes, so 
we need to provide the info about learners to {{localPartitionStates}} 
response. This changes leads to adding new filed to a 
{{LocalPartitionStateMessage}}

Original test fails with the other reason: we see some reordering of 
{{changePeersAndLearnersAsync}} and outdated 

{{Receive ChangePeersAndLearnersAsyncRequest}} which leads to 

{{{drmt_trtpwcucr_3347> can't do preVote as it is not in conf}} and further 
{{Unsuccessful election rounds}}


> Test 
> ItDisasterRecoveryManagerTest#testRestartTablePartitionsWithCleanUpConcurrentRebalance
>  is flaky
> ----------------------------------------------------------------------------------------------------
>
>                 Key: IGNITE-26271
>                 URL: https://issues.apache.org/jira/browse/IGNITE-26271
>             Project: Ignite
>          Issue Type: Bug
>            Reporter: Mirza Aliev
>            Assignee: Mirza Aliev
>            Priority: Major
>              Labels: MakeTeamcityGreenAgain, ignite-3
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
> {noformat}
> org.opentest4j.AssertionFailedError: Row comparison failed within the 
> timeout. ==> expected: <true> but was: <false>
>   at 
> app//org.junit.jupiter.api.AssertionFailureBuilder.build(AssertionFailureBuilder.java:151)
>   at 
> app//org.junit.jupiter.api.AssertionFailureBuilder.buildAndThrow(AssertionFailureBuilder.java:132)
>   at app//org.junit.jupiter.api.AssertTrue.failNotTrue(AssertTrue.java:63)
>   at app//org.junit.jupiter.api.AssertTrue.assertTrue(AssertTrue.java:36)
>   at app//org.junit.jupiter.api.Assertions.assertTrue(Assertions.java:214)
>   at 
> app//org.apache.ignite.internal.disaster.ItDisasterRecoveryManagerTest.assertValueOnSpecificNode(ItDisasterRecoveryManagerTest.java:498)
>   at 
> app//org.apache.ignite.internal.disaster.ItDisasterRecoveryManagerTest.assertValueOnSpecificNodes(ItDisasterRecoveryManagerTest.java:488)
>   at 
> app//org.apache.ignite.internal.disaster.ItDisasterRecoveryManagerTest.testRestartTablePartitionsWithCleanUpConcurrentRebalance(ItDisasterRecoveryManagerTest.java:436)
>   at [email protected]/java.lang.reflect.Method.invoke(Method.java:568)
>   at [email protected]/java.util.ArrayList.forEach(ArrayList.java:1511)
>   at [email protected]/java.util.ArrayList.forEach(ArrayList.java:1511)
> {noformat}
> Test failed on a private TC
> In this ticket we have investigated a bug in the way how we calculate alive 
> data nodes for disaster recovery, the problem is that current solution does 
> not take into account that some nodes from assignments could be learner 
> nodes, so we need to provide the info about learners to 
> {{localPartitionStates}} response. This changes leads to adding new filed to 
> a {{LocalPartitionStateMessage}}
> Original test fails with the other reason: we see some reordering of 
> {{changePeersAndLearnersAsync}} and outdated 
> {{Receive ChangePeersAndLearnersAsyncRequest}} which leads to 
> {{{drmt_trtpwcucr_3347> can't do preVote as it is not in conf}} and further 
> {{Unsuccessful election rounds}}
> but test awaits that all nodes will be in conf and tries to read rows from 
> them



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to