[ 
https://issues.apache.org/jira/browse/CASSANDRA-18816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17770022#comment-17770022
 ] 

Andres de la Peña commented on CASSANDRA-18816:
-----------------------------------------------

The new {{ConcurrentIrWithPreviewFuzzTest}} introduced by this patch is ~6% 
flaky in both 5.0 and trunk:

* 
https://app.circleci.com/pipelines/github/adelapena/cassandra/3222/workflows/ecfca708-f183-429e-80e5-b2bfea8d25a0/jobs/80292/tests
* 
https://app.circleci.com/pipelines/github/adelapena/cassandra/3221/workflows/bb777ac0-6263-4d6e-aa54-35d6928e1e9b/jobs/80294

{code}
junit.framework.AssertionFailedError: Property error detected:
Seed = 3695691971125975155
Examples = 2
Pure = false
Error: property test did not complete within PT1M
Values:

        at accord.utils.Property$Common.checkWithTimeout(Property.java:115)
        at accord.utils.Property$SingleBuilder.check(Property.java:223)
        at accord.utils.Property$ForBuilder.check(Property.java:124)
        at 
org.apache.cassandra.repair.ConcurrentIrWithPreviewFuzzTest.concurrentIrWithPreview(ConcurrentIrWithPreviewFuzzTest.java:46)
        at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
        at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
{code}
I don't see any repeated runs on the CI results above, were they run?

I have opened CASSANDRA-18890 to deal with it.

> Add support for repair coordinator to retry messages that timeout
> -----------------------------------------------------------------
>
>                 Key: CASSANDRA-18816
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-18816
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Consistency/Repair
>            Reporter: David Capwell
>            Assignee: David Capwell
>            Priority: Normal
>             Fix For: 5.0-alpha2
>
>          Time Spent: 13h 10m
>  Remaining Estimate: 0h
>
> Now that CASSANDRA-15399 is in, most of the repair messages have a state that 
> they can check against to make message delivery idempotent, allowing the 
> coordinator to retry such messages; a few of the most critical messages to 
> retry are: PREPARE_MSG, VALIDATION_REQ, VALIDATION_RSP, SYNC_REQ, and 
> SYNC_RSP.
> With this I propose making the coordinator able to retry these key messages 
> to try and make repair more resilient to ephemeral issues.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to