[ 
https://issues.apache.org/jira/browse/CASSANDRA-15553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17040506#comment-17040506
 ] 

David Capwell commented on CASSANDRA-15553:
-------------------------------------------

Took a look and had to look closer at IR messaging, what I see is the following

IR messaging is fire-and-forget pattern, so any ephemeral issues lead to 
messages not being seen (tests show this CASSANDRA-15564 and have been reported 
as issues with current repair CASSANDRA-15566).  This patch relies on the 
FINALIZE_COMMIT_MSG being seen on the coordinator of the IR preview repair in 
order to detect conflict, but the message is seen asynchronously so may see 
this on the participants while validation is running and seen on the 
coordinator after all validations have been seen on the coordinator (so session 
is already complete); in this case you have the same issue as reported by this 
JIRA.

This patch also affectively blocks preview and IR running for the same range as 
the preview will fail with conflict*, so IR should stop scheduling if preview 
is running, and preview should not be scheduled while IR is running (else we 
waste the resources on validation); effectively what ever is scheduling the 
repairs will have to be enhanced to handle this which adds more complexity to 
operators.

I actually wonder if we can remove this restriction.  What it looks like to me 
is that repairedAt is system time (aka, could have drift, could roll backwards, 
etc.), but we could keep track of largest one and make sure this counter is 
monotonic.  With a data structure of 

* largest contiguous commit (long)
* inFlight (array of long)

We could make sure that we (coordinator) always produce a repairedAt larger 
than any we know of, and this lets preview take a snapshot of the state at the 
start of coordination. With this snapshot, we filter for repaired and 
repairedAt <= largest contiguous commit snapshot; this should give preview 
repair effectively snapshot isolation (assuming compaction also maintains 
repairedAt).

* In CASSANDRA-15564 I show that preview doesn't properly check session 
failures, run [this 
test|https://github.com/apache/cassandra/pull/446/files#diff-af4a07a2b44695f510dddb0c102e1953R28]
 and [this 
one|https://github.com/apache/cassandra/pull/446/files#diff-ca9f3b43ad8ff955d6ddd2ef4d2b6904R28]
 without the change in the JIRA to see it.  The reason your tests are different 
is because you don't use nodetool and directly monitor notifications.

> Preview repair should include sstables from finalized incremental repair 
> sessions
> ---------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-15553
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-15553
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Consistency/Repair
>            Reporter: Marcus Eriksson
>            Assignee: Marcus Eriksson
>            Priority: Normal
>             Fix For: 4.0-alpha
>
>
> When running a preview repair we currently grab all repaired sstables, 
> problem is that we depend on compaction to move the sstables from pending to 
> repaired so we might have different data marked repaired on different nodes. 
> Including any sstables from finalized incremental repair sessions as repaired 
> will solve this.
> Another problem is that validations don't start at exactly the same time on 
> different nodes, so if an incremental repair finishes while the preview 
> repair is running we might also validate the wrong repaired set. We should 
> fail the preview repair if an intersecting incremental repair finishes during 
> the preview repair.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to