[ 
https://issues.apache.org/jira/browse/CASSANDRA-8272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16006545#comment-16006545
 ] 

Sergio Bossa commented on CASSANDRA-8272:
-----------------------------------------

bq. my main point is that on principle we should be careful to look at the 
whole solution before comparing it to alternatives and deciding which one we 
"stick with". I've seen simple solutions get pretty messy once you fix all edge 
cases to the point that it wasn't the best solution anymore.

Of course. And I indeed gave some thoughts on my own to the tombstones solution 
(as I'm sure [~adelapena] did as well), and I've found it quite more complex 
that the current one, with little/no gains in return, and, something I didn't 
mention before, not really complete for indexes covering multiple columns, or 
if we'll ever want to support multiple indexes per row: in such cases, mixing 
tombstones and valid column values for all combinations would easily turn into 
a mess IMHO, while actually returning the row and later post-filter is IMHO 
cleaner and less error prone. To be noted, we could still "skim" the row when 
we detect it's related to a stale entry and only keep the index-related columns 
(and easily add a merging step in the future for the multiple indexes cases): 
this would buy us the performance optimization you mentioned above, but I see 
it slightly error prone and I'd rather go with a functionally complete solution 
first.

bq. It's in particular not true that fixing this bug will be "invalidated when 
filtering is applied"

I disagree here: if filtering is applied on top of index results, you'll still 
get wrong results, which is confusing to me (as a user). I understand filtering 
is also orthogonal, so what about fixing filtering (that is, moving to 
coordinator-side filtering) only when indexes are present?

bq. That [fixing other index implementations] I agree is something we should 
consider. Though tbh, I have doubts we can have a solution that is completely 
index agnostic. 

Of course. But we can still provide some API (i.e. the {{isSatisfiedBy()}} you 
mentioned) they can leverage. And if we do this kind of work on the 
SASI-enabled branches, we'll have two different index implementations to test 
the goodness of our API.

bq. One thing that hasn't been mentioned is that the fix has impact on 
upgrades. Namely, in a mixed cluster, some replica will start to return invalid 
results and if the coordinator isn't upgraded yet, it won't filter those, which 
means we'll return invalid entries.

Excellent point! And definitely something to avoid.

bq. That does mean we should consider starting to filter entries on index 
queries coordinator-side in 3.0/3.11 (even though we never return them), and 
only do the replica-side parts in 4.0, with a fat warning that you need to only 
upgrade to 4.0 from a 3.X version that has the coordinator-side fix.

Mmmhhhh ... clunky. And error prone as the 3.X code would be probably 
untestable. Couldn't the replica detect the coordinator version and return 
results accordingly?

bq. Worth noting that this doesn't entirely fly for index using custom indexes: 
we'd need to have them implement the CustomExpression#isSatistiedBy method in 
3.X in that scheme since we need it for the coordinator-side filtering as well, 
but making that method abstract in 3.X is, as said above, a breaking change.

I'm not sure I get why you _have to_ make that abstract: I think it's fine to 
leave it as it is and warn users they'll have to override it on upgrade if they 
want consistent results. And for those implementations that can't implement it, 
we should maybe add a {{isConsistent}} predicate to disable "consistent 
filtering" altogether.

> 2ndary indexes can return stale data
> ------------------------------------
>
>                 Key: CASSANDRA-8272
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8272
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Sylvain Lebresne
>            Assignee: Andrés de la Peña
>             Fix For: 3.0.x
>
>
> When replica return 2ndary index results, it's possible for a single replica 
> to return a stale result and that result will be sent back to the user, 
> potentially failing the CL contract.
> For instance, consider 3 replicas A, B and C, and the following situation:
> {noformat}
> CREATE TABLE test (k int PRIMARY KEY, v text);
> CREATE INDEX ON test(v);
> INSERT INTO test(k, v) VALUES (0, 'foo');
> {noformat}
> with every replica up to date. Now, suppose that the following queries are 
> done at {{QUORUM}}:
> {noformat}
> UPDATE test SET v = 'bar' WHERE k = 0;
> SELECT * FROM test WHERE v = 'foo';
> {noformat}
> then, if A and B acknowledge the insert but C respond to the read before 
> having applied the insert, then the now stale result will be returned (since 
> C will return it and A or B will return nothing).
> A potential solution would be that when we read a tombstone in the index (and 
> provided we make the index inherit the gcGrace of it's parent CF), instead of 
> skipping that tombstone, we'd insert in the result a corresponding range 
> tombstone.  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to