[ 
https://issues.apache.org/jira/browse/CASSANDRA-8273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16007964#comment-16007964
 ] 

Sylvain Lebresne commented on CASSANDRA-8273:
---------------------------------------------

bq. Obviously, moving the filtering to the coordinator would remove that 
problem, but doing so would, on top of not being trivial to implmenent, have 
serious performance impact since we can't know in advance how much data will be 
filtered and we may have to redo query to replica multiple times. 

That comment (from the description) is pretty old and isn't entirely accurate 
anymore so I want to amend it and expand on it.

While it's obviously still true that moving filtering coordinator-side has 
performance impacts, it's now kind of trivial to do post-CASSANDRA-8099.

Basically, I believe we just need to move the {{RowFilter#filter}} call that is 
currently in {{ReadCommand#executeLocally()}} to 
post-coordinator-reconciliation. Typically, to the 
{{postReconciliationProcessing()}} method that {{PartitionRangeReadCommand}} 
has that we would just generalize to all {{ReadCommand}} (that is, adding it to 
{{SinglePartitionReadCommand}}).

In particular, while it's still true that we'll have to redo queries when 
filtering makes us fall short on a first try, the "short read protection" from 
{{DataResolver}} actually handles this for us reasonably nicely.

Of course, there is the performance concerns, which concretely come in 2 
flavors:
# we'll transfer everything that is filtered from the replica to the 
coordinator while we don't today.
# as a consequence and as mentioned above, we'll have to (usually) do multiple 
coordinator<->replica queries to get a particular count of final rows, when 
it's only one today.

I do want to note the following though:
* For CL.ONE, and as noted by Robert above, this is not really a big deal. 
There is actually no impact if you use a token-aware client. If you don't, then 
we could theoretically push the filtering on the replica in that specific case, 
but honestly, if you care about performance, you should be using 
token-awareness so I'm not convinced it's even worth adding any complexity for 
this (at the very least, for a v1, we don't currently ship the CL with queries 
to replica, and while I'm sure we'll want to change that for other reasons at 
some point, I don't think we should bother here).
* For higher CL, it's definitively a bigger impact, but here the thing: if you 
use a higher CL, that implies that you actually care about and _rely on_ CL 
guarantees, so I think no kind of performance matters if we don't fulfill those 
guarantees, and not fixing a know correctness issue because it impact 
performance is imo backward.

I'll also note that while the 2nd flavor will certainly have an impact, the 
short-read protection from {{DataResolver}} is actually not too stupid about 
this and will "regulate" his 2nd query based on how much was filtered on the 
1st one to limit the impact somehow. Not awesome, but better than nothing.

Anyway, I'm personally in favor of fixing this by moving filtering server-side, 
as while this has performance impact, we shouldn't be fast at the expense of 
correctness. And I have no clue how to fix this replica-side and no-one offered 
a proper option for that in ~3 years. Let's we make things correct now, and 
_then_ we can think about how to optimize.

I also do want to remind for context that {{ALLOW FILTERING}} is something we 
strongly advertise as not-a-great-idea for anything performance sensitive in 
the first place, so that's imo all the more reason to not agonize over 
performance too much and favor correctness first and foremost.

> Allow filtering queries can return stale data
> ---------------------------------------------
>
>                 Key: CASSANDRA-8273
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8273
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Sylvain Lebresne
>
> Data filtering is done replica side. That means that a single replica with 
> stale data may make the whole query return that stale data.
> For instance, consider 3 replicas A, B and C, and the following situation:
> {noformat}
> CREATE TABLE test (k int PRIMARY KEY, v1 text, v2 int);
> CREATE INDEX ON test(v1);
> INSERT INTO test(k, v1, v2) VALUES (0, 'foo', 1);
> {noformat}
> with every replica up to date. Now, suppose that the following queries are 
> done at {{QUORUM}}:
> {noformat}
> UPDATE test SET v2 = 2 WHERE k = 0;
> SELECT * FROM test WHERE v1 = 'foo' AND v2 = 1;
> {noformat}
> then, if A and B acknowledge the insert but C respond to the read before 
> having applied the insert, then the now stale result will be returned. Let's 
> note that this is a problem related to filtering, not 2ndary indexes.
> This issue share similarity with CASSANDRA-8272 but contrarily to that former 
> issue, I'm not sure how to fix it. Obviously, moving the filtering to the 
> coordinator would remove that problem, but doing so would, on top of not 
> being trivial to implmenent, have serious performance impact since we can't 
> know in advance how much data will be filtered and we may have to redo query 
> to replica multiple times.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to