[
https://issues.apache.org/jira/browse/CASSANDRA-20639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Caleb Rackliffe updated CASSANDRA-20639:
----------------------------------------
Change Category: Performance
Complexity: Normal
Fix Version/s: 5.0.x
5.x
Status: Open (was: Triage Needed)
> Replica filtering protection can trigger short-read protection too
> aggressively when the LIMIT is less than the number of results in a partition
> ------------------------------------------------------------------------------------------------------------------------------------------------
>
> Key: CASSANDRA-20639
> URL: https://issues.apache.org/jira/browse/CASSANDRA-20639
> Project: Apache Cassandra
> Issue Type: Improvement
> Components: Consistency/Coordination, Feature/SAI
> Reporter: Caleb Rackliffe
> Assignee: Caleb Rackliffe
> Priority: Normal
> Fix For: 5.0.x, 5.x
>
>
> {{ReplicaFilteringProtection#queryProtectedPartitions()}} provides
> "completed" partitions to the {{DataResolver}} in two steps. First, it
> consumes the initial merged query results from the replicas, via a
> {{PartitionIterator}} which is short-read protected. As it does this, it
> consumes all matches in a partition. This forces the row data through RFP's
> merge listener, which catalogs the places where replicas are "silent" marks
> them for completion. Second, PartitionBuilder uses this information to
> complete the partition with data from the replicas that provided ambiguous
> results.
> The problem here is in the first step. When the total number of matches in a
> large partition is a large multiple of the LIMIT, consuming all the marches
> in the partition triggers a flurry of short-read protection reads to any
> replicas that actually provided enough results to hit the limit. This problem
> is somewhat mitigated by CASSANDRA-20566 if we can use strict filtering and
> therefore {{SinglePartitionReadCommand}}, where digest matches bypass RFP
> altogether. (This would be especially likely with small limits and reasonably
> repaired data.)
> Here's a short test that should hit all of this:
> (Just put a breakpoint in {{queryProtectedPartitions()}} in {{hasNext()}} and
> then in {{ShortReadPartitionsProtection#executeReadCommand()}} to see SRP
> reads being sent.)
> {noformat}
> @Test
> public void testShortReadNoSRP()
> {
> CLUSTER.schemaChange(withKeyspace("CREATE TABLE %s.short_read_no_srp (k
> int, c int, a int, b int, PRIMARY KEY (k, c)) WITH read_repair = 'NONE'"));
> CLUSTER.schemaChange(withKeyspace("CREATE INDEX ON
> %s.short_read_no_srp(a) USING 'sai'"));
> CLUSTER.schemaChange(withKeyspace("CREATE INDEX ON
> %s.short_read_no_srp(b) USING 'sai'"));
> SAIUtil.waitForIndexQueryable(CLUSTER, KEYSPACE);
> CLUSTER.get(1).executeInternal(withKeyspace("INSERT INTO
> %s.short_read_no_srp(k, c, a) VALUES (0, 2, 1) USING TIMESTAMP 5"));
> String select = withKeyspace("SELECT * FROM %s.short_read_no_srp WHERE k
> = 0 AND a = 1");
> Iterator<Object[]> initialRows =
> CLUSTER.coordinator(1).executeWithPaging(select, ConsistencyLevel.ALL, 1);
> assertRows(initialRows, row(0, 2, 1, null));
> }
> {noformat}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]