[ 
https://issues.apache.org/jira/browse/CASSANDRA-18515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andres de la Peña updated CASSANDRA-18515:
------------------------------------------
    Source Control Link: https://github.com/apache/cassandra/pull/2463  (was: 
https://github.com/maedhroz/cassandra/pull/13)

> Optimize Initial Concurrency Selection for Range Read Algorithm During SAI 
> Queries
> ----------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-18515
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-18515
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Feature/2i Index
>            Reporter: Mike Adamson
>            Assignee: Mike Adamson
>            Priority: Normal
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> The range read algorithm relies on the Index API’s notion of estimated result 
> rows to decide how many replicas to contact in parallel during its first 
> round of requests. The more results expected from a replica for a token 
> range, the fewer replicas the range read will initially try to contact. Like 
> SASI, SAI floors that estimate to a huge negative number to make sure it’s 
> selected over other indexes, and this floors the concurrency factor to 1. The 
> actual formula looks like this:
> {code:java}
> // resultsPerRange, from SAI, is a giant negative number
> concurrencyFactor = Math.max(1, Math.min(ranges.rangeCount(), (int) 
> Math.ceil(command.limits().count() / resultsPerRange)));
> {code}
> Although that concurrency factor is updated as actual results stream in, only 
> sending a single range request to a single replica in every case for SAI is 
> not ideal. For example, assume I have a 3 node cluster and a keyspace at 
> RF=1, with 10 rows spread across the 3 nodes, without vnodes. Issuing a query 
> that matches all 10 rows with a LIMIT of 10 will make 2 or 3 serial range 
> requests from the coordinator, one to each of the 3 nodes.
> This can be fixed by allowing indexes to bypass the initial concurrency 
> calculation allowing SAI queries to contact the entire ring in a single round 
> of queries, or at worst the minimum number of rounds as bounded by the 
> existing statutory maximum ranges per round.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to