[ https://issues.apache.org/jira/browse/CASSANDRA-18515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Mike Adamson updated CASSANDRA-18515: ------------------------------------- Test and Documentation Plan: The latest circle-ci test run is here: https://app.circleci.com/pipelines/github/mike-tr-adamson/cassandra/154/workflows/b22cf67c-1e71-4c93-bdc8-c2c1ab6c3773 Status: Patch Available (was: In Progress) > Optimize Initial Concurrency Selection for Range Read Algorithm During SAI > Queries > ---------------------------------------------------------------------------------- > > Key: CASSANDRA-18515 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18515 > Project: Cassandra > Issue Type: Improvement > Components: Feature/2i Index > Reporter: Mike Adamson > Assignee: Mike Adamson > Priority: Normal > > The range read algorithm relies on the Index API’s notion of estimated result > rows to decide how many replicas to contact in parallel during its first > round of requests. The more results expected from a replica for a token > range, the fewer replicas the range read will initially try to contact. Like > SASI, SAI floors that estimate to a huge negative number to make sure it’s > selected over other indexes, and this floors the concurrency factor to 1. The > actual formula looks like this: > {code:java} > // resultsPerRange, from SAI, is a giant negative number > concurrencyFactor = Math.max(1, Math.min(ranges.rangeCount(), (int) > Math.ceil(command.limits().count() / resultsPerRange))); > {code} > Although that concurrency factor is updated as actual results stream in, only > sending a single range request to a single replica in every case for SAI is > not ideal. For example, assume I have a 3 node cluster and a keyspace at > RF=1, with 10 rows spread across the 3 nodes, without vnodes. Issuing a query > that matches all 10 rows with a LIMIT of 10 will make 2 or 3 serial range > requests from the coordinator, one to each of the 3 nodes. > This can be fixed by allowing indexes to bypass the initial concurrency > calculation allowing SAI queries to contact the entire ring in a single round > of queries, or at worst the minimum number of rounds as bounded by the > existing statutory maximum ranges per round. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org