[ 
https://issues.apache.org/jira/browse/CASSANDRA-1337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13784290#comment-13784290
 ] 

Jonathan Ellis commented on CASSANDRA-1337:
-------------------------------------------

Pushed some super minor cleanup to 
https://github.com/jbellis/cassandra/commits/1337.

Question left in my mind is, do we want to shoot for exactly enough concurrent 
requests, on average?  Would imply that half the time we need to do an extra 
round.  ISTM we probably want to give ourselves a margin of error.

> parallelize fetching rows for low-cardinality indexes
> -----------------------------------------------------
>
>                 Key: CASSANDRA-1337
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1337
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Jonathan Ellis
>            Assignee: Tyler Hobbs
>            Priority: Minor
>             Fix For: 2.1
>
>         Attachments: 0001-Concurrent-range-and-2ary-index-subqueries.patch, 
> 1137-bugfix.patch, 1337.patch, 1337-v4.patch, 
> ASF.LICENSE.NOT.GRANTED--0001-CASSANDRA-1337-scan-concurrently-depending-on-num-rows.txt,
>  CASSANDRA-1337.patch
>
>   Original Estimate: 8h
>  Remaining Estimate: 8h
>
> currently, we read the indexed rows from the first node (in partitioner 
> order); if that does not have enough matching rows, we read the rows from the 
> next, and so forth.
> we should use the statistics fom CASSANDRA-1155 to query multiple nodes in 
> parallel, such that we have a high chance of getting enough rows w/o having 
> to do another round of queries (but, if our estimate is incorrect, we do need 
> to loop and do more rounds until we have enough data or we have fetched from 
> each node).



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to