[jira] [Commented] (CASSANDRA-6976) Determining replicas to query is very slow with large numbers of nodes or vnodes

Ariel Weisberg (JIRA) Wed, 05 Nov 2014 15:43:57 -0800

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-6976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14199384#comment-14199384
 ]


Ariel Weisberg commented on CASSANDRA-6976:
-------------------------------------------

The ticket is originally based on a performance issue with relatively small 
numbers of tokens (15 * 256) which I was unable to reproduce once things were 
warmed up. I did reproduce the 100 millisecond number for the first handful of 
invocations.

I think for the kinds of queries we are talking about the performance of 
getRestrictedRange isn't going to be an issue because it's a small slice of the 
query pie compared to getRangeSlice. Testing with CCM and 3 nodes the amount of 
time spent in getRangeSlice was 10x the time in getRestrictedRange and also 
seemed to grow linearly with number of tokens.

The catch is that I think it's only a real win in the cases where we aren't 
going to materialize the entire list anyways. The only time I think that 
happens is when there is a limit. If we will iterate the entire list it could 
be faster to materialize it.

I think getRangeSlice might not be that slow if there is a limit in play 
because it will bail out earlier, but I haven't tested it. That might make 
getRestrictedRange look more expensive with large numbers of tokens.

Constructing the ring iterator is cheap it's just assembling an object. Making 
generation of the list lazy looks doable to me. The list is random accessed, 
but always moving forward.

> Determining replicas to query is very slow with large numbers of nodes or 
> vnodes
> --------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-6976
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6976
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Benedict
>            Assignee: Ariel Weisberg
>              Labels: performance
>             Fix For: 2.1.2
>
>         Attachments: GetRestrictedRanges.java, jmh_output.txt, 
> jmh_output_murmur3.txt, make_jmh_work.patch
>
>
> As described in CASSANDRA-6906, this can be ~100ms for a relatively small 
> cluster with vnodes, which is longer than it will spend in transit on the 
> network. This should be much faster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-6976) Determining replicas to query is very slow with large numbers of nodes or vnodes

Reply via email to