Thanks Robert for your intput but that sounds little crazy to me. Still
physical node is the same so why can't it just do one indexed scan for all
the contiguous or non-contiguous token ranges (vnodes) held by that
physical node. I doubt that it needs to respect token order for "some
reason" & hence the multiple scans.

Great if you or someone can help me clarify below doubts (in the context of
trace output):
>>

When coordinator fires indexed scan request to node 192.168.51.22, why
don't it ask that node to check all of its (at least primary) ranges for
the queried data, at once. Also, internally that node should be able to
just do one scan through all of the ranges held by it, isn't it?
(e.g. [min(-9223372036854775808), max(-9193352069377957523), and
(max(-9136021049555745100), max(-8959555493872108621)], and etc. ]

Seems like it needs to query data in token order. So,
min(-9223372036854775808), max(-*9193352069377957523*) on 192.168.51.22.
But next range ((max(-*9193352069377957523*), max(-*9136021049555745100*)])
is on 192.168.51.25 so fire query there. Then, next range  (max(-
*9136021049555745100*), max(-8959555493872108621)] again on  192.168.51.22.
Btw,, I'm not too sure regarding min/max or max/max in trace output.

I found below comment in
https://issues.apache.org/jira/browse/CASSANDRA-4858.
"The problem is that we have to scan the nodes in token order so we dont
break the existing API's, if we do so then we are sending a lot more
requests and waiting for the response than the number of nodes. "
Don't understand the restriction though - "don't break the existing API's".

With non-vnode, it only queries a particular node only one time..Btw, in
the worst case, I understand secondary index query has to scan all the
nodes in cluster sometime (empty table or high cardinality index?) but I
don't understand why vnode makes it to scan the *same node *multiple times.
If RF is 1, then also I see this behavior.

>> Snippet from output1.txt attached earlier:
Executing indexed scan for [min(-9223372036854775808),
max(-9193352069377957523)] | 23:11:30,992 | 192.168.51.22 |
Executing indexed scan for (max(-9193352069377957523),
max(-9136021049555745100)] | 23:11:30,998 | 192.168.51.25 |
Executing indexed scan for (max(-9136021049555745100),
max(-8959555493872108621)] | 23:11:30,999 | 192.168.51.22 |
Executing indexed scan for (max(-8959555493872108621),
max(-8929774302283364912)] | 23:11:31,000 | 192.168.51.25 |


On Fri, Sep 19, 2014 at 2:54 PM, Robert Coli <rc...@eventbrite.com> wrote:

> On Fri, Sep 19, 2014 at 2:19 PM, DuyHai Doan <doanduy...@gmail.com> wrote:
>
>>  But does it implies that with vnodes, there are actually "extra work" to
>> do for scanning indices ?
>>
>
> Vnodes are just nodes, so they have all the
> problems-associated-with-many-nodes one would get with 256x as many nodes.
>
> =Rob
>
>

Reply via email to