Re: horizontal query scaling issues follow on

Diane Griffith Thu, 17 Jul 2014 17:17:23 -0700

So I stripped out the number of clients experiment path information.  It is
unclear if I can only show horizontal scaling by also spawning many client
requests all working at once.  So that is why I stripped that information
out to distill what our original attempt was at how to show horizontal
scaling.

I did tests comparing 1, 2, 10, 20, 50, 100 clients spawned all querying.
 Performance on 2 nodes starts to degrade from 10 clients on.  I saw
similar behavior on 4 nodes but haven't done the official runs on that yet.

When I tried to grab the list of tokens assigned and populate it in the
cassandra.yaml I never got it right.

I basically did the command and it was outputting 256 tokens on each node
and comma separated.  So I tried taking that string and setting that as the
value to initial_token but the node wouldn't start up.

Not sure if I maybe had a carriage return in there and that was the problem.

And if I do that do I need to do more than comment out num_tokens?

Thanks,
Diane

On Thu, Jul 17, 2014 at 6:58 PM, Robert Coli <rc...@eventbrite.com> wrote:

> On Thu, Jul 17, 2014 at 3:21 PM, Diane Griffith <dfgriff...@gmail.com>
> wrote:
>
>> So do partitions equate to tokens/vnodes?
>>
>
> A partition is what used to be called a "row".
>
> Each individual token in the token ring can contain a partition, which you
> request using the token as the key.
>
> A "token range" is the space between two tokens.
>
>
>> If so we had configured all cluster nodes/vms with num_tokens: 256
>> instead of setting init_token and assigning ranges.  I am still not getting
>> why in Cassandra 2.0, I would assign my own ranges via init_token and this
>> was based on the documentation and even this blog item
>> <http://www.datastax.com/dev/blog/virtual-nodes-in-cassandra-1-2> that
>> made it seem right for us to always configure our cluster vms with
>> num_tokens: 256 in the cassandra.yaml file.
>>
>
> If you are using vnodes and don't want to try to figure out what ideally
> random token ranges for them are, you should, generally :
>
> 1) start the node with num_tokens set to a value greater than 1
> 2) once succesffully bootstrapped, dump all node tokens with :
>
> nodetool info -T | grep Token | awk '{print $3}' | paste -s -d,
>
> 3) put list from 2) in initial_token list in cassandra.yaml
> 4) (optional) restart and verify that your node has the tokens you expect
>
> So given that I was assuming the partitions were such that it wasn't a
>> problem.  Is that an incorrect assumption and something to dig into more?
>>
>
> How many client threads do you have? Your OP suggested a low number, which
> will not have good results in terms of throughput?
>
> =Rob
>
>

Re: horizontal query scaling issues follow on

Reply via email to