On 20/03/2017 02:35, S G wrote:
2)
https://docs.datastax.com/en/developer/java-driver/3.1/manual/statements/prepared/
tells me to avoid preparing select queries if I expect a change of
columns in my table down the road.
The problem is also related to select * which is considered bad practice with most databases...

I did some more testing to see if my client machines were the bottleneck.
For a 6-node Cassandra cluster (each VM having 8-cores), I got 26,000
reads/sec for all of the following:
1) Client nodes:1, Threads: 60
2) Client nodes:3, Threads: 180
3) Client nodes:5, Threads: 300
4) Client nodes:10, Threads: 600
5) Client nodes:20, Threads: 1200

So adding more client nodes or threads to those client nodes is not
having any effect.
I am suspecting Cassandra is simply not allowing me to go any further.
> Primary keys for my schema are:
>      PRIMARY KEY((name, phone), age)
> name: text
> phone: int
> age: int

Yes with such a PK data must be spread on the whole cluster (also taking into account the partitioner), strange that the throughput doesn't scale.
I guess you also have verified that you select data randomly?

May be you could have a look at the system traces to see the query plan for some requests: If you are on a test cluster you can truncate the tables before (truncate system_traces.sessions; and truncate system_traces.events;), run a test then select * from system_traces.events
where session_id = xxxx
xxx being one of the sessions you pick in trace.sessions.

Try to see if you are not always hitting the same nodes.


--
best,
Alain

Reply via email to