[ 
https://issues.apache.org/jira/browse/CASSANDRA-13701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17210491#comment-17210491
 ] 

Paulo Motta edited comment on CASSANDRA-13701 at 10/8/20, 10:31 PM:
--------------------------------------------------------------------

I was able to improve runtime of vnode dtests by around 50% on my local machine 
by [making CCM start nodes in 
parallel|https://github.com/pauloricardomg/ccm/commit/3b21db1a46b596c2b4850c076e035b5251d7dc39]
 with a new flag {{-Dcassandra.init.wait_for_live_members}}.

[This 
flag|https://github.com/pauloricardomg/cassandra/commit/d03956b088e0f408ade607c55182619d593c8519]
 makes the node wait until a specified number of nodes is live *and* part of 
the ring before proceeding with bootstrap. This ensures the processes are 
started in parallel but tokens are assigned sequentially. So the first node is 
started with {{-Dcassandra.init.wait_for_live_members=0}}, the second node with 
{{-Dcassandra.init.wait_for_live_members=1}}, the third node with 
{{-Dcassandra.init.wait_for_live_members=2}} and so on.

A bit hacky but seems to improve runtimes significantly since we can 
parallelize a big chunk of the startup time. I'm running this on a very slow 
machine so we might get nicer improvements on a better CI machines.

The good news is that on the non-vnode case the tokens are assigned manually 
via CCM so we don't need to make nodes start sequentially so the runtimes on 
the non-vnode case are unchanged.

[~e.dimitrova] would you (or someone with CI access) mind re-running the tests 
above with the branches below to see how the runtimes look with this change?
 * [cassandra|https://github.com/pauloricardomg/cassandra/tree/CASSANDRA-13701]
 * 
[dtest|https://github.com/pauloricardomg/cassandra-dtest/tree/CASSANDRA-13701]
 * [ccm|https://github.com/pauloricardomg/ccm/tree/CASSANDRA-13701]

(cc [~mck] since this is related to CASSANDRA-16079)


was (Author: pauloricardomg):
I was able to improve runtime of a few vnode dtests by around 50% by [making 
CCM start nodes in 
parallel|https://github.com/pauloricardomg/ccm/commit/3b21db1a46b596c2b4850c076e035b5251d7dc39]
 with a new flag {{-Dcassandra.init.wait_for_live_members}}.

[This 
flag|https://github.com/pauloricardomg/cassandra/commit/d03956b088e0f408ade607c55182619d593c8519]
 makes the node wait until a specified number of nodes is live *and* part of 
the ring before proceeding with bootstrap. This ensures the processes are 
started in parallel but tokens are assigned sequentially. So the first node is 
started with {{-Dcassandra.init.wait_for_live_members=0}}, the second node with 
{{-Dcassandra.init.wait_for_live_members=1}}, the third node with 
{{-Dcassandra.init.wait_for_live_members=2}} and so on.

A bit hacky but seems to improve runtimes significantly since we can 
parallelize a big chunk of the startup time. I'm running this on a very slow 
machine so we might get nicer improvements on a better CI machines.

The good news is that on the non-vnode case the tokens are assigned manually 
via CCM so we don't need to make nodes start sequentially so the runtimes on 
the non-vnode case are unchanged.

[~e.dimitrova] would you (or someone with CI access) mind re-running the tests 
above with the branches below to see how the runtimes look with this change?
 * [cassandra|https://github.com/pauloricardomg/cassandra/tree/CASSANDRA-13701]
 * 
[dtest|https://github.com/pauloricardomg/cassandra-dtest/tree/CASSANDRA-13701]
 * [ccm|https://github.com/pauloricardomg/ccm/tree/CASSANDRA-13701]

(cc [~mck] since this is related to CASSANDRA-16079)

> Lower default num_tokens
> ------------------------
>
>                 Key: CASSANDRA-13701
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-13701
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Local/Config
>            Reporter: Chris Lohfink
>            Assignee: Alexander Dejanovski
>            Priority: Low
>             Fix For: 4.0-alpha
>
>
> For reasons highlighted in CASSANDRA-7032, the high number of vnodes is not 
> necessary. It is very expensive for operations processes and scanning. Its 
> come up a lot and its pretty standard and known now to always reduce the 
> num_tokens within the community. We should just lower the defaults.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to