[jira] [Commented] (CASSANDRA-15260) Add `allocate_tokens_for_dc_rf` yaml option for token allocation

2019-10-24 Thread venky (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16959267#comment-16959267
 ] 

venky commented on CASSANDRA-15260:
---

Thanks [~mck] 

> Add `allocate_tokens_for_dc_rf` yaml option for token allocation
> 
>
> Key: CASSANDRA-15260
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15260
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Config
>Reporter: Michael Semb Wever
>Assignee: Michael Semb Wever
>Priority: Normal
> Fix For: 4.0-alpha
>
>
> Similar to DSE's option: {{allocate_tokens_for_local_replication_factor}}
> Currently the 
> [ReplicationAwareTokenAllocator|https://www.datastax.com/dev/blog/token-allocation-algorithm]
>  requires a defined keyspace and a replica factor specified in the current 
> datacenter.
> This is problematic in a number of ways. The real keyspace can not be used 
> when adding new datacenters as, in practice, all its nodes need to be up and 
> running before it has the capacity to replicate data into it. New datacenters 
> (or lift-and-shifting a cluster via datacenter migration) therefore has to be 
> done using a dummy keyspace that duplicates the replication strategy+factor 
> of the real keyspace. This gets even more difficult come version 4.0, as the 
> replica factor can not even be defined in new datacenters before those 
> datacenters are up and running. 
> These issues are removed by avoiding the keyspace definition and lookup, and 
> presuming the replica strategy is by datacenter, ie NTS. This can be done 
> with the use of an {{allocate_tokens_for_dc_rf}} option.
> It may also be of value considering whether {{allocate_tokens_for_dc_rf=3}} 
> becomes the default? as this is the replication factor for the vast majority 
> of datacenters in production. I suspect this would be a good improvement over 
> the existing randomly generated tokens algorithm.
> Initial patch is available in 
> [https://github.com/thelastpickle/cassandra/commit/fc4865b0399570e58f11215565ba17dc4a53da97]
> The patch does not remove the existing {{allocate_tokens_for_keyspace}} 
> option, as that provides the codebase for handling different replication 
> strategies.
>  
> fyi [~blambov] [~jay.zhuang] [~chovatia.jayd...@gmail.com] [~alokamvenki] 
> [~alexchueshev]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15260) Add `allocate_tokens_for_dc_rf` yaml option for token allocation

2019-08-07 Thread Branimir Lambov (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16902109#comment-16902109
 ] 

Branimir Lambov commented on CASSANDRA-15260:
-

I will review this in a couple of days.

Meanwhile, for consistency's sake I would change the name of the option to 
match DSE's as it is doing exactly the same thing.
{quote}It may also be of value considering whether 
{{allocate_tokens_for_dc_rf=3}} becomes the default?
{quote}
Something like this with a default vnode count of 16 or 8 makes a lot of sense 
if the configuration is indeed that, but I worry that the algorithm will 
generate tokens that are very unsuitable for other configurations, and also if 
the rack and DC information is not properly set up which is probably something 
that happens a lot during initial contact with Cassandra. We may be doing more 
damage than good over e.g. 256-vnode random choice, that's why I personally 
prefer to not change the default. I'm happy to hear other opinions on this, 
though.

 

> Add `allocate_tokens_for_dc_rf` yaml option for token allocation
> 
>
> Key: CASSANDRA-15260
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15260
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Config
>Reporter: mck
>Assignee: mck
>Priority: Normal
> Fix For: 4.x
>
>
> Similar to DSE's option: {{allocate_tokens_for_local_replication_factor}}
> Currently the 
> [ReplicationAwareTokenAllocator|https://www.datastax.com/dev/blog/token-allocation-algorithm]
>  requires a defined keyspace and a replica factor specified in the current 
> datacenter.
> This is problematic in a number of ways. The real keyspace can not be used 
> when adding new datacenters as, in practice, all its nodes need to be up and 
> running before it has the capacity to replicate data into it. New datacenters 
> (or lift-and-shifting a cluster via datacenter migration) therefore has to be 
> done using a dummy keyspace that duplicates the replication strategy+factor 
> of the real keyspace. This gets even more difficult come version 4.0, as the 
> replica factor can not even be defined in new datacenters before those 
> datacenters are up and running. 
> These issues are removed by avoiding the keyspace definition and lookup, and 
> presuming the replica strategy is by datacenter, ie NTS. This can be done 
> with the use of an {{allocate_tokens_for_dc_rf}} option.
> It may also be of value considering whether {{allocate_tokens_for_dc_rf=3}} 
> becomes the default? as this is the replication factor for the vast majority 
> of datacenters in production. I suspect this would be a good improvement over 
> the existing randomly generated tokens algorithm.
> Initial patch is available in 
> [https://github.com/thelastpickle/cassandra/commit/fc4865b0399570e58f11215565ba17dc4a53da97]
> The patch does not remove the existing {{allocate_tokens_for_keyspace}} 
> option, as that provides the codebase for handling different replication 
> strategies.
>  
> fyi [~blambov] [~jay.zhuang] [~chovatia.jayd...@gmail.com] [~alokamvenki] 
> [~alexchueshev]



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15260) Add `allocate_tokens_for_dc_rf` yaml option for token allocation

2019-08-07 Thread mck (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16902243#comment-16902243
 ] 

mck commented on CASSANDRA-15260:
-

{quote} Meanwhile, for consistency's sake I would change the name of the option 
to match DSE's as it is doing exactly the same thing. \{quote}

No objection. I will fix it. The naming is a bit clumsy either way imho, but 
nothing better comes to mind, and indeed it makes sense to re-use DSE's 
terminology for an identical feature.

 

{quote}We may be doing more damage than good over e.g. 256-vnode random choice, 
…\{quote}

Makes sense and is fine by me. It helps to just have these concerns, and the 
trade-off, stated somewhere. 

 

> Add `allocate_tokens_for_dc_rf` yaml option for token allocation
> 
>
> Key: CASSANDRA-15260
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15260
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Config
>Reporter: mck
>Assignee: mck
>Priority: Normal
> Fix For: 4.x
>
>
> Similar to DSE's option: {{allocate_tokens_for_local_replication_factor}}
> Currently the 
> [ReplicationAwareTokenAllocator|https://www.datastax.com/dev/blog/token-allocation-algorithm]
>  requires a defined keyspace and a replica factor specified in the current 
> datacenter.
> This is problematic in a number of ways. The real keyspace can not be used 
> when adding new datacenters as, in practice, all its nodes need to be up and 
> running before it has the capacity to replicate data into it. New datacenters 
> (or lift-and-shifting a cluster via datacenter migration) therefore has to be 
> done using a dummy keyspace that duplicates the replication strategy+factor 
> of the real keyspace. This gets even more difficult come version 4.0, as the 
> replica factor can not even be defined in new datacenters before those 
> datacenters are up and running. 
> These issues are removed by avoiding the keyspace definition and lookup, and 
> presuming the replica strategy is by datacenter, ie NTS. This can be done 
> with the use of an {{allocate_tokens_for_dc_rf}} option.
> It may also be of value considering whether {{allocate_tokens_for_dc_rf=3}} 
> becomes the default? as this is the replication factor for the vast majority 
> of datacenters in production. I suspect this would be a good improvement over 
> the existing randomly generated tokens algorithm.
> Initial patch is available in 
> [https://github.com/thelastpickle/cassandra/commit/fc4865b0399570e58f11215565ba17dc4a53da97]
> The patch does not remove the existing {{allocate_tokens_for_keyspace}} 
> option, as that provides the codebase for handling different replication 
> strategies.
>  
> fyi [~blambov] [~jay.zhuang] [~chovatia.jayd...@gmail.com] [~alokamvenki] 
> [~alexchueshev]



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15260) Add `allocate_tokens_for_dc_rf` yaml option for token allocation

2019-08-09 Thread mck (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16904095#comment-16904095
 ] 

mck commented on CASSANDRA-15260:
-

[~blambov], in context of this thread 
https://lists.apache.org/thread.html/56435ee11852ea842443d462500277eebe76743e6657e0cfdd7d67df@%3Cdev.cassandra.apache.org%3E
would you agree with the default of `allocate_tokens_for_local_rf=3` if 
`num_tokens=16` also became the default?

> Add `allocate_tokens_for_dc_rf` yaml option for token allocation
> 
>
> Key: CASSANDRA-15260
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15260
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Config
>Reporter: mck
>Assignee: mck
>Priority: Normal
> Fix For: 4.x
>
>
> Similar to DSE's option: {{allocate_tokens_for_local_replication_factor}}
> Currently the 
> [ReplicationAwareTokenAllocator|https://www.datastax.com/dev/blog/token-allocation-algorithm]
>  requires a defined keyspace and a replica factor specified in the current 
> datacenter.
> This is problematic in a number of ways. The real keyspace can not be used 
> when adding new datacenters as, in practice, all its nodes need to be up and 
> running before it has the capacity to replicate data into it. New datacenters 
> (or lift-and-shifting a cluster via datacenter migration) therefore has to be 
> done using a dummy keyspace that duplicates the replication strategy+factor 
> of the real keyspace. This gets even more difficult come version 4.0, as the 
> replica factor can not even be defined in new datacenters before those 
> datacenters are up and running. 
> These issues are removed by avoiding the keyspace definition and lookup, and 
> presuming the replica strategy is by datacenter, ie NTS. This can be done 
> with the use of an {{allocate_tokens_for_dc_rf}} option.
> It may also be of value considering whether {{allocate_tokens_for_dc_rf=3}} 
> becomes the default? as this is the replication factor for the vast majority 
> of datacenters in production. I suspect this would be a good improvement over 
> the existing randomly generated tokens algorithm.
> Initial patch is available in 
> [https://github.com/thelastpickle/cassandra/commit/fc4865b0399570e58f11215565ba17dc4a53da97]
> The patch does not remove the existing {{allocate_tokens_for_keyspace}} 
> option, as that provides the codebase for handling different replication 
> strategies.
>  
> fyi [~blambov] [~jay.zhuang] [~chovatia.jayd...@gmail.com] [~alokamvenki] 
> [~alexchueshev]



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15260) Add `allocate_tokens_for_dc_rf` yaml option for token allocation

2019-08-12 Thread Branimir Lambov (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16905041#comment-16905041
 ] 

Branimir Lambov commented on CASSANDRA-15260:
-

The code looks good to me. As mentioned, it would be good to change the name to 
match DSE's.

 

On making algorithmic allocation the default, let's continue the discussion on 
CASSANDRA-13701 once this is committed.

> Add `allocate_tokens_for_dc_rf` yaml option for token allocation
> 
>
> Key: CASSANDRA-15260
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15260
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Config
>Reporter: mck
>Assignee: mck
>Priority: Normal
> Fix For: 4.x
>
>
> Similar to DSE's option: {{allocate_tokens_for_local_replication_factor}}
> Currently the 
> [ReplicationAwareTokenAllocator|https://www.datastax.com/dev/blog/token-allocation-algorithm]
>  requires a defined keyspace and a replica factor specified in the current 
> datacenter.
> This is problematic in a number of ways. The real keyspace can not be used 
> when adding new datacenters as, in practice, all its nodes need to be up and 
> running before it has the capacity to replicate data into it. New datacenters 
> (or lift-and-shifting a cluster via datacenter migration) therefore has to be 
> done using a dummy keyspace that duplicates the replication strategy+factor 
> of the real keyspace. This gets even more difficult come version 4.0, as the 
> replica factor can not even be defined in new datacenters before those 
> datacenters are up and running. 
> These issues are removed by avoiding the keyspace definition and lookup, and 
> presuming the replica strategy is by datacenter, ie NTS. This can be done 
> with the use of an {{allocate_tokens_for_dc_rf}} option.
> It may also be of value considering whether {{allocate_tokens_for_dc_rf=3}} 
> becomes the default? as this is the replication factor for the vast majority 
> of datacenters in production. I suspect this would be a good improvement over 
> the existing randomly generated tokens algorithm.
> Initial patch is available in 
> [https://github.com/thelastpickle/cassandra/commit/fc4865b0399570e58f11215565ba17dc4a53da97]
> The patch does not remove the existing {{allocate_tokens_for_keyspace}} 
> option, as that provides the codebase for handling different replication 
> strategies.
>  
> fyi [~blambov] [~jay.zhuang] [~chovatia.jayd...@gmail.com] [~alokamvenki] 
> [~alexchueshev]



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15260) Add `allocate_tokens_for_dc_rf` yaml option for token allocation

2019-08-12 Thread mck (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16905336#comment-16905336
 ] 

mck commented on CASSANDRA-15260:
-

Thanks [~blambov]. The rename is done.


||branch||circleci||asf jenkins testall||
|[CASSANDRA-15260|https://github.com/thelastpickle/cassandra/commit/4513af58a532b91ab4449161a79e70f78b7ebcfc]|[circleci|https://circleci.com/gh/thelastpickle/workflows/cassandra/tree/mck%2Ftrunk__allocate_tokens_for_dc_rf]|[!https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-testall/43//badge/icon!|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-testall/43/]|

I've opened the ticket, and will transition it to 'Submit Patch' after I get 
some unit tests in.

> Add `allocate_tokens_for_dc_rf` yaml option for token allocation
> 
>
> Key: CASSANDRA-15260
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15260
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Config
>Reporter: mck
>Assignee: mck
>Priority: Normal
> Fix For: 4.x
>
>
> Similar to DSE's option: {{allocate_tokens_for_local_replication_factor}}
> Currently the 
> [ReplicationAwareTokenAllocator|https://www.datastax.com/dev/blog/token-allocation-algorithm]
>  requires a defined keyspace and a replica factor specified in the current 
> datacenter.
> This is problematic in a number of ways. The real keyspace can not be used 
> when adding new datacenters as, in practice, all its nodes need to be up and 
> running before it has the capacity to replicate data into it. New datacenters 
> (or lift-and-shifting a cluster via datacenter migration) therefore has to be 
> done using a dummy keyspace that duplicates the replication strategy+factor 
> of the real keyspace. This gets even more difficult come version 4.0, as the 
> replica factor can not even be defined in new datacenters before those 
> datacenters are up and running. 
> These issues are removed by avoiding the keyspace definition and lookup, and 
> presuming the replica strategy is by datacenter, ie NTS. This can be done 
> with the use of an {{allocate_tokens_for_dc_rf}} option.
> It may also be of value considering whether {{allocate_tokens_for_dc_rf=3}} 
> becomes the default? as this is the replication factor for the vast majority 
> of datacenters in production. I suspect this would be a good improvement over 
> the existing randomly generated tokens algorithm.
> Initial patch is available in 
> [https://github.com/thelastpickle/cassandra/commit/fc4865b0399570e58f11215565ba17dc4a53da97]
> The patch does not remove the existing {{allocate_tokens_for_keyspace}} 
> option, as that provides the codebase for handling different replication 
> strategies.
>  
> fyi [~blambov] [~jay.zhuang] [~chovatia.jayd...@gmail.com] [~alokamvenki] 
> [~alexchueshev]



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org