[jira] [Commented] (CASSANDRA-9231) Support Routing Key as part of Partition Key

Jeremiah Jordan (JIRA) Tue, 09 Feb 2016 08:27:48 -0800

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-9231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15139147#comment-15139147
 ]


Jeremiah Jordan commented on CASSANDRA-9231:
--------------------------------------------

I think we probably have other issues to solve besides CASSANDRA-9754 for 
multi-GB partitions to be viable?  Are you not going to still have operational 
issues around repairing them and compacting them still?

> Support Routing Key as part of Partition Key
> --------------------------------------------
>
>                 Key: CASSANDRA-9231
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9231
>             Project: Cassandra
>          Issue Type: Wish
>            Reporter: Matthias Broecheler
>
> Provide support for sub-dividing the partition key into a routing key and a 
> non-routing key component. Currently, all columns that make up the partition 
> key of the primary key are also routing keys, i.e. they determine which nodes 
> store the data. This proposal would give the data modeler the ability to 
> designate only a subset of the columns that comprise the partition key to be 
> routing keys. The non-routing key columns of the partition key identify the 
> partition but are not used to determine where to store the data.
> Consider the following example table definition:
> CREATE TABLE foo (
>   a int,
>   b int,
>   c int,
>   d int,
>   PRIMARY KEY  (([a], b), c ) );
> (a,b) is the partition key, c is the clustering key, and d is just a column. 
> In addition, the square brackets identify the routing key as column a. This 
> means that only the value of column a is used to determine the node for data 
> placement (i.e. only the value of column a is murmur3 hashed to compute the 
> token). In addition, column b is needed to identify the partition but does 
> not influence the placement.
> This has the benefit that all rows with the same routing key (but potentially 
> different non-routing key columns of the partition key) are stored on the 
> same node and that knowledge of such co-locality can be exploited by 
> applications build on top of Cassandra.
> Currently, the only way to achieve co-locality is within a partition. 
> However, this approach has the limitations that: a) there are theoretical and 
> (more importantly) practical limitations on the size of a partition and b) 
> rows within a partition are ordered and an index is build to exploit such 
> ordering. For large partitions that overhead is significant if ordering isn't 
> needed.
> In other words, routing keys afford a simple means to achieve scalable 
> node-level co-locality without ordering while clustering keys afford 
> page-level co-locality with ordering. As such, they address different 
> co-locality needs giving the data modeler the flexibility to choose what is 
> needed for their application.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9231) Support Routing Key as part of Partition Key

Reply via email to