[ https://issues.apache.org/jira/browse/CASSANDRA-9231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15139147#comment-15139147 ]
Jeremiah Jordan commented on CASSANDRA-9231: -------------------------------------------- I think we probably have other issues to solve besides CASSANDRA-9754 for multi-GB partitions to be viable? Are you not going to still have operational issues around repairing them and compacting them still? > Support Routing Key as part of Partition Key > -------------------------------------------- > > Key: CASSANDRA-9231 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9231 > Project: Cassandra > Issue Type: Wish > Reporter: Matthias Broecheler > > Provide support for sub-dividing the partition key into a routing key and a > non-routing key component. Currently, all columns that make up the partition > key of the primary key are also routing keys, i.e. they determine which nodes > store the data. This proposal would give the data modeler the ability to > designate only a subset of the columns that comprise the partition key to be > routing keys. The non-routing key columns of the partition key identify the > partition but are not used to determine where to store the data. > Consider the following example table definition: > CREATE TABLE foo ( > a int, > b int, > c int, > d int, > PRIMARY KEY (([a], b), c ) ); > (a,b) is the partition key, c is the clustering key, and d is just a column. > In addition, the square brackets identify the routing key as column a. This > means that only the value of column a is used to determine the node for data > placement (i.e. only the value of column a is murmur3 hashed to compute the > token). In addition, column b is needed to identify the partition but does > not influence the placement. > This has the benefit that all rows with the same routing key (but potentially > different non-routing key columns of the partition key) are stored on the > same node and that knowledge of such co-locality can be exploited by > applications build on top of Cassandra. > Currently, the only way to achieve co-locality is within a partition. > However, this approach has the limitations that: a) there are theoretical and > (more importantly) practical limitations on the size of a partition and b) > rows within a partition are ordered and an index is build to exploit such > ordering. For large partitions that overhead is significant if ordering isn't > needed. > In other words, routing keys afford a simple means to achieve scalable > node-level co-locality without ordering while clustering keys afford > page-level co-locality with ordering. As such, they address different > co-locality needs giving the data modeler the flexibility to choose what is > needed for their application. -- This message was sent by Atlassian JIRA (v6.3.4#6332)