[ https://issues.apache.org/jira/browse/GORA-267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14204309#comment-14204309 ]
Lewis John McGibbney commented on GORA-267: ------------------------------------------- Up-to-date code is available here https://github.com/lewismc/gora/compare/GORA-267?expand=1 This is not finished, additionally no testing has been undertaken. It has just been brought up to date with master, however it will not compile. This is just a FYI > Cassandra composite primary key support > --------------------------------------- > > Key: GORA-267 > URL: https://issues.apache.org/jira/browse/GORA-267 > Project: Apache Gora > Issue Type: Improvement > Components: gora-cassandra > Reporter: c.zirp...@seeburger.de > Assignee: Lewis John McGibbney > Labels: features > Fix For: 0.6 > > Attachments: gora-267.diff > > > The extension allows to define primary keys that are represented by avro > classes. A mapping specifies how fields of the key class are mapped to the > components of composite partition keys and composite column names. This gives > users more control with respect to the distribution of data into Cassandra > database structures. It is now possible to store data in wide rows with > custom indexes that allow for fast range scans on a single node. Also there > is no more need for an order-preserving partitioner that is likely to > compromise data distribution in the Cassandra cluster. > The extension allows to define primary keys that are represented by avro > classes. A mapping specifies how fields of the key class are mapped to the > components of composite partition keys and composite column names. This gives > users more control with respect to the distribution of data into Cassandra > database structures. It is now possible to store data in wide rows with > custom indexes that allow for fast range scans on a single node. Also there > is no more need for an order-preserving partitioner that is likely to > compromise data distribution in the Cassandra cluster. > In essence, composite primary keys with identical partition parts will be > written in the same Cassandra row (which is essentially a partition). Within > the same row entities are stored in lexical order by their cluster key > components. Avro field names are appended as the last component of the > composite column name. The current implementation does not substitute super > columns. Thus, complex avro fields are still mapped to super columns. Super > column families use the same composite primary keys as simple column > families. As Gora always fully loads nested complex types, the use of super > column families is not really a problem. Yet, super columns could be > substituted by another level of column name components below the field > qualifiers in future work. It would also be possible to rethink the > decomposition of complex nested types beyond the first level. > The implementation uses the concept of Gora partitionQueries in order to > decompose row scanning queries into a sets of queries that each operate on a > single row. However, such a decomposition is not always possible and real > range scans are limited to wide rows (partitions). > The implementation is fully backward compatible. Simple key classes can still > be used and row scans are still possible with an order-preserving > partitioner. The current junit tests are all passed. Furthermore, I have > added an example and some unit tests to demonstrate the use of composite > primary keys for time series data. > As mentioned earlier, we are happy to share this extension. I've created a > jira issue for it (GORA-267) and will provide the implementation on GitHub > (https://github.com/zirpins/gora/tree/GORA-267). > Regards, > Christian -- This message was sent by Atlassian JIRA (v6.3.4#6332)