[ https://issues.apache.org/jira/browse/CASSANDRA-6477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14626760#comment-14626760 ]
Tupshin Harper commented on CASSANDRA-6477: ------------------------------------------- I find myself disagreeing with the hard requirement that all rows in the table must show up in the materialized views. While it would be nice, I believe that clearly documenting the limitation and providing a couple of reasonable choices is far preferable then encouraging using rope sufficient to hang the user. My suggestion: * Create a formal notion of NOT NULL columns in the schema that can be applied to a table, irrespective of any MV usage. * Columns that are NOT NULL would have the exact same restrictions as PK columns, namely that they need to be included in all inserts and updates (with the possible exception of LWT updates) * Document (and warn in cqlsh) the fact that if you create a MV with a PK using a nullable column from the table, then those values will not be in the view It seems to me like this is a far less dangerous (and in many ways less surprising) than automatically creating a hotspot in the MV because lots of data with NULLs get added. Now with 8099 supporting NULLs for clustering columns, this might only apply to columns that would be a partition key in the MV, and that seems appealing. But I can't talk myself into liking inserting nulls into a MV partition key. > Materialized Views (was: Global Indexes) > ---------------------------------------- > > Key: CASSANDRA-6477 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6477 > Project: Cassandra > Issue Type: New Feature > Components: API, Core > Reporter: Jonathan Ellis > Assignee: Carl Yeksigian > Labels: cql > Fix For: 3.0 beta 1 > > Attachments: test-view-data.sh, users.yaml > > > Local indexes are suitable for low-cardinality data, where spreading the > index across the cluster is a Good Thing. However, for high-cardinality > data, local indexes require querying most nodes in the cluster even if only a > handful of rows is returned. -- This message was sent by Atlassian JIRA (v6.3.4#6332)