[ https://issues.apache.org/jira/browse/CASSANDRA-13857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Kurt Greaves updated CASSANDRA-13857: ------------------------------------- Summary: Allow MV with only partition key (was: Allow MV with same PK but a different filtering) > Allow MV with only partition key > -------------------------------- > > Key: CASSANDRA-13857 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13857 > Project: Cassandra > Issue Type: Bug > Reporter: Kurt Greaves > > We currently disallow creation of a view that has the exact same primary key > as the base where no clustering keys are present, however a potential use > case would be a view where part of the PK is filtered so as to have a subset > of data in the view which is faster for range queries. We actually currently > allow this, but only if you have a clustering key defined. If you only have a > partitioning key it's not possible. > From the mailing list, the below example works: > {code:java} > CREATE TABLE users ( > site_id int, > user_id text, > n int, > data set<frozen<text>>, > PRIMARY KEY ((site_id, user_id), n)); > user data is updated and read by PK and sometimes I have to fetch all user > for some specific site_id. It appeared that full scan by > token(site_id,user_id) filtered by WHERE site_id = <some id> works much > slower than unfiltered full scan on > CREATE MATERIALIZED VIEW users_1 AS > SELECT site_id, user_id, n, data > FROM users > WHERE site_id = 1 AND user_id IS NOT NULL AND n IS NOT NULL > PRIMARY KEY ((site_id, user_id), n); > {code} > However the following does not: > {code:java} > CREATE TABLE users ( > site_id int, > user_id text, > data set<text>, > PRIMARY KEY ((site_id, user_id))); > CREATE MATERIALIZED VIEW users_1 AS > SELECT site_id, user_id, data > FROM users > WHERE site_id = 1 AND user_id IS NOT NULL > PRIMARY KEY ((site_id, user_id)); > InvalidRequest: Error from server: code=2200 [Invalid query] message="No > columns are defined for Materialized View other than primary key" > {code} > This is because if the clustering key is empty we assume they've only defined > the primary key in the partition key and we haven't accounted for this use > case. > On that note, we also don't allow the following narrowing of the partition > key: > {code} > CREATE TABLE kurt.base ( > id int, > uid text, > data text, > PRIMARY KEY (id, uid) > ) > CREATE MATERIALIZED VIEW kurt.mv2 AS SELECT * from kurt.base where id IS NOT > NULL and uid='1' PRIMARY KEY ((id, uid)); > InvalidRequest: Error from server: code=2200 [Invalid query] message="No > columns are defined for Materialized View other than primary key" > {code} > But we do allow the following, which works because there is still a > clustering key, despite not changing the PK. > {code} > CREATE MATERIALIZED VIEW kurt.mv2 AS SELECT * from kurt.base where id IS NOT > NULL and uid='1' PRIMARY KEY (id, uid); > {code} > And we also allow the following, which is a narrowing of the partition key as > above, but with an extra clustering key. > {code} > create table kurt.base3 (id int, uid int, clus1 int, clus2 int, data text, > PRIMARY KEY ((id, uid), clus1, clus2)); > CREATE MATERIALIZED VIEW kurt.mv4 AS SELECT * from kurt.base3 where id IS NOT > NULL and uid IS NOT NULL and clus1 IS NOT NULL AND clus2 IS NOT NULL PRIMARY > KEY ((id, uid, clus1), clus2); > {code} > I _think_ supporting these cases is trivial and mostly already handled in the > underlying MV write path, so we might be able to get away with just a simple > change of [this > condition|https://github.com/apache/cassandra/blob/83822d12d87dcb3aaad2b1e670e57ebef4ab1c36/src/java/org/apache/cassandra/cql3/statements/CreateViewStatement.java#L291]. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org