[ https://issues.apache.org/jira/browse/CASSANDRA-9415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14616743#comment-14616743 ]
Brian Hess commented on CASSANDRA-9415: ---------------------------------------- Wouldn't the preparing of the statement handle that? > Implicit use of Materialized Views on SELECT > -------------------------------------------- > > Key: CASSANDRA-9415 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9415 > Project: Cassandra > Issue Type: Improvement > Reporter: Brian Hess > Labels: ponies > > CASSANDRA-6477 introduces Materialized Views. This greatly simplifies the > write path for the best-practice of "query tables". But it does not simplify > the read path as much as our users want/need. > We suggest to folks to create multiple copies of their base table optimized > for certain queries - hence "query table". For example, we may have a USER > table with two type of queries: lookup by userid and lookup by email address. > We would recommend creating 2 tables USER_BY_USERID and USER_BY_EMAIL. Both > would have the exact same schema, with the same PRIMARY KEY columns, but > different PARTITION KEY - the first would be USERID and the second would be > EMAIL. > One complicating thing with this approach is that the application now needs > to know that when it INSERT/UPDATE/DELETEs from the base table it needs to > INSERT/UPDATE/DELETE from all of the query tables as well. CASSANDRA-6477 > covers this nicely. > However, the other side of the coin is that the application needs to know > which query table to leverage based on the selection criteria. Using the > example above, if the query has a predicate such as "WHERE userid = 'bhess'", > then USERS_BY_USERID is the better table to use. Similarly, when the > predicate is "WHERE email = 'bhess@company.whatever'", USERS_BY_EMAIL is > appropriate. > On INSERT/UPDATE/DELETE, Materialized Views essentially give a single "name" > to the collection of tables. You do operations just on the base table. It > is very attractive for the SELECT side as well. It would be very good to > allow an application to simply do "SELECT * FROM users WHERE userid = > 'bhess'" and have that query implicitly leverage the USERS_BY_USERID > materialized view. > For additional use cases, especially analytics use cases like in Spark, this > allows the Spark code to simply push down the query without having to know > about all of the MVs that have been set up. The system will route the query > appropriately. And if additional MVs are necessary to make a query run > better/faster, then those MVs can be set up and Spark will implicitly > leverage them. -- This message was sent by Atlassian JIRA (v6.3.4#6332)