[ 
https://issues.apache.org/jira/browse/CASSANDRA-9415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14616743#comment-14616743
 ] 

 Brian Hess commented on CASSANDRA-9415:
----------------------------------------

Wouldn't the preparing of the statement handle that?

> Implicit use of Materialized Views on SELECT
> --------------------------------------------
>
>                 Key: CASSANDRA-9415
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9415
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter:  Brian Hess
>              Labels: ponies
>
> CASSANDRA-6477 introduces Materialized Views.  This greatly simplifies the 
> write path for the best-practice of "query tables".  But it does not simplify 
> the read path as much as our users want/need.
> We suggest to folks to create multiple copies of their base table optimized 
> for certain queries - hence "query table".  For example, we may have a USER 
> table with two type of queries: lookup by userid and lookup by email address. 
>  We would recommend creating 2 tables USER_BY_USERID and USER_BY_EMAIL.  Both 
> would have the exact same schema, with the same PRIMARY KEY columns, but 
> different PARTITION KEY - the first would be USERID and the second would be 
> EMAIL.
> One complicating thing with this approach is that the application now needs 
> to know that when it INSERT/UPDATE/DELETEs from the base table it needs to 
> INSERT/UPDATE/DELETE from all of the query tables as well.  CASSANDRA-6477 
> covers this nicely.
> However, the other side of the coin is that the application needs to know 
> which query table to leverage based on the selection criteria.  Using the 
> example above, if the query has a predicate such as "WHERE userid = 'bhess'", 
> then USERS_BY_USERID is the better table to use.  Similarly, when the 
> predicate is "WHERE email = 'bhess@company.whatever'", USERS_BY_EMAIL is 
> appropriate.
> On INSERT/UPDATE/DELETE, Materialized Views essentially give a single "name" 
> to the collection of tables.  You do operations just on the base table.  It 
> is very attractive for the SELECT side as well.  It would be very good to 
> allow an application to simply do "SELECT * FROM users WHERE userid = 
> 'bhess'" and have that query implicitly leverage the USERS_BY_USERID 
> materialized view.
> For additional use cases, especially analytics use cases like in Spark, this 
> allows the Spark code to simply push down the query without having to know 
> about all of the MVs that have been set up.  The system will route the query 
> appropriately.  And if additional MVs are necessary to make a query run 
> better/faster, then those MVs can be set up and Spark will implicitly 
> leverage them.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to