[ 
https://issues.apache.org/jira/browse/CASSANDRA-6477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14541814#comment-14541814
 ] 

Jack Krupansky commented on CASSANDRA-6477:
-------------------------------------------

When exactly would population of the MV occur? What refresh options would 
initially be supported? Would population/refresh begin instantly when the MV is 
created, by default, or would an explicit command be required to begin 
population? Earlier I linked to the Oracle doc on MV, so a comparison to Oracle 
for refresh options might be nice, especially for users migrating from Oracle. 
Where would the state of refresh be stored, and how can a user monitor it? On 
each node of the base table?

PostgreSQL doesn't seem to have as many options:
http://www.postgresql.org/docs/9.3/static/sql-creatematerializedview.html

With RF>1, which of the nodes containing a given token would push an update to 
the MV? All of them? Presumably the push can be token-aware, so that each push 
only goes to RF=n nodes based on the PK of the MV insert row. Would a 
consistency level be warranted for the push? Would there be hints as well? And 
repair of an MV if the rate of updates of the base table overwhelms the update 
bandwidth of the (many) MVs for the base table?

Any thoughts on throttling of the flow of updates from other nodes so that 
population of a MV does not overwhelm or interfere with normal cluster 
operation? What default, and what override? What would be a reasonable default, 
and what would be best practice advice for a maximum?

> Materialized Views (was: Global Indexes)
> ----------------------------------------
>
>                 Key: CASSANDRA-6477
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6477
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: API, Core
>            Reporter: Jonathan Ellis
>            Assignee: Carl Yeksigian
>              Labels: cql
>             Fix For: 3.x
>
>
> Local indexes are suitable for low-cardinality data, where spreading the 
> index across the cluster is a Good Thing.  However, for high-cardinality 
> data, local indexes require querying most nodes in the cluster even if only a 
> handful of rows is returned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to