[jira] [Comment Edited] (CASSANDRA-6477) Materialized Views (was: Global Indexes)

Jonathan Shook (JIRA) Fri, 17 Jul 2015 14:59:33 -0700

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-6477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14631983#comment-14631983
 ]


Jonathan Shook edited comment on CASSANDRA-6477 at 7/17/15 9:58 PM:
--------------------------------------------------------------------

If we look at this from the perspective of a typical developer who simply wants 
query tables to be easier to manage, then the basic requirements are pretty 
simple: Emulate current practice. That isn't to say that we shouldn't dig 
deeper in terms of what would could make sense in different contexts, but the 
basic usage pattern that it is meant to simplify is pretty basic:

* Logged batches are not commonly used to wrap a primary table with it's query 
tables during writes. The failure modes of these are usually well understood, 
meaning that it is clear what the implications are for a failed write in nearly 
every case.
* The same CL is generally used for all related tables.
* Savvy users will do this with async with the same CL for all of these 
operations.

So effectively, I would expect the very basic form of this feature to look much 
like it would in practice already, except that it requires much less effort on 
the end user to maintain. I would like for us to consider that where the 
implementation varies from this, that there may be lots of potential for 
surprise. I really think we need to be following the principle of least 
surprise here as a start. It is almost certain that MV will be adopted quickly 
in places that have a need for it because they are essentially doing this 
manually at the present. If you require them to micro-manage the settings in 
order to even get close to the current result (performance, availability 
assumptions, ...) then we should change the defaults.

It doesn't really seem necessary that we force the coordinator node to be a 
replica. This is orthogonal to the base problem, and has controls in topology 
aware clients already. As well, it does add potentially another hop, which I do 
have concerns about with respect to the above.



was (Author: jshook):
If we look at this from the perspective of a typical developer who simply wants 
query tables to be easier to manage, then the basic requirements are pretty 
simple: Emulate current practice. That isn't to say that we shouldn't dig 
deeper in terms of what would could make sense in different contexts, but the 
basic usage pattern that it is meant to simplify is pretty basic:

* Logged batches are not commonly used to wrap a primary table with it's query 
tables during writes. The failure modes of these are usually well understood, 
meaning that it is clear what the implications are for a failed write in nearly 
every case.
* The same CL is generally used for all related tables.
* Savvy users will do this with async with the same CL for all of these 
operations.

So effectively, I would expect the very basic form of this feature to look much 
like it would in practice already, except that it requires much less effort on 
the end user to maintain. I would like for us to consider that where the 
implementation varies from this, that there may be lots of potential for 
surprise. I really think we need to be following the principle of least 
surprise here as a start. It is almost certain that MV will be adopted quickly 
in places that have a need for it because the are essentially doing this 
manually at the present. If you require them to micro-manage the settings in 
order to even get close to the current result (performance, availability 
assumptions, ...) then we should change the defaults.

It doesn't really seem necessary that we force the coordinator node to be a 
replica. This is orthogonal to the base problem, and has controls in topology 
aware clients already. As well, it does add potentially another hop, which I do 
have concerns about with respect to the above.


> Materialized Views (was: Global Indexes)
> ----------------------------------------
>
>                 Key: CASSANDRA-6477
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6477
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: API, Core
>            Reporter: Jonathan Ellis
>            Assignee: Carl Yeksigian
>              Labels: cql
>             Fix For: 3.0 beta 1
>
>         Attachments: test-view-data.sh, users.yaml
>
>
> Local indexes are suitable for low-cardinality data, where spreading the 
> index across the cluster is a Good Thing.  However, for high-cardinality 
> data, local indexes require querying most nodes in the cluster even if only a 
> handful of rows is returned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (CASSANDRA-6477) Materialized Views (was: Global Indexes)

Reply via email to