The new Materialized View feature is just an automated way of creating and
maintaining what people used to call a "query table", which is the
traditional Cassandra data modeling technique for performing queries on on
than the primary key for a table - you store the same columns in different
tables using different columns for the primary key.

One also needs to be careful to include all columns of the original primary
key in each MV primary key - in addition to whatever column(s) are to be
used for indexing in each MV (so that Cassandra can find the old row when
it needs to update the MV when the base table row changes, such as on a
deletion.)

But before creating MVs, you first need to answer questions about how the
app needs to query the data. Even with MV, conceptualizing queries needs to
precede data modeling.

For example, what is the cardinality of favorites vs. non-favorites, does
the app even need to query by favorates, as opposed to querying all
contacts and retrieving is_favorite as simply a non-key column value,
whether favorites need to be retrieved separately from non-favorites, the
frequency and latency requirements for query by favorite status, etc. Once
these questions are answered, decisions can be made about data modeling.

-- Jack Krupansky

On Mon, Jan 11, 2016 at 5:13 AM, Carlos Alonso <i...@mrcalonso.com> wrote:

> I have never used Materialized Views so maybe this suggestion is not
> possible, but in this case, wouldn't it make sense to define the
> materialized view as
>
> is_favourite IS TRUE
> instead of
> is_favourite IS NOT NULL?
>
> Carlos Alonso | Software Engineer | @calonso <https://twitter.com/calonso>
>
> On 10 January 2016 at 09:59, DuyHai Doan <doanduy...@gmail.com> wrote:
>
>> Try this
>>
>> CREATE TABLE communication.user_contact_list (
>>   user_id uuid,
>>   contact_id uuid,
>>   contact_name text,
>>   created_at timeuuid,
>>   is_favorite boolean,
>>   favorite_at timestamp,
>>   PRIMARY KEY (user_id, contact_name, contact_id)
>> );
>>
>> CREATE MATERIALIZED VIEW communication.user_favorite_contact_list
>> AS SELECT * FROM communication.user_contact_list
>> WHERE user_id IS NOT NULL AND contact_name IS NOT NULL
>> AND contact_id IS NOT NULL AND is_favorite IS NOT NULL
>> PRIMARY KEY(user_id, is_favorite, contact_name, contact_id)
>>
>> If the flag is_favorite is not updated very often the write perf hit due
>> to materialized view is acceptable.
>>
>> On Sat, Jan 9, 2016 at 11:57 PM, Isaac P. <i...@hotmail.com> wrote:
>>
>>> Jack/ Michael,
>>>
>>> Thanks for answering.
>>>
>>> How big?: Less then one hundred contacts by user is the normal.
>>>
>>> Update requirements: The UPDATE requirements are all around  each user
>>> “favoriting/unfavoriting” the contacts . Deleting is not very frequent.
>>>
>>> Does that mean that in C* 3.02 , for this use case to work, the contact
>>> name  must be part of a  composite partition key in order to allow sorting
>>> by contact_name like this ? :
>>>
>>> CREATE TABLE communication.user_contact_list (
>>> user_id uuid,
>>> contact_name text,
>>> is_favorite boolean,
>>> contact_id uuid,
>>> created_at timeuuid,
>>> favorite_at timestamp,
>>> PRIMARY KEY ((user_id, contact_name), is_favorite)
>>> )  WITH CLUSTERING ORDER BY (contact_name ASC);
>>>
>>> Query: Select * from user_contact_list where user_id = :userid and
>>> is_favorite = true order by contact_name asc;
>>>
>>> Looks like each contact as a row/clustering key will be the way to go.
>>>
>>> Thanks
>>>
>>> IPVP
>>>
>>>
>>> From: Laing, Michael <michael.la...@nytimes.com>
>>> <michael.la...@nytimes.com>
>>> Reply: user@cassandra.apache.org <user@cassandra.apache.org>>
>>> <user@cassandra.apache.org>
>>> Date: January 9, 2016 at 11:51:27 AM
>>> To: user@cassandra.apache.org <user@cassandra.apache.org>>
>>> <user@cassandra.apache.org>
>>> Subject:  Re: Modeling contact list, plain table or List
>>>
>>> Note that in C* 3.02 the second query is invalid:
>>>
>>> cqlsh> Select * from communication.user_contact_list where user_id =
>>> 98f50f00-b6d5-11e5-afec-6003089bf572 and is_favorite = true order
>>> by contact_name asc;
>>>
>>> *InvalidRequest: code=2200 [Invalid query] message="PRIMARY KEY column
>>> "is_favorite" cannot be restricted as preceding column "contact_name" is
>>> not restricted"*
>>>
>>> On Fri, Jan 8, 2016 at 6:50 PM, Jack Krupansky <jack.krupan...@gmail.com
>>> > wrote:
>>>
>>>> How big is each contact list expected to be? Dozens? Hundreds?
>>>> Thousands? If just dozens, a simple list column would seem sufficient. If
>>>> thousands, the row (not partition) would get kind of bloated.
>>>>
>>>> What requirements do you have for updating? If updating contacts and
>>>> lots of contacts, I think I'd prefer each contact as a row/clustering key.
>>>> Nice to be able to do selective queries to return slices of the clustering
>>>> key values, which is not so easy if they are all just a single list column.
>>>>
>>>> -- Jack Krupansky
>>>>
>>>> On Fri, Jan 8, 2016 at 6:31 PM, Isaac P. <i...@hotmail.com> wrote:
>>>>
>>>>> Hi everyone
>>>>>
>>>>> What would perform better while modeling a simple user contact list
>>>>>  that will be used mainly to select the recipients for/from/to messages ?
>>>>>
>>>>> a) Individual rows to each (user, contact) pair so a select would
>>>>> fetch all the rows  to retrieve all the contacts from a given user.
>>>>>
>>>>> or
>>>>>
>>>>> b) A single row for each user containing the List<Contact>  UDT.
>>>>>
>>>>> Aside of the basic CRUD, the queries will be the following ones:
>>>>>
>>>>> Select * from user_contact_list where user_id = :userid order by
>>>>> contact_name asc
>>>>>
>>>>> Select * from user_contact_list where user_id = :userid and
>>>>> is_favorite = true order by contact_name asc
>>>>>
>>>>> After reading this
>>>>> https://docs.datastax.com/en/cql/3.0/cql/ddl/ddl_compound_keys_c.html
>>>>>  the table is looking like this:
>>>>>
>>>>> CREATE TABLE communication.user_contact_list (
>>>>> user_id uuid,
>>>>> contact_id uuid,
>>>>> contact_name text,
>>>>> created_at timeuuid,
>>>>> is_favorite boolean,
>>>>> favorite_at timestamp,
>>>>> PRIMARY KEY (user_id, contact_name, is_favorite)
>>>>> );
>>>>>
>>>>> Any guidance will be appreciated.
>>>>>
>>>>> Thanks
>>>>>
>>>>> --
>>>>> IPVP
>>>>>
>>>>>
>>>>
>>>
>>
>

Reply via email to