Re: Recreating materialized views in cassandra

Jasonstack Zhao Yang Wed, 29 Jul 2020 05:52:23 -0700

> Also, do you think putting the '-Dmv_enable_coordinator_batchlog=true'
parameter in cassandra.yaml will solve or reduce the issue to some extent?


It should improve the eventual consistency of MV. Only enable it if there
is enough CPU/IO capacity in the cluster.

Do you mind creating a JIRA describing: what's the workload/queries and how
does it end up in an inconsistent state if you can reproduce it?

On Wed, 29 Jul 2020 at 20:49, Jasonstack Zhao Yang <
jasonstack.z...@gmail.com> wrote:

> > The cluster started to crash when some partitions in MV crossed 1 GB
> size at few nodes, whereas in other nodes it is less than 50 MB.
> > Should we be worried about this?
>
> Depend on your MV partition key design.
>
> The memory pressure of wide partitions is improved in 3.x (CASSANDRA-11206
> <https://issues.apache.org/jira/browse/CASSANDRA-11206>).
>
> On Tue, 28 Jul 2020 at 14:24, Saijal Chauhan <saijal.chau...@goevive.com>
> wrote:
>
>> >do you run "nodetool repair" on both base and view regularly?
>>
>> Yes, we run a full repair on our entire cluster every weekend which
>> includes the keyspaces with the base table and materialized views
>> But still, there are a ton of discrepancies in our base table and
>> materialized view.
>>
>> Also, do you think putting the '-Dmv_enable_coordinator_batchlog=true'
>> parameter in cassandra.yaml will solve or reduce the issue to some extent?
>>
>> Came across a Jira issue
>> <https://issues.apache.org/jira/browse/CASSANDRA-15918> and this
>> <https://medium.com/engineered-publicis-sapient/making-the-right-choices-in-cassandra-with-critical-configurations-and-data-size-speed-d358989d3437>
>> blog which mentions cluster instability while creating and deleting mv's
>>
>> The cluster started to crash when some partitions in MV crossed 1 GB size
>>> at few nodes, whereas in other nodes it is less than 50 MB.
>>
>>
>> Should we be worried about this?
>>
>> On Mon, Jul 27, 2020 at 10:18 PM Jasonstack Zhao Yang <
>> jasonstack.z...@gmail.com> wrote:
>>
>>> Hi,
>>>
>>> > We are facing data inconsistency issues between base tables and
>>> materialized views.
>>>
>>> do you run "nodetool repair" on both base and view regularly?
>>>
>>> > What are all the possible scenarios that we should be watching out for
>>> in a production environment?
>>>
>>> more cpu/io/gc for populating views.
>>>
>>> > Could there be any downtime in the Cassandra cluster while creating or
>>> deleting these materialized views?
>>>
>>> no, but be careful about the latency/throughput impact on the regular
>>> workload.
>>>
>>> On Tue, 28 Jul 2020 at 00:02, Saijal Chauhan <saijal.chau...@goevive.com>
>>> wrote:
>>>
>>>> Hi,
>>>>
>>>> We are using Cassandra 3.0.13
>>>> We have the following datacenters:
>>>>
>>>>    - DC1 with 7 Cassandra nodes with RF:3
>>>>    - DC2 with 2 Cassandra nodes with RF:2
>>>>    - DC3 with 2 Cassandra nodes with RF:2
>>>>
>>>> We are facing data inconsistency issues between base tables and
>>>> materialized views.
>>>> The only solution to this problem seems to be the creation of new
>>>> materialized views and dropping the old views.
>>>>
>>>> We are planning to recreate 4 materialized views, 2 belonging to the
>>>> same base table.
>>>> The size of each base table ranges up to 4 to 5GB.
>>>>
>>>> What are all the possible scenarios that we should be watching out for
>>>> in a production environment?
>>>> Could there be any downtime in the Cassandra cluster while creating or
>>>> deleting these materialized views?
>>>>
>>>> Thank you.
>>>>
>>>
>>
>>

Re: Recreating materialized views in cassandra

Reply via email to