> Also, do you think putting the '-Dmv_enable_coordinator_batchlog=true' parameter in cassandra.yaml will solve or reduce the issue to some extent?
It should improve the eventual consistency of MV. Only enable it if there is enough CPU/IO capacity in the cluster. Do you mind creating a JIRA describing: what's the workload/queries and how does it end up in an inconsistent state if you can reproduce it? On Wed, 29 Jul 2020 at 20:49, Jasonstack Zhao Yang < jasonstack.z...@gmail.com> wrote: > > The cluster started to crash when some partitions in MV crossed 1 GB > size at few nodes, whereas in other nodes it is less than 50 MB. > > Should we be worried about this? > > Depend on your MV partition key design. > > The memory pressure of wide partitions is improved in 3.x (CASSANDRA-11206 > <https://issues.apache.org/jira/browse/CASSANDRA-11206>). > > On Tue, 28 Jul 2020 at 14:24, Saijal Chauhan <saijal.chau...@goevive.com> > wrote: > >> >do you run "nodetool repair" on both base and view regularly? >> >> Yes, we run a full repair on our entire cluster every weekend which >> includes the keyspaces with the base table and materialized views >> But still, there are a ton of discrepancies in our base table and >> materialized view. >> >> Also, do you think putting the '-Dmv_enable_coordinator_batchlog=true' >> parameter in cassandra.yaml will solve or reduce the issue to some extent? >> >> Came across a Jira issue >> <https://issues.apache.org/jira/browse/CASSANDRA-15918> and this >> <https://medium.com/engineered-publicis-sapient/making-the-right-choices-in-cassandra-with-critical-configurations-and-data-size-speed-d358989d3437> >> blog which mentions cluster instability while creating and deleting mv's >> >> The cluster started to crash when some partitions in MV crossed 1 GB size >>> at few nodes, whereas in other nodes it is less than 50 MB. >> >> >> Should we be worried about this? >> >> On Mon, Jul 27, 2020 at 10:18 PM Jasonstack Zhao Yang < >> jasonstack.z...@gmail.com> wrote: >> >>> Hi, >>> >>> > We are facing data inconsistency issues between base tables and >>> materialized views. >>> >>> do you run "nodetool repair" on both base and view regularly? >>> >>> > What are all the possible scenarios that we should be watching out for >>> in a production environment? >>> >>> more cpu/io/gc for populating views. >>> >>> > Could there be any downtime in the Cassandra cluster while creating or >>> deleting these materialized views? >>> >>> no, but be careful about the latency/throughput impact on the regular >>> workload. >>> >>> On Tue, 28 Jul 2020 at 00:02, Saijal Chauhan <saijal.chau...@goevive.com> >>> wrote: >>> >>>> Hi, >>>> >>>> We are using Cassandra 3.0.13 >>>> We have the following datacenters: >>>> >>>> - DC1 with 7 Cassandra nodes with RF:3 >>>> - DC2 with 2 Cassandra nodes with RF:2 >>>> - DC3 with 2 Cassandra nodes with RF:2 >>>> >>>> We are facing data inconsistency issues between base tables and >>>> materialized views. >>>> The only solution to this problem seems to be the creation of new >>>> materialized views and dropping the old views. >>>> >>>> We are planning to recreate 4 materialized views, 2 belonging to the >>>> same base table. >>>> The size of each base table ranges up to 4 to 5GB. >>>> >>>> What are all the possible scenarios that we should be watching out for >>>> in a production environment? >>>> Could there be any downtime in the Cassandra cluster while creating or >>>> deleting these materialized views? >>>> >>>> Thank you. >>>> >>> >> >>