Re: Load discrepancy between old vs new nodes of Cassandra
Also, recently we have been observing a lot of repair logs like these. > > INFO [RepairJobTask:3] 2020-08-12 12:07:46,325 SyncTask.java:73 - [repair > #aa-bbb-c-dd-] Endpoints /2.2.2.2 and /3.3.3.3 have 9146 > range(s) out of sync for table_a > > Could this be somehow related? On Tue, Aug 18, 2020 at 7:26 AM Erick Ramirez wrote: > I would start by checking the replication settings on all your keyspaces. > There's a chance that you have keyspaces not replicated to DC3. FWIW it > would have to be an application keyspace (vs system keyspaces) because of > the size. Cheers! > -- Saijal Chauhan Infrastructure Systems Engineer Evive goevive.com
Re: Load discrepancy between old vs new nodes of Cassandra
Also, we are observing a decrease in some Gb's load in our Cassandra cluster, every time we restart a particular node in the cluster. Could it be because of the stale data being removed? If not what could be other possible reasons for that. On Tue, Aug 18, 2020 at 7:26 AM Erick Ramirez wrote: > I would start by checking the replication settings on all your keyspaces. > There's a chance that you have keyspaces not replicated to DC3. FWIW it > would have to be an application keyspace (vs system keyspaces) because of > the size. Cheers! > -- Saijal Chauhan Infrastructure Systems Engineer Evive goevive.com
Load discrepancy between old vs new nodes of Cassandra
Hi, We are using Cassandra 3.0.13 We have the following datacenters: - DC1 with 7 Cassandra nodes with RF:3 (2 years old) - DC2 with 1 Cassandra node with RF:1 (4 years old) - DC3 with 2 Cassandra nodes with RF:2 (one-month-old) On DC2 and DC3, each node has 100% data. Seed nodes while setting up new-datacenter DC3 were: - 1 DC2 node - 1 DC1 node We are planning to remove the 4-year-old data center (DC2). We are seeing a discrepancy (around ~250GB) in the load of the nodes in DC2 and DC3 via the "nodetool status" command. Datacenter: DC2 > > Status=Up/Down > |/ State=Normal/Leaving/Joining/Moving > -- Address Load Tokens Owns (effective) Host ID >Rack > UN 2.2.2.2 747.57 GB 256 100.0%efgh > RAC1 > > Datacenter: DC3 > > Status=Up/Down > |/ State=Normal/Leaving/Joining/Moving > -- Address Load Tokens Owns (effective) Host ID >Rack > UN 3.3.3.3 504.57 GB 256 100.0%ijkl >RAC1 > UN 4.4.4.4 502.17 GB 256 100.0%mnop > RAC1 > > What could be the possible reasons for the 250gb data discrepancy? Also, we run a repair on every weekend. Thank you!
Re: Recreating materialized views in cassandra
>do you run "nodetool repair" on both base and view regularly? Yes, we run a full repair on our entire cluster every weekend which includes the keyspaces with the base table and materialized views But still, there are a ton of discrepancies in our base table and materialized view. Also, do you think putting the '-Dmv_enable_coordinator_batchlog=true' parameter in cassandra.yaml will solve or reduce the issue to some extent? Came across a Jira issue <https://issues.apache.org/jira/browse/CASSANDRA-15918> and this <https://medium.com/engineered-publicis-sapient/making-the-right-choices-in-cassandra-with-critical-configurations-and-data-size-speed-d358989d3437> blog which mentions cluster instability while creating and deleting mv's The cluster started to crash when some partitions in MV crossed 1 GB size > at few nodes, whereas in other nodes it is less than 50 MB. Should we be worried about this? On Mon, Jul 27, 2020 at 10:18 PM Jasonstack Zhao Yang < jasonstack.z...@gmail.com> wrote: > Hi, > > > We are facing data inconsistency issues between base tables and > materialized views. > > do you run "nodetool repair" on both base and view regularly? > > > What are all the possible scenarios that we should be watching out for > in a production environment? > > more cpu/io/gc for populating views. > > > Could there be any downtime in the Cassandra cluster while creating or > deleting these materialized views? > > no, but be careful about the latency/throughput impact on the regular > workload. > > On Tue, 28 Jul 2020 at 00:02, Saijal Chauhan > wrote: > >> Hi, >> >> We are using Cassandra 3.0.13 >> We have the following datacenters: >> >>- DC1 with 7 Cassandra nodes with RF:3 >>- DC2 with 2 Cassandra nodes with RF:2 >>- DC3 with 2 Cassandra nodes with RF:2 >> >> We are facing data inconsistency issues between base tables and >> materialized views. >> The only solution to this problem seems to be the creation of new >> materialized views and dropping the old views. >> >> We are planning to recreate 4 materialized views, 2 belonging to the same >> base table. >> The size of each base table ranges up to 4 to 5GB. >> >> What are all the possible scenarios that we should be watching out for in >> a production environment? >> Could there be any downtime in the Cassandra cluster while creating or >> deleting these materialized views? >> >> Thank you. >> >
Recreating materialized views in cassandra
Hi, We are using Cassandra 3.0.13 We have the following datacenters: - DC1 with 7 Cassandra nodes with RF:3 - DC2 with 2 Cassandra nodes with RF:2 - DC3 with 2 Cassandra nodes with RF:2 We are facing data inconsistency issues between base tables and materialized views. The only solution to this problem seems to be the creation of new materialized views and dropping the old views. We are planning to recreate 4 materialized views, 2 belonging to the same base table. The size of each base table ranges up to 4 to 5GB. What are all the possible scenarios that we should be watching out for in a production environment? Could there be any downtime in the Cassandra cluster while creating or deleting these materialized views? Thank you.