Are you using Cassandra reaper? On Thu, Oct 24, 2019, 12:31 PM Ben Mills <b...@bitbrew.com> wrote:
> Greetings, > > Inherited a small Cassandra cluster with some repair issues and need some > advice on recommended next steps. Apologies in advance for a long email. > > Issue: > > Intermittent repair failures on two non-system keyspaces. > > - platform_users > - platform_management > > Repair Type: > > Full, parallel repairs are run on each of the three nodes every five days. > > Repair command output for a typical failure: > > [2019-10-18 00:22:09,109] Starting repair command #46, repairing keyspace > platform_users with repair options (parallelism: parallel, primary range: > false, incremental: false, job threads: 1, ColumnFamilies: [], dataCenters: > [], hosts: [], # of ranges: 12) > [2019-10-18 00:22:09,242] Repair session > 5282be70-f13d-11e9-9b4e-7f6db768ba9a for range > [(-1890954128429545684,2847510199483651721], > (8249813014782655320,-8746483007209345011], > (4299912178579297893,6811748355903297393], > (-8746483007209345011,-8628999431140554276], > (-5865769407232506956,-4746990901966533744], > (-4470950459111056725,-1890954128429545684], > (4001531392883953257,4299912178579297893], > (6811748355903297393,6878104809564599690], > (6878104809564599690,8249813014782655320], > (-4746990901966533744,-4470950459111056725], > (-8628999431140554276,-5865769407232506956], > (2847510199483651721,4001531392883953257]] failed with error [repair > #5282be70-f13d-11e9-9b4e-7f6db768ba9a on platform_users/access_tokens_v2, > [(-1890954128429545684,2847510199483651721], > (8249813014782655320,-8746483007209345011], > (4299912178579297893,6811748355903297393], > (-8746483007209345011,-8628999431140554276], > (-5865769407232506956,-4746990901966533744], > (-4470950459111056725,-1890954128429545684], > (4001531392883953257,4299912178579297893], > (6811748355903297393,6878104809564599690], > (6878104809564599690,8249813014782655320], > (-4746990901966533744,-4470950459111056725], > (-8628999431140554276,-5865769407232506956], > (2847510199483651721,4001531392883953257]]] Validation failed in /10.x.x.x > (progress: 26%) > [2019-10-18 00:22:09,246] Some repair failed > [2019-10-18 00:22:09,248] Repair command #46 finished in 0 seconds > > Additional Notes: > > Repairs encounter above failures more often than not. Sometimes on one > node only, though occasionally on two. Sometimes just one of the two > keyspaces, sometimes both. Apparently the previous repair schedule for > this cluster included incremental repairs (script alternated between > incremental and full repairs). After reading this TLP article: > > > https://thelastpickle.com/blog/2017/12/14/should-you-use-incremental-repair.html > > the repair script was replaced with cassandra-reaper (v1.4.0), which was > run with its default configs. Reaper was fine but only obscured the ongoing > issues (it did not resolve them) and complicated the debugging process and > so was then removed. The current repair schedule is as described above > under Repair Type. > > Attempts at Resolution: > > (1) nodetool scrub was attempted on the offending keyspaces/tables to no > effect. > > (2) sstablescrub has not been attempted due to the current design of the > Docker image that runs Cassandra in each Kubernetes pod - i.e. there is no > way to stop the server to run this utility without killing the only pid > running in the container. > > Related Error: > > Not sure if this is related, though sometimes, when either: > > (a) Running nodetool snapshot, or > (b) Rolling a pod that runs a Cassandra node, which calls nodetool drain > prior shutdown, > > the following error is thrown: > > -- StackTrace -- > java.lang.RuntimeException: Last written key > DecoratedKey(10df3ba1-6eb2-4c8e-bddd-c0c7af586bda, > 10df3ba16eb24c8ebdddc0c7af586bda) >= current key > DecoratedKey(00000000-0000-0000-0000-000000000000, > 17343121887f480c9ba87c0e32206b74) writing into > /cassandra_data/data/platform_management/device_by_tenant_v2-e91529202ccf11e7ab96d5693708c583/.device_by_tenant_tags_idx/mb-45-big-Data.db > at > org.apache.cassandra.io.sstable.format.big.BigTableWriter.beforeAppend(BigTableWriter.java:114) > at > org.apache.cassandra.io.sstable.format.big.BigTableWriter.append(BigTableWriter.java:153) > at > org.apache.cassandra.io.sstable.SimpleSSTableMultiWriter.append(SimpleSSTableMultiWriter.java:48) > at > org.apache.cassandra.db.Memtable$FlushRunnable.writeSortedContents(Memtable.java:441) > at > org.apache.cassandra.db.Memtable$FlushRunnable.call(Memtable.java:477) > at > org.apache.cassandra.db.Memtable$FlushRunnable.call(Memtable.java:363) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > > Here are some details on the environment and configs in the event that > something is relevant. > > Environment: Kubernetes > Environment Config: Stateful set of 3 replicas > Storage: Persistent Volumes > Storage Class: SSD > Node OS: Container-Optimized OS > Container OS: Ubuntu 16.04.3 LTS > > Version: Cassandra 3.7 > Data Centers: 1 > Racks: 3 (one per zone) > Nodes: 3 > Tokens: 4 > Replication Factor: 3 > Replication Strategy: NetworkTopologyStrategy (all keyspaces) > Compaction Strategy: STCS (all tables) > Read/Write Requirements: Blend of both > Data Load: <1GB per node > gc_grace_seconds: default (10 days - all tables) > > Memory: 4Gi per node > CPU: 3.5 per node (3500m) > > Java Version: 1.8.0_144 > > Heap Settings: > > -XX:+UnlockExperimentalVMOptions > -XX:+UseCGroupMemoryLimitForHeap > -XX:MaxRAMFraction=2 > > GC Settings: (CMS) > > -XX:+UseParNewGC > -XX:+UseConcMarkSweepGC > -XX:+CMSParallelRemarkEnabled > -XX:SurvivorRatio=8 > -XX:MaxTenuringThreshold=1 > -XX:CMSInitiatingOccupancyFraction=75 > -XX:+UseCMSInitiatingOccupancyOnly > -XX:CMSWaitDuration=30000 > -XX:+CMSParallelInitialMarkEnabled > -XX:+CMSEdenChunksRecordAlways > > Any ideas are much appreciated. >