Hi, We've made the mistake of letting our nodes get too large, now holding about 3TB each. We ran out of enough free space to have a successful compaction, and because we're on 1.0.7, enabling compression to get out of the mess wasn't feasible. We tried adding another node, but we think this may have put too much pressure on the existing ones it was replicating from, so we backed out.
So we decided to drop RF down to 2 from 3 to relieve the disk pressure and started building a secondary cluster with lots of 1 TB nodes. We ran repair -pr on each node, but it’s failing with a JVM OOM on one node while another node is streaming from it for the final repair. Does anyone know what we can tune to get the cluster stable enough to put it in a multi-dc setup with the secondary cluster? Do we actually need to wait for these RF3->RF2 repairs to stabilize, or could we point it at the secondary cluster without worry of data loss? We’ve set the heap on these two problematic nodes to 20GB, up from the equally too high 12GB, but we’re still hitting OOM. I had seen in other threads that tuning down compaction might help, so we’re trying the following: in_memory_compaction_limit_in_mb 32 (down from 64) compaction_throughput_mb_per_sec 8 (down from 16) concurrent_compactors 2 (the nodes have 24 cores) flush_largest_memtables_at 0.45 (down from 0.50) stream_throughput_outbound_megabits_per_sec 300 (down from 400) reduce_cache_sizes_at 0.5 (down from 0.6) reduce_cache_capacity_to 0.35 (down from 0.4) -XX:CMSInitiatingOccupancyFraction=30 Here’s the log from the most recent repair failure: http://paste.ubuntu.com/5843017/ The OOM starts at line 13401. Thanks for whatever insight you can provide.