I would definitely check the IO stats then, If you see latency going over 20ms, you need to solve that problem.
Patrick On Tue, Jan 28, 2020 at 12:01 PM Surbhi Gupta <surbhi.gupt...@gmail.com> wrote: > We have also noticed a lot of MutationStage pending . > > > On Tue, 28 Jan 2020 at 11:06, Richard Andersen <rich...@andersenfamily.us> > wrote: > >> I am in agreement with Patrick, this is a typical symptom of saturated >> IO. Are there a high of drops and/or pending compactions? >> >> Get Outlook for Android <https://aka.ms/ghei36> >> ------------------------------ >> *From:* Patrick McFadin <pmcfa...@gmail.com> >> *Sent:* Tuesday, January 28, 2020 11:25:49 AM >> *To:* user@cassandra.apache.org <user@cassandra.apache.org> >> *Subject:* Re: How to read content of hints file and apply them manually? >> >> Just to add in here. Any time I see any hints on a cluster, that's like >> seeing smoke. If you can't explain it, you have a fire somewhere and it's >> not going to get any better. >> >> By the few messages I've seen, I would start by looking at your IO >> subsystem on your nodes. Do you have enough throughput to write and read at >> the same time? These are exactly the symptoms I see when running Cassandra >> on a SAN or NAS. >> >> Patrick >> >> On Mon, Jan 27, 2020 at 8:17 PM Surbhi Gupta <surbhi.gupt...@gmail.com> >> wrote: >> >> We tried to tune sethintedhandoffthrottlekb to 100 , 1024 , 10240 but >> nothing helped . >> Our hints related parameters are as below, if you don't find any >> parameter below then it is not set in our environment and should be of the >> default value. >> >> max_hint_window_in_ms: 10800000 # 3 hours >> >> hinted_handoff_enabled: true >> >> hinted_handoff_throttle_in_kb: 100 >> >> max_hints_delivery_threads: 8 >> >> hints_directory: /var/lib/cassandra/hints >> >> hints_flush_period_in_ms: 10000 >> >> max_hints_file_size_in_mb: 128 >> >> On Mon, 27 Jan 2020 at 18:34, Jeff Jirsa <jji...@gmail.com> wrote: >> >> >> The high cpu is probably the hints getting replayed slamming the write >> path >> >> Slowing it down with the hint throttle may help >> >> It’s not instant. >> >> On Jan 27, 2020, at 6:05 PM, Erick Ramirez <flightc...@gmail.com> wrote: >> >> >> >> Increase the max_hint_window_in_ms setting in cassandra.yaml to more than >> 3 hours, perhaps 6 hours. If the issue still persists networking may need >> to be tested for bandwidth issues. >> >> >> Just a note of warning about bumping up the hint window without >> understanding the pros and cons. Be aware that doubling it means: >> >> - you'll end up doubling the size of stored hints in >> the hints_directory >> - there'll be twice as much hints to replay when node(s) come back >> online >> >> There's always 2 sides to fiddling with the knobs in C*. Cheers! >> >>