Well, paging still happens due to mmapped file I/O, however whilst this could easily cause a slow login it would struggle to cause a 315s GC pause.
A slow network should also never cause this, though: the network threads are simply caught by any safepoint on returning to the VM, so don't delay GC. On 27 February 2014 17:01, Frank Ng <fnt...@gmail.com> wrote: > We have swap disabled. Can death by paging still happen? > > > On Thu, Feb 27, 2014 at 11:32 AM, Benedict Elliott Smith < > belliottsm...@datastax.com> wrote: > >> That sounds a lot like death by paging. >> >> >> On 27 February 2014 16:29, Frank Ng <fnt...@gmail.com> wrote: >> >>> I just caught that a node was down based on running nodetool status on a >>> different node. I tried to ssh into the downed node at that time and it >>> was very slow logging on. Looking at the gc.log file, there was a ParNew >>> that only took 0.09 secs. Yet the overall application threads stop time is >>> 315 seconds (5 minutes). Our cluster is handling alot of read requests. >>> >>> If there were network hiccups, would that cause a delay in the Cassandra >>> process when it tries to get to a safepoint? I assume Cassandra has >>> threads running with lots of network activity and maybe taking a long time >>> to reach a safepoint. >>> >>> thanks, >>> Frank >>> >>> >>> On Fri, Feb 21, 2014 at 4:24 AM, Joel Samuelsson < >>> samuelsson.j...@gmail.com> wrote: >>> >>>> What happens if a ParNew is triggered while CMS is running? Will it >>>> wait for the CMS to finish? If so, that would be the eplanation of our long >>>> ParNew above. >>>> >>>> Regards, >>>> Joel >>>> >>>> >>>> 2014-02-20 16:29 GMT+01:00 Joel Samuelsson <samuelsson.j...@gmail.com>: >>>> >>>> Hi Frank, >>>>> >>>>> We got a (quite) long GC pause today on 2.0.5: >>>>> INFO [ScheduledTasks:1] 2014-02-20 13:51:14,528 GCInspector.java >>>>> (line 116) GC for ParNew: 1627 ms for 1 collections, 425562984 used; max >>>>> is >>>>> 4253024256 >>>>> INFO [ScheduledTasks:1] 2014-02-20 13:51:14,542 GCInspector.java >>>>> (line 116) GC for ConcurrentMarkSweep: 3703 ms for 2 collections, >>>>> 434394920 >>>>> used; max is 4253024256 >>>>> >>>>> Unfortunately it's a production cluster so I have no additional >>>>> GC-logging enabled. This may be an indication that upgrading is not the >>>>> (complete) solution. >>>>> >>>>> Regards, >>>>> Joel >>>>> >>>>> >>>>> 2014-02-17 13:41 GMT+01:00 Benedict Elliott Smith < >>>>> belliottsm...@datastax.com>: >>>>> >>>>> Hi Ondrej, >>>>>> >>>>>> It's possible you were hit by the problems in this thread before, but >>>>>> it looks potentially like you may have other issues. Of course it may be >>>>>> that on G1 you have one issue and CMS another, but 27s is extreme even >>>>>> for >>>>>> G1, so it seems unlikely. If you're hitting these pause times in CMS and >>>>>> you get some more output from the safepoint tracing, please do contribute >>>>>> as I would love to get to the bottom of that, however is it possible >>>>>> you're >>>>>> experiencing paging activity? Have you made certain the VM memory is >>>>>> locked >>>>>> (and preferably that paging is entirely disabled, as the bloom filters >>>>>> and >>>>>> other memory won't be locked, although that shouldn't cause pauses during >>>>>> GC) >>>>>> >>>>>> Note that mmapped file accesses and other native work shouldn't in >>>>>> anyway inhibit GC activity or other safepoint pause times, unless >>>>>> there's a >>>>>> bug in the VM. These threads will simply enter a safepoint as they return >>>>>> to the VM execution context, and are considered safe for the duration >>>>>> they >>>>>> are outside. >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> On 17 February 2014 12:30, Ondřej Černoš <cern...@gmail.com> wrote: >>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> we tried to switch to G1 because we observed this behaviour on CMS >>>>>>> too (27 seconds pause in G1 is quite an advise not to use it). Pauses >>>>>>> with >>>>>>> CMS were not easily traceable - JVM stopped even without stop-the-world >>>>>>> pause scheduled (defragmentation, remarking). We thought the >>>>>>> go-to-safepoint waiting time might have been involved (we saw waiting >>>>>>> for >>>>>>> safepoint resolution) - especially because access to mmpaped files is >>>>>>> not >>>>>>> preemptive, afaik, but it doesn't explain tens of seconds waiting times, >>>>>>> even slow IO should read our sstables into memory in much less time. We >>>>>>> switched to G1 out of desperation - and to try different code paths - >>>>>>> not >>>>>>> that we'd thought it was a great idea. So I think we were hit by the >>>>>>> problem discussed in this thread, just the G1 report wasn't very clear, >>>>>>> sorry. >>>>>>> >>>>>>> regards, >>>>>>> ondrej >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Mon, Feb 17, 2014 at 11:45 AM, Benedict Elliott Smith < >>>>>>> belliottsm...@datastax.com> wrote: >>>>>>> >>>>>>>> Ondrej, >>>>>>>> >>>>>>>> It seems like your issue is much less difficult to diagnose: your >>>>>>>> collection times are long. At least, the pause you printed the time >>>>>>>> for is >>>>>>>> all attributable to the G1 pause. >>>>>>>> >>>>>>>> Note that G1 has not generally performed well with Cassandra in our >>>>>>>> testing. There are a number of changes going in soon that may change >>>>>>>> that, >>>>>>>> but for the time being it is advisable to stick with CMS. With tuning >>>>>>>> you >>>>>>>> can no doubt bring your pauses down considerably. >>>>>>>> >>>>>>>> >>>>>>>> On 17 February 2014 10:17, Ondřej Černoš <cern...@gmail.com> wrote: >>>>>>>> >>>>>>>>> Hi all, >>>>>>>>> >>>>>>>>> we are seeing the same kind of long pauses in Cassandra. We tried >>>>>>>>> to switch CMS to G1 without positive result. The stress test is read >>>>>>>>> heavy, >>>>>>>>> 2 datacenters, 6 nodes, 400reqs/sec on one datacenter. We see spikes >>>>>>>>> in >>>>>>>>> latency on 99.99 percentil and higher, caused by threads being >>>>>>>>> stopped in >>>>>>>>> JVM. >>>>>>>>> >>>>>>>>> The GC in G1 looks like this: >>>>>>>>> >>>>>>>>> {Heap before GC invocations=4073 (full 1): >>>>>>>>> garbage-first heap total 8388608K, used 3602914K >>>>>>>>> [0x00000005f5c00000, 0x00000007f5c00000, 0x00000007f5c00000) >>>>>>>>> region size 4096K, 142 young (581632K), 11 survivors (45056K) >>>>>>>>> compacting perm gen total 28672K, used 27428K >>>>>>>>> [0x00000007f5c00000, 0x00000007f7800000, 0x0000000800000000) >>>>>>>>> the space 28672K, 95% used [0x00000007f5c00000, >>>>>>>>> 0x00000007f76c9108, 0x00000007f76c9200, 0x00000007f7800000) >>>>>>>>> No shared spaces configured. >>>>>>>>> 2014-02-17T04:44:16.385+0100: 222346.218: [GC pause (G1 >>>>>>>>> Evacuation Pause) (young) >>>>>>>>> Desired survivor size 37748736 bytes, new threshold 15 (max 15) >>>>>>>>> - age 1: 17213632 bytes, 17213632 total >>>>>>>>> - age 2: 19391208 bytes, 36604840 total >>>>>>>>> , 0.1664300 secs] >>>>>>>>> [Parallel Time: 163.9 ms, GC Workers: 2] >>>>>>>>> [GC Worker Start (ms): Min: 222346218.3, Avg: 222346218.3, >>>>>>>>> Max: 222346218.3, Diff: 0.0] >>>>>>>>> [Ext Root Scanning (ms): Min: 6.0, Avg: 6.9, Max: 7.7, Diff: >>>>>>>>> 1.7, Sum: 13.7] >>>>>>>>> [Update RS (ms): Min: 20.4, Avg: 21.3, Max: 22.1, Diff: 1.7, >>>>>>>>> Sum: 42.6] >>>>>>>>> [Processed Buffers: Min: 49, Avg: 60.0, Max: 71, Diff: 22, >>>>>>>>> Sum: 120] >>>>>>>>> [Scan RS (ms): Min: 23.2, Avg: 23.2, Max: 23.3, Diff: 0.1, >>>>>>>>> Sum: 46.5] >>>>>>>>> [Object Copy (ms): Min: 112.3, Avg: 112.3, Max: 112.4, Diff: >>>>>>>>> 0.1, Sum: 224.6] >>>>>>>>> [Termination (ms): Min: 0.0, Avg: 0.0, Max: 0.1, Diff: 0.0, >>>>>>>>> Sum: 0.1] >>>>>>>>> [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: >>>>>>>>> 0.0, Sum: 0.1] >>>>>>>>> [GC Worker Total (ms): Min: 163.8, Avg: 163.8, Max: 163.8, >>>>>>>>> Diff: 0.0, Sum: 327.6] >>>>>>>>> [GC Worker End (ms): Min: 222346382.1, Avg: 222346382.1, Max: >>>>>>>>> 222346382.1, Diff: 0.0] >>>>>>>>> [Code Root Fixup: 0.0 ms] >>>>>>>>> [Clear CT: 0.4 ms] >>>>>>>>> [Other: 2.1 ms] >>>>>>>>> [Choose CSet: 0.0 ms] >>>>>>>>> [Ref Proc: 1.1 ms] >>>>>>>>> [Ref Enq: 0.0 ms] >>>>>>>>> [Free CSet: 0.4 ms] >>>>>>>>> [Eden: 524.0M(524.0M)->0.0B(476.0M) Survivors: 44.0M->68.0M >>>>>>>>> Heap: 3518.5M(8192.0M)->3018.5M(8192.0M)] >>>>>>>>> Heap after GC invocations=4074 (full 1): >>>>>>>>> garbage-first heap total 8388608K, used 3090914K >>>>>>>>> [0x00000005f5c00000, >>>>>>>>> 0x00000007f5c00000, 0x00000007f5c00000) >>>>>>>>> region size 4096K, 17 young (69632K), 17 survivors (69632K) >>>>>>>>> compacting perm gen total 28672K, used 27428K [0x00000007f5c00000, >>>>>>>>> 0x00000007f7800000, 0x0000000800000000) >>>>>>>>> the space 28672K, 95% used [0x00000007f5c00000, 0x00000007f76c9108, >>>>>>>>> 0x00000007f76c9200, 0x00000007f7800000) >>>>>>>>> No shared spaces configured. >>>>>>>>> } >>>>>>>>> [Times: user=0.35 sys=0.00, real=27.58 secs] >>>>>>>>> 222346.219: G1IncCollectionPause [ 111 0 >>>>>>>>> 0 ] [ 0 0 0 0 27586 ] 0 >>>>>>>>> >>>>>>>>> And the total thime for which application threads were stopped is >>>>>>>>> 27.58 seconds. >>>>>>>>> >>>>>>>>> CMS behaves in a similar manner. We thought it would be GC, >>>>>>>>> waiting for mmaped files being read from disk (the thread cannot reach >>>>>>>>> safepoint during this operation), but it doesn't explain the huge >>>>>>>>> time. >>>>>>>>> >>>>>>>>> We'll try jhiccup to see if it provides any additional >>>>>>>>> information. The test was done on mixed aws/openstack environment, >>>>>>>>> openjdk 1.7.0_45, cassandra 1.2.11. Upgrading to 2.0.x is no option >>>>>>>>> for us. >>>>>>>>> >>>>>>>>> regards, >>>>>>>>> >>>>>>>>> ondrej cernos >>>>>>>>> >>>>>>>>> >>>>>>>>> On Fri, Feb 14, 2014 at 8:53 PM, Frank Ng <fnt...@gmail.com>wrote: >>>>>>>>> >>>>>>>>>> Sorry, I have not had a chance to file a JIRA ticket. We have >>>>>>>>>> not been able to resolve the issue. But since Joel mentioned that >>>>>>>>>> upgrading to Cassandra 2.0.X solved it for them, we may need to >>>>>>>>>> upgrade. >>>>>>>>>> We are currently on Java 1.7 and Cassandra 1.2.8 >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Thu, Feb 13, 2014 at 12:40 PM, Keith Wright < >>>>>>>>>> kwri...@nanigans.com> wrote: >>>>>>>>>> >>>>>>>>>>> You’re running 2.0.* in production? May I ask what C* version >>>>>>>>>>> and OS? Any hardware details would be appreciated as well. Thx! >>>>>>>>>>> >>>>>>>>>>> From: Joel Samuelsson <samuelsson.j...@gmail.com> >>>>>>>>>>> Reply-To: "user@cassandra.apache.org" <user@cassandra.apache.org >>>>>>>>>>> > >>>>>>>>>>> Date: Thursday, February 13, 2014 at 11:39 AM >>>>>>>>>>> >>>>>>>>>>> To: "user@cassandra.apache.org" <user@cassandra.apache.org> >>>>>>>>>>> Subject: Re: Intermittent long application pauses on nodes >>>>>>>>>>> >>>>>>>>>>> We have had similar issues and upgrading C* to 2.0.x and Java to >>>>>>>>>>> 1.7 seems to have helped our issues. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> 2014-02-13 Keith Wright <kwri...@nanigans.com>: >>>>>>>>>>> >>>>>>>>>>>> Frank did you ever file a ticket for this issue or find the >>>>>>>>>>>> root cause? I believe we are seeing the same issues when >>>>>>>>>>>> attempting to >>>>>>>>>>>> bootstrap. >>>>>>>>>>>> >>>>>>>>>>>> Thanks >>>>>>>>>>>> >>>>>>>>>>>> From: Robert Coli <rc...@eventbrite.com> >>>>>>>>>>>> Reply-To: "user@cassandra.apache.org" < >>>>>>>>>>>> user@cassandra.apache.org> >>>>>>>>>>>> Date: Monday, February 3, 2014 at 6:10 PM >>>>>>>>>>>> To: "user@cassandra.apache.org" <user@cassandra.apache.org> >>>>>>>>>>>> Subject: Re: Intermittent long application pauses on nodes >>>>>>>>>>>> >>>>>>>>>>>> On Mon, Feb 3, 2014 at 8:52 AM, Benedict Elliott Smith < >>>>>>>>>>>> belliottsm...@datastax.com> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> It's possible that this is a JVM issue, but if so there may be >>>>>>>>>>>>> some remedial action we can take anyway. There are some more >>>>>>>>>>>>> flags we >>>>>>>>>>>>> should add, but we can discuss that once you open a ticket. If >>>>>>>>>>>>> you could >>>>>>>>>>>>> include the strange JMX error as well, that might be helpful. >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> It would be appreciated if you could inform this thread of the >>>>>>>>>>>> JIRA ticket number, for the benefit of the community and google >>>>>>>>>>>> searchers. >>>>>>>>>>>> :) >>>>>>>>>>>> >>>>>>>>>>>> =Rob >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> >