Thanks Peter and Aaron.

right now I have too much logging so the CMS logging is flushed
(somehow it does not appear in the system.log, only on stdout ), I'll
keep an eye on the correlation with ParNew as I get more logging

Yang

On Sun, Sep 25, 2011 at 3:59 AM, Peter Schuller
<peter.schul...@infidyne.com> wrote:
>> I see the following in my GC log
>>
>> 1910.513: [GC [1 CMS-initial-mark: 2598619K(26214400K)]
>> 13749939K(49807360K), 6.0696680 secs] [Times: user=6.10 sys=0.00,
>> real=6.07 secs]
>>
>> so there is a stop-the-world period of 6 seconds. does this sound bad
>> ? or 6 seconds is OK  and we should expect the built-in
>> fault-tolerance of Cassandra handle this?
>
> initial-mark pauses are stop-the-world, so a 6 second initial-mark
> would have paused the node for those 6 seconds.
>
> The initial mark is essentially marking roots for old-gen; that should
> include thread stacks and such, but will also include younger
> generations. You might read [1] which talks a bit about it; a
> recommendation there is to make sure that initial marks happen right
> after a young-gen collection, and they advise increasing heap size
> sufficiently to allow an ininitial mark to trigger (I suppose by
> heuristics) after the young gen collection, prior to the CMS trigger.
> It makes sense, especially given that initial-mark is single-threaded,
> to try do to that (and leave the young-gen smaller, collected by the
> parallel collector). However I'm not entirely clear on what VM options
> are required for this. I had a brief look at the code but it wasn't
> obvious at cursory glance under what circumstances an initial mark is
> triggered right after young-gen vs. not. In your case you clearly have
> enough heap.
>
> Can you correlate with ParNew collections and see if the initial mark
> pauses seem to happen immediately after a ParNew, or somewhere in
> between, in the cases where they take this long?
>
> Also, as a mitigationg: What's your young generation size? One way to
> mitigate the problem, if it is indeed the young gen marking that is
> taking time, is to decrease the size of the young generation to leave
> less work for initial marking. Normally the young gen is sized based
> on expected pause times given parallel ParNew ollections, but if a
> non-parallel initial-mark is having to do marking of the same contents
> the pause time could be higher (hence the discussion above).
>
> Also, is each initial mark this long, or is that something that
> happens once in a while?
>
> As for Cassandra dealing with it: It is definitely not a good thing to
> have 6 second pauses. Even with all other nodes up, it takes time for
> the dynamic snitch to realize what's going on and you will tend to see
> a subset of requests to the cluster get 'stuck' in circumstances like
> that. Also, if you're e.g. doing QUORUM at RF=3, if a node is down for
> legitimate reasons, another node having a 6 second pause will by
> necessity cause high latency for requests during that period.
>
> [1] http://answerpot.com/showthread.php?1558705-CMS+initial+mark+pauses
>
>
> --
> / Peter Schuller (@scode on twitter)
>

Reply via email to