Re: High GC activity on node with 4TB on data

Arya Goudarzi Thu, 19 Feb 2015 00:04:51 -0800

Sorry to jump on this late. GC is one of my favorite topics. A while ago I
wrote a blob post about C* GC tuning and documented several issues that I
had experienced. It seems it has helped some people in that past, so I am
sharing it here:


http://aryanet.com/blog/cassandra-garbage-collector-tuning



On Thu, Feb 12, 2015 at 11:08 AM, Jiri Horky <ho...@avast.com> wrote:

>  Number of cores: 2x6Cores x 2(HT).
>
> I do agree with you that the the hardware is certainly overestimated for
> just one Cassandra, but we got a very good price since we ordered several
> 10s of the same nodes for a different project. That's why we use for
> multiple cassandra instances.
>
> Jirka H.
>
>
> On 02/12/2015 04:18 PM, Eric Stevens wrote:
>
> > each node has 256G of memory, 24x1T drives, 2x Xeon CPU
>
>  I don't have first hand experience running Cassandra on such massive
> hardware, but it strikes me that these machines are dramatically oversized
> to be good candidates for Cassandra (though I wonder how many cores are in
> those CPUs; I'm guessing closer to 18 than 2 based on the other hardware).
>
>  A larger cluster of smaller hardware would be a much better shape for
> Cassandra.  Or several clusters of smaller hardware since you're running
> multiple instances on this hardware - best practices have one instance per
> host no matter the hardware size.
>
> On Thu, Feb 12, 2015 at 12:36 AM, Jiri Horky <ho...@avast.com> wrote:
>
>>  Hi Chris,
>>
>> On 02/09/2015 04:22 PM, Chris Lohfink wrote:
>>
>>   - number of tombstones - how can I reliably find it out?
>>  https://github.com/spotify/cassandra-opstools
>>  https://github.com/cloudian/support-tools
>>
>>  thanks.
>>
>>
>>  If not getting much compression it may be worth trying to disable it,
>> it may contribute but its very unlikely that its the cause of the gc
>> pressure itself.
>>
>>  7000 sstables but STCS? Sounds like compactions couldn't keep up.  Do
>> you have a lot of pending compactions (nodetool)?  You may want to increase
>> your compaction throughput (nodetool) to see if you can catch up a little,
>> it would cause a lot of heap overhead to do reads with that many.  May even
>> need to take more drastic measures if it cant catch back up.
>>
>>  I am sorry, I was wrong. We actually do use LCS (the switch was done
>> recently). There are almost none pending compaction. We have increased the
>> size sstable to 768M, so it should help as as well.
>>
>>
>>  May also be good to check `nodetool cfstats` for very wide partitions.
>>
>>
>>  There are basically none, this is fine.
>>
>> It seems that the problem really comes from having so much data in so
>> many sstables, so
>> org.apache.cassandra.io.compress.CompressedRandomAccessReader classes
>> consumes more memory than 0.75*HEAP_SIZE, which triggers the CMS over and
>> over.
>>
>> We have turned off the compression and so far, the situation seems to be
>> fine.
>>
>> Cheers
>> Jirka H.
>>
>>
>>
>>  Theres a good chance if under load and you have over 8gb heap your GCs
>> could use tuning.  The bigger the nodes the more manual tweaking it will
>> require to get the most out of them
>> https://issues.apache.org/jira/browse/CASSANDRA-8150 also has some
>> ideas.
>>
>>  Chris
>>
>> On Mon, Feb 9, 2015 at 2:00 AM, Jiri Horky <ho...@avast.com> wrote:
>>
>>>  Hi all,
>>>
>>> thank you all for the info.
>>>
>>> To answer the questions:
>>>  - we have 2 DCs with 5 nodes in each, each node has 256G of memory,
>>> 24x1T drives, 2x Xeon CPU - there are multiple cassandra instances running
>>> for different project. The node itself is powerful enough.
>>>  - there 2 keyspaces, one with 3 replicas per DC, one with 1 replica per
>>> DC (because of amount of data and because it serves more or less like a
>>> cache)
>>>  - there are about 4k/s Request-response, 3k/s Read and 2k/s Mutation
>>> requests  - numbers are sum of all nodes
>>>  - we us STCS (LCS would be quite IO have for this amount of data)
>>>  - number of tombstones - how can I reliably find it out?
>>>  - the biggest CF (3.6T per node) has 7000 sstables
>>>
>>> Now, I understand that the best practice for Cassandra is to run "with
>>> the minimum size of heap which is enough" which for this case we thought is
>>> about 12G - there is always 8G consumbed by the SSTable readers. Also, I
>>> though that high number of tombstones create pressure in the new space
>>> (which can then cause pressure in old space as well), but this is not what
>>> we are seeing. We see continuous GC activity in Old generation only.
>>>
>>> Also, I noticed that the biggest CF has Compression factor of 0.99 which
>>> basically means that the data come compressed already. Do you think that
>>> turning off the compression should help with memory consumption?
>>>
>>> Also, I think that tuning CMSInitiatingOccupancyFraction=75 might help
>>> here, as it seems that 8G is something that Cassandra needs for bookkeeping
>>> this amount of data and that this was sligtly above the 75% limit which
>>> triggered the CMS again and again.
>>>
>>> I will definitely have a look at the presentation.
>>>
>>> Regards
>>> Jiri Horky
>>>
>>>
>>> On 02/08/2015 10:32 PM, Mark Reddy wrote:
>>>
>>> Hey Jiri,
>>>
>>>  While I don't have any experience running 4TB nodes (yet), I would
>>> recommend taking a look at a presentation by Arron Morton on large nodes:
>>> http://planetcassandra.org/blog/cassandra-community-webinar-videoslides-large-nodes-with-cassandra-by-aaron-morton/
>>> to see if you can glean anything from that.
>>>
>>>  I would note that at the start of his talk he mentions that in version
>>> 1.2 we can now talk about nodes around 1 - 3 TB in size, so if you are
>>> storing anything more than that you are getting into very specialised use
>>> cases.
>>>
>>>  If you could provide us with some more information about your cluster
>>> setup (No. of CFs, read/write patterns, do you delete / update often, etc.)
>>> that may help in getting you to a better place.
>>>
>>>
>>>  Regards,
>>> Mark
>>>
>>> On 8 February 2015 at 21:10, Kevin Burton <bur...@spinn3r.com> wrote:
>>>
>>>> Do you have a lot of individual tables?  Or lots of small compactions?
>>>>
>>>>  I think the general consensus is that (at least for Cassandra), 8GB
>>>> heaps are ideal.
>>>>
>>>>  If you have lots of small tables it’s a known anti-pattern (I
>>>> believe) because the Cassandra internals could do a better job on handling
>>>> the in memory metadata representation.
>>>>
>>>>  I think this has been improved in 2.0 and 2.1 though so the fact that
>>>> you’re on 1.2.18 could exasperate the issue.  You might want to consider an
>>>> upgrade (though that has its own issues as well).
>>>>
>>>> On Sun, Feb 8, 2015 at 12:44 PM, Jiri Horky <ho...@avast.com> wrote:
>>>>
>>>>> Hi all,
>>>>>
>>>>> we are seeing quite high GC pressure (in old space by CMS GC Algorithm)
>>>>> on a node with 4TB of data. It runs C* 1.2.18 with 12G of heap memory
>>>>> (2G for new space). The node runs fine for couple of days when the GC
>>>>> activity starts to raise and reaches about 15% of the C* activity which
>>>>> causes dropped messages and other problems.
>>>>>
>>>>> Taking a look at heap dump, there is about 8G used by SSTableReader
>>>>> classes in
>>>>> org.apache.cassandra.io.compress.CompressedRandomAccessReader.
>>>>>
>>>>> Is this something expected and we have just reached the limit of how
>>>>> many data a single Cassandra instance can handle or it is possible to
>>>>> tune it better?
>>>>>
>>>>> Regards
>>>>> Jiri Horky
>>>>>
>>>>
>>>>
>>>>
>>>>   --
>>>>   Founder/CEO Spinn3r.com
>>>>  Location: *San Francisco, CA*
>>>>  blog: http://burtonator.wordpress.com
>>>> … or check out my Google+ profile
>>>> <https://plus.google.com/102718274791889610666/posts>
>>>>  <http://spinn3r.com>
>>>>
>>>
>>>
>>>
>>
>>
>
>


-- 
Cheers,
-Arya

Re: High GC activity on node with 4TB on data

Reply via email to