Hi Sergio,

Thanks for this and sorry for the slow reply.

We are indeed still running Java 8 and so it's very helpful.

This Cassandra cluster has been running reliably in Kubernetes for several
years, and while we've had some repair-related issues, they are not related
to container orchestration or the cloud environment. We don't use operators
and have simply built the needed Kubernetes configs (YAML manifests) to
handle deployment of new Docker images (when needed), and so forth. We have:

(1) ConfigMap - Cassandra environment variables
(2) ConfigMap - Prometheus configs for this JMX exporter
<https://github.com/prometheus/jmx_exporter>, which is built into the image
and runs as a Java agent
(3) PodDisruptionBudget - with minAvailable: 2 as the important setting
(4) Service - this is a headless service (clusterIP: None) which specifies
the ports for cql, jmx, prometheus, intra-node
(5) StatefulSet - 3 replicas, ports, health checks, resources, etc - as you
would expect

We store data on persistent volumes using an SSD storage class, and use: an
updateStrategy of OnDelete, some affinity rules to ensure an even spread of
pods across our zones, Prometheus annotations for scraping the metrics
port, a nodeSelector and tolerations to ensure the Cassandra pods run in
their dedicated node pool, and a preStop hook that runs nodetool drain to
help with graceful shutdown when a pod is rolled.

I'm guessing your installation is much larger than ours and so operators
may be a good way to go. For our needs the above has been very reliable as
has GCP in general.

We are currently updating our backup/restore implementation to provide
better granularity with respect to restoring a specific keyspace and also
exploring Velero <https://github.com/vmware-tanzu/velero> for DR.

Hope this helps.


On Fri, Nov 1, 2019 at 5:34 PM Sergio <lapostadiser...@gmail.com> wrote:

> Hi Ben,
>
> Well, I had a similar question and Jon Haddad was preferring ParNew + CMS
> over G1GC for java 8.
> https://lists.apache.org/thread.html/283547619b1dcdcddb80947a45e2178158394e317f3092b8959ba879@%3Cuser.cassandra.apache.org%3E
> It depends on your JVM and in any case, I would test it based on your
> workload.
>
> What's your experience of running Cassandra in k8s. Are you using the
> Cassandra Kubernetes Operator?
>
> How do you monitor it and how do you perform disaster recovery backup?
>
>
> Best,
>
> Sergio
>
> Il giorno ven 1 nov 2019 alle ore 14:14 Ben Mills <b...@bitbrew.com> ha
> scritto:
>
>> Thanks Sergio - that's good advice and we have this built into the plan.
>> Have you heard a solid/consistent recommendation/requirement as to the
>> amount of memory heap requires for G1GC?
>>
>> On Fri, Nov 1, 2019 at 5:11 PM Sergio <lapostadiser...@gmail.com> wrote:
>>
>>> In any case I would test with tlp-stress or Cassandra stress tool any
>>> configuration
>>>
>>> Sergio
>>>
>>> On Fri, Nov 1, 2019, 12:31 PM Ben Mills <b...@bitbrew.com> wrote:
>>>
>>>> Greetings,
>>>>
>>>> We are planning a Cassandra upgrade from 3.7 to 3.11.5 and considering
>>>> a change to the GC config.
>>>>
>>>> What is the minimum amount of memory that needs to be allocated to heap
>>>> space when using G1GC?
>>>>
>>>> For GC, we currently use CMS. Along with the version upgrade, we'll be
>>>> running the stateful set of Cassandra pods on new machine types in a new
>>>> node pool with 12Gi memory per node. Not a lot of memory but an
>>>> improvement. We may be able to go up to 16Gi memory per node. We'd like to
>>>> continue using these heap settings:
>>>>
>>>> -XX:+UnlockExperimentalVMOptions
>>>> -XX:+UseCGroupMemoryLimitForHeap
>>>> -XX:MaxRAMFraction=2
>>>>
>>>> which (if 12Gi per node) would provide 6Gi memory for heap (i.e. half
>>>> of total available).
>>>>
>>>> Here are some details on the environment and configs in the event that
>>>> something is relevant.
>>>>
>>>> Environment: Kubernetes
>>>> Environment Config: Stateful set of 3 replicas
>>>> Storage: Persistent Volumes
>>>> Storage Class: SSD
>>>> Node OS: Container-Optimized OS
>>>> Container OS: Ubuntu 16.04.3 LTS
>>>> Data Centers: 1
>>>> Racks: 3 (one per zone)
>>>> Nodes: 3
>>>> Tokens: 4
>>>> Replication Factor: 3
>>>> Replication Strategy: NetworkTopologyStrategy (all keyspaces)
>>>> Compaction Strategy: STCS (all tables)
>>>> Read/Write Requirements: Blend of both
>>>> Data Load: <1GB per node
>>>> gc_grace_seconds: default (10 days - all tables)
>>>>
>>>> GC Settings: (CMS)
>>>>
>>>> -XX:+UseParNewGC
>>>> -XX:+UseConcMarkSweepGC
>>>> -XX:+CMSParallelRemarkEnabled
>>>> -XX:SurvivorRatio=8
>>>> -XX:MaxTenuringThreshold=1
>>>> -XX:CMSInitiatingOccupancyFraction=75
>>>> -XX:+UseCMSInitiatingOccupancyOnly
>>>> -XX:CMSWaitDuration=30000
>>>> -XX:+CMSParallelInitialMarkEnabled
>>>> -XX:+CMSEdenChunksRecordAlways
>>>>
>>>> Any ideas are much appreciated.
>>>>
>>>

Reply via email to