Hi (yet again) Sergio, Finally, note that we use this sidecar <https://github.com/Stackdriver/stackdriver-prometheus-sidecar> for shipping metrics to Stackdriver. It runs as a second container within our Prometheus stateful set.
On Mon, Nov 4, 2019 at 8:46 AM Ben Mills <b...@bitbrew.com> wrote: > Hi (again) Sergio, > > I forgot to note that along with Prometheus, we use Grafana (with > Prometheus as its data source) as well as Stackdriver for monitoring. > > As Stackdriver is still developing (i.e. does not have all the features we > need), we tend to use it for the basics (i.e. monitoring and alerting on > memory, cpu and disk (PVs) thresholds). More specifically, the > Prometheus JMX exporter (noted above) scrapes all the MBeans inside > Cassandra, exporting in the Prometheus data model. Its config map filters > (allows) our metrics of interest, and those metrics are sent to our Grafana > instances and to Stackdriver. We use Grafana for more advanced metric > configs that provide deeper insight in Cassandra - e.g. read/write > latencies and so forth. For monitoring memory utilization, we monitor both > pod-level in Stackdriver (i.e. to avoid having a Cassandra pod oomkilled by > kubelet) as well as inside the JVM (heap space). > > Hope this helps. > > On Mon, Nov 4, 2019 at 8:26 AM Ben Mills <b...@bitbrew.com> wrote: > >> Hi Sergio, >> >> Thanks for this and sorry for the slow reply. >> >> We are indeed still running Java 8 and so it's very helpful. >> >> This Cassandra cluster has been running reliably in Kubernetes for >> several years, and while we've had some repair-related issues, they are not >> related to container orchestration or the cloud environment. We don't use >> operators and have simply built the needed Kubernetes configs (YAML >> manifests) to handle deployment of new Docker images (when needed), and so >> forth. We have: >> >> (1) ConfigMap - Cassandra environment variables >> (2) ConfigMap - Prometheus configs for this JMX exporter >> <https://github.com/prometheus/jmx_exporter>, which is built into the >> image and runs as a Java agent >> (3) PodDisruptionBudget - with minAvailable: 2 as the important setting >> (4) Service - this is a headless service (clusterIP: None) which >> specifies the ports for cql, jmx, prometheus, intra-node >> (5) StatefulSet - 3 replicas, ports, health checks, resources, etc - as >> you would expect >> >> We store data on persistent volumes using an SSD storage class, and use: >> an updateStrategy of OnDelete, some affinity rules to ensure an even >> spread of pods across our zones, Prometheus annotations for scraping the >> metrics port, a nodeSelector and tolerations to ensure the Cassandra pods >> run in their dedicated node pool, and a preStop hook that runs nodetool >> drain to help with graceful shutdown when a pod is rolled. >> >> I'm guessing your installation is much larger than ours and so operators >> may be a good way to go. For our needs the above has been very reliable as >> has GCP in general. >> >> We are currently updating our backup/restore implementation to provide >> better granularity with respect to restoring a specific keyspace and also >> exploring Velero <https://github.com/vmware-tanzu/velero> for DR. >> >> Hope this helps. >> >> >> On Fri, Nov 1, 2019 at 5:34 PM Sergio <lapostadiser...@gmail.com> wrote: >> >>> Hi Ben, >>> >>> Well, I had a similar question and Jon Haddad was preferring ParNew + >>> CMS over G1GC for java 8. >>> https://lists.apache.org/thread.html/283547619b1dcdcddb80947a45e2178158394e317f3092b8959ba879@%3Cuser.cassandra.apache.org%3E >>> It depends on your JVM and in any case, I would test it based on your >>> workload. >>> >>> What's your experience of running Cassandra in k8s. Are you using the >>> Cassandra Kubernetes Operator? >>> >>> How do you monitor it and how do you perform disaster recovery backup? >>> >>> >>> Best, >>> >>> Sergio >>> >>> Il giorno ven 1 nov 2019 alle ore 14:14 Ben Mills <b...@bitbrew.com> ha >>> scritto: >>> >>>> Thanks Sergio - that's good advice and we have this built into the >>>> plan. >>>> Have you heard a solid/consistent recommendation/requirement as to the >>>> amount of memory heap requires for G1GC? >>>> >>>> On Fri, Nov 1, 2019 at 5:11 PM Sergio <lapostadiser...@gmail.com> >>>> wrote: >>>> >>>>> In any case I would test with tlp-stress or Cassandra stress tool any >>>>> configuration >>>>> >>>>> Sergio >>>>> >>>>> On Fri, Nov 1, 2019, 12:31 PM Ben Mills <b...@bitbrew.com> wrote: >>>>> >>>>>> Greetings, >>>>>> >>>>>> We are planning a Cassandra upgrade from 3.7 to 3.11.5 and >>>>>> considering a change to the GC config. >>>>>> >>>>>> What is the minimum amount of memory that needs to be allocated to >>>>>> heap space when using G1GC? >>>>>> >>>>>> For GC, we currently use CMS. Along with the version upgrade, we'll >>>>>> be running the stateful set of Cassandra pods on new machine types in a >>>>>> new >>>>>> node pool with 12Gi memory per node. Not a lot of memory but an >>>>>> improvement. We may be able to go up to 16Gi memory per node. We'd like >>>>>> to >>>>>> continue using these heap settings: >>>>>> >>>>>> -XX:+UnlockExperimentalVMOptions >>>>>> -XX:+UseCGroupMemoryLimitForHeap >>>>>> -XX:MaxRAMFraction=2 >>>>>> >>>>>> which (if 12Gi per node) would provide 6Gi memory for heap (i.e. half >>>>>> of total available). >>>>>> >>>>>> Here are some details on the environment and configs in the event >>>>>> that something is relevant. >>>>>> >>>>>> Environment: Kubernetes >>>>>> Environment Config: Stateful set of 3 replicas >>>>>> Storage: Persistent Volumes >>>>>> Storage Class: SSD >>>>>> Node OS: Container-Optimized OS >>>>>> Container OS: Ubuntu 16.04.3 LTS >>>>>> Data Centers: 1 >>>>>> Racks: 3 (one per zone) >>>>>> Nodes: 3 >>>>>> Tokens: 4 >>>>>> Replication Factor: 3 >>>>>> Replication Strategy: NetworkTopologyStrategy (all keyspaces) >>>>>> Compaction Strategy: STCS (all tables) >>>>>> Read/Write Requirements: Blend of both >>>>>> Data Load: <1GB per node >>>>>> gc_grace_seconds: default (10 days - all tables) >>>>>> >>>>>> GC Settings: (CMS) >>>>>> >>>>>> -XX:+UseParNewGC >>>>>> -XX:+UseConcMarkSweepGC >>>>>> -XX:+CMSParallelRemarkEnabled >>>>>> -XX:SurvivorRatio=8 >>>>>> -XX:MaxTenuringThreshold=1 >>>>>> -XX:CMSInitiatingOccupancyFraction=75 >>>>>> -XX:+UseCMSInitiatingOccupancyOnly >>>>>> -XX:CMSWaitDuration=30000 >>>>>> -XX:+CMSParallelInitialMarkEnabled >>>>>> -XX:+CMSEdenChunksRecordAlways >>>>>> >>>>>> Any ideas are much appreciated. >>>>>> >>>>>