Hi, Before introducing Mesos we're using mainly Graphite / Grafana. Ideally we would like to have metrics per container as an easy way to detect if problem touches only single, subset of containers or it's global.
Unfortunately using Graphite for that is far from being perfect. Having container identifier as a part of metric has many negative implications like having tons of new metrics every release on Marathon (new containers = new identifiers). Investigated InfluxDB so far but project isn't mature enough as still components like https://github.com/influxdata/telegraf/blob/master/plugins/inputs/statsd/README.md#influx-statsd have major blockers: COMING SOON: there will be a way to specify multiple fields. What do you use to monitor your Mesos clusters and f.ex. to detect that some containers are having issues? -- BR, Michał Łowicki