TellApart also has a rather active fork of Diamond (they're working to merge it back upstream ~soonish) that you can take a look at https://github.com/tellapart/Diamond. They use it to monitor both Apache Mesos and Apache Aurora.
Twitter has an internal monitoring system, and we have an agent which is installed via RPM/puppet on each host that scrapes the metrics pages and pushes data to our time series database. If you wanted to setup an agent through Aurora itself, you'd need support to have one-task per machine <https://issues.apache.org/jira/browse/AURORA-1075> (which would be cool, but could lead to a circular dependency since Aurora or Mesos could go down and not launch your monitoring agents). I'd likely recommend using the same system you use for deploying Mesos as that for getting your monitoring agents onto your hosts. On Tue, Jan 19, 2016 at 12:17 PM, Tomek Janiszewski <jani...@gmail.com> wrote: > Hi > > In our setup we are using Diamond with default system collectors and one > custom collector (based on > https://github.com/python-diamond/Diamond/pull/106 but with some > improvements). Some other solutions were presented at MesosCon: > https://www.youtube.com/watch?v=yLkc17HFEb8 > https://www.youtube.com/watch?v=zlgAT_xFNzU > > Tomek > > wt., 19.01.2016 o 21:04 użytkownik Michał Łowicki <mlowi...@gmail.com> > napisał: > >> Hi, >> >> I've read Mesos Observability Metrics >> <http://mesos.apache.org/documentation/latest/monitoring/> which gives >> nice overview of cluster's health. What about other parameters like I/O >> usage (disk, network), number of processes etc. Maybe there are some tools >> or their configurations dedicated for Mesos? (we're mostly using Diamond >> and StatsD which reports to Graphite). How to launch such tools - >> separately from Mesos or launch as a part of long-running tasks? >> >> -- >> BR, >> Michał Łowicki >> >