On Tue, Feb 21, 2017 at 1:06 PM Francesco Romani <from...@redhat.com> wrote:
> Hello everyone, > > > in the last weeks I've been submitting PRs to collectd upstream, to > bring the virt plugin up to date with Vdsm and oVirt needs. > > Previously, the collectd virt plugin reported only a subset of metrics > oVirt uses. > > In current collectd master, the collectd virt plugin provides all the > data Vdsm (thus Engine) needs. This means that it is now > > possible for Vdsm or Engine to query collectd, not Vdsm/libvirt, and > have the same data. > Do we wish to ship the unixsock collectd plugin? I'm not sure we do these days (4.1). We can do that later, of course, when we ship this. Y. > > There are only two caveats: > > 1. it is yet to be seen which version of collectd will ship all those > enhancements > > 2. collectd *intentionally* report metrics as rates, not as absolute > values as Vdsm does. This may be one issue in presence of restarts/data > loss in the link between collectd and the metrics store. > > > Please keep reading for more details: > > > How to get the code? > > -------------------------------- > > This somehow tricky until we get one official release. If one is > familiar with the RPM build process, it is easy to build one custom > packages > > from a snapshot from collectd master > (https://github.com/collectd/collectd) and a recent 5.7.1 RPM (like > https://koji.fedoraproject.org/koji/buildinfo?buildID=835669) > > > How to configure it? > > ------------------------------ > > Most thing work out of the box. One currently in progress Vdsm patch > ships the recommended configuration > https://gerrit.ovirt.org/#/c/71176/6/static/etc/collectd.d/virt.conf > > The meaning of the configuration option is documented in man 5 > collectd.conf > > > How it looks like? > > -------------------------- > > > Let me post one "screenshot" :) > > > > $ collectdctl listval | grep a0 > a0/virt/disk_octets-hdc > a0/virt/disk_octets-vda > a0/virt/disk_ops-hdc > a0/virt/disk_ops-vda > a0/virt/disk_time-hdc > a0/virt/disk_time-vda > a0/virt/if_dropped-vnet0 > a0/virt/if_errors-vnet0 > a0/virt/if_octets-vnet0 > a0/virt/if_packets-vnet0 > a0/virt/memory-actual_balloon > a0/virt/memory-rss > a0/virt/memory-total > a0/virt/ps_cputime > a0/virt/total_requests-flush-hdc > a0/virt/total_requests-flush-vda > a0/virt/total_time_in_ms-flush-hdc > a0/virt/total_time_in_ms-flush-vda > a0/virt/virt_cpu_total > a0/virt/virt_vcpu-0 > a0/virt/virt_vcpu-1 > > > How to consume the data? > ----------------------------------------- > > Among the ways to query collectd, the two most popular (and most fitting > for oVirt use case) ways are perhaps the network protocol > (https://collectd.org/wiki/index.php/Binary_protocol) > and the plain text protocol > (https://collectd.org/wiki/index.php/Plain_text_protocol). The first > could be used by Engine to get the data directly, or to consolidate the > metrics in one database (e.g to run any kind of query, for historical > series...). > The latter will be used by Vdsm to keep reporting the metrics (again > https://gerrit.ovirt.org/#/c/71176/6) > > Please note that the performance of the plain text protocol are known to > be lower than the binary protocol > > What about the unresponsive hosts? > ------------------------------------------------------- > > We know from experience that hosts may become unresponsive, and this can > disrupt monitoring. however, we do want to keep monitoring the > responsive hosts, avoiding that one rogue hosts makes us lose all the > monitoring data. > To cope with this need, the virt plugin gained support for "partition > tag". With this, we can group VMs together using one arbitrary tag. This > is completely transparent to collectd, and also completely optional. > oVirt can use this tag to group VMs per-storage-domain, or however it > sees fit, trying to minimize the disruption should one host become > unresponsive. > > Read the full docs here: > > https://github.com/collectd/collectd/commit/999efc28d8e2e96bc15f535254d412a79755ca4f > > > What about the collectd-ovirt plugin? > -------------------------------------------------------- > > Some time ago I implemented one out-of-tree collectd plugin leveraging > the libvirt bulk stats: https://github.com/fromanirh/collectd-ovirt > This plugin is meant to be a modern, drop-in replacement for the > existing virt plugin. > The development of that out of tree plugin is now halted, because we > have everything we need in the upstream collectd plugin. > > Future work > ------------------ > > We believe we have reached feature parity, so we are looking for > bugixes/performance tuning in the near term future. I'll be happy to > provide more patches/PRs about that. > > > > Thanks and bests, > > -- > Francesco Romani > Red Hat Engineering Virtualization R & D > IRC: fromani > > _______________________________________________ > Devel mailing list > Devel@ovirt.org > http://lists.ovirt.org/mailman/listinfo/devel >
_______________________________________________ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel