вт, 9 мар. 2021 г. в 15:35, Michal Privoznik <mpriv...@redhat.com>:
> On 2/19/21 9:08 PM, Aleksei Zakharov wrote: > > This patch adds delay time (steal time inside guest) to libvirt > > domain per-vcpu stats. Delay time is an important performance metric. > > It is a consequence of the overloaded CPU. Knowledge of the delay > > time of a virtual machine helps to understand if it is affected and > > estimate the impact. > > > > As a result, it is possible to react exactly when needed and > > rebalance the load between hosts. This is used by cloud providers > > to provide quality of service, especially when the CPU is > > oversubscribed. > > > > It's more convenient to work with this metric in a context of a > > libvirt domain. Any monitoring software may use this information. > > > > Signed-off-by: Aleksei Zakharov <zaha...@selectel.ru> > > --- > > docs/manpages/virsh.rst | 4 ++++ > > src/libvirt-domain.c | 4 ++++ > > src/qemu/qemu_driver.c | 44 +++++++++++++++++++++++++++++++++++++++-- > > 3 files changed, 50 insertions(+), 2 deletions(-) > > > > diff --git a/docs/manpages/virsh.rst b/docs/manpages/virsh.rst > > index e3afa48f7b..a2b83ddfaa 100644 > > --- a/docs/manpages/virsh.rst > > +++ b/docs/manpages/virsh.rst > > @@ -2295,6 +2295,10 @@ When selecting the *--state* group the following > fields are returned: > > * ``vcpu.<num>.halted`` - virtual CPU <num> is halted: yes or > > no (may indicate the processor is idle or even disabled, > > depending on the architecture) > > +* ``vcpu.<num>.delay`` - time the vCPU <num> thread was enqueued by the > > + host scheduler, but was waiting in the queue instead of running. > > + Exposed to the VM as a steal time. > > + > > > > > > *--interface* returns: > > diff --git a/src/libvirt-domain.c b/src/libvirt-domain.c > > index 4af0166872..e54d11e513 100644 > > --- a/src/libvirt-domain.c > > +++ b/src/libvirt-domain.c > > @@ -11693,6 +11693,10 @@ virConnectGetDomainCapabilities(virConnectPtr > conn, > > * "vcpu.<num>.halted" - virtual CPU <num> is halted, may indicate > the > > * processor is idle or even disabled, > depending > > * on the architecture) > > + * "vcpu.<num>.delay" - time the vCPU <num> thread was enqueued by > the > > + * host scheduler, but was waiting in the queue > > + * instead of running. Exposed to the VM as a > steal > > + * time. > > * > > * VIR_DOMAIN_STATS_INTERFACE: > > * Return network interface statistics (from domain point of view). > > diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c > > index b9bbdf8d48..9ec3c8fce7 100644 > > --- a/src/qemu/qemu_driver.c > > +++ b/src/qemu/qemu_driver.c > > @@ -1332,6 +1332,34 @@ static char > *qemuConnectGetCapabilities(virConnectPtr conn) { > > return virCapabilitiesFormatXML(caps); > > } > > > > +static int > > +qemuGetSchedstatDelay(unsigned long long *cpudelay, > > + pid_t pid, pid_t tid) > > +{ > > I know you took inspiration in existing code, but our standards have > moved since that code was written, a bit. > > > + g_autofree char *proc = NULL; > > + unsigned long long oncpu = 0; > > + g_autofree FILE *schedstat = NULL; > > + > > + if (tid) > > + proc = g_strdup_printf("/proc/%d/task/%d/schedstat", (int)pid, > (int)tid); > > + else > > + proc = g_strdup_printf("/proc/%d/schedstat", (int)pid); > > + > > + schedstat = fopen(proc, "r"); > > + if (!schedstat) { > > + virReportError(VIR_ERR_INTERNAL_ERROR, > > + _("Unable to open schedstat file at '%s'"), > > + proc); > > I'd expect this to be an error. Except for the case when the file > doesn't exist. > > > + } > > + if (fscanf(schedstat, "%llu %llu", &oncpu, cpudelay) < 2) { > > + virReportError(VIR_ERR_INTERNAL_ERROR, > > + _("Unable to parse schedstat info at '%s'"), > > + proc); > > > And this has to be error always. > > > + } > > + > > + VIR_FORCE_FCLOSE(schedstat); > > + return 0; > > I'm fixing this function a bit. > > > +} > > The rest looks okay. I'm pushing it. > > Reviewed-by: Michal Privoznik <mpriv...@redhat.com> > > Congratulations on your first libvirt contribution! > > Michal > > Awesome, thank you! -- Best Regards, Aleksei Zakharov