Roman Mohr has posted comments on this change.

Change subject: host stats: Collect stats from online cpu cores only
......................................................................


Patch Set 4:

(2 comments)

https://gerrit.ovirt.org/#/c/46269/4//COMMIT_MSG
Commit Message:

Line 12: When vdsm is already running, it is enough to do something like
Line 13: 
Line 14:   echo 0 > /sys/devices/system/cpu/cpu2/online
Line 15: 
Line 16: to break getVdsStats.
> To make your change more clear, please show a fraction of the stats vdsm re
For now added to https://bugzilla.redhat.com/show_bug.cgi?id=1264003. Turns out 
to be more complicated why things are like they are.
Line 17: 
Line 18: Change-Id: Ia9c247f9138e02a9230a0849a04cb2e1705e7fac


https://gerrit.ovirt.org/#/c/46269/4/vdsm/virt/sampling.py
File vdsm/virt/sampling.py:

Line 692:         cpu_cores = numa_node['cpus']
Line 693:         for cpu_core in cpu_cores:
Line 694:             # Do not try to collect cpu core data when no samples are 
present
Line 695:             if (not last_sample.cpuCores.getCoreSample(cpu_core) or 
not
Line 696:                     first_sample.cpuCores.getCoreSample(cpu_core)):
> By returning nothing, you mean there is no entry for this cpu in the return
Turned out that it depends on libvirtd if we see the exception or no report for 
the offline cpu.

When there are three cpus (0,1,2) and cpu 1 is offline when libvirtd is 
started, we only show sampling data for cpu 0 and 2. When the cpu is disabled 
while libvirtd is running, we run into the exception.

So maybe we should not only fix compute_cpu_usage, but also getNumaTopology() 
that it will always behave the same way, no matter when the cpu went offline.

Or we can leave getNumaTopology() as it is and use _get_online_cpus instead, 
wich really just reports online cpus.

But still I don't know if we should report nothing at all. But I think when we 
skip the entry, we should add cpuCores, cpuThreads and cpuSockets along with 
onlineCpus too to getVdsStats, because they also change when the cpu goes 
offline

The exact behaviour is documentend in a document attached to the bug report
Line 697:                 continue
Line 698:             core_stat = {
Line 699:                 'nodeIndex': int(node_index),
Line 700:                 'cpuUser': compute_cpu_usage(cpu_core, 'user'),


-- 
To view, visit https://gerrit.ovirt.org/46269
To unsubscribe, visit https://gerrit.ovirt.org/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia9c247f9138e02a9230a0849a04cb2e1705e7fac
Gerrit-PatchSet: 4
Gerrit-Project: vdsm
Gerrit-Branch: master
Gerrit-Owner: Roman Mohr <[email protected]>
Gerrit-Reviewer: Francesco Romani <[email protected]>
Gerrit-Reviewer: Jenkins CI
Gerrit-Reviewer: Nir Soffer <[email protected]>
Gerrit-Reviewer: Omer Frenkel <[email protected]>
Gerrit-Reviewer: Roman Mohr <[email protected]>
Gerrit-Reviewer: Roy Golan <[email protected]>
Gerrit-Reviewer: [email protected]
Gerrit-HasComments: Yes
_______________________________________________
vdsm-patches mailing list
[email protected]
https://lists.fedorahosted.org/mailman/listinfo/vdsm-patches

Reply via email to