Roman Mohr has posted comments on this change. Change subject: host stats: Collect stats from online cpu cores only ......................................................................
Patch Set 4: (2 comments) https://gerrit.ovirt.org/#/c/46269/4//COMMIT_MSG Commit Message: Line 12: When vdsm is already running, it is enough to do something like Line 13: Line 14: echo 0 > /sys/devices/system/cpu/cpu2/online Line 15: Line 16: to break getVdsStats. > To make your change more clear, please show a fraction of the stats vdsm re For now added to https://bugzilla.redhat.com/show_bug.cgi?id=1264003. Turns out to be more complicated why things are like they are. Line 17: Line 18: Change-Id: Ia9c247f9138e02a9230a0849a04cb2e1705e7fac https://gerrit.ovirt.org/#/c/46269/4/vdsm/virt/sampling.py File vdsm/virt/sampling.py: Line 692: cpu_cores = numa_node['cpus'] Line 693: for cpu_core in cpu_cores: Line 694: # Do not try to collect cpu core data when no samples are present Line 695: if (not last_sample.cpuCores.getCoreSample(cpu_core) or not Line 696: first_sample.cpuCores.getCoreSample(cpu_core)): > By returning nothing, you mean there is no entry for this cpu in the return Turned out that it depends on libvirtd if we see the exception or no report for the offline cpu. When there are three cpus (0,1,2) and cpu 1 is offline when libvirtd is started, we only show sampling data for cpu 0 and 2. When the cpu is disabled while libvirtd is running, we run into the exception. So maybe we should not only fix compute_cpu_usage, but also getNumaTopology() that it will always behave the same way, no matter when the cpu went offline. Or we can leave getNumaTopology() as it is and use _get_online_cpus instead, wich really just reports online cpus. But still I don't know if we should report nothing at all. But I think when we skip the entry, we should add cpuCores, cpuThreads and cpuSockets along with onlineCpus too to getVdsStats, because they also change when the cpu goes offline The exact behaviour is documentend in a document attached to the bug report Line 697: continue Line 698: core_stat = { Line 699: 'nodeIndex': int(node_index), Line 700: 'cpuUser': compute_cpu_usage(cpu_core, 'user'), -- To view, visit https://gerrit.ovirt.org/46269 To unsubscribe, visit https://gerrit.ovirt.org/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ia9c247f9138e02a9230a0849a04cb2e1705e7fac Gerrit-PatchSet: 4 Gerrit-Project: vdsm Gerrit-Branch: master Gerrit-Owner: Roman Mohr <[email protected]> Gerrit-Reviewer: Francesco Romani <[email protected]> Gerrit-Reviewer: Jenkins CI Gerrit-Reviewer: Nir Soffer <[email protected]> Gerrit-Reviewer: Omer Frenkel <[email protected]> Gerrit-Reviewer: Roman Mohr <[email protected]> Gerrit-Reviewer: Roy Golan <[email protected]> Gerrit-Reviewer: [email protected] Gerrit-HasComments: Yes _______________________________________________ vdsm-patches mailing list [email protected] https://lists.fedorahosted.org/mailman/listinfo/vdsm-patches
