This series of patches introduced the x86 Cache Monitoring Technology (CMT) to libvirt by interacting with kernel resource control (resctrl) interface. CMT is one of the Intel(R) x86 CPU feature which belongs to the Resource Director Technology (RDT). CMT reports the occupancy of the last level cache, which is shared by all CPU cores.
In v1 series, we are introducing CMT for libvirt, including reporting host capability and creating CMT groups. Introducing host capability is pretty much a well self-contained step, we only cover this step in this series. As an extension of v1, MBM capability is also introduced. These patches will not cover the part of creating CMT groups, which will be subsequent patches. We have serval discussion about the enabling of CMT, please refer to following links for the RFCs. RFCv3 https://www.redhat.com/archives/libvir-list/2018-August/msg01213.html RFCv2 https://www.redhat.com/archives/libvir-list/2018-July/msg00409.html https://www.redhat.com/archives/libvir-list/2018-July/msg01241.html RFCv1 https://www.redhat.com/archives/libvir-list/2018-June/msg00674.html 1. About reason why CMT is necessary for libvirt? The perf events of 'CMT, MBML, MBMT' have been phased out since Linux kernel commit c39a0e2c8850f08249383f2425dbd8dbe4baad69, in libvirt the perf based cmt,mbm will not work with the latest linux kernel. These patches add CMT feature to libvirt through kernel resctrlfs interface. 2. Interfaces for CMT from the high level. CMT, CAT, MBM and MBA are orthogonal features, each could works independently. If 'CMT' is enabled in host, then a 'cache monitor' is introduced for cache, which is role is monitoring the last level cache utilization of target system process. Cache monitor capabilities is shown under element <cache>. 'MBM', a monitor named memory bandwidth monitor is introduced, for role of monitoring memory bandwidth utilization. The capability information block is located under <memory bandwidth> element. 2.1 Query the host capability of CMT. The element 'monitor' represents the host capabilities of CMT. The explanations of involved attributes: - 'maxMonitors': denotes the maximum monitoring groups could be created, which is limited by the number of hardware 'RMID'. - 'reuseThreshold': an adjustable value affects the final reuse of resources used by monitor. After the action of removing a monitor, the kernel may not release all hardware resources that monitor used immediately if the cache occupancy value associated with 'removed' monitor is above this threshold. Once the cache occupancy is below this threshold, the underlying hardware resource will be reclaimed and be put into the resource pool for next reusing. - 'llc_occupancy': a feature of CMT, reporting the last level cache occupancy information. - 'mbm_total_bytes': a feature of MBM, reporting total memory bandwidth utilization, in bytes, including local memory and remote memory for multi-node system. - 'mbm_local_bytes': a feature of MBM, reporting only local memory bandwidth utilization. # virsh capabilities ... <cache> <bank id='0' level='3' type='both' size='15' unit='MiB' cpus='0-5'> <control granularity='768' min='1536' unit='KiB' type='both' maxAllocs='4'/> </bank> <bank id='1' level='3' type='both' size='15' unit='MiB' cpus='6-11'> <control granularity='768' min='1536' unit='KiB' type='both' maxAllocs='4'/> </bank> + <monitor level='3' reuseThreshold='270336' maxMonitors='176'> + <feature name='llc_occupancy'/> + </monitor> </cache> <memory_bandwidth> <node id='0' cpus='0-5'> <control granularity='10' min ='10' maxAllocs='4'/> </node> <node id='1' cpus='6-11'> <control granularity='10' min ='10' maxAllocs='4'/> </node> + <monitor maxMonitors='176'> + <feature name='mbm_total_bytes'/> + <feature name='mbm_local_bytes'/> + </monitor> </memory_bandwidth> ... </host> Changes since v2: - Addressed John Ferlan's review. - Typo fixed. - Removed VIR_ENUM_DECL(virMonitor); Changes since v1: - Introduced MBM capability. - Capability layout changed * Moved <monitor> from cahe <bank> to <cache> * Renamed <Threshold> to <reuseThreshold> - Document for 'reuseThreshold' changed. - Introduced API virResctrlInfoGetMonitorPrefix - Added more tests, covering standalone CMT, fake new feature. - Creating CMT resource control group will be subsequent job. Wang Huaqiang (4): util: Introduce monitor capability interface conf: Refactor cache bank capability structure conf: Refactor memory bandwidth capability structure conf: Introduce RDT monitor host capability docs/schemas/capability.rng | 37 +++- src/conf/capabilities.c | 122 ++++++++--- src/conf/capabilities.h | 24 ++- src/libvirt_private.syms | 2 + src/util/virresctrl.c | 236 +++++++++++++++++++++ src/util/virresctrl.h | 40 ++++ .../resctrl/info/L3_MON/max_threshold_occupancy | 1 + .../resctrl/info/L3_MON/mon_features | 1 + .../resctrl/info/L3_MON/num_rmids | 1 + .../linux-resctrl-cmt/resctrl/manualres/cpus | 1 + .../linux-resctrl-cmt/resctrl/manualres/schemata | 1 + .../linux-resctrl-cmt/resctrl/manualres/tasks | 0 .../linux-resctrl-cmt/resctrl/schemata | 1 + tests/vircaps2xmldata/linux-resctrl-cmt/system | 1 + .../resctrl/info/L3/cbm_mask | 1 + .../resctrl/info/L3/min_cbm_bits | 1 + .../resctrl/info/L3/num_closids | 1 + .../resctrl/info/L3_MON/max_threshold_occupancy | 1 + .../resctrl/info/L3_MON/mon_features | 10 + .../resctrl/info/L3_MON/num_rmids | 1 + .../resctrl/info/MB/bandwidth_gran | 1 + .../resctrl/info/MB/min_bandwidth | 1 + .../resctrl/info/MB/num_closids | 1 + .../resctrl/manualres/cpus | 1 + .../resctrl/manualres/schemata | 1 + .../resctrl/manualres/tasks | 0 .../linux-resctrl-fake-feature/resctrl/schemata | 1 + .../linux-resctrl-fake-feature/system | 1 + .../resctrl/info/L3_MON/max_threshold_occupancy | 1 + .../linux-resctrl/resctrl/info/L3_MON/mon_features | 3 + .../linux-resctrl/resctrl/info/L3_MON/num_rmids | 1 + .../vircaps2xmldata/vircaps-x86_64-resctrl-cmt.xml | 53 +++++ .../vircaps-x86_64-resctrl-fake-feature.xml | 73 +++++++ tests/vircaps2xmldata/vircaps-x86_64-resctrl.xml | 7 + tests/vircaps2xmltest.c | 2 + 35 files changed, 594 insertions(+), 36 deletions(-) create mode 100644 tests/vircaps2xmldata/linux-resctrl-cmt/resctrl/info/L3_MON/max_threshold_occupancy create mode 100644 tests/vircaps2xmldata/linux-resctrl-cmt/resctrl/info/L3_MON/mon_features create mode 100644 tests/vircaps2xmldata/linux-resctrl-cmt/resctrl/info/L3_MON/num_rmids create mode 100644 tests/vircaps2xmldata/linux-resctrl-cmt/resctrl/manualres/cpus create mode 100644 tests/vircaps2xmldata/linux-resctrl-cmt/resctrl/manualres/schemata create mode 100644 tests/vircaps2xmldata/linux-resctrl-cmt/resctrl/manualres/tasks create mode 100644 tests/vircaps2xmldata/linux-resctrl-cmt/resctrl/schemata create mode 120000 tests/vircaps2xmldata/linux-resctrl-cmt/system create mode 100644 tests/vircaps2xmldata/linux-resctrl-fake-feature/resctrl/info/L3/cbm_mask create mode 100644 tests/vircaps2xmldata/linux-resctrl-fake-feature/resctrl/info/L3/min_cbm_bits create mode 100644 tests/vircaps2xmldata/linux-resctrl-fake-feature/resctrl/info/L3/num_closids create mode 100644 tests/vircaps2xmldata/linux-resctrl-fake-feature/resctrl/info/L3_MON/max_threshold_occupancy create mode 100644 tests/vircaps2xmldata/linux-resctrl-fake-feature/resctrl/info/L3_MON/mon_features create mode 100644 tests/vircaps2xmldata/linux-resctrl-fake-feature/resctrl/info/L3_MON/num_rmids create mode 100644 tests/vircaps2xmldata/linux-resctrl-fake-feature/resctrl/info/MB/bandwidth_gran create mode 100644 tests/vircaps2xmldata/linux-resctrl-fake-feature/resctrl/info/MB/min_bandwidth create mode 100644 tests/vircaps2xmldata/linux-resctrl-fake-feature/resctrl/info/MB/num_closids create mode 100644 tests/vircaps2xmldata/linux-resctrl-fake-feature/resctrl/manualres/cpus create mode 100644 tests/vircaps2xmldata/linux-resctrl-fake-feature/resctrl/manualres/schemata create mode 100644 tests/vircaps2xmldata/linux-resctrl-fake-feature/resctrl/manualres/tasks create mode 100644 tests/vircaps2xmldata/linux-resctrl-fake-feature/resctrl/schemata create mode 120000 tests/vircaps2xmldata/linux-resctrl-fake-feature/system create mode 100644 tests/vircaps2xmldata/linux-resctrl/resctrl/info/L3_MON/max_threshold_occupancy create mode 100644 tests/vircaps2xmldata/linux-resctrl/resctrl/info/L3_MON/mon_features create mode 100644 tests/vircaps2xmldata/linux-resctrl/resctrl/info/L3_MON/num_rmids create mode 100644 tests/vircaps2xmldata/vircaps-x86_64-resctrl-cmt.xml create mode 100644 tests/vircaps2xmldata/vircaps-x86_64-resctrl-fake-feature.xml -- 2.7.4 -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list