Hi Eric,

On 2/15/24 17:13, Eric Auger wrote:
Hi Shaoqin,

On 2/1/24 09:51, Shaoqin Huang wrote:
The KVM_ARM_VCPU_PMU_V3_FILTER provides the ability to let the VMM decide
which PMU events are provided to the guest. Add a new option
`kvm-pmu-filter` as -cpu sub-option to set the PMU Event Filtering.
Without the filter, all PMU events are exposed from host to guest by
default. The usage of the new sub-option can be found from the updated
document (docs/system/arm/cpu-features.rst).

Here is an example which shows how to use the PMU Event Filtering, when
we launch a guest by use kvm, add such command line:

   # qemu-system-aarch64 \
         -accel kvm \
         -cpu host,kvm-pmu-filter="D:0x11-0x11"

Since the first action is deny, we have a global allow policy. This
filters out the cycle counter (event 0x11 being CPU_CYCLES).

And then in guest, use the perf to count the cycle:

   # perf stat sleep 1

    Performance counter stats for 'sleep 1':

               1.22 msec task-clock                       #    0.001 CPUs 
utilized
                  1      context-switches                 #  820.695 /sec
                  0      cpu-migrations                   #    0.000 /sec
                 55      page-faults                      #   45.138 K/sec
    <not supported>      cycles
            1128954      instructions
             227031      branches                         #  186.323 M/sec
               8686      branch-misses                    #    3.83% of all 
branches

        1.002492480 seconds time elapsed

        0.001752000 seconds user
        0.000000000 seconds sys

As we can see, the cycle counter has been disabled in the guest, but
other pmu events do still work.

Reviewed-by: Sebastian Ott <seb...@redhat.com>
Signed-off-by: Shaoqin Huang <shahu...@redhat.com>
---
v5->v6:
   - Commit message improvement.
   - Remove some unused code.
   - Collect Reviewed-by, thanks Sebastian.
   - Use g_auto(Gstrv) to replace the gchar **.          [Eric]

v4->v5:
   - Change the kvm-pmu-filter as a -cpu sub-option.     [Eric]
   - Comment tweak.                                      [Gavin]
   - Rebase to the latest branch.

v3->v4:
   - Fix the wrong check for pmu_filter_init.            [Sebastian]
   - Fix multiple alignment issue.                       [Gavin]
   - Report error by warn_report() instead of error_report(), and don't use
   abort() since the PMU Event Filter is an add-on and best-effort feature.
                                                         [Gavin]
   - Add several missing {  } for single line of code.   [Gavin]
   - Use the g_strsplit() to replace strtok().           [Gavin]

v2->v3:
   - Improve commits message, use kernel doc wording, add more explaination on
     filter example, fix some typo error.                [Eric]
   - Add g_free() in kvm_arch_set_pmu_filter() to prevent memory leak. [Eric]
   - Add more precise error message report.              [Eric]
   - In options doc, add pmu-filter rely on KVM_ARM_VCPU_PMU_V3_FILTER support 
in
     KVM.                                                [Eric]

v1->v2:
   - Add more description for allow and deny meaning in
     commit message.                                     [Sebastian]
   - Small improvement.                                  [Sebastian]

  docs/system/arm/cpu-features.rst | 23 ++++++++++
  target/arm/cpu.h                 |  3 ++
  target/arm/kvm.c                 | 76 ++++++++++++++++++++++++++++++++
  3 files changed, 102 insertions(+)

diff --git a/docs/system/arm/cpu-features.rst b/docs/system/arm/cpu-features.rst
index a5fb929243..26e306cc83 100644
--- a/docs/system/arm/cpu-features.rst
+++ b/docs/system/arm/cpu-features.rst
@@ -204,6 +204,29 @@ the list of KVM VCPU features and their descriptions.
    the guest scheduler behavior and/or be exposed to the guest
    userspace.
+``kvm-pmu-filter``
+  By default kvm-pmu-filter is disabled. This means that by default all pmu
+  events will be exposed to guest.
+
+  KVM implements PMU Event Filtering to prevent a guest from being able to
+  sample certain events. It depends on the KVM_ARM_VCPU_PMU_V3_FILTER
+  attribute supported in KVM. It has the following format:
+
+  kvm-pmu-filter="{A,D}:start-end[;{A,D}:start-end...]"
+
+  The A means "allow" and D means "deny", start is the first event of the
+  range and the end is the last one. The first registered range defines
+  the global policy(global ALLOW if the first @action is DENY, global DENY
+  if the first @action is ALLOW). The start and end only support hexadecimal
+  format now. For example:
nit: I would remove " now"

Will remove it.

+
+  kvm-pmu-filter="A:0x11-0x11;A:0x23-0x3a;D:0x30-0x30"
+
+  Since the first action is allow, we have a global deny policy. It
+  will allow event 0x11 (The cycle counter), events 0x23 to 0x3a are
+  also allowed except the event 0x30 which is denied, and all the other
+  events are denied.
+
  TCG VCPU Features
  =================
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index d3477b1601..2d860c227d 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -948,6 +948,9 @@ struct ArchCPU {
/* KVM steal time */
      OnOffAuto kvm_steal_time;
+
+    /* KVM PMU Filter */
+    char *kvm_pmu_filter;
  #endif /* CONFIG_KVM */
/* Uniprocessor system with MP extensions */
diff --git a/target/arm/kvm.c b/target/arm/kvm.c
index 81813030a5..3f16f96f7e 100644
--- a/target/arm/kvm.c
+++ b/target/arm/kvm.c
@@ -496,6 +496,22 @@ static void kvm_steal_time_set(Object *obj, bool value, 
Error **errp)
      ARM_CPU(obj)->kvm_steal_time = value ? ON_OFF_AUTO_ON : ON_OFF_AUTO_OFF;
  }
+static char *kvm_pmu_filter_get(Object *obj, Error **errp)
+{
+    ARMCPU *cpu = ARM_CPU(obj);
+
+    return g_strdup(cpu->kvm_pmu_filter);
+}
+
+static void kvm_pmu_filter_set(Object *obj, const char *pmu_filter,
+                               Error **errp)
+{
+    ARMCPU *cpu = ARM_CPU(obj);
+
+    g_free(cpu->kvm_pmu_filter);
+    cpu->kvm_pmu_filter = g_strdup(pmu_filter);
+}
+
  /* KVM VCPU properties should be prefixed with "kvm-". */
  void kvm_arm_add_vcpu_properties(ARMCPU *cpu)
  {
@@ -517,6 +533,12 @@ void kvm_arm_add_vcpu_properties(ARMCPU *cpu)
                               kvm_steal_time_set);
      object_property_set_description(obj, "kvm-steal-time",
                                      "Set off to disable KVM steal time.");
+
+    object_property_add_str(obj, "kvm-pmu-filter", kvm_pmu_filter_get,
+                            kvm_pmu_filter_set);
+    object_property_set_description(obj, "kvm-pmu-filter",
+                                    "PMU Event Filtering description for "
+                                    "guest PMU. (default: NULL, disabled)");
  }
bool kvm_arm_pmu_supported(void)
@@ -1706,6 +1728,58 @@ static bool kvm_arm_set_device_attr(ARMCPU *cpu, struct 
kvm_device_attr *attr,
      return true;
  }
+static void kvm_arm_pmu_filter_init(ARMCPU *cpu)
+{
+    static bool pmu_filter_init;
+    struct kvm_pmu_event_filter filter;
+    struct kvm_device_attr attr = {
+        .group      = KVM_ARM_VCPU_PMU_V3_CTRL,
+        .attr       = KVM_ARM_VCPU_PMU_V3_FILTER,
+        .addr       = (uint64_t)&filter,
+    };
+    int i;
+    g_auto(GStrv) event_filters;
+
+    if (!cpu->kvm_pmu_filter) {
+        return;
+    }
+    if (kvm_vcpu_ioctl(CPU(cpu), KVM_HAS_DEVICE_ATTR, &attr)) {
+        warn_report("The KVM doesn't support the PMU Event Filter!");
+        return;
+    }
+
+    /*
+     * The filter only needs to be initialized through one vcpu ioctl and it
+     * will affect all other vcpu in the vm.
+     */
+    if (pmu_filter_init) {
+        return;
+    } else {
+        pmu_filter_init = true;
+    }
+
+    event_filters = g_strsplit(cpu->kvm_pmu_filter, ";", -1);
+    for (i = 0; event_filters[i]; i++) {
+        unsigned short start = 0, end = 0;
+        char act;
+
+        sscanf(event_filters[i], "%c:%hx-%hx", &act, &start, &end);
I would check the returned value of sscanf. In most of the existing
calls in qemu it is done that way.

Yep. It's better to check the return value of sscanf.


"returns the number of fields that were successfully converted and
assigned. The return value does not include fields that were read but
not assigned."
+        if ((act != 'A' && act != 'D') || (!start && !end)) {
the 2nd check then could be replaced by start > end to check we have a
valid range.

Yes. If we check the return value of sscanf, we can replace the second check to start > end, and it will help to detect the overflow.

Thanks,
Shaoqin

+            warn_report("Skipping invalid PMU filter %s", event_filters[i]);
+            continue;
+        }
+
+        filter.base_event = start;
+        filter.nevents = end - start + 1;
the above check would help here I think. Also handle the case where it
could overflow?
+        filter.action = (act == 'A') ? KVM_PMU_EVENT_ALLOW :
+                                       KVM_PMU_EVENT_DENY;
+
+        if (!kvm_arm_set_device_attr(cpu, &attr, "PMU_V3_FILTER")) {
+            break;
+        }
+    }
+}
+
  void kvm_arm_pmu_init(ARMCPU *cpu)
  {
      struct kvm_device_attr attr = {
@@ -1716,6 +1790,8 @@ void kvm_arm_pmu_init(ARMCPU *cpu)
      if (!cpu->has_pmu) {
          return;
      }
+
+    kvm_arm_pmu_filter_init(cpu);
      if (!kvm_arm_set_device_attr(cpu, &attr, "PMU")) {
          error_report("failed to init PMU");
          abort();

base-commit: bd2e12310b18b51aefbf834e6d54989fd175976f
Thanks

Eric


--
Shaoqin


Reply via email to