Re: [PATCH v9] arm/kvm: Enable support for KVM_ARM_VCPU_PMU_V3_FILTER

2024-05-27 Thread Shaoqin Huang

Hi Zhao,

Thanks for your proposed idea. If you are willing to take the PMU Filter 
Enabling work, you can do it. I won't update this series anymore due to 
the QAPI restriction. I really appreciate if you can implement that.


Thanks,
Shaoqin

On 5/13/24 14:52, Zhao Liu wrote:

Hi Daniel,


Please describe it in terms of a QAPI definition, as that's what we're
striving for with all QEMU public interfaces. Once the QAPI design is
agreed, then the -object mapping is trivial, as -object's JSON format
supports arbitrary QAPI structures.


Thank you for your guidance!

I rethought and and modified my previous proposal:

Let me show the command examples firstly:
   * Add a single event:
 (x86) -object kvm-pmu-event,id=e0,action=allow,format=x86-default,\
   select=0x3c,umask=0x00
 (arm or general) -object kvm-pmu-event,id=e1,action=deny,\
  format=raw,code=0x01
  
   * Add a counter bitmap:

 (x86) -object kvm-pmu-counter,id=cnt,action=allow,type=x86-fixed,\
   bitmap=0x
  
   * Add an event list (must use Json syntax format):

(x86) -object 
'{"qom-type":"kvm-pmu-event-list","id"="filter0","action"="allow","format"="x86-default","events=[{"select"=0x3c,"umask"=0x00},{"select"=0x2e,"umask"=0x4f}]'
(arm) -object 
'{"qom-type":"kvm-pmu-event-list","id"="filter1","action"="allow","format"="raw","events"=[{"code"=0x01},{"code"=0x02}]'


The specific JSON definitions are as follows (IIUC, this is "in terms of
a QAPI definition", right? ;-)):
* Define PMU event and counter bitmap with JSON format:
   - basic filter action:

   { 'enum': 'KVMPMUFilterAction',
 'prefix': 'KVM_PMU_FILTER_ACTION',
 'data': ['deny', 'allow' ] }

   - PMU counter:

   { 'enum': 'KVMPMUCounterType',
 'prefix': 'KVM_PMU_COUNTER_TYPE',
 'data': [ 'x86-fixed' ] }

   { 'struct': 'KVMPMUX86FixedCounter',
 'data': { 'bitmap': 'uint32' } }

   - PMU events (total 3 formats):

   # 3 encoding formats: "raw" is compatible with shaoqin's ARM format as
   # well as the x86 raw format, and could support other architectures in
   # the future.
   { 'enum': 'KVMPMUEventEncodeFmt',
 'prefix': 'KVM_PMU_EVENT_ENCODE_FMT',
 'data': ['raw', 'x86-default', 'x86-masked-entry' ] }

   # A general format.
   { 'struct': 'KVMPMURawEvent',
 'data': { 'code': 'uint64' } }

   # x86-specific
   { 'struct': 'KVMPMUX86DefalutEvent',
 'data': { 'select': 'uint16',
   'umask': 'uint16' } }

   # another x86 specific
   { 'struct': 'KVMPMUX86MaskedEntry',
 'data': { 'select': 'uint16',
   'match': 'uint8',
   'mask': 'uint8',
   'exclude': 'bool' } }

   # And their list wrappers:
   { 'struct': 'KVMPMURawEventList',
 'data': { 'events': ['KVMPMURawEvent'] } }

   { 'struct': 'KVMPMUX86DefalutEventList',
 'data': { 'events': ['KVMPMUX86DefalutEvent'] } }

   { 'struct': 'KVMPMUX86MaskedEntryList',
 'data': { 'events': ['KVMPMUX86MaskedEntryList'] } }


Based on the above basic structs, we could provide 3 new more qom-types:
   - 'kvm-pmu-counter': 'KVMPMUFilterCounter'

   # This is a single object option to configure PMU counter
   # bitmap filter.
   { 'union': 'KVMPMUFilterCounter',
 'base': { 'action': 'KVMPMUFilterAction',
   'type': 'KVMPMUCounterType' },
 'discriminator': 'type',
 'data': { 'x86-fixed': 'KVMPMUX86FixedCounter' } }


   - 'kvm-pmu-counter': 'KVMPMUFilterCounter'

   # This option is used to configure a single PMU event for
   # PMU filter.
   { 'union': 'KVMPMUFilterEvent',
 'base': { 'action': 'KVMPMUFilterAction',
   'format': 'KVMPMUEventEncodeFmt' },
 'discriminator': 'format',
 'data': { 'raw': 'KVMPMURawEvent',
   'x86-default': 'KVMPMUX86DefalutEvent',
   'x86-masked-entry': 'KVMPMUX86MaskedEntry' } }


   - 'kvm-pmu-event-list': 'KVMPMUFilterEventList'

   # Used to configure multiple events.
   { 'union': 'KVMPMUFilterEventList',
 'base': { 'action': 'KVMPMUFilterAction',
   'format': 'KVMPMUEventEncodeFmt' },
 'discriminator': 'format',
 'data': { 'raw': 'KVMPMURawEventList',
   'x86-default': 'KVMPMUX86DefalutEventList',
   'x86-masked-entry': 'KVMPMUX86MaskedEntryList' } }


Compared to Shaoqin's original format, kvm-pmu-event-list is not able to
enumerate events continuously (similar to 0x00-0x30 before), and now
user must enumerate events one by one individually.

What do you think about the above 3 new commands?

Thanks and Best Regards,
Zhao



--
Shaoqin




Re: [PATCH v9] arm/kvm: Enable support for KVM_ARM_VCPU_PMU_V3_FILTER

2024-05-07 Thread Shaoqin Huang

Hi Daniel,

On 4/16/24 01:29, Daniel P. Berrangé wrote:

On Mon, Apr 08, 2024 at 10:49:40PM -0400, Shaoqin Huang wrote:

The KVM_ARM_VCPU_PMU_V3_FILTER provides the ability to let the VMM decide
which PMU events are provided to the guest. Add a new option
`kvm-pmu-filter` as -cpu sub-option to set the PMU Event Filtering.
Without the filter, all PMU events are exposed from host to guest by
default. The usage of the new sub-option can be found from the updated
document (docs/system/arm/cpu-features.rst).

Here is an example which shows how to use the PMU Event Filtering, when
we launch a guest by use kvm, add such command line:

   # qemu-system-aarch64 \
 -accel kvm \
 -cpu host,kvm-pmu-filter="D:0x11-0x11"


I'm still against implementing this one-off custom parsed syntax
for kvm-pmu-filter values. Once this syntax exists, we're locked
into back-compatibility for multiple releases, and it will make
a conversion to QAPI/JSON harder.


Thanks for your effort of reviewing my patch. I think if I need cost 
more time about the QAPI, that's outside my initial idea and deviate 
from supporting the PMU Filter.


So I decide to not update this patch now. And wait until I have time to 
look into the QAPI or the -cpu option has been transformed to QAPI format.


Thanks,
Shaoqin



With regards,
Daniel


--
Shaoqin




Re: [PATCH v9] arm/kvm: Enable support for KVM_ARM_VCPU_PMU_V3_FILTER

2024-04-09 Thread Shaoqin Huang

Hi Thmoas,

On 4/9/24 13:33, Thomas Huth wrote:

+    assert_has_feature(qts, "host", "kvm-pmu-filter");


So you assert here that the feature is available ...


  assert_has_feature(qts, "host", "kvm-steal-time");
  assert_has_feature(qts, "host", "sve");
  resp = do_query_no_props(qts, "host");
+    kvm_supports_pmu_filter = resp_get_feature_str(resp, 
"kvm-pmu-filter");
  kvm_supports_steal_time = resp_get_feature(resp, 
"kvm-steal-time");

  kvm_supports_sve = resp_get_feature(resp, "sve");
  vls = resp_get_sve_vls(resp);
  qobject_unref(resp);
+    if (kvm_supports_pmu_filter) { >

... why do you then need to check for its availability here again?
I either don't understand this part of the code, or you could drop the 
kvm_supports_pmu_filter variable and simply always execute the code below.


Thanks for your reviewing. I did so because all other feature like 
"kvm-steal-time" check its availability again. I don't know the original 
reason why they did that. I just followed it.


Do you think we should delete all the checking?

Thanks,
Shaoqin



  Thomas



--
Shaoqin




[PATCH v9] arm/kvm: Enable support for KVM_ARM_VCPU_PMU_V3_FILTER

2024-04-08 Thread Shaoqin Huang
The KVM_ARM_VCPU_PMU_V3_FILTER provides the ability to let the VMM decide
which PMU events are provided to the guest. Add a new option
`kvm-pmu-filter` as -cpu sub-option to set the PMU Event Filtering.
Without the filter, all PMU events are exposed from host to guest by
default. The usage of the new sub-option can be found from the updated
document (docs/system/arm/cpu-features.rst).

Here is an example which shows how to use the PMU Event Filtering, when
we launch a guest by use kvm, add such command line:

  # qemu-system-aarch64 \
-accel kvm \
-cpu host,kvm-pmu-filter="D:0x11-0x11"

Since the first action is deny, we have a global allow policy. This
filters out the cycle counter (event 0x11 being CPU_CYCLES).

And then in guest, use the perf to count the cycle:

  # perf stat sleep 1

   Performance counter stats for 'sleep 1':

  1.22 msec task-clock   #0.001 CPUs 
utilized
 1  context-switches #  820.695 /sec
 0  cpu-migrations   #0.000 /sec
55  page-faults  #   45.138 K/sec
 cycles
   1128954  instructions
227031  branches #  186.323 M/sec
  8686  branch-misses#3.83% of all 
branches

   1.002492480 seconds time elapsed

   0.001752000 seconds user
   0.0 seconds sys

As we can see, the cycle counter has been disabled in the guest, but
other pmu events do still work.

Signed-off-by: Shaoqin Huang 
---
v8->v9:
  - Replace the warn_report to error_setg in some places.
  - Merge the check condition to make code more clean.
  - Try to use the QAPI format for the PMU Filter property but failed to use it
  since the -cpu option doesn't support json format yet.

v7->v8:
  - Add qtest for kvm-pmu-filter.
  - Do the kvm-pmu-filter syntax checking up-front in the kvm_pmu_filter_set()
  function. And store the filter information at there. When kvm_pmu_filter_get()
  reconstitute it.

v6->v7:
  - Check return value of sscanf.
  - Improve the check condition.

v5->v6:
  - Commit message improvement.
  - Remove some unused code.
  - Collect Reviewed-by, thanks Sebastian.
  - Use g_auto(Gstrv) to replace the gchar **.  [Eric]

v4->v5:
  - Change the kvm-pmu-filter as a -cpu sub-option. [Eric]
  - Comment tweak.  [Gavin]
  - Rebase to the latest branch.

v3->v4:
  - Fix the wrong check for pmu_filter_init.[Sebastian]
  - Fix multiple alignment issue.   [Gavin]
  - Report error by warn_report() instead of error_report(), and don't use
  abort() since the PMU Event Filter is an add-on and best-effort feature.
[Gavin]
  - Add several missing {  } for single line of code.   [Gavin]
  - Use the g_strsplit() to replace strtok().   [Gavin]

v2->v3:
  - Improve commits message, use kernel doc wording, add more explaination on
filter example, fix some typo error.[Eric]
  - Add g_free() in kvm_arch_set_pmu_filter() to prevent memory leak. [Eric]
  - Add more precise error message report.  [Eric]
  - In options doc, add pmu-filter rely on KVM_ARM_VCPU_PMU_V3_FILTER support in
KVM.[Eric]

v1->v2:
  - Add more description for allow and deny meaning in 
commit message. [Sebastian]
  - Small improvement.  [Sebastian]
---
 docs/system/arm/cpu-features.rst |  23 +++
 target/arm/arm-qmp-cmds.c|   2 +-
 target/arm/cpu.h |   3 +
 target/arm/kvm.c | 112 +++
 tests/qtest/arm-cpu-features.c   |  51 ++
 5 files changed, 190 insertions(+), 1 deletion(-)

diff --git a/docs/system/arm/cpu-features.rst b/docs/system/arm/cpu-features.rst
index a5fb929243..f3930f34b3 100644
--- a/docs/system/arm/cpu-features.rst
+++ b/docs/system/arm/cpu-features.rst
@@ -204,6 +204,29 @@ the list of KVM VCPU features and their descriptions.
   the guest scheduler behavior and/or be exposed to the guest
   userspace.
 
+``kvm-pmu-filter``
+  By default kvm-pmu-filter is disabled. This means that by default all PMU
+  events will be exposed to guest.
+
+  KVM implements PMU Event Filtering to prevent a guest from being able to
+  sample certain events. It depends on the KVM_ARM_VCPU_PMU_V3_FILTER
+  attribute supported in KVM. It has the following format:
+
+  kvm-pmu-filter="{A,D}:start-end[;{A,D}:start-end...]"
+
+  The A means "allow" and D means "deny", start is the first event of the
+  range and the end is the last one. The first registered range defines
+  the global policy (global ALLOW if the first action is DENY, global DENY

Re: [PATCH v8] arm/kvm: Enable support for KVM_ARM_VCPU_PMU_V3_FILTER

2024-04-08 Thread Shaoqin Huang

Hi Eric,

On 3/19/24 23:23, Eric Auger wrote:

+if (kvm_supports_pmu_filter) {
+assert_set_feature_str(qts, "host", "kvm-pmu-filter", "");
+assert_set_feature_str(qts, "host", "kvm-pmu-filter",
+   "A:0x11-0x11");
+assert_set_feature_str(qts, "host", "kvm-pmu-filter",
+   "D:0x11-0x11");
+assert_set_feature_str(qts, "host", "kvm-pmu-filter",
+   "A:0x11-0x11;A:0x12-0x20");
+assert_set_feature_str(qts, "host", "kvm-pmu-filter",
+   "D:0x11-0x11;A:0x12-0x20;D:0x12-0x15");

Just to double check this set the filter and checks the filter is
applied, is that correct?
I see you set some ranges of events. Are you sure those events are
supported on host PMU and won't create a failure on setting the PMU filter?


What I test here is that checking if the PMU Filter parser is right 
which I write in the kvm_pmu_filter_set/get function, I don't test any 
KVM side things like if the PMU event is supported by host.


Thanks,
Shaoqin



Thanks

Eric


--
Shaoqin




Re: [PATCH v8] arm/kvm: Enable support for KVM_ARM_VCPU_PMU_V3_FILTER

2024-04-08 Thread Shaoqin Huang

Hi Kevin,

On 4/2/24 21:01, Kevin Wolf wrote:

Maybe I'm wrong. So I want to double check with if the -cpu option
support json format nowadays?

As far as I can see, -cpu doesn't support JSON yet. But even if it did,
your command line would be invalid because the 'host,' part isn't JSON.



Thanks for answering my question. I guess I should still keep the 
current implementation, and to transform the property in the future when 
the -cpu option support JSON format.


Thanks,
Shaoqin


If the -cpu option doesn't support json format, how I can use the QAPI
for kvm-pmu-filter property?

This would probably mean QAPIfying all CPUs first, which sounds like a
major effort.


--
Shaoqin




Re: [PATCH v8] arm/kvm: Enable support for KVM_ARM_VCPU_PMU_V3_FILTER

2024-03-28 Thread Shaoqin Huang

Hi Daniel,

On 3/25/24 16:55, Daniel P. Berrangé wrote:

On Mon, Mar 25, 2024 at 01:35:58PM +0800, Shaoqin Huang wrote:

Hi Daniel,

Thanks for your reviewing. I see your comments in the v7.

I have some doubts about what you said about the QAPI. Do you want me to
convert the current design into the QAPI parsing like the
IOThreadVirtQueueMapping? And we need to add new json definition in the
qapi/ directory?


I have defined the QAPI for kvm-pmu-filter like below:

+##
+# @FilterAction:
+#
+# The Filter Action
+#
+# @a: Allow
+#
+# @d: Disallow
+#
+# Since: 9.0
+##
+{ 'enum': 'FilterAction',
+  'data': [ 'a', 'd' ] }
+
+##
+# @SingleFilter:
+#
+# Lazy
+#
+# @action: the action
+#
+# @start: the start
+#
+# @end: the end
+#
+# Since: 9.0
+##
+
+{ 'struct': 'SingleFilter',
+ 'data': { 'action': 'FilterAction', 'start': 'int', 'end': 'int' } }
+
+##
+# @KVMPMUFilter:
+#
+# Lazy
+#
+# @filter: the filter
+#
+# Since: 9.0
+##
+
+{ 'struct': 'KVMPMUFilter',
+  'data': { 'filter': ['SingleFilter'] }}

And I guess I can use it by adding code like below:

--- a/hw/core/qdev-properties-system.c
+++ b/hw/core/qdev-properties-system.c
@@ -1206,3 +1206,35 @@ const PropertyInfo 
qdev_prop_iothread_vq_mapping_list = {

 .set = set_iothread_vq_mapping_list,
 .release = release_iothread_vq_mapping_list,
 };
+
+/* --- kvm-pmu-filter ---*/
+
+static void get_kvm_pmu_filter(Object *obj, Visitor *v,
+const char *name, void *opaque, Error **errp)
+{
+KVMPMUFilter **prop_ptr = object_field_prop_ptr(obj, opaque);
+
+visit_type_KVMPMUFilter(v, name, prop_ptr, errp);
+}
+
+static void set_kvm_pmu_filter(Object *obj, Visitor *v,
+const char *name, void *opaque, Error **errp)
+{
+KVMPMUFilter **prop_ptr = object_field_prop_ptr(obj, opaque);
+KVMPMUFilter *list;
+
+printf("running the %s\n", __func__);
+if (!visit_type_KVMPMUFilter(v, name, , errp)) {
+return;
+}
+
+printf("The name is %s\n", name);
+*prop_ptr = list;
+}
+
+const PropertyInfo qdev_prop_kvm_pmu_filter = {
+.name = "KVMPMUFilter",
+.description = "der der",
+.get = get_kvm_pmu_filter,
+.set = set_kvm_pmu_filter,
+};

+#define DEFINE_PROP_KVM_PMU_FILTER(_name, _state, _field) \
+DEFINE_PROP(_name, _state, _field, qdev_prop_kvm_pmu_filter, \
+KVMPMUFilter *)

--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -2439,6 +2441,7 @@ static Property arm_cpu_properties[] = {
 mp_affinity, ARM64_AFFINITY_INVALID),
 DEFINE_PROP_INT32("node-id", ARMCPU, node_id, CPU_UNSET_NUMA_NODE_ID),
 DEFINE_PROP_INT32("core-count", ARMCPU, core_count, -1),
+DEFINE_PROP_KVM_PMU_FILTER("kvm-pmu-filter", ARMCPU, kvm_pmu_filter),
 DEFINE_PROP_END_OF_LIST()
 };

And I guess I can use the new json format input like below:

qemu-system-aarch64 \
-cpu host, '{"filter": [{"action": "a", "start": 0x10, "end": "0x11"}]}'

But it doesn't work. It seems like because the -cpu option doesn't 
support json format parameter.


Maybe I'm wrong. So I want to double check with if the -cpu option 
support json format nowadays?


If the -cpu option doesn't support json format, how I can use the QAPI 
for kvm-pmu-filter property?


Thanks,
Shaoqin



Yes, you would define a type in the qapi dir similar to how is
done for IOThreadVirtQueueMapping, and then you can use that
in the property setter method.


With regards,
Daniel


--
Shaoqin




Re: [PATCH v8] arm/kvm: Enable support for KVM_ARM_VCPU_PMU_V3_FILTER

2024-03-24 Thread Shaoqin Huang

Hi Daniel,

Thanks for your reviewing. I see your comments in the v7.

I have some doubts about what you said about the QAPI. Do you want me to 
convert the current design into the QAPI parsing like the 
IOThreadVirtQueueMapping? And we need to add new json definition in the 
qapi/ directory?


Thanks,
Shaoqin

On 3/22/24 22:53, Daniel P. Berrangé wrote:

On Tue, Mar 12, 2024 at 03:48:49AM -0400, Shaoqin Huang wrote:

The KVM_ARM_VCPU_PMU_V3_FILTER provides the ability to let the VMM decide
which PMU events are provided to the guest. Add a new option
`kvm-pmu-filter` as -cpu sub-option to set the PMU Event Filtering.
Without the filter, all PMU events are exposed from host to guest by
default. The usage of the new sub-option can be found from the updated
document (docs/system/arm/cpu-features.rst).

Here is an example which shows how to use the PMU Event Filtering, when
we launch a guest by use kvm, add such command line:

   # qemu-system-aarch64 \
 -accel kvm \
 -cpu host,kvm-pmu-filter="D:0x11-0x11"


I mistakenly sent some comments to the older v7 (despite this v8 already
existing) about the design of this syntax So for linking up the threads:

  https://lists.nongnu.org/archive/html/qemu-devel/2024-03/msg04703.html

With regards,
Daniel


--
Shaoqin




[PATCH v8] arm/kvm: Enable support for KVM_ARM_VCPU_PMU_V3_FILTER

2024-03-12 Thread Shaoqin Huang
The KVM_ARM_VCPU_PMU_V3_FILTER provides the ability to let the VMM decide
which PMU events are provided to the guest. Add a new option
`kvm-pmu-filter` as -cpu sub-option to set the PMU Event Filtering.
Without the filter, all PMU events are exposed from host to guest by
default. The usage of the new sub-option can be found from the updated
document (docs/system/arm/cpu-features.rst).

Here is an example which shows how to use the PMU Event Filtering, when
we launch a guest by use kvm, add such command line:

  # qemu-system-aarch64 \
-accel kvm \
-cpu host,kvm-pmu-filter="D:0x11-0x11"

Since the first action is deny, we have a global allow policy. This
filters out the cycle counter (event 0x11 being CPU_CYCLES).

And then in guest, use the perf to count the cycle:

  # perf stat sleep 1

   Performance counter stats for 'sleep 1':

  1.22 msec task-clock   #0.001 CPUs 
utilized
 1  context-switches #  820.695 /sec
 0  cpu-migrations   #0.000 /sec
55  page-faults  #   45.138 K/sec
 cycles
   1128954  instructions
227031  branches #  186.323 M/sec
  8686  branch-misses#3.83% of all 
branches

   1.002492480 seconds time elapsed

   0.001752000 seconds user
   0.0 seconds sys

As we can see, the cycle counter has been disabled in the guest, but
other pmu events do still work.

Signed-off-by: Shaoqin Huang 
---
v7->v8:
  - Add qtest for kvm-pmu-filter.
  - Do the kvm-pmu-filter syntax checking up-front in the kvm_pmu_filter_set()
  function. And store the filter information at there. When kvm_pmu_filter_get()
  reconstitute it.

v6->v7:
  - Check return value of sscanf.
  - Improve the check condition.

v5->v6:
  - Commit message improvement.
  - Remove some unused code.
  - Collect Reviewed-by, thanks Sebastian.
  - Use g_auto(Gstrv) to replace the gchar **.  [Eric]

v4->v5:
  - Change the kvm-pmu-filter as a -cpu sub-option. [Eric]
  - Comment tweak.  [Gavin]
  - Rebase to the latest branch.

v3->v4:
  - Fix the wrong check for pmu_filter_init.[Sebastian]
  - Fix multiple alignment issue.   [Gavin]
  - Report error by warn_report() instead of error_report(), and don't use
  abort() since the PMU Event Filter is an add-on and best-effort feature.
[Gavin]
  - Add several missing {  } for single line of code.   [Gavin]
  - Use the g_strsplit() to replace strtok().   [Gavin]

v2->v3:
  - Improve commits message, use kernel doc wording, add more explaination on
filter example, fix some typo error.[Eric]
  - Add g_free() in kvm_arch_set_pmu_filter() to prevent memory leak. [Eric]
  - Add more precise error message report.  [Eric]
  - In options doc, add pmu-filter rely on KVM_ARM_VCPU_PMU_V3_FILTER support in
KVM.[Eric]

v1->v2:
  - Add more description for allow and deny meaning in 
commit message. [Sebastian]
  - Small improvement.  [Sebastian]
---
 docs/system/arm/cpu-features.rst |  23 +++
 target/arm/arm-qmp-cmds.c|   2 +-
 target/arm/cpu.h |   3 +
 target/arm/kvm.c | 115 +++
 tests/qtest/arm-cpu-features.c   |  51 ++
 5 files changed, 193 insertions(+), 1 deletion(-)

diff --git a/docs/system/arm/cpu-features.rst b/docs/system/arm/cpu-features.rst
index a5fb929243..f3930f34b3 100644
--- a/docs/system/arm/cpu-features.rst
+++ b/docs/system/arm/cpu-features.rst
@@ -204,6 +204,29 @@ the list of KVM VCPU features and their descriptions.
   the guest scheduler behavior and/or be exposed to the guest
   userspace.
 
+``kvm-pmu-filter``
+  By default kvm-pmu-filter is disabled. This means that by default all PMU
+  events will be exposed to guest.
+
+  KVM implements PMU Event Filtering to prevent a guest from being able to
+  sample certain events. It depends on the KVM_ARM_VCPU_PMU_V3_FILTER
+  attribute supported in KVM. It has the following format:
+
+  kvm-pmu-filter="{A,D}:start-end[;{A,D}:start-end...]"
+
+  The A means "allow" and D means "deny", start is the first event of the
+  range and the end is the last one. The first registered range defines
+  the global policy (global ALLOW if the first action is DENY, global DENY
+  if the first action is ALLOW). The start and end only support hexadecimal
+  format. For example:
+
+  kvm-pmu-filter="A:0x11-0x11;A:0x23-0x3a;D:0x30-0x30"
+
+  Since the first action is allow, we have a global deny policy. It
+  will allow event 0

Re: [PATCH v7] arm/kvm: Enable support for KVM_ARM_VCPU_PMU_V3_FILTER

2024-02-28 Thread Shaoqin Huang

Hi Peter,

On 2/22/24 22:28, Peter Maydell wrote:

On Wed, 21 Feb 2024 at 06:34, Shaoqin Huang  wrote:


The KVM_ARM_VCPU_PMU_V3_FILTER provides the ability to let the VMM decide
which PMU events are provided to the guest. Add a new option
`kvm-pmu-filter` as -cpu sub-option to set the PMU Event Filtering.
Without the filter, all PMU events are exposed from host to guest by
default. The usage of the new sub-option can be found from the updated
document (docs/system/arm/cpu-features.rst).

Here is an example which shows how to use the PMU Event Filtering, when
we launch a guest by use kvm, add such command line:

   # qemu-system-aarch64 \
 -accel kvm \
 -cpu host,kvm-pmu-filter="D:0x11-0x11"

Since the first action is deny, we have a global allow policy. This
filters out the cycle counter (event 0x11 being CPU_CYCLES).

And then in guest, use the perf to count the cycle:

   # perf stat sleep 1

Performance counter stats for 'sleep 1':

   1.22 msec task-clock   #0.001 CPUs 
utilized
  1  context-switches #  820.695 /sec
  0  cpu-migrations   #0.000 /sec
 55  page-faults  #   45.138 K/sec
  cycles
1128954  instructions
 227031  branches #  186.323 M/sec
   8686  branch-misses#3.83% of all 
branches

1.002492480 seconds time elapsed

0.001752000 seconds user
0.0 seconds sys

As we can see, the cycle counter has been disabled in the guest, but
other pmu events do still work.

Reviewed-by: Sebastian Ott 
Signed-off-by: Shaoqin Huang 
---
v6->v7:
   - Check return value of sscanf.
   - Improve the check condition.

v5->v6:
   - Commit message improvement.
   - Remove some unused code.
   - Collect Reviewed-by, thanks Sebastian.
   - Use g_auto(Gstrv) to replace the gchar **.  [Eric]

v4->v5:
   - Change the kvm-pmu-filter as a -cpu sub-option. [Eric]
   - Comment tweak.  [Gavin]
   - Rebase to the latest branch.

v3->v4:
   - Fix the wrong check for pmu_filter_init.[Sebastian]
   - Fix multiple alignment issue.   [Gavin]
   - Report error by warn_report() instead of error_report(), and don't use
   abort() since the PMU Event Filter is an add-on and best-effort feature.
 [Gavin]
   - Add several missing {  } for single line of code.   [Gavin]
   - Use the g_strsplit() to replace strtok().   [Gavin]

v2->v3:
   - Improve commits message, use kernel doc wording, add more explaination on
 filter example, fix some typo error.[Eric]
   - Add g_free() in kvm_arch_set_pmu_filter() to prevent memory leak. [Eric]
   - Add more precise error message report.  [Eric]
   - In options doc, add pmu-filter rely on KVM_ARM_VCPU_PMU_V3_FILTER support 
in
 KVM.[Eric]

v1->v2:
   - Add more description for allow and deny meaning in
 commit message. [Sebastian]
   - Small improvement.  [Sebastian]

  docs/system/arm/cpu-features.rst | 23 +
  target/arm/cpu.h |  3 ++
  target/arm/kvm.c | 80 
  3 files changed, 106 insertions(+)


The new syntax for the filter property seems quite complicated.
I think it would be worth testing it with a new test in
tests/qtest/arm-cpu-features.c.


I was trying to add a test in tests/qtest/arm-cpu-features.c. But I 
found all other cpu-feature is bool property.


When I use the 'query-cpu-model-expansion' to query the cpu-features, 
the kvm-pmu-filter will not shown in the returned results, just like below.


{'execute': 'query-cpu-model-expansion', 'arguments': {'type': 'full', 
'model': { 'name': 'host'}}}{"return": {}}


{"return": {"model": {"name": "host", "props": {"sve768": false, 
"sve128": false, "sve1024": false, "sve1280": false, "sve896": false, 
"sve256": false, "sve1536": false, "sve1792": false, "sve384": false, 
"sve": false, "sve2048": false, "pauth": false, "kvm-no-adjvtime": 
false, "sve512": false, "aarch64": true, "pmu": true, "sve1920": false, 
"sve1152": false, "kvm-steal-time": true, "sve640": false, "sve1408": 
false, "sve1664": false


I'm not sure if it's because the `query-cpu-model-expansion` only return 
the feature which is bool. Since the kvm-pmu-filter is a str, it won't 
be recogniz

[PATCH v7] arm/kvm: Enable support for KVM_ARM_VCPU_PMU_V3_FILTER

2024-02-20 Thread Shaoqin Huang
The KVM_ARM_VCPU_PMU_V3_FILTER provides the ability to let the VMM decide
which PMU events are provided to the guest. Add a new option
`kvm-pmu-filter` as -cpu sub-option to set the PMU Event Filtering.
Without the filter, all PMU events are exposed from host to guest by
default. The usage of the new sub-option can be found from the updated
document (docs/system/arm/cpu-features.rst).

Here is an example which shows how to use the PMU Event Filtering, when
we launch a guest by use kvm, add such command line:

  # qemu-system-aarch64 \
-accel kvm \
-cpu host,kvm-pmu-filter="D:0x11-0x11"

Since the first action is deny, we have a global allow policy. This
filters out the cycle counter (event 0x11 being CPU_CYCLES).

And then in guest, use the perf to count the cycle:

  # perf stat sleep 1

   Performance counter stats for 'sleep 1':

  1.22 msec task-clock   #0.001 CPUs 
utilized
 1  context-switches #  820.695 /sec
 0  cpu-migrations   #0.000 /sec
55  page-faults  #   45.138 K/sec
 cycles
   1128954  instructions
227031  branches #  186.323 M/sec
  8686  branch-misses#3.83% of all 
branches

   1.002492480 seconds time elapsed

   0.001752000 seconds user
   0.0 seconds sys

As we can see, the cycle counter has been disabled in the guest, but
other pmu events do still work.

Reviewed-by: Sebastian Ott 
Signed-off-by: Shaoqin Huang 
---
v6->v7:
  - Check return value of sscanf.
  - Improve the check condition.

v5->v6:
  - Commit message improvement.
  - Remove some unused code.
  - Collect Reviewed-by, thanks Sebastian.
  - Use g_auto(Gstrv) to replace the gchar **.  [Eric]

v4->v5:
  - Change the kvm-pmu-filter as a -cpu sub-option. [Eric]
  - Comment tweak.  [Gavin]
  - Rebase to the latest branch.

v3->v4:
  - Fix the wrong check for pmu_filter_init.[Sebastian]
  - Fix multiple alignment issue.   [Gavin]
  - Report error by warn_report() instead of error_report(), and don't use
  abort() since the PMU Event Filter is an add-on and best-effort feature.
[Gavin]
  - Add several missing {  } for single line of code.   [Gavin]
  - Use the g_strsplit() to replace strtok().   [Gavin]

v2->v3:
  - Improve commits message, use kernel doc wording, add more explaination on
filter example, fix some typo error.[Eric]
  - Add g_free() in kvm_arch_set_pmu_filter() to prevent memory leak. [Eric]
  - Add more precise error message report.  [Eric]
  - In options doc, add pmu-filter rely on KVM_ARM_VCPU_PMU_V3_FILTER support in
KVM.[Eric]

v1->v2:
  - Add more description for allow and deny meaning in 
commit message. [Sebastian]
  - Small improvement.  [Sebastian]

 docs/system/arm/cpu-features.rst | 23 +
 target/arm/cpu.h |  3 ++
 target/arm/kvm.c | 80 
 3 files changed, 106 insertions(+)

diff --git a/docs/system/arm/cpu-features.rst b/docs/system/arm/cpu-features.rst
index a5fb929243..7c8f6a60ef 100644
--- a/docs/system/arm/cpu-features.rst
+++ b/docs/system/arm/cpu-features.rst
@@ -204,6 +204,29 @@ the list of KVM VCPU features and their descriptions.
   the guest scheduler behavior and/or be exposed to the guest
   userspace.
 
+``kvm-pmu-filter``
+  By default kvm-pmu-filter is disabled. This means that by default all pmu
+  events will be exposed to guest.
+
+  KVM implements PMU Event Filtering to prevent a guest from being able to
+  sample certain events. It depends on the KVM_ARM_VCPU_PMU_V3_FILTER
+  attribute supported in KVM. It has the following format:
+
+  kvm-pmu-filter="{A,D}:start-end[;{A,D}:start-end...]"
+
+  The A means "allow" and D means "deny", start is the first event of the
+  range and the end is the last one. The first registered range defines
+  the global policy(global ALLOW if the first @action is DENY, global DENY
+  if the first @action is ALLOW). The start and end only support hexadecimal
+  format. For example:
+
+  kvm-pmu-filter="A:0x11-0x11;A:0x23-0x3a;D:0x30-0x30"
+
+  Since the first action is allow, we have a global deny policy. It
+  will allow event 0x11 (The cycle counter), events 0x23 to 0x3a are
+  also allowed except the event 0x30 which is denied, and all the other
+  events are denied.
+
 TCG VCPU Features
 =
 
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index 63f31e0d98..f7f2431755 100644
--- a/target/arm/cpu.h
+++ b/target/arm

Re: [PATCH v6] arm/kvm: Enable support for KVM_ARM_VCPU_PMU_V3_FILTER

2024-02-20 Thread Shaoqin Huang

Hi Eric,

On 2/15/24 17:13, Eric Auger wrote:

Hi Shaoqin,

On 2/1/24 09:51, Shaoqin Huang wrote:

The KVM_ARM_VCPU_PMU_V3_FILTER provides the ability to let the VMM decide
which PMU events are provided to the guest. Add a new option
`kvm-pmu-filter` as -cpu sub-option to set the PMU Event Filtering.
Without the filter, all PMU events are exposed from host to guest by
default. The usage of the new sub-option can be found from the updated
document (docs/system/arm/cpu-features.rst).

Here is an example which shows how to use the PMU Event Filtering, when
we launch a guest by use kvm, add such command line:

   # qemu-system-aarch64 \
 -accel kvm \
 -cpu host,kvm-pmu-filter="D:0x11-0x11"

Since the first action is deny, we have a global allow policy. This
filters out the cycle counter (event 0x11 being CPU_CYCLES).

And then in guest, use the perf to count the cycle:

   # perf stat sleep 1

Performance counter stats for 'sleep 1':

   1.22 msec task-clock   #0.001 CPUs 
utilized
  1  context-switches #  820.695 /sec
  0  cpu-migrations   #0.000 /sec
 55  page-faults  #   45.138 K/sec
  cycles
1128954  instructions
 227031  branches #  186.323 M/sec
   8686  branch-misses#3.83% of all 
branches

1.002492480 seconds time elapsed

0.001752000 seconds user
0.0 seconds sys

As we can see, the cycle counter has been disabled in the guest, but
other pmu events do still work.

Reviewed-by: Sebastian Ott 
Signed-off-by: Shaoqin Huang 
---
v5->v6:
   - Commit message improvement.
   - Remove some unused code.
   - Collect Reviewed-by, thanks Sebastian.
   - Use g_auto(Gstrv) to replace the gchar **.  [Eric]

v4->v5:
   - Change the kvm-pmu-filter as a -cpu sub-option. [Eric]
   - Comment tweak.  [Gavin]
   - Rebase to the latest branch.

v3->v4:
   - Fix the wrong check for pmu_filter_init.[Sebastian]
   - Fix multiple alignment issue.   [Gavin]
   - Report error by warn_report() instead of error_report(), and don't use
   abort() since the PMU Event Filter is an add-on and best-effort feature.
 [Gavin]
   - Add several missing {  } for single line of code.   [Gavin]
   - Use the g_strsplit() to replace strtok().   [Gavin]

v2->v3:
   - Improve commits message, use kernel doc wording, add more explaination on
 filter example, fix some typo error.[Eric]
   - Add g_free() in kvm_arch_set_pmu_filter() to prevent memory leak. [Eric]
   - Add more precise error message report.  [Eric]
   - In options doc, add pmu-filter rely on KVM_ARM_VCPU_PMU_V3_FILTER support 
in
 KVM.[Eric]

v1->v2:
   - Add more description for allow and deny meaning in
 commit message. [Sebastian]
   - Small improvement.  [Sebastian]

  docs/system/arm/cpu-features.rst | 23 ++
  target/arm/cpu.h |  3 ++
  target/arm/kvm.c | 76 
  3 files changed, 102 insertions(+)

diff --git a/docs/system/arm/cpu-features.rst b/docs/system/arm/cpu-features.rst
index a5fb929243..26e306cc83 100644
--- a/docs/system/arm/cpu-features.rst
+++ b/docs/system/arm/cpu-features.rst
@@ -204,6 +204,29 @@ the list of KVM VCPU features and their descriptions.
the guest scheduler behavior and/or be exposed to the guest
userspace.
  
+``kvm-pmu-filter``

+  By default kvm-pmu-filter is disabled. This means that by default all pmu
+  events will be exposed to guest.
+
+  KVM implements PMU Event Filtering to prevent a guest from being able to
+  sample certain events. It depends on the KVM_ARM_VCPU_PMU_V3_FILTER
+  attribute supported in KVM. It has the following format:
+
+  kvm-pmu-filter="{A,D}:start-end[;{A,D}:start-end...]"
+
+  The A means "allow" and D means "deny", start is the first event of the
+  range and the end is the last one. The first registered range defines
+  the global policy(global ALLOW if the first @action is DENY, global DENY
+  if the first @action is ALLOW). The start and end only support hexadecimal
+  format now. For example:

nit: I would remove " now"


Will remove it.


+
+  kvm-pmu-filter="A:0x11-0x11;A:0x23-0x3a;D:0x30-0x30"
+
+  Since the first action is allow, we have a global deny policy. It
+  will allow event 0x11 (The cycle counter), events 0x23 to 0x3a are
+  also allowed except the event 0x30 which is denied, and all the other
+  events are denied.
+
  TCG VCPU Features
  

[PATCH v6] arm/kvm: Enable support for KVM_ARM_VCPU_PMU_V3_FILTER

2024-02-01 Thread Shaoqin Huang
The KVM_ARM_VCPU_PMU_V3_FILTER provides the ability to let the VMM decide
which PMU events are provided to the guest. Add a new option
`kvm-pmu-filter` as -cpu sub-option to set the PMU Event Filtering.
Without the filter, all PMU events are exposed from host to guest by
default. The usage of the new sub-option can be found from the updated
document (docs/system/arm/cpu-features.rst).

Here is an example which shows how to use the PMU Event Filtering, when
we launch a guest by use kvm, add such command line:

  # qemu-system-aarch64 \
-accel kvm \
-cpu host,kvm-pmu-filter="D:0x11-0x11"

Since the first action is deny, we have a global allow policy. This
filters out the cycle counter (event 0x11 being CPU_CYCLES).

And then in guest, use the perf to count the cycle:

  # perf stat sleep 1

   Performance counter stats for 'sleep 1':

  1.22 msec task-clock   #0.001 CPUs 
utilized
 1  context-switches #  820.695 /sec
 0  cpu-migrations   #0.000 /sec
55  page-faults  #   45.138 K/sec
 cycles
   1128954  instructions
227031  branches #  186.323 M/sec
  8686  branch-misses#3.83% of all 
branches

   1.002492480 seconds time elapsed

   0.001752000 seconds user
   0.0 seconds sys

As we can see, the cycle counter has been disabled in the guest, but
other pmu events do still work.

Reviewed-by: Sebastian Ott 
Signed-off-by: Shaoqin Huang 
---
v5->v6:
  - Commit message improvement.
  - Remove some unused code.
  - Collect Reviewed-by, thanks Sebastian.
  - Use g_auto(Gstrv) to replace the gchar **.  [Eric]

v4->v5:
  - Change the kvm-pmu-filter as a -cpu sub-option. [Eric]
  - Comment tweak.  [Gavin]
  - Rebase to the latest branch.

v3->v4:
  - Fix the wrong check for pmu_filter_init.[Sebastian]
  - Fix multiple alignment issue.   [Gavin]
  - Report error by warn_report() instead of error_report(), and don't use
  abort() since the PMU Event Filter is an add-on and best-effort feature.
[Gavin]
  - Add several missing {  } for single line of code.   [Gavin]
  - Use the g_strsplit() to replace strtok().   [Gavin]

v2->v3:
  - Improve commits message, use kernel doc wording, add more explaination on
filter example, fix some typo error.[Eric]
  - Add g_free() in kvm_arch_set_pmu_filter() to prevent memory leak. [Eric]
  - Add more precise error message report.  [Eric]
  - In options doc, add pmu-filter rely on KVM_ARM_VCPU_PMU_V3_FILTER support in
KVM.[Eric]

v1->v2:
  - Add more description for allow and deny meaning in 
commit message. [Sebastian]
  - Small improvement.  [Sebastian]

 docs/system/arm/cpu-features.rst | 23 ++
 target/arm/cpu.h |  3 ++
 target/arm/kvm.c | 76 
 3 files changed, 102 insertions(+)

diff --git a/docs/system/arm/cpu-features.rst b/docs/system/arm/cpu-features.rst
index a5fb929243..26e306cc83 100644
--- a/docs/system/arm/cpu-features.rst
+++ b/docs/system/arm/cpu-features.rst
@@ -204,6 +204,29 @@ the list of KVM VCPU features and their descriptions.
   the guest scheduler behavior and/or be exposed to the guest
   userspace.
 
+``kvm-pmu-filter``
+  By default kvm-pmu-filter is disabled. This means that by default all pmu
+  events will be exposed to guest.
+
+  KVM implements PMU Event Filtering to prevent a guest from being able to
+  sample certain events. It depends on the KVM_ARM_VCPU_PMU_V3_FILTER
+  attribute supported in KVM. It has the following format:
+
+  kvm-pmu-filter="{A,D}:start-end[;{A,D}:start-end...]"
+
+  The A means "allow" and D means "deny", start is the first event of the
+  range and the end is the last one. The first registered range defines
+  the global policy(global ALLOW if the first @action is DENY, global DENY
+  if the first @action is ALLOW). The start and end only support hexadecimal
+  format now. For example:
+
+  kvm-pmu-filter="A:0x11-0x11;A:0x23-0x3a;D:0x30-0x30"
+
+  Since the first action is allow, we have a global deny policy. It
+  will allow event 0x11 (The cycle counter), events 0x23 to 0x3a are
+  also allowed except the event 0x30 which is denied, and all the other
+  events are denied.
+
 TCG VCPU Features
 =
 
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index d3477b1601..2d860c227d 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -948,6 +948,9 @@ struct ArchCPU {
 
 /* KVM steal time */
   

Re: [PATCH v5] arm/kvm: Enable support for KVM_ARM_VCPU_PMU_V3_FILTER

2024-01-25 Thread Shaoqin Huang

Hi Eric,

On 1/17/24 20:59, Eric Auger wrote:

Hi  Shaoqin,

On 1/15/24 09:01, Shaoqin Huang wrote:

The KVM_ARM_VCPU_PMU_V3_FILTER provides the ability to let the VMM decide
which PMU events are provided to the guest. Add a new option
`kvm-pmu-filter` as -cpu sub-option to set the PMU Event Filtering.
Without the filter, all PMU events are exposed from host to guest by
default. The usage of the new sub-option can be found from the updated
document (docs/system/arm/cpu-features.rst).


do not hesitate to cc qemu-...@nongnu.org for ARM specific topics.



Here is an example shows how to use the PMU Event Filtering, when

which shows

we launch a guest by use kvm, add such command line:

   # qemu-system-aarch64 \
 -accel kvm \
 -cpu host,kvm-pmu-filter="D:0x11-0x11"

Since the first action is deny, we have a global allow policy. This
disables the filtering of the cycle counter (event 0x11 being CPU_CYCLES).

Actually it filters it ;-) It would rather say this filters out the
cycle counter. But I am not a native speaker either ;-)


And then in guest, use the perf to count the cycle:

   # perf stat sleep 1

Performance counter stats for 'sleep 1':

   1.22 msec task-clock   #0.001 CPUs 
utilized
  1  context-switches #  820.695 /sec
  0  cpu-migrations   #0.000 /sec
 55  page-faults  #   45.138 K/sec
  cycles
1128954  instructions
 227031  branches #  186.323 M/sec
   8686  branch-misses#3.83% of all 
branches

1.002492480 seconds time elapsed

0.001752000 seconds user
0.0 seconds sys

As we can see, the cycle counter has been disabled in the guest, but
other pmu events are still work.

do still work


Signed-off-by: Shaoqin Huang 
---
v4->v5:
   - Change the kvm-pmu-filter as a -cpu sub-option. [Eric]
   - Comment tweak.  [Gavin]
   - Rebase to the latest branch.

v3->v4:
   - Fix the wrong check for pmu_filter_init.[Sebastian]
   - Fix multiple alignment issue.   [Gavin]
   - Report error by warn_report() instead of error_report(), and don't use
   abort() since the PMU Event Filter is an add-on and best-effort feature.
 [Gavin]
   - Add several missing {  } for single line of code.   [Gavin]
   - Use the g_strsplit() to replace strtok().   [Gavin]

v2->v3:
   - Improve commits message, use kernel doc wording, add more explaination on
 filter example, fix some typo error.[Eric]
   - Add g_free() in kvm_arch_set_pmu_filter() to prevent memory leak. [Eric]
   - Add more precise error message report.  [Eric]
   - In options doc, add pmu-filter rely on KVM_ARM_VCPU_PMU_V3_FILTER support 
in
 KVM.[Eric]

v1->v2:
   - Add more description for allow and deny meaning in
 commit message. [Sebastian]
   - Small improvement.  [Sebastian]

  docs/system/arm/cpu-features.rst | 23 ++
  include/sysemu/kvm_int.h |  1 +
  target/arm/cpu.h |  3 ++
  target/arm/kvm.c | 78 
  4 files changed, 105 insertions(+)

diff --git a/docs/system/arm/cpu-features.rst b/docs/system/arm/cpu-features.rst
index a5fb929243..44a797c50e 100644
--- a/docs/system/arm/cpu-features.rst
+++ b/docs/system/arm/cpu-features.rst
@@ -204,6 +204,29 @@ the list of KVM VCPU features and their descriptions.
the guest scheduler behavior and/or be exposed to the guest
userspace.
  
+``kvm-pmu-filter``

+  By default kvm-pmu-filter is disabled. This means that by default all pmu
+  events will be exposed to guest.
+
+  KVM implements PMU Event Filtering to prevent a guest from being able to
+  sample certain events. It depends on the KVM_ARM_VCPU_PMU_V3_FILTER
+  attribute supported in KVM. It has the following format:
+
+  kvm-pmu-filter="{A,D}:start-end[;{A,D}:start-end...]"
+
+  The A means "allow" and D means "deny", start is the first event of the
+  range and the end is the last one. The first registered range defines
+  the global policy(global ALLOW if the first @action is DENY, global DENY
+  if the first @action is ALLOW). The start and end only support hexadecimal
+  format now. For example:
+
+  kvm-pmu-filter="A:0x11-0x11;A:0x23-0x3a;D:0x30-0x30"
+
+  Since the first action is allow, we have a global deny policy. It
+  will allow event 0x11 (The cycle counter), events 0x23 to 0x3a is

s/is/are

+  also allowed except the event 0x30 is denied, and all the other events

0x30 is/0x30 which is

+  are disall

Re: [PATCH v5] arm/kvm: Enable support for KVM_ARM_VCPU_PMU_V3_FILTER

2024-01-18 Thread Shaoqin Huang

Hi Eric,

On 1/17/24 20:59, Eric Auger wrote:

Hi  Shaoqin,

On 1/15/24 09:01, Shaoqin Huang wrote:

The KVM_ARM_VCPU_PMU_V3_FILTER provides the ability to let the VMM decide
which PMU events are provided to the guest. Add a new option
`kvm-pmu-filter` as -cpu sub-option to set the PMU Event Filtering.
Without the filter, all PMU events are exposed from host to guest by
default. The usage of the new sub-option can be found from the updated
document (docs/system/arm/cpu-features.rst).


do not hesitate to cc qemu-...@nongnu.org for ARM specific topics.



Here is an example shows how to use the PMU Event Filtering, when

which shows

we launch a guest by use kvm, add such command line:

   # qemu-system-aarch64 \
 -accel kvm \
 -cpu host,kvm-pmu-filter="D:0x11-0x11"

Since the first action is deny, we have a global allow policy. This
disables the filtering of the cycle counter (event 0x11 being CPU_CYCLES).

Actually it filters it ;-) It would rather say this filters out the
cycle counter. But I am not a native speaker either ;-)


And then in guest, use the perf to count the cycle:

   # perf stat sleep 1

Performance counter stats for 'sleep 1':

   1.22 msec task-clock   #0.001 CPUs 
utilized
  1  context-switches #  820.695 /sec
  0  cpu-migrations   #0.000 /sec
 55  page-faults  #   45.138 K/sec
  cycles
1128954  instructions
 227031  branches #  186.323 M/sec
   8686  branch-misses#3.83% of all 
branches

1.002492480 seconds time elapsed

0.001752000 seconds user
0.0 seconds sys

As we can see, the cycle counter has been disabled in the guest, but
other pmu events are still work.

do still work


Signed-off-by: Shaoqin Huang 
---
v4->v5:
   - Change the kvm-pmu-filter as a -cpu sub-option. [Eric]
   - Comment tweak.  [Gavin]
   - Rebase to the latest branch.

v3->v4:
   - Fix the wrong check for pmu_filter_init.[Sebastian]
   - Fix multiple alignment issue.   [Gavin]
   - Report error by warn_report() instead of error_report(), and don't use
   abort() since the PMU Event Filter is an add-on and best-effort feature.
 [Gavin]
   - Add several missing {  } for single line of code.   [Gavin]
   - Use the g_strsplit() to replace strtok().   [Gavin]

v2->v3:
   - Improve commits message, use kernel doc wording, add more explaination on
 filter example, fix some typo error.[Eric]
   - Add g_free() in kvm_arch_set_pmu_filter() to prevent memory leak. [Eric]
   - Add more precise error message report.  [Eric]
   - In options doc, add pmu-filter rely on KVM_ARM_VCPU_PMU_V3_FILTER support 
in
 KVM.[Eric]

v1->v2:
   - Add more description for allow and deny meaning in
 commit message. [Sebastian]
   - Small improvement.  [Sebastian]

  docs/system/arm/cpu-features.rst | 23 ++
  include/sysemu/kvm_int.h |  1 +
  target/arm/cpu.h |  3 ++
  target/arm/kvm.c | 78 
  4 files changed, 105 insertions(+)

diff --git a/docs/system/arm/cpu-features.rst b/docs/system/arm/cpu-features.rst
index a5fb929243..44a797c50e 100644
--- a/docs/system/arm/cpu-features.rst
+++ b/docs/system/arm/cpu-features.rst
@@ -204,6 +204,29 @@ the list of KVM VCPU features and their descriptions.
the guest scheduler behavior and/or be exposed to the guest
userspace.
  
+``kvm-pmu-filter``

+  By default kvm-pmu-filter is disabled. This means that by default all pmu
+  events will be exposed to guest.
+
+  KVM implements PMU Event Filtering to prevent a guest from being able to
+  sample certain events. It depends on the KVM_ARM_VCPU_PMU_V3_FILTER
+  attribute supported in KVM. It has the following format:
+
+  kvm-pmu-filter="{A,D}:start-end[;{A,D}:start-end...]"
+
+  The A means "allow" and D means "deny", start is the first event of the
+  range and the end is the last one. The first registered range defines
+  the global policy(global ALLOW if the first @action is DENY, global DENY
+  if the first @action is ALLOW). The start and end only support hexadecimal
+  format now. For example:
+
+  kvm-pmu-filter="A:0x11-0x11;A:0x23-0x3a;D:0x30-0x30"
+
+  Since the first action is allow, we have a global deny policy. It
+  will allow event 0x11 (The cycle counter), events 0x23 to 0x3a is

s/is/are

+  also allowed except the event 0x30 is denied, and all the other events

0x30 is/0x30 which is

+  are disall

[PATCH v5] arm/kvm: Enable support for KVM_ARM_VCPU_PMU_V3_FILTER

2024-01-15 Thread Shaoqin Huang
The KVM_ARM_VCPU_PMU_V3_FILTER provides the ability to let the VMM decide
which PMU events are provided to the guest. Add a new option
`kvm-pmu-filter` as -cpu sub-option to set the PMU Event Filtering.
Without the filter, all PMU events are exposed from host to guest by
default. The usage of the new sub-option can be found from the updated
document (docs/system/arm/cpu-features.rst).

Here is an example shows how to use the PMU Event Filtering, when
we launch a guest by use kvm, add such command line:

  # qemu-system-aarch64 \
-accel kvm \
-cpu host,kvm-pmu-filter="D:0x11-0x11"

Since the first action is deny, we have a global allow policy. This
disables the filtering of the cycle counter (event 0x11 being CPU_CYCLES).

And then in guest, use the perf to count the cycle:

  # perf stat sleep 1

   Performance counter stats for 'sleep 1':

  1.22 msec task-clock   #0.001 CPUs 
utilized
 1  context-switches #  820.695 /sec
 0  cpu-migrations   #0.000 /sec
55  page-faults  #   45.138 K/sec
 cycles
   1128954  instructions
227031  branches #  186.323 M/sec
  8686  branch-misses#3.83% of all 
branches

   1.002492480 seconds time elapsed

   0.001752000 seconds user
   0.0 seconds sys

As we can see, the cycle counter has been disabled in the guest, but
other pmu events are still work.

Signed-off-by: Shaoqin Huang 
---
v4->v5:
  - Change the kvm-pmu-filter as a -cpu sub-option. [Eric]
  - Comment tweak.  [Gavin]
  - Rebase to the latest branch.

v3->v4:
  - Fix the wrong check for pmu_filter_init.[Sebastian]
  - Fix multiple alignment issue.   [Gavin]
  - Report error by warn_report() instead of error_report(), and don't use
  abort() since the PMU Event Filter is an add-on and best-effort feature.
[Gavin]
  - Add several missing {  } for single line of code.   [Gavin]
  - Use the g_strsplit() to replace strtok().   [Gavin]

v2->v3:
  - Improve commits message, use kernel doc wording, add more explaination on
filter example, fix some typo error.[Eric]
  - Add g_free() in kvm_arch_set_pmu_filter() to prevent memory leak. [Eric]
  - Add more precise error message report.  [Eric]
  - In options doc, add pmu-filter rely on KVM_ARM_VCPU_PMU_V3_FILTER support in
KVM.[Eric]

v1->v2:
  - Add more description for allow and deny meaning in 
commit message. [Sebastian]
  - Small improvement.  [Sebastian]

 docs/system/arm/cpu-features.rst | 23 ++
 include/sysemu/kvm_int.h |  1 +
 target/arm/cpu.h |  3 ++
 target/arm/kvm.c | 78 
 4 files changed, 105 insertions(+)

diff --git a/docs/system/arm/cpu-features.rst b/docs/system/arm/cpu-features.rst
index a5fb929243..44a797c50e 100644
--- a/docs/system/arm/cpu-features.rst
+++ b/docs/system/arm/cpu-features.rst
@@ -204,6 +204,29 @@ the list of KVM VCPU features and their descriptions.
   the guest scheduler behavior and/or be exposed to the guest
   userspace.
 
+``kvm-pmu-filter``
+  By default kvm-pmu-filter is disabled. This means that by default all pmu
+  events will be exposed to guest.
+
+  KVM implements PMU Event Filtering to prevent a guest from being able to
+  sample certain events. It depends on the KVM_ARM_VCPU_PMU_V3_FILTER
+  attribute supported in KVM. It has the following format:
+
+  kvm-pmu-filter="{A,D}:start-end[;{A,D}:start-end...]"
+
+  The A means "allow" and D means "deny", start is the first event of the
+  range and the end is the last one. The first registered range defines
+  the global policy(global ALLOW if the first @action is DENY, global DENY
+  if the first @action is ALLOW). The start and end only support hexadecimal
+  format now. For example:
+
+  kvm-pmu-filter="A:0x11-0x11;A:0x23-0x3a;D:0x30-0x30"
+
+  Since the first action is allow, we have a global deny policy. It
+  will allow event 0x11 (The cycle counter), events 0x23 to 0x3a is
+  also allowed except the event 0x30 is denied, and all the other events
+  are disallowed.
+
 TCG VCPU Features
 =
 
diff --git a/include/sysemu/kvm_int.h b/include/sysemu/kvm_int.h
index fd846394be..8f4601474f 100644
--- a/include/sysemu/kvm_int.h
+++ b/include/sysemu/kvm_int.h
@@ -120,6 +120,7 @@ struct KVMState
 uint32_t xen_caps;
 uint16_t xen_gnttab_max_frames;
 uint16_t xen_evtchn_max_pirq;
+char *kvm_pmu_filter;
 };
 
 void kvm_memory_listener_registe

[PATCH v4] arm/kvm: Enable support for KVM_ARM_VCPU_PMU_V3_FILTER

2023-12-07 Thread Shaoqin Huang
The KVM_ARM_VCPU_PMU_V3_FILTER provide the ability to let the VMM decide
which PMU events are provided to the guest. Add a new option
`pmu-filter` as -accel sub-option to set the PMU Event Filtering.
Without the filter, the KVM will expose all events from the host to
guest by default.

The `pmu-filter` has such format:

  pmu-filter="{A,D}:start-end[;{A,D}:start-end...]"

The A means "allow" and D means "deny", start is the first event of the
range and the end is the last one. The first registered range defines
the global policy(global ALLOW if the first @action is DENY, global DENY
if the first @action is ALLOW). The start and end only support hex
format now. For example:

  pmu-filter="A:0x11-0x11;A:0x23-0x3a;D:0x30-0x30"

Since the first action is allow, we have a global deny policy. It
will allow event 0x11 (The cycle counter), events 0x23 to 0x3a is
also allowed except the event 0x30 is denied, and all the other events
are disallowed.

Here is an real example shows how to use the PMU Event Filtering, when
we launch a guest by use kvm, add such command line:

  # qemu-system-aarch64 \
-accel kvm,pmu-filter="D:0x11-0x11"

Since the first action is deny, we have a global allow policy. This
disables the filtering of the cycle counter (event 0x11 being CPU_CYCLES).

And then in guest, use the perf to count the cycle:

  # perf stat sleep 1

   Performance counter stats for 'sleep 1':

  1.22 msec task-clock   #0.001 CPUs 
utilized
 1  context-switches #  820.695 /sec
 0  cpu-migrations   #0.000 /sec
55  page-faults  #   45.138 K/sec
 cycles
   1128954  instructions
227031  branches #  186.323 M/sec
  8686  branch-misses#3.83% of all 
branches

   1.002492480 seconds time elapsed

   0.001752000 seconds user
   0.0 seconds sys

As we can see, the cycle counter has been disabled in the guest, but
other pmu events are still work.

Signed-off-by: Shaoqin Huang 
---
v3->v4:
  - Fix the wrong check for pmu_filter_init.[Sebastian]
  - Fix multiple alignment issue.   [Gavin]
  - Report error by warn_report() instead of error_report(), and don't use
  abort() since the PMU Event Filter is an add-on and best-effort feature.
[Gavin]
  - Add several missing {  } for single line of code.   [Gavin]
  - Use the g_strsplit() to replace strtok().   [Gavin]

v2->v3:
  - Improve commits message, use kernel doc wording, add more explaination on
filter example, fix some typo error.[Eric]
  - Add g_free() in kvm_arch_set_pmu_filter() to prevent memory leak. [Eric]
  - Add more precise error message report.  [Eric]
  - In options doc, add pmu-filter rely on KVM_ARM_VCPU_PMU_V3_FILTER support in
KVM.[Eric]

v1->v2:
  - Add more description for allow and deny meaning in 
commit message. [Sebastian]
  - Small improvement.  [Sebastian]

v2: https://lore.kernel.org/all/20231117060838.39723-1-shahu...@redhat.com/
v1: https://lore.kernel.org/all/20231113081713.153615-1-shahu...@redhat.com/
---
 include/sysemu/kvm_int.h |  1 +
 qemu-options.hx  | 21 +++
 target/arm/kvm.c | 23 
 target/arm/kvm64.c   | 75 
 4 files changed, 120 insertions(+)

diff --git a/include/sysemu/kvm_int.h b/include/sysemu/kvm_int.h
index fd846394be..8f4601474f 100644
--- a/include/sysemu/kvm_int.h
+++ b/include/sysemu/kvm_int.h
@@ -120,6 +120,7 @@ struct KVMState
 uint32_t xen_caps;
 uint16_t xen_gnttab_max_frames;
 uint16_t xen_evtchn_max_pirq;
+char *kvm_pmu_filter;
 };
 
 void kvm_memory_listener_register(KVMState *s, KVMMemoryListener *kml,
diff --git a/qemu-options.hx b/qemu-options.hx
index 42fd09e4de..054865ba0d 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -187,6 +187,7 @@ DEF("accel", HAS_ARG, QEMU_OPTION_accel,
 "tb-size=n (TCG translation block cache size)\n"
 "dirty-ring-size=n (KVM dirty ring GFN count, default 0)\n"
 "eager-split-size=n (KVM Eager Page Split chunk size, 
default 0, disabled. ARM only)\n"
+"pmu-filter={A,D}:start-end[;{A,D}:start-end...] (KVM PMU 
Event Filter, default no filter. ARM only)\n"
 "notify-vmexit=run|internal-error|disable,notify-window=n 
(enable notify VM exit and set notify window, x86 only)\n"
 "thread=single|multi (enable multi-threaded TCG)\n", 
Q

Re: [PATCH v3] arm/kvm: Enable support for KVM_ARM_VCPU_PMU_V3_FILTER

2023-12-06 Thread Shaoqin Huang

Hi Gavin,

On 12/1/23 13:37, Gavin Shan wrote:

Hi Shaoqin,

On 11/29/23 14:08, Shaoqin Huang wrote:

The KVM_ARM_VCPU_PMU_V3_FILTER provide the ability to let the VMM decide
which PMU events are provided to the guest. Add a new option
`pmu-filter` as -accel sub-option to set the PMU Event Filtering.
Without the filter, the KVM will expose all events from the host to
guest by default.

The `pmu-filter` has such format:

   pmu-filter="{A,D}:start-end[;{A,D}:start-end...]"

The A means "allow" and D means "deny", start is the first event of the
range and the end is the last one. The first registered range defines
the global policy(global ALLOW if the first @action is DENY, global DENY
if the first @action is ALLOW). The start and end only support hex
format now. For example:

   pmu-filter="A:0x11-0x11;A:0x23-0x3a;D:0x30-0x30"

Since the first action is allow, we have a global deny policy. It
will allow event 0x11 (The cycle counter), events 0x23 to 0x3a is
also allowed except the event 0x30 is denied, and all the other events
are disallowed.

Here is an real example shows how to use the PMU Event Filtering, when
we launch a guest by use kvm, add such command line:

   # qemu-system-aarch64 \
-accel kvm,pmu-filter="D:0x11-0x11"

Since the first action is deny, we have a global allow policy. This
disables the filtering of the cycle counter (event 0x11 being 
CPU_CYCLES).


And then in guest, use the perf to count the cycle:

   # perf stat sleep 1

    Performance counter stats for 'sleep 1':

   1.22 msec task-clock   #    0.001 
CPUs utilized

  1  context-switches #  820.695 /sec
  0  cpu-migrations   #    0.000 /sec
 55  page-faults  #   45.138 
K/sec

      cycles
    1128954  instructions
 227031  branches #  186.323 
M/sec
   8686  branch-misses    #    3.83% 
of all branches


    1.002492480 seconds time elapsed

    0.001752000 seconds user
    0.0 seconds sys

As we can see, the cycle counter has been disabled in the guest, but
other pmu events are still work.

Signed-off-by: Shaoqin Huang 
---
v2->v3:
   - Improve commits message, use kernel doc wording, add more 
explaination on

 filter example, fix some typo error.    [Eric]
   - Add g_free() in kvm_arch_set_pmu_filter() to prevent memory leak. 
[Eric]

   - Add more precise error message report.  [Eric]
   - In options doc, add pmu-filter rely on KVM_ARM_VCPU_PMU_V3_FILTER 
support in

 KVM.    [Eric]

v1->v2:
   - Add more description for allow and deny meaning in
 commit message. [Sebastian]
   - Small improvement.  [Sebastian]

v2: 
https://lore.kernel.org/all/20231117060838.39723-1-shahu...@redhat.com/
v1: 
https://lore.kernel.org/all/20231113081713.153615-1-shahu...@redhat.com/

---
  include/sysemu/kvm_int.h |  1 +
  qemu-options.hx  | 21 +
  target/arm/kvm.c | 23 ++
  target/arm/kvm64.c   | 68 
  4 files changed, 113 insertions(+)

diff --git a/include/sysemu/kvm_int.h b/include/sysemu/kvm_int.h
index fd846394be..8f4601474f 100644
--- a/include/sysemu/kvm_int.h
+++ b/include/sysemu/kvm_int.h
@@ -120,6 +120,7 @@ struct KVMState
  uint32_t xen_caps;
  uint16_t xen_gnttab_max_frames;
  uint16_t xen_evtchn_max_pirq;
+    char *kvm_pmu_filter;
  };
  void kvm_memory_listener_register(KVMState *s, KVMMemoryListener *kml,
diff --git a/qemu-options.hx b/qemu-options.hx
index 42fd09e4de..8b721d6668 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -187,6 +187,7 @@ DEF("accel", HAS_ARG, QEMU_OPTION_accel,
  "    tb-size=n (TCG translation block cache size)\n"
  "    dirty-ring-size=n (KVM dirty ring GFN count, 
default 0)\n"
  "    eager-split-size=n (KVM Eager Page Split chunk 
size, default 0, disabled. ARM only)\n"
+    "    pmu-filter={A,D}:start-end[;...] (KVM PMU Event 
Filter, default no filter. ARM only)\n"

   ^^^

Potential alignment issue, or the email isn't shown for me correctly.
Besides, why not follow the pattern in the commit log, which is nicer
than what's of being:

pmu-filter={A,D}:start-end[;...]

to

pmu-filter="{A,D}:start-end[;{A,D}:start-end...]



Ok. I can replace it with the better format.

  "
notify-vmexit=run|internal-error|disable,notify-window=n (enable 
notify VM exit and set notify window, x86 only)\n"
  "    thread=single|multi (enable multi-threaded 
TCG)\n", QEMU_ARCH_ALL)


Re: [PATCH v3] arm/kvm: Enable support for KVM_ARM_VCPU_PMU_V3_FILTER

2023-12-06 Thread Shaoqin Huang




On 12/1/23 00:55, Sebastian Ott wrote:

On Tue, 28 Nov 2023, Shaoqin Huang wrote:

+static void kvm_arm_pmu_filter_init(CPUState *cs)
+{
+    static bool pmu_filter_init = false;
+    struct kvm_pmu_event_filter filter;
+    struct kvm_device_attr attr = {
+    .group  = KVM_ARM_VCPU_PMU_V3_CTRL,
+    .attr   = KVM_ARM_VCPU_PMU_V3_FILTER,
+    .addr   = (uint64_t),
+    };
+    KVMState *kvm_state = cs->kvm_state;
+    char *tmp;
+    char *str, act;
+
+    if (!kvm_state->kvm_pmu_filter)
+    return;
+
+    if (kvm_vcpu_ioctl(cs, KVM_HAS_DEVICE_ATTR, attr)) {
+    error_report("The kernel doesn't support the pmu event 
filter!\n");

+    abort();
+    }
+
+    /* The filter only needs to be initialized for 1 vcpu. */
+    if (!pmu_filter_init)
+    pmu_filter_init = true;


Imho this is missing an else to bail out. Or the shorter version

 if (pmu_filter_init)
     return;

 pmu_filter_init = true;



Yes. This is what I want to do. Thanks for fixing it.


which could also move above the other tests.

Sebastian



--
Shaoqin




[PATCH v3] arm/kvm: Enable support for KVM_ARM_VCPU_PMU_V3_FILTER

2023-11-28 Thread Shaoqin Huang
The KVM_ARM_VCPU_PMU_V3_FILTER provide the ability to let the VMM decide
which PMU events are provided to the guest. Add a new option
`pmu-filter` as -accel sub-option to set the PMU Event Filtering.
Without the filter, the KVM will expose all events from the host to
guest by default.

The `pmu-filter` has such format:

  pmu-filter="{A,D}:start-end[;{A,D}:start-end...]"

The A means "allow" and D means "deny", start is the first event of the
range and the end is the last one. The first registered range defines
the global policy(global ALLOW if the first @action is DENY, global DENY
if the first @action is ALLOW). The start and end only support hex
format now. For example:

  pmu-filter="A:0x11-0x11;A:0x23-0x3a;D:0x30-0x30"

Since the first action is allow, we have a global deny policy. It
will allow event 0x11 (The cycle counter), events 0x23 to 0x3a is
also allowed except the event 0x30 is denied, and all the other events
are disallowed.

Here is an real example shows how to use the PMU Event Filtering, when
we launch a guest by use kvm, add such command line:

  # qemu-system-aarch64 \
-accel kvm,pmu-filter="D:0x11-0x11"

Since the first action is deny, we have a global allow policy. This
disables the filtering of the cycle counter (event 0x11 being CPU_CYCLES).

And then in guest, use the perf to count the cycle:

  # perf stat sleep 1

   Performance counter stats for 'sleep 1':

  1.22 msec task-clock   #0.001 CPUs 
utilized
 1  context-switches #  820.695 /sec
 0  cpu-migrations   #0.000 /sec
55  page-faults  #   45.138 K/sec
 cycles
   1128954  instructions
227031  branches #  186.323 M/sec
  8686  branch-misses#3.83% of all 
branches

   1.002492480 seconds time elapsed

   0.001752000 seconds user
   0.0 seconds sys

As we can see, the cycle counter has been disabled in the guest, but
other pmu events are still work.

Signed-off-by: Shaoqin Huang 
---
v2->v3:
  - Improve commits message, use kernel doc wording, add more explaination on
filter example, fix some typo error.[Eric]
  - Add g_free() in kvm_arch_set_pmu_filter() to prevent memory leak. [Eric]
  - Add more precise error message report.  [Eric]
  - In options doc, add pmu-filter rely on KVM_ARM_VCPU_PMU_V3_FILTER support in
KVM.[Eric]

v1->v2:
  - Add more description for allow and deny meaning in 
commit message. [Sebastian]
  - Small improvement.  [Sebastian]

v2: https://lore.kernel.org/all/20231117060838.39723-1-shahu...@redhat.com/
v1: https://lore.kernel.org/all/20231113081713.153615-1-shahu...@redhat.com/
---
 include/sysemu/kvm_int.h |  1 +
 qemu-options.hx  | 21 +
 target/arm/kvm.c | 23 ++
 target/arm/kvm64.c   | 68 
 4 files changed, 113 insertions(+)

diff --git a/include/sysemu/kvm_int.h b/include/sysemu/kvm_int.h
index fd846394be..8f4601474f 100644
--- a/include/sysemu/kvm_int.h
+++ b/include/sysemu/kvm_int.h
@@ -120,6 +120,7 @@ struct KVMState
 uint32_t xen_caps;
 uint16_t xen_gnttab_max_frames;
 uint16_t xen_evtchn_max_pirq;
+char *kvm_pmu_filter;
 };
 
 void kvm_memory_listener_register(KVMState *s, KVMMemoryListener *kml,
diff --git a/qemu-options.hx b/qemu-options.hx
index 42fd09e4de..8b721d6668 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -187,6 +187,7 @@ DEF("accel", HAS_ARG, QEMU_OPTION_accel,
 "tb-size=n (TCG translation block cache size)\n"
 "dirty-ring-size=n (KVM dirty ring GFN count, default 0)\n"
 "eager-split-size=n (KVM Eager Page Split chunk size, 
default 0, disabled. ARM only)\n"
+"pmu-filter={A,D}:start-end[;...] (KVM PMU Event Filter, 
default no filter. ARM only)\n"
 "notify-vmexit=run|internal-error|disable,notify-window=n 
(enable notify VM exit and set notify window, x86 only)\n"
 "thread=single|multi (enable multi-threaded TCG)\n", 
QEMU_ARCH_ALL)
 SRST
@@ -259,6 +260,26 @@ SRST
 impact on the memory. By default, this feature is disabled
 (eager-split-size=0).
 
+``pmu-filter={A,D}:start-end[;...]``
+KVM implements pmu event filtering to prevent a guest from being able 
to
+   sample certain events. It depends on the KVM_ARM_VCPU_PMU_V3_FILTER attr
+   supported in KVM. It has the following format:
+
+   pmu-filter="{A,D}:start-end[;{A,D}:start-end...]"
+
+   The A

Re: [PATCH v2] arm/kvm: Enable support for KVM_ARM_VCPU_PMU_V3_FILTER

2023-11-27 Thread Shaoqin Huang

Hi Eric,

On 11/25/23 02:24, Eric Auger wrote:

Hi,

On 11/17/23 07:08, Shaoqin Huang wrote:

The KVM_ARM_VCPU_PMU_V3_FILTER provide the ability to let the VMM decide
which PMU events are provided to the guest. Add a new option
`pmu-filter` as -accel sub-option to set the PMU Event Filtering.

The `pmu-filter` has such format:

   pmu-filter="{A,D}:start-end[;{A,D}:start-end...]"

The A means "allow" and D means "deny", start is the first event of the
range and the end is the last one. The first filter action defines if the whole
event list is an allow or deny list, if the first filter action is "allow", all
other events are denied except start-end; if the first filter action is "deny",
all other events are allowed except start-end. For example:

   pmu-filter="A:0x11-0x11;A:0x23-0x3a,D:0x30-0x30"

This will allow event 0x11 (The cycle counter), events 0x23 to 0x3a is
also allowed except the event 0x30 is denied, and all the other events
are disallowed.

Here is an real example shows how to use the PMU Event Filtering, when
we launch a guest by use kvm, add such command line:

   # qemu-system-aarch64 \
-accel kvm,pmu-filter="D:0x11-0x11"

And then in guest, use the perf to count the cycle:

   # perf stat sleep 1

Performance counter stats for 'sleep 1':

   1.22 msec task-clock   #0.001 CPUs 
utilized
  1  context-switches #  820.695 /sec
  0  cpu-migrations   #0.000 /sec
 55  page-faults  #   45.138 K/sec
  cycles
1128954  instructions
 227031  branches #  186.323 M/sec
   8686  branch-misses#3.83% of all 
branches

1.002492480 seconds time elapsed

0.001752000 seconds user
0.0 seconds sys

As we can see, the cycle counter has been disabled in the guest, but
other pmu events are still work.

Signed-off-by: Shaoqin Huang 
---
v1->v2:
   - Add more description for allow and deny meaning in
 commit message. [Sebastian]
   - Small improvement.  [Sebastian]

v1: https://lore.kernel.org/all/20231113081713.153615-1-shahu...@redhat.com/
---
  include/sysemu/kvm_int.h |  1 +
  qemu-options.hx  | 16 +
  target/arm/kvm.c | 22 +
  target/arm/kvm64.c   | 51 
  4 files changed, 90 insertions(+)

diff --git a/include/sysemu/kvm_int.h b/include/sysemu/kvm_int.h
index fd846394be..8f4601474f 100644
--- a/include/sysemu/kvm_int.h
+++ b/include/sysemu/kvm_int.h
@@ -120,6 +120,7 @@ struct KVMState
  uint32_t xen_caps;
  uint16_t xen_gnttab_max_frames;
  uint16_t xen_evtchn_max_pirq;
+char *kvm_pmu_filter;
  };
  
  void kvm_memory_listener_register(KVMState *s, KVMMemoryListener *kml,

diff --git a/qemu-options.hx b/qemu-options.hx
index 42fd09e4de..dd3518092c 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -187,6 +187,7 @@ DEF("accel", HAS_ARG, QEMU_OPTION_accel,
  "tb-size=n (TCG translation block cache size)\n"
  "dirty-ring-size=n (KVM dirty ring GFN count, default 
0)\n"
  "eager-split-size=n (KVM Eager Page Split chunk size, default 
0, disabled. ARM only)\n"
+"pmu-filter={A,D}:start-end[;...] (KVM PMU Event Filter, 
default no filter. ARM only)\n"
  "notify-vmexit=run|internal-error|disable,notify-window=n 
(enable notify VM exit and set notify window, x86 only)\n"
  "thread=single|multi (enable multi-threaded TCG)\n", 
QEMU_ARCH_ALL)
  SRST
@@ -259,6 +260,21 @@ SRST
  impact on the memory. By default, this feature is disabled
  (eager-split-size=0).
  
+``pmu-filter={A,D}:start-end[;...]``

+KVM implements pmu event filtering to prevent a guest from being able 
to
+   sample certain events. It has the following format:
+
+   pmu-filter="{A,D}:start-end[;{A,D}:start-end...]"
+
+   The A means "allow" and D means "deny", start if the first event of the
+   range and the end is the last one. For example:
+
+   pmu-filter="A:0x11-0x11;A:0x23-0x3a,D:0x30-0x30"
+
+   This will allow event 0x11 (The cycle counter), events 0x23 to 0x3a is
+   also allowed except the event 0x30 is denied, and all the other events
+   are disallowed.
+
  ``notify-vmexit=run|internal-error|disable,notify-window=n``
  Enables or disables notify VM exit support on x86 host and specify
  the corresponding notify window to trigger the VM exit if enabled.
diff --git a/target/arm/kvm.c b/targe

Re: [PATCH v2] arm/kvm: Enable support for KVM_ARM_VCPU_PMU_V3_FILTER

2023-11-27 Thread Shaoqin Huang

Hi Eric,

On 11/24/23 18:40, Eric Auger wrote:

Hi Shaoqin,

On 11/17/23 07:08, Shaoqin Huang wrote:

The KVM_ARM_VCPU_PMU_V3_FILTER provide the ability to let the VMM decide
which PMU events are provided to the guest. Add a new option
`pmu-filter` as -accel sub-option to set the PMU Event Filtering.

you remind the reader the default policy without filter (ie. expose all
events from the hots)


Yes. I will add this description to the default policy.



The `pmu-filter` has such format:

   pmu-filter="{A,D}:start-end[;{A,D}:start-end...]"

The A means "allow" and D means "deny", start is the first event of the
range and the end is the last one. The first filter action defines if the whole
event list is an allow or deny list, if the first filter action is "allow", all
other events are denied except start-end; if the first filter action is "deny",
all other events are allowed except start-end. For example:


I prefer the kernel doc wording
The first registered range defines the global policy (global ALLOW if
the first @action is DENY, global DENY if the first @action is ALLOW).


I can replace it this by kernel doc description in next version.



   pmu-filter="A:0x11-0x11;A:0x23-0x3a,D:0x30-0x30"

shoudn't the "," be replaced by a ";"?


Yes. Thanks for catching this. It should be ";".




I would add: since the first action is allow, we have a global deny policy.


That makes the example more clear, will add it.



This will allow event 0x11 (The cycle counter), events 0x23 to 0x3a is
also allowed except the event 0x30 is denied, and all the other events
are disallowed.

Here is an real example shows how to use the PMU Event Filtering, when
we launch a guest by use kvm, add such command line:

   # qemu-system-aarch64 \
-accel kvm,pmu-filter="D:0x11-0x11"

Since the first filter action is deny, we have a global allow policy.
this disables the filtering of the cycle counter (event 0x11 being
CPU_CYCLES)

kernel doc says that the ranges should match the PMU arch (10 bits on
ARMv8.0, 16 bits from ARMv8.1 onwards). How do you handle that?


Currently I think I can rely on the SET_DEVICE_ATTR, when set the 
KVM_ARM_VCPU_PMU_V3_FILTER, check the errno number, if it equals to 
EINVAL, then report the error to the use it's an invalid filter.


Or another way is to detect the ARM version, and do more check in the 
userspace?


Do you have any good suggestions on handle the two different event space 
in QEMU?




And then in guest, use the perf to count the cycle:

   # perf stat sleep 1

Performance counter stats for 'sleep 1':

   1.22 msec task-clock   #0.001 CPUs 
utilized
  1  context-switches #  820.695 /sec
  0  cpu-migrations   #0.000 /sec
 55  page-faults  #   45.138 K/sec
  cycles
1128954  instructions
 227031  branches #  186.323 M/sec
   8686  branch-misses#3.83% of all 
branches

1.002492480 seconds time elapsed

0.001752000 seconds user
0.0 seconds sys

As we can see, the cycle counter has been disabled in the guest, but
other pmu events are still work.


perf list should work as well


It works, should I post it output at here?



Signed-off-by: Shaoqin Huang 
---
v1->v2:
   - Add more description for allow and deny meaning in
 commit message. [Sebastian]
   - Small improvement.  [Sebastian]

v1: https://lore.kernel.org/all/20231113081713.153615-1-shahu...@redhat.com/
---
  include/sysemu/kvm_int.h |  1 +
  qemu-options.hx  | 16 +
  target/arm/kvm.c | 22 +
  target/arm/kvm64.c   | 51 
  4 files changed, 90 insertions(+)

diff --git a/include/sysemu/kvm_int.h b/include/sysemu/kvm_int.h
index fd846394be..8f4601474f 100644
--- a/include/sysemu/kvm_int.h
+++ b/include/sysemu/kvm_int.h
@@ -120,6 +120,7 @@ struct KVMState
  uint32_t xen_caps;
  uint16_t xen_gnttab_max_frames;
  uint16_t xen_evtchn_max_pirq;
+char *kvm_pmu_filter;
  };
  
  void kvm_memory_listener_register(KVMState *s, KVMMemoryListener *kml,

diff --git a/qemu-options.hx b/qemu-options.hx
index 42fd09e4de..dd3518092c 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -187,6 +187,7 @@ DEF("accel", HAS_ARG, QEMU_OPTION_accel,
  "tb-size=n (TCG translation block cache size)\n"
  "dirty-ring-size=n (KVM dirty ring GFN count, default 
0)\n"
  "eager-split-size=n (KVM Eager Page Split chunk size, default 
0, disabled. ARM only)\n"
+"pmu-filter=

Re: [PATCH V7 8/8] docs/specs/acpi_hw_reduced_hotplug: Add the CPU Hotplug Event Bit

2023-11-21 Thread Shaoqin Huang




On 11/14/23 04:12, Salil Mehta via wrote:

GED interface is used by many hotplug events like memory hotplug, NVDIMM hotplug
and non-hotplug events like system power down event. Each of these can be
selected using a bit in the 32 bit GED IO interface. A bit has been reserved for
the CPU hotplug event.

Signed-off-by: Salil Mehta 

Reviewed-by: Shaoqin Huang 

---
  docs/specs/acpi_hw_reduced_hotplug.rst | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/docs/specs/acpi_hw_reduced_hotplug.rst 
b/docs/specs/acpi_hw_reduced_hotplug.rst
index 0bd3f9399f..3acd6fcd8b 100644
--- a/docs/specs/acpi_hw_reduced_hotplug.rst
+++ b/docs/specs/acpi_hw_reduced_hotplug.rst
@@ -64,7 +64,8 @@ GED IO interface (4 byte access)
 0: Memory hotplug event
 1: System power down event
 2: NVDIMM hotplug event
-3-31: Reserved
+   3: CPU hotplug event
+4-31: Reserved
  
  **write_access:**
  


--
Shaoqin




[PATCH v2] arm/kvm: Enable support for KVM_ARM_VCPU_PMU_V3_FILTER

2023-11-16 Thread Shaoqin Huang
The KVM_ARM_VCPU_PMU_V3_FILTER provide the ability to let the VMM decide
which PMU events are provided to the guest. Add a new option
`pmu-filter` as -accel sub-option to set the PMU Event Filtering.

The `pmu-filter` has such format:

  pmu-filter="{A,D}:start-end[;{A,D}:start-end...]"

The A means "allow" and D means "deny", start is the first event of the
range and the end is the last one. The first filter action defines if the whole
event list is an allow or deny list, if the first filter action is "allow", all
other events are denied except start-end; if the first filter action is "deny",
all other events are allowed except start-end. For example:

  pmu-filter="A:0x11-0x11;A:0x23-0x3a,D:0x30-0x30"

This will allow event 0x11 (The cycle counter), events 0x23 to 0x3a is
also allowed except the event 0x30 is denied, and all the other events
are disallowed.

Here is an real example shows how to use the PMU Event Filtering, when
we launch a guest by use kvm, add such command line:

  # qemu-system-aarch64 \
-accel kvm,pmu-filter="D:0x11-0x11"

And then in guest, use the perf to count the cycle:

  # perf stat sleep 1

   Performance counter stats for 'sleep 1':

  1.22 msec task-clock   #0.001 CPUs 
utilized
 1  context-switches #  820.695 /sec
 0  cpu-migrations   #0.000 /sec
55  page-faults  #   45.138 K/sec
 cycles
   1128954  instructions
227031  branches #  186.323 M/sec
  8686  branch-misses#3.83% of all 
branches

   1.002492480 seconds time elapsed

   0.001752000 seconds user
   0.0 seconds sys

As we can see, the cycle counter has been disabled in the guest, but
other pmu events are still work.

Signed-off-by: Shaoqin Huang 
---
v1->v2:
  - Add more description for allow and deny meaning in 
commit message. [Sebastian]
  - Small improvement.  [Sebastian]

v1: https://lore.kernel.org/all/20231113081713.153615-1-shahu...@redhat.com/
---
 include/sysemu/kvm_int.h |  1 +
 qemu-options.hx  | 16 +
 target/arm/kvm.c | 22 +
 target/arm/kvm64.c   | 51 
 4 files changed, 90 insertions(+)

diff --git a/include/sysemu/kvm_int.h b/include/sysemu/kvm_int.h
index fd846394be..8f4601474f 100644
--- a/include/sysemu/kvm_int.h
+++ b/include/sysemu/kvm_int.h
@@ -120,6 +120,7 @@ struct KVMState
 uint32_t xen_caps;
 uint16_t xen_gnttab_max_frames;
 uint16_t xen_evtchn_max_pirq;
+char *kvm_pmu_filter;
 };
 
 void kvm_memory_listener_register(KVMState *s, KVMMemoryListener *kml,
diff --git a/qemu-options.hx b/qemu-options.hx
index 42fd09e4de..dd3518092c 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -187,6 +187,7 @@ DEF("accel", HAS_ARG, QEMU_OPTION_accel,
 "tb-size=n (TCG translation block cache size)\n"
 "dirty-ring-size=n (KVM dirty ring GFN count, default 0)\n"
 "eager-split-size=n (KVM Eager Page Split chunk size, 
default 0, disabled. ARM only)\n"
+"pmu-filter={A,D}:start-end[;...] (KVM PMU Event Filter, 
default no filter. ARM only)\n"
 "notify-vmexit=run|internal-error|disable,notify-window=n 
(enable notify VM exit and set notify window, x86 only)\n"
 "thread=single|multi (enable multi-threaded TCG)\n", 
QEMU_ARCH_ALL)
 SRST
@@ -259,6 +260,21 @@ SRST
 impact on the memory. By default, this feature is disabled
 (eager-split-size=0).
 
+``pmu-filter={A,D}:start-end[;...]``
+KVM implements pmu event filtering to prevent a guest from being able 
to
+   sample certain events. It has the following format:
+
+   pmu-filter="{A,D}:start-end[;{A,D}:start-end...]"
+
+   The A means "allow" and D means "deny", start if the first event of the
+   range and the end is the last one. For example:
+
+   pmu-filter="A:0x11-0x11;A:0x23-0x3a,D:0x30-0x30"
+
+   This will allow event 0x11 (The cycle counter), events 0x23 to 0x3a is
+   also allowed except the event 0x30 is denied, and all the other events
+   are disallowed.
+
 ``notify-vmexit=run|internal-error|disable,notify-window=n``
 Enables or disables notify VM exit support on x86 host and specify
 the corresponding notify window to trigger the VM exit if enabled.
diff --git a/target/arm/kvm.c b/target/arm/kvm.c
index 7903e2ddde..74796de055 100644
--- a/target/arm/kvm.c
+++ b/target/arm/kvm.c
@@ -1108,6 +1108,21 @@ static void kvm_arch_set_eag

Re: [PATCH v1] arm/kvm: Enable support for KVM_ARM_VCPU_PMU_V3_FILTER

2023-11-16 Thread Shaoqin Huang

Hi Sebastian,

On 11/15/23 20:17, Sebastian Ott wrote:

Hi,

On Mon, 13 Nov 2023, Shaoqin Huang wrote:

+    ``pmu-filter={A,D}:start-end[;...]``
+    KVM implements pmu event filtering to prevent a guest from 
being able to

+    sample certain events. It has the following format:
+
+    pmu-filter="{A,D}:start-end[;{A,D}:start-end...]"
+
+    The A means "allow" and D means "deny", start if the first event 
of the

   ^
   is



Thanks for point it out.


Also it should be stated that the first filter action defines if the whole
list is an allow or a deny list.


+static void kvm_arm_pmu_filter_init(CPUState *cs)
+{
+    struct kvm_pmu_event_filter filter;
+    struct kvm_device_attr attr = {
+    .group  = KVM_ARM_VCPU_PMU_V3_CTRL,
+    .attr   = KVM_ARM_VCPU_PMU_V3_FILTER,
+    };
+    KVMState *kvm_state = cs->kvm_state;
+    char *tmp;
+    char *str, act;
+
+    if (!kvm_state->kvm_pmu_filter)
+    return;
+
+    tmp = g_strdup(kvm_state->kvm_pmu_filter);
+
+    for (str = strtok(tmp, ";"); str != NULL; str = strtok(NULL, ";")) {
+    unsigned short start = 0, end = 0;
+
+    sscanf(str, "%c:%hx-%hx", , , );
+    if ((act != 'A' && act != 'D') || (!start && !end)) {
+    error_report("skipping invalid filter %s\n", str);
+    continue;
+    }
+
+    filter = (struct kvm_pmu_event_filter) {
+    .base_event = start,
+    .nevents    = end - start + 1,
+    .action = act == 'A' ? KVM_PMU_EVENT_ALLOW :
+   KVM_PMU_EVENT_DENY,
+    };
+
+    attr.addr = (uint64_t)


That could move to the initialization of attr (the address of filter
doesn't change).



It looks better. Will change it.


+    if (!kvm_arm_set_device_attr(cs, , "PMU Event Filter")) {
+    error_report("Failed to init PMU Event Filter\n");
+    abort();
+    }
+    }
+
+    g_free(tmp);
+}
+
void kvm_arm_pmu_init(CPUState *cs)
{
    struct kvm_device_attr attr = {
    .group = KVM_ARM_VCPU_PMU_V3_CTRL,
    .attr = KVM_ARM_VCPU_PMU_V3_INIT,
    };
+    static bool pmu_filter_init = false;

    if (!ARM_CPU(cs)->has_pmu) {
    return;
    }
+    if (!pmu_filter_init) {
+    kvm_arm_pmu_filter_init(cs);
+    pmu_filter_init = true;


pmu_filter_init could move inside kvm_arm_pmu_filter_init() - maybe
together with a comment that this only needs to be called for 1 vcpu.


Good idea. Will do that.

Thanks,
Shaoqin



Thanks,
Sebastian






[PATCH v1] arm/kvm: Enable support for KVM_ARM_VCPU_PMU_V3_FILTER

2023-11-13 Thread Shaoqin Huang
The KVM_ARM_VCPU_PMU_V3_FILTER provide the ability to let the VMM decide
which PMU events are provided to the guest. Add a new option
`pmu-filter` as -accel sub-option to set the PMU Event Filtering.

The `pmu-filter` has such format:

  pmu-filter="{A,D}:start-end[;{A,D}:start-end...]"

The A means "allow" and D means "deny", start if the first event of the
range and the end is the last one. For example:

  pmu-filter="A:0x11-0x11;A:0x23-0x3a,D:0x30-0x30"

This will allow event 0x11 (The cycle counter), events 0x23 to 0x3a is
also allowed except the event 0x30 is denied, and all the other events
are disallowed.

Here is an real example shows how to use the PMU Event Filtering, when
we launch a guest by use kvm, add such command line:

  # qemu-system-aarch64 \
-accel kvm,pmu-filter="D:0x11-0x11"

And then in guest, use the perf to count the cycle:

  # perf stat sleep 1

   Performance counter stats for 'sleep 1':

  1.22 msec task-clock   #0.001 CPUs 
utilized
 1  context-switches #  820.695 /sec
 0  cpu-migrations   #0.000 /sec
55  page-faults  #   45.138 K/sec
 cycles
   1128954  instructions
227031  branches #  186.323 M/sec
  8686  branch-misses#3.83% of all 
branches

   1.002492480 seconds time elapsed

   0.001752000 seconds user
   0.0 seconds sys

As we can see, the cycle counter has been disabled in the guest, but
other pmu events are still work.

Signed-off-by: Shaoqin Huang 
---
 include/sysemu/kvm_int.h |  1 +
 qemu-options.hx  | 16 +
 target/arm/kvm.c | 22 ++
 target/arm/kvm64.c   | 49 
 4 files changed, 88 insertions(+)

diff --git a/include/sysemu/kvm_int.h b/include/sysemu/kvm_int.h
index fd846394be..8f4601474f 100644
--- a/include/sysemu/kvm_int.h
+++ b/include/sysemu/kvm_int.h
@@ -120,6 +120,7 @@ struct KVMState
 uint32_t xen_caps;
 uint16_t xen_gnttab_max_frames;
 uint16_t xen_evtchn_max_pirq;
+char *kvm_pmu_filter;
 };
 
 void kvm_memory_listener_register(KVMState *s, KVMMemoryListener *kml,
diff --git a/qemu-options.hx b/qemu-options.hx
index 42fd09e4de..dd3518092c 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -187,6 +187,7 @@ DEF("accel", HAS_ARG, QEMU_OPTION_accel,
 "tb-size=n (TCG translation block cache size)\n"
 "dirty-ring-size=n (KVM dirty ring GFN count, default 0)\n"
 "eager-split-size=n (KVM Eager Page Split chunk size, 
default 0, disabled. ARM only)\n"
+"pmu-filter={A,D}:start-end[;...] (KVM PMU Event Filter, 
default no filter. ARM only)\n"
 "notify-vmexit=run|internal-error|disable,notify-window=n 
(enable notify VM exit and set notify window, x86 only)\n"
 "thread=single|multi (enable multi-threaded TCG)\n", 
QEMU_ARCH_ALL)
 SRST
@@ -259,6 +260,21 @@ SRST
 impact on the memory. By default, this feature is disabled
 (eager-split-size=0).
 
+``pmu-filter={A,D}:start-end[;...]``
+KVM implements pmu event filtering to prevent a guest from being able 
to
+   sample certain events. It has the following format:
+
+   pmu-filter="{A,D}:start-end[;{A,D}:start-end...]"
+
+   The A means "allow" and D means "deny", start if the first event of the
+   range and the end is the last one. For example:
+
+   pmu-filter="A:0x11-0x11;A:0x23-0x3a,D:0x30-0x30"
+
+   This will allow event 0x11 (The cycle counter), events 0x23 to 0x3a is
+   also allowed except the event 0x30 is denied, and all the other events
+   are disallowed.
+
 ``notify-vmexit=run|internal-error|disable,notify-window=n``
 Enables or disables notify VM exit support on x86 host and specify
 the corresponding notify window to trigger the VM exit if enabled.
diff --git a/target/arm/kvm.c b/target/arm/kvm.c
index 7903e2ddde..74796de055 100644
--- a/target/arm/kvm.c
+++ b/target/arm/kvm.c
@@ -1108,6 +1108,21 @@ static void kvm_arch_set_eager_split_size(Object *obj, 
Visitor *v,
 s->kvm_eager_split_size = value;
 }
 
+static char *kvm_arch_get_pmu_filter(Object *obj, Error **errp)
+{
+KVMState *s = KVM_STATE(obj);
+
+return g_strdup(s->kvm_pmu_filter);
+}
+
+static void kvm_arch_set_pmu_filter(Object *obj, const char *pmu_filter,
+Error **errp)
+{
+KVMState *s = KVM_STATE(obj);
+
+s->kvm_pmu_filter = g_strdup(pmu_filter);
+}
+
 void kvm_arch_accel_class_init(ObjectClass *oc)
 {
 object_class_property_add(

Re: [PATCH V6 0/9] Add architecture agnostic code to support vCPU Hotplug

2023-10-19 Thread Shaoqin Huang




On 10/19/23 17:34, Salil Mehta wrote:

Hi Shaoqin,


From: Shaoqin Huang 
Sent: Thursday, October 19, 2023 10:05 AM
To: Salil Mehta ; qemu-devel@nongnu.org; qemu-
a...@nongnu.org
Cc: m...@kernel.org; jean-phili...@linaro.org; Jonathan Cameron
; lpieral...@kernel.org;
peter.mayd...@linaro.org; richard.hender...@linaro.org;
imamm...@redhat.com; andrew.jo...@linux.dev; da...@redhat.com;
phi...@linaro.org; eric.au...@redhat.com; oliver.up...@linux.dev;
pbonz...@redhat.com; m...@redhat.com; w...@kernel.org; gs...@redhat.com;
raf...@kernel.org; alex.ben...@linaro.org; li...@armlinux.org.uk;
dar...@os.amperecomputing.com; il...@os.amperecomputing.com;
vis...@os.amperecomputing.com; karl.heub...@oracle.com;
miguel.l...@oracle.com; salil.me...@opnsrc.net; zhukeqian
; wangxiongfeng (C) ;
wangyanan (Y) ; jiakern...@gmail.com;
maob...@loongson.cn; lixiang...@loongson.cn; Linuxarm 
Subject: Re: [PATCH V6 0/9] Add architecture agnostic code to support vCPU
Hotplug



On 10/13/23 18:51, Salil Mehta via wrote:

Virtual CPU hotplug support is being added across various

architectures[1][3].

This series adds various code bits common across all architectures:

1. vCPU creation and Parking code refactor [Patch 1]
2. Update ACPI GED framework to support vCPU Hotplug [Patch 4,6,7]
3. ACPI CPUs AML code change [Patch 5]
4. Helper functions to support unrealization of CPU objects [Patch 8,9]
5. Misc [Patch 2,3]


Repository:

[*] https://github.com/salil-mehta/qemu.git virt-cpuhp-armv8/rfc-

v2.common.v6



Revision History:

Patch-set  V5 -> V6
1. Addressed Gavin Shan's comments
 - Fixed the assert() ranges of address spaces
 - Rebased the patch-set to latest changes in the qemu.git
 - Added Reviewed-by tags for patches {8,9}
2. Addressed Jonathan Cameron's comments
 - Updated commit-log for [Patch V5 1/9] with mention of trace events
 - Added Reviewed-by tags for patches {1,5}
3. Added Tested-by tags from Xianglai Li
4. Fixed checkpatch.pl error "Qemu -> QEMU" in [Patch V5 1/9]
Link: https://lore.kernel.org/qemu-devel/20231011194355.15628-1-

salil.me...@huawei.com/


Patch-set  V4 -> V5
1. Addressed Gavin Shan's comments
 - Fixed the trace events print string for

kvm_{create,get,park,destroy}_vcpu

 - Added Reviewed-by tag for patch {1}
2. Added Shaoqin Huang's Reviewed-by tags for Patches {2,3}
3. Added Tested-by Tag from Vishnu Pajjuri to the patch-set
4. Dropped the ARM specific [Patch V4 10/10]
Link: https://lore.kernel.org/qemu-devel/20231009203601.17584-1-

salil.me...@huawei.com/


Patch-set  V3 -> V4
1. Addressed David Hilderbrand's comments
 - Fixed the wrong doc comment of kvm_park_vcpu API prototype
 - Added Reviewed-by tags for patches {2,4}
Link: https://lore.kernel.org/qemu-devel/20231009112812.10612-1-

salil.me...@huawei.com/


Patch-set  V2 -> V3
1. Addressed Jonathan Cameron's comments
 - Fixed 'vcpu-id' type wrongly changed from 'unsigned long' to

'integer'

 - Removed unnecessary use of variable 'vcpu_id' in kvm_park_vcpu
 - Updated [Patch V2 3/10] commit-log with details of

ACPI_CPU_SCAN_METHOD macro

 - Updated [Patch V2 5/10] commit-log with details of conditional

event handler method

 - Added Reviewed-by tags for patches {2,3,4,6,7}
2. Addressed Gavin Shan's comments
 - Remove unnecessary use of variable 'vcpu_id' in kvm_par_vcpu
 - Fixed return value in kvm_get_vcpu from -1 to -ENOENT
 - Reset the value of 'gdb_num_g_regs' in

gdb_unregister_coprocessor_all

 - Fixed the kvm_{create,park}_vcpu prototypes docs
 - Added Reviewed-by tags for patches {2,3,4,5,6,7,9,10}
3. Addressed one earlier missed comment by Alex Bennée in RFC V1
 - Added traces instead of DPRINTF in the newly added and some

existing functions

Link: https://lore.kernel.org/qemu-devel/20230930001933.2660-1-

salil.me...@huawei.com/


Patch-set V1 -> V2
1. Addressed Alex Bennée's comments
 - Refactored the kvm_create_vcpu logic to get rid of goto
 - Added the docs for kvm_{create,park}_vcpu prototypes
 - Splitted the gdbstub and AddressSpace destruction change into

separate patches

 - Added Reviewed-by tags for patches {2,10}
Link: https://lore.kernel.org/qemu-devel/20230929124304.13672-1-

salil.me...@huawei.com/


References:

[1] https://lore.kernel.org/qemu-devel/20230926100436.28284-1-

salil.me...@huawei.com/

[2] https://lore.kernel.org/all/20230913163823.7880-1-

james.mo...@arm.com/

[3] https://lore.kernel.org/qemu-

devel/cover.1695697701.git.lixiang...@loongson.cn/



Salil Mehta (9):
accel/kvm: Extract common KVM vCPU {creation,parking} code
hw/acpi: Move CPU ctrl-dev MMIO region len macro to common header file
hw/acpi: Add ACPI CPU hotplug init stub
hw/acpi: Init GED framework with CPU hotplug events
hw/acpi: Update CPUs AML with cpu-(ctrl)dev change
hw/acpi: Update GED _EVT method AML with CPU scan
hw/acpi: Update ACPI GED framework to support vCPU Hot

Re: [PATCH V6 0/9] Add architecture agnostic code to support vCPU Hotplug

2023-10-19 Thread Shaoqin Huang
r you effort to update it so 
actively. No issues being found by simply testing and several daily use.


Reviewed-by: Shaoqin Huang 

Thanks,
Shaoqin

--
Shaoqin




Re: [PATCH V5 4/9] hw/acpi: Init GED framework with CPU hotplug events

2023-10-15 Thread Shaoqin Huang




On 10/12/23 03:43, Salil Mehta via wrote:

ACPI GED(as described in the ACPI 6.2 spec) can be used to generate ACPI events
when OSPM/guest receives an interrupt listed in the _CRS object of GED. OSPM
then maps or demultiplexes the event by evaluating _EVT method.

This change adds the support of CPU hotplug event initialization in the
existing GED framework.

Co-developed-by: Keqian Zhu 
Signed-off-by: Keqian Zhu 
Signed-off-by: Salil Mehta 
Reviewed-by: Jonathan Cameron 
Reviewed-by: Gavin Shan 
Reviewed-by: David Hildenbrand 
Tested-by: Vishnu Pajjuri 

Reviewed-by: Shaoqin Huang 

---
  hw/acpi/generic_event_device.c | 8 
  include/hw/acpi/generic_event_device.h | 5 +
  2 files changed, 13 insertions(+)

diff --git a/hw/acpi/generic_event_device.c b/hw/acpi/generic_event_device.c
index a3d31631fe..d2fa1d0e4a 100644
--- a/hw/acpi/generic_event_device.c
+++ b/hw/acpi/generic_event_device.c
@@ -25,6 +25,7 @@ static const uint32_t ged_supported_events[] = {
  ACPI_GED_MEM_HOTPLUG_EVT,
  ACPI_GED_PWR_DOWN_EVT,
  ACPI_GED_NVDIMM_HOTPLUG_EVT,
+ACPI_GED_CPU_HOTPLUG_EVT,
  };
  
  /*

@@ -400,6 +401,13 @@ static void acpi_ged_initfn(Object *obj)
  memory_region_init_io(_st->regs, obj, _regs_ops, ged_st,
TYPE_ACPI_GED "-regs", ACPI_GED_REG_COUNT);
  sysbus_init_mmio(sbd, _st->regs);
+
+s->cpuhp.device = OBJECT(s);
+memory_region_init(>container_cpuhp, OBJECT(dev), "cpuhp container",
+   ACPI_CPU_HOTPLUG_REG_LEN);
+sysbus_init_mmio(SYS_BUS_DEVICE(dev), >container_cpuhp);
+cpu_hotplug_hw_init(>container_cpuhp, OBJECT(dev),
+>cpuhp_state, 0);
  }
  
  static void acpi_ged_class_init(ObjectClass *class, void *data)

diff --git a/include/hw/acpi/generic_event_device.h 
b/include/hw/acpi/generic_event_device.h
index d831bbd889..d0a5a43abf 100644
--- a/include/hw/acpi/generic_event_device.h
+++ b/include/hw/acpi/generic_event_device.h
@@ -60,6 +60,7 @@
  #define HW_ACPI_GENERIC_EVENT_DEVICE_H
  
  #include "hw/sysbus.h"

+#include "hw/acpi/cpu_hotplug.h"
  #include "hw/acpi/memory_hotplug.h"
  #include "hw/acpi/ghes.h"
  #include "qom/object.h"
@@ -97,6 +98,7 @@ OBJECT_DECLARE_SIMPLE_TYPE(AcpiGedState, ACPI_GED)
  #define ACPI_GED_MEM_HOTPLUG_EVT   0x1
  #define ACPI_GED_PWR_DOWN_EVT  0x2
  #define ACPI_GED_NVDIMM_HOTPLUG_EVT 0x4
+#define ACPI_GED_CPU_HOTPLUG_EVT0x8
  
  typedef struct GEDState {

  MemoryRegion evt;
@@ -108,6 +110,9 @@ struct AcpiGedState {
  SysBusDevice parent_obj;
  MemHotplugState memhp_state;
  MemoryRegion container_memhp;
+CPUHotplugState cpuhp_state;
+MemoryRegion container_cpuhp;
+AcpiCpuHotplug cpuhp;
  GEDState ged_state;
  uint32_t ged_event_bitmap;
  qemu_irq irq;


--
Shaoqin




Re: [PATCH V5 1/9] accel/kvm: Extract common KVM vCPU {creation, parking} code

2023-10-15 Thread Shaoqin Huang




On 10/12/23 03:43, Salil Mehta via wrote:

KVM vCPU creation is done once during the initialization of the VM when Qemu
thread is spawned. This is common to all the architectures.

Hot-unplug of vCPU results in destruction of the vCPU object in QOM but the
corresponding KVM vCPU object in the Host KVM is not destroyed and its
representative KVM vCPU object/context in Qemu is parked.

Refactor common logic so that some APIs could be reused by vCPU Hotplug code.

Signed-off-by: Salil Mehta 
Reviewed-by: Gavin Shan 
Tested-by: Vishnu Pajjuri 

Reviewed-by: Shaoqin Huang 

---
  accel/kvm/kvm-all.c| 64 --
  accel/kvm/trace-events |  4 +++
  include/sysemu/kvm.h   | 16 +++
  3 files changed, 69 insertions(+), 15 deletions(-)

diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index ff1578bb32..0dcaa15276 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -137,6 +137,7 @@ static QemuMutex kml_slots_lock;
  #define kvm_slots_unlock()  qemu_mutex_unlock(_slots_lock)
  
  static void kvm_slot_init_dirty_bitmap(KVMSlot *mem);

+static int kvm_get_vcpu(KVMState *s, unsigned long vcpu_id);
  
  static inline void kvm_resample_fd_remove(int gsi)

  {
@@ -320,14 +321,53 @@ err:
  return ret;
  }
  
+void kvm_park_vcpu(CPUState *cpu)

+{
+struct KVMParkedVcpu *vcpu;
+
+trace_kvm_park_vcpu(cpu->cpu_index, kvm_arch_vcpu_id(cpu));
+
+vcpu = g_malloc0(sizeof(*vcpu));
+vcpu->vcpu_id = kvm_arch_vcpu_id(cpu);
+vcpu->kvm_fd = cpu->kvm_fd;
+QLIST_INSERT_HEAD(_state->kvm_parked_vcpus, vcpu, node);
+}
+
+int kvm_create_vcpu(CPUState *cpu)
+{
+unsigned long vcpu_id = kvm_arch_vcpu_id(cpu);
+KVMState *s = kvm_state;
+int kvm_fd;
+
+trace_kvm_create_vcpu(cpu->cpu_index, kvm_arch_vcpu_id(cpu));
+
+/* check if the KVM vCPU already exist but is parked */
+kvm_fd = kvm_get_vcpu(s, vcpu_id);
+if (kvm_fd < 0) {
+/* vCPU not parked: create a new KVM vCPU */
+kvm_fd = kvm_vm_ioctl(s, KVM_CREATE_VCPU, vcpu_id);
+if (kvm_fd < 0) {
+error_report("KVM_CREATE_VCPU IOCTL failed for vCPU %lu", vcpu_id);
+return kvm_fd;
+}
+}
+
+cpu->kvm_fd = kvm_fd;
+cpu->kvm_state = s;
+cpu->vcpu_dirty = true;
+cpu->dirty_pages = 0;
+cpu->throttle_us_per_full = 0;
+
+return 0;
+}
+
  static int do_kvm_destroy_vcpu(CPUState *cpu)
  {
  KVMState *s = kvm_state;
  long mmap_size;
-struct KVMParkedVcpu *vcpu = NULL;
  int ret = 0;
  
-DPRINTF("kvm_destroy_vcpu\n");

+trace_kvm_destroy_vcpu(cpu->cpu_index, kvm_arch_vcpu_id(cpu));
  
  ret = kvm_arch_destroy_vcpu(cpu);

  if (ret < 0) {
@@ -353,10 +393,7 @@ static int do_kvm_destroy_vcpu(CPUState *cpu)
  }
  }
  
-vcpu = g_malloc0(sizeof(*vcpu));

-vcpu->vcpu_id = kvm_arch_vcpu_id(cpu);
-vcpu->kvm_fd = cpu->kvm_fd;
-QLIST_INSERT_HEAD(_state->kvm_parked_vcpus, vcpu, node);
+kvm_park_vcpu(cpu);
  err:
  return ret;
  }
@@ -377,6 +414,8 @@ static int kvm_get_vcpu(KVMState *s, unsigned long vcpu_id)
  if (cpu->vcpu_id == vcpu_id) {
  int kvm_fd;
  
+trace_kvm_get_vcpu(vcpu_id);

+
  QLIST_REMOVE(cpu, node);
  kvm_fd = cpu->kvm_fd;
  g_free(cpu);
@@ -384,7 +423,7 @@ static int kvm_get_vcpu(KVMState *s, unsigned long vcpu_id)
  }
  }
  
-return kvm_vm_ioctl(s, KVM_CREATE_VCPU, (void *)vcpu_id);

+return -ENOENT;
  }
  
  int kvm_init_vcpu(CPUState *cpu, Error **errp)

@@ -395,19 +434,14 @@ int kvm_init_vcpu(CPUState *cpu, Error **errp)
  
  trace_kvm_init_vcpu(cpu->cpu_index, kvm_arch_vcpu_id(cpu));
  
-ret = kvm_get_vcpu(s, kvm_arch_vcpu_id(cpu));

+ret = kvm_create_vcpu(cpu);
  if (ret < 0) {
-error_setg_errno(errp, -ret, "kvm_init_vcpu: kvm_get_vcpu failed 
(%lu)",
+error_setg_errno(errp, -ret,
+ "kvm_init_vcpu: kvm_create_vcpu failed (%lu)",
   kvm_arch_vcpu_id(cpu));
  goto err;
  }
  
-cpu->kvm_fd = ret;

-cpu->kvm_state = s;
-cpu->vcpu_dirty = true;
-cpu->dirty_pages = 0;
-cpu->throttle_us_per_full = 0;
-
  mmap_size = kvm_ioctl(s, KVM_GET_VCPU_MMAP_SIZE, 0);
  if (mmap_size < 0) {
  ret = mmap_size;
diff --git a/accel/kvm/trace-events b/accel/kvm/trace-events
index 399aaeb0ec..cdd0c95c09 100644
--- a/accel/kvm/trace-events
+++ b/accel/kvm/trace-events
@@ -9,6 +9,10 @@ kvm_device_ioctl(int fd, int type, void *arg) "dev fd %d, type 
0x%x, arg %p"
  kvm_failed_reg_get(uint64_t id, const char *msg) "Warning: Unable to retrieve ONEREG %" 
PRIu64 " from KVM: %s"
  kvm_failed_reg_set(uint64_t id, const char *msg) "Warning: Unable to set ONEREG %" 
PRIu64 " to KVM: %s

Re: [PATCH V5 1/9] accel/kvm: Extract common KVM vCPU {creation, parking} code

2023-10-15 Thread Shaoqin Huang




On 10/12/23 03:43, Salil Mehta via wrote:

KVM vCPU creation is done once during the initialization of the VM when Qemu
thread is spawned. This is common to all the architectures.

Hot-unplug of vCPU results in destruction of the vCPU object in QOM but the
corresponding KVM vCPU object in the Host KVM is not destroyed and its
representative KVM vCPU object/context in Qemu is parked.

Refactor common logic so that some APIs could be reused by vCPU Hotplug code.

Signed-off-by: Salil Mehta 
Reviewed-by: Gavin Shan 
Tested-by: Vishnu Pajjuri 

Reviewed-by: Shaoqin Huang 

---
  accel/kvm/kvm-all.c| 64 --
  accel/kvm/trace-events |  4 +++
  include/sysemu/kvm.h   | 16 +++
  3 files changed, 69 insertions(+), 15 deletions(-)

diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index ff1578bb32..0dcaa15276 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -137,6 +137,7 @@ static QemuMutex kml_slots_lock;
  #define kvm_slots_unlock()  qemu_mutex_unlock(_slots_lock)
  
  static void kvm_slot_init_dirty_bitmap(KVMSlot *mem);

+static int kvm_get_vcpu(KVMState *s, unsigned long vcpu_id);
  
  static inline void kvm_resample_fd_remove(int gsi)

  {
@@ -320,14 +321,53 @@ err:
  return ret;
  }
  
+void kvm_park_vcpu(CPUState *cpu)

+{
+struct KVMParkedVcpu *vcpu;
+
+trace_kvm_park_vcpu(cpu->cpu_index, kvm_arch_vcpu_id(cpu));
+
+vcpu = g_malloc0(sizeof(*vcpu));
+vcpu->vcpu_id = kvm_arch_vcpu_id(cpu);
+vcpu->kvm_fd = cpu->kvm_fd;
+QLIST_INSERT_HEAD(_state->kvm_parked_vcpus, vcpu, node);
+}
+
+int kvm_create_vcpu(CPUState *cpu)
+{
+unsigned long vcpu_id = kvm_arch_vcpu_id(cpu);
+KVMState *s = kvm_state;
+int kvm_fd;
+
+trace_kvm_create_vcpu(cpu->cpu_index, kvm_arch_vcpu_id(cpu));
+
+/* check if the KVM vCPU already exist but is parked */
+kvm_fd = kvm_get_vcpu(s, vcpu_id);
+if (kvm_fd < 0) {
+/* vCPU not parked: create a new KVM vCPU */
+kvm_fd = kvm_vm_ioctl(s, KVM_CREATE_VCPU, vcpu_id);
+if (kvm_fd < 0) {
+error_report("KVM_CREATE_VCPU IOCTL failed for vCPU %lu", vcpu_id);
+return kvm_fd;
+}
+}
+
+cpu->kvm_fd = kvm_fd;
+cpu->kvm_state = s;
+cpu->vcpu_dirty = true;
+cpu->dirty_pages = 0;
+cpu->throttle_us_per_full = 0;
+
+return 0;
+}
+
  static int do_kvm_destroy_vcpu(CPUState *cpu)
  {
  KVMState *s = kvm_state;
  long mmap_size;
-struct KVMParkedVcpu *vcpu = NULL;
  int ret = 0;
  
-DPRINTF("kvm_destroy_vcpu\n");

+trace_kvm_destroy_vcpu(cpu->cpu_index, kvm_arch_vcpu_id(cpu));
  
  ret = kvm_arch_destroy_vcpu(cpu);

  if (ret < 0) {
@@ -353,10 +393,7 @@ static int do_kvm_destroy_vcpu(CPUState *cpu)
  }
  }
  
-vcpu = g_malloc0(sizeof(*vcpu));

-vcpu->vcpu_id = kvm_arch_vcpu_id(cpu);
-vcpu->kvm_fd = cpu->kvm_fd;
-QLIST_INSERT_HEAD(_state->kvm_parked_vcpus, vcpu, node);
+kvm_park_vcpu(cpu);
  err:
  return ret;
  }
@@ -377,6 +414,8 @@ static int kvm_get_vcpu(KVMState *s, unsigned long vcpu_id)
  if (cpu->vcpu_id == vcpu_id) {
  int kvm_fd;
  
+trace_kvm_get_vcpu(vcpu_id);

+
  QLIST_REMOVE(cpu, node);
  kvm_fd = cpu->kvm_fd;
  g_free(cpu);
@@ -384,7 +423,7 @@ static int kvm_get_vcpu(KVMState *s, unsigned long vcpu_id)
  }
  }
  
-return kvm_vm_ioctl(s, KVM_CREATE_VCPU, (void *)vcpu_id);

+return -ENOENT;
  }
  
  int kvm_init_vcpu(CPUState *cpu, Error **errp)

@@ -395,19 +434,14 @@ int kvm_init_vcpu(CPUState *cpu, Error **errp)
  
  trace_kvm_init_vcpu(cpu->cpu_index, kvm_arch_vcpu_id(cpu));
  
-ret = kvm_get_vcpu(s, kvm_arch_vcpu_id(cpu));

+ret = kvm_create_vcpu(cpu);
  if (ret < 0) {
-error_setg_errno(errp, -ret, "kvm_init_vcpu: kvm_get_vcpu failed 
(%lu)",
+error_setg_errno(errp, -ret,
+ "kvm_init_vcpu: kvm_create_vcpu failed (%lu)",
   kvm_arch_vcpu_id(cpu));
  goto err;
  }
  
-cpu->kvm_fd = ret;

-cpu->kvm_state = s;
-cpu->vcpu_dirty = true;
-cpu->dirty_pages = 0;
-cpu->throttle_us_per_full = 0;
-
  mmap_size = kvm_ioctl(s, KVM_GET_VCPU_MMAP_SIZE, 0);
  if (mmap_size < 0) {
  ret = mmap_size;
diff --git a/accel/kvm/trace-events b/accel/kvm/trace-events
index 399aaeb0ec..cdd0c95c09 100644
--- a/accel/kvm/trace-events
+++ b/accel/kvm/trace-events
@@ -9,6 +9,10 @@ kvm_device_ioctl(int fd, int type, void *arg) "dev fd %d, type 
0x%x, arg %p"
  kvm_failed_reg_get(uint64_t id, const char *msg) "Warning: Unable to retrieve ONEREG %" 
PRIu64 " from KVM: %s"
  kvm_failed_reg_set(uint64_t id, const char *msg) "Warning: Unable to set ONEREG %" 
PRIu64 " to KVM: %s

Re: [PATCH V4 07/10] hw/acpi: Update ACPI GED framework to support vCPU Hotplug

2023-10-10 Thread Shaoqin Huang




On 10/10/23 04:35, Salil Mehta via wrote:

ACPI GED shall be used to convey to the guest kernel about any CPU hot-(un)plug
events. Therefore, existing ACPI GED framework inside QEMU needs to be enhanced
to support CPU hotplug state and events.

Co-developed-by: Keqian Zhu 
Signed-off-by: Keqian Zhu 
Signed-off-by: Salil Mehta 
Reviewed-by: Jonathan Cameron 
Reviewed-by: Gavin Shan 

Reviewed-by: Shaoqin Huang 

---
  hw/acpi/generic_event_device.c | 10 ++
  1 file changed, 10 insertions(+)

diff --git a/hw/acpi/generic_event_device.c b/hw/acpi/generic_event_device.c
index 62d504d231..0d5f0140e5 100644
--- a/hw/acpi/generic_event_device.c
+++ b/hw/acpi/generic_event_device.c
@@ -12,6 +12,7 @@
  #include "qemu/osdep.h"
  #include "qapi/error.h"
  #include "hw/acpi/acpi.h"
+#include "hw/acpi/cpu.h"
  #include "hw/acpi/generic_event_device.h"
  #include "hw/irq.h"
  #include "hw/mem/pc-dimm.h"
@@ -239,6 +240,8 @@ static void acpi_ged_device_plug_cb(HotplugHandler 
*hotplug_dev,
  } else {
  acpi_memory_plug_cb(hotplug_dev, >memhp_state, dev, errp);
  }
+} else if (object_dynamic_cast(OBJECT(dev), TYPE_CPU)) {
+acpi_cpu_plug_cb(hotplug_dev, >cpuhp_state, dev, errp);
  } else {
  error_setg(errp, "virt: device plug request for unsupported device"
 " type: %s", object_get_typename(OBJECT(dev)));
@@ -253,6 +256,8 @@ static void acpi_ged_unplug_request_cb(HotplugHandler 
*hotplug_dev,
  if ((object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM) &&
 !(object_dynamic_cast(OBJECT(dev), TYPE_NVDIMM {
  acpi_memory_unplug_request_cb(hotplug_dev, >memhp_state, dev, 
errp);
+} else if (object_dynamic_cast(OBJECT(dev), TYPE_CPU)) {
+acpi_cpu_unplug_request_cb(hotplug_dev, >cpuhp_state, dev, errp);
  } else {
  error_setg(errp, "acpi: device unplug request for unsupported device"
 " type: %s", object_get_typename(OBJECT(dev)));
@@ -266,6 +271,8 @@ static void acpi_ged_unplug_cb(HotplugHandler *hotplug_dev,
  
  if (object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM)) {

  acpi_memory_unplug_cb(>memhp_state, dev, errp);
+} else if (object_dynamic_cast(OBJECT(dev), TYPE_CPU)) {
+acpi_cpu_unplug_cb(>cpuhp_state, dev, errp);
  } else {
  error_setg(errp, "acpi: device unplug for unsupported device"
 " type: %s", object_get_typename(OBJECT(dev)));
@@ -277,6 +284,7 @@ static void acpi_ged_ospm_status(AcpiDeviceIf *adev, 
ACPIOSTInfoList ***list)
  AcpiGedState *s = ACPI_GED(adev);
  
  acpi_memory_ospm_status(>memhp_state, list);

+acpi_cpu_ospm_status(>cpuhp_state, list);
  }
  
  static void acpi_ged_send_event(AcpiDeviceIf *adev, AcpiEventStatusBits ev)

@@ -291,6 +299,8 @@ static void acpi_ged_send_event(AcpiDeviceIf *adev, 
AcpiEventStatusBits ev)
  sel = ACPI_GED_PWR_DOWN_EVT;
  } else if (ev & ACPI_NVDIMM_HOTPLUG_STATUS) {
  sel = ACPI_GED_NVDIMM_HOTPLUG_EVT;
+} else if (ev & ACPI_CPU_HOTPLUG_STATUS) {
+sel = ACPI_GED_CPU_HOTPLUG_EVT;
  } else {
  /* Unknown event. Return without generating interrupt. */
  warn_report("GED: Unsupported event %d. No irq injected", ev);


--
Shaoqin




Re: [PATCH V4 03/10] hw/acpi: Add ACPI CPU hotplug init stub

2023-10-10 Thread Shaoqin Huang




On 10/10/23 04:35, Salil Mehta via wrote:

ACPI CPU hotplug related initialization should only happen if ACPI_CPU_HOTPLUG
support has been enabled for particular architecture. Add cpu_hotplug_hw_init()
stub to avoid compilation break.

Signed-off-by: Salil Mehta 
Reviewed-by: Jonathan Cameron 
Reviewed-by: Gavin Shan 

Reviewed-by: Shaoqin Huang 

---
  hw/acpi/acpi-cpu-hotplug-stub.c | 6 ++
  1 file changed, 6 insertions(+)

diff --git a/hw/acpi/acpi-cpu-hotplug-stub.c b/hw/acpi/acpi-cpu-hotplug-stub.c
index 3fc4b14c26..c6c61bb9cd 100644
--- a/hw/acpi/acpi-cpu-hotplug-stub.c
+++ b/hw/acpi/acpi-cpu-hotplug-stub.c
@@ -19,6 +19,12 @@ void legacy_acpi_cpu_hotplug_init(MemoryRegion *parent, 
Object *owner,
  return;
  }
  
+void cpu_hotplug_hw_init(MemoryRegion *as, Object *owner,

+ CPUHotplugState *state, hwaddr base_addr)
+{
+return;
+}
+
  void acpi_cpu_ospm_status(CPUHotplugState *cpu_st, ACPIOSTInfoList ***list)
  {
  return;


--
Shaoqin




Re: [PATCH V4 02/10] hw/acpi: Move CPU ctrl-dev MMIO region len macro to common header file

2023-10-10 Thread Shaoqin Huang




On 10/10/23 04:35, Salil Mehta via wrote:

CPU ctrl-dev MMIO region length could be used in ACPI GED and various other
architecture specific places. Move ACPI_CPU_HOTPLUG_REG_LEN macro to more
appropriate common header file.

Signed-off-by: Salil Mehta 
Reviewed-by: Alex Bennée 
Reviewed-by: Jonathan Cameron 
Reviewed-by: Gavin Shan 
Reviewed-by: David Hildenbrand 

Reviewed-by: Shaoqin Huang 

---
  hw/acpi/cpu.c | 2 +-
  include/hw/acpi/cpu_hotplug.h | 2 ++
  2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/hw/acpi/cpu.c b/hw/acpi/cpu.c
index 19c154d78f..45defdc0e2 100644
--- a/hw/acpi/cpu.c
+++ b/hw/acpi/cpu.c
@@ -1,12 +1,12 @@
  #include "qemu/osdep.h"
  #include "migration/vmstate.h"
  #include "hw/acpi/cpu.h"
+#include "hw/acpi/cpu_hotplug.h"
  #include "qapi/error.h"
  #include "qapi/qapi-events-acpi.h"
  #include "trace.h"
  #include "sysemu/numa.h"
  
-#define ACPI_CPU_HOTPLUG_REG_LEN 12

  #define ACPI_CPU_SELECTOR_OFFSET_WR 0
  #define ACPI_CPU_FLAGS_OFFSET_RW 4
  #define ACPI_CPU_CMD_OFFSET_WR 5
diff --git a/include/hw/acpi/cpu_hotplug.h b/include/hw/acpi/cpu_hotplug.h
index 3b932a..48b291e45e 100644
--- a/include/hw/acpi/cpu_hotplug.h
+++ b/include/hw/acpi/cpu_hotplug.h
@@ -19,6 +19,8 @@
  #include "hw/hotplug.h"
  #include "hw/acpi/cpu.h"
  
+#define ACPI_CPU_HOTPLUG_REG_LEN 12

+
  typedef struct AcpiCpuHotplug {
  Object *device;
  MemoryRegion io;


--
Shaoqin




Re: [PATCH RFC V2 03/37] hw/arm/virt: Move setting of common CPU properties in a function

2023-10-10 Thread Shaoqin Huang




On 9/26/23 18:04, Salil Mehta via wrote:

Factor out CPU properties code common for {hot,cold}-plugged CPUs. This allows
code reuse.

Signed-off-by: Salil Mehta 
---
  hw/arm/virt.c | 220 ++
  include/hw/arm/virt.h |   4 +
  2 files changed, 140 insertions(+), 84 deletions(-)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 57fe97c242..0eb6bf5a18 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -2018,16 +2018,130 @@ static void virt_cpu_post_init(VirtMachineState *vms, 
MemoryRegion *sysmem)
  }
  }
  
+static void virt_cpu_set_properties(Object *cpuobj, const CPUArchId *cpu_slot,

+Error **errp)
+{


Hi Salil,

This patch seems break the code, the virt_cpu_set_properties() function 
being defined but not used in this patch, so those original code in the 
machvirt_init() just not work.


We should use this function in the machvirt_init().


+MachineState *ms = MACHINE(qdev_get_machine());
+VirtMachineState *vms = VIRT_MACHINE(ms);
+Error *local_err = NULL;
+VirtMachineClass *vmc;
+
+vmc = VIRT_MACHINE_GET_CLASS(ms);
+
+/* now, set the cpu object property values */
+numa_cpu_pre_plug(cpu_slot, DEVICE(cpuobj), _err);
+if (local_err) {
+goto out;
+}
+
+object_property_set_int(cpuobj, "mp-affinity", cpu_slot->arch_id, NULL);
+
+if (!vms->secure) {
+object_property_set_bool(cpuobj, "has_el3", false, NULL);
+}
+
+if (!vms->virt && object_property_find(cpuobj, "has_el2")) {
+object_property_set_bool(cpuobj, "has_el2", false, NULL);
+}
+
+if (vmc->kvm_no_adjvtime &&
+object_property_find(cpuobj, "kvm-no-adjvtime")) {
+object_property_set_bool(cpuobj, "kvm-no-adjvtime", true, NULL);
+}
+
+if (vmc->no_kvm_steal_time &&
+object_property_find(cpuobj, "kvm-steal-time")) {
+object_property_set_bool(cpuobj, "kvm-steal-time", false, NULL);
+}
+
+if (vmc->no_pmu && object_property_find(cpuobj, "pmu")) {
+object_property_set_bool(cpuobj, "pmu", false, NULL);
+}
+
+if (vmc->no_tcg_lpa2 && object_property_find(cpuobj, "lpa2")) {
+object_property_set_bool(cpuobj, "lpa2", false, NULL);
+}
+
+if (object_property_find(cpuobj, "reset-cbar")) {
+object_property_set_int(cpuobj, "reset-cbar",
+vms->memmap[VIRT_CPUPERIPHS].base,
+_err);
+if (local_err) {
+goto out;
+}
+}
+
+/* link already initialized {secure,tag}-memory regions to this cpu */
+object_property_set_link(cpuobj, "memory", OBJECT(vms->sysmem), 
_err);
+if (local_err) {
+goto out;
+}
+
+if (vms->secure) {
+object_property_set_link(cpuobj, "secure-memory",
+ OBJECT(vms->secure_sysmem), _err);
+if (local_err) {
+goto out;
+}
+}
+
+if (vms->mte) {
+if (!object_property_find(cpuobj, "tag-memory")) {
+error_setg(_err, "MTE requested, but not supported "
+   "by the guest CPU");
+if (local_err) {
+goto out;
+}
+}
+
+object_property_set_link(cpuobj, "tag-memory", OBJECT(vms->tag_sysmem),
+ _err);
+if (local_err) {
+goto out;
+}
+
+if (vms->secure) {
+object_property_set_link(cpuobj, "secure-tag-memory",
+ OBJECT(vms->secure_tag_sysmem),
+ _err);
+if (local_err) {
+goto out;
+}
+}
+}
+
+/*
+ * RFC: Question: this must only be called for the hotplugged cpus. For the
+ * cold booted secondary cpus this is being taken care in arm_load_kernel()
+ * in boot.c. Perhaps we should remove that code now?
+ */
+if (vms->psci_conduit != QEMU_PSCI_CONDUIT_DISABLED) {
+object_property_set_int(cpuobj, "psci-conduit", vms->psci_conduit,
+NULL);
+
+/* Secondary CPUs start in PSCI powered-down state */
+if (CPU(cpuobj)->cpu_index > 0) {
+object_property_set_bool(cpuobj, "start-powered-off", true, NULL);
+}
+}


Besides, if this patch is just factor out the code, we could move the 
check psci_conduit to later patch, and keep this patch clean.


Thanks,
Shaoqin


+
+out:
+if (local_err) {
+error_propagate(errp, local_err);
+}
+return;
+}
+
  static void machvirt_init(MachineState *machine)
  {
  VirtMachineState *vms = VIRT_MACHINE(machine);
  VirtMachineClass *vmc = VIRT_MACHINE_GET_CLASS(machine);
  MachineClass *mc = MACHINE_GET_CLASS(machine);
  const CPUArchIdList *possible_cpus;
-MemoryRegion *sysmem = get_system_memory();
+MemoryRegion *secure_tag_sysmem = NULL;
  MemoryRegion *secure_sysmem 

Re: [PATCH v1 0/5] target/arm: Handle psci calls in userspace

2023-06-26 Thread Shaoqin Huang

Hi Salil,

On 6/26/23 21:42, Salil Mehta wrote:

From: Shaoqin Huang 
Sent: Monday, June 26, 2023 7:49 AM
To: qemu-devel@nongnu.org; qemu-...@nongnu.org
Cc: oliver.up...@linux.dev; Salil Mehta ;
james.mo...@arm.com; gs...@redhat.com; Shaoqin Huang ;
Cornelia Huck ; k...@vger.kernel.org; Michael S. Tsirkin
; Paolo Bonzini ; Peter Maydell

Subject: [PATCH v1 0/5] target/arm: Handle psci calls in userspace

The userspace SMCCC call filtering[1] provides the ability to forward the SMCCC
calls to the userspace. The vCPU hotplug[2] would be the first legitimate use
case to handle the psci calls in userspace, thus the vCPU hotplug can deny the
PSCI_ON call if the vCPU is not present now.

This series try to enable the userspace SMCCC call filtering, thus can handle
the SMCCC call in userspace. The first enabled SMCCC call is psci call, by using
the new added option 'user-smccc', we can enable handle psci calls in userspace.

qemu-system-aarch64 -machine virt,user-smccc=on

This series reuse the qemu implementation of the psci handling, thus the
handling process is very simple. But when handling psci in userspace when using
kvm, the reset vcpu process need to be taking care, the detail is included in
the patch05.


This change in intended for VCPU Hotplug and we are duplicating the code
we are working on. Unless this change is also intended for any other
feature I would request you to defer this.


Thanks for sharing me the information. I'm not intended for merging this 
series, but discuss something about the VCPU Hotplug, since I'm also 
following the work of vCPU Hotplug.


Just curious, what is your plan to update a new version of VCPU Hotplug 
which is based on the userspace SMCCC filtering?


Thanks,
Shaoqin




Thanks
Salil



--
Shaoqin




[PATCH v1 5/5] arm/kvm: add support for userspace psci calls handling

2023-06-26 Thread Shaoqin Huang
Use the SMCCC filter to start sending psci calls to userspace, qemu will
need to handle the psci calls. In qemu, reuse the psci handler which
used for tcg, while use it, we need to take care the reset vcpu process
which will reset the vcpu register and grab all vcpu locks when reset
gicv3.

So when reset vcpu, we need to mark it as dirty to force the vcpu to
sync its register to kvm, and when reset gicv3, we need to pause all
vcpus to grab the all vcpu locks, thus when handling the psci CPU_ON
call, the vcpu can be successfuly boot up.

Signed-off-by: Shaoqin Huang 
---
 hw/intc/arm_gicv3_kvm.c | 10 +
 target/arm/kvm.c| 94 -
 2 files changed, 103 insertions(+), 1 deletion(-)

diff --git a/hw/intc/arm_gicv3_kvm.c b/hw/intc/arm_gicv3_kvm.c
index 72ad916d3d..e42898c1d6 100644
--- a/hw/intc/arm_gicv3_kvm.c
+++ b/hw/intc/arm_gicv3_kvm.c
@@ -24,6 +24,7 @@
 #include "hw/intc/arm_gicv3_common.h"
 #include "qemu/error-report.h"
 #include "qemu/module.h"
+#include "sysemu/cpus.h"
 #include "sysemu/kvm.h"
 #include "sysemu/runstate.h"
 #include "kvm_arm.h"
@@ -695,10 +696,19 @@ static void arm_gicv3_icc_reset(CPUARMState *env, const 
ARMCPRegInfo *ri)
 return;
 }
 
+/*
+ * When handling psci call in userspace like cpu hotplug, this shall be 
called
+ * when other vcpus might be running. Host kernel KVM to handle device
+ * access of IOCTLs KVM_{GET|SET}_DEVICE_ATTR might fail due to inability 
to
+ * grab vcpu locks for all the vcpus. Hence, we need to pause all vcpus to
+ * facilitate locking within host.
+ */
+pause_all_vcpus();
 /* Initialize to actual HW supported configuration */
 kvm_device_access(s->dev_fd, KVM_DEV_ARM_VGIC_GRP_CPU_SYSREGS,
   KVM_VGIC_ATTR(ICC_CTLR_EL1, c->gicr_typer),
   >icc_ctlr_el1[GICV3_NS], false, _abort);
+resume_all_vcpus();
 
 c->icc_ctlr_el1[GICV3_S] = c->icc_ctlr_el1[GICV3_NS];
 }
diff --git a/target/arm/kvm.c b/target/arm/kvm.c
index 579c6edd49..d2857a8499 100644
--- a/target/arm/kvm.c
+++ b/target/arm/kvm.c
@@ -10,6 +10,7 @@
 
 #include "qemu/osdep.h"
 #include 
+#include 
 #include 
 #include 
 
@@ -251,7 +252,29 @@ int kvm_arm_get_max_vm_ipa_size(MachineState *ms, bool 
*fixed_ipa)
 
 static int kvm_arm_init_smccc_filter(KVMState *s)
 {
+unsigned int i;
 int ret = 0;
+struct kvm_smccc_filter filter_ranges[] = {
+{
+.base   = KVM_PSCI_FN_BASE,
+.nr_functions   = 4,
+.action = KVM_SMCCC_FILTER_DENY,
+},
+{
+.base   = PSCI_0_2_FN_BASE,
+.nr_functions   = 0x20,
+.action = KVM_SMCCC_FILTER_FWD_TO_USER,
+},
+{
+.base   = PSCI_0_2_FN64_BASE,
+.nr_functions   = 0x20,
+.action = KVM_SMCCC_FILTER_FWD_TO_USER,
+},
+};
+struct kvm_device_attr attr = {
+.group = KVM_ARM_VM_SMCCC_CTRL,
+.attr = KVM_ARM_VM_SMCCC_FILTER,
+};
 
 if (kvm_vm_check_attr(s, KVM_ARM_VM_SMCCC_CTRL, KVM_ARM_VM_SMCCC_FILTER)) {
 error_report("ARM SMCCC filter not supported");
@@ -259,6 +282,16 @@ static int kvm_arm_init_smccc_filter(KVMState *s)
 goto out;
 }
 
+for (i = 0; i < ARRAY_SIZE(filter_ranges); i++) {
+attr.addr = (uint64_t)_ranges[i];
+
+ret = kvm_vm_ioctl(s, KVM_SET_DEVICE_ATTR, );
+if (ret < 0) {
+error_report("KVM_SET_DEVICE_ATTR failed when SMCCC init");
+goto out;
+}
+}
+
 out:
 return ret;
 }
@@ -654,6 +687,14 @@ void kvm_arm_reset_vcpu(ARMCPU *cpu)
  * for the same reason we do so in kvm_arch_get_registers().
  */
 write_list_to_cpustate(cpu);
+
+/*
+ * When enabled userspace psci call handling, qemu will reset the vcpu if
+ * it's PSCI CPU_ON call. Since this will reset the vcpu register and
+ * power_state, we should sync these state to kvm, so manually set the
+ * vcpu_dirty to force the qemu to put register to kvm.
+ */
+CPU(cpu)->vcpu_dirty = true;
 }
 
 /*
@@ -932,6 +973,51 @@ static int kvm_arm_handle_dabt_nisv(CPUState *cs, uint64_t 
esr_iss,
 return -1;
 }
 
+static int kvm_arm_handle_psci(CPUState *cs, struct kvm_run *run)
+{
+if (run->hypercall.flags & KVM_HYPERCALL_EXIT_SMC) {
+cs->exception_index = EXCP_SMC;
+} else {
+cs->exception_index = EXCP_HVC;
+}
+
+qemu_mutex_lock_iothread();
+arm_cpu_do_interrupt(cs);
+qemu_mutex_unlock_iothread();
+
+/*
+ * We need to exit the run loop to have the chance to execute the
+ * qemu_wait_io_event() which will execute the psci function which queued 
in
+ * the cpu work queue.
+ */
+return EXCP_INTERRUPT;

[PATCH v1 3/5] target/arm: make psci call can be used by kvm

2023-06-26 Thread Shaoqin Huang
Now the psci call can only be used when tcg_enabled, we want to reuse it
when kvm_enabled, which will be used in subsequent patch which enable
the psci handling in userspace.

Signed-off-by: Shaoqin Huang 
---
 target/arm/helper.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index d4bee43bd0..58063a92a6 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -11020,7 +11020,8 @@ void arm_cpu_do_interrupt(CPUState *cs)
   env->exception.syndrome);
 }
 
-if (tcg_enabled() && arm_is_psci_call(cpu, cs->exception_index)) {
+if ((tcg_enabled() || kvm_enabled()) &&
+ arm_is_psci_call(cpu, cs->exception_index)) {
 arm_handle_psci_call(cpu);
 qemu_log_mask(CPU_LOG_INT, "...handled as PSCI call\n");
 return;
-- 
2.39.1




[PATCH v1 2/5] linux-headers: Import arm-smccc.h from Linux v6.4-rc7

2023-06-26 Thread Shaoqin Huang
Copy in the SMCCC definitions from the kernel, which will be used to
implement SMCCC handling in userspace.

Signed-off-by: Shaoqin Huang 
---
 linux-headers/linux/arm-smccc.h | 240 
 1 file changed, 240 insertions(+)
 create mode 100644 linux-headers/linux/arm-smccc.h

diff --git a/linux-headers/linux/arm-smccc.h b/linux-headers/linux/arm-smccc.h
new file mode 100644
index 00..3663c31ba5
--- /dev/null
+++ b/linux-headers/linux/arm-smccc.h
@@ -0,0 +1,240 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (c) 2015, Linaro Limited
+ */
+#ifndef __LINUX_ARM_SMCCC_H
+#define __LINUX_ARM_SMCCC_H
+
+#include 
+
+/*
+ * This file provides common defines for ARM SMC Calling Convention as
+ * specified in
+ * https://developer.arm.com/docs/den0028/latest
+ *
+ * This code is up-to-date with version DEN 0028 C
+ */
+
+#define ARM_SMCCC_STD_CALL _AC(0,U)
+#define ARM_SMCCC_FAST_CALL_AC(1,U)
+#define ARM_SMCCC_TYPE_SHIFT   31
+
+#define ARM_SMCCC_SMC_32   0
+#define ARM_SMCCC_SMC_64   1
+#define ARM_SMCCC_CALL_CONV_SHIFT  30
+
+#define ARM_SMCCC_OWNER_MASK   0x3F
+#define ARM_SMCCC_OWNER_SHIFT  24
+
+#define ARM_SMCCC_FUNC_MASK0x
+
+#define ARM_SMCCC_IS_FAST_CALL(smc_val)\
+   ((smc_val) & (ARM_SMCCC_FAST_CALL << ARM_SMCCC_TYPE_SHIFT))
+#define ARM_SMCCC_IS_64(smc_val) \
+   ((smc_val) & (ARM_SMCCC_SMC_64 << ARM_SMCCC_CALL_CONV_SHIFT))
+#define ARM_SMCCC_FUNC_NUM(smc_val)((smc_val) & ARM_SMCCC_FUNC_MASK)
+#define ARM_SMCCC_OWNER_NUM(smc_val) \
+   (((smc_val) >> ARM_SMCCC_OWNER_SHIFT) & ARM_SMCCC_OWNER_MASK)
+
+#define ARM_SMCCC_CALL_VAL(type, calling_convention, owner, func_num) \
+   (((type) << ARM_SMCCC_TYPE_SHIFT) | \
+   ((calling_convention) << ARM_SMCCC_CALL_CONV_SHIFT) | \
+   (((owner) & ARM_SMCCC_OWNER_MASK) << ARM_SMCCC_OWNER_SHIFT) | \
+   ((func_num) & ARM_SMCCC_FUNC_MASK))
+
+#define ARM_SMCCC_OWNER_ARCH   0
+#define ARM_SMCCC_OWNER_CPU1
+#define ARM_SMCCC_OWNER_SIP2
+#define ARM_SMCCC_OWNER_OEM3
+#define ARM_SMCCC_OWNER_STANDARD   4
+#define ARM_SMCCC_OWNER_STANDARD_HYP   5
+#define ARM_SMCCC_OWNER_VENDOR_HYP 6
+#define ARM_SMCCC_OWNER_TRUSTED_APP48
+#define ARM_SMCCC_OWNER_TRUSTED_APP_END49
+#define ARM_SMCCC_OWNER_TRUSTED_OS 50
+#define ARM_SMCCC_OWNER_TRUSTED_OS_END 63
+
+#define ARM_SMCCC_FUNC_QUERY_CALL_UID  0xff01
+
+#define ARM_SMCCC_QUIRK_NONE   0
+#define ARM_SMCCC_QUIRK_QCOM_A61 /* Save/restore register a6 */
+
+#define ARM_SMCCC_VERSION_1_0  0x1
+#define ARM_SMCCC_VERSION_1_1  0x10001
+#define ARM_SMCCC_VERSION_1_2  0x10002
+#define ARM_SMCCC_VERSION_1_3  0x10003
+
+#define ARM_SMCCC_1_3_SVE_HINT 0x1
+
+#define ARM_SMCCC_VERSION_FUNC_ID  \
+   ARM_SMCCC_CALL_VAL(ARM_SMCCC_FAST_CALL, \
+  ARM_SMCCC_SMC_32,\
+  0, 0)
+
+#define ARM_SMCCC_ARCH_FEATURES_FUNC_ID
\
+   ARM_SMCCC_CALL_VAL(ARM_SMCCC_FAST_CALL, \
+  ARM_SMCCC_SMC_32,\
+  0, 1)
+
+#define ARM_SMCCC_ARCH_SOC_ID  \
+   ARM_SMCCC_CALL_VAL(ARM_SMCCC_FAST_CALL, \
+  ARM_SMCCC_SMC_32,\
+  0, 2)
+
+#define ARM_SMCCC_ARCH_WORKAROUND_1\
+   ARM_SMCCC_CALL_VAL(ARM_SMCCC_FAST_CALL, \
+  ARM_SMCCC_SMC_32,\
+  0, 0x8000)
+
+#define ARM_SMCCC_ARCH_WORKAROUND_2\
+   ARM_SMCCC_CALL_VAL(ARM_SMCCC_FAST_CALL, \
+  ARM_SMCCC_SMC_32,\
+  0, 0x7fff)
+
+#define ARM_SMCCC_ARCH_WORKAROUND_3\
+   ARM_SMCCC_CALL_VAL(ARM_SMCCC_FAST_CALL, \
+  ARM_SMCCC_SMC_32,\
+  0, 0x3fff)
+
+#define ARM_SMCCC_VENDOR_HYP_CALL_UID_FUNC_ID  \
+   ARM_SMCCC_CALL_VAL(ARM_SMCCC_FAST_CALL, \
+  ARM_SMCCC_SMC_32,\
+  ARM_SMCCC_OWNER_VENDOR_HYP,  \
+  ARM_SMCCC_FUNC_QUERY_CALL_UID)
+
+/* KVM UID value: 28b46fb6-2ec5-11e9-a9ca-4b564d003a74 */
+#define ARM_SMCCC_VENDOR_HYP_UID_KVM_REG_0 0xb66fb428U

[PATCH v1 1/5] linux-headers: Update to v6.4-rc7

2023-06-26 Thread Shaoqin Huang
Update to commit 45a3e24f65e9 ("Linux 6.4-rc7").

Signed-off-by: Shaoqin Huang 
---
 include/standard-headers/linux/const.h|  2 +-
 include/standard-headers/linux/virtio_blk.h   | 18 +++
 .../standard-headers/linux/virtio_config.h|  6 +++
 include/standard-headers/linux/virtio_net.h   |  1 +
 linux-headers/asm-arm64/kvm.h | 33 
 linux-headers/asm-riscv/kvm.h | 53 ++-
 linux-headers/asm-riscv/unistd.h  |  9 
 linux-headers/asm-s390/unistd_32.h|  1 +
 linux-headers/asm-s390/unistd_64.h|  1 +
 linux-headers/asm-x86/kvm.h   |  3 ++
 linux-headers/linux/const.h   |  2 +-
 linux-headers/linux/kvm.h | 12 +++--
 linux-headers/linux/psp-sev.h |  7 +++
 linux-headers/linux/userfaultfd.h | 17 +-
 14 files changed, 149 insertions(+), 16 deletions(-)

diff --git a/include/standard-headers/linux/const.h 
b/include/standard-headers/linux/const.h
index 5e48987251..1eb84b5087 100644
--- a/include/standard-headers/linux/const.h
+++ b/include/standard-headers/linux/const.h
@@ -28,7 +28,7 @@
 #define _BITUL(x)  (_UL(1) << (x))
 #define _BITULL(x) (_ULL(1) << (x))
 
-#define __ALIGN_KERNEL(x, a)   __ALIGN_KERNEL_MASK(x, (typeof(x))(a) - 
1)
+#define __ALIGN_KERNEL(x, a)   __ALIGN_KERNEL_MASK(x, 
(__typeof__(x))(a) - 1)
 #define __ALIGN_KERNEL_MASK(x, mask)   (((x) + (mask)) & ~(mask))
 
 #define __KERNEL_DIV_ROUND_UP(n, d) (((n) + (d) - 1) / (d))
diff --git a/include/standard-headers/linux/virtio_blk.h 
b/include/standard-headers/linux/virtio_blk.h
index 7155b1a470..d7be3cf5e4 100644
--- a/include/standard-headers/linux/virtio_blk.h
+++ b/include/standard-headers/linux/virtio_blk.h
@@ -138,11 +138,11 @@ struct virtio_blk_config {
 
/* Zoned block device characteristics (if VIRTIO_BLK_F_ZONED) */
struct virtio_blk_zoned_characteristics {
-   uint32_t zone_sectors;
-   uint32_t max_open_zones;
-   uint32_t max_active_zones;
-   uint32_t max_append_sectors;
-   uint32_t write_granularity;
+   __virtio32 zone_sectors;
+   __virtio32 max_open_zones;
+   __virtio32 max_active_zones;
+   __virtio32 max_append_sectors;
+   __virtio32 write_granularity;
uint8_t model;
uint8_t unused2[3];
} zoned;
@@ -239,11 +239,11 @@ struct virtio_blk_outhdr {
  */
 struct virtio_blk_zone_descriptor {
/* Zone capacity */
-   uint64_t z_cap;
+   __virtio64 z_cap;
/* The starting sector of the zone */
-   uint64_t z_start;
+   __virtio64 z_start;
/* Zone write pointer position in sectors */
-   uint64_t z_wp;
+   __virtio64 z_wp;
/* Zone type */
uint8_t z_type;
/* Zone state */
@@ -252,7 +252,7 @@ struct virtio_blk_zone_descriptor {
 };
 
 struct virtio_blk_zone_report {
-   uint64_t nr_zones;
+   __virtio64 nr_zones;
uint8_t reserved[56];
struct virtio_blk_zone_descriptor zones[];
 };
diff --git a/include/standard-headers/linux/virtio_config.h 
b/include/standard-headers/linux/virtio_config.h
index 965ee6ae23..8a7d0dc8b0 100644
--- a/include/standard-headers/linux/virtio_config.h
+++ b/include/standard-headers/linux/virtio_config.h
@@ -97,6 +97,12 @@
  */
 #define VIRTIO_F_SR_IOV37
 
+/*
+ * This feature indicates that the driver passes extra data (besides
+ * identifying the virtqueue) in its device notifications.
+ */
+#define VIRTIO_F_NOTIFICATION_DATA 38
+
 /*
  * This feature indicates that the driver can reset a queue individually.
  */
diff --git a/include/standard-headers/linux/virtio_net.h 
b/include/standard-headers/linux/virtio_net.h
index c0e797067a..2325485f2c 100644
--- a/include/standard-headers/linux/virtio_net.h
+++ b/include/standard-headers/linux/virtio_net.h
@@ -61,6 +61,7 @@
 #define VIRTIO_NET_F_GUEST_USO655  /* Guest can handle USOv6 in. */
 #define VIRTIO_NET_F_HOST_USO  56  /* Host can handle USO in. */
 #define VIRTIO_NET_F_HASH_REPORT  57   /* Supports hash report */
+#define VIRTIO_NET_F_GUEST_HDRLEN  59  /* Guest provides the exact hdr_len 
value. */
 #define VIRTIO_NET_F_RSS 60/* Supports RSS RX steering */
 #define VIRTIO_NET_F_RSC_EXT 61/* extended coalescing info */
 #define VIRTIO_NET_F_STANDBY 62/* Act as standby for another device
diff --git a/linux-headers/asm-arm64/kvm.h b/linux-headers/asm-arm64/kvm.h
index d7e7bb885e..38e5957526 100644
--- a/linux-headers/asm-arm64/kvm.h
+++ b/linux-headers/asm-arm64/kvm.h
@@ -198,6 +198,15 @@ struct kvm_arm_copy_mte_tags {
__u64 reserved[2];
 };
 
+/*
+ * Counter/Timer offset structure. Describe the virtual/physical offset.
+ * To be used with KVM_ARM_SET_COUNTER_OFFSET.
+ */
+str

[PATCH v1 4/5] arm/kvm: add skeleton implementation for userspace SMCCC call handling

2023-06-26 Thread Shaoqin Huang
The SMCCC call filtering provide the ability to forward the SMCCC call
to userspace, so we provide a new option `user-smccc` to enable handling
SMCCC call in userspace, the default value is off.

And add the skeleton implementation for userspace SMCCC call
initialization and handling.

Signed-off-by: Shaoqin Huang 
---
 docs/system/arm/virt.rst |  4 +++
 hw/arm/virt.c| 21 
 include/hw/arm/virt.h|  1 +
 target/arm/kvm.c | 54 
 4 files changed, 80 insertions(+)

diff --git a/docs/system/arm/virt.rst b/docs/system/arm/virt.rst
index 1cab33f02e..ff43d52f04 100644
--- a/docs/system/arm/virt.rst
+++ b/docs/system/arm/virt.rst
@@ -155,6 +155,10 @@ dtb-randomness
   DTB to be non-deterministic. It would be the responsibility of
   the firmware to come up with a seed and pass it on if it wants to.
 
+user-smccc
+  Set ``on``/``off`` to enable/disable handling smccc call in userspace
+  instead of kernel.
+
 dtb-kaslr-seed
   A deprecated synonym for dtb-randomness.
 
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 9b9f7d9c68..767720321c 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -42,6 +42,7 @@
 #include "hw/vfio/vfio-amd-xgbe.h"
 #include "hw/display/ramfb.h"
 #include "net/net.h"
+#include "qom/object.h"
 #include "sysemu/device_tree.h"
 #include "sysemu/numa.h"
 #include "sysemu/runstate.h"
@@ -2511,6 +2512,19 @@ static void virt_set_oem_table_id(Object *obj, const 
char *value,
 strncpy(vms->oem_table_id, value, 8);
 }
 
+static bool virt_get_user_smccc(Object *obj, Error **errp)
+{
+VirtMachineState *vms = VIRT_MACHINE(obj);
+
+return vms->user_smccc;
+}
+
+static void virt_set_user_smccc(Object *obj, bool value, Error **errp)
+{
+VirtMachineState *vms = VIRT_MACHINE(obj);
+
+vms->user_smccc = value;
+}
 
 bool virt_is_acpi_enabled(VirtMachineState *vms)
 {
@@ -3155,6 +3169,13 @@ static void virt_machine_class_init(ObjectClass *oc, 
void *data)
   "in ACPI table header."
   "The string may be up to 8 bytes in 
size");
 
+object_class_property_add_bool(oc, "user-smccc",
+   virt_get_user_smccc,
+   virt_set_user_smccc);
+object_class_property_set_description(oc, "user-smccc",
+  "Set on/off to enable/disable "
+  "handling smccc call in userspace");
+
 }
 
 static void virt_instance_init(Object *obj)
diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
index e1ddbea96b..4f1bc12680 100644
--- a/include/hw/arm/virt.h
+++ b/include/hw/arm/virt.h
@@ -160,6 +160,7 @@ struct VirtMachineState {
 bool ras;
 bool mte;
 bool dtb_randomness;
+bool user_smccc;
 OnOffAuto acpi;
 VirtGICType gic_version;
 VirtIOMMUType iommu;
diff --git a/target/arm/kvm.c b/target/arm/kvm.c
index 84da49332c..579c6edd49 100644
--- a/target/arm/kvm.c
+++ b/target/arm/kvm.c
@@ -9,6 +9,8 @@
  */
 
 #include "qemu/osdep.h"
+#include 
+#include 
 #include 
 
 #include 
@@ -247,6 +249,20 @@ int kvm_arm_get_max_vm_ipa_size(MachineState *ms, bool 
*fixed_ipa)
 return ret > 0 ? ret : 40;
 }
 
+static int kvm_arm_init_smccc_filter(KVMState *s)
+{
+int ret = 0;
+
+if (kvm_vm_check_attr(s, KVM_ARM_VM_SMCCC_CTRL, KVM_ARM_VM_SMCCC_FILTER)) {
+error_report("ARM SMCCC filter not supported");
+ret = -EINVAL;
+goto out;
+}
+
+out:
+return ret;
+}
+
 int kvm_arch_init(MachineState *ms, KVMState *s)
 {
 int ret = 0;
@@ -282,6 +298,10 @@ int kvm_arch_init(MachineState *ms, KVMState *s)
 
 kvm_arm_init_debug(s);
 
+if (ret == 0 && object_property_get_bool(OBJECT(ms), "user-smccc", NULL)) {
+ret = kvm_arm_init_smccc_filter(s);
+}
+
 return ret;
 }
 
@@ -912,6 +932,37 @@ static int kvm_arm_handle_dabt_nisv(CPUState *cs, uint64_t 
esr_iss,
 return -1;
 }
 
+static void kvm_arm_smccc_return_result(CPUState *cs, struct arm_smccc_res 
*res)
+{
+ARMCPU *cpu = ARM_CPU(cs);
+CPUARMState *env = >env;
+
+env->xregs[0] = res->a0;
+env->xregs[1] = res->a1;
+env->xregs[2] = res->a2;
+env->xregs[3] = res->a3;
+}
+
+static int kvm_arm_handle_hypercall(CPUState *cs, struct kvm_run *run)
+{
+uint32_t fn = run->hypercall.nr;
+struct arm_smccc_res res = {
+.a0 = SMCCC_RET_NOT_SUPPORTED,
+};
+int ret = 0;
+
+kvm_cpu_synchronize_state(cs);
+
+switch (ARM_SMCCC_OWNER_NUM(fn)) {
+default:
+break;
+}
+
+kvm_arm_smccc_return_result(cs, );
+
+return ret;
+}
+
 int kvm_arch_handle_exit(CPUState *cs, struct kvm_run *run)
 {
 int ret = 0;
@@ -927,6 +978,9 

[PATCH v1 0/5] target/arm: Handle psci calls in userspace

2023-06-26 Thread Shaoqin Huang
The userspace SMCCC call filtering[1] provides the ability to forward the SMCCC
calls to the userspace. The vCPU hotplug[2] would be the first legitimate use
case to handle the psci calls in userspace, thus the vCPU hotplug can deny the
PSCI_ON call if the vCPU is not present now.

This series try to enable the userspace SMCCC call filtering, thus can handle
the SMCCC call in userspace. The first enabled SMCCC call is psci call, by using
the new added option 'user-smccc', we can enable handle psci calls in userspace.

qemu-system-aarch64 -machine virt,user-smccc=on

This series reuse the qemu implementation of the psci handling, thus the
handling process is very simple. But when handling psci in userspace when using
kvm, the reset vcpu process need to be taking care, the detail is included in
the patch05.

[1] lore.kernel.org/20230404154050.2270077-1-oliver.up...@linux.dev
[2] lore.kernel.org/20230203135043.409192-1-james.mo...@arm.com

Shaoqin Huang (5):
  linux-headers: Update to v6.4-rc7
  linux-headers: Import arm-smccc.h from Linux v6.4-rc7
  target/arm: make psci call can be used by kvm
  arm/kvm: add skeleton implementation for userspace SMCCC call handling
  arm/kvm: add support for userspace psci calls handling

 docs/system/arm/virt.rst  |   4 +
 hw/arm/virt.c |  21 ++
 hw/intc/arm_gicv3_kvm.c   |  10 +
 include/hw/arm/virt.h |   1 +
 include/standard-headers/linux/const.h|   2 +-
 include/standard-headers/linux/virtio_blk.h   |  18 +-
 .../standard-headers/linux/virtio_config.h|   6 +
 include/standard-headers/linux/virtio_net.h   |   1 +
 linux-headers/asm-arm64/kvm.h |  33 +++
 linux-headers/asm-riscv/kvm.h |  53 +++-
 linux-headers/asm-riscv/unistd.h  |   9 +
 linux-headers/asm-s390/unistd_32.h|   1 +
 linux-headers/asm-s390/unistd_64.h|   1 +
 linux-headers/asm-x86/kvm.h   |   3 +
 linux-headers/linux/arm-smccc.h   | 240 ++
 linux-headers/linux/const.h   |   2 +-
 linux-headers/linux/kvm.h |  12 +-
 linux-headers/linux/psp-sev.h |   7 +
 linux-headers/linux/userfaultfd.h |  17 +-
 target/arm/helper.c   |   3 +-
 target/arm/kvm.c  | 146 +++
 21 files changed, 573 insertions(+), 17 deletions(-)
 create mode 100644 linux-headers/linux/arm-smccc.h

base-commit: e3660cc1e3cb136af50c0eaaeac27943c2438d1d
-- 
2.39.1




[PATCH v2] hw: Fix format for comments

2023-06-19 Thread Shaoqin Huang
Simply fix the #vcpus_count to @vcpus_count in CPUArchId comments. Whlie
at it, reorder the parameters in comments to match the sequence of
parameters which defined in the CPUArchId.

Reviewed-by: Igor Mammedov 
Signed-off-by: Shaoqin Huang 
---
 include/hw/boards.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/include/hw/boards.h b/include/hw/boards.h
index a385010909..e0497c2314 100644
--- a/include/hw/boards.h
+++ b/include/hw/boards.h
@@ -101,10 +101,10 @@ MemoryRegion *machine_consume_memdev(MachineState 
*machine,
 /**
  * CPUArchId:
  * @arch_id - architecture-dependent CPU ID of present or possible CPU
+ * @vcpus_count - number of threads provided by @cpu object
+ * @props - CPU object properties, initialized by board
  * @cpu - pointer to corresponding CPU object if it's present on NULL otherwise
  * @type - QOM class name of possible @cpu object
- * @props - CPU object properties, initialized by board
- * #vcpus_count - number of threads provided by @cpu object
  */
 typedef struct CPUArchId {
 uint64_t arch_id;
-- 
2.39.1




Re: [PATCH] machine: do not crash if default RAM backend name has been stollen

2023-05-23 Thread Shaoqin Huang

With the patch, qemu exits normally instead of Aborted.

On 5/22/23 21:17, Igor Mammedov wrote:

QEMU aborts when default RAM backend should be used (i.e. no
explicit '-machine memory-backend=' specified) but user
has created an object which 'id' equals to default RAM backend
name used by board.

  $QEMU -machine pc \
-object memory-backend-ram,id=pc.ram,size=4294967296

  Actual results:
  QEMU 7.2.0 monitor - type 'help' for more information
  (qemu) Unexpected error in object_property_try_add() at ../qom/object.c:1239:
  qemu-kvm: attempt to add duplicate property 'pc.ram' to object (type 
'container')
  Aborted (core dumped)

Instead of abort, check for the conflicting 'id' and exit with
an error, suggesting how to remedy the issue.

Signed-off-by: Igor Mammedov 
CC: th...@redhat.com

Reviewed-by: Shaoqin Huang 

---
  hw/core/machine.c | 8 
  1 file changed, 8 insertions(+)

diff --git a/hw/core/machine.c b/hw/core/machine.c
index 07f763eb2e..1000406211 100644
--- a/hw/core/machine.c
+++ b/hw/core/machine.c
@@ -1338,6 +1338,14 @@ void machine_run_board_init(MachineState *machine, const 
char *mem_path, Error *
  }
  } else if (machine_class->default_ram_id && machine->ram_size &&
 numa_uses_legacy_mem()) {
+if (object_property_find(object_get_objects_root(),
+ machine_class->default_ram_id)) {
+error_setg(errp, "object name '%s' is reserved for the default"
+" RAM backend, it can't be used for any other purposes."
+" Change the object's 'id' to something else",
+machine_class->default_ram_id);
+return;
+}
  if (!create_default_memdev(current_machine, mem_path, errp)) {
  return;
  }


--
Shaoqin




[PATCH] hw: Fix format for comments

2023-05-15 Thread Shaoqin Huang
Simply fix the #vcpus_count to @vcpus_count in CPUArchId comments. Since
we are at here, resort the parameters in comments to match the sequence
of parameters which defined in the CPUArchId.

CC: Igor Mammedov 
Signed-off-by: Shaoqin Huang 
---
 include/hw/boards.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/include/hw/boards.h b/include/hw/boards.h
index f4117fdb9a..cefa3d5897 100644
--- a/include/hw/boards.h
+++ b/include/hw/boards.h
@@ -101,10 +101,10 @@ MemoryRegion *machine_consume_memdev(MachineState 
*machine,
 /**
  * CPUArchId:
  * @arch_id - architecture-dependent CPU ID of present or possible CPU
+ * @vcpus_count - number of threads provided by @cpu object
+ * @props - CPU object properties, initialized by board
  * @cpu - pointer to corresponding CPU object if it's present on NULL otherwise
  * @type - QOM class name of possible @cpu object
- * @props - CPU object properties, initialized by board
- * #vcpus_count - number of threads provided by @cpu object
  */
 typedef struct CPUArchId {
 uint64_t arch_id;
-- 
2.39.1




[PATCH] hw: Fix format for comments

2023-05-05 Thread Shaoqin Huang
Simply fix the #vcpus_count to @vcpus_count in CPUArchId comments. Since
we are at here, resort the parameters in comments to match the sequence
of parameters which defined in the CPUArchId.

Signed-off-by: Shaoqin Huang 
---
 include/hw/boards.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/include/hw/boards.h b/include/hw/boards.h
index f4117fdb9a..cefa3d5897 100644
--- a/include/hw/boards.h
+++ b/include/hw/boards.h
@@ -101,10 +101,10 @@ MemoryRegion *machine_consume_memdev(MachineState 
*machine,
 /**
  * CPUArchId:
  * @arch_id - architecture-dependent CPU ID of present or possible CPU
+ * @vcpus_count - number of threads provided by @cpu object
+ * @props - CPU object properties, initialized by board
  * @cpu - pointer to corresponding CPU object if it's present on NULL otherwise
  * @type - QOM class name of possible @cpu object
- * @props - CPU object properties, initialized by board
- * #vcpus_count - number of threads provided by @cpu object
  */
 typedef struct CPUArchId {
 uint64_t arch_id;
-- 
2.39.1