Re: [PATCH 15/19] qapi/parser: demote QAPIExpression to Dict[str, Any]

2024-01-09 Thread Markus Armbruster
John Snow  writes:

> On Thu, Nov 23, 2023 at 9:12 AM Markus Armbruster  wrote:
>>
>> John Snow  writes:
>>
>> > Dict[str, object] is a stricter type, but with the way that code is
>> > currently arranged, it is infeasible to enforce this strictness.
>> >
>> > In particular, although expr.py's entire raison d'être is normalization
>> > and type-checking of QAPI Expressions, that type information is not
>> > "remembered" in any meaningful way by mypy because each individual
>> > expression is not downcast to a specific expression type that holds all
>> > the details of each expression's unique form.
>> >
>> > As a result, all of the code in schema.py that deals with actually
>> > creating type-safe specialized structures has no guarantee (myopically)
>> > that the data it is being passed is correct.
>> >
>> > There are two ways to solve this:
>> >
>> > (1) Re-assert that the incoming data is in the shape we expect it to be, or
>> > (2) Disable type checking for this data.
>> >
>> > (1) is appealing to my sense of strictness, but I gotta concede that it
>> > is asinine to re-check the shape of a QAPIExpression in schema.py when
>> > expr.py has just completed that work at length. The duplication of code
>> > and the nightmare thought of needing to update both locations if and
>> > when we change the shape of these structures makes me extremely
>> > reluctant to go down this route.
>> >
>> > (2) allows us the chance to miss updating types in the case that types
>> > are updated in expr.py, but it *is* an awful lot simpler and,
>> > importantly, gets us closer to type checking schema.py *at
>> > all*. Something is better than nothing, I'd argue.
>> >
>> > So, do the simpler dumber thing and worry about future strictness
>> > improvements later.
>>
>> Yes.
>
> (You were right, again.)

Occasionally happens ;)

>> While Dict[str, object] is stricter than Dict[str, Any], both are miles
>> away from the actual, recursive type.
>>
>> > Signed-off-by: John Snow 
>> > ---
>> >  scripts/qapi/parser.py | 3 ++-
>> >  1 file changed, 2 insertions(+), 1 deletion(-)
>> >
>> > diff --git a/scripts/qapi/parser.py b/scripts/qapi/parser.py
>> > index bf31018aef0..b7f08cf36f2 100644
>> > --- a/scripts/qapi/parser.py
>> > +++ b/scripts/qapi/parser.py
>> > @@ -19,6 +19,7 @@
>> >  import re
>> >  from typing import (
>> >  TYPE_CHECKING,
>> > +Any,
>> >  Dict,
>> >  List,
>> >  Mapping,
>> > @@ -43,7 +44,7 @@
>> >  _ExprValue = Union[List[object], Dict[str, object], str, bool]
>> >
>> >
>> > -class QAPIExpression(Dict[str, object]):
>> > +class QAPIExpression(Dict[str, Any]):
>> >  # pylint: disable=too-few-public-methods
>> >  def __init__(self,
>> >   data: Mapping[str, object],
>>
>> There are several occurences of Dict[str, object] elsewhere.  Would your
>> argument for dumbing down QAPIExpression apply to (some of) them, too?
>
> When and if they piss me off, sure. I'm just wary of making the types
> too permissive because it can obscure typing errors; by using Any, you
> really disable any further checks and might lead to false confidence
> in the static checker. I still have a weird grudge against Any and
> would like to fully eliminate it from any statically checked Python
> code, but it's just not always feasible and I have to admit that "good
> enough" is good enough. Doesn't have me running to lessen the
> strictness in areas that didn't cause me pain, though...
>
>> Skimming them, I found this in introspect.py:
>>
>> # These types are based on structures defined in QEMU's schema, so we
>> # lack precise types for them here. Python 3.6 does not offer
>> # TypedDict constructs, so they are broadly typed here as simple
>> # Python Dicts.
>> SchemaInfo = Dict[str, object]
>> SchemaInfoEnumMember = Dict[str, object]
>> SchemaInfoObject = Dict[str, object]
>> SchemaInfoObjectVariant = Dict[str, object]
>> SchemaInfoObjectMember = Dict[str, object]
>> SchemaInfoCommand = Dict[str, object]
>>
>> Can we do better now we have 3.8?
>
> A little bit, but it involves reproducing these types -- which are
> ultimately meant to represent QAPI types defined in introspect.json --
> with "redundant" type info. i.e. I have to reproduce the existing type
> definitions in Python-ese, and then we have the maintenance burden of
> making sure they match.
>
> Maybe too much work to come up with a crazy dynamic definition thing
> where we take the QAPI definition and build Python types from them ...
> without some pretty interesting work to avoid the Ouroboros that'd
> result. introspection.py wants static types based on types defined
> dynamically by the schema definition; but we are not guaranteed to
> have a suitable schema with these types at all. I'm not sure how to
> express this kind of dependency without some interesting re-work. This
> is a rare circumstance of the QAPI generator relying on the contents
> of the Schema to provide static type assistance.

Re: [PATCH 13/19] qapi/schema: fix typing for QAPISchemaVariants.tag_member

2024-01-09 Thread Markus Armbruster
John Snow  writes:

> On Wed, Nov 22, 2023 at 11:02 AM John Snow  wrote:
>>
>> On Wed, Nov 22, 2023 at 9:05 AM Markus Armbruster  wrote:
>> >
>> > John Snow  writes:
>> >
>> > > There are two related changes here:
>> > >
>> > > (1) We need to perform type narrowing for resolving the type of
>> > > tag_member during check(), and
>> > >
>> > > (2) tag_member is a delayed initialization field, but we can hide it
>> > > behind a property that raises an Exception if it's called too
>> > > early. This simplifies the typing in quite a few places and avoids
>> > > needing to assert that the "tag_member is not None" at a dozen
>> > > callsites, which can be confusing and suggest the wrong thing to a
>> > > drive-by contributor.
>> > >
>> > > Signed-off-by: John Snow 
>> >
>> > Without looking closely: review of PATCH 10 applies, doesn't it?
>> >
>>
>> Yep!
>
> Hm, actually, maybe not quite as cleanly.
>
> The problem is we *are* initializing that field immediately with
> whatever we were passed in during __init__, which means the field is
> indeed Optional. Later, during check(), we happen to eliminate that
> usage of None.

You're right.

QAPISchemaVariants.__init__() takes @tag_name and @tag_member.  Exactly
one of them must be None.  When creating a union's QAPISchemaVariants,
it's tag_member, and when creating an alternate's, it's tag_name.

Why?

A union's tag is an ordinary member selected by name via
'discriminator': TAG_NAME.  We can't resolve the name at this time,
because it may be buried arbitrarily deep in the base type chain.

An alternate's tag is an implicitly created "member" of type 'QType'.
"Member" in scare-quotes, because is special: it exists in C, but not on
the wire, and not in introspection.

Historical note: simple unions also had an implictly created tag member,
and its type was the implicit enum type enumerating the branches.

So _def_union_type() passes TAG_NAME to .__init__(), and
_def_alternate_type() creates and passes the implicit tag member.
Hardly elegant, but it works.

> To remove the use of the @property trick here, we could:
>
> ... declare the field, then only initialize it if we were passed a
> non-None value. But then check() would need to rely on something like
> hasattr to check if it was set or not, which is maybe an unfortunate
> code smell.
> So I think you'd still wind up needing a ._tag_member field which is
> Optional and always gets set during __init__, then setting a proper
> .tag_member field during check().
>
> Or I could just leave this one as-is. Or something else. I think the
> dirt has to get swept somewhere, because we don't *always* have enough
> information to fully initialize it at __init__ time, it's a
> conditional delayed initialization, unlike the others which are
> unconditionally delayed.

Yes.

Here's a possible "something else":

1. Drop parameter .__init__() parameter @tag_member, and leave
.tag_member unset there.

2. Set .tag_member in .check(): if .tag_name, look up that member (no
change).  Else, it's an alternate; create the alternate's implicit tag
member.

Drawback: before, we create AST in just one place, namely
QAPISchema._def_exprs().  Now we also create some in .check().

Here's another "something else":

1. Fuse parameters .__init__() @tag_member and @tag_name.  The type
becomes Union.  Store for .check().

2. Set .tag_member in .check(): if we stored a name, look up that
member, else we must have stored an implicit member, so use that.

3. We check "is this a union?" like if self._tag_name.  Needs
adjustment.

Feels a bit awkward to me.

We can also do nothing, as you said.  We don't *have* to express
".check() resolves unresolved tag member" in the type system.  We can
just live with .tag_member remaining Optional.

Differently awkward, I guess.

Thoughts?




Re: [PATCH v8 00/10] Introduce model for IBM's FSI

2024-01-09 Thread Cédric Le Goater

Hello Ninad,

Here are comments on the file organization and configs.

On 11/29/23 00:56, Ninad Palsule wrote:

Hello,

Please review the patch-set version 8.
I have incorporated review comments from Cedric.
   - Fixed checkpatch failures.
   - Fixed commit messages.
   - Fixed LBUS memory map size.

Ninad Palsule (10):
   hw/fsi: Introduce IBM's Local bus
   hw/fsi: Introduce IBM's FSI Bus
   hw/fsi: Introduce IBM's cfam,fsi-slave,scratchpad
   hw/fsi: IBM's On-chip Peripheral Bus
   hw/fsi: Introduce IBM's FSI master
   hw/fsi: Aspeed APB2OPB interface
   hw/arm: Hook up FSI module in AST2600
   hw/fsi: Added qtest
   hw/fsi: Added FSI documentation
   hw/fsi: Update MAINTAINER list

  MAINTAINERS |   8 +
  docs/specs/fsi.rst  | 138 ++
  docs/specs/index.rst|   1 +
  meson.build |   1 +
  hw/fsi/trace.h  |   1 +
  include/hw/arm/aspeed_soc.h |   4 +
  include/hw/fsi/aspeed-apb2opb.h |  34 


aspeed-apb2opb is a HW logic bridging the FSI world and Aspeed. It
doesn't belong to the FSI susbsytem. Since we don't have a directory
for platform specific devices, I think the model shoud go under hw/misc/.



  include/hw/fsi/cfam.h   |  45 +


scratchpad is the only lbus device and it is quite generic, we could
move it to lbus files. It would be nice to implement more than one
reg.

 

  include/hw/fsi/fsi-master.h |  32 
  include/hw/fsi/fsi-slave.h  |  29 +++
  include/hw/fsi/fsi.h|  24 +++


I would move the definitions and implementation of the fsi bus and
the fsi slave under the fsi.h and fsi.c files



  include/hw/fsi/lbus.h   |  40 
  include/hw/fsi/opb.h|  25 +++


opb is quite minimal now and I think it could be hidden under
aspeed-apb2opb.


  hw/arm/aspeed_ast2600.c |  19 ++
  hw/fsi/aspeed-apb2opb.c | 316 
  hw/fsi/cfam.c   | 261 ++
  hw/fsi/fsi-master.c | 165 +
  hw/fsi/fsi-slave.c  |  78 
  hw/fsi/fsi.c|  22 +++
  hw/fsi/lbus.c   |  51 ++
  hw/fsi/opb.c|  36 
  tests/qtest/aspeed-fsi-test.c   | 205 +
  hw/Kconfig  |   1 +
  hw/arm/Kconfig  |   1 +
  hw/fsi/Kconfig  |  21 +++


one CONFIG_FSI option and one CONFIG_FSI_APB2OPB should be enough.
CONFIG_FSI_APB2OPB should select FSI and depends on CONFIG_ASPEED_SOC.


Thanks,

C.





  hw/fsi/meson.build  |   5 +
  hw/fsi/trace-events |  13 ++
  hw/meson.build  |   1 +
  tests/qtest/meson.build |   1 +
  29 files changed, 1578 insertions(+)
  create mode 100644 docs/specs/fsi.rst
  create mode 100644 hw/fsi/trace.h
  create mode 100644 include/hw/fsi/aspeed-apb2opb.h
  create mode 100644 include/hw/fsi/cfam.h
  create mode 100644 include/hw/fsi/fsi-master.h
  create mode 100644 include/hw/fsi/fsi-slave.h
  create mode 100644 include/hw/fsi/fsi.h
  create mode 100644 include/hw/fsi/lbus.h
  create mode 100644 include/hw/fsi/opb.h
  create mode 100644 hw/fsi/aspeed-apb2opb.c
  create mode 100644 hw/fsi/cfam.c
  create mode 100644 hw/fsi/fsi-master.c
  create mode 100644 hw/fsi/fsi-slave.c
  create mode 100644 hw/fsi/fsi.c
  create mode 100644 hw/fsi/lbus.c
  create mode 100644 hw/fsi/opb.c
  create mode 100644 tests/qtest/aspeed-fsi-test.c
  create mode 100644 hw/fsi/Kconfig
  create mode 100644 hw/fsi/meson.build
  create mode 100644 hw/fsi/trace-events






Re: [PATCH V1 2/3] migration: notifier error reporting

2024-01-09 Thread Peter Xu
On Wed, Dec 13, 2023 at 10:11:32AM -0800, Steve Sistare wrote:
> After calling notifiers, check if an error has been reported via
> migrate_set_error, and halt the migration.
> 
> None of the notifiers call migrate_set_error at this time, so no
> functional change.
> 
> Signed-off-by: Steve Sistare 
> ---
>  include/migration/misc.h |  2 +-
>  migration/migration.c| 26 ++
>  2 files changed, 23 insertions(+), 5 deletions(-)
> 
> diff --git a/include/migration/misc.h b/include/migration/misc.h
> index 901d117..231d7e4 100644
> --- a/include/migration/misc.h
> +++ b/include/migration/misc.h
> @@ -65,7 +65,7 @@ MigMode migrate_mode_of(MigrationState *);
>  void migration_add_notifier(Notifier *notify,
>  void (*func)(Notifier *notifier, void *data));
>  void migration_remove_notifier(Notifier *notify);
> -void migration_call_notifiers(MigrationState *s);
> +int migration_call_notifiers(MigrationState *s);
>  bool migration_in_setup(MigrationState *);
>  bool migration_has_finished(MigrationState *);
>  bool migration_has_failed(MigrationState *);
> diff --git a/migration/migration.c b/migration/migration.c
> index d5bfe70..29a9a92 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -1280,6 +1280,8 @@ void migrate_set_state(int *state, int old_state, int 
> new_state)
>  
>  static void migrate_fd_cleanup(MigrationState *s)
>  {
> +bool already_failed;
> +
>  qemu_bh_delete(s->cleanup_bh);
>  s->cleanup_bh = NULL;
>  
> @@ -1327,11 +1329,20 @@ static void migrate_fd_cleanup(MigrationState *s)
>MIGRATION_STATUS_CANCELLED);
>  }
>  
> +already_failed = migration_has_failed(s);
> +if (migration_call_notifiers(s)) {
> +if (!already_failed) {
> +migrate_set_state(>state, s->state, MIGRATION_STATUS_FAILED);
> +/* Notify again to recover from this late failure. */
> +migration_call_notifiers(s);
> +}
> +}
> +
>  if (s->error) {
>  /* It is used on info migrate.  We can't free it */
>  error_report_err(error_copy(s->error));
>  }
> -migration_call_notifiers(s);
> +
>  block_cleanup_parameters();
>  yank_unregister_instance(MIGRATION_YANK_INSTANCE);
>  }
> @@ -1450,9 +1461,10 @@ void migration_remove_notifier(Notifier *notify)
>  }
>  }
>  
> -void migration_call_notifiers(MigrationState *s)
> +int migration_call_notifiers(MigrationState *s)
>  {
>  notifier_list_notify(_state_notifiers, s);
> +return (s->error != NULL);

Exporting more migration_*() functions is pretty ugly to me..

Would it be better to pass in "Error** errp" into each notifiers?  That may
need an open coded notifier_list_notify(), breaking the loop if "*errp".

And the notifier API currently only support one arg..  maybe we should
implement the notifiers ourselves, ideally passing in "(int state, Error
**errp)" instead of "(MigrationState *s)".

Ideally with that MigrationState* shouldn't be visible outside migration/.

Thanks,

>  }
>  
>  bool migration_in_setup(MigrationState *s)
> @@ -2520,7 +2532,9 @@ static int postcopy_start(MigrationState *ms, Error 
> **errp)
>   * spice needs to trigger a transition now
>   */
>  ms->postcopy_after_devices = true;
> -migration_call_notifiers(ms);
> +if (migration_call_notifiers(ms)) {
> +goto fail;
> +}
>  
>  migration_downtime_end(ms);
>  
> @@ -3589,7 +3603,11 @@ void migrate_fd_connect(MigrationState *s, Error 
> *error_in)
>  rate_limit = migrate_max_bandwidth();
>  
>  /* Notify before starting migration thread */
> -migration_call_notifiers(s);
> +if (migration_call_notifiers(s)) {
> +migrate_set_state(>state, s->state, MIGRATION_STATUS_FAILED);
> +migrate_fd_cleanup(s);
> +return;
> +}
>  }
>  
>  migration_rate_set(rate_limit);
> -- 
> 1.8.3.1
> 

-- 
Peter Xu




[PULL 1/2] util/fifo8: Allow fifo8_pop_buf() to not populate popped length

2024-01-09 Thread Mark Cave-Ayland
From: Philippe Mathieu-Daudé 

There might be cases where we know the number of bytes we can
pop from the FIFO, or we simply don't care how many bytes is
returned. Allow fifo8_pop_buf() to take a NULL numptr.

Signed-off-by: Philippe Mathieu-Daudé 
Reviewed-by: Francisco Iglesias 
Reviewed-by: Alex Bennée 
Reviewed-by: Richard Henderson 
Tested-by: Alex Bennée 
Message-Id: <20231109192814.95977-2-phi...@linaro.org>
Signed-off-by: Mark Cave-Ayland 
---
 include/qemu/fifo8.h | 10 +-
 util/fifo8.c | 12 
 2 files changed, 13 insertions(+), 9 deletions(-)

diff --git a/include/qemu/fifo8.h b/include/qemu/fifo8.h
index 16be02f361..d0d02bc73d 100644
--- a/include/qemu/fifo8.h
+++ b/include/qemu/fifo8.h
@@ -71,7 +71,7 @@ uint8_t fifo8_pop(Fifo8 *fifo);
  * fifo8_pop_buf:
  * @fifo: FIFO to pop from
  * @max: maximum number of bytes to pop
- * @num: actual number of returned bytes
+ * @numptr: pointer filled with number of bytes returned (can be NULL)
  *
  * Pop a number of elements from the FIFO up to a maximum of max. The buffer
  * containing the popped data is returned. This buffer points directly into
@@ -82,16 +82,16 @@ uint8_t fifo8_pop(Fifo8 *fifo);
  * around in the ring buffer; in this case only a contiguous part of the data
  * is returned.
  *
- * The number of valid bytes returned is populated in *num; will always return
- * at least 1 byte. max must not be 0 or greater than the number of bytes in
- * the FIFO.
+ * The number of valid bytes returned is populated in *numptr; will always
+ * return at least 1 byte. max must not be 0 or greater than the number of
+ * bytes in the FIFO.
  *
  * Clients are responsible for checking the availability of requested data
  * using fifo8_num_used().
  *
  * Returns: A pointer to popped data.
  */
-const uint8_t *fifo8_pop_buf(Fifo8 *fifo, uint32_t max, uint32_t *num);
+const uint8_t *fifo8_pop_buf(Fifo8 *fifo, uint32_t max, uint32_t *numptr);
 
 /**
  * fifo8_reset:
diff --git a/util/fifo8.c b/util/fifo8.c
index de8fd0f1c5..2eeed56e80 100644
--- a/util/fifo8.c
+++ b/util/fifo8.c
@@ -66,16 +66,20 @@ uint8_t fifo8_pop(Fifo8 *fifo)
 return ret;
 }
 
-const uint8_t *fifo8_pop_buf(Fifo8 *fifo, uint32_t max, uint32_t *num)
+const uint8_t *fifo8_pop_buf(Fifo8 *fifo, uint32_t max, uint32_t *numptr)
 {
 uint8_t *ret;
+uint32_t num;
 
 assert(max > 0 && max <= fifo->num);
-*num = MIN(fifo->capacity - fifo->head, max);
+num = MIN(fifo->capacity - fifo->head, max);
 ret = >data[fifo->head];
-fifo->head += *num;
+fifo->head += num;
 fifo->head %= fifo->capacity;
-fifo->num -= *num;
+fifo->num -= num;
+if (numptr) {
+*numptr = num;
+}
 return ret;
 }
 
-- 
2.39.2




[PULL 2/2] util/fifo8: Introduce fifo8_peek_buf()

2024-01-09 Thread Mark Cave-Ayland
From: Philippe Mathieu-Daudé 

To be able to peek at FIFO content without popping it,
introduce the fifo8_peek_buf() method by factoring
common content from fifo8_pop_buf().

Reviewed-by: Francisco Iglesias 
Signed-off-by: Philippe Mathieu-Daudé 
Reviewed-by: Alex Bennée 
Reviewed-by: Richard Henderson 
Tested-by: Alex Bennée 
Message-Id: <20231109192814.95977-3-phi...@linaro.org>
Signed-off-by: Mark Cave-Ayland 
---
 include/qemu/fifo8.h | 27 +++
 util/fifo8.c | 22 ++
 2 files changed, 45 insertions(+), 4 deletions(-)

diff --git a/include/qemu/fifo8.h b/include/qemu/fifo8.h
index d0d02bc73d..c6295c6ff0 100644
--- a/include/qemu/fifo8.h
+++ b/include/qemu/fifo8.h
@@ -93,6 +93,33 @@ uint8_t fifo8_pop(Fifo8 *fifo);
  */
 const uint8_t *fifo8_pop_buf(Fifo8 *fifo, uint32_t max, uint32_t *numptr);
 
+/**
+ * fifo8_peek_buf: read upto max bytes from the fifo
+ * @fifo: FIFO to read from
+ * @max: maximum number of bytes to peek
+ * @numptr: pointer filled with number of bytes returned (can be NULL)
+ *
+ * Peek into a number of elements from the FIFO up to a maximum of max.
+ * The buffer containing the data peeked into is returned. This buffer points
+ * directly into the FIFO backing store. Since data is invalidated once any
+ * of the fifo8_* APIs are called on the FIFO, it is the caller responsibility
+ * to access it before doing further API calls.
+ *
+ * The function may return fewer bytes than requested when the data wraps
+ * around in the ring buffer; in this case only a contiguous part of the data
+ * is returned.
+ *
+ * The number of valid bytes returned is populated in *numptr; will always
+ * return at least 1 byte. max must not be 0 or greater than the number of
+ * bytes in the FIFO.
+ *
+ * Clients are responsible for checking the availability of requested data
+ * using fifo8_num_used().
+ *
+ * Returns: A pointer to peekable data.
+ */
+const uint8_t *fifo8_peek_buf(Fifo8 *fifo, uint32_t max, uint32_t *numptr);
+
 /**
  * fifo8_reset:
  * @fifo: FIFO to reset
diff --git a/util/fifo8.c b/util/fifo8.c
index 2eeed56e80..4e01b532d9 100644
--- a/util/fifo8.c
+++ b/util/fifo8.c
@@ -66,7 +66,8 @@ uint8_t fifo8_pop(Fifo8 *fifo)
 return ret;
 }
 
-const uint8_t *fifo8_pop_buf(Fifo8 *fifo, uint32_t max, uint32_t *numptr)
+static const uint8_t *fifo8_peekpop_buf(Fifo8 *fifo, uint32_t max,
+uint32_t *numptr, bool do_pop)
 {
 uint8_t *ret;
 uint32_t num;
@@ -74,15 +75,28 @@ const uint8_t *fifo8_pop_buf(Fifo8 *fifo, uint32_t max, 
uint32_t *numptr)
 assert(max > 0 && max <= fifo->num);
 num = MIN(fifo->capacity - fifo->head, max);
 ret = >data[fifo->head];
-fifo->head += num;
-fifo->head %= fifo->capacity;
-fifo->num -= num;
+
+if (do_pop) {
+fifo->head += num;
+fifo->head %= fifo->capacity;
+fifo->num -= num;
+}
 if (numptr) {
 *numptr = num;
 }
 return ret;
 }
 
+const uint8_t *fifo8_peek_buf(Fifo8 *fifo, uint32_t max, uint32_t *numptr)
+{
+return fifo8_peekpop_buf(fifo, max, numptr, false);
+}
+
+const uint8_t *fifo8_pop_buf(Fifo8 *fifo, uint32_t max, uint32_t *numptr)
+{
+return fifo8_peekpop_buf(fifo, max, numptr, true);
+}
+
 void fifo8_reset(Fifo8 *fifo)
 {
 fifo->num = 0;
-- 
2.39.2




[PULL 0/2] qemu-sparc queue 20240110

2024-01-09 Thread Mark Cave-Ayland
The following changes since commit 9468484fe904ab4691de6d9c34616667f377ceac:

  Merge tag 'block-pull-request' of https://gitlab.com/stefanha/qemu into 
staging (2024-01-09 10:32:23 +)

are available in the Git repository at:

  https://github.com/mcayland/qemu.git tags/qemu-sparc-20240110

for you to fetch changes up to 995d8348eb3d8ddf24882ed384a5c50eaf3aeae9:

  util/fifo8: Introduce fifo8_peek_buf() (2024-01-10 06:58:50 +)


qemu-sparc queue
- introduce fifo8_peek_buf() function


Philippe Mathieu-Daudé (2):
  util/fifo8: Allow fifo8_pop_buf() to not populate popped length
  util/fifo8: Introduce fifo8_peek_buf()

 include/qemu/fifo8.h | 37 -
 util/fifo8.c | 28 +++-
 2 files changed, 55 insertions(+), 10 deletions(-)



Re: [PATCH V1 1/3] migration: check mode in notifiers

2024-01-09 Thread Peter Xu
On Wed, Dec 13, 2023 at 10:11:31AM -0800, Steve Sistare wrote:
> The existing notifiers should only apply to normal mode.
> 
> No functional change.

Instead of adding such check in every notifier, why not make CPR a separate
list of notifiers?  Just like the blocker lists.

Aside of this patch, I just started to look at this "notifier" code, I
really don't think we should pass in MigrationState* into the notifiers.
IIUC we only need the "state" as an enum.  Then with two separate
registers, the device code knows the migration mode.

What do you think?

-- 
Peter Xu




Re: [PATCH-for-8.2 v4 00/10] hw/char/pl011: Implement TX (async) FIFO to avoid blocking the main loop

2024-01-09 Thread Mark Cave-Ayland

On 05/01/2024 07:50, Mark Cave-Ayland wrote:


On 09/11/2023 19:28, Philippe Mathieu-Daudé wrote:


Missing review: #10

Hi,

This series add support for (async) FIFO on the transmit path
of the PL011 UART.

Since v3:
- Document migration bits (Alex, Richard)
- Just check FIFO is not empty in pl011_xmit_fifo_state_needed (rth)
- In pl011_xmit check TX enabled first, and ignore < 8-bit TX (rth)

Since v2:
- Added R-b tags
- Addressed Richard comments on migration

Since v1:
- Restrict pl011_ops[] impl access_size,
- Do not check transmitter is enabled (Peter),
- Addressed Alex's review comments,
- Simplified migration trying to care about backward compat,
   but still unsure...

Philippe Mathieu-Daudé (10):
   util/fifo8: Allow fifo8_pop_buf() to not populate popped length
   util/fifo8: Introduce fifo8_peek_buf()
   hw/char/pl011: Split RX/TX path of pl011_reset_fifo()
   hw/char/pl011: Extract pl011_write_txdata() from pl011_write()
   hw/char/pl011: Extract pl011_read_rxdata() from pl011_read()
   hw/char/pl011: Warn when using disabled transmitter
   hw/char/pl011: Check if receiver is enabled
   hw/char/pl011: Rename RX FIFO methods
   hw/char/pl011: Add transmit FIFO to PL011State
   hw/char/pl011: Implement TX FIFO

  include/hw/char/pl011.h |   2 +
  include/qemu/fifo8.h    |  37 ++-
  hw/char/pl011.c | 239 +---
  util/fifo8.c    |  28 -
  hw/char/trace-events    |   8 +-
  5 files changed, 263 insertions(+), 51 deletions(-)


Hi Phil,

Happy New Year! Are there plans to queue this series for 9.0 soon? I'm particularly 
interested in the first 2 patches as I've made use of the new fifo8_peek_buf() 
function as part of my latest ESP updates.


I've spoken to Phil, and as patches 1 and 2 implementing fifo8_peek_buf() have R-B 
tags he is happy for me to take them separately via my qemu-sparc branch. I'll send a 
PR with those patches shortly.



ATB,

Mark.




Re: hw: nvme: Separate 'serial' property for VFs

2024-01-09 Thread Klaus Jensen
On Jan  9 11:29, Minwoo Im wrote:
> Currently, when a VF is created, it uses the 'params' object of the PF
> as it is. In other words, the 'params.serial' string memory area is
> also shared. In this situation, if the VF is removed from the system,
> the PF's 'params.serial' object is released with object_finalize()
> followed by object_property_del_all() which release the memory for
> 'serial' property. If that happens, the next VF created will inherit
> a serial from a corrupted memory area.
> 
> If this happens, an error will occur when comparing subsys->serial and
> n->params.serial in the nvme_subsys_register_ctrl() function.
> 
> Cc: qemu-sta...@nongnu.org
> Fixes: 44c2c09488db ("hw/nvme: Add support for SR-IOV")
> Signed-off-by: Minwoo Im 

Thanks Minwoo! Queued on nvme-next.

Reviewed-by: Klaus Jensen 


signature.asc
Description: PGP signature


Re: [RFC v2 0/7] Add persistence to NVMe ZNS emulation

2024-01-09 Thread Klaus Jensen
On Nov 27 16:56, Sam Li wrote:
> ZNS emulation follows NVMe ZNS spec but the state of namespace
> zones does not persist accross restarts of QEMU. This patch makes the
> metadata of ZNS emulation persistent by using new block layer APIs and
> the qcow2 img as backing file. It is the second part after the patches
> - adding full zoned storage emulation to qcow2 driver.
> https://patchwork.kernel.org/project/qemu-devel/cover/20231127043703.49489-1-faithilike...@gmail.com/
> 
> The metadata of ZNS emulation divides into two parts, zone metadata and
> zone descriptor extension data. The zone metadata is composed of zone
> states, zone type, wp and zone attributes. The zone information can be
> stored at an uint64_t wp to save space and easy access. The structure of
> wp of each zone is as follows:
> |(4)| zone type (1)| zone attr (8)| wp (51) ||
> 
> The zone descriptor extension data is relatively small comparing to the
> overall size therefore we adopt the option that store zded of all zones
> in an array regardless of the valid bit set.
> 
> Creating a zns format qcow2 image file adds one more option zd_extension_size
> to zoned device configurations.
> 
> To attach this file as emulated zns drive in the command line of QEMU, use:
>   -drive file=${znsimg},id=nvmezns0,format=qcow2,if=none \
>   -device nvme-ns,drive=nvmezns0,bus=nvme0,nsid=1,uuid=xxx \
> 
> Sorry, send this one more time due to network problems.
> 
> v1->v2:
> - split [v1 2/5] patch to three (doc, config, block layer API)
> - adapt qcow2 v6
> 
> Sam Li (7):
>   docs/qcow2: add zd_extension_size option to the zoned format feature
>   qcow2: add zd_extension configurations to zoned metadata
>   hw/nvme: use blk_get_*() to access zone info in the block layer
>   hw/nvme: add blk_get_zone_extension to access zd_extensions
>   hw/nvme: make the metadata of ZNS emulation persistent
>   hw/nvme: refactor zone append write using block layer APIs
>   hw/nvme: make ZDED persistent
> 
>  block/block-backend.c |   88 ++
>  block/qcow2.c |  119 ++-
>  block/qcow2.h |2 +
>  docs/interop/qcow2.txt|3 +
>  hw/nvme/ctrl.c| 1247 -
>  hw/nvme/ns.c  |  162 +---
>  hw/nvme/nvme.h|   95 +--
>  include/block/block-common.h  |9 +
>  include/block/block_int-common.h  |8 +
>  include/sysemu/block-backend-io.h |   11 +
>  include/sysemu/dma.h  |3 +
>  qapi/block-core.json  |4 +
>  system/dma-helpers.c  |   17 +
>  13 files changed, 647 insertions(+), 1121 deletions(-)
> 
> -- 
> 2.40.1
> 

Hi Sam,

This is awesome. For the hw/nvme parts,

Acked-by: Klaus Jensen 

I'll give it a proper R-b when you drop the RFC status.


signature.asc
Description: PGP signature


Re: [PATCH 00/33] hw/cpu/arm: Remove one use of qemu_get_cpu() in A7/A15 MPCore priv

2024-01-09 Thread Peter Xu
On Wed, Jan 10, 2024 at 07:03:06AM +0100, Markus Armbruster wrote:
> Peter Xu  writes:
> 
> > On Tue, Jan 09, 2024 at 10:22:31PM +0100, Philippe Mathieu-Daudé wrote:
> >> Hi Fabiano,
> >> 
> >> On 9/1/24 21:21, Fabiano Rosas wrote:
> >> > Cédric Le Goater  writes:
> >> > 
> >> > > On 1/9/24 18:40, Fabiano Rosas wrote:
> >> > > > Cédric Le Goater  writes:
> >> > > > 
> >> > > > > On 1/3/24 20:53, Fabiano Rosas wrote:
> >> > > > > > Philippe Mathieu-Daudé  writes:
> >> > > > > > 
> >> > > > > > > +Peter/Fabiano
> >> > > > > > > 
> >> > > > > > > On 2/1/24 17:41, Cédric Le Goater wrote:
> >> > > > > > > > On 1/2/24 17:15, Philippe Mathieu-Daudé wrote:
> >> > > > > > > > > Hi Cédric,
> >> > > > > > > > > 
> >> > > > > > > > > On 2/1/24 15:55, Cédric Le Goater wrote:
> >> > > > > > > > > > On 12/12/23 17:29, Philippe Mathieu-Daudé wrote:
> >> > > > > > > > > > > Hi,
> >> > > > > > > > > > > 
> >> > > > > > > > > > > When a MPCore cluster is used, the Cortex-A cores 
> >> > > > > > > > > > > belong the the
> >> > > > > > > > > > > cluster container, not to the board/soc layer. This 
> >> > > > > > > > > > > series move
> >> > > > > > > > > > > the creation of vCPUs to the MPCore private container.
> >> > > > > > > > > > > 
> >> > > > > > > > > > > Doing so we consolidate the QOM model, moving common 
> >> > > > > > > > > > > code in a
> >> > > > > > > > > > > central place (abstract MPCore parent).
> >> > > > > > > > > > 
> >> > > > > > > > > > Changing the QOM hierarchy has an impact on the state of 
> >> > > > > > > > > > the machine
> >> > > > > > > > > > and some fixups are then required to maintain migration 
> >> > > > > > > > > > compatibility.
> >> > > > > > > > > > This can become a real headache for KVM machines like 
> >> > > > > > > > > > virt for which
> >> > > > > > > > > > migration compatibility is a feature, less for emulated 
> >> > > > > > > > > > ones.
> >> > > > > > > > > 
> >> > > > > > > > > All changes are either moving properties (which are not 
> >> > > > > > > > > migrated)
> >> > > > > > > > > or moving non-migrated QOM members (i.e. pointers of 
> >> > > > > > > > > ARMCPU, which
> >> > > > > > > > > is still migrated elsewhere). So I don't see any obvious 
> >> > > > > > > > > migration
> >> > > > > > > > > problem, but I might be missing something, so I Cc'ed Juan 
> >> > > > > > > > > :>
> >> > > > > > 
> >> > > > > > FWIW, I didn't spot anything problematic either.
> >> > > > > > 
> >> > > > > > I've ran this through my migration compatibility series [1] and 
> >> > > > > > it
> >> > > > > > doesn't regress aarch64 migration from/to 8.2. The tests use '-M
> >> > > > > > virt -cpu max', so the cortex-a7 and cortex-a15 are not covered. 
> >> > > > > > I don't
> >> > > > > > think we even support migration of anything non-KVM on arm.
> >> > > > > 
> >> > > > > it happens we do.
> >> > > > > 
> >> > > > 
> >> > > > Oh, sorry, I didn't mean TCG here. Probably meant to say something 
> >> > > > like
> >> > > > non-KVM-capable cpus, as in 32-bit. Nevermind.
> >> > > 
> >> > > Theoretically, we should be able to migrate to a TCG guest. Well, this
> >> > > worked in the past for PPC. When I was doing more KVM related changes,
> >> > > this was very useful for dev. Also, some machines are partially 
> >> > > emulated.
> >> > > Anyhow I agree this is not a strong requirement and we often break it.
> >> > > Let's focus on KVM only.
> >> > > 
> >> > > > > > 1- https://gitlab.com/farosas/qemu/-/jobs/5853599533
> >> > > > > 
> >> > > > > yes it depends on the QOM hierarchy and virt seems immune to the 
> >> > > > > changes.
> >> > > > > Good.
> >> > > > > 
> >> > > > > However, changing the QOM topology clearly breaks migration compat,
> >> > > > 
> >> > > > Well, "clearly" is relative =) You've mentioned pseries and aspeed
> >> > > > already, do you have a pointer to one of those cases were we broke
> >> > > > migration
> >> > > 
> >> > > Regarding pseries, migration compat broke because of 5bc8d26de20c
> >> > > ("spapr: allocate the ICPState object from under sPAPRCPUCore") which
> >> > > is similar to the changes proposed by this series, it impacts the QOM
> >> > > hierarchy. Here is the workaround/fix from Greg : 46f7afa37096
> >> > > ("spapr: fix migration of ICPState objects from/to older QEMU") which
> >> > > is quite an headache and this turned out to raise another problem some
> >> > > months ago ... :/ That's why I sent [1] to prepare removal of old
> >> > > machines and workarounds becoming a burden.
> >> > 
> >> > This feels like something that could be handled by the vmstate code
> >> > somehow. The state is there, just under a different path.
> >> 
> >> What, the QOM path is used in migration? ...
> >
> > Hopefully not..
> >
> >> 
> >> See recent discussions on "QOM path stability":
> >> https://lore.kernel.org/qemu-devel/zzfyvlmcxbcia...@redhat.com/
> >> https://lore.kernel.org/qemu-devel/87jzojbxt7@pond.sub.org/
> >> 

[PATCH] target/i386/sev: Add an option to allow SEV not to pin memory

2024-01-09 Thread Zheyun Shen
For now, SEV pins guest's memory to avoid swapping or
moving ciphertext, but leading to the inhibition of
Memory Ballooning.

In Memory Ballooning, only guest's free pages will be relocated
in balloon inflation and deflation, so the difference of plaintext
doesn't matter to guest.

Memory Ballooning is a nice memory overcommitment technology can
be used in CVM based on SEV and SEV-ES, so userspace tools can
provide an option to allow SEV not to pin memory.

Signed-off-by: Zheyun Shen 
---
 qapi/qom.json |  7 ++-
 qemu-options.hx   |  5 -
 target/i386/sev.c | 39 ++-
 3 files changed, 44 insertions(+), 7 deletions(-)

diff --git a/qapi/qom.json b/qapi/qom.json
index 95516ba..c23397c 100644
--- a/qapi/qom.json
+++ b/qapi/qom.json
@@ -882,6 +882,10 @@
 # @kernel-hashes: if true, add hashes of kernel/initrd/cmdline to a
 # designated guest firmware page for measured boot with -kernel
 # (default: false) (since 6.2)
+
+# @pin-memory: if true, sev initialization will pin guest's
+# memory by registering to kvm, and disable ram block discard.
+# (default: true)
 #
 # Since: 2.12
 ##
@@ -893,7 +897,8 @@
 '*handle': 'uint32',
 '*cbitpos': 'uint32',
 'reduced-phys-bits': 'uint32',
-'*kernel-hashes': 'bool' } }
+'*kernel-hashes': 'bool',
+'*pin-memory': 'bool' } }
 
 ##
 # @ThreadContextProperties:
diff --git a/qemu-options.hx b/qemu-options.hx
index b66570a..1add214 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -5668,7 +5668,7 @@ SRST
  -object secret,id=sec0,keyid=secmaster0,format=base64,\\
  data=$SECRET,iv=$(pin_memory) {
+return;
+}
 int r;
 struct kvm_enc_region range;
 ram_addr_t offset;
@@ -256,6 +260,9 @@ static void
 sev_ram_block_removed(RAMBlockNotifier *n, void *host, size_t size,
   size_t max_size)
 {
+if (!sev_guest->pin_memory) {
+return;
+}
 int r;
 struct kvm_enc_region range;
 ram_addr_t offset;
@@ -353,6 +360,20 @@ static void sev_guest_set_kernel_hashes(Object *obj, bool 
value, Error **errp)
 sev->kernel_hashes = value;
 }
 
+static bool sev_guest_get_pin_memory(Object *obj, Error **errp)
+{
+SevGuestState *sev = SEV_GUEST(obj);
+
+return sev->pin_memory;
+}
+
+static void sev_guest_set_pin_memory(Object *obj, bool value, Error **errp)
+{
+SevGuestState *sev = SEV_GUEST(obj);
+
+sev->pin_memory = value;
+}
+
 static void
 sev_guest_class_init(ObjectClass *oc, void *data)
 {
@@ -376,6 +397,11 @@ sev_guest_class_init(ObjectClass *oc, void *data)
sev_guest_set_kernel_hashes);
 object_class_property_set_description(oc, "kernel-hashes",
 "add kernel hashes to guest firmware for measured Linux boot");
+object_class_property_add_bool(oc, "pin-memory",
+   sev_guest_get_pin_memory,
+   sev_guest_set_pin_memory);
+object_class_property_set_description(oc, "pin-memory",
+"pin guest memory at initialization");
 }
 
 static void
@@ -383,6 +409,7 @@ sev_guest_instance_init(Object *obj)
 {
 SevGuestState *sev = SEV_GUEST(obj);
 
+sev->pin_memory = true;
 sev->sev_device = g_strdup(DEFAULT_SEV_DEVICE);
 sev->policy = DEFAULT_GUEST_POLICY;
 object_property_add_uint32_ptr(obj, "policy", >policy,
@@ -920,11 +947,13 @@ int sev_kvm_init(ConfidentialGuestSupport *cgs, Error 
**errp)
 return 0;
 }
 
-ret = ram_block_discard_disable(true);
-if (ret) {
-error_report("%s: cannot disable RAM discard", __func__);
-return -1;
-}
+if (sev->pin_memory) {
+ret = ram_block_discard_disable(true);
+if (ret) {
+error_report("%s: cannot disable RAM discard", 
__func__);
+return -1;
+}
+}
 
 sev_guest = sev;
 sev->state = SEV_STATE_UNINIT;
--
2.34.1



Re: [PATCH 00/33] hw/cpu/arm: Remove one use of qemu_get_cpu() in A7/A15 MPCore priv

2024-01-09 Thread Markus Armbruster
Peter Xu  writes:

> On Tue, Jan 09, 2024 at 10:22:31PM +0100, Philippe Mathieu-Daudé wrote:
>> Hi Fabiano,
>> 
>> On 9/1/24 21:21, Fabiano Rosas wrote:
>> > Cédric Le Goater  writes:
>> > 
>> > > On 1/9/24 18:40, Fabiano Rosas wrote:
>> > > > Cédric Le Goater  writes:
>> > > > 
>> > > > > On 1/3/24 20:53, Fabiano Rosas wrote:
>> > > > > > Philippe Mathieu-Daudé  writes:
>> > > > > > 
>> > > > > > > +Peter/Fabiano
>> > > > > > > 
>> > > > > > > On 2/1/24 17:41, Cédric Le Goater wrote:
>> > > > > > > > On 1/2/24 17:15, Philippe Mathieu-Daudé wrote:
>> > > > > > > > > Hi Cédric,
>> > > > > > > > > 
>> > > > > > > > > On 2/1/24 15:55, Cédric Le Goater wrote:
>> > > > > > > > > > On 12/12/23 17:29, Philippe Mathieu-Daudé wrote:
>> > > > > > > > > > > Hi,
>> > > > > > > > > > > 
>> > > > > > > > > > > When a MPCore cluster is used, the Cortex-A cores belong 
>> > > > > > > > > > > the the
>> > > > > > > > > > > cluster container, not to the board/soc layer. This 
>> > > > > > > > > > > series move
>> > > > > > > > > > > the creation of vCPUs to the MPCore private container.
>> > > > > > > > > > > 
>> > > > > > > > > > > Doing so we consolidate the QOM model, moving common 
>> > > > > > > > > > > code in a
>> > > > > > > > > > > central place (abstract MPCore parent).
>> > > > > > > > > > 
>> > > > > > > > > > Changing the QOM hierarchy has an impact on the state of 
>> > > > > > > > > > the machine
>> > > > > > > > > > and some fixups are then required to maintain migration 
>> > > > > > > > > > compatibility.
>> > > > > > > > > > This can become a real headache for KVM machines like virt 
>> > > > > > > > > > for which
>> > > > > > > > > > migration compatibility is a feature, less for emulated 
>> > > > > > > > > > ones.
>> > > > > > > > > 
>> > > > > > > > > All changes are either moving properties (which are not 
>> > > > > > > > > migrated)
>> > > > > > > > > or moving non-migrated QOM members (i.e. pointers of ARMCPU, 
>> > > > > > > > > which
>> > > > > > > > > is still migrated elsewhere). So I don't see any obvious 
>> > > > > > > > > migration
>> > > > > > > > > problem, but I might be missing something, so I Cc'ed Juan :>
>> > > > > > 
>> > > > > > FWIW, I didn't spot anything problematic either.
>> > > > > > 
>> > > > > > I've ran this through my migration compatibility series [1] and it
>> > > > > > doesn't regress aarch64 migration from/to 8.2. The tests use '-M
>> > > > > > virt -cpu max', so the cortex-a7 and cortex-a15 are not covered. I 
>> > > > > > don't
>> > > > > > think we even support migration of anything non-KVM on arm.
>> > > > > 
>> > > > > it happens we do.
>> > > > > 
>> > > > 
>> > > > Oh, sorry, I didn't mean TCG here. Probably meant to say something like
>> > > > non-KVM-capable cpus, as in 32-bit. Nevermind.
>> > > 
>> > > Theoretically, we should be able to migrate to a TCG guest. Well, this
>> > > worked in the past for PPC. When I was doing more KVM related changes,
>> > > this was very useful for dev. Also, some machines are partially emulated.
>> > > Anyhow I agree this is not a strong requirement and we often break it.
>> > > Let's focus on KVM only.
>> > > 
>> > > > > > 1- https://gitlab.com/farosas/qemu/-/jobs/5853599533
>> > > > > 
>> > > > > yes it depends on the QOM hierarchy and virt seems immune to the 
>> > > > > changes.
>> > > > > Good.
>> > > > > 
>> > > > > However, changing the QOM topology clearly breaks migration compat,
>> > > > 
>> > > > Well, "clearly" is relative =) You've mentioned pseries and aspeed
>> > > > already, do you have a pointer to one of those cases were we broke
>> > > > migration
>> > > 
>> > > Regarding pseries, migration compat broke because of 5bc8d26de20c
>> > > ("spapr: allocate the ICPState object from under sPAPRCPUCore") which
>> > > is similar to the changes proposed by this series, it impacts the QOM
>> > > hierarchy. Here is the workaround/fix from Greg : 46f7afa37096
>> > > ("spapr: fix migration of ICPState objects from/to older QEMU") which
>> > > is quite an headache and this turned out to raise another problem some
>> > > months ago ... :/ That's why I sent [1] to prepare removal of old
>> > > machines and workarounds becoming a burden.
>> > 
>> > This feels like something that could be handled by the vmstate code
>> > somehow. The state is there, just under a different path.
>> 
>> What, the QOM path is used in migration? ...
>
> Hopefully not..
>
>> 
>> See recent discussions on "QOM path stability":
>> https://lore.kernel.org/qemu-devel/zzfyvlmcxbcia...@redhat.com/
>> https://lore.kernel.org/qemu-devel/87jzojbxt7@pond.sub.org/
>> https://lore.kernel.org/qemu-devel/87v883by34@pond.sub.org/
>
> If I read it right, the commit 46f7afa37096 example is pretty special that
> the QOM path more or less decided more than the hierachy itself but changes
> the existances of objects.

Let's see whether I got this...

We removed some useless objects, moved the useful ones to another home.
The move changed their QOM path.


Re: [PATCH v2 01/14] target/arm/cpu: Simplify checking A64_MTE bit in FEATURE_ID register

2024-01-09 Thread Richard Henderson

On 1/10/24 05:09, Philippe Mathieu-Daudé wrote:

cpu_isar_feature(aa64_mte, cpu) is testing a AArch64-only ID
register. The ARM_FEATURE_AARCH64 check is redundant.

Signed-off-by: Philippe Mathieu-Daudé 
---
  target/arm/cpu.c | 3 +--
  1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index 1c8b787482..c828b333c9 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -1690,8 +1690,7 @@ void arm_cpu_post_init(Object *obj)
  }
  
  #ifndef CONFIG_USER_ONLY

-if (arm_feature(>env, ARM_FEATURE_AARCH64) &&
-cpu_isar_feature(aa64_mte, cpu)) {
+if (cpu_isar_feature(aa64_mte, cpu)) {
  object_property_add_link(obj, "tag-memory",
   TYPE_MEMORY_REGION,
   (Object **)>tag_memory,


It is not redundant.

If !AARCH64, then the isar registers tested by aa64_mte are invalid.


r~



Re: [PATCH v2 1/3] linux-user: Allow gdbstub to ignore page protection

2024-01-09 Thread Richard Henderson

On 1/10/24 10:05, Ilya Leoshkevich wrote:

gdbserver ignores page protection by virtue of using /proc/$pid/mem.
Teach qemu gdbstub to do this too. This will not work if /proc is not
mounted; accept this limitation.

One alternative is to temporarily grant the missing PROT_* bit, but
this is inherently racy. Another alternative is self-debugging with
ptrace(POKE), which will break if QEMU itself is being debugged - a
much more severe limitation.

Signed-off-by: Ilya Leoshkevich 


Reviewed-by: Richard Henderson 


r~



Re: [PATCH] net/vmnet: Pad short Ethernet frames

2024-01-09 Thread William Hooper
On Mon, Jan 8, 2024 at 7:36 AM Philippe Mathieu-Daudé  wrote:
> Don't we want to initialize min_pktsz here ...
>
>min_pktsz = sizeof(min_pkt);
>
> > +if (eth_pad_short_frame(min_pkt, _pktsz, pkt, pktsz)) {
>
> ... because eth_pad_short_frame() update it?

Thanks for the review.

The results would be the same, since eth_pad_short_frame() sets
min_pktsz, if at all, to ETH_ZLEN, the same value as the initializer.

I have no objection to re-initializing min_pktsz for each packet,
however, if only to reduce the risk of a bug being introduced if this
behavior of eth_pad_short_frame() were ever to be changed.

Would you like me to post a revised patch?



qcow2-rs v0.1 and rublk-qcow2

2024-01-09 Thread Ming Lei
Hello,

qcow2-rs[1] is one pure Rust library for reading/writing qcow2 image, it is
based on rsd's[2] internal qcow2 implementation, but with lots of change, so 
far:

- supports read/write on data file, backing file and compressed image

- block device like interface, minimized read/write unit is aligned with block
size of image, so that direct io can be supported

- l2 table & refcount block load & store in slice way, and the minimized
slice size is block size, and the maximized size is cluster size

- built over Rust async/await, low level IO handling is abstracted by async
traits, and multiple low level io engines can be supported, so far, verified
on tokio-uring[3], raw linux sync IO syscall and io-uring[4] with smol[5]
runtime

Attributed to excellent async/.await, any IO(include meta IO) is handled in
async way actually, but the programming looks just like writing sync code,
so this library can be well-designed & implemented, and it is easy to add
new features & run further optimization with current code base.

rublk-qcow2[6] wires qcow2-rs, libublk-rs[7], smol(LocalExecutor) and io-uring
together, and provides block device interface for qcow2 image in 500 LoC.

Inside rublk-qcow2 async implementation, io-uring future is mapped to
(waker, result) by using unique cqe.user_data as key via HashMap, this easy way
does work, even though it may slow things a bit, but performance is still not
bad. In simple 'fio/t/io_uring $DEV' test, IOPS of rublk-qcow2 is better than
vdpa-virtio-blk by 20% with same setting(cache.direct=on,aio=io_uring) when
reading from fully allocated image in my test VM.

The initial motivation is for supporting rblk-qcow2, but I can’t find any
Rust qcow2 library with read/write support & simple interfaces and efficient
AIOs support, finally it is evolved into one generic qcow2 library. Many
qcow2 test cases are added. Also one utility is included in this project,
which can dump qcow2 meta, show any meta related statistics of the image,
check image meta integrity & host cluster leak, format qcow2 image,
read & write, ...

Any comments are welcome!



[1] qcow2-rs
https://github.com/ublk-org/qcow2-rs

[2] rsd
https://gitlab.com/hreitz/rsd/-/tree/main/src/node/qcow2?ref_type=heads

[3] tokio-uring
https://docs.rs/tokio-uring

[4] io-uring
https://docs.rs/io-uring

[5] smol
https://docs.rs/smol

[6] rublk-qcow2
https://github.com/ublk-org/rublk

[7] libublk-rs
https://github.com/ublk-org/libublk-rs



Thanks, 
Ming



Re: [PATCH v3 4/4] [NOT FOR MERGE] tests/qtest/migration: Adapt tests to use older QEMUs

2024-01-09 Thread Peter Xu
On Tue, Jan 09, 2024 at 11:46:32AM -0300, Fabiano Rosas wrote:
> Hm, it would be better to avoid the extra maintenance task at the start
> of every release, no? It also blocks us from doing n-2 even
> experimentally.

See my other reply, on whether we can use "n-1" for migration-test.  If
that can work for us, then IIUC we can avoid either "since:" or any
relevant flag, neither do we need to unmask tests after each releases.  All
old tests should always "just work" with a new qemu binary.

One drawback I can think of is, new tests (even if applicable to old qemu
binaries) will only start to take effect on cross-binary test until the
next release, but that's not so bad I assume.

Since the QTEST_QEMU_BINARY_SRC|DST function is already merged in 8.2, I
think we can already start kicking them and enable them for 9.0 if it works.

-- 
Peter Xu




[PATCH 2/2] target/riscv: Raise an exception when sdtrig is turned off

2024-01-09 Thread Himanshu Chauhan
When sdtrig is turned off by "sdtrig=false" option, raise
and illegal instruction exception on any read/write to
sdtrig CSRs.

Signed-off-by: Himanshu Chauhan 
---
 target/riscv/csr.c | 20 
 1 file changed, 20 insertions(+)

diff --git a/target/riscv/csr.c b/target/riscv/csr.c
index c50a33397c..b9ca016ef2 100644
--- a/target/riscv/csr.c
+++ b/target/riscv/csr.c
@@ -3854,6 +3854,10 @@ static RISCVException write_pmpaddr(CPURISCVState *env, 
int csrno,
 static RISCVException read_tselect(CPURISCVState *env, int csrno,
target_ulong *val)
 {
+if (!riscv_cpu_cfg(env)->ext_sdtrig) {
+return RISCV_EXCP_ILLEGAL_INST;
+}
+
 *val = tselect_csr_read(env);
 return RISCV_EXCP_NONE;
 }
@@ -3861,6 +3865,10 @@ static RISCVException read_tselect(CPURISCVState *env, 
int csrno,
 static RISCVException write_tselect(CPURISCVState *env, int csrno,
 target_ulong val)
 {
+if (!riscv_cpu_cfg(env)->ext_sdtrig) {
+return RISCV_EXCP_ILLEGAL_INST;
+}
+
 tselect_csr_write(env, val);
 return RISCV_EXCP_NONE;
 }
@@ -3868,6 +3876,10 @@ static RISCVException write_tselect(CPURISCVState *env, 
int csrno,
 static RISCVException read_tdata(CPURISCVState *env, int csrno,
  target_ulong *val)
 {
+if (!riscv_cpu_cfg(env)->ext_sdtrig) {
+return RISCV_EXCP_ILLEGAL_INST;
+}
+
 /* return 0 in tdata1 to end the trigger enumeration */
 if (env->trigger_cur >= RV_MAX_TRIGGERS && csrno == CSR_TDATA1) {
 *val = 0;
@@ -3885,6 +3897,10 @@ static RISCVException read_tdata(CPURISCVState *env, int 
csrno,
 static RISCVException write_tdata(CPURISCVState *env, int csrno,
   target_ulong val)
 {
+if (!riscv_cpu_cfg(env)->ext_sdtrig) {
+return RISCV_EXCP_ILLEGAL_INST;
+}
+
 if (!tdata_available(env, csrno - CSR_TDATA1)) {
 return RISCV_EXCP_ILLEGAL_INST;
 }
@@ -3896,6 +3912,10 @@ static RISCVException write_tdata(CPURISCVState *env, 
int csrno,
 static RISCVException read_tinfo(CPURISCVState *env, int csrno,
  target_ulong *val)
 {
+if (!riscv_cpu_cfg(env)->ext_sdtrig) {
+return RISCV_EXCP_ILLEGAL_INST;
+}
+
 *val = tinfo_csr_read(env);
 return RISCV_EXCP_NONE;
 }
-- 
2.34.1




[PATCH 1/2] target/riscv: Export sdtrig as an extension and ISA string

2024-01-09 Thread Himanshu Chauhan
This patch makes the debug trigger (sdtrig) capability
as an extension and exports it as an ISA string. The sdtrig
extension may or may not be implemented in a system. The
-cpu rv64,sdtrig=
option can be used to dynamicaly turn sdtrig extension
on or off.

Signed-off-by: Himanshu Chauhan 
---
 target/riscv/cpu.c | 2 ++
 target/riscv/cpu_cfg.h | 1 +
 2 files changed, 3 insertions(+)

diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index b07a76ef6b..aaa2d4ff1d 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -143,6 +143,7 @@ const RISCVIsaExtData isa_edata_arr[] = {
 ISA_EXT_DATA_ENTRY(zvkt, PRIV_VERSION_1_12_0, ext_zvkt),
 ISA_EXT_DATA_ENTRY(zhinx, PRIV_VERSION_1_12_0, ext_zhinx),
 ISA_EXT_DATA_ENTRY(zhinxmin, PRIV_VERSION_1_12_0, ext_zhinxmin),
+ISA_EXT_DATA_ENTRY(sdtrig, PRIV_VERSION_1_12_0, ext_sdtrig),
 ISA_EXT_DATA_ENTRY(smaia, PRIV_VERSION_1_12_0, ext_smaia),
 ISA_EXT_DATA_ENTRY(smepmp, PRIV_VERSION_1_12_0, ext_smepmp),
 ISA_EXT_DATA_ENTRY(smstateen, PRIV_VERSION_1_12_0, ext_smstateen),
@@ -1306,6 +1307,7 @@ const RISCVCPUMultiExtConfig riscv_cpu_extensions[] = {
 MULTI_EXT_CFG_BOOL("zve64d", ext_zve64d, false),
 MULTI_EXT_CFG_BOOL("sstc", ext_sstc, true),
 
+MULTI_EXT_CFG_BOOL("sdtrig", ext_sdtrig, true),
 MULTI_EXT_CFG_BOOL("smepmp", ext_smepmp, false),
 MULTI_EXT_CFG_BOOL("smstateen", ext_smstateen, false),
 MULTI_EXT_CFG_BOOL("svadu", ext_svadu, true),
diff --git a/target/riscv/cpu_cfg.h b/target/riscv/cpu_cfg.h
index f4605fb190..3d3acc7f90 100644
--- a/target/riscv/cpu_cfg.h
+++ b/target/riscv/cpu_cfg.h
@@ -113,6 +113,7 @@ struct RISCVCPUConfig {
 bool ext_ssaia;
 bool ext_sscofpmf;
 bool ext_smepmp;
+bool ext_sdtrig;
 bool rvv_ta_all_1s;
 bool rvv_ma_all_1s;
 
-- 
2.34.1




[PATCH 0/2] Export debug triggers as an extension

2024-01-09 Thread Himanshu Chauhan
All the CPUs may or may not implement the debug trigger (sdtrig)
extension. The presence of it should be dynamically detectable.
This patch exports the debug triggers as an extension which
can be turned on or off by sdtrig= option. It is
turned on by default.

"sdtrig" is concatenated to ISA string when it is enabled.
Like so:
rv64imafdch_zicbom_*_sdtrig_*_sstc_svadu


Himanshu Chauhan (2):
  target/riscv: Export sdtrig as an extension and ISA string
  target/riscv: Raise an exception when sdtrig is turned off

 target/riscv/cpu.c |  2 ++
 target/riscv/cpu_cfg.h |  1 +
 target/riscv/csr.c | 20 
 3 files changed, 23 insertions(+)

-- 
2.34.1




Re: [PATCH v3 3/4] ci: Add a migration compatibility test job

2024-01-09 Thread Peter Xu
On Tue, Jan 09, 2024 at 10:00:17AM -0300, Fabiano Rosas wrote:
> > Can we opt-out those broken tests using either your "since:" thing or
> > anything similar?
> 
> If it's something migration related, then yes. But there might be other
> types of breakages that have nothing to do with migration. Our tests are
> not resilent enough (nor they should) to detect when QEMU aborted for
> other reasons. Think about the -audio issue: the old QEMU would just say
> "there's no -audio option, abort" and that's a test failure of course.

I'm wondering whether we can more or less remedy that by running
migration-test under the build-previous directory for cross-binary tests.
We don't necessarily need to cross-test anything new happening anyway.

IOW, we use both old QEMU / migration-test for "n-1", and we only use "n"
for the new QEMU binary?

-- 
Peter Xu




Re: [PATCH 00/33] hw/cpu/arm: Remove one use of qemu_get_cpu() in A7/A15 MPCore priv

2024-01-09 Thread Peter Xu
On Tue, Jan 09, 2024 at 10:22:31PM +0100, Philippe Mathieu-Daudé wrote:
> Hi Fabiano,
> 
> On 9/1/24 21:21, Fabiano Rosas wrote:
> > Cédric Le Goater  writes:
> > 
> > > On 1/9/24 18:40, Fabiano Rosas wrote:
> > > > Cédric Le Goater  writes:
> > > > 
> > > > > On 1/3/24 20:53, Fabiano Rosas wrote:
> > > > > > Philippe Mathieu-Daudé  writes:
> > > > > > 
> > > > > > > +Peter/Fabiano
> > > > > > > 
> > > > > > > On 2/1/24 17:41, Cédric Le Goater wrote:
> > > > > > > > On 1/2/24 17:15, Philippe Mathieu-Daudé wrote:
> > > > > > > > > Hi Cédric,
> > > > > > > > > 
> > > > > > > > > On 2/1/24 15:55, Cédric Le Goater wrote:
> > > > > > > > > > On 12/12/23 17:29, Philippe Mathieu-Daudé wrote:
> > > > > > > > > > > Hi,
> > > > > > > > > > > 
> > > > > > > > > > > When a MPCore cluster is used, the Cortex-A cores belong 
> > > > > > > > > > > the the
> > > > > > > > > > > cluster container, not to the board/soc layer. This 
> > > > > > > > > > > series move
> > > > > > > > > > > the creation of vCPUs to the MPCore private container.
> > > > > > > > > > > 
> > > > > > > > > > > Doing so we consolidate the QOM model, moving common code 
> > > > > > > > > > > in a
> > > > > > > > > > > central place (abstract MPCore parent).
> > > > > > > > > > 
> > > > > > > > > > Changing the QOM hierarchy has an impact on the state of 
> > > > > > > > > > the machine
> > > > > > > > > > and some fixups are then required to maintain migration 
> > > > > > > > > > compatibility.
> > > > > > > > > > This can become a real headache for KVM machines like virt 
> > > > > > > > > > for which
> > > > > > > > > > migration compatibility is a feature, less for emulated 
> > > > > > > > > > ones.
> > > > > > > > > 
> > > > > > > > > All changes are either moving properties (which are not 
> > > > > > > > > migrated)
> > > > > > > > > or moving non-migrated QOM members (i.e. pointers of ARMCPU, 
> > > > > > > > > which
> > > > > > > > > is still migrated elsewhere). So I don't see any obvious 
> > > > > > > > > migration
> > > > > > > > > problem, but I might be missing something, so I Cc'ed Juan :>
> > > > > > 
> > > > > > FWIW, I didn't spot anything problematic either.
> > > > > > 
> > > > > > I've ran this through my migration compatibility series [1] and it
> > > > > > doesn't regress aarch64 migration from/to 8.2. The tests use '-M
> > > > > > virt -cpu max', so the cortex-a7 and cortex-a15 are not covered. I 
> > > > > > don't
> > > > > > think we even support migration of anything non-KVM on arm.
> > > > > 
> > > > > it happens we do.
> > > > > 
> > > > 
> > > > Oh, sorry, I didn't mean TCG here. Probably meant to say something like
> > > > non-KVM-capable cpus, as in 32-bit. Nevermind.
> > > 
> > > Theoretically, we should be able to migrate to a TCG guest. Well, this
> > > worked in the past for PPC. When I was doing more KVM related changes,
> > > this was very useful for dev. Also, some machines are partially emulated.
> > > Anyhow I agree this is not a strong requirement and we often break it.
> > > Let's focus on KVM only.
> > > 
> > > > > > 1- https://gitlab.com/farosas/qemu/-/jobs/5853599533
> > > > > 
> > > > > yes it depends on the QOM hierarchy and virt seems immune to the 
> > > > > changes.
> > > > > Good.
> > > > > 
> > > > > However, changing the QOM topology clearly breaks migration compat,
> > > > 
> > > > Well, "clearly" is relative =) You've mentioned pseries and aspeed
> > > > already, do you have a pointer to one of those cases were we broke
> > > > migration
> > > 
> > > Regarding pseries, migration compat broke because of 5bc8d26de20c
> > > ("spapr: allocate the ICPState object from under sPAPRCPUCore") which
> > > is similar to the changes proposed by this series, it impacts the QOM
> > > hierarchy. Here is the workaround/fix from Greg : 46f7afa37096
> > > ("spapr: fix migration of ICPState objects from/to older QEMU") which
> > > is quite an headache and this turned out to raise another problem some
> > > months ago ... :/ That's why I sent [1] to prepare removal of old
> > > machines and workarounds becoming a burden.
> > 
> > This feels like something that could be handled by the vmstate code
> > somehow. The state is there, just under a different path.
> 
> What, the QOM path is used in migration? ...

Hopefully not..

> 
> See recent discussions on "QOM path stability":
> https://lore.kernel.org/qemu-devel/zzfyvlmcxbcia...@redhat.com/
> https://lore.kernel.org/qemu-devel/87jzojbxt7@pond.sub.org/
> https://lore.kernel.org/qemu-devel/87v883by34@pond.sub.org/

If I read it right, the commit 46f7afa37096 example is pretty special that
the QOM path more or less decided more than the hierachy itself but changes
the existances of objects.

> 
> > No one wants
> > to be policing QOM hierarchy changes in every single series that shows
> > up on the list.
> > 
> > Anyway, thanks for the pointers. I'll study that code a bit more, maybe
> > I can come up with some way to handle these cases.
> > 
> > Hopefully 

Re: [PATCH v2 0/4] hw/loongarch/virt: Set iocsr address space per-board rather percpu

2024-01-09 Thread gaosong

在 2023/12/15 下午6:03, Bibo Mao 写道:

On LoongArch system, there is iocsr address space simliar system io
address space on x86. And each cpu has its separate iocsr address space now,
with this patch, iocsr address space is changed with per-board, and
MemTxAttrs.requester_id is used to differentiate cpu cores.

---
Changes in v2:
   1. Add num-cpu property for extioi interrupt controller
   2. Add post_load support for extioi vmstate to calculate sw_ipmap/sw_coremap 
info
---
Bibo Mao (4):
   hw/intc/loongarch_ipi: Use MemTxAttrs interface for ipi ops
   hw/loongarch/virt: Set iocsr address space per-board rather than
 percpu
   hw/intc/loongarch_extioi: Add dynamic cpu number support
   hw/intc/loongarch_extioi: Add vmstate post_load support

  hw/intc/loongarch_extioi.c | 230 ++---
  hw/intc/loongarch_ipi.c| 191 +++-
  hw/loongarch/virt.c|  94 
  include/hw/intc/loongarch_extioi.h |  12 +-
  include/hw/intc/loongarch_ipi.h|   3 +-
  include/hw/loongarch/virt.h|   3 +
  target/loongarch/cpu.c |  48 --
  target/loongarch/cpu.h |   4 +-
  target/loongarch/iocsr_helper.c|  16 +-
  9 files changed, 358 insertions(+), 243 deletions(-)


Applied to loongarch-next.

Thanks.
Song Gao




Re: [PATCH v4 0/9] Add loongarch kvm accel support

2024-01-09 Thread gaosong

在 2024/1/5 下午3:57, Tianrui Zhao 写道:

The linux headers in this patch synchronized from linux kernel
v6.7.0-rc8, and the loongarch kvm part of this patch series
based on the header files. And the linux kernel has added the
loongarch kvm support in master branch.

This series add loongarch kvm support, mainly implement
some interfaces used by kvm, such as kvm_arch_get/set_regs,
kvm_arch_handle_exit, kvm_loongarch_set_interrupt, etc.

Currently, we are able to boot LoongArch KVM Linux Guests.
In loongarch VM, mmio devices and iocsr devices are emulated
in user space such as APIC, IPI, pci devices, etc, other
hardwares such as MMU, timer and csr are emulated in kernel.

The running environment of LoongArch virt machine:
1. Get the Linux KVM environment of LoongArch in Linux mainline.
make ARCH=loongarch CROSS_COMPILE=loongarch64-unknown-linux-gnu- 
loongson3_defconfig
make ARCH=loongarch CROSS_COMPILE=loongarch64-unknown-linux-gnu-
2. Get the qemu source: https://github.com/loongson/qemu
git checkout kvm-loongarch
./configure --target-list="loongarch64-softmmu"  --enable-kvm
make
3. Get uefi bios of LoongArch virt machine:
Link: 
https://github.com/tianocore/edk2-platforms/tree/master/Platform/Loongson/LoongArchQemuPkg#readme
4. Also you can access the binary files we have already built:
https://github.com/yangxiaojuan-loongson/qemu-binary

The command to boot loongarch virt machine:
$ qemu-system-loongarch64 -machine virt -m 4G -cpu la464 \
-smp 1 -bios QEMU_EFI.fd -kernel vmlinuz.efi -initrd ramdisk \
-serial stdio   -monitor telnet:localhost:4495,server,nowait \
-append "root=/dev/ram rdinit=/sbin/init console=ttyS0,115200" \
--nographic

Changes for v4:
1. Synchronize linux headers from linux v6.7.0-rc8.
2. Move kvm.c and kvm_loongarch.h into target/loongarch/kvm/
directory.
3. Add "#ifndef CONFIG_USER_ONLY" before loongarch_cpu_do_interrupt
to fix compiling issue.
4. Remove "#ifdef CONFIG_TCG" before "#include "exec/cpu_ldst.h""
in fpu_helper.c, As it has been changed in other patches.

Changes for v3:
1. Synchronize linux headers from linux v6.7.0-rc7.
2. Fix compiling error when config enable-kvm and disable-tcg
at one time.

Changes for v2:
1. Synchronize linux headers from linux v6.7.0-rc6.
2. Remove the stub function: kvm_loongarch_set_interrupt, as kvm_enabled
3. Move the kvm function such as kvm_arch_reset_vcpu from cpu.h to
loongarch_kvm.h, and supplement "#include " in loongarch_kvm.h.

Changes for v1:
1. Synchronous KVM headers about LoongArch KVM form linux kernel,
as the LoongArch KVM patch series have been accepted by linux kernel.
2. Remove the KVM_GET/SET_ONE_UREG64 macro in target/loongarch, and
use the common interface kvm_get/set_one_reg to replace it.
3. Resolve the compiling errors when LoongArch is built by other archs.

Tianrui Zhao (9):
   linux-headers: Synchronize linux headers from linux v6.7.0-rc8
   target/loongarch: Define some kvm_arch interfaces
   target/loongarch: Supplement vcpu env initial when vcpu reset
   target/loongarch: Implement kvm get/set registers
   target/loongarch: Implement kvm_arch_init function
   target/loongarch: Implement kvm_arch_init_vcpu
   target/loongarch: Implement kvm_arch_handle_exit
   target/loongarch: Implement set vcpu intr for kvm
   target/loongarch: Add loongarch kvm into meson build

  include/standard-headers/drm/drm_fourcc.h |   2 +
  include/standard-headers/linux/fuse.h |  10 +-
  include/standard-headers/linux/pci_regs.h |  24 +-
  include/standard-headers/linux/vhost_types.h  |   7 +
  .../standard-headers/linux/virtio_config.h|   5 +
  include/standard-headers/linux/virtio_pci.h   |  11 +
  linux-headers/asm-arm64/kvm.h |  32 +
  linux-headers/asm-generic/unistd.h|  14 +-
  linux-headers/asm-loongarch/bitsperlong.h |   1 +
  linux-headers/asm-loongarch/kvm.h | 108 +++
  linux-headers/asm-loongarch/mman.h|   1 +
  linux-headers/asm-loongarch/unistd.h  |   5 +
  linux-headers/asm-mips/unistd_n32.h   |   4 +
  linux-headers/asm-mips/unistd_n64.h   |   4 +
  linux-headers/asm-mips/unistd_o32.h   |   4 +
  linux-headers/asm-powerpc/unistd_32.h |   4 +
  linux-headers/asm-powerpc/unistd_64.h |   4 +
  linux-headers/asm-riscv/kvm.h |  12 +
  linux-headers/asm-s390/unistd_32.h|   4 +
  linux-headers/asm-s390/unistd_64.h|   4 +
  linux-headers/asm-x86/unistd_32.h |   4 +
  linux-headers/asm-x86/unistd_64.h |   3 +
  linux-headers/asm-x86/unistd_x32.h|   3 +
  linux-headers/linux/iommufd.h | 180 +++-
  linux-headers/linux/kvm.h |  11 +
  linux-headers/linux/psp-sev.h |   1 +
  linux-headers/linux/stddef.h  |   9 +-
  linux-headers/linux/userfaultfd.h |   9 +-
  linux-headers/linux/vfio.h  

Re: [PATCH 00/10] docs/migration: Reorganize migration documentations

2024-01-09 Thread Peter Xu
On Tue, Jan 09, 2024 at 02:21:26PM +0100, Cédric Le Goater wrote:
> 
> > A few things I'd like to mention alongside, because it's documentation
> > relevant too, and I'd like to collect if there's any comment.
> > 
> > I just mostly rewrote two wiki pages completely:
> > 
> >https://wiki.qemu.org/ToDo/LiveMigration
> >https://wiki.qemu.org/Features/Migration>
> > I merged all the TODO items from Features/Migration into the ToDo page,
> > while kept the 2nd page mostly clean, just to route to other places.
> > 
> > I had a plan to make:
> > 
> >https://qemu.org/docs/master
> > 
> > The solo place for migration documentations (aka, QEMU repo the source of
> > truth for migration docs, as it's peroidically built there), making all the
> > rest places pointing to that, as I already did in the wiki page.  While I
> > kept all the TODOs on the wiki page (not Features/Migration, but
> > ToDo/LiveMigration).> Fabiano / anyone: feel free to add / update /
> > correct any entries there
> > where applicable.  Also if there's any thoughts on above feel free to let
> > me know too.
> 
> The Wiki has some limited value, the changelog for instance, but the rest
> is a bag of orphan and obsolete pages doomed to bit-rot since it is slowly
> being replaced by the in-tree documentation.
> 
> The info in the Features/Migration page is redundant with what we have
> in-tree, a part from the CREDITS. The TODO list could be some file under :
> 
>   https://qemu.org/docs/master/devel/migration
> 
> It would be easier to find and it would keep the Wiki to a strict minimum.

Thanks for the suggestions.  I agree that we should minimize the wiki use,
especially on docs.  It'll be nice we use a solo source of truth for the
docs, always accessable via qemu.org/docs, and also makes it easier for us
to ask for docs altogether as patches when new features are merged.

I see that most of the ToDos for the other part of qemus still use the wiki
page, even though they're indeed mostly outdated just like the migration
ToDo before I updated it.

IMHO one thing that the wiki services well for ToDo is that it allows easy
& frequent updates on the projects, without the need to require a review
process like most of the patches being posted on the list.  The wiki page
still maintains a diff, and IMHO that may not even be required, as a
history record of a ToDo list may not help much in most cases.

The other issue regarding ToDo is that, some of the ToDo idea (or when
someone frequently updates with details on a project of an ongoing item)
may not be mature enough to be mentioned in an official documents. So even
if some can be considered to be put together with the qemu repo, there may
always be some that may not be suitable, then we will still need some place
for those.  I still don't know what's the ideal way to do this.

Thanks,

-- 
Peter Xu




Re: [PATCH v4 9/9] target/loongarch: Add loongarch kvm into meson build

2024-01-09 Thread gaosong

在 2024/1/5 下午3:58, Tianrui Zhao 写道:

Add kvm.c into meson.build to compile it when kvm
is configed. Meanwhile in meson.build, we set the
kvm_targets to loongarch64-softmmu when the cpu is
loongarch. And fix the compiling error when config
is enable-kvm,disable-tcg.

Signed-off-by: Tianrui Zhao 
Signed-off-by: xianglai li 
Reviewed-by: Richard Henderson 
---
  meson.build  | 2 ++
  target/loongarch/kvm/meson.build | 1 +
  target/loongarch/meson.build | 1 +
  3 files changed, 4 insertions(+)
  create mode 100644 target/loongarch/kvm/meson.build


Reviewed-by: Song Gao 

Thanks.
Song Gao

diff --git a/meson.build b/meson.build
index 445f2b7c2b..0c62b4156d 100644
--- a/meson.build
+++ b/meson.build
@@ -114,6 +114,8 @@ elif cpu in ['riscv32']
kvm_targets = ['riscv32-softmmu']
  elif cpu in ['riscv64']
kvm_targets = ['riscv64-softmmu']
+elif cpu in ['loongarch64']Reviewed-by: Song Gao 
+  kvm_targets = ['loongarch64-softmmu']
  else
kvm_targets = []
  endif
diff --git a/target/loongarch/kvm/meson.build b/target/loongarch/kvm/meson.build
new file mode 100644
index 00..2266de6ca9
--- /dev/null
+++ b/target/loongarch/kvm/meson.build
@@ -0,0 +1 @@
+loongarch_ss.add(when: 'CONFIG_KVM', if_true: files('kvm.c'))
diff --git a/target/loongarch/meson.build b/target/loongarch/meson.build
index 18e8191e2b..7f86caf373 100644
--- a/target/loongarch/meson.build
+++ b/target/loongarch/meson.build
@@ -31,3 +31,4 @@ loongarch_ss.add_all(when: 'CONFIG_TCG', if_true: 
[loongarch_tcg_ss])
  
  target_arch += {'loongarch': loongarch_ss}

  target_system_arch += {'loongarch': loongarch_system_ss}
+subdir('kvm')





Re: [PATCH 05/10] docs/migration: Split "Debugging" and "Firmware"

2024-01-09 Thread Peter Xu
On Tue, Jan 09, 2024 at 02:03:04PM -0300, Fabiano Rosas wrote:
> pet...@redhat.com writes:
> 
> > From: Peter Xu 
> >
> > Move the two sections into a separate file called "best-practises.rst".
> 
> s/practises/practices/

Will fix, thanks.

-- 
Peter Xu




RE: [PATCH v3 06/70] kvm: Introduce support for memory_attributes

2024-01-09 Thread Wang, Wei W
On Wednesday, January 10, 2024 12:32 AM, Li, Xiaoyao wrote:
> On 1/9/2024 10:53 PM, Wang, Wei W wrote:
> > On Tuesday, January 9, 2024 1:47 PM, Li, Xiaoyao wrote:
> >> On 12/21/2023 9:47 PM, Wang, Wei W wrote:
> >>> On Thursday, December 21, 2023 7:54 PM, Li, Xiaoyao wrote:
>  On 12/21/2023 6:36 PM, Wang, Wei W wrote:
> > No need to specifically check for KVM_MEMORY_ATTRIBUTE_PRIVATE
> >> there.
> > I'm suggesting below:
> >
> > diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c index
> > 2d9a2455de..63ba74b221 100644
> > --- a/accel/kvm/kvm-all.c
> > +++ b/accel/kvm/kvm-all.c
> > @@ -1375,6 +1375,11 @@ static int
> kvm_set_memory_attributes(hwaddr
>  start, hwaddr size, uint64_t attr)
> > struct kvm_memory_attributes attrs;
> > int r;
> >
> > +if ((attr & kvm_supported_memory_attributes) != attr) {
> > +error_report("KVM doesn't support memory attr %lx\n", attr);
> > +return -EINVAL;
> > +}
> 
>  In the case of setting a range of memory to shared while KVM
>  doesn't support private memory. Above check doesn't work. and
>  following IOCTL
> >> fails.
> >>>
> >>> SHARED attribute uses the value 0, which indicates it's always supported,
> no?
> >>> For the implementation, can you find in the KVM side where the ioctl
> >>> would get failed in that case?
> >>
> >> I'm worrying about the future case, that KVM supports other memory
> >> attribute than shared/private. For example, KVM supports RWX bits
> >> (bit 0
> >> - 2) but not shared/private bit.
> >
> > What's the exact issue?
> > +#define KVM_MEMORY_ATTRIBUTE_READ   (1ULL << 2)
> > +#define KVM_MEMORY_ATTRIBUTE_WRITE (1ULL << 1)
> > +#define KVM_MEMORY_ATTRIBUTE_EXE  (1ULL << 0)
> >
> > They are checked via
> > "if ((attr & kvm_supported_memory_attributes) != attr)" shared above
> > in kvm_set_memory_attributes.
> > In the case you described, kvm_supported_memory_attributes will be 0x7.
> > Anything unexpected?
> 
> Sorry that I thought for wrong case.
> 
> It doesn't work on the case that KVM doesn't support memory_attribute, e.g.,
> an old KVM. In this case, 'kvm_supported_memory_attributes' is 0, and 'attr' 
> is
> 0 as well.

How is this different in your existing implementation?

The official way of defining a feature is to take a bit (based on the first 
feature,
*_PRIVATE, defined). Using 0 as an attr is a bit magic and it passes all the 
"&" based check.
But using it for *_SHARED looks fine to me as semantically memory can always be 
shared
and the ioctl will return with -ENOTTY anyway in your mentioned case.


RE: [PATCH trivial] colo: examples: remove mentions of script= and (wrong) downscript=

2024-01-09 Thread Zhang, Chen


> -Original Message-
> From: Michael Tokarev 
> Sent: Tuesday, January 9, 2024 1:44 PM
> To: Zhang, Chen ; qemu-devel@nongnu.org
> Cc: qemu-triv...@nongnu.org; Li Zhijian 
> Subject: Re: [PATCH trivial] colo: examples: remove mentions of script= and
> (wrong) downscript=
> 
> 09.01.2024 05:08, Zhang, Chen :
> >
> >
> >> -Original Message-
> >> From: Michael Tokarev 
> >> Sent: Sunday, January 7, 2024 7:25 PM
> >> To: qemu-devel@nongnu.org
> >> Cc: Michael Tokarev ; qemu-triv...@nongnu.org; Zhang,
> >> Chen ; Li Zhijian 
> >> Subject: [PATCH trivial] colo: examples: remove mentions of script=
> >> and
> >> (wrong) downscript=
> >>
> >> There's no need to repeat script=/etc/qemu-ifup in examples, as it is
> >> already in there.  More, all examples uses incorrect "down script="
> >> (which should be "downscript=").
> >
> > Yes, good catch.
> > Reviewed-by: Zhang Chen 
> >
> >> ---
> >> I'm not sure we need so many identical examples, and why it uses
> >> vnet=off, - it looks like vnet= should also be dropped.
> >
> > Do you means the "vnet_hdr_support" in docs?
> 
> Nope, it was a thinko on my part, I mean vhost=off parameter - which is right
> next to script=.
> Why vhost is explicitly disabled here, while it isn't even enabled by default?
> 

Because Qemu net filter can't support vhost.
Vhost puts virtio emulation code into the kernel, taking QEMU userspace out of 
the picture.
So, the filter can't works to get network data.

> And do we really need that many examples like this, maybe it's a good idea to
> remove half of them and refer to the other place instead?

Yes, nice to see optimized documentation.

Thanks
Chen

> 
> /mjt


Re: [PATCH 13/19] qapi/schema: fix typing for QAPISchemaVariants.tag_member

2024-01-09 Thread John Snow
On Wed, Nov 22, 2023 at 11:02 AM John Snow  wrote:
>
> On Wed, Nov 22, 2023 at 9:05 AM Markus Armbruster  wrote:
> >
> > John Snow  writes:
> >
> > > There are two related changes here:
> > >
> > > (1) We need to perform type narrowing for resolving the type of
> > > tag_member during check(), and
> > >
> > > (2) tag_member is a delayed initialization field, but we can hide it
> > > behind a property that raises an Exception if it's called too
> > > early. This simplifies the typing in quite a few places and avoids
> > > needing to assert that the "tag_member is not None" at a dozen
> > > callsites, which can be confusing and suggest the wrong thing to a
> > > drive-by contributor.
> > >
> > > Signed-off-by: John Snow 
> >
> > Without looking closely: review of PATCH 10 applies, doesn't it?
> >
>
> Yep!

Hm, actually, maybe not quite as cleanly.

The problem is we *are* initializing that field immediately with
whatever we were passed in during __init__, which means the field is
indeed Optional. Later, during check(), we happen to eliminate that
usage of None.

To remove the use of the @property trick here, we could:

... declare the field, then only initialize it if we were passed a
non-None value. But then check() would need to rely on something like
hasattr to check if it was set or not, which is maybe an unfortunate
code smell.
So I think you'd still wind up needing a ._tag_member field which is
Optional and always gets set during __init__, then setting a proper
.tag_member field during check().

Or I could just leave this one as-is. Or something else. I think the
dirt has to get swept somewhere, because we don't *always* have enough
information to fully initialize it at __init__ time, it's a
conditional delayed initialization, unlike the others which are
unconditionally delayed.

--js




RE: [PATCH v2] target/riscv: Implement optional CSR mcontext of debug Sdtrig extension

2024-01-09 Thread 張哲嘉
> -Original Message-
> From: Daniel Henrique Barboza 
> Sent: Wednesday, January 10, 2024 6:04 AM
> To: Alvin Che-Chia Chang(張哲嘉) ;
> qemu-ri...@nongnu.org; qemu-devel@nongnu.org
> Cc: alistair.fran...@wdc.com; bin.m...@windriver.com;
> liwei1...@gmail.com; zhiwei_...@linux.alibaba.com
> Subject: Re: [PATCH v2] target/riscv: Implement optional CSR mcontext of
> debug Sdtrig extension
> 
> 
> On 12/19/23 09:32, Alvin Chang wrote:
> > The debug Sdtrig extension defines an CSR "mcontext". This commit
> > implements its predicate and read/write operations into CSR table.
> > Its value is reset as 0 when the trigger module is reset.
> >
> > Signed-off-by: Alvin Chang 
> > ---
> 
> The patch per se LGTM:
> 
> Reviewed-by: Daniel Henrique Barboza 

Thank you!!

> 
> 
> But I have a question: shouldn't we just go ahead and add the 'sdtrig'
> extension?
> We have a handful of its CSRs already. Adding the extension would also add
> 'sdtrig'
> in riscv,isa, allowing software to be aware of its existence in QEMU.

I agree with you. I can prepare another PR for adding 'sdtrig' extension.
BTW, currently we have "riscv_cpu_cfg(env)->debug" to control those trigger 
module CSRs.
Maybe we can just remove "debug" and use "sdtrig" instead ? 

Alvin

> 
> 
> Thanks,
> 
> Daniel
> 
> 
> 
> > Changes from v1: Remove dedicated cfg, always implement mcontext.
> >
> >   target/riscv/cpu.h  |  1 +
> >   target/riscv/cpu_bits.h |  7 +++
> >   target/riscv/csr.c  | 36 +++-
> >   target/riscv/debug.c|  2 ++
> >   4 files changed, 41 insertions(+), 5 deletions(-)
> >
> > diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h index
> > d74b361..e117641 100644
> > --- a/target/riscv/cpu.h
> > +++ b/target/riscv/cpu.h
> > @@ -345,6 +345,7 @@ struct CPUArchState {
> >   target_ulong tdata1[RV_MAX_TRIGGERS];
> >   target_ulong tdata2[RV_MAX_TRIGGERS];
> >   target_ulong tdata3[RV_MAX_TRIGGERS];
> > +target_ulong mcontext;
> >   struct CPUBreakpoint *cpu_breakpoint[RV_MAX_TRIGGERS];
> >   struct CPUWatchpoint *cpu_watchpoint[RV_MAX_TRIGGERS];
> >   QEMUTimer *itrigger_timer[RV_MAX_TRIGGERS]; diff --git
> > a/target/riscv/cpu_bits.h b/target/riscv/cpu_bits.h index
> > ebd7917..3296648 100644
> > --- a/target/riscv/cpu_bits.h
> > +++ b/target/riscv/cpu_bits.h
> > @@ -361,6 +361,7 @@
> >   #define CSR_TDATA2  0x7a2
> >   #define CSR_TDATA3  0x7a3
> >   #define CSR_TINFO   0x7a4
> > +#define CSR_MCONTEXT0x7a8
> >
> >   /* Debug Mode Registers */
> >   #define CSR_DCSR0x7b0
> > @@ -905,4 +906,10 @@ typedef enum RISCVException {
> >   /* JVT CSR bits */
> >   #define JVT_MODE   0x3F
> >   #define JVT_BASE   (~0x3F)
> > +
> > +/* Debug Sdtrig CSR masks */
> > +#define MCONTEXT32 0x003F
> > +#define MCONTEXT64
> 0x1FFFULL
> > +#define MCONTEXT32_HCONTEXT0x007F
> > +#define MCONTEXT64_HCONTEXT
> 0x3FFFULL
> >   #endif
> > diff --git a/target/riscv/csr.c b/target/riscv/csr.c index
> > fde7ce1..ff1e128 100644
> > --- a/target/riscv/csr.c
> > +++ b/target/riscv/csr.c
> > @@ -3900,6 +3900,31 @@ static RISCVException read_tinfo(CPURISCVState
> *env, int csrno,
> >   return RISCV_EXCP_NONE;
> >   }
> >
> > +static RISCVException read_mcontext(CPURISCVState *env, int csrno,
> > +target_ulong *val) {
> > +*val = env->mcontext;
> > +return RISCV_EXCP_NONE;
> > +}
> > +
> > +static RISCVException write_mcontext(CPURISCVState *env, int csrno,
> > + target_ulong val) {
> > +bool rv32 = riscv_cpu_mxl(env) == MXL_RV32 ? true : false;
> > +int32_t mask;
> > +
> > +if (riscv_has_ext(env, RVH)) {
> > +/* Spec suggest 7-bit for RV32 and 14-bit for RV64 w/ H extension
> */
> > +mask = rv32 ? MCONTEXT32_HCONTEXT :
> MCONTEXT64_HCONTEXT;
> > +} else {
> > +/* Spec suggest 6-bit for RV32 and 13-bit for RV64 w/o H
> extension */
> > +mask = rv32 ? MCONTEXT32 : MCONTEXT64;
> > +}
> > +
> > +env->mcontext = val & mask;
> > +return RISCV_EXCP_NONE;
> > +}
> > +
> >   /*
> >* Functions to access Pointer Masking feature registers
> >* We have to check if current priv lvl could modify @@ -4794,11
> > +4819,12 @@ riscv_csr_operations csr_ops[CSR_TABLE_SIZE] = {
> >   [CSR_PMPADDR15] =  { "pmpaddr15", pmp, read_pmpaddr,
> > write_pmpaddr },
> >
> >   /* Debug CSRs */
> > -[CSR_TSELECT]   =  { "tselect", debug, read_tselect, write_tselect },
> > -[CSR_TDATA1]=  { "tdata1",  debug, read_tdata,
> write_tdata   },
> > -[CSR_TDATA2]=  { "tdata2",  debug, read_tdata,
> write_tdata   },
> > -[CSR_TDATA3]=  { "tdata3",  debug, read_tdata,
> write_tdata   },
> > -[CSR_TINFO] =  { "tinfo",   debug, read_tinfo,
> write_ignore  },
> > +[CSR_TSELECT]   =  

Re: [PATCH 14/19] qapi/schema: assert QAPISchemaVariants are QAPISchemaObjectType

2024-01-09 Thread John Snow
On Thu, Nov 23, 2023 at 8:51 AM Markus Armbruster  wrote:
>
> John Snow  writes:
>
> > I'm actually not too sure about this one, it seems to hold up at runtime
> > but instead of lying and coming up with an elaborate ruse as a commit
> > message I'm just going to admit I just cribbed my own notes from the
> > last time I typed schema.py and I no longer remember why or if this is
> > correct.
> >
> > Cool!
> >
> > With more seriousness, variants are only guaranteed to house a
> > QAPISchemaType as per the definition of QAPISchemaObjectTypeMember but
> > the only classes/types that have a check_clash method are descendents of
> > QAPISchemaMember and the QAPISchemaVariants class itself.
> >
> > Signed-off-by: John Snow 
> > ---
> >  scripts/qapi/schema.py | 1 +
> >  1 file changed, 1 insertion(+)
> >
> > diff --git a/scripts/qapi/schema.py b/scripts/qapi/schema.py
> > index 476b19aed61..ce5b01b3182 100644
> > --- a/scripts/qapi/schema.py
> > +++ b/scripts/qapi/schema.py
> > @@ -717,6 +717,7 @@ def check_clash(self, info, seen):
> >  for v in self.variants:
> >  # Reset seen map for each variant, since qapi names from one
> >  # branch do not affect another branch
> > +assert isinstance(v.type, QAPISchemaObjectType)  # I think, 
> > anyway?
> >  v.type.check_clash(info, dict(seen))
>
> Have a look at .check() right above:
>
>def check(
>self, schema: QAPISchema, seen: Dict[str, QAPISchemaMember]
>) -> None:
>[...]
>if not self.variants:
>raise QAPISemError(self.info, "union has no branches")
>for v in self.variants:
>v.check(schema)
># Union names must match enum values; alternate names are
># checked separately. Use 'seen' to tell the two apart.
>if seen:
>if v.name not in self.tag_member.type.member_names():
>raise QAPISemError(
>self.info,
>"branch '%s' is not a value of %s"
>% (v.name, self.tag_member.type.describe()))
> --->   if not isinstance(v.type, QAPISchemaObjectType):
> --->   raise QAPISemError(
>self.info,
>"%s cannot use %s"
>% (v.describe(self.info), v.type.describe()))
>v.type.check(schema)
>
> Since .check() runs before .check_clash(), your assertion holds.
>
> Clearer now?
>

OK, I think this just needs a better commit message and comment, then.

--js




Re: [PATCH 15/19] qapi/parser: demote QAPIExpression to Dict[str, Any]

2024-01-09 Thread John Snow
On Thu, Nov 23, 2023 at 9:12 AM Markus Armbruster  wrote:
>
> John Snow  writes:
>
> > Dict[str, object] is a stricter type, but with the way that code is
> > currently arranged, it is infeasible to enforce this strictness.
> >
> > In particular, although expr.py's entire raison d'être is normalization
> > and type-checking of QAPI Expressions, that type information is not
> > "remembered" in any meaningful way by mypy because each individual
> > expression is not downcast to a specific expression type that holds all
> > the details of each expression's unique form.
> >
> > As a result, all of the code in schema.py that deals with actually
> > creating type-safe specialized structures has no guarantee (myopically)
> > that the data it is being passed is correct.
> >
> > There are two ways to solve this:
> >
> > (1) Re-assert that the incoming data is in the shape we expect it to be, or
> > (2) Disable type checking for this data.
> >
> > (1) is appealing to my sense of strictness, but I gotta concede that it
> > is asinine to re-check the shape of a QAPIExpression in schema.py when
> > expr.py has just completed that work at length. The duplication of code
> > and the nightmare thought of needing to update both locations if and
> > when we change the shape of these structures makes me extremely
> > reluctant to go down this route.
> >
> > (2) allows us the chance to miss updating types in the case that types
> > are updated in expr.py, but it *is* an awful lot simpler and,
> > importantly, gets us closer to type checking schema.py *at
> > all*. Something is better than nothing, I'd argue.
> >
> > So, do the simpler dumber thing and worry about future strictness
> > improvements later.
>
> Yes.
>

(You were right, again.)

> While Dict[str, object] is stricter than Dict[str, Any], both are miles
> away from the actual, recursive type.
>
> > Signed-off-by: John Snow 
> > ---
> >  scripts/qapi/parser.py | 3 ++-
> >  1 file changed, 2 insertions(+), 1 deletion(-)
> >
> > diff --git a/scripts/qapi/parser.py b/scripts/qapi/parser.py
> > index bf31018aef0..b7f08cf36f2 100644
> > --- a/scripts/qapi/parser.py
> > +++ b/scripts/qapi/parser.py
> > @@ -19,6 +19,7 @@
> >  import re
> >  from typing import (
> >  TYPE_CHECKING,
> > +Any,
> >  Dict,
> >  List,
> >  Mapping,
> > @@ -43,7 +44,7 @@
> >  _ExprValue = Union[List[object], Dict[str, object], str, bool]
> >
> >
> > -class QAPIExpression(Dict[str, object]):
> > +class QAPIExpression(Dict[str, Any]):
> >  # pylint: disable=too-few-public-methods
> >  def __init__(self,
> >   data: Mapping[str, object],
>
> There are several occurences of Dict[str, object] elsewhere.  Would your
> argument for dumbing down QAPIExpression apply to (some of) them, too?

When and if they piss me off, sure. I'm just wary of making the types
too permissive because it can obscure typing errors; by using Any, you
really disable any further checks and might lead to false confidence
in the static checker. I still have a weird grudge against Any and
would like to fully eliminate it from any statically checked Python
code, but it's just not always feasible and I have to admit that "good
enough" is good enough. Doesn't have me running to lessen the
strictness in areas that didn't cause me pain, though...

> Skimming them, I found this in introspect.py:
>
> # These types are based on structures defined in QEMU's schema, so we
> # lack precise types for them here. Python 3.6 does not offer
> # TypedDict constructs, so they are broadly typed here as simple
> # Python Dicts.
> SchemaInfo = Dict[str, object]
> SchemaInfoEnumMember = Dict[str, object]
> SchemaInfoObject = Dict[str, object]
> SchemaInfoObjectVariant = Dict[str, object]
> SchemaInfoObjectMember = Dict[str, object]
> SchemaInfoCommand = Dict[str, object]
>
> Can we do better now we have 3.8?

A little bit, but it involves reproducing these types -- which are
ultimately meant to represent QAPI types defined in introspect.json --
with "redundant" type info. i.e. I have to reproduce the existing type
definitions in Python-ese, and then we have the maintenance burden of
making sure they match.

Maybe too much work to come up with a crazy dynamic definition thing
where we take the QAPI definition and build Python types from them ...
without some pretty interesting work to avoid the Ouroboros that'd
result. introspection.py wants static types based on types defined
dynamically by the schema definition; but we are not guaranteed to
have a suitable schema with these types at all. I'm not sure how to
express this kind of dependency without some interesting re-work. This
is a rare circumstance of the QAPI generator relying on the contents
of the Schema to provide static type assistance.

Now, I COULD do it statically, since I don't expect these types to
change much, but I'm wary of how quickly it might get out of hand
trying to achieve better type specificity.


Re: [External] Re: [QEMU-devel][RFC PATCH 1/1] backends/hostmem: qapi/qom: Add an ObjectOption for memory-backend-* called HostMemType and its arg 'cxlram'

2024-01-09 Thread Hao Xiang
On Tue, Jan 9, 2024 at 2:13 PM Gregory Price  wrote:
>
> On Tue, Jan 09, 2024 at 01:27:28PM -0800, Hao Xiang wrote:
> > On Tue, Jan 9, 2024 at 11:58 AM Gregory Price
> >  wrote:
> > >
> > > If you drop this line:
> > >
> > > -numa node,memdev=vmem0,nodeid=1
> >
> > We tried this as well and it works after going through the cxlcli
> > process and created the devdax device. The problem is that without the
> > "nodeid=1" configuration, we cannot connect with the explicit per numa
> > node latency/bandwidth configuration "-numa hmat-lb". I glanced at the
> > code in hw/numa.c, parse_numa_hmat_lb() looks like the one passing the
> > lb information to VM's hmat.
> >
>
> Yeah, this is what Jonathan was saying - right now there isn't a good
> way (in QEMU) to pass the hmat/cdat stuff down through the device.
> Needs to be plumbed out.
>
> In the meantime: You should just straight up drop the cxl device from
> your QEMU config.  It doesn't actually get you anything.
>
> > From what I understand so far, the guest kernel will dynamically
> > create a numa node after a cxl devdax device is created. That means we
> > don't know the numa node until after VM boot. 2. QEMU can only
> > statically parse the lb information to the VM at boot time. How do we
> > connect these two things?
>
> during boot, the kernel discovers all the memory regions exposed to
> bios. In this qemu configuration you have defined:
>
> region 0: CPU + DRAM node
> region 1: DRAM only node
> region 2: CXL Fixed Memory Window (the last line of the cxl stuff)
>
> The kernel reads this information on boot and reserves 1 numa node for
> each of these regions.
>
> The kernel then automatically brings up regions 0 and 1 in nodes 0 and 1
> respectively.
>
> Node2 sits dormant until you go through the cxl-cli startup sequence.
>
>
> What you're asking for is for the QEMU team to plumb hmat/cdat
> information down through the type3 device.  I *think* that is presently
> possible with a custom CDAT file - but Jonathan probably has more
> details on that.  You'll have to go digging for answers on that one.

I think this is exactly what I was looking for. When we started with
the idea of having an explicit CXL memory backend, we wanted to
1) Bind a virtual CXL device to an actual CXL memory node on host.
2) Pass the latency/bandwidth information from the CXL backend into
the virtual CXL device.
I didn't have a concrete idea of how to do 2)
With the discussion here, I learned that the information is passed
from CDAT. Just looked into the virtual CXL code and found that
ct3_build_cdat_entries_for_mr() is the function that builds this
information. But the latency and bandwidth there are currently
hard-coded. I think it makes sense to have an explicit CXL memory
backend where QEMU can query the CXL memory attributes from the host
and pass that information from the CXL backend into the virtual CXL
type-3 device.

>
>
> Now - even if you did that - the current state of the cxl-type3 device
> is *not what you want* because your memory accesses will be routed
> through the read/write functions in the emulated device.
>
> What Jonathan and I discussed on the other thread is how you might go
> about slimming this down to allow pass-through of the memory without the
> need for all the fluff.  This is a non-trivial refactor of the existing
> device, so i would not expect that any time soon.
>
> At the end of the day, quickest way to get-there-from-here is to just
> drop the cxl related lines from your QEMU config, and keep everything
> else.

Agreed. We need the kernel to be capable of reading the memory
attributes from HMAT and establish the correct memory tier for
system-DRAM (on a CPUless numa node). Currently system-DRAM is assumed
to always be fast tier.

>
> >
> > Assuming that the same issue applies to a physical server with CXL.
> > Were you able to see a host kernel getting the correct lb information
> > for a CXL devdax device?
> >
>
> Yes, if you bring up a CXL device via cxl-cli on real hardware, the
> subsequent numa node ends up in the "lower tier" of the memory-tiering
> infrastructure.
>
> ~Gregory



[PATCH 0/2] target/s390x: Fix LAE setting a wrong access register

2024-01-09 Thread Ilya Leoshkevich
Hi,

Ido has noticed that LAE sets a wrong access register and proposed a
fix. This series fixes the issue and adds a test.

Best regards,
Ilya

Ilya Leoshkevich (2):
  target/s390x: Fix LAE setting a wrong access register
  tests/tcg/s390x: Test LOAD ADDRESS EXTENDED

 target/s390x/tcg/translate.c|  3 ++-
 tests/tcg/s390x/Makefile.target |  1 +
 tests/tcg/s390x/lae.c   | 25 +
 3 files changed, 28 insertions(+), 1 deletion(-)
 create mode 100644 tests/tcg/s390x/lae.c

-- 
2.43.0




[PATCH 2/2] tests/tcg/s390x: Test LOAD ADDRESS EXTENDED

2024-01-09 Thread Ilya Leoshkevich
Add a small test to prevent regressions. Userspace runs in primary
mode, so LAE should always set the access register to 0.

Signed-off-by: Ilya Leoshkevich 
---
 tests/tcg/s390x/Makefile.target |  1 +
 tests/tcg/s390x/lae.c   | 25 +
 2 files changed, 26 insertions(+)
 create mode 100644 tests/tcg/s390x/lae.c

diff --git a/tests/tcg/s390x/Makefile.target b/tests/tcg/s390x/Makefile.target
index 0e670f3f8b9..30994dcf9c2 100644
--- a/tests/tcg/s390x/Makefile.target
+++ b/tests/tcg/s390x/Makefile.target
@@ -44,6 +44,7 @@ TESTS+=clgebr
 TESTS+=clc
 TESTS+=laalg
 TESTS+=add-logical-with-carry
+TESTS+=lae
 
 cdsg: CFLAGS+=-pthread
 cdsg: LDFLAGS+=-pthread
diff --git a/tests/tcg/s390x/lae.c b/tests/tcg/s390x/lae.c
new file mode 100644
index 000..661e95f9978
--- /dev/null
+++ b/tests/tcg/s390x/lae.c
@@ -0,0 +1,25 @@
+/*
+ * Test the LOAD ADDRESS EXTENDED instruction.
+ *
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ */
+#include 
+#include 
+
+int main(void)
+{
+unsigned long long ar = -1, b2 = 10, r, x2 = 500;
+int tmp;
+
+asm("ear %[tmp],%[r]\n"
+"lae %[r],42(%[x2],%[b2])\n"
+"ear %[ar],%[r]\n"
+"sar %[r],%[tmp]"
+: [tmp] "=" (tmp), [r] "=" (r), [ar] "+r" (ar)
+: [b2] "r" (b2), [x2] "r" (x2)
+: "memory");
+assert(ar == 0xULL);
+assert(r == 100542);
+
+return EXIT_SUCCESS;
+}
-- 
2.43.0




[PATCH 1/2] target/s390x: Fix LAE setting a wrong access register

2024-01-09 Thread Ilya Leoshkevich
LAE should set the access register corresponding to the first operand,
instead, it always modifies access register 1.

Co-developed-by: Ido Plat 
Cc: qemu-sta...@nongnu.org
Fixes: a1c7610a6879 ("target-s390x: implement LAY and LAEY instructions")
Signed-off-by: Ilya Leoshkevich 
---
 target/s390x/tcg/translate.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/target/s390x/tcg/translate.c b/target/s390x/tcg/translate.c
index 62ab2be8b12..8df00b7df9f 100644
--- a/target/s390x/tcg/translate.c
+++ b/target/s390x/tcg/translate.c
@@ -3221,6 +3221,7 @@ static DisasJumpType op_mov2e(DisasContext *s, DisasOps 
*o)
 {
 int b2 = get_field(s, b2);
 TCGv ar1 = tcg_temp_new_i64();
+int r1 = get_field(s, r1);
 
 o->out = o->in2;
 o->in2 = NULL;
@@ -3244,7 +3245,7 @@ static DisasJumpType op_mov2e(DisasContext *s, DisasOps 
*o)
 break;
 }
 
-tcg_gen_st32_i64(ar1, tcg_env, offsetof(CPUS390XState, aregs[1]));
+tcg_gen_st32_i64(ar1, tcg_env, offsetof(CPUS390XState, aregs[r1]));
 return DISAS_NEXT;
 }
 
-- 
2.43.0




[PATCH v2 2/3] tests/tcg: Factor out gdbstub test functions

2024-01-09 Thread Ilya Leoshkevich
Both the report() function as well as the initial gdbstub test sequence
are copy-pasted into ~10 files with slight modifications. This
indicates that they are indeed generic, so factor them out. While
at it, add a few newlines to make the formatting closer to PEP-8.

Signed-off-by: Ilya Leoshkevich 
---
 tests/guest-debug/run-test.py |  7 ++-
 tests/guest-debug/test_gdbstub.py | 56 +++
 tests/tcg/aarch64/gdbstub/test-sve-ioctl.py   | 34 +--
 tests/tcg/aarch64/gdbstub/test-sve.py | 33 +--
 tests/tcg/multiarch/gdbstub/interrupt.py  | 47 ++--
 tests/tcg/multiarch/gdbstub/memory.py | 41 +-
 tests/tcg/multiarch/gdbstub/registers.py  | 41 ++
 tests/tcg/multiarch/gdbstub/sha1.py   | 40 ++---
 .../multiarch/gdbstub/test-proc-mappings.py   | 39 +
 .../multiarch/gdbstub/test-qxfer-auxv-read.py | 37 +---
 .../gdbstub/test-thread-breakpoint.py | 37 +---
 tests/tcg/s390x/gdbstub/test-signals-s390x.py | 42 +-
 tests/tcg/s390x/gdbstub/test-svc.py   | 39 +
 13 files changed, 96 insertions(+), 397 deletions(-)
 create mode 100644 tests/guest-debug/test_gdbstub.py

diff --git a/tests/guest-debug/run-test.py b/tests/guest-debug/run-test.py
index b13b27d4b19..368ff8a8903 100755
--- a/tests/guest-debug/run-test.py
+++ b/tests/guest-debug/run-test.py
@@ -97,7 +97,12 @@ def log(output, msg):
 sleep(1)
 log(output, "GDB CMD: %s" % (gdb_cmd))
 
-result = subprocess.call(gdb_cmd, shell=True, stdout=output, stderr=stderr)
+gdb_env = dict(os.environ)
+gdb_pythonpath = gdb_env.get("PYTHONPATH", "").split(os.pathsep)
+gdb_pythonpath.append(os.path.dirname(os.path.realpath(__file__)))
+gdb_env["PYTHONPATH"] = os.pathsep.join(gdb_pythonpath)
+result = subprocess.call(gdb_cmd, shell=True, stdout=output, stderr=stderr,
+ env=gdb_env)
 
 # A result of greater than 128 indicates a fatal signal (likely a
 # crash due to gdb internal failure). That's a problem for GDB and
diff --git a/tests/guest-debug/test_gdbstub.py 
b/tests/guest-debug/test_gdbstub.py
new file mode 100644
index 000..1bc4ed131f4
--- /dev/null
+++ b/tests/guest-debug/test_gdbstub.py
@@ -0,0 +1,56 @@
+"""Helper functions for gdbstub testing
+
+"""
+from __future__ import print_function
+import gdb
+import sys
+
+fail_count = 0
+
+
+def report(cond, msg):
+"""Report success/fail of a test"""
+if cond:
+print("PASS: {}".format(msg))
+else:
+print("FAIL: {}".format(msg))
+global fail_count
+fail_count += 1
+
+
+def main(test, expected_arch=None):
+"""Run a test function
+
+This runs as the script it sourced (via -x, via run-test.py)."""
+try:
+inferior = gdb.selected_inferior()
+arch = inferior.architecture()
+print("ATTACHED: {}".format(arch))
+if expected_arch is not None:
+report(arch.name() == expected_arch,
+   "connected to {}".format(expected_arch))
+except (gdb.error, AttributeError):
+print("SKIP: not connected")
+exit(0)
+
+if gdb.parse_and_eval("$pc") == 0:
+print("SKIP: PC not set")
+exit(0)
+
+try:
+test()
+except:
+print("GDB Exception: {}".format(sys.exc_info()[0]))
+global fail_count
+fail_count += 1
+import code
+code.InteractiveConsole(locals=globals()).interact()
+raise
+
+try:
+gdb.execute("kill")
+except gdb.error:
+pass
+
+print("All tests complete: %d failures".format(fail_count))
+exit(fail_count)
diff --git a/tests/tcg/aarch64/gdbstub/test-sve-ioctl.py 
b/tests/tcg/aarch64/gdbstub/test-sve-ioctl.py
index ee8d467e59d..a78a3a2514d 100644
--- a/tests/tcg/aarch64/gdbstub/test-sve-ioctl.py
+++ b/tests/tcg/aarch64/gdbstub/test-sve-ioctl.py
@@ -8,19 +8,10 @@
 #
 
 import gdb
-import sys
+from test_gdbstub import main, report
 
 initial_vlen = 0
-failcount = 0
 
-def report(cond, msg):
-"Report success/fail of test"
-if cond:
-print ("PASS: %s" % (msg))
-else:
-print ("FAIL: %s" % (msg))
-global failcount
-failcount += 1
 
 class TestBreakpoint(gdb.Breakpoint):
 def __init__(self, sym_name="__sve_ld_done"):
@@ -64,26 +55,5 @@ def run_test():
 
 gdb.execute("c")
 
-#
-# This runs as the script it sourced (via -x, via run-test.py)
-#
-try:
-inferior = gdb.selected_inferior()
-arch = inferior.architecture()
-report(arch.name() == "aarch64", "connected to aarch64")
-except (gdb.error, AttributeError):
-print("SKIPPING (not connected)", file=sys.stderr)
-exit(0)
-
-try:
-# Run the actual tests
-run_test()
-except:
-print ("GDB Exception: %s" % (sys.exc_info()[0]))
-failcount += 1
-import code
-code.InteractiveConsole(locals=globals()).interact()
-

[PATCH v2 0/3] linux-user: Allow gdbstub to ignore page protection

2024-01-09 Thread Ilya Leoshkevich
v1 -> v2: Use /proc/self/mem as a fallback. Handle TB invalidation
  (Richard).
  Test cross-page accesses.

RFC: https://lists.gnu.org/archive/html/qemu-devel/2023-12/msg02044.html
RFC -> v1: Use /proc/self/mem and accept that this will not work
   without /proc.
   Factor out a couple functions for gdbstub testing.
   Add a test.

Hi,

I've noticed that gdbstub behaves differently from gdbserver in that it
doesn't allow reading non-readable pages. This series improves the
situation by using the same mechanism as gdbserver: /proc/self/mem.

Best regards,
Ilya

Ilya Leoshkevich (3):
  linux-user: Allow gdbstub to ignore page protection
  tests/tcg: Factor out gdbstub test functions
  tests/tcg: Add the PROT_NONE gdbstub test

 cpu-target.c  | 76 +++
 tests/guest-debug/run-test.py |  7 +-
 tests/guest-debug/test_gdbstub.py | 56 ++
 tests/tcg/aarch64/gdbstub/test-sve-ioctl.py   | 34 +
 tests/tcg/aarch64/gdbstub/test-sve.py | 33 +---
 tests/tcg/multiarch/Makefile.target   |  9 ++-
 tests/tcg/multiarch/gdbstub/interrupt.py  | 47 ++--
 tests/tcg/multiarch/gdbstub/memory.py | 41 +-
 tests/tcg/multiarch/gdbstub/prot-none.py  | 22 ++
 tests/tcg/multiarch/gdbstub/registers.py  | 41 ++
 tests/tcg/multiarch/gdbstub/sha1.py   | 40 ++
 .../multiarch/gdbstub/test-proc-mappings.py   | 39 +-
 .../multiarch/gdbstub/test-qxfer-auxv-read.py | 37 +
 .../gdbstub/test-thread-breakpoint.py | 37 +
 tests/tcg/multiarch/prot-none.c   | 40 ++
 tests/tcg/s390x/gdbstub/test-signals-s390x.py | 42 +-
 tests/tcg/s390x/gdbstub/test-svc.py   | 39 +-
 17 files changed, 227 insertions(+), 413 deletions(-)
 create mode 100644 tests/guest-debug/test_gdbstub.py
 create mode 100644 tests/tcg/multiarch/gdbstub/prot-none.py
 create mode 100644 tests/tcg/multiarch/prot-none.c

-- 
2.43.0




[PATCH v2 1/3] linux-user: Allow gdbstub to ignore page protection

2024-01-09 Thread Ilya Leoshkevich
gdbserver ignores page protection by virtue of using /proc/$pid/mem.
Teach qemu gdbstub to do this too. This will not work if /proc is not
mounted; accept this limitation.

One alternative is to temporarily grant the missing PROT_* bit, but
this is inherently racy. Another alternative is self-debugging with
ptrace(POKE), which will break if QEMU itself is being debugged - a
much more severe limitation.

Signed-off-by: Ilya Leoshkevich 
---
 cpu-target.c | 76 +---
 1 file changed, 61 insertions(+), 15 deletions(-)

diff --git a/cpu-target.c b/cpu-target.c
index 5eecd7ea2d7..723f6af5fba 100644
--- a/cpu-target.c
+++ b/cpu-target.c
@@ -406,6 +406,9 @@ int cpu_memory_rw_debug(CPUState *cpu, vaddr addr,
 vaddr l, page;
 void * p;
 uint8_t *buf = ptr;
+ssize_t written;
+int ret = -1;
+int fd = -1;
 
 while (len > 0) {
 page = addr & TARGET_PAGE_MASK;
@@ -413,30 +416,73 @@ int cpu_memory_rw_debug(CPUState *cpu, vaddr addr,
 if (l > len)
 l = len;
 flags = page_get_flags(page);
-if (!(flags & PAGE_VALID))
-return -1;
+if (!(flags & PAGE_VALID)) {
+goto out_close;
+}
 if (is_write) {
-if (!(flags & PAGE_WRITE))
-return -1;
-/* XXX: this code should not depend on lock_user */
-if (!(p = lock_user(VERIFY_WRITE, addr, l, 0)))
-return -1;
-memcpy(p, buf, l);
-unlock_user(p, addr, l);
-} else {
-if (!(flags & PAGE_READ))
-return -1;
+if (flags & PAGE_WRITE) {
+/* XXX: this code should not depend on lock_user */
+p = lock_user(VERIFY_WRITE, addr, l, 0);
+if (!p) {
+goto out_close;
+}
+memcpy(p, buf, l);
+unlock_user(p, addr, l);
+} else {
+/* Bypass the host page protection using ptrace. */
+if (fd == -1) {
+fd = open("/proc/self/mem", O_WRONLY);
+if (fd == -1) {
+goto out;
+}
+}
+/*
+ * If there is a TranslationBlock and we weren't bypassing the
+ * host page protection, the memcpy() above would SEGV,
+ * ultimately leading to page_unprotect(). So invalidate the
+ * translations manually. Both invalidation and pwrite() must
+ * be under mmap_lock() in order to prevent the creation of
+ * another TranslationBlock in between.
+ */
+mmap_lock();
+tb_invalidate_phys_range(addr, addr + l - 1);
+written = pwrite(fd, buf, l, (off_t)g2h_untagged(addr));
+mmap_unlock();
+if (written != l) {
+goto out_close;
+}
+}
+} else if (flags & PAGE_READ) {
 /* XXX: this code should not depend on lock_user */
-if (!(p = lock_user(VERIFY_READ, addr, l, 1)))
-return -1;
+p = lock_user(VERIFY_READ, addr, l, 1);
+if (!p) {
+goto out_close;
+}
 memcpy(buf, p, l);
 unlock_user(p, addr, 0);
+} else {
+/* Bypass the host page protection using ptrace. */
+if (fd == -1) {
+fd = open("/proc/self/mem", O_RDONLY);
+if (fd == -1) {
+goto out;
+}
+}
+if (pread(fd, buf, l, (off_t)g2h_untagged(addr)) != l) {
+goto out_close;
+}
 }
 len -= l;
 buf += l;
 addr += l;
 }
-return 0;
+ret = 0;
+out_close:
+if (fd != -1) {
+close(fd);
+}
+out:
+return ret;
 }
 #endif
 
-- 
2.43.0




[PATCH v2 3/3] tests/tcg: Add the PROT_NONE gdbstub test

2024-01-09 Thread Ilya Leoshkevich
Make sure that qemu gdbstub, like gdbserver, allows reading from and
writing to PROT_NONE pages.

Signed-off-by: Ilya Leoshkevich 
---
 tests/tcg/multiarch/Makefile.target  |  9 +-
 tests/tcg/multiarch/gdbstub/prot-none.py | 22 +
 tests/tcg/multiarch/prot-none.c  | 40 
 3 files changed, 70 insertions(+), 1 deletion(-)
 create mode 100644 tests/tcg/multiarch/gdbstub/prot-none.py
 create mode 100644 tests/tcg/multiarch/prot-none.c

diff --git a/tests/tcg/multiarch/Makefile.target 
b/tests/tcg/multiarch/Makefile.target
index d31ba8d6ae4..315a2e13588 100644
--- a/tests/tcg/multiarch/Makefile.target
+++ b/tests/tcg/multiarch/Makefile.target
@@ -101,13 +101,20 @@ run-gdbstub-registers: sha512
--bin $< --test $(MULTIARCH_SRC)/gdbstub/registers.py, \
checking register enumeration)
 
+run-gdbstub-prot-none: prot-none
+   $(call run-test, $@, env PROT_NONE_PY=1 $(GDB_SCRIPT) \
+   --gdb $(GDB) \
+   --qemu $(QEMU) --qargs "$(QEMU_OPTS)" \
+   --bin $< --test $(MULTIARCH_SRC)/gdbstub/prot-none.py, \
+   accessing PROT_NONE memory)
+
 else
 run-gdbstub-%:
$(call skip-test, "gdbstub test $*", "need working gdb with $(patsubst 
-%,,$(TARGET_NAME)) support")
 endif
 EXTRA_RUNS += run-gdbstub-sha1 run-gdbstub-qxfer-auxv-read \
  run-gdbstub-proc-mappings run-gdbstub-thread-breakpoint \
- run-gdbstub-registers
+ run-gdbstub-registers run-gdbstub-prot-none
 
 # ARM Compatible Semi Hosting Tests
 #
diff --git a/tests/tcg/multiarch/gdbstub/prot-none.py 
b/tests/tcg/multiarch/gdbstub/prot-none.py
new file mode 100644
index 000..f1f1dd82cbe
--- /dev/null
+++ b/tests/tcg/multiarch/gdbstub/prot-none.py
@@ -0,0 +1,22 @@
+"""Test that GDB can access PROT_NONE pages.
+
+This runs as a sourced script (via -x, via run-test.py).
+
+SPDX-License-Identifier: GPL-2.0-or-later
+"""
+from test_gdbstub import main, report
+
+
+def run_test():
+"""Run through the tests one by one"""
+gdb.Breakpoint("break_here")
+gdb.execute("continue")
+val = gdb.parse_and_eval("*(char[2] *)q").string()
+report(val == "42", "{} == 42".format(val))
+gdb.execute("set *(char[3] *)q = \"24\"")
+gdb.execute("continue")
+exitcode = int(gdb.parse_and_eval("$_exitcode"))
+report(exitcode == 0, "{} == 0".format(exitcode))
+
+
+main(run_test)
diff --git a/tests/tcg/multiarch/prot-none.c b/tests/tcg/multiarch/prot-none.c
new file mode 100644
index 000..dc56aadb3c5
--- /dev/null
+++ b/tests/tcg/multiarch/prot-none.c
@@ -0,0 +1,40 @@
+/*
+ * Test that GDB can access PROT_NONE pages.
+ *
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ */
+#include 
+#include 
+#include 
+#include 
+#include 
+
+void break_here(void *q)
+{
+}
+
+int main(void)
+{
+long pagesize = sysconf(_SC_PAGESIZE);
+void *p, *q;
+int err;
+
+p = mmap(NULL, pagesize * 2, PROT_READ | PROT_WRITE,
+ MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
+assert(p != MAP_FAILED);
+q = p + pagesize - 1;
+strcpy(q, "42");
+
+err = mprotect(p, pagesize * 2, PROT_NONE);
+assert(err == 0);
+
+break_here(q);
+
+err = mprotect(p, pagesize * 2, PROT_READ);
+assert(err == 0);
+if (getenv("PROT_NONE_PY")) {
+assert(strcmp(q, "24") == 0);
+}
+
+return EXIT_SUCCESS;
+}
-- 
2.43.0




Re: [PATCH] hw/i386/pc_piix: Make piix_intx_routing_notifier_xen() more device independent

2024-01-09 Thread Bernhard Beschow



Am 9. Januar 2024 08:51:37 UTC schrieb David Woodhouse :
>On Mon, 2024-01-08 at 00:16 +0100, Bernhard Beschow wrote:
>> This is a follow-up on commit 89965db43cce "hw/isa/piix3: Avoid Xen-specific
>> variant of piix3_write_config()" which introduced
>> piix_intx_routing_notifier_xen(). This function is implemented in board code 
>> but
>> accesses the PCI configuration space of the PIIX ISA function to determine 
>> the
>> PCI interrupt routes. Avoid this by reusing pci_device_route_intx_to_irq() 
>> which
>> makes piix_intx_routing_notifier_xen() more device-agnostic.
>> 
>> One remaining improvement would be making piix_intx_routing_notifier_xen()
>> agnostic towards the number of PCI interrupt routes and move it to xen-hvm.
>> This might be useful for possible Q35 Xen efforts but remains a future 
>> exercise
>> for now.
>> 
>> Signed-off-by: Bernhard Beschow 
>
>I'm still moderately unhappy that all this code is written with the
>apparent assumption that there is only *one* IRQ# which is the target
>for a given INTx, when in fact it gets routed to that pin# on the
>legacy PIC and a potentially *different* pin# on the I/O APIC.

Would TYPE_SPLIT_IRQ help in any way?

>
>But you aren't making that worse, so
>
>Reviewed-by: David Woodhouse 

Thanks!



[PATCH v9 02/10] hw/fsi: Introduce IBM's FSI Bus

2024-01-09 Thread Ninad Palsule
This is a part of patchset where FSI bus is introduced.

The FSI bus is a simple bus where FSI master is attached.

[ clg: - removed include/hw/fsi/engine-scratchpad.h and
 hw/fsi/engine-scratchpad.c
   - dropped FSI_SCRATCHPAD
   - included FSIBus definition
   - dropped hw/fsi/trace-events changes ]

Signed-off-by: Andrew Jeffery 
Signed-off-by: Cédric Le Goater 
Signed-off-by: Ninad Palsule 
---
 include/hw/fsi/fsi.h | 19 +++
 hw/fsi/fsi.c | 22 ++
 hw/fsi/Kconfig   |  3 +++
 hw/fsi/meson.build   |  1 +
 4 files changed, 45 insertions(+)
 create mode 100644 include/hw/fsi/fsi.h
 create mode 100644 hw/fsi/fsi.c

diff --git a/include/hw/fsi/fsi.h b/include/hw/fsi/fsi.h
new file mode 100644
index 00..a75e3e5bdc
--- /dev/null
+++ b/include/hw/fsi/fsi.h
@@ -0,0 +1,19 @@
+/*
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ * Copyright (C) 2023 IBM Corp.
+ *
+ * IBM Flexible Service Interface
+ */
+#ifndef FSI_FSI_H
+#define FSI_FSI_H
+
+#include "hw/qdev-core.h"
+
+#define TYPE_FSI_BUS "fsi.bus"
+OBJECT_DECLARE_SIMPLE_TYPE(FSIBus, FSI_BUS)
+
+typedef struct FSIBus {
+BusState bus;
+} FSIBus;
+
+#endif
diff --git a/hw/fsi/fsi.c b/hw/fsi/fsi.c
new file mode 100644
index 00..8dca472bc3
--- /dev/null
+++ b/hw/fsi/fsi.c
@@ -0,0 +1,22 @@
+/*
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ * Copyright (C) 2023 IBM Corp.
+ *
+ * IBM Flexible Service Interface
+ */
+#include "qemu/osdep.h"
+
+#include "hw/fsi/fsi.h"
+
+static const TypeInfo fsi_bus_info = {
+.name = TYPE_FSI_BUS,
+.parent = TYPE_BUS,
+.instance_size = sizeof(FSIBus),
+};
+
+static void fsi_bus_register_types(void)
+{
+type_register_static(_bus_info);
+}
+
+type_init(fsi_bus_register_types);
diff --git a/hw/fsi/Kconfig b/hw/fsi/Kconfig
index e650c660f0..f4869c209f 100644
--- a/hw/fsi/Kconfig
+++ b/hw/fsi/Kconfig
@@ -1,2 +1,5 @@
+config FSI
+bool
+
 config FSI_LBUS
 bool
diff --git a/hw/fsi/meson.build b/hw/fsi/meson.build
index 4074d3a7d2..487fb31cbc 100644
--- a/hw/fsi/meson.build
+++ b/hw/fsi/meson.build
@@ -1 +1,2 @@
 system_ss.add(when: 'CONFIG_FSI_LBUS', if_true: files('lbus.c'))
+system_ss.add(when: 'CONFIG_FSI', if_true: files('fsi.c'))
-- 
2.39.2




[PATCH v9 01/10] hw/fsi: Introduce IBM's Local bus

2024-01-09 Thread Ninad Palsule
This is a part of patchset where IBM's Flexible Service Interface is
introduced.

The LBUS is modelled to maintain mapped memory for the devices. The
memory is mapped after CFAM config, peek table and FSI slave registers.

[ clg: - removed lbus_add_device() bc unused
   - removed lbus_create_device() bc used only once
   - removed "address" property
   - updated meson.build to build fsi dir
   - included an empty hw/fsi/trace-events ]

Signed-off-by: Andrew Jeffery 
Signed-off-by: Cédric Le Goater 
Signed-off-by: Ninad Palsule 
---
v9:
  - Changed LBUS memory region to 1MB.
---
 meson.build   |  1 +
 hw/fsi/trace.h|  1 +
 include/hw/fsi/lbus.h | 41 ++
 hw/fsi/lbus.c | 51 +++
 hw/Kconfig|  1 +
 hw/fsi/Kconfig|  2 ++
 hw/fsi/meson.build|  1 +
 hw/fsi/trace-events   |  1 +
 hw/meson.build|  1 +
 9 files changed, 100 insertions(+)
 create mode 100644 hw/fsi/trace.h
 create mode 100644 include/hw/fsi/lbus.h
 create mode 100644 hw/fsi/lbus.c
 create mode 100644 hw/fsi/Kconfig
 create mode 100644 hw/fsi/meson.build
 create mode 100644 hw/fsi/trace-events

diff --git a/meson.build b/meson.build
index 371edafae6..498d08b866 100644
--- a/meson.build
+++ b/meson.build
@@ -3273,6 +3273,7 @@ if have_system
 'hw/char',
 'hw/display',
 'hw/dma',
+'hw/fsi',
 'hw/hyperv',
 'hw/i2c',
 'hw/i386',
diff --git a/hw/fsi/trace.h b/hw/fsi/trace.h
new file mode 100644
index 00..ee67c7fb04
--- /dev/null
+++ b/hw/fsi/trace.h
@@ -0,0 +1 @@
+#include "trace/trace-hw_fsi.h"
diff --git a/include/hw/fsi/lbus.h b/include/hw/fsi/lbus.h
new file mode 100644
index 00..44d74d8b6a
--- /dev/null
+++ b/include/hw/fsi/lbus.h
@@ -0,0 +1,41 @@
+/*
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ * Copyright (C) 2023 IBM Corp.
+ *
+ * IBM Local bus and connected device structures.
+ */
+#ifndef FSI_LBUS_H
+#define FSI_LBUS_H
+
+#include "exec/memory.h"
+#include "hw/qdev-core.h"
+#include "qemu/units.h"
+
+#define TYPE_FSI_LBUS_DEVICE "fsi.lbus.device"
+OBJECT_DECLARE_TYPE(FSILBusDevice, FSILBusDeviceClass, FSI_LBUS_DEVICE)
+
+#define FSI_LBUS_MEM_REGION_SIZE  (1 * MiB)
+#define FSI_LBUSDEV_IOMEM_START   0xc00 /* 3K used by CFAM config etc */
+
+typedef struct FSILBusDevice {
+DeviceState parent;
+
+MemoryRegion iomem;
+} FSILBusDevice;
+
+typedef struct FSILBusDeviceClass {
+DeviceClass parent;
+
+uint32_t config;
+} FSILBusDeviceClass;
+
+#define TYPE_FSI_LBUS "fsi.lbus"
+OBJECT_DECLARE_SIMPLE_TYPE(FSILBus, FSI_LBUS)
+
+typedef struct FSILBus {
+BusState bus;
+
+MemoryRegion mr;
+} FSILBus;
+
+#endif /* FSI_LBUS_H */
diff --git a/hw/fsi/lbus.c b/hw/fsi/lbus.c
new file mode 100644
index 00..84c46a00d7
--- /dev/null
+++ b/hw/fsi/lbus.c
@@ -0,0 +1,51 @@
+/*
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ * Copyright (C) 2023 IBM Corp.
+ *
+ * IBM Local bus where FSI slaves are connected
+ */
+
+#include "qemu/osdep.h"
+#include "qapi/error.h"
+#include "hw/fsi/lbus.h"
+
+#include "hw/qdev-properties.h"
+
+static void lbus_init(Object *o)
+{
+FSILBus *lbus = FSI_LBUS(o);
+
+memory_region_init(>mr, OBJECT(lbus), TYPE_FSI_LBUS,
+   FSI_LBUS_MEM_REGION_SIZE - FSI_LBUSDEV_IOMEM_START);
+}
+
+static const TypeInfo lbus_info = {
+.name = TYPE_FSI_LBUS,
+.parent = TYPE_BUS,
+.instance_init = lbus_init,
+.instance_size = sizeof(FSILBus),
+};
+
+static void lbus_device_class_init(ObjectClass *klass, void *data)
+{
+DeviceClass *dc = DEVICE_CLASS(klass);
+
+dc->bus_type = TYPE_FSI_LBUS;
+}
+
+static const TypeInfo lbus_device_type_info = {
+.name = TYPE_FSI_LBUS_DEVICE,
+.parent = TYPE_DEVICE,
+.instance_size = sizeof(FSILBusDevice),
+.abstract = true,
+.class_init = lbus_device_class_init,
+.class_size = sizeof(FSILBusDeviceClass),
+};
+
+static void lbus_register_types(void)
+{
+type_register_static(_info);
+type_register_static(_device_type_info);
+}
+
+type_init(lbus_register_types);
diff --git a/hw/Kconfig b/hw/Kconfig
index 9ca7b38c31..2c00936c28 100644
--- a/hw/Kconfig
+++ b/hw/Kconfig
@@ -9,6 +9,7 @@ source core/Kconfig
 source cxl/Kconfig
 source display/Kconfig
 source dma/Kconfig
+source fsi/Kconfig
 source gpio/Kconfig
 source hyperv/Kconfig
 source i2c/Kconfig
diff --git a/hw/fsi/Kconfig b/hw/fsi/Kconfig
new file mode 100644
index 00..e650c660f0
--- /dev/null
+++ b/hw/fsi/Kconfig
@@ -0,0 +1,2 @@
+config FSI_LBUS
+bool
diff --git a/hw/fsi/meson.build b/hw/fsi/meson.build
new file mode 100644
index 00..4074d3a7d2
--- /dev/null
+++ b/hw/fsi/meson.build
@@ -0,0 +1 @@
+system_ss.add(when: 'CONFIG_FSI_LBUS', if_true: files('lbus.c'))
diff --git a/hw/fsi/trace-events b/hw/fsi/trace-events
new file mode 100644
index 00..8b13789179
--- /dev/null
+++ b/hw/fsi/trace-events
@@ -0,0 +1 @@
+
diff --git a/hw/meson.build b/hw/meson.build

[PATCH v9 00/10] Introduce model for IBM's FSI

2024-01-09 Thread Ninad Palsule
Hello,

Please review the patch-set version 9.
I have incorporated review comments from Cedric.

Ninad Palsule (10):
  hw/fsi: Introduce IBM's Local bus
  hw/fsi: Introduce IBM's FSI Bus
  hw/fsi: Introduce IBM's cfam,fsi-slave,scratchpad
  hw/fsi: IBM's On-chip Peripheral Bus
  hw/fsi: Introduce IBM's FSI master
  hw/fsi: Aspeed APB2OPB interface
  hw/arm: Hook up FSI module in AST2600
  hw/fsi: Added qtest
  hw/fsi: Added FSI documentation
  hw/fsi: Update MAINTAINER list

 MAINTAINERS |   8 +
 docs/specs/fsi.rst  | 138 ++
 docs/specs/index.rst|   1 +
 meson.build |   1 +
 hw/fsi/trace.h  |   1 +
 include/hw/arm/aspeed_soc.h |   4 +
 include/hw/fsi/aspeed-apb2opb.h |  34 
 include/hw/fsi/cfam.h   |  46 +
 include/hw/fsi/fsi-master.h |  32 
 include/hw/fsi/fsi-slave.h  |  27 +++
 include/hw/fsi/fsi.h|  24 +++
 include/hw/fsi/lbus.h   |  41 +
 include/hw/fsi/opb.h|  27 +++
 hw/arm/aspeed_ast2600.c |  19 ++
 hw/fsi/aspeed-apb2opb.c | 314 
 hw/fsi/cfam.c   | 253 +
 hw/fsi/fsi-master.c | 173 ++
 hw/fsi/fsi-slave.c  | 101 ++
 hw/fsi/fsi.c|  22 +++
 hw/fsi/lbus.c   |  51 ++
 hw/fsi/opb.c|  36 
 tests/qtest/aspeed-fsi-test.c   | 205 +
 hw/Kconfig  |   1 +
 hw/arm/Kconfig  |   1 +
 hw/fsi/Kconfig  |  21 +++
 hw/fsi/meson.build  |   5 +
 hw/fsi/trace-events |  13 ++
 hw/meson.build  |   1 +
 tests/qtest/meson.build |   1 +
 29 files changed, 1601 insertions(+)
 create mode 100644 docs/specs/fsi.rst
 create mode 100644 hw/fsi/trace.h
 create mode 100644 include/hw/fsi/aspeed-apb2opb.h
 create mode 100644 include/hw/fsi/cfam.h
 create mode 100644 include/hw/fsi/fsi-master.h
 create mode 100644 include/hw/fsi/fsi-slave.h
 create mode 100644 include/hw/fsi/fsi.h
 create mode 100644 include/hw/fsi/lbus.h
 create mode 100644 include/hw/fsi/opb.h
 create mode 100644 hw/fsi/aspeed-apb2opb.c
 create mode 100644 hw/fsi/cfam.c
 create mode 100644 hw/fsi/fsi-master.c
 create mode 100644 hw/fsi/fsi-slave.c
 create mode 100644 hw/fsi/fsi.c
 create mode 100644 hw/fsi/lbus.c
 create mode 100644 hw/fsi/opb.c
 create mode 100644 tests/qtest/aspeed-fsi-test.c
 create mode 100644 hw/fsi/Kconfig
 create mode 100644 hw/fsi/meson.build
 create mode 100644 hw/fsi/trace-events

-- 
2.39.2




[PATCH v9 03/10] hw/fsi: Introduce IBM's cfam,fsi-slave,scratchpad

2024-01-09 Thread Ninad Palsule
This is a part of patchset where IBM's Flexible Service Interface is
introduced.

The Common FRU Access Macro (CFAM), an address space containing
various "engines" that drive accesses on busses internal and external
to the POWER chip. Examples include the SBEFIFO and I2C masters. The
engines hang off of an internal Local Bus (LBUS) which is described
by the CFAM configuration block.

The FSI slave: The slave is the terminal point of the FSI bus for
FSI symbols addressed to it. Slaves can be cascaded off of one
another. The slave's configuration registers appear in address space
of the CFAM to which it is attached.

The scratchpad provides a set of non-functional registers. The firmware
is free to use them, hardware does not support any special management
support. The scratchpad registers can be read or written from LBUS
slave. The scratch pad is managed under FSI CFAM state.

[ clg: - moved object FSIScratchPad under FSICFAMState
   - moved FSIScratchPad code under cfam.c
   - introduced fsi_cfam_instance_init()
   - reworked fsi_cfam_realize() ]

Signed-off-by: Andrew Jeffery 
Signed-off-by: Cédric Le Goater 
Signed-off-by: Ninad Palsule 
---
v9:
  - Added more registers to scratchpad
  - Removed unnecessary address space
  - Removed unnecessary header file
  - Defined macros for config values.
  - Cleaned up cfam config read.
---
 include/hw/fsi/cfam.h  |  46 +++
 include/hw/fsi/fsi-slave.h |  27 
 include/hw/fsi/fsi.h   |   5 +
 hw/fsi/cfam.c  | 253 +
 hw/fsi/fsi-slave.c | 101 +++
 hw/fsi/Kconfig |   8 ++
 hw/fsi/meson.build |   3 +-
 hw/fsi/trace-events|  10 +-
 8 files changed, 451 insertions(+), 2 deletions(-)
 create mode 100644 include/hw/fsi/cfam.h
 create mode 100644 include/hw/fsi/fsi-slave.h
 create mode 100644 hw/fsi/cfam.c
 create mode 100644 hw/fsi/fsi-slave.c

diff --git a/include/hw/fsi/cfam.h b/include/hw/fsi/cfam.h
new file mode 100644
index 00..147bc13156
--- /dev/null
+++ b/include/hw/fsi/cfam.h
@@ -0,0 +1,46 @@
+/*
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ * Copyright (C) 2023 IBM Corp.
+ *
+ * IBM Common FRU Access Macro
+ */
+#ifndef FSI_CFAM_H
+#define FSI_CFAM_H
+
+#include "exec/memory.h"
+
+#include "hw/fsi/fsi-slave.h"
+#include "hw/fsi/lbus.h"
+
+
+#define TYPE_FSI_SCRATCHPAD "fsi.scratchpad"
+#define SCRATCHPAD(obj) OBJECT_CHECK(FSIScratchPad, (obj), TYPE_FSI_SCRATCHPAD)
+
+#define FSI_SCRATCHPAD_NR_REGS 4
+
+typedef struct FSIScratchPad {
+FSILBusDevice parent;
+
+uint32_t reg[FSI_SCRATCHPAD_NR_REGS];
+} FSIScratchPad;
+
+#define TYPE_FSI_CFAM "cfam"
+#define FSI_CFAM(obj) OBJECT_CHECK(FSICFAMState, (obj), TYPE_FSI_CFAM)
+
+/* P9-ism */
+#define CFAM_CONFIG_NR_REGS 0x28
+
+typedef struct FSICFAMState {
+/* < private > */
+FSISlaveState parent;
+
+/* CFAM config address space */
+MemoryRegion config_iomem;
+
+MemoryRegion mr;
+
+FSILBus lbus;
+FSIScratchPad scratchpad;
+} FSICFAMState;
+
+#endif /* FSI_CFAM_H */
diff --git a/include/hw/fsi/fsi-slave.h b/include/hw/fsi/fsi-slave.h
new file mode 100644
index 00..6fc15a15a0
--- /dev/null
+++ b/include/hw/fsi/fsi-slave.h
@@ -0,0 +1,27 @@
+/*
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ * Copyright (C) 2023 IBM Corp.
+ *
+ * IBM Flexible Service Interface slave
+ */
+#ifndef FSI_FSI_SLAVE_H
+#define FSI_FSI_SLAVE_H
+
+#include "exec/memory.h"
+#include "hw/qdev-core.h"
+
+#include "hw/fsi/lbus.h"
+
+#define TYPE_FSI_SLAVE "fsi.slave"
+OBJECT_DECLARE_SIMPLE_TYPE(FSISlaveState, FSI_SLAVE)
+
+#define FSI_SLAVE_CONTROL_NR_REGS ((0x40 >> 2) + 1)
+
+typedef struct FSISlaveState {
+DeviceState parent;
+
+MemoryRegion iomem;
+uint32_t regs[FSI_SLAVE_CONTROL_NR_REGS];
+} FSISlaveState;
+
+#endif /* FSI_FSI_H */
diff --git a/include/hw/fsi/fsi.h b/include/hw/fsi/fsi.h
index a75e3e5bdc..af39f9b4ad 100644
--- a/include/hw/fsi/fsi.h
+++ b/include/hw/fsi/fsi.h
@@ -8,6 +8,11 @@
 #define FSI_FSI_H
 
 #include "hw/qdev-core.h"
+#include "qemu/bitops.h"
+
+/* Bitwise operations at the word level. */
+#define BE_BIT(x)   BIT(31 - (x))
+#define BE_GENMASK(hb, lb)  MAKE_64BIT_MASK((lb), ((hb) - (lb) + 1))
 
 #define TYPE_FSI_BUS "fsi.bus"
 OBJECT_DECLARE_SIMPLE_TYPE(FSIBus, FSI_BUS)
diff --git a/hw/fsi/cfam.c b/hw/fsi/cfam.c
new file mode 100644
index 00..2ad7087102
--- /dev/null
+++ b/hw/fsi/cfam.c
@@ -0,0 +1,253 @@
+/*
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ * Copyright (C) 2023 IBM Corp.
+ *
+ * IBM Common FRU Access Macro
+ */
+
+#include "qemu/osdep.h"
+#include "qemu/units.h"
+
+#include "qapi/error.h"
+#include "trace.h"
+
+#include "hw/fsi/cfam.h"
+#include "hw/fsi/fsi.h"
+
+#include "hw/qdev-properties.h"
+
+#define ENGINE_CONFIG_NEXTBE_BIT(0)
+#define ENGINE_CONFIG_TYPE_PEEK   (0x02 << 4)
+#define ENGINE_CONFIG_TYPE_FSI(0x03 << 4)
+#define ENGINE_CONFIG_TYPE_SCRATCHPAD (0x06 << 4)
+
+/* 

[PATCH v9 09/10] hw/fsi: Added FSI documentation

2024-01-09 Thread Ninad Palsule
Documentation for IBM FSI model.

Signed-off-by: Cédric Le Goater 
Signed-off-by: Ninad Palsule 
---
 docs/specs/fsi.rst   | 138 +++
 docs/specs/index.rst |   1 +
 2 files changed, 139 insertions(+)
 create mode 100644 docs/specs/fsi.rst

diff --git a/docs/specs/fsi.rst b/docs/specs/fsi.rst
new file mode 100644
index 00..05a6b6347a
--- /dev/null
+++ b/docs/specs/fsi.rst
@@ -0,0 +1,138 @@
+==
+IBM's Flexible Service Interface (FSI)
+==
+
+The QEMU FSI emulation implements hardware interfaces between ASPEED SOC, FSI
+master/slave and the end engine.
+
+FSI is a point-to-point two wire interface which is capable of supporting
+distances of up to 4 meters. FSI interfaces have been used successfully for
+many years in IBM servers to attach IBM Flexible Support Processors(FSP) to
+CPUs and IBM ASICs.
+
+FSI allows a service processor access to the internal buses of a host POWER
+processor to perform configuration or debugging. FSI has long existed in POWER
+processes and so comes with some baggage, including how it has been integrated
+into the ASPEED SoC.
+
+Working backwards from the POWER processor, the fundamental pieces of interest
+for the implementation are: (see the `FSI specification`_ for more details)
+
+1. The Common FRU Access Macro (CFAM), an address space containing various
+   "engines" that drive accesses on buses internal and external to the POWER
+   chip. Examples include the SBEFIFO and I2C masters. The engines hang off of
+   an internal Local Bus (LBUS) which is described by the CFAM configuration
+   block.
+
+2. The FSI slave: The slave is the terminal point of the FSI bus for FSI
+   symbols addressed to it. Slaves can be cascaded off of one another. The
+   slave's configuration registers appear in address space of the CFAM to
+   which it is attached.
+
+3. The FSI master: A controller in the platform service processor (e.g. BMC)
+   driving CFAM engine accesses into the POWER chip. At the hardware level
+   FSI is a bit-based protocol supporting synchronous and DMA-driven accesses
+   of engines in a CFAM.
+
+4. The On-Chip Peripheral Bus (OPB): A low-speed bus typically found in POWER
+   processors. This now makes an appearance in the ASPEED SoC due to tight
+   integration of the FSI master IP with the OPB, mainly the existence of an
+   MMIO-mapping of the CFAM address straight onto a sub-region of the OPB
+   address space.
+
+5. An APB-to-OPB bridge enabling access to the OPB from the ARM core in the
+   AST2600. Hardware limitations prevent the OPB from being directly mapped
+   into APB, so all accesses are indirect through the bridge.
+
+The LBUS is modelled to maintain the qdev bus hierarchy and to take advantages
+of the object model to automatically generate the CFAM configuration block.
+The configuration block presents engines in the order they are attached to the
+CFAM's LBUS. Engine implementations should subclass the LBusDevice and set the
+'config' member of LBusDeviceClass to match the engine's type.
+
+CFAM designs offer a lot of flexibility, for instance it is possible for a
+CFAM to be simultaneously driven from multiple FSI links. The modeling is not
+so complete; it's assumed that each CFAM is attached to a single FSI slave (as
+a consequence the CFAM subclasses the FSI slave).
+
+As for FSI, its symbols and wire-protocol are not modelled at all. This is not
+necessary to get FSI off the ground thanks to the mapping of the CFAM address
+space onto the OPB address space - the models follow this directly and map the
+CFAM memory region into the OPB's memory region.
+
+QEMU files related to FSI interface:
+ - ``hw/fsi/aspeed-apb2opb.c``
+ - ``include/hw/fsi/aspeed-apb2opb.h``
+ - ``hw/fsi/opb.c``
+ - ``include/hw/fsi/opb.h``
+ - ``hw/fsi/fsi.c``
+ - ``include/hw/fsi/fsi.h``
+ - ``hw/fsi/fsi-master.c``
+ - ``include/hw/fsi/fsi-master.h``
+ - ``hw/fsi/fsi-slave.c``
+ - ``include/hw/fsi/fsi-slave.h``
+ - ``hw/fsi/cfam.c``
+ - ``include/hw/fsi/cfam.h``
+ - ``hw/fsi/engine-scratchpad.c``
+ - ``include/hw/fsi/engine-scratchpad.h``
+ - ``include/hw/fsi/lbus.h``
+
+The following commands start the rainier machine with built-in FSI model.
+There are no model specific arguments.
+
+.. code-block:: console
+
+  qemu-system-arm -M rainier-bmc -nographic \
+  -kernel fitImage-linux.bin \
+  -dtb aspeed-bmc-ibm-rainier.dtb \
+  -initrd obmc-phosphor-initramfs.rootfs.cpio.xz \
+  -drive file=obmc-phosphor-image.rootfs.wic.qcow2,if=sd,index=2 \
+  -append "rootwait console=ttyS4,115200n8 root=PARTLABEL=rofs-a"
+
+The implementation appears as following in the qemu device tree:
+
+.. code-block:: console
+
+  (qemu) info qtree
+  bus: main-system-bus
+type System
+...
+dev: aspeed.apb2opb, id ""
+  gpio-out "sysbus-irq" 1
+  mmio 1e79b000/1000
+  bus: opb.1
+type opb
+dev: fsi.master, id ""
+  

[PATCH v9 05/10] hw/fsi: Introduce IBM's FSI master

2024-01-09 Thread Ninad Palsule
This is a part of patchset where IBM's Flexible Service Interface is
introduced.

This commit models the FSI master. CFAM is hanging out of FSI master which is a 
bus controller.

The FSI master: A controller in the platform service processor (e.g.
BMC) driving CFAM engine accesses into the POWER chip. At the
hardware level FSI is a bit-based protocol supporting synchronous and
DMA-driven accesses of engines in a CFAM.

[ clg: - move FSICFAMState object under FSIMasterState
   - introduced fsi_master_init()
   - reworked fsi_master_realize()
   - dropped FSIBus definition ]

Signed-off-by: Andrew Jeffery 
Reviewed-by: Joel Stanley 
Signed-off-by: Cédric Le Goater 
Signed-off-by: Ninad Palsule 
---
v9:
  - Initialized registers.
  - Fixed the address check.
---
 include/hw/fsi/fsi-master.h |  32 +++
 hw/fsi/fsi-master.c | 173 
 hw/fsi/meson.build  |   2 +-
 hw/fsi/trace-events |   2 +
 4 files changed, 208 insertions(+), 1 deletion(-)
 create mode 100644 include/hw/fsi/fsi-master.h
 create mode 100644 hw/fsi/fsi-master.c

diff --git a/include/hw/fsi/fsi-master.h b/include/hw/fsi/fsi-master.h
new file mode 100644
index 00..3830869877
--- /dev/null
+++ b/include/hw/fsi/fsi-master.h
@@ -0,0 +1,32 @@
+/*
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ * Copyright (C) 2019 IBM Corp.
+ *
+ * IBM Flexible Service Interface Master
+ */
+#ifndef FSI_FSI_MASTER_H
+#define FSI_FSI_MASTER_H
+
+#include "exec/memory.h"
+#include "hw/qdev-core.h"
+#include "hw/fsi/fsi.h"
+#include "hw/fsi/cfam.h"
+
+#define TYPE_FSI_MASTER "fsi.master"
+OBJECT_DECLARE_SIMPLE_TYPE(FSIMasterState, FSI_MASTER)
+
+#define FSI_MASTER_NR_REGS ((0x2e0 >> 2) + 1)
+
+typedef struct FSIMasterState {
+DeviceState parent;
+MemoryRegion iomem;
+MemoryRegion opb2fsi;
+
+FSIBus bus;
+
+uint32_t regs[FSI_MASTER_NR_REGS];
+FSICFAMState cfam;
+} FSIMasterState;
+
+
+#endif /* FSI_FSI_H */
diff --git a/hw/fsi/fsi-master.c b/hw/fsi/fsi-master.c
new file mode 100644
index 00..939de5927f
--- /dev/null
+++ b/hw/fsi/fsi-master.c
@@ -0,0 +1,173 @@
+/*
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ * Copyright (C) 2023 IBM Corp.
+ *
+ * IBM Flexible Service Interface master
+ */
+
+#include "qemu/osdep.h"
+#include "qapi/error.h"
+#include "qemu/log.h"
+#include "trace.h"
+
+#include "hw/fsi/fsi-master.h"
+
+#define TYPE_OP_BUS "opb"
+
+#define TO_REG(x)   ((x) >> 2)
+
+#define FSI_MENP0   TO_REG(0x010)
+#define FSI_MENP32  TO_REG(0x014)
+#define FSI_MSENP0  TO_REG(0x018)
+#define FSI_MLEVP0  TO_REG(0x018)
+#define FSI_MSENP32 TO_REG(0x01c)
+#define FSI_MLEVP32 TO_REG(0x01c)
+#define FSI_MCENP0  TO_REG(0x020)
+#define FSI_MREFP0  TO_REG(0x020)
+#define FSI_MCENP32 TO_REG(0x024)
+#define FSI_MREFP32 TO_REG(0x024)
+
+#define FSI_MVERTO_REG(0x074)
+#define FSI_MRESP0  TO_REG(0x0d0)
+
+#define FSI_MRESB0  TO_REG(0x1d0)
+#define   FSI_MRESB0_RESET_GENERAL  BE_BIT(0)
+#define   FSI_MRESB0_RESET_ERRORBE_BIT(1)
+
+static uint64_t fsi_master_read(void *opaque, hwaddr addr, unsigned size)
+{
+FSIMasterState *s = FSI_MASTER(opaque);
+int reg = TO_REG(addr);
+
+trace_fsi_master_read(addr, size);
+
+if (reg >= FSI_MASTER_NR_REGS) {
+qemu_log_mask(LOG_GUEST_ERROR,
+  "%s: Out of bounds read: 0x%"HWADDR_PRIx" for %u\n",
+  __func__, addr, size);
+return 0;
+}
+
+return s->regs[reg];
+}
+
+static void fsi_master_write(void *opaque, hwaddr addr, uint64_t data,
+ unsigned size)
+{
+FSIMasterState *s = FSI_MASTER(opaque);
+int reg = TO_REG(addr);
+
+trace_fsi_master_write(addr, size, data);
+
+if (reg >= FSI_MASTER_NR_REGS) {
+qemu_log_mask(LOG_GUEST_ERROR,
+  "%s: Out of bounds write: %"HWADDR_PRIx" for %u\n",
+  __func__, addr, size);
+return;
+}
+
+switch (reg) {
+case FSI_MENP0:
+s->regs[FSI_MENP0] = data;
+break;
+case FSI_MENP32:
+s->regs[FSI_MENP32] = data;
+break;
+case FSI_MSENP0:
+s->regs[FSI_MENP0] |= data;
+break;
+case FSI_MSENP32:
+s->regs[FSI_MENP32] |= data;
+break;
+case FSI_MCENP0:
+s->regs[FSI_MENP0] &= ~data;
+break;
+case FSI_MCENP32:
+s->regs[FSI_MENP32] &= ~data;
+break;
+case FSI_MRESP0:
+/* Perform necessary resets leave register 0 to indicate no errors */
+break;
+case FSI_MRESB0:
+if (data 

[PATCH v9 10/10] hw/fsi: Update MAINTAINER list

2024-01-09 Thread Ninad Palsule
Added maintainer for IBM FSI model

Signed-off-by: Cédric Le Goater 
Signed-off-by: Ninad Palsule 
---
 MAINTAINERS | 8 
 1 file changed, 8 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 00ec1f7eca..79f97a3fb9 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -3569,6 +3569,14 @@ F: tests/qtest/adm1272-test.c
 F: tests/qtest/max34451-test.c
 F: tests/qtest/isl_pmbus_vr-test.c
 
+FSI
+M: Ninad Palsule 
+S: Maintained
+F: hw/fsi/*
+F: include/hw/fsi/*
+F: docs/specs/fsi.rst
+F: tests/qtest/fsi-test.c
+
 Firmware schema specifications
 M: Philippe Mathieu-Daudé 
 R: Daniel P. Berrange 
-- 
2.39.2




[PATCH v9 08/10] hw/fsi: Added qtest

2024-01-09 Thread Ninad Palsule
Added basic qtests for FSI model.

Acked-by: Thomas Huth 
Signed-off-by: Cédric Le Goater 
Signed-off-by: Ninad Palsule 
---
 tests/qtest/aspeed-fsi-test.c | 205 ++
 tests/qtest/meson.build   |   1 +
 2 files changed, 206 insertions(+)
 create mode 100644 tests/qtest/aspeed-fsi-test.c

diff --git a/tests/qtest/aspeed-fsi-test.c b/tests/qtest/aspeed-fsi-test.c
new file mode 100644
index 00..b3020dd821
--- /dev/null
+++ b/tests/qtest/aspeed-fsi-test.c
@@ -0,0 +1,205 @@
+/*
+ * QTest testcases for IBM's Flexible Service Interface (FSI)
+ *
+ * Copyright (c) 2023 IBM Corporation
+ *
+ * Authors:
+ *   Ninad Palsule 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#include "qemu/osdep.h"
+#include 
+
+#include "qemu/module.h"
+#include "libqtest-single.h"
+
+/* Registers from ast2600 specifications */
+#define ASPEED_FSI_ENGINER_TRIGGER   0x04
+#define ASPEED_FSI_OPB0_BUS_SELECT   0x10
+#define ASPEED_FSI_OPB1_BUS_SELECT   0x28
+#define ASPEED_FSI_OPB0_RW_DIRECTION 0x14
+#define ASPEED_FSI_OPB1_RW_DIRECTION 0x2c
+#define ASPEED_FSI_OPB0_XFER_SIZE0x18
+#define ASPEED_FSI_OPB1_XFER_SIZE0x30
+#define ASPEED_FSI_OPB0_BUS_ADDR 0x1c
+#define ASPEED_FSI_OPB1_BUS_ADDR 0x34
+#define ASPEED_FSI_INTRRUPT_CLEAR0x40
+#define ASPEED_FSI_INTRRUPT_STATUS   0x48
+#define ASPEED_FSI_OPB0_BUS_STATUS   0x80
+#define ASPEED_FSI_OPB1_BUS_STATUS   0x8c
+#define ASPEED_FSI_OPB0_READ_DATA0x84
+#define ASPEED_FSI_OPB1_READ_DATA0x90
+
+/*
+ * FSI Base addresses from the ast2600 specifications.
+ */
+#define AST2600_OPB_FSI0_BASE_ADDR 0x1e79b000
+#define AST2600_OPB_FSI1_BASE_ADDR 0x1e79b100
+
+static uint32_t aspeed_fsi_base_addr;
+
+static uint32_t aspeed_fsi_readl(QTestState *s, uint32_t reg)
+{
+return qtest_readl(s, aspeed_fsi_base_addr + reg);
+}
+
+static void aspeed_fsi_writel(QTestState *s, uint32_t reg, uint32_t val)
+{
+qtest_writel(s, aspeed_fsi_base_addr + reg, val);
+}
+
+/* Setup base address and select register */
+static void test_fsi_setup(QTestState *s, uint32_t base_addr)
+{
+uint32_t curval;
+
+aspeed_fsi_base_addr = base_addr;
+
+/* Set the base select register */
+if (base_addr == AST2600_OPB_FSI0_BASE_ADDR) {
+/* Unselect FSI1 */
+aspeed_fsi_writel(s, ASPEED_FSI_OPB1_BUS_SELECT, 0x0);
+curval = aspeed_fsi_readl(s, ASPEED_FSI_OPB1_BUS_SELECT);
+g_assert_cmpuint(curval, ==, 0x0);
+
+/* Select FSI0 */
+aspeed_fsi_writel(s, ASPEED_FSI_OPB0_BUS_SELECT, 0x1);
+curval = aspeed_fsi_readl(s, ASPEED_FSI_OPB0_BUS_SELECT);
+g_assert_cmpuint(curval, ==, 0x1);
+} else if (base_addr == AST2600_OPB_FSI1_BASE_ADDR) {
+/* Unselect FSI0 */
+aspeed_fsi_writel(s, ASPEED_FSI_OPB0_BUS_SELECT, 0x0);
+curval = aspeed_fsi_readl(s, ASPEED_FSI_OPB0_BUS_SELECT);
+g_assert_cmpuint(curval, ==, 0x0);
+
+/* Select FSI1 */
+aspeed_fsi_writel(s, ASPEED_FSI_OPB1_BUS_SELECT, 0x1);
+curval = aspeed_fsi_readl(s, ASPEED_FSI_OPB1_BUS_SELECT);
+g_assert_cmpuint(curval, ==, 0x1);
+} else {
+g_assert_not_reached();
+}
+}
+
+static void test_fsi_reg_change(QTestState *s, uint32_t reg, uint32_t newval)
+{
+uint32_t base;
+uint32_t curval;
+
+base = aspeed_fsi_readl(s, reg);
+aspeed_fsi_writel(s, reg, newval);
+curval = aspeed_fsi_readl(s, reg);
+g_assert_cmpuint(curval, ==, newval);
+aspeed_fsi_writel(s, reg, base);
+curval = aspeed_fsi_readl(s, reg);
+g_assert_cmpuint(curval, ==, base);
+}
+
+static void test_fsi0_master_regs(const void *data)
+{
+QTestState *s = (QTestState *)data;
+
+test_fsi_setup(s, AST2600_OPB_FSI0_BASE_ADDR);
+
+test_fsi_reg_change(s, ASPEED_FSI_OPB0_RW_DIRECTION, 0xF3F4F514);
+test_fsi_reg_change(s, ASPEED_FSI_OPB0_XFER_SIZE, 0xF3F4F518);
+test_fsi_reg_change(s, ASPEED_FSI_OPB0_BUS_ADDR, 0xF3F4F51c);
+test_fsi_reg_change(s, ASPEED_FSI_INTRRUPT_CLEAR, 0xF3F4F540);
+test_fsi_reg_change(s, ASPEED_FSI_INTRRUPT_STATUS, 0xF3F4F548);
+test_fsi_reg_change(s, ASPEED_FSI_OPB0_BUS_STATUS, 0xF3F4F580);
+test_fsi_reg_change(s, ASPEED_FSI_OPB0_READ_DATA, 0xF3F4F584);
+}
+
+static void test_fsi1_master_regs(const void *data)
+{
+QTestState *s = (QTestState *)data;
+
+test_fsi_setup(s, AST2600_OPB_FSI1_BASE_ADDR);
+
+test_fsi_reg_change(s, ASPEED_FSI_OPB1_RW_DIRECTION, 0xF3F4F514);
+test_fsi_reg_change(s, ASPEED_FSI_OPB1_XFER_SIZE, 0xF3F4F518);
+test_fsi_reg_change(s, ASPEED_FSI_OPB1_BUS_ADDR, 0xF3F4F51c);
+test_fsi_reg_change(s, ASPEED_FSI_INTRRUPT_CLEAR, 0xF3F4F540);
+test_fsi_reg_change(s, ASPEED_FSI_INTRRUPT_STATUS, 0xF3F4F548);
+test_fsi_reg_change(s, ASPEED_FSI_OPB1_BUS_STATUS, 0xF3F4F580);
+test_fsi_reg_change(s, ASPEED_FSI_OPB1_READ_DATA, 0xF3F4F584);
+}
+
+static void test_fsi0_getcfam_addr0(const void 

[PATCH v9 04/10] hw/fsi: IBM's On-chip Peripheral Bus

2024-01-09 Thread Ninad Palsule
This is a part of patchset where IBM's Flexible Service Interface is
introduced.

The On-Chip Peripheral Bus (OPB): A low-speed bus typically found in
POWER processors. This now makes an appearance in the ASPEED SoC due
to tight integration of the FSI master IP with the OPB, mainly the
existence of an MMIO-mapping of the CFAM address straight onto a
sub-region of the OPB address space.

[ clg: - removed FSIMasterState object and fsi_opb_realize()
   - simplified OPBus ]

Signed-off-by: Andrew Jeffery 
Reviewed-by: Joel Stanley 
Signed-off-by: Cédric Le Goater 
Signed-off-by: Ninad Palsule 
---
v9:
  - Given a name to the opb memory region.
---
 include/hw/fsi/opb.h | 27 +++
 hw/fsi/opb.c | 36 
 hw/fsi/Kconfig   |  4 
 hw/fsi/meson.build   |  1 +
 4 files changed, 68 insertions(+)
 create mode 100644 include/hw/fsi/opb.h
 create mode 100644 hw/fsi/opb.c

diff --git a/include/hw/fsi/opb.h b/include/hw/fsi/opb.h
new file mode 100644
index 00..7a98f4b253
--- /dev/null
+++ b/include/hw/fsi/opb.h
@@ -0,0 +1,27 @@
+/*
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ * Copyright (C) 2023 IBM Corp.
+ *
+ * IBM On-Chip Peripheral Bus
+ */
+#ifndef FSI_OPB_H
+#define FSI_OPB_H
+
+#include "exec/memory.h"
+#include "hw/fsi/fsi-master.h"
+
+#define TYPE_FSI_OPB "fsi.opb"
+
+#define TYPE_OP_BUS "opb"
+OBJECT_DECLARE_SIMPLE_TYPE(OPBus, OP_BUS)
+
+typedef struct OPBus {
+/*< private >*/
+BusState bus;
+
+/*< public >*/
+MemoryRegion mr;
+AddressSpace as;
+} OPBus;
+
+#endif /* FSI_OPB_H */
diff --git a/hw/fsi/opb.c b/hw/fsi/opb.c
new file mode 100644
index 00..ec1bf57fee
--- /dev/null
+++ b/hw/fsi/opb.c
@@ -0,0 +1,36 @@
+/*
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ * Copyright (C) 2023 IBM Corp.
+ *
+ * IBM On-chip Peripheral Bus
+ */
+
+#include "qemu/osdep.h"
+
+#include "qapi/error.h"
+#include "qemu/log.h"
+
+#include "hw/fsi/opb.h"
+
+static void fsi_opb_init(Object *o)
+{
+OPBus *opb = OP_BUS(o);
+
+memory_region_init_io(>mr, OBJECT(opb), NULL, opb,
+  TYPE_FSI_OPB, UINT32_MAX);
+address_space_init(>as, >mr, TYPE_FSI_OPB);
+}
+
+static const TypeInfo opb_info = {
+.name = TYPE_OP_BUS,
+.parent = TYPE_BUS,
+.instance_init = fsi_opb_init,
+.instance_size = sizeof(OPBus),
+};
+
+static void fsi_opb_register_types(void)
+{
+type_register_static(_info);
+}
+
+type_init(fsi_opb_register_types);
diff --git a/hw/fsi/Kconfig b/hw/fsi/Kconfig
index de1594a335..9755baa8cc 100644
--- a/hw/fsi/Kconfig
+++ b/hw/fsi/Kconfig
@@ -1,3 +1,7 @@
+config FSI_OPB
+bool
+select FSI_CFAM
+
 config FSI_CFAM
 bool
 select FSI
diff --git a/hw/fsi/meson.build b/hw/fsi/meson.build
index cafd009c6d..ba92881370 100644
--- a/hw/fsi/meson.build
+++ b/hw/fsi/meson.build
@@ -1,3 +1,4 @@
 system_ss.add(when: 'CONFIG_FSI_LBUS', if_true: files('lbus.c'))
 system_ss.add(when: 'CONFIG_FSI_CFAM', if_true: files('cfam.c'))
 system_ss.add(when: 'CONFIG_FSI', if_true: files('fsi.c','fsi-slave.c'))
+system_ss.add(when: 'CONFIG_FSI_OPB', if_true: files('opb.c'))
-- 
2.39.2




[PATCH v9 06/10] hw/fsi: Aspeed APB2OPB interface

2024-01-09 Thread Ninad Palsule
This is a part of patchset where IBM's Flexible Service Interface is
introduced.

An APB-to-OPB bridge enabling access to the OPB from the ARM core in
the AST2600. Hardware limitations prevent the OPB from being directly
mapped into APB, so all accesses are indirect through the bridge.

[ clg: - moved FSIMasterState under AspeedAPB2OPBState
   - modified fsi_opb_fsi_master_address() and
 fsi_opb_opb2fsi_address()
   - instroduced fsi_aspeed_apb2opb_init()
   - reworked fsi_aspeed_apb2opb_realize() ]

Signed-off-by: Andrew Jeffery 
Signed-off-by: Cédric Le Goater 
Signed-off-by: Ninad Palsule 
---
v9:
  - Removed unused parameters from function.
  - Used qdev_realize() instead of qdev_realize_and_undef
---
 include/hw/fsi/aspeed-apb2opb.h |  34 
 hw/fsi/aspeed-apb2opb.c | 314 
 hw/arm/Kconfig  |   1 +
 hw/fsi/Kconfig  |   4 +
 hw/fsi/meson.build  |   1 +
 hw/fsi/trace-events |   2 +
 6 files changed, 356 insertions(+)
 create mode 100644 include/hw/fsi/aspeed-apb2opb.h
 create mode 100644 hw/fsi/aspeed-apb2opb.c

diff --git a/include/hw/fsi/aspeed-apb2opb.h b/include/hw/fsi/aspeed-apb2opb.h
new file mode 100644
index 00..c51fbeda9f
--- /dev/null
+++ b/include/hw/fsi/aspeed-apb2opb.h
@@ -0,0 +1,34 @@
+/*
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ * Copyright (C) 2023 IBM Corp.
+ *
+ * ASPEED APB2OPB Bridge
+ */
+#ifndef FSI_ASPEED_APB2OPB_H
+#define FSI_ASPEED_APB2OPB_H
+
+#include "hw/sysbus.h"
+#include "hw/fsi/opb.h"
+
+#define TYPE_ASPEED_APB2OPB "aspeed.apb2opb"
+OBJECT_DECLARE_SIMPLE_TYPE(AspeedAPB2OPBState, ASPEED_APB2OPB)
+
+#define ASPEED_APB2OPB_NR_REGS ((0xe8 >> 2) + 1)
+
+#define ASPEED_FSI_NUM 2
+
+typedef struct AspeedAPB2OPBState {
+/*< private >*/
+SysBusDevice parent_obj;
+
+/*< public >*/
+MemoryRegion iomem;
+
+uint32_t regs[ASPEED_APB2OPB_NR_REGS];
+qemu_irq irq;
+
+OPBus opb[ASPEED_FSI_NUM];
+FSIMasterState fsi[ASPEED_FSI_NUM];
+} AspeedAPB2OPBState;
+
+#endif /* FSI_ASPEED_APB2OPB_H */
diff --git a/hw/fsi/aspeed-apb2opb.c b/hw/fsi/aspeed-apb2opb.c
new file mode 100644
index 00..5beb9a772c
--- /dev/null
+++ b/hw/fsi/aspeed-apb2opb.c
@@ -0,0 +1,314 @@
+/*
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ * Copyright (C) 2023 IBM Corp.
+ *
+ * ASPEED APB-OPB FSI interface
+ */
+
+#include "qemu/osdep.h"
+#include "qemu/log.h"
+#include "qom/object.h"
+#include "qapi/error.h"
+#include "trace.h"
+
+#include "hw/fsi/aspeed-apb2opb.h"
+#include "hw/qdev-core.h"
+
+#define TO_REG(x) (x >> 2)
+
+#define APB2OPB_VERSIONTO_REG(0x00)
+#define APB2OPB_TRIGGERTO_REG(0x04)
+
+#define APB2OPB_CONTROLTO_REG(0x08)
+#define   APB2OPB_CONTROL_OFF  BE_GENMASK(31, 13)
+
+#define APB2OPB_OPB2FSITO_REG(0x0c)
+#define   APB2OPB_OPB2FSI_OFF  BE_GENMASK(31, 22)
+
+#define APB2OPB_OPB0_SEL   TO_REG(0x10)
+#define APB2OPB_OPB1_SEL   TO_REG(0x28)
+#define   APB2OPB_OPB_SEL_EN   BIT(0)
+
+#define APB2OPB_OPB0_MODE  TO_REG(0x14)
+#define APB2OPB_OPB1_MODE  TO_REG(0x2c)
+#define   APB2OPB_OPB_MODE_RD  BIT(0)
+
+#define APB2OPB_OPB0_XFER  TO_REG(0x18)
+#define APB2OPB_OPB1_XFER  TO_REG(0x30)
+#define   APB2OPB_OPB_XFER_FULLBIT(1)
+#define   APB2OPB_OPB_XFER_HALFBIT(0)
+
+#define APB2OPB_OPB0_ADDR  TO_REG(0x1c)
+#define APB2OPB_OPB0_WRITE_DATATO_REG(0x20)
+
+#define APB2OPB_OPB1_ADDR  TO_REG(0x34)
+#define APB2OPB_OPB1_WRITE_DATA  TO_REG(0x38)
+
+#define APB2OPB_IRQ_STSTO_REG(0x48)
+#define   APB2OPB_IRQ_STS_OPB1_TX_ACK  BIT(17)
+#define   APB2OPB_IRQ_STS_OPB0_TX_ACK  BIT(16)
+
+#define APB2OPB_OPB0_WRITE_WORD_ENDIAN TO_REG(0x4c)
+#define   APB2OPB_OPB0_WRITE_WORD_ENDIAN_BE 0x0011101b
+#define APB2OPB_OPB0_WRITE_BYTE_ENDIAN TO_REG(0x50)
+#define   APB2OPB_OPB0_WRITE_BYTE_ENDIAN_BE 0x0c330f3f
+#define APB2OPB_OPB1_WRITE_WORD_ENDIAN TO_REG(0x54)
+#define APB2OPB_OPB1_WRITE_BYTE_ENDIAN TO_REG(0x58)
+#define APB2OPB_OPB0_READ_BYTE_ENDIAN  TO_REG(0x5c)
+#define APB2OPB_OPB1_READ_BYTE_ENDIAN  TO_REG(0x60)
+#define   APB2OPB_OPB0_READ_WORD_ENDIAN_BE  0x00030b1b
+
+#define APB2OPB_OPB0_READ_DATA TO_REG(0x84)
+#define APB2OPB_OPB1_READ_DATA TO_REG(0x90)
+
+/*
+ * The following magic values came from AST2600 data sheet
+ * The register values are defined under section "FSI controller"
+ * as initial values.
+ */
+static const uint32_t aspeed_apb2opb_reset[ASPEED_APB2OPB_NR_REGS] = {
+ [APB2OPB_VERSION]= 0x00a1,
+ [APB2OPB_OPB0_WRITE_WORD_ENDIAN] = 0x0044eee4,
+ [APB2OPB_OPB0_WRITE_BYTE_ENDIAN] = 0x0055aaff,
+ [APB2OPB_OPB1_WRITE_WORD_ENDIAN] = 0x00117717,
+ 

[PATCH v9 07/10] hw/arm: Hook up FSI module in AST2600

2024-01-09 Thread Ninad Palsule
This patchset introduces IBM's Flexible Service Interface(FSI).

Time for some fun with inter-processor buses. FSI allows a service
processor access to the internal buses of a host POWER processor to
perform configuration or debugging.

FSI has long existed in POWER processes and so comes with some baggage,
including how it has been integrated into the ASPEED SoC.

Working backwards from the POWER processor, the fundamental pieces of
interest for the implementation are:

1. The Common FRU Access Macro (CFAM), an address space containing
   various "engines" that drive accesses on buses internal and external
   to the POWER chip. Examples include the SBEFIFO and I2C masters. The
   engines hang off of an internal Local Bus (LBUS) which is described
   by the CFAM configuration block.

2. The FSI slave: The slave is the terminal point of the FSI bus for
   FSI symbols addressed to it. Slaves can be cascaded off of one
   another. The slave's configuration registers appear in address space
   of the CFAM to which it is attached.

3. The FSI master: A controller in the platform service processor (e.g.
   BMC) driving CFAM engine accesses into the POWER chip. At the
   hardware level FSI is a bit-based protocol supporting synchronous and
   DMA-driven accesses of engines in a CFAM.

4. The On-Chip Peripheral Bus (OPB): A low-speed bus typically found in
   POWER processors. This now makes an appearance in the ASPEED SoC due
   to tight integration of the FSI master IP with the OPB, mainly the
   existence of an MMIO-mapping of the CFAM address straight onto a
   sub-region of the OPB address space.

5. An APB-to-OPB bridge enabling access to the OPB from the ARM core in
   the AST2600. Hardware limitations prevent the OPB from being directly
   mapped into APB, so all accesses are indirect through the bridge.

The implementation appears as following in the qemu device tree:

(qemu) info qtree
bus: main-system-bus
  type System
  ...
  dev: aspeed.apb2opb, id ""
gpio-out "sysbus-irq" 1
mmio 1e79b000/1000
bus: opb.1
  type opb
  dev: fsi.master, id ""
bus: fsi.bus.1
  type fsi.bus
  dev: cfam.config, id ""
  dev: cfam, id ""
bus: fsi.lbus.1
  type lbus
  dev: scratchpad, id ""
address = 0 (0x0)
bus: opb.0
  type opb
  dev: fsi.master, id ""
bus: fsi.bus.0
  type fsi.bus
  dev: cfam.config, id ""
  dev: cfam, id ""
bus: fsi.lbus.0
  type lbus
  dev: scratchpad, id ""
address = 0 (0x0)

The LBUS is modelled to maintain the qdev bus hierarchy and to take
advantage of the object model to automatically generate the CFAM
configuration block. The configuration block presents engines in the
order they are attached to the CFAM's LBUS. Engine implementations
should subclass the LBusDevice and set the 'config' member of
LBusDeviceClass to match the engine's type.

CFAM designs offer a lot of flexibility, for instance it is possible for
a CFAM to be simultaneously driven from multiple FSI links. The modeling
is not so complete; it's assumed that each CFAM is attached to a single
FSI slave (as a consequence the CFAM subclasses the FSI slave).

As for FSI, its symbols and wire-protocol are not modelled at all. This
is not necessary to get FSI off the ground thanks to the mapping of the
CFAM address space onto the OPB address space - the models follow this
directly and map the CFAM memory region into the OPB's memory region.
Future work includes supporting more advanced accesses that drive the
FSI master directly rather than indirectly via the CFAM mapping, which
will require implementing the FSI state machine and methods for each of
the FSI symbols on the slave. Further down the track we can also look at
supporting the bitbanged SoftFSI drivers in Linux by extending the FSI
slave model to resolve sequences of GPIO IRQs into FSI symbols, and
calling the associated symbol method on the slave to map the access onto
the CFAM.

Testing:
Tested by reading cfam config address 0 on rainier machine type.

root@p10bmc:~# pdbg -a getcfam 0x0
p0: 0x0 = 0xc0022d15

Signed-off-by: Andrew Jeffery 
Signed-off-by: Cédric Le Goater 
Signed-off-by: Ninad Palsule 
Reviewed-by: Philippe Mathieu-Daudé 
Reviewed-by: Cédric Le Goater 
---
 include/hw/arm/aspeed_soc.h |  4 
 hw/arm/aspeed_ast2600.c | 19 +++
 2 files changed, 23 insertions(+)

diff --git a/include/hw/arm/aspeed_soc.h b/include/hw/arm/aspeed_soc.h
index cb832bc1ee..e452108260 100644
--- a/include/hw/arm/aspeed_soc.h
+++ b/include/hw/arm/aspeed_soc.h
@@ -36,6 +36,7 @@
 #include "hw/misc/aspeed_lpc.h"
 #include "hw/misc/unimp.h"
 #include "hw/misc/aspeed_peci.h"
+#include "hw/fsi/aspeed-apb2opb.h"
 #include 

Re: [External] Re: [QEMU-devel][RFC PATCH 1/1] backends/hostmem: qapi/qom: Add an ObjectOption for memory-backend-* called HostMemType and its arg 'cxlram'

2024-01-09 Thread Gregory Price
On Tue, Jan 09, 2024 at 01:27:28PM -0800, Hao Xiang wrote:
> On Tue, Jan 9, 2024 at 11:58 AM Gregory Price
>  wrote:
> >
> > If you drop this line:
> >
> > -numa node,memdev=vmem0,nodeid=1
> 
> We tried this as well and it works after going through the cxlcli
> process and created the devdax device. The problem is that without the
> "nodeid=1" configuration, we cannot connect with the explicit per numa
> node latency/bandwidth configuration "-numa hmat-lb". I glanced at the
> code in hw/numa.c, parse_numa_hmat_lb() looks like the one passing the
> lb information to VM's hmat.
>

Yeah, this is what Jonathan was saying - right now there isn't a good
way (in QEMU) to pass the hmat/cdat stuff down through the device.
Needs to be plumbed out.

In the meantime: You should just straight up drop the cxl device from
your QEMU config.  It doesn't actually get you anything.

> From what I understand so far, the guest kernel will dynamically
> create a numa node after a cxl devdax device is created. That means we
> don't know the numa node until after VM boot. 2. QEMU can only
> statically parse the lb information to the VM at boot time. How do we
> connect these two things?

during boot, the kernel discovers all the memory regions exposed to
bios. In this qemu configuration you have defined:

region 0: CPU + DRAM node
region 1: DRAM only node
region 2: CXL Fixed Memory Window (the last line of the cxl stuff)

The kernel reads this information on boot and reserves 1 numa node for
each of these regions.

The kernel then automatically brings up regions 0 and 1 in nodes 0 and 1
respectively.

Node2 sits dormant until you go through the cxl-cli startup sequence.


What you're asking for is for the QEMU team to plumb hmat/cdat
information down through the type3 device.  I *think* that is presently
possible with a custom CDAT file - but Jonathan probably has more
details on that.  You'll have to go digging for answers on that one.


Now - even if you did that - the current state of the cxl-type3 device
is *not what you want* because your memory accesses will be routed
through the read/write functions in the emulated device.

What Jonathan and I discussed on the other thread is how you might go
about slimming this down to allow pass-through of the memory without the
need for all the fluff.  This is a non-trivial refactor of the existing
device, so i would not expect that any time soon.

At the end of the day, quickest way to get-there-from-here is to just
drop the cxl related lines from your QEMU config, and keep everything
else.

> 
> Assuming that the same issue applies to a physical server with CXL.
> Were you able to see a host kernel getting the correct lb information
> for a CXL devdax device?
> 

Yes, if you bring up a CXL device via cxl-cli on real hardware, the
subsequent numa node ends up in the "lower tier" of the memory-tiering
infrastructure.

~Gregory



Re: [PATCH v4 00/11] hw/isa/vt82c686: Implement relocation and toggling of SuperI/O functions

2024-01-09 Thread Bernhard Beschow



Am 8. Januar 2024 22:12:12 UTC schrieb Mark Cave-Ayland 
:
>On 08/01/2024 20:07, Bernhard Beschow wrote:
>
>> Am 7. Januar 2024 14:13:44 UTC schrieb Mark Cave-Ayland 
>> :
>>> On 06/01/2024 21:05, Bernhard Beschow wrote:
>>> 
 This series implements relocation of the SuperI/O functions of the VIA 
 south
 bridges which resolves some FIXME's. It is part of my via-apollo-pro-133t
 branch [1] which is an extension of bringing the VIA south bridges to the 
 PC
 machine [2]. This branch is able to run some real-world X86 BIOSes in the 
 hope
 that it allows us to form a better understanding of the real vt82c686b 
 devices.
 Implementing relocation and toggling of the SuperI/O functions is one step 
 to
 make these BIOSes run without error messages, so here we go.
 
 The series is structured as follows: Patches 1-3 prepare the TYPE_ISA_FDC,
 TYPE_ISA_PARALLEL and TYPE_ISA_SERIAL to relocate and toggle 
 (enable/disable)
 themselves without breaking encapsulation of their respective device 
 states.
 This is achieved by moving the MemoryRegions and PortioLists from the 
 device
 states into the encapsulating ISA devices since they will be relocated and
 toggled.
 
 Inspired by the memory API patches 4-6 add two convenience functions to the
 portio_list API to toggle and relocate portio lists. Patch 5 is a 
 preparation
 for that which removes some redundancies which otherwise had to be dealt 
 with
 during relocation.
 
 Patches 7-9 implement toggling and relocation for types TYPE_ISA_FDC,
 TYPE_ISA_PARALLEL and TYPE_ISA_SERIAL. Patch 10 prepares the pegasos2 
 machine
 which would end up with all SuperI/O functions disabled if no -bios 
 argument is
 given. Patch 11 finally implements the main feature which now relies on
 firmware to configure the SuperI/O functions accordingly (except for 
 pegasos2).
 
 v4:
 * Drop incomplete SuperI/O vmstate handling (Zoltan)
 
 v3:
 * Rework various commit messages (Zoltan)
 * Drop patch "hw/char/serial: Free struct SerialState from MemoryRegion"
 (Zoltan)
 * Generalize wording in migration.rst to include portio_list API (Zoltan)
 
 v2:
 * Improve commit messages (Zoltan)
 * Split pegasos2 from vt82c686 patch (Zoltan)
 * Avoid poking into device internals (Zoltan)
 
 Testing done:
 * `make check`
 * `make check-avocado`
 * Run MorphOS on pegasos2 with and without pegasos2.rom
 * Run Linux on amigaone
 * Run real-world BIOSes on via-apollo-pro-133t branch
 * Start rescue-yl on fuloong2e
 
 [1] https://github.com/shentok/qemu/tree/via-apollo-pro-133t
 [2] https://github.com/shentok/qemu/tree/pc-via
 
 Bernhard Beschow (11):
 hw/block/fdc-isa: Move portio_list from FDCtrl to FDCtrlISABus
 hw/block/fdc-sysbus: Move iomem from FDCtrl to FDCtrlSysBus
 hw/char/parallel: Move portio_list from ParallelState to
   ISAParallelState
 exec/ioport: Resolve redundant .base attribute in struct
   MemoryRegionPortio
 exec/ioport: Add portio_list_set_address()
 exec/ioport: Add portio_list_set_enabled()
 hw/block/fdc-isa: Implement relocation and enabling/disabling for
   TYPE_ISA_FDC
 hw/char/serial-isa: Implement relocation and enabling/disabling for
   TYPE_ISA_SERIAL
 hw/char/parallel-isa: Implement relocation and enabling/disabling for
   TYPE_ISA_PARALLEL
 hw/ppc/pegasos2: Let pegasos2 machine configure SuperI/O functions
 hw/isa/vt82c686: Implement relocation and toggling of SuperI/O
   functions
 
docs/devel/migration.rst   |  6 ++--
hw/block/fdc-internal.h|  4 ---
include/exec/ioport.h  |  4 ++-
include/hw/block/fdc.h |  3 ++
include/hw/char/parallel-isa.h |  5 +++
include/hw/char/parallel.h |  2 --
include/hw/char/serial.h   |  2 ++
hw/block/fdc-isa.c | 18 +-
hw/block/fdc-sysbus.c  |  6 ++--
hw/char/parallel-isa.c | 14 
hw/char/parallel.c |  2 +-
hw/char/serial-isa.c   | 14 
hw/isa/vt82c686.c  | 66 --
hw/ppc/pegasos2.c  | 15 
system/ioport.c| 41 +
15 files changed, 172 insertions(+), 30 deletions(-)
>>> 
>>> I think this series generally looks good: the only thing I think it's worth 
>>> checking is whether portio lists are considered exclusive to ISA devices or 
>>> not? (Paolo?).
>> 
>> The modifications preserve the current design, so how is this question 
>> related to this series?
>
>I was thinking about patches 1 and 3 where the portio_list 

Re: [PATCH v8 03/10] hw/fsi: Introduce IBM's cfam,fsi-slave,scratchpad

2024-01-09 Thread Ninad Palsule

Hello Cedric,



+
+#define TYPE_FSI_SCRATCHPAD "fsi.scratchpad"
+#define SCRATCHPAD(obj) OBJECT_CHECK(FSIScratchPad, (obj), 
TYPE_FSI_SCRATCHPAD)

+
+typedef struct FSIScratchPad {
+    FSILBusDevice parent;
+
+    uint32_t reg;
+} FSIScratchPad;


We could extend to 4 regs possibly.

OK, Added 4 registers.



+
+#define TYPE_FSI_CFAM "cfam"
+#define FSI_CFAM(obj) OBJECT_CHECK(FSICFAMState, (obj), TYPE_FSI_CFAM)
+
+/* P9-ism */
+#define CFAM_CONFIG_NR_REGS 0x28
+
+typedef struct FSICFAMState {
+    /* < private > */
+    FSISlaveState parent;
+
+    /* CFAM config address space */
+    MemoryRegion config_iomem;
+
+    MemoryRegion mr;
+    AddressSpace as;


The address space is not used. please remove.

Removed address space.




+#include "exec/memory.h"
+#include "hw/qdev-core.h"
+
+#include "hw/fsi/lbus.h"
+
+#include 


Not needed. Please remove.

Removed the header file.



+
+static uint64_t fsi_cfam_config_read(void *opaque, hwaddr addr, 
unsigned size)

+{
+    FSICFAMState *cfam = FSI_CFAM(opaque);
+    BusChild *kid;
+    int i;
+
+    trace_fsi_cfam_config_read(addr, size);
+
+    switch (addr) {
+    case 0x00:
+    return CFAM_CONFIG_CHIP_ID_P9;
+    case 0x04:
+    return ENGINE_CONFIG_NEXT   |   /* valid */
+   0x0001   |   /* slots */
+   0x1000   |   /* version */
+   ENGINE_CONFIG_TYPE_PEEK  |   /* type */
+   0x000c;  /* crc */
+    case 0x08:
+    return ENGINE_CONFIG_NEXT   |   /* valid */
+   0x0001   |   /* slots */
+   0x5000   |   /* version */
+   ENGINE_CONFIG_TYPE_FSI   |   /* type */
+   0x000a;  /* crc */


Please introduce a macro to build these register values.

Added macros



+    break;
+    default:
+    /* The config table contains different engines from 0xc 
onwards. */

+    i = 0xc;
+    QTAILQ_FOREACH(kid, >lbus.bus.children, sibling) {
+    if (i == addr) {
+    DeviceState *ds = kid->child;
+    FSILBusDevice *dev = FSI_LBUS_DEVICE(ds);
+    return FSI_LBUS_DEVICE_GET_CLASS(dev)->config;
+    }
+    i += size;
+    }
+
+    if (i == addr) {
+    return 0;
+    }


If I understand correctly, the register 0xC contains some static config
value for the first device engine, the scratchpad device mapped at 0xC00,
and following registers would do the same for other devices if they were
modelled.

This is certtainly hardwired in HW, so I would simplify to :

case 0xC:
    return ldc->config
default:
    /* log not implemented */

And extend the list when more devices are modeled.

Simplified as per your suggestion.



+    /*
+ * As per FSI specification, This is a magic value at 
address 0 of
+ * given FSI port. This causes FSI master to send BREAK 
command for

+ * initialization and recovery.
+ */
+    return CFAM_CONFIG_CHIP_ID_BREAK;


This looks weird. I don't understant to which offset this value belongs.

Yes, Removed it for now. We are handling break command in the config write.



+    }
+}
+
+static void fsi_cfam_config_write(void *opaque, hwaddr addr, 
uint64_t data,

+  unsigned size)
+{
+    FSICFAMState *cfam = FSI_CFAM(opaque);
+
+    trace_fsi_cfam_config_write(addr, size, data);
+
+    switch (TO_REG(addr)) {
+    case CFAM_CONFIG_CHIP_ID:
+    case CFAM_CONFIG_CHIP_ID + 4:


Couldn't we introduce a proper define for this reg ? and can we write to
the config space ? This break command seems to be sent to the FSI master,
according to Linux. Why is it handled in the CFAM config space ?
Added new PEEK_STATUS register. The BREAK command is send by FSI-master 
to FSI-slave and FSI-slave is embedded into CFAM hence we are handling 
it here.



+    if (data == CFAM_CONFIG_CHIP_ID_BREAK) {
+    bus_cold_reset(BUS(>lbus));
+    }
+    break;


alignment is not good.

Fixed the alignment.




+static void fsi_cfam_realize(DeviceState *dev, Error **errp)
+{
+    FSICFAMState *cfam = FSI_CFAM(dev);
+    FSISlaveState *slave = FSI_SLAVE(dev);
+
+    /* Each slave has a 2MiB address space */
+    memory_region_init_io(>mr, OBJECT(cfam), 
_cfam_unimplemented_ops,

+  cfam, TYPE_FSI_CFAM, 2 * 1024 * 1024);


2 * MiB

Now using MiB.



+
+    /* Add scratchpad engine */
+    if (!qdev_realize_and_unref(DEVICE(>scratchpad), 
BUS(>lbus),


cfam->scratchpad is not allocated. We should use qdev_realize instead.

Fixed it.




+    /* TODO: clarify scratchpad mapping */


You can remove the TODO now. All Local bus devices are mapped at offset
0xc00.

Removed it.




+static void fsi_scratchpad_reset(DeviceState *dev)
+{
+    FSIScratchPad *s = SCRATCHPAD(dev);
+
+    s->reg = 0;


Just one reg ! Too easy :) let's have a few 

Re: [PATCH v2] target/riscv: Implement optional CSR mcontext of debug Sdtrig extension

2024-01-09 Thread Daniel Henrique Barboza



On 12/19/23 09:32, Alvin Chang wrote:

The debug Sdtrig extension defines an CSR "mcontext". This commit
implements its predicate and read/write operations into CSR table.
Its value is reset as 0 when the trigger module is reset.

Signed-off-by: Alvin Chang 
---


The patch per se LGTM:

Reviewed-by: Daniel Henrique Barboza 


But I have a question: shouldn't we just go ahead and add the 'sdtrig' 
extension?
We have a handful of its CSRs already. Adding the extension would also add 
'sdtrig'
in riscv,isa, allowing software to be aware of its existence in QEMU.


Thanks,

Daniel




Changes from v1: Remove dedicated cfg, always implement mcontext.

  target/riscv/cpu.h  |  1 +
  target/riscv/cpu_bits.h |  7 +++
  target/riscv/csr.c  | 36 +++-
  target/riscv/debug.c|  2 ++
  4 files changed, 41 insertions(+), 5 deletions(-)

diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
index d74b361..e117641 100644
--- a/target/riscv/cpu.h
+++ b/target/riscv/cpu.h
@@ -345,6 +345,7 @@ struct CPUArchState {
  target_ulong tdata1[RV_MAX_TRIGGERS];
  target_ulong tdata2[RV_MAX_TRIGGERS];
  target_ulong tdata3[RV_MAX_TRIGGERS];
+target_ulong mcontext;
  struct CPUBreakpoint *cpu_breakpoint[RV_MAX_TRIGGERS];
  struct CPUWatchpoint *cpu_watchpoint[RV_MAX_TRIGGERS];
  QEMUTimer *itrigger_timer[RV_MAX_TRIGGERS];
diff --git a/target/riscv/cpu_bits.h b/target/riscv/cpu_bits.h
index ebd7917..3296648 100644
--- a/target/riscv/cpu_bits.h
+++ b/target/riscv/cpu_bits.h
@@ -361,6 +361,7 @@
  #define CSR_TDATA2  0x7a2
  #define CSR_TDATA3  0x7a3
  #define CSR_TINFO   0x7a4
+#define CSR_MCONTEXT0x7a8
  
  /* Debug Mode Registers */

  #define CSR_DCSR0x7b0
@@ -905,4 +906,10 @@ typedef enum RISCVException {
  /* JVT CSR bits */
  #define JVT_MODE   0x3F
  #define JVT_BASE   (~0x3F)
+
+/* Debug Sdtrig CSR masks */
+#define MCONTEXT32 0x003F
+#define MCONTEXT64 0x1FFFULL
+#define MCONTEXT32_HCONTEXT0x007F
+#define MCONTEXT64_HCONTEXT0x3FFFULL
  #endif
diff --git a/target/riscv/csr.c b/target/riscv/csr.c
index fde7ce1..ff1e128 100644
--- a/target/riscv/csr.c
+++ b/target/riscv/csr.c
@@ -3900,6 +3900,31 @@ static RISCVException read_tinfo(CPURISCVState *env, int 
csrno,
  return RISCV_EXCP_NONE;
  }
  
+static RISCVException read_mcontext(CPURISCVState *env, int csrno,

+target_ulong *val)
+{
+*val = env->mcontext;
+return RISCV_EXCP_NONE;
+}
+
+static RISCVException write_mcontext(CPURISCVState *env, int csrno,
+ target_ulong val)
+{
+bool rv32 = riscv_cpu_mxl(env) == MXL_RV32 ? true : false;
+int32_t mask;
+
+if (riscv_has_ext(env, RVH)) {
+/* Spec suggest 7-bit for RV32 and 14-bit for RV64 w/ H extension */
+mask = rv32 ? MCONTEXT32_HCONTEXT : MCONTEXT64_HCONTEXT;
+} else {
+/* Spec suggest 6-bit for RV32 and 13-bit for RV64 w/o H extension */
+mask = rv32 ? MCONTEXT32 : MCONTEXT64;
+}
+
+env->mcontext = val & mask;
+return RISCV_EXCP_NONE;
+}
+
  /*
   * Functions to access Pointer Masking feature registers
   * We have to check if current priv lvl could modify
@@ -4794,11 +4819,12 @@ riscv_csr_operations csr_ops[CSR_TABLE_SIZE] = {
  [CSR_PMPADDR15] =  { "pmpaddr15", pmp, read_pmpaddr, write_pmpaddr },
  
  /* Debug CSRs */

-[CSR_TSELECT]   =  { "tselect", debug, read_tselect, write_tselect },
-[CSR_TDATA1]=  { "tdata1",  debug, read_tdata,   write_tdata   },
-[CSR_TDATA2]=  { "tdata2",  debug, read_tdata,   write_tdata   },
-[CSR_TDATA3]=  { "tdata3",  debug, read_tdata,   write_tdata   },
-[CSR_TINFO] =  { "tinfo",   debug, read_tinfo,   write_ignore  },
+[CSR_TSELECT]   =  { "tselect",  debug, read_tselect,  write_tselect  },
+[CSR_TDATA1]=  { "tdata1",   debug, read_tdata,write_tdata},
+[CSR_TDATA2]=  { "tdata2",   debug, read_tdata,write_tdata},
+[CSR_TDATA3]=  { "tdata3",   debug, read_tdata,write_tdata},
+[CSR_TINFO] =  { "tinfo",debug, read_tinfo,write_ignore   },
+[CSR_MCONTEXT]  =  { "mcontext", debug, read_mcontext, write_mcontext },
  
  /* User Pointer Masking */

  [CSR_UMTE]={ "umte",pointer_masking, read_umte,  write_umte },
diff --git a/target/riscv/debug.c b/target/riscv/debug.c
index 4945d1a..e30d99c 100644
--- a/target/riscv/debug.c
+++ b/target/riscv/debug.c
@@ -940,4 +940,6 @@ void riscv_trigger_reset_hold(CPURISCVState *env)
  env->cpu_watchpoint[i] = NULL;
  timer_del(env->itrigger_timer[i]);
  }
+
+env->mcontext = 0;
  }




Re: Re: [PATCH v2 0/2] linux-user: two fixes to coredump generation

2024-01-09 Thread Thomas Weißschuh
On 2024-01-10 08:33:11+1100, Richard Henderson wrote:
> On 1/8/24 01:01, Thomas Weißschuh wrote:
> > Signed-off-by: Thomas Weißschuh 
> > ---
> > Changes in v2:
> > - Rebase on 8.2 master
> > - Resend after closed tree and holidays
> > - Link to v1: 
> > https://lore.kernel.org/r/20231115-qemu-user-dumpable-v1-0-edbe7f0fb...@t-8ch.de
> > 
> > ---
> > Thomas Weißschuh (2):
> >linux-user/elfload: test return value of getrlimit
> >linux-user/elfload: check PR_GET_DUMPABLE before creating coredump
> > 
> >   linux-user/elfload.c | 8 ++--
> >   1 file changed, 6 insertions(+), 2 deletions(-)
> > ---
> > base-commit: 0c1eccd368af8805ec0fb11e6cf25d0684d37328
> > change-id: 20231115-qemu-user-dumpable-d499c0396103
> > 
> > Best regards,
> 
> Both patches look good for correctness, but both have style issues: need
> braces on those if statements.
> 
> With that fixed,
> Reviewed-by: Richard Henderson 

Thanks,

I added the braces for the next revision, which I'll send after waiting
some more feedback.



Re: [PATCH v2 4/9] target/hppa: Fix PDC address translation on PA2.0 with PSW.W=0

2024-01-09 Thread Richard Henderson

On 1/10/24 08:06, Helge Deller wrote:
What evidence?  So far, all I can see is for your seabios button, which doesn't run on 
physical hardware.


You are wrong on this.
My Seabios just mimics the real hardware. And the hardware has such a button
which is reported back by the PDC firmware.
Here is what the Linux kernel reports on *physical* hardware:
64-bit kernel -> powersw: Soft power switch at 0xfff0f0400804 enabled.
32-bit kernel -> powersw: Soft power switch at 0xf0400804 enabled
Just look at the old dmesg from another user (with Linux kernel 2.6.16):
http://ftp.parisc-linux.org/dmesg/dmesg_C3700.txt
(search for "power" in that log).


Ok, fair enough.  I just wish HP had been more accurate in their diagrams.  :-)


r~



Re: [PATCH v2 1/2] nubus-device: round Declaration ROM memory region address to qemu_target_page_size()

2024-01-09 Thread Mark Cave-Ayland

On 08/01/2024 23:06, Philippe Mathieu-Daudé wrote:


On 8/1/24 20:20, Mark Cave-Ayland wrote:

Declaration ROM binary images can be any arbitrary size, however if a host ROM
memory region is not aligned to qemu_target_page_size() then we fail the
"assert(!(iotlb & ~TARGET_PAGE_MASK))" check in tlb_set_page_full().

Ensure that the host ROM memory region is aligned to qemu_target_page_size()
and adjust the offset at which the Declaration ROM image is loaded, since Nubus
ROM images are unusual in that they are aligned to the end of the slot address
space.

Signed-off-by: Mark Cave-Ayland 
---
  hw/nubus/nubus-device.c | 16 
  1 file changed, 12 insertions(+), 4 deletions(-)

diff --git a/hw/nubus/nubus-device.c b/hw/nubus/nubus-device.c
index 49008e4938..e4f824d58b 100644
--- a/hw/nubus/nubus-device.c
+++ b/hw/nubus/nubus-device.c
@@ -10,6 +10,7 @@
  #include "qemu/osdep.h"
  #include "qemu/datadir.h"
+#include "exec/target_page.h"
  #include "hw/irq.h"
  #include "hw/loader.h"
  #include "hw/nubus/nubus.h"
@@ -30,7 +31,7 @@ static void nubus_device_realize(DeviceState *dev, Error 
**errp)
  NubusDevice *nd = NUBUS_DEVICE(dev);
  char *name, *path;
  hwaddr slot_offset;
-    int64_t size;
+    int64_t size, align_size;


Both are 'size_t'.


I had a look at include/hw/loader.h, and the function signature for get_image_size() 
returns int64_t. Does it not make sense to keep int64_t here and use uintptr_t for 
the pointer arithmetic as below so that everything matches?



  int ret;
  /* Super */
@@ -76,16 +77,23 @@ static void nubus_device_realize(DeviceState *dev, Error 
**errp)
  }
  name = g_strdup_printf("nubus-slot-%x-declaration-rom", nd->slot);
-    memory_region_init_rom(>decl_rom, OBJECT(dev), name, size,
+
+    /*
+ * Ensure ROM memory region is aligned to target page size regardless
+ * of the size of the Declaration ROM image
+ */
+    align_size = ROUND_UP(size, qemu_target_page_size());
+    memory_region_init_rom(>decl_rom, OBJECT(dev), name, align_size,
 _abort);
-    ret = load_image_mr(path, >decl_rom);
+    ret = load_image_size(path, memory_region_get_ram_ptr(>decl_rom) +
+    (uintptr_t)align_size - size, size);


memory_region_get_ram_ptr() returns a 'void *' so this looks dubious.
Maybe use a local variable to ease offset calculation?

   char *rombase = memory_region_get_ram_ptr(>decl_rom);
   ret = load_image_size(path, rombase + align_size - size, size);

Otherwise KISS but ugly:

   ret = load_image_size(path,
     (void *)((uintptr_t)memory_region_get_ram_ptr(>decl_rom)
  + align_size - size), size);


I prefer the first approach, but with uint8_t instead of char since it clarifies that 
it is a pointer to an arbitrary set of bytes as opposed to a string. Does that seem 
reasonable?



  g_free(path);
  g_free(name);
  if (ret < 0) {
  error_setg(errp, "could not load romfile \"%s\"", nd->romfile);
  return;
  }
-    memory_region_add_subregion(>slot_mem, NUBUS_SLOT_SIZE - size,
+    memory_region_add_subregion(>slot_mem, NUBUS_SLOT_SIZE - 
align_size,
  >decl_rom);
  }
  }



ATB,

Mark.




Re: [PATCH 1/3] linux-user: Allow gdbstub to ignore page protection

2024-01-09 Thread Richard Henderson

On 1/10/24 06:39, Ilya Leoshkevich wrote:

On Wed, 2024-01-10 at 04:42 +1100, Richard Henderson wrote:

On 1/9/24 10:34, Ilya Leoshkevich wrote:

gdbserver ignores page protection by virtue of using
/proc/$pid/mem.
Teach qemu gdbstub to do this too. This will not work if /proc is
not
mounted; accept this limitation.

One alternative is to temporarily grant the missing PROT_* bit, but
this is inherently racy. Another alternative is self-debugging with
ptrace(POKE), which will break if QEMU itself is being debugged - a
much more severe limitation.

Signed-off-by: Ilya Leoshkevich 
---
   cpu-target.c | 55 ++-
-
   1 file changed, 40 insertions(+), 15 deletions(-)

diff --git a/cpu-target.c b/cpu-target.c
index 5eecd7ea2d7..69e97f78980 100644
--- a/cpu-target.c
+++ b/cpu-target.c
@@ -406,6 +406,15 @@ int cpu_memory_rw_debug(CPUState *cpu, vaddr
addr,
   vaddr l, page;
   void * p;
   uint8_t *buf = ptr;
+    int ret = -1;
+    int mem_fd;
+
+    /*
+ * Try ptrace first. If /proc is not mounted or if there is a
different
+ * problem, fall back to the manual page access. Note that,
unlike ptrace,
+ * it will not be able to ignore the protection bits.
+ */
+    mem_fd = open("/proc/self/mem", is_write ? O_WRONLY :
O_RDONLY);


Surely this is the unlikely fallback, and you don't need to open
unless the page is
otherwise inaccessible.


Ok, I can move this under (flags & PAGE_*) checks.


I see no handling for writes to pages that contain TranslationBlocks.


Sorry, I completely missed that. I'm currently experimenting with the
following:

/*
 * If there is a TranslationBlock and we weren't bypassing
host
 * page protection, the memcpy() above would SEGV, ultimately
 * leading to page_unprotect(). So invalidate the translations
 * manually. Both invalidation and pwrite() must be under
 * mmap_lock() in order to prevent the creation of another
 * TranslationBlock in between.
 */
mmap_lock();
tb_invalidate_phys_page(page);
written = pwrite(fd, buf, l, (off_t)g2h_untagged(addr));


I would use here tb_invalidate_phys_range(addr, addr + l - 1),
but otherwise, it looks good.


r~


mmap_unlock();

Does that look okay?

[...]





Re: [PATCH v2 2/2] hw/pflash: implement update buffer for block writes

2024-01-09 Thread Richard Henderson

On 1/8/24 23:53, Philippe Mathieu-Daudé wrote:

@@ -818,6 +867,9 @@ static void pflash_cfi01_realize(DeviceState *dev, Error 
**errp)
  pfl->cmd = 0x00;
  pfl->status = 0x80; /* WSM ready */
  pflash_cfi01_fill_cfi_table(pfl);
+
+pfl->blk_bytes = g_malloc(pfl->writeblock_size);


Do you need an unrealize to free?


r~



Re: [PATCH v2 0/2] linux-user: two fixes to coredump generation

2024-01-09 Thread Richard Henderson

On 1/8/24 01:01, Thomas Weißschuh wrote:

Signed-off-by: Thomas Weißschuh 
---
Changes in v2:
- Rebase on 8.2 master
- Resend after closed tree and holidays
- Link to v1: 
https://lore.kernel.org/r/20231115-qemu-user-dumpable-v1-0-edbe7f0fb...@t-8ch.de

---
Thomas Weißschuh (2):
   linux-user/elfload: test return value of getrlimit
   linux-user/elfload: check PR_GET_DUMPABLE before creating coredump

  linux-user/elfload.c | 8 ++--
  1 file changed, 6 insertions(+), 2 deletions(-)
---
base-commit: 0c1eccd368af8805ec0fb11e6cf25d0684d37328
change-id: 20231115-qemu-user-dumpable-d499c0396103

Best regards,


Both patches look good for correctness, but both have style issues: need braces on those 
if statements.


With that fixed,
Reviewed-by: Richard Henderson 


r~



Re: [External] Re: [QEMU-devel][RFC PATCH 1/1] backends/hostmem: qapi/qom: Add an ObjectOption for memory-backend-* called HostMemType and its arg 'cxlram'

2024-01-09 Thread Hao Xiang
On Tue, Jan 9, 2024 at 11:58 AM Gregory Price
 wrote:
>
> On Tue, Jan 09, 2024 at 11:33:04AM -0800, Hao Xiang wrote:
> > On Mon, Jan 8, 2024 at 5:13 PM Gregory Price  
> > wrote:
> >
> > Sounds like the technical details are explained on the other thread.
> > From what I understand now, if we don't go through a complex CXL
> > setup, it wouldn't go through the emulation path.
> >
> > Here is our exact setup. Guest runs Linux kernel 6.6rc2
> >
> > taskset --cpu-list 0-47,96-143 \
> > numactl -N 0 -m 0 ${QEMU} \
> > -M q35,cxl=on,hmat=on \
> > -m 64G \
> > -smp 8,sockets=1,cores=8,threads=1 \
> > -object memory-backend-ram,id=ram0,size=45G \
> > -numa node,memdev=ram0,cpus=0-7,nodeid=0 \
> > -msg timestamp=on -L /usr/share/seabios \
> > -enable-kvm \
> > -object 
> > memory-backend-ram,id=vmem0,size=19G,host-nodes=${HOST_CXL_NODE},policy=bind
> > \
> > -device pxb-cxl,bus_nr=12,bus=pcie.0,id=cxl.1 \
> > -device cxl-rp,port=0,bus=cxl.1,id=root_port13,chassis=0,slot=2 \
> > -device cxl-type3,bus=root_port13,volatile-memdev=vmem0,id=cxl-vmem0 \
> > -numa node,memdev=vmem0,nodeid=1 \
> > -M 
> > cxl-fmw.0.targets.0=cxl.1,cxl-fmw.0.size=19G,cxl-fmw.0.interleave-granularity=8k
>
> :] you did what i thought you did
>
> -numa node,memdev=vmem0,nodeid=1
>
> """
> Another possiblity: You mapped this memory-backend into another numa
> node explicitly and never onlined the memory via cxlcli.  I've done
> this, and it works, but it's a "hidden feature" that probably should
> not exist / be supported.
> """
>
> You're mapping vmem0 into an explicit numa node *and* into the type3
> device.  You don't need to do both - and technically this shouldn't be
> allowed.
>
> With this configuration, you can go thorugh the cxl-cli setup process
> for the CXL device, you'll find that you can create *another* node
> (node 2 in this case) that maps to the same memory you mapped to node1..
>
>
> You can drop the cxl devices objects in here and the memory will still
> come up the way you want it to.
>
> If you drop this line:
>
> -numa node,memdev=vmem0,nodeid=1

We tried this as well and it works after going through the cxlcli
process and created the devdax device. The problem is that without the
"nodeid=1" configuration, we cannot connect with the explicit per numa
node latency/bandwidth configuration "-numa hmat-lb". I glanced at the
code in hw/numa.c, parse_numa_hmat_lb() looks like the one passing the
lb information to VM's hmat.

>From what I understand so far, the guest kernel will dynamically
create a numa node after a cxl devdax device is created. That means we
don't know the numa node until after VM boot. 2. QEMU can only
statically parse the lb information to the VM at boot time. How do we
connect these two things?

Assuming that the same issue applies to a physical server with CXL.
Were you able to see a host kernel getting the correct lb information
for a CXL devdax device?

>
> You have to use the CXL driver to instantiate the dax device and the
> numa node, and at *that* point you will see the read/write functions
> being called.
>
> ~Gregory



Re: [PATCH 00/33] hw/cpu/arm: Remove one use of qemu_get_cpu() in A7/A15 MPCore priv

2024-01-09 Thread Philippe Mathieu-Daudé

Hi Fabiano,

On 9/1/24 21:21, Fabiano Rosas wrote:

Cédric Le Goater  writes:


On 1/9/24 18:40, Fabiano Rosas wrote:

Cédric Le Goater  writes:


On 1/3/24 20:53, Fabiano Rosas wrote:

Philippe Mathieu-Daudé  writes:


+Peter/Fabiano

On 2/1/24 17:41, Cédric Le Goater wrote:

On 1/2/24 17:15, Philippe Mathieu-Daudé wrote:

Hi Cédric,

On 2/1/24 15:55, Cédric Le Goater wrote:

On 12/12/23 17:29, Philippe Mathieu-Daudé wrote:

Hi,

When a MPCore cluster is used, the Cortex-A cores belong the the
cluster container, not to the board/soc layer. This series move
the creation of vCPUs to the MPCore private container.

Doing so we consolidate the QOM model, moving common code in a
central place (abstract MPCore parent).


Changing the QOM hierarchy has an impact on the state of the machine
and some fixups are then required to maintain migration compatibility.
This can become a real headache for KVM machines like virt for which
migration compatibility is a feature, less for emulated ones.


All changes are either moving properties (which are not migrated)
or moving non-migrated QOM members (i.e. pointers of ARMCPU, which
is still migrated elsewhere). So I don't see any obvious migration
problem, but I might be missing something, so I Cc'ed Juan :>


FWIW, I didn't spot anything problematic either.

I've ran this through my migration compatibility series [1] and it
doesn't regress aarch64 migration from/to 8.2. The tests use '-M
virt -cpu max', so the cortex-a7 and cortex-a15 are not covered. I don't
think we even support migration of anything non-KVM on arm.


it happens we do.



Oh, sorry, I didn't mean TCG here. Probably meant to say something like
non-KVM-capable cpus, as in 32-bit. Nevermind.


Theoretically, we should be able to migrate to a TCG guest. Well, this
worked in the past for PPC. When I was doing more KVM related changes,
this was very useful for dev. Also, some machines are partially emulated.
Anyhow I agree this is not a strong requirement and we often break it.
Let's focus on KVM only.


1- https://gitlab.com/farosas/qemu/-/jobs/5853599533


yes it depends on the QOM hierarchy and virt seems immune to the changes.
Good.

However, changing the QOM topology clearly breaks migration compat,


Well, "clearly" is relative =) You've mentioned pseries and aspeed
already, do you have a pointer to one of those cases were we broke
migration


Regarding pseries, migration compat broke because of 5bc8d26de20c
("spapr: allocate the ICPState object from under sPAPRCPUCore") which
is similar to the changes proposed by this series, it impacts the QOM
hierarchy. Here is the workaround/fix from Greg : 46f7afa37096
("spapr: fix migration of ICPState objects from/to older QEMU") which
is quite an headache and this turned out to raise another problem some
months ago ... :/ That's why I sent [1] to prepare removal of old
machines and workarounds becoming a burden.


This feels like something that could be handled by the vmstate code
somehow. The state is there, just under a different path.


What, the QOM path is used in migration? ...

See recent discussions on "QOM path stability":
https://lore.kernel.org/qemu-devel/zzfyvlmcxbcia...@redhat.com/
https://lore.kernel.org/qemu-devel/87jzojbxt7@pond.sub.org/
https://lore.kernel.org/qemu-devel/87v883by34@pond.sub.org/


No one wants
to be policing QOM hierarchy changes in every single series that shows
up on the list.

Anyway, thanks for the pointers. I'll study that code a bit more, maybe
I can come up with some way to handle these cases.

Hopefully between the analyze-migration test and the compat tests we'll
catch the next bug of this kind before it gets merged.







Re: [PATCH v3 3/3] hw/virtio: rename virtio dmabuf API

2024-01-09 Thread Philippe Mathieu-Daudé

On 9/1/24 13:56, Albert Esteve wrote:

Functions in the virtio-dmabuf module
start with 'virtio_*', which is too
generic and may not correctly identify
them as part of the virtio dmabuf API.

Rename all functions to 'virtio_dmabuf_*'
instead to avoid confusion.

Signed-off-by: Albert Esteve 
Acked-by: Stefan Hajnoczi 
---
  hw/display/virtio-dmabuf.c| 14 
  hw/virtio/vhost-user.c| 14 
  include/hw/virtio/virtio-dmabuf.h | 33 +-
  tests/unit/test-virtio-dmabuf.c   | 58 +++
  4 files changed, 60 insertions(+), 59 deletions(-)


Reviewed-by: Philippe Mathieu-Daudé 



Re: [PATCH] hw/timer: fix systick trace message

2024-01-09 Thread Philippe Mathieu-Daudé

On 9/1/24 19:45, Samuel Tardieu wrote:

Signed-off-by: Samuel Tardieu 
Fixes: ff68dacbc786 ("armv7m: Split systick out from NVIC")
---
  hw/timer/trace-events | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)


Reviewed-by: Philippe Mathieu-Daudé 




Re: [PATCH 00/33] hw/cpu/arm: Remove one use of qemu_get_cpu() in A7/A15 MPCore priv

2024-01-09 Thread Philippe Mathieu-Daudé

On 9/1/24 22:07, Philippe Mathieu-Daudé wrote:

Hi Cédric,

On 9/1/24 19:06, Cédric Le Goater wrote:

On 1/9/24 18:40, Fabiano Rosas wrote:

Cédric Le Goater  writes:


On 1/3/24 20:53, Fabiano Rosas wrote:

Philippe Mathieu-Daudé  writes:


+Peter/Fabiano

On 2/1/24 17:41, Cédric Le Goater wrote:

On 1/2/24 17:15, Philippe Mathieu-Daudé wrote:

Hi Cédric,

On 2/1/24 15:55, Cédric Le Goater wrote:

On 12/12/23 17:29, Philippe Mathieu-Daudé wrote:

Hi,

When a MPCore cluster is used, the Cortex-A cores belong the the
cluster container, not to the board/soc layer. This series move
the creation of vCPUs to the MPCore private container.

Doing so we consolidate the QOM model, moving common code in a
central place (abstract MPCore parent).


Changing the QOM hierarchy has an impact on the state of the 
machine
and some fixups are then required to maintain migration 
compatibility.
This can become a real headache for KVM machines like virt for 
which

migration compatibility is a feature, less for emulated ones.


All changes are either moving properties (which are not migrated)
or moving non-migrated QOM members (i.e. pointers of ARMCPU, which
is still migrated elsewhere). So I don't see any obvious migration
problem, but I might be missing something, so I Cc'ed Juan :>


FWIW, I didn't spot anything problematic either.

I've ran this through my migration compatibility series [1] and it
doesn't regress aarch64 migration from/to 8.2. The tests use '-M
virt -cpu max', so the cortex-a7 and cortex-a15 are not covered. I 
don't

think we even support migration of anything non-KVM on arm.


it happens we do.



Oh, sorry, I didn't mean TCG here. Probably meant to say something like
non-KVM-capable cpus, as in 32-bit. Nevermind.


Theoretically, we should be able to migrate to a TCG guest. Well, this
worked in the past for PPC. When I was doing more KVM related changes,
this was very useful for dev. Also, some machines are partially emulated.
Anyhow I agree this is not a strong requirement and we often break it.
Let's focus on KVM only.


No no, we want the same for TCG.


1- https://gitlab.com/farosas/qemu/-/jobs/5853599533


yes it depends on the QOM hierarchy and virt seems immune to the 
changes.

Good.

However, changing the QOM topology clearly breaks migration compat,


Well, "clearly" is relative =) You've mentioned pseries and aspeed
already, do you have a pointer to one of those cases were we broke
migration 


Regarding pseries, migration compat broke because of 5bc8d26de20c
("spapr: allocate the ICPState object from under sPAPRCPUCore") which
is similar to the changes proposed by this series, it impacts the QOM
hierarchy. Here is the workaround/fix from Greg : 46f7afa37096
("spapr: fix migration of ICPState objects from/to older QEMU") which
is quite an headache and this turned out to raise another problem some
months ago ... :/ That's why I sent [1] to prepare removal of old
machines and workarounds becoming a burden.

Regarding aspeed, this series breaks compat.


Can you write down the steps to reproduce please? I'll debug it.


Also, have you figured (bisecting) which patch start to break?


We need to understand this.


Not that we care much
but ​this caught my attention because of my past experience on pseries.
Same kind of QOM change which could impact other machines, like virt.
Since you checked that migration compat is preserved on virt, we should
be fine.

Thanks,

C.

[1] 
https://lore.kernel.org/qemu-devel/20231214181723.1520854-1-...@kaod.org/









Re: [PATCH 00/33] hw/cpu/arm: Remove one use of qemu_get_cpu() in A7/A15 MPCore priv

2024-01-09 Thread Philippe Mathieu-Daudé

Hi Cédric,

On 9/1/24 19:06, Cédric Le Goater wrote:

On 1/9/24 18:40, Fabiano Rosas wrote:

Cédric Le Goater  writes:


On 1/3/24 20:53, Fabiano Rosas wrote:

Philippe Mathieu-Daudé  writes:


+Peter/Fabiano

On 2/1/24 17:41, Cédric Le Goater wrote:

On 1/2/24 17:15, Philippe Mathieu-Daudé wrote:

Hi Cédric,

On 2/1/24 15:55, Cédric Le Goater wrote:

On 12/12/23 17:29, Philippe Mathieu-Daudé wrote:

Hi,

When a MPCore cluster is used, the Cortex-A cores belong the the
cluster container, not to the board/soc layer. This series move
the creation of vCPUs to the MPCore private container.

Doing so we consolidate the QOM model, moving common code in a
central place (abstract MPCore parent).


Changing the QOM hierarchy has an impact on the state of the 
machine
and some fixups are then required to maintain migration 
compatibility.
This can become a real headache for KVM machines like virt for 
which

migration compatibility is a feature, less for emulated ones.


All changes are either moving properties (which are not migrated)
or moving non-migrated QOM members (i.e. pointers of ARMCPU, which
is still migrated elsewhere). So I don't see any obvious migration
problem, but I might be missing something, so I Cc'ed Juan :>


FWIW, I didn't spot anything problematic either.

I've ran this through my migration compatibility series [1] and it
doesn't regress aarch64 migration from/to 8.2. The tests use '-M
virt -cpu max', so the cortex-a7 and cortex-a15 are not covered. I 
don't

think we even support migration of anything non-KVM on arm.


it happens we do.



Oh, sorry, I didn't mean TCG here. Probably meant to say something like
non-KVM-capable cpus, as in 32-bit. Nevermind.


Theoretically, we should be able to migrate to a TCG guest. Well, this
worked in the past for PPC. When I was doing more KVM related changes,
this was very useful for dev. Also, some machines are partially emulated.
Anyhow I agree this is not a strong requirement and we often break it.
Let's focus on KVM only.


No no, we want the same for TCG.


1- https://gitlab.com/farosas/qemu/-/jobs/5853599533


yes it depends on the QOM hierarchy and virt seems immune to the 
changes.

Good.

However, changing the QOM topology clearly breaks migration compat,


Well, "clearly" is relative =) You've mentioned pseries and aspeed
already, do you have a pointer to one of those cases were we broke
migration 


Regarding pseries, migration compat broke because of 5bc8d26de20c
("spapr: allocate the ICPState object from under sPAPRCPUCore") which
is similar to the changes proposed by this series, it impacts the QOM
hierarchy. Here is the workaround/fix from Greg : 46f7afa37096
("spapr: fix migration of ICPState objects from/to older QEMU") which
is quite an headache and this turned out to raise another problem some
months ago ... :/ That's why I sent [1] to prepare removal of old
machines and workarounds becoming a burden.

Regarding aspeed, this series breaks compat.


Can you write down the steps to reproduce please? I'll debug it.
We need to understand this.


Not that we care much
but ​this caught my attention because of my past experience on pseries.
Same kind of QOM change which could impact other machines, like virt.
Since you checked that migration compat is preserved on virt, we should
be fine.

Thanks,

C.

[1] 
https://lore.kernel.org/qemu-devel/20231214181723.1520854-1-...@kaod.org/







Re: [PATCH v2 4/9] target/hppa: Fix PDC address translation on PA2.0 with PSW.W=0

2024-01-09 Thread Helge Deller

On 1/9/24 17:18, Richard Henderson wrote:

On 1/9/24 22:22, Helge Deller wrote:

On 1/9/24 10:14, Richard Henderson wrote:

On 1/8/24 00:22, del...@kernel.org wrote:

From: Helge Deller 

Fix the address translation for PDC space on PA2.0 if PSW.W=0.
Basically, for any address in the 32-bit PDC range from 0xf000 to
0xf100 keep the lower 32-bits and just set the upper 32-bits to
0xfff0.

This mapping fixes the emulated power button in PDC space for 32- and
64-bit machines and is how the physical C3700 machine seems to map
PDC.

Signed-off-by: Helge Deller 
---
  target/hppa/mem_helper.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/hppa/mem_helper.c b/target/hppa/mem_helper.c
index 08abd1a9f9..011b192406 100644
--- a/target/hppa/mem_helper.c
+++ b/target/hppa/mem_helper.c
@@ -56,7 +56,7 @@ hwaddr hppa_abs_to_phys_pa2_w0(vaddr addr)
  addr = (int32_t)addr;
  } else {
  /* PDC address space */
-    addr &= MAKE_64BIT_MASK(0, 24);
+    addr = (uint32_t)addr;
  addr |= -1ull << (TARGET_PHYS_ADDR_SPACE_BITS - 4);
  }
  return addr;


I believe this to be incorrect, as it contradicts Figures H-10 and H-11.


Yes, but that seems to be how it's really implemented on physical hardware.
We have seen other figures as well, which didn't reflect the real world either.
IMHO we can revert if it really turns out to be wrong and when we
get a better solution.


What evidence?  So far, all I can see is for your seabios button, which doesn't 
run on physical hardware.


You are wrong on this.
My Seabios just mimics the real hardware. And the hardware has such a button
which is reported back by the PDC firmware.
Here is what the Linux kernel reports on *physical* hardware:
64-bit kernel -> powersw: Soft power switch at 0xfff0f0400804 enabled.
32-bit kernel -> powersw: Soft power switch at 0xf0400804 enabled
Just look at the old dmesg from another user (with Linux kernel 2.6.16):
http://ftp.parisc-linux.org/dmesg/dmesg_C3700.txt
(search for "power" in that log).

As you can see, even the real C3700 reports the power button inside the
firmware region. And on 64-bit the higher 32-bits are at 0xfff0.
That's exactly what I do with this patch.


In any case, there is a comment just above pointing to the spec, which you are 
now deviating from.  You need to expand that comment to say why and how.


Ok.

Helge



Re: [PATCH v3 3/4] ci: Add a migration compatibility test job

2024-01-09 Thread Fabiano Rosas
Cédric Le Goater  writes:

> On 1/5/24 19:04, Fabiano Rosas wrote:
>> The migration tests have support for being passed two QEMU binaries to
>> test migration compatibility.
>> 
>> Add a CI job that builds the lastest release of QEMU and another job
>> that uses that version plus an already present build of the current
>> version and run the migration tests with the two, both as source and
>> destination. I.e.:
>> 
>>   old QEMU (n-1) -> current QEMU (development tree)
>>   current QEMU (development tree) -> old QEMU (n-1)
>> 
>> The purpose of this CI job is to ensure the code we're about to merge
>> will not cause a migration compatibility problem when migrating the
>> next release (which will contain that code) to/from the previous
>> release.
>> 
>> I'm leaving the jobs as manual for now because using an older QEMU in
>> tests could hit bugs that were already fixed in the current
>> development tree and we need to handle those case-by-case.
>> 
>> Note: for user forks, the version tags need to be pushed to gitlab
>> otherwise it won't be able to checkout a different version.
>> 
>> Signed-off-by: Fabiano Rosas 
>> ---
>>   .gitlab-ci.d/buildtest.yml | 53 ++
>>   1 file changed, 53 insertions(+)
>> 
>> diff --git a/.gitlab-ci.d/buildtest.yml b/.gitlab-ci.d/buildtest.yml
>> index 91663946de..81163a3f6a 100644
>> --- a/.gitlab-ci.d/buildtest.yml
>> +++ b/.gitlab-ci.d/buildtest.yml
>> @@ -167,6 +167,59 @@ build-system-centos:
>> x86_64-softmmu rx-softmmu sh4-softmmu nios2-softmmu
>>   MAKE_CHECK_ARGS: check-build
>>   
>> +build-previous-qemu:
>> +  extends: .native_build_job_template
>> +  artifacts:
>> +when: on_success
>> +expire_in: 2 days
>> +paths:
>> +  - build-previous
>> +exclude:
>> +  - build-previous/**/*.p
>> +  - build-previous/**/*.a.p
>> +  - build-previous/**/*.fa.p
>> +  - build-previous/**/*.c.o
>> +  - build-previous/**/*.c.o.d
>> +  - build-previous/**/*.fa
>> +  needs:
>> +job: amd64-opensuse-leap-container
>> +  variables:
>> +QEMU_JOB_OPTIONAL: 1
>> +IMAGE: opensuse-leap
>> +TARGETS: x86_64-softmmu aarch64-softmmu
>> +  before_script:
>> +- export QEMU_PREV_VERSION="$(sed 's/\([0-9.]*\)\.[0-9]*/v\1.0/' 
>> VERSION)"
>> +- git checkout $QEMU_PREV_VERSION
>> +  after_script:
>> +- mv build build-previous
>> +
>> +.migration-compat-common:
>> +  extends: .common_test_job_template
>> +  needs:
>> +- job: build-previous-qemu
>> +- job: build-system-opensuse
>> +  allow_failure: true
>> +  variables:
>> +QEMU_JOB_OPTIONAL: 1
>> +IMAGE: opensuse-leap
>> +MAKE_CHECK_ARGS: check-build
>> +  script:
>> +- cd build
>> +- QTEST_QEMU_BINARY_SRC=../build-previous/qemu-system-${TARGET}
>> +  QTEST_QEMU_BINARY=./qemu-system-${TARGET} 
>> ./tests/qtest/migration-test
>> +- QTEST_QEMU_BINARY_DST=../build-previous/qemu-system-${TARGET}
>> +  QTEST_QEMU_BINARY=./qemu-system-${TARGET} 
>> ./tests/qtest/migration-test
>> +
>> +migration-compat-aarch64:
>> +  extends: .migration-compat-common
>> +  variables:
>> +TARGET: aarch64
>> +
>> +migration-compat-x86_64:
>> +  extends: .migration-compat-common
>> +  variables:
>> +TARGET: x86_64
>
>
> What about the others archs, s390x and ppc ? Do you lack the resources
> or are there any problems to address ?

Currently s390x and ppc are only tested on KVM. Which means they are not
tested at all unless someone runs migration-test on a custom runner. The
same is true for this test.

The TCG tests have been disabled:
/*
 * On ppc64, the test only works with kvm-hv, but not with kvm-pr and TCG
 * is touchy due to race conditions on dirty bits (especially on PPC for
 * some reason)
 */

/*
 * Similar to ppc64, s390x seems to be touchy with TCG, so disable it
 * there until the problems are resolved
 */

It would be great if we could figure out what these issues are and fix
them so we can at least test with TCG like we do for aarch64.

Doing a TCG run of migration-test with both archs (one binary only, not
this series):

- ppc survived one run, taking 6 minutes longer than x86/Aarch64.
- s390x survived one run, taking 40s less than x86/aarch64.

I'll leave them enabled on my machine and do some runs here and there,
see if I spot something. If not, we can consider re-enabling them once
we figure out why ppc takes so long.



Re: [PATCH v8 06/10] hw/fsi: Aspeed APB2OPB interface

2024-01-09 Thread Ninad Palsule

Hello Cedric,



+    for (i = 0; i < ASPEED_FSI_NUM; i++) {
+    if (!qdev_realize_and_unref(DEVICE(>fsi[i]), 
BUS(>opb[i]),



s->fsi[i] is not allocated. We should use qdev_realize instead.


I am not sure I understood this. FSIMasterState fsi[ASPEED_FSI_NUM]; 
is inside structure AspeedAPB2OPBState so it must be allocated, right?


See the documentation :

https://www.qemu.org/docs/master/devel/qdev-api.html#c.qdev_realize_and_unref


Fixed it. Thanks for the review.

Regards,

Ninad




Re: [PATCH 00/33] hw/cpu/arm: Remove one use of qemu_get_cpu() in A7/A15 MPCore priv

2024-01-09 Thread Fabiano Rosas
Cédric Le Goater  writes:

> On 1/9/24 18:40, Fabiano Rosas wrote:
>> Cédric Le Goater  writes:
>> 
>>> On 1/3/24 20:53, Fabiano Rosas wrote:
 Philippe Mathieu-Daudé  writes:

> +Peter/Fabiano
>
> On 2/1/24 17:41, Cédric Le Goater wrote:
>> On 1/2/24 17:15, Philippe Mathieu-Daudé wrote:
>>> Hi Cédric,
>>>
>>> On 2/1/24 15:55, Cédric Le Goater wrote:
 On 12/12/23 17:29, Philippe Mathieu-Daudé wrote:
> Hi,
>
> When a MPCore cluster is used, the Cortex-A cores belong the the
> cluster container, not to the board/soc layer. This series move
> the creation of vCPUs to the MPCore private container.
>
> Doing so we consolidate the QOM model, moving common code in a
> central place (abstract MPCore parent).

 Changing the QOM hierarchy has an impact on the state of the machine
 and some fixups are then required to maintain migration compatibility.
 This can become a real headache for KVM machines like virt for which
 migration compatibility is a feature, less for emulated ones.
>>>
>>> All changes are either moving properties (which are not migrated)
>>> or moving non-migrated QOM members (i.e. pointers of ARMCPU, which
>>> is still migrated elsewhere). So I don't see any obvious migration
>>> problem, but I might be missing something, so I Cc'ed Juan :>

 FWIW, I didn't spot anything problematic either.

 I've ran this through my migration compatibility series [1] and it
 doesn't regress aarch64 migration from/to 8.2. The tests use '-M
 virt -cpu max', so the cortex-a7 and cortex-a15 are not covered. I don't
 think we even support migration of anything non-KVM on arm.
>>>
>>> it happens we do.
>>>
>> 
>> Oh, sorry, I didn't mean TCG here. Probably meant to say something like
>> non-KVM-capable cpus, as in 32-bit. Nevermind.
>
> Theoretically, we should be able to migrate to a TCG guest. Well, this
> worked in the past for PPC. When I was doing more KVM related changes,
> this was very useful for dev. Also, some machines are partially emulated.
> Anyhow I agree this is not a strong requirement and we often break it.
> Let's focus on KVM only.
>
 1- https://gitlab.com/farosas/qemu/-/jobs/5853599533
>>>
>>> yes it depends on the QOM hierarchy and virt seems immune to the changes.
>>> Good.
>>>
>>> However, changing the QOM topology clearly breaks migration compat,
>> 
>> Well, "clearly" is relative =) You've mentioned pseries and aspeed
>> already, do you have a pointer to one of those cases were we broke
>> migration 
>
> Regarding pseries, migration compat broke because of 5bc8d26de20c
> ("spapr: allocate the ICPState object from under sPAPRCPUCore") which
> is similar to the changes proposed by this series, it impacts the QOM
> hierarchy. Here is the workaround/fix from Greg : 46f7afa37096
> ("spapr: fix migration of ICPState objects from/to older QEMU") which
> is quite an headache and this turned out to raise another problem some
> months ago ... :/ That's why I sent [1] to prepare removal of old
> machines and workarounds becoming a burden.

This feels like something that could be handled by the vmstate code
somehow. The state is there, just under a different path. No one wants
to be policing QOM hierarchy changes in every single series that shows
up on the list.

Anyway, thanks for the pointers. I'll study that code a bit more, maybe
I can come up with some way to handle these cases.

Hopefully between the analyze-migration test and the compat tests we'll
catch the next bug of this kind before it gets merged.





Re: [PATCH v8 05/10] hw/fsi: Introduce IBM's FSI master

2024-01-09 Thread Ninad Palsule

Hello Cedric,


+static uint64_t fsi_master_read(void *opaque, hwaddr addr, unsigned 
size)

+{
+    FSIMasterState *s = FSI_MASTER(opaque);
+
+    trace_fsi_master_read(addr, size);
+
+    if (addr + size > sizeof(s->regs)) {


See comment on patch 3

I fixed it.



+    qemu_log_mask(LOG_GUEST_ERROR,
+  "%s: Out of bounds read: 0x%"HWADDR_PRIx" for 
%u\n",

+  __func__, addr, size);
+    return 0;
+    }
+
+    return s->regs[TO_REG(addr)];
+}
+
+static void fsi_master_write(void *opaque, hwaddr addr, uint64_t data,
+ unsigned size)
+{
+    FSIMasterState *s = FSI_MASTER(opaque);
+
+    trace_fsi_master_write(addr, size, data);
+
+    if (addr + size > sizeof(s->regs)) {

I fixed it.


+
+    /* address ? */
+    memory_region_add_subregion(>opb2fsi, 0, >cfam.mr);
+}
+
+static void fsi_master_reset(DeviceState *dev)
+{
+    FSIMasterState *s = FSI_MASTER(dev);


Don't we want to set all values to some default ?
Initialize all other registers to 0 as FSI spec expect them to be zero 
except MVER and MLEVP0. I don't have reset value for MLEVP0 for ast2600. 
This is related to HOT plug detection so setting it to 0 for now.


Thanks for the review.

Regards,

Ninad




Re: [External] Re: [QEMU-devel][RFC PATCH 1/1] backends/hostmem: qapi/qom: Add an ObjectOption for memory-backend-* called HostMemType and its arg 'cxlram'

2024-01-09 Thread Gregory Price
On Tue, Jan 09, 2024 at 11:33:04AM -0800, Hao Xiang wrote:
> On Mon, Jan 8, 2024 at 5:13 PM Gregory Price  
> wrote:
> 
> Sounds like the technical details are explained on the other thread.
> From what I understand now, if we don't go through a complex CXL
> setup, it wouldn't go through the emulation path.
> 
> Here is our exact setup. Guest runs Linux kernel 6.6rc2
> 
> taskset --cpu-list 0-47,96-143 \
> numactl -N 0 -m 0 ${QEMU} \
> -M q35,cxl=on,hmat=on \
> -m 64G \
> -smp 8,sockets=1,cores=8,threads=1 \
> -object memory-backend-ram,id=ram0,size=45G \
> -numa node,memdev=ram0,cpus=0-7,nodeid=0 \
> -msg timestamp=on -L /usr/share/seabios \
> -enable-kvm \
> -object 
> memory-backend-ram,id=vmem0,size=19G,host-nodes=${HOST_CXL_NODE},policy=bind
> \
> -device pxb-cxl,bus_nr=12,bus=pcie.0,id=cxl.1 \
> -device cxl-rp,port=0,bus=cxl.1,id=root_port13,chassis=0,slot=2 \
> -device cxl-type3,bus=root_port13,volatile-memdev=vmem0,id=cxl-vmem0 \
> -numa node,memdev=vmem0,nodeid=1 \
> -M 
> cxl-fmw.0.targets.0=cxl.1,cxl-fmw.0.size=19G,cxl-fmw.0.interleave-granularity=8k

:] you did what i thought you did

-numa node,memdev=vmem0,nodeid=1

"""
Another possiblity: You mapped this memory-backend into another numa
node explicitly and never onlined the memory via cxlcli.  I've done
this, and it works, but it's a "hidden feature" that probably should
not exist / be supported.
"""

You're mapping vmem0 into an explicit numa node *and* into the type3
device.  You don't need to do both - and technically this shouldn't be
allowed.

With this configuration, you can go thorugh the cxl-cli setup process
for the CXL device, you'll find that you can create *another* node
(node 2 in this case) that maps to the same memory you mapped to node1..


You can drop the cxl devices objects in here and the memory will still
come up the way you want it to.

If you drop this line:

-numa node,memdev=vmem0,nodeid=1

You have to use the CXL driver to instantiate the dax device and the
numa node, and at *that* point you will see the read/write functions
being called.

~Gregory



[PATCH v4 2/3] hw/arm: Connect STM32L4x5 SYSCFG to STM32L4x5 SoC

2024-01-09 Thread Inès Varhol
The SYSCFG input GPIOs aren't connected yet. When the STM32L4x5 GPIO
device will be implemented, its output GPIOs will be connected to the
SYSCFG input GPIOs.

Tested-by: Philippe Mathieu-Daudé 
Reviewed-by: Philippe Mathieu-Daudé 
Signed-off-by: Arnaud Minier 
Signed-off-by: Inès Varhol 
---
 hw/arm/Kconfig |  1 +
 hw/arm/stm32l4x5_soc.c | 21 -
 include/hw/arm/stm32l4x5_soc.h |  2 ++
 3 files changed, 23 insertions(+), 1 deletion(-)

diff --git a/hw/arm/Kconfig b/hw/arm/Kconfig
index 8c8488a70a..bb4693bfbb 100644
--- a/hw/arm/Kconfig
+++ b/hw/arm/Kconfig
@@ -459,6 +459,7 @@ config STM32L4X5_SOC
 bool
 select ARM_V7M
 select OR_IRQ
+select STM32L4X5_SYSCFG
 select STM32L4X5_EXTI
 
 config XLNX_ZYNQMP_ARM
diff --git a/hw/arm/stm32l4x5_soc.c b/hw/arm/stm32l4x5_soc.c
index fe46b7c6c0..431f982caf 100644
--- a/hw/arm/stm32l4x5_soc.c
+++ b/hw/arm/stm32l4x5_soc.c
@@ -37,6 +37,7 @@
 #define SRAM2_SIZE (32 * KiB)
 
 #define EXTI_ADDR 0x40010400
+#define SYSCFG_ADDR 0x4001
 
 #define NUM_EXTI_IRQ 40
 /* Match exti line connections with their CPU IRQ number */
@@ -80,6 +81,7 @@ static void stm32l4x5_soc_initfn(Object *obj)
 Stm32l4x5SocState *s = STM32L4X5_SOC(obj);
 
 object_initialize_child(obj, "exti", >exti, TYPE_STM32L4X5_EXTI);
+object_initialize_child(obj, "syscfg", >syscfg, TYPE_STM32L4X5_SYSCFG);
 
 s->sysclk = qdev_init_clock_in(DEVICE(s), "sysclk", NULL, NULL, 0);
 s->refclk = qdev_init_clock_in(DEVICE(s), "refclk", NULL, NULL, 0);
@@ -154,6 +156,19 @@ static void stm32l4x5_soc_realize(DeviceState *dev_soc, 
Error **errp)
 return;
 }
 
+/* System configuration controller */
+busdev = SYS_BUS_DEVICE(>syscfg);
+if (!sysbus_realize(busdev, errp)) {
+return;
+}
+sysbus_mmio_map(busdev, 0, SYSCFG_ADDR);
+/*
+ * TODO: when the GPIO device is implemented, connect it
+ * to SYCFG using `qdev_connect_gpio_out`, NUM_GPIOS and
+ * GPIO_NUM_PINS.
+ */
+
+/* EXTI device */
 busdev = SYS_BUS_DEVICE(>exti);
 if (!sysbus_realize(busdev, errp)) {
 return;
@@ -163,6 +178,11 @@ static void stm32l4x5_soc_realize(DeviceState *dev_soc, 
Error **errp)
 sysbus_connect_irq(busdev, i, qdev_get_gpio_in(armv7m, exti_irq[i]));
 }
 
+for (unsigned i = 0; i < 16; i++) {
+qdev_connect_gpio_out(DEVICE(>syscfg), i,
+  qdev_get_gpio_in(DEVICE(>exti), i));
+}
+
 /* APB1 BUS */
 create_unimplemented_device("TIM2",  0x4000, 0x400);
 create_unimplemented_device("TIM3",  0x4400, 0x400);
@@ -200,7 +220,6 @@ static void stm32l4x5_soc_realize(DeviceState *dev_soc, 
Error **errp)
 /* RESERVED:0x40009800, 0x6800 */
 
 /* APB2 BUS */
-create_unimplemented_device("SYSCFG",0x4001, 0x30);
 create_unimplemented_device("VREFBUF",   0x40010030, 0x1D0);
 create_unimplemented_device("COMP",  0x40010200, 0x200);
 /* RESERVED:0x40010800, 0x1400 */
diff --git a/include/hw/arm/stm32l4x5_soc.h b/include/hw/arm/stm32l4x5_soc.h
index f7305568dc..baf70410b5 100644
--- a/include/hw/arm/stm32l4x5_soc.h
+++ b/include/hw/arm/stm32l4x5_soc.h
@@ -26,6 +26,7 @@
 
 #include "exec/memory.h"
 #include "hw/arm/armv7m.h"
+#include "hw/misc/stm32l4x5_syscfg.h"
 #include "hw/misc/stm32l4x5_exti.h"
 #include "qom/object.h"
 
@@ -41,6 +42,7 @@ struct Stm32l4x5SocState {
 ARMv7MState armv7m;
 
 Stm32l4x5ExtiState exti;
+Stm32l4x5SyscfgState syscfg;
 
 MemoryRegion sram1;
 MemoryRegion sram2;
-- 
2.43.0




[PATCH v4 0/3] Add device STM32L4x5 SYSCFG

2024-01-09 Thread Inès Varhol
This patch adds a new device STM32L4x5 SYSCFG device and is part
of a series implementing the STM32L4x5 with a few peripherals.

Changes from v3 to v4:
- swapping commit 2 (add tests) and commit 3 (connect syscfg to SoC)
so that the tests pass in the commit they're added
- in `stm32l4x5_syscfg-test.c`: instead of declaring intermediate
variables, using `syscfg_readl` directly in `g_assert_cmpuint`
so that QEMU coding style is respected
- in `stm32l4x5_syscfg-test.c`: the tests are now independant
from the EXTI device (the reads in EXTI registers were unnecessary)
- in `stm32l4x5_syscfg-test.c` : using a helper function
`syscfg_set_irq()` to help readability
- in `stm32l4x5_soc.c` : reducing scope of `i` used in for loops
- in `stm32l4x5_soc.c` : removing useless variable `dev`
- in `stm32l4x5_syscfg.c`: add macro `NUM_LINES_PER_EXTICR_REG`,
correct some coding styles issues

Changes from v2 to v3:
- updating the B-L475E-IOT01A machine's documentation file
- using `GPIO_NUM_PINS` instead of 16 in `stm32l4x5_syscfg_init`
- correcting the formatting of multiline indents
- renaming a trace function (`trace_stm32l4x5_syscfg_forward_exti`
instead of `trace_stm32l4x5_syscfg_pulse_exti`)

Changes from v1 to v2:
- explain in 3rd commit why SYSCFG input GPIOs aren't connected and add
a TODO comment in stm32l4x5_soc.c
- use macros `NUM_GPIOS` and `GPIO_NUM_PINS` in
`stm32l4x5_syscfg_set_irq`
- rename STM32L4XX to STM32L4X5, Stm32l4xx to Stm32l4x5
(the SYSCFG implementation is only valid for STM32L4x5 and STM32L4x6
but not for STM32L41xx/42xx/43xx/44xx)
- refactor `STM32L4x5SyscfgState` to `Stm32l4x5SyscfgState` to be
consistent with other peripherals

Based-on: 20240109160658.311932-1-ines.var...@telecom-paris.fr
([PATCH v8 0/3] Add device STM32L4x5 EXTI)

Signed-off-by: Arnaud Minier 
Signed-off-by: Inès Varhol 

Inès Varhol (3):
  hw/misc: Implement STM32L4x5 SYSCFG
  hw/arm: Connect STM32L4x5 SYSCFG to STM32L4x5 SoC
  tests/qtest: Add STM32L4x5 SYSCFG QTest testcase

 docs/system/arm/b-l475e-iot01a.rst  |   2 +-
 hw/arm/Kconfig  |   1 +
 hw/arm/stm32l4x5_soc.c  |  21 +-
 hw/misc/Kconfig |   3 +
 hw/misc/meson.build |   1 +
 hw/misc/stm32l4x5_syscfg.c  | 266 ++
 hw/misc/trace-events|   6 +
 include/hw/arm/stm32l4x5_soc.h  |   2 +
 include/hw/misc/stm32l4x5_syscfg.h  |  54 +
 tests/qtest/meson.build |   3 +-
 tests/qtest/stm32l4x5_syscfg-test.c | 331 
 11 files changed, 687 insertions(+), 3 deletions(-)
 create mode 100644 hw/misc/stm32l4x5_syscfg.c
 create mode 100644 include/hw/misc/stm32l4x5_syscfg.h
 create mode 100644 tests/qtest/stm32l4x5_syscfg-test.c

-- 
2.43.0




[PATCH v4 3/3] tests/qtest: Add STM32L4x5 SYSCFG QTest testcase

2024-01-09 Thread Inès Varhol
Tested-by: Philippe Mathieu-Daudé 
Reviewed-by: Philippe Mathieu-Daudé 
Acked-by: Alistair Francis 
Signed-off-by: Arnaud Minier 
Signed-off-by: Inès Varhol 
---
 tests/qtest/meson.build |   3 +-
 tests/qtest/stm32l4x5_syscfg-test.c | 331 
 2 files changed, 333 insertions(+), 1 deletion(-)
 create mode 100644 tests/qtest/stm32l4x5_syscfg-test.c

diff --git a/tests/qtest/meson.build b/tests/qtest/meson.build
index d890b6f333..a926af92f6 100644
--- a/tests/qtest/meson.build
+++ b/tests/qtest/meson.build
@@ -196,7 +196,8 @@ qtests_aspeed = \
'aspeed_gpio-test']
 
 qtests_stm32l4x5 = \
-  ['stm32l4x5_exti-test']
+  ['stm32l4x5_exti-test',
+   'stm32l4x5_syscfg-test']
 
 qtests_arm = \
   (config_all_devices.has_key('CONFIG_MPS2') ? ['sse-timer-test'] : []) + \
diff --git a/tests/qtest/stm32l4x5_syscfg-test.c 
b/tests/qtest/stm32l4x5_syscfg-test.c
new file mode 100644
index 00..ed4801798d
--- /dev/null
+++ b/tests/qtest/stm32l4x5_syscfg-test.c
@@ -0,0 +1,331 @@
+/*
+ * QTest testcase for STM32L4x5_SYSCFG
+ *
+ * Copyright (c) 2023 Arnaud Minier 
+ * Copyright (c) 2023 Inès Varhol 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#include "qemu/osdep.h"
+#include "libqtest-single.h"
+
+#define SYSCFG_BASE_ADDR 0x4001
+#define SYSCFG_MEMRMP 0x00
+#define SYSCFG_CFGR1 0x04
+#define SYSCFG_EXTICR1 0x08
+#define SYSCFG_EXTICR2 0x0C
+#define SYSCFG_EXTICR3 0x10
+#define SYSCFG_EXTICR4 0x14
+#define SYSCFG_SCSR 0x18
+#define SYSCFG_CFGR2 0x1C
+#define SYSCFG_SWPR 0x20
+#define SYSCFG_SKR 0x24
+#define SYSCFG_SWPR2 0x28
+#define INVALID_ADDR 0x2C
+
+static void syscfg_writel(unsigned int offset, uint32_t value)
+{
+writel(SYSCFG_BASE_ADDR + offset, value);
+}
+
+static uint32_t syscfg_readl(unsigned int offset)
+{
+return readl(SYSCFG_BASE_ADDR + offset);
+}
+
+static void syscfg_set_irq(int num, int level)
+{
+   qtest_set_irq_in(global_qtest, "/machine/soc/syscfg",
+NULL, num, level);
+}
+
+static void system_reset(void)
+{
+QDict *response;
+response = qtest_qmp(global_qtest, "{'execute': 'system_reset'}");
+g_assert(qdict_haskey(response, "return"));
+qobject_unref(response);
+}
+
+static void test_reset(void)
+{
+/*
+ * Test that registers are initialized at the correct values
+ */
+g_assert_cmpuint(syscfg_readl(SYSCFG_MEMRMP), ==, 0x);
+
+g_assert_cmpuint(syscfg_readl(SYSCFG_CFGR1), ==, 0x7C01);
+
+g_assert_cmpuint(syscfg_readl(SYSCFG_EXTICR1), ==, 0x);
+
+g_assert_cmpuint(syscfg_readl(SYSCFG_EXTICR2), ==, 0x);
+
+g_assert_cmpuint(syscfg_readl(SYSCFG_EXTICR3), ==, 0x);
+
+g_assert_cmpuint(syscfg_readl(SYSCFG_EXTICR4), ==, 0x);
+
+g_assert_cmpuint(syscfg_readl(SYSCFG_SCSR), ==, 0x);
+
+g_assert_cmpuint(syscfg_readl(SYSCFG_CFGR2), ==, 0x);
+
+g_assert_cmpuint(syscfg_readl(SYSCFG_SWPR), ==, 0x);
+
+g_assert_cmpuint(syscfg_readl(SYSCFG_SKR), ==, 0x);
+
+g_assert_cmpuint(syscfg_readl(SYSCFG_SWPR2), ==, 0x);
+}
+
+static void test_reserved_bits(void)
+{
+/*
+ * Test that reserved bits stay at reset value
+ * (which is 0 for all of them) by writing '1'
+ * in all reserved bits (keeping reset value for
+ * other bits) and checking that the
+ * register is still at reset value
+ */
+syscfg_writel(SYSCFG_MEMRMP, 0xFEF8);
+g_assert_cmpuint(syscfg_readl(SYSCFG_MEMRMP), ==, 0x);
+
+syscfg_writel(SYSCFG_CFGR1, 0x7F00FEFF);
+g_assert_cmpuint(syscfg_readl(SYSCFG_CFGR1), ==, 0x7C01);
+
+syscfg_writel(SYSCFG_EXTICR1, 0x);
+g_assert_cmpuint(syscfg_readl(SYSCFG_EXTICR1), ==, 0x);
+
+syscfg_writel(SYSCFG_EXTICR2, 0x);
+g_assert_cmpuint(syscfg_readl(SYSCFG_EXTICR2), ==, 0x);
+
+syscfg_writel(SYSCFG_EXTICR3, 0x);
+g_assert_cmpuint(syscfg_readl(SYSCFG_EXTICR3), ==, 0x);
+
+syscfg_writel(SYSCFG_EXTICR4, 0x);
+g_assert_cmpuint(syscfg_readl(SYSCFG_EXTICR4), ==, 0x);
+
+syscfg_writel(SYSCFG_SKR, 0xFF00);
+g_assert_cmpuint(syscfg_readl(SYSCFG_SKR), ==, 0x);
+}
+
+static void test_set_and_clear(void)
+{
+/*
+ * Test that regular bits can be set and cleared
+ */
+syscfg_writel(SYSCFG_MEMRMP, 0x0107);
+g_assert_cmpuint(syscfg_readl(SYSCFG_MEMRMP), ==, 0x0107);
+syscfg_writel(SYSCFG_MEMRMP, 0x);
+g_assert_cmpuint(syscfg_readl(SYSCFG_MEMRMP), ==, 0x);
+
+/* cfgr1 bit 0 is clear only so we keep it set */
+syscfg_writel(SYSCFG_CFGR1, 0xFCFF0101);
+g_assert_cmpuint(syscfg_readl(SYSCFG_CFGR1), ==, 0xFCFF0101);
+syscfg_writel(SYSCFG_CFGR1, 0x0001);
+g_assert_cmpuint(syscfg_readl(SYSCFG_CFGR1), ==, 0x0001);
+
+syscfg_writel(SYSCFG_EXTICR1, 0x);
+

[PATCH v4 1/3] hw/misc: Implement STM32L4x5 SYSCFG

2024-01-09 Thread Inès Varhol
Acked-by: Alistair Francis 
Signed-off-by: Arnaud Minier 
Signed-off-by: Inès Varhol 
---
 docs/system/arm/b-l475e-iot01a.rst |   2 +-
 hw/misc/Kconfig|   3 +
 hw/misc/meson.build|   1 +
 hw/misc/stm32l4x5_syscfg.c | 266 +
 hw/misc/trace-events   |   6 +
 include/hw/misc/stm32l4x5_syscfg.h |  54 ++
 6 files changed, 331 insertions(+), 1 deletion(-)
 create mode 100644 hw/misc/stm32l4x5_syscfg.c
 create mode 100644 include/hw/misc/stm32l4x5_syscfg.h

diff --git a/docs/system/arm/b-l475e-iot01a.rst 
b/docs/system/arm/b-l475e-iot01a.rst
index 72f256ace7..1a021b306a 100644
--- a/docs/system/arm/b-l475e-iot01a.rst
+++ b/docs/system/arm/b-l475e-iot01a.rst
@@ -16,6 +16,7 @@ Currently B-L475E-IOT01A machine's only supports the 
following devices:
 
 - Cortex-M4F based STM32L4x5 SoC
 - STM32L4x5 EXTI (Extended interrupts and events controller)
+- STM32L4x5 SYSCFG (System configuration controller)
 
 Missing devices
 """
@@ -24,7 +25,6 @@ The B-L475E-IOT01A does *not* support the following devices:
 
 - Reset and clock control (RCC)
 - Serial ports (UART)
-- System configuration controller (SYSCFG)
 - General-purpose I/Os (GPIO)
 - Analog to Digital Converter (ADC)
 - SPI controller
diff --git a/hw/misc/Kconfig b/hw/misc/Kconfig
index 3efe3dc2cc..4fc6b29b43 100644
--- a/hw/misc/Kconfig
+++ b/hw/misc/Kconfig
@@ -90,6 +90,9 @@ config STM32F4XX_EXTI
 config STM32L4X5_EXTI
 bool
 
+config STM32L4X5_SYSCFG
+bool
+
 config MIPS_ITU
 bool
 
diff --git a/hw/misc/meson.build b/hw/misc/meson.build
index 16db6e228d..2ca2ce4b62 100644
--- a/hw/misc/meson.build
+++ b/hw/misc/meson.build
@@ -111,6 +111,7 @@ system_ss.add(when: 'CONFIG_STM32F2XX_SYSCFG', if_true: 
files('stm32f2xx_syscfg.
 system_ss.add(when: 'CONFIG_STM32F4XX_SYSCFG', if_true: 
files('stm32f4xx_syscfg.c'))
 system_ss.add(when: 'CONFIG_STM32F4XX_EXTI', if_true: 
files('stm32f4xx_exti.c'))
 system_ss.add(when: 'CONFIG_STM32L4X5_EXTI', if_true: 
files('stm32l4x5_exti.c'))
+system_ss.add(when: 'CONFIG_STM32L4X5_SYSCFG', if_true: 
files('stm32l4x5_syscfg.c'))
 system_ss.add(when: 'CONFIG_MPS2_FPGAIO', if_true: files('mps2-fpgaio.c'))
 system_ss.add(when: 'CONFIG_MPS2_SCC', if_true: files('mps2-scc.c'))
 
diff --git a/hw/misc/stm32l4x5_syscfg.c b/hw/misc/stm32l4x5_syscfg.c
new file mode 100644
index 00..fd68cb800b
--- /dev/null
+++ b/hw/misc/stm32l4x5_syscfg.c
@@ -0,0 +1,266 @@
+/*
+ * STM32L4x5 SYSCFG (System Configuration Controller)
+ *
+ * Copyright (c) 2023 Arnaud Minier 
+ * Copyright (c) 2023 Inès Varhol 
+ *
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ *
+ * This work is based on the stm32f4xx_syscfg by Alistair Francis.
+ * Original code is licensed under the MIT License:
+ *
+ * Copyright (c) 2014 Alistair Francis 
+ */
+
+/*
+ * The reference used is the STMicroElectronics RM0351 Reference manual
+ * for STM32L4x5 and STM32L4x6 advanced Arm ® -based 32-bit MCUs.
+ * 
https://www.st.com/en/microcontrollers-microprocessors/stm32l4x5/documentation.html
+ */
+
+#include "qemu/osdep.h"
+#include "qemu/log.h"
+#include "trace.h"
+#include "hw/irq.h"
+#include "migration/vmstate.h"
+#include "hw/misc/stm32l4x5_syscfg.h"
+
+#define SYSCFG_MEMRMP 0x00
+#define SYSCFG_CFGR1 0x04
+#define SYSCFG_EXTICR1 0x08
+#define SYSCFG_EXTICR2 0x0C
+#define SYSCFG_EXTICR3 0x10
+#define SYSCFG_EXTICR4 0x14
+#define SYSCFG_SCSR 0x18
+#define SYSCFG_CFGR2 0x1C
+#define SYSCFG_SWPR 0x20
+#define SYSCFG_SKR 0x24
+#define SYSCFG_SWPR2 0x28
+
+/* __0001_0111 */
+#define ACTIVABLE_BITS_MEMRP 0x0107
+
+/* 1100__0001_ */
+#define ACTIVABLE_BITS_CFGR1 0xFCFF0100
+/* ___0001 */
+#define FIREWALL_DISABLE_CFGR1 0x0001
+
+/* ___ */
+#define ACTIVABLE_BITS_EXTICR 0x
+
+/* ___0011 */
+/* #define ACTIVABLE_BITS_SCSR 0x0003 */
+
+/* ___ */
+#define ECC_LOCK_CFGR2 0x000F
+/* __0001_ */
+#define SRAM2_PARITY_ERROR_FLAG_CFGR2 0x0100
+
+/* ___ */
+#define ACTIVABLE_BITS_SKR 0x00FF
+
+#define NUM_LINES_PER_EXTICR_REG 4
+
+static void stm32l4x5_syscfg_hold_reset(Object *obj)
+{
+Stm32l4x5SyscfgState *s = STM32L4X5_SYSCFG(obj);
+
+s->memrmp = 0x;
+s->cfgr1 = 0x7C01;
+s->exticr[0] = 0x;
+s->exticr[1] = 0x;
+s->exticr[2] = 0x;
+s->exticr[3] = 0x;
+s->scsr = 0x;
+s->cfgr2 = 0x;
+s->swpr = 0x;
+s->skr = 0x;
+s->swpr2 = 0x;
+}
+
+static void stm32l4x5_syscfg_set_irq(void *opaque, int irq, int level)
+{
+Stm32l4x5SyscfgState *s = opaque;
+const uint8_t gpio = irq / 

Re: [PATCH 1/3] linux-user: Allow gdbstub to ignore page protection

2024-01-09 Thread Ilya Leoshkevich
On Wed, 2024-01-10 at 04:42 +1100, Richard Henderson wrote:
> On 1/9/24 10:34, Ilya Leoshkevich wrote:
> > gdbserver ignores page protection by virtue of using
> > /proc/$pid/mem.
> > Teach qemu gdbstub to do this too. This will not work if /proc is
> > not
> > mounted; accept this limitation.
> > 
> > One alternative is to temporarily grant the missing PROT_* bit, but
> > this is inherently racy. Another alternative is self-debugging with
> > ptrace(POKE), which will break if QEMU itself is being debugged - a
> > much more severe limitation.
> > 
> > Signed-off-by: Ilya Leoshkevich 
> > ---
> >   cpu-target.c | 55 ++-
> > -
> >   1 file changed, 40 insertions(+), 15 deletions(-)
> > 
> > diff --git a/cpu-target.c b/cpu-target.c
> > index 5eecd7ea2d7..69e97f78980 100644
> > --- a/cpu-target.c
> > +++ b/cpu-target.c
> > @@ -406,6 +406,15 @@ int cpu_memory_rw_debug(CPUState *cpu, vaddr
> > addr,
> >   vaddr l, page;
> >   void * p;
> >   uint8_t *buf = ptr;
> > +    int ret = -1;
> > +    int mem_fd;
> > +
> > +    /*
> > + * Try ptrace first. If /proc is not mounted or if there is a
> > different
> > + * problem, fall back to the manual page access. Note that,
> > unlike ptrace,
> > + * it will not be able to ignore the protection bits.
> > + */
> > +    mem_fd = open("/proc/self/mem", is_write ? O_WRONLY :
> > O_RDONLY);
> 
> Surely this is the unlikely fallback, and you don't need to open
> unless the page is 
> otherwise inaccessible.

Ok, I can move this under (flags & PAGE_*) checks.

> I see no handling for writes to pages that contain TranslationBlocks.

Sorry, I completely missed that. I'm currently experimenting with the
following:

/*
 * If there is a TranslationBlock and we weren't bypassing
host
 * page protection, the memcpy() above would SEGV, ultimately
 * leading to page_unprotect(). So invalidate the translations
 * manually. Both invalidation and pwrite() must be under
 * mmap_lock() in order to prevent the creation of another
 * TranslationBlock in between.
 */
mmap_lock();
tb_invalidate_phys_page(page);
written = pwrite(fd, buf, l, (off_t)g2h_untagged(addr));
mmap_unlock();

Does that look okay?

[...]



Re: [PATCH v6 1/2] qom: new object to associate device to numa node

2024-01-09 Thread Jason Gunthorpe
On Tue, Jan 09, 2024 at 11:36:03AM -0800, Dan Williams wrote:
> Jason Gunthorpe wrote:
> > On Tue, Jan 09, 2024 at 06:02:03PM +0100, David Hildenbrand wrote:
> > > > Given that, an alternative proposal that I think would work
> > > > for you would be to add a 'placeholder' memory node definition
> > > > in SRAT (so allow 0 size explicitly - might need a new SRAT
> > > > entry to avoid backwards compat issues).
> > > 
> > > Putting all the PCI/GI/... complexity aside, I'll just raise again that 
> > > for
> > > virtio-mem something simple like that might be helpful as well, IIUC.
> > > 
> > >   -numa node,nodeid=2 \
> > >   ...
> > >   -device virtio-mem-pci,node=2,... \
> > > 
> > > All we need is the OS to prepare for an empty node that will get populated
> > > with memory later.
> > 
> > That is all this is doing too, the NUMA relationship of the actual
> > memory is desribed already by the PCI device since it is a BAR on the
> > device.
> > 
> > The only purpose is to get the empty nodes into Linux :(
> > 
> > > So if that's what a "placeholder" node definition in srat could achieve as
> > > well, even without all of the other acpi-generic-initiator stuff, that 
> > > would
> > > be great.
> > 
> > Seems like there are two use quite similar cases.. virtio-mem is going
> > to be calling the same family of kernel API I suspect :)
> 
> It seems sad that we, as an industry, went through all of this trouble
> to define a dynamically enumerable CXL device model only to turn around
> and require static ACPI tables to tell us how to enumerate it.
> 
> A similar problem exists on the memory target side and the approach
> taken there was to have Linux statically reserve at least enough numa
> node numbers for all the platform CXL memory ranges (defined in the
> ACPI.CEDT.CFMWS), but with the promise to come back and broach the
> dynamic node creation problem "if the need arises".
> 
> This initiator-node enumeration case seems like that occasion where the
> need has arisen to get Linux out of the mode of needing to declare all
> possible numa nodes early in boot. Allow for nodes to be discoverable
> post NUMA-init.
> 
> One strawman scheme that comes to mind is instead of "add nodes early" in
> boot, "delete unused nodes late" in boot after the device topology has
> been enumerated. Otherwise, requiring static ACPI tables to further
> enumerate an industry-standard dynamically enumerated bus seems to be
> going in the wrong direction.

Fully agree, and I think this will get increasingly painful as we go
down the CXL road.

Jason



Re: [PATCH v6 1/2] qom: new object to associate device to numa node

2024-01-09 Thread Dan Williams
Jason Gunthorpe wrote:
> On Tue, Jan 09, 2024 at 06:02:03PM +0100, David Hildenbrand wrote:
> > > Given that, an alternative proposal that I think would work
> > > for you would be to add a 'placeholder' memory node definition
> > > in SRAT (so allow 0 size explicitly - might need a new SRAT
> > > entry to avoid backwards compat issues).
> > 
> > Putting all the PCI/GI/... complexity aside, I'll just raise again that for
> > virtio-mem something simple like that might be helpful as well, IIUC.
> > 
> > -numa node,nodeid=2 \
> > ...
> > -device virtio-mem-pci,node=2,... \
> > 
> > All we need is the OS to prepare for an empty node that will get populated
> > with memory later.
> 
> That is all this is doing too, the NUMA relationship of the actual
> memory is desribed already by the PCI device since it is a BAR on the
> device.
> 
> The only purpose is to get the empty nodes into Linux :(
> 
> > So if that's what a "placeholder" node definition in srat could achieve as
> > well, even without all of the other acpi-generic-initiator stuff, that would
> > be great.
> 
> Seems like there are two use quite similar cases.. virtio-mem is going
> to be calling the same family of kernel API I suspect :)

It seems sad that we, as an industry, went through all of this trouble
to define a dynamically enumerable CXL device model only to turn around
and require static ACPI tables to tell us how to enumerate it.

A similar problem exists on the memory target side and the approach
taken there was to have Linux statically reserve at least enough numa
node numbers for all the platform CXL memory ranges (defined in the
ACPI.CEDT.CFMWS), but with the promise to come back and broach the
dynamic node creation problem "if the need arises".

This initiator-node enumeration case seems like that occasion where the
need has arisen to get Linux out of the mode of needing to declare all
possible numa nodes early in boot. Allow for nodes to be discoverable
post NUMA-init.

One strawman scheme that comes to mind is instead of "add nodes early" in
boot, "delete unused nodes late" in boot after the device topology has
been enumerated. Otherwise, requiring static ACPI tables to further
enumerate an industry-standard dynamically enumerated bus seems to be
going in the wrong direction.



Re: [External] Re: [QEMU-devel][RFC PATCH 1/1] backends/hostmem: qapi/qom: Add an ObjectOption for memory-backend-* called HostMemType and its arg 'cxlram'

2024-01-09 Thread Hao Xiang
On Mon, Jan 8, 2024 at 5:13 PM Gregory Price  wrote:
>
> On Mon, Jan 08, 2024 at 05:05:38PM -0800, Hao Xiang wrote:
> > On Mon, Jan 8, 2024 at 2:47 PM Hao Xiang  wrote:
> > >
> > > On Mon, Jan 8, 2024 at 9:15 AM Gregory Price  
> > > wrote:
> > > >
> > > > On Fri, Jan 05, 2024 at 09:59:19PM -0800, Hao Xiang wrote:
> > > > > On Wed, Jan 3, 2024 at 1:56 PM Gregory Price 
> > > > >  wrote:
> > > > > >
> > > > > > For a variety of performance reasons, this will not work the way you
> > > > > > want it to.  You are essentially telling QEMU to map the vmem0 into 
> > > > > > a
> > > > > > virtual cxl device, and now any memory accesses to that memory 
> > > > > > region
> > > > > > will end up going through the cxl-type3 device logic - which is an 
> > > > > > IO
> > > > > > path from the perspective of QEMU.
> > > > >
> > > > > I didn't understand exactly how the virtual cxl-type3 device works. I
> > > > > thought it would go with the same "guest virtual address ->  guest
> > > > > physical address -> host physical address" translation totally done by
> > > > > CPU. But if it is going through an emulation path handled by virtual
> > > > > cxl-type3, I agree the performance would be bad. Do you know why
> > > > > accessing memory on a virtual cxl-type3 device can't go with the
> > > > > nested page table translation?
> > > > >
> > > >
> > > > Because a byte-access on CXL memory can have checks on it that must be
> > > > emulated by the virtual device, and because there are caching
> > > > implications that have to be emulated as well.
> > >
> > > Interesting. Now that I see the cxl_type3_read/cxl_type3_write. If the
> > > CXL memory data path goes through them, the performance would be
> > > pretty problematic. We have actually run Intel's Memory Latency
> > > Checker benchmark from inside a guest VM with both system-DRAM and
> > > virtual CXL-type3 configured. The idle latency on the virtual CXL
> > > memory is 2X of system DRAM, which is on-par with the benchmark
> > > running from a physical host. I need to debug this more to understand
> > > why the latency is actually much better than I would expect now.
> >
> > So we double checked on benchmark testing. What we see is that running
> > Intel Memory Latency Checker from a guest VM with virtual CXL memory
> > VS from a physical host with CXL1.1 memory expander has the same
> > latency.
> >
> > From guest VM: local socket system-DRAM latency is 117.0ns, local
> > socket CXL-DRAM latency is 269.4ns
> > From physical host: local socket system-DRAM latency is 113.6ns ,
> > local socket CXL-DRAM latency is 267.5ns
> >
> > I also set debugger breakpoints on cxl_type3_read/cxl_type3_write
> > while running the benchmark testing but those two functions are not
> > ever hit. We used the virtual CXL configuration while launching QEMU
> > but the CXL memory is present as a separate NUMA node and we are not
> > creating devdax devices. Does that make any difference?
> >
>
> Could you possibly share your full QEMU configuration and what OS/kernel
> you are running inside the guest?

Sounds like the technical details are explained on the other thread.
>From what I understand now, if we don't go through a complex CXL
setup, it wouldn't go through the emulation path.

Here is our exact setup. Guest runs Linux kernel 6.6rc2

taskset --cpu-list 0-47,96-143 \
numactl -N 0 -m 0 ${QEMU} \
-M q35,cxl=on,hmat=on \
-m 64G \
-smp 8,sockets=1,cores=8,threads=1 \
-object memory-backend-ram,id=ram0,size=45G \
-numa node,memdev=ram0,cpus=0-7,nodeid=0 \
-msg timestamp=on -L /usr/share/seabios \
-enable-kvm \
-object 
memory-backend-ram,id=vmem0,size=19G,host-nodes=${HOST_CXL_NODE},policy=bind
\
-device pxb-cxl,bus_nr=12,bus=pcie.0,id=cxl.1 \
-device cxl-rp,port=0,bus=cxl.1,id=root_port13,chassis=0,slot=2 \
-device cxl-type3,bus=root_port13,volatile-memdev=vmem0,id=cxl-vmem0 \
-numa node,memdev=vmem0,nodeid=1 \
-M 
cxl-fmw.0.targets.0=cxl.1,cxl-fmw.0.size=19G,cxl-fmw.0.interleave-granularity=8k
\
-numa dist,src=0,dst=0,val=10 \
-numa dist,src=0,dst=1,val=14 \
-numa dist,src=1,dst=0,val=14 \
-numa dist,src=1,dst=1,val=10 \
-numa 
hmat-lb,initiator=0,target=0,hierarchy=memory,data-type=read-latency,latency=91
\
-numa 
hmat-lb,initiator=0,target=1,hierarchy=memory,data-type=read-latency,latency=100
\
-numa 
hmat-lb,initiator=0,target=0,hierarchy=memory,data-type=write-latency,latency=91
\
-numa 
hmat-lb,initiator=0,target=1,hierarchy=memory,data-type=write-latency,latency=100
\
-numa 
hmat-lb,initiator=0,target=0,hierarchy=memory,data-type=read-bandwidth,bandwidth=262100M
\
-numa 
hmat-lb,initiator=0,target=1,hierarchy=memory,data-type=read-bandwidth,bandwidth=3M
\
-numa 
hmat-lb,initiator=0,target=0,hierarchy=memory,data-type=write-bandwidth,bandwidth=176100M
\
-numa 
hmat-lb,initiator=0,target=1,hierarchy=memory,data-type=write-bandwidth,bandwidth=3M
\
-drive file="${DISK_IMG}",format=qcow2 \
-device pci-bridge,chassis_nr=3,id=pci.3,bus=pcie.0,addr=0xd \
-netdev 

Re: [PATCH v8 01/10] hw/fsi: Introduce IBM's Local bus

2024-01-09 Thread Ninad Palsule

Hello Cedric,

On 12/12/23 08:46, Cédric Le Goater wrote:

On 11/29/23 00:56, Ninad Palsule wrote:

This is a part of patchset where IBM's Flexible Service Interface is
introduced.

The LBUS is modelled to maintain mapped memory for the devices. The
memory is mapped after CFAM config, peek table and FSI slave registers.

Signed-off-by: Andrew Jeffery 
Signed-off-by: Ninad Palsule 
[ clg: - removed lbus_add_device() bc unused
    - removed lbus_create_device() bc used only once
    - removed "address" property
    - updated meson.build to build fsi dir
    - included an empty hw/fsi/trace-events ]
Signed-off-by: Cédric Le Goater 
---
  meson.build   |  1 +
  hw/fsi/trace.h    |  1 +
  include/hw/fsi/lbus.h | 40 +
  hw/fsi/lbus.c | 51 +++
  hw/Kconfig    |  1 +
  hw/fsi/Kconfig    |  2 ++
  hw/fsi/meson.build    |  1 +
  hw/fsi/trace-events   |  1 +
  hw/meson.build    |  1 +
  9 files changed, 99 insertions(+)
  create mode 100644 hw/fsi/trace.h
  create mode 100644 include/hw/fsi/lbus.h
  create mode 100644 hw/fsi/lbus.c
  create mode 100644 hw/fsi/Kconfig
  create mode 100644 hw/fsi/meson.build
  create mode 100644 hw/fsi/trace-events

diff --git a/meson.build b/meson.build
index ec01f8b138..b6556efd51 100644
--- a/meson.build
+++ b/meson.build
@@ -3298,6 +3298,7 @@ if have_system
  'hw/char',
  'hw/display',
  'hw/dma',
+    'hw/fsi',
  'hw/hyperv',
  'hw/i2c',
  'hw/i386',
diff --git a/hw/fsi/trace.h b/hw/fsi/trace.h
new file mode 100644
index 00..ee67c7fb04
--- /dev/null
+++ b/hw/fsi/trace.h
@@ -0,0 +1 @@
+#include "trace/trace-hw_fsi.h"
diff --git a/include/hw/fsi/lbus.h b/include/hw/fsi/lbus.h
new file mode 100644
index 00..a58e33d061
--- /dev/null
+++ b/include/hw/fsi/lbus.h
@@ -0,0 +1,40 @@
+/*
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ * Copyright (C) 2023 IBM Corp.
+ *
+ * IBM Local bus and connected device structures.
+ */
+#ifndef FSI_LBUS_H
+#define FSI_LBUS_H
+
+#include "exec/memory.h"
+#include "hw/qdev-core.h"
+
+#define TYPE_FSI_LBUS_DEVICE "fsi.lbus.device"
+OBJECT_DECLARE_TYPE(FSILBusDevice, FSILBusDeviceClass, FSI_LBUS_DEVICE)
+
+#define FSI_LBUS_MEM_REGION_SIZE  (2 * 1024 * 1024)
+#define FSI_LBUSDEV_IOMEM_START   0xc00 /* 3K used by CFAM config 
etc */


I don't think sizing the local bus MMIO region exactly to the size of
the CFAM MMIO region is necessary. The upper LBUS/CFAM addresses might
not even be backed by device registers.

I would simplify with :

#define FSI_LBUS_MEM_REGION_SIZE  (1 * MiB)

and forget about the offset.


ok, I made it as 1MB.

Thanks for the review.

Regards,

Ninad





[PATCH] block/blklogwrites: Fix a bug when logging "write zeroes" operations.

2024-01-09 Thread megari
From: Ari Sundholm 

There is a bug in the blklogwrites driver pertaining to logging "write
zeroes" operations, causing log corruption. This can be easily observed
by setting detect-zeroes to something other than "off" for the driver.

The issue is caused by a concurrency bug pertaining to the fact that
"write zeroes" operations have to be logged in two parts: first the log
entry metadata, then the zeroed-out region. While the log entry
metadata is being written by bdrv_co_pwritev(), another operation may
begin in the meanwhile and modify the state of the blklogwrites driver.
This is as intended by the coroutine-driven I/O model in QEMU, of
course.

Unfortunately, this specific scenario is mishandled. A short example:
1. Initially, in the current operation (#1), the current log sector
number in the driver state is only incremented by the number of sectors
taken by the log entry metadata, after which the log entry metadata is
written. The current operation yields.
2. Another operation (#2) may start while the log entry metadata is
being written. It uses the current log position as the start offset for
its log entry. This is in the sector right after the operation #1 log
entry metadata, which is bad!
3. After bdrv_co_pwritev() returns (#1), the current log sector
number is reread from the driver state in order to find out the start
offset for bdrv_co_pwrite_zeroes(). This is an obvious blunder, as the
offset will be the sector right after the (misplaced) operation #2 log
entry, which means that the zeroed-out region begins at the wrong
offset.
4. As a result of the above, the log is corrupt.

Fix this by only reading the driver metadata once, computing the
offsets and sizes in one go (including the optional zeroed-out region)
and setting the log sector number to the appropriate value for the next
operation in line.

Signed-off-by: Ari Sundholm 
Cc: qemu-sta...@nongnu.org
---
 block/blklogwrites.c | 35 ++-
 1 file changed, 26 insertions(+), 9 deletions(-)

diff --git a/block/blklogwrites.c b/block/blklogwrites.c
index 7207b2e757..ba717dab4d 100644
--- a/block/blklogwrites.c
+++ b/block/blklogwrites.c
@@ -328,22 +328,39 @@ static void coroutine_fn GRAPH_RDLOCK
 blk_log_writes_co_do_log(BlkLogWritesLogReq *lr)
 {
 BDRVBlkLogWritesState *s = lr->bs->opaque;
-uint64_t cur_log_offset = s->cur_log_sector << s->sectorbits;

-s->nr_entries++;
-s->cur_log_sector +=
-ROUND_UP(lr->qiov->size, s->sectorsize) >> s->sectorbits;
+/*
+ * Determine the offsets and sizes of different parts of the entry, and
+ * update the state of the driver.
+ *
+ * This needs to be done in one go, before any actual I/O is done, as the
+ * log entry may have to be written in two parts, and the state of the
+ * driver may be modified by other driver operations while waiting for the
+ * I/O to complete.
+ */
+const uint64_t entry_start_sector = s->cur_log_sector;
+const uint64_t entry_offset = entry_start_sector << s->sectorbits;
+const uint64_t qiov_aligned_size = ROUND_UP(lr->qiov->size, s->sectorsize);
+const uint64_t entry_aligned_size = qiov_aligned_size +
+ROUND_UP(lr->zero_size, s->sectorsize);
+const uint64_t entry_nr_sectors = entry_aligned_size >> s->sectorbits;

-lr->log_ret = bdrv_co_pwritev(s->log_file, cur_log_offset, lr->qiov->size,
+s->nr_entries++;
+s->cur_log_sector += entry_nr_sectors;
+
+/*
+ * Write the log entry. Note that if this is a "write zeroes" operation,
+ * only the entry header is written here, with the zeroing being done
+ * separately below.
+ */
+lr->log_ret = bdrv_co_pwritev(s->log_file, entry_offset, lr->qiov->size,
   lr->qiov, 0);

 /* Logging for the "write zeroes" operation */
 if (lr->log_ret == 0 && lr->zero_size) {
-cur_log_offset = s->cur_log_sector << s->sectorbits;
-s->cur_log_sector +=
-ROUND_UP(lr->zero_size, s->sectorsize) >> s->sectorbits;
+const uint64_t zeroes_offset = entry_offset + qiov_aligned_size;

-lr->log_ret = bdrv_co_pwrite_zeroes(s->log_file, cur_log_offset,
+lr->log_ret = bdrv_co_pwrite_zeroes(s->log_file, zeroes_offset,
 lr->zero_size, 0);
 }

--
2.43.0




[PATCH] hw/timer: fix systick trace message

2024-01-09 Thread Samuel Tardieu
Signed-off-by: Samuel Tardieu 
Fixes: ff68dacbc786 ("armv7m: Split systick out from NVIC")
---
 hw/timer/trace-events | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/timer/trace-events b/hw/timer/trace-events
index 3eccef83858..8145e18e3da 100644
--- a/hw/timer/trace-events
+++ b/hw/timer/trace-events
@@ -35,7 +35,7 @@ aspeed_timer_read(uint64_t offset, unsigned size, uint64_t 
value) "From 0x%" PRI
 
 # armv7m_systick.c
 systick_reload(void) "systick reload"
-systick_timer_tick(void) "systick reload"
+systick_timer_tick(void) "systick tick"
 systick_read(uint64_t addr, uint32_t value, unsigned size) "systick read addr 
0x%" PRIx64 " data 0x%" PRIx32 " size %u"
 systick_write(uint64_t addr, uint32_t value, unsigned size) "systick write 
addr 0x%" PRIx64 " data 0x%" PRIx32 " size %u"
 
-- 
2.42.0




Re: [PATCH v1 1/2] oslib-posix: refactor memory prealloc threads

2024-01-09 Thread Mark Kanda




On 1/9/24 8:25 AM, David Hildenbrand wrote:

On 09.01.24 15:15, Daniel P. Berrangé wrote:

On Tue, Jan 09, 2024 at 03:02:00PM +0100, David Hildenbrand wrote:

On 08.01.24 19:40, Mark Kanda wrote:

On 1/8/24 9:40 AM, David Hildenbrand wrote:

On 08.01.24 16:10, Mark Kanda wrote:

Refactor the memory prealloc threads support:
- Make memset context a global qlist
- Move the memset thread join/cleanup code to a separate routine

This is functionally equivalent and facilitates multiple memset 
contexts

(used in a subsequent patch).

Signed-off-by: Mark Kanda 
---
    util/oslib-posix.c | 104 
+

    1 file changed, 68 insertions(+), 36 deletions(-)

diff --git a/util/oslib-posix.c b/util/oslib-posix.c
index e86fd64e09..293297ac6c 100644
--- a/util/oslib-posix.c
+++ b/util/oslib-posix.c
@@ -63,11 +63,15 @@
      struct MemsetThread;
    +static QLIST_HEAD(, MemsetContext) memset_contexts =
+    QLIST_HEAD_INITIALIZER(memset_contexts);
+
    typedef struct MemsetContext {
    bool all_threads_created;
    bool any_thread_failed;
    struct MemsetThread *threads;
    int num_threads;
+    QLIST_ENTRY(MemsetContext) next;
    } MemsetContext;
      struct MemsetThread {
@@ -81,7 +85,7 @@ struct MemsetThread {
    typedef struct MemsetThread MemsetThread;
      /* used by sigbus_handler() */
-static MemsetContext *sigbus_memset_context;
+static bool sigbus_memset_context;
    struct sigaction sigbus_oldact;
    static QemuMutex sigbus_mutex;
    @@ -295,13 +299,16 @@ static void sigbus_handler(int signal)
    #endif /* CONFIG_LINUX */
    {
    int i;
+    MemsetContext *context;
      if (sigbus_memset_context) {
-    for (i = 0; i < sigbus_memset_context->num_threads; i++) {
-    MemsetThread *thread = 
_memset_context->threads[i];

+    QLIST_FOREACH(context, _contexts, next) {
+    for (i = 0; i < context->num_threads; i++) {
+    MemsetThread *thread = >threads[i];
    -    if (qemu_thread_is_self(>pgthread)) {
-    siglongjmp(thread->env, 1);
+    if (qemu_thread_is_self(>pgthread)) {
+    siglongjmp(thread->env, 1);
+    }
    }
    }
    }
@@ -417,14 +424,15 @@ static int touch_all_pages(char *area, size_t
hpagesize, size_t numpages,
   bool use_madv_populate_write)
    {
    static gsize initialized = 0;
-    MemsetContext context = {
-    .num_threads = get_memset_num_threads(hpagesize, numpages,
max_threads),
-    };
+    MemsetContext *context = g_malloc0(sizeof(MemsetContext));
    size_t numpages_per_thread, leftover;
    void *(*touch_fn)(void *);
-    int ret = 0, i = 0;
+    int i = 0;
    char *addr = area;
    +    context->num_threads =
+    get_memset_num_threads(hpagesize, numpages, max_threads);
+
    if (g_once_init_enter()) {
    qemu_mutex_init(_mutex);
    qemu_cond_init(_cond);
@@ -433,7 +441,7 @@ static int touch_all_pages(char *area, size_t
hpagesize, size_t numpages,
      if (use_madv_populate_write) {
    /* Avoid creating a single thread for 
MADV_POPULATE_WRITE */

-    if (context.num_threads == 1) {
+    if (context->num_threads == 1) {
    if (qemu_madvise(area, hpagesize * numpages,
QEMU_MADV_POPULATE_WRITE)) {
    return -errno;
@@ -445,49 +453,74 @@ static int touch_all_pages(char *area, size_t
hpagesize, size_t numpages,
    touch_fn = do_touch_pages;
    }
    -    context.threads = g_new0(MemsetThread, 
context.num_threads);

-    numpages_per_thread = numpages / context.num_threads;
-    leftover = numpages % context.num_threads;
-    for (i = 0; i < context.num_threads; i++) {
-    context.threads[i].addr = addr;
-    context.threads[i].numpages = numpages_per_thread + (i <
leftover);
-    context.threads[i].hpagesize = hpagesize;
-    context.threads[i].context = 
+    context->threads = g_new0(MemsetThread, context->num_threads);
+    numpages_per_thread = numpages / context->num_threads;
+    leftover = numpages % context->num_threads;
+    for (i = 0; i < context->num_threads; i++) {
+    context->threads[i].addr = addr;
+    context->threads[i].numpages = numpages_per_thread + (i <
leftover);
+    context->threads[i].hpagesize = hpagesize;
+    context->threads[i].context = context;
    if (tc) {
-    thread_context_create_thread(tc,
[i].pgthread,
+    thread_context_create_thread(tc,
>threads[i].pgthread,
"touch_pages",
- touch_fn, 
[i],

+ touch_fn,
>threads[i],
QEMU_THREAD_JOINABLE);
    } else {
- qemu_thread_create([i].pgthread, "touch_pages",
-   touch_fn, [i],
+ qemu_thread_create(>threads[i].pgthread, "touch_pages",
+   touch_fn, >threads[i],

Re: [PATCH v4 2/5] target/riscv: Add cycle & instret privilege mode filtering properties

2024-01-09 Thread Daniel Henrique Barboza




On 1/8/24 21:25, Atish Patra wrote:

From: Kaiwen Xue 

This adds the properties for ISA extension smcntrpmf. Patches
implementing it will follow.

Signed-off-by: Atish Patra 
Signed-off-by: Kaiwen Xue 
---


Reviewed-by: Daniel Henrique Barboza 


  target/riscv/cpu.c | 2 ++
  target/riscv/cpu_cfg.h | 1 +
  2 files changed, 3 insertions(+)

diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index 83c7c0cf07be..501ae560ec29 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -144,6 +144,7 @@ const RISCVIsaExtData isa_edata_arr[] = {
  ISA_EXT_DATA_ENTRY(zhinx, PRIV_VERSION_1_12_0, ext_zhinx),
  ISA_EXT_DATA_ENTRY(zhinxmin, PRIV_VERSION_1_12_0, ext_zhinxmin),
  ISA_EXT_DATA_ENTRY(smaia, PRIV_VERSION_1_12_0, ext_smaia),
+ISA_EXT_DATA_ENTRY(smcntrpmf, PRIV_VERSION_1_12_0, ext_smcntrpmf),
  ISA_EXT_DATA_ENTRY(smepmp, PRIV_VERSION_1_12_0, ext_smepmp),
  ISA_EXT_DATA_ENTRY(smstateen, PRIV_VERSION_1_12_0, ext_smstateen),
  ISA_EXT_DATA_ENTRY(ssaia, PRIV_VERSION_1_12_0, ext_ssaia),
@@ -1296,6 +1297,7 @@ const char *riscv_get_misa_ext_description(uint32_t bit)
  const RISCVCPUMultiExtConfig riscv_cpu_extensions[] = {
  /* Defaults for standard extensions */
  MULTI_EXT_CFG_BOOL("sscofpmf", ext_sscofpmf, false),
+MULTI_EXT_CFG_BOOL("smcntrpmf", ext_smcntrpmf, false),
  MULTI_EXT_CFG_BOOL("zifencei", ext_zifencei, true),
  MULTI_EXT_CFG_BOOL("zicsr", ext_zicsr, true),
  MULTI_EXT_CFG_BOOL("zihintntl", ext_zihintntl, true),
diff --git a/target/riscv/cpu_cfg.h b/target/riscv/cpu_cfg.h
index f4605fb190b9..00c34fdd3209 100644
--- a/target/riscv/cpu_cfg.h
+++ b/target/riscv/cpu_cfg.h
@@ -72,6 +72,7 @@ struct RISCVCPUConfig {
  bool ext_zihpm;
  bool ext_smstateen;
  bool ext_sstc;
+bool ext_smcntrpmf;
  bool ext_svadu;
  bool ext_svinval;
  bool ext_svnapot;




Re: [PATCH v2 10/14] hw/arm: Prefer arm_feature(EL2) over object_property_find(has_el2)

2024-01-09 Thread Philippe Mathieu-Daudé

On 9/1/24 19:09, Philippe Mathieu-Daudé wrote:

The "has_el2" property is added to ARMCPU when the
ARM_FEATURE_EL2 feature is available. Rather than
checking whether the QOM property is present, directly
check the feature.

Suggested-by: Markus Armbruster 
Signed-off-by: Philippe Mathieu-Daudé 
---
  hw/arm/vexpress.c  | 3 ++-
  hw/arm/virt.c  | 2 +-
  hw/cpu/a15mpcore.c | 6 --
  3 files changed, 7 insertions(+), 4 deletions(-)




diff --git a/hw/cpu/a15mpcore.c b/hw/cpu/a15mpcore.c
index cebfe142cf..1fa079b3b8 100644
--- a/hw/cpu/a15mpcore.c
+++ b/hw/cpu/a15mpcore.c
@@ -73,9 +73,11 @@ static void a15mp_priv_realize(DeviceState *dev, Error 
**errp)
  qdev_prop_set_bit(gicdev, "has-security-extensions", true);
  }
  /* Similarly for virtualization support */
-has_el2 = object_property_find(cpuobj, "has_el2") &&
+has_el2 = arm_feature(cpu_env(cpu), ARM_FEATURE_EL2);
+if (has_el2) {
  object_property_get_bool(cpuobj, "has_el2", _abort);


Missing to be squashed on top:

-- >8 --
 if (has_el2) {
-object_property_get_bool(cpuobj, "has_el2", _abort);
-qdev_prop_set_bit(gicdev, "has-virtualization-extensions", 
true);

+qdev_prop_set_bit(gicdev, "has-virtualization-extensions",
+  object_property_get_bool(cpuobj, "has_el2",
+   _abort));
 }
---


-qdev_prop_set_bit(gicdev, "has-virtualization-extensions", has_el2);
+qdev_prop_set_bit(gicdev, "has-virtualization-extensions", true);
+}
  }
  
  if (!sysbus_realize(SYS_BUS_DEVICE(>gic), errp)) {





Re: [PATCH v2 09/14] hw/arm: Prefer arm_feature(EL3) over object_property_find(has_el3)

2024-01-09 Thread Philippe Mathieu-Daudé

On 9/1/24 19:09, Philippe Mathieu-Daudé wrote:

The "has_el3" property is added to ARMCPU when the
ARM_FEATURE_EL3 feature is available. Rather than
checking whether the QOM property is present, directly
check the feature.

Suggested-by: Markus Armbruster 
Signed-off-by: Philippe Mathieu-Daudé 
---
  hw/arm/exynos4210.c   |  4 ++--
  hw/arm/integratorcp.c |  5 ++---
  hw/arm/realview.c |  2 +-
  hw/arm/versatilepb.c  |  5 ++---
  hw/arm/xilinx_zynq.c  |  2 +-
  hw/cpu/a15mpcore.c| 11 +++
  hw/cpu/a9mpcore.c |  6 +++---
  7 files changed, 18 insertions(+), 17 deletions(-)




diff --git a/hw/cpu/a15mpcore.c b/hw/cpu/a15mpcore.c
index bfd8aa5644..cebfe142cf 100644
--- a/hw/cpu/a15mpcore.c
+++ b/hw/cpu/a15mpcore.c
@@ -53,7 +53,6 @@ static void a15mp_priv_realize(DeviceState *dev, Error **errp)
  DeviceState *gicdev;
  SysBusDevice *busdev;
  int i;
-bool has_el3;
  bool has_el2 = false;
  Object *cpuobj;
  
@@ -62,13 +61,17 @@ static void a15mp_priv_realize(DeviceState *dev, Error **errp)

  qdev_prop_set_uint32(gicdev, "num-irq", s->num_irq);
  
  if (!kvm_irqchip_in_kernel()) {

+CPUState *cpu;
+
  /* Make the GIC's TZ support match the CPUs. We assume that
   * either all the CPUs have TZ, or none do.
   */
-cpuobj = OBJECT(qemu_get_cpu(0));
-has_el3 = object_property_find(cpuobj, "has_el3") &&
+cpu = qemu_get_cpu(0);
+cpuobj = OBJECT(cpu);
+if (arm_feature(cpu_env(cpu), ARM_FEATURE_EL3)) {
  object_property_get_bool(cpuobj, "has_el3", _abort);


This requires the same change than a9mp_priv_realize(), so squashing:

-- >8 --
 if (arm_feature(cpu_env(cpu), ARM_FEATURE_EL3)) {
-object_property_get_bool(cpuobj, "has_el3", _abort);
-qdev_prop_set_bit(gicdev, "has-security-extensions", true);
+qdev_prop_set_bit(gicdev, "has-security-extensions",
+  object_property_get_bool(cpuobj, "has_el3",
+   _abort));
 }
---


-qdev_prop_set_bit(gicdev, "has-security-extensions", has_el3);
+qdev_prop_set_bit(gicdev, "has-security-extensions", true);
+}
  /* Similarly for virtualization support */
  has_el2 = object_property_find(cpuobj, "has_el2") &&
  object_property_get_bool(cpuobj, "has_el2", _abort);




Re: [PATCH v2 09/14] hw/arm: Prefer arm_feature(EL3) over object_property_find(has_el3)

2024-01-09 Thread Philippe Mathieu-Daudé

On 9/1/24 19:13, Philippe Mathieu-Daudé wrote:

On 9/1/24 19:09, Philippe Mathieu-Daudé wrote:

The "has_el3" property is added to ARMCPU when the
ARM_FEATURE_EL3 feature is available. Rather than
checking whether the QOM property is present, directly
check the feature.

Suggested-by: Markus Armbruster 
Signed-off-by: Philippe Mathieu-Daudé 
---
  hw/arm/exynos4210.c   |  4 ++--
  hw/arm/integratorcp.c |  5 ++---
  hw/arm/realview.c |  2 +-
  hw/arm/versatilepb.c  |  5 ++---
  hw/arm/xilinx_zynq.c  |  2 +-
  hw/cpu/a15mpcore.c    | 11 +++
  hw/cpu/a9mpcore.c |  6 +++---
  7 files changed, 18 insertions(+), 17 deletions(-)




diff --git a/hw/cpu/a9mpcore.c b/hw/cpu/a9mpcore.c
index d03f57e579..9355e8443b 100644
--- a/hw/cpu/a9mpcore.c
+++ b/hw/cpu/a9mpcore.c
@@ -52,7 +52,6 @@ static void a9mp_priv_realize(DeviceState *dev, 
Error **errp)

  SysBusDevice *scubusdev, *gicbusdev, *gtimerbusdev, *mptimerbusdev,
   *wdtbusdev;
  int i;
-    bool has_el3;
  CPUState *cpu0;
  Object *cpuobj;
@@ -81,9 +80,10 @@ static void a9mp_priv_realize(DeviceState *dev, 
Error **errp)

  /* Make the GIC's TZ support match the CPUs. We assume that
   * either all the CPUs have TZ, or none do.
   */
-    has_el3 = object_property_find(cpuobj, "has_el3") &&
+    if (arm_feature(cpu_env(cpu0), ARM_FEATURE_EL3)) {
  object_property_get_bool(cpuobj, "has_el3", _abort);


Oops, something is wrong here...


This should be:

-- >8 --
@@ -84,3 +83,5 @@ static void a9mp_priv_realize(DeviceState *dev, Error 
**errp)

-has_el3 = object_property_find(cpuobj, "has_el3") &&
-object_property_get_bool(cpuobj, "has_el3", _abort);
-qdev_prop_set_bit(gicdev, "has-security-extensions", has_el3);
+if (arm_feature(cpu_env(cpu0), ARM_FEATURE_EL3)) {
+qdev_prop_set_bit(gicdev, "has-security-extensions",
+  object_property_get_bool(cpuobj, "has_el3",
+   _abort));
+}
---


-    qdev_prop_set_bit(gicdev, "has-security-extensions", has_el3);
+    qdev_prop_set_bit(gicdev, "has-security-extensions", true);
+    }
  if (!sysbus_realize(SYS_BUS_DEVICE(>gic), errp)) {
  return;







[PATCH] string-output-visitor: Fix (pseudo) struct handling

2024-01-09 Thread Kevin Wolf
Commit ff32bb53 tried to get minimal struct support into the string
output visitor by just making it return "". Unfortunately, it
forgot that the caller will still make more visitor calls for the
content of the struct.

If the struct is contained in a list, such as IOThreadVirtQueueMapping,
in the better case its fields show up as separate list entries. In the
worse case, it contains another list, and the string output visitor
doesn't support nested lists and asserts that this doesn't happen. So as
soon as the optional "vqs" field in IOThreadVirtQueueMapping is
specified, we get a crash.

This can be reproduced with the following command line:

  echo "info qtree" | ./qemu-system-x86_64 \
-object iothread,id=t0 \
-blockdev null-co,node-name=disk \
-device '{"driver": "virtio-blk-pci", "drive": "disk",
  "iothread-vq-mapping": [{"iothread": "t0", "vqs": [0]}]}' \
-monitor stdio

Fix the problem by counting the nesting level of structs and ignoring
any visitor calls for values (apart from start/end_struct) while we're
not on the top level.

Fixes: ff32bb53476539d352653f4ed56372dced73a388
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2069
Reported-by: Aihua Liang 
Signed-off-by: Kevin Wolf 
---
 qapi/string-output-visitor.c | 46 
 1 file changed, 46 insertions(+)

diff --git a/qapi/string-output-visitor.c b/qapi/string-output-visitor.c
index f0c1dea89e..5115536b15 100644
--- a/qapi/string-output-visitor.c
+++ b/qapi/string-output-visitor.c
@@ -65,6 +65,7 @@ struct StringOutputVisitor
 } range_start, range_end;
 GList *ranges;
 void *list; /* Only needed for sanity checking the caller */
+unsigned int struct_nesting;
 };
 
 static StringOutputVisitor *to_sov(Visitor *v)
@@ -144,6 +145,10 @@ static bool print_type_int64(Visitor *v, const char *name, 
int64_t *obj,
 StringOutputVisitor *sov = to_sov(v);
 GList *l;
 
+if (sov->struct_nesting) {
+return true;
+}
+
 switch (sov->list_mode) {
 case LM_NONE:
 string_output_append(sov, *obj);
@@ -231,6 +236,10 @@ static bool print_type_size(Visitor *v, const char *name, 
uint64_t *obj,
 uint64_t val;
 char *out, *psize;
 
+if (sov->struct_nesting) {
+return true;
+}
+
 if (!sov->human) {
 out = g_strdup_printf("%"PRIu64, *obj);
 string_output_set(sov, out);
@@ -250,6 +259,11 @@ static bool print_type_bool(Visitor *v, const char *name, 
bool *obj,
 Error **errp)
 {
 StringOutputVisitor *sov = to_sov(v);
+
+if (sov->struct_nesting) {
+return true;
+}
+
 string_output_set(sov, g_strdup(*obj ? "true" : "false"));
 return true;
 }
@@ -260,6 +274,10 @@ static bool print_type_str(Visitor *v, const char *name, 
char **obj,
 StringOutputVisitor *sov = to_sov(v);
 char *out;
 
+if (sov->struct_nesting) {
+return true;
+}
+
 if (sov->human) {
 out = *obj ? g_strdup_printf("\"%s\"", *obj) : g_strdup("");
 } else {
@@ -273,6 +291,11 @@ static bool print_type_number(Visitor *v, const char 
*name, double *obj,
   Error **errp)
 {
 StringOutputVisitor *sov = to_sov(v);
+
+if (sov->struct_nesting) {
+return true;
+}
+
 string_output_set(sov, g_strdup_printf("%.17g", *obj));
 return true;
 }
@@ -283,6 +306,10 @@ static bool print_type_null(Visitor *v, const char *name, 
QNull **obj,
 StringOutputVisitor *sov = to_sov(v);
 char *out;
 
+if (sov->struct_nesting) {
+return true;
+}
+
 if (sov->human) {
 out = g_strdup("");
 } else {
@@ -295,6 +322,9 @@ static bool print_type_null(Visitor *v, const char *name, 
QNull **obj,
 static bool start_struct(Visitor *v, const char *name, void **obj,
  size_t size, Error **errp)
 {
+StringOutputVisitor *sov = to_sov(v);
+
+sov->struct_nesting++;
 return true;
 }
 
@@ -302,6 +332,10 @@ static void end_struct(Visitor *v, void **obj)
 {
 StringOutputVisitor *sov = to_sov(v);
 
+if (--sov->struct_nesting) {
+return;
+}
+
 /* TODO actually print struct fields */
 string_output_set(sov, g_strdup(""));
 }
@@ -312,6 +346,10 @@ start_list(Visitor *v, const char *name, GenericList 
**list, size_t size,
 {
 StringOutputVisitor *sov = to_sov(v);
 
+if (sov->struct_nesting) {
+return true;
+}
+
 /* we can't traverse a list in a list */
 assert(sov->list_mode == LM_NONE);
 /* We don't support visits without a list */
@@ -329,6 +367,10 @@ static GenericList *next_list(Visitor *v, GenericList 
*tail, size_t size)
 StringOutputVisitor *sov = to_sov(v);
 GenericList *ret = tail->next;
 
+if (sov->struct_nesting) {
+return ret;
+}
+
 if (ret && !ret->next) {
 sov->list_mode = LM_END;
 }
@@ -339,6 +381,10 @@ static void end_list(Visitor *v, void **obj)
 {
 

Re: [PATCH v3 3/4] ci: Add a migration compatibility test job

2024-01-09 Thread Cédric Le Goater

On 1/5/24 19:04, Fabiano Rosas wrote:

The migration tests have support for being passed two QEMU binaries to
test migration compatibility.

Add a CI job that builds the lastest release of QEMU and another job
that uses that version plus an already present build of the current
version and run the migration tests with the two, both as source and
destination. I.e.:

  old QEMU (n-1) -> current QEMU (development tree)
  current QEMU (development tree) -> old QEMU (n-1)

The purpose of this CI job is to ensure the code we're about to merge
will not cause a migration compatibility problem when migrating the
next release (which will contain that code) to/from the previous
release.

I'm leaving the jobs as manual for now because using an older QEMU in
tests could hit bugs that were already fixed in the current
development tree and we need to handle those case-by-case.

Note: for user forks, the version tags need to be pushed to gitlab
otherwise it won't be able to checkout a different version.

Signed-off-by: Fabiano Rosas 
---
  .gitlab-ci.d/buildtest.yml | 53 ++
  1 file changed, 53 insertions(+)

diff --git a/.gitlab-ci.d/buildtest.yml b/.gitlab-ci.d/buildtest.yml
index 91663946de..81163a3f6a 100644
--- a/.gitlab-ci.d/buildtest.yml
+++ b/.gitlab-ci.d/buildtest.yml
@@ -167,6 +167,59 @@ build-system-centos:
x86_64-softmmu rx-softmmu sh4-softmmu nios2-softmmu
  MAKE_CHECK_ARGS: check-build
  
+build-previous-qemu:

+  extends: .native_build_job_template
+  artifacts:
+when: on_success
+expire_in: 2 days
+paths:
+  - build-previous
+exclude:
+  - build-previous/**/*.p
+  - build-previous/**/*.a.p
+  - build-previous/**/*.fa.p
+  - build-previous/**/*.c.o
+  - build-previous/**/*.c.o.d
+  - build-previous/**/*.fa
+  needs:
+job: amd64-opensuse-leap-container
+  variables:
+QEMU_JOB_OPTIONAL: 1
+IMAGE: opensuse-leap
+TARGETS: x86_64-softmmu aarch64-softmmu
+  before_script:
+- export QEMU_PREV_VERSION="$(sed 's/\([0-9.]*\)\.[0-9]*/v\1.0/' VERSION)"
+- git checkout $QEMU_PREV_VERSION
+  after_script:
+- mv build build-previous
+
+.migration-compat-common:
+  extends: .common_test_job_template
+  needs:
+- job: build-previous-qemu
+- job: build-system-opensuse
+  allow_failure: true
+  variables:
+QEMU_JOB_OPTIONAL: 1
+IMAGE: opensuse-leap
+MAKE_CHECK_ARGS: check-build
+  script:
+- cd build
+- QTEST_QEMU_BINARY_SRC=../build-previous/qemu-system-${TARGET}
+  QTEST_QEMU_BINARY=./qemu-system-${TARGET} 
./tests/qtest/migration-test
+- QTEST_QEMU_BINARY_DST=../build-previous/qemu-system-${TARGET}
+  QTEST_QEMU_BINARY=./qemu-system-${TARGET} 
./tests/qtest/migration-test
+
+migration-compat-aarch64:
+  extends: .migration-compat-common
+  variables:
+TARGET: aarch64
+
+migration-compat-x86_64:
+  extends: .migration-compat-common
+  variables:
+TARGET: x86_64



What about the others archs, s390x and ppc ? Do you lack the resources
or are there any problems to address ?

Thanks,

C.




Re: [PATCH v2 09/14] hw/arm: Prefer arm_feature(EL3) over object_property_find(has_el3)

2024-01-09 Thread Philippe Mathieu-Daudé

On 9/1/24 19:09, Philippe Mathieu-Daudé wrote:

The "has_el3" property is added to ARMCPU when the
ARM_FEATURE_EL3 feature is available. Rather than
checking whether the QOM property is present, directly
check the feature.

Suggested-by: Markus Armbruster 
Signed-off-by: Philippe Mathieu-Daudé 
---
  hw/arm/exynos4210.c   |  4 ++--
  hw/arm/integratorcp.c |  5 ++---
  hw/arm/realview.c |  2 +-
  hw/arm/versatilepb.c  |  5 ++---
  hw/arm/xilinx_zynq.c  |  2 +-
  hw/cpu/a15mpcore.c| 11 +++
  hw/cpu/a9mpcore.c |  6 +++---
  7 files changed, 18 insertions(+), 17 deletions(-)




diff --git a/hw/cpu/a9mpcore.c b/hw/cpu/a9mpcore.c
index d03f57e579..9355e8443b 100644
--- a/hw/cpu/a9mpcore.c
+++ b/hw/cpu/a9mpcore.c
@@ -52,7 +52,6 @@ static void a9mp_priv_realize(DeviceState *dev, Error **errp)
  SysBusDevice *scubusdev, *gicbusdev, *gtimerbusdev, *mptimerbusdev,
   *wdtbusdev;
  int i;
-bool has_el3;
  CPUState *cpu0;
  Object *cpuobj;
  
@@ -81,9 +80,10 @@ static void a9mp_priv_realize(DeviceState *dev, Error **errp)

  /* Make the GIC's TZ support match the CPUs. We assume that
   * either all the CPUs have TZ, or none do.
   */
-has_el3 = object_property_find(cpuobj, "has_el3") &&
+if (arm_feature(cpu_env(cpu0), ARM_FEATURE_EL3)) {
  object_property_get_bool(cpuobj, "has_el3", _abort);


Oops, something is wrong here...


-qdev_prop_set_bit(gicdev, "has-security-extensions", has_el3);
+qdev_prop_set_bit(gicdev, "has-security-extensions", true);
+}
  
  if (!sysbus_realize(SYS_BUS_DEVICE(>gic), errp)) {

  return;





[PATCH v2 10/14] hw/arm: Prefer arm_feature(EL2) over object_property_find(has_el2)

2024-01-09 Thread Philippe Mathieu-Daudé
The "has_el2" property is added to ARMCPU when the
ARM_FEATURE_EL2 feature is available. Rather than
checking whether the QOM property is present, directly
check the feature.

Suggested-by: Markus Armbruster 
Signed-off-by: Philippe Mathieu-Daudé 
---
 hw/arm/vexpress.c  | 3 ++-
 hw/arm/virt.c  | 2 +-
 hw/cpu/a15mpcore.c | 6 --
 3 files changed, 7 insertions(+), 4 deletions(-)

diff --git a/hw/arm/vexpress.c b/hw/arm/vexpress.c
index fd981f4c33..753a645c05 100644
--- a/hw/arm/vexpress.c
+++ b/hw/arm/vexpress.c
@@ -218,12 +218,13 @@ static void init_cpus(MachineState *ms, const char 
*cpu_type,
 /* Create the actual CPUs */
 for (n = 0; n < smp_cpus; n++) {
 Object *cpuobj = object_new(cpu_type);
+ARMCPU *cpu = ARM_CPU(cpuobj);
 
 if (!secure) {
 object_property_set_bool(cpuobj, "has_el3", false, NULL);
 }
 if (!virt) {
-if (object_property_find(cpuobj, "has_el2")) {
+if (arm_feature(>env, ARM_FEATURE_EL2)) {
 object_property_set_bool(cpuobj, "has_el2", false, NULL);
 }
 }
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 2793121cb4..35eb01a3dc 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -2146,7 +2146,7 @@ static void machvirt_init(MachineState *machine)
 object_property_set_bool(cpuobj, "has_el3", false, NULL);
 }
 
-if (!vms->virt && object_property_find(cpuobj, "has_el2")) {
+if (!vms->virt &&  arm_feature(cpu_env(cs), ARM_FEATURE_EL2)) {
 object_property_set_bool(cpuobj, "has_el2", false, NULL);
 }
 
diff --git a/hw/cpu/a15mpcore.c b/hw/cpu/a15mpcore.c
index cebfe142cf..1fa079b3b8 100644
--- a/hw/cpu/a15mpcore.c
+++ b/hw/cpu/a15mpcore.c
@@ -73,9 +73,11 @@ static void a15mp_priv_realize(DeviceState *dev, Error 
**errp)
 qdev_prop_set_bit(gicdev, "has-security-extensions", true);
 }
 /* Similarly for virtualization support */
-has_el2 = object_property_find(cpuobj, "has_el2") &&
+has_el2 = arm_feature(cpu_env(cpu), ARM_FEATURE_EL2);
+if (has_el2) {
 object_property_get_bool(cpuobj, "has_el2", _abort);
-qdev_prop_set_bit(gicdev, "has-virtualization-extensions", has_el2);
+qdev_prop_set_bit(gicdev, "has-virtualization-extensions", true);
+}
 }
 
 if (!sysbus_realize(SYS_BUS_DEVICE(>gic), errp)) {
-- 
2.41.0




[PATCH v2 14/14] hw/arm: Prefer arm_feature(GENERIC_TMR) over 'kvm-no-adjvtime' property

2024-01-09 Thread Philippe Mathieu-Daudé
First, the "kvm-no-adjvtime" and "kvm-steal-time" are only
available when KVM is available, so guard this block within
a 'kvm_enabled()' check. Since the "kvm-steal-time" property
is always available under KVM, directly set it.

Then, the "kvm-no-adjvtime" property is added to ARMCPU when
the ARM_FEATURE_GENERIC_TIMER feature is available. Rather than
checking whether the QOM property is present, directly check
the feature.

Finally, since we are sure the properties are available, we can
use _abort instead of NULL error. Replace:

  object_property_set_bool(..., PROPERTY, ..., _abort);

by:

  qdev_prop_set_bit(..., PROPERTY, ...);

which is a one-to-one replacement.

Suggested-by: Markus Armbruster 
Signed-off-by: Philippe Mathieu-Daudé 
---
 hw/arm/virt.c | 15 +++
 1 file changed, 7 insertions(+), 8 deletions(-)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 2ce4a18d73..6ac8fb19d2 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -2150,14 +2150,13 @@ static void machvirt_init(MachineState *machine)
 object_property_set_bool(cpuobj, "has_el2", false, NULL);
 }
 
-if (vmc->kvm_no_adjvtime &&
-object_property_find(cpuobj, "kvm-no-adjvtime")) {
-object_property_set_bool(cpuobj, "kvm-no-adjvtime", true, NULL);
-}
-
-if (vmc->no_kvm_steal_time &&
-object_property_find(cpuobj, "kvm-steal-time")) {
-object_property_set_bool(cpuobj, "kvm-steal-time", false, NULL);
+if (kvm_enabled()) {
+if (arm_feature(cpu_env(cs), ARM_FEATURE_GENERIC_TIMER)) {
+qdev_prop_set_bit(DEVICE(cs), "kvm-no-adjvtime",
+  vmc->kvm_no_adjvtime);
+}
+qdev_prop_set_bit(DEVICE(cs), "kvm-steal-time",
+  !vmc->no_kvm_steal_time);
 }
 
 if (arm_feature(cpu_env(cs), ARM_FEATURE_PMU) && vmc->no_pmu) {
-- 
2.41.0




[PATCH v2 09/14] hw/arm: Prefer arm_feature(EL3) over object_property_find(has_el3)

2024-01-09 Thread Philippe Mathieu-Daudé
The "has_el3" property is added to ARMCPU when the
ARM_FEATURE_EL3 feature is available. Rather than
checking whether the QOM property is present, directly
check the feature.

Suggested-by: Markus Armbruster 
Signed-off-by: Philippe Mathieu-Daudé 
---
 hw/arm/exynos4210.c   |  4 ++--
 hw/arm/integratorcp.c |  5 ++---
 hw/arm/realview.c |  2 +-
 hw/arm/versatilepb.c  |  5 ++---
 hw/arm/xilinx_zynq.c  |  2 +-
 hw/cpu/a15mpcore.c| 11 +++
 hw/cpu/a9mpcore.c |  6 +++---
 7 files changed, 18 insertions(+), 17 deletions(-)

diff --git a/hw/arm/exynos4210.c b/hw/arm/exynos4210.c
index de39fb0ece..5efaa538cd 100644
--- a/hw/arm/exynos4210.c
+++ b/hw/arm/exynos4210.c
@@ -554,14 +554,14 @@ static void exynos4210_realize(DeviceState *socdev, Error 
**errp)
 for (n = 0; n < EXYNOS4210_NCPUS; n++) {
 Object *cpuobj = object_new(ARM_CPU_TYPE_NAME("cortex-a9"));
 
+s->cpu[n] = ARM_CPU(cpuobj);
 /* By default A9 CPUs have EL3 enabled.  This board does not currently
  * support EL3 so the CPU EL3 property is disabled before realization.
  */
-if (object_property_find(cpuobj, "has_el3")) {
+if (arm_feature(>cpu[n]->env, ARM_FEATURE_EL3)) {
 object_property_set_bool(cpuobj, "has_el3", false, _fatal);
 }
 
-s->cpu[n] = ARM_CPU(cpuobj);
 object_property_set_int(cpuobj, "mp-affinity",
 exynos4210_calc_affinity(n), _abort);
 object_property_set_int(cpuobj, "reset-cbar",
diff --git a/hw/arm/integratorcp.c b/hw/arm/integratorcp.c
index 1830e1d785..7685527eb2 100644
--- a/hw/arm/integratorcp.c
+++ b/hw/arm/integratorcp.c
@@ -596,19 +596,18 @@ static void integratorcp_init(MachineState *machine)
 int i;
 
 cpuobj = object_new(machine->cpu_type);
+cpu = ARM_CPU(cpuobj);
 
 /* By default ARM1176 CPUs have EL3 enabled.  This board does not
  * currently support EL3 so the CPU EL3 property is disabled before
  * realization.
  */
-if (object_property_find(cpuobj, "has_el3")) {
+if (arm_feature(>env, ARM_FEATURE_EL3)) {
 object_property_set_bool(cpuobj, "has_el3", false, _fatal);
 }
 
 qdev_realize(DEVICE(cpuobj), NULL, _fatal);
 
-cpu = ARM_CPU(cpuobj);
-
 /* ??? On a real system the first 1Mb is mapped as SSRAM or boot flash.  */
 /* ??? RAM should repeat to fill physical memory space.  */
 /* SDRAM at address zero*/
diff --git a/hw/arm/realview.c b/hw/arm/realview.c
index 132217b2ed..433fe72ced 100644
--- a/hw/arm/realview.c
+++ b/hw/arm/realview.c
@@ -123,7 +123,7 @@ static void realview_init(MachineState *machine,
  * does not currently support EL3 so the CPU EL3 property is disabled
  * before realization.
  */
-if (object_property_find(cpuobj, "has_el3")) {
+if (arm_feature(>env, ARM_FEATURE_EL3)) {
 object_property_set_bool(cpuobj, "has_el3", false, _fatal);
 }
 
diff --git a/hw/arm/versatilepb.c b/hw/arm/versatilepb.c
index 4b2257787b..1969bb4608 100644
--- a/hw/arm/versatilepb.c
+++ b/hw/arm/versatilepb.c
@@ -208,19 +208,18 @@ static void versatile_init(MachineState *machine, int 
board_id)
 }
 
 cpuobj = object_new(machine->cpu_type);
+cpu = ARM_CPU(cpuobj);
 
 /* By default ARM1176 CPUs have EL3 enabled.  This board does not
  * currently support EL3 so the CPU EL3 property is disabled before
  * realization.
  */
-if (object_property_find(cpuobj, "has_el3")) {
+if (arm_feature(>env, ARM_FEATURE_EL3)) {
 object_property_set_bool(cpuobj, "has_el3", false, _fatal);
 }
 
 qdev_realize(DEVICE(cpuobj), NULL, _fatal);
 
-cpu = ARM_CPU(cpuobj);
-
 /* ??? RAM should repeat to fill physical memory space.  */
 /* SDRAM at address zero.  */
 memory_region_add_subregion(sysmem, 0, machine->ram);
diff --git a/hw/arm/xilinx_zynq.c b/hw/arm/xilinx_zynq.c
index dbb9793aa1..33e57dceef 100644
--- a/hw/arm/xilinx_zynq.c
+++ b/hw/arm/xilinx_zynq.c
@@ -198,7 +198,7 @@ static void zynq_init(MachineState *machine)
  * currently support EL3 so the CPU EL3 property is disabled before
  * realization.
  */
-if (object_property_find(OBJECT(cpu), "has_el3")) {
+if (arm_feature(>env, ARM_FEATURE_EL3)) {
 object_property_set_bool(OBJECT(cpu), "has_el3", false, _fatal);
 }
 
diff --git a/hw/cpu/a15mpcore.c b/hw/cpu/a15mpcore.c
index bfd8aa5644..cebfe142cf 100644
--- a/hw/cpu/a15mpcore.c
+++ b/hw/cpu/a15mpcore.c
@@ -53,7 +53,6 @@ static void a15mp_priv_realize(DeviceState *dev, Error **errp)
 DeviceState *gicdev;
 SysBusDevice *busdev;
 int i;
-bool has_el3;
 bool has_el2 = false;
 Object *cpuobj;
 
@@ -62,13 +61,17 @@ static void a15mp_priv_realize(DeviceState *dev, Error 
**errp)
 qdev_prop_set_uint32(gicdev, "num-irq", s->num_irq);
 
 if (!kvm_irqchip_in_kernel()) {
+CPUState *cpu;
+
 /* Make the GIC's TZ support match the CPUs. 

[PATCH v2 05/14] hw/arm/armv7m: Always set 'init-nsvtor' property for Cortex-M CPUs

2024-01-09 Thread Philippe Mathieu-Daudé
All CPUs implementing ARM_FEATURE_M have the 'init-nsvtor' property.
Since setting the property can not fail, replace

   object_property_set_uint(..., "init-nsvtor", ..., _abort);

by:
   qdev_prop_set_uint32(..., "init-nsvtor", ...).

which is a one-to-one replacement.

Suggested-by: Peter Maydell 
Signed-off-by: Philippe Mathieu-Daudé 
---
 hw/arm/armv7m.c | 8 ++--
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/hw/arm/armv7m.c b/hw/arm/armv7m.c
index b752049add..530729f42e 100644
--- a/hw/arm/armv7m.c
+++ b/hw/arm/armv7m.c
@@ -309,6 +309,8 @@ static void armv7m_realize(DeviceState *dev, Error **errp)
 object_property_set_link(OBJECT(s->cpu), "memory", OBJECT(>container),
  _abort);
 qdev_prop_set_bit(cpudev, "start-powered-off", s->start_powered_off);
+qdev_prop_set_uint32(cpudev, "init-nsvtor", s->init_nsvtor);
+
 if (object_property_find(OBJECT(s->cpu), "idau")) {
 object_property_set_link(OBJECT(s->cpu), "idau", s->idau,
  _abort);
@@ -319,12 +321,6 @@ static void armv7m_realize(DeviceState *dev, Error **errp)
 return;
 }
 }
-if (object_property_find(OBJECT(s->cpu), "init-nsvtor")) {
-if (!object_property_set_uint(OBJECT(s->cpu), "init-nsvtor",
-  s->init_nsvtor, errp)) {
-return;
-}
-}
 if (object_property_find(OBJECT(s->cpu), "vfp")) {
 if (!object_property_set_bool(OBJECT(s->cpu), "vfp", s->vfp, errp)) {
 return;
-- 
2.41.0




  1   2   3   >