Re: [PATCH v5 20/24] replay: simple auto-snapshot mode for record

2024-04-04 Thread Pavel Dovgalyuk

On 18.03.2024 18:46, Nicholas Piggin wrote:

record makes an initial snapshot when the machine is created, to enable
reverse-debugging. Often the issue being debugged appears near the end of
the trace, so it is important for performance to keep snapshots close to
the end.

This implements a periodic snapshot mode that keeps a rolling set of
recent snapshots. This could be done by the debugger or other program
that talks QMP, but for setting up simple scenarios and tests, this is
more convenient.

Signed-off-by: Nicholas Piggin 
---
  docs/system/replay.rst   |  5 
  include/sysemu/replay.h  | 11 
  replay/replay-snapshot.c | 57 
  replay/replay.c  | 27 +--
  system/vl.c  |  9 +++
  qemu-options.hx  |  9 +--
  6 files changed, 114 insertions(+), 4 deletions(-)

diff --git a/docs/system/replay.rst b/docs/system/replay.rst
index ca7c17c63d..1ae8614475 100644
--- a/docs/system/replay.rst
+++ b/docs/system/replay.rst
@@ -156,6 +156,11 @@ for storing VM snapshots. Here is the example of the 
command line for this:
  ``empty.qcow2`` drive does not connected to any virtual block device and used
  for VM snapshots only.
  
+``rrsnapmode`` can be used to select just an initial snapshot or periodic

+snapshots, with ``rrsnapcount`` specifying the number of periodic snapshots
+to maintain, and ``rrsnaptime`` the amount of run time in seconds between
+periodic snapshots.
+
  .. _network-label:
  
  Network devices

diff --git a/include/sysemu/replay.h b/include/sysemu/replay.h
index 8102fa54f0..92fa82842b 100644
--- a/include/sysemu/replay.h
+++ b/include/sysemu/replay.h
@@ -48,6 +48,17 @@ typedef enum ReplayCheckpoint ReplayCheckpoint;
  
  typedef struct ReplayNetState ReplayNetState;
  
+enum ReplaySnapshotMode {

+REPLAY_SNAPSHOT_MODE_INITIAL,
+REPLAY_SNAPSHOT_MODE_PERIODIC,
+};
+typedef enum ReplaySnapshotMode ReplaySnapshotMode;
+
+extern ReplaySnapshotMode replay_snapshot_mode;
+
+extern uint64_t replay_snapshot_periodic_delay;
+extern int replay_snapshot_periodic_nr_keep;


Please put the internal variables and enum into the replay-internal.h


+
  /* Name of the initial VM snapshot */
  extern char *replay_snapshot;
  
diff --git a/replay/replay-snapshot.c b/replay/replay-snapshot.c

index ccb4d89dda..762555feaa 100644
--- a/replay/replay-snapshot.c
+++ b/replay/replay-snapshot.c
@@ -70,6 +70,53 @@ void replay_vmstate_register(void)
  vmstate_register(NULL, 0, &vmstate_replay, &replay_state);
  }
  
+static QEMUTimer *replay_snapshot_timer;

+static int replay_snapshot_count;
+
+static void replay_snapshot_timer_cb(void *opaque)
+{
+Error *err = NULL;
+char *name;
+
+if (!replay_can_snapshot()) {
+/* Try again soon */
+timer_mod(replay_snapshot_timer,
+  qemu_clock_get_ms(QEMU_CLOCK_REALTIME) +
+  replay_snapshot_periodic_delay / 10);
+return;
+}
+
+name = g_strdup_printf("%s-%d", replay_snapshot, replay_snapshot_count);
+if (!save_snapshot(name,
+   true, NULL, false, NULL, &err)) {
+error_report_err(err);
+error_report("Could not create periodic snapshot "
+ "for icount record, disabling");
+g_free(name);
+return;
+}
+g_free(name);
+replay_snapshot_count++;
+
+if (replay_snapshot_periodic_nr_keep >= 1 &&
+replay_snapshot_count > replay_snapshot_periodic_nr_keep) {
+int del_nr;
+
+del_nr = replay_snapshot_count - replay_snapshot_periodic_nr_keep - 1;
+name = g_strdup_printf("%s-%d", replay_snapshot, del_nr);
+if (!delete_snapshot(name, false, NULL, &err)) {
+error_report_err(err);
+error_report("Could not delete periodic snapshot "
+ "for icount record");
+}
+g_free(name);
+}
+
+timer_mod(replay_snapshot_timer,
+  qemu_clock_get_ms(QEMU_CLOCK_REALTIME) +
+  replay_snapshot_periodic_delay);
+}
+
  void replay_vmstate_init(void)
  {
  Error *err = NULL;
@@ -82,6 +129,16 @@ void replay_vmstate_init(void)
  error_report("Could not create snapshot for icount record");
  exit(1);
  }
+
+if (replay_snapshot_mode == REPLAY_SNAPSHOT_MODE_PERIODIC) {
+replay_snapshot_timer = timer_new_ms(QEMU_CLOCK_REALTIME,
+ replay_snapshot_timer_cb,
+ NULL);
+timer_mod(replay_snapshot_timer,
+  qemu_clock_get_ms(QEMU_CLOCK_REALTIME) +
+  replay_snapshot_periodic_delay);
+}
+
  } else if (replay_mode == REPLAY_MODE_PLAY) {
  if (!load_snapshot(replay_snapshot, NULL, false, NULL, &err)) {
  error_report_err(err);
diff --git a/replay/re

Re: [PATCH v5 10/24] virtio-net: Use replay_schedule_bh_event for bhs that affect machine state

2024-04-04 Thread Pavel Dovgalyuk

Reviewed-by: Pavel Dovgalyuk 

On 18.03.2024 18:46, Nicholas Piggin wrote:

The regular qemu_bh_schedule() calls result in non-deterministic
execution of the bh in record-replay mode, which causes replay failure.

Signed-off-by: Nicholas Piggin 
---
  hw/net/virtio-net.c | 11 ++-
  1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index 9959f1932b..6ac737f2cf 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -40,6 +40,7 @@
  #include "migration/misc.h"
  #include "standard-headers/linux/ethtool.h"
  #include "sysemu/sysemu.h"
+#include "sysemu/replay.h"
  #include "trace.h"
  #include "monitor/qdev.h"
  #include "monitor/monitor.h"
@@ -416,7 +417,7 @@ static void virtio_net_set_status(struct VirtIODevice 
*vdev, uint8_t status)
  timer_mod(q->tx_timer,
 qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) + 
n->tx_timeout);
  } else {
-qemu_bh_schedule(q->tx_bh);
+replay_bh_schedule_event(q->tx_bh);
  }
  } else {
  if (q->tx_timer) {
@@ -2724,7 +2725,7 @@ static void virtio_net_tx_complete(NetClientState *nc, 
ssize_t len)
   */
  virtio_queue_set_notification(q->tx_vq, 0);
  if (q->tx_bh) {
-qemu_bh_schedule(q->tx_bh);
+replay_bh_schedule_event(q->tx_bh);
  } else {
  timer_mod(q->tx_timer,
qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) + n->tx_timeout);
@@ -2879,7 +2880,7 @@ static void virtio_net_handle_tx_bh(VirtIODevice *vdev, 
VirtQueue *vq)
  return;
  }
  virtio_queue_set_notification(vq, 0);
-qemu_bh_schedule(q->tx_bh);
+replay_bh_schedule_event(q->tx_bh);
  }
  
  static void virtio_net_tx_timer(void *opaque)

@@ -2962,7 +2963,7 @@ static void virtio_net_tx_bh(void *opaque)
  /* If we flush a full burst of packets, assume there are
   * more coming and immediately reschedule */
  if (ret >= n->tx_burst) {
-qemu_bh_schedule(q->tx_bh);
+replay_bh_schedule_event(q->tx_bh);
  q->tx_waiting = 1;
  return;
  }
@@ -2976,7 +2977,7 @@ static void virtio_net_tx_bh(void *opaque)
  return;
  } else if (ret > 0) {
  virtio_queue_set_notification(q->tx_vq, 0);
-qemu_bh_schedule(q->tx_bh);
+replay_bh_schedule_event(q->tx_bh);
  q->tx_waiting = 1;
  }
  }





Re: [RFC PATCH-for-9.1] qapi: Do not generate commands/events/introspect code for user emulation

2024-04-04 Thread Markus Armbruster
Philippe Mathieu-Daudé  writes:

> User emulation requires the QAPI types. Due to the command
> line processing, some visitor code is also used. The rest
> is irrelevant (no QMP socket).
>
> Add an option to the qapi-gen script to allow generating
> the minimum when only user emulation is being built.
>
> Signed-off-by: Philippe Mathieu-Daudé 
> ---
> RFC: Quick PoC for Markus. It is useful for user-only builds.
> ---
>  qapi/meson.build |  6 +-
>  scripts/qapi/main.py | 16 +++-
>  2 files changed, 16 insertions(+), 6 deletions(-)
>
> diff --git a/qapi/meson.build b/qapi/meson.build
> index 375d564277..5e02621145 100644
> --- a/qapi/meson.build
> +++ b/qapi/meson.build
> @@ -115,10 +115,14 @@ foreach module : qapi_all_modules
>endif
>  endforeach
>  
> +qapi_gen_cmd = [ qapi_gen, '-o', 'qapi', '-b', '@INPUT0@' ]
> +if not (have_system or have_tools)
> +  qapi_gen_cmd += [ '--types-only' ]
> +endif
>  qapi_files = custom_target('shared QAPI source files',
>output: qapi_util_outputs + qapi_specific_outputs + qapi_nonmodule_outputs,
>input: [ files('qapi-schema.json') ],
> -  command: [ qapi_gen, '-o', 'qapi', '-b', '@INPUT0@' ],
> +  command: qapi_gen_cmd,
>depend_files: [ qapi_inputs, qapi_gen_depends ])
>  
>  # Now go through all the outputs and add them to the right sourceset.
> diff --git a/scripts/qapi/main.py b/scripts/qapi/main.py
> index 316736b6a2..925af5841b 100644
> --- a/scripts/qapi/main.py
> +++ b/scripts/qapi/main.py
> @@ -33,7 +33,8 @@ def generate(schema_file: str,
>   prefix: str,
>   unmask: bool = False,
>   builtins: bool = False,
> - gen_tracing: bool = False) -> None:
> + gen_tracing: bool = False,
> + gen_types_only: bool = False) -> None:
>  """
>  Generate C code for the given schema into the target directory.
>  
> @@ -50,9 +51,10 @@ def generate(schema_file: str,
>  schema = QAPISchema(schema_file)
>  gen_types(schema, output_dir, prefix, builtins)
>  gen_visit(schema, output_dir, prefix, builtins)
> -gen_commands(schema, output_dir, prefix, gen_tracing)
> -gen_events(schema, output_dir, prefix)
> -gen_introspect(schema, output_dir, prefix, unmask)
> +if not gen_types_only:
> +gen_commands(schema, output_dir, prefix, gen_tracing)
> +gen_events(schema, output_dir, prefix)
> +gen_introspect(schema, output_dir, prefix, unmask)

This is the behavior change, everything else is plumbing.  You suppress
generation of source code for commands, events, and introspection, i.e.

qapi-commands*.[ch]
qapi-init-commands.[ch]
qapi-events*[ch]
qapi-introspect.[ch]

and the associated .trace-events.

But none of these .c get compiled for a user-only build.

So, all we save is a bit of build time and disk space: less than 0.1s on
my machine, ~1.6MiB in ~220 files.  My linux-user-only build tree clocks
in at 317MiB in ~4900 files, a full build takes me around 30s (real
time, -j 14 with ccache), so we're talking about 0.5% in disk space and
0.3% in build time.

Moreover, the patch needs work:

FAILED: 
tests/unit/test-qobject-input-visitor.p/test-qobject-input-visitor.c.o 
cc [...] -c ../tests/unit/test-qobject-input-visitor.c
../tests/unit/test-qobject-input-visitor.c:27:10: fatal error: 
qapi/qapi-introspect.h: No such file or directory
   27 | #include "qapi/qapi-introspect.h"
  |  ^~~~
FAILED: libqemuutil.a.p/stubs_monitor-core.c.o 
cc [...] -c ../stubs/monitor-core.c
../stubs/monitor-core.c:3:10: fatal error: qapi/qapi-emit-events.h: No such 
file or directory
3 | #include "qapi/qapi-emit-events.h"
  |  ^

I don't think it's worth the bother.

>  
>  
>  def main() -> int:
> @@ -75,6 +77,9 @@ def main() -> int:
>  parser.add_argument('-u', '--unmask-non-abi-names', action='store_true',
>  dest='unmask',
>  help="expose non-ABI names in introspection")
> +parser.add_argument('-t', '--types-only', action='store_true',
> +dest='gen_types_only',
> +help="Only generate QAPI types")
>  
>  # Option --suppress-tracing exists so we can avoid solving build system
>  # problems.  TODO Drop it when we no longer need it.
> @@ -96,7 +101,8 @@ def main() -> int:
>   prefix=args.prefix,
>   unmask=args.unmask,
>   builtins=args.builtins,
> - gen_tracing=not args.suppress_tracing)
> + gen_tracing=not args.suppress_tracing,
> + gen_types_only=args.gen_types_only)
>  except QAPIError as err:
>  print(err, file=sys.stderr)
>  return 1




RE: [PATCH v2] migration/postcopy: ensure preempt channel is ready before loading states

2024-04-04 Thread Wang, Wei W
On Friday, April 5, 2024 11:41 AM, Wang, Wei W wrote:
> 
> Before loading the guest states, ensure that the preempt channel has been
> ready to use, as some of the states (e.g. via virtio_load) might trigger page
> faults that will be handled through the preempt channel. So yield to the main
> thread in the case that the channel create event hasn't been dispatched.
> 
> Originally-by: Lei Wang 
> Link: https://lore.kernel.org/all/9aa5d1be-7801-40dd-83fd-
> f7e041ced...@intel.com/T/
> Suggested-by: Peter Xu 
> Signed-off-by: Lei Wang 
> Signed-off-by: Wei Wang 
> ---
>  migration/savevm.c | 17 +
>  1 file changed, 17 insertions(+)
> 
> diff --git a/migration/savevm.c b/migration/savevm.c index
> 388d7af7cd..63f9991a8a 100644
> --- a/migration/savevm.c
> +++ b/migration/savevm.c
> @@ -2342,6 +2342,23 @@ static int
> loadvm_handle_cmd_packaged(MigrationIncomingState *mis)
> 
>  QEMUFile *packf = qemu_file_new_input(QIO_CHANNEL(bioc));
> 
> +/*
> + * Before loading the guest states, ensure that the preempt channel has
> + * been ready to use, as some of the states (e.g. via virtio_load) might
> + * trigger page faults that will be handled through the preempt channel.
> + * So yield to the main thread in the case that the channel create event
> + * hasn't been dispatched.
> + */
> +do {
> +if (!migrate_postcopy_preempt() || !qemu_in_coroutine() ||
> +mis->postcopy_qemufile_dst) {
> +break;
> +}
> +
> +aio_co_schedule(qemu_get_current_aio_context(),
> qemu_coroutine_self());
> +qemu_coroutine_yield();
> +} while (1);
> +
>  ret = qemu_loadvm_state_main(packf, mis);
>  trace_loadvm_handle_cmd_packaged_main(ret);
>  qemu_fclose(packf);
> --
> 2.27.0

Main change from v1 is the drop of the wait on sem.
It's still patched to loadvm_handle_cmd_packaged, as the sem issue
(possible twice wait) isn't there now.



[PATCH v2] migration/postcopy: ensure preempt channel is ready before loading states

2024-04-04 Thread Wei Wang
Before loading the guest states, ensure that the preempt channel has been
ready to use, as some of the states (e.g. via virtio_load) might trigger
page faults that will be handled through the preempt channel. So yield to
the main thread in the case that the channel create event hasn't been
dispatched.

Originally-by: Lei Wang 
Link: 
https://lore.kernel.org/all/9aa5d1be-7801-40dd-83fd-f7e041ced...@intel.com/T/
Suggested-by: Peter Xu 
Signed-off-by: Lei Wang 
Signed-off-by: Wei Wang 
---
 migration/savevm.c | 17 +
 1 file changed, 17 insertions(+)

diff --git a/migration/savevm.c b/migration/savevm.c
index 388d7af7cd..63f9991a8a 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -2342,6 +2342,23 @@ static int 
loadvm_handle_cmd_packaged(MigrationIncomingState *mis)
 
 QEMUFile *packf = qemu_file_new_input(QIO_CHANNEL(bioc));
 
+/*
+ * Before loading the guest states, ensure that the preempt channel has
+ * been ready to use, as some of the states (e.g. via virtio_load) might
+ * trigger page faults that will be handled through the preempt channel.
+ * So yield to the main thread in the case that the channel create event
+ * hasn't been dispatched.
+ */
+do {
+if (!migrate_postcopy_preempt() || !qemu_in_coroutine() ||
+mis->postcopy_qemufile_dst) {
+break;
+}
+
+aio_co_schedule(qemu_get_current_aio_context(), qemu_coroutine_self());
+qemu_coroutine_yield();
+} while (1);
+
 ret = qemu_loadvm_state_main(packf, mis);
 trace_loadvm_handle_cmd_packaged_main(ret);
 qemu_fclose(packf);
-- 
2.27.0




RE: [PATCH v1] migration/postcopy: ensure preempt channel is ready before loading states

2024-04-04 Thread Wang, Wei W
On Friday, April 5, 2024 10:33 AM, Peter Xu wrote:
> On Fri, Apr 05, 2024 at 01:38:31AM +, Wang, Wei W wrote:
> > On Friday, April 5, 2024 4:57 AM, Peter Xu wrote:
> > > On Fri, Apr 05, 2024 at 12:48:15AM +0800, Wang, Lei wrote:
> > > > On 4/5/2024 0:25, Wang, Wei W wrote:> On Thursday, April 4, 2024
> > > > 10:12 PM, Peter Xu wrote:
> > > > >> On Thu, Apr 04, 2024 at 06:05:50PM +0800, Wei Wang wrote:
> > > > >>> Before loading the guest states, ensure that the preempt
> > > > >>> channel has been ready to use, as some of the states (e.g. via
> > > > >>> virtio_load) might trigger page faults that will be handled
> > > > >>> through the
> > > preempt channel.
> > > > >>> So yield to the main thread in the case that the channel
> > > > >>> create event has been dispatched.
> > > > >>>
> > > > >>> Originally-by: Lei Wang 
> > > > >>> Link:
> > > > >>> https://lore.kernel.org/all/9aa5d1be-7801-40dd-83fd-f7e041ced2
> > > > >>> 49@i
> > > > >>> ntel
> > > > >>> .com/T/
> > > > >>> Suggested-by: Peter Xu 
> > > > >>> Signed-off-by: Lei Wang 
> > > > >>> Signed-off-by: Wei Wang 
> > > > >>> ---
> > > > >>>  migration/savevm.c | 17 +
> > > > >>>  1 file changed, 17 insertions(+)
> > > > >>>
> > > > >>> diff --git a/migration/savevm.c b/migration/savevm.c index
> > > > >>> 388d7af7cd..fbc9f2bdd4 100644
> > > > >>> --- a/migration/savevm.c
> > > > >>> +++ b/migration/savevm.c
> > > > >>> @@ -2342,6 +2342,23 @@ static int
> > > > >>> loadvm_handle_cmd_packaged(MigrationIncomingState *mis)
> > > > >>>
> > > > >>>  QEMUFile *packf = qemu_file_new_input(QIO_CHANNEL(bioc));
> > > > >>>
> > > > >>> +/*
> > > > >>> + * Before loading the guest states, ensure that the
> > > > >>> + preempt channel
> > > has
> > > > >>> + * been ready to use, as some of the states (e.g. via 
> > > > >>> virtio_load)
> might
> > > > >>> + * trigger page faults that will be handled through the
> > > > >>> + preempt
> > > channel.
> > > > >>> + * So yield to the main thread in the case that the
> > > > >>> + channel create
> > > event
> > > > >>> + * has been dispatched.
> > > > >>> + */
> > > > >>> +do {
> > > > >>> +if (!migrate_postcopy_preempt() || !qemu_in_coroutine() ||
> > > > >>> +mis->postcopy_qemufile_dst) {
> > > > >>> +break;
> > > > >>> +}
> > > > >>> +
> > > > >>> +aio_co_schedule(qemu_get_current_aio_context(),
> > > > >> qemu_coroutine_self());
> > > > >>> +qemu_coroutine_yield();
> > > > >>> +} while
> > > > >>> + (!qemu_sem_timedwait(&mis->postcopy_qemufile_dst_done,
> > > > >>> + 1));
> > > > >>
> > > > >> I think we need s/!// here, so the same mistake I made?  I
> > > > >> think we need to rework the retval of qemu_sem_timedwait() at some
> point later..
> > > > >
> > > > > No. qemu_sem_timedwait returns false when timeout, which means
> > > > > sem
> > > isn’t posted yet.
> > > > > So it needs to go back to the loop. (the patch was tested)
> > > >
> > > > When timeout, qemu_sem_timedwait() will return -1. I think the
> > > > patch test passed may because you will always have at least one
> > > > yield (the first yield in the do ...while ...) when
> loadvm_handle_cmd_packaged()?
> > >
> > > My guess is that here the kick will work and qemu_sem_timedwait()
> > > later will ETIMEOUT -> qemu_sem_timedwait() returns -1, then the loop
> just broke.
> > > That aio schedule should make sure anyway that the file is ready;
> > > the preempt thread must run before this to not hang that thread.
> >
> > Yes, misread of the return value. It still worked because the loop
> > broke at the "if (mis->postcopy_qemufile_dst)" check.
> >
> > Even below will work:
> > do {
> > if (mis->postcopy_qemufile_dst) {
> > break;
> >  }
> > ...
> > } while (1);
> >
> > I still don’t see the value of using postcopy_qemufile_dst_done sem in
> > the code though It simplify blocks the main thread from creating the
> > preempt channel for 1ms (regardless of the possibility about whether
> > the sem has been be posted or not. We add it for the case it is not posted
> and need to go back to the loop).
> 
> I think it used to only wait() in the preempt thread, so that is needed.
> It's also needed when postcopy is interrupted and need a recover, see
> loadvm_postcopy_handle_resume(), in that case it's the postcopy ram load
> thread that waits for it rather than the main thread or preempt thread.
> 
> Indeed if we move channel creation out of the preempt thread then it seems
> we don't need the sem in this path.  However the other path will still need 
> it,
> then when the new channel created (postcopy_preempt_new_channel) we'll
> need to identify a "switch to postcopy" case or "postcopy recovery" case, only
> post the sem when the former.  I think it might complicate the code, I'll 
> think
> again tomorrow after a sleep so my brain will work better, but I doubt this is
> what we want to do at rc3.

Yes, it's a bit rushed (no need

Re: [PATCH v1] migration/postcopy: ensure preempt channel is ready before loading states

2024-04-04 Thread Peter Xu
On Fri, Apr 05, 2024 at 01:38:31AM +, Wang, Wei W wrote:
> On Friday, April 5, 2024 4:57 AM, Peter Xu wrote:
> > On Fri, Apr 05, 2024 at 12:48:15AM +0800, Wang, Lei wrote:
> > > On 4/5/2024 0:25, Wang, Wei W wrote:> On Thursday, April 4, 2024 10:12
> > > PM, Peter Xu wrote:
> > > >> On Thu, Apr 04, 2024 at 06:05:50PM +0800, Wei Wang wrote:
> > > >>> Before loading the guest states, ensure that the preempt channel
> > > >>> has been ready to use, as some of the states (e.g. via
> > > >>> virtio_load) might trigger page faults that will be handled through 
> > > >>> the
> > preempt channel.
> > > >>> So yield to the main thread in the case that the channel create
> > > >>> event has been dispatched.
> > > >>>
> > > >>> Originally-by: Lei Wang 
> > > >>> Link:
> > > >>> https://lore.kernel.org/all/9aa5d1be-7801-40dd-83fd-f7e041ced249@i
> > > >>> ntel
> > > >>> .com/T/
> > > >>> Suggested-by: Peter Xu 
> > > >>> Signed-off-by: Lei Wang 
> > > >>> Signed-off-by: Wei Wang 
> > > >>> ---
> > > >>>  migration/savevm.c | 17 +
> > > >>>  1 file changed, 17 insertions(+)
> > > >>>
> > > >>> diff --git a/migration/savevm.c b/migration/savevm.c index
> > > >>> 388d7af7cd..fbc9f2bdd4 100644
> > > >>> --- a/migration/savevm.c
> > > >>> +++ b/migration/savevm.c
> > > >>> @@ -2342,6 +2342,23 @@ static int
> > > >>> loadvm_handle_cmd_packaged(MigrationIncomingState *mis)
> > > >>>
> > > >>>  QEMUFile *packf = qemu_file_new_input(QIO_CHANNEL(bioc));
> > > >>>
> > > >>> +/*
> > > >>> + * Before loading the guest states, ensure that the preempt 
> > > >>> channel
> > has
> > > >>> + * been ready to use, as some of the states (e.g. via 
> > > >>> virtio_load) might
> > > >>> + * trigger page faults that will be handled through the preempt
> > channel.
> > > >>> + * So yield to the main thread in the case that the channel 
> > > >>> create
> > event
> > > >>> + * has been dispatched.
> > > >>> + */
> > > >>> +do {
> > > >>> +if (!migrate_postcopy_preempt() || !qemu_in_coroutine() ||
> > > >>> +mis->postcopy_qemufile_dst) {
> > > >>> +break;
> > > >>> +}
> > > >>> +
> > > >>> +aio_co_schedule(qemu_get_current_aio_context(),
> > > >> qemu_coroutine_self());
> > > >>> +qemu_coroutine_yield();
> > > >>> +} while
> > > >>> + (!qemu_sem_timedwait(&mis->postcopy_qemufile_dst_done,
> > > >>> + 1));
> > > >>
> > > >> I think we need s/!// here, so the same mistake I made?  I think we
> > > >> need to rework the retval of qemu_sem_timedwait() at some point later..
> > > >
> > > > No. qemu_sem_timedwait returns false when timeout, which means sem
> > isn’t posted yet.
> > > > So it needs to go back to the loop. (the patch was tested)
> > >
> > > When timeout, qemu_sem_timedwait() will return -1. I think the patch
> > > test passed may because you will always have at least one yield (the
> > > first yield in the do ...while ...) when loadvm_handle_cmd_packaged()?
> > 
> > My guess is that here the kick will work and qemu_sem_timedwait() later will
> > ETIMEOUT -> qemu_sem_timedwait() returns -1, then the loop just broke.
> > That aio schedule should make sure anyway that the file is ready; the 
> > preempt
> > thread must run before this to not hang that thread.
> 
> Yes, misread of the return value. It still worked because the loop broke at
> the "if (mis->postcopy_qemufile_dst)" check.
> 
> Even below will work:
> do {
> if (mis->postcopy_qemufile_dst) {
> break;
>  }
> ...
> } while (1);
> 
> I still don’t see the value of using postcopy_qemufile_dst_done sem in the 
> code though
> It simplify blocks the main thread from creating the preempt channel for 1ms 
> (regardless
> of the possibility about whether the sem has been be posted or not. We add it 
> for the case
> it is not posted and need to go back to the loop).

I think it used to only wait() in the preempt thread, so that is needed.
It's also needed when postcopy is interrupted and need a recover, see
loadvm_postcopy_handle_resume(), in that case it's the postcopy ram load
thread that waits for it rather than the main thread or preempt thread.

Indeed if we move channel creation out of the preempt thread then it seems
we don't need the sem in this path.  However the other path will still need
it, then when the new channel created (postcopy_preempt_new_channel) we'll
need to identify a "switch to postcopy" case or "postcopy recovery" case,
only post the sem when the former.  I think it might complicate the code,
I'll think again tomorrow after a sleep so my brain will work better, but I
doubt this is what we want to do at rc3.

If you feel comfortable, please feel free to send a version that you think
is the most correct so far (if you prefer no timedwait it's fine), and make
sure the test works the best on your side.  Then I'll smoke it a bit during
weekends. Please always keep that in mind if that will be for rc3 it should
be non-intrusive change, 

RE: [PATCH v1] migration/postcopy: ensure preempt channel is ready before loading states

2024-04-04 Thread Wang, Wei W
On Friday, April 5, 2024 4:57 AM, Peter Xu wrote:
> On Fri, Apr 05, 2024 at 12:48:15AM +0800, Wang, Lei wrote:
> > On 4/5/2024 0:25, Wang, Wei W wrote:> On Thursday, April 4, 2024 10:12
> > PM, Peter Xu wrote:
> > >> On Thu, Apr 04, 2024 at 06:05:50PM +0800, Wei Wang wrote:
> > >>> Before loading the guest states, ensure that the preempt channel
> > >>> has been ready to use, as some of the states (e.g. via
> > >>> virtio_load) might trigger page faults that will be handled through the
> preempt channel.
> > >>> So yield to the main thread in the case that the channel create
> > >>> event has been dispatched.
> > >>>
> > >>> Originally-by: Lei Wang 
> > >>> Link:
> > >>> https://lore.kernel.org/all/9aa5d1be-7801-40dd-83fd-f7e041ced249@i
> > >>> ntel
> > >>> .com/T/
> > >>> Suggested-by: Peter Xu 
> > >>> Signed-off-by: Lei Wang 
> > >>> Signed-off-by: Wei Wang 
> > >>> ---
> > >>>  migration/savevm.c | 17 +
> > >>>  1 file changed, 17 insertions(+)
> > >>>
> > >>> diff --git a/migration/savevm.c b/migration/savevm.c index
> > >>> 388d7af7cd..fbc9f2bdd4 100644
> > >>> --- a/migration/savevm.c
> > >>> +++ b/migration/savevm.c
> > >>> @@ -2342,6 +2342,23 @@ static int
> > >>> loadvm_handle_cmd_packaged(MigrationIncomingState *mis)
> > >>>
> > >>>  QEMUFile *packf = qemu_file_new_input(QIO_CHANNEL(bioc));
> > >>>
> > >>> +/*
> > >>> + * Before loading the guest states, ensure that the preempt channel
> has
> > >>> + * been ready to use, as some of the states (e.g. via virtio_load) 
> > >>> might
> > >>> + * trigger page faults that will be handled through the preempt
> channel.
> > >>> + * So yield to the main thread in the case that the channel create
> event
> > >>> + * has been dispatched.
> > >>> + */
> > >>> +do {
> > >>> +if (!migrate_postcopy_preempt() || !qemu_in_coroutine() ||
> > >>> +mis->postcopy_qemufile_dst) {
> > >>> +break;
> > >>> +}
> > >>> +
> > >>> +aio_co_schedule(qemu_get_current_aio_context(),
> > >> qemu_coroutine_self());
> > >>> +qemu_coroutine_yield();
> > >>> +} while
> > >>> + (!qemu_sem_timedwait(&mis->postcopy_qemufile_dst_done,
> > >>> + 1));
> > >>
> > >> I think we need s/!// here, so the same mistake I made?  I think we
> > >> need to rework the retval of qemu_sem_timedwait() at some point later..
> > >
> > > No. qemu_sem_timedwait returns false when timeout, which means sem
> isn’t posted yet.
> > > So it needs to go back to the loop. (the patch was tested)
> >
> > When timeout, qemu_sem_timedwait() will return -1. I think the patch
> > test passed may because you will always have at least one yield (the
> > first yield in the do ...while ...) when loadvm_handle_cmd_packaged()?
> 
> My guess is that here the kick will work and qemu_sem_timedwait() later will
> ETIMEOUT -> qemu_sem_timedwait() returns -1, then the loop just broke.
> That aio schedule should make sure anyway that the file is ready; the preempt
> thread must run before this to not hang that thread.

Yes, misread of the return value. It still worked because the loop broke at
the "if (mis->postcopy_qemufile_dst)" check.

Even below will work:
do {
if (mis->postcopy_qemufile_dst) {
break;
 }
...
} while (1);

I still don’t see the value of using postcopy_qemufile_dst_done sem in the code 
though
It simplify blocks the main thread from creating the preempt channel for 1ms 
(regardless
of the possibility about whether the sem has been be posted or not. We add it 
for the case
it is not posted and need to go back to the loop).


Re: [PATCH v2] riscv: thead: Add th.sxstatus CSR emulation

2024-04-04 Thread LIU Zhiwei



On 2024/3/29 20:04, Christoph Müllner wrote:

The th.sxstatus CSR can be used to identify available custom extension
on T-Head CPUs. The CSR is documented here:
   https://github.com/T-head-Semi/thead-extension-spec/pull/46

An important property of this patch is, that the th.sxstatus MAEE field
is not set (indicating that XTheadMaee is not available).
XTheadMaee is a memory attribute extension (similar to Svpbmt) which is
implemented in many T-Head CPUs (C906, C910, etc.) and utilizes bits
in PTEs that are marked as reserved. QEMU maintainers prefer to not
implement XTheadMaee, so we need give kernels a mechanism to identify
if XTheadMaee is available in a system or not. And this patch introduces
this mechanism in QEMU in a way that's compatible with real HW
(i.e., probing the th.sxstatus.MAEE bit).

Further context can be found on the list:
https://lists.gnu.org/archive/html/qemu-devel/2024-02/msg00775.html

Signed-off-by: Christoph Müllner 
---
  target/riscv/cpu.c   |  1 +
  target/riscv/cpu.h   |  3 ++
  target/riscv/meson.build |  1 +
  target/riscv/th_csr.c| 78 
  4 files changed, 83 insertions(+)
  create mode 100644 target/riscv/th_csr.c

diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index 36e3e5fdaf..b82ba95ae6 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -545,6 +545,7 @@ static void rv64_thead_c906_cpu_init(Object *obj)
  cpu->cfg.mvendorid = THEAD_VENDOR_ID;
  #ifndef CONFIG_USER_ONLY
  set_satp_mode_max_supported(cpu, VM_1_10_SV39);
+th_register_custom_csrs(cpu);
  #endif
  
  /* inherited from parent obj via riscv_cpu_init() */

diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
index 3b1a02b944..c9f8f06751 100644
--- a/target/riscv/cpu.h
+++ b/target/riscv/cpu.h
@@ -824,4 +824,7 @@ void riscv_cpu_register_gdb_regs_for_features(CPUState *cs);
  uint8_t satp_mode_max_from_map(uint32_t map);
  const char *satp_mode_str(uint8_t satp_mode, bool is_32_bit);
  
+/* Implemented in th_csr.c */

+void th_register_custom_csrs(RISCVCPU *cpu);
+
  #endif /* RISCV_CPU_H */
diff --git a/target/riscv/meson.build b/target/riscv/meson.build
index a5e0734e7f..a4bd61e52a 100644
--- a/target/riscv/meson.build
+++ b/target/riscv/meson.build
@@ -33,6 +33,7 @@ riscv_system_ss.add(files(
'monitor.c',
'machine.c',
'pmu.c',
+  'th_csr.c',
'time_helper.c',
'riscv-qmp-cmds.c',
  ))
diff --git a/target/riscv/th_csr.c b/target/riscv/th_csr.c
new file mode 100644
index 00..66d260cabd
--- /dev/null
+++ b/target/riscv/th_csr.c
@@ -0,0 +1,78 @@
+/*
+ * T-Head-specific CSRs.
+ *
+ * Copyright (c) 2024 VRULL GmbH
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2 or later, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program.  If not, see .
+ */
+
+#include "qemu/osdep.h"
+#include "cpu.h"
+#include "cpu_vendorid.h"
+
+#define CSR_TH_SXSTATUS 0x5c0
+
+/* TH_SXSTATUS bits */
+#define TH_SXSTATUS_UCMEBIT(16)
+#define TH_SXSTATUS_MAEEBIT(21)
+#define TH_SXSTATUS_THEADISAEE  BIT(22)
+
+typedef struct {
+int csrno;
+int (*insertion_test)(RISCVCPU *cpu);
+riscv_csr_operations csr_ops;
+} riscv_csr;
+
+static RISCVException s_mode_csr(CPURISCVState *env, int csrno)
+{
+if (env->debugger)
+return RISCV_EXCP_NONE;
+
+if (env->priv >= PRV_S)
+return RISCV_EXCP_NONE;

This will be checked by riscv_csrrw_check.

+
+return RISCV_EXCP_ILLEGAL_INST;
+}

Insteadly, reuse the smode in csr.c, where it checks iscv_has_ext(env, RVS).

+
+static int test_thead_mvendorid(RISCVCPU *cpu)
+{
+if (cpu->cfg.mvendorid != THEAD_VENDOR_ID)
+return -1;
+return 0;
+}
+
+static RISCVException read_th_sxstatus(CPURISCVState *env, int csrno,
+   target_ulong *val)
+{
+/* We don't set MAEE here, because QEMU does not implement MAEE. */
+*val = TH_SXSTATUS_UCME | TH_SXSTATUS_THEADISAEE;
+return RISCV_EXCP_NONE;
+}
+
+static riscv_csr th_csr_list[] = {
+{
+.csrno = CSR_TH_SXSTATUS,
+.insertion_test = test_thead_mvendorid,
+.csr_ops = { "th.sxstatus", s_mode_csr, read_th_sxstatus }
+}
+};
+
+void th_register_custom_csrs(RISCVCPU *cpu)
+{
+for (size_t i = 0; i < ARRAY_SIZE(th_csr_list); i++) {
+int csrno = th_csr_list[i].csrno;
+riscv_csr_operations *csr_ops = &th_csr_list[i].csr_ops;
+if (!th_csr_list[i].insertion_test(cpu))
+riscv_set_csr_ops(csrno, csr_ops);
+}
+}


Otherwi

[PATCH v11 2/2] memory tier: create CPUless memory tiers after obtaining HMAT info

2024-04-04 Thread Ho-Ren (Jack) Chuang
The current implementation treats emulated memory devices, such as
CXL1.1 type3 memory, as normal DRAM when they are emulated as normal memory
(E820_TYPE_RAM). However, these emulated devices have different
characteristics than traditional DRAM, making it important to
distinguish them. Thus, we modify the tiered memory initialization process
to introduce a delay specifically for CPUless NUMA nodes. This delay
ensures that the memory tier initialization for these nodes is deferred
until HMAT information is obtained during the boot process. Finally,
demotion tables are recalculated at the end.

* late_initcall(memory_tier_late_init);
Some device drivers may have initialized memory tiers between
`memory_tier_init()` and `memory_tier_late_init()`, potentially bringing
online memory nodes and configuring memory tiers. They should be excluded
in the late init.

* Handle cases where there is no HMAT when creating memory tiers
There is a scenario where a CPUless node does not provide HMAT information.
If no HMAT is specified, it falls back to using the default DRAM tier.

* Introduce another new lock `default_dram_perf_lock` for adist calculation
In the current implementation, iterating through CPUlist nodes requires
holding the `memory_tier_lock`. However, `mt_calc_adistance()` will end up
trying to acquire the same lock, leading to a potential deadlock.
Therefore, we propose introducing a standalone `default_dram_perf_lock` to
protect `default_dram_perf_*`. This approach not only avoids deadlock
but also prevents holding a large lock simultaneously.

* Upgrade `set_node_memory_tier` to support additional cases, including
  default DRAM, late CPUless, and hot-plugged initializations.
To cover hot-plugged memory nodes, `mt_calc_adistance()` and
`mt_find_alloc_memory_type()` are moved into `set_node_memory_tier()` to
handle cases where memtype is not initialized and where HMAT information is
available.

* Introduce `default_memory_types` for those memory types that are not
  initialized by device drivers.
Because late initialized memory and default DRAM memory need to be managed,
a default memory type is created for storing all memory types that are
not initialized by device drivers and as a fallback.

Signed-off-by: Ho-Ren (Jack) Chuang 
Signed-off-by: Hao Xiang 
Reviewed-by: "Huang, Ying" 
---
 mm/memory-tiers.c | 94 +++
 1 file changed, 70 insertions(+), 24 deletions(-)

diff --git a/mm/memory-tiers.c b/mm/memory-tiers.c
index 516b144fd45a..6632102bd5c9 100644
--- a/mm/memory-tiers.c
+++ b/mm/memory-tiers.c
@@ -36,6 +36,11 @@ struct node_memory_type_map {
 
 static DEFINE_MUTEX(memory_tier_lock);
 static LIST_HEAD(memory_tiers);
+/*
+ * The list is used to store all memory types that are not created
+ * by a device driver.
+ */
+static LIST_HEAD(default_memory_types);
 static struct node_memory_type_map node_memory_types[MAX_NUMNODES];
 struct memory_dev_type *default_dram_type;
 
@@ -108,6 +113,8 @@ static struct demotion_nodes *node_demotion __read_mostly;
 
 static BLOCKING_NOTIFIER_HEAD(mt_adistance_algorithms);
 
+/* The lock is used to protect `default_dram_perf*` info and nid. */
+static DEFINE_MUTEX(default_dram_perf_lock);
 static bool default_dram_perf_error;
 static struct access_coordinate default_dram_perf;
 static int default_dram_perf_ref_nid = NUMA_NO_NODE;
@@ -505,7 +512,8 @@ static inline void __init_node_memory_type(int node, struct 
memory_dev_type *mem
 static struct memory_tier *set_node_memory_tier(int node)
 {
struct memory_tier *memtier;
-   struct memory_dev_type *memtype;
+   struct memory_dev_type *memtype = default_dram_type;
+   int adist = MEMTIER_ADISTANCE_DRAM;
pg_data_t *pgdat = NODE_DATA(node);
 
 
@@ -514,7 +522,16 @@ static struct memory_tier *set_node_memory_tier(int node)
if (!node_state(node, N_MEMORY))
return ERR_PTR(-EINVAL);
 
-   __init_node_memory_type(node, default_dram_type);
+   mt_calc_adistance(node, &adist);
+   if (!node_memory_types[node].memtype) {
+   memtype = mt_find_alloc_memory_type(adist, 
&default_memory_types);
+   if (IS_ERR(memtype)) {
+   memtype = default_dram_type;
+   pr_info("Failed to allocate a memory type. Fall 
back.\n");
+   }
+   }
+
+   __init_node_memory_type(node, memtype);
 
memtype = node_memory_types[node].memtype;
node_set(node, memtype->nodes);
@@ -652,6 +669,35 @@ void mt_put_memory_types(struct list_head *memory_types)
 }
 EXPORT_SYMBOL_GPL(mt_put_memory_types);
 
+/*
+ * This is invoked via `late_initcall()` to initialize memory tiers for
+ * CPU-less memory nodes after driver initialization, which is
+ * expected to provide `adistance` algorithms.
+ */
+static int __init memory_tier_late_init(void)
+{
+   int nid;
+
+   guard(mutex)(&memory_tier_lock);
+   for_each_node_state(nid, N_MEMORY) {
+   /*

[PATCH v11 1/2] memory tier: dax/kmem: introduce an abstract layer for finding, allocating, and putting memory types

2024-04-04 Thread Ho-Ren (Jack) Chuang
Since different memory devices require finding, allocating, and putting
memory types, these common steps are abstracted in this patch,
enhancing the scalability and conciseness of the code.

Signed-off-by: Ho-Ren (Jack) Chuang 
Reviewed-by: "Huang, Ying" 
---
 drivers/dax/kmem.c   | 30 --
 include/linux/memory-tiers.h | 13 +
 mm/memory-tiers.c| 29 +
 3 files changed, 46 insertions(+), 26 deletions(-)

diff --git a/drivers/dax/kmem.c b/drivers/dax/kmem.c
index 42ee360cf4e3..4fe9d040e375 100644
--- a/drivers/dax/kmem.c
+++ b/drivers/dax/kmem.c
@@ -55,36 +55,14 @@ static LIST_HEAD(kmem_memory_types);
 
 static struct memory_dev_type *kmem_find_alloc_memory_type(int adist)
 {
-   bool found = false;
-   struct memory_dev_type *mtype;
-
-   mutex_lock(&kmem_memory_type_lock);
-   list_for_each_entry(mtype, &kmem_memory_types, list) {
-   if (mtype->adistance == adist) {
-   found = true;
-   break;
-   }
-   }
-   if (!found) {
-   mtype = alloc_memory_type(adist);
-   if (!IS_ERR(mtype))
-   list_add(&mtype->list, &kmem_memory_types);
-   }
-   mutex_unlock(&kmem_memory_type_lock);
-
-   return mtype;
+   guard(mutex)(&kmem_memory_type_lock);
+   return mt_find_alloc_memory_type(adist, &kmem_memory_types);
 }
 
 static void kmem_put_memory_types(void)
 {
-   struct memory_dev_type *mtype, *mtn;
-
-   mutex_lock(&kmem_memory_type_lock);
-   list_for_each_entry_safe(mtype, mtn, &kmem_memory_types, list) {
-   list_del(&mtype->list);
-   put_memory_type(mtype);
-   }
-   mutex_unlock(&kmem_memory_type_lock);
+   guard(mutex)(&kmem_memory_type_lock);
+   mt_put_memory_types(&kmem_memory_types);
 }
 
 static int dev_dax_kmem_probe(struct dev_dax *dev_dax)
diff --git a/include/linux/memory-tiers.h b/include/linux/memory-tiers.h
index 69e781900082..0d70788558f4 100644
--- a/include/linux/memory-tiers.h
+++ b/include/linux/memory-tiers.h
@@ -48,6 +48,9 @@ int mt_calc_adistance(int node, int *adist);
 int mt_set_default_dram_perf(int nid, struct access_coordinate *perf,
 const char *source);
 int mt_perf_to_adistance(struct access_coordinate *perf, int *adist);
+struct memory_dev_type *mt_find_alloc_memory_type(int adist,
+ struct list_head 
*memory_types);
+void mt_put_memory_types(struct list_head *memory_types);
 #ifdef CONFIG_MIGRATION
 int next_demotion_node(int node);
 void node_get_allowed_targets(pg_data_t *pgdat, nodemask_t *targets);
@@ -136,5 +139,15 @@ static inline int mt_perf_to_adistance(struct 
access_coordinate *perf, int *adis
 {
return -EIO;
 }
+
+static inline struct memory_dev_type *mt_find_alloc_memory_type(int adist,
+   struct 
list_head *memory_types)
+{
+   return NULL;
+}
+
+static inline void mt_put_memory_types(struct list_head *memory_types)
+{
+}
 #endif /* CONFIG_NUMA */
 #endif  /* _LINUX_MEMORY_TIERS_H */
diff --git a/mm/memory-tiers.c b/mm/memory-tiers.c
index 0537664620e5..516b144fd45a 100644
--- a/mm/memory-tiers.c
+++ b/mm/memory-tiers.c
@@ -623,6 +623,35 @@ void clear_node_memory_type(int node, struct 
memory_dev_type *memtype)
 }
 EXPORT_SYMBOL_GPL(clear_node_memory_type);
 
+struct memory_dev_type *mt_find_alloc_memory_type(int adist, struct list_head 
*memory_types)
+{
+   struct memory_dev_type *mtype;
+
+   list_for_each_entry(mtype, memory_types, list)
+   if (mtype->adistance == adist)
+   return mtype;
+
+   mtype = alloc_memory_type(adist);
+   if (IS_ERR(mtype))
+   return mtype;
+
+   list_add(&mtype->list, memory_types);
+
+   return mtype;
+}
+EXPORT_SYMBOL_GPL(mt_find_alloc_memory_type);
+
+void mt_put_memory_types(struct list_head *memory_types)
+{
+   struct memory_dev_type *mtype, *mtn;
+
+   list_for_each_entry_safe(mtype, mtn, memory_types, list) {
+   list_del(&mtype->list);
+   put_memory_type(mtype);
+   }
+}
+EXPORT_SYMBOL_GPL(mt_put_memory_types);
+
 static void dump_hmem_attrs(struct access_coordinate *coord, const char 
*prefix)
 {
pr_info(
-- 
Ho-Ren (Jack) Chuang




[PATCH v11 0/2] Improved Memory Tier Creation for CPUless NUMA Nodes

2024-04-04 Thread Ho-Ren (Jack) Chuang
When a memory device, such as CXL1.1 type3 memory, is emulated as
normal memory (E820_TYPE_RAM), the memory device is indistinguishable from
normal DRAM in terms of memory tiering with the current implementation.
The current memory tiering assigns all detected normal memory nodes to
the same DRAM tier. This results in normal memory devices with different
attributions being unable to be assigned to the correct memory tier,
leading to the inability to migrate pages between different
types of memory.
https://lore.kernel.org/linux-mm/ph0pr08mb7955e9f08ccb64f23963b5c3a8...@ph0pr08mb7955.namprd08.prod.outlook.com/T/

This patchset automatically resolves the issues. It delays the
initialization of memory tiers for CPUless NUMA nodes until they obtain
HMAT information and after all devices are initialized at boot time,
eliminating the need for user intervention. If no HMAT is specified,
it falls back to using `default_dram_type`.

Example usecase:
We have CXL memory on the host, and we create VMs with a new system memory
device backed by host CXL memory. We inject CXL memory performance
attributes through QEMU, and the guest now sees memory nodes with
performance attributes in HMAT. With this change, we enable the
guest kernel to construct the correct memory tiering for the memory nodes.

- v11:
 Thanks to comments from Jonathan,
 * Replace `mutex_lock()` with `guard(mutex)()`
 * Reorder some modifications within the patchset
 * Rewrite the code for improved readability and fixing alignment issues
 * Pass all strict rules in checkpatch.pl
- v10:
 Thanks to Andrew's and SeongJae's comments,
 * Address kunit compilation errors
 * Resolve the bug of not returning the correct error code in
   `mt_perf_to_adistance`
 * 
https://lore.kernel.org/lkml/20240402001739.2521623-1-horenchu...@bytedance.com/T/#u
-v9:
 * Address corner cases in `memory_tier_late_init`. Thank Ying's comments.
 * 
https://lore.kernel.org/lkml/20240329053353.309557-1-horenchu...@bytedance.com/T/#u
-v8:
 * Fix email format
 * 
https://lore.kernel.org/lkml/20240329004815.195476-1-horenchu...@bytedance.com/T/#u
-v7:
 * Add Reviewed-by: "Huang, Ying" 
-v6:
 Thanks to Ying's comments,
 * Move `default_dram_perf_lock` to the function's beginning for clarity
 * Fix double unlocking at v5
 * 
https://lore.kernel.org/lkml/20240327072729.3381685-1-horenchu...@bytedance.com/T/#u
-v5:
 Thanks to Ying's comments,
 * Add comments about what is protected by `default_dram_perf_lock`
 * Fix an uninitialized pointer mtype
 * Slightly shorten the time holding `default_dram_perf_lock`
 * Fix a deadlock bug in `mt_perf_to_adistance`
 * 
https://lore.kernel.org/lkml/20240327041646.3258110-1-horenchu...@bytedance.com/T/#u
-v4:
 Thanks to Ying's comments,
 * Remove redundant code
 * Reorganize patches accordingly
 * 
https://lore.kernel.org/lkml/20240322070356.315922-1-horenchu...@bytedance.com/T/#u
-v3:
 Thanks to Ying's comments,
 * Make the newly added code independent of HMAT
 * Upgrade set_node_memory_tier to support more cases
 * Put all non-driver-initialized memory types into default_memory_types
   instead of using hmat_memory_types
 * find_alloc_memory_type -> mt_find_alloc_memory_type
 * 
https://lore.kernel.org/lkml/20240320061041.3246828-1-horenchu...@bytedance.com/T/#u
-v2:
 Thanks to Ying's comments,
 * Rewrite cover letter & patch description
 * Rename functions, don't use _hmat
 * Abstract common functions into find_alloc_memory_type()
 * Use the expected way to use set_node_memory_tier instead of modifying it
 * 
https://lore.kernel.org/lkml/20240312061729.1997111-1-horenchu...@bytedance.com/T/#u
-v1:
 * 
https://lore.kernel.org/lkml/20240301082248.3456086-1-horenchu...@bytedance.com/T/#u

Ho-Ren (Jack) Chuang (2):
  memory tier: dax/kmem: introduce an abstract layer for finding,
allocating, and putting memory types
  memory tier: create CPUless memory tiers after obtaining HMAT info

 drivers/dax/kmem.c   |  30 ++---
 include/linux/memory-tiers.h |  13 
 mm/memory-tiers.c| 123 ---
 3 files changed, 116 insertions(+), 50 deletions(-)

-- 
Ho-Ren (Jack) Chuang




[PATCH v2 16/21] plugins: Introduce PLUGIN_CB_MEM_REGULAR

2024-04-04 Thread Richard Henderson
Use different enumerators for vcpu_udata and vcpu_mem callbacks.

Reviewed-by: Pierrick Bouvier 
Signed-off-by: Richard Henderson 
---
 include/qemu/plugin.h  | 1 +
 accel/tcg/plugin-gen.c | 2 +-
 plugins/core.c | 4 ++--
 3 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/include/qemu/plugin.h b/include/qemu/plugin.h
index cf9758be55..34498da717 100644
--- a/include/qemu/plugin.h
+++ b/include/qemu/plugin.h
@@ -67,6 +67,7 @@ union qemu_plugin_cb_sig {
 
 enum plugin_dyn_cb_type {
 PLUGIN_CB_REGULAR,
+PLUGIN_CB_MEM_REGULAR,
 PLUGIN_CB_INLINE,
 };
 
diff --git a/accel/tcg/plugin-gen.c b/accel/tcg/plugin-gen.c
index e77ff2a565..c545303956 100644
--- a/accel/tcg/plugin-gen.c
+++ b/accel/tcg/plugin-gen.c
@@ -361,7 +361,7 @@ static void plugin_gen_inject(struct qemu_plugin_tb 
*plugin_tb)
 
 if (cb->rw & rw) {
 switch (cb->type) {
-case PLUGIN_CB_REGULAR:
+case PLUGIN_CB_MEM_REGULAR:
 gen_mem_cb(cb, meminfo, addr);
 break;
 case PLUGIN_CB_INLINE:
diff --git a/plugins/core.c b/plugins/core.c
index b0615f1e7f..0213513ec6 100644
--- a/plugins/core.c
+++ b/plugins/core.c
@@ -391,7 +391,7 @@ void plugin_register_vcpu_mem_cb(GArray **arr,
 
 struct qemu_plugin_dyn_cb *dyn_cb = plugin_get_dyn_cb(arr);
 dyn_cb->userp = udata;
-dyn_cb->type = PLUGIN_CB_REGULAR;
+dyn_cb->type = PLUGIN_CB_MEM_REGULAR;
 dyn_cb->rw = rw;
 dyn_cb->regular.f.vcpu_mem = cb;
 
@@ -547,7 +547,7 @@ void qemu_plugin_vcpu_mem_cb(CPUState *cpu, uint64_t vaddr,
 break;
 }
 switch (cb->type) {
-case PLUGIN_CB_REGULAR:
+case PLUGIN_CB_MEM_REGULAR:
 cb->regular.f.vcpu_mem(cpu->cpu_index, make_plugin_meminfo(oi, rw),
vaddr, cb->userp);
 break;
-- 
2.34.1




[PATCH v2 13/21] tcg: Remove TCG_CALL_PLUGIN

2024-04-04 Thread Richard Henderson
Since we no longer emit plugin helpers during the initial code
translation phase, we don't need to specially mark plugin helpers.

Reviewed-by: Pierrick Bouvier 
Signed-off-by: Richard Henderson 
---
 include/tcg/tcg.h |  2 --
 plugins/core.c| 10 --
 tcg/tcg.c |  4 +---
 3 files changed, 5 insertions(+), 11 deletions(-)

diff --git a/include/tcg/tcg.h b/include/tcg/tcg.h
index 8d9f6585ff..196e3b7ba1 100644
--- a/include/tcg/tcg.h
+++ b/include/tcg/tcg.h
@@ -353,8 +353,6 @@ typedef TCGv_ptr TCGv_env;
 #define TCG_CALL_NO_SIDE_EFFECTS0x0004
 /* Helper is G_NORETURN.  */
 #define TCG_CALL_NO_RETURN  0x0008
-/* Helper is part of Plugins.  */
-#define TCG_CALL_PLUGIN 0x0010
 
 /* convenience version of most used call flags */
 #define TCG_CALL_NO_RWG TCG_CALL_NO_READ_GLOBALS
diff --git a/plugins/core.c b/plugins/core.c
index b0a2e80874..b0615f1e7f 100644
--- a/plugins/core.c
+++ b/plugins/core.c
@@ -339,9 +339,8 @@ void plugin_register_dyn_cb__udata(GArray **arr,
void *udata)
 {
 static TCGHelperInfo info[3] = {
-[QEMU_PLUGIN_CB_NO_REGS].flags = TCG_CALL_NO_RWG | TCG_CALL_PLUGIN,
-[QEMU_PLUGIN_CB_R_REGS].flags = TCG_CALL_NO_WG | TCG_CALL_PLUGIN,
-[QEMU_PLUGIN_CB_RW_REGS].flags = TCG_CALL_PLUGIN,
+[QEMU_PLUGIN_CB_NO_REGS].flags = TCG_CALL_NO_RWG,
+[QEMU_PLUGIN_CB_R_REGS].flags = TCG_CALL_NO_WG,
 /*
  * Match qemu_plugin_vcpu_udata_cb_t:
  *   void (*)(uint32_t, void *)
@@ -375,9 +374,8 @@ void plugin_register_vcpu_mem_cb(GArray **arr,
 !__builtin_types_compatible_p(qemu_plugin_meminfo_t, int32_t));
 
 static TCGHelperInfo info[3] = {
-[QEMU_PLUGIN_CB_NO_REGS].flags = TCG_CALL_NO_RWG | TCG_CALL_PLUGIN,
-[QEMU_PLUGIN_CB_R_REGS].flags = TCG_CALL_NO_WG | TCG_CALL_PLUGIN,
-[QEMU_PLUGIN_CB_RW_REGS].flags = TCG_CALL_PLUGIN,
+[QEMU_PLUGIN_CB_NO_REGS].flags = TCG_CALL_NO_RWG,
+[QEMU_PLUGIN_CB_R_REGS].flags = TCG_CALL_NO_WG,
 /*
  * Match qemu_plugin_vcpu_mem_cb_t:
  *   void (*)(uint32_t, qemu_plugin_meminfo_t, uint64_t, void *)
diff --git a/tcg/tcg.c b/tcg/tcg.c
index 0bf218314b..363a065e28 100644
--- a/tcg/tcg.c
+++ b/tcg/tcg.c
@@ -2269,9 +2269,7 @@ static void tcg_gen_callN(void *func, TCGHelperInfo *info,
 
 #ifdef CONFIG_PLUGIN
 /* Flag helpers that may affect guest state */
-if (tcg_ctx->plugin_insn &&
-!(info->flags & TCG_CALL_PLUGIN) &&
-!(info->flags & TCG_CALL_NO_SIDE_EFFECTS)) {
+if (tcg_ctx->plugin_insn && !(info->flags & TCG_CALL_NO_SIDE_EFFECTS)) {
 tcg_ctx->plugin_insn->calls_helpers = true;
 }
 #endif
-- 
2.34.1




[PATCH v2 20/21] plugins: Inline plugin_gen_empty_callback

2024-04-04 Thread Richard Henderson
Each caller can use tcg_gen_plugin_cb directly.

Reviewed-by: Alex Bennée 
Signed-off-by: Richard Henderson 
---
 accel/tcg/plugin-gen.c | 19 +++
 1 file changed, 3 insertions(+), 16 deletions(-)

diff --git a/accel/tcg/plugin-gen.c b/accel/tcg/plugin-gen.c
index c0cbc26984..d914d64de0 100644
--- a/accel/tcg/plugin-gen.c
+++ b/accel/tcg/plugin-gen.c
@@ -60,19 +60,6 @@ enum plugin_gen_from {
 PLUGIN_GEN_AFTER_TB,
 };
 
-static void plugin_gen_empty_callback(enum plugin_gen_from from)
-{
-switch (from) {
-case PLUGIN_GEN_AFTER_INSN:
-case PLUGIN_GEN_FROM_TB:
-case PLUGIN_GEN_FROM_INSN:
-tcg_gen_plugin_cb(from);
-break;
-default:
-g_assert_not_reached();
-}
-}
-
 /* called before finishing a TB with exit_tb, goto_tb or goto_ptr */
 void plugin_gen_disable_mem_helpers(void)
 {
@@ -362,7 +349,7 @@ bool plugin_gen_tb_start(CPUState *cpu, const 
DisasContextBase *db,
 ptb->mem_only = mem_only;
 ptb->mem_helper = false;
 
-plugin_gen_empty_callback(PLUGIN_GEN_FROM_TB);
+tcg_gen_plugin_cb(PLUGIN_GEN_FROM_TB);
 }
 
 tcg_ctx->plugin_insn = NULL;
@@ -419,12 +406,12 @@ void plugin_gen_insn_start(CPUState *cpu, const 
DisasContextBase *db)
 insn->haddr = ptb->haddr2 + pc - ptb->vaddr2;
 }
 
-plugin_gen_empty_callback(PLUGIN_GEN_FROM_INSN);
+tcg_gen_plugin_cb(PLUGIN_GEN_FROM_INSN);
 }
 
 void plugin_gen_insn_end(void)
 {
-plugin_gen_empty_callback(PLUGIN_GEN_AFTER_INSN);
+tcg_gen_plugin_cb(PLUGIN_GEN_AFTER_INSN);
 }
 
 /*
-- 
2.34.1




[PATCH v2 15/21] plugins: Simplify callback queues

2024-04-04 Thread Richard Henderson
We have qemu_plugin_dyn_cb.type to differentiate the various
callback types, so we do not need to keep them in separate queues.

Reviewed-by: Pierrick Bouvier 
Signed-off-by: Richard Henderson 
---
 include/qemu/plugin.h  | 35 ++--
 accel/tcg/plugin-gen.c | 90 ++
 plugins/api.c  | 18 +++--
 3 files changed, 65 insertions(+), 78 deletions(-)

diff --git a/include/qemu/plugin.h b/include/qemu/plugin.h
index ee1c1b174a..cf9758be55 100644
--- a/include/qemu/plugin.h
+++ b/include/qemu/plugin.h
@@ -66,15 +66,8 @@ union qemu_plugin_cb_sig {
 };
 
 enum plugin_dyn_cb_type {
-PLUGIN_CB_INSN,
-PLUGIN_CB_MEM,
-PLUGIN_N_CB_TYPES,
-};
-
-enum plugin_dyn_cb_subtype {
 PLUGIN_CB_REGULAR,
 PLUGIN_CB_INLINE,
-PLUGIN_N_CB_SUBTYPES,
 };
 
 /*
@@ -84,7 +77,7 @@ enum plugin_dyn_cb_subtype {
  */
 struct qemu_plugin_dyn_cb {
 void *userp;
-enum plugin_dyn_cb_subtype type;
+enum plugin_dyn_cb_type type;
 /* @rw applies to mem callbacks only (both regular and inline) */
 enum qemu_plugin_mem_rw rw;
 /* fields specific to each dyn_cb type go here */
@@ -106,7 +99,8 @@ struct qemu_plugin_insn {
 GByteArray *data;
 uint64_t vaddr;
 void *haddr;
-GArray *cbs[PLUGIN_N_CB_TYPES][PLUGIN_N_CB_SUBTYPES];
+GArray *insn_cbs;
+GArray *mem_cbs;
 bool calls_helpers;
 
 /* if set, the instruction calls helpers that might access guest memory */
@@ -135,16 +129,9 @@ static inline void qemu_plugin_insn_cleanup_fn(gpointer 
data)
 
 static inline struct qemu_plugin_insn *qemu_plugin_insn_alloc(void)
 {
-int i, j;
 struct qemu_plugin_insn *insn = g_new0(struct qemu_plugin_insn, 1);
-insn->data = g_byte_array_sized_new(4);
 
-for (i = 0; i < PLUGIN_N_CB_TYPES; i++) {
-for (j = 0; j < PLUGIN_N_CB_SUBTYPES; j++) {
-insn->cbs[i][j] = g_array_new(false, false,
-  sizeof(struct qemu_plugin_dyn_cb));
-}
-}
+insn->data = g_byte_array_sized_new(4);
 return insn;
 }
 
@@ -161,7 +148,7 @@ struct qemu_plugin_tb {
 /* if set, the TB calls helpers that might access guest memory */
 bool mem_helper;
 
-GArray *cbs[PLUGIN_N_CB_SUBTYPES];
+GArray *cbs;
 };
 
 /**
@@ -174,22 +161,22 @@ struct qemu_plugin_insn *qemu_plugin_tb_insn_get(struct 
qemu_plugin_tb *tb,
  uint64_t pc)
 {
 struct qemu_plugin_insn *insn;
-int i, j;
 
 if (unlikely(tb->n == tb->insns->len)) {
 struct qemu_plugin_insn *new_insn = qemu_plugin_insn_alloc();
 g_ptr_array_add(tb->insns, new_insn);
 }
+
 insn = g_ptr_array_index(tb->insns, tb->n++);
 g_byte_array_set_size(insn->data, 0);
 insn->calls_helpers = false;
 insn->mem_helper = false;
 insn->vaddr = pc;
-
-for (i = 0; i < PLUGIN_N_CB_TYPES; i++) {
-for (j = 0; j < PLUGIN_N_CB_SUBTYPES; j++) {
-g_array_set_size(insn->cbs[i][j], 0);
-}
+if (insn->insn_cbs) {
+g_array_set_size(insn->insn_cbs, 0);
+}
+if (insn->mem_cbs) {
+g_array_set_size(insn->mem_cbs, 0);
 }
 
 return insn;
diff --git a/accel/tcg/plugin-gen.c b/accel/tcg/plugin-gen.c
index d9ee9bb2ec..e77ff2a565 100644
--- a/accel/tcg/plugin-gen.c
+++ b/accel/tcg/plugin-gen.c
@@ -83,9 +83,8 @@ void plugin_gen_disable_mem_helpers(void)
 static void gen_enable_mem_helper(struct qemu_plugin_tb *ptb,
   struct qemu_plugin_insn *insn)
 {
-GArray *cbs[2];
 GArray *arr;
-size_t n_cbs;
+size_t len;
 
 /*
  * Tracking memory accesses performed from helpers requires extra work.
@@ -104,22 +103,25 @@ static void gen_enable_mem_helper(struct qemu_plugin_tb 
*ptb,
 return;
 }
 
-cbs[0] = insn->cbs[PLUGIN_CB_MEM][PLUGIN_CB_REGULAR];
-cbs[1] = insn->cbs[PLUGIN_CB_MEM][PLUGIN_CB_INLINE];
-n_cbs = cbs[0]->len + cbs[1]->len;
-
-if (n_cbs == 0) {
+if (!insn->mem_cbs || !insn->mem_cbs->len) {
 insn->mem_helper = false;
 return;
 }
 insn->mem_helper = true;
 ptb->mem_helper = true;
 
+/*
+ * TODO: It seems like we should be able to use ref/unref
+ * to avoid needing to actually copy this array.
+ * Alternately, perhaps we could allocate new memory adjacent
+ * to the TranslationBlock itself, so that we do not have to
+ * actively manage the lifetime after this.
+ */
+len = insn->mem_cbs->len;
 arr = g_array_sized_new(false, false,
-sizeof(struct qemu_plugin_dyn_cb), n_cbs);
-g_array_append_vals(arr, cbs[0]->data, cbs[0]->len);
-g_array_append_vals(arr, cbs[1]->data, cbs[1]->len);
-
+sizeof(struct qemu_plugin_dyn_cb), len);
+memcpy(arr->data, insn->mem_cbs->data,
+   len * sizeof(struct qemu_plugin_dyn_cb));
 qemu_plugin_add_dyn_cb_arr(arr);
 
 tcg_gen_st_ptr(tcg_con

[PATCH v2 17/21] plugins: Replace pr_ops with a proper debug dump flag

2024-04-04 Thread Richard Henderson
The DEBUG_PLUGIN_GEN_OPS ifdef is replaced with "-d op_plugin".
The second pr_ops call can be obtained with "-d op".

Reviewed-by: Pierrick Bouvier 
Signed-off-by: Richard Henderson 
---
 include/qemu/log.h |  1 +
 include/tcg/tcg.h  |  1 +
 accel/tcg/plugin-gen.c | 67 +++---
 tcg/tcg.c  | 29 +-
 util/log.c |  4 +++
 5 files changed, 45 insertions(+), 57 deletions(-)

diff --git a/include/qemu/log.h b/include/qemu/log.h
index df59bfabcd..e10e24cd4f 100644
--- a/include/qemu/log.h
+++ b/include/qemu/log.h
@@ -36,6 +36,7 @@ bool qemu_log_separate(void);
 #define LOG_STRACE (1 << 19)
 #define LOG_PER_THREAD (1 << 20)
 #define CPU_LOG_TB_VPU (1 << 21)
+#define LOG_TB_OP_PLUGIN   (1 << 22)
 
 /* Lock/unlock output. */
 
diff --git a/include/tcg/tcg.h b/include/tcg/tcg.h
index 196e3b7ba1..135e36d729 100644
--- a/include/tcg/tcg.h
+++ b/include/tcg/tcg.h
@@ -1070,5 +1070,6 @@ static inline const TCGOpcode *tcg_swap_vecop_list(const 
TCGOpcode *n)
 }
 
 bool tcg_can_emit_vecop_list(const TCGOpcode *, TCGType, unsigned);
+void tcg_dump_ops(TCGContext *s, FILE *f, bool have_prefs);
 
 #endif /* TCG_H */
diff --git a/accel/tcg/plugin-gen.c b/accel/tcg/plugin-gen.c
index c545303956..49d9b07438 100644
--- a/accel/tcg/plugin-gen.c
+++ b/accel/tcg/plugin-gen.c
@@ -44,6 +44,7 @@
  */
 #include "qemu/osdep.h"
 #include "qemu/plugin.h"
+#include "qemu/log.h"
 #include "cpu.h"
 #include "tcg/tcg.h"
 #include "tcg/tcg-temp-internal.h"
@@ -186,66 +187,21 @@ static void gen_mem_cb(struct qemu_plugin_dyn_cb *cb,
 tcg_temp_free_i32(cpu_index);
 }
 
-/* #define DEBUG_PLUGIN_GEN_OPS */
-static void pr_ops(void)
-{
-#ifdef DEBUG_PLUGIN_GEN_OPS
-TCGOp *op;
-int i = 0;
-
-QTAILQ_FOREACH(op, &tcg_ctx->ops, link) {
-const char *name = "";
-const char *type = "";
-
-if (op->opc == INDEX_op_plugin_cb_start) {
-switch (op->args[0]) {
-case PLUGIN_GEN_FROM_TB:
-name = "tb";
-break;
-case PLUGIN_GEN_FROM_INSN:
-name = "insn";
-break;
-case PLUGIN_GEN_FROM_MEM:
-name = "mem";
-break;
-case PLUGIN_GEN_AFTER_INSN:
-name = "after insn";
-break;
-default:
-break;
-}
-switch (op->args[1]) {
-case PLUGIN_GEN_CB_UDATA:
-type = "udata";
-break;
-case PLUGIN_GEN_CB_INLINE:
-type = "inline";
-break;
-case PLUGIN_GEN_CB_MEM:
-type = "mem";
-break;
-case PLUGIN_GEN_ENABLE_MEM_HELPER:
-type = "enable mem helper";
-break;
-case PLUGIN_GEN_DISABLE_MEM_HELPER:
-type = "disable mem helper";
-break;
-default:
-break;
-}
-}
-printf("op[%2i]: %s %s %s\n", i, tcg_op_defs[op->opc].name, name, 
type);
-i++;
-}
-#endif
-}
-
 static void plugin_gen_inject(struct qemu_plugin_tb *plugin_tb)
 {
 TCGOp *op, *next;
 int insn_idx = -1;
 
-pr_ops();
+if (unlikely(qemu_loglevel_mask(LOG_TB_OP_PLUGIN)
+ && qemu_log_in_addr_range(plugin_tb->vaddr))) {
+FILE *logfile = qemu_log_trylock();
+if (logfile) {
+fprintf(logfile, "OP before plugin injection:\n");
+tcg_dump_ops(tcg_ctx, logfile, false);
+fprintf(logfile, "\n");
+qemu_log_unlock(logfile);
+}
+}
 
 /*
  * While injecting code, we cannot afford to reuse any ebb temps
@@ -383,7 +339,6 @@ static void plugin_gen_inject(struct qemu_plugin_tb 
*plugin_tb)
 break;
 }
 }
-pr_ops();
 }
 
 bool plugin_gen_tb_start(CPUState *cpu, const DisasContextBase *db,
diff --git a/tcg/tcg.c b/tcg/tcg.c
index 363a065e28..d248c52e96 100644
--- a/tcg/tcg.c
+++ b/tcg/tcg.c
@@ -2540,6 +2540,15 @@ static const char bswap_flag_name[][6] = {
 [TCG_BSWAP_IZ | TCG_BSWAP_OS] = "iz,os",
 };
 
+#ifdef CONFIG_PLUGIN
+static const char * const plugin_from_name[] = {
+"from-tb",
+"from-insn",
+"after-insn",
+"after-tb",
+};
+#endif
+
 static inline bool tcg_regset_single(TCGRegSet d)
 {
 return (d & (d - 1)) == 0;
@@ -2558,7 +2567,7 @@ static inline TCGReg tcg_regset_first(TCGRegSet d)
 #define ne_fprintf(...) \
 ({ int ret_ = fprintf(__VA_ARGS__); ret_ >= 0 ? ret_ : 0; })
 
-static void tcg_dump_ops(TCGContext *s, FILE *f, bool have_prefs)
+void tcg_dump_ops(TCGContext *s, FILE *f, bool have_prefs)
 {
 char buf[128];
 TCGOp *op;
@@ -2714,6 +2723,24 @@ static void tcg_dump_ops(TCGContext *s, FILE *f, bool 
have_prefs)
 i = k = 1;
 }
 break;
+#ifdef CONFIG_PLUGIN
+c

[PATCH v2 04/21] plugins: Zero new qemu_plugin_dyn_cb entries

2024-04-04 Thread Richard Henderson
Reviewed-by: Alex Bennée 
Signed-off-by: Richard Henderson 
---
 plugins/core.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/plugins/core.c b/plugins/core.c
index 11ca20e626..4487cb7c48 100644
--- a/plugins/core.c
+++ b/plugins/core.c
@@ -307,7 +307,7 @@ static struct qemu_plugin_dyn_cb *plugin_get_dyn_cb(GArray 
**arr)
 GArray *cbs = *arr;
 
 if (!cbs) {
-cbs = g_array_sized_new(false, false,
+cbs = g_array_sized_new(false, true,
 sizeof(struct qemu_plugin_dyn_cb), 1);
 *arr = cbs;
 }
-- 
2.34.1




[PATCH v2 10/21] plugins: Use emit_before_op for PLUGIN_GEN_FROM_INSN

2024-04-04 Thread Richard Henderson
Reviewed-by: Pierrick Bouvier 
Signed-off-by: Richard Henderson 
---
 include/qemu/plugin.h  |   1 -
 accel/tcg/plugin-gen.c | 286 ++---
 plugins/api.c  |   8 +-
 3 files changed, 67 insertions(+), 228 deletions(-)

diff --git a/include/qemu/plugin.h b/include/qemu/plugin.h
index 793c44f1f2..ee1c1b174a 100644
--- a/include/qemu/plugin.h
+++ b/include/qemu/plugin.h
@@ -73,7 +73,6 @@ enum plugin_dyn_cb_type {
 
 enum plugin_dyn_cb_subtype {
 PLUGIN_CB_REGULAR,
-PLUGIN_CB_REGULAR_R,
 PLUGIN_CB_INLINE,
 PLUGIN_N_CB_SUBTYPES,
 };
diff --git a/accel/tcg/plugin-gen.c b/accel/tcg/plugin-gen.c
index 1faa49cb8f..a3dd82df4b 100644
--- a/accel/tcg/plugin-gen.c
+++ b/accel/tcg/plugin-gen.c
@@ -98,30 +98,6 @@ void HELPER(plugin_vcpu_mem_cb)(unsigned int vcpu_index,
 void *userdata)
 { }
 
-static void gen_empty_udata_cb(void (*gen_helper)(TCGv_i32, TCGv_ptr))
-{
-TCGv_i32 cpu_index = tcg_temp_ebb_new_i32();
-TCGv_ptr udata = tcg_temp_ebb_new_ptr();
-
-tcg_gen_movi_ptr(udata, 0);
-tcg_gen_ld_i32(cpu_index, tcg_env,
-   -offsetof(ArchCPU, env) + offsetof(CPUState, cpu_index));
-gen_helper(cpu_index, udata);
-
-tcg_temp_free_ptr(udata);
-tcg_temp_free_i32(cpu_index);
-}
-
-static void gen_empty_udata_cb_no_wg(void)
-{
-gen_empty_udata_cb(gen_helper_plugin_vcpu_udata_cb_no_wg);
-}
-
-static void gen_empty_udata_cb_no_rwg(void)
-{
-gen_empty_udata_cb(gen_helper_plugin_vcpu_udata_cb_no_rwg);
-}
-
 /*
  * For now we only support addi_i64.
  * When we support more ops, we can generate one empty inline cb for each.
@@ -170,51 +146,19 @@ static void gen_empty_mem_cb(TCGv_i64 addr, uint32_t info)
 tcg_temp_free_i32(cpu_index);
 }
 
-/*
- * Share the same function for enable/disable. When enabling, the NULL
- * pointer will be overwritten later.
- */
-static void gen_empty_mem_helper(void)
-{
-TCGv_ptr ptr = tcg_temp_ebb_new_ptr();
-
-tcg_gen_movi_ptr(ptr, 0);
-tcg_gen_st_ptr(ptr, tcg_env, offsetof(CPUState, plugin_mem_cbs) -
- offsetof(ArchCPU, env));
-tcg_temp_free_ptr(ptr);
-}
-
 static void gen_plugin_cb_start(enum plugin_gen_from from,
 enum plugin_gen_cb type, unsigned wr)
 {
 tcg_gen_plugin_cb_start(from, type, wr);
 }
 
-static void gen_wrapped(enum plugin_gen_from from,
-enum plugin_gen_cb type, void (*func)(void))
-{
-gen_plugin_cb_start(from, type, 0);
-func();
-tcg_gen_plugin_cb_end();
-}
-
 static void plugin_gen_empty_callback(enum plugin_gen_from from)
 {
 switch (from) {
 case PLUGIN_GEN_AFTER_INSN:
 case PLUGIN_GEN_FROM_TB:
-tcg_gen_plugin_cb(from);
-break;
 case PLUGIN_GEN_FROM_INSN:
-/*
- * Note: plugin_gen_inject() relies on ENABLE_MEM_HELPER being
- * the first callback of an instruction
- */
-gen_wrapped(from, PLUGIN_GEN_ENABLE_MEM_HELPER,
-gen_empty_mem_helper);
-gen_wrapped(from, PLUGIN_GEN_CB_UDATA, gen_empty_udata_cb_no_rwg);
-gen_wrapped(from, PLUGIN_GEN_CB_UDATA_R, gen_empty_udata_cb_no_wg);
-gen_wrapped(from, PLUGIN_GEN_CB_INLINE, gen_empty_inline_cb);
+tcg_gen_plugin_cb(from);
 break;
 default:
 g_assert_not_reached();
@@ -368,18 +312,6 @@ static TCGOp *copy_mul_i32(TCGOp **begin_op, TCGOp *op, 
uint32_t v)
 return op;
 }
 
-static TCGOp *copy_st_ptr(TCGOp **begin_op, TCGOp *op)
-{
-if (UINTPTR_MAX == UINT32_MAX) {
-/* st_i32 */
-op = copy_op(begin_op, op, INDEX_op_st_i32);
-} else {
-/* st_i64 */
-op = copy_st_i64(begin_op, op);
-}
-return op;
-}
-
 static TCGOp *copy_call(TCGOp **begin_op, TCGOp *op, void *func, int *cb_idx)
 {
 TCGOp *old_op;
@@ -403,32 +335,6 @@ static TCGOp *copy_call(TCGOp **begin_op, TCGOp *op, void 
*func, int *cb_idx)
 return op;
 }
 
-/*
- * When we append/replace ops here we are sensitive to changing patterns of
- * TCGOps generated by the tcg_gen_FOO calls when we generated the
- * empty callbacks. This will assert very quickly in a debug build as
- * we assert the ops we are replacing are the correct ones.
- */
-static TCGOp *append_udata_cb(const struct qemu_plugin_dyn_cb *cb,
-  TCGOp *begin_op, TCGOp *op, int *cb_idx)
-{
-/* const_ptr */
-op = copy_const_ptr(&begin_op, op, cb->userp);
-
-/* copy the ld_i32, but note that we only have to copy it once */
-if (*cb_idx == -1) {
-op = copy_op(&begin_op, op, INDEX_op_ld_i32);
-} else {
-begin_op = QTAILQ_NEXT(begin_op, link);
-tcg_debug_assert(begin_op && begin_op->opc == INDEX_op_ld_i32);
-}
-
-/* call */
-op = copy_call(&begin_op, op, cb->regular.f.vcpu_udata, cb_idx);
-
-return op;
-}
-
 static TCGOp *append_inline_cb(const struct qemu_plugin_dyn_cb *cb,
   

[PATCH v2 12/21] plugins: Remove plugin helpers

2024-04-04 Thread Richard Henderson
These placeholder helpers are no longer required.

Reviewed-by: Alex Bennée 
Signed-off-by: Richard Henderson 
---
 accel/tcg/plugin-helpers.h |  5 -
 include/exec/helper-gen-common.h   |  4 
 include/exec/helper-proto-common.h |  4 
 accel/tcg/plugin-gen.c | 20 
 4 files changed, 33 deletions(-)
 delete mode 100644 accel/tcg/plugin-helpers.h

diff --git a/accel/tcg/plugin-helpers.h b/accel/tcg/plugin-helpers.h
deleted file mode 100644
index 11796436f3..00
--- a/accel/tcg/plugin-helpers.h
+++ /dev/null
@@ -1,5 +0,0 @@
-#ifdef CONFIG_PLUGIN
-DEF_HELPER_FLAGS_2(plugin_vcpu_udata_cb_no_wg, TCG_CALL_NO_WG | 
TCG_CALL_PLUGIN, void, i32, ptr)
-DEF_HELPER_FLAGS_2(plugin_vcpu_udata_cb_no_rwg, TCG_CALL_NO_RWG | 
TCG_CALL_PLUGIN, void, i32, ptr)
-DEF_HELPER_FLAGS_4(plugin_vcpu_mem_cb, TCG_CALL_NO_RWG | TCG_CALL_PLUGIN, 
void, i32, i32, i64, ptr)
-#endif
diff --git a/include/exec/helper-gen-common.h b/include/exec/helper-gen-common.h
index 5d6d78a625..834590dc4e 100644
--- a/include/exec/helper-gen-common.h
+++ b/include/exec/helper-gen-common.h
@@ -11,8 +11,4 @@
 #include "exec/helper-gen.h.inc"
 #undef  HELPER_H
 
-#define HELPER_H "accel/tcg/plugin-helpers.h"
-#include "exec/helper-gen.h.inc"
-#undef  HELPER_H
-
 #endif /* HELPER_GEN_COMMON_H */
diff --git a/include/exec/helper-proto-common.h 
b/include/exec/helper-proto-common.h
index 8b67170a22..16782ef46c 100644
--- a/include/exec/helper-proto-common.h
+++ b/include/exec/helper-proto-common.h
@@ -13,8 +13,4 @@
 #include "exec/helper-proto.h.inc"
 #undef  HELPER_H
 
-#define HELPER_H "accel/tcg/plugin-helpers.h"
-#include "exec/helper-proto.h.inc"
-#undef  HELPER_H
-
 #endif /* HELPER_PROTO_COMMON_H */
diff --git a/accel/tcg/plugin-gen.c b/accel/tcg/plugin-gen.c
index 8f8ae156b6..fb77585ac0 100644
--- a/accel/tcg/plugin-gen.c
+++ b/accel/tcg/plugin-gen.c
@@ -51,11 +51,6 @@
 #include "exec/exec-all.h"
 #include "exec/plugin-gen.h"
 #include "exec/translator.h"
-#include "exec/helper-proto-common.h"
-
-#define HELPER_H  "accel/tcg/plugin-helpers.h"
-#include "exec/helper-info.c.inc"
-#undef  HELPER_H
 
 /*
  * plugin_cb_start TCG op args[]:
@@ -82,21 +77,6 @@ enum plugin_gen_cb {
 PLUGIN_GEN_N_CBS,
 };
 
-/*
- * These helpers are stubs that get dynamically switched out for calls
- * direct to the plugin if they are subscribed to.
- */
-void HELPER(plugin_vcpu_udata_cb_no_wg)(uint32_t cpu_index, void *udata)
-{ }
-
-void HELPER(plugin_vcpu_udata_cb_no_rwg)(uint32_t cpu_index, void *udata)
-{ }
-
-void HELPER(plugin_vcpu_mem_cb)(unsigned int vcpu_index,
-qemu_plugin_meminfo_t info, uint64_t vaddr,
-void *userdata)
-{ }
-
 static void plugin_gen_empty_callback(enum plugin_gen_from from)
 {
 switch (from) {
-- 
2.34.1




[PATCH v2 03/21] tcg: Pass function pointer to tcg_gen_call*

2024-04-04 Thread Richard Henderson
For normal helpers, read the function pointer from the
structure earlier.  For plugins, this will allow the
function pointer to come from elsewhere.

Reviewed-by: Alex Bennée 
Signed-off-by: Richard Henderson 
---
 include/tcg/tcg.h | 21 +---
 include/exec/helper-gen.h.inc | 24 ---
 tcg/tcg.c | 45 +++
 3 files changed, 52 insertions(+), 38 deletions(-)

diff --git a/include/tcg/tcg.h b/include/tcg/tcg.h
index e4c598428d..8d9f6585ff 100644
--- a/include/tcg/tcg.h
+++ b/include/tcg/tcg.h
@@ -852,19 +852,22 @@ typedef struct TCGTargetOpDef {
 
 bool tcg_op_supported(TCGOpcode op);
 
-void tcg_gen_call0(TCGHelperInfo *, TCGTemp *ret);
-void tcg_gen_call1(TCGHelperInfo *, TCGTemp *ret, TCGTemp *);
-void tcg_gen_call2(TCGHelperInfo *, TCGTemp *ret, TCGTemp *, TCGTemp *);
-void tcg_gen_call3(TCGHelperInfo *, TCGTemp *ret, TCGTemp *,
+void tcg_gen_call0(void *func, TCGHelperInfo *, TCGTemp *ret);
+void tcg_gen_call1(void *func, TCGHelperInfo *, TCGTemp *ret, TCGTemp *);
+void tcg_gen_call2(void *func, TCGHelperInfo *, TCGTemp *ret,
TCGTemp *, TCGTemp *);
-void tcg_gen_call4(TCGHelperInfo *, TCGTemp *ret, TCGTemp *, TCGTemp *,
-   TCGTemp *, TCGTemp *);
-void tcg_gen_call5(TCGHelperInfo *, TCGTemp *ret, TCGTemp *, TCGTemp *,
+void tcg_gen_call3(void *func, TCGHelperInfo *, TCGTemp *ret,
TCGTemp *, TCGTemp *, TCGTemp *);
-void tcg_gen_call6(TCGHelperInfo *, TCGTemp *ret, TCGTemp *, TCGTemp *,
+void tcg_gen_call4(void *func, TCGHelperInfo *, TCGTemp *ret,
TCGTemp *, TCGTemp *, TCGTemp *, TCGTemp *);
-void tcg_gen_call7(TCGHelperInfo *, TCGTemp *ret, TCGTemp *, TCGTemp *,
+void tcg_gen_call5(void *func, TCGHelperInfo *, TCGTemp *ret,
TCGTemp *, TCGTemp *, TCGTemp *, TCGTemp *, TCGTemp *);
+void tcg_gen_call6(void *func, TCGHelperInfo *, TCGTemp *ret,
+   TCGTemp *, TCGTemp *, TCGTemp *, TCGTemp *,
+   TCGTemp *, TCGTemp *);
+void tcg_gen_call7(void *func, TCGHelperInfo *, TCGTemp *ret,
+   TCGTemp *, TCGTemp *, TCGTemp *, TCGTemp *,
+   TCGTemp *, TCGTemp *, TCGTemp *);
 
 TCGOp *tcg_emit_op(TCGOpcode opc, unsigned nargs);
 void tcg_op_remove(TCGContext *s, TCGOp *op);
diff --git a/include/exec/helper-gen.h.inc b/include/exec/helper-gen.h.inc
index c009641517..f7eb59b6c1 100644
--- a/include/exec/helper-gen.h.inc
+++ b/include/exec/helper-gen.h.inc
@@ -14,7 +14,8 @@
 extern TCGHelperInfo glue(helper_info_, name);  \
 static inline void glue(gen_helper_, name)(dh_retvar_decl0(ret))\
 {   \
-tcg_gen_call0(&glue(helper_info_, name), dh_retvar(ret));   \
+tcg_gen_call0(glue(helper_info_,name).func, \
+  &glue(helper_info_,name), dh_retvar(ret));\
 }
 
 #define DEF_HELPER_FLAGS_1(name, flags, ret, t1)\
@@ -22,7 +23,8 @@ extern TCGHelperInfo glue(helper_info_, name);
  \
 static inline void glue(gen_helper_, name)(dh_retvar_decl(ret)  \
 dh_arg_decl(t1, 1)) \
 {   \
-tcg_gen_call1(&glue(helper_info_, name), dh_retvar(ret),\
+tcg_gen_call1(glue(helper_info_,name).func, \
+  &glue(helper_info_,name), dh_retvar(ret), \
   dh_arg(t1, 1));   \
 }
 
@@ -31,7 +33,8 @@ extern TCGHelperInfo glue(helper_info_, name);
  \
 static inline void glue(gen_helper_, name)(dh_retvar_decl(ret)  \
 dh_arg_decl(t1, 1), dh_arg_decl(t2, 2)) \
 {   \
-tcg_gen_call2(&glue(helper_info_, name), dh_retvar(ret),\
+tcg_gen_call2(glue(helper_info_,name).func, \
+  &glue(helper_info_,name), dh_retvar(ret), \
   dh_arg(t1, 1), dh_arg(t2, 2));\
 }
 
@@ -40,7 +43,8 @@ extern TCGHelperInfo glue(helper_info_, name);
  \
 static inline void glue(gen_helper_, name)(dh_retvar_decl(ret)  \
 dh_arg_decl(t1, 1), dh_arg_decl(t2, 2), dh_arg_decl(t3, 3)) \
 {   \
-tcg_gen_call3(&glue(helper_info_, name), dh_retvar(ret),\
+tcg_gen_call3(glue(helper_info_,name).func, \
+  &glue(helper_info_,name), dh_retvar(ret), \
   dh_arg(t1, 1), dh_arg(t2, 2), dh_arg(t3, 3)); \
 }
 
@@ -50,7 +54,8 @@ static inline void glue(gen_helper_

[PATCH v2 21/21] plugins: Update the documentation block for plugin-gen.c

2024-04-04 Thread Richard Henderson
Reviewed-by: Pierrick Bouvier 
Signed-off-by: Richard Henderson 
---
 accel/tcg/plugin-gen.c | 31 ---
 1 file changed, 4 insertions(+), 27 deletions(-)

diff --git a/accel/tcg/plugin-gen.c b/accel/tcg/plugin-gen.c
index d914d64de0..3db74ae9bf 100644
--- a/accel/tcg/plugin-gen.c
+++ b/accel/tcg/plugin-gen.c
@@ -14,33 +14,10 @@
  * Injecting the desired instrumentation could be done with a second
  * translation pass that combined the instrumentation requests, but that
  * would be ugly and inefficient since we would decode the guest code twice.
- * Instead, during TB translation we add "empty" instrumentation calls for all
- * possible instrumentation events, and then once we collect the 
instrumentation
- * requests from plugins, we either "fill in" those empty events or remove them
- * if they have no requests.
- *
- * When "filling in" an event we first copy the empty callback's TCG ops. This
- * might seem unnecessary, but it is done to support an arbitrary number
- * of callbacks per event. Take for example a regular instruction callback.
- * We first generate a callback to an empty helper function. Then, if two
- * plugins register one callback each for this instruction, we make two copies
- * of the TCG ops generated for the empty callback, substituting the function
- * pointer that points to the empty helper function with the plugins' desired
- * callback functions. After that we remove the empty callback's ops.
- *
- * Note that the location in TCGOp.args[] of the pointer to a helper function
- * varies across different guest and host architectures. Instead of duplicating
- * the logic that figures this out, we rely on the fact that the empty
- * callbacks point to empty functions that are unique pointers in the program.
- * Thus, to find the right location we just have to look for a match in
- * TCGOp.args[]. This is the main reason why we first copy an empty callback's
- * TCG ops and then fill them in; regardless of whether we have one or many
- * callbacks for that event, the logic to add all of them is the same.
- *
- * When generating more than one callback per event, we make a small
- * optimization to avoid generating redundant operations. For instance, for the
- * second and all subsequent callbacks of an event, we do not need to reload 
the
- * CPU's index into a TCG temp, since the first callback did it already.
+ * Instead, during TB translation we add "plugin_cb" marker opcodes
+ * for all possible instrumentation events, and then once we collect the
+ * instrumentation requests from plugins, we generate code for those markers
+ * or remove them if they have no requests.
  */
 #include "qemu/osdep.h"
 #include "qemu/plugin.h"
-- 
2.34.1




[PATCH v2 19/21] plugins: Merge qemu_plugin_tb_insn_get to plugin-gen.c

2024-04-04 Thread Richard Henderson
Merge qemu_plugin_insn_alloc and qemu_plugin_tb_insn_get into
plugin_gen_insn_start, since it is used nowhere else.

Reviewed-by: Pierrick Bouvier 
Signed-off-by: Richard Henderson 
---
 include/qemu/plugin.h  | 39 ---
 accel/tcg/plugin-gen.c | 39 ---
 2 files changed, 32 insertions(+), 46 deletions(-)

diff --git a/include/qemu/plugin.h b/include/qemu/plugin.h
index 34498da717..07b1755990 100644
--- a/include/qemu/plugin.h
+++ b/include/qemu/plugin.h
@@ -128,14 +128,6 @@ static inline void qemu_plugin_insn_cleanup_fn(gpointer 
data)
 g_byte_array_free(insn->data, true);
 }
 
-static inline struct qemu_plugin_insn *qemu_plugin_insn_alloc(void)
-{
-struct qemu_plugin_insn *insn = g_new0(struct qemu_plugin_insn, 1);
-
-insn->data = g_byte_array_sized_new(4);
-return insn;
-}
-
 /* Internal context for this TranslationBlock */
 struct qemu_plugin_tb {
 GPtrArray *insns;
@@ -152,37 +144,6 @@ struct qemu_plugin_tb {
 GArray *cbs;
 };
 
-/**
- * qemu_plugin_tb_insn_get(): get next plugin record for translation.
- * @tb: the internal tb context
- * @pc: address of instruction
- */
-static inline
-struct qemu_plugin_insn *qemu_plugin_tb_insn_get(struct qemu_plugin_tb *tb,
- uint64_t pc)
-{
-struct qemu_plugin_insn *insn;
-
-if (unlikely(tb->n == tb->insns->len)) {
-struct qemu_plugin_insn *new_insn = qemu_plugin_insn_alloc();
-g_ptr_array_add(tb->insns, new_insn);
-}
-
-insn = g_ptr_array_index(tb->insns, tb->n++);
-g_byte_array_set_size(insn->data, 0);
-insn->calls_helpers = false;
-insn->mem_helper = false;
-insn->vaddr = pc;
-if (insn->insn_cbs) {
-g_array_set_size(insn->insn_cbs, 0);
-}
-if (insn->mem_cbs) {
-g_array_set_size(insn->mem_cbs, 0);
-}
-
-return insn;
-}
-
 /**
  * struct CPUPluginState - per-CPU state for plugins
  * @event_mask: plugin event bitmap. Modified only via async work.
diff --git a/accel/tcg/plugin-gen.c b/accel/tcg/plugin-gen.c
index 5b63b93114..c0cbc26984 100644
--- a/accel/tcg/plugin-gen.c
+++ b/accel/tcg/plugin-gen.c
@@ -373,11 +373,34 @@ bool plugin_gen_tb_start(CPUState *cpu, const 
DisasContextBase *db,
 void plugin_gen_insn_start(CPUState *cpu, const DisasContextBase *db)
 {
 struct qemu_plugin_tb *ptb = tcg_ctx->plugin_tb;
-struct qemu_plugin_insn *pinsn;
+struct qemu_plugin_insn *insn;
+size_t n = db->num_insns;
+vaddr pc;
 
-pinsn = qemu_plugin_tb_insn_get(ptb, db->pc_next);
-tcg_ctx->plugin_insn = pinsn;
-plugin_gen_empty_callback(PLUGIN_GEN_FROM_INSN);
+assert(n >= 1);
+ptb->n = n;
+if (n <= ptb->insns->len) {
+insn = g_ptr_array_index(ptb->insns, n - 1);
+g_byte_array_set_size(insn->data, 0);
+} else {
+assert(n - 1 == ptb->insns->len);
+insn = g_new0(struct qemu_plugin_insn, 1);
+insn->data = g_byte_array_sized_new(4);
+g_ptr_array_add(ptb->insns, insn);
+}
+
+tcg_ctx->plugin_insn = insn;
+insn->calls_helpers = false;
+insn->mem_helper = false;
+if (insn->insn_cbs) {
+g_array_set_size(insn->insn_cbs, 0);
+}
+if (insn->mem_cbs) {
+g_array_set_size(insn->mem_cbs, 0);
+}
+
+pc = db->pc_next;
+insn->vaddr = pc;
 
 /*
  * Detect page crossing to get the new host address.
@@ -385,16 +408,18 @@ void plugin_gen_insn_start(CPUState *cpu, const 
DisasContextBase *db)
  * fetching instructions from a region not backed by RAM.
  */
 if (ptb->haddr1 == NULL) {
-pinsn->haddr = NULL;
+insn->haddr = NULL;
 } else if (is_same_page(db, db->pc_next)) {
-pinsn->haddr = ptb->haddr1 + pinsn->vaddr - ptb->vaddr;
+insn->haddr = ptb->haddr1 + pc - ptb->vaddr;
 } else {
 if (ptb->vaddr2 == -1) {
 ptb->vaddr2 = TARGET_PAGE_ALIGN(db->pc_first);
 get_page_addr_code_hostp(cpu_env(cpu), ptb->vaddr2, &ptb->haddr2);
 }
-pinsn->haddr = ptb->haddr2 + pinsn->vaddr - ptb->vaddr2;
+insn->haddr = ptb->haddr2 + pc - ptb->vaddr2;
 }
+
+plugin_gen_empty_callback(PLUGIN_GEN_FROM_INSN);
 }
 
 void plugin_gen_insn_end(void)
-- 
2.34.1




[PATCH v2 11/21] plugins: Use emit_before_op for PLUGIN_GEN_FROM_MEM

2024-04-04 Thread Richard Henderson
Introduce a new plugin_mem_cb op to hold the address temp
and meminfo computed by tcg-op-ldst.c.  Because this now
has its own opcode, we no longer need PLUGIN_GEN_FROM_MEM.

Reviewed-by: Pierrick Bouvier 
Signed-off-by: Richard Henderson 
---
 include/exec/plugin-gen.h   |   4 -
 include/tcg/tcg-op-common.h |   1 +
 include/tcg/tcg-opc.h   |   1 +
 accel/tcg/plugin-gen.c  | 408 
 tcg/tcg-op-ldst.c   |   6 +-
 tcg/tcg-op.c|   5 +
 6 files changed, 54 insertions(+), 371 deletions(-)

diff --git a/include/exec/plugin-gen.h b/include/exec/plugin-gen.h
index c4552b5061..f333f33198 100644
--- a/include/exec/plugin-gen.h
+++ b/include/exec/plugin-gen.h
@@ -25,7 +25,6 @@ void plugin_gen_insn_start(CPUState *cpu, const struct 
DisasContextBase *db);
 void plugin_gen_insn_end(void);
 
 void plugin_gen_disable_mem_helpers(void);
-void plugin_gen_empty_mem_callback(TCGv_i64 addr, uint32_t info);
 
 #else /* !CONFIG_PLUGIN */
 
@@ -48,9 +47,6 @@ static inline void plugin_gen_tb_end(CPUState *cpu, size_t 
num_insns)
 static inline void plugin_gen_disable_mem_helpers(void)
 { }
 
-static inline void plugin_gen_empty_mem_callback(TCGv_i64 addr, uint32_t info)
-{ }
-
 #endif /* CONFIG_PLUGIN */
 
 #endif /* QEMU_PLUGIN_GEN_H */
diff --git a/include/tcg/tcg-op-common.h b/include/tcg/tcg-op-common.h
index 9de5a7f280..72b80b20d0 100644
--- a/include/tcg/tcg-op-common.h
+++ b/include/tcg/tcg-op-common.h
@@ -75,6 +75,7 @@ void tcg_gen_goto_tb(unsigned idx);
 void tcg_gen_lookup_and_goto_ptr(void);
 
 void tcg_gen_plugin_cb(unsigned from);
+void tcg_gen_plugin_mem_cb(TCGv_i64 addr, unsigned meminfo);
 void tcg_gen_plugin_cb_start(unsigned from, unsigned type, unsigned wr);
 void tcg_gen_plugin_cb_end(void);
 
diff --git a/include/tcg/tcg-opc.h b/include/tcg/tcg-opc.h
index 3b7cb2bce1..be9e36e386 100644
--- a/include/tcg/tcg-opc.h
+++ b/include/tcg/tcg-opc.h
@@ -198,6 +198,7 @@ DEF(goto_tb, 0, 0, 1, TCG_OPF_BB_EXIT | TCG_OPF_BB_END)
 DEF(goto_ptr, 0, 1, 0, TCG_OPF_BB_EXIT | TCG_OPF_BB_END)
 
 DEF(plugin_cb, 0, 0, 1, TCG_OPF_NOT_PRESENT)
+DEF(plugin_mem_cb, 0, 1, 1, TCG_OPF_NOT_PRESENT)
 DEF(plugin_cb_start, 0, 0, 3, TCG_OPF_NOT_PRESENT)
 DEF(plugin_cb_end, 0, 0, 0, TCG_OPF_NOT_PRESENT)
 
diff --git a/accel/tcg/plugin-gen.c b/accel/tcg/plugin-gen.c
index a3dd82df4b..8f8ae156b6 100644
--- a/accel/tcg/plugin-gen.c
+++ b/accel/tcg/plugin-gen.c
@@ -67,7 +67,6 @@
 enum plugin_gen_from {
 PLUGIN_GEN_FROM_TB,
 PLUGIN_GEN_FROM_INSN,
-PLUGIN_GEN_FROM_MEM,
 PLUGIN_GEN_AFTER_INSN,
 PLUGIN_GEN_AFTER_TB,
 PLUGIN_GEN_N_FROMS,
@@ -98,60 +97,6 @@ void HELPER(plugin_vcpu_mem_cb)(unsigned int vcpu_index,
 void *userdata)
 { }
 
-/*
- * For now we only support addi_i64.
- * When we support more ops, we can generate one empty inline cb for each.
- */
-static void gen_empty_inline_cb(void)
-{
-TCGv_i32 cpu_index = tcg_temp_ebb_new_i32();
-TCGv_ptr cpu_index_as_ptr = tcg_temp_ebb_new_ptr();
-TCGv_i64 val = tcg_temp_ebb_new_i64();
-TCGv_ptr ptr = tcg_temp_ebb_new_ptr();
-
-tcg_gen_ld_i32(cpu_index, tcg_env,
-   -offsetof(ArchCPU, env) + offsetof(CPUState, cpu_index));
-/* second operand will be replaced by immediate value */
-tcg_gen_mul_i32(cpu_index, cpu_index, cpu_index);
-tcg_gen_ext_i32_ptr(cpu_index_as_ptr, cpu_index);
-
-tcg_gen_movi_ptr(ptr, 0);
-tcg_gen_add_ptr(ptr, ptr, cpu_index_as_ptr);
-tcg_gen_ld_i64(val, ptr, 0);
-/* second operand will be replaced by immediate value */
-tcg_gen_add_i64(val, val, val);
-
-tcg_gen_st_i64(val, ptr, 0);
-tcg_temp_free_ptr(ptr);
-tcg_temp_free_i64(val);
-tcg_temp_free_ptr(cpu_index_as_ptr);
-tcg_temp_free_i32(cpu_index);
-}
-
-static void gen_empty_mem_cb(TCGv_i64 addr, uint32_t info)
-{
-TCGv_i32 cpu_index = tcg_temp_ebb_new_i32();
-TCGv_i32 meminfo = tcg_temp_ebb_new_i32();
-TCGv_ptr udata = tcg_temp_ebb_new_ptr();
-
-tcg_gen_movi_i32(meminfo, info);
-tcg_gen_movi_ptr(udata, 0);
-tcg_gen_ld_i32(cpu_index, tcg_env,
-   -offsetof(ArchCPU, env) + offsetof(CPUState, cpu_index));
-
-gen_helper_plugin_vcpu_mem_cb(cpu_index, meminfo, addr, udata);
-
-tcg_temp_free_ptr(udata);
-tcg_temp_free_i32(meminfo);
-tcg_temp_free_i32(cpu_index);
-}
-
-static void gen_plugin_cb_start(enum plugin_gen_from from,
-enum plugin_gen_cb type, unsigned wr)
-{
-tcg_gen_plugin_cb_start(from, type, wr);
-}
-
 static void plugin_gen_empty_callback(enum plugin_gen_from from)
 {
 switch (from) {
@@ -165,278 +110,6 @@ static void plugin_gen_empty_callback(enum 
plugin_gen_from from)
 }
 }
 
-void plugin_gen_empty_mem_callback(TCGv_i64 addr, uint32_t info)
-{
-enum qemu_plugin_mem_rw rw = get_plugin_meminfo_rw(info);
-
-gen_plugin_cb_start(PLUGIN_GEN_FROM_MEM, PLUGIN_GEN_CB_MEM, rw);
-gen_empty_mem_cb(addr, info);
-tcg

[PATCH v2 18/21] plugins: Split out common cb expanders

2024-04-04 Thread Richard Henderson
Reviewed-by: Pierrick Bouvier 
Signed-off-by: Richard Henderson 
---
 accel/tcg/plugin-gen.c | 84 +-
 1 file changed, 41 insertions(+), 43 deletions(-)

diff --git a/accel/tcg/plugin-gen.c b/accel/tcg/plugin-gen.c
index 49d9b07438..5b63b93114 100644
--- a/accel/tcg/plugin-gen.c
+++ b/accel/tcg/plugin-gen.c
@@ -187,6 +187,37 @@ static void gen_mem_cb(struct qemu_plugin_dyn_cb *cb,
 tcg_temp_free_i32(cpu_index);
 }
 
+static void inject_cb(struct qemu_plugin_dyn_cb *cb)
+
+{
+switch (cb->type) {
+case PLUGIN_CB_REGULAR:
+gen_udata_cb(cb);
+break;
+case PLUGIN_CB_INLINE:
+gen_inline_cb(cb);
+break;
+default:
+g_assert_not_reached();
+}
+}
+
+static void inject_mem_cb(struct qemu_plugin_dyn_cb *cb,
+  enum qemu_plugin_mem_rw rw,
+  qemu_plugin_meminfo_t meminfo, TCGv_i64 addr)
+{
+if (cb->rw & rw) {
+switch (cb->type) {
+case PLUGIN_CB_MEM_REGULAR:
+gen_mem_cb(cb, meminfo, addr);
+break;
+default:
+inject_cb(cb);
+break;
+}
+}
+}
+
 static void plugin_gen_inject(struct qemu_plugin_tb *plugin_tb)
 {
 TCGOp *op, *next;
@@ -248,19 +279,8 @@ static void plugin_gen_inject(struct qemu_plugin_tb 
*plugin_tb)
 
 cbs = plugin_tb->cbs;
 for (i = 0, n = (cbs ? cbs->len : 0); i < n; i++) {
-struct qemu_plugin_dyn_cb *cb =
-&g_array_index(cbs, struct qemu_plugin_dyn_cb, i);
-
-switch (cb->type) {
-case PLUGIN_CB_REGULAR:
-gen_udata_cb(cb);
-break;
-case PLUGIN_CB_INLINE:
-gen_inline_cb(cb);
-break;
-default:
-g_assert_not_reached();
-}
+inject_cb(
+&g_array_index(cbs, struct qemu_plugin_dyn_cb, i));
 }
 break;
 
@@ -271,19 +291,8 @@ static void plugin_gen_inject(struct qemu_plugin_tb 
*plugin_tb)
 
 cbs = insn->insn_cbs;
 for (i = 0, n = (cbs ? cbs->len : 0); i < n; i++) {
-struct qemu_plugin_dyn_cb *cb =
-&g_array_index(cbs, struct qemu_plugin_dyn_cb, i);
-
-switch (cb->type) {
-case PLUGIN_CB_REGULAR:
-gen_udata_cb(cb);
-break;
-case PLUGIN_CB_INLINE:
-gen_inline_cb(cb);
-break;
-default:
-g_assert_not_reached();
-}
+inject_cb(
+&g_array_index(cbs, struct qemu_plugin_dyn_cb, i));
 }
 break;
 
@@ -300,33 +309,22 @@ static void plugin_gen_inject(struct qemu_plugin_tb 
*plugin_tb)
 {
 TCGv_i64 addr = temp_tcgv_i64(arg_temp(op->args[0]));
 qemu_plugin_meminfo_t meminfo = op->args[1];
+enum qemu_plugin_mem_rw rw =
+(qemu_plugin_mem_is_store(meminfo)
+ ? QEMU_PLUGIN_MEM_W : QEMU_PLUGIN_MEM_R);
 struct qemu_plugin_insn *insn;
 const GArray *cbs;
-int i, n, rw;
+int i, n;
 
 assert(insn_idx >= 0);
 insn = g_ptr_array_index(plugin_tb->insns, insn_idx);
-rw = qemu_plugin_mem_is_store(meminfo) ? 2 : 1;
 
 tcg_ctx->emit_before_op = op;
 
 cbs = insn->mem_cbs;
 for (i = 0, n = (cbs ? cbs->len : 0); i < n; i++) {
-struct qemu_plugin_dyn_cb *cb =
-&g_array_index(cbs, struct qemu_plugin_dyn_cb, i);
-
-if (cb->rw & rw) {
-switch (cb->type) {
-case PLUGIN_CB_MEM_REGULAR:
-gen_mem_cb(cb, meminfo, addr);
-break;
-case PLUGIN_CB_INLINE:
-gen_inline_cb(cb);
-break;
-default:
-g_assert_not_reached();
-}
-}
+inject_mem_cb(&g_array_index(cbs, struct qemu_plugin_dyn_cb, 
i),
+  rw, meminfo, addr);
 }
 
 tcg_ctx->emit_before_op = NULL;
-- 
2.34.1




[PATCH v2 14/21] tcg: Remove INDEX_op_plugin_cb_{start,end}

2024-04-04 Thread Richard Henderson
These opcodes are no longer used.

Reviewed-by: Pierrick Bouvier 
Signed-off-by: Richard Henderson 
---
 include/tcg/tcg-op-common.h |  2 --
 include/tcg/tcg-opc.h   |  2 --
 accel/tcg/plugin-gen.c  | 18 --
 tcg/tcg-op.c| 10 --
 4 files changed, 32 deletions(-)

diff --git a/include/tcg/tcg-op-common.h b/include/tcg/tcg-op-common.h
index 72b80b20d0..009e2778c5 100644
--- a/include/tcg/tcg-op-common.h
+++ b/include/tcg/tcg-op-common.h
@@ -76,8 +76,6 @@ void tcg_gen_lookup_and_goto_ptr(void);
 
 void tcg_gen_plugin_cb(unsigned from);
 void tcg_gen_plugin_mem_cb(TCGv_i64 addr, unsigned meminfo);
-void tcg_gen_plugin_cb_start(unsigned from, unsigned type, unsigned wr);
-void tcg_gen_plugin_cb_end(void);
 
 /* 32 bit ops */
 
diff --git a/include/tcg/tcg-opc.h b/include/tcg/tcg-opc.h
index be9e36e386..546eb49c11 100644
--- a/include/tcg/tcg-opc.h
+++ b/include/tcg/tcg-opc.h
@@ -199,8 +199,6 @@ DEF(goto_ptr, 0, 1, 0, TCG_OPF_BB_EXIT | TCG_OPF_BB_END)
 
 DEF(plugin_cb, 0, 0, 1, TCG_OPF_NOT_PRESENT)
 DEF(plugin_mem_cb, 0, 1, 1, TCG_OPF_NOT_PRESENT)
-DEF(plugin_cb_start, 0, 0, 3, TCG_OPF_NOT_PRESENT)
-DEF(plugin_cb_end, 0, 0, 0, TCG_OPF_NOT_PRESENT)
 
 /* Replicate ld/st ops for 32 and 64-bit guest addresses. */
 DEF(qemu_ld_a32_i32, 1, 1, 1,
diff --git a/accel/tcg/plugin-gen.c b/accel/tcg/plugin-gen.c
index fb77585ac0..d9ee9bb2ec 100644
--- a/accel/tcg/plugin-gen.c
+++ b/accel/tcg/plugin-gen.c
@@ -52,29 +52,11 @@
 #include "exec/plugin-gen.h"
 #include "exec/translator.h"
 
-/*
- * plugin_cb_start TCG op args[]:
- * 0: enum plugin_gen_from
- * 1: enum plugin_gen_cb
- * 2: set to 1 for mem callback that is a write, 0 otherwise.
- */
-
 enum plugin_gen_from {
 PLUGIN_GEN_FROM_TB,
 PLUGIN_GEN_FROM_INSN,
 PLUGIN_GEN_AFTER_INSN,
 PLUGIN_GEN_AFTER_TB,
-PLUGIN_GEN_N_FROMS,
-};
-
-enum plugin_gen_cb {
-PLUGIN_GEN_CB_UDATA,
-PLUGIN_GEN_CB_UDATA_R,
-PLUGIN_GEN_CB_INLINE,
-PLUGIN_GEN_CB_MEM,
-PLUGIN_GEN_ENABLE_MEM_HELPER,
-PLUGIN_GEN_DISABLE_MEM_HELPER,
-PLUGIN_GEN_N_CBS,
 };
 
 static void plugin_gen_empty_callback(enum plugin_gen_from from)
diff --git a/tcg/tcg-op.c b/tcg/tcg-op.c
index 0ae12fa49d..eff3728622 100644
--- a/tcg/tcg-op.c
+++ b/tcg/tcg-op.c
@@ -322,16 +322,6 @@ void tcg_gen_plugin_mem_cb(TCGv_i64 addr, unsigned meminfo)
 tcg_gen_op2(INDEX_op_plugin_mem_cb, tcgv_i64_arg(addr), meminfo);
 }
 
-void tcg_gen_plugin_cb_start(unsigned from, unsigned type, unsigned wr)
-{
-tcg_gen_op3(INDEX_op_plugin_cb_start, from, type, wr);
-}
-
-void tcg_gen_plugin_cb_end(void)
-{
-tcg_emit_op(INDEX_op_plugin_cb_end, 0);
-}
-
 /* 32 bit ops */
 
 void tcg_gen_discard_i32(TCGv_i32 arg)
-- 
2.34.1




[PATCH v2 09/21] plugins: Add PLUGIN_GEN_AFTER_TB

2024-04-04 Thread Richard Henderson
Delay test of plugin_tb->mem_helper until the inject pass.

Reviewed-by: Pierrick Bouvier 
Signed-off-by: Richard Henderson 
---
 accel/tcg/plugin-gen.c | 37 -
 1 file changed, 16 insertions(+), 21 deletions(-)

diff --git a/accel/tcg/plugin-gen.c b/accel/tcg/plugin-gen.c
index c803fe8e96..1faa49cb8f 100644
--- a/accel/tcg/plugin-gen.c
+++ b/accel/tcg/plugin-gen.c
@@ -69,6 +69,7 @@ enum plugin_gen_from {
 PLUGIN_GEN_FROM_INSN,
 PLUGIN_GEN_FROM_MEM,
 PLUGIN_GEN_AFTER_INSN,
+PLUGIN_GEN_AFTER_TB,
 PLUGIN_GEN_N_FROMS,
 };
 
@@ -609,20 +610,9 @@ static void inject_mem_enable_helper(struct qemu_plugin_tb 
*ptb,
 /* called before finishing a TB with exit_tb, goto_tb or goto_ptr */
 void plugin_gen_disable_mem_helpers(void)
 {
-/*
- * We could emit the clearing unconditionally and be done. However, this 
can
- * be wasteful if for instance plugins don't track memory accesses, or if
- * most TBs don't use helpers. Instead, emit the clearing iff the TB calls
- * helpers that might access guest memory.
- *
- * Note: we do not reset plugin_tb->mem_helper here; a TB might have 
several
- * exit points, and we want to emit the clearing from all of them.
- */
-if (!tcg_ctx->plugin_tb->mem_helper) {
-return;
+if (tcg_ctx->plugin_insn) {
+tcg_gen_plugin_cb(PLUGIN_GEN_AFTER_TB);
 }
-tcg_gen_st_ptr(tcg_constant_ptr(NULL), tcg_env,
-   offsetof(CPUState, plugin_mem_cbs) - offsetof(ArchCPU, 
env));
 }
 
 static void plugin_gen_insn_udata(const struct qemu_plugin_tb *ptb,
@@ -673,14 +663,11 @@ static void plugin_gen_enable_mem_helper(struct 
qemu_plugin_tb *ptb,
 inject_mem_enable_helper(ptb, insn, begin_op);
 }
 
-static void gen_disable_mem_helper(struct qemu_plugin_tb *ptb,
-   struct qemu_plugin_insn *insn)
+static void gen_disable_mem_helper(void)
 {
-if (insn->mem_helper) {
-tcg_gen_st_ptr(tcg_constant_ptr(0), tcg_env,
-   offsetof(CPUState, plugin_mem_cbs) -
-   offsetof(ArchCPU, env));
-}
+tcg_gen_st_ptr(tcg_constant_ptr(0), tcg_env,
+   offsetof(CPUState, plugin_mem_cbs) -
+   offsetof(ArchCPU, env));
 }
 
 static void gen_udata_cb(struct qemu_plugin_dyn_cb *cb)
@@ -806,9 +793,17 @@ static void plugin_gen_inject(struct qemu_plugin_tb 
*plugin_tb)
 tcg_ctx->emit_before_op = op;
 
 switch (from) {
+case PLUGIN_GEN_AFTER_TB:
+if (plugin_tb->mem_helper) {
+gen_disable_mem_helper();
+}
+break;
+
 case PLUGIN_GEN_AFTER_INSN:
 assert(insn != NULL);
-gen_disable_mem_helper(plugin_tb, insn);
+if (insn->mem_helper) {
+gen_disable_mem_helper();
+}
 break;
 
 case PLUGIN_GEN_FROM_TB:
-- 
2.34.1




[PATCH v2 08/21] plugins: Use emit_before_op for PLUGIN_GEN_FROM_TB

2024-04-04 Thread Richard Henderson
By having the qemu_plugin_cb_flags be recorded in the TCGHelperInfo,
we no longer need to distinguish PLUGIN_CB_REGULAR from
PLUGIN_CB_REGULAR_R, so place all TB callbacks in the same queue.

Signed-off-by: Richard Henderson 
---
 accel/tcg/plugin-gen.c | 96 +-
 plugins/api.c  |  6 +--
 2 files changed, 58 insertions(+), 44 deletions(-)

diff --git a/accel/tcg/plugin-gen.c b/accel/tcg/plugin-gen.c
index 4b02c0bfbf..c803fe8e96 100644
--- a/accel/tcg/plugin-gen.c
+++ b/accel/tcg/plugin-gen.c
@@ -201,6 +201,7 @@ static void plugin_gen_empty_callback(enum plugin_gen_from 
from)
 {
 switch (from) {
 case PLUGIN_GEN_AFTER_INSN:
+case PLUGIN_GEN_FROM_TB:
 tcg_gen_plugin_cb(from);
 break;
 case PLUGIN_GEN_FROM_INSN:
@@ -210,8 +211,6 @@ static void plugin_gen_empty_callback(enum plugin_gen_from 
from)
  */
 gen_wrapped(from, PLUGIN_GEN_ENABLE_MEM_HELPER,
 gen_empty_mem_helper);
-/* fall through */
-case PLUGIN_GEN_FROM_TB:
 gen_wrapped(from, PLUGIN_GEN_CB_UDATA, gen_empty_udata_cb_no_rwg);
 gen_wrapped(from, PLUGIN_GEN_CB_UDATA_R, gen_empty_udata_cb_no_wg);
 gen_wrapped(from, PLUGIN_GEN_CB_INLINE, gen_empty_inline_cb);
@@ -626,24 +625,6 @@ void plugin_gen_disable_mem_helpers(void)
offsetof(CPUState, plugin_mem_cbs) - offsetof(ArchCPU, 
env));
 }
 
-static void plugin_gen_tb_udata(const struct qemu_plugin_tb *ptb,
-TCGOp *begin_op)
-{
-inject_udata_cb(ptb->cbs[PLUGIN_CB_REGULAR], begin_op);
-}
-
-static void plugin_gen_tb_udata_r(const struct qemu_plugin_tb *ptb,
-  TCGOp *begin_op)
-{
-inject_udata_cb(ptb->cbs[PLUGIN_CB_REGULAR_R], begin_op);
-}
-
-static void plugin_gen_tb_inline(const struct qemu_plugin_tb *ptb,
- TCGOp *begin_op)
-{
-inject_inline_cb(ptb->cbs[PLUGIN_CB_INLINE], begin_op, op_ok);
-}
-
 static void plugin_gen_insn_udata(const struct qemu_plugin_tb *ptb,
   TCGOp *begin_op, int insn_idx)
 {
@@ -702,6 +683,41 @@ static void gen_disable_mem_helper(struct qemu_plugin_tb 
*ptb,
 }
 }
 
+static void gen_udata_cb(struct qemu_plugin_dyn_cb *cb)
+{
+TCGv_i32 cpu_index = tcg_temp_ebb_new_i32();
+
+tcg_gen_ld_i32(cpu_index, tcg_env,
+   -offsetof(ArchCPU, env) + offsetof(CPUState, cpu_index));
+tcg_gen_call2(cb->regular.f.vcpu_udata, cb->regular.info, NULL,
+  tcgv_i32_temp(cpu_index),
+  tcgv_ptr_temp(tcg_constant_ptr(cb->userp)));
+tcg_temp_free_i32(cpu_index);
+}
+
+static void gen_inline_cb(struct qemu_plugin_dyn_cb *cb)
+{
+GArray *arr = cb->inline_insn.entry.score->data;
+size_t offset = cb->inline_insn.entry.offset;
+TCGv_i32 cpu_index = tcg_temp_ebb_new_i32();
+TCGv_i64 val = tcg_temp_ebb_new_i64();
+TCGv_ptr ptr = tcg_temp_ebb_new_ptr();
+
+tcg_gen_ld_i32(cpu_index, tcg_env,
+   -offsetof(ArchCPU, env) + offsetof(CPUState, cpu_index));
+tcg_gen_muli_i32(cpu_index, cpu_index, g_array_get_element_size(arr));
+tcg_gen_ext_i32_ptr(ptr, cpu_index);
+tcg_temp_free_i32(cpu_index);
+
+tcg_gen_addi_ptr(ptr, ptr, (intptr_t)arr->data);
+tcg_gen_ld_i64(val, ptr, offset);
+tcg_gen_addi_i64(val, val, cb->inline_insn.imm);
+tcg_gen_st_i64(val, ptr, offset);
+
+tcg_temp_free_i64(val);
+tcg_temp_free_ptr(ptr);
+}
+
 /* #define DEBUG_PLUGIN_GEN_OPS */
 static void pr_ops(void)
 {
@@ -780,6 +796,8 @@ static void plugin_gen_inject(struct qemu_plugin_tb 
*plugin_tb)
 {
 enum plugin_gen_from from = op->args[0];
 struct qemu_plugin_insn *insn = NULL;
+const GArray *cbs;
+int i, n;
 
 if (insn_idx >= 0) {
 insn = g_ptr_array_index(plugin_tb->insns, insn_idx);
@@ -792,6 +810,25 @@ static void plugin_gen_inject(struct qemu_plugin_tb 
*plugin_tb)
 assert(insn != NULL);
 gen_disable_mem_helper(plugin_tb, insn);
 break;
+
+case PLUGIN_GEN_FROM_TB:
+assert(insn == NULL);
+
+cbs = plugin_tb->cbs[PLUGIN_CB_REGULAR];
+for (i = 0, n = (cbs ? cbs->len : 0); i < n; i++) {
+struct qemu_plugin_dyn_cb *cb =
+&g_array_index(cbs, struct qemu_plugin_dyn_cb, i);
+gen_udata_cb(cb);
+}
+
+cbs = plugin_tb->cbs[PLUGIN_CB_INLINE];
+for (i = 0, n = (cbs ? cbs->len : 0); i < n; i++) {
+struct qemu_plugin_dyn_cb *cb =
+&g_array_index(cbs, struct qemu_plugin_dyn_cb, i);
+gen_inline_cb(cb);
+}
+break;
+
 default:
 g_assert_not_reached();
 }
@@ -807,25 +844,6 @@ static void

[PATCH v2 07/21] plugins: Use emit_before_op for PLUGIN_GEN_AFTER_INSN

2024-04-04 Thread Richard Henderson
Introduce a new plugin_cb op and migrate one operation.
By using emit_before_op, we do not need to emit opcodes
early and modify them later -- we can simply emit the
final set of opcodes once.

Signed-off-by: Richard Henderson 
---
 include/tcg/tcg-op-common.h |  1 +
 include/tcg/tcg-opc.h   |  1 +
 accel/tcg/plugin-gen.c  | 74 +
 tcg/tcg-op.c|  5 +++
 4 files changed, 50 insertions(+), 31 deletions(-)

diff --git a/include/tcg/tcg-op-common.h b/include/tcg/tcg-op-common.h
index 2d932a515e..9de5a7f280 100644
--- a/include/tcg/tcg-op-common.h
+++ b/include/tcg/tcg-op-common.h
@@ -74,6 +74,7 @@ void tcg_gen_goto_tb(unsigned idx);
  */
 void tcg_gen_lookup_and_goto_ptr(void);
 
+void tcg_gen_plugin_cb(unsigned from);
 void tcg_gen_plugin_cb_start(unsigned from, unsigned type, unsigned wr);
 void tcg_gen_plugin_cb_end(void);
 
diff --git a/include/tcg/tcg-opc.h b/include/tcg/tcg-opc.h
index b80227fa1c..3b7cb2bce1 100644
--- a/include/tcg/tcg-opc.h
+++ b/include/tcg/tcg-opc.h
@@ -197,6 +197,7 @@ DEF(exit_tb, 0, 0, 1, TCG_OPF_BB_EXIT | TCG_OPF_BB_END)
 DEF(goto_tb, 0, 0, 1, TCG_OPF_BB_EXIT | TCG_OPF_BB_END)
 DEF(goto_ptr, 0, 1, 0, TCG_OPF_BB_EXIT | TCG_OPF_BB_END)
 
+DEF(plugin_cb, 0, 0, 1, TCG_OPF_NOT_PRESENT)
 DEF(plugin_cb_start, 0, 0, 3, TCG_OPF_NOT_PRESENT)
 DEF(plugin_cb_end, 0, 0, 0, TCG_OPF_NOT_PRESENT)
 
diff --git a/accel/tcg/plugin-gen.c b/accel/tcg/plugin-gen.c
index 4b488943ff..4b02c0bfbf 100644
--- a/accel/tcg/plugin-gen.c
+++ b/accel/tcg/plugin-gen.c
@@ -201,8 +201,7 @@ static void plugin_gen_empty_callback(enum plugin_gen_from 
from)
 {
 switch (from) {
 case PLUGIN_GEN_AFTER_INSN:
-gen_wrapped(from, PLUGIN_GEN_DISABLE_MEM_HELPER,
-gen_empty_mem_helper);
+tcg_gen_plugin_cb(from);
 break;
 case PLUGIN_GEN_FROM_INSN:
 /*
@@ -608,16 +607,6 @@ static void inject_mem_enable_helper(struct qemu_plugin_tb 
*ptb,
 inject_mem_helper(begin_op, arr);
 }
 
-static void inject_mem_disable_helper(struct qemu_plugin_insn *plugin_insn,
-  TCGOp *begin_op)
-{
-if (likely(!plugin_insn->mem_helper)) {
-rm_ops(begin_op);
-return;
-}
-inject_mem_helper(begin_op, NULL);
-}
-
 /* called before finishing a TB with exit_tb, goto_tb or goto_ptr */
 void plugin_gen_disable_mem_helpers(void)
 {
@@ -703,11 +692,14 @@ static void plugin_gen_enable_mem_helper(struct 
qemu_plugin_tb *ptb,
 inject_mem_enable_helper(ptb, insn, begin_op);
 }
 
-static void plugin_gen_disable_mem_helper(struct qemu_plugin_tb *ptb,
-  TCGOp *begin_op, int insn_idx)
+static void gen_disable_mem_helper(struct qemu_plugin_tb *ptb,
+   struct qemu_plugin_insn *insn)
 {
-struct qemu_plugin_insn *insn = g_ptr_array_index(ptb->insns, insn_idx);
-inject_mem_disable_helper(insn, begin_op);
+if (insn->mem_helper) {
+tcg_gen_st_ptr(tcg_constant_ptr(0), tcg_env,
+   offsetof(CPUState, plugin_mem_cbs) -
+   offsetof(ArchCPU, env));
+}
 }
 
 /* #define DEBUG_PLUGIN_GEN_OPS */
@@ -766,16 +758,49 @@ static void pr_ops(void)
 
 static void plugin_gen_inject(struct qemu_plugin_tb *plugin_tb)
 {
-TCGOp *op;
+TCGOp *op, *next;
 int insn_idx = -1;
 
 pr_ops();
 
-QTAILQ_FOREACH(op, &tcg_ctx->ops, link) {
+/*
+ * While injecting code, we cannot afford to reuse any ebb temps
+ * that might be live within the existing opcode stream.
+ * The simplest solution is to release them all and create new.
+ */
+memset(tcg_ctx->free_temps, 0, sizeof(tcg_ctx->free_temps));
+
+QTAILQ_FOREACH_SAFE(op, &tcg_ctx->ops, link, next) {
 switch (op->opc) {
 case INDEX_op_insn_start:
 insn_idx++;
 break;
+
+case INDEX_op_plugin_cb:
+{
+enum plugin_gen_from from = op->args[0];
+struct qemu_plugin_insn *insn = NULL;
+
+if (insn_idx >= 0) {
+insn = g_ptr_array_index(plugin_tb->insns, insn_idx);
+}
+
+tcg_ctx->emit_before_op = op;
+
+switch (from) {
+case PLUGIN_GEN_AFTER_INSN:
+assert(insn != NULL);
+gen_disable_mem_helper(plugin_tb, insn);
+break;
+default:
+g_assert_not_reached();
+}
+
+tcg_ctx->emit_before_op = NULL;
+tcg_op_remove(tcg_ctx, op);
+break;
+}
+
 case INDEX_op_plugin_cb_start:
 {
 enum plugin_gen_from from = op->args[0];
@@ -840,19 +865,6 @@ static void plugin_gen_inject(struct qemu_plugin_tb 
*plugin_tb)
 
 break;
 }
-case PLUGIN_GEN_AFTER_INSN:
-{
-g_assert(insn_idx >= 0);
-
-switch (type) {
-case P

[PATCH v2 00/21] Rewrite plugin code generation

2024-04-04 Thread Richard Henderson
Add a (trivial) mechanism for emitting code into the middle of the
opcode sequence: tcg_ctx->emit_before_op.

Rip out all of the "empty" generation and "copy" to modify those
sequences.  Replace with regular code generation once we know what
values to place.

Changes for v2:
  * Fix TCI build failure.
  * Drop qemu_plugin_insn_cleanup_fn movement; I have another plan for that.

Patches requiring review: 7 and 8.


r~


Richard Henderson (21):
  tcg: Add TCGContext.emit_before_op
  tcg: Make tcg/helper-info.h self-contained
  tcg: Pass function pointer to tcg_gen_call*
  plugins: Zero new qemu_plugin_dyn_cb entries
  plugins: Move function pointer in qemu_plugin_dyn_cb
  plugins: Create TCGHelperInfo for all out-of-line callbacks
  plugins: Use emit_before_op for PLUGIN_GEN_AFTER_INSN
  plugins: Use emit_before_op for PLUGIN_GEN_FROM_TB
  plugins: Add PLUGIN_GEN_AFTER_TB
  plugins: Use emit_before_op for PLUGIN_GEN_FROM_INSN
  plugins: Use emit_before_op for PLUGIN_GEN_FROM_MEM
  plugins: Remove plugin helpers
  tcg: Remove TCG_CALL_PLUGIN
  tcg: Remove INDEX_op_plugin_cb_{start,end}
  plugins: Simplify callback queues
  plugins: Introduce PLUGIN_CB_MEM_REGULAR
  plugins: Replace pr_ops with a proper debug dump flag
  plugins: Split out common cb expanders
  plugins: Merge qemu_plugin_tb_insn_get to plugin-gen.c
  plugins: Inline plugin_gen_empty_callback
  plugins: Update the documentation block for plugin-gen.c

 accel/tcg/plugin-helpers.h |5 -
 include/exec/helper-gen-common.h   |4 -
 include/exec/helper-proto-common.h |4 -
 include/exec/plugin-gen.h  |4 -
 include/qemu/log.h |1 +
 include/qemu/plugin.h  |   67 +-
 include/tcg/helper-info.h  |3 +
 include/tcg/tcg-op-common.h|4 +-
 include/tcg/tcg-opc.h  |4 +-
 include/tcg/tcg.h  |   32 +-
 include/exec/helper-gen.h.inc  |   24 +-
 accel/tcg/plugin-gen.c | 1007 +++-
 plugins/api.c  |   26 +-
 plugins/core.c |   61 +-
 tcg/tcg-op-ldst.c  |6 +-
 tcg/tcg-op.c   |8 +-
 tcg/tcg.c  |   92 ++-
 tcg/tci.c  |1 +
 util/log.c |4 +
 19 files changed, 417 insertions(+), 940 deletions(-)
 delete mode 100644 accel/tcg/plugin-helpers.h

-- 
2.34.1




[PATCH v2 01/21] tcg: Add TCGContext.emit_before_op

2024-04-04 Thread Richard Henderson
Allow operations to be emitted via normal expanders
into the middle of the opcode stream.

Reviewed-by: Pierrick Bouvier 
Signed-off-by: Richard Henderson 
---
 include/tcg/tcg.h |  6 ++
 tcg/tcg.c | 14 --
 2 files changed, 18 insertions(+), 2 deletions(-)

diff --git a/include/tcg/tcg.h b/include/tcg/tcg.h
index 451f3fec41..05a1912f8a 100644
--- a/include/tcg/tcg.h
+++ b/include/tcg/tcg.h
@@ -553,6 +553,12 @@ struct TCGContext {
 QTAILQ_HEAD(, TCGOp) ops, free_ops;
 QSIMPLEQ_HEAD(, TCGLabel) labels;
 
+/*
+ * When clear, new ops are added to the tail of @ops.
+ * When set, new ops are added in front of @emit_before_op.
+ */
+TCGOp *emit_before_op;
+
 /* Tells which temporary holds a given register.
It does not take into account fixed registers */
 TCGTemp *reg_to_temp[TCG_TARGET_NB_REGS];
diff --git a/tcg/tcg.c b/tcg/tcg.c
index d6670237fb..0c0bb9d169 100644
--- a/tcg/tcg.c
+++ b/tcg/tcg.c
@@ -1521,6 +1521,7 @@ void tcg_func_start(TCGContext *s)
 
 QTAILQ_INIT(&s->ops);
 QTAILQ_INIT(&s->free_ops);
+s->emit_before_op = NULL;
 QSIMPLEQ_INIT(&s->labels);
 
 tcg_debug_assert(s->addr_type == TCG_TYPE_I32 ||
@@ -2332,7 +2333,11 @@ static void tcg_gen_callN(TCGHelperInfo *info, TCGTemp 
*ret, TCGTemp **args)
 op->args[pi++] = (uintptr_t)info;
 tcg_debug_assert(pi == total_args);
 
-QTAILQ_INSERT_TAIL(&tcg_ctx->ops, op, link);
+if (tcg_ctx->emit_before_op) {
+QTAILQ_INSERT_BEFORE(tcg_ctx->emit_before_op, op, link);
+} else {
+QTAILQ_INSERT_TAIL(&tcg_ctx->ops, op, link);
+}
 
 tcg_debug_assert(n_extend < ARRAY_SIZE(extend_free));
 for (i = 0; i < n_extend; ++i) {
@@ -3215,7 +3220,12 @@ static TCGOp *tcg_op_alloc(TCGOpcode opc, unsigned nargs)
 TCGOp *tcg_emit_op(TCGOpcode opc, unsigned nargs)
 {
 TCGOp *op = tcg_op_alloc(opc, nargs);
-QTAILQ_INSERT_TAIL(&tcg_ctx->ops, op, link);
+
+if (tcg_ctx->emit_before_op) {
+QTAILQ_INSERT_BEFORE(tcg_ctx->emit_before_op, op, link);
+} else {
+QTAILQ_INSERT_TAIL(&tcg_ctx->ops, op, link);
+}
 return op;
 }
 
-- 
2.34.1




[PATCH v2 05/21] plugins: Move function pointer in qemu_plugin_dyn_cb

2024-04-04 Thread Richard Henderson
The out-of-line function pointer is mutually exclusive
with inline expansion, so move it into the union.
Wrap the pointer in a structure named 'regular' to match
PLUGIN_CB_REGULAR.

Reviewed-by: Alex Bennée 
Signed-off-by: Richard Henderson 
---
 include/qemu/plugin.h  | 4 +++-
 accel/tcg/plugin-gen.c | 4 ++--
 plugins/core.c | 8 
 3 files changed, 9 insertions(+), 7 deletions(-)

diff --git a/include/qemu/plugin.h b/include/qemu/plugin.h
index 12a96cea2a..143262dca8 100644
--- a/include/qemu/plugin.h
+++ b/include/qemu/plugin.h
@@ -84,13 +84,15 @@ enum plugin_dyn_cb_subtype {
  * instance of a callback to be called upon the execution of a particular TB.
  */
 struct qemu_plugin_dyn_cb {
-union qemu_plugin_cb_sig f;
 void *userp;
 enum plugin_dyn_cb_subtype type;
 /* @rw applies to mem callbacks only (both regular and inline) */
 enum qemu_plugin_mem_rw rw;
 /* fields specific to each dyn_cb type go here */
 union {
+struct {
+union qemu_plugin_cb_sig f;
+} regular;
 struct {
 qemu_plugin_u64 entry;
 enum qemu_plugin_op op;
diff --git a/accel/tcg/plugin-gen.c b/accel/tcg/plugin-gen.c
index cd78ef94a1..4b488943ff 100644
--- a/accel/tcg/plugin-gen.c
+++ b/accel/tcg/plugin-gen.c
@@ -425,7 +425,7 @@ static TCGOp *append_udata_cb(const struct 
qemu_plugin_dyn_cb *cb,
 }
 
 /* call */
-op = copy_call(&begin_op, op, cb->f.vcpu_udata, cb_idx);
+op = copy_call(&begin_op, op, cb->regular.f.vcpu_udata, cb_idx);
 
 return op;
 }
@@ -473,7 +473,7 @@ static TCGOp *append_mem_cb(const struct qemu_plugin_dyn_cb 
*cb,
 
 if (type == PLUGIN_GEN_CB_MEM) {
 /* call */
-op = copy_call(&begin_op, op, cb->f.vcpu_udata, cb_idx);
+op = copy_call(&begin_op, op, cb->regular.f.vcpu_udata, cb_idx);
 }
 
 return op;
diff --git a/plugins/core.c b/plugins/core.c
index 4487cb7c48..837c373690 100644
--- a/plugins/core.c
+++ b/plugins/core.c
@@ -342,7 +342,7 @@ void plugin_register_dyn_cb__udata(GArray **arr,
 
 dyn_cb->userp = udata;
 /* Note flags are discarded as unused. */
-dyn_cb->f.vcpu_udata = cb;
+dyn_cb->regular.f.vcpu_udata = cb;
 dyn_cb->type = PLUGIN_CB_REGULAR;
 }
 
@@ -359,7 +359,7 @@ void plugin_register_vcpu_mem_cb(GArray **arr,
 /* Note flags are discarded as unused. */
 dyn_cb->type = PLUGIN_CB_REGULAR;
 dyn_cb->rw = rw;
-dyn_cb->f.generic = cb;
+dyn_cb->regular.f.vcpu_mem = cb;
 }
 
 /*
@@ -511,8 +511,8 @@ void qemu_plugin_vcpu_mem_cb(CPUState *cpu, uint64_t vaddr,
 }
 switch (cb->type) {
 case PLUGIN_CB_REGULAR:
-cb->f.vcpu_mem(cpu->cpu_index, make_plugin_meminfo(oi, rw),
-   vaddr, cb->userp);
+cb->regular.f.vcpu_mem(cpu->cpu_index, make_plugin_meminfo(oi, rw),
+   vaddr, cb->userp);
 break;
 case PLUGIN_CB_INLINE:
 exec_inline_op(cb, cpu->cpu_index);
-- 
2.34.1




[PATCH v2 02/21] tcg: Make tcg/helper-info.h self-contained

2024-04-04 Thread Richard Henderson
Move MAX_CALL_IARGS from tcg.h and include for
the define of TCG_TARGET_REG_BITS.

Reviewed-by: Alex Bennée 
Signed-off-by: Richard Henderson 
---
 include/tcg/helper-info.h | 3 +++
 include/tcg/tcg.h | 2 --
 tcg/tci.c | 1 +
 3 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/include/tcg/helper-info.h b/include/tcg/helper-info.h
index 7c27d6164a..909fe73afa 100644
--- a/include/tcg/helper-info.h
+++ b/include/tcg/helper-info.h
@@ -12,6 +12,9 @@
 #ifdef CONFIG_TCG_INTERPRETER
 #include 
 #endif
+#include "tcg-target-reg-bits.h"
+
+#define MAX_CALL_IARGS  7
 
 /*
  * Describe the calling convention of a given argument type.
diff --git a/include/tcg/tcg.h b/include/tcg/tcg.h
index 05a1912f8a..e4c598428d 100644
--- a/include/tcg/tcg.h
+++ b/include/tcg/tcg.h
@@ -39,8 +39,6 @@
 /* XXX: make safe guess about sizes */
 #define MAX_OP_PER_INSTR 266
 
-#define MAX_CALL_IARGS  7
-
 #define CPU_TEMP_BUF_NLONGS 128
 #define TCG_STATIC_FRAME_SIZE  (CPU_TEMP_BUF_NLONGS * sizeof(long))
 
diff --git a/tcg/tci.c b/tcg/tci.c
index 39adcb7d82..3afb223528 100644
--- a/tcg/tci.c
+++ b/tcg/tci.c
@@ -19,6 +19,7 @@
 
 #include "qemu/osdep.h"
 #include "tcg/tcg.h"
+#include "tcg/helper-info.h"
 #include "tcg/tcg-ldst.h"
 #include 
 
-- 
2.34.1




[PATCH v2 06/21] plugins: Create TCGHelperInfo for all out-of-line callbacks

2024-04-04 Thread Richard Henderson
TCGHelperInfo includes the ABI for every function call.

Reviewed-by: Pierrick Bouvier 
Signed-off-by: Richard Henderson 
---
 include/qemu/plugin.h |  1 +
 plugins/core.c| 51 ++-
 2 files changed, 46 insertions(+), 6 deletions(-)

diff --git a/include/qemu/plugin.h b/include/qemu/plugin.h
index 143262dca8..793c44f1f2 100644
--- a/include/qemu/plugin.h
+++ b/include/qemu/plugin.h
@@ -92,6 +92,7 @@ struct qemu_plugin_dyn_cb {
 union {
 struct {
 union qemu_plugin_cb_sig f;
+TCGHelperInfo *info;
 } regular;
 struct {
 qemu_plugin_u64 entry;
diff --git a/plugins/core.c b/plugins/core.c
index 837c373690..b0a2e80874 100644
--- a/plugins/core.c
+++ b/plugins/core.c
@@ -338,12 +338,26 @@ void plugin_register_dyn_cb__udata(GArray **arr,
enum qemu_plugin_cb_flags flags,
void *udata)
 {
-struct qemu_plugin_dyn_cb *dyn_cb = plugin_get_dyn_cb(arr);
+static TCGHelperInfo info[3] = {
+[QEMU_PLUGIN_CB_NO_REGS].flags = TCG_CALL_NO_RWG | TCG_CALL_PLUGIN,
+[QEMU_PLUGIN_CB_R_REGS].flags = TCG_CALL_NO_WG | TCG_CALL_PLUGIN,
+[QEMU_PLUGIN_CB_RW_REGS].flags = TCG_CALL_PLUGIN,
+/*
+ * Match qemu_plugin_vcpu_udata_cb_t:
+ *   void (*)(uint32_t, void *)
+ */
+[0 ... 2].typemask = (dh_typemask(void, 0) |
+  dh_typemask(i32, 1) |
+  dh_typemask(ptr, 2))
+};
 
+struct qemu_plugin_dyn_cb *dyn_cb = plugin_get_dyn_cb(arr);
 dyn_cb->userp = udata;
-/* Note flags are discarded as unused. */
-dyn_cb->regular.f.vcpu_udata = cb;
 dyn_cb->type = PLUGIN_CB_REGULAR;
+dyn_cb->regular.f.vcpu_udata = cb;
+
+assert((unsigned)flags < ARRAY_SIZE(info));
+dyn_cb->regular.info = &info[flags];
 }
 
 void plugin_register_vcpu_mem_cb(GArray **arr,
@@ -352,14 +366,39 @@ void plugin_register_vcpu_mem_cb(GArray **arr,
  enum qemu_plugin_mem_rw rw,
  void *udata)
 {
-struct qemu_plugin_dyn_cb *dyn_cb;
+/*
+ * Expect that the underlying type for enum qemu_plugin_meminfo_t
+ * is either int32_t or uint32_t, aka int or unsigned int.
+ */
+QEMU_BUILD_BUG_ON(
+!__builtin_types_compatible_p(qemu_plugin_meminfo_t, uint32_t) &&
+!__builtin_types_compatible_p(qemu_plugin_meminfo_t, int32_t));
 
-dyn_cb = plugin_get_dyn_cb(arr);
+static TCGHelperInfo info[3] = {
+[QEMU_PLUGIN_CB_NO_REGS].flags = TCG_CALL_NO_RWG | TCG_CALL_PLUGIN,
+[QEMU_PLUGIN_CB_R_REGS].flags = TCG_CALL_NO_WG | TCG_CALL_PLUGIN,
+[QEMU_PLUGIN_CB_RW_REGS].flags = TCG_CALL_PLUGIN,
+/*
+ * Match qemu_plugin_vcpu_mem_cb_t:
+ *   void (*)(uint32_t, qemu_plugin_meminfo_t, uint64_t, void *)
+ */
+[0 ... 2].typemask =
+(dh_typemask(void, 0) |
+ dh_typemask(i32, 1) |
+ (__builtin_types_compatible_p(qemu_plugin_meminfo_t, uint32_t)
+  ? dh_typemask(i32, 2) : dh_typemask(s32, 2)) |
+ dh_typemask(i64, 3) |
+ dh_typemask(ptr, 4))
+};
+
+struct qemu_plugin_dyn_cb *dyn_cb = plugin_get_dyn_cb(arr);
 dyn_cb->userp = udata;
-/* Note flags are discarded as unused. */
 dyn_cb->type = PLUGIN_CB_REGULAR;
 dyn_cb->rw = rw;
 dyn_cb->regular.f.vcpu_mem = cb;
+
+assert((unsigned)flags < ARRAY_SIZE(info));
+dyn_cb->regular.info = &info[flags];
 }
 
 /*
-- 
2.34.1




Re: [PATCH v2] sh4: mac.l: implement saturation arithmetic logic

2024-04-04 Thread Zack Buhman
Signed-off-by: Zack Buhman 

- Original message -
From: "Philippe Mathieu-Daudé" 
To: Peter Maydell , Zack Buhman 
Cc: qemu-devel@nongnu.org, Yoshinori Sato 
Subject: Re: [PATCH v2] sh4: mac.l: implement saturation arithmetic logic
Date: Friday, April 05, 2024 1:26 AM

Hi Zack,

Cc'ing the maintainer of this file, Yoshinori:

$ ./scripts/get_maintainer.pl -f target/sh4/op_helper.c
Yoshinori Sato  (reviewer:SH4 TCG CPUs)
(https://www.qemu.org/docs/master/devel/submitting-a-patch.html#cc-the-relevant-maintainer)

On 4/4/24 18:39, Peter Maydell wrote:
> On Thu, 4 Apr 2024 at 17:26, Zack Buhman  wrote:
>>
>> The saturation arithmetic logic in helper_macl is not correct.
>>
>> I tested and verified this behavior on a SH7091, the general pattern
>> is a code sequence such as:
>>
>>  sets
>>
>>  mov.l _mach,r2
>>  lds r2,mach
>>  mov.l _macl,r2
>>  lds r2,macl
>>
>>  mova _n,r0
>>  mov r0,r1
>>  mova _m,r0
>>  mac.l @r0+,@r1+
>>
>>  _mach: .long 0x7fff
>>  _macl: .long 0x12345678
>>  _m:.long 0x7fff
>>  _n:.long 0x7fff
>>
>> Test case 0: (no int64_t overflow)
>>given; prior to saturation mac.l:
>>  mach = 0x7fff macl = 0x12345678
>>  @r0  = 0x7fff @r1  = 0x7fff
>>
>>expected saturation mac.l result:
>>  mach = 0x7fff macl = 0x
>>
>>qemu saturation mac.l result (prior to this commit):
>>  mach = 0x7ffe macl = 0x12345678
>>
>> Test case 1: (no int64_t overflow)
>>given; prior to saturation mac.l:
>>  mach = 0x8000 macl = 0x
>>  @r0  = 0x @r1  = 0x0001
>>
>>expected saturation mac.l result:
>>  mach = 0x8000 macl = 0x
>>
>>qemu saturation mac.l result (prior to this commit):
>>  mach = 0x7fff macl = 0x
>>
>> Test case 2: (int64_t addition overflow)
>>given; prior to saturation mac.l:
>>  mach = 0x8000 macl = 0x
>>  @r0  = 0x @r1  = 0x0001
>>
>>expected saturation mac.l result:
>>  mach = 0x8000 macl = 0x
>>
>>qemu saturation mac.l result (prior to this commit):
>>  mach = 0x7fff macl = 0x
>>
>> Test case 3: (int64_t addition overflow)
>>given; prior to saturation mac.l:
>>  mach = 0x7fff macl = 0x
>>  @r0 = 0x7fff @r1 = 0x7fff
>>
>>expected saturation mac.l result:
>>  mach = 0x7fff macl = 0x
>>
>>qemu saturation mac.l result (prior to this commit):
>>  mach = 0xfffe macl = 0x0001
>>
>> All of the above also matches the description of MAC.L as documented
>> in cd00147165-sh-4-32-bit-cpu-core-architecture-stmicroelectronics.pdf
> 
> Hi. I just noticed that you didn't include a signed-off-by line
> in your commit message. We need these as they're how you say
> that you're legally OK to contribute this code to QEMU and
> you're happy for it to go into the project:
> 
> https://www.qemu.org/docs/master/devel/submitting-a-patch.html#patch-emails-must-include-a-signed-off-by-line
> has links to what exactly this means, but basically the
> requirement is that the last line of your commit message should be
> "Signed-off-by: Your Name "
> 
> In this case, if you just reply to this email with that, we
> can pick it up and fix up the commit message when we apply the
> patch.
> 
>> ---
>>   target/sh4/op_helper.c | 31 +--
>>   1 file changed, 21 insertions(+), 10 deletions(-)
>>
>> diff --git a/target/sh4/op_helper.c b/target/sh4/op_helper.c
>> index 4559d0d376..ee16524083 100644
>> --- a/target/sh4/op_helper.c
>> +++ b/target/sh4/op_helper.c
>> @@ -160,18 +160,29 @@ void helper_ocbi(CPUSH4State *env, uint32_t address)
>>
>>   void helper_macl(CPUSH4State *env, uint32_t arg0, uint32_t arg1)
>>   {
>> -int64_t res;
>> -
>> -res = ((uint64_t) env->mach << 32) | env->macl;
>> -res += (int64_t) (int32_t) arg0 *(int64_t) (int32_t) arg1;
>> -env->mach = (res >> 32) & 0x;
>> -env->macl = res & 0x;
>> +int32_t value0 = (int32_t)arg0;
>> +int32_t value1 = (int32_t)arg1;
>> +int64_t mul = ((int64_t)value0) * ((int64_t)value1);
>> +int64_t mac = (((uint64_t)env->mach) << 32) | env->macl;
>> +int64_t result;
>> +bool overflow = sadd64_overflow(mac, mul, &result);
>> +/* Perform 48-bit saturation arithmetic if the S flag is set */
>>   if (env->sr & (1u << SR_S)) {
>> -if (res < 0)
>> -env->mach |= 0x;
>> -else
>> -env->mach &= 0x7fff;
>> +/*
>> + * The sign bit of `mac + mul` may overflow. The MAC unit on
>> + * real SH-4 hardware has equivalent carry/saturation logic:
>> + */
>> +const int64_t upper_bound =  ((1ull << 47) - 1);
>> +const int64_t lower_bound = -((1ull << 47) - 0);
>> +
>> +if (overflow) {
>> +result = (mac < 0) ? lower_bound :

Re: [PATCH-for-9.1 2/7] yank: Restrict to system emulation

2024-04-04 Thread Richard Henderson

On 4/4/24 09:47, Philippe Mathieu-Daudé wrote:

The yank feature is not used in user emulation.

Signed-off-by: Philippe Mathieu-Daudé
---
  util/meson.build | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)


Reviewed-by: Richard Henderson 

r~



Re: [PATCH-for-9.1 1/7] ebpf: Restrict to system emulation

2024-04-04 Thread Richard Henderson

On 4/4/24 09:47, Philippe Mathieu-Daudé wrote:

eBPF is not used in user emulation.

Signed-off-by: Philippe Mathieu-Daudé
---
  ebpf/meson.build | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)


Reviewed-by: Richard Henderson 

r~



Re: [External] Re: [PATCH v10 2/2] memory tier: create CPUless memory tiers after obtaining HMAT info

2024-04-04 Thread Ho-Ren (Jack) Chuang
Hi Jonathan,

Thank you! I will fix them and send a V11 soon.

On Thu, Apr 4, 2024 at 6:37 AM Jonathan Cameron
 wrote:
>
> 
>
> > > > @@ -858,7 +910,8 @@ static int __init memory_tier_init(void)
> > > >* For now we can have 4 faster memory tiers with smaller 
> > > > adistance
> > > >* than default DRAM tier.
> > > >*/
> > > > - default_dram_type = alloc_memory_type(MEMTIER_ADISTANCE_DRAM);
> > > > + default_dram_type = 
> > > > mt_find_alloc_memory_type(MEMTIER_ADISTANCE_DRAM,
> > > > + 
> > > > &default_memory_types);
> > >
> > > Unusual indenting.  Align with just after (
> > >
> >
> > Aligning with "(" will exceed 100 columns. Would that be acceptable?
> I think we are talking cross purposes.
>
> default_dram_type = mt_find_alloc_memory_type(MEMTIER_ADISTANCE_DRAM,
>   &default_memory_types);
>
> Is what I was suggesting.
>

Oh, now I see. Thanks!

> >
> > > >   if (IS_ERR(default_dram_type))
> > > >   panic("%s() failed to allocate default DRAM tier\n", 
> > > > __func__);
> > > >
> > > > @@ -868,6 +921,14 @@ static int __init memory_tier_init(void)
> > > >* types assigned.
> > > >*/
> > > >   for_each_node_state(node, N_MEMORY) {
> > > > + if (!node_state(node, N_CPU))
> > > > + /*
> > > > +  * Defer memory tier initialization on CPUless 
> > > > numa nodes.
> > > > +  * These will be initialized after firmware and 
> > > > devices are
> > >
> > > I think this wraps at just over 80 chars.  Seems silly to wrap so tightly 
> > > and not
> > > quite fit under 80. (this is about 83 chars.
> > >
> >
> > I can fix this.
> > I have a question. From my patch, this is <80 chars. However,
> > in an email, this is >80 chars. Does that mean we need to
> > count the number of chars in an email, not in a patch? Or if I
> > missed something? like vim configuration or?
>
> 3 tabs + 1 space + the text from * (58)
> = 24 + 1 + 58 = 83
>
> Advantage of using claws email for kernel stuff is it has a nice per character
> ruler at the top of the window.
>
> I wonder if you have a different tab indent size?  The kernel uses 8
> characters.  It might explain the few other odd indents if perhaps
> you have it at 4 in your editor?
>
> https://www.kernel.org/doc/html/v4.10/process/coding-style.html
>

Got it. I was using tab=4. I will change to 8. Thanks!

> Jonathan
>
> >
> > > > +  * initialized.
> > > > +  */
> > > > + continue;
> > > > +
> > > >   memtier = set_node_memory_tier(node);
> > > >   if (IS_ERR(memtier))
> > > >   /*
> > >
> >
> >
>


-- 
Best regards,
Ho-Ren (Jack) Chuang
莊賀任



[PATCH for-9.0] tcg/optimize: Do not attempt to constant fold neg_vec

2024-04-04 Thread Richard Henderson
Split out the tail of fold_neg to fold_neg_no_const so that we
can avoid attempting to constant fold vector negate.

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2150
Signed-off-by: Richard Henderson 
---
 tcg/optimize.c| 17 -
 tests/tcg/aarch64/test-2150.c | 12 
 tests/tcg/aarch64/Makefile.target |  2 +-
 3 files changed, 21 insertions(+), 10 deletions(-)
 create mode 100644 tests/tcg/aarch64/test-2150.c

diff --git a/tcg/optimize.c b/tcg/optimize.c
index 275db77b42..2e9e5725a9 100644
--- a/tcg/optimize.c
+++ b/tcg/optimize.c
@@ -1990,16 +1990,10 @@ static bool fold_nand(OptContext *ctx, TCGOp *op)
 return false;
 }
 
-static bool fold_neg(OptContext *ctx, TCGOp *op)
+static bool fold_neg_no_const(OptContext *ctx, TCGOp *op)
 {
-uint64_t z_mask;
-
-if (fold_const1(ctx, op)) {
-return true;
-}
-
 /* Set to 1 all bits to the left of the rightmost.  */
-z_mask = arg_info(op->args[1])->z_mask;
+uint64_t z_mask = arg_info(op->args[1])->z_mask;
 ctx->z_mask = -(z_mask & -z_mask);
 
 /*
@@ -2010,6 +2004,11 @@ static bool fold_neg(OptContext *ctx, TCGOp *op)
 return true;
 }
 
+static bool fold_neg(OptContext *ctx, TCGOp *op)
+{
+return fold_const1(ctx, op) || fold_neg_no_const(ctx, op);
+}
+
 static bool fold_nor(OptContext *ctx, TCGOp *op)
 {
 if (fold_const2_commutative(ctx, op) ||
@@ -2418,7 +2417,7 @@ static bool fold_sub_to_neg(OptContext *ctx, TCGOp *op)
 if (have_neg) {
 op->opc = neg_op;
 op->args[1] = op->args[2];
-return fold_neg(ctx, op);
+return fold_neg_no_const(ctx, op);
 }
 return false;
 }
diff --git a/tests/tcg/aarch64/test-2150.c b/tests/tcg/aarch64/test-2150.c
new file mode 100644
index 00..fb86c11958
--- /dev/null
+++ b/tests/tcg/aarch64/test-2150.c
@@ -0,0 +1,12 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+/* See https://gitlab.com/qemu-project/qemu/-/issues/2150 */
+
+int main()
+{
+asm volatile(
+"movi v6.4s, #1\n"
+"movi v7.4s, #0\n"
+"sub  v6.2d, v7.2d, v6.2d\n"
+: : : "v6", "v7");
+return 0;
+}
diff --git a/tests/tcg/aarch64/Makefile.target 
b/tests/tcg/aarch64/Makefile.target
index 0efd565f05..70d728ae9a 100644
--- a/tests/tcg/aarch64/Makefile.target
+++ b/tests/tcg/aarch64/Makefile.target
@@ -10,7 +10,7 @@ VPATH += $(AARCH64_SRC)
 
 # Base architecture tests
 AARCH64_TESTS=fcvt pcalign-a64 lse2-fault
-AARCH64_TESTS += test-2248
+AARCH64_TESTS += test-2248 test-2150
 
 fcvt: LDFLAGS+=-lm
 
-- 
2.34.1




Re: [PATCH v1] migration/postcopy: ensure preempt channel is ready before loading states

2024-04-04 Thread Peter Xu
On Fri, Apr 05, 2024 at 12:48:15AM +0800, Wang, Lei wrote:
> On 4/5/2024 0:25, Wang, Wei W wrote:> On Thursday, April 4, 2024 10:12 PM, 
> Peter
> Xu wrote:
> >> On Thu, Apr 04, 2024 at 06:05:50PM +0800, Wei Wang wrote:
> >>> Before loading the guest states, ensure that the preempt channel has
> >>> been ready to use, as some of the states (e.g. via virtio_load) might
> >>> trigger page faults that will be handled through the preempt channel.
> >>> So yield to the main thread in the case that the channel create event
> >>> has been dispatched.
> >>>
> >>> Originally-by: Lei Wang 
> >>> Link:
> >>> https://lore.kernel.org/all/9aa5d1be-7801-40dd-83fd-f7e041ced249@intel
> >>> .com/T/
> >>> Suggested-by: Peter Xu 
> >>> Signed-off-by: Lei Wang 
> >>> Signed-off-by: Wei Wang 
> >>> ---
> >>>  migration/savevm.c | 17 +
> >>>  1 file changed, 17 insertions(+)
> >>>
> >>> diff --git a/migration/savevm.c b/migration/savevm.c index
> >>> 388d7af7cd..fbc9f2bdd4 100644
> >>> --- a/migration/savevm.c
> >>> +++ b/migration/savevm.c
> >>> @@ -2342,6 +2342,23 @@ static int
> >>> loadvm_handle_cmd_packaged(MigrationIncomingState *mis)
> >>>
> >>>  QEMUFile *packf = qemu_file_new_input(QIO_CHANNEL(bioc));
> >>>
> >>> +/*
> >>> + * Before loading the guest states, ensure that the preempt channel 
> >>> has
> >>> + * been ready to use, as some of the states (e.g. via virtio_load) 
> >>> might
> >>> + * trigger page faults that will be handled through the preempt 
> >>> channel.
> >>> + * So yield to the main thread in the case that the channel create 
> >>> event
> >>> + * has been dispatched.
> >>> + */
> >>> +do {
> >>> +if (!migrate_postcopy_preempt() || !qemu_in_coroutine() ||
> >>> +mis->postcopy_qemufile_dst) {
> >>> +break;
> >>> +}
> >>> +
> >>> +aio_co_schedule(qemu_get_current_aio_context(),
> >> qemu_coroutine_self());
> >>> +qemu_coroutine_yield();
> >>> +} while (!qemu_sem_timedwait(&mis->postcopy_qemufile_dst_done,
> >>> + 1));
> >>
> >> I think we need s/!// here, so the same mistake I made?  I think we need to
> >> rework the retval of qemu_sem_timedwait() at some point later..
> > 
> > No. qemu_sem_timedwait returns false when timeout, which means sem isn’t 
> > posted yet.
> > So it needs to go back to the loop. (the patch was tested)
> 
> When timeout, qemu_sem_timedwait() will return -1. I think the patch test 
> passed
> may because you will always have at least one yield (the first yield in the do
> ...while ...) when loadvm_handle_cmd_packaged()?

My guess is that here the kick will work and qemu_sem_timedwait() later
will ETIMEOUT -> qemu_sem_timedwait() returns -1, then the loop just broke.
That aio schedule should make sure anyway that the file is ready; the
preempt thread must run before this to not hang that thread.

I think it more kind of justifies that the retval needs to be properly
defined. :( It's confusion is on top of when I know libpthread returns
positive error codes.

Thans,

-- 
Peter Xu




[RFC PATCH-for-9.1] qapi: Do not generate commands/events/introspect code for user emulation

2024-04-04 Thread Philippe Mathieu-Daudé
User emulation requires the QAPI types. Due to the command
line processing, some visitor code is also used. The rest
is irrelevant (no QMP socket).

Add an option to the qapi-gen script to allow generating
the minimum when only user emulation is being built.

Signed-off-by: Philippe Mathieu-Daudé 
---
RFC: Quick PoC for Markus. It is useful for user-only builds.
---
 qapi/meson.build |  6 +-
 scripts/qapi/main.py | 16 +++-
 2 files changed, 16 insertions(+), 6 deletions(-)

diff --git a/qapi/meson.build b/qapi/meson.build
index 375d564277..5e02621145 100644
--- a/qapi/meson.build
+++ b/qapi/meson.build
@@ -115,10 +115,14 @@ foreach module : qapi_all_modules
   endif
 endforeach
 
+qapi_gen_cmd = [ qapi_gen, '-o', 'qapi', '-b', '@INPUT0@' ]
+if not (have_system or have_tools)
+  qapi_gen_cmd += [ '--types-only' ]
+endif
 qapi_files = custom_target('shared QAPI source files',
   output: qapi_util_outputs + qapi_specific_outputs + qapi_nonmodule_outputs,
   input: [ files('qapi-schema.json') ],
-  command: [ qapi_gen, '-o', 'qapi', '-b', '@INPUT0@' ],
+  command: qapi_gen_cmd,
   depend_files: [ qapi_inputs, qapi_gen_depends ])
 
 # Now go through all the outputs and add them to the right sourceset.
diff --git a/scripts/qapi/main.py b/scripts/qapi/main.py
index 316736b6a2..925af5841b 100644
--- a/scripts/qapi/main.py
+++ b/scripts/qapi/main.py
@@ -33,7 +33,8 @@ def generate(schema_file: str,
  prefix: str,
  unmask: bool = False,
  builtins: bool = False,
- gen_tracing: bool = False) -> None:
+ gen_tracing: bool = False,
+ gen_types_only: bool = False) -> None:
 """
 Generate C code for the given schema into the target directory.
 
@@ -50,9 +51,10 @@ def generate(schema_file: str,
 schema = QAPISchema(schema_file)
 gen_types(schema, output_dir, prefix, builtins)
 gen_visit(schema, output_dir, prefix, builtins)
-gen_commands(schema, output_dir, prefix, gen_tracing)
-gen_events(schema, output_dir, prefix)
-gen_introspect(schema, output_dir, prefix, unmask)
+if not gen_types_only:
+gen_commands(schema, output_dir, prefix, gen_tracing)
+gen_events(schema, output_dir, prefix)
+gen_introspect(schema, output_dir, prefix, unmask)
 
 
 def main() -> int:
@@ -75,6 +77,9 @@ def main() -> int:
 parser.add_argument('-u', '--unmask-non-abi-names', action='store_true',
 dest='unmask',
 help="expose non-ABI names in introspection")
+parser.add_argument('-t', '--types-only', action='store_true',
+dest='gen_types_only',
+help="Only generate QAPI types")
 
 # Option --suppress-tracing exists so we can avoid solving build system
 # problems.  TODO Drop it when we no longer need it.
@@ -96,7 +101,8 @@ def main() -> int:
  prefix=args.prefix,
  unmask=args.unmask,
  builtins=args.builtins,
- gen_tracing=not args.suppress_tracing)
+ gen_tracing=not args.suppress_tracing,
+ gen_types_only=args.gen_types_only)
 except QAPIError as err:
 print(err, file=sys.stderr)
 return 1
-- 
2.41.0




[PATCH-for-9.1 2/7] yank: Restrict to system emulation

2024-04-04 Thread Philippe Mathieu-Daudé
The yank feature is not used in user emulation.

Signed-off-by: Philippe Mathieu-Daudé 
---
 util/meson.build | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/util/meson.build b/util/meson.build
index 0ef9886be0..247f55a80d 100644
--- a/util/meson.build
+++ b/util/meson.build
@@ -60,7 +60,6 @@ util_ss.add(files('stats64.c'))
 util_ss.add(files('systemd.c'))
 util_ss.add(files('transactions.c'))
 util_ss.add(files('guest-random.c'))
-util_ss.add(files('yank.c'))
 util_ss.add(files('int128.c'))
 util_ss.add(files('memalign.c'))
 util_ss.add(files('interval-tree.c'))
@@ -76,6 +75,7 @@ if have_system
   if host_os == 'linux'
 util_ss.add(files('userfaultfd.c'))
   endif
+  util_ss.add(files('yank.c'))
 endif
 
 if have_block or have_ga
-- 
2.41.0




[PATCH-for-9.1 1/7] ebpf: Restrict to system emulation

2024-04-04 Thread Philippe Mathieu-Daudé
eBPF is not used in user emulation.

Signed-off-by: Philippe Mathieu-Daudé 
---
 ebpf/meson.build | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/ebpf/meson.build b/ebpf/meson.build
index c5bf9295a2..bff6156f51 100644
--- a/ebpf/meson.build
+++ b/ebpf/meson.build
@@ -1 +1 @@
-common_ss.add(when: libbpf, if_true: files('ebpf.c', 'ebpf_rss.c'), if_false: 
files('ebpf_rss-stub.c'))
+system_ss.add(when: libbpf, if_true: files('ebpf.c', 'ebpf_rss.c'), if_false: 
files('ebpf_rss-stub.c'))
-- 
2.41.0




[PATCH-for-9.1 5/7] hw/core: Restrict reset handlers API to system emulation

2024-04-04 Thread Philippe Mathieu-Daudé
Headers in include/sysemu/ are specific to system
emulation and should not be used in user emulation.

Signed-off-by: Philippe Mathieu-Daudé 
---
 hw/core/reset.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/hw/core/reset.c b/hw/core/reset.c
index d50da7e304..167c8bf1a9 100644
--- a/hw/core/reset.c
+++ b/hw/core/reset.c
@@ -24,7 +24,9 @@
  */
 
 #include "qemu/osdep.h"
+#ifndef CONFIG_USER_ONLY
 #include "sysemu/reset.h"
+#endif
 #include "hw/resettable.h"
 #include "hw/core/resetcontainer.h"
 
@@ -43,6 +45,7 @@ static ResettableContainer *get_root_reset_container(void)
 return root_reset_container;
 }
 
+#ifndef CONFIG_USER_ONLY
 /*
  * Reason why the currently in-progress qemu_devices_reset() was called.
  * If we made at least SHUTDOWN_CAUSE_SNAPSHOT_LOAD have a corresponding
@@ -185,3 +188,4 @@ void qemu_devices_reset(ShutdownCause reason)
 /* Reset the simulation */
 resettable_reset(OBJECT(get_root_reset_container()), RESET_TYPE_COLD);
 }
+#endif
-- 
2.41.0




[PATCH-for-9.1 6/7] hw/core: Move reset.c to hwcore_ss[] source set

2024-04-04 Thread Philippe Mathieu-Daudé
reset.c contains core code used by any CPU, required
by user emulation. Move it to hwcore_ss[] where it
belongs.

Signed-off-by: Philippe Mathieu-Daudé 
---
 hw/core/meson.build | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/core/meson.build b/hw/core/meson.build
index e26f2e088c..1389f1b339 100644
--- a/hw/core/meson.build
+++ b/hw/core/meson.build
@@ -3,7 +3,6 @@ hwcore_ss.add(files(
   'bus.c',
   'qdev-properties.c',
   'qdev.c',
-  'reset.c',
   'resetcontainer.c',
   'resettable.c',
   'vmstate-if.c',
@@ -16,6 +15,7 @@ if have_system
   hwcore_ss.add(files(
 'hotplug.c',
 'qdev-hotplug.c',
+'reset.c',
   ))
 else
   hwcore_ss.add(files(
-- 
2.41.0




[PATCH-for-9.1 7/7] hw: Include minimal source set in user emulation build

2024-04-04 Thread Philippe Mathieu-Daudé
Only the files in hwcore_ss[] are required to link
a user emulation binary.

Have meson process the hw/ sub-directories if system
emulation is selected, otherwise directly process
hw/core/ to get hwcore_ss[], which is the only set
required by user emulation.

Signed-off-by: Philippe Mathieu-Daudé 
---
 meson.build | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/meson.build b/meson.build
index c9c3217ba4..68eecd1937 100644
--- a/meson.build
+++ b/meson.build
@@ -3447,8 +3447,12 @@ subdir('qom')
 subdir('authz')
 subdir('crypto')
 subdir('ui')
-subdir('hw')
 subdir('gdbstub')
+if have_system
+  subdir('hw')
+else
+  subdir('hw/core')
+endif
 
 if enable_modules
   libmodulecommon = static_library('module-common', files('module-common.c') + 
genh, pic: true, c_args: '-DBUILD_DSO')
-- 
2.41.0




[PATCH-for-9.1 4/7] util/qemu-config: Extract QMP commands to qemu-config-qmp.c

2024-04-04 Thread Philippe Mathieu-Daudé
QMP is irrelevant for user emulation. Extract the code
related to QMP in a different source file, which won't
be build for user emulation binaries. This avoid pulling
pointless code.

Signed-off-by: Philippe Mathieu-Daudé 
---
 include/qemu/config-file.h |   3 +
 util/qemu-config-qmp.c | 206 +
 util/qemu-config.c | 204 +---
 util/meson.build   |   1 +
 4 files changed, 212 insertions(+), 202 deletions(-)
 create mode 100644 util/qemu-config-qmp.c

diff --git a/include/qemu/config-file.h b/include/qemu/config-file.h
index b82a778123..8b9d6df173 100644
--- a/include/qemu/config-file.h
+++ b/include/qemu/config-file.h
@@ -8,6 +8,9 @@ QemuOptsList *qemu_find_opts(const char *group);
 QemuOptsList *qemu_find_opts_err(const char *group, Error **errp);
 QemuOpts *qemu_find_opts_singleton(const char *group);
 
+extern QemuOptsList *vm_config_groups[48];
+extern QemuOptsList *drive_config_groups[5];
+
 void qemu_add_opts(QemuOptsList *list);
 void qemu_add_drive_opts(QemuOptsList *list);
 int qemu_global_option(const char *str);
diff --git a/util/qemu-config-qmp.c b/util/qemu-config-qmp.c
new file mode 100644
index 00..24477a0e44
--- /dev/null
+++ b/util/qemu-config-qmp.c
@@ -0,0 +1,206 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+#include "qemu/osdep.h"
+#include "qapi/error.h"
+#include "qapi/qapi-commands-misc.h"
+#include "qapi/qmp/qlist.h"
+#include "qemu/option.h"
+#include "qemu/config-file.h"
+#include "hw/boards.h"
+
+static CommandLineParameterInfoList *query_option_descs(const QemuOptDesc 
*desc)
+{
+CommandLineParameterInfoList *param_list = NULL;
+CommandLineParameterInfo *info;
+int i;
+
+for (i = 0; desc[i].name != NULL; i++) {
+info = g_malloc0(sizeof(*info));
+info->name = g_strdup(desc[i].name);
+
+switch (desc[i].type) {
+case QEMU_OPT_STRING:
+info->type = COMMAND_LINE_PARAMETER_TYPE_STRING;
+break;
+case QEMU_OPT_BOOL:
+info->type = COMMAND_LINE_PARAMETER_TYPE_BOOLEAN;
+break;
+case QEMU_OPT_NUMBER:
+info->type = COMMAND_LINE_PARAMETER_TYPE_NUMBER;
+break;
+case QEMU_OPT_SIZE:
+info->type = COMMAND_LINE_PARAMETER_TYPE_SIZE;
+break;
+}
+
+info->help = g_strdup(desc[i].help);
+info->q_default = g_strdup(desc[i].def_value_str);
+
+QAPI_LIST_PREPEND(param_list, info);
+}
+
+return param_list;
+}
+
+/* remove repeated entry from the info list */
+static void cleanup_infolist(CommandLineParameterInfoList *head)
+{
+CommandLineParameterInfoList *pre_entry, *cur, *del_entry;
+
+cur = head;
+while (cur->next) {
+pre_entry = head;
+while (pre_entry != cur->next) {
+if (!strcmp(pre_entry->value->name, cur->next->value->name)) {
+del_entry = cur->next;
+cur->next = cur->next->next;
+del_entry->next = NULL;
+qapi_free_CommandLineParameterInfoList(del_entry);
+break;
+}
+pre_entry = pre_entry->next;
+}
+cur = cur->next;
+}
+}
+
+/* merge the description items of two parameter infolists */
+static void connect_infolist(CommandLineParameterInfoList *head,
+ CommandLineParameterInfoList *new)
+{
+CommandLineParameterInfoList *cur;
+
+cur = head;
+while (cur->next) {
+cur = cur->next;
+}
+cur->next = new;
+}
+
+/* access all the local QemuOptsLists for drive option */
+static CommandLineParameterInfoList *get_drive_infolist(void)
+{
+CommandLineParameterInfoList *head = NULL, *cur;
+int i;
+
+for (i = 0; drive_config_groups[i] != NULL; i++) {
+if (!head) {
+head = query_option_descs(drive_config_groups[i]->desc);
+} else {
+cur = query_option_descs(drive_config_groups[i]->desc);
+connect_infolist(head, cur);
+}
+}
+cleanup_infolist(head);
+
+return head;
+}
+
+static CommandLineParameterInfo *objprop_to_cmdline_prop(ObjectProperty *prop)
+{
+CommandLineParameterInfo *info;
+
+info = g_malloc0(sizeof(*info));
+info->name = g_strdup(prop->name);
+
+if (g_str_equal(prop->type, "bool") || g_str_equal(prop->type, 
"OnOffAuto")) {
+info->type = COMMAND_LINE_PARAMETER_TYPE_BOOLEAN;
+} else if (g_str_equal(prop->type, "int")) {
+info->type = COMMAND_LINE_PARAMETER_TYPE_NUMBER;
+} else if (g_str_equal(prop->type, "size")) {
+info->type = COMMAND_LINE_PARAMETER_TYPE_SIZE;
+} else {
+info->type = COMMAND_LINE_PARAMETER_TYPE_STRING;
+}
+
+if (prop->description) {
+info->help = g_strdup(prop->description);
+}
+
+return info;
+}
+
+static CommandLineParameterInfoList *query_all_machine_properties(void)
+{
+CommandLineParameterInfoList 

[PATCH-for-9.1 0/7] buildsys: Start shrinking qemu-user build process

2024-04-04 Thread Philippe Mathieu-Daudé
Hi,

While reworking include/exec/ I have to build many configs
to be sure nothing breaks. qemu-user is particularly
sensitive to changes in this directory (mostly because
all user-specific files include "qemu.h", itself including
various exec/ headers). Getting tired of this waste I had
a look at what we pointlessly build. This series is the
beginning of yet another cleanup set.

Regards,

Phil.

Philippe Mathieu-Daudé (7):
  ebpf: Restrict to system emulation
  yank: Restrict to system emulation
  monitor: Rework stubs to simplify user emulation linking
  util/qemu-config: Extract QMP commands to qemu-config-qmp.c
  hw/core: Restrict reset handlers API to system emulation
  hw/core: Move reset.c to hwcore_ss[] source set
  hw: Include minimal source set in user emulation build

 meson.build|   6 +-
 include/qemu/config-file.h |   3 +
 hw/core/reset.c|   4 +
 stubs/fdset.c  |  17 ---
 stubs/monitor-core.c   |  20 +++-
 stubs/monitor.c|   8 +-
 util/qemu-config-qmp.c | 206 +
 util/qemu-config.c | 204 +---
 ebpf/meson.build   |   2 +-
 hw/core/meson.build|   2 +-
 stubs/meson.build  |   5 +-
 util/meson.build   |   3 +-
 12 files changed, 248 insertions(+), 232 deletions(-)
 delete mode 100644 stubs/fdset.c
 create mode 100644 util/qemu-config-qmp.c

-- 
2.41.0




[PATCH-for-9.1 3/7] monitor: Rework stubs to simplify user emulation linking

2024-04-04 Thread Philippe Mathieu-Daudé
Currently monitor stubs are scattered in 3 files.

Merge these stubs in 2 files, a generic one (monitor-core)
included in all builds (in particular user emulation), and
a less generic one to be included by tools and system emulation.

Signed-off-by: Philippe Mathieu-Daudé 
---
 stubs/fdset.c| 17 -
 stubs/monitor-core.c | 20 +++-
 stubs/monitor.c  |  8 ++--
 stubs/meson.build|  5 +++--
 4 files changed, 24 insertions(+), 26 deletions(-)
 delete mode 100644 stubs/fdset.c

diff --git a/stubs/fdset.c b/stubs/fdset.c
deleted file mode 100644
index 56b3663d58..00
--- a/stubs/fdset.c
+++ /dev/null
@@ -1,17 +0,0 @@
-#include "qemu/osdep.h"
-#include "monitor/monitor.h"
-
-int monitor_fdset_dup_fd_add(int64_t fdset_id, int flags)
-{
-errno = ENOSYS;
-return -1;
-}
-
-int64_t monitor_fdset_dup_fd_find(int dup_fd)
-{
-return -1;
-}
-
-void monitor_fdset_dup_fd_remove(int dupfd)
-{
-}
diff --git a/stubs/monitor-core.c b/stubs/monitor-core.c
index afa477aae6..72e40bcc15 100644
--- a/stubs/monitor-core.c
+++ b/stubs/monitor-core.c
@@ -1,6 +1,7 @@
+/* Monitor stub required for user emulation */
 #include "qemu/osdep.h"
 #include "monitor/monitor.h"
-#include "qapi/qapi-emit-events.h"
+#include "../monitor/monitor-internal.h"
 
 Monitor *monitor_cur(void)
 {
@@ -12,11 +13,22 @@ Monitor *monitor_set_cur(Coroutine *co, Monitor *mon)
 return NULL;
 }
 
-void monitor_init_qmp(Chardev *chr, bool pretty, Error **errp)
+int monitor_fdset_dup_fd_add(int64_t fdset_id, int flags)
+{
+errno = ENOSYS;
+return -1;
+}
+
+int64_t monitor_fdset_dup_fd_find(int dup_fd)
+{
+return -1;
+}
+
+void monitor_fdset_dup_fd_remove(int dupfd)
 {
 }
 
-void qapi_event_emit(QAPIEvent event, QDict *qdict)
+void monitor_fdsets_cleanup(void)
 {
 }
 
@@ -24,5 +36,3 @@ int monitor_vprintf(Monitor *mon, const char *fmt, va_list ap)
 {
 abort();
 }
-
-
diff --git a/stubs/monitor.c b/stubs/monitor.c
index 20786ac4ff..2fc4dc1493 100644
--- a/stubs/monitor.c
+++ b/stubs/monitor.c
@@ -1,7 +1,7 @@
 #include "qemu/osdep.h"
 #include "qapi/error.h"
+#include "qapi/qapi-emit-events.h"
 #include "monitor/monitor.h"
-#include "../monitor/monitor-internal.h"
 
 int monitor_get_fd(Monitor *mon, const char *name, Error **errp)
 {
@@ -13,6 +13,10 @@ void monitor_init_hmp(Chardev *chr, bool use_readline, Error 
**errp)
 {
 }
 
-void monitor_fdsets_cleanup(void)
+void monitor_init_qmp(Chardev *chr, bool pretty, Error **errp)
+{
+}
+
+void qapi_event_emit(QAPIEvent event, QDict *qdict)
 {
 }
diff --git a/stubs/meson.build b/stubs/meson.build
index 0bf25e6ca5..ca1bc07d30 100644
--- a/stubs/meson.build
+++ b/stubs/meson.build
@@ -10,7 +10,6 @@ stub_ss.add(files('qemu-timer-notify-cb.c'))
 stub_ss.add(files('icount.c'))
 stub_ss.add(files('dump.c'))
 stub_ss.add(files('error-printf.c'))
-stub_ss.add(files('fdset.c'))
 stub_ss.add(files('gdbstub.c'))
 stub_ss.add(files('get-vm-name.c'))
 stub_ss.add(files('graph-lock.c'))
@@ -28,7 +27,9 @@ if libaio.found()
 endif
 stub_ss.add(files('migr-blocker.c'))
 stub_ss.add(files('module-opts.c'))
-stub_ss.add(files('monitor.c'))
+if have_system or have_tools
+  stub_ss.add(files('monitor.c'))
+endif
 stub_ss.add(files('monitor-core.c'))
 stub_ss.add(files('physmem.c'))
 stub_ss.add(files('qemu-timer-notify-cb.c'))
-- 
2.41.0




[PATCH-for-9.0 1/4] hw/virtio: Introduce virtio_bh_new_guarded() helper

2024-04-04 Thread Philippe Mathieu-Daudé
Introduce virtio_bh_new_guarded(), similar to qemu_bh_new_guarded()
but using the transport memory guard, instead of the device one
(there can only be one virtio device per virtio bus).

Inspired-by: Gerd Hoffmann 
Signed-off-by: Philippe Mathieu-Daudé 
---
 include/hw/virtio/virtio.h |  7 +++
 hw/virtio/virtio.c | 10 ++
 2 files changed, 17 insertions(+)

diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
index b3c74a1bca..12419d6355 100644
--- a/include/hw/virtio/virtio.h
+++ b/include/hw/virtio/virtio.h
@@ -22,6 +22,7 @@
 #include "standard-headers/linux/virtio_config.h"
 #include "standard-headers/linux/virtio_ring.h"
 #include "qom/object.h"
+#include "block/aio.h"
 
 /*
  * A guest should never accept this. It implies negotiation is broken
@@ -527,4 +528,10 @@ static inline bool virtio_device_disabled(VirtIODevice 
*vdev)
 bool virtio_legacy_allowed(VirtIODevice *vdev);
 bool virtio_legacy_check_disabled(VirtIODevice *vdev);
 
+QEMUBH *virtio_bh_new_guarded_full(VirtIODevice *vdev,
+   QEMUBHFunc *cb, void *opaque,
+   const char *name);
+#define virtio_bh_new_guarded(vdev, cb, opaque) \
+virtio_bh_new_guarded_full((vdev), (cb), (opaque), (stringify(cb)))
+
 #endif
diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
index fb6b4ccd83..e1735cf7fd 100644
--- a/hw/virtio/virtio.c
+++ b/hw/virtio/virtio.c
@@ -4176,3 +4176,13 @@ static void virtio_register_types(void)
 }
 
 type_init(virtio_register_types)
+
+QEMUBH *virtio_bh_new_guarded_full(VirtIODevice *vdev,
+   QEMUBHFunc *cb, void *opaque,
+   const char *name)
+{
+BusState *virtio_bus = qdev_get_parent_bus(DEVICE(vdev));
+DeviceState *transport = virtio_bus->parent;
+
+return qemu_bh_new_full(cb, opaque, name, 
&transport->mem_reentrancy_guard);
+}
-- 
2.41.0




[PATCH-for-9.0 4/4] hw/virtio/virtio-crypto: Protect from DMA re-entrancy bugs

2024-04-04 Thread Philippe Mathieu-Daudé
Replace qemu_bh_new_guarded() by virtio_bh_new_guarded()
so the bus and device use the same guard. Otherwise the
DMA-reentrancy protection can be bypassed.

Cc: qemu-sta...@nongnu.org
Suggested-by: Alexander Bulekov 
Signed-off-by: Philippe Mathieu-Daudé 
---
 hw/virtio/virtio-crypto.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/hw/virtio/virtio-crypto.c b/hw/virtio/virtio-crypto.c
index fe1313f2ad..ac1b67d1fb 100644
--- a/hw/virtio/virtio-crypto.c
+++ b/hw/virtio/virtio-crypto.c
@@ -1080,8 +1080,8 @@ static void virtio_crypto_device_realize(DeviceState 
*dev, Error **errp)
 vcrypto->vqs[i].dataq =
  virtio_add_queue(vdev, 1024, virtio_crypto_handle_dataq_bh);
 vcrypto->vqs[i].dataq_bh =
- qemu_bh_new_guarded(virtio_crypto_dataq_bh, &vcrypto->vqs[i],
- &dev->mem_reentrancy_guard);
+ virtio_bh_new_guarded(vdev, virtio_crypto_dataq_bh,
+   &vcrypto->vqs[i]);
 vcrypto->vqs[i].vcrypto = vcrypto;
 }
 
-- 
2.41.0




[PATCH-for-9.0 3/4] hw/char/virtio-serial-bus: Protect from DMA re-entrancy bugs

2024-04-04 Thread Philippe Mathieu-Daudé
Replace qemu_bh_new_guarded() by virtio_bh_new_guarded()
so the bus and device use the same guard. Otherwise the
DMA-reentrancy protection can be bypassed.

Cc: qemu-sta...@nongnu.org
Suggested-by: Alexander Bulekov 
Signed-off-by: Philippe Mathieu-Daudé 
---
 hw/char/virtio-serial-bus.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/hw/char/virtio-serial-bus.c b/hw/char/virtio-serial-bus.c
index 016aba6374..cd0e3a11f7 100644
--- a/hw/char/virtio-serial-bus.c
+++ b/hw/char/virtio-serial-bus.c
@@ -985,8 +985,7 @@ static void virtser_port_device_realize(DeviceState *dev, 
Error **errp)
 return;
 }
 
-port->bh = qemu_bh_new_guarded(flush_queued_data_bh, port,
-   &dev->mem_reentrancy_guard);
+port->bh = virtio_bh_new_guarded(vdev, flush_queued_data_bh, port);
 port->elem = NULL;
 }
 
-- 
2.41.0




[PATCH-for-9.0 2/4] hw/display/virtio-gpu: Protect from DMA re-entrancy bugs

2024-04-04 Thread Philippe Mathieu-Daudé
Replace qemu_bh_new_guarded() by virtio_bh_new_guarded()
so the bus and device use the same guard. Otherwise the
DMA-reentrancy protection can be bypassed:

  $ cat << EOF | qemu-system-i386 -display none -nodefaults \
  -machine q35,accel=qtest \
  -m 512M \
  -device virtio-gpu \
  -qtest stdio
  outl 0xcf8 0x8820
  outl 0xcfc 0xe0004000
  outl 0xcf8 0x8804
  outw 0xcfc 0x06
  write 0xe0004030 0x4 0x024000e0
  write 0xe0004028 0x1 0xff
  write 0xe0004020 0x4 0x9300
  write 0xe000401c 0x1 0x01
  write 0x101 0x1 0x04
  write 0x103 0x1 0x1c
  write 0x9301c8 0x1 0x18
  write 0x105 0x1 0x1c
  write 0x107 0x1 0x1c
  write 0x109 0x1 0x1c
  write 0x10b 0x1 0x00
  write 0x10d 0x1 0x00
  write 0x10f 0x1 0x00
  write 0x111 0x1 0x00
  write 0x113 0x1 0x00
  write 0x115 0x1 0x00
  write 0x117 0x1 0x00
  write 0x119 0x1 0x00
  write 0x11b 0x1 0x00
  write 0x11d 0x1 0x00
  write 0x11f 0x1 0x00
  write 0x121 0x1 0x00
  write 0x123 0x1 0x00
  write 0x125 0x1 0x00
  write 0x127 0x1 0x00
  write 0x129 0x1 0x00
  write 0x12b 0x1 0x00
  write 0x12d 0x1 0x00
  write 0x12f 0x1 0x00
  write 0x131 0x1 0x00
  write 0x133 0x1 0x00
  write 0x135 0x1 0x00
  write 0x137 0x1 0x00
  write 0x139 0x1 0x00
  write 0xe0007003 0x1 0x00
  EOF
  ...
  =
  ==276099==ERROR: AddressSanitizer: heap-use-after-free on address 
0x60d11178
  at pc 0x562cc3b736c7 bp 0x7ffed49dee60 sp 0x7ffed49dee58
  READ of size 8 at 0x60d11178 thread T0
  #0 0x562cc3b736c6 in virtio_gpu_ctrl_response 
hw/display/virtio-gpu.c:180:42
  #1 0x562cc3b7c40b in virtio_gpu_ctrl_response_nodata 
hw/display/virtio-gpu.c:192:5
  #2 0x562cc3b7c40b in virtio_gpu_simple_process_cmd 
hw/display/virtio-gpu.c:1015:13
  #3 0x562cc3b82873 in virtio_gpu_process_cmdq 
hw/display/virtio-gpu.c:1050:9
  #4 0x562cc4a85514 in aio_bh_call util/async.c:169:5
  #5 0x562cc4a85c52 in aio_bh_poll util/async.c:216:13
  #6 0x562cc4a1a79b in aio_dispatch util/aio-posix.c:423:5
  #7 0x562cc4a8a2da in aio_ctx_dispatch util/async.c:358:5
  #8 0x7f36840547a8 in g_main_context_dispatch 
(/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x547a8)
  #9 0x562cc4a8b753 in glib_pollfds_poll util/main-loop.c:290:9
  #10 0x562cc4a8b753 in os_host_main_loop_wait util/main-loop.c:313:5
  #11 0x562cc4a8b753 in main_loop_wait util/main-loop.c:592:11
  #12 0x562cc3938186 in qemu_main_loop system/runstate.c:782:9
  #13 0x562cc43b7af5 in qemu_default_main system/main.c:37:14
  #14 0x7f3683a6c189 in __libc_start_call_main 
csu/../sysdeps/nptl/libc_start_call_main.h:58:16
  #15 0x7f3683a6c244 in __libc_start_main csu/../csu/libc-start.c:381:3
  #16 0x562cc2a58ac0 in _start (qemu-system-i386+0x231bac0)

  0x60d11178 is located 56 bytes inside of 136-byte region 
[0x60d11140,0x60d111c8)
  freed by thread T0 here:
  #0 0x562cc2adb662 in __interceptor_free (qemu-system-i386+0x239e662)
  #1 0x562cc3b86b21 in virtio_gpu_reset hw/display/virtio-gpu.c:1524:9
  #2 0x562cc416e20e in virtio_reset hw/virtio/virtio.c:2145:9
  #3 0x562cc37c5644 in virtio_pci_reset hw/virtio/virtio-pci.c:2249:5
  #4 0x562cc4233758 in memory_region_write_accessor system/memory.c:497:5
  #5 0x562cc4232eea in access_with_adjusted_size system/memory.c:573:18

  previously allocated by thread T0 here:
  #0 0x562cc2adb90e in malloc (qemu-system-i386+0x239e90e)
  #1 0x7f368405a678 in g_malloc 
(/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x5a678)
  #2 0x562cc4163ffc in virtqueue_split_pop hw/virtio/virtio.c:1612:12
  #3 0x562cc4163ffc in virtqueue_pop hw/virtio/virtio.c:1783:16
  #4 0x562cc3b91a95 in virtio_gpu_handle_ctrl 
hw/display/virtio-gpu.c:1112:15
  #5 0x562cc4a85514 in aio_bh_call util/async.c:169:5
  #6 0x562cc4a85c52 in aio_bh_poll util/async.c:216:13
  #7 0x562cc4a1a79b in aio_dispatch util/aio-posix.c:423:5

  SUMMARY: AddressSanitizer: heap-use-after-free hw/display/virtio-gpu.c:180:42 
in virtio_gpu_ctrl_response

With this change, the same reproducer triggers:

  qemu-system-i386: warning: Blocked re-entrant IO on MemoryRegion: 
virtio-pci-common-virtio-gpu at addr: 0x6

Cc: qemu-sta...@nongnu.org
Reported-by: Alexander Bulekov 
Reported-by: Yongkang Jia 
Reported-by: Xiao Lei 
Reported-by: Yiming Tao 
Buglink: https://bugs.launchpad.net/qemu/+bug/1888606
Signed-off-by: Philippe Mathieu-Daudé 
---
 hw/display/virtio-gpu.c | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/hw/display/virtio-gpu.c b/hw/display/virtio-gpu.c
index 78d5a4f164..3ab94a9735 100644
--- a/hw/display/virtio-gpu.c
+++ b/hw/display/virtio-gpu.c
@@ -1492,10 +1492,8 @@ void virtio_gpu_device_realize(DeviceState *qdev, Error 
**errp)
 
 g->ctrl_vq = virtio_get_queue(vdev, 0);
 g->cursor_vq = virtio_get_queue(vdev, 1);
-

[PATCH-for-9.0 0/4] hw/virtio: Protect from more DMA re-entrancy bugs

2024-04-04 Thread Philippe Mathieu-Daudé
Gerd suggested to use the transport guard to protect the
device from DMA re-entrancy abuses.

Philippe Mathieu-Daudé (4):
  hw/virtio: Introduce virtio_bh_new_guarded() helper
  hw/display/virtio-gpu: Protect from DMA re-entrancy bugs
  hw/char/virtio-serial-bus: Protect from DMA re-entrancy bugs
  hw/virtio/virtio-crypto: Protect from DMA re-entrancy bugs

 include/hw/virtio/virtio.h  |  7 +++
 hw/char/virtio-serial-bus.c |  3 +--
 hw/display/virtio-gpu.c |  6 ++
 hw/virtio/virtio-crypto.c   |  4 ++--
 hw/virtio/virtio.c  | 10 ++
 5 files changed, 22 insertions(+), 8 deletions(-)

-- 
2.41.0




Re: Intention to work on GSoC project

2024-04-04 Thread Sahil
Hi,

On Thursday, April 4, 2024 12:07:49 AM IST Eugenio Perez Martin wrote:
> On Wed, Apr 3, 2024 at 4:36 PM Sahil  wrote:
> [...]
> > I would like to clarify one thing in the figure "Full two-entries
> > descriptor table". The driver can only overwrite a used descriptor in the
> > descriptor ring, right?
> 
> Except for the first round, the driver can only write to used entries
> in the descriptor table. In other words, their avail and used flags
> must be equal.
> 
> > And likewise for the device?
> 
> Yes, but with avail descs. I think you got this already, but I want to
> be as complete as possible here.
> 
> > So in the figure, the driver will have to wait until descriptor[1] is
> > used before it can overwrite it?
> 
> Yes, but I think it is easier to think that both descriptor id 0 and 1
> are available already. The descriptor id must be less than virtqueue
> size.
> 
> An entry with a valid buffer and length must be invalid because of the
> descriptor id in that situation, either because it is a number > vq
> length or because it is a descriptor already available.

I didn't think of it in that way. This makes sense now.

> > Suppose the device marks descriptor[0] as used. I think the driver will
> > not be able to overwrite that descriptor entry because it has to go in
> > order and is at descriptor[1]. Is that correct?
> 
> The device must write one descriptor as used, either 0 or 1, at
> descriptors[0] as all the descriptors are available.
> 
> Now, it does not matter if the device marks as used one or the two
> descriptors: the driver must write its next available descriptor at
> descriptor[1]. This is not because descriptor[1] contains a special
> field or data, but because the driver must write the avail descriptors
> sequentially, so the device knows the address to poll or check after a
> notification.
> 
> In other words, descriptor[1] is just a buffer space from the driver
> to communicate an available descriptor to the device. It does not
> matter what it contained before the writing, as the driver must
> process that information before writing the new available descriptor.
> 
> > Is it possible for the driver
> > to go "backwards" in the descriptor ring?
> 
> Nope, under any circumstance.

Understood. Thank you for the clarification.

> [...]
> > Q1.
> > In the paragraph just above Figure 6, there is the following line:
> > > the vhost kernel thread and QEMU may run in different CPU threads,
> > > so these writes must be synchronized with QEMU cleaning of the dirty
> > > bitmap, and this write must be seen strictly after the modifications of
> > > the guest memory by the QEMU thread.
> > 
> > I am not clear on the last part of the statement. The modification of
> > guest memory is being done by the vhost device and not by the QEMU
> > thread, right?
> 
> QEMU also writes to the bitmap cleaning it, so it knows the memory
> does not need to be resent.

Oh, I thought, from figure 6, the bitmap is a part of QEMU's memory but is
separate from the guest's memory.

> Feel free to ask questions about this, but you don't need to interact
> with the dirty bitmap in the project.

Understood, I won't go off on a tangent in that case.

> [...]
> > Regarding the implementation of this project, can the project be broken
> > down into two parts:
> > 1. implementing packed virtqueues in QEMU, and
> 
> Right, but let me expand on this: QEMU already supports packed
> virtqueue in an emulated device (hw/virtio/virtio.c). The missing part
> is the "driver" one, to be able to communicate with a vDPA device, at
> hw/virtio/vhost-shadow-virtqueue.c.

Got it. I'll take a look at "hw/virtio/virtio.c".

> [...]
> > My plan is to also understand how split virtqueue has been implemented
> > in QEMU. I think that'll be helpful when moving the kernel's implementation
> > to QEMU.
> 
> Sure, the split virtqueue is implemented in the same file
> vhost_shadow_virtqueue.c. If you deploy vhost_vdpa +vdpa_sim or
> vp_vdpa [1][2], you can:
> * Run QEMU with -netdev type=vhost-vdpa,x-svq=on
> * Set GDB breakpoint in interesting functions like
> vhost_handle_guest_kick and vhost_svq_flush.

I'll set up this environment as well.

Thanks,
Sahil





[PATCH] target/arm: Fix CNTPOFF_EL2 trap to missing EL3

2024-04-04 Thread Pierre-Clément Tosi
EL2 accesses to CNTPOFF_EL2 should only ever trap to EL3 if EL3 is
present, as described by the reference manual (for MRS):

  /* ... */
  elsif PSTATE.EL == EL2 then
  if Halted() && HaveEL(EL3) && /*...*/ then
  UNDEFINED;
  elsif HaveEL(EL3) && SCR_EL3.ECVEn == '0' then
  /* ... */
  else
  X[t, 64] = CNTPOFF_EL2;

However, the existing implementation of gt_cntpoff_access() always
returns CP_ACCESS_TRAP_EL3 for EL2 accesses with SCR_EL3.ECVEn unset. In
pseudo-code terminology, this corresponds to assuming that HaveEL(EL3)
is always true, which is wrong. As a result, QEMU panics in
access_check_cp_reg() when started without EL3 and running EL2 code
accessing the register (e.g. any recent KVM booting a guest).

Therefore, add the HaveEL(EL3) check to gt_cntpoff_access().

Cc: qemu-sta...@nongnu.org
Fixes: 2808d3b38a52 ("target/arm: Implement FEAT_ECV CNTPOFF_EL2 handling")
Signed-off-by: Pierre-Clément Tosi 
---
 target/arm/helper.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index 3f3a5b55d4..13ad90cac1 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -3452,7 +3452,8 @@ static CPAccessResult gt_cntpoff_access(CPUARMState *env,
 const ARMCPRegInfo *ri,
 bool isread)
 {
-if (arm_current_el(env) == 2 && !(env->cp15.scr_el3 & SCR_ECVEN)) {
+if (arm_current_el(env) == 2 && arm_feature(env, ARM_FEATURE_EL3) &&
+!(env->cp15.scr_el3 & SCR_ECVEN)) {
 return CP_ACCESS_TRAP_EL3;
 }
 return CP_ACCESS_OK;
-- 
2.44.0.478.gd926399ef9-goog


-- 
Pierre



Re: [PULL 00/17] qemu-sparc queue 20240404

2024-04-04 Thread Peter Maydell
On Thu, 4 Apr 2024 at 15:25, Mark Cave-Ayland
 wrote:
>
> The following changes since commit 786fd793b81410fb2a28914315e2f05d2ff6733b:
>
>   Merge tag 'for-upstream' of https://gitlab.com/bonzini/qemu into staging 
> (2024-04-03 12:52:03 +0100)
>
> are available in the Git repository at:
>
>   https://github.com/mcayland/qemu.git tags/qemu-sparc-20240404
>
> for you to fetch changes up to d7fe931818d5e9aa70d08056c43b496ce789ba64:
>
>   esp.c: remove explicit setting of DRQ within ESP state machine (2024-04-04 
> 15:17:53 +0100)
>
> 
> qemu-sparc queue
> - This contains fixes for the ESP emulation discovered by fuzzing (with 
> thanks to
>   Chuhong Yuan )


Applied, thanks.

Please update the changelog at https://wiki.qemu.org/ChangeLog/9.0
for any user-visible changes.

-- PMM



Re: [PULL for-9.0 0/1] Block patches

2024-04-04 Thread Peter Maydell
On Thu, 4 Apr 2024 at 14:58, Stefan Hajnoczi  wrote:
>
> The following changes since commit 786fd793b81410fb2a28914315e2f05d2ff6733b:
>
>   Merge tag 'for-upstream' of https://gitlab.com/bonzini/qemu into staging 
> (2024-04-03 12:52:03 +0100)
>
> are available in the Git repository at:
>
>   https://gitlab.com/stefanha/qemu.git tags/block-pull-request
>
> for you to fetch changes up to bbdf9023665f409113cb07b463732861af63fb47:
>
>   block/virtio-blk: Fix memory leak from virtio_blk_zone_report (2024-04-04 
> 09:29:42 -0400)
>
> 
> Pull request
>
> Fix a memory leak in virtio-blk zone report emulation code when the request is
> invalid.
>
> 
>


Applied, thanks.

Please update the changelog at https://wiki.qemu.org/ChangeLog/9.0
for any user-visible changes.

-- PMM



Re: [PATCH v2] sh4: mac.l: implement saturation arithmetic logic

2024-04-04 Thread Philippe Mathieu-Daudé

Hi Zack,

Cc'ing the maintainer of this file, Yoshinori:

$ ./scripts/get_maintainer.pl -f target/sh4/op_helper.c
Yoshinori Sato  (reviewer:SH4 TCG CPUs)
(https://www.qemu.org/docs/master/devel/submitting-a-patch.html#cc-the-relevant-maintainer)

On 4/4/24 18:39, Peter Maydell wrote:

On Thu, 4 Apr 2024 at 17:26, Zack Buhman  wrote:


The saturation arithmetic logic in helper_macl is not correct.

I tested and verified this behavior on a SH7091, the general pattern
is a code sequence such as:

 sets

 mov.l _mach,r2
 lds r2,mach
 mov.l _macl,r2
 lds r2,macl

 mova _n,r0
 mov r0,r1
 mova _m,r0
 mac.l @r0+,@r1+

 _mach: .long 0x7fff
 _macl: .long 0x12345678
 _m:.long 0x7fff
 _n:.long 0x7fff

Test case 0: (no int64_t overflow)
   given; prior to saturation mac.l:
 mach = 0x7fff macl = 0x12345678
 @r0  = 0x7fff @r1  = 0x7fff

   expected saturation mac.l result:
 mach = 0x7fff macl = 0x

   qemu saturation mac.l result (prior to this commit):
 mach = 0x7ffe macl = 0x12345678

Test case 1: (no int64_t overflow)
   given; prior to saturation mac.l:
 mach = 0x8000 macl = 0x
 @r0  = 0x @r1  = 0x0001

   expected saturation mac.l result:
 mach = 0x8000 macl = 0x

   qemu saturation mac.l result (prior to this commit):
 mach = 0x7fff macl = 0x

Test case 2: (int64_t addition overflow)
   given; prior to saturation mac.l:
 mach = 0x8000 macl = 0x
 @r0  = 0x @r1  = 0x0001

   expected saturation mac.l result:
 mach = 0x8000 macl = 0x

   qemu saturation mac.l result (prior to this commit):
 mach = 0x7fff macl = 0x

Test case 3: (int64_t addition overflow)
   given; prior to saturation mac.l:
 mach = 0x7fff macl = 0x
 @r0 = 0x7fff @r1 = 0x7fff

   expected saturation mac.l result:
 mach = 0x7fff macl = 0x

   qemu saturation mac.l result (prior to this commit):
 mach = 0xfffe macl = 0x0001

All of the above also matches the description of MAC.L as documented
in cd00147165-sh-4-32-bit-cpu-core-architecture-stmicroelectronics.pdf


Hi. I just noticed that you didn't include a signed-off-by line
in your commit message. We need these as they're how you say
that you're legally OK to contribute this code to QEMU and
you're happy for it to go into the project:

https://www.qemu.org/docs/master/devel/submitting-a-patch.html#patch-emails-must-include-a-signed-off-by-line
has links to what exactly this means, but basically the
requirement is that the last line of your commit message should be
"Signed-off-by: Your Name "

In this case, if you just reply to this email with that, we
can pick it up and fix up the commit message when we apply the
patch.


---
  target/sh4/op_helper.c | 31 +--
  1 file changed, 21 insertions(+), 10 deletions(-)

diff --git a/target/sh4/op_helper.c b/target/sh4/op_helper.c
index 4559d0d376..ee16524083 100644
--- a/target/sh4/op_helper.c
+++ b/target/sh4/op_helper.c
@@ -160,18 +160,29 @@ void helper_ocbi(CPUSH4State *env, uint32_t address)

  void helper_macl(CPUSH4State *env, uint32_t arg0, uint32_t arg1)
  {
-int64_t res;
-
-res = ((uint64_t) env->mach << 32) | env->macl;
-res += (int64_t) (int32_t) arg0 *(int64_t) (int32_t) arg1;
-env->mach = (res >> 32) & 0x;
-env->macl = res & 0x;
+int32_t value0 = (int32_t)arg0;
+int32_t value1 = (int32_t)arg1;
+int64_t mul = ((int64_t)value0) * ((int64_t)value1);
+int64_t mac = (((uint64_t)env->mach) << 32) | env->macl;
+int64_t result;
+bool overflow = sadd64_overflow(mac, mul, &result);
+/* Perform 48-bit saturation arithmetic if the S flag is set */
  if (env->sr & (1u << SR_S)) {
-if (res < 0)
-env->mach |= 0x;
-else
-env->mach &= 0x7fff;
+/*
+ * The sign bit of `mac + mul` may overflow. The MAC unit on
+ * real SH-4 hardware has equivalent carry/saturation logic:
+ */
+const int64_t upper_bound =  ((1ull << 47) - 1);
+const int64_t lower_bound = -((1ull << 47) - 0);
+
+if (overflow) {
+result = (mac < 0) ? lower_bound : upper_bound;
+} else {
+result = MIN(MAX(result, lower_bound), upper_bound);
+}
  }
+env->macl = result;
+env->mach = result >> 32;
  }


I haven't checked the sh4 docs but the change looks right, so

Reviewed-by: Peter Maydell 

thanks
-- PMM






[PATCH v2] hw/virtio: Fix packed virtqueue flush used_idx

2024-04-04 Thread Wafer
If a virtio-net device has the VIRTIO_NET_F_MRG_RXBUF feature
but not the VIRTIO_RING_F_INDIRECT_DESC feature,
'VirtIONetQueue->rx_vq' will use the merge feature
to store data in multiple 'elems'.
The 'num_buffers' in the virtio header indicates how many elements are merged.
If the value of 'num_buffers' is greater than 1,
all the merged elements will be filled into the descriptor ring.
The 'idx' of the elements should be the value of 'vq->used_idx' plus 'ndescs'.

Signed-off-by: Wafer 

---
Changes in v2:
  - Clarify more in commit message;
---
 hw/virtio/virtio.c | 12 ++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
index fb6b4ccd83..cab5832cac 100644
--- a/hw/virtio/virtio.c
+++ b/hw/virtio/virtio.c
@@ -957,12 +957,20 @@ static void virtqueue_packed_flush(VirtQueue *vq, 
unsigned int count)
 return;
 }
 
+/*
+ * For indirect element's 'ndescs' is 1.
+ * For all other elemment's 'ndescs' is the
+ * number of descriptors chained by NEXT (as set in virtqueue_packed_pop).
+ * So When the 'elem' be filled into the descriptor ring,
+ * The 'idx' of this 'elem' shall be
+ * the value of 'vq->used_idx' plus the 'ndescs'.
+ */
+ndescs += vq->used_elems[0].ndescs;
 for (i = 1; i < count; i++) {
-virtqueue_packed_fill_desc(vq, &vq->used_elems[i], i, false);
+virtqueue_packed_fill_desc(vq, &vq->used_elems[i], ndescs, false);
 ndescs += vq->used_elems[i].ndescs;
 }
 virtqueue_packed_fill_desc(vq, &vq->used_elems[0], 0, true);
-ndescs += vq->used_elems[0].ndescs;
 
 vq->inuse -= ndescs;
 vq->used_idx += ndescs;
-- 
2.27.0




Re: [PATCH v1] migration/postcopy: ensure preempt channel is ready before loading states

2024-04-04 Thread Wang, Lei
On 4/5/2024 0:25, Wang, Wei W wrote:> On Thursday, April 4, 2024 10:12 PM, Peter
Xu wrote:
>> On Thu, Apr 04, 2024 at 06:05:50PM +0800, Wei Wang wrote:
>>> Before loading the guest states, ensure that the preempt channel has
>>> been ready to use, as some of the states (e.g. via virtio_load) might
>>> trigger page faults that will be handled through the preempt channel.
>>> So yield to the main thread in the case that the channel create event
>>> has been dispatched.
>>>
>>> Originally-by: Lei Wang 
>>> Link:
>>> https://lore.kernel.org/all/9aa5d1be-7801-40dd-83fd-f7e041ced249@intel
>>> .com/T/
>>> Suggested-by: Peter Xu 
>>> Signed-off-by: Lei Wang 
>>> Signed-off-by: Wei Wang 
>>> ---
>>>  migration/savevm.c | 17 +
>>>  1 file changed, 17 insertions(+)
>>>
>>> diff --git a/migration/savevm.c b/migration/savevm.c index
>>> 388d7af7cd..fbc9f2bdd4 100644
>>> --- a/migration/savevm.c
>>> +++ b/migration/savevm.c
>>> @@ -2342,6 +2342,23 @@ static int
>>> loadvm_handle_cmd_packaged(MigrationIncomingState *mis)
>>>
>>>  QEMUFile *packf = qemu_file_new_input(QIO_CHANNEL(bioc));
>>>
>>> +/*
>>> + * Before loading the guest states, ensure that the preempt channel has
>>> + * been ready to use, as some of the states (e.g. via virtio_load) 
>>> might
>>> + * trigger page faults that will be handled through the preempt 
>>> channel.
>>> + * So yield to the main thread in the case that the channel create 
>>> event
>>> + * has been dispatched.
>>> + */
>>> +do {
>>> +if (!migrate_postcopy_preempt() || !qemu_in_coroutine() ||
>>> +mis->postcopy_qemufile_dst) {
>>> +break;
>>> +}
>>> +
>>> +aio_co_schedule(qemu_get_current_aio_context(),
>> qemu_coroutine_self());
>>> +qemu_coroutine_yield();
>>> +} while (!qemu_sem_timedwait(&mis->postcopy_qemufile_dst_done,
>>> + 1));
>>
>> I think we need s/!// here, so the same mistake I made?  I think we need to
>> rework the retval of qemu_sem_timedwait() at some point later..
> 
> No. qemu_sem_timedwait returns false when timeout, which means sem isn’t 
> posted yet.
> So it needs to go back to the loop. (the patch was tested)

When timeout, qemu_sem_timedwait() will return -1. I think the patch test passed
may because you will always have at least one yield (the first yield in the do
...while ...) when loadvm_handle_cmd_packaged()?

> 
>>
>> Besides, this patch kept the sem_wait() in postcopy_preempt_thread() so it
>> will wait() on this sem again.  If this qemu_sem_timedwait() accidentally
>> consumed the sem count then I think the other thread can hang forever?
> 
> I can get the issue you mentioned, and seems better to be placed before the 
> creation of
> the preempt thread. Then we probably don’t need to wait_sem in the preempt 
> thread, as the
> channel is guaranteed to be ready when it runs?
> 
> Update will be:
> 
> diff --git a/migration/postcopy-ram.c b/migration/postcopy-ram.c
> index eccff499cb..5a70ce4f23 100644
> --- a/migration/postcopy-ram.c
> +++ b/migration/postcopy-ram.c
> @@ -1254,6 +1254,15 @@ int postcopy_ram_incoming_setup(MigrationIncomingState 
> *mis)
>  }
> 
>  if (migrate_postcopy_preempt()) {
> +do {
> +if (!migrate_postcopy_preempt() || !qemu_in_coroutine() ||
> +mis->postcopy_qemufile_dst) {
> +break;
> +}
> +aio_co_schedule(qemu_get_current_aio_context(), 
> qemu_coroutine_self());
> +qemu_coroutine_yield();
> +} while (!qemu_sem_timedwait(&mis->postcopy_qemufile_dst_done, 1));
> +
>  /*
>   * This thread needs to be created after the temp pages because
>   * it'll fetch RAM_CHANNEL_POSTCOPY PostcopyTmpPage immediately.
> @@ -1743,12 +1752,6 @@ void *postcopy_preempt_thread(void *opaque)
> 
>  qemu_sem_post(&mis->thread_sync_sem);
> 
> -/*
> - * The preempt channel is established in asynchronous way.  Wait
> - * for its completion.
> - */
> -qemu_sem_wait(&mis->postcopy_qemufile_dst_done);
> 
> 
> 
> 
> 
> 
> 



Re: [PATCH v2] sh4: mac.l: implement saturation arithmetic logic

2024-04-04 Thread Peter Maydell
On Thu, 4 Apr 2024 at 17:26, Zack Buhman  wrote:
>
> The saturation arithmetic logic in helper_macl is not correct.
>
> I tested and verified this behavior on a SH7091, the general pattern
> is a code sequence such as:
>
> sets
>
> mov.l _mach,r2
> lds r2,mach
> mov.l _macl,r2
> lds r2,macl
>
> mova _n,r0
> mov r0,r1
> mova _m,r0
> mac.l @r0+,@r1+
>
> _mach: .long 0x7fff
> _macl: .long 0x12345678
> _m:.long 0x7fff
> _n:.long 0x7fff
>
> Test case 0: (no int64_t overflow)
>   given; prior to saturation mac.l:
> mach = 0x7fff macl = 0x12345678
> @r0  = 0x7fff @r1  = 0x7fff
>
>   expected saturation mac.l result:
> mach = 0x7fff macl = 0x
>
>   qemu saturation mac.l result (prior to this commit):
> mach = 0x7ffe macl = 0x12345678
>
> Test case 1: (no int64_t overflow)
>   given; prior to saturation mac.l:
> mach = 0x8000 macl = 0x
> @r0  = 0x @r1  = 0x0001
>
>   expected saturation mac.l result:
> mach = 0x8000 macl = 0x
>
>   qemu saturation mac.l result (prior to this commit):
> mach = 0x7fff macl = 0x
>
> Test case 2: (int64_t addition overflow)
>   given; prior to saturation mac.l:
> mach = 0x8000 macl = 0x
> @r0  = 0x @r1  = 0x0001
>
>   expected saturation mac.l result:
> mach = 0x8000 macl = 0x
>
>   qemu saturation mac.l result (prior to this commit):
> mach = 0x7fff macl = 0x
>
> Test case 3: (int64_t addition overflow)
>   given; prior to saturation mac.l:
> mach = 0x7fff macl = 0x
> @r0 = 0x7fff @r1 = 0x7fff
>
>   expected saturation mac.l result:
> mach = 0x7fff macl = 0x
>
>   qemu saturation mac.l result (prior to this commit):
> mach = 0xfffe macl = 0x0001
>
> All of the above also matches the description of MAC.L as documented
> in cd00147165-sh-4-32-bit-cpu-core-architecture-stmicroelectronics.pdf

Hi. I just noticed that you didn't include a signed-off-by line
in your commit message. We need these as they're how you say
that you're legally OK to contribute this code to QEMU and
you're happy for it to go into the project:

https://www.qemu.org/docs/master/devel/submitting-a-patch.html#patch-emails-must-include-a-signed-off-by-line
has links to what exactly this means, but basically the
requirement is that the last line of your commit message should be
"Signed-off-by: Your Name "

In this case, if you just reply to this email with that, we
can pick it up and fix up the commit message when we apply the
patch.

> ---
>  target/sh4/op_helper.c | 31 +--
>  1 file changed, 21 insertions(+), 10 deletions(-)
>
> diff --git a/target/sh4/op_helper.c b/target/sh4/op_helper.c
> index 4559d0d376..ee16524083 100644
> --- a/target/sh4/op_helper.c
> +++ b/target/sh4/op_helper.c
> @@ -160,18 +160,29 @@ void helper_ocbi(CPUSH4State *env, uint32_t address)
>
>  void helper_macl(CPUSH4State *env, uint32_t arg0, uint32_t arg1)
>  {
> -int64_t res;
> -
> -res = ((uint64_t) env->mach << 32) | env->macl;
> -res += (int64_t) (int32_t) arg0 *(int64_t) (int32_t) arg1;
> -env->mach = (res >> 32) & 0x;
> -env->macl = res & 0x;
> +int32_t value0 = (int32_t)arg0;
> +int32_t value1 = (int32_t)arg1;
> +int64_t mul = ((int64_t)value0) * ((int64_t)value1);
> +int64_t mac = (((uint64_t)env->mach) << 32) | env->macl;
> +int64_t result;
> +bool overflow = sadd64_overflow(mac, mul, &result);
> +/* Perform 48-bit saturation arithmetic if the S flag is set */
>  if (env->sr & (1u << SR_S)) {
> -if (res < 0)
> -env->mach |= 0x;
> -else
> -env->mach &= 0x7fff;
> +/*
> + * The sign bit of `mac + mul` may overflow. The MAC unit on
> + * real SH-4 hardware has equivalent carry/saturation logic:
> + */
> +const int64_t upper_bound =  ((1ull << 47) - 1);
> +const int64_t lower_bound = -((1ull << 47) - 0);
> +
> +if (overflow) {
> +result = (mac < 0) ? lower_bound : upper_bound;
> +} else {
> +result = MIN(MAX(result, lower_bound), upper_bound);
> +}
>  }
> +env->macl = result;
> +env->mach = result >> 32;
>  }

I haven't checked the sh4 docs but the change looks right, so

Reviewed-by: Peter Maydell 

thanks
-- PMM



Re: [RFC v2 1/5] virtio: Initialize sequence variables

2024-04-04 Thread Eugenio Perez Martin
On Thu, Apr 4, 2024 at 4:42 PM Jonah Palmer  wrote:
>
>
>
> On 4/4/24 7:35 AM, Eugenio Perez Martin wrote:
> > On Wed, Apr 3, 2024 at 6:51 PM Jonah Palmer  wrote:
> >>
> >>
> >>
> >> On 4/3/24 6:18 AM, Eugenio Perez Martin wrote:
> >>> On Thu, Mar 28, 2024 at 5:22 PM Jonah Palmer  
> >>> wrote:
> 
>  Initialize sequence variables for VirtQueue and VirtQueueElement
>  structures. A VirtQueue's sequence variables are initialized when a
>  VirtQueue is being created or reset. A VirtQueueElement's sequence
>  variable is initialized when a VirtQueueElement is being initialized.
>  These variables will be used to support the VIRTIO_F_IN_ORDER feature.
> 
>  A VirtQueue's used_seq_idx represents the next expected index in a
>  sequence of VirtQueueElements to be processed (put on the used ring).
>  The next VirtQueueElement added to the used ring must match this
>  sequence number before additional elements can be safely added to the
>  used ring. It's also particularly useful for helping find the number of
>  new elements added to the used ring.
> 
>  A VirtQueue's current_seq_idx represents the current sequence index.
>  This value is essentially a counter where the value is assigned to a new
>  VirtQueueElement and then incremented. Given its uint16_t type, this
>  sequence number can be between 0 and 65,535.
> 
>  A VirtQueueElement's seq_idx represents the sequence number assigned to
>  the VirtQueueElement when it was created. This value must match with the
>  VirtQueue's used_seq_idx before the element can be put on the used ring
>  by the device.
> 
>  Signed-off-by: Jonah Palmer 
>  ---
> hw/virtio/virtio.c | 18 ++
> include/hw/virtio/virtio.h |  1 +
> 2 files changed, 19 insertions(+)
> 
>  diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
>  index fb6b4ccd83..069d96df99 100644
>  --- a/hw/virtio/virtio.c
>  +++ b/hw/virtio/virtio.c
>  @@ -132,6 +132,10 @@ struct VirtQueue
> uint16_t used_idx;
> bool used_wrap_counter;
> 
>  +/* In-Order sequence indices */
>  +uint16_t used_seq_idx;
>  +uint16_t current_seq_idx;
>  +
> >>>
> >>> I'm having a hard time understanding the difference between these and
> >>> last_avail_idx and used_idx. It seems to me if we replace them
> >>> everything will work? What am I missing?
> >>>
> >>
> >> For used_seq_idx, it does work like used_idx except the difference is
> >> when their values get updated, specifically for the split VQ case.
> >>
> >> As you know, for the split VQ case, the used_idx is updated during
> >> virtqueue_split_flush. However, imagine a batch of elements coming in
> >> where virtqueue_split_fill is called multiple times before
> >> virtqueue_split_flush. We want to make sure we write these elements to
> >> the used ring in-order and we'll know its order based on used_seq_idx.
> >>
> >> Alternatively, I thought about replicating the logic for the packed VQ
> >> case (where this used_seq_idx isn't used) where we start looking at
> >> vq->used_elems[vq->used_idx] and iterate through until we find a used
> >> element, but I wasn't sure how to handle the case where elements get
> >> used (written to the used ring) and new elements get put in used_elems
> >> before the used_idx is updated. Since this search would require us to
> >> always start at index vq->used_idx.
> >>
> >> For example, say, of three elements getting filled (elem0 - elem2),
> >> elem1 and elem0 come back first (vq->used_idx = 0):
> >>
> >> elem1 - not in-order
> >> elem0 - in-order, vq->used_elems[vq->used_idx + 1] (elem1) also now
> >>   in-order, write elem0 and elem1 to used ring, mark elements as
> >>   used
> >>
> >> Then elem2 comes back, but vq->used_idx is still 0, so how do we know to
> >> ignore the used elements at vq->used_idx (elem0) and vq->used_idx + 1
> >> (elem1) and iterate to vq->used_idx + 2 (elem2)?
> >>
> >> Hmm... now that I'm thinking about it, maybe for the split VQ case we
> >> could continue looking through the vq->used_elems array until we find an
> >> unused element... but then again how would we (1) know if the element is
> >> in-order and (2) know when to stop searching?
> >>
> >
> > Ok I think I understand the problem now. It is aggravated if we add
> > chained descriptors to the mix.
> >
> > We know that the order of used descriptors must be the exact same as
> > the order they were made available, leaving out in order batching.
> > What if vq->used_elems at virtqueue_pop and then virtqueue_push just
> > marks them as used somehow? Two booleans (or flag) would do for a
> > first iteration.
> >
> > If we go with this approach I think used_elems should be renamed actually.
> >
>
> If I'm understanding correctly, I don't think adding newly created
> elements to vq->used_elems at virtqueue_pop will do much for u

[PATCH v2] sh4: mac.l: implement saturation arithmetic logic

2024-04-04 Thread Zack Buhman
The saturation arithmetic logic in helper_macl is not correct.

I tested and verified this behavior on a SH7091, the general pattern
is a code sequence such as:

sets

mov.l _mach,r2
lds r2,mach
mov.l _macl,r2
lds r2,macl

mova _n,r0
mov r0,r1
mova _m,r0
mac.l @r0+,@r1+

_mach: .long 0x7fff
_macl: .long 0x12345678
_m:.long 0x7fff
_n:.long 0x7fff

Test case 0: (no int64_t overflow)
  given; prior to saturation mac.l:
mach = 0x7fff macl = 0x12345678
@r0  = 0x7fff @r1  = 0x7fff

  expected saturation mac.l result:
mach = 0x7fff macl = 0x

  qemu saturation mac.l result (prior to this commit):
mach = 0x7ffe macl = 0x12345678

Test case 1: (no int64_t overflow)
  given; prior to saturation mac.l:
mach = 0x8000 macl = 0x
@r0  = 0x @r1  = 0x0001

  expected saturation mac.l result:
mach = 0x8000 macl = 0x

  qemu saturation mac.l result (prior to this commit):
mach = 0x7fff macl = 0x

Test case 2: (int64_t addition overflow)
  given; prior to saturation mac.l:
mach = 0x8000 macl = 0x
@r0  = 0x @r1  = 0x0001

  expected saturation mac.l result:
mach = 0x8000 macl = 0x

  qemu saturation mac.l result (prior to this commit):
mach = 0x7fff macl = 0x

Test case 3: (int64_t addition overflow)
  given; prior to saturation mac.l:
mach = 0x7fff macl = 0x
@r0 = 0x7fff @r1 = 0x7fff

  expected saturation mac.l result:
mach = 0x7fff macl = 0x

  qemu saturation mac.l result (prior to this commit):
mach = 0xfffe macl = 0x0001

All of the above also matches the description of MAC.L as documented
in cd00147165-sh-4-32-bit-cpu-core-architecture-stmicroelectronics.pdf
---
 target/sh4/op_helper.c | 31 +--
 1 file changed, 21 insertions(+), 10 deletions(-)

diff --git a/target/sh4/op_helper.c b/target/sh4/op_helper.c
index 4559d0d376..ee16524083 100644
--- a/target/sh4/op_helper.c
+++ b/target/sh4/op_helper.c
@@ -160,18 +160,29 @@ void helper_ocbi(CPUSH4State *env, uint32_t address)
 
 void helper_macl(CPUSH4State *env, uint32_t arg0, uint32_t arg1)
 {
-int64_t res;
-
-res = ((uint64_t) env->mach << 32) | env->macl;
-res += (int64_t) (int32_t) arg0 *(int64_t) (int32_t) arg1;
-env->mach = (res >> 32) & 0x;
-env->macl = res & 0x;
+int32_t value0 = (int32_t)arg0;
+int32_t value1 = (int32_t)arg1;
+int64_t mul = ((int64_t)value0) * ((int64_t)value1);
+int64_t mac = (((uint64_t)env->mach) << 32) | env->macl;
+int64_t result;
+bool overflow = sadd64_overflow(mac, mul, &result);
+/* Perform 48-bit saturation arithmetic if the S flag is set */
 if (env->sr & (1u << SR_S)) {
-if (res < 0)
-env->mach |= 0x;
-else
-env->mach &= 0x7fff;
+/*
+ * The sign bit of `mac + mul` may overflow. The MAC unit on
+ * real SH-4 hardware has equivalent carry/saturation logic:
+ */
+const int64_t upper_bound =  ((1ull << 47) - 1);
+const int64_t lower_bound = -((1ull << 47) - 0);
+
+if (overflow) {
+result = (mac < 0) ? lower_bound : upper_bound;
+} else {
+result = MIN(MAX(result, lower_bound), upper_bound);
+}
 }
+env->macl = result;
+env->mach = result >> 32;
 }
 
 void helper_macw(CPUSH4State *env, uint32_t arg0, uint32_t arg1)
-- 
2.41.0




RE: [PATCH v1] migration/postcopy: ensure preempt channel is ready before loading states

2024-04-04 Thread Wang, Wei W
On Thursday, April 4, 2024 10:12 PM, Peter Xu wrote:
> On Thu, Apr 04, 2024 at 06:05:50PM +0800, Wei Wang wrote:
> > Before loading the guest states, ensure that the preempt channel has
> > been ready to use, as some of the states (e.g. via virtio_load) might
> > trigger page faults that will be handled through the preempt channel.
> > So yield to the main thread in the case that the channel create event
> > has been dispatched.
> >
> > Originally-by: Lei Wang 
> > Link:
> > https://lore.kernel.org/all/9aa5d1be-7801-40dd-83fd-f7e041ced249@intel
> > .com/T/
> > Suggested-by: Peter Xu 
> > Signed-off-by: Lei Wang 
> > Signed-off-by: Wei Wang 
> > ---
> >  migration/savevm.c | 17 +
> >  1 file changed, 17 insertions(+)
> >
> > diff --git a/migration/savevm.c b/migration/savevm.c index
> > 388d7af7cd..fbc9f2bdd4 100644
> > --- a/migration/savevm.c
> > +++ b/migration/savevm.c
> > @@ -2342,6 +2342,23 @@ static int
> > loadvm_handle_cmd_packaged(MigrationIncomingState *mis)
> >
> >  QEMUFile *packf = qemu_file_new_input(QIO_CHANNEL(bioc));
> >
> > +/*
> > + * Before loading the guest states, ensure that the preempt channel has
> > + * been ready to use, as some of the states (e.g. via virtio_load) 
> > might
> > + * trigger page faults that will be handled through the preempt 
> > channel.
> > + * So yield to the main thread in the case that the channel create 
> > event
> > + * has been dispatched.
> > + */
> > +do {
> > +if (!migrate_postcopy_preempt() || !qemu_in_coroutine() ||
> > +mis->postcopy_qemufile_dst) {
> > +break;
> > +}
> > +
> > +aio_co_schedule(qemu_get_current_aio_context(),
> qemu_coroutine_self());
> > +qemu_coroutine_yield();
> > +} while (!qemu_sem_timedwait(&mis->postcopy_qemufile_dst_done,
> > + 1));
> 
> I think we need s/!// here, so the same mistake I made?  I think we need to
> rework the retval of qemu_sem_timedwait() at some point later..

No. qemu_sem_timedwait returns false when timeout, which means sem isn’t posted 
yet.
So it needs to go back to the loop. (the patch was tested)

> 
> Besides, this patch kept the sem_wait() in postcopy_preempt_thread() so it
> will wait() on this sem again.  If this qemu_sem_timedwait() accidentally
> consumed the sem count then I think the other thread can hang forever?

I can get the issue you mentioned, and seems better to be placed before the 
creation of
the preempt thread. Then we probably don’t need to wait_sem in the preempt 
thread, as the
channel is guaranteed to be ready when it runs?

Update will be:

diff --git a/migration/postcopy-ram.c b/migration/postcopy-ram.c
index eccff499cb..5a70ce4f23 100644
--- a/migration/postcopy-ram.c
+++ b/migration/postcopy-ram.c
@@ -1254,6 +1254,15 @@ int postcopy_ram_incoming_setup(MigrationIncomingState 
*mis)
 }

 if (migrate_postcopy_preempt()) {
+do {
+if (!migrate_postcopy_preempt() || !qemu_in_coroutine() ||
+mis->postcopy_qemufile_dst) {
+break;
+}
+aio_co_schedule(qemu_get_current_aio_context(), 
qemu_coroutine_self());
+qemu_coroutine_yield();
+} while (!qemu_sem_timedwait(&mis->postcopy_qemufile_dst_done, 1));
+
 /*
  * This thread needs to be created after the temp pages because
  * it'll fetch RAM_CHANNEL_POSTCOPY PostcopyTmpPage immediately.
@@ -1743,12 +1752,6 @@ void *postcopy_preempt_thread(void *opaque)

 qemu_sem_post(&mis->thread_sync_sem);

-/*
- * The preempt channel is established in asynchronous way.  Wait
- * for its completion.
- */
-qemu_sem_wait(&mis->postcopy_qemufile_dst_done);









Re: [PATCH] sh4: mac.l: implement saturation arithmetic logic

2024-04-04 Thread Peter Maydell
On Thu, 4 Apr 2024 at 16:12, Zack Buhman  wrote:
>
> The saturation arithmetic logic in helper_macl is not correct.
>
> I tested and verified this behavior on a SH7091, the general pattern
> is a code sequence such as:
>
> sets
>
> mov.l _mach,r2
> lds r2,mach
> mov.l _macl,r2
> lds r2,macl
>
> mova _n,r0
> mov r0,r1
> mova _m,r0
> mac.l @r0+,@r1+
>
> _mach: .long 0x7fff
> _macl: .long 0x12345678
> _m:.long 0x7fff
> _n:.long 0x7fff
>
> Test case 0: (no int64_t overflow)
>   given; prior to saturation mac.l:
> mach = 0x7fff macl = 0x12345678
> @r0  = 0x7fff @r1  = 0x7fff
>
>   expected saturation mac.l result:
> mach = 0x7fff macl = 0x
>
>   qemu saturation mac.l result (prior to this commit):
> mach = 0x7ffe macl = 0x12345678
>
> Test case 1: (no int64_t overflow)
>   given; prior to saturation mac.l:
> mach = 0x8000 macl = 0x
> @r0  = 0x @r1  = 0x0001
>
>   expected saturation mac.l result:
> mach = 0x8000 macl = 0x
>
>   qemu saturation mac.l result (prior to this commit):
> mach = 0x7fff macl = 0x
>
> Test case 2: (int64_t addition overflow)
>   given; prior to saturation mac.l:
> mach = 0x8000 macl = 0x
> @r0  = 0x @r1  = 0x0001
>
>   expected saturation mac.l result:
> mach = 0x8000 macl = 0x
>
>   qemu saturation mac.l result (prior to this commit):
> mach = 0x7fff macl = 0x
>
> Test case 3: (int64_t addition overflow)
>   given; prior to saturation mac.l:
> mach = 0x7fff macl = 0x
> @r0 = 0x7fff @r1 = 0x7fff
>
>   expected saturation mac.l result:
> mach = 0x7fff macl = 0x
>
>   qemu saturation mac.l result (prior to this commit):
> mach = 0xfffe macl = 0x0001
>
> All of the above also matches the description of MAC.L as documented
> in cd00147165-sh-4-32-bit-cpu-core-architecture-stmicroelectronics.pdf
> ---
>  target/sh4/op_helper.c | 45 --
>  1 file changed, 35 insertions(+), 10 deletions(-)
>
> diff --git a/target/sh4/op_helper.c b/target/sh4/op_helper.c
> index 4559d0d376..a3eb2f5281 100644
> --- a/target/sh4/op_helper.c
> +++ b/target/sh4/op_helper.c
> @@ -160,18 +160,43 @@ void helper_ocbi(CPUSH4State *env, uint32_t address)
>
>  void helper_macl(CPUSH4State *env, uint32_t arg0, uint32_t arg1)
>  {
> -int64_t res;
> -
> -res = ((uint64_t) env->mach << 32) | env->macl;
> -res += (int64_t) (int32_t) arg0 *(int64_t) (int32_t) arg1;
> -env->mach = (res >> 32) & 0x;
> -env->macl = res & 0x;
> +int32_t value0 = (int32_t)arg0;
> +int32_t value1 = (int32_t)arg1;
> +int64_t mul = ((int64_t)value0) * ((int64_t)value1);
> +int64_t mac = (((uint64_t)env->mach) << 32) | env->macl;
> +int64_t result = mac + mul;
> +/* Perform 48-bit saturation arithmetic if the S flag is set */
>  if (env->sr & (1u << SR_S)) {
> -if (res < 0)
> -env->mach |= 0x;
> -else
> -env->mach &= 0x7fff;
> +/*
> + * The following xor/and expression is necessary to detect an
> + * overflow in MSB of res; this is logic necessary because the
> + * sign bit of `mac + mul` may overflow. The MAC unit on real
> + * SH-4 hardware has carry/saturation logic that is equivalent
> + * to the following:
> + */
> +const int64_t upper_bound =  ((1ull << 47) - 1);
> +const int64_t lower_bound = -((1ull << 47) - 0);
> +
> +if (result ^ mac) & (result ^ mul)) >> 63) & 1) == 1) {
> +/* An overflow occured during 64-bit addition */

This is testing whether the "int64_t result = mac + mul"
signed 64-bit arithmetic overflowed, right? That's probably
cleaner written by using the sadd64_overflow() function in
host-utils.h, which does the 64-bit add and returns a bool
to tell you whether it overflowed or not:

   if (sadd64_overflow(mac, mul, &result)) {
   result = (result < 0) ? lower_bound : upper_bound;
   } else {
   result = MIN(MAX(result, lower_bound), upper_bound);
   }



> +if (((mac >> 63) & 1) == 0) {
> +result = upper_bound;
> +} else {
> +result = lower_bound;
> +}
> +} else {
> +/* An overflow did not occur during 64-bit addition */
> +if (result > upper_bound) {
> +result = upper_bound;
> +} else if (result < lower_bound) {
> +result = lower_bound;
> +} else {
> +/* leave result unchanged */
> +}
> +}
>  }
> +env->macl = result;
> +env->mach = result >> 32;

thanks
-- PMM



[PATCH] sh4: mac.l: implement saturation arithmetic logic

2024-04-04 Thread Zack Buhman
The saturation arithmetic logic in helper_macl is not correct.

I tested and verified this behavior on a SH7091, the general pattern
is a code sequence such as:

sets

mov.l _mach,r2
lds r2,mach
mov.l _macl,r2
lds r2,macl

mova _n,r0
mov r0,r1
mova _m,r0
mac.l @r0+,@r1+

_mach: .long 0x7fff
_macl: .long 0x12345678
_m:.long 0x7fff
_n:.long 0x7fff

Test case 0: (no int64_t overflow)
  given; prior to saturation mac.l:
mach = 0x7fff macl = 0x12345678
@r0  = 0x7fff @r1  = 0x7fff

  expected saturation mac.l result:
mach = 0x7fff macl = 0x

  qemu saturation mac.l result (prior to this commit):
mach = 0x7ffe macl = 0x12345678

Test case 1: (no int64_t overflow)
  given; prior to saturation mac.l:
mach = 0x8000 macl = 0x
@r0  = 0x @r1  = 0x0001

  expected saturation mac.l result:
mach = 0x8000 macl = 0x

  qemu saturation mac.l result (prior to this commit):
mach = 0x7fff macl = 0x

Test case 2: (int64_t addition overflow)
  given; prior to saturation mac.l:
mach = 0x8000 macl = 0x
@r0  = 0x @r1  = 0x0001

  expected saturation mac.l result:
mach = 0x8000 macl = 0x

  qemu saturation mac.l result (prior to this commit):
mach = 0x7fff macl = 0x

Test case 3: (int64_t addition overflow)
  given; prior to saturation mac.l:
mach = 0x7fff macl = 0x
@r0 = 0x7fff @r1 = 0x7fff

  expected saturation mac.l result:
mach = 0x7fff macl = 0x

  qemu saturation mac.l result (prior to this commit):
mach = 0xfffe macl = 0x0001

All of the above also matches the description of MAC.L as documented
in cd00147165-sh-4-32-bit-cpu-core-architecture-stmicroelectronics.pdf
---
 target/sh4/op_helper.c | 45 --
 1 file changed, 35 insertions(+), 10 deletions(-)

diff --git a/target/sh4/op_helper.c b/target/sh4/op_helper.c
index 4559d0d376..a3eb2f5281 100644
--- a/target/sh4/op_helper.c
+++ b/target/sh4/op_helper.c
@@ -160,18 +160,43 @@ void helper_ocbi(CPUSH4State *env, uint32_t address)
 
 void helper_macl(CPUSH4State *env, uint32_t arg0, uint32_t arg1)
 {
-int64_t res;
-
-res = ((uint64_t) env->mach << 32) | env->macl;
-res += (int64_t) (int32_t) arg0 *(int64_t) (int32_t) arg1;
-env->mach = (res >> 32) & 0x;
-env->macl = res & 0x;
+int32_t value0 = (int32_t)arg0;
+int32_t value1 = (int32_t)arg1;
+int64_t mul = ((int64_t)value0) * ((int64_t)value1);
+int64_t mac = (((uint64_t)env->mach) << 32) | env->macl;
+int64_t result = mac + mul;
+/* Perform 48-bit saturation arithmetic if the S flag is set */
 if (env->sr & (1u << SR_S)) {
-if (res < 0)
-env->mach |= 0x;
-else
-env->mach &= 0x7fff;
+/*
+ * The following xor/and expression is necessary to detect an
+ * overflow in MSB of res; this is logic necessary because the
+ * sign bit of `mac + mul` may overflow. The MAC unit on real
+ * SH-4 hardware has carry/saturation logic that is equivalent
+ * to the following:
+ */
+const int64_t upper_bound =  ((1ull << 47) - 1);
+const int64_t lower_bound = -((1ull << 47) - 0);
+
+if (result ^ mac) & (result ^ mul)) >> 63) & 1) == 1) {
+/* An overflow occured during 64-bit addition */
+if (((mac >> 63) & 1) == 0) {
+result = upper_bound;
+} else {
+result = lower_bound;
+}
+} else {
+/* An overflow did not occur during 64-bit addition */
+if (result > upper_bound) {
+result = upper_bound;
+} else if (result < lower_bound) {
+result = lower_bound;
+} else {
+/* leave result unchanged */
+}
+}
 }
+env->macl = result;
+env->mach = result >> 32;
 }
 
 void helper_macw(CPUSH4State *env, uint32_t arg0, uint32_t arg1)
-- 
2.41.0




Re: [PATCH v12 10/23] hw/arm/virt: Wire NMI and VINMI irq lines from GIC to CPU

2024-04-04 Thread Peter Maydell
On Wed, 3 Apr 2024 at 11:18, Jinjie Ruan  wrote:
>
> Wire the new NMI and VINMI interrupt line from the GIC to each CPU.
>
> Signed-off-by: Jinjie Ruan 
> Reviewed-by: Richard Henderson 
> ---
> v9:
> - Rename ARM_CPU_VNMI to ARM_CPU_VINMI.
> - Update the commit message.
> v4:
> - Add Reviewed-by.
> v3:
> - Also add VNMI wire.
> ---
>  hw/arm/virt.c | 7 ++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> index a9a913aead..ef2e6c2c4d 100644
> --- a/hw/arm/virt.c
> +++ b/hw/arm/virt.c
> @@ -821,7 +821,8 @@ static void create_gic(VirtMachineState *vms, 
> MemoryRegion *mem)
>
>  /* Wire the outputs from each CPU's generic timer and the GICv3
>   * maintenance interrupt signal to the appropriate GIC PPI inputs,
> - * and the GIC's IRQ/FIQ/VIRQ/VFIQ interrupt outputs to the CPU's inputs.
> + * and the GIC's IRQ/FIQ/VIRQ/VFIQ/NMI/VINMI interrupt outputs to the
> + * CPU's inputs.
>   */
>  for (i = 0; i < smp_cpus; i++) {
>  DeviceState *cpudev = DEVICE(qemu_get_cpu(i));
> @@ -865,6 +866,10 @@ static void create_gic(VirtMachineState *vms, 
> MemoryRegion *mem)
> qdev_get_gpio_in(cpudev, ARM_CPU_VIRQ));
>  sysbus_connect_irq(gicbusdev, i + 3 * smp_cpus,
> qdev_get_gpio_in(cpudev, ARM_CPU_VFIQ));
> +sysbus_connect_irq(gicbusdev, i + 4 * smp_cpus,
> +   qdev_get_gpio_in(cpudev, ARM_CPU_NMI));
> +sysbus_connect_irq(gicbusdev, i + 5 * smp_cpus,
> +   qdev_get_gpio_in(cpudev, ARM_CPU_VINMI));
>  }

This patch needs to go after patch 11. Otherwise at this point
in the patchseries we are trying to wire up GPIOs on the GIC
which don't exist, and QEMU will assert:

$ ./build/x86/qemu-system-aarch64 -M virt,gic-version=3
Unexpected error in object_property_find_err() at ../../qom/object.c:1366:
qemu-system-aarch64: Property 'arm-gicv3.sysbus-irq[4]' not found
Aborted (core dumped)

We also need to only connect these up if vms->gic_version
is not VIRT_GIC_VERSION_2. This is because these GPIOs don't
exist on the GICv2, and otherwise we again assert if you
try to wire them up but you're using GICv2:

$ ./build/x86/qemu-system-aarch64 -M virt,gic-version=2
Unexpected error in object_property_find_err() at ../../qom/object.c:1366:
qemu-system-aarch64: Property 'arm_gic.sysbus-irq[4]' not found
Aborted (core dumped)

thanks
-- PMM



Re: [RFC v2 1/5] virtio: Initialize sequence variables

2024-04-04 Thread Jonah Palmer




On 4/4/24 7:35 AM, Eugenio Perez Martin wrote:

On Wed, Apr 3, 2024 at 6:51 PM Jonah Palmer  wrote:




On 4/3/24 6:18 AM, Eugenio Perez Martin wrote:

On Thu, Mar 28, 2024 at 5:22 PM Jonah Palmer  wrote:


Initialize sequence variables for VirtQueue and VirtQueueElement
structures. A VirtQueue's sequence variables are initialized when a
VirtQueue is being created or reset. A VirtQueueElement's sequence
variable is initialized when a VirtQueueElement is being initialized.
These variables will be used to support the VIRTIO_F_IN_ORDER feature.

A VirtQueue's used_seq_idx represents the next expected index in a
sequence of VirtQueueElements to be processed (put on the used ring).
The next VirtQueueElement added to the used ring must match this
sequence number before additional elements can be safely added to the
used ring. It's also particularly useful for helping find the number of
new elements added to the used ring.

A VirtQueue's current_seq_idx represents the current sequence index.
This value is essentially a counter where the value is assigned to a new
VirtQueueElement and then incremented. Given its uint16_t type, this
sequence number can be between 0 and 65,535.

A VirtQueueElement's seq_idx represents the sequence number assigned to
the VirtQueueElement when it was created. This value must match with the
VirtQueue's used_seq_idx before the element can be put on the used ring
by the device.

Signed-off-by: Jonah Palmer 
---
   hw/virtio/virtio.c | 18 ++
   include/hw/virtio/virtio.h |  1 +
   2 files changed, 19 insertions(+)

diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
index fb6b4ccd83..069d96df99 100644
--- a/hw/virtio/virtio.c
+++ b/hw/virtio/virtio.c
@@ -132,6 +132,10 @@ struct VirtQueue
   uint16_t used_idx;
   bool used_wrap_counter;

+/* In-Order sequence indices */
+uint16_t used_seq_idx;
+uint16_t current_seq_idx;
+


I'm having a hard time understanding the difference between these and
last_avail_idx and used_idx. It seems to me if we replace them
everything will work? What am I missing?



For used_seq_idx, it does work like used_idx except the difference is
when their values get updated, specifically for the split VQ case.

As you know, for the split VQ case, the used_idx is updated during
virtqueue_split_flush. However, imagine a batch of elements coming in
where virtqueue_split_fill is called multiple times before
virtqueue_split_flush. We want to make sure we write these elements to
the used ring in-order and we'll know its order based on used_seq_idx.

Alternatively, I thought about replicating the logic for the packed VQ
case (where this used_seq_idx isn't used) where we start looking at
vq->used_elems[vq->used_idx] and iterate through until we find a used
element, but I wasn't sure how to handle the case where elements get
used (written to the used ring) and new elements get put in used_elems
before the used_idx is updated. Since this search would require us to
always start at index vq->used_idx.

For example, say, of three elements getting filled (elem0 - elem2),
elem1 and elem0 come back first (vq->used_idx = 0):

elem1 - not in-order
elem0 - in-order, vq->used_elems[vq->used_idx + 1] (elem1) also now
  in-order, write elem0 and elem1 to used ring, mark elements as
  used

Then elem2 comes back, but vq->used_idx is still 0, so how do we know to
ignore the used elements at vq->used_idx (elem0) and vq->used_idx + 1
(elem1) and iterate to vq->used_idx + 2 (elem2)?

Hmm... now that I'm thinking about it, maybe for the split VQ case we
could continue looking through the vq->used_elems array until we find an
unused element... but then again how would we (1) know if the element is
in-order and (2) know when to stop searching?



Ok I think I understand the problem now. It is aggravated if we add
chained descriptors to the mix.

We know that the order of used descriptors must be the exact same as
the order they were made available, leaving out in order batching.
What if vq->used_elems at virtqueue_pop and then virtqueue_push just
marks them as used somehow? Two booleans (or flag) would do for a
first iteration.

If we go with this approach I think used_elems should be renamed actually.



If I'm understanding correctly, I don't think adding newly created 
elements to vq->used_elems at virtqueue_pop will do much for us. We 
could just keep adding processed elements to vq->used_elems at 
virtqueue_fill but instead of:


vq->used_elems[seq_idx].in_num = elem->in_num;
vq->used_elems[seq_idx].out_num = elem->out_num;

We could do:

vq->used_elems[seq_idx].in_num = 1;
vq->used_elems[seq_idx].out_num = 1;

We'd use in_num and out_num as separate flags. in_num could indicate if 
this element has been written to the used ring while out_num could 
indicate if this element has been flushed (1 for no, 0 for yes). In 
other words, when we go to write to the used ring, start at index 
vq->used_idx and iterate through the used elem

Re: [PATCH v12 00/23] target/arm: Implement FEAT_NMI and FEAT_GICv3_NMI

2024-04-04 Thread Peter Maydell
On Wed, 3 Apr 2024 at 11:18, Jinjie Ruan  wrote:
>
> This patch set implements FEAT_NMI and FEAT_GICv3_NMI for ARMv8. These
> introduce support for a new category of interrupts in the architecture
> which we can use to provide NMI like functionality.

Looking through the Arm ARM pseudocode at places where it
handles NMI related features and bits, I noticed one corner
case we don't handle in this patchseries: illegal exception return.
In the pseudocode, AArch64.ExceptionReturn() calls
SetPSTATEFromPSR(), which treats PSTATE.ALLINT as one of the
bits which are reinstated from SPSR to PSTATE regardless of
whether this is an illegal exception return or not. For
QEMU that means we want to handle it the same way we do
PSTATE_DAIF and PSTATE_NZCV in the illegal_return exit path of
the exception_return helper:

--- a/target/arm/tcg/helper-a64.c
+++ b/target/arm/tcg/helper-a64.c
@@ -904,8 +904,8 @@ illegal_return:
  */
 env->pstate |= PSTATE_IL;
 env->pc = new_pc;
-spsr &= PSTATE_NZCV | PSTATE_DAIF;
-spsr |= pstate_read(env) & ~(PSTATE_NZCV | PSTATE_DAIF);
+spsr &= PSTATE_NZCV | PSTATE_DAIF | PSTATE_ALLINT;
+spsr |= pstate_read(env) & ~(PSTATE_NZCV | PSTATE_DAIF | PSTATE_ALLINT);
 pstate_write(env, spsr);
 if (!arm_singlestep_active(env)) {
 env->pstate &= ~PSTATE_SS;

(I haven't thought about whether this fits particularly into
any existing patch or should be a patch of its own.)

thanks
-- PMM



[PULL 07/17] esp.c: use esp_fifo_push() instead of fifo8_push()

2024-04-04 Thread Mark Cave-Ayland
There are still a few places that use fifo8_push() instead of esp_fifo_push() in
order to push a value into the FIFO. Update those places to use esp_fifo_push()
instead.

Signed-off-by: Mark Cave-Ayland 
Reviewed-by: Philippe Mathieu-Daudé 
Reviewed-by: Paolo Bonzini 
Message-Id: <20240324191707.623175-8-mark.cave-ayl...@ilande.co.uk>
Signed-off-by: Mark Cave-Ayland 
---
 hw/scsi/esp.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/hw/scsi/esp.c b/hw/scsi/esp.c
index d474268438..8d2d36d56c 100644
--- a/hw/scsi/esp.c
+++ b/hw/scsi/esp.c
@@ -858,7 +858,7 @@ static void esp_do_nodma(ESPState *s)
 return;
 }
 if (fifo8_is_empty(&s->fifo)) {
-fifo8_push(&s->fifo, s->async_buf[0]);
+esp_fifo_push(s, s->async_buf[0]);
 s->async_buf++;
 s->async_len--;
 s->ti_size--;
@@ -881,7 +881,7 @@ static void esp_do_nodma(ESPState *s)
 case STAT_ST:
 switch (s->rregs[ESP_CMD]) {
 case CMD_ICCS:
-fifo8_push(&s->fifo, s->status);
+esp_fifo_push(s, s->status);
 esp_set_phase(s, STAT_MI);
 
 /* Process any message in phase data */
@@ -893,7 +893,7 @@ static void esp_do_nodma(ESPState *s)
 case STAT_MI:
 switch (s->rregs[ESP_CMD]) {
 case CMD_ICCS:
-fifo8_push(&s->fifo, 0);
+esp_fifo_push(s, 0);
 
 /* Raise end of command interrupt */
 s->rregs[ESP_RINTR] |= INTR_FC;
-- 
2.39.2




[PULL 08/17] esp.c: change esp_fifo_pop_buf() to take ESPState

2024-04-04 Thread Mark Cave-Ayland
Now that all users of esp_fifo_pop_buf() operate on the main FIFO there is no
need to pass the FIFO explicitly.

Signed-off-by: Mark Cave-Ayland 
Reviewed-by: Philippe Mathieu-Daudé 
Reviewed-by: Paolo Bonzini 
Message-Id: <20240324191707.623175-9-mark.cave-ayl...@ilande.co.uk>
Signed-off-by: Mark Cave-Ayland 
---
 hw/scsi/esp.c | 24 
 1 file changed, 12 insertions(+), 12 deletions(-)

diff --git a/hw/scsi/esp.c b/hw/scsi/esp.c
index 8d2d36d56c..83b621ee0f 100644
--- a/hw/scsi/esp.c
+++ b/hw/scsi/esp.c
@@ -155,9 +155,9 @@ static uint32_t esp_fifo8_pop_buf(Fifo8 *fifo, uint8_t 
*dest, int maxlen)
 return n;
 }
 
-static uint32_t esp_fifo_pop_buf(Fifo8 *fifo, uint8_t *dest, int maxlen)
+static uint32_t esp_fifo_pop_buf(ESPState *s, uint8_t *dest, int maxlen)
 {
-return esp_fifo8_pop_buf(fifo, dest, maxlen);
+return esp_fifo8_pop_buf(&s->fifo, dest, maxlen);
 }
 
 static uint32_t esp_get_tc(ESPState *s)
@@ -459,7 +459,7 @@ static void esp_do_dma(ESPState *s)
 s->dma_memory_read(s->dma_opaque, buf, len);
 esp_set_tc(s, esp_get_tc(s) - len);
 } else {
-len = esp_fifo_pop_buf(&s->fifo, buf, fifo8_num_used(&s->fifo));
+len = esp_fifo_pop_buf(s, buf, fifo8_num_used(&s->fifo));
 len = MIN(fifo8_num_free(&s->cmdfifo), len);
 esp_raise_drq(s);
 }
@@ -515,7 +515,7 @@ static void esp_do_dma(ESPState *s)
 fifo8_push_all(&s->cmdfifo, buf, len);
 esp_set_tc(s, esp_get_tc(s) - len);
 } else {
-len = esp_fifo_pop_buf(&s->fifo, buf, fifo8_num_used(&s->fifo));
+len = esp_fifo_pop_buf(s, buf, fifo8_num_used(&s->fifo));
 len = MIN(fifo8_num_free(&s->cmdfifo), len);
 fifo8_push_all(&s->cmdfifo, buf, len);
 esp_raise_drq(s);
@@ -549,7 +549,7 @@ static void esp_do_dma(ESPState *s)
 /* Copy FIFO data to device */
 len = MIN(s->async_len, ESP_FIFO_SZ);
 len = MIN(len, fifo8_num_used(&s->fifo));
-len = esp_fifo_pop_buf(&s->fifo, s->async_buf, len);
+len = esp_fifo_pop_buf(s, s->async_buf, len);
 esp_raise_drq(s);
 }
 
@@ -713,7 +713,7 @@ static void esp_nodma_ti_dataout(ESPState *s)
 }
 len = MIN(s->async_len, ESP_FIFO_SZ);
 len = MIN(len, fifo8_num_used(&s->fifo));
-esp_fifo_pop_buf(&s->fifo, s->async_buf, len);
+esp_fifo_pop_buf(s, s->async_buf, len);
 s->async_buf += len;
 s->async_len -= len;
 s->ti_size += len;
@@ -738,7 +738,7 @@ static void esp_do_nodma(ESPState *s)
 switch (s->rregs[ESP_CMD]) {
 case CMD_SELATN:
 /* Copy FIFO into cmdfifo */
-len = esp_fifo_pop_buf(&s->fifo, buf, fifo8_num_used(&s->fifo));
+len = esp_fifo_pop_buf(s, buf, fifo8_num_used(&s->fifo));
 len = MIN(fifo8_num_free(&s->cmdfifo), len);
 fifo8_push_all(&s->cmdfifo, buf, len);
 
@@ -757,7 +757,7 @@ static void esp_do_nodma(ESPState *s)
 
 case CMD_SELATNS:
 /* Copy one byte from FIFO into cmdfifo */
-len = esp_fifo_pop_buf(&s->fifo, buf, 1);
+len = esp_fifo_pop_buf(s, buf, 1);
 len = MIN(fifo8_num_free(&s->cmdfifo), len);
 fifo8_push_all(&s->cmdfifo, buf, len);
 
@@ -774,7 +774,7 @@ static void esp_do_nodma(ESPState *s)
 
 case CMD_TI:
 /* Copy FIFO into cmdfifo */
-len = esp_fifo_pop_buf(&s->fifo, buf, fifo8_num_used(&s->fifo));
+len = esp_fifo_pop_buf(s, buf, fifo8_num_used(&s->fifo));
 len = MIN(fifo8_num_free(&s->cmdfifo), len);
 fifo8_push_all(&s->cmdfifo, buf, len);
 
@@ -792,7 +792,7 @@ static void esp_do_nodma(ESPState *s)
 switch (s->rregs[ESP_CMD]) {
 case CMD_TI:
 /* Copy FIFO into cmdfifo */
-len = esp_fifo_pop_buf(&s->fifo, buf, fifo8_num_used(&s->fifo));
+len = esp_fifo_pop_buf(s, buf, fifo8_num_used(&s->fifo));
 len = MIN(fifo8_num_free(&s->cmdfifo), len);
 fifo8_push_all(&s->cmdfifo, buf, len);
 
@@ -821,7 +821,7 @@ static void esp_do_nodma(ESPState *s)
 case CMD_SEL | CMD_DMA:
 case CMD_SELATN | CMD_DMA:
 /* Copy FIFO into cmdfifo */
-len = esp_fifo_pop_buf(&s->fifo, buf, fifo8_num_used(&s->fifo));
+len = esp_fifo_pop_buf(s, buf, fifo8_num_used(&s->fifo));
 len = MIN(fifo8_num_free(&s->cmdfifo), len);
 fifo8_push_all(&s->cmdfifo, buf, len);
 
@@ -836,7 +836,7 @@ static void esp_do_nodma(ESPState *s)
 case CMD_SEL:
 case CMD_SELATN:
 /* FIFO already contain entire CDB: copy to cmdfifo and execute */
-len = esp_fifo_pop_buf(&s->fifo, buf, fifo8_num_used(&s->fifo));
+len = esp_fifo_pop_buf(s, buf, fifo8_num_used(&s->fifo));
 len = MIN(fifo8_num_free(&s->cmdfifo), len);
  

[PULL 13/17] esp.c: move esp_set_phase() and esp_get_phase() towards the beginning of the file

2024-04-04 Thread Mark Cave-Ayland
This allows these functions to be used earlier in the file without needing a
separate forward declaration.

Signed-off-by: Mark Cave-Ayland 
Reviewed-by: Philippe Mathieu-Daudé 
Reviewed-by: Paolo Bonzini 
Message-Id: <20240324191707.623175-14-mark.cave-ayl...@ilande.co.uk>
Signed-off-by: Mark Cave-Ayland 
---
 hw/scsi/esp.c | 36 ++--
 1 file changed, 18 insertions(+), 18 deletions(-)

diff --git a/hw/scsi/esp.c b/hw/scsi/esp.c
index d8db33b921..9e35c00927 100644
--- a/hw/scsi/esp.c
+++ b/hw/scsi/esp.c
@@ -79,6 +79,24 @@ static void esp_lower_drq(ESPState *s)
 }
 }
 
+static const char *esp_phase_names[8] = {
+"DATA OUT", "DATA IN", "COMMAND", "STATUS",
+"(reserved)", "(reserved)", "MESSAGE OUT", "MESSAGE IN"
+};
+
+static void esp_set_phase(ESPState *s, uint8_t phase)
+{
+s->rregs[ESP_RSTAT] &= ~7;
+s->rregs[ESP_RSTAT] |= phase;
+
+trace_esp_set_phase(esp_phase_names[phase]);
+}
+
+static uint8_t esp_get_phase(ESPState *s)
+{
+return s->rregs[ESP_RSTAT] & 7;
+}
+
 void esp_dma_enable(ESPState *s, int irq, int level)
 {
 if (level) {
@@ -200,24 +218,6 @@ static uint32_t esp_get_stc(ESPState *s)
 return dmalen;
 }
 
-static const char *esp_phase_names[8] = {
-"DATA OUT", "DATA IN", "COMMAND", "STATUS",
-"(reserved)", "(reserved)", "MESSAGE OUT", "MESSAGE IN"
-};
-
-static void esp_set_phase(ESPState *s, uint8_t phase)
-{
-s->rregs[ESP_RSTAT] &= ~7;
-s->rregs[ESP_RSTAT] |= phase;
-
-trace_esp_set_phase(esp_phase_names[phase]);
-}
-
-static uint8_t esp_get_phase(ESPState *s)
-{
-return s->rregs[ESP_RSTAT] & 7;
-}
-
 static uint8_t esp_pdma_read(ESPState *s)
 {
 uint8_t val;
-- 
2.39.2




[PULL 15/17] esp.c: update esp_fifo_{push, pop}() to call esp_update_drq()

2024-04-04 Thread Mark Cave-Ayland
This ensures that the DRQ line is always set correctly when reading/writing
single bytes to/from the FIFO.

Signed-off-by: Mark Cave-Ayland 
Reviewed-by: Philippe Mathieu-Daudé 
Reviewed-by: Paolo Bonzini 
Message-Id: <20240324191707.623175-16-mark.cave-ayl...@ilande.co.uk>
Signed-off-by: Mark Cave-Ayland 
---
 hw/scsi/esp.c | 14 ++
 1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/hw/scsi/esp.c b/hw/scsi/esp.c
index 6fd1a12f23..4895181ec1 100644
--- a/hw/scsi/esp.c
+++ b/hw/scsi/esp.c
@@ -170,10 +170,11 @@ static void esp_fifo_push(ESPState *s, uint8_t val)
 {
 if (fifo8_num_used(&s->fifo) == s->fifo.capacity) {
 trace_esp_error_fifo_overrun();
-return;
+} else {
+fifo8_push(&s->fifo, val);
 }
 
-fifo8_push(&s->fifo, val);
+esp_update_drq(s);
 }
 
 static void esp_fifo_push_buf(ESPState *s, uint8_t *buf, int len)
@@ -184,11 +185,16 @@ static void esp_fifo_push_buf(ESPState *s, uint8_t *buf, 
int len)
 
 static uint8_t esp_fifo_pop(ESPState *s)
 {
+uint8_t val;
+
 if (fifo8_is_empty(&s->fifo)) {
-return 0;
+val = 0;
+} else {
+val = fifo8_pop(&s->fifo);
 }
 
-return fifo8_pop(&s->fifo);
+esp_update_drq(s);
+return val;
 }
 
 static uint32_t esp_fifo8_pop_buf(Fifo8 *fifo, uint8_t *dest, int maxlen)
-- 
2.39.2




[PULL 17/17] esp.c: remove explicit setting of DRQ within ESP state machine

2024-04-04 Thread Mark Cave-Ayland
Now the esp_update_drq() is called for all reads/writes to the FIFO, there is
no need to manually raise and lower the DRQ signal.

Signed-off-by: Mark Cave-Ayland 
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/611
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1831
Reviewed-by: Paolo Bonzini 
Message-Id: <20240324191707.623175-18-mark.cave-ayl...@ilande.co.uk>
Signed-off-by: Mark Cave-Ayland 
---
 hw/scsi/esp.c | 9 -
 1 file changed, 9 deletions(-)

diff --git a/hw/scsi/esp.c b/hw/scsi/esp.c
index 04dfd90090..5d9b52632e 100644
--- a/hw/scsi/esp.c
+++ b/hw/scsi/esp.c
@@ -506,7 +506,6 @@ static void esp_dma_ti_check(ESPState *s)
 if (esp_get_tc(s) == 0 && fifo8_num_used(&s->fifo) < 2) {
 s->rregs[ESP_RINTR] |= INTR_BS;
 esp_raise_irq(s);
-esp_lower_drq(s);
 }
 }
 
@@ -526,7 +525,6 @@ static void esp_do_dma(ESPState *s)
 } else {
 len = esp_fifo_pop_buf(s, buf, fifo8_num_used(&s->fifo));
 len = MIN(fifo8_num_free(&s->cmdfifo), len);
-esp_raise_drq(s);
 }
 
 fifo8_push_all(&s->cmdfifo, buf, len);
@@ -583,7 +581,6 @@ static void esp_do_dma(ESPState *s)
 len = esp_fifo_pop_buf(s, buf, fifo8_num_used(&s->fifo));
 len = MIN(fifo8_num_free(&s->cmdfifo), len);
 fifo8_push_all(&s->cmdfifo, buf, len);
-esp_raise_drq(s);
 }
 trace_esp_handle_ti_cmd(cmdlen);
 s->ti_size = 0;
@@ -615,7 +612,6 @@ static void esp_do_dma(ESPState *s)
 len = MIN(s->async_len, ESP_FIFO_SZ);
 len = MIN(len, fifo8_num_used(&s->fifo));
 len = esp_fifo_pop_buf(s, s->async_buf, len);
-esp_raise_drq(s);
 }
 
 s->async_buf += len;
@@ -667,7 +663,6 @@ static void esp_do_dma(ESPState *s)
 /* Copy device data to FIFO */
 len = MIN(len, fifo8_num_free(&s->fifo));
 esp_fifo_push_buf(s, s->async_buf, len);
-esp_raise_drq(s);
 }
 
 s->async_buf += len;
@@ -733,7 +728,6 @@ static void esp_do_dma(ESPState *s)
 if (fifo8_num_used(&s->fifo) < 2) {
 s->rregs[ESP_RINTR] |= INTR_BS;
 esp_raise_irq(s);
-esp_lower_drq(s);
 }
 break;
 }
@@ -1021,9 +1015,6 @@ void esp_command_complete(SCSIRequest *req, size_t resid)
 s->rregs[ESP_RINTR] |= INTR_BS;
 esp_raise_irq(s);
 
-/* Ensure DRQ is set correctly for TC underflow or normal completion */
-esp_dma_ti_check(s);
-
 if (s->current_req) {
 scsi_req_unref(s->current_req);
 s->current_req = NULL;
-- 
2.39.2




[PULL 10/17] esp.c: don't assert() if FIFO empty when executing non-DMA SELATNS

2024-04-04 Thread Mark Cave-Ayland
The current logic assumes that at least 1 byte is present in the FIFO when
executing a non-DMA SELATNS command, but this may not be the case if the
guest executes an invalid ESP command sequence.

Reported-by: Chuhong Yuan 
Signed-off-by: Mark Cave-Ayland 
Reviewed-by: Paolo Bonzini 
Reviewed-by: Philippe Mathieu-Daudé 
Message-Id: <20240324191707.623175-11-mark.cave-ayl...@ilande.co.uk>
Signed-off-by: Mark Cave-Ayland 
---
 hw/scsi/esp.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/hw/scsi/esp.c b/hw/scsi/esp.c
index 1aac8f5564..f3aa5364cf 100644
--- a/hw/scsi/esp.c
+++ b/hw/scsi/esp.c
@@ -762,7 +762,8 @@ static void esp_do_nodma(ESPState *s)
 
 case CMD_SELATNS:
 /* Copy one byte from FIFO into cmdfifo */
-len = esp_fifo_pop_buf(s, buf, 1);
+len = esp_fifo_pop_buf(s, buf,
+   MIN(fifo8_num_used(&s->fifo), 1));
 len = MIN(fifo8_num_free(&s->cmdfifo), len);
 fifo8_push_all(&s->cmdfifo, buf, len);
 
-- 
2.39.2




[PULL 03/17] esp.c: replace esp_fifo_pop_buf() with esp_fifo8_pop_buf() in do_message_phase()

2024-04-04 Thread Mark Cave-Ayland
The aim is to restrict the esp_fifo_*() functions so that they only operate on
the hardware FIFO. When reading from cmdfifo in do_message_phase() use the
underlying esp_fifo8_pop_buf() function directly.

Signed-off-by: Mark Cave-Ayland 
Reviewed-by: Paolo Bonzini 
Reviewed-by: Philippe Mathieu-Daudé 
Message-Id: <20240324191707.623175-4-mark.cave-ayl...@ilande.co.uk>
Signed-off-by: Mark Cave-Ayland 
---
 hw/scsi/esp.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/scsi/esp.c b/hw/scsi/esp.c
index ff51145da7..9386704a58 100644
--- a/hw/scsi/esp.c
+++ b/hw/scsi/esp.c
@@ -325,7 +325,7 @@ static void do_message_phase(ESPState *s)
 /* Ignore extended messages for now */
 if (s->cmdfifo_cdb_offset) {
 int len = MIN(s->cmdfifo_cdb_offset, fifo8_num_used(&s->cmdfifo));
-esp_fifo_pop_buf(&s->cmdfifo, NULL, len);
+esp_fifo8_pop_buf(&s->cmdfifo, NULL, len);
 s->cmdfifo_cdb_offset = 0;
 }
 }
-- 
2.39.2




[PULL 16/17] esp.c: ensure esp_pdma_write() always calls esp_fifo_push()

2024-04-04 Thread Mark Cave-Ayland
This ensures that esp_update_drq() is called via esp_fifo_push() whenever the
host uses PDMA to transfer data to a SCSI device.

Signed-off-by: Mark Cave-Ayland 
Reviewed-by: Philippe Mathieu-Daudé 
Reviewed-by: Paolo Bonzini 
Message-Id: <20240324191707.623175-17-mark.cave-ayl...@ilande.co.uk>
Signed-off-by: Mark Cave-Ayland 
---
 hw/scsi/esp.c | 10 --
 1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/hw/scsi/esp.c b/hw/scsi/esp.c
index 4895181ec1..04dfd90090 100644
--- a/hw/scsi/esp.c
+++ b/hw/scsi/esp.c
@@ -282,14 +282,12 @@ static void esp_pdma_write(ESPState *s, uint8_t val)
 {
 uint32_t dmalen = esp_get_tc(s);
 
-if (dmalen == 0) {
-return;
-}
-
 esp_fifo_push(s, val);
 
-dmalen--;
-esp_set_tc(s, dmalen);
+if (dmalen && s->drq_state) {
+dmalen--;
+esp_set_tc(s, dmalen);
+}
 }
 
 static int esp_select(ESPState *s)
-- 
2.39.2




[PULL 02/17] esp.c: replace esp_fifo_pop_buf() with esp_fifo8_pop_buf() in do_command_phase()

2024-04-04 Thread Mark Cave-Ayland
The aim is to restrict the esp_fifo_*() functions so that they only operate on
the hardware FIFO. When reading from cmdfifo in do_command_phase() use the
underlying esp_fifo8_pop_buf() function directly.

Signed-off-by: Mark Cave-Ayland 
Reviewed-by: Paolo Bonzini 
Reviewed-by: Philippe Mathieu-Daudé 
Message-Id: <20240324191707.623175-3-mark.cave-ayl...@ilande.co.uk>
Signed-off-by: Mark Cave-Ayland 
---
 hw/scsi/esp.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/scsi/esp.c b/hw/scsi/esp.c
index 1b7b118a0b..ff51145da7 100644
--- a/hw/scsi/esp.c
+++ b/hw/scsi/esp.c
@@ -280,7 +280,7 @@ static void do_command_phase(ESPState *s)
 if (!cmdlen || !s->current_dev) {
 return;
 }
-esp_fifo_pop_buf(&s->cmdfifo, buf, cmdlen);
+esp_fifo8_pop_buf(&s->cmdfifo, buf, cmdlen);
 
 current_lun = scsi_device_find(&s->bus, 0, s->current_dev->id, s->lun);
 if (!current_lun) {
-- 
2.39.2




[PULL 14/17] esp.c: introduce esp_update_drq() and update esp_fifo_{push, pop}_buf() to use it

2024-04-04 Thread Mark Cave-Ayland
This new function sets the DRQ line correctly according to the current transfer
mode, direction and FIFO contents. Update esp_fifo_push_buf() and 
esp_fifo_pop_buf()
to use it so that DRQ is always set correctly when reading/writing multiple 
bytes
to/from the FIFO.

Signed-off-by: Mark Cave-Ayland 
Reviewed-by: Philippe Mathieu-Daudé 
Reviewed-by: Paolo Bonzini 
Message-Id: <20240324191707.623175-15-mark.cave-ayl...@ilande.co.uk>
Signed-off-by: Mark Cave-Ayland 
---
 hw/scsi/esp.c | 48 +++-
 1 file changed, 47 insertions(+), 1 deletion(-)

diff --git a/hw/scsi/esp.c b/hw/scsi/esp.c
index 9e35c00927..6fd1a12f23 100644
--- a/hw/scsi/esp.c
+++ b/hw/scsi/esp.c
@@ -124,6 +124,48 @@ void esp_request_cancelled(SCSIRequest *req)
 }
 }
 
+static void esp_update_drq(ESPState *s)
+{
+bool to_device;
+
+switch (esp_get_phase(s)) {
+case STAT_MO:
+case STAT_CD:
+case STAT_DO:
+to_device = true;
+break;
+
+case STAT_DI:
+case STAT_ST:
+case STAT_MI:
+to_device = false;
+break;
+
+default:
+return;
+}
+
+if (s->dma) {
+/* DMA request so update DRQ according to transfer direction */
+if (to_device) {
+if (fifo8_num_free(&s->fifo) < 2) {
+esp_lower_drq(s);
+} else {
+esp_raise_drq(s);
+}
+} else {
+if (fifo8_num_used(&s->fifo) < 2) {
+esp_lower_drq(s);
+} else {
+esp_raise_drq(s);
+}
+}
+} else {
+/* Not a DMA request */
+esp_lower_drq(s);
+}
+}
+
 static void esp_fifo_push(ESPState *s, uint8_t val)
 {
 if (fifo8_num_used(&s->fifo) == s->fifo.capacity) {
@@ -137,6 +179,7 @@ static void esp_fifo_push(ESPState *s, uint8_t val)
 static void esp_fifo_push_buf(ESPState *s, uint8_t *buf, int len)
 {
 fifo8_push_all(&s->fifo, buf, len);
+esp_update_drq(s);
 }
 
 static uint8_t esp_fifo_pop(ESPState *s)
@@ -180,7 +223,10 @@ static uint32_t esp_fifo8_pop_buf(Fifo8 *fifo, uint8_t 
*dest, int maxlen)
 
 static uint32_t esp_fifo_pop_buf(ESPState *s, uint8_t *dest, int maxlen)
 {
-return esp_fifo8_pop_buf(&s->fifo, dest, maxlen);
+uint32_t len = esp_fifo8_pop_buf(&s->fifo, dest, maxlen);
+
+esp_update_drq(s);
+return len;
 }
 
 static uint32_t esp_get_tc(ESPState *s)
-- 
2.39.2




[PULL 05/17] esp.c: change esp_fifo_push() to take ESPState

2024-04-04 Thread Mark Cave-Ayland
Now that all users of esp_fifo_push() operate on the main FIFO there is no need
to pass the FIFO explicitly.

Signed-off-by: Mark Cave-Ayland 
Reviewed-by: Philippe Mathieu-Daudé 
Reviewed-by: Paolo Bonzini 
Message-Id: <20240324191707.623175-6-mark.cave-ayl...@ilande.co.uk>
Signed-off-by: Mark Cave-Ayland 
---
 hw/scsi/esp.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/hw/scsi/esp.c b/hw/scsi/esp.c
index 5b169b3720..7e3338815b 100644
--- a/hw/scsi/esp.c
+++ b/hw/scsi/esp.c
@@ -106,14 +106,14 @@ void esp_request_cancelled(SCSIRequest *req)
 }
 }
 
-static void esp_fifo_push(Fifo8 *fifo, uint8_t val)
+static void esp_fifo_push(ESPState *s, uint8_t val)
 {
-if (fifo8_num_used(fifo) == fifo->capacity) {
+if (fifo8_num_used(&s->fifo) == s->fifo.capacity) {
 trace_esp_error_fifo_overrun();
 return;
 }
 
-fifo8_push(fifo, val);
+fifo8_push(&s->fifo, val);
 }
 
 static uint8_t esp_fifo_pop(Fifo8 *fifo)
@@ -229,7 +229,7 @@ static void esp_pdma_write(ESPState *s, uint8_t val)
 return;
 }
 
-esp_fifo_push(&s->fifo, val);
+esp_fifo_push(s, val);
 
 dmalen--;
 esp_set_tc(s, dmalen);
@@ -1240,7 +1240,7 @@ void esp_reg_write(ESPState *s, uint32_t saddr, uint64_t 
val)
 break;
 case ESP_FIFO:
 if (!fifo8_is_full(&s->fifo)) {
-esp_fifo_push(&s->fifo, val);
+esp_fifo_push(s, val);
 }
 esp_do_nodma(s);
 break;
-- 
2.39.2




Re: [PATCH v12 18/23] hw/intc/arm_gicv3: Handle icv_nmiar1_read() for icc_nmiar1_read()

2024-04-04 Thread Peter Maydell
On Wed, 3 Apr 2024 at 11:18, Jinjie Ruan  wrote:
>
> Implement icv_nmiar1_read() for icc_nmiar1_read(), so add definition for
> ICH_LR_EL2.NMI and ICH_AP1R_EL2.NMI bit.
>
> If FEAT_GICv3_NMI is supported, ich_ap_write() should consider 
> ICV_AP1R_EL1.NMI
> bit. In icv_activate_irq() and icv_eoir_write(), the ICV_AP1R_EL1.NMI bit
> should be set or clear according to the Non-maskable property. And the RPR
> priority should also update the NMI bit according to the APR priority NMI bit.
>
> By the way, add gicv3_icv_nmiar1_read trace event.
>
> If the hpp irq is a NMI, the icv iar read should return 1022 and trap for
> NMI again
>
> Signed-off-by: Jinjie Ruan 
> Reviewed-by: Richard Henderson 


> @@ -301,10 +310,11 @@ static bool icv_hppi_can_preempt(GICv3CPUState *cs, 
> uint64_t lr)
>   */
>
>  prio = ich_lr_prio(lr);
> +is_nmi = lr & ICH_LR_EL2_NMI;

If you want to be able to skip the cs->gic->nmi_support check here
then you need to enforce in ich_lr_write() that the guest cannot
write a 1 to the ICH_LR_EL2_NMI bit when the GIC doesn't implement NMIs.

@@ -2833,6 +2833,10 @@ static void ich_lr_write(CPUARMState *env,
const ARMCPRegInfo *ri,
   8 - cs->vpribits, 0);
 }

+if (!cs->gic->nmi_support) {
+value &= ~ICH_LR_EL2_NMI;
+}
+
 cs->ich_lr_el2[regno] = value;
 gicv3_cpuif_virt_update(cs);
 }

Otherwise
Reviewed-by: Peter Maydell 

thanks
-- PMM



[PULL 11/17] esp.c: rework esp_cdb_length() into esp_cdb_ready()

2024-04-04 Thread Mark Cave-Ayland
The esp_cdb_length() function is only used as part of a calculation to determine
whether the cmdfifo contains an entire SCSI CDB. Rework esp_cdb_length() into a
new esp_cdb_ready() function which both enables us to handle the case where
scsi_cdb_length() returns -1, plus simplify the logic for its callers.

Suggested-by: Paolo Bonzini 
Signed-off-by: Mark Cave-Ayland 
Reviewed-by: Paolo Bonzini 
Reviewed-by: Philippe Mathieu-Daudé 
Message-Id: <20240324191707.623175-12-mark.cave-ayl...@ilande.co.uk>
Signed-off-by: Mark Cave-Ayland 
---
 hw/scsi/esp.c | 30 ++
 1 file changed, 14 insertions(+), 16 deletions(-)

diff --git a/hw/scsi/esp.c b/hw/scsi/esp.c
index f3aa5364cf..f47abc36d6 100644
--- a/hw/scsi/esp.c
+++ b/hw/scsi/esp.c
@@ -425,20 +425,20 @@ static void write_response(ESPState *s)
 }
 }
 
-static int esp_cdb_length(ESPState *s)
+static bool esp_cdb_ready(ESPState *s)
 {
+int len = fifo8_num_used(&s->cmdfifo) - s->cmdfifo_cdb_offset;
 const uint8_t *pbuf;
-int cmdlen, len;
+int cdblen;
 
-cmdlen = fifo8_num_used(&s->cmdfifo);
-if (cmdlen < s->cmdfifo_cdb_offset) {
-return 0;
+if (len <= 0) {
+return false;
 }
 
-pbuf = fifo8_peek_buf(&s->cmdfifo, cmdlen, NULL);
-len = scsi_cdb_length((uint8_t *)&pbuf[s->cmdfifo_cdb_offset]);
+pbuf = fifo8_peek_buf(&s->cmdfifo, len, NULL);
+cdblen = scsi_cdb_length((uint8_t *)&pbuf[s->cmdfifo_cdb_offset]);
 
-return len;
+return cdblen < 0 ? false : (len >= cdblen);
 }
 
 static void esp_dma_ti_check(ESPState *s)
@@ -806,10 +806,9 @@ static void esp_do_nodma(ESPState *s)
 trace_esp_handle_ti_cmd(cmdlen);
 
 /* CDB may be transferred in one or more TI commands */
-if (esp_cdb_length(s) && esp_cdb_length(s) ==
-fifo8_num_used(&s->cmdfifo) - s->cmdfifo_cdb_offset) {
-/* Command has been received */
-do_cmd(s);
+if (esp_cdb_ready(s)) {
+/* Command has been received */
+do_cmd(s);
 } else {
 /*
  * If data was transferred from the FIFO then raise bus
@@ -832,10 +831,9 @@ static void esp_do_nodma(ESPState *s)
 fifo8_push_all(&s->cmdfifo, buf, len);
 
 /* Handle when DMA transfer is terminated by non-DMA FIFO write */
-if (esp_cdb_length(s) && esp_cdb_length(s) ==
-fifo8_num_used(&s->cmdfifo) - s->cmdfifo_cdb_offset) {
-/* Command has been received */
-do_cmd(s);
+if (esp_cdb_ready(s)) {
+/* Command has been received */
+do_cmd(s);
 }
 break;
 
-- 
2.39.2




[PULL 09/17] esp.c: introduce esp_fifo_push_buf() function for pushing to the FIFO

2024-04-04 Thread Mark Cave-Ayland
Instead of pushing data into the FIFO directly with fifo8_push_all(), add a new
esp_fifo_push_buf() function and use it accordingly.

Signed-off-by: Mark Cave-Ayland 
Reviewed-by: Philippe Mathieu-Daudé 
Reviewed-by: Paolo Bonzini 
Message-Id: <20240324191707.623175-10-mark.cave-ayl...@ilande.co.uk>
Signed-off-by: Mark Cave-Ayland 
---
 hw/scsi/esp.c | 11 ---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/hw/scsi/esp.c b/hw/scsi/esp.c
index 83b621ee0f..1aac8f5564 100644
--- a/hw/scsi/esp.c
+++ b/hw/scsi/esp.c
@@ -116,6 +116,11 @@ static void esp_fifo_push(ESPState *s, uint8_t val)
 fifo8_push(&s->fifo, val);
 }
 
+static void esp_fifo_push_buf(ESPState *s, uint8_t *buf, int len)
+{
+fifo8_push_all(&s->fifo, buf, len);
+}
+
 static uint8_t esp_fifo_pop(ESPState *s)
 {
 if (fifo8_is_empty(&s->fifo)) {
@@ -601,7 +606,7 @@ static void esp_do_dma(ESPState *s)
 } else {
 /* Copy device data to FIFO */
 len = MIN(len, fifo8_num_free(&s->fifo));
-fifo8_push_all(&s->fifo, s->async_buf, len);
+esp_fifo_push_buf(s, s->async_buf, len);
 esp_raise_drq(s);
 }
 
@@ -650,7 +655,7 @@ static void esp_do_dma(ESPState *s)
 if (s->dma_memory_write) {
 s->dma_memory_write(s->dma_opaque, buf, len);
 } else {
-fifo8_push_all(&s->fifo, buf, len);
+esp_fifo_push_buf(s, buf, len);
 }
 
 esp_set_tc(s, esp_get_tc(s) - len);
@@ -685,7 +690,7 @@ static void esp_do_dma(ESPState *s)
 if (s->dma_memory_write) {
 s->dma_memory_write(s->dma_opaque, buf, len);
 } else {
-fifo8_push_all(&s->fifo, buf, len);
+esp_fifo_push_buf(s, buf, len);
 }
 
 esp_set_tc(s, esp_get_tc(s) - len);
-- 
2.39.2




[PULL 01/17] esp.c: move esp_fifo_pop_buf() internals to new esp_fifo8_pop_buf() function

2024-04-04 Thread Mark Cave-Ayland
Update esp_fifo_pop_buf() to be a simple wrapper onto the new 
esp_fifo8_pop_buf()
function.

Signed-off-by: Mark Cave-Ayland 
Reviewed-by: Paolo Bonzini 
Reviewed-by: Philippe Mathieu-Daudé 
Message-Id: <20240324191707.623175-2-mark.cave-ayl...@ilande.co.uk>
Signed-off-by: Mark Cave-Ayland 
---
 hw/scsi/esp.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/hw/scsi/esp.c b/hw/scsi/esp.c
index 590ff99744..1b7b118a0b 100644
--- a/hw/scsi/esp.c
+++ b/hw/scsi/esp.c
@@ -125,7 +125,7 @@ static uint8_t esp_fifo_pop(Fifo8 *fifo)
 return fifo8_pop(fifo);
 }
 
-static uint32_t esp_fifo_pop_buf(Fifo8 *fifo, uint8_t *dest, int maxlen)
+static uint32_t esp_fifo8_pop_buf(Fifo8 *fifo, uint8_t *dest, int maxlen)
 {
 const uint8_t *buf;
 uint32_t n, n2;
@@ -155,6 +155,11 @@ static uint32_t esp_fifo_pop_buf(Fifo8 *fifo, uint8_t 
*dest, int maxlen)
 return n;
 }
 
+static uint32_t esp_fifo_pop_buf(Fifo8 *fifo, uint8_t *dest, int maxlen)
+{
+return esp_fifo8_pop_buf(fifo, dest, maxlen);
+}
+
 static uint32_t esp_get_tc(ESPState *s)
 {
 uint32_t dmalen;
-- 
2.39.2




[PULL 12/17] esp.c: prevent cmdfifo overflow in esp_cdb_ready()

2024-04-04 Thread Mark Cave-Ayland
During normal use the cmdfifo will never wrap internally and cmdfifo_cdb_offset
will always indicate the start of the SCSI CDB. However it is possible that a
malicious guest could issue an invalid ESP command sequence such that cmdfifo
wraps internally and cmdfifo_cdb_offset could point beyond the end of the FIFO
data buffer.

Add an extra check to fifo8_peek_buf() to ensure that if the cmdfifo has wrapped
internally then esp_cdb_ready() will exit rather than allow scsi_cdb_length() to
access data outside the cmdfifo data buffer.

Reported-by: Chuhong Yuan 
Signed-off-by: Mark Cave-Ayland 
Reviewed-by: Paolo Bonzini 
Reviewed-by: Philippe Mathieu-Daudé 
Message-Id: <20240324191707.623175-13-mark.cave-ayl...@ilande.co.uk>
Signed-off-by: Mark Cave-Ayland 
---
 hw/scsi/esp.c | 12 +++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/hw/scsi/esp.c b/hw/scsi/esp.c
index f47abc36d6..d8db33b921 100644
--- a/hw/scsi/esp.c
+++ b/hw/scsi/esp.c
@@ -429,13 +429,23 @@ static bool esp_cdb_ready(ESPState *s)
 {
 int len = fifo8_num_used(&s->cmdfifo) - s->cmdfifo_cdb_offset;
 const uint8_t *pbuf;
+uint32_t n;
 int cdblen;
 
 if (len <= 0) {
 return false;
 }
 
-pbuf = fifo8_peek_buf(&s->cmdfifo, len, NULL);
+pbuf = fifo8_peek_buf(&s->cmdfifo, len, &n);
+if (n < len) {
+/*
+ * In normal use the cmdfifo should never wrap, but include this check
+ * to prevent a malicious guest from reading past the end of the
+ * cmdfifo data buffer below
+ */
+return false;
+}
+
 cdblen = scsi_cdb_length((uint8_t *)&pbuf[s->cmdfifo_cdb_offset]);
 
 return cdblen < 0 ? false : (len >= cdblen);
-- 
2.39.2




[PULL 00/17] qemu-sparc queue 20240404

2024-04-04 Thread Mark Cave-Ayland
The following changes since commit 786fd793b81410fb2a28914315e2f05d2ff6733b:

  Merge tag 'for-upstream' of https://gitlab.com/bonzini/qemu into staging 
(2024-04-03 12:52:03 +0100)

are available in the Git repository at:

  https://github.com/mcayland/qemu.git tags/qemu-sparc-20240404

for you to fetch changes up to d7fe931818d5e9aa70d08056c43b496ce789ba64:

  esp.c: remove explicit setting of DRQ within ESP state machine (2024-04-04 
15:17:53 +0100)


qemu-sparc queue
- This contains fixes for the ESP emulation discovered by fuzzing (with thanks 
to
  Chuhong Yuan )


Mark Cave-Ayland (17):
  esp.c: move esp_fifo_pop_buf() internals to new esp_fifo8_pop_buf() 
function
  esp.c: replace esp_fifo_pop_buf() with esp_fifo8_pop_buf() in 
do_command_phase()
  esp.c: replace esp_fifo_pop_buf() with esp_fifo8_pop_buf() in 
do_message_phase()
  esp.c: replace cmdfifo use of esp_fifo_pop() in do_message_phase()
  esp.c: change esp_fifo_push() to take ESPState
  esp.c: change esp_fifo_pop() to take ESPState
  esp.c: use esp_fifo_push() instead of fifo8_push()
  esp.c: change esp_fifo_pop_buf() to take ESPState
  esp.c: introduce esp_fifo_push_buf() function for pushing to the FIFO
  esp.c: don't assert() if FIFO empty when executing non-DMA SELATNS
  esp.c: rework esp_cdb_length() into esp_cdb_ready()
  esp.c: prevent cmdfifo overflow in esp_cdb_ready()
  esp.c: move esp_set_phase() and esp_get_phase() towards the beginning of 
the file
  esp.c: introduce esp_update_drq() and update esp_fifo_{push, pop}_buf() 
to use it
  esp.c: update esp_fifo_{push, pop}() to call esp_update_drq()
  esp.c: ensure esp_pdma_write() always calls esp_fifo_push()
  esp.c: remove explicit setting of DRQ within ESP state machine

 hw/scsi/esp.c | 223 +-
 1 file changed, 142 insertions(+), 81 deletions(-)



[PULL 06/17] esp.c: change esp_fifo_pop() to take ESPState

2024-04-04 Thread Mark Cave-Ayland
Now that all users of esp_fifo_pop() operate on the main FIFO there is no need
to pass the FIFO explicitly.

Signed-off-by: Mark Cave-Ayland 
Reviewed-by: Philippe Mathieu-Daudé 
Reviewed-by: Paolo Bonzini 
Message-Id: <20240324191707.623175-7-mark.cave-ayl...@ilande.co.uk>
Signed-off-by: Mark Cave-Ayland 
---
 hw/scsi/esp.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/hw/scsi/esp.c b/hw/scsi/esp.c
index 7e3338815b..d474268438 100644
--- a/hw/scsi/esp.c
+++ b/hw/scsi/esp.c
@@ -116,13 +116,13 @@ static void esp_fifo_push(ESPState *s, uint8_t val)
 fifo8_push(&s->fifo, val);
 }
 
-static uint8_t esp_fifo_pop(Fifo8 *fifo)
+static uint8_t esp_fifo_pop(ESPState *s)
 {
-if (fifo8_is_empty(fifo)) {
+if (fifo8_is_empty(&s->fifo)) {
 return 0;
 }
 
-return fifo8_pop(fifo);
+return fifo8_pop(&s->fifo);
 }
 
 static uint32_t esp_fifo8_pop_buf(Fifo8 *fifo, uint8_t *dest, int maxlen)
@@ -217,7 +217,7 @@ static uint8_t esp_pdma_read(ESPState *s)
 {
 uint8_t val;
 
-val = esp_fifo_pop(&s->fifo);
+val = esp_fifo_pop(s);
 return val;
 }
 
@@ -1184,7 +1184,7 @@ uint64_t esp_reg_read(ESPState *s, uint32_t saddr)
 
 switch (saddr) {
 case ESP_FIFO:
-s->rregs[ESP_FIFO] = esp_fifo_pop(&s->fifo);
+s->rregs[ESP_FIFO] = esp_fifo_pop(s);
 val = s->rregs[ESP_FIFO];
 break;
 case ESP_RINTR:
-- 
2.39.2




[PULL 04/17] esp.c: replace cmdfifo use of esp_fifo_pop() in do_message_phase()

2024-04-04 Thread Mark Cave-Ayland
Signed-off-by: Mark Cave-Ayland 
Reviewed-by: Philippe Mathieu-Daudé 
Reviewed-by: Paolo Bonzini 
Message-Id: <20240324191707.623175-5-mark.cave-ayl...@ilande.co.uk>
Signed-off-by: Mark Cave-Ayland 
---
 hw/scsi/esp.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/hw/scsi/esp.c b/hw/scsi/esp.c
index 9386704a58..5b169b3720 100644
--- a/hw/scsi/esp.c
+++ b/hw/scsi/esp.c
@@ -315,7 +315,8 @@ static void do_command_phase(ESPState *s)
 static void do_message_phase(ESPState *s)
 {
 if (s->cmdfifo_cdb_offset) {
-uint8_t message = esp_fifo_pop(&s->cmdfifo);
+uint8_t message = fifo8_is_empty(&s->cmdfifo) ? 0 :
+  fifo8_pop(&s->cmdfifo);
 
 trace_esp_do_identify(message);
 s->lun = message & 7;
-- 
2.39.2




Re: [PATCH v12 17/23] hw/intc/arm_gicv3: Add NMI handling CPU interface registers

2024-04-04 Thread Peter Maydell
On Wed, 3 Apr 2024 at 11:17, Jinjie Ruan  wrote:
>
> Add the NMIAR CPU interface registers which deal with acknowledging NMI.
>
> When introduce NMI interrupt, there are some updates to the semantics for the
> register ICC_IAR1_EL1 and ICC_HPPIR1_EL1. For ICC_IAR1_EL1 register, it
> should return 1022 if the intid has non-maskable property. And for
> ICC_NMIAR1_EL1 register, it should return 1023 if the intid do not have
> non-maskable property. Howerever, these are not necessary for ICC_HPPIR1_EL1
> register.
>
> And the APR and RPR has NMI bits which should be handled correctly.
>
> Signed-off-by: Jinjie Ruan 
> Reviewed-by: Richard Henderson 
> ---

Reviewed-by: Peter Maydell 

thanks
-- PMM



Re: [PATCH v12 12/23] target/arm: Handle NMI in arm_cpu_do_interrupt_aarch64()

2024-04-04 Thread Peter Maydell
On Wed, 3 Apr 2024 at 11:18, Jinjie Ruan  wrote:
>
> According to Arm GIC section 4.6.3 Interrupt superpriority, the interrupt
> with superpriority is always IRQ, never FIQ, so the NMI exception trap entry
> behave like IRQ. And VINMI(vIRQ with Superpriority) can be raised from the
> GIC or come from the hcrx_el2.HCRX_VINMI bit, VFNMI(vFIQ with Superpriority)
> come from the hcrx_el2.HCRX_VFNMI bit.
>
> Signed-off-by: Jinjie Ruan 
> Reviewed-by: Richard Henderson 

Reviewed-by: Peter Maydell 

thanks
-- PMM



Re: [PATCH v12 09/23] target/arm: Handle PSTATE.ALLINT on taking an exception

2024-04-04 Thread Peter Maydell
On Wed, 3 Apr 2024 at 11:17, Jinjie Ruan  wrote:
>
> Set or clear PSTATE.ALLINT on taking an exception to ELx according to the
> SCTLR_ELx.SPINTMASK bit.
>
> Signed-off-by: Jinjie Ruan 
> Reviewed-by: Richard Henderson 


Reviewed-by: Peter Maydell 

thanks
-- PMM



Re: [PATCH v12 08/23] target/arm: Handle IS/FS in ISR_EL1 for NMI, VINMI and VFNMI

2024-04-04 Thread Peter Maydell
On Wed, 3 Apr 2024 at 11:18, Jinjie Ruan  wrote:
>
> Add IS and FS bit in ISR_EL1 and handle the read. With CPU_INTERRUPT_NMI or
> CPU_INTERRUPT_VINMI, both CPSR_I and ISR_IS must be set. With
> CPU_INTERRUPT_VFNMI, both CPSR_F and ISR_FS must be set.
>
> Signed-off-by: Jinjie Ruan 
> Reviewed-by: Richard Henderson 
> ---

Reviewed-by: Peter Maydell 

thanks
-- PMM



Re: [PATCH v12 07/23] target/arm: Add support for NMI in arm_phys_excp_target_el()

2024-04-04 Thread Peter Maydell
On Wed, 3 Apr 2024 at 11:18, Jinjie Ruan via  wrote:
>
> According to Arm GIC section 4.6.3 Interrupt superpriority, the interrupt
> with superpriority is always IRQ, never FIQ, so handle NMI same as IRQ in
> arm_phys_excp_target_el().
>
> Signed-off-by: Jinjie Ruan 
> Reviewed-by: Richard Henderson 
> ---

Reviewed-by: Peter Maydell 

thanks
-- PMM



Re: [PATCH v12 04/23] target/arm: Implement ALLINT MSR (immediate)

2024-04-04 Thread Peter Maydell
On Wed, 3 Apr 2024 at 11:18, Jinjie Ruan  wrote:
>
> Add ALLINT MSR (immediate) to decodetree, in which the CRm is 0b000x. The
> EL0 check is necessary to ALLINT, and the EL1 check is necessary when
> imm == 1. So implement it inline for EL2/3, or EL1 with imm==0. Avoid the
> unconditional write to pc and use raise_exception_ra to unwind.
>
> Signed-off-by: Jinjie Ruan 
> Reviewed-by: Richard Henderson 
> ---

Reviewed-by: Peter Maydell 

thanks
-- PMM



Re: [PATCH v12 02/23] target/arm: Add PSTATE.ALLINT

2024-04-04 Thread Peter Maydell
On Wed, 3 Apr 2024 at 11:17, Jinjie Ruan  wrote:
>
> When PSTATE.ALLINT is set, an IRQ or FIQ interrupt that is targeted to
> ELx, with or without superpriority is masked.
>
> As Richard suggested, place ALLINT bit in PSTATE in env->pstate.
>
> With the change to pstate_read/write, exception entry
> and return are automatically handled.
>
> Signed-off-by: Jinjie Ruan 
> Reviewed-by: Richard Henderson 

Reviewed-by: Peter Maydell 

thanks
-- PMM



Re: [PATCH v12 03/23] target/arm: Add support for FEAT_NMI, Non-maskable Interrupt

2024-04-04 Thread Peter Maydell
On Wed, 3 Apr 2024 at 11:18, Jinjie Ruan  wrote:
>
> Add support for FEAT_NMI. NMI (FEAT_NMI) is an mandatory feature in
> ARMv8.8-A and ARM v9.3-A.
>
> Signed-off-by: Jinjie Ruan 
> Reviewed-by: Richard Henderson 
> ---
> v3:
> - Add Reviewed-by.
> - Adjust to before the MSR patches.
> ---
>  target/arm/internals.h | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/target/arm/internals.h b/target/arm/internals.h
> index dd3da211a3..516e0584bf 100644
> --- a/target/arm/internals.h
> +++ b/target/arm/internals.h
> @@ -1229,6 +1229,9 @@ static inline uint32_t aarch64_pstate_valid_mask(const 
> ARMISARegisters *id)
>  if (isar_feature_aa64_mte(id)) {
>  valid |= PSTATE_TCO;
>  }
> +if (isar_feature_aa64_nmi(id)) {
> +valid |= PSTATE_ALLINT;
> +}

Reviewed-by: Peter Maydell 

thanks
-- PMM



Re: [PATCH v12 06/23] target/arm: Add support for Non-maskable Interrupt

2024-04-04 Thread Peter Maydell
On Wed, 3 Apr 2024 at 11:17, Jinjie Ruan  wrote:
>
> This only implements the external delivery method via the GICv3.
>
> Signed-off-by: Jinjie Ruan 
> Reviewed-by: Richard Henderson 

Reviewed-by: Peter Maydell 

thanks
-- PMM



Re: [PATCH v12 01/23] target/arm: Handle HCR_EL2 accesses for bits introduced with FEAT_NMI

2024-04-04 Thread Peter Maydell
On Wed, 3 Apr 2024 at 11:17, Jinjie Ruan  wrote:
>
> FEAT_NMI defines another three new bits in HCRX_EL2: TALLINT, HCRX_VINMI and
> HCRX_VFNMI. When the feature is enabled, allow these bits to be written in
> HCRX_EL2.
>
> Signed-off-by: Jinjie Ruan 
> Reviewed-by: Richard Henderson 
> ---

Reviewed-by: Peter Maydell 

thanks
-- PMM



Re: [PATCH v12 05/23] target/arm: Support MSR access to ALLINT

2024-04-04 Thread Peter Maydell
On Wed, 3 Apr 2024 at 11:17, Jinjie Ruan  wrote:
>
> Support ALLINT msr access as follow:
> mrs , ALLINT// read allint
> msr ALLINT, // write allint with imm
>
> Signed-off-by: Jinjie Ruan 
> Reviewed-by: Richard Henderson 
> ---

Reviewed-by: Peter Maydell 

thanks
-- PMM



  1   2   >