Re: [PATCH for-6.2 31/43] target/hexagon: Implement cpu_mmu_index

2021-07-28 Thread Philippe Mathieu-Daudé
On 7/29/21 2:46 AM, Richard Henderson wrote:
> The function is trivial for user-only, but still must be present.
> 
> Cc: Taylor Simpson 
> Signed-off-by: Richard Henderson 
> ---
>  target/hexagon/cpu.h | 9 +
>  1 file changed, 9 insertions(+)

Reviewed-by: Philippe Mathieu-Daudé 



Re: [PATCH for-6.2 29/43] target/ppc: Use MO_128 for 16 byte atomics

2021-07-28 Thread Philippe Mathieu-Daudé
On 7/29/21 2:46 AM, Richard Henderson wrote:
> Cc: qemu-...@nongnu.org
> Signed-off-by: Richard Henderson 
> ---
>  target/ppc/translate.c | 12 +++-
>  1 file changed, 7 insertions(+), 5 deletions(-)

Reviewed-by: Philippe Mathieu-Daudé 



Re: [PATCH for-6.2 27/43] target/arm: Use MO_128 for 16 byte atomics

2021-07-28 Thread Philippe Mathieu-Daudé
On 7/29/21 2:46 AM, Richard Henderson wrote:
> Cc: qemu-...@nongnu.org
> Signed-off-by: Richard Henderson 
> ---
>  target/arm/helper-a64.c | 8 
>  1 file changed, 4 insertions(+), 4 deletions(-)

Reviewed-by: Philippe Mathieu-Daudé 



Re: [PATCH for-6.2 24/43] accel/tcg: Pass MemOpIdx to atomic_trace_*_post

2021-07-28 Thread Philippe Mathieu-Daudé
On 7/29/21 2:46 AM, Richard Henderson wrote:
> We will shortly use the MemOpIdx directly, but in the meantime
> re-compute the trace meminfo.
> 
> Signed-off-by: Richard Henderson 
> ---
>  accel/tcg/atomic_template.h   | 48 +--
>  accel/tcg/atomic_common.c.inc | 30 +++---
>  2 files changed, 39 insertions(+), 39 deletions(-)

Reviewed-by: Philippe Mathieu-Daudé 



Re: [PATCH for-6.2 30/43] target/s390x: Use MO_128 for 16 byte atomics

2021-07-28 Thread Philippe Mathieu-Daudé
On 7/29/21 2:46 AM, Richard Henderson wrote:
> Cc: qemu-s3...@nongnu.org
> Signed-off-by: Richard Henderson 
> ---
>  target/s390x/tcg/mem_helper.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)

Reviewed-by: Philippe Mathieu-Daudé 



Re: [PATCH for-6.2 v3 03/11] machine: Set the value of cpus to match maxcpus if it's omitted

2021-07-28 Thread wangyanan (Y)

On 2021/7/29 4:22, Andrew Jones wrote:

On Wed, Jul 28, 2021 at 11:48:40AM +0800, Yanan Wang wrote:

Currently we directly calculate the omitted cpus based on the given
incomplete collection of parameters. This makes some cmdlines like:
   -smp maxcpus=16
   -smp sockets=2,maxcpus=16
   -smp sockets=2,dies=2,maxcpus=16
   -smp sockets=2,cores=4,maxcpus=16
not work. We should probably set the value of cpus to match maxcpus
if it's omitted, which will make above configs start to work.

So the calculation logic of cpus/maxcpus after this patch will be:
When both maxcpus and cpus are omitted, maxcpus will be calculated
from the given parameters and cpus will be set equal to maxcpus.
When only one of maxcpus and cpus is given then the omitted one
will be set to its counterpart's value. Both maxcpus and cpus may
be specified, but maxcpus must be equal to or greater than cpus.

Note: change in this patch won't affect any existing working cmdlines
but allows more incomplete configs to be valid.

Signed-off-by: Yanan Wang 
---
  hw/core/machine.c | 29 -
  hw/i386/pc.c  | 29 -
  qemu-options.hx   | 11 ---
  3 files changed, 40 insertions(+), 29 deletions(-)

diff --git a/hw/core/machine.c b/hw/core/machine.c
index 69979c93dd..958e6e7107 100644
--- a/hw/core/machine.c
+++ b/hw/core/machine.c
@@ -755,25 +755,28 @@ static void smp_parse(MachineState *ms, SMPConfiguration 
*config, Error **errp)
  }
  
  /* compute missing values, prefer sockets over cores over threads */

-maxcpus = maxcpus > 0 ? maxcpus : cpus;
-
-if (cpus == 0) {
+if (cpus == 0 && maxcpus == 0) {
  sockets = sockets > 0 ? sockets : 1;
  cores = cores > 0 ? cores : 1;
  threads = threads > 0 ? threads : 1;
-cpus = sockets * cores * threads;
+} else {
  maxcpus = maxcpus > 0 ? maxcpus : cpus;
-} else if (sockets == 0) {
-cores = cores > 0 ? cores : 1;
-threads = threads > 0 ? threads : 1;
-sockets = maxcpus / (cores * threads);
-} else if (cores == 0) {
-threads = threads > 0 ? threads : 1;
-cores = maxcpus / (sockets * threads);
-} else if (threads == 0) {
-threads = maxcpus / (sockets * cores);
+
+if (sockets == 0) {
+cores = cores > 0 ? cores : 1;
+threads = threads > 0 ? threads : 1;
+sockets = maxcpus / (cores * threads);
+} else if (cores == 0) {
+threads = threads > 0 ? threads : 1;
+cores = maxcpus / (sockets * threads);
+} else if (threads == 0) {
+threads = maxcpus / (sockets * cores);
+}
  }
  
+maxcpus = maxcpus > 0 ? maxcpus : sockets * cores * threads;

+cpus = cpus > 0 ? cpus : maxcpus;
+
  if (sockets * cores * threads < cpus) {
  error_setg(errp, "cpu topology: "
 "sockets (%u) * cores (%u) * threads (%u) < "
diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index a9ff9ef52c..9ad7ae5254 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -725,25 +725,28 @@ static void pc_smp_parse(MachineState *ms, 
SMPConfiguration *config, Error **err
  dies = dies > 0 ? dies : 1;
  
  /* compute missing values, prefer sockets over cores over threads */

-maxcpus = maxcpus > 0 ? maxcpus : cpus;
-
-if (cpus == 0) {
+if (cpus == 0 && maxcpus == 0) {
  sockets = sockets > 0 ? sockets : 1;
  cores = cores > 0 ? cores : 1;
  threads = threads > 0 ? threads : 1;
-cpus = sockets * dies * cores * threads;
+} else {
  maxcpus = maxcpus > 0 ? maxcpus : cpus;
-} else if (sockets == 0) {
-cores = cores > 0 ? cores : 1;
-threads = threads > 0 ? threads : 1;
-sockets = maxcpus / (dies * cores * threads);
-} else if (cores == 0) {
-threads = threads > 0 ? threads : 1;
-cores = maxcpus / (sockets * dies * threads);
-} else if (threads == 0) {
-threads = maxcpus / (sockets * dies * cores);
+
+if (sockets == 0) {
+cores = cores > 0 ? cores : 1;
+threads = threads > 0 ? threads : 1;
+sockets = maxcpus / (dies * cores * threads);
+} else if (cores == 0) {
+threads = threads > 0 ? threads : 1;
+cores = maxcpus / (sockets * dies * threads);
+} else if (threads == 0) {
+threads = maxcpus / (sockets * dies * cores);
+}
  }
  
+maxcpus = maxcpus > 0 ? maxcpus : sockets * dies * cores * threads;

+cpus = cpus > 0 ? cpus : maxcpus;
+
  if (sockets * dies * cores * threads < cpus) {
  error_setg(errp, "cpu topology: "
 "sockets (%u) * dies (%u) * cores (%u) * threads (%u) < "
diff --git a/qemu-options.hx b/qemu-options.hx
index e077aa80bf..0912236b4b 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -210,9 +210,14 @@ SRST
  Simulate a SMP system with '\ ``n``\ ' CPUs initially pres

Re: [PATCH for-6.1? 23/43] accel/tcg: Remove double bswap for helper_atomic_sto_*_mmu

2021-07-28 Thread Philippe Mathieu-Daudé
On 7/29/21 2:46 AM, Richard Henderson wrote:
> This crept in as either a cut-and-paste error, or rebase error.
> 
> Fixes: cfec388518d
> Signed-off-by: Richard Henderson 
> ---
>  accel/tcg/atomic_template.h | 1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/accel/tcg/atomic_template.h b/accel/tcg/atomic_template.h
> index 4427fab6df..4230ff2957 100644
> --- a/accel/tcg/atomic_template.h
> +++ b/accel/tcg/atomic_template.h
> @@ -251,7 +251,6 @@ void ATOMIC_NAME(st)(CPUArchState *env, target_ulong 
> addr, ABI_TYPE val,
>   PAGE_WRITE, retaddr);
>  uint16_t info = atomic_trace_st_pre(env, addr, oi);
>  
> -val = BSWAP(val);
>  val = BSWAP(val);
>  atomic16_set(haddr, val);
>  ATOMIC_MMU_CLEANUP;

Why not merge this for 6.1? Because old bug, no regression?

Reviewed-by: Philippe Mathieu-Daudé 



Re: [PATCH for-6.2 21/43] tcg: Split out MemOpIdx to exec/memopidx.h

2021-07-28 Thread Philippe Mathieu-Daudé
On 7/29/21 2:46 AM, Richard Henderson wrote:
> Move this code from tcg/tcg.h to its own header.
> 
> Signed-off-by: Richard Henderson 
> ---
>  include/exec/memopidx.h | 55 +
>  include/tcg/tcg.h   | 39 +
>  2 files changed, 56 insertions(+), 38 deletions(-)
>  create mode 100644 include/exec/memopidx.h

Reviewed-by: Philippe Mathieu-Daudé 



Re: [PATCH for-6.2 20/43] tcg: Rename TCGMemOpIdx to MemOpIdx

2021-07-28 Thread Philippe Mathieu-Daudé
On 7/29/21 2:46 AM, Richard Henderson wrote:
> We're about to move this out of tcg.h, so rename it
> as we did when moving MemOp.
> 
> Signed-off-by: Richard Henderson 
> ---
>  accel/tcg/atomic_template.h   | 24 +--
>  include/tcg/tcg.h | 74 -
>  accel/tcg/cputlb.c| 78 +--
>  accel/tcg/user-exec.c |  2 +-
>  target/arm/helper-a64.c   | 16 +++
>  target/arm/m_helper.c |  2 +-
>  target/i386/tcg/mem_helper.c  |  4 +-
>  target/m68k/op_helper.c   |  2 +-
>  target/mips/tcg/msa_helper.c  |  6 +--
>  target/s390x/tcg/mem_helper.c | 20 -
>  target/sparc/ldst_helper.c|  2 +-
>  tcg/optimize.c|  2 +-
>  tcg/tcg-op.c  | 12 +++---
>  tcg/tcg.c |  2 +-
>  tcg/tci.c | 14 +++
>  accel/tcg/atomic_common.c.inc |  6 +--
>  tcg/aarch64/tcg-target.c.inc  | 14 +++
>  tcg/arm/tcg-target.c.inc  | 10 ++---
>  tcg/i386/tcg-target.c.inc | 10 ++---
>  tcg/mips/tcg-target.c.inc | 12 +++---
>  tcg/ppc/tcg-target.c.inc  | 10 ++---
>  tcg/riscv/tcg-target.c.inc| 16 +++
>  tcg/s390/tcg-target.c.inc | 10 ++---
>  tcg/sparc/tcg-target.c.inc|  4 +-
>  tcg/tcg-ldst.c.inc|  2 +-
>  25 files changed, 177 insertions(+), 177 deletions(-)

Maybe mention "mechanical change using sed".

Reviewed-by: Philippe Mathieu-Daudé 



Re: [PATCH for-6.2 v3 01/11] machine: Minor refactor/cleanup for the smp parsers

2021-07-28 Thread wangyanan (Y)

On 2021/7/29 4:16, Andrew Jones wrote:

On Wed, Jul 28, 2021 at 11:48:38AM +0800, Yanan Wang wrote:

To pave the way for the functional improvement in later patches,
make some refactor/cleanup for the smp parsers, including using
local maxcpus instead of ms->smp.max_cpus in the calculation,
defaulting dies to 0 initially like other members, cleanup the
sanity check for dies.

No functional change intended.

Signed-off-by: Yanan Wang 
---
  hw/core/machine.c | 19 +++
  hw/i386/pc.c  | 23 ++-
  2 files changed, 25 insertions(+), 17 deletions(-)

diff --git a/hw/core/machine.c b/hw/core/machine.c
index e1533dfc47..ffc0629854 100644
--- a/hw/core/machine.c
+++ b/hw/core/machine.c
@@ -747,9 +747,11 @@ static void smp_parse(MachineState *ms, SMPConfiguration 
*config, Error **errp)
  unsigned sockets = config->has_sockets ? config->sockets : 0;
  unsigned cores   = config->has_cores ? config->cores : 0;
  unsigned threads = config->has_threads ? config->threads : 0;
+unsigned maxcpus = config->has_maxcpus ? config->maxcpus : 0;
  
-if (config->has_dies && config->dies != 0 && config->dies != 1) {

+if (config->has_dies && config->dies > 1) {
  error_setg(errp, "dies not supported by this machine's CPU topology");
+return;
  }
  
  /* compute missing values, prefer sockets over cores over threads */

@@ -760,8 +762,8 @@ static void smp_parse(MachineState *ms, SMPConfiguration 
*config, Error **errp)
  sockets = sockets > 0 ? sockets : 1;
  cpus = cores * threads * sockets;
  } else {
-ms->smp.max_cpus = config->has_maxcpus ? config->maxcpus : cpus;
-sockets = ms->smp.max_cpus / (cores * threads);
+maxcpus = maxcpus > 0 ? maxcpus : cpus;
+sockets = maxcpus / (sockets * cores);

Should be divided by (cores * threads) like before.

Absolutely... Will fix it.

Thanks,
Yanan

Thanks,
drew

.





Re: [PATCH for-6.2 19/43] tcg: Expand MO_SIZE to 3 bits

2021-07-28 Thread Philippe Mathieu-Daudé
On 7/29/21 2:46 AM, Richard Henderson wrote:
> We have lacked expressive support for memory sizes larger
> than 64-bits for a while.  Fixing that requires adjustment
> to several points where we used this for array indexing,
> and two places that develop -Wswitch warnings after the change.
> 
> Signed-off-by: Richard Henderson 
> ---
>  include/exec/memop.h| 14 +-
>  target/arm/translate-a64.c  |  2 +-
>  tcg/tcg-op.c| 13 -
>  target/s390x/tcg/translate_vx.c.inc |  2 +-
>  tcg/aarch64/tcg-target.c.inc|  4 ++--
>  tcg/arm/tcg-target.c.inc|  4 ++--
>  tcg/i386/tcg-target.c.inc   |  4 ++--
>  tcg/mips/tcg-target.c.inc   |  4 ++--
>  tcg/ppc/tcg-target.c.inc|  8 
>  tcg/riscv/tcg-target.c.inc  |  4 ++--
>  tcg/s390/tcg-target.c.inc   |  4 ++--
>  tcg/sparc/tcg-target.c.inc  | 16 
>  12 files changed, 43 insertions(+), 36 deletions(-)

Nice cleanup.

Reviewed-by: Philippe Mathieu-Daudé 



Re: [PATCH for-6.2 01/43] hw/core: Make do_unaligned_access available to user-only

2021-07-28 Thread Philippe Mathieu-Daudé
On 7/29/21 2:46 AM, Richard Henderson wrote:
> We shouldn't be ignoring SIGBUS for user-only.
> Move our existing TCGCPUOps hook out from CONFIG_SOFTMMU.
> 
> Signed-off-by: Richard Henderson 
> ---
>  include/hw/core/tcg-cpu-ops.h | 14 +++---
>  1 file changed, 7 insertions(+), 7 deletions(-)
> 
> diff --git a/include/hw/core/tcg-cpu-ops.h b/include/hw/core/tcg-cpu-ops.h
> index eab27d0c03..513d6bfe72 100644
> --- a/include/hw/core/tcg-cpu-ops.h
> +++ b/include/hw/core/tcg-cpu-ops.h
> @@ -60,6 +60,13 @@ struct TCGCPUOps {
>  /** @debug_excp_handler: Callback for handling debug exceptions */
>  void (*debug_excp_handler)(CPUState *cpu);
>  
> +/**
> + * @do_unaligned_access: Callback for unaligned access handling
> + */
> +void (*do_unaligned_access)(CPUState *cpu, vaddr addr,
> +MMUAccessType access_type,
> +int mmu_idx, uintptr_t retaddr);

Shouldn't it be QEMU_NORETURN?



Re: [PATCH for-6.2 13/43] target/sparc: Remove DEBUG_UNALIGNED

2021-07-28 Thread Philippe Mathieu-Daudé
On 7/29/21 2:46 AM, Richard Henderson wrote:
> The printf should have been qemu_log_mask, the parameters
> themselves no longer compile, and because this is placed
> before unwinding the PC is actively wrong.
> 
> We get better (and correct) logging on the other side of
> raising the exception, in sparc_cpu_do_interrupt.
> 
> Cc: Mark Cave-Ayland 
> Signed-off-by: Richard Henderson 
> ---
>  target/sparc/ldst_helper.c | 9 -
>  1 file changed, 9 deletions(-)

Reviewed-by: Philippe Mathieu-Daudé 



Re: [PATCH for-6.2 11/43] target/sh4: Set fault address in superh_cpu_do_unaligned_access

2021-07-28 Thread Philippe Mathieu-Daudé
On 7/29/21 2:46 AM, Richard Henderson wrote:
> We ought to have been recording the virtual address for reporting
> to the guest trap handler.
> 
> Cc: Yoshinori Sato 
> Signed-off-by: Richard Henderson 
> ---
>  target/sh4/op_helper.c | 5 +
>  1 file changed, 5 insertions(+)

Reviewed-by: Philippe Mathieu-Daudé 



Re: [PATCH for-6.2 01/43] hw/core: Make do_unaligned_access available to user-only

2021-07-28 Thread Philippe Mathieu-Daudé
On 7/29/21 2:46 AM, Richard Henderson wrote:
> We shouldn't be ignoring SIGBUS for user-only.
> Move our existing TCGCPUOps hook out from CONFIG_SOFTMMU.
> 
> Signed-off-by: Richard Henderson 
> ---
>  include/hw/core/tcg-cpu-ops.h | 14 +++---
>  1 file changed, 7 insertions(+), 7 deletions(-)

Reviewed-by: Philippe Mathieu-Daudé 



Re: [PATCH for-6.2 00/43] Unaligned accesses for user-only

2021-07-28 Thread Philippe Mathieu-Daudé
On 7/29/21 2:46 AM, Richard Henderson wrote:
> This began with Peter wanting a cpu_ldst.h interface that can handle
> alignment info for Arm M-profile system mode, which will also compile
> for user-only without ifdefs.  This is patch 32.
> 
> Once I had that interface, I thought I might as well enforce the
> requested alignment in user-only.  There are plenty of cases where
> we ought to have been doing that for quite a while.  This took rather
> more work than I imagined to start.
> 
> So far only x86 host has been fully converted to handle unaligned
> operations in user-only mode.  I'll get to the others later.  But
> the added testcase is fairly broad, and caught lots of bugs and/or
> missing code between target/ and linux-user/.
> 
> Notes:
>   * For target/i386 we have no way to signal SIGBUS from user-only.
> In theory we could go through do_unaligned_access in system mode,
> via #AC.  But we don't even implement that control in tcg, probably
> because no one ever sets it.  The cmpxchg16b insn requires alignment,
> but raises #GP, which maps to SIGSEGV.
> 
>   * For target/s390x we have no way to signal SIGBUS from user-only.
> The atomic operations raise PGM_SPECIFICATION, which the linux
> kernel maps to SIGILL.
> 
>   * I think target/hexagon should be setting TARGET_ALIGNED_ONLY=y.
> In the meantime, all memory accesses are allowed to be unaligned.

Now I better understand what you tried to explain me last with
TCGCPUOps. Since Claudio was also involved, Cc'ing him (not asking
for a review, just in case he wants to follow up).



Re: [PATCH v3] hw/acpi: add an assertion check for non-null return from acpi_get_i386_pci_host

2021-07-28 Thread Ani Sinha



On Thu, 29 Jul 2021, Ani Sinha wrote:

>
>
> On Wed, 28 Jul 2021, Michael S. Tsirkin wrote:
>
> > On Mon, Jul 26, 2021 at 10:27:43PM +0530, Ani Sinha wrote:
> > > All existing code using acpi_get_i386_pci_host() checks for a non-null
> > > return value from this function call. Instead of returning early when the 
> > > value
> > > returned is NULL, assert instead. Since there are only two possible host 
> > > buses
> > > for i386 - q35 and i440fx, a null value return from the function does not 
> > > make
> > > sense in most cases and is likely an error situation.
> >
> > add "on i386"?
> >
> > > Fixes: c0e427d6eb5fef ("hw/acpi/ich9: Enable ACPI PCI hot-plug")
> >
> > This that seems inappropriate, this is not a bugfix.
> >
>
> Forgot to answer this. I started this patch because I saw a gap that was
> introduced with the above patch. In acpi_pcihp_disable_root_bus(), Julia's
> code did not check for null return value from acpi_get_i386_pci_host().
> See v2. Hence, I added the fixes tag. Then Igor suggested that I assert
> instead and I also thought perhaps assertion is a better idea. Hence v3. I
> am not conflicted after reading your argument. We should assert only when
> a certain invariant is always respected. Otherwise we should not assert.
> If you think acpi_get_i386_pci_host() can be called from non-i386 path as
> well, maybe v2 approach is better.

Also I should point out that at this moment, only ich9 and piix4 end up
calling acpi_pcihp_disable_root_bus(). Hence, we are ok either way for
now. In the future, if other archs end of calling this function, then the
question is, do we gracefully fail by simply returning in case of null
host bridge or do we assert? In its current form, it will ungracefully
crash somewhere.




Re: [PATCH] gitlab-ci.d/custom-runners: Improve rules for the staging branch

2021-07-28 Thread Thomas Huth

On 28/07/2021 20.26, Philippe Mathieu-Daudé wrote:

On 7/28/21 7:38 PM, Thomas Huth wrote:

If maintainers are currently pushing to a branch called "staging"
in their repository, they are ending up with some stuck jobs - unless
they have a s390x CI runner machine available. That's ugly, we should
make sure that the related jobs are really only started if such a
runner is available. So let's only run these jobs if it's the
"staging" branch of the main repository of the QEMU project (where
we can be sure that the s390x runner is available), or if the user
explicitly set a S390X_RUNNER_AVAILABLE variable in their CI configs
to declare that they have such a runner available, too.

Fixes: 4799c21023 ("Jobs based on custom runners: add job definitions ...")
Signed-off-by: Thomas Huth 
---
  .gitlab-ci.d/custom-runners.yml | 40 +++--
  1 file changed, 28 insertions(+), 12 deletions(-)

diff --git a/.gitlab-ci.d/custom-runners.yml b/.gitlab-ci.d/custom-runners.yml
index 061d3cdfed..564b94565d 100644
--- a/.gitlab-ci.d/custom-runners.yml
+++ b/.gitlab-ci.d/custom-runners.yml
@@ -24,7 +24,8 @@ ubuntu-18.04-s390x-all-linux-static:
   - ubuntu_18.04
   - s390x
   rules:
- - if: '$CI_COMMIT_BRANCH =~ /^staging/'
+ - if: '$CI_PROJECT_NAMESPACE == "qemu-project" && $CI_COMMIT_BRANCH =~ 
/^staging/'
+ - if: "$S390X_RUNNER_AVAILABLE"


If you base this patch on top of "docs: Document GitLab
custom CI/CD variables" that you already queued, you can
directly add a description for S390X_RUNNER_AVAILABLE in
docs/devel/ci.rst, but this can be done later too.


Good idea! But I really want to get this out of the door to finally get a 
usable gitlab-CI again, so I'll rather send a patch for this later.


 Thanks,
  Thomas




Re: [PATCH] tests: Fix migration-test build failure for sparc

2021-07-28 Thread Thomas Huth

On 28/07/2021 23.41, Peter Xu wrote:

Even if  seems to exist for all archs on linux, however including
it with __linux__ defined seems to be not working yet as it'll try to include
asm/kvm.h and that can be missing for archs that do not support kvm.

To fix this (instead of any attempt to fix linux headers..), we can mark the
header to be x86_64 only, because it's so far only service for adding the kvm
dirty ring test.

No need to have "Fixes" as the issue is just introduced very recently.

Reported-by: Richard Henderson 
Signed-off-by: Peter Xu 
---
  tests/qtest/migration-test.c | 5 +++--
  1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
index 1e8b7784ef..cc5e83d98a 100644
--- a/tests/qtest/migration-test.c
+++ b/tests/qtest/migration-test.c
@@ -27,7 +27,8 @@
  #include "migration-helpers.h"
  #include "tests/migration/migration-test.h"
  
-#if defined(__linux__)

+/* For dirty ring test; so far only x86_64 is supported */
+#if defined(__linux__) && defined(HOST_X86_64)
  #include "linux/kvm.h"
  #endif
  
@@ -1395,7 +1396,7 @@ static void test_multifd_tcp_cancel(void)
  
  static bool kvm_dirty_ring_supported(void)

  {
-#if defined(__linux__)
+#if defined(__linux__) && defined(HOST_X86_64)
  int ret, kvm_fd = open("/dev/kvm", O_RDONLY);
  
  if (kvm_fd < 0) {




Acked-by: Thomas Huth 

Juan, Dave, if you don't mind I can take this through my testing branch - 
I'm planning to send a pull request today anyway.


 Thomas




[Bug 1891748] Re: qemu-arm-static 5.1 can't run gcc

2021-07-28 Thread Maxim Devaev
Sup?

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1891748

Title:
  qemu-arm-static 5.1 can't run gcc

Status in QEMU:
  Fix Released
Status in Juju Charms Collection:
  New

Bug description:
  Issue discovered while trying to build pikvm (1)

  Long story short: when using qemu-arm-static 5.1, gcc exits whith
  message:

  Allocating guest commpage: Operation not permitted

  
  when using qemu-arm-static v5.0, gcc "works"

  Steps to reproduce will follow

  (1)  https://github.com/pikvm/pikvm/blob/master/pages/building_os.md

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1891748/+subscriptions




Re: [PATCH v3] hw/acpi: add an assertion check for non-null return from acpi_get_i386_pci_host

2021-07-28 Thread Ani Sinha



On Wed, 28 Jul 2021, Michael S. Tsirkin wrote:

> On Mon, Jul 26, 2021 at 10:27:43PM +0530, Ani Sinha wrote:
> > All existing code using acpi_get_i386_pci_host() checks for a non-null
> > return value from this function call. Instead of returning early when the 
> > value
> > returned is NULL, assert instead. Since there are only two possible host 
> > buses
> > for i386 - q35 and i440fx, a null value return from the function does not 
> > make
> > sense in most cases and is likely an error situation.
>
> add "on i386"?
>
> > Fixes: c0e427d6eb5fef ("hw/acpi/ich9: Enable ACPI PCI hot-plug")
>
> This that seems inappropriate, this is not a bugfix.
>

Forgot to answer this. I started this patch because I saw a gap that was
introduced with the above patch. In acpi_pcihp_disable_root_bus(), Julia's
code did not check for null return value from acpi_get_i386_pci_host().
See v2. Hence, I added the fixes tag. Then Igor suggested that I assert
instead and I also thought perhaps assertion is a better idea. Hence v3. I
am not conflicted after reading your argument. We should assert only when
a certain invariant is always respected. Otherwise we should not assert.
If you think acpi_get_i386_pci_host() can be called from non-i386 path as
well, maybe v2 approach is better.





Re: [PATCH V5 23/25] chardev: cpr for sockets

2021-07-28 Thread Zheng Chuan
Hi.

On 2021/7/8 1:20, Steve Sistare wrote:
> Save accepted socket fds in the environment before cprsave, and look for
> fds in the environment after cprload.  Reject cprexec if a socket enables
> the TLS or websocket option.  Allow a monitor socket by closing it on exec.
> 
> Signed-off-by: Mark Kanda 
> Signed-off-by: Steve Sistare 
> ---
>  chardev/char-socket.c | 31 +++
>  monitor/hmp.c |  3 +++
>  monitor/qmp.c |  3 +++
>  3 files changed, 37 insertions(+)
> 
> diff --git a/chardev/char-socket.c b/chardev/char-socket.c
> index d0fb545..dc9da8c 100644
> --- a/chardev/char-socket.c
> +++ b/chardev/char-socket.c
> @@ -27,7 +27,9 @@
>  #include "io/channel-socket.h"
>  #include "io/channel-tls.h"
>  #include "io/channel-websock.h"
> +#include "qemu/env.h"
>  #include "io/net-listener.h"
> +#include "qemu/env.h"
duplicated include.

>  #include "qemu/error-report.h"
>  #include "qemu/module.h"
>  #include "qemu/option.h"
> @@ -414,6 +416,7 @@ static void tcp_chr_free_connection(Chardev *chr)
>  SocketChardev *s = SOCKET_CHARDEV(chr);
>  int i;
>  
> +unsetenv_fd(chr->label);
>  if (s->read_msgfds_num) {
>  for (i = 0; i < s->read_msgfds_num; i++) {
>  close(s->read_msgfds[i]);
> @@ -976,6 +979,10 @@ static void tcp_chr_accept(QIONetListener *listener,
> QIO_CHANNEL(cioc));
>  }
>  tcp_chr_new_client(chr, cioc);
> +
> +if (s->sioc && !chr->close_on_cpr) {
> +setenv_fd(chr->label, s->sioc->fd);
> +}
>  }
>  
>  
> @@ -1231,6 +1238,24 @@ static gboolean socket_reconnect_timeout(gpointer 
> opaque)
>  return false;
>  }
>  
> +static void load_char_socket_fd(Chardev *chr, Error **errp)
> +{
> +SocketChardev *sockchar = SOCKET_CHARDEV(chr);
> +QIOChannelSocket *sioc;
> +int fd = getenv_fd(chr->label);
> +
> +if (fd != -1) {
> +sockchar = SOCKET_CHARDEV(chr);
> +sioc = qio_channel_socket_new_fd(fd, errp);
> +if (sioc) {
> +tcp_chr_accept(sockchar->listener, sioc, chr);
> +object_unref(OBJECT(sioc));
> +} else {
> +error_setg(errp, "error: could not restore socket for %s",
> +   chr->label);
> +}
> +}
> +}
>  
>  static int qmp_chardev_open_socket_server(Chardev *chr,
>bool is_telnet,
> @@ -1435,6 +1460,10 @@ static void qmp_chardev_open_socket(Chardev *chr,
>  }
>  s->registered_yank = true;
>  
> +if (!s->tls_creds && !s->is_websock) {
> +qemu_chr_set_feature(chr, QEMU_CHAR_FEATURE_CPR);
> +}
> +
>  /* be isn't opened until we get a connection */
>  *be_opened = false;
>  
> @@ -1450,6 +1479,8 @@ static void qmp_chardev_open_socket(Chardev *chr,
>  return;
>  }
>  }
> +
> +load_char_socket_fd(chr, errp);
>  }
>  
>  static void qemu_chr_parse_socket(QemuOpts *opts, ChardevBackend *backend,
> diff --git a/monitor/hmp.c b/monitor/hmp.c
> index 6c0b33a..63700b3 100644
> --- a/monitor/hmp.c
> +++ b/monitor/hmp.c
> @@ -1451,4 +1451,7 @@ void monitor_init_hmp(Chardev *chr, bool use_readline, 
> Error **errp)
>  qemu_chr_fe_set_handlers(&mon->common.chr, monitor_can_read, 
> monitor_read,
>   monitor_event, NULL, &mon->common, NULL, true);
>  monitor_list_append(&mon->common);
> +
> +/* monitor cannot yet be preserved across cpr */
> +chr->close_on_cpr = true;
>  }
> diff --git a/monitor/qmp.c b/monitor/qmp.c
> index 092c527..21a90bf 100644
> --- a/monitor/qmp.c
> +++ b/monitor/qmp.c
> @@ -535,4 +535,7 @@ void monitor_init_qmp(Chardev *chr, bool pretty, Error 
> **errp)
>   NULL, &mon->common, NULL, true);
>  monitor_list_append(&mon->common);
>  }
> +
> +/* Monitor cannot yet be preserved across cpr */
> +chr->close_on_cpr = true;
>  }
> 

-- 
Regards.
Chuan



[PATCH 2/3] hw/mips/boston: Allow loading elf kernel and dtb

2021-07-28 Thread Jiaxun Yang
ELF kernel allows us debugging much easier with DWARF symbols.

Signed-off-by: Jiaxun Yang 
---
 hw/mips/boston.c | 38 ++
 1 file changed, 34 insertions(+), 4 deletions(-)

diff --git a/hw/mips/boston.c b/hw/mips/boston.c
index a5746ede65..42b31a1ce4 100644
--- a/hw/mips/boston.c
+++ b/hw/mips/boston.c
@@ -20,6 +20,7 @@
 #include "qemu/osdep.h"
 #include "qemu/units.h"
 
+#include "elf.h"
 #include "hw/boards.h"
 #include "hw/char/serial.h"
 #include "hw/ide/pci.h"
@@ -546,10 +547,39 @@ static void boston_mach_init(MachineState *machine)
 exit(1);
 }
 } else if (machine->kernel_filename) {
-fit_err = load_fit(&boston_fit_loader, machine->kernel_filename, s);
-if (fit_err) {
-error_report("unable to load FIT image");
-exit(1);
+uint64_t kernel_entry, kernel_low, kernel_high, kernel_size;
+
+kernel_size = load_elf(machine->kernel_filename, NULL,
+   cpu_mips_kseg0_to_phys, NULL,
+   (uint64_t *)&kernel_entry,
+   (uint64_t *)&kernel_low, (uint64_t *)&kernel_high,
+   NULL, 0, EM_MIPS, 1, 0);
+
+if (kernel_size) {
+hwaddr dtb_paddr = QEMU_ALIGN_UP(kernel_high, 64 * KiB);
+hwaddr dtb_vaddr = cpu_mips_phys_to_kseg0(NULL, dtb_paddr);
+
+s->kernel_entry = kernel_entry;
+if (machine->dtb) {
+int dt_size;
+const void *dtb_file_data, *dtb_load_data;
+
+dtb_file_data = load_device_tree(machine->dtb, &dt_size);
+dtb_load_data = boston_fdt_filter(s, dtb_file_data, NULL, 
&dtb_vaddr);
+
+/* Calculate real fdt size after filter */
+dt_size = fdt_totalsize(dtb_load_data);
+rom_add_blob_fixed("dtb", dtb_load_data, dt_size, dtb_paddr);
+g_free((void *) dtb_file_data);
+g_free((void *) dtb_load_data);
+}
+} else {
+/* Try to load file as FIT */
+fit_err = load_fit(&boston_fit_loader, machine->kernel_filename, 
s);
+if (fit_err) {
+error_report("unable to load kernel image");
+exit(1);
+}
 }
 
 gen_firmware(memory_region_get_ram_ptr(flash) + 0x7c0,
-- 
2.32.0




[PATCH 3/3] hw/mips/boston: Add FDT generator

2021-07-28 Thread Jiaxun Yang
Generate FDT on our own if no dtb argument supplied.
Avoid introduce unused device in FDT with user supplied dtb.

Signed-off-by: Jiaxun Yang 
---
 hw/mips/boston.c | 238 +--
 1 file changed, 228 insertions(+), 10 deletions(-)

diff --git a/hw/mips/boston.c b/hw/mips/boston.c
index 42b31a1ce4..aaa79b9da7 100644
--- a/hw/mips/boston.c
+++ b/hw/mips/boston.c
@@ -49,6 +49,13 @@ typedef struct BostonState BostonState;
 DECLARE_INSTANCE_CHECKER(BostonState, BOSTON,
  TYPE_BOSTON)
 
+#define FDT_IRQ_TYPE_NONE   0
+#define FDT_IRQ_TYPE_LEVEL_HIGH 4
+#define FDT_GIC_SHARED  0
+#define FDT_GIC_LOCAL   1
+#define FDT_BOSTON_CLK_SYS  1
+#define FDT_BOSTON_CLK_CPU  2
+
 struct BostonState {
 SysBusDevice parent_obj;
 
@@ -435,6 +442,214 @@ xilinx_pcie_init(MemoryRegion *sys_mem, uint32_t bus_nr,
 return XILINX_PCIE_HOST(dev);
 }
 
+
+static void fdt_create_pcie(void *fdt, int gic_ph, int irq, hwaddr reg_base,
+hwaddr reg_size, hwaddr mmio_base, hwaddr 
mmio_size)
+{
+int i;
+char *name, *intc_name;
+uint32_t intc_ph;
+uint32_t interrupt_map[4][6];
+
+intc_ph = qemu_fdt_alloc_phandle(fdt);
+name = g_strdup_printf("/soc/pci@%lx", (long)reg_base);
+qemu_fdt_add_subnode(fdt, name);
+qemu_fdt_setprop_string(fdt, name, "compatible", 
"xlnx,axi-pcie-host-1.00.a");
+qemu_fdt_setprop_string(fdt, name, "device_type", "pci");
+qemu_fdt_setprop_cells(fdt, name, "reg", reg_base, reg_size);
+
+qemu_fdt_setprop_cell(fdt, name, "#address-cells", 3);
+qemu_fdt_setprop_cell(fdt, name, "#size-cells", 2);
+qemu_fdt_setprop_cell(fdt, name, "#interrupt-cells", 1);
+
+qemu_fdt_setprop_cell(fdt, name, "interrupt-parent", gic_ph);
+qemu_fdt_setprop_cells(fdt, name, "interrupts", FDT_GIC_SHARED, irq,
+FDT_IRQ_TYPE_LEVEL_HIGH);
+
+qemu_fdt_setprop_cells(fdt, name, "ranges", 0x0200, 0, mmio_base,
+mmio_base, 0, mmio_size);
+qemu_fdt_setprop_cells(fdt, name, "bus-range", 0x00, 0xff);
+
+
+
+intc_name = g_strdup_printf("%s/interrupt-controller", name);
+qemu_fdt_add_subnode(fdt, intc_name);
+qemu_fdt_setprop(fdt, intc_name, "interrupt-controller", NULL, 0);
+qemu_fdt_setprop_cell(fdt, intc_name, "#address-cells", 0);
+qemu_fdt_setprop_cell(fdt, intc_name, "#interrupt-cells", 1);
+qemu_fdt_setprop_cell(fdt, intc_name, "phandle", intc_ph);
+
+qemu_fdt_setprop_cells(fdt, name, "interrupt-map-mask", 0, 0, 0, 7);
+for (i = 0; i < 4; i++) {
+uint32_t *irqmap = interrupt_map[i];
+
+irqmap[0] = cpu_to_be32(0);
+irqmap[1] = cpu_to_be32(0);
+irqmap[2] = cpu_to_be32(0);
+irqmap[3] = cpu_to_be32(i + 1);
+irqmap[4] = cpu_to_be32(intc_ph);
+irqmap[5] = cpu_to_be32(i + 1);
+}
+qemu_fdt_setprop(fdt, name, "interrupt-map", &interrupt_map, 
sizeof(interrupt_map));
+
+g_free(intc_name);
+g_free(name);
+}
+
+static const void *create_fdt(BostonState *s, const MemMapEntry *memmap, int 
*dt_size)
+{
+void *fdt;
+int cpu;
+MachineState *mc = s->mach;
+uint32_t platreg_ph, gic_ph, clk_ph;
+char *name, *gic_name, *platreg_name, *stdout_name;
+
+fdt = create_device_tree(dt_size);
+if (!fdt) {
+error_report("create_device_tree() failed");
+exit(1);
+}
+
+platreg_ph = qemu_fdt_alloc_phandle(fdt);
+gic_ph = qemu_fdt_alloc_phandle(fdt);
+clk_ph = qemu_fdt_alloc_phandle(fdt);
+
+qemu_fdt_setprop_string(fdt, "/", "model", "img,boston");
+qemu_fdt_setprop_string(fdt, "/", "compatible", "img,boston");
+qemu_fdt_setprop_cell(fdt, "/", "#size-cells", 0x1);
+qemu_fdt_setprop_cell(fdt, "/", "#address-cells", 0x1);
+
+
+qemu_fdt_add_subnode(fdt, "/cpus");
+qemu_fdt_setprop_cell(fdt, "/cpus", "#size-cells", 0x0);
+qemu_fdt_setprop_cell(fdt, "/cpus", "#address-cells", 0x1);
+
+for (cpu = 0; cpu < mc->smp.cpus; cpu++) {
+name = g_strdup_printf("/cpus/cpu@%d", cpu);
+qemu_fdt_add_subnode(fdt, name);
+qemu_fdt_setprop_string(fdt, name, "compatible", "img,mips");
+qemu_fdt_setprop_string(fdt, name, "status", "okay");
+qemu_fdt_setprop_cell(fdt, name, "reg", cpu);
+qemu_fdt_setprop_string(fdt, name, "device_type", "cpu");
+qemu_fdt_setprop_cells(fdt, name, "clocks", clk_ph, 
FDT_BOSTON_CLK_CPU);
+g_free(name);
+}
+
+qemu_fdt_add_subnode(fdt, "/soc");
+qemu_fdt_setprop(fdt, "/soc", "ranges", NULL, 0);
+qemu_fdt_setprop_string(fdt, "/soc", "compatible", "simple-bus");
+qemu_fdt_setprop_cell(fdt, "/soc", "#size-cells", 0x1);
+qemu_fdt_setprop_cell(fdt, "/soc", "#address-cells", 0x1);
+
+fdt_create_pcie(fdt, gic_ph, 2, memmap[BOSTON_PCIE0].base, 
memmap[BOSTON_PCIE0].size,
+memmap[BOSTON_PCIE0_MMIO].base, 
memmap[BOSTON_PCIE0_MMIO].size);

[PATCH 1/3] hw/mips/boston: Massage memory map information

2021-07-28 Thread Jiaxun Yang
Use memmap array to unfiy address of memory map.
That would allow us reuse address information for FDT generation.

Signed-off-by: Jiaxun Yang 
---
 hw/mips/boston.c | 95 
 1 file changed, 71 insertions(+), 24 deletions(-)

diff --git a/hw/mips/boston.c b/hw/mips/boston.c
index 20b06865b2..a5746ede65 100644
--- a/hw/mips/boston.c
+++ b/hw/mips/boston.c
@@ -64,6 +64,44 @@ struct BostonState {
 hwaddr fdt_base;
 };
 
+enum {
+BOSTON_LOWDDR,
+BOSTON_PCIE0,
+BOSTON_PCIE1,
+BOSTON_PCIE2,
+BOSTON_PCIE2_MMIO,
+BOSTON_CM,
+BOSTON_GIC,
+BOSTON_CDMM,
+BOSTON_CPC,
+BOSTON_PLATREG,
+BOSTON_UART,
+BOSTON_LCD,
+BOSTON_FLASH,
+BOSTON_PCIE1_MMIO,
+BOSTON_PCIE0_MMIO,
+BOSTON_HIGHDDR,
+};
+
+static const MemMapEntry boston_memmap[] = {
+[BOSTON_LOWDDR] = {0x0,0x1000 },
+[BOSTON_PCIE0] =  { 0x1000, 0x200 },
+[BOSTON_PCIE1] =  { 0x1200, 0x200 },
+[BOSTON_PCIE2] =  { 0x1400, 0x200 },
+[BOSTON_PCIE2_MMIO] = { 0x1600,  0x10 },
+[BOSTON_CM] = { 0x1610,   0x2 },
+[BOSTON_GIC] ={ 0x1612,   0x2 },
+[BOSTON_CDMM] =   { 0x1614,0x8000 },
+[BOSTON_CPC] ={ 0x1620,0x8000 },
+[BOSTON_PLATREG] ={ 0x17ffd000,0x1000 },
+[BOSTON_UART] =   { 0x17ffe000,0x1000 },
+[BOSTON_LCD] ={ 0x17fff000,   0x8 },
+[BOSTON_FLASH] =  { 0x1800, 0x800 },
+[BOSTON_PCIE1_MMIO] = { 0x2000,0x2000 },
+[BOSTON_PCIE0_MMIO] = { 0x4000,0x4000 },
+[BOSTON_HIGHDDR] ={ 0x8000,   0x0 },
+};
+
 enum boston_plat_reg {
 PLAT_FPGA_BUILD = 0x00,
 PLAT_CORE_CL= 0x04,
@@ -275,24 +313,22 @@ type_init(boston_register_types)
 
 static void gen_firmware(uint32_t *p, hwaddr kernel_entry, hwaddr fdt_addr)
 {
-const uint32_t cm_base = 0x1610;
-const uint32_t gic_base = 0x1612;
-const uint32_t cpc_base = 0x1620;
-
 /* Move CM GCRs */
 bl_gen_write_ulong(&p,
cpu_mips_phys_to_kseg1(NULL, GCR_BASE_ADDR + 
GCR_BASE_OFS),
-   cm_base);
+   boston_memmap[BOSTON_CM].base);
 
 /* Move & enable GIC GCRs */
 bl_gen_write_ulong(&p,
-   cpu_mips_phys_to_kseg1(NULL, cm_base + 
GCR_GIC_BASE_OFS),
-   gic_base | GCR_GIC_BASE_GICEN_MSK);
+   cpu_mips_phys_to_kseg1(NULL,
+boston_memmap[BOSTON_CM].base + GCR_GIC_BASE_OFS),
+   boston_memmap[BOSTON_GIC].base | 
GCR_GIC_BASE_GICEN_MSK);
 
 /* Move & enable CPC GCRs */
 bl_gen_write_ulong(&p,
-   cpu_mips_phys_to_kseg1(NULL, cm_base + 
GCR_CPC_BASE_OFS),
-   cpc_base | GCR_CPC_BASE_CPCEN_MSK);
+   cpu_mips_phys_to_kseg1(NULL,
+boston_memmap[BOSTON_CM].base + GCR_CPC_BASE_OFS),
+   boston_memmap[BOSTON_CPC].base | 
GCR_CPC_BASE_CPCEN_MSK);
 
 /*
  * Setup argument registers to follow the UHI boot protocol:
@@ -333,8 +369,9 @@ static const void *boston_fdt_filter(void *opaque, const 
void *fdt_orig,
 ram_low_sz = MIN(256 * MiB, machine->ram_size);
 ram_high_sz = machine->ram_size - ram_low_sz;
 qemu_fdt_setprop_sized_cells(fdt, "/memory@0", "reg",
- 1, 0x, 1, ram_low_sz,
- 1, 0x9000, 1, ram_high_sz);
+ 1, boston_memmap[BOSTON_LOWDDR].base, 1, 
ram_low_sz,
+ 1, boston_memmap[BOSTON_HIGHDDR].base + 
ram_low_sz
+ , 1, ram_high_sz);
 
 fdt = g_realloc(fdt, fdt_totalsize(fdt));
 qemu_fdt_dumpdtb(fdt, fdt_sz);
@@ -438,11 +475,13 @@ static void boston_mach_init(MachineState *machine)
 sysbus_mmio_map_overlap(SYS_BUS_DEVICE(&s->cps), 0, 0, 1);
 
 flash =  g_new(MemoryRegion, 1);
-memory_region_init_rom(flash, NULL, "boston.flash", 128 * MiB,
+memory_region_init_rom(flash, NULL, "boston.flash", 
boston_memmap[BOSTON_FLASH].size,
&error_fatal);
-memory_region_add_subregion_overlap(sys_mem, 0x1800, flash, 0);
+memory_region_add_subregion_overlap(sys_mem, 
boston_memmap[BOSTON_FLASH].base,
+flash, 0);
 
-memory_region_add_subregion_overlap(sys_mem, 0x8000, machine->ram, 0);
+memory_region_add_subregion_overlap(sys_mem, 
boston_memmap[BOSTON_HIGHDDR].base,
+machine->ram, 0);
 
 ddr_low_alias = g_new(MemoryRegion, 1);
 memory_region_init_alias(ddr_low_alias, NULL, "boston_low.ddr",
@@ -451,32 +490,40 @@ static void boston_mach_init(MachineState *machine)
 me

[PATCH 0/3] hw/mips/boston: ELF kernel support

2021-07-28 Thread Jiaxun Yang
Jiaxun Yang (3):
  hw/mips/boston: Massage memory map information
  hw/mips/boston: Allow loading elf kernel and dtb
  hw/mips/boston: Add FDT generator

 hw/mips/boston.c | 351 +++
 1 file changed, 323 insertions(+), 28 deletions(-)

-- 
2.32.0




RE: [PATCH for-6.2 31/43] target/hexagon: Implement cpu_mmu_index

2021-07-28 Thread Taylor Simpson



> -Original Message-
> From: Richard Henderson 
> Sent: Wednesday, July 28, 2021 6:47 PM
> To: qemu-devel@nongnu.org
> Cc: Taylor Simpson 
> Subject: [PATCH for-6.2 31/43] target/hexagon: Implement cpu_mmu_index
> 
> The function is trivial for user-only, but still must be present.
> 
> Cc: Taylor Simpson 
> Signed-off-by: Richard Henderson 
> ---
>  target/hexagon/cpu.h | 9 +
>  1 file changed, 9 insertions(+)
> 
> diff --git a/target/hexagon/cpu.h b/target/hexagon/cpu.h index
> 2855dd3881..bde538fd5c 100644
> --- a/target/hexagon/cpu.h
> +++ b/target/hexagon/cpu.h
> @@ -144,6 +144,15 @@ static inline void
> cpu_get_tb_cpu_state(CPUHexagonState *env, target_ulong *pc,  #endif  }
> 
> +static inline int cpu_mmu_index(CPUHexagonState *env, bool ifetch) {
> +#ifdef CONFIG_USER_ONLY
> +return MMU_USER_IDX;
> +#else
> +#error System mode not supported on Hexagon yet #endif }
> +

Reviewed-by: Taylor Simpson 




[PATCH for-6.2 40/43] linux-user/alpha: Remove TARGET_ALIGNED_ONLY

2021-07-28 Thread Richard Henderson
By default, the Linux kernel fixes up unaligned accesses.
Therefore, as the kernel surrogate, qemu should as well.
No fixups are done for load-locked/store-conditional, so
mark those as MO_ALIGN.

There is a syscall to disable this, and (among other things)
deliver SIGBUS, but it is essentially unused.  A survey of
open source code shows no uses of SSI_NVPAIRS except trivial
examples that show how to disable unaligned fixups.

Signed-off-by: Richard Henderson 
---
 configs/targets/alpha-linux-user.mak | 1 -
 target/alpha/translate.c | 8 
 2 files changed, 4 insertions(+), 5 deletions(-)

diff --git a/configs/targets/alpha-linux-user.mak 
b/configs/targets/alpha-linux-user.mak
index 7e62fd796a..f7d3fb4afa 100644
--- a/configs/targets/alpha-linux-user.mak
+++ b/configs/targets/alpha-linux-user.mak
@@ -1,4 +1,3 @@
 TARGET_ARCH=alpha
 TARGET_SYSTBL_ABI=common
 TARGET_SYSTBL=syscall.tbl
-TARGET_ALIGNED_ONLY=y
diff --git a/target/alpha/translate.c b/target/alpha/translate.c
index de6c0a8439..8c60e90114 100644
--- a/target/alpha/translate.c
+++ b/target/alpha/translate.c
@@ -293,14 +293,14 @@ static inline void gen_qemu_lds(TCGv t0, TCGv t1, int 
flags)
 
 static inline void gen_qemu_ldl_l(TCGv t0, TCGv t1, int flags)
 {
-tcg_gen_qemu_ld_i64(t0, t1, flags, MO_LESL);
+tcg_gen_qemu_ld_i64(t0, t1, flags, MO_LESL | MO_ALIGN);
 tcg_gen_mov_i64(cpu_lock_addr, t1);
 tcg_gen_mov_i64(cpu_lock_value, t0);
 }
 
 static inline void gen_qemu_ldq_l(TCGv t0, TCGv t1, int flags)
 {
-tcg_gen_qemu_ld_i64(t0, t1, flags, MO_LEQ);
+tcg_gen_qemu_ld_i64(t0, t1, flags, MO_LEQ | MO_ALIGN);
 tcg_gen_mov_i64(cpu_lock_addr, t1);
 tcg_gen_mov_i64(cpu_lock_value, t0);
 }
@@ -2840,12 +2840,12 @@ static DisasJumpType translate_one(DisasContext *ctx, 
uint32_t insn)
 case 0x2E:
 /* STL_C */
 ret = gen_store_conditional(ctx, ra, rb, disp16,
-ctx->mem_idx, MO_LESL);
+ctx->mem_idx, MO_LESL | MO_ALIGN);
 break;
 case 0x2F:
 /* STQ_C */
 ret = gen_store_conditional(ctx, ra, rb, disp16,
-ctx->mem_idx, MO_LEQ);
+ctx->mem_idx, MO_LEQ | MO_ALIGN);
 break;
 case 0x30:
 /* BR */
-- 
2.25.1




[PATCH for-6.2 39/43] tcg: Move helper_*_mmu decls to tcg/tcg-ldst.h

2021-07-28 Thread Richard Henderson
These functions have been replaced by cpu_*_mmu as the
most proper interface to use from target code.

Hide these declarations from code that should not use them.

Signed-off-by: Richard Henderson 
---
 include/tcg/tcg-ldst.h | 74 ++
 include/tcg/tcg.h  | 71 
 accel/tcg/cputlb.c |  1 +
 tcg/tcg.c  |  1 +
 tcg/tci.c  |  1 +
 5 files changed, 77 insertions(+), 71 deletions(-)
 create mode 100644 include/tcg/tcg-ldst.h

diff --git a/include/tcg/tcg-ldst.h b/include/tcg/tcg-ldst.h
new file mode 100644
index 00..8c86365611
--- /dev/null
+++ b/include/tcg/tcg-ldst.h
@@ -0,0 +1,74 @@
+/*
+ * Memory helpers that will be used by TCG generated code.
+ *
+ * Copyright (c) 2008 Fabrice Bellard
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to 
deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+
+#ifndef TCG_LDST_H
+#define TCG_LDST_H 1
+
+#ifdef CONFIG_SOFTMMU
+
+/* Value zero-extended to tcg register size.  */
+tcg_target_ulong helper_ret_ldub_mmu(CPUArchState *env, target_ulong addr,
+ MemOpIdx oi, uintptr_t retaddr);
+tcg_target_ulong helper_le_lduw_mmu(CPUArchState *env, target_ulong addr,
+MemOpIdx oi, uintptr_t retaddr);
+tcg_target_ulong helper_le_ldul_mmu(CPUArchState *env, target_ulong addr,
+MemOpIdx oi, uintptr_t retaddr);
+uint64_t helper_le_ldq_mmu(CPUArchState *env, target_ulong addr,
+   MemOpIdx oi, uintptr_t retaddr);
+tcg_target_ulong helper_be_lduw_mmu(CPUArchState *env, target_ulong addr,
+MemOpIdx oi, uintptr_t retaddr);
+tcg_target_ulong helper_be_ldul_mmu(CPUArchState *env, target_ulong addr,
+MemOpIdx oi, uintptr_t retaddr);
+uint64_t helper_be_ldq_mmu(CPUArchState *env, target_ulong addr,
+   MemOpIdx oi, uintptr_t retaddr);
+
+/* Value sign-extended to tcg register size.  */
+tcg_target_ulong helper_ret_ldsb_mmu(CPUArchState *env, target_ulong addr,
+ MemOpIdx oi, uintptr_t retaddr);
+tcg_target_ulong helper_le_ldsw_mmu(CPUArchState *env, target_ulong addr,
+MemOpIdx oi, uintptr_t retaddr);
+tcg_target_ulong helper_le_ldsl_mmu(CPUArchState *env, target_ulong addr,
+MemOpIdx oi, uintptr_t retaddr);
+tcg_target_ulong helper_be_ldsw_mmu(CPUArchState *env, target_ulong addr,
+MemOpIdx oi, uintptr_t retaddr);
+tcg_target_ulong helper_be_ldsl_mmu(CPUArchState *env, target_ulong addr,
+MemOpIdx oi, uintptr_t retaddr);
+
+void helper_ret_stb_mmu(CPUArchState *env, target_ulong addr, uint8_t val,
+MemOpIdx oi, uintptr_t retaddr);
+void helper_le_stw_mmu(CPUArchState *env, target_ulong addr, uint16_t val,
+   MemOpIdx oi, uintptr_t retaddr);
+void helper_le_stl_mmu(CPUArchState *env, target_ulong addr, uint32_t val,
+   MemOpIdx oi, uintptr_t retaddr);
+void helper_le_stq_mmu(CPUArchState *env, target_ulong addr, uint64_t val,
+   MemOpIdx oi, uintptr_t retaddr);
+void helper_be_stw_mmu(CPUArchState *env, target_ulong addr, uint16_t val,
+   MemOpIdx oi, uintptr_t retaddr);
+void helper_be_stl_mmu(CPUArchState *env, target_ulong addr, uint32_t val,
+   MemOpIdx oi, uintptr_t retaddr);
+void helper_be_stq_mmu(CPUArchState *env, target_ulong addr, uint64_t val,
+   MemOpIdx oi, uintptr_t retaddr);
+
+#endif /* CONFIG_SOFTMMU */
+#endif /* TCG_LDST_H */
diff --git a/include/tcg/tcg.h b/include/tcg/tcg.h
index 114ad66b25..82b4abfa31 100644
--- a/include/tcg/tcg.h
+++ b/include/tcg/tcg.h
@@ -1234,77 +1234,6 @@ uint64_t dup_const(unsigned vece, uint64_t c);
 : (

[PATCH for-6.2 35/43] target/mips: Use 8-byte memory ops for msa load/store

2021-07-28 Thread Richard Henderson
Rather than use 4-16 separate operations, use 2 operations
plus some byte reordering as necessary.

Cc: Philippe Mathieu-Daudé 
Signed-off-by: Richard Henderson 
---
 target/mips/tcg/msa_helper.c | 201 +--
 1 file changed, 71 insertions(+), 130 deletions(-)

diff --git a/target/mips/tcg/msa_helper.c b/target/mips/tcg/msa_helper.c
index a8880ce81c..e40c1b7057 100644
--- a/target/mips/tcg/msa_helper.c
+++ b/target/mips/tcg/msa_helper.c
@@ -8218,47 +8218,31 @@ void helper_msa_ffint_u_df(CPUMIPSState *env, uint32_t 
df, uint32_t wd,
 #define MEMOP_IDX(DF)
 #endif
 
+#ifdef TARGET_WORDS_BIGENDIAN
+static inline uint64_t bswap16x4(uint64_t x)
+{
+uint64_t m = 0x00ff00ff00ff00ffull;
+return ((x & m) << 8) | ((x >> 8) & m);
+}
+
+static inline uint64_t bswap32x2(uint64_t x)
+{
+return ror64(bswap64(x), 32);
+}
+#endif
+
 void helper_msa_ld_b(CPUMIPSState *env, uint32_t wd,
  target_ulong addr)
 {
 wr_t *pwd = &(env->active_fpu.fpr[wd].wr);
 uintptr_t ra = GETPC();
+uint64_t d0, d1;
 
-#if !defined(HOST_WORDS_BIGENDIAN)
-pwd->b[0]  = cpu_ldub_data_ra(env, addr + (0  << DF_BYTE), ra);
-pwd->b[1]  = cpu_ldub_data_ra(env, addr + (1  << DF_BYTE), ra);
-pwd->b[2]  = cpu_ldub_data_ra(env, addr + (2  << DF_BYTE), ra);
-pwd->b[3]  = cpu_ldub_data_ra(env, addr + (3  << DF_BYTE), ra);
-pwd->b[4]  = cpu_ldub_data_ra(env, addr + (4  << DF_BYTE), ra);
-pwd->b[5]  = cpu_ldub_data_ra(env, addr + (5  << DF_BYTE), ra);
-pwd->b[6]  = cpu_ldub_data_ra(env, addr + (6  << DF_BYTE), ra);
-pwd->b[7]  = cpu_ldub_data_ra(env, addr + (7  << DF_BYTE), ra);
-pwd->b[8]  = cpu_ldub_data_ra(env, addr + (8  << DF_BYTE), ra);
-pwd->b[9]  = cpu_ldub_data_ra(env, addr + (9  << DF_BYTE), ra);
-pwd->b[10] = cpu_ldub_data_ra(env, addr + (10 << DF_BYTE), ra);
-pwd->b[11] = cpu_ldub_data_ra(env, addr + (11 << DF_BYTE), ra);
-pwd->b[12] = cpu_ldub_data_ra(env, addr + (12 << DF_BYTE), ra);
-pwd->b[13] = cpu_ldub_data_ra(env, addr + (13 << DF_BYTE), ra);
-pwd->b[14] = cpu_ldub_data_ra(env, addr + (14 << DF_BYTE), ra);
-pwd->b[15] = cpu_ldub_data_ra(env, addr + (15 << DF_BYTE), ra);
-#else
-pwd->b[0]  = cpu_ldub_data_ra(env, addr + (7  << DF_BYTE), ra);
-pwd->b[1]  = cpu_ldub_data_ra(env, addr + (6  << DF_BYTE), ra);
-pwd->b[2]  = cpu_ldub_data_ra(env, addr + (5  << DF_BYTE), ra);
-pwd->b[3]  = cpu_ldub_data_ra(env, addr + (4  << DF_BYTE), ra);
-pwd->b[4]  = cpu_ldub_data_ra(env, addr + (3  << DF_BYTE), ra);
-pwd->b[5]  = cpu_ldub_data_ra(env, addr + (2  << DF_BYTE), ra);
-pwd->b[6]  = cpu_ldub_data_ra(env, addr + (1  << DF_BYTE), ra);
-pwd->b[7]  = cpu_ldub_data_ra(env, addr + (0  << DF_BYTE), ra);
-pwd->b[8]  = cpu_ldub_data_ra(env, addr + (15 << DF_BYTE), ra);
-pwd->b[9]  = cpu_ldub_data_ra(env, addr + (14 << DF_BYTE), ra);
-pwd->b[10] = cpu_ldub_data_ra(env, addr + (13 << DF_BYTE), ra);
-pwd->b[11] = cpu_ldub_data_ra(env, addr + (12 << DF_BYTE), ra);
-pwd->b[12] = cpu_ldub_data_ra(env, addr + (11 << DF_BYTE), ra);
-pwd->b[13] = cpu_ldub_data_ra(env, addr + (10 << DF_BYTE), ra);
-pwd->b[14] = cpu_ldub_data_ra(env, addr + (9 << DF_BYTE), ra);
-pwd->b[15] = cpu_ldub_data_ra(env, addr + (8 << DF_BYTE), ra);
-#endif
+/* Load 8 bytes at a time.  Vector element ordering makes this LE.  */
+d0 = cpu_ldq_le_data_ra(env, addr + 0, ra);
+d1 = cpu_ldq_le_data_ra(env, addr + 8, ra);
+pwd->d[0] = d0;
+pwd->d[1] = d1;
 }
 
 void helper_msa_ld_h(CPUMIPSState *env, uint32_t wd,
@@ -8266,26 +8250,20 @@ void helper_msa_ld_h(CPUMIPSState *env, uint32_t wd,
 {
 wr_t *pwd = &(env->active_fpu.fpr[wd].wr);
 uintptr_t ra = GETPC();
+uint64_t d0, d1;
 
-#if !defined(HOST_WORDS_BIGENDIAN)
-pwd->h[0] = cpu_lduw_data_ra(env, addr + (0 << DF_HALF), ra);
-pwd->h[1] = cpu_lduw_data_ra(env, addr + (1 << DF_HALF), ra);
-pwd->h[2] = cpu_lduw_data_ra(env, addr + (2 << DF_HALF), ra);
-pwd->h[3] = cpu_lduw_data_ra(env, addr + (3 << DF_HALF), ra);
-pwd->h[4] = cpu_lduw_data_ra(env, addr + (4 << DF_HALF), ra);
-pwd->h[5] = cpu_lduw_data_ra(env, addr + (5 << DF_HALF), ra);
-pwd->h[6] = cpu_lduw_data_ra(env, addr + (6 << DF_HALF), ra);
-pwd->h[7] = cpu_lduw_data_ra(env, addr + (7 << DF_HALF), ra);
-#else
-pwd->h[0] = cpu_lduw_data_ra(env, addr + (3 << DF_HALF), ra);
-pwd->h[1] = cpu_lduw_data_ra(env, addr + (2 << DF_HALF), ra);
-pwd->h[2] = cpu_lduw_data_ra(env, addr + (1 << DF_HALF), ra);
-pwd->h[3] = cpu_lduw_data_ra(env, addr + (0 << DF_HALF), ra);
-pwd->h[4] = cpu_lduw_data_ra(env, addr + (7 << DF_HALF), ra);
-pwd->h[5] = cpu_lduw_data_ra(env, addr + (6 << DF_HALF), ra);
-pwd->h[6] = cpu_lduw_data_ra(env, addr + (5 << DF_HALF), ra);
-pwd->h[7] = cpu_lduw_data_ra(env, addr + (4 << DF_HALF), ra);
+/*
+ * Load 8 bytes at a time.  Use little-endian load, then for
+ * big-endian target, we 

[PATCH for-6.2 36/43] target/s390x: Use cpu_*_mmu instead of helper_*_mmu

2021-07-28 Thread Richard Henderson
The helper_*_mmu functions were the only thing available
when this code was written.  This could have been adjusted
when we added cpu_*_mmuidx_ra, but now we can most easily
use the newest set of interfaces.

Cc: qemu-s3...@nongnu.org
Signed-off-by: Richard Henderson 
---
 target/s390x/tcg/mem_helper.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/target/s390x/tcg/mem_helper.c b/target/s390x/tcg/mem_helper.c
index b20a82a914..4115cadbd7 100644
--- a/target/s390x/tcg/mem_helper.c
+++ b/target/s390x/tcg/mem_helper.c
@@ -248,13 +248,13 @@ static void do_access_memset(CPUS390XState *env, vaddr 
vaddr, char *haddr,
  * page. This is especially relevant to speed up TLB_NOTDIRTY.
  */
 g_assert(size > 0);
-helper_ret_stb_mmu(env, vaddr, byte, oi, ra);
+cpu_stb_mmu(env, vaddr, byte, oi, ra);
 haddr = tlb_vaddr_to_host(env, vaddr, MMU_DATA_STORE, mmu_idx);
 if (likely(haddr)) {
 memset(haddr + 1, byte, size - 1);
 } else {
 for (i = 1; i < size; i++) {
-helper_ret_stb_mmu(env, vaddr + i, byte, oi, ra);
+cpu_stb_mmu(env, vaddr + i, byte, oi, ra);
 }
 }
 }
@@ -290,7 +290,7 @@ static uint8_t do_access_get_byte(CPUS390XState *env, vaddr 
vaddr, char **haddr,
  * Do a single access and test if we can then get access to the
  * page. This is especially relevant to speed up TLB_NOTDIRTY.
  */
-byte = helper_ret_ldub_mmu(env, vaddr + offset, oi, ra);
+byte = cpu_ldb_mmu(env, vaddr + offset, oi, ra);
 *haddr = tlb_vaddr_to_host(env, vaddr, MMU_DATA_LOAD, mmu_idx);
 return byte;
 #endif
@@ -324,7 +324,7 @@ static void do_access_set_byte(CPUS390XState *env, vaddr 
vaddr, char **haddr,
  * Do a single access and test if we can then get access to the
  * page. This is especially relevant to speed up TLB_NOTDIRTY.
  */
-helper_ret_stb_mmu(env, vaddr + offset, byte, oi, ra);
+cpu_stb_mmu(env, vaddr + offset, byte, oi, ra);
 *haddr = tlb_vaddr_to_host(env, vaddr, MMU_DATA_STORE, mmu_idx);
 #endif
 }
-- 
2.25.1




[PATCH for-6.2 31/43] target/hexagon: Implement cpu_mmu_index

2021-07-28 Thread Richard Henderson
The function is trivial for user-only, but still must be present.

Cc: Taylor Simpson 
Signed-off-by: Richard Henderson 
---
 target/hexagon/cpu.h | 9 +
 1 file changed, 9 insertions(+)

diff --git a/target/hexagon/cpu.h b/target/hexagon/cpu.h
index 2855dd3881..bde538fd5c 100644
--- a/target/hexagon/cpu.h
+++ b/target/hexagon/cpu.h
@@ -144,6 +144,15 @@ static inline void cpu_get_tb_cpu_state(CPUHexagonState 
*env, target_ulong *pc,
 #endif
 }
 
+static inline int cpu_mmu_index(CPUHexagonState *env, bool ifetch)
+{
+#ifdef CONFIG_USER_ONLY
+return MMU_USER_IDX;
+#else
+#error System mode not supported on Hexagon yet
+#endif
+}
+
 typedef struct CPUHexagonState CPUArchState;
 typedef HexagonCPU ArchCPU;
 
-- 
2.25.1




[PATCH for-6.2 28/43] target/i386: Use MO_128 for 16 byte atomics

2021-07-28 Thread Richard Henderson
Signed-off-by: Richard Henderson 
---
 target/i386/tcg/mem_helper.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/i386/tcg/mem_helper.c b/target/i386/tcg/mem_helper.c
index 0fd696f9c1..a207e624cb 100644
--- a/target/i386/tcg/mem_helper.c
+++ b/target/i386/tcg/mem_helper.c
@@ -136,7 +136,7 @@ void helper_cmpxchg16b(CPUX86State *env, target_ulong a0)
 Int128 newv = int128_make128(env->regs[R_EBX], env->regs[R_ECX]);
 
 int mem_idx = cpu_mmu_index(env, false);
-MemOpIdx oi = make_memop_idx(MO_TEQ | MO_ALIGN_16, mem_idx);
+MemOpIdx oi = make_memop_idx(MO_TE | MO_128 | MO_ALIGN, mem_idx);
 Int128 oldv = cpu_atomic_cmpxchgo_le_mmu(env, a0, cmpv, newv, oi, ra);
 
 if (int128_eq(oldv, cmpv)) {
-- 
2.25.1




[PATCH for-6.2 26/43] trace: Split guest_mem_before

2021-07-28 Thread Richard Henderson
There is no point in encoding load/store within a bit of
the memory trace info operand.  Represent atomic operations
as a single read-modify-write tracepoint.  Use MemOpIdx
instead of inventing a form specifically for traces.

Signed-off-by: Richard Henderson 
---
 accel/tcg/atomic_template.h   |  1 -
 trace/mem.h   | 51 ---
 accel/tcg/cputlb.c|  7 ++---
 accel/tcg/user-exec.c | 43 ++---
 tcg/tcg-op.c  | 17 +++-
 accel/tcg/atomic_common.c.inc | 12 +++--
 trace-events  | 18 +++--
 7 files changed, 27 insertions(+), 122 deletions(-)
 delete mode 100644 trace/mem.h

diff --git a/accel/tcg/atomic_template.h b/accel/tcg/atomic_template.h
index c08d859a8a..2d917b6b1f 100644
--- a/accel/tcg/atomic_template.h
+++ b/accel/tcg/atomic_template.h
@@ -19,7 +19,6 @@
  */
 
 #include "qemu/plugin.h"
-#include "trace/mem.h"
 
 #if DATA_SIZE == 16
 # define SUFFIX o
diff --git a/trace/mem.h b/trace/mem.h
deleted file mode 100644
index 699566c661..00
--- a/trace/mem.h
+++ /dev/null
@@ -1,51 +0,0 @@
-/*
- * Helper functions for guest memory tracing
- *
- * Copyright (C) 2016 Lluís Vilanova 
- *
- * This work is licensed under the terms of the GNU GPL, version 2 or later.
- * See the COPYING file in the top-level directory.
- */
-
-#ifndef TRACE__MEM_H
-#define TRACE__MEM_H
-
-#include "exec/memopidx.h"
-
-#define TRACE_MEM_SZ_SHIFT_MASK 0xf /* size shift mask */
-#define TRACE_MEM_SE (1ULL << 4)/* sign extended (y/n) */
-#define TRACE_MEM_BE (1ULL << 5)/* big endian (y/n) */
-#define TRACE_MEM_ST (1ULL << 6)/* store (y/n) */
-#define TRACE_MEM_MMU_SHIFT 8   /* mmu idx */
-
-/**
- * trace_mem_get_info:
- *
- * Return a value for the 'info' argument in guest memory access traces.
- */
-static inline uint16_t trace_mem_get_info(MemOpIdx oi, bool store)
-{
-MemOp op = get_memop(oi);
-uint32_t size_shift = op & MO_SIZE;
-bool sign_extend = op & MO_SIGN;
-bool big_endian = (op & MO_BSWAP) == MO_BE;
-uint16_t res;
-
-res = size_shift & TRACE_MEM_SZ_SHIFT_MASK;
-if (sign_extend) {
-res |= TRACE_MEM_SE;
-}
-if (big_endian) {
-res |= TRACE_MEM_BE;
-}
-if (store) {
-res |= TRACE_MEM_ST;
-}
-#ifdef CONFIG_SOFTMMU
-res |= get_mmuidx(oi) << TRACE_MEM_MMU_SHIFT;
-#endif
-
-return res;
-}
-
-#endif /* TRACE__MEM_H */
diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
index ee07457880..46140ccff3 100644
--- a/accel/tcg/cputlb.c
+++ b/accel/tcg/cputlb.c
@@ -34,7 +34,6 @@
 #include "qemu/atomic128.h"
 #include "exec/translate-all.h"
 #include "trace/trace-root.h"
-#include "trace/mem.h"
 #include "tb-hash.h"
 #include "internal.h"
 #ifdef CONFIG_PLUGIN
@@ -2113,10 +2112,9 @@ static inline uint64_t cpu_load_helper(CPUArchState 
*env, abi_ptr addr,
MemOp op, FullLoadHelper *full_load)
 {
 MemOpIdx oi = make_memop_idx(op, mmu_idx);
-uint16_t meminfo = trace_mem_get_info(oi, false);
 uint64_t ret;
 
-trace_guest_mem_before_exec(env_cpu(env), addr, meminfo);
+trace_guest_ld_before_exec(env_cpu(env), addr, oi);
 
 ret = full_load(env, addr, oi, retaddr);
 
@@ -2550,9 +2548,8 @@ cpu_store_helper(CPUArchState *env, target_ulong addr, 
uint64_t val,
  int mmu_idx, uintptr_t retaddr, MemOp op)
 {
 MemOpIdx oi = make_memop_idx(op, mmu_idx);
-uint16_t meminfo = trace_mem_get_info(oi, true);
 
-trace_guest_mem_before_exec(env_cpu(env), addr, meminfo);
+trace_guest_st_before_exec(env_cpu(env), addr, oi);
 
 store_helper(env, addr, val, oi, retaddr, op);
 
diff --git a/accel/tcg/user-exec.c b/accel/tcg/user-exec.c
index 9f8f3a8031..eda577013f 100644
--- a/accel/tcg/user-exec.c
+++ b/accel/tcg/user-exec.c
@@ -27,7 +27,6 @@
 #include "exec/helper-proto.h"
 #include "qemu/atomic128.h"
 #include "trace/trace-root.h"
-#include "trace/mem.h"
 
 #undef EAX
 #undef ECX
@@ -865,10 +864,9 @@ static void cpu_unaligned_access(CPUState *cpu, vaddr addr,
 uint32_t cpu_ldub_data(CPUArchState *env, abi_ptr ptr)
 {
 MemOpIdx oi = make_memop_idx(MO_UB, MMU_USER_IDX);
-uint16_t meminfo = trace_mem_get_info(oi, false);
 uint32_t ret;
 
-trace_guest_mem_before_exec(env_cpu(env), ptr, meminfo);
+trace_guest_ld_before_exec(env_cpu(env), ptr, oi);
 ret = ldub_p(g2h(env_cpu(env), ptr));
 qemu_plugin_vcpu_mem_cb(env_cpu(env), ptr, oi, QEMU_PLUGIN_MEM_R);
 return ret;
@@ -882,10 +880,9 @@ int cpu_ldsb_data(CPUArchState *env, abi_ptr ptr)
 uint32_t cpu_lduw_be_data(CPUArchState *env, abi_ptr ptr)
 {
 MemOpIdx oi = make_memop_idx(MO_BEUW, MMU_USER_IDX);
-uint16_t meminfo = trace_mem_get_info(oi, false);
 uint32_t ret;
 
-trace_guest_mem_before_exec(env_cpu(env), ptr, meminfo);
+trace_guest_ld_before_exec(env_cpu(env), ptr, oi);
 ret = lduw_be_p(g2h(env_cpu(env), ptr));
 qemu_plugi

[PATCH for-6.2 25/43] plugins: Reorg arguments to qemu_plugin_vcpu_mem_cb

2021-07-28 Thread Richard Henderson
Use the MemOpIdx directly, rather than the rearrangement
of the same bits currently done by the trace infrastructure.
Pass in enum qemu_plugin_mem_rw so that we are able to treat
read-modify-write operations as a single operation.

Signed-off-by: Richard Henderson 
---
 include/qemu/plugin.h | 26 --
 accel/tcg/cputlb.c|  4 ++--
 accel/tcg/plugin-gen.c|  5 ++---
 accel/tcg/user-exec.c | 28 ++--
 plugins/api.c | 19 +++
 plugins/core.c| 10 +-
 tcg/tcg-op.c  | 30 +-
 accel/tcg/atomic_common.c.inc | 13 +++--
 8 files changed, 82 insertions(+), 53 deletions(-)

diff --git a/include/qemu/plugin.h b/include/qemu/plugin.h
index 9a8438f683..b3172b147f 100644
--- a/include/qemu/plugin.h
+++ b/include/qemu/plugin.h
@@ -12,6 +12,7 @@
 #include "qemu/error-report.h"
 #include "qemu/queue.h"
 #include "qemu/option.h"
+#include "exec/memopidx.h"
 
 /*
  * Events that plugins can subscribe to.
@@ -36,6 +37,25 @@ enum qemu_plugin_event {
 struct qemu_plugin_desc;
 typedef QTAILQ_HEAD(, qemu_plugin_desc) QemuPluginList;
 
+/*
+ * Construct a qemu_plugin_meminfo_t.
+ */
+static inline qemu_plugin_meminfo_t
+make_plugin_meminfo(MemOpIdx oi, enum qemu_plugin_mem_rw rw)
+{
+return oi | (rw << 16);
+}
+
+/*
+ * Extract the memory operation direction from a qemu_plugin_meminfo_t.
+ * Other portions may be extracted via get_memop and get_mmuidx.
+ */
+static inline enum qemu_plugin_mem_rw
+get_plugin_meminfo_rw(qemu_plugin_meminfo_t i)
+{
+return i >> 16;
+}
+
 #ifdef CONFIG_PLUGIN
 extern QemuOptsList qemu_plugin_opts;
 
@@ -180,7 +200,8 @@ qemu_plugin_vcpu_syscall(CPUState *cpu, int64_t num, 
uint64_t a1,
  uint64_t a6, uint64_t a7, uint64_t a8);
 void qemu_plugin_vcpu_syscall_ret(CPUState *cpu, int64_t num, int64_t ret);
 
-void qemu_plugin_vcpu_mem_cb(CPUState *cpu, uint64_t vaddr, uint32_t meminfo);
+void qemu_plugin_vcpu_mem_cb(CPUState *cpu, uint64_t vaddr,
+ MemOpIdx oi, enum qemu_plugin_mem_rw rw);
 
 void qemu_plugin_flush_cb(void);
 
@@ -244,7 +265,8 @@ void qemu_plugin_vcpu_syscall_ret(CPUState *cpu, int64_t 
num, int64_t ret)
 { }
 
 static inline void qemu_plugin_vcpu_mem_cb(CPUState *cpu, uint64_t vaddr,
-   uint32_t meminfo)
+   MemOpIdx oi,
+   enum qemu_plugin_mem_rw rw)
 { }
 
 static inline void qemu_plugin_flush_cb(void)
diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
index 0aa6157ec4..ee07457880 100644
--- a/accel/tcg/cputlb.c
+++ b/accel/tcg/cputlb.c
@@ -2120,7 +2120,7 @@ static inline uint64_t cpu_load_helper(CPUArchState *env, 
abi_ptr addr,
 
 ret = full_load(env, addr, oi, retaddr);
 
-qemu_plugin_vcpu_mem_cb(env_cpu(env), addr, meminfo);
+qemu_plugin_vcpu_mem_cb(env_cpu(env), addr, oi, QEMU_PLUGIN_MEM_R);
 
 return ret;
 }
@@ -2556,7 +2556,7 @@ cpu_store_helper(CPUArchState *env, target_ulong addr, 
uint64_t val,
 
 store_helper(env, addr, val, oi, retaddr, op);
 
-qemu_plugin_vcpu_mem_cb(env_cpu(env), addr, meminfo);
+qemu_plugin_vcpu_mem_cb(env_cpu(env), addr, oi, QEMU_PLUGIN_MEM_W);
 }
 
 void cpu_stb_mmuidx_ra(CPUArchState *env, target_ulong addr, uint32_t val,
diff --git a/accel/tcg/plugin-gen.c b/accel/tcg/plugin-gen.c
index 88e25c6df9..f5fd5f279c 100644
--- a/accel/tcg/plugin-gen.c
+++ b/accel/tcg/plugin-gen.c
@@ -45,7 +45,6 @@
 #include "qemu/osdep.h"
 #include "tcg/tcg.h"
 #include "tcg/tcg-op.h"
-#include "trace/mem.h"
 #include "exec/exec-all.h"
 #include "exec/plugin-gen.h"
 #include "exec/translator.h"
@@ -211,9 +210,9 @@ static void gen_mem_wrapped(enum plugin_gen_cb type,
 const union mem_gen_fn *f, TCGv addr,
 uint32_t info, bool is_mem)
 {
-int wr = !!(info & TRACE_MEM_ST);
+enum qemu_plugin_mem_rw rw = get_plugin_meminfo_rw(info);
 
-gen_plugin_cb_start(PLUGIN_GEN_FROM_MEM, type, wr);
+gen_plugin_cb_start(PLUGIN_GEN_FROM_MEM, type, rw);
 if (is_mem) {
 f->mem_fn(addr, info);
 } else {
diff --git a/accel/tcg/user-exec.c b/accel/tcg/user-exec.c
index 9f6fa729d0..9f8f3a8031 100644
--- a/accel/tcg/user-exec.c
+++ b/accel/tcg/user-exec.c
@@ -870,7 +870,7 @@ uint32_t cpu_ldub_data(CPUArchState *env, abi_ptr ptr)
 
 trace_guest_mem_before_exec(env_cpu(env), ptr, meminfo);
 ret = ldub_p(g2h(env_cpu(env), ptr));
-qemu_plugin_vcpu_mem_cb(env_cpu(env), ptr, meminfo);
+qemu_plugin_vcpu_mem_cb(env_cpu(env), ptr, oi, QEMU_PLUGIN_MEM_R);
 return ret;
 }
 
@@ -887,7 +887,7 @@ uint32_t cpu_lduw_be_data(CPUArchState *env, abi_ptr ptr)
 
 trace_guest_mem_before_exec(env_cpu(env), ptr, meminfo);
 ret = lduw_be_p(g2h(env_cpu(env), ptr));
-qemu_plugin_vcpu_mem_cb(env_cpu(env), ptr, meminfo);
+

Re: [PATCH RFC 04/19] vfio-user: Define type vfio_user_pci_dev_info

2021-07-28 Thread John Johnson


> On Jul 28, 2021, at 3:16 AM, Stefan Hajnoczi  wrote:
> 
> On Sun, Jul 18, 2021 at 11:27:43PM -0700, Elena Ufimtseva wrote:
>> From: John G Johnson 
>> 
>> New class for vfio-user with its class and instance
>> constructors and destructors.
>> 
>> Signed-off-by: Elena Ufimtseva 
>> Signed-off-by: John G Johnson 
>> Signed-off-by: Jagannathan Raman 
>> ---
>> hw/vfio/pci.c | 49 +
>> 1 file changed, 49 insertions(+)
>> 
>> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
>> index bea95efc33..554b562769 100644
>> --- a/hw/vfio/pci.c
>> +++ b/hw/vfio/pci.c
>> @@ -42,6 +42,7 @@
>> #include "qapi/error.h"
>> #include "migration/blocker.h"
>> #include "migration/qemu-file.h"
>> +#include "hw/vfio/user.h"
>> 
>> #define TYPE_VFIO_PCI_NOHOTPLUG "vfio-pci-nohotplug"
>> 
>> @@ -3326,3 +3327,51 @@ static void register_vfio_pci_dev_type(void)
>> }
>> 
>> type_init(register_vfio_pci_dev_type)
>> +
>> +static void vfio_user_pci_realize(PCIDevice *pdev, Error **errp)
>> +{
>> +ERRP_GUARD();
>> +VFIOUserPCIDevice *udev = VFIO_USER_PCI(pdev);
>> +
>> +if (!udev->sock_name) {
>> +error_setg(errp, "No socket specified");
>> +error_append_hint(errp, "Use -device 
>> vfio-user-pci,socket=\n");
>> +return;
>> +}
>> +}
>> +
>> +static void vfio_user_instance_finalize(Object *obj)
>> +{
>> +}
>> +
>> +static Property vfio_user_pci_dev_properties[] = {
>> +DEFINE_PROP_STRING("socket", VFIOUserPCIDevice, sock_name),
> 
> Please use SocketAddress so that alternative socket connection details
> can be supported without inventing custom syntax for vfio-user-pci. For
> example, file descriptor passing should be possible.
> 
> I think this requires a bit of command-line parsing work, so don't worry
> about it for now, but please add a TODO comment. When the -device
> vfio-user-pci syntax is finalized (i.e. when the code is merged and the
> device name doesn't start with the experimental x- prefix), then it
> needs to be solved.
> 

What do you want the options to look like at the endgame?  I’d
rather work backward from that than have several different flavors of
options as new socket options are added.  I did look at -chardev socket,
and it was confusing enough that I went for the simple string.



>> +DEFINE_PROP_BOOL("secure-dma", VFIOUserPCIDevice, secure, false),
> 
> I'm not sure what "secure-dma" means and the "secure" variable name is
> even more inscrutable. Does this mean don't share memory so that each
> DMA access is checked individually?
> 

Yes.  Do you have another name you’d prefer? “no-shared-mem”?

JJ



>> +DEFINE_PROP_END_OF_LIST(),
>> +};
>> +
>> +static void vfio_user_pci_dev_class_init(ObjectClass *klass, void *data)
>> +{
>> +DeviceClass *dc = DEVICE_CLASS(klass);
>> +PCIDeviceClass *pdc = PCI_DEVICE_CLASS(klass);
>> +
>> +device_class_set_props(dc, vfio_user_pci_dev_properties);
>> +dc->desc = "VFIO over socket PCI device assignment";
>> +pdc->realize = vfio_user_pci_realize;
>> +}
>> +
>> +static const TypeInfo vfio_user_pci_dev_info = {
>> +.name = TYPE_VFIO_USER_PCI,
>> +.parent = TYPE_VFIO_PCI_BASE,
>> +.instance_size = sizeof(VFIOUserPCIDevice),
>> +.class_init = vfio_user_pci_dev_class_init,
>> +.instance_init = vfio_instance_init,
>> +.instance_finalize = vfio_user_instance_finalize,
>> +};
>> +
>> +static void register_vfio_user_dev_type(void)
>> +{
>> +type_register_static(&vfio_user_pci_dev_info);
>> +}
>> +
>> +type_init(register_vfio_user_dev_type)
>> -- 
>> 2.25.1
>> 



[PATCH for-6.2 22/43] trace/mem: Pass MemOpIdx to trace_mem_get_info

2021-07-28 Thread Richard Henderson
We (will) often have the complete MemOpIdx handy, so use that.

Signed-off-by: Richard Henderson 
---
 trace/mem.h   | 32 +-
 accel/tcg/cputlb.c| 12 --
 accel/tcg/user-exec.c | 42 +++
 tcg/tcg-op.c  |  8 +++
 accel/tcg/atomic_common.c.inc |  6 ++---
 5 files changed, 49 insertions(+), 51 deletions(-)

diff --git a/trace/mem.h b/trace/mem.h
index 2f27e7bdf0..699566c661 100644
--- a/trace/mem.h
+++ b/trace/mem.h
@@ -10,7 +10,7 @@
 #ifndef TRACE__MEM_H
 #define TRACE__MEM_H
 
-#include "tcg/tcg.h"
+#include "exec/memopidx.h"
 
 #define TRACE_MEM_SZ_SHIFT_MASK 0xf /* size shift mask */
 #define TRACE_MEM_SE (1ULL << 4)/* sign extended (y/n) */
@@ -19,45 +19,33 @@
 #define TRACE_MEM_MMU_SHIFT 8   /* mmu idx */
 
 /**
- * trace_mem_build_info:
+ * trace_mem_get_info:
  *
  * Return a value for the 'info' argument in guest memory access traces.
  */
-static inline uint16_t trace_mem_build_info(int size_shift, bool sign_extend,
-MemOp endianness, bool store,
-unsigned int mmu_idx)
+static inline uint16_t trace_mem_get_info(MemOpIdx oi, bool store)
 {
+MemOp op = get_memop(oi);
+uint32_t size_shift = op & MO_SIZE;
+bool sign_extend = op & MO_SIGN;
+bool big_endian = (op & MO_BSWAP) == MO_BE;
 uint16_t res;
 
 res = size_shift & TRACE_MEM_SZ_SHIFT_MASK;
 if (sign_extend) {
 res |= TRACE_MEM_SE;
 }
-if (endianness == MO_BE) {
+if (big_endian) {
 res |= TRACE_MEM_BE;
 }
 if (store) {
 res |= TRACE_MEM_ST;
 }
 #ifdef CONFIG_SOFTMMU
-res |= mmu_idx << TRACE_MEM_MMU_SHIFT;
+res |= get_mmuidx(oi) << TRACE_MEM_MMU_SHIFT;
 #endif
+
 return res;
 }
 
-
-/**
- * trace_mem_get_info:
- *
- * Return a value for the 'info' argument in guest memory access traces.
- */
-static inline uint16_t trace_mem_get_info(MemOp op,
-  unsigned int mmu_idx,
-  bool store)
-{
-return trace_mem_build_info(op & MO_SIZE, !!(op & MO_SIGN),
-op & MO_BSWAP, store,
-mmu_idx);
-}
-
 #endif /* TRACE__MEM_H */
diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
index d72f65f42b..0aa6157ec4 100644
--- a/accel/tcg/cputlb.c
+++ b/accel/tcg/cputlb.c
@@ -2112,14 +2112,12 @@ static inline uint64_t cpu_load_helper(CPUArchState 
*env, abi_ptr addr,
int mmu_idx, uintptr_t retaddr,
MemOp op, FullLoadHelper *full_load)
 {
-uint16_t meminfo;
-MemOpIdx oi;
+MemOpIdx oi = make_memop_idx(op, mmu_idx);
+uint16_t meminfo = trace_mem_get_info(oi, false);
 uint64_t ret;
 
-meminfo = trace_mem_get_info(op, mmu_idx, false);
 trace_guest_mem_before_exec(env_cpu(env), addr, meminfo);
 
-oi = make_memop_idx(op, mmu_idx);
 ret = full_load(env, addr, oi, retaddr);
 
 qemu_plugin_vcpu_mem_cb(env_cpu(env), addr, meminfo);
@@ -2551,13 +2549,11 @@ static inline void QEMU_ALWAYS_INLINE
 cpu_store_helper(CPUArchState *env, target_ulong addr, uint64_t val,
  int mmu_idx, uintptr_t retaddr, MemOp op)
 {
-MemOpIdx oi;
-uint16_t meminfo;
+MemOpIdx oi = make_memop_idx(op, mmu_idx);
+uint16_t meminfo = trace_mem_get_info(oi, true);
 
-meminfo = trace_mem_get_info(op, mmu_idx, true);
 trace_guest_mem_before_exec(env_cpu(env), addr, meminfo);
 
-oi = make_memop_idx(op, mmu_idx);
 store_helper(env, addr, val, oi, retaddr, op);
 
 qemu_plugin_vcpu_mem_cb(env_cpu(env), addr, meminfo);
diff --git a/accel/tcg/user-exec.c b/accel/tcg/user-exec.c
index eb672eae3a..9f6fa729d0 100644
--- a/accel/tcg/user-exec.c
+++ b/accel/tcg/user-exec.c
@@ -864,8 +864,9 @@ static void cpu_unaligned_access(CPUState *cpu, vaddr addr,
 
 uint32_t cpu_ldub_data(CPUArchState *env, abi_ptr ptr)
 {
+MemOpIdx oi = make_memop_idx(MO_UB, MMU_USER_IDX);
+uint16_t meminfo = trace_mem_get_info(oi, false);
 uint32_t ret;
-uint16_t meminfo = trace_mem_get_info(MO_UB, MMU_USER_IDX, false);
 
 trace_guest_mem_before_exec(env_cpu(env), ptr, meminfo);
 ret = ldub_p(g2h(env_cpu(env), ptr));
@@ -880,8 +881,9 @@ int cpu_ldsb_data(CPUArchState *env, abi_ptr ptr)
 
 uint32_t cpu_lduw_be_data(CPUArchState *env, abi_ptr ptr)
 {
+MemOpIdx oi = make_memop_idx(MO_BEUW, MMU_USER_IDX);
+uint16_t meminfo = trace_mem_get_info(oi, false);
 uint32_t ret;
-uint16_t meminfo = trace_mem_get_info(MO_BEUW, MMU_USER_IDX, false);
 
 trace_guest_mem_before_exec(env_cpu(env), ptr, meminfo);
 ret = lduw_be_p(g2h(env_cpu(env), ptr));
@@ -896,8 +898,9 @@ int cpu_ldsw_be_data(CPUArchState *env, abi_ptr ptr)
 
 uint32_t cpu_ldl_be_data(CPUArchState *env, abi_ptr ptr)
 {
+MemOpIdx oi = make_memop_idx(M

[PATCH for-6.2 20/43] tcg: Rename TCGMemOpIdx to MemOpIdx

2021-07-28 Thread Richard Henderson
We're about to move this out of tcg.h, so rename it
as we did when moving MemOp.

Signed-off-by: Richard Henderson 
---
 accel/tcg/atomic_template.h   | 24 +--
 include/tcg/tcg.h | 74 -
 accel/tcg/cputlb.c| 78 +--
 accel/tcg/user-exec.c |  2 +-
 target/arm/helper-a64.c   | 16 +++
 target/arm/m_helper.c |  2 +-
 target/i386/tcg/mem_helper.c  |  4 +-
 target/m68k/op_helper.c   |  2 +-
 target/mips/tcg/msa_helper.c  |  6 +--
 target/s390x/tcg/mem_helper.c | 20 -
 target/sparc/ldst_helper.c|  2 +-
 tcg/optimize.c|  2 +-
 tcg/tcg-op.c  | 12 +++---
 tcg/tcg.c |  2 +-
 tcg/tci.c | 14 +++
 accel/tcg/atomic_common.c.inc |  6 +--
 tcg/aarch64/tcg-target.c.inc  | 14 +++
 tcg/arm/tcg-target.c.inc  | 10 ++---
 tcg/i386/tcg-target.c.inc | 10 ++---
 tcg/mips/tcg-target.c.inc | 12 +++---
 tcg/ppc/tcg-target.c.inc  | 10 ++---
 tcg/riscv/tcg-target.c.inc| 16 +++
 tcg/s390/tcg-target.c.inc | 10 ++---
 tcg/sparc/tcg-target.c.inc|  4 +-
 tcg/tcg-ldst.c.inc|  2 +-
 25 files changed, 177 insertions(+), 177 deletions(-)

diff --git a/accel/tcg/atomic_template.h b/accel/tcg/atomic_template.h
index d89af4cc1e..4427fab6df 100644
--- a/accel/tcg/atomic_template.h
+++ b/accel/tcg/atomic_template.h
@@ -72,7 +72,7 @@
 
 ABI_TYPE ATOMIC_NAME(cmpxchg)(CPUArchState *env, target_ulong addr,
   ABI_TYPE cmpv, ABI_TYPE newv,
-  TCGMemOpIdx oi, uintptr_t retaddr)
+  MemOpIdx oi, uintptr_t retaddr)
 {
 DATA_TYPE *haddr = atomic_mmu_lookup(env, addr, oi, DATA_SIZE,
  PAGE_READ | PAGE_WRITE, retaddr);
@@ -92,7 +92,7 @@ ABI_TYPE ATOMIC_NAME(cmpxchg)(CPUArchState *env, target_ulong 
addr,
 #if DATA_SIZE >= 16
 #if HAVE_ATOMIC128
 ABI_TYPE ATOMIC_NAME(ld)(CPUArchState *env, target_ulong addr,
- TCGMemOpIdx oi, uintptr_t retaddr)
+ MemOpIdx oi, uintptr_t retaddr)
 {
 DATA_TYPE *haddr = atomic_mmu_lookup(env, addr, oi, DATA_SIZE,
  PAGE_READ, retaddr);
@@ -106,7 +106,7 @@ ABI_TYPE ATOMIC_NAME(ld)(CPUArchState *env, target_ulong 
addr,
 }
 
 void ATOMIC_NAME(st)(CPUArchState *env, target_ulong addr, ABI_TYPE val,
- TCGMemOpIdx oi, uintptr_t retaddr)
+ MemOpIdx oi, uintptr_t retaddr)
 {
 DATA_TYPE *haddr = atomic_mmu_lookup(env, addr, oi, DATA_SIZE,
  PAGE_WRITE, retaddr);
@@ -119,7 +119,7 @@ void ATOMIC_NAME(st)(CPUArchState *env, target_ulong addr, 
ABI_TYPE val,
 #endif
 #else
 ABI_TYPE ATOMIC_NAME(xchg)(CPUArchState *env, target_ulong addr, ABI_TYPE val,
-   TCGMemOpIdx oi, uintptr_t retaddr)
+   MemOpIdx oi, uintptr_t retaddr)
 {
 DATA_TYPE *haddr = atomic_mmu_lookup(env, addr, oi, DATA_SIZE,
  PAGE_READ | PAGE_WRITE, retaddr);
@@ -134,7 +134,7 @@ ABI_TYPE ATOMIC_NAME(xchg)(CPUArchState *env, target_ulong 
addr, ABI_TYPE val,
 
 #define GEN_ATOMIC_HELPER(X)\
 ABI_TYPE ATOMIC_NAME(X)(CPUArchState *env, target_ulong addr,   \
-ABI_TYPE val, TCGMemOpIdx oi, uintptr_t retaddr) \
+ABI_TYPE val, MemOpIdx oi, uintptr_t retaddr) \
 {   \
 DATA_TYPE *haddr = atomic_mmu_lookup(env, addr, oi, DATA_SIZE,  \
  PAGE_READ | PAGE_WRITE, retaddr); \
@@ -167,7 +167,7 @@ GEN_ATOMIC_HELPER(xor_fetch)
  */
 #define GEN_ATOMIC_HELPER_FN(X, FN, XDATA_TYPE, RET)\
 ABI_TYPE ATOMIC_NAME(X)(CPUArchState *env, target_ulong addr,   \
-ABI_TYPE xval, TCGMemOpIdx oi, uintptr_t retaddr) \
+ABI_TYPE xval, MemOpIdx oi, uintptr_t retaddr) \
 {   \
 XDATA_TYPE *haddr = atomic_mmu_lookup(env, addr, oi, DATA_SIZE, \
   PAGE_READ | PAGE_WRITE, retaddr); \
@@ -211,7 +211,7 @@ GEN_ATOMIC_HELPER_FN(umax_fetch, MAX,  DATA_TYPE, new)
 
 ABI_TYPE ATOMIC_NAME(cmpxchg)(CPUArchState *env, target_ulong addr,
   ABI_TYPE cmpv, ABI_TYPE newv,
-  TCGMemOpIdx oi, uintptr_t retaddr)
+  MemOpIdx oi, uintptr_t retaddr)
 {
 DATA_TYPE *haddr = atomic_mmu_lookup(env, addr, oi, DATA_SIZE,
  PAGE_READ | PAGE_WRITE, retaddr);
@@ -231,7 +231,7 @@ ABI_TYPE ATOMIC_NAME(cmpxchg)(CPUArchState *env, 
target_ulong addr,
 #if DATA_SIZE >= 16
 #if HAVE_ATOMIC128
 ABI_TYPE ATOMIC_N

[PATCH for-6.2 43/43] tests/tcg/multiarch: Add sigbus.c

2021-07-28 Thread Richard Henderson
A mostly generic test for unaligned access raising SIGBUS.

Signed-off-by: Richard Henderson 
---
 tests/tcg/multiarch/sigbus.c | 68 
 1 file changed, 68 insertions(+)
 create mode 100644 tests/tcg/multiarch/sigbus.c

diff --git a/tests/tcg/multiarch/sigbus.c b/tests/tcg/multiarch/sigbus.c
new file mode 100644
index 00..8134c5fd56
--- /dev/null
+++ b/tests/tcg/multiarch/sigbus.c
@@ -0,0 +1,68 @@
+#define _GNU_SOURCE 1
+
+#include 
+#include 
+#include 
+#include 
+
+
+unsigned long long x = 0x8877665544332211ull;
+void * volatile p = (void *)&x + 1;
+
+void sigbus(int sig, siginfo_t *info, void *uc)
+{
+assert(sig == SIGBUS);
+assert(info->si_signo == SIGBUS);
+#ifdef BUS_ADRALN
+assert(info->si_code == BUS_ADRALN);
+#endif
+assert(info->si_addr == p);
+exit(EXIT_SUCCESS);
+}
+
+int main()
+{
+struct sigaction sa = {
+.sa_sigaction = sigbus,
+.sa_flags = SA_SIGINFO
+};
+int allow_fail = 0;
+int tmp;
+
+tmp = sigaction(SIGBUS, &sa, NULL);
+assert(tmp == 0);
+
+/*
+ * Select an operation that's likely to enforce alignment.
+ * On many guests that support unaligned accesses by default,
+ * this is often an atomic operation.
+ */
+#if defined(__aarch64__)
+asm volatile("ldxr %w0,[%1]" : "=r"(tmp) : "r"(p) : "memory");
+#elif defined(__alpha__)
+asm volatile("ldl_l %0,0(%1)" : "=r"(tmp) : "r"(p) : "memory");
+#elif defined(__arm__)
+asm volatile("ldrex %0,[%1]" : "=r"(tmp) : "r"(p) : "memory");
+#elif defined(__powerpc__)
+asm volatile("lwarx %0,0,%1" : "=r"(tmp) : "r"(p) : "memory");
+#elif defined(__riscv_atomic)
+asm volatile("lr.w %0,(%1)" : "=r"(tmp) : "r"(p) : "memory");
+#else
+/* No insn known to fault unaligned -- try for a straight load. */
+allow_fail = 1;
+tmp = *(volatile int *)p;
+#endif
+
+assert(allow_fail);
+
+/*
+ * We didn't see a signal.
+ * We might as well validate the unaligned load worked.
+ */
+if (BYTE_ORDER == LITTLE_ENDIAN) {
+assert(tmp == 0x55443322);
+} else {
+assert(tmp == 0x77665544);
+}
+return EXIT_SUCCESS;
+}
-- 
2.25.1




[PATCH for-6.2 42/43] tcg/i386: Support raising sigbus for user-only

2021-07-28 Thread Richard Henderson
Signed-off-by: Richard Henderson 
---
 tcg/i386/tcg-target.h |   2 -
 tcg/i386/tcg-target.c.inc | 114 --
 2 files changed, 110 insertions(+), 6 deletions(-)

diff --git a/tcg/i386/tcg-target.h b/tcg/i386/tcg-target.h
index b00a6da293..3b2c9437a0 100644
--- a/tcg/i386/tcg-target.h
+++ b/tcg/i386/tcg-target.h
@@ -232,9 +232,7 @@ static inline void tb_target_set_jmp_target(uintptr_t 
tc_ptr, uintptr_t jmp_rx,
 
 #define TCG_TARGET_HAS_MEMORY_BSWAP  have_movbe
 
-#ifdef CONFIG_SOFTMMU
 #define TCG_TARGET_NEED_LDST_LABELS
-#endif
 #define TCG_TARGET_NEED_POOL_LABELS
 
 #endif
diff --git a/tcg/i386/tcg-target.c.inc b/tcg/i386/tcg-target.c.inc
index 1e42a877fb..4abf612891 100644
--- a/tcg/i386/tcg-target.c.inc
+++ b/tcg/i386/tcg-target.c.inc
@@ -420,8 +420,9 @@ static bool tcg_target_const_match(int64_t val, TCGType 
type, int ct)
 #define OPC_VZEROUPPER  (0x77 | P_EXT)
 #define OPC_XCHG_ax_r32(0x90)
 
-#define OPC_GRP3_Ev(0xf7)
-#define OPC_GRP5   (0xff)
+#define OPC_GRP3_Eb (0xf6)
+#define OPC_GRP3_Ev (0xf7)
+#define OPC_GRP5(0xff)
 #define OPC_GRP14   (0x73 | P_EXT | P_DATA16)
 
 /* Group 1 opcode extensions for 0x80-0x83.
@@ -443,6 +444,7 @@ static bool tcg_target_const_match(int64_t val, TCGType 
type, int ct)
 #define SHIFT_SAR 7
 
 /* Group 3 opcode extensions for 0xf6, 0xf7.  To be used with OPC_GRP3.  */
+#define EXT3_TESTi 0
 #define EXT3_NOT   2
 #define EXT3_NEG   3
 #define EXT3_MUL   4
@@ -1604,9 +1606,9 @@ static void tcg_out_nopn(TCGContext *s, int n)
 tcg_out8(s, 0x90);
 }
 
-#if defined(CONFIG_SOFTMMU)
 #include "../tcg-ldst.c.inc"
 
+#if defined(CONFIG_SOFTMMU)
 /* helper signature: helper_ret_ld_mmu(CPUState *env, target_ulong addr,
  * int mmu_idx, uintptr_t ra)
  */
@@ -1915,7 +1917,96 @@ static bool tcg_out_qemu_st_slow_path(TCGContext *s, 
TCGLabelQemuLdst *l)
 tcg_out_jmp(s, qemu_st_helpers[opc & (MO_BSWAP | MO_SIZE)]);
 return true;
 }
-#elif TCG_TARGET_REG_BITS == 32
+#else
+
+static void tcg_out_test_alignment(TCGContext *s, bool is_ld, TCGReg addrlo,
+   TCGReg addrhi, unsigned a_bits)
+{
+unsigned a_mask = (1 << a_bits) - 1;
+TCGLabelQemuLdst *label;
+
+/*
+ * We are expecting a_bits to max out at 7, so we can usually use testb.
+ * For i686, we have to use testl for %esi/%edi.
+ */
+if (a_mask <= 0xff && (TCG_TARGET_REG_BITS == 64 || addrlo < 4)) {
+tcg_out_modrm(s, OPC_GRP3_Eb, EXT3_TESTi, addrlo);
+tcg_out8(s, a_mask);
+} else {
+tcg_out_modrm(s, OPC_GRP3_Ev, EXT3_TESTi, addrlo);
+tcg_out32(s, a_mask);
+}
+
+/* jne slow_path */
+tcg_out_opc(s, OPC_JCC_long + JCC_JNE, 0, 0, 0);
+
+label = new_ldst_label(s);
+label->is_ld = is_ld;
+label->addrlo_reg = addrlo;
+label->addrhi_reg = addrhi;
+label->raddr = tcg_splitwx_to_rx(s->code_ptr + 4);
+label->label_ptr[0] = s->code_ptr;
+
+s->code_ptr += 4;
+}
+
+static bool tcg_out_fail_alignment(TCGContext *s, TCGLabelQemuLdst *l)
+{
+MMUAccessType type = l->is_ld ? MMU_DATA_LOAD : MMU_DATA_STORE;
+TCGReg retaddr;
+
+/* resolve label address */
+tcg_patch32(l->label_ptr[0], s->code_ptr - l->label_ptr[0] - 4);
+
+if (TCG_TARGET_REG_BITS == 32) {
+int ofs = 0;
+
+tcg_out_st(s, TCG_TYPE_PTR, TCG_AREG0, TCG_REG_ESP, ofs);
+ofs += 4;
+
+tcg_out_st(s, TCG_TYPE_I32, l->addrlo_reg, TCG_REG_ESP, ofs);
+ofs += 4;
+if (TARGET_LONG_BITS == 64) {
+tcg_out_st(s, TCG_TYPE_I32, l->addrhi_reg, TCG_REG_ESP, ofs);
+ofs += 4;
+}
+
+tcg_out_sti(s, TCG_TYPE_I32, type, TCG_REG_ESP, ofs);
+
+retaddr = TCG_REG_EAX;
+tcg_out_movi(s, TCG_TYPE_PTR, retaddr, (uintptr_t)l->raddr);
+tcg_out_st(s, TCG_TYPE_PTR, retaddr, TCG_REG_ESP, ofs);
+} else {
+tcg_out_mov(s, TCG_TYPE_PTR, tcg_target_call_iarg_regs[0], TCG_AREG0);
+tcg_out_mov(s, TCG_TYPE_TL, tcg_target_call_iarg_regs[1],
+l->addrlo_reg);
+tcg_out_movi(s, TCG_TYPE_I32, tcg_target_call_iarg_regs[2], type);
+
+retaddr = tcg_target_call_iarg_regs[3];
+tcg_out_movi(s, TCG_TYPE_PTR, retaddr, (uintptr_t)l->raddr);
+}
+
+/*
+ * "Tail call" to the helper, with the return address back inline,
+ * just for the clarity of the debugging traceback -- the helper
+ * cannot return.
+ */
+tcg_out_push(s, retaddr);
+tcg_out_jmp(s, (const void *)helper_unaligned_mmu);
+return true;
+}
+
+static bool tcg_out_qemu_ld_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
+{
+return tcg_out_fail_alignment(s, l);
+}
+
+static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
+{
+return tcg_out_fail_alignment(s, l);
+}
+
+#if TCG_TARGET_REG_BITS == 32
 # define x86_guest_base_seg 0
 # define x86_guest_base_index   -1
 # define x86_guest_bas

[PATCH for-6.2 41/43] tcg: Add helper_unaligned_mmu for user-only sigbus

2021-07-28 Thread Richard Henderson
To be called from tcg generated code on hosts that support
unaligned accesses natively, in response to an access that
is supposed to be aligned.

Signed-off-by: Richard Henderson 
---
 include/tcg/tcg-ldst.h |  5 +
 accel/tcg/user-exec.c  | 13 ++---
 2 files changed, 15 insertions(+), 3 deletions(-)

diff --git a/include/tcg/tcg-ldst.h b/include/tcg/tcg-ldst.h
index 8c86365611..a934bed042 100644
--- a/include/tcg/tcg-ldst.h
+++ b/include/tcg/tcg-ldst.h
@@ -70,5 +70,10 @@ void helper_be_stl_mmu(CPUArchState *env, target_ulong addr, 
uint32_t val,
 void helper_be_stq_mmu(CPUArchState *env, target_ulong addr, uint64_t val,
MemOpIdx oi, uintptr_t retaddr);
 
+#else
+
+void QEMU_NORETURN helper_unaligned_mmu(CPUArchState *env, target_ulong addr,
+uint32_t type, uintptr_t ra);
+
 #endif /* CONFIG_SOFTMMU */
 #endif /* TCG_LDST_H */
diff --git a/accel/tcg/user-exec.c b/accel/tcg/user-exec.c
index e8a82dd43f..5cbae7a7cc 100644
--- a/accel/tcg/user-exec.c
+++ b/accel/tcg/user-exec.c
@@ -27,6 +27,7 @@
 #include "exec/helper-proto.h"
 #include "qemu/atomic128.h"
 #include "trace/trace-root.h"
+#include "tcg/tcg-ldst.h"
 
 #undef EAX
 #undef ECX
@@ -866,9 +867,9 @@ static void validate_memop(MemOpIdx oi, MemOp expected)
 #endif
 }
 
-static void cpu_unaligned_access(CPUState *cpu, vaddr addr,
- MMUAccessType access_type,
- int mmu_idx, uintptr_t ra)
+static void QEMU_NORETURN
+cpu_unaligned_access(CPUState *cpu, vaddr addr, MMUAccessType access_type,
+ int mmu_idx, uintptr_t ra)
 {
 CPUClass *cc = CPU_GET_CLASS(cpu);
 
@@ -876,6 +877,12 @@ static void cpu_unaligned_access(CPUState *cpu, vaddr addr,
 g_assert_not_reached();
 }
 
+void helper_unaligned_mmu(CPUArchState *env, target_ulong addr,
+  uint32_t access_type, uintptr_t ra)
+{
+cpu_unaligned_access(env_cpu(env), addr, access_type, MMU_USER_IDX, ra);
+}
+
 static void *cpu_mmu_lookup(CPUArchState *env, target_ulong addr,
 MemOpIdx oi, uintptr_t ra, MMUAccessType type)
 {
-- 
2.25.1




[PATCH for-6.2 37/43] target/sparc: Use cpu_*_mmu instead of helper_*_mmu

2021-07-28 Thread Richard Henderson
The helper_*_mmu functions were the only thing available
when this code was written.  This could have been adjusted
when we added cpu_*_mmuidx_ra, but now we can most easily
use the newest set of interfaces.

Cc: Mark Cave-Ayland 
Signed-off-by: Richard Henderson 
---
 target/sparc/ldst_helper.c | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/target/sparc/ldst_helper.c b/target/sparc/ldst_helper.c
index 5c558d312a..10979404ad 100644
--- a/target/sparc/ldst_helper.c
+++ b/target/sparc/ldst_helper.c
@@ -1328,27 +1328,27 @@ uint64_t helper_ld_asi(CPUSPARCState *env, target_ulong 
addr,
 oi = make_memop_idx(memop, idx);
 switch (size) {
 case 1:
-ret = helper_ret_ldub_mmu(env, addr, oi, GETPC());
+ret = cpu_ldb_mmu(env, addr, oi, GETPC());
 break;
 case 2:
 if (asi & 8) {
-ret = helper_le_lduw_mmu(env, addr, oi, GETPC());
+ret = cpu_ldw_le_mmu(env, addr, oi, GETPC());
 } else {
-ret = helper_be_lduw_mmu(env, addr, oi, GETPC());
+ret = cpu_ldw_be_mmu(env, addr, oi, GETPC());
 }
 break;
 case 4:
 if (asi & 8) {
-ret = helper_le_ldul_mmu(env, addr, oi, GETPC());
+ret = cpu_ldl_le_mmu(env, addr, oi, GETPC());
 } else {
-ret = helper_be_ldul_mmu(env, addr, oi, GETPC());
+ret = cpu_ldl_be_mmu(env, addr, oi, GETPC());
 }
 break;
 case 8:
 if (asi & 8) {
-ret = helper_le_ldq_mmu(env, addr, oi, GETPC());
+ret = cpu_ldq_le_mmu(env, addr, oi, GETPC());
 } else {
-ret = helper_be_ldq_mmu(env, addr, oi, GETPC());
+ret = cpu_ldq_be_mmu(env, addr, oi, GETPC());
 }
 break;
 default:
-- 
2.25.1




[PATCH for-6.2 38/43] target/arm: Use cpu_*_mmu instead of helper_*_mmu

2021-07-28 Thread Richard Henderson
The helper_*_mmu functions were the only thing available
when this code was written.  This could have been adjusted
when we added cpu_*_mmuidx_ra, but now we can most easily
use the newest set of interfaces.

Cc: qemu-...@nongnu.org
Signed-off-by: Richard Henderson 
---
 target/arm/helper-a64.c | 52 +++--
 target/arm/m_helper.c   |  6 ++---
 2 files changed, 11 insertions(+), 47 deletions(-)

diff --git a/target/arm/helper-a64.c b/target/arm/helper-a64.c
index f1a4089a4f..17c0ebebb2 100644
--- a/target/arm/helper-a64.c
+++ b/target/arm/helper-a64.c
@@ -512,37 +512,19 @@ uint64_t HELPER(paired_cmpxchg64_le)(CPUARMState *env, 
uint64_t addr,
 uintptr_t ra = GETPC();
 uint64_t o0, o1;
 bool success;
-
-#ifdef CONFIG_USER_ONLY
-/* ??? Enforce alignment.  */
-uint64_t *haddr = g2h(env_cpu(env), addr);
-
-set_helper_retaddr(ra);
-o0 = ldq_le_p(haddr + 0);
-o1 = ldq_le_p(haddr + 1);
-oldv = int128_make128(o0, o1);
-
-success = int128_eq(oldv, cmpv);
-if (success) {
-stq_le_p(haddr + 0, int128_getlo(newv));
-stq_le_p(haddr + 1, int128_gethi(newv));
-}
-clear_helper_retaddr();
-#else
 int mem_idx = cpu_mmu_index(env, false);
 MemOpIdx oi0 = make_memop_idx(MO_LEQ | MO_ALIGN_16, mem_idx);
 MemOpIdx oi1 = make_memop_idx(MO_LEQ, mem_idx);
 
-o0 = helper_le_ldq_mmu(env, addr + 0, oi0, ra);
-o1 = helper_le_ldq_mmu(env, addr + 8, oi1, ra);
+o0 = cpu_ldq_le_mmu(env, addr + 0, oi0, ra);
+o1 = cpu_ldq_le_mmu(env, addr + 8, oi1, ra);
 oldv = int128_make128(o0, o1);
 
 success = int128_eq(oldv, cmpv);
 if (success) {
-helper_le_stq_mmu(env, addr + 0, int128_getlo(newv), oi1, ra);
-helper_le_stq_mmu(env, addr + 8, int128_gethi(newv), oi1, ra);
+cpu_stq_le_mmu(env, addr + 0, int128_getlo(newv), oi1, ra);
+cpu_stq_le_mmu(env, addr + 8, int128_gethi(newv), oi1, ra);
 }
-#endif
 
 return !success;
 }
@@ -582,37 +564,19 @@ uint64_t HELPER(paired_cmpxchg64_be)(CPUARMState *env, 
uint64_t addr,
 uintptr_t ra = GETPC();
 uint64_t o0, o1;
 bool success;
-
-#ifdef CONFIG_USER_ONLY
-/* ??? Enforce alignment.  */
-uint64_t *haddr = g2h(env_cpu(env), addr);
-
-set_helper_retaddr(ra);
-o1 = ldq_be_p(haddr + 0);
-o0 = ldq_be_p(haddr + 1);
-oldv = int128_make128(o0, o1);
-
-success = int128_eq(oldv, cmpv);
-if (success) {
-stq_be_p(haddr + 0, int128_gethi(newv));
-stq_be_p(haddr + 1, int128_getlo(newv));
-}
-clear_helper_retaddr();
-#else
 int mem_idx = cpu_mmu_index(env, false);
 MemOpIdx oi0 = make_memop_idx(MO_BEQ | MO_ALIGN_16, mem_idx);
 MemOpIdx oi1 = make_memop_idx(MO_BEQ, mem_idx);
 
-o1 = helper_be_ldq_mmu(env, addr + 0, oi0, ra);
-o0 = helper_be_ldq_mmu(env, addr + 8, oi1, ra);
+o1 = cpu_ldq_be_mmu(env, addr + 0, oi0, ra);
+o0 = cpu_ldq_be_mmu(env, addr + 8, oi1, ra);
 oldv = int128_make128(o0, o1);
 
 success = int128_eq(oldv, cmpv);
 if (success) {
-helper_be_stq_mmu(env, addr + 0, int128_gethi(newv), oi1, ra);
-helper_be_stq_mmu(env, addr + 8, int128_getlo(newv), oi1, ra);
+cpu_stq_be_mmu(env, addr + 0, int128_gethi(newv), oi1, ra);
+cpu_stq_be_mmu(env, addr + 8, int128_getlo(newv), oi1, ra);
 }
-#endif
 
 return !success;
 }
diff --git a/target/arm/m_helper.c b/target/arm/m_helper.c
index efb522dc44..b6019595f5 100644
--- a/target/arm/m_helper.c
+++ b/target/arm/m_helper.c
@@ -1947,9 +1947,9 @@ static bool do_v7m_function_return(ARMCPU *cpu)
  * do them as secure, so work out what MMU index that is.
  */
 mmu_idx = arm_v7m_mmu_idx_for_secstate(env, true);
-oi = make_memop_idx(MO_LE, arm_to_core_mmu_idx(mmu_idx));
-newpc = helper_le_ldul_mmu(env, frameptr, oi, 0);
-newpsr = helper_le_ldul_mmu(env, frameptr + 4, oi, 0);
+oi = make_memop_idx(MO_LEUL, arm_to_core_mmu_idx(mmu_idx));
+newpc = cpu_ldl_le_mmu(env, frameptr, oi, 0);
+newpsr = cpu_ldl_le_mmu(env, frameptr + 4, oi, 0);
 
 /* Consistency checks on new IPSR */
 newpsr_exc = newpsr & XPSR_EXCP;
-- 
2.25.1




[PATCH for-6.2 21/43] tcg: Split out MemOpIdx to exec/memopidx.h

2021-07-28 Thread Richard Henderson
Move this code from tcg/tcg.h to its own header.

Signed-off-by: Richard Henderson 
---
 include/exec/memopidx.h | 55 +
 include/tcg/tcg.h   | 39 +
 2 files changed, 56 insertions(+), 38 deletions(-)
 create mode 100644 include/exec/memopidx.h

diff --git a/include/exec/memopidx.h b/include/exec/memopidx.h
new file mode 100644
index 00..83bce97874
--- /dev/null
+++ b/include/exec/memopidx.h
@@ -0,0 +1,55 @@
+/*
+ * Combine the MemOp and mmu_idx parameters into a single value.
+ *
+ * Authors:
+ *  Richard Henderson 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#ifndef EXEC_MEMOPIDX_H
+#define EXEC_MEMOPIDX_H 1
+
+#include "exec/memop.h"
+
+typedef uint32_t MemOpIdx;
+
+/**
+ * make_memop_idx
+ * @op: memory operation
+ * @idx: mmu index
+ *
+ * Encode these values into a single parameter.
+ */
+static inline MemOpIdx make_memop_idx(MemOp op, unsigned idx)
+{
+#ifdef CONFIG_DEBUG_TCG
+assert(idx <= 15);
+#endif
+return (op << 4) | idx;
+}
+
+/**
+ * get_memop
+ * @oi: combined op/idx parameter
+ *
+ * Extract the memory operation from the combined value.
+ */
+static inline MemOp get_memop(MemOpIdx oi)
+{
+return oi >> 4;
+}
+
+/**
+ * get_mmuidx
+ * @oi: combined op/idx parameter
+ *
+ * Extract the mmu index from the combined value.
+ */
+static inline unsigned get_mmuidx(MemOpIdx oi)
+{
+return oi & 15;
+}
+
+#endif
diff --git a/include/tcg/tcg.h b/include/tcg/tcg.h
index f91ebd0743..e67ef34694 100644
--- a/include/tcg/tcg.h
+++ b/include/tcg/tcg.h
@@ -27,6 +27,7 @@
 
 #include "cpu.h"
 #include "exec/memop.h"
+#include "exec/memopidx.h"
 #include "qemu/bitops.h"
 #include "qemu/plugin.h"
 #include "qemu/queue.h"
@@ -1147,44 +1148,6 @@ static inline size_t tcg_current_code_size(TCGContext *s)
 return tcg_ptr_byte_diff(s->code_ptr, s->code_buf);
 }
 
-/* Combine the MemOp and mmu_idx parameters into a single value.  */
-typedef uint32_t MemOpIdx;
-
-/**
- * make_memop_idx
- * @op: memory operation
- * @idx: mmu index
- *
- * Encode these values into a single parameter.
- */
-static inline MemOpIdx make_memop_idx(MemOp op, unsigned idx)
-{
-tcg_debug_assert(idx <= 15);
-return (op << 4) | idx;
-}
-
-/**
- * get_memop
- * @oi: combined op/idx parameter
- *
- * Extract the memory operation from the combined value.
- */
-static inline MemOp get_memop(MemOpIdx oi)
-{
-return oi >> 4;
-}
-
-/**
- * get_mmuidx
- * @oi: combined op/idx parameter
- *
- * Extract the mmu index from the combined value.
- */
-static inline unsigned get_mmuidx(MemOpIdx oi)
-{
-return oi & 15;
-}
-
 /**
  * tcg_qemu_tb_exec:
  * @env: pointer to CPUArchState for the CPU
-- 
2.25.1




[PATCH for-6.2 30/43] target/s390x: Use MO_128 for 16 byte atomics

2021-07-28 Thread Richard Henderson
Cc: qemu-s3...@nongnu.org
Signed-off-by: Richard Henderson 
---
 target/s390x/tcg/mem_helper.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/target/s390x/tcg/mem_helper.c b/target/s390x/tcg/mem_helper.c
index ec88f5dbb0..3782c1c098 100644
--- a/target/s390x/tcg/mem_helper.c
+++ b/target/s390x/tcg/mem_helper.c
@@ -1810,7 +1810,7 @@ void HELPER(cdsg_parallel)(CPUS390XState *env, uint64_t 
addr,
 assert(HAVE_CMPXCHG128);
 
 mem_idx = cpu_mmu_index(env, false);
-oi = make_memop_idx(MO_TEQ | MO_ALIGN_16, mem_idx);
+oi = make_memop_idx(MO_TE | MO_128 | MO_ALIGN, mem_idx);
 oldv = cpu_atomic_cmpxchgo_be_mmu(env, addr, cmpv, newv, oi, ra);
 fail = !int128_eq(oldv, cmpv);
 
@@ -1939,7 +1939,7 @@ static uint32_t do_csst(CPUS390XState *env, uint32_t r3, 
uint64_t a1,
 cpu_stq_data_ra(env, a1 + 0, int128_gethi(nv), ra);
 cpu_stq_data_ra(env, a1 + 8, int128_getlo(nv), ra);
 } else if (HAVE_CMPXCHG128) {
-MemOpIdx oi = make_memop_idx(MO_TEQ | MO_ALIGN_16, mem_idx);
+MemOpIdx oi = make_memop_idx(MO_TE | MO_128 | MO_ALIGN, 
mem_idx);
 ov = cpu_atomic_cmpxchgo_be_mmu(env, a1, cv, nv, oi, ra);
 cc = !int128_eq(ov, cv);
 } else {
-- 
2.25.1




[PATCH for-6.2 17/43] accel/tcg: Report unaligned atomics for user-only

2021-07-28 Thread Richard Henderson
Use the newly exposed do_unaligned_access hook from atomic_mmu_lookup,
which has access to complete alignment info from the TCGMemOpIdx arg.

Signed-off-by: Richard Henderson 
---
 accel/tcg/user-exec.c | 23 ++-
 1 file changed, 22 insertions(+), 1 deletion(-)

diff --git a/accel/tcg/user-exec.c b/accel/tcg/user-exec.c
index 90d1a2d327..dd77e90789 100644
--- a/accel/tcg/user-exec.c
+++ b/accel/tcg/user-exec.c
@@ -852,6 +852,16 @@ int cpu_signal_handler(int host_signum, void *pinfo,
 
 /* The softmmu versions of these helpers are in cputlb.c.  */
 
+static void cpu_unaligned_access(CPUState *cpu, vaddr addr,
+ MMUAccessType access_type,
+ int mmu_idx, uintptr_t ra)
+{
+CPUClass *cc = CPU_GET_CLASS(cpu);
+
+cc->tcg_ops->do_unaligned_access(cpu, addr, access_type, mmu_idx, ra);
+g_assert_not_reached();
+}
+
 uint32_t cpu_ldub_data(CPUArchState *env, abi_ptr ptr)
 {
 uint32_t ret;
@@ -1230,11 +1240,22 @@ static void *atomic_mmu_lookup(CPUArchState *env, 
target_ulong addr,
TCGMemOpIdx oi, int size, int prot,
uintptr_t retaddr)
 {
+MemOp mop = get_memop(oi);
+int a_bits = get_alignment_bits(mop);
+void *ret;
+
+/* Enforce guest required alignment.  */
+if (unlikely(addr & ((1 << a_bits) - 1))) {
+MMUAccessType t = prot == PAGE_READ ? MMU_DATA_LOAD : MMU_DATA_STORE;
+cpu_unaligned_access(env_cpu(env), addr, t, get_mmuidx(oi), retaddr);
+}
+
 /* Enforce qemu required alignment.  */
 if (unlikely(addr & (size - 1))) {
 cpu_loop_exit_atomic(env_cpu(env), retaddr);
 }
-void *ret = g2h(env_cpu(env), addr);
+
+ret = g2h(env_cpu(env), addr);
 set_helper_retaddr(retaddr);
 return ret;
 }
-- 
2.25.1




[PATCH for-6.2 32/43] accel/tcg: Add cpu_{ld,st}*_mmu interfaces

2021-07-28 Thread Richard Henderson
These functions are much closer to the softmmu helper
functions, in that they take the complete MemOpIdx,
and from that they may enforce required alignment.

The previous cpu_ldst.h functions did not have alignment info,
and so did not enforce it.  Retain this by adding MO_UNALN to
the MemOp that we create in calling the new functions.

Signed-off-by: Richard Henderson 
---
 include/exec/cpu_ldst.h | 245 --
 accel/tcg/cputlb.c  | 392 
 accel/tcg/user-exec.c   | 390 +++
 accel/tcg/ldst_common.c.inc | 307 
 4 files changed, 679 insertions(+), 655 deletions(-)
 create mode 100644 accel/tcg/ldst_common.c.inc

diff --git a/include/exec/cpu_ldst.h b/include/exec/cpu_ldst.h
index ce6ce82618..a4dad0772f 100644
--- a/include/exec/cpu_ldst.h
+++ b/include/exec/cpu_ldst.h
@@ -28,10 +28,12 @@
  * load:  cpu_ld{sign}{size}{end}_{mmusuffix}(env, ptr)
  *cpu_ld{sign}{size}{end}_{mmusuffix}_ra(env, ptr, retaddr)
  *cpu_ld{sign}{size}{end}_mmuidx_ra(env, ptr, mmu_idx, retaddr)
+ *cpu_ld{sign}{size}{end}_mmu(env, ptr, oi, retaddr)
  *
  * store: cpu_st{size}{end}_{mmusuffix}(env, ptr, val)
  *cpu_st{size}{end}_{mmusuffix}_ra(env, ptr, val, retaddr)
  *cpu_st{size}{end}_mmuidx_ra(env, ptr, val, mmu_idx, retaddr)
+ *cpu_st{size}{end}_mmu(env, ptr, val, oi, retaddr)
  *
  * sign is:
  * (empty): for 32 and 64 bit sizes
@@ -53,10 +55,15 @@
  * The "mmuidx" suffix carries an extra mmu_idx argument that specifies
  * the index to use; the "data" and "code" suffixes take the index from
  * cpu_mmu_index().
+ *
+ * The "mmu" suffix carries the full MemOpIdx, with both mmu_idx and the
+ * MemOp including alignment requirements.  The alignment will be enforced.
  */
 #ifndef CPU_LDST_H
 #define CPU_LDST_H
 
+#include "exec/memopidx.h"
+
 #if defined(CONFIG_USER_ONLY)
 /* sparc32plus has 64bit long but 32bit space address
  * this can make bad result with g2h() and h2g()
@@ -118,12 +125,10 @@ typedef target_ulong abi_ptr;
 
 uint32_t cpu_ldub_data(CPUArchState *env, abi_ptr ptr);
 int cpu_ldsb_data(CPUArchState *env, abi_ptr ptr);
-
 uint32_t cpu_lduw_be_data(CPUArchState *env, abi_ptr ptr);
 int cpu_ldsw_be_data(CPUArchState *env, abi_ptr ptr);
 uint32_t cpu_ldl_be_data(CPUArchState *env, abi_ptr ptr);
 uint64_t cpu_ldq_be_data(CPUArchState *env, abi_ptr ptr);
-
 uint32_t cpu_lduw_le_data(CPUArchState *env, abi_ptr ptr);
 int cpu_ldsw_le_data(CPUArchState *env, abi_ptr ptr);
 uint32_t cpu_ldl_le_data(CPUArchState *env, abi_ptr ptr);
@@ -131,37 +136,31 @@ uint64_t cpu_ldq_le_data(CPUArchState *env, abi_ptr ptr);
 
 uint32_t cpu_ldub_data_ra(CPUArchState *env, abi_ptr ptr, uintptr_t ra);
 int cpu_ldsb_data_ra(CPUArchState *env, abi_ptr ptr, uintptr_t ra);
-
 uint32_t cpu_lduw_be_data_ra(CPUArchState *env, abi_ptr ptr, uintptr_t ra);
 int cpu_ldsw_be_data_ra(CPUArchState *env, abi_ptr ptr, uintptr_t ra);
 uint32_t cpu_ldl_be_data_ra(CPUArchState *env, abi_ptr ptr, uintptr_t ra);
 uint64_t cpu_ldq_be_data_ra(CPUArchState *env, abi_ptr ptr, uintptr_t ra);
-
 uint32_t cpu_lduw_le_data_ra(CPUArchState *env, abi_ptr ptr, uintptr_t ra);
 int cpu_ldsw_le_data_ra(CPUArchState *env, abi_ptr ptr, uintptr_t ra);
 uint32_t cpu_ldl_le_data_ra(CPUArchState *env, abi_ptr ptr, uintptr_t ra);
 uint64_t cpu_ldq_le_data_ra(CPUArchState *env, abi_ptr ptr, uintptr_t ra);
 
 void cpu_stb_data(CPUArchState *env, abi_ptr ptr, uint32_t val);
-
 void cpu_stw_be_data(CPUArchState *env, abi_ptr ptr, uint32_t val);
 void cpu_stl_be_data(CPUArchState *env, abi_ptr ptr, uint32_t val);
 void cpu_stq_be_data(CPUArchState *env, abi_ptr ptr, uint64_t val);
-
 void cpu_stw_le_data(CPUArchState *env, abi_ptr ptr, uint32_t val);
 void cpu_stl_le_data(CPUArchState *env, abi_ptr ptr, uint32_t val);
 void cpu_stq_le_data(CPUArchState *env, abi_ptr ptr, uint64_t val);
 
 void cpu_stb_data_ra(CPUArchState *env, abi_ptr ptr,
  uint32_t val, uintptr_t ra);
-
 void cpu_stw_be_data_ra(CPUArchState *env, abi_ptr ptr,
 uint32_t val, uintptr_t ra);
 void cpu_stl_be_data_ra(CPUArchState *env, abi_ptr ptr,
 uint32_t val, uintptr_t ra);
 void cpu_stq_be_data_ra(CPUArchState *env, abi_ptr ptr,
 uint64_t val, uintptr_t ra);
-
 void cpu_stw_le_data_ra(CPUArchState *env, abi_ptr ptr,
 uint32_t val, uintptr_t ra);
 void cpu_stl_le_data_ra(CPUArchState *env, abi_ptr ptr,
@@ -169,6 +168,71 @@ void cpu_stl_le_data_ra(CPUArchState *env, abi_ptr ptr,
 void cpu_stq_le_data_ra(CPUArchState *env, abi_ptr ptr,
 uint64_t val, uintptr_t ra);
 
+uint32_t cpu_ldub_mmuidx_ra(CPUArchState *env, abi_ptr ptr,
+int mmu_idx, uintptr_t ra);
+int cpu_ldsb_mmuidx_ra(CPUArchState *env, abi_ptr ptr,
+   int mmu_idx, uintptr_t ra);
+uint32_t cpu_lduw_be_mmu

[PATCH for-6.2 34/43] target/mips: Use cpu_*_data_ra for msa load/store

2021-07-28 Thread Richard Henderson
We should not have been using the helper_ret_* set of
functions, as they are supposed to be private to tcg.
Nor should we have been using the plain cpu_*_data set
of functions, as they do not handle unwinding properly.

Cc: Philippe Mathieu-Daudé 
Signed-off-by: Richard Henderson 
---
 target/mips/tcg/msa_helper.c | 420 +++
 1 file changed, 135 insertions(+), 285 deletions(-)

diff --git a/target/mips/tcg/msa_helper.c b/target/mips/tcg/msa_helper.c
index 167d9a591c..a8880ce81c 100644
--- a/target/mips/tcg/msa_helper.c
+++ b/target/mips/tcg/msa_helper.c
@@ -8222,79 +8222,42 @@ void helper_msa_ld_b(CPUMIPSState *env, uint32_t wd,
  target_ulong addr)
 {
 wr_t *pwd = &(env->active_fpu.fpr[wd].wr);
-MEMOP_IDX(DF_BYTE)
-#if !defined(CONFIG_USER_ONLY)
+uintptr_t ra = GETPC();
+
 #if !defined(HOST_WORDS_BIGENDIAN)
-pwd->b[0]  = helper_ret_ldub_mmu(env, addr + (0  << DF_BYTE), oi, GETPC());
-pwd->b[1]  = helper_ret_ldub_mmu(env, addr + (1  << DF_BYTE), oi, GETPC());
-pwd->b[2]  = helper_ret_ldub_mmu(env, addr + (2  << DF_BYTE), oi, GETPC());
-pwd->b[3]  = helper_ret_ldub_mmu(env, addr + (3  << DF_BYTE), oi, GETPC());
-pwd->b[4]  = helper_ret_ldub_mmu(env, addr + (4  << DF_BYTE), oi, GETPC());
-pwd->b[5]  = helper_ret_ldub_mmu(env, addr + (5  << DF_BYTE), oi, GETPC());
-pwd->b[6]  = helper_ret_ldub_mmu(env, addr + (6  << DF_BYTE), oi, GETPC());
-pwd->b[7]  = helper_ret_ldub_mmu(env, addr + (7  << DF_BYTE), oi, GETPC());
-pwd->b[8]  = helper_ret_ldub_mmu(env, addr + (8  << DF_BYTE), oi, GETPC());
-pwd->b[9]  = helper_ret_ldub_mmu(env, addr + (9  << DF_BYTE), oi, GETPC());
-pwd->b[10] = helper_ret_ldub_mmu(env, addr + (10 << DF_BYTE), oi, GETPC());
-pwd->b[11] = helper_ret_ldub_mmu(env, addr + (11 << DF_BYTE), oi, GETPC());
-pwd->b[12] = helper_ret_ldub_mmu(env, addr + (12 << DF_BYTE), oi, GETPC());
-pwd->b[13] = helper_ret_ldub_mmu(env, addr + (13 << DF_BYTE), oi, GETPC());
-pwd->b[14] = helper_ret_ldub_mmu(env, addr + (14 << DF_BYTE), oi, GETPC());
-pwd->b[15] = helper_ret_ldub_mmu(env, addr + (15 << DF_BYTE), oi, GETPC());
+pwd->b[0]  = cpu_ldub_data_ra(env, addr + (0  << DF_BYTE), ra);
+pwd->b[1]  = cpu_ldub_data_ra(env, addr + (1  << DF_BYTE), ra);
+pwd->b[2]  = cpu_ldub_data_ra(env, addr + (2  << DF_BYTE), ra);
+pwd->b[3]  = cpu_ldub_data_ra(env, addr + (3  << DF_BYTE), ra);
+pwd->b[4]  = cpu_ldub_data_ra(env, addr + (4  << DF_BYTE), ra);
+pwd->b[5]  = cpu_ldub_data_ra(env, addr + (5  << DF_BYTE), ra);
+pwd->b[6]  = cpu_ldub_data_ra(env, addr + (6  << DF_BYTE), ra);
+pwd->b[7]  = cpu_ldub_data_ra(env, addr + (7  << DF_BYTE), ra);
+pwd->b[8]  = cpu_ldub_data_ra(env, addr + (8  << DF_BYTE), ra);
+pwd->b[9]  = cpu_ldub_data_ra(env, addr + (9  << DF_BYTE), ra);
+pwd->b[10] = cpu_ldub_data_ra(env, addr + (10 << DF_BYTE), ra);
+pwd->b[11] = cpu_ldub_data_ra(env, addr + (11 << DF_BYTE), ra);
+pwd->b[12] = cpu_ldub_data_ra(env, addr + (12 << DF_BYTE), ra);
+pwd->b[13] = cpu_ldub_data_ra(env, addr + (13 << DF_BYTE), ra);
+pwd->b[14] = cpu_ldub_data_ra(env, addr + (14 << DF_BYTE), ra);
+pwd->b[15] = cpu_ldub_data_ra(env, addr + (15 << DF_BYTE), ra);
 #else
-pwd->b[0]  = helper_ret_ldub_mmu(env, addr + (7  << DF_BYTE), oi, GETPC());
-pwd->b[1]  = helper_ret_ldub_mmu(env, addr + (6  << DF_BYTE), oi, GETPC());
-pwd->b[2]  = helper_ret_ldub_mmu(env, addr + (5  << DF_BYTE), oi, GETPC());
-pwd->b[3]  = helper_ret_ldub_mmu(env, addr + (4  << DF_BYTE), oi, GETPC());
-pwd->b[4]  = helper_ret_ldub_mmu(env, addr + (3  << DF_BYTE), oi, GETPC());
-pwd->b[5]  = helper_ret_ldub_mmu(env, addr + (2  << DF_BYTE), oi, GETPC());
-pwd->b[6]  = helper_ret_ldub_mmu(env, addr + (1  << DF_BYTE), oi, GETPC());
-pwd->b[7]  = helper_ret_ldub_mmu(env, addr + (0  << DF_BYTE), oi, GETPC());
-pwd->b[8]  = helper_ret_ldub_mmu(env, addr + (15 << DF_BYTE), oi, GETPC());
-pwd->b[9]  = helper_ret_ldub_mmu(env, addr + (14 << DF_BYTE), oi, GETPC());
-pwd->b[10] = helper_ret_ldub_mmu(env, addr + (13 << DF_BYTE), oi, GETPC());
-pwd->b[11] = helper_ret_ldub_mmu(env, addr + (12 << DF_BYTE), oi, GETPC());
-pwd->b[12] = helper_ret_ldub_mmu(env, addr + (11 << DF_BYTE), oi, GETPC());
-pwd->b[13] = helper_ret_ldub_mmu(env, addr + (10 << DF_BYTE), oi, GETPC());
-pwd->b[14] = helper_ret_ldub_mmu(env, addr + (9  << DF_BYTE), oi, GETPC());
-pwd->b[15] = helper_ret_ldub_mmu(env, addr + (8  << DF_BYTE), oi, GETPC());
-#endif
-#else
-#if !defined(HOST_WORDS_BIGENDIAN)
-pwd->b[0]  = cpu_ldub_data(env, addr + (0  << DF_BYTE));
-pwd->b[1]  = cpu_ldub_data(env, addr + (1  << DF_BYTE));
-pwd->b[2]  = cpu_ldub_data(env, addr + (2  << DF_BYTE));
-pwd->b[3]  = cpu_ldub_data(env, addr + (3  << DF_BYTE));
-pwd->b[4]  = cpu_ldub_data(env, addr + (4  << DF_BYTE));
-pwd->b[5]  = cpu_ldub_data(env, addr + (5  << D

[PATCH for-6.2 15/43] target/sparc: Implement do_unaligned_access for user-only

2021-07-28 Thread Richard Henderson
Cc: Mark Cave-Ayland 
Signed-off-by: Richard Henderson 
---
 linux-user/sparc/cpu_loop.c | 11 +++
 target/sparc/cpu.c  |  2 +-
 target/sparc/ldst_helper.c  |  2 --
 3 files changed, 12 insertions(+), 3 deletions(-)

diff --git a/linux-user/sparc/cpu_loop.c b/linux-user/sparc/cpu_loop.c
index 02532f198d..612e77807e 100644
--- a/linux-user/sparc/cpu_loop.c
+++ b/linux-user/sparc/cpu_loop.c
@@ -272,6 +272,17 @@ void cpu_loop (CPUSPARCState *env)
 queue_signal(env, info.si_signo, QEMU_SI_FAULT, &info);
 }
 break;
+case TT_UNALIGNED:
+info.si_signo = TARGET_SIGBUS;
+info.si_errno = 0;
+info.si_code = TARGET_BUS_ADRALN;
+#ifdef TARGET_SPARC64
+info._sifields._sigfault._addr = env->dmmu.sfar;
+#else
+info._sifields._sigfault._addr = env->mmuregs[4];
+#endif
+queue_signal(env, info.si_signo, QEMU_SI_FAULT, &info);
+break;
 case EXCP_DEBUG:
 info.si_signo = TARGET_SIGTRAP;
 info.si_errno = 0;
diff --git a/target/sparc/cpu.c b/target/sparc/cpu.c
index da6b30ec74..d33d41e837 100644
--- a/target/sparc/cpu.c
+++ b/target/sparc/cpu.c
@@ -865,11 +865,11 @@ static const struct TCGCPUOps sparc_tcg_ops = {
 .synchronize_from_tb = sparc_cpu_synchronize_from_tb,
 .cpu_exec_interrupt = sparc_cpu_exec_interrupt,
 .tlb_fill = sparc_cpu_tlb_fill,
+.do_unaligned_access = sparc_cpu_do_unaligned_access,
 
 #ifndef CONFIG_USER_ONLY
 .do_interrupt = sparc_cpu_do_interrupt,
 .do_transaction_failed = sparc_cpu_do_transaction_failed,
-.do_unaligned_access = sparc_cpu_do_unaligned_access,
 #endif /* !CONFIG_USER_ONLY */
 };
 #endif /* CONFIG_TCG */
diff --git a/target/sparc/ldst_helper.c b/target/sparc/ldst_helper.c
index 7367b48c8b..69b812e68c 100644
--- a/target/sparc/ldst_helper.c
+++ b/target/sparc/ldst_helper.c
@@ -1954,7 +1954,6 @@ void sparc_cpu_do_transaction_failed(CPUState *cs, hwaddr 
physaddr,
 }
 #endif
 
-#if !defined(CONFIG_USER_ONLY)
 void QEMU_NORETURN sparc_cpu_do_unaligned_access(CPUState *cs, vaddr addr,
  MMUAccessType access_type,
  int mmu_idx,
@@ -1973,4 +1972,3 @@ void QEMU_NORETURN sparc_cpu_do_unaligned_access(CPUState 
*cs, vaddr addr,
 
 cpu_raise_exception_ra(env, TT_UNALIGNED, retaddr);
 }
-#endif
-- 
2.25.1




[PATCH for-6.2 29/43] target/ppc: Use MO_128 for 16 byte atomics

2021-07-28 Thread Richard Henderson
Cc: qemu-...@nongnu.org
Signed-off-by: Richard Henderson 
---
 target/ppc/translate.c | 12 +++-
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/target/ppc/translate.c b/target/ppc/translate.c
index 171b216e17..540efa858f 100644
--- a/target/ppc/translate.c
+++ b/target/ppc/translate.c
@@ -3461,10 +3461,12 @@ static void gen_std(DisasContext *ctx)
 if (HAVE_ATOMIC128) {
 TCGv_i32 oi = tcg_temp_new_i32();
 if (ctx->le_mode) {
-tcg_gen_movi_i32(oi, make_memop_idx(MO_LEQ, ctx->mem_idx));
+tcg_gen_movi_i32(oi, make_memop_idx(MO_LE | MO_128,
+ctx->mem_idx));
 gen_helper_stq_le_parallel(cpu_env, EA, lo, hi, oi);
 } else {
-tcg_gen_movi_i32(oi, make_memop_idx(MO_BEQ, ctx->mem_idx));
+tcg_gen_movi_i32(oi, make_memop_idx(MO_BE | MO_128,
+ctx->mem_idx));
 gen_helper_stq_be_parallel(cpu_env, EA, lo, hi, oi);
 }
 tcg_temp_free_i32(oi);
@@ -4066,11 +4068,11 @@ static void gen_lqarx(DisasContext *ctx)
 if (HAVE_ATOMIC128) {
 TCGv_i32 oi = tcg_temp_new_i32();
 if (ctx->le_mode) {
-tcg_gen_movi_i32(oi, make_memop_idx(MO_LEQ | MO_ALIGN_16,
+tcg_gen_movi_i32(oi, make_memop_idx(MO_LE | MO_128 | MO_ALIGN,
 ctx->mem_idx));
 gen_helper_lq_le_parallel(lo, cpu_env, EA, oi);
 } else {
-tcg_gen_movi_i32(oi, make_memop_idx(MO_BEQ | MO_ALIGN_16,
+tcg_gen_movi_i32(oi, make_memop_idx(MO_BE | MO_128 | MO_ALIGN,
 ctx->mem_idx));
 gen_helper_lq_be_parallel(lo, cpu_env, EA, oi);
 }
@@ -4121,7 +4123,7 @@ static void gen_stqcx_(DisasContext *ctx)
 
 if (tb_cflags(ctx->base.tb) & CF_PARALLEL) {
 if (HAVE_CMPXCHG128) {
-TCGv_i32 oi = tcg_const_i32(DEF_MEMOP(MO_Q) | MO_ALIGN_16);
+TCGv_i32 oi = tcg_const_i32(DEF_MEMOP(MO_128) | MO_ALIGN);
 if (ctx->le_mode) {
 gen_helper_stqcx_le_parallel(cpu_crf[0], cpu_env,
  EA, lo, hi, oi);
-- 
2.25.1




[PATCH for-6.2 19/43] tcg: Expand MO_SIZE to 3 bits

2021-07-28 Thread Richard Henderson
We have lacked expressive support for memory sizes larger
than 64-bits for a while.  Fixing that requires adjustment
to several points where we used this for array indexing,
and two places that develop -Wswitch warnings after the change.

Signed-off-by: Richard Henderson 
---
 include/exec/memop.h| 14 +-
 target/arm/translate-a64.c  |  2 +-
 tcg/tcg-op.c| 13 -
 target/s390x/tcg/translate_vx.c.inc |  2 +-
 tcg/aarch64/tcg-target.c.inc|  4 ++--
 tcg/arm/tcg-target.c.inc|  4 ++--
 tcg/i386/tcg-target.c.inc   |  4 ++--
 tcg/mips/tcg-target.c.inc   |  4 ++--
 tcg/ppc/tcg-target.c.inc|  8 
 tcg/riscv/tcg-target.c.inc  |  4 ++--
 tcg/s390/tcg-target.c.inc   |  4 ++--
 tcg/sparc/tcg-target.c.inc  | 16 
 12 files changed, 43 insertions(+), 36 deletions(-)

diff --git a/include/exec/memop.h b/include/exec/memop.h
index 529d07b02d..04264ffd6b 100644
--- a/include/exec/memop.h
+++ b/include/exec/memop.h
@@ -19,11 +19,15 @@ typedef enum MemOp {
 MO_16= 1,
 MO_32= 2,
 MO_64= 3,
-MO_SIZE  = 3,   /* Mask for the above.  */
+MO_128   = 4,
+MO_256   = 5,
+MO_512   = 6,
+MO_1024  = 7,
+MO_SIZE  = 0x07,   /* Mask for the above.  */
 
-MO_SIGN  = 4,   /* Sign-extended, otherwise zero-extended.  */
+MO_SIGN  = 0x08,   /* Sign-extended, otherwise zero-extended.  */
 
-MO_BSWAP = 8,   /* Host reverse endian.  */
+MO_BSWAP = 0x10,   /* Host reverse endian.  */
 #ifdef HOST_WORDS_BIGENDIAN
 MO_LE= MO_BSWAP,
 MO_BE= 0,
@@ -59,8 +63,8 @@ typedef enum MemOp {
  * - an alignment to a specified size, which may be more or less than
  *   the access size (MO_ALIGN_x where 'x' is a size in bytes);
  */
-MO_ASHIFT = 4,
-MO_AMASK = 7 << MO_ASHIFT,
+MO_ASHIFT = 5,
+MO_AMASK = 0x7 << MO_ASHIFT,
 #ifdef NEED_CPU_H
 #ifdef TARGET_ALIGNED_ONLY
 MO_ALIGN = 0,
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index 422e2ac0c9..247c9672be 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -1045,7 +1045,7 @@ static void read_vec_element(DisasContext *s, TCGv_i64 
tcg_dest, int srcidx,
  int element, MemOp memop)
 {
 int vect_off = vec_reg_offset(s, srcidx, element, memop & MO_SIZE);
-switch (memop) {
+switch ((unsigned)memop) {
 case MO_8:
 tcg_gen_ld8u_i64(tcg_dest, cpu_env, vect_off);
 break;
diff --git a/tcg/tcg-op.c b/tcg/tcg-op.c
index c754396575..e01f68f44d 100644
--- a/tcg/tcg-op.c
+++ b/tcg/tcg-op.c
@@ -2780,10 +2780,13 @@ static inline MemOp tcg_canonicalize_memop(MemOp op, 
bool is64, bool st)
 }
 break;
 case MO_64:
-if (!is64) {
-tcg_abort();
+if (is64) {
+op &= ~MO_SIGN;
+break;
 }
-break;
+/* fall through */
+default:
+g_assert_not_reached();
 }
 if (st) {
 op &= ~MO_SIGN;
@@ -3095,7 +3098,7 @@ typedef void (*gen_atomic_op_i64)(TCGv_i64, TCGv_env, 
TCGv,
 # define WITH_ATOMIC64(X)
 #endif
 
-static void * const table_cmpxchg[16] = {
+static void * const table_cmpxchg[(MO_SIZE | MO_BSWAP) + 1] = {
 [MO_8] = gen_helper_atomic_cmpxchgb,
 [MO_16 | MO_LE] = gen_helper_atomic_cmpxchgw_le,
 [MO_16 | MO_BE] = gen_helper_atomic_cmpxchgw_be,
@@ -3297,7 +3300,7 @@ static void do_atomic_op_i64(TCGv_i64 ret, TCGv addr, 
TCGv_i64 val,
 }
 
 #define GEN_ATOMIC_HELPER(NAME, OP, NEW)\
-static void * const table_##NAME[16] = {\
+static void * const table_##NAME[(MO_SIZE | MO_BSWAP) + 1] = {  \
 [MO_8] = gen_helper_atomic_##NAME##b,   \
 [MO_16 | MO_LE] = gen_helper_atomic_##NAME##w_le,   \
 [MO_16 | MO_BE] = gen_helper_atomic_##NAME##w_be,   \
diff --git a/target/s390x/tcg/translate_vx.c.inc 
b/target/s390x/tcg/translate_vx.c.inc
index 0afa46e463..28bf5a23b6 100644
--- a/target/s390x/tcg/translate_vx.c.inc
+++ b/target/s390x/tcg/translate_vx.c.inc
@@ -67,7 +67,7 @@ static void read_vec_element_i64(TCGv_i64 dst, uint8_t reg, 
uint8_t enr,
 {
 const int offs = vec_reg_offset(reg, enr, memop & MO_SIZE);
 
-switch (memop) {
+switch ((unsigned)memop) {
 case ES_8:
 tcg_gen_ld8u_i64(dst, cpu_env, offs);
 break;
diff --git a/tcg/aarch64/tcg-target.c.inc b/tcg/aarch64/tcg-target.c.inc
index 5924977b42..6f43c048a5 100644
--- a/tcg/aarch64/tcg-target.c.inc
+++ b/tcg/aarch64/tcg-target.c.inc
@@ -1547,7 +1547,7 @@ static void tcg_out_cltz(TCGContext *s, TCGType ext, 
TCGReg d,
 /* helper signature: helper_ret_ld_mmu(CPUState *env, target_ulong addr,
  * TCGMemOpIdx oi, uintptr_t ra)
  */
-static void * const qemu_ld_helpers[4] = {
+static void * co

[PATCH for-6.2 33/43] accel/tcg: Move cpu_atomic decls to exec/cpu_ldst.h

2021-07-28 Thread Richard Henderson
The previous placement in tcg/tcg.h was not logical.

Signed-off-by: Richard Henderson 
---
 include/exec/cpu_ldst.h   | 87 +++
 include/tcg/tcg.h | 87 ---
 target/arm/helper-a64.c   |  1 -
 target/m68k/op_helper.c   |  1 -
 target/ppc/mem_helper.c   |  1 -
 target/s390x/tcg/mem_helper.c |  1 -
 6 files changed, 87 insertions(+), 91 deletions(-)

diff --git a/include/exec/cpu_ldst.h b/include/exec/cpu_ldst.h
index a4dad0772f..a878fd0105 100644
--- a/include/exec/cpu_ldst.h
+++ b/include/exec/cpu_ldst.h
@@ -63,6 +63,7 @@
 #define CPU_LDST_H
 
 #include "exec/memopidx.h"
+#include "qemu/int128.h"
 
 #if defined(CONFIG_USER_ONLY)
 /* sparc32plus has 64bit long but 32bit space address
@@ -233,6 +234,92 @@ void cpu_stl_le_mmu(CPUArchState *env, abi_ptr ptr, 
uint32_t val,
 void cpu_stq_le_mmu(CPUArchState *env, abi_ptr ptr, uint64_t val,
 MemOpIdx oi, uintptr_t ra);
 
+uint32_t cpu_atomic_cmpxchgb_mmu(CPUArchState *env, target_ulong addr,
+ uint32_t cmpv, uint32_t newv,
+ MemOpIdx oi, uintptr_t retaddr);
+uint32_t cpu_atomic_cmpxchgw_le_mmu(CPUArchState *env, target_ulong addr,
+uint32_t cmpv, uint32_t newv,
+MemOpIdx oi, uintptr_t retaddr);
+uint32_t cpu_atomic_cmpxchgl_le_mmu(CPUArchState *env, target_ulong addr,
+uint32_t cmpv, uint32_t newv,
+MemOpIdx oi, uintptr_t retaddr);
+uint64_t cpu_atomic_cmpxchgq_le_mmu(CPUArchState *env, target_ulong addr,
+uint64_t cmpv, uint64_t newv,
+MemOpIdx oi, uintptr_t retaddr);
+uint32_t cpu_atomic_cmpxchgw_be_mmu(CPUArchState *env, target_ulong addr,
+uint32_t cmpv, uint32_t newv,
+MemOpIdx oi, uintptr_t retaddr);
+uint32_t cpu_atomic_cmpxchgl_be_mmu(CPUArchState *env, target_ulong addr,
+uint32_t cmpv, uint32_t newv,
+MemOpIdx oi, uintptr_t retaddr);
+uint64_t cpu_atomic_cmpxchgq_be_mmu(CPUArchState *env, target_ulong addr,
+uint64_t cmpv, uint64_t newv,
+MemOpIdx oi, uintptr_t retaddr);
+
+#define GEN_ATOMIC_HELPER(NAME, TYPE, SUFFIX) \
+TYPE cpu_atomic_ ## NAME ## SUFFIX ## _mmu\
+(CPUArchState *env, target_ulong addr, TYPE val,  \
+ MemOpIdx oi, uintptr_t retaddr);
+
+#ifdef CONFIG_ATOMIC64
+#define GEN_ATOMIC_HELPER_ALL(NAME)  \
+GEN_ATOMIC_HELPER(NAME, uint32_t, b) \
+GEN_ATOMIC_HELPER(NAME, uint32_t, w_le)  \
+GEN_ATOMIC_HELPER(NAME, uint32_t, w_be)  \
+GEN_ATOMIC_HELPER(NAME, uint32_t, l_le)  \
+GEN_ATOMIC_HELPER(NAME, uint32_t, l_be)  \
+GEN_ATOMIC_HELPER(NAME, uint64_t, q_le)  \
+GEN_ATOMIC_HELPER(NAME, uint64_t, q_be)
+#else
+#define GEN_ATOMIC_HELPER_ALL(NAME)  \
+GEN_ATOMIC_HELPER(NAME, uint32_t, b) \
+GEN_ATOMIC_HELPER(NAME, uint32_t, w_le)  \
+GEN_ATOMIC_HELPER(NAME, uint32_t, w_be)  \
+GEN_ATOMIC_HELPER(NAME, uint32_t, l_le)  \
+GEN_ATOMIC_HELPER(NAME, uint32_t, l_be)
+#endif
+
+GEN_ATOMIC_HELPER_ALL(fetch_add)
+GEN_ATOMIC_HELPER_ALL(fetch_sub)
+GEN_ATOMIC_HELPER_ALL(fetch_and)
+GEN_ATOMIC_HELPER_ALL(fetch_or)
+GEN_ATOMIC_HELPER_ALL(fetch_xor)
+GEN_ATOMIC_HELPER_ALL(fetch_smin)
+GEN_ATOMIC_HELPER_ALL(fetch_umin)
+GEN_ATOMIC_HELPER_ALL(fetch_smax)
+GEN_ATOMIC_HELPER_ALL(fetch_umax)
+
+GEN_ATOMIC_HELPER_ALL(add_fetch)
+GEN_ATOMIC_HELPER_ALL(sub_fetch)
+GEN_ATOMIC_HELPER_ALL(and_fetch)
+GEN_ATOMIC_HELPER_ALL(or_fetch)
+GEN_ATOMIC_HELPER_ALL(xor_fetch)
+GEN_ATOMIC_HELPER_ALL(smin_fetch)
+GEN_ATOMIC_HELPER_ALL(umin_fetch)
+GEN_ATOMIC_HELPER_ALL(smax_fetch)
+GEN_ATOMIC_HELPER_ALL(umax_fetch)
+
+GEN_ATOMIC_HELPER_ALL(xchg)
+
+#undef GEN_ATOMIC_HELPER_ALL
+#undef GEN_ATOMIC_HELPER
+
+Int128 cpu_atomic_cmpxchgo_le_mmu(CPUArchState *env, target_ulong addr,
+  Int128 cmpv, Int128 newv,
+  MemOpIdx oi, uintptr_t retaddr);
+Int128 cpu_atomic_cmpxchgo_be_mmu(CPUArchState *env, target_ulong addr,
+  Int128 cmpv, Int128 newv,
+  MemOpIdx oi, uintptr_t retaddr);
+
+Int128 cpu_atomic_ldo_le_mmu(CPUArchState *env, target_ulong addr,
+ MemOpIdx oi, uintptr_t retaddr);
+Int128 cpu_atomic_ldo_be_mmu(CPUArchState *env, target_ulong addr,
+ MemOpIdx oi, uintptr_t retaddr);
+void cpu_atomic_sto_le_mmu(CPUArchState *env, target_ulong addr, Int128 val,
+   MemOpIdx oi, uintptr_t retaddr);
+void cpu_atomic_sto_be_mmu(CPUArchState *env, target_ulong addr, Int128 val,
+  

[PATCH for-6.2 24/43] accel/tcg: Pass MemOpIdx to atomic_trace_*_post

2021-07-28 Thread Richard Henderson
We will shortly use the MemOpIdx directly, but in the meantime
re-compute the trace meminfo.

Signed-off-by: Richard Henderson 
---
 accel/tcg/atomic_template.h   | 48 +--
 accel/tcg/atomic_common.c.inc | 30 +++---
 2 files changed, 39 insertions(+), 39 deletions(-)

diff --git a/accel/tcg/atomic_template.h b/accel/tcg/atomic_template.h
index 4230ff2957..c08d859a8a 100644
--- a/accel/tcg/atomic_template.h
+++ b/accel/tcg/atomic_template.h
@@ -77,15 +77,15 @@ ABI_TYPE ATOMIC_NAME(cmpxchg)(CPUArchState *env, 
target_ulong addr,
 DATA_TYPE *haddr = atomic_mmu_lookup(env, addr, oi, DATA_SIZE,
  PAGE_READ | PAGE_WRITE, retaddr);
 DATA_TYPE ret;
-uint16_t info = atomic_trace_rmw_pre(env, addr, oi);
 
+atomic_trace_rmw_pre(env, addr, oi);
 #if DATA_SIZE == 16
 ret = atomic16_cmpxchg(haddr, cmpv, newv);
 #else
 ret = qatomic_cmpxchg__nocheck(haddr, cmpv, newv);
 #endif
 ATOMIC_MMU_CLEANUP;
-atomic_trace_rmw_post(env, addr, info);
+atomic_trace_rmw_post(env, addr, oi);
 return ret;
 }
 
@@ -97,11 +97,11 @@ ABI_TYPE ATOMIC_NAME(ld)(CPUArchState *env, target_ulong 
addr,
 DATA_TYPE *haddr = atomic_mmu_lookup(env, addr, oi, DATA_SIZE,
  PAGE_READ, retaddr);
 DATA_TYPE val;
-uint16_t info = atomic_trace_ld_pre(env, addr, oi);
 
+atomic_trace_ld_pre(env, addr, oi);
 val = atomic16_read(haddr);
 ATOMIC_MMU_CLEANUP;
-atomic_trace_ld_post(env, addr, info);
+atomic_trace_ld_post(env, addr, oi);
 return val;
 }
 
@@ -110,11 +110,11 @@ void ATOMIC_NAME(st)(CPUArchState *env, target_ulong 
addr, ABI_TYPE val,
 {
 DATA_TYPE *haddr = atomic_mmu_lookup(env, addr, oi, DATA_SIZE,
  PAGE_WRITE, retaddr);
-uint16_t info = atomic_trace_st_pre(env, addr, oi);
 
+atomic_trace_st_pre(env, addr, oi);
 atomic16_set(haddr, val);
 ATOMIC_MMU_CLEANUP;
-atomic_trace_st_post(env, addr, info);
+atomic_trace_st_post(env, addr, oi);
 }
 #endif
 #else
@@ -124,11 +124,11 @@ ABI_TYPE ATOMIC_NAME(xchg)(CPUArchState *env, 
target_ulong addr, ABI_TYPE val,
 DATA_TYPE *haddr = atomic_mmu_lookup(env, addr, oi, DATA_SIZE,
  PAGE_READ | PAGE_WRITE, retaddr);
 DATA_TYPE ret;
-uint16_t info = atomic_trace_rmw_pre(env, addr, oi);
 
+atomic_trace_rmw_pre(env, addr, oi);
 ret = qatomic_xchg__nocheck(haddr, val);
 ATOMIC_MMU_CLEANUP;
-atomic_trace_rmw_post(env, addr, info);
+atomic_trace_rmw_post(env, addr, oi);
 return ret;
 }
 
@@ -139,10 +139,10 @@ ABI_TYPE ATOMIC_NAME(X)(CPUArchState *env, target_ulong 
addr,   \
 DATA_TYPE *haddr = atomic_mmu_lookup(env, addr, oi, DATA_SIZE,  \
  PAGE_READ | PAGE_WRITE, retaddr); \
 DATA_TYPE ret;  \
-uint16_t info = atomic_trace_rmw_pre(env, addr, oi);\
+atomic_trace_rmw_pre(env, addr, oi);\
 ret = qatomic_##X(haddr, val);  \
 ATOMIC_MMU_CLEANUP; \
-atomic_trace_rmw_post(env, addr, info); \
+atomic_trace_rmw_post(env, addr, oi);   \
 return ret; \
 }
 
@@ -172,7 +172,7 @@ ABI_TYPE ATOMIC_NAME(X)(CPUArchState *env, target_ulong 
addr,   \
 XDATA_TYPE *haddr = atomic_mmu_lookup(env, addr, oi, DATA_SIZE, \
   PAGE_READ | PAGE_WRITE, retaddr); \
 XDATA_TYPE cmp, old, new, val = xval;   \
-uint16_t info = atomic_trace_rmw_pre(env, addr, oi);\
+atomic_trace_rmw_pre(env, addr, oi);\
 smp_mb();   \
 cmp = qatomic_read__nocheck(haddr); \
 do {\
@@ -180,7 +180,7 @@ ABI_TYPE ATOMIC_NAME(X)(CPUArchState *env, target_ulong 
addr,   \
 cmp = qatomic_cmpxchg__nocheck(haddr, old, new);\
 } while (cmp != old);   \
 ATOMIC_MMU_CLEANUP; \
-atomic_trace_rmw_post(env, addr, info); \
+atomic_trace_rmw_post(env, addr, oi);   \
 return RET; \
 }
 
@@ -216,15 +216,15 @@ ABI_TYPE ATOMIC_NAME(cmpxchg)(CPUArchState *env, 
target_ulong addr,
 DATA_TYPE *haddr = atomic_mmu_lookup(env, addr, oi, DATA_SIZE,
  PAGE_READ | PAGE_WRITE, retaddr);
 DATA_TYPE ret;
-uint16_t info = atomic_trace_rmw_pre(env, addr, oi);
 
+atomic_trace_

[PATCH for-6.2 16/43] target/xtensa: Implement do_unaligned_access for user-only

2021-07-28 Thread Richard Henderson
Cc: Max Filippov 
Signed-off-by: Richard Henderson 
---
 target/xtensa/cpu.c|  2 +-
 target/xtensa/helper.c | 30 +++---
 2 files changed, 16 insertions(+), 16 deletions(-)

diff --git a/target/xtensa/cpu.c b/target/xtensa/cpu.c
index 58ec3a0862..41816d91f6 100644
--- a/target/xtensa/cpu.c
+++ b/target/xtensa/cpu.c
@@ -195,11 +195,11 @@ static const struct TCGCPUOps xtensa_tcg_ops = {
 .cpu_exec_interrupt = xtensa_cpu_exec_interrupt,
 .tlb_fill = xtensa_cpu_tlb_fill,
 .debug_excp_handler = xtensa_breakpoint_handler,
+.do_unaligned_access = xtensa_cpu_do_unaligned_access,
 
 #ifndef CONFIG_USER_ONLY
 .do_interrupt = xtensa_cpu_do_interrupt,
 .do_transaction_failed = xtensa_cpu_do_transaction_failed,
-.do_unaligned_access = xtensa_cpu_do_unaligned_access,
 #endif /* !CONFIG_USER_ONLY */
 };
 
diff --git a/target/xtensa/helper.c b/target/xtensa/helper.c
index f18ab383fd..a5296399c5 100644
--- a/target/xtensa/helper.c
+++ b/target/xtensa/helper.c
@@ -242,6 +242,21 @@ void xtensa_cpu_list(void)
 }
 }
 
+void xtensa_cpu_do_unaligned_access(CPUState *cs,
+vaddr addr, MMUAccessType access_type,
+int mmu_idx, uintptr_t retaddr)
+{
+XtensaCPU *cpu = XTENSA_CPU(cs);
+CPUXtensaState *env = &cpu->env;
+
+assert(xtensa_option_enabled(env->config,
+ XTENSA_OPTION_UNALIGNED_EXCEPTION));
+cpu_restore_state(CPU(cpu), retaddr, true);
+HELPER(exception_cause_vaddr)(env,
+  env->pc, LOAD_STORE_ALIGNMENT_CAUSE,
+  addr);
+}
+
 #ifdef CONFIG_USER_ONLY
 
 bool xtensa_cpu_tlb_fill(CPUState *cs, vaddr address, int size,
@@ -263,21 +278,6 @@ bool xtensa_cpu_tlb_fill(CPUState *cs, vaddr address, int 
size,
 
 #else /* !CONFIG_USER_ONLY */
 
-void xtensa_cpu_do_unaligned_access(CPUState *cs,
-vaddr addr, MMUAccessType access_type,
-int mmu_idx, uintptr_t retaddr)
-{
-XtensaCPU *cpu = XTENSA_CPU(cs);
-CPUXtensaState *env = &cpu->env;
-
-assert(xtensa_option_enabled(env->config,
- XTENSA_OPTION_UNALIGNED_EXCEPTION));
-cpu_restore_state(CPU(cpu), retaddr, true);
-HELPER(exception_cause_vaddr)(env,
-  env->pc, LOAD_STORE_ALIGNMENT_CAUSE,
-  addr);
-}
-
 bool xtensa_cpu_tlb_fill(CPUState *cs, vaddr address, int size,
  MMUAccessType access_type, int mmu_idx,
  bool probe, uintptr_t retaddr)
-- 
2.25.1




[PATCH for-6.2 27/43] target/arm: Use MO_128 for 16 byte atomics

2021-07-28 Thread Richard Henderson
Cc: qemu-...@nongnu.org
Signed-off-by: Richard Henderson 
---
 target/arm/helper-a64.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/target/arm/helper-a64.c b/target/arm/helper-a64.c
index 13d1e3f808..f06399f351 100644
--- a/target/arm/helper-a64.c
+++ b/target/arm/helper-a64.c
@@ -560,7 +560,7 @@ uint64_t HELPER(paired_cmpxchg64_le_parallel)(CPUARMState 
*env, uint64_t addr,
 assert(HAVE_CMPXCHG128);
 
 mem_idx = cpu_mmu_index(env, false);
-oi = make_memop_idx(MO_LEQ | MO_ALIGN_16, mem_idx);
+oi = make_memop_idx(MO_LE | MO_128 | MO_ALIGN, mem_idx);
 
 cmpv = int128_make128(env->exclusive_val, env->exclusive_high);
 newv = int128_make128(new_lo, new_hi);
@@ -630,7 +630,7 @@ uint64_t HELPER(paired_cmpxchg64_be_parallel)(CPUARMState 
*env, uint64_t addr,
 assert(HAVE_CMPXCHG128);
 
 mem_idx = cpu_mmu_index(env, false);
-oi = make_memop_idx(MO_BEQ | MO_ALIGN_16, mem_idx);
+oi = make_memop_idx(MO_BE | MO_128 | MO_ALIGN, mem_idx);
 
 /*
  * High and low need to be switched here because this is not actually a
@@ -656,7 +656,7 @@ void HELPER(casp_le_parallel)(CPUARMState *env, uint32_t 
rs, uint64_t addr,
 assert(HAVE_CMPXCHG128);
 
 mem_idx = cpu_mmu_index(env, false);
-oi = make_memop_idx(MO_LEQ | MO_ALIGN_16, mem_idx);
+oi = make_memop_idx(MO_LE | MO_128 | MO_ALIGN, mem_idx);
 
 cmpv = int128_make128(env->xregs[rs], env->xregs[rs + 1]);
 newv = int128_make128(new_lo, new_hi);
@@ -677,7 +677,7 @@ void HELPER(casp_be_parallel)(CPUARMState *env, uint32_t 
rs, uint64_t addr,
 assert(HAVE_CMPXCHG128);
 
 mem_idx = cpu_mmu_index(env, false);
-oi = make_memop_idx(MO_LEQ | MO_ALIGN_16, mem_idx);
+oi = make_memop_idx(MO_LE | MO_128 | MO_ALIGN, mem_idx);
 
 cmpv = int128_make128(env->xregs[rs + 1], env->xregs[rs]);
 newv = int128_make128(new_lo, new_hi);
-- 
2.25.1




[PATCH for-6.2 13/43] target/sparc: Remove DEBUG_UNALIGNED

2021-07-28 Thread Richard Henderson
The printf should have been qemu_log_mask, the parameters
themselves no longer compile, and because this is placed
before unwinding the PC is actively wrong.

We get better (and correct) logging on the other side of
raising the exception, in sparc_cpu_do_interrupt.

Cc: Mark Cave-Ayland 
Signed-off-by: Richard Henderson 
---
 target/sparc/ldst_helper.c | 9 -
 1 file changed, 9 deletions(-)

diff --git a/target/sparc/ldst_helper.c b/target/sparc/ldst_helper.c
index 22327d7d72..974afea041 100644
--- a/target/sparc/ldst_helper.c
+++ b/target/sparc/ldst_helper.c
@@ -27,7 +27,6 @@
 
 //#define DEBUG_MMU
 //#define DEBUG_MXCC
-//#define DEBUG_UNALIGNED
 //#define DEBUG_UNASSIGNED
 //#define DEBUG_ASI
 //#define DEBUG_CACHE_CONTROL
@@ -364,10 +363,6 @@ static void do_check_align(CPUSPARCState *env, 
target_ulong addr,
uint32_t align, uintptr_t ra)
 {
 if (addr & align) {
-#ifdef DEBUG_UNALIGNED
-printf("Unaligned access to 0x" TARGET_FMT_lx " from 0x" TARGET_FMT_lx
-   "\n", addr, env->pc);
-#endif
 cpu_raise_exception_ra(env, TT_UNALIGNED, ra);
 }
 }
@@ -1968,10 +1963,6 @@ void QEMU_NORETURN 
sparc_cpu_do_unaligned_access(CPUState *cs, vaddr addr,
 SPARCCPU *cpu = SPARC_CPU(cs);
 CPUSPARCState *env = &cpu->env;
 
-#ifdef DEBUG_UNALIGNED
-printf("Unaligned access to 0x" TARGET_FMT_lx " from 0x" TARGET_FMT_lx
-   "\n", addr, env->pc);
-#endif
 cpu_raise_exception_ra(env, TT_UNALIGNED, retaddr);
 }
 #endif
-- 
2.25.1




[PATCH for-6.2 12/43] target/sh4: Implement do_unaligned_access for user-only

2021-07-28 Thread Richard Henderson
Cc: Yoshinori Sato 
Signed-off-by: Richard Henderson 
---
 linux-user/sh4/cpu_loop.c | 8 
 target/sh4/cpu.c  | 2 +-
 target/sh4/op_helper.c| 3 ---
 3 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/linux-user/sh4/cpu_loop.c b/linux-user/sh4/cpu_loop.c
index 222ed1c670..21d97250a8 100644
--- a/linux-user/sh4/cpu_loop.c
+++ b/linux-user/sh4/cpu_loop.c
@@ -71,6 +71,14 @@ void cpu_loop(CPUSH4State *env)
 info._sifields._sigfault._addr = env->tea;
 queue_signal(env, info.si_signo, QEMU_SI_FAULT, &info);
 break;
+case 0xe0:
+case 0x100:
+info.si_signo = TARGET_SIGBUS;
+info.si_errno = 0;
+info.si_code = TARGET_BUS_ADRALN;
+info._sifields._sigfault._addr = env->tea;
+queue_signal(env, info.si_signo, QEMU_SI_FAULT, &info);
+break;
 case EXCP_ATOMIC:
 cpu_exec_step_atomic(cs);
 arch_interrupt = false;
diff --git a/target/sh4/cpu.c b/target/sh4/cpu.c
index 8326922942..b60234cd31 100644
--- a/target/sh4/cpu.c
+++ b/target/sh4/cpu.c
@@ -238,10 +238,10 @@ static const struct TCGCPUOps superh_tcg_ops = {
 .synchronize_from_tb = superh_cpu_synchronize_from_tb,
 .cpu_exec_interrupt = superh_cpu_exec_interrupt,
 .tlb_fill = superh_cpu_tlb_fill,
+.do_unaligned_access = superh_cpu_do_unaligned_access,
 
 #ifndef CONFIG_USER_ONLY
 .do_interrupt = superh_cpu_do_interrupt,
-.do_unaligned_access = superh_cpu_do_unaligned_access,
 .io_recompile_replay_branch = superh_io_recompile_replay_branch,
 #endif /* !CONFIG_USER_ONLY */
 };
diff --git a/target/sh4/op_helper.c b/target/sh4/op_helper.c
index d6d70c339f..b46fc1bf11 100644
--- a/target/sh4/op_helper.c
+++ b/target/sh4/op_helper.c
@@ -23,7 +23,6 @@
 #include "exec/cpu_ldst.h"
 #include "fpu/softfloat.h"
 
-#ifndef CONFIG_USER_ONLY
 
 void superh_cpu_do_unaligned_access(CPUState *cs, vaddr addr,
 MMUAccessType access_type,
@@ -46,8 +45,6 @@ void superh_cpu_do_unaligned_access(CPUState *cs, vaddr addr,
 cpu_loop_exit_restore(cs, retaddr);
 }
 
-#endif
-
 void helper_ldtlb(CPUSH4State *env)
 {
 #ifdef CONFIG_USER_ONLY
-- 
2.25.1




[PATCH for-6.2 23/43] accel/tcg: Remove double bswap for helper_atomic_sto_*_mmu

2021-07-28 Thread Richard Henderson
This crept in as either a cut-and-paste error, or rebase error.

Fixes: cfec388518d
Signed-off-by: Richard Henderson 
---
 accel/tcg/atomic_template.h | 1 -
 1 file changed, 1 deletion(-)

diff --git a/accel/tcg/atomic_template.h b/accel/tcg/atomic_template.h
index 4427fab6df..4230ff2957 100644
--- a/accel/tcg/atomic_template.h
+++ b/accel/tcg/atomic_template.h
@@ -251,7 +251,6 @@ void ATOMIC_NAME(st)(CPUArchState *env, target_ulong addr, 
ABI_TYPE val,
  PAGE_WRITE, retaddr);
 uint16_t info = atomic_trace_st_pre(env, addr, oi);
 
-val = BSWAP(val);
 val = BSWAP(val);
 atomic16_set(haddr, val);
 ATOMIC_MMU_CLEANUP;
-- 
2.25.1




[PATCH for-6.2 11/43] target/sh4: Set fault address in superh_cpu_do_unaligned_access

2021-07-28 Thread Richard Henderson
We ought to have been recording the virtual address for reporting
to the guest trap handler.

Cc: Yoshinori Sato 
Signed-off-by: Richard Henderson 
---
 target/sh4/op_helper.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/target/sh4/op_helper.c b/target/sh4/op_helper.c
index c0cbb95382..d6d70c339f 100644
--- a/target/sh4/op_helper.c
+++ b/target/sh4/op_helper.c
@@ -29,6 +29,9 @@ void superh_cpu_do_unaligned_access(CPUState *cs, vaddr addr,
 MMUAccessType access_type,
 int mmu_idx, uintptr_t retaddr)
 {
+CPUSH4State *env = cs->env_ptr;
+
+env->tea = addr;
 switch (access_type) {
 case MMU_INST_FETCH:
 case MMU_DATA_LOAD:
@@ -37,6 +40,8 @@ void superh_cpu_do_unaligned_access(CPUState *cs, vaddr addr,
 case MMU_DATA_STORE:
 cs->exception_index = 0x100;
 break;
+default:
+g_assert_not_reached();
 }
 cpu_loop_exit_restore(cs, retaddr);
 }
-- 
2.25.1




[PATCH for-6.2 14/43] target/sparc: Set fault address in sparc_cpu_do_unaligned_access

2021-07-28 Thread Richard Henderson
We ought to have been recording the virtual address for reporting
to the guest trap handler.  Mirror the SFSR FIXME from the sparc64
version of get_physical_address_data.

Cc: Mark Cave-Ayland 
Signed-off-by: Richard Henderson 
---
 target/sparc/ldst_helper.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/target/sparc/ldst_helper.c b/target/sparc/ldst_helper.c
index 974afea041..7367b48c8b 100644
--- a/target/sparc/ldst_helper.c
+++ b/target/sparc/ldst_helper.c
@@ -1963,6 +1963,14 @@ void QEMU_NORETURN 
sparc_cpu_do_unaligned_access(CPUState *cs, vaddr addr,
 SPARCCPU *cpu = SPARC_CPU(cs);
 CPUSPARCState *env = &cpu->env;
 
+#ifdef TARGET_SPARC64
+/* FIXME: ASI field in SFSR must be set */
+env->dmmu.sfsr = SFSR_VALID_BIT; /* Fault status register */
+env->dmmu.sfar = addr;   /* Fault address register */
+#else
+env->mmuregs[4] = addr;
+#endif
+
 cpu_raise_exception_ra(env, TT_UNALIGNED, retaddr);
 }
 #endif
-- 
2.25.1




[PATCH for-6.2 09/43] target/riscv: Implement do_unaligned_access for user-only

2021-07-28 Thread Richard Henderson
Cc: qemu-ri...@nongnu.org
Signed-off-by: Richard Henderson 
---
 linux-user/riscv/cpu_loop.c | 7 +++
 target/riscv/cpu.c  | 2 +-
 target/riscv/cpu_helper.c   | 8 +++-
 3 files changed, 15 insertions(+), 2 deletions(-)

diff --git a/linux-user/riscv/cpu_loop.c b/linux-user/riscv/cpu_loop.c
index 74a9628dc9..0428140d86 100644
--- a/linux-user/riscv/cpu_loop.c
+++ b/linux-user/riscv/cpu_loop.c
@@ -92,6 +92,13 @@ void cpu_loop(CPURISCVState *env)
 sigcode = TARGET_SEGV_MAPERR;
 sigaddr = env->badaddr;
 break;
+case RISCV_EXCP_INST_ADDR_MIS:
+case RISCV_EXCP_LOAD_ADDR_MIS:
+case RISCV_EXCP_STORE_AMO_ADDR_MIS:
+signum = TARGET_SIGBUS;
+sigcode = TARGET_BUS_ADRALN;
+sigaddr = env->badaddr;
+break;
 case RISCV_EXCP_SEMIHOST:
 env->gpr[xA0] = do_common_semihosting(cs);
 env->pc += 4;
diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index 991a6bb760..591d17e62d 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -644,11 +644,11 @@ static const struct TCGCPUOps riscv_tcg_ops = {
 .synchronize_from_tb = riscv_cpu_synchronize_from_tb,
 .cpu_exec_interrupt = riscv_cpu_exec_interrupt,
 .tlb_fill = riscv_cpu_tlb_fill,
+.do_unaligned_access = riscv_cpu_do_unaligned_access,
 
 #ifndef CONFIG_USER_ONLY
 .do_interrupt = riscv_cpu_do_interrupt,
 .do_transaction_failed = riscv_cpu_do_transaction_failed,
-.do_unaligned_access = riscv_cpu_do_unaligned_access,
 #endif /* !CONFIG_USER_ONLY */
 };
 
diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c
index 968cb8046f..a440b2834f 100644
--- a/target/riscv/cpu_helper.c
+++ b/target/riscv/cpu_helper.c
@@ -727,6 +727,7 @@ void riscv_cpu_do_transaction_failed(CPUState *cs, hwaddr 
physaddr,
 riscv_cpu_two_stage_lookup(mmu_idx);
 riscv_raise_exception(&cpu->env, cs->exception_index, retaddr);
 }
+#endif /* !CONFIG_USER_ONLY */
 
 void riscv_cpu_do_unaligned_access(CPUState *cs, vaddr addr,
MMUAccessType access_type, int mmu_idx,
@@ -734,6 +735,7 @@ void riscv_cpu_do_unaligned_access(CPUState *cs, vaddr addr,
 {
 RISCVCPU *cpu = RISCV_CPU(cs);
 CPURISCVState *env = &cpu->env;
+
 switch (access_type) {
 case MMU_INST_FETCH:
 cs->exception_index = RISCV_EXCP_INST_ADDR_MIS;
@@ -748,11 +750,15 @@ void riscv_cpu_do_unaligned_access(CPUState *cs, vaddr 
addr,
 g_assert_not_reached();
 }
 env->badaddr = addr;
+
+#ifdef CONFIG_USER_ONLY
+cpu_loop_exit_restore(cs, retaddr);
+#else
 env->two_stage_lookup = riscv_cpu_virt_enabled(env) ||
 riscv_cpu_two_stage_lookup(mmu_idx);
 riscv_raise_exception(env, cs->exception_index, retaddr);
+#endif
 }
-#endif /* !CONFIG_USER_ONLY */
 
 bool riscv_cpu_tlb_fill(CPUState *cs, vaddr address, int size,
 MMUAccessType access_type, int mmu_idx,
-- 
2.25.1




[PATCH for-6.2 04/43] target/hppa: Implement do_unaligned_access for user-only

2021-07-28 Thread Richard Henderson
Signed-off-by: Richard Henderson 
---
 linux-user/hppa/cpu_loop.c | 2 +-
 target/hppa/cpu.c  | 8 +---
 2 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/linux-user/hppa/cpu_loop.c b/linux-user/hppa/cpu_loop.c
index 82d8183821..5ce30fec8b 100644
--- a/linux-user/hppa/cpu_loop.c
+++ b/linux-user/hppa/cpu_loop.c
@@ -161,7 +161,7 @@ void cpu_loop(CPUHPPAState *env)
 case EXCP_UNALIGN:
 info.si_signo = TARGET_SIGBUS;
 info.si_errno = 0;
-info.si_code = 0;
+info.si_code = TARGET_BUS_ADRALN;
 info._sifields._sigfault._addr = env->cr[CR_IOR];
 queue_signal(env, info.si_signo, QEMU_SI_FAULT, &info);
 break;
diff --git a/target/hppa/cpu.c b/target/hppa/cpu.c
index 2eace4ee12..55c0d81046 100644
--- a/target/hppa/cpu.c
+++ b/target/hppa/cpu.c
@@ -71,7 +71,6 @@ static void hppa_cpu_disas_set_info(CPUState *cs, 
disassemble_info *info)
 info->print_insn = print_insn_hppa;
 }
 
-#ifndef CONFIG_USER_ONLY
 static void hppa_cpu_do_unaligned_access(CPUState *cs, vaddr addr,
  MMUAccessType access_type,
  int mmu_idx, uintptr_t retaddr)
@@ -80,15 +79,18 @@ static void hppa_cpu_do_unaligned_access(CPUState *cs, 
vaddr addr,
 CPUHPPAState *env = &cpu->env;
 
 cs->exception_index = EXCP_UNALIGN;
+#ifdef CONFIG_USER_ONLY
+env->cr[CR_IOR] = addr;
+#else
 if (env->psw & PSW_Q) {
 /* ??? Needs tweaking for hppa64.  */
 env->cr[CR_IOR] = addr;
 env->cr[CR_ISR] = addr >> 32;
 }
+#endif
 
 cpu_loop_exit_restore(cs, retaddr);
 }
-#endif /* CONFIG_USER_ONLY */
 
 static void hppa_cpu_realizefn(DeviceState *dev, Error **errp)
 {
@@ -146,10 +148,10 @@ static const struct TCGCPUOps hppa_tcg_ops = {
 .synchronize_from_tb = hppa_cpu_synchronize_from_tb,
 .cpu_exec_interrupt = hppa_cpu_exec_interrupt,
 .tlb_fill = hppa_cpu_tlb_fill,
+.do_unaligned_access = hppa_cpu_do_unaligned_access,
 
 #ifndef CONFIG_USER_ONLY
 .do_interrupt = hppa_cpu_do_interrupt,
-.do_unaligned_access = hppa_cpu_do_unaligned_access,
 #endif /* !CONFIG_USER_ONLY */
 };
 
-- 
2.25.1




[PATCH for-6.2 18/43] accel/tcg: Drop signness in tracing in cputlb.c

2021-07-28 Thread Richard Henderson
We are already inconsistent about whether or not
MO_SIGN is set in trace_mem_get_info.  Dropping it
entirely allows some simplification.

Signed-off-by: Richard Henderson 
---
 accel/tcg/cputlb.c| 10 +++---
 accel/tcg/user-exec.c | 45 ++-
 2 files changed, 9 insertions(+), 46 deletions(-)

diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
index b1e5471f94..0a1fdbefdd 100644
--- a/accel/tcg/cputlb.c
+++ b/accel/tcg/cputlb.c
@@ -2119,7 +2119,6 @@ static inline uint64_t cpu_load_helper(CPUArchState *env, 
abi_ptr addr,
 meminfo = trace_mem_get_info(op, mmu_idx, false);
 trace_guest_mem_before_exec(env_cpu(env), addr, meminfo);
 
-op &= ~MO_SIGN;
 oi = make_memop_idx(op, mmu_idx);
 ret = full_load(env, addr, oi, retaddr);
 
@@ -2137,8 +2136,7 @@ uint32_t cpu_ldub_mmuidx_ra(CPUArchState *env, abi_ptr 
addr,
 int cpu_ldsb_mmuidx_ra(CPUArchState *env, abi_ptr addr,
int mmu_idx, uintptr_t ra)
 {
-return (int8_t)cpu_load_helper(env, addr, mmu_idx, ra, MO_SB,
-   full_ldub_mmu);
+return (int8_t)cpu_ldub_mmuidx_ra(env, addr, mmu_idx, ra);
 }
 
 uint32_t cpu_lduw_be_mmuidx_ra(CPUArchState *env, abi_ptr addr,
@@ -2150,8 +2148,7 @@ uint32_t cpu_lduw_be_mmuidx_ra(CPUArchState *env, abi_ptr 
addr,
 int cpu_ldsw_be_mmuidx_ra(CPUArchState *env, abi_ptr addr,
   int mmu_idx, uintptr_t ra)
 {
-return (int16_t)cpu_load_helper(env, addr, mmu_idx, ra, MO_BESW,
-full_be_lduw_mmu);
+return (int16_t)cpu_lduw_be_mmuidx_ra(env, addr, mmu_idx, ra);
 }
 
 uint32_t cpu_ldl_be_mmuidx_ra(CPUArchState *env, abi_ptr addr,
@@ -2175,8 +2172,7 @@ uint32_t cpu_lduw_le_mmuidx_ra(CPUArchState *env, abi_ptr 
addr,
 int cpu_ldsw_le_mmuidx_ra(CPUArchState *env, abi_ptr addr,
   int mmu_idx, uintptr_t ra)
 {
-return (int16_t)cpu_load_helper(env, addr, mmu_idx, ra, MO_LESW,
-full_le_lduw_mmu);
+return (int16_t)cpu_lduw_le_mmuidx_ra(env, addr, mmu_idx, ra);
 }
 
 uint32_t cpu_ldl_le_mmuidx_ra(CPUArchState *env, abi_ptr addr,
diff --git a/accel/tcg/user-exec.c b/accel/tcg/user-exec.c
index dd77e90789..f17b75e0aa 100644
--- a/accel/tcg/user-exec.c
+++ b/accel/tcg/user-exec.c
@@ -875,13 +875,7 @@ uint32_t cpu_ldub_data(CPUArchState *env, abi_ptr ptr)
 
 int cpu_ldsb_data(CPUArchState *env, abi_ptr ptr)
 {
-int ret;
-uint16_t meminfo = trace_mem_get_info(MO_SB, MMU_USER_IDX, false);
-
-trace_guest_mem_before_exec(env_cpu(env), ptr, meminfo);
-ret = ldsb_p(g2h(env_cpu(env), ptr));
-qemu_plugin_vcpu_mem_cb(env_cpu(env), ptr, meminfo);
-return ret;
+return (int8_t)cpu_ldub_data(env, ptr);
 }
 
 uint32_t cpu_lduw_be_data(CPUArchState *env, abi_ptr ptr)
@@ -897,13 +891,7 @@ uint32_t cpu_lduw_be_data(CPUArchState *env, abi_ptr ptr)
 
 int cpu_ldsw_be_data(CPUArchState *env, abi_ptr ptr)
 {
-int ret;
-uint16_t meminfo = trace_mem_get_info(MO_BESW, MMU_USER_IDX, false);
-
-trace_guest_mem_before_exec(env_cpu(env), ptr, meminfo);
-ret = ldsw_be_p(g2h(env_cpu(env), ptr));
-qemu_plugin_vcpu_mem_cb(env_cpu(env), ptr, meminfo);
-return ret;
+return (int16_t)cpu_lduw_be_data(env, ptr);
 }
 
 uint32_t cpu_ldl_be_data(CPUArchState *env, abi_ptr ptr)
@@ -941,13 +929,7 @@ uint32_t cpu_lduw_le_data(CPUArchState *env, abi_ptr ptr)
 
 int cpu_ldsw_le_data(CPUArchState *env, abi_ptr ptr)
 {
-int ret;
-uint16_t meminfo = trace_mem_get_info(MO_LESW, MMU_USER_IDX, false);
-
-trace_guest_mem_before_exec(env_cpu(env), ptr, meminfo);
-ret = ldsw_le_p(g2h(env_cpu(env), ptr));
-qemu_plugin_vcpu_mem_cb(env_cpu(env), ptr, meminfo);
-return ret;
+return (int16_t)cpu_lduw_le_data(env, ptr);
 }
 
 uint32_t cpu_ldl_le_data(CPUArchState *env, abi_ptr ptr)
@@ -984,12 +966,7 @@ uint32_t cpu_ldub_data_ra(CPUArchState *env, abi_ptr ptr, 
uintptr_t retaddr)
 
 int cpu_ldsb_data_ra(CPUArchState *env, abi_ptr ptr, uintptr_t retaddr)
 {
-int ret;
-
-set_helper_retaddr(retaddr);
-ret = cpu_ldsb_data(env, ptr);
-clear_helper_retaddr();
-return ret;
+return (int8_t)cpu_ldub_data_ra(env, ptr, retaddr);
 }
 
 uint32_t cpu_lduw_be_data_ra(CPUArchState *env, abi_ptr ptr, uintptr_t retaddr)
@@ -1004,12 +981,7 @@ uint32_t cpu_lduw_be_data_ra(CPUArchState *env, abi_ptr 
ptr, uintptr_t retaddr)
 
 int cpu_ldsw_be_data_ra(CPUArchState *env, abi_ptr ptr, uintptr_t retaddr)
 {
-int ret;
-
-set_helper_retaddr(retaddr);
-ret = cpu_ldsw_be_data(env, ptr);
-clear_helper_retaddr();
-return ret;
+return (int16_t)cpu_lduw_be_data_ra(env, ptr, retaddr);
 }
 
 uint32_t cpu_ldl_be_data_ra(CPUArchState *env, abi_ptr ptr, uintptr_t retaddr)
@@ -1044,12 +1016,7 @@ uint32_t cpu_lduw_le_data_ra(CPUArchState *env, abi_ptr 
ptr, uintptr_t retaddr)
 
 int cpu_ldsw_le_data_ra(CPUArchState *env, abi_ptr ptr, uintptr_t retaddr)
 {
-int

[PATCH for-6.2 10/43] target/s390x: Implement do_unaligned_access for user-only

2021-07-28 Thread Richard Henderson
Cc: qemu-s3...@nongnu.org
Signed-off-by: Richard Henderson 
---
 target/s390x/cpu.c |  2 +-
 target/s390x/tcg/excp_helper.c | 28 +++-
 2 files changed, 16 insertions(+), 14 deletions(-)

diff --git a/target/s390x/cpu.c b/target/s390x/cpu.c
index 7b7b05f1d3..9d8cfb37cd 100644
--- a/target/s390x/cpu.c
+++ b/target/s390x/cpu.c
@@ -267,12 +267,12 @@ static void s390_cpu_reset_full(DeviceState *dev)
 static const struct TCGCPUOps s390_tcg_ops = {
 .initialize = s390x_translate_init,
 .tlb_fill = s390_cpu_tlb_fill,
+.do_unaligned_access = s390x_cpu_do_unaligned_access,
 
 #if !defined(CONFIG_USER_ONLY)
 .cpu_exec_interrupt = s390_cpu_exec_interrupt,
 .do_interrupt = s390_cpu_do_interrupt,
 .debug_excp_handler = s390x_cpu_debug_excp_handler,
-.do_unaligned_access = s390x_cpu_do_unaligned_access,
 #endif /* !CONFIG_USER_ONLY */
 };
 #endif /* CONFIG_TCG */
diff --git a/target/s390x/tcg/excp_helper.c b/target/s390x/tcg/excp_helper.c
index a61917d04f..9cbe160f66 100644
--- a/target/s390x/tcg/excp_helper.c
+++ b/target/s390x/tcg/excp_helper.c
@@ -82,6 +82,21 @@ void HELPER(data_exception)(CPUS390XState *env, uint32_t dxc)
 tcg_s390_data_exception(env, dxc, GETPC());
 }
 
+/*
+ * Unaligned accesses are only diagnosed with MO_ALIGN.  At the moment,
+ * this is only for the atomic operations, for which we want to raise a
+ * specification exception.
+ */
+void s390x_cpu_do_unaligned_access(CPUState *cs, vaddr addr,
+   MMUAccessType access_type,
+   int mmu_idx, uintptr_t retaddr)
+{
+S390CPU *cpu = S390_CPU(cs);
+CPUS390XState *env = &cpu->env;
+
+tcg_s390_program_interrupt(env, PGM_SPECIFICATION, retaddr);
+}
+
 #if defined(CONFIG_USER_ONLY)
 
 void s390_cpu_do_interrupt(CPUState *cs)
@@ -602,19 +617,6 @@ void s390x_cpu_debug_excp_handler(CPUState *cs)
 }
 }
 
-/* Unaligned accesses are only diagnosed with MO_ALIGN.  At the moment,
-   this is only for the atomic operations, for which we want to raise a
-   specification exception.  */
-void s390x_cpu_do_unaligned_access(CPUState *cs, vaddr addr,
-   MMUAccessType access_type,
-   int mmu_idx, uintptr_t retaddr)
-{
-S390CPU *cpu = S390_CPU(cs);
-CPUS390XState *env = &cpu->env;
-
-tcg_s390_program_interrupt(env, PGM_SPECIFICATION, retaddr);
-}
-
 static void QEMU_NORETURN monitor_event(CPUS390XState *env,
 uint64_t monitor_code,
 uint8_t monitor_class, uintptr_t ra)
-- 
2.25.1




[PATCH for-6.2 08/43] target/ppc: Implement do_unaligned_access for user-only

2021-07-28 Thread Richard Henderson
Cc: qemu-...@nongnu.org
Signed-off-by: Richard Henderson 
---
 linux-user/ppc/cpu_loop.c | 2 +-
 target/ppc/cpu_init.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/linux-user/ppc/cpu_loop.c b/linux-user/ppc/cpu_loop.c
index fa91ea0eed..d72d30248b 100644
--- a/linux-user/ppc/cpu_loop.c
+++ b/linux-user/ppc/cpu_loop.c
@@ -165,7 +165,7 @@ void cpu_loop(CPUPPCState *env)
 info.si_signo = TARGET_SIGBUS;
 info.si_errno = 0;
 info.si_code = TARGET_BUS_ADRALN;
-info._sifields._sigfault._addr = env->nip;
+info._sifields._sigfault._addr = env->spr[SPR_DAR];
 queue_signal(env, info.si_signo, QEMU_SI_FAULT, &info);
 break;
 case POWERPC_EXCP_PROGRAM:  /* Program exception */
diff --git a/target/ppc/cpu_init.c b/target/ppc/cpu_init.c
index 505a0ed6ac..84fb6bbb83 100644
--- a/target/ppc/cpu_init.c
+++ b/target/ppc/cpu_init.c
@@ -9014,12 +9014,12 @@ static const struct TCGCPUOps ppc_tcg_ops = {
   .initialize = ppc_translate_init,
   .cpu_exec_interrupt = ppc_cpu_exec_interrupt,
   .tlb_fill = ppc_cpu_tlb_fill,
+  .do_unaligned_access = ppc_cpu_do_unaligned_access,
 
 #ifndef CONFIG_USER_ONLY
   .do_interrupt = ppc_cpu_do_interrupt,
   .cpu_exec_enter = ppc_cpu_exec_enter,
   .cpu_exec_exit = ppc_cpu_exec_exit,
-  .do_unaligned_access = ppc_cpu_do_unaligned_access,
 #endif /* !CONFIG_USER_ONLY */
 };
 #endif /* CONFIG_TCG */
-- 
2.25.1




[PATCH for-6.2 05/43] target/microblaze: Implement do_unaligned_access for user-only

2021-07-28 Thread Richard Henderson
Cc: Edgar E. Iglesias 
Signed-off-by: Richard Henderson 
---
 target/microblaze/cpu.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/microblaze/cpu.c b/target/microblaze/cpu.c
index 72d8f2a0da..cbec062ed7 100644
--- a/target/microblaze/cpu.c
+++ b/target/microblaze/cpu.c
@@ -367,11 +367,11 @@ static const struct TCGCPUOps mb_tcg_ops = {
 .synchronize_from_tb = mb_cpu_synchronize_from_tb,
 .cpu_exec_interrupt = mb_cpu_exec_interrupt,
 .tlb_fill = mb_cpu_tlb_fill,
+.do_unaligned_access = mb_cpu_do_unaligned_access,
 
 #ifndef CONFIG_USER_ONLY
 .do_interrupt = mb_cpu_do_interrupt,
 .do_transaction_failed = mb_cpu_transaction_failed,
-.do_unaligned_access = mb_cpu_do_unaligned_access,
 #endif /* !CONFIG_USER_ONLY */
 };
 
-- 
2.25.1




[PATCH for-6.2 03/43] target/arm: Implement do_unaligned_access for user-only

2021-07-28 Thread Richard Henderson
Cc: qemu-...@nongnu.org
Signed-off-by: Richard Henderson 
---
 linux-user/aarch64/cpu_loop.c |  4 
 linux-user/arm/cpu_loop.c | 43 +++
 target/arm/cpu.c  |  2 +-
 target/arm/cpu_tcg.c  |  2 +-
 4 files changed, 40 insertions(+), 11 deletions(-)

diff --git a/linux-user/aarch64/cpu_loop.c b/linux-user/aarch64/cpu_loop.c
index ee72a1c20f..998831f87f 100644
--- a/linux-user/aarch64/cpu_loop.c
+++ b/linux-user/aarch64/cpu_loop.c
@@ -137,6 +137,10 @@ void cpu_loop(CPUARMState *env)
 case 0x11: /* Synchronous Tag Check Fault */
 info.si_code = TARGET_SEGV_MTESERR;
 break;
+case 0x21: /* Alignment fault */
+info.si_signo = TARGET_SIGBUS;
+info.si_code = TARGET_BUS_ADRALN;
+break;
 default:
 g_assert_not_reached();
 }
diff --git a/linux-user/arm/cpu_loop.c b/linux-user/arm/cpu_loop.c
index 69632d15be..da7da6a0c1 100644
--- a/linux-user/arm/cpu_loop.c
+++ b/linux-user/arm/cpu_loop.c
@@ -23,6 +23,7 @@
 #include "elf.h"
 #include "cpu_loop-common.h"
 #include "semihosting/common-semi.h"
+#include "target/arm/syndrome.h"
 
 #define get_user_code_u32(x, gaddr, env)\
 ({ abi_long __r = get_user_u32((x), (gaddr));   \
@@ -286,9 +287,8 @@ void cpu_loop(CPUARMState *env)
 {
 CPUState *cs = env_cpu(env);
 int trapnr;
-unsigned int n, insn;
+unsigned int n, insn, ec, fsc;
 target_siginfo_t info;
-uint32_t addr;
 abi_ulong ret;
 
 for(;;) {
@@ -437,15 +437,40 @@ void cpu_loop(CPUARMState *env)
 break;
 case EXCP_PREFETCH_ABORT:
 case EXCP_DATA_ABORT:
-addr = env->exception.vaddress;
-{
-info.si_signo = TARGET_SIGSEGV;
-info.si_errno = 0;
-/* XXX: check env->error_code */
+info.si_signo = TARGET_SIGSEGV;
+info.si_errno = 0;
+info._sifields._sigfault._addr = env->exception.vaddress;
+/*
+ * We should only arrive here with EC in {DATAABORT, INSNABORT},
+ * and short-form FSC, which then tells us to look at the FSR.
+ * ??? arm_cpu_reset never sets TTBCR_EAE, so we always get
+ * short-form FSC.
+ */
+ec = syn_get_ec(env->exception.syndrome);
+assert(ec == EC_DATAABORT || ec == EC_INSNABORT);
+fsc = extract32(env->exception.syndrome, 0, 6);
+assert(fsc == 0x3f);
+switch (env->exception.fsr & 0x1f) {
+case 0x1: /* Alignment */
+info.si_signo = TARGET_SIGBUS;
+info.si_code = TARGET_BUS_ADRALN;
+break;
+case 0x3: /* Access flag fault, level 1 */
+case 0x6: /* Access flag fault, level 2 */
+case 0x9: /* Domain fault, level 1 */
+case 0xb: /* Domain fault, level 2 */
+case 0xd: /* Permision fault, level 1 */
+case 0xf: /* Permision fault, level 2 */
+info.si_code = TARGET_SEGV_ACCERR;
+break;
+case 0x5: /* Translation fault, level 1 */
+case 0x7: /* Translation fault, level 2 */
 info.si_code = TARGET_SEGV_MAPERR;
-info._sifields._sigfault._addr = addr;
-queue_signal(env, info.si_signo, QEMU_SI_FAULT, &info);
+break;
+default:
+g_assert_not_reached();
 }
+queue_signal(env, info.si_signo, QEMU_SI_FAULT, &info);
 break;
 case EXCP_DEBUG:
 case EXCP_BKPT:
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index 2866dd7658..de0d968d76 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -1987,11 +1987,11 @@ static const struct TCGCPUOps arm_tcg_ops = {
 .cpu_exec_interrupt = arm_cpu_exec_interrupt,
 .tlb_fill = arm_cpu_tlb_fill,
 .debug_excp_handler = arm_debug_excp_handler,
+.do_unaligned_access = arm_cpu_do_unaligned_access,
 
 #if !defined(CONFIG_USER_ONLY)
 .do_interrupt = arm_cpu_do_interrupt,
 .do_transaction_failed = arm_cpu_do_transaction_failed,
-.do_unaligned_access = arm_cpu_do_unaligned_access,
 .adjust_watchpoint_address = arm_adjust_watchpoint_address,
 .debug_check_watchpoint = arm_debug_check_watchpoint,
 .debug_check_breakpoint = arm_debug_check_breakpoint,
diff --git a/target/arm/cpu_tcg.c b/target/arm/cpu_tcg.c
index ed444bf436..1b91fdc890 100644
--- a/target/arm/cpu_tcg.c
+++ b/target/arm/cpu_tcg.c
@@ -904,11 +904,11 @@ static const struct TCGCPUOps arm_v7m_tcg_ops = {
 .cpu_exec_interrupt = arm_v7m_cpu_exec_interrupt,
 .tlb_fill = arm_cpu_tlb_fill,
 .debug_excp_handler = arm_debug_excp_handler,
+.do_unaligned_access = arm_cpu_do_unaligned_access,
 
 #if !defined(CONFIG_USER_ONLY)
 .do_interrupt = arm_v7m_cpu_do_int

[PATCH for-6.2 06/43] target/mips: Implement do_unaligned_access for user-only

2021-07-28 Thread Richard Henderson
Cc: Philippe Mathieu-Daudé 
Signed-off-by: Richard Henderson 
---
 linux-user/mips/cpu_loop.c| 20 
 target/mips/cpu.c |  2 +-
 target/mips/tcg/op_helper.c   |  3 +--
 target/mips/tcg/user/tlb_helper.c | 23 +++
 4 files changed, 29 insertions(+), 19 deletions(-)

diff --git a/linux-user/mips/cpu_loop.c b/linux-user/mips/cpu_loop.c
index 9d813ece4e..51f4eb65a6 100644
--- a/linux-user/mips/cpu_loop.c
+++ b/linux-user/mips/cpu_loop.c
@@ -158,12 +158,24 @@ done_syscall:
 break;
 case EXCP_TLBL:
 case EXCP_TLBS:
-case EXCP_AdEL:
-case EXCP_AdES:
 info.si_signo = TARGET_SIGSEGV;
 info.si_errno = 0;
-/* XXX: check env->error_code */
-info.si_code = TARGET_SEGV_MAPERR;
+info.si_code = (env->error_code & EXCP_TLB_NOMATCH
+? TARGET_SEGV_MAPERR : TARGET_SEGV_ACCERR);
+info._sifields._sigfault._addr = env->CP0_BadVAddr;
+queue_signal(env, info.si_signo, QEMU_SI_FAULT, &info);
+break;
+case EXCP_AdEL:
+case EXCP_AdES:
+/*
+ * Note that on real hw AdE is also raised for access to a
+ * kernel address from user mode instead of a TLB error.
+ * For simplicity, we do not distinguish this in the user
+ * version of mips_cpu_tlb_fill so only unaligned comes here.
+ */
+info.si_signo = TARGET_SIGBUS;
+info.si_errno = 0;
+info.si_code = TARGET_BUS_ADRALN;
 info._sifields._sigfault._addr = env->CP0_BadVAddr;
 queue_signal(env, info.si_signo, QEMU_SI_FAULT, &info);
 break;
diff --git a/target/mips/cpu.c b/target/mips/cpu.c
index d426918291..a1658af910 100644
--- a/target/mips/cpu.c
+++ b/target/mips/cpu.c
@@ -541,11 +541,11 @@ static const struct TCGCPUOps mips_tcg_ops = {
 .synchronize_from_tb = mips_cpu_synchronize_from_tb,
 .cpu_exec_interrupt = mips_cpu_exec_interrupt,
 .tlb_fill = mips_cpu_tlb_fill,
+.do_unaligned_access = mips_cpu_do_unaligned_access,
 
 #if !defined(CONFIG_USER_ONLY)
 .do_interrupt = mips_cpu_do_interrupt,
 .do_transaction_failed = mips_cpu_do_transaction_failed,
-.do_unaligned_access = mips_cpu_do_unaligned_access,
 .io_recompile_replay_branch = mips_io_recompile_replay_branch,
 #endif /* !CONFIG_USER_ONLY */
 };
diff --git a/target/mips/tcg/op_helper.c b/target/mips/tcg/op_helper.c
index fafbf1faca..0b874823e4 100644
--- a/target/mips/tcg/op_helper.c
+++ b/target/mips/tcg/op_helper.c
@@ -375,8 +375,6 @@ void helper_pmon(CPUMIPSState *env, int function)
 }
 }
 
-#if !defined(CONFIG_USER_ONLY)
-
 void mips_cpu_do_unaligned_access(CPUState *cs, vaddr addr,
   MMUAccessType access_type,
   int mmu_idx, uintptr_t retaddr)
@@ -402,6 +400,7 @@ void mips_cpu_do_unaligned_access(CPUState *cs, vaddr addr,
 do_raise_exception_err(env, excp, error_code, retaddr);
 }
 
+#if !defined(CONFIG_USER_ONLY)
 void mips_cpu_do_transaction_failed(CPUState *cs, hwaddr physaddr,
 vaddr addr, unsigned size,
 MMUAccessType access_type,
diff --git a/target/mips/tcg/user/tlb_helper.c 
b/target/mips/tcg/user/tlb_helper.c
index b835144b82..61a99356e9 100644
--- a/target/mips/tcg/user/tlb_helper.c
+++ b/target/mips/tcg/user/tlb_helper.c
@@ -26,24 +26,23 @@ static void raise_mmu_exception(CPUMIPSState *env, 
target_ulong address,
 MMUAccessType access_type)
 {
 CPUState *cs = env_cpu(env);
+int error_code = 0;
+int flags;
 
-env->error_code = 0;
 if (access_type == MMU_INST_FETCH) {
-env->error_code |= EXCP_INST_NOTAVAIL;
+error_code |= EXCP_INST_NOTAVAIL;
 }
 
-/* Reference to kernel address from user mode or supervisor mode */
-/* Reference to supervisor address from user mode */
-if (access_type == MMU_DATA_STORE) {
-cs->exception_index = EXCP_AdES;
-} else {
-cs->exception_index = EXCP_AdEL;
+flags = page_get_flags(address);
+if (!(flags & PAGE_VALID)) {
+error_code |= EXCP_TLB_NOMATCH;
 }
 
-/* Raise exception */
-if (!(env->hflags & MIPS_HFLAG_DM)) {
-env->CP0_BadVAddr = address;
-}
+cs->exception_index = (access_type == MMU_DATA_STORE
+   ? EXCP_TLBS : EXCP_TLBL);
+
+env->error_code = error_code;
+env->CP0_BadVAddr = address;
 }
 
 bool mips_cpu_tlb_fill(CPUState *cs, vaddr address, int size,
-- 
2.25.1




[PATCH for-6.2 07/43] target/ppc: Set fault address in ppc_cpu_do_unaligned_access

2021-07-28 Thread Richard Henderson
We ought to have been recording the virtual address for reporting
to the guest trap handler.

Cc: qemu-...@nongnu.org
Signed-off-by: Richard Henderson 
---
 target/ppc/excp_helper.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/target/ppc/excp_helper.c b/target/ppc/excp_helper.c
index a79a0ed465..0b2c6de442 100644
--- a/target/ppc/excp_helper.c
+++ b/target/ppc/excp_helper.c
@@ -1503,6 +1503,8 @@ void ppc_cpu_do_unaligned_access(CPUState *cs, vaddr 
vaddr,
 CPUPPCState *env = cs->env_ptr;
 uint32_t insn;
 
+env->spr[SPR_DAR] = vaddr;
+
 /* Restore state and reload the insn we executed, for filling in DSISR.  */
 cpu_restore_state(cs, retaddr, true);
 insn = cpu_ldl_code(env, env->nip);
-- 
2.25.1




[PATCH for-6.2 00/43] Unaligned accesses for user-only

2021-07-28 Thread Richard Henderson
This began with Peter wanting a cpu_ldst.h interface that can handle
alignment info for Arm M-profile system mode, which will also compile
for user-only without ifdefs.  This is patch 32.

Once I had that interface, I thought I might as well enforce the
requested alignment in user-only.  There are plenty of cases where
we ought to have been doing that for quite a while.  This took rather
more work than I imagined to start.

So far only x86 host has been fully converted to handle unaligned
operations in user-only mode.  I'll get to the others later.  But
the added testcase is fairly broad, and caught lots of bugs and/or
missing code between target/ and linux-user/.

Notes:
  * For target/i386 we have no way to signal SIGBUS from user-only.
In theory we could go through do_unaligned_access in system mode,
via #AC.  But we don't even implement that control in tcg, probably
because no one ever sets it.  The cmpxchg16b insn requires alignment,
but raises #GP, which maps to SIGSEGV.

  * For target/s390x we have no way to signal SIGBUS from user-only.
The atomic operations raise PGM_SPECIFICATION, which the linux
kernel maps to SIGILL.

  * I think target/hexagon should be setting TARGET_ALIGNED_ONLY=y.
In the meantime, all memory accesses are allowed to be unaligned.


r~


Richard Henderson (43):
  hw/core: Make do_unaligned_access available to user-only
  target/alpha: Implement do_unaligned_access for user-only
  target/arm: Implement do_unaligned_access for user-only
  target/hppa: Implement do_unaligned_access for user-only
  target/microblaze: Implement do_unaligned_access for user-only
  target/mips: Implement do_unaligned_access for user-only
  target/ppc: Set fault address in ppc_cpu_do_unaligned_access
  target/ppc: Implement do_unaligned_access for user-only
  target/riscv: Implement do_unaligned_access for user-only
  target/s390x: Implement do_unaligned_access for user-only
  target/sh4: Set fault address in superh_cpu_do_unaligned_access
  target/sh4: Implement do_unaligned_access for user-only
  target/sparc: Remove DEBUG_UNALIGNED
  target/sparc: Set fault address in sparc_cpu_do_unaligned_access
  target/sparc: Implement do_unaligned_access for user-only
  target/xtensa: Implement do_unaligned_access for user-only
  accel/tcg: Report unaligned atomics for user-only
  accel/tcg: Drop signness in tracing in cputlb.c
  tcg: Expand MO_SIZE to 3 bits
  tcg: Rename TCGMemOpIdx to MemOpIdx
  tcg: Split out MemOpIdx to exec/memopidx.h
  trace/mem: Pass MemOpIdx to trace_mem_get_info
  accel/tcg: Remove double bswap for helper_atomic_sto_*_mmu
  accel/tcg: Pass MemOpIdx to atomic_trace_*_post
  plugins: Reorg arguments to qemu_plugin_vcpu_mem_cb
  trace: Split guest_mem_before
  target/arm: Use MO_128 for 16 byte atomics
  target/i386: Use MO_128 for 16 byte atomics
  target/ppc: Use MO_128 for 16 byte atomics
  target/s390x: Use MO_128 for 16 byte atomics
  target/hexagon: Implement cpu_mmu_index
  accel/tcg: Add cpu_{ld,st}*_mmu interfaces
  accel/tcg: Move cpu_atomic decls to exec/cpu_ldst.h
  target/mips: Use cpu_*_data_ra for msa load/store
  target/mips: Use 8-byte memory ops for msa load/store
  target/s390x: Use cpu_*_mmu instead of helper_*_mmu
  target/sparc: Use cpu_*_mmu instead of helper_*_mmu
  target/arm: Use cpu_*_mmu instead of helper_*_mmu
  tcg: Move helper_*_mmu decls to tcg/tcg-ldst.h
  linux-user/alpha: Remove TARGET_ALIGNED_ONLY
  tcg: Add helper_unaligned_mmu for user-only sigbus
  tcg/i386: Support raising sigbus for user-only
  tests/tcg/multiarch: Add sigbus.c

 configs/targets/alpha-linux-user.mak |   1 -
 accel/tcg/atomic_template.h  |  74 ++--
 include/exec/cpu_ldst.h  | 332 +-
 include/exec/memop.h |  14 +-
 include/exec/memopidx.h  |  55 +++
 include/hw/core/tcg-cpu-ops.h|  14 +-
 include/qemu/plugin.h|  26 +-
 include/tcg/tcg-ldst.h   |  79 +
 include/tcg/tcg.h| 197 +--
 target/hexagon/cpu.h |   9 +
 tcg/i386/tcg-target.h|   2 -
 trace/mem.h  |  63 
 accel/tcg/cputlb.c   | 486 +--
 accel/tcg/plugin-gen.c   |   5 +-
 accel/tcg/user-exec.c| 444 ++--
 linux-user/aarch64/cpu_loop.c|   4 +
 linux-user/arm/cpu_loop.c|  43 ++-
 linux-user/hppa/cpu_loop.c   |   2 +-
 linux-user/mips/cpu_loop.c   |  20 +-
 linux-user/ppc/cpu_loop.c|   2 +-
 linux-user/riscv/cpu_loop.c  |   7 +
 linux-user/sh4/cpu_loop.c|   8 +
 linux-user/sparc/cpu_loop.c  |  11 +
 plugins/api.c|  19 +-
 plugins/core.c   |  10 +-
 target/alpha/cpu.c   |   2 +-
 target/alpha/mem_helper.c|   8 +-
 target/alpha/translate.c |   8 +-
 target/arm/cpu.c

[PATCH for-6.2 02/43] target/alpha: Implement do_unaligned_access for user-only

2021-07-28 Thread Richard Henderson
Signed-off-by: Richard Henderson 
---
 target/alpha/cpu.c| 2 +-
 target/alpha/mem_helper.c | 8 +++-
 2 files changed, 4 insertions(+), 6 deletions(-)

diff --git a/target/alpha/cpu.c b/target/alpha/cpu.c
index 4871ad0c0a..cb7e5261bd 100644
--- a/target/alpha/cpu.c
+++ b/target/alpha/cpu.c
@@ -220,11 +220,11 @@ static const struct TCGCPUOps alpha_tcg_ops = {
 .initialize = alpha_translate_init,
 .cpu_exec_interrupt = alpha_cpu_exec_interrupt,
 .tlb_fill = alpha_cpu_tlb_fill,
+.do_unaligned_access = alpha_cpu_do_unaligned_access,
 
 #ifndef CONFIG_USER_ONLY
 .do_interrupt = alpha_cpu_do_interrupt,
 .do_transaction_failed = alpha_cpu_do_transaction_failed,
-.do_unaligned_access = alpha_cpu_do_unaligned_access,
 #endif /* !CONFIG_USER_ONLY */
 };
 
diff --git a/target/alpha/mem_helper.c b/target/alpha/mem_helper.c
index 75e72bc337..e3cf98b270 100644
--- a/target/alpha/mem_helper.c
+++ b/target/alpha/mem_helper.c
@@ -23,30 +23,28 @@
 #include "exec/exec-all.h"
 #include "exec/cpu_ldst.h"
 
-/* Softmmu support */
-#ifndef CONFIG_USER_ONLY
 void alpha_cpu_do_unaligned_access(CPUState *cs, vaddr addr,
MMUAccessType access_type,
int mmu_idx, uintptr_t retaddr)
 {
 AlphaCPU *cpu = ALPHA_CPU(cs);
 CPUAlphaState *env = &cpu->env;
-uint64_t pc;
 uint32_t insn;
 
 cpu_restore_state(cs, retaddr, true);
 
-pc = env->pc;
-insn = cpu_ldl_code(env, pc);
+insn = cpu_ldl_code(env, env->pc);
 
 env->trap_arg0 = addr;
 env->trap_arg1 = insn >> 26;/* opcode */
 env->trap_arg2 = (insn >> 21) & 31; /* dest regno */
+
 cs->exception_index = EXCP_UNALIGN;
 env->error_code = 0;
 cpu_loop_exit(cs);
 }
 
+#ifndef CONFIG_USER_ONLY
 void alpha_cpu_do_transaction_failed(CPUState *cs, hwaddr physaddr,
  vaddr addr, unsigned size,
  MMUAccessType access_type,
-- 
2.25.1




[PATCH for-6.2 01/43] hw/core: Make do_unaligned_access available to user-only

2021-07-28 Thread Richard Henderson
We shouldn't be ignoring SIGBUS for user-only.
Move our existing TCGCPUOps hook out from CONFIG_SOFTMMU.

Signed-off-by: Richard Henderson 
---
 include/hw/core/tcg-cpu-ops.h | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/include/hw/core/tcg-cpu-ops.h b/include/hw/core/tcg-cpu-ops.h
index eab27d0c03..513d6bfe72 100644
--- a/include/hw/core/tcg-cpu-ops.h
+++ b/include/hw/core/tcg-cpu-ops.h
@@ -60,6 +60,13 @@ struct TCGCPUOps {
 /** @debug_excp_handler: Callback for handling debug exceptions */
 void (*debug_excp_handler)(CPUState *cpu);
 
+/**
+ * @do_unaligned_access: Callback for unaligned access handling
+ */
+void (*do_unaligned_access)(CPUState *cpu, vaddr addr,
+MMUAccessType access_type,
+int mmu_idx, uintptr_t retaddr);
+
 #ifdef NEED_CPU_H
 #ifdef CONFIG_SOFTMMU
 /**
@@ -70,13 +77,6 @@ struct TCGCPUOps {
   unsigned size, MMUAccessType access_type,
   int mmu_idx, MemTxAttrs attrs,
   MemTxResult response, uintptr_t retaddr);
-/**
- * @do_unaligned_access: Callback for unaligned access handling
- */
-void (*do_unaligned_access)(CPUState *cpu, vaddr addr,
-MMUAccessType access_type,
-int mmu_idx, uintptr_t retaddr);
-
 /**
  * @adjust_watchpoint_address: hack for cpu_check_watchpoint used by ARM
  */
-- 
2.25.1




Re: [PATCH for-6.2 v3 11/11] machine: Move smp_prefer_sockets to struct SMPCompatProps

2021-07-28 Thread David Gibson
On Wed, Jul 28, 2021 at 11:48:48AM +0800, Yanan Wang wrote:
> Now we have a common structure SMPCompatProps used to store information
> about SMP compatibility stuff, so we can also move smp_prefer_sockets
> there for cleaner code.
> 
> No functional change intended.
> 
> Signed-off-by: Yanan Wang 

Acked-by: David Gibson 

> ---
>  hw/arm/virt.c  | 2 +-
>  hw/core/machine.c  | 2 +-
>  hw/i386/pc_piix.c  | 2 +-
>  hw/i386/pc_q35.c   | 2 +-
>  hw/ppc/spapr.c | 2 +-
>  hw/s390x/s390-virtio-ccw.c | 2 +-
>  include/hw/boards.h| 3 ++-
>  7 files changed, 8 insertions(+), 7 deletions(-)
> 
> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> index 7babea40dc..ae029680da 100644
> --- a/hw/arm/virt.c
> +++ b/hw/arm/virt.c
> @@ -2797,7 +2797,7 @@ static void virt_machine_6_1_options(MachineClass *mc)
>  {
>  virt_machine_6_2_options(mc);
>  compat_props_add(mc->compat_props, hw_compat_6_1, hw_compat_6_1_len);
> -mc->smp_prefer_sockets = true;
> +mc->smp_props.prefer_sockets = true;
>  }
>  DEFINE_VIRT_MACHINE(6, 1)
>  
> diff --git a/hw/core/machine.c b/hw/core/machine.c
> index 8f84e38e2e..61d1f643f4 100644
> --- a/hw/core/machine.c
> +++ b/hw/core/machine.c
> @@ -834,7 +834,7 @@ static void smp_parse(MachineState *ms, SMPConfiguration 
> *config, Error **errp)
>  } else {
>  maxcpus = maxcpus > 0 ? maxcpus : cpus;
>  
> -if (mc->smp_prefer_sockets) {
> +if (mc->smp_props.prefer_sockets) {
>  /* prefer sockets over cores over threads before 6.2 */
>  if (sockets == 0) {
>  cores = cores > 0 ? cores : 1;
> diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
> index 9b811fc6ca..a60ebfc2c1 100644
> --- a/hw/i386/pc_piix.c
> +++ b/hw/i386/pc_piix.c
> @@ -432,7 +432,7 @@ static void pc_i440fx_6_1_machine_options(MachineClass *m)
>  m->is_default = false;
>  compat_props_add(m->compat_props, hw_compat_6_1, hw_compat_6_1_len);
>  compat_props_add(m->compat_props, pc_compat_6_1, pc_compat_6_1_len);
> -m->smp_prefer_sockets = true;
> +m->smp_props.prefer_sockets = true;
>  }
>  
>  DEFINE_I440FX_MACHINE(v6_1, "pc-i440fx-6.1", NULL,
> diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c
> index 88efb7fde4..4b622ffb82 100644
> --- a/hw/i386/pc_q35.c
> +++ b/hw/i386/pc_q35.c
> @@ -372,7 +372,7 @@ static void pc_q35_6_1_machine_options(MachineClass *m)
>  m->alias = NULL;
>  compat_props_add(m->compat_props, hw_compat_6_1, hw_compat_6_1_len);
>  compat_props_add(m->compat_props, pc_compat_6_1, pc_compat_6_1_len);
> -m->smp_prefer_sockets = true;
> +m->smp_props.prefer_sockets = true;
>  }
>  
>  DEFINE_Q35_MACHINE(v6_1, "pc-q35-6.1", NULL,
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index a481fade51..efdea43c0d 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -4702,7 +4702,7 @@ static void 
> spapr_machine_6_1_class_options(MachineClass *mc)
>  {
>  spapr_machine_6_2_class_options(mc);
>  compat_props_add(mc->compat_props, hw_compat_6_1, hw_compat_6_1_len);
> -mc->smp_prefer_sockets = true;
> +mc->smp_props.prefer_sockets = true;
>  }
>  
>  DEFINE_SPAPR_MACHINE(6_1, "6.1", false);
> diff --git a/hw/s390x/s390-virtio-ccw.c b/hw/s390x/s390-virtio-ccw.c
> index b40e647883..5bdef9b4d7 100644
> --- a/hw/s390x/s390-virtio-ccw.c
> +++ b/hw/s390x/s390-virtio-ccw.c
> @@ -809,7 +809,7 @@ static void ccw_machine_6_1_class_options(MachineClass 
> *mc)
>  {
>  ccw_machine_6_2_class_options(mc);
>  compat_props_add(mc->compat_props, hw_compat_6_1, hw_compat_6_1_len);
> -mc->smp_prefer_sockets = true;
> +mc->smp_props.prefer_sockets = true;
>  }
>  DEFINE_CCW_MACHINE(6_1, "6.1", false);
>  
> diff --git a/include/hw/boards.h b/include/hw/boards.h
> index 72123f594d..23671a0f8f 100644
> --- a/include/hw/boards.h
> +++ b/include/hw/boards.h
> @@ -110,9 +110,11 @@ typedef struct {
>  
>  /**
>   * SMPCompatProps:
> + * @prefer_sockets - whether sockets is preferred over cores in smp parsing
>   * @dies_supported - whether dies is supported by the machine
>   */
>  typedef struct {
> +bool prefer_sockets;
>  bool dies_supported;
>  } SMPCompatProps;
>  
> @@ -250,7 +252,6 @@ struct MachineClass {
>  bool nvdimm_supported;
>  bool numa_mem_supported;
>  bool auto_enable_numa;
> -bool smp_prefer_sockets;
>  SMPCompatProps smp_props;
>  const char *default_ram_id;
>  

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


Re: [PATCH] tests: Fix migration-test build failure for sparc

2021-07-28 Thread Philippe Mathieu-Daudé
On 7/28/21 11:41 PM, Peter Xu wrote:
> Even if  seems to exist for all archs on linux, however including
> it with __linux__ defined seems to be not working yet as it'll try to include
> asm/kvm.h and that can be missing for archs that do not support kvm.
> 
> To fix this (instead of any attempt to fix linux headers..), we can mark the
> header to be x86_64 only, because it's so far only service for adding the kvm
> dirty ring test.
> 
> No need to have "Fixes" as the issue is just introduced very recently.

Personally I find it very useful to navigate in gitk without having
to use git-blame.

Fixes: 1f546b709d6 ("tests: migration-test: Add dirty ring test")
Reviewed-by: Philippe Mathieu-Daudé 

> 
> Reported-by: Richard Henderson 
> Signed-off-by: Peter Xu 
> ---
>  tests/qtest/migration-test.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
> index 1e8b7784ef..cc5e83d98a 100644
> --- a/tests/qtest/migration-test.c
> +++ b/tests/qtest/migration-test.c
> @@ -27,7 +27,8 @@
>  #include "migration-helpers.h"
>  #include "tests/migration/migration-test.h"
>  
> -#if defined(__linux__)
> +/* For dirty ring test; so far only x86_64 is supported */
> +#if defined(__linux__) && defined(HOST_X86_64)
>  #include "linux/kvm.h"
>  #endif
>  
> @@ -1395,7 +1396,7 @@ static void test_multifd_tcp_cancel(void)
>  
>  static bool kvm_dirty_ring_supported(void)
>  {
> -#if defined(__linux__)
> +#if defined(__linux__) && defined(HOST_X86_64)
>  int ret, kvm_fd = open("/dev/kvm", O_RDONLY);
>  
>  if (kvm_fd < 0) {
> 




RE: [PATCH v2] This is a test mail

2021-07-28 Thread ishii.shuuic...@fujitsu.com
Hi Peter.

> These ones seem to have reached both qemu-devel and qemu-devel \o/
> 
> https://lists.gnu.org/archive/html/qemu-devel/2021-07/msg06355.html
> https://lists.gnu.org/archive/html/qemu-arm/2021-07/msg00391.html

Thank you for contacting us.
Since our email seems to have successfully made the list, 
I would like to create and post V2 of the patch series 
( https://lists.nongnu.org/archive/html/qemu-arm/2021-07/msg00322.html ) 
that you commented on the other day. 

Best regards.

> -Original Message-
> From: Peter Maydell 
> Sent: Monday, July 26, 2021 7:02 PM
> To: Ishii, Shuuichirou
> Cc: qemu-arm ; QEMU Developers
> 
> Subject: Re: [PATCH v2] This is a test mail
> 
> On Mon, 26 Jul 2021 at 09:21, Shuuichirou Ishii 
> wrote:
> >
> > This is a test mail to check the behavior of my mail because it is not
> > listed in the ML of qemu-devel.
> > I may send several test mails.
> >
> > I apologize and thank you for your patience.
> 
> These ones seem to have reached both qemu-devel and qemu-devel \o/
> 
> https://lists.gnu.org/archive/html/qemu-devel/2021-07/msg06355.html
> https://lists.gnu.org/archive/html/qemu-arm/2021-07/msg00391.html
> 
> -- PMM


Re: [PATCH] tests: Fix migration-test build failure for sparc

2021-07-28 Thread Richard Henderson

On 7/28/21 11:41 AM, Peter Xu wrote:

Even if  seems to exist for all archs on linux, however including
it with __linux__ defined seems to be not working yet as it'll try to include
asm/kvm.h and that can be missing for archs that do not support kvm.

To fix this (instead of any attempt to fix linux headers..), we can mark the
header to be x86_64 only, because it's so far only service for adding the kvm
dirty ring test.

No need to have "Fixes" as the issue is just introduced very recently.


What an odd thing to say.  How do I know that without the link?
Fixes: 1f546b709d61

Anyway,
Reviewed-by: Richard Henderson 

r~



Reported-by: Richard Henderson 
Signed-off-by: Peter Xu 
---
  tests/qtest/migration-test.c | 5 +++--
  1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
index 1e8b7784ef..cc5e83d98a 100644
--- a/tests/qtest/migration-test.c
+++ b/tests/qtest/migration-test.c
@@ -27,7 +27,8 @@
  #include "migration-helpers.h"
  #include "tests/migration/migration-test.h"
  
-#if defined(__linux__)

+/* For dirty ring test; so far only x86_64 is supported */
+#if defined(__linux__) && defined(HOST_X86_64)
  #include "linux/kvm.h"
  #endif
  
@@ -1395,7 +1396,7 @@ static void test_multifd_tcp_cancel(void)
  
  static bool kvm_dirty_ring_supported(void)

  {
-#if defined(__linux__)
+#if defined(__linux__) && defined(HOST_X86_64)
  int ret, kvm_fd = open("/dev/kvm", O_RDONLY);
  
  if (kvm_fd < 0) {







Re: [PATCH] tests: Fix migration-test build failure for sparc

2021-07-28 Thread Peter Xu
On Wed, Jul 28, 2021 at 05:41:28PM -0400, Peter Xu wrote:
> No need to have "Fixes" as the issue is just introduced very recently.

And.. This is only true if this patch can be merged in 6.1...

I should have added "for 6.1" in the subject but I forgot.  Sorry.

-- 
Peter Xu




[PATCH] tests: Fix migration-test build failure for sparc

2021-07-28 Thread Peter Xu
Even if  seems to exist for all archs on linux, however including
it with __linux__ defined seems to be not working yet as it'll try to include
asm/kvm.h and that can be missing for archs that do not support kvm.

To fix this (instead of any attempt to fix linux headers..), we can mark the
header to be x86_64 only, because it's so far only service for adding the kvm
dirty ring test.

No need to have "Fixes" as the issue is just introduced very recently.

Reported-by: Richard Henderson 
Signed-off-by: Peter Xu 
---
 tests/qtest/migration-test.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
index 1e8b7784ef..cc5e83d98a 100644
--- a/tests/qtest/migration-test.c
+++ b/tests/qtest/migration-test.c
@@ -27,7 +27,8 @@
 #include "migration-helpers.h"
 #include "tests/migration/migration-test.h"
 
-#if defined(__linux__)
+/* For dirty ring test; so far only x86_64 is supported */
+#if defined(__linux__) && defined(HOST_X86_64)
 #include "linux/kvm.h"
 #endif
 
@@ -1395,7 +1396,7 @@ static void test_multifd_tcp_cancel(void)
 
 static bool kvm_dirty_ring_supported(void)
 {
-#if defined(__linux__)
+#if defined(__linux__) && defined(HOST_X86_64)
 int ret, kvm_fd = open("/dev/kvm", O_RDONLY);
 
 if (kvm_fd < 0) {
-- 
2.31.1




Re: [PATCH v2 2/2] tests: migration-test: Add dirty ring test

2021-07-28 Thread Peter Xu
On Wed, Jul 28, 2021 at 11:11:30AM -1000, Richard Henderson wrote:
> On 7/28/21 10:37 AM, Peter Xu wrote:
> > A quick fix attached; would that work for us?
> 
> Looks plausible, though perhaps just as easy to list the 5 platforms as just 
> the one:
> 
> #if defined(__linux__) && \
> (defined(HOST_X86_64) || \
>  defined(HOST_S390X) || \
>  ...)
> # define HAVE_KVM
> #endif

That looks good to me, especially for the long term to identify whether kvm is
with us, but for the short-term I hope I can still use the (literally :)
simpler patch as attached so hopefully that'll be more welcomed as rc2+
material..

Note again that the kvm.h inclusion is only for kvm dirty ring test in
migration-test so far, meanwhile that's only supported on x86_64, so we won't
lose anything on the rest 4 archs.

Thanks!

-- 
Peter Xu




Re: [PATCH v2 2/2] tests: migration-test: Add dirty ring test

2021-07-28 Thread Richard Henderson

On 7/28/21 10:37 AM, Peter Xu wrote:

A quick fix attached; would that work for us?


Looks plausible, though perhaps just as easy to list the 5 platforms as just 
the one:

#if defined(__linux__) && \
(defined(HOST_X86_64) || \
 defined(HOST_S390X) || \
 ...)
# define HAVE_KVM
#endif


r~



Re: [PATCH for-6.2 v3 11/11] machine: Move smp_prefer_sockets to struct SMPCompatProps

2021-07-28 Thread Andrew Jones
On Wed, Jul 28, 2021 at 11:48:48AM +0800, Yanan Wang wrote:
> Now we have a common structure SMPCompatProps used to store information
> about SMP compatibility stuff, so we can also move smp_prefer_sockets
> there for cleaner code.
> 
> No functional change intended.
> 
> Signed-off-by: Yanan Wang 
> ---
>  hw/arm/virt.c  | 2 +-
>  hw/core/machine.c  | 2 +-
>  hw/i386/pc_piix.c  | 2 +-
>  hw/i386/pc_q35.c   | 2 +-
>  hw/ppc/spapr.c | 2 +-
>  hw/s390x/s390-virtio-ccw.c | 2 +-
>  include/hw/boards.h| 3 ++-
>  7 files changed, 8 insertions(+), 7 deletions(-)
> 
> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> index 7babea40dc..ae029680da 100644
> --- a/hw/arm/virt.c
> +++ b/hw/arm/virt.c
> @@ -2797,7 +2797,7 @@ static void virt_machine_6_1_options(MachineClass *mc)
>  {
>  virt_machine_6_2_options(mc);
>  compat_props_add(mc->compat_props, hw_compat_6_1, hw_compat_6_1_len);
> -mc->smp_prefer_sockets = true;
> +mc->smp_props.prefer_sockets = true;
>  }
>  DEFINE_VIRT_MACHINE(6, 1)
>  
> diff --git a/hw/core/machine.c b/hw/core/machine.c
> index 8f84e38e2e..61d1f643f4 100644
> --- a/hw/core/machine.c
> +++ b/hw/core/machine.c
> @@ -834,7 +834,7 @@ static void smp_parse(MachineState *ms, SMPConfiguration 
> *config, Error **errp)
>  } else {
>  maxcpus = maxcpus > 0 ? maxcpus : cpus;
>  
> -if (mc->smp_prefer_sockets) {
> +if (mc->smp_props.prefer_sockets) {
>  /* prefer sockets over cores over threads before 6.2 */
>  if (sockets == 0) {
>  cores = cores > 0 ? cores : 1;
> diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
> index 9b811fc6ca..a60ebfc2c1 100644
> --- a/hw/i386/pc_piix.c
> +++ b/hw/i386/pc_piix.c
> @@ -432,7 +432,7 @@ static void pc_i440fx_6_1_machine_options(MachineClass *m)
>  m->is_default = false;
>  compat_props_add(m->compat_props, hw_compat_6_1, hw_compat_6_1_len);
>  compat_props_add(m->compat_props, pc_compat_6_1, pc_compat_6_1_len);
> -m->smp_prefer_sockets = true;
> +m->smp_props.prefer_sockets = true;
>  }
>  
>  DEFINE_I440FX_MACHINE(v6_1, "pc-i440fx-6.1", NULL,
> diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c
> index 88efb7fde4..4b622ffb82 100644
> --- a/hw/i386/pc_q35.c
> +++ b/hw/i386/pc_q35.c
> @@ -372,7 +372,7 @@ static void pc_q35_6_1_machine_options(MachineClass *m)
>  m->alias = NULL;
>  compat_props_add(m->compat_props, hw_compat_6_1, hw_compat_6_1_len);
>  compat_props_add(m->compat_props, pc_compat_6_1, pc_compat_6_1_len);
> -m->smp_prefer_sockets = true;
> +m->smp_props.prefer_sockets = true;
>  }
>  
>  DEFINE_Q35_MACHINE(v6_1, "pc-q35-6.1", NULL,
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index a481fade51..efdea43c0d 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -4702,7 +4702,7 @@ static void 
> spapr_machine_6_1_class_options(MachineClass *mc)
>  {
>  spapr_machine_6_2_class_options(mc);
>  compat_props_add(mc->compat_props, hw_compat_6_1, hw_compat_6_1_len);
> -mc->smp_prefer_sockets = true;
> +mc->smp_props.prefer_sockets = true;
>  }
>  
>  DEFINE_SPAPR_MACHINE(6_1, "6.1", false);
> diff --git a/hw/s390x/s390-virtio-ccw.c b/hw/s390x/s390-virtio-ccw.c
> index b40e647883..5bdef9b4d7 100644
> --- a/hw/s390x/s390-virtio-ccw.c
> +++ b/hw/s390x/s390-virtio-ccw.c
> @@ -809,7 +809,7 @@ static void ccw_machine_6_1_class_options(MachineClass 
> *mc)
>  {
>  ccw_machine_6_2_class_options(mc);
>  compat_props_add(mc->compat_props, hw_compat_6_1, hw_compat_6_1_len);
> -mc->smp_prefer_sockets = true;
> +mc->smp_props.prefer_sockets = true;
>  }
>  DEFINE_CCW_MACHINE(6_1, "6.1", false);
>  
> diff --git a/include/hw/boards.h b/include/hw/boards.h
> index 72123f594d..23671a0f8f 100644
> --- a/include/hw/boards.h
> +++ b/include/hw/boards.h
> @@ -110,9 +110,11 @@ typedef struct {
>  
>  /**
>   * SMPCompatProps:
> + * @prefer_sockets - whether sockets is preferred over cores in smp parsing
>   * @dies_supported - whether dies is supported by the machine
>   */
>  typedef struct {
> +bool prefer_sockets;
>  bool dies_supported;
>  } SMPCompatProps;
>  
> @@ -250,7 +252,6 @@ struct MachineClass {
>  bool nvdimm_supported;
>  bool numa_mem_supported;
>  bool auto_enable_numa;
> -bool smp_prefer_sockets;
>  SMPCompatProps smp_props;
>  const char *default_ram_id;
>  
> -- 
> 2.19.1
>

 
Reviewed-by: Andrew Jones 




Re: [PATCH for-6.2 v3 09/11] machine: Make smp_parse generic enough for all arches

2021-07-28 Thread Andrew Jones
On Wed, Jul 28, 2021 at 10:38:57PM +0200, Andrew Jones wrote:
> On Wed, Jul 28, 2021 at 11:48:46AM +0800, Yanan Wang wrote:
> > @@ -248,6 +256,7 @@ struct MachineClass {
> >  bool numa_mem_supported;
> >  bool auto_enable_numa;
> >  bool smp_prefer_sockets;
> > +SMPCompatProps smp_props;
> >  const char *default_ram_id;
> >  
> >  HotplugHandler *(*get_hotplug_handler)(MachineState *machine,
> > -- 
> > 2.19.1
> >
> 
> What about putting smp_prefer_sockets in SMPCompatProps too (as
> prefer_sockets)?

Ah, I see patch 11/11 does that.

Thanks,
drew

> 
> Otherwise
> 
> Reviewed-by: Andrew Jones 




Re: [PATCH for-6.2 v3 10/11] machine: Remove smp_parse callback from MachineClass

2021-07-28 Thread Andrew Jones
On Wed, Jul 28, 2021 at 11:48:47AM +0800, Yanan Wang wrote:
> Now we have a generic smp parser for all arches, and there will
> not be any other arch specific ones, so let's remove the callback
> from MachineClass and call the parser directly.
> 
> Signed-off-by: Yanan Wang 
> ---
>  hw/core/machine.c   | 3 +--
>  include/hw/boards.h | 5 -
>  2 files changed, 1 insertion(+), 7 deletions(-)
> 
> diff --git a/hw/core/machine.c b/hw/core/machine.c
> index 76b6c3bc64..8f84e38e2e 100644
> --- a/hw/core/machine.c
> +++ b/hw/core/machine.c
> @@ -934,7 +934,7 @@ static void machine_set_smp(Object *obj, Visitor *v, 
> const char *name,
>  goto out_free;
>  }
>  
> -mc->smp_parse(ms, config, errp);
> +smp_parse(ms, config, errp);
>  if (errp) {
>  goto out_free;
>  }
> @@ -963,7 +963,6 @@ static void machine_class_init(ObjectClass *oc, void 
> *data)
>  /* Default 128 MB as guest ram size */
>  mc->default_ram_size = 128 * MiB;
>  mc->rom_file_has_mr = true;
> -mc->smp_parse = smp_parse;
>  
>  /* numa node memory size aligned on 8MB by default.
>   * On Linux, each node's border has to be 8MB aligned
> diff --git a/include/hw/boards.h b/include/hw/boards.h
> index 0631900c08..72123f594d 100644
> --- a/include/hw/boards.h
> +++ b/include/hw/boards.h
> @@ -177,10 +177,6 @@ typedef struct {
>   *kvm-type may be NULL if it is not needed.
>   * @numa_mem_supported:
>   *true if '--numa node.mem' option is supported and false otherwise
> - * @smp_parse:
> - *The function pointer to hook different machine specific functions for
> - *parsing "smp-opts" from QemuOpts to MachineState::CpuTopology and more
> - *machine specific topology fields, such as smp_dies for PCMachine.
>   * @hotplug_allowed:
>   *If the hook is provided, then it'll be called for each device
>   *hotplug to check whether the device hotplug is allowed.  Return
> @@ -217,7 +213,6 @@ struct MachineClass {
>  void (*reset)(MachineState *state);
>  void (*wakeup)(MachineState *state);
>  int (*kvm_type)(MachineState *machine, const char *arg);
> -void (*smp_parse)(MachineState *ms, SMPConfiguration *config, Error 
> **errp);
>  
>  BlockInterfaceType block_default_type;
>  int units_per_default_bus;
> -- 
> 2.19.1
>

 
Reviewed-by: Andrew Jones 




Re: [PATCH for-6.2 v3 09/11] machine: Make smp_parse generic enough for all arches

2021-07-28 Thread Andrew Jones
On Wed, Jul 28, 2021 at 11:48:46AM +0800, Yanan Wang wrote:
> Currently the only difference between smp_parse and pc_smp_parse
> is the support of dies parameter and the related error reporting.
> With some arch compat variables like "bool dies_supported", we can
> make smp_parse generic enough for all arches and the PC specific
> one can be removed.
> 
> Making smp_parse() generic enough can reduce code duplication and
> ease the code maintenance, and also allows extending the topology
> with more arch specific members (e.g., clusters) in the future.
> 
> Suggested-by: Andrew Jones 
> Signed-off-by: Yanan Wang 
> ---
>  hw/core/machine.c   | 96 +++--
>  hw/i386/pc.c| 83 +--
>  include/hw/boards.h |  9 +
>  3 files changed, 93 insertions(+), 95 deletions(-)
> 
> diff --git a/hw/core/machine.c b/hw/core/machine.c
> index 9223ece3ea..76b6c3bc64 100644
> --- a/hw/core/machine.c
> +++ b/hw/core/machine.c
> @@ -15,6 +15,7 @@
>  #include "qapi/qmp/qerror.h"
>  #include "sysemu/replay.h"
>  #include "qemu/units.h"
> +#include "qemu/cutils.h"
>  #include "hw/boards.h"
>  #include "hw/loader.h"
>  #include "qapi/error.h"
> @@ -744,20 +745,87 @@ void machine_set_cpu_numa_node(MachineState *machine,
>  }
>  }
>  
> +static char *cpu_topology_hierarchy(MachineState *ms)
> +{
> +MachineClass *mc = MACHINE_GET_CLASS(ms);
> +SMPCompatProps *smp_props = &mc->smp_props;
> +char topo_msg[256] = "";
> +
> +/*
> + * Topology members should be ordered from the largest to the smallest.
> + * Concept of sockets/cores/threads is supported by default and will be
> + * reported in the hierarchy. Unsupported members will not be reported.
> + */
> +g_autofree char *sockets_msg = g_strdup_printf(
> +" * sockets (%u)", ms->smp.sockets);
> +pstrcat(topo_msg, sizeof(topo_msg), sockets_msg);
> +
> +if (smp_props->dies_supported) {
> +g_autofree char *dies_msg = g_strdup_printf(
> +" * dies (%u)", ms->smp.dies);
> +pstrcat(topo_msg, sizeof(topo_msg), dies_msg);
> +}
> +
> +g_autofree char *cores_msg = g_strdup_printf(
> +" * cores (%u)", ms->smp.cores);
> +pstrcat(topo_msg, sizeof(topo_msg), cores_msg);
> +
> +g_autofree char *threads_msg = g_strdup_printf(
> +" * threads (%u)", ms->smp.threads);
> +pstrcat(topo_msg, sizeof(topo_msg), threads_msg);
> +
> +return g_strdup_printf("%s", topo_msg + 3);
> +}
> +
> +/*
> + * smp_parse - Generic function used to parse the given SMP configuration
> + *
> + * A topology parameter must be omitted or specified equal to 1 if it's
> + * not supported by the machine. Concept of sockets/cores/threads is
> + * supported by default. Unsupported members will not be reported in
> + * the cpu topology hierarchy message.
> + *
> + * For compatibility, if omitted the arch-specific members (e.g. dies)
> + * will not be computed, but will directly default to 1 instead. This
> + * logic should also apply to future introduced ones.
> + *
> + * Omitted arch-neutral members, i.e., cpus/sockets/cores/threads/maxcpus
> + * will be computed based on the provided ones. When both maxcpus and cpus
> + * are omitted, maxcpus will be computed from the given parameters and cpus
> + * will be set equal to maxcpus. When only one of maxcpus and cpus is given
> + * then the omitted one will be set to its given counterpart's value.
> + * Both maxcpus and cpus may be specified, but maxcpus must be equal to or
> + * greater than cpus.
> + *
> + * In calculation of omitted sockets/cores/threads, we prefer sockets over
> + * cores over threads before 6.2, while preferring cores over sockets over
> + * threads since 6.2.
> + */
>  static void smp_parse(MachineState *ms, SMPConfiguration *config, Error 
> **errp)
>  {
>  MachineClass *mc = MACHINE_GET_CLASS(ms);
>  unsigned cpus= config->has_cpus ? config->cpus : 0;
>  unsigned sockets = config->has_sockets ? config->sockets : 0;
> +unsigned dies= config->has_dies ? config->dies : 0;
>  unsigned cores   = config->has_cores ? config->cores : 0;
>  unsigned threads = config->has_threads ? config->threads : 0;
>  unsigned maxcpus = config->has_maxcpus ? config->maxcpus : 0;
>  
> -if (config->has_dies && config->dies > 1) {
> +/*
> + * A topology parameter must be omitted or specified equal to 1,
> + * if the machine's CPU topology doesn't support it.
> + */
> +if (!mc->smp_props.dies_supported && dies > 1) {
>  error_setg(errp, "dies not supported by this machine's CPU 
> topology");
>  return;
>  }
>  
> +/*
> + * If omitted the arch-specific members will not be computed,
> + * but will directly default to 1 instead.
> + */
> +dies = dies > 0 ? dies : 1;
> +
>  /* compute missing values based on the provided ones */
>  if (cpus == 0 && maxcpus == 0) {
> 

Re: [PATCH v2 2/2] tests: migration-test: Add dirty ring test

2021-07-28 Thread Peter Xu
On Wed, Jul 28, 2021 at 09:37:48AM -1000, Richard Henderson wrote:
> On 6/15/21 7:55 AM, Peter Xu wrote:
> > Add dirty ring test if kernel supports it.  Add the dirty ring parameter on
> > source should be mostly enough, but let's change the dest too to make them
> > match always.
> > 
> > Reviewed-by: Dr. David Alan Gilbert 
> > Signed-off-by: Peter Xu 
> > ---
> >   tests/qtest/migration-test.c | 58 ++--
> >   1 file changed, 55 insertions(+), 3 deletions(-)
> > 
> > diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
> > index d9225f58d4d..9ef6b471353 100644
> > --- a/tests/qtest/migration-test.c
> > +++ b/tests/qtest/migration-test.c
> > @@ -27,6 +27,10 @@
> >   #include "migration-helpers.h"
> >   #include "tests/migration/migration-test.h"
> > +#if defined(__linux__)
> > +#include "linux/kvm.h"
> > +#endif
> 
> This breaks the build for hosts that do not support kvm, e.g. sparc:
> 
> 
> [2/3] Compiling C object tests/qtest/migration-test.p/migration-test.c.o
> FAILED: tests/qtest/migration-test.p/migration-test.c.o
> cc -Itests/qtest/migration-test.p -Itests/qtest -I../qemu/tests/qtest -I.
> -Iqapi -Itrace -Iui -Iui/shader -I/usr/include/glib-2.0
> -I/usr/lib/sparc64-linux-gnu/glib-2.0/include -fdiagnostics-color=auto -pipe
> -Wall -Winvalid-pch -Werror -std=gnu11 -O2 -g -isystem
> /home/rth/qemu/qemu/linux-headers -isystem linux-headers -iquote . -iquote
> /home/rth/qemu/qemu -iquote /home/rth/qemu/qemu/include -iquote
> /home/rth/qemu/qemu/disas/libvixl -iquote /home/rth/qemu/qemu/tcg/sparc
> -pthread -U_FORTIFY_SOURCE -D_FORTIFY_SOURCE=2 -m64 -mcpu=ultrasparc
> -D_GNU_SOURCE -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -Wstrict-prototypes
> -Wredundant-decls -Wundef -Wwrite-strings -Wmissing-prototypes
> -fno-strict-aliasing -fno-common -fwrapv -Wold-style-declaration
> -Wold-style-definition -Wtype-limits -Wformat-security -Wformat-y2k
> -Winit-self -Wignored-qualifiers -Wempty-body -Wnested-externs
> -Wendif-labels -Wexpansion-to-defined -Wimplicit-fallthrough=2
> -Wno-missing-include-dirs -Wno-shift-negative-value -Wno-psabi -fPIE -MD -MQ
> tests/qtest/migration-test.p/migration-test.c.o -MF
> tests/qtest/migration-test.p/migration-test.c.o.d -o
> tests/qtest/migration-test.p/migration-test.c.o -c
> ../qemu/tests/qtest/migration-test.c
> In file included from ../qemu/tests/qtest/migration-test.c:31:
> /home/rth/qemu/qemu/linux-headers/linux/kvm.h:15:10: fatal error: asm/kvm.h:
> No such file or directory
>15 | #include 
>   |  ^~~
> compilation terminated.

Hi, Richard,

Sorry for that.  It's very weird that linux/kvm.h exists for all archs while
it's not conditionally including asm/kvm.h only for the 5 supported archs, so
any user app trying to include linux/kvm.h will fail for the rest.

(while all references needed in this test is actually KVM_CHECK_EXTENSION,
 KVM_CAP_DIRTY_LOG_RING and both of them exist in linux/kvm.h not the asm one)

A quick fix attached; would that work for us?

Thanks,

-- 
Peter Xu
>From 888ab46c44284738d222edc87e9fc86a49ae2f51 Mon Sep 17 00:00:00 2001
From: Peter Xu 
Date: Wed, 28 Jul 2021 16:32:00 -0400
Subject: [PATCH] tests: Fix migration-test build failure for sparc

Even if  seems to exist for all archs on linux, however including
it with __linux__ defined seems to be not working yet as it'll try to include
asm/kvm.h and that can be missing for archs that do not support kvm.

To fix this (instead of any attempt to fix linux headers..), we can mark the
header to be x86_64 only, because it's so far only service for adding the kvm
dirty ring test.

Reported-by: Richard Henderson 
Signed-off-by: Peter Xu 
---
 tests/qtest/migration-test.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
index 1e8b7784ef..cc5e83d98a 100644
--- a/tests/qtest/migration-test.c
+++ b/tests/qtest/migration-test.c
@@ -27,7 +27,8 @@
 #include "migration-helpers.h"
 #include "tests/migration/migration-test.h"
 
-#if defined(__linux__)
+/* For dirty ring test; so far only x86_64 is supported */
+#if defined(__linux__) && defined(HOST_X86_64)
 #include "linux/kvm.h"
 #endif
 
@@ -1395,7 +1396,7 @@ static void test_multifd_tcp_cancel(void)
 
 static bool kvm_dirty_ring_supported(void)
 {
-#if defined(__linux__)
+#if defined(__linux__) && defined(HOST_X86_64)
 int ret, kvm_fd = open("/dev/kvm", O_RDONLY);
 
 if (kvm_fd < 0) {
-- 
2.31.1



Re: [PATCH for-6.2 v3 06/11] machine: Prefer cores over sockets in smp parsing since 6.2

2021-07-28 Thread Andrew Jones
On Wed, Jul 28, 2021 at 11:48:43AM +0800, Yanan Wang wrote:
> In the real SMP hardware topology world, it's much more likely that
> we have high cores-per-socket counts and few sockets totally. While
> the current preference of sockets over cores in smp parsing results
> in a virtual cpu topology with low cores-per-sockets counts and a
> large number of sockets, which is just contrary to the real world.
> 
> Given that it is better to make the virtual cpu topology be more
> reflective of the real world and also for the sake of compatibility,
> we start to prefer cores over sockets over threads in smp parsing
> since machine type 6.2 for different arches.
> 
> In this patch, a boolean "smp_prefer_sockets" is added, and we only
> enable the old preference on older machines and enable the new one
> since type 6.2 for all arches by using the machine compat mechanism.
> 
> Acked-by: David Gibson 
> Suggested-by: Daniel P. Berrange 
> Signed-off-by: Yanan Wang 
> ---
>  hw/arm/virt.c  |  1 +
>  hw/core/machine.c  | 36 ++--
>  hw/i386/pc.c   | 36 ++--
>  hw/i386/pc_piix.c  |  1 +
>  hw/i386/pc_q35.c   |  1 +
>  hw/ppc/spapr.c |  1 +
>  hw/s390x/s390-virtio-ccw.c |  1 +
>  include/hw/boards.h|  1 +
>  qemu-options.hx|  3 ++-
>  9 files changed, 60 insertions(+), 21 deletions(-)
> 
> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> index 01165f7f53..7babea40dc 100644
> --- a/hw/arm/virt.c
> +++ b/hw/arm/virt.c
> @@ -2797,6 +2797,7 @@ static void virt_machine_6_1_options(MachineClass *mc)
>  {
>  virt_machine_6_2_options(mc);
>  compat_props_add(mc->compat_props, hw_compat_6_1, hw_compat_6_1_len);
> +mc->smp_prefer_sockets = true;
>  }
>  DEFINE_VIRT_MACHINE(6, 1)
>  
> diff --git a/hw/core/machine.c b/hw/core/machine.c
> index 458d9736e3..a8173a0f45 100644
> --- a/hw/core/machine.c
> +++ b/hw/core/machine.c
> @@ -746,6 +746,7 @@ void machine_set_cpu_numa_node(MachineState *machine,
>  
>  static void smp_parse(MachineState *ms, SMPConfiguration *config, Error 
> **errp)
>  {
> +MachineClass *mc = MACHINE_GET_CLASS(ms);
>  unsigned cpus= config->has_cpus ? config->cpus : 0;
>  unsigned sockets = config->has_sockets ? config->sockets : 0;
>  unsigned cores   = config->has_cores ? config->cores : 0;
> @@ -757,7 +758,7 @@ static void smp_parse(MachineState *ms, SMPConfiguration 
> *config, Error **errp)
>  return;
>  }
>  
> -/* compute missing values, prefer sockets over cores over threads */
> +/* compute missing values based on the provided ones */
>  if (cpus == 0 && maxcpus == 0) {
>  sockets = sockets > 0 ? sockets : 1;
>  cores = cores > 0 ? cores : 1;
> @@ -765,15 +766,30 @@ static void smp_parse(MachineState *ms, 
> SMPConfiguration *config, Error **errp)
>  } else {
>  maxcpus = maxcpus > 0 ? maxcpus : cpus;
>  
> -if (sockets == 0) {
> -cores = cores > 0 ? cores : 1;
> -threads = threads > 0 ? threads : 1;
> -sockets = maxcpus / (cores * threads);
> -} else if (cores == 0) {
> -threads = threads > 0 ? threads : 1;
> -cores = maxcpus / (sockets * threads);
> -} else if (threads == 0) {
> -threads = maxcpus / (sockets * cores);
> +if (mc->smp_prefer_sockets) {
> +/* prefer sockets over cores over threads before 6.2 */
> +if (sockets == 0) {
> +cores = cores > 0 ? cores : 1;
> +threads = threads > 0 ? threads : 1;
> +sockets = maxcpus / (cores * threads);
> +} else if (cores == 0) {
> +threads = threads > 0 ? threads : 1;
> +cores = maxcpus / (sockets * threads);
> +} else if (threads == 0) {
> +threads = maxcpus / (sockets * cores);
> +}
> +} else {
> +/* prefer cores over sockets over threads since 6.2 */
> +if (cores == 0) {
> +sockets = sockets > 0 ? sockets : 1;
> +threads = threads > 0 ? threads : 1;
> +cores = maxcpus / (sockets * threads);
> +} else if (sockets == 0) {
> +threads = threads > 0 ? threads : 1;
> +sockets = maxcpus / (cores * threads);
> +} else if (threads == 0) {
> +threads = maxcpus / (sockets * cores);
> +}
>  }
>  }
>  
> diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> index 8c2235ac46..77ab764c5d 100644
> --- a/hw/i386/pc.c
> +++ b/hw/i386/pc.c
> @@ -717,6 +717,7 @@ void pc_acpi_smi_interrupt(void *opaque, int irq, int 
> level)
>   */
>  static void pc_smp_parse(MachineState *ms, SMPConfiguration *config, Error 
> **errp)
>  {
> +MachineClass *mc = MACHINE_GET_CLASS(ms);
>  unsigned cpus= config->has_cpus ? config->cpus : 0;
>  

Re: [PATCH for-6.2 v3 04/11] machine: Improve the error reporting of smp parsing

2021-07-28 Thread Andrew Jones
On Wed, Jul 28, 2021 at 11:48:41AM +0800, Yanan Wang wrote:
> We have two requirements for a valid SMP configuration:
> the product of "sockets * cores * threads" must represent all the
> possible cpus, i.e., max_cpus, and then must include the initially
> present cpus, i.e., smp_cpus.
> 
> So we only need to ensure 1) "sockets * cores * threads == maxcpus"
> at first and then ensure 2) "maxcpus >= cpus". With a reasonable
> order of the sanity check, we can simplify the error reporting code.
> When reporting an error message we also report the exact value of
> each topology member to make users easily see what's going on.
> 
> Signed-off-by: Yanan Wang 
> ---
>  hw/core/machine.c | 22 +-
>  hw/i386/pc.c  | 24 ++--
>  2 files changed, 19 insertions(+), 27 deletions(-)
> 
> diff --git a/hw/core/machine.c b/hw/core/machine.c
> index 958e6e7107..e879163c3b 100644
> --- a/hw/core/machine.c
> +++ b/hw/core/machine.c
> @@ -777,25 +777,21 @@ static void smp_parse(MachineState *ms, 
> SMPConfiguration *config, Error **errp)
>  maxcpus = maxcpus > 0 ? maxcpus : sockets * cores * threads;
>  cpus = cpus > 0 ? cpus : maxcpus;
>  
> -if (sockets * cores * threads < cpus) {
> -error_setg(errp, "cpu topology: "
> -   "sockets (%u) * cores (%u) * threads (%u) < "
> -   "smp_cpus (%u)",
> -   sockets, cores, threads, cpus);
> +if (sockets * cores * threads != maxcpus) {
> +error_setg(errp, "Invalid CPU topology: "
> +   "product of the hierarchy must match maxcpus: "
> +   "sockets (%u) * cores (%u) * threads (%u) "
> +   "!= maxcpus (%u)",
> +   sockets, cores, threads, maxcpus);
>  return;
>  }
>  
>  if (maxcpus < cpus) {
> -error_setg(errp, "maxcpus must be equal to or greater than smp");
> -return;
> -}
> -
> -if (sockets * cores * threads != maxcpus) {
>  error_setg(errp, "Invalid CPU topology: "
> +   "maxcpus must be equal to or greater than smp: "
> "sockets (%u) * cores (%u) * threads (%u) "
> -   "!= maxcpus (%u)",
> -   sockets, cores, threads,
> -   maxcpus);
> +   "== maxcpus (%u) < smp_cpus (%u)",
> +   sockets, cores, threads, maxcpus, cpus);
>  return;
>  }
>  
> diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> index 9ad7ae5254..3e403a7129 100644
> --- a/hw/i386/pc.c
> +++ b/hw/i386/pc.c
> @@ -747,25 +747,21 @@ static void pc_smp_parse(MachineState *ms, 
> SMPConfiguration *config, Error **err
>  maxcpus = maxcpus > 0 ? maxcpus : sockets * dies * cores * threads;
>  cpus = cpus > 0 ? cpus : maxcpus;
>  
> -if (sockets * dies * cores * threads < cpus) {
> -error_setg(errp, "cpu topology: "
> -   "sockets (%u) * dies (%u) * cores (%u) * threads (%u) < "
> -   "smp_cpus (%u)",
> -   sockets, dies, cores, threads, cpus);
> +if (sockets * cores * threads != maxcpus) {
> +error_setg(errp, "Invalid CPU topology: "
> +   "product of the hierarchy must match maxcpus: "
> +   "sockets (%u) * dies (%u) * cores (%u) * threads (%u) "
> +   "!= maxcpus (%u)",
> +   sockets, dies, cores, threads, maxcpus);
>  return;
>  }
>  
>  if (maxcpus < cpus) {
> -error_setg(errp, "maxcpus must be equal to or greater than smp");
> -return;
> -}
> -
> -if (sockets * dies * cores * threads != maxcpus) {
> -error_setg(errp, "Invalid CPU topology deprecated: "
> +error_setg(errp, "Invalid CPU topology: "
> +   "maxcpus must be equal to or greater than smp: "
> "sockets (%u) * dies (%u) * cores (%u) * threads (%u) "
> -   "!= maxcpus (%u)",
> -   sockets, dies, cores, threads,
> -   maxcpus);
> +   "== maxcpus (%u) < smp_cpus (%u)",
> +   sockets, dies, cores, threads, maxcpus, cpus);
>  return;
>  }
>  
> -- 
> 2.19.1
>

 
Reviewed-by: Andrew Jones 




Re: [PATCH for-6.2 v3 03/11] machine: Set the value of cpus to match maxcpus if it's omitted

2021-07-28 Thread Andrew Jones
On Wed, Jul 28, 2021 at 11:48:40AM +0800, Yanan Wang wrote:
> Currently we directly calculate the omitted cpus based on the given
> incomplete collection of parameters. This makes some cmdlines like:
>   -smp maxcpus=16
>   -smp sockets=2,maxcpus=16
>   -smp sockets=2,dies=2,maxcpus=16
>   -smp sockets=2,cores=4,maxcpus=16
> not work. We should probably set the value of cpus to match maxcpus
> if it's omitted, which will make above configs start to work.
> 
> So the calculation logic of cpus/maxcpus after this patch will be:
> When both maxcpus and cpus are omitted, maxcpus will be calculated
> from the given parameters and cpus will be set equal to maxcpus.
> When only one of maxcpus and cpus is given then the omitted one
> will be set to its counterpart's value. Both maxcpus and cpus may
> be specified, but maxcpus must be equal to or greater than cpus.
> 
> Note: change in this patch won't affect any existing working cmdlines
> but allows more incomplete configs to be valid.
> 
> Signed-off-by: Yanan Wang 
> ---
>  hw/core/machine.c | 29 -
>  hw/i386/pc.c  | 29 -
>  qemu-options.hx   | 11 ---
>  3 files changed, 40 insertions(+), 29 deletions(-)
> 
> diff --git a/hw/core/machine.c b/hw/core/machine.c
> index 69979c93dd..958e6e7107 100644
> --- a/hw/core/machine.c
> +++ b/hw/core/machine.c
> @@ -755,25 +755,28 @@ static void smp_parse(MachineState *ms, 
> SMPConfiguration *config, Error **errp)
>  }
>  
>  /* compute missing values, prefer sockets over cores over threads */
> -maxcpus = maxcpus > 0 ? maxcpus : cpus;
> -
> -if (cpus == 0) {
> +if (cpus == 0 && maxcpus == 0) {
>  sockets = sockets > 0 ? sockets : 1;
>  cores = cores > 0 ? cores : 1;
>  threads = threads > 0 ? threads : 1;
> -cpus = sockets * cores * threads;
> +} else {
>  maxcpus = maxcpus > 0 ? maxcpus : cpus;
> -} else if (sockets == 0) {
> -cores = cores > 0 ? cores : 1;
> -threads = threads > 0 ? threads : 1;
> -sockets = maxcpus / (cores * threads);
> -} else if (cores == 0) {
> -threads = threads > 0 ? threads : 1;
> -cores = maxcpus / (sockets * threads);
> -} else if (threads == 0) {
> -threads = maxcpus / (sockets * cores);
> +
> +if (sockets == 0) {
> +cores = cores > 0 ? cores : 1;
> +threads = threads > 0 ? threads : 1;
> +sockets = maxcpus / (cores * threads);
> +} else if (cores == 0) {
> +threads = threads > 0 ? threads : 1;
> +cores = maxcpus / (sockets * threads);
> +} else if (threads == 0) {
> +threads = maxcpus / (sockets * cores);
> +}
>  }
>  
> +maxcpus = maxcpus > 0 ? maxcpus : sockets * cores * threads;
> +cpus = cpus > 0 ? cpus : maxcpus;
> +
>  if (sockets * cores * threads < cpus) {
>  error_setg(errp, "cpu topology: "
> "sockets (%u) * cores (%u) * threads (%u) < "
> diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> index a9ff9ef52c..9ad7ae5254 100644
> --- a/hw/i386/pc.c
> +++ b/hw/i386/pc.c
> @@ -725,25 +725,28 @@ static void pc_smp_parse(MachineState *ms, 
> SMPConfiguration *config, Error **err
>  dies = dies > 0 ? dies : 1;
>  
>  /* compute missing values, prefer sockets over cores over threads */
> -maxcpus = maxcpus > 0 ? maxcpus : cpus;
> -
> -if (cpus == 0) {
> +if (cpus == 0 && maxcpus == 0) {
>  sockets = sockets > 0 ? sockets : 1;
>  cores = cores > 0 ? cores : 1;
>  threads = threads > 0 ? threads : 1;
> -cpus = sockets * dies * cores * threads;
> +} else {
>  maxcpus = maxcpus > 0 ? maxcpus : cpus;
> -} else if (sockets == 0) {
> -cores = cores > 0 ? cores : 1;
> -threads = threads > 0 ? threads : 1;
> -sockets = maxcpus / (dies * cores * threads);
> -} else if (cores == 0) {
> -threads = threads > 0 ? threads : 1;
> -cores = maxcpus / (sockets * dies * threads);
> -} else if (threads == 0) {
> -threads = maxcpus / (sockets * dies * cores);
> +
> +if (sockets == 0) {
> +cores = cores > 0 ? cores : 1;
> +threads = threads > 0 ? threads : 1;
> +sockets = maxcpus / (dies * cores * threads);
> +} else if (cores == 0) {
> +threads = threads > 0 ? threads : 1;
> +cores = maxcpus / (sockets * dies * threads);
> +} else if (threads == 0) {
> +threads = maxcpus / (sockets * dies * cores);
> +}
>  }
>  
> +maxcpus = maxcpus > 0 ? maxcpus : sockets * dies * cores * threads;
> +cpus = cpus > 0 ? cpus : maxcpus;
> +
>  if (sockets * dies * cores * threads < cpus) {
>  error_setg(errp, "cpu topology: "
> "sockets (%u) * dies (%u) * cores (%u) * threads (%u) < "
> diff --git a/qemu-options.hx b/qemu-options.hx
> inde

Re: [PATCH v2 0/6] migration/ram: Optimize for virtio-mem via RamDiscardManager

2021-07-28 Thread Peter Xu
On Wed, Jul 28, 2021 at 09:46:09PM +0200, David Hildenbrand wrote:
> On 28.07.21 21:42, Peter Xu wrote:
> > On Wed, Jul 28, 2021 at 07:39:39PM +0200, David Hildenbrand wrote:
> > > > Meanwhile, I still have no idea how much overhead the "loop" part could 
> > > > bring.
> > > > For a large virtio-mem region with frequent plugged/unplugged mem 
> > > > interacted,
> > > > it seems possible to take a while to me..  I have no solid idea yet.
> > > 
> > > Let's do some math. Assume the worst case on a 1TiB device with a 2MiB 
> > > block
> > > size: We have 524288 blocks == bits. That's precisely a 64k bitmap in
> > > virtio-mem. In the worst case, every second bit would be clear
> > > ("discarded"). For each clear bit ("discarded"), we would have to clear 
> > > 512
> > > bits (64 bytes) in the dirty bitmap. That's storing 32 MiB.
> > > 
> > > So scanning 64 KiB, writing 32 MiB. Certainly not perfect, but I am not 
> > > sure
> > > if it will really matter doing that once on every bitmap sync. I guess the
> > > bitmap syncing itself is much more expensive -- and not syncing the
> > > discarded ranges (b ) above) would make a bigger impact I guess.
> > 
> > I'm not worried about the memory size to be accessed as bitmaps; it's more
> > about the loop itself.  500K blocks/bits means the cb() worse case can be
> > called 500K/2=250k times, no matter what's the hook is doing.
> > 
> > But yeah that's the worst case thing and for a 1TB chunk, I agree that can 
> > also
> > be too harsh.  It's just that if it's very easy to be done in bitmap init 
> > then
> > still worth thinking about it.
> > 
> > > 
> > > > 
> > > > The thing is I still think this extra operation during sync() can be 
> > > > ignored by
> > > > simply clear dirty log during bitmap init, then.. why not? :)
> > > 
> > > I guess clearing the dirty log (especially in KVM) might be more 
> > > expensive.
> > 
> > If we send one ioctl per cb that'll be expensive for sure.  I think it'll be
> > fine if we send one clear ioctl to kvm, summarizing the whole bitmap to 
> > clear.
> > 
> > The other thing is imho having overhead during bitmap init is always better
> > than having that during sync(). :)
> 
> Oh, right, so you're saying, after we set the dirty bmap to all ones and
> excluded the discarded parts, setting the respective bits to 0, we simply
> issue clearing of the whole area?
> 
> For now I assumed we would have to clear per cb.

Hmm when I replied I thought we can pass in a bitmap to ->log_clear() but I
just remembered memory API actually hides the bitmap interface..

Reset the whole region works, but it'll slow down migration starts, more
importantly that'll be with mmu write lock so we will lose most clear-log
benefit for the initial round of migration and stuck the guest #pf at the
meantime...

Let's try do that in cb()s as you mentioned; I think that'll still be okay,
because so far the clear log block size is much larger (1gb), 1tb is worst case
1000 ioctls during bitmap init, slightly better than 250k calls during sync(),
maybe? :)

-- 
Peter Xu




Re: [PATCH for-6.2 v3 01/11] machine: Minor refactor/cleanup for the smp parsers

2021-07-28 Thread Andrew Jones
On Wed, Jul 28, 2021 at 11:48:38AM +0800, Yanan Wang wrote:
> To pave the way for the functional improvement in later patches,
> make some refactor/cleanup for the smp parsers, including using
> local maxcpus instead of ms->smp.max_cpus in the calculation,
> defaulting dies to 0 initially like other members, cleanup the
> sanity check for dies.
> 
> No functional change intended.
> 
> Signed-off-by: Yanan Wang 
> ---
>  hw/core/machine.c | 19 +++
>  hw/i386/pc.c  | 23 ++-
>  2 files changed, 25 insertions(+), 17 deletions(-)
> 
> diff --git a/hw/core/machine.c b/hw/core/machine.c
> index e1533dfc47..ffc0629854 100644
> --- a/hw/core/machine.c
> +++ b/hw/core/machine.c
> @@ -747,9 +747,11 @@ static void smp_parse(MachineState *ms, SMPConfiguration 
> *config, Error **errp)
>  unsigned sockets = config->has_sockets ? config->sockets : 0;
>  unsigned cores   = config->has_cores ? config->cores : 0;
>  unsigned threads = config->has_threads ? config->threads : 0;
> +unsigned maxcpus = config->has_maxcpus ? config->maxcpus : 0;
>  
> -if (config->has_dies && config->dies != 0 && config->dies != 1) {
> +if (config->has_dies && config->dies > 1) {
>  error_setg(errp, "dies not supported by this machine's CPU 
> topology");
> +return;
>  }
>  
>  /* compute missing values, prefer sockets over cores over threads */
> @@ -760,8 +762,8 @@ static void smp_parse(MachineState *ms, SMPConfiguration 
> *config, Error **errp)
>  sockets = sockets > 0 ? sockets : 1;
>  cpus = cores * threads * sockets;
>  } else {
> -ms->smp.max_cpus = config->has_maxcpus ? config->maxcpus : cpus;
> -sockets = ms->smp.max_cpus / (cores * threads);
> +maxcpus = maxcpus > 0 ? maxcpus : cpus;
> +sockets = maxcpus / (sockets * cores);

Should be divided by (cores * threads) like before.

Thanks,
drew




Re: [PATCH v3 3/8] memory: Introduce memory_region_transaction_depth_{inc|dec}()

2021-07-28 Thread David Hildenbrand

On 28.07.21 20:31, Peter Xu wrote:

memory_region_transaction_{begin|commit}() could be too big when finalizing a
memory region.  E.g., we should never attempt to update address space topology
during the finalize() of a memory region.  Provide helpers for further use.

Signed-off-by: Peter Xu 
---
  softmmu/memory.c | 14 --
  1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/softmmu/memory.c b/softmmu/memory.c
index bfedaf9c4d..725d57ec17 100644
--- a/softmmu/memory.c
+++ b/softmmu/memory.c
@@ -1079,10 +1079,20 @@ static void address_space_update_topology(AddressSpace 
*as)
  address_space_set_flatview(as);
  }
  
+static void memory_region_transaction_depth_inc(void)

+{
+memory_region_transaction_depth++;
+}
+
+static void memory_region_transaction_depth_dec(void)
+{
+memory_region_transaction_depth--;
+}
+
  void memory_region_transaction_begin(void)
  {
  qemu_flush_coalesced_mmio_buffer();
-++memory_region_transaction_depth;
+memory_region_transaction_depth_inc();
  }
  
  void memory_region_transaction_commit(void)

@@ -1092,7 +1102,7 @@ void memory_region_transaction_commit(void)
  assert(memory_region_transaction_depth);
  assert(qemu_mutex_iothread_locked());
  
---memory_region_transaction_depth;

+memory_region_transaction_depth_dec();
  if (!memory_region_transaction_depth) {
  if (memory_region_update_pending) {
  flatviews_reset();



Reviewed-by: David Hildenbrand 

--
Thanks,

David / dhildenb




Re: About two-dimensional page translation (e.g., Intel EPT) and shadow page table in Linux QEMU/KVM

2021-07-28 Thread Sean Christopherson
On Wed, Jul 28, 2021, harry harry wrote:
> Sean, sorry for the late reply. Thanks for your careful explanations.
> 
> > For emulation of any instruction/flow that starts with a guest virtual 
> > address.
> > On Intel CPUs, that includes quite literally any "full" instruction 
> > emulation,
> > since KVM needs to translate CS:RIP to a guest physical address in order to 
> > fetch
> > the guest's code stream.  KVM can't avoid "full" emulation unless the guest 
> > is
> > heavily enlightened, e.g. to avoid string I/O, among many other things.
> 
> Do you mean the emulated MMU is needed when it *only* wants to
> translate GVAs to GPAs in the guest level?

Not quite, though gva_to_gpa() is the main use.  The emulated MMU is also used 
to
inject guest #PF and to load/store guest PDTPRs.  

> In such cases, the hardware MMU cannot be used because hardware MMU
> can only translate GVAs to HPAs, right?

Sort of.  The hardware MMU does translate GVA to GPA, but the GPA value is not
visible to software (unless the GPA->HPA translation faults).  That's also true
for VA to PA (and GVA to HPA).  Irrespective of virtualization, x86 ISA doesn't
provide an instruction to retrive the PA for a given VA.

If such an instruction did exist, and it was to be usable for a VMM to do a
GVA->GPA translation, the magic instruction would need to take all MMU params as
operands, e.g. CR0, CR3, CR4, and EFER.  When KVM is active (not the guest), the
hardware MMU is loaded with the host MMU configuration, not the guest.  In both
VMX and SVM, vCPU state is mostly ephemeral in the sense that it ceases to exist
in hardware when the vCPU exits to the host.  Some state is retained in 
hardware,
e.g. TLB and cache entries, but those are associated with select properties of
the vCPU, e.g. EPTP, CR3, etc..., not with the vCPU itself, i.e. not with the
VMCS (VMX) / VMCB (SVM).



Re: [PATCH v2 0/6] migration/ram: Optimize for virtio-mem via RamDiscardManager

2021-07-28 Thread David Hildenbrand

On 28.07.21 21:42, Peter Xu wrote:

On Wed, Jul 28, 2021 at 07:39:39PM +0200, David Hildenbrand wrote:

Meanwhile, I still have no idea how much overhead the "loop" part could bring.
For a large virtio-mem region with frequent plugged/unplugged mem interacted,
it seems possible to take a while to me..  I have no solid idea yet.


Let's do some math. Assume the worst case on a 1TiB device with a 2MiB block
size: We have 524288 blocks == bits. That's precisely a 64k bitmap in
virtio-mem. In the worst case, every second bit would be clear
("discarded"). For each clear bit ("discarded"), we would have to clear 512
bits (64 bytes) in the dirty bitmap. That's storing 32 MiB.

So scanning 64 KiB, writing 32 MiB. Certainly not perfect, but I am not sure
if it will really matter doing that once on every bitmap sync. I guess the
bitmap syncing itself is much more expensive -- and not syncing the
discarded ranges (b ) above) would make a bigger impact I guess.


I'm not worried about the memory size to be accessed as bitmaps; it's more
about the loop itself.  500K blocks/bits means the cb() worse case can be
called 500K/2=250k times, no matter what's the hook is doing.

But yeah that's the worst case thing and for a 1TB chunk, I agree that can also
be too harsh.  It's just that if it's very easy to be done in bitmap init then
still worth thinking about it.





The thing is I still think this extra operation during sync() can be ignored by
simply clear dirty log during bitmap init, then.. why not? :)


I guess clearing the dirty log (especially in KVM) might be more expensive.


If we send one ioctl per cb that'll be expensive for sure.  I think it'll be
fine if we send one clear ioctl to kvm, summarizing the whole bitmap to clear.

The other thing is imho having overhead during bitmap init is always better
than having that during sync(). :)


Oh, right, so you're saying, after we set the dirty bmap to all ones and 
excluded the discarded parts, setting the respective bits to 0, we 
simply issue clearing of the whole area?


For now I assumed we would have to clear per cb.


--
Thanks,

David / dhildenb




Re: [PATCH v2 0/6] migration/ram: Optimize for virtio-mem via RamDiscardManager

2021-07-28 Thread Peter Xu
On Wed, Jul 28, 2021 at 07:39:39PM +0200, David Hildenbrand wrote:
> > Meanwhile, I still have no idea how much overhead the "loop" part could 
> > bring.
> > For a large virtio-mem region with frequent plugged/unplugged mem 
> > interacted,
> > it seems possible to take a while to me..  I have no solid idea yet.
> 
> Let's do some math. Assume the worst case on a 1TiB device with a 2MiB block
> size: We have 524288 blocks == bits. That's precisely a 64k bitmap in
> virtio-mem. In the worst case, every second bit would be clear
> ("discarded"). For each clear bit ("discarded"), we would have to clear 512
> bits (64 bytes) in the dirty bitmap. That's storing 32 MiB.
> 
> So scanning 64 KiB, writing 32 MiB. Certainly not perfect, but I am not sure
> if it will really matter doing that once on every bitmap sync. I guess the
> bitmap syncing itself is much more expensive -- and not syncing the
> discarded ranges (b ) above) would make a bigger impact I guess.

I'm not worried about the memory size to be accessed as bitmaps; it's more
about the loop itself.  500K blocks/bits means the cb() worse case can be
called 500K/2=250k times, no matter what's the hook is doing.

But yeah that's the worst case thing and for a 1TB chunk, I agree that can also
be too harsh.  It's just that if it's very easy to be done in bitmap init then
still worth thinking about it.

> 
> > 
> > The thing is I still think this extra operation during sync() can be 
> > ignored by
> > simply clear dirty log during bitmap init, then.. why not? :)
> 
> I guess clearing the dirty log (especially in KVM) might be more expensive.

If we send one ioctl per cb that'll be expensive for sure.  I think it'll be
fine if we send one clear ioctl to kvm, summarizing the whole bitmap to clear.

The other thing is imho having overhead during bitmap init is always better
than having that during sync(). :)

> But, anyhow, we actually want b) long-term :)

Regarding the longterm plan - sorry to say that, but I still keep a skeptical
view.. :) 

You did mention that for 1tb memory we only have 32mb dirty bitmap, that's
actually not so huge.  That's why I'm not sure whether that complexity of plan
b would bring a lot of help (before I started to think about the interface of
it).  But I could be missing something.

> 
> > 
> > Clear dirty bitmap is as simple as "reprotect the pages" functional-wise - 
> > if
> > they are unplugged memory ranges, and they shouldn't be written by the guest
> > (we still allow reads even for virtio-mem compatibility), then I don't see 
> > it
> > an issue to wr-protect it using clear dirty log when bitmap init.
> > 
> > It still makes sense to me to keep the dirty/clear bitmap in-sync, at least
> > before your plan b proposal; leaving the dirty bits set forever on unplugged
> > memory is okay but just sounds a bit weird.
> > 
> > Though my concern is only valid when virtio-mem is used, so I don't have a
> > strong opinion on it as you maintains virtio-mem. I believe you will always
> > have a better judgement than me on that. Especially, when/if Dave & Juan 
> > have
> > no problem on that. :)
> 
> I'd certainly sleep better at night if I can be 100% sure that a page not to
> be migrated will not get migrated. :)
> 
> I'll play with initial clearing and see how much of a difference it makes
> code wise. Thanks a lot for your feedback!

Thanks!

-- 
Peter Xu




Re: [PATCH v2 2/2] tests: migration-test: Add dirty ring test

2021-07-28 Thread Richard Henderson

On 6/15/21 7:55 AM, Peter Xu wrote:

Add dirty ring test if kernel supports it.  Add the dirty ring parameter on
source should be mostly enough, but let's change the dest too to make them
match always.

Reviewed-by: Dr. David Alan Gilbert 
Signed-off-by: Peter Xu 
---
  tests/qtest/migration-test.c | 58 ++--
  1 file changed, 55 insertions(+), 3 deletions(-)

diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
index d9225f58d4d..9ef6b471353 100644
--- a/tests/qtest/migration-test.c
+++ b/tests/qtest/migration-test.c
@@ -27,6 +27,10 @@
  #include "migration-helpers.h"
  #include "tests/migration/migration-test.h"
  
+#if defined(__linux__)

+#include "linux/kvm.h"
+#endif


This breaks the build for hosts that do not support kvm, e.g. sparc:


[2/3] Compiling C object tests/qtest/migration-test.p/migration-test.c.o
FAILED: tests/qtest/migration-test.p/migration-test.c.o
cc -Itests/qtest/migration-test.p -Itests/qtest -I../qemu/tests/qtest -I. -Iqapi -Itrace 
-Iui -Iui/shader -I/usr/include/glib-2.0 -I/usr/lib/sparc64-linux-gnu/glib-2.0/include 
-fdiagnostics-color=auto -pipe -Wall -Winvalid-pch -Werror -std=gnu11 -O2 -g -isystem 
/home/rth/qemu/qemu/linux-headers -isystem linux-headers -iquote . -iquote 
/home/rth/qemu/qemu -iquote /home/rth/qemu/qemu/include -iquote 
/home/rth/qemu/qemu/disas/libvixl -iquote /home/rth/qemu/qemu/tcg/sparc -pthread 
-U_FORTIFY_SOURCE -D_FORTIFY_SOURCE=2 -m64 -mcpu=ultrasparc -D_GNU_SOURCE 
-D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -Wstrict-prototypes -Wredundant-decls -Wundef 
-Wwrite-strings -Wmissing-prototypes -fno-strict-aliasing -fno-common -fwrapv 
-Wold-style-declaration -Wold-style-definition -Wtype-limits -Wformat-security 
-Wformat-y2k -Winit-self -Wignored-qualifiers -Wempty-body -Wnested-externs -Wendif-labels 
-Wexpansion-to-defined -Wimplicit-fallthrough=2 -Wno-missing-include-dirs 
-Wno-shift-negative-value -Wno-psabi -fPIE -MD -MQ 
tests/qtest/migration-test.p/migration-test.c.o -MF 
tests/qtest/migration-test.p/migration-test.c.o.d -o 
tests/qtest/migration-test.p/migration-test.c.o -c ../qemu/tests/qtest/migration-test.c

In file included from ../qemu/tests/qtest/migration-test.c:31:
/home/rth/qemu/qemu/linux-headers/linux/kvm.h:15:10: fatal error: asm/kvm.h: No such file 
or directory

   15 | #include 
  |  ^~~
compilation terminated.


r~



Re: [PATCH] gitlab-ci.d/custom-runners: Improve rules for the staging branch

2021-07-28 Thread Willian Rampazzo
On Wed, Jul 28, 2021 at 2:39 PM Thomas Huth  wrote:
>
> If maintainers are currently pushing to a branch called "staging"
> in their repository, they are ending up with some stuck jobs - unless
> they have a s390x CI runner machine available. That's ugly, we should
> make sure that the related jobs are really only started if such a
> runner is available. So let's only run these jobs if it's the
> "staging" branch of the main repository of the QEMU project (where
> we can be sure that the s390x runner is available), or if the user
> explicitly set a S390X_RUNNER_AVAILABLE variable in their CI configs
> to declare that they have such a runner available, too.
>
> Fixes: 4799c21023 ("Jobs based on custom runners: add job definitions ...")
> Signed-off-by: Thomas Huth 
> ---
>  .gitlab-ci.d/custom-runners.yml | 40 +++--
>  1 file changed, 28 insertions(+), 12 deletions(-)
>

Reviewed-by: Willian Rampazzo 




Re: About two-dimensional page translation (e.g., Intel EPT) and shadow page table in Linux QEMU/KVM

2021-07-28 Thread harry harry
Sean, sorry for the late reply. Thanks for your careful explanations.

> For emulation of any instruction/flow that starts with a guest virtual 
> address.
> On Intel CPUs, that includes quite literally any "full" instruction emulation,
> since KVM needs to translate CS:RIP to a guest physical address in order to 
> fetch
> the guest's code stream.  KVM can't avoid "full" emulation unless the guest is
> heavily enlightened, e.g. to avoid string I/O, among many other things.

Do you mean the emulated MMU is needed when it *only* wants to
translate GVAs to GPAs in the guest level?
In such cases, the hardware MMU cannot be used because hardware MMU
can only translate GVAs to HPAs, right?



Re: [PATCH v2 2/8] virtio-gpu: hostmem [wip]

2021-07-28 Thread Philippe Mathieu-Daudé
On 7/28/21 3:46 PM, Antonio Caggiano wrote:
> From: Gerd Hoffmann 
> 
> ---
>  hw/display/virtio-gpu-base.c|  4 +++
>  hw/display/virtio-gpu-pci.c | 14 +
>  hw/display/virtio-gpu.c |  1 +
>  hw/display/virtio-vga.c | 32 +++--
>  include/hw/virtio/virtio-gpu.h  |  5 
>  include/standard-headers/linux/virtio_gpu.h |  5 
>  6 files changed, 52 insertions(+), 9 deletions(-)

WIP aims for a RFC series.




Re: [PATCH v3 1/8] cpus: Export queue work related fields to cpu.h

2021-07-28 Thread Philippe Mathieu-Daudé
On 7/28/21 8:31 PM, Peter Xu wrote:
> This patch has no functional change, but prepares for moving the function
> do_run_on_cpu() into softmmu/cpus.c.  It does:
> 
>   1. Move qemu_work_item into hw/core/cpu.h.
>   2. Export queue_work_on_cpu()/qemu_work_cond.
> 
> All of them will be used by softmmu/cpus.c later.
> 
> Reviewed-by: David Hildenbrand 
> Signed-off-by: Peter Xu 
> ---
>  cpus-common.c | 11 ++-
>  include/hw/core/cpu.h | 10 +-
>  2 files changed, 11 insertions(+), 10 deletions(-)

Reviewed-by: Philippe Mathieu-Daudé 




Re: [PATCH v2 7/8] virtio-gpu: Initialize Venus

2021-07-28 Thread Philippe Mathieu-Daudé
On 7/28/21 3:46 PM, Antonio Caggiano wrote:
> Enable VirGL unstable APIs and request Venus when initializing VirGL.
> 
> Signed-off-by: Antonio Caggiano 
> ---
>  hw/display/virtio-gpu-virgl.c | 2 +-
>  meson.build   | 1 +
>  2 files changed, 2 insertions(+), 1 deletion(-)

> diff --git a/meson.build b/meson.build
> index f2e148eaf9..31b65050b7 100644
> --- a/meson.build
> +++ b/meson.build
> @@ -483,6 +483,7 @@ if not get_option('virglrenderer').auto() or have_system
>   method: 'pkg-config',
>   required: get_option('virglrenderer'),
>   kwargs: static_kwargs)
> +  add_project_arguments('-DVIRGL_RENDERER_UNSTABLE_APIS', language : 'c')

Unstable in mainstream repository doesn't sound right.
What is the plan for the project to stabilize it?




  1   2   >