Re: [RFC PATCH] migration: reduce time of loading non-iterable vmstate

2022-12-08 Thread Chuang Xu
On 2022/12/9 上午12:00, Peter Xu wrote:

On Thu, Dec 08, 2022 at 10:39:11PM +0800, Chuang Xu wrote:

On 2022/12/8 上午6:08, Peter Xu wrote:

On Thu, Dec 08, 2022 at 12:07:03AM +0800, Chuang Xu wrote:

On 2022/12/6 上午12:28, Peter Xu wrote:

Chuang,

No worry on the delay; you're faster than when I read yours. :)

On Mon, Dec 05, 2022 at 02:56:15PM +0800, Chuang Xu wrote:

As a start, maybe you can try with poison address_space_to_flatview() (by
e.g. checking the start_pack_mr_change flag and assert it is not set)
during this process to see whether any call stack can even try to
dereference a flatview.

It's just that I didn't figure a good way to "prove" its validity, even if
I think this is an interesting idea worth thinking to shrink the downtime.

Thanks for your sugguestions!
I used a thread local variable to identify whether the current thread is a
migration thread(main thread of target qemu) and I modified the code of
qemu_coroutine_switch to make sure the thread local variable true only in
process_incoming_migration_co call stack. If the target qemu detects that
start_pack_mr_change is set and address_space_to_flatview() is called in
non-migrating threads or non-migrating coroutine, it will crash.

Are you using the thread var just to avoid the assert triggering in the
migration thread when commiting memory changes?

I think _maybe_ another cleaner way to sanity check this is directly upon
the depth:

static inline FlatView *address_space_to_flatview(AddressSpace *as)
{
  /*
   * Before using any flatview, sanity check we're not during a memory
   * region transaction or the map can be invalid.  Note that this can
   * also be called during commit phase of memory transaction, but that
   * should also only happen when the depth decreases to 0 first.
   */
  assert(memory_region_transaction_depth == 0);
  return qatomic_rcu_read(&as->current_map);
}

That should also cover the safe cases of memory transaction commits during
migration.


Peter, I tried this way and found that the target qemu will crash.

Here is the gdb backtrace:

#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
#1  0x7ff2929d851a in __GI_abort () at abort.c:118
#2  0x7ff2929cfe67 in __assert_fail_base (fmt=,
assertion=assertion@entry=0x55a32578cdc0
"memory_region_transaction_depth == 0", file=file@entry=0x55a32575d9b0
"/data00/migration/qemu-5.2.0/include/exec/memory.h",
 line=line@entry=766, function=function@entry=0x55a32578d6e0
<__PRETTY_FUNCTION__.20463> "address_space_to_flatview") at
assert.c:92
#3  0x7ff2929cff12 in __GI___assert_fail
(assertion=assertion@entry=0x55a32578cdc0
"memory_region_transaction_depth == 0", file=file@entry=0x55a32575d9b0
"/data00/migration/qemu-5.2.0/include/exec/memory.h",
line=line@entry=766,
 function=function@entry=0x55a32578d6e0
<__PRETTY_FUNCTION__.20463> "address_space_to_flatview") at
assert.c:101
#4  0x55a324b2ed5e in address_space_to_flatview (as=0x55a326132580
) at
/data00/migration/qemu-5.2.0/include/exec/memory.h:766
#5  0x55a324e79559 in address_space_to_flatview (as=0x55a326132580
) at ../softmmu/memory.c:811
#6  address_space_get_flatview (as=0x55a326132580
) at ../softmmu/memory.c:805
#7  0x55a324e96474 in address_space_cache_init
(cache=cache@entry=0x55a32a4fb000, as=,
addr=addr@entry=68404985856, len=len@entry=4096, is_write=false) at
../softmmu/physmem.c:3307
#8  0x55a324ea9cba in virtio_init_region_cache
(vdev=0x55a32985d9a0, n=0) at ../hw/virtio/virtio.c:185
#9  0x55a324eaa615 in virtio_load (vdev=0x55a32985d9a0,
f=, version_id=) at
../hw/virtio/virtio.c:3203
#10 0x55a324c6ab96 in vmstate_load_state
(f=f@entry=0x55a329dc0c00, vmsd=0x55a325fc1a60 ,
opaque=0x55a32985d9a0, version_id=1) at ../migration/vmstate.c:143
#11 0x55a324cda138 in vmstate_load (f=0x55a329dc0c00,
se=0x55a329941c90) at ../migration/savevm.c:913
#12 0x55a324cdda34 in qemu_loadvm_section_start_full
(mis=0x55a3284ef9e0, f=0x55a329dc0c00) at ../migration/savevm.c:2741
#13 qemu_loadvm_state_main (f=f@entry=0x55a329dc0c00,
mis=mis@entry=0x55a3284ef9e0) at ../migration/savevm.c:2939
#14 0x55a324cdf66a in qemu_loadvm_state (f=0x55a329dc0c00) at
../migration/savevm.c:3021
#15 0x55a324d14b4e in process_incoming_migration_co
(opaque=) at ../migration/migration.c:574
#16 0x55a32501ae3b in coroutine_trampoline (i0=,
i1=) at ../util/coroutine-ucontext.c:173
#17 0x7ff2929e8000 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#18 0x7ffed80dc2a0 in ?? ()
#19 0x in ?? ()

address_space_cache_init() is the only caller of address_space_to_flatview
I can find in vmstate_load call stack so far. Although I think the mr used
by address_space_cache_init() won't be affected by the delay of
memory_region_transaction_commit(), we really need a mechanism to prevent
the modified mr from being used.

Maybe we can build a stale list:
If a subregion is added, add its parent to the stale list(considering that

Re: [PATCH v3 7/8] accel/tcg: Move PageDesc tree into tb-maint.c for system

2022-12-08 Thread Philippe Mathieu-Daudé

On 9/12/22 06:19, Richard Henderson wrote:

Now that PageDesc is not used for user-only, and for system
it is only used for tb maintenance, move the implementation
into tb-main.c appropriately ifdefed.

We have not yet eliminated all references to PageDesc for
user-only, so retain a typedef to the structure without definition.

Signed-off-by: Richard Henderson 
---
  accel/tcg/internal.h  |  49 +++---
  accel/tcg/tb-maint.c  | 130 --
  accel/tcg/translate-all.c |  95 
  3 files changed, 134 insertions(+), 140 deletions(-)




-/*
- * In system mode we want L1_MAP to be based on ram offsets,
- * while in user mode we want it to be based on virtual addresses.
- *
- * TODO: For user mode, see the caveat re host vs guest virtual
- * address spaces near GUEST_ADDR_MAX.
- */
-#if !defined(CONFIG_USER_ONLY)
-#if HOST_LONG_BITS < TARGET_PHYS_ADDR_SPACE_BITS
-# define L1_MAP_ADDR_SPACE_BITS  HOST_LONG_BITS
-#else
-# define L1_MAP_ADDR_SPACE_BITS  TARGET_PHYS_ADDR_SPACE_BITS
-#endif
-#else
-# define L1_MAP_ADDR_SPACE_BITS  MIN(HOST_LONG_BITS, TARGET_ABI_BITS)
-#endif




diff --git a/accel/tcg/tb-maint.c b/accel/tcg/tb-maint.c
index 20e86c813d..9b996bbeb2 100644
--- a/accel/tcg/tb-maint.c
+++ b/accel/tcg/tb-maint.c
@@ -127,6 +127,121 @@ static PageForEachNext foreach_tb_next(PageForEachNext tb,
  }
  
  #else

+/*
+ * In system mode we want L1_MAP to be based on ram offsets.
+ */
+#if HOST_LONG_BITS < TARGET_PHYS_ADDR_SPACE_BITS
+# define L1_MAP_ADDR_SPACE_BITS  HOST_LONG_BITS
+#else
+# define L1_MAP_ADDR_SPACE_BITS  TARGET_PHYS_ADDR_SPACE_BITS
+#endif

So you removed L1_MAP_ADDR_SPACE_BITS in this patch. If you ever respin,
I'd rather have it cleaned in the previous patch, along with the comment
updated and TODO removed.



Re: [PATCH v3 6/8] accel/tcg: Use interval tree for user-only page tracking

2022-12-08 Thread Philippe Mathieu-Daudé

On 9/12/22 06:19, Richard Henderson wrote:

Finish weaning user-only away from PageDesc.

Using an interval tree to track page permissions means that
we can represent very large regions efficiently.

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/290
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/967
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1214
Signed-off-by: Richard Henderson 
---
  accel/tcg/internal.h   |   4 +-
  accel/tcg/tb-maint.c   |  20 +-
  accel/tcg/user-exec.c  | 615 ++---
  tests/tcg/multiarch/test-vma.c |  22 ++
  4 files changed, 451 insertions(+), 210 deletions(-)
  create mode 100644 tests/tcg/multiarch/test-vma.c




  int page_check_range(target_ulong start, target_ulong len, int flags)
  {
-PageDesc *p;
-target_ulong end;
-target_ulong addr;
-
-/*
- * This function should never be called with addresses outside the
- * guest address space.  If this assert fires, it probably indicates
- * a missing call to h2g_valid.
- */
-if (TARGET_ABI_BITS > L1_MAP_ADDR_SPACE_BITS) {
-assert(start < ((target_ulong)1 << L1_MAP_ADDR_SPACE_BITS));
-}


This removes the use of L1_MAP_ADDR_SPACE_BITS in user-only, maybe
remove the definition from "accel/tcg/internal.h"?



Re: [PATCH v3 5/8] accel/tcg: Move page_{get,set}_flags to user-exec.c

2022-12-08 Thread Philippe Mathieu-Daudé

On 9/12/22 06:19, Richard Henderson wrote:

This page tracking implementation is specific to user-only,
since the system softmmu version is in cputlb.c.  Move it
out of translate-all.c to user-exec.c.

Reviewed-by: Alex Bennée 
Signed-off-by: Richard Henderson 
---
  accel/tcg/internal.h  |  17 ++
  accel/tcg/translate-all.c | 350 --
  accel/tcg/user-exec.c | 346 +
  3 files changed, 363 insertions(+), 350 deletions(-)


Reviewed-by: Philippe Mathieu-Daudé 




Re: [PATCH v3 7/8] target/i386/intel-pt: Define specific PT feature set for IceLake-server and Snowridge

2022-12-08 Thread Chenyi Qiang



On 12/8/2022 2:25 PM, Xiaoyao Li wrote:
> For IceLake-server, it's just the same as using the default PT
> feature set since the default one is exact taken from ICX.
> 
> For Snowridge, define it according to real SNR silicon capabilities.
> 
> Signed-off-by: Xiaoyao Li 
> ---
>  target/i386/cpu.c | 18 ++
>  1 file changed, 18 insertions(+)
> 
> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> index 24f3c7b06698..ef574c819671 100644
> --- a/target/i386/cpu.c
> +++ b/target/i386/cpu.c
> @@ -3458,6 +3458,14 @@ static const X86CPUDefinition builtin_x86_defs[] = {
>  .features[FEAT_6_EAX] =
>  CPUID_6_EAX_ARAT,
>  /* Missing: Mode-based execute control (XS/XU), processor tracing, 
> TSC scaling */
> +.features[FEAT_14_0_EBX] =
> +CPUID_14_0_EBX_CR3_FILTER | CPUID_14_0_EBX_PSB |
> +CPUID_14_0_EBX_IP_FILTER | CPUID_14_0_EBX_MTC,
> +.features[FEAT_14_0_ECX] =
> +CPUID_14_0_ECX_TOPA | CPUID_14_0_ECX_MULTI_ENTRIES |
> +CPUID_14_0_ECX_SINGLE_RANGE,
> +.features[FEAT_14_1_EAX] = 0x249 << 16 | 0x2,
> +.features[FEAT_14_1_EBX] = 0x003f << 16 | 0x1fff,
>  .features[FEAT_VMX_BASIC] = MSR_VMX_BASIC_INS_OUTS |
>   MSR_VMX_BASIC_TRUE_CTLS,
>  .features[FEAT_VMX_ENTRY_CTLS] = VMX_VM_ENTRY_IA32E_MODE |
> @@ -3735,6 +3743,16 @@ static const X86CPUDefinition builtin_x86_defs[] = {
>  CPUID_XSAVE_XGETBV1,
>  .features[FEAT_6_EAX] =
>  CPUID_6_EAX_ARAT,
> +.features[FEAT_14_0_EBX] =
> +CPUID_14_0_EBX_CR3_FILTER | CPUID_14_0_EBX_PSB |
> +CPUID_14_0_EBX_IP_FILTER | CPUID_14_0_EBX_MTC |
> +CPUID_14_0_EBX_PTWRITE | CPUID_14_0_EBX_POWER_EVENT |
> +CPUID_14_0_EBX_PSB_PMI_PRESERVATION,
> +.features[FEAT_14_0_ECX] =
> +CPUID_14_0_ECX_TOPA | CPUID_14_0_ECX_MULTI_ENTRIES |
> +CPUID_14_0_ECX_SINGLE_RANGE | CPUID_14_0_ECX_LIP,
> +.features[FEAT_14_1_EAX] = 0x249 << 16 | 0x2,
> +.features[FEAT_14_1_EBX] = 0x003f << 16 | 0x,
>  .features[FEAT_VMX_BASIC] = MSR_VMX_BASIC_INS_OUTS |
>   MSR_VMX_BASIC_TRUE_CTLS,
>  .features[FEAT_VMX_ENTRY_CTLS] = VMX_VM_ENTRY_IA32E_MODE |

Is it acceptable to add the whole FEATURE_WORDS in the default version
of CPU model, or need to put in the versioned one (e.g. Snowridge-v5)?




Re: [PATCH v3 2/8] accel/tcg: Rename page_flush_tb

2022-12-08 Thread Philippe Mathieu-Daudé

On 9/12/22 06:19, Richard Henderson wrote:

Rename to tb_remove_all, to remove the PageDesc "page" from the name,
and to avoid suggesting a "flush" in the icache sense.

Signed-off-by: Richard Henderson 
---
  accel/tcg/tb-maint.c | 10 +-
  1 file changed, 5 insertions(+), 5 deletions(-)


Reviewed-by: Philippe Mathieu-Daudé 




Re: Target-dependent include path, why?

2022-12-08 Thread Philippe Mathieu-Daudé

On 9/12/22 06:24, Richard Henderson wrote:

On 12/8/22 23:12, Markus Armbruster wrote:

I stumbled over this:

 ../include/ui/qemu-pixman.h:12:10: fatal error: pixman.h: No such 
file or directory

    12 | #include 
   |  ^~

Works when included into target-dependent code.

Running make -V=1 shows we're passing a number of -I only when compiling
target-dependent code, i.e. together with -DNEED_CPU_H:

 -I/usr/include/pixman-1 -I/usr/include/capstone 
-I/usr/include/spice-server -I/usr/include/spice-1


 -I/usr/include/cacard -I/usr/include/nss3 -I/usr/include/nspr4 
-I/usr/include/PCSC


 -isystem../linux-headers -isystemlinux-headers

Why?


Because of where [pixman] is added as a dependency in meson.build.
If you want to use it somewhere new, you've got to move the dependency.


Code involving virtio becomes target-specific.




Re: [PATCH v3 6/8] target/i386/intel-pt: Enable host pass through of Intel PT

2022-12-08 Thread Chenyi Qiang



On 12/8/2022 2:25 PM, Xiaoyao Li wrote:
> commit e37a5c7fa459 ("i386: Add Intel Processor Trace feature support")
> added the support of Intel PT by making CPUID[14] of PT as fixed feature
> set (from ICX) for any CPU model on any host. This truly breaks the PT
> exposure on Intel SPR platform because SPR has less supported bitmap of
> CPUID(0x14,1):EBX[15:0] than ICX.
> 
> To fix the problem, enable pass through of host's PT capabilities for
> the cases "-cpu host/max" that it won't use default fixed PT feature set
> of ICX but expand automatically based on get_supported_cpuid reported by
> host. Meanwhile, it needs to ensure named CPU model still has the fixed
> PT feature set to not break the live migration case of
> "-cpu named_cpu_model,+intel-pt"
> 
> Introduces env->use_default_intel_pt flag.
>  - True means it's old CPU model that uses fixed PT feature set of ICX.
>  - False means the named CPU model has its own PT feature set.
> 
> Besides, to keep the same behavior for old CPU models that validate PT
> feature set against default fixed PT feature set of ICX in addition to
> validate from host's capabilities (via get_supported_cpuid) in
> x86_cpu_filter_features().
> 
> In the future, new named CPU model, e.g., Sapphire Rapids, can define
> its own PT feature set by setting @has_specific_intel_pt_feature_set to


It seems @has_specific_intel_pt_feature_set is not introduced in this
series. Then don't need to mention the specific flag name here.

> true and defines it's own FEAT_14_0_EBX, FEAT_14_0_ECX, FEAT_14_1_EAX
> and FEAT_14_1_EBX.
> 
> Signed-off-by: Xiaoyao Li 
> ---



Re: [PATCH v3 4/8] target/i386/intel-pt: print special message for INTEL_PT_ADDR_RANGES_NUM

2022-12-08 Thread Chenyi Qiang



On 12/8/2022 2:25 PM, Xiaoyao Li wrote:
> Bit[2:0] of CPUID.14H_01H:EAX stands as a whole for the number of INTEL
> PT ADDR RANGES. For unsupported value that exceeds what KVM reports,
> report it as a whole in mark_unavailable_features() as well.
> 

Maybe this patch can be put before 3/8.

> Signed-off-by: Xiaoyao Li 
> ---
>  target/i386/cpu.c | 9 -
>  1 file changed, 8 insertions(+), 1 deletion(-)
> 
> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> index 65c6f8ae771a..4d7beccc0af7 100644
> --- a/target/i386/cpu.c
> +++ b/target/i386/cpu.c
> @@ -4387,7 +4387,14 @@ static void mark_unavailable_features(X86CPU *cpu, 
> FeatureWord w, uint64_t mask,
>  return;
>  }
>  
> -for (i = 0; i < 64; ++i) {
> +if ((w == FEAT_14_1_EAX) && (mask & INTEL_PT_ADDR_RANGES_NUM_MASK)) {
> +warn_report("%s: CPUID.14H_01H:EAX [bit 2:0]", verbose_prefix);
> +i = 3;
> +} else {
> +i = 0;
> +}
> +
> +for (; i < 64; ++i) {
>  if ((1ULL << i) & mask) {
>  g_autofree char *feat_word_str = feature_word_description(f, i);
>  warn_report("%s: %s%s%s [bit %d]",



Re: [PATCH v10 5/9] KVM: Use gfn instead of hva for mmu_notifier_retry

2022-12-08 Thread Chao Peng
On Tue, Dec 06, 2022 at 03:48:50PM +, Fuad Tabba wrote:
...
 > >
> > > >  */
> > > > -   if (unlikely(kvm->mmu_invalidate_in_progress) &&
> > > > -   hva >= kvm->mmu_invalidate_range_start &&
> > > > -   hva < kvm->mmu_invalidate_range_end)
> > > > -   return 1;
> > > > +   if (unlikely(kvm->mmu_invalidate_in_progress)) {
> > > > +   /*
> > > > +* Dropping mmu_lock after bumping 
> > > > mmu_invalidate_in_progress
> > > > +* but before updating the range is a KVM bug.
> > > > +*/
> > > > +   if (WARN_ON_ONCE(kvm->mmu_invalidate_range_start == 
> > > > INVALID_GPA ||
> > > > +kvm->mmu_invalidate_range_end == 
> > > > INVALID_GPA))
> > >
> > > INVALID_GPA is an x86-specific define in
> > > arch/x86/include/asm/kvm_host.h, so this doesn't build on other
> > > architectures. The obvious fix is to move it to
> > > include/linux/kvm_host.h.
> >
> > Hmm, INVALID_GPA is defined as ZERO for x86, not 100% confident this is
> > correct choice for other architectures, but after search it has not been
> > used for other architectures, so should be safe to make it common.

As Yu posted a patch:
https://lore.kernel.org/all/20221209023622.274715-1-yu.c.zh...@linux.intel.com/

There is a GPA_INVALID in include/linux/kvm_types.h and I see ARM has already
been using it so sounds that is exactly what I need.

Chao
> 
> With this fixed,
> 
> Reviewed-by: Fuad Tabba 
> And the necessary work to port to arm64 (on qemu/arm64):
> Tested-by: Fuad Tabba 
> 
> Cheers,
> /fuad



[PATCH v2 1/2] target/ppc: Implement the DEXCR and HDEXCR

2022-12-08 Thread Nicholas Miehlbradt
Define the DEXCR and HDEXCR as special purpose registers.

Each register occupies two SPR indicies, one which can be read in an
unprivileged state and one which can be modified in the appropriate
priviliged state, however both indicies refer to the same underlying
value.

Note that the ISA uses the abbreviation UDEXCR in two different
contexts: the userspace DEXCR, the SPR index which can be read from
userspace (implemented in this patch), and the ultravisor DEXCR, the
equivalent register for the ultravisor state (not implemented).

Signed-off-by: Nicholas Miehlbradt 
---
v2: Clearing of upper 32 bits of DEXCR is now performed on read from
problem state rather than on write in privileged state.
---
 target/ppc/cpu.h| 19 +++
 target/ppc/cpu_init.c   | 25 +
 target/ppc/spr_common.h |  1 +
 target/ppc/translate.c  | 19 +++
 4 files changed, 64 insertions(+)

diff --git a/target/ppc/cpu.h b/target/ppc/cpu.h
index 81d4263a07..0ed9f2ae35 100644
--- a/target/ppc/cpu.h
+++ b/target/ppc/cpu.h
@@ -1068,6 +1068,21 @@ struct ppc_radix_page_info {
 uint32_t entries[PPC_PAGE_SIZES_MAX_SZ];
 };
 
+/*/
+/* Dynamic Execution Control Register */
+
+#define DEXCR_ASPECT(name, num)\
+FIELD(DEXCR, PNH_##name, PPC_BIT_NR(num), 1)   \
+FIELD(DEXCR, PRO_##name, PPC_BIT_NR(num + 32), 1)  \
+FIELD(HDEXCR, HNU_##name, PPC_BIT_NR(num), 1)  \
+FIELD(HDEXCR, ENF_##name, PPC_BIT_NR(num + 32), 1) \
+
+DEXCR_ASPECT(SBHE, 0)
+DEXCR_ASPECT(IDRTPB, 1)
+DEXCR_ASPECT(SRAPD, 4)
+DEXCR_ASPECT(NPHIE, 5)
+DEXCR_ASPECT(PHIE, 6)
+
 /*/
 /* The whole PowerPC CPU context */
 
@@ -1674,9 +1689,11 @@ void ppc_compat_add_property(Object *obj, const char 
*name,
 #define SPR_BOOKE_GIVOR13 (0x1BC)
 #define SPR_BOOKE_GIVOR14 (0x1BD)
 #define SPR_TIR   (0x1BE)
+#define SPR_UHDEXCR   (0x1C7)
 #define SPR_PTCR  (0x1D0)
 #define SPR_HASHKEYR  (0x1D4)
 #define SPR_HASHPKEYR (0x1D5)
+#define SPR_HDEXCR(0x1D7)
 #define SPR_BOOKE_SPEFSCR (0x200)
 #define SPR_Exxx_BBEAR(0x201)
 #define SPR_Exxx_BBTAR(0x202)
@@ -1865,8 +1882,10 @@ void ppc_compat_add_property(Object *obj, const char 
*name,
 #define SPR_RCPU_L2U_RA2  (0x32A)
 #define SPR_MPC_MD_DBRAM1 (0x32A)
 #define SPR_RCPU_L2U_RA3  (0x32B)
+#define SPR_UDEXCR(0x32C)
 #define SPR_TAR   (0x32F)
 #define SPR_ASDR  (0x330)
+#define SPR_DEXCR (0x33C)
 #define SPR_IC(0x350)
 #define SPR_VTB   (0x351)
 #define SPR_MMCRC (0x353)
diff --git a/target/ppc/cpu_init.c b/target/ppc/cpu_init.c
index cbf0081374..6433f4fdfd 100644
--- a/target/ppc/cpu_init.c
+++ b/target/ppc/cpu_init.c
@@ -5727,6 +5727,30 @@ static void register_power10_hash_sprs(CPUPPCState *env)
 hashpkeyr_initial_value);
 }
 
+static void register_power10_dexcr_sprs(CPUPPCState *env)
+{
+spr_register(env, SPR_DEXCR, "DEXCR",
+SPR_NOACCESS, SPR_NOACCESS,
+&spr_read_generic, &spr_write_generic,
+0);
+
+spr_register(env, SPR_UDEXCR, "DEXCR",
+&spr_read_dexcr_ureg, SPR_NOACCESS,
+&spr_read_dexcr_ureg, SPR_NOACCESS,
+0);
+
+spr_register_hv(env, SPR_HDEXCR, "HDEXCR",
+SPR_NOACCESS, SPR_NOACCESS,
+SPR_NOACCESS, SPR_NOACCESS,
+&spr_read_generic, &spr_write_generic,
+0);
+
+spr_register(env, SPR_UHDEXCR, "HDEXCR",
+&spr_read_dexcr_ureg, SPR_NOACCESS,
+&spr_read_dexcr_ureg, SPR_NOACCESS,
+0);
+}
+
 /*
  * Initialize PMU counter overflow timers for Power8 and
  * newer Power chips when using TCG.
@@ -6402,6 +6426,7 @@ static void init_proc_POWER10(CPUPPCState *env)
 register_power8_rpr_sprs(env);
 register_power9_mmu_sprs(env);
 register_power10_hash_sprs(env);
+register_power10_dexcr_sprs(env);
 
 /* FIXME: Filter fields properly based on privilege level */
 spr_register_kvm_hv(env, SPR_PSSCR, "PSSCR", NULL, NULL, NULL, NULL,
diff --git a/target/ppc/spr_common.h b/target/ppc/spr_common.h
index b5a5bc6895..91a74cec0f 100644
--- a/target/ppc/spr_common.h
+++ b/target/ppc/spr_common.h
@@ -195,6 +195,7 @@ void spr_read_ebb_upper32(DisasContext *ctx, int gprn, int 
sprn);
 void spr_write_ebb_upper32(DisasContext *ctx, int sprn, int gprn);
 void spr_write_hmer(DisasContext *ctx, int sprn, int gprn);
 void spr_write_lpcr(DisasContext *ctx, int sprn, int gprn);
+void spr_read_dexcr_ureg(DisasContext *ctx, int sprn, int gprn);
 #endif
 
 void register_low_BATs(CPUPPCState *env);
diff --git a/target/ppc/translate.c b/target/ppc/translate.c
index 19c1d17cb0..fcb1180712 100644
--- a/target/ppc/translate.c
+++ b/target/ppc/translate.c
@@ -1249,6 +1249

[PATCH v2 0/2] target/ppc: Implement Dynamic Execution Control Registers

2022-12-08 Thread Nicholas Miehlbradt
Implements the Dynamic Execution Control Register (DEXCR) and the
Hypervisor Dynamic Execution Control Register (HDEXCR) in TCG as
defined in Power ISA 3.1B. Only aspects 5 (Non-privileged hash instruction
enable) and 6 (Privileged hash instruction enable) have architectural
effects. Other aspects can be manipulated but have no effect on execution.

Adds checks to these registers in the hashst and hashchk instructions so
that they are executed as nops when not enabled.

There is currently an RFC for the kernel interface for the DEXCR on the 
Linux PPC mailing list:
https://lore.kernel.org/linuxppc-dev/20221128024458.46121-1-bg...@linux.ibm.com/

Nicholas Miehlbradt (2):
  target/ppc: Implement the DEXCR and HDEXCR
  target/ppc: Check DEXCR on hash{st, chk} instructions

 target/ppc/cpu.h | 19 +
 target/ppc/cpu_init.c| 25 +
 target/ppc/excp_helper.c | 58 +---
 target/ppc/spr_common.h  |  1 +
 target/ppc/translate.c   | 19 +
 5 files changed, 107 insertions(+), 15 deletions(-)

-- 
2.34.1




[PATCH v2 2/2] target/ppc: Check DEXCR on hash{st, chk} instructions

2022-12-08 Thread Nicholas Miehlbradt
Adds checks to the hashst and hashchk instructions to only execute if
enabled by the relevant aspect in the DEXCR and HDEXCR.

This behaviour is guarded behind TARGET_PPC64 since Power10 is
currently the only implementation which has the DEXCR.

Reviewed-by: Daniel Henrique Barboza 
Signed-off-by: Nicholas Miehlbradt 
---
 target/ppc/excp_helper.c | 58 +---
 1 file changed, 43 insertions(+), 15 deletions(-)

diff --git a/target/ppc/excp_helper.c b/target/ppc/excp_helper.c
index 94adcb766b..add4d54ae7 100644
--- a/target/ppc/excp_helper.c
+++ b/target/ppc/excp_helper.c
@@ -2902,29 +2902,57 @@ static uint64_t hash_digest(uint64_t ra, uint64_t rb, 
uint64_t key)
 return stage1_h ^ stage1_l;
 }
 
+static void do_hash(CPUPPCState *env, target_ulong ea, target_ulong ra,
+target_ulong rb, uint64_t key, bool store)
+{
+uint64_t calculated_hash = hash_digest(ra, rb, key), loaded_hash;
+
+if (store) {
+cpu_stq_data_ra(env, ea, calculated_hash, GETPC());
+} else {
+loaded_hash = cpu_ldq_data_ra(env, ea, GETPC());
+if (loaded_hash != calculated_hash) {
+raise_exception_err_ra(env, POWERPC_EXCP_PROGRAM,
+POWERPC_EXCP_TRAP, GETPC());
+}
+}
+}
+
 #include "qemu/guest-random.h"
 
-#define HELPER_HASH(op, key, store)   \
+#ifdef TARGET_PPC64
+#define HELPER_HASH(op, key, store, dexcr_aspect) \
 void helper_##op(CPUPPCState *env, target_ulong ea, target_ulong ra,  \
  target_ulong rb) \
 { \
-uint64_t calculated_hash = hash_digest(ra, rb, key), loaded_hash; \
-  \
-if (store) {  \
-cpu_stq_data_ra(env, ea, calculated_hash, GETPC());   \
-} else {  \
-loaded_hash = cpu_ldq_data_ra(env, ea, GETPC());  \
-if (loaded_hash != calculated_hash) { \
-raise_exception_err_ra(env, POWERPC_EXCP_PROGRAM, \
-POWERPC_EXCP_TRAP, GETPC());  \
-} \
+if (env->msr & R_MSR_PR_MASK) {   \
+if (!(env->spr[SPR_DEXCR] & R_DEXCR_PRO_##dexcr_aspect##_MASK ||  \
+env->spr[SPR_HDEXCR] & R_HDEXCR_ENF_##dexcr_aspect##_MASK))   \
+return;   \
+} else if (!(env->msr & R_MSR_HV_MASK)) { \
+if (!(env->spr[SPR_DEXCR] & R_DEXCR_PNH_##dexcr_aspect##_MASK ||  \
+env->spr[SPR_HDEXCR] & R_HDEXCR_ENF_##dexcr_aspect##_MASK))   \
+return;   \
+} else if (!(env->msr & R_MSR_S_MASK)) {  \
+if (!(env->spr[SPR_HDEXCR] & R_HDEXCR_HNU_##dexcr_aspect##_MASK)) \
+return;   \
 } \
+  \
+do_hash(env, ea, ra, rb, key, store); \
+}
+#else
+#define HELPER_HASH(op, key, store, dexcr_aspect) \
+void helper_##op(CPUPPCState *env, target_ulong ea, target_ulong ra,  \
+ target_ulong rb) \
+{ \
+do_hash(env, ea, ra, rb, key, store); \
 }
+#endif /* TARGET_PPC64 */
 
-HELPER_HASH(HASHST, env->spr[SPR_HASHKEYR], true)
-HELPER_HASH(HASHCHK, env->spr[SPR_HASHKEYR], false)
-HELPER_HASH(HASHSTP, env->spr[SPR_HASHPKEYR], true)
-HELPER_HASH(HASHCHKP, env->spr[SPR_HASHPKEYR], false)
+HELPER_HASH(HASHST, env->spr[SPR_HASHKEYR], true, NPHIE)
+HELPER_HASH(HASHCHK, env->spr[SPR_HASHKEYR], false, NPHIE)
+HELPER_HASH(HASHSTP, env->spr[SPR_HASHPKEYR], true, PHIE)
+HELPER_HASH(HASHCHKP, env->spr[SPR_HASHPKEYR], false, PHIE)
 #endif /* CONFIG_TCG */
 
 #if !defined(CONFIG_USER_ONLY)
-- 
2.34.1




Re: [PATCH v10 8/9] KVM: Handle page fault for private memory

2022-12-08 Thread Yuan Yao
On Thu, Dec 08, 2022 at 07:23:46PM +0800, Chao Peng wrote:
> On Thu, Dec 08, 2022 at 10:29:18AM +0800, Yuan Yao wrote:
> > On Fri, Dec 02, 2022 at 02:13:46PM +0800, Chao Peng wrote:
> > > A KVM_MEM_PRIVATE memslot can include both fd-based private memory and
> > > hva-based shared memory. Architecture code (like TDX code) can tell
> > > whether the on-going fault is private or not. This patch adds a
> > > 'is_private' field to kvm_page_fault to indicate this and architecture
> > > code is expected to set it.
> > >
> > > To handle page fault for such memslot, the handling logic is different
> > > depending on whether the fault is private or shared. KVM checks if
> > > 'is_private' matches the host's view of the page (maintained in
> > > mem_attr_array).
> > >   - For a successful match, private pfn is obtained with
> > > restrictedmem_get_page() and shared pfn is obtained with existing
> > > get_user_pages().
> > >   - For a failed match, KVM causes a KVM_EXIT_MEMORY_FAULT exit to
> > > userspace. Userspace then can convert memory between private/shared
> > > in host's view and retry the fault.
> > >
> > > Co-developed-by: Yu Zhang 
> > > Signed-off-by: Yu Zhang 
> > > Signed-off-by: Chao Peng 
> > > ---
> > >  arch/x86/kvm/mmu/mmu.c  | 63 +++--
> > >  arch/x86/kvm/mmu/mmu_internal.h | 14 +++-
> > >  arch/x86/kvm/mmu/mmutrace.h |  1 +
> > >  arch/x86/kvm/mmu/tdp_mmu.c  |  2 +-
> > >  include/linux/kvm_host.h| 30 
> > >  5 files changed, 105 insertions(+), 5 deletions(-)
> > >
> > > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> > > index 2190fd8c95c0..b1953ebc012e 100644
> > > --- a/arch/x86/kvm/mmu/mmu.c
> > > +++ b/arch/x86/kvm/mmu/mmu.c
> > > @@ -3058,7 +3058,7 @@ static int host_pfn_mapping_level(struct kvm *kvm, 
> > > gfn_t gfn,
> > >
> > >  int kvm_mmu_max_mapping_level(struct kvm *kvm,
> > > const struct kvm_memory_slot *slot, gfn_t gfn,
> > > -   int max_level)
> > > +   int max_level, bool is_private)
> > >  {
> > >   struct kvm_lpage_info *linfo;
> > >   int host_level;
> > > @@ -3070,6 +3070,9 @@ int kvm_mmu_max_mapping_level(struct kvm *kvm,
> > >   break;
> > >   }
> > >
> > > + if (is_private)
> > > + return max_level;
> >
> > lpage mixed information already saved, so is that possible
> > to query info->disallow_lpage without care 'is_private' ?
>
> Actually we already queried info->disallow_lpage just before this
> sentence. The check is needed because later in the function we call
> host_pfn_mapping_level() which is shared memory specific.

You're right. We can't get mapping level info for private page from
host_pfn_mapping_level().

>
> Thanks,
> Chao
> >
> > > +
> > >   if (max_level == PG_LEVEL_4K)
> > >   return PG_LEVEL_4K;
> > >
> > > @@ -3098,7 +3101,8 @@ void kvm_mmu_hugepage_adjust(struct kvm_vcpu *vcpu, 
> > > struct kvm_page_fault *fault
> > >* level, which will be used to do precise, accurate accounting.
> > >*/
> > >   fault->req_level = kvm_mmu_max_mapping_level(vcpu->kvm, slot,
> > > -  fault->gfn, 
> > > fault->max_level);
> > > +  fault->gfn, 
> > > fault->max_level,
> > > +  fault->is_private);
> > >   if (fault->req_level == PG_LEVEL_4K || fault->huge_page_disallowed)
> > >   return;
> > >
> > > @@ -4178,6 +4182,49 @@ void kvm_arch_async_page_ready(struct kvm_vcpu 
> > > *vcpu, struct kvm_async_pf *work)
> > >   kvm_mmu_do_page_fault(vcpu, work->cr2_or_gpa, 0, true);
> > >  }
> > >
> > > +static inline u8 order_to_level(int order)
> > > +{
> > > + BUILD_BUG_ON(KVM_MAX_HUGEPAGE_LEVEL > PG_LEVEL_1G);
> > > +
> > > + if (order >= KVM_HPAGE_GFN_SHIFT(PG_LEVEL_1G))
> > > + return PG_LEVEL_1G;
> > > +
> > > + if (order >= KVM_HPAGE_GFN_SHIFT(PG_LEVEL_2M))
> > > + return PG_LEVEL_2M;
> > > +
> > > + return PG_LEVEL_4K;
> > > +}
> > > +
> > > +static int kvm_do_memory_fault_exit(struct kvm_vcpu *vcpu,
> > > + struct kvm_page_fault *fault)
> > > +{
> > > + vcpu->run->exit_reason = KVM_EXIT_MEMORY_FAULT;
> > > + if (fault->is_private)
> > > + vcpu->run->memory.flags = KVM_MEMORY_EXIT_FLAG_PRIVATE;
> > > + else
> > > + vcpu->run->memory.flags = 0;
> > > + vcpu->run->memory.gpa = fault->gfn << PAGE_SHIFT;
> > > + vcpu->run->memory.size = PAGE_SIZE;
> > > + return RET_PF_USER;
> > > +}
> > > +
> > > +static int kvm_faultin_pfn_private(struct kvm_vcpu *vcpu,
> > > +struct kvm_page_fault *fault)
> > > +{
> > > + int order;
> > > + struct kvm_memory_slot *slot = fault->slot;
> > > +
> > > + if (!kvm_slot_can_be_private(slot))
> > > + return kvm_do_memory_fault_exit(vcpu, fault);
> > > +
> > > + if (kvm_restricted_mem_get_pfn(slot, fault->gfn, &fault->p

Re: [PATCH v10 6/9] KVM: Unmap existing mappings when change the memory attributes

2022-12-08 Thread Yuan Yao
On Thu, Dec 08, 2022 at 07:20:43PM +0800, Chao Peng wrote:
> On Wed, Dec 07, 2022 at 04:13:14PM +0800, Yuan Yao wrote:
> > On Fri, Dec 02, 2022 at 02:13:44PM +0800, Chao Peng wrote:
> > > Unmap the existing guest mappings when memory attribute is changed
> > > between shared and private. This is needed because shared pages and
> > > private pages are from different backends, unmapping existing ones
> > > gives a chance for page fault handler to re-populate the mappings
> > > according to the new attribute.
> > >
> > > Only architecture has private memory support needs this and the
> > > supported architecture is expected to rewrite the weak
> > > kvm_arch_has_private_mem().
> > >
> > > Also, during memory attribute changing and the unmapping time frame,
> > > page fault handler may happen in the same memory range and can cause
> > > incorrect page state, invoke kvm_mmu_invalidate_* helpers to let the
> > > page fault handler retry during this time frame.
> > >
> > > Signed-off-by: Chao Peng 
> > > ---
> > >  include/linux/kvm_host.h |   7 +-
> > >  virt/kvm/kvm_main.c  | 168 ++-
> > >  2 files changed, 116 insertions(+), 59 deletions(-)
> > >
> > > diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> > > index 3d69484d2704..3331c0c92838 100644
> > > --- a/include/linux/kvm_host.h
> > > +++ b/include/linux/kvm_host.h
> > > @@ -255,7 +255,6 @@ bool kvm_setup_async_pf(struct kvm_vcpu *vcpu, gpa_t 
> > > cr2_or_gpa,
> > >  int kvm_async_pf_wakeup_all(struct kvm_vcpu *vcpu);
> > >  #endif
> > >
> > > -#ifdef KVM_ARCH_WANT_MMU_NOTIFIER
> > >  struct kvm_gfn_range {
> > >   struct kvm_memory_slot *slot;
> > >   gfn_t start;
> > > @@ -264,6 +263,8 @@ struct kvm_gfn_range {
> > >   bool may_block;
> > >  };
> > >  bool kvm_unmap_gfn_range(struct kvm *kvm, struct kvm_gfn_range *range);
> > > +
> > > +#ifdef KVM_ARCH_WANT_MMU_NOTIFIER
> > >  bool kvm_age_gfn(struct kvm *kvm, struct kvm_gfn_range *range);
> > >  bool kvm_test_age_gfn(struct kvm *kvm, struct kvm_gfn_range *range);
> > >  bool kvm_set_spte_gfn(struct kvm *kvm, struct kvm_gfn_range *range);
> > > @@ -785,11 +786,12 @@ struct kvm {
> > >
> > >  #if defined(CONFIG_MMU_NOTIFIER) && defined(KVM_ARCH_WANT_MMU_NOTIFIER)
> > >   struct mmu_notifier mmu_notifier;
> > > +#endif
> > >   unsigned long mmu_invalidate_seq;
> > >   long mmu_invalidate_in_progress;
> > >   gfn_t mmu_invalidate_range_start;
> > >   gfn_t mmu_invalidate_range_end;
> > > -#endif
> > > +
> > >   struct list_head devices;
> > >   u64 manual_dirty_log_protect;
> > >   struct dentry *debugfs_dentry;
> > > @@ -1480,6 +1482,7 @@ bool kvm_arch_dy_has_pending_interrupt(struct 
> > > kvm_vcpu *vcpu);
> > >  int kvm_arch_post_init_vm(struct kvm *kvm);
> > >  void kvm_arch_pre_destroy_vm(struct kvm *kvm);
> > >  int kvm_arch_create_vm_debugfs(struct kvm *kvm);
> > > +bool kvm_arch_has_private_mem(struct kvm *kvm);
> > >
> > >  #ifndef __KVM_HAVE_ARCH_VM_ALLOC
> > >  /*
> > > diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> > > index ad55dfbc75d7..4e1e1e113bf0 100644
> > > --- a/virt/kvm/kvm_main.c
> > > +++ b/virt/kvm/kvm_main.c
> > > @@ -520,6 +520,62 @@ void kvm_destroy_vcpus(struct kvm *kvm)
> > >  }
> > >  EXPORT_SYMBOL_GPL(kvm_destroy_vcpus);
> > >
> > > +void kvm_mmu_invalidate_begin(struct kvm *kvm)
> > > +{
> > > + /*
> > > +  * The count increase must become visible at unlock time as no
> > > +  * spte can be established without taking the mmu_lock and
> > > +  * count is also read inside the mmu_lock critical section.
> > > +  */
> > > + kvm->mmu_invalidate_in_progress++;
> > > +
> > > + if (likely(kvm->mmu_invalidate_in_progress == 1)) {
> > > + kvm->mmu_invalidate_range_start = INVALID_GPA;
> > > + kvm->mmu_invalidate_range_end = INVALID_GPA;
> > > + }
> > > +}
> > > +
> > > +void kvm_mmu_invalidate_range_add(struct kvm *kvm, gfn_t start, gfn_t 
> > > end)
> > > +{
> > > + WARN_ON_ONCE(!kvm->mmu_invalidate_in_progress);
> > > +
> > > + if (likely(kvm->mmu_invalidate_in_progress == 1)) {
> > > + kvm->mmu_invalidate_range_start = start;
> > > + kvm->mmu_invalidate_range_end = end;
> > > + } else {
> > > + /*
> > > +  * Fully tracking multiple concurrent ranges has diminishing
> > > +  * returns. Keep things simple and just find the minimal range
> > > +  * which includes the current and new ranges. As there won't be
> > > +  * enough information to subtract a range after its invalidate
> > > +  * completes, any ranges invalidated concurrently will
> > > +  * accumulate and persist until all outstanding invalidates
> > > +  * complete.
> > > +  */
> > > + kvm->mmu_invalidate_range_start =
> > > + min(kvm->mmu_invalidate_range_start, start);
> > > + kvm->mmu_invalidate_range_end =
> > > + max(kvm->mmu_invalidate_range_end, end);
> > > + }
> > > +}
> > > +
> > > +void kvm_mmu_invali

Re: Target-dependent include path, why?

2022-12-08 Thread Richard Henderson

On 12/8/22 23:12, Markus Armbruster wrote:

I stumbled over this:

 ../include/ui/qemu-pixman.h:12:10: fatal error: pixman.h: No such file or 
directory
12 | #include 
   |  ^~

Works when included into target-dependent code.

Running make -V=1 shows we're passing a number of -I only when compiling
target-dependent code, i.e. together with -DNEED_CPU_H:

 -I/usr/include/pixman-1 -I/usr/include/capstone 
-I/usr/include/spice-server -I/usr/include/spice-1

 -I/usr/include/cacard -I/usr/include/nss3 -I/usr/include/nspr4 
-I/usr/include/PCSC

 -isystem../linux-headers -isystemlinux-headers

Why?


Because of where [pixman] is added as a dependency in meson.build.
If you want to use it somewhere new, you've got to move the dependency.


r~




[PATCH v3 0/8] accel/tcg: Rewrite user-only vma tracking

2022-12-08 Thread Richard Henderson
The primary motivator here are the numerous bug reports (e.g. #290)
about not being able to handle very large memory allocations.
I presume all or most of these are due to guest use of the clang
address sanitizer, which allocates a massive shadow vma.

This patch set copies the linux kernel code for interval trees,
which is what the kernel itself uses for managing vmas.  I then
purge all (real) use of PageDesc from user-only.  This is easy
for user-only because everything tricky happens under mmap_lock();

I have thought only briefly about using interval trees for system
mode too, but the locking situation there is more difficult.  So
for now that code gets moved around but not substantially changed.

The test case from #290 is added to test/tcg/multiarch/.
Before this patch set, on my moderately beefy laptop, it takes 39s
and has an RSS of 28GB before the qemu process is killed.  After
the patch set, the test case successfully allocates 16TB and
completes in 0.013s.


r~


Changes for v3:
  * Rename page_flush_tb to tb_remove_all (new patch 2).
  * Shuffle code in last patch, remove tb_lock for !sysemu for clang.

Changes for v2:
  * Rebase on master, 17 patches merged.
  * Structure of page_get_target_data adjusted (ajb).


Richard Henderson (8):
  util: Add interval-tree.c
  accel/tcg: Rename page_flush_tb
  accel/tcg: Use interval tree for TBs in user-only mode
  accel/tcg: Use interval tree for TARGET_PAGE_DATA_SIZE
  accel/tcg: Move page_{get,set}_flags to user-exec.c
  accel/tcg: Use interval tree for user-only page tracking
  accel/tcg: Move PageDesc tree into tb-maint.c for system
  accel/tcg: Move remainder of page locking to tb-maint.c

 accel/tcg/internal.h|  85 +--
 include/exec/exec-all.h |  43 +-
 include/exec/translate-all.h|   6 -
 include/qemu/interval-tree.h|  99 
 accel/tcg/tb-maint.c| 984 
 accel/tcg/translate-all.c   | 746 
 accel/tcg/user-exec.c   | 658 -
 tests/tcg/multiarch/test-vma.c  |  22 +
 tests/unit/test-interval-tree.c | 209 +++
 util/interval-tree.c| 882 
 tests/unit/meson.build  |   1 +
 util/meson.build|   1 +
 12 files changed, 2653 insertions(+), 1083 deletions(-)
 create mode 100644 include/qemu/interval-tree.h
 create mode 100644 tests/tcg/multiarch/test-vma.c
 create mode 100644 tests/unit/test-interval-tree.c
 create mode 100644 util/interval-tree.c

-- 
2.34.1




[PATCH v3 2/8] accel/tcg: Rename page_flush_tb

2022-12-08 Thread Richard Henderson
Rename to tb_remove_all, to remove the PageDesc "page" from the name,
and to avoid suggesting a "flush" in the icache sense.

Signed-off-by: Richard Henderson 
---
 accel/tcg/tb-maint.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/accel/tcg/tb-maint.c b/accel/tcg/tb-maint.c
index 0cdb35548c..b5b90347ae 100644
--- a/accel/tcg/tb-maint.c
+++ b/accel/tcg/tb-maint.c
@@ -51,7 +51,7 @@ void tb_htable_init(void)
 }
 
 /* Set to NULL all the 'first_tb' fields in all PageDescs. */
-static void page_flush_tb_1(int level, void **lp)
+static void tb_remove_all_1(int level, void **lp)
 {
 int i;
 
@@ -70,17 +70,17 @@ static void page_flush_tb_1(int level, void **lp)
 void **pp = *lp;
 
 for (i = 0; i < V_L2_SIZE; ++i) {
-page_flush_tb_1(level - 1, pp + i);
+tb_remove_all_1(level - 1, pp + i);
 }
 }
 }
 
-static void page_flush_tb(void)
+static void tb_remove_all(void)
 {
 int i, l1_sz = v_l1_size;
 
 for (i = 0; i < l1_sz; i++) {
-page_flush_tb_1(v_l2_levels, l1_map + i);
+tb_remove_all_1(v_l2_levels, l1_map + i);
 }
 }
 
@@ -101,7 +101,7 @@ static void do_tb_flush(CPUState *cpu, run_on_cpu_data 
tb_flush_count)
 }
 
 qht_reset_size(&tb_ctx.htable, CODE_GEN_HTABLE_SIZE);
-page_flush_tb();
+tb_remove_all();
 
 tcg_region_reset_all();
 /* XXX: flush processor icache at this point if cache flush is expensive */
-- 
2.34.1




[PATCH v3 4/8] accel/tcg: Use interval tree for TARGET_PAGE_DATA_SIZE

2022-12-08 Thread Richard Henderson
Continue weaning user-only away from PageDesc.

Use an interval tree to record target data.
Chunk the data, to minimize allocation overhead.

Signed-off-by: Richard Henderson 
---
 accel/tcg/internal.h  |  1 -
 accel/tcg/user-exec.c | 99 ---
 2 files changed, 74 insertions(+), 26 deletions(-)

diff --git a/accel/tcg/internal.h b/accel/tcg/internal.h
index bf1bf62e2a..0f91ee939c 100644
--- a/accel/tcg/internal.h
+++ b/accel/tcg/internal.h
@@ -26,7 +26,6 @@
 typedef struct PageDesc {
 #ifdef CONFIG_USER_ONLY
 unsigned long flags;
-void *target_data;
 #else
 QemuSpin lock;
 /* list of TBs intersecting this ram page */
diff --git a/accel/tcg/user-exec.c b/accel/tcg/user-exec.c
index fb7d6ee9e9..42a04bdb21 100644
--- a/accel/tcg/user-exec.c
+++ b/accel/tcg/user-exec.c
@@ -210,47 +210,96 @@ tb_page_addr_t get_page_addr_code_hostp(CPUArchState 
*env, target_ulong addr,
 return addr;
 }
 
+#ifdef TARGET_PAGE_DATA_SIZE
+/*
+ * Allocate chunks of target data together.  For the only current user,
+ * if we allocate one hunk per page, we have overhead of 40/128 or 40%.
+ * Therefore, allocate memory for 64 pages at a time for overhead < 1%.
+ */
+#define TPD_PAGES  64
+#define TBD_MASK   (TARGET_PAGE_MASK * TPD_PAGES)
+
+typedef struct TargetPageDataNode {
+IntervalTreeNode itree;
+char data[TPD_PAGES][TARGET_PAGE_DATA_SIZE] __attribute__((aligned));
+} TargetPageDataNode;
+
+static IntervalTreeRoot targetdata_root;
+
 void page_reset_target_data(target_ulong start, target_ulong end)
 {
-#ifdef TARGET_PAGE_DATA_SIZE
-target_ulong addr, len;
+IntervalTreeNode *n, *next;
+target_ulong last;
 
-/*
- * This function should never be called with addresses outside the
- * guest address space.  If this assert fires, it probably indicates
- * a missing call to h2g_valid.
- */
-assert(end - 1 <= GUEST_ADDR_MAX);
-assert(start < end);
 assert_memory_lock();
 
 start = start & TARGET_PAGE_MASK;
-end = TARGET_PAGE_ALIGN(end);
+last = TARGET_PAGE_ALIGN(end) - 1;
 
-for (addr = start, len = end - start;
- len != 0;
- len -= TARGET_PAGE_SIZE, addr += TARGET_PAGE_SIZE) {
-PageDesc *p = page_find_alloc(addr >> TARGET_PAGE_BITS, 1);
+for (n = interval_tree_iter_first(&targetdata_root, start, last),
+ next = n ? interval_tree_iter_next(n, start, last) : NULL;
+ n != NULL;
+ n = next,
+ next = next ? interval_tree_iter_next(n, start, last) : NULL) {
+target_ulong n_start, n_last, p_ofs, p_len;
+TargetPageDataNode *t;
 
-g_free(p->target_data);
-p->target_data = NULL;
+if (n->start >= start && n->last <= last) {
+interval_tree_remove(n, &targetdata_root);
+g_free(n);
+continue;
+}
+
+if (n->start < start) {
+n_start = start;
+p_ofs = (start - n->start) >> TARGET_PAGE_BITS;
+} else {
+n_start = n->start;
+p_ofs = 0;
+}
+n_last = MIN(last, n->last);
+p_len = (n_last + 1 - n_start) >> TARGET_PAGE_BITS;
+
+t = container_of(n, TargetPageDataNode, itree);
+memset(t->data[p_ofs], 0, p_len * TARGET_PAGE_DATA_SIZE);
 }
-#endif
 }
 
-#ifdef TARGET_PAGE_DATA_SIZE
 void *page_get_target_data(target_ulong address)
 {
-PageDesc *p = page_find(address >> TARGET_PAGE_BITS);
-void *ret = p->target_data;
+IntervalTreeNode *n;
+TargetPageDataNode *t;
+target_ulong page, region;
 
-if (!ret) {
-ret = g_malloc0(TARGET_PAGE_DATA_SIZE);
-p->target_data = ret;
+page = address & TARGET_PAGE_MASK;
+region = address & TBD_MASK;
+
+n = interval_tree_iter_first(&targetdata_root, page, page);
+if (!n) {
+/*
+ * See util/interval-tree.c re lockless lookups: no false positives
+ * but there are false negatives.  If we find nothing, retry with
+ * the mmap lock acquired.  We also need the lock for the
+ * allocation + insert.
+ */
+mmap_lock();
+n = interval_tree_iter_first(&targetdata_root, page, page);
+if (!n) {
+t = g_new0(TargetPageDataNode, 1);
+n = &t->itree;
+n->start = region;
+n->last = region | ~TBD_MASK;
+interval_tree_insert(n, &targetdata_root);
+}
+mmap_unlock();
 }
-return ret;
+
+t = container_of(n, TargetPageDataNode, itree);
+return t->data[(page - region) >> TARGET_PAGE_BITS];
 }
-#endif
+#else
+void page_reset_target_data(target_ulong start, target_ulong end) { }
+#endif /* TARGET_PAGE_DATA_SIZE */
 
 /* The softmmu versions of these helpers are in cputlb.c.  */
 
-- 
2.34.1




[PATCH v3 6/8] accel/tcg: Use interval tree for user-only page tracking

2022-12-08 Thread Richard Henderson
Finish weaning user-only away from PageDesc.

Using an interval tree to track page permissions means that
we can represent very large regions efficiently.

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/290
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/967
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1214
Signed-off-by: Richard Henderson 
---
 accel/tcg/internal.h   |   4 +-
 accel/tcg/tb-maint.c   |  20 +-
 accel/tcg/user-exec.c  | 615 ++---
 tests/tcg/multiarch/test-vma.c |  22 ++
 4 files changed, 451 insertions(+), 210 deletions(-)
 create mode 100644 tests/tcg/multiarch/test-vma.c

diff --git a/accel/tcg/internal.h b/accel/tcg/internal.h
index ddd1fa6bdc..be19bdf088 100644
--- a/accel/tcg/internal.h
+++ b/accel/tcg/internal.h
@@ -24,9 +24,7 @@
 #endif
 
 typedef struct PageDesc {
-#ifdef CONFIG_USER_ONLY
-unsigned long flags;
-#else
+#ifndef CONFIG_USER_ONLY
 QemuSpin lock;
 /* list of TBs intersecting this ram page */
 uintptr_t first_tb;
diff --git a/accel/tcg/tb-maint.c b/accel/tcg/tb-maint.c
index 8da2c64d87..20e86c813d 100644
--- a/accel/tcg/tb-maint.c
+++ b/accel/tcg/tb-maint.c
@@ -68,15 +68,23 @@ static void tb_remove_all(void)
 /* Call with mmap_lock held. */
 static void tb_record(TranslationBlock *tb, PageDesc *p1, PageDesc *p2)
 {
-/* translator_loop() must have made all TB pages non-writable */
-assert(!(p1->flags & PAGE_WRITE));
-if (p2) {
-assert(!(p2->flags & PAGE_WRITE));
-}
+target_ulong addr;
+int flags;
 
 assert_memory_lock();
-
 tb->itree.last = tb->itree.start + tb->size - 1;
+
+/* translator_loop() must have made all TB pages non-writable */
+addr = tb_page_addr0(tb);
+flags = page_get_flags(addr);
+assert(!(flags & PAGE_WRITE));
+
+addr = tb_page_addr1(tb);
+if (addr != -1) {
+flags = page_get_flags(addr);
+assert(!(flags & PAGE_WRITE));
+}
+
 interval_tree_insert(&tb->itree, &tb_root);
 }
 
diff --git a/accel/tcg/user-exec.c b/accel/tcg/user-exec.c
index 22ef780900..a3cecda405 100644
--- a/accel/tcg/user-exec.c
+++ b/accel/tcg/user-exec.c
@@ -135,106 +135,61 @@ bool handle_sigsegv_accerr_write(CPUState *cpu, sigset_t 
*old_set,
 }
 }
 
-/*
- * Walks guest process memory "regions" one by one
- * and calls callback function 'fn' for each region.
- */
-struct walk_memory_regions_data {
-walk_memory_regions_fn fn;
-void *priv;
-target_ulong start;
-int prot;
-};
+typedef struct PageFlagsNode {
+IntervalTreeNode itree;
+int flags;
+} PageFlagsNode;
 
-static int walk_memory_regions_end(struct walk_memory_regions_data *data,
-   target_ulong end, int new_prot)
+static IntervalTreeRoot pageflags_root;
+
+static PageFlagsNode *pageflags_find(target_ulong start, target_long last)
 {
-if (data->start != -1u) {
-int rc = data->fn(data->priv, data->start, end, data->prot);
-if (rc != 0) {
-return rc;
-}
-}
+IntervalTreeNode *n;
 
-data->start = (new_prot ? end : -1u);
-data->prot = new_prot;
-
-return 0;
+n = interval_tree_iter_first(&pageflags_root, start, last);
+return n ? container_of(n, PageFlagsNode, itree) : NULL;
 }
 
-static int walk_memory_regions_1(struct walk_memory_regions_data *data,
- target_ulong base, int level, void **lp)
+static PageFlagsNode *pageflags_next(PageFlagsNode *p, target_ulong start,
+ target_long last)
 {
-target_ulong pa;
-int i, rc;
+IntervalTreeNode *n;
 
-if (*lp == NULL) {
-return walk_memory_regions_end(data, base, 0);
-}
-
-if (level == 0) {
-PageDesc *pd = *lp;
-
-for (i = 0; i < V_L2_SIZE; ++i) {
-int prot = pd[i].flags;
-
-pa = base | (i << TARGET_PAGE_BITS);
-if (prot != data->prot) {
-rc = walk_memory_regions_end(data, pa, prot);
-if (rc != 0) {
-return rc;
-}
-}
-}
-} else {
-void **pp = *lp;
-
-for (i = 0; i < V_L2_SIZE; ++i) {
-pa = base | ((target_ulong)i <<
-(TARGET_PAGE_BITS + V_L2_BITS * level));
-rc = walk_memory_regions_1(data, pa, level - 1, pp + i);
-if (rc != 0) {
-return rc;
-}
-}
-}
-
-return 0;
+n = interval_tree_iter_next(&p->itree, start, last);
+return n ? container_of(n, PageFlagsNode, itree) : NULL;
 }
 
 int walk_memory_regions(void *priv, walk_memory_regions_fn fn)
 {
-struct walk_memory_regions_data data;
-uintptr_t i, l1_sz = v_l1_size;
+IntervalTreeNode *n;
+int rc = 0;
 
-data.fn = fn;
-data.priv = priv;
-data.start = -1u;
-data.prot = 0;
+mmap_lock();
+for (n = interval_tree_iter_first(&pageflags_root, 0, -1);
+   

[PATCH v3 1/8] util: Add interval-tree.c

2022-12-08 Thread Richard Henderson
Copy and simplify the Linux kernel's interval_tree_generic.h,
instantiating for uint64_t.

Reviewed-by: Alex Bennée 
Signed-off-by: Richard Henderson 
---
 include/qemu/interval-tree.h|  99 
 tests/unit/test-interval-tree.c | 209 
 util/interval-tree.c| 882 
 tests/unit/meson.build  |   1 +
 util/meson.build|   1 +
 5 files changed, 1192 insertions(+)
 create mode 100644 include/qemu/interval-tree.h
 create mode 100644 tests/unit/test-interval-tree.c
 create mode 100644 util/interval-tree.c

diff --git a/include/qemu/interval-tree.h b/include/qemu/interval-tree.h
new file mode 100644
index 00..25006debe8
--- /dev/null
+++ b/include/qemu/interval-tree.h
@@ -0,0 +1,99 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+/*
+ * Interval trees.
+ *
+ * Derived from include/linux/interval_tree.h and its dependencies.
+ */
+
+#ifndef QEMU_INTERVAL_TREE_H
+#define QEMU_INTERVAL_TREE_H
+
+/*
+ * For now, don't expose Linux Red-Black Trees separately, but retain the
+ * separate type definitions to keep the implementation sane, and allow
+ * the possibility of disentangling them later.
+ */
+typedef struct RBNode
+{
+/* Encodes parent with color in the lsb. */
+uintptr_t rb_parent_color;
+struct RBNode *rb_right;
+struct RBNode *rb_left;
+} RBNode;
+
+typedef struct RBRoot
+{
+RBNode *rb_node;
+} RBRoot;
+
+typedef struct RBRootLeftCached {
+RBRoot rb_root;
+RBNode *rb_leftmost;
+} RBRootLeftCached;
+
+typedef struct IntervalTreeNode
+{
+RBNode rb;
+
+uint64_t start;/* Start of interval */
+uint64_t last; /* Last location _in_ interval */
+uint64_t subtree_last;
+} IntervalTreeNode;
+
+typedef RBRootLeftCached IntervalTreeRoot;
+
+/**
+ * interval_tree_is_empty
+ * @root: root of the tree.
+ *
+ * Returns true if the tree contains no nodes.
+ */
+static inline bool interval_tree_is_empty(const IntervalTreeRoot *root)
+{
+return root->rb_root.rb_node == NULL;
+}
+
+/**
+ * interval_tree_insert
+ * @node: node to insert,
+ * @root: root of the tree.
+ *
+ * Insert @node into @root, and rebalance.
+ */
+void interval_tree_insert(IntervalTreeNode *node, IntervalTreeRoot *root);
+
+/**
+ * interval_tree_remove
+ * @node: node to remove,
+ * @root: root of the tree.
+ *
+ * Remove @node from @root, and rebalance.
+ */
+void interval_tree_remove(IntervalTreeNode *node, IntervalTreeRoot *root);
+
+/**
+ * interval_tree_iter_first:
+ * @root: root of the tree,
+ * @start, @last: the inclusive interval [start, last].
+ *
+ * Locate the "first" of a set of nodes within the tree at @root
+ * that overlap the interval, where "first" is sorted by start.
+ * Returns NULL if no overlap found.
+ */
+IntervalTreeNode *interval_tree_iter_first(IntervalTreeRoot *root,
+   uint64_t start, uint64_t last);
+
+/**
+ * interval_tree_iter_next:
+ * @node: previous search result
+ * @start, @last: the inclusive interval [start, last].
+ *
+ * Locate the "next" of a set of nodes within the tree that overlap the
+ * interval; @next is the result of a previous call to
+ * interval_tree_iter_{first,next}.  Returns NULL if @next was the last
+ * node in the set.
+ */
+IntervalTreeNode *interval_tree_iter_next(IntervalTreeNode *node,
+  uint64_t start, uint64_t last);
+
+#endif /* QEMU_INTERVAL_TREE_H */
diff --git a/tests/unit/test-interval-tree.c b/tests/unit/test-interval-tree.c
new file mode 100644
index 00..119817a019
--- /dev/null
+++ b/tests/unit/test-interval-tree.c
@@ -0,0 +1,209 @@
+/*
+ * Test interval trees
+ *
+ * This work is licensed under the terms of the GNU LGPL, version 2 or later.
+ * See the COPYING.LIB file in the top-level directory.
+ *
+ */
+
+#include "qemu/osdep.h"
+#include "qemu/interval-tree.h"
+
+static IntervalTreeNode nodes[20];
+static IntervalTreeRoot root;
+
+static void rand_interval(IntervalTreeNode *n, uint64_t start, uint64_t last)
+{
+gint32 s_ofs, l_ofs, l_max;
+
+if (last - start > INT32_MAX) {
+l_max = INT32_MAX;
+} else {
+l_max = last - start;
+}
+s_ofs = g_test_rand_int_range(0, l_max);
+l_ofs = g_test_rand_int_range(s_ofs, l_max);
+
+n->start = start + s_ofs;
+n->last = start + l_ofs;
+}
+
+static void test_empty(void)
+{
+g_assert(root.rb_root.rb_node == NULL);
+g_assert(root.rb_leftmost == NULL);
+g_assert(interval_tree_iter_first(&root, 0, UINT64_MAX) == NULL);
+}
+
+static void test_find_one_point(void)
+{
+/* Create a tree of a single node, which is the point [1,1]. */
+nodes[0].start = 1;
+nodes[0].last = 1;
+
+interval_tree_insert(&nodes[0], &root);
+
+g_assert(interval_tree_iter_first(&root, 0, 9) == &nodes[0]);
+g_assert(interval_tree_iter_next(&nodes[0], 0, 9) == NULL);
+g_assert(interval_tree_iter_first(&root, 0, 0) == NULL);
+g_assert(interval_tree_iter_next(&nodes[0

[PATCH v3 5/8] accel/tcg: Move page_{get,set}_flags to user-exec.c

2022-12-08 Thread Richard Henderson
This page tracking implementation is specific to user-only,
since the system softmmu version is in cputlb.c.  Move it
out of translate-all.c to user-exec.c.

Reviewed-by: Alex Bennée 
Signed-off-by: Richard Henderson 
---
 accel/tcg/internal.h  |  17 ++
 accel/tcg/translate-all.c | 350 --
 accel/tcg/user-exec.c | 346 +
 3 files changed, 363 insertions(+), 350 deletions(-)

diff --git a/accel/tcg/internal.h b/accel/tcg/internal.h
index 0f91ee939c..ddd1fa6bdc 100644
--- a/accel/tcg/internal.h
+++ b/accel/tcg/internal.h
@@ -33,6 +33,23 @@ typedef struct PageDesc {
 #endif
 } PageDesc;
 
+/*
+ * In system mode we want L1_MAP to be based on ram offsets,
+ * while in user mode we want it to be based on virtual addresses.
+ *
+ * TODO: For user mode, see the caveat re host vs guest virtual
+ * address spaces near GUEST_ADDR_MAX.
+ */
+#if !defined(CONFIG_USER_ONLY)
+#if HOST_LONG_BITS < TARGET_PHYS_ADDR_SPACE_BITS
+# define L1_MAP_ADDR_SPACE_BITS  HOST_LONG_BITS
+#else
+# define L1_MAP_ADDR_SPACE_BITS  TARGET_PHYS_ADDR_SPACE_BITS
+#endif
+#else
+# define L1_MAP_ADDR_SPACE_BITS  MIN(HOST_LONG_BITS, TARGET_ABI_BITS)
+#endif
+
 /* Size of the L2 (and L3, etc) page tables.  */
 #define V_L2_BITS 10
 #define V_L2_SIZE (1 << V_L2_BITS)
diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c
index b964ea44d7..0d7596fcb8 100644
--- a/accel/tcg/translate-all.c
+++ b/accel/tcg/translate-all.c
@@ -109,23 +109,6 @@ struct page_collection {
 struct page_entry *max;
 };
 
-/*
- * In system mode we want L1_MAP to be based on ram offsets,
- * while in user mode we want it to be based on virtual addresses.
- *
- * TODO: For user mode, see the caveat re host vs guest virtual
- * address spaces near GUEST_ADDR_MAX.
- */
-#if !defined(CONFIG_USER_ONLY)
-#if HOST_LONG_BITS < TARGET_PHYS_ADDR_SPACE_BITS
-# define L1_MAP_ADDR_SPACE_BITS  HOST_LONG_BITS
-#else
-# define L1_MAP_ADDR_SPACE_BITS  TARGET_PHYS_ADDR_SPACE_BITS
-#endif
-#else
-# define L1_MAP_ADDR_SPACE_BITS  MIN(HOST_LONG_BITS, TARGET_ABI_BITS)
-#endif
-
 /* Make sure all possible CPU event bits fit in tb->trace_vcpu_dstate */
 QEMU_BUILD_BUG_ON(CPU_TRACE_DSTATE_MAX_EVENTS >
   sizeof_field(TranslationBlock, trace_vcpu_dstate)
@@ -1235,339 +1218,6 @@ void cpu_interrupt(CPUState *cpu, int mask)
 qatomic_set(&cpu_neg(cpu)->icount_decr.u16.high, -1);
 }
 
-/*
- * Walks guest process memory "regions" one by one
- * and calls callback function 'fn' for each region.
- */
-struct walk_memory_regions_data {
-walk_memory_regions_fn fn;
-void *priv;
-target_ulong start;
-int prot;
-};
-
-static int walk_memory_regions_end(struct walk_memory_regions_data *data,
-   target_ulong end, int new_prot)
-{
-if (data->start != -1u) {
-int rc = data->fn(data->priv, data->start, end, data->prot);
-if (rc != 0) {
-return rc;
-}
-}
-
-data->start = (new_prot ? end : -1u);
-data->prot = new_prot;
-
-return 0;
-}
-
-static int walk_memory_regions_1(struct walk_memory_regions_data *data,
- target_ulong base, int level, void **lp)
-{
-target_ulong pa;
-int i, rc;
-
-if (*lp == NULL) {
-return walk_memory_regions_end(data, base, 0);
-}
-
-if (level == 0) {
-PageDesc *pd = *lp;
-
-for (i = 0; i < V_L2_SIZE; ++i) {
-int prot = pd[i].flags;
-
-pa = base | (i << TARGET_PAGE_BITS);
-if (prot != data->prot) {
-rc = walk_memory_regions_end(data, pa, prot);
-if (rc != 0) {
-return rc;
-}
-}
-}
-} else {
-void **pp = *lp;
-
-for (i = 0; i < V_L2_SIZE; ++i) {
-pa = base | ((target_ulong)i <<
-(TARGET_PAGE_BITS + V_L2_BITS * level));
-rc = walk_memory_regions_1(data, pa, level - 1, pp + i);
-if (rc != 0) {
-return rc;
-}
-}
-}
-
-return 0;
-}
-
-int walk_memory_regions(void *priv, walk_memory_regions_fn fn)
-{
-struct walk_memory_regions_data data;
-uintptr_t i, l1_sz = v_l1_size;
-
-data.fn = fn;
-data.priv = priv;
-data.start = -1u;
-data.prot = 0;
-
-for (i = 0; i < l1_sz; i++) {
-target_ulong base = i << (v_l1_shift + TARGET_PAGE_BITS);
-int rc = walk_memory_regions_1(&data, base, v_l2_levels, l1_map + i);
-if (rc != 0) {
-return rc;
-}
-}
-
-return walk_memory_regions_end(&data, 0, 0);
-}
-
-static int dump_region(void *priv, target_ulong start,
-target_ulong end, unsigned long prot)
-{
-FILE *f = (FILE *)priv;
-
-(void) fprintf(f, TARGET_FMT_lx"-"TARGET_FMT_lx
-" "TARGET_FMT_lx" %c%c%c\n",
-start, end, end - start,
-((prot & PAGE_READ) ? 'r' : '-'),
-((prot & PA

[PATCH v3 3/8] accel/tcg: Use interval tree for TBs in user-only mode

2022-12-08 Thread Richard Henderson
Begin weaning user-only away from PageDesc.

Since, for user-only, all TB (and page) manipulation is done with
a single mutex, and there is no virtual/physical discontinuity to
split a TB across discontinuous pages, place all of the TBs into
a single IntervalTree. This makes it trivial to find all of the
TBs intersecting a range.

Retain the existing PageDesc + linked list implementation for
system mode.  Move the portion of the implementation that overlaps
the new user-only code behind the common ifdef.

Reviewed-by: Alex Bennée 
Signed-off-by: Richard Henderson 
---
 accel/tcg/internal.h  |  16 +-
 include/exec/exec-all.h   |  43 -
 accel/tcg/tb-maint.c  | 387 ++
 accel/tcg/translate-all.c |   4 +-
 4 files changed, 279 insertions(+), 171 deletions(-)

diff --git a/accel/tcg/internal.h b/accel/tcg/internal.h
index cb13bade4f..bf1bf62e2a 100644
--- a/accel/tcg/internal.h
+++ b/accel/tcg/internal.h
@@ -24,14 +24,13 @@
 #endif
 
 typedef struct PageDesc {
-/* list of TBs intersecting this ram page */
-uintptr_t first_tb;
 #ifdef CONFIG_USER_ONLY
 unsigned long flags;
 void *target_data;
-#endif
-#ifdef CONFIG_SOFTMMU
+#else
 QemuSpin lock;
+/* list of TBs intersecting this ram page */
+uintptr_t first_tb;
 #endif
 } PageDesc;
 
@@ -69,9 +68,6 @@ static inline PageDesc *page_find(tb_page_addr_t index)
  tb; tb = (TranslationBlock *)tb->field[n], n = (uintptr_t)tb & 1, \
  tb = (TranslationBlock *)((uintptr_t)tb & ~1))
 
-#define PAGE_FOR_EACH_TB(pagedesc, tb, n)   \
-TB_FOR_EACH_TAGGED((pagedesc)->first_tb, tb, n, page_next)
-
 #define TB_FOR_EACH_JMP(head_tb, tb, n) \
 TB_FOR_EACH_TAGGED((head_tb)->jmp_list_head, tb, n, jmp_list_next)
 
@@ -89,6 +85,12 @@ void do_assert_page_locked(const PageDesc *pd, const char 
*file, int line);
 #endif
 void page_lock(PageDesc *pd);
 void page_unlock(PageDesc *pd);
+
+/* TODO: For now, still shared with translate-all.c for system mode. */
+typedef int PageForEachNext;
+#define PAGE_FOR_EACH_TB(start, end, pagedesc, tb, n) \
+TB_FOR_EACH_TAGGED((pagedesc)->first_tb, tb, n, page_next)
+
 #endif
 #if !defined(CONFIG_USER_ONLY) && defined(CONFIG_DEBUG_TCG)
 void assert_no_pages_locked(void);
diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
index 9b7bfbf09a..25e11b0a8d 100644
--- a/include/exec/exec-all.h
+++ b/include/exec/exec-all.h
@@ -24,6 +24,7 @@
 #ifdef CONFIG_TCG
 #include "exec/cpu_ldst.h"
 #endif
+#include "qemu/interval-tree.h"
 
 /* allow to see translation results - the slowdown should be negligible, so we 
leave it */
 #define DEBUG_DISAS
@@ -559,11 +560,20 @@ struct TranslationBlock {
 
 struct tb_tc tc;
 
-/* first and second physical page containing code. The lower bit
-   of the pointer tells the index in page_next[].
-   The list is protected by the TB's page('s) lock(s) */
+/*
+ * Track tb_page_addr_t intervals that intersect this TB.
+ * For user-only, the virtual addresses are always contiguous,
+ * and we use a unified interval tree.  For system, we use a
+ * linked list headed in each PageDesc.  Within the list, the lsb
+ * of the previous pointer tells the index of page_next[], and the
+ * list is protected by the PageDesc lock(s).
+ */
+#ifdef CONFIG_USER_ONLY
+IntervalTreeNode itree;
+#else
 uintptr_t page_next[2];
 tb_page_addr_t page_addr[2];
+#endif
 
 /* jmp_lock placed here to fill a 4-byte hole. Its documentation is below 
*/
 QemuSpin jmp_lock;
@@ -619,24 +629,51 @@ static inline uint32_t tb_cflags(const TranslationBlock 
*tb)
 
 static inline tb_page_addr_t tb_page_addr0(const TranslationBlock *tb)
 {
+#ifdef CONFIG_USER_ONLY
+return tb->itree.start;
+#else
 return tb->page_addr[0];
+#endif
 }
 
 static inline tb_page_addr_t tb_page_addr1(const TranslationBlock *tb)
 {
+#ifdef CONFIG_USER_ONLY
+tb_page_addr_t next = tb->itree.last & TARGET_PAGE_MASK;
+return next == (tb->itree.start & TARGET_PAGE_MASK) ? -1 : next;
+#else
 return tb->page_addr[1];
+#endif
 }
 
 static inline void tb_set_page_addr0(TranslationBlock *tb,
  tb_page_addr_t addr)
 {
+#ifdef CONFIG_USER_ONLY
+tb->itree.start = addr;
+/*
+ * To begin, we record an interval of one byte.  When the translation
+ * loop encounters a second page, the interval will be extended to
+ * include the first byte of the second page, which is sufficient to
+ * allow tb_page_addr1() above to work properly.  The final corrected
+ * interval will be set by tb_page_add() from tb->size before the
+ * node is added to the interval tree.
+ */
+tb->itree.last = addr;
+#else
 tb->page_addr[0] = addr;
+#endif
 }
 
 static inline void tb_set_page_addr1(TranslationBlock *tb,
  tb_page_addr_t addr)
 {
+#ifdef CONFIG_USER_ONLY
+/* Ext

[PATCH v3 7/8] accel/tcg: Move PageDesc tree into tb-maint.c for system

2022-12-08 Thread Richard Henderson
Now that PageDesc is not used for user-only, and for system
it is only used for tb maintenance, move the implementation
into tb-main.c appropriately ifdefed.

We have not yet eliminated all references to PageDesc for
user-only, so retain a typedef to the structure without definition.

Signed-off-by: Richard Henderson 
---
 accel/tcg/internal.h  |  49 +++---
 accel/tcg/tb-maint.c  | 130 --
 accel/tcg/translate-all.c |  95 
 3 files changed, 134 insertions(+), 140 deletions(-)

diff --git a/accel/tcg/internal.h b/accel/tcg/internal.h
index be19bdf088..14b89c4ee8 100644
--- a/accel/tcg/internal.h
+++ b/accel/tcg/internal.h
@@ -23,51 +23,13 @@
 #define assert_memory_lock() tcg_debug_assert(have_mmap_lock())
 #endif
 
-typedef struct PageDesc {
+typedef struct PageDesc PageDesc;
 #ifndef CONFIG_USER_ONLY
+struct PageDesc {
 QemuSpin lock;
 /* list of TBs intersecting this ram page */
 uintptr_t first_tb;
-#endif
-} PageDesc;
-
-/*
- * In system mode we want L1_MAP to be based on ram offsets,
- * while in user mode we want it to be based on virtual addresses.
- *
- * TODO: For user mode, see the caveat re host vs guest virtual
- * address spaces near GUEST_ADDR_MAX.
- */
-#if !defined(CONFIG_USER_ONLY)
-#if HOST_LONG_BITS < TARGET_PHYS_ADDR_SPACE_BITS
-# define L1_MAP_ADDR_SPACE_BITS  HOST_LONG_BITS
-#else
-# define L1_MAP_ADDR_SPACE_BITS  TARGET_PHYS_ADDR_SPACE_BITS
-#endif
-#else
-# define L1_MAP_ADDR_SPACE_BITS  MIN(HOST_LONG_BITS, TARGET_ABI_BITS)
-#endif
-
-/* Size of the L2 (and L3, etc) page tables.  */
-#define V_L2_BITS 10
-#define V_L2_SIZE (1 << V_L2_BITS)
-
-/*
- * L1 Mapping properties
- */
-extern int v_l1_size;
-extern int v_l1_shift;
-extern int v_l2_levels;
-
-/*
- * The bottom level has pointers to PageDesc, and is indexed by
- * anything from 4 to (V_L2_BITS + 3) bits, depending on target page size.
- */
-#define V_L1_MIN_BITS 4
-#define V_L1_MAX_BITS (V_L2_BITS + 3)
-#define V_L1_MAX_SIZE (1 << V_L1_MAX_BITS)
-
-extern void *l1_map[V_L1_MAX_SIZE];
+};
 
 PageDesc *page_find_alloc(tb_page_addr_t index, bool alloc);
 
@@ -76,6 +38,11 @@ static inline PageDesc *page_find(tb_page_addr_t index)
 return page_find_alloc(index, false);
 }
 
+void page_table_config_init(void);
+#else
+static inline void page_table_config_init(void) { }
+#endif
+
 /* list iterators for lists of tagged pointers in TranslationBlock */
 #define TB_FOR_EACH_TAGGED(head, tb, n, field)  \
 for (n = (head) & 1, tb = (TranslationBlock *)((head) & ~1);\
diff --git a/accel/tcg/tb-maint.c b/accel/tcg/tb-maint.c
index 20e86c813d..9b996bbeb2 100644
--- a/accel/tcg/tb-maint.c
+++ b/accel/tcg/tb-maint.c
@@ -127,6 +127,121 @@ static PageForEachNext foreach_tb_next(PageForEachNext tb,
 }
 
 #else
+/*
+ * In system mode we want L1_MAP to be based on ram offsets.
+ */
+#if HOST_LONG_BITS < TARGET_PHYS_ADDR_SPACE_BITS
+# define L1_MAP_ADDR_SPACE_BITS  HOST_LONG_BITS
+#else
+# define L1_MAP_ADDR_SPACE_BITS  TARGET_PHYS_ADDR_SPACE_BITS
+#endif
+
+/* Size of the L2 (and L3, etc) page tables.  */
+#define V_L2_BITS 10
+#define V_L2_SIZE (1 << V_L2_BITS)
+
+/*
+ * L1 Mapping properties
+ */
+static int v_l1_size;
+static int v_l1_shift;
+static int v_l2_levels;
+
+/*
+ * The bottom level has pointers to PageDesc, and is indexed by
+ * anything from 4 to (V_L2_BITS + 3) bits, depending on target page size.
+ */
+#define V_L1_MIN_BITS 4
+#define V_L1_MAX_BITS (V_L2_BITS + 3)
+#define V_L1_MAX_SIZE (1 << V_L1_MAX_BITS)
+
+static void *l1_map[V_L1_MAX_SIZE];
+
+void page_table_config_init(void)
+{
+uint32_t v_l1_bits;
+
+assert(TARGET_PAGE_BITS);
+/* The bits remaining after N lower levels of page tables.  */
+v_l1_bits = (L1_MAP_ADDR_SPACE_BITS - TARGET_PAGE_BITS) % V_L2_BITS;
+if (v_l1_bits < V_L1_MIN_BITS) {
+v_l1_bits += V_L2_BITS;
+}
+
+v_l1_size = 1 << v_l1_bits;
+v_l1_shift = L1_MAP_ADDR_SPACE_BITS - TARGET_PAGE_BITS - v_l1_bits;
+v_l2_levels = v_l1_shift / V_L2_BITS - 1;
+
+assert(v_l1_bits <= V_L1_MAX_BITS);
+assert(v_l1_shift % V_L2_BITS == 0);
+assert(v_l2_levels >= 0);
+}
+
+PageDesc *page_find_alloc(tb_page_addr_t index, bool alloc)
+{
+PageDesc *pd;
+void **lp;
+int i;
+
+/* Level 1.  Always allocated.  */
+lp = l1_map + ((index >> v_l1_shift) & (v_l1_size - 1));
+
+/* Level 2..N-1.  */
+for (i = v_l2_levels; i > 0; i--) {
+void **p = qatomic_rcu_read(lp);
+
+if (p == NULL) {
+void *existing;
+
+if (!alloc) {
+return NULL;
+}
+p = g_new0(void *, V_L2_SIZE);
+existing = qatomic_cmpxchg(lp, NULL, p);
+if (unlikely(existing)) {
+g_free(p);
+p = existing;
+}
+}
+
+lp = p + ((index >> (i * V_L2_BITS)) & (V_L2_SIZE - 1));
+}
+
+pd = qatomic_rcu_read(lp);
+   

[PATCH v3 8/8] accel/tcg: Move remainder of page locking to tb-maint.c

2022-12-08 Thread Richard Henderson
The only thing that still touches PageDesc in translate-all.c
are some locking routines related to tb-maint.c which have not
yet been moved.  Do so now.

Move some code up in tb-maint.c as well, to untangle the maze
of ifdefs, and allow a sensible final ordering.

Move some declarations from exec/translate-all.h to internal.h,
as they are only used within accel/tcg/.

Reviewed-by: Alex Bennée 
Signed-off-by: Richard Henderson 
---
 accel/tcg/internal.h |  68 ++---
 include/exec/translate-all.h |   6 -
 accel/tcg/tb-maint.c | 473 +--
 accel/tcg/translate-all.c| 301 --
 4 files changed, 411 insertions(+), 437 deletions(-)

diff --git a/accel/tcg/internal.h b/accel/tcg/internal.h
index 14b89c4ee8..e1429a53ac 100644
--- a/accel/tcg/internal.h
+++ b/accel/tcg/internal.h
@@ -23,62 +23,28 @@
 #define assert_memory_lock() tcg_debug_assert(have_mmap_lock())
 #endif
 
-typedef struct PageDesc PageDesc;
-#ifndef CONFIG_USER_ONLY
-struct PageDesc {
-QemuSpin lock;
-/* list of TBs intersecting this ram page */
-uintptr_t first_tb;
-};
-
-PageDesc *page_find_alloc(tb_page_addr_t index, bool alloc);
-
-static inline PageDesc *page_find(tb_page_addr_t index)
-{
-return page_find_alloc(index, false);
-}
-
-void page_table_config_init(void);
-#else
-static inline void page_table_config_init(void) { }
-#endif
-
-/* list iterators for lists of tagged pointers in TranslationBlock */
-#define TB_FOR_EACH_TAGGED(head, tb, n, field)  \
-for (n = (head) & 1, tb = (TranslationBlock *)((head) & ~1);\
- tb; tb = (TranslationBlock *)tb->field[n], n = (uintptr_t)tb & 1, \
- tb = (TranslationBlock *)((uintptr_t)tb & ~1))
-
-#define TB_FOR_EACH_JMP(head_tb, tb, n) \
-TB_FOR_EACH_TAGGED((head_tb)->jmp_list_head, tb, n, jmp_list_next)
-
-/* In user-mode page locks aren't used; mmap_lock is enough */
-#ifdef CONFIG_USER_ONLY
-#define assert_page_locked(pd) tcg_debug_assert(have_mmap_lock())
-static inline void page_lock(PageDesc *pd) { }
-static inline void page_unlock(PageDesc *pd) { }
-#else
-#ifdef CONFIG_DEBUG_TCG
-void do_assert_page_locked(const PageDesc *pd, const char *file, int line);
-#define assert_page_locked(pd) do_assert_page_locked(pd, __FILE__, __LINE__)
-#else
-#define assert_page_locked(pd)
-#endif
-void page_lock(PageDesc *pd);
-void page_unlock(PageDesc *pd);
-
-/* TODO: For now, still shared with translate-all.c for system mode. */
-typedef int PageForEachNext;
-#define PAGE_FOR_EACH_TB(start, end, pagedesc, tb, n) \
-TB_FOR_EACH_TAGGED((pagedesc)->first_tb, tb, n, page_next)
-
-#endif
-#if !defined(CONFIG_USER_ONLY) && defined(CONFIG_DEBUG_TCG)
+#if defined(CONFIG_SOFTMMU) && defined(CONFIG_DEBUG_TCG)
 void assert_no_pages_locked(void);
 #else
 static inline void assert_no_pages_locked(void) { }
 #endif
 
+#ifdef CONFIG_USER_ONLY
+static inline void page_table_config_init(void) { }
+#else
+void page_table_config_init(void);
+#endif
+
+#ifdef CONFIG_SOFTMMU
+struct page_collection;
+void tb_invalidate_phys_page_fast(struct page_collection *pages,
+  tb_page_addr_t start, int len,
+  uintptr_t retaddr);
+struct page_collection *page_collection_lock(tb_page_addr_t start,
+ tb_page_addr_t end);
+void page_collection_unlock(struct page_collection *set);
+#endif /* CONFIG_SOFTMMU */
+
 TranslationBlock *tb_gen_code(CPUState *cpu, target_ulong pc,
   target_ulong cs_base, uint32_t flags,
   int cflags);
diff --git a/include/exec/translate-all.h b/include/exec/translate-all.h
index 3e9cb91565..88602ae8d8 100644
--- a/include/exec/translate-all.h
+++ b/include/exec/translate-all.h
@@ -23,12 +23,6 @@
 
 
 /* translate-all.c */
-struct page_collection *page_collection_lock(tb_page_addr_t start,
- tb_page_addr_t end);
-void page_collection_unlock(struct page_collection *set);
-void tb_invalidate_phys_page_fast(struct page_collection *pages,
-  tb_page_addr_t start, int len,
-  uintptr_t retaddr);
 void tb_invalidate_phys_page(tb_page_addr_t addr);
 void tb_check_watchpoint(CPUState *cpu, uintptr_t retaddr);
 
diff --git a/accel/tcg/tb-maint.c b/accel/tcg/tb-maint.c
index 9b996bbeb2..0c56e81d8c 100644
--- a/accel/tcg/tb-maint.c
+++ b/accel/tcg/tb-maint.c
@@ -30,6 +30,15 @@
 #include "internal.h"
 
 
+/* List iterators for lists of tagged pointers in TranslationBlock. */
+#define TB_FOR_EACH_TAGGED(head, tb, n, field)  \
+for (n = (head) & 1, tb = (TranslationBlock *)((head) & ~1);\
+ tb; tb = (TranslationBlock *)tb->field[n], n = (uintptr_t)tb & 1, \
+ tb = (TranslationBlock *)((uintptr_t)tb & ~1))
+
+#define TB_FOR_EACH_JMP(head

Target-dependent include path, why?

2022-12-08 Thread Markus Armbruster
I stumbled over this:

../include/ui/qemu-pixman.h:12:10: fatal error: pixman.h: No such file or 
directory
   12 | #include 
  |  ^~

Works when included into target-dependent code.

Running make -V=1 shows we're passing a number of -I only when compiling
target-dependent code, i.e. together with -DNEED_CPU_H:

-I/usr/include/pixman-1 -I/usr/include/capstone -I/usr/include/spice-server 
-I/usr/include/spice-1

-I/usr/include/cacard -I/usr/include/nss3 -I/usr/include/nspr4 
-I/usr/include/PCSC

-isystem../linux-headers -isystemlinux-headers

Why?




Re: [PATCH RESEND v3 00/10] migration: introduce dirtylimit capability

2022-12-08 Thread Hyman

Ping ?

在 2022/12/4 1:09, huang...@chinatelecom.cn 写道:

From: Hyman Huang(黄勇) 

v3(resend):
- fix the syntax error of the topic.

v3:
This version make some modifications inspired by Peter and Markus
as following:
1. Do the code clean up in [PATCH v2 02/11] suggested by Markus
2. Replace the [PATCH v2 03/11] with a much simpler patch posted by
Peter to fix the following bug:
https://bugzilla.redhat.com/show_bug.cgi?id=2124756
3. Fix the error path of migrate_params_check in [PATCH v2 04/11]
pointed out by Markus. Enrich the commit message to explain why
x-vcpu-dirty-limit-period an unstable parameter.
4. Refactor the dirty-limit convergence algo in [PATCH v2 07/11]
suggested by Peter:
a. apply blk_mig_bulk_active check before enable dirty-limit
b. drop the unhelpful check function before enable dirty-limit
c. change the migration_cancel logic, just cancel dirty-limit
   only if dirty-limit capability turned on.
d. abstract a code clean commit [PATCH v3 07/10] to adjust
   the check order before enable auto-converge
5. Change the name of observing indexes during dirty-limit live
migration to make them more easy-understanding. Use the
maximum throttle time of vpus as "dirty-limit-throttle-time-per-full"
6. Fix some grammatical and spelling errors pointed out by Markus
and enrich the document about the dirty-limit live migration
observing indexes "dirty-limit-ring-full-time"
and "dirty-limit-throttle-time-per-full"
7. Change the default value of x-vcpu-dirty-limit-period to 1000ms,
which is optimal value pointed out in cover letter in that
testing environment.
8. Drop the 2 guestperf test commits [PATCH v2 10/11],
[PATCH v2 11/11] and post them with a standalone series in the
future.

Thanks Peter and Markus sincerely for the passionate, efficient
and careful comments and suggestions.

Please review.

Yong

v2:
This version make a little bit modifications comparing with
version 1 as following:
1. fix the overflow issue reported by Peter Maydell
2. add parameter check for hmp "set_vcpu_dirty_limit" command
3. fix the racing issue between dirty ring reaper thread and
Qemu main thread.
4. add migrate parameter check for x-vcpu-dirty-limit-period
and vcpu-dirty-limit.
5. add the logic to forbid hmp/qmp commands set_vcpu_dirty_limit,
cancel_vcpu_dirty_limit during dirty-limit live migration when
implement dirty-limit convergence algo.
6. add capability check to ensure auto-converge and dirty-limit
are mutually exclusive.
7. pre-check if kvm dirty ring size is configured before setting
dirty-limit migrate parameter

A more comprehensive test was done comparing with version 1.

The following are test environment:
-
a. Host hardware info:

CPU:
Intel(R) Xeon(R) Gold 5218 CPU @ 2.30GHz

CPU(s):  64
On-line CPU(s) list: 0-63
Thread(s) per core:  2
Core(s) per socket:  16
Socket(s):   2
NUMA node(s):2

NUMA node0 CPU(s):   0-15,32-47
NUMA node1 CPU(s):   16-31,48-63

Memory:
Hynix  503Gi

Interface:
Intel Corporation Ethernet Connection X722 for 1GbE (rev 09)
Speed: 1000Mb/s

b. Host software info:

OS: ctyunos release 2
Kernel: 4.19.90-2102.2.0.0066.ctl2.x86_64
Libvirt baseline version:  libvirt-6.9.0
Qemu baseline version: qemu-5.0

c. vm scale
CPU: 4
Memory: 4G
-

All the supplementary test data shown as follows are basing on
above test environment.

In version 1, we post a test data from unixbench as follows:

$ taskset -c 8-15 ./Run -i 2 -c 8 {unixbench test item}

host cpu: Intel(R) Xeon(R) Platinum 8378A
host interface speed: 1000Mb/s
   |-+++---|
   | UnixBench test item | Normal | Dirtylimit | Auto-converge |
   |-+++---|
   | dhry2reg| 32800  | 32786  | 25292 |
   | whetstone-double| 10326  | 10315  | 9847  |
   | pipe| 15442  | 15271  | 14506 |
   | context1| 7260   | 6235   | 4514  |
   | spawn   | 3663   | 3317   | 3249  |
   | syscall | 4669   | 4667   | 3841  |
   |-+++---|

In version 2, we post a supplementary test data that do not use
taskset and make the scenario more general, see as follows:

$ ./Run

per-vcpu data:
   |-+++---|
   | UnixBench test item | Normal | Dirtylimit | Auto-converge |
   |-+++---|
   | dhry2reg| 2991   | 2902   | 1722  |
   | whetstone-double| 1018   | 1006   | 627   |
   | Execl Throughput| 955   

Re: [RFC PATCH] RISC-V: Save mmu_idx using FIELD_DP32 not OR

2022-12-08 Thread LIU Zhiwei



On 2022/12/8 23:11, Christoph Muellner wrote:

From: Christoph Müllner 

Setting flags using OR might work, but is not optimal
for a couple of reasons:
* No way grep for stores to the field MEM_IDX.
* The return value of cpu_mmu_index() is not masked
   (not a real problem as long as cpu_mmu_index() returns only valid values).
* If the offset of MEM_IDX would get moved to non-0, then this code
   would not work anymore.

Let's use the FIELD_DP32() macro instead of the OR, which is already
used for most other flags.

Signed-off-by: Christoph Müllner 
---
  target/riscv/cpu_helper.c | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c
index 278d163803..d68b6b351d 100644
--- a/target/riscv/cpu_helper.c
+++ b/target/riscv/cpu_helper.c
@@ -80,7 +80,8 @@ void cpu_get_tb_cpu_state(CPURISCVState *env, target_ulong 
*pc,
  flags |= TB_FLAGS_MSTATUS_FS;
  flags |= TB_FLAGS_MSTATUS_VS;
  #else
-flags |= cpu_mmu_index(env, 0);
+flags = FIELD_DP32(flags, TB_FLAGS, MEM_IDX, cpu_mmu_index(env, 0));
+
  if (riscv_cpu_fp_enabled(env)) {
  flags |= env->mstatus & MSTATUS_FS;
  }


We may should rename cpu_mmu_index to cpu_mem_idx and 
TB_FLAGS_PRIV_MMU_MASK to TB_FLAGS_PRIV_MEM_MASK.


We can also remove the TB_FLAGS_PRIV_MMU_MASK as the position of MEM_IDX 
in tb_flags may change in the future.



Otherwise, this patch looks good to me,

Reviewed-by: LIU Zhiwei 




Re: [RFC v4 3/3] hw/cxl: Multi-Region CXL Type-3 Devices (Volatile and Persistent)

2022-12-08 Thread Fan Ni
On Mon, Nov 28, 2022 at 10:01:57AM -0500, Gregory Price wrote:

> From: Gregory Price 
> 
> This commit enables each CXL Type-3 device to contain one volatile
> memory region and one persistent region.
> 
> Two new properties have been added to cxl-type3 device initialization:
> [volatile-memdev] and [persistent-memdev]
> 
> The existing [memdev] property has been deprecated and will default the
> memory region to a persistent memory region (although a user may assign
> the region to a ram or file backed region). It cannot be used in
> combination with the new [persistent-memdev] property.
> 
> Partitioning volatile memory from persistent memory is not yet supported.
> 
> Volatile memory is mapped at DPA(0x0), while Persistent memory is mapped
> at DPA(vmem->size), per CXL Spec 8.2.9.8.2.0 - Get Partition Info.
> 
> Signed-off-by: Gregory Price 
> Signed-off-by: Jonathan Cameron 
> ---
>  docs/system/devices/cxl.rst |  49 --
>  hw/cxl/cxl-mailbox-utils.c  |  22 +--
>  hw/mem/cxl_type3.c  | 292 +++-
>  include/hw/cxl/cxl_device.h |  11 +-
>  tests/qtest/cxl-test.c  |  78 --
>  5 files changed, 347 insertions(+), 105 deletions(-)
> 
> diff --git a/docs/system/devices/cxl.rst b/docs/system/devices/cxl.rst
> index f25783a4ec..45639a676a 100644
> --- a/docs/system/devices/cxl.rst
> +++ b/docs/system/devices/cxl.rst
> @@ -300,7 +300,7 @@ Example topology involving a switch::
>  
>  Example command lines
>  -
> -A very simple setup with just one directly attached CXL Type 3 device::
> +A very simple setup with just one directly attached CXL Type 3 Persistent 
> Memory device::
>  
>qemu-system-aarch64 -M virt,gic-version=3,cxl=on -m 4g,maxmem=8G,slots=8 
> -cpu max \
>...
> @@ -308,7 +308,28 @@ A very simple setup with just one directly attached CXL 
> Type 3 device::
>-object 
> memory-backend-file,id=cxl-lsa1,share=on,mem-path=/tmp/lsa.raw,size=256M \
>-device pxb-cxl,bus_nr=12,bus=pcie.0,id=cxl.1 \
>-device cxl-rp,port=0,bus=cxl.1,id=root_port13,chassis=0,slot=2 \
> -  -device 
> cxl-type3,bus=root_port13,memdev=cxl-mem1,lsa=cxl-lsa1,id=cxl-pmem0 \
> +  -device 
> cxl-type3,bus=root_port13,persistent-memdev=cxl-mem1,lsa=cxl-lsa1,id=cxl-pmem0
>  \
> +  -M cxl-fmw.0.targets.0=cxl.1,cxl-fmw.0.size=4G
> +
> +A very simple setup with just one directly attached CXL Type 3 Volatile 
> Memory device::
> +
> +  qemu-system-aarch64 -M virt,gic-version=3,cxl=on -m 4g,maxmem=8G,slots=8 
> -cpu max \
> +  ...
> +  -object memory-backend-ram,id=vmem0,share=on,size=256M \
> +  -device pxb-cxl,bus_nr=12,bus=pcie.0,id=cxl.1 \
> +  -device cxl-rp,port=0,bus=cxl.1,id=root_port13,chassis=0,slot=2 \
> +  -device cxl-type3,bus=root_port13,volatile-memdev=vmem0,id=cxl-vmem0 \
> +  -M cxl-fmw.0.targets.0=cxl.1,cxl-fmw.0.size=4G
> +
> +The same volatile setup may optionally include an LSA region::
> +
> +  qemu-system-aarch64 -M virt,gic-version=3,cxl=on -m 4g,maxmem=8G,slots=8 
> -cpu max \
> +  ...
> +  -object memory-backend-ram,id=vmem0,share=on,size=256M \
> +  -object 
> memory-backend-file,id=cxl-lsa0,share=on,mem-path=/tmp/lsa.raw,size=256M \
> +  -device pxb-cxl,bus_nr=12,bus=pcie.0,id=cxl.1 \
> +  -device cxl-rp,port=0,bus=cxl.1,id=root_port13,chassis=0,slot=2 \
> +  -device 
> cxl-type3,bus=root_port13,volatile-memdev=vmem0,lsa=cxl-lsa0,id=cxl-vmem0 \
>-M cxl-fmw.0.targets.0=cxl.1,cxl-fmw.0.size=4G
>  
>  A setup suitable for 4 way interleave. Only one fixed window provided, to 
> enable 2 way
> @@ -328,13 +349,13 @@ the CXL Type3 device directly attached (no switches).::
>-device pxb-cxl,bus_nr=12,bus=pcie.0,id=cxl.1 \
>-device pxb-cxl,bus_nr=222,bus=pcie.0,id=cxl.2 \
>-device cxl-rp,port=0,bus=cxl.1,id=root_port13,chassis=0,slot=2 \
> -  -device 
> cxl-type3,bus=root_port13,memdev=cxl-mem1,lsa=cxl-lsa1,id=cxl-pmem0 \
> +  -device 
> cxl-type3,bus=root_port13,persistent-memdev=cxl-mem1,lsa=cxl-lsa1,id=cxl-pmem0
>  \
>-device cxl-rp,port=1,bus=cxl.1,id=root_port14,chassis=0,slot=3 \
> -  -device 
> cxl-type3,bus=root_port14,memdev=cxl-mem2,lsa=cxl-lsa2,id=cxl-pmem1 \
> +  -device 
> cxl-type3,bus=root_port14,persistent-memdev=cxl-mem2,lsa=cxl-lsa2,id=cxl-pmem1
>  \
>-device cxl-rp,port=0,bus=cxl.2,id=root_port15,chassis=0,slot=5 \
> -  -device 
> cxl-type3,bus=root_port15,memdev=cxl-mem3,lsa=cxl-lsa3,id=cxl-pmem2 \
> +  -device 
> cxl-type3,bus=root_port15,persistent-memdev=cxl-mem3,lsa=cxl-lsa3,id=cxl-pmem2
>  \
>-device cxl-rp,port=1,bus=cxl.2,id=root_port16,chassis=0,slot=6 \
> -  -device 
> cxl-type3,bus=root_port16,memdev=cxl-mem4,lsa=cxl-lsa4,id=cxl-pmem3 \
> +  -device 
> cxl-type3,bus=root_port16,persistent-memdev=cxl-mem4,lsa=cxl-lsa4,id=cxl-pmem3
>  \
>-M 
> cxl-fmw.0.targets.0=cxl.1,cxl-fmw.0.targets.1=cxl.2,cxl-fmw.0.size=4G,cxl-fmw.0.interleave-granularity=8k
>  
>  An example of 4 devices below a switch suitable for 1, 2 or 4 way 
> interleave::
> @@ -354,15 +375,23 @@ An example 

Re: [PATCH] blockdev: add 'media=cdrom' argument to support usb cdrom emulated as cdrom

2022-12-08 Thread Zhipeng Lu

Thanks.

 -device usb-bot,id=bot0
 -device scsi-{cd,hd},bus=bot0.0,drive=drive0

Qemu implements virtio scsi to emulate scsi controller, but if the 
virtual machine(for example windows guest os) don't install the virtio 
scsi driver, it don't work
i need the function: emulate cdrom in guest, support hotplug and unplug, 
not  depend on virtio driver


have a better idea?

在 2022/12/7 16:39, Paolo Bonzini 写道:

It should be like this:

-device usb-bot,id=bot0
-device scsi-{cd,hd},bus=bot0.0,drive=drive0

Libvirt has the code to generate the options for SCSI controllers, but 
usb-bot only allows one disk attached to it so it's easier to make it a 
 element.


Paolo

Il sab 3 dic 2022, 13:52 Zhipeng Lu > ha scritto:


Could you give the detail qemu cmdline about usb-bot?

在 2022/12/2 17:40, Paolo Bonzini 写道:
 > On 12/2/22 03:26, Zhipeng Lu wrote:
 >> NAME  MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
 >> sda 8:0    0  100M  1 disk
 >> vda   252:0    0   10G  0 disk
 >> ├─vda1    252:1    0    1G  0 part /boot
 >> └─vda2    252:2    0    9G  0 part
 >>    ├─rhel-root 253:0    0    8G  0 lvm  /
 >>    └─rhel-swap 253:1    0    1G  0 lvm  [SWAP]
 >> lshw -short|grep cdrom -i
 >> No cdrom.
 >>
 >> My patch is to solve this problem, usb cdrom emulated as cdrom.
 >
 > This is a libvirt bug, it should use usb-bot instead of usb-storage
 > together with -blockdev.  Then it can add a scsi-cd device below
usb-bot.
 >
 > Paolo
 >
 >>
 >>
 >> 在 2022/12/1 23:35, Markus Armbruster 写道:
 >>> luzhipeng mailto:luzhip...@cestc.cn>> writes:
 >>>
  From: zhipeng Lu mailto:luzhip...@cestc.cn>>
 
  The drive interface supports media=cdrom so that the usb cdrom
  can be emulated as cdrom in qemu, but libvirt deprived the drive
  interface, so media=cdrom is added to the blockdev interface to
  support usb cdrom emulated as cdrom
 
  Signed-off-by: zhipeng Lu mailto:luzhip...@cestc.cn>>
 >>>
 >>> What problem are you trying to solve?
 >>>
 >>>
 >>>
 >>
 >>
 >>
 >
 >
 >








[PATCH v4 12/27] tcg/s390x: Distinguish RRF-a and RRF-c formats

2022-12-08 Thread Richard Henderson
One has 3 register arguments; the other has 2 plus an m3 field.

Reviewed-by: Ilya Leoshkevich 
Signed-off-by: Richard Henderson 
---
 tcg/s390x/tcg-target.c.inc | 57 +-
 1 file changed, 32 insertions(+), 25 deletions(-)

diff --git a/tcg/s390x/tcg-target.c.inc b/tcg/s390x/tcg-target.c.inc
index 6cf07152a5..d38a602dd9 100644
--- a/tcg/s390x/tcg-target.c.inc
+++ b/tcg/s390x/tcg-target.c.inc
@@ -172,18 +172,19 @@ typedef enum S390Opcode {
 RRE_SLBGR   = 0xb989,
 RRE_XGR = 0xb982,
 
-RRF_LOCR= 0xb9f2,
-RRF_LOCGR   = 0xb9e2,
-RRF_NRK = 0xb9f4,
-RRF_NGRK= 0xb9e4,
-RRF_ORK = 0xb9f6,
-RRF_OGRK= 0xb9e6,
-RRF_SRK = 0xb9f9,
-RRF_SGRK= 0xb9e9,
-RRF_SLRK= 0xb9fb,
-RRF_SLGRK   = 0xb9eb,
-RRF_XRK = 0xb9f7,
-RRF_XGRK= 0xb9e7,
+RRFa_NRK= 0xb9f4,
+RRFa_NGRK   = 0xb9e4,
+RRFa_ORK= 0xb9f6,
+RRFa_OGRK   = 0xb9e6,
+RRFa_SRK= 0xb9f9,
+RRFa_SGRK   = 0xb9e9,
+RRFa_SLRK   = 0xb9fb,
+RRFa_SLGRK  = 0xb9eb,
+RRFa_XRK= 0xb9f7,
+RRFa_XGRK   = 0xb9e7,
+
+RRFc_LOCR   = 0xb9f2,
+RRFc_LOCGR  = 0xb9e2,
 
 RR_AR   = 0x1a,
 RR_ALR  = 0x1e,
@@ -538,8 +539,14 @@ static void tcg_out_insn_RRE(TCGContext *s, S390Opcode op,
 tcg_out32(s, (op << 16) | (r1 << 4) | r2);
 }
 
-static void tcg_out_insn_RRF(TCGContext *s, S390Opcode op,
- TCGReg r1, TCGReg r2, int m3)
+static void tcg_out_insn_RRFa(TCGContext *s, S390Opcode op,
+  TCGReg r1, TCGReg r2, TCGReg r3)
+{
+tcg_out32(s, (op << 16) | (r3 << 12) | (r1 << 4) | r2);
+}
+
+static void tcg_out_insn_RRFc(TCGContext *s, S390Opcode op,
+  TCGReg r1, TCGReg r2, int m3)
 {
 tcg_out32(s, (op << 16) | (m3 << 12) | (r1 << 4) | r2);
 }
@@ -1324,7 +1331,7 @@ static void tgen_setcond(TCGContext *s, TCGType type, 
TCGCond cond,
 /* Emit: d = 0, t = 1, d = (cc ? t : d).  */
 tcg_out_movi(s, TCG_TYPE_I64, dest, 0);
 tcg_out_movi(s, TCG_TYPE_I64, TCG_TMP0, 1);
-tcg_out_insn(s, RRF, LOCGR, dest, TCG_TMP0, cc);
+tcg_out_insn(s, RRFc, LOCGR, dest, TCG_TMP0, cc);
 }
 
 static void tgen_movcond(TCGContext *s, TCGType type, TCGCond c, TCGReg dest,
@@ -1335,7 +1342,7 @@ static void tgen_movcond(TCGContext *s, TCGType type, 
TCGCond c, TCGReg dest,
 if (v3const) {
 tcg_out_insn(s, RIE, LOCGHI, dest, v3, cc);
 } else {
-tcg_out_insn(s, RRF, LOCGR, dest, v3, cc);
+tcg_out_insn(s, RRFc, LOCGR, dest, v3, cc);
 }
 }
 
@@ -1356,7 +1363,7 @@ static void tgen_clz(TCGContext *s, TCGReg dest, TCGReg 
a1,
 tcg_out_mov(s, TCG_TYPE_I64, dest, a2);
 }
 /* Emit: if (one bit found) dest = r0.  */
-tcg_out_insn(s, RRF, LOCGR, dest, TCG_REG_R0, 2);
+tcg_out_insn(s, RRFc, LOCGR, dest, TCG_REG_R0, 2);
 }
 }
 
@@ -1960,7 +1967,7 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode 
opc,
 } else if (a0 == a1) {
 tcg_out_insn(s, RR, SR, a0, a2);
 } else {
-tcg_out_insn(s, RRF, SRK, a0, a1, a2);
+tcg_out_insn(s, RRFa, SRK, a0, a1, a2);
 }
 break;
 
@@ -1972,7 +1979,7 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode 
opc,
 } else if (a0 == a1) {
 tcg_out_insn(s, RR, NR, a0, a2);
 } else {
-tcg_out_insn(s, RRF, NRK, a0, a1, a2);
+tcg_out_insn(s, RRFa, NRK, a0, a1, a2);
 }
 break;
 case INDEX_op_or_i32:
@@ -1983,7 +1990,7 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode 
opc,
 } else if (a0 == a1) {
 tcg_out_insn(s, RR, OR, a0, a2);
 } else {
-tcg_out_insn(s, RRF, ORK, a0, a1, a2);
+tcg_out_insn(s, RRFa, ORK, a0, a1, a2);
 }
 break;
 case INDEX_op_xor_i32:
@@ -1994,7 +2001,7 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode 
opc,
 } else if (a0 == a1) {
 tcg_out_insn(s, RR, XR, args[0], args[2]);
 } else {
-tcg_out_insn(s, RRF, XRK, a0, a1, a2);
+tcg_out_insn(s, RRFa, XRK, a0, a1, a2);
 }
 break;
 
@@ -2220,7 +2227,7 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode 
opc,
 a2 = -a2;
 goto do_addi_64;
 } else {
-tcg_out_insn(s, RRF, SGRK, a0, a1, a2);
+tcg_out_insn(s, RRFa, SGRK, a0, a1, a2);
 }
 break;
 
@@ -2230,7 +2237,7 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode 
opc,
 tcg_out_mov(s, TCG_TYPE_I64, a0, a1);
 tgen_andi(s, TCG_TYPE_I64, args[0], args[2]);
 } else {
-tcg_out_insn(s, RRF, NGRK, a0, a1, a2);
+tcg_out_insn(s, RRFa, NGRK, a0, a1, a2);
 }
 break;
 case INDEX_op_or_i64:
@@ -2239,7 +2246,7 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode 
opc,
   

[PATCH v4 13/27] tcg/s390x: Distinguish RIE formats

2022-12-08 Thread Richard Henderson
There are multiple variations, with different fields.

Reviewed-by: Ilya Leoshkevich 
Signed-off-by: Richard Henderson 
---
 tcg/s390x/tcg-target.c.inc | 47 +-
 1 file changed, 26 insertions(+), 21 deletions(-)

diff --git a/tcg/s390x/tcg-target.c.inc b/tcg/s390x/tcg-target.c.inc
index d38a602dd9..a81a82c70b 100644
--- a/tcg/s390x/tcg-target.c.inc
+++ b/tcg/s390x/tcg-target.c.inc
@@ -128,16 +128,19 @@ typedef enum S390Opcode {
 RI_OILL = 0xa50b,
 RI_TMLL = 0xa701,
 
-RIE_CGIJ= 0xec7c,
-RIE_CGRJ= 0xec64,
-RIE_CIJ = 0xec7e,
-RIE_CLGRJ   = 0xec65,
-RIE_CLIJ= 0xec7f,
-RIE_CLGIJ   = 0xec7d,
-RIE_CLRJ= 0xec77,
-RIE_CRJ = 0xec76,
-RIE_LOCGHI  = 0xec46,
-RIE_RISBG   = 0xec55,
+RIEb_CGRJ= 0xec64,
+RIEb_CLGRJ   = 0xec65,
+RIEb_CLRJ= 0xec77,
+RIEb_CRJ = 0xec76,
+
+RIEc_CGIJ= 0xec7c,
+RIEc_CIJ = 0xec7e,
+RIEc_CLGIJ   = 0xec7d,
+RIEc_CLIJ= 0xec7f,
+
+RIEf_RISBG   = 0xec55,
+
+RIEg_LOCGHI  = 0xec46,
 
 RRE_AGR = 0xb908,
 RRE_ALGR= 0xb90a,
@@ -556,7 +559,7 @@ static void tcg_out_insn_RI(TCGContext *s, S390Opcode op, 
TCGReg r1, int i2)
 tcg_out32(s, (op << 16) | (r1 << 20) | (i2 & 0x));
 }
 
-static void tcg_out_insn_RIE(TCGContext *s, S390Opcode op, TCGReg r1,
+static void tcg_out_insn_RIEg(TCGContext *s, S390Opcode op, TCGReg r1,
  int i2, int m3)
 {
 tcg_out16(s, (op & 0xff00) | (r1 << 4) | m3);
@@ -985,9 +988,9 @@ static inline void tcg_out_risbg(TCGContext *s, TCGReg 
dest, TCGReg src,
  int msb, int lsb, int ofs, int z)
 {
 /* Format RIE-f */
-tcg_out16(s, (RIE_RISBG & 0xff00) | (dest << 4) | src);
+tcg_out16(s, (RIEf_RISBG & 0xff00) | (dest << 4) | src);
 tcg_out16(s, (msb << 8) | (z << 7) | lsb);
-tcg_out16(s, (ofs << 8) | (RIE_RISBG & 0xff));
+tcg_out16(s, (ofs << 8) | (RIEf_RISBG & 0xff));
 }
 
 static void tgen_ext8s(TCGContext *s, TCGType type, TCGReg dest, TCGReg src)
@@ -1266,7 +1269,7 @@ static void tgen_setcond(TCGContext *s, TCGType type, 
TCGCond cond,
 /* Emit: d = 0, d = (cc ? 1 : d).  */
 cc = tgen_cmp(s, type, cond, c1, c2, c2const, false);
 tcg_out_movi(s, TCG_TYPE_I64, dest, 0);
-tcg_out_insn(s, RIE, LOCGHI, dest, 1, cc);
+tcg_out_insn(s, RIEg, LOCGHI, dest, 1, cc);
 return;
 }
 
@@ -1340,7 +1343,7 @@ static void tgen_movcond(TCGContext *s, TCGType type, 
TCGCond c, TCGReg dest,
 {
 int cc = tgen_cmp(s, type, c, c1, c2, c2const, false);
 if (v3const) {
-tcg_out_insn(s, RIE, LOCGHI, dest, v3, cc);
+tcg_out_insn(s, RIEg, LOCGHI, dest, v3, cc);
 } else {
 tcg_out_insn(s, RRFc, LOCGR, dest, v3, cc);
 }
@@ -1409,6 +1412,7 @@ static void tgen_compare_branch(TCGContext *s, S390Opcode 
opc, int cc,
 TCGReg r1, TCGReg r2, TCGLabel *l)
 {
 tcg_out_reloc(s, s->code_ptr + 1, R_390_PC16DBL, l, 2);
+/* Format RIE-b */
 tcg_out16(s, (opc & 0xff00) | (r1 << 4) | r2);
 tcg_out16(s, 0);
 tcg_out16(s, cc << 12 | (opc & 0xff));
@@ -1418,6 +1422,7 @@ static void tgen_compare_imm_branch(TCGContext *s, 
S390Opcode opc, int cc,
 TCGReg r1, int i2, TCGLabel *l)
 {
 tcg_out_reloc(s, s->code_ptr + 1, R_390_PC16DBL, l, 2);
+/* Format RIE-c */
 tcg_out16(s, (opc & 0xff00) | (r1 << 4) | cc);
 tcg_out16(s, 0);
 tcg_out16(s, (i2 << 8) | (opc & 0xff));
@@ -1435,8 +1440,8 @@ static void tgen_brcond(TCGContext *s, TCGType type, 
TCGCond c,
 
 if (!c2const) {
 opc = (type == TCG_TYPE_I32
-   ? (is_unsigned ? RIE_CLRJ : RIE_CRJ)
-   : (is_unsigned ? RIE_CLGRJ : RIE_CGRJ));
+   ? (is_unsigned ? RIEb_CLRJ : RIEb_CRJ)
+   : (is_unsigned ? RIEb_CLGRJ : RIEb_CGRJ));
 tgen_compare_branch(s, opc, cc, r1, c2, l);
 return;
 }
@@ -1449,18 +1454,18 @@ static void tgen_brcond(TCGContext *s, TCGType type, 
TCGCond c,
  */
 if (type == TCG_TYPE_I32) {
 if (is_unsigned) {
-opc = RIE_CLIJ;
+opc = RIEc_CLIJ;
 in_range = (uint32_t)c2 == (uint8_t)c2;
 } else {
-opc = RIE_CIJ;
+opc = RIEc_CIJ;
 in_range = (int32_t)c2 == (int8_t)c2;
 }
 } else {
 if (is_unsigned) {
-opc = RIE_CLGIJ;
+opc = RIEc_CLGIJ;
 in_range = (uint64_t)c2 == (uint8_t)c2;
 } else {
-opc = RIE_CGIJ;
+opc = RIEc_CGIJ;
 in_range = (int64_t)c2 == (int8_t)c2;
 }
 }
-- 
2.34.1




[PATCH v4 27/27] tcg/s390x: Avoid the constant pool in tcg_out_movi

2022-12-08 Thread Richard Henderson
Load constants in no more than two insns, which turns
out to be faster than using the constant pool.

Suggested-by: Ilya Leoshkevich 
Signed-off-by: Richard Henderson 
---
 tcg/s390x/tcg-target.c.inc | 23 +--
 1 file changed, 17 insertions(+), 6 deletions(-)

diff --git a/tcg/s390x/tcg-target.c.inc b/tcg/s390x/tcg-target.c.inc
index b72c43e4aa..2b38fd991d 100644
--- a/tcg/s390x/tcg-target.c.inc
+++ b/tcg/s390x/tcg-target.c.inc
@@ -877,6 +877,9 @@ static bool tcg_out_mov(TCGContext *s, TCGType type, TCGReg 
dst, TCGReg src)
 static const S390Opcode li_insns[4] = {
 RI_LLILL, RI_LLILH, RI_LLIHL, RI_LLIHH
 };
+static const S390Opcode oi_insns[4] = {
+RI_OILL, RI_OILH, RI_OIHL, RI_OIHH
+};
 static const S390Opcode lif_insns[2] = {
 RIL_LLILF, RIL_LLIHF,
 };
@@ -928,9 +931,20 @@ static void tcg_out_movi(TCGContext *s, TCGType type,
 return;
 }
 
-/* Otherwise, stuff it in the constant pool.  */
-tcg_out_insn(s, RIL, LGRL, ret, 0);
-new_pool_label(s, sval, R_390_PC32DBL, s->code_ptr - 2, 2);
+/* Otherwise, load it by parts. */
+i = is_const_p16((uint32_t)uval);
+if (i >= 0) {
+tcg_out_insn_RI(s, li_insns[i], ret, uval >> (i * 16));
+} else {
+tcg_out_insn(s, RIL, LLILF, ret, uval);
+}
+uval >>= 32;
+i = is_const_p16(uval);
+if (i >= 0) {
+tcg_out_insn_RI(s, oi_insns[i + 2], ret, uval >> (i * 16));
+} else {
+tcg_out_insn(s, RIL, OIHF, ret, uval);
+}
 }
 
 /* Emit a load/store type instruction.  Inputs are:
@@ -1160,9 +1174,6 @@ static void tgen_andi(TCGContext *s, TCGType type, TCGReg 
dest, uint64_t val)
 
 static void tgen_ori(TCGContext *s, TCGReg dest, uint64_t val)
 {
-static const S390Opcode oi_insns[4] = {
-RI_OILL, RI_OILH, RI_OIHL, RI_OIHH
-};
 static const S390Opcode oif_insns[2] = {
 RIL_OILF, RIL_OIHF
 };
-- 
2.34.1




[PATCH v4 03/27] tcg/s390x: Always set TCG_TARGET_HAS_direct_jump

2022-12-08 Thread Richard Henderson
Since USE_REG_TB is removed, there is no need to load the
target TB address into a register.

Signed-off-by: Richard Henderson 
---
 tcg/s390x/tcg-target.h |  2 +-
 tcg/s390x/tcg-target.c.inc | 48 +++---
 2 files changed, 10 insertions(+), 40 deletions(-)

diff --git a/tcg/s390x/tcg-target.h b/tcg/s390x/tcg-target.h
index 22d70d431b..645f522058 100644
--- a/tcg/s390x/tcg-target.h
+++ b/tcg/s390x/tcg-target.h
@@ -103,7 +103,7 @@ extern uint64_t s390_facilities[3];
 #define TCG_TARGET_HAS_mulsh_i32  0
 #define TCG_TARGET_HAS_extrl_i64_i32  0
 #define TCG_TARGET_HAS_extrh_i64_i32  0
-#define TCG_TARGET_HAS_direct_jumpHAVE_FACILITY(GEN_INST_EXT)
+#define TCG_TARGET_HAS_direct_jump1
 #define TCG_TARGET_HAS_qemu_st8_i32   0
 
 #define TCG_TARGET_HAS_div2_i64   1
diff --git a/tcg/s390x/tcg-target.c.inc b/tcg/s390x/tcg-target.c.inc
index ba4bb6a629..2cdd0d7a92 100644
--- a/tcg/s390x/tcg-target.c.inc
+++ b/tcg/s390x/tcg-target.c.inc
@@ -996,28 +996,6 @@ static inline bool tcg_out_sti(TCGContext *s, TCGType 
type, TCGArg val,
 return false;
 }
 
-/* load data from an absolute host address */
-static void tcg_out_ld_abs(TCGContext *s, TCGType type,
-   TCGReg dest, const void *abs)
-{
-intptr_t addr = (intptr_t)abs;
-
-if (HAVE_FACILITY(GEN_INST_EXT) && !(addr & 1)) {
-ptrdiff_t disp = tcg_pcrel_diff(s, abs) >> 1;
-if (disp == (int32_t)disp) {
-if (type == TCG_TYPE_I32) {
-tcg_out_insn(s, RIL, LRL, dest, disp);
-} else {
-tcg_out_insn(s, RIL, LGRL, dest, disp);
-}
-return;
-}
-}
-
-tcg_out_movi(s, TCG_TYPE_PTR, dest, addr & ~0x);
-tcg_out_ld(s, type, dest, dest, addr & 0x);
-}
-
 static inline void tcg_out_risbg(TCGContext *s, TCGReg dest, TCGReg src,
  int msb, int lsb, int ofs, int z)
 {
@@ -2037,24 +2015,16 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode 
opc,
 
 case INDEX_op_goto_tb:
 a0 = args[0];
-if (s->tb_jmp_insn_offset) {
-/*
- * branch displacement must be aligned for atomic patching;
- * see if we need to add extra nop before branch
- */
-if (!QEMU_PTR_IS_ALIGNED(s->code_ptr + 1, 4)) {
-tcg_out16(s, NOP);
-}
-tcg_out16(s, RIL_BRCL | (S390_CC_ALWAYS << 4));
-s->tb_jmp_insn_offset[a0] = tcg_current_code_size(s);
-s->code_ptr += 2;
-} else {
-/* load address stored at s->tb_jmp_target_addr + a0 */
-tcg_out_ld_abs(s, TCG_TYPE_PTR, TCG_TMP0,
-   tcg_splitwx_to_rx(s->tb_jmp_target_addr + a0));
-/* and go there */
-tcg_out_insn(s, RR, BCR, S390_CC_ALWAYS, TCG_TMP0);
+/*
+ * branch displacement must be aligned for atomic patching;
+ * see if we need to add extra nop before branch
+ */
+if (!QEMU_PTR_IS_ALIGNED(s->code_ptr + 1, 4)) {
+tcg_out16(s, NOP);
 }
+tcg_out16(s, RIL_BRCL | (S390_CC_ALWAYS << 4));
+s->tb_jmp_insn_offset[a0] = tcg_current_code_size(s);
+s->code_ptr += 2;
 set_jmp_reset_offset(s, a0);
 break;
 
-- 
2.34.1




[PATCH v4 11/27] tcg/s390x: Use LARL+AGHI for odd addresses

2022-12-08 Thread Richard Henderson
Add one instead of dropping odd addresses to the constant pool.

Reviewed-by: Ilya Leoshkevich 
Signed-off-by: Richard Henderson 
---
 tcg/s390x/tcg-target.c.inc | 15 ---
 1 file changed, 8 insertions(+), 7 deletions(-)

diff --git a/tcg/s390x/tcg-target.c.inc b/tcg/s390x/tcg-target.c.inc
index e4403ffabf..6cf07152a5 100644
--- a/tcg/s390x/tcg-target.c.inc
+++ b/tcg/s390x/tcg-target.c.inc
@@ -806,6 +806,7 @@ static void tcg_out_movi(TCGContext *s, TCGType type,
  TCGReg ret, tcg_target_long sval)
 {
 tcg_target_ulong uval;
+ptrdiff_t pc_off;
 
 /* Try all 32-bit insns that can load it in one go.  */
 if (maybe_out_small_movi(s, type, ret, sval)) {
@@ -832,14 +833,14 @@ static void tcg_out_movi(TCGContext *s, TCGType type,
 return;
 }
 
-/* Try for PC-relative address load.  For odd addresses,
-   attempt to use an offset from the start of the TB.  */
-if ((sval & 1) == 0) {
-ptrdiff_t off = tcg_pcrel_diff(s, (void *)sval) >> 1;
-if (off == (int32_t)off) {
-tcg_out_insn(s, RIL, LARL, ret, off);
-return;
+/* Try for PC-relative address load.  For odd addresses, add one. */
+pc_off = tcg_pcrel_diff(s, (void *)sval) >> 1;
+if (pc_off == (int32_t)pc_off) {
+tcg_out_insn(s, RIL, LARL, ret, pc_off);
+if (sval & 1) {
+tcg_out_insn(s, RI, AGHI, ret, 1);
 }
+return;
 }
 
 /* Otherwise, stuff it in the constant pool.  */
-- 
2.34.1




[PATCH v4 14/27] tcg/s390x: Support MIE2 multiply single instructions

2022-12-08 Thread Richard Henderson
The MIE2 facility adds 3-operand versions of multiply.

Reviewed-by: Ilya Leoshkevich 
Signed-off-by: Richard Henderson 
---
 tcg/s390x/tcg-target-con-set.h |  1 +
 tcg/s390x/tcg-target.h |  1 +
 tcg/s390x/tcg-target.c.inc | 34 --
 3 files changed, 26 insertions(+), 10 deletions(-)

diff --git a/tcg/s390x/tcg-target-con-set.h b/tcg/s390x/tcg-target-con-set.h
index 00ba727b70..33a82e3286 100644
--- a/tcg/s390x/tcg-target-con-set.h
+++ b/tcg/s390x/tcg-target-con-set.h
@@ -23,6 +23,7 @@ C_O1_I2(r, 0, ri)
 C_O1_I2(r, 0, rI)
 C_O1_I2(r, 0, rJ)
 C_O1_I2(r, r, ri)
+C_O1_I2(r, r, rJ)
 C_O1_I2(r, rZ, r)
 C_O1_I2(v, v, r)
 C_O1_I2(v, v, v)
diff --git a/tcg/s390x/tcg-target.h b/tcg/s390x/tcg-target.h
index db10a39381..1fb7b8fb1d 100644
--- a/tcg/s390x/tcg-target.h
+++ b/tcg/s390x/tcg-target.h
@@ -63,6 +63,7 @@ typedef enum TCGReg {
 /* Facilities that are checked at runtime. */
 
 #define FACILITY_LOAD_ON_COND253
+#define FACILITY_MISC_INSN_EXT2   58
 #define FACILITY_VECTOR   129
 #define FACILITY_VECTOR_ENH1  135
 
diff --git a/tcg/s390x/tcg-target.c.inc b/tcg/s390x/tcg-target.c.inc
index a81a82c70b..9634126ed1 100644
--- a/tcg/s390x/tcg-target.c.inc
+++ b/tcg/s390x/tcg-target.c.inc
@@ -175,6 +175,8 @@ typedef enum S390Opcode {
 RRE_SLBGR   = 0xb989,
 RRE_XGR = 0xb982,
 
+RRFa_MSRKC  = 0xb9fd,
+RRFa_MSGRKC = 0xb9ed,
 RRFa_NRK= 0xb9f4,
 RRFa_NGRK   = 0xb9e4,
 RRFa_ORK= 0xb9f6,
@@ -2015,14 +2017,18 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode 
opc,
 break;
 
 case INDEX_op_mul_i32:
+a0 = args[0], a1 = args[1], a2 = (int32_t)args[2];
 if (const_args[2]) {
-if ((int32_t)args[2] == (int16_t)args[2]) {
-tcg_out_insn(s, RI, MHI, args[0], args[2]);
+tcg_out_mov(s, TCG_TYPE_I32, a0, a1);
+if (a2 == (int16_t)a2) {
+tcg_out_insn(s, RI, MHI, a0, a2);
 } else {
-tcg_out_insn(s, RIL, MSFI, args[0], args[2]);
+tcg_out_insn(s, RIL, MSFI, a0, a2);
 }
+} else if (a0 == a1) {
+tcg_out_insn(s, RRE, MSR, a0, a2);
 } else {
-tcg_out_insn(s, RRE, MSR, args[0], args[2]);
+tcg_out_insn(s, RRFa, MSRKC, a0, a1, a2);
 }
 break;
 
@@ -2272,14 +2278,18 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode 
opc,
 break;
 
 case INDEX_op_mul_i64:
+a0 = args[0], a1 = args[1], a2 = args[2];
 if (const_args[2]) {
-if (args[2] == (int16_t)args[2]) {
-tcg_out_insn(s, RI, MGHI, args[0], args[2]);
+tcg_out_mov(s, TCG_TYPE_I64, a0, a1);
+if (a2 == (int16_t)a2) {
+tcg_out_insn(s, RI, MGHI, a0, a2);
 } else {
-tcg_out_insn(s, RIL, MSGFI, args[0], args[2]);
+tcg_out_insn(s, RIL, MSGFI, a0, a2);
 }
+} else if (a0 == a1) {
+tcg_out_insn(s, RRE, MSGR, a0, a2);
 } else {
-tcg_out_insn(s, RRE, MSGR, args[0], args[2]);
+tcg_out_insn(s, RRFa, MSGRKC, a0, a1, a2);
 }
 break;
 
@@ -2934,9 +2944,13 @@ static TCGConstraintSetIndex tcg_target_op_def(TCGOpcode 
op)
 return C_O1_I2(r, r, ri);
 
 case INDEX_op_mul_i32:
-return C_O1_I2(r, 0, ri);
+return (HAVE_FACILITY(MISC_INSN_EXT2)
+? C_O1_I2(r, r, ri)
+: C_O1_I2(r, 0, ri));
 case INDEX_op_mul_i64:
-return C_O1_I2(r, 0, rJ);
+return (HAVE_FACILITY(MISC_INSN_EXT2)
+? C_O1_I2(r, r, rJ)
+: C_O1_I2(r, 0, rJ));
 
 case INDEX_op_shl_i32:
 case INDEX_op_shr_i32:
-- 
2.34.1




[PATCH v4 15/27] tcg/s390x: Support MIE2 MGRK instruction

2022-12-08 Thread Richard Henderson
The MIE2 facility adds a 3-operand signed 64x64->128 multiply.

Reviewed-by: Ilya Leoshkevich 
Signed-off-by: Richard Henderson 
---
 tcg/s390x/tcg-target-con-set.h | 1 +
 tcg/s390x/tcg-target.h | 2 +-
 tcg/s390x/tcg-target.c.inc | 8 
 3 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/tcg/s390x/tcg-target-con-set.h b/tcg/s390x/tcg-target-con-set.h
index 33a82e3286..b1a89a88ba 100644
--- a/tcg/s390x/tcg-target-con-set.h
+++ b/tcg/s390x/tcg-target-con-set.h
@@ -31,6 +31,7 @@ C_O1_I3(v, v, v, v)
 C_O1_I4(r, r, ri, r, 0)
 C_O1_I4(r, r, ri, rI, 0)
 C_O2_I2(o, m, 0, r)
+C_O2_I2(o, m, r, r)
 C_O2_I3(o, m, 0, 1, r)
 C_O2_I4(r, r, 0, 1, rA, r)
 C_O2_I4(r, r, 0, 1, ri, r)
diff --git a/tcg/s390x/tcg-target.h b/tcg/s390x/tcg-target.h
index 1fb7b8fb1d..03ce11a34a 100644
--- a/tcg/s390x/tcg-target.h
+++ b/tcg/s390x/tcg-target.h
@@ -136,7 +136,7 @@ extern uint64_t s390_facilities[3];
 #define TCG_TARGET_HAS_add2_i64   1
 #define TCG_TARGET_HAS_sub2_i64   1
 #define TCG_TARGET_HAS_mulu2_i64  1
-#define TCG_TARGET_HAS_muls2_i64  0
+#define TCG_TARGET_HAS_muls2_i64  HAVE_FACILITY(MISC_INSN_EXT2)
 #define TCG_TARGET_HAS_muluh_i64  0
 #define TCG_TARGET_HAS_mulsh_i64  0
 
diff --git a/tcg/s390x/tcg-target.c.inc b/tcg/s390x/tcg-target.c.inc
index 9634126ed1..871fcb7683 100644
--- a/tcg/s390x/tcg-target.c.inc
+++ b/tcg/s390x/tcg-target.c.inc
@@ -175,6 +175,7 @@ typedef enum S390Opcode {
 RRE_SLBGR   = 0xb989,
 RRE_XGR = 0xb982,
 
+RRFa_MGRK   = 0xb9ec,
 RRFa_MSRKC  = 0xb9fd,
 RRFa_MSGRKC = 0xb9ed,
 RRFa_NRK= 0xb9f4,
@@ -2319,6 +2320,11 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode 
opc,
 tcg_debug_assert(args[0] == args[1] + 1);
 tcg_out_insn(s, RRE, MLGR, args[1], args[3]);
 break;
+case INDEX_op_muls2_i64:
+tcg_debug_assert((args[1] & 1) == 0);
+tcg_debug_assert(args[0] == args[1] + 1);
+tcg_out_insn(s, RRFa, MGRK, args[1], args[2], args[3]);
+break;
 
 case INDEX_op_shl_i64:
 op = RSY_SLLG;
@@ -3009,6 +3015,8 @@ static TCGConstraintSetIndex tcg_target_op_def(TCGOpcode 
op)
 
 case INDEX_op_mulu2_i64:
 return C_O2_I2(o, m, 0, r);
+case INDEX_op_muls2_i64:
+return C_O2_I2(o, m, r, r);
 
 case INDEX_op_add2_i32:
 case INDEX_op_sub2_i32:
-- 
2.34.1




[PATCH v4 24/27] tcg/s390x: Implement ctpop operation

2022-12-08 Thread Richard Henderson
There is an older form that produces per-byte results,
and a newer form that produces per-register results.

Reviewed-by: Ilya Leoshkevich 
Signed-off-by: Richard Henderson 
---
 tcg/s390x/tcg-target.h |  4 ++--
 tcg/s390x/tcg-target.c.inc | 36 
 2 files changed, 38 insertions(+), 2 deletions(-)

diff --git a/tcg/s390x/tcg-target.h b/tcg/s390x/tcg-target.h
index dabdae1e84..68dcbc6645 100644
--- a/tcg/s390x/tcg-target.h
+++ b/tcg/s390x/tcg-target.h
@@ -91,7 +91,7 @@ extern uint64_t s390_facilities[3];
 #define TCG_TARGET_HAS_nor_i32HAVE_FACILITY(MISC_INSN_EXT3)
 #define TCG_TARGET_HAS_clz_i320
 #define TCG_TARGET_HAS_ctz_i320
-#define TCG_TARGET_HAS_ctpop_i32  0
+#define TCG_TARGET_HAS_ctpop_i32  1
 #define TCG_TARGET_HAS_deposit_i321
 #define TCG_TARGET_HAS_extract_i321
 #define TCG_TARGET_HAS_sextract_i32   0
@@ -128,7 +128,7 @@ extern uint64_t s390_facilities[3];
 #define TCG_TARGET_HAS_nor_i64HAVE_FACILITY(MISC_INSN_EXT3)
 #define TCG_TARGET_HAS_clz_i641
 #define TCG_TARGET_HAS_ctz_i640
-#define TCG_TARGET_HAS_ctpop_i64  0
+#define TCG_TARGET_HAS_ctpop_i64  1
 #define TCG_TARGET_HAS_deposit_i641
 #define TCG_TARGET_HAS_extract_i641
 #define TCG_TARGET_HAS_sextract_i64   0
diff --git a/tcg/s390x/tcg-target.c.inc b/tcg/s390x/tcg-target.c.inc
index 8254f9f650..c0434fa2f8 100644
--- a/tcg/s390x/tcg-target.c.inc
+++ b/tcg/s390x/tcg-target.c.inc
@@ -206,6 +206,7 @@ typedef enum S390Opcode {
 
 RRFc_LOCR   = 0xb9f2,
 RRFc_LOCGR  = 0xb9e2,
+RRFc_POPCNT = 0xb9e1,
 
 RR_AR   = 0x1a,
 RR_ALR  = 0x1e,
@@ -1435,6 +1436,32 @@ static void tgen_clz(TCGContext *s, TCGReg dest, TCGReg 
a1,
 tgen_movcond_int(s, TCG_TYPE_I64, dest, a2, a2const, TCG_REG_R0, 8, 2);
 }
 
+static void tgen_ctpop(TCGContext *s, TCGType type, TCGReg dest, TCGReg src)
+{
+/* With MIE3, and bit 0 of m4 set, we get the complete result. */
+if (HAVE_FACILITY(MISC_INSN_EXT3)) {
+if (type == TCG_TYPE_I32) {
+tgen_ext32u(s, dest, src);
+src = dest;
+}
+tcg_out_insn(s, RRFc, POPCNT, dest, src, 8);
+return;
+}
+
+/* Without MIE3, each byte gets the count of bits for the byte. */
+tcg_out_insn(s, RRFc, POPCNT, dest, src, 0);
+
+/* Multiply to sum each byte at the top of the word. */
+if (type == TCG_TYPE_I32) {
+tcg_out_insn(s, RIL, MSFI, dest, 0x01010101);
+tcg_out_sh32(s, RS_SRL, dest, TCG_REG_NONE, 24);
+} else {
+tcg_out_movi(s, TCG_TYPE_I64, TCG_TMP0, 0x0101010101010101ull);
+tcg_out_insn(s, RRE, MSGR, dest, TCG_TMP0);
+tcg_out_sh64(s, RSY_SRLG, dest, dest, TCG_REG_NONE, 56);
+}
+}
+
 static void tgen_deposit(TCGContext *s, TCGReg dest, TCGReg src,
  int ofs, int len, int z)
 {
@@ -2584,6 +2611,13 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode 
opc,
 tgen_clz(s, args[0], args[1], args[2], const_args[2]);
 break;
 
+case INDEX_op_ctpop_i32:
+tgen_ctpop(s, TCG_TYPE_I32, args[0], args[1]);
+break;
+case INDEX_op_ctpop_i64:
+tgen_ctpop(s, TCG_TYPE_I64, args[0], args[1]);
+break;
+
 case INDEX_op_mb:
 /* The host memory model is quite strong, we simply need to
serialize the instruction stream.  */
@@ -3146,6 +3180,8 @@ static TCGConstraintSetIndex tcg_target_op_def(TCGOpcode 
op)
 case INDEX_op_extu_i32_i64:
 case INDEX_op_extract_i32:
 case INDEX_op_extract_i64:
+case INDEX_op_ctpop_i32:
+case INDEX_op_ctpop_i64:
 return C_O1_I1(r, r);
 
 case INDEX_op_qemu_ld_i32:
-- 
2.34.1




[PATCH v4 04/27] tcg/s390x: Remove USE_LONG_BRANCHES

2022-12-08 Thread Richard Henderson
The size of a compiled TB is limited by the uint16_t used by
gen_insn_end_off[] -- there is no need for a 32-bit branch.

Signed-off-by: Richard Henderson 
---
 tcg/s390x/tcg-target.c.inc | 9 -
 1 file changed, 9 deletions(-)

diff --git a/tcg/s390x/tcg-target.c.inc b/tcg/s390x/tcg-target.c.inc
index 2cdd0d7a92..dea889ffa1 100644
--- a/tcg/s390x/tcg-target.c.inc
+++ b/tcg/s390x/tcg-target.c.inc
@@ -33,11 +33,6 @@
 #include "../tcg-pool.c.inc"
 #include "elf.h"
 
-/* ??? The translation blocks produced by TCG are generally small enough to
-   be entirely reachable with a 16-bit displacement.  Leaving the option for
-   a 32-bit displacement here Just In Case.  */
-#define USE_LONG_BRANCHES 0
-
 #define TCG_CT_CONST_S16   0x100
 #define TCG_CT_CONST_S32   0x200
 #define TCG_CT_CONST_S33   0x400
@@ -1525,10 +1520,6 @@ static void tgen_branch(TCGContext *s, int cc, TCGLabel 
*l)
 {
 if (l->has_value) {
 tgen_gotoi(s, cc, l->u.value_ptr);
-} else if (USE_LONG_BRANCHES) {
-tcg_out16(s, RIL_BRCL | (cc << 4));
-tcg_out_reloc(s, s->code_ptr, R_390_PC32DBL, l, 2);
-s->code_ptr += 2;
 } else {
 tcg_out16(s, RI_BRC | (cc << 4));
 tcg_out_reloc(s, s->code_ptr, R_390_PC16DBL, l, 2);
-- 
2.34.1




[PATCH v4 16/27] tcg/s390x: Issue XILF directly for xor_i32

2022-12-08 Thread Richard Henderson
There is only one instruction that is applicable
to a 32-bit immediate xor.

Signed-off-by: Richard Henderson 
---
 tcg/s390x/tcg-target.c.inc | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tcg/s390x/tcg-target.c.inc b/tcg/s390x/tcg-target.c.inc
index 871fcb7683..fc304327fc 100644
--- a/tcg/s390x/tcg-target.c.inc
+++ b/tcg/s390x/tcg-target.c.inc
@@ -2005,7 +2005,7 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode 
opc,
 a0 = args[0], a1 = args[1], a2 = (uint32_t)args[2];
 if (const_args[2]) {
 tcg_out_mov(s, TCG_TYPE_I32, a0, a1);
-tgen_xori(s, TCG_TYPE_I32, a0, a2);
+tcg_out_insn(s, RIL, XILF, a0, a2);
 } else if (a0 == a1) {
 tcg_out_insn(s, RR, XR, args[0], args[2]);
 } else {
-- 
2.34.1




[PATCH v4 02/27] tcg/s390x: Remove TCG_REG_TB

2022-12-08 Thread Richard Henderson
This reverts 829e1376d940 ("tcg/s390: Introduce TCG_REG_TB"), and
several follow-up patches.  The primary motivation is to reduce the
less-tested code paths, pre-z10.  Secondarily, this allows the
unconditional use of TCG_TARGET_HAS_direct_jump, which might be more
important for performance than any slight increase in code size.

Reviewed-by: Ilya Leoshkevich 
Signed-off-by: Richard Henderson 
---
v4: Do not simplify tgen_ori, tgen_xori.
---
 tcg/s390x/tcg-target.c.inc | 97 +++---
 1 file changed, 6 insertions(+), 91 deletions(-)

diff --git a/tcg/s390x/tcg-target.c.inc b/tcg/s390x/tcg-target.c.inc
index cb00bb6999..ba4bb6a629 100644
--- a/tcg/s390x/tcg-target.c.inc
+++ b/tcg/s390x/tcg-target.c.inc
@@ -65,12 +65,6 @@
 /* A scratch register that may be be used throughout the backend.  */
 #define TCG_TMP0TCG_REG_R1
 
-/* A scratch register that holds a pointer to the beginning of the TB.
-   We don't need this when we have pc-relative loads with the general
-   instructions extension facility.  */
-#define TCG_REG_TB  TCG_REG_R12
-#define USE_REG_TB  (!HAVE_FACILITY(GEN_INST_EXT))
-
 #ifndef CONFIG_SOFTMMU
 #define TCG_GUEST_BASE_REG TCG_REG_R13
 #endif
@@ -813,8 +807,8 @@ static bool maybe_out_small_movi(TCGContext *s, TCGType 
type,
 }
 
 /* load a register with an immediate value */
-static void tcg_out_movi_int(TCGContext *s, TCGType type, TCGReg ret,
- tcg_target_long sval, bool in_prologue)
+static void tcg_out_movi(TCGContext *s, TCGType type,
+ TCGReg ret, tcg_target_long sval)
 {
 tcg_target_ulong uval;
 
@@ -853,14 +847,6 @@ static void tcg_out_movi_int(TCGContext *s, TCGType type, 
TCGReg ret,
 tcg_out_insn(s, RIL, LARL, ret, off);
 return;
 }
-} else if (USE_REG_TB && !in_prologue) {
-ptrdiff_t off = tcg_tbrel_diff(s, (void *)sval);
-if (off == sextract64(off, 0, 20)) {
-/* This is certain to be an address within TB, and therefore
-   OFF will be negative; don't try RX_LA.  */
-tcg_out_insn(s, RXY, LAY, ret, TCG_REG_TB, TCG_REG_NONE, off);
-return;
-}
 }
 
 /* A 32-bit unsigned value can be loaded in 2 insns.  And given
@@ -876,10 +862,6 @@ static void tcg_out_movi_int(TCGContext *s, TCGType type, 
TCGReg ret,
 if (HAVE_FACILITY(GEN_INST_EXT)) {
 tcg_out_insn(s, RIL, LGRL, ret, 0);
 new_pool_label(s, sval, R_390_PC32DBL, s->code_ptr - 2, 2);
-} else if (USE_REG_TB && !in_prologue) {
-tcg_out_insn(s, RXY, LG, ret, TCG_REG_TB, TCG_REG_NONE, 0);
-new_pool_label(s, sval, R_390_20, s->code_ptr - 2,
-   tcg_tbrel_diff(s, NULL));
 } else {
 TCGReg base = ret ? ret : TCG_TMP0;
 tcg_out_insn(s, RIL, LARL, base, 0);
@@ -888,12 +870,6 @@ static void tcg_out_movi_int(TCGContext *s, TCGType type, 
TCGReg ret,
 }
 }
 
-static void tcg_out_movi(TCGContext *s, TCGType type,
- TCGReg ret, tcg_target_long sval)
-{
-tcg_out_movi_int(s, type, ret, sval, false);
-}
-
 /* Emit a load/store type instruction.  Inputs are:
DATA: The register to be loaded or stored.
BASE+OFS: The effective address.
@@ -1037,13 +1013,6 @@ static void tcg_out_ld_abs(TCGContext *s, TCGType type,
 return;
 }
 }
-if (USE_REG_TB) {
-ptrdiff_t disp = tcg_tbrel_diff(s, abs);
-if (disp == sextract64(disp, 0, 20)) {
-tcg_out_ld(s, type, dest, TCG_REG_TB, disp);
-return;
-}
-}
 
 tcg_out_movi(s, TCG_TYPE_PTR, dest, addr & ~0x);
 tcg_out_ld(s, type, dest, dest, addr & 0x);
@@ -1243,17 +1212,7 @@ static void tgen_andi(TCGContext *s, TCGType type, 
TCGReg dest, uint64_t val)
 return;
 }
 
-/* Use the constant pool if USE_REG_TB, but not for small constants.  */
-if (USE_REG_TB) {
-if (!maybe_out_small_movi(s, type, TCG_TMP0, val)) {
-tcg_out_insn(s, RXY, NG, dest, TCG_REG_TB, TCG_REG_NONE, 0);
-new_pool_label(s, val & valid, R_390_20, s->code_ptr - 2,
-   tcg_tbrel_diff(s, NULL));
-return;
-}
-} else {
-tcg_out_movi(s, type, TCG_TMP0, val);
-}
+tcg_out_movi(s, type, TCG_TMP0, val);
 if (type == TCG_TYPE_I32) {
 tcg_out_insn(s, RR, NR, dest, TCG_TMP0);
 } else {
@@ -1297,17 +1256,12 @@ static void tgen_ori(TCGContext *s, TCGType type, 
TCGReg dest, uint64_t val)
 }
 }
 
-/* Use the constant pool if USE_REG_TB, but not for small constants.  */
 if (maybe_out_small_movi(s, type, TCG_TMP0, val)) {
 if (type == TCG_TYPE_I32) {
 tcg_out_insn(s, RR, OR, dest, TCG_TMP0);
 } else {
 tcg_out_insn(s, RRE, OGR, dest, TCG_TMP0);
 }
-} else if (USE_REG_TB) {
-tcg_out_insn(s, RXY, OG, dest, TCG_REG_TB, TCG_REG_NONE, 0);
- 

[PATCH v4 10/27] tcg/s390x: Remove DISTINCT_OPERANDS facility check

2022-12-08 Thread Richard Henderson
The distinct-operands facility is bundled into facility 45,
along with load-on-condition.  We are checking this at startup.
Remove the a0 == a1 checks for 64-bit sub, and, or, xor, as there
is no space savings for avoiding the distinct-operands insn.

Signed-off-by: Richard Henderson 
---
 tcg/s390x/tcg-target.h |  1 -
 tcg/s390x/tcg-target.c.inc | 16 ++--
 2 files changed, 2 insertions(+), 15 deletions(-)

diff --git a/tcg/s390x/tcg-target.h b/tcg/s390x/tcg-target.h
index fc9ae82700..db10a39381 100644
--- a/tcg/s390x/tcg-target.h
+++ b/tcg/s390x/tcg-target.h
@@ -62,7 +62,6 @@ typedef enum TCGReg {
 
 /* Facilities that are checked at runtime. */
 
-#define FACILITY_DISTINCT_OPS 45
 #define FACILITY_LOAD_ON_COND253
 #define FACILITY_VECTOR   129
 #define FACILITY_VECTOR_ENH1  135
diff --git a/tcg/s390x/tcg-target.c.inc b/tcg/s390x/tcg-target.c.inc
index dd58f0cdb5..e4403ffabf 100644
--- a/tcg/s390x/tcg-target.c.inc
+++ b/tcg/s390x/tcg-target.c.inc
@@ -2218,8 +2218,6 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode 
opc,
 if (const_args[2]) {
 a2 = -a2;
 goto do_addi_64;
-} else if (a0 == a1) {
-tcg_out_insn(s, RRE, SGR, a0, a2);
 } else {
 tcg_out_insn(s, RRF, SGRK, a0, a1, a2);
 }
@@ -2230,8 +2228,6 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode 
opc,
 if (const_args[2]) {
 tcg_out_mov(s, TCG_TYPE_I64, a0, a1);
 tgen_andi(s, TCG_TYPE_I64, args[0], args[2]);
-} else if (a0 == a1) {
-tcg_out_insn(s, RRE, NGR, args[0], args[2]);
 } else {
 tcg_out_insn(s, RRF, NGRK, a0, a1, a2);
 }
@@ -2241,8 +2237,6 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode 
opc,
 if (const_args[2]) {
 tcg_out_mov(s, TCG_TYPE_I64, a0, a1);
 tgen_ori(s, TCG_TYPE_I64, a0, a2);
-} else if (a0 == a1) {
-tcg_out_insn(s, RRE, OGR, a0, a2);
 } else {
 tcg_out_insn(s, RRF, OGRK, a0, a1, a2);
 }
@@ -2252,8 +2246,6 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode 
opc,
 if (const_args[2]) {
 tcg_out_mov(s, TCG_TYPE_I64, a0, a1);
 tgen_xori(s, TCG_TYPE_I64, a0, a2);
-} else if (a0 == a1) {
-tcg_out_insn(s, RRE, XGR, a0, a2);
 } else {
 tcg_out_insn(s, RRF, XGRK, a0, a1, a2);
 }
@@ -2926,9 +2918,7 @@ static TCGConstraintSetIndex tcg_target_op_def(TCGOpcode 
op)
 case INDEX_op_or_i64:
 case INDEX_op_xor_i32:
 case INDEX_op_xor_i64:
-return (HAVE_FACILITY(DISTINCT_OPS)
-? C_O1_I2(r, r, ri)
-: C_O1_I2(r, 0, ri));
+return C_O1_I2(r, r, ri);
 
 case INDEX_op_mul_i32:
 return C_O1_I2(r, 0, ri);
@@ -2938,9 +2928,7 @@ static TCGConstraintSetIndex tcg_target_op_def(TCGOpcode 
op)
 case INDEX_op_shl_i32:
 case INDEX_op_shr_i32:
 case INDEX_op_sar_i32:
-return (HAVE_FACILITY(DISTINCT_OPS)
-? C_O1_I2(r, r, ri)
-: C_O1_I2(r, 0, ri));
+return C_O1_I2(r, r, ri);
 
 case INDEX_op_brcond_i32:
 case INDEX_op_brcond_i64:
-- 
2.34.1




[PATCH v4 18/27] tcg/s390x: Tighten constraints for and_i64

2022-12-08 Thread Richard Henderson
Let the register allocator handle such immediates by matching
only what one insn can achieve.

Signed-off-by: Richard Henderson 
---
 tcg/s390x/tcg-target-con-set.h |   1 +
 tcg/s390x/tcg-target-con-str.h |   2 +
 tcg/s390x/tcg-target.c.inc | 114 +
 3 files changed, 61 insertions(+), 56 deletions(-)

diff --git a/tcg/s390x/tcg-target-con-set.h b/tcg/s390x/tcg-target-con-set.h
index 34ae4c7743..0c4d0da8f5 100644
--- a/tcg/s390x/tcg-target-con-set.h
+++ b/tcg/s390x/tcg-target-con-set.h
@@ -25,6 +25,7 @@ C_O1_I2(r, 0, rJ)
 C_O1_I2(r, r, ri)
 C_O1_I2(r, r, rJ)
 C_O1_I2(r, r, rK)
+C_O1_I2(r, r, rNKR)
 C_O1_I2(r, rZ, r)
 C_O1_I2(v, v, r)
 C_O1_I2(v, v, v)
diff --git a/tcg/s390x/tcg-target-con-str.h b/tcg/s390x/tcg-target-con-str.h
index 7b910d6d11..6fa64a1ed6 100644
--- a/tcg/s390x/tcg-target-con-str.h
+++ b/tcg/s390x/tcg-target-con-str.h
@@ -21,4 +21,6 @@ CONST('A', TCG_CT_CONST_S33)
 CONST('I', TCG_CT_CONST_S16)
 CONST('J', TCG_CT_CONST_S32)
 CONST('K', TCG_CT_CONST_P32)
+CONST('N', TCG_CT_CONST_INV)
+CONST('R', TCG_CT_CONST_INVRISBG)
 CONST('Z', TCG_CT_CONST_ZERO)
diff --git a/tcg/s390x/tcg-target.c.inc b/tcg/s390x/tcg-target.c.inc
index 2a7410ba58..21007f94ad 100644
--- a/tcg/s390x/tcg-target.c.inc
+++ b/tcg/s390x/tcg-target.c.inc
@@ -33,11 +33,13 @@
 #include "../tcg-pool.c.inc"
 #include "elf.h"
 
-#define TCG_CT_CONST_S16   0x100
-#define TCG_CT_CONST_S32   0x200
-#define TCG_CT_CONST_S33   0x400
-#define TCG_CT_CONST_ZERO  0x800
-#define TCG_CT_CONST_P32   0x1000
+#define TCG_CT_CONST_S16(1 << 8)
+#define TCG_CT_CONST_S32(1 << 9)
+#define TCG_CT_CONST_S33(1 << 10)
+#define TCG_CT_CONST_ZERO   (1 << 11)
+#define TCG_CT_CONST_P32(1 << 12)
+#define TCG_CT_CONST_INV(1 << 13)
+#define TCG_CT_CONST_INVRISBG   (1 << 14)
 
 #define ALL_GENERAL_REGS MAKE_64BIT_MASK(0, 16)
 #define ALL_VECTOR_REGS  MAKE_64BIT_MASK(32, 32)
@@ -530,6 +532,38 @@ static int is_const_p32(uint64_t val)
 return -1;
 }
 
+/*
+ * Accept bit patterns like these:
+ *  0011
+ *  1100
+ *  1..10..01..1
+ *  0..01..10..0
+ * Copied from gcc sources.
+ */
+static bool risbg_mask(uint64_t c)
+{
+uint64_t lsb;
+/* We don't change the number of transitions by inverting,
+   so make sure we start with the LSB zero.  */
+if (c & 1) {
+c = ~c;
+}
+/* Reject all zeros or all ones.  */
+if (c == 0) {
+return false;
+}
+/* Find the first transition.  */
+lsb = c & -c;
+/* Invert to look for a second transition.  */
+c = ~c;
+/* Erase the first transition.  */
+c &= -lsb;
+/* Find the second transition, if any.  */
+lsb = c & -c;
+/* Match if all the bits are 1's, or if c is zero.  */
+return c == -lsb;
+}
+
 /* Test if a constant matches the constraint. */
 static bool tcg_target_const_match(int64_t val, TCGType type, int ct)
 {
@@ -552,6 +586,9 @@ static bool tcg_target_const_match(int64_t val, TCGType 
type, int ct)
 return val == 0;
 }
 
+if (ct & TCG_CT_CONST_INV) {
+val = ~val;
+}
 /*
  * Note that is_const_p16 is a subset of is_const_p32,
  * so we don't need both constraints.
@@ -559,6 +596,9 @@ static bool tcg_target_const_match(int64_t val, TCGType 
type, int ct)
 if ((ct & TCG_CT_CONST_P32) && is_const_p32(val) >= 0) {
 return true;
 }
+if ((ct & TCG_CT_CONST_INVRISBG) && risbg_mask(~val)) {
+return true;
+}
 
 return 0;
 }
@@ -1057,36 +1097,6 @@ static inline void tgen_ext32u(TCGContext *s, TCGReg 
dest, TCGReg src)
 tcg_out_insn(s, RRE, LLGFR, dest, src);
 }
 
-/* Accept bit patterns like these:
-0011
-1100
-1..10..01..1
-0..01..10..0
-   Copied from gcc sources.  */
-static inline bool risbg_mask(uint64_t c)
-{
-uint64_t lsb;
-/* We don't change the number of transitions by inverting,
-   so make sure we start with the LSB zero.  */
-if (c & 1) {
-c = ~c;
-}
-/* Reject all zeros or all ones.  */
-if (c == 0) {
-return false;
-}
-/* Find the first transition.  */
-lsb = c & -c;
-/* Invert to look for a second transition.  */
-c = ~c;
-/* Erase the first transition.  */
-c &= -lsb;
-/* Find the second transition, if any.  */
-lsb = c & -c;
-/* Match if all the bits are 1's, or if c is zero.  */
-return c == -lsb;
-}
-
 static void tgen_andi_risbg(TCGContext *s, TCGReg out, TCGReg in, uint64_t val)
 {
 int msb, lsb;
@@ -1126,34 +1136,25 @@ static void tgen_andi(TCGContext *s, TCGType type, 
TCGReg dest, uint64_t val)
 return;
 }
 
-/* Try all 32-bit insns that can perform it in one go.  */
-for (i = 0; i < 4; i++) {
-tcg_target_ulong mask = ~(0xull << i * 16);
-if (((val | ~valid) & mask) == mask) {
-tcg_out_insn_RI(s, ni_insns[i], dest, val >> i * 16);
-return;
-}
+i = is_cons

[PATCH v4 26/27] tcg/s390x: Cleanup tcg_out_movi

2022-12-08 Thread Richard Henderson
Merge maybe_out_small_movi, as it no longer has additional users.
Use is_const_p{16,32}.

Signed-off-by: Richard Henderson 
---
 tcg/s390x/tcg-target.c.inc | 52 --
 1 file changed, 16 insertions(+), 36 deletions(-)

diff --git a/tcg/s390x/tcg-target.c.inc b/tcg/s390x/tcg-target.c.inc
index 4d113139e5..b72c43e4aa 100644
--- a/tcg/s390x/tcg-target.c.inc
+++ b/tcg/s390x/tcg-target.c.inc
@@ -874,14 +874,19 @@ static bool tcg_out_mov(TCGContext *s, TCGType type, 
TCGReg dst, TCGReg src)
 return true;
 }
 
-static const S390Opcode lli_insns[4] = {
+static const S390Opcode li_insns[4] = {
 RI_LLILL, RI_LLILH, RI_LLIHL, RI_LLIHH
 };
+static const S390Opcode lif_insns[2] = {
+RIL_LLILF, RIL_LLIHF,
+};
 
-static bool maybe_out_small_movi(TCGContext *s, TCGType type,
- TCGReg ret, tcg_target_long sval)
+/* load a register with an immediate value */
+static void tcg_out_movi(TCGContext *s, TCGType type,
+ TCGReg ret, tcg_target_long sval)
 {
 tcg_target_ulong uval = sval;
+ptrdiff_t pc_off;
 int i;
 
 if (type == TCG_TYPE_I32) {
@@ -892,36 +897,13 @@ static bool maybe_out_small_movi(TCGContext *s, TCGType 
type,
 /* Try all 32-bit insns that can load it in one go.  */
 if (sval >= -0x8000 && sval < 0x8000) {
 tcg_out_insn(s, RI, LGHI, ret, sval);
-return true;
-}
-
-for (i = 0; i < 4; i++) {
-tcg_target_long mask = 0xull << i * 16;
-if ((uval & mask) == uval) {
-tcg_out_insn_RI(s, lli_insns[i], ret, uval >> i * 16);
-return true;
-}
-}
-
-return false;
-}
-
-/* load a register with an immediate value */
-static void tcg_out_movi(TCGContext *s, TCGType type,
- TCGReg ret, tcg_target_long sval)
-{
-tcg_target_ulong uval;
-ptrdiff_t pc_off;
-
-/* Try all 32-bit insns that can load it in one go.  */
-if (maybe_out_small_movi(s, type, ret, sval)) {
 return;
 }
 
-uval = sval;
-if (type == TCG_TYPE_I32) {
-uval = (uint32_t)sval;
-sval = (int32_t)sval;
+i = is_const_p16(uval);
+if (i >= 0) {
+tcg_out_insn_RI(s, li_insns[i], ret, uval >> (i * 16));
+return;
 }
 
 /* Try all 48-bit insns that can load it in one go.  */
@@ -929,12 +911,10 @@ static void tcg_out_movi(TCGContext *s, TCGType type,
 tcg_out_insn(s, RIL, LGFI, ret, sval);
 return;
 }
-if (uval <= 0x) {
-tcg_out_insn(s, RIL, LLILF, ret, uval);
-return;
-}
-if ((uval & 0x) == 0) {
-tcg_out_insn(s, RIL, LLIHF, ret, uval >> 32);
+
+i = is_const_p32(uval);
+if (i >= 0) {
+tcg_out_insn_RIL(s, lif_insns[i], ret, uval >> (i * 32));
 return;
 }
 
-- 
2.34.1




[PATCH v4 20/27] tcg/s390x: Create tgen_cmp2 to simplify movcond

2022-12-08 Thread Richard Henderson
Return both regular and inverted condition codes from tgen_cmp2.
This lets us choose after the fact which comparision we want.

Reviewed-by: Ilya Leoshkevich 
Signed-off-by: Richard Henderson 
---
 tcg/s390x/tcg-target.c.inc | 19 ---
 1 file changed, 16 insertions(+), 3 deletions(-)

diff --git a/tcg/s390x/tcg-target.c.inc b/tcg/s390x/tcg-target.c.inc
index bab2d679c2..a9e3b4a9b9 100644
--- a/tcg/s390x/tcg-target.c.inc
+++ b/tcg/s390x/tcg-target.c.inc
@@ -1207,10 +1207,11 @@ static void tgen_xori(TCGContext *s, TCGReg dest, 
uint64_t val)
 }
 }
 
-static int tgen_cmp(TCGContext *s, TCGType type, TCGCond c, TCGReg r1,
-TCGArg c2, bool c2const, bool need_carry)
+static int tgen_cmp2(TCGContext *s, TCGType type, TCGCond c, TCGReg r1,
+ TCGArg c2, bool c2const, bool need_carry, int *inv_cc)
 {
 bool is_unsigned = is_unsigned_cond(c);
+TCGCond inv_c = tcg_invert_cond(c);
 S390Opcode op;
 
 if (c2const) {
@@ -1221,6 +1222,7 @@ static int tgen_cmp(TCGContext *s, TCGType type, TCGCond 
c, TCGReg r1,
 } else {
 tcg_out_insn(s, RRE, LTGR, r1, r1);
 }
+*inv_cc = tcg_cond_to_ltr_cond[inv_c];
 return tcg_cond_to_ltr_cond[c];
 }
 }
@@ -1263,9 +1265,17 @@ static int tgen_cmp(TCGContext *s, TCGType type, TCGCond 
c, TCGReg r1,
 }
 
  exit:
+*inv_cc = tcg_cond_to_s390_cond[inv_c];
 return tcg_cond_to_s390_cond[c];
 }
 
+static int tgen_cmp(TCGContext *s, TCGType type, TCGCond c, TCGReg r1,
+TCGArg c2, bool c2const, bool need_carry)
+{
+int inv_cc;
+return tgen_cmp2(s, type, c, r1, c2, c2const, need_carry, &inv_cc);
+}
+
 static void tgen_setcond(TCGContext *s, TCGType type, TCGCond cond,
  TCGReg dest, TCGReg c1, TCGArg c2, int c2const)
 {
@@ -1348,7 +1358,10 @@ static void tgen_movcond(TCGContext *s, TCGType type, 
TCGCond c, TCGReg dest,
  TCGReg c1, TCGArg c2, int c2const,
  TCGArg v3, int v3const)
 {
-int cc = tgen_cmp(s, type, c, c1, c2, c2const, false);
+int cc, inv_cc;
+
+cc = tgen_cmp2(s, type, c, c1, c2, c2const, false, &inv_cc);
+
 if (v3const) {
 tcg_out_insn(s, RIEg, LOCGHI, dest, v3, cc);
 } else {
-- 
2.34.1




[PATCH v4 01/27] tcg/s390x: Use register pair allocation for div and mulu2

2022-12-08 Thread Richard Henderson
Previously we hard-coded R2 and R3.

Reviewed-by: Ilya Leoshkevich 
Signed-off-by: Richard Henderson 
---
 tcg/s390x/tcg-target-con-set.h |  4 ++--
 tcg/s390x/tcg-target-con-str.h |  8 +--
 tcg/s390x/tcg-target.c.inc | 43 +-
 3 files changed, 35 insertions(+), 20 deletions(-)

diff --git a/tcg/s390x/tcg-target-con-set.h b/tcg/s390x/tcg-target-con-set.h
index 426dd92e51..00ba727b70 100644
--- a/tcg/s390x/tcg-target-con-set.h
+++ b/tcg/s390x/tcg-target-con-set.h
@@ -29,8 +29,8 @@ C_O1_I2(v, v, v)
 C_O1_I3(v, v, v, v)
 C_O1_I4(r, r, ri, r, 0)
 C_O1_I4(r, r, ri, rI, 0)
-C_O2_I2(b, a, 0, r)
-C_O2_I3(b, a, 0, 1, r)
+C_O2_I2(o, m, 0, r)
+C_O2_I3(o, m, 0, 1, r)
 C_O2_I4(r, r, 0, 1, rA, r)
 C_O2_I4(r, r, 0, 1, ri, r)
 C_O2_I4(r, r, 0, 1, r, r)
diff --git a/tcg/s390x/tcg-target-con-str.h b/tcg/s390x/tcg-target-con-str.h
index 8bb0358ae5..76446aecae 100644
--- a/tcg/s390x/tcg-target-con-str.h
+++ b/tcg/s390x/tcg-target-con-str.h
@@ -11,13 +11,7 @@
 REGS('r', ALL_GENERAL_REGS)
 REGS('L', ALL_GENERAL_REGS & ~SOFTMMU_RESERVE_REGS)
 REGS('v', ALL_VECTOR_REGS)
-/*
- * A (single) even/odd pair for division.
- * TODO: Add something to the register allocator to allow
- * this kind of regno+1 pairing to be done more generally.
- */
-REGS('a', 1u << TCG_REG_R2)
-REGS('b', 1u << TCG_REG_R3)
+REGS('o', 0x) /* odd numbered general regs */
 
 /*
  * Define constraint letters for constants:
diff --git a/tcg/s390x/tcg-target.c.inc b/tcg/s390x/tcg-target.c.inc
index b9ba7b605e..cb00bb6999 100644
--- a/tcg/s390x/tcg-target.c.inc
+++ b/tcg/s390x/tcg-target.c.inc
@@ -2264,10 +2264,18 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode 
opc,
 break;
 
 case INDEX_op_div2_i32:
-tcg_out_insn(s, RR, DR, TCG_REG_R2, args[4]);
+tcg_debug_assert(args[0] == args[2]);
+tcg_debug_assert(args[1] == args[3]);
+tcg_debug_assert((args[1] & 1) == 0);
+tcg_debug_assert(args[0] == args[1] + 1);
+tcg_out_insn(s, RR, DR, args[1], args[4]);
 break;
 case INDEX_op_divu2_i32:
-tcg_out_insn(s, RRE, DLR, TCG_REG_R2, args[4]);
+tcg_debug_assert(args[0] == args[2]);
+tcg_debug_assert(args[1] == args[3]);
+tcg_debug_assert((args[1] & 1) == 0);
+tcg_debug_assert(args[0] == args[1] + 1);
+tcg_out_insn(s, RRE, DLR, args[1], args[4]);
 break;
 
 case INDEX_op_shl_i32:
@@ -2521,17 +2529,30 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode 
opc,
 break;
 
 case INDEX_op_div2_i64:
-/* ??? We get an unnecessary sign-extension of the dividend
-   into R3 with this definition, but as we do in fact always
-   produce both quotient and remainder using INDEX_op_div_i64
-   instead requires jumping through even more hoops.  */
-tcg_out_insn(s, RRE, DSGR, TCG_REG_R2, args[4]);
+/*
+ * ??? We get an unnecessary sign-extension of the dividend
+ * into op0 with this definition, but as we do in fact always
+ * produce both quotient and remainder using INDEX_op_div_i64
+ * instead requires jumping through even more hoops.
+ */
+tcg_debug_assert(args[0] == args[2]);
+tcg_debug_assert(args[1] == args[3]);
+tcg_debug_assert((args[1] & 1) == 0);
+tcg_debug_assert(args[0] == args[1] + 1);
+tcg_out_insn(s, RRE, DSGR, args[1], args[4]);
 break;
 case INDEX_op_divu2_i64:
-tcg_out_insn(s, RRE, DLGR, TCG_REG_R2, args[4]);
+tcg_debug_assert(args[0] == args[2]);
+tcg_debug_assert(args[1] == args[3]);
+tcg_debug_assert((args[1] & 1) == 0);
+tcg_debug_assert(args[0] == args[1] + 1);
+tcg_out_insn(s, RRE, DLGR, args[1], args[4]);
 break;
 case INDEX_op_mulu2_i64:
-tcg_out_insn(s, RRE, MLGR, TCG_REG_R2, args[3]);
+tcg_debug_assert(args[0] == args[2]);
+tcg_debug_assert((args[1] & 1) == 0);
+tcg_debug_assert(args[0] == args[1] + 1);
+tcg_out_insn(s, RRE, MLGR, args[1], args[3]);
 break;
 
 case INDEX_op_shl_i64:
@@ -3226,10 +3247,10 @@ static TCGConstraintSetIndex 
tcg_target_op_def(TCGOpcode op)
 case INDEX_op_div2_i64:
 case INDEX_op_divu2_i32:
 case INDEX_op_divu2_i64:
-return C_O2_I3(b, a, 0, 1, r);
+return C_O2_I3(o, m, 0, 1, r);
 
 case INDEX_op_mulu2_i64:
-return C_O2_I2(b, a, 0, r);
+return C_O2_I2(o, m, 0, r);
 
 case INDEX_op_add2_i32:
 case INDEX_op_sub2_i32:
-- 
2.34.1




[PATCH v4 25/27] tcg/s390x: Tighten constraints for 64-bit compare

2022-12-08 Thread Richard Henderson
Give 64-bit comparison second operand a signed 33-bit immediate.
This is the smallest superset of uint32_t and int32_t, as used
by CLGFI and CGFI respectively.  The rest of the 33-bit space
can be loaded into TCG_TMP0.  Drop use of the constant pool.

Signed-off-by: Richard Henderson 
---
 tcg/s390x/tcg-target-con-set.h |  3 +++
 tcg/s390x/tcg-target.c.inc | 27 ++-
 2 files changed, 17 insertions(+), 13 deletions(-)

diff --git a/tcg/s390x/tcg-target-con-set.h b/tcg/s390x/tcg-target-con-set.h
index baf3bc9037..15f1c55103 100644
--- a/tcg/s390x/tcg-target-con-set.h
+++ b/tcg/s390x/tcg-target-con-set.h
@@ -13,6 +13,7 @@ C_O0_I1(r)
 C_O0_I2(L, L)
 C_O0_I2(r, r)
 C_O0_I2(r, ri)
+C_O0_I2(r, rA)
 C_O0_I2(v, r)
 C_O1_I1(r, L)
 C_O1_I1(r, r)
@@ -24,6 +25,7 @@ C_O1_I2(r, 0, rI)
 C_O1_I2(r, 0, rJ)
 C_O1_I2(r, r, r)
 C_O1_I2(r, r, ri)
+C_O1_I2(r, r, rA)
 C_O1_I2(r, r, rI)
 C_O1_I2(r, r, rJ)
 C_O1_I2(r, r, rK)
@@ -35,6 +37,7 @@ C_O1_I2(v, v, r)
 C_O1_I2(v, v, v)
 C_O1_I3(v, v, v, v)
 C_O1_I4(r, r, ri, rI, r)
+C_O1_I4(r, r, rA, rI, r)
 C_O2_I2(o, m, 0, r)
 C_O2_I2(o, m, r, r)
 C_O2_I3(o, m, 0, 1, r)
diff --git a/tcg/s390x/tcg-target.c.inc b/tcg/s390x/tcg-target.c.inc
index c0434fa2f8..4d113139e5 100644
--- a/tcg/s390x/tcg-target.c.inc
+++ b/tcg/s390x/tcg-target.c.inc
@@ -1249,22 +1249,20 @@ static int tgen_cmp2(TCGContext *s, TCGType type, 
TCGCond c, TCGReg r1,
 tcg_out_insn_RIL(s, op, r1, c2);
 goto exit;
 }
+
+/*
+ * Constraints are for a signed 33-bit operand, which is a
+ * convenient superset of this signed/unsigned test.
+ */
 if (c2 == (is_unsigned ? (TCGArg)(uint32_t)c2 : (TCGArg)(int32_t)c2)) {
 op = (is_unsigned ? RIL_CLGFI : RIL_CGFI);
 tcg_out_insn_RIL(s, op, r1, c2);
 goto exit;
 }
 
-/* Use the constant pool, but not for small constants.  */
-if (maybe_out_small_movi(s, type, TCG_TMP0, c2)) {
-c2 = TCG_TMP0;
-/* fall through to reg-reg */
-} else {
-op = (is_unsigned ? RIL_CLGRL : RIL_CGRL);
-tcg_out_insn_RIL(s, op, r1, 0);
-new_pool_label(s, c2, R_390_PC32DBL, s->code_ptr - 2, 2);
-goto exit;
-}
+/* Load everything else into a register. */
+tcg_out_movi(s, TCG_TYPE_I64, TCG_TMP0, c2);
+c2 = TCG_TMP0;
 }
 
 if (type == TCG_TYPE_I32) {
@@ -3105,8 +3103,9 @@ static TCGConstraintSetIndex tcg_target_op_def(TCGOpcode 
op)
 case INDEX_op_rotr_i32:
 case INDEX_op_rotr_i64:
 case INDEX_op_setcond_i32:
-case INDEX_op_setcond_i64:
 return C_O1_I2(r, r, ri);
+case INDEX_op_setcond_i64:
+return C_O1_I2(r, r, rA);
 
 case INDEX_op_clz_i64:
 return C_O1_I2(r, r, rI);
@@ -3154,8 +3153,9 @@ static TCGConstraintSetIndex tcg_target_op_def(TCGOpcode 
op)
 return C_O1_I2(r, r, ri);
 
 case INDEX_op_brcond_i32:
-case INDEX_op_brcond_i64:
 return C_O0_I2(r, ri);
+case INDEX_op_brcond_i64:
+return C_O0_I2(r, rA);
 
 case INDEX_op_bswap16_i32:
 case INDEX_op_bswap16_i64:
@@ -3196,8 +3196,9 @@ static TCGConstraintSetIndex tcg_target_op_def(TCGOpcode 
op)
 return C_O1_I2(r, rZ, r);
 
 case INDEX_op_movcond_i32:
-case INDEX_op_movcond_i64:
 return C_O1_I4(r, r, ri, rI, r);
+case INDEX_op_movcond_i64:
+return C_O1_I4(r, r, rA, rI, r);
 
 case INDEX_op_div2_i32:
 case INDEX_op_div2_i64:
-- 
2.34.1




[PATCH v4 07/27] tcg/s390x: Check for general-instruction-extension facility at startup

2022-12-08 Thread Richard Henderson
The general-instruction-extension facility was introduced in z10,
which itself was end-of-life in 2019.

Signed-off-by: Richard Henderson 
---
 tcg/s390x/tcg-target.h |  10 ++--
 tcg/s390x/tcg-target.c.inc | 100 -
 2 files changed, 49 insertions(+), 61 deletions(-)

diff --git a/tcg/s390x/tcg-target.h b/tcg/s390x/tcg-target.h
index 126ba1048a..d47e8ba66a 100644
--- a/tcg/s390x/tcg-target.h
+++ b/tcg/s390x/tcg-target.h
@@ -57,10 +57,10 @@ typedef enum TCGReg {
 #define FACILITY_ZARCH_ACTIVE 2
 #define FACILITY_LONG_DISP18
 #define FACILITY_EXT_IMM  21
+#define FACILITY_GEN_INST_EXT 34
 
 /* Facilities that are checked at runtime. */
 
-#define FACILITY_GEN_INST_EXT 34
 #define FACILITY_LOAD_ON_COND 45
 #define FACILITY_FAST_BCR_SER FACILITY_LOAD_ON_COND
 #define FACILITY_DISTINCT_OPS FACILITY_LOAD_ON_COND
@@ -92,8 +92,8 @@ extern uint64_t s390_facilities[3];
 #define TCG_TARGET_HAS_clz_i320
 #define TCG_TARGET_HAS_ctz_i320
 #define TCG_TARGET_HAS_ctpop_i32  0
-#define TCG_TARGET_HAS_deposit_i32HAVE_FACILITY(GEN_INST_EXT)
-#define TCG_TARGET_HAS_extract_i32HAVE_FACILITY(GEN_INST_EXT)
+#define TCG_TARGET_HAS_deposit_i321
+#define TCG_TARGET_HAS_extract_i321
 #define TCG_TARGET_HAS_sextract_i32   0
 #define TCG_TARGET_HAS_extract2_i32   0
 #define TCG_TARGET_HAS_movcond_i321
@@ -129,8 +129,8 @@ extern uint64_t s390_facilities[3];
 #define TCG_TARGET_HAS_clz_i641
 #define TCG_TARGET_HAS_ctz_i640
 #define TCG_TARGET_HAS_ctpop_i64  0
-#define TCG_TARGET_HAS_deposit_i64HAVE_FACILITY(GEN_INST_EXT)
-#define TCG_TARGET_HAS_extract_i64HAVE_FACILITY(GEN_INST_EXT)
+#define TCG_TARGET_HAS_deposit_i641
+#define TCG_TARGET_HAS_extract_i641
 #define TCG_TARGET_HAS_sextract_i64   0
 #define TCG_TARGET_HAS_extract2_i64   0
 #define TCG_TARGET_HAS_movcond_i641
diff --git a/tcg/s390x/tcg-target.c.inc b/tcg/s390x/tcg-target.c.inc
index 42e161cc7e..f0b581293c 100644
--- a/tcg/s390x/tcg-target.c.inc
+++ b/tcg/s390x/tcg-target.c.inc
@@ -843,15 +843,8 @@ static void tcg_out_movi(TCGContext *s, TCGType type,
 }
 
 /* Otherwise, stuff it in the constant pool.  */
-if (HAVE_FACILITY(GEN_INST_EXT)) {
-tcg_out_insn(s, RIL, LGRL, ret, 0);
-new_pool_label(s, sval, R_390_PC32DBL, s->code_ptr - 2, 2);
-} else {
-TCGReg base = ret ? ret : TCG_TMP0;
-tcg_out_insn(s, RIL, LARL, base, 0);
-new_pool_label(s, sval, R_390_PC32DBL, s->code_ptr - 2, 2);
-tcg_out_insn(s, RXY, LG, ret, base, TCG_REG_NONE, 0);
-}
+tcg_out_insn(s, RIL, LGRL, ret, 0);
+new_pool_label(s, sval, R_390_PC32DBL, s->code_ptr - 2, 2);
 }
 
 /* Emit a load/store type instruction.  Inputs are:
@@ -1105,7 +1098,7 @@ static void tgen_andi(TCGContext *s, TCGType type, TCGReg 
dest, uint64_t val)
 return;
 }
 }
-if (HAVE_FACILITY(GEN_INST_EXT) && risbg_mask(val)) {
+if (risbg_mask(val)) {
 tgen_andi_risbg(s, dest, dest, val);
 return;
 }
@@ -1460,48 +1453,47 @@ static void tgen_brcond(TCGContext *s, TCGType type, 
TCGCond c,
 TCGReg r1, TCGArg c2, int c2const, TCGLabel *l)
 {
 int cc;
+bool is_unsigned = is_unsigned_cond(c);
+bool in_range;
+S390Opcode opc;
 
-if (HAVE_FACILITY(GEN_INST_EXT)) {
-bool is_unsigned = is_unsigned_cond(c);
-bool in_range;
-S390Opcode opc;
+cc = tcg_cond_to_s390_cond[c];
 
-cc = tcg_cond_to_s390_cond[c];
+if (!c2const) {
+opc = (type == TCG_TYPE_I32
+   ? (is_unsigned ? RIE_CLRJ : RIE_CRJ)
+   : (is_unsigned ? RIE_CLGRJ : RIE_CGRJ));
+tgen_compare_branch(s, opc, cc, r1, c2, l);
+return;
+}
 
-if (!c2const) {
-opc = (type == TCG_TYPE_I32
-   ? (is_unsigned ? RIE_CLRJ : RIE_CRJ)
-   : (is_unsigned ? RIE_CLGRJ : RIE_CGRJ));
-tgen_compare_branch(s, opc, cc, r1, c2, l);
-return;
-}
-
-/* COMPARE IMMEDIATE AND BRANCH RELATIVE has an 8-bit immediate field.
-   If the immediate we've been given does not fit that range, we'll
-   fall back to separate compare and branch instructions using the
-   larger comparison range afforded by COMPARE IMMEDIATE.  */
-if (type == TCG_TYPE_I32) {
-if (is_unsigned) {
-opc = RIE_CLIJ;
-in_range = (uint32_t)c2 == (uint8_t)c2;
-} else {
-opc = RIE_CIJ;
-in_range = (int32_t)c2 == (int8_t)c2;
-}
+/*
+ * COMPARE IMMEDIATE AND BRANCH RELATIVE has an 8-bit immediate field.
+ * If the immediate we've been given does not fit that range, we'll
+ * fall back to separate compare and branch instructions using the
+ * larger comparison range afforded by COMPA

[PATCH v4 21/27] tcg/s390x: Generalize movcond implementation

2022-12-08 Thread Richard Henderson
Generalize movcond to support pre-computed conditions, and the same
set of arguments at all times.  This will be assumed by a following
patch, which needs to reuse tgen_movcond_int.

Reviewed-by: Ilya Leoshkevich 
Signed-off-by: Richard Henderson 
---
 tcg/s390x/tcg-target-con-set.h |  3 +-
 tcg/s390x/tcg-target.c.inc | 52 ++
 2 files changed, 41 insertions(+), 14 deletions(-)

diff --git a/tcg/s390x/tcg-target-con-set.h b/tcg/s390x/tcg-target-con-set.h
index b194ad7f03..8cf8ed4dff 100644
--- a/tcg/s390x/tcg-target-con-set.h
+++ b/tcg/s390x/tcg-target-con-set.h
@@ -33,8 +33,7 @@ C_O1_I2(r, rZ, r)
 C_O1_I2(v, v, r)
 C_O1_I2(v, v, v)
 C_O1_I3(v, v, v, v)
-C_O1_I4(r, r, ri, r, 0)
-C_O1_I4(r, r, ri, rI, 0)
+C_O1_I4(r, r, ri, rI, r)
 C_O2_I2(o, m, 0, r)
 C_O2_I2(o, m, r, r)
 C_O2_I3(o, m, 0, 1, r)
diff --git a/tcg/s390x/tcg-target.c.inc b/tcg/s390x/tcg-target.c.inc
index a9e3b4a9b9..30c12052f0 100644
--- a/tcg/s390x/tcg-target.c.inc
+++ b/tcg/s390x/tcg-target.c.inc
@@ -1354,19 +1354,49 @@ static void tgen_setcond(TCGContext *s, TCGType type, 
TCGCond cond,
 tcg_out_insn(s, RRFc, LOCGR, dest, TCG_TMP0, cc);
 }
 
+static void tgen_movcond_int(TCGContext *s, TCGType type, TCGReg dest,
+ TCGArg v3, int v3const, TCGReg v4,
+ int cc, int inv_cc)
+{
+TCGReg src;
+
+if (v3const) {
+if (dest == v4) {
+if (HAVE_FACILITY(LOAD_ON_COND2)) {
+/* Emit: if (cc) dest = v3. */
+tcg_out_insn(s, RIEg, LOCGHI, dest, v3, cc);
+return;
+}
+tcg_out_insn(s, RI, LGHI, TCG_TMP0, v3);
+src = TCG_TMP0;
+} else {
+/* LGR+LOCGHI is larger than LGHI+LOCGR. */
+tcg_out_insn(s, RI, LGHI, dest, v3);
+cc = inv_cc;
+src = v4;
+}
+} else {
+if (dest == v4) {
+src = v3;
+} else {
+tcg_out_mov(s, type, dest, v3);
+cc = inv_cc;
+src = v4;
+}
+}
+
+/* Emit: if (cc) dest = src. */
+tcg_out_insn(s, RRFc, LOCGR, dest, src, cc);
+}
+
 static void tgen_movcond(TCGContext *s, TCGType type, TCGCond c, TCGReg dest,
  TCGReg c1, TCGArg c2, int c2const,
- TCGArg v3, int v3const)
+ TCGArg v3, int v3const, TCGReg v4)
 {
 int cc, inv_cc;
 
 cc = tgen_cmp2(s, type, c, c1, c2, c2const, false, &inv_cc);
-
-if (v3const) {
-tcg_out_insn(s, RIEg, LOCGHI, dest, v3, cc);
-} else {
-tcg_out_insn(s, RRFc, LOCGR, dest, v3, cc);
-}
+tgen_movcond_int(s, type, dest, v3, v3const, v4, cc, inv_cc);
 }
 
 static void tgen_clz(TCGContext *s, TCGReg dest, TCGReg a1,
@@ -2225,7 +2255,7 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode 
opc,
 break;
 case INDEX_op_movcond_i32:
 tgen_movcond(s, TCG_TYPE_I32, args[5], args[0], args[1],
- args[2], const_args[2], args[3], const_args[3]);
+ args[2], const_args[2], args[3], const_args[3], args[4]);
 break;
 
 case INDEX_op_qemu_ld_i32:
@@ -2509,7 +2539,7 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode 
opc,
 break;
 case INDEX_op_movcond_i64:
 tgen_movcond(s, TCG_TYPE_I64, args[5], args[0], args[1],
- args[2], const_args[2], args[3], const_args[3]);
+ args[2], const_args[2], args[3], const_args[3], args[4]);
 break;
 
 OP_32_64(deposit):
@@ -3114,9 +3144,7 @@ static TCGConstraintSetIndex tcg_target_op_def(TCGOpcode 
op)
 
 case INDEX_op_movcond_i32:
 case INDEX_op_movcond_i64:
-return (HAVE_FACILITY(LOAD_ON_COND2)
-? C_O1_I4(r, r, ri, rI, 0)
-: C_O1_I4(r, r, ri, r, 0));
+return C_O1_I4(r, r, ri, rI, r);
 
 case INDEX_op_div2_i32:
 case INDEX_op_div2_i64:
-- 
2.34.1




[PATCH v4 17/27] tcg/s390x: Tighten constraints for or_i64 and xor_i64

2022-12-08 Thread Richard Henderson
Drop support for sequential OR and XOR, as the serial dependency is
slower than loading the constant first.  Let the register allocator
handle such immediates by matching only what one insn can achieve.

Signed-off-by: Richard Henderson 
---
 tcg/s390x/tcg-target-con-set.h |   1 +
 tcg/s390x/tcg-target-con-str.h |   1 +
 tcg/s390x/tcg-target.c.inc | 114 -
 3 files changed, 56 insertions(+), 60 deletions(-)

diff --git a/tcg/s390x/tcg-target-con-set.h b/tcg/s390x/tcg-target-con-set.h
index b1a89a88ba..34ae4c7743 100644
--- a/tcg/s390x/tcg-target-con-set.h
+++ b/tcg/s390x/tcg-target-con-set.h
@@ -24,6 +24,7 @@ C_O1_I2(r, 0, rI)
 C_O1_I2(r, 0, rJ)
 C_O1_I2(r, r, ri)
 C_O1_I2(r, r, rJ)
+C_O1_I2(r, r, rK)
 C_O1_I2(r, rZ, r)
 C_O1_I2(v, v, r)
 C_O1_I2(v, v, v)
diff --git a/tcg/s390x/tcg-target-con-str.h b/tcg/s390x/tcg-target-con-str.h
index 76446aecae..7b910d6d11 100644
--- a/tcg/s390x/tcg-target-con-str.h
+++ b/tcg/s390x/tcg-target-con-str.h
@@ -20,4 +20,5 @@ REGS('o', 0x) /* odd numbered general regs */
 CONST('A', TCG_CT_CONST_S33)
 CONST('I', TCG_CT_CONST_S16)
 CONST('J', TCG_CT_CONST_S32)
+CONST('K', TCG_CT_CONST_P32)
 CONST('Z', TCG_CT_CONST_ZERO)
diff --git a/tcg/s390x/tcg-target.c.inc b/tcg/s390x/tcg-target.c.inc
index fc304327fc..2a7410ba58 100644
--- a/tcg/s390x/tcg-target.c.inc
+++ b/tcg/s390x/tcg-target.c.inc
@@ -37,6 +37,7 @@
 #define TCG_CT_CONST_S32   0x200
 #define TCG_CT_CONST_S33   0x400
 #define TCG_CT_CONST_ZERO  0x800
+#define TCG_CT_CONST_P32   0x1000
 
 #define ALL_GENERAL_REGS MAKE_64BIT_MASK(0, 16)
 #define ALL_VECTOR_REGS  MAKE_64BIT_MASK(32, 32)
@@ -507,6 +508,28 @@ static bool patch_reloc(tcg_insn_unit *src_rw, int type,
 return false;
 }
 
+static int is_const_p16(uint64_t val)
+{
+for (int i = 0; i < 4; ++i) {
+uint64_t mask = 0xull << (i * 16);
+if ((val & ~mask) == 0) {
+return i;
+}
+}
+return -1;
+}
+
+static int is_const_p32(uint64_t val)
+{
+if ((val & 0xull) == 0) {
+return 0;
+}
+if ((val & 0xull) == 0) {
+return 1;
+}
+return -1;
+}
+
 /* Test if a constant matches the constraint. */
 static bool tcg_target_const_match(int64_t val, TCGType type, int ct)
 {
@@ -529,6 +552,14 @@ static bool tcg_target_const_match(int64_t val, TCGType 
type, int ct)
 return val == 0;
 }
 
+/*
+ * Note that is_const_p16 is a subset of is_const_p32,
+ * so we don't need both constraints.
+ */
+if ((ct & TCG_CT_CONST_P32) && is_const_p32(val) >= 0) {
+return true;
+}
+
 return 0;
 }
 
@@ -1125,7 +1156,7 @@ static void tgen_andi(TCGContext *s, TCGType type, TCGReg 
dest, uint64_t val)
 }
 }
 
-static void tgen_ori(TCGContext *s, TCGType type, TCGReg dest, uint64_t val)
+static void tgen_ori(TCGContext *s, TCGReg dest, uint64_t val)
 {
 static const S390Opcode oi_insns[4] = {
 RI_OILL, RI_OILH, RI_OIHL, RI_OIHH
@@ -1136,70 +1167,32 @@ static void tgen_ori(TCGContext *s, TCGType type, 
TCGReg dest, uint64_t val)
 
 int i;
 
-/* Look for no-op.  */
-if (unlikely(val == 0)) {
+i = is_const_p16(val);
+if (i >= 0) {
+tcg_out_insn_RI(s, oi_insns[i], dest, val >> (i * 16));
 return;
 }
 
-/* Try all 32-bit insns that can perform it in one go.  */
-for (i = 0; i < 4; i++) {
-tcg_target_ulong mask = (0xull << i * 16);
-if ((val & mask) != 0 && (val & ~mask) == 0) {
-tcg_out_insn_RI(s, oi_insns[i], dest, val >> i * 16);
-return;
-}
+i = is_const_p32(val);
+if (i >= 0) {
+tcg_out_insn_RIL(s, oif_insns[i], dest, val >> (i * 32));
+return;
 }
 
-/* Try all 48-bit insns that can perform it in one go.  */
-for (i = 0; i < 2; i++) {
-tcg_target_ulong mask = (0xull << i * 32);
-if ((val & mask) != 0 && (val & ~mask) == 0) {
-tcg_out_insn_RIL(s, oif_insns[i], dest, val >> i * 32);
-return;
-}
-}
-
-if (maybe_out_small_movi(s, type, TCG_TMP0, val)) {
-if (type == TCG_TYPE_I32) {
-tcg_out_insn(s, RR, OR, dest, TCG_TMP0);
-} else {
-tcg_out_insn(s, RRE, OGR, dest, TCG_TMP0);
-}
-} else {
-/* Perform the OR via sequential modifications to the high and
-   low parts.  Do this via recursion to handle 16-bit vs 32-bit
-   masks in each half.  */
-tgen_ori(s, type, dest, val & 0xull);
-tgen_ori(s, type, dest, val & 0xull);
-}
+g_assert_not_reached();
 }
 
-static void tgen_xori(TCGContext *s, TCGType type, TCGReg dest, uint64_t val)
+static void tgen_xori(TCGContext *s, TCGReg dest, uint64_t val)
 {
-/* Try all 48-bit insns that can perform it in one go.  */
-if ((val & 0xull) == 0) {
+switch (is_const_p32(val)) {
+case 0:
 tc

[PATCH v4 19/27] tcg/s390x: Support MIE3 logical operations

2022-12-08 Thread Richard Henderson
This is andc, orc, nand, nor, eqv.
We can use nor for implementing not.

Reviewed-by: Ilya Leoshkevich 
Signed-off-by: Richard Henderson 
---
 tcg/s390x/tcg-target-con-set.h |   3 +
 tcg/s390x/tcg-target.h |  25 
 tcg/s390x/tcg-target.c.inc | 102 +
 3 files changed, 118 insertions(+), 12 deletions(-)

diff --git a/tcg/s390x/tcg-target-con-set.h b/tcg/s390x/tcg-target-con-set.h
index 0c4d0da8f5..b194ad7f03 100644
--- a/tcg/s390x/tcg-target-con-set.h
+++ b/tcg/s390x/tcg-target-con-set.h
@@ -22,9 +22,12 @@ C_O1_I1(v, vr)
 C_O1_I2(r, 0, ri)
 C_O1_I2(r, 0, rI)
 C_O1_I2(r, 0, rJ)
+C_O1_I2(r, r, r)
 C_O1_I2(r, r, ri)
 C_O1_I2(r, r, rJ)
 C_O1_I2(r, r, rK)
+C_O1_I2(r, r, rKR)
+C_O1_I2(r, r, rNK)
 C_O1_I2(r, r, rNKR)
 C_O1_I2(r, rZ, r)
 C_O1_I2(v, v, r)
diff --git a/tcg/s390x/tcg-target.h b/tcg/s390x/tcg-target.h
index 03ce11a34a..dabdae1e84 100644
--- a/tcg/s390x/tcg-target.h
+++ b/tcg/s390x/tcg-target.h
@@ -64,6 +64,7 @@ typedef enum TCGReg {
 
 #define FACILITY_LOAD_ON_COND253
 #define FACILITY_MISC_INSN_EXT2   58
+#define FACILITY_MISC_INSN_EXT3   61
 #define FACILITY_VECTOR   129
 #define FACILITY_VECTOR_ENH1  135
 
@@ -81,13 +82,13 @@ extern uint64_t s390_facilities[3];
 #define TCG_TARGET_HAS_ext16u_i32 1
 #define TCG_TARGET_HAS_bswap16_i321
 #define TCG_TARGET_HAS_bswap32_i321
-#define TCG_TARGET_HAS_not_i320
+#define TCG_TARGET_HAS_not_i32HAVE_FACILITY(MISC_INSN_EXT3)
 #define TCG_TARGET_HAS_neg_i321
-#define TCG_TARGET_HAS_andc_i32   0
-#define TCG_TARGET_HAS_orc_i320
-#define TCG_TARGET_HAS_eqv_i320
-#define TCG_TARGET_HAS_nand_i32   0
-#define TCG_TARGET_HAS_nor_i320
+#define TCG_TARGET_HAS_andc_i32   HAVE_FACILITY(MISC_INSN_EXT3)
+#define TCG_TARGET_HAS_orc_i32HAVE_FACILITY(MISC_INSN_EXT3)
+#define TCG_TARGET_HAS_eqv_i32HAVE_FACILITY(MISC_INSN_EXT3)
+#define TCG_TARGET_HAS_nand_i32   HAVE_FACILITY(MISC_INSN_EXT3)
+#define TCG_TARGET_HAS_nor_i32HAVE_FACILITY(MISC_INSN_EXT3)
 #define TCG_TARGET_HAS_clz_i320
 #define TCG_TARGET_HAS_ctz_i320
 #define TCG_TARGET_HAS_ctpop_i32  0
@@ -118,13 +119,13 @@ extern uint64_t s390_facilities[3];
 #define TCG_TARGET_HAS_bswap16_i641
 #define TCG_TARGET_HAS_bswap32_i641
 #define TCG_TARGET_HAS_bswap64_i641
-#define TCG_TARGET_HAS_not_i640
+#define TCG_TARGET_HAS_not_i64HAVE_FACILITY(MISC_INSN_EXT3)
 #define TCG_TARGET_HAS_neg_i641
-#define TCG_TARGET_HAS_andc_i64   0
-#define TCG_TARGET_HAS_orc_i640
-#define TCG_TARGET_HAS_eqv_i640
-#define TCG_TARGET_HAS_nand_i64   0
-#define TCG_TARGET_HAS_nor_i640
+#define TCG_TARGET_HAS_andc_i64   HAVE_FACILITY(MISC_INSN_EXT3)
+#define TCG_TARGET_HAS_orc_i64HAVE_FACILITY(MISC_INSN_EXT3)
+#define TCG_TARGET_HAS_eqv_i64HAVE_FACILITY(MISC_INSN_EXT3)
+#define TCG_TARGET_HAS_nand_i64   HAVE_FACILITY(MISC_INSN_EXT3)
+#define TCG_TARGET_HAS_nor_i64HAVE_FACILITY(MISC_INSN_EXT3)
 #define TCG_TARGET_HAS_clz_i641
 #define TCG_TARGET_HAS_ctz_i640
 #define TCG_TARGET_HAS_ctpop_i64  0
diff --git a/tcg/s390x/tcg-target.c.inc b/tcg/s390x/tcg-target.c.inc
index 21007f94ad..bab2d679c2 100644
--- a/tcg/s390x/tcg-target.c.inc
+++ b/tcg/s390x/tcg-target.c.inc
@@ -181,8 +181,18 @@ typedef enum S390Opcode {
 RRFa_MGRK   = 0xb9ec,
 RRFa_MSRKC  = 0xb9fd,
 RRFa_MSGRKC = 0xb9ed,
+RRFa_NCRK   = 0xb9f5,
+RRFa_NCGRK  = 0xb9e5,
+RRFa_NNRK   = 0xb974,
+RRFa_NNGRK  = 0xb964,
+RRFa_NORK   = 0xb976,
+RRFa_NOGRK  = 0xb966,
 RRFa_NRK= 0xb9f4,
 RRFa_NGRK   = 0xb9e4,
+RRFa_NXRK   = 0xb977,
+RRFa_NXGRK  = 0xb967,
+RRFa_OCRK   = 0xb975,
+RRFa_OCGRK  = 0xb965,
 RRFa_ORK= 0xb9f6,
 RRFa_OGRK   = 0xb9e6,
 RRFa_SRK= 0xb9f9,
@@ -2007,9 +2017,46 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode 
opc,
 }
 break;
 
+case INDEX_op_andc_i32:
+a0 = args[0], a1 = args[1], a2 = (uint32_t)args[2];
+if (const_args[2]) {
+tcg_out_mov(s, TCG_TYPE_I32, a0, a1);
+tgen_andi(s, TCG_TYPE_I32, a0, (uint32_t)~a2);
+   } else {
+tcg_out_insn(s, RRFa, NCRK, a0, a1, a2);
+   }
+break;
+case INDEX_op_orc_i32:
+a0 = args[0], a1 = args[1], a2 = (uint32_t)args[2];
+if (const_args[2]) {
+tcg_out_mov(s, TCG_TYPE_I32, a0, a1);
+tgen_ori(s, a0, (uint32_t)~a2);
+} else {
+tcg_out_insn(s, RRFa, OCRK, a0, a1, a2);
+}
+break;
+case INDEX_op_eqv_i32:
+a0 = args[0], a1 = args[1], a2 = (uint32_t)args[2];
+if (const_args[2]) {
+tcg_out_mov(s, TCG_TYPE_I32, a0, a1);
+tcg_out_insn(s, RIL, XILF, a0, ~a2);
+} else {
+tcg_out_insn(s, RRFa, NXRK, a0, a1, a2);
+}
+bre

[PATCH v4 00/27] tcg/s390x: misc patches

2022-12-08 Thread Richard Henderson
Based-on: 20221202053958.223890-1-richard.hender...@linaro.org
("[PATCH for-8.0 v3 00/34] tcg misc patches")

Changes from v3:
  * Require z196 as minimum cpu -- 6 new patches removing checks.
  * Tighten constraints on AND, OR, XOR, CMP, trying get the register
allocator to hoist things that can't be done in a single insn.
  * Avoid the constant pool for movi.

I believe that I have addressed all of the discussion in v3,
except perhaps for goto_tb concurrent modifications to jumps.
I'm still not quite sure what to do about that.


r~


Richard Henderson (27):
  tcg/s390x: Use register pair allocation for div and mulu2
  tcg/s390x: Remove TCG_REG_TB
  tcg/s390x: Always set TCG_TARGET_HAS_direct_jump
  tcg/s390x: Remove USE_LONG_BRANCHES
  tcg/s390x: Check for long-displacement facility at startup
  tcg/s390x: Check for extended-immediate facility at startup
  tcg/s390x: Check for general-instruction-extension facility at startup
  tcg/s390x: Check for load-on-condition facility at startup
  tcg/s390x: Remove FAST_BCR_SER facility check
  tcg/s390x: Remove DISTINCT_OPERANDS facility check
  tcg/s390x: Use LARL+AGHI for odd addresses
  tcg/s390x: Distinguish RRF-a and RRF-c formats
  tcg/s390x: Distinguish RIE formats
  tcg/s390x: Support MIE2 multiply single instructions
  tcg/s390x: Support MIE2 MGRK instruction
  tcg/s390x: Issue XILF directly for xor_i32
  tcg/s390x: Tighten constraints for or_i64 and xor_i64
  tcg/s390x: Tighten constraints for and_i64
  tcg/s390x: Support MIE3 logical operations
  tcg/s390x: Create tgen_cmp2 to simplify movcond
  tcg/s390x: Generalize movcond implementation
  tcg/s390x: Support SELGR instruction in movcond
  tcg/s390x: Use tgen_movcond_int in tgen_clz
  tcg/s390x: Implement ctpop operation
  tcg/s390x: Tighten constraints for 64-bit compare
  tcg/s390x: Cleanup tcg_out_movi
  tcg/s390x: Avoid the constant pool in tcg_out_movi

 tcg/s390x/tcg-target-con-set.h |   18 +-
 tcg/s390x/tcg-target-con-str.h |   11 +-
 tcg/s390x/tcg-target.h |   54 +-
 tcg/s390x/tcg-target.c.inc | 1251 
 4 files changed, 668 insertions(+), 666 deletions(-)

-- 
2.34.1




[PATCH v4 08/27] tcg/s390x: Check for load-on-condition facility at startup

2022-12-08 Thread Richard Henderson
The general-instruction-extension facility was introduced in z196,
which itself was end-of-life in 2021.  In addition, z196 is the
minimum CPU supported by our set of supported operating systems:
RHEL 7 (z196), SLES 12 (z196) and Ubuntu 16.04 (zEC12).

Check for facility number 45, which will be the consilidated check
for several facilities.

Signed-off-by: Richard Henderson 
---
 tcg/s390x/tcg-target.h |  6 ++--
 tcg/s390x/tcg-target.c.inc | 72 +-
 2 files changed, 27 insertions(+), 51 deletions(-)

diff --git a/tcg/s390x/tcg-target.h b/tcg/s390x/tcg-target.h
index d47e8ba66a..31d5510d2d 100644
--- a/tcg/s390x/tcg-target.h
+++ b/tcg/s390x/tcg-target.h
@@ -58,12 +58,12 @@ typedef enum TCGReg {
 #define FACILITY_LONG_DISP18
 #define FACILITY_EXT_IMM  21
 #define FACILITY_GEN_INST_EXT 34
+#define FACILITY_45   45
 
 /* Facilities that are checked at runtime. */
 
-#define FACILITY_LOAD_ON_COND 45
-#define FACILITY_FAST_BCR_SER FACILITY_LOAD_ON_COND
-#define FACILITY_DISTINCT_OPS FACILITY_LOAD_ON_COND
+#define FACILITY_FAST_BCR_SER 45
+#define FACILITY_DISTINCT_OPS 45
 #define FACILITY_LOAD_ON_COND253
 #define FACILITY_VECTOR   129
 #define FACILITY_VECTOR_ENH1  135
diff --git a/tcg/s390x/tcg-target.c.inc b/tcg/s390x/tcg-target.c.inc
index f0b581293c..29a64ad0fe 100644
--- a/tcg/s390x/tcg-target.c.inc
+++ b/tcg/s390x/tcg-target.c.inc
@@ -1252,7 +1252,6 @@ static void tgen_setcond(TCGContext *s, TCGType type, 
TCGCond cond,
  TCGReg dest, TCGReg c1, TCGArg c2, int c2const)
 {
 int cc;
-bool have_loc;
 
 /* With LOC2, we can always emit the minimum 3 insns.  */
 if (HAVE_FACILITY(LOAD_ON_COND2)) {
@@ -1263,9 +1262,6 @@ static void tgen_setcond(TCGContext *s, TCGType type, 
TCGCond cond,
 return;
 }
 
-have_loc = HAVE_FACILITY(LOAD_ON_COND);
-
-/* For HAVE_LOC, only the paths through GTU/GT/LEU/LE are smaller.  */
  restart:
 switch (cond) {
 case TCG_COND_NE:
@@ -1310,59 +1306,35 @@ static void tgen_setcond(TCGContext *s, TCGType type, 
TCGCond cond,
 case TCG_COND_LT:
 case TCG_COND_GE:
 /* Swap operands so that we can use LEU/GTU/GT/LE.  */
-if (c2const) {
-if (have_loc) {
-break;
-}
-tcg_out_movi(s, type, TCG_TMP0, c2);
-c2 = c1;
-c2const = 0;
-c1 = TCG_TMP0;
-} else {
+if (!c2const) {
 TCGReg t = c1;
 c1 = c2;
 c2 = t;
+cond = tcg_swap_cond(cond);
+goto restart;
 }
-cond = tcg_swap_cond(cond);
-goto restart;
+break;
 
 default:
 g_assert_not_reached();
 }
 
 cc = tgen_cmp(s, type, cond, c1, c2, c2const, false);
-if (have_loc) {
-/* Emit: d = 0, t = 1, d = (cc ? t : d).  */
-tcg_out_movi(s, TCG_TYPE_I64, dest, 0);
-tcg_out_movi(s, TCG_TYPE_I64, TCG_TMP0, 1);
-tcg_out_insn(s, RRF, LOCGR, dest, TCG_TMP0, cc);
-} else {
-/* Emit: d = 1; if (cc) goto over; d = 0; over:  */
-tcg_out_movi(s, type, dest, 1);
-tcg_out_insn(s, RI, BRC, cc, (4 + 4) >> 1);
-tcg_out_movi(s, type, dest, 0);
-}
+/* Emit: d = 0, t = 1, d = (cc ? t : d).  */
+tcg_out_movi(s, TCG_TYPE_I64, dest, 0);
+tcg_out_movi(s, TCG_TYPE_I64, TCG_TMP0, 1);
+tcg_out_insn(s, RRF, LOCGR, dest, TCG_TMP0, cc);
 }
 
 static void tgen_movcond(TCGContext *s, TCGType type, TCGCond c, TCGReg dest,
  TCGReg c1, TCGArg c2, int c2const,
  TCGArg v3, int v3const)
 {
-int cc;
-if (HAVE_FACILITY(LOAD_ON_COND)) {
-cc = tgen_cmp(s, type, c, c1, c2, c2const, false);
-if (v3const) {
-tcg_out_insn(s, RIE, LOCGHI, dest, v3, cc);
-} else {
-tcg_out_insn(s, RRF, LOCGR, dest, v3, cc);
-}
+int cc = tgen_cmp(s, type, c, c1, c2, c2const, false);
+if (v3const) {
+tcg_out_insn(s, RIE, LOCGHI, dest, v3, cc);
 } else {
-c = tcg_invert_cond(c);
-cc = tgen_cmp(s, type, c, c1, c2, c2const, false);
-
-/* Emit: if (cc) goto over; dest = r3; over:  */
-tcg_out_insn(s, RI, BRC, cc, (4 + 4) >> 1);
-tcg_out_insn(s, RRE, LGR, dest, v3);
+tcg_out_insn(s, RRF, LOCGR, dest, v3, cc);
 }
 }
 
@@ -1382,14 +1354,8 @@ static void tgen_clz(TCGContext *s, TCGReg dest, TCGReg 
a1,
 } else {
 tcg_out_mov(s, TCG_TYPE_I64, dest, a2);
 }
-if (HAVE_FACILITY(LOAD_ON_COND)) {
-/* Emit: if (one bit found) dest = r0.  */
-tcg_out_insn(s, RRF, LOCGR, dest, TCG_REG_R0, 2);
-} else {
-/* Emit: if (no one bit found) goto over; dest = r0; over:  */
-tcg_out_insn(s, RI, BRC, 8, (4 + 4) >> 1);
-   

[PATCH v4 22/27] tcg/s390x: Support SELGR instruction in movcond

2022-12-08 Thread Richard Henderson
The new select instruction provides two separate register inputs,
whereas the old load-on-condition instruction overlaps one of the
register inputs with the destination.

Reviewed-by: Ilya Leoshkevich 
Signed-off-by: Richard Henderson 
---
 tcg/s390x/tcg-target.c.inc | 15 +++
 1 file changed, 15 insertions(+)

diff --git a/tcg/s390x/tcg-target.c.inc b/tcg/s390x/tcg-target.c.inc
index 30c12052f0..ab1fb45cc2 100644
--- a/tcg/s390x/tcg-target.c.inc
+++ b/tcg/s390x/tcg-target.c.inc
@@ -202,6 +202,8 @@ typedef enum S390Opcode {
 RRFa_XRK= 0xb9f7,
 RRFa_XGRK   = 0xb9e7,
 
+RRFam_SELGR = 0xb9e3,
+
 RRFc_LOCR   = 0xb9f2,
 RRFc_LOCGR  = 0xb9e2,
 
@@ -626,12 +628,20 @@ static void tcg_out_insn_RRE(TCGContext *s, S390Opcode op,
 tcg_out32(s, (op << 16) | (r1 << 4) | r2);
 }
 
+/* RRF-a without the m4 field */
 static void tcg_out_insn_RRFa(TCGContext *s, S390Opcode op,
   TCGReg r1, TCGReg r2, TCGReg r3)
 {
 tcg_out32(s, (op << 16) | (r3 << 12) | (r1 << 4) | r2);
 }
 
+/* RRF-a with the m4 field */
+static void tcg_out_insn_RRFam(TCGContext *s, S390Opcode op,
+   TCGReg r1, TCGReg r2, TCGReg r3, int m4)
+{
+tcg_out32(s, (op << 16) | (r3 << 12) | (m4 << 8) | (r1 << 4) | r2);
+}
+
 static void tcg_out_insn_RRFc(TCGContext *s, S390Opcode op,
   TCGReg r1, TCGReg r2, int m3)
 {
@@ -1376,6 +1386,11 @@ static void tgen_movcond_int(TCGContext *s, TCGType 
type, TCGReg dest,
 src = v4;
 }
 } else {
+if (HAVE_FACILITY(MISC_INSN_EXT3)) {
+/* Emit: dest = cc ? v3 : v4. */
+tcg_out_insn(s, RRFam, SELGR, dest, v3, v4, cc);
+return;
+}
 if (dest == v4) {
 src = v3;
 } else {
-- 
2.34.1




[PATCH v4 06/27] tcg/s390x: Check for extended-immediate facility at startup

2022-12-08 Thread Richard Henderson
The extended-immediate facility was introduced in z9-109,
which itself was end-of-life in 2017.

Signed-off-by: Richard Henderson 
---
 tcg/s390x/tcg-target.h |   4 +-
 tcg/s390x/tcg-target.c.inc | 231 +++--
 2 files changed, 72 insertions(+), 163 deletions(-)

diff --git a/tcg/s390x/tcg-target.h b/tcg/s390x/tcg-target.h
index 7f230ed243..126ba1048a 100644
--- a/tcg/s390x/tcg-target.h
+++ b/tcg/s390x/tcg-target.h
@@ -56,10 +56,10 @@ typedef enum TCGReg {
 
 #define FACILITY_ZARCH_ACTIVE 2
 #define FACILITY_LONG_DISP18
+#define FACILITY_EXT_IMM  21
 
 /* Facilities that are checked at runtime. */
 
-#define FACILITY_EXT_IMM  21
 #define FACILITY_GEN_INST_EXT 34
 #define FACILITY_LOAD_ON_COND 45
 #define FACILITY_FAST_BCR_SER FACILITY_LOAD_ON_COND
@@ -126,7 +126,7 @@ extern uint64_t s390_facilities[3];
 #define TCG_TARGET_HAS_eqv_i640
 #define TCG_TARGET_HAS_nand_i64   0
 #define TCG_TARGET_HAS_nor_i640
-#define TCG_TARGET_HAS_clz_i64HAVE_FACILITY(EXT_IMM)
+#define TCG_TARGET_HAS_clz_i641
 #define TCG_TARGET_HAS_ctz_i640
 #define TCG_TARGET_HAS_ctpop_i64  0
 #define TCG_TARGET_HAS_deposit_i64HAVE_FACILITY(GEN_INST_EXT)
diff --git a/tcg/s390x/tcg-target.c.inc b/tcg/s390x/tcg-target.c.inc
index 1fcefba7ba..42e161cc7e 100644
--- a/tcg/s390x/tcg-target.c.inc
+++ b/tcg/s390x/tcg-target.c.inc
@@ -819,19 +819,17 @@ static void tcg_out_movi(TCGContext *s, TCGType type,
 }
 
 /* Try all 48-bit insns that can load it in one go.  */
-if (HAVE_FACILITY(EXT_IMM)) {
-if (sval == (int32_t)sval) {
-tcg_out_insn(s, RIL, LGFI, ret, sval);
-return;
-}
-if (uval <= 0x) {
-tcg_out_insn(s, RIL, LLILF, ret, uval);
-return;
-}
-if ((uval & 0x) == 0) {
-tcg_out_insn(s, RIL, LLIHF, ret, uval >> 32);
-return;
-}
+if (sval == (int32_t)sval) {
+tcg_out_insn(s, RIL, LGFI, ret, sval);
+return;
+}
+if (uval <= 0x) {
+tcg_out_insn(s, RIL, LLILF, ret, uval);
+return;
+}
+if ((uval & 0x) == 0) {
+tcg_out_insn(s, RIL, LLIHF, ret, uval >> 32);
+return;
 }
 
 /* Try for PC-relative address load.  For odd addresses,
@@ -844,15 +842,6 @@ static void tcg_out_movi(TCGContext *s, TCGType type,
 }
 }
 
-/* A 32-bit unsigned value can be loaded in 2 insns.  And given
-   that LLILL, LLIHL, LLILF above did not succeed, we know that
-   both insns are required.  */
-if (uval <= 0x) {
-tcg_out_insn(s, RI, LLILL, ret, uval);
-tcg_out_insn(s, RI, IILH, ret, uval >> 16);
-return;
-}
-
 /* Otherwise, stuff it in the constant pool.  */
 if (HAVE_FACILITY(GEN_INST_EXT)) {
 tcg_out_insn(s, RIL, LGRL, ret, 0);
@@ -1002,82 +991,22 @@ static inline void tcg_out_risbg(TCGContext *s, TCGReg 
dest, TCGReg src,
 
 static void tgen_ext8s(TCGContext *s, TCGType type, TCGReg dest, TCGReg src)
 {
-if (HAVE_FACILITY(EXT_IMM)) {
-tcg_out_insn(s, RRE, LGBR, dest, src);
-return;
-}
-
-if (type == TCG_TYPE_I32) {
-if (dest == src) {
-tcg_out_sh32(s, RS_SLL, dest, TCG_REG_NONE, 24);
-} else {
-tcg_out_sh64(s, RSY_SLLG, dest, src, TCG_REG_NONE, 24);
-}
-tcg_out_sh32(s, RS_SRA, dest, TCG_REG_NONE, 24);
-} else {
-tcg_out_sh64(s, RSY_SLLG, dest, src, TCG_REG_NONE, 56);
-tcg_out_sh64(s, RSY_SRAG, dest, dest, TCG_REG_NONE, 56);
-}
+tcg_out_insn(s, RRE, LGBR, dest, src);
 }
 
 static void tgen_ext8u(TCGContext *s, TCGType type, TCGReg dest, TCGReg src)
 {
-if (HAVE_FACILITY(EXT_IMM)) {
-tcg_out_insn(s, RRE, LLGCR, dest, src);
-return;
-}
-
-if (dest == src) {
-tcg_out_movi(s, type, TCG_TMP0, 0xff);
-src = TCG_TMP0;
-} else {
-tcg_out_movi(s, type, dest, 0xff);
-}
-if (type == TCG_TYPE_I32) {
-tcg_out_insn(s, RR, NR, dest, src);
-} else {
-tcg_out_insn(s, RRE, NGR, dest, src);
-}
+tcg_out_insn(s, RRE, LLGCR, dest, src);
 }
 
 static void tgen_ext16s(TCGContext *s, TCGType type, TCGReg dest, TCGReg src)
 {
-if (HAVE_FACILITY(EXT_IMM)) {
-tcg_out_insn(s, RRE, LGHR, dest, src);
-return;
-}
-
-if (type == TCG_TYPE_I32) {
-if (dest == src) {
-tcg_out_sh32(s, RS_SLL, dest, TCG_REG_NONE, 16);
-} else {
-tcg_out_sh64(s, RSY_SLLG, dest, src, TCG_REG_NONE, 16);
-}
-tcg_out_sh32(s, RS_SRA, dest, TCG_REG_NONE, 16);
-} else {
-tcg_out_sh64(s, RSY_SLLG, dest, src, TCG_REG_NONE, 48);
-tcg_out_sh64(s, RSY_SRAG, dest, dest, TCG_REG_NONE, 48);
-}
+tcg_out_insn(s, RRE, LGHR, dest, src);
 }
 
 static void tgen_ext16u(TCGContext 

[PATCH v4 23/27] tcg/s390x: Use tgen_movcond_int in tgen_clz

2022-12-08 Thread Richard Henderson
Reuse code from movcond to conditionally copy a2 to dest,
based on the condition codes produced by FLOGR.

Reviewed-by: Ilya Leoshkevich 
Signed-off-by: Richard Henderson 
---
 tcg/s390x/tcg-target-con-set.h |  1 +
 tcg/s390x/tcg-target.c.inc | 20 +++-
 2 files changed, 12 insertions(+), 9 deletions(-)

diff --git a/tcg/s390x/tcg-target-con-set.h b/tcg/s390x/tcg-target-con-set.h
index 8cf8ed4dff..baf3bc9037 100644
--- a/tcg/s390x/tcg-target-con-set.h
+++ b/tcg/s390x/tcg-target-con-set.h
@@ -24,6 +24,7 @@ C_O1_I2(r, 0, rI)
 C_O1_I2(r, 0, rJ)
 C_O1_I2(r, r, r)
 C_O1_I2(r, r, ri)
+C_O1_I2(r, r, rI)
 C_O1_I2(r, r, rJ)
 C_O1_I2(r, r, rK)
 C_O1_I2(r, r, rKR)
diff --git a/tcg/s390x/tcg-target.c.inc b/tcg/s390x/tcg-target.c.inc
index ab1fb45cc2..8254f9f650 100644
--- a/tcg/s390x/tcg-target.c.inc
+++ b/tcg/s390x/tcg-target.c.inc
@@ -1424,15 +1424,15 @@ static void tgen_clz(TCGContext *s, TCGReg dest, TCGReg 
a1,
 
 if (a2const && a2 == 64) {
 tcg_out_mov(s, TCG_TYPE_I64, dest, TCG_REG_R0);
-} else {
-if (a2const) {
-tcg_out_movi(s, TCG_TYPE_I64, dest, a2);
-} else {
-tcg_out_mov(s, TCG_TYPE_I64, dest, a2);
-}
-/* Emit: if (one bit found) dest = r0.  */
-tcg_out_insn(s, RRFc, LOCGR, dest, TCG_REG_R0, 2);
+return;
 }
+
+/*
+ * Conditions from FLOGR are:
+ *   2 -> one bit found
+ *   8 -> no one bit found
+ */
+tgen_movcond_int(s, TCG_TYPE_I64, dest, a2, a2const, TCG_REG_R0, 8, 2);
 }
 
 static void tgen_deposit(TCGContext *s, TCGReg dest, TCGReg src,
@@ -3070,11 +3070,13 @@ static TCGConstraintSetIndex 
tcg_target_op_def(TCGOpcode op)
 case INDEX_op_rotl_i64:
 case INDEX_op_rotr_i32:
 case INDEX_op_rotr_i64:
-case INDEX_op_clz_i64:
 case INDEX_op_setcond_i32:
 case INDEX_op_setcond_i64:
 return C_O1_I2(r, r, ri);
 
+case INDEX_op_clz_i64:
+return C_O1_I2(r, r, rI);
+
 case INDEX_op_sub_i32:
 case INDEX_op_sub_i64:
 case INDEX_op_and_i32:
-- 
2.34.1




[PATCH v4 09/27] tcg/s390x: Remove FAST_BCR_SER facility check

2022-12-08 Thread Richard Henderson
The fast-bcr-serialization facility is bundled into facility 45,
along with load-on-condition.  We are checking this at startup.

Signed-off-by: Richard Henderson 
---
 tcg/s390x/tcg-target.h | 1 -
 tcg/s390x/tcg-target.c.inc | 3 ++-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/tcg/s390x/tcg-target.h b/tcg/s390x/tcg-target.h
index 31d5510d2d..fc9ae82700 100644
--- a/tcg/s390x/tcg-target.h
+++ b/tcg/s390x/tcg-target.h
@@ -62,7 +62,6 @@ typedef enum TCGReg {
 
 /* Facilities that are checked at runtime. */
 
-#define FACILITY_FAST_BCR_SER 45
 #define FACILITY_DISTINCT_OPS 45
 #define FACILITY_LOAD_ON_COND253
 #define FACILITY_VECTOR   129
diff --git a/tcg/s390x/tcg-target.c.inc b/tcg/s390x/tcg-target.c.inc
index 29a64ad0fe..dd58f0cdb5 100644
--- a/tcg/s390x/tcg-target.c.inc
+++ b/tcg/s390x/tcg-target.c.inc
@@ -2431,7 +2431,8 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode 
opc,
 /* The host memory model is quite strong, we simply need to
serialize the instruction stream.  */
 if (args[0] & TCG_MO_ST_LD) {
-tcg_out_insn(s, RR, BCR, HAVE_FACILITY(FAST_BCR_SER) ? 14 : 15, 0);
+/* fast-bcr-serialization facility (45) is present */
+tcg_out_insn(s, RR, BCR, 14, 0);
 }
 break;
 
-- 
2.34.1




[PATCH v4 05/27] tcg/s390x: Check for long-displacement facility at startup

2022-12-08 Thread Richard Henderson
We are already assuming the existance of long-displacement, but were
not being explicit about it.  This has been present since z990.

Signed-off-by: Richard Henderson 
---
 tcg/s390x/tcg-target.h |  6 --
 tcg/s390x/tcg-target.c.inc | 15 +++
 2 files changed, 19 insertions(+), 2 deletions(-)

diff --git a/tcg/s390x/tcg-target.h b/tcg/s390x/tcg-target.h
index 645f522058..7f230ed243 100644
--- a/tcg/s390x/tcg-target.h
+++ b/tcg/s390x/tcg-target.h
@@ -52,11 +52,13 @@ typedef enum TCGReg {
 
 #define TCG_TARGET_NB_REGS 64
 
-/* A list of relevant facilities used by this translator.  Some of these
-   are required for proper operation, and these are checked at startup.  */
+/* Facilities required for proper operation; checked at startup. */
 
 #define FACILITY_ZARCH_ACTIVE 2
 #define FACILITY_LONG_DISP18
+
+/* Facilities that are checked at runtime. */
+
 #define FACILITY_EXT_IMM  21
 #define FACILITY_GEN_INST_EXT 34
 #define FACILITY_LOAD_ON_COND 45
diff --git a/tcg/s390x/tcg-target.c.inc b/tcg/s390x/tcg-target.c.inc
index dea889ffa1..1fcefba7ba 100644
--- a/tcg/s390x/tcg-target.c.inc
+++ b/tcg/s390x/tcg-target.c.inc
@@ -3211,6 +3211,7 @@ static TCGConstraintSetIndex tcg_target_op_def(TCGOpcode 
op)
 static void query_s390_facilities(void)
 {
 unsigned long hwcap = qemu_getauxval(AT_HWCAP);
+const char *which;
 
 /* Is STORE FACILITY LIST EXTENDED available?  Honestly, I believe this
is present on all 64-bit systems, but let's check for it anyway.  */
@@ -3232,6 +3233,20 @@ static void query_s390_facilities(void)
 if (!(hwcap & HWCAP_S390_VXRS)) {
 s390_facilities[2] = 0;
 }
+
+/*
+ * Check for all required facilities.
+ * ZARCH_ACTIVE is done via preprocessor check for 64-bit.
+ */
+if (!HAVE_FACILITY(LONG_DISP)) {
+which = "long-displacement";
+goto fail;
+}
+return;
+
+ fail:
+error_report("%s: missing required facility %s", __func__, which);
+exit(EXIT_FAILURE);
 }
 
 static void tcg_target_init(TCGContext *s)
-- 
2.34.1




Re: [RFC v4 3/3] hw/cxl: Multi-Region CXL Type-3 Devices (Volatile and Persistent)

2022-12-08 Thread Gregory Price
On Thu, Dec 08, 2022 at 10:55:58PM +, Fan Ni wrote:
> On Mon, Nov 28, 2022 at 10:01:57AM -0500, Gregory Price wrote:
> >  
> > -if (cxl_dstate->pmem_size < CXL_CAPACITY_MULTIPLIER) {
> > +if ((cxl_dstate->vmem_size < CXL_CAPACITY_MULTIPLIER) ||
> > +(cxl_dstate->pmem_size < CXL_CAPACITY_MULTIPLIER)) {
> >  return CXL_MBOX_INTERNAL_ERROR;
> >  }
> For a cxl configuration with only pmem or vmem, vmem_size or pmem_size
> can be 0 and fail the check? 
> 

While nonsensical, i believe it's technically supported.  The prior
implementation likewise enabled pmem_size to be 0 in these checks.

> >  
> > +error_cleanup:
> > +int i;
> > +for (i = 0; i < cur_ent; i++) {
> > +g_free(table[i]);
> > +}
> > +return rc;
> >  }
> 
> I hit an error when compiling with gcc version 9.4.0
> (Ubuntu 9.4.0-1ubuntu1~20.04.1), maybe moving the declaration of `i` to
> the following loop.
> 
> 
> ../hw/mem/cxl_type3.c:211:5: error: a label can only be part of a statement
> and a declaration is not a statement
>   211 |     int i;
>       |     ^~~

Moved the declaration to the top fo the function with the rest of the
declarations. Good catch.




Re: [PATCH v2 14/16] hw/intc: sifive_plic: Change "priority-base" to start from interrupt source 0

2022-12-08 Thread Wilfred Mallawa
On Wed, 2022-12-07 at 18:03 +0800, Bin Meng wrote:
> At present the SiFive PLIC model "priority-base" expects interrupt
> priority register base starting from source 1 instead source 0,
> that's why on most platforms "priority-base" is set to 0x04 except
> 'opentitan' machine. 'opentitan' should have set "priority-base"
> to 0x04 too.
> 
> Note the irq number calculation in sifive_plic_{read,write} is
> correct as the codes make up for the irq number by adding 1.
> 
> Let's simply update "priority-base" to start from interrupt source
> 0 and add a comment to make it crystal clear.
> 
> Signed-off-by: Bin Meng 
> Reviewed-by: Alistair Francis 
Reviewed-by: Wilfred Mallawa 

Wilfred
> ---
> 
> (no changes since v1)
> 
>  include/hw/riscv/microchip_pfsoc.h | 2 +-
>  include/hw/riscv/shakti_c.h    | 2 +-
>  include/hw/riscv/sifive_e.h    | 2 +-
>  include/hw/riscv/sifive_u.h    | 2 +-
>  include/hw/riscv/virt.h    | 2 +-
>  hw/intc/sifive_plic.c  | 5 +++--
>  6 files changed, 8 insertions(+), 7 deletions(-)
> 
> diff --git a/include/hw/riscv/microchip_pfsoc.h
> b/include/hw/riscv/microchip_pfsoc.h
> index 577efad0c4..e65ffeb02d 100644
> --- a/include/hw/riscv/microchip_pfsoc.h
> +++ b/include/hw/riscv/microchip_pfsoc.h
> @@ -155,7 +155,7 @@ enum {
>  
>  #define MICROCHIP_PFSOC_PLIC_NUM_SOURCES    187
>  #define MICROCHIP_PFSOC_PLIC_NUM_PRIORITIES 7
> -#define MICROCHIP_PFSOC_PLIC_PRIORITY_BASE  0x04
> +#define MICROCHIP_PFSOC_PLIC_PRIORITY_BASE  0x00
>  #define MICROCHIP_PFSOC_PLIC_PENDING_BASE   0x1000
>  #define MICROCHIP_PFSOC_PLIC_ENABLE_BASE    0x2000
>  #define MICROCHIP_PFSOC_PLIC_ENABLE_STRIDE  0x80
> diff --git a/include/hw/riscv/shakti_c.h
> b/include/hw/riscv/shakti_c.h
> index daf0aae13f..539fe1156d 100644
> --- a/include/hw/riscv/shakti_c.h
> +++ b/include/hw/riscv/shakti_c.h
> @@ -65,7 +65,7 @@ enum {
>  #define SHAKTI_C_PLIC_NUM_SOURCES 28
>  /* Excluding Priority 0 */
>  #define SHAKTI_C_PLIC_NUM_PRIORITIES 2
> -#define SHAKTI_C_PLIC_PRIORITY_BASE 0x04
> +#define SHAKTI_C_PLIC_PRIORITY_BASE 0x00
>  #define SHAKTI_C_PLIC_PENDING_BASE 0x1000
>  #define SHAKTI_C_PLIC_ENABLE_BASE 0x2000
>  #define SHAKTI_C_PLIC_ENABLE_STRIDE 0x80
> diff --git a/include/hw/riscv/sifive_e.h
> b/include/hw/riscv/sifive_e.h
> index 9e58247fd8..b824a79e2d 100644
> --- a/include/hw/riscv/sifive_e.h
> +++ b/include/hw/riscv/sifive_e.h
> @@ -89,7 +89,7 @@ enum {
>   */
>  #define SIFIVE_E_PLIC_NUM_SOURCES 53
>  #define SIFIVE_E_PLIC_NUM_PRIORITIES 7
> -#define SIFIVE_E_PLIC_PRIORITY_BASE 0x04
> +#define SIFIVE_E_PLIC_PRIORITY_BASE 0x00
>  #define SIFIVE_E_PLIC_PENDING_BASE 0x1000
>  #define SIFIVE_E_PLIC_ENABLE_BASE 0x2000
>  #define SIFIVE_E_PLIC_ENABLE_STRIDE 0x80
> diff --git a/include/hw/riscv/sifive_u.h
> b/include/hw/riscv/sifive_u.h
> index 8f63a183c4..e680d61ece 100644
> --- a/include/hw/riscv/sifive_u.h
> +++ b/include/hw/riscv/sifive_u.h
> @@ -158,7 +158,7 @@ enum {
>  
>  #define SIFIVE_U_PLIC_NUM_SOURCES 54
>  #define SIFIVE_U_PLIC_NUM_PRIORITIES 7
> -#define SIFIVE_U_PLIC_PRIORITY_BASE 0x04
> +#define SIFIVE_U_PLIC_PRIORITY_BASE 0x00
>  #define SIFIVE_U_PLIC_PENDING_BASE 0x1000
>  #define SIFIVE_U_PLIC_ENABLE_BASE 0x2000
>  #define SIFIVE_U_PLIC_ENABLE_STRIDE 0x80
> diff --git a/include/hw/riscv/virt.h b/include/hw/riscv/virt.h
> index e1ce0048af..3407c9e8dd 100644
> --- a/include/hw/riscv/virt.h
> +++ b/include/hw/riscv/virt.h
> @@ -98,7 +98,7 @@ enum {
>  #define VIRT_IRQCHIP_MAX_GUESTS_BITS 3
>  #define VIRT_IRQCHIP_MAX_GUESTS ((1U <<
> VIRT_IRQCHIP_MAX_GUESTS_BITS) - 1U)
>  
> -#define VIRT_PLIC_PRIORITY_BASE 0x04
> +#define VIRT_PLIC_PRIORITY_BASE 0x00
>  #define VIRT_PLIC_PENDING_BASE 0x1000
>  #define VIRT_PLIC_ENABLE_BASE 0x2000
>  #define VIRT_PLIC_ENABLE_STRIDE 0x80
> diff --git a/hw/intc/sifive_plic.c b/hw/intc/sifive_plic.c
> index 1edeb1e1ed..1a792cc3f5 100644
> --- a/hw/intc/sifive_plic.c
> +++ b/hw/intc/sifive_plic.c
> @@ -140,7 +140,7 @@ static uint64_t sifive_plic_read(void *opaque,
> hwaddr addr, unsigned size)
>  SiFivePLICState *plic = opaque;
>  
>  if (addr_between(addr, plic->priority_base, plic->num_sources <<
> 2)) {
> -    uint32_t irq = ((addr - plic->priority_base) >> 2) + 1;
> +    uint32_t irq = (addr - plic->priority_base) >> 2;
>  
>  return plic->source_priority[irq];
>  } else if (addr_between(addr, plic->pending_base, plic-
> >num_sources >> 3)) {
> @@ -187,7 +187,7 @@ static void sifive_plic_write(void *opaque,
> hwaddr addr, uint64_t value,
>  SiFivePLICState *plic = opaque;
>  
>  if (addr_between(addr, plic->priority_base, plic->num_sources <<
> 2)) {
> -    uint32_t irq = ((addr - plic->priority_base) >> 2) + 1;
> +    uint32_t irq = (addr - plic->priority_base) >> 2;
>  
>  if (((plic->num_priorities + 1) & plic->num_priorities) ==
> 0) {
>  /*
> @@ -428,6 +428,7 @@ static Property sifive_plic_properties[] = {
>  /* number of interrupt sources

Re: [PATCH v2 01/16] hw/riscv: Select MSI_NONBROKEN in SIFIVE_PLIC

2022-12-08 Thread Wilfred Mallawa
On Wed, 2022-12-07 at 18:03 +0800, Bin Meng wrote:
> hw/pci/Kconfig says MSI_NONBROKEN should be selected by interrupt
> controllers regardless of how MSI is implemented. msi_nonbroken is
> initialized to true in sifive_plic_realize().
> 
> Let SIFIVE_PLIC select MSI_NONBROKEN and drop the selection from
> RISC-V machines.
> 
> Signed-off-by: Bin Meng 
> Reviewed-by: Alistair Francis 
Reviewed-by: Wilfred Mallawa 

Wilfred
> ---
> 
> (no changes since v1)
> 
>  hw/intc/Kconfig  | 1 +
>  hw/riscv/Kconfig | 5 -
>  2 files changed, 1 insertion(+), 5 deletions(-)
> 
> diff --git a/hw/intc/Kconfig b/hw/intc/Kconfig
> index ecd2883ceb..1d4573e803 100644
> --- a/hw/intc/Kconfig
> +++ b/hw/intc/Kconfig
> @@ -78,6 +78,7 @@ config RISCV_IMSIC
>  
>  config SIFIVE_PLIC
>  bool
> +    select MSI_NONBROKEN
>  
>  config GOLDFISH_PIC
>  bool
> diff --git a/hw/riscv/Kconfig b/hw/riscv/Kconfig
> index 79ff61c464..167dc4cca6 100644
> --- a/hw/riscv/Kconfig
> +++ b/hw/riscv/Kconfig
> @@ -11,7 +11,6 @@ config MICROCHIP_PFSOC
>  select MCHP_PFSOC_IOSCB
>  select MCHP_PFSOC_MMUART
>  select MCHP_PFSOC_SYSREG
> -    select MSI_NONBROKEN
>  select RISCV_ACLINT
>  select SIFIVE_PDMA
>  select SIFIVE_PLIC
> @@ -37,7 +36,6 @@ config RISCV_VIRT
>  imply TPM_TIS_SYSBUS
>  select RISCV_NUMA
>  select GOLDFISH_RTC
> -    select MSI_NONBROKEN
>  select PCI
>  select PCI_EXPRESS_GENERIC_BRIDGE
>  select PFLASH_CFI01
> @@ -53,7 +51,6 @@ config RISCV_VIRT
>  
>  config SIFIVE_E
>  bool
> -    select MSI_NONBROKEN
>  select RISCV_ACLINT
>  select SIFIVE_GPIO
>  select SIFIVE_PLIC
> @@ -64,7 +61,6 @@ config SIFIVE_E
>  config SIFIVE_U
>  bool
>  select CADENCE
> -    select MSI_NONBROKEN
>  select RISCV_ACLINT
>  select SIFIVE_GPIO
>  select SIFIVE_PDMA
> @@ -82,6 +78,5 @@ config SPIKE
>  bool
>  select RISCV_NUMA
>  select HTIF
> -    select MSI_NONBROKEN
>  select RISCV_ACLINT
>  select SIFIVE_PLIC



Re: [PATCH v2 04/16] hw/riscv: Sort machines Kconfig options in alphabetical order

2022-12-08 Thread Wilfred Mallawa
On Wed, 2022-12-07 at 18:03 +0800, Bin Meng wrote:
> SHAKTI_C machine Kconfig option was inserted in disorder. Fix it.
> 
> Signed-off-by: Bin Meng 
> Reviewed-by: Alistair Francis 
Reviewed-by: Wilfred Mallawa 

Wilfred
> ---
> 
> (no changes since v1)
> 
>  hw/riscv/Kconfig | 16 +---
>  1 file changed, 9 insertions(+), 7 deletions(-)
> 
> diff --git a/hw/riscv/Kconfig b/hw/riscv/Kconfig
> index 1e4b58024f..4550b3b938 100644
> --- a/hw/riscv/Kconfig
> +++ b/hw/riscv/Kconfig
> @@ -4,6 +4,8 @@ config RISCV_NUMA
>  config IBEX
>  bool
>  
> +# RISC-V machines in alphabetical order
> +
>  config MICROCHIP_PFSOC
>  bool
>  select CADENCE_SDHCI
> @@ -22,13 +24,6 @@ config OPENTITAN
>  select SIFIVE_PLIC
>  select UNIMP
>  
> -config SHAKTI_C
> -    bool
> -    select UNIMP
> -    select SHAKTI_UART
> -    select RISCV_ACLINT
> -    select SIFIVE_PLIC
> -
>  config RISCV_VIRT
>  bool
>  imply PCI_DEVICES
> @@ -50,6 +45,13 @@ config RISCV_VIRT
>  select FW_CFG_DMA
>  select PLATFORM_BUS
>  
> +config SHAKTI_C
> +    bool
> +    select RISCV_ACLINT
> +    select SHAKTI_UART
> +    select SIFIVE_PLIC
> +    select UNIMP
> +
>  config SIFIVE_E
>  bool
>  select RISCV_ACLINT



Re: [PATCH 1/2] target/riscv: Simplify helper_sret() a little bit

2022-12-08 Thread Wilfred Mallawa
On Wed, 2022-12-07 at 17:00 +0800, Bin Meng wrote:
> There are 2 paths in helper_sret() and the same mstatus update codes
> are replicated. Extract the common parts to simplify it a little bit.
> 
> Signed-off-by: Bin Meng 
Reviewed-by: Wilfred Mallawa 

Wilfred
> ---
> 
>  target/riscv/op_helper.c | 20 ++--
>  1 file changed, 6 insertions(+), 14 deletions(-)
> 
> diff --git a/target/riscv/op_helper.c b/target/riscv/op_helper.c
> index d7af7f056b..a047d38152 100644
> --- a/target/riscv/op_helper.c
> +++ b/target/riscv/op_helper.c
> @@ -149,21 +149,21 @@ target_ulong helper_sret(CPURISCVState *env)
>  }
>  
>  mstatus = env->mstatus;
> +    prev_priv = get_field(mstatus, MSTATUS_SPP);
> +    mstatus = set_field(mstatus, MSTATUS_SIE,
> +    get_field(mstatus, MSTATUS_SPIE));
> +    mstatus = set_field(mstatus, MSTATUS_SPIE, 1);
> +    mstatus = set_field(mstatus, MSTATUS_SPP, PRV_U);
> +    env->mstatus = mstatus;
>  
>  if (riscv_has_ext(env, RVH) && !riscv_cpu_virt_enabled(env)) {
>  /* We support Hypervisor extensions and virtulisation is
> disabled */
>  target_ulong hstatus = env->hstatus;
>  
> -    prev_priv = get_field(mstatus, MSTATUS_SPP);
>  prev_virt = get_field(hstatus, HSTATUS_SPV);
>  
>  hstatus = set_field(hstatus, HSTATUS_SPV, 0);
> -    mstatus = set_field(mstatus, MSTATUS_SPP, 0);
> -    mstatus = set_field(mstatus, SSTATUS_SIE,
> -    get_field(mstatus, SSTATUS_SPIE));
> -    mstatus = set_field(mstatus, SSTATUS_SPIE, 1);
>  
> -    env->mstatus = mstatus;
>  env->hstatus = hstatus;
>  
>  if (prev_virt) {
> @@ -171,14 +171,6 @@ target_ulong helper_sret(CPURISCVState *env)
>  }
>  
>  riscv_cpu_set_virt_enabled(env, prev_virt);
> -    } else {
> -    prev_priv = get_field(mstatus, MSTATUS_SPP);
> -
> -    mstatus = set_field(mstatus, MSTATUS_SIE,
> -    get_field(mstatus, MSTATUS_SPIE));
> -    mstatus = set_field(mstatus, MSTATUS_SPIE, 1);
> -    mstatus = set_field(mstatus, MSTATUS_SPP, PRV_U);
> -    env->mstatus = mstatus;
>  }
>  
>  riscv_cpu_set_mode(env, prev_priv);



Re: CVMSEG Emulation

2022-12-08 Thread Jiaxun Yang


Hi,

This address range is located in KSEG3… Doesn’t seems to be a good location
for userspace program.

I think you have two options to make target_mmap work, the first would be rising
TARGET_VIRT_ADDR_SPACE_BITS to 64 bit. That may break some user space
applications storing pointer tags on higher bits.

The second would be mask CVMSEG base with TARGET_VIRT_ADDR_SPACE_BITS
before mmap, As higher VM address bits will be dropped when addressing guest VM,
that should provide a similar behaviour. Though you’ll have multiple alias for 
CVMSEG in
memory and application will be able to access CVMSEG with bits higher than
TARGET_VIRT_ADDR_SPACE_BITS set to any value. Don’t know if it will break 
anything,
AFAIK normal applications won't use this range.

Thanks
- Jiaxun 


> 2022年12月8日 15:08,Christopher Wrogg  写道:
> 
> In userspace emulation how do I make a set of addresses always valid and 
> initialized to 0 even though the process does not map it in? In particular I 
> want to map the CVMSEG for Cavium qemu-mips64 and qemu-mipsn32. The addresses 
> would be 0x8000 - 0xBFFF. I've looked at target_mmap 
> but it can't handle addresses that large. The lack of an emulated mmu for 64 
> bit guests is going to be a problem.




Re: [PATCH v3 1/2] hw/nvme: Implement shadow doorbell buffer support

2022-12-08 Thread Guenter Roeck

On 12/8/22 12:28, Keith Busch wrote:

When the request times out, the kernel should be printing the command ID. What 
does that say? The driver thinks the 0 is invalid, so I'm just curious what 
value it's expecting.



After some time I see the following.

...
[   88.071197] nvme nvme0: invalid id 0 completed on queue 1
[   88.071514] could not locate request for tag 0x0
[   88.071802] nvme nvme0: invalid id 0 completed on queue 1
[   88.072135] could not locate request for tag 0x0
[   88.072426] nvme nvme0: invalid id 0 completed on queue 1
[   88.072720] could not locate request for tag 0x0
[   88.073007] nvme nvme0: invalid id 0 completed on queue 1
[   88.073343] nvme nvme0: request 0x50 genctr mismatch (got 0x0 expected 0x1)
[   88.073774] nvme nvme0: invalid id 80 completed on queue 1
[   88.074127] nvme nvme0: request 0x4f genctr mismatch (got 0x0 expected 0x1)
[   88.074556] nvme nvme0: invalid id 79 completed on queue 1
[   88.074903] nvme nvme0: request 0x4e genctr mismatch (got 0x0 expected 0x1)
[   88.075318] nvme nvme0: invalid id 78 completed on queue 1
[   88.075803] nvme nvme0: request 0x45 genctr mismatch (got 0x0 expected 0x1)
[   88.076239] nvme nvme0: invalid id 69 completed on queue 1
[   88.076585] nvme nvme0: request 0x46 genctr mismatch (got 0x0 expected 0x1)
[   88.076990] nvme nvme0: invalid id 70 completed on queue 1
[   88.077314] nvme nvme0: request 0x47 genctr mismatch (got 0x0 expected 0x1)
[   88.077744] nvme nvme0: invalid id 71 completed on queue 1
[   88.078064] nvme nvme0: request 0x48 genctr mismatch (got 0x0 expected 0x1)
[   88.078465] nvme nvme0: invalid id 72 completed on queue 1
[   88.078792] nvme nvme0: request 0x49 genctr mismatch (got 0x0 expected 0x1)
[   88.079190] nvme nvme0: invalid id 73 completed on queue 1
[   88.079522] nvme nvme0: request 0x4a genctr mismatch (got 0x0 expected 0x1)
[   88.079918] nvme nvme0: invalid id 74 completed on queue 1
[   88.080243] nvme nvme0: request 0x4b genctr mismatch (got 0x0 expected 0x1)
[   88.080630] nvme nvme0: invalid id 75 completed on queue 1
[   88.080963] nvme nvme0: request 0x4c genctr mismatch (got 0x0 expected 0x1)
[   88.081361] nvme nvme0: invalid id 76 completed on queue 1
[   88.081687] nvme nvme0: request 0x4d genctr mismatch (got 0x0 expected 0x1)
[   88.082081] nvme nvme0: invalid id 77 completed on queue 1
[   89.061345] irq 9: nobody cared (try booting with the "irqpoll" option)
[   89.061794] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G N 
6.1.0-rc8+ #1
[   89.062220] Call Trace:
[   89.062383] [<00eb7518>] __report_bad_irq+0x38/0xb4
[   89.062685] [<004e6e58>] note_interrupt+0x318/0x380
[   89.063000] [<004e2f00>] handle_irq_event+0x80/0xc0
[   89.063296] [<004e7e10>] handle_fasteoi_irq+0x90/0x220
[   89.063631] [<004e18e8>] generic_handle_irq+0x28/0x40
[   89.063946] [<00ede2ec>] handler_irq+0xac/0x100
[   89.064255] [<004274b0>] sys_call_table+0x760/0x970
[   89.064578] [<00eb2ee0>] ffs+0x0/0x18
[   89.064848] [<0042be0c>] do_softirq_own_stack+0x2c/0x40
[   89.065195] [<0046fd50>] __irq_exit_rcu+0xf0/0x140
[   89.065520] [<00470744>] irq_exit+0x4/0x40
[   89.065830] [<00ede3c4>] timer_interrupt+0x84/0xc0
[   89.066184] [<004274f8>] sys_call_table+0x7a8/0x970
[   89.066546] [<008f75e0>] blk_mq_do_dispatch_sched+0xa0/0x380
[   89.066940] [<008f7ad4>] __blk_mq_sched_dispatch_requests+0x94/0x160
[   89.067359] [<008f7bfc>] blk_mq_sched_dispatch_requests+0x3c/0x80
[   89.067774] handlers:
[   89.067952] [<(ptrval)>] nvme_irq
[   89.068254] [<(ptrval)>] nvme_irq
[   89.068538] Disabling IRQ #9
[   89.069837] random: crng init done
[   89.183077] could not locate request for tag 0x0
[   89.183475] nvme nvme0: invalid id 0 completed on queue 1
[   89.183824] could not locate request for tag 0x0
...
[   89.766750] nvme nvme0: invalid id 0 completed on queue 1
[   89.767076] could not locate request for tag 0x0
[   89.767361] nvme nvme0: invalid id 0 completed on queue 1
[   89.767701] nvme nvme0: request 0x4d genctr mismatch (got 0x0 expected 0x1)
[   89.768114] nvme nvme0: invalid id 77 completed on queue 1
[   89.768455] nvme nvme0: request 0x4c genctr mismatch (got 0x0 expected 0x1)
[   89.768876] nvme nvme0: invalid id 76 completed on queue 1
[   89.769215] nvme nvme0: request 0x4b genctr mismatch (got 0x0 expected 0x1)
[   89.769630] nvme nvme0: invalid id 75 completed on queue 1
[   89.769991] nvme nvme0: request 0x4a genctr mismatch (got 0x0 expected 0x1)
[   89.770409] nvme nvme0: invalid id 74 completed on queue 1
[   89.770750] nvme nvme0: request 0x49 genctr mismatch (got 0x0 expected 0x1)
[   89.771171] nvme nvme0: invalid id 73 completed on queue 1
[   89.771513] nvme nvme0: request 0x48 genctr mismatch (got 0x0 expected 0x1)
[   89.771934] nvme nvme0: invalid id 72 completed on queue 1
[   89.772286] nvme nvme0: request 0x47 genctr mism

Re: [PATCH] mailmap: Fix Stefan Weil author email

2022-12-08 Thread Stefan Weil via

Am 08.12.22 um 16:55 schrieb Philippe Mathieu-Daudé:


Fix authorship of commits 266aaedc37~..ac14949821. See commit
3bd2608db7 ("maint: Add .mailmap entries for patches claiming
list authorship") for rationale.

Signed-off-by: Philippe Mathieu-Daudé 
---
  .mailmap | 1 +
  1 file changed, 1 insertion(+)

diff --git a/.mailmap b/.mailmap
index 35dddbe27b..fad2aff5aa 100644
--- a/.mailmap
+++ b/.mailmap
@@ -45,6 +45,7 @@ Ed Swierk  Ed Swierk via Qemu-devel 
 Ian McKellar via Qemu-devel 

  Julia Suvorova  Julia Suvorova via Qemu-devel 

  Justin Terry (VM)  Justin Terry (VM) via Qemu-devel 

+Stefan Weil  Stefan Weil via 
  
  # Next, replace old addresses by a more recent one.

  Aleksandar Markovic  




Signed-off-by: Stefan Weil 

Thanks!




OpenPGP_0xE08C21D5677450AD.asc
Description: OpenPGP public key


OpenPGP_signature
Description: OpenPGP digital signature


Re: [PATCH v3 1/2] hw/nvme: Implement shadow doorbell buffer support

2022-12-08 Thread Guenter Roeck
On Thu, Dec 08, 2022 at 12:13:55PM -0800, Guenter Roeck wrote:
> On Thu, Dec 08, 2022 at 10:47:42AM -0800, Guenter Roeck wrote:
> > > 
> > > A cq head doorbell mmio is skipped... And it is not the fault of the
> > > kernel. The kernel is in it's good right to skip the mmio since the cq
> > > eventidx is not properly updated.
> > > 
> > > Adding that and it boots properly on riscv. But I'm perplexed as to why
> > > this didnt show up on our regularly tested platforms.
> > > 
> > > Gonna try to get this in for 7.2!
> > 
> > I see another problem with sparc64.
> > 
> > [5.261508] could not locate request for tag 0x0
> > [5.261711] nvme nvme0: invalid id 0 completed on queue 1
> > 
> > That is seen repeatedly until the request times out. I'll test with
> > your patch to see if it resolves this problem as well, and will bisect
> > otherwise.
> > 
> The second problem is unrelated to the doorbell problem.
> It is first seen in qemu v7.1. I'll try to bisect.
> 

Unfortunately, the problem observed with sparc64 also bisects to this
patch. Making things worse, "hw/nvme: fix missing cq eventidx update"
does not fix it (which is why I initially thought it was unrelated).

I used the following qemu command line.

qemu-system-sparc64 -M sun4v -cpu "TI UltraSparc IIi" -m 512 -snapshot \
-device nvme,serial=foo,drive=d0,bus=pciB \
-drive file=rootfs.ext2,if=none,format=raw,id=d0 \
-kernel arch/sparc/boot/image -no-reboot \
-append "root=/dev/nvme0n1 console=ttyS0" \
-nographic -monitor none

Guenter



Re: [PATCH v3 1/2] hw/nvme: Implement shadow doorbell buffer support

2022-12-08 Thread Keith Busch
When the request times out, the kernel should be printing the command ID.
What does that say? The driver thinks the 0 is invalid, so I'm just curious
what value it's expecting.

On Thu, Dec 8, 2022, 8:13 PM Guenter Roeck  wrote:

> On Thu, Dec 08, 2022 at 10:47:42AM -0800, Guenter Roeck wrote:
> > >
> > > A cq head doorbell mmio is skipped... And it is not the fault of the
> > > kernel. The kernel is in it's good right to skip the mmio since the cq
> > > eventidx is not properly updated.
> > >
> > > Adding that and it boots properly on riscv. But I'm perplexed as to why
> > > this didnt show up on our regularly tested platforms.
> > >
> > > Gonna try to get this in for 7.2!
> >
> > I see another problem with sparc64.
> >
> > [5.261508] could not locate request for tag 0x0
> > [5.261711] nvme nvme0: invalid id 0 completed on queue 1
> >
> > That is seen repeatedly until the request times out. I'll test with
> > your patch to see if it resolves this problem as well, and will bisect
> > otherwise.
> >
> The second problem is unrelated to the doorbell problem.
> It is first seen in qemu v7.1. I'll try to bisect.
>
> Guenter
>


Re: [PATCH 0/6] Enable Cubieboard A10 boot SPL from SD card

2022-12-08 Thread Niek Linnenbank
Hi Strahinja,

On Thu, Dec 8, 2022 at 8:24 PM Strahinja Jankovic <
strahinjapjanko...@gmail.com> wrote:

> Hi Niek,
>
> On Wed, Dec 7, 2022 at 9:25 PM Niek Linnenbank 
> wrote:
> >
> > Hello Strahinja,
> >
> > Thanks for contribution these patches, and also taking the H3 into
> account :-)
>
> Thank you for looking into these patches and all of the comments. I
> will try to submit V2 of this patch set in the following days.
>
> >
> > I've ran the avocado based acceptance tests for both boards and got
> these results:
> >
> > $ ARMBIAN_ARTIFACTS_CACHED=yes AVOCADO_ALLOW_LARGE_STORAGE=yes
> ./build/tests/venv/bin/avocado --show=app,console run -t
> machine:orangepi-pc tests/avocado/boot_linux_console.py
> > ...
> > RESULTS: PASS 5 | ERROR 0 | FAIL 0 | SKIP 0 | WARN 0 | INTERRUPT 0 |
> CANCEL 0
> > JOB TIME   : 114.24 s
> >
> > $ ./build/tests/venv/bin/avocado --show=app,console run -t
> machine:cubieboard tests/avocado/boot_linux_console.py
> > ...
> > RESULTS: PASS 2 | ERROR 0 | FAIL 0 | SKIP 0 | WARN 0 | INTERRUPT 0 |
> CANCEL 0
> > JOB TIME   : 22.79 s
>
> I did not think initially about avocado, but maybe I could also add an
> SPL/SD boot test for the cubieboard, similarly to the way it is run
> for Orange Pi, for V2 of the patch set?
>

Yeah that would be great. It can help to make testing easier when working
on your current code, since its all automated.
And when covering the SPL/SD boot with an acceptance test, it also helps to
ensure it keeps working with future updates to Qemu code too.

One thing to be aware of is to select an image with an URL that is stable.
Once a new test is merged, and the image is subsequently deleted
from the remote server, the test can't run anymore. We've had such a
problem before with the orangepi-pc tests.

Regards,
Niek


>
> Best regards,
> Strahinja
>
>
>
> >
> > So that shows both machines are still running fine. During startup of
> the bionic 20.08 image for orangepi-pc it did show this message:
> >   console: i2c i2c-0: mv64xxx: I2C bus locked, block: 1, time_left: 0
> >   console: sy8106a: probe of 0-0065 failed with error -110
> >
> > The SY8106a appears to be an peripheral attached to the I2C bus on the
> orangepi-pc, and we don't emulate the SY8106a yet, so that's an error to be
> expected:
> >   https://linux-sunxi.org/SY8106A
> >
> > So for the series:
> > Tested-by: Niek Linnenbank 
> >
> > I'll try to reply to each patch as well.
> >
> > Kind regards,
> > Niek
> >
> > On Sun, Dec 4, 2022 at 12:19 AM Strahinja Jankovic <
> strahinjapjanko...@gmail.com> wrote:
> >>
> >> This patch series adds missing Allwinner A10 modules needed for
> >> successful SPL boot:
> >> - Clock controller module
> >> - DRAM controller
> >> - I2C0 controller (added also for Allwinner H3 since it is the same)
> >> - AXP-209 connected to I2C0 bus
> >>
> >> It also updates Allwinner A10 emulation so SPL is copied from attached
> >> SD card if `-kernel` parameter is not passed when starting QEMU
> >> (approach adapted from Allwinner H3 implementation).
> >>
> >> Boot from SD card has been tested with Cubieboard Armbian SD card image
> and custom
> >> Yocto image built for Cubieboard.
> >> Example usage for Armbian image:
> >> qemu-system-arm -M cubieboard -nographic -sd
> ~/Armbian_22.11.0-trunk_Cubieboard_kinetic_edge_6.0.7.img
> >>
> >>
> >> Strahinja Jankovic (6):
> >>   hw/misc: Allwinner-A10 Clock Controller Module Emulation
> >>   hw/misc: Allwinner A10 DRAM Controller Emulation
> >>   hw/i2c: Allwinner TWI/I2C Emulation
> >>   hw/misc: Allwinner AXP-209 Emulation
> >>   hw/arm: Add AXP-209 to Cubieboard
> >>   hw/arm: Allwinner A10 enable SPL load from MMC
> >>
> >>  hw/arm/Kconfig|   5 +
> >>  hw/arm/allwinner-a10.c|  40 +++
> >>  hw/arm/allwinner-h3.c |  11 +-
> >>  hw/arm/cubieboard.c   |  11 +
> >>  hw/i2c/Kconfig|   4 +
> >>  hw/i2c/allwinner-i2c.c| 417 ++
> >>  hw/i2c/meson.build|   1 +
> >>  hw/misc/Kconfig   |  10 +
> >>  hw/misc/allwinner-a10-ccm.c   | 224 ++
> >>  hw/misc/allwinner-a10-dramc.c | 179 +++
> >>  hw/misc/allwinner-axp-209.c   | 263 
> >>  hw/misc/meson.build   |   3 +
> >>  include/hw/arm/allwinner-a10.h|  27 ++
> >>  include/hw/arm/allwinner-h3.h |   3 +
> >>  include/hw/i2c/allwinner-i2c.h| 112 +++
> >>  include/hw/misc/allwinner-a10-ccm.h   |  67 +
> >>  include/hw/misc/allwinner-a10-dramc.h |  68 +
> >>  17 files changed, 1444 insertions(+), 1 deletion(-)
> >>  create mode 100644 hw/i2c/allwinner-i2c.c
> >>  create mode 100644 hw/misc/allwinner-a10-ccm.c
> >>  create mode 100644 hw/misc/allwinner-a10-dramc.c
> >>  create mode 100644 hw/misc/allwinner-axp-209.c
> >>  create mode 100644 include/hw/i2c/allwinner-i2c.h
> >>  create mode 100644 include/hw/misc/allwinne

Re: [PATCH v3 1/2] hw/nvme: Implement shadow doorbell buffer support

2022-12-08 Thread Guenter Roeck
On Thu, Dec 08, 2022 at 10:47:42AM -0800, Guenter Roeck wrote:
> > 
> > A cq head doorbell mmio is skipped... And it is not the fault of the
> > kernel. The kernel is in it's good right to skip the mmio since the cq
> > eventidx is not properly updated.
> > 
> > Adding that and it boots properly on riscv. But I'm perplexed as to why
> > this didnt show up on our regularly tested platforms.
> > 
> > Gonna try to get this in for 7.2!
> 
> I see another problem with sparc64.
> 
> [5.261508] could not locate request for tag 0x0
> [5.261711] nvme nvme0: invalid id 0 completed on queue 1
> 
> That is seen repeatedly until the request times out. I'll test with
> your patch to see if it resolves this problem as well, and will bisect
> otherwise.
> 
The second problem is unrelated to the doorbell problem.
It is first seen in qemu v7.1. I'll try to bisect.

Guenter



Re: [PATCH 0/6] Enable Cubieboard A10 boot SPL from SD card

2022-12-08 Thread Strahinja Jankovic
Hi Niek,

On Wed, Dec 7, 2022 at 9:25 PM Niek Linnenbank  wrote:
>
> Hello Strahinja,
>
> Thanks for contribution these patches, and also taking the H3 into account :-)

Thank you for looking into these patches and all of the comments. I
will try to submit V2 of this patch set in the following days.

>
> I've ran the avocado based acceptance tests for both boards and got these 
> results:
>
> $ ARMBIAN_ARTIFACTS_CACHED=yes AVOCADO_ALLOW_LARGE_STORAGE=yes 
> ./build/tests/venv/bin/avocado --show=app,console run -t machine:orangepi-pc 
> tests/avocado/boot_linux_console.py
> ...
> RESULTS: PASS 5 | ERROR 0 | FAIL 0 | SKIP 0 | WARN 0 | INTERRUPT 0 | 
> CANCEL 0
> JOB TIME   : 114.24 s
>
> $ ./build/tests/venv/bin/avocado --show=app,console run -t machine:cubieboard 
> tests/avocado/boot_linux_console.py
> ...
> RESULTS: PASS 2 | ERROR 0 | FAIL 0 | SKIP 0 | WARN 0 | INTERRUPT 0 | 
> CANCEL 0
> JOB TIME   : 22.79 s

I did not think initially about avocado, but maybe I could also add an
SPL/SD boot test for the cubieboard, similarly to the way it is run
for Orange Pi, for V2 of the patch set?

Best regards,
Strahinja



>
> So that shows both machines are still running fine. During startup of the 
> bionic 20.08 image for orangepi-pc it did show this message:
>   console: i2c i2c-0: mv64xxx: I2C bus locked, block: 1, time_left: 0
>   console: sy8106a: probe of 0-0065 failed with error -110
>
> The SY8106a appears to be an peripheral attached to the I2C bus on the 
> orangepi-pc, and we don't emulate the SY8106a yet, so that's an error to be 
> expected:
>   https://linux-sunxi.org/SY8106A
>
> So for the series:
> Tested-by: Niek Linnenbank 
>
> I'll try to reply to each patch as well.
>
> Kind regards,
> Niek
>
> On Sun, Dec 4, 2022 at 12:19 AM Strahinja Jankovic 
>  wrote:
>>
>> This patch series adds missing Allwinner A10 modules needed for
>> successful SPL boot:
>> - Clock controller module
>> - DRAM controller
>> - I2C0 controller (added also for Allwinner H3 since it is the same)
>> - AXP-209 connected to I2C0 bus
>>
>> It also updates Allwinner A10 emulation so SPL is copied from attached
>> SD card if `-kernel` parameter is not passed when starting QEMU
>> (approach adapted from Allwinner H3 implementation).
>>
>> Boot from SD card has been tested with Cubieboard Armbian SD card image and 
>> custom
>> Yocto image built for Cubieboard.
>> Example usage for Armbian image:
>> qemu-system-arm -M cubieboard -nographic -sd 
>> ~/Armbian_22.11.0-trunk_Cubieboard_kinetic_edge_6.0.7.img
>>
>>
>> Strahinja Jankovic (6):
>>   hw/misc: Allwinner-A10 Clock Controller Module Emulation
>>   hw/misc: Allwinner A10 DRAM Controller Emulation
>>   hw/i2c: Allwinner TWI/I2C Emulation
>>   hw/misc: Allwinner AXP-209 Emulation
>>   hw/arm: Add AXP-209 to Cubieboard
>>   hw/arm: Allwinner A10 enable SPL load from MMC
>>
>>  hw/arm/Kconfig|   5 +
>>  hw/arm/allwinner-a10.c|  40 +++
>>  hw/arm/allwinner-h3.c |  11 +-
>>  hw/arm/cubieboard.c   |  11 +
>>  hw/i2c/Kconfig|   4 +
>>  hw/i2c/allwinner-i2c.c| 417 ++
>>  hw/i2c/meson.build|   1 +
>>  hw/misc/Kconfig   |  10 +
>>  hw/misc/allwinner-a10-ccm.c   | 224 ++
>>  hw/misc/allwinner-a10-dramc.c | 179 +++
>>  hw/misc/allwinner-axp-209.c   | 263 
>>  hw/misc/meson.build   |   3 +
>>  include/hw/arm/allwinner-a10.h|  27 ++
>>  include/hw/arm/allwinner-h3.h |   3 +
>>  include/hw/i2c/allwinner-i2c.h| 112 +++
>>  include/hw/misc/allwinner-a10-ccm.h   |  67 +
>>  include/hw/misc/allwinner-a10-dramc.h |  68 +
>>  17 files changed, 1444 insertions(+), 1 deletion(-)
>>  create mode 100644 hw/i2c/allwinner-i2c.c
>>  create mode 100644 hw/misc/allwinner-a10-ccm.c
>>  create mode 100644 hw/misc/allwinner-a10-dramc.c
>>  create mode 100644 hw/misc/allwinner-axp-209.c
>>  create mode 100644 include/hw/i2c/allwinner-i2c.h
>>  create mode 100644 include/hw/misc/allwinner-a10-ccm.h
>>  create mode 100644 include/hw/misc/allwinner-a10-dramc.h
>>
>> --
>> 2.30.2
>>
>
>
> --
> Niek Linnenbank
>



Re: [PATCH 6/6] hw/arm: Allwinner A10 enable SPL load from MMC

2022-12-08 Thread Strahinja Jankovic
On Wed, Dec 7, 2022 at 11:39 PM Niek Linnenbank
 wrote:
>
> Hi Strahinja,
>
>
> On Sun, Dec 4, 2022 at 12:19 AM Strahinja Jankovic 
>  wrote:
>>
>> This patch enables copying of SPL from MMC if `-kernel` parameter is not
>> passed when starting QEMU. SPL is copied to SRAM_A.
>>
>> The approach is reused from Allwinner H3 implementation.
>>
>> Tested with Armbian and custom Yocto image.
>>
>> Signed-off-by: Strahinja Jankovic 
>> ---
>>  hw/arm/allwinner-a10.c | 18 ++
>>  hw/arm/cubieboard.c|  5 +
>>  include/hw/arm/allwinner-a10.h | 21 +
>>  3 files changed, 44 insertions(+)
>>
>> diff --git a/hw/arm/allwinner-a10.c b/hw/arm/allwinner-a10.c
>> index 17e439777e..dc1966ff7a 100644
>> --- a/hw/arm/allwinner-a10.c
>> +++ b/hw/arm/allwinner-a10.c
>> @@ -24,7 +24,9 @@
>>  #include "sysemu/sysemu.h"
>>  #include "hw/boards.h"
>>  #include "hw/usb/hcd-ohci.h"
>> +#include "hw/loader.h"
>>
>> +#define AW_A10_SRAM_A_BASE  0x
>>  #define AW_A10_DRAMC_BASE   0x01c01000
>>  #define AW_A10_MMC0_BASE0x01c0f000
>>  #define AW_A10_CCM_BASE 0x01c2
>> @@ -38,6 +40,22 @@
>>  #define AW_A10_RTC_BASE 0x01c20d00
>>  #define AW_A10_I2C0_BASE0x01c2ac00
>>
>> +void allwinner_a10_bootrom_setup(AwA10State *s, BlockBackend *blk)
>> +{
>> +const int64_t rom_size = 32 * KiB;
>> +g_autofree uint8_t *buffer = g_new0(uint8_t, rom_size);
>> +
>> +if (blk_pread(blk, 8 * KiB, rom_size, buffer, 0) < 0) {
>> +error_setg(&error_fatal, "%s: failed to read BlockBackend data",
>> +   __func__);
>> +return;
>> +}
>> +
>> +rom_add_blob("allwinner-a10.bootrom", buffer, rom_size,
>> +  rom_size, AW_A10_SRAM_A_BASE,
>> +  NULL, NULL, NULL, NULL, false);
>> +}
>
>
> Its probably fine for now to do it in the same way here for the A10 indeed. 
> Perhaps in the future, we can try
> to share some overlapping code between the A10 and H3.

That definitely makes sense. I plan on submitting support for A20
after this patch set, so maybe that would be a good opportunity to
refactor the Allwinner support in QEMU.

Best regards,
Strahinja


>
> So the patch looks fine to me:
> Reviewed-by: Niek Linnenbank 
>
> Regards,
> Niek
>
>>
>> +
>>  static void aw_a10_init(Object *obj)
>>  {
>>  AwA10State *s = AW_A10(obj);
>> diff --git a/hw/arm/cubieboard.c b/hw/arm/cubieboard.c
>> index afc7980414..37659c35fd 100644
>> --- a/hw/arm/cubieboard.c
>> +++ b/hw/arm/cubieboard.c
>> @@ -99,6 +99,11 @@ static void cubieboard_init(MachineState *machine)
>>  memory_region_add_subregion(get_system_memory(), AW_A10_SDRAM_BASE,
>>  machine->ram);
>>
>> +/* Load target kernel or start using BootROM */
>> +if (!machine->kernel_filename && blk && blk_is_available(blk)) {
>> +/* Use Boot ROM to copy data from SD card to SRAM */
>> +allwinner_a10_bootrom_setup(a10, blk);
>> +}
>>  /* TODO create and connect IDE devices for ide_drive_get() */
>>
>>  cubieboard_binfo.ram_size = machine->ram_size;
>> diff --git a/include/hw/arm/allwinner-a10.h b/include/hw/arm/allwinner-a10.h
>> index 763935fca9..b3c9ed24c7 100644
>> --- a/include/hw/arm/allwinner-a10.h
>> +++ b/include/hw/arm/allwinner-a10.h
>> @@ -15,6 +15,7 @@
>>  #include "hw/misc/allwinner-a10-ccm.h"
>>  #include "hw/misc/allwinner-a10-dramc.h"
>>  #include "hw/i2c/allwinner-i2c.h"
>> +#include "sysemu/block-backend.h"
>>
>>  #include "target/arm/cpu.h"
>>  #include "qom/object.h"
>> @@ -47,4 +48,24 @@ struct AwA10State {
>>  OHCISysBusState ohci[AW_A10_NUM_USB];
>>  };
>>
>> +/**
>> + * Emulate Boot ROM firmware setup functionality.
>> + *
>> + * A real Allwinner A10 SoC contains a Boot ROM
>> + * which is the first code that runs right after
>> + * the SoC is powered on. The Boot ROM is responsible
>> + * for loading user code (e.g. a bootloader) from any
>> + * of the supported external devices and writing the
>> + * downloaded code to internal SRAM. After loading the SoC
>> + * begins executing the code written to SRAM.
>> + *
>> + * This function emulates the Boot ROM by copying 32 KiB
>> + * of data from the given block device and writes it to
>> + * the start of the first internal SRAM memory.
>> + *
>> + * @s: Allwinner A10 state object pointer
>> + * @blk: Block backend device object pointer
>> + */
>> +void allwinner_a10_bootrom_setup(AwA10State *s, BlockBackend *blk);
>> +
>>  #endif
>> --
>> 2.30.2
>>
>
>
> --
> Niek Linnenbank
>



Re: [PATCH 3/6] hw/i2c: Allwinner TWI/I2C Emulation

2022-12-08 Thread Strahinja Jankovic
Hi Niek,

On Wed, Dec 7, 2022 at 11:06 PM Niek Linnenbank
 wrote:
>
> Hi Strahinja,
>
> On Sun, Dec 4, 2022 at 12:19 AM Strahinja Jankovic 
>  wrote:
>>
>> This patch implements Allwinner TWI/I2C controller emulation. Only
>> master-mode functionality is implemented.
>>
>> The SPL boot for Cubieboard expects AXP209 PMIC on TWI0/I2C0 bus, so this is
>> first part enabling the TWI/I2C bus operation.
>>
>> Since both Allwinner A10 and H3 use the same module, it is added for
>> both boards.
>
>
> The A10 and H3 datasheets have the same introduction text on the TWI, 
> suggesting re-use indeed. Unfortunately
> the A10 datasheet seems to be missing register documentation, so I can't 
> compare that with the H3 datasheet.

The A10 register documentation for TWI exists in
https://linux-sunxi.org/File:Allwinner_A10_User_manual_V1.5.pdf user
manual (unfortunately, register description for many other modules is
missing). From what I could see, the description matches the H3, so I
thought it would be good to use the same implementation.

>
> At least according to what is implemented in the linux kernel, looks like 
> that indeed both SoCs implement the same I2C module.
> The file drivers/i2c/busses/i2c-mv64xxx.c has the following 
> mv64xxx_i2c_of_match_table:
> { .compatible = "allwinner,sun4i-a10-i2c", .data = 
> &mv64xxx_i2c_regs_sun4i},
> { .compatible = "allwinner,sun6i-a31-i2c", .data = 
> &mv64xxx_i2c_regs_sun4i},
>
> And both SoCs define the sun4i-a10-i2c and sun6i-a31-i2c in their device tree 
> files, respectively.
>
> Could you please also update the documentation files for both boards, so we 
> can show that they now support TWI/I2C?
>   docs/system/arm/cubieboard.rst
>   docs/system/arm/orangepi.rst

Yes, I will update these documents in V2. I will also try to update
the description for the cubieboard to have some examples on how to run
the emulation.

>
>>
>>
>> Signed-off-by: Strahinja Jankovic 
>> ---
>>  hw/arm/Kconfig |   2 +
>>  hw/arm/allwinner-a10.c |   8 +
>>  hw/arm/allwinner-h3.c  |  11 +-
>>  hw/i2c/Kconfig |   4 +
>>  hw/i2c/allwinner-i2c.c | 417 +
>>  hw/i2c/meson.build |   1 +
>>  include/hw/arm/allwinner-a10.h |   2 +
>>  include/hw/arm/allwinner-h3.h  |   3 +
>>  include/hw/i2c/allwinner-i2c.h | 112 +
>>  9 files changed, 559 insertions(+), 1 deletion(-)
>>  create mode 100644 hw/i2c/allwinner-i2c.c
>>  create mode 100644 include/hw/i2c/allwinner-i2c.h
>>
>> diff --git a/hw/arm/Kconfig b/hw/arm/Kconfig
>> index 140f142ae5..eefe1fd134 100644
>> --- a/hw/arm/Kconfig
>> +++ b/hw/arm/Kconfig
>> @@ -322,6 +322,7 @@ config ALLWINNER_A10
>>  select ALLWINNER_A10_CCM
>>  select ALLWINNER_A10_DRAMC
>>  select ALLWINNER_EMAC
>> +select ALLWINNER_I2C
>>  select SERIAL
>>  select UNIMP
>>
>> @@ -329,6 +330,7 @@ config ALLWINNER_H3
>>  bool
>>  select ALLWINNER_A10_PIT
>>  select ALLWINNER_SUN8I_EMAC
>> +select ALLWINNER_I2C
>>  select SERIAL
>>  select ARM_TIMER
>>  select ARM_GIC
>> diff --git a/hw/arm/allwinner-a10.c b/hw/arm/allwinner-a10.c
>> index a5f7a36ac9..17e439777e 100644
>> --- a/hw/arm/allwinner-a10.c
>> +++ b/hw/arm/allwinner-a10.c
>> @@ -36,6 +36,7 @@
>>  #define AW_A10_OHCI_BASE0x01c14400
>>  #define AW_A10_SATA_BASE0x01c18000
>>  #define AW_A10_RTC_BASE 0x01c20d00
>> +#define AW_A10_I2C0_BASE0x01c2ac00
>>
>>  static void aw_a10_init(Object *obj)
>>  {
>> @@ -56,6 +57,8 @@ static void aw_a10_init(Object *obj)
>>
>>  object_initialize_child(obj, "sata", &s->sata, TYPE_ALLWINNER_AHCI);
>>
>> +object_initialize_child(obj, "i2c0", &s->i2c0, TYPE_AW_I2C);
>> +
>>  if (machine_usb(current_machine)) {
>>  int i;
>>
>> @@ -176,6 +179,11 @@ static void aw_a10_realize(DeviceState *dev, Error 
>> **errp)
>>  /* RTC */
>>  sysbus_realize(SYS_BUS_DEVICE(&s->rtc), &error_fatal);
>>  sysbus_mmio_map_overlap(SYS_BUS_DEVICE(&s->rtc), 0, AW_A10_RTC_BASE, 
>> 10);
>> +
>> +/* I2C */
>> +sysbus_realize(SYS_BUS_DEVICE(&s->i2c0), &error_fatal);
>> +sysbus_mmio_map(SYS_BUS_DEVICE(&s->i2c0), 0, AW_A10_I2C0_BASE);
>> +sysbus_connect_irq(SYS_BUS_DEVICE(&s->i2c0), 0, qdev_get_gpio_in(dev, 
>> 7));
>>  }
>>
>>  static void aw_a10_class_init(ObjectClass *oc, void *data)
>> diff --git a/hw/arm/allwinner-h3.c b/hw/arm/allwinner-h3.c
>> index 308ed15552..bfce3c8d92 100644
>> --- a/hw/arm/allwinner-h3.c
>> +++ b/hw/arm/allwinner-h3.c
>> @@ -53,6 +53,7 @@ const hwaddr allwinner_h3_memmap[] = {
>>  [AW_H3_DEV_UART1]  = 0x01c28400,
>>  [AW_H3_DEV_UART2]  = 0x01c28800,
>>  [AW_H3_DEV_UART3]  = 0x01c28c00,
>> +[AW_H3_DEV_TWI0]   = 0x01c2ac00,
>>  [AW_H3_DEV_EMAC]   = 0x01c3,
>>  [AW_H3_DEV_DRAMCOM]= 0x01c62000,
>>  [AW_H3_DEV_DRAMCTL]= 0x01c63000,
>> @@ -106,7 +107,6 @@ struct AwH3Unimplemented {
>>  { "uart1

Re: [PATCH v3 1/2] hw/nvme: Implement shadow doorbell buffer support

2022-12-08 Thread Guenter Roeck
On Thu, Dec 08, 2022 at 09:08:12AM +0100, Klaus Jensen wrote:
> On Dec  8 08:16, Klaus Jensen wrote:
> > On Dec  7 09:49, Guenter Roeck wrote:
> > > Hi,
> > > 
> > > On Thu, Jun 16, 2022 at 08:34:07PM +0800, Jinhao Fan wrote:
> > > > Implement Doorbel Buffer Config command (Section 5.7 in NVMe Spec 1.3)
> > > > and Shadow Doorbel buffer & EventIdx buffer handling logic (Section 7.13
> > > > in NVMe Spec 1.3). For queues created before the Doorbell Buffer Config
> > > > command, the nvme_dbbuf_config function tries to associate each existing
> > > > SQ and CQ with its Shadow Doorbel buffer and EventIdx buffer address.
> > > > Queues created after the Doorbell Buffer Config command will have the
> > > > doorbell buffers associated with them when they are initialized.
> > > > 
> > > > In nvme_process_sq and nvme_post_cqe, proactively check for Shadow
> > > > Doorbell buffer changes instead of wait for doorbell register changes.
> > > > This reduces the number of MMIOs.
> > > > 
> > > > In nvme_process_db(), update the shadow doorbell buffer value with
> > > > the doorbell register value if it is the admin queue. This is a hack
> > > > since hosts like Linux NVMe driver and SPDK do not use shadow
> > > > doorbell buffer for the admin queue. Copying the doorbell register
> > > > value to the shadow doorbell buffer allows us to support these hosts
> > > > as well as spec-compliant hosts that use shadow doorbell buffer for
> > > > the admin queue.
> > > > 
> > > > Signed-off-by: Jinhao Fan 
> > > 
> > > I noticed that I can no longer boot Linux kernels from nvme on riscv64
> > > systems. The problem is seen with qemu v7.1 and qemu v7.2-rc4.
> > > The log shows:
> > > 
> > > [   35.904128] nvme nvme0: I/O 642 (I/O Cmd) QID 1 timeout, aborting
> > > [   35.905000] EXT4-fs (nvme0n1): mounting ext2 file system using the 
> > > ext4 subsystem
> > > [   66.623863] nvme nvme0: I/O 643 (I/O Cmd) QID 1 timeout, aborting
> > > [   97.343989] nvme nvme0: Abort status: 0x0
> > > [   97.344355] nvme nvme0: Abort status: 0x0
> > > [   97.344647] nvme nvme0: I/O 7 QID 0 timeout, reset controller
> > > [   97.350568] nvme nvme0: I/O 644 (I/O Cmd) QID 1 timeout, aborting
> > > 
> > > This is with the mainline Linux kernel (v6.1-rc8).
> > > 
> > > Bisect points to this patch. Reverting this patch and a number of 
> > > associated
> > > patches (to fix conflicts) fixes the problem.
> > > 
> > > 06143d8771 Revert "hw/nvme: Implement shadow doorbell buffer support"
> > > acb4443e3a Revert "hw/nvme: Use ioeventfd to handle doorbell updates"
> > > d5fd309feb Revert "hw/nvme: do not enable ioeventfd by default"
> > > 1ca1e6c47c Revert "hw/nvme: unregister the event notifier handler on the 
> > > main loop"
> > > 2d26abd51e Revert "hw/nvme: skip queue processing if notifier is cleared"
> > > 99d411b5a5 Revert "hw/nvme: reenable cqe batching"
> > > 2293d3ca6c Revert "hw/nvme: Add trace events for shadow doorbell buffer"
> > > 
> > > Qemu command line:
> > > 
> > > qemu-system-riscv64 -M virt -m 512M \
> > >  -kernel arch/riscv/boot/Image -snapshot \
> > >  -device nvme,serial=foo,drive=d0 \
> > >  -drive file=rootfs.ext2,if=none,format=raw,id=d0 \
> > >  -bios default \
> > >  -append "root=/dev/nvme0n1 console=ttyS0,115200 
> > > earlycon=uart8250,mmio,0x1000,115200" \
> > >  -nographic -monitor none
> > > 
> > > Guenter
> > 
> > Hi Guenter,
> > 
> > Thanks for the bisect.
> > 
> > The shadow doorbell is also an obvious candidate for this regression. I
> > wonder if this could be a kernel bug, since we are not observing this on
> > other architectures. The memory barriers required are super finicky, but
> > in QEMU all the operations are associated with full memory barriers. The
> > barriers are more fine grained in the kernel though.
> > 
> > I will dig into this together with Keith.
> 
> A cq head doorbell mmio is skipped... And it is not the fault of the
> kernel. The kernel is in it's good right to skip the mmio since the cq
> eventidx is not properly updated.
> 
> Adding that and it boots properly on riscv. But I'm perplexed as to why
> this didnt show up on our regularly tested platforms.
> 
> Gonna try to get this in for 7.2!

I see another problem with sparc64.

[5.261508] could not locate request for tag 0x0
[5.261711] nvme nvme0: invalid id 0 completed on queue 1

That is seen repeatedly until the request times out. I'll test with
your patch to see if it resolves this problem as well, and will bisect
otherwise.

Guenter



Re: [PATCH] FreeBSD: Upgrade to 12.4 release

2022-12-08 Thread Warner Losh
On Thu, Dec 8, 2022 at 12:47 AM Philippe Mathieu-Daudé 
wrote:

> On 8/12/22 07:52, Brad Smith wrote:
> > FreeBSD: Upgrade to 12.4 release
> >
> > Signed-off-by: Brad Smith 
> > ---
> >   .gitlab-ci.d/cirrus.yml | 2 +-
> >   tests/vm/freebsd| 4 ++--
> >   2 files changed, 3 insertions(+), 3 deletions(-)
> >
> > diff --git a/.gitlab-ci.d/cirrus.yml b/.gitlab-ci.d/cirrus.yml
> > index 634a73a742..785b163aa6 100644
> > --- a/.gitlab-ci.d/cirrus.yml
> > +++ b/.gitlab-ci.d/cirrus.yml
> > @@ -50,7 +50,7 @@ x64-freebsd-12-build:
> >   NAME: freebsd-12
> >   CIRRUS_VM_INSTANCE_TYPE: freebsd_instance
> >   CIRRUS_VM_IMAGE_SELECTOR: image_family
> > -CIRRUS_VM_IMAGE_NAME: freebsd-12-3
> > +CIRRUS_VM_IMAGE_NAME: freebsd-12-4
> >   CIRRUS_VM_CPUS: 8
> >   CIRRUS_VM_RAM: 8G
> >   UPDATE_COMMAND: pkg update
> > diff --git a/tests/vm/freebsd b/tests/vm/freebsd
> > index d6ff4461ba..ba2ba23d24 100755
> > --- a/tests/vm/freebsd
> > +++ b/tests/vm/freebsd
> > @@ -28,8 +28,8 @@ class FreeBSDVM(basevm.BaseVM):
> >   name = "freebsd"
> >   arch = "x86_64"
> >
> > -link = "
> https://download.freebsd.org/ftp/releases/ISO-IMAGES/12.3/FreeBSD-12.3-RELEASE-amd64-disc1.iso.xz
> "
> > -csum =
> "36dd0de50f1fe5f0a88e181e94657656de26fb64254412f74e80e128e8b938b4"
> > +link = "
> https://download.freebsd.org/ftp/releases/ISO-IMAGES/12.4/FreeBSD-12.4-RELEASE-amd64-disc1.iso.xz
> "
> > +csum =
> "1dcf6446e31bf3f81b582e9aba3319a258c29a937a2af6138ee4b181ed719a87"
>
> I don't remember and wonder why we don't use the pre-populated image:
>
> https://download.freebsd.org/ftp/releases/VM-IMAGES/12.4-RELEASE/amd64/Latest/FreeBSD-12.4-RELEASE-amd64.qcow2.xz


QEMU's CI pre-dates the FreeBSD project producing those images. I don't
think there's a big technical reason to not use them, though some of the
scripting would need to change (mostly, I think, to delete things, and
maybe to more-directly change config files to effect some of the settings
done via the installer).


> Anyhow,
>
> Reviewed-by: Philippe Mathieu-Daudé 
> Tested-by: Philippe Mathieu-Daudé 
>

Reviewed by: Warner Losh 


Re: [PATCH v2 3/3] hw/nvme: fix missing cq eventidx update

2022-12-08 Thread Guenter Roeck
On Thu, Dec 08, 2022 at 01:26:42PM +0100, Klaus Jensen wrote:
> From: Klaus Jensen 
> 
> Prior to reading the shadow doorbell cq head, we have to update the
> eventidx. Otherwise, we risk that the driver will skip an mmio doorbell
> write. This happens on riscv64, as reported by Guenter.
> 
> Adding the missing update to the cq eventidx fixes the issue.
> 
> Fixes: 3f7fe8de3d49 ("hw/nvme: Implement shadow doorbell buffer support")
> Cc: qemu-sta...@nongnu.org
> Cc: qemu-ri...@nongnu.org
> Reported-by: Guenter Roeck 
> Signed-off-by: Klaus Jensen 

Tested-by: Guenter Roeck 

> ---
>  hw/nvme/ctrl.c | 9 +
>  1 file changed, 9 insertions(+)
> 
> diff --git a/hw/nvme/ctrl.c b/hw/nvme/ctrl.c
> index cfab21b3436e..f6cc766aba4a 100644
> --- a/hw/nvme/ctrl.c
> +++ b/hw/nvme/ctrl.c
> @@ -1334,6 +1334,14 @@ static inline void nvme_blk_write(BlockBackend *blk, 
> int64_t offset,
>  }
>  }
>  
> +static void nvme_update_cq_eventidx(const NvmeCQueue *cq)
> +{
> +trace_pci_nvme_update_cq_eventidx(cq->cqid, cq->head);
> +
> +pci_dma_write(PCI_DEVICE(cq->ctrl), cq->ei_addr, &cq->head,
> +  sizeof(cq->head));
> +}
> +
>  static void nvme_update_cq_head(NvmeCQueue *cq)
>  {
>  pci_dma_read(PCI_DEVICE(cq->ctrl), cq->db_addr, &cq->head,
> @@ -1355,6 +1363,7 @@ static void nvme_post_cqes(void *opaque)
>  hwaddr addr;
>  
>  if (n->dbbuf_enabled) {
> +nvme_update_cq_eventidx(cq);
>  nvme_update_cq_head(cq);
>  }
>  



Re: [PATCH] scripts/archive-source: Use more portable argument with tar command

2022-12-08 Thread Philippe Mathieu-Daudé

On 8/12/22 18:15, Daniel P. Berrangé wrote:

On Thu, Dec 08, 2022 at 05:20:51PM +0100, Philippe Mathieu-Daudé wrote:

When using the archive-source.sh script on Darwin we get:

   tar: Option --concatenate is not supported
   Usage:
 List:tar -tf 
 Extract: tar -xf 
 Create:  tar -cf  [filenames...]
 Help:tar --help

Replace the long argument added by commit 8fc76176f6 ("scripts: use
git-archive in archive-source") by their short form to keep this
script functional.


Or install a better tar implementation from brew ?

   https://formulae.brew.sh/formula/gnu-tar


Good idea, this works for me:

-- >8 --
diff --git a/scripts/archive-source.sh b/scripts/archive-source.sh
index 23e042dacd..150bdf5536 100755
--- a/scripts/archive-source.sh
+++ b/scripts/archive-source.sh
@@ -20,2 +20,3 @@ fi

+tar=$(command -v gtar || command -v tar)
 tar_file=$(realpath "$1")
@@ -69,3 +70,3 @@ for sm in $submodules; do
 test $? -ne 0 && error "failed to archive submodule $sm ($smhash)"
-tar --concatenate --file "$tar_file" "$sub_file"
+$tar --concatenate --file "$tar_file" "$sub_file"
 test $? -ne 0 && error "failed append submodule $sm to $tar_file"
---



Re: [PATCH 1/4] coroutine: Clean up superfluous inclusion of qemu/coroutine.h

2022-12-08 Thread Markus Armbruster
Stefan Hajnoczi  writes:

> Probably because block layer, aio.h, and coroutine_int.h header files
> already include "qemu/coroutine.h"?

Mostly, but not always.  For instance, crypto/block-luks-priv.h compiles
fine without it, and doesn't include it after this patch.

> Reviewed-by: Stefan Hajnoczi 

Thanks!




Re: [PATCH qemu.git 0/1] hw/arm/virt: add 2x sp804 timer

2022-12-08 Thread Axel Heider

Peter,



For the seL4 specific case, this is currently not possible in
the standard configuration. It's only exposed for a special
debug and benchmarking configuration.


It's not clear to me what you mean here -- the generic
timer in the CPU exists in all configurations, so there
should be no obstacle to seL4 using it.


Access is not exposed to userland in the standard configuration
and the standard kernel API has no no timeouts besides zero and
infinite. It's a design thing in the end. Nothing that could not
be hacked around or be changed in the design in the long run. But
my goal is not to hack around, but have a "proper" machine
simulation instead. Which basically falls down to having a generic
machine in mainline that has a few more customization options.


The really cool customization option would be passing a DTB
to QEMU that describes exactly what "virt" machine is to be
emulated.


This is a firm "no" -- it sounds on the surface like a good
idea but it doesn't actually work in practice -- DTB files
don't provide enough info to be able to build a board from,
except in some specific restricted situations like the Xilinx
one.


I can see the point. But what about supporting an overlay DTB
that takes a stripped down virt machine as base? This might avoid
some limitation. In the long run, customization via a DTB seems
still better then adding parameters to the command line. For the
short term, a few more command line options seem good enough.

What is the general feeling about having a more general system
emulation option when it comes to the "virt" machine, and a way
of resolving the usage (and security) conflict with the KVM
usecase.

Axel






Re: [PATCH] scripts/archive-source: Use more portable argument with tar command

2022-12-08 Thread Daniel P . Berrangé
On Thu, Dec 08, 2022 at 05:20:51PM +0100, Philippe Mathieu-Daudé wrote:
> When using the archive-source.sh script on Darwin we get:
> 
>   tar: Option --concatenate is not supported
>   Usage:
> List:tar -tf 
> Extract: tar -xf 
> Create:  tar -cf  [filenames...]
> Help:tar --help
> 
> Replace the long argument added by commit 8fc76176f6 ("scripts: use
> git-archive in archive-source") by their short form to keep this
> script functional.

Or install a better tar implementation from brew ?

  https://formulae.brew.sh/formula/gnu-tar


With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|




Re: [PATCH] scripts/archive-source: Use more portable argument with tar command

2022-12-08 Thread Alex Bennée


Philippe Mathieu-Daudé  writes:

> When using the archive-source.sh script on Darwin we get:
>
>   tar: Option --concatenate is not supported
>   Usage:
> List:tar -tf 
> Extract: tar -xf 
> Create:  tar -cf  [filenames...]
> Help:tar --help
>
> Replace the long argument added by commit 8fc76176f6 ("scripts: use
> git-archive in archive-source") by their short form to keep this
> script functional.
>
> Signed-off-by: Philippe Mathieu-Daudé 
> ---
>  scripts/archive-source.sh | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/scripts/archive-source.sh b/scripts/archive-source.sh
> index 23e042dacd..6a710a212e 100755
> --- a/scripts/archive-source.sh
> +++ b/scripts/archive-source.sh
> @@ -67,7 +67,7 @@ for sm in $submodules; do
>  esac
>  (cd $sm; git archive --format tar --prefix "$sm/" $(tree_ish)) > 
> "$sub_file"
>  test $? -ne 0 && error "failed to archive submodule $sm ($smhash)"
> -tar --concatenate --file "$tar_file" "$sub_file"
> +tar -c -f "$tar_file" "$sub_file"

I'm not sure that is correct. The gnu shortform for --concatenate is -A,
-c is specifically create so I suspect you end up re-creating the
tarball rather than adding to it.

>  test $? -ne 0 && error "failed append submodule $sm to $tar_file"
>  done
>  exit 0


-- 
Alex Bennée



Re: [PATCH qemu.git 0/1] hw/arm/virt: add 2x sp804 timer

2022-12-08 Thread Peter Maydell
On Thu, 8 Dec 2022 at 16:59, Axel Heider  wrote:
>
> Peter,
>
> >> This patch adds timer peripherals to the arm-virt machine.>>
> > Is there a reason you can't use the CPU's built-in generic timer
> > device ? That is what typical guest code does on this system.
> > I'm a bit reluctant to add more devices to the virt board
> > because over time it gradually gets increasingly complicated,
> > and every new device model we expose to the guest is another
> > thing that's part of the security attack surface for guest
> > code trying to escape from a KVM VM.
>
> For the seL4 specific case, this is currently not possible in
> the standard configuration. It's only exposed for a special
> debug and benchmarking configuration.

It's not clear to me what you mean here -- the generic
timer in the CPU exists in all configurations, so there
should be no obstacle to seL4 using it.

> The catch we have here is, that the virt machine is a nice
> generic ARM (and RISC-V) machine for OS testing purposes also,
> but it sometimes lacks things (see my other patched for the
> UART). So, I wonder what would be the best option to continue
> here. Should we consider defining another generic machine
> profile that is more suited for the system emulation use case.
> This is what OS developer could use then. Or could the virt
> machine get some config parameters to customize it further.
> So the "Machine-specific options" would  support a "sp804=on"
> that would add two timer peripherals then?
>
> The really cool customization option would be passing a DTB
> to QEMU that describes exactly what "virt" machine is to be
> emulated.

This is a firm "no" -- it sounds on the surface like a good
idea but it doesn't actually work in practice -- DTB files
don't provide enough info to be able to build a board from,
except in some specific restricted situations like the Xilinx one.

-- PMM



Re: [PATCH qemu.git 0/1] hw/arm/virt: add 2x sp804 timer

2022-12-08 Thread Axel Heider

Peter,


This patch adds timer peripherals to the arm-virt machine.>>

Is there a reason you can't use the CPU's built-in generic timer
device ? That is what typical guest code does on this system.
I'm a bit reluctant to add more devices to the virt board
because over time it gradually gets increasingly complicated,
and every new device model we expose to the guest is another
thing that's part of the security attack surface for guest
code trying to escape from a KVM VM.


For the seL4 specific case, this is currently not possible in
the standard configuration. It's only exposed for a special
debug and benchmarking configuration.

The catch we have here is, that the virt machine is a nice
generic ARM (and RISC-V) machine for OS testing purposes also,
but it sometimes lacks things (see my other patched for the
UART). So, I wonder what would be the best option to continue
here. Should we consider defining another generic machine
profile that is more suited for the system emulation use case.
This is what OS developer could use then. Or could the virt
machine get some config parameters to customize it further.
So the "Machine-specific options" would  support a "sp804=on"
that would add two timer peripherals then?

The really cool customization option would be passing a DTB
to QEMU that describes exactly what "virt" machine is to be
emulated. I think the Xlinx fork used to support this feature
partly. Not sure if there was ever an attempt to mainline this?
But it would avoid running into a command parameters hell for
customization options.

Axel





Re: [PATCH] scripts/archive-source: Use more portable argument with tar command

2022-12-08 Thread Peter Maydell
On Thu, 8 Dec 2022 at 16:21, Philippe Mathieu-Daudé  wrote:
>
> When using the archive-source.sh script on Darwin we get:
>
>   tar: Option --concatenate is not supported
>   Usage:
> List:tar -tf 
> Extract: tar -xf 
> Create:  tar -cf  [filenames...]
> Help:tar --help
>
> Replace the long argument added by commit 8fc76176f6 ("scripts: use
> git-archive in archive-source") by their short form to keep this
> script functional.
>
> Signed-off-by: Philippe Mathieu-Daudé 
> ---
>  scripts/archive-source.sh | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/scripts/archive-source.sh b/scripts/archive-source.sh
> index 23e042dacd..6a710a212e 100755
> --- a/scripts/archive-source.sh
> +++ b/scripts/archive-source.sh
> @@ -67,7 +67,7 @@ for sm in $submodules; do
>  esac
>  (cd $sm; git archive --format tar --prefix "$sm/" $(tree_ish)) > 
> "$sub_file"
>  test $? -ne 0 && error "failed to archive submodule $sm ($smhash)"
> -tar --concatenate --file "$tar_file" "$sub_file"
> +tar -c -f "$tar_file" "$sub_file"

'-c' is not the short-form option of '--concatenate': that would
be '-A'. The problem is not long vs short options, but that
BSD-style tar does not support the --concatenate functionality at all.

>  test $? -ne 0 && error "failed append submodule $sm to $tar_file"
>  done
>  exit 0

thanks
-- PMM



Re: [RFC PATCH-for-8.0] hw: Avoid using inlined functions with external linkage

2022-12-08 Thread Peter Maydell
On Thu, 8 Dec 2022 at 16:11, Philippe Mathieu-Daudé  wrote:
>
> When using Clang ("Apple clang version 14.0.0 (clang-1400.0.29.202)")
> and building with -Wall we get:
>
>   hw/arm/smmu-common.c:173:33: warning: static function 
> 'smmu_hash_remove_by_asid_iova' is used in an inline function with external 
> linkage [-Wstatic-in-inline]
>   hw/arm/smmu-common.h:170:1: note: use 'static' to give inline function 
> 'smmu_iotlb_inv_iova' internal linkage
> void smmu_iotlb_inv_iova(SMMUState *s, int asid, dma_addr_t iova,
> ^
> static
>
> None of our code base require / use inlined functions with external
> linkage. Some places use internal inlining in the hot path. These
> two functions are certainly not in any hot path and don't justify
> any inlining.
>
> Reported-by: Stefan Weil 
> Signed-off-by: Philippe Mathieu-Daudé 
> ---
> RFC: Any better justification?

I don't really understand what the warning is trying to warn
about, and googling didn't enlighten me. Does anybody understand it?

In any case, it does seem weird to define a function inline and
also have it be defined in a C file rather than as a 'static inline'
in a header file, so these are likely oversights rather than
intentional.

> ---
>  hw/arm/smmu-common.c | 10 +-
>  hw/i386/x86.c|  3 +--
>  2 files changed, 6 insertions(+), 7 deletions(-)
>
> diff --git a/hw/arm/smmu-common.c b/hw/arm/smmu-common.c
> index e09b9c13b7..298e021cd3 100644
> --- a/hw/arm/smmu-common.c
> +++ b/hw/arm/smmu-common.c
> @@ -116,7 +116,7 @@ void smmu_iotlb_insert(SMMUState *bs, SMMUTransCfg *cfg, 
> SMMUTLBEntry *new)
>  g_hash_table_insert(bs->iotlb, key, new);
>  }
>
> -inline void smmu_iotlb_inv_all(SMMUState *s)
> +void smmu_iotlb_inv_all(SMMUState *s)
>  {
>  trace_smmu_iotlb_inv_all();
>  g_hash_table_remove_all(s->iotlb);
> @@ -146,7 +146,7 @@ static gboolean smmu_hash_remove_by_asid_iova(gpointer 
> key, gpointer value,
> ((entry->iova & ~info->mask) == info->iova);
>  }
>
> -inline void
> +void
>  smmu_iotlb_inv_iova(SMMUState *s, int asid, dma_addr_t iova,
>  uint8_t tg, uint64_t num_pages, uint8_t ttl)

While we're changing this, can we put the "void" on the same line as
the rest of the function prototype, to match the style of these other
functions ?

>  {
> @@ -174,7 +174,7 @@ smmu_iotlb_inv_iova(SMMUState *s, int asid, dma_addr_t 
> iova,
>  &info);
>  }
>
> -inline void smmu_iotlb_inv_asid(SMMUState *s, uint16_t asid)
> +void smmu_iotlb_inv_asid(SMMUState *s, uint16_t asid)
>  {
>  trace_smmu_iotlb_inv_asid(asid);
>  g_hash_table_foreach_remove(s->iotlb, smmu_hash_remove_by_asid, &asid);
> @@ -374,7 +374,7 @@ error:
>   *
>   * return 0 on success
>   */
> -inline int smmu_ptw(SMMUTransCfg *cfg, dma_addr_t iova, IOMMUAccessFlags 
> perm,
> +int smmu_ptw(SMMUTransCfg *cfg, dma_addr_t iova, IOMMUAccessFlags perm,
>  SMMUTLBEntry *tlbe, SMMUPTWEventInfo *info)

This second line now needs re-indenting.

>  {
>  if (!cfg->aa64) {
> @@ -483,7 +483,7 @@ static void smmu_unmap_notifier_range(IOMMUNotifier *n)
>  }
>
>  /* Unmap all notifiers attached to @mr */
> -inline void smmu_inv_notifiers_mr(IOMMUMemoryRegion *mr)
> +void smmu_inv_notifiers_mr(IOMMUMemoryRegion *mr)
>  {
>  IOMMUNotifier *n;

> diff --git a/hw/i386/x86.c b/hw/i386/x86.c
> index 78cc131926..9ac1680180 100644
> --- a/hw/i386/x86.c
> +++ b/hw/i386/x86.c
> @@ -64,8 +64,7 @@
>  /* Physical Address of PVH entry point read from kernel ELF NOTE */
>  static size_t pvh_start_addr;
>
> -inline void init_topo_info(X86CPUTopoInfo *topo_info,
> -   const X86MachineState *x86ms)
> +void init_topo_info(X86CPUTopoInfo *topo_info, const X86MachineState *x86ms)
>  {
>  MachineState *ms = MACHINE(x86ms);

This function is not used anywhere outside this file, so we
can delete the prototype from include/hw/i386/x86.h and
make the function "static void".

With those changes,
Reviewed-by: Peter Maydell 

thanks
-- PMM



[PATCH] scripts/archive-source: Use more portable argument with tar command

2022-12-08 Thread Philippe Mathieu-Daudé
When using the archive-source.sh script on Darwin we get:

  tar: Option --concatenate is not supported
  Usage:
List:tar -tf 
Extract: tar -xf 
Create:  tar -cf  [filenames...]
Help:tar --help

Replace the long argument added by commit 8fc76176f6 ("scripts: use
git-archive in archive-source") by their short form to keep this
script functional.

Signed-off-by: Philippe Mathieu-Daudé 
---
 scripts/archive-source.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/scripts/archive-source.sh b/scripts/archive-source.sh
index 23e042dacd..6a710a212e 100755
--- a/scripts/archive-source.sh
+++ b/scripts/archive-source.sh
@@ -67,7 +67,7 @@ for sm in $submodules; do
 esac
 (cd $sm; git archive --format tar --prefix "$sm/" $(tree_ish)) > 
"$sub_file"
 test $? -ne 0 && error "failed to archive submodule $sm ($smhash)"
-tar --concatenate --file "$tar_file" "$sub_file"
+tar -c -f "$tar_file" "$sub_file"
 test $? -ne 0 && error "failed append submodule $sm to $tar_file"
 done
 exit 0
-- 
2.38.1




[RFC PATCH-for-8.0] hw: Avoid using inlined functions with external linkage

2022-12-08 Thread Philippe Mathieu-Daudé
When using Clang ("Apple clang version 14.0.0 (clang-1400.0.29.202)")
and building with -Wall we get:

  hw/arm/smmu-common.c:173:33: warning: static function 
'smmu_hash_remove_by_asid_iova' is used in an inline function with external 
linkage [-Wstatic-in-inline]
  hw/arm/smmu-common.h:170:1: note: use 'static' to give inline function 
'smmu_iotlb_inv_iova' internal linkage
void smmu_iotlb_inv_iova(SMMUState *s, int asid, dma_addr_t iova,
^
static

None of our code base require / use inlined functions with external
linkage. Some places use internal inlining in the hot path. These
two functions are certainly not in any hot path and don't justify
any inlining.

Reported-by: Stefan Weil 
Signed-off-by: Philippe Mathieu-Daudé 
---
RFC: Any better justification?
---
 hw/arm/smmu-common.c | 10 +-
 hw/i386/x86.c|  3 +--
 2 files changed, 6 insertions(+), 7 deletions(-)

diff --git a/hw/arm/smmu-common.c b/hw/arm/smmu-common.c
index e09b9c13b7..298e021cd3 100644
--- a/hw/arm/smmu-common.c
+++ b/hw/arm/smmu-common.c
@@ -116,7 +116,7 @@ void smmu_iotlb_insert(SMMUState *bs, SMMUTransCfg *cfg, 
SMMUTLBEntry *new)
 g_hash_table_insert(bs->iotlb, key, new);
 }
 
-inline void smmu_iotlb_inv_all(SMMUState *s)
+void smmu_iotlb_inv_all(SMMUState *s)
 {
 trace_smmu_iotlb_inv_all();
 g_hash_table_remove_all(s->iotlb);
@@ -146,7 +146,7 @@ static gboolean smmu_hash_remove_by_asid_iova(gpointer key, 
gpointer value,
((entry->iova & ~info->mask) == info->iova);
 }
 
-inline void
+void
 smmu_iotlb_inv_iova(SMMUState *s, int asid, dma_addr_t iova,
 uint8_t tg, uint64_t num_pages, uint8_t ttl)
 {
@@ -174,7 +174,7 @@ smmu_iotlb_inv_iova(SMMUState *s, int asid, dma_addr_t iova,
 &info);
 }
 
-inline void smmu_iotlb_inv_asid(SMMUState *s, uint16_t asid)
+void smmu_iotlb_inv_asid(SMMUState *s, uint16_t asid)
 {
 trace_smmu_iotlb_inv_asid(asid);
 g_hash_table_foreach_remove(s->iotlb, smmu_hash_remove_by_asid, &asid);
@@ -374,7 +374,7 @@ error:
  *
  * return 0 on success
  */
-inline int smmu_ptw(SMMUTransCfg *cfg, dma_addr_t iova, IOMMUAccessFlags perm,
+int smmu_ptw(SMMUTransCfg *cfg, dma_addr_t iova, IOMMUAccessFlags perm,
 SMMUTLBEntry *tlbe, SMMUPTWEventInfo *info)
 {
 if (!cfg->aa64) {
@@ -483,7 +483,7 @@ static void smmu_unmap_notifier_range(IOMMUNotifier *n)
 }
 
 /* Unmap all notifiers attached to @mr */
-inline void smmu_inv_notifiers_mr(IOMMUMemoryRegion *mr)
+void smmu_inv_notifiers_mr(IOMMUMemoryRegion *mr)
 {
 IOMMUNotifier *n;
 
diff --git a/hw/i386/x86.c b/hw/i386/x86.c
index 78cc131926..9ac1680180 100644
--- a/hw/i386/x86.c
+++ b/hw/i386/x86.c
@@ -64,8 +64,7 @@
 /* Physical Address of PVH entry point read from kernel ELF NOTE */
 static size_t pvh_start_addr;
 
-inline void init_topo_info(X86CPUTopoInfo *topo_info,
-   const X86MachineState *x86ms)
+void init_topo_info(X86CPUTopoInfo *topo_info, const X86MachineState *x86ms)
 {
 MachineState *ms = MACHINE(x86ms);
 
-- 
2.38.1




Re: [PATCH v4 0/2] Add OCP extended log to nvme QEMU

2022-12-08 Thread Joel Granados
ping.

Is the solution to the guid constant ok?

Best

On Fri, Nov 25, 2022 at 10:48:06AM +0100, Joel Granados wrote:
> The motivation and description are contained in the last patch in this set.
> Will copy paste it here for convenience:
> 
> In order to evaluate write amplification factor (WAF) within the storage
> stack it is important to know the number of bytes written to the
> controller. The existing SMART log value of Data Units Written is too
> coarse (given in units of 500 Kb) and so we add the SMART health
> information extended from the OCP specification (given in units of bytes).
> 
> To accommodate different vendor specific specifications like OCP, we add a
> multiplexing function (nvme_vendor_specific_log) which will route to the
> different log functions based on arguments and log ids. We only return the
> OCP extended smart log when the command is 0xC0 and ocp has been turned on
> in the args.
> 
> Though we add the whole nvme smart log extended structure, we only 
> populate
> the physical_media_units_{read,written}, log_page_version and
> log_page_uuid.
> 
> V4 changes:
> 1. Fixed cpu_to_le64 instead of cpu_to_le32
> 2. Variable naming : uuid -> guid
> 3. Changed how the guid value appears in the code:
>Used to be:
> smart_l.log_page_uuid[0] = 0xA4F2BFEA2810AFC5;
> smart_l.log_page_uuid[1] = 0xAFD514C97C6F4F9C;
> 
>Now is:
> static const uint8_t guid[16] = {
> 0xC5, 0xAF, 0x10, 0x28, 0xEA, 0xBF, 0xF2, 0xA4,
> 0x9C, 0x4F, 0x6F, 0x7C, 0xC9, 0x14, 0xD5, 0xAF
> };
> 
>This is different from what @klaus suggested because I want to keep it
>consistent to what nvme-cli currently implements. I think here we can
>either change both nvme-cli and this patch or leave the order of the
>bytes as they are here. This all depends on how you interpret the Spec
>(which is ambiguous)
> 
> V3 changes:
> 1. Corrected a bunch of checkpatch issues. Since I changed the first patch
>I did not include the reviewed-by.
> 2. Included some documentation in nvme.rst for the ocp argument
> 3. Squashed the ocp arg changes into the main patch.
> 4. Fixed several comments and an open parenthesis
> 5. Hex values are now in lower case.
> 6. Change the reserved format to rsvd
> 7. Made sure that NvmeCtrl is the first arg in all the functions.
> 8. Fixed comment on commit of main patch
> 
> V2 changes:
> 1. I moved the ocp parameter from the namespace to the subsystem as it is
>defined there in the OCP specification
> 2. I now accumulate statistics from all namespaces and report them back on
>the extended log as per the spec.
> 3. I removed the default case in the switch in nvme_vendor_specific_log as
>it does not have any special function.
> 
> Joel Granados (2):
>   nvme: Move adjustment of data_units{read,written}
>   nvme: Add physical writes/reads from OCP log
> 
>  docs/system/devices/nvme.rst |  7 
>  hw/nvme/ctrl.c   | 73 +---
>  hw/nvme/nvme.h   |  1 +
>  include/block/nvme.h | 36 ++
>  4 files changed, 111 insertions(+), 6 deletions(-)
> 
> -- 
> 2.30.2
> 


signature.asc
Description: PGP signature


Re: [SeaBIOS] Re: [PATCH 4/4] be less conservative with the 64bit pci io window

2022-12-08 Thread Igor Mammedov
On Wed, 23 Nov 2022 11:25:08 +0100
Gerd Hoffmann  wrote:

> On Tue, Nov 22, 2022 at 01:43:16PM -0500, Kevin O'Connor wrote:
> > On Mon, Nov 21, 2022 at 11:32:13AM +0100, Gerd Hoffmann wrote:  
> > > Current seabios code will only enable and use the 64bit pci io window in
> > > case it runs out of space in the 32bit pci mmio window below 4G.
> > > 
> > > This patch will also enable the 64bit pci io window when
> > >   (a) RAM above 4G is present, and
> > >   (b) the physical address space size is known, and
> > >   (c) seabios is running on a 64bit capable processor.
> > > 
> > > This operates with the assumption that guests which are ok with memory
> > > above 4G most likely can handle mmio above 4G too.  
> > 
> > Thanks.  In general, the series looks good to me.  Can you elaborate
> > on the background to this change though?  It sounds like there is a
> > (small) risk of a regression, so I think it would be good to have a
> > high level understanding of what is driving this memory reorg.  
> 
> Well, the idea is to adapt to the world moving forward.  Running a
> 64-bit capable OS is standard these days, and the resources needed
> by devices (especially GPUs) are becoming larger and larger.
> 
> Yes, there is the risk that (old) guests are unhappy with their
> PCI bars suddenly being mapped above 4G.  Can happen only in case
> seabios handles pci initialization (i.e. when running on qemu,
> otherwise coreboot initializes the pci bars).  I hope the memory
> check handles the 'old guest' case: when the guest can't handle
> addresses above 4G it is unlikely that qemu is configured to have
> memory mapped above 4G ...

does it break 32-bit PAE enabled guests
(which can have more then 4Gb RAM configured)?

> 
> take care,
>   Gerd
> 
> ___
> SeaBIOS mailing list -- seab...@seabios.org
> To unsubscribe send an email to seabios-le...@seabios.org
> 




Re: [RFC PATCH] migration: reduce time of loading non-iterable vmstate

2022-12-08 Thread Peter Xu
On Thu, Dec 08, 2022 at 10:39:11PM +0800, Chuang Xu wrote:
> 
> On 2022/12/8 上午6:08, Peter Xu wrote:
> > On Thu, Dec 08, 2022 at 12:07:03AM +0800, Chuang Xu wrote:
> > > On 2022/12/6 上午12:28, Peter Xu wrote:
> > > > Chuang,
> > > > 
> > > > No worry on the delay; you're faster than when I read yours. :)
> > > > 
> > > > On Mon, Dec 05, 2022 at 02:56:15PM +0800, Chuang Xu wrote:
> > > > > > As a start, maybe you can try with poison 
> > > > > > address_space_to_flatview() (by
> > > > > > e.g. checking the start_pack_mr_change flag and assert it is not 
> > > > > > set)
> > > > > > during this process to see whether any call stack can even try to
> > > > > > dereference a flatview.
> > > > > > 
> > > > > > It's just that I didn't figure a good way to "prove" its validity, 
> > > > > > even if
> > > > > > I think this is an interesting idea worth thinking to shrink the 
> > > > > > downtime.
> > > > > Thanks for your sugguestions!
> > > > > I used a thread local variable to identify whether the current thread 
> > > > > is a
> > > > > migration thread(main thread of target qemu) and I modified the code 
> > > > > of
> > > > > qemu_coroutine_switch to make sure the thread local variable true 
> > > > > only in
> > > > > process_incoming_migration_co call stack. If the target qemu detects 
> > > > > that
> > > > > start_pack_mr_change is set and address_space_to_flatview() is called 
> > > > > in
> > > > > non-migrating threads or non-migrating coroutine, it will crash.
> > > > Are you using the thread var just to avoid the assert triggering in the
> > > > migration thread when commiting memory changes?
> > > > 
> > > > I think _maybe_ another cleaner way to sanity check this is directly 
> > > > upon
> > > > the depth:
> > > > 
> > > > static inline FlatView *address_space_to_flatview(AddressSpace *as)
> > > > {
> > > >   /*
> > > >* Before using any flatview, sanity check we're not during a 
> > > > memory
> > > >* region transaction or the map can be invalid.  Note that this 
> > > > can
> > > >* also be called during commit phase of memory transaction, but 
> > > > that
> > > >* should also only happen when the depth decreases to 0 first.
> > > >*/
> > > >   assert(memory_region_transaction_depth == 0);
> > > >   return qatomic_rcu_read(&as->current_map);
> > > > }
> > > > 
> > > > That should also cover the safe cases of memory transaction commits 
> > > > during
> > > > migration.
> > > > 
> > > Peter, I tried this way and found that the target qemu will crash.
> > > 
> > > Here is the gdb backtrace:
> > > 
> > > #0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
> > > #1  0x7ff2929d851a in __GI_abort () at abort.c:118
> > > #2  0x7ff2929cfe67 in __assert_fail_base (fmt=, 
> > > assertion=assertion@entry=0x55a32578cdc0 "memory_region_transaction_depth 
> > > == 0", file=file@entry=0x55a32575d9b0 
> > > "/data00/migration/qemu-5.2.0/include/exec/memory.h",
> > >  line=line@entry=766, function=function@entry=0x55a32578d6e0 
> > > <__PRETTY_FUNCTION__.20463> "address_space_to_flatview") at assert.c:92
> > > #3  0x7ff2929cff12 in __GI___assert_fail 
> > > (assertion=assertion@entry=0x55a32578cdc0 
> > > "memory_region_transaction_depth == 0", file=file@entry=0x55a32575d9b0 
> > > "/data00/migration/qemu-5.2.0/include/exec/memory.h", line=line@entry=766,
> > >  function=function@entry=0x55a32578d6e0 <__PRETTY_FUNCTION__.20463> 
> > > "address_space_to_flatview") at assert.c:101
> > > #4  0x55a324b2ed5e in address_space_to_flatview (as=0x55a326132580 
> > > ) at 
> > > /data00/migration/qemu-5.2.0/include/exec/memory.h:766
> > > #5  0x55a324e79559 in address_space_to_flatview (as=0x55a326132580 
> > > ) at ../softmmu/memory.c:811
> > > #6  address_space_get_flatview (as=0x55a326132580 ) 
> > > at ../softmmu/memory.c:805
> > > #7  0x55a324e96474 in address_space_cache_init 
> > > (cache=cache@entry=0x55a32a4fb000, as=, 
> > > addr=addr@entry=68404985856, len=len@entry=4096, is_write=false) at 
> > > ../softmmu/physmem.c:3307
> > > #8  0x55a324ea9cba in virtio_init_region_cache (vdev=0x55a32985d9a0, 
> > > n=0) at ../hw/virtio/virtio.c:185
> > > #9  0x55a324eaa615 in virtio_load (vdev=0x55a32985d9a0, f= > > out>, version_id=) at ../hw/virtio/virtio.c:3203
> > > #10 0x55a324c6ab96 in vmstate_load_state (f=f@entry=0x55a329dc0c00, 
> > > vmsd=0x55a325fc1a60 , opaque=0x55a32985d9a0, 
> > > version_id=1) at ../migration/vmstate.c:143
> > > #11 0x55a324cda138 in vmstate_load (f=0x55a329dc0c00, 
> > > se=0x55a329941c90) at ../migration/savevm.c:913
> > > #12 0x55a324cdda34 in qemu_loadvm_section_start_full 
> > > (mis=0x55a3284ef9e0, f=0x55a329dc0c00) at ../migration/savevm.c:2741
> > > #13 qemu_loadvm_state_main (f=f@entry=0x55a329dc0c00, 
> > > mis=mis@entry=0x55a3284ef9e0) at ../migration/savevm.c:2939
> > > #14 0x55a324cdf66a in qemu_loadvm_state (f=0x55a329dc0c00) at 
> > > ../migration

Re: [PATCH-for-8.0 v2 2/4] gdbstub: Use vaddr type for generic insert/remove_breakpoint() API

2022-12-08 Thread Fabiano Rosas
Philippe Mathieu-Daudé  writes:

> Both insert/remove_breakpoint() handlers are used in system and
> user emulation. We can not use the 'hwaddr' type on user emulation,
> we have to use 'vaddr' which is defined as "wide enough to contain
> any #target_ulong virtual address".
>
> gdbstub.c doesn't require to include "exec/hwaddr.h" anymore.
>
> Signed-off-by: Philippe Mathieu-Daudé 

Reviewed-by: Fabiano Rosas 



[PATCH] mailmap: Fix Stefan Weil author email

2022-12-08 Thread Philippe Mathieu-Daudé
Fix authorship of commits 266aaedc37~..ac14949821. See commit
3bd2608db7 ("maint: Add .mailmap entries for patches claiming
list authorship") for rationale.

Signed-off-by: Philippe Mathieu-Daudé 
---
 .mailmap | 1 +
 1 file changed, 1 insertion(+)

diff --git a/.mailmap b/.mailmap
index 35dddbe27b..fad2aff5aa 100644
--- a/.mailmap
+++ b/.mailmap
@@ -45,6 +45,7 @@ Ed Swierk  Ed Swierk via 
Qemu-devel  Ian McKellar via Qemu-devel 

 Julia Suvorova  Julia Suvorova via Qemu-devel 

 Justin Terry (VM)  Justin Terry (VM) via Qemu-devel 

+Stefan Weil  Stefan Weil via 
 
 # Next, replace old addresses by a more recent one.
 Aleksandar Markovic  

-- 
2.38.1




Re: [PATCH-for-8.0 v2 0/4] target/cpu: System/User cleanups around hwaddr/vaddr

2022-12-08 Thread Richard Henderson

On 12/8/22 09:35, Philippe Mathieu-Daudé wrote:

Philippe Mathieu-Daudé (4):
   cputlb: Restrict SavedIOTLB to system emulation
   gdbstub: Use vaddr type for generic insert/remove_breakpoint() API
   target/cpu: Restrict cpu_get_phys_page_debug() handlers to sysemu
   target/sparc/sysemu: Remove pointless CONFIG_USER_ONLY guard


Reviewed-by: Richard Henderson 


r~



[PATCH-for-8.0 v2 4/4] target/sparc/sysemu: Remove pointless CONFIG_USER_ONLY guard

2022-12-08 Thread Philippe Mathieu-Daudé
Commit caac44a52a ("target/sparc: Make sparc_cpu_tlb_fill sysemu
only") restricted mmu_helper.c to system emulation. Checking
whether CONFIG_USER_ONLY is defined is now pointless.

Signed-off-by: Philippe Mathieu-Daudé 
---
 target/sparc/mmu_helper.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/target/sparc/mmu_helper.c b/target/sparc/mmu_helper.c
index 919448a494..a7e51e4b7d 100644
--- a/target/sparc/mmu_helper.c
+++ b/target/sparc/mmu_helper.c
@@ -924,7 +924,6 @@ hwaddr sparc_cpu_get_phys_page_debug(CPUState *cs, vaddr 
addr)
 return phys_addr;
 }
 
-#ifndef CONFIG_USER_ONLY
 G_NORETURN void sparc_cpu_do_unaligned_access(CPUState *cs, vaddr addr,
   MMUAccessType access_type,
   int mmu_idx,
@@ -942,4 +941,3 @@ G_NORETURN void sparc_cpu_do_unaligned_access(CPUState *cs, 
vaddr addr,
 
 cpu_raise_exception_ra(env, TT_UNALIGNED, retaddr);
 }
-#endif /* !CONFIG_USER_ONLY */
-- 
2.38.1




[PATCH-for-8.0 v2 3/4] target/cpu: Restrict cpu_get_phys_page_debug() handlers to sysemu

2022-12-08 Thread Philippe Mathieu-Daudé
The 'hwaddr' type is only available / meaningful on system emulation.

Signed-off-by: Philippe Mathieu-Daudé 
---
 target/alpha/cpu.h| 2 +-
 target/cris/cpu.h | 3 +--
 target/hppa/cpu.h | 2 +-
 target/m68k/cpu.h | 2 +-
 target/nios2/cpu.h| 2 +-
 target/openrisc/cpu.h | 3 ++-
 target/ppc/cpu.h  | 2 +-
 target/rx/cpu.h   | 2 +-
 target/rx/helper.c| 4 ++--
 target/sh4/cpu.h  | 2 +-
 target/sparc/cpu.h| 3 ++-
 target/xtensa/cpu.h   | 2 +-
 12 files changed, 15 insertions(+), 14 deletions(-)

diff --git a/target/alpha/cpu.h b/target/alpha/cpu.h
index d0abc949a8..5e67304d81 100644
--- a/target/alpha/cpu.h
+++ b/target/alpha/cpu.h
@@ -276,9 +276,9 @@ extern const VMStateDescription vmstate_alpha_cpu;
 
 void alpha_cpu_do_interrupt(CPUState *cpu);
 bool alpha_cpu_exec_interrupt(CPUState *cpu, int int_req);
+hwaddr alpha_cpu_get_phys_page_debug(CPUState *cpu, vaddr addr);
 #endif /* !CONFIG_USER_ONLY */
 void alpha_cpu_dump_state(CPUState *cs, FILE *f, int flags);
-hwaddr alpha_cpu_get_phys_page_debug(CPUState *cpu, vaddr addr);
 int alpha_cpu_gdb_read_register(CPUState *cpu, GByteArray *buf, int reg);
 int alpha_cpu_gdb_write_register(CPUState *cpu, uint8_t *buf, int reg);
 
diff --git a/target/cris/cpu.h b/target/cris/cpu.h
index e6776f25b1..71fa1f96e0 100644
--- a/target/cris/cpu.h
+++ b/target/cris/cpu.h
@@ -193,12 +193,11 @@ bool cris_cpu_exec_interrupt(CPUState *cpu, int int_req);
 bool cris_cpu_tlb_fill(CPUState *cs, vaddr address, int size,
MMUAccessType access_type, int mmu_idx,
bool probe, uintptr_t retaddr);
+hwaddr cris_cpu_get_phys_page_debug(CPUState *cpu, vaddr addr);
 #endif
 
 void cris_cpu_dump_state(CPUState *cs, FILE *f, int flags);
 
-hwaddr cris_cpu_get_phys_page_debug(CPUState *cpu, vaddr addr);
-
 int crisv10_cpu_gdb_read_register(CPUState *cpu, GByteArray *buf, int reg);
 int cris_cpu_gdb_read_register(CPUState *cpu, GByteArray *buf, int reg);
 int cris_cpu_gdb_write_register(CPUState *cpu, uint8_t *buf, int reg);
diff --git a/target/hppa/cpu.h b/target/hppa/cpu.h
index 6f3b6beecf..b595ef25a9 100644
--- a/target/hppa/cpu.h
+++ b/target/hppa/cpu.h
@@ -322,11 +322,11 @@ static inline void cpu_hppa_change_prot_id(CPUHPPAState 
*env) { }
 void cpu_hppa_change_prot_id(CPUHPPAState *env);
 #endif
 
-hwaddr hppa_cpu_get_phys_page_debug(CPUState *cs, vaddr addr);
 int hppa_cpu_gdb_read_register(CPUState *cpu, GByteArray *buf, int reg);
 int hppa_cpu_gdb_write_register(CPUState *cpu, uint8_t *buf, int reg);
 void hppa_cpu_dump_state(CPUState *cs, FILE *f, int);
 #ifndef CONFIG_USER_ONLY
+hwaddr hppa_cpu_get_phys_page_debug(CPUState *cs, vaddr addr);
 bool hppa_cpu_tlb_fill(CPUState *cs, vaddr address, int size,
MMUAccessType access_type, int mmu_idx,
bool probe, uintptr_t retaddr);
diff --git a/target/m68k/cpu.h b/target/m68k/cpu.h
index 3a9cfe2f33..68ed531fc3 100644
--- a/target/m68k/cpu.h
+++ b/target/m68k/cpu.h
@@ -176,9 +176,9 @@ struct ArchCPU {
 #ifndef CONFIG_USER_ONLY
 void m68k_cpu_do_interrupt(CPUState *cpu);
 bool m68k_cpu_exec_interrupt(CPUState *cpu, int int_req);
+hwaddr m68k_cpu_get_phys_page_debug(CPUState *cpu, vaddr addr);
 #endif /* !CONFIG_USER_ONLY */
 void m68k_cpu_dump_state(CPUState *cpu, FILE *f, int flags);
-hwaddr m68k_cpu_get_phys_page_debug(CPUState *cpu, vaddr addr);
 int m68k_cpu_gdb_read_register(CPUState *cpu, GByteArray *buf, int reg);
 int m68k_cpu_gdb_write_register(CPUState *cpu, uint8_t *buf, int reg);
 
diff --git a/target/nios2/cpu.h b/target/nios2/cpu.h
index f85581ee56..2f43b67a8f 100644
--- a/target/nios2/cpu.h
+++ b/target/nios2/cpu.h
@@ -262,7 +262,6 @@ void nios2_tcg_init(void);
 void nios2_cpu_do_interrupt(CPUState *cs);
 void dump_mmu(CPUNios2State *env);
 void nios2_cpu_dump_state(CPUState *cpu, FILE *f, int flags);
-hwaddr nios2_cpu_get_phys_page_debug(CPUState *cpu, vaddr addr);
 G_NORETURN void nios2_cpu_do_unaligned_access(CPUState *cpu, vaddr addr,
   MMUAccessType access_type, int 
mmu_idx,
   uintptr_t retaddr);
@@ -288,6 +287,7 @@ static inline int cpu_mmu_index(CPUNios2State *env, bool 
ifetch)
 }
 
 #ifndef CONFIG_USER_ONLY
+hwaddr nios2_cpu_get_phys_page_debug(CPUState *cpu, vaddr addr);
 bool nios2_cpu_tlb_fill(CPUState *cs, vaddr address, int size,
 MMUAccessType access_type, int mmu_idx,
 bool probe, uintptr_t retaddr);
diff --git a/target/openrisc/cpu.h b/target/openrisc/cpu.h
index 1d5efa5ca2..31a4ae5ad3 100644
--- a/target/openrisc/cpu.h
+++ b/target/openrisc/cpu.h
@@ -312,7 +312,6 @@ struct ArchCPU {
 
 void cpu_openrisc_list(void);
 void openrisc_cpu_dump_state(CPUState *cpu, FILE *f, int flags);
-hwaddr openrisc_cpu_get_phys_page_debug(CPUState *cpu, vaddr addr);
 int openrisc_cpu_gdb_read_register(CPUState *cpu, GByteArray *buf, int reg);
 int openrisc_cpu_gdb_wri

[PATCH-for-8.0 v2 2/4] gdbstub: Use vaddr type for generic insert/remove_breakpoint() API

2022-12-08 Thread Philippe Mathieu-Daudé
Both insert/remove_breakpoint() handlers are used in system and
user emulation. We can not use the 'hwaddr' type on user emulation,
we have to use 'vaddr' which is defined as "wide enough to contain
any #target_ulong virtual address".

gdbstub.c doesn't require to include "exec/hwaddr.h" anymore.

Signed-off-by: Philippe Mathieu-Daudé 
---
 accel/kvm/kvm-all.c| 4 ++--
 accel/kvm/kvm-cpus.h   | 4 ++--
 accel/tcg/tcg-accel-ops.c  | 4 ++--
 gdbstub/gdbstub.c  | 1 -
 gdbstub/internals.h| 6 --
 gdbstub/softmmu.c  | 5 ++---
 gdbstub/user.c | 5 ++---
 include/sysemu/accel-ops.h | 6 +++---
 8 files changed, 17 insertions(+), 18 deletions(-)

diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index f99b0becd8..f3b434c717 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -3219,7 +3219,7 @@ bool kvm_supports_guest_debug(void)
 return kvm_has_guest_debug;
 }
 
-int kvm_insert_breakpoint(CPUState *cpu, int type, hwaddr addr, hwaddr len)
+int kvm_insert_breakpoint(CPUState *cpu, int type, vaddr addr, vaddr len)
 {
 struct kvm_sw_breakpoint *bp;
 int err;
@@ -3257,7 +3257,7 @@ int kvm_insert_breakpoint(CPUState *cpu, int type, hwaddr 
addr, hwaddr len)
 return 0;
 }
 
-int kvm_remove_breakpoint(CPUState *cpu, int type, hwaddr addr, hwaddr len)
+int kvm_remove_breakpoint(CPUState *cpu, int type, vaddr addr, vaddr len)
 {
 struct kvm_sw_breakpoint *bp;
 int err;
diff --git a/accel/kvm/kvm-cpus.h b/accel/kvm/kvm-cpus.h
index fd63fe6a59..ca40add32c 100644
--- a/accel/kvm/kvm-cpus.h
+++ b/accel/kvm/kvm-cpus.h
@@ -19,8 +19,8 @@ void kvm_cpu_synchronize_post_reset(CPUState *cpu);
 void kvm_cpu_synchronize_post_init(CPUState *cpu);
 void kvm_cpu_synchronize_pre_loadvm(CPUState *cpu);
 bool kvm_supports_guest_debug(void);
-int kvm_insert_breakpoint(CPUState *cpu, int type, hwaddr addr, hwaddr len);
-int kvm_remove_breakpoint(CPUState *cpu, int type, hwaddr addr, hwaddr len);
+int kvm_insert_breakpoint(CPUState *cpu, int type, vaddr addr, vaddr len);
+int kvm_remove_breakpoint(CPUState *cpu, int type, vaddr addr, vaddr len);
 void kvm_remove_all_breakpoints(CPUState *cpu);
 
 #endif /* KVM_CPUS_H */
diff --git a/accel/tcg/tcg-accel-ops.c b/accel/tcg/tcg-accel-ops.c
index 19cbf1db3a..d9228fd403 100644
--- a/accel/tcg/tcg-accel-ops.c
+++ b/accel/tcg/tcg-accel-ops.c
@@ -116,7 +116,7 @@ static inline int xlat_gdb_type(CPUState *cpu, int gdbtype)
 return cputype;
 }
 
-static int tcg_insert_breakpoint(CPUState *cs, int type, hwaddr addr, hwaddr 
len)
+static int tcg_insert_breakpoint(CPUState *cs, int type, vaddr addr, vaddr len)
 {
 CPUState *cpu;
 int err = 0;
@@ -147,7 +147,7 @@ static int tcg_insert_breakpoint(CPUState *cs, int type, 
hwaddr addr, hwaddr len
 }
 }
 
-static int tcg_remove_breakpoint(CPUState *cs, int type, hwaddr addr, hwaddr 
len)
+static int tcg_remove_breakpoint(CPUState *cs, int type, vaddr addr, vaddr len)
 {
 CPUState *cpu;
 int err = 0;
diff --git a/gdbstub/gdbstub.c b/gdbstub/gdbstub.c
index be88ca0d71..c3fbc31123 100644
--- a/gdbstub/gdbstub.c
+++ b/gdbstub/gdbstub.c
@@ -48,7 +48,6 @@
 #include "sysemu/runstate.h"
 #include "semihosting/semihost.h"
 #include "exec/exec-all.h"
-#include "exec/hwaddr.h"
 #include "sysemu/replay.h"
 
 #include "internals.h"
diff --git a/gdbstub/internals.h b/gdbstub/internals.h
index eabb0341d1..b23999f951 100644
--- a/gdbstub/internals.h
+++ b/gdbstub/internals.h
@@ -9,9 +9,11 @@
 #ifndef _INTERNALS_H_
 #define _INTERNALS_H_
 
+#include "exec/cpu-common.h"
+
 bool gdb_supports_guest_debug(void);
-int gdb_breakpoint_insert(CPUState *cs, int type, hwaddr addr, hwaddr len);
-int gdb_breakpoint_remove(CPUState *cs, int type, hwaddr addr, hwaddr len);
+int gdb_breakpoint_insert(CPUState *cs, int type, vaddr addr, vaddr len);
+int gdb_breakpoint_remove(CPUState *cs, int type, vaddr addr, vaddr len);
 void gdb_breakpoint_remove_all(CPUState *cs);
 
 #endif /* _INTERNALS_H_ */
diff --git a/gdbstub/softmmu.c b/gdbstub/softmmu.c
index f208c6cf15..129575e510 100644
--- a/gdbstub/softmmu.c
+++ b/gdbstub/softmmu.c
@@ -11,7 +11,6 @@
 
 #include "qemu/osdep.h"
 #include "exec/gdbstub.h"
-#include "exec/hwaddr.h"
 #include "sysemu/cpus.h"
 #include "internals.h"
 
@@ -24,7 +23,7 @@ bool gdb_supports_guest_debug(void)
 return false;
 }
 
-int gdb_breakpoint_insert(CPUState *cs, int type, hwaddr addr, hwaddr len)
+int gdb_breakpoint_insert(CPUState *cs, int type, vaddr addr, vaddr len)
 {
 const AccelOpsClass *ops = cpus_get_accel();
 if (ops->insert_breakpoint) {
@@ -33,7 +32,7 @@ int gdb_breakpoint_insert(CPUState *cs, int type, hwaddr 
addr, hwaddr len)
 return -ENOSYS;
 }
 
-int gdb_breakpoint_remove(CPUState *cs, int type, hwaddr addr, hwaddr len)
+int gdb_breakpoint_remove(CPUState *cs, int type, vaddr addr, vaddr len)
 {
 const AccelOpsClass *ops = cpus_get_accel();
 if (ops->remove_breakpoint) {
diff --git a/gdbstub/user.c b/gdbstub/user.c
index 0

[PATCH-for-8.0 v2 1/4] cputlb: Restrict SavedIOTLB to system emulation

2022-12-08 Thread Philippe Mathieu-Daudé
Commit 2f3a57ee47 ("cputlb: ensure we save the IOTLB data in
case of reset") added the SavedIOTLB structure -- which is
system emulation specific -- in the generic CPUState structure.

Signed-off-by: Philippe Mathieu-Daudé 
---
 include/hw/core/cpu.h | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/include/hw/core/cpu.h b/include/hw/core/cpu.h
index 8830546121..bc3229ae13 100644
--- a/include/hw/core/cpu.h
+++ b/include/hw/core/cpu.h
@@ -222,7 +222,7 @@ struct CPUWatchpoint {
 QTAILQ_ENTRY(CPUWatchpoint) entry;
 };
 
-#ifdef CONFIG_PLUGIN
+#if defined(CONFIG_PLUGIN) && !defined(CONFIG_USER_ONLY)
 /*
  * For plugins we sometime need to save the resolved iotlb data before
  * the memory regions get moved around  by io_writex.
@@ -406,9 +406,11 @@ struct CPUState {
 
 #ifdef CONFIG_PLUGIN
 GArray *plugin_mem_cbs;
+#if !defined(CONFIG_USER_ONLY)
 /* saved iotlb data from io_writex */
 SavedIOTLB saved_iotlb;
-#endif
+#endif /* !CONFIG_USER_ONLY */
+#endif /* CONFIG_PLUGIN */
 
 /* TODO Move common fields from CPUArchState here. */
 int cpu_index;
-- 
2.38.1




[PATCH-for-8.0 v2 0/4] target/cpu: System/User cleanups around hwaddr/vaddr

2022-12-08 Thread Philippe Mathieu-Daudé
We are not supposed to use the 'hwaddr' type on user emulation.

This series is a preparatory cleanup before few refactors to
isolate further System vs User code.

Since v1:
- only restrict SavedIOTLB in header (Alex)
- convert insert/remove_breakpoint implementations (Peter)

Philippe Mathieu-Daudé (4):
  cputlb: Restrict SavedIOTLB to system emulation
  gdbstub: Use vaddr type for generic insert/remove_breakpoint() API
  target/cpu: Restrict cpu_get_phys_page_debug() handlers to sysemu
  target/sparc/sysemu: Remove pointless CONFIG_USER_ONLY guard

 accel/kvm/kvm-all.c| 4 ++--
 accel/kvm/kvm-cpus.h   | 4 ++--
 accel/tcg/tcg-accel-ops.c  | 4 ++--
 gdbstub/gdbstub.c  | 1 -
 gdbstub/internals.h| 6 --
 gdbstub/softmmu.c  | 5 ++---
 gdbstub/user.c | 5 ++---
 include/hw/core/cpu.h  | 6 --
 include/sysemu/accel-ops.h | 6 +++---
 target/alpha/cpu.h | 2 +-
 target/cris/cpu.h  | 3 +--
 target/hppa/cpu.h  | 2 +-
 target/m68k/cpu.h  | 2 +-
 target/nios2/cpu.h | 2 +-
 target/openrisc/cpu.h  | 3 ++-
 target/ppc/cpu.h   | 2 +-
 target/rx/cpu.h| 2 +-
 target/rx/helper.c | 4 ++--
 target/sh4/cpu.h   | 2 +-
 target/sparc/cpu.h | 3 ++-
 target/sparc/mmu_helper.c  | 2 --
 target/xtensa/cpu.h| 2 +-
 22 files changed, 36 insertions(+), 36 deletions(-)

-- 
2.38.1




[RFC PATCH] RISC-V: Save mmu_idx using FIELD_DP32 not OR

2022-12-08 Thread Christoph Muellner
From: Christoph Müllner 

Setting flags using OR might work, but is not optimal
for a couple of reasons:
* No way grep for stores to the field MEM_IDX.
* The return value of cpu_mmu_index() is not masked
  (not a real problem as long as cpu_mmu_index() returns only valid values).
* If the offset of MEM_IDX would get moved to non-0, then this code
  would not work anymore.

Let's use the FIELD_DP32() macro instead of the OR, which is already
used for most other flags.

Signed-off-by: Christoph Müllner 
---
 target/riscv/cpu_helper.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c
index 278d163803..d68b6b351d 100644
--- a/target/riscv/cpu_helper.c
+++ b/target/riscv/cpu_helper.c
@@ -80,7 +80,8 @@ void cpu_get_tb_cpu_state(CPURISCVState *env, target_ulong 
*pc,
 flags |= TB_FLAGS_MSTATUS_FS;
 flags |= TB_FLAGS_MSTATUS_VS;
 #else
-flags |= cpu_mmu_index(env, 0);
+flags = FIELD_DP32(flags, TB_FLAGS, MEM_IDX, cpu_mmu_index(env, 0));
+
 if (riscv_cpu_fp_enabled(env)) {
 flags |= env->mstatus & MSTATUS_FS;
 }
-- 
2.38.1




CVMSEG Emulation

2022-12-08 Thread Christopher Wrogg

In userspace emulation how do I make a set of addresses always valid and
initialized to 0 even though the process does not map it in? In particular
I want to map the CVMSEG for Cavium qemu-mips64 and qemu-mipsn32. The
addresses would be 0x8000 - 0xBFFF. I've looked at
target_mmap but it can't handle addresses that large. The lack of an
emulated mmu for 64 bit guests is going to be a problem.


  1   2   >