Re: [PATCH for-8.1] tcg: Use HAVE_CMPXCHG128 instead of CONFIG_CMPXCHG128
On 13/07/2023 22.23, Richard Henderson wrote: We adjust CONFIG_ATOMIC128 and CONFIG_CMPXCHG128 with CONFIG_ATOMIC128_OPT in atomic128.h. It is difficult to tell when those changes have been applied with the ifdef we must use with CONFIG_CMPXCHG128. So instead use HAVE_CMPXCHG128, which triggers -Werror-undef when the proper header has not been included. Improves tcg_gen_atomic_cmpxchg_i128 for s390x host, which requires CONFIG_ATOMIC128_OPT. Without this we fall back to EXCP_ATOMIC to single-step 128-bit atomics, which is slow enough to cause some tests to time out. Reported-by: Thomas Huth Signed-off-by: Richard Henderson --- Thomas, this issue does not quite match the one you bisected, but other than the cmpxchg, I don't see any see any qemu_{ld,st}_i128 being used in BootLinuxS390X.test_s390_ccw_virtio_tcg. As far as I can see, this wasn't broken by the addition of CONFIG_ATOMIC128_OPT, rather that fix didn't go far enough. Anyway, test_s390_ccw_virtio_tcg now passes in 159s on our host. Thanks, I can confirm that this fixes the issue for me, too. Tested-by: Thomas Huth
Re: [PATCH for-8.1] tcg: Use HAVE_CMPXCHG128 instead of CONFIG_CMPXCHG128
On 7/13/23 22:36, Philippe Mathieu-Daudé wrote: Hi Richard, On 13/7/23 22:23, Richard Henderson wrote: We adjust CONFIG_ATOMIC128 and CONFIG_CMPXCHG128 with CONFIG_ATOMIC128_OPT in atomic128.h. It is difficult to tell when those changes have been applied with the ifdef we must use with CONFIG_CMPXCHG128. So instead use HAVE_CMPXCHG128, which triggers -Werror-undef when the proper header has not been included. Improves tcg_gen_atomic_cmpxchg_i128 for s390x host, which requires CONFIG_ATOMIC128_OPT. Without this we fall back to EXCP_ATOMIC to single-step 128-bit atomics, which is slow enough to cause some tests to time out. Reported-by: Thomas Huth Signed-off-by: Richard Henderson --- Thomas, this issue does not quite match the one you bisected, but other than the cmpxchg, I don't see any see any qemu_{ld,st}_i128 being used in BootLinuxS390X.test_s390_ccw_virtio_tcg. As far as I can see, this wasn't broken by the addition of CONFIG_ATOMIC128_OPT, rather that fix didn't go far enough. Anyway, test_s390_ccw_virtio_tcg now passes in 159s on our host. IIUC: If we have CONFIG_ATOMIC128, we use qatomic_cmpxchg__nocheck; else if we have CONFIG_CMPXCHG128 we use __sync_val_compare_and_swap_16; in both cases we set HAVE_CMPXCHG128; otherwise we can not use atomic128 cmpxchg(). (I'm trying to figure why we need both CONFIGs). Or sometimes we use inline asm, because there's no compiler support at all. Please see host/include/*/host/atomic16-*.h. r~
Re: [PATCH for-8.1] tcg: Use HAVE_CMPXCHG128 instead of CONFIG_CMPXCHG128
Hi Richard, On 13/7/23 22:23, Richard Henderson wrote: We adjust CONFIG_ATOMIC128 and CONFIG_CMPXCHG128 with CONFIG_ATOMIC128_OPT in atomic128.h. It is difficult to tell when those changes have been applied with the ifdef we must use with CONFIG_CMPXCHG128. So instead use HAVE_CMPXCHG128, which triggers -Werror-undef when the proper header has not been included. Improves tcg_gen_atomic_cmpxchg_i128 for s390x host, which requires CONFIG_ATOMIC128_OPT. Without this we fall back to EXCP_ATOMIC to single-step 128-bit atomics, which is slow enough to cause some tests to time out. Reported-by: Thomas Huth Signed-off-by: Richard Henderson --- Thomas, this issue does not quite match the one you bisected, but other than the cmpxchg, I don't see any see any qemu_{ld,st}_i128 being used in BootLinuxS390X.test_s390_ccw_virtio_tcg. As far as I can see, this wasn't broken by the addition of CONFIG_ATOMIC128_OPT, rather that fix didn't go far enough. Anyway, test_s390_ccw_virtio_tcg now passes in 159s on our host. IIUC: If we have CONFIG_ATOMIC128, we use qatomic_cmpxchg__nocheck; else if we have CONFIG_CMPXCHG128 we use __sync_val_compare_and_swap_16; in both cases we set HAVE_CMPXCHG128; otherwise we can not use atomic128 cmpxchg(). (I'm trying to figure why we need both CONFIGs).
[PATCH for-8.1] tcg: Use HAVE_CMPXCHG128 instead of CONFIG_CMPXCHG128
We adjust CONFIG_ATOMIC128 and CONFIG_CMPXCHG128 with CONFIG_ATOMIC128_OPT in atomic128.h. It is difficult to tell when those changes have been applied with the ifdef we must use with CONFIG_CMPXCHG128. So instead use HAVE_CMPXCHG128, which triggers -Werror-undef when the proper header has not been included. Improves tcg_gen_atomic_cmpxchg_i128 for s390x host, which requires CONFIG_ATOMIC128_OPT. Without this we fall back to EXCP_ATOMIC to single-step 128-bit atomics, which is slow enough to cause some tests to time out. Reported-by: Thomas Huth Signed-off-by: Richard Henderson --- Thomas, this issue does not quite match the one you bisected, but other than the cmpxchg, I don't see any see any qemu_{ld,st}_i128 being used in BootLinuxS390X.test_s390_ccw_virtio_tcg. As far as I can see, this wasn't broken by the addition of CONFIG_ATOMIC128_OPT, rather that fix didn't go far enough. Anyway, test_s390_ccw_virtio_tcg now passes in 159s on our host. r~ --- accel/tcg/tcg-runtime.h| 2 +- include/exec/helper-proto-common.h | 2 ++ accel/tcg/cputlb.c | 2 +- accel/tcg/user-exec.c | 2 +- tcg/tcg-op-ldst.c | 2 +- accel/tcg/atomic_common.c.inc | 2 +- 6 files changed, 7 insertions(+), 5 deletions(-) diff --git a/accel/tcg/tcg-runtime.h b/accel/tcg/tcg-runtime.h index 39e68007f9..186899a2c7 100644 --- a/accel/tcg/tcg-runtime.h +++ b/accel/tcg/tcg-runtime.h @@ -58,7 +58,7 @@ DEF_HELPER_FLAGS_5(atomic_cmpxchgq_be, TCG_CALL_NO_WG, DEF_HELPER_FLAGS_5(atomic_cmpxchgq_le, TCG_CALL_NO_WG, i64, env, i64, i64, i64, i32) #endif -#ifdef CONFIG_CMPXCHG128 +#if HAVE_CMPXCHG128 DEF_HELPER_FLAGS_5(atomic_cmpxchgo_be, TCG_CALL_NO_WG, i128, env, i64, i128, i128, i32) DEF_HELPER_FLAGS_5(atomic_cmpxchgo_le, TCG_CALL_NO_WG, diff --git a/include/exec/helper-proto-common.h b/include/exec/helper-proto-common.h index 4d4b022668..8b67170a22 100644 --- a/include/exec/helper-proto-common.h +++ b/include/exec/helper-proto-common.h @@ -7,6 +7,8 @@ #ifndef HELPER_PROTO_COMMON_H #define HELPER_PROTO_COMMON_H +#include "qemu/atomic128.h" /* for HAVE_CMPXCHG128 */ + #define HELPER_H "accel/tcg/tcg-runtime.h" #include "exec/helper-proto.h.inc" #undef HELPER_H diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c index c2b81ec569..e0079c9a9d 100644 --- a/accel/tcg/cputlb.c +++ b/accel/tcg/cputlb.c @@ -3105,7 +3105,7 @@ void cpu_st16_mmu(CPUArchState *env, target_ulong addr, Int128 val, #include "atomic_template.h" #endif -#if defined(CONFIG_ATOMIC128) || defined(CONFIG_CMPXCHG128) +#if defined(CONFIG_ATOMIC128) || HAVE_CMPXCHG128 #define DATA_SIZE 16 #include "atomic_template.h" #endif diff --git a/accel/tcg/user-exec.c b/accel/tcg/user-exec.c index d95b875a6a..e7225e10e9 100644 --- a/accel/tcg/user-exec.c +++ b/accel/tcg/user-exec.c @@ -1385,7 +1385,7 @@ static void *atomic_mmu_lookup(CPUArchState *env, vaddr addr, MemOpIdx oi, #include "atomic_template.h" #endif -#if defined(CONFIG_ATOMIC128) || defined(CONFIG_CMPXCHG128) +#if defined(CONFIG_ATOMIC128) || HAVE_CMPXCHG128 #define DATA_SIZE 16 #include "atomic_template.h" #endif diff --git a/tcg/tcg-op-ldst.c b/tcg/tcg-op-ldst.c index 0fcc1618e5..d54c305598 100644 --- a/tcg/tcg-op-ldst.c +++ b/tcg/tcg-op-ldst.c @@ -778,7 +778,7 @@ typedef void (*gen_atomic_op_i64)(TCGv_i64, TCGv_env, TCGv_i64, #else # define WITH_ATOMIC64(X) #endif -#ifdef CONFIG_CMPXCHG128 +#if HAVE_CMPXCHG128 # define WITH_ATOMIC128(X) X, #else # define WITH_ATOMIC128(X) diff --git a/accel/tcg/atomic_common.c.inc b/accel/tcg/atomic_common.c.inc index ee222fd7e7..95a5c5ff12 100644 --- a/accel/tcg/atomic_common.c.inc +++ b/accel/tcg/atomic_common.c.inc @@ -41,7 +41,7 @@ CMPXCHG_HELPER(cmpxchgq_be, uint64_t) CMPXCHG_HELPER(cmpxchgq_le, uint64_t) #endif -#ifdef CONFIG_CMPXCHG128 +#if HAVE_CMPXCHG128 CMPXCHG_HELPER(cmpxchgo_be, Int128) CMPXCHG_HELPER(cmpxchgo_le, Int128) #endif -- 2.34.1