[PATCH 0/8] aarch64: testsuite: Fix test failures with --enable-default-pie or --enable-default-ssp
Hi, This patch series fixes a lot of test failures with --enable-default-pie or --enable-default-ssp for AArch64 target. Only test files are changed to disable PIE or SSP to satisify the expectation of the developer who programmed the test. Bootstrapped and regtested on aarch64-linux-gnu. Ok for trunk? Xi Ruoyao (8): aarch64: testsuite: disable PIE for aapcs64 tests [PR70150] aarch64: testsuite: disable PIE for tests with large code model [PR70150] aarch64: testsuite: disable PIE for fuse_adrp_add_1.c [PR70150] aarch64: testsuite: disable stack protector for sve-pcs tests aarch64: testsuite: disable stack protector for pr103147-10 tests aarch64: testsuite: disable stack protector for auto-init-7.c aarch64: testsuite: disable stack protector for pr104005.c aarch64: testsuite: disable stack protector for tests relying on stack offset gcc/testsuite/g++.target/aarch64/pr103147-10.C | 2 +- gcc/testsuite/gcc.dg/tls/pr78796.c | 2 +- gcc/testsuite/gcc.target/aarch64/aapcs64/aapcs64.exp | 2 +- gcc/testsuite/gcc.target/aarch64/auto-init-7.c | 2 +- gcc/testsuite/gcc.target/aarch64/fuse_adrp_add_1.c | 2 +- gcc/testsuite/gcc.target/aarch64/pr103147-10.c | 2 +- gcc/testsuite/gcc.target/aarch64/pr104005.c| 2 +- gcc/testsuite/gcc.target/aarch64/pr63304_1.c | 2 +- gcc/testsuite/gcc.target/aarch64/pr70120-2.c | 2 +- gcc/testsuite/gcc.target/aarch64/pr78733.c | 2 +- gcc/testsuite/gcc.target/aarch64/pr79041-2.c | 2 +- gcc/testsuite/gcc.target/aarch64/pr94530.c | 2 +- gcc/testsuite/gcc.target/aarch64/pr94577.c | 2 +- gcc/testsuite/gcc.target/aarch64/reload-valid-spoff.c | 2 +- gcc/testsuite/gcc.target/aarch64/shrink_wrap_1.c | 2 +- gcc/testsuite/gcc.target/aarch64/stack-check-cfa-1.c | 2 +- gcc/testsuite/gcc.target/aarch64/stack-check-cfa-2.c | 2 +- .../gcc.target/aarch64/sve/pcs/aarch64-sve-pcs.exp | 7 --- gcc/testsuite/gcc.target/aarch64/test_frame_17.c | 2 +- 19 files changed, 22 insertions(+), 21 deletions(-) -- 2.39.2
[PATCH 4/8] aarch64: testsuite: disable stack protector for sve-pcs tests
If GCC is configured with --enable-default-ssp, the stack protector can make many sve-pcs tests fail. gcc/testsuite/ChangeLog: * gcc.target/aarch64/sve/pcs/aarch64-sve-pcs.exp (sve_flags): Add -fno-stack-protector. --- .../gcc.target/aarch64/sve/pcs/aarch64-sve-pcs.exp | 7 --- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/aarch64-sve-pcs.exp b/gcc/testsuite/gcc.target/aarch64/sve/pcs/aarch64-sve-pcs.exp index 5562502cc07..3dbf73f67c9 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/pcs/aarch64-sve-pcs.exp +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/aarch64-sve-pcs.exp @@ -37,11 +37,12 @@ if ![info exists DEFAULT_CFLAGS] then { # Initialize `dg'. dg-init -# Force SVE if we're not testing it already. +# Force SVE if we're not testing it already. And, disable stack protector +# to avoid test failures with --enable-default-ssp. if { [check_effective_target_aarch64_sve] } { -set sve_flags "" +set sve_flags "-fno-stack-protector" } else { -set sve_flags "-march=armv8.2-a+sve" +set sve_flags "-march=armv8.2-a+sve -fno-stack-protector" } # Main loop. -- 2.39.2
[PATCH 6/8] aarch64: testsuite: disable stack protector for auto-init-7.c
The test scans for "const_int 0" in the RTL dump, but stack protector can produce more "const_int 0". To avoid a failure with --enable-default-ssp, disable stack protector for this. gcc/testsuite/ChangeLog: * gcc.target/aarch64/auto-init-7.c (dg-options): Add -fno-stack-protector. --- gcc/testsuite/gcc.target/aarch64/auto-init-7.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gcc/testsuite/gcc.target/aarch64/auto-init-7.c b/gcc/testsuite/gcc.target/aarch64/auto-init-7.c index ac27fbe92f4..fde6e568532 100644 --- a/gcc/testsuite/gcc.target/aarch64/auto-init-7.c +++ b/gcc/testsuite/gcc.target/aarch64/auto-init-7.c @@ -1,6 +1,6 @@ /* Verify zero initialization for array, union, and structure type automatic variables. */ /* { dg-do compile } */ -/* { dg-options "-ftrivial-auto-var-init=zero -fdump-rtl-expand" } */ +/* { dg-options "-ftrivial-auto-var-init=zero -fdump-rtl-expand -fno-stack-protector" } */ struct S { -- 2.39.2
[PATCH 3/8] aarch64: testsuite: disable PIE for fuse_adrp_add_1.c [PR70150]
In PIE, symbol "fixed_regs" is addressed via GOT. It will break the scan-assembler pattern and cause test failure with --enable-default-pie. gcc/testsuite/ChangeLog: PR testsuite/70150 * gcc.target/aarch64/fuse_adrp_add_1.c (dg-options): Add -fno-pie. --- gcc/testsuite/gcc.target/aarch64/fuse_adrp_add_1.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gcc/testsuite/gcc.target/aarch64/fuse_adrp_add_1.c b/gcc/testsuite/gcc.target/aarch64/fuse_adrp_add_1.c index e49aadaa639..d66fe3a4b23 100644 --- a/gcc/testsuite/gcc.target/aarch64/fuse_adrp_add_1.c +++ b/gcc/testsuite/gcc.target/aarch64/fuse_adrp_add_1.c @@ -1,6 +1,6 @@ /* { dg-do compile } */ /* { dg-require-effective-target aarch64_small } */ -/* { dg-options "-O3 -mcpu=cortex-a57" } */ +/* { dg-options "-O3 -mcpu=cortex-a57 -fno-pie" } */ enum reg_class { NO_REGS, AP_REG, XRF_REGS, GENERAL_REGS, AGRF_REGS, XGRF_REGS, ALL_REGS, LIM_REG_CLASSES }; -- 2.39.2
[PATCH 7/8] aarch64: testsuite: disable stack protector for pr104005.c
Storing stack guarding variable need one stp instruction, breaking the scan-assembler-not pattern in the test. Disable stack protector to avoid a test failure with --enable-default-ssp. gcc/testsuite/ChangeLog: * gcc.target/aarch64/pr104005.c (dg-options): Add -fno-stack-protector. --- gcc/testsuite/gcc.target/aarch64/pr104005.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gcc/testsuite/gcc.target/aarch64/pr104005.c b/gcc/testsuite/gcc.target/aarch64/pr104005.c index 09dd81910eb..9f1ef2dc308 100644 --- a/gcc/testsuite/gcc.target/aarch64/pr104005.c +++ b/gcc/testsuite/gcc.target/aarch64/pr104005.c @@ -1,4 +1,4 @@ -/* { dg-options "-O2 -funroll-loops" } */ +/* { dg-options "-O2 -funroll-loops -fno-stack-protector" } */ typedef int v2 __attribute__((vector_size(8))); -- 2.39.2
[PATCH 2/8] aarch64: testsuite: disable PIE for tests with large code model [PR70150]
These tests set large code model with -mcmodel=large or target pragma for AArch64. But if GCC is configured with --enable-default-pie, it triggers "sorry: unimplemented: code model large with -fpic". Disable PIE to make avoid the issue. gcc/testsuite/ChangeLog: PR testsuite/70150 * gcc.dg/tls/pr78796.c (dg-additional-options): Add -fno-pie -no-pie for aarch64-*-*. * gcc.target/aarch64/pr63304_1.c (dg-options): Add -fno-pie. * gcc.target/aarch64/pr70120-2.c (dg-options): Add -fno-pie. * gcc.target/aarch64/pr78733.c (dg-options): Add -fno-pie. * gcc.target/aarch64/pr79041-2.c (dg-options): Add -fno-pie. * gcc.target/aarch64/pr94530.c (dg-options): Add -fno-pie. * gcc.target/aarch64/pr94577.c (dg-options): Add -fno-pie. * gcc.target/aarch64/reload-valid-spoff.c (dg-options): Add -fno-pie. --- gcc/testsuite/gcc.dg/tls/pr78796.c| 2 +- gcc/testsuite/gcc.target/aarch64/pr63304_1.c | 2 +- gcc/testsuite/gcc.target/aarch64/pr70120-2.c | 2 +- gcc/testsuite/gcc.target/aarch64/pr78733.c| 2 +- gcc/testsuite/gcc.target/aarch64/pr79041-2.c | 2 +- gcc/testsuite/gcc.target/aarch64/pr94530.c| 2 +- gcc/testsuite/gcc.target/aarch64/pr94577.c| 2 +- gcc/testsuite/gcc.target/aarch64/reload-valid-spoff.c | 2 +- 8 files changed, 8 insertions(+), 8 deletions(-) diff --git a/gcc/testsuite/gcc.dg/tls/pr78796.c b/gcc/testsuite/gcc.dg/tls/pr78796.c index 038e5366e41..96f87d47ba4 100644 --- a/gcc/testsuite/gcc.dg/tls/pr78796.c +++ b/gcc/testsuite/gcc.dg/tls/pr78796.c @@ -1,7 +1,7 @@ /* PR target/78796 */ /* { dg-do run } */ /* { dg-options "-O2" } */ -/* { dg-additional-options "-mcmodel=large" { target aarch64-*-* } } */ +/* { dg-additional-options "-mcmodel=large -fno-pie -no-pie" { target aarch64-*-* } } */ /* { dg-require-effective-target tls_runtime } */ /* { dg-add-options tls } */ diff --git a/gcc/testsuite/gcc.target/aarch64/pr63304_1.c b/gcc/testsuite/gcc.target/aarch64/pr63304_1.c index 9f1ed947806..5d519d817cc 100644 --- a/gcc/testsuite/gcc.target/aarch64/pr63304_1.c +++ b/gcc/testsuite/gcc.target/aarch64/pr63304_1.c @@ -1,6 +1,6 @@ /* { dg-do assemble } */ /* { dg-require-effective-target lp64 } */ -/* { dg-options "-O1 --save-temps" } */ +/* { dg-options "-O1 --save-temps -fno-pie" } */ #pragma GCC push_options #pragma GCC target ("+nothing+simd,cmodel=small") diff --git a/gcc/testsuite/gcc.target/aarch64/pr70120-2.c b/gcc/testsuite/gcc.target/aarch64/pr70120-2.c index 663bf2ed147..8f5cdc93fe3 100644 --- a/gcc/testsuite/gcc.target/aarch64/pr70120-2.c +++ b/gcc/testsuite/gcc.target/aarch64/pr70120-2.c @@ -1,6 +1,6 @@ /* { dg-do compile } */ /* { dg-require-effective-target lp64 } */ -/* { dg-options "-Og -freorder-functions -g3 -mcmodel=large" } */ +/* { dg-options "-Og -freorder-functions -g3 -mcmodel=large -fno-pie" } */ typedef short v32u16 __attribute__ ((vector_size (32))); typedef int v32u32 __attribute__ ((vector_size (32))); diff --git a/gcc/testsuite/gcc.target/aarch64/pr78733.c b/gcc/testsuite/gcc.target/aarch64/pr78733.c index 4695b5c1b2b..8556ef3f371 100644 --- a/gcc/testsuite/gcc.target/aarch64/pr78733.c +++ b/gcc/testsuite/gcc.target/aarch64/pr78733.c @@ -1,5 +1,5 @@ /* { dg-do compile } */ -/* { dg-options "-O2 -mcmodel=large -mpc-relative-literal-loads" } */ +/* { dg-options "-O2 -mcmodel=large -mpc-relative-literal-loads -fno-pie" } */ /* { dg-require-effective-target lp64 } */ /* { dg-skip-if "-mcmodel=large, no support for -fpic" { aarch64-*-* } { "-fpic" } { "" } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/pr79041-2.c b/gcc/testsuite/gcc.target/aarch64/pr79041-2.c index 4695b5c1b2b..8556ef3f371 100644 --- a/gcc/testsuite/gcc.target/aarch64/pr79041-2.c +++ b/gcc/testsuite/gcc.target/aarch64/pr79041-2.c @@ -1,5 +1,5 @@ /* { dg-do compile } */ -/* { dg-options "-O2 -mcmodel=large -mpc-relative-literal-loads" } */ +/* { dg-options "-O2 -mcmodel=large -mpc-relative-literal-loads -fno-pie" } */ /* { dg-require-effective-target lp64 } */ /* { dg-skip-if "-mcmodel=large, no support for -fpic" { aarch64-*-* } { "-fpic" } { "" } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/pr94530.c b/gcc/testsuite/gcc.target/aarch64/pr94530.c index 2797d116dcf..5dfdbe3311d 100644 --- a/gcc/testsuite/gcc.target/aarch64/pr94530.c +++ b/gcc/testsuite/gcc.target/aarch64/pr94530.c @@ -1,6 +1,6 @@ /* { dg-do compile } */ /* { dg-require-effective-target lp64 } */ -/* { dg-options "-Os -mcpu=falkor -mpc-relative-literal-loads -mcmodel=large" } */ +/* { dg-options "-Os -mcpu=falkor -mpc-relative-literal-loads -mcmodel=large -fno-pie" } */ extern void bar(const char *); diff --git a/gcc/testsuite/gcc.target/aarch64/pr94577.c b/gcc/testsuite/gcc.target/aarch64/pr94577.c index 6f2d3612c26..d51799fb0bb 100644 --- a/gcc/testsuite/gcc.target/aarch64/pr94577.c +++ b/gcc/testsuite/gcc.target/aarch64/pr94
[PATCH 5/8] aarch64: testsuite: disable stack protector for pr103147-10 tests
Stack protector influence code generation and cause function body checks fail. gcc/testsuite/ChangeLog: * gcc.target/aarch64/pr103147-10.c (dg-options): Add -fno-stack-protector. * g++.target/aarch64/pr103147-10.C: Likewise. --- gcc/testsuite/g++.target/aarch64/pr103147-10.C | 2 +- gcc/testsuite/gcc.target/aarch64/pr103147-10.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/gcc/testsuite/g++.target/aarch64/pr103147-10.C b/gcc/testsuite/g++.target/aarch64/pr103147-10.C index 914fdf9c692..e12771533f7 100644 --- a/gcc/testsuite/g++.target/aarch64/pr103147-10.C +++ b/gcc/testsuite/g++.target/aarch64/pr103147-10.C @@ -1,4 +1,4 @@ -/* { dg-options "-O2 -fpack-struct -mstrict-align" } */ +/* { dg-options "-O2 -fpack-struct -mstrict-align -fno-stack-protector" } */ /* { dg-final { check-function-bodies "**" "" "" } } */ #include diff --git a/gcc/testsuite/gcc.target/aarch64/pr103147-10.c b/gcc/testsuite/gcc.target/aarch64/pr103147-10.c index b2c34e4155d..57942bfd10a 100644 --- a/gcc/testsuite/gcc.target/aarch64/pr103147-10.c +++ b/gcc/testsuite/gcc.target/aarch64/pr103147-10.c @@ -1,4 +1,4 @@ -/* { dg-options "-O2 -fpack-struct -mstrict-align" } */ +/* { dg-options "-O2 -fpack-struct -mstrict-align -fno-stack-protector" } */ /* { dg-final { check-function-bodies "**" "" "" } } */ #include -- 2.39.2
[PATCH 8/8] aarch64: testsuite: disable stack protector for tests relying on stack offset
Stack protector needs a guard value on the stack and change the stack layout. So we need to disable it for those tests, to avoid test failure with --enable-default-ssp. gcc/testsuite/ChangeLog: * gcc.target/aarch64/shrink_wrap_1.c (dg-options): Add -fno-stack-protector. * gcc.target/aarch64/stack-check-cfa-1.c (dg-options): Add -fno-stack-protector. * gcc.target/aarch64/stack-check-cfa-2.c (dg-options): Add -fno-stack-protector. * gcc.target/aarch64/test_frame_17.c (dg-options): Add -fno-stack-protector. --- gcc/testsuite/gcc.target/aarch64/shrink_wrap_1.c | 2 +- gcc/testsuite/gcc.target/aarch64/stack-check-cfa-1.c | 2 +- gcc/testsuite/gcc.target/aarch64/stack-check-cfa-2.c | 2 +- gcc/testsuite/gcc.target/aarch64/test_frame_17.c | 2 +- 4 files changed, 4 insertions(+), 4 deletions(-) diff --git a/gcc/testsuite/gcc.target/aarch64/shrink_wrap_1.c b/gcc/testsuite/gcc.target/aarch64/shrink_wrap_1.c index ab7cd74ec3b..067220c04a0 100644 --- a/gcc/testsuite/gcc.target/aarch64/shrink_wrap_1.c +++ b/gcc/testsuite/gcc.target/aarch64/shrink_wrap_1.c @@ -1,5 +1,5 @@ /* { dg-do compile { target { aarch64*-*-* } } } */ -/* { dg-options "-O2" } */ +/* { dg-options "-O2 -fno-stack-protector" } */ /* { dg-final { check-function-bodies "**" "" } } */ /* diff --git a/gcc/testsuite/gcc.target/aarch64/stack-check-cfa-1.c b/gcc/testsuite/gcc.target/aarch64/stack-check-cfa-1.c index 6885894a97e..412a9ed1aab 100644 --- a/gcc/testsuite/gcc.target/aarch64/stack-check-cfa-1.c +++ b/gcc/testsuite/gcc.target/aarch64/stack-check-cfa-1.c @@ -1,5 +1,5 @@ /* { dg-do compile } */ -/* { dg-options "-O2 -fstack-clash-protection --param stack-clash-protection-guard-size=16 -funwind-tables" } */ +/* { dg-options "-O2 -fstack-clash-protection --param stack-clash-protection-guard-size=16 -funwind-tables -fno-stack-protector" } */ /* { dg-require-effective-target supports_stack_clash_protection } */ #define SIZE 128*1024 diff --git a/gcc/testsuite/gcc.target/aarch64/stack-check-cfa-2.c b/gcc/testsuite/gcc.target/aarch64/stack-check-cfa-2.c index 5796a53be06..e440569a078 100644 --- a/gcc/testsuite/gcc.target/aarch64/stack-check-cfa-2.c +++ b/gcc/testsuite/gcc.target/aarch64/stack-check-cfa-2.c @@ -1,5 +1,5 @@ /* { dg-do compile } */ -/* { dg-options "-O2 -fstack-clash-protection --param stack-clash-protection-guard-size=16 -funwind-tables" } */ +/* { dg-options "-O2 -fstack-clash-protection --param stack-clash-protection-guard-size=16 -funwind-tables -fno-stack-protector" } */ /* { dg-require-effective-target supports_stack_clash_protection } */ #define SIZE 1280*1024 + 512 diff --git a/gcc/testsuite/gcc.target/aarch64/test_frame_17.c b/gcc/testsuite/gcc.target/aarch64/test_frame_17.c index 44f13291128..5d432ad0854 100644 --- a/gcc/testsuite/gcc.target/aarch64/test_frame_17.c +++ b/gcc/testsuite/gcc.target/aarch64/test_frame_17.c @@ -1,5 +1,5 @@ /* { dg-do compile } */ -/* { dg-options "-O2" } */ +/* { dg-options "-O2 -fno-stack-protector" } */ /* Test reuse of stack adjustment temporaries. */ -- 2.39.2
[PATCH] LoongArch: Stop -mfpu from silently breaking ABI
In the toolchain convention, we describe -mfpu= as: "Selects the allowed set of basic floating-point instructions and registers. This option should not change the FP calling convention unless it's necessary." Though not explicitly stated, the rationale of this rule is to allow combinations like "-mabi=lp64s -mfpu=64". This will be useful for running applications with LP64S/F ABI on a double-float-capable LoongArch hardware and using a math library with LP64S/F ABI but native double float HW instructions, for a better performance. And now a case in Linux kernel has again proven the usefulness of this kind of combination. The AMDGPU DCN kernel driver needs to perform some floating-point operation, but the entire kernel uses LP64S ABI. So the translation units of the AMDGPU DCN driver need to be compiled with -mfpu=64 (the kernel lacks soft-FP routines in libgcc), but -mabi=lp64s (or you can't link it with the other part of the kernel). Unfortunately, currently GCC uses TARGET_{HARD,SOFT,DOUBLE}_FLOAT to determine the floating calling convention. This causes "-mfpu=64" silently allow using $fa* to pass parameters and return values EVEN IF -mabi=lp64s is used. To make things worse, the generated object file has SOFT-FLOAT set in the eflags field so the linker will happily link it with other LP64S ABI object files, but obviously this will lead to bad results at runtime. The fix is simple: use TARGET_*_FLOAT_ABI instead. But then it causes "-mabi=lp64s -march=loongarch64" to generate code like: movgr2fr.d $fa0, $a0 frecip.d $fa0, $fa0 movfr2gr.d $a0, $fa0 The problem here is "loongarch64" is never strictly defined. So we consider "loongarch64" a "64-bit LoongArch CPU with the simplest FPU needed by the ABI", and if -march=loongarch64 but -mfpu is not explicitly used, we set -mfpu such a simplest one. I consider this a bug fix: the behavior difference from the toolchain convention doc is a bug, and generating object files with SOFT-FLOAT flag but parameters/return values passed through FPRs is definitely a bug. Bootstrapped and regtested on loongarch64-linux-gnu. Ok for trunk? I'm not sure if it's a good idea to backport this into gcc-12 though. gcc/ChangeLog: * config/loongarch/loongarch.h (FP_RETURN): Use TARGET_*_FLOAT_ABI instead of TARGET_*_FLOAT. (UNITS_PER_FP_ARG): Likewise. * config/loongarch/loongarch-opts.cc (loongarch_config_target): If -march=loongarch64 and -mfpu not explicitly used, guess FPU capability from ABI. gcc/testsuite/ChangeLog: * gcc.target/loongarch/flt-abi-isa-1.c: New test. * gcc.target/loongarch/flt-abi-isa-2.c: New test. * gcc.target/loongarch/flt-abi-isa-3.c: New test. * gcc.target/loongarch/flt-abi-isa-4.c: New test. * gcc.target/loongarch/flt-abi-isa-5.c: New test. * gcc.target/loongarch/flt-abi-isa-6.c: New test. * gcc.target/loongarch/flt-abi-isa-7.c: New test. * gcc.target/loongarch/flt-abi-isa-8.c: New test. * gcc.target/loongarch/flt-abi-isa-9.c: New test. * gcc.target/loongarch/flt-abi-isa-10.c: New test. --- gcc/config/loongarch/loongarch-opts.cc | 18 ++ gcc/config/loongarch/loongarch.h | 4 ++-- .../gcc.target/loongarch/flt-abi-isa-1.c | 12 .../gcc.target/loongarch/flt-abi-isa-10.c | 7 +++ .../gcc.target/loongarch/flt-abi-isa-2.c | 11 +++ .../gcc.target/loongarch/flt-abi-isa-3.c | 11 +++ .../gcc.target/loongarch/flt-abi-isa-4.c | 12 .../gcc.target/loongarch/flt-abi-isa-5.c | 7 +++ .../gcc.target/loongarch/flt-abi-isa-6.c | 11 +++ .../gcc.target/loongarch/flt-abi-isa-7.c | 5 + .../gcc.target/loongarch/flt-abi-isa-8.c | 7 +++ .../gcc.target/loongarch/flt-abi-isa-9.c | 7 +++ 12 files changed, 110 insertions(+), 2 deletions(-) create mode 100644 gcc/testsuite/gcc.target/loongarch/flt-abi-isa-1.c create mode 100644 gcc/testsuite/gcc.target/loongarch/flt-abi-isa-10.c create mode 100644 gcc/testsuite/gcc.target/loongarch/flt-abi-isa-2.c create mode 100644 gcc/testsuite/gcc.target/loongarch/flt-abi-isa-3.c create mode 100644 gcc/testsuite/gcc.target/loongarch/flt-abi-isa-4.c create mode 100644 gcc/testsuite/gcc.target/loongarch/flt-abi-isa-5.c create mode 100644 gcc/testsuite/gcc.target/loongarch/flt-abi-isa-6.c create mode 100644 gcc/testsuite/gcc.target/loongarch/flt-abi-isa-7.c create mode 100644 gcc/testsuite/gcc.target/loongarch/flt-abi-isa-8.c create mode 100644 gcc/testsuite/gcc.target/loongarch/flt-abi-isa-9.c diff --git a/gcc/config/loongarch/loongarch-opts.cc b/gcc/config/loongarch/loongarch-opts.cc index a52e25236ea..bea77da93e9 100644 --- a/gcc/config/loongarch/loongarch-opts.cc +++ b/gcc/config/loongarch/loongarch-opts.cc @@ -251,6 +251,24 @@ config_target_isa: ((t.cpu_arch == CPU_NATIVE && constrained.arch) ?
Re: [PATCH] LoongArch: Stop -mfpu from silently breaking ABI
On Fri, 2023-03-03 at 10:12 +0800, Yujie Yang wrote: > However, "loongarch64" is defined to include the "fpu64" ISA module[1] > (i.e. enable "-mfpu=64" when -mfpu is not explicitly used). So I believe > the above behavior you observed is expected. Ah this make things much simpler and aligns with my gut feeling :). I can drop the change in loongarch-opts.cc now. And the smaller changeset also makes me more confident about a backport to gcc-12. V2 patch is being tested and I'll send it after the testing. Meanwhile I created PR109000 to track the issue. -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University
Re: [PATCH 01/07] RISC-V: Add auto-vectorization support
Please don't use the same title for all the patches. This will puzzle people running "git log" once they are committed. And when you send 01-07, use "reply" instead of "new" so they will show up in the correct location in a mail client. Or use git send-email which is much eaiser to use. On Thu, 2023-03-02 at 23:52 -0500, Michael Collison wrote: > This patch adds foundational support in the form of: > > 1. New predicates > > 2. New function prototypes > > 3. Exporting emit_vlmax_vsetvl to global scope > > 4. Add a new command line option -mriscv_vector_lmul > > gcc/ChangeLog: > > * config/riscv/riscv-protos.h (riscv_classify_vlmul_field): The change log entries should be indented with one tab, not some whitespaces. -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University
[PATCH v2] LoongArch: Stop -mfpu from silently breaking ABI [PR109000]
In the toolchain convention, we describe -mfpu= as: "Selects the allowed set of basic floating-point instructions and registers. This option should not change the FP calling convention unless it's necessary." Though not explicitly stated, the rationale of this rule is to allow combinations like "-mabi=lp64s -mfpu=64". This will be useful for running applications with LP64S/F ABI on a double-float-capable LoongArch hardware and using a math library with LP64S/F ABI but native double float HW instructions, for a better performance. And now a case in Linux kernel has again proven the usefulness of this kind of combination. The AMDGPU DCN kernel driver needs to perform some floating-point operation, but the entire kernel uses LP64S ABI. So the translation units of the AMDGPU DCN driver need to be compiled with -mfpu=64 (the kernel lacks soft-FP routines in libgcc), but -mabi=lp64s (or you can't link it with the other part of the kernel). Unfortunately, currently GCC uses TARGET_{HARD,SOFT,DOUBLE}_FLOAT to determine the floating calling convention. This causes "-mfpu=64" silently allow using $fa* to pass parameters and return values EVEN IF -mabi=lp64s is used. To make things worse, the generated object file has SOFT-FLOAT set in the eflags field so the linker will happily link it with other LP64S ABI object files, but obviously this will lead to bad results at runtime. And for now all loongarch64 CPU models (-march settings) implies -mfpu=64 on by default, so the issue makes a single "-mabi=lp64s" option basically broken (fortunately most projects for eg the Linux kernel have used -msoft-float which implies both -mabi=lp64s and -mfpu=none as we've recommended in the toolchain convention doc). The fix is simple: use TARGET_*_FLOAT_ABI instead. I consider this a bug fix: the behavior difference from the toolchain convention doc is a bug, and generating object files with SOFT-FLOAT flag but parameters/return values passed through FPRs is definitely a bug. Bootstrapped and regtested on loongarch64-linux-gnu. Ok for trunk and release/gcc-12 branch? gcc/ChangeLog: PR target/109000 * config/loongarch/loongarch.h (FP_RETURN): Use TARGET_*_FLOAT_ABI instead of TARGET_*_FLOAT. (UNITS_PER_FP_ARG): Likewise. gcc/testsuite/ChangeLog: PR target/109000 * gcc.target/loongarch/flt-abi-isa-1.c: New test. * gcc.target/loongarch/flt-abi-isa-2.c: New test. * gcc.target/loongarch/flt-abi-isa-3.c: New test. * gcc.target/loongarch/flt-abi-isa-4.c: New test. --- gcc/config/loongarch/loongarch.h | 4 ++-- gcc/testsuite/gcc.target/loongarch/flt-abi-isa-1.c | 14 ++ gcc/testsuite/gcc.target/loongarch/flt-abi-isa-2.c | 10 ++ gcc/testsuite/gcc.target/loongarch/flt-abi-isa-3.c | 9 + gcc/testsuite/gcc.target/loongarch/flt-abi-isa-4.c | 10 ++ 5 files changed, 45 insertions(+), 2 deletions(-) create mode 100644 gcc/testsuite/gcc.target/loongarch/flt-abi-isa-1.c create mode 100644 gcc/testsuite/gcc.target/loongarch/flt-abi-isa-2.c create mode 100644 gcc/testsuite/gcc.target/loongarch/flt-abi-isa-3.c create mode 100644 gcc/testsuite/gcc.target/loongarch/flt-abi-isa-4.c diff --git a/gcc/config/loongarch/loongarch.h b/gcc/config/loongarch/loongarch.h index f4e903d46bb..f8167875646 100644 --- a/gcc/config/loongarch/loongarch.h +++ b/gcc/config/loongarch/loongarch.h @@ -676,7 +676,7 @@ enum reg_class point values. */ #define GP_RETURN (GP_REG_FIRST + 4) -#define FP_RETURN ((TARGET_SOFT_FLOAT) ? GP_RETURN : (FP_REG_FIRST + 0)) +#define FP_RETURN ((TARGET_SOFT_FLOAT_ABI) ? GP_RETURN : (FP_REG_FIRST + 0)) #define MAX_ARGS_IN_REGISTERS 8 @@ -1154,6 +1154,6 @@ struct GTY (()) machine_function /* The largest type that can be passed in floating-point registers. */ /* TODO: according to mabi. */ #define UNITS_PER_FP_ARG \ - (TARGET_HARD_FLOAT ? (TARGET_DOUBLE_FLOAT ? 8 : 4) : 0) + (TARGET_HARD_FLOAT_ABI ? (TARGET_DOUBLE_FLOAT_ABI ? 8 : 4) : 0) #define FUNCTION_VALUE_REGNO_P(N) ((N) == GP_RETURN || (N) == FP_RETURN) diff --git a/gcc/testsuite/gcc.target/loongarch/flt-abi-isa-1.c b/gcc/testsuite/gcc.target/loongarch/flt-abi-isa-1.c new file mode 100644 index 000..1c9490f6a87 --- /dev/null +++ b/gcc/testsuite/gcc.target/loongarch/flt-abi-isa-1.c @@ -0,0 +1,14 @@ +/* { dg-do compile } */ +/* { dg-options "-mabi=lp64d -mfpu=64 -march=loongarch64 -O2" } */ +/* { dg-final { scan-assembler "frecip\\.d" } } */ +/* { dg-final { scan-assembler-not "movgr2fr\\.d" } } */ +/* { dg-final { scan-assembler-not "movfr2gr\\.d" } } */ + +/* FPU is used for calculation and FPR is used for arguments and return + values. */ + +double +t (double x) +{ + return 1.0 / x; +} diff --git a/gcc/testsuite/gcc.target/loongarch/flt-abi-isa-2.c b/gcc/testsuite/gcc.target/loongarch/flt-abi-isa-2.c new file mode 100644 index 000..0580fd65d3a --- /dev/null +++ b/gcc/testsuite/gcc.target/loongarch
[PATCH 0/2] LoongArch: testsuite: Fix tests related to stack
Some trivial test case fixes. Ok for trunk? Xi Ruoyao (2): LoongArch: testsuite: Disable stack protector for some tests LoongArch: testsuite: Adjust stack offsets in stack-check-cfa tests gcc/testsuite/gcc.target/loongarch/prolog-opt.c| 2 +- gcc/testsuite/gcc.target/loongarch/stack-check-cfa-1.c | 4 ++-- gcc/testsuite/gcc.target/loongarch/stack-check-cfa-2.c | 4 ++-- 3 files changed, 5 insertions(+), 5 deletions(-) -- 2.39.2
[PATCH 1/2] LoongArch: testsuite: Disable stack protector for some tests
Stack protector will affect stack layout and break the expectation of these tests, causing test failures if GCC is configured with --enable-default-ssp. gcc/testsuite/ChangeLog: * gcc.target/loongarch/prolog-opt.c (dg-options): Add -fno-stack-protector. * gcc.target/loongarch/stack-check-cfa-1.c (dg-options): Likewise. * gcc.target/loongarch/stack-check-cfa-2.c (dg-options): Likewise. --- gcc/testsuite/gcc.target/loongarch/prolog-opt.c| 2 +- gcc/testsuite/gcc.target/loongarch/stack-check-cfa-1.c | 2 +- gcc/testsuite/gcc.target/loongarch/stack-check-cfa-2.c | 2 +- 3 files changed, 3 insertions(+), 3 deletions(-) diff --git a/gcc/testsuite/gcc.target/loongarch/prolog-opt.c b/gcc/testsuite/gcc.target/loongarch/prolog-opt.c index 0470a1f1eee..e6a64263384 100644 --- a/gcc/testsuite/gcc.target/loongarch/prolog-opt.c +++ b/gcc/testsuite/gcc.target/loongarch/prolog-opt.c @@ -1,7 +1,7 @@ /* Test that LoongArch backend stack drop operation optimized. */ /* { dg-do compile } */ -/* { dg-options "-O2 -mabi=lp64d" } */ +/* { dg-options "-O2 -mabi=lp64d -fno-stack-protector" } */ /* { dg-final { scan-assembler "addi.d\t\\\$r3,\\\$r3,-16" } } */ extern int printf (char *, ...); diff --git a/gcc/testsuite/gcc.target/loongarch/stack-check-cfa-1.c b/gcc/testsuite/gcc.target/loongarch/stack-check-cfa-1.c index f0c6877fc25..3533fe7b685 100644 --- a/gcc/testsuite/gcc.target/loongarch/stack-check-cfa-1.c +++ b/gcc/testsuite/gcc.target/loongarch/stack-check-cfa-1.c @@ -1,5 +1,5 @@ /* { dg-do compile } */ -/* { dg-options "-O2 -fstack-clash-protection --param stack-clash-protection-guard-size=16 -funwind-tables" } */ +/* { dg-options "-O2 -fstack-clash-protection --param stack-clash-protection-guard-size=16 -funwind-tables -fno-stack-protector" } */ /* { dg-require-effective-target supports_stack_clash_protection } */ /* { dg-skip-if "" { *-*-* } { "-fstack-check" } { "" } } */ diff --git a/gcc/testsuite/gcc.target/loongarch/stack-check-cfa-2.c b/gcc/testsuite/gcc.target/loongarch/stack-check-cfa-2.c index c6e07bc561a..e5e711105ac 100644 --- a/gcc/testsuite/gcc.target/loongarch/stack-check-cfa-2.c +++ b/gcc/testsuite/gcc.target/loongarch/stack-check-cfa-2.c @@ -1,5 +1,5 @@ /* { dg-do compile } */ -/* { dg-options "-O2 -fstack-clash-protection --param stack-clash-protection-guard-size=16 -funwind-tables" } */ +/* { dg-options "-O2 -fstack-clash-protection --param stack-clash-protection-guard-size=16 -funwind-tables -fno-stack-protector" } */ /* { dg-require-effective-target supports_stack_clash_protection } */ /* { dg-skip-if "" { *-*-* } { "-fstack-check" } { "" } } */ -- 2.39.2
[PATCH 2/2] LoongArch: testsuite: Adjust stack offsets in stack-check-cfa tests
Once upon the time we used to save two registers unnecessarily, costing 16 bytes. Now the issue seems fixed (not sure by which commit though), adjust the stack offsets to reflex the change. gcc/testsuite/ChangeLog: * gcc.target/loongarch/stack-check-cfa-1.c (dg-final): Adjust expected stack offset. * gcc.target/loongarch/stack-check-cfa-2.c (dg-final): Likewise. --- gcc/testsuite/gcc.target/loongarch/stack-check-cfa-1.c | 2 +- gcc/testsuite/gcc.target/loongarch/stack-check-cfa-2.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/gcc/testsuite/gcc.target/loongarch/stack-check-cfa-1.c b/gcc/testsuite/gcc.target/loongarch/stack-check-cfa-1.c index 3533fe7b685..cd72154f46c 100644 --- a/gcc/testsuite/gcc.target/loongarch/stack-check-cfa-1.c +++ b/gcc/testsuite/gcc.target/loongarch/stack-check-cfa-1.c @@ -6,7 +6,7 @@ #define SIZE 128*1024 #include "stack-check-prologue.h" -/* { dg-final { scan-assembler-times {\.cfi_def_cfa_offset 131088} 1 } } */ +/* { dg-final { scan-assembler-times {\.cfi_def_cfa_offset 131072} 1 } } */ /* { dg-final { scan-assembler-times {\.cfi_def_cfa_offset 0} 1 } } */ /* Checks that the CFA notes are correct for every sp adjustment. */ diff --git a/gcc/testsuite/gcc.target/loongarch/stack-check-cfa-2.c b/gcc/testsuite/gcc.target/loongarch/stack-check-cfa-2.c index e5e711105ac..3e5ca05b2da 100644 --- a/gcc/testsuite/gcc.target/loongarch/stack-check-cfa-2.c +++ b/gcc/testsuite/gcc.target/loongarch/stack-check-cfa-2.c @@ -6,7 +6,7 @@ #define SIZE 1280*1024 + 512 #include "stack-check-prologue.h" -/* { dg-final { scan-assembler-times {\.cfi_def_cfa_offset 1311248} 1 } } */ +/* { dg-final { scan-assembler-times {\.cfi_def_cfa_offset 1311232} 1 } } */ /* { dg-final { scan-assembler-times {\.cfi_def_cfa_offset 0} 1 } } */ /* Checks that the CFA notes are correct for every sp adjustment. */ -- 2.39.2
[PATCH] driver: toplev: Fix a typo
The driver emits a warning when -fstack-check and -fstack-clash-protection are used together, but the message refers to "-fstack-clash_protection" (underline instead of dash). gcc/ChangeLog: * toplev.cc (process_options): Fix the spelling of "-fstack-clash-protection". --- gcc/toplev.cc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gcc/toplev.cc b/gcc/toplev.cc index 4c15d4f542e..109c9d58cbd 100644 --- a/gcc/toplev.cc +++ b/gcc/toplev.cc @@ -1580,7 +1580,7 @@ process_options (bool no_backend) if (flag_stack_check != NO_STACK_CHECK && flag_stack_clash_protection) { warning_at (UNKNOWN_LOCATION, 0, - "%<-fstack-check=%> and %<-fstack-clash_protection%> are " + "%<-fstack-check=%> and %<-fstack-clash-protection%> are " "mutually exclusive; disabling %<-fstack-check=%>"); flag_stack_check = NO_STACK_CHECK; } -- 2.39.2
Re: [PATCH] driver: toplev: Fix a typo
On Fri, 2023-03-03 at 16:54 +0800, Xi Ruoyao wrote: > The driver emits a warning when -fstack-check and > -fstack-clash-protection are used together, but the message refers to > "-fstack-clash_protection" (underline instead of dash). Forgot: Ok for trunk? Though I think it's obvious but let's follow the procedure :). > gcc/ChangeLog: > > * toplev.cc (process_options): Fix the spelling of > "-fstack-clash-protection". > --- > gcc/toplev.cc | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/gcc/toplev.cc b/gcc/toplev.cc > index 4c15d4f542e..109c9d58cbd 100644 > --- a/gcc/toplev.cc > +++ b/gcc/toplev.cc > @@ -1580,7 +1580,7 @@ process_options (bool no_backend) > if (flag_stack_check != NO_STACK_CHECK && > flag_stack_clash_protection) > { > warning_at (UNKNOWN_LOCATION, 0, > - "%<-fstack-check=%> and %<-fstack-clash_protection%> > are " > + "%<-fstack-check=%> and %<-fstack-clash-protection%> > are " > "mutually exclusive; disabling %<-fstack-check=%>"); > flag_stack_check = NO_STACK_CHECK; > } -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University
Re: [PATCH] Always define `WIN32_LEAN_AND_MEAN` before
On Sat, 2023-01-07 at 06:52 +, Jonathan Yong via Gcc-patches wrote: > On 1/6/23 18:10, Jakub Jelinek wrote: > > On Sat, Jan 07, 2023 at 02:01:05AM +0800, LIU Hao via Gcc-patches > > wrote: > > > libgomp/ > > > > > > PR middle-end/108300 > > > * config/mingw32/proc.c: Define `WIN32_LEAN_AND_MEAN` > > > before > > > . > > > > This change is ok for trunk. > > > > Jakub > > > > Pushed to master branch, thanks LH. The patch touches libgo (w/o mentioning it in the ChangeLog). I guess you need to contribute the libgo part into the upstream Go runtime or the change will be undone when Ian merges libgo next time. -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University
Re: [PATCH] update copyright year in libstdcc++ manual
Wrong subject: the library is libstdc++, not "cc++". On Fri, 2022-12-30 at 10:30 +, Jonny Grant wrote: > 2022-12-30 Jonathan Grant > * libstdc++-v3/doc/xml/faq.xml: update copyright year in > libstdcc++ manual Wrong ChangeLog format, see https://gcc.gnu.org/codingconventions.html#ChangeLogs > From 60789c24fa54301135dff1c3d0b1514d77b90a82 Mon Sep 17 00:00:00 2001 > From: Jonathan Grant > Date: Fri, 30 Dec 2022 10:28:34 + > Subject: [PATCH] update copyright year date Don't leave these lines here or they'll be the junks in the commit message. (You may leave the "From: Jonathan Grant " line, especially if you are sending the patch from a different email address, but then it should be the first line of the email body.) > > --- > libstdc++-v3/doc/xml/faq.xml | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/libstdc++-v3/doc/xml/faq.xml b/libstdc++- > v3/doc/xml/faq.xml > index 9ae4966ecea..42354f87af7 100644 > --- a/libstdc++-v3/doc/xml/faq.xml > +++ b/libstdc++-v3/doc/xml/faq.xml > @@ -7,7 +7,7 @@ > > > > - 2008-2018 > + 2008-2023 > > > http://www.w3.org/1999/xlink"; > xlink:href="https://www.fsf.org";>FSF -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University
Re: [PATCH 0/2] LoongArch: testsuite: Fix tests related to stack
On Fri, 2023-03-03 at 08:21 -0800, Mike Stump wrote: > On Mar 3, 2023, at 12:40 AM, Xi Ruoyao via Gcc-patches > wrote: > > > > Some trivial test case fixes. Ok for trunk? > > Ok. Lulu: if you don't object I'll push these two in this week. I tried to bisect for the exact point where the test cases are broken, but it turns out they are broken the first day committed (r13-4401). As the draft of r13-4401 was sent in Sept 2022 but it's committed in Nov 2022, I can only guess something had changed in the two months and broke the tests... -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University
Re: [PATCH 0/2] LoongArch: testsuite: Fix tests related to stack
On Mon, 2023-03-06 at 09:15 +0800, Lulu Cheng wrote: > > 在 2023/3/6 上午12:21, Xi Ruoyao 写道: > > On Fri, 2023-03-03 at 08:21 -0800, Mike Stump wrote: > > > On Mar 3, 2023, at 12:40 AM, Xi Ruoyao via Gcc-patches > > > wrote: > > > > Some trivial test case fixes. Ok for trunk? > > > Ok. > > Lulu: if you don't object I'll push these two in this week. > > > > I tried to bisect for the exact point where the test cases are broken, > > but it turns out they are broken the first day committed (r13-4401). As > > the draft of r13-4401 was sent in Sept 2022 but it's committed in Nov > > 2022, I can only guess something had changed in the two months and broke > > the tests... > > Sorry for the late reply, the first patch I think is fine. But I haven't > reproduced the problem of the second mail. > > Is there any special option in the configuration? Oh some strange thing might be happening... I'll try to figure out what has caused the behavior difference. > -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University
Re: [PATCH 0/2] LoongArch: testsuite: Fix tests related to stack
On Mon, 2023-03-06 at 10:48 +0800, Xi Ruoyao wrote: > On Mon, 2023-03-06 at 09:15 +0800, Lulu Cheng wrote: > > > > 在 2023/3/6 上午12:21, Xi Ruoyao 写道: > > > On Fri, 2023-03-03 at 08:21 -0800, Mike Stump wrote: > > > > On Mar 3, 2023, at 12:40 AM, Xi Ruoyao via Gcc-patches > > > > wrote: > > > > > Some trivial test case fixes. Ok for trunk? > > > > Ok. > > > Lulu: if you don't object I'll push these two in this week. > > > > > > I tried to bisect for the exact point where the test cases are broken, > > > but it turns out they are broken the first day committed (r13-4401). As > > > the draft of r13-4401 was sent in Sept 2022 but it's committed in Nov > > > 2022, I can only guess something had changed in the two months and broke > > > the tests... > > > > Sorry for the late reply, the first patch I think is fine. But I haven't > > reproduced the problem of the second mail. > > > > Is there any special option in the configuration? > > Oh some strange thing might be happening... I'll try to figure out what > has caused the behavior difference. Oh no, the difference is caused by --enable-default-pie. Maybe I should just add -fno-PIE for the dg-options. But now I'm still puzzled: why would -fPIE affect code generation on LoongArch? AFAIK all the code we are generating is position independent (at least for now). > -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University
Re: [PATCH v2 00/07] RISC-V: autovec: Add auto-vectorization support
On Sun, 2023-03-05 at 22:13 -0500, Michael Collison wrote: /* snip */ > - Fixed ChangeLog email formatting Unfortunately it's not fixed. We expect one tab, but now you have 16 whitespaces. To me it looks like your email client is being too smart and destroying the patch . Try "git send-email" which is much easier to be correctly configured. -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University
Re: [PATCH 0/2] LoongArch: testsuite: Fix tests related to stack
On Mon, 2023-03-06 at 11:16 +0800, Xi Ruoyao wrote: /* snip */ > > > Sorry for the late reply, the first patch I think is fine. But I haven't > > > reproduced the problem of the second mail. > > > > > > Is there any special option in the configuration? > > > > Oh some strange thing might be happening... I'll try to figure out what > > has caused the behavior difference. > > Oh no, the difference is caused by --enable-default-pie. > > Maybe I should just add -fno-PIE for the dg-options. But now I'm still > puzzled: why would -fPIE affect code generation on LoongArch? AFAIK all > the code we are generating is position independent (at least for now). Without -fPIE, the compiler stores a register with no reason: $ cat t.c int test(int x) { char buf[128 << 10]; return buf[x]; } $ ./gcc/cc1 t.c -nostdinc -O2 -fdump-rtl-all -o- 2>/dev/null | grep test: -A20 test: .LFB0 = . lu12i.w $r13,-135168>>12# 0xfffdf000 ori $r13,$r13,4080 add.d $r3,$r3,$r13 .LCFI0 = . lu12i.w $r12,-131072>>12# 0xfffe lu12i.w $r13,131072>>12 # 0x2 add.d $r13,$r13,$r12 addi.d $r12,$r3,16 add.d $r12,$r13,$r12 lu12i.w $r13,131072>>12 # 0x2 st.d$r12,$r3,8 ori $r13,$r13,16 ldx.b $r4,$r12,$r4 add.d $r3,$r3,$r13 .LCFI1 = . jr $r1 .LFE0: .size test, .-test .section.eh_frame,"aw",@progbits Note the "st.d $r12,$r3,8" instruction is completely meaningless. The t.c.300r.ira dump contains some "interesting" thing: Pass 0 for finding pseudo/allocno costs a0 (r87,l0) best GR_REGS, allocno GR_REGS a1 (r84,l0) best NO_REGS, allocno NO_REGS a2 (r83,l0) best GR_REGS, allocno GR_REGS a0(r87,l0) costs: SIBCALL_REGS:2000,2000 JIRL_REGS:2000,2000 CSR_REGS:2000,2000 GR_REGS:2000,2000 FP_REGS:8000,8000 ALL_REGS:32000,32000 MEM:8000,8000 a1(r84,l0) costs: SIBCALL_REGS:100,100 JIRL_REGS:100,100 CSR_REGS:100,100 GR_REGS:100,100 FP_REGS:1004000,1004000 ALL_REGS:1016000,1016000 MEM:1004000,1004000 a2(r83,l0) costs: SIBCALL_REGS:100,100 JIRL_REGS:100,100 CSR_REGS:100,100 GR_REGS:100,100 FP_REGS:1004000,1004000 ALL_REGS:1008000,1008000 MEM:1004000,1004000 Here r84 is the pseudo register for ($frame - 131072). Any idea why the compiler selects "NO_REGS" here? FWIW RISC-V port suffers the same issue: https://godbolt.org/z/aPorqj73b. -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University
Pushed: [PATCH v2] LoongArch: Stop -mfpu from silently breaking ABI [PR109000]
Pushed r13-6500 and r12-9225. On Mon, 2023-03-06 at 15:21 +0800, Lulu Cheng wrote: > > 在 2023/3/3 下午4:16, Xi Ruoyao 写道: > > In the toolchain convention, we describe -mfpu= as: > > > > "Selects the allowed set of basic floating-point instructions and > > registers. This option should not change the FP calling convention > > unless it's necessary." > > > > Though not explicitly stated, the rationale of this rule is to allow > > combinations like "-mabi=lp64s -mfpu=64". This will be useful for > > running applications with LP64S/F ABI on a double-float-capable > > LoongArch hardware and using a math library with LP64S/F ABI but > > native > > double float HW instructions, for a better performance. > > > > And now a case in Linux kernel has again proven the usefulness of > > this > > kind of combination. The AMDGPU DCN kernel driver needs to perform > > some > > floating-point operation, but the entire kernel uses LP64S ABI. So > > the > > translation units of the AMDGPU DCN driver need to be compiled with > > -mfpu=64 (the kernel lacks soft-FP routines in libgcc), but - > > mabi=lp64s > > (or you can't link it with the other part of the kernel). > > > > Unfortunately, currently GCC uses TARGET_{HARD,SOFT,DOUBLE}_FLOAT to > > determine the floating calling convention. This causes "-mfpu=64" > > silently allow using $fa* to pass parameters and return values EVEN > > IF > > -mabi=lp64s is used. To make things worse, the generated object > > file > > has SOFT-FLOAT set in the eflags field so the linker will happily > > link > > it with other LP64S ABI object files, but obviously this will lead > > to > > bad results at runtime. And for now all loongarch64 CPU models (- > > march > > settings) implies -mfpu=64 on by default, so the issue makes a > > single > > "-mabi=lp64s" option basically broken (fortunately most projects for > > eg > > the Linux kernel have used -msoft-float which implies both - > > mabi=lp64s > > and -mfpu=none as we've recommended in the toolchain convention > > doc). > > > > The fix is simple: use TARGET_*_FLOAT_ABI instead. > > > > I consider this a bug fix: the behavior difference from the > > toolchain > > convention doc is a bug, and generating object files with SOFT-FLOAT > > flag but parameters/return values passed through FPRs is definitely > > a > > bug. > > > > Bootstrapped and regtested on loongarch64-linux-gnu. Ok for trunk > > and > > release/gcc-12 branch? > > LGTM! > > Thanks! > > > > > gcc/ChangeLog: > > > > PR target/109000 > > * config/loongarch/loongarch.h (FP_RETURN): Use > > TARGET_*_FLOAT_ABI instead of TARGET_*_FLOAT. > > (UNITS_PER_FP_ARG): Likewise. > > > > gcc/testsuite/ChangeLog: > > > > PR target/109000 > > * gcc.target/loongarch/flt-abi-isa-1.c: New test. > > * gcc.target/loongarch/flt-abi-isa-2.c: New test. > > * gcc.target/loongarch/flt-abi-isa-3.c: New test. > > * gcc.target/loongarch/flt-abi-isa-4.c: New test. > > --- > > gcc/config/loongarch/loongarch.h | 4 ++-- > > gcc/testsuite/gcc.target/loongarch/flt-abi-isa-1.c | 14 > > ++ > > gcc/testsuite/gcc.target/loongarch/flt-abi-isa-2.c | 10 ++ > > gcc/testsuite/gcc.target/loongarch/flt-abi-isa-3.c | 9 + > > gcc/testsuite/gcc.target/loongarch/flt-abi-isa-4.c | 10 ++ > > 5 files changed, 45 insertions(+), 2 deletions(-) > > create mode 100644 gcc/testsuite/gcc.target/loongarch/flt-abi-isa- > > 1.c > > create mode 100644 gcc/testsuite/gcc.target/loongarch/flt-abi-isa- > > 2.c > > create mode 100644 gcc/testsuite/gcc.target/loongarch/flt-abi-isa- > > 3.c > > create mode 100644 gcc/testsuite/gcc.target/loongarch/flt-abi-isa- > > 4.c > > > > diff --git a/gcc/config/loongarch/loongarch.h > > b/gcc/config/loongarch/loongarch.h > > index f4e903d46bb..f8167875646 100644 > > --- a/gcc/config/loongarch/loongarch.h > > +++ b/gcc/config/loongarch/loongarch.h > > @@ -676,7 +676,7 @@ enum reg_class > > point values. */ > > > > #define GP_RETURN (GP_REG_FIRST + 4) > > -#define FP_RETURN ((TARGET_SOFT_FLOAT) ? GP_RETURN : (FP_REG_FIRST > > + 0)) > > +#define FP_RETURN ((TARGET_SOFT_FLOAT_ABI) ? GP_RETURN : > > (FP_REG_FIRST + 0)) > > > > #define MAX_A
Re: [PATCH 0/2] LoongArch: testsuite: Fix tests related to stack
On Mon, 2023-03-06 at 13:59 +0800, Xi Ruoyao via Gcc-patches wrote: > On Mon, 2023-03-06 at 11:16 +0800, Xi Ruoyao wrote: > > /* snip */ > > > > > Sorry for the late reply, the first patch I think is fine. But I > > > > haven't > > > > reproduced the problem of the second mail. > > > > > > > > Is there any special option in the configuration? > > > > > > Oh some strange thing might be happening... I'll try to figure out what > > > has caused the behavior difference. > > > > Oh no, the difference is caused by --enable-default-pie. > > > > Maybe I should just add -fno-PIE for the dg-options. But now I'm still > > puzzled: why would -fPIE affect code generation on LoongArch? AFAIK all > > the code we are generating is position independent (at least for now). > > Without -fPIE, the compiler stores a register with no reason: Pushed the first patch as r13-6501. The second one is dropped and I've created PR109035 for the "unnecessary store" issue (after some failed attempts to triage it). > $ cat t.c > int test(int x) > { > char buf[128 << 10]; > return buf[x]; > } > $ ./gcc/cc1 t.c -nostdinc -O2 -fdump-rtl-all -o- 2>/dev/null | grep test: > -A20 > test: > .LFB0 = . > lu12i.w $r13,-135168>>12# 0xfffdf000 > ori $r13,$r13,4080 > add.d $r3,$r3,$r13 > .LCFI0 = . > lu12i.w $r12,-131072>>12# 0xfffe > lu12i.w $r13,131072>>12 # 0x2 > add.d $r13,$r13,$r12 > addi.d $r12,$r3,16 > add.d $r12,$r13,$r12 > lu12i.w $r13,131072>>12 # 0x2 > st.d$r12,$r3,8 > ori $r13,$r13,16 > ldx.b $r4,$r12,$r4 > add.d $r3,$r3,$r13 > .LCFI1 = . > jr $r1 > .LFE0: > .size test, .-test > .section.eh_frame,"aw",@progbits > > Note the "st.d $r12,$r3,8" instruction is completely meaningless. > > The t.c.300r.ira dump contains some "interesting" thing: > > Pass 0 for finding pseudo/allocno costs > > a0 (r87,l0) best GR_REGS, allocno GR_REGS > a1 (r84,l0) best NO_REGS, allocno NO_REGS > a2 (r83,l0) best GR_REGS, allocno GR_REGS > > a0(r87,l0) costs: SIBCALL_REGS:2000,2000 JIRL_REGS:2000,2000 > CSR_REGS:2000,2000 GR_REGS:2000,2000 FP_REGS:8000,8000 ALL_REGS:32000,32000 > MEM:8000,8000 > a1(r84,l0) costs: SIBCALL_REGS:100,100 JIRL_REGS:100,100 > CSR_REGS:100,100 GR_REGS:100,100 FP_REGS:1004000,1004000 > ALL_REGS:1016000,1016000 MEM:1004000,1004000 > a2(r83,l0) costs: SIBCALL_REGS:100,100 JIRL_REGS:100,100 > CSR_REGS:100,100 GR_REGS:100,100 FP_REGS:1004000,1004000 > ALL_REGS:1008000,1008000 MEM:1004000,1004000 > > > Here r84 is the pseudo register for ($frame - 131072). Any idea why the > compiler selects "NO_REGS" here? > > FWIW RISC-V port suffers the same issue: > https://godbolt.org/z/aPorqj73b. -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University
Re: [PATCH 0/2] LoongArch: testsuite: Fix tests related to stack
On Mon, 2023-03-06 at 16:12 +0800, Lulu Cheng wrote: > Has the first patch been merged into the main branch yet? > > I think there is one more test case that needs to be modified: > > --- a/gcc/testsuite/gcc.target/loongarch/prolog-opt.c > +++ b/gcc/testsuite/gcc.target/loongarch/prolog-opt.c > @@ -1,7 +1,7 @@ > /* Test that LoongArch backend stack drop operation optimized. */ > > /* { dg-do compile } */ > -/* { dg-options "-O2 -mabi=lp64d" } */ > +/* { dg-options "-O2 -mabi=lp64d -fno-stack-protector" } */ > /* { dg-final { scan-assembler "addi.d\t\\\$r3,\\\$r3,-16" } } */ The first patch contains this hunk. It's r13-6501 now. -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University
Pushed: [PATCH 0/8] aarch64: testsuite: Fix test failures with --enable-default-pie or --enable-default-ssp
On Thu, 2023-03-02 at 10:26 +, Richard Sandiford wrote: > Xi Ruoyao writes: > > Hi, > > > > This patch series fixes a lot of test failures with --enable-default-pie > > or --enable-default-ssp for AArch64 target. Only test files are changed > > to disable PIE or SSP to satisify the expectation of the developer who > > programmed the test. > > > > Bootstrapped and regtested on aarch64-linux-gnu. Ok for trunk? > > OK for the series. Thanks for doing this! Pushed r13-6516 .. r13-6523. > > Richard > > > Xi Ruoyao (8): > > aarch64: testsuite: disable PIE for aapcs64 tests [PR70150] > > aarch64: testsuite: disable PIE for tests with large code model > > [PR70150] > > aarch64: testsuite: disable PIE for fuse_adrp_add_1.c [PR70150] > > aarch64: testsuite: disable stack protector for sve-pcs tests > > aarch64: testsuite: disable stack protector for pr103147-10 tests > > aarch64: testsuite: disable stack protector for auto-init-7.c > > aarch64: testsuite: disable stack protector for pr104005.c > > aarch64: testsuite: disable stack protector for tests relying on stack > > offset > > > > gcc/testsuite/g++.target/aarch64/pr103147-10.C | 2 +- > > gcc/testsuite/gcc.dg/tls/pr78796.c | 2 +- > > gcc/testsuite/gcc.target/aarch64/aapcs64/aapcs64.exp | 2 +- > > gcc/testsuite/gcc.target/aarch64/auto-init-7.c | 2 +- > > gcc/testsuite/gcc.target/aarch64/fuse_adrp_add_1.c | 2 +- > > gcc/testsuite/gcc.target/aarch64/pr103147-10.c | 2 +- > > gcc/testsuite/gcc.target/aarch64/pr104005.c | 2 +- > > gcc/testsuite/gcc.target/aarch64/pr63304_1.c | 2 +- > > gcc/testsuite/gcc.target/aarch64/pr70120-2.c | 2 +- > > gcc/testsuite/gcc.target/aarch64/pr78733.c | 2 +- > > gcc/testsuite/gcc.target/aarch64/pr79041-2.c | 2 +- > > gcc/testsuite/gcc.target/aarch64/pr94530.c | 2 +- > > gcc/testsuite/gcc.target/aarch64/pr94577.c | 2 +- > > gcc/testsuite/gcc.target/aarch64/reload-valid-spoff.c | 2 +- > > gcc/testsuite/gcc.target/aarch64/shrink_wrap_1.c | 2 +- > > gcc/testsuite/gcc.target/aarch64/stack-check-cfa-1.c | 2 +- > > gcc/testsuite/gcc.target/aarch64/stack-check-cfa-2.c | 2 +- > > .../gcc.target/aarch64/sve/pcs/aarch64-sve-pcs.exp | 7 --- > > gcc/testsuite/gcc.target/aarch64/test_frame_17.c | 2 +- > > 19 files changed, 22 insertions(+), 21 deletions(-) -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University
Re: [PATCH] LoongArch: Control all __crc* __crcc* builtin functions with macro __loongarch64.
On Mon, 2023-03-13 at 11:52 +0800, Lulu Cheng wrote: > diff --git a/gcc/config/loongarch/larchintrin.h > b/gcc/config/loongarch/larchintrin.h > index e571ed27b37..09f9a5db846 100644 > --- a/gcc/config/loongarch/larchintrin.h > +++ b/gcc/config/loongarch/larchintrin.h > @@ -145,6 +145,7 @@ __asrtgt_d (long int _1, long int _2) > #error "Unsupported ABI." > #endif > > +#ifdef __loongarch64 Use __loongarch_lp64. __loongarch64 is deemed as "for Compatibility with Historical Code" now, and it's defined "as-is" (__loongarch_grlen == 64). But we are using "long int" in some of these functions, which is only OK in LP64 ABI. Or maybe we can handle this better like... #if __loongarch_grlen == 64 /* the hardware supports crc.w.d.w */ #if __loongarch_lp64 extern __inline int __crc_w_d_w (long int _1, int _2) { return (int) __builtin_loongarch_crc_w_d_w (_1, _2); } #else /* __loongarch_lp64 */ extern __inline int __crc_w_d_w (long long int _1, int _2) { return (int) __builtin_loongarch_crc_w_d_w (_1, _2); } #endif /* __loongarch_lp64 */ #endif /* __loongarch_grlen */ (BTW I'm pretty sure the casts for arguments are not needed; the casts for the return value may be needed to suppress -Wconversion if __builtin_loongarch_crc_w_d_w does not return an int, but why not change __builtin_loongarch_crc_w_d_w to return int then?) -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University
Re: [PATCH] LoongArch: Control all __crc* __crcc* builtin functions with macro __loongarch64.
On Mon, 2023-03-13 at 12:40 +0800, WANG Xuerui wrote: > This is ugly. The fact all current LA32 models don't support CRC ops is > just a coincidence; it's entirely possible for a future product > iteration to introduce such functionality. It's not like the CRC*.W.W.W > ops require anything wider than 32 bits, after all. Maybe the correct way would be adding a switch like MIPS -mcrc or x86 -mcrc32. -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University
Re: [PATCH] LoongArch: Control all __crc* __crcc* builtin functions with macro __loongarch64.
On Mon, 2023-03-13 at 12:58 +0800, Lulu Cheng wrote: > > 在 2023/3/13 下午12:54, Xi Ruoyao 写道: > > On Mon, 2023-03-13 at 12:40 +0800, WANG Xuerui wrote: > > > This is ugly. The fact all current LA32 models don't support CRC > > > ops is > > > just a coincidence; it's entirely possible for a future product > > > iteration to introduce such functionality. It's not like the > > > CRC*.W.W.W > > > ops require anything wider than 32 bits, after all. > > Maybe the correct way would be adding a switch like MIPS -mcrc or > > x86 > > -mcrc32. > > > Because these instructions are part of the LA64 base instruction set, > there are no control options here. I'd postpone the change until we add LA32 support because there is no details about LA32 now and it's hard to determine how to gate this in a best way... -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University
Pushed: [PATCH] builtins: Move the character difference into result instead of reassigning result [PR109086]
Already approved in bugzilla and bootstrapped on x86_64-linux-gnu. Pushed. expand_simple_binop() is allowed to allocate a new pseudo-register and return it, instead of forcing the result into the provided pseudo-register. This can cause a problem when we expand the unrolled loop for __builtin_strcmp: the compiler always generates code for all n iterations of the loop, so "result" will be an alias of the pseudo-register allocated and used in the last iteration; but at runtime the loop can break early, causing this pseudo-register uninitialized. Emit a move instruction in the iteration to force the difference into one register which has been allocated before the loop, to avoid this issue. gcc/ChangeLog: PR other/109086 * builtins.cc (inline_string_cmp): Force the character difference into "result" pseudo-register, instead of reassign the pseudo-register. --- gcc/builtins.cc | 12 ++-- 1 file changed, 10 insertions(+), 2 deletions(-) diff --git a/gcc/builtins.cc b/gcc/builtins.cc index 305c65c29be..90246e214d6 100644 --- a/gcc/builtins.cc +++ b/gcc/builtins.cc @@ -7142,8 +7142,16 @@ inline_string_cmp (rtx target, tree var_str, const char *const_str, op0 = convert_modes (mode, unit_mode, op0, 1); op1 = convert_modes (mode, unit_mode, op1, 1); - result = expand_simple_binop (mode, MINUS, op0, op1, - result, 1, OPTAB_WIDEN); + rtx diff = expand_simple_binop (mode, MINUS, op0, op1, + result, 1, OPTAB_WIDEN); + + /* Force the difference into result register. We cannot reassign +result here ("result = diff") or we may end up returning +uninitialized result when expand_simple_binop allocates a new +pseudo-register for returning. */ + if (diff != result) + emit_move_insn (result, diff); + if (i < length - 1) emit_cmp_and_jump_insns (result, CONST0_RTX (mode), NE, NULL_RTX, mode, true, ne_label); -- 2.40.0
Re: [PATCH] libstdc++: use new built-in trait __is_reference
You need to CC libstd...@gcc.gnu.org for any patches touching libstdc++. On Sat, 2023-03-18 at 21:21 -0700, Ken Matsui via Gcc-patches wrote: > libstdc++-v3/ChangeLog: > > * include/std/type_traits (is_reference): Use __is_reference built-in > trait. Bad ChangeLog format. You should have a tab (not 4 or 8 spaces, nor nothing) to indent the ChangeLog content. Is there any benefit to use a builtin, instead of the existing implementation? I can see no but maybe I'm stupid. > --- > diff --git a/libstdc++-v3/include/std/type_traits > b/libstdc++-v3/include/std/type_traits The patch fails to apply. It seems because your mail client inserted an additional newline before "b/". Try to use git-send-email or configure the mail client properly. > index 2bd607a8b8f..18408d8ceb6 100644 > --- a/libstdc++-v3/include/std/type_traits > +++ b/libstdc++-v3/include/std/type_traits > @@ -639,6 +639,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION > // Composite type categories. > > /// is_reference > +#if __has_builtin(__is_reference) > + template > + struct is_reference > + : public integral_constant If a patch depends on another patch not applied yet, sent them in a series. Or people are puzzled because when this patch is applied alone, the code fails to build. > + { }; > +#else > template > struct is_reference > : public false_type > @@ -653,6 +659,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION > struct is_reference<_Tp&&> > : public true_type > { }; > +#endif > > /// is_arithmetic > template -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University
Re: [PATCH] libstdc++: use new built-in trait __is_reference
On Mon, 2023-03-20 at 00:30 -0700, Ken Matsui wrote: > I see. Thank you! Please continue to read. I guess you missed some inline comments from me... > > On Mon, Mar 20, 2023 at 12:26 AM Xi Ruoyao wrote: > > > > You need to CC libstd...@gcc.gnu.org for any patches touching > > libstdc++. > > > > On Sat, 2023-03-18 at 21:21 -0700, Ken Matsui via Gcc-patches wrote: > > > libstdc++-v3/ChangeLog: > > > > > > * include/std/type_traits (is_reference): Use __is_reference > > > built-in > > > trait. > > > > Bad ChangeLog format. You should have a tab (not 4 or 8 spaces, nor > > nothing) to indent the ChangeLog content. > > > > Is there any benefit to use a builtin, instead of the existing > > implementation? I can see no but maybe I'm stupid. > > > > > --- > > > diff --git a/libstdc++-v3/include/std/type_traits > > > b/libstdc++-v3/include/std/type_traits > > > > The patch fails to apply. It seems because your mail client > > inserted an > > additional newline before "b/". Try to use git-send-email or > > configure > > the mail client properly. > > > > > index 2bd607a8b8f..18408d8ceb6 100644 > > > --- a/libstdc++-v3/include/std/type_traits > > > +++ b/libstdc++-v3/include/std/type_traits > > > @@ -639,6 +639,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION > > > // Composite type categories. > > > > > > /// is_reference > > > +#if __has_builtin(__is_reference) > > > + template > > > + struct is_reference > > > + : public integral_constant > > > > If a patch depends on another patch not applied yet, sent them in a > > series. Or people are puzzled because when this patch is applied > > alone, > > the code fails to build. > > > > > + { }; > > > +#else > > > template > > > struct is_reference > > > : public false_type > > > @@ -653,6 +659,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION > > > struct is_reference<_Tp&&> > > > : public true_type > > > { }; > > > +#endif > > > > > > /// is_arithmetic > > > template > > > > -- > > Xi Ruoyao > > School of Aerospace Science and Technology, Xidian University -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University
Re: [PATCH] libstdc++: use new built-in trait __is_reference
On Mon, 2023-03-20 at 01:03 -0700, Ken Matsui wrote: > Oops, I assumed those were my email... Thank you for your heads up and > your comments! > > > Bad ChangeLog format. You should have a tab (not 4 or 8 spaces, nor > > nothing) to indent the ChangeLog content. > > Do you mean like the following? > > ``` > libstdc++-v3/ChangeLog: > > [TAB]* include/std/type_traits (is_reference): Use __is_reference > built-in > trait. > ``` Yep. > > Is there any benefit to use a builtin, instead of the existing > > implementation? I can see no but maybe I'm stupid. > > My patches are based on the GSoC project "C++: Implement compiler > built-in traits for the standard library traits". These built-in > traits basically make the compilation faster. > > https://gcc.gnu.org/wiki/SummerOfCode Ok, to me making compilation faster is a valid reason. > > The patch fails to apply. It seems because your mail client inserted an > > additional newline before "b/". Try to use git-send-email or configure > > the mail client properly. > > Let me try to use git-send-email instead. I stupidly don't understand > how to use them, so I was making my patches manually... Or adjust the mail client correctly. You can send the patch to yourself first and see if it's not "mangled" by the mail client when you debug such an issue... But when you finally end up sending 10 patches in a series you'll find git send-email much easier :). -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University
Re: [PATCH v2] LoongArch: Libvtv add loongarch support.
On Mon, 2022-10-10 at 10:49 -0700, Caroline Tice via Gcc-patches wrote: > Is "if (VTV_PAGE_SIZE != sysconf (_SC_PAGE_SIZE))" going to fail for > loongarch? Because LoongArch systems may have 4KB, 16KB, or 64KB pages. > If not, why do you need to insert anything here at all? If so, > perhaps you could write something similar to sysconf_SC_PAGE_SIZE for > loongarch (as was done for __CYGWIN__ & __MINGW32__)? I'd like to ask a question: if we set VTV_PAGE_SIZE to 64KB and make the special case, will libvtv work for 4KB and 16KB pages? (If I read code correctly, setting VTV_PAGE_SIZE to 4KB will obviously break 16KB or 64KB configuration.) If VTV_PAGE_SIZE == sysconf (_SC_PAGE_SIZE) is strictly required for libvtv we'll have to keep the check as-is and then we'll only support 16KB page configuration (which is the default in Linux kernel configuration for LoongArch). -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University
Re: [PATCH] Optimize nested permutation to single VEC_PERM_EXPR [PR54346]
On Mon, 2022-09-26 at 14:56 +0800, Liwei Xu via Gcc-patches wrote: > This patch implemented the optimization in PR 54346, which Merges > > c = VEC_PERM_EXPR ; > d = VEC_PERM_EXPR ; > to > d = VEC_PERM_EXPR ; > > Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,} > tree-ssa/forwprop-19.c fail to pass but I'm not sure whether it > is ok to removed it. I'm getting: FAIL: gcc.dg/pr54346.c scan-tree-dump dse1 "VEC_PERM_EXPR.*{ 3, 6, 0, 0 }" FAIL: gcc.dg/pr54346.c scan-tree-dump-times dse1 "VEC_PERM_EXPR" 1 on loongarch64-linux-gnu. Not sure why. -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University
[PATCH] LoongArch: implement count_{leading,trailing}_zeros
LoongArch always support clz and ctz instructions, so we can always use __builtin_{clz,ctz} for count_{leading,trailing}_zeros. This improves the code of libgcc, and also benefits Glibc once we merge longlong.h there. Bootstrapped and regtested on loongarch64-linux-gnu. include/ChangeLog: * longlong.h [__loongarch__] (count_leading_zeros): Define. [__loongarch__] (count_trailing_zeros): Likewise. [__loongarch__] (COUNT_LEADING_ZEROS_0): Likewise. --- include/longlong.h | 12 1 file changed, 12 insertions(+) diff --git a/include/longlong.h b/include/longlong.h index 64a7b10f9b2..c3a6f1e7eaa 100644 --- a/include/longlong.h +++ b/include/longlong.h @@ -593,6 +593,18 @@ extern UDItype __umulsidi3 (USItype, USItype); #define UMUL_TIME 14 #endif +#ifdef __loongarch__ +# if W_TYPE_SIZE == 32 +# define count_leading_zeros(count, x) ((count) = __builtin_clz (x)) +# define count_trailing_zeros(count, x) ((count) = __builtin_ctz (x)) +# define COUNT_LEADING_ZEROS_0 32 +# elif W_TYPE_SIZE == 64 +# define count_leading_zeros(count, x) ((count) = __builtin_clzll (x)) +# define count_trailing_zeros(count, x) ((count) = __builtin_ctzll (x)) +# define COUNT_LEADING_ZEROS_0 64 +# endif +#endif + #if defined (__M32R__) && W_TYPE_SIZE == 32 #define add_ss(sh, sl, ah, al, bh, bl) \ /* The cmp clears the condition bit. */ \ -- 2.38.0
Re: [PATCH] Optimize nested permutation to single VEC_PERM_EXPR [PR54346]
On Thu, 2022-10-13 at 14:15 +0800, Levy wrote: > Hi RuoYao > > It’s probably because loongarch64 doesn’t support > can_vec_perm_const_p(result_mode, op_mode, sel2, false) > > I’m not sure whether if loongarch will support it or should I just > limit the test target for pr54346.c? I'm not sure if we can add TARGET_VECTORIZE_VEC_PERM_CONST when we don't actually support vector. (LoongArch has SIMD instructions but the support in GCC won't be added in a very recent future.) -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University
pushed: [PATCH] LoongArch: implement count_{leading,trailing}_zeros
On Thu, 2022-10-13 at 16:43 +0800, Lulu Cheng wrote: > Looks good to me! > > Thanks! Pushed r13-3269. > > 在 2022/10/12 下午10:23, Xi Ruoyao 写道: > > LoongArch always support clz and ctz instructions, so we can always > > use > > __builtin_{clz,ctz} for count_{leading,trailing}_zeros. This > > improves > > the code of libgcc, and also benefits Glibc once we merge longlong.h > > there. > > > > Bootstrapped and regtested on loongarch64-linux-gnu. > > > > include/ChangeLog: > > > > * longlong.h [__loongarch__] (count_leading_zeros): Define. > > [__loongarch__] (count_trailing_zeros): Likewise. > > [__loongarch__] (COUNT_LEADING_ZEROS_0): Likewise. > > --- > > include/longlong.h | 12 > > 1 file changed, 12 insertions(+) > > > > diff --git a/include/longlong.h b/include/longlong.h > > index 64a7b10f9b2..c3a6f1e7eaa 100644 > > --- a/include/longlong.h > > +++ b/include/longlong.h > > @@ -593,6 +593,18 @@ extern UDItype __umulsidi3 (USItype, USItype); > > #define UMUL_TIME 14 > > #endif > > > > +#ifdef __loongarch__ > > +# if W_TYPE_SIZE == 32 > > +# define count_leading_zeros(count, x) ((count) = __builtin_clz > > (x)) > > +# define count_trailing_zeros(count, x) ((count) = __builtin_ctz > > (x)) > > +# define COUNT_LEADING_ZEROS_0 32 > > +# elif W_TYPE_SIZE == 64 > > +# define count_leading_zeros(count, x) ((count) = __builtin_clzll > > (x)) > > +# define count_trailing_zeros(count, x) ((count) = __builtin_ctzll > > (x)) > > +# define COUNT_LEADING_ZEROS_0 64 > > +# endif > > +#endif > > + > > #if defined (__M32R__) && W_TYPE_SIZE == 32 > > #define add_ss(sh, sl, ah, al, bh, bl) \ > > /* The cmp clears the condition bit. */ \ > -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University
Re: Announcement: Porting the Docs to Sphinx - 9. November 2022
On Mon, 2022-10-17 at 15:28 +0200, Martin Liška wrote: > Hello. > > Based on the very positive feedback I was given at the Cauldron Sphinx > Documentation BoF, > I'm planning migrating the documentation on 9th November. There are still > some minor comments > from Sandra when it comes to the PDF output, but we can address that once the > conversion is done. > > The reason I'm sending the email now is that I was waiting for latest Sphinx > release (5.3.0) that > simplifies reference format for options and results in much simpler Option > summary section ([1]) > > The current GCC master (using Sphinx 5.3.0) converted docs can be seen here: > https://splichal.eu/scripts/sphinx/ > > If you see any issues with the converted documentation, or have a feedback > about it, > please reply to this email. Ouch. This will be very painful for Linux From Scratch. We'll need to add 23 Python modules to build the documentation, while we only have 88 packages in total currently... And we don't want to omit GCC documentation in our system. Could generated man and info pages be provided as a tarball on gcc.gnu.org or ftp.gnu.org?
Re: Announcement: Porting the Docs to Sphinx - 9. November 2022
(CC our team members.) On Thu, 2022-10-20 at 13:27 +0200, Martin Liška wrote: > > Ouch. This will be very painful for Linux From Scratch. We'll need to > > add 23 Python modules to build the documentation, while we only have 88 > > packages in total currently... And we don't want to omit GCC > > documentation in our system. > > Various other distros will have to face it too. The proper solution is a > multi-build > package (gcc:doc) which can be built later in the dependency chain. Btw. do > you also > provide PDF documentation in your system? No (texlive is much heavier than Sphinx). But generally we expect man pages and info pages. We can separate man and info into the second-time build in BLFS (we're already doing this now for Go, Objective C, etc.), but I don't really like to omit the man and info pages... > > Could generated man and info pages be provided as a tarball on > > gcc.gnu.org or ftp.gnu.org? > > Not planning doing that.
Re: Announcement: Porting the Docs to Sphinx - 9. November 2022
On Thu, 2022-10-20 at 13:53 +0200, Martin Liška wrote: > On 10/20/22 13:49, Xi Ruoyao wrote: > > (CC our team members.) > > > > On Thu, 2022-10-20 at 13:27 +0200, Martin Liška wrote: > > > > Ouch. This will be very painful for Linux From Scratch. We'll need to > > > > add 23 Python modules to build the documentation, while we only have 88 > > > > packages in total currently... And we don't want to omit GCC > > > > documentation in our system. > > > > > > Various other distros will have to face it too. The proper solution is a > > > multi-build > > > package (gcc:doc) which can be built later in the dependency chain. Btw. > > > do you also > > > provide PDF documentation in your system? > > > > No (texlive is much heavier than Sphinx). But generally we expect man > > pages and info pages. > > > > We can separate man and info into the second-time build in BLFS (we're > > already doing this now for Go, Objective C, etc.), > > Do the same for GCC. > > > but I don't really > > like to omit the man and info pages.. > > What should I do about it? We want to switch to a more modern documentation > tool > called Sphinx and yes, it will make packaging of the GCC more complicated. Nothing, I guess. We'll handle it on our side (if we finally decide to ship the man/info tarballs we can generate them by ourselves). I was just trying to find a simpler solution before beginning all the work :). Thanks!
Re: [PATCH v2] RISC-V: Libitm add RISC-V support.
On Thu, 2022-10-27 at 17:44 -0700, Palmer Dabbelt wrote: > though I don't have an opinion on whether libitm should be taking ports > to new targets, I'd never even heard of it before. I asked this question to myself when I reviewed LoongArch libitm port. But I remember one maintainer of Deepin (a distro) has complained that some packages were depending on libitm (and/or libvtv). -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University
Re: [PATCH v2] LoongArch: Optimize immediate load.
On Tue, 2022-11-01 at 14:19 +0800, Lulu Cheng wrote: > +;; Load immediate to the 32-63 bits of the source register. > +(define_insn_and_split "load_hi32" > + [(set (match_operand:DI 0 "register_operand" "=r") > + (ior:DI > + (and:DI (match_operand:DI 1 "register_operand" "0") > + (match_operand 2 "hi32_mask_operand")) > + (match_operand 3 "const_hi32_operand" "x")))] > + "TARGET_64BIT" > + "#" > + "" > + [(set (match_dup 0) > + (ior:DI > + (zero_extend:DI > + (subreg:SI (match_dup 1) 0)) > + (match_dup 4))) > + (set (match_dup 0) > + (ior:DI > + (and:DI (match_dup 0) > + (match_dup 6)) > + (match_dup 5)))] > +{ > + operands[4] = GEN_INT (INTVAL (operands[3]) << 12 >> 12); It's an undefined behavior if INTVAL (operands[3]) is negative. > + operands[5] = GEN_INT (INTVAL (operands[3]) & 0xfff0); > + operands[6] = GEN_INT (0xf); > +} > + [(set_attr "insn_count" "2")]) -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University
Re: [PATCH v2] libgccjit: allow common objects in $(EXTRA_GCC_OBJS) and $(EXTRA_OBJS)
Ping. I'd like to see libgccjit working on LoongArch so I would be able to submit a Rust port to upstream. If the result is NACK I'd like to know alternative approaches to fix the build failure. I doubt if "j...@gcc.gnu.org" is really used, so CC'ed the JIT maintainer listed in MAINTAINERS. On Thu, 2022-05-19 at 16:10 +0800, Yang Yujie wrote: > Hello, > > This patch fixes libgccjit build failure on loongarch* targets, > and could probably be useful for future ports. > > For now, libgccjit is linked with objects from $(EXTRA_GCC_OBJS) and > libbackend.a, which contains object files from $(EXTRA_OBJS). > > This effectively forbids any overlap between those two lists, i.e. all > target-specific shared code between the gcc driver and compiler > executables must go into gcc/common/config//-common.cc, > which feels a bit inconvenient when there are a lot of "common" stuff > that we want to put into separate source files. > > By linking libgccjit with $(EXTRA_GCC_OBJS_EXCLUSIVE), which contains > all elements from $(EXTRA_GCC_OBJS) but not $(EXTRA_OBJS), this > problem > can be alleviated. > > This patch does not affect any other target architecture than > loongarch, > and has been bootstrapped and regression-tested on loongarch64-linux- > gnuf64 > an x86_64-pc-linux-gnu. > > Any recommendations? Please review. Thanks a lot. > > Yujie > > * gcc/jit/ChangeLog: > > * Make-lang.in: only link objects from $(EXTRA_GCC_OBJS) > that's not in $(EXTRA_OBJS) into libgccjit. > --- > gcc/jit/Make-lang.in | 9 +++-- > 1 file changed, 7 insertions(+), 2 deletions(-) > > diff --git a/gcc/jit/Make-lang.in b/gcc/jit/Make-lang.in > index 6e10abfd0ac..248ec45b729 100644 > --- a/gcc/jit/Make-lang.in > +++ b/gcc/jit/Make-lang.in > @@ -157,18 +157,23 @@ LIBGCCJIT_EXTRA_OPTS = > $(LIBGCCJIT_VERSION_SCRIPT_OPTION) \ > endif > endif > > +# Only link objects from $(EXTRA_GCC_OBJS) that's not already > +# included in libbackend.a ($(EXTRA_OBJS)). > +EXTRA_GCC_OBJS_EXCLUSIVE = $(foreach _obj1, $(EXTRA_GCC_OBJS), \ > + $(if $(filter $(_obj1), $(EXTRA_OBJS)),, $(_obj1))) > + > # We avoid using $(BACKEND) from Makefile.in in order to avoid > pulling > # in main.o > $(LIBGCCJIT_FILENAME): $(jit_OBJS) \ > libbackend.a libcommon-target.a libcommon.a \ > $(CPPLIB) $(LIBDECNUMBER) \ > $(LIBDEPS) $(srcdir)/jit/libgccjit.map \ > - $(EXTRA_GCC_OBJS) $(jit.prev) > + $(EXTRA_GCC_OBJS_EXCLUSIVE) $(jit.prev) > @$(call LINK_PROGRESS,$(INDEX.jit),start) > +$(LLINKER) $(ALL_LINKERFLAGS) $(LDFLAGS) -o $@ -shared \ > $(jit_OBJS) libbackend.a libcommon-target.a libcommon.a \ > $(CPPLIB) $(LIBDECNUMBER) $(EXTRA_GCC_LIBS) $(LIBS) > $(BACKENDLIBS) \ > - $(EXTRA_GCC_OBJS) \ > + $(EXTRA_GCC_OBJS_EXCLUSIVE) \ > $(LIBGCCJIT_EXTRA_OPTS) > @$(call LINK_PROGRESS,$(INDEX.jit),end) > -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University
Re: [PATCH] Mips: Fix the ASAN shadow offset hook for the n32 ABI
On Mon, 2022-06-06 at 09:28 +, Dimitrije Milosevic wrote: > Fix the ASAN shadow offset hook for the n32 ABI. > > gcc/ChangeLog: > > * config/mips/mips.cc (mips_asan_shadow_offset): Reformat > to handle the N32 ABI. > * config/mips/mips.h (SUBTARGET_SHADOW_OFFSET): Remove > the macro, as it is not needed anymore. > > --- > > gcc/config/mips/mips.cc | 7 ++- > gcc/config/mips/mips.h | 7 --- > 2 files changed, 6 insertions(+), 8 deletions(-) > > diff --git a/gcc/config/mips/mips.cc b/gcc/config/mips/mips.cc > index 2dce4007678..91e651c458e 100644 > --- a/gcc/config/mips/mips.cc > +++ b/gcc/config/mips/mips.cc > @@ -22745,7 +22745,12 @@ mips_constant_alignment (const_tree exp, > HOST_WIDE_INT align) > static unsigned HOST_WIDE_INT > mips_asan_shadow_offset (void) > { > - return SUBTARGET_SHADOW_OFFSET; > + if (mips_abi == ABI_N32) > + return (HOST_WIDE_INT_1 << 29); > + if (POINTER_SIZE == 64) > + return (HOST_WIDE_INT_1 << 37); > + else > + return HOST_WIDE_INT_C (0x0aaa); > } > > /* Implement TARGET_STARTING_FRAME_OFFSET. See > mips_compute_frame_info > diff --git a/gcc/config/mips/mips.h b/gcc/config/mips/mips.h > index 858bbba3a36..0029864fdcd 100644 > --- a/gcc/config/mips/mips.h > +++ b/gcc/config/mips/mips.h > @@ -3463,10 +3463,3 @@ struct GTY(()) machine_function { > && !TARGET_MICROMIPS && !TARGET_FIX_24K) > > #define NEED_INDICATE_EXEC_STACK 0 > - > -/* Define the shadow offset for asan. Other OS's can override in the > - respective tm.h files. */ > -#ifndef SUBTARGET_SHADOW_OFFSET > -#define SUBTARGET_SHADOW_OFFSET \ > - (POINTER_SIZE == 64 ? HOST_WIDE_INT_1 << 37 : HOST_WIDE_INT_C > (0x0aaa)) > -#endif > > --- I think this depends on https://reviews.llvm.org/D127096 (not committed yet)? -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University
Re: [PATCH] Mips: Enable asynchronous unwind tables with both ASAN and TSAN
On Mon, 2022-05-30 at 07:10 +, Dimitrije Milosevic wrote: > Hi Xi, thanks for pointing this out. I'd definitely say that the > https://clang.llvm.org/docs/ThreadSanitizer.html documentation is > outdated. According > tohttps://github.com/google/sanitizers/wiki/ThreadSanitizerCppManual#s > upported-platforms TSAN is supported on Mips64. Furthermore, there are > actual code segments (in compiler-rt/lib/tsan/rtl/tsan_platforms.h, > for example) related to Mips64. > I didn't add the 64-bit target check, however. Here is the updated > version of the patch. Well, so should we add TSAN_SUPPORTED=yes for MIPS64 in libsanitizer/configure.tgt first? I'll try this on my MIPS64 in a few days. -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University
Re: [PATCH v2] libgccjit: allow common objects in $(EXTRA_GCC_OBJS) and $(EXTRA_OBJS)
On Mon, 2022-06-06 at 18:33 -0400, David Malcolm wrote: > > On Thu, 2022-05-19 at 16:10 +0800, Yang Yujie wrote: > > > This patch does not affect any other target architecture than > > > loongarch, > > > and has been bootstrapped and regression-tested on loongarch64- > > > linux- > > > gnuf64 > > > an x86_64-pc-linux-gnu. > > > > > > Any recommendations? Please review. Thanks a lot. > > The patch looks good to me. Pushed as r13-1010. -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University
Re: [PING][PATCH][WIP] have configure probe prefix for gmp/mpfr/mpc [PR44425]
On Thu, 2022-06-09 at 16:04 -0400, Eric Gallager via Gcc-patches wrote: > Hi, I'd like to ping this patch: > https://gcc.gnu.org/pipermail/gcc-patches/2022-June/596126.html > (cc-ing the build machinery maintainers listed in MAINTAINERS this > time) > > On Thu, Jun 2, 2022 at 11:53 AM Eric Gallager > wrote: > > > > So, I'm working on fixing PR bootstrap/44425, and have this patch to > > have the top-level configure script check in the value passed to > > `--prefix=` when looking for gmp/mpfr/mpc. It "works" (in that > > configuring with just `--prefix=` and none of > > `--with-gmp=`/`--with-mpfr=`/`--with-mpc=` now works where it failed > > before), but unfortunately it results in a bunch of duplicated > > `-I`/`-L` flags stuck in ${gmplibs} and ${gmpinc}... is that > > acceptable or should I try another approach? > > Eric A patch should not edit configure directly. configure.ac should be edited and configure should be regenerated from it. -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University
Re: [PATCH] Mips: Enable asynchronous unwind tables with both ASAN and TSAN
> Well, so should we add TSAN_SUPPORTED=yes for MIPS64 in > libsanitizer/configure.tgt first? I'll try this on my MIPS64 in a few > days. Just tried TSAN_SUPPORTED=yes with asynchronous unwind tables enabled, but I got some strange test failures for tls_race.c: FAIL: c-c++-common/tsan/tls_race.c -O0 output pattern test Output was: ThreadSanitizer: CHECK failed: tsan_platform_linux.cpp:452 "((thr_end)) <= ((tls_addr + tls_size))" (0xffec35f8c0, 0xffec35f784) (tid=748216) #0 __tsan::CheckUnwind() ../../../../gcc/libsanitizer/tsan/tsan_rtl.cpp:627 (libtsan.so.2+0xa30ec) #1 __sanitizer::CheckFailed(char const*, int, char const*, unsigned long long, unsigned long long) ../../../../gcc/libsanitizer/sanitizer_common/sanitizer_termination.cpp:86 (libtsan.so.2+0xeb8cc) #2 __tsan::ImitateTlsWrite(__tsan::ThreadState*, unsigned long, unsigned long) ../../../../gcc/libsanitizer/tsan/tsan_platform_linux.cpp:452 (libtsan.so.2+0xa0cac) #3 __tsan::ThreadStart(__tsan::ThreadState*, unsigned int, unsigned long long, __sanitizer::ThreadType) ../../../../gcc/libsanitizer/tsan/tsan_rtl_thread.cpp:197 (libtsan.so.2+0xc0e88) #4 __tsan_thread_start_func ../../../../gcc/libsanitizer/tsan/tsan_interceptors_posix.cpp:1009 (libtsan.so.2+0x3e5dc) #5 start_thread /sources/glibc-2.35/nptl/pthread_create.c:442 (libc.so.6+0xc75f4) I've tried to diagnose the root cause but failed. -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University
PING^5: [PATCH] mips: add MSA vec_cmp and vec_cmpu expand pattern [PR101132]
Ping again. On Mon, 2021-06-21 at 21:42 +0800, Xi Ruoyao wrote: > Middle-end started to emit vec_cmp and vec_cmpu since GCC 11, causing > ICE on MIPS with MSA enabled. Add the pattern to prevent it. > > Bootstrapped and regression tested on mips64el-linux-gnu. > Ok for trunk? > > gcc/ > > * config/mips/mips-protos.h (mips_expand_vec_cmp_expr): > Declare. > * config/mips/mips.c (mips_expand_vec_cmp_expr): New function. > * config/mips/mips-msa.md (vec_cmp): New > expander. > (vec_cmpu): New expander. > --- > gcc/config/mips/mips-msa.md | 22 ++ > gcc/config/mips/mips-protos.h | 1 + > gcc/config/mips/mips.c | 11 +++ > 3 files changed, 34 insertions(+) > > diff --git a/gcc/config/mips/mips-msa.md b/gcc/config/mips/mips-msa.md > index 3ecf2bde19f..3a67f25be56 100644 > --- a/gcc/config/mips/mips-msa.md > +++ b/gcc/config/mips/mips-msa.md > @@ -435,6 +435,28 @@ > DONE; > }) > > +(define_expand "vec_cmp" > + [(match_operand: 0 "register_operand") > + (match_operator 1 "" > + [(match_operand:MSA 2 "register_operand") > + (match_operand:MSA 3 "register_operand")])] > + "ISA_HAS_MSA" > +{ > + mips_expand_vec_cmp_expr (operands); > + DONE; > +}) > + > +(define_expand "vec_cmpu" > + [(match_operand: 0 "register_operand") > + (match_operator 1 "" > + [(match_operand:IMSA 2 "register_operand") > + (match_operand:IMSA 3 "register_operand")])] > + "ISA_HAS_MSA" > +{ > + mips_expand_vec_cmp_expr (operands); > + DONE; > +}) > + > (define_insn "msa_insert_" > [(set (match_operand:MSA 0 "register_operand" "=f,f") > (vec_merge:MSA > diff --git a/gcc/config/mips/mips-protos.h b/gcc/config/mips/mips- > protos.h > index 2cf4ed50292..a685f7f7dd5 100644 > --- a/gcc/config/mips/mips-protos.h > +++ b/gcc/config/mips/mips-protos.h > @@ -385,6 +385,7 @@ extern mulsidi3_gen_fn mips_mulsidi3_gen_fn (enum > rtx_code); > > extern void mips_register_frame_header_opt (void); > extern void mips_expand_vec_cond_expr (machine_mode, machine_mode, > rtx *); > +extern void mips_expand_vec_cmp_expr (rtx *); > > /* Routines implemented in mips-d.c */ > extern void mips_d_target_versions (void); > diff --git a/gcc/config/mips/mips.c b/gcc/config/mips/mips.c > index 00a8eef96aa..8f043399a8e 100644 > --- a/gcc/config/mips/mips.c > +++ b/gcc/config/mips/mips.c > @@ -22321,6 +22321,17 @@ mips_expand_msa_cmp (rtx dest, enum rtx_code > cond, rtx op0, rtx op1) > } > } > > +void > +mips_expand_vec_cmp_expr (rtx *operands) > +{ > + rtx cond = operands[1]; > + rtx op0 = operands[2]; > + rtx op1 = operands[3]; > + rtx res = operands[0]; > + > + mips_expand_msa_cmp (res, GET_CODE (cond), op0, op1); > +} > + > /* Expand VEC_COND_EXPR, where: > MODE is mode of the result > VIMODE equivalent integer mode -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University
PING^5: [PATCH] mips: Fix up mips_atomic_assign_expand_fenv [PR94780]
Ping again. On Wed, 2021-06-23 at 11:11 +0800, Xi Ruoyao wrote: > Commit message shamelessly copied from 1777beb6b129 by jakub: > > This function, because it is sometimes called even outside of function > bodies, uses create_tmp_var_raw rather than create_tmp_var. But in > order > for that to work, when first referenced, the VAR_DECLs need to appear > in a > TARGET_EXPR so that during gimplification the var gets the right > DECL_CONTEXT and is added to local decls. > > Bootstrapped & regtested on mips64el-linux-gnu. Ok for trunk and > backport > to 11, 10, and 9? > > gcc/ > > * config/mips/mips.c (mips_atomic_assign_expand_fenv): Use > TARGET_EXPR instead of MODIFY_EXPR. > --- > gcc/config/mips/mips.c | 12 ++-- > 1 file changed, 6 insertions(+), 6 deletions(-) > > diff --git a/gcc/config/mips/mips.c b/gcc/config/mips/mips.c > index 8f043399a8e..89d1be6cea6 100644 > --- a/gcc/config/mips/mips.c > +++ b/gcc/config/mips/mips.c > @@ -22439,12 +22439,12 @@ mips_atomic_assign_expand_fenv (tree *hold, > tree *clear, tree *update) > tree get_fcsr = mips_builtin_decls[MIPS_GET_FCSR]; > tree set_fcsr = mips_builtin_decls[MIPS_SET_FCSR]; > tree get_fcsr_hold_call = build_call_expr (get_fcsr, 0); > - tree hold_assign_orig = build2 (MODIFY_EXPR, MIPS_ATYPE_USI, > - fcsr_orig_var, get_fcsr_hold_call); > + tree hold_assign_orig = build4 (TARGET_EXPR, MIPS_ATYPE_USI, > + fcsr_orig_var, get_fcsr_hold_call, > NULL, NULL); > tree hold_mod_val = build2 (BIT_AND_EXPR, MIPS_ATYPE_USI, > fcsr_orig_var, > build_int_cst (MIPS_ATYPE_USI, > 0xf003)); > - tree hold_assign_mod = build2 (MODIFY_EXPR, MIPS_ATYPE_USI, > - fcsr_mod_var, hold_mod_val); > + tree hold_assign_mod = build4 (TARGET_EXPR, MIPS_ATYPE_USI, > + fcsr_mod_var, hold_mod_val, NULL, > NULL); > tree set_fcsr_hold_call = build_call_expr (set_fcsr, 1, > fcsr_mod_var); > tree hold_all = build2 (COMPOUND_EXPR, MIPS_ATYPE_USI, > hold_assign_orig, hold_assign_mod); > @@ -22454,8 +22454,8 @@ mips_atomic_assign_expand_fenv (tree *hold, > tree *clear, tree *update) > *clear = build_call_expr (set_fcsr, 1, fcsr_mod_var); > > tree get_fcsr_update_call = build_call_expr (get_fcsr, 0); > - *update = build2 (MODIFY_EXPR, MIPS_ATYPE_USI, > - exceptions_var, get_fcsr_update_call); > + *update = build4 (TARGET_EXPR, MIPS_ATYPE_USI, > + exceptions_var, get_fcsr_update_call, NULL, NULL); > tree set_fcsr_update_call = build_call_expr (set_fcsr, 1, > fcsr_orig_var); > *update = build2 (COMPOUND_EXPR, void_type_node, *update, > set_fcsr_update_call); -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University
PING^w: [PATCH] ipa-devirt: check precision mismatch of enum values [PR101396]
Ping again. On Sun, 2021-07-11 at 01:48 +0800, Xi Ruoyao wrote: > We are comparing enum values (in wide_int) to check ODR violation. > However, if we compare two wide_int values with different precision, > we'll trigger an assert, leading to ICE. With enum-base introduced > in C++11, it's easy to sink into this situation. > > To fix the issue, we need to explicitly check this kind of mismatch, > and emit a proper warning message if there is such one. > > Bootstrapped & regtested on x86_64-linux-gnu. Ok for trunk? > > gcc/ > > PR ipa/101396 > * ipa-devirt.c (ipa_odr_read_section): Compare the precision > of > enum values, and emit a warning if they mismatch. > > gcc/testsuite/ > > PR ipa/101396 > * g++.dg/lto/pr101396_0.C: New test. > * g++.dg/lto/pr101396_1.C: New test. > --- > gcc/ipa-devirt.c | 9 + > gcc/testsuite/g++.dg/lto/pr101396_0.C | 12 > gcc/testsuite/g++.dg/lto/pr101396_1.C | 10 ++ > 3 files changed, 31 insertions(+) > create mode 100644 gcc/testsuite/g++.dg/lto/pr101396_0.C > create mode 100644 gcc/testsuite/g++.dg/lto/pr101396_1.C > > diff --git a/gcc/ipa-devirt.c b/gcc/ipa-devirt.c > index 8cd1100aba9..8deec75b2df 100644 > --- a/gcc/ipa-devirt.c > +++ b/gcc/ipa-devirt.c > @@ -4193,6 +4193,8 @@ ipa_odr_read_section (struct lto_file_decl_data > *file_data, const char *data, > if (do_warning != -1 || j >= this_enum.vals.length ()) > continue; > if (strcmp (id, this_enum.vals[j].name) > + || (val.get_precision() != > + this_enum.vals[j].val.get_precision()) > || val != this_enum.vals[j].val) > { > warn_name = xstrdup (id); > @@ -4260,6 +4262,13 @@ ipa_odr_read_section (struct lto_file_decl_data > *file_data, const char *data, > "name %qs differs from name %qs defined" > " in another translation unit", > this_enum.vals[j].name, warn_name); > + else if (this_enum.vals[j].val.get_precision() != > + warn_value.get_precision()) > + inform (this_enum.vals[j].locus, > + "name %qs is defined as %u-bit while > another " > + "translation unit defines it as %u-bit", > + warn_name, > this_enum.vals[j].val.get_precision(), > + warn_value.get_precision()); > /* FIXME: In case there is easy way to print > wide_ints, > perhaps we could do it here instead of overflow > check. */ > else if (wi::fits_shwi_p (this_enum.vals[j].val) > diff --git a/gcc/testsuite/g++.dg/lto/pr101396_0.C > b/gcc/testsuite/g++.dg/lto/pr101396_0.C > new file mode 100644 > index 000..b7a2947a880 > --- /dev/null > +++ b/gcc/testsuite/g++.dg/lto/pr101396_0.C > @@ -0,0 +1,12 @@ > +/* { dg-lto-do link } */ > + > +enum A : __UINT32_TYPE__ { // { dg-lto-warning "6: type 'A' violates > the C\\+\\+ One Definition Rule" } > + a, // { dg-lto-note "3: name 'a' is defined as 32-bit while another > translation unit defines it as 64-bit" } > + b, > + c > +}; > + > +int main() > +{ > + return (int) A::a; > +} > diff --git a/gcc/testsuite/g++.dg/lto/pr101396_1.C > b/gcc/testsuite/g++.dg/lto/pr101396_1.C > new file mode 100644 > index 000..a6d032d694d > --- /dev/null > +++ b/gcc/testsuite/g++.dg/lto/pr101396_1.C > @@ -0,0 +1,10 @@ > +enum A : __UINT64_TYPE__ { // { dg-lto-note "6: an enum with > different value name is defined in another translation unit" } > + a, // { dg-lto-note "3: mismatching definition" } > + b, > + c > +}; > + > +int f(enum A x) > +{ > + return (int) x; > +} -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University
Re: PING^w: [PATCH] ipa-devirt: check precision mismatch of enum values [PR101396]
On Fri, 2021-07-30 at 15:00 +0800, Kewen.Lin wrote: > Hi Ruoyao, > > on 2021/7/30 下午12:57, Xi Ruoyao via Gcc-patches wrote: > > Ping again. > > > > This ping-ed patch has been approved by Richard at > > https://gcc.gnu.org/pipermail/gcc-patches/2021-July/576001.html > > Just chime in as I guess you didn't receive his mail somehow. Hi Kewen, I guessed I missed some mail because I didn't subscribe to gcc-patches. Thanks for the reminder!
[PATCH committed] ipa-devirt: check precision mismatch of enum values [PR101396]
On Fri, 2021-07-30 at 15:21 +0800, Xi Ruoyao via Gcc-patches wrote: > On Fri, 2021-07-30 at 15:00 +0800, Kewen.Lin wrote: > > Hi Ruoyao, > > > > on 2021/7/30 下午12:57, Xi Ruoyao via Gcc-patches wrote: > > > Ping again. > > > > > > > This ping-ed patch has been approved by Richard at > > > > https://gcc.gnu.org/pipermail/gcc-patches/2021-July/576001.html > > > > Just chime in as I guess you didn't receive his mail somehow. > > Hi Kewen, > > I guessed I missed some mail because I didn't subscribe to gcc- > patches. > Thanks for the reminder! > Rebased and bootstrapped & regtested on x86_64-linux-gnu. Committed at 291416d3.
Re: PING^5: [PATCH] mips: add MSA vec_cmp and vec_cmpu expand pattern [PR101132]
On Fri, 2021-07-30 at 09:04 +0100, Richard Sandiford wrote: > Xi Ruoyao writes: > > Ping again. > > Sorry that this has gone unreviewed for so long. I think in practice > the MIPS port is essentially unmaintained at this point -- it would > be great if someone would volunteer :-) A company working on MIPS has contacted me and said one of their employees may contact the SC and take the role of MIPS maintainer. Not sure their progress though. > It isn't really appropriate for me to review MIPS stuff given that I > work > for a company that has a competing architecture. I think Jeff > expressed > similar concerns given his new role. > That said, the patch looks clearly correct to me, so please go ahead > and apply (to trunk and GCC 11). Thanks for your patience. Thanks! It has been 5 weeks so it's better to rebase and bootstrap & test it again. I'll commit it if there is no regression. -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University
Re: PING^5: [PATCH] mips: Fix up mips_atomic_assign_expand_fenv [PR94780]
On Fri, 2021-07-30 at 09:11 +0100, Richard Sandiford wrote: > Xi Ruoyao writes: > > Ping again. > > > > On Wed, 2021-06-23 at 11:11 +0800, Xi Ruoyao wrote: > > > Commit message shamelessly copied from 1777beb6b129 by jakub: > > > > > > This function, because it is sometimes called even outside of > > > function > > > bodies, uses create_tmp_var_raw rather than create_tmp_var. But > > > in > > > order > > > for that to work, when first referenced, the VAR_DECLs need to > > > appear > > > in a > > > TARGET_EXPR so that during gimplification the var gets the right > > > DECL_CONTEXT and is added to local decls. > > > > > > Bootstrapped & regtested on mips64el-linux-gnu. Ok for trunk and > > > backport > > > to 11, 10, and 9? > > OK for all, thanks. > > Similar comments to the previous message about the appropriateness > of me reviewing the patch, but like you say, this is doing for MIPS > what we've already had to do for other targets. Thanks for reviewing. Will bootstrap and test it again, and commit if there is no regressions. -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University
committed: [PATCH] mips: add MSA vec_cmp and vec_cmpu expand pattern [PR101132]
On Fri, 2021-07-30 at 16:17 +0800, Xi Ruoyao via Gcc-patches wrote: > On Fri, 2021-07-30 at 09:04 +0100, Richard Sandiford wrote: > > Xi Ruoyao writes: > > > Ping again. > > > > Sorry that this has gone unreviewed for so long. I think in > > practice > > the MIPS port is essentially unmaintained at this point -- it would > > be great if someone would volunteer :-) > > A company working on MIPS has contacted me and said one of their > employees may contact the SC and take the role of MIPS maintainer. > Not > sure their progress though. > > > It isn't really appropriate for me to review MIPS stuff given that I > > work > > for a company that has a competing architecture. I think Jeff > > expressed > > similar concerns given his new role. > > > That said, the patch looks clearly correct to me, so please go ahead > > and apply (to trunk and GCC 11). Thanks for your patience. > > Thanks! > > It has been 5 weeks so it's better to rebase and bootstrap & test it > again. I'll commit it if there is no regression. Committed to master at 45cb789e and releases/gcc-11 at 2a47ee78. -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University
committed: [PATCH] mips: Fix up mips_atomic_assign_expand_fenv [PR94780]
On Fri, 2021-07-30 at 16:23 +0800, Xi Ruoyao via Gcc-patches wrote: > On Fri, 2021-07-30 at 09:11 +0100, Richard Sandiford wrote: > > Xi Ruoyao writes: > > > Ping again. > > > > > > On Wed, 2021-06-23 at 11:11 +0800, Xi Ruoyao wrote: > > > > Commit message shamelessly copied from 1777beb6b129 by jakub: > > > > > > > > This function, because it is sometimes called even outside of > > > > function > > > > bodies, uses create_tmp_var_raw rather than create_tmp_var. But > > > > in > > > > order > > > > for that to work, when first referenced, the VAR_DECLs need to > > > > appear > > > > in a > > > > TARGET_EXPR so that during gimplification the var gets the right > > > > DECL_CONTEXT and is added to local decls. > > > > > > > > Bootstrapped & regtested on mips64el-linux-gnu. Ok for trunk > > > > and > > > > backport > > > > to 11, 10, and 9? > > > > OK for all, thanks. > > > > Similar comments to the previous message about the appropriateness > > of me reviewing the patch, but like you say, this is doing for MIPS > > what we've already had to do for other targets. > > Thanks for reviewing. > > Will bootstrap and test it again, and commit if there is no > regressions. Committed to master at 20656544 and releases/gcc-11 at 7db1795a. Will do it for gcc-10 and gcc-9 tomorrow.
committed: [PATCH] mips: Fix up mips_atomic_assign_expand_fenv [PR94780]
On Sat, 2021-07-31 at 02:08 +0800, Xi Ruoyao via Gcc-patches wrote: > On Fri, 2021-07-30 at 16:23 +0800, Xi Ruoyao via Gcc-patches wrote: > > On Fri, 2021-07-30 at 09:11 +0100, Richard Sandiford wrote: > > > Xi Ruoyao writes: > > > > Ping again. > > > > > > > > On Wed, 2021-06-23 at 11:11 +0800, Xi Ruoyao wrote: > > > > > Commit message shamelessly copied from 1777beb6b129 by jakub: > > > > > > > > > > This function, because it is sometimes called even outside of > > > > > function > > > > > bodies, uses create_tmp_var_raw rather than create_tmp_var. > > > > > But > > > > > in > > > > > order > > > > > for that to work, when first referenced, the VAR_DECLs need to > > > > > appear > > > > > in a > > > > > TARGET_EXPR so that during gimplification the var gets the > > > > > right > > > > > DECL_CONTEXT and is added to local decls. > > > > > > > > > > Bootstrapped & regtested on mips64el-linux-gnu. Ok for trunk > > > > > and > > > > > backport > > > > > to 11, 10, and 9? > > > > > > OK for all, thanks. > > > > > > Similar comments to the previous message about the appropriateness > > > of me reviewing the patch, but like you say, this is doing for > > > MIPS > > > what we've already had to do for other targets. > > > > Thanks for reviewing. > > > > Will bootstrap and test it again, and commit if there is no > > regressions. > > Committed to master at 20656544 and releases/gcc-11 at 7db1795a. Commited to releases/gcc-10 at 613e4ebc and releases/gcc-9 at 79184d8c.
[PATCH] mips: msa: truncate immediate shift amount [PR101922]
When -mloongson-mmi is enabled, SHIFT_COUNT_TRUNCATED is turned off. This causes untruncated immediate shift amount outputed into the asm, and the GNU assembler refuses to assemble it. Truncate immediate shift amount when outputing the asm instruction to make GAS happy again. gcc/ PR target/101922 * config/mips/mips-protos.h (mips_msa_output_shift_immediate): Declare. * config/mips/mips.c (mips_msa_output_shift_immediate): New function. * config/mips/mips-msa.md (vashl3, vashr3, vlshr3): Call it. gcc/testsuite/ PR target/101922 * gcc.target/mips/pr101922.c: New test. --- gcc/config/mips/mips-msa.md | 27 gcc/config/mips/mips-protos.h| 1 + gcc/config/mips/mips.c | 21 ++ gcc/testsuite/gcc.target/mips/pr101922.c | 19 + 4 files changed, 59 insertions(+), 9 deletions(-) create mode 100644 gcc/testsuite/gcc.target/mips/pr101922.c diff --git a/gcc/config/mips/mips-msa.md b/gcc/config/mips/mips-msa.md index 3a67f25be56..d3b27d132ad 100644 --- a/gcc/config/mips/mips-msa.md +++ b/gcc/config/mips/mips-msa.md @@ -870,9 +870,12 @@ (define_insn "vlshr3" (match_operand:IMSA 1 "register_operand" "f,f") (match_operand:IMSA 2 "reg_or_vector_same_uimm6_operand" "f,Uuv6")))] "ISA_HAS_MSA" - "@ - srl.\t%w0,%w1,%w2 - srli.\t%w0,%w1,%E2" +{ + if (which_alternative == 0) +return "srl.\t%w0,%w1,%w2"; + + return mips_msa_output_shift_immediate("srli.\t%w0,%w1,%E2", operands); +} [(set_attr "type" "simd_shift") (set_attr "mode" "")]) @@ -882,9 +885,12 @@ (define_insn "vashr3" (match_operand:IMSA 1 "register_operand" "f,f") (match_operand:IMSA 2 "reg_or_vector_same_uimm6_operand" "f,Uuv6")))] "ISA_HAS_MSA" - "@ - sra.\t%w0,%w1,%w2 - srai.\t%w0,%w1,%E2" +{ + if (which_alternative == 0) +return "sra.\t%w0,%w1,%w2"; + + return mips_msa_output_shift_immediate("srai.\t%w0,%w1,%E2", operands); +} [(set_attr "type" "simd_shift") (set_attr "mode" "")]) @@ -894,9 +900,12 @@ (define_insn "vashl3" (match_operand:IMSA 1 "register_operand" "f,f") (match_operand:IMSA 2 "reg_or_vector_same_uimm6_operand" "f,Uuv6")))] "ISA_HAS_MSA" - "@ - sll.\t%w0,%w1,%w2 - slli.\t%w0,%w1,%E2" +{ + if (which_alternative == 0) +return "sll.\t%w0,%w1,%w2"; + + return mips_msa_output_shift_immediate("slli.\t%w0,%w1,%E2", operands); +} [(set_attr "type" "simd_shift") (set_attr "mode" "")]) diff --git a/gcc/config/mips/mips-protos.h b/gcc/config/mips/mips-protos.h index a5e4151b9e6..8d97eb36125 100644 --- a/gcc/config/mips/mips-protos.h +++ b/gcc/config/mips/mips-protos.h @@ -317,6 +317,7 @@ extern const char *mips_output_sync_loop (rtx_insn *, rtx *); extern unsigned int mips_sync_loop_insns (rtx_insn *, rtx *); extern const char *mips_output_division (const char *, rtx *); extern const char *mips_msa_output_division (const char *, rtx *); +extern const char *mips_msa_output_shift_immediate (const char *, rtx *); extern const char *mips_output_probe_stack_range (rtx, rtx); extern bool mips_hard_regno_rename_ok (unsigned int, unsigned int); extern bool mips_linked_madd_p (rtx_insn *, rtx_insn *); diff --git a/gcc/config/mips/mips.c b/gcc/config/mips/mips.c index 89d1be6cea6..3d5be369b1c 100644 --- a/gcc/config/mips/mips.c +++ b/gcc/config/mips/mips.c @@ -14495,6 +14495,27 @@ mips_msa_output_division (const char *division, rtx *operands) } return s; } + +/* Return the assembly code for MSA immediate shift instructions, + which has the operands given by OPERANDS. Truncate the shift amount + to make GAS happy. */ + +const char * +mips_msa_output_shift_immediate (const char *shift, rtx *operands) +{ + rtx amount = operands[2]; + machine_mode mode = amount->mode; + + unsigned val = UINTVAL (CONST_VECTOR_ELT (amount, 0)); + val &= GET_MODE_UNIT_BITSIZE (mode) - 1; + if (!val) +return ""; + + rtx c = gen_int_mode (val, GET_MODE_INNER (mode)); + operands[2] = gen_const_vec_duplicate (mode, c); + + return shift; +} /* Return true if destination of IN_INSN is used as add source in OUT_INSN. Both IN_INSN and OUT_INSN are of type fmadd. Example: diff --git a/gcc/testsuite/gcc.target/mips/pr101922.c b/gcc/testsuite/gcc.target/mips/pr101922.c new file mode 100644 index 000..00a6e495ba2 --- /dev/null +++ b/gcc/testsuite/gcc.target/mips/pr101922.c @@ -0,0 +1,19 @@ +/* PR target/101922 + This was triggering an assembler error with -O3 -mmsa -mloongson-mmi. */ + +/* { dg-do assemble } */ +/* { dg-options "-mmsa -mloongson-mmi" } */ + +typedef __INT8_TYPE__ i8; +typedef __INT32_TYPE__ i32; + +i8 d[16]; + +i32 f(i32 x) { + int i; + for (i = 0; i < 16; i++) { +i32 t = (i32) d[i] >> 31; +x &= t; + } + return x; +} -- 2.33.0
Re: [PATCH] mips: msa: truncate immediate shift amount [PR101922]
On Sat, 2021-08-21 at 01:07 +0800, Xi Ruoyao via Gcc-patches wrote: > When -mloongson-mmi is enabled, SHIFT_COUNT_TRUNCATED is turned off. > This causes untruncated immediate shift amount outputed into the asm, > and the GNU assembler refuses to assemble it. > > Truncate immediate shift amount when outputing the asm instruction to > make GAS happy again. > > gcc/ > > PR target/101922 > * config/mips/mips-protos.h (mips_msa_output_shift_immediate): > Declare. > * config/mips/mips.c (mips_msa_output_shift_immediate): New > function. > * config/mips/mips-msa.md (vashl3, vashr3, > vlshr3): Call it. > > gcc/testsuite/ > > PR target/101922 > * gcc.target/mips/pr101922.c: New test. Forgot to mention: tested on mips64el-linux-gnu, OK for trunk?
Re: [PATCH] mips: msa: truncate immediate shift amount [PR101922]
On Sun, 2021-08-22 at 19:21 -0600, Jeff Law wrote: > > > On 8/20/2021 11:07 AM, Xi Ruoyao via Gcc-patches wrote: > > When -mloongson-mmi is enabled, SHIFT_COUNT_TRUNCATED is turned off. > > This causes untruncated immediate shift amount outputed into the > > asm, > > and the GNU assembler refuses to assemble it. > > > > Truncate immediate shift amount when outputing the asm instruction > > to > > make GAS happy again. > > > > gcc/ > > > > PR target/101922 > > * config/mips/mips-protos.h > > (mips_msa_output_shift_immediate): > > Declare. > > * config/mips/mips.c (mips_msa_output_shift_immediate): New > > function. > > * config/mips/mips-msa.md (vashl3, vashr3, > > vlshr3): Call it. > > > > gcc/testsuite/ > > > > PR target/101922 > > * gcc.target/mips/pr101922.c: New test. > OK. Committed @ f93f0868919. > Q. Looking out further, is it going to continue to make sense to have > more sense to have a distinct loongson port? The latest Loongson processors (branded Loongson 3A5000 for desktop, 3B5000 for workstation and server, and 3C5000L for server) have moved away from MIPS to a new RISC architecture named "LoongArch". Its design learnt some traits from MIPSr6 and RISC-V I think, but it's not a simple MIPS variant and will need a new port for GCC. A manual of LoongArch basic instructions is at https://loongson.github.io/LoongArch-Documentation/LoongArch-Vol1-EN.html. LoongArch also have 128 and 256 bit vector insturctions, but the manual is not published yet. A team from Loongson is working on the port, the (experimental) source code is available at https://github.com/loongson/gcc/commits/loongarch-12. It's not ready for upstream reviewing yet. For "legacy" Loongson processors using MIPS, I suggest to keep the support as a MIPS extension. I'll try to keep it in an "usable" state (i. e. fix, or at least workaround ICE and bad assemble code like this). If one day we can't maintain it anymore we'd have to sadly deprecate and remove Loongson MMI support. -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University
Re: [PATCH] MIPS: use N64 ABI by default if the triple end with -gnuabi64
On Thu, 2021-08-26 at 23:56 -0400, YunQiang Su wrote: > gcc/ChangeLog: > > PR target/102089 > * config.gcc: MIPS: use N64 ABI by default if the triple end > with -gnuabi64, which is used by Debian since 2013. > --- > gcc/config.gcc | 14 ++ > 1 file changed, 14 insertions(+) > > diff --git a/gcc/config.gcc b/gcc/config.gcc > index 0ff5cac15..0c91be6f3 100644 > --- a/gcc/config.gcc > +++ b/gcc/config.gcc > @@ -2553,16 +2553,30 @@ mips*-*-linux*) # Linux MIPS, > either endian. > target_cpu_default=MASK_SOFT_FLOAT_ABI > enable_mips_multilibs="yes" > ;; > + mipsisa64r6*-*-linux-gnuabi64) > + default_mips_abi=64 > + default_mips_arch=mips64r6 > + enable_mips_multilibs="yes" > + ;; > mipsisa64r6*-*-linux*) > default_mips_abi=n32 > default_mips_arch=mips64r6 > enable_mips_multilibs="yes" > ;; > + mipsisa64r2*-*-linux-gnuabi64) > + default_mips_abi=64 > + default_mips_arch=mips64r2 > + enable_mips_multilibs="yes" > + ;; > mipsisa64r2*-*-linux*) > default_mips_abi=n32 > default_mips_arch=mips64r2 > enable_mips_multilibs="yes" > ;; > + mips64*-*-linux-gnuabi64 | mipsisa64*-*-linux-gnuabi64) > + default_mips_abi=64 > + enable_mips_multilibs="yes" > + ;; > mips64*-*-linux* | mipsisa64*-*-linux*) > default_mips_abi=n32 > enable_mips_multilibs="yes" LGTM, but I don't have the privilege to approve any change so you'll still need to wait for approval :) And I think the behavior change should be announced in https://gcc.gnu.org/gcc-12/changes.html, if it's approved.
Re: [PATCH v2] [MIPS]: add .module mipsREV to all output asm file
On Fri, 2021-08-27 at 13:11 +0800, YunQiang Su wrote: > 在 2021/6/18 11:29, YunQiang Su 写道: > > Currently, the asm output file for MIPS has no rev info. > > It can make some trouble, for example: > > assembler is mips1 by default, > > gcc is fpxx by default. > > To assemble the output of gcc -S, we have to pass -mips2 > > to assembler. > > > > gcc/ChangeLog: > > > > * gcc/config/mips/mips.c (mips_module_isa_name): New. > > mips_file_start: add .module mipsREV to all asm output > > ping for this patch. > > > --- > > gcc/config/mips/mips.c | 37 + > > 1 file changed, 37 insertions(+) > > > > diff --git a/gcc/config/mips/mips.c b/gcc/config/mips/mips.c > > index 1f1475cf400..51cc70e6ceb 100644 > > --- a/gcc/config/mips/mips.c > > +++ b/gcc/config/mips/mips.c > > @@ -9877,6 +9877,40 @@ mips_mdebug_abi_name (void) > > } > > } > > > > +static const char * > > +mips_module_isa_name() GNU style: there should be a space before (). > > +{ I think it's better to add enum values like MIPS_ISA_MIPS64R2 and use a switch statement here? switch (mips_isa) { case MIPS_ISA_MIPS1: return "mips1"; // ... } It looks better, and (maybe) generates better code. Just my 2 cents though. > > + if (ISA_MIPS1) > > + return "mips1"; > > + else if (ISA_MIPS2) > > + return "mips2"; > > + else if (ISA_MIPS3) > > + return "mips3"; > > + else if (ISA_MIPS4) > > + return "mips4"; > > + else if (ISA_MIPS32) > > + return "mips32"; > > + else if (ISA_MIPS32R2) > > + return "mips32r2"; > > + else if (ISA_MIPS32R3) > > + return "mips32r3"; > > + else if (ISA_MIPS32R5) > > + return "mips32r5"; > > + else if (ISA_MIPS32R6) > > + return "mips32r6"; > > + else if (ISA_MIPS64) > > + return "mips64"; > > + else if (ISA_MIPS64R2) > > + return "mips64r2"; > > + else if (ISA_MIPS64R3) > > + return "mips64r3"; > > + else if (ISA_MIPS64R5) > > + return "mips64r5"; > > + else if (ISA_MIPS64R6) > > + return "mips64r6"; > > + gcc_unreachable (); > > +} > > + > > /* Implement TARGET_ASM_FILE_START. */ > > > > static void > > @@ -9908,6 +9942,9 @@ mips_file_start (void) > > fprintf (asm_out_file, "\t.nan\t%s\n", > > mips_nan == MIPS_IEEE_754_2008 ? "2008" : "legacy"); > > > > + fprintf (asm_out_file, "\t.module\t%s\n", > > + mips_module_isa_name ()); > > + > > #ifdef HAVE_AS_DOT_MODULE > > /* Record the FP ABI. See below for comments. */ > > if (TARGET_NO_FLOAT) > > >
Re: [PATCH v2] [MIPS]: add .module mipsREV to all output asm file
On Fri, 2021-08-27 at 15:36 -0600, Jeff Law wrote: > It's easier when someone has to debug the code later. > enums show up in debug output by default, while #defines do not. > > > > switch (mips_isa) > > { > > case MIPS_ISA_MIPS1: return "mips1"; > > // ... > > } > > > > It looks better, and (maybe) generates better code. Just my 2 cents > > though. > Coding standards would have that as > > switch (mips_isa) > { > case MIPS_ISA_MIPS_1: > return "mips1"; > ... > } There is some existing code using "case ... : return ..." in one line in mips.c, so I thought it was standard :(. > Presumably .module is supported by all reasonably modern versions of > GAS? It's added by the commit in binutils-gdb: > commit 919731affbef19fcad8dddb0a595bb05755cb345 > Author: mfortune > Date: Tue May 20 13:28:20 2014 +0100 > > Add MIPS .module directive > So it should be supported since binutils-2.25. If we want to support old binutils we'll need something like "-fno-mips- module-directive" and "--without-mips-module-directive". My suggestion is just bumping the binutils requirement for mips*-*-*.
Committed: [PATCH] MIPS: use N64 ABI by default if the triple end with -gnuabi64
On Fri, 2021-08-27 at 15:38 -0600, Jeff Law wrote: > > > On 8/26/2021 10:20 PM, Xi Ruoyao via Gcc-patches wrote: > > On Thu, 2021-08-26 at 23:56 -0400, YunQiang Su wrote: > > > gcc/ChangeLog: > > > > > > PR target/102089 > > > * config.gcc: MIPS: use N64 ABI by default if the triple > > > end > > > with -gnuabi64, which is used by Debian since 2013. > > > --- > > > gcc/config.gcc | 14 ++ > > > 1 file changed, 14 insertions(+) > > > > > > diff --git a/gcc/config.gcc b/gcc/config.gcc > > > index 0ff5cac15..0c91be6f3 100644 > > > --- a/gcc/config.gcc > > > +++ b/gcc/config.gcc > > > @@ -2553,16 +2553,30 @@ mips*-*-linux*) # > > > Linux MIPS, either endian. > > > target_cpu_default=MASK_SOFT_FLOAT_ABI > > > enable_mips_multilibs="yes" > > > ;; > > > + mipsisa64r6*-*-linux-gnuabi64) > > > + default_mips_abi=64 > > > + default_mips_arch=mips64r6 > > > + enable_mips_multilibs="yes" > > > + ;; > > > mipsisa64r6*-*-linux*) > > > default_mips_abi=n32 > > > default_mips_arch=mips64r6 > > > enable_mips_multilibs="yes" > > > ;; > > > + mipsisa64r2*-*-linux-gnuabi64) > > > + default_mips_abi=64 > > > + default_mips_arch=mips64r2 > > > + enable_mips_multilibs="yes" > > > + ;; > > > mipsisa64r2*-*-linux*) > > > default_mips_abi=n32 > > > default_mips_arch=mips64r2 > > > enable_mips_multilibs="yes" > > > ;; > > > + mips64*-*-linux-gnuabi64 | mipsisa64*-*-linux- > > > gnuabi64) > > > + default_mips_abi=64 > > > + enable_mips_multilibs="yes" > > > + ;; > > > mips64*-*-linux* | mipsisa64*-*-linux*) > > > default_mips_abi=n32 > > > enable_mips_multilibs="yes" > > LGTM, but I don't have the privilege to approve any change so you'll > > still need to wait for approval :) > Yea, but having a second pair of eyes chime in is definitely helpful. > > > > > And I think the behavior change should be announced in > > https://gcc.gnu.org/gcc-12/changes.html, if it's approved. > Agreed. > > The configure change is approved. Pushed to trunk as 91f78b67. Jeff: YunQiang is working on compilers for CIP United, a company producing MIPS IPs, chips, and toolchains. Is it possible to grant him a write-after-approval? > And a change to the gcc-12 changes.html is pre-approved.
Re: [PATCH] libffi: Fix MIPS r6 support
On Fri, 2021-08-27 at 15:28 -0600, Jeff Law via Gcc-patches wrote: > > > On 8/26/2021 10:58 PM, YunQiang Su wrote: > > for some instructions, MIPS r6 uses different encoding other than > > the previous releases. > > > > 1. mips/n32.S disable .set mips4: since it casuses old insn encoding > > is used. > > https://github.com/libffi/libffi/pull/396 > > 2. mips/ffi.c: the encoding for JR is hardcoded: we need to use > > different value for r6 and pre-r6. > > https://github.com/libffi/libffi/pull/401 > > > > libffi/ > > PR libffi/83636 > > * src/mips/n32.S: disable .set mips4 > > * src/mips/ffi.c: use different JR encoding for r6. > These should go to the upstream libffi project. Once accepted there > you > can add them to GCC. Hi Jeff, The two PRs are already merged, and released since libffi-3.3.0 (now the upstream latest release is 3.4.2). I don't have a MIPSr6 so I can't test though. YunQiang: the commit message should mention the commit SHA in upstream libffi repo's main branch, instead of a URL to PR. You can use "git log" in gcc repo and search for commits for libffi and learn from it. -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University
Re: [PATCH 1/3] md/define_c_enum: support value assignation
On Sat, 2021-08-28 at 08:05 -0400, YunQiang Su wrote: > Currently, the enums from define_c_enum and define_enum can only > has values one by one from 0. > > In fact we can support the behaviour just like C, aka like > (define_enum "mips_isa" [mips1=1 mips2 mips32=32 mips32r2]), > then we can get > enum mips_isa { > MIPS_ISA_MIPS1 = 1, > MIPS_ISA_MIPS2 = 2, > MIPS_ISA_MIPS32 = 32, > MIPS_ISA_MIPS32R2 = 33 > }; IMO we can just define the enum in mips.h. In a private communication YunQiang has explained he doesn't want to introduce a new large enum at the top of mips.h though. How do others think? > gcc/ChangeLog: > * read-md.c (md_reader::handle_enum): support value > assignation. > * doc/md.texi: record define_c_enum value assignation support. > --- > gcc/doc/md.texi | 4 > gcc/read-md.c | 15 +-- > 2 files changed, 17 insertions(+), 2 deletions(-) > > diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi > index f8047aefc..1c1282c4c 100644 > --- a/gcc/doc/md.texi > +++ b/gcc/doc/md.texi > @@ -11074,6 +11074,8 @@ The syntax is as follows: > (define_c_enum "@var{name}" [ > @var{value0} > @var{value1} > + @var{value32}=32 > + @var{value33} > @dots{} > @var{valuen} > ]) > @@ -11086,6 +11088,8 @@ in @file{insn-constants.h}: > enum @var{name} @{ > @var{value0} = 0, > @var{value1} = 1, > + @var{value32} = 32, > + @var{value33} = 33, > @dots{} > @var{valuen} = @var{n} > @}; > diff --git a/gcc/read-md.c b/gcc/read-md.c > index bb419e0f6..43dfbe264 100644 > --- a/gcc/read-md.c > +++ b/gcc/read-md.c > @@ -901,7 +901,8 @@ md_decimal_string (int number) > void > md_reader::handle_enum (file_location loc, bool md_p) > { > - char *enum_name, *value_name; > + char *enum_name, *value_name, *token; > + unsigned int cur_value; > struct md_name name; > struct enum_type *def; > struct enum_value *ev; > @@ -928,6 +929,7 @@ md_reader::handle_enum (file_location loc, bool > md_p) > *slot = def; > } > > + cur_value = def->num_values; > require_char_ws ('['); > > while ((c = read_skip_spaces ()) != ']') > @@ -945,20 +947,29 @@ md_reader::handle_enum (file_location loc, bool > md_p) > if (md_p) > { > value_name = concat (def->name, "_", name.string, NULL); > + value_name = strtok (value_name, "="); > + token = strtok (NULL, "="); > + if (token) > + cur_value = atoi (token); > upcase_string (value_name); > ev->name = xstrdup (name.string); > } > else > { > value_name = xstrdup (name.string); > + value_name = strtok (value_name, "="); > + token = strtok (NULL, "="); > + if (token) > + cur_value = atoi (token); > ev->name = value_name; > } > ev->def = add_constant (get_md_constants (), value_name, > - md_decimal_string (def->num_values), > def); > + md_decimal_string (cur_value), def); > > *def->tail_ptr = ev; > def->tail_ptr = &ev->next; > def->num_values++; > + cur_value++; > } > } > -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University
Re: [PATCH 2/3] MIPS: use mips_isa enum instead hardcoded numbers
BLE_PCREL) > > /* Generic ISA defines. */ > -#define ISA_MIPS1 (mips_isa == 1) > -#define ISA_MIPS2 (mips_isa == 2) > -#define ISA_MIPS3 (mips_isa == 3) > -#define ISA_MIPS4 (mips_isa == 4) > -#define ISA_MIPS32 (mips_isa == 32) > -#define ISA_MIPS32R2 (mips_isa == 33) > -#define ISA_MIPS32R3 (mips_isa == 34) > -#define ISA_MIPS32R5 (mips_isa == 36) > -#define ISA_MIPS32R6 (mips_isa == 37) > -#define ISA_MIPS64 (mips_isa == 64) > -#define ISA_MIPS64R2 (mips_isa == 65) > -#define ISA_MIPS64R3 (mips_isa == 66) > -#define ISA_MIPS64R5 (mips_isa == 68) > -#define ISA_MIPS64R6 (mips_isa == 69) > +#define ISA_MIPS1 (mips_isa == MIPS_ISA_MIPS1) > +#define ISA_MIPS2 (mips_isa == MIPS_ISA_MIPS2) > +#define ISA_MIPS3 (mips_isa == MIPS_ISA_MIPS3) > +#define ISA_MIPS4 (mips_isa == MIPS_ISA_MIPS4) > +#define ISA_MIPS32 (mips_isa == MIPS_ISA_MIPS32) > +#define ISA_MIPS32R2 (mips_isa == MIPS_ISA_MIPS32R2) > +#define ISA_MIPS32R3 (mips_isa == MIPS_ISA_MIPS32R3) > +#define ISA_MIPS32R5 (mips_isa == MIPS_ISA_MIPS32R5) > +#define ISA_MIPS32R6 (mips_isa == MIPS_ISA_MIPS32R6) > +#define ISA_MIPS64 (mips_isa == MIPS_ISA_MIPS64) > +#define ISA_MIPS64R2 (mips_isa == MIPS_ISA_MIPS64R2) > +#define ISA_MIPS64R3 (mips_isa == MIPS_ISA_MIPS64R3) > +#define ISA_MIPS64R5 (mips_isa == MIPS_ISA_MIPS64R5) > +#define ISA_MIPS64R6 (mips_isa == MIPS_ISA_MIPS64R6) > > /* Architecture target defines. */ > #define TARGET_LOONGSON_2E (mips_arch == > PROCESSOR_LOONGSON_2E) > @@ -708,25 +708,25 @@ struct mips_cpu_info { > #endif > > #ifndef MULTILIB_ISA_DEFAULT > -#if MIPS_ISA_DEFAULT == 1 > +#if MIPS_ISA_DEFAULT == MIPS_ISA_MIPS1 > #define MULTILIB_ISA_DEFAULT "mips1" > -#elif MIPS_ISA_DEFAULT == 2 > +#elif MIPS_ISA_DEFAULT == MIPS_ISA_MIPS2 > #define MULTILIB_ISA_DEFAULT "mips2" > -#elif MIPS_ISA_DEFAULT == 3 > +#elif MIPS_ISA_DEFAULT == MIPS_ISA_MIPS3 > #define MULTILIB_ISA_DEFAULT "mips3" > -#elif MIPS_ISA_DEFAULT == 4 > +#elif MIPS_ISA_DEFAULT == MIPS_ISA_MIPS4 > #define MULTILIB_ISA_DEFAULT "mips4" > -#elif MIPS_ISA_DEFAULT == 32 > +#elif MIPS_ISA_DEFAULT == MIPS_ISA_MIPS32 > #define MULTILIB_ISA_DEFAULT "mips32" > -#elif MIPS_ISA_DEFAULT == 33 > +#elif MIPS_ISA_DEFAULT == MIPS_ISA_MIPS32R2 > #define MULTILIB_ISA_DEFAULT "mips32r2" > -#elif MIPS_ISA_DEFAULT == 37 > +#elif MIPS_ISA_DEFAULT == MIPS_ISA_MIPS32R6 > #define MULTILIB_ISA_DEFAULT "mips32r6" > -#elif MIPS_ISA_DEFAULT == 64 > +#elif MIPS_ISA_DEFAULT == MIPS_ISA_MIPS64 > #define MULTILIB_ISA_DEFAULT "mips64" > -#elif MIPS_ISA_DEFAULT == 65 > +#elif MIPS_ISA_DEFAULT == MIPS_ISA_MIPS64R2 > #define MULTILIB_ISA_DEFAULT "mips64r2" > -#elif MIPS_ISA_DEFAULT == 69 > +#elif MIPS_ISA_DEFAULT == MIPS_ISA_MIPS64R6 > #define MULTILIB_ISA_DEFAULT "mips64r6" > #else > #define MULTILIB_ISA_DEFAULT "mips1" > diff --git a/gcc/config/mips/mips.md b/gcc/config/mips/mips.md > index 455b9b802..f35e50ced 100644 > --- a/gcc/config/mips/mips.md > +++ b/gcc/config/mips/mips.md > @@ -21,6 +21,23 @@ > ;; along with GCC; see the file COPYING3. If not see > ;; <http://www.gnu.org/licenses/>. > > +(define_enum "mips_isa" [ > + mips1=1 > + mips2 > + mips3 > + mips4 > + mips32=32 > + mips32r2 > + mips32r3 > + mips32r5=36 > + mips32r6 > + mips64=64 > + mips64r2 > + mips64r3 > + mips64r5=68 > + mips64r6 > +]) > + > (define_enum "processor" [ > r3000 > 4kc -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University
Re: [PATCH v2 1/2] MIPS: use mips_isa enum instead hardcoded numbers
These two patches look good to me. Still, need a maintainer's approval. On Sun, 2021-08-29 at 22:59 -0400, YunQiang Su wrote: > Currently mips-cpu.defs, mips.h, netbsd.h and config.gcc are > using hardcoded numbers for isa level. > > Let's replace them with more readable enum mips_isa. > > gcc/ChangeLog: > * config/mips/mips.h (struct mips_cpu_info): define enum mips_isa; > use enum instead of int for 'isa' member. > * config/config.gcc, config/mips/mips{.h,-cpus.def}, > config/mips/netbsd.h: replace hardcoded numbers with enum. > --- > gcc/config.gcc | 62 - > gcc/config/mips/mips-cpus.def | 228 +- > gcc/config/mips/mips.c | 5 +- > gcc/config/mips/mips.h | 84 - > gcc/config/mips/netbsd.h | 5 +- > 5 files changed, 203 insertions(+), 181 deletions(-)
Re: [PATCH] match.pd: Fix x * 0.0 -> 0.0 folding [PR104389]
On Sat, 2022-02-05 at 08:21 +0100, Richard Biener via Gcc-patches wrote: > > > > Am 05.02.2022 um 00:08 schrieb Jakub Jelinek via Gcc-patches > > : > > > > Hi! > > > > The recent PR95115 change to punt in const_binop on folding operation > > with non-NaN operands into NaN if flag_trapping_math broke the following > > testcase, because the x * 0.0 simplification punts just if > > x maybe a NaN (because NaN * 0.0 is NaN not 0.0) or if one of the operands > > could be negative zero. But Inf * 0.0 or -Inf * 0.0 is also NaN, not > > 0.0, so when NaNs are honored we need to punt for possible infinities too. Sorry for the trouble, but some warning here: even with this patch applied, Python would still need to replace inf * 0.0 with nan("") or something. Now with folding for inf * 0.0 disabled, the multiplication will be evaluated at runtime and raise FE_INVALID, which is likely unwanted by Python. However, raising FE_INVALID is the correct behavior no matter if we like or dislike it... > > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk > > and 11/10 where the PR95115 change has been unfortunately backported to > > as well? As there may be many packages (unfortunately) using the nasty "inf * 0.0" pattern to get NaN, maybe we should just revert the PR95115 change for 11/10, apply PR104389 patch for trunk, and announce the behavior change in gcc-12/changes.html? -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University
Re: [PATCH v7 08/12] LoongArch Port: libgcc
On Sat, 2022-02-12 at 11:11 +0800, xucheng...@loongson.cn wrote: > + sc = &rt_->uc.uc_mcontext; Get a warning: In file included from ../../../libgcc/unwind-dw2.c:412: ./md-unwind-support.h: In function ‘loongarch_fallback_frame_state’: ./md-unwind-support.h:55:10: warning: assignment to ‘struct sigcontext *’ from incompatible pointer type ‘mcontext_t *’ [-Wincompatible-pointer-types] 55 | sc = &rt_->uc.uc_mcontext; | ^ Maybe we should just add a cast here like `(struct sigcontext *) &rt_->uc.uc_mcontext` ? -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University
[PATCH] libgcc: fix a warning calling find_fde_tail
Bootstrapped on x86_64-linux-gnu. OK for master? The third parameter of find_fde_tail is an _Unwind_Ptr (which is an integer type instead of a pointer), but we are passing NULL to it. This causes a -Wint-conversion warning. libgcc/ * unwind-dw2-fde-dip.c (_Unwind_Find_FDE): Call find_fde_tail with 0 instead of NULL. --- libgcc/unwind-dw2-fde-dip.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/libgcc/unwind-dw2-fde-dip.c b/libgcc/unwind-dw2-fde-dip.c index 3d6f39f5460..7f9be5e6b02 100644 --- a/libgcc/unwind-dw2-fde-dip.c +++ b/libgcc/unwind-dw2-fde-dip.c @@ -514,7 +514,7 @@ _Unwind_Find_FDE (void *pc, struct dwarf_eh_bases *bases) # if DLFO_STRUCT_HAS_EH_DBASE (_Unwind_Ptr) dlfo.dlfo_eh_dbase, # else - NULL, + 0, # endif bases); else -- 2.35.1
Re: [PATCH] libgcc: fix a warning calling find_fde_tail
On Thu, 2022-02-24 at 19:53 +0100, Jakub Jelinek wrote: > On Fri, Feb 25, 2022 at 02:35:07AM +0800, Xi Ruoyao via Gcc-patches > wrote: > > Bootstrapped on x86_64-linux-gnu. OK for master? > > > > The third parameter of find_fde_tail is an _Unwind_Ptr (which is an > > integer type instead of a pointer), but we are passing NULL to it. > > This > > causes a -Wint-conversion warning. > > > > libgcc/ > > > > * unwind-dw2-fde-dip.c (_Unwind_Find_FDE): Call > > find_fde_tail > > with 0 instead of NULL. > > Ok (except that the second ChangeLog entry line should be indented > just > with a tab, not any spaces after the tab). Pushed as r12-7375, with the ChangeLog corrected.
Re: [PATCH v7 02/12] LoongArch Port: gcc build
On Sat, 2022-02-12 at 11:11 +0800, xucheng...@loongson.cn wrote: > +mstrict-align > +Target Var(TARGET_STRICT_ALIGN) Init(0) > +Do not generate unaligned memory accesses. Any update on the rational to make -mno-strict-align the default? Note that I'm not against this decision: I'm really not a fan of the "dinosaur" or "teaching tool" uarchs with no unaligned access support :). But you should really document this somewhere, for e. g. updating your arch spec, or cliam "any OS on LoongArch should emulate unaligned access if it's not supported by hardware". Maybe this is a little OT though. -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University
Re: [PATCH v7 11/12] LoongArch Port: gcc/testsuite
On Sat, 2022-02-12 at 11:11 +0800, xucheng...@loongson.cn wrote: > From: chenglulu > > 2022-02-12 Chenghua Xu > Lulu Cheng > > gcc/testsuite/ spec-barrier tests fail with: ./testsuite/c-c++-common/spec-barrier-1.c:21:3: warning: this target does not define a speculation barrier; your program will still execute correctly, but incorrect speculation may not be restricted I'd seen some news saying your uarch has in-silicon defense for speculation related vulnerabilities. If this is true you can just make __builtin_speculation_safe_value a nop. Quote from gcc internal doc: > If this pattern is not defined then the default expansion of > '__builtin_speculation_safe_value' will emit a warning. You can > suppress this warning by defining this pattern with a final > condition of '0' (zero), which tells the compiler that a > speculation barrier is not needed for this target. -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University
Re: [PATCH v7 05/12] LoongArch Port: Machine description C files and .h files.
On Sat, 2022-02-12 at 11:11 +0800, xucheng...@loongson.cn wrote: > + /* Clean up the vars set above. Note that final_end_function resets > + the global pointer for us. */ We don't have a global pointer. Let's kill this MIPS remenant :). > + reload_completed = 0; -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University
Re: [PATCH] libatomic: Improve 16-byte atomics on Intel AVX [PR104688]
On Mon, 2022-02-28 at 07:06 +0100, Jakub Jelinek via Gcc-patches wrote: > +++ libatomic/Makefile.am 2022-02-25 17:25:16.298314196 +0100 > @@ -138,8 +138,9 @@ IFUNC_OPTIONS = -march=i586 > libatomic_la_LIBADD += $(addsuffix _8_1_.lo,$(SIZEOBJS)) > endif > if ARCH_X86_64 > -IFUNC_OPTIONS = -mcx16 > -libatomic_la_LIBADD += $(addsuffix _16_1_.lo,$(SIZEOBJS)) > +IFUNC_OPTIONS = -mcx16 -mcx16 The duplication of "-mcx16" is unintentional, I guess? -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University
Re: [PATCH v8 00/12] Add LoongArch support.
On Fri, 2022-03-04 at 15:17 +0800, xucheng...@loongson.cn wrote: > The binutils has been merged into trunk: > https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=560b3fe208255ae909b4b1c88ba9c28b09043307 > > Note: We split -mabi= into -mabi=lp64d/f/s, the new options not support by > upstream binutils yet, > this GCC port requires the following patch applied to binutils to build. > https://github.com/loongson/binutils-gdb/commit/aacb0bf860f02aa5a7dcb76dd0e392bf871c7586 > (will be submitted to upstream after gcc side comfirmed) I think you don't need a review for binutils change here. You should get it reviewed and applied in binutils-gdb ASAP. Then in install.texi you would add a note like "loongarch64-*-* requires binutils >= 2.39" in "Target specific installation notes", as an unpatched 2.38 does not work. And based on the history of RISC-V port (https://gcc.gnu.org/pipermail/gcc/2017-January/222595.html) the process for a new port seems: 1. Get a permission from the Steering Committee. 2. Add one or two port maintainers into MAINTAINERS file. 3. Now the technical reviewing of the patch series just begin. I'm not an expert in software engineering (or social interaction :) and I don't know if the process has been changed in these years. -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University
Re: [PATCH v8 00/12] Add LoongArch support.
On Sat, 2022-03-05 at 09:36 +0800, Paul Hua wrote: > > > > And based on the history of RISC-V port > > (https://gcc.gnu.org/pipermail/gcc/2017-January/222595.html) the > > process > > for a new port seems: > > > > 1. Get a permission from the Steering Committee. > > 2. Add one or two port maintainers into MAINTAINERS file. > > 3. Now the technical reviewing of the patch series just begin. > > > > Hi Ruoyao, > Thanks for your advice. But I don't know how to contact the GCC > Steering Committee. https://gcc.gnu.org/steering.html It says the best place to reach the SC members is the list g...@gcc.gnu.org. And Joseph is also a SC member. -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University
Re: [PATCH v8 04/12] LoongArch Port: Machine description files.
On Fri, 2022-03-04 at 15:18 +0800, xucheng...@loongson.cn wrote: > * config/loongarch/loongarch.md: New file. An ICE happens building OpenSSH-8.9p1. Investigation shows it's caused by the flag "-fzero-call-used-regs=". It's because the compiler attempts to clear FCCx registers but can't figure out how. This flag also triggers ICE for other targets (for example, PR 104820 for MIPS), and the related tests (zero-scratch-regs-{8,9,10,11}.c) are marked dg-skip for many targets. But it's unfortunate that packages like OpenSSH have already start to use this flag... I guess they just enabled it once they saw it was working for i386 :(. So it's better to solve the problem for a new target. A "quick fix" is adding an insn to clear FCCx. This is enough to build OpenSSH and make zero-scratch-regs-{8,9,10,11}.c PASS. diff --git a/gcc/config/loongarch/loongarch.md b/gcc/config/loongarch/loongarch.md index a9a8bc4b038..76c5ded9fe4 100644 --- a/gcc/config/loongarch/loongarch.md +++ b/gcc/config/loongarch/loongarch.md @@ -2020,6 +2020,12 @@ DONE; }) +;; Clear one FCC register +(define_insn "movfcc" [(set (match_operand:FCC 0 "register_operand" "=z") + (const_int 0))] + "" + "movgr2cf\t%0,$r0") + ;; Conditional move instructions. (define_insn "*sel_using_" -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University
[PATCH RFC] mips: add TARGET_ZERO_CALL_USED_REGS hook [PR104817, PR104820]
Bootstrapped and regtested on mips64el-linux-gnuabi64. I'm not sure if it's "correct" to clobber other registers during the zeroing of scratch registers. But I can't really come up with a better idea: on MIPS there is no simple way to clear one bit in FCSR (i. e. FCC[x]). We can't just use "c.f.s $fccx,$f0,$f0" because it will raise an exception if $f0 contains a sNaN. -- This fixes the ICEs using -fzero-call-used-regs=all for MIPS target. OpenSSH-8.9p1 has started to enable this by default, giving us a reason to fix -fzero-call-used-regs for more targets. gcc/ PR target/104817 PR target/104820 * config/mips/mips.cc (mips_zero_call_used_regs): New function. (TARGET_ZERO_CALL_USED_REGS): Define. * config/mips/mips.md (mips_zero_fcc): New insn. gcc/testsuite * c-c++-common/zero-scratch-regs-8.c: Enable for MIPS. * c-c++-common/zero-scratch-regs-9.c: Likewise. * c-c++-common/zero-scratch-regs-10.c: Likewise. * c-c++-common/zero-scratch-regs-11.c: Likewise. --- gcc/config/mips/mips.cc | 47 +++ gcc/config/mips/mips.md | 6 +++ .../c-c++-common/zero-scratch-regs-10.c | 2 +- .../c-c++-common/zero-scratch-regs-11.c | 2 +- .../c-c++-common/zero-scratch-regs-8.c| 2 +- .../c-c++-common/zero-scratch-regs-9.c| 2 +- 6 files changed, 57 insertions(+), 4 deletions(-) diff --git a/gcc/config/mips/mips.cc b/gcc/config/mips/mips.cc index 4f9683e8bf4..f0d6c8f564c 100644 --- a/gcc/config/mips/mips.cc +++ b/gcc/config/mips/mips.cc @@ -22611,6 +22611,51 @@ mips_asm_file_end (void) if (NEED_INDICATE_EXEC_STACK) file_end_indicate_exec_stack (); } + +static HARD_REG_SET +mips_zero_call_used_regs (HARD_REG_SET need_zeroed_hardregs) +{ + HARD_REG_SET zeroed_hardregs; + CLEAR_HARD_REG_SET (zeroed_hardregs); + + if (TEST_HARD_REG_BIT (need_zeroed_hardregs, HI_REGNUM)) +{ + /* Clear HI and LO altogether. MIPS target treats HILO as a +double-word register. */ + machine_mode dword_mode = TARGET_64BIT ? TImode : DImode; + rtx hilo = gen_rtx_REG (dword_mode, MD_REG_FIRST); + rtx zero = CONST0_RTX (dword_mode); + emit_move_insn (hilo, zero); + + SET_HARD_REG_BIT (zeroed_hardregs, HI_REGNUM); + if (TEST_HARD_REG_BIT (need_zeroed_hardregs, LO_REGNUM)) + SET_HARD_REG_BIT (zeroed_hardregs, LO_REGNUM); + else + emit_clobber (gen_rtx_REG (word_mode, LO_REGNUM)); +} + + bool zero_fcc = false; + for (int i = ST_REG_FIRST; i <= ST_REG_LAST; i++) +if (TEST_HARD_REG_BIT (need_zeroed_hardregs, i)) + zero_fcc = true; + + /* MIPS does not have a simple way to clear one bit in FCC. We just + clear FCC with ctc1 and clobber all FCC bits. */ + if (zero_fcc) +{ + emit_insn (gen_mips_zero_fcc ()); + for (int i = ST_REG_FIRST; i <= ST_REG_LAST; i++) + if (TEST_HARD_REG_BIT (need_zeroed_hardregs, i)) + SET_HARD_REG_BIT (zeroed_hardregs, i); + else + emit_clobber (gen_rtx_REG (CCmode, i)); +} + + need_zeroed_hardregs &= ~zeroed_hardregs; + return zeroed_hardregs | +default_zero_call_used_regs (need_zeroed_hardregs); +} + /* Initialize the GCC target structure. */ #undef TARGET_ASM_ALIGNED_HI_OP @@ -22919,6 +22964,8 @@ mips_asm_file_end (void) #undef TARGET_ASM_FILE_END #define TARGET_ASM_FILE_END mips_asm_file_end +#undef TARGET_ZERO_CALL_USED_REGS +#define TARGET_ZERO_CALL_USED_REGS mips_zero_call_used_regs struct gcc_target targetm = TARGET_INITIALIZER; diff --git a/gcc/config/mips/mips.md b/gcc/config/mips/mips.md index e0f0a582732..edf58710cdd 100644 --- a/gcc/config/mips/mips.md +++ b/gcc/config/mips/mips.md @@ -96,6 +96,7 @@ (define_c_enum "unspec" [ ;; Floating-point environment. UNSPEC_GET_FCSR UNSPEC_SET_FCSR + UNSPEC_ZERO_FCC ;; HI/LO moves. UNSPEC_MFHI @@ -7670,6 +7671,11 @@ (define_insn "*mips_set_fcsr" "TARGET_HARD_FLOAT" "ctc1\t%0,$31") +(define_insn "mips_zero_fcc" + [(unspec_volatile [(const_int 0)] UNSPEC_ZERO_FCC)] + "TARGET_HARD_FLOAT" + "ctc1\t$0,$25") + ;; See tls_get_tp_mips16_ for why this form is used. (define_insn "mips_set_fcsr_mips16_" [(unspec_volatile:SI [(match_operand:P 0 "call_insn_operand" "dS") diff --git a/gcc/testsuite/c-c++-common/zero-scratch-regs-10.c b/gcc/testsuite/c-c++-common/zero-scratch-regs-10.c index 96e0b79b328..c23b2ceb391 100644 --- a/gcc/testsuite/c-c++-common/zero-scratch-regs-10.c +++ b/gcc/testsuite/c-c++-common/zero-scratch-regs-10.c @@ -1,5 +1,5 @@ /* { dg-do run } */ -/* { dg-skip-if "not implemented" { ! { i?86*-*-* x86_64*-*-* sparc*-*-* aarch64*-*-* nvptx*-*-* s390*-*-* } } } */ +/* { dg-skip-if "not implemented" { ! { i?86*-*-* x86_64*-*-* sparc*-*-* aarch64*-*-* nvptx*-*-* s390*-*-* mips*-*-* } } } */ /* { dg-options "-O2" } */ #include diff --git a/gcc/testsuite/c-c++-common/zero-scratch-regs-11.c b/gcc/test
[PATCH] mips: avoid signed overflow in LUI_OPERAND [PR104842]
I think this one obvious. Ok for trunk? gcc/ PR target/104842 * config/mips/mips.h (LUI_OPERAND): Cast the input to an unsigned value before adding an offset. --- gcc/config/mips/mips.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gcc/config/mips/mips.h b/gcc/config/mips/mips.h index bf5c1d5a709..0029864fdcd 100644 --- a/gcc/config/mips/mips.h +++ b/gcc/config/mips/mips.h @@ -2309,7 +2309,7 @@ enum reg_class #define LUI_OPERAND(VALUE) \ (((VALUE) | 0x7fff) == 0x7fff\ - || ((VALUE) | 0x7fff) + 0x1 == 0) + || ((unsigned HOST_WIDE_INT) (VALUE) | 0x7fff) + 0x1 == 0) /* Return a value X with the low 16 bits clear, and such that VALUE - X is a signed 16-bit value. */ -- 2.35.1
[PATCH] vect: fix out-of-bound access in supports_vec_convert_optab_p [PR 104851]
This should be obvious, OK for trunk? -- >8 -- Calling VECTOR_MODE_P with MAX_MACHINE_MODE has caused out-of-bound access. gcc/ PR tree-optimization/104851 * optabs-query.cc (supports_vec_convert_optab_p): Fix off-by-one error. --- gcc/optabs-query.cc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gcc/optabs-query.cc b/gcc/optabs-query.cc index 713c098ba4e..68dc679cc6a 100644 --- a/gcc/optabs-query.cc +++ b/gcc/optabs-query.cc @@ -720,7 +720,7 @@ static bool supports_vec_convert_optab_p (optab op, machine_mode mode) { int start = mode == VOIDmode ? 0 : mode; - int end = mode == VOIDmode ? MAX_MACHINE_MODE : mode; + int end = mode == VOIDmode ? MAX_MACHINE_MODE - 1 : mode; for (int i = start; i <= end; ++i) if (VECTOR_MODE_P ((machine_mode) i)) for (int j = MIN_MODE_VECTOR_INT; j < MAX_MODE_VECTOR_INT; ++j) -- 2.35.1
Re: [PATCH] vect: fix out-of-bound access in supports_vec_convert_optab_p [PR 104851]
On Wed, 2022-03-09 at 09:37 +0100, Richard Biener wrote: > On Wed, Mar 9, 2022 at 5:06 AM Xi Ruoyao via Gcc-patches > wrote: > > > > This should be obvious, OK for trunk? > > OK. Committed r12-7559. > > -- >8 -- > > > > Calling VECTOR_MODE_P with MAX_MACHINE_MODE has caused out-of-bound > > access. /* snip */ > > - int end = mode == VOIDmode ? MAX_MACHINE_MODE : mode; > > + int end = mode == VOIDmode ? MAX_MACHINE_MODE - 1 : mode; -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University
Re: [PATCH] mips: avoid signed overflow in LUI_OPERAND [PR104842]
On Tue, 2022-03-08 at 18:20 +, Richard Sandiford wrote: > Xi Ruoyao writes: > > I think this one obvious. Ok for trunk? > OK, thanks. Committed r12-7555. /* snip */ > > #define LUI_OPERAND(VALUE) \ > > (((VALUE) | 0x7fff) == 0x7fff\ > > - || ((VALUE) | 0x7fff) + 0x1 == 0) > > + || ((unsigned HOST_WIDE_INT) (VALUE) | 0x7fff) + 0x1 == 0) -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University
[PATCH] cse: avoid signed overflow in compute_const_anchors [PR 104843]
Bootstrapped and tested on mips64el-linux-gnuabi64, and MIPS is the only port with a non-zero targetm.const_anchor. Ok for trunk? -- >8 -- With a non-zero const_anchor, the behavior of this function relied on signed overflow. gcc/ PR rtl-optimization/104843 * cse.cc (compute_const_anchors): Cast to unsigned HOST_WIDE_INT to perform overflow arithmetics safely. --- gcc/cse.cc | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/gcc/cse.cc b/gcc/cse.cc index a18b599d324..7c39a009449 100644 --- a/gcc/cse.cc +++ b/gcc/cse.cc @@ -1169,10 +1169,10 @@ compute_const_anchors (rtx cst, HOST_WIDE_INT *lower_base, HOST_WIDE_INT *lower_offs, HOST_WIDE_INT *upper_base, HOST_WIDE_INT *upper_offs) { - HOST_WIDE_INT n = INTVAL (cst); + unsigned HOST_WIDE_INT n = INTVAL (cst); *lower_base = n & ~(targetm.const_anchor - 1); - if (*lower_base == n) + if (*lower_base == INTVAL (cst)) return false; *upper_base = -- 2.35.1
[PATCH v2] cse: avoid signed overflow in compute_const_anchors [PR 104843]
On Wed, 2022-03-09 at 15:55 +0100, Richard Biener wrote: > isn't it better to make targetm.const_anchor unsigned? > The & and ~ are not subject to overflow rules. It's not enough: if n is the minimum value of HOST_WIDE_INT and const_anchor = 0x8000 (the value for MIPS), we'll have a signed 0x7fff in *upper_base. Then the next line, "*upper_offs = n - *upper_base;" will be a signed overflow again. How about the following? -- >8 -- With a non-zero const_anchor, the behavior of this function relied on signed overflow. gcc/ PR rtl-optimization/104843 * cse.cc (compute_const_anchors): Use unsigned HOST_WIDE_INT for n to perform overflow arithmetics safely. --- gcc/cse.cc | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/gcc/cse.cc b/gcc/cse.cc index a18b599d324..052fa0c3490 100644 --- a/gcc/cse.cc +++ b/gcc/cse.cc @@ -1169,12 +1169,12 @@ compute_const_anchors (rtx cst, HOST_WIDE_INT *lower_base, HOST_WIDE_INT *lower_offs, HOST_WIDE_INT *upper_base, HOST_WIDE_INT *upper_offs) { - HOST_WIDE_INT n = INTVAL (cst); - - *lower_base = n & ~(targetm.const_anchor - 1); - if (*lower_base == n) + unsigned HOST_WIDE_INT n = UINTVAL (cst); + unsigned HOST_WIDE_INT lb = n & ~(targetm.const_anchor - 1); + if (lb == n) return false; + *lower_base = lb; *upper_base = (n + (targetm.const_anchor - 1)) & ~(targetm.const_anchor - 1); *upper_offs = n - *upper_base; -- 2.35.1 >
Re: [PATCH v2] cse: avoid signed overflow in compute_const_anchors [PR 104843]
On Thu, 2022-03-10 at 09:01 +0100, Richard Biener wrote: > On Wed, Mar 9, 2022 at 5:12 PM Xi Ruoyao > wrote: > > > > On Wed, 2022-03-09 at 15:55 +0100, Richard Biener wrote: > > > > > isn't it better to make targetm.const_anchor unsigned? > > > The & and ~ are not subject to overflow rules. > > > > It's not enough: if n is the minimum value of HOST_WIDE_INT and > > const_anchor = 0x8000 (the value for MIPS), we'll have a signed > > 0x7fff > > in *upper_base. Then the next line, "*upper_offs = n - > > *upper_base;" > > will be a signed overflow again. > > > > How about the following? > > Hmm, so all this seems to be to round CST up and down to a multiple of > CONST_ANCHOR. > It works on CONST_INT only which is sign-extended, so if there is > overflow the resulting > anchor is broken as far as I can see. On MIPS addiu/daddiu do 2-complement addition, so the overflowed result is still usable. > So instead of papering over this issue > the function should return false when n is negative since then > n & ~(targetm.const_anchor - 1) is also not n rounded down to a > multiple of const_anchor. This function does work for negative n, like: void g (int, int); void f (void) { g(0x8123, 0x81240001); } It should produce: li $4,-2128347136 # 0x8124 daddiu $5,$4,1 daddiu $4,$4,-1 jal g But return false for negative n will cause regression for this case, producing: li $5,-2128347136 # 0x8124 li $4,-2128412672 # 0x8123 ori $5,$5,0x1 ori $4,$4,0x jal g That being said, it indeed does not work for: void g (int, int); void f () { g (0x7fff, 0x8001); } It produces: li $5,-2147483648 # 0x8000 li $4,2147418112 # 0x7fff daddiu $5,$5,1 ori $4,$4,0x jal g Should be: li $5,-2147483648 # 0x8000 daddiu $5,$5,1 addiu $4,$5,-1 > > -- >8 -- > > > > With a non-zero const_anchor, the behavior of this function relied on > > signed overflow. > > > > gcc/ > > > > PR rtl-optimization/104843 > > * cse.cc (compute_const_anchors): Use unsigned HOST_WIDE_INT for > > n to perform overflow arithmetics safely. > > --- > > gcc/cse.cc | 8 > > 1 file changed, 4 insertions(+), 4 deletions(-) > > > > diff --git a/gcc/cse.cc b/gcc/cse.cc > > index a18b599d324..052fa0c3490 100644 > > --- a/gcc/cse.cc > > +++ b/gcc/cse.cc > > @@ -1169,12 +1169,12 @@ compute_const_anchors (rtx cst, > > HOST_WIDE_INT *lower_base, HOST_WIDE_INT *lower_offs, > > HOST_WIDE_INT *upper_base, HOST_WIDE_INT *upper_offs) > > { > > - HOST_WIDE_INT n = INTVAL (cst); > > - > > - *lower_base = n & ~(targetm.const_anchor - 1); > > - if (*lower_base == n) > > + unsigned HOST_WIDE_INT n = UINTVAL (cst); > > + unsigned HOST_WIDE_INT lb = n & ~(targetm.const_anchor - 1); > > + if (lb == n) > > return false; > > > > + *lower_base = lb; > > *upper_base = > > (n + (targetm.const_anchor - 1)) & ~(targetm.const_anchor - 1); > > *upper_offs = n - *upper_base; > > -- > > 2.35.1 > > > > > > > -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University
Re: [PATCH RFC] mips: add TARGET_ZERO_CALL_USED_REGS hook [PR104817, PR104820]
On Wed, 2022-03-09 at 18:25 +, Richard Sandiford wrote: > Xi Ruoyao writes: > > Bootstrapped and regtested on mips64el-linux-gnuabi64. > > > > I'm not sure if it's "correct" to clobber other registers during the > > zeroing of scratch registers. But I can't really come up with a > > better > > idea: on MIPS there is no simple way to clear one bit in FCSR (i. e. > > FCC[x]). We can't just use "c.f.s $fccx,$f0,$f0" because it will > > raise > > an exception if $f0 contains a sNaN. > > Yeah, it's a bit of a grey area, but I think it should be fine, > provided > that the extra clobbers are never used as return registers (which is > obviously true for the FCC registers). > > But on that basis… /* snip */ > > + if (TEST_HARD_REG_BIT (need_zeroed_hardregs, LO_REGNUM)) > > + SET_HARD_REG_BIT (zeroed_hardregs, LO_REGNUM); > > + else > > + emit_clobber (gen_rtx_REG (word_mode, LO_REGNUM)); > > …I don't think this conditional LO_REGNUM code is worth it. > We might as well just add both registers to zeroed_hardregs. (See below) > > + } > > + > > + bool zero_fcc = false; > > + for (int i = ST_REG_FIRST; i <= ST_REG_LAST; i++) > > + if (TEST_HARD_REG_BIT (need_zeroed_hardregs, i)) > > + zero_fcc = true; > > + > > + /* MIPS does not have a simple way to clear one bit in FCC. We just > > + clear FCC with ctc1 and clobber all FCC bits. */ > > + if (zero_fcc) > > + { > > + emit_insn (gen_mips_zero_fcc ()); > > + for (int i = ST_REG_FIRST; i <= ST_REG_LAST; i++) > > + if (TEST_HARD_REG_BIT (need_zeroed_hardregs, i)) > > + SET_HARD_REG_BIT (zeroed_hardregs, i); > > + else > > + emit_clobber (gen_rtx_REG (CCmode, i)); > > + } > > Here too I think we should just do: > > zeroed_hardregs |= reg_class_contents[ST_REGS] & accessible_reg_set; > > to include all available FCC registers. I'm afraid that doing so will cause an ICE (triggering an assertion somewhere). Could someone confirm that returning "more" registers than required is allowed? GCC Internal does not say it explicitly, and x86 port is carefully avoiding from clearing registers not requested to be cleared. > > + need_zeroed_hardregs &= ~zeroed_hardregs; > > + return zeroed_hardregs | > > + default_zero_call_used_regs (need_zeroed_hardregs); > > Nit, but: should be formatted as: > > return (zeroed_hardregs > | default_zero_call_used_regs (need_zeroed_hardregs)); > > > +} Will do. > > /* Initialize the GCC target structure. */ > > #undef TARGET_ASM_ALIGNED_HI_OP > > @@ -22919,6 +22964,8 @@ mips_asm_file_end (void) > > #undef TARGET_ASM_FILE_END > > #define TARGET_ASM_FILE_END mips_asm_file_end > > > > +#undef TARGET_ZERO_CALL_USED_REGS > > +#define TARGET_ZERO_CALL_USED_REGS mips_zero_call_used_regs > > > > struct gcc_target targetm = TARGET_INITIALIZER; > > > > diff --git a/gcc/config/mips/mips.md b/gcc/config/mips/mips.md > > index e0f0a582732..edf58710cdd 100644 > > --- a/gcc/config/mips/mips.md > > +++ b/gcc/config/mips/mips.md > > @@ -96,6 +96,7 @@ (define_c_enum "unspec" [ > > ;; Floating-point environment. > > UNSPEC_GET_FCSR > > UNSPEC_SET_FCSR > > + UNSPEC_ZERO_FCC > > > > ;; HI/LO moves. > > UNSPEC_MFHI > > @@ -7670,6 +7671,11 @@ (define_insn "*mips_set_fcsr" > > "TARGET_HARD_FLOAT" > > "ctc1\t%0,$31") > > > > +(define_insn "mips_zero_fcc" > > + [(unspec_volatile [(const_int 0)] UNSPEC_ZERO_FCC)] > > + "TARGET_HARD_FLOAT" > > + "ctc1\t$0,$25") > > I've forgotten a lot of MIPS stuff, so: does this clear only the > FCC registers, or does it clear other things (such as exception bits) > as well? Yes, with fs = 25 CTC1 only clear FCCs. > Does it work even for !ISA_HAS_8CC? For !ISA_HAS_8CC targets, ST_REG_FIRST is not added into need_zeroed_hardregs at all. I think it's another bug I didn't catched... > I think this pattern should explicit clear all eight registers, e.g. > using: > > (set (reg:CC FCC0_REGNUM) (const_int 0)) > (set (reg:CC FCC1_REGNUM) (const_int 0)) > … > > which unfortunately means defining 8 new register constants in mips.md. > I guess for extra safety there should be a separate !ISA_HAS_8CC version > that only sets FCC0_REGNUM. Will do. > An alternative would be to avoid clearing the FCC registers altogether. > I suppose that's less secure, but residual information could leak through > the exception bits as well, and it isn't clear whether those should be > zeroed at the end of each function. I guess it depends on people's > appetite for risk. -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University
Re: [PATCH v2 RFC] mips: add TARGET_ZERO_CALL_USED_REGS hook [PR104817, PR104820]
Changes from v1: * Added all zeroed registers into the return value of TARGET_ZERO_CALL_USED_REGS. **The question: is this allowed?** * Define mips_zero_fcc insn only for ISA_HAS_8CC and mips_isa > MIPS_ISA_MIPS4, because MIPS I-IV and MIPSr6 don't support "ISA_HAS_8CC && mips_isa > MIPS_ISA_MIPS4". * Change mips_zero_fcc to explicit clear all eight registers. * Report an error for MIPS I-IV. -- >8 -- This fixes the ICEs using -fzero-call-used-regs=all for MIPS target. OpenSSH-8.9p1 has started to enable this by default, giving us a reason to fix -fzero-call-used-regs for more targets. gcc/ PR target/xx (WIP) PR target/xx (Don't push) * config/mips/mips.cc (mips_zero_call_used_regs): New function. (TARGET_ZERO_CALL_USED_REGS): Define. * config/mips/mips.md (FCC{0..9}_REGNUM): New constants. (mips_zero_fcc): New insn. gcc/testsuite * c-c++-common/zero-scratch-regs-8.c: Enable for MIPS. * c-c++-common/zero-scratch-regs-9.c: Likewise. * c-c++-common/zero-scratch-regs-10.c: Likewise. * c-c++-common/zero-scratch-regs-11.c: Likewise. --- gcc/config/mips/mips.cc | 55 +++ gcc/config/mips/mips.md | 20 +++ .../c-c++-common/zero-scratch-regs-10.c | 2 +- .../c-c++-common/zero-scratch-regs-11.c | 2 +- .../c-c++-common/zero-scratch-regs-8.c | 2 +- .../c-c++-common/zero-scratch-regs-9.c | 2 +- 6 files changed, 79 insertions(+), 4 deletions(-) diff --git a/gcc/config/mips/mips.cc b/gcc/config/mips/mips.cc index 4f9683e8bf4..59eef515826 100644 --- a/gcc/config/mips/mips.cc +++ b/gcc/config/mips/mips.cc @@ -22611,6 +22611,59 @@ mips_asm_file_end (void) if (NEED_INDICATE_EXEC_STACK) file_end_indicate_exec_stack (); } + +static HARD_REG_SET +mips_zero_call_used_regs (HARD_REG_SET need_zeroed_hardregs) +{ + HARD_REG_SET zeroed_hardregs; + CLEAR_HARD_REG_SET (zeroed_hardregs); + + if (TEST_HARD_REG_BIT (need_zeroed_hardregs, HI_REGNUM)) + { + /* Clear HI and LO altogether. MIPS target treats HILO as a + double-word register. */ + machine_mode dword_mode = TARGET_64BIT ? TImode : DImode; + rtx hilo = gen_rtx_REG (dword_mode, MD_REG_FIRST); + rtx zero = CONST0_RTX (dword_mode); + emit_move_insn (hilo, zero); + + SET_HARD_REG_BIT (zeroed_hardregs, HI_REGNUM); + SET_HARD_REG_BIT (zeroed_hardregs, LO_REGNUM); + } + + /* MIPS does not have a simple way to clear one bit in FCC. We just + clear FCC with ctc1 and clobber all FCC bits. */ + HARD_REG_SET fcc = reg_class_contents[ST_REGS] & accessible_reg_set; + if (hard_reg_set_intersect_p (need_zeroed_hardregs, fcc)) + { + static bool issued_error = false; + if (mips_isa <= MIPS_ISA_MIPS4) + { + /* We don't have an easy way to clear FCC on MIPS I, II, III, + and IV. */ + if (!issued_error) + sorry ("%qs not supported on this target", + "-fzero-call-used-regs"); + issued_error = true; + + /* Prevent an ICE. */ + need_zeroed_hardregs &= ~fcc; + } + else + { + /* If the target is MIPSr6, we should not reach here. All other + MIPS targets are ISA_HAS_8CC. */ + gcc_assert (ISA_HAS_8CC); + emit_insn (gen_mips_zero_fcc ()); + zeroed_hardregs |= fcc; + } + } + + need_zeroed_hardregs &= ~zeroed_hardregs; + return (zeroed_hardregs | + default_zero_call_used_regs (need_zeroed_hardregs)); +} + /* Initialize the GCC target structure. */ #undef TARGET_ASM_ALIGNED_HI_OP @@ -22919,6 +22972,8 @@ mips_asm_file_end (void) #undef TARGET_ASM_FILE_END #define TARGET_ASM_FILE_END mips_asm_file_end +#undef TARGET_ZERO_CALL_USED_REGS +#define TARGET_ZERO_CALL_USED_REGS mips_zero_call_used_regs struct gcc_target targetm = TARGET_INITIALIZER; diff --git a/gcc/config/mips/mips.md b/gcc/config/mips/mips.md index e0f0a582732..36d6a43d67f 100644 --- a/gcc/config/mips/mips.md +++ b/gcc/config/mips/mips.md @@ -167,6 +167,14 @@ (define_constants (SET_FCSR_REGNUM 4) (PIC_FUNCTION_ADDR_REGNUM 25) (RETURN_ADDR_REGNUM 31) + (FCC0_REGNUM 67) + (FCC1_REGNUM 68) + (FCC2_REGNUM 69) + (FCC3_REGNUM 70) + (FCC4_REGNUM 71) + (FCC5_REGNUM 72) + (FCC6_REGNUM 73) + (FCC7_REGNUM 74) (CPRESTORE_SLOT_REGNUM 76) (GOT_VERSION_REGNUM 79) @@ -7670,6 +7678,18 @@ (define_insn "*mips_set_fcsr" "TARGET_HARD_FLOAT" "ctc1\t%0,$31") +(define_insn "mips_zero_fcc" + [(set (reg:CC FCC0_REGNUM) (const_int 0)) + (set (reg:CC FCC1_REGNUM) (const_int 0)) + (set (reg:CC FCC2_REGNUM) (const_int 0)) + (set (reg:CC FCC3_REGNUM) (const_int 0)) + (set (reg:CC FCC4_REGNUM) (const_int 0)) + (set (reg:CC FCC5_REGNUM) (const_int 0)) + (set (reg:CC FCC6_REGNUM) (const_int 0)) + (set (reg:CC FCC7_REGNUM) (const_int 0))] + "TARGET_HARD_FLOAT && ISA_HAS_8CC && mips_isa > MIPS_ISA_MIPS4" + "ctc1\t$0,$25") + ;; See tls_get_tp_mips16_ for why this form is used. (define_insn "mips_set_fcsr_mips16_" [(unspec_volatile:SI [(match_operand:P 0 "call_insn_operand" "dS") diff --git a/gcc/testsuite/c-c++-common/zero-scratch-regs-10.c b/gcc/testsuite/c-c++-common/zero-scratch-regs-10.c index 96e0b79b328..c23b2ceb391 100644 --- a/gcc/testsuite/c-c++-common/zero-scratch-regs-10.c +++ b/gcc/tes
[PATCH v2 RFC, resend] mips: add TARGET_ZERO_CALL_USED_REGS hook [PR104817, PR104820]
On Thu, 2022-03-10 at 21:41 +0800, Xi Ruoyao wrote: > Changes from v1: > > * Added all zeroed registers into the return value of > TARGET_ZERO_CALL_USED_REGS. **The question: is this allowed?** > * Define mips_zero_fcc insn only for ISA_HAS_8CC and mips_isa > > MIPS_ISA_MIPS4, because MIPS I-IV and MIPSr6 don't support > "ISA_HAS_8CC && mips_isa > MIPS_ISA_MIPS4". > * Change mips_zero_fcc to explicit clear all eight registers. > * Report an error for MIPS I-IV. My mail client somehow mangled the patch. Resending... -- >8 -- This fixes the ICEs using -fzero-call-used-regs=all for MIPS target. OpenSSH-8.9p1 has started to enable this by default, giving us a reason to fix -fzero-call-used-regs for more targets. gcc/ PR target/xx (WIP) PR target/xx (Don't push) * config/mips/mips.cc (mips_zero_call_used_regs): New function. (TARGET_ZERO_CALL_USED_REGS): Define. * config/mips/mips.md (FCC{0..9}_REGNUM): New constants. (mips_zero_fcc): New insn. gcc/testsuite * c-c++-common/zero-scratch-regs-8.c: Enable for MIPS. * c-c++-common/zero-scratch-regs-9.c: Likewise. * c-c++-common/zero-scratch-regs-10.c: Likewise. * c-c++-common/zero-scratch-regs-11.c: Likewise. --- gcc/config/mips/mips.cc | 55 +++ gcc/config/mips/mips.md | 20 +++ .../c-c++-common/zero-scratch-regs-10.c | 2 +- .../c-c++-common/zero-scratch-regs-11.c | 2 +- .../c-c++-common/zero-scratch-regs-8.c| 2 +- .../c-c++-common/zero-scratch-regs-9.c| 2 +- 6 files changed, 79 insertions(+), 4 deletions(-) diff --git a/gcc/config/mips/mips.cc b/gcc/config/mips/mips.cc index 4f9683e8bf4..59eef515826 100644 --- a/gcc/config/mips/mips.cc +++ b/gcc/config/mips/mips.cc @@ -22611,6 +22611,59 @@ mips_asm_file_end (void) if (NEED_INDICATE_EXEC_STACK) file_end_indicate_exec_stack (); } + +static HARD_REG_SET +mips_zero_call_used_regs (HARD_REG_SET need_zeroed_hardregs) +{ + HARD_REG_SET zeroed_hardregs; + CLEAR_HARD_REG_SET (zeroed_hardregs); + + if (TEST_HARD_REG_BIT (need_zeroed_hardregs, HI_REGNUM)) +{ + /* Clear HI and LO altogether. MIPS target treats HILO as a +double-word register. */ + machine_mode dword_mode = TARGET_64BIT ? TImode : DImode; + rtx hilo = gen_rtx_REG (dword_mode, MD_REG_FIRST); + rtx zero = CONST0_RTX (dword_mode); + emit_move_insn (hilo, zero); + + SET_HARD_REG_BIT (zeroed_hardregs, HI_REGNUM); + SET_HARD_REG_BIT (zeroed_hardregs, LO_REGNUM); +} + + /* MIPS does not have a simple way to clear one bit in FCC. We just + clear FCC with ctc1 and clobber all FCC bits. */ + HARD_REG_SET fcc = reg_class_contents[ST_REGS] & accessible_reg_set; + if (hard_reg_set_intersect_p (need_zeroed_hardregs, fcc)) +{ + static bool issued_error = false; + if (mips_isa <= MIPS_ISA_MIPS4) + { + /* We don't have an easy way to clear FCC on MIPS I, II, III, +and IV. */ + if (!issued_error) + sorry ("%qs not supported on this target", + "-fzero-call-used-regs"); + issued_error = true; + + /* Prevent an ICE. */ + need_zeroed_hardregs &= ~fcc; + } + else + { + /* If the target is MIPSr6, we should not reach here. All other +MIPS targets are ISA_HAS_8CC. */ + gcc_assert (ISA_HAS_8CC); + emit_insn (gen_mips_zero_fcc ()); + zeroed_hardregs |= fcc; + } +} + + need_zeroed_hardregs &= ~zeroed_hardregs; + return (zeroed_hardregs | + default_zero_call_used_regs (need_zeroed_hardregs)); +} + /* Initialize the GCC target structure. */ #undef TARGET_ASM_ALIGNED_HI_OP @@ -22919,6 +22972,8 @@ mips_asm_file_end (void) #undef TARGET_ASM_FILE_END #define TARGET_ASM_FILE_END mips_asm_file_end +#undef TARGET_ZERO_CALL_USED_REGS +#define TARGET_ZERO_CALL_USED_REGS mips_zero_call_used_regs struct gcc_target targetm = TARGET_INITIALIZER; diff --git a/gcc/config/mips/mips.md b/gcc/config/mips/mips.md index e0f0a582732..36d6a43d67f 100644 --- a/gcc/config/mips/mips.md +++ b/gcc/config/mips/mips.md @@ -167,6 +167,14 @@ (define_constants (SET_FCSR_REGNUM4) (PIC_FUNCTION_ADDR_REGNUM 25) (RETURN_ADDR_REGNUM 31) + (FCC0_REGNUM67) + (FCC1_REGNUM68) + (FCC2_REGNUM69) + (FCC3_REGNUM70) + (FCC4_REGNUM71) + (FCC5_REGNUM72) + (FCC6_REGNUM73) + (FCC7_REGNUM74) (CPRESTORE_SLOT_REGNUM 76) (GOT_VERSION_REGNUM 79) @@ -7670,6 +7678,18 @@ (define_insn "*mips_set
Re: [PATCH RFC] mips: add TARGET_ZERO_CALL_USED_REGS hook [PR104817, PR104820]
On Thu, 2022-03-10 at 20:31 +, Qing Zhao wrote: > > > + SET_HARD_REG_BIT (zeroed_hardregs, HI_REGNUM); > > > + if (TEST_HARD_REG_BIT (need_zeroed_hardregs, LO_REGNUM)) > > > + SET_HARD_REG_BIT (zeroed_hardregs, LO_REGNUM); > > > + else > > > + emit_clobber (gen_rtx_REG (word_mode, LO_REGNUM)); > > > > …I don't think this conditional LO_REGNUM code is worth it. > > We might as well just add both registers to zeroed_hardregs. > > If the LO_REGNUM is NOT in “need_zeroed_hardregs”, adding it to > “zeroed_hardregs” seems not right to me. > What’s you mean by “not worth it”? It's because the MIPS port almost always treat HI as "a subreg of dword HI-LO register". A direct "mthi $0" is possible but MIPS backend does not recognize "emit_move_insn (HI, CONST_0)". In theory it's possible to emit the mthi instruction explicitly here though, but we'll need to clear something NOT in need_zeroed_hardregs for MIPS anyway (see below). > > Here too I think we should just do: > > > > zeroed_hardregs |= reg_class_contents[ST_REGS] & accessible_reg_set; > > > > to include all available FCC registers. > > What’s the relationship between “ST_REGs” and FCC? (sorry for the stupid > question since I am not familiar with the MIPS register set). MIPS instruction manual names the 8 one-bit floating condition codes FCC0, ..., FCC7, but GCC MIPS backend code names the condition codes ST_REG0, ..., ST_REG7. Maybe it's better to always use the name "ST_REG" instead of "FCC" then. > From the above code, looks like that when any “ST_REGs” is in > “need_zeroed_hardregs”,FCC need to be cleared? Because there is no elegant way to clear one specific FCC bit in MIPS. A "ctc1 $0, $25" instruction will zero them altogether. If we really need to clear only one of them (let's say ST_REG3), we'll have to emit something like mtc1 $0, $0 # zero FPR0 to ensure it won't contain sNaN c.f.s $3, $0, $0 Then we'll still need to clobber FPR0 with zero. So anyway we'll have to clear some registers not specified in need_zeroed_hardregs. And the question is: is it really allowed to return something other than a subset of need_zeroed_hardregs for a TARGET_ZERO_CALL_USED_REGS hook? If yes then we'll happily to do so (like how the v2 of the patch does), otherwise we'd need to clobber those registers NOT in need_zeroed_hardregs explicitly. -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University
[PATCH 1/2] libsanitizer: cherry-pick db7bca28638e from upstream
libsanitizer/ * sanitizer_common/sanitizer_atomic_clang.h: Ensures to only include sanitizer_atomic_clang_mips.h for O32. --- libsanitizer/sanitizer_common/sanitizer_atomic_clang.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/libsanitizer/sanitizer_common/sanitizer_atomic_clang.h b/libsanitizer/sanitizer_common/sanitizer_atomic_clang.h index fc13ca52dda..ccf18f0786d 100644 --- a/libsanitizer/sanitizer_common/sanitizer_atomic_clang.h +++ b/libsanitizer/sanitizer_common/sanitizer_atomic_clang.h @@ -96,8 +96,8 @@ inline bool atomic_compare_exchange_weak(volatile T *a, // This include provides explicit template instantiations for atomic_uint64_t // on MIPS32, which does not directly support 8 byte atomics. It has to // proceed the template definitions above. -#if defined(_MIPS_SIM) && defined(_ABIO32) - #include "sanitizer_atomic_clang_mips.h" +#if defined(_MIPS_SIM) && defined(_ABIO32) && _MIPS_SIM == _ABIO32 +# include "sanitizer_atomic_clang_mips.h" #endif #undef ATOMIC_ORDER -- 2.35.1
[PATCH 2/2] Enable libsanitizer build on mips64
Bootstrapped and regtested on mips64el-linux-gnuabi64. bootstrap-ubsan revealed 3 bugs (PR 104842, 104843, 104851). bootstrap-asan did not reveal any new bug. gcc/ * config/mips/mips.h (SUBTARGET_SHADOW_OFFSET): Define. * config/mips/mips.cc (mips_option_override): Make -fsanitize=address imply -fasynchronous-unwind-tables. This is needed by libasan for stack backtrace on MIPS. (mips_asan_shadow_offset): Return SUBTARGET_SHADOW_OFFSET. gcc/testsuite: * c-c++-common/asan/global-overflow-1.c: Skip for MIPS with some optimization levels because inaccurate debug info is causing dg-output mismatch on line numbers. * g++.dg/asan/large-func-test-1.C: Likewise. libsanitizer/ * configure.tgt: Enable build on mips64. --- gcc/config/mips/mips.cc | 9 - gcc/config/mips/mips.h | 7 +++ gcc/testsuite/c-c++-common/asan/global-overflow-1.c | 1 + gcc/testsuite/g++.dg/asan/large-func-test-1.C | 1 + libsanitizer/configure.tgt | 4 5 files changed, 17 insertions(+), 5 deletions(-) diff --git a/gcc/config/mips/mips.cc b/gcc/config/mips/mips.cc index 59eef515826..6b06c6380f6 100644 --- a/gcc/config/mips/mips.cc +++ b/gcc/config/mips/mips.cc @@ -19974,6 +19974,13 @@ mips_option_override (void) target_flags |= MASK_64BIT; } + /* -fsanitize=address needs to turn on -fasynchronous-unwind-tables in + order for tracebacks to be complete but not if any + -fasynchronous-unwind-table were already specified. */ + if (flag_sanitize & SANITIZE_USER_ADDRESS + && !global_options_set.x_flag_asynchronous_unwind_tables) +flag_asynchronous_unwind_tables = 1; + if ((target_flags_explicit & MASK_FLOAT64) != 0) { if (mips_isa_rev >= 6 && !TARGET_FLOAT64) @@ -22591,7 +22598,7 @@ mips_constant_alignment (const_tree exp, HOST_WIDE_INT align) static unsigned HOST_WIDE_INT mips_asan_shadow_offset (void) { - return 0x0aaa; + return SUBTARGET_SHADOW_OFFSET; } /* Implement TARGET_STARTING_FRAME_OFFSET. See mips_compute_frame_info diff --git a/gcc/config/mips/mips.h b/gcc/config/mips/mips.h index 0029864fdcd..858bbba3a36 100644 --- a/gcc/config/mips/mips.h +++ b/gcc/config/mips/mips.h @@ -3463,3 +3463,10 @@ struct GTY(()) machine_function { && !TARGET_MICROMIPS && !TARGET_FIX_24K) #define NEED_INDICATE_EXEC_STACK 0 + +/* Define the shadow offset for asan. Other OS's can override in the + respective tm.h files. */ +#ifndef SUBTARGET_SHADOW_OFFSET +#define SUBTARGET_SHADOW_OFFSET \ + (POINTER_SIZE == 64 ? HOST_WIDE_INT_1 << 37 : HOST_WIDE_INT_C (0x0aaa)) +#endif diff --git a/gcc/testsuite/c-c++-common/asan/global-overflow-1.c b/gcc/testsuite/c-c++-common/asan/global-overflow-1.c index 1092a316681..ec412231be0 100644 --- a/gcc/testsuite/c-c++-common/asan/global-overflow-1.c +++ b/gcc/testsuite/c-c++-common/asan/global-overflow-1.c @@ -22,6 +22,7 @@ int main() { return res; } +/* { dg-skip-if "inaccurate debug info" { mips*-*-* } { "*" } { "-O0" } } */ /* { dg-output "READ of size 1 at 0x\[0-9a-f\]+ thread T0.*(\n|\r\n|\r)" } */ /* { dg-output "#0 0x\[0-9a-f\]+ +(in _*main (\[^\n\r]*global-overflow-1.c:20|\[^\n\r]*:0|\[^\n\r]*\\+0x\[0-9a-z\]*)|\[(\])\[^\n\r]*(\n|\r\n|\r).*" } */ /* { dg-output "0x\[0-9a-f\]+ is located 0 bytes to the right of global variable" } */ diff --git a/gcc/testsuite/g++.dg/asan/large-func-test-1.C b/gcc/testsuite/g++.dg/asan/large-func-test-1.C index b42c09e3b0d..ac9deb898c8 100644 --- a/gcc/testsuite/g++.dg/asan/large-func-test-1.C +++ b/gcc/testsuite/g++.dg/asan/large-func-test-1.C @@ -35,6 +35,7 @@ int main() { delete x; } +// { dg-skip-if "inaccurate debug info" { mips*-*-* } { "-Os" } { "" } } // { dg-output "ERROR: AddressSanitizer:? heap-buffer-overflow on address\[^\n\r]*" } // { dg-output "0x\[0-9a-f\]+ at pc 0x\[0-9a-f\]+ bp 0x\[0-9a-f\]+ sp 0x\[0-9a-f\]+\[^\n\r]*(\n|\r\n|\r)" } // { dg-output "\[^\n\r]*READ of size 4 at 0x\[0-9a-f\]+ thread T0\[^\n\r]*(\n|\r\n|\r)" } diff --git a/libsanitizer/configure.tgt b/libsanitizer/configure.tgt index 5a59ea6a1b5..fb89df4935c 100644 --- a/libsanitizer/configure.tgt +++ b/libsanitizer/configure.tgt @@ -54,10 +54,6 @@ case "${target}" in ;; arm*-*-linux*) ;; - mips*64*-*-linux*) - # This clause is only here to not match the supported mips*-*-linux*. - UNSUPPORTED=1 - ;; mips*-*-linux*) ;; aarch64*-*-linux*) -- 2.35.1
Re: [PATCH RFC] mips: add TARGET_ZERO_CALL_USED_REGS hook [PR104817, PR104820]
On Fri, 2022-03-11 at 16:08 +, Qing Zhao wrote: > Why there is “mthi $0” instruction, but there is NO emit_move_insn(HI, > CONST_0)? > Is such mismatch a bug? If not, why? > > > In theory it's possible > > to emit the mthi instruction explicitly here though, but we'll need to > > clear something NOT in need_zeroed_hardregs for MIPS anyway (see below). > > One question here, is there situation when only HI is cleared but LO is not > cleared? No, if I interpret the document of -fzero_call_used_regs and attribute((zero_call_used_regs(...))) correctly. A 2-reg multiplication (or division) always set the value of both HI and LO. Richard has added a comment for this in mips.cc: > 12868 /* After a multiplication or division, clobbering HI makes > 1 the value of LO unpredictable, and vice versa. This means > 2 that, for all interesting cases, HI and LO are effectively > 3 a single register. > 4 > 5 We model this by requiring that any value that uses HI > 6 also uses LO. */ This is also why the handling of emit_move_insn(HI, CONST_0) was removed, I guess (the removal happened in the same commit adding this comment). > > > > Okay, I see. So, each ST_REGi register is a 1-bit pseudo register? > But physically each of them is 1-bit in a physical register? Yes. > > > > Because there is no elegant way to clear one specific FCC bit in MIPS. > > A "ctc1 $0, $25" instruction will zero them altogether. If we really > > need to clear only one of them (let's say ST_REG3), we'll have to emit > > something like > > > > mtc1 $0, $0 # zero FPR0 to ensure it won't contain sNaN > > c.f.s $3, $0, $0 > > > > Then we'll still need to clobber FPR0 with zero. So anyway we'll have > > to clear some registers not specified in need_zeroed_hardregs. > > So, “c.f.s” instruction can be used to clear ONLY one specific FCC bit? > But you have to clear one FPR (floating pointer register?) first to avoid > raising exception? > My question here is: is there a case when only FCC need to be cleared but no > FPR need to be cleared? Yes, for example: double a, b; struct x { double a, b; }; struct x f(void) { struct x x = { .a = a, .b = b }; if (a < b) x.a = x.b; return x; } It does not need to zero the two FPRs, as they contain the return value. But a FCC bit needs to be cleared. > If NOT, then we can always pick one FPRi before c.f.s to avoid the > issue you mentioned (We’ll have to clear some registers not specified > in need_zeroed_hardregs). I'm now thinking: is there always at least one *GPR* which need to be cleared? If it's true, let's say GPR $12, and fcc0 & fcc2 needs to be cleared, we can use something like: cfc1 $12, $25 andi $25, 5 ctc1 $12, $25 move $12, $0 > > And the question is: is it really allowed to return something other than > > a subset of need_zeroed_hardregs for a TARGET_ZERO_CALL_USED_REGS hook? > > Although currently there is no assertion added to force this > requirement, I still think that we should keep it. > > The “need_zeroed_hardregs” is computed based on > > 1. User’s request from command line option; > 2. Data flow info of the routine; > 3. Abi info of the target; > > If zero_call_used_regs target hook return registers out of > “need_zeroed_hardregs” set, then it might out of the user’s exception, > it should be considered as a bug, I think. I have the same concern. But now I'm too sleepy... Will try to improve this tomorrow. -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University
Re: [PATCH RFC] mips: add TARGET_ZERO_CALL_USED_REGS hook [PR104817, PR104820]
On Sat, 2022-03-12 at 01:29 +0800, Xi Ruoyao via Gcc-patches wrote: > I'm now thinking: is there always at least one *GPR* which need to be > cleared? If it's true, let's say GPR $12, and fcc0 & fcc2 needs to be > cleared, we can use something like: > > cfc1 $12, $25 > andi $25, 5 $12, 5. I can't type. > ctc1 $12, $25 > move $12, $0 -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University
Re: [PATCH RFC] mips: add TARGET_ZERO_CALL_USED_REGS hook [PR104817, PR104820]
On Fri, 2022-03-11 at 21:26 +, Qing Zhao wrote: > Hi, Ruoyao, > > (I might not be able to reply to this thread till next Wed due to a > short vacation). > > First, some comments on opening bugs against Gcc: > > I took a look at the bug reports PR104817 and PR104820: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104820 > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104817 > > I didn’t see a testing case and a script to repeat the error, so I > cannot repeat the error at my side. I've put the test case, but maybe you didn't see it because it is too simple: echo 'int t() {}' | /home/xry111/git-repos/gcc-test-mips/gcc/cc1 -nostdinc -fzero-call-used-regs=all An empty function is enough to break -fzero-call-used-regs=all. And if you append -mips64r2 to the cc1 command line you'll get 104820. I enabled 4 existing tests for MIPS (reported "not work" on MIPS) in the patch so I think it's unnecessary to add new test cases. Richard: can we use MIPS_EPILOGUE_TEMP as a scratch register in the sequence for zeroing the call-used registers, and then zero itself (despite it's not in need_zeroed_hardregs)? -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University