[PATCH 0/2] Unify and deduplicate FTM code
Greetings! This patch set replaces all code that involves defining feature test macros based on loosely put together conditionals in the standard library with a unified helper for specifying and requiring feature test macros, as well as updating most usage sites, many of which have been migrated to following a pattern similar, in structure, to: ... #define __glibcxx_want_foo #include ... namespace std { ... #ifdef __cpp_lib_foo template void foonicate(T&& t) { __builtin_foonicate_address(std::__addressof(t)); } #endif // __cpp_lib_foo ... } // namespace std In the future this should aid in preventing from being dishonest about what the implementation provides, as well as reducing the amount of finicky work it takes to update FTMs. Note that this patchset is not perfect. The usage sites of various feature test macros still include "wide" condition blocks that shadow over the blocks that check for FTMs, mostly in places where features with FTMs are the exception, rather than the norm. That said, using a pair of scripts[1][2], I've tested that the code emitted in bits/stdc++.h remains unchanged (save for a misdeclared __cpp_lib_constexpr_string in !HOSTED), as well as regression-tested --enable-languages=c,c++,lto on x86_64-pc-linux-gnu, and ran the libstdc++ testsuite with --target_board="unix{,-std=c++98,-std=gnu++11,-std=gnu++20, -D_GLIBCXX_USE_CXX11_ABI=0/-D_GLIBCXX_DEBUG,-D_GLIBCXX_DEBUG, -std=gnu++23}{-fno-freestanding,-ffreestanding}" (without the line breaks) to find no relevant failures. OK for trunk? Thanks in advance, have a lovely day. [1] https://git.sr.ht/~arsen/scripts/tree/master/item/difall.bash [2] https://git.sr.ht/~arsen/scripts/tree/master/item/vercmp.bash Arsen Arsenović (2): libstdc++: Implement more maintainable header libstdc++: Replace all manual FTM definitions and use libstdc++-v3/include/Makefile.am | 10 +- libstdc++-v3/include/Makefile.in | 10 +- libstdc++-v3/include/bits/algorithmfwd.h |7 +- libstdc++-v3/include/bits/align.h |8 +- libstdc++-v3/include/bits/alloc_traits.h | 11 +- libstdc++-v3/include/bits/allocator.h |3 +- libstdc++-v3/include/bits/atomic_base.h | 14 +- libstdc++-v3/include/bits/atomic_wait.h | 10 +- libstdc++-v3/include/bits/basic_string.h | 24 +- libstdc++-v3/include/bits/char_traits.h | 11 +- libstdc++-v3/include/bits/chrono.h| 18 +- libstdc++-v3/include/bits/cow_string.h|9 +- libstdc++-v3/include/bits/erase_if.h | 11 +- libstdc++-v3/include/bits/forward_list.h |6 +- libstdc++-v3/include/bits/hashtable.h |9 +- libstdc++-v3/include/bits/ios_base.h |6 +- libstdc++-v3/include/bits/move.h |8 +- .../include/bits/move_only_function.h |9 +- libstdc++-v3/include/bits/node_handle.h |8 +- libstdc++-v3/include/bits/ptr_traits.h| 15 +- libstdc++-v3/include/bits/range_access.h | 16 +- libstdc++-v3/include/bits/ranges_algo.h | 27 +- libstdc++-v3/include/bits/ranges_cmp.h| 14 +- libstdc++-v3/include/bits/shared_ptr.h| 10 +- libstdc++-v3/include/bits/shared_ptr_atomic.h |6 +- libstdc++-v3/include/bits/shared_ptr_base.h | 17 +- libstdc++-v3/include/bits/specfun.h |6 +- libstdc++-v3/include/bits/stl_algo.h | 20 +- libstdc++-v3/include/bits/stl_algobase.h | 13 +- libstdc++-v3/include/bits/stl_function.h | 28 +- libstdc++-v3/include/bits/stl_iterator.h | 21 +- libstdc++-v3/include/bits/stl_list.h |6 +- libstdc++-v3/include/bits/stl_map.h |6 +- libstdc++-v3/include/bits/stl_pair.h | 12 +- libstdc++-v3/include/bits/stl_queue.h |9 +- libstdc++-v3/include/bits/stl_stack.h |7 +- libstdc++-v3/include/bits/stl_tree.h |7 +- libstdc++-v3/include/bits/stl_uninitialized.h |9 +- libstdc++-v3/include/bits/stl_vector.h|4 +- libstdc++-v3/include/bits/unique_ptr.h| 13 +- libstdc++-v3/include/bits/unordered_map.h |8 +- .../include/bits/uses_allocator_args.h| 10 +- libstdc++-v3/include/bits/utility.h | 21 +- libstdc++-v3/include/bits/version.def | 1591 ++ libstdc++-v3/include/bits/version.h | 1937 + libstdc++-v3/include/bits/version.tpl | 209 ++ .../include/c_compatibility/stdatomic.h |9 +- libstdc++-v3/include/c_global/cmath | 18 +- libstdc++-v3/include/c_global/cstddef |9 +- libstdc++-v3/include/std/algorithm| 10 +- libstdc++-v3/include/std/any |9 +- libstdc++-v3/include/std/array|9 +- libstdc++-v3/include/std/atomic | 67 +- libstdc++-v3/include/std/barrier | 11 +- libstdc++-v
[PATCH 1/2] libstdc++: Implement more maintainable header
This commit replaces the ad-hoc logic in with an AutoGen database that (mostly) declaratively generates a version.h bit which combines all of the FTM logic across all headers together. This generated header defines macros of the form __glibcxx_foo, equivalent to their __cpp_lib_foo variants, according to rules specified in version.def and, optionally, if __glibcxx_want_foo or __glibcxx_want_all are defined, also defines __cpp_lib_foo forms with the same definition. libstdc++-v3/ChangeLog: * include/Makefile.am (bits_freestanding): Add version.h. (allcreated): Add version.h. (${bits_srcdir}/version.h): New rule. Regenerates version.h out of version.{def,tpl}. * include/Makefile.in: Regenerate. * include/bits/version.def: New file. Declares a list of all feature test macros, their values and their preconditions. * include/bits/version.tpl: New file. Turns version.def into a sequence of #if blocks. * include/bits/version.h: New file. Generated from version.def. * include/std/version: Replace with a __glibcxx_want_all define and bits/version.h include. --- libstdc++-v3/include/Makefile.am | 10 +- libstdc++-v3/include/Makefile.in | 10 +- libstdc++-v3/include/bits/version.def | 1591 libstdc++-v3/include/bits/version.h | 1937 + libstdc++-v3/include/bits/version.tpl | 209 +++ libstdc++-v3/include/std/version | 350 + 6 files changed, 3758 insertions(+), 349 deletions(-) create mode 100644 libstdc++-v3/include/bits/version.def create mode 100644 libstdc++-v3/include/bits/version.h create mode 100644 libstdc++-v3/include/bits/version.tpl diff --git a/libstdc++-v3/include/Makefile.am b/libstdc++-v3/include/Makefile.am index a880e8ee227..a07b4c18585 100644 --- a/libstdc++-v3/include/Makefile.am +++ b/libstdc++-v3/include/Makefile.am @@ -154,6 +154,7 @@ bits_freestanding = \ ${bits_srcdir}/stl_raw_storage_iter.h \ ${bits_srcdir}/stl_relops.h \ ${bits_srcdir}/stl_uninitialized.h \ + ${bits_srcdir}/version.h \ ${bits_srcdir}/string_view.tcc \ ${bits_srcdir}/uniform_int_dist.h \ ${bits_srcdir}/unique_ptr.h \ @@ -1113,7 +1114,8 @@ allcreated = \ ${host_builddir}/c++config.h \ ${host_builddir}/largefile-config.h \ ${thread_host_headers} \ - ${pch_build} + ${pch_build} \ + ${bits_srcdir}/version.h # Here are the rules for building the headers all-local: ${allstamped} ${allcreated} @@ -1463,6 +1465,12 @@ ${pch3_output}: ${pch3_source} ${pch2_output} -mkdir -p ${pch3_output_builddir} $(CXX) $(PCHFLAGS) $(AM_CPPFLAGS) -O2 -g ${pch3_source} -o $@ +# AutoGen . +${bits_srcdir}/version.h: ${bits_srcdir}/version.def \ + ${bits_srcdir}/version.tpl + cd $(@D) && \ + autogen version.def + # The real deal. install-data-local: install-headers install-headers: diff --git a/libstdc++-v3/include/Makefile.in b/libstdc++-v3/include/Makefile.in index 0ff875b280b..f5b04d3fe8a 100644 --- a/libstdc++-v3/include/Makefile.in +++ b/libstdc++-v3/include/Makefile.in @@ -509,6 +509,7 @@ bits_freestanding = \ ${bits_srcdir}/stl_raw_storage_iter.h \ ${bits_srcdir}/stl_relops.h \ ${bits_srcdir}/stl_uninitialized.h \ + ${bits_srcdir}/version.h \ ${bits_srcdir}/string_view.tcc \ ${bits_srcdir}/uniform_int_dist.h \ ${bits_srcdir}/unique_ptr.h \ @@ -1441,7 +1442,8 @@ allcreated = \ ${host_builddir}/c++config.h \ ${host_builddir}/largefile-config.h \ ${thread_host_headers} \ - ${pch_build} + ${pch_build} \ + ${bits_srcdir}/version.h # Host includes for threads @@ -1937,6 +1939,12 @@ ${pch3_output}: ${pch3_source} ${pch2_output} -mkdir -p ${pch3_output_builddir} $(CXX) $(PCHFLAGS) $(AM_CPPFLAGS) -O2 -g ${pch3_source} -o $@ +# AutoGen . +${bits_srcdir}/version.h: ${bits_srcdir}/version.def \ + ${bits_srcdir}/version.tpl + cd $(@D) && \ + autogen version.def + # The real deal. install-data-local: install-headers install-headers: diff --git a/libstdc++-v3/include/bits/version.def b/libstdc++-v3/include/bits/version.def new file mode 100644 index 000..afdec9acfe3 --- /dev/null +++ b/libstdc++-v3/include/bits/version.def @@ -0,0 +1,1591 @@ +// Feature test macro definitions -*- C++ -*- +// Copyright (C) 2023 Free Software Foundation, Inc. + +// This file is part of the GNU ISO C++ Library. This library is free +// software; you can redistribute it and/or modify it under the +// terms of the GNU General Public License as published by the +// Free Software Foundation; either version 3, or (at your option) +// any later version. + +// This library is distributed in the hope that it will be useful, +// but WITHOUT AN
[PATCH] OpenACC: Further attach/detach clause fixes for Fortran [PR109622]
This patch moves several tests introduced by the following patch: https://gcc.gnu.org/pipermail/gcc-patches/2023-April/616939.html into the proper location for OpenACC testing (thanks to Thomas for spotting my mistake!), and also fixes a few additional problems -- missing diagnostics for non-pointer attaches, and a case where a pointer was incorrectly dereferenced. Tests are also adjusted for vector-length warnings on nvidia accelerators. Tested with offloading to nvptx. OK? 2023-04-29 Julian Brown PR fortran/109622 gcc/fortran/ * trans-openmp.cc (gfc_trans_omp_clauses): Add diagnostic for non-pointer/non-allocatable attach/detach. Remove dereference for pointer-to-scalar derived type component attach/detach. gcc/testsuite/ * gfortran.dg/goacc/pr109622-5.f90: New test. libgomp/ * testsuite/libgomp.fortran/pr109622.f90: Move test... * testsuite/libgomp.oacc-fortran/pr109622.f90: ...to here. Ignore vector length warning. * testsuite/libgomp.fortran/pr109622-2.f90: Move test... * testsuite/libgomp.oacc-fortran/pr109622-2.f90: ...to here. Add missing copyin/copyout variable. Ignore vector length warnings. * testsuite/libgomp.fortran/pr109622-3.f90: Move test... * testsuite/libgomp.oacc-fortran/pr109622-3.f90: ...to here. Ignore vector length warnings. * testsuite/libgomp.oacc-fortran/pr109622-4.f90: New test. --- gcc/fortran/trans-openmp.cc | 38 --- .../gfortran.dg/goacc/pr109622-5.f90 | 45 ++ .../pr109622-2.f90| 7 ++- .../pr109622-3.f90| 3 ++ .../libgomp.oacc-fortran/pr109622-4.f90 | 47 +++ .../pr109622.f90 | 3 ++ 6 files changed, 135 insertions(+), 8 deletions(-) create mode 100644 gcc/testsuite/gfortran.dg/goacc/pr109622-5.f90 rename libgomp/testsuite/{libgomp.fortran => libgomp.oacc-fortran}/pr109622-2.f90 (63%) rename libgomp/testsuite/{libgomp.fortran => libgomp.oacc-fortran}/pr109622-3.f90 (76%) create mode 100644 libgomp/testsuite/libgomp.oacc-fortran/pr109622-4.f90 rename libgomp/testsuite/{libgomp.fortran => libgomp.oacc-fortran}/pr109622.f90 (78%) diff --git a/gcc/fortran/trans-openmp.cc b/gcc/fortran/trans-openmp.cc index 6ee22faa836a..b9a4ae3e53a8 100644 --- a/gcc/fortran/trans-openmp.cc +++ b/gcc/fortran/trans-openmp.cc @@ -3395,6 +3395,8 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses, && (n->u.map_op == OMP_MAP_ATTACH || n->u.map_op == OMP_MAP_DETACH)) { + OMP_CLAUSE_DECL (node) + = build_fold_addr_expr (OMP_CLAUSE_DECL (node)); OMP_CLAUSE_SIZE (node) = size_zero_node; goto finalize_map_clause; } @@ -3430,6 +3432,13 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses, = TYPE_SIZE_UNIT (gfc_charlen_type_node); } } + else if (openacc + && (n->u.map_op == OMP_MAP_ATTACH + || n->u.map_op == OMP_MAP_DETACH)) + gfc_error ("%qs clause argument not pointer or " + "allocatable at %L", + (n->u.map_op == OMP_MAP_ATTACH) + ? "attach" : "detach", &where); } else if (n->expr && n->expr->expr_type == EXPR_VARIABLE @@ -3510,6 +3519,13 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses, } else { + if (openacc + && (n->u.map_op == OMP_MAP_ATTACH + || n->u.map_op == OMP_MAP_DETACH)) + gfc_error ("%qs clause argument not pointer or " + "allocatable at %L", + (n->u.map_op == OMP_MAP_ATTACH) + ? "attach" : "detach", &where); OMP_CLAUSE_DECL (node) = inner; OMP_CLAUSE_SIZE (node) = TYPE_SIZE_UNIT (TREE_TYPE (inner)); @@ -3523,15 +3539,25 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses, if (n->u.map_op == OMP_MAP_ATTACH || n->u.map_op == OMP_MAP_DETACH) { - if (GFC_DESCRIPTOR_TYPE_P (TREE_TYPE (inner))) + if (POINTER_TYPE_P (TREE_TYPE (inner)) + || GFC_DESCRIPTOR_TYPE_P (TREE_TYPE (inner)))
[PATCH V2] RISC-V: decouple stack allocation for rv32e w/o save-restore.
Currently in rv32e, stack allocation for GPR callee-saved registers is always 12 bytes w/o save-restore. Actually, for the case without save-restore, less stack memory can be reserved. This patch decouples stack allocation for rv32e w/o save-restore and makes riscv_compute_frame_info more readable. output of testcase rv32e_stack.c before patch: addisp,sp,-16 sw ra,12(sp) callgetInt sw a0,0(sp) lw a0,0(sp) callPrintInts lw a5,0(sp) mv a0,a5 lw ra,12(sp) addisp,sp,16 jr ra after patch: addisp,sp,-8 sw ra,4(sp) callgetInt sw a0,0(sp) lw a0,0(sp) callPrintInts lw a5,0(sp) mv a0,a5 lw ra,4(sp) addisp,sp,8 jr ra gcc/ChangeLog: * config/riscv/riscv.cc (riscv_avoid_save_libcall): helper function for riscv_use_save_libcall. (riscv_use_save_libcall): call riscv_avoid_save_libcall. (riscv_compute_frame_info): restructure to decouple stack allocation for rv32e w/o save-restore. gcc/testsuite/ChangeLog: * gcc.target/riscv/rv32e_stack.c: New test. --- gcc/config/riscv/riscv.cc| 58 gcc/testsuite/gcc.target/riscv/rv32e_stack.c | 14 + 2 files changed, 50 insertions(+), 22 deletions(-) create mode 100644 gcc/testsuite/gcc.target/riscv/rv32e_stack.c diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc index 5d2550871c7..8b32977e296 100644 --- a/gcc/config/riscv/riscv.cc +++ b/gcc/config/riscv/riscv.cc @@ -4772,12 +4772,27 @@ riscv_save_reg_p (unsigned int regno) return false; } +/* Return TRUE if a libcall to save/restore GPRs should be + avoided. FALSE otherwise. */ +static bool +riscv_avoid_save_libcall (void) +{ + if (!TARGET_SAVE_RESTORE + || crtl->calls_eh_return + || frame_pointer_needed + || cfun->machine->interrupt_handler_p + || cfun->machine->varargs_size != 0 + || crtl->args.pretend_args_size != 0) +return true; + + return false; +} + /* Determine whether to call GPR save/restore routines. */ static bool riscv_use_save_libcall (const struct riscv_frame_info *frame) { - if (!TARGET_SAVE_RESTORE || crtl->calls_eh_return || frame_pointer_needed - || cfun->machine->interrupt_handler_p) + if (riscv_avoid_save_libcall ()) return false; return frame->save_libcall_adjustment != 0; @@ -4857,7 +4872,7 @@ riscv_compute_frame_info (void) struct riscv_frame_info *frame; poly_int64 offset; bool interrupt_save_prologue_temp = false; - unsigned int regno, i, num_x_saved = 0, num_f_saved = 0; + unsigned int regno, i, num_x_saved = 0, num_f_saved = 0, x_save_size = 0; frame = &cfun->machine->frame; @@ -4895,24 +4910,14 @@ riscv_compute_frame_info (void) frame->fmask |= 1 << (regno - FP_REG_FIRST), num_f_saved++; } - /* At the bottom of the frame are any outgoing stack arguments. */ - offset = riscv_stack_align (crtl->outgoing_args_size); - /* Next are local stack variables. */ - offset += riscv_stack_align (get_frame_size ()); - /* The virtual frame pointer points above the local variables. */ - frame->frame_pointer_offset = offset; - /* Next are the callee-saved FPRs. */ - if (frame->fmask) -offset += riscv_stack_align (num_f_saved * UNITS_PER_FP_REG); - frame->fp_sp_offset = offset - UNITS_PER_FP_REG; - /* Next are the callee-saved GPRs. */ if (frame->mask) { - unsigned x_save_size = riscv_stack_align (num_x_saved * UNITS_PER_WORD); + x_save_size = riscv_stack_align (num_x_saved * UNITS_PER_WORD); unsigned num_save_restore = 1 + riscv_save_libcall_count (frame->mask); /* Only use save/restore routines if they don't alter the stack size. */ - if (riscv_stack_align (num_save_restore * UNITS_PER_WORD) == x_save_size) + if (riscv_stack_align (num_save_restore * UNITS_PER_WORD) == x_save_size + && !riscv_avoid_save_libcall ()) { /* Libcall saves/restores 3 registers at once, so we need to allocate 12 bytes for callee-saved register. */ @@ -4921,9 +4926,21 @@ riscv_compute_frame_info (void) frame->save_libcall_adjustment = x_save_size; } - - offset += x_save_size; } + + /* At the bottom of the frame are any outgoing stack arguments. */ + offset = riscv_stack_align (crtl->outgoing_args_size); + /* Next are local stack variables. */ + offset += riscv_stack_align (get_frame_size ()); + /* The virtual frame pointer points above the local variables. */ + frame->frame_pointer_offset = offset; + /* Next are the callee-saved FPRs. */ + if (frame->fmask) +offset += riscv_stack_align (num_f_saved * UNITS_PER_FP_REG); + frame->fp_sp_offset = offset - UNITS_PER_FP_REG; + /* Next are the callee-saved GPRs. */ + if (frame->mask) +
[PATCH v2] RISC-V: Allow RVV VMS{Compare}(V1, V1) simplify to VMSET
From: Pan Li When some RVV integer compare operators act on the same vector registers without mask. They can be simplified to VMSET. This PATCH allow the eq, le, leu, ge, geu to perform such kind of the simplification by adding vector bool support in relational_result of the simplify rtx. Given we have: vbool1_t test_shortcut_for_riscv_vmseq_case_0(vint8m8_t v1, size_t vl) { return __riscv_vmseq_vv_i8m8_b1(v1, v1, vl); } Before this patch: vsetvli zero,a2,e8,m8,ta,ma vl8re8.v v8,0(a1) vmseq.vv v8,v8,v8 vsetvli a5,zero,e8,m8,ta,ma vsm.vv8,0(a0) ret After this patch: vsetvli zero,a2,e8,m8,ta,ma vmset.m v1 <- optimized to vmset.m vsetvli a5,zero,e8,m8,ta,ma vsm.v v1,0(a0) ret As above, we may have one instruction eliminated and require less vector registers. gcc/ChangeLog: * machmode.h (VECTOR_BOOL_MODE_P): Add new predication macro. * simplify-rtx.cc (relational_result): Add vector bool support. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/integer_compare_insn_shortcut.c: Adjust test check condition. Signed-off-by: Pan Li --- gcc/machmode.h | 4 gcc/simplify-rtx.cc | 4 .../riscv/rvv/base/integer_compare_insn_shortcut.c | 6 +- 3 files changed, 9 insertions(+), 5 deletions(-) diff --git a/gcc/machmode.h b/gcc/machmode.h index f1865c1ef42..5fbece0042f 100644 --- a/gcc/machmode.h +++ b/gcc/machmode.h @@ -134,6 +134,10 @@ extern const unsigned char mode_class[NUM_MACHINE_MODES]; || GET_MODE_CLASS (MODE) == MODE_VECTOR_ACCUM \ || GET_MODE_CLASS (MODE) == MODE_VECTOR_UACCUM) +/* Nonzero if MODE is a vector bool mode. */ +#define VECTOR_BOOL_MODE_P(MODE) \ + (GET_MODE_CLASS (MODE) == MODE_VECTOR_BOOL) + /* Nonzero if MODE is a scalar integral mode. */ #define SCALAR_INT_MODE_P(MODE)\ (GET_MODE_CLASS (MODE) == MODE_INT \ diff --git a/gcc/simplify-rtx.cc b/gcc/simplify-rtx.cc index d4aeebc7a5f..12aba4c4b05 100644 --- a/gcc/simplify-rtx.cc +++ b/gcc/simplify-rtx.cc @@ -2535,6 +2535,10 @@ relational_result (machine_mode mode, machine_mode cmp_mode, rtx res) { if (res == const0_rtx) return CONST0_RTX (mode); + + if (VECTOR_BOOL_MODE_P (mode) && res == const1_rtx) + return CONSTM1_RTX (mode); + #ifdef VECTOR_STORE_FLAG_VALUE rtx val = VECTOR_STORE_FLAG_VALUE (mode); if (val == NULL_RTX) diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/integer_compare_insn_shortcut.c b/gcc/testsuite/gcc.target/riscv/rvv/base/integer_compare_insn_shortcut.c index 8954adad09d..1bca8467a16 100644 --- a/gcc/testsuite/gcc.target/riscv/rvv/base/integer_compare_insn_shortcut.c +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/integer_compare_insn_shortcut.c @@ -283,9 +283,5 @@ vbool64_t test_shortcut_for_riscv_vmsgeu_case_6(vuint8mf8_t v1, size_t vl) { return __riscv_vmsgeu_vv_u8mf8_b64(v1, v1, vl); } -/* { dg-final { scan-assembler-times {vmseq\.vv\sv[0-9],\s*v[0-9],\s*v[0-9]} 7 } } */ -/* { dg-final { scan-assembler-times {vmsle\.vv\sv[0-9],\s*v[0-9],\s*v[0-9]} 7 } } */ -/* { dg-final { scan-assembler-times {vmsleu\.vv\sv[0-9],\s*v[0-9],\s*v[0-9]} 7 } } */ -/* { dg-final { scan-assembler-times {vmsge\.vv\sv[0-9],\s*v[0-9],\s*v[0-9]} 7 } } */ -/* { dg-final { scan-assembler-times {vmsgeu\.vv\sv[0-9],\s*v[0-9],\s*v[0-9]} 7 } } */ /* { dg-final { scan-assembler-times {vmclr\.m\sv[0-9]} 35 } } */ +/* { dg-final { scan-assembler-times {vmset\.m\sv[0-9]} 35 } } */ -- 2.34.1
RE: [PATCH] RISC-V: Allow RVV VMS{Compare}(V1, V1) simplify to VMSET
Hi Jeff Just have a try in simplify_rtx for this optimization in PATCH v2. Could you please help to share any idea about this when you free? Thank you! https://gcc.gnu.org/pipermail/gcc-patches/2023-April/617117.html Pan -Original Message- From: Li, Pan2 Sent: Saturday, April 29, 2023 10:55 AM To: Jeff Law ; gcc-patches@gcc.gnu.org Cc: juzhe.zh...@rivai.ai; kito.ch...@sifive.com; Wang, Yanzhang Subject: RE: [PATCH] RISC-V: Allow RVV VMS{Compare}(V1, V1) simplify to VMSET Thanks Jeff for comments. It makes sense to me. For the EQ operator we should have CONSTM1. Does this mean s390 parts has similar issue here? Then for instructions like VMSEQ, we need to adjust the simplify_rtx up to a point. Please help to correct me if any mistake. Thank you again. Pan -Original Message- From: Jeff Law Sent: Saturday, April 29, 2023 5:48 AM To: Li, Pan2 ; gcc-patches@gcc.gnu.org Cc: juzhe.zh...@rivai.ai; kito.ch...@sifive.com; Wang, Yanzhang Subject: Re: [PATCH] RISC-V: Allow RVV VMS{Compare}(V1, V1) simplify to VMSET On 4/28/23 09:21, Pan Li via Gcc-patches wrote: > From: Pan Li > > When some RVV integer compare operators act on the same vector > registers without mask. They can be simplified to VMSET. > > This PATCH allows the eq, le, leu, ge, geu to perform such kind of the > simplification by adding one macro in riscv for simplify rtx. > > Given we have: > vbool1_t test_shortcut_for_riscv_vmseq_case_0(vint8m8_t v1, size_t vl) > { >return __riscv_vmseq_vv_i8m8_b1(v1, v1, vl); } > > Before this patch: > vsetvli zero,a2,e8,m8,ta,ma > vl8re8.v v8,0(a1) > vmseq.vv v8,v8,v8 > vsetvli a5,zero,e8,m8,ta,ma > vsm.vv8,0(a0) > ret > > After this patch: > vsetvli zero,a2,e8,m8,ta,ma > vmset.m v1 <- optimized to vmset.m > vsetvli a5,zero,e8,m8,ta,ma > vsm.v v1,0(a0) > ret > > As above, we may have one instruction eliminated and require less > vector registers. > > Signed-off-by: Pan Li > > gcc/ChangeLog: > > * config/riscv/riscv.h (VECTOR_STORE_FLAG_VALUE): Add new macro > consumed by simplify_rtx. > > gcc/testsuite/ChangeLog: > > * gcc.target/riscv/rvv/base/integer_compare_insn_shortcut.c: > Adjust test check condition. I'm not sure this is 100% correct. What happens to the high bits in the resultant mask register? My understanding is we have one output bit per input element in the comparison. So unless the number of elements matches the bit width of the mask register, this isn't going to work. Am I missing something? Jeff
Re: [PATCH] Turn on LRA on all targets
Segher Boessenkool wrote: > I send this patch now so that people can start testing. > > diff --git a/gcc/config/nvptx/nvptx.cc b/gcc/config/nvptx/nvptx.cc > index 89349dae9e62..e32f17377525 100644 > --- a/gcc/config/nvptx/nvptx.cc > +++ b/gcc/config/nvptx/nvptx.cc > @@ -7601,9 +7601,6 @@ nvptx_asm_output_def_from_decls (FILE *stream, tree name, tree value) > #undef TARGET_ATTRIBUTE_TABLE > #define TARGET_ATTRIBUTE_TABLE nvptx_attribute_table > >-#undef TARGET_LRA_P >-#define TARGET_LRA_P hook_bool_void_false >- > #undef TARGET_LEGITIMATE_ADDRESS_P > #define TARGET_LEGITIMATE_ADDRESS_P nvptx_legitimate_address_p I've tested Segher's patch on nvptx-none with make and make -k check and can confirm there are no new regressions. Nvptx is unique in that it doesn't use register allocation, i.e. GCC's only TARGET_NO_REGISTER_ALLOCATION target, so it's a little odd that it specifies which register allocator it doesn't use. I hope this helps, Roger --
Re: [PATCH] Turn on LRA on all targets
Hi! On Mon, Apr 24, 2023 at 11:46:50AM +0200, Uros Bizjak wrote: > On Mon, Apr 24, 2023 at 11:19 AM Segher Boessenkool > wrote: > > We still need someone to test this on alpha now, years later, and give > > a final okay, but hearing this is encouraging :-) > > Please note that bootstrap worked on alpha*EV6*, not plain alpha. > > Plain alpha is !BWX architecture and uses {un,}aligned_memory_operand > predicates that call resolve_reload_operand function. Unfortunately, > this function peeks deep into reload internals (reg_equiv_memory_loc) > that has no equivalent in LRA. As said in the comment, this internal > function resolves what reload is going to do with OP if it is a > register. Bootstrap works with everything I tried, but building Linux fails with a few things like /home/segher/src/kernel/drivers/tty/serial/serial_core.c:1029:1: internal compiler error: maximum number of generated reload insns per insn achieved (90) (it uses -mcpu=ev5 there; to reproduce just (try to) build a defconfig). Segher
Re: [PATCH] RISC-V: Allow RVV VMS{Compare}(V1, V1) simplify to VMSET
On 4/28/23 20:55, Li, Pan2 wrote: Thanks Jeff for comments. It makes sense to me. For the EQ operator we should have CONSTM1. That's not the way I interpret the RVV documentation. Of course it's not terribly clear.I guess one could do some experiments with qemu or try to dig into the sail code and figure out the intent from those. Does this mean s390 parts has similar issue here? Then for instructions like VMSEQ, we need to adjust the simplify_rtx up to a point. You'd have to refer to the s390 instruction set reference to understand precisely how the vector compares work. But as it stands this really isn't a simplify-rtx question, but a question of the semantics of risc-v. What happens with the high bits in the destination mask register is critical -- and if risc-v doesn't set them to all ones in this case, then that would mean that defining that macro is simply wrong for risc-v. jeff
Re: [PATCH] Turn on LRA on all targets
On 4/29/23 07:37, Roger Sayle wrote: Segher Boessenkool wrote: I send this patch now so that people can start testing. diff --git a/gcc/config/nvptx/nvptx.cc b/gcc/config/nvptx/nvptx.cc index 89349dae9e62..e32f17377525 100644 --- a/gcc/config/nvptx/nvptx.cc +++ b/gcc/config/nvptx/nvptx.cc @@ -7601,9 +7601,6 @@ nvptx_asm_output_def_from_decls (FILE *stream, tree name, tree value) #undef TARGET_ATTRIBUTE_TABLE #define TARGET_ATTRIBUTE_TABLE nvptx_attribute_table -#undef TARGET_LRA_P -#define TARGET_LRA_P hook_bool_void_false - #undef TARGET_LEGITIMATE_ADDRESS_P #define TARGET_LEGITIMATE_ADDRESS_P nvptx_legitimate_address_p I've tested Segher's patch on nvptx-none with make and make -k check and can confirm there are no new regressions. Nvptx is unique in that it doesn't use register allocation, i.e. GCC's only TARGET_NO_REGISTER_ALLOCATION target, so it's a little odd that it specifies which register allocator it doesn't use. I hope this helps, It does. Consider a patch which flips the nvptx port to LRA as pre-approved. I tried the FRV just for fun. It faulted all over the place :( jeff
Re: [PATCH V2] RISC-V: decouple stack allocation for rv32e w/o save-restore.
On 4/29/23 04:59, Fei Gao wrote: Currently in rv32e, stack allocation for GPR callee-saved registers is always 12 bytes w/o save-restore. Actually, for the case without save-restore, less stack memory can be reserved. This patch decouples stack allocation for rv32e w/o save-restore and makes riscv_compute_frame_info more readable. output of testcase rv32e_stack.c before patch: addisp,sp,-16 sw ra,12(sp) callgetInt sw a0,0(sp) lw a0,0(sp) callPrintInts lw a5,0(sp) mv a0,a5 lw ra,12(sp) addisp,sp,16 jr ra after patch: addisp,sp,-8 sw ra,4(sp) callgetInt sw a0,0(sp) lw a0,0(sp) callPrintInts lw a5,0(sp) mv a0,a5 lw ra,4(sp) addisp,sp,8 jr ra gcc/ChangeLog: * config/riscv/riscv.cc (riscv_avoid_save_libcall): helper function for riscv_use_save_libcall. (riscv_use_save_libcall): call riscv_avoid_save_libcall. (riscv_compute_frame_info): restructure to decouple stack allocation for rv32e w/o save-restore. gcc/testsuite/ChangeLog: * gcc.target/riscv/rv32e_stack.c: New test. Thanks. I rewrapped the ChangeLog and pushed this to the trunk. jeff
[committed] [PR target/109549] Adjust mips test for recent ifcvt costing changes
MIPS ports have been failing a few tests since the change to add cost checks in another path through the if-converter pass. As with the other ports, these look like cases where we don't do good costing in the MIPS port. Someone who cares about MIPS will need to fix this properly. In the mean time this patch adjusts the branch cost when running the two affected tests and skips them at -Os. This is enough to verify that if conversion can still happen if the costs are adjusted. Committed to the trunk. Jeff commit ef6c3095aabe75af727a269d91d9ffa37f982ace Author: Jeff Law Date: Sat Apr 29 10:16:21 2023 -0600 Adjust mips test for recent ifcvt costing changes MIPS ports have been failing a few tests since the change to add cost checks in another path through the if-converter pass. As with the other ports, these look like cases where we don't do good costing in the MIPS port. Someone who cares about MIPS will need to fix this properly. In the mean time this patch adjusts the branch cost when running the two affected tests and skips them at -Os. This is enough to verify that if conversion can still happen if the costs are adjusted. gcc/testsuite * gcc.target/mips/mips-ps-type-2.c: Adjust branch cost to encourage if-conversion. Skip for -Os. * gcc.target/mips/movcc-3.c: Similarly. diff --git a/gcc/testsuite/gcc.target/mips/mips-ps-type-2.c b/gcc/testsuite/gcc.target/mips/mips-ps-type-2.c index ed5d6ee1663..e5cb7d48dae 100644 --- a/gcc/testsuite/gcc.target/mips/mips-ps-type-2.c +++ b/gcc/testsuite/gcc.target/mips/mips-ps-type-2.c @@ -1,8 +1,8 @@ /* Test v2sf calculations. The nmadd and nmsub patterns need -ffinite-math-only. */ /* { dg-do compile } */ -/* { dg-options "(HAS_MADDPS) -mmadd4 -mgp32 -mpaired-single -ffinite-math-only forbid_cpu=octeon.*" } */ -/* { dg-skip-if "nmadd and nmsub need combine" { *-*-* } { "-O0" } { "" } } */ +/* { dg-options "(HAS_MADDPS) -mmadd4 -mgp32 -mpaired-single -ffinite-math-only forbid_cpu=octeon.* -mbranch-cost=2" } */ +/* { dg-skip-if "nmadd and nmsub need combine" { *-*-* } { "-O0" "-Os" } { "" } } */ /* { dg-final { scan-assembler "\tcvt.ps.s\t" } } */ /* { dg-final { scan-assembler "\tmov.ps\t" } } */ /* { dg-final { scan-assembler "\tldc1\t" } } */ diff --git a/gcc/testsuite/gcc.target/mips/movcc-3.c b/gcc/testsuite/gcc.target/mips/movcc-3.c index 55434b72c72..80d44098a3f 100644 --- a/gcc/testsuite/gcc.target/mips/movcc-3.c +++ b/gcc/testsuite/gcc.target/mips/movcc-3.c @@ -1,6 +1,6 @@ /* { dg-do compile } */ -/* { dg-options "(HAS_MOVN) -mhard-float" } */ -/* { dg-skip-if "code quality test" { *-*-* } { "-O0" } { "" } } */ +/* { dg-options "(HAS_MOVN) -mhard-float -mbranch-cost=2" } */ +/* { dg-skip-if "code quality test" { *-*-* } { "-O0" "-Os" } { "" } } */ /* { dg-final { scan-assembler "\tmovt\t" } } */ /* { dg-final { scan-assembler "\tmovf\t" } } */ /* { dg-final { scan-assembler "\tmovz.s\t" } } */
[xstormy16 PATCH] Recognize/support swpn (swap nibbles) instruction.
This patch adds support for xstormy16's swap nibbles instruction (swpn). For the test case: short foo(short x) { return (x&0xff00) | ((x<<4)&0xf0) | ((x>>4)&0x0f); } GCC with -O2 currently generates the nine instruction sequence: foo:mov r7,r2 asr r2,#4 and r2,#15 mov.w r6,#-256 and r6,r7 or r2,r6 shl r7,#4 and r7,#255 or r2,r7 ret with this patch, we now generate: foo:swpn r2 ret To achieve this using combine's four instruction "combinations" requires a little wizardry. Firstly, define_insn_and_split are introduced to treat logical shifts followed by bitwise-AND as macro instructions that are split after reload. This is sufficient to recognize a QImode nibble swap, which can be implemented by swpn followed by either a zero-extension or a sign-extension from QImode to HImode. Then finally, in the correct context, a QImode swap-nibbles pattern can be combined to preserve the high-byte of a HImode word, matching the xstormy16's swpn semantics. The naming of the new code iterators is taken from i386.md. The any_rotate code iterator is used in my next (split out) patch. This patch has been tested by building a cross-compiler to xstormy16-elf from x86_64-pc-linux-gnu and confirming the new test cases pass. Ok for mainline? 2023-04-29 Roger Sayle gcc/ChangeLog * config/stormy16/stormy16.md (any_lshift): New code iterator. (any_or_plus): Likewise. (any_rotate): Likewise. (*_and_internal): New define_insn_and_split to recognize a logical shift followed by an AND, and split it again after reload. (*swpn): New define_insn matching xstormy16's swpn. (*swpn_zext): New define_insn recognizing swpn followed by zero_extendqihi2, i.e. with the high byte set to zero. (*swpn_sext): Likewise, for swpn followed by cbw. (*swpn_sext_2): Likewise, for an alternate RTL form. (*swpn_zext_ior): A pre-reload splitter so that an swpn+zext+ior sequence is split in the correct place to recognize the *swpn_zext followed by any_or_plus (ior, xor or plus) instruction. gcc/testsuite/ChangeLog * gcc.target/xstormy16/swpn-1.c: New QImode test case. * gcc.target/xstormy16/swpn-2.c: New zero_extend test case. * gcc.target/xstormy16/swpn-3.c: New sign_extend test case. * gcc.target/xstormy16/swpn-4.c: New HImode test case. Thanks in advance, Roger -- diff --git a/gcc/config/stormy16/stormy16.md b/gcc/config/stormy16/stormy16.md index b2e86ee..be1ee04 100644 --- a/gcc/config/stormy16/stormy16.md +++ b/gcc/config/stormy16/stormy16.md @@ -48,6 +48,10 @@ (CARRY_REG 16) ] ) + +(define_code_iterator any_lshift [ashift lshiftrt]) +(define_code_iterator any_or_plus [plus ior xor]) +(define_code_iterator any_rotate [rotate rotatert]) ;; ;; :: @@ -1301,3 +1323,86 @@ [(parallel [(set (match_dup 2) (match_dup 1)) (set (match_dup 1) (match_dup 2))])]) +;; Recognize shl+and and shr+and as macro instructions. +(define_insn_and_split "*_and_internal" + [(set (match_operand:HI 0 "register_operand" "=r") +(and:HI (any_lshift:HI (match_operand 1 "register_operand" "0") + (match_operand 2 "const_int_operand" "i")) + (match_operand 3 "const_int_operand" "i"))) + (clobber (reg:BI CARRY_REG))] + "IN_RANGE (INTVAL (operands[2]), 0, 15)" + "#" + "reload_completed" + [(parallel [(set (match_dup 0) (any_lshift:HI (match_dup 1) (match_dup 2))) + (clobber (reg:BI CARRY_REG))]) + (set (match_dup 0) (and:HI (match_dup 0) (match_dup 3)))]) + +;; Swap nibbles instruction +(define_insn "*swpn" + [(set (match_operand:HI 0 "register_operand" "=r") + (any_or_plus:HI + (any_or_plus:HI + (and:HI (ashift:HI (match_operand:HI 1 "register_operand" "0") + (const_int 4)) + (const_int 240)) + (and:HI (lshiftrt:HI (match_dup 1) (const_int 4)) + (const_int 15))) + (and:HI (match_dup 1) (const_int -256] + "" + "swpn %0") + +(define_insn "*swpn_zext" + [(set (match_operand:HI 0 "register_operand" "=r") + (any_or_plus:HI + (and:HI (ashift:HI (match_operand:HI 1 "register_operand" "0") +(const_int 4)) + (const_int 240)) + (and:HI (lshiftrt:HI (match_dup 1) (const_int 4)) + (const_int 15] + "" + "swpn %0 | and %0,#255" + [(set_attr "length" "6")]) + +(define_insn "*swpn_sext" + [(set (match_operand:HI 0 "register_operand" "=r") + (sign_extend:HI + (rotate:QI (subreg:QI (match_operand:HI 1 "register_operand" "0") 0) +(const_int 4] + "" + "swpn %0 | cbw %0" + [(set_attr "length" "4")]) + +(define_insn "*swpn_sext_2" + [(set (match_operand:HI 0 "register_operand" "=r") + (sign
[xstormy16 PATCH] Efficient HImode rotate left by a single bit.
This patch contains some minor tweak to xstormy16's machine description most significantly providing a pattern for HImode rotate left by a single bit that requires only two instructions. unsigned short foo(unsigned short x) { return (x << 1) | (x >> 15); } currently with -O2 generates: foo:mov r7,r2 shr r7,#15 shl r2,#1 or r2,r7 ret with this patch, GCC now generates: foo:shl r2,#1 | adc r2,#0 ret Additionally neghi2 is converted to a define_insn (so that the RTL optimizers see the negation semantics), and HImode rotations by 8-bits can now be recognized and implemented using swpb. This patch has been tested by building a cross-compiler to xstormy16-elf from x86_64-pc-linux-gnu and confirming the new test cases pass. Ok for mainline? 2023-04-29 Roger Sayle gcc/ChangeLog * config/stormy16/stormy16.md (neghi2): Convert from a define_expand to a define_insn. (*rotatehi_1): New define_insn for efficient 2 insn sequence. (*rotatehi_8, *rotaterthi_8): New define_insn to emit a swpb. gcc/testsuite/ChangeLog * gcc.target/xstormy16/neghi2.c: New test case. * gcc.target/rotatehi-1.c: Likewise. Thanks in advance, Roger -- diff --git a/gcc/config/stormy16/stormy16.md b/gcc/config/stormy16/stormy16.md index b2e86ee..be1ee04 100644 --- a/gcc/config/stormy16/stormy16.md +++ b/gcc/config/stormy16/stormy16.md @@ -514,13 +518,13 @@ ;; Negation -(define_expand "neghi2" - [(set (match_operand:HI 0 "register_operand" "") - (not:HI (match_operand:HI 1 "register_operand" ""))) - (parallel [(set (match_dup 0) (plus:HI (match_dup 0) (const_int 1))) +(define_insn "neghi2" + [(parallel [(set (match_operand:HI 0 "register_operand" "=r") + (neg:HI (match_operand:HI 1 "register_operand" "0"))) (clobber (reg:BI CARRY_REG))])] "" - "") + "not %0 | add %0,#1" + [(set_attr "length" "4")]) ;; ;; :: @@ -554,6 +558,24 @@ (clobber (reg:BI CARRY_REG))] "" "shr %0,%2") + +;; HImode rotate left by 1 bit +(define_insn "*rotatehi_1" + [(set (match_operand:HI 0 "register_operand" "=r") + (rotate:HI (match_operand:HI 1 "register_operand" "0") + (const_int 1))) + (clobber (reg:BI CARRY_REG))] + "" + "shl %0,#1 | adc %0,#0" + [(set_attr "length" "4")]) + +;; HImode rotate left by 8 bits +(define_insn "*hi_8" + [(set (match_operand:HI 0 "register_operand" "=r") + (any_rotate:HI (match_operand:HI 1 "register_operand" "0") + (const_int 8)))] + "" + "swpb %0") ;; ;; :: diff --git a/gcc/testsuite/gcc.target/xstormy16/neghi2.c b/gcc/testsuite/gcc.target/xstormy16/neghi2.c new file mode 100644 index 000..dd3dd1e --- /dev/null +++ b/gcc/testsuite/gcc.target/xstormy16/neghi2.c @@ -0,0 +1,8 @@ +/* { dg-do compile } */ +/* { dg-options "-O2" } */ + +short neg(short x) +{ + return -x; +} +/* { dg-final { scan-assembler "not r2 | add r2,#1" } } */ diff --git a/gcc/testsuite/gcc.target/xstormy16/rotatehi-1.c b/gcc/testsuite/gcc.target/xstormy16/rotatehi-1.c new file mode 100644 index 000..586e7dc --- /dev/null +++ b/gcc/testsuite/gcc.target/xstormy16/rotatehi-1.c @@ -0,0 +1,10 @@ +/* { dg-do compile } */ +/* { dg-options "-O2" } */ + +unsigned short foo(unsigned short x) +{ + return (x << 1) | (x >> 15); +} + +/* { dg-final { scan-assembler "shl r2,#1" } } */ +/* { dg-final { scan-assembler "adc r2,#0" } } */
Re: [xstormy16 PATCH] Efficient HImode rotate left by a single bit.
On 4/29/23 10:25, Roger Sayle wrote: This patch contains some minor tweak to xstormy16's machine description most significantly providing a pattern for HImode rotate left by a single bit that requires only two instructions. unsigned short foo(unsigned short x) { return (x << 1) | (x >> 15); } currently with -O2 generates: foo:mov r7,r2 shr r7,#15 shl r2,#1 or r2,r7 ret with this patch, GCC now generates: foo:shl r2,#1 | adc r2,#0 ret Additionally neghi2 is converted to a define_insn (so that the RTL optimizers see the negation semantics), and HImode rotations by 8-bits can now be recognized and implemented using swpb. This patch has been tested by building a cross-compiler to xstormy16-elf from x86_64-pc-linux-gnu and confirming the new test cases pass. Ok for mainline? 2023-04-29 Roger Sayle gcc/ChangeLog * config/stormy16/stormy16.md (neghi2): Convert from a define_expand to a define_insn. (*rotatehi_1): New define_insn for efficient 2 insn sequence. (*rotatehi_8, *rotaterthi_8): New define_insn to emit a swpb. gcc/testsuite/ChangeLog * gcc.target/xstormy16/neghi2.c: New test case. * gcc.target/rotatehi-1.c: Likewise. It may be the case that exposing negation as a not + add sequence was thought to potentially produce better code by exposing the component instructions. Or it may have simply been the case that nobody considered the tradeoffs. Either way, I think the patch is fine. As is always the case, figure ~24hrs after committing we'll have test results. jeff
Re: [xstormy16 PATCH] Recognize/support swpn (swap nibbles) instruction.
On 4/29/23 10:24, Roger Sayle wrote: This patch adds support for xstormy16's swap nibbles instruction (swpn). For the test case: short foo(short x) { return (x&0xff00) | ((x<<4)&0xf0) | ((x>>4)&0x0f); } GCC with -O2 currently generates the nine instruction sequence: foo:mov r7,r2 asr r2,#4 and r2,#15 mov.w r6,#-256 and r6,r7 or r2,r6 shl r7,#4 and r7,#255 or r2,r7 ret with this patch, we now generate: foo:swpn r2 ret To achieve this using combine's four instruction "combinations" requires a little wizardry. Firstly, define_insn_and_split are introduced to treat logical shifts followed by bitwise-AND as macro instructions that are split after reload. This is sufficient to recognize a QImode nibble swap, which can be implemented by swpn followed by either a zero-extension or a sign-extension from QImode to HImode. Then finally, in the correct context, a QImode swap-nibbles pattern can be combined to preserve the high-byte of a HImode word, matching the xstormy16's swpn semantics. The naming of the new code iterators is taken from i386.md. The any_rotate code iterator is used in my next (split out) patch. This patch has been tested by building a cross-compiler to xstormy16-elf from x86_64-pc-linux-gnu and confirming the new test cases pass. Ok for mainline? 2023-04-29 Roger Sayle gcc/ChangeLog * config/stormy16/stormy16.md (any_lshift): New code iterator. (any_or_plus): Likewise. (any_rotate): Likewise. (*_and_internal): New define_insn_and_split to recognize a logical shift followed by an AND, and split it again after reload. (*swpn): New define_insn matching xstormy16's swpn. (*swpn_zext): New define_insn recognizing swpn followed by zero_extendqihi2, i.e. with the high byte set to zero. (*swpn_sext): Likewise, for swpn followed by cbw. (*swpn_sext_2): Likewise, for an alternate RTL form. (*swpn_zext_ior): A pre-reload splitter so that an swpn+zext+ior sequence is split in the correct place to recognize the *swpn_zext followed by any_or_plus (ior, xor or plus) instruction. gcc/testsuite/ChangeLog * gcc.target/xstormy16/swpn-1.c: New QImode test case. * gcc.target/xstormy16/swpn-2.c: New zero_extend test case. * gcc.target/xstormy16/swpn-3.c: New sign_extend test case. * gcc.target/xstormy16/swpn-4.c: New HImode test case. Ah, bridge patterns. OK for the trunk. jeff
Re: [PATCH] add glibc-stdint.h to vax and lm32 linux target (PR target/105525)
On 4/28/23 11:45, Mikael Pettersson via Gcc-patches wrote: PR target/105525 is a build regression for the vax and lm32 linux targets present in gcc-12/13/head, where the builds fail due to unsatisfied references to __INTPTR_TYPE__ and __UINTPTR_TYPE__, caused by these two targets failing to provide glibc-stdint.h. Fixed thusly, tested by building crosses, which now succeeds. Ok for trunk? (Note I don't have commit rights.) 2023-04-28 Mikael Pettersson PR target/105525 * config.gcc (vax-*-linux*): Add glibc-stdint.h. (lm32-*-uclinux*): Likewise. Thanks. I've pushed this to the trunk. jeff
Re: [PATCH V2] RISC-V: decouple stack allocation for rv32e w/o save-restore.
On Sat, 29 Apr 2023 08:38:06 PDT (-0700), jeffreya...@gmail.com wrote: On 4/29/23 04:59, Fei Gao wrote: Currently in rv32e, stack allocation for GPR callee-saved registers is always 12 bytes w/o save-restore. Actually, for the case without save-restore, less stack memory can be reserved. This patch decouples stack allocation for rv32e w/o save-restore and makes riscv_compute_frame_info more readable. Are you guys using rv32e? It's not widely tested, at least by most upstream folks. If you're actively trying to ship it then we should probably add it to the various lists of targest that get tested, as I'd bet there's a lot of oddness floating around. output of testcase rv32e_stack.c before patch: addisp,sp,-16 sw ra,12(sp) callgetInt sw a0,0(sp) lw a0,0(sp) callPrintInts lw a5,0(sp) mv a0,a5 lw ra,12(sp) addisp,sp,16 jr ra after patch: addisp,sp,-8 sw ra,4(sp) callgetInt sw a0,0(sp) lw a0,0(sp) callPrintInts lw a5,0(sp) mv a0,a5 lw ra,4(sp) addisp,sp,8 jr ra gcc/ChangeLog: * config/riscv/riscv.cc (riscv_avoid_save_libcall): helper function for riscv_use_save_libcall. (riscv_use_save_libcall): call riscv_avoid_save_libcall. (riscv_compute_frame_info): restructure to decouple stack allocation for rv32e w/o save-restore. gcc/testsuite/ChangeLog: * gcc.target/riscv/rv32e_stack.c: New test. Thanks. I rewrapped the ChangeLog and pushed this to the trunk. Works for me, thanks for reviewing all this stuff -- we're all pretty buried ;) jeff
Re: [PATCH] RISC-V: Allow RVV VMS{Compare}(V1, V1) simplify to VMSET
On Sat, Apr 29, 2023 at 8:06 AM Jeff Law via Gcc-patches < gcc-patches@gcc.gnu.org> wrote: > > > > On 4/28/23 20:55, Li, Pan2 wrote: > > Thanks Jeff for comments. > > > > It makes sense to me. For the EQ operator we should have CONSTM1. > That's not the way I interpret the RVV documentation. Of course it's > not terribly clear.I guess one could do some experiments with qemu > or try to dig into the sail code and figure out the intent from those. > > > > Does this mean s390 parts has similar issue here? Then for instructions > like VMSEQ, we need to adjust the simplify_rtx up to a point. > You'd have to refer to the s390 instruction set reference to understand > precisely how the vector compares work. > > But as it stands this really isn't a simplify-rtx question, but a > question of the semantics of risc-v. What happens with the high bits > in the destination mask register is critical -- and if risc-v doesn't > set them to all ones in this case, then that would mean that defining > that macro is simply wrong for risc-v. The relevant statement in the spec is that "the tail elements are always updated with a tail-agnostic policy". The vmset.m instruction will cause mask register bits [0, vl-1] to be set to 1; elements [vl, VLMAX-1] will either be undisturbed or set to 1, i.e., effectively unspecified. > > jeff
Re: [PATCH] RISC-V: Allow RVV VMS{Compare}(V1, V1) simplify to VMSET
On Sat, 29 Apr 2023 10:21:53 PDT (-0700), gcc-patches@gcc.gnu.org wrote: On Sat, Apr 29, 2023 at 8:06 AM Jeff Law via Gcc-patches < gcc-patches@gcc.gnu.org> wrote: On 4/28/23 20:55, Li, Pan2 wrote: > Thanks Jeff for comments. > > It makes sense to me. For the EQ operator we should have CONSTM1. That's not the way I interpret the RVV documentation. Of course it's not terribly clear.I guess one could do some experiments with qemu or try to dig into the sail code and figure out the intent from those. QEMU specifically takes advantage of the behavior Andrew is pointing out it the spec, and will soon do so more aggressively (assuming the patches Daniel just sent out get merged). Does this mean s390 parts has similar issue here? Then for instructions like VMSEQ, we need to adjust the simplify_rtx up to a point. You'd have to refer to the s390 instruction set reference to understand precisely how the vector compares work. But as it stands this really isn't a simplify-rtx question, but a question of the semantics of risc-v. What happens with the high bits in the destination mask register is critical -- and if risc-v doesn't set them to all ones in this case, then that would mean that defining that macro is simply wrong for risc-v. The relevant statement in the spec is that "the tail elements are always updated with a tail-agnostic policy". The vmset.m instruction will cause mask register bits [0, vl-1] to be set to 1; elements [vl, VLMAX-1] will either be undisturbed or set to 1, i.e., effectively unspecified. jeff
Re: [PATCH V2] RISC-V: decouple stack allocation for rv32e w/o save-restore.
On 4/29/23 11:00, Palmer Dabbelt wrote: On Sat, 29 Apr 2023 08:38:06 PDT (-0700), jeffreya...@gmail.com wrote: On 4/29/23 04:59, Fei Gao wrote: Currently in rv32e, stack allocation for GPR callee-saved registers is always 12 bytes w/o save-restore. Actually, for the case without save-restore, less stack memory can be reserved. This patch decouples stack allocation for rv32e w/o save-restore and makes riscv_compute_frame_info more readable. Are you guys using rv32e? It's not widely tested, at least by most upstream folks. If you're actively trying to ship it then we should probably add it to the various lists of targest that get tested, as I'd bet there's a lot of oddness floating around. No interest at all in rv32 at Ventana. Thanks. I rewrapped the ChangeLog and pushed this to the trunk. Works for me, thanks for reviewing all this stuff -- we're all pretty buried ;) Just standard procedure with the trunk re-opened. In the past I would have ignored anything in the risc-v space. I've traded that for ignoring x86 :-) Jeff
Re: [PATCH V2] RISC-V: decouple stack allocation for rv32e w/o save-restore.
On Sat, 29 Apr 2023 10:44:08 PDT (-0700), jeffreya...@gmail.com wrote: On 4/29/23 11:00, Palmer Dabbelt wrote: On Sat, 29 Apr 2023 08:38:06 PDT (-0700), jeffreya...@gmail.com wrote: On 4/29/23 04:59, Fei Gao wrote: Currently in rv32e, stack allocation for GPR callee-saved registers is always 12 bytes w/o save-restore. Actually, for the case without save-restore, less stack memory can be reserved. This patch decouples stack allocation for rv32e w/o save-restore and makes riscv_compute_frame_info more readable. Are you guys using rv32e? It's not widely tested, at least by most upstream folks. If you're actively trying to ship it then we should probably add it to the various lists of targest that get tested, as I'd bet there's a lot of oddness floating around. No interest at all in rv32 at Ventana. Makes sense, I was mostly wondering abotu the Eswin folks though. Thanks. I rewrapped the ChangeLog and pushed this to the trunk. Works for me, thanks for reviewing all this stuff -- we're all pretty buried ;) Just standard procedure with the trunk re-opened. In the past I would have ignored anything in the risc-v space. I've traded that for ignoring x86 :-) Jeff
Re: [PATCH] RISC-V: Allow RVV VMS{Compare}(V1, V1) simplify to VMSET
On 4/29/23 11:28, Palmer Dabbelt wrote: On Sat, 29 Apr 2023 10:21:53 PDT (-0700), gcc-patches@gcc.gnu.org wrote: On Sat, Apr 29, 2023 at 8:06 AM Jeff Law via Gcc-patches < gcc-patches@gcc.gnu.org> wrote: On 4/28/23 20:55, Li, Pan2 wrote: > Thanks Jeff for comments. > > It makes sense to me. For the EQ operator we should have CONSTM1. That's not the way I interpret the RVV documentation. Of course it's not terribly clear. I guess one could do some experiments with qemu or try to dig into the sail code and figure out the intent from those. QEMU specifically takes advantage of the behavior Andrew is pointing out it the spec, and will soon do so more aggressively (assuming the patches Daniel just sent out get merged). Yea. And taking advantage of that behavior is definitely a performance issue for QEMU. There's still work to do though. QEMU on vector code is running crazy slow. jeff
Re: [PATCH] RISC-V: Allow RVV VMS{Compare}(V1, V1) simplify to VMSET
On Sat, 29 Apr 2023 10:46:37 PDT (-0700), jeffreya...@gmail.com wrote: On 4/29/23 11:28, Palmer Dabbelt wrote: On Sat, 29 Apr 2023 10:21:53 PDT (-0700), gcc-patches@gcc.gnu.org wrote: On Sat, Apr 29, 2023 at 8:06 AM Jeff Law via Gcc-patches < gcc-patches@gcc.gnu.org> wrote: On 4/28/23 20:55, Li, Pan2 wrote: > Thanks Jeff for comments. > > It makes sense to me. For the EQ operator we should have CONSTM1. That's not the way I interpret the RVV documentation. Of course it's not terribly clear. I guess one could do some experiments with qemu or try to dig into the sail code and figure out the intent from those. QEMU specifically takes advantage of the behavior Andrew is pointing out it the spec, and will soon do so more aggressively (assuming the patches Daniel just sent out get merged). Yea. And taking advantage of that behavior is definitely a performance issue for QEMU. There's still work to do though. QEMU on vector code is running crazy slow. I guess we're kind of off the rails for a GCC patch, but that's definately true. Across the board RVV is going to just need a lot of work, it's very different than SVE or AVX. Unfortunately QEMU performance isn't really a priority on our end, but it's great to see folks digging into it.
Re: [PATCH] RISC-V: Allow RVV VMS{Compare}(V1, V1) simplify to VMSET
On 4/29/23 11:21, Andrew Waterman wrote: The relevant statement in the spec is that "the tail elements are always updated with a tail-agnostic policy". The vmset.m instruction will cause mask register bits [0, vl-1] to be set to 1; elements [vl, VLMAX-1] will either be undisturbed or set to 1, i.e., effectively unspecified. Makes sense. Just have to stitch together bits from different locations in the manual. The net being that I can't think we can define that macro for RISC-V in the way that Pan wants, the semantics just don't line up correctly. jeff
Re: [PATCH] RISC-V: Allow RVV VMS{Compare}(V1, V1) simplify to VMSET
On 4/29/23 11:48, Palmer Dabbelt wrote: Yea. And taking advantage of that behavior is definitely a performance issue for QEMU. There's still work to do though. QEMU on vector code is running crazy slow. I guess we're kind of off the rails for a GCC patch, but that's definately true. Across the board RVV is going to just need a lot of work, it's very different than SVE or AVX. Unfortunately QEMU performance isn't really a priority on our end, but it's great to see folks digging into it. Well, when a user mode SPEC run goes from ~15 minutes to multiple hours for a single input workload within specint it becomes a development problem. Daniel is loosely affiliated with my group in Ventana, so I can bug him with this kind of stuff. jeff
Re: [PATCH] reload: Handle generating reloads that also clobbers flags
On 4/18/23 08:12, Hans-Peter Nilsson wrote: Date: Tue, 18 Apr 2023 07:43:41 -0600 From: Jeff Law On 2/15/23 08:34, Hans-Peter Nilsson via Gcc-patches wrote: Regtested cris-elf with its LEGITIMIZE_RELOAD_ADDRESS disabled, where it regresses gcc.target/cris/rld-legit1.c; as expected, because that test guards proper function of its LEGITIMIZE_RELOAD_ADDRESS i.e., that there's no sign of decomposed address elements. LRA also causes a similar decomposition (and worse, in even smaller bits), but it can create valid insns as-is. Unfortunately, it doesn't have something equivalent to LEGITIMIZE_RELOAD_ADDRESS so it generates worse code for cases where that hook helped reload. I fear reload-related patches these days are treated like a redheaded stepchild and even worse as this one is intended for stage 1. Either way, I need to create a reference to it, and it's properly tested and has been a help when working towards LRA, thus might help other targets: ok to install for the next stage 1? -- >8 -- When LEGITIMIZE_RELOAD_ADDRESS for cris-elf is disabled, this code is now required for reload to generate valid insns from some reload-decomposed addresses, for example the (plus:SI (sign_extend:SI (mem:HI (reg/v/f:SI 32 [ a ]) [1 *a_6(D)+0 S2 A8])) (reg/v/f:SI 33 [ y ])) generated in gcc.target/cris/rld-legit1.c (a valid address but with two registers needing reload). Now after decc0:ing, most SET insns for former cc0 targets need to be a parallel with a clobber of the flags register. Such targets typically have TARGET_FLAGS_REGNUM set to a valid register. * reload1.cc (emit_insn_if_valid_for_reload_1): Rename from emit_insn_if_valid_for_reload. (emit_insn_if_valid_for_reload): Call new helper, and if a SET fails to be recognized, also try emitting a parallel that clobbers TARGET_FLAGS_REGNUM, as applicable. BUt isn't it the case that we're not supposed to be exposing the flags register until after reload? And if that's the case, then why would this be necessary? Clearly I must be missing something. That "supposed to" is only *one* possible implementation. The one in CRIS - and I believe the preferred one; one I should advocate more - is to *always* expose clobbering of the flags. (I managed to do the CRIS decc0ification transformation without loss of performance. There were much fewer issues with code taking PATTERN (insn) and failing on it being PARALLEL than I had expected, much thanks to use of rtx_single_set.) Think about it: why should the semantics of a valid insn change after a "random" pass? That's almost as crazy as the implied semantics of cc0. Ah, yes, thanks for the reminder that there's multiple approaches here. If I cared enough it'd probably make more sense at this point to expose cc0 early on the H8 as doing so would allow easier codegen for overflow tests which in turn could significantly speed up the testsuite. OK for the trunk. jeff
Re: [PATCH] build: Use -nostdinc generating macro_list [PR109522]
On 4/15/23 06:01, Xi Ruoyao via Gcc-patches wrote: This prevents a spurious message building a cross-compiler when target libc is not installed yet: cc1: error: no include path in which to search for stdc-predef.h As stdc-predef.h was added to define __STDC_* macros by libc, it's unlikely the header will ever contain some bad definitions w/o "__" prefix so it should be safe. gcc/ChangeLog: PR other/109522 * Makefile.in (s-macro_list): Pass -nostdinc to $(GCC_FOR_TARGET). OK. Thanks. jeff
Re: [PATCH] Handle Windows nul device in unlink-if-ordinary.c
On 3/12/23 23:15, Himal wrote: On 3/12/2023 1:48 AM, Jeff Law wrote: On 1/6/23 01:31, anothername27-unity--- via Gcc-patches wrote: From: Himal Hi, This might be a better fix. Regards. PS. I had to use a different email. --- libiberty/unlink-if-ordinary.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/libiberty/unlink-if-ordinary.c b/libiberty/unlink-if-ordinary.c index 84328b216..e765ac8b1 100644 --- a/libiberty/unlink-if-ordinary.c +++ b/libiberty/unlink-if-ordinary.c @@ -62,6 +62,12 @@ was made to unlink the file because it is special. int unlink_if_ordinary (const char *name) { +/* MS-Windows 'stat' function (and in turn, S_ISREG) + reports the null device as a regular file. */ +#ifdef _WIN32 + if (stricmp (name, "nul") == 0) + return 1; +#endif Hi Jeff, Thanks for the response. Umm, wouldn't this return true for a real file called nul in the current directory? ie, don't you need to distinguish between the nul device and a file named nul based on the full path? I don't think that we can create a file called nul under Windows. And not being a windows person, I'd really like to see some documentation which indicates that stat on the null device will indicate its a regular file. Alternately if one of the windows experts here can chime in, it'd be appreciated. jeff I found these patches that might indicate the same thing. https://src.fedoraproject.org/rpms/binutils/blob/0b119dd9d51a3763db7d6fea1b51a03494cb96d8/f/binutils-CVE-2021-20197.patch#_121-135 https://github.com/msys2/MINGW-packages/pull/10541/files I would like to see some input from a Windows developer as well. BTW, This doesn't affecting anything. I stumbled upon this while debugging another [bug](https://sourceware.org/bugzilla/show_bug.cgi?id=29947). I noticed it's calling unlink function for the nul device as well, but it wasn't throwing any errors or anything like that. I'm inclined to go ahead and commit this. I think the only other question I have is the use of stricmp. That's not strictly ISO, strcasecmp would be preferred. But I don't know enough about the windows environment to know if they picked up strcasecmp over time. jeff
Re: [PATCH] RISC-V: Allow RVV VMS{Compare}(V1, V1) simplify to VMSET
On Sat, 29 Apr 2023 10:52:50 PDT (-0700), jeffreya...@gmail.com wrote: On 4/29/23 11:48, Palmer Dabbelt wrote: Yea. And taking advantage of that behavior is definitely a performance issue for QEMU. There's still work to do though. QEMU on vector code is running crazy slow. I guess we're kind of off the rails for a GCC patch, but that's definately true. Across the board RVV is going to just need a lot of work, it's very different than SVE or AVX. Unfortunately QEMU performance isn't really a priority on our end, but it's great to see folks digging into it. Well, when a user mode SPEC run goes from ~15 minutes to multiple hours for a single input workload within specint it becomes a development problem. Daniel is loosely affiliated with my group in Ventana, so I can bug him with this kind of stuff. We've got another team actually doing the mechanics of the SPEC runs, we just do the compiler. So while I guess it is a problem, it's not my problem ;) Maybe not the best way to go about things, but there's only so much that can be done...
Re: [PATCH] RISC-V: Allow RVV VMS{Compare}(V1, V1) simplify to VMSET
Hi Jeff: The RTL pattern already models tail element and vector length well, so I don't feel the first version of Pan's patch has any problem? Input RTL pattern: #(insn 10 7 12 2 (set (reg:VNx2BI 134 [ _1 ]) #(if_then_else:VNx2BI (unspec:VNx2BI [ #(const_vector:VNx2BI repeat [ #(const_int 1 [0x1]) #]) # all-1 mask #(reg:DI 143) # AVL reg, or vector length #(const_int 2 [0x2]) # mask policy #(const_int 0 [0]) # avl type #(reg:SI 66 vl) #(reg:SI 67 vtype) #] UNSPEC_VPREDICATE) #(geu:VNx2BI (reg/v:VNx2QI 137 [ v1 ]) #(reg/v:VNx2QI 137 [ v1 ])) #(unspec:VNx2BI [ #(reg:SI 0 zero) #] UNSPEC_VUNDEF))) # maskoff and tail operand # (expr_list:REG_DEAD (reg:DI 143) #(expr_list:REG_DEAD (reg/v:VNx2QI 137 [ v1 ]) #(nil And the split pattern, only did on tail/maskoff element with undefined value: (define_split [(set (match_operand:VB 0 "register_operand") (if_then_else:VB (unspec:VB [(match_operand:VB 1 "vector_all_trues_mask_operand") (match_operand4 "vector_length_operand") (match_operand5 "const_int_operand") (match_operand6 "const_int_operand") (reg:SI VL_REGNUM) (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) (match_operand:VB3 "vector_move_operand") (match_operand:VB2 "vector_undef_operand")))] # maskoff and tail operand, only match undef value Then it turns into vmset, and also discard mask policy operand (since maskoff is undef means don't care IMO): (insn 10 7 12 2 (set (reg:VNx2BI 134 [ _1 ]) (if_then_else:VNx2BI (unspec:VNx2BI [ (const_vector:VNx2BI repeat [ (const_int 1 [0x1]) ]) # all-1 mask (reg:DI 143) # AVL reg, or vector length (const_int 2 [0x2]) # mask policy (reg:SI 66 vl) (reg:SI 67 vtype) ] UNSPEC_VPREDICATE) (const_vector:VNx2BI repeat [ (const_int 1 [0x1]) ])# all-1 (unspec:VNx2BI [ (reg:SI 0 zero) ] UNSPEC_VUNDEF))) # still vundef (expr_list:REG_DEAD (reg:DI 143) (nil))) On Sat, Apr 29, 2023 at 11:05 PM Jeff Law wrote: > > > > On 4/28/23 20:55, Li, Pan2 wrote: > > Thanks Jeff for comments. > > > > It makes sense to me. For the EQ operator we should have CONSTM1. > That's not the way I interpret the RVV documentation. Of course it's > not terribly clear.I guess one could do some experiments with qemu > or try to dig into the sail code and figure out the intent from those. > > > > Does this mean s390 parts has similar issue here? Then for instructions > like VMSEQ, we need to adjust the simplify_rtx up to a point. > You'd have to refer to the s390 instruction set reference to understand > precisely how the vector compares work. > > But as it stands this really isn't a simplify-rtx question, but a > question of the semantics of risc-v. What happens with the high bits > in the destination mask register is critical -- and if risc-v doesn't > set them to all ones in this case, then that would mean that defining > that macro is simply wrong for risc-v. > > jeff
Re: [PATCH V2] RISC-V: decouple stack allocation for rv32e w/o save-restore.
SiFive has tests and delivers RV32E. On Sun, Apr 30, 2023 at 1:45 AM Palmer Dabbelt wrote: > > On Sat, 29 Apr 2023 10:44:08 PDT (-0700), jeffreya...@gmail.com wrote: > > > > > > On 4/29/23 11:00, Palmer Dabbelt wrote: > >> On Sat, 29 Apr 2023 08:38:06 PDT (-0700), jeffreya...@gmail.com wrote: > >>> > >>> > >>> On 4/29/23 04:59, Fei Gao wrote: > Currently in rv32e, stack allocation for GPR callee-saved registers is > always 12 bytes w/o save-restore. Actually, for the case without > save-restore, > less stack memory can be reserved. This patch decouples stack > allocation for > rv32e w/o save-restore and makes riscv_compute_frame_info more readable. > >> > >> Are you guys using rv32e? It's not widely tested, at least by most > >> upstream folks. If you're actively trying to ship it then we should > >> probably add it to the various lists of targest that get tested, as I'd > >> bet there's a lot of oddness floating around. > > No interest at all in rv32 at Ventana. > > Makes sense, I was mostly wondering abotu the Eswin folks though. > > > > > > >>> Thanks. I rewrapped the ChangeLog and pushed this to the trunk. > >> > >> Works for me, thanks for reviewing all this stuff -- we're all pretty > >> buried ;) > > Just standard procedure with the trunk re-opened. In the past I would > > have ignored anything in the risc-v space. I've traded that for > > ignoring x86 :-) > > > > > > > > Jeff
[PATCH] c++: Report invalid id-expression in decltype [PR100482]
This patch ensures that any errors raised by finish_id_expression when parsing a decltype expression are properly reported, rather than potentially going ignored and causing invalid code to be accepted. We can also now remove the separate check for templates without args as this is also checked for in finish_id_expression. PR 100482 gcc/cp/ChangeLog: * parser.cc (cp_parser_decltype_expr): Report errors raised by finish_id_expression. gcc/testsuite/ChangeLog: * g++.dg/pr100482.C: New test. Signed-off-by: Nathaniel Shead --- gcc/cp/parser.cc| 22 +++--- gcc/testsuite/g++.dg/pr100482.C | 11 +++ 2 files changed, 22 insertions(+), 11 deletions(-) create mode 100644 gcc/testsuite/g++.dg/pr100482.C diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc index e5f032f2330..20ebcdc3cfd 100644 --- a/gcc/cp/parser.cc +++ b/gcc/cp/parser.cc @@ -16508,10 +16508,6 @@ cp_parser_decltype_expr (cp_parser *parser, expr = cp_parser_lookup_name_simple (parser, expr, id_expr_start_token->location); - if (expr && TREE_CODE (expr) == TEMPLATE_DECL) - /* A template without args is not a complete id-expression. */ - expr = error_mark_node; - if (expr && expr != error_mark_node && TREE_CODE (expr) != TYPE_DECL @@ -16532,13 +16528,17 @@ cp_parser_decltype_expr (cp_parser *parser, &error_msg, id_expr_start_token->location)); - if (expr == error_mark_node) -/* We found an id-expression, but it was something that we - should not have found. This is an error, not something - we can recover from, so note that we found an - id-expression and we'll recover as gracefully as - possible. */ -id_expression_or_member_access_p = true; + if (error_msg) + { + /* We found an id-expression, but it was something that we +should not have found. This is an error, not something +we can recover from, so report the error we found and +we'll recover as gracefully as possible. */ + cp_parser_parse_definitely (parser); + cp_parser_error (parser, error_msg); + id_expression_or_member_access_p = true; + return error_mark_node; + } } if (expr diff --git a/gcc/testsuite/g++.dg/pr100482.C b/gcc/testsuite/g++.dg/pr100482.C new file mode 100644 index 000..dcf6722fda5 --- /dev/null +++ b/gcc/testsuite/g++.dg/pr100482.C @@ -0,0 +1,11 @@ +// { dg-do compile { target c++11 } } + +namespace N {} +decltype(std) x; // { dg-error "expected primary-expression" } + +struct S {}; +decltype(S) y; // { dg-error "argument to .decltype. must be an expression" } + +template +struct U {}; +decltype(U) z; // { dg-error "missing template arguments" } -- 2.40.0
Re: [PATCH] apply debug-remap to file names in .su files
On 2/13/23 12:27, Rasmus Villemoes wrote: The .su files generated with -fstack-usage are arguably debug info. In order to make builds more reproducible, apply the same remapping logic to the recorded file names as for when producing the debug info embedded in the object files. To this end, teach print_decl_identifier() a new PRINT_DECL_REMAP_DEBUG flag and use that from output_stack_usage_1(). gcc/ChangeLog: * print-tree.h (PRINT_DECL_REMAP_DEBUG): New flag. * print-tree.cc (print_decl_identifier): Implement it. * toplev.cc (output_stack_usage_1): Use it. OK for the trunk. jeff