[PATCH] PHIOPT: Value-replacement check undef
While moving value replacement part of PHIOPT over to use match-and-simplify, I ran into the case where we would have an undef use that was conditional become unconditional. This prevents that. I can't remember at this point what the testcase was though. Bootstrapped and tested on x86_64-linux-gnu with no regressions. gcc/ChangeLog: * tree-ssa-phiopt.cc (value_replacement): Reject undef variables so they don't become unconditional used. Signed-off-by: Andrew Pinski --- gcc/tree-ssa-phiopt.cc | 7 +++ 1 file changed, 7 insertions(+) diff --git a/gcc/tree-ssa-phiopt.cc b/gcc/tree-ssa-phiopt.cc index a2bdcb5eae8..f166c3132cb 100644 --- a/gcc/tree-ssa-phiopt.cc +++ b/gcc/tree-ssa-phiopt.cc @@ -1146,6 +1146,13 @@ value_replacement (basic_block cond_bb, basic_block middle_bb, if (code != NE_EXPR && code != EQ_EXPR) return 0; + /* Do not make conditional undefs unconditional. */ + if ((TREE_CODE (arg0) == SSA_NAME + && ssa_name_maybe_undef_p (arg0)) + || (TREE_CODE (arg1) == SSA_NAME + && ssa_name_maybe_undef_p (arg1))) +return false; + /* If the type says honor signed zeros we cannot do this optimization. */ if (HONOR_SIGNED_ZEROS (arg1)) -- 2.43.0
Re: [PATCH v1] RISC-V: Fix ICE for legitimize move on subreg const_poly_move
> diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc > index 0519e0679ed..bad23ea487f 100644 > --- a/gcc/config/riscv/riscv.cc > +++ b/gcc/config/riscv/riscv.cc > @@ -2786,6 +2786,44 @@ riscv_v_adjust_scalable_frame (rtx target, poly_int64 > offset, bool epilogue) >REG_NOTES (insn) = dwarf; > } > +/* Take care below subreg const_poly_int move: > + > + 1. (set (subreg:DI (reg:TI 237) 8) > +(subreg:DI (const_poly_int:TI [4, 2]) 8)) > + => > + (set (subreg:DI (reg:TI 237) 8) > +(const_int 0)) */ > + > +static bool > +riscv_legitimize_subreg_const_poly_move (machine_mode mode, rtx dest, rtx > src) > +{ > + gcc_assert (SUBREG_P (src) && CONST_POLY_INT_P (SUBREG_REG (src))); > + gcc_assert (SUBREG_BYTE (src).is_constant ()); > + > + int byte_offset = SUBREG_BYTE (src).to_constant (); > + rtx const_poly = SUBREG_REG (src); > + machine_mode subreg_mode = GET_MODE (const_poly); > + > + if (subreg_mode != TImode) /* Only TImode is needed for now. */ > +return false; > + > + if (byte_offset == 8) > +{ /* The const_poly_int cannot exceed int64, just set zero here. */ { /* The const_poly_int cannot exceed int64, just set zero here. */ New line for the comment. > + emit_move_insn (dest, CONST0_RTX (mode)); > + return true; > +} > + > + /* The below transform will be covered in somewhere else. > + Thus, ignore this here. > + 1. (set (subreg:DI (reg:TI 237) 0) > +(subreg:DI (const_poly_int:TI [4, 2]) 0)) > + => > + (set (subreg:DI (reg:TI 237) 0) > +(const_poly_int:DI [4, 2])) */ > + > + return false; > +} > + > /* If (set DEST SRC) is not a valid move instruction, emit an equivalent > sequence that is valid. */ > @@ -2839,6 +2877,11 @@ riscv_legitimize_move (machine_mode mode, rtx dest, > rtx src) > } >return true; > } > + > + if (SUBREG_P (src) && CONST_POLY_INT_P (SUBREG_REG (src)) > +&& riscv_legitimize_subreg_const_poly_move (mode, dest, src)) > +return true; > + >/* Expand > (set (reg:DI target) (subreg:DI (reg:V8QI reg) 0)) > Expand this data movement instead of simply forbid it since > -- > 2.34.1 > >
Re: [PATCH] ppc: testsuite: pr79004 needs -mlong-double-128
On Apr 28, 2024, "Kewen.Lin" wrote: > OK, from this perspective IMHO it seems more clear to adopt xfail > with effective target long_double_64bit? *nod*, yeah, that makes sense. I'm going to travel this week, to speak at FSF's LibrePlanet conference, so I'll look into massaging the patch into that when I get back, if you haven't rendered it obsolete by then ;-) Thanks, -- Alexandre Oliva, happy hackerhttps://FSFLA.org/blogs/lxo/ Free Software Activist GNU Toolchain Engineer More tolerance and less prejudice are key for inclusion and diversity Excluding neuro-others for not behaving ""normal"" is *not* inclusive
Re: [PATCH] adjust vectorization expectations for ppc costmodel 76b
On Apr 28, 2024, "Kewen.Lin" wrote: > Nit: Maybe add a prefix "testsuite: ". ACK >> >> From: Kewen Lin > Thanks, you can just drop this. :) I've turned it into Co-Authored-By, since you insist. But unfortunately with the patch it still fails when testing for -mcpu=power7 on ppc64le-linux-gnu: it does vectorize the loop with 13 iterations. We need 16 iterations, as in an earlier version of this test, for it to pass for -mcpu=power7, but then it doesn't pass for -mcpu=power6. It looks like we're going to have to adjust the expectations. -- Alexandre Oliva, happy hackerhttps://FSFLA.org/blogs/lxo/ Free Software Activist GNU Toolchain Engineer More tolerance and less prejudice are key for inclusion and diversity Excluding neuro-others for not behaving ""normal"" is *not* inclusive
Re: [PATCH v2] RISC-V: Refine the condition for add additional vars in RVV cost model
Hi, Han. GCC 14 is branch out. You can commit it to trunk (GCC 15). juzhe.zh...@rivai.ai From: demin.han Date: 2024-04-02 16:30 To: gcc-patches CC: juzhe.zhong; kito.cheng; pan2.li; jeffreyalaw; rdapp.gcc Subject: [PATCH v2] RISC-V: Refine the condition for add additional vars in RVV cost model The adjacent_dr_p is sufficient and unnecessary condition for contiguous access. So unnecessary live-ranges are added and result in smaller LMUL. This patch uses MEMORY_ACCESS_TYPE as condition and constrains segment load/store. Tested on RV64 and no regression. PR target/114506 gcc/ChangeLog: * config/riscv/riscv-vector-costs.cc (non_contiguous_memory_access_p): Rename (need_additional_vector_vars_p): Rename and refine condition gcc/testsuite/ChangeLog: * gcc.dg/vect/costmodel/riscv/rvv/pr114506.c: New test. Signed-off-by: demin.han --- V2 changes: 1. remove max_point issue 2. minor change in commit message gcc/config/riscv/riscv-vector-costs.cc| 23 --- .../vect/costmodel/riscv/rvv/pr114506.c | 23 +++ 2 files changed, 38 insertions(+), 8 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/pr114506.c diff --git a/gcc/config/riscv/riscv-vector-costs.cc b/gcc/config/riscv/riscv-vector-costs.cc index f462c272a6e..484196b15b4 100644 --- a/gcc/config/riscv/riscv-vector-costs.cc +++ b/gcc/config/riscv/riscv-vector-costs.cc @@ -563,14 +563,24 @@ get_store_value (gimple *stmt) return gimple_assign_rhs1 (stmt); } -/* Return true if it is non-contiguous load/store. */ +/* Return true if addtional vector vars needed. */ static bool -non_contiguous_memory_access_p (stmt_vec_info stmt_info) +need_additional_vector_vars_p (stmt_vec_info stmt_info) { enum stmt_vec_info_type type = STMT_VINFO_TYPE (vect_stmt_to_vectorize (stmt_info)); - return ((type == load_vec_info_type || type == store_vec_info_type) - && !adjacent_dr_p (STMT_VINFO_DATA_REF (stmt_info))); + if (type == load_vec_info_type || type == store_vec_info_type) +{ + if (STMT_VINFO_GATHER_SCATTER_P (stmt_info) + && STMT_VINFO_MEMORY_ACCESS_TYPE (stmt_info) == VMAT_GATHER_SCATTER) + return true; + + machine_mode mode = TYPE_MODE (STMT_VINFO_VECTYPE (stmt_info)); + int lmul = riscv_get_v_regno_alignment (mode); + if (DR_GROUP_SIZE (stmt_info) * lmul > RVV_M8) + return true; +} + return false; } /* Return the LMUL of the current analysis. */ @@ -739,10 +749,7 @@ update_local_live_ranges ( stmt_vec_info stmt_info = vinfo->lookup_stmt (gsi_stmt (si)); enum stmt_vec_info_type type = STMT_VINFO_TYPE (vect_stmt_to_vectorize (stmt_info)); - if (non_contiguous_memory_access_p (stmt_info) - /* LOAD_LANES/STORE_LANES doesn't need a perm indice. */ - && STMT_VINFO_MEMORY_ACCESS_TYPE (stmt_info) -!= VMAT_LOAD_STORE_LANES) + if (need_additional_vector_vars_p (stmt_info)) { /* For non-adjacent load/store STMT, we will potentially convert it into: diff --git a/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/pr114506.c b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/pr114506.c new file mode 100644 index 000..a88d24b2d2d --- /dev/null +++ b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/pr114506.c @@ -0,0 +1,23 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -ftree-vectorize -mrvv-max-lmul=dynamic -fdump-tree-vect-details" } */ + +float a[32000], b[32000], c[32000], d[32000]; +float aa[256][256], bb[256][256], cc[256][256]; + +void +s2275 () +{ + for (int i = 0; i < 256; i++) +{ + for (int j = 0; j < 256; j++) + { + aa[j][i] = aa[j][i] + bb[j][i] * cc[j][i]; + } + a[i] = b[i] + c[i] * d[i]; +} +} + +/* { dg-final { scan-assembler-times {e32,m8} 1 } } */ +/* { dg-final { scan-assembler-not {e32,m4} } } */ +/* { dg-final { scan-assembler-not {csrr} } } */ +/* { dg-final { scan-tree-dump-not "Preferring smaller LMUL loop because it has unexpected spills" "vect" } } */ -- 2.44.0
Re: [PATCH] RISC-V: Fix parsing of Zic* extensions
OK for trunk, and my understanding is that flag isn't really used in code gen yet, so it's not necessary to port to GCC 14 branch? On Mon, Apr 29, 2024 at 7:05 AM Christoph Müllner wrote: > > The extension parsing table entries for a range of Zic* extensions > does not match the mask definition in riscv.opt. > This results in broken TARGET_ZIC* macros, because the values of > riscv_zi_subext and riscv_zicmo_subext are set wrong. > > This patch fixes this by moving Zic64b into riscv_zicmo_subext > and all other affected Zic* extensions to riscv_zi_subext. > > gcc/ChangeLog: > > * common/config/riscv/riscv-common.cc: Move ziccamoa, ziccif, > zicclsm, and ziccrse into riscv_zi_subext. > * config/riscv/riscv.opt: Define MASK_ZIC64B for > riscv_ziccmo_subext. > > Signed-off-by: Christoph Müllner > --- > gcc/common/config/riscv/riscv-common.cc | 8 > gcc/config/riscv/riscv.opt | 4 ++-- > 2 files changed, 6 insertions(+), 6 deletions(-) > > diff --git a/gcc/common/config/riscv/riscv-common.cc > b/gcc/common/config/riscv/riscv-common.cc > index 43b7549e3ec..8cc0e727737 100644 > --- a/gcc/common/config/riscv/riscv-common.cc > +++ b/gcc/common/config/riscv/riscv-common.cc > @@ -1638,15 +1638,15 @@ static const riscv_ext_flag_table_t > riscv_ext_flag_table[] = > >{"zihintntl", &gcc_options::x_riscv_zi_subext, MASK_ZIHINTNTL}, >{"zihintpause", &gcc_options::x_riscv_zi_subext, MASK_ZIHINTPAUSE}, > + {"ziccamoa", &gcc_options::x_riscv_zi_subext, MASK_ZICCAMOA}, > + {"ziccif", &gcc_options::x_riscv_zi_subext, MASK_ZICCIF}, > + {"zicclsm", &gcc_options::x_riscv_zi_subext, MASK_ZICCLSM}, > + {"ziccrse", &gcc_options::x_riscv_zi_subext, MASK_ZICCRSE}, > >{"zicboz", &gcc_options::x_riscv_zicmo_subext, MASK_ZICBOZ}, >{"zicbom", &gcc_options::x_riscv_zicmo_subext, MASK_ZICBOM}, >{"zicbop", &gcc_options::x_riscv_zicmo_subext, MASK_ZICBOP}, >{"zic64b", &gcc_options::x_riscv_zicmo_subext, MASK_ZIC64B}, > - {"ziccamoa", &gcc_options::x_riscv_zicmo_subext, MASK_ZICCAMOA}, > - {"ziccif", &gcc_options::x_riscv_zicmo_subext, MASK_ZICCIF}, > - {"zicclsm", &gcc_options::x_riscv_zicmo_subext, MASK_ZICCLSM}, > - {"ziccrse", &gcc_options::x_riscv_zicmo_subext, MASK_ZICCRSE}, > >{"zve32x", &gcc_options::x_target_flags, MASK_VECTOR}, >{"zve32f", &gcc_options::x_target_flags, MASK_VECTOR}, > diff --git a/gcc/config/riscv/riscv.opt b/gcc/config/riscv/riscv.opt > index b14888e9816..ee824756381 100644 > --- a/gcc/config/riscv/riscv.opt > +++ b/gcc/config/riscv/riscv.opt > @@ -237,8 +237,6 @@ Mask(ZIHINTPAUSE) Var(riscv_zi_subext) > > Mask(ZICOND) Var(riscv_zi_subext) > > -Mask(ZIC64B) Var(riscv_zi_subext) > - > Mask(ZICCAMOA)Var(riscv_zi_subext) > > Mask(ZICCIF) Var(riscv_zi_subext) > @@ -390,6 +388,8 @@ Mask(ZICBOM) Var(riscv_zicmo_subext) > > Mask(ZICBOP) Var(riscv_zicmo_subext) > > +Mask(ZIC64B) Var(riscv_zicmo_subext) > + > TargetVariable > int riscv_zf_subext > > -- > 2.44.0 >
Re: [PATCH v3] MIPS: Add MIN/MAX.fmt instructions support for MIPS R6
I will apply this patch. While we still have a problem about ``` float max(float a, float b) { return a>=b?a:b; } ``` If it is compiled with `-ffinite-math-only -fsigned-zeros -O2 -mips32r6 -mabi=32`, `max.s` can be used. The max.fmt/min.fmt of MIPSr6 can process +0/-0 correctly.
Re: [PATCH v2] MIPS: Add MIN/MAX.fmt instructions support for MIPS R6
Xi Ruoyao 于2024年3月26日周二 18:10写道: > > On Tue, 2024-03-26 at 11:15 +0800, YunQiang Su wrote: > > /* snip */ > > > With -ffinite-math-only -fno-signed-zeros, it does work with > > x >= y ? x : y > > while without `-ffinite-math-only -fno-signed-zeros`, it cannot. > > @Xi Ruoyao Is it expected by IEEE? > > When y is (quiet) NaN and x is not, fmax(x, y) should produce x but x >= > y ? x : y should produce y. Thus -ffinite-math-only is needed. > > When x is +0.0 and y is -0.0, x >= y ? x : y should produce +0.0 but > fmax(x, y) may produce +0.0 or -0.0 (IEEE allows both and I don't see a > more strict requirement in MIPS 6.06 manual either). Thus -fno-signed- > zeros is needed. > Yes, MIPS 6.06 requires `max.f Y,+0,-0` produce +0. There is a table after the description of max.fmt instruction, aka Table 4.1 Special Cases for FP MAX, MIN, MAXA, MINA. > -- > Xi Ruoyao > School of Aerospace Science and Technology, Xidian University
Re: [PATCH] config-ml.in: Fix multi-os-dir search
Jeff Law 于2024年1月3日周三 01:00写道: > > > > On 1/1/24 09:48, YunQiang Su wrote: > > When building multilib libraries, CC/CXX etc are set with an option > > -B*/lib/, instead of -B/lib/. > > This will make some trouble in some case, for example building > > cross toolchain based on Debian's cross packages: > > > >If we have libc6-dev-i386-amd64-cross packages installed on > >a non-x86 machine. This package will have the files in > >/usr/x86_4-linux-gnu/lib32. The fellow configure will fail > >when build libgcc for i386, with complains the libc is not > >i386 ones: > > ../configure --enable-multilib --enable-multilib \ > > --target=x86_64-linux-gnu > > > > Let's insert a "-B*/lib/`CC ${flags} --print-multi-os-directory`" > > before "-B*/lib/". > > > > This patch is based on the patch used by Debian now. > > > > ChangeLog > > > > * config-ml.in: Insert an -B option with multi-os-dir into > > compiler commands used to build libraries. > I would prefer this to wait for gcc-15. I'll go ahead and ACK it for > gcc-15 though. > I noticed that the gcc-14 branch has been created, and the basever has also been 15.0 now. Is it time for this patch now? > What would also be valuable would be to extract out the rest of the > multiarch patches from the Debian patches and get those into into GCC > proper. > > Jeff
[PATCH] Silence two instances of -Wcalloc-transposed-args
Signed-off-by: Peter Damianov --- Fixes these warnings: ../../gcc/gcc/../libgcc/libgcov-util.c: In function 'void tag_counters(unsigned int, int)': ../../gcc/gcc/../libgcc/libgcov-util.c:214:59: warning: 'void* calloc(size_t, size_t)' sizes specified with 'sizeof' in the earlier argument and not in the later argument [-Wcalloc-transposed-args] 214 | k_ctrs[tag_ix].values = values = (gcov_type *) xcalloc (sizeof (gcov_type), | ^~ ../../gcc/gcc/../libgcc/libgcov-util.c:214:59: note: earlier argument should specify number of elements, later size of each element ../../gcc/gcc/../libgcc/libgcov-util.c: In function 'void topn_to_memory_representation(gcov_ctr_info*)': ../../gcc/gcc/../libgcc/libgcov-util.c:529:43: warning: 'void* calloc(size_t, size_t)' sizes specified with 'sizeof' in the earlier argument and not in the later argument [-Wcalloc-transposed-args] 529 | = (struct gcov_kvp *)xcalloc (sizeof (struct gcov_kvp), n); | ^~~~ ../../gcc/gcc/../libgcc/libgcov-util.c:529:43: note: earlier argument should specify number of elements, later size of each element I think this can be applied as obvious. libgcc/libgcov-util.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/libgcc/libgcov-util.c b/libgcc/libgcov-util.c index ba4b90a480d..f443408c4ab 100644 --- a/libgcc/libgcov-util.c +++ b/libgcc/libgcov-util.c @@ -211,8 +211,8 @@ tag_counters (unsigned tag, int length) gcc_assert (k_ctrs[tag_ix].num == 0); k_ctrs[tag_ix].num = n_counts; - k_ctrs[tag_ix].values = values = (gcov_type *) xcalloc (sizeof (gcov_type), - n_counts); + k_ctrs[tag_ix].values = values = (gcov_type *) xcalloc (n_counts, + sizeof (gcov_type)); gcc_assert (values); if (length > 0) @@ -526,7 +526,7 @@ topn_to_memory_representation (struct gcov_ctr_info *info) if (n > 0) { struct gcov_kvp *tuples - = (struct gcov_kvp *)xcalloc (sizeof (struct gcov_kvp), n); + = (struct gcov_kvp *)xcalloc (n, sizeof (struct gcov_kvp)); for (unsigned i = 0; i < n - 1; i++) tuples[i].next = &tuples[i + 1]; for (unsigned i = 0; i < n; i++) -- 2.39.2
Re: [PATCH v3 1/2] Driver: Add new -truncate option
29 Apr 2024 12:16:26 am Peter Damianov : This commit adds a new option to the driver that truncates one file after linking. Tested likeso: $ gcc hello.c -c $ du -h hello.o 4.0K hello.o $ gcc hello.o -truncate hello.o $ ./a.out Hello world $ du -h hello.o $ 0 hello.o $ gcc hello.o -truncate gcc: error: missing filename after '-truncate' The motivation for adding this is PR110710. It is used by lto-wrapper to truncate files in a shell-independent manner. Signed-off-by: Peter Damianov --- gcc/common.opt | 6 ++ gcc/gcc.cc | 14 ++ 2 files changed, 20 insertions(+) diff --git a/gcc/common.opt b/gcc/common.opt index ad348844775..40cab3cb36a 100644 --- a/gcc/common.opt +++ b/gcc/common.opt @@ -422,6 +422,12 @@ Display target specific command line options (including assembler and linker opt -time Driver Alias(time) +;; Truncate the file specified after linking. +;; This option is used by lto-wrapper to reduce the peak disk-usage when +;; linking with many .LTRANS units. +truncate +Driver Separate Undocumented MissingArgError(missing filename after %qs) + -verbose Driver Alias(v) diff --git a/gcc/gcc.cc b/gcc/gcc.cc index 728332b8153..830a4700a87 100644 --- a/gcc/gcc.cc +++ b/gcc/gcc.cc @@ -2138,6 +2138,10 @@ static int have_E = 0; /* Pointer to output file name passed in with -o. */ static const char *output_file = 0; +/* Pointer to input file name passed in with -truncate. + This file should be truncated after linking. */ +static const char *totruncate_file = 0; + /* This is the list of suffixes and codes (%g/%u/%U/%j) and the associated temp file. If the HOST_BIT_BUCKET is used for %j, no entry is made for it here. */ @@ -4538,6 +4542,11 @@ driver_handle_option (struct gcc_options *opts, do_save = false; break; + case OPT_truncate: + totruncate_file = arg; + do_save = false; + break; + case OPT: /* "-###" This is similar to -v except that there is no execution @@ -9286,6 +9295,11 @@ driver::final_actions () const delete_failure_queue (); delete_temp_files (); + if (totruncate_file != NULL && !seen_error ()) + /* Truncate file specified by -truncate. + Used by lto-wrapper to reduce temporary disk-space usage. */ + truncate(totruncate_file, 0); + if (print_help_list) { printf (("\nFor bug reporting instructions, please see:\n")); -- 2.39.2 I resubmitted the patch because the previous one had a mistake. It didn't set "do_save" to false, so it resulted in problems like this: ./gcc/xgcc -truncate xgcc: error: missing filename after ‘-truncate’ xgcc: fatal error: no input files ./gcc/xgcc -truncate ?? xgcc: error: unrecognized command-line option ‘-truncate’ xgcc: fatal error: no input files Therefore regressing some tests, and not working properly. After fixing this, I ran all of the LTO tests again and observed no failures. I'm not sure how I ever observed it working before, but I'm reasonably confident this is correct now.
[PATCH v3 2/2] lto-wrapper: Truncate files using -truncate driver option [PR110710]
This commit changes the Makefiles generated by lto-wrapper to no longer use the "mv" and "touch" shell commands. These don't exist on Windows, so when the Makefile attempts to call them, it results in errors like: The system cannot find the file specified. This problem only manifested when calling gcc from cmd.exe, and having no sh.exe present on the PATH. The Windows port of GNU Make searches the PATH for an sh.exe, and uses it if present. I have tested this in environments with and without sh.exe on the PATH and confirmed it works as expected. Signed-off-by: Peter Damianov --- gcc/lto-wrapper.cc | 6 ++ 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/gcc/lto-wrapper.cc b/gcc/lto-wrapper.cc index 02579951569..cfded757f26 100644 --- a/gcc/lto-wrapper.cc +++ b/gcc/lto-wrapper.cc @@ -2023,14 +2023,12 @@ cont: fprintf (mstream, "%s:\n\t@%s ", output_name, new_argv[0]); for (j = 1; new_argv[j] != NULL; ++j) fprintf (mstream, " '%s'", new_argv[j]); - fprintf (mstream, "\n"); /* If we are not preserving the ltrans input files then truncate them as soon as we have processed it. This reduces temporary disk-space usage. */ if (! save_temps) - fprintf (mstream, "\t@-touch -r \"%s\" \"%s.tem\" > /dev/null " -"2>&1 && mv \"%s.tem\" \"%s\"\n", -input_name, input_name, input_name, input_name); + fprintf (mstream, " -truncate '%s'", input_name); + fprintf (mstream, "\n"); } else { -- 2.39.2
[PATCH v3 1/2] Driver: Add new -truncate option
This commit adds a new option to the driver that truncates one file after linking. Tested likeso: $ gcc hello.c -c $ du -h hello.o 4.0K hello.o $ gcc hello.o -truncate hello.o $ ./a.out Hello world $ du -h hello.o $ 0 hello.o $ gcc hello.o -truncate gcc: error: missing filename after '-truncate' The motivation for adding this is PR110710. It is used by lto-wrapper to truncate files in a shell-independent manner. Signed-off-by: Peter Damianov --- gcc/common.opt | 6 ++ gcc/gcc.cc | 14 ++ 2 files changed, 20 insertions(+) diff --git a/gcc/common.opt b/gcc/common.opt index ad348844775..40cab3cb36a 100644 --- a/gcc/common.opt +++ b/gcc/common.opt @@ -422,6 +422,12 @@ Display target specific command line options (including assembler and linker opt -time Driver Alias(time) +;; Truncate the file specified after linking. +;; This option is used by lto-wrapper to reduce the peak disk-usage when +;; linking with many .LTRANS units. +truncate +Driver Separate Undocumented MissingArgError(missing filename after %qs) + -verbose Driver Alias(v) diff --git a/gcc/gcc.cc b/gcc/gcc.cc index 728332b8153..830a4700a87 100644 --- a/gcc/gcc.cc +++ b/gcc/gcc.cc @@ -2138,6 +2138,10 @@ static int have_E = 0; /* Pointer to output file name passed in with -o. */ static const char *output_file = 0; +/* Pointer to input file name passed in with -truncate. + This file should be truncated after linking. */ +static const char *totruncate_file = 0; + /* This is the list of suffixes and codes (%g/%u/%U/%j) and the associated temp file. If the HOST_BIT_BUCKET is used for %j, no entry is made for it here. */ @@ -4538,6 +4542,11 @@ driver_handle_option (struct gcc_options *opts, do_save = false; break; +case OPT_truncate: + totruncate_file = arg; + do_save = false; + break; + case OPT: /* "-###" This is similar to -v except that there is no execution @@ -9286,6 +9295,11 @@ driver::final_actions () const delete_failure_queue (); delete_temp_files (); + if (totruncate_file != NULL && !seen_error ()) +/* Truncate file specified by -truncate. + Used by lto-wrapper to reduce temporary disk-space usage. */ +truncate(totruncate_file, 0); + if (print_help_list) { printf (("\nFor bug reporting instructions, please see:\n")); -- 2.39.2
[PATCH] RISC-V: Fix parsing of Zic* extensions
The extension parsing table entries for a range of Zic* extensions does not match the mask definition in riscv.opt. This results in broken TARGET_ZIC* macros, because the values of riscv_zi_subext and riscv_zicmo_subext are set wrong. This patch fixes this by moving Zic64b into riscv_zicmo_subext and all other affected Zic* extensions to riscv_zi_subext. gcc/ChangeLog: * common/config/riscv/riscv-common.cc: Move ziccamoa, ziccif, zicclsm, and ziccrse into riscv_zi_subext. * config/riscv/riscv.opt: Define MASK_ZIC64B for riscv_ziccmo_subext. Signed-off-by: Christoph Müllner --- gcc/common/config/riscv/riscv-common.cc | 8 gcc/config/riscv/riscv.opt | 4 ++-- 2 files changed, 6 insertions(+), 6 deletions(-) diff --git a/gcc/common/config/riscv/riscv-common.cc b/gcc/common/config/riscv/riscv-common.cc index 43b7549e3ec..8cc0e727737 100644 --- a/gcc/common/config/riscv/riscv-common.cc +++ b/gcc/common/config/riscv/riscv-common.cc @@ -1638,15 +1638,15 @@ static const riscv_ext_flag_table_t riscv_ext_flag_table[] = {"zihintntl", &gcc_options::x_riscv_zi_subext, MASK_ZIHINTNTL}, {"zihintpause", &gcc_options::x_riscv_zi_subext, MASK_ZIHINTPAUSE}, + {"ziccamoa", &gcc_options::x_riscv_zi_subext, MASK_ZICCAMOA}, + {"ziccif", &gcc_options::x_riscv_zi_subext, MASK_ZICCIF}, + {"zicclsm", &gcc_options::x_riscv_zi_subext, MASK_ZICCLSM}, + {"ziccrse", &gcc_options::x_riscv_zi_subext, MASK_ZICCRSE}, {"zicboz", &gcc_options::x_riscv_zicmo_subext, MASK_ZICBOZ}, {"zicbom", &gcc_options::x_riscv_zicmo_subext, MASK_ZICBOM}, {"zicbop", &gcc_options::x_riscv_zicmo_subext, MASK_ZICBOP}, {"zic64b", &gcc_options::x_riscv_zicmo_subext, MASK_ZIC64B}, - {"ziccamoa", &gcc_options::x_riscv_zicmo_subext, MASK_ZICCAMOA}, - {"ziccif", &gcc_options::x_riscv_zicmo_subext, MASK_ZICCIF}, - {"zicclsm", &gcc_options::x_riscv_zicmo_subext, MASK_ZICCLSM}, - {"ziccrse", &gcc_options::x_riscv_zicmo_subext, MASK_ZICCRSE}, {"zve32x", &gcc_options::x_target_flags, MASK_VECTOR}, {"zve32f", &gcc_options::x_target_flags, MASK_VECTOR}, diff --git a/gcc/config/riscv/riscv.opt b/gcc/config/riscv/riscv.opt index b14888e9816..ee824756381 100644 --- a/gcc/config/riscv/riscv.opt +++ b/gcc/config/riscv/riscv.opt @@ -237,8 +237,6 @@ Mask(ZIHINTPAUSE) Var(riscv_zi_subext) Mask(ZICOND) Var(riscv_zi_subext) -Mask(ZIC64B) Var(riscv_zi_subext) - Mask(ZICCAMOA)Var(riscv_zi_subext) Mask(ZICCIF) Var(riscv_zi_subext) @@ -390,6 +388,8 @@ Mask(ZICBOM) Var(riscv_zicmo_subext) Mask(ZICBOP) Var(riscv_zicmo_subext) +Mask(ZIC64B) Var(riscv_zicmo_subext) + TargetVariable int riscv_zf_subext -- 2.44.0
[Patch, fortran] PR114859 - [14/15 Regression] Seeing new segmentation fault in same_type_as since r14-9752
Hi All, Could this be looked at quickly? The timing of this regression is more than a little embarrassing on the eve of the 14.1 release. The testcase and the comment in gfc_trans_class_init_assign explain what this problem is all about and how the patch fixes it. OK for 15-branch and backporting to 14-branch (hopefully to the RC as well)? Paul Fortran: Fix regression caused by r14-9752 [PR114959] 2024-04-28 Paul Thomas gcc/fortran PR fortran/114959 * trans-expr.cc (gfc_trans_class_init_assign): Return NULL_TREE if the default initializer has all NULL fields. Guard this by a requirement that the code be EXEC_INIT_ASSIGN and that the object be an INTENT_IN dummy. * trans-stmt.cc (gfc_trans_allocate): Change the initializer code for allocate with mold to EXEC_ASSIGN to allow initializer with all NULL fields.. gcc/testsuite/ PR fortran/114959 * gfortran.dg/pr114959.f90: New test. diff --git a/gcc/fortran/trans-expr.cc b/gcc/fortran/trans-expr.cc index 072adf3fe77..0280c441ced 100644 --- a/gcc/fortran/trans-expr.cc +++ b/gcc/fortran/trans-expr.cc @@ -1720,6 +1720,7 @@ gfc_trans_class_init_assign (gfc_code *code) gfc_se dst,src,memsz; gfc_expr *lhs, *rhs, *sz; gfc_component *cmp; + gfc_symbol *sym; gfc_start_block (&block); @@ -1736,18 +1737,25 @@ gfc_trans_class_init_assign (gfc_code *code) /* The _def_init is always scalar. */ rhs->rank = 0; - /* Check def_init for initializers. If this is a dummy with all default - initializer components NULL, return NULL_TREE and use the passed value as - required by F2018(8.5.10). */ - if (!lhs->ref && lhs->symtree->n.sym->attr.dummy) + /* Check def_init for initializers. If this is an INTENT(OUT) dummy with all + default initializer components NULL, return NULL_TREE and use the passed + value as required by F2018(8.5.10). */ + sym = code->expr1->expr_type == EXPR_VARIABLE ? code->expr1->symtree->n.sym + : NULL; + if (code->op != EXEC_ALLOCATE + && sym && sym->attr.dummy + && sym->attr.intent == INTENT_OUT) { - cmp = rhs->ref->next->u.c.component->ts.u.derived->components; - for (; cmp; cmp = cmp->next) + if (!lhs->ref && lhs->symtree->n.sym->attr.dummy) { - if (cmp->initializer) - break; - else if (!cmp->next) - return build_empty_stmt (input_location); + cmp = rhs->ref->next->u.c.component->ts.u.derived->components; + for (; cmp; cmp = cmp->next) + { + if (cmp->initializer) + break; + else if (!cmp->next) + return NULL_TREE; + } } } diff --git a/gcc/fortran/trans-stmt.cc b/gcc/fortran/trans-stmt.cc index c34e0b4c0cd..d355009fa5e 100644 --- a/gcc/fortran/trans-stmt.cc +++ b/gcc/fortran/trans-stmt.cc @@ -7262,11 +7262,12 @@ gfc_trans_allocate (gfc_code * code, gfc_omp_namelist *omp_allocate) { /* Use class_init_assign to initialize expr. */ gfc_code *ini; - ini = gfc_get_code (EXEC_INIT_ASSIGN); + ini = gfc_get_code (EXEC_ALLOCATE); ini->expr1 = gfc_find_and_cut_at_last_class_ref (expr, true); tmp = gfc_trans_class_init_assign (ini); gfc_free_statements (ini); - gfc_add_expr_to_block (&block, tmp); + if (tmp != NULL_TREE) + gfc_add_expr_to_block (&block, tmp); } else if ((init_expr = allocate_get_initializer (code, expr))) { diff --git a/gcc/testsuite/gfortran.dg/pr114959.f90 b/gcc/testsuite/gfortran.dg/pr114959.f90 new file mode 100644 index 000..5cc3c052c1d --- /dev/null +++ b/gcc/testsuite/gfortran.dg/pr114959.f90 @@ -0,0 +1,33 @@ +! { dg-do compile } +! { dg-options "-fdump-tree-original" } +! +! Fix the regression caused by r14-9752 (fix for PR112407) +! Contributed by Orion Poplawski +! Problem isolated by Jakub Jelinek and further +! reduced here. +! +module m + type :: smoother_type +integer :: i + end type + type :: onelev_type +class(smoother_type), allocatable :: sm +class(smoother_type), allocatable :: sm2a + end type +contains + subroutine save_smoothers(level,save1, save2) +Implicit None +type(onelev_type), intent(inout) :: level +class(smoother_type), allocatable , intent(inout) :: save1, save2 +integer(4) :: info + +info = 0 +! r14-9752 causes the 'stat' declaration from the first ALLOCATE statement +! to disappear, which triggers an ICE in gimplify_var_or_parm_decl. The +! second ALLOCATE statement has to be present for the ICE to occur. +allocate(save1, mold=level%sm,stat=info) +allocate(save2, mold=level%sm2a,stat=info) + end subroutine save_smoothers +end module m +! Two 'stat's from the allocate statements and two from the final wrapper. +! { dg-final { scan-tree-dump-times "integer\\(kind..\\) stat" 4 "original" } }
[PATCH] Minor range type fixes for IPA in preparation for prange.
The polymorphic Value_Range object takes a tree type at construction so it can determine what type of range to use (currently irange or frange). It seems a few of the types are slightly off. This isn't a problem now, because IPA only cares about integers and pointers, which can both live in an irange. However, with prange coming about, we need to get the type right, because you can't store an integer in a pointer range or vice versa. Also, in preparation for prange, the irange::supports_p() idiom will become: irange::supports_p () || prange::supports_p() To avoid changing all these palces, I've added an inline function we can later change and change everything at once. Finally, there's a Value_Range::supports_type_p() && irange::supports_p() in the code. The latter is a subset of the former, so there's no need to check both. OK for trunk? gcc/ChangeLog: * ipa-cp.cc (ipa_vr_operation_and_type_effects): Use ipa_supports_p. (ipa_value_range_from_jfunc): Change Value_Range type. (propagate_vr_across_jump_function): Same. * ipa-cp.h (ipa_supports_p): New. * ipa-fnsummary.cc (evaluate_conditions_for_known_args): Change Value_Range type. * ipa-prop.cc (ipa_compute_jump_functions_for_edge): Use ipa_supports_p. (ipcp_get_parm_bits): Same. --- gcc/ipa-cp.cc| 14 +++--- gcc/ipa-cp.h | 8 gcc/ipa-fnsummary.cc | 2 +- gcc/ipa-prop.cc | 8 +++- 4 files changed, 19 insertions(+), 13 deletions(-) diff --git a/gcc/ipa-cp.cc b/gcc/ipa-cp.cc index a688dced5c9..5781f50c854 100644 --- a/gcc/ipa-cp.cc +++ b/gcc/ipa-cp.cc @@ -1649,7 +1649,7 @@ ipa_vr_operation_and_type_effects (vrange &dst_vr, enum tree_code operation, tree dst_type, tree src_type) { - if (!irange::supports_p (dst_type) || !irange::supports_p (src_type)) + if (!ipa_supports_p (dst_type) || !ipa_supports_p (src_type)) return false; range_op_handler handler (operation); @@ -1720,7 +1720,7 @@ ipa_value_range_from_jfunc (vrange &vr, if (TREE_CODE_CLASS (operation) == tcc_unary) { - Value_Range res (vr_type); + Value_Range res (parm_type); if (ipa_vr_operation_and_type_effects (res, srcvr, @@ -1733,7 +1733,7 @@ ipa_value_range_from_jfunc (vrange &vr, Value_Range op_res (vr_type); Value_Range res (vr_type); tree op = ipa_get_jf_pass_through_operand (jfunc); - Value_Range op_vr (vr_type); + Value_Range op_vr (TREE_TYPE (op)); range_op_handler handler (operation); ipa_range_set_and_normalize (op_vr, op); @@ -2527,7 +2527,7 @@ propagate_vr_across_jump_function (cgraph_edge *cs, ipa_jump_func *jfunc, if (src_lats->m_value_range.bottom_p ()) return dest_lat->set_to_bottom (); - Value_Range vr (operand_type); + Value_Range vr (param_type); if (TREE_CODE_CLASS (operation) == tcc_unary) ipa_vr_operation_and_type_effects (vr, src_lats->m_value_range.m_vr, @@ -2540,16 +2540,16 @@ propagate_vr_across_jump_function (cgraph_edge *cs, ipa_jump_func *jfunc, { tree op = ipa_get_jf_pass_through_operand (jfunc); Value_Range op_vr (TREE_TYPE (op)); - Value_Range op_res (operand_type); + Value_Range op_res (param_type); range_op_handler handler (operation); ipa_range_set_and_normalize (op_vr, op); if (!handler - || !op_res.supports_type_p (operand_type) + || !ipa_supports_p (operand_type) || !handler.fold_range (op_res, operand_type, src_lats->m_value_range.m_vr, op_vr)) - op_res.set_varying (operand_type); + op_res.set_varying (param_type); ipa_vr_operation_and_type_effects (vr, op_res, diff --git a/gcc/ipa-cp.h b/gcc/ipa-cp.h index 7ff74fb5c98..abeaaa4053e 100644 --- a/gcc/ipa-cp.h +++ b/gcc/ipa-cp.h @@ -291,4 +291,12 @@ public: bool values_equal_for_ipcp_p (tree x, tree y); +/* Return TRUE if IPA supports ranges of TYPE. */ + +static inline bool +ipa_supports_p (tree type) +{ + return irange::supports_p (type); +} + #endif /* IPA_CP_H */ diff --git a/gcc/ipa-fnsummary.cc b/gcc/ipa-fnsummary.cc index dff40cd8aa5..1dbf5278149 100644 --- a/gcc/ipa-fnsummary.cc +++ b/gcc/ipa-fnsummary.cc @@ -515,7 +515,7 @@ evaluate_conditions_for_known_args (struct cgraph_node *node, } else if (!op->val[1]) { - Value_Range op0 (op->type); + Value_Range op0 (TREE_TYPE (op->val[0])); range_op_handler handler (op->code); ipa_range_set_and_normalize (op0, op->val[0]); diff --git a/gcc/ip
[PATCH 00/16] prange supporting patchset
In this cycle, we will be contributing ranges for pointers (prange), to disambiguate pointers from integers in a range. Initially they will behave exactly as they do now, with just two integer end points and a bitmask, but eventually we will track points-to info in a less hacky manner than what we do with the pointer equivalency class (pointer_equiv_analyzer). This first set of patches implements a bunch of little cleanups and set-ups that will make it easier to drop in prange in a week or two. The patches in this set are non-intrusive, and don't touch code that changes much in the release, so they should be safe to push now. There should be no change in behavior in any of these patches. All patches have been tested on x86-64 Linux. Aldy Hernandez (16): Make vrange an abstract class. Add a virtual vrange destructor. Make some Value_Range's explicitly integer. Add tree versions of lower and upper bounds to vrange. Move bitmask routines to vrange base class. Remove GTY support for vrange and derived classes. Make fold_cond_with_ops use a boolean type for range_true/range_false. Change range_includes_zero_p argument to a reference. Verify that reading back from vrange_storage doesn't drop bits. Accept a vrange in get_legacy_range. Move get_bitmask_from_range out of irange class. Make some integer specific ranges generic Value_Range's. Accept any vrange in range_includes_zero_p. Move print_irange_* out of vrange_printer class. Remove range_zero and range_nonzero. Callers of irange_bitmask must normalize value/mask pairs. gcc/gimple-range-op.cc | 6 +- gcc/gimple-ssa-warn-access.cc | 4 +- gcc/ipa-cp.cc | 9 +- gcc/ipa-prop.cc | 10 +- gcc/range-op-mixed.h| 2 +- gcc/range-op-ptr.cc | 14 +- gcc/range-op.cc | 20 ++- gcc/range.cc| 14 -- gcc/range.h | 2 - gcc/tree-ssa-ccp.cc | 1 + gcc/tree-ssa-loop-niter.cc | 16 +- gcc/tree-ssa-loop-split.cc | 6 +- gcc/tree-ssa-strlen.cc | 2 +- gcc/value-query.cc | 4 +- gcc/value-range-pretty-print.cc | 83 +- gcc/value-range-pretty-print.h | 2 - gcc/value-range-storage.cc | 20 ++- gcc/value-range-storage.h | 4 - gcc/value-range.cc | 284 gcc/value-range.h | 166 --- gcc/vr-values.cc| 7 +- 21 files changed, 310 insertions(+), 366 deletions(-) -- 2.44.0
[COMMITTED 16/16] Callers of irange_bitmask must normalize value/mask pairs.
As per the documentation, irange_bitmask must have the unknown bits in the mask set to 0 in the value field. Even though we say we must have normalized value/mask bits, we don't enforce it, opting to normalize on the fly in union and intersect. Avoiding this lazy enforcing as well as the extra saving/restoring involved in returning the changed status, gives us a performance increase of 1.25% for VRP and 1.51% for ipa-CP. gcc/ChangeLog: * tree-ssa-ccp.cc (ccp_finalize): Normalize before calling set_bitmask. * value-range.cc (irange::intersect_bitmask): Calculate changed irange_bitmask bits on our own. (irange::union_bitmask): Same. (irange_bitmask::verify_mask): Verify that bits are normalized. * value-range.h (irange_bitmask::union_): Do not normalize. Remove return value. (irange_bitmask::intersect): Same. --- gcc/tree-ssa-ccp.cc | 1 + gcc/value-range.cc | 7 +-- gcc/value-range.h | 24 ++-- 3 files changed, 12 insertions(+), 20 deletions(-) diff --git a/gcc/tree-ssa-ccp.cc b/gcc/tree-ssa-ccp.cc index f6a5cd0ee6e..3749126b5f7 100644 --- a/gcc/tree-ssa-ccp.cc +++ b/gcc/tree-ssa-ccp.cc @@ -1024,6 +1024,7 @@ ccp_finalize (bool nonzero_p) unsigned int precision = TYPE_PRECISION (TREE_TYPE (val->value)); wide_int value = wi::to_wide (val->value); wide_int mask = wide_int::from (val->mask, precision, UNSIGNED); + value = value & ~mask; set_bitmask (name, value, mask); } } diff --git a/gcc/value-range.cc b/gcc/value-range.cc index a27de5534e1..ca6d521c625 100644 --- a/gcc/value-range.cc +++ b/gcc/value-range.cc @@ -2067,7 +2067,8 @@ irange::intersect_bitmask (const irange &r) irange_bitmask bm = get_bitmask (); irange_bitmask save = bm; - if (!bm.intersect (r.get_bitmask ())) + bm.intersect (r.get_bitmask ()); + if (save == bm) return false; m_bitmask = bm; @@ -2099,7 +2100,8 @@ irange::union_bitmask (const irange &r) irange_bitmask bm = get_bitmask (); irange_bitmask save = bm; - if (!bm.union_ (r.get_bitmask ())) + bm.union_ (r.get_bitmask ()); + if (save == bm) return false; m_bitmask = bm; @@ -2133,6 +2135,7 @@ void irange_bitmask::verify_mask () const { gcc_assert (m_value.get_precision () == m_mask.get_precision ()); + gcc_checking_assert (wi::bit_and (m_mask, m_value) == 0); } void diff --git a/gcc/value-range.h b/gcc/value-range.h index 0ab717697f0..11c73faca1b 100644 --- a/gcc/value-range.h +++ b/gcc/value-range.h @@ -139,8 +139,8 @@ public: void set_unknown (unsigned prec); bool unknown_p () const; unsigned get_precision () const; - bool union_ (const irange_bitmask &src); - bool intersect (const irange_bitmask &src); + void union_ (const irange_bitmask &src); + void intersect (const irange_bitmask &src); bool operator== (const irange_bitmask &src) const; bool operator!= (const irange_bitmask &src) const { return !(*this == src); } void verify_mask () const; @@ -233,29 +233,18 @@ irange_bitmask::operator== (const irange_bitmask &src) const return m_value == src.m_value && m_mask == src.m_mask; } -inline bool -irange_bitmask::union_ (const irange_bitmask &orig_src) +inline void +irange_bitmask::union_ (const irange_bitmask &src) { - // Normalize mask. - irange_bitmask src (orig_src.m_value & ~orig_src.m_mask, orig_src.m_mask); - m_value &= ~m_mask; - - irange_bitmask save (*this); m_mask = (m_mask | src.m_mask) | (m_value ^ src.m_value); m_value = m_value & src.m_value; if (flag_checking) verify_mask (); - return *this != save; } -inline bool -irange_bitmask::intersect (const irange_bitmask &orig_src) +inline void +irange_bitmask::intersect (const irange_bitmask &src) { - // Normalize mask. - irange_bitmask src (orig_src.m_value & ~orig_src.m_mask, orig_src.m_mask); - m_value &= ~m_mask; - - irange_bitmask save (*this); // If we have two known bits that are incompatible, the resulting // bit is undefined. It is unclear whether we should set the entire // range to UNDEFINED, or just a subset of it. For now, set the @@ -274,7 +263,6 @@ irange_bitmask::intersect (const irange_bitmask &orig_src) } if (flag_checking) verify_mask (); - return *this != save; } // An integer range without any storage. -- 2.44.0
[COMMITTED 07/16] Make fold_cond_with_ops use a boolean type for range_true/range_false.
Conditional operators are always boolean, regardless of their operands. Getting the type wrong is not currently a problem, but will be when prange's can no longer store an integer. gcc/ChangeLog: * vr-values.cc (simplify_using_ranges::fold_cond_with_ops): Remove type from range_true and range_false. --- gcc/vr-values.cc | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/gcc/vr-values.cc b/gcc/vr-values.cc index a7e291a16e5..ff68d40c355 100644 --- a/gcc/vr-values.cc +++ b/gcc/vr-values.cc @@ -320,9 +320,9 @@ simplify_using_ranges::fold_cond_with_ops (enum tree_code code, range_op_handler handler (code); if (handler && handler.fold_range (res, type, r0, r1)) { - if (res == range_true (type)) + if (res == range_true ()) return boolean_true_node; - if (res == range_false (type)) + if (res == range_false ()) return boolean_false_node; } return NULL; -- 2.44.0
[COMMITTED 12/16] Make some integer specific ranges generic Value_Range's.
There are some irange uses that should be Value_Range, because they can be either integers or pointers. This will become a problem when prange comes live. gcc/ChangeLog: * tree-ssa-loop-split.cc (split_at_bb_p): Make int_range a Value_Range. * tree-ssa-strlen.cc (get_range): Same. * value-query.cc (range_query::get_tree_range): Handle both integers and pointers. * vr-values.cc (simplify_using_ranges::fold_cond_with_ops): Make r0 and r1 Value_Range's. --- gcc/tree-ssa-loop-split.cc | 6 +++--- gcc/tree-ssa-strlen.cc | 2 +- gcc/value-query.cc | 4 +--- gcc/vr-values.cc | 3 ++- 4 files changed, 7 insertions(+), 8 deletions(-) diff --git a/gcc/tree-ssa-loop-split.cc b/gcc/tree-ssa-loop-split.cc index a770ea371a2..a6be0cef7b0 100644 --- a/gcc/tree-ssa-loop-split.cc +++ b/gcc/tree-ssa-loop-split.cc @@ -144,18 +144,18 @@ split_at_bb_p (class loop *loop, basic_block bb, tree *border, affine_iv *iv, value range. */ else { - int_range<2> r; + Value_Range r (TREE_TYPE (op0)); get_global_range_query ()->range_of_expr (r, op0, stmt); if (!r.varying_p () && !r.undefined_p () && TREE_CODE (op1) == INTEGER_CST) { wide_int val = wi::to_wide (op1); - if (known_eq (val, r.lower_bound ())) + if (known_eq (val, wi::to_wide (r.lbound ( { code = (code == EQ_EXPR) ? LE_EXPR : GT_EXPR; break; } - else if (known_eq (val, r.upper_bound ())) + else if (known_eq (val, wi::to_wide (r.ubound ( { code = (code == EQ_EXPR) ? GE_EXPR : LT_EXPR; break; diff --git a/gcc/tree-ssa-strlen.cc b/gcc/tree-ssa-strlen.cc index e09c9cc081f..61c3da22322 100644 --- a/gcc/tree-ssa-strlen.cc +++ b/gcc/tree-ssa-strlen.cc @@ -215,7 +215,7 @@ get_range (tree val, gimple *stmt, wide_int minmax[2], rvals = get_range_query (cfun); } - value_range vr; + Value_Range vr (TREE_TYPE (val)); if (!rvals->range_of_expr (vr, val, stmt)) return NULL_TREE; diff --git a/gcc/value-query.cc b/gcc/value-query.cc index eda71dc89d3..052b7511565 100644 --- a/gcc/value-query.cc +++ b/gcc/value-query.cc @@ -156,11 +156,9 @@ range_query::get_tree_range (vrange &r, tree expr, gimple *stmt) { case INTEGER_CST: { - irange &i = as_a (r); if (TREE_OVERFLOW_P (expr)) expr = drop_tree_overflow (expr); - wide_int w = wi::to_wide (expr); - i.set (TREE_TYPE (expr), w, w); + r.set (expr, expr); return true; } diff --git a/gcc/vr-values.cc b/gcc/vr-values.cc index ff68d40c355..0572bf6c8c7 100644 --- a/gcc/vr-values.cc +++ b/gcc/vr-values.cc @@ -310,7 +310,8 @@ tree simplify_using_ranges::fold_cond_with_ops (enum tree_code code, tree op0, tree op1, gimple *s) { - int_range_max r0, r1; + Value_Range r0 (TREE_TYPE (op0)); + Value_Range r1 (TREE_TYPE (op1)); if (!query->range_of_expr (r0, op0, s) || !query->range_of_expr (r1, op1, s)) return NULL_TREE; -- 2.44.0
[COMMITTED 15/16] Remove range_zero and range_nonzero.
Remove legacy range_zero and range_nonzero as they return by value, which make it not work in a separate irange and prange world. Also, we already have set_zero and set_nonzero methods in vrange. gcc/ChangeLog: * range-op-ptr.cc (pointer_plus_operator::wi_fold): Use method range setters instead of out of line functions. (pointer_min_max_operator::wi_fold): Same. (pointer_and_operator::wi_fold): Same. (pointer_or_operator::wi_fold): Same. * range-op.cc (operator_negate::fold_range): Same. (operator_addr_expr::fold_range): Same. (range_op_cast_tests): Same. * range.cc (range_zero): Remove. (range_nonzero): Remove. * range.h (range_zero): Remove. (range_nonzero): Remove. * value-range.cc (range_tests_misc): Use method instead of out of line function. --- gcc/range-op-ptr.cc | 14 +++--- gcc/range-op.cc | 14 -- gcc/range.cc| 14 -- gcc/range.h | 2 -- gcc/value-range.cc | 7 --- 5 files changed, 19 insertions(+), 32 deletions(-) diff --git a/gcc/range-op-ptr.cc b/gcc/range-op-ptr.cc index 2c85d75b5e8..7343ef635f3 100644 --- a/gcc/range-op-ptr.cc +++ b/gcc/range-op-ptr.cc @@ -101,10 +101,10 @@ pointer_plus_operator::wi_fold (irange &r, tree type, && !TYPE_OVERFLOW_WRAPS (type) && (flag_delete_null_pointer_checks || !wi::sign_mask (rh_ub))) -r = range_nonzero (type); +r.set_nonzero (type); else if (lh_lb == lh_ub && lh_lb == 0 && rh_lb == rh_ub && rh_lb == 0) -r = range_zero (type); +r.set_zero (type); else r.set_varying (type); } @@ -150,9 +150,9 @@ pointer_min_max_operator::wi_fold (irange &r, tree type, // are varying. if (!wi_includes_zero_p (type, lh_lb, lh_ub) && !wi_includes_zero_p (type, rh_lb, rh_ub)) -r = range_nonzero (type); +r.set_nonzero (type); else if (wi_zero_p (type, lh_lb, lh_ub) && wi_zero_p (type, rh_lb, rh_ub)) -r = range_zero (type); +r.set_zero (type); else r.set_varying (type); } @@ -175,7 +175,7 @@ pointer_and_operator::wi_fold (irange &r, tree type, // For pointer types, we are really only interested in asserting // whether the expression evaluates to non-NULL. if (wi_zero_p (type, lh_lb, lh_ub) || wi_zero_p (type, lh_lb, lh_ub)) -r = range_zero (type); +r.set_zero (type); else r.set_varying (type); } @@ -236,9 +236,9 @@ pointer_or_operator::wi_fold (irange &r, tree type, // whether the expression evaluates to non-NULL. if (!wi_includes_zero_p (type, lh_lb, lh_ub) && !wi_includes_zero_p (type, rh_lb, rh_ub)) -r = range_nonzero (type); +r.set_nonzero (type); else if (wi_zero_p (type, lh_lb, lh_ub) && wi_zero_p (type, rh_lb, rh_ub)) -r = range_zero (type); +r.set_zero (type); else r.set_varying (type); } diff --git a/gcc/range-op.cc b/gcc/range-op.cc index 6ea7d624a9b..ab3a4f0b200 100644 --- a/gcc/range-op.cc +++ b/gcc/range-op.cc @@ -4364,9 +4364,11 @@ operator_negate::fold_range (irange &r, tree type, { if (empty_range_varying (r, type, lh, rh)) return true; - // -X is simply 0 - X. - return range_op_handler (MINUS_EXPR).fold_range (r, type, - range_zero (type), lh); + +// -X is simply 0 - X. + int_range<1> zero; + zero.set_zero (type); + return range_op_handler (MINUS_EXPR).fold_range (r, type, zero, lh); } bool @@ -4391,7 +4393,7 @@ operator_addr_expr::fold_range (irange &r, tree type, // Return a non-null pointer of the LHS type (passed in op2). if (lh.zero_p ()) -r = range_zero (type); +r.set_zero (type); else if (lh.undefined_p () || contains_zero_p (lh)) r.set_varying (type); else @@ -4675,7 +4677,7 @@ range_op_cast_tests () if (TYPE_PRECISION (integer_type_node) > TYPE_PRECISION (short_integer_type_node)) { - r0 = range_nonzero (integer_type_node); + r0.set_nonzero (integer_type_node); range_cast (r0, short_integer_type_node); r1 = int_range<1> (short_integer_type_node, min_limit (short_integer_type_node), @@ -4687,7 +4689,7 @@ range_op_cast_tests () // // NONZERO signed 16-bits is [-MIN_16,-1][1, +MAX_16]. // Converting this to 32-bits signed is [-MIN_16,-1][1, +MAX_16]. - r0 = range_nonzero (short_integer_type_node); + r0.set_nonzero (short_integer_type_node); range_cast (r0, integer_type_node); r1 = int_range<1> (integer_type_node, INT (-32768), INT (-1)); r2 = int_range<1> (integer_type_node, INT (1), INT (32767)); diff --git a/gcc/range.cc b/gcc/range.cc index c68f387f71c..b362e0f12e0 100644 --- a/gcc/range.cc +++ b/gcc/range.cc @@ -29,20 +29,6 @@ along with GCC; see the file COPYING3. If not see #include "ssa.h" #include "range.h" -value_range -range_zero (tree type) -{ - wide_int zero = wi::zero (TYPE_PRECISION (type)); - return value_range (type, ze
[COMMITTED 13/16] Accept any vrange in range_includes_zero_p.
Accept a vrange, as this will be used for either integers or pointers. gcc/ChangeLog: * value-range.h (range_includes_zero_p): Accept vrange. --- gcc/value-range.h | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/gcc/value-range.h b/gcc/value-range.h index ede90a496d8..0ab717697f0 100644 --- a/gcc/value-range.h +++ b/gcc/value-range.h @@ -970,7 +970,7 @@ irange::contains_p (tree cst) const } inline bool -range_includes_zero_p (const irange &vr) +range_includes_zero_p (const vrange &vr) { if (vr.undefined_p ()) return false; @@ -978,8 +978,7 @@ range_includes_zero_p (const irange &vr) if (vr.varying_p ()) return true; - wide_int zero = wi::zero (TYPE_PRECISION (vr.type ())); - return vr.contains_p (zero); + return vr.contains_p (build_zero_cst (vr.type ())); } // Constructors for irange -- 2.44.0
[COMMITTED 01/16] Make vrange an abstract class.
Explicitly make vrange an abstract class. This involves fleshing out the unsupported_range overrides which we were inheriting by default from vrange. gcc/ChangeLog: * value-range.cc (unsupported_range::accept): Move down. (vrange::contains_p): Rename to... (unsupported_range::contains_p): ...this. (vrange::singleton_p): Rename to... (unsupported_range::singleton_p): ...this. (vrange::set): Rename to... (unsupported_range::set): ...this. (vrange::type): Rename to... (unsupported_range::type): ...this. (vrange::supports_type_p): Rename to... (unsupported_range::supports_type_p): ...this. (vrange::set_undefined): Rename to... (unsupported_range::set_undefined): ...this. (vrange::set_varying): Rename to... (unsupported_range::set_varying): ...this. (vrange::union_): Rename to... (unsupported_range::union_): ...this. (vrange::intersect): Rename to... (unsupported_range::intersect): ...this. (vrange::zero_p): Rename to... (unsupported_range::zero_p): ...this. (vrange::nonzero_p): Rename to... (unsupported_range::nonzero_p): ...this. (vrange::set_nonzero): Rename to... (unsupported_range::set_nonzero): ...this. (vrange::set_zero): Rename to... (unsupported_range::set_zero): ...this. (vrange::set_nonnegative): Rename to... (unsupported_range::set_nonnegative): ...this. (vrange::fits_p): Rename to... (unsupported_range::fits_p): ...this. (unsupported_range::operator=): New. (frange::fits_p): New. * value-range.h (class vrange): Make an abstract class. (class unsupported_range): Declare override methods. --- gcc/value-range.cc | 62 ++ gcc/value-range.h | 53 --- 2 files changed, 73 insertions(+), 42 deletions(-) diff --git a/gcc/value-range.cc b/gcc/value-range.cc index 70375f7abf9..632d77305cc 100644 --- a/gcc/value-range.cc +++ b/gcc/value-range.cc @@ -37,12 +37,6 @@ irange::accept (const vrange_visitor &v) const v.visit (*this); } -void -unsupported_range::accept (const vrange_visitor &v) const -{ - v.visit (*this); -} - // Convenience function only available for integers and pointers. wide_int @@ -86,52 +80,58 @@ debug (const irange_bitmask &bm) fprintf (stderr, "\n"); } -// Default vrange definitions. +// Definitions for unsupported_range. + +void +unsupported_range::accept (const vrange_visitor &v) const +{ + v.visit (*this); +} bool -vrange::contains_p (tree) const +unsupported_range::contains_p (tree) const { return varying_p (); } bool -vrange::singleton_p (tree *) const +unsupported_range::singleton_p (tree *) const { return false; } void -vrange::set (tree min, tree, value_range_kind) +unsupported_range::set (tree min, tree, value_range_kind) { set_varying (TREE_TYPE (min)); } tree -vrange::type () const +unsupported_range::type () const { return void_type_node; } bool -vrange::supports_type_p (const_tree) const +unsupported_range::supports_type_p (const_tree) const { return false; } void -vrange::set_undefined () +unsupported_range::set_undefined () { m_kind = VR_UNDEFINED; } void -vrange::set_varying (tree) +unsupported_range::set_varying (tree) { m_kind = VR_VARYING; } bool -vrange::union_ (const vrange &r) +unsupported_range::union_ (const vrange &r) { if (r.undefined_p () || varying_p ()) return false; @@ -145,7 +145,7 @@ vrange::union_ (const vrange &r) } bool -vrange::intersect (const vrange &r) +unsupported_range::intersect (const vrange &r) { if (undefined_p () || r.varying_p ()) return false; @@ -164,41 +164,53 @@ vrange::intersect (const vrange &r) } bool -vrange::zero_p () const +unsupported_range::zero_p () const { return false; } bool -vrange::nonzero_p () const +unsupported_range::nonzero_p () const { return false; } void -vrange::set_nonzero (tree type) +unsupported_range::set_nonzero (tree type) { set_varying (type); } void -vrange::set_zero (tree type) +unsupported_range::set_zero (tree type) { set_varying (type); } void -vrange::set_nonnegative (tree type) +unsupported_range::set_nonnegative (tree type) { set_varying (type); } bool -vrange::fits_p (const vrange &) const +unsupported_range::fits_p (const vrange &) const { return true; } +unsupported_range & +unsupported_range::operator= (const vrange &r) +{ + if (r.undefined_p ()) +set_undefined (); + else if (r.varying_p ()) +set_varying (void_type_node); + else +gcc_unreachable (); + return *this; +} + // Assignment operator for generic ranges. Copying incompatible types // is not allowed. @@ -359,6 +371,12 @@ frange::accept (const vrange_visitor &v) const v.visit (*this); } +bool +frange::fits_p (con
[COMMITTED 10/16] Accept a vrange in get_legacy_range.
In preparation for prange, make get_legacy_range take a generic vrange, not just an irange. gcc/ChangeLog: * value-range.cc (get_legacy_range): Make static and add another version of get_legacy_range that takes a vrange. * value-range.h (class irange): Remove unnecessary friendship with get_legacy_range. (get_legacy_range): Accept a vrange. --- gcc/value-range.cc | 17 - gcc/value-range.h | 3 +-- 2 files changed, 17 insertions(+), 3 deletions(-) diff --git a/gcc/value-range.cc b/gcc/value-range.cc index b901c864a7b..44929b210aa 100644 --- a/gcc/value-range.cc +++ b/gcc/value-range.cc @@ -1004,7 +1004,7 @@ irange::operator= (const irange &src) return *this; } -value_range_kind +static value_range_kind get_legacy_range (const irange &r, tree &min, tree &max) { if (r.undefined_p ()) @@ -1041,6 +1041,21 @@ get_legacy_range (const irange &r, tree &min, tree &max) return VR_RANGE; } +// Given a range in V, return an old-style legacy range consisting of +// a value_range_kind with a MIN/MAX. This is to maintain +// compatibility with passes that still depend on VR_ANTI_RANGE, and +// only works for integers and pointers. + +value_range_kind +get_legacy_range (const vrange &v, tree &min, tree &max) +{ + if (is_a (v)) +return get_legacy_range (as_a (v), min, max); + + gcc_unreachable (); + return VR_UNDEFINED; +} + /* Set value range to the canonical form of {VRTYPE, MIN, MAX, EQUIV}. This means adjusting VRTYPE, MIN and MAX representing the case of a wrapping range with MAX < MIN covering [MIN, type_max] U [type_min, MAX] diff --git a/gcc/value-range.h b/gcc/value-range.h index 62f123e2a4b..d2e8fd5a4d9 100644 --- a/gcc/value-range.h +++ b/gcc/value-range.h @@ -281,7 +281,6 @@ irange_bitmask::intersect (const irange_bitmask &orig_src) class irange : public vrange { - friend value_range_kind get_legacy_range (const irange &, tree &, tree &); friend class irange_storage; friend class vrange_printer; public: @@ -886,7 +885,7 @@ Value_Range::supports_type_p (const_tree type) return irange::supports_p (type) || frange::supports_p (type); } -extern value_range_kind get_legacy_range (const irange &, tree &min, tree &max); +extern value_range_kind get_legacy_range (const vrange &, tree &min, tree &max); extern void dump_value_range (FILE *, const vrange *); extern bool vrp_operand_equal_p (const_tree, const_tree); inline REAL_VALUE_TYPE frange_val_min (const_tree type); -- 2.44.0
[COMMITTED 08/16] Change range_includes_zero_p argument to a reference.
Make range_includes_zero_p take an argument instead of a pointer for consistency in the range-op code. gcc/ChangeLog: * gimple-range-op.cc (cfn_clz::fold_range): Change range_includes_zero_p argument to a reference. (cfn_ctz::fold_range): Same. * range-op.cc (operator_plus::lhs_op1_relation): Same. * value-range.h (range_includes_zero_p): Same. --- gcc/gimple-range-op.cc | 6 +++--- gcc/range-op.cc| 2 +- gcc/value-range.h | 10 +- 3 files changed, 9 insertions(+), 9 deletions(-) diff --git a/gcc/gimple-range-op.cc b/gcc/gimple-range-op.cc index a98f7db62a7..9c50c00549e 100644 --- a/gcc/gimple-range-op.cc +++ b/gcc/gimple-range-op.cc @@ -853,7 +853,7 @@ public: // __builtin_ffs* and __builtin_popcount* return [0, prec]. int prec = TYPE_PRECISION (lh.type ()); // If arg is non-zero, then ffs or popcount are non-zero. -int mini = range_includes_zero_p (&lh) ? 0 : 1; +int mini = range_includes_zero_p (lh) ? 0 : 1; int maxi = prec; // If some high bits are known to be zero, decrease the maximum. @@ -945,7 +945,7 @@ cfn_clz::fold_range (irange &r, tree type, const irange &lh, if (mini == -2) mini = 0; } - else if (!range_includes_zero_p (&lh)) + else if (!range_includes_zero_p (lh)) { mini = 0; maxi = prec - 1; @@ -1007,7 +1007,7 @@ cfn_ctz::fold_range (irange &r, tree type, const irange &lh, mini = -2; } // If arg is non-zero, then use [0, prec - 1]. - if (!range_includes_zero_p (&lh)) + if (!range_includes_zero_p (lh)) { mini = 0; maxi = prec - 1; diff --git a/gcc/range-op.cc b/gcc/range-op.cc index aeff55cfd78..6ea7d624a9b 100644 --- a/gcc/range-op.cc +++ b/gcc/range-op.cc @@ -1657,7 +1657,7 @@ operator_plus::lhs_op1_relation (const irange &lhs, } // If op2 does not contain 0, then LHS and OP1 can never be equal. - if (!range_includes_zero_p (&op2)) + if (!range_includes_zero_p (op2)) return VREL_NE; return VREL_VARYING; diff --git a/gcc/value-range.h b/gcc/value-range.h index 2650ded6d10..62f123e2a4b 100644 --- a/gcc/value-range.h +++ b/gcc/value-range.h @@ -972,16 +972,16 @@ irange::contains_p (tree cst) const } inline bool -range_includes_zero_p (const irange *vr) +range_includes_zero_p (const irange &vr) { - if (vr->undefined_p ()) + if (vr.undefined_p ()) return false; - if (vr->varying_p ()) + if (vr.varying_p ()) return true; - wide_int zero = wi::zero (TYPE_PRECISION (vr->type ())); - return vr->contains_p (zero); + wide_int zero = wi::zero (TYPE_PRECISION (vr.type ())); + return vr.contains_p (zero); } // Constructors for irange -- 2.44.0
[COMMITTED 14/16] Move print_irange_* out of vrange_printer class.
Move some code out of the irange pretty printers so it can be shared with pointers. gcc/ChangeLog: * value-range-pretty-print.cc (print_int_bound): New. (print_irange_bitmasks): New. (vrange_printer::print_irange_bound): Remove. (vrange_printer::print_irange_bitmasks): Remove. * value-range-pretty-print.h: Remove print_irange_bitmasks and print_irange_bound --- gcc/value-range-pretty-print.cc | 83 - gcc/value-range-pretty-print.h | 2 - 2 files changed, 41 insertions(+), 44 deletions(-) diff --git a/gcc/value-range-pretty-print.cc b/gcc/value-range-pretty-print.cc index c75cbea3955..b6d23dce6d2 100644 --- a/gcc/value-range-pretty-print.cc +++ b/gcc/value-range-pretty-print.cc @@ -30,6 +30,44 @@ along with GCC; see the file COPYING3. If not see #include "gimple-range.h" #include "value-range-pretty-print.h" +static void +print_int_bound (pretty_printer *pp, const wide_int &bound, tree type) +{ + wide_int type_min = wi::min_value (TYPE_PRECISION (type), TYPE_SIGN (type)); + wide_int type_max = wi::max_value (TYPE_PRECISION (type), TYPE_SIGN (type)); + + if (INTEGRAL_TYPE_P (type) + && !TYPE_UNSIGNED (type) + && bound == type_min + && TYPE_PRECISION (type) != 1) +pp_string (pp, "-INF"); + else if (bound == type_max && TYPE_PRECISION (type) != 1) +pp_string (pp, "+INF"); + else +pp_wide_int (pp, bound, TYPE_SIGN (type)); +} + +static void +print_irange_bitmasks (pretty_printer *pp, const irange_bitmask &bm) +{ + if (bm.unknown_p ()) +return; + + pp_string (pp, " MASK "); + char buf[WIDE_INT_PRINT_BUFFER_SIZE], *p; + unsigned len_mask, len_val; + if (print_hex_buf_size (bm.mask (), &len_mask) + | print_hex_buf_size (bm.value (), &len_val)) +p = XALLOCAVEC (char, MAX (len_mask, len_val)); + else +p = buf; + print_hex (bm.mask (), p); + pp_string (pp, p); + pp_string (pp, " VALUE "); + print_hex (bm.value (), p); + pp_string (pp, p); +} + void vrange_printer::visit (const unsupported_range &r) const { @@ -66,51 +104,12 @@ vrange_printer::visit (const irange &r) const for (unsigned i = 0; i < r.num_pairs (); ++i) { pp_character (pp, '['); - print_irange_bound (r.lower_bound (i), r.type ()); + print_int_bound (pp, r.lower_bound (i), r.type ()); pp_string (pp, ", "); - print_irange_bound (r.upper_bound (i), r.type ()); + print_int_bound (pp, r.upper_bound (i), r.type ()); pp_character (pp, ']'); } - print_irange_bitmasks (r); -} - -void -vrange_printer::print_irange_bound (const wide_int &bound, tree type) const -{ - wide_int type_min = wi::min_value (TYPE_PRECISION (type), TYPE_SIGN (type)); - wide_int type_max = wi::max_value (TYPE_PRECISION (type), TYPE_SIGN (type)); - - if (INTEGRAL_TYPE_P (type) - && !TYPE_UNSIGNED (type) - && bound == type_min - && TYPE_PRECISION (type) != 1) -pp_string (pp, "-INF"); - else if (bound == type_max && TYPE_PRECISION (type) != 1) -pp_string (pp, "+INF"); - else -pp_wide_int (pp, bound, TYPE_SIGN (type)); -} - -void -vrange_printer::print_irange_bitmasks (const irange &r) const -{ - irange_bitmask bm = r.m_bitmask; - if (bm.unknown_p ()) -return; - - pp_string (pp, " MASK "); - char buf[WIDE_INT_PRINT_BUFFER_SIZE], *p; - unsigned len_mask, len_val; - if (print_hex_buf_size (bm.mask (), &len_mask) - | print_hex_buf_size (bm.value (), &len_val)) -p = XALLOCAVEC (char, MAX (len_mask, len_val)); - else -p = buf; - print_hex (bm.mask (), p); - pp_string (pp, p); - pp_string (pp, " VALUE "); - print_hex (bm.value (), p); - pp_string (pp, p); + print_irange_bitmasks (pp, r.m_bitmask); } void diff --git a/gcc/value-range-pretty-print.h b/gcc/value-range-pretty-print.h index ca85fd6157c..44cd6e81298 100644 --- a/gcc/value-range-pretty-print.h +++ b/gcc/value-range-pretty-print.h @@ -29,8 +29,6 @@ public: void visit (const irange &) const override; void visit (const frange &) const override; private: - void print_irange_bound (const wide_int &w, tree type) const; - void print_irange_bitmasks (const irange &) const; void print_frange_nan (const frange &) const; void print_real_value (tree type, const REAL_VALUE_TYPE &r) const; -- 2.44.0
[COMMITTED 09/16] Verify that reading back from vrange_storage doesn't drop bits.
We have a sanity check in the irange storage code to make sure that reading back a cache entry we have just written to yields exactly the same range. There's no need to do this only for integers. This patch moves the code to a more generic place. However, doing so tickles a latent bug in the frange code where a range is being pessimized from [0.0, 1.0] to [-0.0, 1.0]. Exclude checking frange's until this bug is fixed. gcc/ChangeLog: * value-range-storage.cc (irange_storage::set_irange): Move verification code from here... (vrange_storage::set_vrange): ...to here. --- gcc/value-range-storage.cc | 20 +--- 1 file changed, 13 insertions(+), 7 deletions(-) diff --git a/gcc/value-range-storage.cc b/gcc/value-range-storage.cc index f00474ad0e6..09a29776a0e 100644 --- a/gcc/value-range-storage.cc +++ b/gcc/value-range-storage.cc @@ -165,6 +165,19 @@ vrange_storage::set_vrange (const vrange &r) } else gcc_unreachable (); + + // Verify that reading back from the cache didn't drop bits. + if (flag_checking + // FIXME: Avoid checking frange, as it currently pessimizes some ranges: + // + // gfortran.dg/pr49472.f90 pessimizes [0.0, 1.0] into [-0.0, 1.0]. + && !is_a (r) + && !r.undefined_p ()) +{ + Value_Range tmp (r); + get_vrange (tmp, r.type ()); + gcc_checking_assert (tmp == r); +} } // Restore R from storage. @@ -306,13 +319,6 @@ irange_storage::set_irange (const irange &r) irange_bitmask bm = r.m_bitmask; write_wide_int (val, len, bm.value ()); write_wide_int (val, len, bm.mask ()); - - if (flag_checking) -{ - int_range_max tmp; - get_irange (tmp, r.type ()); - gcc_checking_assert (tmp == r); -} } static inline void -- 2.44.0
[COMMITTED 02/16] Add a virtual vrange destructor.
Richi mentioned in PR113476 that it would be cleaner to move the destructor from int_range to the base class. Although this isn't strictly necessary, as there are no users, it is good to future proof things, and the overall impact is miniscule. gcc/ChangeLog: * value-range.h (vrange::~vrange): New. (int_range::~int_range): Make final override. --- gcc/value-range.h | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/gcc/value-range.h b/gcc/value-range.h index e7f61950a24..b7c83982385 100644 --- a/gcc/value-range.h +++ b/gcc/value-range.h @@ -95,6 +95,7 @@ public: virtual void set_zero (tree type) = 0; virtual void set_nonnegative (tree type) = 0; virtual bool fits_p (const vrange &r) const = 0; + virtual ~vrange () { } bool varying_p () const; bool undefined_p () const; @@ -382,7 +383,7 @@ public: int_range (tree type); int_range (const int_range &); int_range (const irange &); - virtual ~int_range (); + ~int_range () final override; int_range& operator= (const int_range &); protected: int_range (tree, tree, value_range_kind = VR_RANGE); -- 2.44.0
[COMMITTED 11/16] Move get_bitmask_from_range out of irange class.
prange will also have bitmasks, so it will need to use get_bitmask_from_range. gcc/ChangeLog: * value-range.cc (get_bitmask_from_range): Move out of irange class. (irange::get_bitmask): Call function instead of internal method. * value-range.h (class irange): Remove get_bitmask_from_range. --- gcc/value-range.cc | 52 +++--- gcc/value-range.h | 1 - 2 files changed, 26 insertions(+), 27 deletions(-) diff --git a/gcc/value-range.cc b/gcc/value-range.cc index 44929b210aa..d9689bd469f 100644 --- a/gcc/value-range.cc +++ b/gcc/value-range.cc @@ -31,6 +31,30 @@ along with GCC; see the file COPYING3. If not see #include "fold-const.h" #include "gimple-range.h" +// Return the bitmask inherent in a range. + +static irange_bitmask +get_bitmask_from_range (tree type, + const wide_int &min, const wide_int &max) +{ + unsigned prec = TYPE_PRECISION (type); + + // All the bits of a singleton are known. + if (min == max) +{ + wide_int mask = wi::zero (prec); + wide_int value = min; + return irange_bitmask (value, mask); +} + + wide_int xorv = min ^ max; + + if (xorv != 0) +xorv = wi::mask (prec - wi::clz (xorv), false, prec); + + return irange_bitmask (wi::zero (prec), min | xorv); +} + void irange::accept (const vrange_visitor &v) const { @@ -1881,31 +1905,6 @@ irange::invert () verify_range (); } -// Return the bitmask inherent in the range. - -irange_bitmask -irange::get_bitmask_from_range () const -{ - unsigned prec = TYPE_PRECISION (type ()); - wide_int min = lower_bound (); - wide_int max = upper_bound (); - - // All the bits of a singleton are known. - if (min == max) -{ - wide_int mask = wi::zero (prec); - wide_int value = lower_bound (); - return irange_bitmask (value, mask); -} - - wide_int xorv = min ^ max; - - if (xorv != 0) -xorv = wi::mask (prec - wi::clz (xorv), false, prec); - - return irange_bitmask (wi::zero (prec), min | xorv); -} - // Remove trailing ranges that this bitmask indicates can't exist. void @@ -2027,7 +2026,8 @@ irange::get_bitmask () const // in the mask. // // See also the note in irange_bitmask::intersect. - irange_bitmask bm = get_bitmask_from_range (); + irange_bitmask bm += get_bitmask_from_range (type (), lower_bound (), upper_bound ()); if (!m_bitmask.unknown_p ()) bm.intersect (m_bitmask); return bm; diff --git a/gcc/value-range.h b/gcc/value-range.h index d2e8fd5a4d9..ede90a496d8 100644 --- a/gcc/value-range.h +++ b/gcc/value-range.h @@ -352,7 +352,6 @@ private: bool varying_compatible_p () const; bool intersect_bitmask (const irange &r); bool union_bitmask (const irange &r); - irange_bitmask get_bitmask_from_range () const; bool set_range_from_bitmask (); bool intersect (const wide_int& lb, const wide_int& ub); -- 2.44.0
[COMMITTED 04/16] Add tree versions of lower and upper bounds to vrange.
This patch adds vrange::lbound() and vrange::ubound() that return trees. These can be used in generic code that is type agnostic, and avoids special casing for pointers and integers in places where we handle both. It also cleans up a wart in the Value_Range class. gcc/ChangeLog: * tree-ssa-loop-niter.cc (refine_value_range_using_guard): Convert bound to wide_int. * value-range.cc (Value_Range::lower_bound): Remove. (Value_Range::upper_bound): Remove. (unsupported_range::lbound): New. (unsupported_range::ubound): New. (frange::lbound): New. (frange::ubound): New. (irange::lbound): New. (irange::ubound): New. * value-range.h (class vrange): Add lbound() and ubound(). (class irange): Same. (class frange): Same. (class unsupported_range): Same. (class Value_Range): Rename lower_bound and upper_bound to lbound and ubound respectively. --- gcc/tree-ssa-loop-niter.cc | 4 +-- gcc/value-range.cc | 56 -- gcc/value-range.h | 13 +++-- 3 files changed, 48 insertions(+), 25 deletions(-) diff --git a/gcc/tree-ssa-loop-niter.cc b/gcc/tree-ssa-loop-niter.cc index cbc9dbc5a1f..adbc1936982 100644 --- a/gcc/tree-ssa-loop-niter.cc +++ b/gcc/tree-ssa-loop-niter.cc @@ -4067,7 +4067,7 @@ record_nonwrapping_iv (class loop *loop, tree base, tree step, gimple *stmt, Value_Range base_range (TREE_TYPE (orig_base)); if (get_range_query (cfun)->range_of_expr (base_range, orig_base) && !base_range.undefined_p ()) - max = base_range.upper_bound (); + max = wi::to_wide (base_range.ubound ()); extreme = fold_convert (unsigned_type, low); if (TREE_CODE (orig_base) == SSA_NAME && TREE_CODE (high) == INTEGER_CST @@ -4090,7 +4090,7 @@ record_nonwrapping_iv (class loop *loop, tree base, tree step, gimple *stmt, Value_Range base_range (TREE_TYPE (orig_base)); if (get_range_query (cfun)->range_of_expr (base_range, orig_base) && !base_range.undefined_p ()) - min = base_range.lower_bound (); + min = wi::to_wide (base_range.lbound ()); extreme = fold_convert (unsigned_type, high); if (TREE_CODE (orig_base) == SSA_NAME && TREE_CODE (low) == INTEGER_CST diff --git a/gcc/value-range.cc b/gcc/value-range.cc index 632d77305cc..ccac517d4c4 100644 --- a/gcc/value-range.cc +++ b/gcc/value-range.cc @@ -37,26 +37,6 @@ irange::accept (const vrange_visitor &v) const v.visit (*this); } -// Convenience function only available for integers and pointers. - -wide_int -Value_Range::lower_bound () const -{ - if (is_a (*m_vrange)) -return as_a (*m_vrange).lower_bound (); - gcc_unreachable (); -} - -// Convenience function only available for integers and pointers. - -wide_int -Value_Range::upper_bound () const -{ - if (is_a (*m_vrange)) -return as_a (*m_vrange).upper_bound (); - gcc_unreachable (); -} - void Value_Range::dump (FILE *out) const { @@ -211,6 +191,18 @@ unsupported_range::operator= (const vrange &r) return *this; } +tree +unsupported_range::lbound () const +{ + return NULL; +} + +tree +unsupported_range::ubound () const +{ + return NULL; +} + // Assignment operator for generic ranges. Copying incompatible types // is not allowed. @@ -957,6 +949,18 @@ frange::set_nonnegative (tree type) set (type, dconst0, frange_val_max (type)); } +tree +frange::lbound () const +{ + return build_real (type (), lower_bound ()); +} + +tree +frange::ubound () const +{ + return build_real (type (), upper_bound ()); +} + // Here we copy between any two irange's. irange & @@ -2086,6 +2090,18 @@ irange::union_bitmask (const irange &r) return true; } +tree +irange::lbound () const +{ + return wide_int_to_tree (type (), lower_bound ()); +} + +tree +irange::ubound () const +{ + return wide_int_to_tree (type (), upper_bound ()); +} + void irange_bitmask::verify_mask () const { diff --git a/gcc/value-range.h b/gcc/value-range.h index b7c83982385..f216f1b82c1 100644 --- a/gcc/value-range.h +++ b/gcc/value-range.h @@ -96,6 +96,8 @@ public: virtual void set_nonnegative (tree type) = 0; virtual bool fits_p (const vrange &r) const = 0; virtual ~vrange () { } + virtual tree lbound () const = 0; + virtual tree ubound () const = 0; bool varying_p () const; bool undefined_p () const; @@ -298,6 +300,8 @@ public: wide_int lower_bound (unsigned = 0) const; wide_int upper_bound (unsigned) const; wide_int upper_bound () const; + virtual tree lbound () const override; + virtual tree ubound () const override; // Predicates. virtual bool zero_p () const override; @@ -419,6 +423,8 @@ public: void set_nonnegative (tree type) final override; bool fits_p (const vrange &) const final override; unsupported_range& operator= (const vrange &r); + tree lbound () const final override; + tree ubo
[COMMITTED 05/16] Move bitmask routines to vrange base class.
Any range can theoretically have a bitmask of set bits. This patch moves the bitmask accessors to the base class. This cleans up some users in IPA*, and will provide a cleaner interface when prange is in place. gcc/ChangeLog: * ipa-cp.cc (propagate_bits_across_jump_function): Access bitmask through base class. (ipcp_store_vr_results): Same. * ipa-prop.cc (ipa_compute_jump_functions_for_edge): Same. (ipcp_get_parm_bits): Same. (ipcp_update_vr): Same. * range-op-mixed.h (update_known_bitmask): Change argument to vrange. * range-op.cc (update_known_bitmask): Same. * value-range.cc (vrange::update_bitmask): New. (irange::set_nonzero_bits): Move to vrange class. (irange::get_nonzero_bits): Same. * value-range.h (class vrange): Add update_bitmask, get_bitmask, get_nonzero_bits, and set_nonzero_bits. (class irange): Make bitmask methods virtual overrides. (class Value_Range): Add get_bitmask and update_bitmask. --- gcc/ipa-cp.cc| 9 +++-- gcc/ipa-prop.cc | 10 -- gcc/range-op-mixed.h | 2 +- gcc/range-op.cc | 4 ++-- gcc/value-range.cc | 16 ++-- gcc/value-range.h| 14 +- 6 files changed, 33 insertions(+), 22 deletions(-) diff --git a/gcc/ipa-cp.cc b/gcc/ipa-cp.cc index b7add455bd5..a688dced5c9 100644 --- a/gcc/ipa-cp.cc +++ b/gcc/ipa-cp.cc @@ -2485,8 +2485,7 @@ propagate_bits_across_jump_function (cgraph_edge *cs, int idx, jfunc->m_vr->get_vrange (vr); if (!vr.undefined_p () && !vr.varying_p ()) { - irange &r = as_a (vr); - irange_bitmask bm = r.get_bitmask (); + irange_bitmask bm = vr.get_bitmask (); widest_int mask = widest_int::from (bm.mask (), TYPE_SIGN (parm_type)); widest_int value @@ -6346,14 +6345,13 @@ ipcp_store_vr_results (void) { Value_Range tmp = plats->m_value_range.m_vr; tree type = ipa_get_type (info, i); - irange &r = as_a (tmp); irange_bitmask bm (wide_int::from (bits->get_value (), TYPE_PRECISION (type), TYPE_SIGN (type)), wide_int::from (bits->get_mask (), TYPE_PRECISION (type), TYPE_SIGN (type))); - r.update_bitmask (bm); + tmp.update_bitmask (bm); ipa_vr vr (tmp); ts->m_vr->quick_push (vr); } @@ -6368,14 +6366,13 @@ ipcp_store_vr_results (void) tree type = ipa_get_type (info, i); Value_Range tmp; tmp.set_varying (type); - irange &r = as_a (tmp); irange_bitmask bm (wide_int::from (bits->get_value (), TYPE_PRECISION (type), TYPE_SIGN (type)), wide_int::from (bits->get_mask (), TYPE_PRECISION (type), TYPE_SIGN (type))); - r.update_bitmask (bm); + tmp.update_bitmask (bm); ipa_vr vr (tmp); ts->m_vr->quick_push (vr); } diff --git a/gcc/ipa-prop.cc b/gcc/ipa-prop.cc index 374e998aa64..b57f9750431 100644 --- a/gcc/ipa-prop.cc +++ b/gcc/ipa-prop.cc @@ -2381,8 +2381,7 @@ ipa_compute_jump_functions_for_edge (struct ipa_func_body_info *fbi, irange_bitmask bm (value, mask); if (!addr_nonzero) vr.set_varying (TREE_TYPE (arg)); - irange &r = as_a (vr); - r.update_bitmask (bm); + vr.update_bitmask (bm); ipa_set_jfunc_vr (jfunc, vr); } else if (addr_nonzero) @@ -5785,8 +5784,8 @@ ipcp_get_parm_bits (tree parm, tree *value, widest_int *mask) vr[i].get_vrange (tmp); if (tmp.undefined_p () || tmp.varying_p ()) return false; - irange &r = as_a (tmp); - irange_bitmask bm = r.get_bitmask (); + irange_bitmask bm; + bm = tmp.get_bitmask (); *mask = widest_int::from (bm.mask (), TYPE_SIGN (TREE_TYPE (parm))); *value = wide_int_to_tree (TREE_TYPE (parm), bm.value ()); return true; @@ -5857,8 +5856,7 @@ ipcp_update_vr (struct cgraph_node *node, ipcp_transformation *ts) if (POINTER_TYPE_P (TREE_TYPE (parm)) && opt_for_fn (node->decl, flag_ipa_bit_cp)) { - irange &r = as_a (tmp); - irange_bitmask bm = r.get_bitmask (); + irange_bitmask bm = tmp.get_bitmask (); unsigned tem = bm.mask ().to_uhwi (); unsigned HOST_WIDE_I
[COMMITTED 06/16] Remove GTY support for vrange and derived classes.
Now that we have a vrange storage class to save ranges in long-term memory, there is no need for GTY markers for any of the vrange classes, since they should never live in GC. gcc/ChangeLog: * value-range-storage.h: Remove friends. * value-range.cc (gt_ggc_mx): Remove. (gt_pch_nx): Remove. * value-range.h (class vrange): Remove GTY markers. (class irange): Same. (class int_range): Same. (class frange): Same. (gt_ggc_mx): Remove. (gt_pch_nx): Remove. --- gcc/value-range-storage.h | 4 --- gcc/value-range.cc| 73 --- gcc/value-range.h | 46 +++- 3 files changed, 4 insertions(+), 119 deletions(-) diff --git a/gcc/value-range-storage.h b/gcc/value-range-storage.h index d94c520aa73..5756de7e32d 100644 --- a/gcc/value-range-storage.h +++ b/gcc/value-range-storage.h @@ -75,10 +75,6 @@ private: static size_t size (const irange &r); const unsigned short *lengths_address () const; unsigned short *write_lengths_address (); - friend void gt_ggc_mx_irange_storage (void *); - friend void gt_pch_p_14irange_storage (void *, void *, - gt_pointer_operator, void *); - friend void gt_pch_nx_irange_storage (void *); // The shared precision of each number. unsigned short m_precision; diff --git a/gcc/value-range.cc b/gcc/value-range.cc index 926f7b707ea..b901c864a7b 100644 --- a/gcc/value-range.cc +++ b/gcc/value-range.cc @@ -2165,79 +2165,6 @@ vrp_operand_equal_p (const_tree val1, const_tree val2) return true; } -void -gt_ggc_mx (irange *x) -{ - if (!x->undefined_p ()) -gt_ggc_mx (x->m_type); -} - -void -gt_pch_nx (irange *x) -{ - if (!x->undefined_p ()) -gt_pch_nx (x->m_type); -} - -void -gt_pch_nx (irange *x, gt_pointer_operator op, void *cookie) -{ - for (unsigned i = 0; i < x->m_num_ranges; ++i) -{ - op (&x->m_base[i * 2], NULL, cookie); - op (&x->m_base[i * 2 + 1], NULL, cookie); -} -} - -void -gt_ggc_mx (frange *x) -{ - gt_ggc_mx (x->m_type); -} - -void -gt_pch_nx (frange *x) -{ - gt_pch_nx (x->m_type); -} - -void -gt_pch_nx (frange *x, gt_pointer_operator op, void *cookie) -{ - op (&x->m_type, NULL, cookie); -} - -void -gt_ggc_mx (vrange *x) -{ - if (is_a (*x)) -return gt_ggc_mx ((irange *) x); - if (is_a (*x)) -return gt_ggc_mx ((frange *) x); - gcc_unreachable (); -} - -void -gt_pch_nx (vrange *x) -{ - if (is_a (*x)) -return gt_pch_nx ((irange *) x); - if (is_a (*x)) -return gt_pch_nx ((frange *) x); - gcc_unreachable (); -} - -void -gt_pch_nx (vrange *x, gt_pointer_operator op, void *cookie) -{ - if (is_a (*x)) -gt_pch_nx ((irange *) x, op, cookie); - else if (is_a (*x)) -gt_pch_nx ((frange *) x, op, cookie); - else -gcc_unreachable (); -} - #define DEFINE_INT_RANGE_INSTANCE(N) \ template int_range::int_range(tree_node *, \ const wide_int &,\ diff --git a/gcc/value-range.h b/gcc/value-range.h index 991ffeafcb8..2650ded6d10 100644 --- a/gcc/value-range.h +++ b/gcc/value-range.h @@ -72,7 +72,7 @@ enum value_range_discriminator // if (f.supports_type_p (type)) ... //} -class GTY((user)) vrange +class vrange { template friend bool is_a (vrange &); friend class Value_Range; @@ -279,7 +279,7 @@ irange_bitmask::intersect (const irange_bitmask &orig_src) // An integer range without any storage. -class GTY((user)) irange : public vrange +class irange : public vrange { friend value_range_kind get_legacy_range (const irange &, tree &, tree &); friend class irange_storage; @@ -350,10 +350,6 @@ protected: // Hard limit on max ranges allowed. static const int HARD_MAX_RANGES = 255; private: - friend void gt_ggc_mx (irange *); - friend void gt_pch_nx (irange *); - friend void gt_pch_nx (irange *, gt_pointer_operator, void *); - bool varying_compatible_p () const; bool intersect_bitmask (const irange &r); bool union_bitmask (const irange &r); @@ -379,7 +375,7 @@ protected: // HARD_MAX_RANGES. This new storage is freed upon destruction. template -class GTY((user)) int_range : public irange +class int_range : public irange { public: int_range (); @@ -484,13 +480,10 @@ nan_state::neg_p () const // The representation is a type with a couple of endpoints, unioned // with the set of { -NAN, +Nan }. -class GTY((user)) frange : public vrange +class frange : public vrange { friend class frange_storage; friend class vrange_printer; - friend void gt_ggc_mx (frange *); - friend void gt_pch_nx (frange *); - friend void gt_pch_nx (frange *, gt_pointer_operator, void *); public: frange (); frange (const frange &); @@ -991,37 +984,6 @@ range_includes_zero_p (const irange *vr) return vr->contains_p (zero); } -extern void gt_ggc_mx (vrange *); -extern v
[COMMITTED 03/16] Make some Value_Range's explicitly integer.
Fix some Value_Range's that we know ahead of time will be only integers. This avoids using the polymorphic Value_Range unnecessarily gcc/ChangeLog: * gimple-ssa-warn-access.cc (check_nul_terminated_array): Make Value_Range an int_range. (memmodel_to_uhwi): Same * tree-ssa-loop-niter.cc (refine_value_range_using_guard): Same. (determine_value_range): Same. (infer_loop_bounds_from_signedness): Same. (scev_var_range_cant_overflow): Same. --- gcc/gimple-ssa-warn-access.cc | 4 ++-- gcc/tree-ssa-loop-niter.cc| 12 ++-- 2 files changed, 8 insertions(+), 8 deletions(-) diff --git a/gcc/gimple-ssa-warn-access.cc b/gcc/gimple-ssa-warn-access.cc index dedaae27b31..450c1caa765 100644 --- a/gcc/gimple-ssa-warn-access.cc +++ b/gcc/gimple-ssa-warn-access.cc @@ -330,7 +330,7 @@ check_nul_terminated_array (GimpleOrTree expr, tree src, tree bound) wide_int bndrng[2]; if (bound) { - Value_Range r (TREE_TYPE (bound)); + int_range<2> r (TREE_TYPE (bound)); get_range_query (cfun)->range_of_expr (r, bound); @@ -2816,7 +2816,7 @@ memmodel_to_uhwi (tree ord, gimple *stmt, unsigned HOST_WIDE_INT *cstval) { /* Use the range query to determine constant values in the absence of constant propagation (such as at -O0). */ - Value_Range rng (TREE_TYPE (ord)); + int_range<2> rng (TREE_TYPE (ord)); if (!get_range_query (cfun)->range_of_expr (rng, ord, stmt) || !rng.singleton_p (&ord)) return false; diff --git a/gcc/tree-ssa-loop-niter.cc b/gcc/tree-ssa-loop-niter.cc index c6d010f6d89..cbc9dbc5a1f 100644 --- a/gcc/tree-ssa-loop-niter.cc +++ b/gcc/tree-ssa-loop-niter.cc @@ -214,7 +214,7 @@ refine_value_range_using_guard (tree type, tree var, get_type_static_bounds (type, mint, maxt); mpz_init (minc1); mpz_init (maxc1); - Value_Range r (TREE_TYPE (varc1)); + int_range<2> r (TREE_TYPE (varc1)); /* Setup range information for varc1. */ if (integer_zerop (varc1)) { @@ -368,7 +368,7 @@ determine_value_range (class loop *loop, tree type, tree var, mpz_t off, gphi_iterator gsi; /* Either for VAR itself... */ - Value_Range var_range (TREE_TYPE (var)); + int_range<2> var_range (TREE_TYPE (var)); get_range_query (cfun)->range_of_expr (var_range, var); if (var_range.varying_p () || var_range.undefined_p ()) rtype = VR_VARYING; @@ -382,7 +382,7 @@ determine_value_range (class loop *loop, tree type, tree var, mpz_t off, /* Or for PHI results in loop->header where VAR is used as PHI argument from the loop preheader edge. */ - Value_Range phi_range (TREE_TYPE (var)); + int_range<2> phi_range (TREE_TYPE (var)); for (gsi = gsi_start_phis (loop->header); !gsi_end_p (gsi); gsi_next (&gsi)) { gphi *phi = gsi.phi (); @@ -408,7 +408,7 @@ determine_value_range (class loop *loop, tree type, tree var, mpz_t off, involved. */ if (wi::gt_p (minv, maxv, sgn)) { - Value_Range vr (TREE_TYPE (var)); + int_range<2> vr (TREE_TYPE (var)); get_range_query (cfun)->range_of_expr (vr, var); if (vr.varying_p () || vr.undefined_p ()) rtype = VR_VARYING; @@ -4367,7 +4367,7 @@ infer_loop_bounds_from_signedness (class loop *loop, gimple *stmt) low = lower_bound_in_type (type, type); high = upper_bound_in_type (type, type); - Value_Range r (TREE_TYPE (def)); + int_range<2> r (TREE_TYPE (def)); get_range_query (cfun)->range_of_expr (r, def); if (!r.varying_p () && !r.undefined_p ()) { @@ -5426,7 +5426,7 @@ scev_var_range_cant_overflow (tree var, tree step, class loop *loop) if (!def_bb || !dominated_by_p (CDI_DOMINATORS, loop->latch, def_bb)) return false; - Value_Range r (TREE_TYPE (var)); + int_range<2> r (TREE_TYPE (var)); get_range_query (cfun)->range_of_expr (r, var); if (r.varying_p () || r.undefined_p ()) return false; -- 2.44.0
[PATCH] expmed: TRUNCATE value1 if needed in store_bit_field_using_insv
PR target/113179. In `store_bit_field_using_insv`, we just use SUBREG if value_mode >= op_mode, while in some ports, a sign_extend will be needed, such as MIPS64: If either GPR rs or GPR rt does not contain sign-extended 32-bit values (bits 63..31 equal), then the result of the operation is UNPREDICTABLE. The problem happens for the code like: struct xx { int a:4; int b:24; int c:3; int d:1; }; void xx (struct xx *a, long long b) { a->d = b; } In the above code, the hard register contains `b`, may be note well sign-extended. gcc/ PR target/113179 * expmed.c(store_bit_field_using_insv): TRUNCATE value1 if needed. gcc/testsuite PR target/113179 * gcc.target/mips/pr113179.c: New tests. --- gcc/expmed.cc| 12 +--- gcc/testsuite/gcc.target/mips/pr113179.c | 18 ++ 2 files changed, 27 insertions(+), 3 deletions(-) create mode 100644 gcc/testsuite/gcc.target/mips/pr113179.c diff --git a/gcc/expmed.cc b/gcc/expmed.cc index 4ec035e4843..6a582593da8 100644 --- a/gcc/expmed.cc +++ b/gcc/expmed.cc @@ -704,9 +704,15 @@ store_bit_field_using_insv (const extraction_insn *insv, rtx op0, } else { - tmp = gen_lowpart_if_possible (op_mode, value1); - if (! tmp) - tmp = gen_lowpart (op_mode, force_reg (value_mode, value1)); + if (targetm.mode_rep_extended (op_mode, value_mode)) + tmp = simplify_gen_unary (TRUNCATE, op_mode, + value1, value_mode); + else + { + tmp = gen_lowpart_if_possible (op_mode, value1); + if (! tmp) + tmp = gen_lowpart (op_mode, force_reg (value_mode, value1)); + } } value1 = tmp; } diff --git a/gcc/testsuite/gcc.target/mips/pr113179.c b/gcc/testsuite/gcc.target/mips/pr113179.c new file mode 100644 index 000..f32c5a16765 --- /dev/null +++ b/gcc/testsuite/gcc.target/mips/pr113179.c @@ -0,0 +1,18 @@ +/* Check if the operand of INS is sign-extended on MIPS64. */ +/* { dg-options "-mips64r2 -mabi=64" } */ +/* { dg-skip-if "code quality test" { *-*-* } { "-O0" } { "" } } */ + +struct xx { +int a:1; +int b:24; +int c:6; +int d:1; +}; + +long long xx (struct xx *a, long long b) { +a->d = b; +return b+1; +} + +/* { dg-final { scan-assembler "\tsll\t\\\$3,\\\$5,0" } } */ +/* { dg-final { scan-assembler "\tdaddiu\t\\\$2,\\\$5,1" } } */ -- 2.39.2
Re: [PATCH 5/4] libbacktrace: improve getting debug information for loaded dlls
On Thu, Apr 25, 2024 at 1:15 PM Björn Schäpers wrote: > > > Attached is the combined version of the two patches, only implementing the > > variant with the tlhelp32 API. > > > > Tested on x86 and x86_64 windows. > > > > Kind regards, > > Björn. > > A friendly ping. Thanks. Committed as follows. Which of your other patches are still relevant? Thanks. Ian 942a9cf2a958113d2ab46f5b015c36e569abedcf diff --git a/libbacktrace/configure.ac b/libbacktrace/configure.ac index 3e0075a2b79..59e9c415db8 100644 --- a/libbacktrace/configure.ac +++ b/libbacktrace/configure.ac @@ -380,6 +380,10 @@ if test "$have_loadquery" = "yes"; then fi AC_CHECK_HEADERS(windows.h) +AC_CHECK_HEADERS(tlhelp32.h, [], [], +[#ifdef HAVE_WINDOWS_H +# include +#endif]) # Check for the fcntl function. if test -n "${with_target_subdir}"; then diff --git a/libbacktrace/pecoff.c b/libbacktrace/pecoff.c index 9e437d810c7..4f267841178 100644 --- a/libbacktrace/pecoff.c +++ b/libbacktrace/pecoff.c @@ -49,6 +49,18 @@ POSSIBILITY OF SUCH DAMAGE. */ #endif #include + +#ifdef HAVE_TLHELP32_H +#include + +#ifdef UNICODE +/* If UNICODE is defined, all the symbols are replaced by a macro to use the + wide variant. But we need the ansi variant, so undef the macros. */ +#undef MODULEENTRY32 +#undef Module32First +#undef Module32Next +#endif +#endif #endif /* Coff file header. */ @@ -592,7 +604,8 @@ coff_syminfo (struct backtrace_state *state, uintptr_t addr, static int coff_add (struct backtrace_state *state, int descriptor, backtrace_error_callback error_callback, void *data, - fileline *fileline_fn, int *found_sym, int *found_dwarf) + fileline *fileline_fn, int *found_sym, int *found_dwarf, + uintptr_t module_handle ATTRIBUTE_UNUSED) { struct backtrace_view fhdr_view; off_t fhdr_off; @@ -870,12 +883,7 @@ coff_add (struct backtrace_state *state, int descriptor, } #ifdef HAVE_WINDOWS_H - { -uintptr_t module_handle; - -module_handle = (uintptr_t) GetModuleHandle (NULL); -base_address = module_handle - image_base; - } + base_address = module_handle - image_base; #endif if (!backtrace_dwarf_add (state, base_address, &dwarf_sections, @@ -917,12 +925,61 @@ backtrace_initialize (struct backtrace_state *state, int found_sym; int found_dwarf; fileline coff_fileline_fn; + uintptr_t module_handle = 0; +#ifdef HAVE_TLHELP32_H + fileline module_fileline_fn; + int module_found_sym; + HANDLE snapshot; +#endif + +#ifdef HAVE_WINDOWS_H + module_handle = (uintptr_t) GetModuleHandle (NULL); +#endif ret = coff_add (state, descriptor, error_callback, data, - &coff_fileline_fn, &found_sym, &found_dwarf); + &coff_fileline_fn, &found_sym, &found_dwarf, module_handle); if (!ret) return 0; +#ifdef HAVE_TLHELP32_H + do +{ + snapshot = CreateToolhelp32Snapshot (TH32CS_SNAPMODULE, 0); +} + while (snapshot == INVALID_HANDLE_VALUE +&& GetLastError () == ERROR_BAD_LENGTH); + + if (snapshot != INVALID_HANDLE_VALUE) +{ + MODULEENTRY32 entry; + BOOL ok; + entry.dwSize = sizeof (MODULEENTRY32); + + for (ok = Module32First (snapshot, &entry); ok; ok = Module32Next (snapshot, &entry)) + { + if (strcmp (filename, entry.szExePath) == 0) + continue; + + module_handle = (uintptr_t) entry.hModule; + if (module_handle == 0) + continue; + + descriptor = backtrace_open (entry.szExePath, error_callback, data, + NULL); + if (descriptor < 0) + continue; + + coff_add (state, descriptor, error_callback, data, + &module_fileline_fn, &module_found_sym, &found_dwarf, + module_handle); + if (module_found_sym) + found_sym = 1; + } + + CloseHandle (snapshot); +} +#endif + if (!state->threaded) { if (found_sym)
RE: [PATCH v2] DSE: Bugfix ICE after allow vector type in get_stored_val
Kindly ping^^ for this ice fix. Pan -Original Message- From: Li, Pan2 Sent: Thursday, April 18, 2024 9:46 AM To: Jeff Law ; Robin Dapp ; gcc-patches@gcc.gnu.org Cc: juzhe.zh...@rivai.ai; kito.ch...@gmail.com; richard.guent...@gmail.com; Liu, Hongtao Subject: RE: [PATCH v2] DSE: Bugfix ICE after allow vector type in get_stored_val Kindly ping^ for this ice fix. Pan -Original Message- From: Li, Pan2 Sent: Saturday, April 6, 2024 8:02 PM To: Li, Pan2 ; Jeff Law ; Robin Dapp ; gcc-patches@gcc.gnu.org Cc: juzhe.zh...@rivai.ai; kito.ch...@gmail.com; richard.guent...@gmail.com; Wang, Yanzhang ; Liu, Hongtao Subject: RE: [PATCH v2] DSE: Bugfix ICE after allow vector type in get_stored_val Kindly ping for this ice. Pan -Original Message- From: Li, Pan2 Sent: Saturday, March 23, 2024 1:45 PM To: Jeff Law ; Robin Dapp ; gcc-patches@gcc.gnu.org Cc: juzhe.zh...@rivai.ai; kito.ch...@gmail.com; richard.guent...@gmail.com; Wang, Yanzhang ; Liu, Hongtao Subject: RE: [PATCH v2] DSE: Bugfix ICE after allow vector type in get_stored_val Thanks Jeff for comments. > As Richi noted using validate_subreg here isn't great. Does it work to > factor out this code from extract_low_bits > >> if (!int_mode_for_mode (src_mode).exists (&src_int_mode) >> || !int_mode_for_mode (mode).exists (&int_mode)) >> return NULL_RTX; >> >> if (!targetm.modes_tieable_p (src_int_mode, src_mode)) >> return NULL_RTX; >> if (!targetm.modes_tieable_p (int_mode, mode)) >> return NULL_RTX; > And use that in the condition (and in extract_low_bits rather than > duplicating the code)? It can solve the ICE but will forbid all vector modes goes gen_lowpart. Actually only the vector mode size is less than reg nature size will trigger the ICE. Thus, how about just add one more condition before goes to gen_lowpart as below? Feel free to correct me if any misunderstandings. 😉! diff --git a/gcc/dse.cc b/gcc/dse.cc index edc7a1dfecf..258d2ccc299 100644 --- a/gcc/dse.cc +++ b/gcc/dse.cc @@ -1946,7 +1946,9 @@ get_stored_val (store_info *store_info, machine_mode read_mode, copy_rtx (store_info->const_rhs)); else if (VECTOR_MODE_P (read_mode) && VECTOR_MODE_P (store_mode) && known_le (GET_MODE_BITSIZE (read_mode), GET_MODE_BITSIZE (store_mode)) -&& targetm.modes_tieable_p (read_mode, store_mode)) +&& targetm.modes_tieable_p (read_mode, store_mode) +/* It's invalid in validate_subreg if read_mode size is < reg natural. */ +&& known_ge (GET_MODE_SIZE (read_mode), REGMODE_NATURAL_SIZE (read_mode))) read_reg = gen_lowpart (read_mode, copy_rtx (store_info->rhs)); else read_reg = extract_low_bits (read_mode, store_mode, Pan -Original Message- From: Jeff Law Sent: Saturday, March 23, 2024 2:54 AM To: Li, Pan2 ; Robin Dapp ; gcc-patches@gcc.gnu.org Cc: juzhe.zh...@rivai.ai; kito.ch...@gmail.com; richard.guent...@gmail.com; Wang, Yanzhang ; Liu, Hongtao Subject: Re: [PATCH v2] DSE: Bugfix ICE after allow vector type in get_stored_val On 3/4/24 11:22 PM, Li, Pan2 wrote: > Thanks Jeff for comments. > >> But in the case of a vector modes, we can usually reinterpret the >> underlying bits in whatever mode we want and do any of the usual >> operations on those bits. > > Yes, I think that is why we can allow vector mode in get_stored_val if my > understanding is correct. > And then the different modes will return by gen_low_part. Unfortunately, > there are some modes > (less than a vector bit size like V2SF, V2QI for vlen=128) are considered > as invalid by validate_subreg, > and return NULL_RTX result in the final ICE. That doesn't make a lot of sense to me. Even for vlen=128 I would have expected that we can still use a subreg to access low bits. After all we might have had a V16QI vector and done a reduction of some sort storing the result in the first element and we have to be able to extract that result and move it around. I'm not real keen on a target workaround. While extremely safe, I wouldn't be surprised if other ports could trigger the ICE and we'd end up patching up multiple targets for what is, IMHO, a more generic issue. As Richi noted using validate_subreg here isn't great. Does it work to factor out this code from extract_low_bits: > if (!int_mode_for_mode (src_mode).exists (&src_int_mode) > || !int_mode_for_mode (mode).exists (&int_mode)) > return NULL_RTX; > > if (!targetm.modes_tieable_p (src_int_mode, src_mode)) > return NULL_RTX; > if (!targetm.modes_tieable_p (int_mode, mode)) > return NULL_RTX; And use that in the condition (and in extract_low_bits rather than duplicating the code)? jeff ps. No need to apologize for the pings. This completely fell off my radar.
RE: [PATCH v2] Internal-fn: Introduce new internal function SAT_ADD
Kinding ping for SAT_ADD. Pan -Original Message- From: Li, Pan2 Sent: Sunday, April 7, 2024 3:03 PM To: gcc-patches@gcc.gnu.org Cc: juzhe.zh...@rivai.ai; kito.ch...@gmail.com; Wang, Yanzhang ; tamar.christ...@arm.com; richard.guent...@gmail.com; Liu, Hongtao ; Li, Pan2 Subject: [PATCH v2] Internal-fn: Introduce new internal function SAT_ADD From: Pan Li Update in v2: * Fix one failure for x86 bootstrap. Original log: This patch would like to add the middle-end presentation for the saturation add. Aka set the result of add to the max when overflow. It will take the pattern similar as below. SAT_ADD (x, y) => (x + y) | (-(TYPE)((TYPE)(x + y) < x)) Take uint8_t as example, we will have: * SAT_ADD (1, 254) => 255. * SAT_ADD (1, 255) => 255. * SAT_ADD (2, 255) => 255. * SAT_ADD (255, 255) => 255. The patch also implement the SAT_ADD in the riscv backend as the sample for both the scalar and vector. Given below example: uint64_t sat_add_u64 (uint64_t x, uint64_t y) { return (x + y) | (- (uint64_t)((uint64_t)(x + y) < x)); } Before this patch: uint64_t sat_add_uint64_t (uint64_t x, uint64_t y) { long unsigned int _1; _Bool _2; long unsigned int _3; long unsigned int _4; uint64_t _7; long unsigned int _10; __complex__ long unsigned int _11; ;; basic block 2, loop depth 0 ;;pred: ENTRY _11 = .ADD_OVERFLOW (x_5(D), y_6(D)); _1 = REALPART_EXPR <_11>; _10 = IMAGPART_EXPR <_11>; _2 = _10 != 0; _3 = (long unsigned int) _2; _4 = -_3; _7 = _1 | _4; return _7; ;;succ: EXIT } After this patch: uint64_t sat_add_uint64_t (uint64_t x, uint64_t y) { uint64_t _7; ;; basic block 2, loop depth 0 ;;pred: ENTRY _7 = .SAT_ADD (x_5(D), y_6(D)); [tail call] return _7; ;;succ: EXIT } For vectorize, we leverage the existing vect pattern recog to find the pattern similar to scalar and let the vectorizer to perform the rest part for standard name usadd3 in vector mode. The riscv vector backend have insn "Vector Single-Width Saturating Add and Subtract" which can be leveraged when expand the usadd3 in vector mode. For example: void vec_sat_add_u64 (uint64_t *out, uint64_t *x, uint64_t *y, unsigned n) { unsigned i; for (i = 0; i < n; i++) out[i] = (x[i] + y[i]) | (- (uint64_t)((uint64_t)(x[i] + y[i]) < x[i])); } Before this patch: void vec_sat_add_u64 (uint64_t *out, uint64_t *x, uint64_t *y, unsigned n) { ... _80 = .SELECT_VL (ivtmp_78, POLY_INT_CST [2, 2]); ivtmp_58 = _80 * 8; vect__4.7_61 = .MASK_LEN_LOAD (vectp_x.5_59, 64B, { -1, ... }, _80, 0); vect__6.10_65 = .MASK_LEN_LOAD (vectp_y.8_63, 64B, { -1, ... }, _80, 0); vect__7.11_66 = vect__4.7_61 + vect__6.10_65; mask__8.12_67 = vect__4.7_61 > vect__7.11_66; vect__12.15_72 = .VCOND_MASK (mask__8.12_67, { 18446744073709551615, ... }, vect__7.11_66); .MASK_LEN_STORE (vectp_out.16_74, 64B, { -1, ... }, _80, 0, vect__12.15_72); vectp_x.5_60 = vectp_x.5_59 + ivtmp_58; vectp_y.8_64 = vectp_y.8_63 + ivtmp_58; vectp_out.16_75 = vectp_out.16_74 + ivtmp_58; ivtmp_79 = ivtmp_78 - _80; ... } vec_sat_add_u64: ... vsetvli a5,a3,e64,m1,ta,ma vle64.v v0,0(a1) vle64.v v1,0(a2) sllia4,a5,3 sub a3,a3,a5 add a1,a1,a4 add a2,a2,a4 vadd.vv v1,v0,v1 vmsgtu.vv v0,v0,v1 vmerge.vim v1,v1,-1,v0 vse64.v v1,0(a0) ... After this patch: void vec_sat_add_u64 (uint64_t *out, uint64_t *x, uint64_t *y, unsigned n) { ... _62 = .SELECT_VL (ivtmp_60, POLY_INT_CST [2, 2]); ivtmp_46 = _62 * 8; vect__4.7_49 = .MASK_LEN_LOAD (vectp_x.5_47, 64B, { -1, ... }, _62, 0); vect__6.10_53 = .MASK_LEN_LOAD (vectp_y.8_51, 64B, { -1, ... }, _62, 0); vect__12.11_54 = .SAT_ADD (vect__4.7_49, vect__6.10_53); .MASK_LEN_STORE (vectp_out.12_56, 64B, { -1, ... }, _62, 0, vect__12.11_54); ... } vec_sat_add_u64: ... vsetvli a5,a3,e64,m1,ta,ma vle64.v v1,0(a1) vle64.v v2,0(a2) sllia4,a5,3 sub a3,a3,a5 add a1,a1,a4 add a2,a2,a4 vsaddu.vv v1,v1,v2 vse64.v v1,0(a0) ... To limit the patch size for review, only unsigned version of usadd3 are involved here. The signed version will be covered in the underlying patch(es). The below test suites are passed for this patch. * The riscv fully regression tests. * The aarch64 fully regression tests. * The x86 bootstrap tests. * The x86 fully regression tests. PR target/51492 PR target/112600 gcc/ChangeLog: * config/riscv/autovec.md (usadd3): New pattern expand for unsigned SAT_ADD vector. * config/riscv/riscv-protos.h (riscv_expand_usadd): New func decl to expand usadd3 pattern. (expand_vec_usadd): Ditto but for vector. * config/riscv/riscv-v.cc (emit_vec_saddu): New func impl to emit the vsadd insn. (expand_vec_usadd): New func impl to expand usadd3 for vector. * config/riscv/riscv.cc (riscv_expand_usadd): New func impl to expand usadd
[pushed] doc: Update David Binderman's entry in contrib.texi
gcc/ChangeLog: * doc/contrib.texi: Update David Binderman's entry. Pushed. Gerald --- gcc/doc/contrib.texi | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/gcc/doc/contrib.texi b/gcc/doc/contrib.texi index 2a15fd05883..32e89d6df25 100644 --- a/gcc/doc/contrib.texi +++ b/gcc/doc/contrib.texi @@ -64,8 +64,8 @@ improved alias analysis, plus migrating GCC to Bugzilla. Geoff Berry for his Java object serialization work and various patches. @item -David Binderman tests weekly snapshots of GCC trunk against Fedora Rawhide -for several architectures. +David Binderman for testing GCC trunk against Fedora Rawhide +and csmith. @item Laurynas Biveinis for memory management work and DJGPP port fixes. -- 2.44.0
Re: [PATCH] ppc: testsuite: pr79004 needs -mlong-double-128
Hi, on 2024/4/28 16:20, Alexandre Oliva wrote: > On Apr 23, 2024, "Kewen.Lin" wrote: > >> This patch seemed to miss to CC gcc-patches list. :) > > Oops, sorry, thanks for catching that. > > Here it is. FTR, you've already responded suggesting an apparent > preference for addressing PR105359, but since I meant to contribute it, > I'm reposting is to gcc-patches, now with a reference to the PR. OK, from this perspective IMHO it seems more clear to adopt xfail with effective target long_double_64bit? BR, Kewen > > > ppc: testsuite: pr79004 needs -mlong-double-128 > > Some of the asm opcodes expected by pr79004 depend on > -mlong-double-128 to be output. E.g., without this flag, the > conditions of patterns @extenddf2 and extendsf2 do not > hold, and so GCC resorts to libcalls instead of even trying > rs6000_expand_float128_convert. > > Perhaps the conditions are too strict, and they could enable the use > of conversion insns involving __ieee128/_Float128 even with 64-bit > long doubles. Alas, for now, we need this flag for the test to pass > on target variants that use 64-bit long doubles. > > > for gcc/testsuite/ChangeLog > > * gcc.target/powerpr/pr79004.c: Add -mlong-double-128. > --- > gcc/testsuite/gcc.target/powerpc/pr79004.c |2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/gcc/testsuite/gcc.target/powerpc/pr79004.c > b/gcc/testsuite/gcc.target/powerpc/pr79004.c > index e411702dc98a9..061a0e83fe2ad 100644 > --- a/gcc/testsuite/gcc.target/powerpc/pr79004.c > +++ b/gcc/testsuite/gcc.target/powerpc/pr79004.c > @@ -1,6 +1,6 @@ > /* { dg-do compile { target { powerpc*-*-* && lp64 } } } */ > /* { dg-require-effective-target powerpc_p9vector_ok } */ > -/* { dg-options "-mdejagnu-cpu=power9 -O2 -mfloat128" } */ > +/* { dg-options "-mdejagnu-cpu=power9 -O2 -mfloat128 -mlong-double-128" } */ > /* { dg-prune-output ".-mfloat128. option may not be fully supported" } */ > > #include > >
Re: [PATCH] adjust vectorization expectations for ppc costmodel 76b
Hi, on 2024/4/28 16:14, Alexandre Oliva wrote: > On Apr 24, 2024, "Kewen.Lin" wrote: > >> For !has_arch_pwr7 case, it still adopts peeling but as the comment (one >> line above) >> shows the original intention of this case is to expect not profitable for >> peeling >> so it's not expected to be handled here, can we just tweak the loop bound >> instead, >> such as: > >> -#define N 14 >> +#define N 13 >> #define OFF 4 > >> ?, it can make this loop not profitable to be vectorized for !vect_no_align >> with >> peeling (both pwr7 and pwr6) and keep consistent. > > Like this? I didn't feel I could claim authorship of this one-liner > just because I turned it into a patch and tested it, so I took the > liberty of turning your own words above into the commit message. So Feel free to do so! > far, tested on ppc64le-linux-gnu (ppc9). Testing with vxworks targets > now. Would you like to tweak the commit message to your liking? OK, tweaked as below. > Otherwise, is this ok to install? > > Thanks, > > > adjust iteration count for ppc costmodel 76b Nit: Maybe add a prefix "testsuite: ". > > From: Kewen Lin Thanks, you can just drop this. :) > > The original intention of this case is to expect not profitable for > peeling. Tweak the loop bound to make this loop not profitable to be > vectorized for !vect_no_align with peeling (both pwr7 and pwr6) and > keep consistent. For some hardware which doesn't support unaligned vector memory access, test case costmodel-vect-76b.c expects to see cost modeling would make the decision that it's not profitable for peeling, according to the commit history, test case comments and the way to check. For now, the existing loop bound 14 works well for Power7, but it does not for some targets on which the cost of operation vec_perm can be different from Power7, such as: Power6, it's 3 vs. 1. This difference further causes the difference (10 vs. 12) on the minimum iteration for profitability and cause the failure. To keep the original test point, this patch is to tweak the loop bound to ensure it's not profitable to be vectorized for !vect_no_align with peeling. OK for trunk (assuming the testings run well on p6/p7 too), thanks! BR, Kewen > > > for gcc/testsuite/ChangeLog > > * gcc.dg/vect/costmodel/ppc/costmodel-vect-76b.c (N): Tweak. > --- > .../gcc.dg/vect/costmodel/ppc/costmodel-vect-76b.c |2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-vect-76b.c > b/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-vect-76b.c > index cbbfbb24658f8..e48b0ab759e75 100644 > --- a/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-vect-76b.c > +++ b/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-vect-76b.c > @@ -6,7 +6,7 @@ > > /* On Power7 without misalign vector support, this case is to check it's not > profitable to perform vectorization by peeling to align the store. */ > -#define N 14 > +#define N 13 > #define OFF 4 > > /* Check handling of accesses for which the "initial condition" - > >
[PATCH] PR tree-opt/113673: Avoid load merging from potentially trapping additions.
This patch fixes PR tree-optimization/113673, a P2 ice-on-valid regression caused by load merging of (ptr[0]<<8)+ptr[1] when -ftrapv has been specified. When the operator is | or ^ this is safe, but for addition of signed integer types, a trap may be generated/required, so merging this idiom into a single non-trapping instruction is inappropriate, confusing the compiler by transforming a basic block with an exception edge into one without. One fix is to be more selective for PLUS_EXPR than for BIT_IOR_EXPR or BIT_XOR_EXPR in gimple-ssa-store-merging.cc's find_bswap_or_nop_1 function. An alternate solution might be to notice that in this idiom the addition can't overflow, but that this detail wasn't apparent when exception edges were added to the CFG. In which case, it's safe to remove (or mark for removal) the problematic exceptional edge. Unfortunately updating the CFG is a part of the compiler that I'm less familiar with. This patch has been tested on x86_64-pc-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32} with no new failures. Ok for mainline? 2024-04-28 Roger Sayle gcc/ChangeLog PR tree-optimization/113673 * gimple-ssa-store-merging.cc (find_bswap_or_nop_1) : Don't perform load merging if a signed addition may trap. gcc/testsuite/ChangeLog PR tree-optimization/113673 * g++.dg/pr113673.C: New test case. Thanks in advance, Roger -- diff --git a/gcc/gimple-ssa-store-merging.cc b/gcc/gimple-ssa-store-merging.cc index cb0cb5f..41a1066 100644 --- a/gcc/gimple-ssa-store-merging.cc +++ b/gcc/gimple-ssa-store-merging.cc @@ -776,9 +776,16 @@ find_bswap_or_nop_1 (gimple *stmt, struct symbolic_number *n, int limit) switch (code) { + case PLUS_EXPR: + /* Don't perform load merging if this addition can trap. */ + if (cfun->can_throw_non_call_exceptions + && INTEGRAL_TYPE_P (TREE_TYPE (rhs1)) + && TYPE_OVERFLOW_TRAPS (TREE_TYPE (rhs1))) + return NULL; + /* Fallthru. */ + case BIT_IOR_EXPR: case BIT_XOR_EXPR: - case PLUS_EXPR: source_stmt1 = find_bswap_or_nop_1 (rhs1_stmt, &n1, limit - 1); if (!source_stmt1) diff --git a/gcc/testsuite/g++.dg/pr113673.C b/gcc/testsuite/g++.dg/pr113673.C new file mode 100644 index 000..1148977 --- /dev/null +++ b/gcc/testsuite/g++.dg/pr113673.C @@ -0,0 +1,14 @@ +/* { dg-do compile } */ +/* { dg-options "-Os -fnon-call-exceptions -ftrapv" } */ + +struct s { ~s(); }; +void +h (unsigned char *data, int c) +{ + s a1; + while (c) +{ + int m = *data++ << 8; + m += *data++; +} +}
[PATCH v3] MIPS: Add MIN/MAX.fmt instructions support for MIPS R6
This patch adds the smin/smax RTL mode for the min/max.fmt instructions. Also, since the min/max.fmt instrucions applies to the IEEE 754-2008 "minNum" and "maxNum" operations, this patch also provides the new "fmin3" and "fmax3" modes. gcc/ChangeLog: * config/mips/i6400.md (i6400_fpu_minmax): New define_insn_reservation. * config/mips/mips.h (ISA_HAS_FMIN_FMAX): Define new macro. * config/mips/mips.md (UNSPEC_FMIN): New unspec. (UNSPEC_FMAX): Same as above. (type): Add fminmax. (smin3): Generates MIN.fmt instructions. (smax3): Generates MAX.fmt instructions. (fmin3): Generates MIN.fmt instructions. (fmax3): Generates MAX.fmt instructions. * config/mips/p6600.md (p6600_fpu_fabs): Include fminmax type. gcc/testsuite/ChangeLog: * gcc.target/mips/mips-minmax1.c: New test for MIPS R6. * gcc.target/mips/mips-minmax2.c: Same as above. --- gcc/config/mips/i6400.md | 6 +++ gcc/config/mips/mips.h | 2 + gcc/config/mips/mips.md | 50 +++- gcc/config/mips/p6600.md | 4 +- gcc/testsuite/gcc.target/mips/mips-minmax1.c | 40 gcc/testsuite/gcc.target/mips/mips-minmax2.c | 36 ++ 6 files changed, 134 insertions(+), 4 deletions(-) create mode 100644 gcc/testsuite/gcc.target/mips/mips-minmax1.c create mode 100644 gcc/testsuite/gcc.target/mips/mips-minmax2.c diff --git a/gcc/config/mips/i6400.md b/gcc/config/mips/i6400.md index 9f216fe0210..d6f691ee217 100644 --- a/gcc/config/mips/i6400.md +++ b/gcc/config/mips/i6400.md @@ -219,6 +219,12 @@ (eq_attr "type" "fabs,fneg,fmove")) "i6400_fpu_short, i6400_fpu_apu") +;; min, max +(define_insn_reservation "i6400_fpu_minmax" 2 + (and (eq_attr "cpu" "i6400") + (eq_attr "type" "fminmax")) + "i6400_fpu_short+i6400_fpu_logic") + ;; fadd, fsub, fcvt (define_insn_reservation "i6400_fpu_fadd" 4 (and (eq_attr "cpu" "i6400") diff --git a/gcc/config/mips/mips.h b/gcc/config/mips/mips.h index 7145d23c650..5ce984ac99b 100644 --- a/gcc/config/mips/mips.h +++ b/gcc/config/mips/mips.h @@ -1259,6 +1259,8 @@ struct mips_cpu_info { #define ISA_HAS_9BIT_DISPLACEMENT (mips_isa_rev >= 6 \ || ISA_HAS_MIPS16E2) +#define ISA_HAS_FMIN_FMAX (mips_isa_rev >= 6) + /* ISA has data indexed prefetch instructions. This controls use of 'prefx', along with TARGET_HARD_FLOAT and TARGET_DOUBLE_FLOAT. (prefx is a cop1x instruction, so can only be used if FP is diff --git a/gcc/config/mips/mips.md b/gcc/config/mips/mips.md index b0fb5850a9e..26f758c90dd 100644 --- a/gcc/config/mips/mips.md +++ b/gcc/config/mips/mips.md @@ -97,6 +97,10 @@ UNSPEC_GET_FCSR UNSPEC_SET_FCSR + ;; Floating-point unspecs. + UNSPEC_FMIN + UNSPEC_FMAX + ;; HI/LO moves. UNSPEC_MFHI UNSPEC_MTHI @@ -370,6 +374,7 @@ ;; frsqrt floating point reciprocal square root ;; frsqrt1 floating point reciprocal square root step1 ;; frsqrt2 floating point reciprocal square root step2 +;; fminmax floating point min/max ;; dspmac DSP MAC instructions not saturating the accumulator ;; dspmacsatDSP MAC instructions that saturate the accumulator ;; accext DSP accumulator extract instructions @@ -387,8 +392,8 @@ prefetch,prefetchx,condmove,mtc,mfc,mthi,mtlo,mfhi,mflo,const,arith,logical, shift,slt,signext,clz,pop,trap,imul,imul3,imul3nc,imadd,idiv,idiv3,move, fmove,fadd,fmul,fmadd,fdiv,frdiv,frdiv1,frdiv2,fabs,fneg,fcmp,fcvt,fsqrt, - frsqrt,frsqrt1,frsqrt2,dspmac,dspmacsat,accext,accmod,dspalu,dspalusat, - multi,atomic,syncloop,nop,ghost,multimem, + frsqrt,frsqrt1,frsqrt2,fminmax,dspmac,dspmacsat,accext,accmod,dspalu, + dspalusat,multi,atomic,syncloop,nop,ghost,multimem, simd_div,simd_fclass,simd_flog2,simd_fadd,simd_fcvt,simd_fmul,simd_fmadd, simd_fdiv,simd_bitins,simd_bitmov,simd_insert,simd_sld,simd_mul,simd_fcmp, simd_fexp2,simd_int_arith,simd_bit,simd_shift,simd_splat,simd_fill, @@ -7971,6 +7976,47 @@ [(set_attr "move_type" "load") (set_attr "insn_count" "2")]) +;; +;; Float point MIN/MAX +;; + +(define_insn "smin3" + [(set (match_operand:SCALARF 0 "register_operand" "=f") + (smin:SCALARF (match_operand:SCALARF 1 "register_operand" "f") + (match_operand:SCALARF 2 "register_operand" "f")))] + "ISA_HAS_FMIN_FMAX" + "min.\t%0,%1,%2" + [(set_attr "type" "fminmax") + (set_attr "mode" "")]) + +(define_insn "smax3" + [(set (match_operand:SCALARF 0 "register_operand" "=f") + (smax:SCALARF (match_operand:SCALARF 1 "register_operand" "f") + (match_operand:SCALARF 2 "register_operand" "f")))] + "ISA_HAS_FMIN_FMAX" + "max.\t%0,%1,%2" + [(set_attr "type" "fminmax") + (set_attr "mode" "")]) + +(define_insn "fmin3" + [(set (match_operand:SCALARF 0 "register_operand"
[PATCH] make -freg-struct-return visibly a negative alias of -fpcc-struct-return
The fact that both options accept negative forms suggests that maybe they aren't negative forms of each other. They are, but that isn't clear even by examining common.opt. Use NegativeAlias to make it abundantly clear. The 'Optimization' keyword next to freg-struct-return was the only thing that caused flag_pcc_struct_return to be a per-function flag, and ipa-inline relied on that. After making it an alias, the Optimization keyword was no longer operational. I'm not sure it was sensible or desirable for flag_pcc_struct_return to be a per-function setting, but this patch does not intend to change behavior. Regstrapped on x86_64-linux-gnu and ppc64le-linux-gnu. Ok to install? for gcc/ChangeLog * common.opt (freg-struct-return): Make it explicitly fpcc-struct-return's NegativeAlias. Copy Optimization... (freg-struct-return): ... here. --- gcc/common.opt |4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/gcc/common.opt b/gcc/common.opt index ad3488447752b..12d93c76a1e63 100644 --- a/gcc/common.opt +++ b/gcc/common.opt @@ -2406,7 +2406,7 @@ Common RejectNegative Joined UInteger Optimization -fpack-struct= Set initial maximum structure member alignment. fpcc-struct-return -Common Var(flag_pcc_struct_return,1) Init(DEFAULT_PCC_STRUCT_RETURN) +Common Var(flag_pcc_struct_return,1) Init(DEFAULT_PCC_STRUCT_RETURN) Optimization Return small aggregates in memory, not registers. fpeel-loops @@ -2596,7 +2596,7 @@ Common Var(flag_record_gcc_switches) Record gcc command line switches in the object file. freg-struct-return -Common Var(flag_pcc_struct_return,0) Optimization +Common NegativeAlias Alias(fpcc_struct_return) Optimization Return small aggregates in registers. fregmove -- Alexandre Oliva, happy hackerhttps://FSFLA.org/blogs/lxo/ Free Software Activist GNU Toolchain Engineer More tolerance and less prejudice are key for inclusion and diversity Excluding neuro-others for not behaving ""normal"" is *not* inclusive
Re: [PATCH] ppc: testsuite: pr79004 needs -mlong-double-128
On Apr 23, 2024, "Kewen.Lin" wrote: > This patch seemed to miss to CC gcc-patches list. :) Oops, sorry, thanks for catching that. Here it is. FTR, you've already responded suggesting an apparent preference for addressing PR105359, but since I meant to contribute it, I'm reposting is to gcc-patches, now with a reference to the PR. ppc: testsuite: pr79004 needs -mlong-double-128 Some of the asm opcodes expected by pr79004 depend on -mlong-double-128 to be output. E.g., without this flag, the conditions of patterns @extenddf2 and extendsf2 do not hold, and so GCC resorts to libcalls instead of even trying rs6000_expand_float128_convert. Perhaps the conditions are too strict, and they could enable the use of conversion insns involving __ieee128/_Float128 even with 64-bit long doubles. Alas, for now, we need this flag for the test to pass on target variants that use 64-bit long doubles. for gcc/testsuite/ChangeLog * gcc.target/powerpr/pr79004.c: Add -mlong-double-128. --- gcc/testsuite/gcc.target/powerpc/pr79004.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gcc/testsuite/gcc.target/powerpc/pr79004.c b/gcc/testsuite/gcc.target/powerpc/pr79004.c index e411702dc98a9..061a0e83fe2ad 100644 --- a/gcc/testsuite/gcc.target/powerpc/pr79004.c +++ b/gcc/testsuite/gcc.target/powerpc/pr79004.c @@ -1,6 +1,6 @@ /* { dg-do compile { target { powerpc*-*-* && lp64 } } } */ /* { dg-require-effective-target powerpc_p9vector_ok } */ -/* { dg-options "-mdejagnu-cpu=power9 -O2 -mfloat128" } */ +/* { dg-options "-mdejagnu-cpu=power9 -O2 -mfloat128 -mlong-double-128" } */ /* { dg-prune-output ".-mfloat128. option may not be fully supported" } */ #include -- Alexandre Oliva, happy hackerhttps://FSFLA.org/blogs/lxo/ Free Software Activist GNU Toolchain Engineer More tolerance and less prejudice are key for inclusion and diversity Excluding neuro-others for not behaving ""normal"" is *not* inclusive
Re: [PATCH] [x86] Adjust alternative *k to ?k for avx512 mask in zero_extend patterns
On Sun, Apr 28, 2024 at 7:47 AM liuhongt wrote: > > So when both source operand and dest operand require avx512 MASK_REGS, RA > can allocate MASK_REGS register instead of GPR to avoid reload it from > GPR to MASK_REGS. > It's similar as what did for logic patterns. > > Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}. > Ok for trunk? > > gcc/ChangeLog: > > * config/i386/i386.md: (zero_extendsidi2): Adjust > alternative *k to ?k. > (zero_extenddi2): Ditto. > (*zero_extendsi2): Ditto. > (*zero_extendqihi2): Ditto. OK. Thanks, Uros. > --- > gcc/config/i386/i386.md | 16 +++ > .../gcc.target/i386/zero_extendkmask.c| 43 +++ > 2 files changed, 51 insertions(+), 8 deletions(-) > create mode 100644 gcc/testsuite/gcc.target/i386/zero_extendkmask.c > > diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md > index d4ce3809e6d..f2ab7fdcd58 100644 > --- a/gcc/config/i386/i386.md > +++ b/gcc/config/i386/i386.md > @@ -4567,10 +4567,10 @@ (define_expand "zero_extendsidi2" > > (define_insn "*zero_extendsidi2" >[(set (match_operand:DI 0 "nonimmediate_operand" > - "=r,?r,?o,r ,o,?*y,?!*y,$r,$v,$x,*x,*v,*r,*k") > + "=r,?r,?o,r ,o,?*y,?!*y,$r,$v,$x,*x,*v,?r,?k") > (zero_extend:DI > (match_operand:SI 1 "x86_64_zext_operand" > - "0 ,rm,r ,rmWz,0,r ,m ,v ,r ,m ,*x,*v,*k,*km")))] > + "0 ,rm,r ,rmWz,0,r ,m ,v ,r ,m ,*x,*v,?k,?km")))] >"" > { >switch (get_attr_type (insn)) > @@ -4703,9 +4703,9 @@ (define_mode_attr kmov_isa >[(QI "avx512dq") (HI "avx512f") (SI "avx512bw") (DI "avx512bw")]) > > (define_insn "zero_extenddi2" > - [(set (match_operand:DI 0 "register_operand" "=r,*r,*k") > + [(set (match_operand:DI 0 "register_operand" "=r,?r,?k") > (zero_extend:DI > -(match_operand:SWI12 1 "nonimmediate_operand" "m,*k,*km")))] > +(match_operand:SWI12 1 "nonimmediate_operand" "m,?k,?km")))] >"TARGET_64BIT" >"@ > movz{l|x}\t{%1, %k0|%k0, %1} > @@ -4758,9 +4758,9 @@ (define_insn_and_split "zero_extendsi2_and" > (set_attr "mode" "SI")]) > > (define_insn "*zero_extendsi2" > - [(set (match_operand:SI 0 "register_operand" "=r,*r,*k") > + [(set (match_operand:SI 0 "register_operand" "=r,?r,?k") > (zero_extend:SI > - (match_operand:SWI12 1 "nonimmediate_operand" "m,*k,*km")))] > + (match_operand:SWI12 1 "nonimmediate_operand" "m,?k,?km")))] >"!(TARGET_ZERO_EXTEND_WITH_AND && optimize_function_for_speed_p (cfun))" >"@ > movz{l|x}\t{%1, %0|%0, %1} > @@ -4813,8 +4813,8 @@ (define_insn_and_split "zero_extendqihi2_and" > > ; zero extend to SImode to avoid partial register stalls > (define_insn "*zero_extendqihi2" > - [(set (match_operand:HI 0 "register_operand" "=r,*r,*k") > - (zero_extend:HI (match_operand:QI 1 "nonimmediate_operand" > "qm,*k,*km")))] > + [(set (match_operand:HI 0 "register_operand" "=r,?r,?k") > + (zero_extend:HI (match_operand:QI 1 "nonimmediate_operand" > "qm,?k,?km")))] >"!(TARGET_ZERO_EXTEND_WITH_AND && optimize_function_for_speed_p (cfun))" >"@ > movz{bl|x}\t{%1, %k0|%k0, %1} > diff --git a/gcc/testsuite/gcc.target/i386/zero_extendkmask.c > b/gcc/testsuite/gcc.target/i386/zero_extendkmask.c > new file mode 100644 > index 000..6b18980bbd1 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/i386/zero_extendkmask.c > @@ -0,0 +1,43 @@ > +/* { dg-do compile { target { ! ia32 } } } */ > +/* { dg-options "-march=x86-64-v4 -O2" } */ > +/* { dg-final { scan-assembler-not {(?n)shr[bwl]} } } */ > +/* { dg-final { scan-assembler-not {(?n)movz[bw]} } } */ > + > +#include > + > +__m512 > +foo (__m512d a, __m512d b, __m512 c, __m512 d) > +{ > + return _mm512_mask_mov_ps (c, (__mmask16) (_mm512_cmpeq_pd_mask (a, b) >> > 1), d); > +} > + > + > +__m512i > +foo1 (__m512d a, __m512d b, __m512i c, __m512i d) > +{ > + return _mm512_mask_mov_epi16 (c, (__mmask32) (_mm512_cmpeq_pd_mask (a, b) > >> 1), d); > +} > + > +__m512i > +foo2 (__m512d a, __m512d b, __m512i c, __m512i d) > +{ > + return _mm512_mask_mov_epi8 (c, (__mmask64) (_mm512_cmpeq_pd_mask (a, b) > >> 1), d); > +} > + > +__m512i > +foo3 (__m512 a, __m512 b, __m512i c, __m512i d) > +{ > + return _mm512_mask_mov_epi16 (c, (__mmask32) (_mm512_cmpeq_ps_mask (a, b) > >> 1), d); > +} > + > +__m512i > +foo4 (__m512 a, __m512 b, __m512i c, __m512i d) > +{ > + return _mm512_mask_mov_epi8 (c, (__mmask64) (_mm512_cmpeq_ps_mask (a, b) > >> 1), d); > +} > + > +__m512i > +foo5 (__m512i a, __m512i b, __m512i c, __m512i d) > +{ > + return _mm512_mask_mov_epi8 (c, (__mmask64) (_mm512_cmp_epi16_mask (a, b, > 5) >> 1), d); > +} > -- > 2.31.1 >
Re: [PATCH] adjust vectorization expectations for ppc costmodel 76b
On Apr 24, 2024, "Kewen.Lin" wrote: > For !has_arch_pwr7 case, it still adopts peeling but as the comment (one line > above) > shows the original intention of this case is to expect not profitable for > peeling > so it's not expected to be handled here, can we just tweak the loop bound > instead, > such as: > -#define N 14 > +#define N 13 > #define OFF 4 > ?, it can make this loop not profitable to be vectorized for !vect_no_align > with > peeling (both pwr7 and pwr6) and keep consistent. Like this? I didn't feel I could claim authorship of this one-liner just because I turned it into a patch and tested it, so I took the liberty of turning your own words above into the commit message. So far, tested on ppc64le-linux-gnu (ppc9). Testing with vxworks targets now. Would you like to tweak the commit message to your liking? Otherwise, is this ok to install? Thanks, adjust iteration count for ppc costmodel 76b From: Kewen Lin The original intention of this case is to expect not profitable for peeling. Tweak the loop bound to make this loop not profitable to be vectorized for !vect_no_align with peeling (both pwr7 and pwr6) and keep consistent. for gcc/testsuite/ChangeLog * gcc.dg/vect/costmodel/ppc/costmodel-vect-76b.c (N): Tweak. --- .../gcc.dg/vect/costmodel/ppc/costmodel-vect-76b.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-vect-76b.c b/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-vect-76b.c index cbbfbb24658f8..e48b0ab759e75 100644 --- a/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-vect-76b.c +++ b/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-vect-76b.c @@ -6,7 +6,7 @@ /* On Power7 without misalign vector support, this case is to check it's not profitable to perform vectorization by peeling to align the store. */ -#define N 14 +#define N 13 #define OFF 4 /* Check handling of accesses for which the "initial condition" - -- Alexandre Oliva, happy hackerhttps://FSFLA.org/blogs/lxo/ Free Software Activist GNU Toolchain Engineer More tolerance and less prejudice are key for inclusion and diversity Excluding neuro-others for not behaving ""normal"" is *not* inclusive
Re: enable sqrt insns for cdce3.c
On Apr 23, 2024, Hans-Peter Nilsson wrote: > (We could also fix the predicate description to actually say > "for all floating-point modes" and/or split the predicate into > mode-specific variants, etc. ;-) Yeah, I suppose that could make sense. > MMIX has sqrtdf2 but not sqrtsf2, and the latter is what's used > in cdce3.c. I see, thanks for the info. -- Alexandre Oliva, happy hackerhttps://FSFLA.org/blogs/lxo/ Free Software Activist GNU Toolchain Engineer More tolerance and less prejudice are key for inclusion and diversity Excluding neuro-others for not behaving ""normal"" is *not* inclusive
Re: [PATCH v2] [testsuite] require sqrt_insn effective target where needed
On Apr 23, 2024, Iain Sandoe wrote: >>> --- a/gcc/testsuite/gcc.target/powerpc/pr46728-10.c >>> +++ b/gcc/testsuite/gcc.target/powerpc/pr46728-10.c >>> @@ -1,6 +1,7 @@ >>> /* { dg-do run } */ >>> /* { dg-skip-if "-mpowerpc-gpopt not supported" { powerpc*-*-darwin* } } */ >>> /* { dg-options "-O2 -ffast-math -fno-inline -fno-unroll-loops -lm >>> -mpowerpc-gpopt" } */ >>> +/* { dg-require-effective-target sqrt_insn } */ >> >> This change looks sensible to me. >> >> Nit: With the proposed change, I'd expect that we can remove the line for >> powerpc*-*-darwin*. >> >> CC Iain to confirm. > Indeed, the check for sqrt_insn fails and so the test is unsupported without > needing the separate > powerpc*-*-darwin* line, Thanks, here's the adjusted version I'm just about to push. [testsuite] require sqrt_insn effective target where needed Some tests fail on ppc and ppc64 when testing a compiler [with options for] for a CPU [emulator] that doesn't support the sqrt insn. The gcc.dg/cdce3.c is one in which the expected shrink-wrap optimization only takes place when the target CPU supports a sqrt insn. The gcc.target/powerpc/pr46728-1[0-4].c tests use -mpowerpc-gpopt and call sqrt(), which involves the sqrt insn that the target CPU under test may not support. Require a sqrt_insn effective target for all the affected tests. for gcc/testsuite/ChangeLog * gcc.dg/cdce3.c: Require sqrt_insn effective target. * gcc.target/powerpc/pr46728-10.c: Likewise. Drop darwin explicit skipping. * gcc.target/powerpc/pr46728-11.c: Likewise. Likewise. * gcc.target/powerpc/pr46728-13.c: Likewise. Likewise. * gcc.target/powerpc/pr46728-14.c: Likewise. Likewise. --- gcc/testsuite/gcc.dg/cdce3.c |3 ++- gcc/testsuite/gcc.target/powerpc/pr46728-10.c |2 +- gcc/testsuite/gcc.target/powerpc/pr46728-11.c |2 +- gcc/testsuite/gcc.target/powerpc/pr46728-13.c |2 +- gcc/testsuite/gcc.target/powerpc/pr46728-14.c |2 +- 5 files changed, 6 insertions(+), 5 deletions(-) diff --git a/gcc/testsuite/gcc.dg/cdce3.c b/gcc/testsuite/gcc.dg/cdce3.c index 601ddf055fd71..f759a95972e8b 100644 --- a/gcc/testsuite/gcc.dg/cdce3.c +++ b/gcc/testsuite/gcc.dg/cdce3.c @@ -1,7 +1,8 @@ /* { dg-do compile } */ /* { dg-require-effective-target hard_float } */ +/* { dg-require-effective-target sqrt_insn } */ /* { dg-options "-O2 -fmath-errno -fdump-tree-cdce-details -fdump-tree-optimized" } */ -/* { dg-final { scan-tree-dump "cdce3.c:11: \[^\n\r]* function call is shrink-wrapped into error conditions\." "cdce" } } */ +/* { dg-final { scan-tree-dump "cdce3.c:12: \[^\n\r]* function call is shrink-wrapped into error conditions\." "cdce" } } */ /* { dg-final { scan-tree-dump "sqrtf \\(\[^\n\r]*\\); \\\[tail call\\\]" "optimized" } } */ /* { dg-skip-if "doesn't have a sqrtf insn" { mmix-*-* } } */ diff --git a/gcc/testsuite/gcc.target/powerpc/pr46728-10.c b/gcc/testsuite/gcc.target/powerpc/pr46728-10.c index 3be4728d333a4..c04a3101c113f 100644 --- a/gcc/testsuite/gcc.target/powerpc/pr46728-10.c +++ b/gcc/testsuite/gcc.target/powerpc/pr46728-10.c @@ -1,6 +1,6 @@ /* { dg-do run } */ -/* { dg-skip-if "-mpowerpc-gpopt not supported" { powerpc*-*-darwin* } } */ /* { dg-options "-O2 -ffast-math -fno-inline -fno-unroll-loops -lm -mpowerpc-gpopt" } */ +/* { dg-require-effective-target sqrt_insn } */ #include diff --git a/gcc/testsuite/gcc.target/powerpc/pr46728-11.c b/gcc/testsuite/gcc.target/powerpc/pr46728-11.c index 43b6728a4b812..d0e3d60212194 100644 --- a/gcc/testsuite/gcc.target/powerpc/pr46728-11.c +++ b/gcc/testsuite/gcc.target/powerpc/pr46728-11.c @@ -1,6 +1,6 @@ /* { dg-do run } */ -/* { dg-skip-if "-mpowerpc-gpopt not supported" { powerpc*-*-darwin* } } */ /* { dg-options "-O2 -ffast-math -fno-inline -fno-unroll-loops -lm -mpowerpc-gpopt" } */ +/* { dg-require-effective-target sqrt_insn } */ #include diff --git a/gcc/testsuite/gcc.target/powerpc/pr46728-13.c b/gcc/testsuite/gcc.target/powerpc/pr46728-13.c index b9fd63973b728..2b9df737a9b0d 100644 --- a/gcc/testsuite/gcc.target/powerpc/pr46728-13.c +++ b/gcc/testsuite/gcc.target/powerpc/pr46728-13.c @@ -1,6 +1,6 @@ /* { dg-do run } */ -/* { dg-skip-if "-mpowerpc-gpopt not supported" { powerpc*-*-darwin* } } */ /* { dg-options "-O2 -ffast-math -fno-inline -fno-unroll-loops -lm -mpowerpc-gpopt" } */ +/* { dg-require-effective-target sqrt_insn } */ #include diff --git a/gcc/testsuite/gcc.target/powerpc/pr46728-14.c b/gcc/testsuite/gcc.target/powerpc/pr46728-14.c index 5a13bdb6c..e6836f515e4f8 100644 --- a/gcc/testsuite/gcc.target/powerpc/pr46728-14.c +++ b/gcc/testsuite/gcc.target/powerpc/pr46728-14.c @@ -1,6 +1,6 @@ /* { dg-do run } */ -/* { dg-skip-if "-mpowerpc-gpopt not supported" { powerpc*-*-darwin* } } */ /* { dg-options "-O2 -ffast-math -fno-inline -fno-unroll-loops -lm -mpowerpc-gpopt" } */ +/* { dg-require-effective-target sqrt_insn } */ #include -- Alexand
Re: [PATCH v2] xfail fetestexcept test - ppc always uses fcmpu
On Apr 23, 2024, "Kewen.Lin" wrote: >> --- a/gcc/testsuite/gcc.dg/torture/pr91323.c >> +++ b/gcc/testsuite/gcc.dg/torture/pr91323.c >> @@ -1,4 +1,5 @@ >> -/* { dg-do run } */ >> +/* { dg-do run { xfail powerpc*-*-* } } */ >> +/* The ppc xfail is because of PR target/58684. */ > OK, though the proposed comment is slightly different from what's in > the related commit r8-6445-g86145a19abf39f. :) Thanks! Oh, thanks for the pointer, that was easy to fix. Here's what I'm pushing momentarily... xfail fetestexcept test - ppc always uses fcmpu gcc.dg/torture/pr91323.c tests that a compare with NaNf doesn't set an exception using builtin compare intrinsics, and that it does when using regular compare operators. That doesn't seem to be expected to work on powerpc targets. It fails on GNU/Linux, it's marked to be skipped on AIX, and a similar test, gcc.dg/torture/pr93133.c, has the execution test xfailed for all of powerpc*-*-*. In this test, the functions that use intrinsics for the compare end up with the same code as the one that uses compare operators, using fcmpu, a floating compare that, unlike fcmpo, does not set the invalid operand exception for quiet NaN. I couldn't find any evidence that the rs6000 backend ever outputs fcmpo. Therefore, I'm adding the same execution xfail marker to this test. for gcc/testsuite/ChangeLog PR target/58684 * gcc.dg/torture/pr91323.c: Expect execution fail on powerpc*-*-*. --- gcc/testsuite/gcc.dg/torture/pr91323.c |3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/gcc/testsuite/gcc.dg/torture/pr91323.c b/gcc/testsuite/gcc.dg/torture/pr91323.c index 1411fcaa3966c..4574342e728db 100644 --- a/gcc/testsuite/gcc.dg/torture/pr91323.c +++ b/gcc/testsuite/gcc.dg/torture/pr91323.c @@ -1,4 +1,5 @@ -/* { dg-do run } */ +/* { dg-do run { xfail powerpc*-*-* } } */ +/* remove the xfail for powerpc when pr58684 is fixed */ /* { dg-add-options ieee } */ /* { dg-require-effective-target fenv_exceptions } */ /* { dg-skip-if "fenv" { powerpc-ibm-aix* } } */ -- Alexandre Oliva, happy hackerhttps://FSFLA.org/blogs/lxo/ Free Software Activist GNU Toolchain Engineer More tolerance and less prejudice are key for inclusion and diversity Excluding neuro-others for not behaving ""normal"" is *not* inclusive
Re: [PATCH] ppc: testsuite: vec-mul requires vsx runtime
On Apr 23, 2024, "Kewen.Lin" wrote: >> -/* { dg-do run } */ >> +/* { dg-do compile { target { ! vsx_hw } } } */ >> +/* { dg-do run { target vsx_hw } } */ >> /* { dg-require-effective-target powerpc_vsx_ok } */ > Nit: It's useless to check powerpc_vsx_ok for vsx_hw, so powerpc_vsx_ok check > can be moved to be with ! vsx_hw. > OK with this nit tweaked, thanks! Thanks, here's what I'm pushing momentarily... ppc: testsuite: vec-mul requires vsx runtime vec-mul is an execution test, but it only requires a powerpc_vsx_ok effective target, which is enough only for compile tests. In order to check for runtime and execution environment support, we need to require vsx_hw. Make that a condition for execution, but still perform a compile test if the condition is not satisfied. for gcc/testsuite/ChangeLog * gcc.target/powerpc/vec-mul.c: Run on target vsx_hw, just compile otherwise. --- gcc/testsuite/gcc.target/powerpc/vec-mul.c |4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/gcc/testsuite/gcc.target/powerpc/vec-mul.c b/gcc/testsuite/gcc.target/powerpc/vec-mul.c index bfcaf80719d1d..aa0ef7aa45acc 100644 --- a/gcc/testsuite/gcc.target/powerpc/vec-mul.c +++ b/gcc/testsuite/gcc.target/powerpc/vec-mul.c @@ -1,5 +1,5 @@ -/* { dg-do run } */ -/* { dg-require-effective-target powerpc_vsx_ok } */ +/* { dg-do compile { target { { ! vsx_hw } && powerpc_vsx_ok } } } */ +/* { dg-do run { target vsx_hw } } */ /* { dg-options "-mvsx -O3" } */ /* Test that the vec_mul builtin works as expected. */ -- Alexandre Oliva, happy hackerhttps://FSFLA.org/blogs/lxo/ Free Software Activist GNU Toolchain Engineer More tolerance and less prejudice are key for inclusion and diversity Excluding neuro-others for not behaving ""normal"" is *not* inclusive