Re: [PATCH RFA] driver: fix validate_switches logic

2022-12-01 Thread Richard Biener via Gcc-patches
On Fri, Dec 2, 2022 at 7:37 AM Alexandre Oliva via Gcc-patches
 wrote:
>
> On Dec  1, 2022, Jason Merrill  wrote:
>
> > Once we see g*, starred is set.  Then we see %:, and it sees that as a
> > zero-length switch, which because starred is still set, matches any and all
> > command-line options.  So targets that use such a spec accept all options in
> > the driver, while ones that don't reject some, such as the recent
> > -nostdlib++.
>
> Woo, nice catch, thanks!
>
> I don't have authority to approve the patch,
> but it's ok as far as I'm concerned.

OK.

> --
> Alexandre Oliva, happy hackerhttps://FSFLA.org/blogs/lxo/
>Free Software Activist   GNU Toolchain Engineer
> Disinformation flourishes because many people care deeply about injustice
> but very few check the facts.  Ask me about 


Re: [PATCH 19/56] Revert "Move void_list_node init to common code". (8ff2a92a0450243e52d3299a13b30f208bafa7e0)

2022-12-01 Thread Richard Biener via Gcc-patches
On Fri, Dec 2, 2022 at 8:43 AM Richard Biener
 wrote:
>
> On Fri, Dec 2, 2022 at 1:23 AM Zopolis0  wrote:
> >
> > > But that looks like the correct thing to do.
> >
> > It's not. The patch I reverted changes it so that no matter what,
> > void_list_node = build_tree_list (NULL_TREE, void_type_node);.
> >
> > Before, each front-end set it in their own way, but they all set it
> > via void_list_node = build_tree_list (NULL_TREE, void_type_node); or a
> > synonym anyway. So while the patch made sense in a java-free context,
> > given that java sets it a different way, I can't see a world in which
> > this commit stays active and Java works, unless we find a way to set
> > it in tree.cc for every language except Java.
>
> The middle-end expects it to be this way, it's not correct for a frontend
> to define it in other ways.  That means you need to try to understand
> _why_ the frontend isn't happy with the middle-ends definition.
>
> How does Java build end_params_node?

Looking at the 4.7 tree it does

decl.c:  end_params_node = tree_cons (NULL_TREE, void_type_node, NULL_TREE);

that's exactly the same.

> Richard.


Re: [PATCH 19/56] Revert "Move void_list_node init to common code". (8ff2a92a0450243e52d3299a13b30f208bafa7e0)

2022-12-01 Thread Richard Biener via Gcc-patches
On Fri, Dec 2, 2022 at 1:23 AM Zopolis0  wrote:
>
> > But that looks like the correct thing to do.
>
> It's not. The patch I reverted changes it so that no matter what,
> void_list_node = build_tree_list (NULL_TREE, void_type_node);.
>
> Before, each front-end set it in their own way, but they all set it
> via void_list_node = build_tree_list (NULL_TREE, void_type_node); or a
> synonym anyway. So while the patch made sense in a java-free context,
> given that java sets it a different way, I can't see a world in which
> this commit stays active and Java works, unless we find a way to set
> it in tree.cc for every language except Java.

The middle-end expects it to be this way, it's not correct for a frontend
to define it in other ways.  That means you need to try to understand
_why_ the frontend isn't happy with the middle-ends definition.

How does Java build end_params_node?

Richard.


[PATCH] Silence some -Wnarrowing errors

2022-12-01 Thread Eric Gallager via Gcc-patches
I tried turning -Wnarrowing back on earlier this year, but
unfortunately it didn't work due to triggering a bunch of new errors.
This patch silences at least some of them, but there will still be
more left even after applying it. (When compiling with clang,
technically the warning flag is -Wc++11-narrowing, but it's pretty
much the same thing as gcc's -Wnarrowing, albeit with fixit hints,
which I made use of to insert the casts here.)

gcc/ChangeLog:

* ipa-modref.cc (modref_lattice::add_escape_point): Use a
static_cast to silence -Wnarrowing.
(modref_eaf_analysis::record_escape_points): Likewise.
(update_escape_summary_1): Likewise.
* rtl-ssa/changes.cc (function_info::temp_access_array): Likewise.
* rtl-ssa/member-fns.inl: Likewise.
* tree-ssa-structalias.cc (push_fields_onto_fieldstack): Likewise.
* tree-vect-slp.cc (vect_prologue_cost_for_slp): Likewise.
* tree-vect-stmts.cc (vect_truncate_gather_scatter_offset): Likewise.
(vectorizable_operation): Likewise.


patch-Wnarrowing.diff
Description: Binary data


[aarch64] PR107920 - Fix incorrect handling of virtual operands in svld1rq_impl::fold

2022-12-01 Thread Prathamesh Kulkarni via Gcc-patches
Hi,
The following test:

#include "arm_sve.h"

svint8_t
test_s8(int8_t *x)
{
  return svld1rq_s8 (svptrue_b8 (), [0]);
}

ICE's with -march=armv8.2-a+sve -O1 -fno-tree-ccp -fno-tree-forwprop:
during GIMPLE pass: fre
pr107920.c: In function ‘test_s8’:
pr107920.c:7:1: internal compiler error: in execute_todo, at passes.cc:2140
7 | }
  | ^
0x7b03d0 execute_todo
../../gcc/gcc/passes.cc:2140

because of incorrect handling of virtual operands in svld1rq_impl::fold:
 # VUSE <.MEM>
  _5 = MEM  [(signed char * {ref-all})x_3(D)];
  _4 = VEC_PERM_EXPR <_5, _5, { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11,
12, 13, 14, 15, ... }>;
  # VUSE <.MEM_2(D)>
  return _4;

The attached patch tries to fix the issue by building the replacement
statements in gimple_seq, and passing it to gsi_replace_with_seq_vops,
which resolves the ICE, and results in:
   :
  # VUSE <.MEM_2(D)>
  _5 = MEM  [(signed char * {ref-all})x_3(D)];
  _4 = VEC_PERM_EXPR <_5, _5, { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11,
12, 13, 14, 15, ... }>;
  # VUSE <.MEM_2(D)>
  return _4;

Bootstrapped+tested on aarch64-linux-gnu.
OK to commit ?

Thanks,
Prathamesh
diff --git a/gcc/config/aarch64/aarch64-sve-builtins-base.cc 
b/gcc/config/aarch64/aarch64-sve-builtins-base.cc
index 6347407555f..f5546a65d22 100644
--- a/gcc/config/aarch64/aarch64-sve-builtins-base.cc
+++ b/gcc/config/aarch64/aarch64-sve-builtins-base.cc
@@ -45,6 +45,7 @@
 #include "aarch64-sve-builtins-base.h"
 #include "aarch64-sve-builtins-functions.h"
 #include "ssa.h"
+#include "gimple-fold.h"
 
 using namespace aarch64_sve;
 
@@ -1232,7 +1233,9 @@ public:
tree mem_ref_op = fold_build2 (MEM_REF, access_type, arg1, zero);
gimple *mem_ref_stmt
  = gimple_build_assign (mem_ref_lhs, mem_ref_op);
-   gsi_insert_before (f.gsi, mem_ref_stmt, GSI_SAME_STMT);
+
+   gimple_seq stmts = NULL;
+   gimple_seq_add_stmt_without_update (, mem_ref_stmt);
 
int source_nelts = TYPE_VECTOR_SUBPARTS (access_type).to_constant ();
vec_perm_builder sel (lhs_len, source_nelts, 1);
@@ -1245,8 +1248,11 @@ public:
   indices));
tree mask_type = build_vector_type (ssizetype, lhs_len);
tree mask = vec_perm_indices_to_tree (mask_type, indices);
-   return gimple_build_assign (lhs, VEC_PERM_EXPR,
-   mem_ref_lhs, mem_ref_lhs, mask);
+   gimple *g2 = gimple_build_assign (lhs, VEC_PERM_EXPR,
+ mem_ref_lhs, mem_ref_lhs, mask);
+   gimple_seq_add_stmt_without_update (, g2);
+   gsi_replace_with_seq_vops (f.gsi, stmts);
+   return g2;
   }
 
 return NULL;
diff --git a/gcc/gimple-fold.cc b/gcc/gimple-fold.cc
index c2d9c806aee..03cdb2f9f49 100644
--- a/gcc/gimple-fold.cc
+++ b/gcc/gimple-fold.cc
@@ -591,7 +591,7 @@ fold_gimple_assign (gimple_stmt_iterator *si)
If the statement has a lhs the last stmt in the sequence is expected
to assign to that lhs.  */
 
-static void
+void
 gsi_replace_with_seq_vops (gimple_stmt_iterator *si_p, gimple_seq stmts)
 {
   gimple *stmt = gsi_stmt (*si_p);
diff --git a/gcc/gimple-fold.h b/gcc/gimple-fold.h
index 7d29ee9a9a4..87ed4e56d25 100644
--- a/gcc/gimple-fold.h
+++ b/gcc/gimple-fold.h
@@ -63,6 +63,7 @@ extern bool arith_code_with_undefined_signed_overflow 
(tree_code);
 extern gimple_seq rewrite_to_defined_overflow (gimple *, bool = false);
 extern void replace_call_with_value (gimple_stmt_iterator *, tree);
 extern tree tree_vec_extract (gimple_stmt_iterator *, tree, tree, tree, tree);
+extern void gsi_replace_with_seq_vops (gimple_stmt_iterator *, gimple_seq);
 
 /* gimple_build, functionally matching fold_buildN, outputs stmts
int the provided sequence, matching and simplifying them on-the-fly.
diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general/pr107920.c 
b/gcc/testsuite/gcc.target/aarch64/sve/acle/general/pr107920.c
new file mode 100644
index 000..11448ed5e68
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general/pr107920.c
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-options "-O1 -fno-tree-ccp -fno-tree-forwprop" } */
+
+#include "arm_sve.h"
+
+svint8_t
+test_s8(int8_t *x)
+{
+  return svld1rq_s8 (svptrue_b8 (), [0]);
+}


Re: [V2][PATCH 1/1] Add a new warning option -Wstrict-flex-arrays.

2022-12-01 Thread Richard Biener via Gcc-patches
On Fri, 2 Dec 2022, Richard Biener wrote:

> On Thu, 1 Dec 2022, Siddhesh Poyarekar wrote:
> 
> > On 2022-12-01 11:42, Kees Cook wrote:
> > > On Wed, Nov 30, 2022 at 02:25:56PM +, Qing Zhao wrote:
> > >> '-Wstrict-flex-arrays'
> > >>   Warn about inproper usages of flexible array members according to
> > >>   the LEVEL of the 'strict_flex_array (LEVEL)' attribute attached to
> > >>   the trailing array field of a structure if it's available,
> > >>   otherwise according to the LEVEL of the option
> > >>   '-fstrict-flex-arrays=LEVEL'.
> > >>
> > >>   This option is effective only when LEVEL is bigger than 0.
> > >>   Otherwise, it will be ignored with a warning.
> > >>
> > >>   when LEVEL=1, warnings will be issued for a trailing array
> > >>   reference of a structure that have 2 or more elements if the
> > >>   trailing array is referenced as a flexible array member.
> > >>
> > >>   when LEVEL=2, in addition to LEVEL=1, additional warnings will be
> > >>   issued for a trailing one-element array reference of a structure if
> > >>   the array is referenced as a flexible array member.
> > >>
> > >>   when LEVEL=3, in addition to LEVEL=2, additional warnings will be
> > >>   issued for a trailing zero-length array reference of a structure if
> > >>   the array is referenced as a flexible array member.
> > >>
> > >> At the same time, -Warray-bounds is updated:
> > > 
> > > Why is there both -Wstrict-flex-arrays and -Warray-bounds? I thought
> > > only the latter was going to exist?
> 
> Sorry for appearantly not being clear - I was requesting 
> -Wstrict-flex-arrays to be dropped and instead adjusting -Warray-bounds
> to adhere to -fstrict-flex-arrays in both =1 and =2 where then =2
> would only add the intermediate pointer results verification.
> 
> I think that's reasonable if documented since the default behavior
> with -Wall will not change then unless the -fstrict-flex-arrays
> default is altered.

Btw, your patch seems to implement the above plus adds 
-Wstrict-flex-arrays.  It seems it could be split into two, doing
the -Warray-bounds adjustment as first and the -Wstrict-flex-arrays 
addition as second.  We do all seem to agree on the first so it's easy
to go forward with that?

Thanks,
Richard.


Re: [V2][PATCH 1/1] Add a new warning option -Wstrict-flex-arrays.

2022-12-01 Thread Richard Biener via Gcc-patches
On Thu, 1 Dec 2022, Siddhesh Poyarekar wrote:

> On 2022-12-01 11:42, Kees Cook wrote:
> > On Wed, Nov 30, 2022 at 02:25:56PM +, Qing Zhao wrote:
> >> '-Wstrict-flex-arrays'
> >>   Warn about inproper usages of flexible array members according to
> >>   the LEVEL of the 'strict_flex_array (LEVEL)' attribute attached to
> >>   the trailing array field of a structure if it's available,
> >>   otherwise according to the LEVEL of the option
> >>   '-fstrict-flex-arrays=LEVEL'.
> >>
> >>   This option is effective only when LEVEL is bigger than 0.
> >>   Otherwise, it will be ignored with a warning.
> >>
> >>   when LEVEL=1, warnings will be issued for a trailing array
> >>   reference of a structure that have 2 or more elements if the
> >>   trailing array is referenced as a flexible array member.
> >>
> >>   when LEVEL=2, in addition to LEVEL=1, additional warnings will be
> >>   issued for a trailing one-element array reference of a structure if
> >>   the array is referenced as a flexible array member.
> >>
> >>   when LEVEL=3, in addition to LEVEL=2, additional warnings will be
> >>   issued for a trailing zero-length array reference of a structure if
> >>   the array is referenced as a flexible array member.
> >>
> >> At the same time, -Warray-bounds is updated:
> > 
> > Why is there both -Wstrict-flex-arrays and -Warray-bounds? I thought
> > only the latter was going to exist?

Sorry for appearantly not being clear - I was requesting 
-Wstrict-flex-arrays to be dropped and instead adjusting -Warray-bounds
to adhere to -fstrict-flex-arrays in both =1 and =2 where then =2
would only add the intermediate pointer results verification.

I think that's reasonable if documented since the default behavior
with -Wall will not change then unless the -fstrict-flex-arrays
default is altered.

> Oh my understanding of the consensus was to move flex array related diagnosis
> from -Warray-bounds to -Wstring-flex-arrays as Qing has done. If only the
> former exists then instead of removing the flex array related statement in the
> documentation as Richard suggested, we need to enhance it to say that
> behaviour of -Warray-bounds will depend on -fstrict-flex-arrays.
> 
> -Warray-bounds does diagnosis beyond just flexible arrays, in case that's the
> confusion.

Richard.

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg,
Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman;
HRB 36809 (AG Nuernberg)


RE: [PATCH 1/2]middle-end: Add new tbranch optab to add support for bit-test-and-branch operations

2022-12-01 Thread Richard Biener via Gcc-patches
On Thu, 1 Dec 2022, Tamar Christina wrote:

> > > +/* Check to see if the supplied comparison in PTEST can be performed as a
> > > +   bit-test-and-branch instead.  VAL must contain the original tree
> > > +   expression of the non-zero operand which will be used to rewrite the
> > > +   comparison in PTEST.
> > > +
> > > +   Returns TRUE if operation succeeds and returns updated PMODE and
> > PTEST,
> > > +   else FALSE.  */
> > > +
> > > +enum insn_code
> > > +static validate_test_and_branch (tree val, rtx *ptest, machine_mode
> > > +*pmode) {
> > > +  if (!val || TREE_CODE (val) != SSA_NAME)
> > > +return CODE_FOR_nothing;
> > > +
> > > +  machine_mode mode = TYPE_MODE (TREE_TYPE (val));  rtx test =
> > > + *ptest;
> > > +
> > > +  if (GET_CODE (test) != EQ && GET_CODE (test) != NE)
> > > +return CODE_FOR_nothing;
> > > +
> > > +  /* If the target supports the testbit comparison directly, great.
> > > + */  auto icode = direct_optab_handler (tbranch_optab, mode);  if
> > > + (icode == CODE_FOR_nothing)
> > > +return icode;
> > > +
> > > +  if (tree_zero_one_valued_p (val))
> > > +{
> > > +  auto pos = BYTES_BIG_ENDIAN ? GET_MODE_BITSIZE (mode) - 1 : 0;
> > 
> > Does this work for BYTES_BIG_ENDIAN && !WORDS_BIG_ENDIAN and mode
> > > word_mode?
> > 
> 
> It does now. In this particular case all that matters is the bit ordering, so 
> I've changed
> It to BITS_BIG_ENDIAN.

It looks like this would fit indeed.

> Also during the review of the AArch64 optab Richard Sandiford wanted me to 
> split the
> optabs apart into two.  The reason is that a match_operator still gets the 
> full RTL.
> 
> In the case of a tbranch the full RTL has an invalid comparison, so if a 
> target doesn't implement
> the hook correctly this would lead to incorrect code.  We've now moved the 
> operator as part of
> the name itself to avoid this.
> 
> Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> 
> Ok for master?

OK if Richard doesn't have any further comments.

Thanks,
Richard.

> Thanks,
> Tamar
> 
> gcc/ChangeLog:
> 
>   * dojump.cc (do_jump): Pass along value.
>   (do_jump_by_parts_greater_rtx): Likewise.
>   (do_jump_by_parts_zero_rtx): Likewise.
>   (do_jump_by_parts_equality_rtx): Likewise.
>   (do_compare_rtx_and_jump): Likewise.
>   (do_compare_and_jump): Likewise.
>   * dojump.h (do_compare_rtx_and_jump): New.
>   * optabs.cc (emit_cmp_and_jump_insn_1): Refactor to take optab to check.
>   (validate_test_and_branch): New.
>   (emit_cmp_and_jump_insns): Optiobally take a value, and when value is
>   supplied then check if it's suitable for tbranch.
>   * optabs.def (tbranch_eq$a4, tbranch_ne$a4): New.
>   * doc/md.texi (tbranch_@var{op}@var{mode}4): Document it.
>   * optabs.h (emit_cmp_and_jump_insns):
>   * tree.h (tree_zero_one_valued_p): New.
> 
> --- inline copy of patch ---
> 
> diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
> index 
> d0a71ecbb806de3a6564c6ffe973fec5da5c597b..c6c4b13d756de28078a0a779876a00c614246914
>  100644
> --- a/gcc/doc/md.texi
> +++ b/gcc/doc/md.texi
> @@ -6964,6 +6964,14 @@ case, you can and should make operand 1's predicate 
> reject some operators
>  in the @samp{cstore@var{mode}4} pattern, or remove the pattern altogether
>  from the machine description.
>  
> +@cindex @code{tbranch_@var{op}@var{mode}4} instruction pattern
> +@item @samp{tbranch_@var{op}@var{mode}4}
> +Conditional branch instruction combined with a bit test-and-compare
> +instruction. Operand 0 is a comparison operator.  Operand 1 is the
> +operand of the comparison. Operand 2 is the bit position of Operand 1 to 
> test.
> +Operand 3 is the @code{code_label} to jump to. @var{op} is one of @var{eq} or
> +@var{ne}.
> +
>  @cindex @code{cbranch@var{mode}4} instruction pattern
>  @item @samp{cbranch@var{mode}4}
>  Conditional branch instruction combined with a compare instruction.
> diff --git a/gcc/dojump.h b/gcc/dojump.h
> index 
> e379cceb34bb1765cb575636e4c05b61501fc2cf..d1d79c490c420a805fe48d58740a79c1f25fb839
>  100644
> --- a/gcc/dojump.h
> +++ b/gcc/dojump.h
> @@ -71,6 +71,10 @@ extern void jumpifnot (tree exp, rtx_code_label *label,
>  extern void jumpifnot_1 (enum tree_code, tree, tree, rtx_code_label *,
>profile_probability);
>  
> +extern void do_compare_rtx_and_jump (rtx, rtx, enum rtx_code, int, tree,
> +  machine_mode, rtx, rtx_code_label *,
> +  rtx_code_label *, profile_probability);
> +
>  extern void do_compare_rtx_and_jump (rtx, rtx, enum rtx_code, int,
>machine_mode, rtx, rtx_code_label *,
>rtx_code_label *, profile_probability);
> diff --git a/gcc/dojump.cc b/gcc/dojump.cc
> index 
> 2af0cd1aca3b6af13d5d8799094ee93f18022296..190324f36f1a31990f8c49bc8c0f45c23da5c31e
>  100644
> --- a/gcc/dojump.cc
> +++ b/gcc/dojump.cc
> @@ -619,7 +619,7 @@ do_jump 

[PATCH v5, rs6000] Change mode and insn condition for VSX scalar extract/insert instructions

2022-12-01 Thread HAO CHEN GUI via Gcc-patches
Hi,
  For scalar extract/insert instructions, exponent field can be stored in a
32-bit register. So this patch changes the mode of exponent field from DI to
SI so that these instructions can be generated in a 32-bit environment. Also
it removes TARGET_64BIT check for these instructions.

  The instructions using DI registers can be invoked with -mpowerpc64 in a
32-bit environment. The patch changes insn condition from TARGET_64BIT to
TARGET_POWERPC64 for those instructions.

  This patch also changes prototypes and catagories of relevant built-ins and
effective target checks of test cases.

  Compared to last version, main changes are to remove 64-bit environment
requirement for relevant built-ins in extend.texi. And to change the type of
arguments of relevant built-ins in rs6000-overload.def.

  Bootstrapped and tested on powerpc64-linux BE and LE with no regressions.
Is this okay for trunk? Any recommendations? Thanks a lot.

ChangeLog
2022-12-01  Haochen Gui  

gcc/
* config/rs6000/rs6000-builtins.def
(__builtin_vsx_scalar_extract_exp): Set return type to const unsigned
int and move it from power9-64 to power9 catatlog.
(__builtin_vsx_scalar_extract_sig): Set return type to const unsigned
long long.
(__builtin_vsx_scalar_insert_exp): Set type of second argument to
unsigned int.
(__builtin_vsx_scalar_insert_exp_dp): Set type of second argument to
unsigned int and move it from power9-64 to power9 catatlog.
* config/rs6000/vsx.md (xsxexpdp): Set mode of first operand to
SImode.  Remove TARGET_64BIT from insn condition.
(xsxsigdp): Change insn condition from TARGET_64BIT to TARGET_POWERPC64.
(xsiexpdp): Change insn condition from TARGET_64BIT to
TARGET_POWERPC64.  Set mode of third operand to SImode.
(xsiexpdpf): Set mode of third operand to SImode.  Remove TARGET_64BIT
from insn condition.
* config/rs6000/rs6000-overload.def
(__builtin_vec_scalar_insert_exp): Set type of second argument to
unsigned int.
* doc/extend.texi (scalar_insert_exp): Set type of second argument to
unsigned int and remove 64-bit environment requirement when
significand has a float type.
(scalar_extract_exp): Remove 64-bit environment requirement.

gcc/testsuite/
* gcc.target/powerpc/bfp/scalar-extract-exp-0.c: Remove lp64 check.
* gcc.target/powerpc/bfp/scalar-extract-exp-1.c: Remove lp64 check.
* gcc.target/powerpc/bfp/scalar-extract-exp-2.c: Deleted as the case is
invalid now.
* gcc.target/powerpc/bfp/scalar-extract-exp-6.c: Replace lp64 check
with has_arch_ppc64.
* gcc.target/powerpc/bfp/scalar-extract-sig-0.c: Likewise.
* gcc.target/powerpc/bfp/scalar-extract-sig-6.c: Likewise.
* gcc.target/powerpc/bfp/scalar-insert-exp-0.c: Replace lp64 check
with has_arch_ppc64. Set type of exponent to unsigned int.
* gcc.target/powerpc/bfp/scalar-insert-exp-1.c: Set type of exponent
to unsigned int.
* gcc.target/powerpc/bfp/scalar-insert-exp-12.c: Replace lp64 check
with has_arch_ppc64. Set type of exponent to unsigned int.
* gcc.target/powerpc/bfp/scalar-insert-exp-13.c: Remove lp64 check.
Set type of exponent to unsigned int.
* gcc.target/powerpc/bfp/scalar-insert-exp-2.c: Set type of exponent to
unsigned int.
* gcc.target/powerpc/bfp/scalar-insert-exp-3.c: Remove lp64 check. Set
type of exponent to unsigned int.
* gcc.target/powerpc/bfp/scalar-insert-exp-4.c: Likewise.
* gcc.target/powerpc/bfp/scalar-insert-exp-5.c: Deleted as the case is
invalid now.

patch.diff
diff --git a/gcc/config/rs6000/rs6000-builtins.def 
b/gcc/config/rs6000/rs6000-builtins.def
index f76f54793d7..d8d67fa0cad 100644
--- a/gcc/config/rs6000/rs6000-builtins.def
+++ b/gcc/config/rs6000/rs6000-builtins.def
@@ -2833,6 +2833,11 @@
   const signed int __builtin_dtstsfi_ov_td (const int<6>, _Decimal128);
 TSTSFI_OV_TD dfptstsfi_unordered_td {}

+  const unsigned int __builtin_vsx_scalar_extract_exp (double);
+VSEEDP xsxexpdp {}
+
+  const double __builtin_vsx_scalar_insert_exp_dp (double, unsigned int);
+VSIEDPF xsiexpdpf {}

 [power9-64]
   void __builtin_altivec_xst_len_r (vsc, void *, long);
@@ -2847,19 +2852,13 @@
   pure vsc __builtin_vsx_lxvl (const void *, signed long);
 LXVL lxvl {}

-  const signed long __builtin_vsx_scalar_extract_exp (double);
-VSEEDP xsxexpdp {}
-
-  const signed long __builtin_vsx_scalar_extract_sig (double);
+  const unsigned long long __builtin_vsx_scalar_extract_sig (double);
 VSESDP xsxsigdp {}

   const double __builtin_vsx_scalar_insert_exp (unsigned long long, \
-unsigned long long);
+   unsigned int);
 VSIEDP xsiexpdp {}

-  const double 

[PATCH] Add --param max-unswitch-depth

2022-12-01 Thread Richard Biener via Gcc-patches
The following adds a --param to limit the depth of unswitched loop
nests.  One can use --param max-unswitch-depth=1 to disable unswitching
of outer loops (the innermost loop will then be unswitched).

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

PR tree-optimization/107946
* params.opt (-param=max-unswitch-depth=): New.
* doc/invoke.texi (--param=max-unswitch-depth): Document.
* tree-ssa-loop-unswitch.cc (init_loop_unswitch_info): Honor
--param=max-unswitch-depth
---
 gcc/doc/invoke.texi   | 3 +++
 gcc/params.opt| 4 
 gcc/tree-ssa-loop-unswitch.cc | 4 +++-
 3 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 56e5e875e86..277ac35ad16 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -14963,6 +14963,9 @@ The maximum depth of a loop nest suitable for complete 
peeling.
 @item max-unswitch-insns
 The maximum number of insns of an unswitched loop.
 
+@item max-unswitch-depth
+The maximum depth of a loop nest to be unswitched.
+
 @item lim-expensive
 The minimum cost of an expensive expression in the loop invariant motion.
 
diff --git a/gcc/params.opt b/gcc/params.opt
index c1dcb7ea487..397ec0bd128 100644
--- a/gcc/params.opt
+++ b/gcc/params.opt
@@ -726,6 +726,10 @@ The maximum number of instructions to consider to unroll 
in a loop.
 Common Joined UInteger Var(param_max_unswitch_insns) Init(50) Param 
Optimization
 The maximum number of insns of an unswitched loop.
 
+-param=max-unswitch-depth=
+Common Joined UInteger Var(param_max_unswitch_depth) Init(50) IntegerRange(1, 
50) Param Optimization
+The maximum depth of a loop nest to be unswitched.
+
 -param=max-variable-expansions-in-unroller=
 Common Joined UInteger Var(param_max_variable_expansions) Init(1) Param 
Optimization
 If -fvariable-expansion-in-unroller is used, the maximum number of times that 
an individual variable will be expanded during loop unrolling.
diff --git a/gcc/tree-ssa-loop-unswitch.cc b/gcc/tree-ssa-loop-unswitch.cc
index e8c9bd6812a..df7a2019b1c 100644
--- a/gcc/tree-ssa-loop-unswitch.cc
+++ b/gcc/tree-ssa-loop-unswitch.cc
@@ -263,8 +263,10 @@ init_loop_unswitch_info (class loop *, 
unswitch_predicate *,
 
   /* Unswitch only nests with no sibling loops.  */
   class loop *outer_loop = loop;
+  unsigned max_depth = param_max_unswitch_depth;
   while (loop_outer (outer_loop)->num != 0
-&& !loop_outer (outer_loop)->inner->next)
+&& !loop_outer (outer_loop)->inner->next
+&& --max_depth != 0)
 outer_loop = loop_outer (outer_loop);
   hottest = NULL;
   hottest_bb = NULL;
-- 
2.35.3


Re: [PATCH] [x86] Improve ix86_expand_fast_convert_bf_to_sf with new extendbfsf2_1.

2022-12-01 Thread Uros Bizjak via Gcc-patches
On Fri, Dec 2, 2022 at 5:26 AM liuhongt  wrote:
>
> After supporting extendbfsf2_1, ix86_expand_fast_convert_bf_to_sf can
> be improved with pslld either.
> CONST_INT_P is not handled since constant shift can be optimized off.
>
> Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
> Ok for trunk?
>
> gcc/ChangeLog:
>
> * config/i386/i386-expand.cc
> (ix86_expand_fast_convert_bf_to_sf): Optimized with
> extendbfsf2_1 for non-CONST_INT_P operand.

Please say: "Use extendbfsf2_1 for nonimmediate operand."
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/i386/cbranchbf4.c: New test.

Otherwise OK.

Thanks,
Uros.

> ---
>  gcc/config/i386/i386-expand.cc | 13 ++---
>  gcc/testsuite/gcc.target/i386/cbranchbf4.c | 15 +++
>  2 files changed, 21 insertions(+), 7 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/i386/cbranchbf4.c
>
> diff --git a/gcc/config/i386/i386-expand.cc b/gcc/config/i386/i386-expand.cc
> index d26e7e41445..0bc80c4b178 100644
> --- a/gcc/config/i386/i386-expand.cc
> +++ b/gcc/config/i386/i386-expand.cc
> @@ -24155,14 +24155,13 @@ ix86_expand_fast_convert_bf_to_sf (rtx val)
>/* FLOAT_EXTEND simplification will fail if VAL is a sNaN.  */
>ret = gen_reg_rtx (SImode);
>emit_move_insn (ret, GEN_INT (INTVAL (op) & 0x));
> +  emit_insn (gen_ashlsi3 (ret, ret, GEN_INT (16)));
> +  return gen_lowpart (SFmode, ret);
>  }
> -  else
> -{
> -  ret = gen_reg_rtx (SImode);
> -  emit_insn (gen_zero_extendhisi2 (ret, op));
> -}
> -  emit_insn (gen_ashlsi3 (ret, ret, GEN_INT (16)));
> -  return gen_lowpart (SFmode, ret);
> +
> +  ret = gen_reg_rtx (SFmode);
> +  emit_insn (gen_extendbfsf2_1 (ret, force_reg (BFmode, val)));
> +  return ret;
>  }
>
>  #include "gt-i386-expand.h"
> diff --git a/gcc/testsuite/gcc.target/i386/cbranchbf4.c 
> b/gcc/testsuite/gcc.target/i386/cbranchbf4.c
> new file mode 100644
> index 000..8241a0c2165
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/cbranchbf4.c
> @@ -0,0 +1,15 @@
> +/* { dg-do compile } */
> +/* { dg-options "-fexcess-precision=16 -O -msse2 -mfpmath=sse" } */
> +/* { dg-final { scan-assembler-times "pslld" 4 } } */
> +
> +char
> +foo (__bf16 a, __bf16 b)
> +{
> +  return a > b;
> +}
> +
> +float
> +foo1 (__bf16 a, __bf16 b, float c, float d)
> +{
> +  return a > b ? c : d;
> +}
> --
> 2.27.0
>


Re: [PATCH RFA] driver: fix validate_switches logic

2022-12-01 Thread Alexandre Oliva via Gcc-patches
On Dec  1, 2022, Jason Merrill  wrote:

> Once we see g*, starred is set.  Then we see %:, and it sees that as a
> zero-length switch, which because starred is still set, matches any and all
> command-line options.  So targets that use such a spec accept all options in
> the driver, while ones that don't reject some, such as the recent
> -nostdlib++.

Woo, nice catch, thanks!

I don't have authority to approve the patch,
but it's ok as far as I'm concerned.

-- 
Alexandre Oliva, happy hackerhttps://FSFLA.org/blogs/lxo/
   Free Software Activist   GNU Toolchain Engineer
Disinformation flourishes because many people care deeply about injustice
but very few check the facts.  Ask me about 


Re: [PATCH] Fix PR59447: include "(or later)" in documentation of --with-dwarf2 configure flag

2022-12-01 Thread Eric Gallager via Gcc-patches
On Fri, Dec 2, 2022 at 12:30 AM Sandra Loosemore
 wrote:
>
> On 12/1/22 20:29, Eric Gallager via Gcc-patches wrote:
> > A pretty simple patch; borrowed from Andrew Pinski on bugzilla:
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59447
> > Tested by doing `./configure --help` in the gcc subdirectory and
> > noting that the "(or later)" made it into the output. OK for trunk?
> >
> > gcc/ChangeLog:
> >
> >  PR bootstrap/59447
> >  * configure: Regenerate.
> >  * configure.ac: Document --with-dwarf2 flag as also applying to
> > later DWARF standards.
> >  * doc/install.texi: Likewise.
>
> Hmmm.  In this hunk
>
> > diff --git a/gcc/doc/install.texi b/gcc/doc/install.texi
> > index 589c64965b2..1d7c73eb914 100644
> > --- a/gcc/doc/install.texi
> > +++ b/gcc/doc/install.texi
> > @@ -1914,7 +1914,7 @@ should not be built.
> >
> >  @item --with-dwarf2
> >  Specify that the compiler should
> > -use DWARF 2 debugging information as the default.
> > +use DWARF 2 (or later) debugging information as the default.
> >
> >  @item --with-advance-toolchain=@var{at}
> >  On 64-bit PowerPC Linux systems, configure the compiler to use the
>
> I think it would be better to say
>
> use DWARF format for debugging information as the default; the exact
> DWARF version that is the default is target-specific.
>
> OK with that change.

OK thanks, committed as r13-4457-ga710f3ce747479.

>
> -Sandra


Re: [PATCH] Add a new conversion for conditional ternary set into ifcvt [PR106536]

2022-12-01 Thread HAO CHEN GUI via Gcc-patches
Hi Nilsson,

在 2022/12/2 10:49, Hans-Peter Nilsson 写道:
> On Wed, 23 Nov 2022, HAO CHEN GUI via Gcc-patches wrote:
> 
>> diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
>> index 92bda1a7e14..9823eccbe68 100644
>> --- a/gcc/doc/tm.texi
>> +++ b/gcc/doc/tm.texi
>> @@ -7094,6 +7094,15 @@ the @code{POLY_VALUE_MIN}, @code{POLY_VALUE_MAX} and
>>  implementation returns the lowest possible value of @var{val}.
>>  @end deftypefn
>>
>> +@deftypefn {Target Hook} bool TARGET_NOCE_TERNARY_CSET_P (struct 
>> noce_if_info *@var{if_info}, rtx *@var{outer_cond}, rtx *@var{inner_cond}, 
>> int *@var{int1}, int *@var{int2}, int *@var{int3})
>> +This hook returns true if the if-then-else-join blocks describled in
> 
> Random typo spotted: "described"
> 
> Also, IMHO needs more explanation (in tm.texi preferably) why 
> this doesn't happen as part of general "combine" machinery.

Thanks for your comments. Combine can't take it as the insns are not in same
block. Also combine has the limitation on the number of insns. I will add
those comments.

Thanks
Gui Haochen

> 
> brgds, H-P


Re: [PATCH] Fix PR59447: include "(or later)" in documentation of --with-dwarf2 configure flag

2022-12-01 Thread Sandra Loosemore

On 12/1/22 20:29, Eric Gallager via Gcc-patches wrote:

A pretty simple patch; borrowed from Andrew Pinski on bugzilla:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59447
Tested by doing `./configure --help` in the gcc subdirectory and
noting that the "(or later)" made it into the output. OK for trunk?

gcc/ChangeLog:

 PR bootstrap/59447
 * configure: Regenerate.
 * configure.ac: Document --with-dwarf2 flag as also applying to
later DWARF standards.
 * doc/install.texi: Likewise.


Hmmm.  In this hunk


diff --git a/gcc/doc/install.texi b/gcc/doc/install.texi
index 589c64965b2..1d7c73eb914 100644
--- a/gcc/doc/install.texi
+++ b/gcc/doc/install.texi
@@ -1914,7 +1914,7 @@ should not be built.
 
 @item --with-dwarf2

 Specify that the compiler should
-use DWARF 2 debugging information as the default.
+use DWARF 2 (or later) debugging information as the default.
 
 @item --with-advance-toolchain=@var{at}

 On 64-bit PowerPC Linux systems, configure the compiler to use the


I think it would be better to say

use DWARF format for debugging information as the default; the exact 
DWARF version that is the default is target-specific.


OK with that change.

-Sandra


[PATCH] [x86] Improve ix86_expand_fast_convert_bf_to_sf with new extendbfsf2_1.

2022-12-01 Thread liuhongt via Gcc-patches
After supporting extendbfsf2_1, ix86_expand_fast_convert_bf_to_sf can
be improved with pslld either.
CONST_INT_P is not handled since constant shift can be optimized off.

Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
Ok for trunk?

gcc/ChangeLog:

* config/i386/i386-expand.cc
(ix86_expand_fast_convert_bf_to_sf): Optimized with
extendbfsf2_1 for non-CONST_INT_P operand.

gcc/testsuite/ChangeLog:

* gcc.target/i386/cbranchbf4.c: New test.
---
 gcc/config/i386/i386-expand.cc | 13 ++---
 gcc/testsuite/gcc.target/i386/cbranchbf4.c | 15 +++
 2 files changed, 21 insertions(+), 7 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/cbranchbf4.c

diff --git a/gcc/config/i386/i386-expand.cc b/gcc/config/i386/i386-expand.cc
index d26e7e41445..0bc80c4b178 100644
--- a/gcc/config/i386/i386-expand.cc
+++ b/gcc/config/i386/i386-expand.cc
@@ -24155,14 +24155,13 @@ ix86_expand_fast_convert_bf_to_sf (rtx val)
   /* FLOAT_EXTEND simplification will fail if VAL is a sNaN.  */
   ret = gen_reg_rtx (SImode);
   emit_move_insn (ret, GEN_INT (INTVAL (op) & 0x));
+  emit_insn (gen_ashlsi3 (ret, ret, GEN_INT (16)));
+  return gen_lowpart (SFmode, ret);
 }
-  else
-{
-  ret = gen_reg_rtx (SImode);
-  emit_insn (gen_zero_extendhisi2 (ret, op));
-}
-  emit_insn (gen_ashlsi3 (ret, ret, GEN_INT (16)));
-  return gen_lowpart (SFmode, ret);
+
+  ret = gen_reg_rtx (SFmode);
+  emit_insn (gen_extendbfsf2_1 (ret, force_reg (BFmode, val)));
+  return ret;
 }
 
 #include "gt-i386-expand.h"
diff --git a/gcc/testsuite/gcc.target/i386/cbranchbf4.c 
b/gcc/testsuite/gcc.target/i386/cbranchbf4.c
new file mode 100644
index 000..8241a0c2165
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/cbranchbf4.c
@@ -0,0 +1,15 @@
+/* { dg-do compile } */
+/* { dg-options "-fexcess-precision=16 -O -msse2 -mfpmath=sse" } */
+/* { dg-final { scan-assembler-times "pslld" 4 } } */
+
+char
+foo (__bf16 a, __bf16 b)
+{
+  return a > b;
+}
+
+float
+foo1 (__bf16 a, __bf16 b, float c, float d)
+{
+  return a > b ? c : d;
+}
-- 
2.27.0



[PATCH] Fix PR59447: include "(or later)" in documentation of --with-dwarf2 configure flag

2022-12-01 Thread Eric Gallager via Gcc-patches
A pretty simple patch; borrowed from Andrew Pinski on bugzilla:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59447
Tested by doing `./configure --help` in the gcc subdirectory and
noting that the "(or later)" made it into the output. OK for trunk?

gcc/ChangeLog:

PR bootstrap/59447
* configure: Regenerate.
* configure.ac: Document --with-dwarf2 flag as also applying to
later DWARF standards.
* doc/install.texi: Likewise.


patch-PR59447.diff
Description: Binary data


Re: [PATCH] Add a new conversion for conditional ternary set into ifcvt [PR106536]

2022-12-01 Thread Hans-Peter Nilsson
On Wed, 23 Nov 2022, HAO CHEN GUI via Gcc-patches wrote:

> diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
> index 92bda1a7e14..9823eccbe68 100644
> --- a/gcc/doc/tm.texi
> +++ b/gcc/doc/tm.texi
> @@ -7094,6 +7094,15 @@ the @code{POLY_VALUE_MIN}, @code{POLY_VALUE_MAX} and
>  implementation returns the lowest possible value of @var{val}.
>  @end deftypefn
> 
> +@deftypefn {Target Hook} bool TARGET_NOCE_TERNARY_CSET_P (struct 
> noce_if_info *@var{if_info}, rtx *@var{outer_cond}, rtx *@var{inner_cond}, 
> int *@var{int1}, int *@var{int2}, int *@var{int3})
> +This hook returns true if the if-then-else-join blocks describled in

Random typo spotted: "described"

Also, IMHO needs more explanation (in tm.texi preferably) why 
this doesn't happen as part of general "combine" machinery.

brgds, H-P


Ping^4: [PATCH V6] rs6000: Optimize cmp on rotated 16bits constant

2022-12-01 Thread Jiufu Guo via Gcc-patches
Hi,

Gentle ping:
https://gcc.gnu.org/pipermail/gcc-patches/2022-August/600475.html

BR,
Jeff(Jiufu)

Jiufu Guo via Gcc-patches  writes:

> Hi,
>
> Gentle ping this:
> https://gcc.gnu.org/pipermail/gcc-patches/2022-August/600475.html
>
> BR,
> Jeff (Jiufu)
>
>
> Jiufu Guo via Gcc-patches  writes:
>
>> Gentle ping:
>> https://gcc.gnu.org/pipermail/gcc-patches/2022-August/600475.html
>>
>> BR,
>> Jeff (Jiufu)
>>
>> Jiufu Guo via Gcc-patches  writes:
>>
>>> Ping: https://gcc.gnu.org/pipermail/gcc-patches/2022-August/600475.html
>>>
>>> BR,
>>> Jeff(Jiufu)
>>>
>>>
>>> Jiufu Guo  writes:
>>>
 Hi,

 When checking eq/ne with a constant which has only 16bits, it can be
 optimized to check the rotated data.  By this, the constant building
 is optimized.

 As the example in PR103743:
 For "in == 0x8000LL", this patch generates:
 rotldi %r3,%r3,16
 cmpldi %cr0,%r3,32768
 instead:
 li %r9,-1
 rldicr %r9,%r9,0,0
 cmpd %cr0,%r3,%r9

 Compare with previous patchs:
 https://gcc.gnu.org/pipermail/gcc-patches/2022-August/600385.html
 https://gcc.gnu.org/pipermail/gcc-patches/2022-August/600198.html

 This patch releases the condition on can_create_pseudo_p and adds
 clobbers to allow the splitter can be run both before and after RA.

 This is updated patch based on previous patch and comments:
 https://gcc.gnu.org/pipermail/gcc-patches/2022-August/600315.html

 This patch pass bootstrap and regtest on ppc64 and ppc64le.
 Is it ok for trunk?  Thanks for comments!

 BR,
 Jeff(Jiufu)


PR target/103743

 gcc/ChangeLog:

* config/rs6000/rs6000-protos.h (rotate_from_leading_zeros_const): New.
(compare_rotate_immediate_p): New.
* config/rs6000/rs6000.cc (rotate_from_leading_zeros_const): New
definition.
(compare_rotate_immediate_p): New definition.
* config/rs6000/rs6000.md (EQNE): New code_attr.
(*rotate_on_cmpdi): New define_insn_and_split.

 gcc/testsuite/ChangeLog:

* gcc.target/powerpc/pr103743.c: New test.
* gcc.target/powerpc/pr103743_1.c: New test.

 ---
  gcc/config/rs6000/rs6000-protos.h |  2 +
  gcc/config/rs6000/rs6000.cc   | 41 
  gcc/config/rs6000/rs6000.md   | 62 +++-
  gcc/testsuite/gcc.target/powerpc/pr103743.c   | 52 ++
  gcc/testsuite/gcc.target/powerpc/pr103743_1.c | 95 +++
  5 files changed, 251 insertions(+), 1 deletion(-)
  create mode 100644 gcc/testsuite/gcc.target/powerpc/pr103743.c
  create mode 100644 gcc/testsuite/gcc.target/powerpc/pr103743_1.c

 diff --git a/gcc/config/rs6000/rs6000-protos.h 
 b/gcc/config/rs6000/rs6000-protos.h
 index b3c16e7448d..78847e6b3db 100644
 --- a/gcc/config/rs6000/rs6000-protos.h
 +++ b/gcc/config/rs6000/rs6000-protos.h
 @@ -35,6 +35,8 @@ extern bool xxspltib_constant_p (rtx, machine_mode, int 
 *, int *);
  extern int vspltis_shifted (rtx);
  extern HOST_WIDE_INT const_vector_elt_as_int (rtx, unsigned int);
  extern bool macho_lo_sum_memory_operand (rtx, machine_mode);
 +extern int rotate_from_leading_zeros_const (unsigned HOST_WIDE_INT, int);
 +extern bool compare_rotate_immediate_p (unsigned HOST_WIDE_INT);
  extern int num_insns_constant (rtx, machine_mode);
  extern int small_data_operand (rtx, machine_mode);
  extern bool mem_operand_gpr (rtx, machine_mode);
 diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
 index df491bee2ea..a548db42660 100644
 --- a/gcc/config/rs6000/rs6000.cc
 +++ b/gcc/config/rs6000/rs6000.cc
 @@ -14797,6 +14797,47 @@ rs6000_reverse_condition (machine_mode mode, enum 
 rtx_code code)
  return reverse_condition (code);
  }
  
 +/* Check if C can be rotated from an immediate which starts (as 64bit 
 integer)
 +   with at least CLZ bits zero.
 +
 +   Return the number by which C can be rotated from the immediate.
 +   Return -1 if C can not be rotated as from.  */
 +
 +int
 +rotate_from_leading_zeros_const (unsigned HOST_WIDE_INT c, int clz)
 +{
 +  /* case a. 0..0xxx: already at least clz zeros.  */
 +  int lz = clz_hwi (c);
 +  if (lz >= clz)
 +return 0;
 +
 +  /* case b. 0..0xxx0..0: at least clz zeros.  */
 +  int tz = ctz_hwi (c);
 +  if (lz + tz >= clz)
 +return tz;
 +
 +  /* case c. xx10.0xx: rotate 'clz + 1' bits firstly, then check case 
 b.
 + ^bit -> Vbit
 +   00...00xxx100, 'clz + 1' >= bits of .  */
 +  const int rot_bits = HOST_BITS_PER_WIDE_INT - clz + 1;
 +  unsigned HOST_WIDE_INT rc = (c >> rot_bits) | (c << (clz - 1));
 +  tz = ctz_hwi (rc);
 +  if (clz_hwi (rc) + tz 

[committed] analyzer: handle comparisons against negated symbolic values [PR107948]

2022-12-01 Thread David Malcolm via Gcc-patches
Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Pushed to trunk as r13-4456-g0b737090a69624.

gcc/analyzer/ChangeLog:
PR analyzer/107948
* region-model-manager.cc
(region_model_manager::maybe_fold_binop): Fold (0 - VAL) to -VAL.
* region-model.cc (region_model::eval_condition): Handle e.g.
"-X <= 0" as equivalent to X >= 0".

gcc/testsuite/ChangeLog:
PR analyzer/107948
* gcc.dg/analyzer/feasibility-pr107948.c: New test.

Signed-off-by: David Malcolm 
---
 gcc/analyzer/region-model-manager.cc  |  3 ++
 gcc/analyzer/region-model.cc  | 13 +
 .../gcc.dg/analyzer/feasibility-pr107948.c| 49 +++
 3 files changed, 65 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/analyzer/feasibility-pr107948.c

diff --git a/gcc/analyzer/region-model-manager.cc 
b/gcc/analyzer/region-model-manager.cc
index ae63c664ae5..471a9272e41 100644
--- a/gcc/analyzer/region-model-manager.cc
+++ b/gcc/analyzer/region-model-manager.cc
@@ -620,6 +620,9 @@ region_model_manager::maybe_fold_binop (tree type, enum 
tree_code op,
   /* (VAL - 0) -> VAL.  */
   if (cst1 && zerop (cst1))
return get_or_create_cast (type, arg0);
+  /* (0 - VAL) -> -VAL.  */
+  if (cst0 && zerop (cst0))
+   return get_or_create_unaryop (type, NEGATE_EXPR, arg1);
   break;
 case MULT_EXPR:
   /* (VAL * 0).  */
diff --git a/gcc/analyzer/region-model.cc b/gcc/analyzer/region-model.cc
index 91b868f7b16..4f623fd6ca3 100644
--- a/gcc/analyzer/region-model.cc
+++ b/gcc/analyzer/region-model.cc
@@ -3339,6 +3339,19 @@ region_model::eval_condition (const svalue *lhs,
  return lhs_ts;
  }
  }
+   else if (const unaryop_svalue *unaryop
+  = lhs->dyn_cast_unaryop_svalue ())
+ {
+   if (unaryop->get_op () == NEGATE_EXPR)
+ {
+   /* e.g. "-X <= 0" is equivalent to X >= 0".  */
+   tristate lhs_ts = eval_condition (unaryop->get_arg (),
+ swap_tree_comparison (op),
+ rhs);
+   if (lhs_ts.is_known ())
+ return lhs_ts;
+ }
+ }
   }
 
   /* Handle rejection of equality for comparisons of the initial values of
diff --git a/gcc/testsuite/gcc.dg/analyzer/feasibility-pr107948.c 
b/gcc/testsuite/gcc.dg/analyzer/feasibility-pr107948.c
new file mode 100644
index 000..5eb8b0aef22
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/analyzer/feasibility-pr107948.c
@@ -0,0 +1,49 @@
+#include "analyzer-decls.h"
+
+void foo(int width) {
+  int i = 0;
+  int base;
+  if (width > 0){
+__analyzer_eval(i == 0); /* { dg-warning "TRUE" } */
+__analyzer_eval(width > 0); /* { dg-warning "TRUE" } */
+__analyzer_eval(width - i > 0); /* { dg-warning "TRUE" } */
+__analyzer_eval(i - width <= 0); /* { dg-warning "TRUE" } */
+if (i - width <= 0) {
+  base = 512;
+}
+else {
+  __analyzer_dump_path (); /* { dg-bogus "path" } */
+}
+base+=1; /* { dg-bogus "uninit" } */
+  }
+}
+
+void test_ge_zero (int x)
+{
+  if (x >= 0)
+{
+  __analyzer_eval(x >= 0); /* { dg-warning "TRUE" } */
+  __analyzer_eval(x > 0); /* { dg-warning "UNKNOWN" } */
+  __analyzer_eval(x <= 0); /* { dg-warning "UNKNOWN" } */
+  __analyzer_eval(x < 0); /* { dg-warning "FALSE" } */
+  __analyzer_eval(-x <= 0); /* { dg-warning "TRUE" } */
+  __analyzer_eval(-x < 0); /* { dg-warning "UNKNOWN" } */
+  __analyzer_eval(-x >= 0); /* { dg-warning "UNKNOWN" } */
+  __analyzer_eval(-x > 0); /* { dg-warning "FALSE" } */
+}
+}
+
+void test_gt_zero (int x)
+{
+  if (x > 0)
+{
+  __analyzer_eval(x >= 0); /* { dg-warning "TRUE" } */
+  __analyzer_eval(x > 0); /* { dg-warning "TRUE" } */
+  __analyzer_eval(x <= 0); /* { dg-warning "FALSE" } */
+  __analyzer_eval(x < 0); /* { dg-warning "FALSE" } */
+  __analyzer_eval(-x <= 0); /* { dg-warning "TRUE" } */
+  __analyzer_eval(-x < 0); /* { dg-warning "TRUE" } */
+  __analyzer_eval(-x >= 0); /* { dg-warning "FALSE" } */
+  __analyzer_eval(-x > 0); /* { dg-warning "FALSE" } */
+}
+}
-- 
2.26.3



[committed] analyzer: add test coverage for string ops

2022-12-01 Thread David Malcolm via Gcc-patches
Tested on x86_64-pc-linux-gnu.
Pushed to trunk as r13-4455-g5cb7d28dcfb11a.

gcc/testsuite/ChangeLog:
* gcc.dg/analyzer/string-ops-concat-pair.c: New test.
* gcc.dg/analyzer/string-ops-dup.c: New test.

Signed-off-by: David Malcolm 
---
 .../gcc.dg/analyzer/string-ops-concat-pair.c  | 67 +++
 .../gcc.dg/analyzer/string-ops-dup.c  | 61 +
 2 files changed, 128 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/analyzer/string-ops-concat-pair.c
 create mode 100644 gcc/testsuite/gcc.dg/analyzer/string-ops-dup.c

diff --git a/gcc/testsuite/gcc.dg/analyzer/string-ops-concat-pair.c 
b/gcc/testsuite/gcc.dg/analyzer/string-ops-concat-pair.c
new file mode 100644
index 000..f5bcd67594f
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/analyzer/string-ops-concat-pair.c
@@ -0,0 +1,67 @@
+typedef __SIZE_TYPE__ size_t;
+#define NULL ((void *)0)
+
+/* Concatenating a pair of strings.  */
+
+/* Correct but poor implementation with repeated __builtin_strlen calls.  */
+
+char *
+alloc_dup_of_concatenated_pair_1_correct (const char *x, const char *y)
+{
+  size_t sz = __builtin_strlen (x) + __builtin_strlen (y) + 1;
+  char *result = __builtin_malloc (sz);
+  if (!result)
+return NULL;
+  __builtin_memcpy (result, x, __builtin_strlen (x));
+  __builtin_memcpy (result + __builtin_strlen (x), y, __builtin_strlen (y));
+  result[__builtin_strlen(x) + __builtin_strlen (y)] = '\0';
+  return result;
+}
+
+/* Incorrect version: forgetting to add space for terminator.  */
+
+char *
+alloc_dup_of_concatenated_pair_1_incorrect (const char *x, const char *y)
+{
+  /* Forgetting to add space for the terminator here.  */
+  size_t sz = __builtin_strlen (x) + __builtin_strlen (y);
+  char *result = __builtin_malloc (sz);
+  if (!result)
+return NULL;
+  __builtin_memcpy (result, x, __builtin_strlen (x));
+  __builtin_memcpy (result + __builtin_strlen (x), y, __builtin_strlen (y));
+  result[__builtin_strlen(x) + __builtin_strlen (y)] = '\0'; /* { dg-warning 
"heap-based buffer overflow" "PR analyzer/105899" { xfail *-*-* } } */
+  return result;
+}
+
+/* As above, but only calling __builtin_strlen once on each input.  */
+
+char *
+alloc_dup_of_concatenated_pair_2_correct (const char *x, const char *y)
+{
+  size_t len_x = __builtin_strlen (x);
+  size_t len_y = __builtin_strlen (y);
+  size_t sz = len_x + len_y + 1;
+  char *result = __builtin_malloc (sz);
+  if (!result)
+return NULL;
+  __builtin_memcpy (result, x, len_x);
+  __builtin_memcpy (result + len_x, y, len_y);
+  result[len_x + len_y] = '\0';
+  return result;
+}
+
+char *
+alloc_dup_of_concatenated_pair_2_incorrect (const char *x, const char *y)
+{
+  size_t len_x = __builtin_strlen (x);
+  size_t len_y = __builtin_strlen (y);
+  size_t sz = len_x + len_y; /* Forgetting to add space for the terminator.  */
+  char *result = __builtin_malloc (sz); /* { dg-message "capacity: 'len_x \\+ 
len_y' bytes" } */
+  if (!result)
+return NULL;
+  __builtin_memcpy (result, x, len_x);
+  __builtin_memcpy (result + len_x, y, len_y);
+  result[len_x + len_y] = '\0'; /* { dg-warning "heap-based buffer overflow" } 
*/
+  return result;
+}
diff --git a/gcc/testsuite/gcc.dg/analyzer/string-ops-dup.c 
b/gcc/testsuite/gcc.dg/analyzer/string-ops-dup.c
new file mode 100644
index 000..44c4e9dc67e
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/analyzer/string-ops-dup.c
@@ -0,0 +1,61 @@
+typedef __SIZE_TYPE__ size_t;
+#define NULL ((void *)0)
+
+/* Duplicating a string.  */
+
+/* Correct but poor implementation with repeated __builtin_strlen calls.  */
+
+char *
+alloc_dup_1_correct (const char *x)
+{
+  size_t sz = __builtin_strlen (x) + 1;
+  char *result = __builtin_malloc (sz);
+  if (!result)
+return NULL;
+  __builtin_memcpy (result, x, __builtin_strlen (x));
+  result[__builtin_strlen(x)] = '\0';
+  return result;
+}
+
+/* Incorrect version: forgetting to add space for terminator.  */
+
+char *
+alloc_dup_1_incorrect (const char *x, const char *y)
+{
+  /* Forgetting to add space for the terminator here.  */
+  size_t sz = __builtin_strlen (x) + 1;
+  char *result = __builtin_malloc (sz);
+  if (!result)
+return NULL;
+  __builtin_memcpy (result, x, __builtin_strlen (x));
+  result[__builtin_strlen(x)] = '\0'; /* { dg-warning "heap-based buffer 
overflow" "PR analyzer/105899" { xfail *-*-* } } */
+  return result;
+}
+
+/* As above, but only calling __builtin_strlen once.  */
+
+char *
+alloc_dup_2_correct (const char *x)
+{
+  size_t len_x = __builtin_strlen (x);
+  size_t sz = len_x + 1;
+  char *result = __builtin_malloc (sz);
+  if (!result)
+return NULL;
+  __builtin_memcpy (result, x, len_x);
+  result[len_x] = '\0';
+  return result;
+}
+
+char *
+alloc_dup_of_concatenated_pair_2_incorrect (const char *x, const char *y)
+{
+  size_t len_x = __builtin_strlen (x);
+  size_t sz = len_x; /* Forgetting to add space for the terminator.  */
+  char *result = __builtin_malloc (sz); /* { dg-message 

Re: [PATCH v2] Add condition coverage profiling

2022-12-01 Thread Jørgen Kvalsvik via Gcc-patches
On 02/12/2022 00:05, Martin Liška wrote:
> On 11/11/22 06:21, Jørgen Kvalsvik wrote:
>> From: Jørgen Kvalsvik 
>>
>> This patch adds support in gcc+gcov for modified condition/decision
>> coverage (MC/DC) with the -fprofile-conditions flag. MC/DC is a type of
>> test/code coverage and it is particularly important in the avation and
>> automotive industries for safety-critical applications. MC/DC it is
>> required for or recommended by:
>>
>> * DO-178C for the most critical software (Level A) in avionics
>> * IEC 61508 for SIL 4
>> * ISO 26262-6 for ASIL D
>>
>> From the SQLite webpage:
>>
>> Two methods of measuring test coverage were described above:
>> "statement" and "branch" coverage. There are many other test
>> coverage metrics besides these two. Another popular metric is
>> "Modified Condition/Decision Coverage" or MC/DC. Wikipedia defines
>> MC/DC as follows:
>>
>> * Each decision tries every possible outcome.
>> * Each condition in a decision takes on every possible outcome.
>> * Each entry and exit point is invoked.
>> * Each condition in a decision is shown to independently affect
>>   the outcome of the decision.
>>
>> In the C programming language where && and || are "short-circuit"
>> operators, MC/DC and branch coverage are very nearly the same thing.
>> The primary difference is in boolean vector tests. One can test for
>> any of several bits in bit-vector and still obtain 100% branch test
>> coverage even though the second element of MC/DC - the requirement
>> that each condition in a decision take on every possible outcome -
>> might not be satisfied.
>>
>> https://sqlite.org/testing.html#mcdc
>>
>> Wahlen, Heimdahl, and De Silva "Efficient Test Coverage Measurement for
>> MC/DC" describes an algorithm for adding instrumentation by carrying
>> over information from the AST, but my algorithm analyses the the control
>> flow graph to instrument for coverage. This has the benefit of being
>> programming language independent and faithful to compiler decisions
>> and transformations, although I have only tested it on constructs in C
>> and C++, see testsuite/gcc.misc-tests and testsuite/g++.dg.
>>
>> Like Wahlen et al this implementation records coverage in fixed-size
>> bitsets which gcov knows how to interpret. This is very fast, but
>> introduces a limit on the number of terms in a single boolean
>> expression, the number of bits in a gcov_unsigned_type (which is
>> typedef'd to uint64_t), so for most practical purposes this would be
>> acceptable. This limitation is in the implementation and not the
>> algorithm, so support for more conditions can be added by also
>> introducing arbitrary-sized bitsets.
>>
>> For space overhead, the instrumentation needs two accumulators
>> (gcov_unsigned_type) per condition in the program which will be written
>> to the gcov file. In addition, every function gets a pair of local
>> accumulators, but these accmulators are reused between conditions in the
>> same function.
>>
>> For time overhead, there is a zeroing of the local accumulators for
>> every condition and one or two bitwise operation on every edge taken in
>> the an expression.
>>
>> In action it looks pretty similar to the branch coverage. The -g short
>> opt carries no significance, but was chosen because it was an available
>> option with the upper-case free too.
>>
>> gcov --conditions:
>>
>> 3:   17:void fn (int a, int b, int c, int d) {
>> 3:   18:if ((a && (b || c)) && d)
>> condition outcomes covered 3/8
>> condition  0 not covered (true false)
>> condition  1 not covered (true)
>> condition  2 not covered (true)
>> condition  3 not covered (true)
>> 1:   19:x = 1;
>> -:   20:else
>> 2:   21:x = 2;
>> 3:   22:}
>>
>> gcov --conditions --json-format:
>>
>> "conditions": [
>> {
>> "not_covered_false": [
>> 0
>> ],
>> "count": 8,
>> "covered": 3,
>> "not_covered_true": [
>> 0,
>> 1,
>> 2,
>> 3
>> ]
>> }
>> ],
>>
>> Some expressions, mostly those without else-blocks, are effectively
>> "rewritten" in the CFG construction making the algorithm unable to
>> distinguish them:
>>
>> and.c:
>>
>> if (a && b && c)
>> x = 1;
>>
>> ifs.c:
>>
>> if (a)
>> if (b)
>> if (c)
>> x = 1;
>>
>> gcc will build the same graph for both these programs, and gcov will
>> report boths as 3-term expressions. It is vital that it is not
>> interpreted the other way around (which is consistent with the shape of
>> the graph) because otherwise the masking would be wrong for the and.c
>> program which is a more severe error. While surprising, users would
>> probably expect some minor rewriting of semantically-identical
>> 

Re: Java front-end and library patches.

2022-12-01 Thread Zopolis0 via Gcc-patches
>  the "all in one go" approach that you seem to have attempted (?)

I did do all the patches in one go onto master, but for rebases and
bisects I did apply them on various baselines. See
https://github.com/Zopolis4/gcj-branches, where all the branches
labellled newplan/year-month-day will have these patches reapplied on
that date.

On Fri, Dec 2, 2022 at 11:24 AM Zopolis0  wrote:
>
> In response to the testing thing, one critical issue is that these
> patches aren't entirely functional (see the second point of my
> original message), so I can't test yet. I'll check once I can though.


Re: Java front-end and library patches.

2022-12-01 Thread Zopolis0 via Gcc-patches
In response to the testing thing, one critical issue is that these
patches aren't entirely functional (see the second point of my
original message), so I can't test yet. I'll check once I can though.


Re: [PATCH 19/56] Revert "Move void_list_node init to common code". (8ff2a92a0450243e52d3299a13b30f208bafa7e0)

2022-12-01 Thread Zopolis0 via Gcc-patches
> But that looks like the correct thing to do.

It's not. The patch I reverted changes it so that no matter what,
void_list_node = build_tree_list (NULL_TREE, void_type_node);.

Before, each front-end set it in their own way, but they all set it
via void_list_node = build_tree_list (NULL_TREE, void_type_node); or a
synonym anyway. So while the patch made sense in a java-free context,
given that java sets it a different way, I can't see a world in which
this commit stays active and Java works, unless we find a way to set
it in tree.cc for every language except Java.


Re: [V2][PATCH 1/1] Add a new warning option -Wstrict-flex-arrays.

2022-12-01 Thread Siddhesh Poyarekar

On 2022-12-01 18:19, Kees Cook wrote:

On Thu, Dec 01, 2022 at 10:27:41PM +, Qing Zhao wrote:

Hi, Sid,

Thanks a lot for the input.

After more thinking based on your and Kees’ comments, I have the following 
thought:

1. -fstrict-flex-arrays=level should control both GCC code gen and warnings 
consistently;
2. We need warnings specifically for -fstrict-flex-arrays=level to report any 
misuse of flexible
  array corresponding to the “level” to gradually encourage language 
standard.

So, based on the above two, I think what I did in this current patch is correct:

1.  We eliminate the control from -Warray-bounds=level on treating flex arrays,
  now only "-fstrict-flex-arrasy=level" controls how the warning treating 
the flex arrays.
2.  We add a separate new warning -Wstrict-flex-arrays to report any misuse 
corresponding to
  the different level of -fstrict-flex-arrays.

Although we can certainly merge these new warnings into -Warray-bounds, 
however, as Sid mentioned,
-Warray-bounds does issue a lot more warnings than just flexible arrays misuse. 
I think it’s necessary
To provide a seperate warning to only issue flexible array misuse.

Let me know if you have any more comments on this.


That's how I understood Richard's comment.


Okay, that seems good. Given that -Warray-bounds is part of -Wall, what
should happen for -Wstrict-flex-arrays=N?


I suppose it would be independent of -Wall, dependent completely on 
-fstrict-flex-arrays.


Thanks,
Sid


Re: [V2][PATCH 1/1] Add a new warning option -Wstrict-flex-arrays.

2022-12-01 Thread Kees Cook via Gcc-patches
On Thu, Dec 01, 2022 at 10:27:41PM +, Qing Zhao wrote:
> Hi, Sid,
> 
> Thanks a lot for the input.
> 
> After more thinking based on your and Kees’ comments, I have the following 
> thought:
> 
> 1. -fstrict-flex-arrays=level should control both GCC code gen and warnings 
> consistently;
> 2. We need warnings specifically for -fstrict-flex-arrays=level to report any 
> misuse of flexible 
>  array corresponding to the “level” to gradually encourage language 
> standard.
> 
> So, based on the above two, I think what I did in this current patch is 
> correct:
> 
> 1.  We eliminate the control from -Warray-bounds=level on treating flex 
> arrays, 
>  now only "-fstrict-flex-arrasy=level" controls how the warning treating 
> the flex arrays.
> 2.  We add a separate new warning -Wstrict-flex-arrays to report any misuse 
> corresponding to
>  the different level of -fstrict-flex-arrays.
> 
> Although we can certainly merge these new warnings into -Warray-bounds, 
> however, as Sid mentioned,
> -Warray-bounds does issue a lot more warnings than just flexible arrays 
> misuse. I think it’s necessary 
> To provide a seperate warning to only issue flexible array misuse.
> 
> Let me know if you have any more comments on this.

Okay, that seems good. Given that -Warray-bounds is part of -Wall, what
should happen for -Wstrict-flex-arrays=N?

-- 
Kees Cook


Re: [V2][PATCH 1/1] Add a new warning option -Wstrict-flex-arrays.

2022-12-01 Thread Qing Zhao via Gcc-patches
Hi, Sid,

Thanks a lot for the input.

After more thinking based on your and Kees’ comments, I have the following 
thought:

1. -fstrict-flex-arrays=level should control both GCC code gen and warnings 
consistently;
2. We need warnings specifically for -fstrict-flex-arrays=level to report any 
misuse of flexible 
 array corresponding to the “level” to gradually encourage language 
standard.

So, based on the above two, I think what I did in this current patch is correct:

1.  We eliminate the control from -Warray-bounds=level on treating flex arrays, 
 now only "-fstrict-flex-arrasy=level" controls how the warning treating 
the flex arrays.
2.  We add a separate new warning -Wstrict-flex-arrays to report any misuse 
corresponding to
 the different level of -fstrict-flex-arrays.

Although we can certainly merge these new warnings into -Warray-bounds, 
however, as Sid mentioned,
-Warray-bounds does issue a lot more warnings than just flexible arrays misuse. 
I think it’s necessary 
To provide a seperate warning to only issue flexible array misuse.

Let me know if you have any more comments on this.

thanks.

Qing



> On Dec 1, 2022, at 2:45 PM, Siddhesh Poyarekar  wrote:
> 
> On 2022-12-01 11:42, Kees Cook wrote:
>> On Wed, Nov 30, 2022 at 02:25:56PM +, Qing Zhao wrote:
>>> '-Wstrict-flex-arrays'
>>>  Warn about inproper usages of flexible array members according to
>>>  the LEVEL of the 'strict_flex_array (LEVEL)' attribute attached to
>>>  the trailing array field of a structure if it's available,
>>>  otherwise according to the LEVEL of the option
>>>  '-fstrict-flex-arrays=LEVEL'.
>>> 
>>>  This option is effective only when LEVEL is bigger than 0.
>>>  Otherwise, it will be ignored with a warning.
>>> 
>>>  when LEVEL=1, warnings will be issued for a trailing array
>>>  reference of a structure that have 2 or more elements if the
>>>  trailing array is referenced as a flexible array member.
>>> 
>>>  when LEVEL=2, in addition to LEVEL=1, additional warnings will be
>>>  issued for a trailing one-element array reference of a structure if
>>>  the array is referenced as a flexible array member.
>>> 
>>>  when LEVEL=3, in addition to LEVEL=2, additional warnings will be
>>>  issued for a trailing zero-length array reference of a structure if
>>>  the array is referenced as a flexible array member.
>>> 
>>> At the same time, -Warray-bounds is updated:
>> Why is there both -Wstrict-flex-arrays and -Warray-bounds? I thought
>> only the latter was going to exist?
> 
> Oh my understanding of the consensus was to move flex array related diagnosis 
> from -Warray-bounds to -Wstring-flex-arrays as Qing has done. If only the 
> former exists then instead of removing the flex array related statement in 
> the documentation as Richard suggested, we need to enhance it to say that 
> behaviour of -Warray-bounds will depend on -fstrict-flex-arrays.
> 
> -Warray-bounds does diagnosis beyond just flexible arrays, in case that's the 
> confusion.
> 
> Sid



Re: [PATCH] c++: comptypes ICE with BOUND_TEMPLATE_TEMPLATE_PARMs [PR107539]

2022-12-01 Thread Jason Merrill via Gcc-patches

On 12/1/22 15:57, Patrick Palka wrote:

Here the two BOUND_TEMPLATE_TEMPLATE_PARMs written as C end
up having the same TYPE_CANONICAL since the ctp_table (which interns the
canonical form of a template type parameter) doesn't set the
comparing_specializations flag which controls how PARM_DECLs from
different DECL_CONTEXTs compare equal.

Later (from spec_hasher::equal for the two specializations of i) we end
up calling comptypes on these two types with comparing_specializations
set, which notices their TYPE_CANONICAL is the same despite them no
longer structurally comparing equal (thanks to the flag) and so we ICE:

   internal compiler error: same canonical type node for different types
 'C' and 'C'

This suggests that we should be setting comparing_specializations from
ctp_hasher::equal as well.  But doing so introduces an ICE in
cpp2a/concepts-placeholder3.C:

   internal compiler error: canonical types differ for identical types
   'auto [requires ::same_as<, decltype(f::x)>]' and
   'auto [requires ::same_as<, decltype(g::x)>]'

since norm_hasher::equal doesn't set comparing_specializations.  I'm not
sure when excatly we need to set comparing_specializations given that it
controls three things (TYPENAME_TYPE equality/hashing and PARM_DECL
equality) but it seems to be the conservative choice to set the flag
whenever we have a global hash table that relies on structural equality
of trees.  To that end this patch sets comparing_specializations in
ctp_hasher and the normalization/satisfaction hashers.  This turns out
to be a performance win of about 2% in some concepts tests, probably
because improved TYPENAME_TYPE hashing enabled by the flag.

Bootstrapped and regtested on x86_64-pc-linux-gnu, deos this look OK for
trunk?


OK.


PR c++/107539

gcc/cp/ChangeLog:

* constraint.cc (norm_hasher::hash, norm_hasher::equal): Set
comparing_specializations.
(sat_hasher::hash, sat_hasher::equal): Likewise.
* cp-tree.h (atom_hasher::hash, atom_hasher::equal): Likewise.
* pt.cc (ctp_hasher::hash, ctp_hasher::equal): Likewise.

gcc/testsuite/ChangeLog:

* g++.dg/template/canon-type-19.C: New test.
---
  gcc/cp/constraint.cc  | 18 +++---
  gcc/cp/cp-tree.h  | 10 --
  gcc/cp/pt.cc  |  7 ++-
  gcc/testsuite/g++.dg/template/canon-type-19.C | 18 ++
  4 files changed, 47 insertions(+), 6 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/template/canon-type-19.C

diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc
index ab0f66b3d7e..37eae03afdb 100644
--- a/gcc/cp/constraint.cc
+++ b/gcc/cp/constraint.cc
@@ -715,14 +715,20 @@ struct norm_hasher : ggc_ptr_hash
  {
static hashval_t hash (norm_entry *e)
{
-hashval_t hash = iterative_hash_template_arg (e->tmpl, 0);
-return iterative_hash_template_arg (e->args, hash);
+++comparing_specializations;
+hashval_t val = iterative_hash_template_arg (e->tmpl, 0);
+val = iterative_hash_template_arg (e->args, val);
+--comparing_specializations;
+return val;
}
  
static bool equal (norm_entry *e1, norm_entry *e2)

{
-return e1->tmpl == e2->tmpl
+++comparing_specializations;
+bool eq = e1->tmpl == e2->tmpl
&& template_args_equal (e1->args, e2->args);
+--comparing_specializations;
+return eq;
}
  };
  
@@ -2530,6 +2536,9 @@ struct sat_hasher : ggc_ptr_hash

  {
static hashval_t hash (sat_entry *e)
{
+auto cso = make_temp_override (comparing_specializations);
+++comparing_specializations;
+
  if (ATOMIC_CONSTR_MAP_INSTANTIATED_P (e->atom))
{
/* Atoms with instantiated mappings are built during satisfaction.
@@ -2564,6 +2573,9 @@ struct sat_hasher : ggc_ptr_hash
  
static bool equal (sat_entry *e1, sat_entry *e2)

{
+auto cso = make_temp_override (comparing_specializations);
+++comparing_specializations;
+
  if (ATOMIC_CONSTR_MAP_INSTANTIATED_P (e1->atom)
!= ATOMIC_CONSTR_MAP_INSTANTIATED_P (e2->atom))
return false;
diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index 548b533266a..addd26ea077 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -8418,12 +8418,18 @@ struct atom_hasher : default_hash_traits
  {
static hashval_t hash (tree t)
{
-return hash_atomic_constraint (t);
+++comparing_specializations;
+hashval_t val = hash_atomic_constraint (t);
+--comparing_specializations;
+return val;
}
  
static bool equal (tree t1, tree t2)

{
-return atomic_constraints_identical_p (t1, t2);
+++comparing_specializations;
+bool eq = atomic_constraints_identical_p (t1, t2);
+--comparing_specializations;
+return eq;
}
  };
  
diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc

index 08de273a900..31691618d1b 100644
--- a/gcc/cp/pt.cc
+++ b/gcc/cp/pt.cc
@@ -4492,18 +4492,23 @@ struct ctp_hasher : 

Re: [PATCH] c++: explicit spec of constrained member tmpl [PR107522]

2022-12-01 Thread Jason Merrill via Gcc-patches

On 12/1/22 14:51, Patrick Palka wrote:

On Thu, 1 Dec 2022, Jason Merrill wrote:


On 12/1/22 11:37, Patrick Palka wrote:

When defining a explicit specialization of a constrained member template
(of a class template) such as f and g in the below testcase, the
DECL_TEMPLATE_PARMS of the corresponding TEMPLATE_DECL are partially
instantiated, whereas its associated constraints are carried over
from the original template and thus are in terms of the original
DECL_TEMPLATE_PARMS.


But why are they carried over?  We wrote a specification of the constraints in
terms of the template parameters of the specialization, why are we throwing
that away?


Using the partially instantiated constraints would require adding a
special case to satisfaction since during satisfaction we currently
always use the full set of template arguments (relative to the most
general template).


But not for partial specializations, right?  It seems natural to handle 
this explicit instantiation the way we handle partial specializations, 
as both have their constraints written in terms of their template 
parameters.



For satisfaction of the partially instantiated
constraints, we'd instead have to use the template arguments relative to
the explicit specialization, e.g. {42} instead of {{int},{42}} for
A::f<42>.  Not sure if that would be preferable, but it seems
doable.




So during normalization for such an explicit
specialization we need to consider the (parameters of) the most general
template, since that's what the constraints are in terms of and since we
always use the full set of template arguments during satisfaction.

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
trunk and perhaps 12?

PR c++/107522

gcc/cp/ChangeLog:

* constraint.cc (get_normalized_constraints_from_decl): Use the
most general template for an explicit specialization of a
member template.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/concepts-explicit-spec7.C: New test.
---
   gcc/cp/constraint.cc  | 18 ---
   .../g++.dg/cpp2a/concepts-explicit-spec7.C| 31 +++
   2 files changed, 44 insertions(+), 5 deletions(-)
   create mode 100644 gcc/testsuite/g++.dg/cpp2a/concepts-explicit-spec7.C

diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc
index ab0f66b3d7e..f1df84c2a1c 100644
--- a/gcc/cp/constraint.cc
+++ b/gcc/cp/constraint.cc
@@ -973,11 +973,19 @@ get_normalized_constraints_from_decl (tree d, bool
diag = false)
accepting the latter causes the template parameter level of U
to be reduced in a way that makes it overly difficult substitute
concrete arguments (i.e., eventually {int, int} during satisfaction.
*/
-  if (tmpl)
-  {
-if (DECL_LANG_SPECIFIC(tmpl) && !DECL_TEMPLATE_SPECIALIZATION (tmpl))
-  tmpl = most_general_template (tmpl);
-  }
+  if (tmpl && DECL_LANG_SPECIFIC (tmpl)
+  && (!DECL_TEMPLATE_SPECIALIZATION (tmpl)
+ /* DECL_TEMPLATE_SPECIALIZATION means we're dealing with either a
+partial specialization or an explicit specialization of a member
+template.  In the former case all is well: the constraints are in
+terms in TMPL's parameters.  But in the latter case TMPL's
+parameters are partially instantiated whereas its constraints
+aren't, so we need to consider (the parameters of) the most
+general template.  The following test distinguishes between a
+partial specialization and such an explicit specialization.  */
+ || (TMPL_PARMS_DEPTH (DECL_TEMPLATE_PARMS (tmpl))
+ < TMPL_ARGS_DEPTH (DECL_TI_ARGS (tmpl)
+tmpl = most_general_template (tmpl);
   d = tmpl ? tmpl : decl;
   diff --git a/gcc/testsuite/g++.dg/cpp2a/concepts-explicit-spec7.C
b/gcc/testsuite/g++.dg/cpp2a/concepts-explicit-spec7.C
new file mode 100644
index 000..5b5a6df20ff
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/concepts-explicit-spec7.C
@@ -0,0 +1,31 @@
+// PR c++/107522
+// { dg-do compile { target c++20 } }
+
+template
+struct A
+{
+  template
+  static void f() requires (N == 42);
+
+  template
+  struct B {
+template
+static void g() requires (T(N) == 42);
+  };
+};
+
+template<>
+template
+void A::f() requires (N == 42) { }
+
+template<>
+template<>
+template
+void A::B::g() requires (int(N) == 42) { }
+
+int main() {
+  A::f<42>();
+  A::f<43>(); // { dg-error "no match" }
+  A::B::g<42>();
+  A::B::g<43>(); // { dg-error "no match" }
+}









Re: [PATCH] libcpp: suppress builtin macro redefined warnings for __LINE__

2022-12-01 Thread Joseph Myers
On Fri, 2 Dec 2022, Longjun Luo via Gcc-patches wrote:

> They are ./gcc/testsuite/gcc.dg/cpp/warn-redefined.c and
> ./gcc/testsuite/gcc.dg/cpp/warn-redefined-2.c
> 
> These two cases redefine the __TIME__ macro when using the option
> '-Wbuiltin-macro-redefined'.
> 
> I think I shoud add a test to verify __LINE__ macro in these two cases.

I think it should be a test that doesn't use either 
-Wbuiltin-macro-redefined or -Wno-builtin-macro-redefined - a test of how 
the compiler behaves by default.

> So, the patch itself has no problem. What I need do is to rich its test cases
> and update change log, right?

The patch needs review, but I'm fine with the principle that 
-Wno-builtin-macro-redefined should apply to __LINE__ as it does to 
various other built-in macros.

-- 
Joseph S. Myers
jos...@codesourcery.com


[PATCH] c++: comptypes ICE with BOUND_TEMPLATE_TEMPLATE_PARMs [PR107539]

2022-12-01 Thread Patrick Palka via Gcc-patches
Here the two BOUND_TEMPLATE_TEMPLATE_PARMs written as C end
up having the same TYPE_CANONICAL since the ctp_table (which interns the
canonical form of a template type parameter) doesn't set the
comparing_specializations flag which controls how PARM_DECLs from
different DECL_CONTEXTs compare equal.

Later (from spec_hasher::equal for the two specializations of i) we end
up calling comptypes on these two types with comparing_specializations
set, which notices their TYPE_CANONICAL is the same despite them no
longer structurally comparing equal (thanks to the flag) and so we ICE:

  internal compiler error: same canonical type node for different types
'C' and 'C'

This suggests that we should be setting comparing_specializations from
ctp_hasher::equal as well.  But doing so introduces an ICE in
cpp2a/concepts-placeholder3.C:

  internal compiler error: canonical types differ for identical types
  'auto [requires ::same_as<, decltype(f::x)>]' and
  'auto [requires ::same_as<, decltype(g::x)>]'

since norm_hasher::equal doesn't set comparing_specializations.  I'm not
sure when excatly we need to set comparing_specializations given that it
controls three things (TYPENAME_TYPE equality/hashing and PARM_DECL
equality) but it seems to be the conservative choice to set the flag
whenever we have a global hash table that relies on structural equality
of trees.  To that end this patch sets comparing_specializations in
ctp_hasher and the normalization/satisfaction hashers.  This turns out
to be a performance win of about 2% in some concepts tests, probably
because improved TYPENAME_TYPE hashing enabled by the flag.

Bootstrapped and regtested on x86_64-pc-linux-gnu, deos this look OK for
trunk?

PR c++/107539

gcc/cp/ChangeLog:

* constraint.cc (norm_hasher::hash, norm_hasher::equal): Set
comparing_specializations.
(sat_hasher::hash, sat_hasher::equal): Likewise.
* cp-tree.h (atom_hasher::hash, atom_hasher::equal): Likewise.
* pt.cc (ctp_hasher::hash, ctp_hasher::equal): Likewise.

gcc/testsuite/ChangeLog:

* g++.dg/template/canon-type-19.C: New test.
---
 gcc/cp/constraint.cc  | 18 +++---
 gcc/cp/cp-tree.h  | 10 --
 gcc/cp/pt.cc  |  7 ++-
 gcc/testsuite/g++.dg/template/canon-type-19.C | 18 ++
 4 files changed, 47 insertions(+), 6 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/template/canon-type-19.C

diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc
index ab0f66b3d7e..37eae03afdb 100644
--- a/gcc/cp/constraint.cc
+++ b/gcc/cp/constraint.cc
@@ -715,14 +715,20 @@ struct norm_hasher : ggc_ptr_hash
 {
   static hashval_t hash (norm_entry *e)
   {
-hashval_t hash = iterative_hash_template_arg (e->tmpl, 0);
-return iterative_hash_template_arg (e->args, hash);
+++comparing_specializations;
+hashval_t val = iterative_hash_template_arg (e->tmpl, 0);
+val = iterative_hash_template_arg (e->args, val);
+--comparing_specializations;
+return val;
   }
 
   static bool equal (norm_entry *e1, norm_entry *e2)
   {
-return e1->tmpl == e2->tmpl
+++comparing_specializations;
+bool eq = e1->tmpl == e2->tmpl
   && template_args_equal (e1->args, e2->args);
+--comparing_specializations;
+return eq;
   }
 };
 
@@ -2530,6 +2536,9 @@ struct sat_hasher : ggc_ptr_hash
 {
   static hashval_t hash (sat_entry *e)
   {
+auto cso = make_temp_override (comparing_specializations);
+++comparing_specializations;
+
 if (ATOMIC_CONSTR_MAP_INSTANTIATED_P (e->atom))
   {
/* Atoms with instantiated mappings are built during satisfaction.
@@ -2564,6 +2573,9 @@ struct sat_hasher : ggc_ptr_hash
 
   static bool equal (sat_entry *e1, sat_entry *e2)
   {
+auto cso = make_temp_override (comparing_specializations);
+++comparing_specializations;
+
 if (ATOMIC_CONSTR_MAP_INSTANTIATED_P (e1->atom)
!= ATOMIC_CONSTR_MAP_INSTANTIATED_P (e2->atom))
   return false;
diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index 548b533266a..addd26ea077 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -8418,12 +8418,18 @@ struct atom_hasher : default_hash_traits
 {
   static hashval_t hash (tree t)
   {
-return hash_atomic_constraint (t);
+++comparing_specializations;
+hashval_t val = hash_atomic_constraint (t);
+--comparing_specializations;
+return val;
   }
 
   static bool equal (tree t1, tree t2)
   {
-return atomic_constraints_identical_p (t1, t2);
+++comparing_specializations;
+bool eq = atomic_constraints_identical_p (t1, t2);
+--comparing_specializations;
+return eq;
   }
 };
 
diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
index 08de273a900..31691618d1b 100644
--- a/gcc/cp/pt.cc
+++ b/gcc/cp/pt.cc
@@ -4492,18 +4492,23 @@ struct ctp_hasher : ggc_ptr_hash
 {
   static hashval_t hash (tree t)
   {
+++comparing_specializations;
 tree_code 

[PATCH] Fortran: error recovery simplifying UNPACK for insufficient FIELD [PR107922]

2022-12-01 Thread Harald Anlauf via Gcc-patches
Dear all,

we did not properly handle the case of insufficient array-valued
FIELD when trying to simplify UNPACK and could run into a NULL
pointer dereference.  The fix is obvious.

Regtested on x86_64-pc-linux-gnu.  OK for mainline?

Thanks,
Harald

From 0ff50e73c6fce52263b9530daffe6743c1fc9b2c Mon Sep 17 00:00:00 2001
From: Harald Anlauf 
Date: Thu, 1 Dec 2022 21:16:46 +0100
Subject: [PATCH] Fortran: error recovery simplifying UNPACK for insufficient
 FIELD [PR107922]

gcc/fortran/ChangeLog:

	PR fortran/107922
	* simplify.cc (gfc_simplify_unpack): Terminate simplification when
	array-valued argument FIELD does not provide enough elements.

gcc/testsuite/ChangeLog:

	PR fortran/107922
	* gfortran.dg/unpack_field_1.f90: New test.
---
 gcc/fortran/simplify.cc  |  9 -
 gcc/testsuite/gfortran.dg/unpack_field_1.f90 | 15 +++
 2 files changed, 23 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gfortran.dg/unpack_field_1.f90

diff --git a/gcc/fortran/simplify.cc b/gcc/fortran/simplify.cc
index b6184181f26..0a0a7755f4c 100644
--- a/gcc/fortran/simplify.cc
+++ b/gcc/fortran/simplify.cc
@@ -8485,7 +8485,14 @@ gfc_simplify_unpack (gfc_expr *vector, gfc_expr *mask, gfc_expr *field)
 	}
 	}
   else if (field->expr_type == EXPR_ARRAY)
-	e = gfc_copy_expr (field_ctor->expr);
+	if (field_ctor)
+	  e = gfc_copy_expr (field_ctor->expr);
+	else
+	  {
+	/* Not enough elements in array FIELD.  */
+	gfc_free_expr (result);
+	return _bad_expr;
+	  }
   else
 	e = gfc_copy_expr (field);

diff --git a/gcc/testsuite/gfortran.dg/unpack_field_1.f90 b/gcc/testsuite/gfortran.dg/unpack_field_1.f90
new file mode 100644
index 000..ca3cfbd2bd4
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/unpack_field_1.f90
@@ -0,0 +1,15 @@
+! { dg-do compile }
+! PR fortran/107922 - ICE in gfc_simplify_unpack
+! Test error recovery when shapes of FIELD and MASK do not match
+! Contributed by G.Steinmetz
+
+program p
+  integer, parameter :: a(2) = 1
+  integer, parameter :: d(3) = 1
+  logical, parameter :: mask(3) = [.false.,.true.,.false.]
+  integer, parameter :: b(2) = unpack(a,mask,a)  ! { dg-error "must have identical shape" }
+  integer :: c(3) = unpack(a,[.false.,.true.,.false.],a) ! { dg-error "must have identical shape" }
+  print *, unpack(a,mask,a)  ! { dg-error "must have identical shape" }
+  print *, unpack(a,mask,d) ! OK
+  print *, unpack(a,mask,3) ! OK
+end
--
2.35.3



Re: [PATCH] libcpp: suppress builtin macro redefined warnings for __LINE__

2022-12-01 Thread Longjun Luo via Gcc-patches



On 12/2/2022 3:07 AM, Joseph Myers wrote:

On Fri, 2 Dec 2022, Longjun Luo via Gcc-patches wrote:


On 12/2/2022 1:01 AM, Joseph Myers wrote:

On Thu, 1 Dec 2022, Longjun Luo via Gcc-patches wrote:


diff --git a/gcc/testsuite/gcc.dg/builtin-redefine.c
b/gcc/testsuite/gcc.dg/builtin-redefine.c
index 882b2210992..9d5b42252ee 100644
--- a/gcc/testsuite/gcc.dg/builtin-redefine.c
+++ b/gcc/testsuite/gcc.dg/builtin-redefine.c
@@ -71,7 +71,6 @@
   /* { dg-bogus "Expected built-in is not defined" "" { target *-*-* } .-1
} */
   #endif
   -#define __LINE__ 0   /* { dg-warning "-:\"__LINE__\" redef" }
*/
   #define __INCLUDE_LEVEL__ 0  /* { dg-warning "-:\"__INCLUDE_LEVEL__\"
redef" } */
   #define __COUNTER__ 0/* { dg-warning "-:\"__COUNTER__\" redef" }
*/

Is there some existing test that verifies that this redefinition is still
diagnosed by default (in the absence of -Wno-builtin-macro-redefined)?

I am not sure I have fully understood your meaning. The problem here is that
if I try to redefine __LINE__ macro in the situation that projects use the
option '-Werror', the compile will fail.

There are two cases:

(a) Is redefinition of __LINE__ diagnosed *without*
-Wno-builtin-macro-redefined?

(b) Is redefinition of __LINE__ diagnosed *with*
-Wno-builtin-macro-redefined?

My understanding is that both (a) and (b) have answer "yes" at present,
and your patch would change the answer to (b) to "no", without changing
the answer to (a).

My question is about whether there is a test verifying the answer to (a).
If not, I think the patch should add one.



After some check for the source code, two similiar exist test cases for 
the situation (a).


They are ./gcc/testsuite/gcc.dg/cpp/warn-redefined.c and 
./gcc/testsuite/gcc.dg/cpp/warn-redefined-2.c


These two cases redefine the __TIME__ macro when using the option 
'-Wbuiltin-macro-redefined'.


I think I shoud add a test to verify __LINE__ macro in these two cases.

I will write a complete test for situation (a) and situation (b). But I 
need a little time to be familar with the gcc testcases.


So, the patch itself has no problem. What I need do is to rich its test 
cases and update change log, right?






Re: [PATCH] c++: explicit spec of constrained member tmpl [PR107522]

2022-12-01 Thread Patrick Palka via Gcc-patches
On Thu, 1 Dec 2022, Jason Merrill wrote:

> On 12/1/22 11:37, Patrick Palka wrote:
> > When defining a explicit specialization of a constrained member template
> > (of a class template) such as f and g in the below testcase, the
> > DECL_TEMPLATE_PARMS of the corresponding TEMPLATE_DECL are partially
> > instantiated, whereas its associated constraints are carried over
> > from the original template and thus are in terms of the original
> > DECL_TEMPLATE_PARMS.
> 
> But why are they carried over?  We wrote a specification of the constraints in
> terms of the temprate parameters of the specialization, why are we throwing
> that away?

Using the partially instantiated constraints would require adding a
special case to satisfaction since during satisfaction we currently
always use the full set of template arguments (relative to the most
general template).  For satisfaction of the partiall instantiated
constraints, we'd instead have to use the template arguments relative to
the explicit specialization, e.g. {42} instead of {{int},{42}} for
A::f<42>.  Not sure if that would be preferable, but it seems
doable.

> 
> > So during normalization for such an explicit
> > specialization we need to consider the (parameters of) the most general
> > template, since that's what the constraints are in terms of and since we
> > always use the full set of template arguments during satisfaction.
> > 
> > Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
> > trunk and perhaps 12?
> > 
> > PR c++/107522
> > 
> > gcc/cp/ChangeLog:
> > 
> > * constraint.cc (get_normalized_constraints_from_decl): Use the
> > most general template for an explicit specialization of a
> > member template.
> > 
> > gcc/testsuite/ChangeLog:
> > 
> > * g++.dg/cpp2a/concepts-explicit-spec7.C: New test.
> > ---
> >   gcc/cp/constraint.cc  | 18 ---
> >   .../g++.dg/cpp2a/concepts-explicit-spec7.C| 31 +++
> >   2 files changed, 44 insertions(+), 5 deletions(-)
> >   create mode 100644 gcc/testsuite/g++.dg/cpp2a/concepts-explicit-spec7.C
> > 
> > diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc
> > index ab0f66b3d7e..f1df84c2a1c 100644
> > --- a/gcc/cp/constraint.cc
> > +++ b/gcc/cp/constraint.cc
> > @@ -973,11 +973,19 @@ get_normalized_constraints_from_decl (tree d, bool
> > diag = false)
> >accepting the latter causes the template parameter level of U
> >to be reduced in a way that makes it overly difficult substitute
> >concrete arguments (i.e., eventually {int, int} during satisfaction.
> > */
> > -  if (tmpl)
> > -  {
> > -if (DECL_LANG_SPECIFIC(tmpl) && !DECL_TEMPLATE_SPECIALIZATION (tmpl))
> > -  tmpl = most_general_template (tmpl);
> > -  }
> > +  if (tmpl && DECL_LANG_SPECIFIC (tmpl)
> > +  && (!DECL_TEMPLATE_SPECIALIZATION (tmpl)
> > + /* DECL_TEMPLATE_SPECIALIZATION means we're dealing with either a
> > +partial specialization or an explicit specialization of a member
> > +template.  In the former case all is well: the constraints are in
> > +terms in TMPL's parameters.  But in the latter case TMPL's
> > +parameters are partially instantiated whereas its constraints
> > +aren't, so we need to consider (the parameters of) the most
> > +general template.  The following test distinguishes between a
> > +partial specialization and such an explicit specialization.  */
> > + || (TMPL_PARMS_DEPTH (DECL_TEMPLATE_PARMS (tmpl))
> > + < TMPL_ARGS_DEPTH (DECL_TI_ARGS (tmpl)
> > +tmpl = most_general_template (tmpl);
> >   d = tmpl ? tmpl : decl;
> >   diff --git a/gcc/testsuite/g++.dg/cpp2a/concepts-explicit-spec7.C
> > b/gcc/testsuite/g++.dg/cpp2a/concepts-explicit-spec7.C
> > new file mode 100644
> > index 000..5b5a6df20ff
> > --- /dev/null
> > +++ b/gcc/testsuite/g++.dg/cpp2a/concepts-explicit-spec7.C
> > @@ -0,0 +1,31 @@
> > +// PR c++/107522
> > +// { dg-do compile { target c++20 } }
> > +
> > +template
> > +struct A
> > +{
> > +  template
> > +  static void f() requires (N == 42);
> > +
> > +  template
> > +  struct B {
> > +template
> > +static void g() requires (T(N) == 42);
> > +  };
> > +};
> > +
> > +template<>
> > +template
> > +void A::f() requires (N == 42) { }
> > +
> > +template<>
> > +template<>
> > +template
> > +void A::B::g() requires (int(N) == 42) { }
> > +
> > +int main() {
> > +  A::f<42>();
> > +  A::f<43>(); // { dg-error "no match" }
> > +  A::B::g<42>();
> > +  A::B::g<43>(); // { dg-error "no match" }
> > +}
> 
> 



Re: [V2][PATCH 1/1] Add a new warning option -Wstrict-flex-arrays.

2022-12-01 Thread Siddhesh Poyarekar

On 2022-12-01 11:42, Kees Cook wrote:

On Wed, Nov 30, 2022 at 02:25:56PM +, Qing Zhao wrote:

'-Wstrict-flex-arrays'
  Warn about inproper usages of flexible array members according to
  the LEVEL of the 'strict_flex_array (LEVEL)' attribute attached to
  the trailing array field of a structure if it's available,
  otherwise according to the LEVEL of the option
  '-fstrict-flex-arrays=LEVEL'.

  This option is effective only when LEVEL is bigger than 0.
  Otherwise, it will be ignored with a warning.

  when LEVEL=1, warnings will be issued for a trailing array
  reference of a structure that have 2 or more elements if the
  trailing array is referenced as a flexible array member.

  when LEVEL=2, in addition to LEVEL=1, additional warnings will be
  issued for a trailing one-element array reference of a structure if
  the array is referenced as a flexible array member.

  when LEVEL=3, in addition to LEVEL=2, additional warnings will be
  issued for a trailing zero-length array reference of a structure if
  the array is referenced as a flexible array member.

At the same time, -Warray-bounds is updated:


Why is there both -Wstrict-flex-arrays and -Warray-bounds? I thought
only the latter was going to exist?


Oh my understanding of the consensus was to move flex array related 
diagnosis from -Warray-bounds to -Wstring-flex-arrays as Qing has done. 
If only the former exists then instead of removing the flex array 
related statement in the documentation as Richard suggested, we need to 
enhance it to say that behaviour of -Warray-bounds will depend on 
-fstrict-flex-arrays.


-Warray-bounds does diagnosis beyond just flexible arrays, in case 
that's the confusion.


Sid


Re: [PATCH] c++: explicit spec of constrained member tmpl [PR107522]

2022-12-01 Thread Jason Merrill via Gcc-patches

On 12/1/22 11:37, Patrick Palka wrote:

When defining a explicit specialization of a constrained member template
(of a class template) such as f and g in the below testcase, the
DECL_TEMPLATE_PARMS of the corresponding TEMPLATE_DECL are partially
instantiated, whereas its associated constraints are carried over
from the original template and thus are in terms of the original
DECL_TEMPLATE_PARMS.


But why are they carried over?  We wrote a specification of the 
constraints in terms of the temprate parameters of the specialization, 
why are we throwing that away?



So during normalization for such an explicit
specialization we need to consider the (parameters of) the most general
template, since that's what the constraints are in terms of and since we
always use the full set of template arguments during satisfaction.

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
trunk and perhaps 12?

PR c++/107522

gcc/cp/ChangeLog:

* constraint.cc (get_normalized_constraints_from_decl): Use the
most general template for an explicit specialization of a
member template.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/concepts-explicit-spec7.C: New test.
---
  gcc/cp/constraint.cc  | 18 ---
  .../g++.dg/cpp2a/concepts-explicit-spec7.C| 31 +++
  2 files changed, 44 insertions(+), 5 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/cpp2a/concepts-explicit-spec7.C

diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc
index ab0f66b3d7e..f1df84c2a1c 100644
--- a/gcc/cp/constraint.cc
+++ b/gcc/cp/constraint.cc
@@ -973,11 +973,19 @@ get_normalized_constraints_from_decl (tree d, bool diag = 
false)
   accepting the latter causes the template parameter level of U
   to be reduced in a way that makes it overly difficult substitute
   concrete arguments (i.e., eventually {int, int} during satisfaction.  */
-  if (tmpl)
-  {
-if (DECL_LANG_SPECIFIC(tmpl) && !DECL_TEMPLATE_SPECIALIZATION (tmpl))
-  tmpl = most_general_template (tmpl);
-  }
+  if (tmpl && DECL_LANG_SPECIFIC (tmpl)
+  && (!DECL_TEMPLATE_SPECIALIZATION (tmpl)
+ /* DECL_TEMPLATE_SPECIALIZATION means we're dealing with either a
+partial specialization or an explicit specialization of a member
+template.  In the former case all is well: the constraints are in
+terms in TMPL's parameters.  But in the latter case TMPL's
+parameters are partially instantiated whereas its constraints
+aren't, so we need to consider (the parameters of) the most
+general template.  The following test distinguishes between a
+partial specialization and such an explicit specialization.  */
+ || (TMPL_PARMS_DEPTH (DECL_TEMPLATE_PARMS (tmpl))
+ < TMPL_ARGS_DEPTH (DECL_TI_ARGS (tmpl)
+tmpl = most_general_template (tmpl);
  
d = tmpl ? tmpl : decl;
  
diff --git a/gcc/testsuite/g++.dg/cpp2a/concepts-explicit-spec7.C b/gcc/testsuite/g++.dg/cpp2a/concepts-explicit-spec7.C

new file mode 100644
index 000..5b5a6df20ff
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/concepts-explicit-spec7.C
@@ -0,0 +1,31 @@
+// PR c++/107522
+// { dg-do compile { target c++20 } }
+
+template
+struct A
+{
+  template
+  static void f() requires (N == 42);
+
+  template
+  struct B {
+template
+static void g() requires (T(N) == 42);
+  };
+};
+
+template<>
+template
+void A::f() requires (N == 42) { }
+
+template<>
+template<>
+template
+void A::B::g() requires (int(N) == 42) { }
+
+int main() {
+  A::f<42>();
+  A::f<43>(); // { dg-error "no match" }
+  A::B::g<42>();
+  A::B::g<43>(); // { dg-error "no match" }
+}




Re: [PATCH] libcpp: suppress builtin macro redefined warnings for __LINE__

2022-12-01 Thread Joseph Myers
On Fri, 2 Dec 2022, Longjun Luo via Gcc-patches wrote:

> 
> On 12/2/2022 1:01 AM, Joseph Myers wrote:
> > On Thu, 1 Dec 2022, Longjun Luo via Gcc-patches wrote:
> > 
> > > diff --git a/gcc/testsuite/gcc.dg/builtin-redefine.c
> > > b/gcc/testsuite/gcc.dg/builtin-redefine.c
> > > index 882b2210992..9d5b42252ee 100644
> > > --- a/gcc/testsuite/gcc.dg/builtin-redefine.c
> > > +++ b/gcc/testsuite/gcc.dg/builtin-redefine.c
> > > @@ -71,7 +71,6 @@
> > >   /* { dg-bogus "Expected built-in is not defined" "" { target *-*-* } .-1
> > > } */
> > >   #endif
> > >   -#define __LINE__ 0   /* { dg-warning "-:\"__LINE__\" redef" }
> > > */
> > >   #define __INCLUDE_LEVEL__ 0  /* { dg-warning "-:\"__INCLUDE_LEVEL__\"
> > > redef" } */
> > >   #define __COUNTER__ 0/* { dg-warning "-:\"__COUNTER__\" redef" }
> > > */
> > Is there some existing test that verifies that this redefinition is still
> > diagnosed by default (in the absence of -Wno-builtin-macro-redefined)?
> 
> I am not sure I have fully understood your meaning. The problem here is that
> if I try to redefine __LINE__ macro in the situation that projects use the
> option '-Werror', the compile will fail.

There are two cases:

(a) Is redefinition of __LINE__ diagnosed *without* 
-Wno-builtin-macro-redefined?

(b) Is redefinition of __LINE__ diagnosed *with* 
-Wno-builtin-macro-redefined?

My understanding is that both (a) and (b) have answer "yes" at present, 
and your patch would change the answer to (b) to "no", without changing 
the answer to (a).

My question is about whether there is a test verifying the answer to (a).  
If not, I think the patch should add one.

-- 
Joseph S. Myers
jos...@codesourcery.com


RE: [PATCH][X86_64] Separate znver4 insn reservations from older znvers

2022-12-01 Thread Alexander Monakov via Gcc-patches


On Thu, 1 Dec 2022, Joshi, Tejas Sanjay wrote:

> I have addressed all your comments in this revised patch, PFA and inlined 
> below.

Thank you. Honza, please let me know if any further input is needed
from my side. For reference, here's how insn-automata.o table sizes
look with this patch (top 17, in bytes):

20068 r bdver1_fp_check
20068 r bdver1_fp_transitions
26208 r slm_min_issue_delay
27244 r bdver1_fp_min_issue_delay
28518 r glm_check
28518 r glm_transitions
33345 r znver4_fpu_min_issue_delay
33690 r geode_min_issue_delay
46980 r bdver3_fp_min_issue_delay
49428 r glm_min_issue_delay
53730 r btver2_fp_min_issue_delay
53760 r znver1_fp_transitions
93960 r bdver3_fp_transitions
106102 r lujiazui_core_check
106102 r lujiazui_core_transitions
133380 r znver4_fpu_transitions
196123 r lujiazui_core_min_issue_delay

There is a plan to further reduce Lujiazui and b[td]verX table sizes
by properly modeling division units like we did for znver.md (PR 87832).

Alexander


Re: [PATCH v2] match.pd: rewrite select to branchless expression

2022-12-01 Thread Michael Collison

Richard,

Can you submit this patch for me while I sort out git write access?

On 11/18/22 07:57, Richard Biener wrote:

On Fri, Nov 11, 2022 at 3:28 AM Michael Collison  wrote:

This patches transforms ((x & 0x1) == 0) ? y : z  y -into
(-(typeof(y))(x & 0x1) & z)  y, where op is a '^' or a '|'. It also
transforms (cond (and (x , 0x1) != 0), (z op y), y ) into (-(and (x ,
0x1)) & z ) op y.

Matching this patterns allows GCC to generate branchless code for one of
the functions in coremark.

Bootstrapped and tested on x86 and RISC-V. Okay?

OK.

Thanks,
Richard.


Michael.

2022-11-10  Michael Collison  

  * match.pd ((x & 0x1) == 0) ? y : z  y
  -> (-(typeof(y))(x & 0x1) & z)  y.

2022-11-10  Michael Collison 

  * gcc.dg/tree-ssa/branchless-cond.c: New test.

---

Changes in v2:

- Rewrite comment to use C syntax

- Guard against 1-bit types

- Simplify pattern by using zero_one_valued_p

   gcc/match.pd  | 24 +
   .../gcc.dg/tree-ssa/branchless-cond.c | 26 +++
   2 files changed, 50 insertions(+)
   create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/branchless-cond.c

diff --git a/gcc/match.pd b/gcc/match.pd
index 194ba8f5188..258531e9046 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -3486,6 +3486,30 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
 (cond (le @0 integer_zerop@1) (negate@2 @0) integer_zerop@1)
 (max @2 @1))

+/* ((x & 0x1) == 0) ? y : z  y -> (-(typeof(y))(x & 0x1) & z)  y */
+(for op (bit_xor bit_ior)
+ (simplify
+  (cond (eq zero_one_valued_p@0
+integer_zerop)
+@1
+(op:c @2 @1))
+  (if (INTEGRAL_TYPE_P (type)
+   && TYPE_PRECISION (type) > 1
+   && (INTEGRAL_TYPE_P (TREE_TYPE (@0
+   (op (bit_and (negate (convert:type @0)) @2) @1
+
+/* ((x & 0x1) == 0) ? z  y : y -> (-(typeof(y))(x & 0x1) & z)  y */
+(for op (bit_xor bit_ior)
+ (simplify
+  (cond (ne zero_one_valued_p@0
+integer_zerop)
+   (op:c @2 @1)
+@1)
+  (if (INTEGRAL_TYPE_P (type)
+   && TYPE_PRECISION (type) > 1
+   && (INTEGRAL_TYPE_P (TREE_TYPE (@0
+   (op (bit_and (negate (convert:type @0)) @2) @1
+
   /* Simplifications of shift and rotates.  */

   (for rotate (lrotate rrotate)
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/branchless-cond.c 
b/gcc/testsuite/gcc.dg/tree-ssa/branchless-cond.c
new file mode 100644
index 000..68087ae6568
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/branchless-cond.c
@@ -0,0 +1,26 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+
+int f1(unsigned int x, unsigned int y, unsigned int z)
+{
+  return ((x & 1) == 0) ? y : z ^ y;
+}
+
+int f2(unsigned int x, unsigned int y, unsigned int z)
+{
+  return ((x & 1) != 0) ? z ^ y : y;
+}
+
+int f3(unsigned int x, unsigned int y, unsigned int z)
+{
+  return ((x & 1) == 0) ? y : z | y;
+}
+
+int f4(unsigned int x, unsigned int y, unsigned int z)
+{
+  return ((x & 1) != 0) ? z | y : y;
+}
+
+/* { dg-final { scan-tree-dump-times " -" 4 "optimized" } } */
+/* { dg-final { scan-tree-dump-times " & " 8 "optimized" } } */
+/* { dg-final { scan-tree-dump-not "if" "optimized" } } */
--
2.34.1



Re: [PATCH 2/2]AArch64 Perform more late folding of reg moves and shifts which arrive after expand

2022-12-01 Thread Richard Sandiford via Gcc-patches
Tamar Christina  writes:
>> -Original Message-
>> From: Richard Sandiford 
>> Sent: Monday, November 14, 2022 9:59 PM
>> To: Tamar Christina 
>> Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw
>> ; Marcus Shawcroft
>> ; Kyrylo Tkachov 
>> Subject: Re: [PATCH 2/2]AArch64 Perform more late folding of reg moves
>> and shifts which arrive after expand
>> 
>> (Sorry, immediately following up to myself for a second time recently.)
>> 
>> Richard Sandiford  writes:
>> > Tamar Christina  writes:
>> >>>
>> >>> The same thing ought to work for smov, so it would be good to do both.
>> >>> That would also make the split between the original and new patterns
>> >>> more
>> >>> obvious: left shift for the old pattern, right shift for the new pattern.
>> >>>
>> >>
>> >> Done, though because umov can do multilevel extensions I couldn't
>> >> combine them Into a single pattern.
>> >
>> > Hmm, but the pattern is:
>> >
>> > (define_insn "*si3_insn2_uxtw"
>> >   [(set (match_operand:GPI 0 "register_operand" "=r,r,r")
>> >(zero_extend:GPI (LSHIFTRT_ONLY:SI
>> >  (match_operand:SI 1 "register_operand" "w,r,r")
>> >  (match_operand:QI 2 "aarch64_reg_or_shift_imm_si"
>> "Usl,Uss,r"]
>> >
>> > GPI is just SI or DI, so in the SI case we're zero-extending SI to SI,
>> > which isn't a valid operation.  The original patch was just for
>> > extending to DI, which seems correct.  The choice between printing %x
>> > for smov and %w for umov can then depend on the code.
>
> You're right, GPI made no sense here.  Fixed.
>
>> 
>> My original comment quoted above was about using smov in the zero-
>> extend pattern.  I.e. the original:
>> 
>> (define_insn "*si3_insn2_uxtw"
>>   [(set (match_operand:DI 0 "register_operand" "=r,?r,r")
>>  (zero_extend:DI (LSHIFTRT:SI
>>   (match_operand:SI 1 "register_operand" "w,r,r")
>>   (match_operand:QI 2 "aarch64_reg_or_shift_imm_si"
>> "Usl,Uss,r"]
>> 
>> could instead be:
>> 
>> (define_insn "*si3_insn2_uxtw"
>>   [(set (match_operand:DI 0 "register_operand" "=r,?r,r")
>>  (zero_extend:DI (SHIFTRT:SI
>>   (match_operand:SI 1 "register_operand" "w,r,r")
>>   (match_operand:QI 2 "aarch64_reg_or_shift_imm_si"
>> "Usl,Uss,r"]
>> 
>> with the pattern using "smov %w0, ..." for ashiftft case.
>
> Almost, except the none immediate cases don't work with shifts.
> i.e. a right shift can't be used to sign extend from 32 to 64 bits.

Right, but the pattern I quoted above is doing a zero-extend rather than
a sign-extend, even for the ashiftrt case.  That is, I was suggesting that
we keep the zero_extend fixed but allow zero extensions of both lshiftrts
and ashiftrts.  That works because ASR Wx and SMOV Wx zero-extend the Wn
result to Xn.

I wasn't suggesting that you add support for SI->DI sign extensions,
although obviously the more cases we optimise the better :-)

The original comment was only supposed to be a small tweak, sorry for
not explaining it properly.

Thanks,
Richard

>
> I've merged the cases but added a guard for this.
>
> Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
>
> Ok for master?
>
> Thanks,
> Tamar
>
> gcc/ChangeLog:
>
>   * config/aarch64/aarch64.md (*si3_insn_uxtw): Split SHIFT into
>   left and right ones.
>   (*aarch64_ashr_sisd_or_int_3): Support smov.
>   (*si3_insn2_xtw): New.
>   * config/aarch64/constraints.md (Usl): New.
>   * config/aarch64/iterators.md (is_zeroE, extend_op): New.
>
> gcc/testsuite/ChangeLog:
>
>   * gcc.target/aarch64/shift-read_1.c: New test.
>   * gcc.target/aarch64/shift-read_2.c: New test.
>
> --- inline copy of patch ---
>
> diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
> index 
> 39e65979528fb7f748ed456399ca38f929dba1d4..4c181a96e555c2a58c59fc991000b2a2fa9bd244
>  100644
> --- a/gcc/config/aarch64/aarch64.md
> +++ b/gcc/config/aarch64/aarch64.md
> @@ -5425,20 +5425,42 @@ (define_split
>  
>  ;; Arithmetic right shift using SISD or Integer instruction
>  (define_insn "*aarch64_ashr_sisd_or_int_3"
> -  [(set (match_operand:GPI 0 "register_operand" "=r,r,w,,")
> +  [(set (match_operand:GPI 0 "register_operand" "=r,r,w,r,,")
>   (ashiftrt:GPI
> -   (match_operand:GPI 1 "register_operand" "r,r,w,w,w")
> +   (match_operand:GPI 1 "register_operand" "r,r,w,w,w,w")
> (match_operand:QI 2 "aarch64_reg_or_shift_imm_di"
> -"Us,r,Us,w,0")))]
> +"Us,r,Us,Usl,w,0")))]
>""
> -  "@
> -   asr\t%0, %1, %2
> -   asr\t%0, %1, %2
> -   sshr\t%0, %1, %2
> -   #
> -   #"
> -  [(set_attr "type" 
> "bfx,shift_reg,neon_shift_imm,neon_shift_reg,neon_shift_reg")
> -   (set_attr "arch" "*,*,simd,simd,simd")]
> +  {
> +switch (which_alternative)
> +{
> +  case 0:
> + return "asr\t%0, %1, %2";
> +  case 1:
> + return "asr\t%0, %1, %2";
> +  case 2:
> + return "sshr\t%0, %1, %2";
> +  case 3:
> + {
> +   int val = INTVAL 

Re: [PATCH] libcpp: suppress builtin macro redefined warnings for __LINE__

2022-12-01 Thread Longjun Luo via Gcc-patches



On 12/2/2022 1:01 AM, Joseph Myers wrote:

On Thu, 1 Dec 2022, Longjun Luo via Gcc-patches wrote:


diff --git a/gcc/testsuite/gcc.dg/builtin-redefine.c 
b/gcc/testsuite/gcc.dg/builtin-redefine.c
index 882b2210992..9d5b42252ee 100644
--- a/gcc/testsuite/gcc.dg/builtin-redefine.c
+++ b/gcc/testsuite/gcc.dg/builtin-redefine.c
@@ -71,7 +71,6 @@
  /* { dg-bogus "Expected built-in is not defined" "" { target *-*-* } .-1 } */
  #endif
  
-#define __LINE__ 0   /* { dg-warning "-:\"__LINE__\" redef" } */

  #define __INCLUDE_LEVEL__ 0  /* { dg-warning "-:\"__INCLUDE_LEVEL__\" redef" 
} */
  #define __COUNTER__ 0/* { dg-warning "-:\"__COUNTER__\" redef" } */

Is there some existing test that verifies that this redefinition is still
diagnosed by default (in the absence of -Wno-builtin-macro-redefined)?


I am not sure I have fully understood your meaning. The problem here is 
that if I try to redefine __LINE__ macro in the situation that projects 
use the option '-Werror', the compile will fail.


For example, the following compilation will fail:

/echo "void main(){}" | gcc -D__LINE__=0 -Werror -x c -/


The compilation output is:

: error: "__LINE__" redefined [-Werror]
cc1: all warnings being treated as errors


As I know, most projects including Linux kernel enable '-Werror' by 
default. So if I try to redefine __LINE__ macro in this situation, it 
will be impossible.


The reason that I want to redefine __LINE__ macro has been explained in 
the commit.


Thanks for your patience and hope I hit the point.






Re: [PATCH] c++, v2: Incremental fix for g++.dg/gomp/for-21.C [PR84469]

2022-12-01 Thread Jason Merrill via Gcc-patches

On 12/1/22 05:32, Jakub Jelinek wrote:

On Wed, Nov 30, 2022 at 01:52:08PM -0500, Jason Merrill wrote:

It looks like we're already deducing the type for the underlying S variable
in cp_convert_omp_range_for, we just aren't updating the types of the
individual bindings.


You're right.  With this patch (still incremental against the base PR84469
patch) we get the nicer diagnostics in all cases.

Regtested successfully on x86_64-linux (g++ gomp.exp/goacc.exp/goacc-gomp.exp
and libgomp's c++.exp), ok for trunk (including the base patch)
if it passes full bootstrap/regtest?


OK, thanks.


2022-12-01  Jakub Jelinek  

PR c++/84469
gcc/c-family/
* c-omp.cc (c_omp_is_loop_iterator): For range for with structured
binding return TREE_VEC_LENGTH (d->declv) even if decl is equal
to any of the structured binding decls.
gcc/cp/
* parser.cc (cp_convert_omp_range_for): After do_auto_deduction if
!processing_template_decl call cp_finish_decomp with
processing_template_decl temporarily incremented.
gcc/testsuite/
* g++.dg/gomp/for-21.C (f3, f6, f9): Adjust expected diagnostics.
* g++.dg/gomp/for-22.C: New test.

--- gcc/c-family/c-omp.cc.jj2022-10-04 10:36:46.515414485 +0200
+++ gcc/c-family/c-omp.cc   2022-12-01 10:57:56.365253302 +0100
@@ -1311,10 +1311,11 @@ c_omp_is_loop_iterator (tree decl, struc
  else if (TREE_CODE (TREE_VEC_ELT (d->declv, i)) == TREE_LIST
 && TREE_CHAIN (TREE_VEC_ELT (d->declv, i))
 && (TREE_CODE (TREE_CHAIN (TREE_VEC_ELT (d->declv, i)))
-== TREE_VEC)
-&& decl == TREE_VEC_ELT (TREE_CHAIN (TREE_VEC_ELT (d->declv,
- i)), 2))
-  return TREE_VEC_LENGTH (d->declv);
+== TREE_VEC))
+  for (int j = 2;
+  j < TREE_VEC_LENGTH (TREE_CHAIN (TREE_VEC_ELT (d->declv, i))); j++)
+   if (decl == TREE_VEC_ELT (TREE_CHAIN (TREE_VEC_ELT (d->declv, i)), j))
+ return TREE_VEC_LENGTH (d->declv);
return -1;
  }
  
--- gcc/cp/parser.cc.jj	2022-12-01 10:19:27.0 +0100

+++ gcc/cp/parser.cc2022-12-01 10:21:30.760450093 +0100
@@ -43126,8 +43126,16 @@ cp_convert_omp_range_for (tree _pre
tree t = build_x_indirect_ref (input_location, begin, RO_UNARY_STAR,
 NULL_TREE, tf_none);
if (!error_operand_p (t))
-   TREE_TYPE (orig_decl) = do_auto_deduction (TREE_TYPE (orig_decl),
-  t, auto_node);
+   {
+ TREE_TYPE (orig_decl) = do_auto_deduction (TREE_TYPE (orig_decl),
+t, auto_node);
+ if (decomp_first_name)
+   {
+ ++processing_template_decl;
+ cp_finish_decomp (orig_decl, decomp_first_name, decomp_cnt);
+ --processing_template_decl;
+   }
+   }
  }
  
tree v = make_tree_vec (decomp_cnt + 3);

--- gcc/testsuite/g++.dg/gomp/for-21.C.jj   2022-11-30 10:29:09.332186135 
+0100
+++ gcc/testsuite/g++.dg/gomp/for-21.C  2022-12-01 11:05:40.888414600 +0100
@@ -24,9 +24,9 @@ void
  f3 (S ()[10])
  {
#pragma omp for collapse (2)
-  for (auto [i, j, k] : a) // { dg-error "use of 'i' before deduction of 
'auto'" "" { target *-*-* } .+1 }
-for (int l = i; l < j; l += k)  // { dg-error "use of 'j' before 
deduction of 'auto'" }
-  ;// { dg-error "use of 'k' before 
deduction of 'auto'" "" { target *-*-* } .-1 }
+  for (auto [i, j, k] : a) // { dg-error "initializer 
expression refers to iteration variable 'i'" }
+for (int l = i; l < j; l += k)  // { dg-error "condition expression 
refers to iteration variable 'j'" }
+  ;// { dg-error "increment expression 
refers to iteration variable 'k'" "" { target *-*-* } .-2 }
  }
  
  template 

@@ -54,9 +54,9 @@ void
  f6 (S ()[10])
  {
#pragma omp for collapse (2)
-  for (auto [i, j, k] : a) // { dg-error "use of 'i' before deduction of 
'auto'" "" { target *-*-* } .-1 }
-for (int l = i; l < j; l += k)  // { dg-error "use of 'j' before 
deduction of 'auto'" }
-  ;// { dg-error "use of 'k' before 
deduction of 'auto'" "" { target *-*-* } .-3 }
+  for (auto [i, j, k] : a) // { dg-error "initializer expression refers 
to iteration variable 'i'" "" { target *-*-* } .-1 }
+for (int l = i; l < j; l += k)  // { dg-error "condition expression 
refers to iteration variable 'j'" }
+  ;// { dg-error "increment expression 
refers to iteration variable 'k'" "" { target *-*-* } .-3 }
  }
  
  template 

@@ -84,9 +84,9 @@ void
  f9 (U ()[10])
  {
#pragma omp for collapse (2)
-  for (auto [i, j, k] : 

[PATCH] arm: Split up MVE _Generic associations to prevent type clashes [PR107515]

2022-12-01 Thread Stam Markianos-Wright via Gcc-patches

Hi all,

With these previous patches:
https://gcc.gnu.org/pipermail/gcc-patches/2022-November/606586.html
https://gcc.gnu.org/pipermail/gcc-patches/2022-November/606587.html
we enabled the MVE overloaded _Generic associations to handle more
scalar types, however at PR 107515 we found a new regression that
wasn't detected in our testing:

With glibc's `posix/types.h`:
```
typedef signed int __int32_t;
...
typedef __int32_t int32_t;
```
We would get a `error: '_Generic' specifies two compatible types`
from `__ARM_mve_coerce3` because of `type: param`, when `type` is
`int` and `int32_t: param` both being the same under the hood.

The same did not happen with Newlib's header `sys/_stdint.h`:
```
typedef long int __int32_t;
...
typedef __int32_t int32_t ;
```
which worked fine, because it uses `long int`.

The same could feasibly happen in `__ARM_mve_coerce2` between
`__fp16` and `float16_t`.

The solution here is to break the _Generic down, so that the similar
types don't appear at the same level, as is done in `__ARM_mve_typeid`.

Ok for trunk?

Thanks,
Stam Markianos-Wright

gcc/ChangeLog:
    PR target/96795
    PR target/107515
    * config/arm/arm_mve.h (__ARM_mve_coerce2): Split types.
    (__ARM_mve_coerce3): Likewise.

gcc/testsuite/ChangeLog:
    PR target/96795
    PR target/107515
    * 
gcc.target/arm/mve/intrinsics/mve_intrinsic_type_overloads-fp.c: New test.
    * 
gcc.target/arm/mve/intrinsics/mve_intrinsic_type_overloads-int.c: New test.



=== Inline Ctrl+C, Ctrl+V or patch ===

diff --git a/gcc/config/arm/arm_mve.h b/gcc/config/arm/arm_mve.h
index 
09167ec118ed3310c5077145e119196f29d83cac..70003653db65736fcfd019e83d9f18153be650dc 
100644

--- a/gcc/config/arm/arm_mve.h
+++ b/gcc/config/arm/arm_mve.h
@@ -35659,9 +35659,9 @@ extern void *__ARM_undef;
 #define __ARM_mve_coerce1(param, type) \
 _Generic(param, type: param, const type: param, default: *(type 
*)__ARM_undef)

 #define __ARM_mve_coerce2(param, type) \
-    _Generic(param, type: param, float16_t: param, float32_t: param, 
default: *(type *)__ARM_undef)
+    _Generic(param, type: param, __fp16: param, default: _Generic 
(param, _Float16: param, float16_t: param, float32_t: param, default: 
*(type *)__ARM_undef))

 #define __ARM_mve_coerce3(param, type) \
-    _Generic(param, type: param, int8_t: param, int16_t: param, 
int32_t: param, int64_t: param, uint8_t: param, uint16_t: param, 
uint32_t: param, uint64_t: param, default: *(type *)__ARM_undef)
+    _Generic(param, type: param, default: _Generic (param, int8_t: 
param, int16_t: param, int32_t: param, int64_t: param, uint8_t: param, 
uint16_t: param, uint32_t: param, uint64_t: param, default: *(type 
*)__ARM_undef))


 #if (__ARM_FEATURE_MVE & 2) /* MVE Floating point.  */

diff --git 
a/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_intrinsic_type_overloads-fp.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_intrinsic_type_overloads-fp.c

new file mode 100644
index 
..427dcacb5ff59b53d5eab1f1582ef6460da3f2f3

--- /dev/null
+++ 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_intrinsic_type_overloads-fp.c

@@ -0,0 +1,65 @@
+/* { dg-require-effective-target arm_v8_1m_mve_fp_ok } */
+/* { dg-add-options arm_v8_1m_mve_fp } */
+/* { dg-additional-options "-O2 -Wno-pedantic -Wno-long-long" } */
+#include "arm_mve.h"
+
+float f1;
+double f2;
+float16_t f3;
+float32_t f4;
+__fp16 f5;
+_Float16 f6;
+
+int i1;
+short i2;
+long i3;
+long long i4;
+int8_t i5;
+int16_t i6;
+int32_t i7;
+int64_t i8;
+
+const int ci1;
+const short ci2;
+const long ci3;
+const long long ci4;
+const int8_t ci5;
+const int16_t ci6;
+const int32_t ci7;
+const int64_t ci8;
+
+float16x8_t floatvec;
+int16x8_t intvec;
+
+void test(void)
+{
+    /* Test a few different supported ways of passing an int value.  The
+    intrinsic vmulq was chosen arbitrarily, but it is representative of
+    all intrinsics that take a non-const scalar value.  */
+    intvec = vmulq(intvec, 2);
+    intvec = vmulq(intvec, (int32_t) 2);
+    intvec = vmulq(intvec, (short) 2);
+    intvec = vmulq(intvec, i1);
+    intvec = vmulq(intvec, i2);
+    intvec = vmulq(intvec, i3);
+    intvec = vmulq(intvec, i4);
+    intvec = vmulq(intvec, i5);
+    intvec = vmulq(intvec, i6);
+    intvec = vmulq(intvec, i7);
+    intvec = vmulq(intvec, i8);
+
+    /* Test a few different supported ways of passing a float value.  */
+    floatvec = vmulq(floatvec, 0.5);
+    floatvec = vmulq(floatvec, 0.5f);
+    floatvec = vmulq(floatvec, (__fp16) 0.5);
+    floatvec = vmulq(floatvec, f1);
+    floatvec = vmulq(floatvec, f2);
+    floatvec = vmulq(floatvec, f3);
+    floatvec = vmulq(floatvec, f4);
+    floatvec = vmulq(floatvec, f5);
+    floatvec = vmulq(floatvec, f6);
+    floatvec = vmulq(floatvec, 0.15f16);
+    floatvec = vmulq(floatvec, (_Float16) 0.15);
+}
+
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
\ No newline at end of file
diff --git 

Re: [PATCH] varasm: Fix type confusion bug

2022-12-01 Thread Alex Coplan via Gcc-patches
On 01/12/2022 16:12, Richard Sandiford wrote:
> Alex Coplan via Gcc-patches  writes:
> > Hi,
> >
> > This patch fixes a type confusion bug in varasm.cc:assemble_variable.
> > The problem is that the current code calls:
> >
> >   sect = get_variable_section (decl, false);
> >
> > and then accesses sect->named.name without checking whether the section
> > is in fact a named section. In the surrounding else clause, we only know
> > that SECTION_STYLE (sect) != SECTION_NOSWITCH, so it is possible that
> > the section is an unnamed section.
> >
> > In practice, this means that we end up doing a wild string compare
> > between a function pointer and the string literal ".vtable_map_vars".
> > This is because sect->named.name aliases sect->unnamed.callback in the
> > section union.
> >
> > This can be seen in GDB with a simple testcase such as "int x;".
> >
> > This patch fixes the issue by checking the SECTION_STYLE of the section
> > is in fact SECTION_NAMED before trying to do the string comparison.
> >
> > We drop the existing check of whether sect->named.name is non-NULL
> > because this should presumably always be the case for a named section.
> >
> > Bootstrapped/regtested on aarch64-none-linux-gnu, OK for trunk?
> 
> OK, thanks.  I think it's OK for backports too if you like,
> since it's a regression from around 2013.

Thanks, I've pushed the patch to trunk, and will backport if there are
no complaints after a week or so.

Alex

> 
> Richard
> 
> >
> > Thanks,
> > Alex
> >
> > gcc/ChangeLog:
> >
> > * varasm.cc (assemble_variable): Fix type confusion bug when
> > checking for ".vtable_map_vars" section.
> >
> > diff --git a/gcc/varasm.cc b/gcc/varasm.cc
> > index 9dfbebbb915..6851201b6a2 100644
> > --- a/gcc/varasm.cc
> > +++ b/gcc/varasm.cc
> > @@ -2400,7 +2400,7 @@ assemble_variable (tree decl, int top_level 
> > ATTRIBUTE_UNUSED,
> >else
> >  {
> >/* Special-case handling of vtv comdat sections.  */
> > -  if (sect->named.name
> > +  if (SECTION_STYLE (sect) == SECTION_NAMED
> >   && (strcmp (sect->named.name, ".vtable_map_vars") == 0))
> > handle_vtv_comdat_section (sect, decl);
> >else


Re: [V2][PATCH 1/1] Add a new warning option -Wstrict-flex-arrays.

2022-12-01 Thread Qing Zhao via Gcc-patches
Richard,

What’s your opinion on this?

Do we need one separate warning option to report the misuse of flexible array 
member only? 
Or we just combine such warnings into -Warray-bounds when it combines with 
-fstrict-flex-arrays?

Thanks.

Qing
> On Dec 1, 2022, at 12:18 PM, Kees Cook  wrote:
> 
> On Thu, Dec 01, 2022 at 05:04:02PM +, Qing Zhao wrote:
>> 
>> 
>>> On Dec 1, 2022, at 11:42 AM, Kees Cook  wrote:
>>> 
>>> On Wed, Nov 30, 2022 at 02:25:56PM +, Qing Zhao wrote:
 '-Wstrict-flex-arrays'
Warn about inproper usages of flexible array members according to
the LEVEL of the 'strict_flex_array (LEVEL)' attribute attached to
the trailing array field of a structure if it's available,
otherwise according to the LEVEL of the option
'-fstrict-flex-arrays=LEVEL'.
 
This option is effective only when LEVEL is bigger than 0.
Otherwise, it will be ignored with a warning.
 
when LEVEL=1, warnings will be issued for a trailing array
reference of a structure that have 2 or more elements if the
trailing array is referenced as a flexible array member.
 
when LEVEL=2, in addition to LEVEL=1, additional warnings will be
issued for a trailing one-element array reference of a structure if
the array is referenced as a flexible array member.
 
when LEVEL=3, in addition to LEVEL=2, additional warnings will be
issued for a trailing zero-length array reference of a structure if
the array is referenced as a flexible array member.
 
 At the same time, -Warray-bounds is updated:
>>> 
>>> Why is there both -Wstrict-flex-arrays and -Warray-bounds? I thought
>>> only the latter was going to exist?
>> 
>> Yes, It’s very easy to merge these two warnings into one: 
>> 
>> -Warray-bounds, when combined with -fstrict-flex-arrays,  in addition to 
>> report all the out-of-bounds warnings, it also report 
>> the misuse of flexible array members according to the LEVEL of 
>> -fstrict-flex-arrays
>> 
>> The major question is, do we need one separate warning option to report the 
>> misuse of flexible array member only?
>> If so, then we need to add a new one. 
> 
> I guess it is up to you, but I think it just makes things needlessly
> complex. I think having 1 option for behavior (-ftrict-flex-arrays),
> and 1 option for warnings (-Warray-bounds) is sufficient. I think adding
> -Wstrict-flex-arrays is confusing.
> 
> -- 
> Kees Cook



Re: [V2][PATCH 1/1] Add a new warning option -Wstrict-flex-arrays.

2022-12-01 Thread Kees Cook via Gcc-patches
On Thu, Dec 01, 2022 at 05:04:02PM +, Qing Zhao wrote:
> 
> 
> > On Dec 1, 2022, at 11:42 AM, Kees Cook  wrote:
> > 
> > On Wed, Nov 30, 2022 at 02:25:56PM +, Qing Zhao wrote:
> >> '-Wstrict-flex-arrays'
> >> Warn about inproper usages of flexible array members according to
> >> the LEVEL of the 'strict_flex_array (LEVEL)' attribute attached to
> >> the trailing array field of a structure if it's available,
> >> otherwise according to the LEVEL of the option
> >> '-fstrict-flex-arrays=LEVEL'.
> >> 
> >> This option is effective only when LEVEL is bigger than 0.
> >> Otherwise, it will be ignored with a warning.
> >> 
> >> when LEVEL=1, warnings will be issued for a trailing array
> >> reference of a structure that have 2 or more elements if the
> >> trailing array is referenced as a flexible array member.
> >> 
> >> when LEVEL=2, in addition to LEVEL=1, additional warnings will be
> >> issued for a trailing one-element array reference of a structure if
> >> the array is referenced as a flexible array member.
> >> 
> >> when LEVEL=3, in addition to LEVEL=2, additional warnings will be
> >> issued for a trailing zero-length array reference of a structure if
> >> the array is referenced as a flexible array member.
> >> 
> >> At the same time, -Warray-bounds is updated:
> > 
> > Why is there both -Wstrict-flex-arrays and -Warray-bounds? I thought
> > only the latter was going to exist?
> 
> Yes, It’s very easy to merge these two warnings into one: 
> 
>  -Warray-bounds, when combined with -fstrict-flex-arrays,  in addition to 
> report all the out-of-bounds warnings, it also report 
> the misuse of flexible array members according to the LEVEL of 
> -fstrict-flex-arrays
> 
> The major question is, do we need one separate warning option to report the 
> misuse of flexible array member only?
> If so, then we need to add a new one. 

I guess it is up to you, but I think it just makes things needlessly
complex. I think having 1 option for behavior (-ftrict-flex-arrays),
and 1 option for warnings (-Warray-bounds) is sufficient. I think adding
-Wstrict-flex-arrays is confusing.

-- 
Kees Cook


Re: [V2][PATCH 1/1] Add a new warning option -Wstrict-flex-arrays.

2022-12-01 Thread Qing Zhao via Gcc-patches


> On Dec 1, 2022, at 11:42 AM, Kees Cook  wrote:
> 
> On Wed, Nov 30, 2022 at 02:25:56PM +, Qing Zhao wrote:
>> '-Wstrict-flex-arrays'
>> Warn about inproper usages of flexible array members according to
>> the LEVEL of the 'strict_flex_array (LEVEL)' attribute attached to
>> the trailing array field of a structure if it's available,
>> otherwise according to the LEVEL of the option
>> '-fstrict-flex-arrays=LEVEL'.
>> 
>> This option is effective only when LEVEL is bigger than 0.
>> Otherwise, it will be ignored with a warning.
>> 
>> when LEVEL=1, warnings will be issued for a trailing array
>> reference of a structure that have 2 or more elements if the
>> trailing array is referenced as a flexible array member.
>> 
>> when LEVEL=2, in addition to LEVEL=1, additional warnings will be
>> issued for a trailing one-element array reference of a structure if
>> the array is referenced as a flexible array member.
>> 
>> when LEVEL=3, in addition to LEVEL=2, additional warnings will be
>> issued for a trailing zero-length array reference of a structure if
>> the array is referenced as a flexible array member.
>> 
>> At the same time, -Warray-bounds is updated:
> 
> Why is there both -Wstrict-flex-arrays and -Warray-bounds? I thought
> only the latter was going to exist?

Yes, It’s very easy to merge these two warnings into one: 

 -Warray-bounds, when combined with -fstrict-flex-arrays,  in addition to 
report all the out-of-bounds warnings, it also report 
the misuse of flexible array members according to the LEVEL of 
-fstrict-flex-arrays

The major question is, do we need one separate warning option to report the 
misuse of flexible array member only?
If so, then we need to add a new one. 

> 
> Are you trying to split code gen (-fstrict-flex-arrays) from warnings?

No.
After this patch, the -fstrict-flex-arrays will consistently control code gens 
and warnings in GCC except the default behavior without -fstrict-flex-arrays:

For code gen, the default behavior is treating all trailing arrays as FAM;
For warnings, the default behavior is treating [], [0],[1] trailing arrays as 
FAM;  [n] is not treated as FAM. 

Qing

> Is that needed?
> 
> -- 
> Kees Cook



Re: [PATCH] libcpp: suppress builtin macro redefined warnings for __LINE__

2022-12-01 Thread Joseph Myers
On Thu, 1 Dec 2022, Longjun Luo via Gcc-patches wrote:

> diff --git a/gcc/testsuite/gcc.dg/builtin-redefine.c 
> b/gcc/testsuite/gcc.dg/builtin-redefine.c
> index 882b2210992..9d5b42252ee 100644
> --- a/gcc/testsuite/gcc.dg/builtin-redefine.c
> +++ b/gcc/testsuite/gcc.dg/builtin-redefine.c
> @@ -71,7 +71,6 @@
>  /* { dg-bogus "Expected built-in is not defined" "" { target *-*-* } .-1 } */
>  #endif
>  
> -#define __LINE__ 0   /* { dg-warning "-:\"__LINE__\" redef" } */
>  #define __INCLUDE_LEVEL__ 0  /* { dg-warning "-:\"__INCLUDE_LEVEL__\" redef" 
> } */
>  #define __COUNTER__ 0/* { dg-warning "-:\"__COUNTER__\" redef" } */

Is there some existing test that verifies that this redefinition is still 
diagnosed by default (in the absence of -Wno-builtin-macro-redefined)?

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH] libgccjit: Fix float vector comparison

2022-12-01 Thread David Malcolm via Gcc-patches
On Thu, 2022-12-01 at 10:33 -0500, Antoni Boucher wrote:
> On Thu, 2022-12-01 at 10:25 -0500, David Malcolm wrote:
> > On Thu, 2022-12-01 at 10:01 -0500, Antoni Boucher wrote:
> > > Thanks, David.
> > > Since we're not in phase 1 anymore, do we need an approval before
> > > I
> > > merge like last year or can I merge immediately?
> > 
> > I think it counts as a bug fix and thus you can go ahead and merge
> > (assuming you've done the usual testing).
> > 
> > > I also have many other patches (all in jit) that I need to
> > > prepare
> > > and
> > > post to this mailing list.
> > > What do you think?
> > 
> > Given that you're one of the main users of libgccjit I think
> > there's
> > a
> > case for stretching the deadlines a bit here.
> > 
> > Do you have a repo I can look at?
> 
> Yes! The commits are in my fork:
> https://github.com/antoyo/gcc
> 
> The only big one is the one adding support for target-dependent
> builtins:
> https://github.com/antoyo/gcc/commit/6d4313d4c02dd878f43917c978f299f5119330f0
> 
> Regarding this one, there's the issue that since we record the
> builtins
> on the first context run, we only have access to the builtins from
> the
> second run.
> Do you have any idea how to fix this?
> Or do you consider this is acceptable?

This is implemented behind the new
gcc_jit_context_get_target_builtin_function entrypoint, right?

If so, perhaps that recording::context::get_target_builtin_function
could detect if it's the first time it's been called on this context,
and if so make a playback::context to do the detection?  That way it
would be transparent to the user, and work first time.


I see you have patches to add function and variable attributes; I
wonder if this would be cleaner internally if there was a
recording::attribute class, rather than the std::pair currently in use
(some attributes have int arguments rather than string, others have
multiple args).

I also wondered if a "gcc_jit_attribute" type could be exposed to the
user, e.g.:

  attr1 = gcc_jit_context_new_attribute (ctxt, "noreturn");
  attr2 = gcc_jit_context_new_attribute_with_string (ctxt, "alias",
"__foo");
  gcc_jit_function_add_attribute (ctxt, attr1);
  gcc_jit_function_add_attribute (ctxt, attr2);

or somesuch?  But I think the API you currently have is OK.


> 
> I also have a WIP branch which adds support for try/catch:
> https://github.com/antoyo/gcc/commit/6219339fcacb079431596a0bc6cf8d430a1bd5a1
> I'm not sure if this one is going to be ready soon or not.

I see that the new entrypoints have e.g.:

/* Add a try/catch statement.
   This is equivalent to this C++ code:
 try {
try_block
 }
 catch {
catch_block
 }
*/

void
gcc_jit_block_add_try_catch (gcc_jit_block *block,
 gcc_jit_location *loc,
 gcc_jit_block *try_block,
 gcc_jit_block *catch_block);

but I'm not sure how this is meant to interact with the CFG-like model
used by the rest of the gcc_jit_block_* API.  What happens at the end
of the blocks?  Does the generated code use the C++ ABI for exception-
handling?

Dave

> 
> Thanks.
> 
> > 
> > Dave
> > 
> > 
> > > 
> > > On Thu, 2022-12-01 at 09:28 -0500, David Malcolm wrote:
> > > > On Sun, 2022-11-20 at 14:03 -0500, Antoni Boucher via Jit
> > > > wrote:
> > > > > Hi.
> > > > > This fixes bug 107770.
> > > > > Thanks for the review.
> > > > 
> > > > Thanks, the patch looks good to me.
> > > > 
> > > > Dave
> > > > 
> > > 
> > 
> 



[PATCH] libgcc: Fix uninitialized RA signing on AArch64 [PR107678]

2022-12-01 Thread Wilco Dijkstra via Gcc-patches
A recent change only initializes the regs.how[] during Dwarf unwinding
which resulted in an uninitialized offset used in return address signing
and random failures during unwinding.  The fix is to use REG_SAVED_OFFSET
as the state where the return address signing bit is valid, and if the
state is REG_UNSAVED, initialize it to 0.

Passes bootstrap & regress, OK for commit?

libgcc/
PR target/107678
* unwind-dw2.c (execute_cfa_program): Initialize offset of
DWARF_REGNUM_AARCH64_RA_STATE if in REG_UNSAVED state.
* config/aarch64/aarch64-unwind.h (aarch64_frob_update_contex):
Check state is REG_SAVED_OFFSET before using offset for RA state.

---

diff --git a/libgcc/config/aarch64/aarch64-unwind.h 
b/libgcc/config/aarch64/aarch64-unwind.h
index 
26db9cbd9e5c526e0c410a4fc6be2bedb7d261cf..597133b3d708a50a366c8bfeff57475f5522b3f6
 100644
--- a/libgcc/config/aarch64/aarch64-unwind.h
+++ b/libgcc/config/aarch64/aarch64-unwind.h
@@ -71,21 +71,15 @@ aarch64_demangle_return_addr (struct _Unwind_Context 
*context,
 }
 
 /* Do AArch64 private initialization on CONTEXT based on frame info FS.  Mark
-   CONTEXT as return address signed if bit 0 of DWARF_REGNUM_AARCH64_RA_STATE 
is
-   set.  */
+   CONTEXT as having a signed return address if DWARF_REGNUM_AARCH64_RA_STATE
+   is initialized (REG_SAVED_OFFSET state) and the offset has bit 0 set.  */
 
 static inline void
 aarch64_frob_update_context (struct _Unwind_Context *context,
 _Unwind_FrameState *fs)
 {
-  const int reg = DWARF_REGNUM_AARCH64_RA_STATE;
-  int ra_signed;
-  if (fs->regs.how[reg] == REG_UNSAVED)
-ra_signed = fs->regs.reg[reg].loc.offset & 0x1;
-  else
-ra_signed = _Unwind_GetGR (context, reg) & 0x1;
-  if (ra_signed)
-/* The flag is used for re-authenticating EH handler's address.  */
+  if (fs->regs.how[DWARF_REGNUM_AARCH64_RA_STATE] == REG_SAVED_OFFSET
+  && (fs->regs.reg[DWARF_REGNUM_AARCH64_RA_STATE].loc.offset & 1) != 0)
 context->flags |= RA_SIGNED_BIT;
   else
 context->flags &= ~RA_SIGNED_BIT;
diff --git a/libgcc/unwind-dw2.c b/libgcc/unwind-dw2.c
index 
eaceace20298b9b13344aff9d1fe9ee5f9c7bd73..87f2ae065b67982ce48f74e45523d9c754a7661c
 100644
--- a/libgcc/unwind-dw2.c
+++ b/libgcc/unwind-dw2.c
@@ -1203,11 +1203,16 @@ execute_cfa_program (const unsigned char *insn_ptr,
 
case DW_CFA_GNU_window_save:
 #if defined (__aarch64__) && !defined (__ILP32__)
- /* This CFA is multiplexed with Sparc.  On AArch64 it's used to toggle
-return address signing status.  */
- reg = DWARF_REGNUM_AARCH64_RA_STATE;
- gcc_assert (fs->regs.how[reg] == REG_UNSAVED);
- fs->regs.reg[reg].loc.offset ^= 1;
+/* This CFA is multiplexed with Sparc.  On AArch64 it's used to toggle
+   the return address signing status.  It is initialized at the first
+   use and the state is stored in bit 0 of the offset.  */
+reg = DWARF_REGNUM_AARCH64_RA_STATE;
+if (fs->regs.how[reg] == REG_UNSAVED)
+  {
+fs->regs.how[reg] = REG_SAVED_OFFSET;
+fs->regs.reg[reg].loc.offset = 0;
+  }
+fs->regs.reg[reg].loc.offset ^= 1;
 #else
  /* ??? Hardcoded for SPARC register window configuration.  */
  if (__LIBGCC_DWARF_FRAME_REGISTERS__ >= 32)



[PATCH RFA] driver: fix validate_switches logic

2022-12-01 Thread Jason Merrill via Gcc-patches
Tested x86_64-pc-linux-gnu, OK for trunk?

-- 8< --

Under the old logic for validate_switches, once suffix or starred got set,
they stayed set for all later switches found in the spec.  So for e.g.

%{g*:%{%:debug-level-gt(0):

Once we see g*, starred is set.  Then we see %:, and it sees that as a
zero-length switch, which because starred is still set, matches any and all
command-line options.  So targets that use such a spec accept all options in
the driver, while ones that don't reject some, such as the recent
-nostdlib++.

This patch fixes the inconsistency, so all targets would complain about
-nostdlib++, and then sets SKIPOPT for it so they don't.

gcc/ChangeLog:

* gcc.cc (validate_switches): Reset suffix/starred on loop.

gcc/cp/ChangeLog:

* g++spec.cc (lang_specific_driver): Set SKIPOPT for nostdlib++.
---
 gcc/cp/g++spec.cc | 4 +++-
 gcc/gcc.cc| 7 +--
 2 files changed, 8 insertions(+), 3 deletions(-)

diff --git a/gcc/cp/g++spec.cc b/gcc/cp/g++spec.cc
index e599ac906f6..3d3b042dd56 100644
--- a/gcc/cp/g++spec.cc
+++ b/gcc/cp/g++spec.cc
@@ -167,8 +167,10 @@ lang_specific_driver (struct cl_decoded_option 
**in_decoded_options,
  need_experimental = true;
  break;
 
-   case OPT_nostdlib:
case OPT_nostdlib__:
+ args[i] |= SKIPOPT;
+ /* FALLTHRU */
+   case OPT_nostdlib:
case OPT_nodefaultlibs:
  library = -1;
  break;
diff --git a/gcc/gcc.cc b/gcc/gcc.cc
index ca1c9e27a94..2278e2b6bb1 100644
--- a/gcc/gcc.cc
+++ b/gcc/gcc.cc
@@ -9299,12 +9299,15 @@ validate_switches (const char *start, bool user_spec, 
bool braced)
   const char *atom;
   size_t len;
   int i;
-  bool suffix = false;
-  bool starred = false;
+  bool suffix;
+  bool starred;
 
 #define SKIP_WHITE() do { while (*p == ' ' || *p == '\t') p++; } while (0)
 
 next_member:
+  suffix = false;
+  starred = false;
+
   SKIP_WHITE ();
 
   if (*p == '!')

base-commit: 4304e09a1617bcf1c87f5bc96017ae5017379d75
-- 
2.31.1



RE: [PATCH 2/2]AArch64 Support new tbranch optab.

2022-12-01 Thread Tamar Christina via Gcc-patches
Hi,

I hadn't received any reply so I had implemented various ways to do this (about 
8 of them in fact).

The conclusion is that no, we cannot emit one big RTL for the final instruction 
immediately.
The reason that all comparisons in the AArch64 backend expand to separate CC 
compares, and
separate testing of the operands is for ifcvt.

The separate CC compare is needed so ifcvt can produce csel, cset etc from the 
compares.  Unlike
say combine, ifcvt can not do recog on a parallel with a clobber.  Should we 
emit the instruction
directly then ifcvt will not be able to say, make a csel, because we have no 
patterns which handle
zero_extract and compare. (unlike combine ifcvt cannot transform the extract 
into an AND).

While you could provide various patterns for this (and I did try) you end up 
with broken patterns
because you can't add the clobber to the CC register.  If you do, ifcvt recog 
fails.

i.e.

int
f1 (int x)
{
  if (x & 1)
return 1;
  return x;
}

We lose csel here.

Secondly the reason the compare with an explicit CC mode is needed is so that 
ifcvt can transform
the operation into a version that doesn't require the flags to be set.  But it 
only does so if it know
the explicit usage of the CC reg.

For instance 

int
foo (int a, int b)
{
  return ((a & (1 << 25)) ? 5 : 4);
}

Doesn't require a comparison, the optimal form is:

foo(int, int):
ubfxx0, x0, 25, 1
add w0, w0, 4
ret

and no compare is actually needed.  If you represent the instruction using an 
ANDS instead of a zero_extract
then you get close, but you end up with an ands followed by an add, which is a 
slower operation.

These two reasons are the main reasons why all comparisons in AArch64 expand 
the way they do, so tbranch
Shouldn't do anything differently here.  Additionally the reason for the optab 
was to pass range information
to the backend during expansion.

In this version however I have represented the expand using an ANDS instead.  
This allows us not to regress
on -O0 as the previous version did.

Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.

Ok for master?

Note that this patch relies on 
https://patchwork.sourceware.org/project/gcc/patch/y1+4qitmrqhbd...@arm.com/ 
which has yet to be reviewed but which cleans up extensions so they can be used 
like this.

Thanks,
Tamar

gcc/ChangeLog:

* config/aarch64/aarch64.md (*tb1): Rename to...
(*tb1): ... this.
(tbranch_4): New.
(zero_extend2,
zero_extend2,
zero_extend2): Make dynamic calls with @.
* config/aarch64/iterators.md(ZEROM, zerom): New.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/tbz_1.c: New test.

--- inline copy of patch ---

diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index 
4c181a96e555c2a58c59fc991000b2a2fa9bd244..7ee1d01e050004e42cd2d0049f0200da71d918bb
 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -946,12 +946,33 @@ (define_insn "*cb1"
  (const_int 1)))]
 )
 
-(define_insn "*tb1"
+(define_expand "tbranch_4"
   [(set (pc) (if_then_else
- (EQL (zero_extract:DI (match_operand:GPI 0 "register_operand" "r")
-   (const_int 1)
-   (match_operand 1
- "aarch64_simd_shift_imm_" "n"))
+  (EQL (match_operand:ALLI 0 "register_operand")
+   (match_operand 1 "aarch64_simd_shift_imm_"))
+  (label_ref (match_operand 2 ""))
+  (pc)))]
+  ""
+{
+  rtx bitvalue = gen_reg_rtx (mode);
+  rtx reg = gen_reg_rtx (mode);
+  if (mode == mode)
+reg = operands[0];
+  else
+emit_insn (gen_zero_extend2 (mode, mode, reg, operands[0]));
+  rtx val = GEN_INT (1UL << UINTVAL (operands[1]));
+  emit_insn (gen_and3 (bitvalue, reg, val));
+  operands[1] = const0_rtx;
+  operands[0] = aarch64_gen_compare_reg (, bitvalue,
+operands[1]);
+})
+
+(define_insn "*tb1"
+  [(set (pc) (if_then_else
+ (EQL (zero_extract:GPI (match_operand:ALLI 0 "register_operand" 
"r")
+(const_int 1)
+(match_operand 1
+  "aarch64_simd_shift_imm_" 
"n"))
   (const_int 0))
 (label_ref (match_operand 2 "" ""))
 (pc)))
@@ -962,15 +983,15 @@ (define_insn "*tb1"
   {
if (get_attr_far_branch (insn) == 1)
  return aarch64_gen_far_branch (operands, 2, "Ltb",
-"\\t%0, %1, ");
+"\\t%0, %1, ");
else
  {
operands[1] = GEN_INT (HOST_WIDE_INT_1U << UINTVAL (operands[1]));
-   return "tst\t%0, %1\;\t%l2";
+   return "tst\t%0, %1\;\t%l2";
  }
   }
 else
-  return "\t%0, %1, %l2";
+  return "\t%0, 

Re: [V2][PATCH 1/1] Add a new warning option -Wstrict-flex-arrays.

2022-12-01 Thread Kees Cook via Gcc-patches
On Wed, Nov 30, 2022 at 02:25:56PM +, Qing Zhao wrote:
> '-Wstrict-flex-arrays'
>  Warn about inproper usages of flexible array members according to
>  the LEVEL of the 'strict_flex_array (LEVEL)' attribute attached to
>  the trailing array field of a structure if it's available,
>  otherwise according to the LEVEL of the option
>  '-fstrict-flex-arrays=LEVEL'.
> 
>  This option is effective only when LEVEL is bigger than 0.
>  Otherwise, it will be ignored with a warning.
> 
>  when LEVEL=1, warnings will be issued for a trailing array
>  reference of a structure that have 2 or more elements if the
>  trailing array is referenced as a flexible array member.
> 
>  when LEVEL=2, in addition to LEVEL=1, additional warnings will be
>  issued for a trailing one-element array reference of a structure if
>  the array is referenced as a flexible array member.
> 
>  when LEVEL=3, in addition to LEVEL=2, additional warnings will be
>  issued for a trailing zero-length array reference of a structure if
>  the array is referenced as a flexible array member.
> 
> At the same time, -Warray-bounds is updated:

Why is there both -Wstrict-flex-arrays and -Warray-bounds? I thought
only the latter was going to exist?

Are you trying to split code gen (-fstrict-flex-arrays) from warnings?
Is that needed?

-- 
Kees Cook


[PATCH] c++: explicit spec of constrained member tmpl [PR107522]

2022-12-01 Thread Patrick Palka via Gcc-patches
When defining a explicit specialization of a constrained member template
(of a class template) such as f and g in the below testcase, the
DECL_TEMPLATE_PARMS of the corresponding TEMPLATE_DECL are partially
instantiated, whereas its associated constraints are carried over
from the original template and thus are in terms of the original
DECL_TEMPLATE_PARMS.  So during normalization for such an explicit
specialization we need to consider the (parameters of) the most general
template, since that's what the constraints are in terms of and since we
always use the full set of template arguments during satisfaction.

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
trunk and perhaps 12?

PR c++/107522

gcc/cp/ChangeLog:

* constraint.cc (get_normalized_constraints_from_decl): Use the
most general template for an explicit specialization of a
member template.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/concepts-explicit-spec7.C: New test.
---
 gcc/cp/constraint.cc  | 18 ---
 .../g++.dg/cpp2a/concepts-explicit-spec7.C| 31 +++
 2 files changed, 44 insertions(+), 5 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/concepts-explicit-spec7.C

diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc
index ab0f66b3d7e..f1df84c2a1c 100644
--- a/gcc/cp/constraint.cc
+++ b/gcc/cp/constraint.cc
@@ -973,11 +973,19 @@ get_normalized_constraints_from_decl (tree d, bool diag = 
false)
  accepting the latter causes the template parameter level of U
  to be reduced in a way that makes it overly difficult substitute
  concrete arguments (i.e., eventually {int, int} during satisfaction.  */
-  if (tmpl)
-  {
-if (DECL_LANG_SPECIFIC(tmpl) && !DECL_TEMPLATE_SPECIALIZATION (tmpl))
-  tmpl = most_general_template (tmpl);
-  }
+  if (tmpl && DECL_LANG_SPECIFIC (tmpl)
+  && (!DECL_TEMPLATE_SPECIALIZATION (tmpl)
+ /* DECL_TEMPLATE_SPECIALIZATION means we're dealing with either a
+partial specialization or an explicit specialization of a member
+template.  In the former case all is well: the constraints are in
+terms in TMPL's parameters.  But in the latter case TMPL's
+parameters are partially instantiated whereas its constraints
+aren't, so we need to consider (the parameters of) the most
+general template.  The following test distinguishes between a
+partial specialization and such an explicit specialization.  */
+ || (TMPL_PARMS_DEPTH (DECL_TEMPLATE_PARMS (tmpl))
+ < TMPL_ARGS_DEPTH (DECL_TI_ARGS (tmpl)
+tmpl = most_general_template (tmpl);
 
   d = tmpl ? tmpl : decl;
 
diff --git a/gcc/testsuite/g++.dg/cpp2a/concepts-explicit-spec7.C 
b/gcc/testsuite/g++.dg/cpp2a/concepts-explicit-spec7.C
new file mode 100644
index 000..5b5a6df20ff
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/concepts-explicit-spec7.C
@@ -0,0 +1,31 @@
+// PR c++/107522
+// { dg-do compile { target c++20 } }
+
+template
+struct A
+{
+  template
+  static void f() requires (N == 42);
+
+  template
+  struct B {
+template
+static void g() requires (T(N) == 42);
+  };
+};
+
+template<>
+template
+void A::f() requires (N == 42) { }
+
+template<>
+template<>
+template
+void A::B::g() requires (int(N) == 42) { }
+
+int main() {
+  A::f<42>();
+  A::f<43>(); // { dg-error "no match" }
+  A::B::g<42>();
+  A::B::g<43>(); // { dg-error "no match" }
+}
-- 
2.39.0.rc0.49.g083e01275b



RE: [PATCH 1/2]middle-end: Add new tbranch optab to add support for bit-test-and-branch operations

2022-12-01 Thread Tamar Christina via Gcc-patches
> > +/* Check to see if the supplied comparison in PTEST can be performed as a
> > +   bit-test-and-branch instead.  VAL must contain the original tree
> > +   expression of the non-zero operand which will be used to rewrite the
> > +   comparison in PTEST.
> > +
> > +   Returns TRUE if operation succeeds and returns updated PMODE and
> PTEST,
> > +   else FALSE.  */
> > +
> > +enum insn_code
> > +static validate_test_and_branch (tree val, rtx *ptest, machine_mode
> > +*pmode) {
> > +  if (!val || TREE_CODE (val) != SSA_NAME)
> > +return CODE_FOR_nothing;
> > +
> > +  machine_mode mode = TYPE_MODE (TREE_TYPE (val));  rtx test =
> > + *ptest;
> > +
> > +  if (GET_CODE (test) != EQ && GET_CODE (test) != NE)
> > +return CODE_FOR_nothing;
> > +
> > +  /* If the target supports the testbit comparison directly, great.
> > + */  auto icode = direct_optab_handler (tbranch_optab, mode);  if
> > + (icode == CODE_FOR_nothing)
> > +return icode;
> > +
> > +  if (tree_zero_one_valued_p (val))
> > +{
> > +  auto pos = BYTES_BIG_ENDIAN ? GET_MODE_BITSIZE (mode) - 1 : 0;
> 
> Does this work for BYTES_BIG_ENDIAN && !WORDS_BIG_ENDIAN and mode
> > word_mode?
> 

It does now. In this particular case all that matters is the bit ordering, so 
I've changed
It to BITS_BIG_ENDIAN.

Also during the review of the AArch64 optab Richard Sandiford wanted me to 
split the
optabs apart into two.  The reason is that a match_operator still gets the full 
RTL.

In the case of a tbranch the full RTL has an invalid comparison, so if a target 
doesn't implement
the hook correctly this would lead to incorrect code.  We've now moved the 
operator as part of
the name itself to avoid this.

Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.

Ok for master?

Thanks,
Tamar

gcc/ChangeLog:

* dojump.cc (do_jump): Pass along value.
(do_jump_by_parts_greater_rtx): Likewise.
(do_jump_by_parts_zero_rtx): Likewise.
(do_jump_by_parts_equality_rtx): Likewise.
(do_compare_rtx_and_jump): Likewise.
(do_compare_and_jump): Likewise.
* dojump.h (do_compare_rtx_and_jump): New.
* optabs.cc (emit_cmp_and_jump_insn_1): Refactor to take optab to check.
(validate_test_and_branch): New.
(emit_cmp_and_jump_insns): Optiobally take a value, and when value is
supplied then check if it's suitable for tbranch.
* optabs.def (tbranch_eq$a4, tbranch_ne$a4): New.
* doc/md.texi (tbranch_@var{op}@var{mode}4): Document it.
* optabs.h (emit_cmp_and_jump_insns):
* tree.h (tree_zero_one_valued_p): New.

--- inline copy of patch ---

diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
index 
d0a71ecbb806de3a6564c6ffe973fec5da5c597b..c6c4b13d756de28078a0a779876a00c614246914
 100644
--- a/gcc/doc/md.texi
+++ b/gcc/doc/md.texi
@@ -6964,6 +6964,14 @@ case, you can and should make operand 1's predicate 
reject some operators
 in the @samp{cstore@var{mode}4} pattern, or remove the pattern altogether
 from the machine description.
 
+@cindex @code{tbranch_@var{op}@var{mode}4} instruction pattern
+@item @samp{tbranch_@var{op}@var{mode}4}
+Conditional branch instruction combined with a bit test-and-compare
+instruction. Operand 0 is a comparison operator.  Operand 1 is the
+operand of the comparison. Operand 2 is the bit position of Operand 1 to test.
+Operand 3 is the @code{code_label} to jump to. @var{op} is one of @var{eq} or
+@var{ne}.
+
 @cindex @code{cbranch@var{mode}4} instruction pattern
 @item @samp{cbranch@var{mode}4}
 Conditional branch instruction combined with a compare instruction.
diff --git a/gcc/dojump.h b/gcc/dojump.h
index 
e379cceb34bb1765cb575636e4c05b61501fc2cf..d1d79c490c420a805fe48d58740a79c1f25fb839
 100644
--- a/gcc/dojump.h
+++ b/gcc/dojump.h
@@ -71,6 +71,10 @@ extern void jumpifnot (tree exp, rtx_code_label *label,
 extern void jumpifnot_1 (enum tree_code, tree, tree, rtx_code_label *,
 profile_probability);
 
+extern void do_compare_rtx_and_jump (rtx, rtx, enum rtx_code, int, tree,
+machine_mode, rtx, rtx_code_label *,
+rtx_code_label *, profile_probability);
+
 extern void do_compare_rtx_and_jump (rtx, rtx, enum rtx_code, int,
 machine_mode, rtx, rtx_code_label *,
 rtx_code_label *, profile_probability);
diff --git a/gcc/dojump.cc b/gcc/dojump.cc
index 
2af0cd1aca3b6af13d5d8799094ee93f18022296..190324f36f1a31990f8c49bc8c0f45c23da5c31e
 100644
--- a/gcc/dojump.cc
+++ b/gcc/dojump.cc
@@ -619,7 +619,7 @@ do_jump (tree exp, rtx_code_label *if_false_label,
}
   do_compare_rtx_and_jump (temp, CONST0_RTX (GET_MODE (temp)),
   NE, TYPE_UNSIGNED (TREE_TYPE (exp)),
-  GET_MODE (temp), NULL_RTX,
+  exp, GET_MODE (temp), NULL_RTX,
   

RE: [PATCH 2/2]AArch64 Perform more late folding of reg moves and shifts which arrive after expand

2022-12-01 Thread Tamar Christina via Gcc-patches
> -Original Message-
> From: Richard Sandiford 
> Sent: Monday, November 14, 2022 9:59 PM
> To: Tamar Christina 
> Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw
> ; Marcus Shawcroft
> ; Kyrylo Tkachov 
> Subject: Re: [PATCH 2/2]AArch64 Perform more late folding of reg moves
> and shifts which arrive after expand
> 
> (Sorry, immediately following up to myself for a second time recently.)
> 
> Richard Sandiford  writes:
> > Tamar Christina  writes:
> >>>
> >>> The same thing ought to work for smov, so it would be good to do both.
> >>> That would also make the split between the original and new patterns
> >>> more
> >>> obvious: left shift for the old pattern, right shift for the new pattern.
> >>>
> >>
> >> Done, though because umov can do multilevel extensions I couldn't
> >> combine them Into a single pattern.
> >
> > Hmm, but the pattern is:
> >
> > (define_insn "*si3_insn2_uxtw"
> >   [(set (match_operand:GPI 0 "register_operand" "=r,r,r")
> > (zero_extend:GPI (LSHIFTRT_ONLY:SI
> >   (match_operand:SI 1 "register_operand" "w,r,r")
> >   (match_operand:QI 2 "aarch64_reg_or_shift_imm_si"
> "Usl,Uss,r"]
> >
> > GPI is just SI or DI, so in the SI case we're zero-extending SI to SI,
> > which isn't a valid operation.  The original patch was just for
> > extending to DI, which seems correct.  The choice between printing %x
> > for smov and %w for umov can then depend on the code.

You're right, GPI made no sense here.  Fixed.

> 
> My original comment quoted above was about using smov in the zero-
> extend pattern.  I.e. the original:
> 
> (define_insn "*si3_insn2_uxtw"
>   [(set (match_operand:DI 0 "register_operand" "=r,?r,r")
>   (zero_extend:DI (LSHIFTRT:SI
>(match_operand:SI 1 "register_operand" "w,r,r")
>(match_operand:QI 2 "aarch64_reg_or_shift_imm_si"
> "Usl,Uss,r"]
> 
> could instead be:
> 
> (define_insn "*si3_insn2_uxtw"
>   [(set (match_operand:DI 0 "register_operand" "=r,?r,r")
>   (zero_extend:DI (SHIFTRT:SI
>(match_operand:SI 1 "register_operand" "w,r,r")
>(match_operand:QI 2 "aarch64_reg_or_shift_imm_si"
> "Usl,Uss,r"]
> 
> with the pattern using "smov %w0, ..." for ashiftft case.

Almost, except the none immediate cases don't work with shifts.
i.e. a right shift can't be used to sign extend from 32 to 64 bits.

I've merged the cases but added a guard for this.

Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.

Ok for master?

Thanks,
Tamar

gcc/ChangeLog:

* config/aarch64/aarch64.md (*si3_insn_uxtw): Split SHIFT into
left and right ones.
(*aarch64_ashr_sisd_or_int_3): Support smov.
(*si3_insn2_xtw): New.
* config/aarch64/constraints.md (Usl): New.
* config/aarch64/iterators.md (is_zeroE, extend_op): New.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/shift-read_1.c: New test.
* gcc.target/aarch64/shift-read_2.c: New test.

--- inline copy of patch ---

diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index 
39e65979528fb7f748ed456399ca38f929dba1d4..4c181a96e555c2a58c59fc991000b2a2fa9bd244
 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -5425,20 +5425,42 @@ (define_split
 
 ;; Arithmetic right shift using SISD or Integer instruction
 (define_insn "*aarch64_ashr_sisd_or_int_3"
-  [(set (match_operand:GPI 0 "register_operand" "=r,r,w,,")
+  [(set (match_operand:GPI 0 "register_operand" "=r,r,w,r,,")
(ashiftrt:GPI
- (match_operand:GPI 1 "register_operand" "r,r,w,w,w")
+ (match_operand:GPI 1 "register_operand" "r,r,w,w,w,w")
  (match_operand:QI 2 "aarch64_reg_or_shift_imm_di"
-  "Us,r,Us,w,0")))]
+  "Us,r,Us,Usl,w,0")))]
   ""
-  "@
-   asr\t%0, %1, %2
-   asr\t%0, %1, %2
-   sshr\t%0, %1, %2
-   #
-   #"
-  [(set_attr "type" 
"bfx,shift_reg,neon_shift_imm,neon_shift_reg,neon_shift_reg")
-   (set_attr "arch" "*,*,simd,simd,simd")]
+  {
+switch (which_alternative)
+{
+  case 0:
+   return "asr\t%0, %1, %2";
+  case 1:
+   return "asr\t%0, %1, %2";
+  case 2:
+   return "sshr\t%0, %1, %2";
+  case 3:
+   {
+ int val = INTVAL (operands[2]);
+ int size = 32 - val;
+
+ if (size == 16)
+   return "smov\\t%0, %1.h[1]";
+ if (size == 8)
+   return "smov\\t%0, %1.b[3]";
+ gcc_unreachable ();
+   }
+  case 4:
+   return "#";
+  case 5:
+   return "#";
+  default:
+   gcc_unreachable ();
+}
+  }
+  [(set_attr "type" "bfx,shift_reg,neon_shift_imm,neon_to_gp, 
neon_shift_reg,neon_shift_reg")
+   (set_attr "arch" "*,*,simd,simd,simd,simd")]
 )
 
 (define_split
@@ -5548,7 +5570,7 @@ (define_insn "*rol3_insn"
 ;; zero_extend version of shifts
 (define_insn "*si3_insn_uxtw"
   [(set (match_operand:DI 0 "register_operand" "=r,r")
-   (zero_extend:DI (SHIFT_no_rotate:SI
+   

RE: [PATCH]AArch64 Fix vector re-interpretation between partial SIMD modes

2022-12-01 Thread Tamar Christina via Gcc-patches
> -Original Message-
> From: Richard Sandiford 
> Sent: Friday, November 18, 2022 9:30 AM
> To: Tamar Christina 
> Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw
> ; Marcus Shawcroft
> ; Kyrylo Tkachov 
> Subject: Re: [PATCH]AArch64 Fix vector re-interpretation between partial
> SIMD modes
> 
> Richard Sandiford via Gcc-patches  writes:
> > Tamar Christina  writes:
> >> Hi All,
> >>
> >> While writing a patch series I started getting incorrect codegen out
> >> from VEC_PERM on partial struct types.
> >>
> >> It turns out that this was happening because the
> >> TARGET_CAN_CHANGE_MODE_CLASS implementation has a slight bug in
> it.  The hook only checked for SIMD to
> >> Partial but never Partial to SIMD.   This resulted in incorrect subregs to 
> >> be
> >> generated from the fallback code in VEC_PERM_EXPR expansions.
> >>
> >> I have unfortunately not been able to trigger it using a standalone
> >> testcase as the mid-end optimizes away the permute every time I try
> >> to describe a permute that would result in the bug.
> >>
> >> The patch now rejects any conversion of partial SIMD struct types,
> >> unless they are both partial structures of the same number of
> >> registers or one is a SIMD type who's size is less than 8 bytes.
> >>
> >> Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> >>
> >> Ok for master? And backport to GCC 12?
> >>
> >> Thanks,
> >> Tamar
> >>
> >> gcc/ChangeLog:
> >>
> >>* config/aarch64/aarch64.cc (aarch64_can_change_mode_class):
> Restrict
> >>conversions between partial struct types properly.
> >>
> >> --- inline copy of patch --
> >> diff --git a/gcc/config/aarch64/aarch64.cc
> >> b/gcc/config/aarch64/aarch64.cc index
> >>
> d3c3650d7d728f56adb65154127dc7b72386c5a7..84dbe2f4ea7d03b424602ed9
> 8a3
> >> 4e7824217dc91 100644
> >> --- a/gcc/config/aarch64/aarch64.cc
> >> +++ b/gcc/config/aarch64/aarch64.cc
> >> @@ -26471,9 +26471,10 @@ aarch64_can_change_mode_class
> (machine_mode from,
> >>bool from_pred_p = (from_flags & VEC_SVE_PRED);
> >>bool to_pred_p = (to_flags & VEC_SVE_PRED);
> >>
> >> -  bool from_full_advsimd_struct_p = (from_flags == (VEC_ADVSIMD |
> VEC_STRUCT));
> >>bool to_partial_advsimd_struct_p = (to_flags == (VEC_ADVSIMD |
> VEC_STRUCT
> >>   | VEC_PARTIAL));
> >> +  bool from_partial_advsimd_struct_p = (from_flags == (VEC_ADVSIMD
> | VEC_STRUCT
> >> + | VEC_PARTIAL));
> >>
> >>/* Don't allow changes between predicate modes and other modes.
> >>   Only predicate registers can hold predicate modes and only @@
> >> -26496,9 +26497,23 @@ aarch64_can_change_mode_class
> (machine_mode from,
> >>  return false;
> >>
> >>/* Don't allow changes between partial and full Advanced SIMD
> structure
> >> - modes.  */
> >> -  if (from_full_advsimd_struct_p && to_partial_advsimd_struct_p)
> >> -return false;
> >> + modes unless both are a partial struct with the same number of
> registers
> >> + or the vector bitsizes must be the same.  */
> >> +  if (to_partial_advsimd_struct_p ^ from_partial_advsimd_struct_p)
> >> +{
> >> +  /* If they're both partial structures, allow if they have the same
> number
> >> +   or registers.  */
> >> +  if (to_partial_advsimd_struct_p == from_partial_advsimd_struct_p)
> >> +  return known_eq (GET_MODE_SIZE (from), GET_MODE_SIZE (to));
> >
> > It looks like the ^ makes this line unreachable.  I guess it should be
> > a separate top-level condition.
> >
> >> +  /* If one is a normal SIMD register, allow only if no larger than 
> >> 64-bit.
> */
> >> +  if ((to_flags & VEC_ADVSIMD) == to_flags)
> >> +  return known_le (GET_MODE_SIZE (to), 8);
> >> +  else if ((from_flags & VEC_ADVSIMD) == from_flags)
> >> +  return known_le (GET_MODE_SIZE (from), 8);
> >> +
> >> +  return false;
> >> +}
> >
> > I don't think we need to restrict this to SIMD modes.  A plain DI
> > would be OK too.  So I think it should just be:
> >
> > return (known_le (GET_MODE_SIZE (to), 8)
> > || known_le (GET_MODE_SIZE (from, 8));
> 
> Looking again, all the other tests return false if they found a definite 
> problem
> and fall through to later code otherwise.  I think we should do the same here.

I've rewritten the conditions. I needed to allow any conversions as long as 
they're both
partial vectors.  There were various intrinsics tests that rely on for instance 
being able to
take a subreg of a VN2x8QI from a VN3x8QI for instance.  I did not only allow 
the smaller
case since it didn't seem logical to block paradoxical subregs of this kind.

Reload seems to be correctly handling them as separate 64-bit registers (we 
have tests for
this it looks like.)

So Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.

Ok for master? And backport to GCC 12?

Thanks,
Tamar

gcc/ChangeLog:

* config/aarch64/aarch64.cc (aarch64_can_change_mode_class): 

Re: [PATCH] varasm: Fix type confusion bug

2022-12-01 Thread Richard Sandiford via Gcc-patches
Alex Coplan via Gcc-patches  writes:
> Hi,
>
> This patch fixes a type confusion bug in varasm.cc:assemble_variable.
> The problem is that the current code calls:
>
>   sect = get_variable_section (decl, false);
>
> and then accesses sect->named.name without checking whether the section
> is in fact a named section. In the surrounding else clause, we only know
> that SECTION_STYLE (sect) != SECTION_NOSWITCH, so it is possible that
> the section is an unnamed section.
>
> In practice, this means that we end up doing a wild string compare
> between a function pointer and the string literal ".vtable_map_vars".
> This is because sect->named.name aliases sect->unnamed.callback in the
> section union.
>
> This can be seen in GDB with a simple testcase such as "int x;".
>
> This patch fixes the issue by checking the SECTION_STYLE of the section
> is in fact SECTION_NAMED before trying to do the string comparison.
>
> We drop the existing check of whether sect->named.name is non-NULL
> because this should presumably always be the case for a named section.
>
> Bootstrapped/regtested on aarch64-none-linux-gnu, OK for trunk?

OK, thanks.  I think it's OK for backports too if you like,
since it's a regression from around 2013.

Richard

>
> Thanks,
> Alex
>
> gcc/ChangeLog:
>
>   * varasm.cc (assemble_variable): Fix type confusion bug when
>   checking for ".vtable_map_vars" section.
>
> diff --git a/gcc/varasm.cc b/gcc/varasm.cc
> index 9dfbebbb915..6851201b6a2 100644
> --- a/gcc/varasm.cc
> +++ b/gcc/varasm.cc
> @@ -2400,7 +2400,7 @@ assemble_variable (tree decl, int top_level 
> ATTRIBUTE_UNUSED,
>else
>  {
>/* Special-case handling of vtv comdat sections.  */
> -  if (sect->named.name
> +  if (SECTION_STYLE (sect) == SECTION_NAMED
> && (strcmp (sect->named.name, ".vtable_map_vars") == 0))
>   handle_vtv_comdat_section (sect, decl);
>else


Re: [PATCH] RISC-V: Add attributes for VSETVL PASS

2022-12-01 Thread Kito Cheng via Gcc-patches
LGTM, and committed to trunk!

On Tue, Nov 29, 2022 at 4:54 PM Kito Cheng  wrote:
>
> > >>> Yeah, I personally want to support RVV intrinsics in GCC13. As RVV
> > >>> intrinsic is going to release soon next week.
> > >>
> > >> OK, that's fine with me -- I was leaning that way, and I think Jeff only
> > >> had a weak opposition.  Are there any more changes required outside the
> > >> RISC-V backend?  Those would be the most controversial and are already
> > >> late, but if it's only backend stuff at this point then I'm OK taking
> > >> the risk for a bit longer.
> > >>
> > >> Jeff?
> > > It's not ideal, but I can live with the bits going into gcc-13 as long
> > > as they don't bleed out of the RISC-V port.
> >
> > Ya, that's kind of what happens every release though (and not just in
> > GCC, it's that way for everything).  Maybe for gcc-14 we can commit to
> > taking the stage1/stage3 split seriously in RISC-V land?
> >
> > It's early enough that nobody should be surprised, and even if we don't
> > need to do it as per the GCC rules we're going to go crazy if we keep
> > letting things go until the last minute like this.  I think the only
> > real fallout we've had so far was the B stuff in binutils, but we've
> > been exceedingly close to broken releases way too many times and it's
> > going to bite us at some point.
>
> I hope we can follow GCC development rule in GCC 14 too, we don't have enough
> engineer resource and community in RISC-V GNU land before, but now we have
> more people join the development work and review work, so I believe that
> could be improved next year.
>
>
>
> Hi Jeff:
>
> Thanksgiving holiday is over, but I guess it's never too late to say thanks.
> Thank you for joining the RISC-V world and helping review lots of patches :)


Re: Re: [PATCH] RISC-V: Add duplicate vector support.

2022-12-01 Thread Kito Cheng via Gcc-patches
LGMT, and as we discussed in another patch[1], I support RVV related
stuff to keep merge for this moment
and we agreed that it is not ideal but acceptable, so committed to trunku :)

[1] 
https://patchwork.ozlabs.org/project/gcc/patch/20221128141406.242953-1-juzhe.zh...@rivai.ai/

On Tue, Nov 29, 2022 at 6:55 AM 钟居哲  wrote:
>
> OK.
>
>
>
> juzhe.zh...@rivai.ai
>
> From: Jeff Law
> Date: 2022-11-29 00:49
> To: juzhe.zhong; gcc-patches
> CC: kito.cheng
> Subject: Re: [PATCH] RISC-V: Add duplicate vector support.
>
> On 11/25/22 09:06, juzhe.zh...@rivai.ai wrote:
> > From: Ju-Zhe Zhong 
> >
> > gcc/ChangeLog:
> >
> >  * config/riscv/constraints.md (Wdm): New constraint.
> >  * config/riscv/predicates.md (direct_broadcast_operand): New 
> > predicate.
> >  * config/riscv/riscv-protos.h (RVV_VLMAX): New macro.
> >  (emit_pred_op): Refine function.
> >  * config/riscv/riscv-selftests.cc (run_const_vector_selftests): 
> > New function.
> >  (run_broadcast_selftests): Ditto.
> >  (BROADCAST_TEST): New tests.
> >  (riscv_run_selftests): More tests.
> >  * config/riscv/riscv-v.cc (emit_pred_move): Refine function.
> >  (emit_vlmax_vsetvl): Ditto.
> >  (emit_pred_op): Ditto.
> >  (expand_const_vector): New function.
> >  (legitimize_move): Add constant vector support.
> >  * config/riscv/riscv.cc (riscv_print_operand): New asm print rule 
> > for const vector.
> >  * config/riscv/riscv.h (X0_REGNUM): New macro.
> >  * config/riscv/vector-iterators.md: New attribute.
> >  * config/riscv/vector.md (vec_duplicate): New pattern.
> >  (@pred_broadcast): New pattern.
> >
> > gcc/testsuite/ChangeLog:
> >
> >  * gcc.target/riscv/rvv/base/dup-1.c: New test.
> >  * gcc.target/riscv/rvv/base/dup-2.c: New test.
>
> I think this should wait for the next stage1 cycle.
>
> jeff
>
>
>


Re: [PATCH][OG12] amdgcn: Support AMD-specific 'isa' and 'arch' traits in OpenMP context selectors

2022-12-01 Thread Paul-Antoine Arras

On 01/12/2022 13:45, Andrew Stubbs wrote:
P.S. If you want to split the patch into the GCN bits and the bits that 
depend on metadirectives then we can apply the first part to mainline 
right away.


So this is the OG12-specific part (including metadirective and dynamic 
context selectors) of the previous patch.


Once https://gcc.gnu.org/r13-4446-ge41b243302e996 is backported, is it 
OK for OG12?


Thanks,
--
PAFrom 494a815af459b13da6fe9bf5a84b94d4b1f94915 Mon Sep 17 00:00:00 2001
From: Paul-Antoine Arras 
Date: Wed, 30 Nov 2022 14:52:55 +0100
Subject: [PATCH] amdgcn: Support AMD-specific 'isa' and 'arch' traits in
 OpenMP context selectors

Add libgomp support for 'amdgcn' as arch, and for each processor type (as passed
to '-march') as isa traits.
Add test case for all supported 'isa' values used as context selectors in a
metadirective construct.

libgomp/ChangeLog:

* config/gcn/selector.c (GOMP_evaluate_current_device): Recognise 
'amdgcn'
as arch, and '-march' values (as well as 'gfx803') as isa traits.
* testsuite/libgomp.c-c++-common/metadirective-6.c: New test.
---
 libgomp/ChangeLog.omp |  6 +++
 libgomp/config/gcn/selector.c | 24 --
 .../libgomp.c-c++-common/metadirective-6.c| 48 +++
 3 files changed, 73 insertions(+), 5 deletions(-)
 create mode 100644 libgomp/testsuite/libgomp.c-c++-common/metadirective-6.c

diff --git libgomp/ChangeLog.omp libgomp/ChangeLog.omp
index 74053a6eea0..a2f03914725 100644
--- libgomp/ChangeLog.omp
+++ libgomp/ChangeLog.omp
@@ -1,3 +1,9 @@
+2022-12-01  Paul-Antoine Arras 
+
+   * config/gcn/selector.c (GOMP_evaluate_current_device): Recognise 
'amdgcn'
+   as arch, and '-march' values (as well as 'gfx803') as isa traits.
+   * testsuite/libgomp.c-c++-common/metadirective-6.c: New test.
+
 2022-11-30  Tobias Burnus  
 
Backported from master:
diff --git libgomp/config/gcn/selector.c libgomp/config/gcn/selector.c
index 60793fc05d3..570bc1e8ae6 100644
--- libgomp/config/gcn/selector.c
+++ libgomp/config/gcn/selector.c
@@ -36,20 +36,34 @@ GOMP_evaluate_current_device (const char *kind, const char 
*arch,
   if (kind && strcmp (kind, "gpu") != 0)
 return false;
 
-  if (arch && strcmp (arch, "gcn") != 0)
+  if (arch && (strcmp (arch, "gcn") != 0 && strcmp (arch, "amdgcn") != 0))
 return false;
 
   if (!isa)
 return true;
 
-#ifdef __GCN3__
+#ifdef __gfx803__
   if (strcmp (isa, "fiji") == 0 || strcmp (isa, "gfx803") == 0)
 return true;
 #endif
 
-#ifdef __GCN5__
-  if (strcmp (isa, "gfx900") == 0 || strcmp (isa, "gfx906") != 0
-  || strcmp (isa, "gfx908") == 0)
+#ifdef __gfx900__
+  if (strcmp (isa, "gfx900") == 0)
+return true;
+#endif
+
+#ifdef __gfx906__
+  if (strcmp (isa, "gfx906") == 0)
+return true;
+#endif
+
+#ifdef __gfx908__
+  if (strcmp (isa, "gfx908") == 0)
+return true;
+#endif
+
+#ifdef __gfx90a__
+  if (strcmp (isa, "gfx90a") == 0)
 return true;
 #endif
 
diff --git libgomp/testsuite/libgomp.c-c++-common/metadirective-6.c 
libgomp/testsuite/libgomp.c-c++-common/metadirective-6.c
new file mode 100644
index 000..6d169001db1
--- /dev/null
+++ libgomp/testsuite/libgomp.c-c++-common/metadirective-6.c
@@ -0,0 +1,48 @@
+/* { dg-do link { target { offload_target_amdgcn } } } */
+/* { dg-additional-options "-foffload=-fdump-tree-omp_expand_metadirective" } 
*/
+
+#define N 100
+
+void f (int x[], int y[], int z[])
+{
+  int i;
+
+  #pragma omp target map(to: x, y) map(from: z)
+#pragma omp metadirective \
+  when (device={isa("gfx803")}: teams num_teams(512)) \
+  when (device={isa("gfx900")}: teams num_teams(256)) \
+  when (device={isa("gfx906")}: teams num_teams(128)) \
+  when (device={isa("gfx908")}: teams num_teams(64)) \
+  when (device={isa("gfx90a")}: teams num_teams(32)) \
+  default (teams num_teams(4))
+   for (i = 0; i < N; i++)
+ z[i] = x[i] * y[i];
+}
+
+int main (void)
+{
+  int x[N], y[N], z[N];
+  int i;
+
+  for (i = 0; i < N; i++)
+{
+  x[i] = i;
+  y[i] = -i;
+}
+
+  f (x, y, z);
+
+  for (i = 0; i < N; i++)
+if (z[i] != x[i] * y[i])
+  return 1;
+
+  return 0;
+}
+
+/* The metadirective should be resolved after Gimplification.  */
+
+/* { dg-final { scan-offload-tree-dump "__builtin_GOMP_teams4 \\(512, 512" 
"omp_expand_metadirective" { target { any-opts "-foffload=-march=fiji" } } } } 
*/
+/* { dg-final { scan-offload-tree-dump "__builtin_GOMP_teams4 \\(256, 256" 
"omp_expand_metadirective" { target { any-opts "-foffload=-march=gfx900" } } } 
} */
+/* { dg-final { scan-offload-tree-dump "__builtin_GOMP_teams4 \\(128, 128" 
"omp_expand_metadirective" { target { any-opts "-foffload=-march=gfx906" } } } 
} */
+/* { dg-final { scan-offload-tree-dump "__builtin_GOMP_teams4 \\(64, 64" 
"omp_expand_metadirective" { target { any-opts "-foffload=-march=gfx908" } } } 
} */
+/* { dg-final { scan-offload-tree-dump "__builtin_GOMP_teams4 

[PATCH] varasm: Fix type confusion bug

2022-12-01 Thread Alex Coplan via Gcc-patches
Hi,

This patch fixes a type confusion bug in varasm.cc:assemble_variable.
The problem is that the current code calls:

  sect = get_variable_section (decl, false);

and then accesses sect->named.name without checking whether the section
is in fact a named section. In the surrounding else clause, we only know
that SECTION_STYLE (sect) != SECTION_NOSWITCH, so it is possible that
the section is an unnamed section.

In practice, this means that we end up doing a wild string compare
between a function pointer and the string literal ".vtable_map_vars".
This is because sect->named.name aliases sect->unnamed.callback in the
section union.

This can be seen in GDB with a simple testcase such as "int x;".

This patch fixes the issue by checking the SECTION_STYLE of the section
is in fact SECTION_NAMED before trying to do the string comparison.

We drop the existing check of whether sect->named.name is non-NULL
because this should presumably always be the case for a named section.

Bootstrapped/regtested on aarch64-none-linux-gnu, OK for trunk?

Thanks,
Alex

gcc/ChangeLog:

* varasm.cc (assemble_variable): Fix type confusion bug when
checking for ".vtable_map_vars" section.
diff --git a/gcc/varasm.cc b/gcc/varasm.cc
index 9dfbebbb915..6851201b6a2 100644
--- a/gcc/varasm.cc
+++ b/gcc/varasm.cc
@@ -2400,7 +2400,7 @@ assemble_variable (tree decl, int top_level 
ATTRIBUTE_UNUSED,
   else
 {
   /* Special-case handling of vtv comdat sections.  */
-  if (sect->named.name
+  if (SECTION_STYLE (sect) == SECTION_NAMED
  && (strcmp (sect->named.name, ".vtable_map_vars") == 0))
handle_vtv_comdat_section (sect, decl);
   else


Re: [PATCH] libgccjit: Fix float vector comparison

2022-12-01 Thread Antoni Boucher via Gcc-patches
On Thu, 2022-12-01 at 10:25 -0500, David Malcolm wrote:
> On Thu, 2022-12-01 at 10:01 -0500, Antoni Boucher wrote:
> > Thanks, David.
> > Since we're not in phase 1 anymore, do we need an approval before I
> > merge like last year or can I merge immediately?
> 
> I think it counts as a bug fix and thus you can go ahead and merge
> (assuming you've done the usual testing).
> 
> > I also have many other patches (all in jit) that I need to prepare
> > and
> > post to this mailing list.
> > What do you think?
> 
> Given that you're one of the main users of libgccjit I think there's
> a
> case for stretching the deadlines a bit here.
> 
> Do you have a repo I can look at?

Yes! The commits are in my fork:
https://github.com/antoyo/gcc

The only big one is the one adding support for target-dependent
builtins:
https://github.com/antoyo/gcc/commit/6d4313d4c02dd878f43917c978f299f5119330f0

Regarding this one, there's the issue that since we record the builtins
on the first context run, we only have access to the builtins from the
second run.
Do you have any idea how to fix this?
Or do you consider this is acceptable?

I also have a WIP branch which adds support for try/catch:
https://github.com/antoyo/gcc/commit/6219339fcacb079431596a0bc6cf8d430a1bd5a1
I'm not sure if this one is going to be ready soon or not.

Thanks.

> 
> Dave
> 
> 
> > 
> > On Thu, 2022-12-01 at 09:28 -0500, David Malcolm wrote:
> > > On Sun, 2022-11-20 at 14:03 -0500, Antoni Boucher via Jit wrote:
> > > > Hi.
> > > > This fixes bug 107770.
> > > > Thanks for the review.
> > > 
> > > Thanks, the patch looks good to me.
> > > 
> > > Dave
> > > 
> > 
> 



Re: [PATCH] gcc: remove incpath.o from CXX_C_OBJS

2022-12-01 Thread Richard Biener via Gcc-patches
On Thu, Dec 1, 2022 at 10:33 AM Martin Liška  wrote:
>
> The object is already included in OBJS (libbackend.a), thus
> we don't need it.
>
> Noticed while using partial linking for libbackend.a.
>
> Ready to be installed?

Looks obvious?

OK.

> Thanks,
> Martin
>
> gcc/cp/ChangeLog:
>
> * Make-lang.in: Remove extra object dependency.
> ---
>  gcc/cp/Make-lang.in | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/gcc/cp/Make-lang.in b/gcc/cp/Make-lang.in
> index af25bdc044a..75e2f7c7ba3 100644
> --- a/gcc/cp/Make-lang.in
> +++ b/gcc/cp/Make-lang.in
> @@ -81,7 +81,7 @@ g++-cross$(exeext): xg++$(exeext)
>
>  # The compiler itself.
>  # Shared with C front end:
> -CXX_C_OBJS = attribs.o incpath.o \
> +CXX_C_OBJS = attribs.o \
> $(C_COMMON_OBJS) $(CXX_TARGET_OBJS)
>
>  # Language-specific object files for C++ and Objective C++.
> --
> 2.38.1
>


Re: [PATCH] libgccjit: Fix float vector comparison

2022-12-01 Thread David Malcolm via Gcc-patches
On Thu, 2022-12-01 at 10:01 -0500, Antoni Boucher wrote:
> Thanks, David.
> Since we're not in phase 1 anymore, do we need an approval before I
> merge like last year or can I merge immediately?

I think it counts as a bug fix and thus you can go ahead and merge
(assuming you've done the usual testing).

> I also have many other patches (all in jit) that I need to prepare
> and
> post to this mailing list.
> What do you think?

Given that you're one of the main users of libgccjit I think there's a
case for stretching the deadlines a bit here.

Do you have a repo I can look at?

Dave


> 
> On Thu, 2022-12-01 at 09:28 -0500, David Malcolm wrote:
> > On Sun, 2022-11-20 at 14:03 -0500, Antoni Boucher via Jit wrote:
> > > Hi.
> > > This fixes bug 107770.
> > > Thanks for the review.
> > 
> > Thanks, the patch looks good to me.
> > 
> > Dave
> > 
> 



Re: [PATCH v2] Add condition coverage profiling

2022-12-01 Thread Martin Liška
On 11/11/22 06:21, Jørgen Kvalsvik wrote:
> From: Jørgen Kvalsvik 
> 
> This patch adds support in gcc+gcov for modified condition/decision
> coverage (MC/DC) with the -fprofile-conditions flag. MC/DC is a type of
> test/code coverage and it is particularly important in the avation and
> automotive industries for safety-critical applications. MC/DC it is
> required for or recommended by:
> 
> * DO-178C for the most critical software (Level A) in avionics
> * IEC 61508 for SIL 4
> * ISO 26262-6 for ASIL D
> 
> From the SQLite webpage:
> 
> Two methods of measuring test coverage were described above:
> "statement" and "branch" coverage. There are many other test
> coverage metrics besides these two. Another popular metric is
> "Modified Condition/Decision Coverage" or MC/DC. Wikipedia defines
> MC/DC as follows:
> 
> * Each decision tries every possible outcome.
> * Each condition in a decision takes on every possible outcome.
> * Each entry and exit point is invoked.
> * Each condition in a decision is shown to independently affect
>   the outcome of the decision.
> 
> In the C programming language where && and || are "short-circuit"
> operators, MC/DC and branch coverage are very nearly the same thing.
> The primary difference is in boolean vector tests. One can test for
> any of several bits in bit-vector and still obtain 100% branch test
> coverage even though the second element of MC/DC - the requirement
> that each condition in a decision take on every possible outcome -
> might not be satisfied.
> 
> https://sqlite.org/testing.html#mcdc
> 
> Wahlen, Heimdahl, and De Silva "Efficient Test Coverage Measurement for
> MC/DC" describes an algorithm for adding instrumentation by carrying
> over information from the AST, but my algorithm analyses the the control
> flow graph to instrument for coverage. This has the benefit of being
> programming language independent and faithful to compiler decisions
> and transformations, although I have only tested it on constructs in C
> and C++, see testsuite/gcc.misc-tests and testsuite/g++.dg.
> 
> Like Wahlen et al this implementation records coverage in fixed-size
> bitsets which gcov knows how to interpret. This is very fast, but
> introduces a limit on the number of terms in a single boolean
> expression, the number of bits in a gcov_unsigned_type (which is
> typedef'd to uint64_t), so for most practical purposes this would be
> acceptable. This limitation is in the implementation and not the
> algorithm, so support for more conditions can be added by also
> introducing arbitrary-sized bitsets.
> 
> For space overhead, the instrumentation needs two accumulators
> (gcov_unsigned_type) per condition in the program which will be written
> to the gcov file. In addition, every function gets a pair of local
> accumulators, but these accmulators are reused between conditions in the
> same function.
> 
> For time overhead, there is a zeroing of the local accumulators for
> every condition and one or two bitwise operation on every edge taken in
> the an expression.
> 
> In action it looks pretty similar to the branch coverage. The -g short
> opt carries no significance, but was chosen because it was an available
> option with the upper-case free too.
> 
> gcov --conditions:
> 
> 3:   17:void fn (int a, int b, int c, int d) {
> 3:   18:if ((a && (b || c)) && d)
> condition outcomes covered 3/8
> condition  0 not covered (true false)
> condition  1 not covered (true)
> condition  2 not covered (true)
> condition  3 not covered (true)
> 1:   19:x = 1;
> -:   20:else
> 2:   21:x = 2;
> 3:   22:}
> 
> gcov --conditions --json-format:
> 
> "conditions": [
> {
> "not_covered_false": [
> 0
> ],
> "count": 8,
> "covered": 3,
> "not_covered_true": [
> 0,
> 1,
> 2,
> 3
> ]
> }
> ],
> 
> Some expressions, mostly those without else-blocks, are effectively
> "rewritten" in the CFG construction making the algorithm unable to
> distinguish them:
> 
> and.c:
> 
> if (a && b && c)
> x = 1;
> 
> ifs.c:
> 
> if (a)
> if (b)
> if (c)
> x = 1;
> 
> gcc will build the same graph for both these programs, and gcov will
> report boths as 3-term expressions. It is vital that it is not
> interpreted the other way around (which is consistent with the shape of
> the graph) because otherwise the masking would be wrong for the and.c
> program which is a more severe error. While surprising, users would
> probably expect some minor rewriting of semantically-identical
> expressions.
> 
> and.c.gcov:
> #:2:if (a && b && c)
> condition outcomes covered 6/6
> #:3:x = 1;
> 
> 

Re: [PATCH] libgccjit: Fix float vector comparison

2022-12-01 Thread Antoni Boucher via Gcc-patches
Thanks, David.
Since we're not in phase 1 anymore, do we need an approval before I
merge like last year or can I merge immediately?
I also have many other patches (all in jit) that I need to prepare and
post to this mailing list.
What do you think?

On Thu, 2022-12-01 at 09:28 -0500, David Malcolm wrote:
> On Sun, 2022-11-20 at 14:03 -0500, Antoni Boucher via Jit wrote:
> > Hi.
> > This fixes bug 107770.
> > Thanks for the review.
> 
> Thanks, the patch looks good to me.
> 
> Dave
> 



Re: [PATCH] amdgcn: Add preprocessor builtins for every processor type

2022-12-01 Thread Andrew Stubbs

On 01/12/2022 14:35, Paul-Antoine Arras wrote:

I believe this patch addresses your comments regarding the GCN bits.

The new builtins are consistent with the LLVM naming convention (lower 
case, canonical name). For gfx803, I also kept '__fiji__' to be 
consistent with -march=fiji.


Is it OK for mainline?


You need to wrap the long line in the changelog (I'm not sure it'll even 
let you push it like that), but otherwise it looks fine.


OK.

Andrew


[PATCH] amdgcn: Add preprocessor builtins for every processor type

2022-12-01 Thread Paul-Antoine Arras

Hi Andrew, all,
On 01/12/2022 13:45, Andrew Stubbs wrote:

On 01/12/2022 11:10, Paul-Antoine Arras wrote:
+  if 
(TARGET_FIJI) \
+    builtin_define 
("__FIJI__");   \
+  else if 
(TARGET_VEGA10)  \
+    builtin_define 
("__VEGA10__"); \
+  else if 
(TARGET_VEGA20)  \
+    builtin_define 
("__VEGA20__"); \
+  else if 
(TARGET_GFX908)  \
+    builtin_define 
("__GFX908__"); \
+  else if 
(TARGET_GFX90a)  \
+    builtin_define 
("__GFX90a__"); \

+  } while (0)



I don't think it makes sense to say __VEGA10__ when the user asked for 
-march=gfx900.


This whole naming thing is a bit of a mess already, so I think we'd do 
better to either keep the same names throughout or match what LLVM does 
(since it got to these first).


Please use "__gfx900__" etc. (lower case).

[...]

P.S. If you want to split the patch into the GCN bits and the bits that 
depend on metadirectives then we can apply the first part to mainline 
right away.


I believe this patch addresses your comments regarding the GCN bits.

The new builtins are consistent with the LLVM naming convention (lower 
case, canonical name). For gfx803, I also kept '__fiji__' to be 
consistent with -march=fiji.


Is it OK for mainline?

Thanks,
--
PAFrom 238e8e131741fc962fe87482d1e9a6eb1252c75c Mon Sep 17 00:00:00 2001
From: Paul-Antoine Arras 
Date: Thu, 1 Dec 2022 15:09:54 +0100
Subject: [PATCH] amdgcn: Add preprocessor builtins for every processor type

Provide a specific builtin for each possible value of '-march'.

gcc/ChangeLog:

* config/gcn/gcn-opts.h (TARGET_FIJI): -march=fiji.
(TARGET_VEGA10): -march=gfx900.
(TARGET_VEGA20): -march=gfx906.
(TARGET_GFX908): -march=gfx908.
(TARGET_GFX90a): -march=gfx90a.
* config/gcn/gcn.h (TARGET_CPU_CPP_BUILTINS): Define a builtin that 
uniquely maps to '-march'.
---
 gcc/config/gcn/gcn-opts.h |  6 ++
 gcc/config/gcn/gcn.h  | 40 +--
 2 files changed, 32 insertions(+), 14 deletions(-)

diff --git gcc/config/gcn/gcn-opts.h gcc/config/gcn/gcn-opts.h
index b62dfb45f59..b54eae79faf 100644
--- gcc/config/gcn/gcn-opts.h
+++ gcc/config/gcn/gcn-opts.h
@@ -27,6 +27,12 @@ enum processor_type
   PROCESSOR_GFX90a
 };
 
+#define TARGET_FIJI (gcn_arch == PROCESSOR_FIJI)
+#define TARGET_VEGA10 (gcn_arch == PROCESSOR_VEGA10)
+#define TARGET_VEGA20 (gcn_arch == PROCESSOR_VEGA20)
+#define TARGET_GFX908 (gcn_arch == PROCESSOR_GFX908)
+#define TARGET_GFX90a (gcn_arch == PROCESSOR_GFX90a)
+
 /* Set in gcn_option_override.  */
 extern enum gcn_isa {
   ISA_UNKNOWN,
diff --git gcc/config/gcn/gcn.h gcc/config/gcn/gcn.h
index 38f7212db59..1cc5981d904 100644
--- gcc/config/gcn/gcn.h
+++ gcc/config/gcn/gcn.h
@@ -16,20 +16,32 @@
 
 #include "config/gcn/gcn-opts.h"
 
-#define TARGET_CPU_CPP_BUILTINS()  \
-  do   \
-{  \
-  builtin_define ("__AMDGCN__");   \
-  if (TARGET_GCN3) \
-   builtin_define ("__GCN3__");\
-  else if (TARGET_GCN5)\
-   builtin_define ("__GCN5__");\
-  else if (TARGET_CDNA1)   \
-   builtin_define ("__CDNA1__");   \
-  else if (TARGET_CDNA2)   \
-   builtin_define ("__CDNA2__");   \
-}  \
-  while(0)
+#define TARGET_CPU_CPP_BUILTINS()  
\
+  do   
\
+{  
\
+  builtin_define ("__AMDGCN__");   
\
+  if (TARGET_GCN3) 
\
+   builtin_define ("__GCN3__");   \
+  else if (TARGET_GCN5)
\
+   builtin_define ("__GCN5__");   \
+  else if (TARGET_CDNA1)   
\
+   builtin_define ("__CDNA1__");  \
+  else if (TARGET_CDNA2)   
\
+   builtin_define ("__CDNA2__");  \
+  if (TARGET_FIJI) 
\
+   {  \
+ builtin_define ("__fiji__");

Re: [PATCH] libgccjit: Fix float vector comparison

2022-12-01 Thread David Malcolm via Gcc-patches
On Sun, 2022-11-20 at 14:03 -0500, Antoni Boucher via Jit wrote:
> Hi.
> This fixes bug 107770.
> Thanks for the review.

Thanks, the patch looks good to me.

Dave



Ping^3: [PATCH] libcpp: Improve location for macro names [PR66290]

2022-12-01 Thread Lewis Hyatt via Gcc-patches
Hello-

https://gcc.gnu.org/pipermail/gcc-patches/2022-August/599397.html

May I please ping this one? Thanks!
I have also re-attached the rebased patch here.

-Lewis

On Wed, Oct 12, 2022 at 06:37:50PM -0400, Lewis Hyatt wrote:
> Hello-
> 
> https://gcc.gnu.org/pipermail/gcc-patches/2022-August/599397.html
> 
> Since Jeff was kind enough to ack one of my other preprocessor patches
> today, I have become emboldened to ping this one again too :). Would
> anyone have some time to take a look at it please? Thanks!
> 
> -Lewis
> 
> On Thu, Sep 15, 2022 at 6:31 PM Lewis Hyatt  wrote:
> >
> > Hello-
> >
> > https://gcc.gnu.org/pipermail/gcc-patches/2022-August/599397.html
> > May I please ping this patch? Thank you.
> >
> > -Lewis
> >
> > On Fri, Aug 5, 2022 at 12:14 PM Lewis Hyatt  wrote:
> > >
> > >
> > > When libcpp reports diagnostics whose locus is a macro name (such as for
> > > -Wunused-macros), it uses the location in the cpp_macro object that was
> > > stored by _cpp_new_macro. This is currently set to pfile->directive_line,
> > > which contains the line number only and no column information. This patch
> > > changes the stored location to the src_loc for the token defining the 
> > > macro
> > > name, which includes the location and range information.
> > >
> > > libcpp/ChangeLog:
> > >
> > > PR c++/66290
> > > * macro.cc (_cpp_create_definition): Add location argument.
> > > * internal.h (_cpp_create_definition): Adjust prototype.
> > > * directives.cc (do_define): Pass new location argument to
> > > _cpp_create_definition.
> > > (do_undef): Stop passing inferior location to 
> > > cpp_warning_with_line;
> > > the default from cpp_warning is better.
> > > (cpp_pop_definition): Pass new location argument to
> > > _cpp_create_definition.
> > > * pch.cc (cpp_read_state): Likewise.
> > >
> > > gcc/testsuite/ChangeLog:
> > >
> > > PR c++/66290
> > > * c-c++-common/cpp/macro-ranges.c: New test.
> > > * c-c++-common/cpp/line-2.c: Adapt to check for column information
> > > on macro-related libcpp warnings.
> > > * c-c++-common/cpp/line-3.c: Likewise.
> > > * c-c++-common/cpp/macro-arg-count-1.c: Likewise.
> > > * c-c++-common/cpp/pr58844-1.c: Likewise.
> > > * c-c++-common/cpp/pr58844-2.c: Likewise.
> > > * c-c++-common/cpp/warning-zero-location.c: Likewise.
> > > * c-c++-common/pragma-diag-14.c: Likewise.
> > > * c-c++-common/pragma-diag-15.c: Likewise.
> > > * g++.dg/modules/macro-2_d.C: Likewise.
> > > * g++.dg/modules/macro-4_d.C: Likewise.
> > > * g++.dg/modules/macro-4_e.C: Likewise.
> > > * g++.dg/spellcheck-macro-ordering.C: Likewise.
> > > * gcc.dg/builtin-redefine.c: Likewise.
> > > * gcc.dg/cpp/Wunused.c: Likewise.
> > > * gcc.dg/cpp/redef2.c: Likewise.
> > > * gcc.dg/cpp/redef3.c: Likewise.
> > > * gcc.dg/cpp/redef4.c: Likewise.
> > > * gcc.dg/cpp/ucnid-11-utf8.c: Likewise.
> > > * gcc.dg/cpp/ucnid-11.c: Likewise.
> > > * gcc.dg/cpp/undef2.c: Likewise.
> > > * gcc.dg/cpp/warn-redefined-2.c: Likewise.
> > > * gcc.dg/cpp/warn-redefined.c: Likewise.
> > > * gcc.dg/cpp/warn-unused-macros-2.c: Likewise.
> > > * gcc.dg/cpp/warn-unused-macros.c: Likewise.
> > > ---
> > >
> > > Notes:
> > > Hello-
> > >
> > > The PR (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66290) was 
> > > originally
> > > about the entirely wrong location for -Wunused-macros in C++ mode, 
> > > which
> > > behavior was fixed by r13-1903, but before closing it out I wanted to 
> > > also
> > > address a second point brought up in the PR comments, namely that we 
> > > do not
> > > include column information when emitting diagnostics for macro names, 
> > > such as
> > > is done for -Wunused-macros. The attached patch updates the location 
> > > stored in
> > > the cpp_macro object so that it includes the column and range 
> > > information for
> > > the token comprising the macro name; previously, the location was 
> > > just the
> > > generic one pointing to the whole line.
> > >
> > > The change to libcpp is very small, the reason for all the testsuite 
> > > changes is
> > > that I have updated all tests explicitly looking for the columnless 
> > > diagnostics
> > > (with the "-:" syntax to dg-warning et al) so that they expect a 
> > > column
> > > instead. I also added a new test which verifies the expected range 
> > > information
> > > in diagnostics with carets.
> > >
> > > Bootstrap + regtest on x86-64 Linux looks good. Please let me know if 
> > > it looks
> > > OK? Thanks!
> > >
> > > -Lewis
> > >
> > >  libcpp/directives.cc  |  13 +-
> > >  libcpp/internal.h |   2 +-
> > >  libcpp/macro.cc   

Re: [PATCH 3/3] vect: inbranch SIMD clones

2022-12-01 Thread Jakub Jelinek via Gcc-patches
On Thu, Dec 01, 2022 at 01:35:38PM +, Andrew Stubbs wrote:
> > Maybe better add -ffat-lto-objects to dg-additional-options and drop
> > the dg-skip-if (if it works with that, for all similar tests)?
> 
> The tests are already run with -ffat-lto-objects and the test still fails
> (pattern found zero times). I don't know why.
> 
> Aside from that, I've made all the other changes you requested.

Ah, I see what's going on.  You match simdclone, which isn't matched just in
the calls (I bet that is what you actually should/want count), but also twice
per each simd clone definition (and if somebody has say path to gcc
tree with simdclone in the name could match even more times).

Thus, I think:
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/vect/vect-simd-clone-16.c
> @@ -0,0 +1,89 @@
> +/* { dg-require-effective-target vect_simd_clones } */
> +/* { dg-additional-options "-fopenmp-simd -fdump-tree-optimized" } */
> +/* { dg-additional-options "-mavx" { target avx_runtime } } */
> +
> +/* Test that simd inbranch clones work correctly.  */
> +
> +#ifndef TYPE
> +#define TYPE int
> +#endif
> +
> +/* A simple function that will be cloned.  */
> +#pragma omp declare simd
> +TYPE __attribute__((noinline))
> +foo (TYPE a)
> +{
> +  return a + 1;
> +}
> +
> +/* Check that "inbranch" clones are called correctly.  */
> +
> +void __attribute__((noinline))

You should use noipa attribute instead of noinline on callers
which aren't declare simd (on declare simd it would prevent cloning
which is essential for the declare simd behavior), so that you don't
get surprises e.g. from extra ipa cp etc.

> +masked (TYPE * __restrict a, TYPE * __restrict b, int size)
> +{
> +  #pragma omp simd
> +  for (int i = 0; i < size; i++)
> +b[i] = a[i]<1 ? foo(a[i]) : a[i];
> +}
> +
> +/* Check that "inbranch" works when there might be unrolling.  */
> +
> +void __attribute__((noinline))

So here too.
> +masked_fixed (TYPE * __restrict a, TYPE * __restrict b)

> +/* Ensure the the in-branch simd clones are used on targets that support
> +   them.  These counts include all call and definitions.  */
> +
> +/* { dg-skip-if "" { x86_64-*-* } { "-flto" } { "" } } */

Drop lines line above.

> +/* { dg-final { scan-tree-dump-times "simdclone" 18 "optimized" { target 
> x86_64-*-* } } } */
> +/* { dg-final { scan-tree-dump-times "simdclone" 7 "optimized" { target 
> amdgcn-*-* } } } */

And scan-tree-dump-times " = foo.simdclone" 2 "optimized"; I'd think that
should be the right number for all of x86_64, amdgcn and aarch64.  And
please don't forget about i?86-*-* too.

> +/* TODO: aarch64 */

For aarch64, one would need to include it in 
check_effective_target_vect_simd_clones
first...

Otherwise LGTM.

Jakub



[COMMITTED] ada: Strip conversions for the implementation of storage models

2022-12-01 Thread Marc Poulhiès via Gcc-patches
From: Eric Botcazou 

This is necessary for unconstrained allocators with qualified expression.

gcc/ada/

* gcc-interface/trans.cc (get_storage_model_access): Strip any type
conversion around the node before looking into it.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/gcc-interface/trans.cc | 5 +
 1 file changed, 5 insertions(+)

diff --git a/gcc/ada/gcc-interface/trans.cc b/gcc/ada/gcc-interface/trans.cc
index b9d7c015a73..a012271abf3 100644
--- a/gcc/ada/gcc-interface/trans.cc
+++ b/gcc/ada/gcc-interface/trans.cc
@@ -4400,6 +4400,11 @@ get_storage_model_access (Node_Id gnat_node, Entity_Id 
*gnat_smo)
   return;
 }
 
+  /* Now strip any type conversion from GNAT_NODE.  */
+  if (Nkind (gnat_node) == N_Type_Conversion
+  || Nkind (gnat_node) == N_Unchecked_Type_Conversion)
+gnat_node = Expression (gnat_node);
+
   while (node_is_component (gnat_node))
 gnat_node = Prefix (gnat_node);
 
-- 
2.34.1



[COMMITTED] ada: Enforce Aggregate aspect legality rule

2022-12-01 Thread Marc Poulhiès via Gcc-patches
From: Steve Baird 

Ada 2022 requires that an Aggregate aspect specification shall specify a
a name for at least one of Add_Named, Add_Unnamed, or Assign_Indexed.
Enforce this rule.

gcc/ada/

* sem_ch13.adb
(Validate_Aspect_Aggregate): Reject illegal case where none of
Add_Named, Add_Unnamed, and Assign_Indexed are specified.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/sem_ch13.adb | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/gcc/ada/sem_ch13.adb b/gcc/ada/sem_ch13.adb
index a4782747aff..71eabb4f627 100644
--- a/gcc/ada/sem_ch13.adb
+++ b/gcc/ada/sem_ch13.adb
@@ -15729,6 +15729,12 @@ package body Sem_Ch13 is
 return;
  end if;
 
+  elsif No (Add_Named_Subp)
+and then No (Add_Unnamed_Subp)
+and then No (Assign_Indexed_Subp)
+  then
+ Error_Msg_N ("incomplete specification for aggregate", N);
+
   elsif Present (New_Indexed_Subp) /= Present (Assign_Indexed_Subp) then
  Error_Msg_N ("incomplete specification for indexed aggregate", N);
   end if;
-- 
2.34.1



[COMMITTED] ada: Further adjustments to User's Guide for PIE default

2022-12-01 Thread Marc Poulhiès via Gcc-patches
From: Eric Botcazou 

gcc/ada/

* doc/gnat_ugn/gnat_and_program_execution.rst (Non-Symbolic
Traceback): Add compilation line.
(Symbolic Traceback): Remove obsolete stuff.
* doc/gnat_ugn/gnat_utility_programs.rst (gnatsymbolize): Adjust.
* gnat_ugn.texi: Regenerate.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 .../gnat_ugn/gnat_and_program_execution.rst   | 15 ++-
 .../doc/gnat_ugn/gnat_utility_programs.rst| 16 
 gcc/ada/gnat_ugn.texi | 19 +++
 3 files changed, 21 insertions(+), 29 deletions(-)

diff --git a/gcc/ada/doc/gnat_ugn/gnat_and_program_execution.rst 
b/gcc/ada/doc/gnat_ugn/gnat_and_program_execution.rst
index 45ecea75416..5dab2d45626 100644
--- a/gcc/ada/doc/gnat_ugn/gnat_and_program_execution.rst
+++ b/gcc/ada/doc/gnat_ugn/gnat_and_program_execution.rst
@@ -928,10 +928,9 @@ Ada facilities defined in ``Ada.Exceptions``. Here is a 
simple example:
  P2;
   end STB;
 
-This program will output:
-
   ::
 
+ $ gnatmake stb -g -bargs -E -largs -no-pie
  $ stb
 
  raised CONSTRAINT_ERROR : stb.adb:12 range check failed
@@ -1070,7 +1069,7 @@ Here is an example:
 
   ::
 
-  $ gnatmake -g .\stb -bargs -E
+  $ gnatmake -g stb -bargs -E
   $ stb
 
   0040149F in stb.p1 at stb.adb:8
@@ -1082,16 +1081,6 @@ Here is an example:
   004011F1 in mainCRTStartup at crt1.c:222
   77E892A4 in ?? at ??:0
 
-In the above example the ``.\`` syntax in the ``gnatmake`` command
-is currently required by ``addr2line`` for files that are in
-the current working directory.
-Moreover, the exact sequence of linker options may vary from platform
-to platform.
-The above :switch:`-largs` section is for Windows platforms. By contrast,
-under Unix there is no need for the :switch:`-largs` section.
-Differences across platforms are due to details of linker implementation.
-
-
 .. rubric:: Tracebacks From Anywhere in a Program
 
 It is possible to get a symbolic stack traceback
diff --git a/gcc/ada/doc/gnat_ugn/gnat_utility_programs.rst 
b/gcc/ada/doc/gnat_ugn/gnat_utility_programs.rst
index f2d42e96bd4..7df45d518aa 100644
--- a/gcc/ada/doc/gnat_ugn/gnat_utility_programs.rst
+++ b/gcc/ada/doc/gnat_ugn/gnat_utility_programs.rst
@@ -2154,6 +2154,7 @@ building specialized scripts.
 with GNAT.IO; use GNAT.IO;
 with GNAT.Traceback; use GNAT.Traceback;
 with GNAT.Debug_Utilities;
+
 package body Pck is
procedure Call_Me_Third is
   TB : Tracebacks_Array (1 .. 5);
@@ -2177,10 +2178,25 @@ building specialized scripts.
   Call_Me_Second;
end Call_Me_First;
 end Pck;
+
+with GNAT.IO; use GNAT.IO;
+with GNAT.Debug_Utilities;
+with GNAT.Traceback;
+with System;
+
 with Pck; use Pck;
 
 procedure Foo is
+   LA : constant System.Address := \
+ GNAT.Traceback.Executable_Load_Address;
+
+   use type System.Address;
+
 begin
+   if LA /= System.Null_Address then
+  Put_Line ("Load address: " & GNAT.Debug_Utilities.Image_C (LA));
+   end if;
+
Global_Val := 123;
Call_Me_First;
 end Foo;
diff --git a/gcc/ada/gnat_ugn.texi b/gcc/ada/gnat_ugn.texi
index 5224a1201b8..dfe44b0937c 100644
--- a/gcc/ada/gnat_ugn.texi
+++ b/gcc/ada/gnat_ugn.texi
@@ -19,7 +19,7 @@
 
 @copying
 @quotation
-GNAT User's Guide for Native Platforms , Nov 28, 2022
+GNAT User's Guide for Native Platforms , Dec 01, 2022
 
 AdaCore
 
@@ -19200,13 +19200,9 @@ begin
P2;
 end STB;
 @end example
-@end quotation
-
-This program will output:
-
-@quotation
 
 @example
+$ gnatmake stb -g -bargs -E -largs -no-pie
 $ stb
 
 raised CONSTRAINT_ERROR : stb.adb:12 range check failed
@@ -19350,7 +19346,7 @@ end STB;
 @end example
 
 @example
-$ gnatmake -g .\stb -bargs -E
+$ gnatmake -g stb -bargs -E
 $ stb
 
 0040149F in stb.p1 at stb.adb:8
@@ -19364,15 +19360,6 @@ $ stb
 @end example
 @end quotation
 
-In the above example the @code{.\} syntax in the @code{gnatmake} command
-is currently required by @code{addr2line} for files that are in
-the current working directory.
-Moreover, the exact sequence of linker options may vary from platform
-to platform.
-The above @code{-largs} section is for Windows platforms. By contrast,
-under Unix there is no need for the @code{-largs} section.
-Differences across platforms are due to details of linker implementation.
-
 @subsubheading Tracebacks From Anywhere in a Program
 
 
-- 
2.34.1



[COMMITTED] ada: Use the address type of a Storage_Model_Type for 'Address

2022-12-01 Thread Marc Poulhiès via Gcc-patches
From: Gary Dismukes 

When an Address attribute applies to an object that is a dereference of
an access value whose type has aspect Designated_Storage_Model, the
attribute will now be treated as having the address type associated
with the Storage_Model_Type of the access type's associated Storage_Model
object instead of being of type System.Address.

gcc/ada/

* sem_attr.adb (Analyze_Attribute, Attribute_Address): In the case
where the attribute's prefix is a dereference of a value of an
access type that has aspect Designated_Storage_Model (or a
renaming of such a dereference), set the attribute's type to the
corresponding Storage_Model_Type's associated address type rather
than System.Address.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/sem_attr.adb | 29 -
 1 file changed, 28 insertions(+), 1 deletion(-)

diff --git a/gcc/ada/sem_attr.adb b/gcc/ada/sem_attr.adb
index 4c76b9344c2..cca6f6f8c7d 100644
--- a/gcc/ada/sem_attr.adb
+++ b/gcc/ada/sem_attr.adb
@@ -3430,7 +3430,34 @@ package body Sem_Attr is
  Check_E0;
  Address_Checks;
  Check_Not_Incomplete_Type;
- Set_Etype (N, RTE (RE_Address));
+
+ --  If the prefix is a dereference of a value whose associated access
+ --  type has been specified with aspect Designated_Storage_Model, then
+ --  use the associated Storage_Model_Type's address type as the type
+ --  of the attribute. Otherwise we use System.Address as usual. This
+ --  isn't normally legit for a predefined attribute, but this is for
+ --  our own extension to addressing and currently requires extensions
+ --  to be enabled (such as with -gnatX0).
+
+ declare
+Prefix_Obj : constant Node_Id := Get_Referenced_Object (P);
+Addr_Type  : Entity_Id:= RTE (RE_Address);
+ begin
+if Nkind (Prefix_Obj) = N_Explicit_Dereference then
+   declare
+  P_Type : constant Entity_Id := Etype (Prefix (Prefix_Obj));
+
+  use Storage_Model_Support;
+   begin
+  if Has_Designated_Storage_Model_Aspect (P_Type) then
+ Addr_Type := Storage_Model_Address_Type
+(Storage_Model_Object (P_Type));
+  end if;
+   end;
+end if;
+
+Set_Etype (N, Addr_Type);
+ end;
 
   --
   -- Address_Size --
-- 
2.34.1



[COMMITTED] ada: Fix misphrasing in comment

2022-12-01 Thread Marc Poulhiès via Gcc-patches
From: Ronan Desplanques 

gcc/ada/

* lib-xref.adb (Generate_Reference): Fix misphrasing in comment.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/lib-xref.adb | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/ada/lib-xref.adb b/gcc/ada/lib-xref.adb
index e5dcc85523b..182ea2fa9ec 100644
--- a/gcc/ada/lib-xref.adb
+++ b/gcc/ada/lib-xref.adb
@@ -776,7 +776,7 @@ package body Lib.Xref is
Set_Referenced_As_LHS (E, False);
 
 --  For OUT parameter not covered by the above cases, we simply
---  regard it as a non-reference.
+--  regard it as a reference.
 
 else
Set_Referenced_As_Out_Parameter (E);
-- 
2.34.1



[COMMITTED] ada: Fix minor issues in reference manual

2022-12-01 Thread Marc Poulhiès via Gcc-patches
From: Ronan Desplanques 

This patch fixes a few minor issues in the GNAT library section of
the reference manual.

gcc/ada/

* doc/gnat_rm/the_gnat_library.rst: Fix minor issues.
* gnat_rm.texi: Regenerate.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/doc/gnat_rm/the_gnat_library.rst | 40 +++---
 gcc/ada/gnat_rm.texi | 66 
 2 files changed, 53 insertions(+), 53 deletions(-)

diff --git a/gcc/ada/doc/gnat_rm/the_gnat_library.rst 
b/gcc/ada/doc/gnat_rm/the_gnat_library.rst
index d791f8120ac..3aae70a4409 100644
--- a/gcc/ada/doc/gnat_rm/the_gnat_library.rst
+++ b/gcc/ada/doc/gnat_rm/the_gnat_library.rst
@@ -47,7 +47,7 @@ of GNAT, and will generate a warning message.
 This child of ``Ada.Characters``
 provides a set of definitions corresponding to those in the
 RM-defined package ``Ada.Characters.Latin_1`` but with the
-few modifications required for ``Latin-9``
+few modifications required for ``Latin-9``.
 The provision of such a package
 is specifically authorized by the Ada Reference Manual
 (RM A.3.3(27)).
@@ -69,12 +69,12 @@ instead of ``Character``.  The provision of such a package
 is specifically authorized by the Ada Reference Manual
 (RM A.3.3(27)).
 
-.. _`Ada.Characters.Wide_Latin_9_(a-cwila1.ads)`:
+.. _`Ada.Characters.Wide_Latin_9_(a-cwila9.ads)`:
 
-``Ada.Characters.Wide_Latin_9`` (:file:`a-cwila1.ads`)
+``Ada.Characters.Wide_Latin_9`` (:file:`a-cwila9.ads`)
 ==
 
-.. index:: Ada.Characters.Wide_Latin_9 (a-cwila1.ads)
+.. index:: Ada.Characters.Wide_Latin_9 (a-cwila9.ads)
 
 .. index:: Latin_9 constants for Wide_Character
 
@@ -159,8 +159,8 @@ where this concept makes sense.
 This child of ``Ada.Command_Line``
 provides a mechanism for logically removing
 arguments from the argument list.  Once removed, an argument is not visible
-to further calls on the subprograms in ``Ada.Command_Line`` will not
-see the removed argument.
+to further calls to the subprograms in ``Ada.Command_Line``. These calls
+will not see the removed argument.
 
 .. _`Ada.Command_Line.Response_File_(a-clrefi.ads)`:
 
@@ -833,7 +833,7 @@ obtaining information about exceptions provided by Ada 83 
compilers.
 
 .. index:: Memory corruption debugging
 
-Provide a debugging storage pools that helps tracking memory corruption
+Provides a debugging storage pools that helps tracking memory corruption
 problems.
 See ``The GNAT Debug_Pool Facility`` section in the :title:`GNAT User's Guide`.
 
@@ -1043,7 +1043,7 @@ a message from a subprogram in a pure package, since the
 necessary types and subprograms are in ``Ada.Exceptions``
 which is not a pure unit. ``GNAT.Exceptions`` provides a
 facility for getting around this limitation for a few
-predefined exceptions, and for example allow raising
+predefined exceptions, and for example allows raising
 ``Constraint_Error`` with a message from a pure subprogram.
 
 .. _`GNAT.Expect_(g-expect.ads)`:
@@ -1098,7 +1098,7 @@ in this package can be used to reestablish the required 
mode.
 .. index:: Formatted String
 
 Provides support for C/C++ printf() formatted strings. The format is
-copied from the printf() routine and should therefore gives identical
+copied from the printf() routine and should therefore give identical
 output. Some generic routines are provided to be able to use types
 derived from Integer, Float or enumerations as values for the
 formatted string.
@@ -1314,7 +1314,7 @@ Provides a generator of static minimal perfect hash 
functions. No
 collisions occur and each item can be retrieved from the table in one
 probe (perfect property). The hash table size corresponds to the exact
 size of the key set and no larger (minimal property). The key set has to
-be know in advance (static property). The hash functions are also order
+be known in advance (static property). The hash functions are also order
 preserving. If w2 is inserted after w1 in the generator, their
 hashcode are in the same order. These hashing functions are very
 convenient for use with realtime applications.
@@ -1399,7 +1399,7 @@ this interface usable for large files or socket streams.
 
 .. index:: Secondary Stack Info
 
-Provide the capability to query the high water mark of the current task's
+Provides the capability to query the high water mark of the current task's
 secondary stack.
 
 .. _`GNAT.Semaphores_(g-semaph.ads)`:
@@ -1514,7 +1514,7 @@ targets.
 A high level and portable interface to develop sockets based applications.
 This package is based on the sockets thin binding found in
 ``GNAT.Sockets.Thin``. Currently ``GNAT.Sockets`` is implemented
-on all native GNAT ports and on VxWorks cross prots.  It is not implemented for
+on all native GNAT ports and on VxWorks cross ports.  It is not implemented for
 the LynxOS cross port.
 
 .. _`GNAT.Source_Info_(g-souinf.ads)`:
@@ -1781,12 +1781,12 @@ in various debugging situations.
 
 .. index:: Trace back 

[COMMITTED] ada: Minor updates to gnat/doc configuration

2022-12-01 Thread Marc Poulhiès via Gcc-patches
From: Josue Nava Bello 

Minor updates to conf.py (comments, indentation)

gcc/ada/

* doc/share/conf.py: minor updates

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/doc/share/conf.py | 100 +++---
 1 file changed, 72 insertions(+), 28 deletions(-)

diff --git a/gcc/ada/doc/share/conf.py b/gcc/ada/doc/share/conf.py
index 9ab80e7759e..48f1a96a309 100644
--- a/gcc/ada/doc/share/conf.py
+++ b/gcc/ada/doc/share/conf.py
@@ -2,6 +2,9 @@
 # Style_Check:Python_Fragment (meaning no pyflakes check)
 #
 # GNAT build configuration file
+# 
+# This file defines the configuration for all files created
+# by Sphinx. In this case, pdf (using latex) and html
 
 import sys
 import os
@@ -13,16 +16,12 @@ sys.path.append('.')
 import ada_pygments
 import latex_elements
 
-# Some configuration values for the various documentation handled by
-# this conf.py
-
+# Define list of documents to be built and their title
 DOCS = {
-'gnat_rm': {
-'title': 'GNAT Reference Manual'},
-'gnat_ugn': {
-'title': 'GNAT User\'s Guide for Native Platforms'},
-'gnat-style': {
-'title': 'GNAT Coding Style: A Guide for GNAT Developers'}}
+"gnat_rm": {"title": "GNAT Reference Manual"},
+"gnat_ugn": {"title": "GNAT User's Guide for Native Platforms"},
+"gnat-style": {"title": "GNAT Coding Style: A Guide for GNAT Developers"},
+}
 
 # Then retrieve the source directory
 root_source_dir = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
@@ -30,14 +29,17 @@ gnatvsn_spec = os.path.join(root_source_dir, '..', 
'gnatvsn.ads')
 basever = os.path.join(root_source_dir, '..', '..', 'BASE-VER')
 texi_fsf = True  # Set to False when FSF doc is switched to sphinx by default
 
+# get vsn specs
 with open(gnatvsn_spec, 'r') as fd:
 gnatvsn_content = fd.read()
 
 
+# read copyright test from .rst file (used also for sanity-checking)
 def get_copyright():
 return '2008-%s, Free Software Foundation' % time.strftime('%Y')
 
 
+# get environment gnat version (used also for sanity-checking)
 def get_gnat_version():
 m = re.search(r'Gnat_Static_Version_String : ' +
   r'constant String := "([^\(\)]+)\(.*\)?";',
@@ -58,6 +60,7 @@ def get_gnat_version():
 sys.exit(1)
 
 
+# get gnat build type from runtime
 def get_gnat_build_type():
 m = re.search(r'Build_Type : constant Gnat_Build_Type := (.+);',
   gnatvsn_content)
@@ -70,7 +73,35 @@ def get_gnat_build_type():
 sys.exit(1)
 
 
+# Enable Sphinx extensions
+# Note that these are active for all files to be build (see DOCS list)
+extensions = ['sphinx_rtd_theme']
+
+# todo interprets ".. todo::" commands in .rst files
+# mathjax enables math equations to render correctly
+extensions += ['sphinx.ext.todo', 'sphinx.ext.mathjax']
+todo_include_todos = True
+
+# define templates source folder
+templates_path = ['_templates']
+# define the types of files to read as source for documents
+source_suffix = '.rst'
+
+# enable figure, object, table numeration on documents
+print('enabling table, code-block and figure numeration')
+numfig = True
+numfig_format = {
+'figure': 'figure %s',
+'table': 'table %s',
+'code-block': 'listing %s',
+'section': 'section %s',
+}
+print('done')
+
+
+# Start building the documents
 # First retrieve the name of the documentation we are building
+print('checking doc name... ')
 doc_name = os.environ.get('DOC_NAME', None)
 if doc_name is None:
 print('DOC_NAME environment variable should be set')
@@ -79,7 +110,7 @@ if doc_name is None:
 if doc_name not in DOCS:
 print('%s is not a valid documentation name' % doc_name)
 sys.exit(1)
-
+print('found... ' , doc_name)
 
 # Exclude sources that are not part of the current documentation
 exclude_patterns = []
@@ -88,16 +119,13 @@ for d in os.listdir(root_source_dir):
 exclude_patterns.append(d)
 print('ignoring %s' % d)
 
+# Special condition for gnat_rm
 if doc_name == 'gnat_rm':
 exclude_patterns.append('share/gnat_project_manager.rst')
 print('ignoring share/gnat_project_manager.rst')
 
-extensions = ['sphinx_rtd_theme']
-templates_path = ['_templates']
-source_suffix = '.rst'
-master_doc = doc_name
-
 # General information about the project.
+master_doc = doc_name
 project = DOCS[doc_name]['title']
 
 copyright = get_copyright()
@@ -107,42 +135,58 @@ release = get_gnat_version()
 
 pygments_style = None
 tags.add(get_gnat_build_type())
+
+# Define figures to be included
 html_theme = 'sphinx_rtd_theme'
 if os.path.isfile('adacore_transparent.png'):
+# split html and pdf logos to avoid 'same name' error in sphinx <5.2+
 html_logo = 'adacore_transparent.png'
+latex_logo = 'adacore_transparent.png'
 if os.path.isfile('favicon.ico'):
 html_favicon = 'favicon.ico'
 
 html_static_path = ['_static']
 
+# Use gnat.sty for bulding documents
 latex_additional_files = ['gnat.sty']
 
+# Add 

[PATCH] [committed] arm: Fix MVE testsuite fallouts

2022-12-01 Thread Christophe Lyon via Gcc-patches
After the recent patches to improve / tidy up MVE tests and patterns,
a few more tests need to be updated (replacing spaces with tabs).

Committed as obvious.

gcc/testsuite/ChangeLog:

* gcc.target/arm/simd/mve-compare-1.c: Update.
* gcc.target/arm/simd/mve-compare-scalar-1.c: Update.
* gcc.target/arm/simd/mve-vabs.c: Update.
* gcc.target/arm/simd/mve-vadd-1.c: Update.
* gcc.target/arm/simd/mve-vadd-scalar-1.c: Update.
* gcc.target/arm/simd/mve-vcmp.c: Update.
* gcc.target/arm/simd/pr101325.c: Update.
---
 .../gcc.target/arm/simd/mve-compare-1.c   | 48 +--
 .../arm/simd/mve-compare-scalar-1.c   | 48 +--
 gcc/testsuite/gcc.target/arm/simd/mve-vabs.c  |  2 +-
 .../gcc.target/arm/simd/mve-vadd-1.c  | 10 ++--
 .../gcc.target/arm/simd/mve-vadd-scalar-1.c   | 10 ++--
 gcc/testsuite/gcc.target/arm/simd/mve-vcmp.c  | 16 +++
 gcc/testsuite/gcc.target/arm/simd/pr101325.c  |  4 +-
 7 files changed, 69 insertions(+), 69 deletions(-)

diff --git a/gcc/testsuite/gcc.target/arm/simd/mve-compare-1.c 
b/gcc/testsuite/gcc.target/arm/simd/mve-compare-1.c
index 029c931f47f..887f8dbddd9 100644
--- a/gcc/testsuite/gcc.target/arm/simd/mve-compare-1.c
+++ b/gcc/testsuite/gcc.target/arm/simd/mve-compare-1.c
@@ -50,31 +50,31 @@ TEST_TYPE (vs32, __INT32_TYPE__, COMPARE_REG_AND_ZERO, 16)
 TEST_TYPE (vu32, __UINT32_TYPE__, COMPARE_REG, 16)
 
 /* { 8 bits } x { eq, ne, lt, le, gt, ge, hi, cs }.
-/* { dg-final { scan-assembler-times {\tvcmp.i8  eq, q[0-9]+, q[0-9]+\n} 4 } } 
*/
-/* { dg-final { scan-assembler-times {\tvcmp.i8  ne, q[0-9]+, q[0-9]+\n} 4 } } 
*/
-/* { dg-final { scan-assembler-times {\tvcmp.s8  lt, q[0-9]+, q[0-9]+\n} 2 } } 
*/
-/* { dg-final { scan-assembler-times {\tvcmp.s8  le, q[0-9]+, q[0-9]+\n} 2 } } 
*/
-/* { dg-final { scan-assembler-times {\tvcmp.s8  gt, q[0-9]+, q[0-9]+\n} 2 } } 
*/
-/* { dg-final { scan-assembler-times {\tvcmp.s8  ge, q[0-9]+, q[0-9]+\n} 2 } } 
*/
-/* { dg-final { scan-assembler-times {\tvcmp.u8  hi, q[0-9]+, q[0-9]+\n} 2 } } 
*/
-/* { dg-final { scan-assembler-times {\tvcmp.u8  cs, q[0-9]+, q[0-9]+\n} 2 } } 
*/
+/* { dg-final { scan-assembler-times {\tvcmp.i8\teq, q[0-9]+, q[0-9]+\n} 4 } } 
*/
+/* { dg-final { scan-assembler-times {\tvcmp.i8\tne, q[0-9]+, q[0-9]+\n} 4 } } 
*/
+/* { dg-final { scan-assembler-times {\tvcmp.s8\tlt, q[0-9]+, q[0-9]+\n} 2 } } 
*/
+/* { dg-final { scan-assembler-times {\tvcmp.s8\tle, q[0-9]+, q[0-9]+\n} 2 } } 
*/
+/* { dg-final { scan-assembler-times {\tvcmp.s8\tgt, q[0-9]+, q[0-9]+\n} 2 } } 
*/
+/* { dg-final { scan-assembler-times {\tvcmp.s8\tge, q[0-9]+, q[0-9]+\n} 2 } } 
*/
+/* { dg-final { scan-assembler-times {\tvcmp.u8\thi, q[0-9]+, q[0-9]+\n} 2 } } 
*/
+/* { dg-final { scan-assembler-times {\tvcmp.u8\tcs, q[0-9]+, q[0-9]+\n} 2 } } 
*/
 
 /* { 16 bits } x { eq, ne, lt, le, gt, ge, hi, cs }.
-/* { dg-final { scan-assembler-times {\tvcmp.i16  eq, q[0-9]+, q[0-9]+\n} 4 } 
} */
-/* { dg-final { scan-assembler-times {\tvcmp.i16  ne, q[0-9]+, q[0-9]+\n} 4 } 
} */
-/* { dg-final { scan-assembler-times {\tvcmp.s16  lt, q[0-9]+, q[0-9]+\n} 2 } 
} */
-/* { dg-final { scan-assembler-times {\tvcmp.s16  le, q[0-9]+, q[0-9]+\n} 2 } 
} */
-/* { dg-final { scan-assembler-times {\tvcmp.s16  gt, q[0-9]+, q[0-9]+\n} 2 } 
} */
-/* { dg-final { scan-assembler-times {\tvcmp.s16  ge, q[0-9]+, q[0-9]+\n} 2 } 
} */
-/* { dg-final { scan-assembler-times {\tvcmp.u16  hi, q[0-9]+, q[0-9]+\n} 2 } 
} */
-/* { dg-final { scan-assembler-times {\tvcmp.u16  cs, q[0-9]+, q[0-9]+\n} 2 } 
} */
+/* { dg-final { scan-assembler-times {\tvcmp.i16\teq, q[0-9]+, q[0-9]+\n} 4 } 
} */
+/* { dg-final { scan-assembler-times {\tvcmp.i16\tne, q[0-9]+, q[0-9]+\n} 4 } 
} */
+/* { dg-final { scan-assembler-times {\tvcmp.s16\tlt, q[0-9]+, q[0-9]+\n} 2 } 
} */
+/* { dg-final { scan-assembler-times {\tvcmp.s16\tle, q[0-9]+, q[0-9]+\n} 2 } 
} */
+/* { dg-final { scan-assembler-times {\tvcmp.s16\tgt, q[0-9]+, q[0-9]+\n} 2 } 
} */
+/* { dg-final { scan-assembler-times {\tvcmp.s16\tge, q[0-9]+, q[0-9]+\n} 2 } 
} */
+/* { dg-final { scan-assembler-times {\tvcmp.u16\thi, q[0-9]+, q[0-9]+\n} 2 } 
} */
+/* { dg-final { scan-assembler-times {\tvcmp.u16\tcs, q[0-9]+, q[0-9]+\n} 2 } 
} */
 
 /* { 32 bits } x { eq, ne, lt, le, gt, ge, hi, cs }.
-/* { dg-final { scan-assembler-times {\tvcmp.i32  eq, q[0-9]+, q[0-9]+\n} 4 } 
} */
-/* { dg-final { scan-assembler-times {\tvcmp.i32  ne, q[0-9]+, q[0-9]+\n} 4 } 
} */
-/* { dg-final { scan-assembler-times {\tvcmp.s32  lt, q[0-9]+, q[0-9]+\n} 2 } 
} */
-/* { dg-final { scan-assembler-times {\tvcmp.s32  le, q[0-9]+, q[0-9]+\n} 2 } 
} */
-/* { dg-final { scan-assembler-times {\tvcmp.s32  gt, q[0-9]+, q[0-9]+\n} 2 } 
} */
-/* { dg-final { scan-assembler-times {\tvcmp.s32  ge, q[0-9]+, q[0-9]+\n} 2 } 
} */
-/* { dg-final { scan-assembler-times {\tvcmp.u32  hi, q[0-9]+, q[0-9]+\n} 2 } 
} */
-/* { dg-final { scan-assembler-times {\tvcmp.u32  cs, q[0-9]+, q[0-9]+\n} 2 } 
} */

Re: [PATCH 3/3] vect: inbranch SIMD clones

2022-12-01 Thread Andrew Stubbs

On 30/11/2022 15:37, Jakub Jelinek wrote:

On Wed, Nov 30, 2022 at 03:17:30PM +, Andrew Stubbs wrote:

--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/vect-simd-clone-16.c
@@ -0,0 +1,89 @@
+/* { dg-require-effective-target vect_simd_clones } */
+/* { dg-additional-options "-fopenmp-simd -fdump-tree-optimized" } */
+/* { dg-additional-options "-mavx" { target avx_runtime } } */

...

+/* Ensure the the in-branch simd clones are used on targets that support
+   them.  These counts include all call and definitions.  */
+
+/* { dg-skip-if "" { x86_64-*-* } { "-flto" } { "" } } */


Maybe better add -ffat-lto-objects to dg-additional-options and drop
the dg-skip-if (if it works with that, for all similar tests)?


The tests are already run with -ffat-lto-objects and the test still 
fails (pattern found zero times). I don't know why.


Aside from that, I've made all the other changes you requested.

OK now?

Andrewvect: inbranch SIMD clones

There has been support for generating "inbranch" SIMD clones for a long time,
but nothing actually uses them (as far as I can see).

This patch add supports for a sub-set of possible cases (those using
mask_mode == VOIDmode).  The other cases fail to vectorize, just as before,
so there should be no regressions.

The sub-set of support should cover all cases needed by amdgcn, at present.

gcc/ChangeLog:

* internal-fn.cc (expand_MASK_CALL): New.
* internal-fn.def (MASK_CALL): New.
* internal-fn.h (expand_MASK_CALL): New prototype.
* omp-simd-clone.cc (simd_clone_adjust_argument_types): Set vector_type
for mask arguments also.
* tree-if-conv.cc: Include cgraph.h.
(if_convertible_stmt_p): Do if conversions for calls to SIMD calls.
(predicate_statements): Convert functions to IFN_MASK_CALL.
* tree-vect-loop.cc (vect_get_datarefs_in_loop): Recognise
IFN_MASK_CALL as a SIMD function call.
* tree-vect-stmts.cc (vectorizable_simd_clone_call): Handle
IFN_MASK_CALL as an inbranch SIMD function call.
Generate the mask vector arguments.

gcc/testsuite/ChangeLog:

* gcc.dg/vect/vect-simd-clone-16.c: New test.
* gcc.dg/vect/vect-simd-clone-16b.c: New test.
* gcc.dg/vect/vect-simd-clone-16c.c: New test.
* gcc.dg/vect/vect-simd-clone-16d.c: New test.
* gcc.dg/vect/vect-simd-clone-16e.c: New test.
* gcc.dg/vect/vect-simd-clone-16f.c: New test.
* gcc.dg/vect/vect-simd-clone-17.c: New test.
* gcc.dg/vect/vect-simd-clone-17b.c: New test.
* gcc.dg/vect/vect-simd-clone-17c.c: New test.
* gcc.dg/vect/vect-simd-clone-17d.c: New test.
* gcc.dg/vect/vect-simd-clone-17e.c: New test.
* gcc.dg/vect/vect-simd-clone-17f.c: New test.
* gcc.dg/vect/vect-simd-clone-18.c: New test.
* gcc.dg/vect/vect-simd-clone-18b.c: New test.
* gcc.dg/vect/vect-simd-clone-18c.c: New test.
* gcc.dg/vect/vect-simd-clone-18d.c: New test.
* gcc.dg/vect/vect-simd-clone-18e.c: New test.
* gcc.dg/vect/vect-simd-clone-18f.c: New test.

diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc
index 9471f543191..d9e11bfc62a 100644
--- a/gcc/internal-fn.cc
+++ b/gcc/internal-fn.cc
@@ -4527,3 +4527,10 @@ void
 expand_ASSUME (internal_fn, gcall *)
 {
 }
+
+void
+expand_MASK_CALL (internal_fn, gcall *)
+{
+  /* This IFN should only exist between ifcvt and vect passes.  */
+  gcc_unreachable ();
+}
diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def
index 61516dab66d..301c3780659 100644
--- a/gcc/internal-fn.def
+++ b/gcc/internal-fn.def
@@ -466,6 +466,9 @@ DEF_INTERNAL_FN (TRAP, ECF_CONST | ECF_LEAF | ECF_NORETURN
 DEF_INTERNAL_FN (ASSUME, ECF_CONST | ECF_LEAF | ECF_NOTHROW
 | ECF_LOOPING_CONST_OR_PURE, NULL)
 
+/* For if-conversion of inbranch SIMD clones.  */
+DEF_INTERNAL_FN (MASK_CALL, ECF_NOVOPS, NULL)
+
 #undef DEF_INTERNAL_INT_FN
 #undef DEF_INTERNAL_FLT_FN
 #undef DEF_INTERNAL_FLT_FLOATN_FN
diff --git a/gcc/internal-fn.h b/gcc/internal-fn.h
index 21b1ce43df6..ced92c041bb 100644
--- a/gcc/internal-fn.h
+++ b/gcc/internal-fn.h
@@ -244,6 +244,7 @@ extern void expand_SHUFFLEVECTOR (internal_fn, gcall *);
 extern void expand_SPACESHIP (internal_fn, gcall *);
 extern void expand_TRAP (internal_fn, gcall *);
 extern void expand_ASSUME (internal_fn, gcall *);
+extern void expand_MASK_CALL (internal_fn, gcall *);
 
 extern bool vectorized_internal_fn_supported_p (internal_fn, tree);
 
diff --git a/gcc/omp-simd-clone.cc b/gcc/omp-simd-clone.cc
index 21d69aa8747..afb7d99747b 100644
--- a/gcc/omp-simd-clone.cc
+++ b/gcc/omp-simd-clone.cc
@@ -937,6 +937,7 @@ simd_clone_adjust_argument_types (struct cgraph_node *node)
}
   sc->args[i].orig_type = base_type;
   sc->args[i].arg_type = SIMD_CLONE_ARG_TYPE_MASK;
+  sc->args[i].vector_type = adj.type;
 }
 
   if (node->definition)
diff --git a/gcc/testsuite/gcc.dg/vect/vect-simd-clone-16.c 

Re: [PATCH][OG12] amdgcn: Support AMD-specific 'isa' and 'arch' traits in OpenMP context selectors

2022-12-01 Thread Andrew Stubbs

On 01/12/2022 11:10, Paul-Antoine Arras wrote:

+  if (TARGET_FIJI) 
\
+   builtin_define ("__FIJI__");   \
+  else if (TARGET_VEGA10)  
\
+   builtin_define ("__VEGA10__"); \
+  else if (TARGET_VEGA20)  
\
+   builtin_define ("__VEGA20__"); \
+  else if (TARGET_GFX908)  
\
+   builtin_define ("__GFX908__"); \
+  else if (TARGET_GFX90a)  
\
+   builtin_define ("__GFX90a__"); \
+  } while (0)
 


I don't think it makes sense to say __VEGA10__ when the user asked for 
-march=gfx900.


This whole naming thing is a bit of a mess already, so I think we'd do 
better to either keep the same names throughout or match what LLVM does 
(since it got to these first).


Please use "__gfx900__" etc. (lower case).

I'm half tempted to do a global search and replace on the internal 
names, but since they're not externally visible that would probably just 
be making merge conflicts for the sake of it.


Thanks

Andrew

P.S. If you want to split the patch into the GCN bits and the bits that 
depend on metadirectives then we can apply the first part to mainline 
right away.


Re: [PATCH 2/3]rs6000: NFC use sext_hwi to replace ((v&0xf..f)^0x80..0) - 0x80..0

2022-12-01 Thread guojiufu via Gcc-patches

On 2022-12-01 15:10, Jiufu Guo via Gcc-patches wrote:

Hi Kewen,

在 12/1/22 2:11 PM, Kewen.Lin 写道:

on 2022/12/1 13:35, Jiufu Guo wrote:

Hi Kewen,

Thanks for your quick and insight review!

在 12/1/22 1:17 PM, Kewen.Lin 写道:

Hi Jeff,

on 2022/12/1 09:36, Jiufu Guo wrote:

Hi,

This patch just uses sext_hwi to replace the expression like:
((value & 0xf..f) ^ 0x80..0) - 0x80..0 for rs6000.cc and rs6000.md.

Bootstrap & regtest pass on ppc64{,le}.
Is this ok for trunk?


You didn't say it clearly but I guessed you have grepped in the 
whole
config/rs6000 directory, right?  I noticed there are still two 
places
using this kind of expression in function 
constant_generates_xxspltiw,

but I assumed it's intentional as their types are not HOST_WIDE_INT.

gcc/config/rs6000/rs6000.cc:  short sign_h_word = ((h_word & 
0x) ^ 0x8000) - 0x8000;
gcc/config/rs6000/rs6000.cc:  int sign_word = ((word & 0x) ^ 
0x8000) - 0x8000;


If so, could you state it clearly in commit log like "with type
signed/unsigned HOST_WIDE_INT" or similar?


Good question!

And as you said sext_hwi is more for "signed/unsigned HOST_WIDE_INT".
For these two places, it seems sext_hwi is not needed actually!
And I did see why these expressions are used, may be just an 
assignment

is ok.


ah, I see.  I agree using the assignment is quite enough.  Could you
please also simplify them together?  Since they are with the form
"((value & 0xf..f) ^ 0x80..0) - 0x80..0" too, and can be refactored
in a better way.  Thanks!


Sure, I believe just "short sign_h_word = vsx_const->half_words[0];"
should be correct :-), and included in the updated patch.

Updated patch is attached,  bootstrap is on going.


Bootstrap and regtest pass on ppc64{,le}.

BR,
Jeff (Jiufu)



BR,
Jeff (Jiufu)



BR,
Kewen



Re: Java front-end and library patches.

2022-12-01 Thread Thomas Schwinge
Hi!

On 2022-11-30T23:18:06+1100, Zopolis0 via Gcc-patches  
wrote:
> However, patches 14-19 do need an explanation, as proven by multiple
> reviews simply asking why I had made them. I'll send follow up
> messages to those.

Well, (at least for some of them) re-work rather than explanations.  ;-)


Anyway:

>> Why is it now considered useful to add this front end back?
>
> The way I see it, the Java front end was removed due to a lack of
> maintenance and improvement. To put it simply, I am going to maintain
> and improve it. That is the difference between now and then. There is
> more nuance, but that is the gist of it.

Ha, nice!  As it happens, a few months ago, I started the same task...
(... but with very low priority, so have not yet gotten very far...)


>> How has the series been validated?
>
> I'm not exactly sure what you mean by this.

Testing; the integrated GCC/Java test suites, as well as possibly any
external test suites.  To make sure that we're (a) not regressing
anything in non-Java GCC, and (b) that we're maintaining the
functionality level of the "old" GCC/Java.  That said, I found that the
integrated GCC/Java test suites are not exactly testing all that should
be tested...

My approach has been to establish an "old" baseline, and then gradually
rebase this onto specific GCC master branch commits, and catch up with
tree-wide changes along the way.  I've not gotten all too far yet; made a
stop to first add more testing to the baseline, so that I can be
reasonably sure that GCC/Java doesn't regress in functionality.  (It's
been sitting in that state for a number of months now...)  It may be a
somewhat more painful approach in comparison to the "all in one go"
approach that you seem to have attempted (?), but it seemed more
appropriate for me, as I'm only able to spend occasional small blocks of
time on this.


>> Would you propose to maintain the front end and libraries in future?
>
> I have big plans for the library, and plan to maintain that long into
> the future. In regards to the actual front-end code, I will do what I
> can to make sure it remains at its previous level of function, but
> that is about it. I dislike working with the front end code, so I will
> fix it, but I will not make sweeping changes to it.

I might thus be interested in joining that effort (I'm more interested in
the front end and GCC proper parts) -- but, again, this will be
low-priority project for me.


Grüße
 Thomas


> Just a brief overview of my plans for the frontend and library-- When
> GCJ was first introduced it was "the free Java implementation". It was
> trying to offer a bytecode compiler, a machine code compiler and a
> runtime library. Clearly, this was too much, as it borrowed another
> bytecode compiler and runtime library, and even then the runtime
> library fell into dissaray.
>
> Now, we have many pieces of the puzzle. We have a bounty of free Java
> bytecode compilers, and a free runtime library. The only thing missing
> is a free machine code compiler, which GCJ was and is. I plan to
> replace Classpath with the OpenJDK, and double down on the machine
> code aspect of GCJ, dropping bytecode and interpreted support.
-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955


Re: [PATCH] vect: Fold LEN_{LOAD, STORE} if it's for the whole vector [PR107412]

2022-12-01 Thread Richard Sandiford via Gcc-patches
"Kewen.Lin"  writes:
> Hi Richard,
>
> on 2022/11/24 17:24, Richard Sandiford wrote:
>> "Kewen.Lin"  writes:
>>> Hi,
>>>
>>> As the test case in PR107412 shows, we can fold IFN .LEN_{LOAD,
>>> STORE} into normal vector load/store if the given length is known
>>> to be equal to the length of the whole vector.  It would help to
>>> improve overall cycles as normally the latency of vector access
>>> with length in bytes is bigger than normal vector access, and it
>>> also saves the preparation for length if constant length can not
>>> be encoded into instruction (such as on power).
>>>
>>> Bootstrapped and regtested on x86_64-redhat-linux,
>>> aarch64-linux-gnu and powerpc64{,le}-linux-gnu.
>>>
>>> Is it ok for trunk?
>>>
>>> BR,
>>> Kewen
>>> -
>>> PR tree-optimization/107412
>>>
>>> gcc/ChangeLog:
>>>
>>> * gimple-fold.cc (gimple_fold_mask_load_store_mem_ref): Rename to ...
>>> (gimple_fold_partial_load_store_mem_ref): ... this, add one parameter
>>> mask_p indicating it's for mask or length, and add some handlings for
>>> IFN LEN_{LOAD,STORE}.
>>> (gimple_fold_mask_load): Rename to ...
>>> (gimple_fold_partial_load): ... this, add one parameter mask_p.
>>> (gimple_fold_mask_store): Rename to ...
>>> (gimple_fold_partial_store): ... this, add one parameter mask_p.
>>> (gimple_fold_call): Add the handlings for IFN LEN_{LOAD,STORE},
>>> and adjust calls on gimple_fold_mask_load_store_mem_ref to
>>> gimple_fold_partial_load_store_mem_ref.
>> 
>> Sorry to reply to late (still catching up on email), but:
>> 
>>>
>>> gcc/testsuite/ChangeLog:
>>>
>>> * gcc.target/powerpc/pr107412.c: New test.
>>> * gcc.target/powerpc/p9-vec-length-epil-8.c: Adjust scan times for
>>> folded LEN_LOAD.
>>> ---
>>>  gcc/gimple-fold.cc| 57 ++-
>>>  .../gcc.target/powerpc/p9-vec-length-epil-8.c |  2 +-
>>>  gcc/testsuite/gcc.target/powerpc/pr107412.c   | 19 +++
>>>  3 files changed, 64 insertions(+), 14 deletions(-)
>>>  create mode 100644 gcc/testsuite/gcc.target/powerpc/pr107412.c
>>>
>>> diff --git a/gcc/gimple-fold.cc b/gcc/gimple-fold.cc
>>> index a1704784bc9..e3a087defa6 100644
>>> --- a/gcc/gimple-fold.cc
>>> +++ b/gcc/gimple-fold.cc
>>> @@ -5370,19 +5370,39 @@ arith_overflowed_p (enum tree_code code, const_tree 
>>> type,
>>>return wi::min_precision (wres, sign) > TYPE_PRECISION (type);
>>>  }
>>>
>>> -/* If IFN_MASK_LOAD/STORE call CALL is unconditional, return a MEM_REF
>>> +/* If IFN_{MASK,LEN}_LOAD/STORE call CALL is unconditional, return a 
>>> MEM_REF
>>> for the memory it references, otherwise return null.  VECTYPE is the
>>> -   type of the memory vector.  */
>>> +   type of the memory vector.  MASK_P indicates it's for MASK if true,
>>> +   otherwise it's for LEN.  */
>>>
>>>  static tree
>>> -gimple_fold_mask_load_store_mem_ref (gcall *call, tree vectype)
>>> +gimple_fold_partial_load_store_mem_ref (gcall *call, tree vectype, bool 
>>> mask_p)
>>>  {
>>>tree ptr = gimple_call_arg (call, 0);
>>>tree alias_align = gimple_call_arg (call, 1);
>>> -  tree mask = gimple_call_arg (call, 2);
>>> -  if (!tree_fits_uhwi_p (alias_align) || !integer_all_onesp (mask))
>>> +  if (!tree_fits_uhwi_p (alias_align))
>>>  return NULL_TREE;
>>>
>>> +  if (mask_p)
>>> +{
>>> +  tree mask = gimple_call_arg (call, 2);
>>> +  if (!integer_all_onesp (mask))
>>> +   return NULL_TREE;
>>> +} else {
>> 
>> Minor nit: }, else, and { should be on separate lines.  But the thing
>> I actually wanted to say was...
>
> Thanks for catching, I must have forgotten to reformat these lines.
>
>> 
>>> +  tree basic_len = gimple_call_arg (call, 2);
>>> +  if (!tree_fits_uhwi_p (basic_len))
>>> +   return NULL_TREE;
>>> +  unsigned int nargs = gimple_call_num_args (call);
>>> +  tree bias = gimple_call_arg (call, nargs - 1);
>>> +  gcc_assert (tree_fits_uhwi_p (bias));
>>> +  tree biased_len = int_const_binop (MINUS_EXPR, basic_len, bias);
>>> +  unsigned int len = tree_to_uhwi (biased_len);
>>> +  unsigned int vect_len
>>> +   = GET_MODE_SIZE (TYPE_MODE (vectype)).to_constant ();
>>> +  if (vect_len != len)
>>> +   return NULL_TREE;
>> 
>> Using "unsigned int" truncates the value.  I realise that's probably
>> safe in this context, since large values have undefined behaviour.
>> But it still seems better to use an untruncated type, so that it
>> looks less like an oversight.  (I think this is one case where "auto"
>> can help, since it gets the type right automatically.)
>> 
>> It would also be better to avoid the to_constant, since we haven't
>> proven is_constant.  How about:
>> 
>>   tree basic_len = gimple_call_arg (call, 2);
>>   if (!poly_int_tree_p (basic_len))
>>  return NULL_TREE;
>>   unsigned int nargs = gimple_call_num_args (call);
>>   tree bias = gimple_call_arg (call, nargs - 1);
>>   gcc_assert (TREE_CODE (bias) == INTEGER_CST);
>>   

Re: [PATCH][OG12] amdgcn: Support AMD-specific 'isa' and 'arch' traits in OpenMP context selectors

2022-12-01 Thread Paul-Antoine Arras

Hi Kwok,

On 30/11/2022 19:50, Kwok Cheung Yeung wrote:

Hello PA,


--- libgomp/config/gcn/selector.c
+++ libgomp/config/gcn/selector.c
@@ -36,7 +36,7 @@ GOMP_evaluate_current_device (const char *kind, 
const char *arch,

   if (kind && strcmp (kind, "gpu") != 0)
 return false;

-  if (arch && strcmp (arch, "gcn") != 0)
+  if (arch && (strcmp (arch, "gcn") != 0 || strcmp (arch, "amdgcn") 
!= 0))

 return false;


The logic here looks wrong to me - surely it should return false if arch 
is not 'gcn' AND it is not 'amdgcn'?


Sure. Fixed in revised patch.

@@ -48,8 +48,17 @@ GOMP_evaluate_current_device (const char *kind, 
const char *arch,

 #endif

 #ifdef __GCN5__
-  if (strcmp (isa, "gfx900") == 0 || strcmp (isa, "gfx906") != 0
-  || strcmp (isa, "gfx908") == 0)
+  if (strcmp (isa, "gfx900") == 0 || strcmp (isa, "gfx906") != 0)
+    return true;
+#endif
+
+#ifdef __CDNA1__
+  if (strcmp (isa, "gfx908") == 0)
+    return true;
+#endif
+
+#ifdef __CDNA2__
+  if (strcmp (isa, "gfx90a") == 0)
 return true;
 #endif


Okay for gfx908 and gfx90a, but is there any way of distinguishing 
between 'gfx900' and 'gfx906' ISAs? I don't think these are mutually 
compatible.




Since I did not find any existing builtin to check the exact ISA, I 
added all of them for consistency. Let me know if that looks good to you.


Thanks,
--
PAFrom f846292d2ce953a633fe400226277cf0cb0d6243 Mon Sep 17 00:00:00 2001
From: Paul-Antoine Arras 
Date: Wed, 30 Nov 2022 14:52:55 +0100
Subject: [PATCH] amdgcn: Support AMD-specific 'isa' and 'arch' traits in
 OpenMP context selectors

Add or fix libgomp support for 'amdgcn' as arch, and 'gfx908' and 'gfx90a' as 
isa traits.
Add test case for all supported 'isa' values used as context selectors in a 
metadirective construct.

libgomp/ChangeLog:

* config/gcn/selector.c (GOMP_evaluate_current_device): Recognise 
'amdgcn' as arch, and 'gfx908' and
'gfx90a' as isa traits.
* testsuite/libgomp.c-c++-common/metadirective-6.c: New test.
---
 gcc/config/gcn/gcn-opts.h |  6 +++
 gcc/config/gcn/gcn.h  | 37 --
 libgomp/config/gcn/selector.c | 24 --
 .../libgomp.c-c++-common/metadirective-6.c| 48 +++
 4 files changed, 96 insertions(+), 19 deletions(-)
 create mode 100644 libgomp/testsuite/libgomp.c-c++-common/metadirective-6.c

diff --git gcc/config/gcn/gcn-opts.h gcc/config/gcn/gcn-opts.h
index 07ddc79cda3..fb7e5d9a5e9 100644
--- gcc/config/gcn/gcn-opts.h
+++ gcc/config/gcn/gcn-opts.h
@@ -27,6 +27,12 @@ enum processor_type
   PROCESSOR_GFX90a
 };
 
+#define TARGET_FIJI (gcn_arch == PROCESSOR_FIJI)
+#define TARGET_VEGA10 (gcn_arch == PROCESSOR_VEGA10)
+#define TARGET_VEGA20 (gcn_arch == PROCESSOR_VEGA20)
+#define TARGET_GFX908 (gcn_arch == PROCESSOR_GFX908)
+#define TARGET_GFX90a (gcn_arch == PROCESSOR_GFX90a)
+
 /* Set in gcn_option_override.  */
 extern enum gcn_isa {
   ISA_UNKNOWN,
diff --git gcc/config/gcn/gcn.h gcc/config/gcn/gcn.h
index 38f7212db59..22a95ba6609 100644
--- gcc/config/gcn/gcn.h
+++ gcc/config/gcn/gcn.h
@@ -16,20 +16,29 @@
 
 #include "config/gcn/gcn-opts.h"
 
-#define TARGET_CPU_CPP_BUILTINS()  \
-  do   \
-{  \
-  builtin_define ("__AMDGCN__");   \
-  if (TARGET_GCN3) \
-   builtin_define ("__GCN3__");\
-  else if (TARGET_GCN5)\
-   builtin_define ("__GCN5__");\
-  else if (TARGET_CDNA1)   \
-   builtin_define ("__CDNA1__");   \
-  else if (TARGET_CDNA2)   \
-   builtin_define ("__CDNA2__");   \
-}  \
-  while(0)
+#define TARGET_CPU_CPP_BUILTINS()  
\
+  do   
\
+{  
\
+  builtin_define ("__AMDGCN__");   
\
+  if (TARGET_GCN3) 
\
+   builtin_define ("__GCN3__");   \
+  else if (TARGET_GCN5)
\
+   builtin_define ("__GCN5__");   \
+  else if (TARGET_CDNA1)   
\
+   builtin_define ("__CDNA1__");  \
+  else if (TARGET_CDNA2)   
\
+   builtin_define ("__CDNA2__");  \
+  if (TARGET_FIJI) 
\
+   builtin_define ("__FIJI__");   \
+  else if (TARGET_VEGA10)  
\
+   builtin_define 

Re: [PATCH Rust front-end v2 31/37] gccrs: Add GCC Rust front-end Make-lang.in

2022-12-01 Thread Thomas Schwinge
Hi!

On 2022-09-14T15:34:10+0200, Richard Biener via Gcc-patches 
 wrote:
> On Wed, Aug 24, 2022 at 2:22 PM  wrote:
>> --- /dev/null
>> +++ b/gcc/rust/Make-lang.in

>> +# TODO: possibly find a way to ensure C++11 compilation level here?
>> +RUST_CXXFLAGS = -std=c++11 -Wno-unused-parameter -Werror=overloaded-virtual
>
> You probably should inherit from $(CXXFLAGS) here which ensures C++11
> compatibility.

That was done in GCC/Rust commit da13bf4bbc46b399419c3e7f2c358a0efe3bdfdd
"make: Inherit CXXFLAGS, remove compiler-specific warnings", which
changed this to just 'RUST_CXXFLAGS = $(CXXFLAGS)'.

> Note you have to deal with non-g++ host compilers when not
> bootstrapping so adding -Wno-unused-parameter -Werror=overload-virtual
> needs to be guarded.

'-Werror=overloaded-virtual' is implied as by default, we have
'-Woverloaded-virtual' and '-Werror'.  (I've verified via putting
'class tmp : public Dump { void visit (int) {} };' into
'gcc/rust/ast/rust-ast-dump.cc', and getting a number of
'error: ‘virtual void Rust::AST::Dump::visit([...])’ was hidden'.)
(Maybe that isn't active for '--disable-bootstrap' builds, but that's
"OK".)

Remains only '-Wno-unused-parameter'.  That one should move into
'rust-warn', where we currently have:

>> +# Use strict warnings for this front end.
>> +rust-warn = $(STRICT_WARN)

Per GCC 4.8 documentation (baseline version to bootstrap GCC), we may use
'-Wno-[...]' without checking whether the corresponding '-W[...]' is
actually supported, so we may specify '-Wno-unused-parameter'
unconditionally, like existing ones, for example in 'gcc/Makefile.in':

# These files are to have specific diagnostics suppressed, [...]
gimple-match.o-warn = -Wno-unused
generic-match.o-warn = -Wno-unused
dfp.o-warn = -Wno-strict-aliasing

I thus understand that non-GCC compilers implement the same '-Wno-[...]'
behavior -- or maybe warning flags are not passed to those at all, at
stage 1 build where this is (only) relevant.

I've thus proposed 
"'rust-warn += -Wno-unused-parameter'".


Grüße
 Thomas
-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955


Re: [PATCH][AArch64] Cleanup move immediate code

2022-12-01 Thread Richard Sandiford via Gcc-patches
Wilco Dijkstra  writes:
> Hi Richard,
>
>> Just to make sure I understand: isn't it really just MOVN?  I would have
>> expected a 32-bit MOVZ to be equivalent to (and add no capabilities over)
>> a 64-bit MOVZ.
>
> The 32-bit MOVZ immediates are equivalent, MOVN never overlaps, and
> MOVI has some overlaps . Since we allow all 3 variants, the 2 alternatives
> in the movdi pattern are overlapping for MOVZ and MOVI immediates.
>
>> I agree the ctz trick is more elegant than (and an improvement over)
>> the current approach to testing for movz.  But I think the overall logic
>> is harder to follow than it was in the original version.  Initially
>> canonicalising val2 based on the sign bit seems unintuitive since we
>> still need to handle all four combinations of (top bit set, top bit clear)
>> x (low 48 bits set, low 48 bits clear).  I preferred the original
>> approach of testing once with the original value (for MOVZ) and once
>> with the inverted value (for MOVN).
>
> Yes, the canonicalization on the sign ends up requiring 2 special cases.
> Handling the MOVZ case first and then MOVN does avoid that, and makes
> things simpler overall, so I've used that approach in v2.
>
>> Don't the new cases boil down to: if mode is DImode and the upper 32 bits
>> are clear, we can test based on SImode instead?  In other words, couldn't
>> the "(val >> 32) == 0" part of the final test be done first, with the
>> effect of changing the mode to SImode?  Something like:
>
> Yes that works. I used masking of the top bits to avoid repeatedly testing the
> same condition. The new version removes most special cases and ends up
> both smaller and simpler:
>
>
> v2: Simplify the special cases in aarch64_move_imm, use aarch64_is_movz.
>
> Simplify, refactor and improve various move immediate functions.
> Allow 32-bit MOVZ/I/N as a valid 64-bit immediate which removes special
> cases in aarch64_internal_mov_immediate.  Add new constraint so the movdi
> pattern only needs a single alternative for move immediate.
>
> Passes bootstrap and regress, OK for commit?
>
> gcc/ChangeLog:
>
>   * config/aarch64/aarch64.cc (aarch64_bitmask_imm): Use unsigned type.
> (aarch64_zeroextended_move_imm): New function.
> (aarch64_move_imm): Refactor, assert mode is SImode or DImode.
> (aarch64_internal_mov_immediate): Assert mode is SImode or DImode.
> Simplify special cases.
> (aarch64_uimm12_shift): Simplify code.
> (aarch64_clamp_to_uimm12_shift): Likewise.
> (aarch64_movw_imm): Rename to aarch64_is_movz.
> (aarch64_float_const_rtx_p): Pass either SImode or DImode to
> aarch64_internal_mov_immediate.
> (aarch64_rtx_costs): Likewise.
> * config/aarch64/aarch64.md (movdi_aarch64): Merge 'N' and 'M'
> constraints into single 'O'.
> (mov_aarch64): Likewise.
> * config/aarch64/aarch64-protos.h (aarch64_move_imm): Use unsigned.
> (aarch64_bitmask_imm): Likewise.
> (aarch64_uimm12_shift): Likewise.
> (aarch64_zeroextended_move_imm): New prototype.
> * config/aarch64/constraints.md: Add 'O' for 32/64-bit immediates,
> limit 'N' to 64-bit only moves.
>
> ---
>
> diff --git a/gcc/config/aarch64/aarch64-protos.h 
> b/gcc/config/aarch64/aarch64-protos.h
> index 
> 4be93c93c26e091f878bc8e4cf06e90888405fb2..8bce6ec7599edcc2e6a1d8006450f35c0ce7f61f
>  100644
> --- a/gcc/config/aarch64/aarch64-protos.h
> +++ b/gcc/config/aarch64/aarch64-protos.h
> @@ -756,7 +756,7 @@ void aarch64_post_cfi_startproc (void);
>  poly_int64 aarch64_initial_elimination_offset (unsigned, unsigned);
>  int aarch64_get_condition_code (rtx);
>  bool aarch64_address_valid_for_prefetch_p (rtx, bool);
> -bool aarch64_bitmask_imm (HOST_WIDE_INT val, machine_mode);
> +bool aarch64_bitmask_imm (unsigned HOST_WIDE_INT val, machine_mode);
>  unsigned HOST_WIDE_INT aarch64_and_split_imm1 (HOST_WIDE_INT val_in);
>  unsigned HOST_WIDE_INT aarch64_and_split_imm2 (HOST_WIDE_INT val_in);
>  bool aarch64_and_bitmask_imm (unsigned HOST_WIDE_INT val_in, machine_mode 
> mode);
> @@ -793,7 +793,7 @@ bool aarch64_masks_and_shift_for_bfi_p (scalar_int_mode, 
> unsigned HOST_WIDE_INT,
>   unsigned HOST_WIDE_INT,
>   unsigned HOST_WIDE_INT);
>  bool aarch64_zero_extend_const_eq (machine_mode, rtx, machine_mode, rtx);
> -bool aarch64_move_imm (HOST_WIDE_INT, machine_mode);
> +bool aarch64_move_imm (unsigned HOST_WIDE_INT, machine_mode);
>  machine_mode aarch64_sve_int_mode (machine_mode);
>  opt_machine_mode aarch64_sve_pred_mode (unsigned int);
>  machine_mode aarch64_sve_pred_mode (machine_mode);
> @@ -843,8 +843,9 @@ bool aarch64_sve_float_arith_immediate_p (rtx, bool);
>  bool aarch64_sve_float_mul_immediate_p (rtx);
>  bool aarch64_split_dimode_const_store (rtx, rtx);
>  bool aarch64_symbolic_address_p (rtx);
> -bool aarch64_uimm12_shift (HOST_WIDE_INT);
> +bool aarch64_uimm12_shift 

[PATCH] c++, v2: Incremental fix for g++.dg/gomp/for-21.C [PR84469]

2022-12-01 Thread Jakub Jelinek via Gcc-patches
On Wed, Nov 30, 2022 at 01:52:08PM -0500, Jason Merrill wrote:
> It looks like we're already deducing the type for the underlying S variable
> in cp_convert_omp_range_for, we just aren't updating the types of the
> individual bindings.

You're right.  With this patch (still incremental against the base PR84469
patch) we get the nicer diagnostics in all cases.

Regtested successfully on x86_64-linux (g++ gomp.exp/goacc.exp/goacc-gomp.exp
and libgomp's c++.exp), ok for trunk (including the base patch)
if it passes full bootstrap/regtest?

2022-12-01  Jakub Jelinek  

PR c++/84469
gcc/c-family/
* c-omp.cc (c_omp_is_loop_iterator): For range for with structured
binding return TREE_VEC_LENGTH (d->declv) even if decl is equal
to any of the structured binding decls.
gcc/cp/
* parser.cc (cp_convert_omp_range_for): After do_auto_deduction if
!processing_template_decl call cp_finish_decomp with
processing_template_decl temporarily incremented.
gcc/testsuite/
* g++.dg/gomp/for-21.C (f3, f6, f9): Adjust expected diagnostics.
* g++.dg/gomp/for-22.C: New test.

--- gcc/c-family/c-omp.cc.jj2022-10-04 10:36:46.515414485 +0200
+++ gcc/c-family/c-omp.cc   2022-12-01 10:57:56.365253302 +0100
@@ -1311,10 +1311,11 @@ c_omp_is_loop_iterator (tree decl, struc
 else if (TREE_CODE (TREE_VEC_ELT (d->declv, i)) == TREE_LIST
 && TREE_CHAIN (TREE_VEC_ELT (d->declv, i))
 && (TREE_CODE (TREE_CHAIN (TREE_VEC_ELT (d->declv, i)))
-== TREE_VEC)
-&& decl == TREE_VEC_ELT (TREE_CHAIN (TREE_VEC_ELT (d->declv,
- i)), 2))
-  return TREE_VEC_LENGTH (d->declv);
+== TREE_VEC))
+  for (int j = 2;
+  j < TREE_VEC_LENGTH (TREE_CHAIN (TREE_VEC_ELT (d->declv, i))); j++)
+   if (decl == TREE_VEC_ELT (TREE_CHAIN (TREE_VEC_ELT (d->declv, i)), j))
+ return TREE_VEC_LENGTH (d->declv);
   return -1;
 }
 
--- gcc/cp/parser.cc.jj 2022-12-01 10:19:27.0 +0100
+++ gcc/cp/parser.cc2022-12-01 10:21:30.760450093 +0100
@@ -43126,8 +43126,16 @@ cp_convert_omp_range_for (tree _pre
   tree t = build_x_indirect_ref (input_location, begin, RO_UNARY_STAR,
 NULL_TREE, tf_none);
   if (!error_operand_p (t))
-   TREE_TYPE (orig_decl) = do_auto_deduction (TREE_TYPE (orig_decl),
-  t, auto_node);
+   {
+ TREE_TYPE (orig_decl) = do_auto_deduction (TREE_TYPE (orig_decl),
+t, auto_node);
+ if (decomp_first_name)
+   {
+ ++processing_template_decl;
+ cp_finish_decomp (orig_decl, decomp_first_name, decomp_cnt);
+ --processing_template_decl;
+   }
+   }
 }
 
   tree v = make_tree_vec (decomp_cnt + 3);
--- gcc/testsuite/g++.dg/gomp/for-21.C.jj   2022-11-30 10:29:09.332186135 
+0100
+++ gcc/testsuite/g++.dg/gomp/for-21.C  2022-12-01 11:05:40.888414600 +0100
@@ -24,9 +24,9 @@ void
 f3 (S ()[10])
 {
   #pragma omp for collapse (2)
-  for (auto [i, j, k] : a) // { dg-error "use of 'i' 
before deduction of 'auto'" "" { target *-*-* } .+1 }
-for (int l = i; l < j; l += k) // { dg-error "use of 'j' 
before deduction of 'auto'" }
-  ;// { dg-error "use of 
'k' before deduction of 'auto'" "" { target *-*-* } .-1 }
+  for (auto [i, j, k] : a) // { dg-error "initializer 
expression refers to iteration variable 'i'" }
+for (int l = i; l < j; l += k) // { dg-error "condition 
expression refers to iteration variable 'j'" }
+  ;// { dg-error 
"increment expression refers to iteration variable 'k'" "" { target *-*-* } .-2 
}
 }
 
 template 
@@ -54,9 +54,9 @@ void
 f6 (S ()[10])
 {
   #pragma omp for collapse (2)
-  for (auto [i, j, k] : a) // { dg-error "use of 'i' 
before deduction of 'auto'" "" { target *-*-* } .-1 }
-for (int l = i; l < j; l += k) // { dg-error "use of 'j' 
before deduction of 'auto'" }
-  ;// { dg-error "use of 
'k' before deduction of 'auto'" "" { target *-*-* } .-3 }
+  for (auto [i, j, k] : a) // { dg-error "initializer 
expression refers to iteration variable 'i'" "" { target *-*-* } .-1 }
+for (int l = i; l < j; l += k) // { dg-error "condition 
expression refers to iteration variable 'j'" }
+  ;// { dg-error 
"increment expression refers to iteration variable 'k'" "" { target *-*-* } .-3 
}
 }
 
 template 
@@ -84,9 +84,9 @@ void
 f9 (U ()[10])
 {
   #pragma omp for collapse (2)
-  for (auto [i, j, k] : a) // { dg-error "use of 'i' 
before 

[PATCH 0/3] RISC-V: optimize stack manipulation in save-restore

2022-12-01 Thread Fei Gao
The patches allow less instructions to be used in stack allocation 
and deallocation if save-restore enabled, and also make the stack 
manipulation codes more readable.

Fei Gao (3):
  RISC-V: add a new parameter in riscv_first_stack_step.
  RISC-V: optimize stack manipulation in save-restore
  RISC-V: make the stack manipulation codes more readable.

 gcc/config/riscv/riscv.cc | 105 +-
 .../gcc.target/riscv/stack_save_restore.c |  40 +++
 2 files changed, 95 insertions(+), 50 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/stack_save_restore.c

-- 
2.17.1



[PATCH 3/3] RISC-V: make the stack manipulation codes more readable.

2022-12-01 Thread Fei Gao
gcc/ChangeLog:

* config/riscv/riscv.cc (riscv_first_stack_step): make codes more 
readable.
(riscv_expand_epilogue): likewise.
---
 gcc/config/riscv/riscv.cc | 17 ++---
 1 file changed, 10 insertions(+), 7 deletions(-)

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index a50f2303032..95da08ffb3b 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -4926,8 +4926,11 @@ riscv_first_stack_step (struct riscv_frame_info *frame, 
poly_int64 remaining_siz
   if (SMALL_OPERAND (remaining_const_size))
 return remaining_const_size;
 
+  poly_int64 callee_saved_first_step =
+remaining_size - frame->frame_pointer_offset;
+  gcc_assert(callee_saved_first_step.is_constant ());
   HOST_WIDE_INT min_first_step =
-riscv_stack_align ((remaining_size - 
frame->frame_pointer_offset).to_constant());
+riscv_stack_align (callee_saved_first_step.to_constant ());
   HOST_WIDE_INT max_first_step = IMM_REACH / 2 - PREFERRED_STACK_BOUNDARY / 8;
   HOST_WIDE_INT min_second_step = remaining_const_size - max_first_step;
   gcc_assert (min_first_step <= max_first_step);
@@ -4935,7 +4938,7 @@ riscv_first_stack_step (struct riscv_frame_info *frame, 
poly_int64 remaining_siz
   /* As an optimization, use the least-significant bits of the total frame
  size, so that the second adjustment step is just LUI + ADD.  */
   if (!SMALL_OPERAND (min_second_step)
-  && remaining_const_size % IMM_REACH < IMM_REACH / 2
+  && remaining_const_size % IMM_REACH <= max_first_step
   && remaining_const_size % IMM_REACH >= min_first_step)
 return remaining_const_size % IMM_REACH;
 
@@ -5129,14 +5132,14 @@ riscv_adjust_libcall_cfi_epilogue ()
 void
 riscv_expand_epilogue (int style)
 {
-  /* Split the frame into two.  STEP1 is the amount of stack we should
- deallocate before restoring the registers.  STEP2 is the amount we
- should deallocate afterwards.
+  /* Split the frame into 3 steps. STEP1 is the amount of stack we should
+ deallocate before restoring the registers. STEP2 is the amount we
+ should deallocate afterwards including the callee saved regs. STEP3
+ is the amount deallocated by save-restore libcall.
 
  Start off by assuming that no registers need to be restored.  */
   struct riscv_frame_info *frame = >machine->frame;
   unsigned mask = frame->mask;
-  poly_int64 step1 = frame->total_size;
   HOST_WIDE_INT step2 = 0;
   bool use_restore_libcall = ((style == NORMAL_RETURN)
  && riscv_use_save_libcall (frame));
@@ -5223,7 +5226,7 @@ riscv_expand_epilogue (int style)
   if (use_restore_libcall)
 frame->mask = mask; /* Undo the above fib.  */
 
-  step1 -= step2 + libcall_size;
+  poly_int64 step1 = frame->total_size - step2 - libcall_size;
 
   /* Set TARGET to BASE + STEP1.  */
   if (known_gt (step1, 0))
-- 
2.17.1



[PATCH 1/3] RISC-V: add a new parameter in riscv_first_stack_step.

2022-12-01 Thread Fei Gao
frame->total_size to remaining_size conversion is done as an independent patch 
without
functionality change as per review comment.

gcc/ChangeLog:

* config/riscv/riscv.cc (riscv_first_stack_step): add a new function 
parameter remaining_size.
(riscv_compute_frame_info): adapt new riscv_first_stack_step interface.
(riscv_expand_prologue): likewise.
(riscv_expand_epilogue): likewise.
---
 gcc/config/riscv/riscv.cc | 48 +++
 1 file changed, 24 insertions(+), 24 deletions(-)

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 05bdba5ab4d..f0bbcd6d6be 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -4634,7 +4634,7 @@ riscv_save_libcall_count (unsigned mask)
They decrease stack_pointer_rtx but leave frame_pointer_rtx and
hard_frame_pointer_rtx unchanged.  */
 
-static HOST_WIDE_INT riscv_first_stack_step (struct riscv_frame_info *frame);
+static HOST_WIDE_INT riscv_first_stack_step (struct riscv_frame_info *frame, 
poly_int64 remaining_size);
 
 /* Handle stack align for poly_int.  */
 static poly_int64
@@ -4663,7 +4663,7 @@ riscv_compute_frame_info (void)
  save/restore t0.  We check for this before clearing the frame struct.  */
   if (cfun->machine->interrupt_handler_p)
 {
-  HOST_WIDE_INT step1 = riscv_first_stack_step (frame);
+  HOST_WIDE_INT step1 = riscv_first_stack_step (frame, frame->total_size);
   if (! POLY_SMALL_OPERAND_P ((frame->total_size - step1)))
interrupt_save_prologue_temp = true;
 }
@@ -4913,45 +4913,45 @@ riscv_restore_reg (rtx reg, rtx mem)
without adding extra instructions.  */
 
 static HOST_WIDE_INT
-riscv_first_stack_step (struct riscv_frame_info *frame)
+riscv_first_stack_step (struct riscv_frame_info *frame, poly_int64 
remaining_size)
 {
-  HOST_WIDE_INT frame_total_constant_size;
-  if (!frame->total_size.is_constant ())
-frame_total_constant_size
-  = riscv_stack_align (frame->total_size.coeffs[0])
-   - riscv_stack_align (frame->total_size.coeffs[1]);
+  HOST_WIDE_INT remaining_const_size;
+  if (!remaining_size.is_constant ())
+remaining_const_size
+  = riscv_stack_align (remaining_size.coeffs[0])
+- riscv_stack_align (remaining_size.coeffs[1]);
   else
-frame_total_constant_size = frame->total_size.to_constant ();
+remaining_const_size = remaining_size.to_constant ();
 
-  if (SMALL_OPERAND (frame_total_constant_size))
-return frame_total_constant_size;
+  if (SMALL_OPERAND (remaining_const_size))
+return remaining_const_size;
 
   HOST_WIDE_INT min_first_step =
-RISCV_STACK_ALIGN ((frame->total_size - 
frame->frame_pointer_offset).to_constant());
+riscv_stack_align ((remaining_size - 
frame->frame_pointer_offset).to_constant());
   HOST_WIDE_INT max_first_step = IMM_REACH / 2 - PREFERRED_STACK_BOUNDARY / 8;
-  HOST_WIDE_INT min_second_step = frame_total_constant_size - max_first_step;
+  HOST_WIDE_INT min_second_step = remaining_const_size - max_first_step;
   gcc_assert (min_first_step <= max_first_step);
 
   /* As an optimization, use the least-significant bits of the total frame
  size, so that the second adjustment step is just LUI + ADD.  */
   if (!SMALL_OPERAND (min_second_step)
-  && frame_total_constant_size % IMM_REACH < IMM_REACH / 2
-  && frame_total_constant_size % IMM_REACH >= min_first_step)
-return frame_total_constant_size % IMM_REACH;
+  && remaining_const_size % IMM_REACH < IMM_REACH / 2
+  && remaining_const_size % IMM_REACH >= min_first_step)
+return remaining_const_size % IMM_REACH;
 
   if (TARGET_RVC)
 {
   /* If we need two subtracts, and one is small enough to allow compressed
-loads and stores, then put that one first.  */
+ loads and stores, then put that one first.  */
   if (IN_RANGE (min_second_step, 0,
-   (TARGET_64BIT ? SDSP_REACH : SWSP_REACH)))
-   return MAX (min_second_step, min_first_step);
+(TARGET_64BIT ? SDSP_REACH : SWSP_REACH)))
+   return MAX (min_second_step, min_first_step);
 
   /* If we need LUI + ADDI + ADD for the second adjustment step, then start
-with the minimum first step, so that we can get compressed loads and
-stores.  */
+ with the minimum first step, so that we can get compressed loads and
+ stores.  */
   else if (!SMALL_OPERAND (min_second_step))
-   return min_first_step;
+   return min_first_step;
 }
 
   return max_first_step;
@@ -5037,7 +5037,7 @@ riscv_expand_prologue (void)
   /* Save the registers.  */
   if ((frame->mask | frame->fmask) != 0)
 {
-  HOST_WIDE_INT step1 = riscv_first_stack_step (frame);
+  HOST_WIDE_INT step1 = riscv_first_stack_step (frame, frame->total_size);
   if (size.is_constant ())
step1 = MIN (size.to_constant(), step1);
 
@@ -5216,7 +5216,7 @@ riscv_expand_epilogue (int style)
  possible in 

[PATCH 2/3] RISC-V: optimize stack manipulation in save-restore

2022-12-01 Thread Fei Gao
The stack that save-restore reserves is not well accumulated in stack 
allocation and deallocation.
This patch allows less instructions to be used in stack allocation and 
deallocation if save-restore enabled.

before patch:
  bar:
callt0,__riscv_save_4
addisp,sp,-64
...
li  t0,-12288
addit0,t0,-1968 # optimized out after patch
add sp,sp,t0 # prologue
...
li  t0,12288 # epilogue
addit0,t0,2000 # optimized out after patch
add sp,sp,t0
...
addisp,sp,32
tail__riscv_restore_4

after patch:
  bar:
callt0,__riscv_save_4
addisp,sp,-2032
...
li  t0,-12288
add sp,sp,t0 # prologue
...
li  t0,12288 # epilogue
add sp,sp,t0
...
addisp,sp,2032
tail__riscv_restore_4

gcc/ChangeLog:

* config/riscv/riscv.cc (riscv_expand_prologue): consider save-restore 
in stack allocation.
(riscv_expand_epilogue): consider save-restore in stack deallocation.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/stack_save_restore.c: New test.
---
 gcc/config/riscv/riscv.cc | 50 ++-
 .../gcc.target/riscv/stack_save_restore.c | 40 +++
 2 files changed, 66 insertions(+), 24 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/stack_save_restore.c

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index f0bbcd6d6be..a50f2303032 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -5010,12 +5010,12 @@ void
 riscv_expand_prologue (void)
 {
   struct riscv_frame_info *frame = >machine->frame;
-  poly_int64 size = frame->total_size;
+  poly_int64 remaining_size = frame->total_size;
   unsigned mask = frame->mask;
   rtx insn;
 
   if (flag_stack_usage_info)
-current_function_static_stack_size = constant_lower_bound (size);
+current_function_static_stack_size = constant_lower_bound (remaining_size);
 
   if (cfun->machine->naked_p)
 return;
@@ -5026,7 +5026,7 @@ riscv_expand_prologue (void)
   rtx dwarf = NULL_RTX;
   dwarf = riscv_adjust_libcall_cfi_prologue ();
 
-  size -= frame->save_libcall_adjustment;
+  remaining_size -= frame->save_libcall_adjustment;
   insn = emit_insn (riscv_gen_gpr_save_insn (frame));
   frame->mask = 0; /* Temporarily fib that we need not save GPRs.  */
 
@@ -5037,16 +5037,14 @@ riscv_expand_prologue (void)
   /* Save the registers.  */
   if ((frame->mask | frame->fmask) != 0)
 {
-  HOST_WIDE_INT step1 = riscv_first_stack_step (frame, frame->total_size);
-  if (size.is_constant ())
-   step1 = MIN (size.to_constant(), step1);
+  HOST_WIDE_INT step1 = riscv_first_stack_step (frame, remaining_size);
 
   insn = gen_add3_insn (stack_pointer_rtx,
stack_pointer_rtx,
GEN_INT (-step1));
   RTX_FRAME_RELATED_P (emit_insn (insn)) = 1;
-  size -= step1;
-  riscv_for_each_saved_reg (size, riscv_save_reg, false, false);
+  remaining_size -= step1;
+  riscv_for_each_saved_reg (remaining_size, riscv_save_reg, false, false);
 }
 
   frame->mask = mask; /* Undo the above fib.  */
@@ -5055,29 +5053,29 @@ riscv_expand_prologue (void)
   if (frame_pointer_needed)
 {
   insn = gen_add3_insn (hard_frame_pointer_rtx, stack_pointer_rtx,
-   GEN_INT ((frame->hard_frame_pointer_offset - 
size).to_constant ()));
+   GEN_INT ((frame->hard_frame_pointer_offset - 
remaining_size).to_constant ()));
   RTX_FRAME_RELATED_P (emit_insn (insn)) = 1;
 
   riscv_emit_stack_tie ();
 }
 
   /* Allocate the rest of the frame.  */
-  if (known_gt (size, 0))
+  if (known_gt (remaining_size, 0))
 {
   /* Two step adjustment:
 1.scalable frame. 2.constant frame.  */
   poly_int64 scalable_frame (0, 0);
-  if (!size.is_constant ())
+  if (!remaining_size.is_constant ())
{
  /* First for scalable frame.  */
- poly_int64 scalable_frame = size;
- scalable_frame.coeffs[0] = size.coeffs[1];
+ poly_int64 scalable_frame = remaining_size;
+ scalable_frame.coeffs[0] = remaining_size.coeffs[1];
  riscv_v_adjust_scalable_frame (stack_pointer_rtx, scalable_frame, 
false);
- size -= scalable_frame;
+ remaining_size -= scalable_frame;
}
 
   /* Second step for constant frame.  */
-  HOST_WIDE_INT constant_frame = size.to_constant ();
+  HOST_WIDE_INT constant_frame = remaining_size.to_constant ();
   if (constant_frame == 0)
return;
 
@@ -5142,6 +5140,8 @@ riscv_expand_epilogue (int style)
   HOST_WIDE_INT step2 = 0;
   bool use_restore_libcall = ((style == NORMAL_RETURN)
  && riscv_use_save_libcall (frame));
+  unsigned libcall_size = use_restore_libcall ?
+frame->save_libcall_adjustment : 0;
   

[PATCH] IPA: do not release body if still needed

2022-12-01 Thread Martin Liška
Hi.

Noticed during building of libbackend.a with the LTO partial linking.

The function release_body is called even if clone_of is a clone
of a another function and thus it shares tree declaration. We should
preserve it in that situation.

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

Ready to be installed?
Thanks,
Martin

PR ipa/107944

gcc/ChangeLog:

* cgraph.cc (cgraph_node::remove): Do not release body
if a node is clone of another node.
---
 gcc/cgraph.cc | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/cgraph.cc b/gcc/cgraph.cc
index f15cb47c8b8..2e7d77ffd6c 100644
--- a/gcc/cgraph.cc
+++ b/gcc/cgraph.cc
@@ -1893,7 +1893,7 @@ cgraph_node::remove (void)
   else if (clone_of)
 {
   clone_of->clones = next_sibling_clone;
-  if (!clone_of->analyzed && !clone_of->clones && !clones)
+  if (!clone_of->analyzed && !clone_of->clones && !clones && 
!clone_of->clone_of)
clone_of->release_body ();
 }
   if (next_sibling_clone)
-- 
2.38.1



[PATCH] tree-optimization/107937 - uninit predicate simplification fixup

2022-12-01 Thread Richard Biener via Gcc-patches
The following changes the predicate representation to record the
value of a predicate with an empty set of AND predicates.  That's
necessary to properly represent the conservative fallback for the
def vs use predicates.  Since simplification now can result in
such an empty set this distinction becomes important and we need
to check for this as we otherwise ICE.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

PR tree-optimization/107937
* gimple-predicate-analysis.h (predicate::is_true): New.
(predicate::is_false): Likewise.
(predicate::empty_val): Likewise.
(uninit_analysis::uninit_analysis): Properly initialize
def_preds.
* gimple-predicate-analysis.cc (simplify_1b): Indicate
whether the chain became empty.
(predicate::simplify): Release emptied chain before removing it.
(predicate::normalize): Replace temporary object with assertion.
(uninit_analysis::is_use_guarded): Deal with predicates
that simplify to true/false.

* gcc.dg/pr107937.c: New testcase.
---
 gcc/gimple-predicate-analysis.cc | 24 +++-
 gcc/gimple-predicate-analysis.h  | 23 ---
 gcc/testsuite/gcc.dg/pr107937.c  | 24 
 3 files changed, 63 insertions(+), 8 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/pr107937.c

diff --git a/gcc/gimple-predicate-analysis.cc b/gcc/gimple-predicate-analysis.cc
index ce2e1d10e43..afe01e7f4b8 100644
--- a/gcc/gimple-predicate-analysis.cc
+++ b/gcc/gimple-predicate-analysis.cc
@@ -1249,7 +1249,7 @@ simplify_1a (pred_chain )
 }
 
 /* Implement rule 1b above.  PREDS is the AND predicate to simplify
-   in place.  Returns true if CHAIN simplifies to true.  */
+   in place.  Returns true if CHAIN simplifies to true or false.  */
 
 static bool
 simplify_1b (pred_chain )
@@ -1290,6 +1290,8 @@ simplify_1b (pred_chain )
{
  chain.ordered_remove (j);
  chain.ordered_remove (i);
+ if (chain.is_empty ())
+   return true;
  i--;
  break;
}
@@ -1503,6 +1505,7 @@ predicate::simplify (gimple *use_or_def, bool is_use)
   ::simplify_1a (m_preds[i]);
   if (::simplify_1b (m_preds[i]))
{
+ m_preds[i].release ();
  m_preds.ordered_remove (i);
  i--;
}
@@ -1719,10 +1722,11 @@ predicate::normalize (const pred_chain )
   while (!work_list.is_empty ())
 {
   pred_info pi = work_list.pop ();
-  predicate pred;
   /* The predicate object is not modified here, only NORM_CHAIN and
 WORK_LIST are appended to.  */
-  pred.normalize (_chain, pi, BIT_AND_EXPR, _list, _set);
+  unsigned oldlen = m_preds.length ();
+  normalize (_chain, pi, BIT_AND_EXPR, _list, _set);
+  gcc_assert (m_preds.length () == oldlen);
 }
 
   m_preds.safe_push (norm_chain);
@@ -1740,7 +1744,7 @@ predicate::normalize (gimple *use_or_def, bool is_use)
   dump (dump_file, use_or_def, is_use ? "[USE]:\n" : "[DEF]:\n");
 }
 
-  predicate norm_preds;
+  predicate norm_preds (empty_val ());
   for (unsigned i = 0; i < m_preds.length (); i++)
 {
   if (m_preds[i].length () != 1)
@@ -2076,6 +2080,8 @@ predicate::operator= (const predicate )
   if (this == )
 return *this;
 
+  m_cval = rhs.m_cval;
+
   unsigned n = m_preds.length ();
   for (unsigned i = 0; i != n; ++i)
 m_preds[i].release ();
@@ -2204,11 +2210,15 @@ uninit_analysis::is_use_guarded (gimple *use_stmt, 
basic_block use_bb,
   /* Try to build the predicate expression under which the PHI flows
  into its use.  This will be empty if the PHI is defined and used
  in the same bb.  */
-  predicate use_preds;
+  predicate use_preds (true);
   if (!init_use_preds (use_preds, def_bb, use_bb))
 return false;
 
   use_preds.simplify (use_stmt, /*is_use=*/true);
+  if (use_preds.is_false ())
+return true;
+  if (use_preds.is_true ())
+return false;
   use_preds.normalize (use_stmt, /*is_use=*/true);
 
   /* Try to prune the dead incoming phi edges.  */
@@ -2227,6 +2237,10 @@ uninit_analysis::is_use_guarded (gimple *use_stmt, 
basic_block use_bb,
return false;
 
   m_phi_def_preds.simplify (phi);
+  if (m_phi_def_preds.is_false ())
+   return false;
+  if (m_phi_def_preds.is_true ())
+   return true;
   m_phi_def_preds.normalize (phi);
 }
 
diff --git a/gcc/gimple-predicate-analysis.h b/gcc/gimple-predicate-analysis.h
index 972af5e0b2d..c4a7ed51967 100644
--- a/gcc/gimple-predicate-analysis.h
+++ b/gcc/gimple-predicate-analysis.h
@@ -45,7 +45,7 @@ class predicate
 {
  public:
   /* Construct with the specified EVAL object.  */
-  predicate () : m_preds (vNULL) { }
+  predicate (bool empty_val) : m_preds (vNULL), m_cval (empty_val) { }
 
   /* Copy.  */
   predicate (const predicate ) : m_preds (vNULL) { *this = rhs; }
@@ -60,6 +60,21 @@ class predicate
 return 

[PATCH] gcc: remove incpath.o from CXX_C_OBJS

2022-12-01 Thread Martin Liška
The object is already included in OBJS (libbackend.a), thus
we don't need it.

Noticed while using partial linking for libbackend.a.

Ready to be installed?
Thanks,
Martin

gcc/cp/ChangeLog:

* Make-lang.in: Remove extra object dependency.
---
 gcc/cp/Make-lang.in | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/cp/Make-lang.in b/gcc/cp/Make-lang.in
index af25bdc044a..75e2f7c7ba3 100644
--- a/gcc/cp/Make-lang.in
+++ b/gcc/cp/Make-lang.in
@@ -81,7 +81,7 @@ g++-cross$(exeext): xg++$(exeext)
 
 # The compiler itself.
 # Shared with C front end:
-CXX_C_OBJS = attribs.o incpath.o \
+CXX_C_OBJS = attribs.o \
$(C_COMMON_OBJS) $(CXX_TARGET_OBJS)
 
 # Language-specific object files for C++ and Objective C++.
-- 
2.38.1



[PATCH] tree-optimization/107935 - fixup equivalence handling in PHI VN

2022-12-01 Thread Richard Biener via Gcc-patches
The following makes sure to honor the backedge processing logic
that forces VARYING there.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

PR tree-optimization/107935
* tree-ssa-sccvn.cc (visit_phi): Honor forced VARYING on
backedges.

* gcc.dg/torture/pr107935.c: New testcase.
---
 gcc/testsuite/gcc.dg/torture/pr107935.c | 18 ++
 gcc/tree-ssa-sccvn.cc   |  7 ++-
 2 files changed, 24 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.dg/torture/pr107935.c

diff --git a/gcc/testsuite/gcc.dg/torture/pr107935.c 
b/gcc/testsuite/gcc.dg/torture/pr107935.c
new file mode 100644
index 000..78175100f80
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/pr107935.c
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+
+int *a, **b;
+int main() {
+  int d = 0, *e = 
+ L:
+  *e = d;
+  if (a) {
+int *g = e = *b;
+if (!e)
+  __builtin_abort();
+if (**b)
+  return 0;
+*g = 1;
+goto L;
+  }
+  return 0;
+}
diff --git a/gcc/tree-ssa-sccvn.cc b/gcc/tree-ssa-sccvn.cc
index 6895ae84d13..fa2f65df159 100644
--- a/gcc/tree-ssa-sccvn.cc
+++ b/gcc/tree-ssa-sccvn.cc
@@ -5861,7 +5861,12 @@ visit_phi (gimple *phi, bool *inserted, bool 
backedges_varying_p)
  continue;
  }
/* There's also the possibility to use equivalences.  */
-   if (!FLOAT_TYPE_P (TREE_TYPE (def)))
+   if (!FLOAT_TYPE_P (TREE_TYPE (def))
+   /* But only do this if we didn't force any of sameval or
+  val to VARYING because of backedge processing rules.  */
+   && (TREE_CODE (sameval) != SSA_NAME
+   || SSA_VAL (sameval) == sameval)
+   && (TREE_CODE (def) != SSA_NAME || SSA_VAL (def) == def))
  {
vn_nary_op_t vnresult;
tree ops[2];
-- 
2.35.3


Re: [PATCH V2] rs6000: Support to build constants by li/lis+oris/xoris

2022-12-01 Thread Jiufu Guo via Gcc-patches
Hi Segher,

在 11/28/22 10:18 PM, Segher Boessenkool 写道:
> On Mon, Nov 28, 2022 at 11:37:34AM +0800, Jiufu Guo wrote:
>> Segher Boessenkool  writes:
>>> On Fri, Nov 25, 2022 at 04:11:49PM +0800, Kewen.Lin wrote:
 on 2022/10/26 19:40, Jiufu Guo wrote:
 for "li/lis + oris/xoris", I interpreted it into four combinations:

li + oris, lis + oris, li + xoris, lis + xoris.

 not sure just me interpreting like that, but the actual combinations
 which this patch adopts are:

li + oris, li + xoris, lis + xoris.

 It's a bit off, but not a big deal, up to you to reword it or not.  :)
>>>
>>> The first two are obvious, but the last one is almost never a good idea,
>>> there usually are better ways to do the same.  I cannot even think of
>>> any case where this is best?  A lis;rl* is always prefered (it can
>>> optimise better, be combined with other insns).
>> I understant your point here.  The first two: 'li' for lowest 16bits,
>> 'oris/xoris' for next 16bits.
>>
>> While for 'lis + xoris', it may not obvious, because both 'lis' and
>> 'xoris' operates on 17-31bits.
>> 'lis + xoris' is for case "32(1) || 1(0) || 15(x) || 16(0)". xoris is
>> used to clean bit31.  This case seems hard to be supported by 'rlxx'.
> 
> Please put that in a separate patch?  First do a patch with just
> lis;x?oris.  They are unrelated and different in almost every way.
>
I just send out two patches, one for "lis; xoris" and one for "li; x?oris".

https://gcc.gnu.org/pipermail/gcc-patches/2022-December/607617.html
https://gcc.gnu.org/pipermail/gcc-patches/2022-December/607618.html

Maybe, do we prefer to separate into 3 patches for review easily :-)?

Thanks for review!


BR,
Jeff (Jiufu)

>> I hit to find this case when I analyze what kind of constants can be
>> build by two instructions. Checked the posssible combinations:
>> "addi/addis" + "neg/ori/../xoris/rldX/rlwX/../sradi/extswsli"(those
>> instructions which accept one register and one immediate).
>>
>> I also drafted the patch to use "li/lis+rlxx" to build constant.
>> https://gcc.gnu.org/pipermail/gcc-patches/2022-September/601276.html
>> https://gcc.gnu.org/pipermail/gcc-patches/2022-September/601277.html
> 
> Those seem to do many things in one patch as well :-(  It is very hard
> to review such things, it takes many hours each to do properly.
> 
> 
> Segher


[PATCH 2/2] rs6000: use li;x?oris to build constant

2022-12-01 Thread Jiufu Guo via Gcc-patches
Hi,

For constant C:
If '(c & 0x8000ULL) == 0x8000ULL' or say:
32(1) || 16(x) || 1(1) || 15(x), using "li; xoris" would be ok.

If '(c & 0x80008000ULL) == 0x8000ULL' or say:
32(0) || 1(1) || 15(x) || 1(0) || 15(x), we could use "li; oris" to
build constant 'C'.

Here N(M) means N continuous bit M, x for M means it is ok for either
1 or 0; '||' means concatenation.

This patch update rs6000_emit_set_long_const to support those constants.
Bootstrap and regtest pass on ppc64{,le}.

Is this ok for trunk?

BR,
Jeff (Jiufu)


PR target/106708

gcc/ChangeLog:

* config/rs6000/rs6000.cc (rs6000_emit_set_long_const): Add building
constants through "li; x?oris".

gcc/testsuite/ChangeLog:

* gcc.target/powerpc/pr106708.c: Add test functions.

---
 gcc/config/rs6000/rs6000.cc | 34 ++---
 gcc/testsuite/gcc.target/powerpc/pr106708.c | 27 +++-
 2 files changed, 56 insertions(+), 5 deletions(-)

diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index 7efed94a0bc..316ee97c53d 100644
--- a/gcc/config/rs6000/rs6000.cc
+++ b/gcc/config/rs6000/rs6000.cc
@@ -10251,6 +10251,14 @@ rs6000_emit_set_long_const (rtx dest, HOST_WIDE_INT c)
   if (ud1 != 0)
emit_move_insn (dest, gen_rtx_IOR (DImode, temp, GEN_INT (ud1)));
 }
+  else if (ud4 == 0x && ud3 == 0x && (ud1 & 0x8000))
+{
+  /* li; xoris */
+  temp = !can_create_pseudo_p () ? dest : gen_reg_rtx (DImode);
+  emit_move_insn (temp, GEN_INT (sext_hwi (ud1, 16)));
+  emit_move_insn (dest, gen_rtx_XOR (DImode, temp,
+GEN_INT ((ud2 ^ 0x) << 16)));
+}
   else if (ud4 == 0x && ud3 == 0x && !(ud2 & 0x8000) && ud1 == 0)
 {
   /* lis; xoris */
@@ -10263,10 +10271,28 @@ rs6000_emit_set_long_const (rtx dest, HOST_WIDE_INT c)
   temp = !can_create_pseudo_p () ? dest : gen_reg_rtx (DImode);
 
   gcc_assert (ud2 & 0x8000);
-  emit_move_insn (temp, GEN_INT (sext_hwi (ud2 << 16, 32)));
-  if (ud1 != 0)
-   emit_move_insn (temp, gen_rtx_IOR (DImode, temp, GEN_INT (ud1)));
-  emit_move_insn (dest, gen_rtx_AND (DImode, temp, GEN_INT (0x)));
+
+  if (ud1 == 0)
+   {
+ /* lis; rldicl */
+ emit_move_insn (temp, GEN_INT (sext_hwi (ud2 << 16, 32)));
+ emit_move_insn (dest,
+ gen_rtx_AND (DImode, temp, GEN_INT (0x)));
+   }
+  else if (!(ud1 & 0x8000))
+   {
+ /* li; oris */
+ emit_move_insn (temp, GEN_INT (ud1));
+ emit_move_insn (dest,
+ gen_rtx_IOR (DImode, temp, GEN_INT (ud2 << 16)));
+   }
+  else
+   {
+ emit_move_insn (temp, GEN_INT (sext_hwi (ud2 << 16, 32)));
+ emit_move_insn (temp, gen_rtx_IOR (DImode, temp, GEN_INT (ud1)));
+ emit_move_insn (dest,
+ gen_rtx_AND (DImode, temp, GEN_INT (0x)));
+   }
 }
   else if (ud1 == ud3 && ud2 == ud4)
 {
diff --git a/gcc/testsuite/gcc.target/powerpc/pr106708.c 
b/gcc/testsuite/gcc.target/powerpc/pr106708.c
index dd0386c109c..fd1e1cab8fc 100644
--- a/gcc/testsuite/gcc.target/powerpc/pr106708.c
+++ b/gcc/testsuite/gcc.target/powerpc/pr106708.c
@@ -3,7 +3,8 @@
 /* { dg-options "-O2 -mno-prefixed -save-temps" } */
 /* { dg-require-effective-target has_arch_ppc64 } */
 
-long long arr[] = {0x6543LL};
+long long arr[]
+  = {0x6543LL, 0x7cdeab55LL, 0x98765432LL, 0xabcdLL};
 void __attribute__ ((__noipa__)) lisxoris (long long *arg)
 {
   *arg = 0x6543LL;
@@ -11,12 +12,36 @@ void __attribute__ ((__noipa__)) lisxoris (long long *arg)
 /* { dg-final { scan-assembler-times {\mlis .*,0xe543\M} 1 } } */
 /* { dg-final { scan-assembler-times {\mxoris .*0x8000\M} 1 } } */
 
+void __attribute__ ((__noipa__)) lixoris (long long *arg)
+{
+  *arg = 0x7cdeab55LL;
+}
+/* { dg-final { scan-assembler-times {\mli .*,-21675\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mxoris .*0x8321\M} 1 } } */
+
+void __attribute__ ((__noipa__)) lioris (long long *arg)
+{
+  *arg = 0x98765432LL;
+}
+/* { dg-final { scan-assembler-times {\mli .*,21554\M} 1 } } */
+/* { dg-final { scan-assembler-times {\moris .*0x9876\M} 1 } } */
+
+void __attribute__ ((__noipa__)) lisrldicl (long long *arg)
+{
+  *arg = 0xabcdLL;
+}
+/* { dg-final { scan-assembler-times {\mlis .*,0xabcd\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mrldicl .*,0,32\M} 1 } } */
+
 int
 main ()
 {
   long long a[sizeof (arr) / sizeof (arr[0])];
 
   lisxoris (a);
+  lixoris (a + 1);  
+  lioris (a + 2);
+  lisrldicl (a + 3);
   if (__builtin_memcmp (a, arr, sizeof (arr)) != 0)
 __builtin_abort ();
   return 0;
-- 
2.17.1



  1   2   >