Re: [PATCH] ipa-cp: Avoid adjusting references through self-recursion (PR 104813)

2022-03-09 Thread Martin Liška

On 3/8/22 22:51, Martin Jambor wrote:

Hi,

when writing the patch that downgrades address-taken references to
load references when IPA-CP can prove that all uses of the taken
address ends up in loads, I unfortunately did not take into account
that find_more_scalar_values_for_callers_subset now happily adds
self-recursive edges to the set of callers which should be immediately
redirected (originally recursion was meant to be handled as edge
redirection in a second pass over the SCC).

The code as it is can now decrement the referece counters too many
times.  This can remedied by removing self-recursive edges earlier, we
already do it because of thunk expansion issues, and so this patch
does exactly that.

Bootstrapped and LTO-bootstrapped and tested on x86_64-linux.  OK for
master?


Yes, thanks.

Martin



Thanks,

Martin


gcc/ChangeLog:

2022-03-07  Martin Jambor  

PR ipa/104813
* ipa-cp.cc (create_specialized_node): Move removal of
self-recursive calls from callers vector before refrence
adjustments.

gcc/testsuite/ChangeLog:

2022-03-07  Martin Jambor  

PR ipa/104813
* gcc.dg/ipa/pr104813.c: New test.
---
  gcc/ipa-cp.cc   | 20 +-
  gcc/testsuite/gcc.dg/ipa/pr104813.c | 32 +
  2 files changed, 42 insertions(+), 10 deletions(-)
  create mode 100644 gcc/testsuite/gcc.dg/ipa/pr104813.c

diff --git a/gcc/ipa-cp.cc b/gcc/ipa-cp.cc
index 453e9c93cc3..18047c209a8 100644
--- a/gcc/ipa-cp.cc
+++ b/gcc/ipa-cp.cc
@@ -5099,6 +5099,16 @@ create_specialized_node (struct cgraph_node *node,
else
  new_adjustments = NULL;
  
+  auto_vec self_recursive_calls;

+  for (i = callers.length () - 1; i >= 0; i--)
+{
+  cgraph_edge *cs = callers[i];
+  if (cs->caller == node)
+   {
+ self_recursive_calls.safe_push (cs);
+ callers.unordered_remove (i);
+   }
+}
replace_trees = cinfo ? vec_safe_copy (cinfo->tree_map) : NULL;
for (i = 0; i < count; i++)
  {
@@ -5129,16 +5139,6 @@ create_specialized_node (struct cgraph_node *node,
if (replace_map)
vec_safe_push (replace_trees, replace_map);
  }
-  auto_vec self_recursive_calls;
-  for (i = callers.length () - 1; i >= 0; i--)
-{
-  cgraph_edge *cs = callers[i];
-  if (cs->caller == node)
-   {
- self_recursive_calls.safe_push (cs);
- callers.unordered_remove (i);
-   }
-}
  
unsigned &suffix_counter = clone_num_suffixes->get_or_insert (

   IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME (
diff --git a/gcc/testsuite/gcc.dg/ipa/pr104813.c 
b/gcc/testsuite/gcc.dg/ipa/pr104813.c
new file mode 100644
index 000..34f413e3823
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/ipa/pr104813.c
@@ -0,0 +1,32 @@
+/* { dg-do compile } */
+/* { dg-options "-O3"  } */
+
+int a, b, c, d, *e;
+void f(int h) {
+  if (b) {
+int g;
+while (g++)
+  d = *e;
+e++;
+  }
+}
+static void i();
+static void j(int *h, int k, int *l) {
+  if (c) {
+int *o = h, m;
+f(*l);
+i(m);
+j(o, 1, o);
+for (;;)
+  ;
+  }
+}
+void i() {
+  int *n = &a;
+  while (1)
+j(n, 1, n);
+}
+int main() {
+  j(&a, 0, &a);
+  return 0;
+}




Re: [x86 PATCH] PR tree-optimization/98335: New peephole2 xorl;movb -> movzbl

2022-03-09 Thread Uros Bizjak via Gcc-patches
On Wed, Mar 9, 2022 at 7:10 PM Roger Sayle  wrote:
>
>
> Hi Uros,
> Firstly many thanks for already (pre)approving half of this patch.  Jakub 
> Jelinek's
> suggestion for creating a testcase that exposes the SImode issue was extremely
> helpful, but interestingly exposed another missed optimization opportunity 
> that's
> also addressed with this revision of my patch.
>
> char c;
> union Data { char a; short b; };
> void val(void) { __asm__ __volatile__ ("" : : "r" ((union Data) { c } )); }
>
> currently on x86_64 with -O2 on mainline generates following code:
>
> xorl%eax, %eax
> movb$0, %ah
> movbc(%rip), %al
> ret
>
> which contains the strict_low_part movb following an SI mode xor that we hope
> to optimize, but infuriatingly we also have a completely redundant movb $0, 
> %ah.
> Hence, with this patch we have a new peephole2 that sees that in the first two
> instructions the movb is redundant, which then allows the SImode/SWI48 xor
> followed by movb peephole2 to optimize the rest to movzbl.  With the revised
> patch below, the above testcase is now compiled as:
>
> movzbl  c(%rip), %eax
> ret
>
> Tested on x86_64-pc-linux-gnu (on top of the revised middle-end patch) with
> make bootstrap and make -k check with no new failures.  Is this revised 
> version
> (still) Ok for mainline?
>
>
> 2022-03-09  Roger Sayle  
>
> gcc/ChangeLog
> PR tree-optimization/98335
> * config/i386/i386.md (peephole2): Eliminate redundant insv.
> Combine movl followed by movb.  Transform xorl followed by
> a suitable movb or movw into the equivalent movz[bw]l.
>
> gcc/testsuite/ChangeLog
> PR tree-optimization/98335
> * g++.target/i386/pr98335.C: New test case.
> * gcc.target/i386/pr98335.c: New test case.

OK.

Thanks,
Uros.

>
>
> Thanks again for your help/suggested revisions.
> Roger
> --
>
> > -Original Message-
> > From: Uros Bizjak 
> > Sent: 09 March 2022 07:36
> > To: Roger Sayle 
> > Cc: gcc-patches@gcc.gnu.org
> > Subject: Re: [x86 PATCH] PR tree-optimization/98335: New peephole2
> > xorl;movb -> movzbl
> >
> > On Mon, Mar 7, 2022 at 12:51 PM Roger Sayle
> >  wrote:
> > >
> > >
> > > Hi Uros,
> > >
> > > > Is there a reason that only inserts to DImode registers are implemented?
> > > > IMO, these peepholes should also handle inserts to SImode.
> > >
> > > I wasn't able to construct a test case that produced a byte or word
> > > insert into an SImode register.  The front-ends and middle-end end up
> > > producing different code sequences, and -m32 changes the ABI so that
> > > small structs get passed in memory rather than in registers.
> > >
> > > Here's the expanded testcase that I investigated:
> > >
> > > struct DataCL { char a; int b; };
> > > struct DataWL { short a; int b; };
> > > struct DataIL { int a; int b; };
> > >
> > > struct DataCI { char a; short b; };
> > > struct DataWI { short a; short b; };
> > >
> > > char c;
> > > short w;
> > > int i;
> > >
> > > DataCL foo_cl(int idx) { return { c }; } DataCL bar_cl(int idx) {
> > > return { c, 0 }; } DataWL foo_wl(int idx) { return { w }; } DataWL
> > > bar_wl(int idx) { return { w, 0 }; } DataIL foo_il(int idx) { return {
> > > i }; } DataIL bar_il(int idx) { return { i, 0 }; }
> > >
> > > DataCI foo_ci(int idx) { return { c }; } DataCI bar_ci(int idx) {
> > > return { c, 0 }; } DataWI foo_wi(int idx) { return { w }; } DataWI
> > > bar_wi(int idx) { return { w, 0 }; }
> > >
> > >
> > > I agree that for completeness similar peepholes handling inserts into
> > > SImode would be a good thing, but these wouldn't be restricted by
> > > TARGET_64BIT and would therefore require additional -m32 testing.
> > > The DImode peepholes I can justify for stage4 as PR tree-opt/98335 is
> > > a regression, SImode peepholes would be more of a "leap of faith".
> > > If you’d be willing to accept a patch without a testcase, let me know.
> >
> > We have plenty of these, where we assume that if the pattern works in one
> > mode, it also works in other modes. So, I think that by changing DI mode to
> > SWI48 mode iterator, we are on the safe side. We can also trust bootstrap 
> > and
> > regression tests here.
> >
> > IMO, the patch with SWI48 mode iterator is OK.
> >
> > Thanks,
> > Uros.
> >
> > >
> > > It's also a pity that subreg handling in combine doesn't allow merging
> > > these inserts into zero registers to be combined to zero_extends in a
> > > machine independent way.  My recent patch for PR 95126 (awaiting
> > > review) should also allow front-ends and middle-end passes more
> > > flexibility in optimizing small struct constructors.
> > >
> > > Thanks (as always) for reviewing patches so quickly.
> > >
> > > Roger
> > > --
>
>


Re: [PATCH] Check if loading const from mem is faster

2022-03-09 Thread Richard Biener via Gcc-patches
On Thu, 10 Mar 2022, Jiufu Guo wrote:

> 
> Hi!
> 
> Richard Biener  writes:
> 
> > On Wed, 9 Mar 2022, Jiufu Guo wrote:
> >
> >> 
> >> Hi!
> >> 
> >> Richard Biener  writes:
> >> 
> >> > On Tue, 8 Mar 2022, Jiufu Guo wrote:
> >> >
> >> >> Jiufu Guo  writes:
> >> >> 
> >> >> Hi!
> >> >> 
> >> >> > Hi Sehger,
> >> >> >
> >> >> > Segher Boessenkool  writes:
> >> >> >
> >> >> >> On Tue, Mar 01, 2022 at 10:28:57PM +0800, Jiufu Guo wrote:
> >> >> >>> Segher Boessenkool  writes:
> >> >> >>> > No.  insn_cost is only for correct, existing instructions, not for
> >> >> >>> > made-up nonsense.  I created insn_cost precisely to get away from 
> >> >> >>> > that
> >> >> >>> > aspect of rtx_cost (and some other issues, like, it is incredibly 
> >> >> >>> > hard
> >> >> >>> > and cumbersome to write a correct rtx_cost).
> >> >> >>> 
> >> >> >>> Thanks! The implementations of hook insn_cost are align with this
> >> >> >>> design, they are  checking insn's attributes and COSTS_N_INSNS.
> >> >> >>> 
> >> >> >>> One question on the speciall case: 
> >> >> >>> For instruction: "r119:DI=0x100803004101001"
> >> >> >>> Would we treat it as valid instruction?
> >> >> >>
> >> >> >> Currently we do, alternative 6 in *movdi_internal64: we allow any 
> >> >> >> r<-n.
> >> >> >> This is costed as 5 insns (cost=20).
> >> >> >>
> >> >> >> It generally is better to split things into patterns close to the
> >> >> >> eventual machine isntructions as early as possible: all the more 
> >> >> >> generic
> >> >> >> optimisations can take advantage of that then.
> >> >> > Get it!
> >> >> >>
> >> >> >>> A patch, which is attached the end of this mail, accepts
> >> >> >>> "r119:DI=0x100803004101001" as input of insn_cost.
> >> >> >>> In this patch, 
> >> >> >>> - A tmp instruction is generated via make_insn_raw.
> >> >> >>> - A few calls to rtx_cost (in cse_insn) is replaced by insn_cost.
> >> >> >>> - In hook of insn_cost, checking the special 'constant' instruction.
> >> >> >>> Are these make sense?
> >> >> >>
> >> >> >> I'll review that patch inline.
> >> >> 
> >> >> I drafted a new patch that replace rtx_cost with insn_cost for cse.cc.
> >> >> Different from the previous partial patch, this patch replaces all usage
> >> >> of rtx_cost. It may be better/aggressive than previous one.
> >> >
> >> > I think there's no advantage for using insn_cost over rtx_cost for
> >> > the simple SET case.
> >> 
> >> Thanks for your comments and raise this concern.
> >> 
> >> For those targets which do not implement insn_cost, insn_cost calls
> >> rtx_cost through pattern_cost, then insn_cost is equal to rtx_cost.
> >> 
> >> While, for those targets which have insn_cost, it seems insn_cost would
> >> be better(or say more accurate/consistent?) than rtx_cost. Since:
> >> - insn_cost recog the insn first, and compute cost through something
> >
> > target hooks are expected to call recog on the insn but the generic
> > fallback does not!?  Or do you say a target _could_ call recog?
> > I think it would be valid to only expect recognized insns here
> > and thus cse.cc obligation would be to call regoc on the "fake"
> > instruction which then raises the obvious issue whether you should
> > ever call recog on something "fake".
> Thanks Richard! I also feel this is a main point of insn_cost.
> From my understanding: it would be better to let insn_cost check
> the valid recognizable insns; this would be the major purpose of
> insn_cost.  While, I'm also wondering, we may let it go for 'fake'
> instruction for current implementations (e.g. arm_insn_cost) which
> behaviors to walk/check pattern. 

Note for the CSE case you'll always end up with the single move
pattern so it's somewhat pointless to do the recog & insn_cost
dance there.  For move cost using rtx_cost (SET, ..) should
be good enough.  One could argue we should standardize a (wrapping)
API like move_cost (enum machine_mode mode, rtx to, rtx from),
with to/from optional (if omitted a REG of MODE).  But the existing
rtx_cost target hook implementation should be sufficient to handle
it, without building a fake insn and without doing (the pointless,
if not failing) recog process.

Richard.

> >
> > I also see that rs6000_insn_cost does
> >
> > static int
> > rs6000_insn_cost (rtx_insn *insn, bool speed)
> > {
> >   if (recog_memoized (insn) < 0)
> > return 0;
> >
> > so not recognized insns become quite cheap - it looks like
> > insn_cost has no documented failure mode and initial implementors
> > chickened out asserting that an insn is / can be recognized.
> > In fact I'd assert that INSN_CODE (insn) >= 0 simply because there's
> > no failure mode and the _caller_ should have made sure the
> > insn can be recognized.  That said, the insn_cost should do that.
> Thanks! Using the assert would help to catch un-recog failures.
> 
> > (is INSN_CODE () == 0 a real thing?)
> 0 is specialy, it is some thing like CODE_FOR_nothing. :-)
> 
> >
> >> (like length/cost attributes from .md file) for the 'machine insn'.
>

RE: [PATCH v2] PR tree-optimization/98335: Improvements to DSE's compute_trims.

2022-03-09 Thread Roger Sayle


Hi Richard,

> 2022-03-09  Roger Sayle  
> Richard Biener  
> 
> gcc/ChangeLog
>   PR tree-optimization/98335
>   * builtins.cc (get_object_alignment_2): Export.
>   * builtins.h (get_object_alignment_2): Likewise.
>   * tree-ssa-alias.cc (ao_ref_alignment): New.
>   * tree-ssa-alias.h (ao_ref_alignment): Declare.
> 
>   * tree-ssa-dse.cc (compute_trims): Improve logic deciding whether
>   to align head/tail, writing more bytes but using fewer store insns.

>   (maybe_trim_memstar_call): Silence compiler warnings by using
>   memset to initialize lendata.

Please ignore this final change to maybe_trim_memstar_call, which I've
now removed from the patch.  Jason Merrill has now explained that these
warnings are issues with the (RedHat) system compiler on my host system
rather than code cleanliness issues with GCC source tree.  I just need to
ignore these compiler warnings in stage1 of a bootstrap/build, even in the
source files I'm changing.  Sorry for the noise.  Hopefully the rest looks good.

Cheers,
Roger
--




PATCH, rs6000] Add V1TI into vector comparison expand [PR103316]

2022-03-09 Thread HAO CHEN GUI via Gcc-patches
Hi,
   This patch adds V1TI mode into mode iterator used in vector comparison
expands.With the patch, both built-ins and direct comparison could generate
P10 new V1TI comparison instructions.

   Bootstrapped and tested on ppc64 Linux BE and LE with no regressions. Is
this okay for trunk? Any recommendations? Thanks a lot.

ChangeLog
2022-03-09 Haochen Gui 

gcc/
PR target/103316
* config/rs6000/rs6000-builtin.cc (rs6000_gimple_fold_builtin): Enable
gimple folding for RS6000_BIF_VCMPEQUT, RS6000_BIF_VCMPNET,
RS6000_BIF_CMPGE_1TI, RS6000_BIF_CMPGE_U1TI, RS6000_BIF_VCMPGTUT,
RS6000_BIF_VCMPGTST, RS6000_BIF_CMPLE_1TI, RS6000_BIF_CMPLE_U1TI.
* config/rs6000/vector.md (VEC_IC): Define. Add support for new Power10
V1TI instructions.
(vec_cmp): Set mode iterator to VEC_IC.
(vec_cmpu): Likewise.

gcc/testsuite/
PR target/103316
* gcc.target/powerpc/pr103316.c: New.

patch.diff
diff --git a/gcc/config/rs6000/rs6000-builtin.cc 
b/gcc/config/rs6000/rs6000-builtin.cc
index 5d34c1bcfc9..143effa89bf 100644
--- a/gcc/config/rs6000/rs6000-builtin.cc
+++ b/gcc/config/rs6000/rs6000-builtin.cc
@@ -1994,6 +1994,7 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi)
 case RS6000_BIF_VCMPEQUH:
 case RS6000_BIF_VCMPEQUW:
 case RS6000_BIF_VCMPEQUD:
+case RS6000_BIF_VCMPEQUT:
 /* We deliberately omit RS6000_BIF_VCMPEQUT for now, because gimple
folding produces worse code for 128-bit compares.  */
   fold_compare_helper (gsi, EQ_EXPR, stmt);
@@ -2002,6 +2003,7 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi)
 case RS6000_BIF_VCMPNEB:
 case RS6000_BIF_VCMPNEH:
 case RS6000_BIF_VCMPNEW:
+case RS6000_BIF_VCMPNET:
 /* We deliberately omit RS6000_BIF_VCMPNET for now, because gimple
folding produces worse code for 128-bit compares.  */
   fold_compare_helper (gsi, NE_EXPR, stmt);
@@ -2015,6 +2017,8 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi)
 case RS6000_BIF_CMPGE_U4SI:
 case RS6000_BIF_CMPGE_2DI:
 case RS6000_BIF_CMPGE_U2DI:
+case RS6000_BIF_CMPGE_1TI:
+case RS6000_BIF_CMPGE_U1TI:
 /* We deliberately omit RS6000_BIF_CMPGE_1TI and RS6000_BIF_CMPGE_U1TI
for now, because gimple folding produces worse code for 128-bit
compares.  */
@@ -2029,6 +2033,8 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi)
 case RS6000_BIF_VCMPGTUW:
 case RS6000_BIF_VCMPGTUD:
 case RS6000_BIF_VCMPGTSD:
+case RS6000_BIF_VCMPGTUT:
+case RS6000_BIF_VCMPGTST:
 /* We deliberately omit RS6000_BIF_VCMPGTUT and RS6000_BIF_VCMPGTST
for now, because gimple folding produces worse code for 128-bit
compares.  */
@@ -2043,6 +2049,8 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi)
 case RS6000_BIF_CMPLE_U4SI:
 case RS6000_BIF_CMPLE_2DI:
 case RS6000_BIF_CMPLE_U2DI:
+case RS6000_BIF_CMPLE_1TI:
+case RS6000_BIF_CMPLE_U1TI:
 /* We deliberately omit RS6000_BIF_CMPLE_1TI and RS6000_BIF_CMPLE_U1TI
for now, because gimple folding produces worse code for 128-bit
compares.  */
diff --git a/gcc/config/rs6000/vector.md b/gcc/config/rs6000/vector.md
index b87a742cca8..1afb8a6d786 100644
--- a/gcc/config/rs6000/vector.md
+++ b/gcc/config/rs6000/vector.md
@@ -26,6 +26,9 @@
 ;; Vector int modes
 (define_mode_iterator VEC_I [V16QI V8HI V4SI V2DI])

+;; Vector int modes for comparison
+(define_mode_iterator VEC_IC [V16QI V8HI V4SI V2DI V1TI])
+
 ;; 128-bit int modes
 (define_mode_iterator VEC_TI [V1TI TI])

@@ -533,11 +536,12 @@ (define_expand "vcond_mask_"

 ;; For signed integer vectors comparison.
 (define_expand "vec_cmp"
-  [(set (match_operand:VEC_I 0 "vint_operand")
+  [(set (match_operand:VEC_IC 0 "vint_operand")
(match_operator 1 "signed_or_equality_comparison_operator"
- [(match_operand:VEC_I 2 "vint_operand")
-  (match_operand:VEC_I 3 "vint_operand")]))]
-  "VECTOR_UNIT_ALTIVEC_OR_VSX_P (mode)"
+ [(match_operand:VEC_IC 2 "vint_operand")
+  (match_operand:VEC_IC 3 "vint_operand")]))]
+  "(VECTOR_UNIT_ALTIVEC_OR_VSX_P (mode) && mode!= V1TImode)
+   || (mode == V1TImode && TARGET_POWER10)"
 {
   enum rtx_code code = GET_CODE (operands[1]);
   rtx tmp = gen_reg_rtx (mode);
@@ -573,11 +577,12 @@ (define_expand "vec_cmp"

 ;; For unsigned integer vectors comparison.
 (define_expand "vec_cmpu"
-  [(set (match_operand:VEC_I 0 "vint_operand")
+  [(set (match_operand:VEC_IC 0 "vint_operand")
(match_operator 1 "unsigned_or_equality_comparison_operator"
- [(match_operand:VEC_I 2 "vint_operand")
-  (match_operand:VEC_I 3 "vint_operand")]))]
-  "VECTOR_UNIT_ALTIVEC_OR_VSX_P (mode)"
+ [(match_operand:VEC_IC 2 "vint_operand")
+  (match_operand:VEC_IC 3 "vint_operand")]))]
+  "(VECTOR_UNIT_ALTIVEC_OR_VSX_P (mode) && mode != V1TImode)
+   || (mode == V1TImode && TARGET_POWER10)"
 {
   enum rtx_code code = GET_COD

Re: [PATCH v8 04/12] LoongArch Port: Machine description files.

2022-03-09 Thread 程璐璐

Hi,

   We are modifying the code, this support will be

added in the next commit.

Thanks.

在 2022/3/8 上午4:15, Xi Ruoyao 写道:

On Fri, 2022-03-04 at 15:18 +0800, xucheng...@loongson.cn wrote:


 * config/loongarch/loongarch.md: New file.

An ICE happens building OpenSSH-8.9p1.  Investigation shows it's caused
by the flag "-fzero-call-used-regs=".  It's because the compiler
attempts to clear FCCx registers but can't figure out how.

This flag also triggers ICE for other targets (for example, PR 104820
for MIPS), and the related tests (zero-scratch-regs-{8,9,10,11}.c) are
marked dg-skip for many targets.

But it's unfortunate that packages like OpenSSH have already start to
use this flag... I guess they just enabled it once they saw it was
working for i386 :(.  So it's better to solve the problem for a new
target.

A "quick fix" is adding an insn to clear FCCx.  This is enough to build
OpenSSH and make zero-scratch-regs-{8,9,10,11}.c PASS.

diff --git a/gcc/config/loongarch/loongarch.md 
b/gcc/config/loongarch/loongarch.md
index a9a8bc4b038..76c5ded9fe4 100644
--- a/gcc/config/loongarch/loongarch.md
+++ b/gcc/config/loongarch/loongarch.md
@@ -2020,6 +2020,12 @@
DONE;
  })
  
+;; Clear one FCC register

+(define_insn "movfcc" [(set (match_operand:FCC 0 "register_operand" "=z")
+ (const_int 0))]
+  ""
+  "movgr2cf\t%0,$r0")
+
  ;; Conditional move instructions.
  
  (define_insn "*sel_using_"


Re: [PATCH] c++: PR c++/95999: Improved error recovery in enumeration lists.

2022-03-09 Thread Jason Merrill via Gcc-patches

On 2/21/22 08:17, Roger Sayle wrote:


This patch resolves PR c++/95999 which is an ICE-after-error regression
in the g++ front-end.  When parsing an enumerator list, the C++ parser
assumes that cp_parser_constant_expression always returns either an
INTEGER_CST or error_mark_node, but in testcase reported in the PR
actually returns a VAR_DECL.

The usual (but perhaps controversial) design philosophy is that the
routine that reports the error normally has a duty to indicate this to
the rest of the compiler (via error_mark_node), but here the return
value from calling require_rvalue_constant_expression (parser.cc:10666)
is ignored.  I initially experimented with setting EXPRESSION to
error_mark_node here in cp_parser_constant_expression but (perhaps
conveniently) that's insufficient to resolve the problem.  The simple
fix in this patch is to tweak the two places that require INTEGER_CST
to treat all other tree types as though they are error_mark_node.

This patch has been tested on x86_64-pc-linunx-gnu with make bootstrap
and make -k check with no new (unexpected) failures.  Ok for mainline?


OK.


2022-02-21  Roger Sayle  

gcc/cp/ChangeLog
PR c++/95999
* decl.cc (finish_enum_value_list): If VALUE isn't an INTEGER_CST
consider it to be zero (i.e. treat it like error_mark_node).
(build_enumerator): Likewise, if PREV_VALUE isn't an INTEGER_CST,
set VALUE to error_mark_node.

gcc/testsuite/ChangeLog
PR c++/95999
* g++.dg/pr95999.c: New test case.


Thanks in advance,
Roger
--





Re: [C++ PATCH] PR c++/96442: Another improved error recovery in enumerations.

2022-03-09 Thread Jason Merrill via Gcc-patches

On 2/22/22 08:02, Roger Sayle wrote:


This patch resolves PR c++/96442, another ICE-after-error regression.
In this case, invalid code attempts to use a non-integral type as the
underlying type for an enumeration (a record_type in the example given
in the bugzilla PR), for which the parser emits an error message but
allows the inappropriate type to leak to downstream code.


How does that happen?

Would it help to change dependent_type_p in start_enum to WILDCARD_TYPE_P?


The minimal
safe fix is to double check that the enumeration's underlying type
EUTYPE satisfies INTEGRAL_TYPE_P before calling int_fits_type_p in
build_enumerator.  This is a one line fix, but correcting indentation
and storing a common subexpression in a variable makes the change look
a little bigger.

This patch has been tested on x86_64-pc-linunx-gnu with make bootstrap
and make -k check with no new (unexpected) failures.  Ok for mainline?


2022-02-22  Roger Sayle  

gcc/cp/ChangeLog
PR c++/96442
* decl.cc (build_enumeration): Check ENUM_UNDERLYING_TYPE is
INTEGRAL_TYPE_P before calling int_fits_type_p.

gcc/testsuite/ChangeLog
PR c++/96442
* g++.dg/pr96442.C: New test cae.


Thanks in advance,
Roger
--





Re: [PATCH] PR c++/39751: ICE-on-invalid parsing regression.

2022-03-09 Thread Jason Merrill via Gcc-patches

On 2/26/22 19:55, Roger Sayle wrote:


This is a fix for PR c++/39751 which is an ICE-on-invalid regression in
the C++ parser after encountering the end of file.  The one line change
is to check that the tokens cached in DECL_PENDING_INLINE_INFO haven't
been purged before processing them in cp_parser_late_parsing_for_member.

Alas in addition to the one line fix (and new test case), I've also
taken the opportunity to silence the -Wmissing-field-initializers
warnings compiling this source file, by replacing the " = { };" with
explicit calls to memset to initialize/reset structures.

This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
and make -k check with no new (unexpected) failures.  Ok for mainline?


2022-02-26  Roger Sayle  

gcc/cp/ChangeLog
PR c++/39751
* parser.cc (cp_parser_late_parsing_for_member): Confirm the token
stream hasn't been purged before processing DECL_PENDING_INLINE.


This is OK.


(cp_parser_handle_statement_omp_attributes): Silence compilation
warnings using memset to initialize structure.
(cp_parser_late_parsing_omp_declare_simd): Likewise.


These are not; I don't want to uglify the code to satisfy a broken 
warning.  The warning is documented to allow this pattern:


 Likewise, in C++ this option does not warn about the empty { }
 initializer, for example:

  struct s { int f, g, h; };
  s x = { };

So all the instances you changed indicate a bug in the warning that 
should be fixed rather than worked around.


Even if this were OK, it's completely unrelated to PR39751, so should 
have been a separate patch.



+++ b/gcc/testsuite/g++.dg/pr39751.C


Should go in g++.dg/parse/


+/* { dg-options "-O2" } */


Unneeded.

Jason



Re: [PATCH] c++: naming a dependently-scoped template for CTAD [PR104641]

2022-03-09 Thread Jason Merrill via Gcc-patches

On 3/9/22 10:39, Patrick Palka wrote:

On Tue, 8 Mar 2022, Jason Merrill wrote:


On 3/2/22 14:32, Patrick Palka wrote:

In order to be able to perform CTAD for a dependently-scoped template
such as A::B in the testcase below, we need to permit a
typename-specifier to resolve to a template as per [dcl.type.simple]/2,
at least when it appears in a CTAD-enabled context.

This patch implements this using a new tsubst flag tf_tst_ok to control
when a TYPENAME_TYPE is allowed to name a template, and sets this flag
when substituting into the type of a CAST_EXPR, CONSTRUCTOR or VAR_DECL
(each of which is a CTAD-enabled context).


What breaks if we always allow that, or at least in -std that support CTAD?


AFAICT no significant breakage, but some accepts-invalid and diagnostic
regressions crop up, e.g. accepts-invalid for

   using type = typename A::B; // no more diagnostic if typename resolves to 
a
  // template at instantiation time

and diagnostic regression for

   template::B> void f();
   // no more elaboration why deduction failed if typename resolves
   // to a template


Ah, sure, the cost is that we would need to check for this case in 
various callers, rather than setting a flag in a different set of 
callers.  Fair enough.



@@ -16229,6 +16237,12 @@ tsubst (tree t, tree args, tsubst_flags_t complain, 
tree in_decl)
  }
  }
 
+	if (TREE_CODE (f) == TEMPLATE_DECL)

+ {
+   gcc_checking_assert (tst_ok);
+   f = make_template_placeholder (f);
+ }


How about calling make_template_placeholder in make_typename_type?

Jason



Re: [PATCH v2] c++: Don't allow type-constraint auto(x) [PR104752]

2022-03-09 Thread Jason Merrill via Gcc-patches

On 3/9/22 11:22, Marek Polacek wrote:

On Tue, Mar 08, 2022 at 02:47:42PM -0500, Jason Merrill wrote:

On 3/2/22 14:31, Marek Polacek wrote:

104752 points out that

template
concept C = true;
auto y = C auto(1);

is ill-formed as per [dcl.type.auto.deduct]: "For an explicit type conversion,
T is the specified type, which shall be auto." which doesn't allow
type-constraint auto.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

PR c++/104752

gcc/cp/ChangeLog:

* semantics.cc (finish_compound_literal): Disallow auto{x} for
is_constrained_auto.
* typeck2.cc (build_functional_cast_1): Likewise.

gcc/testsuite/ChangeLog:

* g++.dg/cpp23/auto-fncast12.C: New test.
---
   gcc/cp/semantics.cc| 1 +
   gcc/cp/typeck2.cc  | 1 +
   gcc/testsuite/g++.dg/cpp23/auto-fncast12.C | 8 
   3 files changed, 10 insertions(+)
   create mode 100644 gcc/testsuite/g++.dg/cpp23/auto-fncast12.C

diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc
index a2c0eb050e6..5129b12f00f 100644
--- a/gcc/cp/semantics.cc
+++ b/gcc/cp/semantics.cc
@@ -3148,6 +3148,7 @@ finish_compound_literal (tree type, tree compound_literal,
 /* C++23 auto{x}.  */
 else if (is_auto (type)
   && !AUTO_IS_DECLTYPE (type)
+  && !is_constrained_auto (type)
   && CONSTRUCTOR_NELTS (compound_literal) == 1)
   {
 if (cxx_dialect < cxx23)
diff --git a/gcc/cp/typeck2.cc b/gcc/cp/typeck2.cc
index 39d03e4c3c4..c9314bbeb6f 100644
--- a/gcc/cp/typeck2.cc
+++ b/gcc/cp/typeck2.cc
@@ -2305,6 +2305,7 @@ build_functional_cast_1 (location_t loc, tree exp, tree 
parms,
init = parms;
 /* C++23 auto(x).  */
 else if (!AUTO_IS_DECLTYPE (anode)
+  && !is_constrained_auto (anode)
   && list_length (parms) == 1)
{
  init = TREE_VALUE (parms);


We might get a better diagnostic by moving this test inside the block and
saying that the auto can't be constrained, instead of "invalid use of auto"?


Sure.  So like this?

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?


OK.


-- >8 --
104752 points out that

   template
   concept C = true;
   auto y = C auto(1);

is ill-formed as per [dcl.type.auto.deduct]: "For an explicit type conversion,
T is the specified type, which shall be auto." which doesn't allow
type-constraint auto.

PR c++/104752

gcc/cp/ChangeLog:

* semantics.cc (finish_compound_literal): Disallow auto{x} for
is_constrained_auto.
* typeck2.cc (build_functional_cast_1): Likewise.

gcc/testsuite/ChangeLog:

* g++.dg/cpp23/auto-fncast12.C: New test.
---
  gcc/cp/semantics.cc| 8 +++-
  gcc/cp/typeck2.cc  | 8 +++-
  gcc/testsuite/g++.dg/cpp23/auto-fncast12.C | 8 
  3 files changed, 22 insertions(+), 2 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/cpp23/auto-fncast12.C

diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc
index 799ce943279..21740064d3d 100644
--- a/gcc/cp/semantics.cc
+++ b/gcc/cp/semantics.cc
@@ -3150,7 +3150,13 @@ finish_compound_literal (tree type, tree 
compound_literal,
   && !AUTO_IS_DECLTYPE (type)
   && CONSTRUCTOR_NELTS (compound_literal) == 1)
  {
-  if (cxx_dialect < cxx23)
+  if (is_constrained_auto (type))
+   {
+ if (complain & tf_error)
+   error ("% cannot be constrained");
+ return error_mark_node;
+   }
+  else if (cxx_dialect < cxx23)
pedwarn (input_location, OPT_Wc__23_extensions,
 "% only available with "
 "%<-std=c++2b%> or %<-std=gnu++2b%>");
diff --git a/gcc/cp/typeck2.cc b/gcc/cp/typeck2.cc
index 39d03e4c3c4..a4c825fc34d 100644
--- a/gcc/cp/typeck2.cc
+++ b/gcc/cp/typeck2.cc
@@ -2308,7 +2308,13 @@ build_functional_cast_1 (location_t loc, tree exp, tree 
parms,
   && list_length (parms) == 1)
{
  init = TREE_VALUE (parms);
- if (cxx_dialect < cxx23)
+ if (is_constrained_auto (anode))
+   {
+ if (complain & tf_error)
+   error_at (loc, "% cannot be constrained");
+ return error_mark_node;
+   }
+ else if (cxx_dialect < cxx23)
pedwarn (loc, OPT_Wc__23_extensions,
 "% only available with "
 "%<-std=c++2b%> or %<-std=gnu++2b%>");
diff --git a/gcc/testsuite/g++.dg/cpp23/auto-fncast12.C 
b/gcc/testsuite/g++.dg/cpp23/auto-fncast12.C
new file mode 100644
index 000..2856c2846fc
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp23/auto-fncast12.C
@@ -0,0 +1,8 @@
+// PR c++/104752
+// { dg-do compile { target c++23 } }
+
+template
+concept C = true;
+auto x = auto(1); // valid (P0849R8)
+auto y = C auto(1);   // { dg-error "cannot be constrained" }
+auto z = C auto{1};   // { dg-error "cannot be constrained" }

base-commit: bded0d549fd

Re: [PATCH] c++: ICE with operator delete [PR104846]

2022-03-09 Thread Jason Merrill via Gcc-patches

On 3/9/22 14:09, Marek Polacek wrote:

This is an ICE-on-invalid with "auto operator delete[] (void *)" whose
return type must be void.  The return type is checked in coerce_delete_type
but we never got there in this test, because we took the wrong path in
grokdeclarator, set type to error_mark_node, ended up creating a FIELD_DECL
with build_decl, and confused grokmethod by giving it a FIELD_DECL.

Fixed by not taking the data member path for a FUNCTION_TYPE.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?


OK.


PR c++/104846

gcc/cp/ChangeLog:

* decl.cc (grokdeclarator): Check FUNC_OR_METHOD_TYPE_P before giving
data member errors.

gcc/testsuite/ChangeLog:

* g++.dg/init/delete5.C: New test.
---
  gcc/cp/decl.cc  | 2 +-
  gcc/testsuite/g++.dg/init/delete5.C | 8 
  2 files changed, 9 insertions(+), 1 deletion(-)
  create mode 100644 gcc/testsuite/g++.dg/init/delete5.C

diff --git a/gcc/cp/decl.cc b/gcc/cp/decl.cc
index 992e38385c2..12f2524aafe 100644
--- a/gcc/cp/decl.cc
+++ b/gcc/cp/decl.cc
@@ -13751,7 +13751,7 @@ grokdeclarator (const cp_declarator *declarator,
}
  else if (decl_context == FIELD)
{
-   if (!staticp && !friendp && TREE_CODE (type) != METHOD_TYPE)
+   if (!staticp && !friendp && !FUNC_OR_METHOD_TYPE_P (type))
  if (tree auto_node = type_uses_auto (type))
{
  location_t tloc = declspecs->locations[ds_type_spec];
diff --git a/gcc/testsuite/g++.dg/init/delete5.C 
b/gcc/testsuite/g++.dg/init/delete5.C
new file mode 100644
index 000..3555f43bbb0
--- /dev/null
+++ b/gcc/testsuite/g++.dg/init/delete5.C
@@ -0,0 +1,8 @@
+// PR c++/104846
+// { dg-do compile { target c++14 } }
+
+struct S {
+  auto operator delete (void *) {} // { dg-error ".operator delete. must return 
type .void'" }
+  auto operator delete[] (void *) {} // { dg-error ".operator delete. must return 
type .void'" }
+};
+

base-commit: bded0d549fdfdc1328b2c0189dc5f8593b4cbe42




Re: [C++ PATCH] PR c++/96437: ICE-on-invalid-code error recovery.

2022-03-09 Thread Jason Merrill via Gcc-patches

On 3/9/22 14:35, Roger Sayle wrote:


Hi Jason,

Very many thanks for your reviews/approvals of these ICE-on-invalid-code fixes.


+  if (TREE_VALUE (new_parm) != error_mark_node)
+DECL_VIRTUAL_P (TREE_VALUE (new_parm)) = true;


Hmm, I wonder about returning early if there was an error, but this is fine as 
is.


The challenge here is the need to perform "current_binding_level = entry_scope;"
required to restore the parser state before returning.  Now that the parser is
written in C++, we could (in theory) make more use of destructors to 
automatically
restore/pop state, making it safe(r) to just "return error_mark_node" 
immediately
in more places.


Yes, e.g.

auto ov = make_temp_override (current_binding_level);


+/* { dg-options "-O2" } */


This also seems unneeded for this test.


Doh! Force of habit (from working in the middle-end).  Consider this removed
from all the test cases below.

Is there any chance I could ask you to also look at these (that probably slipped
through the net), though PR 84964 might require a nod from a middle-end 
maintainer.

PR c++/39751: 
https://gcc.gnu.org/pipermail/gcc-patches/2022-February/590951.html
PR c++/84964: 
https://gcc.gnu.org/pipermail/gcc-patches/2022-February/590961.html
PR c++/95999: 
https://gcc.gnu.org/pipermail/gcc-patches/2022-February/590653.html
PR c++/96442: 
https://gcc.gnu.org/pipermail/gcc-patches/2022-February/590716.html


Will do.

Jason



Re: [PATCH v2] c++: ->template and implicit typedef [PR104608]

2022-03-09 Thread Jason Merrill via Gcc-patches

On 3/1/22 17:12, Marek Polacek wrote:

On Mon, Feb 28, 2022 at 12:31:37PM -0400, Jason Merrill wrote:

On 2/22/22 17:46, Marek Polacek wrote:

Here we have a forward declaration of Parameter for which we create
an implicit typedef, which is a TYPE_DECL.  Then, when looking it up
at template definition time, cp_parser_template_id gets (since r12-6754)
this TYPE_DECL which it can't handle.


Hmm, getting that global TYPE_DECL from lookup seems like a bug; isn't the
lookup earlier in cp_parser_template_name in object scope?


Yes, it is (in Function), but I think we do the pre-DR1835 lookup.  For

   this->template Parameter();

we don't find Parameter in the object expression's type (Function), so we do
unqualified lookup in the enclosing context and find the global Parameter
TYPE_DECL.  This is implemented in cp_parser_lookup_name:

   decl = lookup_member (object_type, name, ...);
   if (!decl)
  decl = lookup_name (name, ...);

[basic.lookup.qual.general] now says that we only perform unqualified lookup
if the object type isn't dependent.  But I don't think we can fix this by
implementing DR1835 because that would only help in C++23(?).

My v1 patch is wrong in any case; I've come up with template-keyword4.C
where we find a TYPE_DECL which is not an implicit typedef.

Since cp_parser_template_id is only able to handle these TYPE_DECLs:

18353   else if (TREE_CODE (templ) == TYPE_DECL
18354&& TREE_CODE (TREE_TYPE (templ)) == TYPENAME_TYPE)

this v2 patch fixes the problem by repeating lookup of TYPE_DECLs whose
TREE_TYPE is *not* TYPENAME_TYPE.  That fixes my testcases and doesn't
regress any.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?


OK.


-- >8 --
Here we have a forward declaration of Parameter for which we create
an implicit typedef, which is a TYPE_DECL.  Then, when looking it up
at template definition time, cp_parser_template_id gets (since r12-6754)
this TYPE_DECL which it can't handle.

This patch defers lookup for TYPE_DECLs that cp_parser_template_id can't
handle, a la r12-6879.

PR c++/104608

gcc/cp/ChangeLog:

* parser.cc (cp_parser_template_name): Repeat lookup of
TYPE_DECLs.

gcc/testsuite/ChangeLog:

* g++.dg/parse/template-keyword3.C: New test.
* g++.dg/parse/template-keyword4.C: New test.
---
  gcc/cp/parser.cc   |  5 -
  gcc/testsuite/g++.dg/parse/template-keyword3.C | 12 
  gcc/testsuite/g++.dg/parse/template-keyword4.C | 17 +
  3 files changed, 33 insertions(+), 1 deletion(-)
  create mode 100644 gcc/testsuite/g++.dg/parse/template-keyword3.C
  create mode 100644 gcc/testsuite/g++.dg/parse/template-keyword4.C

diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc
index 03d99aba13e..02c34bf964e 100644
--- a/gcc/cp/parser.cc
+++ b/gcc/cp/parser.cc
@@ -18681,7 +18681,10 @@ cp_parser_template_name (cp_parser* parser,
  return error_mark_node;
}
else if ((!DECL_P (decl) && !is_overloaded_fn (decl))
-  || TREE_CODE (decl) == USING_DECL)
+  || TREE_CODE (decl) == USING_DECL
+  /* cp_parser_template_id can only handle some TYPE_DECLs.  */
+  || (TREE_CODE (decl) == TYPE_DECL
+  && TREE_CODE (TREE_TYPE (decl)) != TYPENAME_TYPE))
/* Repeat the lookup at instantiation time.  */
decl = identifier;
  }
diff --git a/gcc/testsuite/g++.dg/parse/template-keyword3.C 
b/gcc/testsuite/g++.dg/parse/template-keyword3.C
new file mode 100644
index 000..59fe0fc180b
--- /dev/null
+++ b/gcc/testsuite/g++.dg/parse/template-keyword3.C
@@ -0,0 +1,12 @@
+// PR c++/104608
+
+class Parameter;
+template  class Function
+: public R
+{
+Function();
+};
+template 
+Function::Function() {
+this->template Parameter();
+}
diff --git a/gcc/testsuite/g++.dg/parse/template-keyword4.C 
b/gcc/testsuite/g++.dg/parse/template-keyword4.C
new file mode 100644
index 000..c688094bcf2
--- /dev/null
+++ b/gcc/testsuite/g++.dg/parse/template-keyword4.C
@@ -0,0 +1,17 @@
+// PR c++/104608
+// { dg-do compile { target c++11 } }
+
+class S;
+using Parameter = S;
+typedef S Parameter2;
+
+template  class Function
+: public R
+{
+Function();
+};
+template 
+Function::Function() {
+this->template Parameter();
+this->template Parameter2();
+}

base-commit: 4a1c20df82c9e14478d79fbe1ae9690a36285ac1




Re: [PATCH] c++: fold calls to std::move/forward [PR96780]

2022-03-09 Thread Jason Merrill via Gcc-patches

On 3/1/22 18:08, Patrick Palka wrote:

A well-formed call to std::move/forward is equivalent to a cast, but the
former being a function call means it comes with bloated debug info, which
persists even after the call has been inlined away, for an operation that
is never interesting to debug.

This patch addresses this problem in a relatively ad-hoc way by folding
calls to std::move/forward into casts as part of the frontend's general
expression folding routine.  After this patch with -O2 and a non-checking
compiler, debug info size for some testcases decreases by about ~10% and
overall compile time and memory usage decreases by ~2%.


Impressive.  Which testcases?

Do you also want to handle addressof and as_const in this patch, as 
Jonathan suggested?


I think we can do this now, and think about generalizing more in stage 1.


Bootstrapped and regtested on x86_64-pc-linux-gnu, is this something we
want to consider for GCC 12?

PR c++/96780

gcc/cp/ChangeLog:

* cp-gimplify.cc (cp_fold) : When optimizing,
fold calls to std::move/forward into simple casts.
* cp-tree.h (is_std_move_p, is_std_forward_p): Declare.
* typeck.cc (is_std_move_p, is_std_forward_p): Export.

gcc/testsuite/ChangeLog:

* g++.dg/opt/pr96780.C: New test.
---
  gcc/cp/cp-gimplify.cc  | 18 ++
  gcc/cp/cp-tree.h   |  2 ++
  gcc/cp/typeck.cc   |  6 ++
  gcc/testsuite/g++.dg/opt/pr96780.C | 24 
  4 files changed, 46 insertions(+), 4 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/opt/pr96780.C

diff --git a/gcc/cp/cp-gimplify.cc b/gcc/cp/cp-gimplify.cc
index d7323fb5c09..0b009b631c7 100644
--- a/gcc/cp/cp-gimplify.cc
+++ b/gcc/cp/cp-gimplify.cc
@@ -2756,6 +2756,24 @@ cp_fold (tree x)
  
  case CALL_EXPR:

{
+   if (optimize


I think this should check flag_no_inline rather than optimize.


+   && (is_std_move_p (x) || is_std_forward_p (x)))
+ {
+   /* When optimizing, "inline" calls to std::move/forward by
+  simply folding them into the corresponding cast.  This is
+  cheaper than relying on the inliner to do so, and also
+  means we avoid generating useless debug info for them at all.
+
+  At this point the argument has already been coerced into a
+  reference, so it suffices to use a NOP_EXPR to express the
+  reference-to-reference cast.  */
+   r = CALL_EXPR_ARG (x, 0);
+   if (!same_type_p (TREE_TYPE (x), TREE_TYPE (r)))
+ r = build_nop (TREE_TYPE (x), r);
+   x = cp_fold (r);
+   break;
+ }
+
int sv = optimize, nw = sv;
tree callee = get_callee_fndecl (x);
  
diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h

index 37d462fca6e..ab828730b03 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -8089,6 +8089,8 @@ extern tree finish_right_unary_fold_expr (tree, int);
  extern tree finish_binary_fold_expr  (tree, tree, int);
  extern tree treat_lvalue_as_rvalue_p   (tree, bool);
  extern bool decl_in_std_namespace_p(tree);
+extern bool is_std_move_p   (tree);
+extern bool is_std_forward_p(tree);
  
  /* in typeck2.cc */

  extern void require_complete_eh_spec_types(tree, tree);
diff --git a/gcc/cp/typeck.cc b/gcc/cp/typeck.cc
index bddc83759ad..a3644f8e7f7 100644
--- a/gcc/cp/typeck.cc
+++ b/gcc/cp/typeck.cc
@@ -62,8 +62,6 @@ static bool maybe_warn_about_returning_address_of_local 
(tree, location_t = UNKN
  static void error_args_num (location_t, tree, bool);
  static int convert_arguments (tree, vec **, tree, int,
tsubst_flags_t);
-static bool is_std_move_p (tree);
-static bool is_std_forward_p (tree);
  
  /* Do `exp = require_complete_type (exp);' to make sure exp

 does not have an incomplete type.  (That includes void types.)
@@ -10207,7 +10205,7 @@ decl_in_std_namespace_p (tree decl)
  
  /* Returns true if FN, a CALL_EXPR, is a call to std::forward.  */
  
-static bool

+bool
  is_std_forward_p (tree fn)
  {
/* std::forward only takes one argument.  */
@@ -10224,7 +10222,7 @@ is_std_forward_p (tree fn)
  
  /* Returns true if FN, a CALL_EXPR, is a call to std::move.  */
  
-static bool

+bool
  is_std_move_p (tree fn)
  {
/* std::move only takes one argument.  */
diff --git a/gcc/testsuite/g++.dg/opt/pr96780.C 
b/gcc/testsuite/g++.dg/opt/pr96780.C
new file mode 100644
index 000..ca24b2802bb
--- /dev/null
+++ b/gcc/testsuite/g++.dg/opt/pr96780.C
@@ -0,0 +1,24 @@
+// PR c++/96780
+// Verify calls to std::move/forward are folded away by the frontend.
+// { dg-do compile { target c++11 } }
+// { dg-additional-options "-O -fdump-tree-gimple" }
+
+#include 
+
+struct A;
+
+extern A& a;
+extern const A& ca;
+
+void f() {
+  auto&& x1 = std::move(a);
+  auto&& x2 = std::forward(a);
+  auto&& x3 = std::forward(

[PATCH, V2] Eliminate power8 fusion options, use power8 tuning, PR target/102059

2022-03-09 Thread Michael Meissner via Gcc-patches
Eliminate power8 fusion options, use power8 tuning, PR target/102059

The power8 fusion support used to be set automatically when -mcpu=power8 or
-mtune=power8 was used, and it was cleared for other cpu's.  However, if you
used the target attribute or target #pragma to change the default cpu type or
tuning, you would get an error that a target specifiction option mismatch
occurred.

This occurred because the rs6000_can_inline_p function just compares the ISA
bits between the called inline function and the caller.  If the ISA flags of
the called function is not a subset of the ISA flags of the caller, we won't do
the inlinging.  When a power9 or power10 function inlines a function that is
explicitly compiled for power8, the power8 function has the power8 fusion bits
set and the power9 or power10 functions do not have the fusion bits set.

This code removes the -mpower8-fusion and -mpower8-fusion-sign options, and
only enables power8 fusion if we are tuning for a power8.  Power8 sign fusion
is only enabled if we are tuning for a power8 and we have -O3 optimization or
higher.

I left the options -mno-power8-fusion and -mno-power8-fusion-sign in rs6000.opt
and they don't issue a warning.  If the user explicitly used -mpower8-fusion or
-mpower8-fusion-sign, then they will get a warning that the swtich has been
removed.

Similarly, I left in the pragma target and attribute target support for the
fusion options, but they don't do anything now.  This is because I believe the
customer who encountered this problem now is explicitly setting the
no-power8-fusion option in the pragma or attribute to avoid the warning.

I have tested this on the following systems, and they all bootstraps fine and
there were no regressions in the test suite:

big endian power8 (both 64-bit and 32-bit)
little endian power9
little endian power10

Can I check this patch into the current master branch for GCC and after a
cooling period check in the patch to the GCC 11 and GCC 10 branches.  The
customer is currently using GCC 10.

2022-03-09   Michael Meissner  

gcc/
PR target/102059
* config/rs6000/rs6000-cpus.def (OTHER_FUSION_MASKS): Delete.
(ISA_3_0_MASKS_SERVER): Don't clear the fusion masks.
(POWERPC_MASKS): Remove OPTION_MASK_P8_FUSION.
* config/rs6000/rs6000.cc (rs6000_option_override_internal):
Delete code that set the power8 fusion options automatically.
(rs6000_opt_masks): Allow #pragma target and attribute target to set
power8-fusion and power8-fusion-sign, but these no longer represent
options that the user can set.
(rs6000_print_options_internal): Skip printing nop options.
* config/rs6000/rs6000.h (TARGET_P8_FUSION): New macro.
(TARGET_P8_FUSION_SIGN): Likewise.
(MASK_P8_FUSION): Delete.
* config/rs6000/rs6000.opt (-mpower8-fusion): Recognize the option but
ignore the no form and warn that the option was removed for the regular
form.
(-mpower8-fusion-sign): Likewise.
* doc/invoke.texi (RS/6000 and PowerPC Options): Delete -mpower8-fusion
and -mpower8-fusion-sign.

gcc/testsuite/
PR target/102059
* gcc.dg/lto/pr102059-1_0.c: Remove -mno-power8-fusion.
* gcc.dg/lto/pr102059-2_0.c: Likewise.
* gcc.target/powerpc/pr102059-3.c: Likewise.
* gcc.target/powerpc/pr102059-4.c: New test.
---
 gcc/config/rs6000/rs6000-cpus.def | 22 +++--
 gcc/config/rs6000/rs6000.cc   | 49 +--
 gcc/config/rs6000/rs6000.h| 14 +-
 gcc/config/rs6000/rs6000.opt  | 19 +--
 gcc/doc/invoke.texi   | 13 +
 gcc/testsuite/gcc.dg/lto/pr102059-1_0.c   |  2 +-
 gcc/testsuite/gcc.dg/lto/pr102059-2_0.c   |  2 +-
 gcc/testsuite/gcc.target/powerpc/pr102059-3.c |  2 +-
 gcc/testsuite/gcc.target/powerpc/pr102059-4.c | 23 +
 9 files changed, 75 insertions(+), 71 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pr102059-4.c

diff --git a/gcc/config/rs6000/rs6000-cpus.def 
b/gcc/config/rs6000/rs6000-cpus.def
index 963947f6939..a05b2d8c41a 100644
--- a/gcc/config/rs6000/rs6000-cpus.def
+++ b/gcc/config/rs6000/rs6000-cpus.def
@@ -43,9 +43,7 @@
 | OPTION_MASK_ALTIVEC  \
 | OPTION_MASK_VSX)
 
-/* For now, don't provide an embedded version of ISA 2.07.  Do not set power8
-   fusion here, instead set it in rs6000.cc if we are tuning for a power8
-   system.  */
+/* For now, don't provide an embedded version of ISA 2.07.  */
 #define ISA_2_7_MASKS_SERVER   (ISA_2_6_MASKS_SERVER   \
 | OPTION_MASK_P8_VECTOR\
 | OPTION_MASK_CRYPTO   \
@@ -54,19 +52,14 @@
 | OPTION_MASK_QUAD_MEMORY  \
 

Re: [PATCH] Check if loading const from mem is faster

2022-03-09 Thread Jiufu Guo via Gcc-patches


Hi!

Richard Biener  writes:

> On Wed, 9 Mar 2022, Jiufu Guo wrote:
>
>> 
>> Hi!
>> 
>> Richard Biener  writes:
>> 
>> > On Tue, 8 Mar 2022, Jiufu Guo wrote:
>> >
>> >> Jiufu Guo  writes:
>> >> 
>> >> Hi!
>> >> 
>> >> > Hi Sehger,
>> >> >
>> >> > Segher Boessenkool  writes:
>> >> >
>> >> >> On Tue, Mar 01, 2022 at 10:28:57PM +0800, Jiufu Guo wrote:
>> >> >>> Segher Boessenkool  writes:
>> >> >>> > No.  insn_cost is only for correct, existing instructions, not for
>> >> >>> > made-up nonsense.  I created insn_cost precisely to get away from 
>> >> >>> > that
>> >> >>> > aspect of rtx_cost (and some other issues, like, it is incredibly 
>> >> >>> > hard
>> >> >>> > and cumbersome to write a correct rtx_cost).
>> >> >>> 
>> >> >>> Thanks! The implementations of hook insn_cost are align with this
>> >> >>> design, they are  checking insn's attributes and COSTS_N_INSNS.
>> >> >>> 
>> >> >>> One question on the speciall case: 
>> >> >>> For instruction: "r119:DI=0x100803004101001"
>> >> >>> Would we treat it as valid instruction?
>> >> >>
>> >> >> Currently we do, alternative 6 in *movdi_internal64: we allow any r<-n.
>> >> >> This is costed as 5 insns (cost=20).
>> >> >>
>> >> >> It generally is better to split things into patterns close to the
>> >> >> eventual machine isntructions as early as possible: all the more 
>> >> >> generic
>> >> >> optimisations can take advantage of that then.
>> >> > Get it!
>> >> >>
>> >> >>> A patch, which is attached the end of this mail, accepts
>> >> >>> "r119:DI=0x100803004101001" as input of insn_cost.
>> >> >>> In this patch, 
>> >> >>> - A tmp instruction is generated via make_insn_raw.
>> >> >>> - A few calls to rtx_cost (in cse_insn) is replaced by insn_cost.
>> >> >>> - In hook of insn_cost, checking the special 'constant' instruction.
>> >> >>> Are these make sense?
>> >> >>
>> >> >> I'll review that patch inline.
>> >> 
>> >> I drafted a new patch that replace rtx_cost with insn_cost for cse.cc.
>> >> Different from the previous partial patch, this patch replaces all usage
>> >> of rtx_cost. It may be better/aggressive than previous one.
>> >
>> > I think there's no advantage for using insn_cost over rtx_cost for
>> > the simple SET case.
>> 
>> Thanks for your comments and raise this concern.
>> 
>> For those targets which do not implement insn_cost, insn_cost calls
>> rtx_cost through pattern_cost, then insn_cost is equal to rtx_cost.
>> 
>> While, for those targets which have insn_cost, it seems insn_cost would
>> be better(or say more accurate/consistent?) than rtx_cost. Since:
>> - insn_cost recog the insn first, and compute cost through something
>
> target hooks are expected to call recog on the insn but the generic
> fallback does not!?  Or do you say a target _could_ call recog?
> I think it would be valid to only expect recognized insns here
> and thus cse.cc obligation would be to call regoc on the "fake"
> instruction which then raises the obvious issue whether you should
> ever call recog on something "fake".
Thanks Richard! I also feel this is a main point of insn_cost.
>From my understanding: it would be better to let insn_cost check
the valid recognizable insns; this would be the major purpose of
insn_cost.  While, I'm also wondering, we may let it go for 'fake'
instruction for current implementations (e.g. arm_insn_cost) which
behaviors to walk/check pattern. 

>
> I also see that rs6000_insn_cost does
>
> static int
> rs6000_insn_cost (rtx_insn *insn, bool speed)
> {
>   if (recog_memoized (insn) < 0)
> return 0;
>
> so not recognized insns become quite cheap - it looks like
> insn_cost has no documented failure mode and initial implementors
> chickened out asserting that an insn is / can be recognized.
> In fact I'd assert that INSN_CODE (insn) >= 0 simply because there's
> no failure mode and the _caller_ should have made sure the
> insn can be recognized.  That said, the insn_cost should do that.
Thanks! Using the assert would help to catch un-recog failures.

> (is INSN_CODE () == 0 a real thing?)
0 is specialy, it is some thing like CODE_FOR_nothing. :-)

>
>> (like length/cost attributes from .md file) for the 'machine insn'.
>> - rtx_cost estimates the cost through analyzing the 'rtx content'.
>> The accurate estimation relates to the context.
>> 
>> For a special example: "%r100 = C", as a previous patch, by tunning
>> target's rtx_cost hook, cost could be computed according to the value
>> of C. insn_cost may just model the cost in the define of the machine
>> instruction.
>> 
>> These reasons are my initial thoughts.  Segher may have better
>> explain. :-) 
>
> OK, so in theory the targets insn_cost can use insn attributes
> which allows to keep the cost parts of an instruction close to the
> instruction patterns which is probably a good thing for
> maintainability.  But as you point out this requires recognizing
> of possibly generated instructions.  And before reload it has
> the issue that the alternative is not determined which 

[PATCH] middle-end/104854: Avoid overread warning for strnlen and strndup

2022-03-09 Thread Siddhesh Poyarekar
The size argument larger than size of SRC for strnlen and strndup is
problematic only if SRC is not NULL terminated, which invokes undefined
behaviour.  In all other cases, as long as SRC is large enough to have a
NULL char (i.e. size 1 or more), a larger N should not invoke a warning
during compilation.

Such a warning may be a suitable check for the static analyzer instead
with slightly different wording suggesting that choice of size argument
makes the function call equivalent to strlen/strdup.

This change results in the following code going through without a
warning:

--
char *buf;

char *
foo (void)
{
  buf = __builtin_malloc (4);
  __builtin_memset (buf, 'A', 4);

  return __builtin_strndup (buf,  5);
}

int
main ()
{
  __builtin_printf ("%s\n", foo ());
}
--

but the problem above is a missing NULL, not N being larger than the
size of SRC and the overread warning in this context is confusing at
best and misleading (and hinting at the wrong solution) in the worst
case.

gcc/ChangeLog:

middle-end/104854
* gimple-ssa-warn-access.cc (check_access):
New parameter.  Skip warning if in read-only mode, source string
is NULL terminated and has non-zero object size.
(check_access): New parameter.
(check_access): Adjust.
(check_read_access): New parameter.  Adjust for check_access
change.
(pass_waccess::check_builtin): Adjust check_read_access call for
memcmp, memchr.
(pass_waccess::maybe_check_access_sizes): Likewise.

gcc/testsuite/ChangeLog:

middle-end/104854
* gcc.dg/Wstringop-overread.c
(test_strnlen_array, test_strndup_array): Don't expect warning
for non-zero source sizes.
* gcc.dg/attr-nonstring-4.c (strnlen_range): Likewise.
* gcc.dg/pr78902.c: Likewise.
* gcc.dg/warn-strnlen-no-nul.c: Likewise.

Signed-off-by: Siddhesh Poyarekar 
---
Tested with an x86_64 bootstrap.  strncmp has a similar issue, I'll post a
separate patch for it.

 gcc/gimple-ssa-warn-access.cc  | 35 ++
 gcc/testsuite/gcc.dg/Wstringop-overread.c  | 26 
 gcc/testsuite/gcc.dg/attr-nonstring-4.c|  2 +-
 gcc/testsuite/gcc.dg/pr78902.c |  1 -
 gcc/testsuite/gcc.dg/warn-strnlen-no-nul.c | 16 +-
 5 files changed, 45 insertions(+), 35 deletions(-)

diff --git a/gcc/gimple-ssa-warn-access.cc b/gcc/gimple-ssa-warn-access.cc
index c36cd5d45d4..972e80e4b62 100644
--- a/gcc/gimple-ssa-warn-access.cc
+++ b/gcc/gimple-ssa-warn-access.cc
@@ -1256,7 +1256,7 @@ static bool
 check_access (GimpleOrTree exp, tree dstwrite,
  tree maxread, tree srcstr, tree dstsize,
  access_mode mode, const access_data *pad,
- range_query *rvals)
+ range_query *rvals, bool null_terminated)
 {
   /* The size of the largest object is half the address space, or
  PTRDIFF_MAX.  (This is way too permissive.)  */
@@ -1431,6 +1431,15 @@ check_access (GimpleOrTree exp, tree dstwrite,
}
 }
 
+  /* For functions that take string inputs and stop reading on encountering a
+ NULL, if remaining size in the source is non-zero, it is legitimate for
+ such functions to pass a larger size (that perhaps is the maximum object
+ size of all possible inputs), making the MAXREAD comparison noisy.  */
+  if (null_terminated
+  && pad && pad->mode == access_read_only
+  && pad->src.size_remaining () != 0)
+return true;
+
   /* Check the maximum length of the source sequence against the size
  of the destination object if known, or against the maximum size
  of an object.  */
@@ -1522,10 +1531,10 @@ static bool
 check_access (gimple *stmt, tree dstwrite,
  tree maxread, tree srcstr, tree dstsize,
  access_mode mode, const access_data *pad,
- range_query *rvals)
+ range_query *rvals, bool null_terminated = true)
 {
   return check_access (stmt, dstwrite, maxread, srcstr, dstsize,
-mode, pad, rvals);
+mode, pad, rvals, null_terminated);
 }
 
 bool
@@ -1534,7 +1543,7 @@ check_access (tree expr, tree dstwrite,
  access_mode mode, const access_data *pad /* = NULL */)
 {
   return check_access (expr, dstwrite, maxread, srcstr, dstsize,
-mode, pad, nullptr);
+mode, pad, nullptr, true);
 }
 
 /* Return true if STMT is a call to an allocation function.  Unless
@@ -2109,7 +2118,8 @@ private:
   void check_stxncpy (gcall *);
   void check_strncmp (gcall *);
   void check_memop_access (gimple *, tree, tree, tree);
-  void check_read_access (gimple *, tree, tree = NULL_TREE, int = 1);
+  void check_read_access (gimple *, tree, tree = NULL_TREE, int = 1,
+ bool = true);
 
   void maybe_check_dealloc_call (gcall *);
   void maybe_check_access_sizes (rdwr_map *, tree, tre

[committed] libstdc++: Avoid implicit narrowing from uint128_t [PR104859]

2022-03-09 Thread Patrick Palka via Gcc-patches
We need to be explicit about narrowing conversions from uint128_t since,
on targets that lack __int128, this type is defined as an integer-class
type that is only _explicitly_ convertible to the builtin integer types.
This issue was latent until r12-7563-ge32869a17b788b made the frontend
correctly reject explicit conversion functions during (dependent)
copy-initialization.

Tested on x86_64-pc-linux-gnu using both possible definitions of uint128
(as an alias for unsigned __int128 and as the integer-class type), committed to
trunk as obvious.

PR libstdc++/104859

libstdc++-v3/ChangeLog:

* src/c++17/floating_to_chars.cc (__floating_to_chars_hex):
Be explicit when narrowing the shifted effective_mantissa,
since it may have an integer-class type.
---
 libstdc++-v3/src/c++17/floating_to_chars.cc | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/libstdc++-v3/src/c++17/floating_to_chars.cc 
b/libstdc++-v3/src/c++17/floating_to_chars.cc
index 5825e661bf4..66bd457cbe2 100644
--- a/libstdc++-v3/src/c++17/floating_to_chars.cc
+++ b/libstdc++-v3/src/c++17/floating_to_chars.cc
@@ -801,14 +801,14 @@ template
 char leading_hexit;
 if constexpr (has_implicit_leading_bit)
   {
-   const unsigned nibble = effective_mantissa >> rounded_mantissa_bits;
+   const auto nibble = unsigned(effective_mantissa >> 
rounded_mantissa_bits);
__glibcxx_assert(nibble <= 2);
leading_hexit = '0' + nibble;
effective_mantissa &= ~(mantissa_t{0b11} << rounded_mantissa_bits);
   }
 else
   {
-   const unsigned nibble = effective_mantissa >> (rounded_mantissa_bits-4);
+   const auto nibble = unsigned(effective_mantissa >> 
(rounded_mantissa_bits-4));
__glibcxx_assert(nibble < 16);
leading_hexit = "0123456789abcdef"[nibble];
effective_mantissa &= ~(mantissa_t{0b} << 
(rounded_mantissa_bits-4));
@@ -853,7 +853,7 @@ template
while (effective_mantissa != 0)
  {
nibble_offset -= 4;
-   const unsigned nibble = effective_mantissa >> nibble_offset;
+   const auto nibble = unsigned(effective_mantissa >> nibble_offset);
__glibcxx_assert(nibble < 16);
*first++ = "0123456789abcdef"[nibble];
++written_hexits;
-- 
2.35.1.415.gc2162907e9



Re: [PATCH] rs6000, v3: Fix up __SIZEOF_{FLOAT, IBM}128__ defines [PR99708]

2022-03-09 Thread Segher Boessenkool
On Wed, Mar 09, 2022 at 10:10:07PM +0100, Jakub Jelinek wrote:
> On Wed, Mar 09, 2022 at 02:57:20PM -0600, Segher Boessenkool wrote:
> > But __ibm128 should *always* be supported, so this is a hypothetical
> > problem.
> 
> I bet that will require much more work.  I think for the barely supported
> (or really unsupported?) case of old sysv IEEE quad

The "q" library routines are double-double.  On RIOS2 (POWER2) there
were "quad" instructions that worked on a pair of FP regs, but that was
handled as a pair of FP regs, and since 2012 we do not support POWER2
anymore anyway.

I have no clue if and when the "q_" library routines are used.  The do
take KFmode params (or TFmode if we use double-double preferably).

Or are you thinking of something else still?

> or for when long double
> is just DFmode we'd need some more (IFmode?) for __ibm128 and deal with
> everything it needs (mapping it to libgcc entrypoints etc.).

Well, the symbols would be called "tf", as all __ibm128 symbols are, for
backwards compatibility; all __ieee128 are "kf", no matter what.  This
is much simpler than the mess we have with internal modes :-)

It also is no big deal if you get link errors for missing libraries if
you try to use an unsupported configuration.  All the basic stuff will
still work.  This is similar in ways to what happens if you try to use
too old system libraries, or a too old binutils.

> > If you are fed up with all this, please commit what you have now (after
> > testing of course ;-) ), and I'll pick up things myself.  Either way,
> > thank you for all your work on this!
> 
> Ok, here is what I'll test momentarily:

Thanks again!


Segher


[committed] c: Revert C2x changes to function type compatibility

2022-03-09 Thread Joseph Myers
In commit cc806126215c3f4dc187eff3bf923458d8cc6b4f, I implemented
changes that C2x had made to compatibility of unprototyped and
prototyped function types.

C2x has since completely removed unprototyped function types, making
() in a function declaration mean (void) as in C++.  While that change
isn't appropriate at the current development stage for GCC 12, it does
mean that it doesn't make sense for GCC 12 to have different rules for
unprototyped functions in C2x mode than in other modes or previous and
subsequent GCC versions.  Thus, revert the previous change to avoid it
getting into a GCC release, and update the corresponding tests to
expect the same behavior with -std=c2x as with -std=c11 (they will of
course need to change again after implementing () as meaning (void)).

Bootstrapped with no regressions for x86_64-pc-linux-gnu.  Applied to 
mainline.

gcc/c/
* c-typeck.cc (function_types_compatible_p): Do not handle C2X
differently from earlier standards for unprototyped function type
compatibility.

gcc/testsuite/
* gcc.dg/c11-unproto-1.c, gcc.dg/c11-unproto-2.c: Update comments.
* gcc.dg/c2x-unproto-1.c, gcc.dg/c2x-unproto-2.c: Expect same
results as in C11 mode.  Update comments.

diff --git a/gcc/c/c-typeck.cc b/gcc/c/c-typeck.cc
index b37f3cfcd8b..54b0b0d369b 100644
--- a/gcc/c/c-typeck.cc
+++ b/gcc/c/c-typeck.cc
@@ -1693,7 +1693,7 @@ function_types_compatible_p (const_tree f1, const_tree f2,
 
   if (args1 == NULL_TREE)
 {
-  if (flag_isoc2x ? stdarg_p (f2) : !self_promoting_args_p (args2))
+  if (!self_promoting_args_p (args2))
return 0;
   /* If one of these types comes from a non-prototype fn definition,
 compare that with the other type's arglist.
@@ -1706,7 +1706,7 @@ function_types_compatible_p (const_tree f1, const_tree f2,
 }
   if (args2 == NULL_TREE)
 {
-  if (flag_isoc2x ? stdarg_p (f1) : !self_promoting_args_p (args1))
+  if (!self_promoting_args_p (args1))
return 0;
   if (TYPE_ACTUAL_ARG_TYPES (f2)
  && type_lists_compatible_p (args1, TYPE_ACTUAL_ARG_TYPES (f2),
diff --git a/gcc/testsuite/gcc.dg/c11-unproto-1.c 
b/gcc/testsuite/gcc.dg/c11-unproto-1.c
index ea9e807a68e..0949c7bc90c 100644
--- a/gcc/testsuite/gcc.dg/c11-unproto-1.c
+++ b/gcc/testsuite/gcc.dg/c11-unproto-1.c
@@ -1,6 +1,7 @@
-/* Test compatibility of unprototyped and prototyped function types (C2x makes
-   the case of types affected by default argument promotions compatible).  Test
-   valid-in-C2x usages are not accepted for C11.  */
+/* Test compatibility of unprototyped and prototyped function types (C2x made
+   the case of types affected by default argument promotions compatible, before
+   removing unprototyped functions completely).  Test affected usages are not
+   accepted for C11.  */
 /* { dg-do compile } */
 /* { dg-options "-std=c11 -pedantic-errors" } */
 
diff --git a/gcc/testsuite/gcc.dg/c11-unproto-2.c 
b/gcc/testsuite/gcc.dg/c11-unproto-2.c
index 0557ae3f5cb..06da935336e 100644
--- a/gcc/testsuite/gcc.dg/c11-unproto-2.c
+++ b/gcc/testsuite/gcc.dg/c11-unproto-2.c
@@ -1,6 +1,7 @@
-/* Test compatibility of unprototyped and prototyped function types (C2x makes
-   the case of types affected by default argument promotions compatible).  Test
-   invalid-in-C2x usages, in C11 mode.  */
+/* Test compatibility of unprototyped and prototyped function types (C2x made
+   the case of types affected by default argument promotions compatible, before
+   removing unprototyped functions completely).  Test always-invalid-in-C2x
+   usages, in C11 mode.  */
 /* { dg-do compile } */
 /* { dg-options "-std=c11 -pedantic-errors" } */
 
diff --git a/gcc/testsuite/gcc.dg/c2x-unproto-1.c 
b/gcc/testsuite/gcc.dg/c2x-unproto-1.c
index 45d68f2c292..aa87d78610e 100644
--- a/gcc/testsuite/gcc.dg/c2x-unproto-1.c
+++ b/gcc/testsuite/gcc.dg/c2x-unproto-1.c
@@ -1,20 +1,25 @@
-/* Test compatibility of unprototyped and prototyped function types (C2x makes
-   the case of types affected by default argument promotions compatible).  Test
-   valid-in-C2x usages.  */
+/* Test compatibility of unprototyped and prototyped function types (C2x made
+   the case of types affected by default argument promotions compatible, before
+   removing unprototyped functions completely).  Test affected usages are not
+   accepted for C2x.  */
 /* { dg-do compile } */
 /* { dg-options "-std=c2x -pedantic-errors" } */
 
-void f1 ();
-void f1 (float);
+void f1 (); /* { dg-message "previous declaration" } */
+void f1 (float); /* { dg-error "conflicting types" } */
+/* { dg-message "default promotion" "" { target *-*-* } .-1 } */
 
-void f2 (float);
-void f2 ();
+void f2 (float); /* { dg-message "previous declaration" } */
+void f2 (); /* { dg-error "conflicting types" } */
+/* { dg-message "default promotion" "" { target *-*-* } .-1 } */
 
-void f3 ();
-void f3 (char);
+void f3 (); /* { dg-message "previous declaration" } */
+void f3 (char); /

[PATCH] rs6000, v3: Fix up __SIZEOF_{FLOAT, IBM}128__ defines [PR99708]

2022-03-09 Thread Jakub Jelinek via Gcc-patches
On Wed, Mar 09, 2022 at 02:57:20PM -0600, Segher Boessenkool wrote:
> But __ibm128 should *always* be supported, so this is a hypothetical
> problem.

I bet that will require much more work.  I think for the barely supported
(or really unsupported?) case of old sysv IEEE quad or for when long double
is just DFmode we'd need some more (IFmode?) for __ibm128 and deal with
everything it needs (mapping it to libgcc entrypoints etc.).

> If you are fed up with all this, please commit what you have now (after
> testing of course ;-) ), and I'll pick up things myself.  Either way,
> thank you for all your work on this!

Ok, here is what I'll test momentarily:

2022-03-09  Jakub Jelinek  

PR target/99708
* config/rs6000/rs6000.h (enum rs6000_builtin_type_index): Remove
RS6000_BTI_ptr_ieee128_float and RS6000_BTI_ptr_ibm128_float.
(ptr_ieee128_float_type_node, ptr_ibm128_float_type_node): Remove.
* config/rs6000/rs6000-builtin.cc (rs6000_type_string): Return
"**NULL**" if type_node is NULL first.  Handle
ieee128_float_type_node.
(rs6000_init_builtins): Don't initialize ptr_ieee128_float_type_node
and ptr_ibm128_float_type_node.  Set ibm128_float_type_node and
ieee128_float_type_node to NULL rather than long_double_type_node if
they aren't supported.  Do support __ibm128 even if
!TARGET_FLOAT128_TYPE when long double is double double.
(rs6000_expand_builtin): Error if bif_is_ibm128 and
!ibm128_float_type_node.  Remap RS6000_BIF_{,UN}PACK_IF to
RS6000_BIF_{,UN}PACK_TF much earlier and only use bif_is_ibm128 check
for it.
* config/rs6000/rs6000-c.cc (rs6000_target_modify_macros): Define
__SIZEOF_FLOAT128__ here and only iff __float128 macro is defined.
(rs6000_cpu_cpp_builtins): Don't define __SIZEOF_FLOAT128__ here.
Define __SIZEOF_IBM128__=16 if ieee128_float_type_node is non-NULL.
Formatting fix.
* config/rs6000/rs6000-gen-builtins.cc: Document ibm128 attribute.
(struct attrinfo): Add isibm128 member.
(TYPE_MAP_SIZE): Remove.
(type_map): Use [] instead of [TYPE_MAP_SIZE].  For "if" use
ibm128_float_type_node only if it is non-NULL, otherwise fall back
to long_double_type_node.  Remove "pif" entry.
(parse_bif_attrs): Handle ibm128 attribute and print it for debugging.
(write_decls): Output bif_ibm128_bit and bif_is_ibm128.
(write_type_node): Use sizeof type_map / sizeof type_map[0]
instead of TYPE_MAP_SIZE.
(write_bif_static_init): Handle isibm128.
* config/rs6000/rs6000-builtins.def: Document ibm128 attribute.
(__builtin_pack_ibm128, __builtin_unpack_ibm128): Add ibm128
attribute.

* gcc.dg/pr99708.c: New test.
* gcc.target/powerpc/pr99708-2.c: New test.

--- gcc/config/rs6000/rs6000.h.jj   2022-03-09 15:24:50.647017881 +0100
+++ gcc/config/rs6000/rs6000.h  2022-03-09 19:55:31.879255798 +0100
@@ -2444,8 +2444,6 @@ enum rs6000_builtin_type_index
   RS6000_BTI_ptr_long_double,
   RS6000_BTI_ptr_dfloat64,
   RS6000_BTI_ptr_dfloat128,
-  RS6000_BTI_ptr_ieee128_float,
-  RS6000_BTI_ptr_ibm128_float,
   RS6000_BTI_ptr_vector_pair,
   RS6000_BTI_ptr_vector_quad,
   RS6000_BTI_ptr_long_long,
@@ -2541,8 +2539,6 @@ enum rs6000_builtin_type_index
 #define ptr_long_double_type_node   
(rs6000_builtin_types[RS6000_BTI_ptr_long_double])
 #define ptr_dfloat64_type_node  
(rs6000_builtin_types[RS6000_BTI_ptr_dfloat64])
 #define ptr_dfloat128_type_node 
(rs6000_builtin_types[RS6000_BTI_ptr_dfloat128])
-#define ptr_ieee128_float_type_node 
(rs6000_builtin_types[RS6000_BTI_ptr_ieee128_float])
-#define ptr_ibm128_float_type_node  
(rs6000_builtin_types[RS6000_BTI_ptr_ibm128_float])
 #define ptr_vector_pair_type_node   
(rs6000_builtin_types[RS6000_BTI_ptr_vector_pair])
 #define ptr_vector_quad_type_node   
(rs6000_builtin_types[RS6000_BTI_ptr_vector_quad])
 #define ptr_long_long_integer_type_node 
(rs6000_builtin_types[RS6000_BTI_ptr_long_long])
--- gcc/config/rs6000/rs6000-builtin.cc.jj  2022-03-09 15:24:50.642017950 
+0100
+++ gcc/config/rs6000/rs6000-builtin.cc 2022-03-09 20:10:53.778612345 +0100
@@ -402,7 +402,9 @@ rs6000_vector_type (const char *name, tr
 static
 const char *rs6000_type_string (tree type_node)
 {
-  if (type_node == void_type_node)
+  if (type_node == NULL_TREE)
+return "**NULL**";
+  else if (type_node == void_type_node)
 return "void";
   else if (type_node == long_integer_type_node)
 return "long";
@@ -432,6 +434,8 @@ const char *rs6000_type_string (tree typ
 return "ss";
   else if (type_node == ibm128_float_type_node)
 return "__ibm128";
+  else if (type_node == ieee128_float_type_node)
+return "__ieee128";
   else if (type_node == opaque_V4SI_type_node)
 return "opaque";
   else if (POINTER_TYPE_P (type_node))
@@ -709,9 +713,9 @@

[PATCH, committed] PR fortran/104849 - ICE in find_array_section, at fortran/expr.cc:1616

2022-03-09 Thread Harald Anlauf via Gcc-patches
Dear all,

referencing an invalid array section could lead to a NULL pointer
dereference.  Testcase by Gerhard.

Committed to mainline as obvious after regtesting as

https://gcc.gnu.org/g:22015e77d3e45306077396b9de8a8a28bb67fb20

Thanks,
Harald

From 22015e77d3e45306077396b9de8a8a28bb67fb20 Mon Sep 17 00:00:00 2001
From: Harald Anlauf 
Date: Wed, 9 Mar 2022 21:58:26 +0100
Subject: [PATCH] Fortran: improve error recovery on invalid array section

gcc/fortran/ChangeLog:

	PR fortran/104849
	* expr.cc (find_array_section): Avoid NULL pointer dereference on
	invalid array section.

gcc/testsuite/ChangeLog:

	PR fortran/104849
	* gfortran.dg/pr104849.f90: New test.
---
 gcc/fortran/expr.cc| 4 +++-
 gcc/testsuite/gfortran.dg/pr104849.f90 | 9 +
 2 files changed, 12 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gfortran.dg/pr104849.f90

diff --git a/gcc/fortran/expr.cc b/gcc/fortran/expr.cc
index c9c0ba4cc2e..86d61fed302 100644
--- a/gcc/fortran/expr.cc
+++ b/gcc/fortran/expr.cc
@@ -1594,7 +1594,9 @@ find_array_section (gfc_expr *expr, gfc_ref *ref)
 	{
 	  if ((begin && begin->expr_type != EXPR_CONSTANT)
 	  || (finish && finish->expr_type != EXPR_CONSTANT)
-	  || (step && step->expr_type != EXPR_CONSTANT))
+	  || (step && step->expr_type != EXPR_CONSTANT)
+	  || (!begin && !lower)
+	  || (!finish && !upper))
 	{
 	  t = false;
 	  goto cleanup;
diff --git a/gcc/testsuite/gfortran.dg/pr104849.f90 b/gcc/testsuite/gfortran.dg/pr104849.f90
new file mode 100644
index 000..ae221b5ba10
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/pr104849.f90
@@ -0,0 +1,9 @@
+! { dg-do compile }
+! PR fortran/104849 - ICE in find_array_section
+! Contributed by G.Steinmetz
+
+program p
+  integer, parameter :: a(:) = [1, 2] ! { dg-error "deferred shape" }
+  integer :: x(2)
+  data x /a(:)/   ! { dg-error "Invalid" }
+end
--
2.34.1



Re: [PATCH] rs6000, v2: Fix up __SIZEOF_{FLOAT, IBM}128__ defines [PR99708]

2022-03-09 Thread Segher Boessenkool
On Wed, Mar 09, 2022 at 08:24:24PM +0100, Jakub Jelinek wrote:
> On Wed, Mar 09, 2022 at 12:34:06PM -0600, Segher Boessenkool wrote:
> > > rs6000_expand_builtin has (but at an incorrect spot) code to remap
> > > __builtin_{,un}pack_ibm128 to __builtin_{,un}pack_longdouble under the 
> > > hood,
> > > for 2 reasons:
> > > 1) __ibm128 type is only supported when VSX is enabled,
> > 
> > Ugh, another bug.  *^*ET*(^(*^TE($^TW(*T(W
> 
> I agree, but I don't have much energy to fix all the bugs in the backend in 
> one
> patch.

Yes, but this is a very fundamental thing that causes a lot of
unnecessary complexity :-(

> Probably it could be fixed with incremental:
> 
> --- gcc/config/rs6000/rs6000-builtin.cc.jj2022-03-09 19:55:31.879255798 
> +0100
> +++ gcc/config/rs6000/rs6000-builtin.cc   2022-03-09 20:10:53.778612345 
> +0100
> @@ -713,9 +713,9 @@ rs6000_init_builtins (void)
>   For IEEE 128-bit floating point, always create the type __ieee128.  If 
> the
>   user used -mfloat128, rs6000-c.cc will create a define from __float128 
> to
>   __ieee128.  */
> -  if (TARGET_FLOAT128_TYPE)
> +  if (TARGET_LONG_DOUBLE_128 && (!TARGET_IEEEQUAD || TARGET_FLOAT128_TYPE))
>  {
> -  if (!TARGET_IEEEQUAD && TARGET_LONG_DOUBLE_128)
> +  if (!TARGET_IEEEQUAD)
>   ibm128_float_type_node = long_double_type_node;
>else
>   {
> @@ -727,7 +727,12 @@ rs6000_init_builtins (void)
>t = build_qualified_type (ibm128_float_type_node, TYPE_QUAL_CONST);
>lang_hooks.types.register_builtin_type (ibm128_float_type_node,
> "__ibm128");
> +}
> +  else
> +ibm128_float_type_node = NULL_TREE;
>  
> +  if (TARGET_FLOAT128_TYPE)
> +{
>if (TARGET_IEEEQUAD && TARGET_LONG_DOUBLE_128)
>   ieee128_float_type_node = long_double_type_node;
>else
> @@ -737,7 +742,7 @@ rs6000_init_builtins (void)
> "__ieee128");
>  }
>else
> -ieee128_float_type_node = ibm128_float_type_node = NULL_TREE;
> +ieee128_float_type_node = NULL_TREE;
>  
>/* Vector pair and vector quad support.  */
>vector_pair_type_node = make_node (OPAQUE_TYPE);
> @@ -3419,11 +3424,10 @@ rs6000_expand_builtin (tree exp, rtx tar
>return const0_rtx;
>  }
>  
> -  if (bif_is_ibm128 (*bifaddr)
> -  && !(ibm128_float_type_node || FLOAT128_2REG_P (TFmode)))
> +  if (bif_is_ibm128 (*bifaddr) && !ibm128_float_type_node)
>  {
> -  error ("%qs requires %<__ibm128%> type support or % "
> -  "to be IBM 128-bit format", bifaddr->bifname);
> +  error ("%qs requires %<__ibm128%> type support",
> +  bifaddr->bifname);
>return const0_rtx;
>  }

Something like that yes.  It still is all way more complicated than it
should be.  Not your fault of course.

> which ought to support __ibm128 as the same thing as long double
> if long double is double double, and also as a separate IFmode
> double double if -mabi=ieeelongdouble and TARGET_FLOAT128_TYPE
> (i.e. both IFmode and KFmode are supported).

It should support __ibm128 always, and __ieee128 whenever VSX is
supported (it should be whenever VMX is supported, and even *always* as
well imnsho, but I haven't won that fight yet).

> > >seems the intent
> > >was that those 2 builtins can be used interchangeably unless
> > >long double is the same as __ieee128, in which case
> > >__builtin_{,un}pack_longdouble errors and
> > >__builtin_{,un}pack_ibm128 works and returns/takes __ibm128
> > 
> > Aha.
> > 
> > But the type "if" always means the same thing, in the builtins language.
> 
> True.  Before the patch it means ibm128_float_type_node, which is
> a tree connected to __ibm128 type if it exists (but could very well be
> equal to long_double_type_node) or long_double_type if it doesn't.
> With the patch it means the same thing, except that ibm128_float_type_node
> is NULL if __ibm128 isn't supported, and at that point if means
> long_double_type_node.  So, on the builtins side, if is the same thing
> as before.

Ah good.

> > > 2) even with VSX, unless -mlong-double-128 -mabi=ieeelongdouble is in
> > >effect, ibm128_float_type_node has TFmode rather than IFmode,
> > >so we can't use the {,un}packif expanders but need {,un}packtf
> > >instead
> > 
> > TFmode *is* IFmode, in that case.  TFmode is a workaround for
> > shortcomings in the generic parts of GCC.  This almost works as well
> > even!  But it is confusing no end.
> 
> TFmode isn't IFmode in that case, otherwise there wouldn't be the ICE on
> (insn 9 8 10 2 (set (reg:IF 119)
>  (unspec:TF [
> (reg:DF 120)
> (reg:DF 122)
> ] UNSPEC_PACK_128BIT)) "prqquu.C":4:32 -1
>  (nil))
> etc.  IFmode and TFmode clearly are separate modes, which in those
> cases have the same REAL_MODE_FORMAT.  So, under the hood they are handled
> the same, but still it isn't acceptable in RTL t

Re: [PATCH] rs6000, v2: Fix up __SIZEOF_{FLOAT, IBM}128__ defines [PR99708]

2022-03-09 Thread Jakub Jelinek via Gcc-patches
On Wed, Mar 09, 2022 at 12:34:06PM -0600, Segher Boessenkool wrote:
> It is a common idiom anyway, much clearer than any macro :-)  (But no
> parens please?  sizeof is an operator, not a function).

Ok, changed in my copy.

> > > > +{ "if","ibm128_float_type_node "
> > > > +   "? ibm128_float_type_node "
> > > > +   ": long_double" },
> > > 
> > > I don't think that is correct.  If there is no __ibm128, there is no
> > > IFmode either (they are two sides of the same coin).  Using DFmode
> > > instead (which is the only thing that "long double" could be if not
> > > IFmode) will just ICE later, if this is used at all.  So best case this
> > > will confuse the reader.
> > 
> > rs6000_expand_builtin has (but at an incorrect spot) code to remap
> > __builtin_{,un}pack_ibm128 to __builtin_{,un}pack_longdouble under the hood,
> > for 2 reasons:
> > 1) __ibm128 type is only supported when VSX is enabled,
> 
> Ugh, another bug.  *^*ET*(^(*^TE($^TW(*T(W

I agree, but I don't have much energy to fix all the bugs in the backend in one
patch.

Probably it could be fixed with incremental:

--- gcc/config/rs6000/rs6000-builtin.cc.jj  2022-03-09 19:55:31.879255798 
+0100
+++ gcc/config/rs6000/rs6000-builtin.cc 2022-03-09 20:10:53.778612345 +0100
@@ -713,9 +713,9 @@ rs6000_init_builtins (void)
  For IEEE 128-bit floating point, always create the type __ieee128.  If the
  user used -mfloat128, rs6000-c.cc will create a define from __float128 to
  __ieee128.  */
-  if (TARGET_FLOAT128_TYPE)
+  if (TARGET_LONG_DOUBLE_128 && (!TARGET_IEEEQUAD || TARGET_FLOAT128_TYPE))
 {
-  if (!TARGET_IEEEQUAD && TARGET_LONG_DOUBLE_128)
+  if (!TARGET_IEEEQUAD)
ibm128_float_type_node = long_double_type_node;
   else
{
@@ -727,7 +727,12 @@ rs6000_init_builtins (void)
   t = build_qualified_type (ibm128_float_type_node, TYPE_QUAL_CONST);
   lang_hooks.types.register_builtin_type (ibm128_float_type_node,
  "__ibm128");
+}
+  else
+ibm128_float_type_node = NULL_TREE;
 
+  if (TARGET_FLOAT128_TYPE)
+{
   if (TARGET_IEEEQUAD && TARGET_LONG_DOUBLE_128)
ieee128_float_type_node = long_double_type_node;
   else
@@ -737,7 +742,7 @@ rs6000_init_builtins (void)
  "__ieee128");
 }
   else
-ieee128_float_type_node = ibm128_float_type_node = NULL_TREE;
+ieee128_float_type_node = NULL_TREE;
 
   /* Vector pair and vector quad support.  */
   vector_pair_type_node = make_node (OPAQUE_TYPE);
@@ -3419,11 +3424,10 @@ rs6000_expand_builtin (tree exp, rtx tar
   return const0_rtx;
 }
 
-  if (bif_is_ibm128 (*bifaddr)
-  && !(ibm128_float_type_node || FLOAT128_2REG_P (TFmode)))
+  if (bif_is_ibm128 (*bifaddr) && !ibm128_float_type_node)
 {
-  error ("%qs requires %<__ibm128%> type support or % "
-"to be IBM 128-bit format", bifaddr->bifname);
+  error ("%qs requires %<__ibm128%> type support",
+bifaddr->bifname);
   return const0_rtx;
 }
 

which ought to support __ibm128 as the same thing as long double
if long double is double double, and also as a separate IFmode
double double if -mabi=ieeelongdouble and TARGET_FLOAT128_TYPE
(i.e. both IFmode and KFmode are supported).

> >seems the intent
> >was that those 2 builtins can be used interchangeably unless
> >long double is the same as __ieee128, in which case
> >__builtin_{,un}pack_longdouble errors and
> >__builtin_{,un}pack_ibm128 works and returns/takes __ibm128
> 
> Aha.
> 
> But the type "if" always means the same thing, in the builtins language.

True.  Before the patch it means ibm128_float_type_node, which is
a tree connected to __ibm128 type if it exists (but could very well be
equal to long_double_type_node) or long_double_type if it doesn't.
With the patch it means the same thing, except that ibm128_float_type_node
is NULL if __ibm128 isn't supported, and at that point if means
long_double_type_node.  So, on the builtins side, if is the same thing
as before.

> > 2) even with VSX, unless -mlong-double-128 -mabi=ieeelongdouble is in
> >effect, ibm128_float_type_node has TFmode rather than IFmode,
> >so we can't use the {,un}packif expanders but need {,un}packtf
> >instead
> 
> TFmode *is* IFmode, in that case.  TFmode is a workaround for
> shortcomings in the generic parts of GCC.  This almost works as well
> even!  But it is confusing no end.

TFmode isn't IFmode in that case, otherwise there wouldn't be the ICE on
(insn 9 8 10 2 (set (reg:IF 119)
 (unspec:TF [
(reg:DF 120)
(reg:DF 122)
] UNSPEC_PACK_128BIT)) "prqquu.C":4:32 -1
 (nil))
etc.  IFmode and TFmode clearly are separate modes, which in those
cases have the same REAL_MODE_FORMAT.  So, under the hood they are handled
the same, but still it isn't acceptable in RTL to mix

RE: [C++ PATCH] PR c++/96437: ICE-on-invalid-code error recovery.

2022-03-09 Thread Roger Sayle


Hi Jason,

Very many thanks for your reviews/approvals of these ICE-on-invalid-code fixes.

> > +  if (TREE_VALUE (new_parm) != error_mark_node)
> > +DECL_VIRTUAL_P (TREE_VALUE (new_parm)) = true;
> 
> Hmm, I wonder about returning early if there was an error, but this is fine 
> as is.

The challenge here is the need to perform "current_binding_level = entry_scope;"
required to restore the parser state before returning.  Now that the parser is
written in C++, we could (in theory) make more use of destructors to 
automatically
restore/pop state, making it safe(r) to just "return error_mark_node" 
immediately
in more places.

> > +/* { dg-options "-O2" } */
>
> This also seems unneeded for this test.

Doh! Force of habit (from working in the middle-end).  Consider this removed
from all the test cases below.

Is there any chance I could ask you to also look at these (that probably slipped
through the net), though PR 84964 might require a nod from a middle-end 
maintainer.

PR c++/39751: 
https://gcc.gnu.org/pipermail/gcc-patches/2022-February/590951.html
PR c++/84964: 
https://gcc.gnu.org/pipermail/gcc-patches/2022-February/590961.html
PR c++/95999: 
https://gcc.gnu.org/pipermail/gcc-patches/2022-February/590653.html
PR c++/96442: 
https://gcc.gnu.org/pipermail/gcc-patches/2022-February/590716.html

Very many thanks again for your help/reviews.  Much appreciated.
Roger
--




Re: [PATCH] rs6000, v2: Fix up __SIZEOF_{FLOAT, IBM}128__ defines [PR99708]

2022-03-09 Thread Segher Boessenkool
Hi!

On Wed, Mar 09, 2022 at 02:27:19PM +0100, Jakub Jelinek wrote:
> On Mon, Mar 07, 2022 at 03:37:18PM -0600, Segher Boessenkool wrote:
> > > 2) adjusts the builtins code to use
> > >ibm128_float_type_node ? ibm128_float_type_node : long_double_type_node
> > >for the 2 builtins, so that we don't ICE during builtins creation
> > >if ibm128_float_type_node is NULL (using error_mark_node instead of
> > >long_double_type_node sounds more risky to me)
> > 
> > I feel the opposite way: (potentially) using the wrong thing is just a
> > ticking time bomb, never "safer".
> 
> Actually, fallback to long_double_type_node is the right thing, see below.

I don't see how that ever could be true, but I'll see below perhaps :-)

> > There needs to be a __SIZEOF_IEEE128__ as well, if you like reality :-)
> 
> Ok, added now.

Thanks!

> > The basic types, that always exist if they are supported at all, are
> > __ibm128 and __ieee128.  Long double picks one of those, or some other
> > type; and __float128 is a source of confusion (it tries to make things
> > more similar to x86, but things are *not* similar anyway :-( )
> 
> Not just x86, there are multiple targets with __float128 support, and
> when it works, it behaves the same on all of them.

Unfortunately it isn't documented anywhere *what* it should mean, then.

> Mike's PR99708 patch (which unfortunately has been backported to various
> branches without resolving these issues first) regressed on powerpc64-linux

It supposedly was tested there though?  Sigh.

> > Tested with which -mcpu=?  You need at least power7 to get __ieee128
> > support, but it should be tested *without* that as well.  (The actual
> > default for powerpc64-linux is power4 (so not even 970), and for
> > powerpc-linux it is 603 or such.)
> 
> ../configure --enable-languages=c,c++,fortran,objc,obj-c++,lto,go 
> --prefix=/home/jakub/GCC; make -j64 > LOG 2>&1 && make -j64 -k check 
> RUNTESTFLAGS='--target_board=unix\{-m32,-m64\}' > LOGC 2>&1; 
> ../contrib/test_summary > LOGT 2>&1
> i.e. the oldest supported one.  On powerpc64le-linux also without any
> --with- tuning, so power8.

So power4 for powerpc64-linux, and power8 for powerpc64le-linux.

I ask because many people do use --with-cpu=, and the difference matters
a lot here.

> > >   * config/rs6000/rs6000-gen-builtins.cc (TYPE_MAP_SIZE): Set to
> > >   85 instead of 86.
> > 
> > This should be auto-generated, or just not exist at all
> > ("sizeof type_map / sizeof type_map[0]" in the one place it is used,
> > instead -- "one place" is after removing it from the definition of the
> > array of course, where it is unneeded :-) )
> 
> Unfortunately the generator doesn't use libiberty.h, so I couldn't use
> ARRAY_SIZE.  Just used sizeof / sizeof [0].

It is a common idiom anyway, much clearer than any macro :-)  (But no
parens please?  sizeof is an operator, not a function).

> > > +{ "if",  "ibm128_float_type_node "
> > > + "? ibm128_float_type_node "
> > > + ": long_double" },
> > 
> > I don't think that is correct.  If there is no __ibm128, there is no
> > IFmode either (they are two sides of the same coin).  Using DFmode
> > instead (which is the only thing that "long double" could be if not
> > IFmode) will just ICE later, if this is used at all.  So best case this
> > will confuse the reader.
> 
> rs6000_expand_builtin has (but at an incorrect spot) code to remap
> __builtin_{,un}pack_ibm128 to __builtin_{,un}pack_longdouble under the hood,
> for 2 reasons:
> 1) __ibm128 type is only supported when VSX is enabled,

Ugh, another bug.  *^*ET*(^(*^TE($^TW(*T(W

>seems the intent
>was that those 2 builtins can be used interchangeably unless
>long double is the same as __ieee128, in which case
>__builtin_{,un}pack_longdouble errors and
>__builtin_{,un}pack_ibm128 works and returns/takes __ibm128

Aha.

But the type "if" always means the same thing, in the builtins language.

> 2) even with VSX, unless -mlong-double-128 -mabi=ieeelongdouble is in
>effect, ibm128_float_type_node has TFmode rather than IFmode,
>so we can't use the {,un}packif expanders but need {,un}packtf
>instead

TFmode *is* IFmode, in that case.  TFmode is a workaround for
shortcomings in the generic parts of GCC.  This almost works as well
even!  But it is confusing no end.

> Which means the
>   ibm128_float_type_node ? ibm128_float_type_node : long_double_type_node
> is actually the right thing (except for -mlong-double-64 but the
> patch will reject those builtins in that case) - if
> ibm128_float_type_node is NULL but long_double_type_node is TFmode,
> we want __builtin_{,un}pack_ibm128 to work like
> __builtin_{,un}pack_longdouble and use long double, if
> ibm128_float_type_node is non-NULL and long_double_type_node is TFmode,
> ibm128_float_type_node == long_double_type_node and again we want it to
> work like __builtin_{,un}pack_longdouble and finally if
> ibm128_float_type_nod

Re: [PATCH RFC] mips: add TARGET_ZERO_CALL_USED_REGS hook [PR104817, PR104820]

2022-03-09 Thread Richard Sandiford via Gcc-patches
Xi Ruoyao  writes:
> Bootstrapped and regtested on mips64el-linux-gnuabi64.
>
> I'm not sure if it's "correct" to clobber other registers during the
> zeroing of scratch registers.  But I can't really come up with a better
> idea: on MIPS there is no simple way to clear one bit in FCSR (i. e.
> FCC[x]).  We can't just use "c.f.s $fccx,$f0,$f0" because it will raise
> an exception if $f0 contains a sNaN.

Yeah, it's a bit of a grey area, but I think it should be fine, provided
that the extra clobbers are never used as return registers (which is
obviously true for the FCC registers).

But on that basis…

> +static HARD_REG_SET
> +mips_zero_call_used_regs (HARD_REG_SET need_zeroed_hardregs)
> +{
> +  HARD_REG_SET zeroed_hardregs;
> +  CLEAR_HARD_REG_SET (zeroed_hardregs);
> +
> +  if (TEST_HARD_REG_BIT (need_zeroed_hardregs, HI_REGNUM))
> +{
> +  /* Clear HI and LO altogether.  MIPS target treats HILO as a
> +  double-word register.  */
> +  machine_mode dword_mode = TARGET_64BIT ? TImode : DImode;
> +  rtx hilo = gen_rtx_REG (dword_mode, MD_REG_FIRST);
> +  rtx zero = CONST0_RTX (dword_mode);
> +  emit_move_insn (hilo, zero);
> +
> +  SET_HARD_REG_BIT (zeroed_hardregs, HI_REGNUM);
> +  if (TEST_HARD_REG_BIT (need_zeroed_hardregs, LO_REGNUM))
> + SET_HARD_REG_BIT (zeroed_hardregs, LO_REGNUM);
> +  else
> + emit_clobber (gen_rtx_REG (word_mode, LO_REGNUM));

…I don't think this conditional LO_REGNUM code is worth it.
We might as well just add both registers to zeroed_hardregs.

> +}
> +
> +  bool zero_fcc = false;
> +  for (int i = ST_REG_FIRST; i <= ST_REG_LAST; i++)
> +if (TEST_HARD_REG_BIT (need_zeroed_hardregs, i))
> +  zero_fcc = true;
> +
> +  /* MIPS does not have a simple way to clear one bit in FCC.  We just
> + clear FCC with ctc1 and clobber all FCC bits.  */
> +  if (zero_fcc)
> +{
> +  emit_insn (gen_mips_zero_fcc ());
> +  for (int i = ST_REG_FIRST; i <= ST_REG_LAST; i++)
> + if (TEST_HARD_REG_BIT (need_zeroed_hardregs, i))
> +   SET_HARD_REG_BIT (zeroed_hardregs, i);
> + else
> +   emit_clobber (gen_rtx_REG (CCmode, i));
> +}

Here too I think we should just do:

  zeroed_hardregs |= reg_class_contents[ST_REGS] & accessible_reg_set;

to include all available FCC registers.

> +
> +  need_zeroed_hardregs &= ~zeroed_hardregs;
> +  return zeroed_hardregs |
> +  default_zero_call_used_regs (need_zeroed_hardregs);

Nit, but: should be formatted as:

  return (zeroed_hardregs
  | default_zero_call_used_regs (need_zeroed_hardregs));

> +}
> +
>  
>  /* Initialize the GCC target structure.  */
>  #undef TARGET_ASM_ALIGNED_HI_OP
> @@ -22919,6 +22964,8 @@ mips_asm_file_end (void)
>  #undef TARGET_ASM_FILE_END
>  #define TARGET_ASM_FILE_END mips_asm_file_end
>  
> +#undef TARGET_ZERO_CALL_USED_REGS
> +#define TARGET_ZERO_CALL_USED_REGS mips_zero_call_used_regs
>  
>  struct gcc_target targetm = TARGET_INITIALIZER;
>  
> diff --git a/gcc/config/mips/mips.md b/gcc/config/mips/mips.md
> index e0f0a582732..edf58710cdd 100644
> --- a/gcc/config/mips/mips.md
> +++ b/gcc/config/mips/mips.md
> @@ -96,6 +96,7 @@ (define_c_enum "unspec" [
>;; Floating-point environment.
>UNSPEC_GET_FCSR
>UNSPEC_SET_FCSR
> +  UNSPEC_ZERO_FCC
>  
>;; HI/LO moves.
>UNSPEC_MFHI
> @@ -7670,6 +7671,11 @@ (define_insn "*mips_set_fcsr"
>"TARGET_HARD_FLOAT"
>"ctc1\t%0,$31")
>  
> +(define_insn "mips_zero_fcc"
> +  [(unspec_volatile [(const_int 0)] UNSPEC_ZERO_FCC)]
> +  "TARGET_HARD_FLOAT"
> +  "ctc1\t$0,$25")

I've forgotten a lot of MIPS stuff, so: does this clear only the
FCC registers, or does it clear other things (such as exception bits)
as well?  Does it work even for !ISA_HAS_8CC?

I think this pattern should explicit clear all eight registers, e.g. using:

  (set (reg:CC FCC0_REGNUM) (const_int 0))
  (set (reg:CC FCC1_REGNUM) (const_int 0))
  …

which unfortunately means defining 8 new register constants in mips.md.
I guess for extra safety there should be a separate !ISA_HAS_8CC version
that only sets FCC0_REGNUM.

An alternative would be to avoid clearing the FCC registers altogether.
I suppose that's less secure, but residual information could leak through
the exception bits as well, and it isn't clear whether those should be
zeroed at the end of each function.  I guess it depends on people's
appetite for risk.

Both ways are OK with me, just mentioning it in case.

Thanks,
Richard

> +
>  ;; See tls_get_tp_mips16_ for why this form is used.
>  (define_insn "mips_set_fcsr_mips16_"
>[(unspec_volatile:SI [(match_operand:P 0 "call_insn_operand" "dS")
> diff --git a/gcc/testsuite/c-c++-common/zero-scratch-regs-10.c 
> b/gcc/testsuite/c-c++-common/zero-scratch-regs-10.c
> index 96e0b79b328..c23b2ceb391 100644
> --- a/gcc/testsuite/c-c++-common/zero-scratch-regs-10.c
> +++ b/gcc/testsuite/c-c++-common/zero-scratch-regs-10.c
> @@ -1,5 +1,5 @@
>  /* { dg-do run } */
> -/* { dg-skip-if "not impl

RE: [x86 PATCH] PR tree-optimization/98335: New peephole2 xorl; movb -> movzbl

2022-03-09 Thread Roger Sayle

Hi Uros,
Firstly many thanks for already (pre)approving half of this patch.  Jakub 
Jelinek's
suggestion for creating a testcase that exposes the SImode issue was extremely
helpful, but interestingly exposed another missed optimization opportunity 
that's
also addressed with this revision of my patch.

char c;
union Data { char a; short b; };
void val(void) { __asm__ __volatile__ ("" : : "r" ((union Data) { c } )); }

currently on x86_64 with -O2 on mainline generates following code:

xorl%eax, %eax
movb$0, %ah
movbc(%rip), %al
ret

which contains the strict_low_part movb following an SI mode xor that we hope
to optimize, but infuriatingly we also have a completely redundant movb $0, %ah.
Hence, with this patch we have a new peephole2 that sees that in the first two
instructions the movb is redundant, which then allows the SImode/SWI48 xor
followed by movb peephole2 to optimize the rest to movzbl.  With the revised
patch below, the above testcase is now compiled as:

movzbl  c(%rip), %eax
ret

Tested on x86_64-pc-linux-gnu (on top of the revised middle-end patch) with
make bootstrap and make -k check with no new failures.  Is this revised version
(still) Ok for mainline?


2022-03-09  Roger Sayle  

gcc/ChangeLog
PR tree-optimization/98335
* config/i386/i386.md (peephole2): Eliminate redundant insv.
Combine movl followed by movb.  Transform xorl followed by
a suitable movb or movw into the equivalent movz[bw]l.

gcc/testsuite/ChangeLog
PR tree-optimization/98335
* g++.target/i386/pr98335.C: New test case.
* gcc.target/i386/pr98335.c: New test case.


Thanks again for your help/suggested revisions.
Roger
--

> -Original Message-
> From: Uros Bizjak 
> Sent: 09 March 2022 07:36
> To: Roger Sayle 
> Cc: gcc-patches@gcc.gnu.org
> Subject: Re: [x86 PATCH] PR tree-optimization/98335: New peephole2
> xorl;movb -> movzbl
> 
> On Mon, Mar 7, 2022 at 12:51 PM Roger Sayle
>  wrote:
> >
> >
> > Hi Uros,
> >
> > > Is there a reason that only inserts to DImode registers are implemented?
> > > IMO, these peepholes should also handle inserts to SImode.
> >
> > I wasn't able to construct a test case that produced a byte or word
> > insert into an SImode register.  The front-ends and middle-end end up
> > producing different code sequences, and -m32 changes the ABI so that
> > small structs get passed in memory rather than in registers.
> >
> > Here's the expanded testcase that I investigated:
> >
> > struct DataCL { char a; int b; };
> > struct DataWL { short a; int b; };
> > struct DataIL { int a; int b; };
> >
> > struct DataCI { char a; short b; };
> > struct DataWI { short a; short b; };
> >
> > char c;
> > short w;
> > int i;
> >
> > DataCL foo_cl(int idx) { return { c }; } DataCL bar_cl(int idx) {
> > return { c, 0 }; } DataWL foo_wl(int idx) { return { w }; } DataWL
> > bar_wl(int idx) { return { w, 0 }; } DataIL foo_il(int idx) { return {
> > i }; } DataIL bar_il(int idx) { return { i, 0 }; }
> >
> > DataCI foo_ci(int idx) { return { c }; } DataCI bar_ci(int idx) {
> > return { c, 0 }; } DataWI foo_wi(int idx) { return { w }; } DataWI
> > bar_wi(int idx) { return { w, 0 }; }
> >
> >
> > I agree that for completeness similar peepholes handling inserts into
> > SImode would be a good thing, but these wouldn't be restricted by
> > TARGET_64BIT and would therefore require additional -m32 testing.
> > The DImode peepholes I can justify for stage4 as PR tree-opt/98335 is
> > a regression, SImode peepholes would be more of a "leap of faith".
> > If you’d be willing to accept a patch without a testcase, let me know.
> 
> We have plenty of these, where we assume that if the pattern works in one
> mode, it also works in other modes. So, I think that by changing DI mode to
> SWI48 mode iterator, we are on the safe side. We can also trust bootstrap and
> regression tests here.
> 
> IMO, the patch with SWI48 mode iterator is OK.
> 
> Thanks,
> Uros.
> 
> >
> > It's also a pity that subreg handling in combine doesn't allow merging
> > these inserts into zero registers to be combined to zero_extends in a
> > machine independent way.  My recent patch for PR 95126 (awaiting
> > review) should also allow front-ends and middle-end passes more
> > flexibility in optimizing small struct constructors.
> >
> > Thanks (as always) for reviewing patches so quickly.
> >
> > Roger
> > --


diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index d15170e..c8fbf60 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -3180,6 +3180,38 @@
 (const_int 8))
   (subreg:SWI248 (match_dup 1) 0))])
 
+;; Eliminate redundant insv, e.g. xorl %eax,%eax; movb $0, %ah
+(define_peephole2
+  [(parallel [(set (match_operand:SWI48 0 "general_reg_operand")
+  (const_int 0))
+ (clobber (reg:CC FLAGS_REG))])
+   (set (zero_extract:SWI248 (match_operand

[PATCH] c++: ICE with operator delete [PR104846]

2022-03-09 Thread Marek Polacek via Gcc-patches
This is an ICE-on-invalid with "auto operator delete[] (void *)" whose
return type must be void.  The return type is checked in coerce_delete_type
but we never got there in this test, because we took the wrong path in
grokdeclarator, set type to error_mark_node, ended up creating a FIELD_DECL
with build_decl, and confused grokmethod by giving it a FIELD_DECL.

Fixed by not taking the data member path for a FUNCTION_TYPE.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

PR c++/104846

gcc/cp/ChangeLog:

* decl.cc (grokdeclarator): Check FUNC_OR_METHOD_TYPE_P before giving
data member errors.

gcc/testsuite/ChangeLog:

* g++.dg/init/delete5.C: New test.
---
 gcc/cp/decl.cc  | 2 +-
 gcc/testsuite/g++.dg/init/delete5.C | 8 
 2 files changed, 9 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/g++.dg/init/delete5.C

diff --git a/gcc/cp/decl.cc b/gcc/cp/decl.cc
index 992e38385c2..12f2524aafe 100644
--- a/gcc/cp/decl.cc
+++ b/gcc/cp/decl.cc
@@ -13751,7 +13751,7 @@ grokdeclarator (const cp_declarator *declarator,
   }
 else if (decl_context == FIELD)
   {
-   if (!staticp && !friendp && TREE_CODE (type) != METHOD_TYPE)
+   if (!staticp && !friendp && !FUNC_OR_METHOD_TYPE_P (type))
  if (tree auto_node = type_uses_auto (type))
{
  location_t tloc = declspecs->locations[ds_type_spec];
diff --git a/gcc/testsuite/g++.dg/init/delete5.C 
b/gcc/testsuite/g++.dg/init/delete5.C
new file mode 100644
index 000..3555f43bbb0
--- /dev/null
+++ b/gcc/testsuite/g++.dg/init/delete5.C
@@ -0,0 +1,8 @@
+// PR c++/104846
+// { dg-do compile { target c++14 } }
+
+struct S {
+  auto operator delete (void *) {} // { dg-error ".operator delete. must 
return type .void'" }
+  auto operator delete[] (void *) {} // { dg-error ".operator delete. must 
return type .void'" }
+};
+

base-commit: bded0d549fdfdc1328b2c0189dc5f8593b4cbe42
-- 
2.35.1



Re: [PATCH] contrib: Fix non-portable sed commands in gcc-descr [PR102664/]

2022-03-09 Thread Jonathan Wakely via Gcc-patches
On Wed, 9 Mar 2022 at 17:40, Patrick Palka  wrote:
>
> On Wed, Mar 9, 2022 at 8:54 AM Mikael Morin  wrote:
> >
> > Hello,
> >
> > Le 08/03/2022 à 18:58, Jonathan Wakely via Gcc-patches a écrit :
> > > Replace \([0-9]\+\) with \([0-9][0-9]*\) or with \([1-9][0-9]*\) in 
> > > release branch numbers, where
> > > a leading zero does not occur.
> > >
> > Note that you also changed some gcc-[0-9]* to gcc-[1-9]*, which is a
> > typo/thinko I guess?  It looks like it wouldn’t match gcc-10 any more
> > for example…
>
> Perhaps related to this, I noticed the following
>   git gcc-descr ea1ce0d163ea1d63b6837144ae4be51d92630007
> now fails with
>   fatal: No tags can describe 'ea1ce0d163ea1d63b6837144ae4be51d92630007'.
> instead of outputting
>   r0-52309-gea1ce0d163ea1d

Ah yes, now that one is a bug. Changing it to [1-9]* was intentional,
and does still match gcc-10, but I forgot we have "r0" for really old
commits.




>
> >
> > > diff --git a/contrib/git-descr.sh b/contrib/git-descr.sh
> > > index ba5d711f330..95363279d8c 100755
> > > --- a/contrib/git-descr.sh
> > > +++ b/contrib/git-descr.sh
> > > @@ -18,11 +18,11 @@ do
> > >   done
> > >
> > >   if test x$short = xyes; then
> > > -r=$(git describe --all --match 'basepoints/gcc-[0-9]*' $c | sed -n 
> > > 's,^\(tags/\)\?basepoints/gcc-\([0-9]\+\)-\([0-9]\+\)-g[0-9a-f]*$,r\2-\3,p;s,^\(tags/\)\?basepoints/gcc-\([0-9]\+\)$,r\2-0,p');
> > > +r=$(git describe --all --match 'basepoints/gcc-[1-9]*' $c | sed -n 
> > > 's,^tags/,,;s,^basepoints/gcc-\([1-9][0-9]*\)-\([0-9][0-9]*\)-g[0-9a-f]*$,r\1-\2,p;s,^basepoints/gcc-\([1-9][0-9]*\)$,r\1-0,p');
> >
> > …here…
> >
> > >   elif test x$long = xyes; then
> > > -r=$(git describe --all --abbrev=40 --match 'basepoints/gcc-[0-9]*' 
> > > $c | sed -n 's,^\(tags/\)\?basepoints/gcc-,r,p')
> > > +r=$(git describe --all --abbrev=40 --match 'basepoints/gcc-[1-9]*' 
> > > $c | sed -n 's,^tags/,,;s,^basepoints/gcc-,r,p')
> >
> > … and here …
> >
> > >   else
> > > -r=$(git describe --all --abbrev=14 --match 'basepoints/gcc-[0-9]*' 
> > > $c | sed -n 's,^\(tags/\)\?basepoints/gcc-,r,p');
> > > +r=$(git describe --all --abbrev=14 --match 'basepoints/gcc-[1-9]*' 
> > > $c | sed -n 's,^tags/,,;s,^basepoints/gcc-,r,p')
> >
> > … and here.
> >
>



Re: [PATCH v2] Add TARGET_MOVE_WITH_MODE_P

2022-03-09 Thread H.J. Lu via Gcc-patches
On Wed, Mar 9, 2022 at 12:25 AM Richard Biener
 wrote:
>
> On Tue, Mar 8, 2022 at 4:44 PM H.J. Lu  wrote:
> >
> > On Mon, Mar 7, 2022 at 5:45 AM Richard Biener
> >  wrote:
> > >
> > > On Wed, Mar 2, 2022 at 10:18 PM H.J. Lu  wrote:
> > > >
> > > > On Wed, Mar 02, 2022 at 09:51:26AM +0100, Richard Biener wrote:
> > > > > On Tue, Mar 1, 2022 at 11:41 PM H.J. Lu via Gcc-patches
> > > > >  wrote:
> > > > > >
> > > > > > Add TARGET_FOLD_MEMCPY_MAX for the maximum number of bytes to fold 
> > > > > > memcpy.
> > > > > > The default is
> > > > > >
> > > > > > MOVE_MAX * MOVE_RATIO (optimize_function_for_size_p (cfun))
> > > > > >
> > > > > > For x86, it is MOVE_MAX to restore the old behavior before
> > > > >
> > > > > I know we've discussed this to death in the PR, I just want to repeat 
> > > > > here
> > > > > that the GIMPLE folding expects to generate a single load and a single
> > > > > store (that is what it does on the GIMPLE level) which is why MOVE_MAX
> > > > > was chosen originally (it's documented to what a "single instruction" 
> > > > > does).
> > > > > In practice MOVE_MAX does not seem to cover vector register sizes
> > > > > so Richard pulled MOVE_RATIO which is really intended to cover
> > > > > the case of using multiple instructions for moving memory (but then I
> > > > > don't remember whether for the ARM case the single load/store GIMPLE
> > > > > will be expanded to multiple load/store instructions).
> > > > >
> > > > > TARGET_FOLD_MEMCPY_MAX sounds like a stop-gap solution,
> > > > > being very specific for memcpy folding (we also fold memmove btw).
> > > > >
> > > > > There is also MOVE_MAX_PIECES which _might_ be more appropriate
> > > > > than MOVE_MAX here and still honor the idea of single instructions.
> > > > > Now neither arm nor aarch64 define this and it defaults to MOVE_MAX,
> > > > > not MOVE_MAX * MOVE_RATIO.
> > > > >
> > > > > So if we need a new hook then that hook should at least get the
> > > > > 'speed' argument of MOVE_RATIO and it should get a better name.
> > > > >
> > > > > I still think that it should be possible to improve the insn check to
> > > > > avoid use of "disabled" modes, maybe that's also a point to add
> > > > > a new hook like .move_with_mode_p or so?  To quote, we do
> > > >
> > > > Here is the v2 patch to add TARGET_MOVE_WITH_MODE_P.
> > >
> > > Again I'd like to shine light on MOVE_MAX_PIECES which explicitely
> > > mentions "a load or store used TO COPY MEMORY" (emphasis mine)
> > > and whose x86 implementation would already be fine (doing larger moves
> > > and also not doing too large moves).  But appearantly the arm folks
> > > decided that that's not fit and instead (mis-?)used MOVE_MAX * MOVE_RATIO.
> > >
> > > Yes, MOVE_MAX_PIECES is documented to apply to move_by_pieces.
> > > Still GIMPLE memcpy/memmove inlining wants to mimic exactly that but
> > > restrict itself to a single load and a single store.
> > >
> > > > >
> > > > >   scalar_int_mode mode;
> > > > >   if (int_mode_for_size (ilen * 8, 0).exists (&mode)
> > > > >   && GET_MODE_SIZE (mode) * BITS_PER_UNIT == ilen * 8
> > > > >   && have_insn_for (SET, mode)
> > > > >   /* If the destination pointer is not aligned we 
> > > > > must be able
> > > > >  to emit an unaligned store.  */
> > > > >   && (dest_align >= GET_MODE_ALIGNMENT (mode)
> > > > >   || !targetm.slow_unaligned_access (mode, 
> > > > > dest_align)
> > > > >   || (optab_handler (movmisalign_optab, mode)
> > > > >   != CODE_FOR_nothing)))
> > > > >
> > > > > where I understand the ISA is enabled and if the user explicitely
> > > > > uses it that's OK but -mprefer-avx128 should tell GCC to never
> > > > > generate AVX256 code where the user was not explicitely using it
> > > > > (still for example glibc might happily use AVX256 code to implement
> > > > > the memcpy we are folding!)
> > > > >
> > > > > Note the BB vectorizer also might end up with using AVX256 because
> > > > > in places it also relies on optab queries and the 
> > > > > vector_mode_supported_p
> > > > > check (but the memcpy folding uses the fake integer modes).  So
> > > > > x86 might need to implement the related_mode hook to avoid 
> > > > > "auto"-using
> > > > > a larger vector mode which the default implementation would happily 
> > > > > do.
> > > > >
> > > > > Richard.
> > > >
> > > > OK for master?
> > >
> > > Looking for opinions from others as well.
> > >
> > > Btw, there's a similar use in expand_DEFERRED_INIT:
> > >
> > >   && int_mode_for_size (tree_to_uhwi (var_size) * BITS_PER_UNIT,
> > > 0).exists (&var_mode)
> > >   && have_insn_for (SET, var_mode))
> > >
> > > So it occured to me that maybe targetm.move_with_mode_p should eventually
> >
> > TARGET_MOVE_WITH_MODE_P shouldn't be used here since we do want to
> > use var_m

Re: [PATCH v2] Add TARGET_MOVE_WITH_MODE_P

2022-03-09 Thread Richard Sandiford via Gcc-patches
Richard Biener via Gcc-patches  writes:
> On Wed, Mar 2, 2022 at 10:18 PM H.J. Lu  wrote:
>>
>> On Wed, Mar 02, 2022 at 09:51:26AM +0100, Richard Biener wrote:
>> > On Tue, Mar 1, 2022 at 11:41 PM H.J. Lu via Gcc-patches
>> >  wrote:
>> > >
>> > > Add TARGET_FOLD_MEMCPY_MAX for the maximum number of bytes to fold 
>> > > memcpy.
>> > > The default is
>> > >
>> > > MOVE_MAX * MOVE_RATIO (optimize_function_for_size_p (cfun))
>> > >
>> > > For x86, it is MOVE_MAX to restore the old behavior before
>> >
>> > I know we've discussed this to death in the PR, I just want to repeat here
>> > that the GIMPLE folding expects to generate a single load and a single
>> > store (that is what it does on the GIMPLE level) which is why MOVE_MAX
>> > was chosen originally (it's documented to what a "single instruction" 
>> > does).
>> > In practice MOVE_MAX does not seem to cover vector register sizes
>> > so Richard pulled MOVE_RATIO which is really intended to cover
>> > the case of using multiple instructions for moving memory (but then I
>> > don't remember whether for the ARM case the single load/store GIMPLE
>> > will be expanded to multiple load/store instructions).
>> >
>> > TARGET_FOLD_MEMCPY_MAX sounds like a stop-gap solution,
>> > being very specific for memcpy folding (we also fold memmove btw).
>> >
>> > There is also MOVE_MAX_PIECES which _might_ be more appropriate
>> > than MOVE_MAX here and still honor the idea of single instructions.
>> > Now neither arm nor aarch64 define this and it defaults to MOVE_MAX,
>> > not MOVE_MAX * MOVE_RATIO.
>> >
>> > So if we need a new hook then that hook should at least get the
>> > 'speed' argument of MOVE_RATIO and it should get a better name.
>> >
>> > I still think that it should be possible to improve the insn check to
>> > avoid use of "disabled" modes, maybe that's also a point to add
>> > a new hook like .move_with_mode_p or so?  To quote, we do
>>
>> Here is the v2 patch to add TARGET_MOVE_WITH_MODE_P.
>
> Again I'd like to shine light on MOVE_MAX_PIECES which explicitely
> mentions "a load or store used TO COPY MEMORY" (emphasis mine)
> and whose x86 implementation would already be fine (doing larger moves
> and also not doing too large moves).  But appearantly the arm folks
> decided that that's not fit and instead (mis-?)used MOVE_MAX * MOVE_RATIO.

It seems like there are old comments and old documentation that justify
both interpretations, so there are good arguments on both sides.  But
with this kind of thing I think we have to infer the meaning of the
macro from the way it's currently used, rather than trusting such old
and possibly out-of-date and contradictory information.

FWIW, I agree that (if we exclude old reload, which we should!) the
only direct uses of MOVE_MAX before the patch were not specific to
integer registers and so MOVE_MAX should include vectors if the
target wants vector modes to be used for general movement.

Even if people disagree that that's the current meaning, I think it's
at least a sensible meaning.  It provides information that AFAIK isn't
available otherwise, and it avoids overlap with MAX_FIXED_MODE_SIZE.

So FWIW, I think it'd be reasonable to change non-x86 targets if they
want vector modes to be used for single-insn copies.

Thanks,
Richard


[PATCH] contrib: Fix up git-descr.sh regression [PR102664]

2022-03-09 Thread Jakub Jelinek via Gcc-patches
On Wed, Mar 09, 2022 at 12:40:24PM -0500, Patrick Palka via Gcc-patches wrote:
> On Wed, Mar 9, 2022 at 8:54 AM Mikael Morin  wrote:
> > Le 08/03/2022 à 18:58, Jonathan Wakely via Gcc-patches a écrit :
> > > Replace \([0-9]\+\) with \([0-9][0-9]*\) or with \([1-9][0-9]*\) in 
> > > release branch numbers, where
> > > a leading zero does not occur.
> > >
> > Note that you also changed some gcc-[0-9]* to gcc-[1-9]*, which is a
> > typo/thinko I guess?  It looks like it wouldn’t match gcc-10 any more
> > for example…
> 
> Perhaps related to this, I noticed the following
>   git gcc-descr ea1ce0d163ea1d63b6837144ae4be51d92630007
> now fails with
>   fatal: No tags can describe 'ea1ce0d163ea1d63b6837144ae4be51d92630007'.
> instead of outputting
>   r0-52309-gea1ce0d163ea1d

That is because of those [0-9] to [1-9] changes which prevent
basepoints/gcc-0 from working.  While basepoints/gcc-005 etc. are certainly
unespected, basepoints/gcc-0 needs to work.

Ok for trunk?

2022-03-09  Jakub Jelinek  

PR other/102664
* git-descr.sh: Replace all [1-9] occurrences with [0-9].
* git-undescr.sh: Likewise.

--- contrib/git-descr.sh.jj 2022-03-09 09:09:55.392843652 +0100
+++ contrib/git-descr.sh2022-03-09 18:49:39.426914348 +0100
@@ -18,11 +18,11 @@ do
 done
 
 if test x$short = xyes; then
-r=$(git describe --all --match 'basepoints/gcc-[1-9]*' $c | sed -n 
's,^tags/,,;s,^basepoints/gcc-\([1-9][0-9]*\)-\([0-9][0-9]*\)-g[0-9a-f]*$,r\1-\2,p;s,^basepoints/gcc-\([1-9][0-9]*\)$,r\1-0,p');
+r=$(git describe --all --match 'basepoints/gcc-[0-9]*' $c | sed -n 
's,^tags/,,;s,^basepoints/gcc-\([0-9][0-9]*\)-\([0-9][0-9]*\)-g[0-9a-f]*$,r\1-\2,p;s,^basepoints/gcc-\([0-9][0-9]*\)$,r\1-0,p');
 elif test x$long = xyes; then
-r=$(git describe --all --abbrev=40 --match 'basepoints/gcc-[1-9]*' $c | 
sed -n 's,^tags/,,;s,^basepoints/gcc-,r,p')
+r=$(git describe --all --abbrev=40 --match 'basepoints/gcc-[0-9]*' $c | 
sed -n 's,^tags/,,;s,^basepoints/gcc-,r,p')
 else
-r=$(git describe --all --abbrev=14 --match 'basepoints/gcc-[1-9]*' $c | 
sed -n 's,^tags/,,;s,^basepoints/gcc-,r,p')
+r=$(git describe --all --abbrev=14 --match 'basepoints/gcc-[0-9]*' $c | 
sed -n 's,^tags/,,;s,^basepoints/gcc-,r,p')
 expr ${r:-no} : 'r[0-9]\+$' >/dev/null && r=${r}-0-g$(git rev-parse $c);
 fi;
 if test -n $r; then
--- contrib/git-undescr.sh.jj   2022-03-09 09:09:55.392843652 +0100
+++ contrib/git-undescr.sh  2022-03-09 18:50:14.425429223 +0100
@@ -3,11 +3,11 @@
 # Script to undescribe a GCC revision
 
 o=$(git config --get gcc-config.upstream);
-r=$(echo $1 | sed -n 's,^r\([1-9][0-9]*\)-[0-9][0-9]*\(-g[0-9a-f]*\)*$,\1,p');
-n=$(echo $1 | sed -n 's,^r[1-9][0-9]*-\([0-9][0-9]*\)\(-g[0-9a-f]*\)*$,\1,p');
+r=$(echo $1 | sed -n 's,^r\([0-9][0-9]*\)-[0-9][0-9]*\(-g[0-9a-f]*\)*$,\1,p');
+n=$(echo $1 | sed -n 's,^r[0-9][0-9]*-\([0-9][0-9]*\)\(-g[0-9a-f]*\)*$,\1,p');
 
 test -z $r && echo Invalid id $1 && exit 1;
 h=$(git rev-parse --verify --quiet ${o:-origin}/releases/gcc-$r);
 test -z $h && h=$(git rev-parse --verify --quiet ${o:-origin}/master);
-p=$(git describe --all --match 'basepoints/gcc-'$r $h | sed -n 
's,^tags/,,;s,^basepoints/gcc-[1-9][0-9]*-\([0-9][0-9]*\)-g[0-9a-f]*$,\1,p;s,^basepoints/gcc-[1-9][0-9]*$,0,p');
+p=$(git describe --all --match 'basepoints/gcc-'$r $h | sed -n 
's,^tags/,,;s,^basepoints/gcc-[0-9][0-9]*-\([0-9][0-9]*\)-g[0-9a-f]*$,\1,p;s,^basepoints/gcc-[0-9][0-9]*$,0,p');
 git rev-parse --verify $h~$(expr $p - $n);


Jakub



[PATCH v2] PR tree-optimization/98335: Improvements to DSE's compute_trims.

2022-03-09 Thread Roger Sayle

Hi Richard,
Many thanks.  Yes, your proposed ao_ref_alignment is exactly what I was looking 
for.
Here's the second revision of my patch for PR tree-optimization/98335 that both 
uses/
introduces ao_ref_alignment and more intelligently aligns/trims both head and 
tail,
for example handling the case discussed by Richard and Jeff Law, of a 16 
byte-aligned
object where we wish to avoid trimming (just) the last three bytes.  It uses 
the useful
property that writing N consecutive bytes, typically requires popcount(N) store
instructions, so we wish to align (if we can) that we begin/end with a store of 
N' bytes
where popcount(N') is one, if that isn't already the case.

This patch has been tested on x86_64-pc-linux-gnu with make bootstrap and
make -k check with no new failures.  Is this revised version ok for mainline?

2022-03-09  Roger Sayle  
Richard Biener  

gcc/ChangeLog
PR tree-optimization/98335
* builtins.cc (get_object_alignment_2): Export.
* builtins.h (get_object_alignment_2): Likewise.
* tree-ssa-alias.cc (ao_ref_alignment): New.
* tree-ssa-alias.h (ao_ref_alignment): Declare.

* tree-ssa-dse.cc (compute_trims): Improve logic deciding whether
to align head/tail, writing more bytes but using fewer store insns.
(maybe_trim_memstar_call): Silence compiler warnings by using
memset to initialize lendata.

gcc/testsuite/ChangeLog
PR tree-optimization/98335
* g++.dg/pr98335.C: New test case.
* gcc.dg/pr86010.c: New test case.
* gcc.dg/pr86010-2.c: New test case.

Thanks again for your help.
Roger
--

> -Original Message-
> From: Richard Biener 
> Sent: 08 March 2022 10:44
> To: Roger Sayle 
> Cc: GCC Patches 
> Subject: Re: [PATCH] PR tree-optimization/98335: Improvements to DSE's
> compute_trims.
> 
> On Tue, Mar 8, 2022 at 11:10 AM Richard Biener
>  wrote:
> >
> > On Mon, Mar 7, 2022 at 11:04 AM Roger Sayle
>  wrote:
> > >
> > >
> > > This patch is the main middle-end piece of a fix for PR
> > > tree-opt/98335, which is a code-quality regression affecting
> > > mainline.  The issue occurs in DSE's (dead store elimination's)
> > > compute_trims function that determines where a store to memory can
> > > be trimmed.  In the testcase given in the PR, this function notices
> > > that the first byte of a DImode store is dead, and replaces the
> > > 8-byte store at (aligned) offset zero, with a 7-byte store at
> > > (unaligned) offset one.  Most architectures can store a power-of-two
> > > bytes (up to a maximum) in single instruction, so writing 7 bytes
> > > requires more instructions than writing 8 bytes.  This patch follows Jakub
> Jelinek's suggestion in comment 5, that compute_trims needs improved
> heuristics.
> > >
> > > In this patch, decision of whether and how to align trim_head is
> > > based on the number of bytes being written, the alignment of the
> > > start of the object and where within the object the first byte is
> > > written.  The first tests check whether we're already writing to the
> > > start of the object, and that we're writing three or more bytes.  If
> > > we're only writing one or two bytes, there's no benefit from providing
> additional alignment.
> > > Then we determine the alignment of the object, which is either 1, 2,
> > > 4, 8 or 16 byte aligned (capping at 16 guarantees that we never
> > > write more than 7 bytes beyond the minimum required).  If the buffer
> > > is only
> > > 1 or 2 byte aligned there's no benefit from additional alignment.
> > > For the remaining cases, alignment of trim_head is based upon where
> > > within each aligned block (word) the first byte is written.  For
> > > example, storing the last byte (or last half-word) of a word can be
> > > performed with a single insn.
> > >
> > > On x86_64-pc-linux-gnu with -O2 the new test case in the PR goes from:
> > >
> > > movl$0, -24(%rsp)
> > > movabsq $72057594037927935, %rdx
> > > movl$0, -21(%rsp)
> > > andq-24(%rsp), %rdx
> > > movq%rdx, %rax
> > > salq$8, %rax
> > > movbc(%rip), %al
> > > ret
> > >
> > > to
> > >
> > > xorl%eax, %eax
> > > movbc(%rip), %al
> > > ret
> > >
> > > This patch has been tested on x86_64-pc-linux-gnu with make
> > > bootstrap and make -k check with no new failures.  I've also added
> > > new testcases for the original motivating PR
> > > tree-optimization/86010, to ensure that those remain optimized (in 
> > > future).
> Ok for mainline?
> >
> > diff --git a/gcc/tree-ssa-dse.cc b/gcc/tree-ssa-dse.cc index
> > 2b22a61..080e406 100644
> > --- a/gcc/tree-ssa-dse.cc
> > +++ b/gcc/tree-ssa-dse.cc
> > @@ -405,10 +405,36 @@ compute_trims (ao_ref *ref, sbitmap live, int
> > *trim_head, int *trim_tail,
> >int first_live = bitmap_first_set_bit (live);
> >*trim_head = first_live - first_orig;
> >
> > -  /* If more than a word remains, the

Re: [PATCH] contrib: Fix non-portable sed commands in gcc-descr [PR102664/]

2022-03-09 Thread Patrick Palka via Gcc-patches
On Wed, Mar 9, 2022 at 8:54 AM Mikael Morin  wrote:
>
> Hello,
>
> Le 08/03/2022 à 18:58, Jonathan Wakely via Gcc-patches a écrit :
> > Replace \([0-9]\+\) with \([0-9][0-9]*\) or with \([1-9][0-9]*\) in release 
> > branch numbers, where
> > a leading zero does not occur.
> >
> Note that you also changed some gcc-[0-9]* to gcc-[1-9]*, which is a
> typo/thinko I guess?  It looks like it wouldn’t match gcc-10 any more
> for example…

Perhaps related to this, I noticed the following
  git gcc-descr ea1ce0d163ea1d63b6837144ae4be51d92630007
now fails with
  fatal: No tags can describe 'ea1ce0d163ea1d63b6837144ae4be51d92630007'.
instead of outputting
  r0-52309-gea1ce0d163ea1d

>
> > diff --git a/contrib/git-descr.sh b/contrib/git-descr.sh
> > index ba5d711f330..95363279d8c 100755
> > --- a/contrib/git-descr.sh
> > +++ b/contrib/git-descr.sh
> > @@ -18,11 +18,11 @@ do
> >   done
> >
> >   if test x$short = xyes; then
> > -r=$(git describe --all --match 'basepoints/gcc-[0-9]*' $c | sed -n 
> > 's,^\(tags/\)\?basepoints/gcc-\([0-9]\+\)-\([0-9]\+\)-g[0-9a-f]*$,r\2-\3,p;s,^\(tags/\)\?basepoints/gcc-\([0-9]\+\)$,r\2-0,p');
> > +r=$(git describe --all --match 'basepoints/gcc-[1-9]*' $c | sed -n 
> > 's,^tags/,,;s,^basepoints/gcc-\([1-9][0-9]*\)-\([0-9][0-9]*\)-g[0-9a-f]*$,r\1-\2,p;s,^basepoints/gcc-\([1-9][0-9]*\)$,r\1-0,p');
>
> …here…
>
> >   elif test x$long = xyes; then
> > -r=$(git describe --all --abbrev=40 --match 'basepoints/gcc-[0-9]*' $c 
> > | sed -n 's,^\(tags/\)\?basepoints/gcc-,r,p')
> > +r=$(git describe --all --abbrev=40 --match 'basepoints/gcc-[1-9]*' $c 
> > | sed -n 's,^tags/,,;s,^basepoints/gcc-,r,p')
>
> … and here …
>
> >   else
> > -r=$(git describe --all --abbrev=14 --match 'basepoints/gcc-[0-9]*' $c 
> > | sed -n 's,^\(tags/\)\?basepoints/gcc-,r,p');
> > +r=$(git describe --all --abbrev=14 --match 'basepoints/gcc-[1-9]*' $c 
> > | sed -n 's,^tags/,,;s,^basepoints/gcc-,r,p')
>
> … and here.
>



Re: [PATCH] toplevel: Makefile.def: Make configure-sim depend on all-readline

2022-03-09 Thread Richard Biener via Gcc-patches



> Am 09.03.2022 um 17:54 schrieb Hans-Peter Nilsson via Gcc-patches 
> :
> 
> Tom Tromey ok'd this for the sourceware side, but thinks I
> need explicit approval on the gcc side.  Ok to commit?
> 
> --- Start of forwarded message ---
> From: Hans-Peter Nilsson 
> To: "binut...@sourceware.org" ,
>"gdb-patc...@sourceware.org" 
> Subject: [PATCH] toplevel: Makefile.def: Make configure-sim depend on 
> all-readline
> 
> Calling on "global maintainers" as per toplevel/MAINTAINERS
> for "Makefile.*".
> 
> Ok to commit?

Ok 

> (If so, I'll also commit this change to the gcc repo, where
> this dependency is normally unused, i.e. when source trees
> are kept separate.)
> 
> brgds, H-P
> - - 8< -
> 
> Without this, a "make all-sim" without the equivalent of
> libreadline-dev installed on the build system, won't
> properly pick up the in-tree readline build, and you'll see:
> 
> mkdir -p -- ./sim
> Configuring in ./sim
> configure: creating cache ./config.cache
> checking build system type... x86_64-pc-linux-gnu
> checking host system type... x86_64-pc-linux-gnu
> checking target system type... cris-axis-elf
> checking for x86_64-pc-linux-gnu-gcc... gcc
> checking whether the C compiler works... yes
> ...
> checking for library containing tgetent... -ltermcap
> checking for readline in -lreadline... no
> configure: error: the required "readline" library is missing
> make[1]: *** [Makefile:11188: configure-sim] Error 1
> make[1]: Leaving directory '/home/hp/sim/b'
> 
> The sim dependency on readline is apparently (nominally)
> valid as there's a readline call in sim/erc32/sis.c.
> 
> 2022-02-21  Hans-Peter Nilsson  
> 
>* Makefile.def (dependencies): Make configure-sim depend on
>all-readline.
> - ---
> Makefile.def | 2 +-
> Makefile.in  | 2 +-
> 2 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/Makefile.def b/Makefile.def
> index a504192e6d72..8181a7aa97b5 100644
> - --- a/Makefile.def
> +++ b/Makefile.def
> @@ -570,7 +570,7 @@ dependencies = { module=all-sim; on=all-intl; };
> dependencies = { module=all-sim; on=all-libiberty; };
> dependencies = { module=all-sim; on=all-bfd; };
> dependencies = { module=all-sim; on=all-opcodes; };
> - -dependencies = { module=all-sim; on=all-readline; };
> +dependencies = { module=configure-sim; on=all-readline; };
> 
> // Other host modules.
> dependencies = { module=all-fastjar; on=all-zlib; };
> diff --git a/Makefile.in b/Makefile.in
> index 2b77a4706947..843e150dac63 100644
> - --- a/Makefile.in
> +++ b/Makefile.in
> @@ -63072,7 +63072,7 @@ install-strip-sid: maybe-install-strip-tcl
> install-sid: maybe-install-tk
> install-strip-sid: maybe-install-strip-tk
> configure-sim: maybe-all-gnulib
> - -all-sim: maybe-all-readline
> +configure-sim: maybe-all-readline
> all-fastjar: maybe-all-build-texinfo
> all-libctf: all-libiberty
> all-stage1-libctf: all-stage1-libiberty
> - -- 
> 2.30.2
> --- End of forwarded message ---


Re: [PATCH] contrib: Fix non-portable sed commands in gcc-descr [PR102664/]

2022-03-09 Thread Jonathan Wakely via Gcc-patches
On Wed, 9 Mar 2022 at 14:01, Mikael Morin  wrote:
>
> Hello,
>
> Le 08/03/2022 à 18:58, Jonathan Wakely via Gcc-patches a écrit :
> > Replace \([0-9]\+\) with \([0-9][0-9]*\) or with \([1-9][0-9]*\) in release 
> > branch numbers, where
> > a leading zero does not occur.
> >
> Note that you also changed some gcc-[0-9]* to gcc-[1-9]*, which is a
> typo/thinko I guess?  It looks like it wouldn’t match gcc-10 any more
> for example…

Those are all in the --match argument, which is a shell glob not a
regex. See the git-describe(1) man page:

  --match 
  Only consider tags matching the given glob(7) pattern

>
> > diff --git a/contrib/git-descr.sh b/contrib/git-descr.sh
> > index ba5d711f330..95363279d8c 100755
> > --- a/contrib/git-descr.sh
> > +++ b/contrib/git-descr.sh
> > @@ -18,11 +18,11 @@ do
> >   done
> >
> >   if test x$short = xyes; then
> > -r=$(git describe --all --match 'basepoints/gcc-[0-9]*' $c | sed -n 
> > 's,^\(tags/\)\?basepoints/gcc-\([0-9]\+\)-\([0-9]\+\)-g[0-9a-f]*$,r\2-\3,p;s,^\(tags/\)\?basepoints/gcc-\([0-9]\+\)$,r\2-0,p');
> > +r=$(git describe --all --match 'basepoints/gcc-[1-9]*' $c | sed -n 
> > 's,^tags/,,;s,^basepoints/gcc-\([1-9][0-9]*\)-\([0-9][0-9]*\)-g[0-9a-f]*$,r\1-\2,p;s,^basepoints/gcc-\([1-9][0-9]*\)$,r\1-0,p');
>
> …here…
>
> >   elif test x$long = xyes; then
> > -r=$(git describe --all --abbrev=40 --match 'basepoints/gcc-[0-9]*' $c 
> > | sed -n 's,^\(tags/\)\?basepoints/gcc-,r,p')
> > +r=$(git describe --all --abbrev=40 --match 'basepoints/gcc-[1-9]*' $c 
> > | sed -n 's,^tags/,,;s,^basepoints/gcc-,r,p')
>
> … and here …
>
> >   else
> > -r=$(git describe --all --abbrev=14 --match 'basepoints/gcc-[0-9]*' $c 
> > | sed -n 's,^\(tags/\)\?basepoints/gcc-,r,p');
> > +r=$(git describe --all --abbrev=14 --match 'basepoints/gcc-[1-9]*' $c 
> > | sed -n 's,^tags/,,;s,^basepoints/gcc-,r,p')
>
> … and here.
>



[PATCH] toplevel: Makefile.def: Make configure-sim depend on all-readline

2022-03-09 Thread Hans-Peter Nilsson via Gcc-patches
Tom Tromey ok'd this for the sourceware side, but thinks I
need explicit approval on the gcc side.  Ok to commit?

--- Start of forwarded message ---
From: Hans-Peter Nilsson 
To: "binut...@sourceware.org" ,
"gdb-patc...@sourceware.org" 
Subject: [PATCH] toplevel: Makefile.def: Make configure-sim depend on 
all-readline

Calling on "global maintainers" as per toplevel/MAINTAINERS
for "Makefile.*".

Ok to commit?

(If so, I'll also commit this change to the gcc repo, where
this dependency is normally unused, i.e. when source trees
are kept separate.)

brgds, H-P
- - 8< -

Without this, a "make all-sim" without the equivalent of
libreadline-dev installed on the build system, won't
properly pick up the in-tree readline build, and you'll see:

mkdir -p -- ./sim
Configuring in ./sim
configure: creating cache ./config.cache
checking build system type... x86_64-pc-linux-gnu
checking host system type... x86_64-pc-linux-gnu
checking target system type... cris-axis-elf
checking for x86_64-pc-linux-gnu-gcc... gcc
checking whether the C compiler works... yes
...
checking for library containing tgetent... -ltermcap
checking for readline in -lreadline... no
configure: error: the required "readline" library is missing
make[1]: *** [Makefile:11188: configure-sim] Error 1
make[1]: Leaving directory '/home/hp/sim/b'

The sim dependency on readline is apparently (nominally)
valid as there's a readline call in sim/erc32/sis.c.

2022-02-21  Hans-Peter Nilsson  

* Makefile.def (dependencies): Make configure-sim depend on
all-readline.
- ---
 Makefile.def | 2 +-
 Makefile.in  | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/Makefile.def b/Makefile.def
index a504192e6d72..8181a7aa97b5 100644
- --- a/Makefile.def
+++ b/Makefile.def
@@ -570,7 +570,7 @@ dependencies = { module=all-sim; on=all-intl; };
 dependencies = { module=all-sim; on=all-libiberty; };
 dependencies = { module=all-sim; on=all-bfd; };
 dependencies = { module=all-sim; on=all-opcodes; };
- -dependencies = { module=all-sim; on=all-readline; };
+dependencies = { module=configure-sim; on=all-readline; };
 
 // Other host modules.
 dependencies = { module=all-fastjar; on=all-zlib; };
diff --git a/Makefile.in b/Makefile.in
index 2b77a4706947..843e150dac63 100644
- --- a/Makefile.in
+++ b/Makefile.in
@@ -63072,7 +63072,7 @@ install-strip-sid: maybe-install-strip-tcl
 install-sid: maybe-install-tk
 install-strip-sid: maybe-install-strip-tk
 configure-sim: maybe-all-gnulib
- -all-sim: maybe-all-readline
+configure-sim: maybe-all-readline
 all-fastjar: maybe-all-build-texinfo
 all-libctf: all-libiberty
 all-stage1-libctf: all-stage1-libiberty
- -- 
2.30.2
--- End of forwarded message ---


Re: [Patch] GCN: Implement __atomic_compare_exchange_{1, 2} in libgcc [PR102215]

2022-03-09 Thread Andrew Stubbs

On 09/03/2022 16:29, Tobias Burnus wrote:

This shows up with with OpenMP offloading as libgomp since a couple
of months uses __atomic_compare_exchange (see PR for details),
causing link errors when the gcn libgomp.a is linked.
It also shows up with sollve_vv.

The implementation does a bit copy'n'paste from the current
implementation + calls the existing word/uint32_t-wide version
of the atomic intrinsic. The semantic is described at
https://gcc.gnu.org/onlinedocs/gcc/_005f_005fatomic-Builtins.html

In terms of the args, the _4 has:
DEF_SYNC_BUILTIN (BUILT_IN_ATOMIC_COMPARE_EXCHANGE_4,
   "__atomic_compare_exchange_4",
   BT_FN_BOOL_VPTR_PTR_I4_BOOL_INT_INT,
   ATTR_NOTHROWCALL_LEAF_LIST)
and the arg names try to match the GCC manual.


OK for mainline?
Tested with libgomp + sollve_vv and -march=gfx908.


OK.

Andrew


[Patch] GCN: Implement __atomic_compare_exchange_{1,2} in libgcc [PR102215]

2022-03-09 Thread Tobias Burnus

This shows up with with OpenMP offloading as libgomp since a couple
of months uses __atomic_compare_exchange (see PR for details),
causing link errors when the gcn libgomp.a is linked.
It also shows up with sollve_vv.

The implementation does a bit copy'n'paste from the current
implementation + calls the existing word/uint32_t-wide version
of the atomic intrinsic. The semantic is described at
https://gcc.gnu.org/onlinedocs/gcc/_005f_005fatomic-Builtins.html

In terms of the args, the _4 has:
DEF_SYNC_BUILTIN (BUILT_IN_ATOMIC_COMPARE_EXCHANGE_4,
  "__atomic_compare_exchange_4",
  BT_FN_BOOL_VPTR_PTR_I4_BOOL_INT_INT,
  ATTR_NOTHROWCALL_LEAF_LIST)
and the arg names try to match the GCC manual.


OK for mainline?
Tested with libgomp + sollve_vv and -march=gfx908.

Tobias
-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
GCN: Implement __atomic_compare_exchange_{1,2} in libgcc [PR102215]

libgcc/ChangeLog:

	PR target/102215
	* config/gcn/atomic.c (__sync_val_compare_and_swap_##SIZE): Move
	a line up to non-arg-dependent value first.
	(__ATOMIC_COMPARE_EXCHANGE): Define + call to generate
	__atomic_compare_exchange_{1,2}.

diff --git a/libgcc/config/gcn/atomic.c b/libgcc/config/gcn/atomic.c
index 8784f90dfe2..cf29fa82aba 100644
--- a/libgcc/config/gcn/atomic.c
+++ b/libgcc/config/gcn/atomic.c
@@ -18,42 +18,69 @@
 
You should have received a copy of the GNU General Public License and
a copy of the GCC Runtime Library Exception along with this program;
see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
.  */
 
 #include 
 
 #define __SYNC_SUBWORD_COMPARE_AND_SWAP(TYPE, SIZE)			 \
 	 \
 TYPE	 \
 __sync_val_compare_and_swap_##SIZE (TYPE *ptr, TYPE oldval, TYPE newval) \
 {	 \
+  unsigned int valmask = (1 << (SIZE * 8)) - 1; \
   unsigned int *wordptr = (unsigned int *)((__UINTPTR_TYPE__ ) ptr & ~3UL);  \
   int shift = ((__UINTPTR_TYPE__ ) ptr & 3UL) * 8;			 \
-  unsigned int valmask = (1 << (SIZE * 8)) - 1; \
   unsigned int wordmask = ~(valmask << shift); \
   unsigned int oldword = *wordptr;	 \
   for (;;) \
 {	 \
   TYPE prevval = (oldword >> shift) & valmask;			 \
   if (__builtin_expect (prevval != oldval, 0))			 \
 	return prevval;			 \
   unsigned int newword = oldword & wordmask;			 \
   newword |= ((unsigned int) newval) << shift;			 \
   unsigned int prevword		 \
 	  = __sync_val_compare_and_swap_4 (wordptr, oldword, newword);	 \
   if (__builtin_expect (prevword == oldword, 1))			 \
 	return oldval;			 \
   oldword = prevword;		 \
 }	 \
 }	 \
 	 \
 bool	 \
 __sync_bool_compare_and_swap_##SIZE (TYPE *ptr, TYPE oldval, TYPE newval)\
 {	 \
   return __sync_val_compare_and_swap_##SIZE (ptr, oldval, newval) == oldval; \
 }
 
 __SYNC_SUBWORD_COMPARE_AND_SWAP (unsigned char, 1)
 __SYNC_SUBWORD_COMPARE_AND_SWAP (unsigned short, 2)
 
+
+#define __ATOMIC_COMPARE_EXCHANGE(TYPE,SIZE)  \
+bool	  \
+__atomic_compare_exchange_##SIZE (TYPE *ptr, TYPE *expected,		  \
+  TYPE desired, bool weak,		  \
+  int success_memorder, int failure_memorder) \
+{	  \
+  unsigned int valmask = (1 << (SIZE * 8)) - 1;  \
+	  \
+  unsigned int *wordptr = (unsigned int *)((__UINTPTR_TYPE__ ) ptr & ~3UL);   \
+  int ptrshift = ((__UINTPTR_TYPE__ ) ptr & 3UL) * 8;			  \
+  unsigned int wordmask = ~(valmask << ptrshift);			  \
+	  \
+  unsigned int ptrword = *wordptr;	  \
+  unsigned int exptword = ptrword & wordmask;  \
+  unsigned int newword = ptrword & wordmask;  \
+  exptword |= ((unsigned int) *expected) << ptrshift;			  \
+  newword |= ((unsigned int) desired) << ptrshift;			  \
+  if (__atomic_compare_exchange_4 (wordptr, &exptword, newword, weak,	  \
+   success_memorder, failure_memorder))	  \
+return true;			  \
+  *expected = (TYPE) ((exptword >> ptrshift) & valmask);		  \
+  return false;  \
+}
+
+__ATOMIC_COMPARE_EXCHANGE (unsigned char, 1)
+__ATOMIC_COMPARE_EXCHANGE (unsigned short, 2)


[PATCH v2] cse: avoid signed overflow in compute_const_anchors [PR 104843]

2022-03-09 Thread Xi Ruoyao via Gcc-patches
On Wed, 2022-03-09 at 15:55 +0100, Richard Biener wrote:

> isn't it better to make targetm.const_anchor unsigned?
> The & and ~ are not subject to overflow rules.

It's not enough: if n is the minimum value of HOST_WIDE_INT and
const_anchor = 0x8000 (the value for MIPS), we'll have a signed 0x7fff
in *upper_base.  Then the next line, "*upper_offs = n - *upper_base;"
will be a signed overflow again.

How about the following?

-- >8 --

With a non-zero const_anchor, the behavior of this function relied on
signed overflow.

gcc/

PR rtl-optimization/104843
* cse.cc (compute_const_anchors): Use unsigned HOST_WIDE_INT for
n to perform overflow arithmetics safely.
---
 gcc/cse.cc | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/gcc/cse.cc b/gcc/cse.cc
index a18b599d324..052fa0c3490 100644
--- a/gcc/cse.cc
+++ b/gcc/cse.cc
@@ -1169,12 +1169,12 @@ compute_const_anchors (rtx cst,
   HOST_WIDE_INT *lower_base, HOST_WIDE_INT *lower_offs,
   HOST_WIDE_INT *upper_base, HOST_WIDE_INT *upper_offs)
 {
-  HOST_WIDE_INT n = INTVAL (cst);
-
-  *lower_base = n & ~(targetm.const_anchor - 1);
-  if (*lower_base == n)
+  unsigned HOST_WIDE_INT n = UINTVAL (cst);
+  unsigned HOST_WIDE_INT lb = n & ~(targetm.const_anchor - 1);
+  if (lb == n)
 return false;
 
+  *lower_base = lb;
   *upper_base =
 (n + (targetm.const_anchor - 1)) & ~(targetm.const_anchor - 1);
   *upper_offs = n - *upper_base;
-- 
2.35.1


> 


[PATCH v2] c++: Don't allow type-constraint auto(x) [PR104752]

2022-03-09 Thread Marek Polacek via Gcc-patches
On Tue, Mar 08, 2022 at 02:47:42PM -0500, Jason Merrill wrote:
> On 3/2/22 14:31, Marek Polacek wrote:
> > 104752 points out that
> > 
> >template
> >concept C = true;
> >auto y = C auto(1);
> > 
> > is ill-formed as per [dcl.type.auto.deduct]: "For an explicit type 
> > conversion,
> > T is the specified type, which shall be auto." which doesn't allow
> > type-constraint auto.
> > 
> > Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?
> > 
> > PR c++/104752
> > 
> > gcc/cp/ChangeLog:
> > 
> > * semantics.cc (finish_compound_literal): Disallow auto{x} for
> > is_constrained_auto.
> > * typeck2.cc (build_functional_cast_1): Likewise.
> > 
> > gcc/testsuite/ChangeLog:
> > 
> > * g++.dg/cpp23/auto-fncast12.C: New test.
> > ---
> >   gcc/cp/semantics.cc| 1 +
> >   gcc/cp/typeck2.cc  | 1 +
> >   gcc/testsuite/g++.dg/cpp23/auto-fncast12.C | 8 
> >   3 files changed, 10 insertions(+)
> >   create mode 100644 gcc/testsuite/g++.dg/cpp23/auto-fncast12.C
> > 
> > diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc
> > index a2c0eb050e6..5129b12f00f 100644
> > --- a/gcc/cp/semantics.cc
> > +++ b/gcc/cp/semantics.cc
> > @@ -3148,6 +3148,7 @@ finish_compound_literal (tree type, tree 
> > compound_literal,
> > /* C++23 auto{x}.  */
> > else if (is_auto (type)
> >&& !AUTO_IS_DECLTYPE (type)
> > +  && !is_constrained_auto (type)
> >&& CONSTRUCTOR_NELTS (compound_literal) == 1)
> >   {
> > if (cxx_dialect < cxx23)
> > diff --git a/gcc/cp/typeck2.cc b/gcc/cp/typeck2.cc
> > index 39d03e4c3c4..c9314bbeb6f 100644
> > --- a/gcc/cp/typeck2.cc
> > +++ b/gcc/cp/typeck2.cc
> > @@ -2305,6 +2305,7 @@ build_functional_cast_1 (location_t loc, tree exp, 
> > tree parms,
> > init = parms;
> > /* C++23 auto(x).  */
> > else if (!AUTO_IS_DECLTYPE (anode)
> > +  && !is_constrained_auto (anode)
> >&& list_length (parms) == 1)
> > {
> >   init = TREE_VALUE (parms);
> 
> We might get a better diagnostic by moving this test inside the block and
> saying that the auto can't be constrained, instead of "invalid use of auto"?

Sure.  So like this?

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

-- >8 --
104752 points out that

  template
  concept C = true;
  auto y = C auto(1);

is ill-formed as per [dcl.type.auto.deduct]: "For an explicit type conversion,
T is the specified type, which shall be auto." which doesn't allow
type-constraint auto.

PR c++/104752

gcc/cp/ChangeLog:

* semantics.cc (finish_compound_literal): Disallow auto{x} for
is_constrained_auto.
* typeck2.cc (build_functional_cast_1): Likewise.

gcc/testsuite/ChangeLog:

* g++.dg/cpp23/auto-fncast12.C: New test.
---
 gcc/cp/semantics.cc| 8 +++-
 gcc/cp/typeck2.cc  | 8 +++-
 gcc/testsuite/g++.dg/cpp23/auto-fncast12.C | 8 
 3 files changed, 22 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp23/auto-fncast12.C

diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc
index 799ce943279..21740064d3d 100644
--- a/gcc/cp/semantics.cc
+++ b/gcc/cp/semantics.cc
@@ -3150,7 +3150,13 @@ finish_compound_literal (tree type, tree 
compound_literal,
   && !AUTO_IS_DECLTYPE (type)
   && CONSTRUCTOR_NELTS (compound_literal) == 1)
 {
-  if (cxx_dialect < cxx23)
+  if (is_constrained_auto (type))
+   {
+ if (complain & tf_error)
+   error ("% cannot be constrained");
+ return error_mark_node;
+   }
+  else if (cxx_dialect < cxx23)
pedwarn (input_location, OPT_Wc__23_extensions,
 "% only available with "
 "%<-std=c++2b%> or %<-std=gnu++2b%>");
diff --git a/gcc/cp/typeck2.cc b/gcc/cp/typeck2.cc
index 39d03e4c3c4..a4c825fc34d 100644
--- a/gcc/cp/typeck2.cc
+++ b/gcc/cp/typeck2.cc
@@ -2308,7 +2308,13 @@ build_functional_cast_1 (location_t loc, tree exp, tree 
parms,
   && list_length (parms) == 1)
{
  init = TREE_VALUE (parms);
- if (cxx_dialect < cxx23)
+ if (is_constrained_auto (anode))
+   {
+ if (complain & tf_error)
+   error_at (loc, "% cannot be constrained");
+ return error_mark_node;
+   }
+ else if (cxx_dialect < cxx23)
pedwarn (loc, OPT_Wc__23_extensions,
 "% only available with "
 "%<-std=c++2b%> or %<-std=gnu++2b%>");
diff --git a/gcc/testsuite/g++.dg/cpp23/auto-fncast12.C 
b/gcc/testsuite/g++.dg/cpp23/auto-fncast12.C
new file mode 100644
index 000..2856c2846fc
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp23/auto-fncast12.C
@@ -0,0 +1,8 @@
+// PR c++/104752
+// { dg-do compile { target c++23 } }
+
+template
+concept C = true;
+auto x = auto(1); // valid (P0849R8)
+auto y = C auto(1);   // { dg-error "c

Re: [PATCH] contrib: Avoid use of "echo -n" in git customization [PR102664]

2022-03-09 Thread Jonathan Wakely via Gcc-patches

On 09/03/22 12:15 +, Richard Earnshaw wrote:

The -n option to echo is non-portable.  The generally recommended
alternative is to use the shell printf command.

contrib/ChangeLog:

PR other/102664
* gcc-git-customization.sh (ask): Use printf instead of echo -n.

diff --git a/contrib/gcc-git-customization.sh b/contrib/gcc-git-customization.sh
index b24948d9874..cf46c494a6a 100755
--- a/contrib/gcc-git-customization.sh
+++ b/contrib/gcc-git-customization.sh
@@ -7,7 +7,7 @@ ask () {
question=$1
default=$2
var=$3
-echo -n $question "["$default"]? "
+printf "%s" "$question [$default]? "
read answer
if [ "x$answer" = "x" ]
then


This isn't enough to get the script working on AIX and Solaris. The
attached patch has been tested on Fedora Linux, NetBSD 9.2, AIX 7 and
Solaris 11.

The part checking the result of `git rev-parse --git-path hooks` was
needed to work around Git 2.4.0 on gcc211 in the compile farm, which
is a Solaris 11 sparc box. That's a truly ancient version, but
handling the error (and just skipping installation of the hook) isn't
difficult, so seems worthwhile. I can revert that part if preferred.

OK for trunk?


commit 1f23f091a8d6e3e337fd4647fec8afa8d2d8bd46
Author: Jonathan Wakely 
Date:   Wed Mar 9 14:53:52 2022

contrib: Fix non-portable shell commands in gcc-git-customization.sh [PR102664]

Use printf instead of echo -n. Use Basic Regular Expressions instead of
sed -r. Check for error from ancient Git versions that don't support the
--git-path option for git-rev-parse. Remove -c flag from install
command, as it's ignored by GNU and BSD install, but means something
different for Solaris and AIX.

contrib/ChangeLog:

PR other/102664
* gcc-git-customization.sh: Fix non-portable commands.

diff --git a/contrib/gcc-git-customization.sh b/contrib/gcc-git-customization.sh
index b24948d9874..914d868f7bd 100755
--- a/contrib/gcc-git-customization.sh
+++ b/contrib/gcc-git-customization.sh
@@ -7,7 +7,7 @@ ask () {
 question=$1
 default=$2
 var=$3
-echo -n $question "["$default"]? "
+printf "%s [%s]? " "$question" "$default"
 read answer
 if [ "x$answer" = "x" ]
 then
@@ -110,7 +110,7 @@ then
 	# This is a pure guess, but for many people it might be OK.
 	remote_id=$(whoami)
 else
-	remote_id=$(echo $url | sed -r "s|^.*ssh://(.+)@gcc.gnu.org.*$|\1|")
+	remote_id=$(echo $url | sed 's|^.*ssh://\(..*\)@gcc.gnu.org.*$|\1|')
 	if [ x$remote_id = x$url ]
 	then
 	remote_id=$(whoami)
@@ -118,7 +118,7 @@ then
 fi
 fi
 
-ask "Account name on gcc.gnu.org (for your personal branches area)" $remote_id remote_id
+ask "Account name on gcc.gnu.org (for your personal branches area)" "$remote_id" remote_id
 git config "gcc-config.user" "$remote_id"
 
 old_pfx=$(git config --get "gcc-config.userpfx")
@@ -135,16 +135,20 @@ git config "gcc-config.userpfx" "$new_pfx"
 echo
 ask "Install prepare-commit-msg git hook for 'git commit-mklog' alias" yes dohook
 if [ "x$dohook" = xyes ]; then
-hookdir=`git rev-parse --git-path hooks`
-if [ -f "$hookdir/prepare-commit-msg" ]; then
-	echo " Moving existing prepare-commit-msg hook to prepare-commit-msg.bak"
-	mv "$hookdir/prepare-commit-msg" "$hookdir/prepare-commit-msg.bak"
+hookdir=`git rev-parse --git-path hooks 2>/dev/null`
+if [ $? -eq 0 ]; then
+	if [ -f "$hookdir/prepare-commit-msg" ]; then
+	echo " Moving existing prepare-commit-msg hook to prepare-commit-msg.bak"
+	mv "$hookdir/prepare-commit-msg" "$hookdir/prepare-commit-msg.bak"
+	fi
+	install -c "`git rev-parse --show-toplevel`/contrib/prepare-commit-msg" "$hookdir"
+else
+	echo " `git --version` is too old, cannot find hooks dir"
 fi
-install -c "`git rev-parse --show-toplevel`/contrib/prepare-commit-msg" "$hookdir"
 fi
 
 # Scan the existing settings to see if there are any we need to rewrite.
-vendors=$(git config --get-all "remote.${upstream}.fetch" "refs/vendors/" | sed -r "s:.*refs/vendors/([^/]+)/.*:\1:" | sort | uniq)
+vendors=$(git config --get-all "remote.${upstream}.fetch" "refs/vendors/" | sed 's:.*refs/vendors/\([^/][^/]*\)/.*:\1:' | sort | uniq)
 url=$(git config --get "remote.${upstream}.url")
 pushurl=$(git config --get "remote.${upstream}.pushurl")
 for v in $vendors


Re: [PATCH] cse: avoid signed overflow in compute_const_anchors [PR 104843]

2022-03-09 Thread Richard Biener via Gcc-patches
On Wed, Mar 9, 2022 at 12:20 PM Xi Ruoyao via Gcc-patches
 wrote:
>
> Bootstrapped and tested on mips64el-linux-gnuabi64, and MIPS is the only
> port with a non-zero targetm.const_anchor.  Ok for trunk?
>
> -- >8 --
>
> With a non-zero const_anchor, the behavior of this function relied on
> signed overflow.
>
> gcc/
>
> PR rtl-optimization/104843
> * cse.cc (compute_const_anchors): Cast to unsigned HOST_WIDE_INT
> to perform overflow arithmetics safely.
> ---
>  gcc/cse.cc | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/gcc/cse.cc b/gcc/cse.cc
> index a18b599d324..7c39a009449 100644
> --- a/gcc/cse.cc
> +++ b/gcc/cse.cc
> @@ -1169,10 +1169,10 @@ compute_const_anchors (rtx cst,
>HOST_WIDE_INT *lower_base, HOST_WIDE_INT *lower_offs,
>HOST_WIDE_INT *upper_base, HOST_WIDE_INT *upper_offs)
>  {
> -  HOST_WIDE_INT n = INTVAL (cst);
> +  unsigned HOST_WIDE_INT n = INTVAL (cst);

UINTVAL (cst)?

>
>*lower_base = n & ~(targetm.const_anchor - 1);

isn't it better to make targetm.const_anchor unsigned?
The & and ~ are not subject to overflow rules.

> -  if (*lower_base == n)
> +  if (*lower_base == INTVAL (cst))

duplicating this here is definitely ugly.

>  return false;
>
>*upper_base =
> --
> 2.35.1
>
>


Re: [RFC][PATCH, OpenMP/OpenACC, libgomp] Allow base-pointers to be NULL

2022-03-09 Thread Jakub Jelinek via Gcc-patches
On Wed, Mar 09, 2022 at 10:43:56PM +0800, Chung-Lin Tang wrote:
> when troubleshooting building/running SPEC HPC 2021 with GCC with OpenMP 
> offloading,
> specifically 534.hpgmgfv_t, an issue encountered in the benchmark was:
> when the benchmark was initializing and creating its data environment on the 
> GPU,
> it was trying to map array sections where the base-pointer is actually NULL:
> ...
> for (block=0;block<3;++block) {
>   #pragma omp target enter data 
> map(to:level->restriction[shape].blocks[block][:length])
>   // level->restriction[shape].blocks[block] == NULL for some values of index 
> 'block'
> ...
> 
> The benchmark appears to be assuming that such NULL base-pointers would 
> simply be
> silently ignored, and the program would just keep running.

I'd say that looks like a clear SPEC HPC 2021 bug, array sections are
similar to array elements and using even &ptr[0] or &ptr[10] when ptr is NULL 
is UB
in C/C++.
Perhaps it doesn't hurt to discuss it on omp-lang, but I wouldn't change
anything on the compiler side until those discussions.

> 2022-03-09  Chung-Lin Tang  
> 
> libgomp/ChangeLog:
> 
>   * target.c (gomp_attach_pointer): When pointer is NULL,
>   return instead of calling gomp_fatal.

> diff --git a/libgomp/target.c b/libgomp/target.c
> index 9017458885e..0e8bbd83c20 100644
> --- a/libgomp/target.c
> +++ b/libgomp/target.c
> @@ -796,8 +796,8 @@ gomp_attach_pointer (struct gomp_device_descr *devicep,
>  
>if ((void *) target == NULL)
>   {
> -   gomp_mutex_unlock (&devicep->lock);
> -   gomp_fatal ("attempt to attach null pointer");
> +   n->aux->attach_count[idx] = 0;
> +   return;
>   }
>  
>s.host_start = target + bias;


Jakub



[RFC][PATCH, OpenMP/OpenACC, libgomp] Allow base-pointers to be NULL

2022-03-09 Thread Chung-Lin Tang

Hi all,
when troubleshooting building/running SPEC HPC 2021 with GCC with OpenMP 
offloading,
specifically 534.hpgmgfv_t, an issue encountered in the benchmark was:
when the benchmark was initializing and creating its data environment on the 
GPU,
it was trying to map array sections where the base-pointer is actually NULL:
...
for (block=0;block<3;++block) {
  #pragma omp target enter data 
map(to:level->restriction[shape].blocks[block][:length])
  // level->restriction[shape].blocks[block] == NULL for some values of index 
'block'
...

The benchmark appears to be assuming that such NULL base-pointers would simply 
be
silently ignored, and the program would just keep running.

(BTW, the above case needs this patch to compile:
 https://gcc.gnu.org/pipermail/gcc-patches/2022-February/590658.html
 which is still awaiting review :) )

What we currently do in libgomp, however, is that we issue an error and call 
gomp_fatal():
libgomp/target.c:gomp_attach_pointer():
...
   if ((void *) target == NULL)
{
- gomp_mutex_unlock (&devicep->lock);
- gomp_fatal ("attempt to attach null pointer");
+ n->aux->attach_count[idx] = 0;  // proposed change 
attached in patch
+ return;
...
Some quick testing shows that clang/LLVM behaves mostly the same as GCC.

OTOH, nVidia HPC SDK (PGI) does appear to silently go on without bailing out.
(I have not verified if 534.hpgmgfv_t fully works with PGI, just observed how 
their
runtime handles NULL base-pointers)

I don't see any explicit description of this case in the OpenMP specifications, 
just simply
"The corresponding pointer variable becomes an attached pointer", lack of 
description on how
this is to be handled.

So WDYGT? Should libgomp behavior be adjusted here, or should SPEC benchmark 
source be adjusted?
(The attached patch to adjust libgomp attach behavior has been regtested 
without regressions, FWIW)

Thanks,
Chung-Lin

2022-03-09  Chung-Lin Tang  

libgomp/ChangeLog:

* target.c (gomp_attach_pointer): When pointer is NULL,
return instead of calling gomp_fatal.
diff --git a/libgomp/target.c b/libgomp/target.c
index 9017458885e..0e8bbd83c20 100644
--- a/libgomp/target.c
+++ b/libgomp/target.c
@@ -796,8 +796,8 @@ gomp_attach_pointer (struct gomp_device_descr *devicep,
 
   if ((void *) target == NULL)
{
- gomp_mutex_unlock (&devicep->lock);
- gomp_fatal ("attempt to attach null pointer");
+ n->aux->attach_count[idx] = 0;
+ return;
}
 
   s.host_start = target + bias;


Re: [PATCH] c++: naming a dependently-scoped template for CTAD [PR104641]

2022-03-09 Thread Patrick Palka via Gcc-patches
On Tue, 8 Mar 2022, Jason Merrill wrote:

> On 3/2/22 14:32, Patrick Palka wrote:
> > In order to be able to perform CTAD for a dependently-scoped template
> > such as A::B in the testcase below, we need to permit a
> > typename-specifier to resolve to a template as per [dcl.type.simple]/2,
> > at least when it appears in a CTAD-enabled context.
> > 
> > This patch implements this using a new tsubst flag tf_tst_ok to control
> > when a TYPENAME_TYPE is allowed to name a template, and sets this flag
> > when substituting into the type of a CAST_EXPR, CONSTRUCTOR or VAR_DECL
> > (each of which is a CTAD-enabled context).
> 
> What breaks if we always allow that, or at least in -std that support CTAD?

AFAICT no significant breakage, but some accepts-invalid and diagnostic
regressions crop up, e.g. accepts-invalid for

  using type = typename A::B; // no more diagnostic if typename resolves to a
 // template at instantiation time

and diagnostic regression for

  template::B> void f();
  // no more elaboration why deduction failed if typename resolves
  // to a template



Re: [PATCH] Restore INDIRECT_REF asm operand heuristic with MEM_REF

2022-03-09 Thread Jakub Jelinek via Gcc-patches
On Wed, Mar 09, 2022 at 03:21:53PM +0100, Richard Biener wrote:
> As noticed we are looking for INDIRECT_REF with allows_mem to avoid
> a copy since then we're sure the operand is in memory (assuming
> *& is folded).  But INDIRECT_REFs are no longer a thing, the following
> replaces the check with a check for a MEM_REF with a non-ADDR_EXPR
> operand.  This should fix the regression part without fully
> exploring all possibilities around tcc_reference operands.
> 
> I've placed an assert that we do not see an INDIRECT_REF here.
> While we gimplify asm operands we never do any checking on its
> IL afterwards.
> 
> Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.
> 
> OK if that succeeds?
> 
> Thanks,
> Richard.
> 
> 2022-03-09  Richard Biener  
> 
>   * cfgexpand.c (expand_gimple_asm): Special-case MEM_REF
>   with non-decl operand, avoiding a copy.

LGTM.

Jakub



[PATCH] Restore INDIRECT_REF asm operand heuristic with MEM_REF

2022-03-09 Thread Richard Biener via Gcc-patches
As noticed we are looking for INDIRECT_REF with allows_mem to avoid
a copy since then we're sure the operand is in memory (assuming
*& is folded).  But INDIRECT_REFs are no longer a thing, the following
replaces the check with a check for a MEM_REF with a non-ADDR_EXPR
operand.  This should fix the regression part without fully
exploring all possibilities around tcc_reference operands.

I've placed an assert that we do not see an INDIRECT_REF here.
While we gimplify asm operands we never do any checking on its
IL afterwards.

Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

OK if that succeeds?

Thanks,
Richard.

2022-03-09  Richard Biener  

* cfgexpand.c (expand_gimple_asm): Special-case MEM_REF
with non-decl operand, avoiding a copy.
---
 gcc/cfgexpand.cc | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/gcc/cfgexpand.cc b/gcc/cfgexpand.cc
index 4f99f040450..d3cc77d2ca9 100644
--- a/gcc/cfgexpand.cc
+++ b/gcc/cfgexpand.cc
@@ -3290,7 +3290,10 @@ expand_asm_stmt (gasm *stmt)
 
   generating_concat_p = 0;
 
-  if ((TREE_CODE (val) == INDIRECT_REF && allows_mem)
+  gcc_assert (TREE_CODE (val) != INDIRECT_REF);
+  if (((TREE_CODE (val) == MEM_REF
+   && TREE_CODE (TREE_OPERAND (val, 0)) != ADDR_EXPR)
+  && allows_mem)
  || (DECL_P (val)
  && (allows_mem || REG_P (DECL_RTL (val)))
  && ! (REG_P (DECL_RTL (val))
-- 
2.34.1


Re: [PATCH] rs6000, v2: Fix up __SIZEOF_{FLOAT, IBM}128__ defines [PR99708]

2022-03-09 Thread Jonathan Wakely via Gcc-patches
On Wed, 9 Mar 2022 at 13:27, Jakub Jelinek  wrote:
> On Mon, Mar 07, 2022 at 03:37:18PM -0600, Segher Boessenkool wrote:
> > > * config/rs6000/rs6000-gen-builtins.cc (TYPE_MAP_SIZE): Set to
> > > 85 instead of 86.
> >
> > This should be auto-generated, or just not exist at all
> > ("sizeof type_map / sizeof type_map[0]" in the one place it is used,
> > instead -- "one place" is after removing it from the definition of the
> > array of course, where it is unneeded :-) )
>
> Unfortunately the generator doesn't use libiberty.h, so I couldn't use
> ARRAY_SIZE.  Just used sizeof / sizeof [0].

FWIW this could be simply std::size(type_map) in C++17.

C++11 allows std::end(type_map) - std::begin(type_map) but that's not
really an improvement.

Those functions are all overloaded for arrays and deduce the array length.



Re: [PATCH] contrib: Fix non-portable sed commands in gcc-descr [PR102664/]

2022-03-09 Thread Mikael Morin

Hello,

Le 08/03/2022 à 18:58, Jonathan Wakely via Gcc-patches a écrit :

Replace \([0-9]\+\) with \([0-9][0-9]*\) or with \([1-9][0-9]*\) in release 
branch numbers, where
a leading zero does not occur.

Note that you also changed some gcc-[0-9]* to gcc-[1-9]*, which is a 
typo/thinko I guess?  It looks like it wouldn’t match gcc-10 any more 
for example…



diff --git a/contrib/git-descr.sh b/contrib/git-descr.sh
index ba5d711f330..95363279d8c 100755
--- a/contrib/git-descr.sh
+++ b/contrib/git-descr.sh
@@ -18,11 +18,11 @@ do
  done
  
  if test x$short = xyes; then

-r=$(git describe --all --match 'basepoints/gcc-[0-9]*' $c | sed -n 
's,^\(tags/\)\?basepoints/gcc-\([0-9]\+\)-\([0-9]\+\)-g[0-9a-f]*$,r\2-\3,p;s,^\(tags/\)\?basepoints/gcc-\([0-9]\+\)$,r\2-0,p');
+r=$(git describe --all --match 'basepoints/gcc-[1-9]*' $c | sed -n 
's,^tags/,,;s,^basepoints/gcc-\([1-9][0-9]*\)-\([0-9][0-9]*\)-g[0-9a-f]*$,r\1-\2,p;s,^basepoints/gcc-\([1-9][0-9]*\)$,r\1-0,p');


…here…


  elif test x$long = xyes; then
-r=$(git describe --all --abbrev=40 --match 'basepoints/gcc-[0-9]*' $c | 
sed -n 's,^\(tags/\)\?basepoints/gcc-,r,p')
+r=$(git describe --all --abbrev=40 --match 'basepoints/gcc-[1-9]*' $c | 
sed -n 's,^tags/,,;s,^basepoints/gcc-,r,p')


… and here …


  else
-r=$(git describe --all --abbrev=14 --match 'basepoints/gcc-[0-9]*' $c | 
sed -n 's,^\(tags/\)\?basepoints/gcc-,r,p');
+r=$(git describe --all --abbrev=14 --match 'basepoints/gcc-[1-9]*' $c | 
sed -n 's,^tags/,,;s,^basepoints/gcc-,r,p')


… and here.


[PATCH] testsuite/104759 - adjust gcc.dg/vect/vect-multitypes-12.c

2022-03-09 Thread Richard Biener via Gcc-patches
This adjusts gcc.dg/vect/vect-multitypes-12.c to just look for the
interesting loop vectorization.

tested on x86_64-unknown-linux-gnu, pushed.

2022-03-09  Richard Biener  

PR testsuite/104759
* gcc.dg/vect/vect-multitypes-12.c: Adjust.
---
 gcc/testsuite/gcc.dg/vect/vect-multitypes-12.c | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/vect/vect-multitypes-12.c 
b/gcc/testsuite/gcc.dg/vect/vect-multitypes-12.c
index b09caeb51b6..4782d3f7d10 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-multitypes-12.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-multitypes-12.c
@@ -26,6 +26,7 @@ int main (void)
 
   for (i=0; i

[PATCH] rs6000, v2: Fix up __SIZEOF_{FLOAT, IBM}128__ defines [PR99708]

2022-03-09 Thread Jakub Jelinek via Gcc-patches
On Mon, Mar 07, 2022 at 03:37:18PM -0600, Segher Boessenkool wrote:
> > As mentioned in the PR, right now on powerpc* __SIZEOF_{FLOAT,IBM}128__
> > macros are predefined unconditionally, because {ieee,ibm}128_float_type_node
> > is always non-NULL, doesn't reflect whether __ieee128 or __ibm128 are
> > actually supported or not.
> > 
> > The following patch:
> > 1) makes those {ieee,ibm}128_float_type_node trees NULL if we don't
> >really support them instead of equal to long_double_type_node
> > 2) adjusts the builtins code to use
> >ibm128_float_type_node ? ibm128_float_type_node : long_double_type_node
> >for the 2 builtins, so that we don't ICE during builtins creation
> >if ibm128_float_type_node is NULL (using error_mark_node instead of
> >long_double_type_node sounds more risky to me)
> 
> I feel the opposite way: (potentially) using the wrong thing is just a
> ticking time bomb, never "safer".

Actually, fallback to long_double_type_node is the right thing, see below.

> There needs to be a __SIZEOF_IEEE128__ as well, if you like reality :-)

Ok, added now.

> The basic types, that always exist if they are supported at all, are
> __ibm128 and __ieee128.  Long double picks one of those, or some other
> type; and __float128 is a source of confusion (it tries to make things
> more similar to x86, but things are *not* similar anyway :-( )

Not just x86, there are multiple targets with __float128 support, and
when it works, it behaves the same on all of them.
Mike's PR99708 patch (which unfortunately has been backported to various
branches without resolving these issues first) regressed on powerpc64-linux
+FAIL: gcc.dg/pr83198.c (test for excess errors)
+FAIL: c-c++-common/ubsan/float-cast-overflow-7.c   -O2  (test for excess 
errors)
+UNRESOLVED: c-c++-common/ubsan/float-cast-overflow-7.c   -O2  compilation 
failed to produce executable
+FAIL: c-c++-common/ubsan/float-cast-overflow-7.c   -O2 -flto 
-fno-use-linker-plugin -flto-partition=none  (test for excess errors)
+UNRESOLVED: c-c++-common/ubsan/float-cast-overflow-7.c   -O2 -flto 
-fno-use-linker-plugin -flto-partition=none  compilation failed to produce 
executable
+FAIL: c-c++-common/ubsan/float-cast-overflow-7.c   -O2 -flto 
-fuse-linker-plugin -fno-fat-lto-objects  (test for excess errors)
+UNRESOLVED: c-c++-common/ubsan/float-cast-overflow-7.c   -O2 -flto 
-fuse-linker-plugin -fno-fat-lto-objects  compilation failed to produce 
executable
Those are all tests that use __float128 if __SIZEOF_FLOAT128__ is defined.

> Tested with which -mcpu=?  You need at least power7 to get __ieee128
> support, but it should be tested *without* that as well.  (The actual
> default for powerpc64-linux is power4 (so not even 970), and for
> powerpc-linux it is 603 or such.)

../configure --enable-languages=c,c++,fortran,objc,obj-c++,lto,go 
--prefix=/home/jakub/GCC; make -j64 > LOG 2>&1 && make -j64 -k check 
RUNTESTFLAGS='--target_board=unix\{-m32,-m64\}' > LOGC 2>&1; 
../contrib/test_summary > LOGT 2>&1
i.e. the oldest supported one.  On powerpc64le-linux also without any
--with- tuning, so power8.

> > * config/rs6000/rs6000-c.cc (rs6000_target_modify_macros): Define
> > __SIZEOF_FLOAT128__ here and only if __float128 macro is defined.
> 
> (if and only if?)

Changed.

> > * config/rs6000/rs6000-gen-builtins.cc (TYPE_MAP_SIZE): Set to
> > 85 instead of 86.
> 
> This should be auto-generated, or just not exist at all
> ("sizeof type_map / sizeof type_map[0]" in the one place it is used,
> instead -- "one place" is after removing it from the definition of the
> array of course, where it is unneeded :-) )

Unfortunately the generator doesn't use libiberty.h, so I couldn't use
ARRAY_SIZE.  Just used sizeof / sizeof [0].

> >  static
> >  const char *rs6000_type_string (tree type_node)
> >  {
> > +  if (type_node == NULL_TREE)
> > +return "unknown";
> 
> Please say this is NULL?  If you want to indicate it is problematic, you
> can say "***NULL***" or such?

Done.

> > +{ "if","ibm128_float_type_node "
> > +   "? ibm128_float_type_node "
> > +   ": long_double" },
> 
> I don't think that is correct.  If there is no __ibm128, there is no
> IFmode either (they are two sides of the same coin).  Using DFmode
> instead (which is the only thing that "long double" could be if not
> IFmode) will just ICE later, if this is used at all.  So best case this
> will confuse the reader.

rs6000_expand_builtin has (but at an incorrect spot) code to remap
__builtin_{,un}pack_ibm128 to __builtin_{,un}pack_longdouble under the hood,
for 2 reasons:
1) __ibm128 type is only supported when VSX is enabled, seems the intent
   was that those 2 builtins can be used interchangeably unless
   long double is the same as __ieee128, in which case
   __builtin_{,un}pack_longdouble errors and
   __builtin_{,un}pack_ibm128 works and returns/takes __ibm128
2) even with VSX, unless -mlong-double-128 -mabi=ieee

Re: [PATCH] handle "invisible" reference in -Wdangling-pointer (PR104436)

2022-03-09 Thread Richard Biener via Gcc-patches
On Fri, Feb 11, 2022 at 12:05 AM Martin Sebor via Gcc-patches
 wrote:
>
> On 2/8/22 15:37, Jason Merrill wrote:
> > On 2/8/22 16:59, Martin Sebor wrote:
> >> Transforming a by-value arguments to by-reference as GCC does for some
> >> class types can trigger -Wdangling-pointer when the argument is used
> >> to store the address of a local variable.  Since the stored value is
> >> not accessible in the caller the warning is a false positive.
> >>
> >> The attached patch handles this case by excluding PARM_DECLs with
> >> the DECL_BY_REFERENCE bit set from consideration.
> >>
> >> While testing the patch I noticed some instances of the warning are
> >> uninitentionally duplicated as the pass runs more than once.  To avoid
> >> that, I also introduce warning suppression into the handler for this
> >> instance of the warning.  (There might still be others.)
> >
> > The second test should verify that we do warn about returning 't' from a
> > function; we don't want to ignore the DECL_BY_REFERENCE RESULT_DECL.
>
> The indirect aggregate case isn't handled and needs more work but
> since you brought it up I thought I should look into finishing it.
> The attached patch #2 adds support for it.  It also incorporates
> Richard's suggestion to test PARM_DECL.  Patch #2 applies on top
> of patch #1 which is unchanged from the first revision.

patch #1 would be OK if you'd do the PARM_DECL check there - I'd
rather defer #2 to stage1.

Richard.

>
> I have retested it on x86_64-linux and by building Glibc and
> Binutils + GDB.
>
> If now is too late for the aggregate enhancement I'm okay with
> deferring it until stage 1.
>
> >
> >> +  tree var = SSA_NAME_VAR (lhs_ref.ref);
> >> +  if (DECL_BY_REFERENCE (var))
> >> +/* Avoid by-value arguments transformed into by-reference.  */
> >> +continue;
> >
> > I wonder if we can we express this property of invisiref parms somewhere
> > more general?  I imagine optimizations would find it useful as well.
> > Could pointer_query somehow treat the reference as pointing to a
> > function-local object?
>
> I don't quite see where in the pointer_query class this would be
> useful (the class also isn't used for optimization).  I could add
> a helper to the access_ref class to query this property in warning
> code but as this is the only potential client I'm also not sure
> that's quite what you had in mind.  I'd need some guidance as to
> what you're thinking of here.
>
> Martin
>
>
> >
> > I previously tried to express this by marking the reference as
> > 'restrict', but that was wrong
> > (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97474).
> >
> > Jason
> >


[PING][PATCH][final] Handle compiler-generated asm insn

2022-03-09 Thread Tom de Vries via Gcc-patches

On 2/22/22 14:55, Tom de Vries wrote:

Hi,

For the nvptx port, with -mptx-comment we have in pr53465.s:
...
 // #APP
// 9 "gcc/testsuite/gcc.c-torture/execute/pr53465.c" 1
 // Start: Added by -minit-regs=3:
 // #NO_APP
 mov.u32 %r26, 0;
 // #APP
// 9 "gcc/testsuite/gcc.c-torture/execute/pr53465.c" 1
 // End: Added by -minit-regs=3:
 // #NO_APP
...

The comments where generated using the compiler-generated equivalent of:
...
   asm ("// Comment");
...
but both the printed location and the NO_APP/APP are unnecessary for a
compiler-generated asm insn.

Fix this by handling ASM_INPUT_SOURCE_LOCATION == UNKNOWN_LOCATION in
final_scan_insn_1, such what we simply get:
...
 // Start: Added by -minit-regs=3:
 mov.u32 %r26, 0;
 // End: Added by -minit-regs=3:
...

Tested on nvptx.

OK for trunk?



Ping.

Thanks,
- Tom


[final] Handle compiler-generated asm insn

gcc/ChangeLog:

2022-02-21  Tom de Vries  

PR rtl-optimization/104596
* config/nvptx/nvptx.cc (gen_comment): Use gen_rtx_ASM_INPUT instead
of gen_rtx_ASM_INPUT_loc.
* final.cc (final_scan_insn_1): Handle
ASM_INPUT_SOURCE_LOCATION == UNKNOWN_LOCATION.

---
  gcc/config/nvptx/nvptx.cc |  3 +--
  gcc/final.cc  | 17 +++--
  2 files changed, 12 insertions(+), 8 deletions(-)

diff --git a/gcc/config/nvptx/nvptx.cc b/gcc/config/nvptx/nvptx.cc
index 858789e6df7..4124c597f24 100644
--- a/gcc/config/nvptx/nvptx.cc
+++ b/gcc/config/nvptx/nvptx.cc
@@ -5381,8 +5381,7 @@ gen_comment (const char *s)
size_t len = strlen (ASM_COMMENT_START) + strlen (sep) + strlen (s) + 1;
char *comment = (char *) alloca (len);
snprintf (comment, len, "%s%s%s", ASM_COMMENT_START, sep, s);
-  return gen_rtx_ASM_INPUT_loc (VOIDmode, ggc_strdup (comment),
-   cfun->function_start_locus);
+  return gen_rtx_ASM_INPUT (VOIDmode, ggc_strdup (comment));
  }
  
  /* Initialize all declared regs at function entry.

diff --git a/gcc/final.cc b/gcc/final.cc
index a9868861bd2..e6443ef7a4f 100644
--- a/gcc/final.cc
+++ b/gcc/final.cc
@@ -2642,15 +2642,20 @@ final_scan_insn_1 (rtx_insn *insn, FILE *file, int 
optimize_p ATTRIBUTE_UNUSED,
if (string[0])
  {
expanded_location loc;
+   bool unknown_loc_p
+ = ASM_INPUT_SOURCE_LOCATION (body) == UNKNOWN_LOCATION;
  
-		app_enable ();

-   loc = expand_location (ASM_INPUT_SOURCE_LOCATION (body));
-   if (*loc.file && loc.line)
- fprintf (asm_out_file, "%s %i \"%s\" 1\n",
-  ASM_COMMENT_START, loc.line, loc.file);
+   if (!unknown_loc_p)
+ {
+   app_enable ();
+   loc = expand_location (ASM_INPUT_SOURCE_LOCATION (body));
+   if (*loc.file && loc.line)
+ fprintf (asm_out_file, "%s %i \"%s\" 1\n",
+  ASM_COMMENT_START, loc.line, loc.file);
+ }
fprintf (asm_out_file, "\t%s\n", string);
  #if HAVE_AS_LINE_ZERO
-   if (*loc.file && loc.line)
+   if (!unknown_loc_p && loc.file && *loc.file && loc.line)
  fprintf (asm_out_file, "%s 0 \"\" 2\n", ASM_COMMENT_START);
  #endif
  }


[PATCH] contrib: Avoid use of "echo -n" in git customization [PR102664]

2022-03-09 Thread Richard Earnshaw via Gcc-patches

The -n option to echo is non-portable.  The generally recommended
alternative is to use the shell printf command.

contrib/ChangeLog:

PR other/102664
* gcc-git-customization.sh (ask): Use printf instead of echo -n.diff --git a/contrib/gcc-git-customization.sh b/contrib/gcc-git-customization.sh
index b24948d9874..cf46c494a6a 100755
--- a/contrib/gcc-git-customization.sh
+++ b/contrib/gcc-git-customization.sh
@@ -7,7 +7,7 @@ ask () {
 question=$1
 default=$2
 var=$3
-echo -n $question "["$default"]? "
+printf "%s" "$question [$default]? "
 read answer
 if [ "x$answer" = "x" ]
 then


Re: [PATCH v3] libgo: Don't use pt_regs member in mcontext_t

2022-03-09 Thread Rich Felker
On Wed, Mar 09, 2022 at 08:26:11AM +0100, Sören Tempel wrote:
> Ian Lance Taylor  wrote:
> > Have you tested this in 32-bit mode?  It does not look correct based
> > on the glibc definitions.  Looking at glibc it seems that it ought to
> > be
> 
> As stated in the commit message, I have only tested this on Alpine Linux
> ppc64le (which uses musl libc). Unfortunately, I don't have access to a
> 32-bit PowerPC machine and hence haven't performed any tests with it.
> 
> > reg.sigpc = ((ucontext_t*)(context))->uc_mcontext.uc_regs->gregs[32];
> 
> While this should work with glibc, it doesn't work with musl. In order
> to support both (musl and glibc) on 32-bit PowerPC, we would have to do
> something along the lines of:
> 
>   #ifdef __PPC__
>   #if defined(__PPC64__)   /* ppc64 glibc & musl */
>   ret.sigpc = ((ucontext_t*)(context))->uc_mcontext.gp_regs[32]
>   #elif defined(__GLIBC__) /* ppc32 glibc */
>   reg.sigpc = ((ucontext_t*)(context))->uc_mcontext.uc_regs->gregs[32];
>   #else/* ppc32 musl */
>   ret.sigpc = ((ucontext_t*)(context))->uc_mcontext.gregs[32];
>   #endif /* __PPC64__ */
>   #endif /* __PPC__ */
> 
> In light of these observations, maybe using asm/ptrace.h and .regs (as
> proposed in the v1 patch) is the "better" (i.e. more readable) solution
> for now? I agree with Rich that using .regs is certainly a "code smell",
> but this gigantic ifdef block also looks pretty smelly to me. That being
> said, I can also send a v4 which uses this ifdef block.

I still prefer the #ifdef chain here, because it's at least using the
intended API. The problem is just that we can't have a matching API
because the only API glibc offers on ppc32 contradicts the POSIX
requirements.

I'm also not understanding how the .regs approach in the v1 patch was
ever supposed to work with musl anyway. The relevant line was:

ret.sigpc = ((ucontext_t*)(context))->uc_mcontext.regs->nip;

and musl has no uc_mcontext.regs member because uc_mcontext is
mcontext_t, not the glibc ppc32 pointer union thing. The only .regs in
musl's ppc32 signal.h/ucontext.h is in struct sigcontext, not anywhere
in ucontext_t.

Rich


Re: [PATCH] middle-end/104786 - ICE with asm and VLA

2022-03-09 Thread Jakub Jelinek via Gcc-patches
On Wed, Mar 09, 2022 at 12:30:08PM +0100, Richard Biener wrote:
> The following fixes an ICE observed with a MEM_REF allows_mem asm
> operand referencing a VLA.  The following makes sure to not attempt
> to go the temporary creation way when we cannot.
> 
> 2022-03-09  Richard Biener  
> 
>   PR middle-end/104786
>   * cfgexpand.cc (expand_asm_stmt): Do not generate a copy
>   for VLAs without an upper size bound.
> 
>   * gcc.dg/pr104786.c: New testcase.

LGTM.

Jakub



Re: [PATCH] x86, v2: Define LIBGCC2_UNWIND_ATTRIBUTE on ia32 [PR104781]

2022-03-09 Thread Richard Biener via Gcc-patches
On Wed, 9 Mar 2022, Jakub Jelinek wrote:

> On Wed, Mar 09, 2022 at 08:18:38AM +0100, Richard Biener wrote:
> > I wonder if this is a good case for general-regs-only instead?  At
> > least no-sse cannot be functionally equivalent (since then we would
> > not have needed general-regs-only ...).
> 
> I think general-regs-only is approx. equivalent to no-mmx,no-sse,
> we don't really need/want mmx in the unwinder I guess.
> 
> The following seems to work fine too, regtested with
> -O2 -msse2 -mfpmath=sse -mstackrealign in {C,CXX,XC,TC}FLAGS on i686-linux
> (where without the patch it fails to build).

LGTM.

> 2022-03-09  Jakub Jelinek  
> 
>   PR target/104781
>   * config/i386/i386.h (LIBGCC2_UNWIND_ATTRIBUTE): Define for ia32.
> 
> --- gcc/config/i386/i386.h.jj 2022-02-25 21:59:29.356291326 +0100
> +++ gcc/config/i386/i386.h2022-03-09 10:00:38.242376504 +0100
> @@ -2848,6 +2848,12 @@ extern enum attr_cpu ix86_schedule;
>  #define NUM_X86_64_MS_CLOBBERED_REGS 12
>  #endif
>  
> +/* __builtin_eh_return can't handle stack realignment, so restrict to
> +   general regs in 32-bit libgcc functions that call it.  */
> +#ifndef __x86_64__
> +#define LIBGCC2_UNWIND_ATTRIBUTE __attribute__((target 
> ("general-regs-only")))
> +#endif
> +
>  /*
>  Local variables:
>  version-control: t
> 
> 
>   Jakub
> 
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Ivo Totev; HRB 36809 (AG Nuernberg)


Re: [PATCH] middle-end/104786 - ICE with asm and VLA

2022-03-09 Thread Richard Biener via Gcc-patches
On Wed, 9 Mar 2022, Jakub Jelinek wrote:

> On Wed, Mar 09, 2022 at 11:07:21AM +0100, Richard Biener wrote:
> > The following fixes an ICE observed with a MEM_REF allows_mem asm
> > operand.  There's code expecting INDIRECT_REFs that are now never
> > going to appear.  The following simply treats all tcc_reference
> > operands the same.
> 
> The INDIRECT_REF in there seems to be at least from 1995, but my limited
> understanding of what it wants to catch is operands that provably must
> live in memory.  INDIRECT_REF (assuming *&*& is folded already) is
> such a case, MEM_REF could be but not always (there is the case of
> MEM_REF on &decl without TREE_ADDRESSABLE) but usually is,
> other handled_component_p depend on what their first operand is etc.
> 
> I think there are two cases, one is an optimization (the && allows_mem
> case) where we don't want create a temporary (especially a huge one)
> if it clearly has to live in memory.  Example where we mess this up:
> struct T { char b[1024]; };
> struct S { int a; struct T b; } s;
> 
> void
> foo (void)
> {
>   asm volatile ("# %0" : : "rm" (s.b));
> }
> where we create 1024 bytes long temporary on the stack, copy s.b to it
> and that is the operand live in memory.
> So perhaps for the allows_mem && case we could handle that way
> all BLKmode types (BLKmode must live in memory, no?) and otherwise
> look through all handled_component_p's and determine if the base
> will be in memory (addressable decl, MEM_REF except for ADDR of
> non-addressable decl, INDIRECT_REF, what else?).

Well, in theory you could have a MEM_REF (&int_decl)
with a BLKmode struct T but SImode int_decl.  So looking at the
mode isn't good enough I think.  Note that the DECL_P case
does simply allow a non-MEM_P DECL_RTL when allows_mem and
only disallows the case of a REG with a different mode than
the type (but if it say a CONCAT it doesn't care).  That was
my reasoning that any additional restrictions are not necessary.
Of course we might run into

  if (! allows_reg && !MEM_P (op))
{
  error_at (locus, "output number %d not directly 
addressable", i);
  error_seen = true;
}

then, so when !allows_reg it might be better to go the
temporary copy way.  But then I fail to see how !allows_mem
is OK in that case ...

> And another case is where it is not an optimization, where we
> just can't ever create a temporary.
> One of those examples is
>   || TREE_ADDRESSABLE (type)
> the if already has, I guess non-constant length (ok, SVE is ok)
> would be another.
> assign_temp tests !poly_int_tree_p (TYPE_SIZE_UNIT (type), &size),
> so perhaps add || !tree_fits_poly_int64_p (TYPE_SIZE_UNIT (type)) ?

Yes, and that's good enough for the testcase at hand.  assign_temp
also handles some cases through max_int_size_in_bytes.  I'm going
to leave the optimization above for a followup and will test the
following for now.

Richard.

>From 90066f8dc0353103b9e7276b7a3e18e925509ded Mon Sep 17 00:00:00 2001
From: Richard Biener 
Date: Wed, 9 Mar 2022 10:55:49 +0100
Subject: [PATCH] middle-end/104786 - ICE with asm and VLA
To: gcc-patches@gcc.gnu.org

The following fixes an ICE observed with a MEM_REF allows_mem asm
operand referencing a VLA.  The following makes sure to not attempt
to go the temporary creation way when we cannot.

2022-03-09  Richard Biener  

PR middle-end/104786
* cfgexpand.cc (expand_asm_stmt): Do not generate a copy
for VLAs without an upper size bound.

* gcc.dg/pr104786.c: New testcase.
---
 gcc/cfgexpand.cc| 4 +++-
 gcc/testsuite/gcc.dg/pr104786.c | 8 
 2 files changed, 11 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.dg/pr104786.c

diff --git a/gcc/cfgexpand.cc b/gcc/cfgexpand.cc
index 87536ec7ccd..271bbfb8fe9 100644
--- a/gcc/cfgexpand.cc
+++ b/gcc/cfgexpand.cc
@@ -3297,7 +3297,9 @@ expand_asm_stmt (gasm *stmt)
&& GET_MODE (DECL_RTL (val)) != TYPE_MODE (type)))
  || ! allows_reg
  || is_inout
- || TREE_ADDRESSABLE (type))
+ || TREE_ADDRESSABLE (type)
+ || (!tree_fits_poly_int64_p (type)
+ && !known_size_p (max_int_size_in_bytes (type
{
  op = expand_expr (val, NULL_RTX, VOIDmode,
!allows_reg ? EXPAND_MEMORY : EXPAND_WRITE);
diff --git a/gcc/testsuite/gcc.dg/pr104786.c b/gcc/testsuite/gcc.dg/pr104786.c
new file mode 100644
index 000..3076d236d21
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr104786.c
@@ -0,0 +1,8 @@
+/* { dg-do compile } */
+/* { dg-options "-std=gnu90" } */
+
+void h(void *di, int num)
+{
+  char (*t)[num] = di;
+  __asm__ ("" : "=X"( *t));
+}
-- 
2.34.1



[PATCH] cse: avoid signed overflow in compute_const_anchors [PR 104843]

2022-03-09 Thread Xi Ruoyao via Gcc-patches
Bootstrapped and tested on mips64el-linux-gnuabi64, and MIPS is the only
port with a non-zero targetm.const_anchor.  Ok for trunk?

-- >8 --

With a non-zero const_anchor, the behavior of this function relied on
signed overflow.

gcc/

PR rtl-optimization/104843
* cse.cc (compute_const_anchors): Cast to unsigned HOST_WIDE_INT
to perform overflow arithmetics safely.
---
 gcc/cse.cc | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/cse.cc b/gcc/cse.cc
index a18b599d324..7c39a009449 100644
--- a/gcc/cse.cc
+++ b/gcc/cse.cc
@@ -1169,10 +1169,10 @@ compute_const_anchors (rtx cst,
   HOST_WIDE_INT *lower_base, HOST_WIDE_INT *lower_offs,
   HOST_WIDE_INT *upper_base, HOST_WIDE_INT *upper_offs)
 {
-  HOST_WIDE_INT n = INTVAL (cst);
+  unsigned HOST_WIDE_INT n = INTVAL (cst);
 
   *lower_base = n & ~(targetm.const_anchor - 1);
-  if (*lower_base == n)
+  if (*lower_base == INTVAL (cst))
 return false;
 
   *upper_base =
-- 
2.35.1




[PATCH, OpenMP, C++] Allow classes with static members to be mappable

2022-03-09 Thread Chung-Lin Tang

Hi Jakub,
Now in OpenMP 5.x, static members are supposed to be not a barrier for a class
to be target-mapped.

There is the related issue of actually providing access to static 
const/constexpr
members on the GPU (probably a case of 
https://github.com/OpenMP/spec/issues/2158)
but that is for later.

This patch basically just removes the check for static members inside
cp_omp_mappable_type_1, and adjusts a testcase. Not sure if more tests are 
needed.
Tested on trunk without regressions, okay when stage1 reopens?

Thanks,
Chung-Lin

2022-03-09  Chung-Lin Tang  

gcc/cp/ChangeLog:

* decl2.cc (cp_omp_mappable_type_1): Remove requirement that all
members must be non-static; remove check for static fields.

gcc/testsuite/ChangeLog:

* g++.dg/gomp/unmappable-1.C: Adjust testcase.diff --git a/gcc/cp/decl2.cc b/gcc/cp/decl2.cc
index c53acf4546d..ace7783d9bd 100644
--- a/gcc/cp/decl2.cc
+++ b/gcc/cp/decl2.cc
@@ -1544,21 +1544,14 @@ cp_omp_mappable_type_1 (tree type, bool notes)
   /* Arrays have mappable type if the elements have mappable type.  */
   while (TREE_CODE (type) == ARRAY_TYPE)
 type = TREE_TYPE (type);
-  /* All data members must be non-static.  */
+
   if (CLASS_TYPE_P (type))
 {
   tree field;
   for (field = TYPE_FIELDS (type); field; field = DECL_CHAIN (field))
-   if (VAR_P (field))
- {
-   if (notes)
- inform (DECL_SOURCE_LOCATION (field),
- "static field %qD is not mappable", field);
-   result = false;
- }
/* All fields must have mappable types.  */
-   else if (TREE_CODE (field) == FIELD_DECL
-&& !cp_omp_mappable_type_1 (TREE_TYPE (field), notes))
+   if (TREE_CODE (field) == FIELD_DECL
+   && !cp_omp_mappable_type_1 (TREE_TYPE (field), notes))
  result = false;
 }
   return result;
diff --git a/gcc/testsuite/g++.dg/gomp/unmappable-1.C 
b/gcc/testsuite/g++.dg/gomp/unmappable-1.C
index 364f884500c..1532b9c73f1 100644
--- a/gcc/testsuite/g++.dg/gomp/unmappable-1.C
+++ b/gcc/testsuite/g++.dg/gomp/unmappable-1.C
@@ -4,7 +4,7 @@
 class C
 {
 public:
-  static int static_member; /* { dg-message "static field .C::static_member. 
is not mappable" } */
+  static int static_member;
   virtual void f() {}
 };
 


Re: [PATCH] mips: avoid signed overflow in LUI_OPERAND [PR104842]

2022-03-09 Thread Xi Ruoyao via Gcc-patches
On Tue, 2022-03-08 at 18:20 +, Richard Sandiford wrote:
> Xi Ruoyao  writes:
> > I think this one obvious.  Ok for trunk?

> OK, thanks.

Committed r12-7555.

/* snip */

> >  #define LUI_OPERAND(VALUE) \
> >    (((VALUE) | 0x7fff) == 0x7fff\
> > -   || ((VALUE) | 0x7fff) + 0x1 == 0)
> > +   || ((unsigned HOST_WIDE_INT) (VALUE) | 0x7fff) + 0x1 == 0)

-- 
Xi Ruoyao 
School of Aerospace Science and Technology, Xidian University


Re: [PATCH] vect: fix out-of-bound access in supports_vec_convert_optab_p [PR 104851]

2022-03-09 Thread Xi Ruoyao via Gcc-patches
On Wed, 2022-03-09 at 09:37 +0100, Richard Biener wrote:
> On Wed, Mar 9, 2022 at 5:06 AM Xi Ruoyao via Gcc-patches
>  wrote:
> > 
> > This should be obvious, OK for trunk?
> 
> OK.

Committed r12-7559.

> > -- >8 --
> > 
> > Calling VECTOR_MODE_P with MAX_MACHINE_MODE has caused out-of-bound
> > access.

/* snip */

> > -  int end = mode == VOIDmode ? MAX_MACHINE_MODE : mode;
> > +  int end = mode == VOIDmode ? MAX_MACHINE_MODE - 1 : mode;

-- 
Xi Ruoyao 
School of Aerospace Science and Technology, Xidian University


Re: [PATCH] middle-end/104786 - ICE with asm and VLA

2022-03-09 Thread Jakub Jelinek via Gcc-patches
On Wed, Mar 09, 2022 at 11:07:21AM +0100, Richard Biener wrote:
> The following fixes an ICE observed with a MEM_REF allows_mem asm
> operand.  There's code expecting INDIRECT_REFs that are now never
> going to appear.  The following simply treats all tcc_reference
> operands the same.

The INDIRECT_REF in there seems to be at least from 1995, but my limited
understanding of what it wants to catch is operands that provably must
live in memory.  INDIRECT_REF (assuming *&*& is folded already) is
such a case, MEM_REF could be but not always (there is the case of
MEM_REF on &decl without TREE_ADDRESSABLE) but usually is,
other handled_component_p depend on what their first operand is etc.

I think there are two cases, one is an optimization (the && allows_mem
case) where we don't want create a temporary (especially a huge one)
if it clearly has to live in memory.  Example where we mess this up:
struct T { char b[1024]; };
struct S { int a; struct T b; } s;

void
foo (void)
{
  asm volatile ("# %0" : : "rm" (s.b));
}
where we create 1024 bytes long temporary on the stack, copy s.b to it
and that is the operand live in memory.
So perhaps for the allows_mem && case we could handle that way
all BLKmode types (BLKmode must live in memory, no?) and otherwise
look through all handled_component_p's and determine if the base
will be in memory (addressable decl, MEM_REF except for ADDR of
non-addressable decl, INDIRECT_REF, what else?).

And another case is where it is not an optimization, where we
just can't ever create a temporary.
One of those examples is
|| TREE_ADDRESSABLE (type)
the if already has, I guess non-constant length (ok, SVE is ok)
would be another.
assign_temp tests !poly_int_tree_p (TYPE_SIZE_UNIT (type), &size),
so perhaps add || !tree_fits_poly_int64_p (TYPE_SIZE_UNIT (type)) ?

Jakub



RE: [aarch64] update reg-costs to include predicate move costs

2022-03-09 Thread Kyrylo Tkachov via Gcc-patches


> -Original Message-
> From: Andre Vieira (lists) 
> Sent: Tuesday, March 8, 2022 3:16 PM
> To: gcc-patches@gcc.gnu.org
> Cc: Kyrylo Tkachov ; Richard Sandiford
> 
> Subject: [aarch64] update reg-costs to include predicate move costs
> 
> Hi,
> 
> This patch adds predicate move costs to several SVE enabled cores.
> 

Ok. It's a bit odd to have them in non-SVE tuning structs too, but we do reuse 
these structs across different cores, so it makes sense to have it.

Thanks,
Kyrill

> 
> 2022-02-25  Tamar Christina  
>     Andre Vieira 
> 
> gcc/ChangeLog:
> 
>      * config/aarch64/aarch64-protos.h (struct cpu_regmove_cost):
> Add PR2PR member.
>      * config/aarch64/aarch64.cc (aarch64_register_move_cost): Use
> PR2PR costs when moving a predicate.
>      (generic_regmove_cost, cortexa57_regmove_cost,
> exynosm1_regmove_cost thunderx_regmove_cost, xgene1_regmove_cost,
> qdf24xx_regmove_cost, thunderx2t99_regmove_cost,
> thunderx3t110_regmove_cost, tsv110_regmove_cost, a64fx_regmove_cost):
> Add PR2PR entry.
>      (cortexa76_regmove_cost): New.
>      (neoversen1_tunings): Use cortexa76_regmove_cost.


[PATCH] middle-end/104786 - ICE with asm and VLA

2022-03-09 Thread Richard Biener via Gcc-patches
The following fixes an ICE observed with a MEM_REF allows_mem asm
operand.  There's code expecting INDIRECT_REFs that are now never
going to appear.  The following simply treats all tcc_reference
operands the same.

A more conservative approach would be to use INDIRECT_REF || MEM_REF
which would fix this particular testcase, but all non-register typed
kinds of reference trees could appear here and I'm not sure why
former INDIRECT_REFs should be OK to go this path but not say
a.b as we also handle plain DECL_P here albeit with some extra
constraints.  For the VLA case in the PR another spot to adjust
would be || TREE_ADDRESSABLE (type) noting that we also cannot
create a temporary for not constant-sized types.

That said - INDIRECT_REF catched my eye here, this might have
caused quite some extra copies eventually so I did want to fix
that.

Btw, it seems to me the DECL_P case disallowing promoted variables
should only apply to register types which should be SSA names
here unless TREE_ADDRESSABLE in which case we let them through
anyway.  With the same reasoning a generic REFERENCE_CLASS_P
should be OK, not needing such special-casing.

Bootstrap and regtest running on x86_64-unknown-linux-gnu.

OK for trunk?

Thanks,
Richard.

2022-03-09  Richard Biener  

PR middle-end/104786
* cfgexpand.cc (expand_asm_stmt): Specialize REFERENCE_CLASS_P
operands rather than INDIRECT_REFs.

* gcc.dg/pr104786.c: New testcase.
---
 gcc/cfgexpand.cc| 2 +-
 gcc/testsuite/gcc.dg/pr104786.c | 8 
 2 files changed, 9 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.dg/pr104786.c

diff --git a/gcc/cfgexpand.cc b/gcc/cfgexpand.cc
index 87536ec7ccd..aaccf1a0906 100644
--- a/gcc/cfgexpand.cc
+++ b/gcc/cfgexpand.cc
@@ -3290,7 +3290,7 @@ expand_asm_stmt (gasm *stmt)
 
   generating_concat_p = 0;
 
-  if ((TREE_CODE (val) == INDIRECT_REF && allows_mem)
+  if ((REFERENCE_CLASS_P (val) && allows_mem)
  || (DECL_P (val)
  && (allows_mem || REG_P (DECL_RTL (val)))
  && ! (REG_P (DECL_RTL (val))
diff --git a/gcc/testsuite/gcc.dg/pr104786.c b/gcc/testsuite/gcc.dg/pr104786.c
new file mode 100644
index 000..3076d236d21
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr104786.c
@@ -0,0 +1,8 @@
+/* { dg-do compile } */
+/* { dg-options "-std=gnu90" } */
+
+void h(void *di, int num)
+{
+  char (*t)[num] = di;
+  __asm__ ("" : "=X"( *t));
+}
-- 
2.34.1


[PATCH] x86, v2: Define LIBGCC2_UNWIND_ATTRIBUTE on ia32 [PR104781]

2022-03-09 Thread Jakub Jelinek via Gcc-patches
On Wed, Mar 09, 2022 at 08:18:38AM +0100, Richard Biener wrote:
> I wonder if this is a good case for general-regs-only instead?  At
> least no-sse cannot be functionally equivalent (since then we would
> not have needed general-regs-only ...).

I think general-regs-only is approx. equivalent to no-mmx,no-sse,
we don't really need/want mmx in the unwinder I guess.

The following seems to work fine too, regtested with
-O2 -msse2 -mfpmath=sse -mstackrealign in {C,CXX,XC,TC}FLAGS on i686-linux
(where without the patch it fails to build).

2022-03-09  Jakub Jelinek  

PR target/104781
* config/i386/i386.h (LIBGCC2_UNWIND_ATTRIBUTE): Define for ia32.

--- gcc/config/i386/i386.h.jj   2022-02-25 21:59:29.356291326 +0100
+++ gcc/config/i386/i386.h  2022-03-09 10:00:38.242376504 +0100
@@ -2848,6 +2848,12 @@ extern enum attr_cpu ix86_schedule;
 #define NUM_X86_64_MS_CLOBBERED_REGS 12
 #endif
 
+/* __builtin_eh_return can't handle stack realignment, so restrict to
+   general regs in 32-bit libgcc functions that call it.  */
+#ifndef __x86_64__
+#define LIBGCC2_UNWIND_ATTRIBUTE __attribute__((target ("general-regs-only")))
+#endif
+
 /*
 Local variables:
 version-control: t


Jakub



Re: [PATCH] vect: fix out-of-bound access in supports_vec_convert_optab_p [PR 104851]

2022-03-09 Thread Richard Biener via Gcc-patches
On Wed, Mar 9, 2022 at 5:06 AM Xi Ruoyao via Gcc-patches
 wrote:
>
> This should be obvious, OK for trunk?

OK.

> -- >8 --
>
> Calling VECTOR_MODE_P with MAX_MACHINE_MODE has caused out-of-bound
> access.
>
> gcc/
>
> PR tree-optimization/104851
> * optabs-query.cc (supports_vec_convert_optab_p): Fix off-by-one
> error.
> ---
>  gcc/optabs-query.cc | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/gcc/optabs-query.cc b/gcc/optabs-query.cc
> index 713c098ba4e..68dc679cc6a 100644
> --- a/gcc/optabs-query.cc
> +++ b/gcc/optabs-query.cc
> @@ -720,7 +720,7 @@ static bool
>  supports_vec_convert_optab_p (optab op, machine_mode mode)
>  {
>int start = mode == VOIDmode ? 0 : mode;
> -  int end = mode == VOIDmode ? MAX_MACHINE_MODE : mode;
> +  int end = mode == VOIDmode ? MAX_MACHINE_MODE - 1 : mode;
>for (int i = start; i <= end; ++i)
>  if (VECTOR_MODE_P ((machine_mode) i))
>for (int j = MIN_MODE_VECTOR_INT; j < MAX_MODE_VECTOR_INT; ++j)
> --
> 2.35.1
>
>


Re: [Patch] Fortran: Fix CLASS handling in SIZEOF intrinsic

2022-03-09 Thread Paul Richard Thomas via Gcc-patches
Hi Tobias,

Thanks for the patch and the particularly comprehensive testcase.

OK for mainline as far as I am concerned.

Paul


On Tue, 8 Mar 2022 at 20:06, Tobias Burnus  wrote:

> Fix SIZEOF handling.
>
> I have to admit that I do understand what the current code does,
> but do not understand what the previous code did. However, it
> still passes the testsuite - and also some code which did ICE
> now compiles :-)
>
> While writing the testcase, I did find two issues:
> * Passing a CLASS to TYPE(*),dimension(..) will have an
>elem_len of the declared type and not of the dynamic type.
>https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104844
> * var%class_array(1,1)%array will have size(...) == 0
>instead of size(... % array).
>https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104845
>
> OK for mainline? (Unless you want to hold off until GCC 13)
>
> Tobias
>
> -
> Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201,
> 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer:
> Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München;
> Registergericht München, HRB 106955
>


-- 
"If you can't explain it simply, you don't understand it well enough" -
Albert Einstein


Re: [PATCH v2] Add TARGET_MOVE_WITH_MODE_P

2022-03-09 Thread Richard Biener via Gcc-patches
On Tue, Mar 8, 2022 at 4:44 PM H.J. Lu  wrote:
>
> On Mon, Mar 7, 2022 at 5:45 AM Richard Biener
>  wrote:
> >
> > On Wed, Mar 2, 2022 at 10:18 PM H.J. Lu  wrote:
> > >
> > > On Wed, Mar 02, 2022 at 09:51:26AM +0100, Richard Biener wrote:
> > > > On Tue, Mar 1, 2022 at 11:41 PM H.J. Lu via Gcc-patches
> > > >  wrote:
> > > > >
> > > > > Add TARGET_FOLD_MEMCPY_MAX for the maximum number of bytes to fold 
> > > > > memcpy.
> > > > > The default is
> > > > >
> > > > > MOVE_MAX * MOVE_RATIO (optimize_function_for_size_p (cfun))
> > > > >
> > > > > For x86, it is MOVE_MAX to restore the old behavior before
> > > >
> > > > I know we've discussed this to death in the PR, I just want to repeat 
> > > > here
> > > > that the GIMPLE folding expects to generate a single load and a single
> > > > store (that is what it does on the GIMPLE level) which is why MOVE_MAX
> > > > was chosen originally (it's documented to what a "single instruction" 
> > > > does).
> > > > In practice MOVE_MAX does not seem to cover vector register sizes
> > > > so Richard pulled MOVE_RATIO which is really intended to cover
> > > > the case of using multiple instructions for moving memory (but then I
> > > > don't remember whether for the ARM case the single load/store GIMPLE
> > > > will be expanded to multiple load/store instructions).
> > > >
> > > > TARGET_FOLD_MEMCPY_MAX sounds like a stop-gap solution,
> > > > being very specific for memcpy folding (we also fold memmove btw).
> > > >
> > > > There is also MOVE_MAX_PIECES which _might_ be more appropriate
> > > > than MOVE_MAX here and still honor the idea of single instructions.
> > > > Now neither arm nor aarch64 define this and it defaults to MOVE_MAX,
> > > > not MOVE_MAX * MOVE_RATIO.
> > > >
> > > > So if we need a new hook then that hook should at least get the
> > > > 'speed' argument of MOVE_RATIO and it should get a better name.
> > > >
> > > > I still think that it should be possible to improve the insn check to
> > > > avoid use of "disabled" modes, maybe that's also a point to add
> > > > a new hook like .move_with_mode_p or so?  To quote, we do
> > >
> > > Here is the v2 patch to add TARGET_MOVE_WITH_MODE_P.
> >
> > Again I'd like to shine light on MOVE_MAX_PIECES which explicitely
> > mentions "a load or store used TO COPY MEMORY" (emphasis mine)
> > and whose x86 implementation would already be fine (doing larger moves
> > and also not doing too large moves).  But appearantly the arm folks
> > decided that that's not fit and instead (mis-?)used MOVE_MAX * MOVE_RATIO.
> >
> > Yes, MOVE_MAX_PIECES is documented to apply to move_by_pieces.
> > Still GIMPLE memcpy/memmove inlining wants to mimic exactly that but
> > restrict itself to a single load and a single store.
> >
> > > >
> > > >   scalar_int_mode mode;
> > > >   if (int_mode_for_size (ilen * 8, 0).exists (&mode)
> > > >   && GET_MODE_SIZE (mode) * BITS_PER_UNIT == ilen * 8
> > > >   && have_insn_for (SET, mode)
> > > >   /* If the destination pointer is not aligned we must 
> > > > be able
> > > >  to emit an unaligned store.  */
> > > >   && (dest_align >= GET_MODE_ALIGNMENT (mode)
> > > >   || !targetm.slow_unaligned_access (mode, 
> > > > dest_align)
> > > >   || (optab_handler (movmisalign_optab, mode)
> > > >   != CODE_FOR_nothing)))
> > > >
> > > > where I understand the ISA is enabled and if the user explicitely
> > > > uses it that's OK but -mprefer-avx128 should tell GCC to never
> > > > generate AVX256 code where the user was not explicitely using it
> > > > (still for example glibc might happily use AVX256 code to implement
> > > > the memcpy we are folding!)
> > > >
> > > > Note the BB vectorizer also might end up with using AVX256 because
> > > > in places it also relies on optab queries and the 
> > > > vector_mode_supported_p
> > > > check (but the memcpy folding uses the fake integer modes).  So
> > > > x86 might need to implement the related_mode hook to avoid "auto"-using
> > > > a larger vector mode which the default implementation would happily do.
> > > >
> > > > Richard.
> > >
> > > OK for master?
> >
> > Looking for opinions from others as well.
> >
> > Btw, there's a similar use in expand_DEFERRED_INIT:
> >
> >   && int_mode_for_size (tree_to_uhwi (var_size) * BITS_PER_UNIT,
> > 0).exists (&var_mode)
> >   && have_insn_for (SET, var_mode))
> >
> > So it occured to me that maybe targetm.move_with_mode_p should eventually
>
> TARGET_MOVE_WITH_MODE_P shouldn't be used here since we do want to
> use var_mode.
>
> > check have_insn_for (SET, var_mode) or we should abstract checking the two
> > things to a generic API somewhere (in optabs-query.h maybe, or expmed.h,
> > not sure where it would be more appropriate).
> >
> > > +@deftypefn {Target Hook} bool TARGET_MOVE_WITH_MODE_P (ma