date:20201117

Re: [PATCH 1/5] testsuite: Fix vect/vect-sdiv-pow2-1.c

2020-11-17 Thread Richard Biener via Gcc-patches

On Tue, Nov 17, 2020 at 2:02 PM Richard Sandiford
 wrote:
>
> Richard Biener via Gcc-patches  writes:
> > On Tue, Nov 17, 2020 at 12:24 PM Richard Sandiford via Gcc-patches
> >  wrote:
> >>
> >> We're now able to vectorise the set-up loop:
> >>
> >>   int p = power2 (fns[i].po2);
> >>   for (int j = 0; j < N; j++)
> >> a[j] = ((p << 4) * j) / (N - 1) - (p << 5);
> >>
> >> Rather than adjust the expected output for that, it seemed better
> >> to disable optimisation for the testing code.
> >>
> >> Tested on aarch64-linux-gnu (with and without SVE), arm-linux-gnueabihf
> >> and x86_64-linux-gnu.  OK to install?
> >
> > In other places we just add a asm ("" : : : "memory") to the loop body, can 
> > you
> > do it like htat?
>
> I wondered about that, but I don't think it's reliable long-term.
> We could (perhaps rightly) decide that it's a win to vectorise the
> rhs of a[j] even if the asm prevents us from doing a vector store.

But this is about dump-scanning and I'd rather avoid optimize attributes
since that removes coverage gained by people running the testsuite
with random set of options.

We do have #pragma no_vector support in the middle-end just not
yet in the C FE parser (see where it builds ANNOTATE_EXPRs,
add support for the annot_expr_no_vector_kind).  If you want a
future-proof reliable way to disable vectorizing a loop, that is.

Richard.

> Thanks,
> Richard
>
> >
> > Thanks,
> > RIchard.
> >
> >> Richard
> >>
> >>
> >> gcc/testsuite/
> >> * gcc.dg/vect/vect-sdiv-pow2-1.c (main): Disable optimization.
> >> ---
> >>  gcc/testsuite/gcc.dg/vect/vect-sdiv-pow2-1.c | 2 +-
> >>  1 file changed, 1 insertion(+), 1 deletion(-)
> >>
> >> diff --git a/gcc/testsuite/gcc.dg/vect/vect-sdiv-pow2-1.c 
> >> b/gcc/testsuite/gcc.dg/vect/vect-sdiv-pow2-1.c
> >> index be70bc6c47e..bf387133d01 100644
> >> --- a/gcc/testsuite/gcc.dg/vect/vect-sdiv-pow2-1.c
> >> +++ b/gcc/testsuite/gcc.dg/vect/vect-sdiv-pow2-1.c
> >> @@ -53,7 +53,7 @@ power2 (int x)
> >>
> >>  #define N 50
> >>
> >> -int
> >> +int __attribute__ ((optimize (0)))
> >>  main (void)
> >>  {
> >>int a[N], b[N], c[N];

Re: Hash ODR name for OBJ_TYPE_REF

2020-11-17 Thread Richard Biener

On Wed, 18 Nov 2020, Jan Hubicka wrote:

> Hi,
> main purpose of obj_type_ref is to hold the type that was used in
> virutal call.  We do not hash this info in hash_operand that causes a
> lot of miscompares at ICF time.  With LTO this is quite important for
> icf performance and in that case we do have manged type names (for
> non-anonymous types)
> 
> Building libxul without patch we get 1477890 miscompares at:
> libxul.so.wpa.076i.icf:  false returned: 'operand_equal_p failed' in 
> compare_operand at ../../gcc/ipa-icf-gimple.c:356
> With patch this turns into 242454.
> 
> Bootstrapped/regtested x86_64-linux, OK?

OK.

Richard.

> Honza
>   PR ipa/92535
>   * fold-const.c (operand_compare::hash_operand): Hash ODR name of
>   obj_type_ref_class.
> diff --git a/gcc/fold-const.c b/gcc/fold-const.c
> index ddf18f27cb7..e759ddb1e60 100644
> --- a/gcc/fold-const.c
> +++ b/gcc/fold-const.c
> @@ -3866,6 +3866,16 @@ operand_compare::hash_operand (const_tree t, 
> inchash::hash ,
> inchash::add_expr (OBJ_TYPE_REF_EXPR (t), hstate, flags);
> inchash::add_expr (OBJ_TYPE_REF_TOKEN (t), hstate, flags);
> inchash::add_expr (OBJ_TYPE_REF_OBJECT (t), hstate, flags);
> +   if (tree c = obj_type_ref_class (t))
> +   {
> + c = TYPE_NAME (TYPE_MAIN_VARIANT (c));
> + /* We compute mangled names only when free_lang_data is run.
> +In that case we can hash precisely.  */
> + if (DECL_ASSEMBLER_NAME_SET_P (c))
> +   hstate.add_object
> +  (IDENTIFIER_HASH_VALUE
> +  (DECL_ASSEMBLER_NAME (c)));
> +   }
> return;
>   default:
> break;
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imend

Re: [PATCH v5] rtl: builtins: (not just) rs6000: Add builtins for fegetround, feclearexcept and feraiseexcept [PR94193]

2020-11-17 Thread Richard Biener

On Tue, 17 Nov 2020, Jeff Law wrote:

> 
> 
> On 11/4/20 8:10 AM, Raoni Fassina Firmino via Gcc-patches wrote:
> > On Wed, Nov 04, 2020 at 10:35:03AM +0100, Richard Biener wrote:
> >>> +/* Expand call EXP to the fegetround builtin (from C99 fenv.h), 
> >>> returning the
> >>> +   result and setting it in TARGET.  Otherwise return NULL_RTX on 
> >>> failure.  */
> >>> +static rtx
> >>> +expand_builtin_fegetround (tree exp, rtx target, machine_mode 
> >>> target_mode)
> >>> +{
> >>> +  if (!validate_arglist (exp, VOID_TYPE))
> >>> +return NULL_RTX;
> >>> +
> >>> +  insn_code icode = direct_optab_handler (fegetround_optab, SImode);
> >>> +  if (icode == CODE_FOR_nothing)
> >>> +return NULL_RTX;
> >>> +
> >>> +  if (target == 0
> >>> +  || GET_MODE (target) != target_mode
> >>> +  || !(*insn_data[icode].operand[0].predicate) (target, target_mode))
> >>> +target = gen_reg_rtx (target_mode);
> >>> +
> >>> +  rtx pat = GEN_FCN (icode) (target);
> >>> +  if (!pat)
> >>> +return NULL_RTX;
> >>> +  emit_insn (pat);
> >> I think you need to verify whether the expansion ended up in 'target'
> >> and otherwise emit a move since usually 'target' is just a hint.
> > I thought the "if (target == 0 ..." took care of that. The expands do
> > emit a move, if that helps.
> It looks like if we have a passed in target and it either has the wrong
> mode or it does not match the predicate, then we generaet a new target
> and use that instead.? I don't see where we'd copy from that new target
> to the original desired target.? For some expanders the caller would
> handle that, but I don't see how that's possible for this one without
> the caller digging into the generated RTL to determine that
> expand_builtin_fegetround put the result somewhere other than TARGET and
> thus a copy is needed.
> 
> That may be what Richi is worried about.

I know we've added missing

  if (!rtx_equal_p (target, ops[0].value))
emit_move_insn (target, ops[0].value);

to several expanders (using expand_insn rather than manual
GEN_FCN (icode) calls).

Richard.

Hash ODR name for OBJ_TYPE_REF

2020-11-17 Thread Jan Hubicka

Hi,
main purpose of obj_type_ref is to hold the type that was used in
virutal call.  We do not hash this info in hash_operand that causes a
lot of miscompares at ICF time.  With LTO this is quite important for
icf performance and in that case we do have manged type names (for
non-anonymous types)

Building libxul without patch we get 1477890 miscompares at:
libxul.so.wpa.076i.icf:  false returned: 'operand_equal_p failed' in 
compare_operand at ../../gcc/ipa-icf-gimple.c:356
With patch this turns into 242454.

Bootstrapped/regtested x86_64-linux, OK?
Honza
PR ipa/92535
* fold-const.c (operand_compare::hash_operand): Hash ODR name of
obj_type_ref_class.
diff --git a/gcc/fold-const.c b/gcc/fold-const.c
index ddf18f27cb7..e759ddb1e60 100644
--- a/gcc/fold-const.c
+++ b/gcc/fold-const.c
@@ -3866,6 +3866,16 @@ operand_compare::hash_operand (const_tree t, 
inchash::hash ,
  inchash::add_expr (OBJ_TYPE_REF_EXPR (t), hstate, flags);
  inchash::add_expr (OBJ_TYPE_REF_TOKEN (t), hstate, flags);
  inchash::add_expr (OBJ_TYPE_REF_OBJECT (t), hstate, flags);
+ if (tree c = obj_type_ref_class (t))
+ {
+   c = TYPE_NAME (TYPE_MAIN_VARIANT (c));
+   /* We compute mangled names only when free_lang_data is run.
+  In that case we can hash precisely.  */
+   if (DECL_ASSEMBLER_NAME_SET_P (c))
+ hstate.add_object
+(IDENTIFIER_HASH_VALUE
+(DECL_ASSEMBLER_NAME (c)));
+ }
  return;
default:
  break;

Re: [PATCH 1/2] correct BB frequencies after loop changed

2020-11-17 Thread Richard Biener

On Tue, 17 Nov 2020, Jeff Law wrote:

> 
> Minor questions for Jan and Richi embedded below...
> 
> On 10/9/20 4:12 AM, guojiufu via Gcc-patches wrote:
> > When investigating the issue from 
> > https://gcc.gnu.org/pipermail/gcc-patches/2020-July/549786.html
> > I find the BB COUNTs of loop seems are not accurate in some case.
> > For example:
> >
> > In below figure:
> >
> >
> >COUNT:268435456  pre-header
> > |
> > |  ..
> > |  ||
> > V  v|
> >COUNT:805306369|
> >/ \  |
> >33%/   \ |
> >  / \|
> > v   v   |
> > COUNT:268435456  COUNT:536870911  | 
> > exit-edge |   latch |
> >   ._.
> >
> > Those COUNTs have below equations:
> > COUNT of exit-edge:268435456 = COUNT of pre-header:268435456
> > COUNT of exit-edge:268435456 = COUNT of header:805306369 * 33
> > COUNT of header:805306369 = COUNT of pre-header:268435456 + COUNT of 
> > latch:536870911
> >
> >
> > While after pcom:
> >
> >COUNT:268435456  pre-header
> > |
> > |  ..
> > |  ||
> > V  v|
> >COUNT:268435456|
> >/ \  |
> >50%/   \ |
> >  / \|
> > v   v   |
> > COUNT:134217728  COUNT:134217728  | 
> > exit-edge |   latch |
> >   ._.
> >
> > COUNT != COUNT + COUNT
> > COUNT != COUNT
> >
> > In some cases, the probility of exit-edge is easy to estimate, then
> > those COUNTs of other BBs in loop can be re-caculated.
> >
> > Bootstrap and regtest pass on ppc64le. Is this ok for trunk?
> >
> > Jiufu
> >
> > gcc/ChangeLog:
> > 2020-10-09  Jiufu Guo   
> >
> > * cfgloopmanip.h (recompute_loop_frequencies): New function.
> > * cfgloopmanip.c (recompute_loop_frequencies): New implementation.
> > * tree-ssa-loop-manip.c (tree_transform_and_unroll_loop): Call
> > recompute_loop_frequencies.
> >
> > ---
> >  gcc/cfgloopmanip.c| 53 +++
> >  gcc/cfgloopmanip.h|  2 +-
> >  gcc/tree-ssa-loop-manip.c | 28 +++--
> >  3 files changed, 57 insertions(+), 26 deletions(-)
> >
> > diff --git a/gcc/cfgloopmanip.c b/gcc/cfgloopmanip.c
> > index 73134a20e33..b0ca82a67fd 100644
> > --- a/gcc/cfgloopmanip.c
> > +++ b/gcc/cfgloopmanip.c
> > @@ -31,6 +31,7 @@ along with GCC; see the file COPYING3.  If not see
> >  #include "gimplify-me.h"
> >  #include "tree-ssa-loop-manip.h"
> >  #include "dumpfile.h"
> > +#include "cfgrtl.h"
> >  
> >  static void copy_loops_to (class loop **, int,
> >class loop *);
> > @@ -1773,3 +1774,55 @@ loop_version (class loop *loop,
> >  
> >return nloop;
> >  }
> > +
> > +/* Recalculate the COUNTs of BBs in LOOP, if the probability of exit edge
> > +   is NEW_PROB.  */
> > +
> > +bool
> > +recompute_loop_frequencies (class loop *loop, profile_probability new_prob)
> > +{
> > +  edge exit = single_exit (loop);
> > +  if (!exit)
> > +return false;
> > +
> > +  edge e;
> > +  edge_iterator ei;
> > +  edge non_exit;
> > +  basic_block * bbs;
> > +  profile_count exit_count = loop_preheader_edge (loop)->count ();
> > +  profile_probability exit_p = exit_count.probability_in 
> > (loop->header->count);
> > +  profile_count base_count = loop->header->count;
> > +  profile_count after_num = base_count.apply_probability (exit_p);
> > +  profile_count after_den = base_count.apply_probability (new_prob);
> > +
> > +  /* Update BB counts in loop body.
> > + COUNT = COUNT
> > + COUNT = COUNT * exit_edge_probility
> > + The COUNT = COUNT * old_exit_p / new_prob.  */
> > +  bbs = get_loop_body (loop);
> > +  scale_bbs_frequencies_profile_count (bbs, loop->num_nodes, after_num,
> > +after_den);
> > +  free (bbs);
> > +
> > +  /* Update probability and count of the BB besides exit edge (maybe 
> > latch).  */
> > +  FOR_EACH_EDGE (e, ei, exit->src->succs)
> > +if (e != exit)
> > +  break;
> > +  non_exit = e;
> Are we sure that exit->src has just two successors (will that case be
> canonicalized before we get here?).? If it has > 2 successors, then I'm
> pretty sure the frequencies get mucked up.? Richi could probably answer
> whether or not the block with the loop exit edge can have > 2 successors.

There's nothing

Re: RISC-V: Support version controling for ISA standard extensions

2020-11-17 Thread Kito Cheng

Patch set committed :)

On Wed, Nov 18, 2020 at 1:43 PM Kito Cheng  wrote:

> >> Current GCC implementation is RISC-V ISA 2.2, this patch set implement
> v20190608 and v20191213, and also add option
> -misa-spec=[2.2|20190608|20191213] to change the default ISA spec version.
> >>
> >> There is one major incompatible
> >>
> >> That option will effect the default version of each sub-extension, for
> example I-extension is 2.0 for 2.2 and 2.1 for v20190608 and v20191213.
> >>
> >> We also update the -march parser to fit the latest standard, the
> canonical ordering for multi-letter, drop version support for G extension,
> and we also omitted the version for unrecognized extension.
> >>
> >> And we add an special rule for G extension, imafd can't appear again if
> G extension is present, but zicsr and zifencei can.
> >>
> >> The default ISA spec will keep on 2.2, and change that in next GCC
> release.
> >
> >
> > This patch series looks OK to me with minor fixes for the things I
> pointed out.
>
> Thanks for reviewing, I'll commit that after fixing those issues and
> testing again :)
>
> > I assume Nelson Chu will look at adding the missing zifencei binutils
> support?
>
> Nelson has worked before, but got stuck due to arch implication rule,
> i, g, zifencei and zicsr...
> And now he plan to refer GCC implementation on binutils side after I
> commit :P
>
> https://github.com/riscv/riscv-isa-manual/issues/575
>
> > Maybe we should add an attribute for the isa-spec?  This is redundant
> with the arch spec, but might be easier for some folk to handle.  Just one
> version number instead of 6 version numbers.  Though I suppose if we have
> both we might have to deal with accidental conflicts between the two.  We
> would need a binutils patch first.  This can be fixed later if it makes
> sense to do it.
>
> I've considered that before but it seems like getting more confused,
> since arch string can specify the version for each extension which can
> override ISA-spec, but I guess more info is not harmful, I think we
> can create an issue on psabi doc and continue discuss there.
>

Re: [PATCH] PowerPC: Restrict long double test to use IBM long double.

2020-11-17 Thread Michael Meissner via Gcc-patches

On Tue, Nov 17, 2020 at 11:33:29PM -0600, will schmidt wrote:
> On Sun, 2020-11-15 at 12:23 -0500, Michael Meissner via Gcc-patches wrote:
> > PowerPC: Restrict long double test to use IBM long double.
> > 
> > I posted this patch previously as a set of 3 testsuite patches.  I have
> > separated them into separate patches.  This patch marks the convert-bfp-11.c
> > patch as needing IBM extended double.  If you look at the code, it is
> > specifically designed around testing the limits of the IBM 128-bit extended
> > double representation.  I added a new target-supports that says the test
> > requires IBM extended long double, and changed the test to require this
> > effective test.  Can I check this into the master branch?
> 
> 
> It's harder to review that without all the history handy here.
> 
> This will stand alone better if you lead with what you are adding and
> keep it clean.  i.e.

The patch I was referring to was posted on October 22nd:
https://gcc.gnu.org/pipermail/gcc-patches/2020-October/556865.html
 
> Subject: PowerPC: Add ppc_long_double_ibm effective-target check
> 
> "Add a ppc_long_double_ibm dg-require-effective-target check to ensure
> tests that require LONG_DOUBLE_IBM128 . "
> An additional statement to clarify it's relationship with I128
> wouldn't  hurt if that is the case.  i.e. 
> "This is a counterpart to LONG_DOUBLE_IEEE 128 " 

At the moment, we don't need a target supports for long double IEEE 128-bit or
long double 64-bit.  I can add them if needed.

> Hmm, I have those backwards in my head apparently.  Can the return 1 if
> not-defined logic be flattened out so we see the direct relationship?

I'm not sure what you are asking.  These are preprocessor macros that are only
defined in certain cases.  And remember this is main returning a value, so
returning 0 is true and 1 is false.

In particular:

If your long double is 128-bits and uses the IEEE 128-bit representation, the
following macros are defined:

__LONG_DOUBLE_128__
__LONG_DOUBLE_IEEE128__

If your long double is 128-bit and uses the IBM 128-bit representation (current
default0, the following macros are defined:

__LONG_DOUBLE_128__
__LONG_DOUBLE_IBM128__

If your long double is 64 bits, neither of those two macros are defined.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797

Re: [PATCH] Include math.h in nextafter-2.c test.

2020-11-17 Thread Michael Meissner via Gcc-patches

On Tue, Nov 17, 2020 at 11:33:23PM -0600, will schmidt wrote:
> On Sun, 2020-11-15 at 12:12 -0500, Michael Meissner via Gcc-patches wrote:
> > Include math.h in nextafter-2.c test.
> > 
> > I previously posted this with two other patches.  I've separated this into 
> > its
> > own patch.  What happens is because the nextafter-2.c test uses 
> > -fno-builtin,
> > and it does not include math.h, the wrong nextafterl and nextforwardl gets
> > called when long double is not IBM 128-bit (i.e. either 64-bit, or IEEE
> > 128-bit).
> 
> Thats a sandbox issue, or something upstream ?

I'm not sure what you are asking.  If you install the three critical IEEE
128-bit long double patches, and then configure a build with long double
defaulting to IEEE 128-bit, the nextafter-2 test will fail.

The reason is the nextafterl function in GLIBC assumes long double is IBM
128-bit extended double.  The __builtin_nextafterl function calls that
function.

If you compile it normally (with long double using IEEE 128-bit), the compiler
will automatically map nextafterl to __nextafterieee128.

Similarly if you include math.h, and use the -fno-builtin option, the math.h
library will still map nextafterl into __nextafterieee128, and the compiler
will call it.

However, if you do not include math.h and use the -fno-builtin option, the
compiler will call nextafterl, and get the wrong results, because the wrong
function was called.

What I meant in terms of the 3 patches being separated, the last time I posted
a patch for this problem, I grouped together 3 test suite failures into one
patch.  This time, I separated the cases into 3 separate patches (this one, the
fix for pr70117, and the fix for the decimal conversion test).

> > 
> > Rather than add the include only for the PowerPC, I thought it was better to
> > always include it.  There might be some port in the future that has the same
> > issue with multiple long double types without using multilibs.
> > 
> > Can I check this into the master branch.
> > 
> > 2020-11-15  Michael Meissner  
> > 
> > * gcc.dg/nextafter-2.c: Include math.h.
> > ---
> >  gcc/testsuite/gcc.dg/nextafter-2.c | 12 
> >  1 file changed, 12 insertions(+)
> > 
> > diff --git a/gcc/testsuite/gcc.dg/nextafter-2.c 
> > b/gcc/testsuite/gcc.dg/nextafter-2.c
> > index e51ae94be0c..8149a709fa5 100644
> > --- a/gcc/testsuite/gcc.dg/nextafter-2.c
> > +++ b/gcc/testsuite/gcc.dg/nextafter-2.c
> > @@ -6,6 +6,18 @@
> > 
> >  #include 
> > 
> > +/* In order to run on systems like the PowerPC that have 3 different long
> > +   double types, include math.h so it can choose what is the appropriate
> > +   nextafterl function to use.
> > +
> > +   If we didn't use -fno-builtin for this test, the PowerPC compiler would 
> > have
> > +   changed the names of the built-in functions that use long double.  The
> > +   nextafter-1.c function runs with this mapping.
> > +
> > +   Since this test uses -fno-builtin, include math.h, so that math.h can 
> > make
> > +   the appropriate choice to use.  */
> 
> 
> 
> Can this be simplified to stl
> 
> /* Include math.h so that systems like PowerPC that have different long
> double types can choose the appropriate nextafterl function to use.  */
> 
> 
> > +#include 
> > +
> >  #if defined(__GLIBC__) && defined(__GLIBC_PREREQ)
> >  # if !__GLIBC_PREREQ (2, 24)
> >  /* Workaround buggy nextafterl in glibc 2.23 and earlier,
> > -- 
> > 2.22.0
> > 
> > 

Sure, the comment is just trying to explain why math.h needs to be included.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797

Re: [PATCH 2/3] RISC-V: Support zicsr and zifencei extension for -march.

2020-11-17 Thread Kito Cheng via Gcc-patches

>>  - CSR related instructions and fence instructions has to be splitted from
>>baseline ISA, zicsr and zifencei are corresponding sub-extension.
>
>
> It is actually only fence.i that is split off.  fence is still part of the 
> base ISA.  This is why it is called zifencei.

Oh...I didn't notice that, thanks for the review.


>> diff --git a/gcc/config/riscv/riscv.c b/gcc/config/riscv/riscv.c
>> index 738556539f6..2aaa8e96451 100644
>> --- a/gcc/config/riscv/riscv.c
>> +++ b/gcc/config/riscv/riscv.c
>> @@ -3337,6 +3337,9 @@ riscv_memmodel_needs_amo_acquire (enum memmodel model)
>>  static bool
>>  riscv_memmodel_needs_release_fence (enum memmodel model)
>>  {
>> +  if (!TARGET_ZIFENCEI)
>> +return false;
>> +
>>switch (model)
>>  {
>>case MEMMODEL_ACQ_REL:
>
>
> This part looks wrong, as riscv_memmodel_needs_release_fence is only used for 
> fence instructions, not for fence.i.
>>
>> diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
>> index f15bad3b29e..756b35fb8c0 100644
>> --- a/gcc/config/riscv/riscv.md
>> +++ b/gcc/config/riscv/riscv.md
>> @@ -1543,19 +1543,20 @@
>>  LCT_NORMAL, VOIDmode, operands[0], Pmode,
>>  operands[1], Pmode, const0_rtx, Pmode);
>>  #else
>> -  emit_insn (gen_fence_i ());
>> +  if (TARGET_ZIFENCEI)
>> +emit_insn (gen_fence_i ());
>>  #endif
>>DONE;
>>  })
>>
>>  (define_insn "fence"
>>[(unspec_volatile [(const_int 0)] UNSPECV_FENCE)]
>> -  ""
>> +  "TARGET_ZIFENCEI"
>>"%|fence%-")
>>
>>  (define_insn "fence_i"
>>[(unspec_volatile [(const_int 0)] UNSPECV_FENCE_I)]
>> -  ""
>> +  "TARGET_ZIFENCEI"
>>"fence.i")
>>
>>  ;;
>
>
> The fence_i and clear_cache patterns are OK.  The fence pattern change is 
> wrong.
>
> You didn't change sync.md, but it only uses fence, so it needs no change.
>
> Jim
>

Re: [PATCH 3/3] RISC-V: Support version controling for ISA standard extensions

2020-11-17 Thread Kito Cheng via Gcc-patches

On Wed, Nov 18, 2020 at 5:29 AM Jim Wilson  wrote:
>
> On Thu, Nov 12, 2020 at 11:28 PM Kito Cheng  wrote:
>>
>> +#ifndef HAVE_AS_MARCH_ZIFENCE
>> +  /* Skip since older binutils don't recognize zifencei,
>> + we mad a mistake that is binutils 2.35 support zicsr but not support
>> + zifencei.  */
>> +  skip_zifencei = true;
>> +#endif
>
>
> I'd suggest something like "Skip since older binutils doesn't recognize 
> zifencei, we made a mistake in that binutils 2.35 supports zicsr but not 
> zifencei."

Thanks for suggestions :)

>>
>> diff --git a/gcc/config/riscv/riscv.h b/gcc/config/riscv/riscv.h
>> index 172c7ca7c98..9dec5415eab 100644
>> --- a/gcc/config/riscv/riscv.h
>> +++ b/gcc/config/riscv/riscv.h
>> @@ -70,13 +70,20 @@ extern const char *riscv_default_mtune (int argc, const 
>> char **argv);
>>  #define TARGET_64BIT   (__riscv_xlen == 64)
>>  #endif /* IN_LIBGCC2 */
>>
>> +#ifdef HAVE_AS_MISA_SPEC
>> +#define ASM_MISA_SPEC ""
>> +#else
>> +#define ASM_MISA_SPEC "%{misa-spec=*}"
>> +#endif
>
>
> This is backwards.  We do want to pass -misa-spec to the assembler if it 
> supports it.  Or maybe you meant ifndef?

Oh, it should be ifndef here.

Re: RISC-V: Support version controling for ISA standard extensions

2020-11-17 Thread Kito Cheng via Gcc-patches

>> Current GCC implementation is RISC-V ISA 2.2, this patch set implement 
>> v20190608 and v20191213, and also add option 
>> -misa-spec=[2.2|20190608|20191213] to change the default ISA spec version.
>>
>> There is one major incompatible
>>
>> That option will effect the default version of each sub-extension, for 
>> example I-extension is 2.0 for 2.2 and 2.1 for v20190608 and v20191213.
>>
>> We also update the -march parser to fit the latest standard, the canonical 
>> ordering for multi-letter, drop version support for G extension, and we also 
>> omitted the version for unrecognized extension.
>>
>> And we add an special rule for G extension, imafd can't appear again if G 
>> extension is present, but zicsr and zifencei can.
>>
>> The default ISA spec will keep on 2.2, and change that in next GCC release.
>
>
> This patch series looks OK to me with minor fixes for the things I pointed 
> out.

Thanks for reviewing, I'll commit that after fixing those issues and
testing again :)

> I assume Nelson Chu will look at adding the missing zifencei binutils support?

Nelson has worked before, but got stuck due to arch implication rule,
i, g, zifencei and zicsr...
And now he plan to refer GCC implementation on binutils side after I commit :P

https://github.com/riscv/riscv-isa-manual/issues/575

> Maybe we should add an attribute for the isa-spec?  This is redundant with 
> the arch spec, but might be easier for some folk to handle.  Just one version 
> number instead of 6 version numbers.  Though I suppose if we have both we 
> might have to deal with accidental conflicts between the two.  We would need 
> a binutils patch first.  This can be fixed later if it makes sense to do it.

I've considered that before but it seems like getting more confused,
since arch string can specify the version for each extension which can
override ISA-spec, but I guess more info is not harmful, I think we
can create an issue on psabi doc and continue discuss there.

Re: [PATCH] PowerPC Fix ibm128 defaults for pr70117.c test.

2020-11-17 Thread will schmidt via Gcc-patches

On Sun, 2020-11-15 at 12:17 -0500, Michael Meissner via Gcc-patches wrote:
> From 698d9fd8a5701fa4ed9690ddf71d57765921778c Mon Sep 17 00:00:00 2001
> From: Michael Meissner 
> Date: Sun, 15 Nov 2020 00:48:23 -0500
> Subject: [PATCH] PowerPC Fix ibm128 defaults for pr70117.c test.
> 
> This patch was previously posted as a combined patch with 2 other testsuite
> patches.  I moved it to a separate patch.

I don't see that thread.  Either really old or differently named ?

> 
> This patch fixes up a failure that I saw when I built a compiler with the long
> double default set to IEEE 128-bit instead of IBM 128-bit.  Now compilers with
> either 128-bit long double default pass this test.  Can I check this into the
> master branch?

sandbox or upstream failure?
Perhaps stands alone better as 
" This patch updates the pr70177.c testcase to define IBM128_MAX as
appropriate for the IBM 128 or IEEE 128 type that is currently in use."
?


> 
> gcc/testsuite/
> 2020-11-15  Michael Meissner  
> 
>   PR target/70117
>   * gcc.target/powerpc/pr70117.c: Add support for long double being
>   IEEE 128-bit.
> ---
>  gcc/testsuite/gcc.target/powerpc/pr70117.c | 6 --
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/gcc/testsuite/gcc.target/powerpc/pr70117.c 
> b/gcc/testsuite/gcc.target/powerpc/pr70117.c
> index 3bbd2c595e0..928efe39c7b 100644
> --- a/gcc/testsuite/gcc.target/powerpc/pr70117.c
> +++ b/gcc/testsuite/gcc.target/powerpc/pr70117.c
> @@ -9,9 +9,11 @@
> 128-bit floating point, because the type is not enabled on those
> systems.  */
>  #define LDOUBLE __ibm128
> +#define IBM128_MAX ((__ibm128) 1.79769313486231580793728971405301199e+308L)
> 
>  #elif defined(__LONG_DOUBLE_IBM128__)
>  #define LDOUBLE long double
> +#define IBM128_MAX LDBL_MAX
> 


>  #else
>  #error "long double must be either IBM 128-bit or IEEE 128-bit"
> @@ -75,10 +77,10 @@ main (void)
>if (__builtin_isnormal (ld))
>  __builtin_abort ();
> 
> -  ld = LDBL_MAX;
> +  ld = IBM128_MAX;
>if (!__builtin_isnormal (ld))
>  __builtin_abort ();
> -  ld = -LDBL_MAX;
> +  ld = -IBM128_MAX;
>if (!__builtin_isnormal (ld))
>  __builtin_abort ();
> 
> -- 
> 2.22.0
> 
>

Re: [PATCH] Include math.h in nextafter-2.c test.

2020-11-17 Thread will schmidt via Gcc-patches

On Sun, 2020-11-15 at 12:12 -0500, Michael Meissner via Gcc-patches wrote:
> Include math.h in nextafter-2.c test.
> 
> I previously posted this with two other patches.  I've separated this into its
> own patch.  What happens is because the nextafter-2.c test uses -fno-builtin,
> and it does not include math.h, the wrong nextafterl and nextforwardl gets
> called when long double is not IBM 128-bit (i.e. either 64-bit, or IEEE
> 128-bit).

Thats a sandbox issue, or something upstream ?

> 
> Rather than add the include only for the PowerPC, I thought it was better to
> always include it.  There might be some port in the future that has the same
> issue with multiple long double types without using multilibs.
> 
> Can I check this into the master branch.
> 
> 2020-11-15  Michael Meissner  
> 
>   * gcc.dg/nextafter-2.c: Include math.h.
> ---
>  gcc/testsuite/gcc.dg/nextafter-2.c | 12 
>  1 file changed, 12 insertions(+)
> 
> diff --git a/gcc/testsuite/gcc.dg/nextafter-2.c 
> b/gcc/testsuite/gcc.dg/nextafter-2.c
> index e51ae94be0c..8149a709fa5 100644
> --- a/gcc/testsuite/gcc.dg/nextafter-2.c
> +++ b/gcc/testsuite/gcc.dg/nextafter-2.c
> @@ -6,6 +6,18 @@
> 
>  #include 
> 
> +/* In order to run on systems like the PowerPC that have 3 different long
> +   double types, include math.h so it can choose what is the appropriate
> +   nextafterl function to use.
> +
> +   If we didn't use -fno-builtin for this test, the PowerPC compiler would 
> have
> +   changed the names of the built-in functions that use long double.  The
> +   nextafter-1.c function runs with this mapping.
> +
> +   Since this test uses -fno-builtin, include math.h, so that math.h can make
> +   the appropriate choice to use.  */



Can this be simplified to stl

/* Include math.h so that systems like PowerPC that have different long
double types can choose the appropriate nextafterl function to use.  */


> +#include 
> +
>  #if defined(__GLIBC__) && defined(__GLIBC_PREREQ)
>  # if !__GLIBC_PREREQ (2, 24)
>  /* Workaround buggy nextafterl in glibc 2.23 and earlier,
> -- 
> 2.22.0
> 
>

Re: [PATCH] PowerPC: Restrict long double test to use IBM long double.

2020-11-17 Thread will schmidt via Gcc-patches

On Sun, 2020-11-15 at 12:23 -0500, Michael Meissner via Gcc-patches wrote:
> PowerPC: Restrict long double test to use IBM long double.
> 
> I posted this patch previously as a set of 3 testsuite patches.  I have
> separated them into separate patches.  This patch marks the convert-bfp-11.c
> patch as needing IBM extended double.  If you look at the code, it is
> specifically designed around testing the limits of the IBM 128-bit extended
> double representation.  I added a new target-supports that says the test
> requires IBM extended long double, and changed the test to require this
> effective test.  Can I check this into the master branch?


It's harder to review that without all the history handy here.

This will stand alone better if you lead with what you are adding and
keep it clean.  i.e.

Subject: PowerPC: Add ppc_long_double_ibm effective-target check

"Add a ppc_long_double_ibm dg-require-effective-target check to ensure
tests that require LONG_DOUBLE_IBM128 . "
An additional statement to clarify it's relationship with I128
wouldn't  hurt if that is the case.  i.e. 
"This is a counterpart to LONG_DOUBLE_IEEE 128 " 


> 
> gcc/testsuite/
> 2020-11-15  Michael Meissner  
> 
>   * c-c++-common/dfp/convert-bfp-11.c: Require IBM 128-bit long
>   double.
>   * lib/target-supports.exp (check_ppc_long_double_ibm): New
>   function.
>   (is-effective-target): Add ppc_long_double_ibm.




> ---
>  .../c-c++-common/dfp/convert-bfp-11.c |  1 +
>  gcc/testsuite/lib/target-supports.exp | 19 +++
>  2 files changed, 20 insertions(+)
> 
> diff --git a/gcc/testsuite/c-c++-common/dfp/convert-bfp-11.c 
> b/gcc/testsuite/c-c++-common/dfp/convert-bfp-11.c
> index 95c433d2c24..87f6716afb3 100644
> --- a/gcc/testsuite/c-c++-common/dfp/convert-bfp-11.c
> +++ b/gcc/testsuite/c-c++-common/dfp/convert-bfp-11.c
> @@ -1,4 +1,5 @@
>  /* { dg-skip-if "" { ! "powerpc*-*-linux*" } } */
> +/* { dg-require-effective-target ppc_long_double_ibm } */
> 
>  /* Test decimal float conversions to and from IBM 128-bit long double. 
> Checks are skipped at runtime if long double is not 128 bits.

ok

> diff --git a/gcc/testsuite/lib/target-supports.exp 
> b/gcc/testsuite/lib/target-supports.exp
> index ceee78c26a9..dc1100ba96c 100644
> --- a/gcc/testsuite/lib/target-supports.exp
> +++ b/gcc/testsuite/lib/target-supports.exp
> @@ -2336,6 +2336,24 @@ proc check_effective_target_ppc_ieee128_ok { } {
>  }]
>  }
> 
> +# Return 1 if the target is a powerpc with the long double format uses the 
> IBM
> +# extended double format.
> +
> +proc check_ppc_long_double_ibm { } {
> +return [check_cached_effective_target ppc_long_double_ibm {
> + check_runtime_nocache ppc_long_double_ibm {
> + int main()
> + {
> +   #ifndef __LONG_DOUBLE_IBM128__
> + return 1;


Hmm, I have those backwards in my head apparently.  Can the return 1 if
not-defined logic be flattened out so we see the direct relationship?


> +   #else
> + return 0;
> +   #endif
> + }
> + }
> +}]
> +}




> +
>  # Return 1 if the target supports executing VSX instructions, 0
>  # otherwise.  Cache the result.
> 
> @@ -7939,6 +7957,7 @@ proc is-effective-target { arg } {
> "power10_hw" { set selected [check_power10_hw_available] }
> "ppc_float128_sw" { set selected [check_ppc_float128_sw_available] }
> "ppc_float128_hw" { set selected [check_ppc_float128_hw_available] }
> +   "ppc_long_double_ibm" { set selected [check_ppc_long_double_ibm] }
> "ppc_recip_hw"   { set selected [check_ppc_recip_hw_available] }
> "ppc_cpu_supports_hw" { set selected 
> [check_ppc_cpu_supports_hw_available] }
> "ppc_mma_hw" { set selected [check_ppc_mma_hw_available] }
> -- 
> 2.22.0
> 
>

Re: [PATCH, rs6000] Add Power10 scheduling description

2020-11-17 Thread will schmidt via Gcc-patches

On Fri, 2020-11-13 at 16:04 -0600, Pat Haugen via Gcc-patches wrote:
> Add Power10 scheduling description.
> 
> This patch adds the Power10 scheduling description. Since power10.md
> was pretty much a complete rewrite (existing version of power10.md is
> mostly just a copy of power9.md), I diffed power10.md with /dev/null
> so that the full contents of the file are shown as opposed to a diff.
> This should make it easier to read. This patch will not apply on
> current trunk do to that reason.
> 
> Bootstrap/regtest on powerpc64le (Power8/Power10) with no new
> regressions. Ok for trunk?
> 
> -Pat


(reviewing in 2 parts, another follows tnis one with bulk of the
attachment inline..)


> 
> 
> 2020-11-13  Pat Haugen  
> 
> gcc/
>   * config/rs6000/rs6000.c (struct processor_costs): New.

Should that add/reference the "power10_cost" structure itself?

>   (rs6000_option_override_internal): Set Power10 costs.

ok


>   (rs6000_issue_rate): Set Power10 issue rate.

ok


>   * config/rs6000/power10.md: Rewrite for Power10.
> 

+;; Copyright (C) 2020-2020 Free Software Foundation, Inc.

Nit: Can probably just be a single 2020 reference.  :-)


thanks,
-Will

Re: [PATCH, rs6000] Add Power10 scheduling description

2020-11-17 Thread will schmidt via Gcc-patches

On Fri, 2020-11-13 at 16:04 -0600, Pat Haugen via Gcc-patches wrote:
> diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
> index 4d528a39a37..85bb42d6dce 100644
> --- a/gcc/config/rs6000/rs6000.c
> +++ b/gcc/config/rs6000/rs6000.c
> @@ -1080,6 +1080,26 @@ struct processor_costs power9_cost = {
>COSTS_N_INSNS (3), /* SF->DF convert */
>  };
>  
> +/* Instruction costs on POWER10 processors.  */
> +static const
> +struct processor_costs power10_cost = {
> +  COSTS_N_INSNS (1), /* mulsi */
> +  COSTS_N_INSNS (1), /* mulsi_const */
> +  COSTS_N_INSNS (1), /* mulsi_const9 */
> +  COSTS_N_INSNS (1), /* muldi */
> +  COSTS_N_INSNS (4), /* divsi */
> +  COSTS_N_INSNS (4), /* divdi */
> +  COSTS_N_INSNS (2), /* fp */
> +  COSTS_N_INSNS (2), /* dmul */
> +  COSTS_N_INSNS (7), /* sdiv */
> +  COSTS_N_INSNS (9), /* ddiv */
> +  128,   /* cache line size */
> +  32,/* l1 cache */
> +  512,   /* l2 cache */
> +  16,/* prefetch streams */
> +  COSTS_N_INSNS (2), /* SF->DF convert */
> +};
> +


ok

>  /* Instruction costs on POWER A2 processors.  */
>  static const
>  struct processor_costs ppca2_cost = {
> @@ -4734,10 +4754,13 @@ rs6000_option_override_internal (bool global_init_p)
>   break;
>  
>case PROCESSOR_POWER9:
> -  case PROCESSOR_POWER10:
>   rs6000_cost = _cost;
>   break;
>  
> +  case PROCESSOR_POWER10:
> + rs6000_cost = _cost;
> + break;
> +
>case PROCESSOR_PPCA2:
>   rs6000_cost = _cost;
>   break;
> @@ -18001,8 +18024,9 @@ rs6000_issue_rate (void)
>case PROCESSOR_POWER8:
>  return 7;
>case PROCESSOR_POWER9:
> -  case PROCESSOR_POWER10:
>  return 6;
> +  case PROCESSOR_POWER10:
> +return 8;
>default:
>  return 1;
>}

ok


> diff --git a/gcc/config/rs6000/power10.md b/gcc/config/rs6000/power10.md
> new file mode 100644
> index 000..f9ca4cbf10e
> --- /dev/null
> +++ b/gcc/config/rs6000/power10.md
> @@ -0,0 +1,553 @@
> +;; Scheduling description for the IBM POWER10 processor.
> +;; Copyright (C) 2020-2020 Free Software Foundation, Inc.
> +;;
> +;; Contributed by Pat Haugen (pthau...@us.ibm.com).
> +
> +;; This file is part of GCC.
> +;;
> +;; GCC is free software; you can redistribute it and/or modify it
> +;; under the terms of the GNU General Public License as published
> +;; by the Free Software Foundation; either version 3, or (at your
> +;; option) any later version.
> +;;
> +;; GCC is distributed in the hope that it will be useful, but WITHOUT
> +;; ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
> +;; or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
> +;; License for more details.
> +;;
> +;; You should have received a copy of the GNU General Public License
> +;; along with GCC; see the file COPYING3.  If not see
> +;; .
> +
> +; For Power10 we model (and try to pack) the in-order decode/dispatch groups
> +; which consist of 8 instructions max.  We do not try to model the details of
> +; the out-of-order issue queues and how insns flow to the various execution
> +; units except for the simple representation of the issue limitation of at
> +; most 4 insns to the execution units/2 insns to the load units/2 insns to
> +; the store units.
> +(define_automaton "power10dsp,power10issue,power10div")
> +
> +; Decode/dispatch slots
> +(define_cpu_unit "du0_power10,du1_power10,du2_power10,du3_power10,
> +   du4_power10,du5_power10,du6_power10,du7_power10" "power10dsp")
> +
> +; Four execution units
> +(define_cpu_unit "exu0_power10,exu1_power10,exu2_power10,exu3_power10"
> +  "power10issue")
> +; Two load units and two store units
> +(define_cpu_unit "lu0_power10,lu1_power10" "power10issue")
> +(define_cpu_unit "stu0_power10,stu1_power10" "power10issue")
> +; Create false units for use by non-pipelined div/sqrt
> +(define_cpu_unit "fx_div0_power10,fx_div1_power10" "power10div")
> +(define_cpu_unit "fp_div0_power10,fp_div1_power10,fp_div2_power10,
> +   fp_div3_power10" "power10div")

The spacing catches my eye, I'd want to add spaces around those commas,
etc.   But.. this appears to be consistent with behavior
as seen in the
existing power9.md ; power9.md ; etc. 
So it's either this way per necessity, or this way per history.
Either way, no change requested here given that precedence.
(If this and
the older stuff also needs to be cosmetically tweaked, that can be
handled later on..)


> +
> +
> +; Dispatch slots are allocated in order conforming to program order.
> +(absence_set "du0_power10" "du1_power10,du2_power10,du3_power10,du4_power10,\
> +  du5_power10,du6_power10,du7_power10")
> +(absence_set "du1_power10" "du2_power10,du3_power10,du4_power10,du5_power10,\
> +  du6_power10,du7_power10")
> +(absence_set "du2_power10" "du3_power10,du4_power10,du5_power10,du6_power10,\
> +

libgo patch committed: Update gofrontend mangling checks

2020-11-17 Thread Ian Lance Taylor via Gcc-patches

This libgo patch updates the gofrontend mangling checks, in
preparation for changing the mangling scheme again.

This is a port of two patches in the master repository.

https://golang.org/cl/259298

cmd/cgo: split gofrontend mangling checks into cmd/internal/pkgpath

This is a step toward porting https://golang.org/cl/219817 from the
gofrontend repo to the main repo.

Note that this also corrects the implementation of the v2 mangling
scheme to use ..u and ..U where appropriate.

https://golang.org/cl/259299

cmd/go: use cmd/internal/pkgpath for gccgo pkgpath symbol

This is for https://golang.org/issue/37272 and https://golang.org/issue/41862.

Bootstrapped and ran Go testsuite on x86_64-pc-linux-gnu.  Committed
to mainline.

Ian
7690ac52d5c30b74b90818740872cd5c4680a2aa
diff --git a/gcc/go/gofrontend/MERGE b/gcc/go/gofrontend/MERGE
index e62578fc781..1f3c25ce710 100644
--- a/gcc/go/gofrontend/MERGE
+++ b/gcc/go/gofrontend/MERGE
@@ -1,4 +1,4 @@
-893fa057e36ae6c9b2ac5ffdf74634c35b3489c6
+c948c2d770122932a05b62f653efa2c51f72d3ae
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
diff --git a/libgo/check-packages.txt b/libgo/check-packages.txt
index efa7d191180..da77e84517e 100644
--- a/libgo/check-packages.txt
+++ b/libgo/check-packages.txt
@@ -22,6 +22,7 @@ cmd/go/internal/work
 cmd/internal/buildid
 cmd/internal/edit
 cmd/internal/objabi
+cmd/internal/pkgpath
 cmd/internal/test2json
 compress/bzip2
 compress/flate
diff --git a/libgo/go/cmd/cgo/main.go b/libgo/go/cmd/cgo/main.go
index 7fc2508021b..4952118b1f8 100644
--- a/libgo/go/cmd/cgo/main.go
+++ b/libgo/go/cmd/cgo/main.go
@@ -244,8 +244,7 @@ var exportHeader = flag.String("exportheader", "", "where 
to write export header
 var gccgo = flag.Bool("gccgo", false, "generate files for use with gccgo")
 var gccgoprefix = flag.String("gccgoprefix", "", "-fgo-prefix option used with 
gccgo")
 var gccgopkgpath = flag.String("gccgopkgpath", "", "-fgo-pkgpath option used 
with gccgo")
-var gccgoMangleCheckDone bool
-var gccgoNewmanglingInEffect bool
+var gccgoMangler func(string) string
 var importRuntimeCgo = flag.Bool("import_runtime_cgo", true, "import 
runtime/cgo in generated code")
 var importSyscall = flag.Bool("import_syscall", true, "import syscall in 
generated code")
 var goarch, goos string
diff --git a/libgo/go/cmd/cgo/out.go b/libgo/go/cmd/cgo/out.go
index dd03f7d67a5..2ab8425999f 100644
--- a/libgo/go/cmd/cgo/out.go
+++ b/libgo/go/cmd/cgo/out.go
@@ -6,6 +6,7 @@ package main
 
 import (
"bytes"
+   "cmd/internal/pkgpath"
"debug/elf"
"debug/macho"
"debug/pe"
@@ -15,7 +16,6 @@ import (
"go/token"
"internal/xcoff"
"io"
-   "io/ioutil"
"os"
"os/exec"
"path/filepath"
@@ -1286,112 +1286,24 @@ func (p *Package) writeExportHeader(fgcch io.Writer) {
fmt.Fprintf(fgcch, "%s\n", p.gccExportHeaderProlog())
 }
 
-// gccgoUsesNewMangling reports whether gccgo uses the new collision-free
-// packagepath mangling scheme (see determineGccgoManglingScheme for more
-// info).
-func gccgoUsesNewMangling() bool {
-   if !gccgoMangleCheckDone {
-   gccgoNewmanglingInEffect = determineGccgoManglingScheme()
-   gccgoMangleCheckDone = true
-   }
-   return gccgoNewmanglingInEffect
-}
-
-const mangleCheckCode = `
-package läufer
-func Run(x int) int {
-  return 1
-}
-`
-
-// determineGccgoManglingScheme performs a runtime test to see which
-// flavor of packagepath mangling gccgo is using. Older versions of
-// gccgo use a simple mangling scheme where there can be collisions
-// between packages whose paths are different but mangle to the same
-// string. More recent versions of gccgo use a new mangler that avoids
-// these collisions. Return value is whether gccgo uses the new mangling.
-func determineGccgoManglingScheme() bool {
-
-   // Emit a small Go file for gccgo to compile.
-   filepat := "*_gccgo_manglecheck.go"
-   var f *os.File
-   var err error
-   if f, err = ioutil.TempFile(*objDir, filepat); err != nil {
-   fatalf("%v", err)
-   }
-   gofilename := f.Name()
-   defer os.Remove(gofilename)
-
-   if err = ioutil.WriteFile(gofilename, []byte(mangleCheckCode), 0666); 
err != nil {
-   fatalf("%v", err)
-   }
-
-   // Compile with gccgo, capturing generated assembly.
-   gccgocmd := os.Getenv("GCCGO")
-   if gccgocmd == "" {
-   gpath, gerr := exec.LookPath("gccgo")
-   if gerr != nil {
-   fatalf("unable to locate gccgo: %v", gerr)
-   }
-   gccgocmd = gpath
-   }
-   cmd := exec.Command(gccgocmd, "-S", "-o", "-", gofilename)
-   buf, cerr := cmd.CombinedOutput()
-   if cerr != nil {
-   fatalf("%s", cerr)
-   }
-
-   // New mangling: expect go.l..u00e4ufer.Run
-

Re: [PATCH] introduce --param max-object-size

2020-11-17 Thread Martin Sebor via Gcc-patches


On 11/16/20 4:54 PM, Jeff Law wrote:


On 11/16/20 2:04 AM, Richard Biener via Gcc-patches wrote:

On Sun, Nov 15, 2020 at 1:46 AM Martin Sebor via Gcc-patches
 wrote:

GCC considers PTRDIFF_MAX - 1 to be the size of the largest object
so that the difference between a pointer to the byte just past its
end and the first one is no more than PTRDIFF_MAX.  This is too
liberal in LP64 on most systems because the size of the address
space is constrained to much less than that, both by the width
of the address bus for physical memory and by the practical
limitations of disk sizes for swap files.

Shouldn't this be a target hook like MAX_OFILE_ALIGNMENT then?


I think one could argue either way.  Yes, the absolutes are a function
of the underlying hardware and it can change over the lifetime of a
processor family which likey differs from MAX_OFILE_ALIGNMENT.


A PARAM gives the developer  a way to specify the limit which is more
flexible.


What I'm not really not sure of is whether is really matters in practice
for end users.


I'd like to do two things with this change: 1) make it easier
(and encourage users) to detect as early as possible more bugs
due to excessive sizes in various function calls (malloc, memcpy,
etc.), and 2) verify that GCC code uses the limit consistently
and correctly.

I envision the first would appeal to security-minded distros
and other organizations that use GCC as their system compiler.
For those, a target hook would be more convenient than a --param.
But I also expect individual projects wanting to impose stricter
limits than distros select.  For those, a --param is the only
choice (aside from individual -Wlarger-than- options(*)).

With this in mind, I think providing both a target hook and
a --param has the best chance of achieving these goals.

The attached patch does that.

Martin

[*] To enforce more realistic object size limits than PTRDIFF_MAX,
GCC users today have to set no fewer than five (or six if we count
-Wstack-usage) options: -Walloca-larger-than,
-Walloc-size-larger-than, -Wframe-larger-than, -Wlarger-than, and
-Wvla-larger-than.  The limits are all independent of one another.
I expect providing a single configurable baseline value for all
these options to use and refine to be helpful to these users.
Add new parameter max-object-size and new target hook max_object_size.

gcc/ChangeLog:

	* builtins.c (warn_string_no_nul): Print parameter when it's set.
	(maybe_warn_for_bound): Same.
	(compute_objsize_r): Use new parameter.
	* calls.c (alloc_max_size): Same.
	* doc/invoke.texi (-Walloca-larger-than): Mention new --param.
	 (-Walloc-size-larger-than): Same
	* doc/tm.texi.in (TARGET_MAX_OBJECT_SIZE): Document new hook.
	* doc/tm.texi: Same.
	* gimple-ssa-warn-alloca.c (warn_limit_specified_p): Call
	max_object_size.
	(adjusted_warn_limit): Same.
	(pass_walloca::execute): Same.
	* gimple-ssa-sprintf.c (get_string_length): Same.
	* gimple-ssa-warn-restrict.c (maybe_diag_access_bounds):
	* opt-functions.awk (function var_type_struct): Test for Host_Wide_Int.
	* params.opt (max-object-size): New parameter.
	* target.def (max_object_size): New hook.
	* targhooks.h (max_object_size): Declare function.
	* targhooks.c (max_object_size): Define function.
	* tree-ssa-strlen.c (maybe_set_strlen_range): Use new parameter.
	(get_len_or_size): Same.
	* tree.c (max_object_size): Same.

gcc/testsuite/ChangeLog:

	* c-c++-common/Warray-bounds-2.c: Adjust to new parameter value.
	* c-c++-common/Warray-bounds-3.c: Same.
	* c-c++-common/Wrestrict.c: Same.
	* g++.dg/warn/Wplacement-new-size-5.C: Same.
	* gcc.dg/Wstringop-overflow-41.c: Same.
	* gcc.dg/Wstringop-overflow-50.c: Same.
	* gcc.dg/Wstringop-overflow-54.c: Same.
	* gcc.dg/Wstringop-overflow-62.c: Same.
	* gcc.dg/Wstringop-overread-2.c: Same.
	* gcc.dg/attr-nonstring-2.c: Same.
	* gcc.dg/attr-nonstring-3.c: Same.
	* gcc.dg/attr-nonstring-4.c: Same.
	* gcc.dg/strlenopt-40.c: Same.
	* gcc.dg/Walloca-larger-than-4.c: New test.
	* gcc.dg/Wstringop-overflow-64.c: New test.
	* gcc.dg/Wvla-larger-than-5.c: New test.

diff --git a/gcc/builtins.c b/gcc/builtins.c
index aad99da01c2..05abda5cb13 100644
--- a/gcc/builtins.c
+++ b/gcc/builtins.c
@@ -1127,6 +1122,7 @@ warn_string_no_nul (location_t loc, tree expr, const char *fname,
 		 (unsigned long long) bndrng[1].to_uhwi ());
 }
 
+  bool max_exceed = false;
   const tree maxobjsize = max_object_size ();
   const wide_int maxsiz = wi::to_wide (maxobjsize);
   if (expr)
@@ -1134,7 +1130,8 @@ warn_string_no_nul (location_t loc, tree expr, const char *fname,
   tree func = get_callee_fndecl (expr);
   if (bndrng)
 	{
-	  if (wi::ltu_p (maxsiz, bndrng[0]))
+	  max_exceed = wi::ltu_p (maxsiz, bndrng[0]);
+	  if (max_exceed)
 	warned = warning_at (loc, OPT_Wstringop_overread,
  "%K%qD specified bound %s exceeds "
  "maximum object size %E",
@@ -1165,7 +1162,8 @@ warn_string_no_nul (location_t loc, tree expr, const char *fname,
 {
   if (bndrng)
 	{
-	  if

Re: OpenACC 'kernels' testsuite failures

2020-11-17 Thread David Edelsohn via Gcc-patches

Hi, Thomas

The patch resolves the "no such variable" error message, but I see

"during GIMPLE pass: omplower"

excess error message.

I installed Tcl 8.6 with Expect 5.45.  This removes the "no such
variable" error messages for C and C++ test cases, but they still
occur for Fortran.

I guess that the patch is necessary because there seems to be
something else still behaving differently on AIX.

Any insights?

Thanks, David

On Tue, Nov 17, 2020 at 10:03 AM David Edelsohn  wrote:
>
> Hi, Thomas
>
> The standard version of Tcl installed on AIX currently is Tcl 8.4.
> I'll see if I can have a newer version on the side.
>
> The patch resolves the "no such variable" error message.  (Great!
> Thanks!)  I now see:
>
> during GIMPLE pass: omplower
>
> as an Excess error.  Any idea where that comes from and why it doesn't
> appear on other targets?  Does that need to be captured and ignored?
>
> Thanks, David
>
> On Mon, Nov 16, 2020 at 11:46 AM Thomas Schwinge
>  wrote:
> >
> > Hi David!
> >
> > While you were writing your email, I've also been busy:
> >
> > On 2020-11-16T11:32:46-0500, David Edelsohn  wrote:
> > > On Mon, Nov 16, 2020 at 9:16 AM Thomas Schwinge  
> > > wrote:
> > >> On 2020-11-15T15:47:15-0500, David Edelsohn  wrote:
> > >> > I am seeing a number of new failures on AIX related to the OpenACC
> > >> > kernels patches.
> > >> >
> > >> > c-c++-common/goacc/kernels-decompose-1.c
> > >> > c-c++-common/goacc/kernels-decompose-2.c
> > >> > gfortran.dg/goacc/kernels-decompose-1.f95
> > >> > gfortran.dg/goacc/kernels-decompose-2.f95
> > >> > libgomp.oacc-c++/../libgomp.oacc-c-c++-common/kernels-decompose-1.c
> > >> > libgomp.oacc-fortran/pr94358-1.f90
> > >>
> > >> I suppose what you're asking about is what appears in
> > >>  reports as:
> > >>
> > >> ERROR: c-c++-common/goacc/kernels-decompose-1.c: can't read 
> > >> "c_loop_i": no such variable for " dg-line 24 l_loop_i[incr c_loop_i] "
> > >> UNRESOLVED: c-c++-common/goacc/kernels-decompose-1.c: can't read 
> > >> "c_loop_i": no such variable for " dg-line 24 l_loop_i[incr c_loop_i] "
> > >>
> > >> Etc.
> >
> > In the mean time, I did remember that weeks ago, I had noticed this
> > following "detail": on  we
> > read that "Starting with the Tcl 8.5 release, the variable 'varName'
> > passed to 'incr' may be unset, and in that case, it will be set to
> > [...]".  Tcl 8.5 has been released in 2007.
> >
> > Per 'gcc/doc/install.texi' we require:
> >
> > @item DejaGnu 1.4.4
> > @itemx Expect
> > @itemx Tcl
> >
> > Necessary to run the GCC testsuite; [...]
> >
> > DejaGnu has been released in 2004 (so cannot have required Tcl 8.5
> > released in 2007).
> >
> > >> However, per the reports posted there, these really only (!) appear to
> > >> fail for your "Native configuration is powerpc-ibm-aix7.2.3.0" testing,
> > >> strange.  Which versions of DejaGnu and Tcl are used?
> > >
> > > For my internal tester DejaGNU reports the following:
> > >
> > > Expect version is 5.42.1
> > > Tcl version is 8.4
> > > Framework version is 1.5.3
> >
> > There we go: you're on Tcl 8.4.  ;-D
> >
> > >> On  I see there are AIX
> > >> systems gcc111, gcc119 -- maybe I'll have luck reproducing the issue
> > >> there.
> >
> > On these, we've got:
> >
> > tschwinge@gcc111:[/home/tschwinge]/opt/freeware/bin/runtest --version
> > WARNING: Couldn't find the global config file.
> > Expect version is   5.45.4
> > Tcl version is  8.6
> > Framework version is1.4.4
> >
> > tschwinge@gcc119:[/home/tschwinge]/opt/freeware/bin/runtest --version
> > WARNING: Couldn't find the global config file.
> > Expect version is   5.44.1.15
> > Tcl version is  8.5
> > Framework version is1.5.3
> >
> > ..., so can't (easily) be used to reproduce the issue.  (... but it
> > wouldn't be specific to AIX, anyway.)
> >
> > Before I spend time on building/verifying with Tcl 8.4: are you able to
> > give the attached patch "[testsuite] Avoid Tcl 8.5-specific behavior" a
> > try?
> >
> >
> > Grüße
> >  Thomas
> >
> >
> > >> Admittedly, using Tcl code inside DejaGnu directives is not most common,
> > >> but it does make sense conceptually (at least to me), and reportedly does
> > >> work with a large number of DejaGnu/Tcl versions combinations.
> > >>
> > >> > Looking at the testsuite logs I see:
> > >> >
> > >> > fatal error: GCC is not configured to support amdgcn-amdhsa as offload 
> > >> > target
> > >>
> > >> That one's not actually related to the new OpenACC 'kernels' testcases:
> > >> it's just the testsuite harness checking whether GCN offloading is
> > >> configured.  (See  "GCC is not configured to
> > >> support amdgcn-unknown-amdhsa as offload target"; this should one appear
> > >> once per testsuite.)
> > >>
> > >> > I don't know why this is different from the other OpenACC tests.
> >

Re: [committed] libstdc++: Use custom timespec in system calls [PR 93421]

2020-11-17 Thread Jonathan Wakely via Gcc-patches


On 14/11/20 14:23 +, Jonathan Wakely via Libstdc++ wrote:

On Sat, 14 Nov 2020, 13:30 Mike Crowe via Libstdc++, 
wrote:


On Saturday 14 November 2020 at 00:17:59 +, Jonathan Wakely via
Libstdc++ wrote:
> On 32-bit targets where userspace has switched to 64-bit time_t, we
> cannot pass struct timespec to SYS_futex or SYS_clock_gettime, because
> the userspace definition of struct timespec will not match what the
> kernel expects.
>
> We use the existence of the SYS_futex_time64 or SYS_clock_gettime_time64
> macros to imply that userspace *might* have switched to the new timespec
> definition.  This is a conservative assumption. It's possible that the
> new syscall numbers are defined in the libc headers but that timespec
> hasn't been updated yet (as is the case for glibc currently). But using
> the alternative struct with two longs is still OK, it's just redundant
> if userspace timespec still uses a 32-bit time_t.
>
> We also check that SYS_futex_time64 != SYS_futex so that we don't try
> to use a 32-bit tv_sec on modern targets that only support the 64-bit
> system calls and define the old macro to the same value as the new one.
>
> We could possibly check #ifdef __USE_TIME_BITS64 to see whether
> userspace has actually been updated, but it's not clear if user code
> is meant to inspect that or if it's only for libc internal use.
>
> libstdc++-v3/ChangeLog:
>
>   PR libstdc++/93421
>   * src/c++11/chrono.cc [_GLIBCXX_USE_CLOCK_GETTIME_SYSCALL]
>   (syscall_timespec): Define a type suitable for SYS_clock_gettime
>   calls.
>   (system_clock::now(), steady_clock::now()): Use syscall_timespec
>   instead of timespec.
>   * src/c++11/futex.cc (syscall_timespec): Define a type suitable
>   for SYS_futex and SYS_clock_gettime calls.
>   (relative_timespec): Use syscall_timespec instead of timespec.
>   (__atomic_futex_unsigned_base::_M_futex_wait_until): Likewise.
>   (__atomic_futex_unsigned_base::_M_futex_wait_until_steady):
>   Likewise.
>
> Tested x86_64-linux, -m32 too. Committed to trunk.
>

> commit 4d039cb9a1d0ffc6910fe09b726c3561b64527dc
> Author: Jonathan Wakely 
> Date:   Thu Sep 24 17:35:52 2020
>
> libstdc++: Use custom timespec in system calls [PR 93421]
>
> On 32-bit targets where userspace has switched to 64-bit time_t, we
> cannot pass struct timespec to SYS_futex or SYS_clock_gettime,
because
> the userspace definition of struct timespec will not match what the
> kernel expects.
>
> We use the existence of the SYS_futex_time64 or
SYS_clock_gettime_time64
> macros to imply that userspace *might* have switched to the new
timespec
> definition.  This is a conservative assumption. It's possible that
the
> new syscall numbers are defined in the libc headers but that timespec
> hasn't been updated yet (as is the case for glibc currently). But
using
> the alternative struct with two longs is still OK, it's just
redundant
> if userspace timespec still uses a 32-bit time_t.
>
> We also check that SYS_futex_time64 != SYS_futex so that we don't try
> to use a 32-bit tv_sec on modern targets that only support the 64-bit
> system calls and define the old macro to the same value as the new
one.
>
> We could possibly check #ifdef __USE_TIME_BITS64 to see whether
> userspace has actually been updated, but it's not clear if user code
> is meant to inspect that or if it's only for libc internal use.

Presumably this is change is only good for the short term? We really want
to be calling the 64-bit time_t versions of SYS_futex and SYS_clock_gettime
passing 64-bit struct timespec so that this code continues to work
correctly after 2038 (for CLOCK_REALTIME) or if someone is unlucky enough
to have a system uptime of over 68 years (for CLOCK_MONOTONIC.) Perhaps
that's part of the post-GCC11 work that you plan to do.



Right. This is definitely a short term solution, but I ran out of time to
do something better for GCC 11.


We can still do a bit better than that patch though, which was just
broken (it's SYS_clock_gettime64 not SYS_clock_gettime_time64 for a
start).

This partially reverts it, to remove the conditional compilation
related to SYS_clock_gettime64, because that's almost certainly never
going to be needed.

Tested x86_64-linux, with glibc 2.12 and 2.31, committed to trunk.



> @@ -195,7 +205,7 @@ namespace
>   if (__s.count() < 0) [[unlikely]]
> return false;
>
> - struct timespec rt;
> + syscall_timespec rt;
>   if (__s.count() > __int_traits::__max) [[unlikely]]
> rt.tv_sec = __int_traits::__max;

Do these now need to be __int_traits::__max in case time_t is 64-bit
yet syscall_timespec is using 32-bit long?



Ah yes. Maybe decltype(rt.tv_sec).


I'll fix that in the next patch.


commit 1e3e6c700f04fe6992b9077541e434172c1cbdae
Author: Jonathan Wakely 
Date:   Tue Nov 17 16:13:02 2020

libstdc++: Revert changes

Re: [PATCH] Remove lambdas from _Rb_tree

2020-11-17 Thread Jonathan Wakely via Gcc-patches


On 17/11/20 21:51 +0100, FranÃ§ois Dumont via Libstdc++ wrote:
This is a change that has been done to _Hashtable and that I forgot to 
propose for _Rb_tree.


The _GLIBCXX_XREF macro can be easily removed of course.

Â Â Â  libstdc++: _Rb_tree code cleanup, remove lambdas.

Â Â Â  Use an additional template parameter on the clone method to 
propagate if the values must be

Â Â Â  copy or move rather than lambdas.

Â Â Â  libstdc++-v3/ChangeLog:

Â Â Â Â Â Â Â Â Â Â Â  * include/bits/move.h (_GLIBCXX_XREF): New.
Â Â Â Â Â Â Â Â Â Â Â  * include/bits/stl_tree.h: Adapt to use latter.
Â Â Â Â Â Â Â Â Â Â Â  (_Rb_tree<>::_S_fwd_value_for): New.
Â Â Â Â Â Â Â Â Â Â Â  (_Rb_tree<>::_M_clone_node): Add _Tree template 
parameter.
Â Â Â Â Â Â Â Â Â Â Â  Use _S_fwd_value_for.
Â Â Â Â Â Â Â Â Â Â Â  (_Rb_tree<>::_M_cbegin): New.
Â Â Â Â Â Â Â Â Â Â Â  (_Rb_tree<>::_M_begin): Use latter.
Â Â Â Â Â Â Â Â Â Â Â  (_Rb_tree<>::_M_copy): Add _Tree template parameter.
Â Â Â Â Â Â Â Â Â Â Â  (_Rb_tree<>::_M_move_data): Use rvalue reference for 
_Rb_tree parameter.

Â Â Â Â Â Â Â Â Â Â Â  (_Rb_tree<>::_M_move_assign): Likewise.

Tested under Linux x86_64.

Ok to commit ?


GCC is in stage 3 now, so this should have been posted last week
really.



diff --git a/libstdc++-v3/include/bits/move.h b/libstdc++-v3/include/bits/move.h
index 5a4dbdc823c..e0d68ca9108 100644
--- a/libstdc++-v3/include/bits/move.h
+++ b/libstdc++-v3/include/bits/move.h
@@ -158,9 +158,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION

  /// @} group utilities

+#define _GLIBCXX_XREF(_Tp) _Tp&&


I think this does improve the code that uses this. But the correct
name for this is forwarding reference, so I think FWDREF would be
better than XREF. XREF doesn't tell me anything about what it's for.


#define _GLIBCXX_MOVE(__val) std::move(__val)
#define _GLIBCXX_FORWARD(_Tp, __val) std::forward<_Tp>(__val)
#else
+#define _GLIBCXX_XREF(_Tp) const _Tp&
#define _GLIBCXX_MOVE(__val) (__val)
#define _GLIBCXX_FORWARD(_Tp, __val) (__val)
#endif
diff --git a/libstdc++-v3/include/bits/stl_tree.h 
b/libstdc++-v3/include/bits/stl_tree.h
index ec141ea01c7..128c7e2c892 100644
--- a/libstdc++-v3/include/bits/stl_tree.h
+++ b/libstdc++-v3/include/bits/stl_tree.h
@@ -478,11 +478,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION

template
  _Link_type
-#if __cplusplus < 201103L
- operator()(const _Arg& __arg)
-#else
- operator()(_Arg&& __arg)
-#endif
+ operator()(_GLIBCXX_XREF(_Arg) __arg)
  {
_Link_type __node = static_cast<_Link_type>(_M_extract());
if (__node)
@@ -544,11 +540,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION

template
  _Link_type
-#if __cplusplus < 201103L
- operator()(const _Arg& __arg) const
-#else
- operator()(_Arg&& __arg) const
-#endif
+ operator()(_GLIBCXX_XREF(_Arg) __arg) const
  { return _M_t._M_create_node(_GLIBCXX_FORWARD(_Arg, __arg)); }

  private:
@@ -655,11 +647,27 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
_M_put_node(__p);
  }

-  template
+#if __cplusplus >= 201103L
+  template
+   static constexpr
+   typename conditional::value,
+const value_type&, value_type&&>::type
+   _S_fwd_value_for(value_type& __val) noexcept
+   { return std::move(__val); }
+#else
+  template
+   static const value_type&
+   _S_fwd_value_for(value_type& __val)
+   { return __val; }
+#endif
+
+  template
_Link_type
-   _M_clone_node(_Const_Link_type __x, _NodeGen& __node_gen)
+   _M_clone_node(_GLIBCXX_XREF(_Tree),


Since the _Tree type is only used to decide whether to copy or move,
could it just be a bool instead?

  template
_Link_type
_M_clone_node(_Link_type __x, _NodeGen& __node_gen)

Then it would be called as _M_clone_node<_Move>(__x, __node_gen)
instead of _M_clone_node(_GLIBCXX_FORWARD(_Tree, __t), __x, __node_gen).
That seems easier to read.


+ _Link_type __x, _NodeGen& __node_gen)
{
- _Link_type __tmp = __node_gen(*__x->_M_valptr());
+ _Link_type __tmp
+   = __node_gen(_S_fwd_value_for<_Tree>(*__x->_M_valptr()));


Is _S_fwd_value_for necessary? This would work:

#if __cplusplus >= 201103L
  using _Vp = typename conditional::value,
   const value_type&,
   value_type&&>::type;
#else
  typedef const value_type& _Vp;
#endif
  _Link_type __tmp
= __node_gen(_GLIBCXX_FORWARD(_Vp, *__x->_M_valptr()));

Or with the suggestion above, the typedef would be:

  using _Vp = typename conditional<_Move, value_type&&,
   const value_type&>::type;



  __tmp->_M_color = __x->_M_color;
  __tmp->_M_left = 0;
  __tmp->_M_right = 0;
@@ -748,9 +756,13 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  { return this->_M_impl._M_header._M_right;

Re: [PATCH,rs6000] Make MMA builtins use opaque modes

2020-11-17 Thread Peter Bergner via Gcc-patches

On 11/17/20 5:01 PM, Segher Boessenkool wrote:
> On Tue, Nov 17, 2020 at 12:41:58PM -0600, Peter Bergner wrote:
>>> +;; Return 1 if this operand is valid for an MMA disassemble insn.
>>> +(define_predicate "mma_disassemble_output_operand"
>>> +  (match_code "reg,subreg,mem")
>>> +{
>>> +  if (REG_P (op) && !vsx_register_operand (op, mode))
>>> +return false;
>>> +  return true;
>>> +})
>>
>> Do we really want to accept subregs here?  If so, why are they not also 
>> required
>> to be vsx_register_operand()?
> 
> That *does* allow subregs!

We never call vsx_register_operand() with a subreg, because the REG_P() guards 
it.
That means we drop down into the "return true;" for all subregs and mems,
regardless of whether the subreg is of a vsx register or not.

So my question is, if we don't allow a reg of a non-vsx register, why should
we allow subregs of non-vsx registers?

Peter

Re: [PATCH, rs6000] Re-enable vector pair memcpy/memmove expansion

2020-11-17 Thread Segher Boessenkool

On Tue, Nov 17, 2020 at 12:03:05PM -0600, acsaw...@linux.ibm.com wrote:
> From: Aaron Sawdey 
> 
> After the MMA opaque mode patch goes in, we can re-enable
> use of vector pair in the inline expansion of memcpy/memmove.
> 
> After bootstrap/regtest, OK for trunk?

Yes, okay for trunk after the rs6000 "opaque" patch goes in.  Thanks!


Segher

Re: [PATCH,rs6000] Make MMA builtins use opaque modes

2020-11-17 Thread Segher Boessenkool

On Tue, Nov 17, 2020 at 12:41:58PM -0600, Peter Bergner wrote:
> > +;; Return 1 if this operand is valid for an MMA disassemble insn.
> > +(define_predicate "mma_disassemble_output_operand"
> > +  (match_code "reg,subreg,mem")
> > +{
> > +  if (REG_P (op) && !vsx_register_operand (op, mode))
> > +return false;
> > +  return true;
> > +})
> 
> Do we really want to accept subregs here?  If so, why are they not also 
> required
> to be vsx_register_operand()?

That *does* allow subregs!

;; Return 1 if op is a VSX register.
(define_predicate "vsx_register_operand"
  (match_operand 0 "register_operand")
{
  if (SUBREG_P (op))
{
  if (TARGET_NO_SF_SUBREG && sf_subreg_operand (op, mode))
return 0;

  op = SUBREG_REG (op);
}

...

int
register_operand (rtx op, machine_mode mode)
{
  if (GET_CODE (op) == SUBREG)
{
  rtx sub = SUBREG_REG (op);

  /* Before reload, we can allow (SUBREG (MEM...)) as a register operand
 because it is guaranteed to be reloaded into one.
 Just make sure the MEM is valid in itself.
 (Ideally, (SUBREG (MEM)...) should not exist after reload,
 but currently it does result from (SUBREG (REG)...) where the
 reg went on the stack.)  */
  if (!REG_P (sub) && (reload_completed || !MEM_P (sub)))
return 0;
}
  else if (!REG_P (op))
return 0;
  return general_operand (op, mode);
}

(The SFmode thing and subregs-of-mem are complifying things, just to
keep things interesting.)

We need to allow subregs of reg pretty much always, for reinterpret_cast
like things (say, changing one vector mode to another).

Segher

Re: [PATCH,rs6000] Make MMA builtins use opaque modes

2020-11-17 Thread Segher Boessenkool

Hi!

On Tue, Nov 17, 2020 at 11:48:04AM -0600, acsaw...@linux.ibm.com wrote:
> This patch changes powerpc MMA builtins to use the new opaque
> mode class and use modes OO (32 bytes) and XO (64 bytes)
> instead of POI/PXI. Using the opaque modes prevents
> optimization from trying to do anything with vector
> pair/quad, which was the problem we were seeing with the
> partial integer modes.

>   * gcc/config/rs6000/mma.md (unspec):
>   Add assemble/extract UNSPECs.

That fits on one line :-)  (More similar stuff below.)

>   * gcc/config/rs6000/rs6000-modes.def: Create XO and OO modes.

You also delete OImode, XImode, POImode, and PXImode here.

> --- a/gcc/config/rs6000/mma.md
> +++ b/gcc/config/rs6000/mma.md
> @@ -19,24 +19,19 @@
>  ;; along with GCC; see the file COPYING3.  If not see
>  ;; .
>  
> -;; The MMA patterns use the multi-register PXImode and POImode partial
> +;; The MMA patterns use the multi-register XOmode and OOmode partial
>  ;; integer modes to implement the target specific __vector_quad and
>  ;; __vector_pair types that the MMA built-in functions reference.

They are opaque modes, not partial integer modes now :-)

> +;; We define these modes with the new OPAQUE_MODE mechanism to prevent
> +;; anything from trying to open them up.

Any time you write "new" it is dated soon.  Maybe just "these are
OPAQUE_MODEs" or similar?

> +(define_insn_and_split "*movxo"
> +  [(set (match_operand:XO 0 "nonimmediate_operand" "=d,m,d,d")
> + (match_operand:XO 1 "input_operand" "m,d,d,O"))]
>"TARGET_MMA
> +   && (gpc_reg_operand (operands[0], XOmode)
> +   || gpc_reg_operand (operands[1], XOmode))"
>"@
> #
> #
> #
> xxsetaccz %A0"
>"&& reload_completed
> +   && !(fpr_reg_operand (operands[0], XOmode) && operands[1] == const0_rtx)"

XOmode can never be a const_int anymore, so this should just be
  "&& reload_completed"
and nothing more.

Or do you need to handle the unspec case in here?  It can just be
handled in a separate pattern in this case, so that would be much
simpler.

> +(define_insn_and_split "*mma_assemble_pair"
> +  [(set (match_operand:OO 0 "vsx_register_operand" "=wa")
> + (unspec:OO [(match_operand:V16QI 1 "mma_assemble_input_operand" "mwa")
> + (match_operand:V16QI 2 "mma_assemble_input_operand" "mwa")]
> + UNSPEC_MMA_ASSEMBLE))]
> +  "TARGET_MMA
> +   && vsx_register_operand (operands[0], OOmode)"

operands[0] always is an OOmode vsx_register_operand, so this is just
  "TARGET_MMA"

> +(define_insn_and_split "*mma_disassemble_pair"
> +  [(set (match_operand:V16QI 0 "mma_disassemble_output_operand" "=mwa")
> +   (unspec:V16QI [(match_operand:OO 1 "input_operand" "wa")
> +  (match_operand 2 "const_int_operand")]
> +   UNSPEC_MMA_EXTRACT))]

Use "const_0_to_1_operand" instead?

> +  "TARGET_MMA
> +   && fpr_reg_operand (operands[1], OOmode)"

But registers can be all in "wa", not just the FPRs?

> +  "#"
> +  "&& reload_completed"
> +  [(const_int 0)]
> +{
> +  gcc_assert (REG_P (operands[1]));

That is guaranteed by the insn condition.

> +  int reg = REGNO (operands[1]);
> +  int regoff = INTVAL (operands[2]);

Should this consider endianness?

> +  rtx src = gen_rtx_REG (V16QImode, reg + regoff);
> +  emit_move_insn (operands[0], src);
> +  DONE;
> +})


> +(define_expand "mma_disassemble_pair"

Please put the define_expand before the corresponding define_insn.

> +  [(match_operand:V16QI 0 "mma_disassemble_output_operand")
> +   (match_operand:OO 1 "input_operand")
> +   (match_operand 2 "const_int_operand")]

0_to_1 again

> +  "TARGET_MMA"
> +{
> +  rtx src;
> +  int regoff = INTVAL (operands[2]);
> +  gcc_assert (IN_RANGE (regoff, 0, 1));

This assert then trivially isn't needed anymore, either.

> +  src = gen_rtx_UNSPEC (V16QImode,
> +gen_rtvec (2, operands[1], GEN_INT (regoff)),
> +UNSPEC_MMA_EXTRACT);
> +  emit_move_insn (operands[0], src);
>DONE;
>  })


> +(define_insn_and_split "*mma_disassemble_acc"
> +  [(set (match_operand:V16QI 0 "mma_disassemble_output_operand" "=mwa")
> +   (unspec:V16QI [(match_operand:XO 1 "input_operand" "d")
> +  (match_operand 2 "const_int_operand")]

"const_0_to_3_operand", and everything else as well.

> +(define_expand "mma_disassemble_acc"

Also wrt this of course.

>  (define_expand "mma_xxsetaccz"
> -  [(set (match_operand:PXI 0 "fpr_reg_operand")
> +  [(set (match_operand:XO 0 "fpr_reg_operand")
>   (const_int 0))]
>"TARGET_MMA"
>  {
> -  emit_insn (gen_movpxi (operands[0], const0_rtx));
> +  emit_insn (gen_movxo (operands[0], const0_rtx));
>DONE;
>  })

That should wrap the zero in an unspec?

> --- a/gcc/config/rs6000/predicates.md
> +++ b/gcc/config/rs6000/predicates.md
> @@ -1072,8 +1072,9 @@ (define_predicate "input_operand"
>&& easy_fp_constant (op, mode))
>  return 1;
>  
> -  /*

Re: [PATCH v5] rtl: builtins: (not just) rs6000: Add builtins for fegetround, feclearexcept and feraiseexcept [PR94193]

2020-11-17 Thread Jeff Law via Gcc-patches




On 11/4/20 8:10 AM, Raoni Fassina Firmino via Gcc-patches wrote:
> On Wed, Nov 04, 2020 at 10:35:03AM +0100, Richard Biener wrote:
>>> +/* Expand call EXP to the fegetround builtin (from C99 fenv.h), returning 
>>> the
>>> +   result and setting it in TARGET.  Otherwise return NULL_RTX on failure. 
>>>  */
>>> +static rtx
>>> +expand_builtin_fegetround (tree exp, rtx target, machine_mode target_mode)
>>> +{
>>> +  if (!validate_arglist (exp, VOID_TYPE))
>>> +return NULL_RTX;
>>> +
>>> +  insn_code icode = direct_optab_handler (fegetround_optab, SImode);
>>> +  if (icode == CODE_FOR_nothing)
>>> +return NULL_RTX;
>>> +
>>> +  if (target == 0
>>> +  || GET_MODE (target) != target_mode
>>> +  || !(*insn_data[icode].operand[0].predicate) (target, target_mode))
>>> +target = gen_reg_rtx (target_mode);
>>> +
>>> +  rtx pat = GEN_FCN (icode) (target);
>>> +  if (!pat)
>>> +return NULL_RTX;
>>> +  emit_insn (pat);
>> I think you need to verify whether the expansion ended up in 'target'
>> and otherwise emit a move since usually 'target' is just a hint.
> I thought the "if (target == 0 ..." took care of that. The expands do
> emit a move, if that helps.
>
> For feclearexcept and feraiseexcept I included tests to variable
> 'target', including none, but now I see that I did not do the same for
> fegetround, I can add the same if it is necessary, but the test do check
> if the return is correct, so I don't know.
>
>
>>> +@cindex @code{fegetround@var{m}} instruction pattern
>>> +@item @samp{fegetround@var{m}}
>>> +Store the current machine floating-point rounding mode into operand 0.
>>> +Operand 0 has mode @var{m}, which is scalar.  This pattern is used to
>>> +implement the @code{fegetround} function from the ISO C99 standard.
>> I think this needs to elaborate on the format of the "rounding mode".
>>
>> AFAICS you do nothing to marshall with the actually used libc
>> implementation which AFAIU can choose arbitrary values for
>> the FE_* macros.  I'm not sure we require the compiler to be
>> configured for one specific C library and for example require
>> matching FE_* macro definitions for all uses of the built
>> compiler.
>>
>> For the patch at hand you seem to assume the libc "format"
>> matches the hardware one (which would of course be reasonable).
>>
>> Does that actually hold up when looking at libcs other than 
>> glibc supporting powerpc?
> I checked in some other libc implementations that have POWER support and
> all have the same value as glic for the four rounding modes and the five
> exception flags from libc. The libcs implementations I checked are:
>
>  - musl
>  - uclibc & uclibc-ng
>  - freebsd
>
> Is There any other I am missing?
I think the concern here is that while the libcs we have visibility into
have consistent values, I don't think that's necessarily guaranteed.  
I'm not sure how to DTRT here.  We may not have the target headers if
we're doing a cross compile, so a configure test may not do what we 
need.  In fact, ISTM that there is no reliable configure or compile time
check we can do since the constants are part of the runtime and can
change independently of the compiler.

Jeff

Re: [PATCH v5] rtl: builtins: (not just) rs6000: Add builtins for fegetround, feclearexcept and feraiseexcept [PR94193]

2020-11-17 Thread Jeff Law via Gcc-patches




On 11/4/20 8:10 AM, Raoni Fassina Firmino via Gcc-patches wrote:
> On Wed, Nov 04, 2020 at 10:35:03AM +0100, Richard Biener wrote:
>>> +/* Expand call EXP to the fegetround builtin (from C99 fenv.h), returning 
>>> the
>>> +   result and setting it in TARGET.  Otherwise return NULL_RTX on failure. 
>>>  */
>>> +static rtx
>>> +expand_builtin_fegetround (tree exp, rtx target, machine_mode target_mode)
>>> +{
>>> +  if (!validate_arglist (exp, VOID_TYPE))
>>> +return NULL_RTX;
>>> +
>>> +  insn_code icode = direct_optab_handler (fegetround_optab, SImode);
>>> +  if (icode == CODE_FOR_nothing)
>>> +return NULL_RTX;
>>> +
>>> +  if (target == 0
>>> +  || GET_MODE (target) != target_mode
>>> +  || !(*insn_data[icode].operand[0].predicate) (target, target_mode))
>>> +target = gen_reg_rtx (target_mode);
>>> +
>>> +  rtx pat = GEN_FCN (icode) (target);
>>> +  if (!pat)
>>> +return NULL_RTX;
>>> +  emit_insn (pat);
>> I think you need to verify whether the expansion ended up in 'target'
>> and otherwise emit a move since usually 'target' is just a hint.
> I thought the "if (target == 0 ..." took care of that. The expands do
> emit a move, if that helps.
It looks like if we have a passed in target and it either has the wrong
mode or it does not match the predicate, then we generaet a new target
and use that instead.  I don't see where we'd copy from that new target
to the original desired target.  For some expanders the caller would
handle that, but I don't see how that's possible for this one without
the caller digging into the generated RTL to determine that
expand_builtin_fegetround put the result somewhere other than TARGET and
thus a copy is needed.

That may be what Richi is worried about.

Jeff

Re: [PATCH 1/2] correct BB frequencies after loop changed

2020-11-17 Thread Jeff Law via Gcc-patches



Minor questions for Jan and Richi embedded below...

On 10/9/20 4:12 AM, guojiufu via Gcc-patches wrote:
> When investigating the issue from 
> https://gcc.gnu.org/pipermail/gcc-patches/2020-July/549786.html
> I find the BB COUNTs of loop seems are not accurate in some case.
> For example:
>
> In below figure:
>
>
>COUNT:268435456  pre-header
> |
> |  ..
> |  ||
> V  v|
>COUNT:805306369|
>/ \  |
>33%/   \ |
>  / \|
> v   v   |
> COUNT:268435456  COUNT:536870911  | 
> exit-edge |   latch |
>   ._.
>
> Those COUNTs have below equations:
> COUNT of exit-edge:268435456 = COUNT of pre-header:268435456
> COUNT of exit-edge:268435456 = COUNT of header:805306369 * 33
> COUNT of header:805306369 = COUNT of pre-header:268435456 + COUNT of 
> latch:536870911
>
>
> While after pcom:
>
>COUNT:268435456  pre-header
> |
> |  ..
> |  ||
> V  v|
>COUNT:268435456|
>/ \  |
>50%/   \ |
>  / \|
> v   v   |
> COUNT:134217728  COUNT:134217728  | 
> exit-edge |   latch |
>   ._.
>
> COUNT != COUNT + COUNT
> COUNT != COUNT
>
> In some cases, the probility of exit-edge is easy to estimate, then
> those COUNTs of other BBs in loop can be re-caculated.
>
> Bootstrap and regtest pass on ppc64le. Is this ok for trunk?
>
> Jiufu
>
> gcc/ChangeLog:
> 2020-10-09  Jiufu Guo   
>
>   * cfgloopmanip.h (recompute_loop_frequencies): New function.
>   * cfgloopmanip.c (recompute_loop_frequencies): New implementation.
>   * tree-ssa-loop-manip.c (tree_transform_and_unroll_loop): Call
>   recompute_loop_frequencies.
>
> ---
>  gcc/cfgloopmanip.c| 53 +++
>  gcc/cfgloopmanip.h|  2 +-
>  gcc/tree-ssa-loop-manip.c | 28 +++--
>  3 files changed, 57 insertions(+), 26 deletions(-)
>
> diff --git a/gcc/cfgloopmanip.c b/gcc/cfgloopmanip.c
> index 73134a20e33..b0ca82a67fd 100644
> --- a/gcc/cfgloopmanip.c
> +++ b/gcc/cfgloopmanip.c
> @@ -31,6 +31,7 @@ along with GCC; see the file COPYING3.  If not see
>  #include "gimplify-me.h"
>  #include "tree-ssa-loop-manip.h"
>  #include "dumpfile.h"
> +#include "cfgrtl.h"
>  
>  static void copy_loops_to (class loop **, int,
>  class loop *);
> @@ -1773,3 +1774,55 @@ loop_version (class loop *loop,
>  
>return nloop;
>  }
> +
> +/* Recalculate the COUNTs of BBs in LOOP, if the probability of exit edge
> +   is NEW_PROB.  */
> +
> +bool
> +recompute_loop_frequencies (class loop *loop, profile_probability new_prob)
> +{
> +  edge exit = single_exit (loop);
> +  if (!exit)
> +return false;
> +
> +  edge e;
> +  edge_iterator ei;
> +  edge non_exit;
> +  basic_block * bbs;
> +  profile_count exit_count = loop_preheader_edge (loop)->count ();
> +  profile_probability exit_p = exit_count.probability_in 
> (loop->header->count);
> +  profile_count base_count = loop->header->count;
> +  profile_count after_num = base_count.apply_probability (exit_p);
> +  profile_count after_den = base_count.apply_probability (new_prob);
> +
> +  /* Update BB counts in loop body.
> + COUNT = COUNT
> + COUNT = COUNT * exit_edge_probility
> + The COUNT = COUNT * old_exit_p / new_prob.  */
> +  bbs = get_loop_body (loop);
> +  scale_bbs_frequencies_profile_count (bbs, loop->num_nodes, after_num,
> +  after_den);
> +  free (bbs);
> +
> +  /* Update probability and count of the BB besides exit edge (maybe latch). 
>  */
> +  FOR_EACH_EDGE (e, ei, exit->src->succs)
> +if (e != exit)
> +  break;
> +  non_exit = e;
Are we sure that exit->src has just two successors (will that case be
canonicalized before we get here?).  If it has > 2 successors, then I'm
pretty sure the frequencies get mucked up.  Richi could probably answer
whether or not the block with the loop exit edge can have > 2 successors.


> +
> +  non_exit->probability = new_prob.invert ();
> +  non_exit->dest->count = profile_count::zero ();
> +  FOR_EACH_EDGE (e, ei, non_exit->dest->preds)
> +non_exit->dest->count += e->src->count.apply_probability 
> (e->probability);
This generally looks sensible with the caveat that if exit->src has

[PATCH] recognize implied ranges for modulo.

2020-11-17 Thread Andrew MacLeod via Gcc-patches


PR 91029 observes when

 a % b > 0 && b >= 0,

then a has an implied range of  a >=0.  likewise

a % b < 0 implies a range of a <= 0.

This patch is a good example of how range-ops can be leveraged to solve 
problems. It simply implements operator_trunc_mod::op1_range()  to solve 
for 'A' when the LHS and 'b' are known to be within the specified 
ranges.    I also added a a test case to show folding of conditions 
based on that.


Bootstrapped on x86_64-pc-linux-gnu, no regressions.  pushed.

Andrew






commit 1e27e7a582a9b86bcf86f5c103cd947672746e97
Author: Andrew MacLeod 
Date:   Tue Nov 17 14:47:58 2020 -0500

recognize implied ranges for modulo.

implement op1_range for modulo with implied positive and negative ranges.

gcc/
PR tree-optimization/91029
* range-op.cc (operator_trunc_mod::op1_range): New.
gcc/testsuite/
* gcc.dg/pr91029.c: New.

diff --git a/gcc/range-op.cc b/gcc/range-op.cc
index d0adc95527a..f37796cac70 100644
--- a/gcc/range-op.cc
+++ b/gcc/range-op.cc
@@ -2634,6 +2634,9 @@ public:
 		const wide_int _ub,
 		const wide_int _lb,
 		const wide_int _ub) const;
+  virtual bool op1_range (irange , tree type,
+			  const irange ,
+			  const irange ) const;
 } op_trunc_mod;
 
 void
@@ -2680,6 +2683,31 @@ operator_trunc_mod::wi_fold (irange , tree type,
   value_range_with_overflow (r, type, new_lb, new_ub);
 }
 
+bool
+operator_trunc_mod::op1_range (irange , tree type,
+			   const irange ,
+			   const irange ) const
+{
+  // PR 91029.  Check for signed truncation with op2 >= 0.
+  if (TYPE_SIGN (type) == SIGNED && wi::ge_p (op2.lower_bound (), 0, SIGNED))
+{
+  unsigned prec = TYPE_PRECISION (type);
+  // if a & b >=0 , then a >= 0.
+  if (wi::ge_p (lhs.lower_bound (), 0, SIGNED))
+	{
+	  r = value_range (type, wi::zero (prec), wi::max_value (prec, SIGNED));
+	  return true;
+	}
+  // if a & b < 0 , then a <= 0.
+  if (wi::lt_p (lhs.upper_bound (), 0, SIGNED))
+	{
+	  r = value_range (type, wi::min_value (prec, SIGNED), wi::zero (prec));
+	  return true;
+	}
+}
+  return false;
+}
+
 
 class operator_logical_not : public range_operator
 {
diff --git a/gcc/testsuite/gcc.dg/pr91029.c b/gcc/testsuite/gcc.dg/pr91029.c
new file mode 100644
index 000..8a4134c5d96
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr91029.c
@@ -0,0 +1,47 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-evrp" } */
+
+void kill (void);
+int xx;
+
+void f1 (int i)
+{
+  if ((i % 7) == 3)
+{
+  xx = (i < 0);
+  if (xx)
+kill ();
+}
+}
+
+void f2 (int i)
+{
+  if ((i % 7) >= 0)
+{
+  xx = (i < 0);
+  if (xx)
+kill ();
+}
+}
+
+void f3 (int i)
+{
+  if ((i % 7) == -3)
+{
+  xx = (i > 0);
+  if (xx)
+kill ();
+}
+}
+
+void f4 (int i)
+{
+  if ((i % 7) < 0)
+{
+  xx = (i > 0);
+  if (xx)
+kill ();
+}
+}
+
+/* { dg-final { scan-tree-dump-not "kill" "evrp" } }  */

Re: [RFC, testsuite] Add dg-save-linenr

2020-11-17 Thread Jeff Law via Gcc-patches




On 10/30/20 5:34 AM, Thomas Schwinge wrote:
> Hi!
>
> On 2017-05-22T18:55:29+0200, Tom de Vries  wrote:
>> On 05/16/2017 03:12 PM, Rainer Orth wrote:
>>> [...], but the new proc ['dg-line'] needs documenting in sourcebuild.texi.
>> Attached patch adds the missing documentation.
> OK to expand that with the attached patch to "Document that 'linenumvar'
> in 'dg-line' may contain Tcl syntax"?  (Hooray for embedded Tcl!  --
> Don't hurt me; I (later) have a use case where this does make things
> easier.)
>
> '{ dg-line LINENUMVAR }'
>  This DejaGnu directive sets the variable LINENUMVAR to the line
>  number of the source line.  The variable LINENUMVAR, which must be
>  unique per testcase, may then be used in subsequent 'dg-error',
>  'dg-warning', 'dg-message' and 'dg-bogus' directives.  For example:
>
>   int a;   /* { dg-line first_def_a } */
>   float a; /* { dg-error "conflicting types of" } */
>   /* { dg-message "previous declaration of" "" { target *-*-* } 
> first_def_a } */
>
>  Note that LINENUMVAR may contain Tcl syntax, for example:
>
>   #pragma acc parallel loop [...] /* { dg-line line[incr 
> line_count] } */
> /* { dg-message "note: [...]" "" { target *-*-* } 
> line$line_count } */
> /* { dg-message "optimized: [...]" "" { target *-*-* } 
> line$line_count } */
> for (int j = 0; j < nj; ++j)
>   {
> #pragma acc loop [...] /* { dg-line line[incr line_count] 
> } */
> /* { dg-message "missed: [...]" "" { target *-*-* } 
> line$line_count } */
> /* { dg-message "optimized: [...]" "" { target *-*-* } 
> line$line_count } */
> /* { dg-message "note: [...]" "" { target *-*-* } 
> line$line_count } */
> for (int i = 0; i < ni; ++i)
>
>  For each 'dg-line', this increments a counter variable 'line_count'
>  to construct unique 'line$line_count' names for LINENUMVAR:
>  'line1', 'line2',   The preceding 'dg-line' may then be
>  referred to via 'line$line_count'.
>
>
> Grüße
>  Thomas
>
>
> -
> Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany
> Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, 
> Alexander Walter
>
> 0001-Document-that-linenumvar-in-dg-line-may-contain-Tcl-.patch
>
> From 2211acd9a902a5cab874762166dbca116a98bea5 Mon Sep 17 00:00:00 2001
> From: Thomas Schwinge 
> Date: Thu, 29 Oct 2020 07:04:54 +0100
> Subject: [PATCH] Document that 'linenumvar' in 'dg-line' may contain Tcl
>  syntax
>
>   gcc/
>   * doc/sourcebuild.texi (dg-line): Document that 'linenumvar' may
>   contain Tcl syntax.
OK
jeff

Re: [AArch64] Add --with-tune configure flag

2020-11-17 Thread Jeff Law via Gcc-patches




On 11/17/20 2:27 PM, Pop, Sebastian via Gcc-patches wrote:
> Hi,
>
> here is a follow-up patch to add missing Arm64 configure flags as aliases to 
> the existing flags.
>
> gcc/
> * config.gcc: add configure flags --with-{cpu,arch,tune}-{32,64}
> as alias flags for --with-{cpu,arch,tune} on AArch64.
> * doc/install.texi: Document new flags for aarch64.
>
> Tested on aarch64-linux with bootstrap and regression test.
> Ok to commit to trunk and back-port to active branches?
OK
jeff

Re: [AArch64] Add --with-tune configure flag

2020-11-17 Thread Jeff Law via Gcc-patches




On 11/17/20 10:53 AM, Pop, Sebastian via Gcc-patches wrote:
> Hi,
>
> the attached patch fixes a configure error on Arm64 when passing 
> --with-tune=... to configure:
> ```
> This target does not support --with-tune.
> Valid --with options are: abi cpu arch
> ```
> The missing flag sets target tuning to a different value than the generic 
> tuning.
>
> gcc/
> * config.gcc: Add --with-tune to AArch64 configure flags.
>
> Tested on aarch64-linux with bootstrap and regression test.
> Ok to commit to trunk and back-port to active branches?
OK

jeff

Re: RISC-V: Support version controling for ISA standard extensions

2020-11-17 Thread Jim Wilson

On Thu, Nov 12, 2020 at 11:27 PM Kito Cheng  wrote:

> Current GCC implementation is RISC-V ISA 2.2, this patch set implement
> v20190608 and v20191213, and also add option
> -misa-spec=[2.2|20190608|20191213] to change the default ISA spec version.
>
> There is one major incompatible
>
> That option will effect the default version of each sub-extension, for
> example I-extension is 2.0 for 2.2 and 2.1 for v20190608 and v20191213.
>
> We also update the -march parser to fit the latest standard, the canonical
> ordering for multi-letter, drop version support for G extension, and we
> also omitted the version for unrecognized extension.
>
> And we add an special rule for G extension, imafd can't appear again if G
> extension is present, but zicsr and zifencei can.
>
> The default ISA spec will keep on 2.2, and change that in next GCC release.
>

This patch series looks OK to me with minor fixes for the things I pointed
out.

I assume Nelson Chu will look at adding the missing zifencei binutils
support?

Maybe we should add an attribute for the isa-spec?  This is redundant with
the arch spec, but might be easier for some folk to handle.  Just one
version number instead of 6 version numbers.  Though I suppose if we have
both we might have to deal with accidental conflicts between the two.  We
would need a binutils patch first.  This can be fixed later if it makes
sense to do it.

Jim

Re: [r11-5094 Regression] FAIL: gcc.dg/torture/pr8081.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects (test for excess errors) on Linux/x86_64

2020-11-17 Thread Jeff Law via Gcc-patches




On 11/17/20 2:13 PM, Jan Hubicka wrote:
> Hi,
> I am testing the following fix.  I manually applied a rejected hunk and
> for some reaosn managed to reverse the conditonal :(
>
> Honza
>
>   * ipa-icf.c (sem_function::hash_stmt): Fix conditional on
>   variably_modified_type_p.
> diff --git a/gcc/ipa-icf.c b/gcc/ipa-icf.c
> index 27eeda3a319..6ae842766e6 100644
> --- a/gcc/ipa-icf.c
> +++ b/gcc/ipa-icf.c
> @@ -1459,10 +1459,10 @@ sem_function::hash_stmt (gimple *stmt, inchash::hash 
> )
>  
>   ao_ref_init (, gimple_op (stmt, i));
>   tree t = ao_ref_alias_ptr_type ();
> - if (variably_modified_type_p (t, NULL_TREE))
> + if (!variably_modified_type_p (t, NULL_TREE))
> memory_access_types.safe_push (t);
>   t = ao_ref_base_alias_ptr_type ();
> - if (variably_modified_type_p (t, NULL_TREE))
> + if (!variably_modified_type_p (t, NULL_TREE))
> memory_access_types.safe_push (t);
> }
> }
>
Given Sunil's list of failures, this could well be the same issue as the
one I just sent you.  OH wait, it was rejected.  Probably the binary
attachments looked like a virus/spam or somesuch.


jeff

Re: [PATCH 3/3] RISC-V: Support version controling for ISA standard extensions

2020-11-17 Thread Jim Wilson

On Thu, Nov 12, 2020 at 11:28 PM Kito Cheng  wrote:

> +#ifndef HAVE_AS_MARCH_ZIFENCE
> +  /* Skip since older binutils don't recognize zifencei,
> + we mad a mistake that is binutils 2.35 support zicsr but not support
> + zifencei.  */
> +  skip_zifencei = true;
> +#endif
>

I'd suggest something like "Skip since older binutils doesn't recognize
zifencei, we made a mistake in that binutils 2.35 supports zicsr but not
zifencei."

> diff --git a/gcc/config/riscv/riscv.h b/gcc/config/riscv/riscv.h
> index 172c7ca7c98..9dec5415eab 100644
> --- a/gcc/config/riscv/riscv.h
> +++ b/gcc/config/riscv/riscv.h
> @@ -70,13 +70,20 @@ extern const char *riscv_default_mtune (int argc,
> const char **argv);
>  #define TARGET_64BIT   (__riscv_xlen == 64)
>  #endif /* IN_LIBGCC2 */
>
> +#ifdef HAVE_AS_MISA_SPEC
> +#define ASM_MISA_SPEC ""
> +#else
> +#define ASM_MISA_SPEC "%{misa-spec=*}"
> +#endif
>

This is backwards.  We do want to pass -misa-spec to the assembler if it
supports it.  Or maybe you meant ifndef?

Jim

[Patch, Fortran, committed]

2020-11-17 Thread Harald Anlauf

Committed to master as obvious.

Thanks,
Harald


Fortran texi: Fix description of GFC_RTCHECK_* macros.

gcc/fortran/ChangeLog:

* gfortran.texi: Fix description of GFC_RTCHECK_* to match actual
code.


diff --git a/gcc/fortran/gfortran.texi b/gcc/fortran/gfortran.texi
index 453b30f7c61..3b27217d369 100644
--- a/gcc/fortran/gfortran.texi
+++ b/gcc/fortran/gfortran.texi
@@ -3866,8 +3866,8 @@ initialization using @code{_gfortran_set_args}.
 Default: enabled.
 @item @var{option}[6] @tab Enables run-time checking.  Possible values
 are (bitwise or-ed): GFC_RTCHECK_BOUNDS (1), GFC_RTCHECK_ARRAY_TEMPS (2),
-GFC_RTCHECK_RECURSION (4), GFC_RTCHECK_DO (16), GFC_RTCHECK_POINTER (32),
-GFC_RTCHECK_BITS (64).
+GFC_RTCHECK_RECURSION (4), GFC_RTCHECK_DO (8), GFC_RTCHECK_POINTER (16),
+GFC_RTCHECK_MEM (32), GFC_RTCHECK_BITS (64).
 Default: disabled.
 @item @var{option}[7] @tab Unused.
 @item @var{option}[8] @tab Show a warning when invoking @code{STOP} and

extend cache_integer_cst

2020-11-17 Thread Nathan Sidwell


This modules-related patch extends cache_integer_cst.  Currently, when
given a small cst, that cst is added to the type's small and /must
not/ already be there.  Large values are fine if they are already in
the large cache.  This adds a parameter to indicate small duplicates
are ok, and it returns the cached value -- either what was already
tehre, or the newly inserted const.

gcc/
* tree.h (cache_integer_cst): Add defaulted might_duplicate parm.
* tree.c (cache_integer_cst): Return the integer cst, add
might_duplicate parm to permit finding a small duplicate.

applying to trunk

--
Nathan Sidwell
diff --git i/gcc/tree.c w/gcc/tree.c
index 569a9b9317b..004385548c9 100644
--- i/gcc/tree.c
+++ w/gcc/tree.c
@@ -1755,8 +1755,15 @@ wide_int_to_tree (tree type, const poly_wide_int_ref )
   return build_poly_int_cst (type, value);
 }
 
-void
-cache_integer_cst (tree t)
+/* Insert INTEGER_CST T into a cache of integer constants.  And return
+   the cached constant (which may or may not be T).  If MIGHT_DUPLICATE
+   is false, and T falls into the type's 'smaller values' range, there
+   cannot be an existing entry.  Otherwise, if MIGHT_DUPLICATE is true,
+   or the value is large, should an existing entry exist, it is
+   returned (rather than inserting T).  */
+
+tree
+cache_integer_cst (tree t, bool might_duplicate ATTRIBUTE_UNUSED)
 {
   tree type = TREE_TYPE (t);
   int ix = -1;
@@ -1770,7 +1777,7 @@ cache_integer_cst (tree t)
   switch (TREE_CODE (type))
 {
 case NULLPTR_TYPE:
-  gcc_assert (integer_zerop (t));
+  gcc_checking_assert (integer_zerop (t));
   /* Fallthru.  */
 
 case POINTER_TYPE:
@@ -1850,21 +1857,32 @@ cache_integer_cst (tree t)
 	  TYPE_CACHED_VALUES (type) = make_tree_vec (limit);
 	}
 
-  gcc_assert (TREE_VEC_ELT (TYPE_CACHED_VALUES (type), ix) == NULL_TREE);
-  TREE_VEC_ELT (TYPE_CACHED_VALUES (type), ix) = t;
+  if (tree r = TREE_VEC_ELT (TYPE_CACHED_VALUES (type), ix))
+	{
+	  gcc_checking_assert (might_duplicate);
+	  t = r;
+	}
+  else
+	TREE_VEC_ELT (TYPE_CACHED_VALUES (type), ix) = t;
 }
   else
 {
   /* Use the cache of larger shared ints.  */
   tree *slot = int_cst_hash_table->find_slot (t, INSERT);
-  /* If there is already an entry for the number verify it's the
- same.  */
-  if (*slot)
-	gcc_assert (wi::to_wide (tree (*slot)) == wi::to_wide (t));
+  if (tree r = *slot)
+	{
+	  /* If there is already an entry for the number verify it's the
+	 same value.  */
+	  gcc_checking_assert (wi::to_wide (tree (r)) == wi::to_wide (t));
+	  /* And return the cached value.  */
+	  t = r;
+	}
   else
 	/* Otherwise insert this one into the hash table.  */
 	*slot = t;
 }
+
+  return t;
 }
 
 
diff --git i/gcc/tree.h w/gcc/tree.h
index bea3e16c091..20f66a02403 100644
--- i/gcc/tree.h
+++ w/gcc/tree.h
@@ -5124,7 +5124,7 @@ extern const_tree strip_invariant_refs (const_tree);
 extern tree lhd_gcc_personality (void);
 extern void assign_assembler_name_if_needed (tree);
 extern bool warn_deprecated_use (tree, tree);
-extern void cache_integer_cst (tree);
+extern tree cache_integer_cst (tree, bool might_duplicate = false);
 extern const char *combined_fn_name (combined_fn);
 
 /* Compare and hash for any structure which begins with a canonical

Re: [AArch64] Add --with-tune configure flag

2020-11-17 Thread Pop, Sebastian via Gcc-patches

Hi,

here is a follow-up patch to add missing Arm64 configure flags as aliases to 
the existing flags.

gcc/
* config.gcc: add configure flags --with-{cpu,arch,tune}-{32,64}
as alias flags for --with-{cpu,arch,tune} on AArch64.
* doc/install.texi: Document new flags for aarch64.

Tested on aarch64-linux with bootstrap and regression test.
Ok to commit to trunk and back-port to active branches?

Thanks,
Sebastian



0001-AArch64-add-with-cpu-arch-tune-32-64-as-alias-flags-.patch
Description: 0001-AArch64-add-with-cpu-arch-tune-32-64-as-alias-flags-.patch

Re: [PATCH 2/3] RISC-V: Support zicsr and zifencei extension for -march.

2020-11-17 Thread Jim Wilson

On Thu, Nov 12, 2020 at 11:29 PM Kito Cheng  wrote:

>  - CSR related instructions and fence instructions has to be splitted from
>baseline ISA, zicsr and zifencei are corresponding sub-extension.
>

It is actually only fence.i that is split off.  fence is still part of the
base ISA.  This is why it is called zifencei.

diff --git a/gcc/config/riscv/riscv.c b/gcc/config/riscv/riscv.c
> index 738556539f6..2aaa8e96451 100644
> --- a/gcc/config/riscv/riscv.c
> +++ b/gcc/config/riscv/riscv.c
> @@ -3337,6 +3337,9 @@ riscv_memmodel_needs_amo_acquire (enum memmodel
> model)
>  static bool
>  riscv_memmodel_needs_release_fence (enum memmodel model)
>  {
> +  if (!TARGET_ZIFENCEI)
> +return false;
> +
>switch (model)
>  {
>case MEMMODEL_ACQ_REL:
>

This part looks wrong, as riscv_memmodel_needs_release_fence is only used
for fence instructions, not for fence.i.

> diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
> index f15bad3b29e..756b35fb8c0 100644
> --- a/gcc/config/riscv/riscv.md
> +++ b/gcc/config/riscv/riscv.md
> @@ -1543,19 +1543,20 @@
>  LCT_NORMAL, VOIDmode, operands[0], Pmode,
>  operands[1], Pmode, const0_rtx, Pmode);
>  #else
> -  emit_insn (gen_fence_i ());
> +  if (TARGET_ZIFENCEI)
> +emit_insn (gen_fence_i ());
>  #endif
>DONE;
>  })
>
>  (define_insn "fence"
>[(unspec_volatile [(const_int 0)] UNSPECV_FENCE)]
> -  ""
> +  "TARGET_ZIFENCEI"
>"%|fence%-")
>
>  (define_insn "fence_i"
>[(unspec_volatile [(const_int 0)] UNSPECV_FENCE_I)]
> -  ""
> +  "TARGET_ZIFENCEI"
>"fence.i")
>
>  ;;
>

The fence_i and clear_cache patterns are OK.  The fence pattern change is
wrong.

You didn't change sync.md, but it only uses fence, so it needs no change.

Jim

c++: duplicate block-scope extern [PR 97877]

2020-11-17 Thread Nathan Sidwell


We ICED with a duplicated block-scope extern, as duplicate_decls was
dropping the decl_lang_specific of olddecl.  Simplys adding
appropriate retrofitting and copying turned out to be insufficient
because you can get a block-scope using decl also matching the extern.
The latter seems a little suspicious and I have asked CWG for advice.
While there robustified the assert about releasing olddecls'
lang-specific -- if it had one, the new decl better have one.

PR c++/97877
gcc/cp/
* decl.c (duplicate_decls): Deal with duplicated DECL_LOCAL_DECL_P
decls.  Extend decl_lang_specific checking assert.
gcc/testsuite/
* g++.dg/lookup/pr97877.C: New.


pushing to trunk

--
Nathan Sidwell
diff --git i/gcc/cp/decl.c w/gcc/cp/decl.c
index 89bae06cd6b..d90e9840f40 100644
--- i/gcc/cp/decl.c
+++ w/gcc/cp/decl.c
@@ -2452,6 +2452,20 @@ duplicate_decls (tree newdecl, tree olddecl, bool hiding, bool was_hidden)
   if (! DECL_COMDAT (olddecl))
 DECL_COMDAT (newdecl) = 0;
 
+  if (VAR_OR_FUNCTION_DECL_P (newdecl) && DECL_LOCAL_DECL_P (newdecl))
+{
+  if (!DECL_LOCAL_DECL_P (olddecl))
+	/* This can happen if olddecl was brought in from the
+	   enclosing namespace via a using-decl.  The new decl is
+	   then not a block-scope extern at all.  */
+	DECL_LOCAL_DECL_P (newdecl) = false;
+  else
+	{
+	  retrofit_lang_decl (newdecl);
+	  DECL_LOCAL_DECL_ALIAS (newdecl) = DECL_LOCAL_DECL_ALIAS (olddecl);
+	}
+}
+
   new_template_info = NULL_TREE;
   if (DECL_LANG_SPECIFIC (newdecl) && DECL_LANG_SPECIFIC (olddecl))
 {
@@ -2735,8 +2749,9 @@ duplicate_decls (tree newdecl, tree olddecl, bool hiding, bool was_hidden)
  with that from NEWDECL below.  */
   if (DECL_LANG_SPECIFIC (olddecl))
 {
-  gcc_assert (DECL_LANG_SPECIFIC (olddecl)
-		  != DECL_LANG_SPECIFIC (newdecl));
+  gcc_checking_assert (DECL_LANG_SPECIFIC (newdecl)
+			   && (DECL_LANG_SPECIFIC (olddecl)
+			   != DECL_LANG_SPECIFIC (newdecl)));
   ggc_free (DECL_LANG_SPECIFIC (olddecl));
 }

Re: [r11-5094 Regression] FAIL: gcc.dg/torture/pr8081.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects (test for excess errors) on Linux/x86_64

2020-11-17 Thread Jan Hubicka

Hi,
I am testing the following fix.  I manually applied a rejected hunk and
for some reaosn managed to reverse the conditonal :(

Honza

* ipa-icf.c (sem_function::hash_stmt): Fix conditional on
variably_modified_type_p.
diff --git a/gcc/ipa-icf.c b/gcc/ipa-icf.c
index 27eeda3a319..6ae842766e6 100644
--- a/gcc/ipa-icf.c
+++ b/gcc/ipa-icf.c
@@ -1459,10 +1459,10 @@ sem_function::hash_stmt (gimple *stmt, inchash::hash 
)
 
ao_ref_init (, gimple_op (stmt, i));
tree t = ao_ref_alias_ptr_type ();
-   if (variably_modified_type_p (t, NULL_TREE))
+   if (!variably_modified_type_p (t, NULL_TREE))
  memory_access_types.safe_push (t);
t = ao_ref_base_alias_ptr_type ();
-   if (variably_modified_type_p (t, NULL_TREE))
+   if (!variably_modified_type_p (t, NULL_TREE))
  memory_access_types.safe_push (t);
  }
  }

[PATCH] Remove lambdas from _Rb_tree

2020-11-17 Thread François Dumont via Gcc-patches

This is a change that has been done to _Hashtable and that I forgot to 
propose for _Rb_tree.


The _GLIBCXX_XREF macro can be easily removed of course.

    libstdc++: _Rb_tree code cleanup, remove lambdas.

    Use an additional template parameter on the clone method to 
propagate if the values must be

    copy or move rather than lambdas.

    libstdc++-v3/ChangeLog:

    * include/bits/move.h (_GLIBCXX_XREF): New.
    * include/bits/stl_tree.h: Adapt to use latter.
    (_Rb_tree<>::_S_fwd_value_for): New.
    (_Rb_tree<>::_M_clone_node): Add _Tree template parameter.
    Use _S_fwd_value_for.
    (_Rb_tree<>::_M_cbegin): New.
    (_Rb_tree<>::_M_begin): Use latter.
    (_Rb_tree<>::_M_copy): Add _Tree template parameter.
    (_Rb_tree<>::_M_move_data): Use rvalue reference for 
_Rb_tree parameter.

    (_Rb_tree<>::_M_move_assign): Likewise.

Tested under Linux x86_64.

Ok to commit ?

François

diff --git a/libstdc++-v3/include/bits/move.h b/libstdc++-v3/include/bits/move.h
index 5a4dbdc823c..e0d68ca9108 100644
--- a/libstdc++-v3/include/bits/move.h
+++ b/libstdc++-v3/include/bits/move.h
@@ -158,9 +158,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
   /// @} group utilities
 
+#define _GLIBCXX_XREF(_Tp) _Tp&&
 #define _GLIBCXX_MOVE(__val) std::move(__val)
 #define _GLIBCXX_FORWARD(_Tp, __val) std::forward<_Tp>(__val)
 #else
+#define _GLIBCXX_XREF(_Tp) const _Tp&
 #define _GLIBCXX_MOVE(__val) (__val)
 #define _GLIBCXX_FORWARD(_Tp, __val) (__val)
 #endif
diff --git a/libstdc++-v3/include/bits/stl_tree.h b/libstdc++-v3/include/bits/stl_tree.h
index ec141ea01c7..128c7e2c892 100644
--- a/libstdc++-v3/include/bits/stl_tree.h
+++ b/libstdc++-v3/include/bits/stl_tree.h
@@ -478,11 +478,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
 	template
 	  _Link_type
-#if __cplusplus < 201103L
-	  operator()(const _Arg& __arg)
-#else
-	  operator()(_Arg&& __arg)
-#endif
+	  operator()(_GLIBCXX_XREF(_Arg) __arg)
 	  {
 	_Link_type __node = static_cast<_Link_type>(_M_extract());
 	if (__node)
@@ -544,11 +540,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
 	template
 	  _Link_type
-#if __cplusplus < 201103L
-	  operator()(const _Arg& __arg) const
-#else
-	  operator()(_Arg&& __arg) const
-#endif
+	  operator()(_GLIBCXX_XREF(_Arg) __arg) const
 	  { return _M_t._M_create_node(_GLIBCXX_FORWARD(_Arg, __arg)); }
 
   private:
@@ -655,11 +647,27 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 	_M_put_node(__p);
   }
 
-  template
+#if __cplusplus >= 201103L
+  template
+	static constexpr
+	typename conditional::value,
+			 const value_type&, value_type&&>::type
+	_S_fwd_value_for(value_type& __val) noexcept
+	{ return std::move(__val); }
+#else
+  template
+	static const value_type&
+	_S_fwd_value_for(value_type& __val)
+	{ return __val; }
+#endif
+
+  template
 	_Link_type
-	_M_clone_node(_Const_Link_type __x, _NodeGen& __node_gen)
+	_M_clone_node(_GLIBCXX_XREF(_Tree),
+		  _Link_type __x, _NodeGen& __node_gen)
 	{
-	  _Link_type __tmp = __node_gen(*__x->_M_valptr());
+	  _Link_type __tmp
+	= __node_gen(_S_fwd_value_for<_Tree>(*__x->_M_valptr()));
 	  __tmp->_M_color = __x->_M_color;
 	  __tmp->_M_left = 0;
 	  __tmp->_M_right = 0;
@@ -748,9 +756,13 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   { return this->_M_impl._M_header._M_right; }
 
   _Link_type
-  _M_begin() _GLIBCXX_NOEXCEPT
+  _M_cbegin() const _GLIBCXX_NOEXCEPT
   { return static_cast<_Link_type>(this->_M_impl._M_header._M_parent); }
 
+  _Link_type
+  _M_begin() _GLIBCXX_NOEXCEPT
+  { return _M_cbegin(); }
+
   _Const_Link_type
   _M_begin() const _GLIBCXX_NOEXCEPT
   {
@@ -889,15 +901,16 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   _M_insert_equal_lower(const value_type& __x);
 #endif
 
-  template
+  template
 	_Link_type
-	_M_copy(_Const_Link_type __x, _Base_ptr __p, _NodeGen&);
+	_M_copy(_GLIBCXX_XREF(_Tree), _Link_type, _Base_ptr, _NodeGen&);
 
-  template
+  template
 	_Link_type
-	_M_copy(const _Rb_tree& __x, _NodeGen& __gen)
+	_M_copy(_GLIBCXX_XREF(_Tree) __x, _NodeGen& __gen)
 	{
-	  _Link_type __root = _M_copy(__x._M_begin(), _M_end(), __gen);
+	  _Link_type __root = _M_copy(_GLIBCXX_FORWARD(_Tree, __x),
+  __x._M_cbegin(), _M_end(), __gen);
 	  _M_leftmost() = _S_minimum(__root);
 	  _M_rightmost() = _S_maximum(__root);
 	  _M_impl._M_node_count = __x._M_impl._M_node_count;
@@ -977,7 +990,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   : _M_impl(__x._M_impl._M_key_compare, std::move(__a))
   {
 	if (__x._M_root() != nullptr)
-	  _M_move_data(__x, false_type{});
+	  _M_move_data(std::move(__x), false_type{});
   }
 
 public:
@@ -1426,22 +1439,22 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 private:
   // Move elements from container with equal allocator.
   void
-  _M_move_data(_Rb_tree& __x, true_type)
+  _M_move_data(_Rb_tree&& __x, true_type)
   {

[r11-5094 Regression] FAIL: gcc.dg/torture/pr8081.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects (test for excess errors) on Linux/x86_64

2020-11-17 Thread sunil.k.pandey via Gcc-patches

On Linux/x86_64,

afa6adbd6c83eeef6d75655140f7c0c9a02a479e is the first bad commit
commit afa6adbd6c83eeef6d75655140f7c0c9a02a479e
Author: Jan Hubicka 
Date:   Tue Nov 17 15:41:06 2020 +0100

Improve handling of memory operands in ipa-icf 3/4

caused

FAIL: gcc.c-torture/execute/20020412-1.c   -O2 -flto -fno-use-linker-plugin 
-flto-partition=none  (internal compiler error)
FAIL: gcc.c-torture/execute/20020412-1.c   -O2 -flto -fno-use-linker-plugin 
-flto-partition=none  (test for excess errors)
FAIL: gcc.c-torture/execute/20020412-1.c   -O2 -flto -fuse-linker-plugin 
-fno-fat-lto-objects  (internal compiler error)
FAIL: gcc.c-torture/execute/20020412-1.c   -O2 -flto -fuse-linker-plugin 
-fno-fat-lto-objects  (test for excess errors)
FAIL: gcc.c-torture/execute/pr82210.c   -O2 -flto -fno-use-linker-plugin 
-flto-partition=none  (internal compiler error)
FAIL: gcc.c-torture/execute/pr82210.c   -O2 -flto -fno-use-linker-plugin 
-flto-partition=none  (test for excess errors)
FAIL: gcc.c-torture/execute/pr82210.c   -O2 -flto -fuse-linker-plugin 
-fno-fat-lto-objects  (internal compiler error)
FAIL: gcc.c-torture/execute/pr82210.c   -O2 -flto -fuse-linker-plugin 
-fno-fat-lto-objects  (test for excess errors)
FAIL: gcc.dg/pr34457-1.c (internal compiler error)
FAIL: gcc.dg/pr34457-1.c (test for excess errors)
FAIL: gcc.dg/torture/pr8081.c   -O2 -flto -fno-use-linker-plugin 
-flto-partition=none  (internal compiler error)
FAIL: gcc.dg/torture/pr8081.c   -O2 -flto -fno-use-linker-plugin 
-flto-partition=none  (test for excess errors)
FAIL: gcc.dg/torture/pr8081.c   -O2 -flto -fuse-linker-plugin 
-fno-fat-lto-objects  (internal compiler error)
FAIL: gcc.dg/torture/pr8081.c   -O2 -flto -fuse-linker-plugin 
-fno-fat-lto-objects  (test for excess errors)

with GCC configured with

../../gcc/configure 
--prefix=/local/skpandey/gccwork/toolwork/gcc-bisect-master/master/r11-5094/usr 
--enable-clocale=gnu --with-system-zlib --with-demangler-in-ld 
--with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl 
--enable-libmpx x86_64-linux --disable-bootstrap

To reproduce:

$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="execute.exp=gcc.c-torture/execute/20020412-1.c 
--target_board='unix{-m32}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="execute.exp=gcc.c-torture/execute/20020412-1.c 
--target_board='unix{-m32\ -march=cascadelake}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="execute.exp=gcc.c-torture/execute/20020412-1.c 
--target_board='unix{-m64}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="execute.exp=gcc.c-torture/execute/20020412-1.c 
--target_board='unix{-m64\ -march=cascadelake}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="execute.exp=gcc.c-torture/execute/pr82210.c 
--target_board='unix{-m32}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="execute.exp=gcc.c-torture/execute/pr82210.c 
--target_board='unix{-m32\ -march=cascadelake}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="execute.exp=gcc.c-torture/execute/pr82210.c 
--target_board='unix{-m64}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="execute.exp=gcc.c-torture/execute/pr82210.c 
--target_board='unix{-m64\ -march=cascadelake}'"
$ cd {build_dir}/gcc && make check RUNTESTFLAGS="dg.exp=gcc.dg/pr34457-1.c 
--target_board='unix{-m32}'"
$ cd {build_dir}/gcc && make check RUNTESTFLAGS="dg.exp=gcc.dg/pr34457-1.c 
--target_board='unix{-m32\ -march=cascadelake}'"
$ cd {build_dir}/gcc && make check RUNTESTFLAGS="dg.exp=gcc.dg/pr34457-1.c 
--target_board='unix{-m64}'"
$ cd {build_dir}/gcc && make check RUNTESTFLAGS="dg.exp=gcc.dg/pr34457-1.c 
--target_board='unix{-m64\ -march=cascadelake}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="dg-torture.exp=gcc.dg/torture/pr8081.c 
--target_board='unix{-m32}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="dg-torture.exp=gcc.dg/torture/pr8081.c --target_board='unix{-m32\ 
-march=cascadelake}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="dg-torture.exp=gcc.dg/torture/pr8081.c 
--target_board='unix{-m64}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="dg-torture.exp=gcc.dg/torture/pr8081.c --target_board='unix{-m64\ 
-march=cascadelake}'"

(Please do not reply to this email, for question about this report, contact me 
at skpgkp2 at gmail dot com)

[PATCH] c++: Fix ICE-on-invalid with -Wvexing-parse [PR97881]

2020-11-17 Thread Marek Polacek via Gcc-patches

This invalid (?) code broke my assumption that if decl_specifiers->type
is null, there must be any type-specifiers.  Turn the assert into an if
to fix this crash.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

gcc/cp/ChangeLog:

PR c++/97881
* parser.c (warn_about_ambiguous_parse): Only assume "int" if we
actually saw any type-specifiers.

gcc/testsuite/ChangeLog:

PR c++/97881
* g++.dg/warn/Wvexing-parse9.C: New test.
---
 gcc/cp/parser.c| 11 +--
 gcc/testsuite/g++.dg/warn/Wvexing-parse9.C |  8 
 2 files changed, 13 insertions(+), 6 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/warn/Wvexing-parse9.C

diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index b7ef259b048..7a6bf4ad2cf 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -20792,13 +20792,12 @@ warn_about_ambiguous_parse (const 
cp_decl_specifier_seq *decl_specifiers,
   if (same_type_p (type, void_type_node))
return;
 }
+  else if (decl_specifiers->any_type_specifiers_p)
+/* Code like long f(); will have null ->type.  If we have any
+   type-specifiers, pretend we've seen int.  */
+type = integer_type_node;
   else
-{
-  /* Code like long f(); will have null ->type.  If we have any
-type-specifiers, pretend we've seen int.  */
-  gcc_checking_assert (decl_specifiers->any_type_specifiers_p);
-  type = integer_type_node;
-}
+return;
 
   auto_diagnostic_group d;
   location_t loc = declarator->u.function.parens_loc;
diff --git a/gcc/testsuite/g++.dg/warn/Wvexing-parse9.C 
b/gcc/testsuite/g++.dg/warn/Wvexing-parse9.C
new file mode 100644
index 000..dc4198d6c5e
--- /dev/null
+++ b/gcc/testsuite/g++.dg/warn/Wvexing-parse9.C
@@ -0,0 +1,8 @@
+// PR c++/97881
+// { dg-do compile }
+
+void
+cb ()
+{
+  volatile _Atomic (int) a1; // { dg-error "expected initializer" }
+}

base-commit: a5f9c27bfc4417224e332392bb81a2d733b2b5bf
-- 
2.28.0

Re: [PATCH,rs6000] Make MMA builtins use opaque modes

2020-11-17 Thread Peter Bergner via Gcc-patches

On 11/17/20 11:48 AM, acsaw...@linux.ibm.com wrote:
> -;; The MMA patterns use the multi-register PXImode and POImode partial
> +;; The MMA patterns use the multi-register XOmode and OOmode partial
>  ;; integer modes to implement the target specific __vector_quad and

XOmode and OOmode are not partial integer modes, so change to opaque mode.


> +;; Return 1 if this operand is valid for an MMA disassemble insn.
> +(define_predicate "mma_disassemble_output_operand"
> +  (match_code "reg,subreg,mem")
> +{
> +  if (REG_P (op) && !vsx_register_operand (op, mode))
> +return false;
> +  return true;
> +})

Do we really want to accept subregs here?  If so, why are they not also required
to be vsx_register_operand()?


> -   if ((attr & RS6000_BTC_QUAD) == 0)
> +   if ( !( d->code == MMA_BUILTIN_DISASSEMBLE_ACC_INTERNAL
> +   || d->code == MMA_BUILTIN_DISASSEMBLE_PAIR_INTERNAL)
> +&& ((attr & RS6000_BTC_QUAD) == 0))

No white space after the '('.



> -  if (icode == CODE_FOR_nothing)
> +  /* This is a disassemble pair/acc function. */
> +  if ( d->code == MMA_BUILTIN_DISASSEMBLE_ACC
> +|| d->code == MMA_BUILTIN_DISASSEMBLE_PAIR)

Ditto.



> +  /* The __vector_pair and __vector_quad modes are multi-register
> + modes, so if have to load or store the registers, we have to be
> + careful to properly swap them if we're in little endian mode

s/so if have to/so if we have to/


> -   /* We are writing an accumulator register, so we have to
> -  prime it after we've written it.  */
> -   emit_insn (gen_mma_xxmtacc (dst, dst));
> +   if ( GET_MODE (src) == XOmode )

White space again.


>/* Move register range backwards, if we might have destructive
>overlap.  */
>int i;
> -  for (i = nregs - 1; i >= 0; i--)
> - emit_insn (gen_rtx_SET (simplify_gen_subreg (reg_mode, dst, mode,
> -  i * reg_mode_size),
> - simplify_gen_subreg (reg_mode, src, mode,
> -  i * reg_mode_size)));
> +  /* XO/OO are opaque so cannot use subregs. */
> +  if ( mode == OOmode || mode == XOmode )

Ditto.


> +   /* XO/OO are opaque so cannot use subregs. */
> +   if ( mode == OOmode || mode == XOmode )

Ditto.


Peter

[AArch64] Add --with-tune configure flag

2020-11-17 Thread Pop, Sebastian via Gcc-patches

Hi,

the attached patch fixes a configure error on Arm64 when passing 
--with-tune=... to configure:
```
This target does not support --with-tune.
Valid --with options are: abi cpu arch
```
The missing flag sets target tuning to a different value than the generic 
tuning.

gcc/
* config.gcc: Add --with-tune to AArch64 configure flags.

Tested on aarch64-linux with bootstrap and regression test.
Ok to commit to trunk and back-port to active branches?

Thanks,
Sebastian



0001-AArch64-add-with-tune-configure-flag.patch
Description: 0001-AArch64-add-with-tune-configure-flag.patch

Re: [gcc r9-8794] aarch64: Clear canary value after stack_protect_test [PR96191]

2020-11-17 Thread Richard Sandiford via Gcc-patches

Sebastian Pop  writes:
> Hi,
>
> On Fri, Aug 7, 2020 at 6:18 AM Richard Sandiford  wrote:
>>
>> https://gcc.gnu.org/g:5380912a17ea09a8996720fb62b1a70c16c8f9f2
>>
>> commit r9-8794-g5380912a17ea09a8996720fb62b1a70c16c8f9f2
>> Author: Richard Sandiford 
>> Date:   Fri Aug 7 12:17:37 2020 +0100
>
> could you please also apply this change to the gcc-8 branch?

I've now pushed the attached patch to GCC 8.  It's somewhat simpler
than the GCC 9+ version since GCC 8 didn't support the sysreg model.

Tested on aarch64-linux-gnu.

Thanks,
Richard


gcc/
PR target/96191
* config/aarch64/aarch64.md (stack_protect_test_): Set the
CC register directly, instead of a GPR.  Replace the original GPR
destination with an extra scratch register.  Zero out operand 3
after use.
(stack_protect_test): Update accordingly.

gcc/testsuite/
PR target/96191
* gcc.target/aarch64/stack-protector-1.c: New test.
* gcc.target/aarch64/stack-protector-2.c: Likewise.

(cherry picked from commit fe1a26429038d7cd17abc53f96a6f3e2639b605f)
---
 gcc/config/aarch64/aarch64.md | 35 
 .../gcc.target/aarch64/stack-protector-1.c| 89 +++
 .../gcc.target/aarch64/stack-protector-2.c|  6 ++
 3 files changed, 110 insertions(+), 20 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/aarch64/stack-protector-1.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/stack-protector-2.c

diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index 9fc555c4006..ea1319c56a4 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -5995,35 +5995,30 @@
(match_operand 2)]
   ""
 {
-  rtx result;
   machine_mode mode = GET_MODE (operands[0]);
 
-  result = gen_reg_rtx(mode);
-
   emit_insn ((mode == DImode
- ? gen_stack_protect_test_di
- : gen_stack_protect_test_si) (result,
-   operands[0],
-   operands[1]));
-
-  if (mode == DImode)
-emit_jump_insn (gen_cbranchdi4 (gen_rtx_EQ (VOIDmode, result, const0_rtx),
-   result, const0_rtx, operands[2]));
-  else
-emit_jump_insn (gen_cbranchsi4 (gen_rtx_EQ (VOIDmode, result, const0_rtx),
-   result, const0_rtx, operands[2]));
+? gen_stack_protect_test_di
+: gen_stack_protect_test_si) (operands[0], operands[1]));
+
+  rtx cc_reg = gen_rtx_REG (CCmode, CC_REGNUM);
+  emit_jump_insn (gen_condjump (gen_rtx_EQ (VOIDmode, cc_reg, const0_rtx),
+   cc_reg, operands[2]));
   DONE;
 })
 
+;; DO NOT SPLIT THIS PATTERN.  It is important for security reasons that the
+;; canary value does not live beyond the end of this sequence.
 (define_insn "stack_protect_test_"
-  [(set (match_operand:PTR 0 "register_operand" "=r")
-   (unspec:PTR [(match_operand:PTR 1 "memory_operand" "m")
-(match_operand:PTR 2 "memory_operand" "m")]
-UNSPEC_SP_TEST))
+  [(set (reg:CC CC_REGNUM)
+   (unspec:CC [(match_operand:PTR 0 "memory_operand" "m")
+   (match_operand:PTR 1 "memory_operand" "m")]
+  UNSPEC_SP_TEST))
+   (clobber (match_scratch:PTR 2 "="))
(clobber (match_scratch:PTR 3 "="))]
   ""
-  "ldr\t%3, %1\;ldr\t%0, %2\;eor\t%0, %3, %0"
-  [(set_attr "length" "12")
+  "ldr\t%2, %0\;ldr\t%3, %1\;subs\t%2, %2, %3\;mov\t%3, 0"
+  [(set_attr "length" "16")
(set_attr "type" "multiple")])
 
 ;; Write Floating-point Control Register.
diff --git a/gcc/testsuite/gcc.target/aarch64/stack-protector-1.c 
b/gcc/testsuite/gcc.target/aarch64/stack-protector-1.c
new file mode 100644
index 000..73e83bc413f
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/stack-protector-1.c
@@ -0,0 +1,89 @@
+/* { dg-do run } */
+/* { dg-require-effective-target fstack_protector } */
+/* { dg-options "-fstack-protector-all -O2" } */
+
+extern volatile long *stack_chk_guard_ptr;
+
+volatile long *
+get_ptr (void)
+{
+  return stack_chk_guard_ptr;
+}
+
+void __attribute__ ((noipa))
+f (void)
+{
+  volatile int x;
+  x = 1;
+  x += 1;
+}
+
+#define CHECK(REG) "\tcmp\tx0, " #REG "\n\tbeq\t1f\n"
+
+asm (
+"  .pushsection .data\n"
+"  .align  3\n"
+"  .globl  stack_chk_guard_ptr\n"
+"stack_chk_guard_ptr:\n"
+#if __ILP32__
+"  .word   __stack_chk_guard\n"
+#else
+"  .xword  __stack_chk_guard\n"
+#endif
+"  .weak   __stack_chk_guard\n"
+"__stack_chk_guard:\n"
+"  .word   0xdead4321\n"
+"  .word   0xbeef8765\n"
+"  .text\n"
+"  .globl  main\n"
+"  .type   main, %function\n"
+"main:\n"
+"  bl  get_ptr\n"
+"  str x0, [sp, #-16]!\n"
+"  bl  f\n"
+"  str x0, [sp, #8]\n"
+"  ldr x0, [sp]\n"
+#if __ILP32__
+"  ldr w0, [x0]\n"
+#else
+"  ldr x0, [x0]\n"
+#endif
+   CHECK (x1)
+   CHECK (x2)
+   CHECK (x3)
+

global trees

2020-11-17 Thread Nathan Sidwell



This reorders the common and c++ global tree arrays.  It introduces a
module-specific High Water Mark, below which are the immutable slots
initialized at startup and beyond which are the lazily filled slots
(and a few immutables we need to locate by name lookup anyway).

gcc/c-family/
* c-common.h (enum c_tree_index): Reorder to place lazy fields
after newly-added CTI_MODULE_HWM.
gcc/cp/
* cp-tree.h (enum cp_tree_index): Reorder to place lazy fields
after newly-added CPTI_MODULE_HWM.

Pushing to trunk
--
Nathan Sidwell
diff --git i/gcc/c-family/c-common.h w/gcc/c-family/c-common.h
index 3c508979b14..f413e8773f5 100644
--- i/gcc/c-family/c-common.h
+++ w/gcc/c-family/c-common.h
@@ -364,13 +364,17 @@ enum c_tree_index
 
 CTI_DEFAULT_FUNCTION_TYPE,
 
+CTI_NULL,
+
 /* These are not types, but we have to look them up all the time.  */
 CTI_FUNCTION_NAME_DECL,
 CTI_PRETTY_FUNCTION_NAME_DECL,
 CTI_C99_FUNCTION_NAME_DECL,
-CTI_SAVED_FUNCTION_NAME_DECLS,
 
-CTI_NULL,
+CTI_MODULE_HWM,
+/* Below here entities change during compilation.  */
+
+CTI_SAVED_FUNCTION_NAME_DECLS,
 
 CTI_MAX
 };
diff --git i/gcc/cp/cp-tree.h w/gcc/cp/cp-tree.h
index 9ae6ff5f7a2..81485de94f9 100644
--- i/gcc/cp/cp-tree.h
+++ w/gcc/cp/cp-tree.h
@@ -128,12 +128,8 @@ enum cp_tree_index
 CPTI_EXPLICIT_VOID_LIST,
 CPTI_VTBL_TYPE,
 CPTI_VTBL_PTR_TYPE,
-CPTI_STD,
-CPTI_ABI,
 CPTI_GLOBAL,
 CPTI_GLOBAL_TYPE,
-CPTI_CONST_TYPE_INFO_TYPE,
-CPTI_TYPE_INFO_PTR_TYPE,
 CPTI_ABORT_FNDECL,
 CPTI_AGGR_TAG,
 CPTI_CONV_OP_MARKER,
@@ -190,8 +186,28 @@ enum cp_tree_index
 CPTI_NOEXCEPT_FALSE_SPEC,
 CPTI_NOEXCEPT_DEFERRED_SPEC,
 
+CPTI_NULLPTR,
+CPTI_NULLPTR_TYPE,
+
+CPTI_ANY_TARG,
+
+CPTI_MODULE_HWM,
+/* Nodes after here change during compilation, or should not be in
+   the module's global tree table.  */
+
+/* We must find these via the global namespace.  */
+CPTI_STD,
+CPTI_ABI,
+
+/* These are created at init time, but the library/headers provide
+   definitions.  */
+CPTI_ALIGN_TYPE,
+CPTI_CONST_TYPE_INFO_TYPE,
+CPTI_TYPE_INFO_PTR_TYPE,
 CPTI_TERMINATE_FN,
 CPTI_CALL_UNEXPECTED_FN,
+
+/* These are lazily inited.  */
 CPTI_GET_EXCEPTION_PTR_FN,
 CPTI_BEGIN_CATCH_FN,
 CPTI_END_CATCH_FN,
@@ -204,13 +220,6 @@ enum cp_tree_index
 CPTI_DSO_HANDLE,
 CPTI_DCAST,
 
-CPTI_NULLPTR,
-CPTI_NULLPTR_TYPE,
-
-CPTI_ALIGN_TYPE,
-
-CPTI_ANY_TARG,
-
 CPTI_SOURCE_LOCATION_IMPL,
 
 CPTI_FALLBACK_DFLOAT32_TYPE,

[PATCH] c++: Allow template lambdas without lambda-declarator [PR97839]

2020-11-17 Thread Marek Polacek via Gcc-patches

Our implementation of template lambdas incorrectly requires the optional
lambda-declarator.  This was probably required by an early draft of
generic lambdas, but now the production is [expr.prim.lambda.general]:

 lambda-expression:
lambda-introducer lambda-declarator [opt] compound-statement
lambda-introducer < template-parameter-list > requires-clause [opt]
  lambda-declarator [opt] compound-statement

Therefore, we should accept the following test.

Incidentally, I noticed we give a terrible diagnostic when the user uses
'mutable', but forgets to type '()' before it, which sounds like a common
mistake.  So it seems to me we should handle that specifically, rather
than to emit this:

lambda-generic8.C: In lambda function:
lambda-generic8.C:8:18: error: expected '{' before 'mutable'
8 |   [] mutable {}.operator()();
  |  ^~~
lambda-generic8.C: In function 'int main()':
lambda-generic8.C:8:17: error: expected ';' before 'mutable'
8 |   [] mutable {}.operator()();
  | ^~~~
  | ;
lambda-generic8.C:8:28: error: expected primary-expression before '.' token
8 |   [] mutable {}.operator()();
  |^
lambda-generic8.C:8:40: error: expected primary-expression before 'int'
8 |   [] mutable {}.operator()();
  |^~~

Is it okay to fix this in stage3?

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

gcc/cp/ChangeLog:

PR c++/97839
* parser.c (cp_parser_lambda_declarator_opt): Don't require ().

gcc/testsuite/ChangeLog:

PR c++/97839
* g++.dg/cpp2a/lambda-generic8.C: New test.
---
 gcc/cp/parser.c  | 14 ++
 gcc/testsuite/g++.dg/cpp2a/lambda-generic8.C |  9 +
 2 files changed, 15 insertions(+), 8 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/lambda-generic8.C

diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index 42f705266bb..9f09c778c29 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -10604,6 +10604,8 @@ cp_parser_trait_expr (cp_parser* parser, enum rid 
keyword)
 
lambda-expression:
  lambda-introducer lambda-declarator [opt] compound-statement
+ lambda-introducer < template-parameter-list > requires-clause [opt]
+   lambda-declarator [opt] compound-statement
 
Returns a representation of the expression.  */
 
@@ -11061,13 +11063,11 @@ cp_parser_lambda_introducer (cp_parser* parser, tree 
lambda_expr)
 /* Parse the (optional) middle of a lambda expression.
 
lambda-declarator:
- < template-parameter-list [opt] >
-   requires-clause [opt]
- ( parameter-declaration-clause [opt] )
-   attribute-specifier [opt]
+ ( parameter-declaration-clause )
decl-specifier-seq [opt]
-   exception-specification [opt]
-   lambda-return-type-clause [opt]
+   noexcept-specifier [opt]
+   attribute-specifier-seq [opt]
+   trailing-return-type [opt]
requires-clause [opt]
 
LAMBDA_EXPR is the current representation of the lambda expression.  */
@@ -11217,8 +11217,6 @@ cp_parser_lambda_declarator_opt (cp_parser* parser, 
tree lambda_expr)
  trailing-return-type in case of decltype.  */
   pop_bindings_and_leave_scope ();
 }
-  else if (template_param_list != NULL_TREE) // generate diagnostic
-cp_parser_require (parser, CPP_OPEN_PAREN, RT_OPEN_PAREN);
 
   /* Create the function call operator.
 
diff --git a/gcc/testsuite/g++.dg/cpp2a/lambda-generic8.C 
b/gcc/testsuite/g++.dg/cpp2a/lambda-generic8.C
new file mode 100644
index 000..f3c3809b36d
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/lambda-generic8.C
@@ -0,0 +1,9 @@
+// PR c++/97839
+// { dg-do compile { target c++20 } }
+// Test that a lambda with  doesn't require
+// a lambda-declarator.
+
+int main()
+{
+  []{}.operator()();
+}

base-commit: 8661f4faa875f361cd22a197774c1fa04cd0580b
-- 
2.28.0

[PATCH, rs6000] Re-enable vector pair memcpy/memmove expansion

2020-11-17 Thread acsawdey--- via Gcc-patches

From: Aaron Sawdey 

After the MMA opaque mode patch goes in, we can re-enable
use of vector pair in the inline expansion of memcpy/memmove.

After bootstrap/regtest, OK for trunk?

Thanks,
Aaron

gcc/
* config/rs6000/rs6000.c (rs6000_option_override_internal):
Enable vector pair memcpy/memmove expansion.
---
 gcc/config/rs6000/rs6000.c | 9 -
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index bb48ed92aef..53f92970414 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -4117,11 +4117,10 @@ rs6000_option_override_internal (bool global_init_p)
 
   if (!(rs6000_isa_flags_explicit & OPTION_MASK_BLOCK_OPS_VECTOR_PAIR))
 {
-  /* When the POImode issues of PR96791 are resolved, then we can
-once again enable use of vector pair for memcpy/memmove on
-P10 if we have TARGET_MMA.  For now we make it disabled by
-default for all targets.  */
-  rs6000_isa_flags &= ~OPTION_MASK_BLOCK_OPS_VECTOR_PAIR;
+  if (TARGET_MMA && TARGET_EFFICIENT_UNALIGNED_VSX)
+   rs6000_isa_flags |= OPTION_MASK_BLOCK_OPS_VECTOR_PAIR;
+  else
+   rs6000_isa_flags &= ~OPTION_MASK_BLOCK_OPS_VECTOR_PAIR;
 }
 
   /* Use long double size to select the appropriate long double.  We use
-- 
2.18.4

Re: [PATCH] add -Wmismatched-new-delete to middle end (PR 90629)

2020-11-17 Thread Martin Sebor via Gcc-patches


On 11/16/20 5:54 PM, Jeff Law wrote:


On 11/3/20 4:56 PM, Martin Sebor via Gcc-patches wrote:

Attached is a simple middle end implementation of detection of
mismatched pairs of calls to C++ new and delete, along with
a substantially enhanced implementation of -Wfree-nonheap-object.
The latter option has been in place since 2011 but detected only
the most trivial bugs.

Unlike the Clang -Wmismatched-new-delete which diagnoses
declarations of "overloaded operator new() and operator delete()
functions that do not have a corresponding free store function
defined within the same scope", this patch detects mismatches
between calls to allocation and deallocation functions, such as
calling free() on the result of new, of delete on the result of
array new.  The functionality provided by Clang can be added on
top of what this feature does and since they are so close I think
it's fine to have both under the same option (a new level could
be introduced to distinguish the two).

The -Wfree-nonheap-object enhancement lets the warning detect all
calls to free, realloc, or C++ delete, with pointers that can be
proven not to point to the first byte of an allocated object.

The patch relies on the well-tested compute_objsize() function
for the determination of pointer provenance and makes use of
the changes in the following patch submitted for review just
yesterday:
https://gcc.gnu.org/pipermail/gcc-patches/2020-November/557807.html

As usual, I tested on x86_64-linux with Glibc & Binutils/GDB with
no new false positives.

Martin

PS A few words on the implementation choices:

The new code is in builtins.c only because -Wfree-nonheap-object
is there.  I still plan to move all of the invalid access checking
code into its own module or pass at some point but I didn't want
to make this improvement contingent on that restructuring.
Even though it's all in builtins.c, the code is called from calls.c.
This is so that simple mismatches can be diagnosed even when free
isn't handed in builtins.c (i.e., without optimization).
The warning makes no attempt to analyze the CFG or handle
conditional mismatches.  That will have to wait until the code
is moved to a GIMPLE pass.

gcc-90629.diff

PR c++/90629 - Support for -Wmismatched-new-delete

gcc/ChangeLog:

PR c++/90629
* builtins.c (access_ref::access_ref): Initialize new member.
(compute_objsize): Use access_ref::deref.  Handle simple pointer
assignment.
(expand_builtin): Remove handling of the free built-in.
(find_assignment_location): New function.
(gimple_call_alloc_p): Same.
(gimple_call_dealloc_argno): Same.
(gimple_call_dealloc_p): Same.
(matching_alloc_calls_p): Same.
(warn_dealloc_offset): Same.
* builtins.h (struct access_ref): Declare new member.
(maybe_emit_free_warning): Make extern.  Make use of access_ref.
Handle -Wmismatched-new-delete.
* calls.c (initialize_argument_information): Call
maybe_emit_free_warning.
* doc/invoke.texi (-Wfree-nonheap-object): Expand documentation.
(-Wmismatched-new-delete): Document new option.

gcc/c-family/ChangeLog:

PR c++/90629
* c.opt (-Wmismatched-new-delete): New option.

gcc/testsuite/ChangeLog:

PR c++/90629
* g++.dg/warn/delete-array-1.C: Add expected warning.
* g++.old-deja/g++.other/delete2.C: Add expected warning.
* g++.dg/warn/Wfree-nonheap-object-2.C: New test.
* g++.dg/warn/Wfree-nonheap-object.C: New test.
* g++.dg/warn/Wmismatched-new-delete.C: New test.
* gcc.dg/Wfree-nonheap-object-2.c: New test.
* gcc.dg/Wfree-nonheap-object-3.c: New test.
* gcc.dg/Wfree-nonheap-object.c: New test.


So do we need to reconcile with David M's patch that adds the 
"deallocated_by" attribute.  In that thread I raised the question about 
using the same attribute to track both pointers as well as things like 
file descriptors.  It looks like this patch ignores everything except 
builtins/language intrinsics as allocation functions.  ISTM that would 
need to be fixed for David's patch to be useful here, right?


Our emails have crossed mid air.  I replied to your comments on
David's patch here:
https://gcc.gnu.org/pipermail/gcc-patches/2020-November/559309.html
The work I point to there extends this warning to user-defined
functions.

I also gave some preliminary comments on David's initial RFC.
It has some more background:
https://gcc.gnu.org/pipermail/gcc-patches/2020-October/555667.html

As we discussed privately, this can trigger false positives, 
particularly for unoptimized code  as we've seen in Fedora testing.  But 
I'm not terribly worried about these.


I haven't seen anything failing with a valid error because of this, but 
that's because the packages where it's triggering aren't compiling with 
-Werror.  I did some quick grepping of the logs and I think I found 
instances of each of the new warnings.

Re: [PATCH v1 1/2] Simplify shifts wider than the bitwidth of types

2020-11-17 Thread Jakub Jelinek via Gcc-patches

On Tue, Nov 17, 2020 at 06:23:51PM +0100, Philipp Tomsich wrote:
> The rules for E1 << E2 are:
>   - if E2 is negative => undefined
>   - if E1 is unsigned => E1 x 2^E2, reduced module one more than the
> maximum representable value
>   - if E1 is signed and non-negative => E1 x 2^E2, if E1 x 2^E2 is
> representable; otherwise, undefined

Those are rules about UB -fsanitize=shift-base diagnoses, and that greatly
differs between different languages and versions of those languages, and as
we don't really record what it comes from, for the GIMPLE IL everything is
well defined.
What we were talking about before is written earlier in the
"If the value of the right operand is negative or is
greater than or equal to the width of the promoted left operand, the behavior is
undefined."
sentence and is what -fsanitize=shift-exponent diagnoses.  In the GIMPLE IL
such shifts are still UB and in RTL only depending on some target macros
(i.e. undefined for some targets, wrapping with some mask or saturating on
others).

Jakub

Re: [PATCH v1 1/2] Simplify shifts wider than the bitwidth of types

2020-11-17 Thread Jakub Jelinek via Gcc-patches

On Tue, Nov 17, 2020 at 09:14:31AM -0800, Jim Wilson wrote:
> On Tue, Nov 17, 2020 at 8:46 AM Jakub Jelinek via Gcc-patches <
> gcc-patches@gcc.gnu.org> wrote:
> 
> > On Tue, Nov 17, 2020 at 05:29:57PM +0100, Philipp Tomsich wrote:
> > > > > In other words, the change to VRP canonicalizes what a lshift_expr
> > with an
> > > > > shift-amount outside of the type width means... it doesn't assume
> > anything
> > > > > about the original language.
> >
> > Well, I said if we want to do it at all, it should be done in VRP, because
> > there is not really a difference between ((int) x) << 32 and ((int) x) << y
> > for y in [32, 137] etc.
> >
> 
> How does this stuff interact with SHIFT_COUNT_TRUNCATED and
> TARGET_SHIFT_TRUNCATION_MASK?  We already provide a mechanism to
> truncate shift counts to fit based on how the hardware handles out-of-range
> shift counts.  Handling out-of-range shift counts differently in VRP would
> confuse things.  Maybe VRP should be using SHIFT_COUNT_TRUCNATED and/or
> TARGET_SHIFT_TRUNCATION_MASK here?  Or maybe we give up on the shift
> truncation macros?

Those are RTL only, aren't they?  So, in GIMPLE we can still (and already do
in various places) assume that out of bounds shifts are UB, and only in RTL
follow the rules of those macros/hooks, so only at the RTL level we can e.g.
optimize x << (y & 31) to x << y if the macros/hooks say the target handles
it that way.

Jakub

[PATCH,rs6000] Make MMA builtins use opaque modes

2020-11-17 Thread acsawdey--- via Gcc-patches

From: Aaron Sawdey 

This patch changes powerpc MMA builtins to use the new opaque
mode class and use modes OO (32 bytes) and XO (64 bytes)
instead of POI/PXI. Using the opaque modes prevents
optimization from trying to do anything with vector
pair/quad, which was the problem we were seeing with the
partial integer modes.

OK for trunk if bootstrap/regtest passes? 

gcc/
* gcc/config/rs6000/mma.md (unspec):
Add assemble/extract UNSPECs.
(movoi): Change to movoo.
(*movpoi): Change to *movoo.
(movxi): Change to movxo.
(*movpxi): Change to *movxo.
(mma_assemble_pair): Change to OO mode.
(*mma_assemble_pair): New define_insn_and_split.
(mma_disassemble_pair): New define_expand.
(*mma_disassemble_pair): New define_insn_and_split.
(mma_assemble_acc): Change to XO mode.
(*mma_assemble_acc): Change to XO mode.
(mma_disassemble_acc): New define_expand.
(*mma_disassemble_acc): New define_insn_and_split.
(mma_): Change to XO mode.
(mma_): Change to XO mode.
(mma_): Change to XO mode.
(mma_): Change to OO mode.
(mma_): Change to XO/OO mode.
(mma_): Change to XO mode.
(mma_): Change to XO mode.
(mma_): Change to XO mode.
(mma_): Change to XO mode.
(mma_): Change to XO mode.
(mma_): Change to XO mode.
(mma_): Change to XO/OO mode.
(mma_): Change to XO/OO mode.
(mma_): Change to XO mode.
(mma_): Change to XO mode.
* gcc/config/rs6000/predicates.md (input_operand): Allow opaque.
(mma_disassemble_output_operand): New predicate.
* gcc/config/rs6000/rs6000-builtin.def:
Changes to disassemble builtins.
* gcc/config/rs6000/rs6000-call.c (rs6000_return_in_memory):
Disallow __vector_pair/__vector_quad as return types.
(rs6000_promote_function_mode): Remove function return type
check because we can't test it here any more.
(rs6000_function_arg): Do not allow __vector_pair/__vector_quad
as as function arguments.
(rs6000_gimple_fold_mma_builtin):
Handle mma_disassemble_* builtins.
(rs6000_init_builtins): Create types for XO/OO modes.
* gcc/config/rs6000/rs6000-modes.def: Create XO and OO modes.
* gcc/config/rs6000/rs6000-string.c (expand_block_move):
Update to OO mode.
* gcc/config/rs6000/rs6000.c (rs6000_hard_regno_mode_ok_uncached):
Update for XO/OO modes.
(rs6000_modes_tieable_p): Update for XO/OO modes.
(rs6000_debug_reg_global): Update for XO/OO modes.
(rs6000_setup_reg_addr_masks): Update for XO/OO modes.
(rs6000_init_hard_regno_mode_ok): Update for XO/OO modes.
(reg_offset_addressing_ok_p): Update for XO/OO modes.
(rs6000_emit_move): Update for XO/OO modes.
(rs6000_preferred_reload_class): Update for XO/OO modes.
(rs6000_split_multireg_move): Update for XO/OO modes.
(rs6000_mangle_type): Update for opaque types.
(rs6000_invalid_conversion): Update for XO/OO modes.
* gcc/config/rs6000/rs6000.h (VECTOR_ALIGNMENT_P):
Update for XO/OO modes.
* gcc/config/rs6000/rs6000.md (RELOAD): Update for XO/OO modes.
gcc/testsuite/
* gcc.target/powerpc/mma-double-test.c (main): Call abort for failure.
* gcc.target/powerpc/mma-single-test.c (main): Call abort for failure.
* gcc.target/powerpc/pr96506.c: Rename to pr96506-1.c.
* gcc.target/powerpc/pr96506-2.c: New test.
---
 gcc/config/rs6000/mma.md  | 385 ++
 gcc/config/rs6000/predicates.md   |  14 +-
 gcc/config/rs6000/rs6000-builtin.def  |  14 +-
 gcc/config/rs6000/rs6000-call.c   | 144 ---
 gcc/config/rs6000/rs6000-modes.def|  10 +-
 gcc/config/rs6000/rs6000-string.c |   6 +-
 gcc/config/rs6000/rs6000.c| 189 +
 gcc/config/rs6000/rs6000.h|   3 +-
 gcc/config/rs6000/rs6000.md   |   2 +-
 .../gcc.target/powerpc/mma-double-test.c  |   3 +
 .../gcc.target/powerpc/mma-single-test.c  |   3 +
 .../powerpc/{pr96506.c => pr96506-1.c}|  24 --
 gcc/testsuite/gcc.target/powerpc/pr96506-2.c  |  38 ++
 13 files changed, 482 insertions(+), 353 deletions(-)
 rename gcc/testsuite/gcc.target/powerpc/{pr96506.c => pr96506-1.c} (61%)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pr96506-2.c

diff --git a/gcc/config/rs6000/mma.md b/gcc/config/rs6000/mma.md
index a3fd28bdd0a..7d520e19b0d 100644
--- a/gcc/config/rs6000/mma.md
+++ b/gcc/config/rs6000/mma.md
@@ -19,24 +19,19 @@
 ;; along with GCC; see the file COPYING3.  If not see
 ;; .
 
-;; The MMA patterns use the multi-register PXImode and POImode partial
+;; The MMA patterns use the multi-register XOmode and OOmode partial
 ;; integer modes to

Re: [PATCH v1 1/2] Simplify shifts wider than the bitwidth of types

2020-11-17 Thread Philipp Tomsich

Jeff & Jakub,

I went back to reread the C language standard and it turns out that the
delineation between defined and undefined is not as simple as I thought
that I remembered (see below).

On Tue, 17 Nov 2020 at 17:54, Jeff Law  wrote:

>
>
> On 11/17/20 9:46 AM, Jakub Jelinek wrote:
> > On Tue, Nov 17, 2020 at 05:29:57PM +0100, Philipp Tomsich wrote:
>  In other words, the change to VRP canonicalizes what a lshift_expr
> with an
>  shift-amount outside of the type width means... it doesn't assume
> anything
>  about the original language.
>  Do we assume that a LSHIFT_EXPR has the same semantics as for a
>  C-language shift-left? If so, then pre should not generate the
> LSHIFT_EXPR
>  for _9... or we might even catch this later in path isolation (as
>  undefined
>  behavior, insert a __builtin_trap() and emit a warning)?
> 
>  Note that in his comment to patch 2/2, Jim has noted that user code
> for
>  RISC-V may assume a truncation of the shift-operand...
> >>> What I'd suggest doing would be to leave the invalid shift count in the
> >>> IL in VRP, then extend the erroneous path isolation code to turn an
> >>> invalid shift into a trap (conditionally of course).
> >> You had originally suggested to add this to VRP ...
> >> Given the various comments to this patch, do you still want any of
> >> this in VRP or
> >> would you rather see this only in path isolation?
> > Well, I said if we want to do it at all, it should be done in VRP,
> because
> > there is not really a difference between ((int) x) << 32 and ((int) x)
> << y
> > for y in [32, 137] etc.
> Right.  VRP is the right place to discover, but I'm not sure it's the
> right place to clean up the mess though.
>

The rules for E1 << E2 are:
  - if E2 is negative => undefined
  - if E1 is unsigned => E1 x 2^E2, reduced module one more than the
maximum representable value
  - if E1 is signed and non-negative => E1 x 2^E2, if E1 x 2^E2 is
representable; otherwise, undefined

So the test case is 'undefined' due to E2 being negative.
However, if it was a large positive number, the transform would be valid if
E1 is unsigned.

I would propose the following revision to this patch:
 1. tighten the logic in VRP to handle the case of E1 being unsigned and E2
being positive...
 2. catch the undefined case of E2 being negative in path isolation

Agreed?
Philipp.

Re: [PATCH] libgcc: Add a weak stub for __sync_synchronize

2020-11-17 Thread Bernd Edlinger

On 11/17/20 4:41 PM, Richard Earnshaw (lists) wrote:
> 
> libgcc is *still* the wrong place for this.  It belongs in the system
> library (eg newlib, or glibc, or whatever), which knows about the system
> it's running on.  (Sorry, I should have said this before, but I've
> context-switched this out since it's been a long time since it came up).
> 

No problem.  I just saw it from the other end.

It is odd that this problem does not go away even if gcc is configured
with --disable-threads, which should be the default for arm-none-eabi
anyway.

If we assume a threaded environment then it is still libgcc
which does not define __GTHREADS in libgcc/gthr.h, and libstdc++'s
__cxa_guard_acquire is not making use of functions like __gthread_mutex_lock.
But that appears to be this way by design.

Of course the race is not fixed if you ask newlib to implement just this
__sync_synchronize function.

> This hack will just lead to silent code failure of the worst kind
> (non-reproducable, racy) at runtime.
> 

So in a arm-none-eabi system with armv6 or higher where the intrinsic
__sync_synchronize is not a library call but an instruction
we have exactly this worst kind scenario, already.

It is however possible that the default of -fthreadsafe_statics
is inappropriate for --disable-threads ?

Bernd.

> R.
>

Re: [PATCH v1 1/2] Simplify shifts wider than the bitwidth of types

2020-11-17 Thread Jim Wilson

On Tue, Nov 17, 2020 at 8:46 AM Jakub Jelinek via Gcc-patches <
gcc-patches@gcc.gnu.org> wrote:

> On Tue, Nov 17, 2020 at 05:29:57PM +0100, Philipp Tomsich wrote:
> > > > In other words, the change to VRP canonicalizes what a lshift_expr
> with an
> > > > shift-amount outside of the type width means... it doesn't assume
> anything
> > > > about the original language.
>
> Well, I said if we want to do it at all, it should be done in VRP, because
> there is not really a difference between ((int) x) << 32 and ((int) x) << y
> for y in [32, 137] etc.
>

How does this stuff interact with SHIFT_COUNT_TRUNCATED and
TARGET_SHIFT_TRUNCATION_MASK?  We already provide a mechanism to
truncate shift counts to fit based on how the hardware handles out-of-range
shift counts.  Handling out-of-range shift counts differently in VRP would
confuse things.  Maybe VRP should be using SHIFT_COUNT_TRUCNATED and/or
TARGET_SHIFT_TRUNCATION_MASK here?  Or maybe we give up on the shift
truncation macros?

Jim

Re: [PATCH v1 2/2] RISC-V: Adjust predicates for immediate shift operands

2020-11-17 Thread Jim Wilson

On Mon, Nov 16, 2020 at 2:45 PM Philipp Tomsich 
wrote:

> This is an de-optimization only, if applied without patch 1 from the
> series: the change to VRP ensures that the backend will never see a shift
> wider than the immediate field.
> The problem is that if a negative shift-amount makes it to the backend,
> unindented code may be generated (as a shift-amount, in my reading, should
> always be interpreted as unsigned).
>

I doubt that you are catching every possible case.  What kind of testing
have you done to prove this?  The code you are removing does nothing
harmful, but removing it may de-optimize code.  So it should not be removed
without a very good reason and a lot of testing to make sure there is no
regression.

Shift counts do not have to be unsigned at the assembly language level.
Consider this testcase
int sub (int i, int j) { return i << (32 - j); }
Compiling with rv32 -O2 gives me
li a5,32
sub a5,a5,a1
sll a0,a0,a5
which is 3 instructions.  But since we know that shift counts are
truncated, we should be compiling this as
neg a1,a1
sll a0,a0,a1
which gives the exact same result with 2 instructions.  This is on my todo
list.  This deliberately uses a negative shift count, but this is well
defined in the RISC-V ISA, so this poses no problems.

There are other related things we can do here to improve code generation
for shifts.  We should not be limiting optimization at the assembly level
by what the C standard says.

You also might want to look at SHIFT_COUNT_TRUNCATED and
TARGET_SHIFT_TRUNCATION_MASK.  We already have infrastructure to handle
out-of-range shifts as the hardware dictates.  Your changes conflict with
this.  I think we should define TARGET_SHIFT_TRUNCATION_MASK for RISC-V.
That is another item on my todo list.

Jim

[PATCH] IOR with nonzero, range cannot contain 0.

2020-11-17 Thread Andrew MacLeod via Gcc-patches

PR 83072 mentions that we have lost the ability to recognize that when 
we see

  c |= 1;
c cannot be zero.   We can at least put it back for multi-ranges. Added 
a new testcase to make sure EVRP is tracking it.


bootstrapped on x86_64-pc-linux-gnu, no regressions.  pushed.

Andrew

commit a5f9c27bfc4417224e332392bb81a2d733b2b5bf
Author: Andrew MacLeod 
Date:   Tue Nov 17 10:04:38 2020 -0500

IOR with nonzero, range cannot contain 0.

Remove zero from IOR ranges with non-zero masks.

gcc/
PR tree-optimization/83072
* range-op.cc (wi_optimize_and_or): Remove zero from IOR range when
mask is non-zero.
gcc/testsuite/
* gcc.dg/pr83072.c: New.

diff --git a/gcc/range-op.cc b/gcc/range-op.cc
index b746aadb603..d0adc95527a 100644
--- a/gcc/range-op.cc
+++ b/gcc/range-op.cc
@@ -2163,6 +2163,14 @@ wi_optimize_and_or (irange ,
   else
 gcc_unreachable ();
   value_range_with_overflow (r, type, res_lb, res_ub);
+
+  // Furthermore, if the mask is non-zero, an IOR cannot contain zero.
+  if (code == BIT_IOR_EXPR && wi::ne_p (mask, 0))
+{
+  int_range<2> tmp;
+  tmp.set_nonzero (type);
+  r.intersect (tmp);
+}
   return true;
 }
 
diff --git a/gcc/testsuite/gcc.dg/pr83072.c b/gcc/testsuite/gcc.dg/pr83072.c
new file mode 100644
index 000..3bed8d89013
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr83072.c
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-evrp -fno-tree-ccp -fno-tree-forwprop -fno-tree-fre" } */
+
+void kill (void);
+
+int f(int c){
+  c |= 1;
+  if (c == 0)
+kill ();
+
+  return c;
+}
+
+/* { dg-final { scan-tree-dump-not "kill" "evrp" } }  */

Re: [PATCH] Practical Improvement to libgcc Complex Divide

2020-11-17 Thread Patrick McGehearty via Gcc-patches


Joseph, thank you for your detailed review and comments.

I will get to work on the necessary revisions as well
as find for a suitable place for sharing my random number
generating tests.

- patrick

On 11/16/2020 8:34 PM, Joseph Myers wrote:

On Tue, 8 Sep 2020, Patrick McGehearty via Gcc-patches wrote:


This project started with an investigation related to
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59714.  Study of Beebe[1]
provided an overview of past and recent practice for computing complex
divide. The current glibc implementation is based on Robert Smith's
algorithm [2] from 1962.  A google search found the paper by Baudin
and Smith [3] (same Robert Smith) published in 2012. Elen Kalda's
proposed patch [4] is based on that paper.

Thanks, I've now read Baudin and Smith so can review the patch properly.
I'm fine with the overall algorithm, so my comments generally relate to
how the code should best be integrated into libgcc while keeping it
properly machine-mode-generic as far as possible.


I developed two sets of test set by randomly distributing values over
a restricted range and the full range of input values. The current

Are these tests available somewhere?


Support for half, float, double, extended, and long double precision
is included as all are handled with suitable preprocessor symbols in a
single source routine. Since half precision is computed with float
precision as per current libgcc practice, the enhanced algorithm
provides no benefit for half precision and would cost performance.
Therefore half precision is left unchanged.

The existing constants for each precision:
float: FLT_MAX, FLT_MIN;
double: DBL_MAX, DBL_MIN;
extended and/or long double: LDBL_MAX, LDBL_MIN
are used for avoiding the more common overflow/underflow cases.

In general, libgcc code works with modes, not types; hardcoding references
to a particular mapping between modes and types is problematic.  Rather,
the existing code in c-cppbuiltin.c that has a loop over modes should be
extended to provide whatever information is needed, as macros defined for
each machine mode.

   /* For libgcc-internal use only.  */
   if (flag_building_libgcc)
 {
   /* Properties of floating-point modes for libgcc2.c.  */
   opt_scalar_float_mode mode_iter;
   FOR_EACH_MODE_IN_CLASS (mode_iter, MODE_FLOAT)
 {
[...]

For example, that defines macros such as __LIBGCC_DF_FUNC_EXT__ and
__LIBGCC_DF_MANT_DIG__.  The _FUNC_EXT__ definition involves that code
computing a mapping to types.

I'd suggest defining additional macros such as __LIBGCC_DF_MAX__ in the
same code - for each supported floating-point mode.  They can be defined
to __FLT_MAX__ etc. (the predefined macros rather than the ones in
float.h) - the existing code that computes a suffix for functions can be
adjusted so it also computes the string such as "FLT", "DBL", "LDBL",
"FLT128" etc.

(I suggest defining to __FLT_MAX__ rather than to the expansion of
__FLT_MAX__ because that avoids any tricky interactions with the logic to
compute such expansions lazily.  I suggest __FLT_MAX__ rather than the
FLT_MAX name from float.h because that way you avoid any need to define
feature test macros to access names such as FLT128_MAX.)


diff --git a/gcc/c-family/c-cppbuiltin.c b/gcc/c-family/c-cppbuiltin.c
index 74ecca8..02c06d8 100644
--- a/gcc/c-family/c-cppbuiltin.c
+++ b/gcc/c-family/c-cppbuiltin.c
@@ -1343,6 +1343,11 @@ c_cpp_builtins (cpp_reader *pfile)
builtin_define_with_value ("__LIBGCC_INIT_SECTION_ASM_OP__",
 INIT_SECTION_ASM_OP, 1);
  #endif
+  /* For libgcc float/double optimization */
+#ifdef HAVE_adddf3
+  builtin_define_with_int_value ("__LIBGCC_HAVE_HWDBL__",
+HAVE_adddf3);
+#endif

This is another thing to handle more generically - possibly with something
like the mode_has_fma function, and defining a macro for each mode, named
after the mode, rather than only for DFmode.  For an alternative, see the
discussion below.


diff --git a/libgcc/ChangeLog b/libgcc/ChangeLog
index ccfd6f6..8bd66c5 100644
--- a/libgcc/ChangeLog
+++ b/libgcc/ChangeLog
@@ -1,3 +1,10 @@
+2020-08-27  Patrick McGehearty 
+
+   * libgcc2.c (__divsc3, __divdc3, __divxc3, __divtc3): Enhance
+   accuracy of complex divide by avoiding underflow/overflow when
+   ratio underflows or when arguments have very large or very
+   small exponents.

Note that diffs to ChangeLog files should not now be included in patches;
the ChangeLog content needs to be included in the proposed commit message
instead for automatic ChangeLog generation.


+#if defined(L_divsc3)
+#define RBIG   ((FLT_MAX)/2.0)
+#define RMIN   (FLT_MIN)
+#define RMIN2  (0x1.0p-21)
+#define RMINSCAL (0x1.0p+19)
+#define RMAX2  ((RBIG)*(RMIN2))
+#endif

I'd expect all of these to use generic macros based on the mode.  For the
division by 2.0, probably also divide by integer 2 not 2.0 to avoid
unwanted conversions to/from double.

[pushed] C++ : Remove an overzealous checking assert [PR97871]

2020-11-17 Thread Iain Sandoe


Hi,

This amends my commit from r11-4927 to remove an assert.

tested on x86_64-darwin,
pushed to master
Iain

---

It seems we accept __attribute__(()) without any diagnostic at present,
so my added checking assert fires for something like:

__attribute__ (()) int a;

Fixed by removing the assert; in the case that the user enters something
like:

__attribute__ (()) extern "C" int foo;

The diagnostic about attributes before linkage specs will fire and show
the empty attributes.

gcc/cp/ChangeLog:

PR c++/97871
* parser.c (cp_parser_declaration): Remove checking assert.
---
 gcc/cp/parser.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index 42f705266bb..b7ef259b048 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -13536,7 +13536,6 @@ cp_parser_declaration (cp_parser* parser, tree  
prefix_attrs)

 {
   cp_lexer_save_tokens (parser->lexer);
   attributes = cp_parser_attributes_opt (parser);
-  gcc_checking_assert (attributes);
   cp_token *t1 = cp_lexer_peek_token (parser->lexer);
   cp_token *t2 = (t1->type == CPP_EOF
  ? t1 : cp_lexer_peek_nth_token (parser->lexer, 2));
--
2.24.1

Re: [PATCH v1 1/2] Simplify shifts wider than the bitwidth of types

2020-11-17 Thread Jakub Jelinek via Gcc-patches

On Tue, Nov 17, 2020 at 09:54:46AM -0700, Jeff Law wrote:
> > So, e.g. if we had __builtin_warning (dunno where Martin S. is with that),
> > we could e.g. queue a __builtin_warning and add __builtin_unreachable (or
> > other possibilities), or e.g. during VRP just canonicalize proven always
> > out of bound shifts to shifts by an out of bound constant and let some later
> > pass warn and/or add __builtin_warning.
> So the idea is to start funneling this through the path isolation code
> and handle the various strategies there.

If the path isolation code would use the ranger for this, it wouldn't need
to be in VRP but could be anywhere, sure.

> __builtin_warning is on hold pending a rework to make it act more like a
> debug statement than a builtin function call.Â  The latter impacts
> various heuristics, which would mean that it could impact codegen which
> would highly undesirable.

Agreed on that.

Jakub

Re: [PATCH v1 1/2] Simplify shifts wider than the bitwidth of types

2020-11-17 Thread Jeff Law via Gcc-patches




On 11/17/20 9:46 AM, Jakub Jelinek wrote:
> On Tue, Nov 17, 2020 at 05:29:57PM +0100, Philipp Tomsich wrote:
 In other words, the change to VRP canonicalizes what a lshift_expr with an
 shift-amount outside of the type width means... it doesn't assume anything
 about the original language.
 Do we assume that a LSHIFT_EXPR has the same semantics as for a
 C-language shift-left? If so, then pre should not generate the LSHIFT_EXPR
 for _9... or we might even catch this later in path isolation (as
 undefined
 behavior, insert a __builtin_trap() and emit a warning)?

 Note that in his comment to patch 2/2, Jim has noted that user code for
 RISC-V may assume a truncation of the shift-operand...
>>> What I'd suggest doing would be to leave the invalid shift count in the
>>> IL in VRP, then extend the erroneous path isolation code to turn an
>>> invalid shift into a trap (conditionally of course).
>> You had originally suggested to add this to VRP ...
>> Given the various comments to this patch, do you still want any of
>> this in VRP or
>> would you rather see this only in path isolation?
> Well, I said if we want to do it at all, it should be done in VRP, because
> there is not really a difference between ((int) x) << 32 and ((int) x) << y
> for y in [32, 137] etc.
Right.  VRP is the right place to discover, but I'm not sure it's the
right place to clean up the mess though.

> Otherwise it is the general question of what to do upon proven UB, and that
> is a topic discussed several years during Cauldron that it would be nice to
> have switches where users can choose what to do in that case,
> __builtin_unreachable (), __builtin_trap (), ... and another thing is where
> we should warn about it (tight e.g. to the __builtin_warning thing, because
> we don't want these warnings for dead code).
>
> So, e.g. if we had __builtin_warning (dunno where Martin S. is with that),
> we could e.g. queue a __builtin_warning and add __builtin_unreachable (or
> other possibilities), or e.g. during VRP just canonicalize proven always
> out of bound shifts to shifts by an out of bound constant and let some later
> pass warn and/or add __builtin_warning.
So the idea is to start funneling this through the path isolation code
and handle the various strategies there.

__builtin_warning is on hold pending a rework to make it act more like a
debug statement than a builtin function call.  The latter impacts
various heuristics, which would mean that it could impact codegen which
would highly undesirable.

jeff
>
>   Jakub

Re: [PATCH v1 1/2] Simplify shifts wider than the bitwidth of types

2020-11-17 Thread Jakub Jelinek via Gcc-patches

On Tue, Nov 17, 2020 at 05:29:57PM +0100, Philipp Tomsich wrote:
> > > In other words, the change to VRP canonicalizes what a lshift_expr with an
> > > shift-amount outside of the type width means... it doesn't assume anything
> > > about the original language.
> > > Do we assume that a LSHIFT_EXPR has the same semantics as for a
> > > C-language shift-left? If so, then pre should not generate the LSHIFT_EXPR
> > > for _9... or we might even catch this later in path isolation (as
> > > undefined
> > > behavior, insert a __builtin_trap() and emit a warning)?
> > >
> > > Note that in his comment to patch 2/2, Jim has noted that user code for
> > > RISC-V may assume a truncation of the shift-operand...
> > What I'd suggest doing would be to leave the invalid shift count in the
> > IL in VRP, then extend the erroneous path isolation code to turn an
> > invalid shift into a trap (conditionally of course).
> 
> You had originally suggested to add this to VRP ...
> Given the various comments to this patch, do you still want any of
> this in VRP or
> would you rather see this only in path isolation?

Well, I said if we want to do it at all, it should be done in VRP, because
there is not really a difference between ((int) x) << 32 and ((int) x) << y
for y in [32, 137] etc.
Otherwise it is the general question of what to do upon proven UB, and that
is a topic discussed several years during Cauldron that it would be nice to
have switches where users can choose what to do in that case,
__builtin_unreachable (), __builtin_trap (), ... and another thing is where
we should warn about it (tight e.g. to the __builtin_warning thing, because
we don't want these warnings for dead code).

So, e.g. if we had __builtin_warning (dunno where Martin S. is with that),
we could e.g. queue a __builtin_warning and add __builtin_unreachable (or
other possibilities), or e.g. during VRP just canonicalize proven always
out of bound shifts to shifts by an out of bound constant and let some later
pass warn and/or add __builtin_warning.

Jakub

Re: [PATCH v1 1/2] Simplify shifts wider than the bitwidth of types

2020-11-17 Thread Philipp Tomsich via Gcc-patches

Jeff,

On Tue, 17 Nov 2020 at 16:56, Jeff Law  wrote:
> > Note that in his comment to patch 2/2, Jim has noted that user code for
> > RISC-V may assume a truncation of the shift-operand...
> What I'd suggest doing would be to leave the invalid shift count in the
> IL in VRP, then extend the erroneous path isolation code to turn an
> invalid shift into a trap (conditionally of course).

As I remember, FORTRAN allows both LSHIFT(i, shift) or SHIFTL(i, shift) with
'shift' less than or equal to BITSIZE(i) ... this leaves i <<
BITSIZE(i) defined for
FORTRAN and undefined for C.

This seems to indicate that an LSHIFT_EXPR is intentionally not constrained
either to C language (or any other) semantics at this time.  To handle this in
path isolation, should we have different tree expressions for a
left-shift with C
semantics and one with FORTRAN semantics?

Philipp.

Re: [PATCH v1 1/2] Simplify shifts wider than the bitwidth of types

2020-11-17 Thread Philipp Tomsich via Gcc-patches

Jakub,

On Tue, 17 Nov 2020 at 16:56, Jeff Law  wrote:
>
>
>
> On 11/17/20 4:53 AM, Philipp Tomsich wrote:
> > Jeff,
> >
> > On Tue, 17 Nov 2020 at 00:38, Jeff Law  > > wrote:
> >
> >
> > On 11/16/20 11:57 AM, Philipp Tomsich wrote:
> > > From: Philipp Tomsich mailto:p...@gnu.org>>
> > >
> > > While most shifts wider than the bitwidth of a type will be
> > caught by
> > > other passes, it is possible that these show up for VRP.
> > > Consider the following example:
> > >   int func (int a, int b, int c)
> > >   {
> > > return (a << ((b && c) - 1));
> > >   }
> > >
> > > This adds simplify_using_ranges::simplify_lshift_using_ranges to
> > > detect and rewrite such cases.  If the intersection of meaningful
> > > shift amounts for the underlying type and the value-range computed
> > > for the shift-amount (whether an integer constant or a variable) is
> > > empty, the statement is replaced with the zero-constant of the same
> > > precision as the result.
> > >
> > > gcc/ChangeLog:
> > >
> > >* vr-values.h (simplify_using_ranges): Declare.
> > >* vr-values.c (simplify_lshift_using_ranges): New function.
> > >(simplify): Use simplify_lshift_using_ranges for LSHIFT_EXPR.
> >
> > Umm, isn't this a shift wider than the bitwidth undefined
> > behavior?  We
> > should be generating warnings for that, not trying to further optimize
> > it :-)
> >
> >
> > The shift is undefined behavior on the language level (for C) and a
> > warning
> > will be generated, if such a shift is encountered; additionally, the
> > shift will be
> > replaced with the value 0.
> >
> > However, in the above case, the shift is generated only in the middle end:
> > At 136t.walloca, I still have:
> >
> >   # RANGE [-1, 0]
> >   _1 = iftmp.1_2 + -1;
> >   _6 = a_5(D) << _1;
> >
> > Whereas at 137t.pre, this is changed into:
> >
> > Found partial redundancy for expression {lshift_expr,a_5(D),_1} (0006)
> > Inserted _9 = a_5(D) << -1;
> >
> >
> > In other words, the change to VRP canonicalizes what a lshift_expr with an
> > shift-amount outside of the type width means... it doesn't assume anything
> > about the original language.
> > Do we assume that a LSHIFT_EXPR has the same semantics as for a
> > C-language shift-left? If so, then pre should not generate the LSHIFT_EXPR
> > for _9... or we might even catch this later in path isolation (as
> > undefined
> > behavior, insert a __builtin_trap() and emit a warning)?
> >
> > Note that in his comment to patch 2/2, Jim has noted that user code for
> > RISC-V may assume a truncation of the shift-operand...
> What I'd suggest doing would be to leave the invalid shift count in the
> IL in VRP, then extend the erroneous path isolation code to turn an
> invalid shift into a trap (conditionally of course).

You had originally suggested to add this to VRP ...
Given the various comments to this patch, do you still want any of
this in VRP or
would you rather see this only in path isolation?

Philipp.

Re: [patch] Fix build when source directory includes @ character

2020-11-17 Thread Jeff Law via Gcc-patches




On 11/17/20 1:22 AM, FX wrote:
>> OK.  You have commit privs, right?
> Yes, and I did commit after Richard’s OK: 
> https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=034db20e2ea8301b5dc251bf10a97ce1cf90655f
>
> … but I forgot to send an email saying I had, sorry.
No worries.  Thanks.
jeff

preprocessor: new callbacks

2020-11-17 Thread Nathan Sidwell



These two callbacks are needed for C++ modules.  The first is for
handling macros from header-units.  These are resolved lazily.  The
second is for include-translation -- whether a #include gets turned
into a header-unit import.

libcpp/
* include/cpplib.h (struct cpp_callbacks): Add
user_deferred_macro & translate_include.

pushing to trunk

--
Nathan Sidwell
diff --git c/libcpp/include/cpplib.h w/libcpp/include/cpplib.h
index 8e398863cf6..81be6457951 100644
--- c/libcpp/include/cpplib.h
+++ w/libcpp/include/cpplib.h
@@ -680,6 +695,9 @@ struct cpp_callbacks
   /* Callback that can change a user lazy into normal macro.  */
   void (*user_lazy_macro) (cpp_reader *, cpp_macro *, unsigned);
 
+  /* Callback to handle deferred cpp_macros.  */
+  cpp_macro *(*user_deferred_macro) (cpp_reader *, location_t, cpp_hashnode *);
+
   /* Callback to parse SOURCE_DATE_EPOCH from environment.  */
   time_t (*get_source_date_epoch) (cpp_reader *);
 
@@ -698,6 +716,11 @@ struct cpp_callbacks
   /* Callback for filename remapping in __FILE__ and __BASE_FILE__ macro
  expansions.  */
   const char *(*remap_filename) (const char*);
+
+  /* Maybe translate a #include into something else.  Return a
+ cpp_buffer containing the translation if translating.  */
+  char *(*translate_include) (cpp_reader *, line_maps *, location_t,
+			  const char *path);
 };
 
 #ifdef VMS

preprocessor: module line maps

2020-11-17 Thread Nathan Sidwell



This patch adds LC_MODULE as a map kind, used to indicate a c++
module.  Unlike a regular source file, it only contains a single
location, and the source locations in that module are represented by
ordinary locations whose 'included_from' location is the module.

It also exposes some entry points that modules will use to create
blocks of line maps.

In the original posting, I'd missed the deletion of the
linemap_enter_macro from internal.h.  That's included here.

libcpp/
* include/line-map.h (enum lc_reason): Add LC_MODULE.
(MAP_MODULE_P): New.
(line_map_new_raw): Declare.
(linemap_enter_macro): Move declaration from internal.h
(linemap_module_loc, linemap_module_reparent)
(linemap_module_restore): Declare.
(linemap_lookup_macro_indec): Declare.
* internal.h (linemap_enter_macro): Moved to line-map.h.
* linemap.c (linemap_new_raw): New, broken out of ...
(new_linemap): ... here.  Call it.
(LAST_SOURCE_LINE_LOCATION): New.
(liemap_module_loc, linemap_module_reparent)
(linemap_module_restore): New.
(linemap_lookup_macro_index): New, broken out of ...
(linemap_macro_map_lookup): ... here.  Call it.
(linemap_dump): Add module dump.

pushing to trunk
--
Nathan Sidwell
diff --git i/libcpp/include/line-map.h w/libcpp/include/line-map.h
index 44008be5c08..50b2e4ff91a 100644
--- i/libcpp/include/line-map.h
+++ w/libcpp/include/line-map.h
@@ -72,6 +72,7 @@ enum lc_reason
   LC_RENAME,		/* Other reason for name change.  */
   LC_RENAME_VERBATIM,	/* Likewise, but "" != stdin.  */
   LC_ENTER_MACRO,	/* Begin macro expansion.  */
+  LC_MODULE,		/* A (C++) Module.  */
   /* FIXME: add support for stringize and paste.  */
   LC_HWM /* High Water Mark.  */
 };
@@ -439,7 +440,8 @@ struct GTY((tag ("1"))) line_map_ordinary : public line_map {
 
   /* Location from whence this line map was included.  For regular
  #includes, this location will be the last location of a map.  For
- outermost file, this is 0.  */
+ outermost file, this is 0.  For modules it could be anywhere
+ within a map.  */
   location_t included_from;
 
   /* Size is 20 or 24 bytes, no padding  */
@@ -662,6 +664,15 @@ ORDINARY_MAP_IN_SYSTEM_HEADER_P (const line_map_ordinary *ord_map)
   return ord_map->sysp;
 }
 
+/* TRUE if this line map is for a module (not a source file).  */
+
+inline bool
+MAP_MODULE_P (const line_map *map)
+{
+  return (MAP_ORDINARY_P (map)
+	  && linemap_check_ordinary (map)->reason == LC_MODULE);
+}
+
 /* Get the filename of ordinary map MAP.  */
 
 inline const char *
@@ -1076,6 +1087,9 @@ extern void linemap_check_files_exited (class line_maps *);
 extern location_t linemap_line_start
 (class line_maps *set, linenum_type to_line,  unsigned int max_column_hint);
 
+/* Allocate a raw block of line maps, zero initialized.  */
+extern line_map *line_map_new_raw (line_maps *, bool, unsigned);
+
 /* Add a mapping of logical source line to physical source file and
line number. This function creates an "ordinary map", which is a
map that records locations of tokens that are not part of macro
@@ -1093,6 +1107,39 @@ extern const line_map *linemap_add
   (class line_maps *, enum lc_reason, unsigned int sysp,
const char *to_file, linenum_type to_line);
 
+/* Create a macro map.  A macro map encodes source locations of tokens
+   that are part of a macro replacement-list, at a macro expansion
+   point. See the extensive comments of struct line_map and struct
+   line_map_macro, in line-map.h.
+
+   This map shall be created when the macro is expanded. The map
+   encodes the source location of the expansion point of the macro as
+   well as the "original" source location of each token that is part
+   of the macro replacement-list. If a macro is defined but never
+   expanded, it has no macro map.  SET is the set of maps the macro
+   map should be part of.  MACRO_NODE is the macro which the new macro
+   map should encode source locations for.  EXPANSION is the location
+   of the expansion point of MACRO. For function-like macros
+   invocations, it's best to make it point to the closing parenthesis
+   of the macro, rather than the the location of the first character
+   of the macro.  NUM_TOKENS is the number of tokens that are part of
+   the replacement-list of MACRO.  */
+const line_map_macro *linemap_enter_macro (line_maps *, cpp_hashnode *,
+	   location_t, unsigned int);
+
+/* Create a source location for a module.  The creator must either do
+   this after the TU is tokenized, or deal with saving and restoring
+   map state.  */
+
+extern location_t linemap_module_loc
+  (line_maps *, location_t from, const char *name);
+extern void linemap_module_reparent
+  (line_maps *, location_t loc, location_t new_parent);
+
+/* Restore the linemap state such that the map at LWM-1 continues.  */
+extern void linemap_module_restore
+  (line_maps *, unsigned lwm);
+
 /*

[committed] libstdc++: Fix unconditional definition of __cpp_lib_span in [PR 97869}

2020-11-17 Thread Jonathan Wakely via Gcc-patches

The  header is empty unless Concepts are supported, but 
defines the __cpp_lib_span feature test macro unconditionally. It should
be guarded by the same conditions as in .

libstdc++-v3/ChangeLog:

PR libstdc++/97869
* include/precompiled/stdc++.h: Include .
* include/std/version (__cpp_lib_span): Check __cpp_lib_concepts
before defining.

Tested powerpc64le-linux. Committed to trunk.

This also needs to be backported to gcc-10.

commit ecf65330c11544ebf35e198087b4a42be089c620
Author: Jonathan Wakely 
Date:   Tue Nov 17 15:26:29 2020

libstdc++: Fix unconditional definition of __cpp_lib_span in  [PR 
97869}

The  header is empty unless Concepts are supported, but 
defines the __cpp_lib_span feature test macro unconditionally. It should
be guarded by the same conditions as in .

libstdc++-v3/ChangeLog:

PR libstdc++/97869
* include/precompiled/stdc++.h: Include .
* include/std/version (__cpp_lib_span): Check __cpp_lib_concepts
before defining.

diff --git a/libstdc++-v3/include/precompiled/stdc++.h 
b/libstdc++-v3/include/precompiled/stdc++.h
index 8899c323a281..a418c46288de 100644
--- a/libstdc++-v3/include/precompiled/stdc++.h
+++ b/libstdc++-v3/include/precompiled/stdc++.h
@@ -137,6 +137,9 @@
 #include 
 #include 
 #include 
+#if __cpp_impl_coroutine
+# include 
+#endif
 #include 
 #include 
 #include 
diff --git a/libstdc++-v3/include/std/version b/libstdc++-v3/include/std/version
index 7f51ef3a6c4f..12455ad93146 100644
--- a/libstdc++-v3/include/std/version
+++ b/libstdc++-v3/include/std/version
@@ -226,7 +226,9 @@
 # define __cpp_lib_ranges 201911L
 #endif
 #define __cpp_lib_shift 201806L
-#define __cpp_lib_span 202002L
+#if __cpp_lib_concepts
+# define __cpp_lib_span 202002L
+#endif
 #define __cpp_lib_ssize 201902L
 #define __cpp_lib_starts_ends_with 201711L
 # if _GLIBCXX_USE_CXX11_ABI

Re: [PATCH v1 1/2] Simplify shifts wider than the bitwidth of types

2020-11-17 Thread Jeff Law via Gcc-patches




On 11/17/20 4:53 AM, Philipp Tomsich wrote:
> Jeff,
>
> On Tue, 17 Nov 2020 at 00:38, Jeff Law  > wrote:
>
>
> On 11/16/20 11:57 AM, Philipp Tomsich wrote:
> > From: Philipp Tomsich mailto:p...@gnu.org>>
> >
> > While most shifts wider than the bitwidth of a type will be
> caught by
> > other passes, it is possible that these show up for VRP.
> > Consider the following example:
> >   int func (int a, int b, int c)
> >   {
> >     return (a << ((b && c) - 1));
> >   }
> >
> > This adds simplify_using_ranges::simplify_lshift_using_ranges to
> > detect and rewrite such cases.  If the intersection of meaningful
> > shift amounts for the underlying type and the value-range computed
> > for the shift-amount (whether an integer constant or a variable) is
> > empty, the statement is replaced with the zero-constant of the same
> > precision as the result.
> >
> > gcc/ChangeLog:
> >
> >        * vr-values.h (simplify_using_ranges): Declare.
> >        * vr-values.c (simplify_lshift_using_ranges): New function.
> >        (simplify): Use simplify_lshift_using_ranges for LSHIFT_EXPR.
>
> Umm, isn't this a shift wider than the bitwidth undefined
> behavior?  We
> should be generating warnings for that, not trying to further optimize
> it :-)
>
>
> The shift is undefined behavior on the language level (for C) and a
> warning
> will be generated, if such a shift is encountered; additionally, the
> shift will be
> replaced with the value 0.
>
> However, in the above case, the shift is generated only in the middle end:
> At 136t.walloca, I still have:
>
>   # RANGE [-1, 0]
>   _1 = iftmp.1_2 + -1;
>   _6 = a_5(D) << _1;
>
> Whereas at 137t.pre, this is changed into:
>
> Found partial redundancy for expression {lshift_expr,a_5(D),_1} (0006)
> Inserted _9 = a_5(D) << -1;
>
>
> In other words, the change to VRP canonicalizes what a lshift_expr with an
> shift-amount outside of the type width means... it doesn't assume anything
> about the original language.
> Do we assume that a LSHIFT_EXPR has the same semantics as for a
> C-language shift-left? If so, then pre should not generate the LSHIFT_EXPR
> for _9... or we might even catch this later in path isolation (as
> undefined
> behavior, insert a __builtin_trap() and emit a warning)?
>
> Note that in his comment to patch 2/2, Jim has noted that user code for
> RISC-V may assume a truncation of the shift-operand...
What I'd suggest doing would be to leave the invalid shift count in the
IL in VRP, then extend the erroneous path isolation code to turn an
invalid shift into a trap (conditionally of course).

jeff

Re: [PATCH] libgcc: Add a weak stub for __sync_synchronize

2020-11-17 Thread Christophe Lyon via Gcc-patches

On Tue, 17 Nov 2020 at 16:41, Richard Earnshaw (lists) via Gcc-patches
 wrote:
>
> On 17/11/2020 15:18, Bernd Edlinger wrote:
> > On 11/17/20 1:44 PM, Richard Earnshaw (lists) wrote:
> >> On 03/11/2020 15:08, Bernd Edlinger wrote:
> >>> Hi,
> >>>
> >>> this fixes a problem with a missing symbol __sync_synchronize
> >>> which happens when newlib is used together with libstdc++ for
> >>> the non-threaded simulator target arm-none-eabi.
> >>>
> >>> There are several questions on stackoverflow about this issue.
> >>>
> >>> I would like to add a weak symbol for this target, since this
> >>> is only a default implementation and not meant to override a
> >>> possibly more sophisticated synchronization function from the
> >>> c-runtime.
> >>>
> >>>
> >>> Regression tested successfully on arm-none-eabi with newlib-3.3.0.
> >>>
> >>> Is it OK for trunk?
> >>>
> >>>
> >>> Thanks
> >>> Bernd.
> >>>
> >>
> >> I seem to recall that this was a deliberate decision - you can't guess
> >> this correctly, at least when trying to build portable code - you just
> >> have to know which runtime you will be using.
> >>
> >
> > Therefore I suggest to use the weak attribute.  It is on purpose not
> > implementing all of the atomics.
> >
> > The use case, is a C++ program which initializes a local static variable.
> >
> > $ cat test.cc
> > #include 
> > main(int argc, char **argv)
> > {
> >   static std::string x = "test";
> >   return 0;
> > }
> >
> > compiles to this:
> > sub sp, sp, #20
> > str r0, [fp, #-24]
> > str r1, [fp, #-28]
> > ldr r3, .L14
> > ldrbr4, [r3]
> > bl  __sync_synchronize
> > and r3, r4, #255
> > and r3, r3, #1
> > cmp r3, #0
> > moveq   r3, #1
> > movne   r3, #0
> > and r3, r3, #255
> > cmp r3, #0
> > beq .L8
> > ldr r0, .L14
> > bl  __cxa_guard_acquire
> > mov r3, r0
> >
> > so __sync_synchronize is not defined in newlib since the target (arm-sim)
> > is known to be not multi-threaded,
> > but __cxa_guard_acquire is also not a thread safe function,
> > because __GTHREADS is not defined by libgcc, since it is known
> > at configure time, that the target does not support threads.
> > So libstdc++ does not try to use a mutex or any atomics either,
> > because it is not compiled with __GTHREADS.
> >
> > I can further narrow down the patch by only defining this function when
> > __GTHREADS is not defined, to make it more clear.
> >
> >
> >> I think Ramana had some changes in the works at one point to address
> >> (some) of this, but I'm not sure what happened to them.  Ramana?
> >>
> >>
> >> +#if defined (__ARM_ARCH_6__) || defined (__ARM_ARCH_6J__)   \
> >> +|| defined (__ARM_ARCH_6K__) || defined (__ARM_ARCH_6T2__)  \
> >> +|| defined (__ARM_ARCH_6Z__) || defined (__ARM_ARCH_6ZK__)  \
> >> +|| defined (__ARM_ARCH_7__) || defined (__ARM_ARCH_7A__)
> >> +#if defined (__ARM_ARCH_7__) || defined (__ARM_ARCH_7A__)
> >>
> >> Ug, no!  Use the ACLE macros to avoid this sort of mess.
> >>
> >
> > Ah, thanks, copy-paste from freebsd-atomic.c :)
> >
> >
> > I've attached the updated patch.
> > Is it OK?
> >
> >
> > Thanks
> > Bernd.
> >
>
> libgcc is *still* the wrong place for this.  It belongs in the system
> library (eg newlib, or glibc, or whatever), which knows about the system
> it's running on.  (Sorry, I should have said this before, but I've
> context-switched this out since it's been a long time since it came up).
>
> This hack will just lead to silent code failure of the worst kind
> (non-reproducable, racy) at runtime.
>

I haven't fully re-read the thread, but I think this is related to an
old discussion,
not very well archived in:
https://gcc.gnu.org/pipermail/gcc-patches/2016-November/462299.html

There's a pointer to a newlib patch from Ramana.

> R.

Re: [PATCH] libgcc: Add a weak stub for __sync_synchronize

2020-11-17 Thread Richard Earnshaw (lists) via Gcc-patches

On 17/11/2020 15:18, Bernd Edlinger wrote:
> On 11/17/20 1:44 PM, Richard Earnshaw (lists) wrote:
>> On 03/11/2020 15:08, Bernd Edlinger wrote:
>>> Hi,
>>>
>>> this fixes a problem with a missing symbol __sync_synchronize
>>> which happens when newlib is used together with libstdc++ for
>>> the non-threaded simulator target arm-none-eabi.
>>>
>>> There are several questions on stackoverflow about this issue.
>>>
>>> I would like to add a weak symbol for this target, since this
>>> is only a default implementation and not meant to override a
>>> possibly more sophisticated synchronization function from the
>>> c-runtime.
>>>
>>>
>>> Regression tested successfully on arm-none-eabi with newlib-3.3.0.
>>>
>>> Is it OK for trunk?
>>>
>>>
>>> Thanks
>>> Bernd.
>>>
>>
>> I seem to recall that this was a deliberate decision - you can't guess
>> this correctly, at least when trying to build portable code - you just
>> have to know which runtime you will be using.
>>
> 
> Therefore I suggest to use the weak attribute.  It is on purpose not
> implementing all of the atomics.
> 
> The use case, is a C++ program which initializes a local static variable.
> 
> $ cat test.cc
> #include 
> main(int argc, char **argv)
> {
>   static std::string x = "test";
>   return 0;
> }
> 
> compiles to this:
> sub sp, sp, #20
> str r0, [fp, #-24]
> str r1, [fp, #-28]
> ldr r3, .L14
> ldrbr4, [r3]
> bl  __sync_synchronize
> and r3, r4, #255
> and r3, r3, #1
> cmp r3, #0
> moveq   r3, #1
> movne   r3, #0
> and r3, r3, #255
> cmp r3, #0
> beq .L8
> ldr r0, .L14
> bl  __cxa_guard_acquire
> mov r3, r0
> 
> so __sync_synchronize is not defined in newlib since the target (arm-sim)
> is known to be not multi-threaded,
> but __cxa_guard_acquire is also not a thread safe function,
> because __GTHREADS is not defined by libgcc, since it is known
> at configure time, that the target does not support threads.
> So libstdc++ does not try to use a mutex or any atomics either,
> because it is not compiled with __GTHREADS.
> 
> I can further narrow down the patch by only defining this function when
> __GTHREADS is not defined, to make it more clear.
> 
> 
>> I think Ramana had some changes in the works at one point to address
>> (some) of this, but I'm not sure what happened to them.  Ramana?
>>
>>
>> +#if defined (__ARM_ARCH_6__) || defined (__ARM_ARCH_6J__)   \
>> +|| defined (__ARM_ARCH_6K__) || defined (__ARM_ARCH_6T2__)  \
>> +|| defined (__ARM_ARCH_6Z__) || defined (__ARM_ARCH_6ZK__)  \
>> +|| defined (__ARM_ARCH_7__) || defined (__ARM_ARCH_7A__)
>> +#if defined (__ARM_ARCH_7__) || defined (__ARM_ARCH_7A__)
>>
>> Ug, no!  Use the ACLE macros to avoid this sort of mess.
>>
> 
> Ah, thanks, copy-paste from freebsd-atomic.c :)
> 
> 
> I've attached the updated patch.
> Is it OK?
> 
> 
> Thanks
> Bernd.
> 

libgcc is *still* the wrong place for this.  It belongs in the system
library (eg newlib, or glibc, or whatever), which knows about the system
it's running on.  (Sorry, I should have said this before, but I've
context-switched this out since it's been a long time since it came up).

This hack will just lead to silent code failure of the worst kind
(non-reproducable, racy) at runtime.

R.

Re: (VAX) cc0 anyone? (was: [PATCH 0/2] Fixes for old version NetBSD targets)

2020-11-17 Thread Kamil Rytarowski

On 17.11.2020 04:49, Hans-Peter Nilsson wrote:
> On Sun, 15 Nov 2020, Maciej W. Rozycki wrote:
> 
>> Hi,
>>
>>  In the course of my recent VAX backend modernisation effort
> 
> Hi.  That reminds me that VAX is "still" a cc0 target.
> 
> Are you aware of anyone planning on that level of modernization?
> 
> More than a year ago, there was a major heads-up that all
> remaining cc0 targets would be "retired" in the next release.
> 
> Now, I'm thinking that was just an empty threat. 0:-)
> 
> brgds, H-P
> PS. not volunteering, sorry.
> 

The "VAX backend modernisation effort" means upgrading out of cc0.



signature.asc
Description: OpenPGP digital signature

Re: [PATCH] libgcc: Add a weak stub for __sync_synchronize

2020-11-17 Thread Bernd Edlinger

On 11/17/20 1:44 PM, Richard Earnshaw (lists) wrote:
> On 03/11/2020 15:08, Bernd Edlinger wrote:
>> Hi,
>>
>> this fixes a problem with a missing symbol __sync_synchronize
>> which happens when newlib is used together with libstdc++ for
>> the non-threaded simulator target arm-none-eabi.
>>
>> There are several questions on stackoverflow about this issue.
>>
>> I would like to add a weak symbol for this target, since this
>> is only a default implementation and not meant to override a
>> possibly more sophisticated synchronization function from the
>> c-runtime.
>>
>>
>> Regression tested successfully on arm-none-eabi with newlib-3.3.0.
>>
>> Is it OK for trunk?
>>
>>
>> Thanks
>> Bernd.
>>
> 
> I seem to recall that this was a deliberate decision - you can't guess
> this correctly, at least when trying to build portable code - you just
> have to know which runtime you will be using.
> 

Therefore I suggest to use the weak attribute.  It is on purpose not
implementing all of the atomics.

The use case, is a C++ program which initializes a local static variable.

$ cat test.cc
#include 
main(int argc, char **argv)
{
  static std::string x = "test";
  return 0;
}

compiles to this:
sub sp, sp, #20
str r0, [fp, #-24]
str r1, [fp, #-28]
ldr r3, .L14
ldrbr4, [r3]
bl  __sync_synchronize
and r3, r4, #255
and r3, r3, #1
cmp r3, #0
moveq   r3, #1
movne   r3, #0
and r3, r3, #255
cmp r3, #0
beq .L8
ldr r0, .L14
bl  __cxa_guard_acquire
mov r3, r0

so __sync_synchronize is not defined in newlib since the target (arm-sim)
is known to be not multi-threaded,
but __cxa_guard_acquire is also not a thread safe function,
because __GTHREADS is not defined by libgcc, since it is known
at configure time, that the target does not support threads.
So libstdc++ does not try to use a mutex or any atomics either,
because it is not compiled with __GTHREADS.

I can further narrow down the patch by only defining this function when
__GTHREADS is not defined, to make it more clear.


> I think Ramana had some changes in the works at one point to address
> (some) of this, but I'm not sure what happened to them.  Ramana?
> 
> 
> +#if defined (__ARM_ARCH_6__) || defined (__ARM_ARCH_6J__)   \
> +|| defined (__ARM_ARCH_6K__) || defined (__ARM_ARCH_6T2__)  \
> +|| defined (__ARM_ARCH_6Z__) || defined (__ARM_ARCH_6ZK__)  \
> +|| defined (__ARM_ARCH_7__) || defined (__ARM_ARCH_7A__)
> +#if defined (__ARM_ARCH_7__) || defined (__ARM_ARCH_7A__)
> 
> Ug, no!  Use the ACLE macros to avoid this sort of mess.
> 

Ah, thanks, copy-paste from freebsd-atomic.c :)


I've attached the updated patch.
Is it OK?


Thanks
Bernd.
From ca44e1fcb4b991306cbcde6293d20c77ce74ad68 Mon Sep 17 00:00:00 2001
From: Bernd Edlinger 
Date: Mon, 2 Nov 2020 11:43:44 +0100
Subject: [PATCH] libgcc: Add a weak stub for __sync_synchronize

This patch adds a default implementation for __sync_synchronize,
which prevents many unresolved symbol errors on arm-none-eabi.
This happens often in C++ programs even without any threading.

libgcc:
2020-11-17  Bernd Edlinger  

	* config.host: Use t-eabi for arm-none-eabi.
	* config/arm/t-eabi: New.
	* config/arm/eabi-sync.c: New.
---
 libgcc/config.host|  2 +-
 libgcc/config/arm/eabi-sync.c | 38 ++
 libgcc/config/arm/t-eabi  |  1 +
 3 files changed, 40 insertions(+), 1 deletion(-)
 create mode 100644 libgcc/config/arm/eabi-sync.c
 create mode 100644 libgcc/config/arm/t-eabi

diff --git a/libgcc/config.host b/libgcc/config.host
index 66af834..eaf74f1 100644
--- a/libgcc/config.host
+++ b/libgcc/config.host
@@ -496,7 +496,7 @@ arm*-*-eabi* | arm*-*-symbianelf* | arm*-*-rtems*)
 	tm_file="$tm_file arm/bpabi-lib.h"
 	case ${host} in
 	arm*-*-eabi* | arm*-*-rtems*)
-	  tmake_file="${tmake_file} arm/t-bpabi t-crtfm"
+	  tmake_file="${tmake_file} arm/t-bpabi t-crtfm arm/t-eabi"
 	  extra_parts="crtbegin.o crtend.o crti.o crtn.o"
 	  ;;
 	arm*-*-symbianelf*)
diff --git a/libgcc/config/arm/eabi-sync.c b/libgcc/config/arm/eabi-sync.c
new file mode 100644
index 000..c37bacf
--- /dev/null
+++ b/libgcc/config/arm/eabi-sync.c
@@ -0,0 +1,38 @@
+/* Copyright (C) 2020 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or (at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+Under Section 7 of GPL version 3, you are granted additional
+permissions described in the GCC Runtime Library Exception,

Re: OpenACC 'kernels' testsuite failures

2020-11-17 Thread David Edelsohn via Gcc-patches

Hi, Thomas

The standard version of Tcl installed on AIX currently is Tcl 8.4.
I'll see if I can have a newer version on the side.

The patch resolves the "no such variable" error message.  (Great!
Thanks!)  I now see:

during GIMPLE pass: omplower

as an Excess error.  Any idea where that comes from and why it doesn't
appear on other targets?  Does that need to be captured and ignored?

Thanks, David

On Mon, Nov 16, 2020 at 11:46 AM Thomas Schwinge
 wrote:
>
> Hi David!
>
> While you were writing your email, I've also been busy:
>
> On 2020-11-16T11:32:46-0500, David Edelsohn  wrote:
> > On Mon, Nov 16, 2020 at 9:16 AM Thomas Schwinge  
> > wrote:
> >> On 2020-11-15T15:47:15-0500, David Edelsohn  wrote:
> >> > I am seeing a number of new failures on AIX related to the OpenACC
> >> > kernels patches.
> >> >
> >> > c-c++-common/goacc/kernels-decompose-1.c
> >> > c-c++-common/goacc/kernels-decompose-2.c
> >> > gfortran.dg/goacc/kernels-decompose-1.f95
> >> > gfortran.dg/goacc/kernels-decompose-2.f95
> >> > libgomp.oacc-c++/../libgomp.oacc-c-c++-common/kernels-decompose-1.c
> >> > libgomp.oacc-fortran/pr94358-1.f90
> >>
> >> I suppose what you're asking about is what appears in
> >>  reports as:
> >>
> >> ERROR: c-c++-common/goacc/kernels-decompose-1.c: can't read 
> >> "c_loop_i": no such variable for " dg-line 24 l_loop_i[incr c_loop_i] "
> >> UNRESOLVED: c-c++-common/goacc/kernels-decompose-1.c: can't read 
> >> "c_loop_i": no such variable for " dg-line 24 l_loop_i[incr c_loop_i] "
> >>
> >> Etc.
>
> In the mean time, I did remember that weeks ago, I had noticed this
> following "detail": on  we
> read that "Starting with the Tcl 8.5 release, the variable 'varName'
> passed to 'incr' may be unset, and in that case, it will be set to
> [...]".  Tcl 8.5 has been released in 2007.
>
> Per 'gcc/doc/install.texi' we require:
>
> @item DejaGnu 1.4.4
> @itemx Expect
> @itemx Tcl
>
> Necessary to run the GCC testsuite; [...]
>
> DejaGnu has been released in 2004 (so cannot have required Tcl 8.5
> released in 2007).
>
> >> However, per the reports posted there, these really only (!) appear to
> >> fail for your "Native configuration is powerpc-ibm-aix7.2.3.0" testing,
> >> strange.  Which versions of DejaGnu and Tcl are used?
> >
> > For my internal tester DejaGNU reports the following:
> >
> > Expect version is 5.42.1
> > Tcl version is 8.4
> > Framework version is 1.5.3
>
> There we go: you're on Tcl 8.4.  ;-D
>
> >> On  I see there are AIX
> >> systems gcc111, gcc119 -- maybe I'll have luck reproducing the issue
> >> there.
>
> On these, we've got:
>
> tschwinge@gcc111:[/home/tschwinge]/opt/freeware/bin/runtest --version
> WARNING: Couldn't find the global config file.
> Expect version is   5.45.4
> Tcl version is  8.6
> Framework version is1.4.4
>
> tschwinge@gcc119:[/home/tschwinge]/opt/freeware/bin/runtest --version
> WARNING: Couldn't find the global config file.
> Expect version is   5.44.1.15
> Tcl version is  8.5
> Framework version is1.5.3
>
> ..., so can't (easily) be used to reproduce the issue.  (... but it
> wouldn't be specific to AIX, anyway.)
>
> Before I spend time on building/verifying with Tcl 8.4: are you able to
> give the attached patch "[testsuite] Avoid Tcl 8.5-specific behavior" a
> try?
>
>
> Grüße
>  Thomas
>
>
> >> Admittedly, using Tcl code inside DejaGnu directives is not most common,
> >> but it does make sense conceptually (at least to me), and reportedly does
> >> work with a large number of DejaGnu/Tcl versions combinations.
> >>
> >> > Looking at the testsuite logs I see:
> >> >
> >> > fatal error: GCC is not configured to support amdgcn-amdhsa as offload 
> >> > target
> >>
> >> That one's not actually related to the new OpenACC 'kernels' testcases:
> >> it's just the testsuite harness checking whether GCN offloading is
> >> configured.  (See  "GCC is not configured to
> >> support amdgcn-unknown-amdhsa as offload target"; this should one appear
> >> once per testsuite.)
> >>
> >> > I don't know why this is different from the other OpenACC tests.
> >>
> >> It's not.  At least not intentionally.
> >
> > I don't see any obvious difference in the style of the additional
> > options for the kernels testcases versus others, although it
> > specifically is using an option for that test.  I only see the "GCC is
> > not configured ... amdhsa" for those tests.
> >
> >>
> >> > How
> >> > should these tests be skipped or adjusted to not fail on other
> >> > systems?
> >>
> >> They are expected to work fine on all systems; they're not specific to
> >> actual code offloading.  So if something FAILs, we shall resolve it.
> >
> > Thanks, David
>
>
> -
> Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany
> Registergericht München HRB

Re: [PATCH 0/2] Improve MSP430 hardware multiply support

2020-11-17 Thread Jozef Lawrynowicz

On Mon, Nov 16, 2020 at 06:36:17PM -0700, Jeff Law via Gcc-patches wrote:
> 
> 
> On 11/15/20 2:14 PM, Jozef Lawrynowicz wrote:
> > The attached patch series improves MSP430 hardware multiply support, by
> > improving code generation when setting up the 16-bit and 32-bit hardware
> > multiply functions, and adding a 64-bit hardware multiply library
> > function for devices that have a 32-bit hardware multiplier.
> >
> > Successfully regtested GCC and G++ testsuites for:
> > msp430-sim
> > msp430-sim/-mcpu=msp430
> > msp430-sim/-mhwmult=f5series
> > 
> > msp430-sim/-mhwmult=f5series/-mlarge/-mdata-region=either/-mcode-region=either
> > msp430-sim/-mlarge
> > msp430-sim/-mlarge/-mdata-region=either/-mcode-region=either
> >
> > Additionally regtested GCC execute.exp for:
> > msp430-sim/-mhwmult=16bit
> > msp430-sim/-mhwmult=32bit
> > msp430-sim/-mhwmult=f5series
> > msp430-sim/-mhwmult=none
> > 
> > msp430-sim/-mlarge/-mcode-region=either/-mdata-region=either/-mhwmult=16bit
> > 
> > msp430-sim/-mlarge/-mcode-region=either/-mdata-region=either/-mhwmult=32bit
> > 
> > msp430-sim/-mlarge/-mcode-region=either/-mdata-region=either/-mhwmult=f5series
> > 
> > msp430-sim/-mlarge/-mcode-region=either/-mdata-region=either/-mhwmult=none
> >
> > Ok for trunk?
> >
> > Jozef Lawrynowicz (2):
> >   MSP430: Add mulhi, mulsi and {u,}mulsidi3  expanders
> >   MSP430: Add 64-bit hardware multiply support
> >
> >  gcc/config/msp430/msp430.md   | 61 ++--
> >  libgcc/config/msp430/lib2hw_mul.S | 77 +--
> >  libgcc/config/msp430/lib2mul.c| 52 +
> >  libgcc/config/msp430/t-msp430 |  5 ++
> >  4 files changed, 186 insertions(+), 9 deletions(-)
> Both are fine.

Thanks.

> 
> BTW, what would be a reasonable set of multlibs for automated testing? 
> My tester has the ability to define them on a per-target basis, but I
> haven't tried to do that except for targets that I happen to know
> well.   So right now it's just using the default via
> -target_board=msp430-sim.    Figure we've probably got a time budget to
> add 3 multilibs without causing headaches.  What 3 might you suggest?

In addition to the default config, I would suggest:
  msp430-sim/-mcpu=msp430
Test the 430 ISA
  msp430-sim/-mlarge/-mcode-region=either
Test the large memory model with data assumed to be in the lower
memory region (default, reduces code size penalty of using -mlarge),
whilst shuffling code between the upper and lower memory regions to
make the program fit.
  msp430-sim/-mlarge/-mdata-region=either/-mcode-region=either
   Test the large memory model, shuffling code and data between upper
   and lower memory regions.

I should really use -mlarge/-mcode-region=either, instead of just
-mlarge, as well. -mcode-region=either doesn't change code gen, just
allows the linker shuffling of text sections so more tests build and so
we get better test coverage.

With limited testing capacity, testing hwmult configs is not very useful
unless hwmult behavior is specifically changed. There are msp430
specific tests to verify the options basically work.

Thanks,
Jozef

preprocessor: Fix profiled bootstrap warning [pr97858]

2020-11-17 Thread Nathan Sidwell



As Jakub points out, we only ever pass a single variadic parm (if at
all), so just an optional arg is fine.

PR 97858
libcpp/
* mkdeps.c (munge): Drop varadic args, we only ever use one.

pushing to trunk

--
Nathan Sidwell
diff --git i/libcpp/mkdeps.c w/libcpp/mkdeps.c
index ea5f060c380..a989ed355fa 100644
--- i/libcpp/mkdeps.c
+++ w/libcpp/mkdeps.c
@@ -105,23 +105,20 @@ public:
   unsigned short quote_lwm;
 };
 
-/* Apply Make quoting to STR, TRAIL etc.  Note that it's not possible
-   to quote all such characters - e.g. \n, %, *, ?, [, \ (in some
+/* Apply Make quoting to STR, TRAIL.  Note that it's not possible to
+   quote all such characters - e.g. \n, %, *, ?, [, \ (in some
contexts), and ~ are not properly handled.  It isn't possible to
get this right in any current version of Make.  (??? Still true?
Old comment referred to 3.76.1.)  */
 
 static const char *
-munge (const char *str, const char *trail = NULL, ...)
+munge (const char *str, const char *trail = nullptr)
 {
   static unsigned alloc;
   static char *buf;
   unsigned dst = 0;
-  va_list args;
-  if (trail)
-va_start (args, trail);
 
-  for (bool first = true; str; first = false)
+  for (; str; str = trail, trail = nullptr)
 {
   unsigned slashes = 0;
   char c;
@@ -169,14 +166,7 @@ munge (const char *str, const char *trail = NULL, ...)
 
 	  buf[dst++] = c;
 	}
-
-  if (first)
-	str = trail;
-  else
-	str = va_arg (args, const char *);
 }
-  if (trail)
-va_end (args);
 
   buf[dst] = 0;
   return buf;
@@ -332,7 +322,7 @@ make_write_name (const char *name, FILE *fp, unsigned col, unsigned colmax,
 		 bool quote = true, const char *trail = NULL)
 {
   if (quote)
-name = munge (name, trail, NULL);
+name = munge (name, trail);
   unsigned size = strlen (name);
 
   if (col)

Re: Make ltrans type canonicals compatible with WPA ones

2020-11-17 Thread Richard Biener

On Tue, 17 Nov 2020, Jan Hubicka wrote:

> Hi,
> thanks!
> > 
> > So do we want to actually compute alias sets and stream them,
> > "freeing up" TYPE_CANONICAL again?  We're sort-of taking it away
> 
> I am not sure what you mean by freeing up TYPE_CANONICAL again :) but I
> was playing with idea of streaming the alias sets from WPA to ltrans. It
> may simplify things especially if our canonical type logic gets more
> complex... In particular I would eventually like to avoid pesimizing
> C/C++ code just becuase we worry about compatibility with Fortran and
> such and had some patches for this direction  but so far there was more
> pressing issues with TBAA than this.
> 
> We do not really use TYPE_CANONICAL outside of tree-ssa-alias and
> alias.c

GIMPLE type checking uses it heavily (aka useless_type_conversion_p).

Richard.

> Honza
> > from FEs which use it before and recompute it possibly in different
> > ways ...
> > 
> > Richard.
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imend

langhooks: preprocessor hooks for c++ modules

2020-11-17 Thread Nathan Sidwell



This is a slightly modified version of 01-langhooks.def.  I realized I
didn't need the deferred macro langhook -- that can be directly
installed into the preprocessor callbacks via preprocess_options lang
hook.

gcc/
* langhooks-def.h (LANG_HOOKS_PREPROCESS_MAIN_FILE)
(LANG_HOOKS_PREPROCESS_OPTIONS, LANG_HOOKS_PREPROCESS_UNDEF)
(LANG_HOOKS_PREPROCESS_TOKEN): New.
(LANG_HOOKS_INITIALIZER): Add them.
* langhooks.h (struct lang_hooks): Add preprocess_main_file,
preprocess_options, preprocess_undef, preprocess_token hooks.  Add
enum PT_flags.
gcc/c-family.
* c-lex.c: #include "langhooks.h".
(cb_undef): Maybe call preprocess_undef lang hook.
* c-opts.c (c_common_post_options): Maybe call preprocess_options
lang hook.
(push_command_line_include): Maybe call preprocess_main_file lang
hook.
(cb_file_change): Likewise.
* c-ppoutput.c: #include "langhooks.h.
(scan_translation_unit): Maybe call preprocess_token lang hook.
(class do_streamer): New, derive from token_streamer.
(directives_only_cb): Data pointer is do_streamer, call
preprocess_token lang hook.
(scan_translation_unit_directives_only): Use do_streamer.
(print_line_1): Move src_line recording to after string output.
(cb_undef): Maybe call preprocess_undef lang hook.

pushing to trunk
--
Nathan Sidwell
diff --git i/gcc/c-family/c-lex.c w/gcc/c-family/c-lex.c
index 6cd3df7c96f..8dd1420d10d 100644
--- i/gcc/c-family/c-lex.c
+++ w/gcc/c-family/c-lex.c
@@ -28,7 +28,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "c-pragma.h"
 #include "debug.h"
 #include "file-prefix-map.h" /* remap_macro_filename()  */
-
+#include "langhooks.h"
 #include "attribs.h"
 
 /* We may keep statistics about how long which files took to compile.  */
@@ -274,9 +274,11 @@ cb_define (cpp_reader *pfile, location_t loc, cpp_hashnode *node)
 
 /* #undef callback for DWARF and DWARF2 debug info.  */
 static void
-cb_undef (cpp_reader * ARG_UNUSED (pfile), location_t loc,
-	  cpp_hashnode *node)
+cb_undef (cpp_reader *pfile, location_t loc, cpp_hashnode *node)
 {
+  if (lang_hooks.preprocess_undef)
+lang_hooks.preprocess_undef (pfile, loc, node);
+
   const struct line_map *map = linemap_lookup (line_table, loc);
   (*debug_hooks->undef) (SOURCE_LINE (linemap_check_ordinary (map), loc),
 			 (const char *) NODE_NAME (node));
diff --git i/gcc/c-family/c-opts.c w/gcc/c-family/c-opts.c
index 40e92229d8a..77844d7daf1 100644
--- i/gcc/c-family/c-opts.c
+++ w/gcc/c-family/c-opts.c
@@ -1106,6 +1106,8 @@ c_common_post_options (const char **pfilename)
   struct cpp_callbacks *cb = cpp_get_callbacks (parse_in);
   cb->file_change = cb_file_change;
   cb->dir_change = cb_dir_change;
+  if (lang_hooks.preprocess_options)
+lang_hooks.preprocess_options (parse_in);
   cpp_post_options (parse_in);
   init_global_opts_from_cpp (_options, cpp_get_options (parse_in));
 
@@ -1548,7 +1550,13 @@ push_command_line_include (void)
   cpp_opts->warn_unused_macros = cpp_warn_unused_macros;
   /* Restore the line map back to the main file.  */
   if (!cpp_opts->preprocessed)
-	cpp_change_file (parse_in, LC_RENAME, this_input_filename);
+	{
+	  cpp_change_file (parse_in, LC_RENAME, this_input_filename);
+	  if (lang_hooks.preprocess_main_file)
+	/* We're starting the main file.  Inform the FE of that.  */
+	lang_hooks.preprocess_main_file
+	  (parse_in, line_table, LINEMAPS_LAST_ORDINARY_MAP (line_table));
+	}
 
   /* Set this here so the client can change the option if it wishes,
 	 and after stacking the main file so we don't trace the main file.  */
@@ -1558,14 +1566,19 @@ push_command_line_include (void)
 
 /* File change callback.  Has to handle -include files.  */
 static void
-cb_file_change (cpp_reader * ARG_UNUSED (pfile),
-		const line_map_ordinary *new_map)
+cb_file_change (cpp_reader *reader, const line_map_ordinary *new_map)
 {
   if (flag_preprocess_only)
 pp_file_change (new_map);
   else
 fe_file_change (new_map);
 
+  if (new_map && cpp_opts->preprocessed
+  && lang_hooks.preprocess_main_file && MAIN_FILE_P (new_map)
+  && ORDINARY_MAP_STARTING_LINE_NUMBER (new_map))
+/* We're starting the main file.  Inform the FE of that.  */
+lang_hooks.preprocess_main_file (reader, line_table, new_map);
+
   if (new_map 
   && (new_map->reason == LC_ENTER || new_map->reason == LC_RENAME))
 {
diff --git i/gcc/c-family/c-ppoutput.c w/gcc/c-family/c-ppoutput.c
index 517de15d97c..e3e0e59fcc7 100644
--- i/gcc/c-family/c-ppoutput.c
+++ w/gcc/c-family/c-ppoutput.c
@@ -21,6 +21,7 @@
 #include "coretypes.h"
 #include "c-common.h"		/* For flags.  */
 #include "../libcpp/internal.h"
+#include "langhooks.h"
 #include "c-pragma.h"		/* For parse_in.  */
 #include "file-prefix-map.h"/* remap_macro_filename()  */
 
@@ -301,10 +302,15 @@

[PATCH] x86: Add a testcase for PR target/31799

2020-11-17 Thread H.J. Lu via Gcc-patches

Add a testcase for PR target/31799 which was fixed by

commit 4f0473fe89e68bf7c09542ee5c3684da25a5b435
Author: Uros Bizjak 
Date:   Fri May 12 21:04:05 2017 +0200

compare-elim.c (try_eliminate_compare): Canonicalize operation with 
embedded compare to [(set (reg:CCM) (compare:CCM...

* compare-elim.c (try_eliminate_compare): Canonicalize
operation with embedded compare to
[(set (reg:CCM) (compare:CCM (operation) (immediate)))
 (set (reg) (operation)].

* config/i386/i386.c (TARGET_FLAGS_REGNUM): New define.

in GCC 8.

PR target/31799
* gcc.target/i386/pr31799.c: New test.
---
 gcc/testsuite/gcc.target/i386/pr31799.c | 12 
 1 file changed, 12 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr31799.c

diff --git a/gcc/testsuite/gcc.target/i386/pr31799.c 
b/gcc/testsuite/gcc.target/i386/pr31799.c
new file mode 100644
index 000..c72c4eab986
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr31799.c
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+void
+foo (int x, int *y, int *z)
+{
+  *z = ++x;
+  if (x != 0)
+  *y = 1;
+}
+
+/* { dg-final { scan-assembler-not "test" } } */
-- 
2.28.0

Re: [PATCH] ada: c++: Get rid of libposix4, librt on Solaris

2020-11-17 Thread Rainer Orth

Hi Jonathan,

There are two more uses of librt left:

* On glibc targets before 2.17 it's needed for clock_gettime.  I've no
  idea how long gcc is supposed to support such targets (glibc 2.17 was
  released in December 2012).
>>>
>>> RHEL 7 uses glibc 2.17, so it will still be in use for some time.
>>
>>but at least the comments say < 2.17, so RHEL 7 wouldn't be affected.
>
> Ah right, sorry, I read too quickly. Yes, < 2.17 probably isn't very
> relevant now, although historically libstdc++ has not explicitly
> dropped support older glibc versions. If it builds, then it builds.
>
> We could consider doing some housekeeping in that area, or just
> documenting our requirements more carefully (for example, we now
> require Linux kernel version 2.6.22 for the FUTEX_PRIVATE_FLAG but I
> don't think we say that anywhere).

that would certainly help, if only to set user expectations.  When
e.g. I tried to get any info from the LLVM community which macOS
versions were supposed to be still supported, it was like pulling teeth
and in the end got me nothing.  Not a particularly pleasant experience.
We should be able to do better than that.

>>> The libstdc++ part is OK with those adjustments. Thanks for doing
>>> this, it's really helpful to trim these checks so the unnecessary
>>> parts don't hang around indefinitely.
>>
>>My pleasure: they are easy enough to miss, unfortunately, since the are
>>seldom labeled with `for OS version X.Y' or some such.  E.g. we still
>>have a libexc test in gcc/configure.in, which was only added for Tru64
>>UNIX, I believe (unless Linux/alpha needs it, too).
>
> Do we even have anybody still using alpha?

I have no idea: I donated my last alpha systems to some sort of computer
museum years ago once I had removed Tru64 UNIX support.  There are always
some enthusiasts around, some of which clamour for keeping their pet
target around a bit longer ;-)

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University

Re: [3/3][aarch64] Add support for vec_widen_shift pattern

2020-11-17 Thread Richard Sandiford via Gcc-patches

Richard Biener  writes:
> On Fri, 13 Nov 2020, Joel Hutton wrote:
>
>> Tests are still running, but I believe I've addressed all the comments.
>> 
>> > > +#include 
>> > > +
>> > 
>> > SVE targets will need a:
>> > 
>> > #pragma GCC target "+nosve"
>> > 
>> > here, since we'll generate different code for SVE.
>> 
>> Fixed.
>> 
>> > > +/* { dg-final { scan-assembler-times "shll\t" 1} } */
>> > > +/* { dg-final { scan-assembler-times "shll2\t" 1} } */
>> > 
>> > Very minor nit, sorry, but I think:
>> > 
>> > /* { dg-final { scan-assembler-times {\tshll\t} 1 } } */
>> > 
>> > would be better.  Using "?\t" works, but IIRC it shows up as a tab
>> > character in the testsuite result summary too.
>> 
>> Fixed. Minor nits welcome. :)
>> 
>> 
>> > OK for the aarch64 bits with the testsuite changes above.
>> ok?
>
> The gcc/tree-vect-stmts.c parts are OK.

Same for the AArch64 stuff.

Thanks,
Richard

Re: [2/3][vect] Add widening add, subtract vect patterns

2020-11-17 Thread Richard Sandiford via Gcc-patches

Richard Biener  writes:
> On Fri, 13 Nov 2020, Joel Hutton wrote:
>
>> Tests are still running, but I believe I've addressed all the comments.
>> 
>> > Like Richard said, the new patterns need to be documented in md.texi
>> > and the new tree codes need to be documented in generic.texi.
>> 
>> Done.
>> 
>> > While we're using tree codes, I think we need to make the naming
>> > consistent with other tree codes: WIDEN_PLUS_EXPR instead of
>> > WIDEN_ADD_EXPR and WIDEN_MINUS_EXPR instead of WIDEN_SUB_EXPR.
>> > Same idea for the VEC_* codes.
>> 
>> Fixed.
>> 
>> > > gcc/ChangeLog:
>> > >
>> > > 2020-11-12  Joel Hutton  
>> > >
>> > > * expr.c (expand_expr_real_2): add widen_add,widen_subtract cases
>> > 
>> > Not that I personally care about this stuff (would love to see changelogs
>> > go away :-)) but some nits:
>> > 
>> > Each description is supposed to start with a capital letter and end with
>> > a full stop (even if it's not a complete sentence).  Same for the rest
>> 
>> Fixed.
>> 
>> > > * optabs-tree.c (optab_for_tree_code): optabs for widening 
>> > > adds,subtracts
>> > 
>> > The line limit for changelogs is 80 characters.  The entry should say
>> > what changed, so ?Handle ?? or ?Add case for ?? or something.
>> 
>> Fixed.
>> 
>> > > * tree-vect-patterns.c (vect_recog_widen_add_pattern): New recog 
>> > > ptatern
>> > 
>> > typo: pattern
>> 
>> Fixed.
>> 
>> > > Add widening add, subtract patterns to tree-vect-patterns.
>> > > Add aarch64 tests for patterns.
>> > >
>> > > fix sad
>> > 
>> > Would be good to expand on this for the final commit message.
>> 
>> 'fix sad' was accidentally included when I squashed two commits. I've made 
>> all the commit messages more descriptive.
>> 
>> > > +
>> > > +case VEC_WIDEN_SUB_HI_EXPR:
>> > > +  return (TYPE_UNSIGNED (type)
>> > > +   ? vec_widen_usubl_hi_optab  : vec_widen_ssubl_hi_optab);
>> > > +
>> > > +
>> > 
>> > Nits: excess blank line at the end and excess space before the ?:?s.
>> 
>> Fixed.
>> 
>> > > +OPTAB_D (vec_widen_usubl_lo_optab, "vec_widen_usubl_lo_$a")
>> > > +OPTAB_D (vec_widen_uaddl_hi_optab, "vec_widen_uaddl_hi_$a")
>> > > +OPTAB_D (vec_widen_uaddl_lo_optab, "vec_widen_uaddl_lo_$a")
>> > >  OPTAB_D (vec_widen_sshiftl_hi_optab, "vec_widen_sshiftl_hi_$a")
>> > >  OPTAB_D (vec_widen_sshiftl_lo_optab, "vec_widen_sshiftl_lo_$a")
>> > >  OPTAB_D (vec_widen_umult_even_optab, "vec_widen_umult_even_$a")
>> > 
>> > Looks like the current code groups signed stuff together and
>> > unsigned stuff together, so would be good to follow that.
>> 
>> Fixed.
>> 
>> > Same comments as the previous patch about having a "+nosve" pragma
>> > and about the scan-assembler-times lines.  Same for the sub test.
>> 
>> Fixed.
>> 
>> > I am missing documentation in md.texi for the new patterns.  In
>> > particular I wonder why you need singed and unsigned variants
>> > for the add/subtract patterns.
>> 
>> Fixed. Signed and unsigned variants because they correspond to signed and
>> unsigned instructions, (uaddl/uaddl2, saddl/saddl2).
>> 
>> > The new functions should have comments before them.  Can probably
>> > just use the vect_recog_widen_mult_pattern comment as a template.
>> 
>> Fixed.
>> 
>> > > +case VEC_WIDEN_SUB_HI_EXPR:
>> > > +case VEC_WIDEN_SUB_LO_EXPR:
>> > > +case VEC_WIDEN_ADD_HI_EXPR:
>> > > +case VEC_WIDEN_ADD_LO_EXPR:
>> > > +  return false;
>> > > +
>> >
>> > I think these should get the same validity checking as
>> > VEC_WIDEN_MULT_HI_EXPR etc.
>> 
>> Fixed.
>> 
>> > > --- a/gcc/tree-vect-patterns.c
>> > > +++ b/gcc/tree-vect-patterns.c
>> > > @@ -1086,8 +1086,10 @@ vect_recog_sad_pattern (vec_info *vinfo,
>> > >   of the above pattern.  */
>> > >
>> > >tree plus_oprnd0, plus_oprnd1;
>> > > -  if (!vect_reassociating_reduction_p (vinfo, stmt_vinfo, PLUS_EXPR,
>> > > -_oprnd0, _oprnd1))
>> > > +  if (!(vect_reassociating_reduction_p (vinfo, stmt_vinfo, PLUS_EXPR,
>> > > +_oprnd0, _oprnd1)
>> > > + || vect_reassociating_reduction_p (vinfo, stmt_vinfo, 
>> > > WIDEN_ADD_EXPR,
>> > > +_oprnd0, _oprnd1)))
>> > >  return NULL;
>> > >
>> > > tree sum_type = gimple_expr_type (last_stmt);
>> >
>> > I think we should make:
>> >
>> >   /* Any non-truncating sequence of conversions is OK here, since
>> >  with a successful match, the result of the ABS(U) is known to fit
>> >  within the nonnegative range of the result type.  (It cannot be the
>> >  negative of the minimum signed value due to the range of the widening
>> >  MINUS_EXPR.)  */
>> >   vect_unpromoted_value unprom_abs;
>> >   plus_oprnd0 = vect_look_through_possible_promotion (vinfo, plus_oprnd0,
>> >   _abs);
>> >
>> > specific to the PLUS_EXPR case.  If we look through promotions on
>> > the operands of a WIDEN_ADD_EXPR, we could potentially

Re: [PATCH] ada: c++: Get rid of libposix4, librt on Solaris

2020-11-17 Thread Jonathan Wakely via Gcc-patches


On 17/11/20 14:25 +0100, Rainer Orth wrote:

Hi Jonathan,


There are two more uses of librt left:

* On glibc targets before 2.17 it's needed for clock_gettime.  I've no
 idea how long gcc is supposed to support such targets (glibc 2.17 was
 released in December 2012).


RHEL 7 uses glibc 2.17, so it will still be in use for some time.


but at least the comments say < 2.17, so RHEL 7 wouldn't be affected.


Ah right, sorry, I read too quickly. Yes, < 2.17 probably isn't very
relevant now, although historically libstdc++ has not explicitly
dropped support older glibc versions. If it builds, then it builds.

We could consider doing some housekeeping in that area, or just
documenting our requirements more carefully (for example, we now
require Linux kernel version 2.6.22 for the FUTEX_PRIVATE_FLAG but I
don't think we say that anywhere).


diff --git a/libstdc++-v3/acinclude.m4 b/libstdc++-v3/acinclude.m4
--- a/libstdc++-v3/acinclude.m4
+++ b/libstdc++-v3/acinclude.m4
@@ -1381,8 +1381,8 @@ dnl
dnl --enable-libstdcxx-time
dnl --enable-libstdcxx-time=yes
dnlchecks for the availability of monotonic and realtime clocks,
-dnlnanosleep and sched_yield in libc and libposix4 and, if needed,
-dnllinks in the latter.
+dnlnanosleep and sched_yield in libc and, if needed, links in the
+dnllatter.


"The latter" was referring to libposix4, and we always link to libc,
so "if needed" doesn't apply to it.

So I think it should be:

 dnlchecks for the availability of monotonic and realtime clocks,
-dnlnanosleep and sched_yield in libc and libposix4 and, if needed,
-dnllinks in the latter.
+dnlnanosleep and sched_yield in libc.




diff --git a/libstdc++-v3/doc/xml/manual/configure.xml
b/libstdc++-v3/doc/xml/manual/configure.xml
--- a/libstdc++-v3/doc/xml/manual/configure.xml
+++ b/libstdc++-v3/doc/xml/manual/configure.xml
@@ -171,7 +171,7 @@
sched_yield functions, used in the
implementation of [thread.thread.this] of the 2011 ISO C++ standard.
The choice OPTION=yes checks for the availability of the facilities
-   in libc and libposix4.  In case it's needed the latter is also linked
+   in libc.  In case it's needed the latter is also linked
to libstdc++ as part of the build process.  OPTION=rt also checks in


Similarly, the whole "In case it's needed the latter is also linked to
libstdc++ as part of the build process." sentence should be removed. It
only applied to libposix4.


Good catch: I've been too mechanical in my updates.  Btw., can you take
care of regenerating the html files there?


Yes, no problem.


librt (and, if it's needed, links to it).  Note that linking to librt
is not always desirable because for glibc it requires linking to


The libstdc++ part is OK with those adjustments. Thanks for doing
this, it's really helpful to trim these checks so the unnecessary
parts don't hang around indefinitely.


My pleasure: they are easy enough to miss, unfortunately, since the are
seldom labeled with `for OS version X.Y' or some such.  E.g. we still
have a libexc test in gcc/configure.in, which was only added for Tru64
UNIX, I believe (unless Linux/alpha needs it, too).


Do we even have anybody still using alpha?

Re: [PATCH] libstdc++: Fix ranges::search_n for random access iterators [PR97828]

2020-11-17 Thread Jonathan Wakely via Gcc-patches


On 16/11/20 15:25 -0500, Patrick Palka via Libstdc++ wrote:

My ranges transcription of the std::search_n implementation for random
access iterators missed a crucial part of the algorithm which the tests
unfortunately didn't catch.  When __remainder is less than __count at
the start of an iteration of the outer while loop, it means we're
continuing a run of __count - __remainder identical elements from the
previous iteration in which the backwards scan got interrupted by the
predicate failing.  If the backwards scan gets interrupted again at this
next iteration, we need to reset __remainder so that it's offset only by
the size the most recent run of identical elements, rather than by the
sum of all the (disjoint) runs.

This patch fixes this appropriately, mirroring how it's done in the
corresponding std::search_n implementation.

Tested on x86_64-pc-linux-gnu, does this look OK for trunk and the 10
branch?


OK, thanks.

Re: [1/3][aarch64] Add aarch64 support for vec_widen_add, vec_widen_sub patterns

2020-11-17 Thread Richard Sandiford via Gcc-patches

Joel Hutton  writes:
> Tests are still running, but I believe I've addressed the comment.
>
>> There are ways in which we could reduce the amount of cut-&-paste here,
>> but I guess everything is a trade-off between clarity and compactness.
>> One extreme is to write them all out explicitly, another extreme would
>> be to have one define_expand and various iterators and attributes.
>>
>> I think the vec_widen_mult_*_ patterns strike a good balance:
>> the use ANY_EXTEND to hide the sign difference while still having
>> separate hi and lo patterns:
>
> Done
>
> gcc/ChangeLog:
>
> 2020-11-13  Joel Hutton  
>
> * config/aarch64/aarch64-simd.md: New patterns
>   vec_widen_saddl_lo/hi_.
>
> From c52fd11f5d471200c1292fad3bc04056e7721f06 Mon Sep 17 00:00:00 2001
> From: Joel Hutton 
> Date: Mon, 9 Nov 2020 15:35:57 +
> Subject: [PATCH 1/3] [aarch64] Add vec_widen patterns to aarch64
>
> Add widening add and subtract patterns to the aarch64
> backend. These allow taking vectors of N elements of size S
> and performing and add/subtract on the high or low half
> widening the resulting elements and storing N/2 elements of size 2*S.
> These correspond to the addl,addl2,subl,subl2 instructions.
> ---
>  gcc/config/aarch64/aarch64-simd.md | 47 ++
>  1 file changed, 47 insertions(+)
>
> diff --git a/gcc/config/aarch64/aarch64-simd.md 
> b/gcc/config/aarch64/aarch64-simd.md
> index 2cf6fe9154a..30299610635 100644
> --- a/gcc/config/aarch64/aarch64-simd.md
> +++ b/gcc/config/aarch64/aarch64-simd.md
> @@ -3382,6 +3382,53 @@
>[(set_attr "type" "neon__long")]
>  )
>  
> +(define_expand "vec_widen_addl_lo_"
> +  [(match_operand: 0 "register_operand")
> +   (ANY_EXTEND: (match_operand:VQW 1 "register_operand"))
> +   (ANY_EXTEND: (match_operand:VQW 2 "register_operand"))]
> +  "TARGET_SIMD"
> +{
> +  rtx p = aarch64_simd_vect_par_cnst_half (mode, , false);
> +  emit_insn (gen_aarch64_addl_lo_internal (operands[0], 
> operands[1],
> +   operands[2], p));

Nit: operands[2] should be indented three more columns now that “s” and
“u” have changed to “”.

OK with that change, thanks.

Richard

Re: Make ltrans type canonicals compatible with WPA ones

2020-11-17 Thread Jan Hubicka

Hi,
thanks!
> 
> So do we want to actually compute alias sets and stream them,
> "freeing up" TYPE_CANONICAL again?  We're sort-of taking it away

I am not sure what you mean by freeing up TYPE_CANONICAL again :) but I
was playing with idea of streaming the alias sets from WPA to ltrans. It
may simplify things especially if our canonical type logic gets more
complex... In particular I would eventually like to avoid pesimizing
C/C++ code just becuase we worry about compatibility with Fortran and
such and had some patches for this direction  but so far there was more
pressing issues with TBAA than this.

We do not really use TYPE_CANONICAL outside of tree-ssa-alias and
alias.c

Honza
> from FEs which use it before and recompute it possibly in different
> ways ...
> 
> Richard.

Re: [PATCH] ada: c++: Get rid of libposix4, librt on Solaris

2020-11-17 Thread Rainer Orth

Hi Jonathan,

>>There are two more uses of librt left:
>>
>>* On glibc targets before 2.17 it's needed for clock_gettime.  I've no
>>  idea how long gcc is supposed to support such targets (glibc 2.17 was
>>  released in December 2012).
>
> RHEL 7 uses glibc 2.17, so it will still be in use for some time.

but at least the comments say < 2.17, so RHEL 7 wouldn't be affected.

>>diff --git a/libstdc++-v3/acinclude.m4 b/libstdc++-v3/acinclude.m4
>>--- a/libstdc++-v3/acinclude.m4
>>+++ b/libstdc++-v3/acinclude.m4
>>@@ -1381,8 +1381,8 @@ dnl
>> dnl --enable-libstdcxx-time
>> dnl --enable-libstdcxx-time=yes
>> dnlchecks for the availability of monotonic and realtime clocks,
>>-dnlnanosleep and sched_yield in libc and libposix4 and, if needed,
>>-dnllinks in the latter.
>>+dnlnanosleep and sched_yield in libc and, if needed, links in the
>>+dnllatter.
>
> "The latter" was referring to libposix4, and we always link to libc,
> so "if needed" doesn't apply to it.
>
> So I think it should be:
>
>  dnlchecks for the availability of monotonic and realtime clocks,
> -dnlnanosleep and sched_yield in libc and libposix4 and, if needed,
> -dnllinks in the latter.
> +dnlnanosleep and sched_yield in libc.
>
>
>
>>diff --git a/libstdc++-v3/doc/xml/manual/configure.xml
>> b/libstdc++-v3/doc/xml/manual/configure.xml
>>--- a/libstdc++-v3/doc/xml/manual/configure.xml
>>+++ b/libstdc++-v3/doc/xml/manual/configure.xml
>>@@ -171,7 +171,7 @@
>>  sched_yield functions, used in the
>>  implementation of [thread.thread.this] of the 2011 ISO C++ standard.
>>  The choice OPTION=yes checks for the availability of the facilities
>>- in libc and libposix4.  In case it's needed the latter is also linked
>>+ in libc.  In case it's needed the latter is also linked
>>  to libstdc++ as part of the build process.  OPTION=rt also checks in
>
> Similarly, the whole "In case it's needed the latter is also linked to
> libstdc++ as part of the build process." sentence should be removed. It
> only applied to libposix4.

Good catch: I've been too mechanical in my updates.  Btw., can you take
care of regenerating the html files there?

>>  librt (and, if it's needed, links to it).  Note that linking to librt
>>  is not always desirable because for glibc it requires linking to
>
> The libstdc++ part is OK with those adjustments. Thanks for doing
> this, it's really helpful to trim these checks so the unnecessary
> parts don't hang around indefinitely.

My pleasure: they are easy enough to miss, unfortunately, since the are
seldom labeled with `for OS version X.Y' or some such.  E.g. we still
have a libexc test in gcc/configure.in, which was only added for Tru64
UNIX, I believe (unless Linux/alpha needs it, too).

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University

Re: V2 [PATCH] Use SHF_GNU_RETAIN to preserve symbol definitions

2020-11-17 Thread H.J. Lu via Gcc-patches

On Mon, Nov 16, 2020 at 7:59 PM Hans-Peter Nilsson  wrote:
>
> On Fri, 13 Nov 2020, H.J. Lu via Gcc-patches wrote:
> > Done.  Here is the updated patch.
>
> Hi.  I see a test-case for this kind of construct:
>
>  int foo __attribute__((__used__, __section__ (".bar"))) = 42;
>
> and IIUC that it's handled as I'd hope (setting "R" on the named
> section, not another derived section), good.
>
> Could you also add a test-case that the same construct
> *without* a specific initializer is handled the same way?
> I.e.:
>  int foo __attribute__((__used__, __section__ (".bar")));
>

Done.  The only changes are

/* { dg-final { scan-assembler ".data.used_bar_sec,\"awR\"" } } */
...
int __attribute__((used,section(".data.used_bar_sec"))) used_bar;

and 2 additional tests for -fcommon.

Thanks.

-- 
H.J.
From d19f2e2ec7f0f47121a2a4c05ffe20af8972c1bb Mon Sep 17 00:00:00 2001
From: "H.J. Lu" 
Date: Mon, 3 Feb 2020 11:55:43 -0800
Subject: [PATCH] Use SHF_GNU_RETAIN to preserve symbol definitions

In assemly code, the section flag 'R' sets the SHF_GNU_RETAIN flag to
indicate that the section must be preserved by the linker.

Add SECTION_RETAIN to indicate a section should be retained by the linker
and set SECTION_RETAIN on section for the preserved symbol if assembler
supports SHF_GNU_RETAIN.  All retained symbols are placed in separate
sections with

	.section .data.rel.local.preserved_symbol,"awR"
preserved_symbol:
...
	.section .data.rel.local,"aw"
not_preserved_symbol:
...

to avoid

	.section .data.rel.local,"awR"
preserved_symbol:
...
not_preserved_symbol:
...

which places not_preserved_symbol definition in the SHF_GNU_RETAIN
section.

gcc/

2020-11-XX  H.J. Lu  

	* configure.ac (HAVE_GAS_SHF_GNU_RETAIN): New.  Define 1 if
	the assembler supports marking sections with SHF_GNU_RETAIN flag.
	* output.h (SECTION_RETAIN): New.  Defined as 0x400.
	(SECTION_MACH_DEP): Changed from 0x400 to 0x800.
	(default_unique_section): Add a bool argument.
	* varasm.c (get_section): Set SECTION_RETAIN for the preserved
	symbol with HAVE_GAS_SHF_GNU_RETAIN.
	(resolve_unique_section): Used named section for the preserved
	symbol if assembler supports SHF_GNU_RETAIN.
	(get_variable_section): Handle the preserved common symbol with
	HAVE_GAS_SHF_GNU_RETAIN.
	(default_elf_asm_named_section): Require the full declaration and
	use the 'R' flag for SECTION_RETAIN.
	* config.in: Regenerated.
	* configure: Likewise.

gcc/testsuite/

2020-11-XX  H.J. Lu  
	Jozef Lawrynowicz  

	* c-c++-common/attr-used.c: Check the 'R' flag.
	* c-c++-common/attr-used-2.c: Likewise.
	* c-c++-common/attr-used-3.c: New test.
	* c-c++-common/attr-used-4.c: Likewise.
	* gcc.c-torture/compile/attr-used-retain-1.c: Likewise.
	* gcc.c-torture/compile/attr-used-retain-2.c: Likewise.
	* gcc.c-torture/compile/attr-used-retain-3.c: Likewise.
	* gcc.c-torture/compile/attr-used-retain-4.c: Likewise.
	* lib/target-supports.exp
	(check_effective_target_R_flag_in_section): New proc.
---
 gcc/config.in |  7 +++
 gcc/configure | 51 +++
 gcc/configure.ac  | 20 
 gcc/output.h  |  6 ++-
 gcc/testsuite/c-c++-common/attr-used-2.c  |  1 +
 gcc/testsuite/c-c++-common/attr-used-3.c  |  7 +++
 gcc/testsuite/c-c++-common/attr-used-4.c  |  7 +++
 gcc/testsuite/c-c++-common/attr-used.c|  1 +
 .../compile/attr-used-retain-1.c  | 37 ++
 .../compile/attr-used-retain-2.c  | 17 +++
 .../compile/attr-used-retain-3.c  | 17 +++
 .../compile/attr-used-retain-4.c  | 17 +++
 gcc/testsuite/lib/target-supports.exp | 40 +++
 gcc/varasm.c  | 17 +--
 14 files changed, 241 insertions(+), 4 deletions(-)
 create mode 100644 gcc/testsuite/c-c++-common/attr-used-3.c
 create mode 100644 gcc/testsuite/c-c++-common/attr-used-4.c
 create mode 100644 gcc/testsuite/gcc.c-torture/compile/attr-used-retain-1.c
 create mode 100644 gcc/testsuite/gcc.c-torture/compile/attr-used-retain-2.c
 create mode 100644 gcc/testsuite/gcc.c-torture/compile/attr-used-retain-3.c
 create mode 100644 gcc/testsuite/gcc.c-torture/compile/attr-used-retain-4.c

diff --git a/gcc/config.in b/gcc/config.in
index b7c3107bfe3..23ae2f9bc1b 100644
--- a/gcc/config.in
+++ b/gcc/config.in
@@ -1352,6 +1352,13 @@
 #endif
 
 
+/* Define 0/1 if your assembler supports marking sections with SHF_GNU_RETAIN
+   flag. */
+#ifndef USED_FOR_TARGET
+#undef HAVE_GAS_SHF_GNU_RETAIN
+#endif
+
+
 /* Define 0/1 if your assembler supports marking sections with SHF_MERGE flag.
*/
 #ifndef USED_FOR_TARGET
diff --git a/gcc/configure b/gcc/configure
index dbda4415a17..a925a6e5efb 100755
--- a/gcc/configure
+++ b/gcc/configure
@@ -24272,6 +24272,57 @@ cat >>confdefs.h <<_ACEOF
 _ACEOF
 
 
+# Test if the assembler supports the section flag 'R' for specifying
+#

Re: [PATCH] ada: c++: Get rid of libposix4, librt on Solaris

2020-11-17 Thread Jonathan Wakely via Gcc-patches


On 17/11/20 10:47 +0100, Rainer Orth wrote:

I recently noticed that neither libposix4 nor librt are needed on
Solaris 11 any longer:

* libposix4 was renamed to librt in Solaris 7 back in 1998.

* librt was folded into libc in the OpenSolaris timeframe, leaving librt
 only as a filter on libc.  Thus, it's no longer needed on either
 Solaris 11 or Illumos.

The following patch removes both uses.  At the same time, Ada's use of
libthread has gone: it was folded into libc in Solaris 10 already.
TIME_LIBRARY and friends in g++ are likewise removed: Solaris was the
only user.

Bootstrapped without regressions on i386-pc-solaris2.11,
sparc-sun-solaris2.11, and x86_64-pc-linux-gnu.

Ok for master?

There are two more uses of librt left:

* On glibc targets before 2.17 it's needed for clock_gettime.  I've no
 idea how long gcc is supposed to support such targets (glibc 2.17 was
 released in December 2012).


RHEL 7 uses glibc 2.17, so it will still be in use for some time.



diff --git a/libstdc++-v3/acinclude.m4 b/libstdc++-v3/acinclude.m4
--- a/libstdc++-v3/acinclude.m4
+++ b/libstdc++-v3/acinclude.m4
@@ -1381,8 +1381,8 @@ dnl
dnl --enable-libstdcxx-time
dnl --enable-libstdcxx-time=yes
dnlchecks for the availability of monotonic and realtime clocks,
-dnlnanosleep and sched_yield in libc and libposix4 and, if needed,
-dnllinks in the latter.
+dnlnanosleep and sched_yield in libc and, if needed, links in the
+dnllatter.


"The latter" was referring to libposix4, and we always link to libc,
so "if needed" doesn't apply to it.

So I think it should be:

 dnlchecks for the availability of monotonic and realtime clocks,
-dnlnanosleep and sched_yield in libc and libposix4 and, if needed,
-dnllinks in the latter.
+dnlnanosleep and sched_yield in libc.




diff --git a/libstdc++-v3/doc/xml/manual/configure.xml 
b/libstdc++-v3/doc/xml/manual/configure.xml
--- a/libstdc++-v3/doc/xml/manual/configure.xml
+++ b/libstdc++-v3/doc/xml/manual/configure.xml
@@ -171,7 +171,7 @@
sched_yield functions, used in the
implementation of [thread.thread.this] of the 2011 ISO C++ standard.
The choice OPTION=yes checks for the availability of the facilities
-   in libc and libposix4.  In case it's needed the latter is also linked
+   in libc.  In case it's needed the latter is also linked
to libstdc++ as part of the build process.  OPTION=rt also checks in


Similarly, the whole "In case it's needed the latter is also linked to
libstdc++ as part of the build process." sentence should be removed. It
only applied to libposix4.


librt (and, if it's needed, links to it).  Note that linking to librt
is not always desirable because for glibc it requires linking to


The libstdc++ part is OK with those adjustments. Thanks for doing
this, it's really helpful to trim these checks so the unnecessary
parts don't hang around indefinitely.

Re: Make ltrans type canonicals compatible with WPA ones

2020-11-17 Thread Richard Biener

On Tue, 17 Nov 2020, Jan Hubicka wrote:

> Hi,
> this patch fixes profiledbootstrap failure with LTO enabled.
> What happens is that alias_ptr_types_compatible_p relies on the
> fact that alias sets are not refined from WPA to ltrans time:
> 
> /* This function originally abstracts from simply comparing
>get_deref_alias_set so that we are sure this still computes
>the same result after LTO type merging is applied.
>When in LTO type merging is done we can actually do this compare.
> */
>   if (in_lto_p)
> return get_deref_alias_set (t1) == get_deref_alias_set (t2);
>   else
> return (TYPE_MAIN_VARIANT (TREE_TYPE (t1))
> == TYPE_MAIN_VARIANT (TREE_TYPE (t2)));
> 
> This conditional is confused - it pesimizes code with -fno-lto
> for no good reason. I will fix that separately: we now have
> lto_streaming_expected_p so I think it should read
> 
>  if (!lto_stremaing_expected_p () || flag_wpa)
>use alias sets
>  else
>use main varaiants as conservative estimate.
> 
> (so if we ever get idea to ICF during incremental link or deduplicate in
> early passes, things will work safely).  I will send separate patch on
> this.
> 
> 
> Not refining alias sets from WPA to ltrans time is a good invariant to
> maintain and the canonical type hash behaves this way.  However I broke
> this with the ODR logic.
> 
> Normally we define canonical types for C++ ODR types according to their
> type names.  However to make ODR types compatible with C types we check
> if structurally equivalent C type exists and if so, we ignore ODR
> names giving up on the precision.
> 
> This however is not stable between WPA and ltrans since at ltrans the
> type merging does not see as many types as WPA does.  To make this
> consistent the patch makes WPA ODR_TYPE_P == 0 for ODR types that
> conflicted with non-ODR type.
> 
> I had to drop one sanity check in ipa-utils.h (that I think is not very
> important - I added it while introducing CXX_ODR_P machinery) and also
> it now may happen that we query odr_based_tbaa_p before registering
> first ODR type so we do not want to ICE here.
> ODR type registration happens early to produce ODR violation warings.
> Those are not done at ltrans, so dropping the registration is safe. The
> type will still be added while computing the type inheritance graph if
> needed for devirtualization (and late devirtualization is not very
> useful anyway since it won't enable inlining).
> 
> Bootstrapped, regtested x86_64-linux, OK?

OK.

> I think this should go to release branches after some soaking in
> mainline too, even if we don't have direct reproducers.
> 
> Note that Martin implemented type checker to sanity check that alias
> sets are never getting refined from compile to WPa and from WPA to
> ltrans.  I recovered the patch and will play with it more.
> I think we should eventually establish this (if alias sets are refined
> from copmile time to WPA it is either wrong code issue or frontend alias
> sets are not as good as they should be), but of course there are fun
> issues.  My plan is to see if I can identify some wrong code bugs and
> leave rest for early next stage1.

So do we want to actually compute alias sets and stream them,
"freeing up" TYPE_CANONICAL again?  We're sort-of taking it away
from FEs which use it before and recompute it possibly in different
ways ...

Richard.

>   PR bootstrap/97857
>   * ipa-devirt.c (odr_based_tbaa_p): Do not ICE when
>   odr_hash is not initialized.
>   * ipa-utils.h (type_with_linkage_p): Do not sanity check
>   CXX_ODR_P.
>   * lto-common.c (gimple_register_canonical_type_1): Only
>   register types with TYPE_CXX_ODR_P flag; sanity check that no
>   conflict happens at ltrans time.
>   * tree-streamer-out.c (pack_ts_type_common_value_fields): Set
>   CXX_ODR_P according to the canonical type.
> diff --git a/gcc/ipa-devirt.c b/gcc/ipa-devirt.c
> index 067ed5ba073..6e6df0b2af5 100644
> --- a/gcc/ipa-devirt.c
> +++ b/gcc/ipa-devirt.c
> @@ -2032,6 +2032,8 @@ odr_based_tbaa_p (const_tree type)
>  {
>if (!RECORD_OR_UNION_TYPE_P (type))
>  return false;
> +  if (!odr_hash)
> +return false;
>odr_type t = get_odr_type (const_cast  (type), false);
>if (!t || !t->tbaa_enabled)
>  return false;
> diff --git a/gcc/ipa-utils.h b/gcc/ipa-utils.h
> index 880e527c590..91571d8e82a 100644
> --- a/gcc/ipa-utils.h
> +++ b/gcc/ipa-utils.h
> @@ -211,8 +211,6 @@ type_with_linkage_p (const_tree t)
>if (!TYPE_CONTEXT (t))
>  return false;
>  
> -  gcc_checking_assert (TREE_CODE (t) == ENUMERAL_TYPE || TYPE_CXX_ODR_P (t));
> -
>return true;
>  }
>  
> diff --git a/gcc/lto/lto-common.c b/gcc/lto/lto-common.c
> index 6944c469f89..0a3033c3695 100644
> --- a/gcc/lto/lto-common.c
> +++ b/gcc/lto/lto-common.c
> @@ -415,8 +415,8 @@ gimple_register_canonical_type_1 (tree t, hashval_t hash)
>   that we can use to lookup structurally equivalent non-ODR type.
>

Re: [PATCH 1/5] testsuite: Fix vect/vect-sdiv-pow2-1.c

2020-11-17 Thread Richard Sandiford via Gcc-patches

Richard Biener via Gcc-patches  writes:
> On Tue, Nov 17, 2020 at 12:24 PM Richard Sandiford via Gcc-patches
>  wrote:
>>
>> We're now able to vectorise the set-up loop:
>>
>>   int p = power2 (fns[i].po2);
>>   for (int j = 0; j < N; j++)
>> a[j] = ((p << 4) * j) / (N - 1) - (p << 5);
>>
>> Rather than adjust the expected output for that, it seemed better
>> to disable optimisation for the testing code.
>>
>> Tested on aarch64-linux-gnu (with and without SVE), arm-linux-gnueabihf
>> and x86_64-linux-gnu.  OK to install?
>
> In other places we just add a asm ("" : : : "memory") to the loop body, can 
> you
> do it like htat?

I wondered about that, but I don't think it's reliable long-term.
We could (perhaps rightly) decide that it's a win to vectorise the
rhs of a[j] even if the asm prevents us from doing a vector store.

Thanks,
Richard

>
> Thanks,
> RIchard.
>
>> Richard
>>
>>
>> gcc/testsuite/
>> * gcc.dg/vect/vect-sdiv-pow2-1.c (main): Disable optimization.
>> ---
>>  gcc/testsuite/gcc.dg/vect/vect-sdiv-pow2-1.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/gcc/testsuite/gcc.dg/vect/vect-sdiv-pow2-1.c 
>> b/gcc/testsuite/gcc.dg/vect/vect-sdiv-pow2-1.c
>> index be70bc6c47e..bf387133d01 100644
>> --- a/gcc/testsuite/gcc.dg/vect/vect-sdiv-pow2-1.c
>> +++ b/gcc/testsuite/gcc.dg/vect/vect-sdiv-pow2-1.c
>> @@ -53,7 +53,7 @@ power2 (int x)
>>
>>  #define N 50
>>
>> -int
>> +int __attribute__ ((optimize (0)))
>>  main (void)
>>  {
>>int a[N], b[N], c[N];

Re: [PATCH] Add MODE_OPAQUE

2020-11-17 Thread Richard Sandiford via Gcc-patches

acsaw...@linux.ibm.com writes:
> From: Aaron Sawdey 
>
> Richard,
>   Thanks for the review. I think I have resolved everything, as follows:
>
> * I was able to remove the const_tiny_rtx initialization for
> MODE_OPAQUE.  If that becomes a problem it's a pretty simple matter to
> use an UNSPEC to assign a constant to an opaque mode if necessary. The
> whole point of this exercise was not to have this thing treated as an
> integral type so I think it's best to leave this out if at all
> possible.

OK, sounds good.

> * I ended up adding a precision to opaque after I had put in that hack
> in get_nonzero_bits(). Now that it has a precision (equal to bitsize
> as you say) this is no longer needed. The underlying problem there was
> that without a precision, you ended up returning wi::shwi(-1,0) which
> did not get treated as -1.

OK.

> * I have documented OPAQUE_TYPE in generic.texi and MODE_OPAQUE in
> rtl.texi.
>
> OK for trunk if bootstrap/regtest passes on x86_64 and ppc64le?
>
> Thanks,
> Aaron
>
> gcc/ChangeLog
>   PR target/96791
>   * mode-classes.def: Add MODE_OPAQUE.
>   * machmode.def: Add OPAQUE_MODE.
>   * tree.def: Add OPAQUE_TYPE for types that will use
>   MODE_OPAQUE.
>   * doc/generic.texi: Document OPAQUE_TYPE.
>   * doc/rtl.texi: Document MODE_OPAQUE.
>   * machmode.h: Add OPAQUE_MODE_P().
>   * genmodes.c (complete_mode): Add MODE_OPAQUE.
>   (opaque_mode): New function.
>   * tree.c (tree_code_size): Add OPAQUE_TYPE.
>   * tree.h: Add OPAQUE_TYPE_P().
>   * stor-layout.c (int_mode_for_mode): Treat MODE_OPAQUE modes
>   like BLKmode.
>   * ira.c (find_moveable_pseudos): Treat MODE_OPAQUE modes more
>   like integer/float modes here.
>   * dbxout.c (dbxout_type): Treat OPAQUE_TYPE like VOID_TYPE.
>   * tree-pretty-print.c (dump_generic_node): Treat OPAQUE_TYPE
>   like like other types.
> ---
>  gcc/dbxout.c|  1 +
>  gcc/doc/generic.texi|  8 
>  gcc/doc/rtl.texi|  6 ++
>  gcc/genmodes.c  | 22 ++
>  gcc/ira.c   |  4 +++-
>  gcc/machmode.def|  3 +++
>  gcc/machmode.h  |  4 
>  gcc/mode-classes.def|  3 ++-
>  gcc/stor-layout.c   |  3 +++
>  gcc/tree-pretty-print.c |  1 +
>  gcc/tree.c  |  1 +
>  gcc/tree.def|  6 ++
>  gcc/tree.h  |  3 +++
>  13 files changed, 63 insertions(+), 2 deletions(-)
>
> diff --git a/gcc/dbxout.c b/gcc/dbxout.c
> index 5a20fdecdcc..eaee2f19ce0 100644
> --- a/gcc/dbxout.c
> +++ b/gcc/dbxout.c
> @@ -1963,6 +1963,7 @@ dbxout_type (tree type, int full)
>  case VOID_TYPE:
>  case NULLPTR_TYPE:
>  case LANG_TYPE:
> +case OPAQUE_TYPE:
>/* For a void type, just define it as itself; i.e., "5=5".
>This makes us consider it defined
>without saying what it is.  The debugger will make it
> diff --git a/gcc/doc/generic.texi b/gcc/doc/generic.texi
> index 7373266c69f..7e7b74c6c8b 100644
> --- a/gcc/doc/generic.texi
> +++ b/gcc/doc/generic.texi
> @@ -302,6 +302,7 @@ The elements are indexed from zero.
>  @tindex ARRAY_TYPE
>  @tindex RECORD_TYPE
>  @tindex UNION_TYPE
> +@tindex OPAQUE_TYPE
>  @tindex UNKNOWN_TYPE
>  @tindex OFFSET_TYPE
>  @findex TYPE_UNQUALIFIED
> @@ -487,6 +488,13 @@ assigned to that constant.  These constants will appear 
> in the order in
>  which they were declared.  The @code{TREE_TYPE} of each of these
>  constants will be the type of enumeration type itself.
>  
> +@item OPAQUE_TYPE
> +Used for things that use a @code{MODE_OPAQUE} mode class in the

Maybe s/use/have/?  Just a suggestion though -- it's ok either way.

> +backend. Opaque types have a size and precision, and can be held in
> +memory or registers. They are used when we do not want the compiler to
> +make assumptions about the availability of other operations as would
> +happen with integer types.
> +
>  @item BOOLEAN_TYPE
>  Used to represent the @code{bool} type.
>  
> diff --git a/gcc/doc/rtl.texi b/gcc/doc/rtl.texi
> index 22af5731bb6..cf892d425a2 100644
> --- a/gcc/doc/rtl.texi
> +++ b/gcc/doc/rtl.texi
> @@ -1406,6 +1406,12 @@ Pointer bounds modes.  Used to represent values of 
> pointer bounds type.
>  Operations in these modes may be executed as NOPs depending on hardware
>  features and environment setup.
>  
> +@findex MODE_OPAQUE
> +@item MODE_OPAQUE
> +This is a mode class for modes that don't want to provide operations
> +other than moves between registers/memory. They have a size and

How about “other than register moves, memory moves, loads, stores and
@code{unspec}s?”.  I don't think there's anything stopping us using
unspecs for these modes, and it wasn't clear to me which combinations
were included in “moves between registers/memory”.

OK with those changes if you agree.

Thanks,
Richard

> +precision and that's all.
> +
>  @findex MODE_RANDOM
>  @item MODE_RANDOM
>  This is a catchall mode class for modes which don't

Re: [PATCH 5/5] testsuite: Adjust bb-slp-pr68892.c for AArch64

2020-11-17 Thread Richard Biener via Gcc-patches

On Tue, Nov 17, 2020 at 12:29 PM Richard Sandiford via Gcc-patches
 wrote:
>
> AArch64 passes the "not profitable" test because it treats vec_construct
> as having a high-enough cost.  This means that we can try other vector
> modes, which in turn causes "BB vectorization with gaps at the end of
> a load is not supported" to be printed more than once.  The number of
> times that we print the message doesn't seem important, so the patch
> converts it to a plain scan-tree-dump.
>
> Tested on aarch64-linux-gnu (with and without SVE), arm-linux-gnueabihf
> and x86_64-linux-gnu.  OK to install?

OK

> Richard
>
>
> gcc/testsuite/
> * gcc.dg/vect/bb-slp-pr68892.c: Don't XFAIL the profitability
> test for aarch64*-*-*.  Allow the "BB vectorization with gaps"
> message to be printed more than once.
> ---
>  gcc/testsuite/gcc.dg/vect/bb-slp-pr68892.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/gcc/testsuite/gcc.dg/vect/bb-slp-pr68892.c 
> b/gcc/testsuite/gcc.dg/vect/bb-slp-pr68892.c
> index 8cd3a6a1274..e9909cf0dfa 100644
> --- a/gcc/testsuite/gcc.dg/vect/bb-slp-pr68892.c
> +++ b/gcc/testsuite/gcc.dg/vect/bb-slp-pr68892.c
> @@ -15,6 +15,6 @@ void foo(void)
>
>  /* ???  Due to the gaps we fall back to scalar loads which makes the
> vectorization profitable.  */
> -/* { dg-final { scan-tree-dump "not profitable" "slp2" { xfail *-*-* } } } */
> -/* { dg-final { scan-tree-dump-times "BB vectorization with gaps at the end 
> of a load is not supported" 1 "slp2" } } */
> -/* { dg-final { scan-tree-dump-times "Basic block will be vectorized" 1 
> "slp2" } } */
> +/* { dg-final { scan-tree-dump "not profitable" "slp2" { xfail { ! 
> aarch64*-*-* } } } } */
> +/* { dg-final { scan-tree-dump "BB vectorization with gaps at the end of a 
> load is not supported" "slp2" } } */
> +/* { dg-final { scan-tree-dump-times "Basic block will be vectorized" 1 
> "slp2" { xfail aarch64*-*-* } } } */

Re: [PATCH 4/5] testsuite: Adjust gcc.dg/vect/slp-21.c for Arm targets

2020-11-17 Thread Richard Biener via Gcc-patches

On Tue, Nov 17, 2020 at 12:29 PM Richard Sandiford via Gcc-patches
 wrote:
>
> On arm* and aarch64* targets, we can vectorise the second of the main
> loops using SLP, not just the third.  As the comments say, whether this
> is supported depends on a very specific permutation, so it seemed better
> to use direct target selectors.
>
> Tested on aarch64-linux-gnu (with and without SVE), arm-linux-gnueabihf
> and x86_64-linux-gnu.  OK to install?

OK

> Richard
>
>
> gcc/testsuite/
> * gcc.dg/vect/slp-21.c: Expect 4 SLP instances to be vectorized
> on arm* and aarch64* targets.
> ---
>  gcc/testsuite/gcc.dg/vect/slp-21.c | 12 +++-
>  1 file changed, 11 insertions(+), 1 deletion(-)
>
> diff --git a/gcc/testsuite/gcc.dg/vect/slp-21.c 
> b/gcc/testsuite/gcc.dg/vect/slp-21.c
> index 1f8c82e8ba8..117d65c5ddb 100644
> --- a/gcc/testsuite/gcc.dg/vect/slp-21.c
> +++ b/gcc/testsuite/gcc.dg/vect/slp-21.c
> @@ -201,6 +201,16 @@ int main (void)
>
>  /* { dg-final { scan-tree-dump-times "vectorized 4 loops" 1 "vect"  { target 
> { vect_strided4 || vect_extract_even_odd } } } } */
>  /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect"  { target 
>  { ! { vect_strided4 || vect_extract_even_odd } } } } } */
> -/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 2 "vect" 
> { target vect_strided4 } } } */
> +/* Some targets can vectorize the second of the three main loops using
> +   hybrid SLP.  For 128-bit vectors, the required 4->3 permutations are:
> +
> +   { 0, 1, 2, 4, 5, 6, 8, 9 }
> +   { 2, 4, 5, 6, 8, 9, 10, 12 }
> +   { 5, 6, 8, 9, 10, 12, 13, 14 }
> +
> +   Not all vect_perm targets support that, and it's a bit too specific to 
> have
> +   its own effective-target selector, so we just test targets directly.  */
> +/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 4 "vect" 
> { target { aarch64*-*-* arm*-*-* } } } } */
> +/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 2 "vect" 
> { target { vect_strided4 && { ! { aarch64*-*-* arm*-*-* } } } } } } */
>  /* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 0 "vect"  
> { target { ! { vect_strided4 } } } } } */
>

Re: [PATCH 3/5] testsuite: Add vect_perm3_int guards

2020-11-17 Thread Richard Biener via Gcc-patches

On Tue, Nov 17, 2020 at 12:25 PM Richard Sandiford via Gcc-patches
 wrote:
>
> SLP vectorisation of gcc.dg/vect/fast-math-vect-call-1.c involves
> a group of 3 floats, which requires the same permutation as
> vect_perm3_int.
>
> The load/store_lanes XFAILs in gcc.dg/vect/slp-perm-6.c implicitly
> assumed vect_perm3_int, which is true for Advanced SIMD but not for
> VLA SVE.  Whether it's true for fixed-length SVE depends on the
> vector length.
>
> The xfail selector applies on top of the target selector, so it's
> not necessary to make the xfail selector a strict subset of the
> target selector.
>
> Tested on aarch64-linux-gnu (with and without SVE), arm-linux-gnueabihf
> and x86_64-linux-gnu.  OK to install?

OK

> Richard
>
>
> gcc/testsuite/
> * gcc.dg/vect/fast-math-vect-call-1.c: Only expect SLP to be used
> on vect_perm3_int targets.
> * gcc.dg/vect/slp-perm-6.c: Likewise.  Only XFAIL the LOAD/STORE_LANES
> tests on vect_perm3_int targets.
> ---
>  gcc/testsuite/gcc.dg/vect/fast-math-vect-call-1.c | 2 +-
>  gcc/testsuite/gcc.dg/vect/slp-perm-6.c| 8 
>  2 files changed, 5 insertions(+), 5 deletions(-)
>
> diff --git a/gcc/testsuite/gcc.dg/vect/fast-math-vect-call-1.c 
> b/gcc/testsuite/gcc.dg/vect/fast-math-vect-call-1.c
> index 877de4eb5be..495c0319c9d 100644
> --- a/gcc/testsuite/gcc.dg/vect/fast-math-vect-call-1.c
> +++ b/gcc/testsuite/gcc.dg/vect/fast-math-vect-call-1.c
> @@ -97,4 +97,4 @@ main ()
>  }
>
>  /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 4 "vect" { target 
> { vect_call_copysignf && vect_call_sqrtf } } } } */
> -/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 3 "vect" 
> { target { vect_call_copysignf && vect_call_sqrtf } } } } */
> +/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 3 "vect" 
> { target { { vect_call_copysignf && vect_call_sqrtf } && vect_perm3_int } } } 
> } */
> diff --git a/gcc/testsuite/gcc.dg/vect/slp-perm-6.c 
> b/gcc/testsuite/gcc.dg/vect/slp-perm-6.c
> index cc863de76bf..5f121b52ffb 100644
> --- a/gcc/testsuite/gcc.dg/vect/slp-perm-6.c
> +++ b/gcc/testsuite/gcc.dg/vect/slp-perm-6.c
> @@ -106,7 +106,7 @@ int main (int argc, const char* argv[])
>  /* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 2 "vect" 
> { target { vect_perm3_int && { {! vect_load_lanes } && {! 
> vect_partial_vectors_usage_1 } } } } } } */
>  /* The epilogues are vectorized using partial vectors.  */
>  /* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 4 "vect" 
> { target { vect_perm3_int && { {! vect_load_lanes } && 
> vect_partial_vectors_usage_1 } } } } } */
> -/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 2 "vect" 
> { target vect_load_lanes } } } */
> -/* { dg-final { scan-tree-dump "Built SLP cancelled: can use 
> load/store-lanes" "vect" { target { vect_perm3_int && vect_load_lanes } xfail 
> { vect_perm3_int && vect_load_lanes } } } } */
> -/* { dg-final { scan-tree-dump "LOAD_LANES" "vect" { target { 
> vect_load_lanes } xfail { vect_load_lanes } } } } */
> -/* { dg-final { scan-tree-dump "STORE_LANES" "vect" { target { 
> vect_load_lanes } xfail { vect_load_lanes } } } } */
> +/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 2 "vect" 
> { target { vect_perm3_int && vect_load_lanes } } } } */
> +/* { dg-final { scan-tree-dump "Built SLP cancelled: can use 
> load/store-lanes" "vect" { target { vect_perm3_int && vect_load_lanes } xfail 
> *-*-* } } } */
> +/* { dg-final { scan-tree-dump "LOAD_LANES" "vect" { target vect_load_lanes 
> xfail vect_perm3_int } } } */
> +/* { dg-final { scan-tree-dump "STORE_LANES" "vect" { target vect_load_lanes 
> xfail vect_perm3_int } } } */

Re: [PATCH 2/5] testsuite: Add a vect_partial_vectors_usage_2 guard

2020-11-17 Thread Richard Biener via Gcc-patches

On Tue, Nov 17, 2020 at 12:24 PM Richard Sandiford via Gcc-patches
 wrote:
>
> We don't need an epilogue loop if the main loop can operate on
> partial vectors, so this patch disables an associated test.
> The alternative would be to force partial-vectors-usage=1
> on the command line.
>
> Tested on aarch64-linux-gnu (with and without SVE), arm-linux-gnueabihf
> and x86_64-linux-gnu.  OK to install?

OK

> Richard
>
>
> gcc/testsuite/
> * gcc.dg/vect/vect-epilogues.c: XFAIL test for epilogue loop
> vectorization if vect_partial_vectors_usage_2.
> ---
>  gcc/testsuite/gcc.dg/vect/vect-epilogues.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/gcc/testsuite/gcc.dg/vect/vect-epilogues.c 
> b/gcc/testsuite/gcc.dg/vect/vect-epilogues.c
> index a146bb6518a..ab7e8a1a759 100644
> --- a/gcc/testsuite/gcc.dg/vect/vect-epilogues.c
> +++ b/gcc/testsuite/gcc.dg/vect/vect-epilogues.c
> @@ -16,4 +16,4 @@ void pixel_avg( unsigned char *dst, int i_dst_stride,
>   }
>   }
>
> -/* { dg-final { scan-tree-dump "LOOP EPILOGUE VECTORIZED" "vect" { target 
> vect_multiple_sizes xfail { arm32 && be } } } }  */
> +/* { dg-final { scan-tree-dump "LOOP EPILOGUE VECTORIZED" "vect" { target 
> vect_multiple_sizes xfail { { arm32 && be } || vect_partial_vectors_usage_2 } 
> } } } */

Re: [PATCH 1/5] testsuite: Fix vect/vect-sdiv-pow2-1.c

2020-11-17 Thread Richard Biener via Gcc-patches

On Tue, Nov 17, 2020 at 12:24 PM Richard Sandiford via Gcc-patches
 wrote:
>
> We're now able to vectorise the set-up loop:
>
>   int p = power2 (fns[i].po2);
>   for (int j = 0; j < N; j++)
> a[j] = ((p << 4) * j) / (N - 1) - (p << 5);
>
> Rather than adjust the expected output for that, it seemed better
> to disable optimisation for the testing code.
>
> Tested on aarch64-linux-gnu (with and without SVE), arm-linux-gnueabihf
> and x86_64-linux-gnu.  OK to install?

In other places we just add a asm ("" : : : "memory") to the loop body, can you
do it like htat?

Thanks,
RIchard.

> Richard
>
>
> gcc/testsuite/
> * gcc.dg/vect/vect-sdiv-pow2-1.c (main): Disable optimization.
> ---
>  gcc/testsuite/gcc.dg/vect/vect-sdiv-pow2-1.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/gcc/testsuite/gcc.dg/vect/vect-sdiv-pow2-1.c 
> b/gcc/testsuite/gcc.dg/vect/vect-sdiv-pow2-1.c
> index be70bc6c47e..bf387133d01 100644
> --- a/gcc/testsuite/gcc.dg/vect/vect-sdiv-pow2-1.c
> +++ b/gcc/testsuite/gcc.dg/vect/vect-sdiv-pow2-1.c
> @@ -53,7 +53,7 @@ power2 (int x)
>
>  #define N 50
>
> -int
> +int __attribute__ ((optimize (0)))
>  main (void)
>  {
>int a[N], b[N], c[N];

Make ltrans type canonicals compatible with WPA ones

2020-11-17 Thread Jan Hubicka

Hi,
this patch fixes profiledbootstrap failure with LTO enabled.
What happens is that alias_ptr_types_compatible_p relies on the
fact that alias sets are not refined from WPA to ltrans time:

/* This function originally abstracts from simply comparing
   get_deref_alias_set so that we are sure this still computes
   the same result after LTO type merging is applied.
   When in LTO type merging is done we can actually do this compare.
*/
  if (in_lto_p)
return get_deref_alias_set (t1) == get_deref_alias_set (t2);
  else
return (TYPE_MAIN_VARIANT (TREE_TYPE (t1))
== TYPE_MAIN_VARIANT (TREE_TYPE (t2)));

This conditional is confused - it pesimizes code with -fno-lto
for no good reason. I will fix that separately: we now have
lto_streaming_expected_p so I think it should read

 if (!lto_stremaing_expected_p () || flag_wpa)
   use alias sets
 else
   use main varaiants as conservative estimate.

(so if we ever get idea to ICF during incremental link or deduplicate in
early passes, things will work safely).  I will send separate patch on
this.


Not refining alias sets from WPA to ltrans time is a good invariant to
maintain and the canonical type hash behaves this way.  However I broke
this with the ODR logic.

Normally we define canonical types for C++ ODR types according to their
type names.  However to make ODR types compatible with C types we check
if structurally equivalent C type exists and if so, we ignore ODR
names giving up on the precision.

This however is not stable between WPA and ltrans since at ltrans the
type merging does not see as many types as WPA does.  To make this
consistent the patch makes WPA ODR_TYPE_P == 0 for ODR types that
conflicted with non-ODR type.

I had to drop one sanity check in ipa-utils.h (that I think is not very
important - I added it while introducing CXX_ODR_P machinery) and also
it now may happen that we query odr_based_tbaa_p before registering
first ODR type so we do not want to ICE here.
ODR type registration happens early to produce ODR violation warings.
Those are not done at ltrans, so dropping the registration is safe. The
type will still be added while computing the type inheritance graph if
needed for devirtualization (and late devirtualization is not very
useful anyway since it won't enable inlining).

Bootstrapped, regtested x86_64-linux, OK?
I think this should go to release branches after some soaking in
mainline too, even if we don't have direct reproducers.

Note that Martin implemented type checker to sanity check that alias
sets are never getting refined from compile to WPa and from WPA to
ltrans.  I recovered the patch and will play with it more.
I think we should eventually establish this (if alias sets are refined
from copmile time to WPA it is either wrong code issue or frontend alias
sets are not as good as they should be), but of course there are fun
issues.  My plan is to see if I can identify some wrong code bugs and
leave rest for early next stage1.

PR bootstrap/97857
* ipa-devirt.c (odr_based_tbaa_p): Do not ICE when
odr_hash is not initialized.
* ipa-utils.h (type_with_linkage_p): Do not sanity check
CXX_ODR_P.
* lto-common.c (gimple_register_canonical_type_1): Only
register types with TYPE_CXX_ODR_P flag; sanity check that no
conflict happens at ltrans time.
* tree-streamer-out.c (pack_ts_type_common_value_fields): Set
CXX_ODR_P according to the canonical type.
diff --git a/gcc/ipa-devirt.c b/gcc/ipa-devirt.c
index 067ed5ba073..6e6df0b2af5 100644
--- a/gcc/ipa-devirt.c
+++ b/gcc/ipa-devirt.c
@@ -2032,6 +2032,8 @@ odr_based_tbaa_p (const_tree type)
 {
   if (!RECORD_OR_UNION_TYPE_P (type))
 return false;
+  if (!odr_hash)
+return false;
   odr_type t = get_odr_type (const_cast  (type), false);
   if (!t || !t->tbaa_enabled)
 return false;
diff --git a/gcc/ipa-utils.h b/gcc/ipa-utils.h
index 880e527c590..91571d8e82a 100644
--- a/gcc/ipa-utils.h
+++ b/gcc/ipa-utils.h
@@ -211,8 +211,6 @@ type_with_linkage_p (const_tree t)
   if (!TYPE_CONTEXT (t))
 return false;
 
-  gcc_checking_assert (TREE_CODE (t) == ENUMERAL_TYPE || TYPE_CXX_ODR_P (t));
-
   return true;
 }
 
diff --git a/gcc/lto/lto-common.c b/gcc/lto/lto-common.c
index 6944c469f89..0a3033c3695 100644
--- a/gcc/lto/lto-common.c
+++ b/gcc/lto/lto-common.c
@@ -415,8 +415,8 @@ gimple_register_canonical_type_1 (tree t, hashval_t hash)
  that we can use to lookup structurally equivalent non-ODR type.
  In case we decide to treat type as unique ODR type we recompute hash based
  on name and let TBAA machinery know about our decision.  */
-  if (RECORD_OR_UNION_TYPE_P (t)
-  && odr_type_p (t) && !odr_type_violation_reported_p (t))
+  if (RECORD_OR_UNION_TYPE_P (t) && odr_type_p (t)
+  && TYPE_CXX_ODR_P (t) && !odr_type_violation_reported_p (t))
 {
   /* Anonymous namespace types never conflict with non-C++ types.  */
   if

1 2 >

1 - 100 of 134 matches

Mail list logo