date:20161107

Re: [PATCH] Add -fsanitize=shift{,-base,-exponent} support

2016-11-07 Thread Richard Biener

On Tue, 8 Nov 2016, Jakub Jelinek wrote:

> Hi!
> 
> In the libsanitizer merge Maxim posted I've noticed that llvm split
> the -fsanitize=shift option into suboption (at least one really weirdly
> named, I have never seen the shift count being called exponent), this
> patch implements the same with the same option names for gcc.
> The interesting part is if the two suboptions disagree on
> -fsanitize-recover, then we need to emit two checks and two calls (one with
> _abort, one without).
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Ok.

Richard.

> 2016-11-08  Jakub Jelinek  
> 
>   * flag-types.h (enum sanitize_code): Add SANITIZE_SHIFT_BASE
>   and SANITIZE_SHIFT_EXPONENT, change SANITIZE_SHIFT to bitwise
>   or of them, renumber other enumerators.
>   * opts.c (sanitizer_opts): Add shift-base and shift-exponent.
>   * doc/invoke.texi: Document -fsanitize=shift-base and
>   -fsanitize-shift-exponent, document -fsanitize=shift as
>   having those 2 suboptions.
> c-family/
>   * c-ubsan.c (ubsan_instrument_shift): Handle split
>   -fsanitize=shift-base and -fsanitize=shift-exponent.
> testsuite/
>   * gcc.dg/ubsan/c99-shift-3.c: New test.
>   * gcc.dg/ubsan/c99-shift-4.c: New test.
>   * gcc.dg/ubsan/c99-shift-5.c: New test.
>   * gcc.dg/ubsan/c99-shift-6.c: New test.
> 
> --- gcc/flag-types.h.jj   2016-11-07 14:04:29.129755670 +0100
> +++ gcc/flag-types.h  2016-11-07 18:50:54.989652650 +0100
> @@ -211,24 +211,26 @@ enum sanitize_code {
>/* LeakSanitizer.  */
>SANITIZE_LEAK = 1UL << 4,
>/* UndefinedBehaviorSanitizer.  */
> -  SANITIZE_SHIFT = 1UL << 5,
> -  SANITIZE_DIVIDE = 1UL << 6,
> -  SANITIZE_UNREACHABLE = 1UL << 7,
> -  SANITIZE_VLA = 1UL << 8,
> -  SANITIZE_NULL = 1UL << 9,
> -  SANITIZE_RETURN = 1UL << 10,
> -  SANITIZE_SI_OVERFLOW = 1UL << 11,
> -  SANITIZE_BOOL = 1UL << 12,
> -  SANITIZE_ENUM = 1UL << 13,
> -  SANITIZE_FLOAT_DIVIDE = 1UL << 14,
> -  SANITIZE_FLOAT_CAST = 1UL << 15,
> -  SANITIZE_BOUNDS = 1UL << 16,
> -  SANITIZE_ALIGNMENT = 1UL << 17,
> -  SANITIZE_NONNULL_ATTRIBUTE = 1UL << 18,
> -  SANITIZE_RETURNS_NONNULL_ATTRIBUTE = 1UL << 19,
> -  SANITIZE_OBJECT_SIZE = 1UL << 20,
> -  SANITIZE_VPTR = 1UL << 21,
> -  SANITIZE_BOUNDS_STRICT = 1UL << 22,
> +  SANITIZE_SHIFT_BASE = 1UL << 5,
> +  SANITIZE_SHIFT_EXPONENT = 1UL << 6,
> +  SANITIZE_DIVIDE = 1UL << 7,
> +  SANITIZE_UNREACHABLE = 1UL << 8,
> +  SANITIZE_VLA = 1UL << 9,
> +  SANITIZE_NULL = 1UL << 10,
> +  SANITIZE_RETURN = 1UL << 11,
> +  SANITIZE_SI_OVERFLOW = 1UL << 12,
> +  SANITIZE_BOOL = 1UL << 13,
> +  SANITIZE_ENUM = 1UL << 14,
> +  SANITIZE_FLOAT_DIVIDE = 1UL << 15,
> +  SANITIZE_FLOAT_CAST = 1UL << 16,
> +  SANITIZE_BOUNDS = 1UL << 17,
> +  SANITIZE_ALIGNMENT = 1UL << 18,
> +  SANITIZE_NONNULL_ATTRIBUTE = 1UL << 19,
> +  SANITIZE_RETURNS_NONNULL_ATTRIBUTE = 1UL << 20,
> +  SANITIZE_OBJECT_SIZE = 1UL << 21,
> +  SANITIZE_VPTR = 1UL << 22,
> +  SANITIZE_BOUNDS_STRICT = 1UL << 23,
> +  SANITIZE_SHIFT = SANITIZE_SHIFT_BASE | SANITIZE_SHIFT_EXPONENT,
>SANITIZE_UNDEFINED = SANITIZE_SHIFT | SANITIZE_DIVIDE | 
> SANITIZE_UNREACHABLE
>  | SANITIZE_VLA | SANITIZE_NULL | SANITIZE_RETURN
>  | SANITIZE_SI_OVERFLOW | SANITIZE_BOOL | SANITIZE_ENUM
> --- gcc/opts.c.jj 2016-11-07 14:04:29.131755645 +0100
> +++ gcc/opts.c2016-11-07 18:50:54.990652637 +0100
> @@ -1477,6 +1477,8 @@ const struct sanitizer_opts_s sanitizer_
>SANITIZER_OPT (thread, SANITIZE_THREAD, false),
>SANITIZER_OPT (leak, SANITIZE_LEAK, false),
>SANITIZER_OPT (shift, SANITIZE_SHIFT, true),
> +  SANITIZER_OPT (shift-base, SANITIZE_SHIFT_BASE, true),
> +  SANITIZER_OPT (shift-exponent, SANITIZE_SHIFT_EXPONENT, true),
>SANITIZER_OPT (integer-divide-by-zero, SANITIZE_DIVIDE, true),
>SANITIZER_OPT (undefined, SANITIZE_UNDEFINED, true),
>SANITIZER_OPT (unreachable, SANITIZE_UNREACHABLE, false),
> --- gcc/doc/invoke.texi.jj2016-11-07 14:04:28.853759109 +0100
> +++ gcc/doc/invoke.texi   2016-11-07 18:50:54.998652536 +0100
> @@ -10555,6 +10555,21 @@ at runtime.  Current suboptions are:
>  This option enables checking that the result of a shift operation is
>  not undefined.  Note that what exactly is considered undefined differs
>  slightly between C and C++, as well as between ISO C90 and C99, etc.
> +This option has two suboptions, @option{-fsanitize=shift-base} and
> +@option{-fsanitize=shift-exponent}.
> +
> +@item -fsanitize=shift-exponent
> +@opindex fsanitize=shift-exponent
> +This option enables checking that the second argument of a shift operation
> +is not negative and is smaller than the precision of the promoted first
> +argument.
> +
> +@item -fsanitize=shift-base
> +@opindex fsanitize=shift-base
> +If the second argument of a shift operation is within range, check that the
> +result of a shift operation is not undefined.  Note that what exactly is
> +considered

Re: [PATCH] A special predicate for type size equality

2016-11-07 Thread Richard Biener

On Mon, 7 Nov 2016, Martin Jambor wrote:

> Hi,
> 
> this has been in my TODO list for at least two years, probably longer,
> although I do no longer remember why I added it there.  The idea is to
> introduce a special wrapper around operands_equal_p for TYPE_SIZE
> comparisons, which would try simple pointer equality before calling more
> complex operand_equal_p (TYPE_SIZE (t1), TYPE_SIZE (t2), 0), because
> when equal, the sizes are most likely going to be the same tree anyway.
> 
> All users also test whether both TYPE_SIZEs are NULL, most of them to
> test for known size equality, but unfortunately there is one (ODR
> warning) that tests for known inequality.  Nevertheless, the former use
> case seems so much natural that I have outlined it into the new
> predicate as well.

But I think this really asks for a tri-state, known equal, known unequal
and unknown (both NULL_TREE).  Also the checking asserts are redundant
with the tree checking done by the TYPE_SIZE accessor.

Richard.

> I am no longer sure whether it is a scenario that happens so often to
> justify a wrapper, but I'd like to propose it anyway, at least to remove
> it from the TODO list as a not-so-good-idea-after-all :-)
> 
> Bootstrapped and tested on x86_64-linux.  Is it a good idea?  OK for
> trunk?
> 
> Thanks,
> 
> Martin
> 
> 2016-11-03  Martin Jambor  
> 
>   * fold-const.c (type_sizes_equal_p): New function.
>   * fold-const.h (type_sizes_equal_p): Declare.
>   * ipa-devirt.c (odr_types_equivalent_p): Use it.
>   * ipa-polymorphic-call.c (meet_with): Likewise.
>   * tree-ssa-alias.c (stmt_kills_ref_p): Likewise.
> ---
>  gcc/fold-const.c   | 19 +++
>  gcc/fold-const.h   |  1 +
>  gcc/ipa-devirt.c   |  2 +-
>  gcc/ipa-polymorphic-call.c | 10 ++
>  gcc/tree-ssa-alias.c   |  7 +--
>  5 files changed, 24 insertions(+), 15 deletions(-)
> 
> diff --git a/gcc/fold-const.c b/gcc/fold-const.c
> index 603aff0..ab77b8d 100644
> --- a/gcc/fold-const.c
> +++ b/gcc/fold-const.c
> @@ -3342,6 +3342,25 @@ operand_equal_for_comparison_p (tree arg0, tree arg1, 
> tree other)
>  
>return 0;
>  }
> +
> +/* Given two types, return true if both have a non-NULL TYPE_SIZE and these
> +   sizes have the same value.  */
> +
> +bool
> +type_sizes_equal_p (const_tree t1, const_tree t2)
> +{
> +  gcc_checking_assert (TYPE_P (t1));
> +  gcc_checking_assert (TYPE_P (t2));
> +  t1 = TYPE_SIZE (t1);
> +  t2 = TYPE_SIZE (t2);
> +
> +  if (!t1 || !t2)
> +return false;
> +  else if (t1 == t2)
> +return true;
> +  else
> +return operand_equal_p (t1, t2, 0);
> +}
>  
>  /* See if ARG is an expression that is either a comparison or is performing
> arithmetic on comparisons.  The comparisons must only be comparing
> diff --git a/gcc/fold-const.h b/gcc/fold-const.h
> index ae37142..014ca34 100644
> --- a/gcc/fold-const.h
> +++ b/gcc/fold-const.h
> @@ -89,6 +89,7 @@ extern void fold_undefer_and_ignore_overflow_warnings 
> (void);
>  extern bool fold_deferring_overflow_warnings_p (void);
>  extern void fold_overflow_warning (const char*, enum 
> warn_strict_overflow_code);
>  extern int operand_equal_p (const_tree, const_tree, unsigned int);
> +extern bool type_sizes_equal_p (const_tree, const_tree);
>  extern int multiple_of_p (tree, const_tree, const_tree);
>  #define omit_one_operand(T1,T2,T3)\
> omit_one_operand_loc (UNKNOWN_LOCATION, T1, T2, T3)
> diff --git a/gcc/ipa-devirt.c b/gcc/ipa-devirt.c
> index 49e2195..d2db6f2 100644
> --- a/gcc/ipa-devirt.c
> +++ b/gcc/ipa-devirt.c
> @@ -1671,7 +1671,7 @@ odr_types_equivalent_p (tree t1, tree t2, bool warn, 
> bool *warned,
>  
>/* Those are better to come last as they are utterly uninformative.  */
>if (TYPE_SIZE (t1) && TYPE_SIZE (t2)
> -  && !operand_equal_p (TYPE_SIZE (t1), TYPE_SIZE (t2), 0))
> +  && !type_sizes_equal_p (t1, t2))
>  {
>warn_odr (t1, t2, NULL, NULL, warn, warned,
>   G_("a type with different size "
> diff --git a/gcc/ipa-polymorphic-call.c b/gcc/ipa-polymorphic-call.c
> index 8d9f22a..b66fd76 100644
> --- a/gcc/ipa-polymorphic-call.c
> +++ b/gcc/ipa-polymorphic-call.c
> @@ -2454,10 +2454,7 @@ ipa_polymorphic_call_context::meet_with 
> (ipa_polymorphic_call_context ctx,
>if (!dynamic
> && (ctx.dynamic
> || (!otr_type
> -   && (!TYPE_SIZE (ctx.outer_type)
> -   || !TYPE_SIZE (outer_type)
> -   || !operand_equal_p (TYPE_SIZE (ctx.outer_type),
> -TYPE_SIZE (outer_type), 0)
> +   && (!type_sizes_equal_p (ctx.outer_type, outer_type)
>   {
> dynamic = true;
> updated = true;
> @@ -2472,10 +2469,7 @@ ipa_polymorphic_call_context::meet_with 
> (ipa_polymorphic_call_context ctx,
>if (!dynamic
> && (ctx.dynamic
> || (!otr_type
> -   && (!TYPE_SIZE (ctx.outer_type)
> -

Re: [match.pd] Fix for PR35691

2016-11-07 Thread Richard Biener

On Mon, 7 Nov 2016, Prathamesh Kulkarni wrote:

> On 7 November 2016 at 23:06, Prathamesh Kulkarni
>  wrote:
> > On 7 November 2016 at 15:43, Richard Biener  wrote:
> >> On Fri, 4 Nov 2016, Prathamesh Kulkarni wrote:
> >>
> >>> On 4 November 2016 at 13:41, Richard Biener  wrote:
> >>> > On Thu, 3 Nov 2016, Marc Glisse wrote:
> >>> >
> >>> >> On Thu, 3 Nov 2016, Richard Biener wrote:
> >>> >>
> >>> >> > > > > The transform would also work for vectors (element_precision 
> >>> >> > > > > for
> >>> >> > > > > the test but also a value-matching zero which should ensure the
> >>> >> > > > > same number of elements).
> >>> >> > > > Um sorry, I didn't get how to check vectors to be of equal 
> >>> >> > > > length by a
> >>> >> > > > matching zero.
> >>> >> > > > Could you please elaborate on that ?
> >>> >> > >
> >>> >> > > He may have meant something like:
> >>> >> > >
> >>> >> > >   (op (cmp @0 integer_zerop@2) (cmp @1 @2))
> >>> >> >
> >>> >> > I meant with one being @@2 to allow signed vs. Unsigned @0/@1 which 
> >>> >> > was the
> >>> >> > point of the pattern.
> >>> >>
> >>> >> Oups, that's what I had written first, and then I somehow managed to 
> >>> >> confuse
> >>> >> myself enough to remove it so as to remove the call to types_match :-(
> >>> >>
> >>> >> > > So the last operand is checked with operand_equal_p instead of
> >>> >> > > integer_zerop. But the fact that we could compute bit_ior on the
> >>> >> > > comparison results should already imply that the number of 
> >>> >> > > elements is the
> >>> >> > > same.
> >>> >> >
> >>> >> > Though for equality compares we also allow scalar results IIRC.
> >>> >>
> >>> >> Oh, right, I keep forgetting that :-( And I have no idea how to 
> >>> >> generate one
> >>> >> for a testcase, at least until the GIMPLE FE lands...
> >>> >>
> >>> >> > > On platforms that have IOR on floats (at least x86 with SSE, maybe 
> >>> >> > > some
> >>> >> > > vector mode on s390?), it would be cool to do the same for floats 
> >>> >> > > (most
> >>> >> > > likely at the RTL level).
> >>> >> >
> >>> >> > On GIMPLE view-converts could come to the rescue here as well.  Or 
> >>> >> > we cab
> >>> >> > just allow bit-and/or on floats as much as we allow them on pointers.
> >>> >>
> >>> >> Would that generate sensible code on targets that do not have logic 
> >>> >> insns for
> >>> >> floats? Actually, even on x86_64 that generates inefficient code, so 
> >>> >> there
> >>> >> would be some work (for instance grep finds no gen_iordf3, only 
> >>> >> gen_iorv2df3).
> >>> >>
> >>> >> I am also a bit wary of doing those obfuscating optimizations too 
> >>> >> early...
> >>> >> a==0 is something that other optimizations might use. long
> >>> >> c=(long&)a|(long&)b; (double&)c==0; less so...
> >>> >>
> >>> >> (and I am assuming that signaling NaNs don't make the whole 
> >>> >> transformation
> >>> >> impossible, which might be wrong)
> >>> >
> >>> > Yeah.  I also think it's not so much important - I just wanted to 
> >>> > mention
> >>> > vectors...
> >>> >
> >>> > Btw, I still think we need a more sensible infrastructure for passes
> >>> > to gather, analyze and modify complex conditions.  (I'm always pointing
> >>> > to tree-affine.c as an, albeit not very good, example for handling
> >>> > a similar problem)
> >>> Thanks for mentioning the value-matching capture @@, I wasn't aware of
> >>> this match.pd feature.
> >>> The current patch keeps it restricted to only bitwise operators on 
> >>> integers.
> >>> Bootstrap+test running on x86_64-unknown-linux-gnu.
> >>> OK to commit if passes ?
> >>
> >> +/* PR35691: Transform
> >> +   (x == 0 & y == 0) -> (x | typeof(x)(y)) == 0.
> >> +   (x != 0 | y != 0) -> (x | typeof(x)(y)) != 0.  */
> >> +
> >>
> >> Please omit the vertical space
> >>
> >> +(for bitop (bit_and bit_ior)
> >> + cmp (eq ne)
> >> + (simplify
> >> +  (bitop (cmp @0 integer_zerop) (cmp @1 integer_zerop))
> >>
> >> if you capture the first integer_zerop as @2 then you can re-use it...
> >>
> >> +   (if (INTEGRAL_TYPE_P (TREE_TYPE (@0))
> >> +   && INTEGRAL_TYPE_P (TREE_TYPE (@1))
> >> +   && TYPE_PRECISION (TREE_TYPE (@0)) == TYPE_PRECISION (TREE_TYPE
> >> (@1)))
> >> +(cmp (bit_ior @0 (convert @1)) { build_zero_cst (TREE_TYPE (@0));
> >>
> >> ... here inplace of the { build_zero_cst ... }.
> >>
> >> Ok with that changes.
> > Thanks, committed the attached version as r241915.
> ugh, the svn commit message has:
> 
> testsuite/
> * gcc.dg/pr35691-1.c: New test-case.
> * gcc.dg/pr35691-4.c: Likewise.
> 
> pr35691-4.c was a typo, should be pr35691-2.c :/
> However testsuite/ChangeLog correctly has entry for pr35691-2.c
> Is it possible to edit the commit message for r241915 ?
> Sorry about this.

No, just leave it as-is.

Richard.

> Regards,
> Prathamesh
> >
> >>
> >> Richard.
> 
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton,

[patch, avr] Add flash size to device info and make wrap around default (was: Re: [patch, avr] Make pmem-wrap-around option as default)

2016-11-07 Thread Pitchumani Sivanupandi


On Thursday 03 November 2016 06:19 PM, Georg-Johann Lay wrote:

On 03.11.2016 08:58, Pitchumani Sivanupandi wrote:
Most of the AVR's 8k memorydevices have only rjmp instruction, not 
jmp. So, it
is important to wrap around jump destination to check if it can reach 
backwards.


Currently link specs passes --pmem-wrap-around=xxK when mrelax and
mpmem-wrap-around options are enabled. Attached patch changes the 
specs so that

option --pmem-wrap-around=8K is passed for 8k memory devices if
-mno-pmem-wrap-around is not enabled.

If OK, could someone commit please?

Note: Currently 8k devices are identified based on name prefix. We 
are working

on alternative method to incorporate flash memory size.


Currently, "at90usb8" this the only prefix that results in wrap_k = 8, 
so without adding knowledge about flash sizes to the compiler, the 
change makes not much sense... (but does not do harm either).

There are ~40 devices with 8K flash. Current checks are out-dated.


The right place for flash size info would be avr-mcus.def together 
with a new command option to specify the flash size and draw 
conclusions from it.  When I asked Atmel about a mapping of Device to 
flash size, the support told me to check the data sheets -- not really 
practical for ~300 devices in avr-mcus.def :-)  Atmel *must* have 
information about this...


I have updated patch to include the flash size as well. Took that info 
from device headers (it was fed into crt's device information note 
section also).


The new option would render -mn-flash superfluous, but we should keep 
it for backward compatibility.

Ok.
Shouldn't link_pmem_wrap then be removed from link_relax, i.e. from 
LINK_RELAX_SPEC?  And what happens if relaxation is off?

Yes. Removed link_pmem_wrap from link_relax.
Disabling relaxation doesn't change -mpmem-wrap-around behavior.

Now, wrap around behavior is changed as follows:

For 8K flash devices:
Device specs adds --pmem-wrap-around=8k linker option if 
-mno-pmem-wrap-around is NOT enabled.
It makes the --pmem-wrap-around=8k linker option default for 8k flash 
devices.


For 16/32/64K flash devices:
Spec string 'link_pmem_wrap' added to all 16/32/64k flash devices specs.
Other wise no changes i.e. It adds --pmem-wrap-around=16/32/64k option 
if -mpmem-wrap-around option is enabled.


For other devices, no changes in device specs.

Reg tested with default and -mrelax options enabled. No issues.

If OK, could someone commit please?

Regards,
Pitchumani


gcc/ChangeLog

2016-11-08  Pitchumani Sivanupandi 

* config/avr/avr-arch.h (avr_mcu_t): Add flash_size member.
* config/avr/avr-devices.c(avr_mcu_types): Add flash size info.
* config/avr/avr-mcu.def: Likewise.
* config/avr/gen-avr-mmcu-specs.c (print_mcu): Remove hard-coded prefix
check to find wrap-around value, instead use MCU flash size. For 8k 
flash
devices, update link_pmem_wrap spec string to add 
--pmem-wrap-around=8k.

* config/avr/specs.h: Remove link_pmem_wrap from LINK_RELAX_SPEC and
add to linker specs (LINK_SPEC) directly.
diff --git a/gcc/config/avr/avr-arch.h b/gcc/config/avr/avr-arch.h
index 42eaee5..e0961d4 100644
--- a/gcc/config/avr/avr-arch.h
+++ b/gcc/config/avr/avr-arch.h
@@ -122,6 +122,9 @@ typedef struct
 
   /* Number of 64k segments in the flash.  */
   int n_flash;
+
+  /* Flash size in bytes.  */
+  int flash_size;
 } avr_mcu_t;
 
 /* AVR device specific features.
diff --git a/gcc/config/avr/avr-devices.c b/gcc/config/avr/avr-devices.c
index 7d13ba4..cef3b9a 100644
--- a/gcc/config/avr/avr-devices.c
+++ b/gcc/config/avr/avr-devices.c
@@ -111,12 +111,12 @@ avr_texinfo[] =
 const avr_mcu_t
 avr_mcu_types[] =
 {
-#define AVR_MCU(NAME, ARCH, DEV_ATTRIBUTE, MACRO, DATA_SEC, TEXT_SEC, N_FLASH)\
-  { NAME, ARCH, DEV_ATTRIBUTE, MACRO, DATA_SEC, TEXT_SEC, N_FLASH },
+#define AVR_MCU(NAME, ARCH, DEV_ATTRIBUTE, MACRO, DATA_SEC, TEXT_SEC, N_FLASH, FLASH_SIZE)\
+  { NAME, ARCH, DEV_ATTRIBUTE, MACRO, DATA_SEC, TEXT_SEC, N_FLASH, FLASH_SIZE },
 #include "avr-mcus.def"
 #undef AVR_MCU
 /* End of list.  */
-  { NULL, ARCH_UNKNOWN, AVR_ISA_NONE, NULL, 0, 0, 0 }
+  { NULL, ARCH_UNKNOWN, AVR_ISA_NONE, NULL, 0, 0, 0, 0 }
 };
 
 
diff --git a/gcc/config/avr/avr-mcus.def b/gcc/config/avr/avr-mcus.def
index 6bcc6ff..9d4aa1a 100644
--- a/gcc/config/avr/avr-mcus.def
+++ b/gcc/config/avr/avr-mcus.def
@@ -62,295 +62,297 @@
N_FLASH   Number of 64 KiB flash segments, rounded up.  The default
  value for -mn-flash=.
 
+   FLASH_SIZEFlash size in bytes.
+
"avr2" must be first for the "0" default to work as intended.  */
 
 /* Classic, <= 8K.  */
-AVR_MCU ("avr2", ARCH_AVR2, AVR_ERRATA_SKIP, NULL, 0x0060, 0x0, 6)
-AVR_MCU ("at90s2313",ARCH_AVR2, AVR_SHORT_SP, "__AVR_AT90S2313__", 0x0060, 0x0, 1)
-AVR_MCU ("at90s2323",ARCH_AVR2, AVR_SHORT_SP, "__AVR_AT90S2323__", 0x0060, 0x0, 1)
-AVR_MCU ("at90s2333",

[PATCH] Fix regex_iterator end() state and operator==()

2016-11-07 Thread Tim Shen

This fixes libstdc++/78236. I'm surprised that this bug was not
revealed until now :P.

Bootstrapped and tested under x86_64-linux-gnu.

I'm happy with however many backports.

-- 
Regards,
Tim Shen
commit 8aee66b743b5b0ef09cbc9587ebbacf6665ba0cb
Author: Tim Shen 
Date:   Mon Nov 7 21:50:49 2016 -0800

	* libstdc++-v3/include/bits/regex.h (regex_iterator::regex_iterator()):
	Define end() as _M_pregex == nullptr.
	* libstdc++-v3/include/bits/regex.tcc (regex_iterator::operator==(),
	regex_iterator::operator++()): Fix operator==() and operator++() to
	look at null-ness of _M_pregex on both sides.
	* testsuite/28_regex/regression.cc: New testcase.

diff --git a/libstdc++-v3/include/bits/regex.h b/libstdc++-v3/include/bits/regex.h
index a7d45e6..aadf312 100644
--- a/libstdc++-v3/include/bits/regex.h
+++ b/libstdc++-v3/include/bits/regex.h
@@ -2454,7 +2454,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11
* one-past-the-end of a range.
*/
   regex_iterator()
-  : _M_match()
+  : _M_pregex()
   { }
 
   /**
diff --git a/libstdc++-v3/include/bits/regex.tcc b/libstdc++-v3/include/bits/regex.tcc
index 4a3d7c3..3f8969d 100644
--- a/libstdc++-v3/include/bits/regex.tcc
+++ b/libstdc++-v3/include/bits/regex.tcc
@@ -496,12 +496,13 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 regex_iterator<_Bi_iter, _Ch_type, _Rx_traits>::
 operator==(const regex_iterator& __rhs) const
 {
-  return (_M_match.empty() && __rhs._M_match.empty())
-	|| (_M_begin == __rhs._M_begin
-	&& _M_end == __rhs._M_end
-	&& _M_pregex == __rhs._M_pregex
-	&& _M_flags == __rhs._M_flags
-	&& _M_match[0] == __rhs._M_match[0]);
+  if (_M_pregex == nullptr && __rhs._M_pregex == nullptr)
+	return true;
+  return _M_pregex == __rhs._M_pregex
+	  && _M_begin == __rhs._M_begin
+	  && _M_end == __rhs._M_end
+	  && _M_flags == __rhs._M_flags
+	  && _M_match[0] == __rhs._M_match[0];
 }
 
   template r("(f+)");
+  {
+std::cregex_iterator i(s, s+sizeof(s), r);
+std::cregex_iterator j(s, s+sizeof(s), r);
+VERIFY(i == j);
+  }
+  // The iterator manipulation code must be repeated in the same scope
+  // to expose the undefined read during the execution of the ==
+  // operator (stack location reuse)
+  {
+std::cregex_iterator i(s, s+sizeof(s), r);
+std::cregex_iterator j;
+VERIFY(!(i == j));
+  }
+}
+
 int
 main()
 {
@@ -80,6 +101,7 @@ main()
   test03();
   test04();
   test05();
+  test06();
   return 0;
 }

[PATCH] Add -fsanitize=shift{,-base,-exponent} support

2016-11-07 Thread Jakub Jelinek

Hi!

In the libsanitizer merge Maxim posted I've noticed that llvm split
the -fsanitize=shift option into suboption (at least one really weirdly
named, I have never seen the shift count being called exponent), this
patch implements the same with the same option names for gcc.
The interesting part is if the two suboptions disagree on
-fsanitize-recover, then we need to emit two checks and two calls (one with
_abort, one without).

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2016-11-08  Jakub Jelinek  

* flag-types.h (enum sanitize_code): Add SANITIZE_SHIFT_BASE
and SANITIZE_SHIFT_EXPONENT, change SANITIZE_SHIFT to bitwise
or of them, renumber other enumerators.
* opts.c (sanitizer_opts): Add shift-base and shift-exponent.
* doc/invoke.texi: Document -fsanitize=shift-base and
-fsanitize-shift-exponent, document -fsanitize=shift as
having those 2 suboptions.
c-family/
* c-ubsan.c (ubsan_instrument_shift): Handle split
-fsanitize=shift-base and -fsanitize=shift-exponent.
testsuite/
* gcc.dg/ubsan/c99-shift-3.c: New test.
* gcc.dg/ubsan/c99-shift-4.c: New test.
* gcc.dg/ubsan/c99-shift-5.c: New test.
* gcc.dg/ubsan/c99-shift-6.c: New test.

--- gcc/flag-types.h.jj 2016-11-07 14:04:29.129755670 +0100
+++ gcc/flag-types.h2016-11-07 18:50:54.989652650 +0100
@@ -211,24 +211,26 @@ enum sanitize_code {
   /* LeakSanitizer.  */
   SANITIZE_LEAK = 1UL << 4,
   /* UndefinedBehaviorSanitizer.  */
-  SANITIZE_SHIFT = 1UL << 5,
-  SANITIZE_DIVIDE = 1UL << 6,
-  SANITIZE_UNREACHABLE = 1UL << 7,
-  SANITIZE_VLA = 1UL << 8,
-  SANITIZE_NULL = 1UL << 9,
-  SANITIZE_RETURN = 1UL << 10,
-  SANITIZE_SI_OVERFLOW = 1UL << 11,
-  SANITIZE_BOOL = 1UL << 12,
-  SANITIZE_ENUM = 1UL << 13,
-  SANITIZE_FLOAT_DIVIDE = 1UL << 14,
-  SANITIZE_FLOAT_CAST = 1UL << 15,
-  SANITIZE_BOUNDS = 1UL << 16,
-  SANITIZE_ALIGNMENT = 1UL << 17,
-  SANITIZE_NONNULL_ATTRIBUTE = 1UL << 18,
-  SANITIZE_RETURNS_NONNULL_ATTRIBUTE = 1UL << 19,
-  SANITIZE_OBJECT_SIZE = 1UL << 20,
-  SANITIZE_VPTR = 1UL << 21,
-  SANITIZE_BOUNDS_STRICT = 1UL << 22,
+  SANITIZE_SHIFT_BASE = 1UL << 5,
+  SANITIZE_SHIFT_EXPONENT = 1UL << 6,
+  SANITIZE_DIVIDE = 1UL << 7,
+  SANITIZE_UNREACHABLE = 1UL << 8,
+  SANITIZE_VLA = 1UL << 9,
+  SANITIZE_NULL = 1UL << 10,
+  SANITIZE_RETURN = 1UL << 11,
+  SANITIZE_SI_OVERFLOW = 1UL << 12,
+  SANITIZE_BOOL = 1UL << 13,
+  SANITIZE_ENUM = 1UL << 14,
+  SANITIZE_FLOAT_DIVIDE = 1UL << 15,
+  SANITIZE_FLOAT_CAST = 1UL << 16,
+  SANITIZE_BOUNDS = 1UL << 17,
+  SANITIZE_ALIGNMENT = 1UL << 18,
+  SANITIZE_NONNULL_ATTRIBUTE = 1UL << 19,
+  SANITIZE_RETURNS_NONNULL_ATTRIBUTE = 1UL << 20,
+  SANITIZE_OBJECT_SIZE = 1UL << 21,
+  SANITIZE_VPTR = 1UL << 22,
+  SANITIZE_BOUNDS_STRICT = 1UL << 23,
+  SANITIZE_SHIFT = SANITIZE_SHIFT_BASE | SANITIZE_SHIFT_EXPONENT,
   SANITIZE_UNDEFINED = SANITIZE_SHIFT | SANITIZE_DIVIDE | SANITIZE_UNREACHABLE
   | SANITIZE_VLA | SANITIZE_NULL | SANITIZE_RETURN
   | SANITIZE_SI_OVERFLOW | SANITIZE_BOOL | SANITIZE_ENUM
--- gcc/opts.c.jj   2016-11-07 14:04:29.131755645 +0100
+++ gcc/opts.c  2016-11-07 18:50:54.990652637 +0100
@@ -1477,6 +1477,8 @@ const struct sanitizer_opts_s sanitizer_
   SANITIZER_OPT (thread, SANITIZE_THREAD, false),
   SANITIZER_OPT (leak, SANITIZE_LEAK, false),
   SANITIZER_OPT (shift, SANITIZE_SHIFT, true),
+  SANITIZER_OPT (shift-base, SANITIZE_SHIFT_BASE, true),
+  SANITIZER_OPT (shift-exponent, SANITIZE_SHIFT_EXPONENT, true),
   SANITIZER_OPT (integer-divide-by-zero, SANITIZE_DIVIDE, true),
   SANITIZER_OPT (undefined, SANITIZE_UNDEFINED, true),
   SANITIZER_OPT (unreachable, SANITIZE_UNREACHABLE, false),
--- gcc/doc/invoke.texi.jj  2016-11-07 14:04:28.853759109 +0100
+++ gcc/doc/invoke.texi 2016-11-07 18:50:54.998652536 +0100
@@ -10555,6 +10555,21 @@ at runtime.  Current suboptions are:
 This option enables checking that the result of a shift operation is
 not undefined.  Note that what exactly is considered undefined differs
 slightly between C and C++, as well as between ISO C90 and C99, etc.
+This option has two suboptions, @option{-fsanitize=shift-base} and
+@option{-fsanitize=shift-exponent}.
+
+@item -fsanitize=shift-exponent
+@opindex fsanitize=shift-exponent
+This option enables checking that the second argument of a shift operation
+is not negative and is smaller than the precision of the promoted first
+argument.
+
+@item -fsanitize=shift-base
+@opindex fsanitize=shift-base
+If the second argument of a shift operation is within range, check that the
+result of a shift operation is not undefined.  Note that what exactly is
+considered undefined differs slightly between C and C++, as well as between
+ISO C90 and C99, etc.
 
 @item -fsanitize=integer-divide-by-zero
 @opindex fsanitize=integer-divide-by-zero
--- gcc/c-family/c-ubsan.c.jj   2016-11-07 14:04:29.102756006 +0100
+++

[PATCH] Fix PR71727

2016-11-07 Thread Hurugalawadi, Naveen

Hi,

Please find attached the patch that fixes PR71727.
Please review the patch and let me know if its okay?

Regression tested on Aarch64 with no regressions.

Thanks,
Naveen

2016-11-08  Naveen H.S  

* config/aarch64/aarch64.c
(aarch64_builtin_support_vector_misalignment): New.
(TARGET_VECTORIZE_SUPPORT_VECTOR_MISALIGNMENT): Define.
* gcc.target/aarch64/pr71727.c : New Testcase.
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index b7d4640..2649951 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -141,6 +141,10 @@ static bool aarch64_vector_mode_supported_p (machine_mode);
 static bool aarch64_vectorize_vec_perm_const_ok (machine_mode vmode,
 		 const unsigned char *sel);
 static int aarch64_address_cost (rtx, machine_mode, addr_space_t, bool);
+static bool aarch64_builtin_support_vector_misalignment (machine_mode mode,
+			 const_tree type,
+			 int misalignment,
+			 bool is_packed);
 
 /* Major revision number of the ARM Architecture implemented by the target.  */
 unsigned aarch64_architecture_version;
@@ -11148,6 +11152,35 @@ aarch64_simd_vector_alignment_reachable (const_tree type, bool is_packed)
   return true;
 }
 
+static bool
+aarch64_builtin_support_vector_misalignment (machine_mode mode,
+	 const_tree type, int misalignment,
+	 bool is_packed)
+{
+  if (TARGET_SIMD && STRICT_ALIGNMENT)
+{
+  /* Return if movmisalign pattern is not supported for this mode.  */
+  if (optab_handler (movmisalign_optab, mode) == CODE_FOR_nothing)
+return false;
+
+  if (misalignment == -1)
+	{
+	  /* Misalignment factor is unknown at compile time but we know
+	 it's word aligned.  */
+	  if (aarch64_simd_vector_alignment_reachable (type, is_packed))
+{
+  int element_size = TREE_INT_CST_LOW (TYPE_SIZE (type));
+
+  if (element_size != 64)
+return true;
+}
+	  return false;
+	}
+}
+  return default_builtin_support_vector_misalignment (mode, type, misalignment,
+		  is_packed);
+}
+
 /* If VALS is a vector constant that can be loaded into a register
using DUP, generate instructions to do so and return an RTX to
assign to the register.  Otherwise return NULL_RTX.  */
@@ -14398,6 +14431,10 @@ aarch64_optab_supported_p (int op, machine_mode mode1, machine_mode,
 #undef TARGET_VECTOR_MODE_SUPPORTED_P
 #define TARGET_VECTOR_MODE_SUPPORTED_P aarch64_vector_mode_supported_p
 
+#undef TARGET_VECTORIZE_SUPPORT_VECTOR_MISALIGNMENT
+#define TARGET_VECTORIZE_SUPPORT_VECTOR_MISALIGNMENT \
+  aarch64_builtin_support_vector_misalignment
+
 #undef TARGET_ARRAY_MODE_SUPPORTED_P
 #define TARGET_ARRAY_MODE_SUPPORTED_P aarch64_array_mode_supported_p
 
diff --git a/gcc/testsuite/gcc.target/aarch64/pr71727.c b/gcc/testsuite/gcc.target/aarch64/pr71727.c
new file mode 100644
index 000..05eef3e
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/pr71727.c
@@ -0,0 +1,33 @@
+/* { dg-do compile } */
+/* { dg-options "-mstrict-align -O3" } */
+
+struct test_struct_s
+{
+  long a;
+  long b;
+  long c;
+  long d;
+  unsigned long e;
+};
+
+
+char _a;
+struct test_struct_s xarray[128];
+
+void
+_start (void)
+{
+  struct test_struct_s *new_entry;
+
+  new_entry = [0];
+  new_entry->a = 1;
+  new_entry->b = 2;
+  new_entry->c = 3;
+  new_entry->d = 4;
+  new_entry->e = 5;
+
+  return;
+}
+
+/* { dg-final { scan-assembler-times "mov\tx" 5 {target lp64} } } */
+/* { dg-final { scan-assembler-not "add\tx0, x0, :" {target lp64} } } */

Re: [PATCH fix PR71767 2/4 : Darwin configury] Arrange for ld64 to be detected as Darwin's linker

2016-11-07 Thread Iain Sandoe


> On 7 Nov 2016, at 13:53, Mike Stump  wrote:
> 
> On Nov 7, 2016, at 10:40 AM, Jeff Law  wrote:
>> 
>> On 11/07/2016 10:48 AM, Mike Stump wrote:
>>> On Nov 6, 2016, at 11:39 AM, Iain Sandoe  wrote:
 This is an initial patch in a series that converts Darwin's configury to 
 detect ld64 features, rather than the current process of hard-coding them 
 on target system version.
>>> 
>>> So, I really do hate to ask, but does this have to be a config option?  
>>> Normally, we'd just have configure examine things by itself.  For canadian 
>>> crosses, there should be enough state present to key off of directly, 
>>> specially if they are wired up to work.
>>> 
>>> I've rather have the thing that doesn't just work without that config flag, 
>>> just work.  I'd like to think I can figure how how to make it just work, if 
>>> given an idea of what doesn't actually work.
>>> 
>>> Essentially, you do the operation that doesn't work, detect it failed to 
>>> work, then the you know it didn't work.
>>> 
>> But how is that supposed to work in a cross environment when he can't 
>> directly query the linker's behavior?
> 
> :-)  So, the two most obvious solutions would be the programs that need to 
> exist for a build, are portable and ported to run on a system that the build 
> can use, or one can have a forwarding stub from a system the build uses to 
> machine that can host the software that is less portable.  I've done both 
> before, both work fine.  Portable software can also include things like 
> simulators to run software under simulation on the local machine (or on a 
> machine the forwarding stub links to).  I've done that as well.  For example, 
> I've done native bootstraps of gcc on my non-bootstrappable cross compiler by 
> running everything under GNU sim for bootstrap and enhancing the GNU sim 
> stubs to include a few more system calls that bootstrap uses.  :-) read/write 
> already work, once just needs readdir and a few others.

this is pretty “black belt” stuff - I don’t see most of our users wanting to 
dive this deeply … 
> 
> Also, for darwin, in some cases, we can actually run the target or host 
> programs on the build machine directly.

I have that (at least weakly) in the patch posted - reluctant to add more 
smarts to make it cover more cases unless it’s proven useful.

> 
>> In an ideal world we could trivially query the linker's behavior prior to 
>> invocation.  But we don't have that kind of infrastructure in place.
> 
> There are cases that just work.  If you have a forwarding stub for the cross, 
> then you can just run it as usual.  If you have a BINFMT style simulator on 
> the local machine, again, you can just run it.  And on darwin, there are 
> cases where you can run target and/or host programs on the build machine 
> directly.
> 
> For darwin, I can't tell if he wants the runtime property of the target 
> system for programs that will be linked on it, or a behavior of the local 
> linker that will do the deed.  For the local linker, that can be queried 
> directly. For the target system, we can know it's behavior by knowing what 
> the target is.  We already know what the target is from the macosxversion 
> flag, which embodies the dynamic linker. Also, for any specific version of 
> macosx, there can have a table of what version of ld64 it has on it, by fiat. 
>  We can say, if you want to target such a system, you should use the latest 
> Xcode that supported that system.  This can reduce complexities and simplify 
> our lives.

.. and produce the situation where we can never have a c++11 compiler on 
powerpc-darwin9, because the “required ld64” doesn’t support it (OK. maybe we 
don’t care about that) but supposing we can now have symbol aliasses with 
modern ld64 (I think we can) - would we want to prevent a 10.6 from using that?

So, I am strongly against the situation where we fix the capability of the 
toolchain on some assumption of externally-available tools predicated on the 
system revision.

The intent of my patch is to move away from this to a situation where we use 
configuration tests to determine the capability from the tools [when we can run 
them] and on the basis of their version(s) when we are configuring in a cross 
scenario.

>> ISTM the way to go is to have a configure test to try and DTRT automatically 
>> for native builds and a flag to set for crosses (or potentially override the 
>> configure test).
> 
> 
> Sure, if it can't be known.
> 
> For example, if you have the target include directory, you don't to have 
> flags for questions that can be answered by the target headers.  Ditto the 
> libraries.  My question is what is the specific question we are asking?  
> Additionally answering things on the basis of version numbers isn't quite in 
> the GNU spirit.  I'm not opposed to it, but, it is slightly better to form 
> the actual question if possible.

Actually, there’s a bunch

Re: [PATCH fix PR71767 2/4 : Darwin configury] Arrange for ld64 to be detected as Darwin's linker

2016-11-07 Thread Iain Sandoe


> On 7 Nov 2016, at 13:53, Mike Stump  wrote:
> 
> On Nov 7, 2016, at 10:40 AM, Jeff Law  wrote:
>> 
>> On 11/07/2016 10:48 AM, Mike Stump wrote:
>>> On Nov 6, 2016, at 11:39 AM, Iain Sandoe  wrote:
 This is an initial patch in a series that converts Darwin's configury to 
 detect ld64 features, rather than the current process of hard-coding them 
 on target system version.
>>> 
>>> So, I really do hate to ask, but does this have to be a config option?  
>>> Normally, we'd just have configure examine things by itself.  For canadian 
>>> crosses, there should be enough state present to key off of directly, 
>>> specially if they are wired up to work.
>>> 
>>> I've rather have the thing that doesn't just work without that config flag, 
>>> just work.  I'd like to think I can figure how how to make it just work, if 
>>> given an idea of what doesn't actually work.
>>> 
>>> Essentially, you do the operation that doesn't work, detect it failed to 
>>> work, then the you know it didn't work.
>>> 
>> But how is that supposed to work in a cross environment when he can't 
>> directly query the linker's behavior?
> 
> :-)  So, the two most obvious solutions would be the programs that need to 
> exist for a build, are portable and ported to run on a system that the build 
> can use, or one can have a forwarding stub from a system the build uses to 
> machine that can host the software that is less portable.  I've done both 
> before, both work fine.  Portable software can also include things like 
> simulators to run software under simulation on the local machine (or on a 
> machine the forwarding stub links to).  I've done that as well.  For example, 
> I've done native bootstraps of gcc on my non-bootstrappable cross compiler by 
> running everything under GNU sim for bootstrap and enhancing the GNU sim 
> stubs to include a few more system calls that bootstrap uses.  :-) read/write 
> already work, once just needs readdir and a few others.

this is pretty “black belt” stuff - I don’t see most of our users wanting to 
dive this deeply … 
> 
> Also, for darwin, in some cases, we can actually run the target or host 
> programs on the build machine directly.

I have that (at least weakly) in the patch posted - reluctant to add more 
smarts to make it cover more cases unless it’s proven useful.

> 
>> In an ideal world we could trivially query the linker's behavior prior to 
>> invocation.  But we don't have that kind of infrastructure in place.
> 
> There are cases that just work.  If you have a forwarding stub for the cross, 
> then you can just run it as usual.  If you have a BINFMT style simulator on 
> the local machine, again, you can just run it.  And on darwin, there are 
> cases where you can run target and/or host programs on the build machine 
> directly.
> 
> For darwin, I can't tell if he wants the runtime property of the target 
> system for programs that will be linked on it, or a behavior of the local 
> linker that will do the deed.  For the local linker, that can be queried 
> directly. For the target system, we can know it's behavior by knowing what 
> the target is.  We already know what the target is from the macosxversion 
> flag, which embodies the dynamic linker. Also, for any specific version of 
> macosx, there can have a table of what version of ld64 it has on it, by fiat. 
>  We can say, if you want to target such a system, you should use the latest 
> Xcode that supported that system.  This can reduce complexities and simplify 
> our lives.

.. and produce the situation where we can never have a c++11 compiler on 
powerpc-darwin9, because the “required ld64” doesn’t support it (OK. maybe we 
don’t care about that) but supposing we can now have symbol aliasses with 
modern ld64 (I think we can) - would we want to prevent a 10.6 from using that?

So, I am strongly against the situation where we fix the capability of the 
toolchain on some assumption of externally-available tools predicated on the 
system revision.

The intent of my patch is to move away from this to a situation where we use 
configuration tests to determine the capability from the tools [when we can run 
them] and on the basis of their version(s) when we are configuring in a cross 
scenario.

>> ISTM the way to go is to have a configure test to try and DTRT automatically 
>> for native builds and a flag to set for crosses (or potentially override the 
>> configure test).
> 
> 
> Sure, if it can't be known.
> 
> For example, if you have the target include directory, you don't to have 
> flags for questions that can be answered by the target headers.  Ditto the 
> libraries.  My question is what is the specific question we are asking?  
> Additionally answering things on the basis of version numbers isn't quite in 
> the GNU spirit.  I'm not opposed to it, but, it is slightly better to form 
> the actual question if possible.

Actually, there’s a bunch

Re: [PATCH] have __builtin_object_size handle POINTER_PLUS with non-const offset (pr 77608)

2016-11-07 Thread Martin Sebor


It's taken me longer than I expected to finally get back to this
project.  Sorry about the delay.

  https://gcc.gnu.org/ml/gcc-patches/2016-09/msg01110.html

Attached is an updated patch with this enhancement and reflecting
you previous comment.

Besides running the GCC test suite I tested the patch by building
Binutils and the Linux kernel.  It found one stpcpy-related overflow
in Binutils that I'm looking into and reduced by one the number of
problems reported by the -Wformat-length option in the kernel (I
haven't yet checked which one it eliminated).

Although I'm not done investigating the Binutils problem I'm posting
the patch for review now to allow for comments before stage 1 ends.

Martin

PS The tests added in the patch (but nothing else) depend on
the changes in the patch for c/53562:

  https://gcc.gnu.org/ml/gcc-patches/2016-11/msg00483.html

PR middle-end/77608 - missing protection on trivially detectable runtime buffer overflow

gcc/ChangeLog:
2016-11-07  Martin Sebor  

	PR middle-end/77608
	* tree-object-size.c (nonconst_offsets): New global.
	(compute_object_offset): New function.
	(addr_object_size): Add an argument.
	(compute_builtin_object_size): Rename...
	(internal_object_size): to this.
	(expr_object_size): Adjust.
	(merge_object_sizes): Handle non-constant offset.
	(plus_stmt_object_size): Same.
	(collect_object_sizes_for): Change to return bool.
	(init_object_sizes): Initialize nonconst_offsets.
	(fini_object_sizes): Release nonconst_offsets.

gcc/testsuite/ChangeLog:
2016-11-07  Martin Sebor  

	PR middle-end/77608
	* gcc.c-torture/execute/builtins/lib/chk.c: Add debugging output.
	* gcc.c-torture/execute/builtins/mempcpy-chk.c: Add debugging output.
	(test5): Allow test to emit __mempcpy_chk.
	* gcc.dg/builtin-object-size-18.c: New test.
	* gcc.dg/builtin-stringop-chk-3.c: New test.
	* gcc.dg/pr77608.c: New test.

diff --git a/gcc/testsuite/gcc.c-torture/execute/builtins/lib/chk.c b/gcc/testsuite/gcc.c-torture/execute/builtins/lib/chk.c
index b19d7bf..6966b41 100644
--- a/gcc/testsuite/gcc.c-torture/execute/builtins/lib/chk.c
+++ b/gcc/testsuite/gcc.c-torture/execute/builtins/lib/chk.c
@@ -5,6 +5,15 @@
 
 extern void abort (void);
 
+const char *testfunc;
+int testline;
+
+#define abort()\
+  (__builtin_printf ("%s:%i: %s: test failure in %s on line %i\n",	\
+		 __FILE__, __LINE__, __FUNCTION__,			\
+		 testfunc, testline),\
+   __builtin_abort ())
+
 extern int inside_main;
 void *chk_fail_buf[256] __attribute__((aligned (16)));
 volatile int chk_fail_allowed, chk_calls;
diff --git a/gcc/testsuite/gcc.c-torture/execute/builtins/mempcpy-chk.c b/gcc/testsuite/gcc.c-torture/execute/builtins/mempcpy-chk.c
index 7a1737c..b014aff 100644
--- a/gcc/testsuite/gcc.c-torture/execute/builtins/mempcpy-chk.c
+++ b/gcc/testsuite/gcc.c-torture/execute/builtins/mempcpy-chk.c
@@ -9,6 +9,15 @@ extern void *memcpy (void *, const void *, size_t);
 extern void *mempcpy (void *, const void *, size_t);
 extern int memcmp (const void *, const void *, size_t);
 
+extern const char *testfunc;
+extern int testline;
+
+#define memcpy(d, s, n)\
+  (testfunc = __func__, testline = __LINE__, memcpy (d, s, n))
+
+#define mempcpy(d, s, n)			\
+  (testfunc = __func__, testline = __LINE__, mempcpy (d, s, n))
+
 #include "chk.h"
 
 const char s1[] = "123";
@@ -17,6 +26,8 @@ volatile char *s2 = "defg"; /* prevent constant propagation to happen when whole
 volatile char *s3 = "FGH"; /* prevent constant propagation to happen when whole program assumptions are made.  */
 volatile size_t l1 = 1; /* prevent constant propagation to happen when whole program assumptions are made.  */
 
+#define abort() (__builtin_printf ("failure on line %i\n", __LINE__), __builtin_abort ())
+
 void
 __attribute__((noinline))
 test1 (void)
@@ -326,6 +337,8 @@ test4 (void)
   char buf3[20];
 
   chk_fail_allowed = 1;
+  mempcpy_disallowed = 0;
+
   /* Runtime checks.  */
   if (__builtin_setjmp (chk_fail_buf) == 0)
 {
@@ -343,7 +356,9 @@ test4 (void)
   vx = mempcpy ([19], "ab", 2);
   abort ();
 }
+
   chk_fail_allowed = 0;
+  mempcpy_disallowed = 1;
 }
 
 #ifndef MAX_OFFSET
@@ -377,6 +392,12 @@ test5 (void)
   int off1, off2, len, i;
   char *p, *q, c;
 
+  /* The call to mempcpy below, even though it's safe, may result in
+ a call to __mempcpy_chk when the (maximum) size of the destination
+ is determined.  */
+  int mempcpy_disallowed_save = mempcpy_disallowed;
+  mempcpy_disallowed = 0;
+
   for (off1 = 0; off1 < MAX_OFFSET; off1++)
 for (off2 = 0; off2 < MAX_OFFSET; off2++)
   for (len = 1; len < MAX_COPY; len++)
@@ -410,6 +431,8 @@ test5 (void)
 	if (*q != 'a')
 	  abort ();
 	}
+
+  mempcpy_disallowed = mempcpy_disallowed_save;
 }
 
 #define TESTSIZE 80
diff --git a/gcc/testsuite/gcc.dg/builtin-object-size-18.c b/gcc/testsuite/gcc.dg/builtin-object-size-18.c
new file mode 100644
index 000..b1a5666
--- /dev/null
+++

[ssa-coalesce] Rename register_ssa_partition

2016-11-07 Thread kugan


Hi,

In tree-ssa-coalesce, register_ssa_partition ) and
register_ssa_partition_check have lost their meaning over various 
commits and now just verifies that ssa_var is indeed a SSA_NAME and not 
a virtual_operand_p. It is confusing when one look at if for the fist 
time and would expect more while reading the register_ssa_partition.


Attached patch just changes it to verify_ssa_for_coalesce to better 
reflect what it is doing now.


Bootstrap and regression testing is ongoing. Is this OK for trunk if no 
regressions?


Thanks,
Kugan



gcc/ChangeLog:

2016-11-08  Kugan Vivekanandarajah  

* tree-ssa-coalesce.c (register_default_def): Remove usage of arg
map which is not used at all.
(create_outofssa_var_map): Use renamed verify_ssa_for_coalesce from
register_ssa_partition.
	* tree-ssa-live.c (verify_ssa_for_coalesce): Renamed 
register_ssa_partition.

(register_ssa_partition_check): Remove.
	* tree-ssa-live.h (register_ssa_partition): Renamed to 
verify_ssa_for_coalesce
diff --git a/gcc/tree-ssa-coalesce.c b/gcc/tree-ssa-coalesce.c
index 6423cdd..8adbd62 100644
--- a/gcc/tree-ssa-coalesce.c
+++ b/gcc/tree-ssa-coalesce.c
@@ -1040,17 +1040,15 @@ create_default_def (tree var, void *arg 
ATTRIBUTE_UNUSED)
 /* Register VAR's default def in MAP.  */
 
 static void
-register_default_def (tree var, void *map_)
+register_default_def (tree var, void *arg ATTRIBUTE_UNUSED)
 {
-  var_map map = (var_map)map_;
-
   if (!is_gimple_reg (var))
 return;
 
   tree ssa = ssa_default_def (cfun, var);
   gcc_assert (ssa);
 
-  register_ssa_partition (map, ssa);
+  verify_ssa_for_coalesce (ssa);
 }
 
 /* If VAR is an SSA_NAME associated with a PARM_DECL or a RESULT_DECL,
@@ -1096,7 +1094,7 @@ create_outofssa_var_map (coalesce_list *cl, bitmap 
used_in_copy)
 
   map = init_var_map (num_ssa_names);
 
-  for_all_parms (register_default_def, map);
+  for_all_parms (register_default_def, NULL);
 
   FOR_EACH_BB_FN (bb, cfun)
 {
@@ -1114,7 +1112,7 @@ create_outofssa_var_map (coalesce_list *cl, bitmap 
used_in_copy)
 
  res = gimple_phi_result (phi);
  ver = SSA_NAME_VERSION (res);
- register_ssa_partition (map, res);
+ verify_ssa_for_coalesce (res);
 
  /* Register ssa_names and coalesces between the args and the result
 of all PHI.  */
@@ -1125,7 +1123,7 @@ create_outofssa_var_map (coalesce_list *cl, bitmap 
used_in_copy)
  if (TREE_CODE (arg) != SSA_NAME)
continue;
 
- register_ssa_partition (map, arg);
+ verify_ssa_for_coalesce (arg);
  if (gimple_can_coalesce_p (arg, res)
  || (e->flags & EDGE_ABNORMAL))
{
@@ -1154,7 +1152,7 @@ create_outofssa_var_map (coalesce_list *cl, bitmap 
used_in_copy)
 
  /* Register USE and DEF operands in each statement.  */
  FOR_EACH_SSA_TREE_OPERAND (var, stmt, iter, (SSA_OP_DEF|SSA_OP_USE))
-   register_ssa_partition (map, var);
+   verify_ssa_for_coalesce (var);
 
  /* Check for copy coalesces.  */
  switch (gimple_code (stmt))
diff --git a/gcc/tree-ssa-live.c b/gcc/tree-ssa-live.c
index b9eef20..1fadf86 100644
--- a/gcc/tree-ssa-live.c
+++ b/gcc/tree-ssa-live.c
@@ -102,6 +102,26 @@ delete_var_map (var_map map)
 }
 
 
+/* Verify that SSA_VAR is a non-virtual SSA_NAME when
+   flag_checking is enabled.  */
+
+void
+verify_ssa_for_coalesce (tree ssa_var)
+{
+  if (flag_checking)
+{
+  gcc_assert (TREE_CODE (ssa_var) == SSA_NAME);
+  if (virtual_operand_p (ssa_var))
+   {
+ fprintf (stderr, "Illegally registering a virtual SSA name :");
+ print_generic_expr (stderr, ssa_var, TDF_SLIM);
+ fprintf (stderr, " in the SSA->Normal phase.\n");
+ internal_error ("SSA corruption");
+   }
+}
+}
+
+
 /* This function will combine the partitions in MAP for VAR1 and VAR2.  It
Returns the partition which represents the new partition.  If the two
partitions cannot be combined, NO_PARTITION is returned.  */
@@ -1276,22 +1296,6 @@ debug (tree_live_info_d *ptr)
 }
 
 
-/* Verify that SSA_VAR is a non-virtual SSA_NAME.  */
-
-void
-register_ssa_partition_check (tree ssa_var)
-{
-  gcc_assert (TREE_CODE (ssa_var) == SSA_NAME);
-  if (virtual_operand_p (ssa_var))
-{
-  fprintf (stderr, "Illegally registering a virtual SSA name :");
-  print_generic_expr (stderr, ssa_var, TDF_SLIM);
-  fprintf (stderr, " in the SSA->Normal phase.\n");
-  internal_error ("SSA corruption");
-}
-}
-
-
 /* Verify that the info in LIVE matches the current cfg.  */
 
 static void
diff --git a/gcc/tree-ssa-live.h b/gcc/tree-ssa-live.h
index 6df102a..6fc0895 100644
--- a/gcc/tree-ssa-live.h
+++ b/gcc/tree-ssa-live.h
@@ -80,7 +80,7 @@ extern void remove_unused_locals (void);
 extern void dump_var_map (FILE *, var_map);
 extern void debug (_var_map );
 extern void debug (_var_map *ptr);
-extern void

Re: [PATCH fix PR71767 2/4 : Darwin configury] Arrange for ld64 to be detected as Darwin's linker

2016-11-07 Thread Iain Sandoe

> On 7 Nov 2016, at 13:56, Mike Stump  wrote:
> 
> On Nov 7, 2016, at 9:59 AM, Iain Sandoe  wrote:
>> 
>>> On 7 Nov 2016, at 09:51, Mike Stump  wrote:
>>> 
>>> [ possible dup ]
>>> 
 Begin forwarded message:

 From: Mike Stump 
 Subject: Re: [PATCH fix PR71767 2/4 : Darwin configury] Arrange for ld64 
 to be detected as Darwin's linker
 Date: November 7, 2016 at 9:48:53 AM PST
 To: Iain Sandoe 
 Cc: GCC Patches , Jeff Law 

 On Nov 6, 2016, at 11:39 AM, Iain Sandoe  wrote:
> This is an initial patch in a series that converts Darwin's configury to 
> detect ld64 features, rather than the current process of hard-coding them 
> on target system version.

 So, I really do hate to ask, but does this have to be a config option?  
 Normally, we'd just have configure examine things by itself.  For canadian 
 crosses, there should be enough state present to key off of directly, 
 specially if they are wired up to work.

 I've rather have the thing that doesn't just work without that config 
 flag, just work.  I'd like to think I can figure how how to make it just 
 work, if given an idea of what doesn't actually work.

 Essentially, you do the operation that doesn't work, detect it failed to 
 work, then the you know it didn't work.
>> 
>> Well, if you can run the tool, that’s fine - I wanted to cover the base 
>> where we have a native or canadian that’s using a newer ld64 than is 
>> installed by the ‘last available xcode’ on a given platform - which is the 
>> common case (since the older versions of ld64 in particular don’t really 
>> support the features we want, they def. won’t support building LLVM for ex.).
>> 
>> I am *really really* trying to get away from the assumption that darwinNN 
>> implies some ld64 capability - because that’s just wrong, really - makes way 
>> too many assuptions.  I also want to get to the “end game” that we just 
>> configure *-*-darwin and use the cross-capability of the toolchain (we’re a 
>> ways away from that upstream, but my local patch set acheives it at least 
>> for 5.4 and 6.2).
>> 
>> It’s true that adding configure options is not #1 choice in life - but I 
>> think darwin is getting to the stage where there are too many choices to 
>> cover without.
>> 
>> Open to alternate suggestions, of course
> 
> But, you didn't actually tell me the question that you're interested in.  It 
> is that question that I'm curious about.

a) right now, we need to know the target linker version - while it’s not 
impossible to try and conjure up some test to see if a linker we can run 
supports coalesced sections or not, the configury code and complexity needed to 
support that would exceed what I’m proposing at present (and still would not 
cover the native and canadian cases).

- IMO it’s reasonable to decide on coalesced section availability based on the 
linker version and is at least correct (where deciding on the basis of system 
revision is wishful thinking at best).

I’m not debating the various solutions in your reply to Jeff - but honestly I 
wonder how many of them are realistically in reach of the typical end-user (I 
have done most of them at one stage or another, but I wonder how many would be 
stopped dead by “first find and build ld64, which itself needs a c++11 compiler 
and BTW needs you to build libLTO.dylib .. which needs you to build at least 
LLVM itself").

b) Given the speed of various older hardware, it’s an objective to get to the 
stage where we can build reliable native crosses (it seems that there are also 
people trying to do canadian - despite the trickiness).

c) it’s a high priority on my list to make it possible for Linux folks to be 
able to build a Darwin cross toolchain, I think that will help a lot with 
triage of issues

In short, I expect more use of cross and native crosses in the future…

.. so:

case 1. self-hosted build=host=target, ld64 = xcode whatever - no problem; we 
query live.

case 2. build != host but build arch == host arch- well, sometimes we can run 
the host tools (I’ve put that in my patch).

case 3. build != host, build arch != host arch .. I don’t believe there’s any 
more concise way of expressing the necessary data than passing the linker 
version to configure (there’s really no place one can go and look for its 
capability).

So - I agree that there are lots of possible solutions, the question is are 
there any that are less configury / lower maintenance (and accessible to our 
users)?

am I missing a point here?
Iain

Re: [PATCH] [ARC] define SIZE_TYPE and PTRDIFF_TYPE correctly

2016-11-07 Thread Vineet Gupta

On 11/03/2016 03:57 AM, Claudiu Zissulescu wrote:
> Hi Vineet,
>
> Thank you for your contribution.
>
>> gcc/
>> 2016-10-28  Vineet Gupta 
>>
>> * config/arc/arc.h (SIZE_TYPE): define as unsigned int.
>>  * (PTRDIFF_TYPE): define as int.
>>
> Approved and committed. However,  the entry changelog line is not as expected 
> and the patch didn’t apply on the mainline sources seamlessly. Fix them for 
> you :)

Thx Claudiu. I'll be more careful next time !

-Vineet

>
> For ur reference: Committed r241812.
>
> Best,
> Claudiu
>

[openacc] add support for common block data

2016-11-07 Thread Cesar Philippidis

This patch adds support for variables inside common blocks in OpenACC
data clauses. The fortran FE changes are fairly straightforward.
gfc_match_omp_variable_list already has support for common block data,
so all I had to do was teach gfc_match_omp_map_clause to accept common
common block arguments.

The gimplifier changes are more interesting. Originally, the gimplifier
wants to treat the common block and it's members separately and that
resulted in duplicate data mapping errors at runtime. This patch gets
around that problem by teaching omp_notice_variable to ignore the common
block itself, at least in OpenACC contexts. That ensures that only the
common block members get data clauses. The problem here is that OpenACC
permits the user to transfer individual common block members, so that's
why I ended up this approach, otherwise it would have been easier to
transfer the common block as a whole.

This patch has been in gomp-4_0-branch for over a month, you can find
the original patch here
.

Is this patch ok for trunk?

Cesar
2016-11-07  Cesar Philippidis  

	gcc/fortran/
	* openmp.c (gfc_match_omp_map_clause): New common_block argument.
	Propagate it to gfc_match_omp_variable_list.
	(gfc_match_omp_clauses): Update calls to gfc_match_omp_map_clause.

	gcc/
	* gimplify.c (oacc_default_clause): Privatize fortran common blocks.
	(omp_notice_variable): Defer the expansion of DECL_VALUE_EXPR for
	common block decls.

	gcc/testsuite/
	* gfortran.dg/goacc/common-block-1.f90: New test.
	* gfortran.dg/goacc/common-block-2.f90: New test.

	libgomp/
	* testsuite/libgomp.oacc-fortran/common-block-1.f90: New test.
	* testsuite/libgomp.oacc-fortran/common-block-2.f90: New test.
	* testsuite/libgomp.oacc-fortran/common-block-3.f90: New test.


diff --git a/gcc/fortran/openmp.c b/gcc/fortran/openmp.c
index 03e7dbe..9a957e3 100644
--- a/gcc/fortran/openmp.c
+++ b/gcc/fortran/openmp.c
@@ -629,10 +629,11 @@ cleanup:
mapping.  */
 
 static bool
-gfc_match_omp_map_clause (gfc_omp_namelist **list, gfc_omp_map_op map_op)
+gfc_match_omp_map_clause (gfc_omp_namelist **list, gfc_omp_map_op map_op,
+			  bool common_blocks)
 {
   gfc_omp_namelist **head = NULL;
-  if (gfc_match_omp_variable_list ("", list, false, NULL, , true)
+  if (gfc_match_omp_variable_list ("", list, common_blocks, NULL, , true)
   == MATCH_YES)
 {
   gfc_omp_namelist *n;
@@ -757,7 +758,7 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, uint64_t mask,
 	  if ((mask & OMP_CLAUSE_COPY)
 	  && gfc_match ("copy ( ") == MATCH_YES
 	  && gfc_match_omp_map_clause (>lists[OMP_LIST_MAP],
-	   OMP_MAP_FORCE_TOFROM))
+	   OMP_MAP_FORCE_TOFROM, openacc))
 	continue;
 	  if (mask & OMP_CLAUSE_COPYIN)
 	{
@@ -765,7 +766,7 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, uint64_t mask,
 		{
 		  if (gfc_match ("copyin ( ") == MATCH_YES
 		  && gfc_match_omp_map_clause (>lists[OMP_LIST_MAP],
-		   OMP_MAP_FORCE_TO))
+		   OMP_MAP_FORCE_TO, true))
 		continue;
 		}
 	  else if (gfc_match_omp_variable_list ("copyin (",
@@ -776,7 +777,7 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, uint64_t mask,
 	  if ((mask & OMP_CLAUSE_COPYOUT)
 	  && gfc_match ("copyout ( ") == MATCH_YES
 	  && gfc_match_omp_map_clause (>lists[OMP_LIST_MAP],
-	   OMP_MAP_FORCE_FROM))
+	   OMP_MAP_FORCE_FROM, true))
 	continue;
 	  if ((mask & OMP_CLAUSE_COPYPRIVATE)
 	  && gfc_match_omp_variable_list ("copyprivate (",
@@ -786,14 +787,14 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, uint64_t mask,
 	  if ((mask & OMP_CLAUSE_CREATE)
 	  && gfc_match ("create ( ") == MATCH_YES
 	  && gfc_match_omp_map_clause (>lists[OMP_LIST_MAP],
-	   OMP_MAP_FORCE_ALLOC))
+	   OMP_MAP_FORCE_ALLOC, true))
 	continue;
 	  break;
 	case 'd':
 	  if ((mask & OMP_CLAUSE_DELETE)
 	  && gfc_match ("delete ( ") == MATCH_YES
 	  && gfc_match_omp_map_clause (>lists[OMP_LIST_MAP],
-	   OMP_MAP_DELETE))
+	   OMP_MAP_DELETE, true))
 	continue;
 	  if ((mask & OMP_CLAUSE_DEFAULT)
 	  && c->default_sharing == OMP_DEFAULT_UNKNOWN)
@@ -846,22 +847,13 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, uint64_t mask,
 	  if ((mask & OMP_CLAUSE_OACC_DEVICE)
 	  && gfc_match ("device ( ") == MATCH_YES
 	  && gfc_match_omp_map_clause (>lists[OMP_LIST_MAP],
-	   OMP_MAP_FORCE_TO))
+	   OMP_MAP_FORCE_TO, false))
 	continue;
 	  if ((mask & OMP_CLAUSE_DEVICEPTR)
-	  && gfc_match ("deviceptr ( ") == MATCH_YES)
-	{
-	  gfc_omp_namelist **list = >lists[OMP_LIST_MAP];
-	  gfc_omp_namelist **head = NULL;
-	  if (gfc_match_omp_variable_list ("", list, true, NULL,
-	   , false) == MATCH_YES)
-		{
-		  gfc_omp_namelist *n;
-		  for (n = *head; n; n = n->next)
-		n->u.map_op = OMP_MAP_FORCE_DEVICEPTR;
-		  continue;
-		}
-	}
+	  && gfc_match ("deviceptr ( ") == MATCH_YES
+	  &&

Re: RFA (libstdc++): C++ PATCH to implement C++17 noexcept in type system

2016-11-07 Thread Jonathan Wakely

On 7 November 2016 at 22:49, Jason Merrill wrote:
> Tested x86_64-pc-linux-gnu.  Are the libstdc++ changes OK for trunk?

Yes, I like the approach, thanks.

RFA (libstdc++): C++ PATCH to implement C++17 noexcept in type system

2016-11-07 Thread Jason Merrill

This patch implements P0012R1: "Make exception-specifications be part
of the type system, version 5".  Some of the conversion logic was
already there from the transactional memory TS implementation, and
indeed the semantics of the noexcept function pointer conversion was
modeled on that wording, but I've simplified it in some places and
added some missing cases.

One thing that turned up was that we can't defer instantiation of
noexcept-specifiers outside of the toplevel type of a function, so I
needed to add a tsubst flag to indicate that context.

This patch implements Richard Smith's suggestion that we allow
noexcept(E) to be a deduced context in order to avoid another factor
of two expansion in the partial specializations of is_function.  This
is not part of the C++17 CD, but seems like the direction of the
committee.  Accordingly, I've added macros for this to c++config and
adjusted  and  to use them.

Tested x86_64-pc-linux-gnu.  Are the libstdc++ changes OK for trunk?
commit 9291bd1bb7ffdc08586376df023363af0f5ee34e
Author: Jason Merrill 
Date:   Fri Nov 4 13:19:25 2016 -0400

Implement P0012R1, Make exception specifications part of the type system.

gcc/cp/
* cp-tree.h (enum tsubst_flags): Add tf_fndecl_type.
(flag_noexcept_type, ce_type): New.
* call.c (build_conv): Add ck_fnptr.
(enum conversion_kind): Change ck_tsafe to ck_fnptr.
(convert_like_real): Likewise.
(standard_conversion): Likewise.  Allow function pointer
conversions for pointers to member functions.
(reference_compatible_p): Allow function pointer conversions.
(direct_reference_binding): Likewise.
(reference_binding): Reference-compatible is no longer a subset of
reference-related.
(is_subseq): Also strip ck_lvalue after next_conversion.
* class.c (instantiate_type): Check fnptr_conv_p.
(resolve_address_of_overloaded_function): Likewise.
* cvt.c (can_convert_tx_safety): Now static.
(noexcept_conv_p, fnptr_conv_p, strip_fnptr_conv): New.
* decl.c (flag_noexcept_type): Define.
(cxx_init_decl_processing): Set it.
(bad_specifiers): Check it.
(grokdeclarator) [cdk_function]: Add exception-spec to type here.
* lambda.c (maybe_add_lambda_conv_op): Add exception-spec to
returned pointer.
* mangle.c (struct globals): Add need_cxx1z_warning.
(mangle_decl): Check it.
(write_exception_spec): New.
(write_function_type): Call it.
(canonicalize_for_substitution): Handle exception spec.
(write_type): Likewise.
(write_encoding): Set processing_template_decl across mangling of
partially-instantiated type.
* pt.c (determine_specialization): Pass tf_fndecl_type.
(tsubst_decl, fn_type_unification): Likewise.
(tsubst): Strip tf_fndecl_type, pass it to
tsubst_exception_specification.
(convert_nontype_argument_function): Handle function pointer
conversion.
(convert_nontype_argument): Likewise.
(unify, for_each_template_parm_r): Walk into noexcept-specifier.
* rtti.c (ptr_initializer): Encode noexcept.
* tree.c (canonical_eh_spec): New.
(build_exception_variant): Use it.
* typeck.c (composite_pointer_type): Handle fnptr conversion.
(comp_except_specs): Compare canonical EH specs.
(structural_comptypes): Call it.
gcc/c-family/
* c.opt (Wc++1z-compat): New.
* c-cppbuiltin.c (c_cpp_builtins): Add __cpp_noexcept_function_type.
libstdc++-v3/
* include/bits/c++config (_GLIBCXX_NOEXCEPT_PARM)
(_GLIBCXX_NOEXCEPT_QUAL): New.
* include/std/type_traits (is_function): Use them.
* libsubc++/new (launder): Likewise.
* libsupc++/cxxabi.h (__pbase_type_info::__masks): Add
__noexcept_mask.
* libsupc++/pbase_type_info.cc (__do_catch): Handle function
pointer conversion.
libiberty/
* cp-demangle.c (is_fnqual_component_type): New.
(d_encoding, d_print_comp_inner, d_print_mod_list): Use it.
(FNQUAL_COMPONENT_CASE): New.
(d_make_comp, has_return_type, d_print_comp_inner)
(d_print_function_type): Use it.
(next_is_type_qual): New.
(d_cv_qualifiers, d_print_mod): Handle noexcept and throw-spec.
include/
* demangle.h (enum demangle_component_type): Add
DEMANGLE_COMPONENT_NOEXCEPT, DEMANGLE_COMPONENT_THROW_SPEC.

diff --git a/gcc/c-family/c-cppbuiltin.c b/gcc/c-family/c-cppbuiltin.c
index e40fa6f..55dbf44 100644
--- a/gcc/c-family/c-cppbuiltin.c
+++ b/gcc/c-family/c-cppbuiltin.c
@@ -941,6 +941,7 @@

Re: [PATCH] Fix PR78229

2016-11-07 Thread Richard Biener

On November 7, 2016 9:22:38 PM GMT+01:00, Jakub Jelinek  
wrote:
>On Mon, Nov 07, 2016 at 01:20:20PM +0100, Richard Biener wrote:
>> Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk
>> and branch.
>> 
>> Richard.
>> 
>> 2016-11-07  Richard Biener  
>> 
>>  PR target/78229
>>  * config/i386/i386.c (ix86_gimple_fold_builtin): Do not adjust
>>  EH info.
>> 
>>  * g++.dg/pr78229.C: New testcase.
>
>On the trunk there are 2 other spots with gsi_replace (..., true); in
>the same function (and both of them are for functions which also should
>really be ECF_NOTHROW, but aren't).
>
>I haven't managed to create a simple testcase.  Anyway, here is a fix,
>bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

OK.

Richard.

>2016-11-07  Jakub Jelinek  
>
>   PR target/78229
>   * config/i386/i386.c (ix86_gimple_fold_builtin): Do not adjust
>
>   EH info even for bzhi and pdep/pext.
>
>--- gcc/config/i386/i386.c.jj  2016-11-07 18:32:14.0 +0100
>+++ gcc/config/i386/i386.c 2016-11-07 18:46:10.901250307 +0100
>@@ -33537,7 +33537,7 @@ ix86_gimple_fold_builtin (gimple_stmt_it
> location_t loc = gimple_location (stmt);
> gimple *g = gimple_build_assign (gimple_call_lhs (stmt), arg0);
> gimple_set_location (g, loc);
>-gsi_replace (gsi, g, true);
>+gsi_replace (gsi, g, false);
> return true;
>   }
>   break;
>@@ -33554,7 +33554,7 @@ ix86_gimple_fold_builtin (gimple_stmt_it
> arg0 = gimple_call_arg (stmt, 0);
> gimple *g = gimple_build_assign (gimple_call_lhs (stmt), arg0);
> gimple_set_location (g, loc);
>-gsi_replace (gsi, g, true);
>+gsi_replace (gsi, g, false);
> return true;
>   }
>   break;
>
>
>   Jakub

Re: [PATCH fix PR71767 2/4 : Darwin configury] Arrange for ld64 to be detected as Darwin's linker

2016-11-07 Thread Mike Stump

On Nov 7, 2016, at 9:59 AM, Iain Sandoe  wrote:
> 
>> On 7 Nov 2016, at 09:51, Mike Stump  wrote:
>> 
>> [ possible dup ]
>> 
>>> Begin forwarded message:
>>> 
>>> From: Mike Stump 
>>> Subject: Re: [PATCH fix PR71767 2/4 : Darwin configury] Arrange for ld64 to 
>>> be detected as Darwin's linker
>>> Date: November 7, 2016 at 9:48:53 AM PST
>>> To: Iain Sandoe 
>>> Cc: GCC Patches , Jeff Law 
>>> 
>>> On Nov 6, 2016, at 11:39 AM, Iain Sandoe  wrote:
 This is an initial patch in a series that converts Darwin's configury to 
 detect ld64 features, rather than the current process of hard-coding them 
 on target system version.
>>> 
>>> So, I really do hate to ask, but does this have to be a config option?  
>>> Normally, we'd just have configure examine things by itself.  For canadian 
>>> crosses, there should be enough state present to key off of directly, 
>>> specially if they are wired up to work.
>>> 
>>> I've rather have the thing that doesn't just work without that config flag, 
>>> just work.  I'd like to think I can figure how how to make it just work, if 
>>> given an idea of what doesn't actually work.
>>> 
>>> Essentially, you do the operation that doesn't work, detect it failed to 
>>> work, then the you know it didn't work.
> 
> Well, if you can run the tool, that’s fine - I wanted to cover the base where 
> we have a native or canadian that’s using a newer ld64 than is installed by 
> the ‘last available xcode’ on a given platform - which is the common case 
> (since the older versions of ld64 in particular don’t really support the 
> features we want, they def. won’t support building LLVM for ex.).
> 
> I am *really really* trying to get away from the assumption that darwinNN 
> implies some ld64 capability - because that’s just wrong, really - makes way 
> too many assuptions.  I also want to get to the “end game” that we just 
> configure *-*-darwin and use the cross-capability of the toolchain (we’re a 
> ways away from that upstream, but my local patch set acheives it at least for 
> 5.4 and 6.2).
> 
> It’s true that adding configure options is not #1 choice in life - but I 
> think darwin is getting to the stage where there are too many choices to 
> cover without.
> 
> Open to alternate suggestions, of course

But, you didn't actually tell me the question that you're interested in.  It is 
that question that I'm curious about.

Re: [PATCH fix PR71767 2/4 : Darwin configury] Arrange for ld64 to be detected as Darwin's linker

2016-11-07 Thread Mike Stump

On Nov 7, 2016, at 10:40 AM, Jeff Law  wrote:
> 
> On 11/07/2016 10:48 AM, Mike Stump wrote:
>> On Nov 6, 2016, at 11:39 AM, Iain Sandoe  wrote:
>>> This is an initial patch in a series that converts Darwin's configury to 
>>> detect ld64 features, rather than the current process of hard-coding them 
>>> on target system version.
>> 
>> So, I really do hate to ask, but does this have to be a config option?  
>> Normally, we'd just have configure examine things by itself.  For canadian 
>> crosses, there should be enough state present to key off of directly, 
>> specially if they are wired up to work.
>> 
>> I've rather have the thing that doesn't just work without that config flag, 
>> just work.  I'd like to think I can figure how how to make it just work, if 
>> given an idea of what doesn't actually work.
>> 
>> Essentially, you do the operation that doesn't work, detect it failed to 
>> work, then the you know it didn't work.
>> 
> But how is that supposed to work in a cross environment when he can't 
> directly query the linker's behavior?

:-)  So, the two most obvious solutions would be the programs that need to 
exist for a build, are portable and ported to run on a system that the build 
can use, or one can have a forwarding stub from a system the build uses to 
machine that can host the software that is less portable.  I've done both 
before, both work fine.  Portable software can also include things like 
simulators to run software under simulation on the local machine (or on a 
machine the forwarding stub links to).  I've done that as well.  For example, 
I've done native bootstraps of gcc on my non-bootstrappable cross compiler by 
running everything under GNU sim for bootstrap and enhancing the GNU sim stubs 
to include a few more system calls that bootstrap uses.  :-)  read/write 
already work, once just needs readdir and a few others.

Also, for darwin, in some cases, we can actually run the target or host 
programs on the build machine directly.

> In an ideal world we could trivially query the linker's behavior prior to 
> invocation.  But we don't have that kind of infrastructure in place.

There are cases that just work.  If you have a forwarding stub for the cross, 
then you can just run it as usual.  If you have a BINFMT style simulator on the 
local machine, again, you can just run it.  And on darwin, there are cases 
where you can run target and/or host programs on the build machine directly.

For darwin, I can't tell if he wants the runtime property of the target system 
for programs that will be linked on it, or a behavior of the local linker that 
will do the deed.  For the local linker, that can be queried directly.  For the 
target system, we can know it's behavior by knowing what the target is.  We 
already know what the target is from the macosxversion flag, which embodies the 
dynamic linker. Also, for any specific version of macosx, there can have a 
table of what version of ld64 it has on it, by fiat.  We can say, if you want 
to target such a system, you should use the latest Xcode that supported that 
system.  This can reduce complexities and simplify our lives.

> ISTM the way to go is to have a configure test to try and DTRT automatically 
> for native builds and a flag to set for crosses (or potentially override the 
> configure test).

Sure, if it can't be known.

For example, if you have the target include directory, you don't to have flags 
for questions that can be answered by the target headers.  Ditto the libraries. 
 My question is what is the specific question we are asking?  Additionally 
answering things on the basis of version numbers isn't quite in the GNU spirit. 
 I'm not opposed to it, but, it is slightly better to form the actual question 
if possible.

In complex canadian cross scenarios, we might well want to grab the source to 
ld64 and compile it up, just as we would any other software for canadian 
environments.

Re: [PATCH, rs6000] Modify include paths in config.gcc for Advance Toolchain builds

2016-11-07 Thread Peter Bergner


On 11/7/16 3:11 PM, Segher Boessenkool wrote:

On Mon, Nov 07, 2016 at 01:20:29PM -0600, Peter Bergner wrote:

Is this ok for trunk and the GCC 6 branch?

* config.gcc (powerpc*-*-*, rs6000*-*-*): Remove setting of
INCLUDE_EXTRA_SPEC for Advance Toolchain builds.


Okay.  Thanks,


Committed to trunk and the GCC 6 branch.  Thanks!

Peter

Re: [GCC][PATCH] Fix ada compile error on Windows x86_64 (committed as r241907 under the obvious rule)

2016-11-07 Thread Eric Botcazou

> The changes in r240999 re-arranged includes and
> left out signal.h for Windows x86 builds.
> 
> This breaks the build and prevents GCC builds from
> completing with messages such as:
> 
> adaint.c:3317:19: error: 'SIGINT' undeclared (first use in this function);
> did you mean 'SAIT'?
> 
> else if (sig == SIGINT)
> ^~
> 
> Bootstrapped successfully on x86_64-w64-mingw32.

Thanks for fixing the problem!

-- 
Eric Botcazou

Re: [PATCH, rs6000] Modify include paths in config.gcc for Advance Toolchain builds

2016-11-07 Thread Segher Boessenkool

On Mon, Nov 07, 2016 at 01:20:29PM -0600, Peter Bergner wrote:
> Gabriel and I have been tracking down an include path issue for GCC 6
> Advance Toolchain builds (ie, --with-advance-toolchain=...).  The solution
> that fixes the problem for us is to configure with --with-local-prefix=...
> and removing the following hunk from config.gcc.  Gabriel has confirmed
> this fixes his AT builds (native and cross) and I've verified that this
> patch bootstraps with no regressions.
> 
> Is this ok for trunk and the GCC 6 branch?
> 
>   * config.gcc (powerpc*-*-*, rs6000*-*-*): Remove setting of
>   INCLUDE_EXTRA_SPEC for Advance Toolchain builds.

Okay.  Thanks,


Segher

Re: [PATCH] rs6000: Do swdiv at expand time

2016-11-07 Thread Segher Boessenkool

On Mon, Nov 07, 2016 at 08:26:01AM -0500, David Edelsohn wrote:
> On Mon, Nov 7, 2016 at 4:32 AM, Segher Boessenkool
>  wrote:
> > We transform floating point divide instructions to a faster series of
> > simple instructions, "swdiv".  Currently we do not do that until the
> > first splitter pass, which is much too late for most optimisations
> > that can happen on those new instructions, e.g. the constant loads
> > are not CSEd inside an unrolled loop.  This patch changes things so
> > those divide instructions are expanded during expand already.
> >
> > Bootstrapped and tested on powerpc64-linux; Bill has run SPEC on it,
> > and if anything it shows a slight improvement.
> >
> > Is this okay for trunk?
> 
> Okay.
> 
> But commenting on the ChangeLog entry is half the fun!

Okay :-)

2016-11-07  Segher Boessenkool  

* rs6000.md (div3): Expand using rs6000_emit_swdiv if
appropriate.
* vector.md (div3): Ditto.


Segher

[gomp4] remove GOVD_USE_DEVPTR

2016-11-07 Thread Cesar Philippidis

It looks like gomp-4_0-branch was using the GOVD_USE_DEVPTR attribute
for deviceptr data mappings to implicitly set the implicit data clause
for deviceptr variables to present inside OpenACC offloading regions
that are nested inside an ACC DATA region. This isn't necessary anymore
because the gimplifier is intelligent enough to automatically set the
implicit data mapping as PCOPY for any variable used in an enclosing ACC
DATA region. The PCOPY implies that the variable may already be present
on the device, so the special handling for deviceptr variables is excessive.

I've applied this patch remove GOVD_USE_DEVPTR from gomp4.

Cesar
2016-11-07  Cesar Philippidis  

	gcc/fortran/
	* openmp.c (gfc_match_omp_map_clause): New common_block argument.
	Propagate it to gfc_match_omp_variable_list.
	(gfc_match_omp_clauses): Update calls to gfc_match_omp_map_clause.

	gcc/
	* gimplify.c (oacc_default_clause): Privatize fortran common blocks.
	(omp_notice_variable): Defer the expansion of DECL_VALUE_EXPR for
	common block decls.

	gcc/testsuite/
	* gfortran.dg/goacc/common-block-1.f90: New test.
	* gfortran.dg/goacc/common-block-2.f90: New test.

	libgomp/
	* testsuite/libgomp.oacc-fortran/common-block-1.f90: New test.
	* testsuite/libgomp.oacc-fortran/common-block-2.f90: New test.
	* testsuite/libgomp.oacc-fortran/common-block-3.f90: New test.


diff --git a/gcc/fortran/openmp.c b/gcc/fortran/openmp.c
index 03e7dbe..9a957e3 100644
--- a/gcc/fortran/openmp.c
+++ b/gcc/fortran/openmp.c
@@ -629,10 +629,11 @@ cleanup:
mapping.  */
 
 static bool
-gfc_match_omp_map_clause (gfc_omp_namelist **list, gfc_omp_map_op map_op)
+gfc_match_omp_map_clause (gfc_omp_namelist **list, gfc_omp_map_op map_op,
+			  bool common_blocks)
 {
   gfc_omp_namelist **head = NULL;
-  if (gfc_match_omp_variable_list ("", list, false, NULL, , true)
+  if (gfc_match_omp_variable_list ("", list, common_blocks, NULL, , true)
   == MATCH_YES)
 {
   gfc_omp_namelist *n;
@@ -757,7 +758,7 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, uint64_t mask,
 	  if ((mask & OMP_CLAUSE_COPY)
 	  && gfc_match ("copy ( ") == MATCH_YES
 	  && gfc_match_omp_map_clause (>lists[OMP_LIST_MAP],
-	   OMP_MAP_FORCE_TOFROM))
+	   OMP_MAP_FORCE_TOFROM, openacc))
 	continue;
 	  if (mask & OMP_CLAUSE_COPYIN)
 	{
@@ -765,7 +766,7 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, uint64_t mask,
 		{
 		  if (gfc_match ("copyin ( ") == MATCH_YES
 		  && gfc_match_omp_map_clause (>lists[OMP_LIST_MAP],
-		   OMP_MAP_FORCE_TO))
+		   OMP_MAP_FORCE_TO, true))
 		continue;
 		}
 	  else if (gfc_match_omp_variable_list ("copyin (",
@@ -776,7 +777,7 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, uint64_t mask,
 	  if ((mask & OMP_CLAUSE_COPYOUT)
 	  && gfc_match ("copyout ( ") == MATCH_YES
 	  && gfc_match_omp_map_clause (>lists[OMP_LIST_MAP],
-	   OMP_MAP_FORCE_FROM))
+	   OMP_MAP_FORCE_FROM, true))
 	continue;
 	  if ((mask & OMP_CLAUSE_COPYPRIVATE)
 	  && gfc_match_omp_variable_list ("copyprivate (",
@@ -786,14 +787,14 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, uint64_t mask,
 	  if ((mask & OMP_CLAUSE_CREATE)
 	  && gfc_match ("create ( ") == MATCH_YES
 	  && gfc_match_omp_map_clause (>lists[OMP_LIST_MAP],
-	   OMP_MAP_FORCE_ALLOC))
+	   OMP_MAP_FORCE_ALLOC, true))
 	continue;
 	  break;
 	case 'd':
 	  if ((mask & OMP_CLAUSE_DELETE)
 	  && gfc_match ("delete ( ") == MATCH_YES
 	  && gfc_match_omp_map_clause (>lists[OMP_LIST_MAP],
-	   OMP_MAP_DELETE))
+	   OMP_MAP_DELETE, true))
 	continue;
 	  if ((mask & OMP_CLAUSE_DEFAULT)
 	  && c->default_sharing == OMP_DEFAULT_UNKNOWN)
@@ -846,22 +847,13 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, uint64_t mask,
 	  if ((mask & OMP_CLAUSE_OACC_DEVICE)
 	  && gfc_match ("device ( ") == MATCH_YES
 	  && gfc_match_omp_map_clause (>lists[OMP_LIST_MAP],
-	   OMP_MAP_FORCE_TO))
+	   OMP_MAP_FORCE_TO, false))
 	continue;
 	  if ((mask & OMP_CLAUSE_DEVICEPTR)
-	  && gfc_match ("deviceptr ( ") == MATCH_YES)
-	{
-	  gfc_omp_namelist **list = >lists[OMP_LIST_MAP];
-	  gfc_omp_namelist **head = NULL;
-	  if (gfc_match_omp_variable_list ("", list, true, NULL,
-	   , false) == MATCH_YES)
-		{
-		  gfc_omp_namelist *n;
-		  for (n = *head; n; n = n->next)
-		n->u.map_op = OMP_MAP_FORCE_DEVICEPTR;
-		  continue;
-		}
-	}
+	  && gfc_match ("deviceptr ( ") == MATCH_YES
+	  && gfc_match_omp_map_clause (>lists[OMP_LIST_MAP],
+	   OMP_MAP_FORCE_DEVICEPTR, false))
+	continue;
 	  if ((mask & OMP_CLAUSE_DEVICE_RESIDENT)
 	  && gfc_match_omp_variable_list
 		   ("device_resident (",
@@ -922,7 +914,7 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, uint64_t mask,
 	  if ((mask & OMP_CLAUSE_HOST_SELF)
 	  && gfc_match ("host ( ") == MATCH_YES
 	  && gfc_match_omp_map_clause (>lists[OMP_LIST_MAP],
-	   OMP_MAP_FORCE_FROM))
+

Re: [PATCH] Fix PR78189

2016-11-07 Thread Christophe Lyon

Hi Richard,


On 7 November 2016 at 09:01, Richard Biener  wrote:
>
> The following fixes an oversight when computing alignment in the
> vectorizer.
>
> Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk.
>
> Richard.
>
> 2016-11-07  Richard Biener  
>
> PR tree-optimization/78189
> * tree-vect-data-refs.c (vect_compute_data_ref_alignment): Fix
> alignment computation.
>
> * g++.dg/torture/pr78189.C: New testcase.
>
> Index: gcc/testsuite/g++.dg/torture/pr78189.C
> ===
> --- gcc/testsuite/g++.dg/torture/pr78189.C  (revision 0)
> +++ gcc/testsuite/g++.dg/torture/pr78189.C  (working copy)
> @@ -0,0 +1,41 @@
> +/* { dg-do run } */
> +/* { dg-additional-options "-ftree-slp-vectorize -fno-vect-cost-model" } */
> +
> +#include 
> +
> +struct A
> +{
> +  void * a;
> +  void * b;
> +};
> +
> +struct alignas(16) B
> +{
> +  void * pad;
> +  void * misaligned;
> +  void * pad2;
> +
> +  A a;
> +
> +  void Null();
> +};
> +
> +void B::Null()
> +{
> +  a.a = nullptr;
> +  a.b = nullptr;
> +}
> +
> +void __attribute__((noinline,noclone))
> +NullB(void * misalignedPtr)
> +{
> +  B* b = reinterpret_cast(reinterpret_cast(misalignedPtr) - 
> offsetof(B, misaligned));
> +  b->Null();
> +}
> +
> +int main()
> +{
> +  B b;
> +  NullB();
> +  return 0;
> +}
> diff --git gcc/tree-vect-data-refs.c b/gcc/tree-vect-data-refs.c
> index 9346cfe..b03cb1e 100644
> --- gcc/tree-vect-data-refs.c
> +++ gcc/tree-vect-data-refs.c
> @@ -773,10 +773,25 @@ vect_compute_data_ref_alignment (struct data_reference 
> *dr)
>base = ref;
>while (handled_component_p (base))
>  base = TREE_OPERAND (base, 0);
> +  unsigned int base_alignment;
> +  unsigned HOST_WIDE_INT base_bitpos;
> +  get_object_alignment_1 (base, _alignment, _bitpos);
> +  /* As data-ref analysis strips the MEM_REF down to its base operand
> + to form DR_BASE_ADDRESS and adds the offset to DR_INIT we have to
> + adjust things to make base_alignment valid as the alignment of
> + DR_BASE_ADDRESS.  */
>if (TREE_CODE (base) == MEM_REF)
> -base = build2 (MEM_REF, TREE_TYPE (base), base_addr,
> -  build_int_cst (TREE_TYPE (TREE_OPERAND (base, 1)), 0));
> -  unsigned int base_alignment = get_object_alignment (base);
> +{
> +  base_bitpos -= mem_ref_offset (base).to_short_addr () * BITS_PER_UNIT;
> +  base_bitpos &= (base_alignment - 1);
> +}
> +  if (base_bitpos != 0)
> +base_alignment = base_bitpos & -base_bitpos;
> +  /* Also look at the alignment of the base address DR analysis
> + computed.  */
> +  unsigned int base_addr_alignment = get_pointer_alignment (base_addr);
> +  if (base_addr_alignment > base_alignment)
> +base_alignment = base_addr_alignment;
>
>if (base_alignment >= TYPE_ALIGN (TREE_TYPE (vectype)))
>  DR_VECT_AUX (dr)->base_element_aligned = true;

Since you committed this patch (r241892), I'm seeing execution failures:
  gcc.dg/vect/pr40074.c -flto -ffat-lto-objects execution test
  gcc.dg/vect/pr40074.c execution test
on armeb-none-linux-gnueabihf --with-mode=arm --with-cpu=cortex-a9
--with-fpu=neon-fp16
(using qemu as simulator)

Christophe

[PATCH] DejaGNU support for AIX visibility

2016-11-07 Thread David Edelsohn

Adding visibility support for AIX to GCC requires the DejaGNU
testsuite to know about XCOFF "hidden".  DejaGNU labels XCOFF as
"coff".  The appended patch adds a regex to the DejaGNU
hidden-scan-for procedure to recognize AIX XCOFF hidden pseudo-ops in
assembly language.

Committed.

- David


Index: lib/scanasm.exp
===
--- lib/scanasm.exp (revision 241929)
+++ lib/scanasm.exp (working copy)
@@ -103,6 +103,7 @@
 set objformat [gcc_target_object_format]

 switch $objformat {
+coff { return "$symbol\[,\d\]*hidden" }
 elf  { return "hidden\[ \t_\]*$symbol" }
 mach-o   { return "private_extern\[ \t_\]*_?$symbol" }
 default  { return "" }

[gomp4] backport nvptx_exec cleanups

2016-11-07 Thread Cesar Philippidis

gomp-4_0-branch already contains the default nvptx runtime enhancements
that was recently applied to trunk
. However, I
made some tweaks in trunk involving both pthread locking and formatting
changes, which this patch backports.

I've applied this patch to gomp-4_0-branch.

Cesar
2016-11-07  Cesar Philippidis  

	libgomp/
	Backport from trunk
	2016-11-02  Cesar Philippidis  
		Nathan Sidwell  

	* plugin/plugin-nvptx.c (nvptx_exec): Interrogate board attributes
	to determine default geometry.

diff --git a/libgomp/plugin/plugin-nvptx.c b/libgomp/plugin/plugin-nvptx.c
index bd18418..2e7b020 100644
--- a/libgomp/plugin/plugin-nvptx.c
+++ b/libgomp/plugin/plugin-nvptx.c
@@ -910,10 +910,11 @@ nvptx_exec (void (*fn), size_t mapnum, void **hostaddrs, void **devaddrs,
 	 variable to specify runtime defaults. */
   static int default_dims[GOMP_DIM_MAX];
 
+  pthread_mutex_lock (_dev_lock);
   if (!default_dims[0])
 	{
 	  /* We only read the environment variable once.  You can't
-	 change it in the middle of execution.  The sytntax  is
+	 change it in the middle of execution.  The syntax  is
 	 the same as for the -fopenacc-dim compilation option.  */
 	  const char *env_var = getenv ("GOMP_OPENACC_DIM");
 	  if (env_var)
@@ -942,15 +943,17 @@ nvptx_exec (void (*fn), size_t mapnum, void **hostaddrs, void **devaddrs,
 	  CUdevice dev = nvptx_thread()->ptx_dev->dev;
 	  /* 32 is the default for known hardware.  */
 	  int gang = 0, worker = 32, vector = 32;
+	  CUdevice_attribute cu_tpb, cu_ws, cu_mpc, cu_tpm;
 
-	  if (CUDA_SUCCESS == cuDeviceGetAttribute
-	  (_size, CU_DEVICE_ATTRIBUTE_MAX_THREADS_PER_BLOCK, dev)
-	  && CUDA_SUCCESS == cuDeviceGetAttribute
-	  (_size, CU_DEVICE_ATTRIBUTE_WARP_SIZE, dev)
-	  && CUDA_SUCCESS == cuDeviceGetAttribute
-	  (_size, CU_DEVICE_ATTRIBUTE_MULTIPROCESSOR_COUNT, dev)
-	  && CUDA_SUCCESS == cuDeviceGetAttribute
-	  (_size, CU_DEVICE_ATTRIBUTE_MAX_THREADS_PER_MULTIPROCESSOR, dev))
+	  cu_tpb = CU_DEVICE_ATTRIBUTE_MAX_THREADS_PER_BLOCK;
+	  cu_ws = CU_DEVICE_ATTRIBUTE_WARP_SIZE;
+	  cu_mpc = CU_DEVICE_ATTRIBUTE_MULTIPROCESSOR_COUNT;
+	  cu_tpm  = CU_DEVICE_ATTRIBUTE_MAX_THREADS_PER_MULTIPROCESSOR;
+
+	  if (cuDeviceGetAttribute (_size, cu_tpb, dev) == CUDA_SUCCESS
+	  && cuDeviceGetAttribute (_size, cu_ws, dev) == CUDA_SUCCESS
+	  && cuDeviceGetAttribute (_size, cu_mpc, dev) == CUDA_SUCCESS
+	  && cuDeviceGetAttribute (_size, cu_tpm, dev)  == CUDA_SUCCESS)
 	{
 	  GOMP_PLUGIN_debug (0, " warp_size=%d, block_size=%d,"
  " dev_size=%d, cpu_size=%d\n",
@@ -980,6 +983,7 @@ nvptx_exec (void (*fn), size_t mapnum, void **hostaddrs, void **devaddrs,
 			 default_dims[GOMP_DIM_WORKER],
 			 default_dims[GOMP_DIM_VECTOR]);
 	}
+  pthread_mutex_unlock (_dev_lock);
 
   for (i = 0; i != GOMP_DIM_MAX; i++)
 	if (!dims[i])

Re: [PATCH] Do not simplify "(and (reg) (const bit))" to if_then_else.

2016-11-07 Thread Bernd Schmidt


On 10/31/2016 08:56 PM, Dominik Vogt wrote:


combine_simplify_rtx() tries to replace rtx expressions with just two
possible values with an experession that uses if_then_else:

  (if_then_else (condition) (value1) (value2))

If the original expression is e.g.

  (and (reg) (const_int 2))


I'm not convinced that if_then_else_cond is the right place to do this. 
That function is designed to answer the question of whether an rtx has 
exactly one of two values and under which condition; I feel it should 
continue to work this way.


Maybe simplify_ternary_expression needs to be taught to deal with this case?


Bernd

Re: [PATCH] Fix PR78229

2016-11-07 Thread Jakub Jelinek

On Mon, Nov 07, 2016 at 01:20:20PM +0100, Richard Biener wrote:
> Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk
> and branch.
> 
> Richard.
> 
> 2016-11-07  Richard Biener  
> 
>   PR target/78229
>   * config/i386/i386.c (ix86_gimple_fold_builtin): Do not adjust
>   EH info.
> 
>   * g++.dg/pr78229.C: New testcase.

On the trunk there are 2 other spots with gsi_replace (..., true); in
the same function (and both of them are for functions which also should
really be ECF_NOTHROW, but aren't).

I haven't managed to create a simple testcase.  Anyway, here is a fix,
bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2016-11-07  Jakub Jelinek  

PR target/78229
* config/i386/i386.c (ix86_gimple_fold_builtin): Do not adjust  
   
EH info even for bzhi and pdep/pext.

--- gcc/config/i386/i386.c.jj   2016-11-07 18:32:14.0 +0100
+++ gcc/config/i386/i386.c  2016-11-07 18:46:10.901250307 +0100
@@ -33537,7 +33537,7 @@ ix86_gimple_fold_builtin (gimple_stmt_it
  location_t loc = gimple_location (stmt);
  gimple *g = gimple_build_assign (gimple_call_lhs (stmt), arg0);
  gimple_set_location (g, loc);
- gsi_replace (gsi, g, true);
+ gsi_replace (gsi, g, false);
  return true;
}
   break;
@@ -33554,7 +33554,7 @@ ix86_gimple_fold_builtin (gimple_stmt_it
  arg0 = gimple_call_arg (stmt, 0);
  gimple *g = gimple_build_assign (gimple_call_lhs (stmt), arg0);
  gimple_set_location (g, loc);
- gsi_replace (gsi, g, true);
+ gsi_replace (gsi, g, false);
  return true;
}
   break;


Jakub

[patch, committed, fortran] Final bug-fixing part of PR 78226

2016-11-07 Thread Thomas Koenig


OK, this is it (I hope).

Finding the different parts where location information was missing
is now complete (I hope), at least the testsuite passes with the
addition of the warnings.  I had a bit more time than expected today,
so I split this up the way I did.

I will submit a patch soon to always run the check for locus
information when checking is enabled.

Regards

Thomas

2016-11-07  Thomas Koenig  

PR fortran/78226
* expr.c (gfc_generate_initializer):  Add where to EXPR_NULL
statement.
* iresolve.c (gfc_resolve_extends_type_of):  Add where to
both arguments of the function.
* resolve.c (resolve_select_type):  Add where to the
second argument of the new statement.
Index: expr.c
===
--- expr.c	(Revision 241887)
+++ expr.c	(Arbeitskopie)
@@ -4367,6 +4367,7 @@ gfc_generate_initializer (gfc_typespec *ts, bool g
 	{
 	  ctor->expr = gfc_get_expr ();
 	  ctor->expr->expr_type = EXPR_NULL;
+	  ctor->expr->where = init->where;
 	  ctor->expr->ts = comp->ts;
 	}
 
Index: iresolve.c
===
--- iresolve.c	(Revision 241887)
+++ iresolve.c	(Arbeitskopie)
@@ -1044,9 +1044,12 @@ gfc_resolve_extends_type_of (gfc_expr *f, gfc_expr
 gfc_add_vptr_component (a);
   else if (a->ts.type == BT_DERIVED)
 {
+  locus where;
+
   vtab = gfc_find_derived_vtab (a->ts.u.derived);
   /* Clear the old expr.  */
   gfc_free_ref_list (a->ref);
+  where = a->where;
   memset (a, '\0', sizeof (gfc_expr));
   /* Construct a new one.  */
   a->expr_type = EXPR_VARIABLE;
@@ -1053,6 +1056,7 @@ gfc_resolve_extends_type_of (gfc_expr *f, gfc_expr
   st = gfc_find_symtree (vtab->ns->sym_root, vtab->name);
   a->symtree = st;
   a->ts = vtab->ts;
+  a->where = where;
 }
 
   /* Replace the second argument with the corresponding vtab.  */
@@ -1060,8 +1064,11 @@ gfc_resolve_extends_type_of (gfc_expr *f, gfc_expr
 gfc_add_vptr_component (mo);
   else if (mo->ts.type == BT_DERIVED)
 {
+  locus where;
+
   vtab = gfc_find_derived_vtab (mo->ts.u.derived);
   /* Clear the old expr.  */
+  where = mo->where;
   gfc_free_ref_list (mo->ref);
   memset (mo, '\0', sizeof (gfc_expr));
   /* Construct a new one.  */
@@ -1069,6 +1076,7 @@ gfc_resolve_extends_type_of (gfc_expr *f, gfc_expr
   st = gfc_find_symtree (vtab->ns->sym_root, vtab->name);
   mo->symtree = st;
   mo->ts = vtab->ts;
+  mo->where = where;
 }
 
   f->ts.type = BT_LOGICAL;
Index: resolve.c
===
--- resolve.c	(Revision 241909)
+++ resolve.c	(Arbeitskopie)
@@ -8863,6 +8863,7 @@ resolve_select_type (gfc_code *code, gfc_namespace
 	  st = gfc_find_symtree (vtab->ns->sym_root, vtab->name);
 	  new_st->expr1->value.function.actual->next = gfc_get_actual_arglist ();
 	  new_st->expr1->value.function.actual->next->expr = gfc_get_variable_expr (st);
+	  new_st->expr1->value.function.actual->next->expr->where = code->loc;
 	  new_st->next = body->next;
 	}
 	if (default_case->next)

[PATCH] print_rtx: implement support for reuse IDs (v2)

2016-11-07 Thread David Malcolm

On Tue, 2016-10-25 at 14:47 +0200, Bernd Schmidt wrote:
> On 10/21/2016 10:27 PM, David Malcolm wrote:
> > Thanks.  I attemped to use those fields of recog_data, but it
> > doesn't
> > seem to be exactly what's needed here.
> 
> Yeah, I may have been confused. I'm not sure that just looking at
> SCRATCHes is the right thing either, but I think you're on the right
> track, and we can use something like your patch for now and extend it
> later if necessary.
> 
> > + public:
> > +  rtx_reuse_manager ();
> > +  ~rtx_reuse_manager ();
> > +  static rtx_reuse_manager *get () { return singleton; }
> 
> OTOH, this setup looks a bit odd to me. Are you trying to avoid
> converting the print_rtx stuff to its own class, or avoid passing the
> reuse manager as an argument to a lot of functions?
>
> Some of this setup might not even be necessary. We have a "used" flag
> on
> rtx objects which is used to unshare RTL, and I think could also be
> used
> for a similar purpose when dumping. So, before printing, call
> reset_insn_used_flags on everything, then have another pass to set
> bits
> on everything that could conceivably be shared, and when you find
> something that already has the bit set, enter it into a table.
> Finally,
> print everything out, using the table. I think this would be somewhat
> simpler than adding another header file and class definition.

Now that we have a class rtx_writer, it's much clearer to drop the
singleton.

In this version I've eliminated the rtx_reuse_manager singleton,
instead allowing callers to pass a rtx_reuse_manager * to
rtx_writer's ctor.  This can be NULL, allowing most dumps to opt
out of the reuse-tracking, minimizing the risk of changing an
existing testcase; only print_rtl_function makes use of it (and
the selftests).

I eliminated print-rtl-reuse.h, moving class rtx_reuse_manager into
print-rtl.h and print-rtl.c

I kept the class rtx_reuse_manager, as it seems appropriate to
put responsibility for this aspect of dumping into its own class.
I attempted to move it into rtx_writer itself, but doing so made
the code less clear.
 
> > +void
> > +rtx_reuse_manager::preprocess (const_rtx x)
> > +{
> > +  subrtx_iterator::array_type array;
> > +  FOR_EACH_SUBRTX (iter, array, x, NONCONST)
> > +if (uses_rtx_reuse_p (*iter))
> > +  {
> > +   if (int *count = m_rtx_occurrence_count.get (*iter))
> > + {
> > +   if (*count == 1)
> > + {
> > +   m_rtx_reuse_ids.put (*iter, m_next_id++);
> > + }
> > +   (*count)++;
> > + }
> > +   else
> > + m_rtx_occurrence_count.put (*iter, 1);
> > +  }
> 
> Formatting rules suggest no braces around single statements, I think
> a
> more readable version of this would be:
> 
>if (uses_rtx_reuse_p (*iter))
>  {
>int *count = m_rtx_occurrence_count.get (*iter)
>if (count)
>  {
>if ((*count)++ == 1)
>  m_rtx_reuse_ids.put (*iter, m_next_id++);
>  }
>else
>   m_rtx_occurrence_count.put (*iter, 1);
>  }
> 
> 
> Bernd

Fixed in the way you you noted.

Successfully bootstrapped on x86_64-pc-linux-gnu.

OK for trunk?

gcc/ChangeLog:
* config/i386/i386.c: Include print-rtl.h.
(selftest::ix86_test_dumping_memory_blockage): New function.
(selftest::ix86_run_selftests): Call it.
* print-rtl-function.c (print_rtx_function): Create an
rtx_reuse_manager and use it.
* print-rtl.c: Include "rtl-iter.h".
(rtx_writer::rtx_writer): Add reuse_manager param.
(rtx_reuse_manager::rtx_reuse_manager): New ctor.
(uses_rtx_reuse_p): New function.
(rtx_reuse_manager::preprocess): New function.
(rtx_reuse_manager::has_reuse_id): New function.
(rtx_reuse_manager::seen_def_p): New function.
(rtx_reuse_manager::set_seen_def): New function.
(rtx_writer::print_rtx): If "in_rtx" has a reuse ID, print it as a
prefix the first time in_rtx is seen, and print reuse_rtx
subsequently.
(print_inline_rtx): Supply NULL for new reuse_manager param.
(debug_rtx): Likewise.
(print_rtl): Likewise.
(print_rtl_single): Likewise.
(rtx_writer::print_rtl_single_with_indent): Likewise.
* print-rtl.h: Include bitmap.h when building for host.
(rtx_writer::rtx_writer): Add reuse_manager param.
(rtx_writer::m_rtx_reuse_manager): New field.
(class rtx_reuse_manager): New class.
* rtl-tests.c (selftest::assert_rtl_dump_eq): Add reuse_manager
param and use it when constructing rtx_writer.
(selftest::test_dumping_rtx_reuse): New function.
(selftest::rtl_tests_c_tests): Call it.
* selftest-rtl.h (class rtx_reuse_manager): New forward decl.
(selftest::assert_rtl_dump_eq): Add reuse_manager param.
(ASSERT_RTL_DUMP_EQ): Supply NULL for reuse_manager param.
(ASSERT_RTL_DUMP_EQ_WITH_REUSE): New macro.
---
 gcc/config/i386/i386.c

C++ PATCH to announce template instantiations if not -quiet

2016-11-07 Thread Jason Merrill

It occurred to me that a simple trace of template instantiations would
fit simply into the stream of function declarations that
announce_function prints when -quiet is not specified to the compiler.

Tested x86_64-pc-linux-gnu, applying to trunk.
commit ae7b4a929fbd05de433451a1d92794d962366646
Author: Jason Merrill 
Date:   Fri Nov 4 09:22:32 2016 -0400

Add template instantiations to the announce_function stream.

* pt.c (push_tinst_level_loc): Add template instantiations to the
announce_function stream.

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index c8d4a06..f910d40 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -9170,6 +9170,13 @@ push_tinst_level_loc (tree d, location_t loc)
   if (limit_bad_template_recursion (d))
 return false;
 
+  /* When not -quiet, dump template instantiations other than functions, since
+ announce_function will take care of those.  */
+  if (!quiet_flag
+  && TREE_CODE (d) != TREE_LIST
+  && TREE_CODE (d) != FUNCTION_DECL)
+fprintf (stderr, " %s", decl_as_string (d, TFF_DECL_SPECIFIERS));
+
   new_level = ggc_alloc ();
   new_level->decl = d;
   new_level->locus = loc;

[PATCH, rs6000] Modify include paths in config.gcc for Advance Toolchain builds

2016-11-07 Thread Peter Bergner

Gabriel and I have been tracking down an include path issue for GCC 6
Advance Toolchain builds (ie, --with-advance-toolchain=...).  The solution
that fixes the problem for us is to configure with --with-local-prefix=...
and removing the following hunk from config.gcc.  Gabriel has confirmed
this fixes his AT builds (native and cross) and I've verified that this
patch bootstraps with no regressions.

Is this ok for trunk and the GCC 6 branch?

Peter

* config.gcc (powerpc*-*-*, rs6000*-*-*): Remove setting of
INCLUDE_EXTRA_SPEC for Advance Toolchain builds.

Index: gcc/config.gcc
===
--- gcc/config.gcc  (revision 241917)
+++ gcc/config.gcc  (working copy)
@@ -4137,16 +4137,6 @@ case "${target}" in
(at="/opt/$with_advance_toolchain"
 echo "/* Use Advance Toolchain $at */"
 echo
-echo "#ifndef USE_AT_INCLUDE_FILES"
-echo "#define USE_AT_INCLUDE_FILES 1"
-echo "#endif"
-echo
-echo "#if USE_AT_INCLUDE_FILES"
-echo "#undef  INCLUDE_EXTRA_SPEC"
-echo "#define INCLUDE_EXTRA_SPEC" \
- "\"-isystem $at/include\""
-echo "#endif"
-echo
 echo "#undef  LINK_OS_EXTRA_SPEC32"
 echo "#define LINK_OS_EXTRA_SPEC32" \
  "\"%(link_os_new_dtags)" \

Re: [PATCH fix PR71767 2/4 : Darwin configury] Arrange for ld64 to be detected as Darwin's linker

2016-11-07 Thread Jack Howarth

Iain,
 It certainly looks like you dropped a file here. The proposed
ChangeLog shows...

* config.in: Likewise.

but the previously proposed hunk from...

diff --git a/gcc/config.in b/gcc/config.in
index a736de3..a7ff3ee 100644
--- a/gcc/config.in
+++ b/gcc/config.in
@@ -1934,6 +1934,18 @@
 #endif


+/* Define to 1 if ld64 supports '-export_dynamic'. */
+#ifndef USED_FOR_TARGET
+#undef LD64_HAS_EXPORT_DYNAMIC
+#endif
+
+
+/* Define to ld64 version. */
+#ifndef USED_FOR_TARGET
+#undef LD64_VERSION
+#endif
+
+
 /* Define to the linker option to ignore unused dependencies. */
 #ifndef USED_FOR_TARGET
 #undef LD_AS_NEEDED_OPTION

from PR71767-vs-240230 has gone missing. The current patch still
produces a compiler which triggers warnings of...

warning: section "__textcoal_nt" is deprecated

during the bootstrap until that hunk of the original patch is restored.
Jack

On Sun, Nov 6, 2016 at 2:39 PM, Iain Sandoe  wrote:
> Hi Folks,
>
> This is an initial patch in a series that converts Darwin's configury to 
> detect ld64 features, rather than the current process of hard-coding them on 
> target system version.
>
> This adds an option --with-ld64[=version] that allows the configurer to 
> specify that the Darwin ld64 linker is in use.  If the version is given then 
> that will be used to determine the capabilities of the linker in native and 
> canadian crosses.  For Darwin targets this flag will default to "on", since 
> such targets require an ld64-compatible linker.
>
> If a DEFAULT_LINKER is set via --with-ld= then this will also be tested to 
> see if it is ld64.
>
> The ld64 version is determined (unless overridden by --with-ld64=version) and 
> this is exported for use in setting a default value for -mtarget-linker 
> (needed for run-time code-gen changes to section choices).
>
> In this initial patch, support for -rdynamic is converted to be detected at 
> config time, or by the ld64 version if that is explicitly given (as an 
> example of usage).
>
> OK for trunk?
> OK for open branches?
> Iain
>
> gcc/
>
> 2016-11-06  Iain Sandoe  
>
>PR target/71767
> * configure.ac (with-ld64): New arg-with.  gcc_ld64_version: New,
> new test.  gcc_cv_ld64_export_dynamic: New, New test.
> * configure: Regenerate.
> * config.in: Likewise.
> * darwin.h: Use LD64_HAS_DYNAMIC export. DEF_LD64: New, define.
> * darwin10.h(DEF_LD64): Update for this target version.
> * darwin12.h(LINK_GCC_C_SEQUENCE_SPEC): Remove rdynamic test.
> (DEF_LD64): Update for this target version.
> ---
>  gcc/config/darwin.h   | 16 ++-
>  gcc/config/darwin10.h |  5 
>  gcc/config/darwin12.h |  7 -
>  gcc/configure.ac  | 74 
> +++
>  4 files changed, 100 insertions(+), 2 deletions(-)
>
> diff --git a/gcc/config/darwin.h b/gcc/config/darwin.h
> index 045f70b..541bcb3 100644
> --- a/gcc/config/darwin.h
> +++ b/gcc/config/darwin.h
> @@ -165,6 +165,12 @@ extern GTY(()) int darwin_ms_struct;
> specifying the handling of options understood by generic Unix
> linkers, and for positional arguments like libraries.  */
>
> +#if LD64_HAS_EXPORT_DYNAMIC
> +#define DARWIN_EXPORT_DYNAMIC " %{rdynamic:-export_dynamic}"
> +#else
> +#define DARWIN_EXPORT_DYNAMIC " %{rdynamic: %nrdynamic is not supported}"
> +#endif
> +
>  #define LINK_COMMAND_SPEC_A \
> "%{!fdump=*:%{!fsyntax-only:%{!c:%{!M:%{!MM:%{!E:%{!S:\
>  %(linker)" \
> @@ -185,7 +191,9 @@ extern GTY(()) int darwin_ms_struct;
>  %{!nostdlib:%{!nodefaultlibs:\
>%{%:sanitize(address): -lasan } \
>%{%:sanitize(undefined): -lubsan } \
> -  %(link_ssp) %(link_gcc_c_sequence)\
> +  %(link_ssp) \
> +  " DARWIN_EXPORT_DYNAMIC " % +  %(link_gcc_c_sequence) \
>  }}\
>  %{!nostdlib:%{!nostartfiles:%E}} %{T*} %{F*} }}}"
>
> @@ -932,4 +940,10 @@ extern void darwin_driver_init (unsigned int *,struct 
> cl_decoded_option **);
> fall-back default.  */
>  #define DEF_MIN_OSX_VERSION "10.5"
>
> +#ifndef LD64_VERSION
> +#define LD64_VERSION "85.2"
> +#else
> +#define DEF_LD64 LD64_VERSION
> +#endif
> +
>  #endif /* CONFIG_DARWIN_H */
> diff --git a/gcc/config/darwin10.h b/gcc/config/darwin10.h
> index 5829d78..a81fbdc 100644
> --- a/gcc/config/darwin10.h
> +++ b/gcc/config/darwin10.h
> @@ -32,3 +32,8 @@ along with GCC; see the file COPYING3.  If not see
>
>  #undef DEF_MIN_OSX_VERSION
>  #define DEF_MIN_OSX_VERSION "10.6"
> +
> +#ifndef LD64_VERSION
> +#undef DEF_LD64
> +#define DEF_LD64 "97.7"
> +#endif
> diff --git a/gcc/config/darwin12.h b/gcc/config/darwin12.h
> index e366982..f88e2a4 100644
> --- a/gcc/config/darwin12.h
> +++ b/gcc/config/darwin12.h
> @@ -21,10 +21,15 @@ along with GCC; see the file COPYING3.  If not see
>  #undef  LINK_GCC_C_SEQUENCE_SPEC
>  #define LINK_GCC_C_SEQUENCE_SPEC \
>  "%:version-compare(>= 10.6 mmacosx-version-min=

Re: [PATCH v2] aarch64: Add split-stack initial support

2016-11-07 Thread Adhemerval Zanella

On 14/10/2016 15:59, Wilco Dijkstra wrote:
> Hi,
> 

Thanks for the thoughtful review and sorry for late response. 

>> Split-stack prologue on function entry is as follow (this goes before the
>> usual function prologue):
> 
>>  mrsx9, tpidr_el0
>>  movx10, -
> 
> As Jiong already remarked, the nop won't work. Do we know the maximum 
> adjustment
> that the linker is allowed to make? If so, and we can limit the adjustment to 
> 16MB in
> most cases, emitting 2 subtracts is best. Larger offset need mov/movk/sub but 
> that
> should be extremely rare.

There is no limit afaik on gold split stack allocation handling,
and I think one could be added for each backend (in the method
override require to implement it).

In fact it is not really required to tie the nop generation with the
instruction generated by 'aarch64_internal_mov_immediate', it is
just a matter to simplify linker code.  

And although 16MB should be rare, nilptr2.go tests allocates 134217824
so this test fails with this low stack limit.  I am not sure how well
is the stack usage on 'go', but I think we should at least support
current testcase scenario.  So for current iteration I kept my
current approach, but I am open to suggestions.

> 
>>  nop/movk
> 
>>  addx10, sp, x10
>>  ldrx9, [x9, 16]
> 
> Is there any need to detect underflow of x10 or is there a guarantee that 
> stacks are
> never allocated in the low 2GB (given the maximum adjustment is 2GB)? It's 
> safe
> to do a signed comparison.

I do not think so, at least none of current backend that implements
split stack do so.

> 
>>  cmpx10, x9
>>  b.csenough
> 
> Why save/restore x30 and the call x30+8 trick when we could pass the
> continuation address and use a tailcall? That also avoids emitting extra 
> unwind info.
> 
>>  stpx30, [sp, -16]
>>  bl __morestack
>>  ldpx30, [sp], 16
>>  ret
> 
> This part doesn't make any sense - both x28 and carry flag as an input, and 
> spread
> across the prolog - why???
> 
>> enough:
>>  mov x10, sp
>   [prolog]
>>  b.cscontinue
>>  mov x10, x28
> continue:
>   [rest of function]
> 
> Why not do this?
> 
> function:
>   mrsx9, tpidr_el0
>   subx10, sp, N & 0xfff000
>   subx10, x10, N & 0xfff
>   ldrx9, [x9, 16]
>   adr x12, main_fn_entry
>   movx11, sp   [if function has stacked arguments]
>   cmpx10, x9
>   b.gemain_fn_entry
>   b __morestack
> main_fn_entry: [x11 is argument pointer]
>   [prolog]
>   [rest of function]
> 
> In __morestack you need to save x8 as well (another argument register!) and 
> x12 (the 
> continuation address). After returning from the call x8 doesn't need to be 
> preserved.

Indeed this strategy is way better and I adjusted the code follow it.
The only change is I am using a:

[...]
cmp x9, x10
b.ltmain_fn_entr
b   __morestack.
[...]

So I can issue a 'cmp , 0' on __morestack to indicate
the function was called.

> 
> There are several issues with unwinding in __morestack. x28 is not described 
> as a callee-save
> so will be corrupted if unwinding across a __morestack call. This won't 
> unwind correctly after
> the ldp as the unwinder will use the restored frame pointer to try to restore 
> x29/x30:
> 
> + ldp x29, x30, [x28, STACKFRAME_BASE]
> + ldr x28, [x28, STACKFRAME_BASE + 80]
> +
> + .cfi_remember_state
> + .cfi_restore 30
> + .cfi_restore 29
> + .cfi_def_cfa 31, 0

Indeed, it misses x28 save/restore. I think I have added the missing bits, but I
must confess that I am not well versed in CFI directives.  I will appreciate if 
you could help me on this new version.

> 
> This stores a random x30 value on the stack, what is the purpose of this? 
> Nothing can unwind
> to here:
> 
> + # Start using new stack
> + stp x29, x30, [x0, -16]!
> + mov sp, x0
> 
> Also we no longer need split_stack_arg_pointer_used_p () or any code that 
> uses it (functions
> that don't have any arguments passed on the stack could omit the mov x11, sp).

Right, we new strategy you proposed to do a branch this is indeed not
really required.  I remove it from on this new patch.

> 
> Wilco
> 
From dd2927aa5deb8d609c748014f3b566962fb852c5 Mon Sep 17 00:00:00 2001
From: Adhemerval Zanella 
Date: Wed, 4 May 2016 21:13:39 +
Subject: [PATCH 2/2] aarch64: Add split-stack initial support

This patch adds the split-stack support on aarch64 (PR #67877).  As for
other ports this patch should be used along with glibc and gold support.

The support is done similar to other architectures: a __private_ss field is
added on TCB in glibc, a target-specific __morestack implementation and
helper functions are added in libgcc and compiler supported in adjusted
(split-stack prologue, va_start for argument handling).  I also plan

[PATCH] A special predicate for type size equality

2016-11-07 Thread Martin Jambor

Hi,

this has been in my TODO list for at least two years, probably longer,
although I do no longer remember why I added it there.  The idea is to
introduce a special wrapper around operands_equal_p for TYPE_SIZE
comparisons, which would try simple pointer equality before calling more
complex operand_equal_p (TYPE_SIZE (t1), TYPE_SIZE (t2), 0), because
when equal, the sizes are most likely going to be the same tree anyway.

All users also test whether both TYPE_SIZEs are NULL, most of them to
test for known size equality, but unfortunately there is one (ODR
warning) that tests for known inequality.  Nevertheless, the former use
case seems so much natural that I have outlined it into the new
predicate as well.

I am no longer sure whether it is a scenario that happens so often to
justify a wrapper, but I'd like to propose it anyway, at least to remove
it from the TODO list as a not-so-good-idea-after-all :-)

Bootstrapped and tested on x86_64-linux.  Is it a good idea?  OK for
trunk?

Thanks,

Martin

2016-11-03  Martin Jambor  

* fold-const.c (type_sizes_equal_p): New function.
* fold-const.h (type_sizes_equal_p): Declare.
* ipa-devirt.c (odr_types_equivalent_p): Use it.
* ipa-polymorphic-call.c (meet_with): Likewise.
* tree-ssa-alias.c (stmt_kills_ref_p): Likewise.
---
 gcc/fold-const.c   | 19 +++
 gcc/fold-const.h   |  1 +
 gcc/ipa-devirt.c   |  2 +-
 gcc/ipa-polymorphic-call.c | 10 ++
 gcc/tree-ssa-alias.c   |  7 +--
 5 files changed, 24 insertions(+), 15 deletions(-)

diff --git a/gcc/fold-const.c b/gcc/fold-const.c
index 603aff0..ab77b8d 100644
--- a/gcc/fold-const.c
+++ b/gcc/fold-const.c
@@ -3342,6 +3342,25 @@ operand_equal_for_comparison_p (tree arg0, tree arg1, 
tree other)
 
   return 0;
 }
+
+/* Given two types, return true if both have a non-NULL TYPE_SIZE and these
+   sizes have the same value.  */
+
+bool
+type_sizes_equal_p (const_tree t1, const_tree t2)
+{
+  gcc_checking_assert (TYPE_P (t1));
+  gcc_checking_assert (TYPE_P (t2));
+  t1 = TYPE_SIZE (t1);
+  t2 = TYPE_SIZE (t2);
+
+  if (!t1 || !t2)
+return false;
+  else if (t1 == t2)
+return true;
+  else
+return operand_equal_p (t1, t2, 0);
+}
 
 /* See if ARG is an expression that is either a comparison or is performing
arithmetic on comparisons.  The comparisons must only be comparing
diff --git a/gcc/fold-const.h b/gcc/fold-const.h
index ae37142..014ca34 100644
--- a/gcc/fold-const.h
+++ b/gcc/fold-const.h
@@ -89,6 +89,7 @@ extern void fold_undefer_and_ignore_overflow_warnings (void);
 extern bool fold_deferring_overflow_warnings_p (void);
 extern void fold_overflow_warning (const char*, enum 
warn_strict_overflow_code);
 extern int operand_equal_p (const_tree, const_tree, unsigned int);
+extern bool type_sizes_equal_p (const_tree, const_tree);
 extern int multiple_of_p (tree, const_tree, const_tree);
 #define omit_one_operand(T1,T2,T3)\
omit_one_operand_loc (UNKNOWN_LOCATION, T1, T2, T3)
diff --git a/gcc/ipa-devirt.c b/gcc/ipa-devirt.c
index 49e2195..d2db6f2 100644
--- a/gcc/ipa-devirt.c
+++ b/gcc/ipa-devirt.c
@@ -1671,7 +1671,7 @@ odr_types_equivalent_p (tree t1, tree t2, bool warn, bool 
*warned,
 
   /* Those are better to come last as they are utterly uninformative.  */
   if (TYPE_SIZE (t1) && TYPE_SIZE (t2)
-  && !operand_equal_p (TYPE_SIZE (t1), TYPE_SIZE (t2), 0))
+  && !type_sizes_equal_p (t1, t2))
 {
   warn_odr (t1, t2, NULL, NULL, warn, warned,
G_("a type with different size "
diff --git a/gcc/ipa-polymorphic-call.c b/gcc/ipa-polymorphic-call.c
index 8d9f22a..b66fd76 100644
--- a/gcc/ipa-polymorphic-call.c
+++ b/gcc/ipa-polymorphic-call.c
@@ -2454,10 +2454,7 @@ ipa_polymorphic_call_context::meet_with 
(ipa_polymorphic_call_context ctx,
   if (!dynamic
  && (ctx.dynamic
  || (!otr_type
- && (!TYPE_SIZE (ctx.outer_type)
- || !TYPE_SIZE (outer_type)
- || !operand_equal_p (TYPE_SIZE (ctx.outer_type),
-  TYPE_SIZE (outer_type), 0)
+ && (!type_sizes_equal_p (ctx.outer_type, outer_type)
{
  dynamic = true;
  updated = true;
@@ -2472,10 +2469,7 @@ ipa_polymorphic_call_context::meet_with 
(ipa_polymorphic_call_context ctx,
   if (!dynamic
  && (ctx.dynamic
  || (!otr_type
- && (!TYPE_SIZE (ctx.outer_type)
- || !TYPE_SIZE (outer_type)
- || !operand_equal_p (TYPE_SIZE (ctx.outer_type),
-  TYPE_SIZE (outer_type), 0)
+ && (!type_sizes_equal_p (ctx.outer_type, outer_type)
dynamic = true;
   outer_type = ctx.outer_type;
   offset = ctx.offset;
diff --git a/gcc/tree-ssa-alias.c b/gcc/tree-ssa-alias.c
index ebae6cf..98cd1d7 100644
---

Re: [PATCH fix PR71767 2/4 : Darwin configury] Arrange for ld64 to be detected as Darwin's linker

2016-11-07 Thread Jeff Law


On 11/07/2016 10:48 AM, Mike Stump wrote:

On Nov 6, 2016, at 11:39 AM, Iain Sandoe  wrote:

This is an initial patch in a series that converts Darwin's configury to detect 
ld64 features, rather than the current process of hard-coding them on target 
system version.


So, I really do hate to ask, but does this have to be a config option?  
Normally, we'd just have configure examine things by itself.  For canadian 
crosses, there should be enough state present to key off of directly, specially 
if they are wired up to work.

I've rather have the thing that doesn't just work without that config flag, 
just work.  I'd like to think I can figure how how to make it just work, if 
given an idea of what doesn't actually work.

Essentially, you do the operation that doesn't work, detect it failed to work, 
then the you know it didn't work.

But how is that supposed to work in a cross environment when he can't 
directly query the linker's behavior?


In an ideal world we could trivially query the linker's behavior prior 
to invocation.  But we don't have that kind of infrastructure in place.


ISTM the way to go is to have a configure test to try and DTRT 
automatically for native builds and a flag to set for crosses (or 
potentially override the configure test).



Jeff

[hsa-branch] Remove superfluous lastprivate check

2016-11-07 Thread Martin Jambor

Hi,

this is another simple cleanup that I forgot to commit, which just
removes a lastprivate check (which hsa now can handle) at a place
where it cannot ever be anyway.

Committed to the hsa branch, will include it in the pile of OpenMP
stuff to request to merge to trunk later this week.

Thanks,

Martin


2016-11-07  Martin Jambor  

* omp-low.c (grid_target_follows_gridifiable_pattern): Do not
check for lastprivate clause on teams construct.
---
 gcc/omp-low.c | 7 ---
 1 file changed, 7 deletions(-)

diff --git a/gcc/omp-low.c b/gcc/omp-low.c
index ac87a91..65b0ddc 100644
--- a/gcc/omp-low.c
+++ b/gcc/omp-low.c
@@ -17972,13 +17972,6 @@ grid_target_follows_gridifiable_pattern (gomp_target 
*target, grid_prop *grid)
 "clause is present\n ");
  return false;
 
-   case OMP_CLAUSE_LASTPRIVATE:
- if (dump_enabled_p ())
-   dump_printf_loc (MSG_MISSED_OPTIMIZATION, tloc,
-GRID_MISSED_MSG_PREFIX "a lastprivate "
-"clause is present\n ");
- return false;
-
case OMP_CLAUSE_THREAD_LIMIT:
  if (!integer_zerop (OMP_CLAUSE_OPERAND (clauses, 0)))
group_size = OMP_CLAUSE_OPERAND (clauses, 0);
-- 
2.10.1

[hsa-branch] Append UID to local variable names

2016-11-07 Thread Martin Jambor

Hi,

when looking at stuff to merge to trunk, I have found out that this
patch has slipped thorough the cracks.  It adds the UID to names of
private symbols so that variables with the same name but different
scope, particularly OpenMP re-mapped ones, do not clash.

Committed to the hsa branch, will include it in the merge to trunk
too.

Thanks,

Martin


2016-11-07  Martin Jambor  

* hsa-gen.c (hsa_get_declaration_name): Append UID to local variable
names.
---
 gcc/hsa-gen.c | 19 ---
 1 file changed, 16 insertions(+), 3 deletions(-)

diff --git a/gcc/hsa-gen.c b/gcc/hsa-gen.c
index b6e8345..f138434 100644
--- a/gcc/hsa-gen.c
+++ b/gcc/hsa-gen.c
@@ -781,7 +781,8 @@ hsa_needs_cvt (BrigType16_t dtype, BrigType16_t stype)
   return false;
 }
 
-/* Return declaration name if exists.  */
+/* Return declaration name if it exists or create one from UID if it does not.
+   If DECL is a local variable, make UID part of its name.  */
 
 const char *
 hsa_get_declaration_name (tree decl)
@@ -789,7 +790,7 @@ hsa_get_declaration_name (tree decl)
   if (!DECL_NAME (decl))
 {
   char buf[64];
-  snprintf (buf, 64, "__hsa_anon_%i", DECL_UID (decl));
+  snprintf (buf, 64, "__hsa_anon_%u", DECL_UID (decl));
   size_t len = strlen (buf);
   char *copy = (char *) obstack_alloc (_obstack, len + 1);
   memcpy (copy, buf, len + 1);
@@ -808,7 +809,19 @@ hsa_get_declaration_name (tree decl)
   if (name[0] == '*')
 name++;
 
-  return name;
+  if ((TREE_CODE (decl) == VAR_DECL)
+  && decl_function_context (decl))
+{
+  size_t len = strlen (name);
+  char *buf = (char *) alloca (len + 32);
+  snprintf (buf, len + 32, "%s_%u", name, DECL_UID (decl));
+  len = strlen (buf);
+  char *copy = (char *) obstack_alloc (_obstack, len + 1);
+  memcpy (copy, buf, len + 1);
+  return copy;
+}
+  else
+return name;
 }
 
 /* Lookup or create the associated hsa_symbol structure with a given VAR_DECL
-- 
2.10.1

Re: [PATCH fix PR71767 2/4 : Darwin configury] Arrange for ld64 to be detected as Darwin's linker

2016-11-07 Thread Joseph Myers

On Sun, 6 Nov 2016, Iain Sandoe wrote:

> This adds an option --with-ld64[=version] that allows the configurer to 

New configure options should be documented in install.texi.

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: [PATCH, Darwin] Fix PR57438 by avoiding empty function bodies and trailing labels.

2016-11-07 Thread Mike Stump

On Nov 6, 2016, at 12:13 PM, Iain Sandoe  wrote:
> 
> OK for trunk?
> OK for open branches?

For the darwin parts, Ok.

> 2016-11-06  Iain Sandoe  
> 
>   PR target/57438
>   * config/i386/i386.c (ix86_code_end): Note that we emitted code where 
> the
>   function might otherwise appear empty for picbase thunks.
>   (ix86_output_function_epilogue): If we find a zero-sized function 
> assume that
>   reaching it is UB and trap.  If we find a trailing label append a nop.
>   * config/rs6000/rs6000.c (rs6000_output_function_epilogue): If we find
>   a zero-sized function assume that reaching it is UB and trap.  If we 
> find a
>   trailing label, append a nop.
> 
> gcc/testsuite/
> 
> 2016-11-06  Iain Sandoe  
> 
>   PR target/57438
>   * gcc.dg/pr57438-1.c: New.
>   * gcc.dg/pr57438-2.c: New.

Re: [PATCH, Darwin] fix for PR67710 : Update 'as' specs and inputs to handle newer assembler versions.

2016-11-07 Thread Mike Stump

On Nov 6, 2016, at 12:53 PM, Iain Sandoe  wrote:
> OK for trunk?
> OK for open branches?

Ok.

> 2016-11-06  Iain Sandoe  
>   Rainer Orth  
> 
>   target/PR67710
>   * config.in: Regenerate
>   * config/darwin-driver.c (darwin_driver_init): Emit a version string 
> for the assembler.
>   * config/darwin.h(ASM_MMACOSX_VERSION_MIN_SPEC): New, new tests.
>   * config/darwin.opt(asm_macosx_version_min): New.
>   * config/i386/darwin.h: Handle ASM_MMACOSX_VERSION_MIN_SPEC.
>   * configure: Regenerate
>   * configure.ac: Check for mmacosx-version-min handling.
> 
> gcc/testsuite/
> 
> 2016-11-06  Iain Sandoe  
>   Rainer Orth  
> 
>   target/PR67710
>   *  gcc.dg/darwin-minversion-1.c: Update min version check.
>   *  gcc.dg/darwin-minversion-2.c: Likewise.
>   *  gcc.dg/darwin-minversion-3.c: Likewise.
> 
> libgcc/
> 
> 2016-11-06  Iain Sandoe  
>   Rainer Orth  
> 
>   target/PR67710
>   *  libgcc/config/t-darwin: Default builds to 10.5 codegen.

Re: [PATCH fix PR71767 3/4 : Darwin sections] Fix PR71767 - adjust the sections used in response to ld64 version.

2016-11-07 Thread Mike Stump

On Nov 6, 2016, at 11:40 AM, Iain Sandoe  wrote:
> 
> OK for trunk?
> OK for open branches?

Ok.

> 2016-11-06  Iain Sandoe  
> 
>   PR target/71767
>   * config/darwin-sections.def (picbase_thunk_section): New.
>   * config/darwin.c (darwin_init_sections): Set up picbase thunk section.
>   (darwin_rodata_section, darwin_objc2_section, machopic_select_section,
>   darwin_asm_declare_constant_name, darwin_emit_weak_or_comdat, 
>   darwin_function_section): Don’t use coalesced with newer linkers.
>   (darwin_override_options): Decide on usage of coalesed sections on the
>   basis of the target linker version.
>   * config/darwin.h (MIN_LD64_NO_COAL_SECTS): New.
>   * config/darwin.opt  (mtarget-linker): New.
>   * config/i386/i386.c (ix86_code_end): Do not force the thunks into a 
> coalesced
>   section, instead use a thunks section.

Re: [PATCH fix PR71767 4/4 : testsuite] Fix testsuite fallout from section and linker sym visibility changes.

2016-11-07 Thread Mike Stump

On Nov 6, 2016, at 11:41 AM, Iain Sandoe  wrote:
> OK for trunk (after the relevant patches are applied)?
> OK for open branches (likewise)?

Ok.

>   PR target/71767
> 
>   * g++.dg/abi/key2.C: Adjust for changed Darwin sections and 
> linker-visible symbols.
>   * g++.dg/torture/darwin-cfstring-3.C: Likewise.
>   * gcc.dg/const-uniq-1.c: Likewise.
>   * gcc.dg/torture/darwin-cfstring-3.c: Likewise.
>   * gcc.target/i386/pr70799-1.c: Likewise.

Re: [PATCH fix PR71767 1/4 : ld64 atoms] Make PIC indirections and constant labels linker-visible.

2016-11-07 Thread Mike Stump

On Nov 6, 2016, at 11:37 AM, Iain Sandoe  wrote:
> OK for trunk?
> OK for open branches?

Ok.

> 2016-11-06  Iain Sandoe  
> 
>   PR target/71767
>   * config/darwin.c (imachopic_indirection_name): Make data section 
> indirections
>   linker-visible.
>   * config/darwin.h (ASM_GENERATE_INTERNAL_LABEL): Make local constant
>   labels linker-visible.

Re: [PATCH fix PR71767 2/4 : Darwin configury] Arrange for ld64 to be detected as Darwin's linker

2016-11-07 Thread Iain Sandoe

> On 7 Nov 2016, at 09:51, Mike Stump  wrote:
> 
> [ possible dup ]
> 
>> Begin forwarded message:
>> 
>> From: Mike Stump 
>> Subject: Re: [PATCH fix PR71767 2/4 : Darwin configury] Arrange for ld64 to 
>> be detected as Darwin's linker
>> Date: November 7, 2016 at 9:48:53 AM PST
>> To: Iain Sandoe 
>> Cc: GCC Patches , Jeff Law 
>> 
>> On Nov 6, 2016, at 11:39 AM, Iain Sandoe  wrote:
>>> This is an initial patch in a series that converts Darwin's configury to 
>>> detect ld64 features, rather than the current process of hard-coding them 
>>> on target system version.
>> 
>> So, I really do hate to ask, but does this have to be a config option?  
>> Normally, we'd just have configure examine things by itself.  For canadian 
>> crosses, there should be enough state present to key off of directly, 
>> specially if they are wired up to work.
>> 
>> I've rather have the thing that doesn't just work without that config flag, 
>> just work.  I'd like to think I can figure how how to make it just work, if 
>> given an idea of what doesn't actually work.
>> 
>> Essentially, you do the operation that doesn't work, detect it failed to 
>> work, then the you know it didn't work.

Well, if you can run the tool, that’s fine - I wanted to cover the base where 
we have a native or canadian that’s using a newer ld64 than is installed by the 
‘last available xcode’ on a given platform - which is the common case (since 
the older versions of ld64 in particular don’t really support the features we 
want, they def. won’t support building LLVM for ex.).

I am *really really* trying to get away from the assumption that darwinNN 
implies some ld64 capability - because that’s just wrong, really - makes way 
too many assuptions.  I also want to get to the “end game” that we just 
configure *-*-darwin and use the cross-capability of the toolchain (we’re a 
ways away from that upstream, but my local patch set acheives it at least for 
5.4 and 6.2).

It’s true that adding configure options is not #1 choice in life - but I think 
darwin is getting to the stage where there are too many choices to cover 
without.

Open to alternate suggestions, of course
Iain

Re: [PATCH][GCC/TESTSUITE] Make test for traditional-cpp depend on

2016-11-07 Thread Mike Stump

On Nov 1, 2016, at 8:46 AM, Tamar Christina  wrote:
> 
> A glibc update recently broke this test by adding a CPP
> macro that uses the ## string function which traditional-cpp
> does not support.
> The change in glibc that made the test fail is from
> 6962682ffe5e5f0373047a0b894fee7a774be254.
> 
> This fixes (PR78136) by changing the test to use a local
> include file instead of one from glibc.
> The intention of the test is to test that traditional-cpp does
> not expand values inside <> blocks of #includes.
> As such the include has to be included via <> syntax. To do this
> the .exp has been modified to add the test directory to the
> Include search path.
> 
> Ran regression tests on aarch64-none-linux-gnu.
> 
> Ok for trunk?

Ok.

Can you remove the comment: Newlib uses ## when including stdlib.h as of 
2007-09-07.  while you are at it?  I think it doesn't make any sense post the 
change unless one reads history.

> 2016-10-31  Tamar Christina  
> 
>   PR testsuite/78136
>   * gcc.dg/cpp/trad/trad.exp
>   (dg-runtest): Added $srcdir/$subdir/ to Include dirs.
>   * gcc.dg/cpp/trad/include.c: Use local header 
> file.

Re: [PATCH fix PR71767 2/4 : Darwin configury] Arrange for ld64 to be detected as Darwin's linker

2016-11-07 Thread Mike Stump

On Nov 6, 2016, at 11:39 AM, Iain Sandoe  wrote:
> This is an initial patch in a series that converts Darwin's configury to 
> detect ld64 features, rather than the current process of hard-coding them on 
> target system version.

So, I really do hate to ask, but does this have to be a config option?  
Normally, we'd just have configure examine things by itself.  For canadian 
crosses, there should be enough state present to key off of directly, specially 
if they are wired up to work.

I've rather have the thing that doesn't just work without that config flag, 
just work.  I'd like to think I can figure how how to make it just work, if 
given an idea of what doesn't actually work.

Essentially, you do the operation that doesn't work, detect it failed to work, 
then the you know it didn't work.

Re: [match.pd] Fix for PR35691

2016-11-07 Thread Prathamesh Kulkarni

On 7 November 2016 at 23:06, Prathamesh Kulkarni
 wrote:
> On 7 November 2016 at 15:43, Richard Biener  wrote:
>> On Fri, 4 Nov 2016, Prathamesh Kulkarni wrote:
>>
>>> On 4 November 2016 at 13:41, Richard Biener  wrote:
>>> > On Thu, 3 Nov 2016, Marc Glisse wrote:
>>> >
>>> >> On Thu, 3 Nov 2016, Richard Biener wrote:
>>> >>
>>> >> > > > > The transform would also work for vectors (element_precision for
>>> >> > > > > the test but also a value-matching zero which should ensure the
>>> >> > > > > same number of elements).
>>> >> > > > Um sorry, I didn't get how to check vectors to be of equal length 
>>> >> > > > by a
>>> >> > > > matching zero.
>>> >> > > > Could you please elaborate on that ?
>>> >> > >
>>> >> > > He may have meant something like:
>>> >> > >
>>> >> > >   (op (cmp @0 integer_zerop@2) (cmp @1 @2))
>>> >> >
>>> >> > I meant with one being @@2 to allow signed vs. Unsigned @0/@1 which 
>>> >> > was the
>>> >> > point of the pattern.
>>> >>
>>> >> Oups, that's what I had written first, and then I somehow managed to 
>>> >> confuse
>>> >> myself enough to remove it so as to remove the call to types_match :-(
>>> >>
>>> >> > > So the last operand is checked with operand_equal_p instead of
>>> >> > > integer_zerop. But the fact that we could compute bit_ior on the
>>> >> > > comparison results should already imply that the number of elements 
>>> >> > > is the
>>> >> > > same.
>>> >> >
>>> >> > Though for equality compares we also allow scalar results IIRC.
>>> >>
>>> >> Oh, right, I keep forgetting that :-( And I have no idea how to generate 
>>> >> one
>>> >> for a testcase, at least until the GIMPLE FE lands...
>>> >>
>>> >> > > On platforms that have IOR on floats (at least x86 with SSE, maybe 
>>> >> > > some
>>> >> > > vector mode on s390?), it would be cool to do the same for floats 
>>> >> > > (most
>>> >> > > likely at the RTL level).
>>> >> >
>>> >> > On GIMPLE view-converts could come to the rescue here as well.  Or we 
>>> >> > cab
>>> >> > just allow bit-and/or on floats as much as we allow them on pointers.
>>> >>
>>> >> Would that generate sensible code on targets that do not have logic 
>>> >> insns for
>>> >> floats? Actually, even on x86_64 that generates inefficient code, so 
>>> >> there
>>> >> would be some work (for instance grep finds no gen_iordf3, only 
>>> >> gen_iorv2df3).
>>> >>
>>> >> I am also a bit wary of doing those obfuscating optimizations too 
>>> >> early...
>>> >> a==0 is something that other optimizations might use. long
>>> >> c=(long&)a|(long&)b; (double&)c==0; less so...
>>> >>
>>> >> (and I am assuming that signaling NaNs don't make the whole 
>>> >> transformation
>>> >> impossible, which might be wrong)
>>> >
>>> > Yeah.  I also think it's not so much important - I just wanted to mention
>>> > vectors...
>>> >
>>> > Btw, I still think we need a more sensible infrastructure for passes
>>> > to gather, analyze and modify complex conditions.  (I'm always pointing
>>> > to tree-affine.c as an, albeit not very good, example for handling
>>> > a similar problem)
>>> Thanks for mentioning the value-matching capture @@, I wasn't aware of
>>> this match.pd feature.
>>> The current patch keeps it restricted to only bitwise operators on integers.
>>> Bootstrap+test running on x86_64-unknown-linux-gnu.
>>> OK to commit if passes ?
>>
>> +/* PR35691: Transform
>> +   (x == 0 & y == 0) -> (x | typeof(x)(y)) == 0.
>> +   (x != 0 | y != 0) -> (x | typeof(x)(y)) != 0.  */
>> +
>>
>> Please omit the vertical space
>>
>> +(for bitop (bit_and bit_ior)
>> + cmp (eq ne)
>> + (simplify
>> +  (bitop (cmp @0 integer_zerop) (cmp @1 integer_zerop))
>>
>> if you capture the first integer_zerop as @2 then you can re-use it...
>>
>> +   (if (INTEGRAL_TYPE_P (TREE_TYPE (@0))
>> +   && INTEGRAL_TYPE_P (TREE_TYPE (@1))
>> +   && TYPE_PRECISION (TREE_TYPE (@0)) == TYPE_PRECISION (TREE_TYPE
>> (@1)))
>> +(cmp (bit_ior @0 (convert @1)) { build_zero_cst (TREE_TYPE (@0));
>>
>> ... here inplace of the { build_zero_cst ... }.
>>
>> Ok with that changes.
> Thanks, committed the attached version as r241915.
ugh, the svn commit message has:

testsuite/
* gcc.dg/pr35691-1.c: New test-case.
* gcc.dg/pr35691-4.c: Likewise.

pr35691-4.c was a typo, should be pr35691-2.c :/
However testsuite/ChangeLog correctly has entry for pr35691-2.c
Is it possible to edit the commit message for r241915 ?
Sorry about this.

Regards,
Prathamesh
>
>>
>> Richard.

Re: [match.pd] Fix for PR35691

2016-11-07 Thread Prathamesh Kulkarni

On 7 November 2016 at 15:43, Richard Biener  wrote:
> On Fri, 4 Nov 2016, Prathamesh Kulkarni wrote:
>
>> On 4 November 2016 at 13:41, Richard Biener  wrote:
>> > On Thu, 3 Nov 2016, Marc Glisse wrote:
>> >
>> >> On Thu, 3 Nov 2016, Richard Biener wrote:
>> >>
>> >> > > > > The transform would also work for vectors (element_precision for
>> >> > > > > the test but also a value-matching zero which should ensure the
>> >> > > > > same number of elements).
>> >> > > > Um sorry, I didn't get how to check vectors to be of equal length 
>> >> > > > by a
>> >> > > > matching zero.
>> >> > > > Could you please elaborate on that ?
>> >> > >
>> >> > > He may have meant something like:
>> >> > >
>> >> > >   (op (cmp @0 integer_zerop@2) (cmp @1 @2))
>> >> >
>> >> > I meant with one being @@2 to allow signed vs. Unsigned @0/@1 which was 
>> >> > the
>> >> > point of the pattern.
>> >>
>> >> Oups, that's what I had written first, and then I somehow managed to 
>> >> confuse
>> >> myself enough to remove it so as to remove the call to types_match :-(
>> >>
>> >> > > So the last operand is checked with operand_equal_p instead of
>> >> > > integer_zerop. But the fact that we could compute bit_ior on the
>> >> > > comparison results should already imply that the number of elements 
>> >> > > is the
>> >> > > same.
>> >> >
>> >> > Though for equality compares we also allow scalar results IIRC.
>> >>
>> >> Oh, right, I keep forgetting that :-( And I have no idea how to generate 
>> >> one
>> >> for a testcase, at least until the GIMPLE FE lands...
>> >>
>> >> > > On platforms that have IOR on floats (at least x86 with SSE, maybe 
>> >> > > some
>> >> > > vector mode on s390?), it would be cool to do the same for floats 
>> >> > > (most
>> >> > > likely at the RTL level).
>> >> >
>> >> > On GIMPLE view-converts could come to the rescue here as well.  Or we 
>> >> > cab
>> >> > just allow bit-and/or on floats as much as we allow them on pointers.
>> >>
>> >> Would that generate sensible code on targets that do not have logic insns 
>> >> for
>> >> floats? Actually, even on x86_64 that generates inefficient code, so there
>> >> would be some work (for instance grep finds no gen_iordf3, only 
>> >> gen_iorv2df3).
>> >>
>> >> I am also a bit wary of doing those obfuscating optimizations too early...
>> >> a==0 is something that other optimizations might use. long
>> >> c=(long&)a|(long&)b; (double&)c==0; less so...
>> >>
>> >> (and I am assuming that signaling NaNs don't make the whole transformation
>> >> impossible, which might be wrong)
>> >
>> > Yeah.  I also think it's not so much important - I just wanted to mention
>> > vectors...
>> >
>> > Btw, I still think we need a more sensible infrastructure for passes
>> > to gather, analyze and modify complex conditions.  (I'm always pointing
>> > to tree-affine.c as an, albeit not very good, example for handling
>> > a similar problem)
>> Thanks for mentioning the value-matching capture @@, I wasn't aware of
>> this match.pd feature.
>> The current patch keeps it restricted to only bitwise operators on integers.
>> Bootstrap+test running on x86_64-unknown-linux-gnu.
>> OK to commit if passes ?
>
> +/* PR35691: Transform
> +   (x == 0 & y == 0) -> (x | typeof(x)(y)) == 0.
> +   (x != 0 | y != 0) -> (x | typeof(x)(y)) != 0.  */
> +
>
> Please omit the vertical space
>
> +(for bitop (bit_and bit_ior)
> + cmp (eq ne)
> + (simplify
> +  (bitop (cmp @0 integer_zerop) (cmp @1 integer_zerop))
>
> if you capture the first integer_zerop as @2 then you can re-use it...
>
> +   (if (INTEGRAL_TYPE_P (TREE_TYPE (@0))
> +   && INTEGRAL_TYPE_P (TREE_TYPE (@1))
> +   && TYPE_PRECISION (TREE_TYPE (@0)) == TYPE_PRECISION (TREE_TYPE
> (@1)))
> +(cmp (bit_ior @0 (convert @1)) { build_zero_cst (TREE_TYPE (@0));
>
> ... here inplace of the { build_zero_cst ... }.
>
> Ok with that changes.
Thanks, committed the attached version as r241915.

>
> Richard.
2016-11-07  Prathamesh Kulkarni  

PR middle-end/35691
* match.pd: Add following two patterns:
(x == 0 & y == 0) -> (x | typeof(x)(y)) == 0.
(x != 0 | y != 0) -> (x | typeof(x)(y)) != 0.

testsuite/
* gcc.dg/pr35691-1.c: New test-case.
* gcc.dg/pr35691-4.c: Likewise.

diff --git a/gcc/match.pd b/gcc/match.pd
index 48f7351..29ddcd8 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -519,6 +519,18 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
   (if (TYPE_UNSIGNED (type))
 (bit_and @0 (bit_not (lshift { build_all_ones_cst (type); } @1)
 
+/* PR35691: Transform
+   (x == 0 & y == 0) -> (x | typeof(x)(y)) == 0.
+   (x != 0 | y != 0) -> (x | typeof(x)(y)) != 0.  */
+(for bitop (bit_and bit_ior)
+ cmp (eq ne)
+ (simplify
+  (bitop (cmp @0 integer_zerop@2) (cmp @1 integer_zerop))
+   (if (INTEGRAL_TYPE_P (TREE_TYPE (@0))
+   && INTEGRAL_TYPE_P (TREE_TYPE (@1))
+   && TYPE_PRECISION (TREE_TYPE (@0)) == TYPE_PRECISION

Re: [Patch, rtl] PR middle-end/78016, keep REG_NOTE order during insn copy

2016-11-07 Thread Jiong Wang




On 07/11/16 17:04, Bernd Schmidt wrote:

On 11/03/2016 03:00 PM, Eric Botcazou wrote:

FWIW here's a more complete version of my patch which I'm currently
testing. Let me know if you think it's at least a good enough
intermediate step to be installed.


It is, thanks.


Testing showed the same issue as Jiong found, so I've committed it 
with that extra tweak.


Thanks very much!  I have closed PR middle-end/78016

Regards,
Jiong

Re: [Patch, rtl] PR middle-end/78016, keep REG_NOTE order during insn copy

2016-11-07 Thread Bernd Schmidt


On 11/03/2016 03:00 PM, Eric Botcazou wrote:

FWIW here's a more complete version of my patch which I'm currently
testing. Let me know if you think it's at least a good enough
intermediate step to be installed.


It is, thanks.


Testing showed the same issue as Jiong found, so I've committed it with 
that extra tweak.



Bernd

Re: [PATCH 0/2] strncmp builtin expansion improvement

2016-11-07 Thread Aaron Sawdey

On Mon, 2016-11-07 at 15:26 +0100, Richard Biener wrote:
> Your patchset doesn't contain a testcase so I really wonder which
> case
> we know the string length but it is not constant.
> 
> Yes, there's COND_EXPR handling in c_strlen but that should be mostly
> dead code -- the real code should be using get_maxval_strlen or
> get_range_strlen but c_strlen does not use those.
> 
> Ideally the str optabs would get profile data and alignment similar
> to
> the mem ones.
> 
> Care to share a testcase?

I think I haven't explained this well. The case I am interested in is
where the string arguments are indeed of unknown length, but the length
argument to strncmp is a constant. This is the case that I'm attempting
to address with this patch series.

This is from the strncmp-1.c test case, but modified for a constant
length argument to strncmp.

#include 
#include 
#include 

void
test (const unsigned char *s1, const unsigned char *s2, int expected)
{
  register int value = strncmp ((char *) s1, (char *) s2, 5);

  if (expected < 0 && value >= 0)
abort ();
  else if (expected == 0 && value != 0)
abort ();
  else if (expected > 0 && value <= 0)
abort ();
}

I added this small bit to builtins.c so we can see what happens:

Index: gcc/builtins.c
===
--- gcc/builtins.c  (revision 241911)
+++ gcc/builtins.c  (working copy)
@@ -67,6 +67,7 @@
 #include "internal-fn.h"
 #include "case-cfn-macros.h"
 #include "gimple-fold.h"
+#include "print-tree.h"

 struct target_builtins default_target_builtins;
@@ -3932,6 +3933,9 @@
 len1 = c_strlen (arg1, 1);
 len2 = c_strlen (arg2, 1);

+printf("len1 = %p len2 = %p\n",(void*)len1,(void*)len2);
+debug_tree(arg3);
+
 if (len1)
   len1 = size_binop_loc (loc, PLUS_EXPR, ssize_int (1), len1);
 if (len2)

The output then is as follows:

build/gcc/xgcc -B build/gcc -S -O1 strncmp-test.c
len1 = (nil) len2 = (nil)

constant 5>

Looking in the .s file you can see that strncmp was not expanded.
However the current code in i386.md for cmpstrnsi does not handle the
case where the 0 byte in both strings may occur before the length given
to strncmp.

test:
.LFB22:
.cfi_startproc
pushq   %rbx
.cfi_def_cfa_offset 16
.cfi_offset 3, -16
movl%edx, %ebx
movl$5, %edx
callstrncmp
movl%ebx, %edx

I think it's pretty clear from the code in expand_builtin_strncmp that
if len1 and len2 are both NULL, you end up with len=len2 and then it
returns NULL_RTX.

Thanks,
   Aaron

-- 
Aaron Sawdey, Ph.D.  acsaw...@linux.vnet.ibm.com
050-2/C113  (507) 253-7520 home: 507/263-0782
IBM Linux Technology Center - PPC Toolchain

Re: [PATCH] Make direct emission of time profiler counter

2016-11-07 Thread Christophe Lyon

On 7 November 2016 at 09:58, Martin Liška  wrote:
> On 11/05/2016 09:38 AM, Jan Hubicka wrote:
>> Looks OK if it passes.
>>
>> Honza
>
> Thanks, fixed on trunk as r241894.
> Martin

Thanks, this fixed the problems I reported.

Christophe

[PATCH,testsuite] MIPS: Upgrade to MIPS IV if using (HAS_MOVN) with MIPS III.

2016-11-07 Thread Toma Tabacu

Hi,

The (HAS_MOVN) option should cause an upgrade to MIPS IV if the target is
pre-MIPS IV. However, the upgrade condition checks for "$isa < 3", which means
that we won't upgrade if we're targeting MIPS III.

This results in failures for the movcc-{1,2,3}.c and branch-cost-2.c tests
when the target is MIPS III.

This patch fixes the condition to include MIPS III.

Tested with mips-mti-elf.

Regards,
Toma Tabacu

gcc/testsuite/ChangeLog:

2016-11-07  Toma Tabacu  

* gcc.target/mips/mips.exp (mips-dg-options): Upgrade to MIPS IV if 
using
(HAS_MOVN) with MIPS III.

diff --git a/gcc/testsuite/gcc.target/mips/mips.exp 
b/gcc/testsuite/gcc.target/mips/mips.exp
index 39f44ff..e22d782 100644
--- a/gcc/testsuite/gcc.target/mips/mips.exp
+++ b/gcc/testsuite/gcc.target/mips/mips.exp
@@ -1129,7 +1129,7 @@ proc mips-dg-options { args } {
 # We need MIPS IV or higher for:
#
#
-   } elseif { $isa < 3
+   } elseif { $isa < 4
   && [mips_have_test_option_p options "HAS_MOVN"] } {
mips_make_test_option options "-mips4"
 # We need MIPS III or higher for:

Re: Fix build of jit (was Re: [PATCH, RFC] Introduce -fsanitize=use-after-scope (v3))

2016-11-07 Thread Jakub Jelinek

On Mon, Nov 07, 2016 at 11:07:13AM -0500, David Malcolm wrote:
> The patch (r241896) introduced an error in the build of the jit:
> 
> ../../src/gcc/jit/jit-builtins.c:62:1: error: invalid conversion from
> ‘int’ to ‘gcc::jit::built_in_attribute’ [-fpermissive]
>  };
>  ^
> 
> which seems to be due to the "0" for ATTRS in:
> 
> --- a/gcc/sanitizer.def
> +++ b/gcc/sanitizer.def
> @@ -165,6 +165,10 @@ DEF_SANITIZER_BUILTIN(BUILT_IN_ASAN_BEFORE_DYNAMIC_INIT,
>  DEF_SANITIZER_BUILTIN(BUILT_IN_ASAN_AFTER_DYNAMIC_INIT,
> "__asan_after_dynamic_init",
> BT_FN_VOID, ATTR_NOTHROW_LEAF_LIST)
> +DEF_SANITIZER_BUILTIN(BUILT_IN_ASAN_CLOBBER_N, "__asan_poison_stack_memory",
> +   BT_FN_VOID_PTR_PTRMODE, 0)
> +DEF_SANITIZER_BUILTIN(BUILT_IN_ASAN_UNCLOBBER_N, 
> "__asan_unpoison_stack_memory",
> +   BT_FN_VOID_PTR_PTRMODE, 0)

I believe the 0 here is a bug, I'd think we should be using something like
ATTR_TMPURE_NOTHROW_LEAF_LIST that we are using __asan_load* - the functions
aren't going to throw, nor call anything in the current TU.  Not 100% sure
about the TMPURE, after all they do write/read memory (the shadow one).
So maybe ATTR_NOTHROW_LEAF_LIST instead for now?  Martin?

> Is the attached patch OK as a fix? (assuming testing passes)  Or should
> these builtins have other attrs?  (sorry, am not very familiar with the
> sanitizer code).

Jakub

Re: [PATCH] Fix DSE not to consider calls as reads from function's body (PR target/77834)

2016-11-07 Thread Bernd Schmidt


On 11/04/2016 05:35 PM, Jakub Jelinek wrote:


2016-11-04  Jakub Jelinek  

PR target/77834
* dse.c (dse_step5): Call scan_reads even if just
insn_info->frame_read.  Improve and fix dump file messages.


Sounds reasonable, and I checked and it seems not to change code 
generation for any .i files from my collection. So, OK.



Bernd

Fix build of jit (was Re: [PATCH, RFC] Introduce -fsanitize=use-after-scope (v3))

2016-11-07 Thread David Malcolm

On Mon, 2016-11-07 at 11:03 +0100, Martin Liška wrote:
> Hello.
> 
> After discussion with Jakub, I'm resending new version of the patch,
> where I changed following:
> 1) gimplify_ctxp->live_switch_vars is used to track variables
> introduced in switch_expr. Every time
>a case_label_expr is seen, these are unpoisoned. It's quite
> conservative, however it covers all
>corner cases on can come up with. Compared to clang, we are much
> more precise in switch statements
>where a variable liveness crosses label boundary.
> 2) I found a bug where ASAN_CHECK was optimized out due to missing
> check of IFN_ASAN_MARK internal fn.
>Test was added for that.
> 3) Multiple switch tests have been added, which is going to be sent
> in upcoming email.
> 
> Patch can bootstrap on ppc64le-redhat-linux and survives regression
> tests (+ asan bootstrap finishes
> successfully).

The patch (r241896) introduced an error in the build of the jit:

../../src/gcc/jit/jit-builtins.c:62:1: error: invalid conversion from
‘int’ to ‘gcc::jit::built_in_attribute’ [-fpermissive]
 };
 ^

which seems to be due to the "0" for ATTRS in:

--- a/gcc/sanitizer.def
+++ b/gcc/sanitizer.def
@@ -165,6 +165,10 @@ DEF_SANITIZER_BUILTIN(BUILT_IN_ASAN_BEFORE_DYNAMIC_INIT,
 DEF_SANITIZER_BUILTIN(BUILT_IN_ASAN_AFTER_DYNAMIC_INIT,
  "__asan_after_dynamic_init",
  BT_FN_VOID, ATTR_NOTHROW_LEAF_LIST)
+DEF_SANITIZER_BUILTIN(BUILT_IN_ASAN_CLOBBER_N, "__asan_poison_stack_memory",
+ BT_FN_VOID_PTR_PTRMODE, 0)
+DEF_SANITIZER_BUILTIN(BUILT_IN_ASAN_UNCLOBBER_N, 
"__asan_unpoison_stack_memory",
+ BT_FN_VOID_PTR_PTRMODE, 0)

Is the attached patch OK as a fix? (assuming testing passes)  Or should
these builtins have other attrs?  (sorry, am not very familiar with the
sanitizer code).

Dave
From 6db5f9e50dc95f504d33970ee553172bbf400ae7 Mon Sep 17 00:00:00 2001
From: David Malcolm 
Date: Mon, 7 Nov 2016 11:21:20 -0500
Subject: [PATCH] Fix build of jit

gcc/ChangeLog:
	* asan.c (ATTR_NULL): Define.
	* sanitizer.def (BUILT_IN_ASAN_CLOBBER_N): Use ATTR_NULL rather
	than 0.
	(BUILT_IN_ASAN_UNCLOBBER_N): Likewise.
---
 gcc/asan.c| 2 ++
 gcc/sanitizer.def | 4 ++--
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/gcc/asan.c b/gcc/asan.c
index 1e0ce8d..4a124cb 100644
--- a/gcc/asan.c
+++ b/gcc/asan.c
@@ -2463,6 +2463,8 @@ initialize_sanitizer_builtins (void)
 #define BT_FN_I16_CONST_VPTR_INT BT_FN_IX_CONST_VPTR_INT[4]
 #define BT_FN_I16_VPTR_I16_INT BT_FN_IX_VPTR_IX_INT[4]
 #define BT_FN_VOID_VPTR_I16_INT BT_FN_VOID_VPTR_IX_INT[4]
+#undef ATTR_NULL
+#define ATTR_NULL 0
 #undef ATTR_NOTHROW_LEAF_LIST
 #define ATTR_NOTHROW_LEAF_LIST ECF_NOTHROW | ECF_LEAF
 #undef ATTR_TMPURE_NOTHROW_LEAF_LIST
diff --git a/gcc/sanitizer.def b/gcc/sanitizer.def
index 1c142e9..596b8b0 100644
--- a/gcc/sanitizer.def
+++ b/gcc/sanitizer.def
@@ -166,9 +166,9 @@ DEF_SANITIZER_BUILTIN(BUILT_IN_ASAN_AFTER_DYNAMIC_INIT,
 		  "__asan_after_dynamic_init",
 		  BT_FN_VOID, ATTR_NOTHROW_LEAF_LIST)
 DEF_SANITIZER_BUILTIN(BUILT_IN_ASAN_CLOBBER_N, "__asan_poison_stack_memory",
-		  BT_FN_VOID_PTR_PTRMODE, 0)
+		  BT_FN_VOID_PTR_PTRMODE, ATTR_NULL)
 DEF_SANITIZER_BUILTIN(BUILT_IN_ASAN_UNCLOBBER_N, "__asan_unpoison_stack_memory",
-		  BT_FN_VOID_PTR_PTRMODE, 0)
+		  BT_FN_VOID_PTR_PTRMODE, ATTR_NULL)
 
 /* Thread Sanitizer */
 DEF_SANITIZER_BUILTIN(BUILT_IN_TSAN_INIT, "__tsan_init", 
-- 
1.8.5.3

[PING, PATCH] Do not simplify "(and (reg) (const bit))" to if_then_else.

2016-11-07 Thread Dominik Vogt

Ping.

https://gcc.gnu.org/ml/gcc-patches/2016-10/msg02525.html

On Mon, Oct 31, 2016 at 08:56:10PM +0100, Dominik Vogt wrote:
> The attached patch does a little change in
> combine.c:combine_simplify_rtx() to prevent a "simplification"
> where the rtl code gets more complex in reality.  The complete
> description of the change can be found in the commit comment in
> the attached patch.
> 
> The patch reduces the number of patterns in the s390 backend and
> slightly reduces the size of the compiled SPEC2006 code.  (Code
> size or runtime only tested on s390x with -m64.)  It is
> theoretically possible that this patch leads to somewhat worse
> code on some target if that only has a pattern for the formerly replaced
> rtl expression but not for the original one.
> 
> The patch has passed the testsuite on s390, s390x biarch, x86_64
> and Power biarch.
> 
> --
> 
> (I'm not sure whether the const_int expression can appear in both
> operands or only as the second.  If the latter is the case, the
> conditions can be simplified a bit.)
> 
> What do you think about this patch?

Ciao

Dominik ^_^  ^_^

-- 

Dominik Vogt
IBM Germany

Re: [patch, fortran, committed] Fill in some more locations

2016-11-07 Thread Thomas Koenig


Am 07.11.2016 um 16:25 schrieb Thomas Koenig:


PR fortran/78826


... should have been PR 78226.

[patch, fortran, committed] Fill in some more locations

2016-11-07 Thread Thomas Koenig


Hello world,

I have committed the little patchlet below as obvious, after
regression-testing.

Regards

Thomas

2016-11-07  Thomas Koenig  

PR fortran/78826
* match.c (gfc_match_select_type):  Add where for expr1.
* resolve.c (resolev_select_type): Add where for expr1 of new
statement.
Index: match.c
===
--- match.c	(Revision 241887)
+++ match.c	(Arbeitskopie)
@@ -5898,6 +5898,7 @@ gfc_match_select_type (void)
 {
   expr1 = gfc_get_expr ();
   expr1->expr_type = EXPR_VARIABLE;
+  expr1->where = expr2->where;
   if (gfc_get_sym_tree (name, NULL, >symtree, false))
 	{
 	  m = MATCH_ERROR;
Index: resolve.c
===
--- resolve.c	(Revision 241887)
+++ resolve.c	(Arbeitskopie)
@@ -8857,6 +8857,7 @@ resolve_select_type (gfc_code *code, gfc_namespace
 	  new_st->expr1->value.function.actual = gfc_get_actual_arglist ();
 	  new_st->expr1->value.function.actual->expr = gfc_get_variable_expr (selector_expr->symtree);
 	  new_st->expr1->value.function.actual->expr->where = code->loc;
+	  new_st->expr1->where = code->loc;
 	  gfc_add_vptr_component (new_st->expr1->value.function.actual->expr);
 	  vtab = gfc_find_derived_vtab (body->ext.block.case_list->ts.u.derived);
 	  st = gfc_find_symtree (vtab->ns->sym_root, vtab->name);

Re: [PATCH] rtx_writer: avoid printing trailing default values

2016-11-07 Thread David Malcolm

On Fri, 2016-11-04 at 20:40 +0100, Bernd Schmidt wrote:
> On 11/04/2016 08:25 PM, David Malcolm wrote:
> 
> >   return m_compact;
> 
> Ok with this one plus a comment.
> 

Thanks.

Using m_compact required turning the static function into a (private)
member function.  For reference, here's what I committed (r241908),
having verified bootstrap   
Index: gcc/ChangeLog
===
--- gcc/ChangeLog	(revision 241907)
+++ gcc/ChangeLog	(revision 241908)
@@ -1,3 +1,16 @@
+2016-11-07  David Malcolm  
+
+	* print-rtl.c (rtx_writer::operand_has_default_value_p): New
+	method.
+	(rtx_writer::print_rtx): In compact mode, omit trailing operands
+	that have the default values.
+	* print-rtl.h (rtx_writer::operand_has_default_value_p): New
+	method.
+	* rtl-tests.c (selftest::test_dumping_insns): Remove empty
+	label string from expected dump.
+	(seltest::test_uncond_jump): Remove trailing "(nil)" for REG_NOTES
+	from expected dump.
+
 2016-11-07  Jakub Jelinek  
 
 	PR target/77834
Index: gcc/print-rtl.c
===
--- gcc/print-rtl.c	(revision 241907)
+++ gcc/print-rtl.c	(revision 241908)
@@ -564,6 +564,43 @@
 }
 }
 
+/* Subroutine of rtx_writer::print_rtx.
+   In compact mode, determine if operand IDX of IN_RTX is interesting
+   to dump, or (if in a trailing position) it can be omitted.  */
+
+bool
+rtx_writer::operand_has_default_value_p (const_rtx in_rtx, int idx)
+{
+  const char *format_ptr = GET_RTX_FORMAT (GET_CODE (in_rtx));
+
+  switch (format_ptr[idx])
+{
+case 'e':
+case 'u':
+  return XEXP (in_rtx, idx) == NULL_RTX;
+
+case 's':
+  return XSTR (in_rtx, idx) == NULL;
+
+case '0':
+  switch (GET_CODE (in_rtx))
+	{
+	case JUMP_INSN:
+	  /* JUMP_LABELs are always omitted in compact mode, so treat
+	 any value here as omittable, so that earlier operands can
+	 potentially be omitted also.  */
+	  return m_compact;
+
+	default:
+	  return false;
+
+	}
+
+default:
+  return false;
+}
+}
+
 /* Print IN_RTX onto m_outfile.  This is the recursive part of printing.  */
 
 void
@@ -681,9 +718,18 @@
 	fprintf (m_outfile, " %d", INSN_UID (in_rtx));
 }
 
+  /* Determine which is the final operand to print.
+ In compact mode, skip trailing operands that have the default values
+ e.g. trailing "(nil)" values.  */
+  int limit = GET_RTX_LENGTH (GET_CODE (in_rtx));
+  if (m_compact)
+while (limit > idx && operand_has_default_value_p (in_rtx, limit - 1))
+  limit--;
+
   /* Get the format string and skip the first elements if we have handled
  them already.  */
-  for (; idx < GET_RTX_LENGTH (GET_CODE (in_rtx)); idx++)
+
+  for (; idx < limit; idx++)
 print_rtx_operand (in_rtx, idx);
 
   switch (GET_CODE (in_rtx))
Index: gcc/print-rtl.h
===
--- gcc/print-rtl.h	(revision 241907)
+++ gcc/print-rtl.h	(revision 241908)
@@ -39,6 +39,7 @@
   void print_rtx_operand_code_r (const_rtx in_rtx);
   void print_rtx_operand_code_u (const_rtx in_rtx, int idx);
   void print_rtx_operand (const_rtx in_rtx, int idx);
+  bool operand_has_default_value_p (const_rtx in_rtx, int idx);
 
  private:
   FILE *m_outfile;
Index: gcc/rtl-tests.c
===
--- gcc/rtl-tests.c	(revision 241907)
+++ gcc/rtl-tests.c	(revision 241908)
@@ -122,7 +122,7 @@
   /* Labels.  */
   rtx_insn *label = gen_label_rtx ();
   CODE_LABEL_NUMBER (label) = 42;
-  ASSERT_RTL_DUMP_EQ ("(clabel 0 42 \"\")\n", label);
+  ASSERT_RTL_DUMP_EQ ("(clabel 0 42)\n", label);
 
   LABEL_NAME (label)= "some_label";
   ASSERT_RTL_DUMP_EQ ("(clabel 0 42 (\"some_label\"))\n", label);
@@ -176,8 +176,7 @@
   ASSERT_TRUE (control_flow_insn_p (jump_insn));
 
   ASSERT_RTL_DUMP_EQ ("(cjump_insn 1 (set (pc)\n"
-		  "(label_ref 0))\n"
-		  " (nil))\n",
+		  "(label_ref 0)))\n",
 		  jump_insn);
 }

Re: Simplify X / X, 0 / X and X % X

2016-11-07 Thread Jeff Law


On 11/07/2016 03:02 AM, Richard Biener wrote:

On Sat, Nov 5, 2016 at 3:30 AM, Jeff Law  wrote:

On 11/04/2016 02:07 PM, Marc Glisse wrote:


Hello,

since we were discussing this recently...

The condition is copied from the existing 0 % X case, visible in the
context of the diff.

As far as I understand, the main case where we do not want to optimize
is during constexpr evaluation in the C++ front-end (it wants to detect
the undefined behavior), and with late folding I think this means we
only need to care about an explicit 0/0, not about X/X where X would
become 0 after the simplification.

And later, if we do have something like X/0, we could handle it the same
way as we currently handle *(char*)0, insert a trap after that
instruction and clear the following code, which likely gives better code
than replacing 0/0 with 1.


Yup.  I'd prefer to insert a trap if we ultimately expose a division by zero
-- including cases where that division occurs as a result of a PHI arg being
zero and the PHI result being used as a denominator in a division
expression.

It ought to be extremely easy to detect & transform that case (and probably
warn for it too).


We have gimple-ssa-isolate-paths.c for that, right?
Right.   I was thinking about instrumenting for it today to see if it's 
worth any effort.  It shouldn't take more than a few minutes once I 
refamiliarize myself with isolate-paths.






jeff

Re: [RFA] Fix various PPC build failures due to int-in-boolean-context code

2016-11-07 Thread Bernd Edlinger

On Fri, Oct 28, 2016 at 09:12:29AM -0600, Jeff Law wrote:
 >
 > The PPC port is stumbling over the new integer in boolean context 
warnings.
 >
 > In particular this code from rs6000_option_override_internal is
 > problematical:
 >
 >   HOST_WIDE_INT flags = ((TARGET_DEFAULT) ? TARGET_DEFAULT
 >  :
 > processor_target_table[cpu_index].target_enable);
 >
 > The compiler is flagging the (TARGET_DEFAULT) condition.  That's
 > supposed to to be a boolean.
 >
 > After all the macro expansions are done it ultimately looks something
 > like this:
 >
 >  long flags = (((1L << 7)) ? (1L << 7)
 > : processor_target_table[cpu_index].target_enable);
 >
 > Note the (1L << 7) used as the condition for the ternary.  That's what
 > has the int-in-boolean-context warning tripping.  It's a false positive
 > IMHO.

Hmm...

 From the warning's perspective it would look far less suspicious,
if we make this an unsigned shift op.

I looked at options.h and I think we could also use one bit more
if the shift was unsigned.

Furthermore there are macros TARGET_..._P which do not put
brackets around the macro parameter.

So how about this?

Cross-compiler for powerpc-eabi builds without warning.

Bootstrapped and reg-tested on x86_64-pc-linux-gnu.
Is it OK for trunk?


Bernd.
2016-11-07  Bernd Edlinger  

	* opth-gen.awk: Use unsigned shifts for bit masks.  Allow all bits
	to be used.  Add brackets around macro argument.

Index: gcc/opth-gen.awk
===
--- gcc/opth-gen.awk	(revision 241884)
+++ gcc/opth-gen.awk	(working copy)
@@ -350,11 +350,11 @@ for (i = 0; i < n_opts; i++) {
 		mask_bits[name] = 1
 		vname = var_name(flags[i])
 		mask = "MASK_"
-		mask_1 = "1"
+		mask_1 = "1U"
 		if (vname != "") {
 			mask = "OPTION_MASK_"
 			if (host_wide_int[vname] == "yes")
-mask_1 = "HOST_WIDE_INT_1"
+mask_1 = "HOST_WIDE_INT_1U"
 		} else
 			extra_mask_bits[name] = 1
 		print "#define " mask name " (" mask_1 " << " masknum[vname]++ ")"
@@ -362,16 +362,16 @@ for (i = 0; i < n_opts; i++) {
 }
 for (i = 0; i < n_extra_masks; i++) {
 	if (extra_mask_bits[extra_masks[i]] == 0)
-		print "#define MASK_" extra_masks[i] " (1 << " masknum[""]++ ")"
+		print "#define MASK_" extra_masks[i] " (1U << " masknum[""]++ ")"
 }
 
 for (var in masknum) {
 	if (var != "" && host_wide_int[var] == "yes") {
-		print" #if defined(HOST_BITS_PER_WIDE_INT) && " masknum[var] " >= HOST_BITS_PER_WIDE_INT"
+		print "#if defined(HOST_BITS_PER_WIDE_INT) && " masknum[var] " > HOST_BITS_PER_WIDE_INT"
 		print "#error too many masks for " var
 		print "#endif"
 	}
-	else if (masknum[var] > 31) {
+	else if (masknum[var] > 32) {
 		if (var == "")
 			print "#error too many target masks"
 		else
@@ -401,7 +401,7 @@ for (i = 0; i < n_opts; i++) {
 		print "#define TARGET_" name \
 		  " ((" vname " & " mask name ") != 0)"
 		print "#define TARGET_" name "_P(" vname ")" \
-		  " ((" vname " & " mask name ") != 0)"
+		  " (((" vname ") & " mask name ") != 0)"
 	}
 }
 for (i = 0; i < n_extra_masks; i++) {

Re: [PATCH] Fix -O0 AVX512 comparison ICE (PR target/78227)

2016-11-07 Thread Uros Bizjak

On Mon, Nov 7, 2016 at 2:02 PM, Jakub Jelinek  wrote:
> Hi!
>
> The following testcases ICE at -O0, because ix86_expand_sse_cmp avoid using
> the passed in dest only if optimize or if there is some value overlap, but
> we actually need to do that also if we have a maskcmp where we want to use
> a different mode than dest has.
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
>
> 2016-11-07  Jakub Jelinek  
>
> PR target/78227
> * config/i386/i386.c (ix86_expand_sse_cmp): Force dest into
> cmp_mode argument even for -O0 if cmp_mode != mode and maskcmp.
>
> * gcc.target/i386/pr78227-1.c: New test.
> * gcc.target/i386/pr78227-2.c: New test.

OK with a small nit, please see inline ...

Thanks,
Uros.

> --- gcc/config/i386/i386.c.jj   2016-11-04 20:09:48.0 +0100
> +++ gcc/config/i386/i386.c  2016-11-07 10:14:15.625018144 +0100
> @@ -23561,6 +23561,7 @@ ix86_expand_sse_cmp (rtx dest, enum rtx_
>  cmp_op1 = force_reg (cmp_ops_mode, cmp_op1);
>
>if (optimize
> +  || (cmp_mode != mode && maskcmp)

Maybe beter to switch condition around, so:

"(maskcmp && cmp_mode != mode)"

>|| (op_true && reg_overlap_mentioned_p (dest, op_true))
>|| (op_false && reg_overlap_mentioned_p (dest, op_false)))
>  dest = gen_reg_rtx (maskcmp ? cmp_mode : mode);
> --- gcc/testsuite/gcc.target/i386/pr78227-1.c.jj2016-11-07 
> 10:15:52.606762613 +0100
> +++ gcc/testsuite/gcc.target/i386/pr78227-1.c   2016-11-07 10:24:58.821480125 
> +0100
> @@ -0,0 +1,30 @@
> +/* PR target/78227 */
> +/* { dg-do compile } */
> +/* { dg-options "-mavx512f -O0 -Wno-psabi" } */
> +
> +typedef int V __attribute__((vector_size (64)));
> +typedef long long int W __attribute__((vector_size (64)));
> +
> +V
> +foo1 (V v)
> +{
> +  return v > 0;
> +}
> +
> +V
> +bar1 (V v)
> +{
> +  return v != 0;
> +}
> +
> +W
> +foo2 (W w)
> +{
> +  return w > 0;
> +}
> +
> +W
> +bar2 (W w)
> +{
> +  return w != 0;
> +}
> --- gcc/testsuite/gcc.target/i386/pr78227-2.c.jj2016-11-07 
> 10:22:17.055670476 +0100
> +++ gcc/testsuite/gcc.target/i386/pr78227-2.c   2016-11-07 10:25:03.722413765 
> +0100
> @@ -0,0 +1,30 @@
> +/* PR target/78227 */
> +/* { dg-do compile } */
> +/* { dg-options "-mavx512bw -O0 -Wno-psabi" } */
> +
> +typedef signed char V __attribute__((vector_size (64)));
> +typedef short int W __attribute__((vector_size (64)));
> +
> +V
> +foo1 (V v)
> +{
> +  return v > 0;
> +}
> +
> +V
> +bar1 (V v)
> +{
> +  return v != 0;
> +}
> +
> +W
> +foo2 (W w)
> +{
> +  return w > 0;
> +}
> +
> +W
> +bar2 (W w)
> +{
> +  return w != 0;
> +}
>
> Jakub

[PATCH] Avoid peeling for gaps if accesses are aligned

2016-11-07 Thread Richard Biener


Currently we force peeling for gaps whenever element overrun can occur
but for aligned accesses we know that the loads won't trap and thus
we can avoid this.

Bootstrap and regtest running on x86_64-unknown-linux-gnu (I expect
some testsuite fallout here so didn't bother to invent a new testcase).

Just in case somebody thinks the overrun is a bad idea in general
(even when not trapping).  Like for ASAN or valgrind.

Richard.

2016-11-07  Richard Biener  

* tree-vect-stmts.c (get_group_load_store_type): If the
access is aligned do not trigger peeling for gaps.

Index: gcc/tree-vect-stmts.c
===
--- gcc/tree-vect-stmts.c   (revision 241893)
+++ gcc/tree-vect-stmts.c   (working copy)
@@ -1770,6 +1771,11 @@ get_group_load_store_type (gimple *stmt,
   " non-consecutive accesses\n");
  return false;
}
+ /* If the access is aligned an overrun is fine.  */
+ if (overrun_p
+ && aligned_access_p
+  (STMT_VINFO_DATA_REF (vinfo_for_stmt (first_stmt
+   overrun_p = false;
  if (overrun_p && !can_overrun_p)
{
  if (dump_enabled_p ())
@@ -1789,6 +1795,10 @@ get_group_load_store_type (gimple *stmt,
   /* If there is a gap at the end of the group then these optimizations
 would access excess elements in the last iteration.  */
   bool would_overrun_p = (gap != 0);
+  /* If the access is aligned an overrun is fine.  */
+  if (would_overrun_p
+ && aligned_access_p (STMT_VINFO_DATA_REF (stmt_info)))
+   would_overrun_p = false;
   if (!STMT_VINFO_STRIDED_P (stmt_info)
  && (can_overrun_p || !would_overrun_p)
  && compare_step_with_zero (stmt) > 0)

Re: [PATCH 0/2] strncmp builtin expansion improvement

2016-11-07 Thread Richard Biener

On Sun, Nov 6, 2016 at 5:32 AM, Aaron Sawdey
 wrote:
> On Fri, 2016-11-04 at 20:43 -0600, Jeff Law wrote:
>> So what's the motivation here?  When we don't have any constants
>> then
>> I'd think we'd be better off punting into the library.
>
> When none of the args to strncmp are constant, I'd be inclined to
> agree. However the current state of affairs is that strncmp is not
> expanded in the case where the length is a constant but the strings are
> not. This patch allows the expansion to be attempted.
>
> The target's cmpstrnsi pattern can then make the decision of which
> cases to expand and which cases to punt to the library. For instance RX
> might always want to expand this for all cases as that target has an
> instruction that is intended to map to strncmp.
>
> My particular motivation is that I'm working on a cmpstrnsi pattern for
> powerpc64 and I want to have access to the case where the strings are
> not constant but the length is.

Your patchset doesn't contain a testcase so I really wonder which case
we know the string length but it is not constant.

Yes, there's COND_EXPR handling in c_strlen but that should be mostly
dead code -- the real code should be using get_maxval_strlen or
get_range_strlen but c_strlen does not use those.

Ideally the str optabs would get profile data and alignment similar to
the mem ones.

Care to share a testcase?

Thanks,
Richard.

> Thanks,
>Aaron
>
> --
> Aaron Sawdey, Ph.D.  acsaw...@linux.vnet.ibm.com
> 050-2/C113  (507) 253-7520 home: 507/263-0782
> IBM Linux Technology Center - PPC Toolchain
>

[GCC][PATCH] Fix ada compile error on Windows x86_64 (committed as r241907 under the obvious rule)

2016-11-07 Thread Tamar Christina

Hi all,

The changes in r240999 re-arranged includes and
left out signal.h for Windows x86 builds.

This breaks the build and prevents GCC builds from
completing with messages such as:

adaint.c:3317:19: error: 'SIGINT' undeclared (first use in this function); did 
you mean 'SAIT'?

else if (sig == SIGINT)
^~

Bootstrapped successfully on x86_64-w64-mingw32.

Committed as r241907.

Thanks,
Tamar

gcc/

2016-11-07  Tamar Christina  

* gcc/ada/adaint.c: Added signal.h for Windows.diff --git a/gcc/ada/adaint.c b/gcc/ada/adaint.c
index 353914708adbdf301f9d59aaa55debfed469f901..819ea47e449725b08c1a531b340ddc6a74b0e5db 100644
--- a/gcc/ada/adaint.c
+++ b/gcc/ada/adaint.c
@@ -190,6 +190,7 @@ UINT CurrentCCSEncoding;
 #include 
 #include 
 #include 
+#include 
 #undef DIR_SEPARATOR
 #define DIR_SEPARATOR '\\'

Re: New option -flimit-function-alignment

2016-11-07 Thread Bernd Schmidt


On 10/14/2016 08:28 PM, Bernd Schmidt wrote:

On 10/12/2016 09:27 PM, Denys Vlasenko wrote:

Yes, something like "if max_skip >= func_size, temporarily lower
max_skip to func_size-1" (because otherwise we can create padding
bigger-or-equal to the entire function in size, which is stupid
- it's better to just put the function in that space).

This would be a nice.


That would be this patch. Bootstrapped and tested on x86_64-linux, ok?


Ping.
https://gcc.gnu.org/ml/gcc-patches/2016-10/msg01187.html


Bernd

Re: [PATCH][GCC/TESTSUITE] Make test for traditional-cpp depend on

2016-11-07 Thread Tamar Christina

Ping.

From: gcc-patches-ow...@gcc.gnu.org  on behalf 
of Tamar Christina 
Sent: Tuesday, November 1, 2016 3:46:07 PM
To: GCC Patches; r...@cebitec.uni-bielefeld.de; mikest...@comcast.net
Cc: nd
Subject: [PATCH][GCC/TESTSUITE] Make test for traditional-cpp depend on

Hi all,

A glibc update recently broke this test by adding a CPP
macro that uses the ## string function which traditional-cpp
does not support.
The change in glibc that made the test fail is from
6962682ffe5e5f0373047a0b894fee7a774be254.

This fixes (PR78136) by changing the test to use a local
include file instead of one from glibc.
The intention of the test is to test that traditional-cpp does
not expand values inside <> blocks of #includes.
As such the include has to be included via <> syntax. To do this
the .exp has been modified to add the test directory to the
Include search path.

Ran regression tests on aarch64-none-linux-gnu.

Ok for trunk?

Thanks,
Tamar

gcc/testsuite/

2016-10-31  Tamar Christina  

PR testsuite/78136
* gcc.dg/cpp/trad/trad.exp
(dg-runtest): Added $srcdir/$subdir/ to Include dirs.
* gcc.dg/cpp/trad/include.c: Use local header file.

[GCC][AArch64][PATCH][Testsuite] Fix failing test vector_initialization_nostack.c

2016-11-07 Thread Tamar Christina

Hi all,

This fixes (PR78142) by turning off scheduling for the test.
r241590 is causing more registers to be used and so
the SP registered happens to be picked and used.

This test I believe was checking explicitly that the
SP is not used if not needed.  

Ran regression tests on aarch64-none-linux-gnu.

Ok for trunk?

Thanks,
Tamar

gcc/testsuite/

2016-11-07  Tamar Christina  

PR middle-end/78142
* gcc.target/aarch64/vector_initialization_nostack.c
(dg-options): Disabled scheduling.diff --git a/gcc/testsuite/gcc.target/aarch64/vector_initialization_nostack.c b/gcc/testsuite/gcc.target/aarch64/vector_initialization_nostack.c
index bbad04d00263b6a91b826b4911af92bdd226c821..71699281c5ce79fb5cf37e47b8ba078721c19f3a 100644
--- a/gcc/testsuite/gcc.target/aarch64/vector_initialization_nostack.c
+++ b/gcc/testsuite/gcc.target/aarch64/vector_initialization_nostack.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O3 -ftree-vectorize -fno-vect-cost-model" } */
+/* { dg-options "-O3 -ftree-vectorize -fno-vect-cost-model -fno-schedule-insns" } */
 float arr_f[100][100];
 float
 f9 (void)

Re: [PATCH, GCC, wwwdocs] Document new Cortex-M23 and Cortex-M33 processors support in ARM backend

2016-11-07 Thread Kyrill Tkachov



On 07/11/16 14:00, Thomas Preudhomme wrote:

What about ARM maintainers?



Fine with me too.
Thanks,
Kyrill


Best regards,

Thomas

On 04/11/16 22:16, Gerald Pfeifer wrote:

On Fri, 4 Nov 2016, Thomas Preudhomme wrote:

This patch document the newly added support in GCC 7 for Cortex-M23 and
Cortex-M33 processors [1][2].

:

Is this ok for ?


Surely so for me.

Gerald

Re: [PATCH, GCC, wwwdocs] Document new Cortex-M23 and Cortex-M33 processors support in ARM backend

2016-11-07 Thread Thomas Preudhomme


What about ARM maintainers?

Best regards,

Thomas

On 04/11/16 22:16, Gerald Pfeifer wrote:

On Fri, 4 Nov 2016, Thomas Preudhomme wrote:

This patch document the newly added support in GCC 7 for Cortex-M23 and
Cortex-M33 processors [1][2].

:

Is this ok for ?


Surely so for me.

Gerald

[PATCH] Fix PR78224

2016-11-07 Thread Richard Biener


The following fixes an ICE with call cdce where it fails to handle
PHIs in the fallthru destination of a call with EH.  My simple fix is
to simply split the fallthru edge if the dest may contain PHI nodes.

This may also remove the need to free dominance info (hope there's
a testcase for that -- I'll leave the branches alone).

Bootstrap and regtest running on x86_64-unknown-linux-gnu.

Richard.

2016-11-07  Richard Biener  

PR tree-optimization/78224
* tree-call-cdce.c (shrink_wrap_one_built_in_call_with_conds):
Split the fallthru edge in case its successor may have PHIs.
Do not free dominance info.

* g++.dg/torture/pr78224.C: New testcase.

Index: gcc/tree-call-cdce.c
===
--- gcc/tree-call-cdce.c(revision 241893)
+++ gcc/tree-call-cdce.c(working copy)
@@ -807,15 +807,20 @@ shrink_wrap_one_built_in_call_with_conds
 can_guard_call_p.  */
   join_tgt_in_edge_from_call = find_fallthru_edge (bi_call_bb->succs);
   gcc_assert (join_tgt_in_edge_from_call);
-  free_dominance_info (CDI_DOMINATORS);
+  /* We don't want to handle PHIs.  */
+  if (EDGE_COUNT (join_tgt_in_edge_from_call->dest->preds) > 1)
+   join_tgt_bb = split_edge (join_tgt_in_edge_from_call);
+  else
+   join_tgt_bb = join_tgt_in_edge_from_call->dest;
 }
   else
-join_tgt_in_edge_from_call = split_block (bi_call_bb, bi_call);
+{
+  join_tgt_in_edge_from_call = split_block (bi_call_bb, bi_call);
+  join_tgt_bb = join_tgt_in_edge_from_call->dest;
+}
 
   bi_call_bsi = gsi_for_stmt (bi_call);
 
-  join_tgt_bb = join_tgt_in_edge_from_call->dest;
-
   /* Now it is time to insert the first conditional expression
  into bi_call_bb and split this bb so that bi_call is
  shrink-wrapped.  */
Index: gcc/testsuite/g++.dg/torture/pr78224.C
===
--- gcc/testsuite/g++.dg/torture/pr78224.C  (revision 0)
+++ gcc/testsuite/g++.dg/torture/pr78224.C  (working copy)
@@ -0,0 +1,51 @@
+// { dg-do compile }
+
+extern "C"{
+  float sqrtf(float);
+}
+
+inline float squareroot(const float f)
+{
+  return sqrtf(f);
+}
+
+inline int squareroot(const int f)
+{
+  return static_cast(sqrtf(static_cast(f)));
+}
+
+template 
+class vector2d
+{
+public:
+  vector2d(T nx, T ny) : X(nx), Y(ny) {}
+  T getLength() const { return squareroot( X*X + Y*Y ); }
+  T X;
+  T Y;
+};
+
+vector2d getMousePos();
+
+class Client
+{
+public:
+  Client();
+  ~Client();
+};
+
+void the_game(float turn_amount)
+{
+  Client client;
+  bool first = true;
+
+  while (1) {
+  if (first) {
+first = false;
+  } else {
+int dx = getMousePos().X;
+int dy = getMousePos().Y;
+
+turn_amount = vector2d(dx, dy).getLength();
+  }
+  }
+}

[ARM][GCC][PATCHv2 2/3] Add missing Poly64_t intrinsics to GCC

2016-11-07 Thread Tamar Christina

Hi all,

This patch (2 of 3) adds the following NEON intrinsics to
the ARM back-end of GCC:

* vget_lane_p64

Added new tests for these and ran regression tests on aarch64-none-linux-gnu
and on arm-none-linux-gnueabihf.

Ok for trunk?

Thanks,
Tamar

gcc/
2016-11-04  Tamar Christina  

* config/arm/arm_neon.h (vget_lane_p64): New.
diff --git a/gcc/config/arm/arm_neon.h b/gcc/config/arm/arm_neon.h
index 3898ff7302dc3f21e6b50a8a7b835033c1ae2021..ab29da74e0971cc09ee63b561ecc79e9762e3fb4 100644
--- a/gcc/config/arm/arm_neon.h
+++ b/gcc/config/arm/arm_neon.h
@@ -5411,6 +5411,15 @@ vget_lane_s64 (int64x1_t __a, const int __b)
   return (int64_t)__builtin_neon_vget_lanedi (__a, __b);
 }
 
+#pragma GCC push_options
+#pragma GCC target ("fpu=crypto-neon-fp-armv8")
+__extension__ static __inline poly64_t __attribute__ ((__always_inline__))
+vget_lane_p64 (poly64x1_t __a, const int __b)
+{
+  return (poly64_t)__builtin_neon_vget_lanedi ((int64x1_t) __a, __b);
+}
+
+#pragma GCC pop_options
 __extension__ static __inline uint64_t __attribute__ ((__always_inline__))
 vget_lane_u64 (uint64x1_t __a, const int __b)
 {

[AArch64][ARM][GCC][PATCHv2 3/3] Add tests for missing Poly64_t intrinsics to GCC

2016-11-07 Thread Tamar Christina

Hi all,

This patch (3 of 3) adds updates tests for the NEON intrinsics
added by the previous patches:

Ran regression tests on aarch64-none-linux-gnu
and on arm-none-linux-gnueabihf.

Ok for trunk?

Thanks,
Tamar


gcc/testsuite/
2016-11-04  Tamar Christina  

* gcc.target/aarch64/advsimd-intrinsics/p64.c: New.
* gcc.target/aarch64/advsimd-intrinsics/arm-neon-ref.h
(Poly64x1_t, Poly64x2_t): Added type.
(AARCH64_ONLY): Added macro.
* gcc.target/aarch64/advsimd-intrinsics/vcombine.c:
Added test for Poly64.
* gcc.target/aarch64/advsimd-intrinsics/vcreate.c: Likewise.
* gcc.target/aarch64/advsimd-intrinsics/vdup-vmov.c: Likewise.
* gcc.target/aarch64/advsimd-intrinsics/vdup_lane.c: Likewise.
* gcc.target/aarch64/advsimd-intrinsics/vget_high.c: Likewise.
* gcc.target/aarch64/advsimd-intrinsics/vget_lane.c: Likewise.
* gcc.target/aarch64/advsimd-intrinsics/vget_low.c: Likewise.
* gcc.target/aarch64/advsimd-intrinsics/vldX.c: Likewise.
* gcc.target/aarch64/advsimd-intrinsics/vldX_dup.c: Likewise.
* gcc.target/aarch64/advsimd-intrinsics/vldX_lane.c: Likewise.
* gcc.target/aarch64/advsimd-intrinsics/vstX_lane.c: Likewise.
* gcc.target/aarch64/advsimd-intrinsics/vst1_lane.c: Likewise.
* gcc.target/aarch64/advsimd-intrinsics/vld1.c: Likewise.
* gcc.target/aarch64/advsimd-intrinsics/vreinterpret_p128.c:
Added AArch64 flags.
* gcc.target/aarch64/advsimd-intrinsics/vreinterpret_p64.c:
Added Aarch64 flags.diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/arm-neon-ref.h b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/arm-neon-ref.h
index 462141586b3db7c5256c74b08fa0449210634226..174c1948221025b860aaac503354b406fa804007 100644
--- a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/arm-neon-ref.h
+++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/arm-neon-ref.h
@@ -32,6 +32,13 @@ extern size_t strlen(const char *);
VECT_VAR(expected, int, 16, 4) -> expected_int16x4
VECT_VAR_DECL(expected, int, 16, 4) -> int16x4_t expected_int16x4
 */
+/* Some instructions don't exist on ARM.
+   Use this macro to guard against them.  */
+#ifdef __aarch64__
+#define AARCH64_ONLY(X) X
+#else
+#define AARCH64_ONLY(X)
+#endif
 
 #define xSTR(X) #X
 #define STR(X) xSTR(X)
@@ -92,6 +99,13 @@ extern size_t strlen(const char *);
 fprintf(stderr, "CHECKED %s %s\n", STR(VECT_TYPE(T, W, N)), MSG);	\
   }
 
+#if defined (__ARM_FEATURE_CRYPTO)
+#define CHECK_CRYPTO(MSG,T,W,N,FMT,EXPECTED,COMMENT) \
+	   CHECK(MSG,T,W,N,FMT,EXPECTED,COMMENT)
+#else
+#define CHECK_CRYPTO(MSG,T,W,N,FMT,EXPECTED,COMMENT)
+#endif
+
 /* Floating-point variant.  */
 #define CHECK_FP(MSG,T,W,N,FMT,EXPECTED,COMMENT)			\
   {	\
@@ -184,6 +198,9 @@ extern ARRAY(expected, uint, 32, 2);
 extern ARRAY(expected, uint, 64, 1);
 extern ARRAY(expected, poly, 8, 8);
 extern ARRAY(expected, poly, 16, 4);
+#if defined (__ARM_FEATURE_CRYPTO)
+extern ARRAY(expected, poly, 64, 1);
+#endif
 extern ARRAY(expected, hfloat, 16, 4);
 extern ARRAY(expected, hfloat, 32, 2);
 extern ARRAY(expected, hfloat, 64, 1);
@@ -197,11 +214,14 @@ extern ARRAY(expected, uint, 32, 4);
 extern ARRAY(expected, uint, 64, 2);
 extern ARRAY(expected, poly, 8, 16);
 extern ARRAY(expected, poly, 16, 8);
+#if defined (__ARM_FEATURE_CRYPTO)
+extern ARRAY(expected, poly, 64, 2);
+#endif
 extern ARRAY(expected, hfloat, 16, 8);
 extern ARRAY(expected, hfloat, 32, 4);
 extern ARRAY(expected, hfloat, 64, 2);
 
-#define CHECK_RESULTS_NAMED_NO_FP16(test_name,EXPECTED,comment)		\
+#define CHECK_RESULTS_NAMED_NO_FP16_NO_POLY64(test_name,EXPECTED,comment)		\
   {	\
 CHECK(test_name, int, 8, 8, PRIx8, EXPECTED, comment);		\
 CHECK(test_name, int, 16, 4, PRIx16, EXPECTED, comment);		\
@@ -228,6 +248,13 @@ extern ARRAY(expected, hfloat, 64, 2);
 CHECK_FP(test_name, float, 32, 4, PRIx32, EXPECTED, comment);	\
   }	\
 
+#define CHECK_RESULTS_NAMED_NO_FP16(test_name,EXPECTED,comment)		\
+  {	\
+CHECK_RESULTS_NAMED_NO_FP16_NO_POLY64(test_name, EXPECTED, comment);		\
+CHECK_CRYPTO(test_name, poly, 64, 1, PRIx64, EXPECTED, comment);	\
+CHECK_CRYPTO(test_name, poly, 64, 2, PRIx64, EXPECTED, comment);	\
+  }	\
+
 /* Check results against EXPECTED.  Operates on all possible vector types.  */
 #if defined (__ARM_FP16_FORMAT_IEEE) || defined (__ARM_FP16_FORMAT_ALTERNATIVE)
 #define CHECK_RESULTS_NAMED(test_name,EXPECTED,comment)			\
@@ -398,6 +425,9 @@ static void clean_results (void)
   CLEAN(result, uint, 64, 1);
   CLEAN(result, poly, 8, 8);
   CLEAN(result, poly, 16, 4);
+#if defined (__ARM_FEATURE_CRYPTO)
+  CLEAN(result, poly, 64, 1);
+#endif
 #if defined (__ARM_FP16_FORMAT_IEEE) || defined (__ARM_FP16_FORMAT_ALTERNATIVE)
   CLEAN(result, float, 16, 4);
 #endif
@@ -413,6 +443,9 @@ static void clean_results (void)
   CLEAN(result,

[AArch64][GCC][PATCHv2 1/3] Add missing Poly64_t intrinsics to GCC

2016-11-07 Thread Tamar Christina

Hi all,

This patch (1 of 3) adds the following NEON intrinsics
to the Aarch64 back-end of GCC:

* vsli_n_p64
* vsliq_n_p64

* vld1_p64
* vld1q_p64
* vld1_dup_p64
* vld1q_dup_p64

* vst1_p64
* vst1q_p64
  
* vld2_p64
* vld3_p64
* vld4_p64
* vld2q_p64
* vld3q_p64
* vld4q_p64

* vld2_dup_p64
* vld3_dup_p64james.greenha...@arm.com
* vld4_dup_p64

* __aarch64_vdup_lane_p64
* __aarch64_vdup_laneq_p64
* __aarch64_vdupq_lane_p64
* __aarch64_vdupq_laneq_p64

* vget_lane_p64
* vgetq_lane_p64

* vreinterpret_p8_p64
* vreinterpretq_p8_p64
* vreinterpret_p16_p64
* vreinterpretq_p16_p64

* vreinterpret_p64_f16
* vreinterpret_p64_f64
* vreinterpret_p64_s8
* vreinterpret_p64_s16
* vreinterpret_p64_s32
* vreinterpret_p64_s64
* vreinterpret_p64_f32
* vreinterpret_p64_u8
* vreinterpret_p64_u16
* vreinterpret_p64_u32
* vreinterpret_p64_u64
* vreinterpret_p64_p8

* vreinterpretq_p64_f64
* vreinterpretq_p64_s8
* vreinterpretq_p64_s16
* vreinterpretq_p64_s32
* vreinterpretq_p64_s64
* vreinterpretq_p64_f16
* vreinterpretq_p64_f32
* vreinterpretq_p64_u8
* vreinterpretq_p64_u16
* vreinterpretq_p64_u32
* vreinterpretq_p64_u64
* vreinterpretq_p64_p8

* vreinterpret_f16_p64
* vreinterpretq_f16_p64
* vreinterpret_f32_p64
* vreinterpretq_f32_p64
* vreinterpret_f64_p64
* vreinterpretq_f64_p64
* vreinterpret_s64_p64
* vreinterpretq_s64_p64
* vreinterpret_u64_p64
* vreinterpretq_u64_p64
* vreinterpret_s8_p64
* vreinterpretq_s8_p64
* vreinterpret_s16_p64
* vreinterpret_s32_p64
* vreinterpretq_s32_p64
* vreinterpret_u8_p64
* vreinterpret_u16_p64
* vreinterpretq_u16_p64
* vreinterpret_u32_p64
* vreinterpretq_u32_p64

* vset_lane_p64
* vsetq_lane_p64

* vget_low_p64
* vget_high_p64

* vcombine_p64
* vcreate_p64

* vst2_lane_p64
* vst3_lane_p64
* vst4_lane_p64
* vst2q_lane_p64
* vst3q_lane_p64
* vst4q_lane_p64

* vget_lane_p64
* vget_laneq_p64
* vset_lane_p64
* vset_laneq_p64

* vcopy_lane_p64
* vcopy_laneq_p64  

* vdup_n_p64
* vdupq_n_p64
* vdup_lane_p64
* vdup_laneq_p64

* vld1_p64
* vld1q_p64
* vld1_dup_p64
* vld1q_dup_p64
* vld1q_dup_p64
* vmov_n_p64
* vmovq_n_p64
* vst3q_p64
* vst4q_p64

* vld1_lane_p64
* vld1q_lane_p64
* vst1_lane_p64
* vst1q_lane_p64
* vcopy_laneq_p64
* vcopyq_laneq_p64
* vdupq_laneq_p64

Added new tests for these and ran regression tests on aarch64-none-linux-gnu
and on arm-none-linux-gnueabihf.

Ok for trunk?

Thanks,
Tamar

gcc/
2016-11-04  Tamar Christina  

* config/aarch64/aarch64-builtins.c (TYPES_SETREGP): Added poly type.
(TYPES_GETREGP): Likewise.
(TYPES_SHIFTINSERTP): Likewise.
(TYPES_COMBINEP): Likewise.
(TYPES_STORE1P): Likewise.
* config/aarch64/aarch64-simd-builtins.def
(combine): Added poly generator.
(get_dregoi): Likewise.
(get_dregci): Likewise.
(get_dregxi): Likewise.
(ssli_n): Likewise.
(ld1): Likewise.
(st1): Likewise.
* config/aarch64/arm_neon.h
(poly64x1x2_t, poly64x1x3_t): New.
(poly64x1x4_t, poly64x2x2_t): Likewise.
(poly64x2x3_t, poly64x2x4_t): Likewise.
(poly64x1_t): Likewise.
(vcreate_p64, vcombine_p64): Likewise.
(vdup_n_p64, vdupq_n_p64): Likewise.
(vld2_p64, vld2q_p64): Likewise.
(vld3_p64, vld3q_p64): Likewise.
(vld4_p64, vld4q_p64): Likewise.
(vld2_dup_p64, vld3_dup_p64): Likewise.
(vld4_dup_p64, vsli_n_p64): Likewise.
(vsliq_n_p64, vst1_p64): Likewise.
(vst1q_p64, vst2_p64): Likewise.
(vst3_p64, vst4_p64): Likewise.
(__aarch64_vdup_lane_p64, __aarch64_vdup_laneq_p64): Likewise.
(__aarch64_vdupq_lane_p64, __aarch64_vdupq_laneq_p64): Likewise.
(vget_lane_p64, vgetq_lane_p64): Likewise.
(vreinterpret_p8_p64, vreinterpretq_p8_p64): Likewise.
(vreinterpret_p16_p64, vreinterpretq_p16_p64): Likewise.
(vreinterpret_p64_f16, vreinterpret_p64_f64): Likewise.
(vreinterpret_p64_s8, vreinterpret_p64_s16): Likewise.
(vreinterpret_p64_s32, vreinterpret_p64_s64): Likewise.
(vreinterpret_p64_f32, vreinterpret_p64_u8): Likewise.
(vreinterpret_p64_u16, vreinterpret_p64_u32): Likewise.
(vreinterpret_p64_u64, vreinterpret_p64_p8): Likewise.
(vreinterpretq_p64_f64, vreinterpretq_p64_s8): Likewise.
(vreinterpretq_p64_s16, vreinterpretq_p64_s32): Likewise.
(vreinterpretq_p64_s64, vreinterpretq_p64_f16): Likewise.
(vreinterpretq_p64_f32, vreinterpretq_p64_u8): Likewise.
(vreinterpretq_p64_u16, vreinterpretq_p64_u32): Likewise.
(vreinterpretq_p64_u64, vreinterpretq_p64_p8): Likewise.
(vreinterpret_f16_p64, vreinterpretq_f16_p64): Likewise.
(vreinterpret_f32_p64, vreinterpretq_f32_p64): Likewise.
(vreinterpret_f64_p64, vreinterpretq_f64_p64): Likewise.
(vreinterpret_s64_p64, vreinterpretq_s64_p64): Likewise.
(vreinterpret_u64_p64, vreinterpretq_u64_p64): Likewise.
(vreinterpret_s8_p64,

Re: Add missing symbols for versioned namespace

2016-11-07 Thread Jonathan Wakely


On 03/11/16 21:54 +0100, François Dumont wrote:

Hi

   I might not be the right one to propose this patch as I am not 
sure that I fully understand gnu-versioned-namespace.ver organization. 
But with it following test failures when using versioned namespace 
vanish:


FAIL: 20_util/allocator/overaligned.cc (test for excess errors)
FAIL: ext/bitmap_allocator/overaligned.cc (test for excess errors)
FAIL: ext/mt_allocator/overaligned.cc (test for excess errors)
FAIL: ext/new_allocator/overaligned.cc (test for excess errors)
FAIL: ext/pool_allocator/overaligned.cc (test for excess errors)

   Ok to commit ?


This looks correct. OK for trunk, thanks.

[PATCH][AArch64] Optimized implementation of search_line_fast for the CPP lexer

2016-11-07 Thread Richard Earnshaw (lists)

This patch contains an implementation of search_line_fast for the CPP
lexer.  It's based in part on the AArch32 (ARM) code but incorporates
new instructions available in AArch64 (reduction add operations) plus
some tricks for reducing the realignment overheads.  We assume a page
size of 4k, but that's a safe assumption -- AArch64 systems can never
have a smaller page size than that: on systems with larger pages we will
go through the realignment code more often than strictly necessary, but
it's still likely to be in the noise (less than 0.5% of the time).
Bootstrapped on aarch64-none-linux-gnu.


Although this is AArch64 specific and therefore I don't think it
requires approval from anyone else, I'll wait 24 hours for comments.

* lex.c (search_line_fast): New implementation for AArch64.

R.
diff --git a/libcpp/lex.c b/libcpp/lex.c
index 6f65fa1..cea8848 100644
--- a/libcpp/lex.c
+++ b/libcpp/lex.c
@@ -752,6 +752,101 @@ search_line_fast (const uchar *s, const uchar *end 
ATTRIBUTE_UNUSED)
   }
 }
 
+#elif defined (__ARM_NEON) && defined (__ARM_64BIT_STATE)
+#include "arm_neon.h"
+
+/* This doesn't have to be the exact page size, but no system may use
+   a size smaller than this.  ARMv8 requires a minimum page size of
+   4k.  The impact of being conservative here is a small number of
+   cases will take the slightly slower entry path into the main
+   loop.  */
+
+#define AARCH64_MIN_PAGE_SIZE 4096
+
+static const uchar *
+search_line_fast (const uchar *s, const uchar *end ATTRIBUTE_UNUSED)
+{
+  const uint8x16_t repl_nl = vdupq_n_u8 ('\n');
+  const uint8x16_t repl_cr = vdupq_n_u8 ('\r');
+  const uint8x16_t repl_bs = vdupq_n_u8 ('\\');
+  const uint8x16_t repl_qm = vdupq_n_u8 ('?');
+  const uint8x16_t xmask = (uint8x16_t) vdupq_n_u64 (0x8040201008040201ULL);
+
+#ifdef __AARCH64EB
+  const int16x8_t shift = {8, 8, 8, 8, 0, 0, 0, 0};
+#else
+  const int16x8_t shift = {0, 0, 0, 0, 8, 8, 8, 8};
+#endif
+
+  unsigned int found;
+  const uint8_t *p;
+  uint8x16_t data;
+  uint8x16_t t;
+  uint16x8_t m;
+  uint8x16_t u, v, w;
+
+  /* Align the source pointer.  */
+  p = (const uint8_t *)((uintptr_t)s & -16);
+
+  /* Assuming random string start positions, with a 4k page size we'll take
+ the slow path about 0.37% of the time.  */
+  if (__builtin_expect ((AARCH64_MIN_PAGE_SIZE
+- (((uintptr_t) s) & (AARCH64_MIN_PAGE_SIZE - 1)))
+   < 16, 0))
+{
+  /* Slow path: the string starts near a possible page boundary.  */
+  uint32_t misalign, mask;
+
+  misalign = (uintptr_t)s & 15;
+  mask = (-1u << misalign) & 0x;
+  data = vld1q_u8 (p);
+  t = vceqq_u8 (data, repl_nl);
+  u = vceqq_u8 (data, repl_cr);
+  v = vorrq_u8 (t, vceqq_u8 (data, repl_bs));
+  w = vorrq_u8 (u, vceqq_u8 (data, repl_qm));
+  t = vorrq_u8 (v, w);
+  t = vandq_u8 (t, xmask);
+  m = vpaddlq_u8 (t);
+  m = vshlq_u16 (m, shift);
+  found = vaddvq_u16 (m);
+  found &= mask;
+  if (found)
+   return (const uchar*)p + __builtin_ctz (found);
+}
+  else
+{
+  data = vld1q_u8 ((const uint8_t *) s);
+  t = vceqq_u8 (data, repl_nl);
+  u = vceqq_u8 (data, repl_cr);
+  v = vorrq_u8 (t, vceqq_u8 (data, repl_bs));
+  w = vorrq_u8 (u, vceqq_u8 (data, repl_qm));
+  t = vorrq_u8 (v, w);
+  if (__builtin_expect (vpaddd_u64 ((uint64x2_t)t), 0))
+   goto done;
+}
+
+  do
+{
+  p += 16;
+  data = vld1q_u8 (p);
+  t = vceqq_u8 (data, repl_nl);
+  u = vceqq_u8 (data, repl_cr);
+  v = vorrq_u8 (t, vceqq_u8 (data, repl_bs));
+  w = vorrq_u8 (u, vceqq_u8 (data, repl_qm));
+  t = vorrq_u8 (v, w);
+} while (!vpaddd_u64 ((uint64x2_t)t));
+
+done:
+  /* Now that we've found the terminating substring, work out precisely where
+ we need to stop.  */
+  t = vandq_u8 (t, xmask);
+  m = vpaddlq_u8 (t);
+  m = vshlq_u16 (m, shift);
+  found = vaddvq_u16 (m);
+  return (uintptr_t) p) < (uintptr_t) s) ? s : (const uchar *)p)
+ + __builtin_ctz (found));
+}
+
 #elif defined (__ARM_NEON)
 #include "arm_neon.h"

Re: [PATCH] Fix nonoverlapping_memrefs_p ICE (PR target/77834, take 3)

2016-11-07 Thread Richard Biener

On Mon, 7 Nov 2016, Jakub Jelinek wrote:

> On Fri, Nov 04, 2016 at 08:07:37PM +0100, Richard Biener wrote:
> > >If/once this is in, I'm planning to test/submit a patch adding
> > >  /* If one decl is known to be a function or label in a function and
> > > the other is some kind of data, they can't overlap.  */
> > >  if ((TREE_CODE (exprx) == FUNCTION_DECL
> > >   || TREE_CODE (exprx) == LABEL_DECL)
> > >  != (TREE_CODE (expry) == FUNCTION_DECL
> > > || TREE_CODE (expry) == LABEL_DECL))
> > >return 1;
> > >before that.
> > >
> > >Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> > 
> > OK for trunk and branches (if appropriate)
> 
> And here is the incremental patch to disambiguate between code section
> objects and variables.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk (only)?

Ok.

Richard.

> 2016-11-07  Jakub Jelinek  
> 
>   PR target/77834
>   * alias.c (nonoverlapping_memrefs_p): If one decl is
>   FUNCTION_DECL or LABEL_DECL and the other is not, return 1.
> 
> --- gcc/alias.c.jj2016-11-04 20:13:32.0 +0100
> +++ gcc/alias.c   2016-11-07 11:18:57.982160034 +0100
> @@ -2755,6 +2755,14 @@ nonoverlapping_memrefs_p (const_rtx x, c
>|| TREE_CODE (expry) == CONST_DECL)
>  return 1;
>  
> +  /* If one decl is known to be a function or label in a function and
> + the other is some kind of data, they can't overlap.  */
> +  if ((TREE_CODE (exprx) == FUNCTION_DECL
> +   || TREE_CODE (exprx) == LABEL_DECL)
> +  != (TREE_CODE (expry) == FUNCTION_DECL
> +   || TREE_CODE (expry) == LABEL_DECL))
> +return 1;
> +
>/* If either of the decls doesn't have DECL_RTL set (e.g. marked as
>   living in multiple places), we can't tell anything.  Exception
>   are FUNCTION_DECLs for which we can create DECL_RTL on demand.  */
> @@ -2804,7 +2812,7 @@ nonoverlapping_memrefs_p (const_rtx x, c
>  
>/* Offset based disambiguation not appropriate for loop invariant */
>if (loop_invariant)
> -return 0;  
> +return 0;
>  
>/* Offset based disambiguation is OK even if we do not know that the
>   declarations are necessarily different
> 
> 
>   Jakub
> 
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)

Re: [PATCH] rs6000: Do swdiv at expand time

2016-11-07 Thread David Edelsohn

On Mon, Nov 7, 2016 at 4:32 AM, Segher Boessenkool
 wrote:
> We transform floating point divide instructions to a faster series of
> simple instructions, "swdiv".  Currently we do not do that until the
> first splitter pass, which is much too late for most optimisations
> that can happen on those new instructions, e.g. the constant loads
> are not CSEd inside an unrolled loop.  This patch changes things so
> those divide instructions are expanded during expand already.
>
> Bootstrapped and tested on powerpc64-linux; Bill has run SPEC on it,
> and if anything it shows a slight improvement.
>
> Is this okay for trunk?

Okay.

But commenting on the ChangeLog entry is half the fun!

- David

[PATCH] Fix nonoverlapping_memrefs_p ICE (PR target/77834, take 3)

2016-11-07 Thread Jakub Jelinek

On Fri, Nov 04, 2016 at 08:07:37PM +0100, Richard Biener wrote:
> >If/once this is in, I'm planning to test/submit a patch adding
> >  /* If one decl is known to be a function or label in a function and
> > the other is some kind of data, they can't overlap.  */
> >  if ((TREE_CODE (exprx) == FUNCTION_DECL
> >   || TREE_CODE (exprx) == LABEL_DECL)
> >  != (TREE_CODE (expry) == FUNCTION_DECL
> >   || TREE_CODE (expry) == LABEL_DECL))
> >return 1;
> >before that.
> >
> >Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> 
> OK for trunk and branches (if appropriate)

And here is the incremental patch to disambiguate between code section
objects and variables.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk (only)?

2016-11-07  Jakub Jelinek  

PR target/77834
* alias.c (nonoverlapping_memrefs_p): If one decl is
FUNCTION_DECL or LABEL_DECL and the other is not, return 1.

--- gcc/alias.c.jj  2016-11-04 20:13:32.0 +0100
+++ gcc/alias.c 2016-11-07 11:18:57.982160034 +0100
@@ -2755,6 +2755,14 @@ nonoverlapping_memrefs_p (const_rtx x, c
   || TREE_CODE (expry) == CONST_DECL)
 return 1;
 
+  /* If one decl is known to be a function or label in a function and
+ the other is some kind of data, they can't overlap.  */
+  if ((TREE_CODE (exprx) == FUNCTION_DECL
+   || TREE_CODE (exprx) == LABEL_DECL)
+  != (TREE_CODE (expry) == FUNCTION_DECL
+ || TREE_CODE (expry) == LABEL_DECL))
+return 1;
+
   /* If either of the decls doesn't have DECL_RTL set (e.g. marked as
  living in multiple places), we can't tell anything.  Exception
  are FUNCTION_DECLs for which we can create DECL_RTL on demand.  */
@@ -2804,7 +2812,7 @@ nonoverlapping_memrefs_p (const_rtx x, c
 
   /* Offset based disambiguation not appropriate for loop invariant */
   if (loop_invariant)
-return 0;  
+return 0;
 
   /* Offset based disambiguation is OK even if we do not know that the
  declarations are necessarily different


Jakub

Re: [PATCH] combine lhs zero_extract fix (PR78186)

2016-11-07 Thread Segher Boessenkool

On Mon, Nov 07, 2016 at 02:00:46PM +0100, Christophe Lyon wrote:
> > Confirmed.  What a nasty, nasty bug, and it has been here for decades
> > it seems.  Could you please open a PR?
> >
> Sure, I've created PR78232 for this.

Thanks!  I have a patch btw, it's regstrapping.  Not sure it is fully
correct (whether it handles all possible cases), but hey.

Segher

[committed] Move 3 gcc.target/i386/*.C tests

2016-11-07 Thread Jakub Jelinek

Hi!

Richard noticed 3 misplaced tests - C++ tests don't belong into
gcc.target/ which tests just C.

I've bootstrapped/regtested this on x86_64-linux and i686-linux and
committed to trunk as obvious.

2016-11-07  Jakub Jelinek  

PR middle-end/71529
* gcc.target/i386/pr71529.C: Moved to ...
* g++.dg/opt/pr71529.C: ... here.  New test.  Guard for i?86/x86_64.

PR target/64411
* gcc.target/i386/pr64411.C: Moved to ...
* g++.dg/opt/pr64411.C: ... here.  New test.  Guard for i?86/x86_64
lp64.

PR target/65105
* gcc.target/i386/pr65105-4.C: Moved to ...
* g++.dg/opt/pr65105-4.C: ... here.  New test.  Guard for i?86/x86_64.
Run into compile test rather than execute test.

--- gcc/testsuite/gcc.target/i386/pr71529.C.jj  2016-06-15 19:09:09.0 
+0200
+++ gcc/testsuite/gcc.target/i386/pr71529.C 2016-11-07 10:56:21.835713206 
+0100
@@ -1,22 +0,0 @@
-/* PR71529 */
-/* { dg-do compile { target { ! x32 } } } */
-/* { dg-options "-fcheck-pointer-bounds -mmpx -O2" } */
-
-class c1
-{
- public:
-  virtual ~c1 ();
-};
-
-class c2
-{
- public:
-  virtual ~c2 ();
-};
-
-class c3 : c1, c2 { };
-
-int main (int, char **)
-{
-  c3 obj;
-}
--- gcc/testsuite/gcc.target/i386/pr64411.C.jj  2016-03-15 17:10:18.0 
+0100
+++ gcc/testsuite/gcc.target/i386/pr64411.C 2016-11-07 10:54:34.485101960 
+0100
@@ -1,27 +0,0 @@
-/* { dg-do compile } */
-/* { dg-options "-Os -mcmodel=medium -fPIC -fschedule-insns 
-fselective-scheduling" } */
-
-typedef __SIZE_TYPE__ size_t;
-
-extern "C"  long strtol ()
-  { return 0; }
-
-static struct {
-  void *sp[2];
-} info;
-
-union S813
-{
-  void * c[5];
-}
-s813;
-
-S813 a813[5];
-S813 check813 (S813, S813 *, S813);
-
-void checkx813 ()
-{
-  __builtin_memset (, '\0', sizeof (s813));
-  __builtin_memset (, '\0', sizeof (info));
-  check813 (s813, [1], a813[2]);
-}
--- gcc/testsuite/gcc.target/i386/pr65105-4.C.jj2015-10-11 
19:11:14.214767354 +0200
+++ gcc/testsuite/gcc.target/i386/pr65105-4.C   2016-11-07 10:51:05.333808029 
+0100
@@ -1,19 +0,0 @@
-/* PR target/pr65105 */
-/* { dg-do run { target { ia32 } } } */
-/* { dg-options "-O2 -march=slm" } */
-
-struct s {
-  long long l1, l2, l3, l4, l5;
-} *a;
-long long b;
-long long fn1()
-{
-  try
-{
-  b = (a->l1 | a->l2 | a->l3 | a->l4 | a->l5);
-  return a->l1;
-}
-  catch (int)
-{
-}
-}
--- gcc/testsuite/g++.dg/opt/pr71529.C.jj   2016-11-07 10:55:34.151330081 
+0100
+++ gcc/testsuite/g++.dg/opt/pr71529.C  2016-11-07 10:56:13.319823373 +0100
@@ -0,0 +1,22 @@
+// PR middle-end/71529
+// { dg-do compile { target { { i?86-*-* x86_64-*-* } && { ! x32 } } } }
+// { dg-options "-fcheck-pointer-bounds -mmpx -O2" }
+
+class c1
+{
+ public:
+  virtual ~c1 ();
+};
+
+class c2
+{
+ public:
+  virtual ~c2 ();
+};
+
+class c3 : c1, c2 { };
+
+int main (int, char **)
+{
+  c3 obj;
+}
--- gcc/testsuite/g++.dg/opt/pr64411.C.jj   2016-11-07 10:51:38.557378145 
+0100
+++ gcc/testsuite/g++.dg/opt/pr64411.C  2016-11-07 10:54:13.115378412 +0100
@@ -0,0 +1,28 @@
+// PR target/64411
+// { dg-do compile { target { { i?86-*-* x86_64-*-* } && lp64 } } }
+// { dg-options "-Os -mcmodel=medium -fPIC -fschedule-insns 
-fselective-scheduling" }
+
+typedef __SIZE_TYPE__ size_t;
+
+extern "C"  long strtol ()
+  { return 0; }
+
+static struct {
+  void *sp[2];
+} info;
+
+union S813
+{
+  void * c[5];
+}
+s813;
+
+S813 a813[5];
+S813 check813 (S813, S813 *, S813);
+
+void checkx813 ()
+{
+  __builtin_memset (, '\0', sizeof (s813));
+  __builtin_memset (, '\0', sizeof (info));
+  check813 (s813, [1], a813[2]);
+}
--- gcc/testsuite/g++.dg/opt/pr65105-4.C.jj 2016-11-07 10:48:58.587448018 
+0100
+++ gcc/testsuite/g++.dg/opt/pr65105-4.C2016-11-07 10:50:52.066979690 
+0100
@@ -0,0 +1,19 @@
+// PR target/65105
+// { dg-do compile { target { { i?86-*-* x86_64-*-* } && ia32 } } }
+// { dg-options "-O2 -march=slm" }
+
+struct s {
+  long long l1, l2, l3, l4, l5;
+} *a;
+long long b;
+long long fn1()
+{
+  try
+{
+  b = (a->l1 | a->l2 | a->l3 | a->l4 | a->l5);
+  return a->l1;
+}
+  catch (int)
+{
+}
+}

Jakub

Re: [patch,avr] Add new option -mabsdata.

2016-11-07 Thread Georg-Johann Lay


On 07.11.2016 13:54, Georg-Johann Lay wrote:

This patch adds a new command line option -mabsdata which can be ised to set
attribute absdata for all data in static storage so it can be accessed by LDS
and STS instructions.

This is only useful for some reduced Tiny devices like ATtiny40.

For other reduced Tiny where all of SRAM fits LDS / STS, the new option is
automatically set by the device specs file.

For ordinary devices the option is accepted but has no effect.

Ok for trunk?

Johann


gcc/
PR target/78093
* doc/invoke.texi (AVR Options) [-mabsdata]: Document new option.
* config/avr/avr.opt (-mabsdata): New option.
* config/avr/avr.c (avr_encode_section_info) [AVR_TINY]: If
-mabsdata & symbol is not progmem, tag as AVR_SYMBOL_FLAG_TINY_ABSDATA.
* config/avr/avr-mcus.def (attiny4/5/9/10/20): Use AVR_ISA_LDS.
* config/avr/gen-avr-mmcu-specs.c (print_mcu): Print cc1_absdata
spec depending on AVR_ISA_LDS.
* config/avr/specs.h (CC1_SPEC): Enhanced by cc1_absdata spec.

gcc/testsuite/
PR target/78093
* gcc.target/avr/torture/tiny-absdata-2.c: New test.


Here is the complete lag entry (avr-arch.h was missing):

gcc/
PR target/78093
* doc/invoke.texi (AVR Options) [-mabsdata]: Document new option.
* config/avr/avr.opt (-mabsdata): New option.
* config/avr/avr-arch.h (avr_device_specific_features): Add AVR_ISA_LDS.
* config/avr/avr.c (avr_encode_section_info) [AVR_TINY]: If
-mabsdata & symbol is not progmem, tag as AVR_SYMBOL_FLAG_TINY_ABSDATA.
* config/avr/avr-mcus.def (attiny4/5/9/10/20): Use AVR_ISA_LDS.
* config/avr/gen-avr-mmcu-specs.c (print_mcu): Print cc1_absdata
spec depending on AVR_ISA_LDS.
* config/avr/specs.h (CC1_SPEC): Enhanced by cc1_absdata spec.
gcc/testsuite/
PR target/78093
* gcc.target/avr/torture/tiny-absdata-2.c: New test.

[PATCH] Fix -O0 AVX512 comparison ICE (PR target/78227)

2016-11-07 Thread Jakub Jelinek

Hi!

The following testcases ICE at -O0, because ix86_expand_sse_cmp avoid using
the passed in dest only if optimize or if there is some value overlap, but
we actually need to do that also if we have a maskcmp where we want to use
a different mode than dest has.
Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2016-11-07  Jakub Jelinek  

PR target/78227
* config/i386/i386.c (ix86_expand_sse_cmp): Force dest into
cmp_mode argument even for -O0 if cmp_mode != mode and maskcmp.

* gcc.target/i386/pr78227-1.c: New test.
* gcc.target/i386/pr78227-2.c: New test.

--- gcc/config/i386/i386.c.jj   2016-11-04 20:09:48.0 +0100
+++ gcc/config/i386/i386.c  2016-11-07 10:14:15.625018144 +0100
@@ -23561,6 +23561,7 @@ ix86_expand_sse_cmp (rtx dest, enum rtx_
 cmp_op1 = force_reg (cmp_ops_mode, cmp_op1);
 
   if (optimize
+  || (cmp_mode != mode && maskcmp)
   || (op_true && reg_overlap_mentioned_p (dest, op_true))
   || (op_false && reg_overlap_mentioned_p (dest, op_false)))
 dest = gen_reg_rtx (maskcmp ? cmp_mode : mode);
--- gcc/testsuite/gcc.target/i386/pr78227-1.c.jj2016-11-07 
10:15:52.606762613 +0100
+++ gcc/testsuite/gcc.target/i386/pr78227-1.c   2016-11-07 10:24:58.821480125 
+0100
@@ -0,0 +1,30 @@
+/* PR target/78227 */
+/* { dg-do compile } */
+/* { dg-options "-mavx512f -O0 -Wno-psabi" } */
+
+typedef int V __attribute__((vector_size (64)));
+typedef long long int W __attribute__((vector_size (64)));
+
+V
+foo1 (V v)
+{
+  return v > 0;
+}
+
+V
+bar1 (V v)
+{
+  return v != 0;
+}
+
+W
+foo2 (W w)
+{
+  return w > 0;
+}
+
+W
+bar2 (W w)
+{
+  return w != 0;
+}
--- gcc/testsuite/gcc.target/i386/pr78227-2.c.jj2016-11-07 
10:22:17.055670476 +0100
+++ gcc/testsuite/gcc.target/i386/pr78227-2.c   2016-11-07 10:25:03.722413765 
+0100
@@ -0,0 +1,30 @@
+/* PR target/78227 */
+/* { dg-do compile } */
+/* { dg-options "-mavx512bw -O0 -Wno-psabi" } */
+
+typedef signed char V __attribute__((vector_size (64)));
+typedef short int W __attribute__((vector_size (64)));
+
+V
+foo1 (V v)
+{
+  return v > 0;
+}
+
+V
+bar1 (V v)
+{
+  return v != 0;
+}
+
+W
+foo2 (W w)
+{
+  return w > 0;
+}
+
+W
+bar2 (W w)
+{
+  return w != 0;
+}

Jakub

Re: [PATCH] combine lhs zero_extract fix (PR78186)

2016-11-07 Thread Christophe Lyon

On 7 November 2016 at 10:14, Segher Boessenkool
 wrote:
> Hi Christophe,
>
> On Fri, Nov 04, 2016 at 02:31:28PM +0100, Christophe Lyon wrote:
>> Since this commit I have noticed execution failures on "old" arm targets:
>>
>>   gcc.dg/torture/pr48124-4.c   -O1  execution test
>>   gcc.dg/torture/pr48124-4.c   -O2  execution test
>>   gcc.dg/torture/pr48124-4.c   -O2 -flto -fno-use-linker-plugin
>> -flto-partition=none  execution test
>>   gcc.dg/torture/pr48124-4.c   -O2 -flto -fuse-linker-plugin
>> -fno-fat-lto-objects  execution test
>>   gcc.dg/torture/pr48124-4.c   -O3 -g  execution test
>>   gcc.dg/torture/pr48124-4.c   -Os  execution test
>>
>> For instance on target arm-none-linux-gnueabi --with-cpu=cortex-a9
>> --with-mode=arm
>> and running the tests with -march=armv5t
>
> Confirmed.  What a nasty, nasty bug, and it has been here for decades
> it seems.  Could you please open a PR?
>
>
Sure, I've created PR78232 for this.

Thanks.

Christophe

> Segher

[patch,avr] Add new option -mabsdata.

2016-11-07 Thread Georg-Johann Lay

This patch adds a new command line option -mabsdata which can be ised to set 
attribute absdata for all data in static storage so it can be accessed by LDS 
and STS instructions.


This is only useful for some reduced Tiny devices like ATtiny40.

For other reduced Tiny where all of SRAM fits LDS / STS, the new option is 
automatically set by the device specs file.


For ordinary devices the option is accepted but has no effect.

Ok for trunk?

Johann


gcc/
PR target/78093
* doc/invoke.texi (AVR Options) [-mabsdata]: Document new option.
* config/avr/avr.opt (-mabsdata): New option.
* config/avr/avr.c (avr_encode_section_info) [AVR_TINY]: If
-mabsdata & symbol is not progmem, tag as AVR_SYMBOL_FLAG_TINY_ABSDATA.
* config/avr/avr-mcus.def (attiny4/5/9/10/20): Use AVR_ISA_LDS.
* config/avr/gen-avr-mmcu-specs.c (print_mcu): Print cc1_absdata
spec depending on AVR_ISA_LDS.
* config/avr/specs.h (CC1_SPEC): Enhanced by cc1_absdata spec.

gcc/testsuite/
PR target/78093
* gcc.target/avr/torture/tiny-absdata-2.c: New test.
Index: config/avr/avr-arch.h
===
--- config/avr/avr-arch.h	(revision 241841)
+++ config/avr/avr-arch.h	(working copy)
@@ -157,7 +157,9 @@ enum avr_device_specific_features
   AVR_ISA_NONE,
   AVR_ISA_RMW = 0x1, /* device has RMW instructions. */
   AVR_SHORT_SP= 0x2, /* Stack Pointer has 8 bits width. */
-  AVR_ERRATA_SKIP = 0x4  /* device has a core erratum. */
+  AVR_ERRATA_SKIP = 0x4, /* device has a core erratum. */
+  AVR_ISA_LDS = 0x8  /* whether LDS / STS is valid for all data in static
+storage.  Only useful for reduced Tiny.  */
 };
 
 /* Map architecture to its texinfo string.  */
Index: config/avr/avr-mcus.def
===
--- config/avr/avr-mcus.def	(revision 241841)
+++ config/avr/avr-mcus.def	(working copy)
@@ -341,11 +341,11 @@ AVR_MCU ("atxmega128a1u",ARCH_AVRXME
 AVR_MCU ("atxmega128a4u",ARCH_AVRXMEGA7, AVR_ISA_RMW,  "__AVR_ATxmega128A4U__",0x2000, 0x0, 3)
 /* Tiny family */
 AVR_MCU ("avrtiny",  ARCH_AVRTINY, AVR_ISA_NONE, NULL, 0x0040, 0x0, 1)
-AVR_MCU ("attiny4",  ARCH_AVRTINY, AVR_ISA_NONE, "__AVR_ATtiny4__",0x0040, 0x0, 1)
-AVR_MCU ("attiny5",  ARCH_AVRTINY, AVR_ISA_NONE, "__AVR_ATtiny5__",0x0040, 0x0, 1)
-AVR_MCU ("attiny9",  ARCH_AVRTINY, AVR_ISA_NONE, "__AVR_ATtiny9__",0x0040, 0x0, 1) 
-AVR_MCU ("attiny10", ARCH_AVRTINY, AVR_ISA_NONE, "__AVR_ATtiny10__",   0x0040, 0x0, 1)
-AVR_MCU ("attiny20", ARCH_AVRTINY, AVR_ISA_NONE, "__AVR_ATtiny20__",   0x0040, 0x0, 1)
+AVR_MCU ("attiny4",  ARCH_AVRTINY, AVR_ISA_LDS,  "__AVR_ATtiny4__",0x0040, 0x0, 1)
+AVR_MCU ("attiny5",  ARCH_AVRTINY, AVR_ISA_LDS,  "__AVR_ATtiny5__",0x0040, 0x0, 1)
+AVR_MCU ("attiny9",  ARCH_AVRTINY, AVR_ISA_LDS,  "__AVR_ATtiny9__",0x0040, 0x0, 1) 
+AVR_MCU ("attiny10", ARCH_AVRTINY, AVR_ISA_LDS,  "__AVR_ATtiny10__",   0x0040, 0x0, 1)
+AVR_MCU ("attiny20", ARCH_AVRTINY, AVR_ISA_LDS,  "__AVR_ATtiny20__",   0x0040, 0x0, 1)
 AVR_MCU ("attiny40", ARCH_AVRTINY, AVR_ISA_NONE, "__AVR_ATtiny40__",   0x0040, 0x0, 1)
 /* Assembler only.  */
 AVR_MCU ("avr1", ARCH_AVR1, AVR_ISA_NONE, NULL,0x0060, 0x0, 1)
Index: config/avr/avr.c
===
--- config/avr/avr.c	(revision 241841)
+++ config/avr/avr.c	(working copy)
@@ -10182,14 +10182,18 @@ avr_encode_section_info (tree decl, rtx
   && SYMBOL_REF_P (XEXP (rtl, 0)))
 {
   rtx sym = XEXP (rtl, 0);
+  bool progmem_p = -1 == avr_progmem_p (decl, DECL_ATTRIBUTES (decl));
 
-  if (-1 == avr_progmem_p (decl, DECL_ATTRIBUTES (decl)))
+  if (progmem_p)
 {
   // Tag symbols for later addition of 0x4000 (AVR_TINY_PM_OFFSET).
   SYMBOL_REF_FLAGS (sym) |= AVR_SYMBOL_FLAG_TINY_PM;
 }
 
   if (avr_decl_absdata_p (decl, DECL_ATTRIBUTES (decl))
+  || (TARGET_ABSDATA
+  && !progmem_p
+  && !addr_attr)
   || (addr_attr
   // If addr_attr is non-null, it has an argument.  Peek into it.
   && TREE_INT_CST_LOW (TREE_VALUE (TREE_VALUE (addr_attr))) < 0xc0))
@@ -10198,7 +10202,7 @@ avr_encode_section_info (tree decl, rtx
   SYMBOL_REF_FLAGS (sym) |= AVR_SYMBOL_FLAG_TINY_ABSDATA;
 }
 
-  if (-1 == avr_progmem_p (decl, DECL_ATTRIBUTES (decl))
+  if (progmem_p
   && avr_decl_absdata_p (decl, DECL_ATTRIBUTES (decl)))
 {
   error ("%q+D has incompatible attributes %qs and %qs",
Index: config/avr/avr.opt
===
--- config/avr/avr.opt	(revision 241841)

[PATCH] Fix PR78205 -- fix BB SLP "gap" handling

2016-11-07 Thread Richard Biener


The following moves a overly conservative check that we do not access
excess elements when vectorizing a BB to a place where we can do
a better job with respect to the elements we actually use.

This means that for the included testcase we are not confused
by the read from c[4] but just do not vectorize the stores to x[0]
and x[1].

Bootstrap and regtest running on x86_64-unknown-linux-gnu.

Richard.

2016-11-07  Richard Biener  

PR tree-optimization/78205
* tree-vect-stmts.c (vectorizable_load): Move check whether
we may run into gaps when BB vectorizing SLP permutations ...
* tree-vect-slp.c (vect_supported_load_permutation_p): ...
here where we can do a more precise check.

* gcc.dg/vect/bb-slp-pr78205.c: New testcase.

Index: gcc/tree-vect-stmts.c
===
--- gcc/tree-vect-stmts.c   (revision 241893)
+++ gcc/tree-vect-stmts.c   (working copy)
@@ -6548,18 +6611,6 @@ vectorizable_load (gimple *stmt, gimple_
   if (slp && SLP_TREE_LOAD_PERMUTATION (slp_node).exists ())
slp_perm = true;
 
-  /* ???  The following is overly pessimistic (as well as the loop
- case above) in the case we can statically determine the excess
-elements loaded are within the bounds of a decl that is accessed.
-Likewise for BB vectorizations using masked loads is a possibility.  */
-  if (bb_vinfo && slp_perm && group_size % nunits != 0)
-   {
- dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
-  "BB vectorization with gaps at the end of a load "
-  "is not supported\n");
- return false;
-   }
-
   /* Invalidate assumptions made by dependence analysis when vectorization
 on the unrolled body effectively re-orders stmts.  */
   if (!PURE_SLP_STMT (stmt_info)
Index: gcc/tree-vect-slp.c
===
--- gcc/tree-vect-slp.c (revision 241893)
+++ gcc/tree-vect-slp.c (working copy)
@@ -1459,6 +1459,25 @@ vect_supported_load_permutation_p (slp_i
SLP_TREE_LOAD_PERMUTATION (node).release ();
  else
{
+ stmt_vec_info group_info
+   = vinfo_for_stmt (SLP_TREE_SCALAR_STMTS (node)[0]);
+ group_info = vinfo_for_stmt (GROUP_FIRST_ELEMENT (group_info));
+ unsigned nunits
+   = TYPE_VECTOR_SUBPARTS (STMT_VINFO_VECTYPE (group_info));
+ unsigned k, maxk = 0;
+ FOR_EACH_VEC_ELT (SLP_TREE_LOAD_PERMUTATION (node), j, k)
+   if (k > maxk)
+ maxk = k;
+ /* In BB vectorization we may not actually use a loaded vector
+accessing elements in excess of GROUP_SIZE.  */
+ if (maxk >= (GROUP_SIZE (group_info) & ~(nunits - 1)))
+   {
+ dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
+  "BB vectorization with gaps at the end of "
+  "a load is not supported\n");
+ return false;
+   }
+
  /* Verify the permutation can be generated.  */
  vec tem;
  unsigned n_perms;
Index: gcc/testsuite/gcc.dg/vect/bb-slp-pr78205.c
===
--- gcc/testsuite/gcc.dg/vect/bb-slp-pr78205.c  (revision 0)
+++ gcc/testsuite/gcc.dg/vect/bb-slp-pr78205.c  (working copy)
@@ -0,0 +1,25 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target vect_double } */
+/* { dg-additional-options "-fdump-tree-optimized" } */
+
+double x[2], a[4], b[4], c[5];
+
+void foo ()
+{
+  a[0] = c[0];
+  a[1] = c[1];
+  a[2] = c[0];
+  a[3] = c[1];
+  b[0] = c[2];
+  b[1] = c[3];
+  b[2] = c[2];
+  b[3] = c[3];
+  x[0] = c[4];
+  x[1] = c[4];
+}
+
+/* We may not vectorize the store to x[] as it accesses c out-of bounds
+   but we do want to vectorize the other two store groups.  */
+
+/* { dg-final { scan-tree-dump-times "basic block vectorized" 1 "slp2" } } */
+/* { dg-final { scan-tree-dump-times "x\\\[\[0-1\]\\\] = " 2 "optimized" } } */

Re: [PATCH] Fix PR78228

2016-11-07 Thread Richard Biener

On Mon, 7 Nov 2016, Jakub Jelinek wrote:

> On Mon, Nov 07, 2016 at 01:17:25PM +0100, Richard Biener wrote:
> > 
> > The following fixes phiopt to not introduce undefined behavior
> > in its abs replacement code in case we negate only positive values
> > in the original code.
> > 
> > Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk.
> > 
> > Richard.
> > 
> > 2016-11-07  Richard Biener  
> > 
> > PR tree-optimization/78228
> > * tree-ssa-phiopt.c (abs_replacement): Avoid introducing
> > undefined behavior.
> > 
> > * gcc.dg/tree-ssa/phi-opt-15.c: New testcase.
> > 
> > Index: gcc/tree-ssa-phiopt.c
> > ===
> > --- gcc/tree-ssa-phiopt.c   (revision 241891)
> > +++ gcc/tree-ssa-phiopt.c   (working copy)
> > @@ -1453,6 +1453,14 @@ abs_replacement (basic_block cond_bb, ba
> >else
> >  negate = false;
> >  
> > +  /* If the code negates only iff positive then make sure to not
> > + introduce undefined behavior when negating or computing the absolute.
> > + ???  We could use range info if present to check for arg1 == INT_MIN. 
> >  */
> 
> Perhaps just
> 
> > +  if (negate
> > +  && (ANY_INTEGRAL_TYPE_P (TREE_TYPE (arg1))
> > + && ! TYPE_OVERFLOW_WRAPS (TREE_TYPE (arg1
> {
>   wide_int minv = TYPE_MIN_VALUE (TYPE_DOMAIN (TREE_TYPE (arg1)));
>   if (!expr_not_equal_to (arg1, minv))
>   return false;
> }
> ?

rather wi::min_value (TREE_TYPE (arg1), SIGNED) I guess.  Didn't know
of expr_not_equal_to, seems to be only used from i386.c at the moment.

We can improve things on trunk but I'd prefer to be safe on the 
branch(es).

Richard.

Re: [PATCH] Fix PR78228

2016-11-07 Thread Jakub Jelinek

On Mon, Nov 07, 2016 at 01:17:25PM +0100, Richard Biener wrote:
> 
> The following fixes phiopt to not introduce undefined behavior
> in its abs replacement code in case we negate only positive values
> in the original code.
> 
> Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk.
> 
> Richard.
> 
> 2016-11-07  Richard Biener  
> 
>   PR tree-optimization/78228
>   * tree-ssa-phiopt.c (abs_replacement): Avoid introducing
>   undefined behavior.
> 
>   * gcc.dg/tree-ssa/phi-opt-15.c: New testcase.
> 
> Index: gcc/tree-ssa-phiopt.c
> ===
> --- gcc/tree-ssa-phiopt.c (revision 241891)
> +++ gcc/tree-ssa-phiopt.c (working copy)
> @@ -1453,6 +1453,14 @@ abs_replacement (basic_block cond_bb, ba
>else
>  negate = false;
>  
> +  /* If the code negates only iff positive then make sure to not
> + introduce undefined behavior when negating or computing the absolute.
> + ???  We could use range info if present to check for arg1 == INT_MIN.  
> */

Perhaps just

> +  if (negate
> +  && (ANY_INTEGRAL_TYPE_P (TREE_TYPE (arg1))
> +   && ! TYPE_OVERFLOW_WRAPS (TREE_TYPE (arg1
{
  wide_int minv = TYPE_MIN_VALUE (TYPE_DOMAIN (TREE_TYPE (arg1)));
  if (!expr_not_equal_to (arg1, minv))
return false;
}
?

Jakub

[PATCH] Fix PR78229

2016-11-07 Thread Richard Biener


Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk
and branch.

Richard.

2016-11-07  Richard Biener  

PR target/78229
* config/i386/i386.c (ix86_gimple_fold_builtin): Do not adjust
EH info.

* g++.dg/pr78229.C: New testcase.

Index: gcc/config/i386/i386.c
===
--- gcc/config/i386/i386.c  (revision 241891)
+++ gcc/config/i386/i386.c  (working copy)
@@ -37664,7 +37664,7 @@ ix86_gimple_fold_builtin (gimple_stmt_it
  gsi_insert_before (gsi, g, GSI_SAME_STMT);
  g = gimple_build_assign (gimple_call_lhs (stmt), NOP_EXPR, lhs);
  gimple_set_location (g, loc);
- gsi_replace (gsi, g, true);
+ gsi_replace (gsi, g, false);
  return true;
}
   break;
Index: gcc/testsuite/g++.dg/pr78229.C
===
--- gcc/testsuite/g++.dg/pr78229.C  (revision 0)
+++ gcc/testsuite/g++.dg/pr78229.C  (working copy)
@@ -0,0 +1,24 @@
+/* { dg-do compile { target x86_64-*-* i?86-*-* } } */
+/* { dg-options "-O2 -mbmi -w" } */
+
+void a();
+inline int b(int c) {
+int d = c;
+return __builtin_ia32_tzcnt_u32(d);
+}
+struct e {};
+int f, g, h;
+void fn3() {
+float j;
+
+  {
+   e k;
+   while (h) {
+   if (g == 0)
+ continue;
+   int i = b(g);
+   f = i;
+   }
+   a();
+  }
+}

[PATCH] Fix PR78218

2016-11-07 Thread Richard Biener


Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.

Richard.

2016-11-07  Richard Biener  

PR tree-optimization/78218
* gimple-ssa-store-merging.c
(pass_store_merging::terminate_all_aliasing_chains):
Drop unused argument, fix alias check to also consider uses.
(pass_store_merging::execute): Adjust.

* gcc.dg/torture/pr78218.c: New testcase.

Index: gcc/gimple-ssa-store-merging.c
===
--- gcc/gimple-ssa-store-merging.c  (revision 241893)
+++ gcc/gimple-ssa-store-merging.c  (working copy)
@@ -726,7 +726,7 @@ private:
   hash_map m_stores;
 
   bool terminate_and_process_all_chains ();
-  bool terminate_all_aliasing_chains (tree, imm_store_chain_info **,
+  bool terminate_all_aliasing_chains (imm_store_chain_info **,
  bool, gimple *);
   bool terminate_and_release_chain (imm_store_chain_info *);
 }; // class pass_store_merging
@@ -755,8 +755,7 @@ pass_store_merging::terminate_and_proces
If that is the case we have to terminate any chain anchored at BASE.  */
 
 bool
-pass_store_merging::terminate_all_aliasing_chains (tree dest,
-  imm_store_chain_info
+pass_store_merging::terminate_all_aliasing_chains (imm_store_chain_info
 **chain_info,
   bool var_offset_p,
   gimple *stmt)
@@ -788,7 +787,10 @@ pass_store_merging::terminate_all_aliasi
  unsigned int i;
  FOR_EACH_VEC_ELT ((*chain_info)->m_store_info, i, info)
{
- if (stmt_may_clobber_ref_p (info->stmt, dest))
+ if (ref_maybe_used_by_stmt_p (stmt,
+   gimple_assign_lhs (info->stmt))
+ || stmt_may_clobber_ref_p (stmt,
+gimple_assign_lhs (info->stmt)))
{
  if (dump_file && (dump_flags & TDF_DETAILS))
{
@@ -1458,7 +1460,7 @@ pass_store_merging::execute (function *f
}
 
  /* Store aliases any existing chain?  */
- terminate_all_aliasing_chains (lhs, chain_info, false, stmt);
+ terminate_all_aliasing_chains (chain_info, false, stmt);
  /* Start a new chain.  */
  struct imm_store_chain_info *new_chain
= new imm_store_chain_info (base_addr);
@@ -1477,13 +1479,13 @@ pass_store_merging::execute (function *f
}
}
  else
-   terminate_all_aliasing_chains (lhs, chain_info,
+   terminate_all_aliasing_chains (chain_info,
   offset != NULL_TREE, stmt);
 
  continue;
}
 
- terminate_all_aliasing_chains (NULL_TREE, NULL, false, stmt);
+ terminate_all_aliasing_chains (NULL, false, stmt);
}
   terminate_and_process_all_chains ();
 }
Index: gcc/testsuite/gcc.dg/torture/pr78218.c
===
--- gcc/testsuite/gcc.dg/torture/pr78218.c  (revision 0)
+++ gcc/testsuite/gcc.dg/torture/pr78218.c  (working copy)
@@ -0,0 +1,24 @@
+/* { dg-do run } */
+
+struct 
+{
+  int v;
+} a[2];
+
+int b; 
+
+void __attribute__((noinline,noclone))
+check ()
+{
+  if (a[0].v != 1)
+__builtin_abort ();
+}
+
+int main ()
+{
+  a[1].v = 1;
+  a[0] = a[1];
+  a[1].v = 0;
+  check (a);
+  return 0;
+}

[PATCH] Fix PR78228

2016-11-07 Thread Richard Biener


The following fixes phiopt to not introduce undefined behavior
in its abs replacement code in case we negate only positive values
in the original code.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk.

Richard.

2016-11-07  Richard Biener  

PR tree-optimization/78228
* tree-ssa-phiopt.c (abs_replacement): Avoid introducing
undefined behavior.

* gcc.dg/tree-ssa/phi-opt-15.c: New testcase.

Index: gcc/tree-ssa-phiopt.c
===
--- gcc/tree-ssa-phiopt.c   (revision 241891)
+++ gcc/tree-ssa-phiopt.c   (working copy)
@@ -1453,6 +1453,14 @@ abs_replacement (basic_block cond_bb, ba
   else
 negate = false;
 
+  /* If the code negates only iff positive then make sure to not
+ introduce undefined behavior when negating or computing the absolute.
+ ???  We could use range info if present to check for arg1 == INT_MIN.  */
+  if (negate
+  && (ANY_INTEGRAL_TYPE_P (TREE_TYPE (arg1))
+ && ! TYPE_OVERFLOW_WRAPS (TREE_TYPE (arg1
+return false;
+
   result = duplicate_ssa_name (result, NULL);
 
   if (negate)
Index: gcc/testsuite/gcc.dg/tree-ssa/phi-opt-15.c
===
--- gcc/testsuite/gcc.dg/tree-ssa/phi-opt-15.c  (revision 0)
+++ gcc/testsuite/gcc.dg/tree-ssa/phi-opt-15.c  (working copy)
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+
+int
+foo (int i)
+{
+  if (i > 0)
+i = -i;
+  return i;
+}
+
+/* { dg-final { scan-tree-dump-not "ABS" "optimized" } } */

Re: [PATCH][AArch64] Fix PR target/77822: Use tighter predicates for zero_extract patterns

2016-11-07 Thread James Greenhalgh

On Mon, Oct 17, 2016 at 05:15:21PM +0100, Kyrill Tkachov wrote:
> Hi all,
> 
> For the attached testcase the code ends up trying to extract bits outside the
> range of the normal register widths. The aarch64 patterns for ubfz and tbnz
> end up accepting such operands and emitting invalid assembly
> such as 'ubfx x18,x2,192,32'
> 
> The solution is to add proper predicates and guards to the operands of the
> zero_extract operations that are going on.  I had a look at all the other
> patterns in aarch64 that generate/use zero_extract and they all have guards
> on their
> operands in one form or another to avoid them accessing an area that is out
> of range.
> 
> With this patch the testcase compiles and assembles fine.
> 
> Bootstrapped and tested on aarch64-none-linux-gnu.
> 
> Ok for trunk?

Ok, sorry for the delay on review.

Thanks,
James

> 2016-10-17  Kyrylo Tkachov  
> 
> PR target/77822
> * config/aarch64/aarch64.md (*tb1): Use
> aarch64_simd_shift_imm_ predicate for operand 1.
> (, ANY_EXTRACT): Use tighter predicates on operands 2 and 3
> to restrict them to an appropriate range and add FAIL check if the
> region they specify is out of range.  Delete useless constraint
> strings.
> (*, ANY_EXTRACT): Add appropriate predicates on operands
> 2 and 3 to restrict their range and add pattern predicate.
>

Re: [PATCH][AArch64] Fix PR target/77822: Use tighter predicates for zero_extract patterns

2016-11-07 Thread Kyrill Tkachov


Ping.

Thanks,
Kyrill

On 31/10/16 12:10, Kyrill Tkachov wrote:

Ping.

Thanks,
Kyrill

On 24/10/16 14:12, Kyrill Tkachov wrote:


On 24/10/16 12:29, Kyrill Tkachov wrote:

Ping.
https://gcc.gnu.org/ml/gcc-patches/2016-10/msg01321.html



I just noticed my original ChangeLog entry was truncated.
It is
2016-10-04  Kyrylo Tkachov  

PR target/77822
* config/aarch64/aarch64.md (*tb1): Use
aarch64_simd_shift_imm_ predicate for operand 1.
(, ANY_EXTRACT): Use tighter predicates on operands 2 and 3
to restrict them to an appropriate range and add FAIL check if the
region they specify is out of range.  Delete useless constraint
strings.
(*, ANY_EXTRACT): Add appropriate predicates on operands
2 and 3 to restrict their range and add pattern predicate.

2016-10-04  Kyrylo Tkachov  

PR target/77822
* g++.dg/torture/pr77822.C: New test.

Kyrill



On 17/10/16 17:15, Kyrill Tkachov wrote:

Hi all,

For the attached testcase the code ends up trying to extract bits outside the 
range of the normal register
widths. The aarch64 patterns for ubfz and tbnz end up accepting such operands 
and emitting invalid assembly
such as 'ubfx x18,x2,192,32'

The solution is to add proper predicates and guards to the operands of the 
zero_extract operations that are going on.
I had a look at all the other patterns in aarch64 that generate/use 
zero_extract and they all have guards on their
operands in one form or another to avoid them accessing an area that is out of 
range.

With this patch the testcase compiles and assembles fine.

Bootstrapped and tested on aarch64-none-linux-gnu.

Ok for trunk?

Thanks,
Kyrill

2016-10-17  Kyrylo Tkachov  

PR target/77822
* config/aarch64/aarch64.md (*tb1): Use
aarch64_simd_shift_imm_ predicate for operand 1.
(, ANY_EXTRACT): Use tighter predicates on operands 2 and 3
to restrict them to an appropriate range and add FAIL check if the
region they specify is out of range.  Delete useless constraint
strings.
(*, ANY_EXTRACT): Add appropriate predicates on operands
2 and 3 to restrict their range and add pattern predicate.

2016-10-17  Kyrylo Tkachov  

PR target/77822

Re: Ping^6 Re: [Patch AArch64] Add floatdihf2 and floatunsdihf2 patterns

2016-11-07 Thread James Greenhalgh

On Fri, Oct 21, 2016 at 05:31:14PM +0100, James Greenhalgh wrote:
> On Wed, Oct 12, 2016 at 04:56:52PM +0100, James Greenhalgh wrote:
> > On Wed, Sep 28, 2016 at 05:17:14PM +0100, James Greenhalgh wrote:
> > > On Wed, Sep 21, 2016 at 10:42:03AM +0100, James Greenhalgh wrote:
> > > > On Tue, Sep 13, 2016 at 10:31:28AM +0100, James Greenhalgh wrote:
> > > > > On Tue, Sep 06, 2016 at 10:19:50AM +0100, James Greenhalgh wrote:
> > > > > > This patch adds patterns for conversion from 64-bit integer to 
> > > > > > 16-bit
> > > > > > floating-point values under AArch64 targets which don't have 
> > > > > > support for
> > > > > > the ARMv8.2-A 16-bit floating point extensions.
> > > > > > 
> > > > > > We implement these by first saturating to a SImode (we know that any
> > > > > > values >= 65504 will round to infinity after conversion to HFmode), 
> > > > > > then
> > > > > > converting to a DFmode (unsigned conversions could go to SFmode, 
> > > > > > but there
> > > > > > is no performance benefit to this). Then converting to HFmode.
> > > > > > 
> > > > > > Having added these patterns, the expansion path in "expand_float" 
> > > > > > will
> > > > > > now try to use them for conversions from SImode to HFmode as there 
> > > > > > is no
> > > > > > floatsihf2 pattern. expand_float first tries widening the integer 
> > > > > > size and
> > > > > > looking for a match, so it will try SImode -> DImode. But our DI 
> > > > > > mode
> > > > > > pattern is going to then saturate us back to SImode which is 
> > > > > > wasteful.
> > > > > > 
> > > > > > Better, would be for us to provide float(uns)sihf2 patterns 
> > > > > > directly.
> > > > > > So that's what this patch does.
> > > > > > 
> > > > > > The testcase add in this patch would fail on trunk for AArch64. 
> > > > > > There is
> > > > > > no libgcc routine to make the conversion, and we don't provide 
> > > > > > appropriate
> > > > > > patterns in the backend, so we get a link-time error.
> > > > > > 
> > > > > > Bootstrapped and tested on aarch64-none-linux-gnu
> > > > > > 
> > > > > > OK for trunk?
> > > > > 
> > > > > Ping.
> > > > 
> > > > Ping^2
> > > 
> > > Ping^3
> > 
> > Ping^4
> 
> Ping^5

Ping^6

Thanks,
James

> > > > > > 2016-09-06  James Greenhalgh  
> > > > > > 
> > > > > > * config/aarch64/aarch64.md (sihf2): Convert to expand.
> > > > > > (dihf2): Likewise.
> > > > > > (aarch64_fp16_hf2): New.
> > > > > > 
> > > > > > 2016-09-06  James Greenhalgh  
> > > > > > 
> > > > > > * gcc.target/aarch64/floatdihf2_1.c: New.
> > > > > > 
> > > > > 
> > > > > > diff --git a/gcc/config/aarch64/aarch64.md 
> > > > > > b/gcc/config/aarch64/aarch64.md
> > > > > > index 6afaf90..1882a72 100644
> > > > > > --- a/gcc/config/aarch64/aarch64.md
> > > > > > +++ b/gcc/config/aarch64/aarch64.md
> > > > > > @@ -4630,7 +4630,14 @@
> > > > > >[(set_attr "type" "f_cvti2f")]
> > > > > >  )
> > > > > >  
> > > > > > -(define_insn "hf2"
> > > > > > +;; If we do not have ARMv8.2-A 16-bit floating point extensions, 
> > > > > > the
> > > > > > +;; midend will arrange for an SImode conversion to HFmode to first 
> > > > > > go
> > > > > > +;; through DFmode, then to HFmode.  But first it will try 
> > > > > > converting
> > > > > > +;; to DImode then down, which would match our DImode pattern below 
> > > > > > and
> > > > > > +;; give very poor code-generation.  So, we must provide our own 
> > > > > > emulation
> > > > > > +;; of the mid-end logic.
> > > > > > +
> > > > > > +(define_insn "aarch64_fp16_hf2"
> > > > > >[(set (match_operand:HF 0 "register_operand" "=w")
> > > > > > (FLOATUORS:HF (match_operand:GPI 1 "register_operand" "r")))]
> > > > > >"TARGET_FP_F16INST"
> > > > > > @@ -4638,6 +4645,53 @@
> > > > > >[(set_attr "type" "f_cvti2f")]
> > > > > >  )
> > > > > >  
> > > > > > +(define_expand "sihf2"
> > > > > > +  [(set (match_operand:HF 0 "register_operand")
> > > > > > +   (FLOATUORS:HF (match_operand:SI 1 "register_operand")))]
> > > > > > +  "TARGET_FLOAT"
> > > > > > +{
> > > > > > +  if (TARGET_FP_F16INST)
> > > > > > +emit_insn (gen_aarch64_fp16_sihf2 (operands[0], 
> > > > > > operands[1]));
> > > > > > +  else
> > > > > > +{
> > > > > > +  rtx convert_target = gen_reg_rtx (DFmode);
> > > > > > +  emit_insn (gen_sidf2 (convert_target, operands[1]));
> > > > > > +  emit_insn (gen_truncdfhf2 (operands[0], convert_target));
> > > > > > +}
> > > > > > +  DONE;
> > > > > > +}
> > > > > > +)
> > > > > > +
> > > > > > +;; For DImode there is no wide enough floating-point mode that we
> > > > > > +;; can convert through natively (TFmode would work, but requires a 
> > > > > > library
> > > > > > +;; call).  However, we know that any value >= 65504 will be rounded
> > > > > > +;; to infinity on conversion.  This is well within the range of 
> > > > > > SImode, so
> > > > > > +;; we can:
> > > > > > +;;   Saturate to SImode.
> > > > > > +;;   Convert

[RFC] Fix PR rtl-optimization/59461

2016-11-07 Thread Eric Botcazou

It's a missed optimization of a redundant zero-extension on the SPARC, which 
originally comes from PR rtl-optimization/58295 for ARM.  The extension is 
eliminated on the ARM because the load is explicitly zero-extended in RTL;
on the SPARC the load is implicitly zero-extended by means of LOAD_EXTEND_OP 
and the combiner is blocked by limitations of the nonzero_bits machinery.

The approach is two-pronged:
 1. it lifts a limitation in reg_nonzero_bits_for_combine that was recently 
added (https://gcc.gnu.org/ml/gcc-patches/2013-11/msg03782.html) and prevents 
the combiner from reasoning on larger modes under certain circumstances.
 2. it makes nonzero_bits1 propagate results from inner REGs to paradoxical 
SUBREGs if both WORD_REGISTER_OPERATIONS and LOAD_EXTEND_OP are set.

This also eliminate quite a few zero-extensions in the compile.exp testsuite 
at -O2 on the SPARC.  Tested on x86-64/Linux and SPARC/Solaris.


2016-11-07  Eric Botcazou  

PR rtl-optimization/59461
* doc/rtl.texi (paradoxical subregs): Add missing word.
* combine.c (reg_nonzero_bits_for_combine): Do not discard results
in modes with precision larger than that of last_set_mode.
* rtlanal.c (nonzero_bits1) : If WORD_REGISTER_OPERATIONS is
set and LOAD_EXTEND_OP is appropriate, propagate results from inner
REGs to paradoxical SUBREGs.
(num_sign_bit_copies1) : Likewise.  Check that the mode is not
larger than a word before invoking LOAD_EXTEND_OP on it.


2016-11-07  Eric Botcazou  

* gcc.target/sparc/pr59461.c: New test.

-- 
Eric Botcazou/* PR rtl-optimization/59461 */

/* { dg-do compile } */
/* { dg-options "-O2" } */

extern char zeb_test_array[10];

unsigned char ee_isdigit2(unsigned int i)
{
  unsigned char c = zeb_test_array[i];
  unsigned char retval;

  retval = ((c>='0') & (c<='9')) ? 1 : 0;
  return retval;
}

/* { dg-final { scan-assembler-not "and\t%" } } */
Index: doc/rtl.texi
===
--- doc/rtl.texi	(revision 241856)
+++ doc/rtl.texi	(working copy)
@@ -1882,7 +1882,7 @@ When used as an rvalue, the low-order bi
 taken from @var{reg} while the high-order bits may or may not be
 defined.
 
-The high-order bits of rvalues are in the following circumstances:
+The high-order bits of rvalues are defined in the following circumstances:
 
 @itemize
 @item @code{subreg}s of @code{mem}
Index: combine.c
===
--- combine.c	(revision 241856)
+++ combine.c	(working copy)
@@ -9878,18 +9878,17 @@ reg_nonzero_bits_for_combine (const_rtx
 		  (DF_LR_IN (ENTRY_BLOCK_PTR_FOR_FN (cfun)->next_bb),
 		   REGNO (x)
 {
-  unsigned HOST_WIDE_INT mask = rsp->last_set_nonzero_bits;
-
-  if (GET_MODE_PRECISION (rsp->last_set_mode) < GET_MODE_PRECISION (mode))
-	/* We don't know anything about the upper bits.  */
-	mask |= GET_MODE_MASK (mode) ^ GET_MODE_MASK (rsp->last_set_mode);
-
-  *nonzero &= mask;
+  /* Note that, even if the precision of last_set_mode is lower than that
+	 of mode, record_value_for_reg invoked nonzero_bits on the register
+	 with nonzero_bits_mode (because last_set_mode is necessarily integral
+	 and HWI_COMPUTABLE_MODE_P in this case) so bits in nonzero_bits_mode
+	 are all valid, hence in mode too since nonzero_bits_mode is defined
+	 to the largest HWI_COMPUTABLE_MODE_P mode.  */
+  *nonzero &= rsp->last_set_nonzero_bits;
   return NULL;
 }
 
   tem = get_last_value (x);
-
   if (tem)
 {
   if (SHORT_IMMEDIATES_SIGN_EXTEND)
@@ -9898,7 +9897,8 @@ reg_nonzero_bits_for_combine (const_rtx
 
   return tem;
 }
-  else if (nonzero_sign_valid && rsp->nonzero_bits)
+
+  if (nonzero_sign_valid && rsp->nonzero_bits)
 {
   unsigned HOST_WIDE_INT mask = rsp->nonzero_bits;
 
Index: rtlanal.c
===
--- rtlanal.c	(revision 241856)
+++ rtlanal.c	(working copy)
@@ -4242,7 +4242,7 @@ cached_nonzero_bits (const_rtx x, machin
 /* Given an expression, X, compute which bits in X can be nonzero.
We don't care about bits outside of those defined in MODE.
 
-   For most X this is simply GET_MODE_MASK (GET_MODE (MODE)), but if X is
+   For most X this is simply GET_MODE_MASK (GET_MODE (X)), but if X is
an arithmetic operation, we can do better.  */
 
 static unsigned HOST_WIDE_INT
@@ -4549,18 +4549,17 @@ nonzero_bits1 (const_rtx x, machine_mode
   /* If this is a SUBREG formed for a promoted variable that has
 	 been zero-extended, we know that at least the high-order bits
 	 are zero, though others might be too.  */
-
   if (SUBREG_PROMOTED_VAR_P (x) && SUBREG_PROMOTED_UNSIGNED_P (x))
 	nonzero = GET_MODE_MASK (GET_MODE (x))
 		  & cached_nonzero_bits (SUBREG_REG (x), GET_MODE (x),
 	 known_x, known_mode, known_ret);
 
-  inner_mode = GET_MODE (SUBREG_REG (x));

Re: [PATCH 0/7] Libsanitizer merge from upstream r285547.

2016-11-07 Thread Maxim Ostapenko


On 07/11/16 13:04, Jakub Jelinek wrote:

On Mon, Nov 07, 2016 at 11:22:28AM +0300, Maxim Ostapenko wrote:

Hi,

this patch set performs libsanitizer merge from upstream.

Patch 1 is the library merge itself.

Patch 2 is the reapplied change for SPARC by David S. Miller.

Patch 3 changes heuristic for extracting last PC from stack frame for ARM in
fast unwind routine. More details can be found here
(https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61771).

Patch 4 replaces Jakub's fix for
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63888 and removes
CheckODRViolationViaPoisoning call from RegisterGlobal to avoid false
positive odr violation reports.

Patch 5 combines necessary compiler changes.

Patch 6 adds several new tests, backported from upstream.

The patches 1-6 are ok for trunk now, if you fix the missing space
before ( in patch 5.


Ok, I'm going to land these shortly, thank you for review.




Patch 7 adds support for ASan odr indicators at compiler side.

This one can be applied incrementally once the issues reported in there
are resolved.


Yes, I'll fix the patch.



And the libtsan ABI stuff (__intercept*stat*) can be resolved incrementally
too.

Thanks.

Jakub

Re: [PATCH][1/2] GIMPLE Frontend, C FE parts (and GIMPLE parser)

2016-11-07 Thread Richard Biener

On Mon, 7 Nov 2016, Richard Biener wrote:

> On Fri, 4 Nov 2016, Jakub Jelinek wrote:
> 
> > Hi!
> > 
> > Just 2 nits:
> > 
> > On Fri, Oct 28, 2016 at 01:46:57PM +0200, Richard Biener wrote:
> > > +/* Return a pointer to the Nth token in PARERs tokens_buf.  */
> > 
> > PARSERs ?
> 
> Fixed.
> 
> > > @@ -454,7 +423,7 @@ c_lex_one_token (c_parser *parser, c_token *token)
> > >  /* Return a pointer to the next token from PARSER, reading it in if
> > > necessary.  */
> > >  
> > > -static inline c_token *
> > > +c_token *
> > >  c_parser_peek_token (c_parser *parser)
> > >  {
> > >if (parser->tokens_avail == 0)
> > 
> > I wonder if turning all of these into non-inlines is a good idea.
> > Can't you move them to the common header instead?
> 
> The issue with moving is that I failed to export the definition of
> c_parser in c-parser.h due to gengtype putting vec 
> handlers into gtype-c.h but not gtype-objc.h and thus objc bootstrap
> fails :/

If anybody wants to try, f82dc04b921a52a9a5c90d957a824e1c2d04
has it (objc build) still broken on the gimplefe git branch.

> I believe (well, I hope) that code generation for the C parser
> should be mostly unaffected (inlining is still done as determined
> useful) and the performance of the GIMPLE parser shouldn't be
> too important.
> 
> If anybody feels like digging into the gengtype issue, I gave up
> after trying for half a day to trick it to do what I want
> (like for example also putting it in gtype-objc.h).
> 
> > The rest I defer to Joseph or Marek.
> 
> Thanks,
> Richard.
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)

1 2 >

1 - 100 of 130 matches

Mail list logo