date:20151123

Re: [PATCH 1/2] Libsanitizer merge from upstream r253555.

2015-11-23 Thread Maxim Ostapenko


On 23/11/15 16:24, Jakub Jelinek wrote:

On Mon, Nov 23, 2015 at 04:21:34PM +0300, Maxim Ostapenko wrote:

Yeah, right. I've asked about kernel headers just to make sure I correctly
understand the issue.

Actually, I see such code in
lib/sanitizer_common/sanitizer_platform_limits_posix.cc:

#if defined(PTRACE_GETVFPREGS) && defined(PTRACE_SETVFPREGS)
   int ptrace_getvfpregs = PTRACE_GETVFPREGS;
   int ptrace_setvfpregs = PTRACE_SETVFPREGS;
#else
   int ptrace_getvfpregs = -1;
   int ptrace_setvfpregs = -1;
#endif

and in ptrace interceptor:

  else if (request == ptrace_setvfpregs)
 COMMON_INTERCEPTOR_READ_RANGE(ctx, data, struct_user_vfpregs_struct_sz);
  else if (request == ptrace_getvfpregs)
 COMMON_INTERCEPTOR_WRITE_RANGE(ctx, data, struct_user_vfpregs_struct_sz)

So, perhaps we can do the same thing with ARM_VFPREGS_SIZE, something like
this?

diff --git
a/libsanitizer/sanitizer_common/sanitizer_platform_limits_posix.cc
b/libsanitizer/sanitizer_common/sanitizer_platform_limits_posix.cc
index 9866cc9..20ff224 100644
--- a/libsanitizer/sanitizer_common/sanitizer_platform_limits_posix.cc
+++ b/libsanitizer/sanitizer_common/sanitizer_platform_limits_posix.cc
@@ -323,10 +323,14 @@ unsigned struct_ElfW_Phdr_sz = sizeof(Elf_Phdr);
unsigned struct_user_fpxregs_struct_sz = sizeof(struct
user_fpxregs_struct);
  #endif // __x86_64 || __mips64 || __powerpc64__ || __aarch64__ || __arm__
  #ifdef __arm__
+#if defined(ARM_VFPREGS_SIZE)
unsigned struct_user_vfpregs_struct_sz = ARM_VFPREGS_SIZE;
  #else
unsigned struct_user_vfpregs_struct_sz = 0;
  #endif
+#else
+  unsigned struct_user_vfpregs_struct_sz = 0;
+#endif

Maybe, but then it would need to be approved upstream.
If you just define ARM_VFPREGS_SIZE to 0 or whatever else in
the GCC owned wrapper headers, you can avoid that.
I guess talk to upstream.

Jakub




Ok, I posted a fix to upstream (http://reviews.llvm.org/D14921) 
yesterday, but it's still not reviewed. So, I'm wondering if I should 
fix the issue locally?

Attaching proposed fix following Jakub's suggestion.

Christophe could you try the patch?
diff --git a/libsanitizer/ChangeLog b/libsanitizer/ChangeLog
index b97fc7d..c392c57 100644
--- a/libsanitizer/ChangeLog
+++ b/libsanitizer/ChangeLog
@@ -1,3 +1,7 @@
+2015-11-24  Maxim Ostapenko  
+
+	* include/system/linux/asm/ptrace.h: New header.
+
 2015-11-23  Maxim Ostapenko  
 
 	* All source files: Merge from upstream r253555.
diff --git a/libsanitizer/include/system/linux/asm/ptrace.h b/libsanitizer/include/system/linux/asm/ptrace.h
new file mode 100644
index 000..dbdd58b
--- /dev/null
+++ b/libsanitizer/include/system/linux/asm/ptrace.h
@@ -0,0 +1,8 @@
+#include_next 
+#if defined(__arm__)
+#ifndef ARM_VFPREGS_SIZE
+/* The size of the user-visible VFP state as seen by PTRACE_GET/SETVFPREGS
+   and core dumps.  */
+#define ARM_VFPREGS_SIZE ( 32 * 8 /*fpregs*/ + 4 /*fpscr*/ )
+#endif
+#endif

[PATCH] rs6000: Fix for and_operand oversight (PR68332, PR67677)

2015-11-23 Thread Segher Boessenkool

Calling rs6000_is_valid_and_mask on a reg instead of on a const_int is
not a good idea, as PR68332 and PR67677 as well as testing with
--enable-checking=yes,rtl show.  Fix this.

Bootstrapped and tested on powerpc64-linux.  Is this okay for trunk?


Segher


2015-11-24  Segher Boessenkool  

PR target/66217
PR target/67677
PR target/68332
* config/rs6000/predicates.md (and_operand): Check that the operand
is a const_int before calling rs6000_is_valid_and_mask.

---
 gcc/config/rs6000/predicates.md | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/gcc/config/rs6000/predicates.md b/gcc/config/rs6000/predicates.md
index 3b1a456..362188f 100644
--- a/gcc/config/rs6000/predicates.md
+++ b/gcc/config/rs6000/predicates.md
@@ -864,7 +864,8 @@ (define_predicate "non_logical_cint_operand"
 ;; Return 1 if the operand is either a non-special register or a
 ;; constant that can be used as the operand of a logical AND.
 (define_predicate "and_operand"
-  (ior (match_test "rs6000_is_valid_and_mask (op, mode)")
+  (ior (and (match_code "const_int")
+   (match_test "rs6000_is_valid_and_mask (op, mode)"))
(if_then_else (match_test "fixed_regs[CR0_REGNO]")
 (match_operand 0 "gpc_reg_operand")
 (match_operand 0 "logical_operand"
-- 
1.9.3

[PATCH] combine: Handle aborts in is_parallel_of_n_reg_sets (PR68381)

2015-11-23 Thread Segher Boessenkool

Some users of is_parallel_of_n_reg_sets disregard the clobbers in a
parallel after it has returned "yes, this is a parallel of N sets and
maybe some clobbers".  But combine uses a clobber of const0_rtx to
indicate substitution failure, so this leads to disaster.

Fix this by checking for such special clobbers in is_parallel_of_n_reg_sets.

Tested on powerpc64-linux.  Also tested with Kyrill's testcase, manually
inspected the generated asm and the combine dump file (with some extra
instrumentation).  This testcase needs -O1 btw.

The "performance problem" in the PR (same testcase, but with -O3) is
a missed jump optimization: a pseudo is set (to 0) in one BB (and
nowhere else), and then tested against 0 in another BB.  Nothing after
combine seems to handle this.

Applying this patch to trunk.  Kyrill, could you handle the testcase?
Together with whatever you decide should be done for the -O3 problem.
Thank you for tracking down this nastiness!


Segher


2015-11-24  Segher Boessenkool  

PR rtl-optimization/68381
* combine.c (is_parallel_of_n_reg_sets): Return false if the pattern
is poisoned.

---
 gcc/combine.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/gcc/combine.c b/gcc/combine.c
index 2a66fd5..4958d3b 100644
--- a/gcc/combine.c
+++ b/gcc/combine.c
@@ -2512,7 +2512,8 @@ is_parallel_of_n_reg_sets (rtx pat, int n)
|| !REG_P (SET_DEST (XVECEXP (pat, 0, i
   return false;
   for ( ; i < len; i++)
-if (GET_CODE (XVECEXP (pat, 0, i)) != CLOBBER)
+if (GET_CODE (XVECEXP (pat, 0, i)) != CLOBBER
+   || XEXP (XVECEXP (pat, 0, i), 0) == const0_rtx)
   return false;
 
   return true;
-- 
1.9.3

Use TBAA for lto-symtab decl merging warnings

2015-11-23 Thread Jan Hubicka

Hi,
the attached testcase shows that we can not use types_compatible_p to check
if two declarations are compatible in cross-language decl merging.
For example Fortran's C_SIZE_T should be compatible with C size_t while
one is signed and other unsigned.

This patch simply makes us to use TBAA and size alone to check the
compatibility.  This can miss many positives (we miss a way to check that types
are actually representation compatible in some sense), but also can
warn on interesting TBAA issues and seems bit more useful than what we have
now. In fact it reports bugs in libdecnumber:

../../libdecnumber/decNumber.h:179:0: error: type of �decNumberZero� does not 
match original declaration [-Werror=lto-type-mismatch]
   decNumber  * decNumberZero(decNumber *);
 

../../libdecnumber/decNumber.c:3582:0: note: �decNumberZero� was previously 
declared here
 decNumber * decNumberZero(decNumber *dn) {
 
../../libdecnumber/decNumber.h:179:0: note: code may be misoptimized unless 
-fno-strict-aliasing is used

../../gcc/../libdecnumber/decNumber.h:118:0: error: type of 
�decNumberFromString� does not match original declaration 
[-Werror=lto-type-mismatch]
   decNumber * decNumberFromString(decNumber *, const char *, decContext *);
 

../../libdecnumber/decNumber.h:118:0: error: type of �decNumberFromString� does 
not match original declaration [-Werror=lto-type-mismatch]
   decNumber * decNumberFromString(decNumber *, const char *, decContext *);
 

../../libdecnumber/decNumber.h:118:0: error: type of �decNumberFromString� does 
not match original declaration [-Werror=lto-type-mismatch]
   decNumber * decNumberFromString(decNumber *, const char *, decContext *);
 

../../libdecnumber/decNumber.h:118:0: error: type of �decNumberFromString� does 
not match original declaration [-Werror=lto-type-mismatch]
   decNumber * decNumberFromString(decNumber *, const char *, decContext *);
 

../../libdecnumber/decNumber.c:489:0: note: �decNumberFromString� was 
previously declared here
 decNumber * decNumberFromString(decNumber *dn, const char chars[],

../../gcc/../libdecnumber/decNumber.h:118:0: note: code may be misoptimized 
unless -fno-strict-aliasing is used

during cc1 build.  Those seems real bugs, because decNumber is defined with
different size in each unit.  This seems intentional, but it is not valid in C.
I suppose we need to use -fno-strict-aliasing on those or fix the problem 
otherwise.

With -Werror the path bootstrap/regtestes x86_64-linux, OK?
(I am not sure if -Werror LTO bootstrap is supposed to work these days, but 
there
are definitly few other warnings in the way)

* lto-symtab.c: Include alias.h
(warn_type_compatibility_p): Replace types_compatible_p checks by
TBAA and size checks; set bit 2 if locations are TBAA incompatible.
(lto_symtab_merge): Compare DECL sizes.
(lto_symtab_merge_decls_2): Warn about TBAA in compatibility.
* gfortran.dg/lto/bind_c-6_0.f90: New testcase.
* gfortran.dg/lto/bind_c-6_1.c: New testcase.
Index: lto/lto-symtab.c
===
--- lto/lto-symtab.c(revision 230718)
+++ lto/lto-symtab.c(working copy)
@@ -30,6 +30,7 @@ along with GCC; see the file COPYING3.
 #include "lto-streamer.h"
 #include "ipa-utils.h"
 #include "builtins.h"
+#include "alias.h"
 
 /* Replace the cgraph node NODE with PREVAILING_NODE in the cgraph, merging
all edges and removing the old node.  */
@@ -179,39 +180,52 @@ lto_varpool_replace_node (varpool_node *
 
 /* Return non-zero if we want to output waring about T1 and T2.
Return value is a bitmask of reasons of violation:
-   Bit 0 indicates that types are not compatible of memory layout.
-   Bit 1 indicates that types are not compatible because of C++ ODR rule.  */
+   Bit 0 indicates that types are not compatible.
+   Bit 1 indicates that types are not compatible because of C++ ODR rule.
+   If COMMON is true, do not warn on size mismatches of arrays.
+   Bit 2 indicates that types are not TBAA compatible
+
+   The interoperability rules are language specific.  At present we do only
+   full checking for C++ ODR rule and for other languages we do basic check
+   that data structures are of same size and TBAA compatible.  Our TBAA
+   implementation should be coarse enough so all valid type transitions
+   across different languages are allowed.
+
+   In particular we thus allow almost arbitrary type changes with
+   -fno-strict-aliasing which may be tough of as a feature rather than bug
+   as it allows to implement dodgy tricks in the language runtimes.
+
+   Naturally this code can be strenghtened significantly if we could track
+   down the language of origin.  */
 
 static int
-warn_type_compatibility_p (tree prevailing_type, tree type)
+warn_type_compatibility_p (tree prevailing_type, tree type,
+  bool common_or_extern)
 {
   int lev = 0;
+  bool odr_p = odr_or_derived_type_p (prevailing_type)
+

Re: Enable pointer TBAA for LTO

2015-11-23 Thread Jan Hubicka

> On November 23, 2015 5:50:25 PM GMT+01:00, Jan Hubicka  wrote:
> >> 
> >> I think it also causes the following and one related ICE
> >> 
> >> FAIL: gcc.dg/vect/pr62021.c -flto -ffat-lto-objects (internal
> >compiler 
> >> error)
> >> 
> >>
> >/space/rguenther/src/svn/trunk3/gcc/testsuite/gcc.dg/vect/pr62021.c:7:1:
> >
> >> internal compiler error: in get_alias_set, at alias.c:880^M
> >> 0x7528a7 get_alias_set(tree_node*)^M
> >> /space/rguenther/src/svn/trunk3/gcc/alias.c:880^M
> >
> >Does this one reproduce with mainline?
> 
> Yes, testsuite run on x86_64
> 
>  All thee ICEs are on the sanity
> >check:
> >gcc_checking_assert (!in_lto_p || !type_with_alias_set_p (t));
> >which check that in LTO all types that ought to have CANONICAL_TYPE
> >gets CANONICAL_TYPE
> >computed.  ICE here probalby means that the type somehow bypassed LTO
> >canonical type merging
> >as well as the TYPE_CANONICAL=MAIN_VARIANT set in lto-streamer.c

Hi,
all the ICE seems to be caused by fact that the simd streaming code creates
array of pointers at ltrans time. Because pointers are now
TYPE_STRUCTURAL_EQUALITY_P, we get array with TYPE_STRUCTURAL_EQUALITY_P
and that sanity check is there to verify that this doesn't happen (because
we lose optimization then)

This patch makes arrays and vectors to be handled same way as pointers:
the canonical type is not computed because it is unused by get_alias_set
anyway.

I implemented it for different reason - my TBAA checking in lto-symtab got
false positives on arrays during the LTO bootstrap.  THe problem was again
the overly generous globbing of TYPE_CANONICAL.

get_alias_set first does

 t = TYPE_CANONICAL (t)

and then for arrays recurses

 set = get_alias_set (TREE_TYPE (t));

Now we can have type

 int *[10];

with TYPE_CANONICAL

 float *[10];

and then get_alias_set (int *[10]) will return get_alias_set (float *)
which is wrong, because then the array does not conflict with its elements.
I did not managed to get any wrong code out of this, but it is an obvious bug.
This problem reproduced prior the pointer patch, too, on other types which are
globed at TYPE_CANONICAL computation.

This also fixeds semi-latent bug with Ada where TYPE_STRUCTURAL_EQUALITY_P type
may win the merging and turn all interoperable arrays into
TYPE_STRUCTURAL_EQUALITY_P.

For next stage1 I think I can move frontends away from computing TYPE_CANONICAL
for those types. At least my initial test for C and C++ seemed to work just 
fine.

Bootstrapped/regtested x86_64-linux, OK?

* lto-streamer-in.c (lto_read_body_or_constructor): Set TYPE_CANONICAL
only for types where LTO sets them.
* tree.c (build_array_type_1): Do ont set TYPE_CANONICAL for LTO.
(make_vector_type): Likewise.
(gimple_canonical_types_compatible_p): Use canonical_type_used_p.
* tree.h (canonical_type_used_p): New inline.
* alias.c (get_alias_set): Handle structural equality for all
types that pass canonical_type_used_p.
(record_component_aliases): Look through all types with
record_component_aliases for possible pointers; sanity check that
the alias sets match.

* lto.c (iterative_hash_canonical_type): Recruse for all types
which pass !canonical_type_used_p.
(gimple_register_canonical_type_1): Sanity check we do not compute
canonical type of anything with !canonical_type_used_p.
(gimple_register_canonical_type): Skip all types that are
!canonical_type_used_p

Index: lto-streamer-in.c
===
--- lto-streamer-in.c   (revision 230718)
+++ lto-streamer-in.c   (working copy)
@@ -1231,7 +1231,9 @@ lto_read_body_or_constructor (struct lto
  if (TYPE_P (t))
{
  gcc_assert (TYPE_CANONICAL (t) == NULL_TREE);
- TYPE_CANONICAL (t) = TYPE_MAIN_VARIANT (t);
+ if (type_with_alias_set_p (t)
+ && canonical_type_used_p (t))
+   TYPE_CANONICAL (t) = TYPE_MAIN_VARIANT (t);
  if (TYPE_MAIN_VARIANT (t) != t)
{
  gcc_assert (TYPE_NEXT_VARIANT (t) == NULL_TREE);
Index: tree.c
===
--- tree.c  (revision 230718)
+++ tree.c  (working copy)
@@ -8236,7 +8243,8 @@ build_array_type_1 (tree elt_type, tree
   if (TYPE_CANONICAL (t) == t)
 {
   if (TYPE_STRUCTURAL_EQUALITY_P (elt_type)
- || (index_type && TYPE_STRUCTURAL_EQUALITY_P (index_type)))
+ || (index_type && TYPE_STRUCTURAL_EQUALITY_P (index_type))
+ || in_lto_p)
SET_TYPE_STRUCTURAL_EQUALITY (t);
   else if (TYPE_CANONICAL (elt_type) != elt_type
   || (index_type && TYPE_CANONICAL (index_type) != index_type))
@@ -9849,13 +9857,13 @@ make_vector_type (tree innertype, int nu
   SET_TYPE_VECTOR_SUBPARTS (t, nunits);
   SET_TYPE_MODE (

Re: [PATCH] Add LANG_HOOKS_EMPTY_RECORD_P for C++ empty class

2015-11-23 Thread Andrew Pinski

On Mon, Nov 23, 2015 at 9:53 PM, H.J. Lu  wrote:
> On Mon, Nov 23, 2015 at 7:22 PM, Patrick Palka  wrote:
>> On Mon, Nov 23, 2015 at 3:53 PM, H.J. Lu  wrote:
>>> On Mon, Nov 23, 2015 at 1:57 AM, Richard Biener
>>>  wrote:
 On Sat, Nov 21, 2015 at 12:46 AM, H.J. Lu  wrote:
> On Fri, Nov 20, 2015 at 2:17 PM, Jason Merrill  wrote:
>> On 11/20/2015 01:52 PM, H.J. Lu wrote:
>>>
>>> On Tue, Nov 17, 2015 at 4:22 AM, Richard Biener
>>>  wrote:

 On Tue, Nov 17, 2015 at 12:01 PM, H.J. Lu  wrote:
>
> Empty record should be returned and passed the same way in C and C++.
> This patch adds LANG_HOOKS_EMPTY_RECORD_P for C++ empty class, which
> defaults to return false.  For C++, LANG_HOOKS_EMPTY_RECORD_P is 
> defined
> to is_really_empty_class, which returns true for C++ empty classes.  
> For
> LTO, we stream out a bit to indicate if a record is empty and we store
> it in TYPE_LANG_FLAG_0 when streaming in.  get_ref_base_and_extent is
> changed to set bitsize to 0 for empty records.  Middle-end and x86
> backend are updated to ignore empty records for parameter passing and
> function value return.  Other targets may need similar changes.


 Please avoid a new langhook for this and instead claim a bit in
 tree_type_common
 like for example restrict_flag (double-check it is unused for
 non-pointers).
>>>
>>>
>>> There is no bit in tree_type_common I can overload.  restrict_flag is
>>> checked for non-pointers to issue an error when it is used on
>>> non-pointers:
>>>
>>>
>>> /export/gnu/import/git/sources/gcc/gcc/testsuite/g++.dg/template/qualttp20.C:19:38:
>>> error: ‘__restrict__’ qualifiers cannot be applied to ‘AS::L’
>>> typedef typename T::L __restrict__ r;// { dg-error "'__restrict__'
>>> qualifiers cannot" "" }
>>
>>
>> The C++ front end only needs to check TYPE_RESTRICT for this purpose on
>> front-end-specific type codes like TEMPLATE_TYPE_PARM; cp_type_quals 
>> could
>> handle that specifically if you change TYPE_RESTRICT to only apply to
>> pointers.
>>
>
> restrict_flag is also checked in this case:
>
> [hjl@gnu-6 gcc]$ cat x.i
> struct dummy { };
>
> struct dummy
> foo (struct dummy __restrict__ i)
> {
>   return i;
> }
> [hjl@gnu-6 gcc]$ gcc -S x.i -Wall
> x.i:4:13: error: invalid use of ‘restrict’
>  foo (struct dummy __restrict__ i)
>  ^
> x.i:4:13: error: invalid use of ‘restrict’
> [hjl@gnu-6 gcc]$
>
> restrict_flag can't also be used to indicate `i' is an empty record.

 I'm sure this error can be done during parsing w/o relying on 
 TYPE_RESTRICT.

 But well, use any other free bit (but do not enlarge
 tree_type_common).  Eventually
 you can free up a bit by putting sth into type_lang_specific currently
 using bits
 in tree_type_common.
>>>
>>> There are no bits in tree_type_common I can move.  Instead,
>>> this patch overloads side_effects_flag in tree_base.  Tested on
>>> Linux/x86-64.  OK for trunk?
>>>
>>
>> Hi,
>>
>> Coincidentally a few months ago I was experimenting with making
>> empty-struct function arguments zero-cost (and thus making them behave
>> the same way as in GNU C).  My approach (patch attached) was to assign
>> empty-struct arguments to a virtual register (instead of on the stack
>> or to a hard register) during RTL call expansion.  These
>> virtual-register assignments would then be trivially DCE'd later.
>> This approach seemed to work surprisingly well with minimal code
>> changes.  I wonder what
>> your thoughts are on this approach..
>
> I don't think it works for C++ class.  empty_record_or_union_type_p
> missed:
>
> for (binfo = TYPE_BINFO (type), i = 0;
>BINFO_BASE_ITERATE (binfo, i, base_binfo); ++i)
> if (!is_really_empty_class (BINFO_TYPE (base_binfo)))
>   return false;

This above should not be needed as TYPE_FIELDS should include one
already.  Or do you have prove it does not?

Thanks,
Andrew


>
> Does it work with variable argument list?   Did you run GCC
> testsuite for both i686 and x86-64?
>
>
> --
> H.J.

Re: [PATCH] Add LANG_HOOKS_EMPTY_RECORD_P for C++ empty class

2015-11-23 Thread H.J. Lu

On Mon, Nov 23, 2015 at 7:22 PM, Patrick Palka  wrote:
> On Mon, Nov 23, 2015 at 3:53 PM, H.J. Lu  wrote:
>> On Mon, Nov 23, 2015 at 1:57 AM, Richard Biener
>>  wrote:
>>> On Sat, Nov 21, 2015 at 12:46 AM, H.J. Lu  wrote:
 On Fri, Nov 20, 2015 at 2:17 PM, Jason Merrill  wrote:
> On 11/20/2015 01:52 PM, H.J. Lu wrote:
>>
>> On Tue, Nov 17, 2015 at 4:22 AM, Richard Biener
>>  wrote:
>>>
>>> On Tue, Nov 17, 2015 at 12:01 PM, H.J. Lu  wrote:

 Empty record should be returned and passed the same way in C and C++.
 This patch adds LANG_HOOKS_EMPTY_RECORD_P for C++ empty class, which
 defaults to return false.  For C++, LANG_HOOKS_EMPTY_RECORD_P is 
 defined
 to is_really_empty_class, which returns true for C++ empty classes.  
 For
 LTO, we stream out a bit to indicate if a record is empty and we store
 it in TYPE_LANG_FLAG_0 when streaming in.  get_ref_base_and_extent is
 changed to set bitsize to 0 for empty records.  Middle-end and x86
 backend are updated to ignore empty records for parameter passing and
 function value return.  Other targets may need similar changes.
>>>
>>>
>>> Please avoid a new langhook for this and instead claim a bit in
>>> tree_type_common
>>> like for example restrict_flag (double-check it is unused for
>>> non-pointers).
>>
>>
>> There is no bit in tree_type_common I can overload.  restrict_flag is
>> checked for non-pointers to issue an error when it is used on
>> non-pointers:
>>
>>
>> /export/gnu/import/git/sources/gcc/gcc/testsuite/g++.dg/template/qualttp20.C:19:38:
>> error: ‘__restrict__’ qualifiers cannot be applied to ‘AS::L’
>> typedef typename T::L __restrict__ r;// { dg-error "'__restrict__'
>> qualifiers cannot" "" }
>
>
> The C++ front end only needs to check TYPE_RESTRICT for this purpose on
> front-end-specific type codes like TEMPLATE_TYPE_PARM; cp_type_quals could
> handle that specifically if you change TYPE_RESTRICT to only apply to
> pointers.
>

 restrict_flag is also checked in this case:

 [hjl@gnu-6 gcc]$ cat x.i
 struct dummy { };

 struct dummy
 foo (struct dummy __restrict__ i)
 {
   return i;
 }
 [hjl@gnu-6 gcc]$ gcc -S x.i -Wall
 x.i:4:13: error: invalid use of ‘restrict’
  foo (struct dummy __restrict__ i)
  ^
 x.i:4:13: error: invalid use of ‘restrict’
 [hjl@gnu-6 gcc]$

 restrict_flag can't also be used to indicate `i' is an empty record.
>>>
>>> I'm sure this error can be done during parsing w/o relying on TYPE_RESTRICT.
>>>
>>> But well, use any other free bit (but do not enlarge
>>> tree_type_common).  Eventually
>>> you can free up a bit by putting sth into type_lang_specific currently
>>> using bits
>>> in tree_type_common.
>>
>> There are no bits in tree_type_common I can move.  Instead,
>> this patch overloads side_effects_flag in tree_base.  Tested on
>> Linux/x86-64.  OK for trunk?
>>
>
> Hi,
>
> Coincidentally a few months ago I was experimenting with making
> empty-struct function arguments zero-cost (and thus making them behave
> the same way as in GNU C).  My approach (patch attached) was to assign
> empty-struct arguments to a virtual register (instead of on the stack
> or to a hard register) during RTL call expansion.  These
> virtual-register assignments would then be trivially DCE'd later.
> This approach seemed to work surprisingly well with minimal code
> changes.  I wonder what
> your thoughts are on this approach..

I don't think it works for C++ class.  empty_record_or_union_type_p
missed:

for (binfo = TYPE_BINFO (type), i = 0;
   BINFO_BASE_ITERATE (binfo, i, base_binfo); ++i)
if (!is_really_empty_class (BINFO_TYPE (base_binfo)))
  return false;

Does it work with variable argument list?   Did you run GCC
testsuite for both i686 and x86-64?


-- 
H.J.

[PATCH 1/2] destroy values as well as keys when removing them from hash maps

2015-11-23 Thread tbsaunde+gcc

From: Trevor Saunders 

Hi,

This fixes several leaks where a hash_map user expected deleting the map to
destruct the value part of entries as well as the key.  A couple of these bugs
have already been fixed, but there are more of them for example some of the
sanitizer code, and tree-if-conv.c).  The expectation that hash_map should
destruct values seems to be pretty obviously correct, so we should fix that to
fix the existing bugs and prevent future ones (I also seem to remember that
working at some point, but could be incorrect).

I checked all the existing hash_map users and couldn't find any value types
other than auto_vec with destructors, so its the only one with a non trivial
destructor.  So the only effected case auto_vec is fixed by this patch and no
expectations are broken.

bootstrapped + regtested on x86_64-linux-gnu, ok?

Trev

gcc/ChangeLog:

2015-11-20  Trevor Saunders  

* hash-map-traits.h (simple_hashmap_traits ::remove): call
destructors on values that are being removed.
* mem-stats.h (hash_map): Pass type of values to
simple_hashmap_traits.
* tree-sra.c (sra_deinitialize): Remove work around for hash
maps not destructing values.
* genmatch.c (sinfo_hashmap_traits): Adjust.
* tree-ssa-uncprop.c (val_ssa_equiv_hash_traits): Likewise.
---
 gcc/genmatch.c |  3 ++-
 gcc/hash-map-traits.h  | 32 +---
 gcc/mem-stats.h|  3 ++-
 gcc/tree-sra.c |  6 --
 gcc/tree-ssa-uncprop.c |  3 ++-
 5 files changed, 23 insertions(+), 24 deletions(-)

diff --git a/gcc/genmatch.c b/gcc/genmatch.c
index 3a20a48..76c8f1f 100644
--- a/gcc/genmatch.c
+++ b/gcc/genmatch.c
@@ -1397,7 +1397,8 @@ struct sinfo
   unsigned cnt;
 };
 
-struct sinfo_hashmap_traits : simple_hashmap_traits  >
+struct sinfo_hashmap_traits : simple_hashmap_traits,
+   sinfo *>
 {
   static inline hashval_t hash (const key_type &);
   static inline bool equal_keys (const key_type &, const key_type &);
diff --git a/gcc/hash-map-traits.h b/gcc/hash-map-traits.h
index 2225426..773ac1b 100644
--- a/gcc/hash-map-traits.h
+++ b/gcc/hash-map-traits.h
@@ -28,7 +28,7 @@ along with GCC; see the file COPYING3.  If not see
 /* Implement hash_map traits for a key with hash traits H.  Empty and
deleted map entries are represented as empty and deleted keys.  */
 
-template 
+template 
 struct simple_hashmap_traits
 {
   typedef typename H::value_type key_type;
@@ -41,56 +41,58 @@ struct simple_hashmap_traits
   template  static inline void mark_deleted (T &);
 };
 
-template 
+template 
 inline hashval_t
-simple_hashmap_traits ::hash (const key_type &h)
+simple_hashmap_traits ::hash (const key_type &h)
 {
   return H::hash (h);
 }
 
-template 
+template 
 inline bool
-simple_hashmap_traits ::equal_keys (const key_type &k1, const key_type &k2)
+simple_hashmap_traits ::equal_keys (const key_type &k1,
+ const key_type &k2)
 {
   return H::equal (k1, k2);
 }
 
-template 
+template 
 template 
 inline void
-simple_hashmap_traits ::remove (T &entry)
+simple_hashmap_traits ::remove (T &entry)
 {
   H::remove (entry.m_key);
+  entry.m_value.~Value ();
 }
 
-template 
+template 
 template 
 inline bool
-simple_hashmap_traits ::is_empty (const T &entry)
+simple_hashmap_traits ::is_empty (const T &entry)
 {
   return H::is_empty (entry.m_key);
 }
 
-template 
+template 
 template 
 inline bool
-simple_hashmap_traits ::is_deleted (const T &entry)
+simple_hashmap_traits ::is_deleted (const T &entry)
 {
   return H::is_deleted (entry.m_key);
 }
 
-template 
+template 
 template 
 inline void
-simple_hashmap_traits ::mark_empty (T &entry)
+simple_hashmap_traits ::mark_empty (T &entry)
 {
   H::mark_empty (entry.m_key);
 }
 
-template 
+template 
 template 
 inline void
-simple_hashmap_traits ::mark_deleted (T &entry)
+simple_hashmap_traits ::mark_deleted (T &entry)
 {
   H::mark_deleted (entry.m_key);
 }
diff --git a/gcc/mem-stats.h b/gcc/mem-stats.h
index a6489b5..2c68ca7 100644
--- a/gcc/mem-stats.h
+++ b/gcc/mem-stats.h
@@ -3,7 +3,8 @@
 
 /* Forward declaration.  */
 template > >
+typename Traits = simple_hashmap_traits,
+Value> >
 class hash_map;
 
 #define LOCATION_LINE_EXTRA_SPACE 30
diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c
index 2835c99..c4fea5b 100644
--- a/gcc/tree-sra.c
+++ b/gcc/tree-sra.c
@@ -674,12 +674,6 @@ sra_deinitialize (void)
   assign_link_pool.release ();
   obstack_free (&name_obstack, NULL);
 
-  /* TODO: hash_map does not support traits that can release
- value type of the hash_map.  */
-  for (hash_map >::iterator it =
-   base_access_vec->begin (); it != base_access_vec->end (); ++it)
-(*it).second.release ();
-
   delete base_access_vec;
 }
 
diff --git a/gcc/tree-ssa-uncprop.c b/gcc/tree-ssa-uncprop.c
index be6c209d..23b4ca2 100644
--- a/gcc/tree-ssa-uncprop.c
+++ b

[PATCH 2/2] remove val_ssa_equiv_hash_traits

2015-11-23 Thread tbsaunde+gcc

From: Trevor Saunders 

Hi,

this is pretty trivial cleanup after the previous patch, but could wait for
next stage 1 if people don't like the very small risk.

boostrappped + regtested on x86_64-linux-gnu, ok?

Trev

gcc/ChangeLog:

2015-11-20  Trevor Saunders  

* tree-ssa-uncprop.c (struct val_ssa_equiv_hash_traits): Remove.
(val_ssa_equiv_hash_traits::remove): Likewise.
(pass_uncprop::execute): Adjust.
---
 gcc/tree-ssa-uncprop.c | 22 ++
 1 file changed, 2 insertions(+), 20 deletions(-)

diff --git a/gcc/tree-ssa-uncprop.c b/gcc/tree-ssa-uncprop.c
index 23b4ca2..a60184e 100644
--- a/gcc/tree-ssa-uncprop.c
+++ b/gcc/tree-ssa-uncprop.c
@@ -275,27 +275,10 @@ struct equiv_hash_elt
   vec equivalences;
 };
 
-/* Value to ssa name equivalence hashtable helpers.  */
-
-struct val_ssa_equiv_hash_traits : simple_hashmap_traits  >
-{
-  template static inline void remove (T &);
-};
-
-/* Free an instance of equiv_hash_elt.  */
-
-template
-inline void
-val_ssa_equiv_hash_traits::remove (T &elt)
-{
-  elt.m_value.release ();
-}
-
 /* Global hash table implementing a mapping from invariant values
to a list of SSA_NAMEs which have the same value.  We might be
able to reuse tree-vn for this code.  */
-static hash_map, val_ssa_equiv_hash_traits> *val_ssa_equiv;
+static hash_map > *val_ssa_equiv;
 
 static void uncprop_into_successor_phis (basic_block);
 
@@ -518,8 +501,7 @@ pass_uncprop::execute (function *fun)
   associate_equivalences_with_edges ();
 
   /* Create our global data structures.  */
-  val_ssa_equiv
-= new hash_map, val_ssa_equiv_hash_traits> (1024);
+  val_ssa_equiv = new hash_map > (1024);
 
   /* We're going to do a dominator walk, so ensure that we have
  dominance information.  */
-- 
2.4.0

Re: [PR64164] drop copyrename, integrate into expand

2015-11-23 Thread Jeff Law


On 11/16/2015 05:07 PM, Alexandre Oliva wrote:


The check is not in my patch, indeed.  That's because the previous block
performs the runtime check, and it only lets through two cases: the one
we handle, and the one nobody uses.
That was the conclusion I was starting to come to, but expressed so 
poorly in my last message.  Sadly it was non-obvious from staring at the 
current code.  Though I must admit that after a week, I can see it 
better now.  Maybe that's a result of re-reading your message a 
half-dozen more times with the current code and your patch all visible 
in windows next to each other :-)



Prior to your change we'd just blindly copy from ENTRY_PARM to MEM, 
which would result in missing the implicit shift if MEM wasn't actually 
a memory.


You're just moving that conditional up and handling the shift 
explicitly.  You've got asserts for the cases you're not handling (and 
no, I'm not aware of the need for this on any LE architecture, while I 
am aware of BE architectures that align in both directions).




Any suggestions on how to improve the comments so that they convey
enough of this reasoning to make sense, without our having to write a
book :-) on the topic?
Refer back to this thread? :-)  Seriously though, looking at things a 
week later, I can see it much better now.  Thanks for your patience on this.


OK for the trunk,
jeff

Re: [ptx] Fix sso tests

2015-11-23 Thread Jeff Law


On 11/23/2015 01:54 PM, Nathan Sidwell wrote:

On 11/23/15 15:41, Jeff Law wrote:



In the 'put' function, why not just make all targets go through
putchar?  It's not like this is performance critical code and I
don't think it compromises any of the tests, does it?


I contemplated that, but wondered if someone would complain.  I'm
happy either way.
Let's go with a single codepath here.  The one that ought to work for 
all targets is putchar.


Thanks,
jeff

Re: Enable pointer TBAA for LTO

2015-11-23 Thread Jan Hubicka

> I don't understand why we need this (testcase?) because get_alias_set ()
> is supposed to do the alias-set of pointer globbing for us.
Hi,
After some experimentation I managed to derive self contained testcase (other 
than GCC itself).
struct a/b/c gets the same TYPE_CANONICAL which is different from struct b. 
Without that
change in recording component aliases then the alias_set of struct b has 
component float *
instead of int *. Eventually in main we optimize out the store to int * because 
we do not
see the conflict.

One needs to add the extra garbage around to be sure that things are streamed
in proper order and also in proper order shipped to ltrans and finally the
gimple optimizers are not smart enough to get the propagation without TBAA
oracle help.

Comitted.

* gcc.c-torture/execute/lto-tbaa-1.c: New testcase.
Index: gcc.c-torture/execute/lto-tbaa-1.c
===
--- gcc.c-torture/execute/lto-tbaa-1.c  (revision 0)
+++ gcc.c-torture/execute/lto-tbaa-1.c  (revision 0)
@@ -0,0 +1,42 @@
+/* { dg-additional-options "-fno-early-inlining -fno-ipa-cp" }  */
+struct a {
+  float *b;
+} *a;
+struct b {
+  int *b;
+} b;
+struct c {
+  float *b;
+} *c;
+int d;
+use_a (struct a *a)
+{
+}
+set_b (int **a)
+{
+  *a=&d;
+}
+use_c (struct c *a)
+{
+}
+__attribute__ ((noinline)) int **retme(int **val)
+{
+  return val;
+}
+int e;
+struct b b= {&e};
+struct b b2;
+struct b b3;
+int **ptr = &b2.b;
+main ()
+{
+  a= (void *)0;
+  b.b=&e;
+  ptr =retme ( &b.b);
+  set_b (ptr);
+  b3=b;
+  if (b3.b != &d)
+  __builtin_abort ();
+  c= (void *)0;
+  return 0;
+}

Fix computation of TYPE_CANONICAL of VECTOR_TYPE

2015-11-23 Thread Jan Hubicka

Hi,
this patch fixes ICE triggered by extra sanity check I added while fixing
another type checking ICE during Ada bootstrap.

The canonical types of verctor typs are not constructed correctly.  If
make_vector_type is called with INNERTYPE being a variant (say const char),
it builds first the main variant (i.e. vector for char) but when it does
canonical type of it, it recurses for vector of TYPE_CANONICAL(const char)
instead of TYPE_CANONICAL (char). Se we end up with vector of char while
TYPE_CANONICAL is vector of const char

With the new sanity check I added to type verifier this now reproduces as
several ICEs in the vectorizer testuiste.

Bootstrapped/regtested x86_64-linux, OK?

* tree.c (make_vector_type): Properly compute canonical type of the
main variant.
(verify_type): Verify that TYPE_CANONICAl of TYPE_MAIN_VARIANT is
a main variant.
Index: tree.c
===
--- tree.c  (revision 230783)
+++ tree.c  (working copy)
@@ -9843,19 +9844,21 @@ make_vector_type (tree innertype, int nu
 {
   tree t;
   inchash::hash hstate;
+  tree mv_innertype = TYPE_MAIN_VARIANT (innertype);
 
   t = make_node (VECTOR_TYPE);
-  TREE_TYPE (t) = TYPE_MAIN_VARIANT (innertype);
+  TREE_TYPE (t) = mv_innertype;
   SET_TYPE_VECTOR_SUBPARTS (t, nunits);
   SET_TYPE_MODE (t, mode);
 
-  if (TYPE_STRUCTURAL_EQUALITY_P (innertype))
+  if (TYPE_STRUCTURAL_EQUALITY_P (mv_innertype))
 SET_TYPE_STRUCTURAL_EQUALITY (t);
-  else if ((TYPE_CANONICAL (innertype) != innertype
+  else if ((TYPE_CANONICAL (mv_innertype) != mv_innertype
|| mode != VOIDmode)
   && !VECTOR_BOOLEAN_TYPE_P (t))
 TYPE_CANONICAL (t)
-  = make_vector_type (TYPE_CANONICAL (innertype), nunits, VOIDmode);
+  = make_vector_type (TYPE_CANONICAL (mv_innertype),
+ nunits, VOIDmode);
 
   layout_type (t);
 
@@ -13522,6 +13525,13 @@ verify_type (const_tree t)
   debug_tree (ct);
   error_found = true;
 }
+  if (TYPE_MAIN_VARIANT (t) == t && ct && TYPE_MAIN_VARIANT (ct) != ct)
+   {
+  error ("TYPE_CANONICAL of main variant is not main variant");
+  debug_tree (ct);
+  debug_tree (TYPE_MAIN_VARIANT (ct));
+  error_found = true;
+   }
 
 
   /* Check various uses of TYPE_MINVAL.  */

Re: [PATCH] Add LANG_HOOKS_EMPTY_RECORD_P for C++ empty class

2015-11-23 Thread Patrick Palka

On Mon, Nov 23, 2015 at 3:53 PM, H.J. Lu  wrote:
> On Mon, Nov 23, 2015 at 1:57 AM, Richard Biener
>  wrote:
>> On Sat, Nov 21, 2015 at 12:46 AM, H.J. Lu  wrote:
>>> On Fri, Nov 20, 2015 at 2:17 PM, Jason Merrill  wrote:
 On 11/20/2015 01:52 PM, H.J. Lu wrote:
>
> On Tue, Nov 17, 2015 at 4:22 AM, Richard Biener
>  wrote:
>>
>> On Tue, Nov 17, 2015 at 12:01 PM, H.J. Lu  wrote:
>>>
>>> Empty record should be returned and passed the same way in C and C++.
>>> This patch adds LANG_HOOKS_EMPTY_RECORD_P for C++ empty class, which
>>> defaults to return false.  For C++, LANG_HOOKS_EMPTY_RECORD_P is defined
>>> to is_really_empty_class, which returns true for C++ empty classes.  For
>>> LTO, we stream out a bit to indicate if a record is empty and we store
>>> it in TYPE_LANG_FLAG_0 when streaming in.  get_ref_base_and_extent is
>>> changed to set bitsize to 0 for empty records.  Middle-end and x86
>>> backend are updated to ignore empty records for parameter passing and
>>> function value return.  Other targets may need similar changes.
>>
>>
>> Please avoid a new langhook for this and instead claim a bit in
>> tree_type_common
>> like for example restrict_flag (double-check it is unused for
>> non-pointers).
>
>
> There is no bit in tree_type_common I can overload.  restrict_flag is
> checked for non-pointers to issue an error when it is used on
> non-pointers:
>
>
> /export/gnu/import/git/sources/gcc/gcc/testsuite/g++.dg/template/qualttp20.C:19:38:
> error: ‘__restrict__’ qualifiers cannot be applied to ‘AS::L’
> typedef typename T::L __restrict__ r;// { dg-error "'__restrict__'
> qualifiers cannot" "" }


 The C++ front end only needs to check TYPE_RESTRICT for this purpose on
 front-end-specific type codes like TEMPLATE_TYPE_PARM; cp_type_quals could
 handle that specifically if you change TYPE_RESTRICT to only apply to
 pointers.

>>>
>>> restrict_flag is also checked in this case:
>>>
>>> [hjl@gnu-6 gcc]$ cat x.i
>>> struct dummy { };
>>>
>>> struct dummy
>>> foo (struct dummy __restrict__ i)
>>> {
>>>   return i;
>>> }
>>> [hjl@gnu-6 gcc]$ gcc -S x.i -Wall
>>> x.i:4:13: error: invalid use of ‘restrict’
>>>  foo (struct dummy __restrict__ i)
>>>  ^
>>> x.i:4:13: error: invalid use of ‘restrict’
>>> [hjl@gnu-6 gcc]$
>>>
>>> restrict_flag can't also be used to indicate `i' is an empty record.
>>
>> I'm sure this error can be done during parsing w/o relying on TYPE_RESTRICT.
>>
>> But well, use any other free bit (but do not enlarge
>> tree_type_common).  Eventually
>> you can free up a bit by putting sth into type_lang_specific currently
>> using bits
>> in tree_type_common.
>
> There are no bits in tree_type_common I can move.  Instead,
> this patch overloads side_effects_flag in tree_base.  Tested on
> Linux/x86-64.  OK for trunk?
>

Hi,

Coincidentally a few months ago I was experimenting with making
empty-struct function arguments zero-cost (and thus making them behave
the same way as in GNU C).  My approach (patch attached) was to assign
empty-struct arguments to a virtual register (instead of on the stack
or to a hard register) during RTL call expansion.  These
virtual-register assignments would then be trivially DCE'd later.
This approach seemed to work surprisingly well with minimal code
changes.  I wonder what
your thoughts are on this approach..
From 8eb52639992ad0f6e5482783604f362bcc04d230 Mon Sep 17 00:00:00 2001
From: Patrick Palka 
Date: Mon, 23 Nov 2015 21:02:09 -0500
Subject: [PATCH] zero-cost structs

---
 gcc/calls.c | 15 +++
 gcc/tree-tailcall.c |  7 ++-
 gcc/tree.c  | 17 +
 3 files changed, 38 insertions(+), 1 deletion(-)

diff --git a/gcc/calls.c b/gcc/calls.c
index b56556a..4ca668c 100644
--- a/gcc/calls.c
+++ b/gcc/calls.c
@@ -1394,6 +1394,21 @@ initialize_argument_information (int num_actuals ATTRIBUTE_UNUSED,
   args[i].reg = targetm.calls.function_arg (args_so_far, mode, type,
 		argpos < n_named_args);
 
+  bool empty_record_or_union_type_p (const_tree);
+
+  if (type != NULL_TREE
+#if 0
+	  /* ??? This condition was necessary to fix a C regression whose
+	 details I have forgot about.  In GNU C the mode of an empty struct is BLKmode
+	 (and TYPE_SIZE 0) so this condition makes it so that we don't mess
+	 with the codegen of empty structs in C.  In C++ the mode of the empty struct
+	 is QImode and TYPE_SIZE_UNIT 1.  Maybe it's not necessary anymore?   */
+	  && mode != BLKmode
+#endif
+	  && args[i].reg == NULL_RTX
+	  && empty_record_or_union_type_p (type))
+	args[i].reg = gen_reg_rtx (mode);
+
   if (args[i].reg && CONST_INT_P (args[i].reg))
 	{
 	  args[i].special_slot = args[i].reg;
diff --git a/gcc/tree-tailcall.c b/gcc/tree-tailcall.c
index bbd1b29..fa8f66a 100644
--- a/gcc/tree-tailcall.c
+++ b/gcc/tree-tailcall.c
@@ -49

Re: [PATCH AArch64]Handle REG+REG+CONST and REG+NON_REG+CONST in legitimize address

2015-11-23 Thread Bin.Cheng

On Sat, Nov 21, 2015 at 1:39 AM, Richard Earnshaw
 wrote:
> On 20/11/15 08:31, Bin.Cheng wrote:
>> On Thu, Nov 19, 2015 at 10:32 AM, Bin.Cheng  wrote:
>>> On Tue, Nov 17, 2015 at 6:08 PM, James Greenhalgh
>>>  wrote:
 On Tue, Nov 17, 2015 at 05:21:01PM +0800, Bin Cheng wrote:
> Hi,
> GIMPLE IVO needs to call backend interface to calculate costs for addr
> expressions like below:
>FORM1: "r73 + r74 + 16380"
>FORM2: "r73 << 2 + r74 + 16380"
>
> They are invalid address expression on AArch64, so will be legitimized by
> aarch64_legitimize_address.  Below are what we got from that function:
>
> For FORM1, the address expression is legitimized into below insn sequence
> and rtx:
>r84:DI=r73:DI+r74:DI
>r85:DI=r84:DI+0x3000
>r83:DI=r85:DI
>"r83 + 4092"
>
> For FORM2, the address expression is legitimized into below insn sequence
> and rtx:
>r108:DI=r73:DI<<0x2
>r109:DI=r108:DI+r74:DI
>r110:DI=r109:DI+0x3000
>r107:DI=r110:DI
>"r107 + 4092"
>
> So the costs computed are 12/16 respectively.  The high cost prevents IVO
> from choosing right candidates.  Besides cost computation, I also think 
> the
> legitmization is bad in terms of code generation.
> The root cause in aarch64_legitimize_address can be described by it's
> comment:
>/* Try to split X+CONST into Y=X+(CONST & ~mask), Y+(CONST&mask),
>   where mask is selected by alignment and size of the offset.
>   We try to pick as large a range for the offset as possible to
>   maximize the chance of a CSE.  However, for aligned addresses
>   we limit the range to 4k so that structures with different sized
>   elements are likely to use the same base.  */
> I think the split of CONST is intended for REG+CONST where the const 
> offset
> is not in the range of AArch64's addressing modes.  Unfortunately, it
> doesn't explicitly handle/reject "REG+REG+CONST" and 
> "REG+REG< when the CONST are in the range of addressing modes.  As a result, these 
> two
> cases fallthrough this logic, resulting in sub-optimal results.
>
> It's obvious we can do below legitimization:
> FORM1:
>r83:DI=r73:DI+r74:DI
>"r83 + 16380"
> FORM2:
>r107:DI=0x3ffc
>r106:DI=r74:DI+r107:DI
>   REG_EQUAL r74:DI+0x3ffc
>"r106 + r73 << 2"
>
> This patch handles these two cases as described.

 Thanks for the description, it made the patch very easy to review. I only
 have a style comment.

> Bootstrap & test on AArch64 along with other patch.  Is it OK?
>
> 2015-11-04  Bin Cheng  
>   Jiong Wang  
>
>   * config/aarch64/aarch64.c (aarch64_legitimize_address): Handle
>   address expressions like REG+REG+CONST and REG+NON_REG+CONST.

> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
> index 5c8604f..47875ac 100644
> --- a/gcc/config/aarch64/aarch64.c
> +++ b/gcc/config/aarch64/aarch64.c
> @@ -4710,6 +4710,51 @@ aarch64_legitimize_address (rtx x, rtx /* orig_x  
> */, machine_mode mode)
>  {
>HOST_WIDE_INT offset = INTVAL (XEXP (x, 1));
>HOST_WIDE_INT base_offset;
> +  rtx op0 = XEXP (x,0);
> +
> +  if (GET_CODE (op0) == PLUS)
> + {
> +   rtx op0_ = XEXP (op0, 0);
> +   rtx op1_ = XEXP (op0, 1);

 I don't see this trailing _ on a variable name in many places in the source
 tree (mostly in the Go frontend), and certainly not in the aarch64 backend.
 Can we pick a different name for op0_ and op1_?

> +
> +   /* RTX pattern in the form of (PLUS (PLUS REG, REG), CONST) will
> +  reach here, the 'CONST' may be valid in which case we should
> +  not split.  */
> +   if (REG_P (op0_) && REG_P (op1_))
> + {
> +   machine_mode addr_mode = GET_MODE (op0);
> +   rtx addr = gen_reg_rtx (addr_mode);
> +
> +   rtx ret = plus_constant (addr_mode, addr, offset);
> +   if (aarch64_legitimate_address_hook_p (mode, ret, false))
> + {
> +   emit_insn (gen_adddi3 (addr, op0_, op1_));
> +   return ret;
> + }
> + }
> +   /* RTX pattern in the form of (PLUS (PLUS REG, NON_REG), CONST)
> +  will reach here.  If (PLUS REG, NON_REG) is valid addr expr,
> +  we split it into Y=REG+CONST, Y+NON_REG.  */
> +   else if (REG_P (op0_) || REG_P (op1_))
> + {
> +   machine_mode addr_mode = GET_MODE (op0);
> +   rtx addr = gen_reg_rtx (addr_mode);
> +
> +   /* Switch to make sure that register is in op0_.  */
> +   if (REG_P (op1_))
> +

Re: [0/7] Type promotion pass and elimination of zext/sext

2015-11-23 Thread Kugan

Hi Richard,

Thanks for you comments. I am attaching  an updated patch with details
below.

On 19/11/15 02:06, Richard Biener wrote:
> On Wed, Nov 18, 2015 at 3:04 PM, Richard Biener
>  wrote:
>> On Sat, Nov 14, 2015 at 2:15 AM, Kugan
>>  wrote:
>>>
>>> Attached is the latest version of the patch. With the patches
>>> 0001-Add-new-SEXT_EXPR-tree-code.patch,
>>> 0002-Add-type-promotion-pass.patch and
>>> 0003-Optimize-ZEXT_EXPR-with-tree-vrp.patch.
>>>
>>> I did bootstrap on ppc64-linux-gnu, aarch64-linux-gnu and
>>> x64-64-linux-gnu and regression testing on ppc64-linux-gnu,
>>> aarch64-linux-gnu arm64-linux-gnu and x64-64-linux-gnu. I ran into three
>>> issues in ppc64-linux-gnu regression testing. There are some other test
>>> cases which needs adjustment for scanning for some patterns that are not
>>> valid now.
>>>
>>> 1. rtl fwprop was going into infinite loop. Works with the following patch:
>>> diff --git a/gcc/fwprop.c b/gcc/fwprop.c
>>> index 16c7981..9cf4f43 100644
>>> --- a/gcc/fwprop.c
>>> +++ b/gcc/fwprop.c
>>> @@ -948,6 +948,10 @@ try_fwprop_subst (df_ref use, rtx *loc, rtx
>>> new_rtx, rtx_insn *def_insn,
>>>int old_cost = 0;
>>>bool ok;
>>>
>>> +  /* Value to be substituted is the same, nothing to do.  */
>>> +  if (rtx_equal_p (*loc, new_rtx))
>>> +return false;
>>> +
>>>update_df_init (def_insn, insn);
>>>
>>>/* forward_propagate_subreg may be operating on an instruction with
>>
>> Which testcase was this on?

After re-basing the trunk, I cannot reproduce it anymore.

>>
>>> 2. gcc.dg/torture/ftrapv-1.c fails
>>> This is because we are checking for the  SImode trapping. With the
>>> promotion of the operation to wider mode, this is i think expected. I
>>> think the testcase needs updating.
>>
>> No, it is not expected.  As said earlier you need to refrain from promoting
>> integer operations that trap.  You can use ! operation_no_trapping_overflow
>> for this.
>>

I have changed this.

>>> 3. gcc.dg/sms-3.c fails
>>> It fails with  -fmodulo-sched-allow-regmoves  and OK when I remove it. I
>>> am looking into it.
>>>
>>>
>>> I also have the following issues based on the previous review (as posted
>>> in the previous patch). Copying again for the review purpose.
>>>
>>> 1.
 you still call promote_ssa on both DEFs and USEs and promote_ssa looks
 at SSA_NAME_DEF_STMT of the passed arg.  Please call promote_ssa just
 on DEFs and fixup_uses on USEs.
>>>
>>> I am doing this to promote SSA that are defined with GIMPLE_NOP. Is
>>> there anyway to iterate over this. I have added gcc_assert to make sure
>>> that promote_ssa is called only once.
>>
>>   gcc_assert (!ssa_name_info_map->get_or_insert (def));
>>
>> with --disable-checking this will be compiled away so you need to do
>> the assert in a separate statement.
>>
>>> 2.
 Instead of this you should, in promote_all_stmts, walk over all uses
>>> doing what
 fixup_uses does and then walk over all defs, doing what promote_ssa does.

 +case GIMPLE_NOP:
 +   {
 + if (SSA_NAME_VAR (def) == NULL)
 +   {
 + /* Promote def by fixing its type for anonymous def.  */
 + TREE_TYPE (def) = promoted_type;
 +   }
 + else
 +   {
 + /* Create a promoted copy of parameters.  */
 + bb = single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun));

 I think the uninitialized vars are somewhat tricky and it would be best
 to create a new uninit anonymous SSA name for them.  You can
 have SSA_NAME_VAR != NULL and def _not_ being a parameter
 btw.
>>>
>>> I experimented with get_or_create_default_def. Here  we have to have a
>>> SSA_NAME_VAR (def) of promoted type.
>>>
>>> In the attached patch I am doing the following and seems to work. Does
>>> this looks OK?
>>>
>>> + }
>>> +   else if (TREE_CODE (SSA_NAME_VAR (def)) != PARM_DECL)
>>> + {
>>> +   tree var = copy_node (SSA_NAME_VAR (def));
>>> +   TREE_TYPE (var) = promoted_type;
>>> +   TREE_TYPE (def) = promoted_type;
>>> +   SET_SSA_NAME_VAR_OR_IDENTIFIER (def, var);
>>> + }
>>
>> I believe this will wreck the SSA default-def map so you should do
>>
>>   set_ssa_default_def (cfun, SSA_NAME_VAR (def), NULL_TREE);
>>   tree var = create_tmp_reg (promoted_type);
>>   TREE_TYPE (def) = promoted_type;
>>   SET_SSA_NAME_VAR_OR_IDENTIFIER (def, var);
>>   set_ssa_default_def (cfun, var, def);
>>
>> instead.
I have changed this.

>>
>>> I prefer to promote def as otherwise iterating over the uses and
>>> promoting can look complicated (have to look at all the different types
>>> of stmts again and do the right thing as It was in the earlier version
>>> of this before we move to this approach)
>>>
>>> 3)
 you can also transparently handle constants for the cases where promoting
 is required.  At the moment their handling is interwinded with the def
>>> promotion
 cod

Re: [PATCH][combine] PR rtl-optimization/68381: Only restrict pure simplification in mult-extend subst case, allow other substitutions

2015-11-23 Thread Segher Boessenkool

On Thu, Nov 19, 2015 at 03:20:22PM +, Kyrill Tkachov wrote:
> Hmmm, so the answer to that is a bit further down the validate_replacement: 
> path.
> It's the code after the big comment:
>   /* See if this is a PARALLEL of two SETs where one SET's destination is
>  a register that is unused and this isn't marked as an instruction that
>  might trap in an EH region.  In that case, we just need the other SET.
>  We prefer this over the PARALLEL.
> 
>  This can occur when simplifying a divmod insn.  We *must* test for this
>  case here because the code below that splits two independent SETs 
>  doesn't
>  handle this case correctly when it updates the register status.
> 
>  It's pointless doing this if we originally had two sets, one from
>  i3, and one from i2.  Combining then splitting the parallel results
>  in the original i2 again plus an invalid insn (which we delete).
>  The net effect is only to move instructions around, which makes
>  debug info less accurate.  */
> 
> The code extracts all the valid sets inside the PARALLEL and calls 
> recog_for_combine on them
> individually, ignoring the clobber.

Before I made this use is_parallel_of_n_reg_sets the code used to test
if it is a parallel of two sets, and no clobbers allowed.  So it would
never allow a clobber of zero.  But now it does.  I'll fix this in
is_parallel_of_n_reg_sets.

Thanks for finding the problem!


Segher

Re: [PATCH] lround for PowerPC

2015-11-23 Thread Michael Meissner

Segher Boessenkool reminded me that the lround define_expand should not have
the wa and =d constraints.  This patch fixes that problem.  The little endian
power8 system has passed the bootstrap and make check tests.

The big endian power7 system is chugging away on the stage2 build if I use a
more recent compiler than the host compiler (both trunk without the patches and
with the patches failed in the same way).  Obviously we need to dig into that
failure.

2015-11-23  David Edelsohn  
Michael Meissner  

* config/rs6000/rs6000.md (UNSPEC_XSRDPI): New unspec.
(Fv2): New mode attribute to be used when ISA 2.06 instructions
are used on SF/DF values.
(abs2_fpr): Use  instead of .
(nabs2_fpr): Likewise.
(neg2_fpr): Likewise.
(copysign3_fcpsgn): Likewise.
(smax3_vsx): Likewise.
(smin3_vsx): Likewise.
(floatsi2_lfiwax): Likewise.
(floatunssi2_lfiwz): Likewise.
(fctiwz_): Likewise.
(fctiwuz_): Likewise.
(btrunc2): Likewise.
(ceil2): Likewise.
(floor2): Likewise.
(xsrdpi): Add support for the lround function.
(lround2): Likewise.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797
Index: gcc/config/rs6000/rs6000.md
===
--- gcc/config/rs6000/rs6000.md (revision 230768)
+++ gcc/config/rs6000/rs6000.md (working copy)
@@ -77,6 +77,7 @@ (define_c_enum "unspec"
UNSPEC_FRIN
UNSPEC_FRIP
UNSPEC_FRIZ
+   UNSPEC_XSRDPI
UNSPEC_LD_MPIC  ; load_macho_picbase
UNSPEC_RELD_MPIC; re-load_macho_picbase
UNSPEC_MPIC_CORRECT ; macho_correct_pic
@@ -491,9 +492,17 @@ (define_mode_attr Fvsx [(SF "sp") (DF  "
 ; SF/DF constraint for arithmetic on traditional floating point registers
 (define_mode_attr Ff   [(SF "f") (DF "d") (DI "d")])
 
-; SF/DF constraint for arithmetic on VSX registers
+; SF/DF constraint for arithmetic on VSX registers.  This is intended to be
+; used for DFmode instructions added in ISA 2.06 (power7) and SFmode
+; instructions added in ISA 2.07 (power8)
 (define_mode_attr Fv   [(SF "wy") (DF "ws") (DI "wi")])
 
+; SF/DF constraint for arithmetic on VSX registers using instructions added in
+; ISA 2.06 (power7).  This includes instructions that normally target DF mode,
+; but are used on SFmode, since internally SFmode values are kept in the DFmode
+; format.
+(define_mode_attr Fv2  [(SF "ww") (DF "ws") (DI "wi")])
+
 ; SF/DF constraint for arithmetic on altivec registers
 (define_mode_attr Fa   [(SF "wu") (DF "wv")])
 
@@ -4299,8 +4308,8 @@ (define_expand "abs2"
   "")
 
 (define_insn "*abs2_fpr"
-  [(set (match_operand:SFDF 0 "gpc_reg_operand" "=,")
-   (abs:SFDF (match_operand:SFDF 1 "gpc_reg_operand" ",")))]
+  [(set (match_operand:SFDF 0 "gpc_reg_operand" "=,")
+   (abs:SFDF (match_operand:SFDF 1 "gpc_reg_operand" ",")))]
   "TARGET__FPR"
   "@
fabs %0,%1
@@ -4309,10 +4318,10 @@ (define_insn "*abs2_fpr"
(set_attr "fp_type" "fp_addsub_")])
 
 (define_insn "*nabs2_fpr"
-  [(set (match_operand:SFDF 0 "gpc_reg_operand" "=,")
+  [(set (match_operand:SFDF 0 "gpc_reg_operand" "=,")
(neg:SFDF
 (abs:SFDF
- (match_operand:SFDF 1 "gpc_reg_operand" ","]
+ (match_operand:SFDF 1 "gpc_reg_operand" ","]
   "TARGET__FPR"
   "@
fnabs %0,%1
@@ -4327,8 +4336,8 @@ (define_expand "neg2"
   "")
 
 (define_insn "*neg2_fpr"
-  [(set (match_operand:SFDF 0 "gpc_reg_operand" "=,")
-   (neg:SFDF (match_operand:SFDF 1 "gpc_reg_operand" ",")))]
+  [(set (match_operand:SFDF 0 "gpc_reg_operand" "=,")
+   (neg:SFDF (match_operand:SFDF 1 "gpc_reg_operand" ",")))]
   "TARGET__FPR"
   "@
fneg %0,%1
@@ -4557,9 +4566,9 @@ (define_expand "copysign3"
 ;; Use an unspec rather providing an if-then-else in RTL, to prevent the
 ;; compiler from optimizing -0.0
 (define_insn "copysign3_fcpsgn"
-  [(set (match_operand:SFDF 0 "gpc_reg_operand" "=,")
-   (unspec:SFDF [(match_operand:SFDF 1 "gpc_reg_operand" ",")
- (match_operand:SFDF 2 "gpc_reg_operand" ",")]
+  [(set (match_operand:SFDF 0 "gpc_reg_operand" "=,")
+   (unspec:SFDF [(match_operand:SFDF 1 "gpc_reg_operand" ",")
+ (match_operand:SFDF 2 "gpc_reg_operand" ",")]
 UNSPEC_COPYSIGN))]
   "TARGET__FPR && TARGET_CMPB"
   "@
@@ -4593,9 +4602,9 @@ (define_expand "smax3"
 })
 
 (define_insn "*smax3_vsx"
-  [(set (match_operand:SFDF 0 "gpc_reg_operand" "=,")
-   (smax:SFDF (match_operand:SFDF 1 "gpc_reg_operand" "%,")
-  (match_operand:SFDF 2 "gpc_reg_operand" ",")))]
+  [(set (match_operand:SFDF 0 "gpc_reg_operand" "=,")
+   (smax:SFDF (match_operand:SFDF 1 "gpc_reg_operand" "%,")
+  (match_operand:SFDF 2 "gpc_reg_operand"

Re: [PATCH] lround for PowerPC

2015-11-23 Thread David Edelsohn

On Mon, Nov 23, 2015 at 4:56 PM, Michael Meissner
 wrote:
> David ping'ed me on internal IRC, and I had a thinko in terms of the use of 
> the
>  mode attribute.  In some of the uses (such as abs, smax, etc.) we want to
> use ISA 2.06 instructions on SFmode, while in other uses (add, mul, etc.) we
> want to use it only if we have the ISA 2.07 instrucitons.
>
> I have split these mode attributes into Fv and Fv2 and gone through all of the
> uses in the compiler to use the appropriate attribute.  I have built a cross
> compiler on x86, but it blew up on a big endian power7 with a segmentation
> violation that I need to look into.  I'm also building on a little endian
> power8 right now, and it has gotten further.
>
> 2015-11-23  David Edelsohn  
> Michael Meissner  
>
> * config/rs6000/rs6000.md (UNSPEC_XSRDPI): New unspec.
> (Fv2): New mode attribute to be used when ISA 2.06 instructions
> are used on SF/DF values.
> (abs2_fpr): Use  instead of .
> (nabs2_fpr): Likewise.
> (neg2_fpr): Likewise.
> (copysign3_fcpsgn): Likewise.
> (smax3_vsx): Likewise.
> (smin3_vsx): Likewise.
> (floatsi2_lfiwax): Likewise.
> (floatunssi2_lfiwz): Likewise.
> (fctiwz_): Likewise.
> (fctiwuz_): Likewise.
> (btrunc2): Likewise.
> (ceil2): Likewise.
> (floor2): Likewise.
> (xsrdpi): Add support for the lround function.
> (lround2): Likewise.

I would prefer that you reverse the meaning of "Fv" and "Fv2".  "Fv"
corresponds to VSX2 and "Fv2" corresponds to VSX, which is confusing
for anyone trying to make sense of this in the future.

Also, the lrounddi2 pattern should use "Fv" not "wa" from my
original patch.  And the ChangeLog entry should list lrounddi2.

Okay with those changes, after the cause of the SEGV is diagnosed and fixed.

Thanks, David

Re: [PATCH, i386] PR68497. Fix ICE with -fno-checking

2015-11-23 Thread Bernd Schmidt


On 11/24/2015 12:09 AM, Mikhail Maltsev wrote:

The attached patch fixes a problem introduced in r229567: the assertion

gcc_assert (is_sse);

is checked if flag_checking is false, and this causes an ICE when compiling with
-fno-checking.


Ok.


Bernd

Re: [RFA] [PATCH] Fix invalid redundant extension elimination for rl78 port

2015-11-23 Thread Bernd Schmidt


As I mentioned in a prior message on the subject, this is only a problem
when the source/dest of the extension are the same.  When the
source/dest of the extension are different, we only optimize when the
original set and extension are in the same block and we verify that all
affected registers are not set/used between the original set and the
extension.
Bootstrapped and regression tested on x86_64-linux-gnu.  Also tested
execute.exp on rl78 with no regressions.


Ok.


Bernd

[PATCH] [graphite] fix PR68314: revert all patches touching the construction of the original schedule

2015-11-23 Thread Sebastian Pop

---
 gcc/graphite-optimize-isl.c |  20 +---
 gcc/graphite-poly.c |   2 -
 gcc/graphite-poly.h |   3 -
 gcc/graphite-sese-to-poly.c | 235 ++--
 4 files changed, 58 insertions(+), 202 deletions(-)

diff --git a/gcc/graphite-optimize-isl.c b/gcc/graphite-optimize-isl.c
index 6ae224f..559afc4 100644
--- a/gcc/graphite-optimize-isl.c
+++ b/gcc/graphite-optimize-isl.c
@@ -442,23 +442,11 @@ optimize_isl (scop_p scop)
 #else
   isl_union_map *schedule_map = get_schedule_map (schedule);
 #endif
+  apply_schedule_map_to_scop (scop, schedule_map);
 
-  if (isl_union_map_is_equal (scop->original_schedule, schedule_map))
-{
-  if (dump_file && dump_flags)
-   fprintf (dump_file, "\nISL schedule same as original schedule\n");
-
-  isl_schedule_free (schedule);
-  isl_union_map_free (schedule_map);
-  return false;
-}
-  else
-{
-  apply_schedule_map_to_scop (scop, schedule_map);
-  isl_schedule_free (schedule);
-  isl_union_map_free (schedule_map);
-  return true;
-}
+  isl_schedule_free (schedule);
+  isl_union_map_free (schedule_map);
+  return true;
 }
 
 #endif /* HAVE_isl */
diff --git a/gcc/graphite-poly.c b/gcc/graphite-poly.c
index 187a3fa..c783fc3 100644
--- a/gcc/graphite-poly.c
+++ b/gcc/graphite-poly.c
@@ -306,7 +306,6 @@ new_scop (edge entry, edge exit)
   scop->must_waw_no_source = NULL;
   scop->may_waw_no_source = NULL;
   scop_set_region (scop, region);
-  scop->original_schedule = NULL;
   scop->pbbs.create (3);
   scop->drs.create (3);
 
@@ -343,7 +342,6 @@ free_scop (scop_p scop)
   isl_union_map_free (scop->may_waw);
   isl_union_map_free (scop->must_waw_no_source);
   isl_union_map_free (scop->may_waw_no_source);
-  isl_union_map_free (scop->original_schedule);
   XDELETE (scop);
 }
 
diff --git a/gcc/graphite-poly.h b/gcc/graphite-poly.h
index dec7fd6..d396d3f 100644
--- a/gcc/graphite-poly.h
+++ b/gcc/graphite-poly.h
@@ -417,9 +417,6 @@ struct scop
   isl_union_map *must_raw, *may_raw, *must_raw_no_source, *may_raw_no_source,
 *must_war, *may_war, *must_war_no_source, *may_war_no_source,
 *must_waw, *may_waw, *must_waw_no_source, *may_waw_no_source;
-
-  /* Original schedule of the SCoP.  */
-  isl_union_map *original_schedule;
 };
 
 extern scop_p new_scop (edge, edge);
diff --git a/gcc/graphite-sese-to-poly.c b/gcc/graphite-sese-to-poly.c
index 65ff4e6..ec7248b 100644
--- a/gcc/graphite-sese-to-poly.c
+++ b/gcc/graphite-sese-to-poly.c
@@ -113,78 +113,51 @@ isl_id_for_pbb (scop_p s, poly_bb_p pbb)
| 0   0   1   0   0   0   0   0  -5  = 0  */
 
 static void
-build_pbb_minimal_scattering_polyhedrons (isl_aff *static_sched, poly_bb_p pbb,
- int *sequence_dims,
- int nb_sequence_dim)
+build_pbb_scattering_polyhedrons (isl_aff *static_sched,
+ poly_bb_p pbb)
 {
-  int local_dim = isl_set_dim (pbb->domain, isl_dim_set);
-
-  /* Remove a sequence dimension if irrelevant to domain of current pbb.  */
-  int actual_nb_dim = 0;
-  for (int i = 0; i < nb_sequence_dim; i++)
-if (sequence_dims[i] <= local_dim)
-  actual_nb_dim++;
-
-  /* Build an array that combines sequence dimensions and loops dimensions 
info.
- This is used later to compute the static scattering polyhedrons.  */
-  bool *sequence_and_loop_dims = NULL;
-  if (local_dim + actual_nb_dim > 0)
-{
-  sequence_and_loop_dims = XNEWVEC (bool, local_dim + actual_nb_dim);
+  isl_val *val;
 
-  int i = 0, j = 0;
-  for (; i < local_dim; i++)
-   {
- if (sequence_dims && sequence_dims[j] == i)
-   {
- /* True for sequence dimension.  */
- sequence_and_loop_dims[i + j] = true;
- j++;
-   }
- /* False for loop dimension.  */
- sequence_and_loop_dims[i + j] = false;
-   }
-  /* Fake loops make things shifted by one.  */
-  if (sequence_dims && sequence_dims[j] == i)
-   sequence_and_loop_dims[i + j] = true;
-}
+  int scattering_dimensions = isl_set_dim (pbb->domain, isl_dim_set) * 2 + 1;
 
-  int scattering_dimensions = local_dim + actual_nb_dim;
   isl_space *dc = isl_set_get_space (pbb->domain);
-  isl_space *dm = isl_space_add_dims (isl_space_from_domain (dc), isl_dim_out,
- scattering_dimensions);
+  isl_space *dm = isl_space_add_dims (isl_space_from_domain (dc),
+ isl_dim_out, scattering_dimensions);
   pbb->schedule = isl_map_universe (dm);
 
-  int k = 0;
   for (int i = 0; i < scattering_dimensions; i++)
 {
-  if (!sequence_and_loop_dims[i])
+  /* Textual order inside this loop.  */
+  if ((i % 2) == 0)
{
- /* Iterations of this loop - loop dimension.  */
- pbb->schedule = isl_map_equate (pbb->schedule, isl_dim_in, k,
- isl_dim_out, i);
-

[PATCH, i386] PR68497. Fix ICE with -fno-checking

2015-11-23 Thread Mikhail Maltsev

Hi!

The attached patch fixes a problem introduced in r229567: the assertion

gcc_assert (is_sse);

is checked if flag_checking is false, and this causes an ICE when compiling with
-fno-checking.

Bootstrapped and regtested on x86_64-pc-linux-gnu. OK for trunk?

-- 
Regards,
Mikhail Maltsev

gcc/ChangeLog:

2015-11-23  Mikhail Maltsev  

PR target/68497
* config/i386/i386.c (output_387_binary_op): Fix assertion for
-fno-checking case.

gcc/testsuite/ChangeLog:

2015-11-23  Mikhail Maltsev  

PR target/68497
* gcc.target/i386/pr68497.c: New test.


diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 83749d5..23dbb3a 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -17675,18 +17675,20 @@ output_387_binary_op (rtx insn, rtx *operands)
 
   /* Even if we do not want to check the inputs, this documents input
  constraints.  Which helps in understanding the following code.  */
-  if (flag_checking
-  && STACK_REG_P (operands[0])
-  && ((REG_P (operands[1])
-	   && REGNO (operands[0]) == REGNO (operands[1])
-	   && (STACK_REG_P (operands[2]) || MEM_P (operands[2])))
-	  || (REG_P (operands[2])
-	  && REGNO (operands[0]) == REGNO (operands[2])
-	  && (STACK_REG_P (operands[1]) || MEM_P (operands[1]
-  && (STACK_TOP_P (operands[1]) || STACK_TOP_P (operands[2])))
-; /* ok */
-  else
-gcc_checking_assert (is_sse);
+  if (flag_checking)
+{
+  if (STACK_REG_P (operands[0])
+	  && ((REG_P (operands[1])
+	   && REGNO (operands[0]) == REGNO (operands[1])
+	   && (STACK_REG_P (operands[2]) || MEM_P (operands[2])))
+	  || (REG_P (operands[2])
+		  && REGNO (operands[0]) == REGNO (operands[2])
+		  && (STACK_REG_P (operands[1]) || MEM_P (operands[1]
+	  && (STACK_TOP_P (operands[1]) || STACK_TOP_P (operands[2])))
+	; /* ok */
+  else
+	gcc_assert (is_sse);
+}
 
   switch (GET_CODE (operands[3]))
 {
diff --git a/gcc/testsuite/gcc.target/i386/pr68497.c b/gcc/testsuite/gcc.target/i386/pr68497.c
new file mode 100644
index 000..0135cda
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr68497.c
@@ -0,0 +1,8 @@
+/* { dg-do compile } */
+/* { dg-options "-fno-checking" } */
+
+long double
+foo (long double x, long double y)
+{
+  return x + y;
+}

Re: Fix lto-symtab ICE during Ada LTO bootstrap

2015-11-23 Thread Jan Hubicka

> > > If you have any clue how to debug it further, I would be happy to try.
> > > That atree code is real software engineering treat BTW
> > 
> > I'll have a look at some point, once things have stabilized a bit.
> 
> OK, I will push out the remaining patches needed to get LTO into a shape to
> compile gnat1 and we can try to take a look from that.  It seems that gnat1
> was broken with LTO this way at least since GCC 5.

I will also try to check if -fno-strict-aliasing or -fno-ipa-icf fixes the 
issue.

Honza

Re: Fix lto-symtab ICE during Ada LTO bootstrap

2015-11-23 Thread Jan Hubicka

> > If you have any clue how to debug it further, I would be happy to try.
> > That atree code is real software engineering treat BTW
> 
> I'll have a look at some point, once things have stabilized a bit.

OK, I will push out the remaining patches needed to get LTO into a shape to
compile gnat1 and we can try to take a look from that.  It seems that gnat1
was broken with LTO this way at least since GCC 5.

Honza

Re: [PATCH 3/6] Fix memory leaks in IPA devirt

2015-11-23 Thread Trevor Saunders

On Mon, Nov 23, 2015 at 02:48:37PM +0100, marxin wrote:
> gcc/ChangeLog:
> 
> 2015-11-20  Martin Liska  
> 
>   * ipa-devirt.c (ipa_devirt): Use auto_vec instead
>   of a local-scope vec. Release final_warning_records.
> ---
>  gcc/ipa-devirt.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/gcc/ipa-devirt.c b/gcc/ipa-devirt.c
> index e74f853..6003c92 100644
> --- a/gcc/ipa-devirt.c
> +++ b/gcc/ipa-devirt.c
> @@ -3837,7 +3837,7 @@ ipa_devirt (void)
>  
>if (warn_suggest_final_methods)
>   {
> -   vec decl_warnings_vec = vNULL;
> +   auto_vec decl_warnings_vec;
>  
> final_warning_records->decl_warnings.traverse
>*, add_decl_warning> 
> (&decl_warnings_vec);
> @@ -3887,7 +3887,8 @@ ipa_devirt (void)
> decl, count, dyn_count);
>   }
>   }
> - 
> +
> +  final_warning_records->type_warnings.release ();
>delete (final_warning_records);

You should be able to just make
final_warning_record::type_warnings an auto_vec right? that
seems less error prone, though this is certainly fine for now.

Trev

>final_warning_records = 0;
>  }
> -- 
> 2.6.3
> 
>

Re: Fix lto-symtab ICE during Ada LTO bootstrap

2015-11-23 Thread Eric Botcazou

> If you have any clue how to debug it further, I would be happy to try.
> That atree code is real software engineering treat BTW

I'll have a look at some point, once things have stabilized a bit.

-- 
Eric Botcazou

[PATCH] Fix declaration of pthread-structs in s-osinte-rtems.ads (ada/68169)

2015-11-23 Thread Jan Sommer

Just noticed that I forgot to crosspost this mail to the rtems-devel list.

If someone with commit rights could check and push the patches we might get it 
into the next release.

Cheers,

   Jan--- Begin Message ---
Hello,

The paperwork seems to have gone through.
Here is the patch again for the 4.9.x, 5.x and trunk respectively.
I just pulled the head of the corresponding branches and created a new diff, so 
it should apply properly.

Best regards,

   JanIndex: gcc/ada/ChangeLog
===
--- gcc/ada/ChangeLog	(Revision 230563)
+++ gcc/ada/ChangeLog	(Arbeitskopie)
@@ -1,3 +1,9 @@
+2015-11-18  Jan Sommer 
+
+	* s-oscons-tmplt.c: Generate pthread constants for RTEMS
+	* s-osinte-rtems.ads: Declare pthread structs as opaque types in Ada
+	Fixes PR ada/68169
+
 2015-10-09  Eric Botcazou  
 
 	* gcc-interface/Make-lang.in: Make sure that GNAT1_OBJS and not just
Index: gcc/ada/s-oscons-tmplt.c
===
--- gcc/ada/s-oscons-tmplt.c	(Revision 230563)
+++ gcc/ada/s-oscons-tmplt.c	(Arbeitskopie)
@@ -154,7 +154,7 @@ pragma Style_Checks ("M32766");
 # include <_types.h>
 #endif
 
-#ifdef __linux__
+#if defined (__linux__) || defined (__rtems__)
 # include 
 # include 
 #endif
@@ -1441,7 +1441,8 @@ CND(CLOCK_THREAD_CPUTIME_ID, "Thread CPU clock")
 CNS(CLOCK_RT_Ada, "")
 #endif
 
-#if defined (__APPLE__) || defined (__linux__) || defined (DUMMY)
+#if defined (__APPLE__) || defined (__linux__) || defined (__rtems__) || \
+  defined (DUMMY)
 /*
 
--  Sizes of pthread data types
@@ -1484,7 +1485,7 @@ CND(PTHREAD_RWLOCKATTR_SIZE, "pthread_rwlockattr_t
 CND(PTHREAD_RWLOCK_SIZE, "pthread_rwlock_t")
 CND(PTHREAD_ONCE_SIZE,   "pthread_once_t")
 
-#endif /* __APPLE__ || __linux__ */
+#endif /* __APPLE__ || __linux__ || __rtems__*/
 
 /*
 
Index: gcc/ada/s-osinte-rtems.ads
===
--- gcc/ada/s-osinte-rtems.ads	(Revision 230563)
+++ gcc/ada/s-osinte-rtems.ads	(Arbeitskopie)
@@ -51,6 +51,8 @@
 --  It is designed to be a bottom-level (leaf) package.
 
 with Interfaces.C;
+with System.OS_Constants;
+
 package System.OS_Interface is
pragma Preelaborate;
 
@@ -60,6 +62,7 @@ package System.OS_Interface is
subtype rtems_id   is Interfaces.C.unsigned;
 
subtype intis Interfaces.C.int;
+   subtype char   is Interfaces.C.char;
subtype short  is Interfaces.C.short;
subtype long   is Interfaces.C.long;
subtype unsigned   is Interfaces.C.unsigned;
@@ -68,7 +71,6 @@ package System.OS_Interface is
subtype unsigned_char  is Interfaces.C.unsigned_char;
subtype plain_char is Interfaces.C.plain_char;
subtype size_t is Interfaces.C.size_t;
-
---
-- Errno --
---
@@ -76,11 +78,11 @@ package System.OS_Interface is
function errno return int;
pragma Import (C, errno, "__get_errno");
 
-   EAGAIN: constant := 11;
-   EINTR : constant := 4;
-   EINVAL: constant := 22;
-   ENOMEM: constant := 12;
-   ETIMEDOUT : constant := 116;
+   EAGAIN: constant := System.OS_Constants.EAGAIN;
+   EINTR : constant := System.OS_Constants.EINTR;
+   EINVAL: constant := System.OS_Constants.EINVAL;
+   ENOMEM: constant := System.OS_Constants.ENOMEM;
+   ETIMEDOUT : constant := System.OS_Constants.ETIMEDOUT;
 
-
-- Signals --
@@ -448,6 +450,7 @@ package System.OS_Interface is
   ss_low_priority : int;
   ss_replenish_period : timespec;
   ss_initial_budget   : timespec;
+  sched_ss_max_repl   : int;
end record;
pragma Convention (C, struct_sched_param);
 
@@ -621,43 +624,34 @@ private
end record;
pragma Convention (C, timespec);
 
-   CLOCK_REALTIME :  constant clockid_t := 1;
-   CLOCK_MONOTONIC : constant clockid_t := 4;
+   CLOCK_REALTIME :  constant clockid_t := System.OS_Constants.CLOCK_REALTIME;
+   CLOCK_MONOTONIC : constant clockid_t := System.OS_Constants.CLOCK_MONOTONIC;
 
+   subtype char_array is Interfaces.C.char_array;
+
type pthread_attr_t is record
-  is_initialized  : int;
-  stackaddr   : System.Address;
-  stacksize   : int;
-  contentionscope : int;
-  inheritsched: int;
-  schedpolicy : int;
-  schedparam  : struct_sched_param;
-  cputime_clocked_allowed : int;
-  detatchstate: int;
+  Data : char_array (1 .. OS_Constants.PTHREAD_ATTR_SIZE);
end record;
pragma Convention (C, pthread_attr_t);
+   for pthread_attr_t'Alignment use Interfaces.C.double'Alignment;
 
type pthread_condattr_t is record
-  flags   : int;
-  process_shared  : int;
+  Data : char_array (1 .. OS_Constants.PTHREAD_CONDATTR_SIZE);
end record;
pragma Convention (C, pthread_condattr_t);
+   for pthread_condattr_t'Alignment use Interfaces.C.double'Alignment;
 
type pthread_mutex

Re: [PATCH] lround for PowerPC

2015-11-23 Thread Michael Meissner

David ping'ed me on internal IRC, and I had a thinko in terms of the use of the
 mode attribute.  In some of the uses (such as abs, smax, etc.) we want to
use ISA 2.06 instructions on SFmode, while in other uses (add, mul, etc.) we
want to use it only if we have the ISA 2.07 instrucitons.

I have split these mode attributes into Fv and Fv2 and gone through all of the
uses in the compiler to use the appropriate attribute.  I have built a cross
compiler on x86, but it blew up on a big endian power7 with a segmentation
violation that I need to look into.  I'm also building on a little endian
power8 right now, and it has gotten further.

2015-11-23  David Edelsohn  
Michael Meissner  

* config/rs6000/rs6000.md (UNSPEC_XSRDPI): New unspec.
(Fv2): New mode attribute to be used when ISA 2.06 instructions
are used on SF/DF values.
(abs2_fpr): Use  instead of .
(nabs2_fpr): Likewise.
(neg2_fpr): Likewise.
(copysign3_fcpsgn): Likewise.
(smax3_vsx): Likewise.
(smin3_vsx): Likewise.
(floatsi2_lfiwax): Likewise.
(floatunssi2_lfiwz): Likewise.
(fctiwz_): Likewise.
(fctiwuz_): Likewise.
(btrunc2): Likewise.
(ceil2): Likewise.
(floor2): Likewise.
(xsrdpi): Add support for the lround function.
(lround2): Likewise.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797
Index: gcc/config/rs6000/rs6000.md
===
--- gcc/config/rs6000/rs6000.md (revision 230768)
+++ gcc/config/rs6000/rs6000.md (working copy)
@@ -77,6 +77,7 @@ (define_c_enum "unspec"
UNSPEC_FRIN
UNSPEC_FRIP
UNSPEC_FRIZ
+   UNSPEC_XSRDPI
UNSPEC_LD_MPIC  ; load_macho_picbase
UNSPEC_RELD_MPIC; re-load_macho_picbase
UNSPEC_MPIC_CORRECT ; macho_correct_pic
@@ -491,9 +492,17 @@ (define_mode_attr Fvsx [(SF "sp") (DF  "
 ; SF/DF constraint for arithmetic on traditional floating point registers
 (define_mode_attr Ff   [(SF "f") (DF "d") (DI "d")])
 
-; SF/DF constraint for arithmetic on VSX registers
+; SF/DF constraint for arithmetic on VSX registers.  This is intended to be
+; used for DFmode instructions added in ISA 2.06 (power7) and SFmode
+; instructions added in ISA 2.07 (power8)
 (define_mode_attr Fv   [(SF "wy") (DF "ws") (DI "wi")])
 
+; SF/DF constraint for arithmetic on VSX registers using instructions added in
+; ISA 2.06 (power7).  This includes instructions that normally target DF mode,
+; but are used on SFmode, since internally SFmode values are kept in the DFmode
+; format.
+(define_mode_attr Fv2  [(SF "ww") (DF "ws") (DI "wi")])
+
 ; SF/DF constraint for arithmetic on altivec registers
 (define_mode_attr Fa   [(SF "wu") (DF "wv")])
 
@@ -4299,8 +4308,8 @@ (define_expand "abs2"
   "")
 
 (define_insn "*abs2_fpr"
-  [(set (match_operand:SFDF 0 "gpc_reg_operand" "=,")
-   (abs:SFDF (match_operand:SFDF 1 "gpc_reg_operand" ",")))]
+  [(set (match_operand:SFDF 0 "gpc_reg_operand" "=,")
+   (abs:SFDF (match_operand:SFDF 1 "gpc_reg_operand" ",")))]
   "TARGET__FPR"
   "@
fabs %0,%1
@@ -4309,10 +4318,10 @@ (define_insn "*abs2_fpr"
(set_attr "fp_type" "fp_addsub_")])
 
 (define_insn "*nabs2_fpr"
-  [(set (match_operand:SFDF 0 "gpc_reg_operand" "=,")
+  [(set (match_operand:SFDF 0 "gpc_reg_operand" "=,")
(neg:SFDF
 (abs:SFDF
- (match_operand:SFDF 1 "gpc_reg_operand" ","]
+ (match_operand:SFDF 1 "gpc_reg_operand" ","]
   "TARGET__FPR"
   "@
fnabs %0,%1
@@ -4327,8 +4336,8 @@ (define_expand "neg2"
   "")
 
 (define_insn "*neg2_fpr"
-  [(set (match_operand:SFDF 0 "gpc_reg_operand" "=,")
-   (neg:SFDF (match_operand:SFDF 1 "gpc_reg_operand" ",")))]
+  [(set (match_operand:SFDF 0 "gpc_reg_operand" "=,")
+   (neg:SFDF (match_operand:SFDF 1 "gpc_reg_operand" ",")))]
   "TARGET__FPR"
   "@
fneg %0,%1
@@ -4557,9 +4566,9 @@ (define_expand "copysign3"
 ;; Use an unspec rather providing an if-then-else in RTL, to prevent the
 ;; compiler from optimizing -0.0
 (define_insn "copysign3_fcpsgn"
-  [(set (match_operand:SFDF 0 "gpc_reg_operand" "=,")
-   (unspec:SFDF [(match_operand:SFDF 1 "gpc_reg_operand" ",")
- (match_operand:SFDF 2 "gpc_reg_operand" ",")]
+  [(set (match_operand:SFDF 0 "gpc_reg_operand" "=,")
+   (unspec:SFDF [(match_operand:SFDF 1 "gpc_reg_operand" ",")
+ (match_operand:SFDF 2 "gpc_reg_operand" ",")]
 UNSPEC_COPYSIGN))]
   "TARGET__FPR && TARGET_CMPB"
   "@
@@ -4593,9 +4602,9 @@ (define_expand "smax3"
 })
 
 (define_insn "*smax3_vsx"
-  [(set (match_operand:SFDF 0 "gpc_reg_operand" "=,")
-   (smax:SFDF (match_operand:SFDF 1 "gpc_reg_operand" "%,")
-  (match_operand:SFDF 2 "gpc_reg_operand" ",")))

libgo patch committed: Fix reflect.Call of function returning zero-sized type

2015-11-23 Thread Ian Lance Taylor

PR 68496 points out a bug in the handling of reflect.Call calling a
function that returns a zero-sized type.  libffi doesn't understand
zero-sized types, which don't exist in C, so they require special
handling.  This patch fixes the problem.  Bootstrapped and ran Go
testsuite on x86_64-pc-linux-gnu.  Committed to mainline and GCC 5
branch.

Ian
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 230759)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-97ec885c715b3922b0866c081554899b8d50933a
+0d979f0b860cfd879754150e0ae5e1018b94d7c4
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: libgo/runtime/go-reflect-call.c
===
--- libgo/runtime/go-reflect-call.c (revision 230759)
+++ libgo/runtime/go-reflect-call.c (working copy)
@@ -81,6 +81,12 @@ go_results_size (const struct __go_func_
 
   off = (off + maxalign - 1) & ~ (maxalign - 1);
 
+  // The libffi library doesn't understand a struct with no fields.
+  // We generate a struct with a single field of type void.  When used
+  // as a return value, libffi will think that requires a byte.
+  if (off == 0)
+off = 1;
+
   return off;
 }

Re: [PATCH, PING*4] Track indirect calls for call site information in debug info.

2015-11-23 Thread Jason Merrill


On 08/31/2015 03:28 AM, Pierre-Marie de Rodat wrote:

On 07/20/2015 02:45 PM, Pierre-Marie de Rodat wrote:

On PowerPC targets with -mlongcall, most subprogram calls are turned
into indirect calls: the call target is read from a register even though
it is compile-time known. This makes it difficult for machine code
static analysis engines to recover the callee information. The attached
patch is an attempt to help such engines, generating
DW_AT_abstract_origin attributes for all DW_TAG_GNU_call_site we are
interested in.


Ping for the patch submitted in
.


Jakub, since DW_TAG_GNU_call_site is your feature, could you review this?

Jason

Re: [PATCHES, PING*5] Enhance standard DWARF for Ada

2015-11-23 Thread Jason Merrill


On 11/23/2015 08:53 AM, Pierre-Marie de Rodat wrote:

 Do you think the other patches could make it before the branch? (if they 
could, I will rebase+retest them as quick as possible).


Probably, yes.  I can't find the DW_AT_static_link patch, though; it 
doesn't seem to have been attached to your initial mail.



+  /* If we already met this node, there is nothing to compute anymore.  */
+  if (visited.contains (l))
+   {
+#if ENABLE_CHECKING
+ /* Make sure that the stack size is consistent wherever the execution
+flow comes from.  */
+ gcc_assert ((unsigned) l->dw_loc_frame_offset == frame_offset_);
+#endif
+ break;
+   }
+  visited.add (l);


The 'add' function returns whether or not the set already contained the 
entry, so you don't need to also call 'contains'.



+   /* The called DWARF procedure consumes one stack slot per argument
+  and returns one stack slot.  */
+   tree func
+ = lookup_dwarf_proc_decl (l->dw_loc_oprnd1.v.val_die_ref.die);
+
+   frame_offset += 1;
+   for (tree args = DECL_ARGUMENTS (func);
+args != NULL;
+args = DECL_CHAIN (args))
+ frame_offset_--;


Can you avoid the new hash table by counting the 
DW_TAG_formal_parameters instead of the DECL_ARGUMENTS?


Jason

Re: [PATCH] Don't lower VEC_PERM_EXPR if it can be expanded using vec_shr optab (PR target/68483)

2015-11-23 Thread Richard Biener

On November 23, 2015 8:14:59 PM GMT+01:00, Jakub Jelinek  
wrote:
>Hi!
>
>The patches that removed VEC_RSHIFT_EXPR regressed the first of these
>testcases on i?86/-msse2, because can_vec_perm_p returns false for
>that,
>and indeed as can_vec_perm_p is given only the mode and mask indices,
>there is nothing it can do about it.  The former VEC_RSHIFT_EXPR
>is a special VEC_PERM_EXPR with zero (bitwise, so not -0.0) as second
>argument and we can use vec_shr in that case.  The expander knows that,
>but
>veclower hasn't been taught about that, which is what this patch does.
>
>The patch also fixes up the shift_amt_for_vec_perm_mask function,
>if the first index is >= nelt, then it certainly is not a vector shift,
>but
>all zeros result (we should have folded it), plus when first is < nelt,
>then it doesn't make sense to mask the result, even for first == nelt -
>1
>first + nelt - 1 is <= 2 * nelt - 1.
>
>Bootstrapped/regtested on x86_64-linux and i686-linux, ok for
>trunk/5.3?

OK.

I wonder if we want a more "powerful" can_vec_perm that gets the actual 
arguments for example to decide if the result of folding a perm is still valid
To include that shift case.

Richard.

>2015-11-23  Jakub Jelinek  
>
>   PR target/68483
>   * tree-vect-generic.c (lower_vec_perm): If VEC_PERM_EXPR
>   is valid vec_shr pattern, don't lower it even if can_vec_perm_p
>   returns false.
>   * optabs.c (shift_amt_for_vec_perm_mask): Return NULL_RTX
>   whenever first is nelt or above.  Don't mask expected with
>   2 * nelt - 1.
>
>   * gcc.target/i386/pr68483-1.c: New test.
>   * gcc.target/i386/pr68483-2.c: New test.
>
>--- gcc/tree-vect-generic.c.jj 2015-11-23 13:29:41.959236201 +0100
>+++ gcc/tree-vect-generic.c2015-11-23 14:13:10.378094173 +0100
>@@ -1272,6 +1272,30 @@ lower_vec_perm (gimple_stmt_iterator *gs
> update_stmt (stmt);
> return;
>   }
>+  /* Also detect vec_shr pattern - VEC_PERM_EXPR with zero
>+   vector as VEC1 and a right element shift MASK.  */
>+  if (optab_handler (vec_shr_optab, TYPE_MODE (vect_type))
>+!= CODE_FOR_nothing
>+&& TREE_CODE (vec1) == VECTOR_CST
>+&& initializer_zerop (vec1)
>+&& sel_int[0]
>+&& sel_int[0] < elements)
>+  {
>+for (i = 1; i < elements; ++i)
>+  {
>+unsigned int expected = i + sel_int[0];
>+/* Indices into the second vector are all equivalent.  */
>+if (MIN (elements, (unsigned) sel_int[i])
>+!= MIN (elements, expected))
>+  break;
>+  }
>+if (i == elements)
>+  {
>+gimple_assign_set_rhs3 (stmt, mask);
>+update_stmt (stmt);
>+return;
>+  }
>+  }
> }
>   else if (can_vec_perm_p (TYPE_MODE (vect_type), true, NULL))
> return;
>--- gcc/optabs.c.jj2015-11-23 13:29:41.706239800 +0100
>+++ gcc/optabs.c   2015-11-23 13:33:14.857205132 +0100
>@@ -5232,12 +5232,12 @@ shift_amt_for_vec_perm_mask (rtx sel)
> return NULL_RTX;
> 
>   first = INTVAL (CONST_VECTOR_ELT (sel, 0));
>-  if (first >= 2*nelt)
>+  if (first >= nelt)
> return NULL_RTX;
>   for (i = 1; i < nelt; i++)
> {
>   int idx = INTVAL (CONST_VECTOR_ELT (sel, i));
>-  unsigned int expected = (i + first) & (2 * nelt - 1);
>+  unsigned int expected = i + first;
>   /* Indices into the second vector are all equivalent.  */
>   if (idx < 0 || (MIN (nelt, (unsigned) idx) != MIN (nelt, expected)))
>   return NULL_RTX;
>--- gcc/testsuite/gcc.target/i386/pr68483-1.c.jj   2015-11-23
>14:27:54.213534756 +0100
>+++ gcc/testsuite/gcc.target/i386/pr68483-1.c  2015-11-23
>14:33:57.810362424 +0100
>@@ -0,0 +1,22 @@
>+/* PR target/68483 */
>+/* { dg-do compile } */
>+/* { dg-options "-O2 -ftree-vectorize -msse2 -mno-sse3" } */
>+
>+void
>+test (int *input, int *out, unsigned x1, unsigned x2)
>+{
>+  unsigned i, j;
>+  unsigned end = x1;
>+
>+  for (i = j = 0; i < 1000; i++)
>+{
>+  int sum = 0;
>+  end += x2;
>+  for (; j < end; j++)
>+  sum += input[j];
>+  out[i] = sum;
>+}
>+}
>+
>+/* { dg-final { scan-assembler "psrldq\[^\n\r]*(8,|, 8)" { target ia32
>} } } */
>+/* { dg-final { scan-assembler "psrldq\[^\n\r]*(4,|, 4)" { target ia32
>} } } */
>--- gcc/testsuite/gcc.target/i386/pr68483-2.c.jj   2015-11-23
>14:33:22.436865628 +0100
>+++ gcc/testsuite/gcc.target/i386/pr68483-2.c  2015-11-23
>14:34:33.716851638 +0100
>@@ -0,0 +1,15 @@
>+/* PR target/68483 */
>+/* { dg-do compile } */
>+/* { dg-options "-O2 -msse2 -mno-sse3" } */
>+
>+typedef int V __attribute__((vector_size (16)));
>+
>+void
>+foo (V *a, V *b)
>+{
>+  V c = { 0, 0, 0, 0 };
>+  V d = { 1, 2, 3, 4 };
>+  *a = __builtin_shuffle (*b, c, d);
>+}
>+
>+/* { dg-final { scan-assembler "psrldq\[^\n\r]*(4,|, 4)" } } */
>
>   Jakub

Re: [PATCH] Fix PR objc/68438 (uninitialized source ranges)

2015-11-23 Thread Mike Stump

On Nov 23, 2015, at 3:13 AM, Joseph Myers  wrote:
> On Sun, 22 Nov 2015, David Malcolm wrote:
> 
>> Is there (or could there be) a precanned dg- directive to ask if ObjC is
>> available?  
> 
> I don't think so.  Normal practice is that each language's tests are in 
> appropriate directories for that language, with runtest never called with 
> a --tool option for that language if it wasn't built.

It is also reasonable to have a common to C/ObjC test directory if people want 
to do that and have them run if objc is present.  We do that for some tests in 
the C world.

Re: [PATCH] Don't ICE on symbolic ranges in VRP (PR tree-optimization/68455)

2015-11-23 Thread Richard Biener

On November 23, 2015 6:09:33 PM GMT+01:00, Marek Polacek  
wrote:
>On Mon, Nov 23, 2015 at 05:40:14PM +0100, Richard Biener wrote:
>> On November 23, 2015 5:31:11 PM GMT+01:00, Marek Polacek
> wrote:
>> >We blow up on the following testcase because we find ourselves
>passing
>> >[_13 + 1, INT_MAX] as a vr1 to
>extract_range_from_multiplicative_op_1;
>> >that's bad because this function immediately calls
>vrp_int_const_binop
>> >which just doesn't work for symbolic ranges, it only wants int_csts.
>> >
>> >This started with Richards S.'s changes in r228614 -- we're now
>since
>> >able to recurse into SSA names, thus get better info about ranges.
>> >That means that range_includes_zero_p in
>> >extract_range_from_binary_expr_1
>> >for the *_DIV_EXPR cases was able to determine that the range
>doesn't
>> >include zero, so we went through a different code path and ended up
>> >calling extract_range_from_multiplicative_op_1 even with symbolic
>> >ranges.
>> >
>> >I couldn't come up with anything better than checking that we're
>> >dealing
>> >with nonsymbolic ranges for such a case.
>> >
>> >Bootstrapped/regtested on x86_64-linux, ok for trunk?
>> 
>> Hmm.  I think we can do better if vr0 is symbolical - use min, max
>for it.
>> 
>> I suppose it would be best to implement a get_integer_range ()
>function doing that or also looking at equivalences if we are getting a
>symbolic range.
>> 
>> Anyway, those are future enhancements that shouldn't block this
>patch.
> 
>Is this something for this stage3?  Or should I open a PR and fix it in
>the
>next stage1?

Open a PR for next stage1.  Unless you are able to create a testcase that 
regressed of course.

Richard.

>> Thus OK.
>
>Thanks.
>
>   Marek

Re: [PATCH, PING*3] DWARF: materialize subprogram renamings in Ada as imported declarations

2015-11-23 Thread Jason Merrill


On 08/31/2015 03:26 AM, Pierre-Marie de Rodat wrote:

On 07/25/2015 09:44 PM, Pierre-Marie de Rodat wrote:

This change makes GCC materialize subprogram renamings in Ada as
imported declarations (DW_TAG_imported_declarations). For instance,

 procedure Foo renames Bar;

will output:

 DW_TAG_imported_declaration:
 DW_AT_name: foo
 DW_AT_import: 


Ping for the patch submitted in
.


DWARF changes are OK.

Jason

Re: [ptx] Fix sso tests

2015-11-23 Thread Nathan Sidwell


On 11/23/15 15:41, Jeff Law wrote:



In the 'put' function, why not just make all targets go through putchar?  It's
not like this is performance critical code and I don't think it compromises any
of the tests, does it?


I contemplated that, but wondered if someone would complain.  I'm happy either 
way.

nathan

Re: [PATCH] Add LANG_HOOKS_EMPTY_RECORD_P for C++ empty class

2015-11-23 Thread H.J. Lu

On Mon, Nov 23, 2015 at 1:57 AM, Richard Biener
 wrote:
> On Sat, Nov 21, 2015 at 12:46 AM, H.J. Lu  wrote:
>> On Fri, Nov 20, 2015 at 2:17 PM, Jason Merrill  wrote:
>>> On 11/20/2015 01:52 PM, H.J. Lu wrote:

 On Tue, Nov 17, 2015 at 4:22 AM, Richard Biener
  wrote:
>
> On Tue, Nov 17, 2015 at 12:01 PM, H.J. Lu  wrote:
>>
>> Empty record should be returned and passed the same way in C and C++.
>> This patch adds LANG_HOOKS_EMPTY_RECORD_P for C++ empty class, which
>> defaults to return false.  For C++, LANG_HOOKS_EMPTY_RECORD_P is defined
>> to is_really_empty_class, which returns true for C++ empty classes.  For
>> LTO, we stream out a bit to indicate if a record is empty and we store
>> it in TYPE_LANG_FLAG_0 when streaming in.  get_ref_base_and_extent is
>> changed to set bitsize to 0 for empty records.  Middle-end and x86
>> backend are updated to ignore empty records for parameter passing and
>> function value return.  Other targets may need similar changes.
>
>
> Please avoid a new langhook for this and instead claim a bit in
> tree_type_common
> like for example restrict_flag (double-check it is unused for
> non-pointers).


 There is no bit in tree_type_common I can overload.  restrict_flag is
 checked for non-pointers to issue an error when it is used on
 non-pointers:


 /export/gnu/import/git/sources/gcc/gcc/testsuite/g++.dg/template/qualttp20.C:19:38:
 error: ‘__restrict__’ qualifiers cannot be applied to ‘AS::L’
 typedef typename T::L __restrict__ r;// { dg-error "'__restrict__'
 qualifiers cannot" "" }
>>>
>>>
>>> The C++ front end only needs to check TYPE_RESTRICT for this purpose on
>>> front-end-specific type codes like TEMPLATE_TYPE_PARM; cp_type_quals could
>>> handle that specifically if you change TYPE_RESTRICT to only apply to
>>> pointers.
>>>
>>
>> restrict_flag is also checked in this case:
>>
>> [hjl@gnu-6 gcc]$ cat x.i
>> struct dummy { };
>>
>> struct dummy
>> foo (struct dummy __restrict__ i)
>> {
>>   return i;
>> }
>> [hjl@gnu-6 gcc]$ gcc -S x.i -Wall
>> x.i:4:13: error: invalid use of ‘restrict’
>>  foo (struct dummy __restrict__ i)
>>  ^
>> x.i:4:13: error: invalid use of ‘restrict’
>> [hjl@gnu-6 gcc]$
>>
>> restrict_flag can't also be used to indicate `i' is an empty record.
>
> I'm sure this error can be done during parsing w/o relying on TYPE_RESTRICT.
>
> But well, use any other free bit (but do not enlarge
> tree_type_common).  Eventually
> you can free up a bit by putting sth into type_lang_specific currently
> using bits
> in tree_type_common.

There are no bits in tree_type_common I can move.  Instead,
this patch overloads side_effects_flag in tree_base.  Tested on
Linux/x86-64.  OK for trunk?

Thanks.

-- 
H.J.
From 1e3e6ed42ed86b74d01bd3a6b869c15dba964c21 Mon Sep 17 00:00:00 2001
From: "H.J. Lu" 
Date: Sun, 15 Nov 2015 13:19:05 -0800
Subject: [PATCH] Add TYPE_EMPTY_RECORD for C++ empty class

Empty record should be returned and passed the same way in C and C++.
This patch overloads a bit, side_effects_flag, in tree_base for C++
empty class.  Middle-end and x86 backend are updated to ignore empty
records for parameter passing and function value return.  Other targets
may need similar changes.

get_ref_base_and_extent is changed to set bitsize to 0 for empty records
so that when ref_maybe_used_by_call_p_1 calls get_ref_base_and_extent to
get 0 as the maximum size on empty record.  Otherwise, find_tail_calls
won't perform tail call optimization for functions with empty record
parameters, as shown in g++.dg/pr60336-1.C and g++.dg/pr60336-2.C.

gcc/

	PR c++/60336
	PR middle-end/67239
	PR target/68355
	* calls.c (initialize_argument_information): Replace
	targetm.calls.function_arg, targetm.calls.function_incoming_arg
	and targetm.calls.function_arg_advance with function_arg,
	function_incoming_arg and function_arg_advance.
	(expand_call): Likewise.
	(emit_library_call_value_1): Likewise.
	(store_one_arg): Use 0 for empty record size.  Don't
	push 0 size argument onto stack.
	(must_pass_in_stack_var_size_or_pad): Return false for empty
	record.
	* dse.c (get_call_args): Replace targetm.calls.function_arg
	and targetm.calls.function_arg_advance with function_arg and
	function_arg_advance.
	* expr.c (block_move_libcall_safe_for_call_parm): Likewise.
	* function.c (aggregate_value_p): Replace
	targetm.calls.return_in_memory with return_in_memory.
	(assign_parm_find_entry_rtl): Replace
	targetm.calls.function_incoming_arg with function_incoming_arg.
	(assign_parms): Replace targetm.calls.function_arg_advance with
	function_arg_advance.
	(gimplify_parameters): Replace targetm.calls.function_arg_advance
	with function_arg_advance.
	(locate_and_pad_parm): Use 0 for empty record size.
	(function_arg_advance): New wrapper function.
	(function_arg): Likewise.
	(function_incoming_arg): Likewise.
	(return_in_memory): Likewise.
	* lto-stre

Re: [PATCH] Fix PR objc/68438 (uninitialized source ranges)

2015-11-23 Thread Jeff Law


On 11/23/2015 12:50 PM, David Malcolm wrote:

On Mon, 2015-11-23 at 10:25 -0700, Jeff Law wrote:

On 11/23/2015 04:13 AM, Joseph Myers wrote:

On Sun, 22 Nov 2015, David Malcolm wrote:


Is there (or could there be) a precanned dg- directive to ask if ObjC is
available?


I don't think so.  Normal practice is that each language's tests are in
appropriate directories for that language, with runtest never called with
a --tool option for that language if it wasn't built.

Right.  Which argues that we really want to create a new test directory
for objc plugin tests.


Attached is a revised version of the patch which creates an
objc.dg/plugin subdirectory, and builds the plugin that way (directly
reusing the plugin src from the gcc.dg subdir).

Successfully bootstrapped®rtested on x86_64-pc-linux-gnu; adds 16
PASS results to objc.sum.

OK for trunk?


OK.
jeff

Re: [ptx] Fix sso tests

2015-11-23 Thread Jeff Law


On 11/23/2015 01:16 PM, Nathan Sidwell wrote:

The gcc.dg/sso tests gratuitously fail on PTX because they use IO
facilities that don't exist there.  This  patch changes the dumping to
use the putchar function call (and not a macro), and not use fputs.

With this they all pass.

I'm not quite sure where the maintainer  boundaries lie for this kind of
fix. Any objections?
In the 'put' function, why not just make all targets go through putchar? 
 It's not like this is performance critical code and I don't think it 
compromises any of the tests, does it?


jeff

Re: update zlib to 1.2.8

2015-11-23 Thread Matthias Klose


On 23.11.2015 19:13, Joel Brobecker wrote:

In GCC zlib is only used for libjava; for binutils and gdb it is used when
building without --with-system-zlib.  This just updates zlib from 1.2.7 to
1.2.8 (released in 2013).  Applies cleanly, libjava still builds and doesn't
show any regressions in the testsuite.  Ok to apply (even if we already are
in stage3)?



+2015-11-23  Matthias Klose  
+
+   * Imported zlib 1.2.8; merged local changes.


Should not be a problem for GDB, since we're not near branching time.

Out of curiosity, what prompted this update? Just to be in sync with
the latest? Or was there an actual bug that you hit which 1.2.8 fixes?


No, just a packaging issue with somebody mentioning a static binutils build. 
That's when I saw the outdated version.


Now updated in the GCC VCS.

Matthias

Re: [PATCH 1/6] Fix memory leak in cilk

2015-11-23 Thread Trevor Saunders

> diff --git a/gcc/c-family/cilk.c b/gcc/c-family/cilk.c
> index e75e20c..1167b2b 100644
> --- a/gcc/c-family/cilk.c
> +++ b/gcc/c-family/cilk.c
> @@ -844,6 +844,7 @@ gimplify_cilk_spawn (tree *spawn_p)
>   call2, build_empty_stmt (EXPR_LOCATION (call1)));
>append_to_statement_list (spawn_expr, spawn_p);
>  
> +  free (arg_array);

seems like arg_array could just be made an auto_vec, but I guess this is
fine for now and someone can hopefully remember to clean that up later.

Trev

[ptx] Fix sso tests

2015-11-23 Thread Nathan Sidwell

The gcc.dg/sso tests gratuitously fail on PTX because they use IO facilities 
that don't exist there.  This  patch changes the dumping to use the putchar 
function call (and not a macro), and not use fputs.


With this they all pass.

I'm not quite sure where the maintainer  boundaries lie for this kind of fix. 
Any objections?


nathan
2015-11-23  Nathan Sidwell  

	* gcc.dg/sso/dump.h: Force IO to be putchar function call on nvptx.

Index: gcc/testsuite/gcc.dg/sso/dump.h
===
--- gcc/testsuite/gcc.dg/sso/dump.h	(revision 230718)
+++ gcc/testsuite/gcc.dg/sso/dump.h	(working copy)
@@ -1,3 +1,9 @@
+#ifdef __nvptx__
+/* Force function call.  NVPTX's IO is extremely limited.  */
+#undef putchar
+#define putchar (putchar)
+#endif
+
 void dump (void *p, unsigned int len)
 {
   const char digits[17] = "0123456789abcdef";
@@ -14,7 +20,13 @@ void dump (void *p, unsigned int len)
 
 void put (const char s[])
 {
+#ifdef  __nvptx__
+  int i;
+  for (i = 0; s[i]; i++)
+putchar (s[i]);
+#else
   fputs (s, stdout);
+#endif
 }
 
 void new_line (void)

[RFA] [PATCH] Fix invalid redundant extension elimination for rl78 port

2015-11-23 Thread Jeff Law



The core analysis was from Nick.  Essentially:


(insn  44 (set (reg:QI r11) (mem:QI (reg:HI r20)))
(insn  45 (set (reg:QI r10) (mem:QI (reg:HI r18)))
[...]
(insn  71 (set (reg:HI r14) (zero_extend:HI (reg:QI r11)))
[...]
(insn  88 (set (reg:HI r10) (zero_extend:HI (reg:QI r10)))

  (This is on the RL78 target where HImode values occupy two hard
  registers and QImode values only one.  The bug however is generic, not
  RL78 specific).

  The REE pass transforms this into:

(insn  44 (set (reg:QI r11) (mem:QI (reg:HI r20)))
(insn  45 (set (reg:HI r10) (zero_extend:HI (mem:QI (reg:HI r18
[...]
(insn  71 (set (reg:HI r14) (zero_extend:HI (reg:QI r11)))
[...]
(insn  88 deleted)

  Note how the new set at insn 45 clobbers the value loaded by insn 44
  into r11.  Thus when we use the value in insn 71, we're using the
  wrong value.


Nick had a more complex patch which tried to determine if the additional 
hard registers were used/set.  But the implementation was flawed in that 
it assumed the use succeeded the def in the linear insn chain, which is 
an invalid assumption in general.  For this to work what we'd really 
have to do is note all the blocks through which there's a path from the 
def to the use, then check for uses/sets within all those blocks.


Given this scenario is quite rare, it doesn't seem worth the effort. 
Even with an abort in the codepath, I can't get it to trigger during 
normal x86_64 or rl78 builds.  It only triggers on the rl78 with -O1 -free.


As I mentioned in a prior message on the subject, this is only a problem 
when the source/dest of the extension are the same.  When the 
source/dest of the extension are different, we only optimize when the 
original set and extension are in the same block and we verify that all 
affected registers are not set/used between the original set and the 
extension.
Bootstrapped and regression tested on x86_64-linux-gnu.  Also tested 
execute.exp on rl78 with no regressions.


I didn't include a distinct testcase as these are covered by pr42833 and 
strct-stdarg-1.c -- but only when those are run with -O1 -free.  I can 
certainly add a -free test for those tests if folks want.


I took this opportunity to also remove a block of #if 0'd code that I 
had in place for this situation, but had been unable to trigger.  I 
prefer Nick's location for the test.


Ok for the trunk?



Jeff
diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index e560746..29ed4e4 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,11 @@
+2015-11-18  Nick Clifton  
+   Jeff Law  
+
+   * ree.c (add_removable_extension): Avoid mis-optimizing cases where
+   the source/dest of the target extension require a different number of
+   hard registers.
+   (combine_set_extension): Remove #if 0 code.
+
 2015-11-20  Jim Wilson  
 
* tree-vect-data-refs.c (compare_tree): Call STRIP_NOPS.
diff --git a/gcc/ree.c b/gcc/ree.c
index b8436f2..f3b79e0 100644
--- a/gcc/ree.c
+++ b/gcc/ree.c
@@ -332,16 +332,6 @@ combine_set_extension (ext_cand *cand, rtx_insn 
*curr_insn, rtx *orig_set)
   else
 new_reg = gen_rtx_REG (cand->mode, REGNO (SET_DEST (*orig_set)));
 
-#if 0
-  /* Rethinking test.  Temporarily disabled.  */
-  /* We're going to be widening the result of DEF_INSN, ensure that doing so
- doesn't change the number of hard registers needed for the result.  */
-  if (HARD_REGNO_NREGS (REGNO (new_reg), cand->mode)
-  != HARD_REGNO_NREGS (REGNO (SET_DEST (*orig_set)),
-  GET_MODE (SET_DEST (*orig_set
-   return false;
-#endif
-
   /* Merge constants by directly moving the constant into the register under
  some conditions.  Recall that RTL constants are sign-extended.  */
   if (GET_CODE (orig_src) == CONST_INT
@@ -1080,6 +1070,18 @@ add_removable_extension (const_rtx expr, rtx_insn *insn,
  }
  }
 
+  /* Fourth, if the extended version occupies more registers than the
+original and the source of the extension is the same hard register
+as the destination of the extension, then we can not eliminate
+the extension without deep analysis, so just punt.
+
+We allow this when the registers are different because the
+code in combine_reaching_defs will handle that case correctly.  */
+  if ((HARD_REGNO_NREGS (REGNO (dest), mode)
+  != HARD_REGNO_NREGS (REGNO (reg), GET_MODE (reg)))
+ && REGNO (dest) == REGNO (reg))
+   return;
+
   /* Then add the candidate to the list and insert the reaching definitions
  into the definition map.  */
   ext_cand e = {expr, code, mode, insn};

Re: [PATCH 5/6] Fix parser memory leak in cilk_simd_fn_info

2015-11-23 Thread Jeff Law


On 11/23/2015 06:48 AM, marxin wrote:

gcc/cp/ChangeLog:

2015-11-23  Martin Liska  

* parser.c (cp_parser_late_parsing_cilk_simd_fn_info):
Release tokens.
There's a vec of objects in cilk_simd_fn_info, so unless that vec is 
copied elsewhere, we definitely want to release them before we blow away 
parser->cilk_simd_fn_info.  AFAICT the vec is never copied elsewhere.  So...


OK for the trunk.

jeff

Re: [PATCH] Fix PR objc/68438 (uninitialized source ranges)

2015-11-23 Thread David Malcolm

On Mon, 2015-11-23 at 10:25 -0700, Jeff Law wrote:
> On 11/23/2015 04:13 AM, Joseph Myers wrote:
> > On Sun, 22 Nov 2015, David Malcolm wrote:
> >
> >> Is there (or could there be) a precanned dg- directive to ask if ObjC is
> >> available?
> >
> > I don't think so.  Normal practice is that each language's tests are in
> > appropriate directories for that language, with runtest never called with
> > a --tool option for that language if it wasn't built.
> Right.  Which argues that we really want to create a new test directory 
> for objc plugin tests.

Attached is a revised version of the patch which creates an
objc.dg/plugin subdirectory, and builds the plugin that way (directly
reusing the plugin src from the gcc.dg subdir).

Successfully bootstrapped®rtested on x86_64-pc-linux-gnu; adds 16
PASS results to objc.sum.

OK for trunk?

>From f09c48b2ac55b2f9b5c3688e76fb4b91c3325fbb Mon Sep 17 00:00:00 2001
From: David Malcolm 
Date: Fri, 20 Nov 2015 11:12:47 -0500
Subject: [PATCH] Fix PR objc/68438 (uninitialized source ranges)

gcc/c/ChangeLog:
	PR objc/68438
	* c-parser.c (c_parser_postfix_expression): Set up source ranges
	for various Objective-C constructs: Class.name syntax,
	@selector(), @protocol, @encode(), and [] message syntax.

gcc/testsuite/ChangeLog:
	PR objc/68438
	* objc.dg/plugin/diagnostic-test-expressions-1.m: New test file.
	* objc.dg/plugin/plugin.exp: New file, based on
	gcc.dg/plugin/plugin.exp.
---
 gcc/c/c-parser.c   | 17 +++-
 .../objc.dg/plugin/diagnostic-test-expressions-1.m | 94 ++
 gcc/testsuite/objc.dg/plugin/plugin.exp| 90 +
 3 files changed, 198 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/objc.dg/plugin/diagnostic-test-expressions-1.m
 create mode 100644 gcc/testsuite/objc.dg/plugin/plugin.exp

diff --git a/gcc/c/c-parser.c b/gcc/c/c-parser.c
index 7b10764..18e9957 100644
--- a/gcc/c/c-parser.c
+++ b/gcc/c/c-parser.c
@@ -7338,10 +7338,13 @@ c_parser_postfix_expression (c_parser *parser)
 		expr.value = error_mark_node;
 		break;
 	  }
-	component = c_parser_peek_token (parser)->value;
+	c_token *component_tok = c_parser_peek_token (parser);
+	component = component_tok->value;
+	location_t end_loc = component_tok->get_finish ();
 	c_parser_consume_token (parser);
 	expr.value = objc_build_class_component_ref (class_name, 
 			 component);
+	set_c_expr_source_range (&expr, loc, end_loc);
 	break;
 	  }
 	default:
@@ -7816,9 +7819,11 @@ c_parser_postfix_expression (c_parser *parser)
 	}
 	  {
 	tree sel = c_parser_objc_selector_arg (parser);
+	location_t close_loc = c_parser_peek_token (parser)->location;
 	c_parser_skip_until_found (parser, CPP_CLOSE_PAREN,
    "expected %<)%>");
 	expr.value = objc_build_selector_expr (loc, sel);
+	set_c_expr_source_range (&expr, loc, close_loc);
 	  }
 	  break;
 	case RID_AT_PROTOCOL:
@@ -7839,9 +7844,11 @@ c_parser_postfix_expression (c_parser *parser)
 	  {
 	tree id = c_parser_peek_token (parser)->value;
 	c_parser_consume_token (parser);
+	location_t close_loc = c_parser_peek_token (parser)->location;
 	c_parser_skip_until_found (parser, CPP_CLOSE_PAREN,
    "expected %<)%>");
 	expr.value = objc_build_protocol_expr (id);
+	set_c_expr_source_range (&expr, loc, close_loc);
 	  }
 	  break;
 	case RID_AT_ENCODE:
@@ -7860,11 +7867,13 @@ c_parser_postfix_expression (c_parser *parser)
 	  c_parser_skip_until_found (parser, CPP_CLOSE_PAREN, NULL);
 	  break;
 	}
-	  c_parser_skip_until_found (parser, CPP_CLOSE_PAREN,
- "expected %<)%>");
 	  {
+	location_t close_loc = c_parser_peek_token (parser)->location;
+	c_parser_skip_until_found (parser, CPP_CLOSE_PAREN,
+ "expected %<)%>");
 	tree type = groktypename (t1, NULL, NULL);
 	expr.value = objc_build_encode_expr (type);
+	set_c_expr_source_range (&expr, loc, close_loc);
 	  }
 	  break;
 	case RID_GENERIC:
@@ -7907,9 +7916,11 @@ c_parser_postfix_expression (c_parser *parser)
 	  c_parser_consume_token (parser);
 	  receiver = c_parser_objc_receiver (parser);
 	  args = c_parser_objc_message_args (parser);
+	  location_t close_loc = c_parser_peek_token (parser)->location;
 	  c_parser_skip_until_found (parser, CPP_CLOSE_SQUARE,
  "expected %<]%>");
 	  expr.value = objc_build_message_expr (receiver, args);
+	  set_c_expr_source_range (&expr, loc, close_loc);
 	  break;
 	}
   /* Else fall through to report error.  */
diff --git a/gcc/testsuite/objc.dg/plugin/diagnostic-test-expressions-1.m b/gcc/testsuite/objc.dg/plugin/diagnostic-test-expressions-1.m
new file mode 100644
index 000..ed7aca3
--- /dev/null
+++ b/gcc/testsuite/objc.dg/plugin/diagnostic-test-expressions-1.m
@@ -0,0 +1,94 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fdiagnostics-show-caret" } */
+
+/* This file is similar to diagnostic-test-expressions-1.c
+   (s

Re: [PATCH] New version of libmpx with new memmove wrapper

2015-11-23 Thread Aleksandra Tsvetkova

gcc/testsuite/ChangeLog
+2015-10-27  Tsvetkova Alexandra  
+
+ * gcc.target/i386/mpx/memmove.c: New test for __mpx_wrapper_memmove.

libmpx/ChangeLog
+2015-10-28  Tsvetkova Alexandra  
+
+ * mpxrt/Makefile.am (libmpx_la_LDFLAGS): Add -version-info option.
+ * libmpxwrap/Makefile.am (libmpx_la_LDFLAGS): Likewise + includes fixed.
+ * libmpx/Makefile.in: Regenerate.
+ * mpxrt/Makefile.in: Regenerate.
+ * libmpxwrap/Makefile.in: Regenerate.
+ * mpxrt/libtool-version: New version.
+ * libmpxwrap/libtool-version: Likewise.
+ * mpxrt/libmpx.map: Add new version and a new symbol.
+ * mpxrt/mpxrt.h: New file.
+ * mpxrt/mpxrt.c (NUM_L1_BITS): Moved to mpxrt.h.
+(REG_IP_IDX): Moved to mpxrt.h.
+(REX_PREFIX): Moved to mpxrt.h.
+(XSAVE_OFFSET_IN_FPMEM): Moved to mpxrt.h.
+(MPX_L1_SIZE): Moved to mpxrt.h.
+ * libmpxwrap/mpx_wrappers.c: Rewrite __mpx_wrapper_memmove
+ to make it faster.
+ New types: mpx_pointer for extraction of indexes from pointer
+   mpx_bt_entry represents a cell in bounds table.
+ New functions: alloc_bt for allocatinn bounds table
+   get_bt to get address of bounds table
+   copy_if_possible and copy_if_possible_from_end move elements
+   of bounds table if we can
+   move_bounds moves bounds just like memmove


All fixed except for:

>>+static inline void
>>+alloc_bt (void *ptr)
>>+{
>>+  __asm__ __volatile__ ("bndstx %%bnd0, (%0,%0)"::"r" (ptr):"%bnd0");
>>+}
>
>This should be marked as bnd_legacy.

It will not work.

> +void *
> +__mpx_wrapper_memmove (void *dst, const void *src, size_t n)
> +{
> +  if (n == 0)
> +return dst;
> +
> +  __bnd_chk_ptr_bounds (dst, n);
> +  __bnd_chk_ptr_bounds (src, n);
> +
> +  memmove (dst, src, n);
> +  move_bounds (dst, src, n);
> +  return dst;
>  }
>
> You completely remove old algorithm which should be faster on small
> sizes. __mpx_wrapper_memmove should become a dispatcher between old
> and new implementations depending on target (32-bit or 64-bit) and N.
> Since old version performs both data and bounds copy, BD check should
> be moved into __mpx_wrapper_memmove to never call
> it when MPX is disabled.

Even though the old algorithm is faster on small sizes, it should not be used
with the new one because the new one supports unaligned pointers and the
old one does not. Different behavior may cause more problems.

Thanks,
Alexandra.
diff --git a/gcc/testsuite/gcc.target/i386/mpx/memmove.c 
b/gcc/testsuite/gcc.target/i386/mpx/memmove.c
new file mode 100755
index 000..57030a3
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/mpx/memmove.c
@@ -0,0 +1,119 @@
+/* { dg-do run } */
+/* { dg-options "-fcheck-pointer-bounds -mmpx" } */
+
+
+#include 
+#include 
+#include 
+#include 
+#include "mpx-check.h"
+
+#ifdef __i386__
+/* i386 directory size is 4MB.  */
+#define MPX_NUM_L2_BITS 10
+#define MPX_NUM_IGN_BITS 2
+#else /* __i386__ */
+/* x86_64 directory size is 2GB.  */
+#define MPX_NUM_L2_BITS 17
+#define MPX_NUM_IGN_BITS 3
+#endif /* !__i386__ */
+
+
+/* bt_num_of_elems is the number of elements in bounds table.  */
+unsigned long bt_num_of_elems = (1UL << MPX_NUM_L2_BITS);
+/* Function to test MPX wrapper of memmove function.
+   src_bigger_dst determines which address is bigger, can be 0 or 1.
+   src_bt_index and dst_bt index are bt_indexes
+   from the beginning of the page.
+   bd_index_end is the bd index of the last element of src if we define
+   bd index of the first element as 0.
+   src_bt index_end is bt index of the last element of src.
+   pointers inside determines if array being copied includes pointers
+   src_align and dst_align are alignments of src and dst.
+   Arrays may contain unaligned pointers.  */
+int
+test (int src_bigger_dst, int src_bt_index, int dst_bt_index,
+  int bd_index_end, int src_bt_index_end, int pointers_inside,
+  int src_align, int dst_align)
+{
+  const int n =
+src_bt_index_end - src_bt_index + bd_index_end * bt_num_of_elems;
+  if (n < 0)
+{
+  return 0;
+}
+  const int num_of_pointers = (bd_index_end + 2) * bt_num_of_elems;
+  void **arr = 0;
+  posix_memalign ((void **) (&arr),
+   1UL << (MPX_NUM_L2_BITS + MPX_NUM_IGN_BITS),
+   num_of_pointers * sizeof (void *));
+  void **src = arr, **dst = arr;
+  if ((src_bigger_dst) && (src_bt_index < dst_bt_index))
+src_bt_index += bt_num_of_elems;
+  if (!(src_bigger_dst) && (src_bt_index > dst_bt_index))
+dst_bt_index += bt_num_of_elems;
+  src += src_bt_index;
+  dst += dst_bt_index;
+  char *realign = (char *) src;
+  realign += src_align;
+  src = (void **) realign;
+  realign = (char *) dst;
+  realign += src_align;
+  dst = (void **) realign;
+  if (pointers_inside)
+{
+  for (int i = 0; i < n; i++)
+src[i] = __bnd_set_ptr_bounds (arr + i, i * sizeof (void *) + 1);
+}
+  memmove (dst, src, n * sizeof (void *));
+  if (pointers_inside)
+{
+  for (int i = 0; i < n; i++)
+{
+  if (dst[

Re: [PATCH/RFC] C++ FE: expression ranges (v2)

2015-11-23 Thread Jason Merrill


On 11/23/2015 12:07 PM, Marek Polacek wrote:

On Mon, Nov 23, 2015 at 05:57:54PM +0100, Jakub Jelinek wrote:

On Mon, Nov 23, 2015 at 11:53:40AM -0500, David Malcolm wrote:

Does the following look like the kind of thing you had in mind?  (just
the tree.def part for now).   Presumably usable for both lvalues and
rvalues, where the thing it wraps is what's important.  It merely exists
to add an EXPR_LOCATION, for a usage of the wrapped thing.


Yes, but please see with Jason, Richard and perhaps others if they are ok
with that too before spending too much time in that direction.
All occurrences of it would have to be folded away during the gimplification
at latest, this shouldn't be something we use in the middle-end.


I'd expect LOCATION_EXPR be defined in c-family/c-common.def, not tree.def.
And I'd think it shouldn't survive genericizing, thus never leak into the ME.


Makes sense.

Jason

[PATCH] Don't lower VEC_PERM_EXPR if it can be expanded using vec_shr optab (PR target/68483)

2015-11-23 Thread Jakub Jelinek

Hi!

The patches that removed VEC_RSHIFT_EXPR regressed the first of these
testcases on i?86/-msse2, because can_vec_perm_p returns false for that,
and indeed as can_vec_perm_p is given only the mode and mask indices,
there is nothing it can do about it.  The former VEC_RSHIFT_EXPR
is a special VEC_PERM_EXPR with zero (bitwise, so not -0.0) as second
argument and we can use vec_shr in that case.  The expander knows that, but
veclower hasn't been taught about that, which is what this patch does.

The patch also fixes up the shift_amt_for_vec_perm_mask function,
if the first index is >= nelt, then it certainly is not a vector shift, but
all zeros result (we should have folded it), plus when first is < nelt,
then it doesn't make sense to mask the result, even for first == nelt - 1
first + nelt - 1 is <= 2 * nelt - 1.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk/5.3?

2015-11-23  Jakub Jelinek  

PR target/68483
* tree-vect-generic.c (lower_vec_perm): If VEC_PERM_EXPR
is valid vec_shr pattern, don't lower it even if can_vec_perm_p
returns false.
* optabs.c (shift_amt_for_vec_perm_mask): Return NULL_RTX
whenever first is nelt or above.  Don't mask expected with
2 * nelt - 1.

* gcc.target/i386/pr68483-1.c: New test.
* gcc.target/i386/pr68483-2.c: New test.

--- gcc/tree-vect-generic.c.jj  2015-11-23 13:29:41.959236201 +0100
+++ gcc/tree-vect-generic.c 2015-11-23 14:13:10.378094173 +0100
@@ -1272,6 +1272,30 @@ lower_vec_perm (gimple_stmt_iterator *gs
  update_stmt (stmt);
  return;
}
+  /* Also detect vec_shr pattern - VEC_PERM_EXPR with zero
+vector as VEC1 and a right element shift MASK.  */
+  if (optab_handler (vec_shr_optab, TYPE_MODE (vect_type))
+ != CODE_FOR_nothing
+ && TREE_CODE (vec1) == VECTOR_CST
+ && initializer_zerop (vec1)
+ && sel_int[0]
+ && sel_int[0] < elements)
+   {
+ for (i = 1; i < elements; ++i)
+   {
+ unsigned int expected = i + sel_int[0];
+ /* Indices into the second vector are all equivalent.  */
+ if (MIN (elements, (unsigned) sel_int[i])
+ != MIN (elements, expected))
+   break;
+   }
+ if (i == elements)
+   {
+ gimple_assign_set_rhs3 (stmt, mask);
+ update_stmt (stmt);
+ return;
+   }
+   }
 }
   else if (can_vec_perm_p (TYPE_MODE (vect_type), true, NULL))
 return;
--- gcc/optabs.c.jj 2015-11-23 13:29:41.706239800 +0100
+++ gcc/optabs.c2015-11-23 13:33:14.857205132 +0100
@@ -5232,12 +5232,12 @@ shift_amt_for_vec_perm_mask (rtx sel)
 return NULL_RTX;
 
   first = INTVAL (CONST_VECTOR_ELT (sel, 0));
-  if (first >= 2*nelt)
+  if (first >= nelt)
 return NULL_RTX;
   for (i = 1; i < nelt; i++)
 {
   int idx = INTVAL (CONST_VECTOR_ELT (sel, i));
-  unsigned int expected = (i + first) & (2 * nelt - 1);
+  unsigned int expected = i + first;
   /* Indices into the second vector are all equivalent.  */
   if (idx < 0 || (MIN (nelt, (unsigned) idx) != MIN (nelt, expected)))
return NULL_RTX;
--- gcc/testsuite/gcc.target/i386/pr68483-1.c.jj2015-11-23 
14:27:54.213534756 +0100
+++ gcc/testsuite/gcc.target/i386/pr68483-1.c   2015-11-23 14:33:57.810362424 
+0100
@@ -0,0 +1,22 @@
+/* PR target/68483 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -ftree-vectorize -msse2 -mno-sse3" } */
+
+void
+test (int *input, int *out, unsigned x1, unsigned x2)
+{
+  unsigned i, j;
+  unsigned end = x1;
+
+  for (i = j = 0; i < 1000; i++)
+{
+  int sum = 0;
+  end += x2;
+  for (; j < end; j++)
+   sum += input[j];
+  out[i] = sum;
+}
+}
+
+/* { dg-final { scan-assembler "psrldq\[^\n\r]*(8,|, 8)" { target ia32 } } } */
+/* { dg-final { scan-assembler "psrldq\[^\n\r]*(4,|, 4)" { target ia32 } } } */
--- gcc/testsuite/gcc.target/i386/pr68483-2.c.jj2015-11-23 
14:33:22.436865628 +0100
+++ gcc/testsuite/gcc.target/i386/pr68483-2.c   2015-11-23 14:34:33.716851638 
+0100
@@ -0,0 +1,15 @@
+/* PR target/68483 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -msse2 -mno-sse3" } */
+
+typedef int V __attribute__((vector_size (16)));
+
+void
+foo (V *a, V *b)
+{
+  V c = { 0, 0, 0, 0 };
+  V d = { 1, 2, 3, 4 };
+  *a = __builtin_shuffle (*b, c, d);
+}
+
+/* { dg-final { scan-assembler "psrldq\[^\n\r]*(4,|, 4)" } } */

Jakub

[PATCH 2/3] [graphite] fix PR68279: bail out when scev gets instantiated to not_known

2015-11-23 Thread Sebastian Pop

---
 gcc/graphite-poly.c|   1 -
 gcc/graphite-poly.h|   6 +-
 gcc/graphite-sese-to-poly.c| 103 ++---
 gcc/graphite-sese-to-poly.h|  26 ---
 gcc/graphite.c |  27 +++
 gcc/sese.c |   2 +
 gcc/testsuite/gfortran.dg/graphite/pr68279.f90 |  28 +++
 7 files changed, 118 insertions(+), 75 deletions(-)
 delete mode 100644 gcc/graphite-sese-to-poly.h
 create mode 100644 gcc/testsuite/gfortran.dg/graphite/pr68279.f90

diff --git a/gcc/graphite-poly.c b/gcc/graphite-poly.c
index 809670a..bc6befa 100644
--- a/gcc/graphite-poly.c
+++ b/gcc/graphite-poly.c
@@ -307,7 +307,6 @@ new_scop (edge entry, edge exit)
   scop_set_region (scop, region);
   scop->original_schedule = NULL;
   scop->pbbs.create (3);
-  scop->poly_scop_p = false;
   scop->drs.create (3);
 
   return scop;
diff --git a/gcc/graphite-poly.h b/gcc/graphite-poly.h
index 3fcbbaf..dec7fd6 100644
--- a/gcc/graphite-poly.h
+++ b/gcc/graphite-poly.h
@@ -420,10 +420,6 @@ struct scop
 
   /* Original schedule of the SCoP.  */
   isl_union_map *original_schedule;
-
-  /* True when the scop has been converted to its polyhedral
- representation.  */
-  bool poly_scop_p;
 };
 
 extern scop_p new_scop (edge, edge);
@@ -468,4 +464,6 @@ carries_deps (__isl_keep isl_union_map *schedule,
  __isl_keep isl_union_map *deps,
  int depth);
 
+bool build_poly_scop (scop_p);
+
 #endif
diff --git a/gcc/graphite-sese-to-poly.c b/gcc/graphite-sese-to-poly.c
index 83acc4a..2054fad 100644
--- a/gcc/graphite-sese-to-poly.c
+++ b/gcc/graphite-sese-to-poly.c
@@ -65,7 +65,6 @@ extern "C" {
 #include "domwalk.h"
 #include "graphite-poly.h"
 #include "tree-ssa-propagate.h"
-#include "graphite-sese-to-poly.h"
 
 /* Assigns to RES the value of the INTEGER_CST T.  */
 
@@ -589,10 +588,20 @@ set_scop_parameter_dim (scop_p scop)
   scop->param_context = isl_set_universe (space);
 }
 
+static inline bool
+cleanup_loop_iter_dom (isl_set *inner, isl_set *outer, isl_space *space, mpz_t 
g)
+{
+  isl_set_free (inner);
+  isl_set_free (outer);
+  isl_space_free (space);
+  mpz_clear (g);
+  return false;
+}
+
 /* Builds the constraint polyhedra for LOOP in SCOP.  OUTER_PH gives
the constraints for the surrounding loops.  */
 
-static void
+static bool
 build_loop_iteration_domains (scop_p scop, struct loop *loop,
   int nb,
  isl_set *outer, isl_set **doms)
@@ -637,11 +646,17 @@ build_loop_iteration_domains (scop_p scop, struct loop 
*loop,
 
   nb_iters = scalar_evolution_in_region (region, loop, nb_iters);
 
+  /* Bail out as we do not know the scev.  */
+  if (chrec_contains_undetermined (nb_iters))
+   return cleanup_loop_iter_dom (inner, outer, space, g);
+
   aff = extract_affine (scop, nb_iters, isl_set_get_space (inner));
   isl_set *valid = isl_pw_aff_nonneg_set (isl_pw_aff_copy (aff));
   valid = isl_set_project_out (valid, isl_dim_set, 0,
   isl_set_dim (valid, isl_dim_set));
-  scop->param_context = isl_set_intersect (scop->param_context, valid);
+
+  if (valid)
+   scop->param_context = isl_set_intersect (scop->param_context, valid);
 
   isl_local_space *ls = isl_local_space_from_space (isl_space_copy 
(space));
   isl_aff *al = isl_aff_set_coefficient_si (isl_aff_zero_on_domain (ls),
@@ -685,21 +700,24 @@ build_loop_iteration_domains (scop_p scop, struct loop 
*loop,
   else
 gcc_unreachable ();
 
-  if (loop->inner)
-build_loop_iteration_domains (scop, loop->inner, nb + 1,
- isl_set_copy (inner), doms);
+  if (loop->inner
+  && !build_loop_iteration_domains (scop, loop->inner, nb + 1,
+   isl_set_copy (inner), doms))
+return cleanup_loop_iter_dom (inner, outer, space, g);
 
   if (nb != 0
   && loop->next
-  && loop_in_sese_p (loop->next, region))
-build_loop_iteration_domains (scop, loop->next, nb,
- isl_set_copy (outer), doms);
+  && loop_in_sese_p (loop->next, region)
+  && !build_loop_iteration_domains (scop, loop->next, nb,
+   isl_set_copy (outer), doms))
+return cleanup_loop_iter_dom (inner, outer, space, g);
 
   doms[loop->num] = inner;
 
   isl_set_free (outer);
   isl_space_free (space);
   mpz_clear (g);
+  return true;
 }
 
 /* Returns a linear expression for tree T evaluated in PBB.  */
@@ -710,6 +728,11 @@ create_pw_aff_from_tree (poly_bb_p pbb, tree t)
   scop_p scop = PBB_SCOP (pbb);
 
   t = scalar_evolution_in_region (scop->scop_info->region, pbb_loop (pbb), t);
+
+  /* Bail out as we do not know the scev.  */
+  if (chrec_contains_undetermined (t))
+return NULL;
+
   gcc_assert (!automatically_generated_chrec_p (t));
 
   return ex

[PATCH 1/3] [graphite] call update_ssa once

2015-11-23 Thread Sebastian Pop

---
 gcc/graphite-isl-ast-to-gimple.c | 9 -
 gcc/sese.c   | 4 
 2 files changed, 13 deletions(-)

diff --git a/gcc/graphite-isl-ast-to-gimple.c b/gcc/graphite-isl-ast-to-gimple.c
index 557c44c..30c3a21 100644
--- a/gcc/graphite-isl-ast-to-gimple.c
+++ b/gcc/graphite-isl-ast-to-gimple.c
@@ -1022,15 +1022,6 @@ translate_isl_ast_node_user (__isl_keep isl_ast_node 
*node,
   print_loops_bb (dump_file, next_e->src, 0, 3);
 }
 
-  mark_virtual_operands_for_renaming (cfun);
-  update_ssa (TODO_update_ssa);
-
-  if (dump_file)
-{
-  fprintf (dump_file, "\n[codegen] (after update SSA) new basic block\n");
-  print_loops_bb (dump_file, next_e->src, 0, 3);
-}
-
   return next_e;
 }
 
diff --git a/gcc/sese.c b/gcc/sese.c
index b5da428..9b932ce 100644
--- a/gcc/sese.c
+++ b/gcc/sese.c
@@ -301,8 +301,6 @@ sese_insert_phis_for_liveouts (sese_info_p region, 
basic_block bb,
   bitmap_iterator bi;
   bitmap liveouts = BITMAP_ALLOC (NULL);
 
-  update_ssa (TODO_update_ssa);
-
   sese_build_liveouts (region, liveouts);
 
   EXECUTE_IF_SET_IN_BITMAP (liveouts, 0, i, bi)
@@ -310,8 +308,6 @@ sese_insert_phis_for_liveouts (sese_info_p region, 
basic_block bb,
   sese_add_exit_phis_edge (bb, ssa_name (i), false_e, true_e);
 
   BITMAP_FREE (liveouts);
-
-  update_ssa (TODO_update_ssa);
 }
 
 /* Returns the outermost loop in SCOP that contains BB.  */
-- 
1.9.1

[PATCH 3/3] [graphite] fix PR68493: bail out when codegen_error is set

2015-11-23 Thread Sebastian Pop

---
 gcc/graphite-isl-ast-to-gimple.c|  2 ++
 gcc/testsuite/gcc.dg/graphite/pr68493.c | 34 +
 2 files changed, 36 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/graphite/pr68493.c

diff --git a/gcc/graphite-isl-ast-to-gimple.c b/gcc/graphite-isl-ast-to-gimple.c
index 30c3a21..2783ac4 100644
--- a/gcc/graphite-isl-ast-to-gimple.c
+++ b/gcc/graphite-isl-ast-to-gimple.c
@@ -2760,6 +2760,8 @@ translate_isl_ast_to_gimple::translate_pending_phi_nodes 
()
  fprintf (dump_file, "[codegen] to new-phi: ");
  print_gimple_stmt (dump_file, new_phi, 0, 0);
}
+  if (codegen_error)
+   return;
 }
 }
 
diff --git a/gcc/testsuite/gcc.dg/graphite/pr68493.c 
b/gcc/testsuite/gcc.dg/graphite/pr68493.c
new file mode 100644
index 000..95f3699
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/graphite/pr68493.c
@@ -0,0 +1,34 @@
+/* { dg-options "-O1 -floop-nest-optimize" } */
+
+int ce[2];
+int o5;
+int p7;
+
+int foo (void)
+{
+  int j1;
+  ce[0] = 0;
+  for (j1 = 0; j1 < 2; ++j1)
+for (o5 = 1; o5 >= 0; --o5)
+  p7 += ce[o5];
+  return 0;
+}
+
+int du;
+
+int bar (void)
+{
+  int u7[2];
+  int ar;
+
+  for (ar = 0; ar < 2; ++ar) {
+int xo;
+
+for (xo = 0; xo < 2; ++xo) {
+  du += u7[ar];
+  u7[0] = 0;
+}
+  }
+
+  return 0;
+}
-- 
1.9.1

Re: RFA: PATCH to match.pd for c++/68385

2015-11-23 Thread Jason Merrill


On 11/21/2015 01:30 AM, Richard Biener wrote:

What happens if we remove the nops stripping from integer_zerop?  Do other 
integer predicates strip nops?


Many predicates do, but removing that doesn't break anything in the 
testsuite.  So, how about this?



commit b4714ac166ce22b54e89ebb860d52637a210c550
Author: Jason Merrill 
Date:   Sat Nov 21 07:45:01 2015 -0500

	PR c++/68385

	* tree.c (integer_zerop, integer_onep, integer_each_onep)
	(integer_all_onesp, integer_minus_onep, integer_pow2p)
	(integer_nonzerop, integer_truep, tree_log2, tree_floor_log2)
	(real_zerop, real_onep, real_minus_onep): Remove STRIP_NOPS.

diff --git a/gcc/tree.c b/gcc/tree.c
index 779fe93..01b2aa8 100644
--- a/gcc/tree.c
+++ b/gcc/tree.c
@@ -2273,8 +2273,6 @@ zerop (const_tree expr)
 int
 integer_zerop (const_tree expr)
 {
-  STRIP_NOPS (expr);
-
   switch (TREE_CODE (expr))
 {
 case INTEGER_CST:
@@ -2301,8 +2299,6 @@ integer_zerop (const_tree expr)
 int
 integer_onep (const_tree expr)
 {
-  STRIP_NOPS (expr);
-
   switch (TREE_CODE (expr))
 {
 case INTEGER_CST:
@@ -2329,8 +2325,6 @@ integer_onep (const_tree expr)
 int
 integer_each_onep (const_tree expr)
 {
-  STRIP_NOPS (expr);
-
   if (TREE_CODE (expr) == COMPLEX_CST)
 return (integer_onep (TREE_REALPART (expr))
 	&& integer_onep (TREE_IMAGPART (expr)));
@@ -2344,8 +2338,6 @@ integer_each_onep (const_tree expr)
 int
 integer_all_onesp (const_tree expr)
 {
-  STRIP_NOPS (expr);
-
   if (TREE_CODE (expr) == COMPLEX_CST
   && integer_all_onesp (TREE_REALPART (expr))
   && integer_all_onesp (TREE_IMAGPART (expr)))
@@ -2371,8 +2363,6 @@ integer_all_onesp (const_tree expr)
 int
 integer_minus_onep (const_tree expr)
 {
-  STRIP_NOPS (expr);
-
   if (TREE_CODE (expr) == COMPLEX_CST)
 return (integer_all_onesp (TREE_REALPART (expr))
 	&& integer_zerop (TREE_IMAGPART (expr)));
@@ -2386,8 +2376,6 @@ integer_minus_onep (const_tree expr)
 int
 integer_pow2p (const_tree expr)
 {
-  STRIP_NOPS (expr);
-
   if (TREE_CODE (expr) == COMPLEX_CST
   && integer_pow2p (TREE_REALPART (expr))
   && integer_zerop (TREE_IMAGPART (expr)))
@@ -2405,8 +2393,6 @@ integer_pow2p (const_tree expr)
 int
 integer_nonzerop (const_tree expr)
 {
-  STRIP_NOPS (expr);
-
   return ((TREE_CODE (expr) == INTEGER_CST
 	   && !wi::eq_p (expr, 0))
 	  || (TREE_CODE (expr) == COMPLEX_CST
@@ -2421,8 +2407,6 @@ integer_nonzerop (const_tree expr)
 int
 integer_truep (const_tree expr)
 {
-  STRIP_NOPS (expr);
-
   if (TREE_CODE (expr) == VECTOR_CST)
 return integer_all_onesp (expr);
   return integer_onep (expr);
@@ -2443,8 +2427,6 @@ fixed_zerop (const_tree expr)
 int
 tree_log2 (const_tree expr)
 {
-  STRIP_NOPS (expr);
-
   if (TREE_CODE (expr) == COMPLEX_CST)
 return tree_log2 (TREE_REALPART (expr));
 
@@ -2457,8 +2439,6 @@ tree_log2 (const_tree expr)
 int
 tree_floor_log2 (const_tree expr)
 {
-  STRIP_NOPS (expr);
-
   if (TREE_CODE (expr) == COMPLEX_CST)
 return tree_log2 (TREE_REALPART (expr));
 
@@ -2582,8 +2562,6 @@ tree_ctz (const_tree expr)
 int
 real_zerop (const_tree expr)
 {
-  STRIP_NOPS (expr);
-
   switch (TREE_CODE (expr))
 {
 case REAL_CST:
@@ -2612,8 +2590,6 @@ real_zerop (const_tree expr)
 int
 real_onep (const_tree expr)
 {
-  STRIP_NOPS (expr);
-
   switch (TREE_CODE (expr))
 {
 case REAL_CST:
@@ -2641,8 +2617,6 @@ real_onep (const_tree expr)
 int
 real_minus_onep (const_tree expr)
 {
-  STRIP_NOPS (expr);
-
   switch (TREE_CODE (expr))
 {
 case REAL_CST:

Re: Fix lto-symtab ICE during Ada LTO bootstrap

2015-11-23 Thread Jan Hubicka

BTW for the LTO type merging issues one could probably just drop those types
and all derivations to alias set 0. But indeed rewriting them to pointers would
be better, especially for ABI compatibility.

The Ada ICE I get is:
Continuing.
+===GNAT BUG DETECTED==+
| 6.0.0 20151122 (experimental) (x86_64-pc-linux-gnu) Assert_Failure 
atree.adb:6776|
| Error detected at system.ads:107:4   |
| Please submit a bug report; see http://gcc.gnu.org/bugs.html.|
| Use a subject line meaningful to you and us to track the bug.|
| Include the entire contents of this bug box in the report.   |
| Include the exact command that you entered.  |
| Also include sources listed below.   |
+==+

Please include these source files with error report
Note that list may not be accurate in some cases,
so please double check that the problem can still
be reproduced with the set of files listed.
Consider also -gnatd.n switch (see debug.adb).

../../gcc/ada/system.ads
../../gcc/ada/a-except.adb
../../gcc/ada/a-except.ads
../../gcc/ada/ada.ads
../../gcc/ada/s-parame.ads
../../gcc/ada/s-stalib.ads
../../gcc/ada/a-unccon.ads
../../gcc/ada/s-traent.ads
../../gcc/ada/s-excdeb.ads
../../gcc/ada/s-soflin.ads
../../gcc/ada/s-stache.ads
../../gcc/ada/s-stoele.ads

compilation abandoned

(gdb) bt
#0  atree__unchecked_access__set_flag96.part.697.lto_priv.6676 () at 
../../gcc/ada/atree.adb:6776
#1  0x01711774 in atree__unchecked_access__set_flag96 (n=, val=) at ../../gcc/ada/atree.adb:6774
#2  0x0126a95c in einfo.set_warnings_off (v=, id=0) at 
../../gcc/ada/einfo.adb:6435
#3  sem_prag.analyze_pragma () at ../../gcc/ada/sem_prag.adb:22879
#4  0x00989893 in sem.analyze (n=12466) at ../../gcc/ada/sem.adb:456
#5  0x00cac089 in sem_ch3.analyze_declarations (l=-8775) at 
../../gcc/ada/sem_ch3.adb:2323
#6  0x0134e4d5 in sem_ch7.analyze_package_specification () at 
../../gcc/ada/sem_ch7.adb:1395
#7  0x009898ab in sem.analyze (n=12078) at ../../gcc/ada/sem.adb:450
#8  0x013517d8 in sem_ch7.analyze_package_declaration (n=12875) at 
../../gcc/ada/sem_ch7.adb:1006
#9  0x00989e89 in sem.analyze (n=n@entry=12875) at 
../../gcc/ada/sem.adb:441
#10 0x00998d6d in sem_ch10.analyze_compilation_unit (n=n@entry=12067) 
at ../../gcc/ada/sem_ch10.adb:892
#11 0x00989947 in sem.analyze (n=n@entry=12067) at 
../../gcc/ada/sem.adb:174
#12 0x0099760f in sem.semantics.do_analyze () at 
../../gcc/ada/sem.adb:1337
#13 sem.semantics () at ../../gcc/ada/sem.adb:1517
#14 0x00998039 in sem_ch10.analyze_with_clause (n=n@entry=2286) at 
../../gcc/ada/sem_ch10.adb:2540
#15 0x00989a7f in sem.analyze (n=n@entry=2286) at 
../../gcc/ada/sem.adb:601
#16 0x00991e67 in sem_ch10.analyze_context (n=n@entry=2284) at 
../../gcc/ada/sem_ch10.adb:1371
#17 0x00998cb0 in sem_ch10.analyze_compilation_unit (n=n@entry=2284) at 
../../gcc/ada/sem_ch10.adb:686
#18 0x00989947 in sem.analyze (n=n@entry=2284) at 
../../gcc/ada/sem.adb:174
#19 0x0099760f in sem.semantics.do_analyze () at 
../../gcc/ada/sem.adb:1337
#20 sem.semantics () at ../../gcc/ada/sem.adb:1517
#21 0x0090e5f9 in frontend () at ../../gcc/ada/frontend.adb:408
#22 0x0146de0a in _ada_gnat1drv () at ../../gcc/ada/gnat1drv.adb:1029
#23 0x006f579e in gnat_parse_file() [clone .lto_priv.5151] () at 
../../gcc/ada/gcc-interface/misc.c:121
#24 0x016f723c in compile_file () at ../../gcc/toplev.c:464
#25 0x0068996e in do_compile () at ../../gcc/toplev.c:1951
#26 toplev::main (this=this@entry=0x7fffe850, argc=argc@entry=39, 
argv=argv@entry=0x7fffe958) at ../../gcc/toplev.c:2058
#27 0x00688e29 in main (argc=39, argv=0x7fffe958) at 
../../gcc/main.c:39

If you have any clue how to debug it further, I would be happy to try.
That atree code is real software engineering treat BTW

Honza

Re: [PATCH] PR c/68473: sanitize source range-printing within certain macro expansions

2015-11-23 Thread David Malcolm

On Mon, 2015-11-23 at 18:59 +0100, Bernd Schmidt wrote:
> On 11/23/2015 06:52 PM, David Malcolm wrote:
> > This patch fixes PR c/68473 by bulletproofing the new
> > diagnostic_show_locus implementation against ranges that finish before
> > they start (which can happen when using the C preprocessor), falling
> > back to simply printing a caret.
> 
> Hmm, wouldn't it be better to avoid such a situation? Can you describe a 
> bit more how exactly the macro expansion caused such a situation?

The issue is here:

 1  /* { dg-options "-fdiagnostics-show-caret -mno-fp-ret-in-387" } */
 2  
 3  extern long double fminl (long double __x, long double __y);
 4  
 5  #define TEST_EQ(FUNC) do { \
 6if ((long)FUNC##l(xl,xl) != (long)xl) \
 7  return; \
 8} while (0)
 9  
10  void
11  foo (long double xl)
12  {
13TEST_EQ (fmin); /* { dg-error "x87 register return with x87 disabled" 
} */
14  }


16  /* { dg-begin-multiline-output "" }
17 TEST_EQ (fmin);
18  ^
19 { dg-end-multiline-output "" } */
20  
21  /* { dg-begin-multiline-output "" }
22 if ((long)FUNC##l(xl,xl) != (long)xl) \
23   ^~~~
24 { dg-end-multiline-output "" } */

An error is emitted whilst expanding the macro at line 13, at
input_location.

This is at the expansion of this function call:

   fminl (xl, xl)

Normally we'd emit a source range like this for a function call:

   fminl (xl, xl)
   ^~

However, once we fully resolve locations, the "fmin" part of "fminl"
appears at line 13 here:

13TEST_EQ (fmin);
   ^~~~

giving the location of the caret, and start of the range, whereas the
rest of the the call is spelled here:

 6if ((long)FUNC##l(xl,xl) != (long)xl) \
   ~~~

where the close paren gives the end of the range.

It would be wrong to try to print the whole range (anything might be
between lines 6 and 13).

In theory we could attempt to try to handle this kind of thing by
looking at the macro expansions, and to print something like:

13TEST_EQ (fmin);
   ^~~~
 6if ((long)FUNC##l(xl,xl) != (long)xl) \
  

or whatnot, but that strikes me as error-prone at this stage.


The patch instead detects such a situation, and tries to handle things
gracefully by falling back to simply printing a caret, without any
underlines:

pr68473-1.c: In function ‘foo’:
pr68473-1.c:13:12: error: x87 register return with x87 disabled
   TEST_EQ (fmin);
^

pr68473-1.c:6:13: note: in definition of macro ‘TEST_EQ’
   if ((long)FUNC##l(xl,xl) != (long)xl) \
 ^~~~


Dave

Re: update zlib to 1.2.8

2015-11-23 Thread Joel Brobecker

> In GCC zlib is only used for libjava; for binutils and gdb it is used when
> building without --with-system-zlib.  This just updates zlib from 1.2.7 to
> 1.2.8 (released in 2013).  Applies cleanly, libjava still builds and doesn't
> show any regressions in the testsuite.  Ok to apply (even if we already are
> in stage3)?

> +2015-11-23  Matthias Klose  
> +
> +   * Imported zlib 1.2.8; merged local changes.

Should not be a problem for GDB, since we're not near branching time.

Out of curiosity, what prompted this update? Just to be in sync with
the latest? Or was there an actual bug that you hit which 1.2.8 fixes?

-- 
Joel

Re: [PATCH] PR c/68473: sanitize source range-printing within certain macro expansions

2015-11-23 Thread Bernd Schmidt


On 11/23/2015 06:52 PM, David Malcolm wrote:

This patch fixes PR c/68473 by bulletproofing the new
diagnostic_show_locus implementation against ranges that finish before
they start (which can happen when using the C preprocessor), falling
back to simply printing a caret.


Hmm, wouldn't it be better to avoid such a situation? Can you describe a 
bit more how exactly the macro expansion caused such a situation?



Bernd

[PATCH] PR c/68473: sanitize source range-printing within certain macro expansions

2015-11-23 Thread David Malcolm

This patch fixes PR c/68473 by bulletproofing the new
diagnostic_show_locus implementation against ranges that finish before
they start (which can happen when using the C preprocessor), falling
back to simply printing a caret.

Successfully bootstrapped®rtested on x86_64-pc-linux-gnu; adds 7 new
PASS results to gcc.sum.

OK for trunk?

gcc/ChangeLog:
PR c/68473
* diagnostic-show-locus.c (layout::layout): Make loc_range const.
Sanitize the layout_range against ranges that finish before they
start.

gcc/testsuite/ChangeLog:
PR c/68473
* gcc.dg/plugin/diagnostic-test-expressions-1.c (fminl): New decl.
(TEST_EQ): New macro.
(test_macro): New function.
* gcc.target/i386/pr68473-1.c: New test case.
---
 gcc/diagnostic-show-locus.c| 34 ++
 .../gcc.dg/plugin/diagnostic-test-expressions-1.c  | 17 +++
 gcc/testsuite/gcc.target/i386/pr68473-1.c  | 24 +++
 3 files changed, 69 insertions(+), 6 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr68473-1.c

diff --git a/gcc/diagnostic-show-locus.c b/gcc/diagnostic-show-locus.c
index 9e51b95..968b3cb 100644
--- a/gcc/diagnostic-show-locus.c
+++ b/gcc/diagnostic-show-locus.c
@@ -444,7 +444,7 @@ layout::layout (diagnostic_context * context,
 {
   /* This diagnostic printer can only cope with "sufficiently sane" ranges.
 Ignore any ranges that are awkward to handle.  */
-  location_range *loc_range = richloc->get_range (idx);
+  const location_range *loc_range = richloc->get_range (idx);
 
   /* If any part of the range isn't in the same file as the primary
 location of this diagnostic, ignore the range.  */
@@ -456,16 +456,38 @@ layout::layout (diagnostic_context * context,
if (loc_range->m_caret.file != m_exploc.file)
  continue;
 
+  /* Everything is now known to be in the correct source file,
+but it may require further sanitization.  */
+  layout_range ri (loc_range);
+
+  /* If we have a range that finishes before it starts (perhaps
+from something built via macro expansion), printing the
+range is likely to be nonsensical.  Also, attempting to do so
+breaks assumptions within the printing code  (PR c/68473).  */
+  if (loc_range->m_start.line > loc_range->m_finish.line)
+   {
+ /* Is this the primary location?  */
+ if (m_layout_ranges.length () == 0)
+   {
+ /* We want to print the caret for the primary location, but
+we must sanitize away m_start and m_finish.  */
+ ri.m_start = ri.m_caret;
+ ri.m_finish = ri.m_caret;
+   }
+ else
+   /* This is a non-primary range; ignore it.  */
+   continue;
+   }
+
   /* Passed all the tests; add the range to m_layout_ranges so that
 it will be printed.  */
-  layout_range ri (loc_range);
   m_layout_ranges.safe_push (ri);
 
   /* Update m_first_line/m_last_line if necessary.  */
-  if (loc_range->m_start.line < m_first_line)
-   m_first_line = loc_range->m_start.line;
-  if (loc_range->m_finish.line > m_last_line)
-   m_last_line = loc_range->m_finish.line;
+  if (ri.m_start.m_line < m_first_line)
+   m_first_line = ri.m_start.m_line;
+  if (ri.m_finish.m_line > m_last_line)
+   m_last_line = ri.m_finish.m_line;
 }
 
   /* Adjust m_x_offset.
diff --git a/gcc/testsuite/gcc.dg/plugin/diagnostic-test-expressions-1.c 
b/gcc/testsuite/gcc.dg/plugin/diagnostic-test-expressions-1.c
index 0d8c7c5..3e38035 100644
--- a/gcc/testsuite/gcc.dg/plugin/diagnostic-test-expressions-1.c
+++ b/gcc/testsuite/gcc.dg/plugin/diagnostic-test-expressions-1.c
@@ -541,3 +541,20 @@ void test_quadratic (double a, double b, double c)
{ dg-end-multiline-output "" } */
 
 }
+
+/* Reproducer for PR c/68473.  */
+
+extern long double fminl (long double __x, long double __y);
+#define TEST_EQ(FUNC) FUNC##l(xl,xl)
+void test_macro (long double xl)
+{
+  __emit_expression_range (0, TEST_EQ (fmin) ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, TEST_EQ (fmin) );
+^
+   { dg-end-multiline-output "" } */
+/* { dg-begin-multiline-output "" }
+ #define TEST_EQ(FUNC) FUNC##l(xl,xl)
+   ^~~~
+   { dg-end-multiline-output "" } */
+}
diff --git a/gcc/testsuite/gcc.target/i386/pr68473-1.c 
b/gcc/testsuite/gcc.target/i386/pr68473-1.c
new file mode 100644
index 000..aa7
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr68473-1.c
@@ -0,0 +1,24 @@
+/* { dg-options "-fdiagnostics-show-caret -mno-fp-ret-in-387" } */
+
+extern long double fminl (long double __x, long double __y);
+
+#define TEST_EQ(FUNC) do { \
+  if ((long)FUNC##l(xl,xl) != (long)xl) \
+return; \
+  } while (0)
+
+void
+foo (long double xl)
+{
+  TEST_EQ (fmin

Re: update zlib to 1.2.8

2015-11-23 Thread Jeff Law


On 11/22/2015 09:37 PM, Matthias Klose wrote:

In GCC zlib is only used for libjava; for binutils and gdb it is used
when building without --with-system-zlib.  This just updates zlib from
1.2.7 to 1.2.8 (released in 2013).  Applies cleanly, libjava still
builds and doesn't show any regressions in the testsuite.  Ok to apply
(even if we already are in stage3)?

Matthias

zlib/ChangeLog.gcj

+2015-11-23  Matthias Klose  
+
+   * Imported zlib 1.2.8; merged local changes.
OK, though it's really late for this.  My biggest worry here given GCJ's 
status is build failures on the lesser used hosts.   If it causes 
significant problems we'll have to revert.


Jeff

Re: [PATCH] Fix PR objc/68438 (uninitialized source ranges)

2015-11-23 Thread Jeff Law


On 11/23/2015 04:13 AM, Joseph Myers wrote:

On Sun, 22 Nov 2015, David Malcolm wrote:


Is there (or could there be) a precanned dg- directive to ask if ObjC is
available?


I don't think so.  Normal practice is that each language's tests are in
appropriate directories for that language, with runtest never called with
a --tool option for that language if it wasn't built.
Right.  Which argues that we really want to create a new test directory 
for objc plugin tests.


jeff

Re: Enable pointer TBAA for LTO

2015-11-23 Thread Jan Hubicka

> > You are right, TYPE_NONALIASED_COMPONENT is the wrong way.  I will fix it
> > and try to come up with a testcase (TYPE_NONALIASED_COMPONENT is quite
> > rarely used beast)
> 
> It's only used in Ada as far as I know, but is quite sensitive and quickly 
> leads to wrong code if not handled properly in my experience, so this could 
> well be responsible for the gnat1 miscompilation.

Build fialed same way before my patch. Moreover the problem can only happen on
array of pointers that are type punned (i.e.  using store like 
(void *[r])array = [NULL, NULL, NULL]
and then accessing it as (int *[r]) array.   I do not think C or Ada can produce
code like that.

I am re-testing with the fix to TYPE_NONALIASED_COMPONENT arrays I explained in
other email. Perhaps that helps, or perhaps it is one of those Ada/C type puning
glues getting miscompiled?  Other Ada binaries (gnatbind) seems to work fine.
I will post backtrace once my testing gets to the ICE again.

Honza
> 
> -- 
> Eric Botcazou

Re: [PATCH] Don't ICE on symbolic ranges in VRP (PR tree-optimization/68455)

2015-11-23 Thread Marek Polacek

On Mon, Nov 23, 2015 at 05:40:14PM +0100, Richard Biener wrote:
> On November 23, 2015 5:31:11 PM GMT+01:00, Marek Polacek  
> wrote:
> >We blow up on the following testcase because we find ourselves passing
> >[_13 + 1, INT_MAX] as a vr1 to extract_range_from_multiplicative_op_1;
> >that's bad because this function immediately calls vrp_int_const_binop
> >which just doesn't work for symbolic ranges, it only wants int_csts.
> >
> >This started with Richards S.'s changes in r228614 -- we're now since
> >able to recurse into SSA names, thus get better info about ranges.
> >That means that range_includes_zero_p in
> >extract_range_from_binary_expr_1
> >for the *_DIV_EXPR cases was able to determine that the range doesn't
> >include zero, so we went through a different code path and ended up
> >calling extract_range_from_multiplicative_op_1 even with symbolic
> >ranges.
> >
> >I couldn't come up with anything better than checking that we're
> >dealing
> >with nonsymbolic ranges for such a case.
> >
> >Bootstrapped/regtested on x86_64-linux, ok for trunk?
> 
> Hmm.  I think we can do better if vr0 is symbolical - use min, max for it.
> 
> I suppose it would be best to implement a get_integer_range () function doing 
> that or also looking at equivalences if we are getting a symbolic range.
> 
> Anyway, those are future enhancements that shouldn't block this patch.
 
Is this something for this stage3?  Or should I open a PR and fix it in the
next stage1?

> Thus OK.

Thanks.

Marek

Re: Enable pointer TBAA for LTO

2015-11-23 Thread Richard Biener

On November 23, 2015 5:50:25 PM GMT+01:00, Jan Hubicka  wrote:
>> 
>> I think it also causes the following and one related ICE
>> 
>> FAIL: gcc.dg/vect/pr62021.c -flto -ffat-lto-objects (internal
>compiler 
>> error)
>> 
>>
>/space/rguenther/src/svn/trunk3/gcc/testsuite/gcc.dg/vect/pr62021.c:7:1:
>
>> internal compiler error: in get_alias_set, at alias.c:880^M
>> 0x7528a7 get_alias_set(tree_node*)^M
>> /space/rguenther/src/svn/trunk3/gcc/alias.c:880^M
>
>Does this one reproduce with mainline?

Yes, testsuite run on x86_64

 All thee ICEs are on the sanity
>check:
>gcc_checking_assert (!in_lto_p || !type_with_alias_set_p (t));
>which check that in LTO all types that ought to have CANONICAL_TYPE
>gets CANONICAL_TYPE
>computed.  ICE here probalby means that the type somehow bypassed LTO
>canonical type merging
>as well as the TYPE_CANONICAL=MAIN_VARIANT set in lto-streamer.c
>
>Honza

Re: Enable pointer TBAA for LTO

2015-11-23 Thread Jan Hubicka

> 
> Please in future leave patches for review again if you do such
> big changes before committing...

Uhm, sorry, next time I will be more cureful.  It seemed rather easy after
debugging it but indeed it is somewhat surprising issue.
> 
> I don't understand why we need this (testcase?) because get_alias_set ()
> is supposed to do the alias-set of pointer globbing for us.

The situation is as follows.  You can have

struct a {
  int *ptr;
}

struct b {
  float *ptr;
}

Now, if becase type is ignored by TYPE_CANONICAL, we have

   TYPE_CANONICAL (sturct b) = struct a.

and while building alias set of "struct a" we record compoents as:

   struct a
  ^
  |
int *

Now do

struct b b = {NULL};
float *p=&b->ptr;
*p=nonnull;
return b.ptr;

Now alias set of the initialization is alias set of "struct a". The alias set
of of the pointer store is "float *".  We ask alias oracle if "struct a" can
conflict with "float *" and answer is no, because "int *" (component of struct
b) and "float *" are not conflicting.  With the change we record component
alias as follows:

   struct a
  ^
  |
   void *

which makes the answer to be correct, because "int *" conflicts with all 
pointers,
so all such queries about a structure gimple_canonical_types_compatible with 
"struct a"
will be handled correctly.

I will add aritifical testcase for this after double checking that the code 
above
miscompiles without that conditional.

I found that having warning about TBAA incompatibility when doing merigng in
lto-symtab is great to catch issues like this.

I had this patch for quite some time, but it was producing obviously wrong
positives (in Fortran, Ada and also sometimes in Firefox on array initializes
of style int a[]={1,2,3}).  I wrongly assumed tha the bug is because we compute
TYPE_CANONICAL sensitively to array size and there are permited transitions
of array sizes between units.  THis is not the case.

Yesterday I found that the problem is with interaction of get_alias_set
globbing and gimple_canonical_types_compatible globbing.  We can't have
get_alias_set globbing more or less coarse than
gimple_canonical_types_compatible does.

Here the situation is reverse to the case above : because array type is
inherited from element type, we can't have TYPE_CANONICAL more globbed for
array than we have get_alias_set.  In this case the problem appeared when
TYPE_NONALIASED_COMPONENT array previaled in type merging other arrays.  The
situation got worse with pointer, becuase array of pointers of one type could
prevail array of pointers of other type and then the array type gets different
alias sets in different units.  This seems to work for C programs, but is
wrong.  I will send patch and separate explanation after re-testing final
version shortly.

Honza

Re: [PATCH/RFC] C++ FE: expression ranges (v2)

2015-11-23 Thread Marek Polacek

On Mon, Nov 23, 2015 at 05:57:54PM +0100, Jakub Jelinek wrote:
> On Mon, Nov 23, 2015 at 11:53:40AM -0500, David Malcolm wrote:
> > Does the following look like the kind of thing you had in mind?  (just
> > the tree.def part for now).   Presumably usable for both lvalues and
> > rvalues, where the thing it wraps is what's important.  It merely exists
> > to add an EXPR_LOCATION, for a usage of the wrapped thing.
> 
> Yes, but please see with Jason, Richard and perhaps others if they are ok
> with that too before spending too much time in that direction.
> All occurrences of it would have to be folded away during the gimplification
> at latest, this shouldn't be something we use in the middle-end.

I'd expect LOCATION_EXPR be defined in c-family/c-common.def, not tree.def.
And I'd think it shouldn't survive genericizing, thus never leak into the ME.

Marek

Re: [PATCH/RFC] C++ FE: expression ranges (v2)

2015-11-23 Thread Jakub Jelinek

On Mon, Nov 23, 2015 at 11:53:40AM -0500, David Malcolm wrote:
> Does the following look like the kind of thing you had in mind?  (just
> the tree.def part for now).   Presumably usable for both lvalues and
> rvalues, where the thing it wraps is what's important.  It merely exists
> to add an EXPR_LOCATION, for a usage of the wrapped thing.

Yes, but please see with Jason, Richard and perhaps others if they are ok
with that too before spending too much time in that direction.
All occurrences of it would have to be folded away during the gimplification
at latest, this shouldn't be something we use in the middle-end.

> diff --git a/gcc/tree.def b/gcc/tree.def
> index 44e5a5e..30fd766 100644
> --- a/gcc/tree.def
> +++ b/gcc/tree.def
> @@ -1407,6 +1407,13 @@ DEFTREECODE (CILK_SPAWN_STMT, "cilk_spawn_stmt", 
> tcc_statement, 1)
>  /* Cilk Sync statement: Does not have any operands.  */
>  DEFTREECODE (CILK_SYNC_STMT, "cilk_sync_stmt", tcc_statement, 0)
>  
> +/* Wrapper to add a source code location to an expression, either one
> +   that doesn't have one (such as an INTEGER_CST), or to a usage of a
> +   variable (e.g. PARAM_DECL or VAR_DECL), where we want to record
> +   the site in the source where the variable was *used* rather than
> +   where it was declared.  */
> +DEFTREECODE (LOCATION_EXPR, "location_expr", tcc_unary, 1)
> +
>  /*
>  Local variables:
>  mode:c

Jakub

Re: [PATCH/RFC] C++ FE: expression ranges (v2)

2015-11-23 Thread David Malcolm

On Mon, 2015-11-23 at 10:59 +0100, Richard Biener wrote:
> On Sat, Nov 21, 2015 at 9:21 AM, Jakub Jelinek  wrote:
> > On Sat, Nov 21, 2015 at 02:16:49AM -0500, Jason Merrill wrote:
> >> On 11/19/2015 03:46 PM, Jason Merrill wrote:
> >> >On 11/15/2015 12:01 AM, David Malcolm wrote:
> >> >>As with the C frontend, there's an issue with tree nodes that
> >> >>don't have locations: VAR_DECL, INTEGER_CST, etc:
> >> >>
> >> >>   int test (int foo)
> >> >>   {
> >> >> return foo * 100;
> >> >>^^^   ^^^
> >> >>   }
> >> >>
> >> >>where we'd like to access the source spelling ranges of the expressions
> >> >>during parsing, so that we can use them when reporting parser errors.
> >> >
> >> >Hmm, I had been thinking to address this in the C++ front end by
> >> >wrapping uses in another tree: NOP_EXPR for rvalues, VIEW_CONVERT_EXPR
> >> >for lvalues.
> >>
> >> On the other hand, my direction seems likely to cause more issues,
> >> especially with code that doesn't yet know how to handle VIEW_CONVERT_EXPR,
> >> and could create ambiguity with explicit conversions.  So I guess your
> >> approach seems reasonable.
> >
> > But your approach would allow better diagnostics even in places where you
> > don't have the structures with tree, location_t pairs around.  With that
> > it will be limited solely to the parser and nothing else, so even template
> > instantiation if it is something that can be only detected when
> > instantiating would be too late.
> >
> > I think using a new tree (rather than using NOP_EXPR/VIEW_CONVERT_EXPR)
> > that would be just some expression with location and teaching the FE and
> > folder about it might be even better.
> 
> Agreed.  Note that we already have NON_LVALUE_EXPR and fold-const.c uses
> that to stick locations on things that cannot have them.
> 
> OTOH I would like to get rid of NON_LVALUE_EXPR in the middle-end (and thus
> fold-const.c).

Thanks.

Does the following look like the kind of thing you had in mind?  (just
the tree.def part for now).   Presumably usable for both lvalues and
rvalues, where the thing it wraps is what's important.  It merely exists
to add an EXPR_LOCATION, for a usage of the wrapped thing.

diff --git a/gcc/tree.def b/gcc/tree.def
index 44e5a5e..30fd766 100644
--- a/gcc/tree.def
+++ b/gcc/tree.def
@@ -1407,6 +1407,13 @@ DEFTREECODE (CILK_SPAWN_STMT, "cilk_spawn_stmt", tcc_statement, 1)
 /* Cilk Sync statement: Does not have any operands.  */
 DEFTREECODE (CILK_SYNC_STMT, "cilk_sync_stmt", tcc_statement, 0)
 
+/* Wrapper to add a source code location to an expression, either one
+   that doesn't have one (such as an INTEGER_CST), or to a usage of a
+   variable (e.g. PARAM_DECL or VAR_DECL), where we want to record
+   the site in the source where the variable was *used* rather than
+   where it was declared.  */
+DEFTREECODE (LOCATION_EXPR, "location_expr", tcc_unary, 1)
+
 /*
 Local variables:
 mode:c

Re: [PATCH 12/12] always define ENABLE_OFFLOADING

2015-11-23 Thread Ilya Verbin

On Mon, Nov 09, 2015 at 19:41:21 +0100, Bernd Schmidt wrote:
> On 11/09/2015 05:47 PM, tbsaunde+...@tbsaunde.org wrote:
> >-#ifdef ENABLE_OFFLOADING
> >/* If the user didn't specify any, default to all configured offload
> >   targets.  */
> >if (offload_targets == NULL)
> >  handle_foffload_option (OFFLOAD_TARGETS);
> >-#endif
> 
> This one I would keep guarded with an if.
> 
> Otherwise ok modulo stage 1 end.

There are 2 new uses of "#ifdef ENABLE_OFFLOADING" in c_parser_oacc_declare and
cp_parser_oacc_declare.
I don't know how to properly test OpenACC, so here is untested patch.


diff --git a/gcc/c/c-parser.c b/gcc/c/c-parser.c
index 7b10764..1dc0bd5 100644
--- a/gcc/c/c-parser.c
+++ b/gcc/c/c-parser.c
@@ -13473,14 +13473,15 @@ c_parser_oacc_declare (c_parser *parser)
  if (node != NULL)
{
  node->offloadable = 1;
-#ifdef ENABLE_OFFLOADING
- g->have_offload = true;
- if (is_a  (node))
+ if (ENABLE_OFFLOADING)
{
- vec_safe_push (offload_vars, decl);
- node->force_output = 1;
+ g->have_offload = true;
+ if (is_a  (node))
+   {
+ vec_safe_push (offload_vars, decl);
+ node->force_output = 1;
+   }
}
-#endif
}
}
}
diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index 24ed404..a9c0a45 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -34633,14 +34633,15 @@ cp_parser_oacc_declare (cp_parser *parser, cp_token 
*pragma_tok)
  if (node != NULL)
{
  node->offloadable = 1;
-#ifdef ENABLE_OFFLOADING
- g->have_offload = true;
- if (is_a  (node))
+ if (ENABLE_OFFLOADING)
{
- vec_safe_push (offload_vars, decl);
- node->force_output = 1;
+ g->have_offload = true;
+ if (is_a  (node))
+   {
+ vec_safe_push (offload_vars, decl);
+ node->force_output = 1;
+   }
}
-#endif
}
}
}

  -- Ilya

Re: Enable pointer TBAA for LTO

2015-11-23 Thread Jan Hubicka

> 
> I think it also causes the following and one related ICE
> 
> FAIL: gcc.dg/vect/pr62021.c -flto -ffat-lto-objects (internal compiler 
> error)
> 
> /space/rguenther/src/svn/trunk3/gcc/testsuite/gcc.dg/vect/pr62021.c:7:1: 
> internal compiler error: in get_alias_set, at alias.c:880^M
> 0x7528a7 get_alias_set(tree_node*)^M
> /space/rguenther/src/svn/trunk3/gcc/alias.c:880^M

Does this one reproduce with mainline? All thee ICEs are on the sanity check:
gcc_checking_assert (!in_lto_p || !type_with_alias_set_p (t));
which check that in LTO all types that ought to have CANONICAL_TYPE gets 
CANONICAL_TYPE
computed.  ICE here probalby means that the type somehow bypassed LTO canonical 
type merging
as well as the TYPE_CANONICAL=MAIN_VARIANT set in lto-streamer.c

Honza

Re: [PATCH] Fix GC ICE during simd clone creation (PR middle-end/68339)

2015-11-23 Thread Jan Hubicka

> On Fri, Nov 20, 2015 at 9:03 PM, Jakub Jelinek  wrote:
> > Hi!
> >
> > node->get_body () can run various IPA passes and ggc_collect in them
> 
> Aww.  Looks like we never implemented that ggc_defer_collecting idea
> (don't remember the context this popped up, maybe it was when we
> introduced TODO_do_not_ggc_collect).  At least late IPA passes
> might be affected by this issue as well.

We used to have way to push/pop ggc context, but that was removed eventually.
I agree that get_ody doing random ggc_collect is not the best idea.  What 
get_body
is doing is application of transform passes of individual IPA passes.  We could
probably just not do ggc_collect between these? Those are not full bloat
passes, just transform methods.

Honza
> 
> Richard.
> 
> >, so
> > it is undesirable to hold pointers to GC memory in automatic vars over it.
> > While I could store those vars (clone_info, clone and id) into special GTY
> > vars just to avoid collecting them, it seems easier to call node->get_body
> > () earlier.
> >
> > Bootstrapped/regtested on x86_64-linux and i686-linux, committed to trunk
> > and 5 branch.
> >
> > 2015-11-20  Jakub Jelinek  
> >
> > PR middle-end/68339
> > * omp-low.c (expand_simd_clones): Call node->get_body () before
> > allocating stuff in GC.
> >
> > * gcc.dg/vect/pr68339.c: New test.
> >
> > --- gcc/omp-low.c.jj2015-11-18 11:19:19.0 +0100
> > +++ gcc/omp-low.c   2015-11-20 12:56:17.075193601 +0100
> > @@ -18319,6 +18319,10 @@ expand_simd_clones (struct cgraph_node *
> >&& TYPE_ARG_TYPES (TREE_TYPE (node->decl)) == NULL_TREE)
> >  return;
> >
> > +  /* Call this before creating clone_info, as it might ggc_collect.  */
> > +  if (node->definition && node->has_gimple_body_p ())
> > +node->get_body ();
> > +
> >do
> >  {
> >/* Start with parsing the "omp declare simd" attribute(s).  */
> > --- gcc/testsuite/gcc.dg/vect/pr68339.c.jj  2015-11-20 
> > 13:10:47.756905395 +0100
> > +++ gcc/testsuite/gcc.dg/vect/pr68339.c 2015-11-20 13:08:13.0 +0100
> > @@ -0,0 +1,17 @@
> > +/* PR middle-end/68339 */
> > +/* { dg-do compile } */
> > +/* { dg-options "--param ggc-min-heapsize=0 --param ggc-min-expand=0 
> > -fopenmp-simd" } */
> > +
> > +#pragma omp declare simd notinbranch
> > +int
> > +f1 (int x)
> > +{
> > +  return x;
> > +}
> > +
> > +#pragma omp declare simd notinbranch
> > +int
> > +f2 (int x)
> > +{
> > +  return x;
> > +}
> >
> > Jakub

Re: [PATCH] Don't ICE on symbolic ranges in VRP (PR tree-optimization/68455)

2015-11-23 Thread Richard Biener

On November 23, 2015 5:31:11 PM GMT+01:00, Marek Polacek  
wrote:
>We blow up on the following testcase because we find ourselves passing
>[_13 + 1, INT_MAX] as a vr1 to extract_range_from_multiplicative_op_1;
>that's bad because this function immediately calls vrp_int_const_binop
>which just doesn't work for symbolic ranges, it only wants int_csts.
>
>This started with Richards S.'s changes in r228614 -- we're now since
>able to recurse into SSA names, thus get better info about ranges.
>That means that range_includes_zero_p in
>extract_range_from_binary_expr_1
>for the *_DIV_EXPR cases was able to determine that the range doesn't
>include zero, so we went through a different code path and ended up
>calling extract_range_from_multiplicative_op_1 even with symbolic
>ranges.
>
>I couldn't come up with anything better than checking that we're
>dealing
>with nonsymbolic ranges for such a case.
>
>Bootstrapped/regtested on x86_64-linux, ok for trunk?

Hmm.  I think we can do better if vr0 is symbolical - use min, max for it.

I suppose it would be best to implement a get_integer_range () function doing 
that or also looking at equivalences if we are getting a symbolic range.

Anyway, those are future enhancements that shouldn't block this patch.

Thus OK.

Richard.

>2015-11-23  Marek Polacek  
>
>   PR tree-optimization/68455
>   * tree-vrp.c (extract_range_from_binary_expr_1): Don't call
>   extract_range_from_multiplicative_op_1 on symbolic ranges.
>
>   * gcc.dg/tree-ssa/pr68455.c: New test.
>
>diff --git gcc/testsuite/gcc.dg/tree-ssa/pr68455.c
>gcc/testsuite/gcc.dg/tree-ssa/pr68455.c
>index e69de29..6b46b30 100644
>--- gcc/testsuite/gcc.dg/tree-ssa/pr68455.c
>+++ gcc/testsuite/gcc.dg/tree-ssa/pr68455.c
>@@ -0,0 +1,19 @@
>+/* PR tree-optimization/68455 */
>+/* { dg-do compile } */
>+/* { dg-options "-O2" } */
>+
>+int r;
>+int n;
>+
>+void
>+fn1 (void)
>+{
>+  int i;
>+
>+  for (i = 0; i < 1; ++i)
>+{
>+  unsigned short int u;
>+  if (u < n)
>+  r = 1 / n;
>+}
>+}
>diff --git gcc/tree-vrp.c gcc/tree-vrp.c
>index 7001190..acbb70b 100644
>--- gcc/tree-vrp.c
>+++ gcc/tree-vrp.c
>@@ -3015,7 +3015,7 @@ extract_range_from_binary_expr_1 (value_range
>*vr,
> return;
>   }
>   }
>-  else
>+  else if (!symbolic_range_p (&vr0) && !symbolic_range_p (&vr1))
>   {
> extract_range_from_multiplicative_op_1 (vr, code, &vr0, &vr1);
> return;
>
>   Marek

Re: [AArch64][dejagnu][PATCH 5/7] Dejagnu support for ARMv8.1 Adv.SIMD.

2015-11-23 Thread Matthew Wahab


On 23/11/15 12:24, James Greenhalgh wrote:

On Tue, Oct 27, 2015 at 03:32:04PM +, Matthew Wahab wrote:

On 24/10/15 08:16, Bernhard Reutner-Fischer wrote:

On October 23, 2015 2:24:26 PM GMT+02:00, Matthew Wahab 
 wrote:

The ARMv8.1 architecture extension adds two Adv.SIMD instructions,.
This
patch adds support in Dejagnu for ARMv8.1 Adv.SIMD specifiers and
checks.

The new test options are
- { dg-add-options arm_v8_1a_neon }: Add compiler options needed to
   enable ARMv8.1 Adv.SIMD.
- { dg-require-effective-target arm_v8_1a_neon_hw }: Require a target
   capable of executing ARMv8.1 Adv.SIMD instructions.




Hi Matthew,

I have a couple of comments below. Neither need to block the patch, but
I'd appreciate a reply before I say OK.



diff --git a/gcc/testsuite/lib/target-supports.exp 
b/gcc/testsuite/lib/target-supports.exp
index 4d5b0a3d..0fb679d 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -2700,6 +2700,16 @@ proc add_options_for_arm_v8_neon { flags } {
  return "$flags $et_arm_v8_neon_flags -march=armv8-a"
  }

+# Add the options needed for ARMv8.1 Adv.SIMD.
+
+proc add_options_for_arm_v8_1a_neon { flags } {
+if { [istarget aarch64*-*-*] } {
+   return "$flags -march=armv8.1-a"


Should this be -march=armv8.1-a+simd or some other feature flag?



I think it should by armv8.1-a only. +simd is enabled by all -march settings so it 
seems redundant to add it here. An alternative is to add +rdma but that's also 
enabled by armv8.1-a. (I've a patch at 
https://gcc.gnu.org/ml/gcc-patches/2015-11/msg01973.html which gets rid for +rdma as 
part of an armv8.1-a command line clean up.)



+# Return 1 if the target supports executing the ARMv8.1 Adv.SIMD extension, 0
+# otherwise.  The test is valid for AArch64.
+
+proc check_effective_target_arm_v8_1a_neon_hw { } {
+if { ![check_effective_target_arm_v8_1a_neon_ok] } {
+   return 0;
+}
+return [check_runtime_nocache arm_v8_1a_neon_hw_available {
+   int
+   main (void)
+   {
+ long long a = 0, b = 1;
+ long long result = 0;
+
+ asm ("sqrdmlah %s0,%s1,%s2"
+  : "=w"(result)
+  : "w"(a), "w"(b)
+  : /* No clobbers.  */);


Hm, those types look wrong, I guess this works but it is an unusual way
to write it. I presume this is to avoid including arm_neon.h each time, but
you could just directly use the internal type names for the arm_neon types.
That is to say __Int32x4_t (or whichever mode you intend to use).



I'll rework the patch to use the internal types names.

Matthew

Re: [PATCH, 10/16] Add pass_oacc_kernels pass group in passes.def

2015-11-23 Thread Richard Biener

On November 23, 2015 4:37:18 PM GMT+01:00, Tom de Vries 
 wrote:
>On 23/11/15 12:31, Richard Biener wrote:
  From the dump below I understand you want no memory references in
 > >the outer loop?
 > >So the issue seems to be that store motion fails
 > >to insert the preheader load / exit store to the outermost loop
 > >possible and thus another LIM pass is needed to "store motion"
>those
 > >again?
>>> >
>>> >Yep.
>>> >
 > >  But a simple testcase
 > >
 > >int a;
 > >int *p = &a;
 > >int foo (int n)
 > >{
 > >for (int i = 0; i < n; ++i)
 > >  for (int j = 0; j < 100; ++j)
 > >*p += j + i;
 > >return a;
 > >}
 > >
 > >shows that LIM can do this in one step.
>>> >
>>> >I've filed a FTR PR68465 - "pass_lim doesn't detect identical loop
>entry
>>> >conditions" for a test-case where that doesn't happen (when using
>>> >-fno-tree-dominator-opts).
>>> >
 > >Which means it should
 > >be investigated why it doesn't do this properly for your
>testcase
 > >(store motion of *_25).
>>> >
>>> >There seems to be two related problems:
>>> >1. the store has tree_could_trap_p (ref->mem.ref) true, which
>should be
>>> >false. I'll work on a fix for this.
>>> >2. Give that the store can trap, I  was running into PR68465. I
>managed
>>> >to eliminate the 2nd pass_lim by moving the pass_dominator
>instance
>>> >before the pass_lim instance.
>>> >
>>> >Attached patch shows the pass group with only one pass_lim. I hope
>to be able
>>> >to eliminate the first pass_dominator instance before pass_lim once
>I fix 1.
>>> >
 > >Simply adding two LIM passes either papers over a wrong-code
 > >bug (in LIM or in DOM) or over a missed-optimization in LIM.
>>> >
>>> >AFAIU now, it's PR68465, a missed optimization in LIM.
>> Ok, it's not really LIMs job to cleanup loop header copying that way.
>>
>> DOM performs jump-threading for this but FRE should also be able
>> to handle this just fine.  Ah, it doesn't because the outer loop
>> header directly contains the condition
>>
>> Index: gcc/tree-ssa-sccvn.c
>> ===
>> --- gcc/tree-ssa-sccvn.c(revision 230737)
>> +++ gcc/tree-ssa-sccvn.c(working copy)
>> @@ -4357,20 +4402,32 @@ sccvn_dom_walker::before_dom_children (b
>>
>> /* If we have a single predecessor record the equivalence from a
>>possible condition on the predecessor edge.  */
>> -  if (single_pred_p (bb))
>> +  edge pred_e = NULL;
>> +  FOR_EACH_EDGE (e, ei, bb->preds)
>> +{
>> +  if (e->flags & EDGE_DFS_BACK)
>> +   continue;
>> +  if (! pred_e)
>> +   pred_e = e;
>> +  else
>> +   {
>> + pred_e = NULL;
>> + break;
>> +   }
>> +}
>> +  if (pred_e)
>>   {
>> -  edge e = single_pred_edge (bb);
>> /* Check if there are multiple executable successor edges in
>>   the source block.  Otherwise there is no additional info
>>   to be recorded.  */
>> edge e2;
>> -  FOR_EACH_EDGE (e2, ei, e->src->succs)
>> -   if (e2 != e
>> +  FOR_EACH_EDGE (e2, ei, pred_e->src->succs)
>> +   if (e2 != pred_e
>>  && e2->flags & EDGE_EXECUTABLE)
>>break;
>> if (e2 && (e2->flags & EDGE_EXECUTABLE))
>>  {
>> - gimple *stmt = last_stmt (e->src);
>> + gimple *stmt = last_stmt (pred_e->src);
>>if (stmt
>>&& gimple_code (stmt) == GIMPLE_COND)
>>  {
>> @@ -4378,11 +4435,11 @@ sccvn_dom_walker::before_dom_children (b
>>tree lhs = gimple_cond_lhs (stmt);
>>tree rhs = gimple_cond_rhs (stmt);
>>record_conds (bb, code, lhs, rhs,
>> -   (e->flags & EDGE_TRUE_VALUE) != 0);
>> +   (pred_e->flags & EDGE_TRUE_VALUE) != 0);
>>code = invert_tree_comparison (code, HONOR_NANS
>(lhs));
>>if (code != ERROR_MARK)
>>  record_conds (bb, code, lhs, rhs,
>> - (e->flags & EDGE_TRUE_VALUE) == 0);
>> + (pred_e->flags & EDGE_TRUE_VALUE) ==
>0);
>>  }
>>  }
>>   }
>>
>> fixes this for me (for a small testcase).  Does it help yours?
>>
>
>Yes, it has the desired effect (of not needing pass_dominator before 
>pass_lim) . But, patch "Mark by_ref mem_ref in build_receiver_ref as 
>non-trapping" committed as r230738, also has that effect, so AFAIU I 
>don't require this tree-ssa-sccvn.c fix.

OK, I committed it anyway already.

Richard.

>Thanks,
>- Tom
>
>> Otherwise untested of course (I hope EDGE_DFS_BACK is good enough,
>> it's supposed to match edges that have the src dominated by the
>dest).
>> Testing the above now.

[PATCH] Don't ICE on symbolic ranges in VRP (PR tree-optimization/68455)

2015-11-23 Thread Marek Polacek

We blow up on the following testcase because we find ourselves passing
[_13 + 1, INT_MAX] as a vr1 to extract_range_from_multiplicative_op_1;
that's bad because this function immediately calls vrp_int_const_binop
which just doesn't work for symbolic ranges, it only wants int_csts.

This started with Richards S.'s changes in r228614 -- we're now since
able to recurse into SSA names, thus get better info about ranges.
That means that range_includes_zero_p in extract_range_from_binary_expr_1
for the *_DIV_EXPR cases was able to determine that the range doesn't
include zero, so we went through a different code path and ended up
calling extract_range_from_multiplicative_op_1 even with symbolic ranges.

I couldn't come up with anything better than checking that we're dealing
with nonsymbolic ranges for such a case.

Bootstrapped/regtested on x86_64-linux, ok for trunk?

2015-11-23  Marek Polacek  

PR tree-optimization/68455
* tree-vrp.c (extract_range_from_binary_expr_1): Don't call
extract_range_from_multiplicative_op_1 on symbolic ranges.

* gcc.dg/tree-ssa/pr68455.c: New test.

diff --git gcc/testsuite/gcc.dg/tree-ssa/pr68455.c 
gcc/testsuite/gcc.dg/tree-ssa/pr68455.c
index e69de29..6b46b30 100644
--- gcc/testsuite/gcc.dg/tree-ssa/pr68455.c
+++ gcc/testsuite/gcc.dg/tree-ssa/pr68455.c
@@ -0,0 +1,19 @@
+/* PR tree-optimization/68455 */
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+int r;
+int n;
+
+void
+fn1 (void)
+{
+  int i;
+
+  for (i = 0; i < 1; ++i)
+{
+  unsigned short int u;
+  if (u < n)
+   r = 1 / n;
+}
+}
diff --git gcc/tree-vrp.c gcc/tree-vrp.c
index 7001190..acbb70b 100644
--- gcc/tree-vrp.c
+++ gcc/tree-vrp.c
@@ -3015,7 +3015,7 @@ extract_range_from_binary_expr_1 (value_range *vr,
  return;
}
}
-  else
+  else if (!symbolic_range_p (&vr0) && !symbolic_range_p (&vr1))
{
  extract_range_from_multiplicative_op_1 (vr, code, &vr0, &vr1);
  return;

Marek

Re: Fix lto-symtab ICE during Ada LTO bootstrap

2015-11-23 Thread Arnaud Charlet

> > Will it also fix
> > 
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61954
> 
> Yes, probably, as well as the m68k issue, if that's really doable.

Right, that's also why I think it's a more promising approach.
Pretending that system.address is just an unsigned integer is bound to cause
troubles, although handling system.address as a void* in gigi/gcc
is also bound to cause some other troubles at this stage, but I suspect
we'll have to bite the bullet at some point.

Arno

Re: Fix lto-symtab ICE during Ada LTO bootstrap

2015-11-23 Thread Eric Botcazou

> Will it also fix
> 
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61954

Yes, probably, as well as the m68k issue, if that's really doable.

-- 
Eric Botcazou

Re: Fix lto-symtab ICE during Ada LTO bootstrap

2015-11-23 Thread H.J. Lu

On Mon, Nov 23, 2015 at 8:02 AM, Eric Botcazou  wrote:
>> But can't you on the GENERIC side drop System.Address to void_ptr_node
>> again and just not make use of the "heavy lifting" you were talking about?
>
> No "heavy lifting" in this thread, just a heavy machinery in the language. :-)
>
>> That is, why is that speciality of System.Address not a Ada FE thing only?
>
> You mean rewriting System.Address into a pointer when translating to GENERIC?
> Yes, I presume that's doable, but I don't see all the consequences right now.
>

Will it also fix

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61954

-- 
H.J.

Re: Remove noce_mem_write_may_trap_or_fault_p in ifcvt

2015-11-23 Thread Michael Matz

Hi,

On Fri, 20 Nov 2015, Jeff Law wrote:

> > I'm undecided on whether cs-elim is safe wrt the store speculation vs
> > locks concerns raised in the thread discussing Ian's
> > noce_can_store_speculate_p, but that's not something we have to consider
> > to solve the problem at hand.
> I don't think cs-elim is safe WRT locks and such in multi-threaded code.
> 
>In particular it replaces this:
> 
>  bb0:
>if (cond) goto bb2; else goto bb1;
>  bb1:
>*p = RHS;
>  bb2:
> 
>with
> 
>  bb0:
>if (cond) goto bb1; else goto bb2;
>  bb1:
>condtmp' = *p;
>  bb2:
>condtmp = PHI 
>*p = condtmp;

It only does so under some conditions, amongst them if it sees a 
dominating access to the same memory of the same type (load or store) and 
size.  So it doesn't introduce writes on paths that don't already contain 
a write, and hence are multi-thread safe.  Or, at least, that's the 
intention.

> If *p is a shared memory location, then there may be another writer.  
> If that writer happens to store something in that location after the 
> load of *p, but before the store to *p, then that store will get lost in 
> the transformed pseudo code.

Due to the above checking, also the first thread must have been an writer 
so there already are two writers on the same location, and hence a 
race is pre-existing.  It's only when you introduce a write when there was 
none whatsoever before (perhaps due to conditionals) that you create a new 
problem.

Ciao,
Michael.

Re: Fix lto-symtab ICE during Ada LTO bootstrap

2015-11-23 Thread Eric Botcazou

> But can't you on the GENERIC side drop System.Address to void_ptr_node
> again and just not make use of the "heavy lifting" you were talking about?

No "heavy lifting" in this thread, just a heavy machinery in the language. :-)

> That is, why is that speciality of System.Address not a Ada FE thing only?

You mean rewriting System.Address into a pointer when translating to GENERIC?
Yes, I presume that's doable, but I don't see all the consequences right now.

-- 
Eric Botcazou

Re: Remove noce_mem_write_may_trap_or_fault_p in ifcvt

2015-11-23 Thread Michael Matz

Hi,

On Fri, 20 Nov 2015, Bernd Schmidt wrote:

> On 11/19/2015 12:49 AM, Jeff Law wrote:
> > On 11/18/2015 12:16 PM, Bernd Schmidt wrote:
> > > I don't think so, actually. One safe option would be to rip it out and
> > > just stop transforming this case, but let's start by looking at the code
> > > just a bit further down, calling noce_can_store_speculate. This was
> > > added later than the code we're discussing, and it tries to verify that
> > > the same memory location will unconditionally be written to at a point
> > > later than the one we're trying to convert
> > And if we dig into that thread, Ian suggests this isn't terribly
> > important anyway.
> > 
> > However, if we go even further upthread, we find an assertion from
> > Michael that this is critical for 456.hmmer and references BZ 27313.
> 
> Cc'd in case he has additional input.

The transformation I indicated in the mail introducing the cselim pass is 
indeed crucial for hmmer.  But that has nothing to do with 
noce_mem_write_may_trap_or_fault_p.  The latter is purely an RTL 
transformation, the transformation needs to happen on gimple level.  I 
agree with Bernd that the dominator walking code on the RTL side trying to 
find an unconditional write to the same address in order to validate the 
transformation looks wrong.

Ciao,
Michael.

Re: [RFC] Combine vectorized loops with its scalar remainder.

2015-11-23 Thread Yuri Rumyantsev

Hi Richard,

Did you have a chance to look at this?

Thanks.
Yuri.

2015-11-13 13:35 GMT+03:00 Yuri Rumyantsev :
> Hi Richard,
>
> Here is updated version of the patch which 91) is in sync with trunk
> compiler and (2) contains simple cost model to estimate profitability
> of scalar epilogue elimination. The part related to vectorization of
> loops with small trip count is in process of developing. Note that
> implemented cost model was not tuned  well for HASWELL and KNL but we
> got  ~6% speed-up on 436.cactusADM from spec2006 suite for HASWELL.
>
> 2015-11-10 17:52 GMT+03:00 Richard Biener :
>> On Tue, Nov 10, 2015 at 2:02 PM, Ilya Enkovich  
>> wrote:
>>> 2015-11-10 15:30 GMT+03:00 Richard Biener :
 On Tue, Nov 3, 2015 at 1:08 PM, Yuri Rumyantsev  wrote:
> Richard,
>
> It looks like misunderstanding - we assume that for GCCv6 the simple
> scheme of remainder will be used through introducing new IV :
> https://gcc.gnu.org/ml/gcc-patches/2015-09/msg01435.html
>
> Is it true or we missed something?

 
> > Do you have an idea how "masking" is better be organized to be usable
> > for both 4b and 4c?
>
> Do 2a ...
 Okay.
 
>>>
>>> 2a was 'transform already vectorized loop as a separate
>>> post-processing'. Isn't it what this prototype patch implements?
>>> Current version only masks loop body which is in practice applicable
>>> for AVX-512 only in the most cases.  With AVX-512 it's easier to see
>>> how profitable masking might be and it is a main target for the first
>>> masking version.  Extending it to prologues/epilogues and thus making
>>> it more profitable for other targets is the next step and is out of
>>> the scope of this patch.
>>
>> Ok, technically the prototype transforms the already vectorized loop.
>> Of course I meant the vectorized loop be copied, masked and that
>> result used as epilogue...
>>
>> I'll queue a more detailed look into the patch for this week.
>>
>> Did you perform any measurements with this patch like # of
>> masked epilogues in SPEC 2006 FP (and any speedup?)
>>
>> Thanks,
>> Richard.
>>
>>> Thanks,
>>> Ilya
>>>

 Richard.

Re: Enable pointer TBAA for LTO

2015-11-23 Thread Ilya Verbin

On Mon, Nov 23, 2015 at 16:31:42 +0100, Richard Biener wrote:
> I think it also causes the following and one related ICE
> 
> FAIL: gcc.dg/vect/pr62021.c -flto -ffat-lto-objects (internal compiler 
> error)
> 
> /space/rguenther/src/svn/trunk3/gcc/testsuite/gcc.dg/vect/pr62021.c:7:1: 
> internal compiler error: in get_alias_set, at alias.c:880^M
> 0x7528a7 get_alias_set(tree_node*)^M
> /space/rguenther/src/svn/trunk3/gcc/alias.c:880^M
> 0x751ce5 component_uses_parent_alias_set_from(tree_node const*)^M
> /space/rguenther/src/svn/trunk3/gcc/alias.c:635^M
> 0x7522ad reference_alias_ptr_type_1^M
> /space/rguenther/src/svn/trunk3/gcc/alias.c:747^M
> 0x752683 get_alias_set(tree_node*)^M
> ...

And an ICE in intelmicemul offloading compiler:

FAIL: libgomp.c++/for-11.C (internal compiler error)
FAIL: libgomp.c++/for-13.C (internal compiler error)
FAIL: libgomp.c++/for-14.C (internal compiler error)
FAIL: libgomp.c/for-3.c (internal compiler error)
FAIL: libgomp.c/for-5.c (internal compiler error)
FAIL: libgomp.c/for-6.c (internal compiler error)

libgomp/testsuite/libgomp.c/for-2.h:201:9: internal compiler error: in 
get_alias_set, at alias.c:880
0x710eef get_alias_set(tree_node*)
gcc/alias.c:880
0x71032d component_uses_parent_alias_set_from(tree_node const*)
gcc/alias.c:635
0x7108f5 reference_alias_ptr_type_1
gcc/alias.c:747
0x710ccb get_alias_set(tree_node*)
gcc/alias.c:843
0x89d208 expand_assignment(tree_node*, tree_node*, bool)
gcc/expr.c:5020
0x768ff7 expand_gimple_stmt_1
gcc/cfgexpand.c:3592
0x7693e2 expand_gimple_stmt
gcc/cfgexpand.c:3688
0x7704ed expand_gimple_basic_block
gcc/cfgexpand.c:5694
0x771ff1 execute
gcc/cfgexpand.c:6309
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See  for instructions.

  -- Ilya

Re: [PATCH, PR68337] Don't fold memcpy/memmove we want to instrument

2015-11-23 Thread Richard Biener

On Mon, Nov 23, 2015 at 2:45 PM, Ilya Enkovich  wrote:
> On 23 Nov 14:29, Richard Biener wrote:
>> On Mon, Nov 23, 2015 at 12:33 PM, Ilya Enkovich  
>> wrote:
>> >
>> > I see.  But it should still be OK to check type in case of strict 
>> > aliasing, right?
>>
>> No, memcpy is always "no-strict-aliasing"
>>
>
> Thanks a lot for help!  Here is a variant with a size check only as
> you originally suggested.  Is it OK for trunk and gcc-5-branch if
> no regressions?

Ok.

Thanks,
Richard.

> Thanks,
> Ilya
> --
> gcc/
>
> 2015-11-23  Ilya Enkovich  
>
> * gimple-fold.c: Include ipa-chkp.h.
> (gimple_fold_builtin_memory_op): Don't fold call if we
> are going to instrument it and it may copy pointers.
>
> gcc/testsuite/
>
> 2015-11-23  Ilya Enkovich  
>
> * gcc.target/i386/mpx/pr68337-1.c: New test.
> * gcc.target/i386/mpx/pr68337-2.c: New test.
>
>
> diff --git a/gcc/gimple-fold.c b/gcc/gimple-fold.c
> index 1ab20d1..6ff5e26 100644
> --- a/gcc/gimple-fold.c
> +++ b/gcc/gimple-fold.c
> @@ -53,6 +53,7 @@ along with GCC; see the file COPYING3.  If not see
>  #include "gomp-constants.h"
>  #include "optabs-query.h"
>  #include "omp-low.h"
> +#include "ipa-chkp.h"
>
>
>  /* Return true when DECL can be referenced from current unit.
> @@ -664,6 +665,18 @@ gimple_fold_builtin_memory_op (gimple_stmt_iterator *gsi,
>unsigned int src_align, dest_align;
>tree off0;
>
> +  /* Inlining of memcpy/memmove may cause bounds lost (if we copy
> +pointers as wide integer) and also may result in huge function
> +size because of inlined bounds copy.  Thus don't inline for
> +functions we want to instrument.  */
> +  if (flag_check_pointer_bounds
> + && chkp_instrumentable_p (cfun->decl)
> + /* Even if data may contain pointers we can inline if copy
> +less than a pointer size.  */
> + && (!tree_fits_uhwi_p (len)
> + || compare_tree_int (len, POINTER_SIZE_UNITS) >= 0))
> +   return false;
> +
>/* Build accesses at offset zero with a ref-all character type.  */
>off0 = build_int_cst (build_pointer_type_for_mode (char_type_node,
>  ptr_mode, true), 0);
> diff --git a/gcc/testsuite/gcc.target/i386/mpx/pr68337-1.c 
> b/gcc/testsuite/gcc.target/i386/mpx/pr68337-1.c
> new file mode 100644
> index 000..3f8d79d
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/mpx/pr68337-1.c
> @@ -0,0 +1,32 @@
> +/* { dg-do run } */
> +/* { dg-options "-fcheck-pointer-bounds -mmpx" } */
> +
> +#include "mpx-check.h"
> +
> +#define N 2
> +
> +extern void abort ();
> +
> +static int
> +mpx_test (int argc, const char **argv)
> +{
> +  char ** src = (char **)malloc (sizeof (char *) * N);
> +  char ** dst = (char **)malloc (sizeof (char *) * N);
> +  int i;
> +
> +  for (i = 0; i < N; i++)
> +src[i] = __bnd_set_ptr_bounds (argv[0] + i, i + 1);
> +
> +  __builtin_memcpy(dst, src, sizeof (char *) * N);
> +
> +  for (i = 0; i < N; i++)
> +{
> +  char *p = dst[i];
> +  if (p != argv[0] + i
> + || __bnd_get_ptr_lbound (p) != p
> + || __bnd_get_ptr_ubound (p) != p + i)
> +   abort ();
> +}
> +
> +  return 0;
> +}
> diff --git a/gcc/testsuite/gcc.target/i386/mpx/pr68337-2.c 
> b/gcc/testsuite/gcc.target/i386/mpx/pr68337-2.c
> new file mode 100644
> index 000..8845cca
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/mpx/pr68337-2.c
> @@ -0,0 +1,9 @@
> +/* { dg-do compile } */
> +/* { dg-options "-fcheck-pointer-bounds -mmpx" } */
> +/* { dg-final { scan-assembler-not "memcpy" } } */
> +
> +void
> +test (void *dst, void *src)
> +{
> +  __builtin_memcpy (dst, src, sizeof (char *) / 2);
> +}

Re: Enable pointer TBAA for LTO

2015-11-23 Thread Richard Biener

On Mon, 23 Nov 2015, Martin Jambor wrote:

> Hi,
> 
> On Mon, Nov 23, 2015 at 12:00:25AM +0100, Jan Hubicka wrote:
> > Hi,
> > here is updated patch which I finally comitted today.  It addresses all the 
> > comments
> > and also fixes one nasty bug that really cost me a lot of time to 
> > understand. 
> > 
> > + /* LTO type merging does not make any difference between 
> > +component pointer types.  We may have
> > +
> > +struct foo {int *a;};
> > +
> > +as TYPE_CANONICAL of 
> > +
> > +struct bar {float *a;};
> > +
> > +Because accesses to int * and float * do not alias, we would get
> > +false negative when accessing the same memory location by
> > +float ** and bar *. We thus record the canonical type as:
> > +
> > +struct {void *a;};
> > +
> > +void * is special cased and works as a universal pointer type.
> > +Accesses to it conflicts with accesses to any other pointer
> > +type.  */
> > 
> > This problem manifested itself only as a lto-bootstrap miscompare on 32bit
> > build and I spent a lot of time localizing the wrong code since it 
> > reproduces
> > only in quite large programs where we get conficts in canonical type merging
> > like this.
> > 
> > The patch thus updates record_component_aliases to substitute void_ptr_type 
> > for
> > all pointer types. I re-did the stats.  Now the improvement on dealII is 14%
> > that is quite a bit lower than earlier, but still substantial.  Since we 
> > have
> > voidptr globing counter, I know that the number of disambiguations would go 
> > at
> > least 19% up if we did not do it.
> > 
> > THere is a lot of low hanging fruit in that area now, but the real solution 
> > is to
> > track types that needs to be merge by fortran rules and don't do all this 
> > fancy
> > globing for C/C++ types.  I will open branch for IPA work and try to 
> > prepare this
> > for next stage 1.
> > 
> > bootstrapped/regtested x86_64-linux and ppc64-linux, earlier version tested 
> > on i386-linux
> > and also on some bigger apps, committed
> > 
> > Note that we still have bootstrap miscompare with LTO build and 
> > --disable-checking,
> > I am looking for that now.  Additoinally after fixing the ICEs preventing 
> > us to build
> > the gnat1 binary, gnat1 aborts. Both these are independent of the patch.
> > 
> > Honza
> > * lto.c (iterative_hash_canonical_type): Always recurse for pointers.
> > (gimple_register_canonical_type_1): Check that pointers do not get
> > canonical types.
> > (gimple_register_canonical_type): Do not register pointers.
> > 
> > * tree.c (build_pointer_type_for_mode,build_reference_type_for_mode):
> > In LTO we do not compute TYPE_CANONICAL of pointers.
> > (gimple_canonical_types_compatible_p): Improve coments; sanity check
> > that pointers do not have canonical type that would make us believe
> > they are different.
> > * alias.c (get_alias_set): Do structural type equality on pointers;
> > enable pointer path for LTO; also glob pointer to vector with pointer
> > to vector element; glob pointers and references for LTO; do more strict
> > sanity checking about build_pointer_type returning the canonical type
> > which is also the main variant.
> > (record_component_aliases): When component type is pointer and we
> > do LTO; record void_type_node alias set.
> 
> ...
> 
> > Index: alias.c
> > ===
> > --- alias.c (revision 230714)
> > +++ alias.c (working copy)
> > @@ -869,13 +869,23 @@ get_alias_set (tree t)
> >set = lang_hooks.get_alias_set (t);
> >if (set != -1)
> > return set;
> > -  return 0;
> > +  /* Handle structure type equality for pointer types.  This is easy
> > +to do, because the code bellow ignore canonical types on these anyway.
> > +This is important for LTO, where TYPE_CANONICAL for pointers can not
> > +be meaningfuly computed by the frotnend.  */
> > +  if (!POINTER_TYPE_P (t))
> > +   {
> > + /* In LTO we set canonical types for all types where it makes
> > +sense to do so.  Double check we did not miss some type.  */
> > + gcc_checking_assert (!in_lto_p || !type_with_alias_set_p (t));
> > +  return 0;
> 
> I have hit this assert on our LTO tests when doing a merge from trunk
> to the HSA branch.  On the branch, we generate very simple static
> constructors/destructors which just call libgomp (un)registration
> routines to which we pass data in static variables of artificial
> types.  The assert happens inside varpool_node::finalize_decl calls on
> those variables, e.g.:
> 
> lto1: internal compiler error: in get_alias_set, at alias.c:880
> 0x613650 get_alias_set(tree_node*)
> /home/mjambor/gcc/branch/src/gcc/alias.c:880
> 0x71d2c7 set_mem_attributes_minus_bitpos(rtx_def*, tree_node*, int, long)
> /home/mjambor/gcc/branch/src/gcc/emit

Re: [PATCH, 10/16] Add pass_oacc_kernels pass group in passes.def

2015-11-23 Thread Tom de Vries


On 23/11/15 12:31, Richard Biener wrote:

 From the dump below I understand you want no memory references in
> >the outer loop?
> >So the issue seems to be that store motion fails
> >to insert the preheader load / exit store to the outermost loop
> >possible and thus another LIM pass is needed to "store motion" those
> >again?

>
>Yep.
>

> >  But a simple testcase
> >
> >int a;
> >int *p = &a;
> >int foo (int n)
> >{
> >for (int i = 0; i < n; ++i)
> >  for (int j = 0; j < 100; ++j)
> >*p += j + i;
> >return a;
> >}
> >
> >shows that LIM can do this in one step.

>
>I've filed a FTR PR68465 - "pass_lim doesn't detect identical loop entry
>conditions" for a test-case where that doesn't happen (when using
>-fno-tree-dominator-opts).
>

> >Which means it should
> >be investigated why it doesn't do this properly for your testcase
> >(store motion of *_25).

>
>There seems to be two related problems:
>1. the store has tree_could_trap_p (ref->mem.ref) true, which should be
>false. I'll work on a fix for this.
>2. Give that the store can trap, I  was running into PR68465. I managed
>to eliminate the 2nd pass_lim by moving the pass_dominator instance
>before the pass_lim instance.
>
>Attached patch shows the pass group with only one pass_lim. I hope to be able
>to eliminate the first pass_dominator instance before pass_lim once I fix 1.
>

> >Simply adding two LIM passes either papers over a wrong-code
> >bug (in LIM or in DOM) or over a missed-optimization in LIM.

>
>AFAIU now, it's PR68465, a missed optimization in LIM.

Ok, it's not really LIMs job to cleanup loop header copying that way.

DOM performs jump-threading for this but FRE should also be able
to handle this just fine.  Ah, it doesn't because the outer loop
header directly contains the condition

Index: gcc/tree-ssa-sccvn.c
===
--- gcc/tree-ssa-sccvn.c(revision 230737)
+++ gcc/tree-ssa-sccvn.c(working copy)
@@ -4357,20 +4402,32 @@ sccvn_dom_walker::before_dom_children (b

/* If we have a single predecessor record the equivalence from a
   possible condition on the predecessor edge.  */
-  if (single_pred_p (bb))
+  edge pred_e = NULL;
+  FOR_EACH_EDGE (e, ei, bb->preds)
+{
+  if (e->flags & EDGE_DFS_BACK)
+   continue;
+  if (! pred_e)
+   pred_e = e;
+  else
+   {
+ pred_e = NULL;
+ break;
+   }
+}
+  if (pred_e)
  {
-  edge e = single_pred_edge (bb);
/* Check if there are multiple executable successor edges in
  the source block.  Otherwise there is no additional info
  to be recorded.  */
edge e2;
-  FOR_EACH_EDGE (e2, ei, e->src->succs)
-   if (e2 != e
+  FOR_EACH_EDGE (e2, ei, pred_e->src->succs)
+   if (e2 != pred_e
 && e2->flags & EDGE_EXECUTABLE)
   break;
if (e2 && (e2->flags & EDGE_EXECUTABLE))
 {
- gimple *stmt = last_stmt (e->src);
+ gimple *stmt = last_stmt (pred_e->src);
   if (stmt
   && gimple_code (stmt) == GIMPLE_COND)
 {
@@ -4378,11 +4435,11 @@ sccvn_dom_walker::before_dom_children (b
   tree lhs = gimple_cond_lhs (stmt);
   tree rhs = gimple_cond_rhs (stmt);
   record_conds (bb, code, lhs, rhs,
-   (e->flags & EDGE_TRUE_VALUE) != 0);
+   (pred_e->flags & EDGE_TRUE_VALUE) != 0);
   code = invert_tree_comparison (code, HONOR_NANS (lhs));
   if (code != ERROR_MARK)
 record_conds (bb, code, lhs, rhs,
- (e->flags & EDGE_TRUE_VALUE) == 0);
+ (pred_e->flags & EDGE_TRUE_VALUE) == 0);
 }
 }
  }

fixes this for me (for a small testcase).  Does it help yours?



Yes, it has the desired effect (of not needing pass_dominator before 
pass_lim) . But, patch "Mark by_ref mem_ref in build_receiver_ref as 
non-trapping" committed as r230738, also has that effect, so AFAIU I 
don't require this tree-ssa-sccvn.c fix.


Thanks,
- Tom


Otherwise untested of course (I hope EDGE_DFS_BACK is good enough,
it's supposed to match edges that have the src dominated by the dest).
Testing the above now.

[PATCH][RTL-ifcvt] PR rtl-optimization/68435 Allow (c ? x++ : x--) form

2015-11-23 Thread Kyrill Tkachov


Hi all,

In this PR we fail to if-convert a case where in the expression x = c ? a : b;
'a' and 'b' are something like x + 1 and x - 1.
So x appears in a and b.
The code that checks that nothing from the else block modifies the registers 
used in a
rejects this case. It should accept when the modification is in the last insn 
of the block
i.e. insn_a or insn_b in the language of noce_try_cmove_arith because we will 
not be
emitting insn_a and insn_b verbatim, but rather their modified versions emit_a 
and emit_b
that have had their destinations modified to fresh pseudos, so no conflicts 
will arise.

Bootstrapped and tested on arm, aarch64, x86_64.
This improved if-conversion opportunities a bit across SPEC2006, not so much as 
to make
a difference, but definitely a small improvement.

Ok for trunk?

Thanks,
Kyrill

2015-11-23  Kyrylo Tkachov  

PR rtl-optimization/68435
* ifcvt.c (noce_try_cmove_arith): Skip final insn when checking
for clonflicts between a, b and the set destinations.

2015-11-23  Kyrylo Tkachov  

PR rtl-optimization/68435
* gcc.dg/pr68435.c: New test.
commit c890c68e2862980731e95cb4b7982f10b29109b7
Author: Kyrylo Tkachov 
Date:   Fri Nov 20 09:13:04 2015 +

[ifcvt] PR rtl-optimization/68435 Allow (c ? x++ : x--) form

diff --git a/gcc/ifcvt.c b/gcc/ifcvt.c
index d721ec7..af7a3b9 100644
--- a/gcc/ifcvt.c
+++ b/gcc/ifcvt.c
@@ -2210,7 +2210,10 @@ noce_try_cmove_arith (struct noce_if_info *if_info)
   if (tmp_b && then_bb)
 {
   FOR_BB_INSNS (then_bb, tmp_insn)
-	if (modified_in_p (orig_b, tmp_insn))
+	/* Don't check inside insn_a.  We will have changed it to emit_a
+	   with a destination that doesn't conflict.  */
+	if (!(insn_a && tmp_insn == insn_a)
+	&& modified_in_p (orig_b, tmp_insn))
 	  {
 	modified_in_a = true;
 	break;
@@ -2223,7 +2226,10 @@ noce_try_cmove_arith (struct noce_if_info *if_info)
 	if (tmp_b && else_bb)
 	  {
 	FOR_BB_INSNS (else_bb, tmp_insn)
-	  if (modified_in_p (orig_a, tmp_insn))
+	/* Don't check inside insn_b.  We will have changed it to emit_b
+	   with a destination that doesn't conflict.  */
+	  if (!(insn_b && tmp_insn == insn_b)
+		  && modified_in_p (orig_a, tmp_insn))
 		{
 		  modified_in_b = true;
 		  break;
diff --git a/gcc/testsuite/gcc.dg/pr68435.c b/gcc/testsuite/gcc.dg/pr68435.c
new file mode 100644
index 000..765699a
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr68435.c
@@ -0,0 +1,52 @@
+/* { dg-do compile { target aarch64*-*-* x86_64-*-* } } */
+/* { dg-options "-fdump-rtl-ce1 -O2 -w" } */
+
+typedef struct cpp_reader cpp_reader;
+enum cpp_ttype
+{
+  CPP_EQ =
+0, CPP_NOT, CPP_GREATER, CPP_LESS, CPP_PLUS, CPP_MINUS, CPP_MULT, CPP_DIV,
+  CPP_MOD, CPP_AND, CPP_OR, CPP_XOR, CPP_RSHIFT, CPP_LSHIFT, CPP_MIN,
+  CPP_MAX, CPP_COMPL, CPP_AND_AND, CPP_OR_OR, CPP_QUERY, CPP_COLON,
+  CPP_COMMA, CPP_OPEN_PAREN, CPP_CLOSE_PAREN, CPP_EQ_EQ, CPP_NOT_EQ,
+  CPP_GREATER_EQ, CPP_LESS_EQ, CPP_PLUS_EQ, CPP_MINUS_EQ, CPP_MULT_EQ,
+  CPP_DIV_EQ, CPP_MOD_EQ, CPP_AND_EQ, CPP_OR_EQ, CPP_XOR_EQ, CPP_RSHIFT_EQ,
+  CPP_LSHIFT_EQ, CPP_MIN_EQ, CPP_MAX_EQ, CPP_HASH, CPP_PASTE,
+  CPP_OPEN_SQUARE, CPP_CLOSE_SQUARE, CPP_OPEN_BRACE, CPP_CLOSE_BRACE,
+  CPP_SEMICOLON, CPP_ELLIPSIS, CPP_PLUS_PLUS, CPP_MINUS_MINUS, CPP_DEREF,
+  CPP_DOT, CPP_SCOPE, CPP_DEREF_STAR, CPP_DOT_STAR, CPP_ATSIGN, CPP_NAME,
+  CPP_NUMBER, CPP_CHAR, CPP_WCHAR, CPP_OTHER, CPP_STRING, CPP_WSTRING,
+  CPP_HEADER_NAME, CPP_COMMENT, CPP_MACRO_ARG, CPP_PADDING, CPP_EOF,
+};
+
+static struct op lex (cpp_reader *, int);
+
+struct op
+{
+  enum cpp_ttype op;
+  long value;
+};
+
+int
+_cpp_parse_expr (pfile)
+{
+  struct op init_stack[20];
+  struct op *stack = init_stack;
+  struct op *top = stack + 1;
+  int skip_evaluation = 0;
+  for (;;)
+{
+  struct op op;
+  op = lex (pfile, skip_evaluation);
+  switch (op.op)
+	{
+	case CPP_OR_OR:
+	  if (top->value)
+	skip_evaluation++;
+	  else
+	skip_evaluation--;
+	}
+}
+}
+
+/* { dg-final { scan-rtl-dump "2 true changes made" "ce1" } } */

Re: [PATCH v2] Add uaddv_optab, usubv4_optab

2015-11-23 Thread Bernd Schmidt


On 11/22/2015 09:58 PM, Uros Bizjak wrote:

On Sun, Nov 22, 2015 at 11:38 AM, Richard Henderson  wrote:

* optabs.def (uaddv4_optab, usubv4_optab): New.
* internal-fn.c (expand_addsub_overflow): Use them.
* doc/md.texi (Standard Names): Add uaddv4, usubv4.

* config/i386/i386.c (ix86_cc_mode): Extend add overflow check
to reversed operands.
* config/i386/i386.md (uaddv4, usubv4): New.
(*add3_cconly_overflow_1): Rename *add3_cconly_overflow.
(*add3_cc_overflow_1): Rename *add3_cc_overflow.
(*addsi3_zext_cc_overflow_1): Rename *add3_zext_cc_overflow.
(*add3_cconly_overflow_2): New.
(*add3_cc_overflow_2): New.
(*addsi3_zext_cc_overflow_2): New.


x86 parts are OK with a small change, see below.


The other parts are also OK.


Bernd

[PATCH] Fix problem reported in PR68465

2015-11-23 Thread Richard Biener


The following fixes CH - FRE - LIM not doing store-motion across a loop
nest for redundant header checks.  FRE is supposed to do such
redundant comparison "threading" but didn't do it because of latch
edges confusing the single_pred_p check.

The following fixes it by disregarding edges that come from blocks
we dominate.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk.

Richard.

2015-11-23  Richard Biener  

PR tree-optimization/68465
* tree-ssa-sccvn.c (sccvn_dom_walker::before_dom_children):
Also record equalities from multiple predecessor blocks if
only one non-backedge exists.

* gcc.dg/tree-ssa/ssa-fre-52.c: New testcase.

Index: gcc/tree-ssa-sccvn.c
===
*** gcc/tree-ssa-sccvn.c(revision 230737)
--- gcc/tree-ssa-sccvn.c(working copy)
*** sccvn_dom_walker::before_dom_children (b
*** 4357,4376 
  
/* If we have a single predecessor record the equivalence from a
   possible condition on the predecessor edge.  */
!   if (single_pred_p (bb))
  {
-   edge e = single_pred_edge (bb);
/* Check if there are multiple executable successor edges in
 the source block.  Otherwise there is no additional info
 to be recorded.  */
edge e2;
!   FOR_EACH_EDGE (e2, ei, e->src->succs)
!   if (e2 != e
&& e2->flags & EDGE_EXECUTABLE)
  break;
if (e2 && (e2->flags & EDGE_EXECUTABLE))
{
! gimple *stmt = last_stmt (e->src);
  if (stmt
  && gimple_code (stmt) == GIMPLE_COND)
{
--- 4402,4435 
  
/* If we have a single predecessor record the equivalence from a
   possible condition on the predecessor edge.  */
!   edge pred_e = NULL;
!   FOR_EACH_EDGE (e, ei, bb->preds)
! {
!   /* Ignore simple backedges from this to allow recording conditions
!  in loop headers.  */
!   if (dominated_by_p (CDI_DOMINATORS, e->src, e->dest))
!   continue;
!   if (! pred_e)
!   pred_e = e;
!   else
!   {
! pred_e = NULL;
! break;
!   }
! }
!   if (pred_e)
  {
/* Check if there are multiple executable successor edges in
 the source block.  Otherwise there is no additional info
 to be recorded.  */
edge e2;
!   FOR_EACH_EDGE (e2, ei, pred_e->src->succs)
!   if (e2 != pred_e
&& e2->flags & EDGE_EXECUTABLE)
  break;
if (e2 && (e2->flags & EDGE_EXECUTABLE))
{
! gimple *stmt = last_stmt (pred_e->src);
  if (stmt
  && gimple_code (stmt) == GIMPLE_COND)
{
*** sccvn_dom_walker::before_dom_children (b
*** 4378,4388 
  tree lhs = gimple_cond_lhs (stmt);
  tree rhs = gimple_cond_rhs (stmt);
  record_conds (bb, code, lhs, rhs,
!   (e->flags & EDGE_TRUE_VALUE) != 0);
  code = invert_tree_comparison (code, HONOR_NANS (lhs));
  if (code != ERROR_MARK)
record_conds (bb, code, lhs, rhs,
! (e->flags & EDGE_TRUE_VALUE) == 0);
}
}
  }
--- 4437,4447 
  tree lhs = gimple_cond_lhs (stmt);
  tree rhs = gimple_cond_rhs (stmt);
  record_conds (bb, code, lhs, rhs,
!   (pred_e->flags & EDGE_TRUE_VALUE) != 0);
  code = invert_tree_comparison (code, HONOR_NANS (lhs));
  if (code != ERROR_MARK)
record_conds (bb, code, lhs, rhs,
! (pred_e->flags & EDGE_TRUE_VALUE) == 0);
}
}
  }
Index: gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-52.c
===
*** gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-52.c  (revision 0)
--- gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-52.c  (working copy)
***
*** 0 
--- 1,26 
+ /* { dg-do compile } */
+ /* { dg-options "-O -fdump-tree-fre1" } */
+ 
+ void bar ();
+ void foo (int n)
+ {
+   if (n > 0)
+ {
+   int j = 0;
+   do
+   {
+ if (n > 0)
+   {
+ int i = 0;
+ do
+   {
+ bar ();
+   }
+ while (i < n);
+   }
+   }
+   while (j < n);
+ }
+ }
+ 
+ /* { dg-final { scan-tree-dump-times "if" 1 "fre1" } } */

Re: [PATCH][AArch64] Documentation fix for -fpic

2015-11-23 Thread Richard Earnshaw

On 12/11/15 11:30, Szabolcs Nagy wrote:
> The documentation for -fpic and -fPIC explicitly mentions some targets
> where the difference matters, but not AArch64.  Specifying the GOT size
> limit is not entirely correct as it can depend on the -mcmodel setting,
> but probably better than leaving the impression that -fpic vs -fPIC does
> not matter on AArch64.
> 
> ChangeLog:
> 
> 2015-11-12  Szabolcs Nagy  
> 
> * doc/invoke.texi (-fpic): Add the AArch64 limit.
> (-fPIC): Add AArch64.
> 
> fpic.diff
> 
> 
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index 0121832..f925fe0 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -23951,7 +23951,7 @@ loader is not part of GCC; it is part of the 
> operating system).  If
>  the GOT size for the linked executable exceeds a machine-specific
>  maximum size, you get an error message from the linker indicating that
>  @option{-fpic} does not work; in that case, recompile with @option{-fPIC}
> -instead.  (These maximums are 8k on the SPARC and 32k
> +instead.  (These maximums are 8k on the SPARC, 28k on AArch64 and 32k
>  on the m68k and RS/6000.  The x86 has no such limit.)
>  
>  Position-independent code requires special support, and therefore works
> @@ -23966,7 +23966,7 @@ are defined to 1.
>  @opindex fPIC
>  If supported for the target machine, emit position-independent code,
>  suitable for dynamic linking and avoiding any limit on the size of the
> -global offset table.  This option makes a difference on the m68k,
> +global offset table.  This option makes a difference on the AArch64, m68k,
   ^^^
The use of the definite article here makes this read somewhat awkwardly
(particularly as AArch64 is first and is a term for a set of processor
implementations).  I think it would read better if it were dropped:
"This option makes a difference on AArch64, m68k,..."

>  PowerPC and SPARC@.
>  
>  Position-independent code requires special support, and therefore works
> 

OK with that change if nobody else objects within 24 hours.

R.

Re: [PATCH][RTL-ree] PR rtl-optimization/68194: Restrict copy instruction in presence of conditional moves

2015-11-23 Thread Kyrill Tkachov


Hi Bernd,

On 20/11/15 01:41, Bernd Schmidt wrote:

I1 is def_insn, I3 is cand->insn. tmp_reg is 'ax'. What we want to do
is reject this transformation
because the destination of def_insn (aka I1), that is 'ax', is not the
operand of the extend operation
in cand->insn (aka I3). As you said, rtx_equal won't work on just
SET_SRC (PATTERN (cand->insn)) because
it's an extend operation. So reg_overlap_mentioned should be appropriate.


Yeah, so just use the src_reg variable for the comparison. I still don't see why you wouldn't want to use the stronger test. But the whole thing still feels not completely ideal somehow, so after reading through ree.c for a while and 
getting a better feeling for how it works, I think the following (which you said is equivalent) would be the most understandable and direct fix.


You said that the two tests should be equivalent, and I agree. I've not found 
cases where the change makes a difference, other than the testcase. Would you 
mind running this version through the testsuite and committing if it passes?

I've shrunk the comment; massive explanations like this for every bug are inappropriate IMO, and the example also duplicates an earlier comment in the same function. And, as I said earlier, the way you placed the comment is confusing 
because only one part of the following if statement is related to it.




Thanks for the comments, here is the final patch that I'll be committing.
It passed testing on arm, aarch64, x86_64.

Thanks,
Kyrill


2015-11-23  Bernd Schmidt 
Kyrylo Tkachov  

PR rtl-optimization/68194
PR rtl-optimization/68328
PR rtl-optimization/68185
* ree.c (combine_reaching_defs): Reject copy_needed case if
copies_list is not empty.

2015-11-23  Kyrylo Tkachov  

PR rtl-optimization/68194
* gcc.c-torture/execute/pr68185.c: Likewise.
* gcc.c-torture/execute/pr68328.c: Likewise.




Bernd


commit ceecbb45212e2c2a665fabba03e07f6fcbe4
Author: Kyrylo Tkachov 
Date:   Fri Nov 13 15:01:47 2015 +

[RTL-ree] PR rtl-optimization/68194: Restrict copy instruction in presence of conditional moves

diff --git a/gcc/ree.c b/gcc/ree.c
index b8436f2..9d94843 100644
--- a/gcc/ree.c
+++ b/gcc/ree.c
@@ -770,6 +770,12 @@ combine_reaching_defs (ext_cand *cand, const_rtx set_pat, ext_state *state)
   if (state->defs_list.length () != 1)
 	return false;
 
+  /* We don't have the structure described above if there are
+	 conditional moves in between the def and the candidate,
+	 and we will not handle them correctly.  See PR68194.  */
+  if (state->copies_list.length () > 0)
+	return false;
+
   /* We require the candidate not already be modified.  It may,
 	 for example have been changed from a (sign_extend (reg))
 	 into (zero_extend (sign_extend (reg))).
diff --git a/gcc/testsuite/gcc.c-torture/execute/pr68185.c b/gcc/testsuite/gcc.c-torture/execute/pr68185.c
new file mode 100644
index 000..826531b
--- /dev/null
+++ b/gcc/testsuite/gcc.c-torture/execute/pr68185.c
@@ -0,0 +1,29 @@
+int a, b, d = 1, e, f, o, u, w = 1, z;
+short c, q, t;
+
+int
+main ()
+{
+  char g;
+  for (; d; d--)
+{
+  while (o)
+	for (; e;)
+	  {
+	c = b;
+	int h = o = z;
+	for (; u;)
+	  for (; a;)
+		;
+	  }
+  if (t < 1)
+	g = w;
+  f = g;
+  g && (q = 1);
+}
+
+  if (q != 1)
+__builtin_abort ();
+
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.c-torture/execute/pr68328.c b/gcc/testsuite/gcc.c-torture/execute/pr68328.c
new file mode 100644
index 000..edf244b
--- /dev/null
+++ b/gcc/testsuite/gcc.c-torture/execute/pr68328.c
@@ -0,0 +1,44 @@
+int a, b, c = 1, d = 1, e;
+
+__attribute__ ((noinline, noclone))
+ int foo (void)
+{
+  asm volatile ("":::"memory");
+  return 4195552;
+}
+
+__attribute__ ((noinline, noclone))
+ void bar (int x, int y)
+{
+  asm volatile (""::"g" (x), "g" (y):"memory");
+  if (y == 0)
+__builtin_abort ();
+}
+
+int
+baz (int x)
+{
+  char g, h;
+  int i, j;
+
+  foo ();
+  for (;;)
+{
+  if (c)
+	h = d;
+  g = h < x ? h : 0;
+  i = (signed char) ((unsigned char) (g - 120) ^ 1);
+  j = i > 97;
+  if (a - j)
+	bar (0x123456, 0);
+  if (!b)
+	return e;
+}
+}
+
+int
+main ()
+{
+  baz (2);
+  return 0;
+}

Re: [PATCH 0/6] Another fixes of various memory leaks

2015-11-23 Thread Martin Liška

On 11/23/2015 03:36 PM, Bernd Schmidt wrote:
> On 11/23/2015 02:49 PM, marxin wrote:
>> Following series has been just bootregtested on x86_64-linux-gnu
>> (all patches together).
> 
> All ok except 5/6 which I'm not finding obvious. Better to have a cilk/c++ 
> person have a look.

Hi.

Sure, let's wait for some cilk person to do a confirmation.

> 
> In the future, a few more explanations would help with reviewing. Let's say 
> for 4/6, how does the leak occur?

Well, basically the function vect_create_cond_for_alias_checks is not called 
for allocated loop_vec_info.
As I am not vectorizer guy, I can't describe exact scenario how it happens.

> 
> Some changes appear beneficial but unnecessary (converting explicitly 
> released vecs to auto_vecs), and:

It's definitely beneficial, if you take a look at 'expand_an_in_modify_expr', 
the function
contains couple of 'return error_mark_node', where we should call ::release 
method.

> 
>>  static vec *
>> -create_array_refs (location_t loc, vec > an_info,
>> +create_array_refs (location_t loc, const vec > &an_info,
>> vec an_loop_info, size_t size,  size_t rank)
> 
> How does this help prevent leaks? In general we don't want non-bugfixes at 
> this stage.

You are right, this is not directly connected to memory release. I'll remove 
these hunks
in the next version of the patch.

Martin

> 
> 
> Bernd

Re: [PATCH][AArch64][v2] Improve comparison with complex immediates followed by branch/cset

2015-11-23 Thread Kyrill Tkachov



On 23/11/15 14:58, James Greenhalgh wrote:

On Mon, Nov 23, 2015 at 10:33:01AM +, Kyrill Tkachov wrote:

On 12/11/15 12:05, James Greenhalgh wrote:

On Tue, Nov 03, 2015 at 03:43:24PM +, Kyrill Tkachov wrote:

Hi all,

Bootstrapped and tested on aarch64.

Ok for trunk?

Comments in-line.


Here's an updated patch according to your comments.
Sorry it took so long to respin it, had other things to deal with with
stage1 closing...

I've indented the sample code sequences and used valid mnemonics.
These patterns can only match during combine, so I'd expect them to always
split during combine or immediately after, but I don't think that's a documented
guarantee so I've gated them on !reload_completed.

I've used IN_RANGE in the predicate.md hunk and added scan-assembler checks
in the tests.

Is this ok?

Thanks,
Kyrill

2015-11-20  Kyrylo Tkachov  

 * config/aarch64/aarch64.md (*condjump): Rename to...
 (condjump): ... This.
 (*compare_condjump): New define_insn_and_split.
 (*compare_cstore_insn): Likewise.
 (*cstore_insn): Rename to...
 (cstore_insn): ... This.
 * config/aarch64/iterators.md (CMP): Handle ne code.
 * config/aarch64/predicates.md (aarch64_imm24): New predicate.

2015-11-20  Kyrylo Tkachov  

 * gcc.target/aarch64/cmpimm_branch_1.c: New test.
 * gcc.target/aarch64/cmpimm_cset_1.c: Likewise.

commit bb44feed4e6beaae25d9bdffa45073dc61c65838
Author: Kyrylo Tkachov 
Date:   Mon Sep 21 10:56:47 2015 +0100

 [AArch64] Improve comparison with complex immediates

diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index 11f6387..3e57d08 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -372,7 +372,7 @@ (define_expand "mod3"
}
  )
  
-(define_insn "*condjump"

+(define_insn "condjump"
[(set (pc) (if_then_else (match_operator 0 "aarch64_comparison_operator"
[(match_operand 1 "cc_register" "") (const_int 0)])
   (label_ref (match_operand 2 "" ""))
@@ -397,6 +397,41 @@ (define_insn "*condjump"
  (const_int 1)))]
  )
  
+;; For a 24-bit immediate CST we can optimize the compare for equality

+;; and branch sequence from:
+;; mov x0, #imm1
+;; movkx0, #imm2, lsl 16 /* x0 contains CST.  */
+;; cmp x1, x0
+;; b .Label
+;; into the shorter:
+;; sub x0, x0, #(CST & 0xfff000)
+;; subsx0, x0, #(CST & 0x000fff)

   sub x0, x1, #(CST)  ?

The transform doesn't make sense otherwise.


Doh, yes. The source should be x1 of course.

Kyrill




+;; b .Label
+(define_insn_and_split "*compare_condjump"
+  [(set (pc) (if_then_else (EQL
+ (match_operand:GPI 0 "register_operand" "r")
+ (match_operand:GPI 1 "aarch64_imm24" "n"))
+  (label_ref:P (match_operand 2 "" ""))
+  (pc)))]
+  "!aarch64_move_imm (INTVAL (operands[1]), mode)
+   && !aarch64_plus_operand (operands[1], mode)
+   && !reload_completed"
+  "#"
+  "&& true"
+  [(const_int 0)]
+  {
+HOST_WIDE_INT lo_imm = UINTVAL (operands[1]) & 0xfff;
+HOST_WIDE_INT hi_imm = UINTVAL (operands[1]) & 0xfff000;
+rtx tmp = gen_reg_rtx (mode);
+emit_insn (gen_add3 (tmp, operands[0], GEN_INT (-hi_imm)));
+emit_insn (gen_add3_compare0 (tmp, tmp, GEN_INT (-lo_imm)));
+rtx cc_reg = gen_rtx_REG (CC_NZmode, CC_REGNUM);
+rtx cmp_rtx = gen_rtx_fmt_ee (, mode, cc_reg, const0_rtx);
+emit_jump_insn (gen_condjump (cmp_rtx, cc_reg, operands[2]));
+DONE;
+  }
+)
+
  (define_expand "casesi"
[(match_operand:SI 0 "register_operand" "") ; Index
 (match_operand:SI 1 "const_int_operand" ""); Lower bound
@@ -2901,7 +2936,7 @@ (define_expand "cstore4"
"
  )
  
-(define_insn "*cstore_insn"

+(define_insn "aarch64_cstore"
[(set (match_operand:ALLI 0 "register_operand" "=r")
(match_operator:ALLI 1 "aarch64_comparison_operator"
 [(match_operand 2 "cc_register" "") (const_int 0)]))]
@@ -2910,6 +2945,40 @@ (define_insn "*cstore_insn"
[(set_attr "type" "csel")]
  )
  
+;; For a 24-bit immediate CST we can optimize the compare for equality

+;; and branch sequence from:
+;; mov x0, #imm1
+;; movkx0, #imm2, lsl 16 /* x0 contains CST.  */
+;; cmp x1, x0
+;; csetx2, 
+;; into the shorter:
+;; sub x0, x0, #(CST & 0xfff000)
+;; subsx0, x0, #(CST & 0x000fff)
+;; cset x1, .

Please fix the register allocation in your shorter sequence, these
are not equivalent.


+(define_insn_and_split "*compare_cstore_insn"
+  [(set (match_operand:GPI 0 "register_operand" "=r")
+(EQL:GPI (match_operand:GPI 1 "register_operand" "r")
+ (match_operand:GPI 2 "aarch64_imm24" "n")))]
+  "!aarch64_move_imm (INTVAL (operands[2]), mode)
+   && !aarch64_plus_operand (operands[2], mode)
+   && !reload_completed"
+  "#"
+  "&& true"
+  [(const_int 0)]
+

Re: [PATCH, ARM, v2] PR target/68059 libgcc should not use __write for printing fatal error

2015-11-23 Thread Richard Earnshaw

On 06/11/15 13:32, Szabolcs Nagy wrote:
> libgcc/config/arm/linux-atomic-64bit.c uses __write to print an error
> message if the 64bit cmpxchg method is not available in the kernel.
> 
> __write is not part of the public libc abi, so use write instead.
> (user code may define write in iso c conforming mode and then the
> error message may not be visible before the crash.)
> 
> The return type in the declaration of write is fixed too.
> 
> OK for trunk and backporting?
> 
> libgcc/ChangeLog:
> 
> 2015-11-06  Szabolcs Nagy  
> 
> PR target/68059
> * config/arm/linux-atomic-64bit.c (__write): Rename to...
> (write): ...this and fix the return type.
> 

OK.

R.

Re: [PATCH][AArch64][v2] Improve comparison with complex immediates followed by branch/cset

2015-11-23 Thread James Greenhalgh

On Mon, Nov 23, 2015 at 10:33:01AM +, Kyrill Tkachov wrote:
> 
> On 12/11/15 12:05, James Greenhalgh wrote:
> >On Tue, Nov 03, 2015 at 03:43:24PM +, Kyrill Tkachov wrote:
> >>Hi all,
> >>
> >>Bootstrapped and tested on aarch64.
> >>
> >>Ok for trunk?
> >Comments in-line.
> >
> 
> Here's an updated patch according to your comments.
> Sorry it took so long to respin it, had other things to deal with with
> stage1 closing...
> 
> I've indented the sample code sequences and used valid mnemonics.
> These patterns can only match during combine, so I'd expect them to always
> split during combine or immediately after, but I don't think that's a 
> documented
> guarantee so I've gated them on !reload_completed.
> 
> I've used IN_RANGE in the predicate.md hunk and added scan-assembler checks
> in the tests.
> 
> Is this ok?
> 
> Thanks,
> Kyrill
> 
> 2015-11-20  Kyrylo Tkachov  
> 
> * config/aarch64/aarch64.md (*condjump): Rename to...
> (condjump): ... This.
> (*compare_condjump): New define_insn_and_split.
> (*compare_cstore_insn): Likewise.
> (*cstore_insn): Rename to...
> (cstore_insn): ... This.
> * config/aarch64/iterators.md (CMP): Handle ne code.
> * config/aarch64/predicates.md (aarch64_imm24): New predicate.
> 
> 2015-11-20  Kyrylo Tkachov  
> 
> * gcc.target/aarch64/cmpimm_branch_1.c: New test.
> * gcc.target/aarch64/cmpimm_cset_1.c: Likewise.
> 
> commit bb44feed4e6beaae25d9bdffa45073dc61c65838
> Author: Kyrylo Tkachov 
> Date:   Mon Sep 21 10:56:47 2015 +0100
> 
> [AArch64] Improve comparison with complex immediates
> 
> diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
> index 11f6387..3e57d08 100644
> --- a/gcc/config/aarch64/aarch64.md
> +++ b/gcc/config/aarch64/aarch64.md
> @@ -372,7 +372,7 @@ (define_expand "mod3"
>}
>  )
>  
> -(define_insn "*condjump"
> +(define_insn "condjump"
>[(set (pc) (if_then_else (match_operator 0 "aarch64_comparison_operator"
>   [(match_operand 1 "cc_register" "") (const_int 0)])
>  (label_ref (match_operand 2 "" ""))
> @@ -397,6 +397,41 @@ (define_insn "*condjump"
> (const_int 1)))]
>  )
>  
> +;; For a 24-bit immediate CST we can optimize the compare for equality
> +;; and branch sequence from:
> +;;   mov x0, #imm1
> +;;   movkx0, #imm2, lsl 16 /* x0 contains CST.  */
> +;;   cmp x1, x0
> +;;   b .Label
> +;; into the shorter:
> +;;   sub x0, x0, #(CST & 0xfff000)
> +;;   subsx0, x0, #(CST & 0x000fff)

  sub x0, x1, #(CST)  ?

The transform doesn't make sense otherwise.

> +;;   b .Label
> +(define_insn_and_split "*compare_condjump"
> +  [(set (pc) (if_then_else (EQL
> +   (match_operand:GPI 0 "register_operand" "r")
> +   (match_operand:GPI 1 "aarch64_imm24" "n"))
> +(label_ref:P (match_operand 2 "" ""))
> +(pc)))]
> +  "!aarch64_move_imm (INTVAL (operands[1]), mode)
> +   && !aarch64_plus_operand (operands[1], mode)
> +   && !reload_completed"
> +  "#"
> +  "&& true"
> +  [(const_int 0)]
> +  {
> +HOST_WIDE_INT lo_imm = UINTVAL (operands[1]) & 0xfff;
> +HOST_WIDE_INT hi_imm = UINTVAL (operands[1]) & 0xfff000;
> +rtx tmp = gen_reg_rtx (mode);
> +emit_insn (gen_add3 (tmp, operands[0], GEN_INT (-hi_imm)));
> +emit_insn (gen_add3_compare0 (tmp, tmp, GEN_INT (-lo_imm)));
> +rtx cc_reg = gen_rtx_REG (CC_NZmode, CC_REGNUM);
> +rtx cmp_rtx = gen_rtx_fmt_ee (, mode, cc_reg, const0_rtx);
> +emit_jump_insn (gen_condjump (cmp_rtx, cc_reg, operands[2]));
> +DONE;
> +  }
> +)
> +
>  (define_expand "casesi"
>[(match_operand:SI 0 "register_operand" ""); Index
> (match_operand:SI 1 "const_int_operand" "")   ; Lower bound
> @@ -2901,7 +2936,7 @@ (define_expand "cstore4"
>"
>  )
>  
> -(define_insn "*cstore_insn"
> +(define_insn "aarch64_cstore"
>[(set (match_operand:ALLI 0 "register_operand" "=r")
>   (match_operator:ALLI 1 "aarch64_comparison_operator"
>[(match_operand 2 "cc_register" "") (const_int 0)]))]
> @@ -2910,6 +2945,40 @@ (define_insn "*cstore_insn"
>[(set_attr "type" "csel")]
>  )
>  
> +;; For a 24-bit immediate CST we can optimize the compare for equality
> +;; and branch sequence from:
> +;;   mov x0, #imm1
> +;;   movkx0, #imm2, lsl 16 /* x0 contains CST.  */
> +;;   cmp x1, x0
> +;;   csetx2, 
> +;; into the shorter:
> +;;   sub x0, x0, #(CST & 0xfff000)
> +;;   subsx0, x0, #(CST & 0x000fff)
> +;;   cset x1, .

Please fix the register allocation in your shorter sequence, these
are not equivalent.

> +(define_insn_and_split "*compare_cstore_insn"
> +  [(set (match_operand:GPI 0 "register_operand" "=r")
> +  (EQL:GPI (match_operand:GPI 1 "register_operand" "r")
> +   (match_operand:GPI 2 "aarch64_imm24" "n")))]
> +  "!aarch64_move_imm (INTVAL (operands[2]), mode)
> +   && !aarch64_p

Re: [PATCH][AArch64] Documentation fix for -fpic

2015-11-23 Thread Szabolcs Nagy


On 12/11/15 11:30, Szabolcs Nagy wrote:

The documentation for -fpic and -fPIC explicitly mentions some targets
where the difference matters, but not AArch64.  Specifying the GOT size
limit is not entirely correct as it can depend on the -mcmodel setting,
but probably better than leaving the impression that -fpic vs -fPIC does
not matter on AArch64.



ping.


ChangeLog:

2015-11-12  Szabolcs Nagy  

 * doc/invoke.texi (-fpic): Add the AArch64 limit.
 (-fPIC): Add AArch64.

Re: [PATCH, ARM, v2] PR target/68059 libgcc should not use __write for printing fatal error

2015-11-23 Thread Szabolcs Nagy


On 06/11/15 13:32, Szabolcs Nagy wrote:

libgcc/config/arm/linux-atomic-64bit.c uses __write to print an error
message if the 64bit cmpxchg method is not available in the kernel.

__write is not part of the public libc abi, so use write instead.
(user code may define write in iso c conforming mode and then the
error message may not be visible before the crash.)

The return type in the declaration of write is fixed too.

OK for trunk and backporting?



ping.


libgcc/ChangeLog:

2015-11-06  Szabolcs Nagy  

 PR target/68059
 * config/arm/linux-atomic-64bit.c (__write): Rename to...
 (write): ...this and fix the return type.

Re: [PATCH][AArch64] PR target/68363 Check that argument is real INSN in aarch64_madd_needs_nop

2015-11-23 Thread James Greenhalgh

On Mon, Nov 16, 2015 at 01:21:48PM +, Kyrill Tkachov wrote:
> Hi all,
> 
> This RTL checking ICE occurs when processing an rtx_insn* in
> aarch64_madd_needs_nop that apparently holds JUMP_TABLE_DATA. This shouldn't
> be passed to recog. So instead reject it with the INSN_P check.  Such
> rtx_insns are not relevant to the code in aarch64_madd_needs_nop anyway.
> 
> Bootstrapped and tested on trunk, GCC 5 and 4.9 configured with
> --enable-fix-cortex-a53-835769 --enable-checking=yes,rtl.
> 
> Ok for all branches? (the testcase passes on 4.8, presumably before the
> conversion to rtx_insn)
> 
> 2015-11-16  Kyrylo Tkachov  
> 
> PR target/68363
> * config/aarch64/aarch64.c (aarch64_madd_needs_nop): Reject arguments
> that are not INSN_P.
> 
> 2015-11-16  Kyrylo Tkachov  
> 
> PR target/68363
> * gcc.target/aarch64/pr68363_1.c: New test.


OK.

Thanks,
James

Re: [hsa] Use new format of device-specific target arguments

2015-11-23 Thread Jakub Jelinek

On Mon, Nov 23, 2015 at 03:35:48PM +0100, Martin Jambor wrote:
> +/* If the value is directly embeded in target argument, it should be a 16-bit
> +   at most and shifted by tis many bits.  */

this

Jakub

[hsa] Perform version checks in HSA plugin

2015-11-23 Thread Martin Jambor

Hi,

the following patch against the HSA branch makes it call
GOMP_offload_register_ver and GOMP_offload_unregister_ver as opposed
to the routines without version information and adds a version check
to the libgomp plugin along the lines other plugins do it.

Committed to the branch, any feedback welcome,

Martin


2015-11-23  Martin Jambor  

gcc/
* builtin-types.def (BT_FN_VOID_PTR_INT_PTR): Removed.
(BT_FN_VOID_UINT_PTR_INT_PTR): New.
* fortran/types.def (BT_FN_VOID_PTR_INT_PTR): Removed.
(BT_FN_VOID_UINT_PTR_INT_PTR): New.
* hsa-brig.c: Include gomp-constants.
(hsa_output_libgomp_mapping): Add version arguments to to registration
and unregistration calls.
* omp-builtins.def (BUILT_IN_GOMP_OFFLOAD_REGISTER): Change to refer
to functions with versions.
(BUILT_IN_GOMP_OFFLOAD_UNREGISTER): Likewise.

include/
* gomp-constants.h (GOMP_VERSION_HSA): New.

libgomp/
* plugin/plugin-hsa.c (GOMP_OFFLOAD_load_image): Check version.
(GOMP_OFFLOAD_unload_image): Likewise.

---
 gcc/builtin-types.def   |  3 ++-
 gcc/fortran/types.def   |  3 ++-
 gcc/hsa-brig.c  | 19 ---
 gcc/omp-builtins.def|  9 +
 include/gomp-constants.h|  1 +
 libgomp/plugin/plugin-hsa.c | 21 +
 6 files changed, 35 insertions(+), 21 deletions(-)

diff --git a/gcc/builtin-types.def b/gcc/builtin-types.def
index 251c980..8dcf3a6 100644
--- a/gcc/builtin-types.def
+++ b/gcc/builtin-types.def
@@ -450,7 +450,6 @@ DEF_FUNCTION_TYPE_3 (BT_FN_BOOL_ULONG_ULONG_ULONGPTR, 
BT_BOOL, BT_ULONG,
 BT_ULONG, BT_PTR_ULONG)
 DEF_FUNCTION_TYPE_3 (BT_FN_BOOL_ULONGLONG_ULONGLONG_ULONGLONGPTR, BT_BOOL,
 BT_ULONGLONG, BT_ULONGLONG, BT_PTR_ULONGLONG)
-DEF_FUNCTION_TYPE_3 (BT_FN_VOID_PTR_INT_PTR, BT_VOID, BT_PTR, BT_INT, BT_PTR)
 
 DEF_FUNCTION_TYPE_4 (BT_FN_SIZE_CONST_PTR_SIZE_SIZE_FILEPTR,
 BT_SIZE, BT_CONST_PTR, BT_SIZE, BT_SIZE, BT_FILEPTR)
@@ -479,6 +478,8 @@ DEF_FUNCTION_TYPE_4 
(BT_FN_BOOL_UINT_LONGPTR_LONGPTR_LONGPTR,
 DEF_FUNCTION_TYPE_4 (BT_FN_BOOL_UINT_ULLPTR_ULLPTR_ULLPTR,
 BT_BOOL, BT_UINT, BT_PTR_ULONGLONG, BT_PTR_ULONGLONG,
 BT_PTR_ULONGLONG)
+DEF_FUNCTION_TYPE_4 (BT_FN_VOID_UINT_PTR_INT_PTR, BT_VOID, BT_INT, BT_PTR,
+BT_INT, BT_PTR)
 
 DEF_FUNCTION_TYPE_5 (BT_FN_INT_STRING_INT_SIZE_CONST_STRING_VALIST_ARG,
 BT_INT, BT_STRING, BT_INT, BT_SIZE, BT_CONST_STRING,
diff --git a/gcc/fortran/types.def b/gcc/fortran/types.def
index d5f44ab..283eaf4 100644
--- a/gcc/fortran/types.def
+++ b/gcc/fortran/types.def
@@ -145,7 +145,6 @@ DEF_FUNCTION_TYPE_3 (BT_FN_VOID_VPTR_I2_INT, BT_VOID, 
BT_VOLATILE_PTR, BT_I2, BT
 DEF_FUNCTION_TYPE_3 (BT_FN_VOID_VPTR_I4_INT, BT_VOID, BT_VOLATILE_PTR, BT_I4, 
BT_INT)
 DEF_FUNCTION_TYPE_3 (BT_FN_VOID_VPTR_I8_INT, BT_VOID, BT_VOLATILE_PTR, BT_I8, 
BT_INT)
 DEF_FUNCTION_TYPE_3 (BT_FN_VOID_VPTR_I16_INT, BT_VOID, BT_VOLATILE_PTR, 
BT_I16, BT_INT)
-DEF_FUNCTION_TYPE_3 (BT_FN_VOID_PTR_INT_PTR, BT_VOID, BT_PTR, BT_INT, BT_PTR)
 
 DEF_FUNCTION_TYPE_4 (BT_FN_VOID_OMPFN_PTR_UINT_UINT,
  BT_VOID, BT_PTR_FN_VOID_PTR, BT_PTR, BT_UINT, BT_UINT)
@@ -160,6 +159,8 @@ DEF_FUNCTION_TYPE_4 
(BT_FN_BOOL_UINT_LONGPTR_LONGPTR_LONGPTR,
 DEF_FUNCTION_TYPE_4 (BT_FN_BOOL_UINT_ULLPTR_ULLPTR_ULLPTR,
 BT_BOOL, BT_UINT, BT_PTR_ULONGLONG, BT_PTR_ULONGLONG,
 BT_PTR_ULONGLONG)
+DEF_FUNCTION_TYPE_4 (BT_FN_VOID_UINT_PTR_INT_PTR, BT_VOID, BT_INT, BT_PTR,
+BT_INT, BT_PTR)
 
 DEF_FUNCTION_TYPE_5 (BT_FN_VOID_OMPFN_PTR_UINT_UINT_UINT,
 BT_VOID, BT_PTR_FN_VOID_PTR, BT_PTR, BT_UINT, BT_UINT,
diff --git a/gcc/hsa-brig.c b/gcc/hsa-brig.c
index f47e9c3..b687cc5 100644
--- a/gcc/hsa-brig.c
+++ b/gcc/hsa-brig.c
@@ -42,6 +42,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "print-tree.h"
 #include "symbol-summary.h"
 #include "hsa.h"
+#include "gomp-constants.h"
 
 #define BRIG_ELF_SECTION_NAME ".brig"
 #define BRIG_LABEL_STRING "hsa_brig"
@@ -2216,10 +2217,12 @@ hsa_output_libgomp_mapping (tree brig_decl)
   gcc_checking_assert (offload_register);
 
   append_to_statement_list
-(build_call_expr (offload_register, 3,
+(build_call_expr (offload_register, 4,
+ build_int_cstu (unsigned_type_node,
+ GOMP_VERSION_PACK (GOMP_VERSION,
+GOMP_VERSION_HSA)),
  build_fold_addr_expr (hsa_libgomp_host_table),
- /* 7 stands for HSA.  */
- build_int_cst (integer_type_node, 7),
+ build_int_cst (integer_type_node, GOMP_DEVICE_HSA),
  build_fold_addr_expr (hsa_img_descriptor)),
  &hsa_ctor_statements);
 
@@ -2230,10 +2233,12 @@ hsa_output_libgomp_mapping

Re: [PATCH 0/6] Another fixes of various memory leaks

2015-11-23 Thread Bernd Schmidt


On 11/23/2015 02:49 PM, marxin wrote:

Following series has been just bootregtested on x86_64-linux-gnu
(all patches together).


All ok except 5/6 which I'm not finding obvious. Better to have a 
cilk/c++ person have a look.


In the future, a few more explanations would help with reviewing. Let's 
say for 4/6, how does the leak occur?


Some changes appear beneficial but unnecessary (converting explicitly 
released vecs to auto_vecs), and:



 static vec *
-create_array_refs (location_t loc, vec > an_info,
+create_array_refs (location_t loc, const vec > &an_info,
   vec an_loop_info, size_t size,  size_t rank)


How does this help prevent leaks? In general we don't want non-bugfixes 
at this stage.



Bernd

1 2 >

1 - 100 of 183 matches

Mail list logo