Re: [GCC][PATCH][TESTSUITE][ARM][COMMITTED] Invert check to misalign in vect_hw_misalign (PR 78421)

2017-09-25 Thread Christophe Lyon
On 25 September 2017 at 20:19, Mike Stump  wrote:
> On Sep 23, 2017, at 10:52 AM, Christophe Lyon  
> wrote:
>> The attached patch would apply after reverting yours.
>> I've applied it against r253072 (just before your patch) and the
>> results are visible at:
>> http://people.linaro.org/~christophe.lyon/cross-validation/gcc-test-patches/253072-hw-misalign2/report-build-info.html
>>
>> Do they match your expectations? It looks like a few testcases need to
>> be adjusted,
>> or new bugs are uncovered.
>>
>> If the patch OK with a suitable ChangeLog entry?
>
> So, I wasn't sure what you meant by the comment.  Below is an example of ! 
> working.
>
> % proc b {} { return 0 }
> % set v [expr ![b]]
> 1
> % set v [ expr !![b]]
> 0
> % proc b {} { return 1 }
> % set v [expr ![b]]
> 0
> % set v [ expr !![b]]
> 1
>
> Given that, if you want to use a simple set var value, you should just do 
> that directly.  [b] is the placeholder for an [] expression, if you want to 
> invert that.  v is the placeholder for the thing you want to set.  All the 
> other bits should be used as given.  Given that code, I'd be interested in 
> what you want to put in the comment, if any.
>
>   set et_vect_hw_misalign_saved($et_index) [expr 
> ![check_effective_target_arm_vect_no_misalign]]
>
> seems like what you want, does that work?
>
Yes, thanks! I was missing the 'expr' part.

Here is what I have committed (r253187), to avoid further noise in the results.


2017-09-26  Christophe Lyon  

   * lib/target-supports.exp (check_effective_target_vect_hw_misalign):
   Fix arm check.


* lib/target-supports.exp
Index: gcc/testsuite/lib/target-supports.exp
===
--- gcc/testsuite/lib/target-supports.exp   (revision 253186)
+++ gcc/testsuite/lib/target-supports.exp   (working copy)
@@ -5951,7 +5951,7 @@
  set et_vect_hw_misalign_saved($et_index) 1
}
if { [istarget arm*-*-*] } {
-   set et_vect_hw_misalign_saved($et_index)
![check_effective_target_arm_vect_no_misalign]
+   set et_vect_hw_misalign_saved($et_index) [expr
![check_effective_target_arm_vect_no_misalign]]
}
 }
 verbose "check_effective_target_vect_hw_misalign:\


Thanks,
Christophe


[Bug demangler/82195] Undemangleable lambda

2017-09-25 Thread nathan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82195

--- Comment #8 from Nathan Sidwell  ---
Author: nathan
Date: Tue Sep 26 02:38:12 2017
New Revision: 253186

URL: https://gcc.gnu.org/viewcvs?rev=253186=gcc=rev
Log:
PR demangler/82195
* cp-demangle.c (d_encoding): Strip return type when name is a
LOCAL_NAME.
(d_local_name): Strip return type of enclosing TYPED_NAME.
* testsuite/demangle-expected: Add and adjust tests.

Modified:
trunk/libiberty/ChangeLog
trunk/libiberty/cp-demangle.c
trunk/libiberty/testsuite/demangle-expected

[PING 4] [PATCH 3/4] enhance overflow and truncation detection in strncpy and strncat (PR 81117)

2017-09-25 Thread Martin Sebor

Ping #4: https://gcc.gnu.org/ml/gcc-patches/2017-08/msg00912.html

On 09/19/2017 09:44 AM, Martin Sebor wrote:

Ping #3: https://gcc.gnu.org/ml/gcc-patches/2017-08/msg00912.html

Thanks
Martin

On 08/28/2017 08:34 PM, Martin Sebor wrote:

Ping #2: https://gcc.gnu.org/ml/gcc-patches/2017-08/msg00912.html

On 08/23/2017 01:46 PM, Martin Sebor wrote:

Ping: https://gcc.gnu.org/ml/gcc-patches/2017-08/msg00912.html

Jeff, is this version good to commit or are there any other
changes you'd like to see?

Martin

On 08/14/2017 04:40 PM, Martin Sebor wrote:

On 08/10/2017 01:29 PM, Martin Sebor wrote:

diff --git a/gcc/builtins.c b/gcc/builtins.c
index 016f68d..1aa9e22 100644
--- a/gcc/builtins.c
+++ b/gcc/builtins.c

[ ... ]

+
+  if (TREE_CODE (type) == ARRAY_TYPE)
+{
+  /* Return the constant size unless it's zero (that's a
zero-length
+ array likely at the end of a struct).  */
+  tree size = TYPE_SIZE_UNIT (type);
+  if (size && TREE_CODE (size) == INTEGER_CST
+  && !integer_zerop (size))
+return size;
+}

Q. Do we have a canonical test for the trailing array idiom?   In
some
contexts isn't it size 1?  ISTM This test needs slight improvement.
Ideally we'd use some canonical test for detect the trailing array
idiom
rather than open-coding it here.  You might look at the array index
warnings in tree-vrp.c to see if it's got a canonical test you can
call
or factor and use.


You're right, there is an API for this (array_at_struct_end_p,
as Richard pointed out).  I didn't want to use it because it
treats any array at the end of a struct as a flexible array
member, but simple tests show that that's what -Wstringop-
overflow does now, and it wasn't my intention to tighten up
the checking under this change.  It surprises me that no tests
exposed this. Let me relax the check and think about proposing
to tighten it up separately.


Done in the attached patch.  (I opened bug 81849 for the enhancement
to have -Wstringop-overflow diagnose overflows when writing to member
arrays bigger than 1 element even if they're last).

(I've left the handling for zero size in place because GCC allows
global arrays to be declared to have zero elements.)




@@ -3883,6 +3920,30 @@ expand_builtin_strncat (tree exp, rtx)
   return NULL_RTX;
 }

+/* Helper to check the sizes of sequences and the destination of
calls
+   to __builtin_strncpy (DST, SRC, LEN) and
__builtin___strncpy_chk.
+   Returns true on success (no overflow warning), false
otherwise.  */
+
+static bool
+check_strncpy_sizes (tree exp, tree dst, tree src, tree len)
+{
+  tree dstsize = compute_objsize (dst, warn_stringop_overflow - 1);
+
+  if (!check_sizes (OPT_Wstringop_overflow_,
+exp, len, /*maxlen=*/NULL_TREE, src, dstsize))
+return false;
+
+  if (!dstsize || TREE_CODE (len) != INTEGER_CST)
+return true;
+
+  if (tree_int_cst_lt (dstsize, len))
+warning_at (EXPR_LOCATION (exp), OPT_Wstringop_truncation,
+"%K%qD specified bound %E exceeds destination size %E",
+exp, get_callee_fndecl (exp), len, dstsize);
+
+  return true;

So in the case where you issue the warning, what should the return
value
be?  According to the comment it should be false.  It looks like you
got
the wrong return value for the tree_int_cst_lt (dstsize, len) test.


Corrected.  The return value is unused by the only caller so
there is no test to exercise it.


Done in the attached patch.


+/* A helper of handle_builtin_stxncpy.  Check to see if the
specified
+   bound is a) equal to the size of the destination DST and if
so, b)
+   if it's immediately followed by DST[LEN - 1] = '\0'.  If a)
holds
+   and b) does not, warn.  Otherwise, do nothing.  Return true if
+   diagnostic has been issued.
+
+   The purpose is to diagnose calls to strncpy and stpncpy that do
+   not nul-terminate the copy while allowing for the idiom where
+   such a call is immediately followed by setting the last element
+   to nul, as in:
+ char a[32];
+ strncpy (a, s, sizeof a);
+ a[sizeof a - 1] = '\0';
+*/

So using gsi_next to find the next statement could make the heuristic
fail to find the a[sizeof a - 1] = '\0'; statement when debugging is
enabled.

gsi_next_nondebug would be better as it would skip over any debug
insns.


Thanks.  I'll have to remember this.


I went with this simple approach for now since it worked for GDB.
If it turns out that there are important instances of this idiom
that rely on intervening statements the warning can be relaxed.


What might be even better would be to use the immediate uses of the
memory tag.  For your case there should be only one immediate use
and it
should point to the statement which NUL terminates the
destination.  Or
maybe that would be worse in that you only want to allow this
exception
when the statements are consecutive.



 /* Handle a memcpy-like ({mem{,p}cpy,__mem{,p}cpy_chk}) call.
If strlen of the second argument is known and length of the
third
argument
is that plus one, 

Re: [RFC] propagate malloc attribute in ipa-pure-const pass

2017-09-25 Thread Jan Hubicka
> Hi Honza,
> Could you please have a look at this patch ?
> https://gcc.gnu.org/ml/gcc-patches/2017-07/msg02063.html

I can and I should have done long time ago. I really apologize for slow response
and I will try to be more timely from now on. The reason was that I had some
patches that I was thinking I would like to push out first, but I guess since
they are still not ready it is better to go other way around.

+/* A map from node to subset of callees. The subset contains those callees
+ * whose return-value is returned by the node. */
+static hash_map< cgraph_node *, vec* > *return_callees_map;

Extra * at the beggining of line.  It would make more sense to put those
and the other bits into function_summary rather than using the hooks
but that is something we co do incrementally.

I wonder what happens here when, say, ipa-icf redirect the call to eqivaelnt
function and removes the callee?  Perhaps we realy want to have set of call
sites rahter than nodes stored from analysis to execution. Call sites have
unique stmts and uids, so it will be possible to map them back and forth.

+static bool
+check_retval_uses (tree retval, gimple *stmt)
+{

there is missing toplevel comment on those.

+/*
+ * Currently this function does a very conservative analysis to check if
+ * function could be a malloc candidate.
+ *
+ * The function is considered to be a candidate if
+ * 1) The function returns a value of pointer type.
+ * 2) SSA_NAME_DEF_STMT (return_value) is either a function call or
+ *a phi, and element of phi is either NULL or
+ *SSA_NAME_DEF_STMT(element) is function call.
+ * 3) The return-value has immediate uses only within comparisons (gcond or 
gassign)
+ *and return_stmt (and likewise a phi arg has immediate use only within 
comparison
+ *or the phi stmt).
+ */

Now * in begginig of lines. Theoretically by coding standards the comment
should start with description of what function does and what are the parameters.
I believe Richi already commented on this part - which is more of his domain,
but it seems fine to me.

Pehraps with -details dump it would be nice to dump reason why the malloc
candidate was rejected.

+DEBUG_FUNCTION
+static void
+dump_malloc_lattice (FILE *dump_file, const char *s)

+static void
+propagate_malloc (void)

For coding standards, please add block comments.

With these changes the patch looks good to me!
Honza

> 
> I tested it with SPEC2006 on AArch64 Cortex-a57 processor and saw some
> improvement for
> 433.milc (+1.79%), 437.leslie3d (+2.84%) and 470.lbm (+4%) and not
> much differences for other benchmarks.
> I don't expect them to be precise though, it was run with only one
> iteration of SPEC.
> Thanks!
> 
> Regards,
> Prathamesh
> >
> > Thanks,
> > Prathamesh
> >>
> >> Thanks,
> >> Prathamesh
> >>>
> >>> Thanks,
> >>> Prathamesh
> 
>  Thanks,
>  Prathamesh
> >
> > Regards,
> > Prathamesh
> >>
> >> Thanks,
> >> Prathamesh
> >>>
> >>> Honza


[Bug bootstrap/81037] Xcode 9 requires back ports on gcc-5-branch for bootstrapping under Xcode 9

2017-09-25 Thread iains at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81037

--- Comment #13 from Iain Sandoe  ---
Author: iains
Date: Mon Sep 25 23:49:58 2017
New Revision: 253181

URL: https://gcc.gnu.org/viewcvs?rev=253181=gcc=rev
Log:
[Patch, Darwin]  Fix 81037 by adjutng headers

2017-09-26  Iain Sandoe  
Ryan Mounce  

PR bootstrap/81037
Backport from mainline r235362
2016-04-22  Szabolcs Nagy  

* system.h (list, map, set, vector): Include conditionally.
* auto-profile.c (INCLUDE_MAP, INCLUDE_SET): Define.
* graphite-isl-ast-to-gimple.c (INCLUDE_MAP): Define.
* ipa-icf.c (INCLUDE_LIST): Define.
* ipa-icf-gimple.c (INCLUDE_LIST): Define.
* config/sh/sh.c (INCLUDE_VECTOR): Define.
* config/sh/sh_treg_combine.cc (INCLUDE_ALGORITHM): Define.
(INCLUDE_LIST, INCLUDE_VECTOR): Define.
* fortran/trans-common.c (INCLUDE_MAP): Define.

Backport from mainline r235361
2016-04-22  Szabolcs Nagy  

* auto-profile.c: Remove  include.
* diagnostic.c: Remove  include.
* genmatch.c: Likewise.
* pretty-print.c: Likewise.
* toplev.c: Likewise
* c/c-objc-common.c: Likewise.
* cp/error.c: Likewise.
* fortran/error.c: Likewise.



Modified:
branches/gcc-5-branch/gcc/ChangeLog
branches/gcc-5-branch/gcc/auto-profile.c
branches/gcc-5-branch/gcc/c/c-objc-common.c
branches/gcc-5-branch/gcc/config/sh/sh.c
branches/gcc-5-branch/gcc/config/sh/sh_treg_combine.cc
branches/gcc-5-branch/gcc/cp/error.c
branches/gcc-5-branch/gcc/diagnostic.c
branches/gcc-5-branch/gcc/fortran/error.c
branches/gcc-5-branch/gcc/fortran/trans-common.c
branches/gcc-5-branch/gcc/genmatch.c
branches/gcc-5-branch/gcc/graphite-isl-ast-to-gimple.c
branches/gcc-5-branch/gcc/ipa-icf-gimple.c
branches/gcc-5-branch/gcc/ipa-icf.c
branches/gcc-5-branch/gcc/pretty-print.c
branches/gcc-5-branch/gcc/system.h
branches/gcc-5-branch/gcc/toplev.c

[PATCH PR79868 ][aarch64] Fix error calls in aarch64 code so they can be translated (version 2)

2017-09-25 Thread Steve Ellcey
This is a new version of my patch to fix PR target/79868, where some
error messages are impossible to translate correctly due to how the
strings are dynamically constructed.  It also includes some format
changes in the error messags to make the messages more consistent with
each other and with other GCC errors.  This was worked out with help
from Martin Sebor.  I also had to fix some tests to match the new error
string formats.

Tested on Aarch64 with no regressions, OK to checkin?

Steve Ellcey
sell...@cavium.com


2017-09-25  Steve Ellcey  

PR target/79868
* config/aarch64/aarch64-c.c (aarch64_pragma_target_parse):
Change argument type on aarch64_process_target_attr call.
* config/aarch64/aarch64-protos.h (aarch64_process_target_attr):
Change argument type.
* config/aarch64/aarch64.c (aarch64_attribute_info): Change
field type.
(aarch64_handle_attr_arch): Change argument type, use boolean
argument to use different strings in error calls.
(aarch64_handle_attr_cpu): Ditto.
(aarch64_handle_attr_tune): Ditto.
(aarch64_handle_attr_isa_flags): Ditto.
(aarch64_process_one_target_attr): Ditto.
(aarch64_process_target_attr): Ditto.
(aarch64_option_valid_attribute_p): Change argument type on
aarch64_process_target_attr call.


2017-09-25  Steve Ellcey  

PR target/79868
* gcc.target/aarch64/spellcheck_1.c: Update dg-error string to match
new format.
* gcc.target/aarch64/spellcheck_2.c: Ditto.
* gcc.target/aarch64/spellcheck_3.c: Ditto.
* gcc.target/aarch64/target_attr_11.c: Ditto.
* gcc.target/aarch64/target_attr_12.c: Ditto.
* gcc.target/aarch64/target_attr_17.c: Ditto.diff --git a/gcc/config/aarch64/aarch64-c.c b/gcc/config/aarch64/aarch64-c.c
index 177e638..c9945db 100644
--- a/gcc/config/aarch64/aarch64-c.c
+++ b/gcc/config/aarch64/aarch64-c.c
@@ -165,7 +165,7 @@ aarch64_pragma_target_parse (tree args, tree pop_target)
  information that it specifies.  */
   if (args)
 {
-  if (!aarch64_process_target_attr (args, "pragma"))
+  if (!aarch64_process_target_attr (args, true))
 	return false;
 
   aarch64_override_options_internal (_options);
diff --git a/gcc/config/aarch64/aarch64-protos.h b/gcc/config/aarch64/aarch64-protos.h
index e67c2ed..4323e9e 100644
--- a/gcc/config/aarch64/aarch64-protos.h
+++ b/gcc/config/aarch64/aarch64-protos.h
@@ -445,7 +445,7 @@ bool aarch64_gen_adjusted_ldpstp (rtx *, bool, scalar_mode, RTX_CODE);
 
 void aarch64_init_builtins (void);
 
-bool aarch64_process_target_attr (tree, const char*);
+bool aarch64_process_target_attr (tree, bool);
 void aarch64_override_options_internal (struct gcc_options *);
 
 rtx aarch64_expand_builtin (tree exp,
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 1c14008..122ed5e 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -67,6 +67,7 @@
 #include "common/common-target.h"
 #include "selftest.h"
 #include "selftest-rtl.h"
+#include "intl.h"
 
 /* This file should be included last.  */
 #include "target-def.h"
@@ -9554,15 +9555,15 @@ struct aarch64_attribute_info
   const char *name;
   enum aarch64_attr_opt_type attr_type;
   bool allow_neg;
-  bool (*handler) (const char *, const char *);
+  bool (*handler) (const char *, bool);
   enum opt_code opt_num;
 };
 
 /* Handle the ARCH_STR argument to the arch= target attribute.
-   PRAGMA_OR_ATTR is used in potential error messages.  */
+   IS_PRAGMA is used in potential error messages.  */
 
 static bool
-aarch64_handle_attr_arch (const char *str, const char *pragma_or_attr)
+aarch64_handle_attr_arch (const char *str, bool is_pragma)
 {
   const struct processor *tmp_arch = NULL;
   enum aarch64_parse_opt_result parse_res
@@ -9579,15 +9580,22 @@ aarch64_handle_attr_arch (const char *str, const char *pragma_or_attr)
   switch (parse_res)
 {
   case AARCH64_PARSE_MISSING_ARG:
-	error ("missing architecture name in 'arch' target %s", pragma_or_attr);
+	error (is_pragma
+	   ? G_("missing name in % pragma")
+	   : G_("missing name in % attribute"));
 	break;
   case AARCH64_PARSE_INVALID_ARG:
-	error ("unknown value %qs for 'arch' target %s", str, pragma_or_attr);
+	error (is_pragma
+	   ? G_("invalid name (\"%s\") in % pragma")
+	   : G_("invalid name (\"%s\") in % attribute"),
+	   str);
 	aarch64_print_hint_for_arch (str);
 	break;
   case AARCH64_PARSE_INVALID_FEATURE:
-	error ("invalid feature modifier %qs for 'arch' target %s",
-	   str, pragma_or_attr);
+	error (is_pragma
+	   ? G_("invalid value (\"%s\") in % pragma")
+	   : G_("invalid value (\"%s\") in % attribute"),
+	   str);
 	break;
   default:
 	gcc_unreachable ();
@@ -9597,10 

[committed][PATCH] Stack clash protection 06/08 - V5

2017-09-25 Thread Jeff Law
Final committed patch that addresses the minor comments from Segher for
the archives.  Bootstrapped and regression tested on ppc64le.

As with the other patches for stack-clash protection, I'll be here to
deal with any fallout, particularly on other ppc platforms such as AIX.

Addressing Bernhard's comments is next in the queue.

Jeff
commit cc69924dbcb8e67e9ba6783e0c757500938566a2
Author: law 
Date:   Mon Sep 25 23:13:55 2017 +

* config/rs6000/rs6000-protos.h (output_probe_stack_range): Update
prototype for new argument.
* config/rs6000/rs6000.c (rs6000_emit_allocate_stack_1): New 
function,
mostly extracted from rs6000_emit_allocate_stack.
(rs6000_emit_probe_stack_range_stack_clash): New function.
(rs6000_emit_allocate_stack): Call
rs6000_emit_probe_stack_range_stack_clash as needed.
(rs6000_emit_probe_stack_range): Add additional argument
to call to gen_probe_stack_range{si,di}.
(output_probe_stack_range): New.
(output_probe_stack_range_1): Renamed from output_probe_stack_range.
(output_probe_stack_range_stack_clash): New.
(rs6000_emit_prologue): Emit notes into dump file as requested.
* rs6000.md (allocate_stack): Handle -fstack-clash-protection.
(probe_stack_range): Operand 0 is now early-clobbered.
Add additional operand and pass it to output_probe_stack_range.

* lib/target-supports.exp
(check_effective_target_supports_stack_clash_protection): Enable for
rs6000 and powerpc targets.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@253179 
138bc75d-0d04-0410-961f-82ee72b054a4

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 6fbab0123e1..962689bd241 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,22 @@
+2017-09-25  Jeff Law  
+
+   * config/rs6000/rs6000-protos.h (output_probe_stack_range): Update
+   prototype for new argument.
+   * config/rs6000/rs6000.c (rs6000_emit_allocate_stack_1): New function,
+   mostly extracted from rs6000_emit_allocate_stack.
+   (rs6000_emit_probe_stack_range_stack_clash): New function.
+   (rs6000_emit_allocate_stack): Call
+   rs6000_emit_probe_stack_range_stack_clash as needed.
+   (rs6000_emit_probe_stack_range): Add additional argument
+   to call to gen_probe_stack_range{si,di}.
+   (output_probe_stack_range): New.
+   (output_probe_stack_range_1): Renamed from output_probe_stack_range.
+   (output_probe_stack_range_stack_clash): New.
+   (rs6000_emit_prologue): Emit notes into dump file as requested.
+   * rs6000.md (allocate_stack): Handle -fstack-clash-protection.
+   (probe_stack_range): Operand 0 is now early-clobbered.
+   Add additional operand and pass it to output_probe_stack_range.
+
 2017-09-25  Bin Cheng  
 
PR tree-optimization/82163
diff --git a/gcc/config/rs6000/rs6000-protos.h 
b/gcc/config/rs6000/rs6000-protos.h
index 3f86aba947e..781349b850e 100644
--- a/gcc/config/rs6000/rs6000-protos.h
+++ b/gcc/config/rs6000/rs6000-protos.h
@@ -128,7 +128,7 @@ extern void rs6000_emit_sISEL (machine_mode, rtx[]);
 extern void rs6000_emit_sCOND (machine_mode, rtx[]);
 extern void rs6000_emit_cbranch (machine_mode, rtx[]);
 extern char * output_cbranch (rtx, const char *, int, rtx_insn *);
-extern const char * output_probe_stack_range (rtx, rtx);
+extern const char * output_probe_stack_range (rtx, rtx, rtx);
 extern void rs6000_emit_dot_insn (rtx dst, rtx src, int dot, rtx ccreg);
 extern bool rs6000_emit_set_const (rtx, rtx);
 extern int rs6000_emit_cmove (rtx, rtx, rtx, rtx);
diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 3c01546658f..f64a091034a 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -25528,6 +25528,221 @@ rs6000_emit_stack_tie (rtx fp, bool hard_frame_needed)
   emit_insn (gen_stack_tie (gen_rtx_PARALLEL (VOIDmode, p)));
 }
 
+/* Allocate SIZE_INT bytes on the stack using a store with update style insn
+   and set the appropriate attributes for the generated insn.  Return the
+   first insn which adjusts the stack pointer or the last insn before
+   the stack adjustment loop. 
+
+   SIZE_INT is used to create the CFI note for the allocation.
+
+   SIZE_RTX is an rtx containing the size of the adjustment.  Note that
+   since stacks grow to lower addresses its runtime value is -SIZE_INT.
+
+   ORIG_SP contains the backchain value that must be stored at *sp.  */
+
+static rtx_insn *
+rs6000_emit_allocate_stack_1 (HOST_WIDE_INT size_int, rtx orig_sp)
+{
+  rtx_insn *insn;
+
+  rtx size_rtx = GEN_INT (-size_int);
+  if (size_int > 32767)
+{
+  rtx tmp_reg = gen_rtx_REG (Pmode, 0);
+  /* Need a note here so that try_split doesn't get confused.  */
+  if (get_last_insn () == NULL_RTX)

[Bug testsuite/82324] Problem in new trunk test case gfortran.dg/promotion_4.f90

2017-09-25 Thread kargl at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82324

--- Comment #1 from kargl at gcc dot gnu.org ---
Does this fix the problem?

Index: promotion_3.f90
===
--- promotion_3.f90 (revision 253178)
+++ promotion_3.f90 (working copy)
@@ -1,5 +1,6 @@
 ! { dg-do run }
 ! { dg-options "-fdefault-real-16" }
+! { dg-require-effective-target fortran_large_real }
 !
 ! PR 82143: add a -fdefault-real-16 flag
 !
Index: promotion_4.f90
===
--- promotion_4.f90 (revision 253178)
+++ promotion_4.f90 (working copy)
@@ -1,5 +1,6 @@
 ! { dg-do run }
 ! { dg-options "-fdefault-real-10" }
+! { dg-require-effective-target fortran_large_real }
 !
 ! PR 82143: add a -fdefault-real-16 flag
 !

[Bug ada/80590] [8 regression] non-bootstrap build failure of Ada runtime

2017-09-25 Thread ebotcazou at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80590

--- Comment #11 from Eric Botcazou  ---
> That PR is now fixed. I've re-enabled ada in my test setup, and will see if
> I can still reproduce this failure.

It's a different problem since it's on Linux and the other was Darwin-specific.

[Bug c++/82325] New: worse code generated compared to clang when using a constexpr array

2017-09-25 Thread dvd at gnx dot it
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82325

Bug ID: 82325
   Summary: worse code generated compared to clang when using a
constexpr array
   Product: gcc
   Version: 7.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: dvd at gnx dot it
  Target Milestone: ---

While testing some functions I'm writing for a deflate compressor I've noticed
that the following code is translated differently between gcc 7.2 and clang 5.0

#include 

struct code_value
{
uint16_t base;
uint8_t bits;
};

constexpr std::array al = {{
{  3, 0}, {  4, 0}, {  5, 0}, {  6, 0}, {  7, 0}, {  8, 0}, { 
9, 0}, {  10, 0},
{ 11, 1}, { 13, 1}, { 15, 1}, { 17, 1},
{ 19, 2}, { 23, 2}, { 27, 2}, { 31, 2},
{ 35, 3}, { 43, 3}, { 51, 3}, { 59, 3},
{ 67, 4}, { 83, 4}, { 99, 4}, {115, 4},
{131, 5}, {163, 5}, {195, 5}, {227, 5},
{258, 0}
}};


code_value f(int v) {
size_t index = 0;
while (index < al.size()) {
auto mi = al[index].base;
auto mx = al[index].base + (1 << al[index].bits);
if (mi <= v && v < mx)
break;
index++;
}
return al[index];
}

On gcc (with -O3 and -funroll-loop) every iteration is (more or less):

.L4:
  movzx ecx, BYTE PTR al[2+rax*4]
  movzx r9d, WORD PTR al[0+rax*4]
  mov r10d, esi
  sal r10d, cl
  add r10d, r9d
  cmp r10d, edi
  jle .L5
  cmp r9d, edi
  jle .L2

while on clang 

.LBB0_4:
  cmp edi, 13
  jge .LBB0_6
  mov eax, 8
  mov eax, dword ptr [4*rax + al]
  ret

It looks like the latter is able to infer at compilet time the values of
`al[index].base + (1 << al[index].bits);`

godbolt link:
https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAKxAEZSBnVAV2OUxAHIBSAJgGY8AO2QAbZlgDU3fgGEAhsWLyAnjOzcADAEEt2hgWLNkBSWiwB9AG7zxmPdwDsAIQeaAnM2EFaANgumAEbyDPb8rjpant4AHAGSgYQMMhG6jgAiKW7aaEIGmAAeAA7EkgboICCKympy5pjWtsyYpJK87uqSttL86dIuTqlRg5KS/K2aTumtI5IALBNTMy6jAKyLGcvOo74b0/3bko57W6MxJwej7heztJonbh0rtPSStEuXtONvHyO06z9Np9joDpo8/tc2r8VgJWrxodteCD4UCRvxXiiwZEPGiAfwEfNvvjUStVq9ifsRqtIRTSOCVr4QXMCTFvsySdt3JD2ZSXC8ATy6dins4vq9Vgi/N8JRzaO4ATLec5eEjWoqhbocS5eKtzpJJhkHIbDeFsnp6o07JIAGYQbySKwASgOjwYeAAXg1TMIsAUen1JqbhQB3BB4USYSR2oS%2BnqyLqiAB0bs9EEdzsG4I88mYREkAFs8P6E9xVs4fYVS%2BlE8FQllhVEc3n836ZH1bKXyzHK6tq7XI3xthBaHGZPGO2WKwUqzWko765qOh48Nao4XR70HdJeL4%2BL4t2OCwV5w2cR5AsRMPIANYL4bL7vT3iuZ93rWZYWXgisIQlyePqs7wyThHVIUQuFWThSCELhNCg1AuFkQdBzKFg2AHARaCgghYJA0DrxAfh%2BETIjSLI8jdnAzg5igmDODg0gEM4KCGBAe4cPokDSDgWAkDQfMinDTAyAoCB%2BMEiNiBAYBHF4UhrXDAhhNYiBAlw0hEiERQVC4LDSH4/NMCEAgAHkhFEHTONILB83kIRgAjdT8EvEw8CsTBWKswpMGQXMOE4PTvEwKiGMMPB81w0DRDwQJWMgUDUCKAg8FQPIuAAWnKNtkGQ59aHdSR0pM/hCutZghGIVBRFEdLRFQRKGBYtD2DoKKINo9SmIKGJfHS3w5kkYBkGQI5E14KNcEIEht3RVpZFQAShNKPh0UdbDIvwwjiPInbSMorgaOgzquBYtjSA4uDQJ4xAUAWiThPIShxKWlBRDs4BVk0e4FNEJTiBUtSrM07TdKggyjNM8zLIYmz3scqznJ85L3M8hjvN8pTQfIYzgvUsKIs4qKYritNGKSlK0s4TKCHQbLctFAqiv4JrWBa2g2s4SCjqsrqer6gaxHeyRVkTTRRYm/AiGWzC5rupaZt4Nbzo20gCKIkjdp2sCDo6nmTsYM6Lrw7XOF4XWGKY9bCdA9z/opkA5iAA%3D

[Bug target/79041] aarch64 backend emits R_AARCH64_ADR_PREL_PG_HI21 relocation despite -mpc-relative-literal-loads option being used

2017-09-25 Thread pinskia at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79041

--- Comment #14 from Andrew Pinski  ---
(In reply to Wilco from comment #13)
> It doesn't run in the testsuite with -fpic, so is it a problem?

I run the testsuite with RUNTESTFLAGS='--target_board=unix/\{,-fpic\}' and this
testcase fails in the -fpic is selected.  Note this is done as we don't have
enough coverage for -fpic code in the normal testsuite run; running the
testsuite this way has allowed me to file PIC related bugs in the past.

[PATCH v2,rs6000] Replace swap of a loaded vector constant with load of a swapped vector constant

2017-09-25 Thread Kelvin Nilsen

On Power8 little endian, two instructions are needed to load from the
natural in-memory representation of a vector into a vector register: a
load followed by a swap.  When the vector value to be loaded is a
constant, more efficient code can be achieved by swapping the
representation of the constant in memory so that only a load instruction
is required.

This second version of the patch responds to feedback provided by Segher
Boessenkool, Bill Schmidt, and Pat Haugen. Thank you for the careful
reviews:

1. Revised comments in const_load_sequence_p function of rs6000-p8swap.c

2. Restructured nested if statements as a single if-statement with
   compound condition in const_load_sequence_p function of
   rs6000-p8swap.c

3. In replace_swapped_load_constant function of rs6000-p8swap.c,
   replaced two FOR_EACH_INSN_INFO_USE macro expansions with
   non-looping control structures.

4. Added comments and white space to replace_swapped_load_constant
   function of rs6000-p8swap.c to improve readability.

5. Reordered handling of cases in replace_swapped_load_constant
   function of rs6000-p8swap.c, moving V8HImode and V8HFmode
   handling above V4SImode handling.

6. Replaced gcc_assert (0) with gcc_unreachable () in
   replace_swapped_load_constant of rs6000-p8swap.c.

7. In rs6000_analyze_swaps function of rs6000-p8swap.c,
   added requirement that !pass2_insn_entry[i].is_store
   before calling const_load_sequence_p.

8. Removed unnecessary code blocks at end of
   rs6000_analyze_swaps function of rs6000-p8swap.c.

9. Added 15 new tests to exercise different vector element sizes.

This patch has been bootstrapped and tested without regressions on
powerpc64le-unknown-linux (P8) and on powerpc-unknown-linux (P8,
big-endian, with both -m32 and -m64 target options).

Is this ok for trunk?

gcc/ChangeLog:

2017-09-25  Kelvin Nilsen  

* config/rs6000/rs6000-p8swap.c (const_load_sequence_p): Revise
this function to return false if the definition used by the swap
instruction is artificial, or if the memory address from which the
constant value is loaded is not represented by a base address held
in a register or if the base address register is a frame or stack
pointer.  Additionally, return false if the base address of the
loaded constant is a SYMBOL_REF but is not considered to be a
constant.
(replace_swapped_load_constant): New function.
(rs6000_analyze_swaps): Add a new pass to replace a swap of a
loaded constant vector with a load of a swapped constant vector.

gcc/testsuite/ChangeLog:

2017-09-25  Kelvin Nilsen  

* gcc.target/powerpc/swaps-p8-28.c: New test.
* gcc.target/powerpc/swaps-p8-29.c: New test.
* gcc.target/powerpc/swaps-p8-31.c: New test.
* gcc.target/powerpc/swaps-p8-32.c: New test.
* gcc.target/powerpc/swaps-p8-34.c: New test.
* gcc.target/powerpc/swaps-p8-35.c: New test.
* gcc.target/powerpc/swaps-p8-37.c: New test.
* gcc.target/powerpc/swaps-p8-38.c: New test.
* gcc.target/powerpc/swaps-p8-40.c: New test.
* gcc.target/powerpc/swaps-p8-41.c: New test.
* gcc.target/powerpc/swaps-p8-43.c: New test.
* gcc.target/powerpc/swaps-p8-44.c: New test.
* gcc.target/powerpc/swps-p8-30.c: New test.
* gcc.target/powerpc/swps-p8-33.c: New test.
* gcc.target/powerpc/swps-p8-36.c: New test.
* gcc.target/powerpc/swps-p8-39.c: New test.
* gcc.target/powerpc/swps-p8-42.c: New test.
* gcc.target/powerpc/swps-p8-45.c: New test.
Index: gcc/config/rs6000/rs6000-p8swap.c
===
--- gcc/config/rs6000/rs6000-p8swap.c   (revision 252768)
+++ gcc/config/rs6000/rs6000-p8swap.c   (working copy)
@@ -335,21 +335,26 @@ const_load_sequence_p (swap_web_entry *insn_entry,
 
   const_rtx tocrel_base;
 
-  /* Find the unique use in the swap and locate its def.  If the def
- isn't unique, punt.  */
   struct df_insn_info *insn_info = DF_INSN_INFO_GET (insn);
   df_ref use;
   FOR_EACH_INSN_INFO_USE (use, insn_info)
 {
   struct df_link *def_link = DF_REF_CHAIN (use);
-  if (!def_link || def_link->next)
+
+  /* If there is no def or the def is artificial or there are
+multiple defs, punt.  */
+  if (!def_link || !def_link->ref || DF_REF_IS_ARTIFICIAL (def_link->ref)
+ || def_link->next)
return false;
 
   rtx def_insn = DF_REF_INSN (def_link->ref);
   unsigned uid2 = INSN_UID (def_insn);
+  /* If this is not a load or is not a swap, return false */
   if (!insn_entry[uid2].is_load || !insn_entry[uid2].is_swap)
return false;
 
+  /* If the source of the rtl def is not a set from memory, return
+false.  */
   rtx body = PATTERN (def_insn);
   if (GET_CODE (body) != SET
  || GET_CODE (SET_SRC (body)) != VEC_SELECT

RE: 0002-Part-2.-Document-finstrument-control-flow-and-notrack attribute

2017-09-25 Thread Tsimbalist, Igor V
> -Original Message-
> From: Sandra Loosemore [mailto:san...@codesourcery.com]
> Sent: Monday, September 25, 2017 5:07 AM
> To: Tsimbalist, Igor V ; 'gcc-
> patc...@gcc.gnu.org' 
> Cc: Jeff Law 
> Subject: Re: 0002-Part-2.-Document-finstrument-control-flow-and-notrack
> attribute
> 
> On 09/19/2017 07:45 AM, Tsimbalist, Igor V wrote:
> > Here is an updated patch (version #2). Mainly attribute and option  names
> were changed.
> >
> > gcc/doc/
> > * extend.texi: Add 'nocf_check' documentation.
> > * gimple.texi: Add second parameter to
> gimple_build_call_from_tree.
> > * invoke.texi: Add -fcf-protection documentation.
> > * rtl.texi: Add REG_CALL_NOTRACK documenation.
> >
> > Is it ok for trunk?
> > diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi index
> > cd5733e..6bdb183 100644
> > --- a/gcc/doc/extend.texi
> > +++ b/gcc/doc/extend.texi
> > @@ -5646,6 +5646,56 @@ Specify which floating-point unit to use.  You
> > must specify the  @code{target("fpmath=sse,387")} option as
> > @code{target("fpmath=sse+387")} because the comma would separate
> > different options.
> > +
> > +@item nocf_check
> > +@cindex @code{nocf_check} function attribute The @code{nocf_check}
> > +attribute on a function is used to inform the compiler that the
> > +function's prolog should not be instrumented when
> 
> s/prolog/prologue/

Fixed.

> > +compiled with the @option{-fcf-protection=branch} option.  The
> > +compiler assumes that the function's address is a valid target for a
> > +control-flow transfer.
> > +
> > +The @code{nocf_check} attribute on a type of pointer to function is
> > +used to inform the compiler that a call through the pointer should
> > +not be instrumented when compiled with the
> > +@option{-fcf-protection=branch} option.  The compiler assumes that
> > +the function's address from the pointer is a valid target for a
> > +control-flow transfer.  A direct function call through a function
> > +name is assumed as a safe call thus direct calls will not be
> 
> ...is assumed to be a safe call, thus direct calls are not...

Fixed.

> > +instrumented by the compiler.
> > +
> > +The @code{nocf_check} attribute is applied to an object's type.  A
> > +The @code{nocf_check} attribute is transfered to a call instruction
> > +at the GIMPLE and RTL translation phases.  The attribute is not
> > +propagated through assignment, store and load.
> 
> extend.texi is user-facing documentation, but the second sentence here is
> implementor-speak and not meaningful to users of GCC.  I don't understand
> what the third sentence is trying to say.

The second sentence is removed. The third sentence is re-written as

In case of assignment of a function address or a function pointer to
another pointer, the attribute is not carried over from the right-hand
object's type, the type of left-hand object stays unchanged.  The
compiler checks for @code{nocf_check} attribute mismatch and reports
a warning in case of mismatch.

> > +
> > +@smallexample
> > +@{
> > +int foo (void) __attribute__(nocf_check); void (*foo1)(void)
> > +__attribute__(nocf_check); void (*foo2)(void);
> > +
> > +int
> > +foo (void) /* The function's address is assumed as valid.  */
> 
> s/as valid/to be valid/

Fixed.

> > +
> > +  /* This call site is not checked for control-flow validness.  */
> 
> s/validness/validity/g

Fixed.

> > +  (*foo1)();
> > +
> > +  foo1 = foo2;
> > +  /* This call site is still not checked for control-flow validness.
> > + */  (*foo1)();
> > +
> > +  /* This call site is checked for control-flow validness.  */
> > + (*foo2)();
> > +
> > +  foo2 = foo1;
> > +  /* This call site is still checked for control-flow validness.  */
> > + (*foo2)();
> > +
> > +  return 0;
> > +@}
> > +@end smallexample
> > +
> >  @end table
> >
> >  On the x86, the inliner does not inline a diff --git
> > a/gcc/doc/gimple.texi b/gcc/doc/gimple.texi index 635abd3..b6d9149
> > 100644
> > --- a/gcc/doc/gimple.texi
> > +++ b/gcc/doc/gimple.texi
> > @@ -1310,9 +1310,11 @@ operand is validated with
> @code{is_gimple_operand}).
> >  @end deftypefn
> >
> >
> > -@deftypefn {GIMPLE function} gcall *gimple_build_call_from_tree (tree
> > call_expr) -Build a @code{GIMPLE_CALL} from a @code{CALL_EXPR} node.
> > The arguments and the -function are taken from the expression
> > directly.  This routine
> > +@deftypefn {GIMPLE function} gcall *gimple_build_call_from_tree (tree
> > +call_expr, @ tree fnptrtype) Build a @code{GIMPLE_CALL} from a
> > +@code{CALL_EXPR} node.  The arguments and the function are taken
> from
> > +the expression directly.  The type is set from the second parameter
> > +passed by a caller.  This routine
> >  assumes that @code{call_expr} is already in GIMPLE form.  That is,
> > its  operands are GIMPLE values and the function call needs no further
> > simplification.  All the call flags in @code{call_expr} are copied
> > over diff --git 

[Bug c++/82307] unscoped enum-base incorrect cast

2017-09-25 Thread pro100fifa at ukr dot net
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82307

--- Comment #2 from Maxim  ---
(In reply to Richard Biener from comment #1)
> clang accepts it.

Yes, I know. I would like to know if the g++ is deviating from the standard,
that nothing is written about it, or will it be fixed?

[Bug target/80266] ICE in store_pairsi condition with -mabi=ilp32

2017-09-25 Thread qing.zhao at oracle dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80266

Qing Zhao  changed:

   What|Removed |Added

 CC||qing.zhao at oracle dot com

--- Comment #3 from Qing Zhao  ---
This is a very similar bug as PR80295.
I have had a fix for 80295 already. hopefully that fix should fix this bug too.

since I cannot build gnat on the available machines, I cannot confirm this on
my side.

Re: [Patch, Fortran] PR 82143: add a -fdefault-real-16 flag

2017-09-25 Thread Steve Kargl
On Mon, Sep 25, 2017 at 11:14:42PM +0200, Janus Weil wrote:
> 2017-09-25 17:07 GMT+02:00 David Edelsohn :
> > promotion_3.f90 and promotion_4.f90 are failing on at least PowerPC
> > and AArch64.  Are these new tests limited to x86 or some long double
> > assumptions?
> 
> These tests require the availability of  a 10- or 16-byte-wide REAL
> type, respectively. I have to admit that I do not have a complete
> overview of which targets in GCC's wide portfolio provide such a type.
> 
> It seems that REAL(16) is supported via libquadmath on 32-bit x86,
> x86-64 and Itanium at least. I'm not sure about REAL(10).
> 
> Targets that do not support such a type probably need to be XFAILed.
> 

Janus, I think you can control with a dg option

dg-require-effective-target fortran_large_real

See, for example, gfortran.dg/random_3.f90 

-- 
Steve
20170425 https://www.youtube.com/watch?v=VWUpyCsUKR4
20161221 https://www.youtube.com/watch?v=IbCHE-hONow


Re: [Patch, Fortran] PR 82143: add a -fdefault-real-16 flag

2017-09-25 Thread Janus Weil
2017-09-25 17:07 GMT+02:00 David Edelsohn :
> promotion_3.f90 and promotion_4.f90 are failing on at least PowerPC
> and AArch64.  Are these new tests limited to x86 or some long double
> assumptions?

These tests require the availability of  a 10- or 16-byte-wide REAL
type, respectively. I have to admit that I do not have a complete
overview of which targets in GCC's wide portfolio provide such a type.

It seems that REAL(16) is supported via libquadmath on 32-bit x86,
x86-64 and Itanium at least. I'm not sure about REAL(10).

Targets that do not support such a type probably need to be XFAILed.

Cheers,
Janus


[Bug testsuite/82324] New: Problem in new trunk test case gfortran.dg/promotion_4.f90

2017-09-25 Thread seurer at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82324

Bug ID: 82324
   Summary: Problem in new trunk test case
gfortran.dg/promotion_4.f90
   Product: gcc
   Version: 8.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: testsuite
  Assignee: unassigned at gcc dot gnu.org
  Reporter: seurer at gcc dot gnu.org
  Target Milestone: ---

The new test case gfortran.dg/promotion_4.f90 doesn't seem to compile
correctly.  I saw this on powerpc64 both BE and LE.

make -k check-fortran RUNTESTFLAGS=dg.exp=gfortran.dg/promotion_4.f90
. . .
Running /home/seurer/gcc/gcc-test/gcc/testsuite/gfortran.dg/dg.exp ...
FAIL: gfortran.dg/promotion_4.f90   -O0  (test for excess errors)
FAIL: gfortran.dg/promotion_4.f90   -O1  (test for excess errors)
FAIL: gfortran.dg/promotion_4.f90   -O2  (test for excess errors)
FAIL: gfortran.dg/promotion_4.f90   -O3 -fomit-frame-pointer -funroll-loops
-fpeel-loops -ftracer -finline-functions  (test for excess errors)
FAIL: gfortran.dg/promotion_4.f90   -O3 -g  (test for excess errors)
FAIL: gfortran.dg/promotion_4.f90   -Os  (test for excess errors)


In the gfortran.log file from a full run:

Executing on host:
/home/seurer/gcc/build/gcc-trunk/gcc/testsuite/gfortran1/../../gfortran
-B/home/seurer/gcc/build/gcc-trunk/gcc/testsuite/gfortran1/../../
-B/home/seurer/gcc/build/gcc-trunk/powerpc64-unknown-linux-gnu/./libgfortran/
/home/seurer/gcc/gcc-trunk/gcc/testsuite/gfortran.dg/promotion_4.f90 
-fno-diagnostics-show-caret -fdiagnostics-color=never-O0  -fdefault-real-10

-B/home/seurer/gcc/build/gcc-trunk/powerpc64-unknown-linux-gnu/./libgfortran/.libs
-L/home/seurer/gcc/build/gcc-trunk/powerpc64-unknown-linux-gnu/./libgfortran/.libs
-L/home/seurer/gcc/build/gcc-trunk/powerpc64-unknown-linux-gnu/./libgfortran/.libs
-L/home/seurer/gcc/build/gcc-trunk/powerpc64-unknown-linux-gnu/./libatomic/.libs
 -lm-o ./promotion_4.exe(timeout = 300)
spawn -ignore SIGHUP
/home/seurer/gcc/build/gcc-trunk/gcc/testsuite/gfortran1/../../gfortran
-B/home/seurer/gcc/build/gcc-trunk/gcc/testsuite/gfortran1/../../
-B/home/seurer/gcc/build/gcc-trunk/powerpc64-unknown-linux-gnu/./libgfortran/
/home/seurer/gcc/gcc-trunk/gcc/testsuite/gfortran.dg/promotion_4.f90
-fno-diagnostics-show-caret -fdiagnostics-color=never -O0 -fdefault-real-10
-B/home/seurer/gcc/build/gcc-trunk/powerpc64-unknown-linux-gnu/./libgfortran/.libs
-L/home/seurer/gcc/build/gcc-trunk/powerpc64-unknown-linux-gnu/./libgfortran/.libs
-L/home/seurer/gcc/build/gcc-trunk/powerpc64-unknown-linux-gnu/./libgfortran/.libs
-L/home/seurer/gcc/build/gcc-trunk/powerpc64-unknown-linux-gnu/./libatomic/.libs
-lm -o ./promotion_4.exe
f951: Fatal Error: REAL(KIND=10) is not available for '-fdefault-real-10'
option
compilation terminated.
compiler exited with status 1
output is:
f951: Fatal Error: REAL(KIND=10) is not available for '-fdefault-real-10'
option
compilation terminated.

FAIL: gfortran.dg/promotion_4.f90   -O0  (test for excess errors)
Excess errors:
f951: Fatal Error: REAL(KIND=10) is not available for '-fdefault-real-10'
option
compilation terminated.

UNRESOLVED: gfortran.dg/promotion_4.f90   -O0  compilation failed to produce
executable

[Bug c++/82316] unexpected warning for using 'register' storage class in extern "C" declarations

2017-09-25 Thread development at jordi dot vilar.cat
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82316

--- Comment #4 from Jordi Vilar  ---
I'm sorry if I didn't explain it correctly.

I don't claim that a extern "C" declarations have to be interpreted as C in a
C++ translation unit. What I say is that most C libraries (libtiff, zlib, lcms,
etc.) provide a header that assumes that the C api can be reused for C++ just
by wrapping it with extern "C" {}. If a C library uses register in its api
declarations, then that legitimate C declarations no longer can be used in C++
code not even wrapped by extern "C". This is a breaking movement.

Supressing the warning in extern "C" declarations doesn't imply parsing it as
C, because it is actually a C++ translation, but would enable continuing using
the traditional C libraries.

For an example of C library that uses register in its api, you can take a look
on the little cms (lcms) that is included in most linux distributions. It just
has the classic #ifdef __cplusplus extern "C" { #endif and tons of register
function arguments.

Should C++17 applications REJECT all of those ligitimate C libraries?

[Bug tree-optimization/82224] Strict-aliasing not noticing valid aliasing of two unions with active members

2017-09-25 Thread myriachan at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82224

--- Comment #5 from Melissa  ---
This originated from a Stack Overflow post "supercat" made (I'm the "Myria"
there).

https://stackoverflow.com/questions/46205744/is-this-use-of-unions-strictly-conforming/

Invoice Copy

2017-09-25 Thread David
 - This mail is in HTML. Some elements may be ommited in plain text. -

Hi,
I sent you an invoice through Adobe PDF. Please acknowledge.
Thanks!
David
Sent from my iPhone

Invoice (58).pdf


[Bug c/82323] circular ifunc attribute on a function definition silently accepted

2017-09-25 Thread egallager at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82323

Eric Gallager  changed:

   What|Removed |Added

   Keywords||diagnostic
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2017-09-25
 CC||egallager at gcc dot gnu.org
 Ever confirmed|0   |1

--- Comment #1 from Eric Gallager  ---
Confirmed that gcc silently accepts the example. I think it'd be okay to just
issue a warning from -Wattributes or something though, instead of rejecting it
completely.

[Bug lto/82302] LTO producing bad code

2017-09-25 Thread krzysio.kurek at wp dot pl
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82302

--- Comment #8 from krzysio.kurek at wp dot pl ---
In a sense that there is no error, the program goes into infinite loop.

[Bug lto/82302] LTO producing bad code

2017-09-25 Thread krzysio.kurek at wp dot pl
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82302

--- Comment #7 from krzysio.kurek at wp dot pl ---
No, the error you're having is completely different from what I originally
reported.

[Bug lto/82302] LTO producing bad code

2017-09-25 Thread marxin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82302

--- Comment #6 from Martin Liška  ---
(In reply to krzysio.kurek from comment #5)
> I can't reproduce your error.

Is the error I see the same you see with GCC 7? That said a syntax error in a
shader?

[Bug fortran/82312] [7/8 Regression] Pointer assignment to component of class variable results wrong vptr for the variable.

2017-09-25 Thread pault at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82312

Paul Thomas  changed:

   What|Removed |Added

 CC||pault at gcc dot gnu.org
   Assignee|unassigned at gcc dot gnu.org  |pault at gcc dot gnu.org

--- Comment #2 from Paul Thomas  ---
Created attachment 42235
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=42235=edit
Patch for the PR

The attached bootstraps and regtests OK.

I am busy on PR77296 right now. Will post this properly when done.

Paul

[Bug lto/82302] LTO producing bad code

2017-09-25 Thread krzysio.kurek at wp dot pl
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82302

--- Comment #5 from krzysio.kurek at wp dot pl ---
I can't reproduce your error.

Re: [GCC][PATCH][TESTSUITE][ARM][COMMITTED] Invert check to misalign in vect_hw_misalign (PR 78421)

2017-09-25 Thread Mike Stump
On Sep 23, 2017, at 10:52 AM, Christophe Lyon  
wrote:
> The attached patch would apply after reverting yours.
> I've applied it against r253072 (just before your patch) and the
> results are visible at:
> http://people.linaro.org/~christophe.lyon/cross-validation/gcc-test-patches/253072-hw-misalign2/report-build-info.html
> 
> Do they match your expectations? It looks like a few testcases need to
> be adjusted,
> or new bugs are uncovered.
> 
> If the patch OK with a suitable ChangeLog entry?

So, I wasn't sure what you meant by the comment.  Below is an example of ! 
working.

% proc b {} { return 0 }
% set v [expr ![b]]
1
% set v [ expr !![b]]
0
% proc b {} { return 1 }
% set v [expr ![b]]
0
% set v [ expr !![b]]
1

Given that, if you want to use a simple set var value, you should just do that 
directly.  [b] is the placeholder for an [] expression, if you want to invert 
that.  v is the placeholder for the thing you want to set.  All the other bits 
should be used as given.  Given that code, I'd be interested in what you want 
to put in the comment, if any.

  set et_vect_hw_misalign_saved($et_index) [expr 
![check_effective_target_arm_vect_no_misalign]]

seems like what you want, does that work?



Re: [RFA][PATCH] Stack clash protection 06/08 - V4

2017-09-25 Thread Jeff Law
On 09/25/2017 11:25 AM, Segher Boessenkool wrote:
> On Mon, Sep 25, 2017 at 10:00:55AM -0600, Jeff Law wrote:
>> On 09/25/2017 04:52 AM, Segher Boessenkool wrote:
>>> Did you also test Ada?  It needs testing.  I wanted to try it myself,
>>> but your patch doesn't apply (you included the changelog bits in the
>>> patch), and I cannot easily manually apply it either because you sent
>>> it as base64 instead of as text.
>> I didn't test Ada with -fstack-clash-protection on by default.  I did
>> test it as part of the normal bootstrap & regression test cycle with no
>> changes in the Ada testsuites.
>>
>> Testing it with stack clash protection on by default is easy to do :-)
>> I wouldn't be surprised to see failures since some tests are known to
>> test/use -fstack-check= which explicitly conflicts with stack clash
>> protection.
> 
> Right, and that did happen (see other mails).  But that's okay, it won't
> be enabled by default (yet) (right?), and when it does just the testcases
> need fixing?
Correct, it's not enabled by default at this time.To propose that
we'd need to do some testsuite work.  We'd also need to make some
decisions on how to handle the partially protected targets.

So for the immediate future, stack clash protection has to be explicitly
requested.

We're still working out the long term plan for RHEL, but defaulting it
on for all the RHEL architectures is definitely part of that plan.

> 
 +   SIZE_INT is used to create the CFI note for the allocation.
 +
 +   SIZE_RTX is an rtx containing the size of the adjustment.  Note that
 +   since stacks grow to lower addresses its runtime value is -SIZE_INT.
>>>
>>> The size_rtx doesn't always correspond to size_int so this a bit
>>> misleading.
>> The value at runtime of SIZE_RTX is always -SIZE_INT.   It might be held
>> in a register, but that register's value is always -SIZE_INT.
> 
> Ah, I was looking at the last call in rs6000_emit_allocate_stack.  It's
> a bit hard to follow, but I see you are right.  Unfortunate that you
> won't get good generated code if you just drop that size_rtx parameter
> and generate it from size_int here (or do you?)
I don't think it would make any difference from a code generation
standpoint.  One could argue that passing in SIZE_RTX is just silly as
the caller's don't really need that information and it just hides the
invariant that SIZE_RTX is just -SIZE at runtime.  Let me give that a
whirl in the tester.


> 
> Yeah.  But whenever I see "1 <<" I think "will it fit" -- no need for
> that with HOST_WIDE_INT_1 (or unsigned, if you can) :-)
And that's why I just went ahead and wrote a helper.  It ensures we're
good to go even if we expand the limits significantly.

>> Fixed.  Long term I hope we find that changing these things isn't useful
>> and we just drop them.
> 
> I hope we can change to 64kB for 64-bit Power (instead of 4kB) -- if we
> do, we can probably turn on this protection by default, since the runtime
> cost will be close to zero (almost all functions will need *no* extra
> code compared to no clash protection).
Right.   But I would claim that even today with 4k guard and 4k probe
interval that the overhead is likely not measurable in practice on a
target like ppc.


> 
> But we'll probably need to support 4kB as well, even for 64-bit.
Note that we can adjust the size of the guard and probe interval
independently.  So if a system is providing a multi-page guard we can
take advantage of that by raising the former, but not the latter.

> 
>>> Bootstrap+testsuite finished on BE, but I forgot to enable stack-clash
>>> protection by default, whoops.  Will have results later today (also LE).
>> FWIW, I did do BE tests of earlier versions of this code as well as
>> ppc32-be with and without stack clash protection enabled by default.
>> But most of the time I'm testing ppc64le.
> 
> So all testing was fine (except the things with stack clash protection
> on by default, which you are not proposing to commit).
> 
> I'm quite happy with the patch now; it's okay for trunk.
> 
> Thank you for all the work!
Thanks.  Your feedback on both the PPC and generic bits did
significantly improve the implementation.

Jeff



Re: [RFC] propagate malloc attribute in ipa-pure-const pass

2017-09-25 Thread Prathamesh Kulkarni
On 15 September 2017 at 17:49, Prathamesh Kulkarni
 wrote:
> On 1 September 2017 at 08:09, Prathamesh Kulkarni
>  wrote:
>> On 17 August 2017 at 18:02, Prathamesh Kulkarni
>>  wrote:
>>> On 8 August 2017 at 09:50, Prathamesh Kulkarni
>>>  wrote:
 On 31 July 2017 at 23:53, Prathamesh Kulkarni
  wrote:
> On 23 May 2017 at 19:10, Prathamesh Kulkarni
>  wrote:
>> On 19 May 2017 at 19:02, Jan Hubicka  wrote:

 * LTO and memory management
 This is a general question about LTO and memory management.
 IIUC the following sequence takes place during normal LTO:
 LGEN: generate_summary, write_summary
 WPA: read_summary, execute ipa passes, write_opt_summary

 So I assumed it was OK in LGEN to allocate return_callees_map in
 generate_summary and free it in write_summary and during WPA, allocate
 return_callees_map in read_summary and free it after execute (since
 write_opt_summary does not require return_callees_map).

 However with fat LTO, it seems the sequence changes for LGEN with
 execute phase takes place after write_summary. However since
 return_callees_map is freed in pure_const_write_summary and
 propagate_malloc() accesses it in execute stage, it results in
 segmentation fault.

 To work around this, I am using the following hack in 
 pure_const_write_summary:
 // FIXME: Do not free if -ffat-lto-objects is enabled.
 if (!global_options.x_flag_fat_lto_objects)
   free_return_callees_map ();
 Is there a better approach for handling this ?
>>>
>>> I think most passes just do not free summaries with -flto.  We probably 
>>> want
>>> to fix it to make it possible to compile multiple units i.e. from 
>>> plugin by
>>> adding release_summaries method...
>>> So I would say it is OK to do the same as others do and leak it with 
>>> -flto.
 diff --git a/gcc/ipa-pure-const.c b/gcc/ipa-pure-const.c
 index e457166ea39..724c26e03f6 100644
 --- a/gcc/ipa-pure-const.c
 +++ b/gcc/ipa-pure-const.c
 @@ -56,6 +56,7 @@ along with GCC; see the file COPYING3.  If not see
  #include "tree-scalar-evolution.h"
  #include "intl.h"
  #include "opts.h"
 +#include "ssa.h"

  /* Lattice values for const and pure functions.  Everything starts out
 being const, then may drop to pure and then neither depending on
 @@ -69,6 +70,15 @@ enum pure_const_state_e

  const char *pure_const_names[3] = {"const", "pure", "neither"};

 +enum malloc_state_e
 +{
 +  PURE_CONST_MALLOC_TOP,
 +  PURE_CONST_MALLOC,
 +  PURE_CONST_MALLOC_BOTTOM
 +};
>>>
>>> It took me a while to work out what PURE_CONST means here :)
>>> I would just call it something like STATE_MALLOC_TOP... or so.
>>> ipa_pure_const is outdated name from the time pass was doing only
>>> those two.
 @@ -109,6 +121,10 @@ typedef struct funct_state_d * funct_state;

  static vec funct_state_vec;

 +/* A map from node to subset of callees. The subset contains those 
 callees
 + * whose return-value is returned by the node. */
 +static hash_map< cgraph_node *, vec* > 
 *return_callees_map;
 +
>>>
>>> Hehe, a special case of return jump function.  We ought to support 
>>> those more generally.
>>> How do you keep it up to date over callgraph changes?
 @@ -921,6 +1055,23 @@ end:
if (TREE_NOTHROW (decl))
  l->can_throw = false;

 +  if (ipa)
 +{
 +  vec v = vNULL;
 +  l->malloc_state = PURE_CONST_MALLOC_BOTTOM;
 +  if (DECL_IS_MALLOC (decl))
 + l->malloc_state = PURE_CONST_MALLOC;
 +  else if (malloc_candidate_p (DECL_STRUCT_FUNCTION (decl), v))
 + {
 +   l->malloc_state = PURE_CONST_MALLOC_TOP;
 +   vec *callees_p = new vec (vNULL);
 +   for (unsigned i = 0; i < v.length (); ++i)
 + callees_p->safe_push (v[i]);
 +   return_callees_map->put (fn, callees_p);
 + }
 +  v.release ();
 +}
 +
>>>
>>> I would do non-ipa variant, too.  I think most attributes can be 
>>> detected that way
>>> as well.
>>>
>>> The patch generally makes sense to me.  It would be nice to make it 
>>> easier to write such
>>> a basic propagators across callgraph (perhaps adding a template doing 
>>> the basic
>>> 

[Bug lto/82302] LTO producing bad code

2017-09-25 Thread marxin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82302

--- Comment #4 from Martin Liška  ---
Created attachment 42234
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=42234=edit
Build log

Hm, for me GCC 6 with -O0 also fails. Please take a look.

[Bug libfortran/66756] libgfortran: ThreadSanitizer: lock-order-inversion

2017-09-25 Thread tkoenig at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66756

Thomas Koenig  changed:

   What|Removed |Added

 Status|NEW |WAITING

--- Comment #6 from Thomas Koenig  ---
I think the current behavior is correct, only that the thread
sanitizer does not realize it.

From the comment in unit.c:

   Therefore to avoid deadlocks, it is forbidden
   to acquire unit's private locks while holding UNIT_LOCK, except
   for freshly created units (where no other thread can get at their
   address yet) or when using just trylock rather than lock operation.

... and this appears to be exactly what is happening there.

This makes debugging thread-related problems in libgfortran
somewhat harder, so I'm not sure what is the best course.

Should we try to "fix" this? It should be possible to do
file opening under UNIT_LOCK, that should not be a serious
performance bottleneck. OTOH, the current code seems OK, so
it could be a case of "If it ain't broke, don't fix it".

Opinions?

[Bug lto/82302] LTO producing bad code

2017-09-25 Thread krzysio.kurek at wp dot pl
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82302

--- Comment #3 from krzysio.kurek at wp dot pl ---
This happens only on GCC7 with -flto activated.
clang-6, GCC6 and GCC5 with either flag enabled or disabled compile code that
runs fine.

I don't know why this error is occurring for you Martin.

Re: [PING][patch] PR81794: have "would be stringified in traditional C" warning in libcpp/macro.c be controlled by -Wtraditional

2017-09-25 Thread Eric Gallager
Ping: https://gcc.gnu.org/ml/gcc-patches/2017-09/msg01107.html

cc-ing additional libcpp (i.e. preprocessor) maintainers and
diagnostic messages maintainers

On Sun, Sep 17, 2017 at 8:00 PM, Eric Gallager  wrote:
> Attached is a version of
> https://gcc.gnu.org/ml/gcc-patches/2017-05/msg00481.html that contains
> a combination of both the fix and the testcase update, as requested in
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81794#c2
>
> I had to use a different computer than I usually use to send this
> email, as the hard drive that originally had this patch is currently
> unresponsive. Since it's also the one with my ssh keys on it, I can't
> commit with it. Sorry if the ChangeLogs get mangled.
>
> libcpp/ChangeLog:
>
> 2017-03-24  Eric Gallager  
>
>  * macro.c (check_trad_stringification): Have warning be controlled by
>  -Wtraditional.
>
> gcc/testsuite/ChangeLog:
>
> 2017-09-17  Eric Gallager  
>
> PR preprocessor/81794
> * gcc.dg/pragma-diag-7.c: Update to include check for
> stringification.
>
> On Sat, May 6, 2017 at 11:33 AM, Eric Gallager  wrote:
>> Pinging this: https://gcc.gnu.org/ml/gcc-patches/2017-03/msg01325.html
>>
>> On 3/24/17, Eric Gallager  wrote:
>>> It seemed odd to me that gcc was issuing a warning about compatibility
>>> with traditional C that I couldn't turn off by pushing/popping
>>> -Wtraditional over the problem area, so I made the attached (minor)
>>> patch to fix it. Survives bootstrap, but the only testing I've done
>>> with it has been compiling the one file that was giving me issues
>>> previously, which I'd need to reduce further to turn it into a proper
>>> test case.
>>>
>>> Thanks,
>>> Eric Gallager
>>>
>>> libcpp/ChangeLog:
>>>
>>> 2017-03-24  Eric Gallager  
>>>
>>>   * macro.c (check_trad_stringification): Have warning be controlled by
>>>   -Wtraditional.
>>>
>>
>> So I did the reducing I mentioned above and now have a testcase for
>> it; it was pretty similar to the one from here:
>> https://gcc.gnu.org/ml/gcc-patches/2017-03/msg01319.html
>> so I combined them into a single testcase and have attached the
>> combined version. I can confirm that the testcase passes with my patch
>> applied.


Re: [PATCH][aarch64] Fix pr81356 - copy empty string with wrz, not a ldrb/strb

2017-09-25 Thread Steve Ellcey
Ping.

Steve Ellcey
sell...@cavium.com


On Fri, 2017-09-15 at 11:22 -0700, Steve Ellcey wrote:
> PR 81356 points out that doing a __builtin_strcpy of an empty string on
> aarch64 does a copy from memory instead of just writing out a zero byte.
> In looking at this I found that it was because of
> aarch64_use_by_pieces_infrastructure_p, which returns false for
> STORE_BY_PIECES.  The comment says:
> 
>   /* STORE_BY_PIECES can be used when copying a constant string, but
>  in that case each 64-bit chunk takes 5 insns instead of 2 (LDR/STR).
>  For now we always fail this and let the move_by_pieces code copy
>  the string from read-only memory.  */
> 
> But this doesn't seem to be the case anymore.  When I remove this function
> and the TARGET_USE_BY_PIECES_INFRASTRUCTURE_P macro that uses it the code
> for __builtin_strcpy of a constant string seems to be either better or the
> same.  The only time I got more instructions after removing this function
> was on an 8 byte __builtin_strcpy where we now generate a mov and 3 movk
> instructions to create the source followed by a store instead of doing a
> load/store of 8 bytes.  The comment may have been applicable for
> -mstrict-align at one time but it doesn't seem to be the case now.  I still
> get better code without this routine under that option as well.
> 
> Bootstrapped and tested without regressions, OK to checkin?
> 
> Steve Ellcey
> sell...@cavium.com
> 
> 
> 
> 2017-09-15  Steve Ellcey  
> 
>   PR target/81356
>   * config/aarch64/aarch64.c
> (aarch64_use_by_pieces_infrastructure_p):
>   Remove.
>   (TARGET_USE_BY_PIECES_INFRASTRUCTURE_P): Remove define.
> 
> 
> 2017-09-15  Steve Ellcey  
> 
>   * gcc.target/aarch64/pr81356.c: New test.


[Bug tree-optimization/82163] [8 Regression] ICE on valid code at -O3 on x86_64-linux-gnu: in check_loop_closed_ssa_use, at tree-ssa-loop-manip.c:707

2017-09-25 Thread amker at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82163

--- Comment #3 from amker at gcc dot gnu.org ---
Author: amker
Date: Mon Sep 25 17:32:36 2017
New Revision: 253161

URL: https://gcc.gnu.org/viewcvs?rev=253161=gcc=rev
Log:
PR tree-optimization/82163
* tree-ssa-loop-manip.h (verify_loop_closed_ssa): New parameter.
(checking_verify_loop_closed_ssa): New parameter.
* tree-ssa-loop-manip.c (check_loop_closed_ssa_use): Delete.
(check_loop_closed_ssa_stmt): Delete.
(check_loop_closed_ssa_def, check_loop_closed_ssa_bb): New functions.
(verify_loop_closed_ssa): Check loop closed ssa form for LOOP.
(tree_transform_and_unroll_loop): Check loop closed ssa form only for
changed loops.

gcc/testsuite
* gcc.dg/tree-ssa/pr82163.c: New test.

Added:
trunk/gcc/testsuite/gcc.dg/tree-ssa/pr82163.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/testsuite/ChangeLog
trunk/gcc/tree-ssa-loop-manip.c
trunk/gcc/tree-ssa-loop-manip.h

Re: Weird warning when building gcc

2017-09-25 Thread Eric Gallager
On Mon, Sep 25, 2017 at 11:17 AM, nick  wrote:
>
>
> On 2017-09-24 10:10 AM, Eric Gallager wrote:
>> On Sat, Sep 23, 2017 at 12:34 PM, nick  wrote:
>>> If your able to just tell me where the functions are located or how do you 
>>> enable ctags for all of
>>> gcc? That would just save me asking stupid questions. Is there a global 
>>> setting like make ctags for
>>> doing this or you I have to do it manually.
>>>
>>> Thanks for the quick response,
>>>
>>> Nick
>>
>> Also, for `make ctags` to work from the top level source directory,
>> this patch needs to be applied:
>> https://gcc.gnu.org/ml/gcc-patches/2016-10/msg00370.html
>> (The patch is approved but the thread says I was still waiting on
>> commit access at the time; I have since received commit access, but my
>> ssh keys that allow me to commit are currently stuck on a failing hard
>> drive, so if someone else could commit for me, it'd be appreciated.)
>>
>
> Eric,
>
> I rewrote the patch as it was failing for me on the trunk of gcc git. It's 
> still
> reported as failing but it seems to apply fine when I inspect the lines I 
> changed,
> all apply fine now.
>
> Here is the reported git warning:
>
> Applying: Fix Ctags Patch to Work on Master Tree
> .git/rebase-apply/patch:15: trailing whitespace.
> host_modules= { module= libdecnumber; bootstrap=true;
> warning: 1 line adds whitespace errors.
>
> I will fix this after I get some comments on my update of the patch. It's
> attached here. Here is a link to the original patch I rewrote due to
> failing for me:
> https://gcc.gnu.org/ml/gcc-patches/2016-10/txtNehNSQvx5L.txt
>
> Nick

The way to get the patch to work that I was going to suggest would be
to just apply the portion that applies to Makefile.def, and then
regenerate Makefile.in manually by running `autogen Makefile.def`, as
described in the top comments of Makefile.tpl. But if you managed it
another way, that's cool too.

Eric


Re: [RFA][PATCH] Stack clash protection 06/08 - V4

2017-09-25 Thread Segher Boessenkool
On Mon, Sep 25, 2017 at 10:00:55AM -0600, Jeff Law wrote:
> On 09/25/2017 04:52 AM, Segher Boessenkool wrote:
> > Did you also test Ada?  It needs testing.  I wanted to try it myself,
> > but your patch doesn't apply (you included the changelog bits in the
> > patch), and I cannot easily manually apply it either because you sent
> > it as base64 instead of as text.
> I didn't test Ada with -fstack-clash-protection on by default.  I did
> test it as part of the normal bootstrap & regression test cycle with no
> changes in the Ada testsuites.
> 
> Testing it with stack clash protection on by default is easy to do :-)
> I wouldn't be surprised to see failures since some tests are known to
> test/use -fstack-check= which explicitly conflicts with stack clash
> protection.

Right, and that did happen (see other mails).  But that's okay, it won't
be enabled by default (yet) (right?), and when it does just the testcases
need fixing?

> >> +   SIZE_INT is used to create the CFI note for the allocation.
> >> +
> >> +   SIZE_RTX is an rtx containing the size of the adjustment.  Note that
> >> +   since stacks grow to lower addresses its runtime value is -SIZE_INT.
> > 
> > The size_rtx doesn't always correspond to size_int so this a bit
> > misleading.
> The value at runtime of SIZE_RTX is always -SIZE_INT.   It might be held
> in a register, but that register's value is always -SIZE_INT.

Ah, I was looking at the last call in rs6000_emit_allocate_stack.  It's
a bit hard to follow, but I see you are right.  Unfortunate that you
won't get good generated code if you just drop that size_rtx parameter
and generate it from size_int here (or do you?)

(You could still do
  if (!size_rtx)
size_rtx = GEN_INT (-size_int);
to simplify the callers a bit).

> > (These comments were in the original code already, oh well).
> Happy to adjust if you've got a suggestion on how to make it clearer.

I'm drawing a blank right now :-/

> >> +rs6000_emit_probe_stack_range_stack_clash (HOST_WIDE_INT orig_size,
> >> + rtx copy_reg)
> >> +{
> >> +  rtx orig_sp = copy_reg;
> >> +
> >> +  HOST_WIDE_INT probe_interval
> >> += 1 << PARAM_VALUE (PARAM_STACK_CLASH_PROTECTION_PROBE_INTERVAL);
> > 
> > HOST_WIDE_INT_1U << ...
> It won't matter in practice because of the limits we've put on those
> PARAMs, but sure, easy to do except for the long lines, even in a helper
> function :(

Yeah.  But whenever I see "1 <<" I think "will it fit" -- no need for
that with HOST_WIDE_INT_1 (or unsigned, if you can) :-)

> >> +/* Probe a range of stack addresses from REG1 to REG3 inclusive.  These 
> >> are
> >> +   absolute addresses.  REG2 contains the backchain that must be stored 
> >> into
> >> +   *sp at each allocation.
> > 
> > I would just remove "These are absolute addresses.", or write something
> > like "These are addresses, not offsets", but that is kind of obvious
> > isn't it ;-)
> :-)  It was copied from the analogous -fstack-check routine.   Note
> there's a similar routine for -fstack-check which uses offsets, so I
> think being very explicit makes sense.  Perhaps just "These are
> addresses, not offsets" is better than the current comment?  I'll go
> with whatever you prefer here.

Yeah that is fine, thanks.

> >> +static const char *
> >> +output_probe_stack_range_stack_clash (rtx reg1, rtx reg2, rtx reg3)
> >> +{
> >> +  static int labelno = 0;
> >> +  char loop_lab[32];
> >> +  rtx xops[3];
> >> +
> >> +  HOST_WIDE_INT probe_interval
> >> += 1 << PARAM_VALUE (PARAM_STACK_CLASH_PROTECTION_PROBE_INTERVAL);
> > 
> > Once more :-)  Maybe a helper function is in order?  Would avoid the
> > huge length names at least.
> Fixed.  Long term I hope we find that changing these things isn't useful
> and we just drop them.

I hope we can change to 64kB for 64-bit Power (instead of 4kB) -- if we
do, we can probably turn on this protection by default, since the runtime
cost will be close to zero (almost all functions will need *no* extra
code compared to no clash protection).

But we'll probably need to support 4kB as well, even for 64-bit.

> > Bootstrap+testsuite finished on BE, but I forgot to enable stack-clash
> > protection by default, whoops.  Will have results later today (also LE).
> FWIW, I did do BE tests of earlier versions of this code as well as
> ppc32-be with and without stack clash protection enabled by default.
> But most of the time I'm testing ppc64le.

So all testing was fine (except the things with stack clash protection
on by default, which you are not proposing to commit).

I'm quite happy with the patch now; it's okay for trunk.

Thank you for all the work!


Segher


Re: Transform (x / y) != 0 to x >=y and (x / y) == 0 to x < y if x, y are unsigned

2017-09-25 Thread Prathamesh Kulkarni
On 18 September 2017 at 15:40, Prathamesh Kulkarni
 wrote:
> On 15 September 2017 at 22:09, Marc Glisse  wrote:
>> On Fri, 15 Sep 2017, Wilco Dijkstra wrote:
>>
>>> Marc Glisse wrote:
>>>
 The question is whether, having computed c=a/b, it is cheaper to test a>>> or c!=0.
 I think it is usually the second one, but not for all types on all
 targets. Although since
 you mention VRP, it is easier to do further optimizations using the
 information a>>
>>>
>>> No, a>> throughput on
>>> all modern cores, so rather than having to wait until the division
>>> finishes, you can
>>> execute whatever depends on the comparison many cycles earlier.
>>>
>>> Generally you want to avoid division as much as possible and when that
>>> fails
>>> reduce any dependencies on the result of divisions.
>>
>>
>> This would indicate that we do not need to check for single-use, makes the
>> patch simpler, thanks.
>> (let's ignore -Os)
> Hi,
> Thanks for the suggestions, I have updated the patch.
> Is this OK ?
> Bootstrap+test in progress on x86_64-unknown-linux-gnu.
> I will try address the right shift by 4 case in follow up patch.
>
ping https://gcc.gnu.org/ml/gcc-patches/2017-09/msg01145.html

Thanks,
Prathamesh
> Thanks,
> Prathamesh
>>
>> --
>> Marc Glisse


Re: [libgfortran] Replace implicit conversions between enums in io/transfer.c by explicit casts.

2017-09-25 Thread Prathamesh Kulkarni
On 12 September 2017 at 17:08, Prathamesh Kulkarni
 wrote:
> Hi,
> I am working on patch for PR78736
> (https://gcc.gnu.org/ml/gcc-patches/2017-09/msg00011.html),
> which adds a new warning -Wenum-conversion to C front-end to warn for
> implicit conversion between different enums.
> The warning in that patch triggered on io/transfer.c for following
> implicit conversions:
> i) Implicit conversion from unit_mode to file_mode
> ii) Implicit conversion from unit_sign_s to unit_sign.
>
> I was wondering if the warning for above implicit conversions would be
> correct since unit_mode
> and file_mode are different enums and similarly unit_sign_s and
> unit_sign are different enums ?
> Or are these warnings false positives ?
>
> The attached patch makes the conversion explicit to silence the warnings.
> Bootstrap+tested on x86_64-unknown-linux-gnu.
> Does the patch look OK ?
ping https://gcc.gnu.org/ml/fortran/2017-09/msg00036.html

Thanks,
Prathamesh
>
> Thanks,
> Prathamesh


Re: [PATCH PR82163/V2]New interface checking LCSSA for single loop

2017-09-25 Thread Bin.Cheng
On Sat, Sep 23, 2017 at 6:31 PM, Bernhard Reutner-Fischer
 wrote:
> On Fri, Sep 22, 2017 at 11:37:53AM +, Bin Cheng wrote:
>
>> diff --git a/gcc/tree-ssa-loop-manip.c b/gcc/tree-ssa-loop-manip.c
>> index d6ba305..6ad0b75 100644
>> --- a/gcc/tree-ssa-loop-manip.c
>> +++ b/gcc/tree-ssa-loop-manip.c
>> @@ -690,48 +690,62 @@ rewrite_virtuals_into_loop_closed_ssa (struct loop 
>> *loop)
>>rewrite_into_loop_closed_ssa_1 (NULL, 0, SSA_OP_VIRTUAL_USES, loop);
>>  }
>
>> -/* Checks invariants of loop closed ssa form in statement STMT in BB.  */
>> +/* Checks invariants of loop closed ssa form in BB.  */
>>
>>  static void
>> -check_loop_closed_ssa_stmt (basic_block bb, gimple *stmt)
>> +check_loop_closed_ssa_bb (basic_block bb)
>>  {
>> -  ssa_op_iter iter;
>> -  tree var;
>> +  for (gphi_iterator bsi = gsi_start_phis (bb); !gsi_end_p (bsi);
>> +   gsi_next ())
>> +{
>> +  gphi *phi = bsi.phi ();
>>
>> -  if (is_gimple_debug (stmt))
>> -return;
>> +  if (!virtual_operand_p (PHI_RESULT (phi)))
>> + check_loop_closed_ssa_def (bb, PHI_RESULT (phi));
>> +}
>> +
>> +  for (gimple_stmt_iterator bsi = gsi_start_bb (bb); !gsi_end_p (bsi);
>> +   gsi_next ())
>> +{
>> +  ssa_op_iter iter;
>> +  tree var;
>> +  gimple *stmt = gsi_stmt (bsi);
>> +
>> +  if (is_gimple_debug (stmt))
>> + continue;
>
> for (gimple_stmt_iterator bsi = gsi_start_nondebug_after_labels_bb (bb);
>  !gsi_end_p (bsi);
>  gsi_next_nondebug ())
>
> ?
Thanks for the suggestion, patch updated.  I will commit it later
since it's an obvious update.

Thanks,
bin
>>
>> -  FOR_EACH_SSA_TREE_OPERAND (var, stmt, iter, SSA_OP_USE)
>> -check_loop_closed_ssa_use (bb, var);
>> +  FOR_EACH_SSA_TREE_OPERAND (var, stmt, iter, SSA_OP_DEF)
>> + check_loop_closed_ssa_def (bb, var);
>> +}
>>  }
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr82163.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr82163.c
new file mode 100644
index 000..389d5c3
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr82163.c
@@ -0,0 +1,23 @@
+/* { dg-do compile } */
+/* { dg-options "-O3" } */
+
+int a, b, c[4], d, e, f, g;
+
+void h ()
+{
+  for (; a; a++)
+{
+  c[a + 3] = g;
+  if (b)
+c[a] = f;
+  else
+{
+  for (; d; d++)
+c[d + 3] = c[d];
+  for (e = 1; e == 2; e++)
+;
+  if (e)
+break;
+}
+}
+}
diff --git a/gcc/tree-ssa-loop-manip.c b/gcc/tree-ssa-loop-manip.c
index d6ba305..b08b8b9 100644
--- a/gcc/tree-ssa-loop-manip.c
+++ b/gcc/tree-ssa-loop-manip.c
@@ -690,48 +690,59 @@ rewrite_virtuals_into_loop_closed_ssa (struct loop *loop)
   rewrite_into_loop_closed_ssa_1 (NULL, 0, SSA_OP_VIRTUAL_USES, loop);
 }
 
-/* Check invariants of the loop closed ssa form for the USE in BB.  */
+/* Check invariants of the loop closed ssa form for the def in DEF_BB.  */
 
 static void
-check_loop_closed_ssa_use (basic_block bb, tree use)
+check_loop_closed_ssa_def (basic_block def_bb, tree def)
 {
-  gimple *def;
-  basic_block def_bb;
+  use_operand_p use_p;
+  imm_use_iterator iterator;
+  FOR_EACH_IMM_USE_FAST (use_p, iterator, def)
+{
+  if (is_gimple_debug (USE_STMT (use_p)))
+   continue;
 
-  if (TREE_CODE (use) != SSA_NAME || virtual_operand_p (use))
-return;
+  basic_block use_bb = gimple_bb (USE_STMT (use_p));
+  if (is_a  (USE_STMT (use_p)))
+   use_bb = EDGE_PRED (use_bb, PHI_ARG_INDEX_FROM_USE (use_p))->src;
 
-  def = SSA_NAME_DEF_STMT (use);
-  def_bb = gimple_bb (def);
-  gcc_assert (!def_bb
- || flow_bb_inside_loop_p (def_bb->loop_father, bb));
+  gcc_assert (flow_bb_inside_loop_p (def_bb->loop_father, use_bb));
+}
 }
 
-/* Checks invariants of loop closed ssa form in statement STMT in BB.  */
+/* Checks invariants of loop closed ssa form in BB.  */
 
 static void
-check_loop_closed_ssa_stmt (basic_block bb, gimple *stmt)
+check_loop_closed_ssa_bb (basic_block bb)
 {
-  ssa_op_iter iter;
-  tree var;
+  for (gphi_iterator bsi = gsi_start_phis (bb); !gsi_end_p (bsi);
+   gsi_next ())
+{
+  gphi *phi = bsi.phi ();
 
-  if (is_gimple_debug (stmt))
-return;
+  if (!virtual_operand_p (PHI_RESULT (phi)))
+   check_loop_closed_ssa_def (bb, PHI_RESULT (phi));
+}
 
-  FOR_EACH_SSA_TREE_OPERAND (var, stmt, iter, SSA_OP_USE)
-check_loop_closed_ssa_use (bb, var);
+  for (gimple_stmt_iterator bsi = gsi_start_nondebug_bb (bb); !gsi_end_p (bsi);
+   gsi_next_nondebug ())
+{
+  ssa_op_iter iter;
+  tree var;
+  gimple *stmt = gsi_stmt (bsi);
+
+  FOR_EACH_SSA_TREE_OPERAND (var, stmt, iter, SSA_OP_DEF)
+   check_loop_closed_ssa_def (bb, var);
+}
 }
 
 /* Checks that invariants of the loop closed ssa form are preserved.
-   Call verify_ssa when VERIFY_SSA_P is true.  */
+   Call verify_ssa when VERIFY_SSA_P is true.  Note all loops are checked
+   if LOOP is NULL, otherwise, only LOOP is checked.  */

[Bug c++/82230] [8 Regression] ICE: in tsubst, at cp/pt.c:13686 when binding lambda to variable inside a generic lambda inside a template member function inside a template class

2017-09-25 Thread xerofoify at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82230

nik  changed:

   What|Removed |Added

 CC||xerofoify at gmail dot com

--- Comment #4 from nik  ---
I have new to the project but after looking at the code in semantics.c that was
changed by that commit pointed out. It seems that a else above this if
statement was removed:

if (containing_function && DECL_TEMPLATE_INFO (context)
&& LAMBDA_FUNCTION_P (containing_function))

which in term means that:
containing_function = NULL_TREE;

is a NULL_TREE I assume that means it points to a nullptr or null pointer.
Therefore we may be passing in a null tree to an incorrect branch due to the
missing else. Seems it your trace, semantics.c is being called so it may be
this. Anybody have any comments?

Re: [patch, fortran] Warn about out-of-bounds access with DO subscripts

2017-09-25 Thread Thomas Koenig

Hi Jerry,


Yes OK,


Thanks for the review, committed as r253156.

Now, on to some other bugs...

Regards

Thomas



[Bug c/82323] New: circular ifunc attribute on a function definition silently accepted

2017-09-25 Thread msebor at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82323

Bug ID: 82323
   Summary: circular ifunc attribute on a function definition
silently accepted
   Product: gcc
   Version: 8.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: msebor at gcc dot gnu.org
  Target Milestone: ---

While testing the implementation of the fix for bug 82301 (and bug 81854) I
noticed that the ifunc attribute is silently accepted on the definition of a
function, like below:

$ cat t.c && gcc -S -Wall -Wextra -Werror t.c
int __attribute__ ((ifunc ("foo")))
foo (void) { return 0; }

The attribute specification on this function is invalid because a) it specifies
the indirect function itself as its own resolver, and b) it specifies a
resolver that doesn't return the expected type (a pointer to the indirect
function).

It seems to me that the ifunc attribute should be rejected with an error on
definitions of functions.

Re: [PATCH] Fix PR80295[aarch64] [7/8 Regression] ICE in __builtin_update_setjmp_buf expander

2017-09-25 Thread Qing Zhao
Hi, Andreas,

thanks for the comment. 

> GNU style is line break before the operator, not after.

updated per your comment.

Qing.

---
 gcc/config/aarch64/aarch64.c   | 12 +---
 gcc/config/aarch64/aarch64.h   |  2 +-
 gcc/config/aarch64/aarch64.md  |  6 +++---
 gcc/testsuite/gcc.target/aarch64/pr80295.c |  8 
 4 files changed, 21 insertions(+), 7 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/aarch64/pr80295.c

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 6c3ef76..ff0890d 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -3693,7 +3693,9 @@ aarch64_expand_prologue (void)
   stack_pointer_rtx,
   GEN_INT (callee_offset)));
   RTX_FRAME_RELATED_P (insn) = 1;
-  emit_insn (gen_stack_tie (stack_pointer_rtx, hard_frame_pointer_rtx));
+  emit_insn (TARGET_ILP32 
+ ? gen_stack_tiesi (stack_pointer_rtx, hard_frame_pointer_rtx)
+ : gen_stack_tiedi (stack_pointer_rtx, 
hard_frame_pointer_rtx));
 }
 
   aarch64_save_callee_saves (DImode, callee_offset, R0_REGNUM, R30_REGNUM,
@@ -3750,7 +3752,9 @@ aarch64_expand_epilogue (bool for_sibcall)
   if (final_adjust > crtl->outgoing_args_size || cfun->calls_alloca
   || crtl->calls_eh_return)
 {
-  emit_insn (gen_stack_tie (stack_pointer_rtx, stack_pointer_rtx));
+  emit_insn (TARGET_ILP32 
+ ? gen_stack_tiesi (stack_pointer_rtx, stack_pointer_rtx)
+ : gen_stack_tiedi (stack_pointer_rtx, stack_pointer_rtx));
   need_barrier_p = false;
 }
 
@@ -3774,7 +3778,9 @@ aarch64_expand_epilogue (bool for_sibcall)
callee_adjust != 0, _ops);
 
   if (need_barrier_p)
-emit_insn (gen_stack_tie (stack_pointer_rtx, stack_pointer_rtx));
+emit_insn (TARGET_ILP32 
+   ? gen_stack_tiesi (stack_pointer_rtx, stack_pointer_rtx)
+   : gen_stack_tiedi (stack_pointer_rtx, stack_pointer_rtx));
 
   if (callee_adjust != 0)
 aarch64_pop_regs (reg1, reg2, callee_adjust, _ops);
diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h
index 8fada9e..df58442 100644
--- a/gcc/config/aarch64/aarch64.h
+++ b/gcc/config/aarch64/aarch64.h
@@ -782,7 +782,7 @@ typedef struct
 /* Specify the machine mode that the hardware addresses have.
After generation of rtl, the compiler makes no further distinction
between pointers and any other objects of this machine mode.  */
-#define Pmode  DImode
+#define Pmode  (TARGET_ILP32 ? SImode : DImode)
 
 /* A C expression whose value is zero if pointers that need to be extended
from being `POINTER_SIZE' bits wide to `Pmode' are sign-extended and
diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index bb7f2c0..30853b2 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -5533,10 +5533,10 @@
   [(set_attr "type" "call")
(set_attr "length" "16")])
 
-(define_insn "stack_tie"
+(define_insn "stack_tie"
   [(set (mem:BLK (scratch))
-   (unspec:BLK [(match_operand:DI 0 "register_operand" "rk")
-(match_operand:DI 1 "register_operand" "rk")]
+   (unspec:BLK [(match_operand:GPI 0 "register_operand" "rk")
+(match_operand:GPI 1 "register_operand" "rk")]
UNSPEC_PRLG_STK))]
   ""
   ""
diff --git a/gcc/testsuite/gcc.target/aarch64/pr80295.c 
b/gcc/testsuite/gcc.target/aarch64/pr80295.c
new file mode 100644
index 000..b3866d8
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/pr80295.c
@@ -0,0 +1,8 @@
+/* { dg-do compile } */
+/* { dg-options "-mabi=ilp32" } */
+
+void f (void *b) 
+{ 
+  __builtin_update_setjmp_buf (b); 
+}
+
-- 
1.9.1



[PING][PATCH] Fix bug in simplify_ternary_operation

2017-09-25 Thread Tom de Vries

On 09/01/2017 10:51 AM, Tom de Vries wrote:

On 08/31/2017 11:44 PM, Jeff Law wrote:

On 08/28/2017 12:26 PM, Tom de Vries wrote:

Hi,

I think I found a bug in r17465:
...

    * cse.c (simplify_ternary_operation): Handle more IF_THEN_ELSE
    simplifications.

diff --git a/gcc/cse.c b/gcc/cse.c
index e001597..3c27387 100644
--- a/gcc/cse.c
+++ b/gcc/cse.c
@@ -4713,6 +4713,17 @@ simplify_ternary_operation (code, mode,
op0_mode, op0, op1, op2)


Note: the parameters of simplify_ternary_operation have the following
meaning:
...
/* Simplify CODE, an operation with result mode MODE and three operands,
    OP0, OP1, and OP2.  OP0_MODE was the mode of OP0 before it became
    a constant.  Return 0 if no simplifications is possible.  */

rtx
simplify_ternary_operation (code, mode, op0_mode, op0, op1, op2)
  enum rtx_code code;
  enum machine_mode mode, op0_mode;
  rtx op0, op1, op2;
...


   && rtx_equal_p (XEXP (op0, 1), op1)
   && rtx_equal_p (XEXP (op0, 0), op2))
 return op2;
+  else if (! side_effects_p (op0))
+   {
+ rtx temp;
+ temp = simplify_relational_operation (GET_CODE (op0), 
op0_mode,

+   XEXP (op0, 0), XEXP
(op0, 1));


We're handling code == IF_THEN_ELSE here, so op0 is the condition, op1
is the 'then expr' and op2 is the 'else expr'.

The parameters of simplify_relational_operation have the following 
meaning:

...
/* Like simplify_binary_operation except used for relational operators.
    MODE is the mode of the operands, not that of the result.  If MODE
    is VOIDmode, both operands must also be VOIDmode and we compare the
    operands in "infinite precision".

    If no simplification is possible, this function returns zero.
    Otherwise, it returns either const_true_rtx or const0_rtx.  */

rtx
simplify_relational_operation (code, mode, op0, op1)
  enum rtx_code code;
  enum machine_mode mode;
  rtx op0, op1;
...

The problem in the patch is that we use op0_mode argument for the mode
parameter. The mode parameter of simplify_relational_operation needs to
be the mode of the operands of the condition, while op0_mode is the mode
of the condition.

Patch below fixes this on current trunk.

[ I found this by running into an ICE in
gcc.c-torture/compile/pr28776-2.c for gcn target. I haven't been able to
reproduce this with an upstream branch yet. ]

OK for trunk if bootstrap and reg-test for x86_64 succeeds?

So clearly setting cmp_mode to op0_mode is wrong.   But we also have to
make sure that if cmp_mode is VOIDmode that either XEXP (op0, 0) has a
non-void mode or that XEXP (op0, 1) has a non-void mode, otherwise we're
likely to abort down in simplify_const_relational_operation.



You're referring to this assert:
...
/* Check if the given comparison (done in the given MODE) is actually
    a tautology or a contradiction.  If the mode is VOID_mode, the
    comparison is done in "infinite precision".  If no simplification
    is possible, this function returns zero.  Otherwise, it returns
    either const_true_rtx or const0_rtx.  */

rtx
simplify_const_relational_operation (enum rtx_code code,
  machine_mode mode,
  rtx op0, rtx op1)
{
   ...

   gcc_assert (mode != VOIDmode
   || (GET_MODE (op0) == VOIDmode
   && GET_MODE (op1) == VOIDmode));
...

added by Honza:
...
 * simplify-rtx.c (simplify_relational_operation): Verify that
     mode == VOIDmode implies both operands to be VOIDmode.
...

In other words, rewriting the assert in more readable form:
...
#define BOOL_IMPLIES(a, b) (!(a) || (b))
   gcc_assert (BOOL_IMPLIES (mode == VOIDmode,
     (GET_MODE (op0) == VOIDmode
  && GET_MODE (op1) == VOIDmode)));
...
[ I'd be in favor of rewriting imply relations using a macro or some 
such, I find it easier to understand. ]


Now, simplify_relational_operation starts like this:
...
rtx
simplify_relational_operation (enum rtx_code code, machine_mode mode,
    machine_mode cmp_mode, rtx op0, rtx op1)
{
   rtx tem, trueop0, trueop1;

   if (cmp_mode == VOIDmode)
     cmp_mode = GET_MODE (op0);
   if (cmp_mode == VOIDmode)
     cmp_mode = GET_MODE (op1);

   tem = simplify_const_relational_operation (code, cmp_mode, op0, op1);
...

AFAIU, the cmp_mode ifs ensure that the assert in 
simplify_const_relational_operation doesn't trigger.



ISTM a better fix is to return NULL_RTX if cmp_mode is VOIDmode and both
the sub-operations are VOIDmode as well.



I don't think we need that. simplify_const_relational_operation can 
handle the situation that mode == VOIDmode && GET_MODE (op0) == VOIDmode 
&& GET_MODE (op1) == VOIDmode.




Ping.

Thanks,

- Tom



Can you try that and verify that pr28776-2.c continues to work?
jeff





[Bug fortran/82207] ieee_class identifies signaling NaNs as quiet NaNs

2017-09-25 Thread dominiq at lps dot ens.fr
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82207

Dominique d'Humieres  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2017-09-25
 Ever confirmed|0   |1

--- Comment #5 from Dominique d'Humieres  ---
Confirmed from 5.4.0 up to trunk (8.0), 'ieee_arithmetic.mod' was not
implemented before gcc-5.

[Bug fortran/82313] Rejects-valid for sum(minloc(...))) as array dimension

2017-09-25 Thread dominiq at lps dot ens.fr
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82313

Dominique d'Humieres  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2017-09-25
 Ever confirmed|0   |1

--- Comment #1 from Dominique d'Humieres  ---
Confirmed from 4.4 up to trunk (8.0).

[Bug fortran/82314] internal compiler error: in gfc_conv_expr_descriptor, at fortran/trans-array.c:6972

2017-09-25 Thread dominiq at lps dot ens.fr
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82314

Dominique d'Humieres  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2017-09-25
 Ever confirmed|0   |1

--- Comment #1 from Dominique d'Humieres  ---
Confirmed from 4.4 up to trunk (8.0). Note that the ICE is in
is_illegal_recursion, at fortran/resolve.c from 4.4 up to 4.8.

Re: [RFA][PATCH] Stack clash protection 06/08 - V4

2017-09-25 Thread Jeff Law
On 09/25/2017 08:14 AM, Segher Boessenkool wrote:
> On Mon, Sep 25, 2017 at 07:41:18AM -0500, Segher Boessenkool wrote:
>> On Mon, Sep 25, 2017 at 05:52:27AM -0500, Segher Boessenkool wrote:
>>> Bootstrap+testsuite finished on BE, but I forgot to enable stack-clash
>>> protection by default, whoops.  Will have results later today (also LE).
>>
>> Some new failures show up:
>>
>> +FAIL: c-c++-common/ubsan/vla-1.c   -O0  execution test
>>
>> /home/segher/src/gcc/gcc/testsuite/c-c++-common/ubsan/vla-1.c:18:7: runtime 
>> error: variable length array bound evaluates to non-positive value -1
>> /home/segher/src/gcc/gcc/testsuite/c-c++-common/ubsan/vla-1.c:24:7: runtime 
>> error: variable length array bound evaluates to non-positive value -1
>> /home/segher/src/gcc/gcc/testsuite/c-c++-common/ubsan/vla-1.c:24:7: runtime 
>> error: variable length array bound evaluates to non-positive value -1
>> /home/segher/src/gcc/gcc/testsuite/c-c++-common/ubsan/vla-1.c:30:7: runtime 
>> error: variable length array bound evaluates to non-positive value -1
>> /home/segher/src/gcc/gcc/testsuite/c-c++-common/ubsan/vla-1.c:30:7: runtime 
>> error: variable length array bound evaluates to non-positive value -1
>> /home/segher/src/gcc/gcc/testsuite/c-c++-common/ubsan/vla-1.c:30:7: runtime 
>> error: variable length array bound evaluates to non-positive value -1
>> /home/segher/src/gcc/gcc/testsuite/c-c++-common/ubsan/vla-1.c:36:7: runtime 
>> error: variable length array bound evaluates to non-positive value -5
>>
>> (both gcc and g++, both -m32 and -m64).
>>
>> This is BE; LE is still running.
> 
> LE show the same, but also
> 
> === acats tests ===
> +FAIL:  c52103x
> +FAIL:  c52104x
> +FAIL:  c52104y
> +FAIL:  cb1010a
> 
> These are
> 
>  C52103X CHECK THAT IN ARRAY ASSIGNMENTS AND IN SLICE ASSIGNMENTS,
> THE LENGTHS MUST MATCH; ALSO CHECK WHETHER
> CONSTRAINT_ERROR OR STORAGE_ERROR ARE RAISED FOR LARGE
> ARRAYS.
>- C52103X NO CONSTRAINT_ERROR FOR TYPE WITH 'LENGTH = INTEGER'LAST + 
> 3.
> 
> raised STORAGE_ERROR : stack overflow or erroneous memory access
> FAIL: c52103x
This is expected.   stack clash protection does not guarantee we can run
the signal handler upon stack overflow -- thus we can not guarantee we
get the constraint error.


> 
> (twice)
> 
>  C52104Y CHECK THAT IN ARRAY ASSIGNMENTS AND IN SLICE ASSIGNMENTS,
> THE LENGTHS MUST MATCH.
>- C52104Y NO CONSTRAINT_ERROR FOR NON-NULL ARRAY SUBTYPE WHEN ONE
> DIMENSION HAS INTEGER'LAST + 3 COMPONENTS.
> 
> raised STORAGE_ERROR : stack overflow or erroneous memory access
> FAIL:   c52104y
Likewise.


> 
> and,
> 
>  CB1010A CHECK THAT STORAGE_ERROR IS RAISED WHEN STORAGE ALLOCATED
> TO A TASK IS EXCEEDED.
>- CB1010A CHECK TASKS THAT DO NOT HANDLE STORAGE_ERROR PRIOR TO
> RENDEZVOUS.
> FAIL:   cb1010a
Likewise.  We can't guarantee we can run the signal handler and thus we
can't inform the Ada runtime about the overflow.

Jeff


Re: [PATCH] Fix PR80295[aarch64] [7/8 Regression] ICE in __builtin_update_setjmp_buf expander

2017-09-25 Thread Andreas Schwab
On Sep 25 2017, Qing Zhao  wrote:

> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
> index 6c3ef76..876e9e3 100644
> --- a/gcc/config/aarch64/aarch64.c
> +++ b/gcc/config/aarch64/aarch64.c
> @@ -3693,7 +3693,9 @@ aarch64_expand_prologue (void)
>  stack_pointer_rtx,
>  GEN_INT (callee_offset)));
>RTX_FRAME_RELATED_P (insn) = 1;
> -  emit_insn (gen_stack_tie (stack_pointer_rtx, hard_frame_pointer_rtx));
> +  emit_insn (TARGET_ILP32 ? 
> + gen_stack_tiesi (stack_pointer_rtx, hard_frame_pointer_rtx) 
> :
> + gen_stack_tiedi (stack_pointer_rtx, 
> hard_frame_pointer_rtx));

GNU style is line break before the operator, not after.

Andreas.

-- 
Andreas Schwab, SUSE Labs, sch...@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."


[Bug fortran/38936] [F03] ASSOCIATE construct / improved SELECT TYPE (a=>expr)

2017-09-25 Thread dominiq at lps dot ens.fr
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=38936

Dominique d'Humieres  changed:

   What|Removed |Added

 CC||pault at gcc dot gnu.org

--- Comment #19 from Dominique d'Humieres  ---
> This commit did implement better handling for association to derived-types,
> but some cases are still not handled (see the XFAIL of associate_9.f03). 
> I wanted to test with the code of PR 45369, but that also uses CLASS(*) so
> I was not able to compile it still.

The XFAIL has been removed at revision r252894. Is there anything left in this
PR or could it be closed?

Re: [RFA][PATCH] Stack clash protection 06/08 - V4

2017-09-25 Thread Jeff Law
On 09/25/2017 04:52 AM, Segher Boessenkool wrote:
> 
>> I also flipped things so that clash protection is enabled by default and
>> re-ran the tests.  The idea being to see if I could exercise the path
>> that uses SP_ADJUST a bit more.  But that gave me the same results.
>> While I think the change in the return value in
>> rs6000_emit_probe_stack_range_stack_clash is more correct, I don't have
>> a good way to test it.
> 
> Did you also test Ada?  It needs testing.  I wanted to try it myself,
> but your patch doesn't apply (you included the changelog bits in the
> patch), and I cannot easily manually apply it either because you sent
> it as base64 instead of as text.
I didn't test Ada with -fstack-clash-protection on by default.  I did
test it as part of the normal bootstrap & regression test cycle with no
changes in the Ada testsuites.

Testing it with stack clash protection on by default is easy to do :-)
I wouldn't be surprised to see failures since some tests are known to
test/use -fstack-check= which explicitly conflicts with stack clash
protection.

> 
> (... Okay, I think I have it working; testing now).
> 
> Some comments, mostly trivial comment stuff:

> 
> 
>> +/* Allocate SIZE_INT bytes on the stack using a store with update style insn
>> +   and set the appropriate attributes for the generated insn.  Return the
>> +   generated insn.
> 
> "Return the first generated insn"?
Fixed.

>> +   SIZE_INT is used to create the CFI note for the allocation.
>> +
>> +   SIZE_RTX is an rtx containing the size of the adjustment.  Note that
>> +   since stacks grow to lower addresses its runtime value is -SIZE_INT.
> 
> The size_rtx doesn't always correspond to size_int so this a bit
> misleading.
The value at runtime of SIZE_RTX is always -SIZE_INT.   It might be held
in a register, but that register's value is always -SIZE_INT.


> 
> (These comments were in the original code already, oh well).
Happy to adjust if you've got a suggestion on how to make it clearer.


> 
>> +   COPY_REG, if non-null, should contain a copy of the original
>> +   stack pointer at exit from this function.
> 
> "Return a copy of the original stack pointer in COPY_REG if that is
> non-null"?  It wasn't clear to me that it is this function that should
> set it :-)
It's not clear to me either.  I'm just trying to preserve behavior of
the existing code in its handling of COPY_REG and COPY_OFF.



> 
>> +static rtx_insn *
>> +rs6000_emit_probe_stack_range_stack_clash (HOST_WIDE_INT orig_size,
>> +   rtx copy_reg)
>> +{
>> +  rtx orig_sp = copy_reg;
>> +
>> +  HOST_WIDE_INT probe_interval
>> += 1 << PARAM_VALUE (PARAM_STACK_CLASH_PROTECTION_PROBE_INTERVAL);
> 
> HOST_WIDE_INT_1U << ...
It won't matter in practice because of the limits we've put on those
PARAMs, but sure, easy to do except for the long lines, even in a helper
function :(




> 
>> +  /* If explicitly requested,
>> +   or the rounded size is not the same as the original size
>> +   or the the rounded size is greater than a page,
>> + then we will need a copy of the original stack pointer.  */
>> +  if (rounded_size != orig_size
>> +  || rounded_size > probe_interval
>> +  || copy_reg)
>> +{
>> +  /* If the caller requested a copy of the incoming stack pointer,
>> + then ORIG_SP == COPY_REG and will not be NULL.
>> +
>> + If no copy was requested, then we use r0 to hold the copy.  */
>> +  if (orig_sp == NULL_RTX)
>> +orig_sp = gen_rtx_REG (Pmode, 0);
>> +  emit_move_insn (orig_sp, stack_pointer_rtx);
> 
> Maybe just write the "if" as "if (!copy_reg)"?  You can lose the first
> half of the comment then (since it is obvious then).
Agreed.

> 
>> +  for (int i = 0; i < rounded_size; i += probe_interval)
>> +{
>> +  rtx_insn *insn
>> += rs6000_emit_allocate_stack_1 (probe_interval,
>> +probe_int, orig_sp);
>> +  if (!retval)
>> +retval = insn;
>> +}
> 
> Maybe "if (i == 0)" is clearer?
No strong opinions here.  I added a comment and changed  to i == 0
> 
>> @@ -25509,6 +25703,23 @@ rs6000_emit_allocate_stack (HOST_WIDE_INT size, rtx 
>> copy_reg, int copy_off)
>>  warning (0, "stack limit expression is not supported");
>>  }
>>  
>> +  if (flag_stack_clash_protection)
>> +{
>> +  if (size < (1 << PARAM_VALUE 
>> (PARAM_STACK_CLASH_PROTECTION_GUARD_SIZE)))
> 
> HOST_WIDE_INT_1U again.
Fixed.

> 
>> +/* Probe a range of stack addresses from REG1 to REG3 inclusive.  These are
>> +   absolute addresses.  REG2 contains the backchain that must be stored into
>> +   *sp at each allocation.
> 
> I would just remove "These are absolute addresses.", or write something
> like "These are addresses, not offsets", but that is kind of obvious
> isn't it ;-)
:-)  It was copied from the analogous -fstack-check routine.   Note
there's a similar routine for -fstack-check which uses offsets, so I
think being very 

[PATCH] Fix PR80295[aarch64] [7/8 Regression] ICE in __builtin_update_setjmp_buf expander

2017-09-25 Thread Qing Zhao
Hi,

This patch fixes the aarch64 bug 80295
https://gcc.gnu.org/PR80295

The aarch64 backend has multiple places that miss the handling of TARGET_ILP32.
in the patch, we added correct handling of TARGET_ILP32 into aarch64 backend. 

a new small testing case is added.

bootstrapped and tested on aarch64-unknown-linux-gnu with no regression.

thanks.

Qing

==

gcc/ChangeLog:

   * config/aarch64/aarch64.c (aarch64_expand_prologue):
   emit different modes of stack_tie insn depend on TARGET_ILP32.
   (aarch64_expand_epilogue): Likewise.
   * config/aarch64/aarch64.h: define Pmode to SImode/DImode
   depend on TARGET_ILP32.
  * config/aarch64/aarch64.md: define insn stack_tie to different
   modes (SImode/DImode) 

gcc/testsuite/ChangeLog:

   PR middle-end/80295
   * gcc.target/aarch64/pr80295.c: New test.

---
 gcc/config/aarch64/aarch64.c   | 12 +---
 gcc/config/aarch64/aarch64.h   |  2 +-
 gcc/config/aarch64/aarch64.md  |  6 +++---
 gcc/testsuite/gcc.target/aarch64/pr80295.c |  8 
 4 files changed, 21 insertions(+), 7 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/aarch64/pr80295.c

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 6c3ef76..876e9e3 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -3693,7 +3693,9 @@ aarch64_expand_prologue (void)
   stack_pointer_rtx,
   GEN_INT (callee_offset)));
   RTX_FRAME_RELATED_P (insn) = 1;
-  emit_insn (gen_stack_tie (stack_pointer_rtx, hard_frame_pointer_rtx));
+  emit_insn (TARGET_ILP32 ? 
+ gen_stack_tiesi (stack_pointer_rtx, hard_frame_pointer_rtx) :
+ gen_stack_tiedi (stack_pointer_rtx, hard_frame_pointer_rtx));
 }
 
   aarch64_save_callee_saves (DImode, callee_offset, R0_REGNUM, R30_REGNUM,
@@ -3750,7 +3752,9 @@ aarch64_expand_epilogue (bool for_sibcall)
   if (final_adjust > crtl->outgoing_args_size || cfun->calls_alloca
   || crtl->calls_eh_return)
 {
-  emit_insn (gen_stack_tie (stack_pointer_rtx, stack_pointer_rtx));
+  emit_insn (TARGET_ILP32 ? 
+ gen_stack_tiesi (stack_pointer_rtx, stack_pointer_rtx) :
+ gen_stack_tiedi (stack_pointer_rtx, stack_pointer_rtx));
   need_barrier_p = false;
 }
 
@@ -3774,7 +3778,9 @@ aarch64_expand_epilogue (bool for_sibcall)
callee_adjust != 0, _ops);
 
   if (need_barrier_p)
-emit_insn (gen_stack_tie (stack_pointer_rtx, stack_pointer_rtx));
+emit_insn (TARGET_ILP32 ? 
+   gen_stack_tiesi (stack_pointer_rtx, stack_pointer_rtx) :
+   gen_stack_tiedi (stack_pointer_rtx, stack_pointer_rtx));
 
   if (callee_adjust != 0)
 aarch64_pop_regs (reg1, reg2, callee_adjust, _ops);
diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h
index 8fada9e..df58442 100644
--- a/gcc/config/aarch64/aarch64.h
+++ b/gcc/config/aarch64/aarch64.h
@@ -782,7 +782,7 @@ typedef struct
 /* Specify the machine mode that the hardware addresses have.
After generation of rtl, the compiler makes no further distinction
between pointers and any other objects of this machine mode.  */
-#define Pmode  DImode
+#define Pmode  (TARGET_ILP32 ? SImode : DImode)
 
 /* A C expression whose value is zero if pointers that need to be extended
from being `POINTER_SIZE' bits wide to `Pmode' are sign-extended and
diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index bb7f2c0..30853b2 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -5533,10 +5533,10 @@
   [(set_attr "type" "call")
(set_attr "length" "16")])
 
-(define_insn "stack_tie"
+(define_insn "stack_tie"
   [(set (mem:BLK (scratch))
-   (unspec:BLK [(match_operand:DI 0 "register_operand" "rk")
-(match_operand:DI 1 "register_operand" "rk")]
+   (unspec:BLK [(match_operand:GPI 0 "register_operand" "rk")
+(match_operand:GPI 1 "register_operand" "rk")]
UNSPEC_PRLG_STK))]
   ""
   ""
diff --git a/gcc/testsuite/gcc.target/aarch64/pr80295.c 
b/gcc/testsuite/gcc.target/aarch64/pr80295.c
new file mode 100644
index 000..b3866d8
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/pr80295.c
@@ -0,0 +1,8 @@
+/* { dg-do compile } */
+/* { dg-options "-mabi=ilp32" } */
+
+void f (void *b) 
+{ 
+  __builtin_update_setjmp_buf (b); 
+}
+
-- 
1.9.1



[Bug target/82175] [8 Regression] -march=native fails on armv7 big/little system armv7l-unknown-linux-gnueabihf with gcc 8.0.0

2017-09-25 Thread rearnsha at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82175

Richard Earnshaw  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2017-09-25
   Assignee|unassigned at gcc dot gnu.org  |rearnsha at gcc dot 
gnu.org
 Ever confirmed|0   |1

--- Comment #3 from Richard Earnshaw  ---
Mine

Re: [RFA][PATCH] Stack clash protection 06/08 - V4

2017-09-25 Thread Jeff Law
On 09/25/2017 06:41 AM, Segher Boessenkool wrote:
> On Mon, Sep 25, 2017 at 05:52:27AM -0500, Segher Boessenkool wrote:
>> Bootstrap+testsuite finished on BE, but I forgot to enable stack-clash
>> protection by default, whoops.  Will have results later today (also LE).
> 
> Some new failures show up:
> 
> +FAIL: c-c++-common/ubsan/vla-1.c   -O0  execution test
> 
> /home/segher/src/gcc/gcc/testsuite/c-c++-common/ubsan/vla-1.c:18:7: runtime 
> error: variable length array bound evaluates to non-positive value -1
> /home/segher/src/gcc/gcc/testsuite/c-c++-common/ubsan/vla-1.c:24:7: runtime 
> error: variable length array bound evaluates to non-positive value -1
> /home/segher/src/gcc/gcc/testsuite/c-c++-common/ubsan/vla-1.c:24:7: runtime 
> error: variable length array bound evaluates to non-positive value -1
> /home/segher/src/gcc/gcc/testsuite/c-c++-common/ubsan/vla-1.c:30:7: runtime 
> error: variable length array bound evaluates to non-positive value -1
> /home/segher/src/gcc/gcc/testsuite/c-c++-common/ubsan/vla-1.c:30:7: runtime 
> error: variable length array bound evaluates to non-positive value -1
> /home/segher/src/gcc/gcc/testsuite/c-c++-common/ubsan/vla-1.c:30:7: runtime 
> error: variable length array bound evaluates to non-positive value -1
> /home/segher/src/gcc/gcc/testsuite/c-c++-common/ubsan/vla-1.c:36:7: runtime 
> error: variable length array bound evaluates to non-positive value -5
Yes.  I've known about this.

What happens is ubsan detects the error, but allows the code to continue
to run and try to allocate huge stacks.  The stack clash code comes
along and tries to probe the just allocated space which fails in the
expected manner.  It's really a testsuite issue and not an issue with
either UB or stack clash protection -- that's why I didn't call it out.

We could ask the sanitizers to abort on detecting UB, but then the test
itself needs to be split up (and that's the right thing to do IMHO).

There are other tests which are going to fail -- things like mixing
-fstack-check and -fstack-clash and an assortment of guality things.

Jeff


[Bug c/82296] Warn for code removal due to "code never accesses array out of bounds" assumption

2017-09-25 Thread lundril at gmx dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82296

--- Comment #6 from Ingo  ---
> https://www.securecoding.cert.org/confluence/display/c/ARR30-C.+Do+not+form+or+use+out-of-bounds+pointers+or+array+subscripts

Just out of curiosity:
I am not able to find any of that in the ANSI/ISO C89 standard. That might be,
because I am not familiar where to find that in the C89 standard (I am
definitely not familiar with any of the formal C standard documents).

I also noticed that if I compile the example with

  gcc -std=c89 -O2 -S gcc_check.c

I also get assembler code which basically implements "return 1;".

So does that mean gcc will always define "undefined behavior" according to the
C-2011 standard, even if you use "-std=c89" ?

What happens when the standard committee release a more recent version of the C
standard ? 
Will the upcoming versions of GCC then use the updated definitions of
"undefined behavior" from the upcoming C standards definition and thus produce
non working code for any source code, which was not able to look into the
future and guessing what the C standards committee might deem "undefined
behavior" in the future ?

Re: [PATCH] haifa-sched: fix autopref_rank_for_schedule qsort comparator

2017-09-25 Thread Maxim Kuvyrkov
> On Sep 22, 2017, at 4:18 AM, Alexander Monakov  wrote:
> 
> On Tue, 19 Sep 2017, Alexander Monakov wrote:
>>  * haifa-sched.c (autopref_rank_for_schedule): Order 'irrelevant' insns
>>  first, always call autopref_rank_data otherwise.
> 
> May I apply this patch now to unblock qsort checking?  Further changes or
> adjustments can then go in independently at a later time.


Yes, feel free to commit one of your versions.

--
Maxim Kuvyrkov
www.linaro.org



> Thanks.
> Alexander
> 
>> --- a/gcc/haifa-sched.c
>> +++ b/gcc/haifa-sched.c
>> @@ -5707,7 +5707,8 @@ autopref_rank_data (autopref_multipass_data_t data1,
>> static int
>> autopref_rank_for_schedule (const rtx_insn *insn1, const rtx_insn *insn2)
>> {
>> -  for (int write = 0; write < 2; ++write)
>> +  int r = 0;
>> +  for (int write = 0; write < 2 && !r; ++write)
>> {
>>   autopref_multipass_data_t data1
>>  = _AUTOPREF_MULTIPASS_DATA (insn1)[write];
>> @@ -5716,21 +5717,20 @@ autopref_rank_for_schedule (const rtx_insn *insn1, 
>> const rtx_insn *insn2)
>> 
>>   if (data1->status == AUTOPREF_MULTIPASS_DATA_UNINITIALIZED)
>>  autopref_multipass_init (insn1, write);
>> -  if (data1->status == AUTOPREF_MULTIPASS_DATA_IRRELEVANT)
>> -continue;
>> 
>>   if (data2->status == AUTOPREF_MULTIPASS_DATA_UNINITIALIZED)
>>  autopref_multipass_init (insn2, write);
>> -  if (data2->status == AUTOPREF_MULTIPASS_DATA_IRRELEVANT)
>> -continue;
>> 
>> -  if (!rtx_equal_p (data1->base, data2->base))
>> -continue;
>> +  int irrel1 = data1->status == AUTOPREF_MULTIPASS_DATA_IRRELEVANT;
>> +  int irrel2 = data2->status == AUTOPREF_MULTIPASS_DATA_IRRELEVANT;
>> 
>> -  return autopref_rank_data (data1, data2);
>> +  if (!irrel1 && !irrel2)
>> +r = autopref_rank_data (data1, data2);
>> +  else
>> +r = irrel2 - irrel1;
>> }
>> 
>> -  return 0;
>> +  return r;
>> }
>> 
>> /* True if header of debug dump was printed.  */
>> 



Re: [Patch, Fortran] PR 82143: add a -fdefault-real-16 flag

2017-09-25 Thread David Edelsohn
promotion_3.f90 and promotion_4.f90 are failing on at least PowerPC
and AArch64.  Are these new tests limited to x86 or some long double
assumptions?

f951: Fatal Error: REAL(KIND=16) is not available for '-fdefault-real-16' option
compilation terminated.

f951: Fatal Error: REAL(KIND=10) is not available for '-fdefault-real-10' option
compilation terminated.

Thanks, David


[Bug c/81854] weak alias of an incompatible symbol accepted

2017-09-25 Thread uros at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81854

--- Comment #13 from uros at gcc dot gnu.org ---
Author: uros
Date: Mon Sep 25 14:59:19 2017
New Revision: 253153

URL: https://gcc.gnu.org/viewcvs?rev=253153=gcc=rev
Log:
PR c/81854
* src/c++98/complex_io.cc (_GLIBCXX_LDBL_COMPAT): Declare alias
target as a C++ function with no prototype.


Modified:
trunk/libstdc++-v3/ChangeLog
trunk/libstdc++-v3/src/c++98/complex_io.cc

[PATCH] Pretty-print GOACC_REDUCTION arguments

2017-09-25 Thread Tom de Vries

Hi,

currently for a GOACC_REDUCTION internal fn call we print:
...
  sum_5 = GOACC_REDUCTION (SETUP, _3, 0, 0, 67, 0);
...

This patch adds a comment for some arguments explaining the meaning of 
the argument:

...
  sum_5 = GOACC_REDUCTION (SETUP, _3, 0, 0 /*gang*/, 67 /*+*/, 0);
...

OK for trunk, if testing is ok?

Thanks,
- Tom
Pretty-print GOACC_REDUCTION arguments

Prints
  sum_5 = GOACC_REDUCTION (SETUP, _3, 0, 0 /*gang*/, 67 /*+*/, 0);
instead of
  sum_5 = GOACC_REDUCTION (SETUP, _3, 0, 0, 67, 0);

2017-09-25  Tom de Vries  

	* gimple-pretty-print.c (dump_gimple_call_args): Pretty-print
	GOACC_REDUCTION arguments.

---
 gcc/gimple-pretty-print.c | 35 +++
 1 file changed, 35 insertions(+)

diff --git a/gcc/gimple-pretty-print.c b/gcc/gimple-pretty-print.c
index ed8e51c..61efd93 100644
--- a/gcc/gimple-pretty-print.c
+++ b/gcc/gimple-pretty-print.c
@@ -41,6 +41,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "stringpool.h"
 #include "attribs.h"
 #include "asan.h"
+#include "gomp-constants.h"
 
 #define INDENT(SPACE)			\
   do { int i; for (i = 0; i < SPACE; i++) pp_space (buffer); } while (0)
@@ -765,6 +766,40 @@ dump_gimple_call_args (pretty_printer *buffer, gcall *gs, dump_flags_t flags)
   if (i)
 	pp_string (buffer, ", ");
   dump_generic_node (buffer, gimple_call_arg (gs, i), 0, flags, false);
+
+  if (gimple_call_internal_p (gs))
+	switch (gimple_call_internal_fn (gs))
+	  {
+	  case IFN_GOACC_REDUCTION:
+	switch (i)
+	  {
+	  case 3:
+		switch (tree_to_uhwi (gimple_call_arg (gs, i)))
+		  {
+		  case GOMP_DIM_GANG:
+		pp_string (buffer, " /*gang*/");
+		break;
+		  case GOMP_DIM_WORKER:
+		pp_string (buffer, " /*worker*/");
+		break;
+		  case GOMP_DIM_VECTOR:
+		pp_string (buffer, " /*vector*/");
+		break;
+		  default:
+		gcc_unreachable ();
+		  }
+		break;
+	  case 4:
+		{
+		  enum tree_code rcode
+		= (enum tree_code)tree_to_uhwi (gimple_call_arg (gs, i));
+		  pp_string (buffer, " /*");
+		  pp_string (buffer, op_symbol_code (rcode));
+		  pp_string (buffer, "*/");
+		}
+		break;
+	  }
+	  }
 }
 
   if (gimple_call_va_arg_pack_p (gs))


[Bug tree-optimization/82321] [8 Regression] ICE in check_loop_closed_ssa_use, at tree-ssa-loop-manip.c:707

2017-09-25 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82321

Richard Biener  changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2017-09-25
   Assignee|unassigned at gcc dot gnu.org  |rguenth at gcc dot 
gnu.org
   Target Milestone|--- |8.0
 Ever confirmed|0   |1

--- Comment #1 from Richard Biener  ---
Mine.

Re: [RFA][PATCH] Stack clash protection 06/08 - V4

2017-09-25 Thread Segher Boessenkool
On Mon, Sep 25, 2017 at 07:41:18AM -0500, Segher Boessenkool wrote:
> On Mon, Sep 25, 2017 at 05:52:27AM -0500, Segher Boessenkool wrote:
> > Bootstrap+testsuite finished on BE, but I forgot to enable stack-clash
> > protection by default, whoops.  Will have results later today (also LE).
> 
> Some new failures show up:
> 
> +FAIL: c-c++-common/ubsan/vla-1.c   -O0  execution test
> 
> /home/segher/src/gcc/gcc/testsuite/c-c++-common/ubsan/vla-1.c:18:7: runtime 
> error: variable length array bound evaluates to non-positive value -1
> /home/segher/src/gcc/gcc/testsuite/c-c++-common/ubsan/vla-1.c:24:7: runtime 
> error: variable length array bound evaluates to non-positive value -1
> /home/segher/src/gcc/gcc/testsuite/c-c++-common/ubsan/vla-1.c:24:7: runtime 
> error: variable length array bound evaluates to non-positive value -1
> /home/segher/src/gcc/gcc/testsuite/c-c++-common/ubsan/vla-1.c:30:7: runtime 
> error: variable length array bound evaluates to non-positive value -1
> /home/segher/src/gcc/gcc/testsuite/c-c++-common/ubsan/vla-1.c:30:7: runtime 
> error: variable length array bound evaluates to non-positive value -1
> /home/segher/src/gcc/gcc/testsuite/c-c++-common/ubsan/vla-1.c:30:7: runtime 
> error: variable length array bound evaluates to non-positive value -1
> /home/segher/src/gcc/gcc/testsuite/c-c++-common/ubsan/vla-1.c:36:7: runtime 
> error: variable length array bound evaluates to non-positive value -5
> 
> (both gcc and g++, both -m32 and -m64).
> 
> This is BE; LE is still running.

LE show the same, but also

=== acats tests ===
+FAIL:  c52103x
+FAIL:  c52104x
+FAIL:  c52104y
+FAIL:  cb1010a

These are

 C52103X CHECK THAT IN ARRAY ASSIGNMENTS AND IN SLICE ASSIGNMENTS,
THE LENGTHS MUST MATCH; ALSO CHECK WHETHER
CONSTRAINT_ERROR OR STORAGE_ERROR ARE RAISED FOR LARGE
ARRAYS.
   - C52103X NO CONSTRAINT_ERROR FOR TYPE WITH 'LENGTH = INTEGER'LAST + 
3.

raised STORAGE_ERROR : stack overflow or erroneous memory access
FAIL:   c52103x

(twice)

 C52104Y CHECK THAT IN ARRAY ASSIGNMENTS AND IN SLICE ASSIGNMENTS,
THE LENGTHS MUST MATCH.
   - C52104Y NO CONSTRAINT_ERROR FOR NON-NULL ARRAY SUBTYPE WHEN ONE
DIMENSION HAS INTEGER'LAST + 3 COMPONENTS.

raised STORAGE_ERROR : stack overflow or erroneous memory access
FAIL:   c52104y

and,

 CB1010A CHECK THAT STORAGE_ERROR IS RAISED WHEN STORAGE ALLOCATED
TO A TASK IS EXCEEDED.
   - CB1010A CHECK TASKS THAT DO NOT HANDLE STORAGE_ERROR PRIOR TO
RENDEZVOUS.
FAIL:   cb1010a


Segher


[Bug tree-optimization/82320] [8 Regression] Compile time hog w/ -O

2017-09-25 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82320

--- Comment #2 from Richard Biener  ---
Starting iteration 37161
Value numbering md_11 stmt = md_11 = PHI 
Setting value number of md_11 to md_21(D) (changed)
Value numbering md_10 stmt = md_10 = PHI 
Setting value number of md_10 to lq_24(D) (changed)
Value numbering md_9 stmt = md_9 = PHI 
Setting value number of md_9 to md_21(D) (changed)
Starting iteration 37162
Value numbering md_11 stmt = md_11 = PHI 
Setting value number of md_11 to lq_24(D) (changed)
Value numbering md_10 stmt = md_10 = PHI 
Setting value number of md_10 to md_21(D) (changed)
Value numbering md_9 stmt = md_9 = PHI 
Setting value number of md_9 to lq_24(D) (changed)
Starting iteration 37163
Value numbering md_11 stmt = md_11 = PHI 
Setting value number of md_11 to md_21(D) (changed)
Value numbering md_10 stmt = md_10 = PHI 
Setting value number of md_10 to lq_24(D) (changed)
Value numbering md_9 stmt = md_9 = PHI 
Setting value number of md_9 to md_21(D) (changed)
...

So we iterate between the different leaders for "undefined".  I suppose not
considering lq_24(D) -> md_21(D) a change in values would fix this.

Index: gcc/tree-ssa-sccvn.c
===
--- gcc/tree-ssa-sccvn.c(revision 253149)
+++ gcc/tree-ssa-sccvn.c(working copy)
@@ -3355,6 +3355,12 @@ set_ssa_val_to (tree from, tree to)

   if (currval != to
   && !operand_equal_p (currval, to, 0)
+  /* Different undefined SSA names are not actually different.  See
+ PR82320 for a testcase were we'd otherwise not terminate iteration. 
*/
+  && !(TREE_CODE (currval) == SSA_NAME
+  && TREE_CODE (to) == SSA_NAME
+  && ssa_undefined_value_p (currval, false)
+  && ssa_undefined_value_p (to, false))
   /* ???  For addresses involving volatile objects or types
operand_equal_p
  does not reliably detect ADDR_EXPRs as equal.  We know we are only
 getting invariant gimple addresses here, so can use

[Bug target/82322] [7/8 Regression] vec_ceil/vec_floor/vec_round intrincics do not work for gcc 8, need __builtin_s390_vfidb

2017-09-25 Thread dje at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82322

David Edelsohn  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2017-09-25
 CC||dje at gcc dot gnu.org,
   ||krebbel at gcc dot gnu.org,
   ||vogt at linux dot vnet.ibm.com
Summary|vec_ceil/vec_floor/vec_roun |[7/8 Regression]
   |d intrincics do not work|vec_ceil/vec_floor/vec_roun
   |for gcc 8, need |d intrincics do not work
   |__builtin_s390_vfidb.   |for gcc 8, need
   ||__builtin_s390_vfidb
 Ever confirmed|0   |1

--- Comment #1 from David Edelsohn  ---
Confirmed.

[Bug tree-optimization/82320] [8 Regression] Compile time hog w/ -O

2017-09-25 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82320

Richard Biener  changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2017-09-25
   Assignee|unassigned at gcc dot gnu.org  |rguenth at gcc dot 
gnu.org
   Target Milestone|--- |8.0
 Ever confirmed|0   |1

--- Comment #1 from Richard Biener  ---
I will have a look - we don't converge VN the SCC formed by md_11, md_10 and
md_9.

[Bug c++/61806] [C++11] Expression sfinae w/o access gives hard error in partial template specializations

2017-09-25 Thread gcc-bugs at marehr dot dialup.fu-berlin.de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61806

gcc-bugs at marehr dot dialup.fu-berlin.de changed:

   What|Removed |Added

 CC||gcc-bugs at marehr dot 
dialup.fu-b
   ||erlin.de

--- Comment #3 from gcc-bugs at marehr dot dialup.fu-berlin.de ---
I think I encountered a variant of this bug.

Using this new awesome -fconcept feature, you can do the following:

```
template 
struct type_trait;

template <>
struct type_trait
{
static constexpr auto length = 0;
};

template <>
struct type_trait
{
private:
static constexpr auto length = 0;
};

template 
concept bool has_length = requires(type_t a)
{
{ type_trait::length };
};

int main()
{
static_assert(!has_length); // expect: false, has no ::length
static_assert(has_length); // expect: true, has ::length
static_assert(!has_length); // expect: false, ::length is non-visible
// but, last one fails in a compiler error
return 0;
}
```

This example asks whether a type_trait is defined for a given type. And it
would be super useful to be able to express this.

I think gcc uses internally SFINAE to check this but unfortunately fails
because of this bug (probably).

Re: [PATCH] libstdc++: istreambuf_iterator keep attached streambuf

2017-09-25 Thread Jonathan Wakely

On 23/09/17 09:54 +0300, Petr Ovtchenkov wrote:

istreambuf_iterator should not forget about attached
streambuf when it reach EOF.

Checks in debug mode has no infuence more on character
extraction in istreambuf_iterator increment operators.
In this aspect behaviour in debug and non-debug mode
is similar now.

Test for detached srteambuf in istreambuf_iterator:
When istreambuf_iterator reach EOF of istream, it should not
forget about attached streambuf.
From fact "EOF in stream reached" not follow that
stream reach end of life and input operation impossible
more.
---
libstdc++-v3/include/bits/streambuf_iterator.h | 41 +++
.../24_iterators/istreambuf_iterator/3.cc  | 61 ++
2 files changed, 80 insertions(+), 22 deletions(-)
create mode 100644 libstdc++-v3/testsuite/24_iterators/istreambuf_iterator/3.cc

diff --git a/libstdc++-v3/include/bits/streambuf_iterator.h 
b/libstdc++-v3/include/bits/streambuf_iterator.h
index f0451b1..45c3d89 100644
--- a/libstdc++-v3/include/bits/streambuf_iterator.h
+++ b/libstdc++-v3/include/bits/streambuf_iterator.h
@@ -136,12 +136,20 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  istreambuf_iterator&
  operator++()
  {
-   __glibcxx_requires_cond(!_M_at_eof(),
+   __glibcxx_requires_cond(_M_sbuf,
_M_message(__gnu_debug::__msg_inc_istreambuf)
._M_iterator(*this));
if (_M_sbuf)
  {
+#ifdef _GLIBCXX_DEBUG_PEDANTIC
+   int_type _tmp =


_tmp is not a reserved name, this needs to be __tmp.

I'm still reviewing the rest, to understand what observable behaviour
this changes, and how it differs from the patch François sent.




[Bug target/82322] New: vec_ceil/vec_floor/vec_round intrincics do not work for gcc 8, need __builtin_s390_vfidb.

2017-09-25 Thread markos at freevec dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82322

Bug ID: 82322
   Summary: vec_ceil/vec_floor/vec_round intrincics do not work
for gcc 8, need __builtin_s390_vfidb.
   Product: gcc
   Version: 8.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: markos at freevec dot org
  Target Milestone: ---
  Host: s390x-ibm-linux-gnu
Target: s390x-ibm-linux-gnu
 Build: s390x-ibm-linux-gnu

Created attachment 42233
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=42233=edit
testcase  demonstrating compiler error for vec_round, etc.

When trying to compile the attached source with gcc 8.0.0:

> g++ -mzvector -march=z14 /home/markos/Development/zvectortest.cpp -o 
> /home/markos/Development/zvectortest
In file included from /home/markos/Development/zvectortest.cpp:1:0:
/home/markos/Development/zvectortest.cpp: In function 'Packet2d pround(const
Packet2d&)':
/home/markos/Development/zvectortest.cpp:6:45: error: '__builtin_s390_vfi' was
not declared in this scope
 Packet2d pround(const Packet2d& a) { return vec_round(a); }
 ^
/home/markos/Development/zvectortest.cpp:6:45: note: suggested alternative:
'__builtin_s390_vfidb'
/home/markos/Development/zvectortest.cpp: In function 'Packet2d pceil(const
Packet2d&)':
/home/markos/Development/zvectortest.cpp:7:45: error: '__builtin_s390_vfi' was
not declared in this scope
 Packet2d pceil(const  Packet2d& a) { return vec_ceil(a); }
 ^~~~
/home/markos/Development/zvectortest.cpp:7:45: note: suggested alternative:
'__builtin_s390_vfidb'
/home/markos/Development/zvectortest.cpp: In function 'Packet2d pfloor(const
Packet2d&)':
/home/markos/Development/zvectortest.cpp:8:45: error: '__builtin_s390_vfi' was
not declared in this scope
 Packet2d pfloor(const Packet2d& a) { return vec_floor(a); }
 ^
/home/markos/Development/zvectortest.cpp:8:45: note: suggested alternative:
'__builtin_s390_vfidb'

Same result with -march=z13. Works fine with g++-6.

However, it worked with both compilers if I add the code with #ifdef __GNUC__ <
8 and use helper defines using __builtin_s390_vfidb() intrinsic.

Re: [PATCH, libstdc++]: Fix another instance of -Werror=attributes bootststrap failure

2017-09-25 Thread Jonathan Wakely

On 25/09/17 11:11 +0200, Uros Bizjak wrote:

Hello!

Attached patch fixes -Werror=attributes bootstrap failure on
alphaev68-linux-gnu. The patch declares the alias without a prototype,
as suggested in [1].

2017-09-25  Uros Bizjak  

   PR c/81854
   * src/c++98/complex_io.cc (_GLIBCXX_LDBL_COMPAT): Declare alias
   target as a C++ function with no prototype.

Bootstrapped and regression tested on alphaev68-linux-gnu.

OK for mainline?


OK, thanks.



[Bug target/82317] [8 Regression] "'__builtin_s390_vec_min' matching variant requires z14 or higher" for __vector(2) double when it should work on -march=z13 as well

2017-09-25 Thread dje at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82317

David Edelsohn  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2017-09-25
Summary|error   |[8 Regression]
   |"'__builtin_s390_vec_min'   |"'__builtin_s390_vec_min'
   |matching variant requires   |matching variant requires
   |z14 or higher" for  |z14 or higher" for
   |__vector(2) double when it  |__vector(2) double when it
   |should work on -march=z13   |should work on -march=z13
   |as well |as well
 Ever confirmed|0   |1

--- Comment #1 from David Edelsohn  ---
Confirmed.

Re: [PATCH] tree-sra: fix compare_access_positions qsort comparator

2017-09-25 Thread Alexander Monakov
On Mon, 25 Sep 2017, Martin Jambor wrote:
> --- a/gcc/tree-sra.c
> +++ b/gcc/tree-sra.c
> @@ -1542,19 +1542,20 @@ compare_access_positions (const void *a, const void 
> *b)
>  && TREE_CODE (f2->type) != COMPLEX_TYPE
>  && TREE_CODE (f2->type) != VECTOR_TYPE)
>   return -1;
> -  /* Put the integral type with the bigger precision first.  */
> +  /* Put any integral type before any non-integral type.  When splicing, 
> we
> +  make sure that those with insufficient precision and occupupying the

Typo (s/upup/up).

> @@ -2102,6 +2106,21 @@ sort_and_splice_var_accesses (tree var)
>this combination of size and offset, the comparison function
>should have put the scalars first.  */
> gcc_assert (first_scalar || !is_gimple_reg_type (ac2->type));
> +   /* It also prefers integral types to non-integral.  However, when the
> +  precision of the selected type does not span the entire area and
> +  should also be used for a non-integer (i.e. float), we must not
> +  let that happen.  */
> +   if (non_full_precision && !INTEGRAL_TYPE_P (ac2->type))
> + {
> +   if (dump_file && (dump_flags & TDF_DETAILS))
> + {
> +   fprintf (dump_file, "Cannot sclarize the following access "

Typo ('scalarize').

Thanks!  If this is resolved, haifa-sched autoprefetch ranking will become the
last remaining (among discovered so far) inconsistent qsort comparator in GCC.

Alexander


[PATCH] Add helper to sort sibling loops, do so in GRAPHITE

2017-09-25 Thread Richard Biener

The following adds a helper to sort the sibling loop list in RPO order
as it can get messed up (we only ever add loops at the start of the list).
GRAPHITE SCOP detection assumes this list is sorted naturally in RPO
order (as a flow_loops_find would generate).

Turns out it helps a few more loops in SPEC CPU 2006 to get optimized
by GRAPHITE.

Bootstrapped and tested on x86_64-unknown-linux-gnu, SPEC 2k6 is happy
with GRAPHITE.

I've tested the variant below with the extra call in pass_tree_loop_init
but as no pass cares about the sibling list order but graphite I'll not
commit that hunk.

Applied to trunk (w/o that hunk)

Richard.

2017-09-25  Richard Biener  

* cfgloop.h (sort_sibling_loops): Declare.
* cfgloop.c (sort_sibling_loops_cmp): New helper.
(sort_sibling_loops): New function sorting the sibling loop list
in RPO order.
* graphite.c (graphite_transform_loops): Sort sibling loops.

Index: gcc/cfgloop.c
===
--- gcc/cfgloop.c   (revision 253144)
+++ gcc/cfgloop.c   (working copy)
@@ -521,6 +521,58 @@ flow_loops_find (struct loops *loops)
   return loops;
 }
 
+/* qsort helper for sort_sibling_loops.  */
+
+static int *sort_sibling_loops_cmp_rpo;
+static int
+sort_sibling_loops_cmp (const void *la_, const void *lb_)
+{
+  const struct loop *la = *(const struct loop * const *)la_;
+  const struct loop *lb = *(const struct loop * const *)lb_;
+  return (sort_sibling_loops_cmp_rpo[la->header->index]
+ - sort_sibling_loops_cmp_rpo[lb->header->index]);
+}
+
+/* Sort sibling loops in RPO order.  */
+
+void
+sort_sibling_loops (function *fn)
+{
+  /* Match flow_loops_find in the order we sort sibling loops.  */
+  sort_sibling_loops_cmp_rpo = XNEWVEC (int, last_basic_block_for_fn (cfun));
+  int *rc_order = XNEWVEC (int, n_basic_blocks_for_fn (cfun));
+  pre_and_rev_post_order_compute_fn (fn, NULL, rc_order, false);
+  for (int i = 0; i < n_basic_blocks_for_fn (cfun) - NUM_FIXED_BLOCKS; ++i)
+sort_sibling_loops_cmp_rpo[rc_order[i]] = i;
+  free (rc_order);
+
+  auto_vec siblings;
+  loop_p loop;
+  FOR_EACH_LOOP_FN (fn, loop, LI_INCLUDE_ROOT)
+if (loop->inner && loop->inner->next)
+  {
+   loop_p sibling = loop->inner;
+   do
+ {
+   siblings.safe_push (sibling);
+   sibling = sibling->next;
+ }
+   while (sibling);
+   siblings.qsort (sort_sibling_loops_cmp);
+   loop_p *siblingp = >inner;
+   for (unsigned i = 0; i < siblings.length (); ++i)
+ {
+   *siblingp = siblings[i];
+   siblingp = &(*siblingp)->next;
+ }
+   *siblingp = NULL;
+   siblings.truncate (0);
+  }
+
+  free (sort_sibling_loops_cmp_rpo);
+  sort_sibling_loops_cmp_rpo = NULL;
+}
+
 /* Ratio of frequencies of edges so that one of more latch edges is
considered to belong to inner loop with same header.  */
 #define HEAVY_EDGE_RATIO 8
Index: gcc/cfgloop.h
===
--- gcc/cfgloop.h   (revision 253144)
+++ gcc/cfgloop.h   (working copy)
@@ -333,6 +333,7 @@ bool mark_irreducible_loops (void);
 void release_recorded_exits (function *);
 void record_loop_exits (void);
 void rescan_loop_exit (edge, bool, bool);
+void sort_sibling_loops (function *);
 
 /* Loop data structure manipulation/querying.  */
 extern void flow_loop_tree_node_add (struct loop *, struct loop *);
Index: gcc/graphite.c
===
--- gcc/graphite.c  (revision 253144)
+++ gcc/graphite.c  (working copy)
@@ -419,6 +419,7 @@ graphite_transform_loops (void)
   isl_options_set_on_error (ctx, ISL_ON_ERROR_ABORT);
   the_isl_ctx = ctx;
 
+  sort_sibling_loops (cfun);
   canonicalize_loop_closed_ssa_form ();
 
   calculate_dominance_info (CDI_POST_DOMINATORS);
Index: gcc/tree-ssa-loop.c
===
--- gcc/tree-ssa-loop.c (revision 253144)
+++ gcc/tree-ssa-loop.c (working copy)
@@ -359,6 +359,7 @@ pass_tree_loop_init::execute (function *
   | LOOPS_HAVE_RECORDED_EXITS);
   rewrite_into_loop_closed_ssa (NULL, TODO_update_ssa);
   scev_initialize ();
+  sort_sibling_loops (fun);
 
   return 0;
 }



Re: [PATCH][GRAPHITE] More TLC

2017-09-25 Thread Richard Biener
On Fri, 22 Sep 2017, Sebastian Pop wrote:

> On Fri, Sep 22, 2017 at 8:03 AM, Richard Biener  wrote:
> 
> >
> > This simplifies canonicalize_loop_closed_ssa and does other minimal
> > TLC.  It also adds a testcase I reduced from a stupid mistake I made
> > when reworking canonicalize_loop_closed_ssa.
> >
> > Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk.
> >
> > SPEC CPU 2006 is happy with it, current statistics on x86_64 with
> > -Ofast -march=haswell -floop-nest-optimize are
> >
> >  61 loop nests "optimized"
> >  45 loop nest transforms cancelled because of code generation issues
> >  21 loop nest optimizations timed out the 35 ISL "operations" we allow
> >
> > I say "optimized" because the usual transform I've seen is static tiling
> > as enforced by GRAPHITE according to --param loop-block-tile-size.
> > There's no way to automagically figure what kind of transform ISL did
> >
> 
> Here is how to automate (without magic) the detection
> of the transform that isl did.
> 
> The problem solved by isl is the minimization of strides
> in memory, and to do this, we need to tell the isl scheduler
> the validity dependence graph, in graphite-optimize-isl.c
> see the validity (RAW, WAR, WAW) and the proximity
> (RAR + validity) maps.  The proximity does include the
> read after read, as the isl scheduler needs to minimize
> strides between consecutive reads.
> 
> When you apply the schedule to the dependence graph,
> one can tell from the result the strides in memory, a good
> way to say whether a transform was beneficial is to sum up
> all memory strides, and make sure that the sum of all strides
> decreases after transform.  We could add a printf with the
> sum of strides before and after transforms, and have the
> testcases check for that.

Interesting.  Can you perhaps show me in code how to do that?

Thanks,
Richard.


Re: [PATCH][GRAPHITE] More TLC

2017-09-25 Thread Richard Biener
On Mon, 25 Sep 2017, Bin.Cheng wrote:

> On Mon, Sep 25, 2017 at 1:46 PM, Richard Biener  wrote:
> > On Mon, 25 Sep 2017, Richard Biener wrote:
> >
> >> On Fri, 22 Sep 2017, Richard Biener wrote:
> >>
> >> >
> >> > This simplifies canonicalize_loop_closed_ssa and does other minimal
> >> > TLC.  It also adds a testcase I reduced from a stupid mistake I made
> >> > when reworking canonicalize_loop_closed_ssa.
> >> >
> >> > Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk.
> >> >
> >> > SPEC CPU 2006 is happy with it, current statistics on x86_64 with
> >> > -Ofast -march=haswell -floop-nest-optimize are
> >> >
> >> >  61 loop nests "optimized"
> >> >  45 loop nest transforms cancelled because of code generation issues
> >> >  21 loop nest optimizations timed out the 35 ISL "operations" we 
> >> > allow
> >>
> >> Overall compile time (with -j6) is 695 sec. w/o -floop-nest-optimize
> >> and 709 sec. with (this was with release checking).
> >>
> >> A single-run has 416.gamess (580s -> 618s),
> >> 436.cactusADM (206s -> 182s), 437.leslie3d (228s ->218s),
> >> 450.soplex (229s -> 226s), 465.tonto (428s -> 425s), 401.bzip2 (383s ->
> >> 379s), 462.libquantum (352s -> 343s), ignoring +-2s changes.  Will
> >> do a 3-run for those to confirm (it would be only a single regression
> >> for 416.gamess).
> >
> > 416.gamess regression confirmed, 450.soplex improvement as well,
> 436/437 improvements?  450.soplex (229s -> 226s) loops like noise.

base is with -floop-nest-optimize, peak without.

416.gamess  19580619   31.7 S   19580576   
34.0 *
416.gamess  19580614   31.9 S   19580577   
33.9 S
416.gamess  19580618   31.7 *   19580576   
34.0 S
436.cactusADM   11950194   61.5 S   11950204   
58.5 S
436.cactusADM   11950184   65.0 S   11950187   
63.8 *
436.cactusADM   11950186   64.1 *   11950186   
64.1 S
437.leslie3d 9400219   43.0 S9400218   
43.1 S
437.leslie3d 9400219   43.0 *9400223   
42.1 S
437.leslie3d 9400218   43.0 S9400223   
42.2 *
450.soplex   8340225   37.0 S8340231   
36.1 S
450.soplex   8340226   36.9 *8340230   
36.3 *
450.soplex   8340227   36.8 S8340229   
36.4 S
465.tonto9840426   23.1 S9840427   
23.0 *
465.tonto9840424   23.2 S9840430   
22.9 S
465.tonto9840425   23.2 *9840425   
23.2 S
401.bzip29650379   25.5 S9650378   
25.5 S
401.bzip29650379   25.5 *9650380   
25.4 *
401.bzip29650379   25.5 S9650380   
25.4 S
462.libquantum  20720351   59.0 *   20720349   
59.4 S
462.libquantum  20720351   59.0 S   20720345   
60.1 *
462.libquantum  20720352   58.8 S   20720344   
60.2 S



> Thanks,
> bin
> > in the three-run 462.libquantum regresses (344s -> 351s) so I suppose
> > that's noise.
> >
> > Richard.
> 
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)


[Bug tree-optimization/82320] New: [8 Regression] Compile time hog w/ -O

2017-09-25 Thread asolokha at gmx dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82320

Bug ID: 82320
   Summary: [8 Regression] Compile time hog w/ -O
   Product: gcc
   Version: 8.0
Status: UNCONFIRMED
  Keywords: compile-time-hog
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: asolokha at gmx dot com
  Target Milestone: ---

gcc-8.0.0-alpha20170917 snapshot (r252896) takes indefinite time when compiling
the following snippet w/ any optimization level except -O0:

void
ec (int n4, short int ea)
{
  if (1)
{
  if (ea != 0)
{
  int *c1 = (int *)

 nn:
  for (;;)
++*c1;
}
}
  else
{
  int *lq = 
  int *md;
  int da;

  goto nn;

 r1:
  md = lq;
  for (da = 0; da < 1; ++da)
{
 ig:
  ++n4;
  *md += n4;
}
}

  for (ea = 0; ea < 1; ++ea)
goto r1;

  goto ig;
}

% timeout 10 gcc-8.0.0-alpha20170924 -O1 -c yz3jwezs.c
zsh: exit 124   timeout 10 gcc-8.0.0-alpha20170924 -O1 -c yz3jwezs.c

perf top shows the following:

  24.40%  cc1   [.] visit_use
  22.59%  cc1   [.] operand_equal_p
   9.84%  cc1   [.] VN_INFO
   9.53%  cc1   [.] set_ssa_val_to
   8.75%  cc1   [.] DFS
   7.36%  cc1   [.] ssa_defined_default_def_p
   4.13%  cc1   [.] mark_use_processed
   3.70%  cc1   [.] ssa_undefined_value_p
   2.22%  cc1   [.] element_precision
   1.80%  cc1   [.] tree_strip_nop_conversions
   1.71%  cc1   [.] is_gimple_min_invariant
   1.35%  cc1   [.] _obstack_begin_worker
   0.76%  cc1   [.] _obstack_free
   0.63%  cc1   [.] mempool_obstack_chunk_free
   0.58%  cc1   [.] _obstack_begin
   0.23%  cc1   [.] mempool_obstack_chunk_alloc
   0.22%  cc1   [.] call_freefun
   0.17%  cc1   [.] call_chunkfun

[Bug tree-optimization/82321] New: [8 Regression] ICE in check_loop_closed_ssa_use, at tree-ssa-loop-manip.c:707

2017-09-25 Thread asolokha at gmx dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82321

Bug ID: 82321
   Summary: [8 Regression] ICE in check_loop_closed_ssa_use, at
tree-ssa-loop-manip.c:707
   Product: gcc
   Version: 8.0
Status: UNCONFIRMED
  Keywords: ice-on-valid-code
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: asolokha at gmx dot com
  Target Milestone: ---

gcc-8.0.0-alpha20170924 snaphost (r253127) ICEs when compiling the following
snippet w/ -O2 -floop-nest-optimize:

int y8;

void
dm (int io)
{
  if (y8 != 0)
{
  int pu = 1;

  while (io < 2)
{
  int xo = (pu != 0) ? y8 : 0;

  while (y8 != 0)
if (xo != 0)
  {
 gi:
xo = 
pu = 0;
  }
}
}

  if (io != 0)
{
  y8 = 1;
  while (y8 != 0)
if (io / !y8 != 0)
  y8 = 0;

  goto gi;
}
}

% gcc-8.0.0-alpha20170924 -O2 -floop-nest-optimize -w -c r9nismdn.c
during GIMPLE pass: graphite
r9nismdn.c: In function 'dm':
r9nismdn.c:4:1: internal compiler error: in check_loop_closed_ssa_use, at
tree-ssa-loop-manip.c:707
 dm (int io)
 ^~

RE: Enable no-exec stacks for more targets using the Linux kernel

2017-09-25 Thread Nagaraju Mekala
> -Original Message-
> From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
> ow...@gcc.gnu.org] On Behalf Of Nagaraju Mekala
> Sent: Thursday, September 21, 2017 2:56 PM
> To: Joseph Myers 
> Cc: sch...@suse.de; gcc-patches@gcc.gnu.org; l...@redhat.com;
> d...@anglin.bell.net; wil...@tuliptree.org; Michael Eager
> 
> Subject: RE: Enable no-exec stacks for more targets using the Linux kernel
> 
> [This sender failed our fraud detection checks and may not be who they
> appear to be. Learn about spoofing at http://aka.ms/LearnAboutSpoofing]
> 
> > -Original Message-
> > From: Joseph Myers [mailto:jos...@codesourcery.com]
> > Sent: Wednesday, September 20, 2017 5:52 PM
> > To: Nagaraju Mekala 
> > Cc: sch...@suse.de; gcc-patches@gcc.gnu.org; l...@redhat.com;
> > d...@anglin.bell.net; wil...@tuliptree.org; Michael Eager
> > 
> > Subject: Re: Enable no-exec stacks for more targets using the Linux
> > kernel
> >
> > On Wed, 20 Sep 2017, Nagaraju Mekala wrote:
> >
> > > > I've sent a glibc patch
> > > > .  I
> > > >think the key questions for architecture experts now are: on each
> > > >of those three architectures, do trampolines ever require
> > > >executable stacks, and, if they do, how does this work at present
> > > >when the kernel defaults to non-executable and my understanding at
> > > > would
> > > >be that glibc would only make thread stacks executable on those
> > > >architectures, not the main process stacks, and GCC will never
> > > >generate an explicit marker on those architectures to request an
> > > >executable
> > stack?
> > >
> > > Microblaze is a soft processor with many configuration options. If
> > > we don't use the MMU, there is nothing preventing execution of code
> > > on the stack in the MicroBlaze architecture.
> > >  With the MMU, you have the option to make any page, including the
> > > stack  pages, executable or not.
> > >
> > > It is recommended to prevent execution on the stack by defining
> > > those pages as non-executable in the MMU. In particular, trampolines
> > > would have to be possible to code without execution on the stack
> >
> > No-MMU configurations are not relevant to a glibc change; the question
> > is how things work for configurations using glibc, with the Linux
> > kernel, with an MMU.  In such a configuration, for MicroBlaze: (a) is
> > the stack in fact executable, now; (b) if it is, what makes it so
> > given the kernel default to non- executable (where does my reasoning
> > about what the kernel and glibc do go wrong); (c) if it is not executable, 
> > do
> trampolines work anyway?
> 
> The MMU configuration doesn't need stack to be executable.
> The glibc patch related to Microblaze looks correct and it should solve the
> issue.
> We will verify the change in a day or two and get back to you.
>
Sorry for the confusion. 
We need to apply this GCC patch for Microblaze target and make stack executable.
I have verified the patch and there are no regressions with it.
Thanks,
Nagaraju
> 
> > --
> > Joseph S. Myers
> > jos...@codesourcery.com


[Bug c/82318] -fexcess-precision=standard has no effect on a libm function call

2017-09-25 Thread vincent-gcc at vinc17 dot net
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82318

--- Comment #3 from Vincent Lefèvre  ---
(In reply to Andrew Pinski from comment #1)
> This is interesting because log2 should have already done a rounding to
> double before returning.

I suppose that if the C library has been built with GCC without
-fexcess-precision=standard (explicitly or implied), then this rounding is not
done.

I've done my tests on a Debian/unstable machine (currently glibc 2.24).

Re: [PATCH][GRAPHITE] More TLC

2017-09-25 Thread Bin.Cheng
On Mon, Sep 25, 2017 at 1:46 PM, Richard Biener  wrote:
> On Mon, 25 Sep 2017, Richard Biener wrote:
>
>> On Fri, 22 Sep 2017, Richard Biener wrote:
>>
>> >
>> > This simplifies canonicalize_loop_closed_ssa and does other minimal
>> > TLC.  It also adds a testcase I reduced from a stupid mistake I made
>> > when reworking canonicalize_loop_closed_ssa.
>> >
>> > Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk.
>> >
>> > SPEC CPU 2006 is happy with it, current statistics on x86_64 with
>> > -Ofast -march=haswell -floop-nest-optimize are
>> >
>> >  61 loop nests "optimized"
>> >  45 loop nest transforms cancelled because of code generation issues
>> >  21 loop nest optimizations timed out the 35 ISL "operations" we allow
>>
>> Overall compile time (with -j6) is 695 sec. w/o -floop-nest-optimize
>> and 709 sec. with (this was with release checking).
>>
>> A single-run has 416.gamess (580s -> 618s),
>> 436.cactusADM (206s -> 182s), 437.leslie3d (228s ->218s),
>> 450.soplex (229s -> 226s), 465.tonto (428s -> 425s), 401.bzip2 (383s ->
>> 379s), 462.libquantum (352s -> 343s), ignoring +-2s changes.  Will
>> do a 3-run for those to confirm (it would be only a single regression
>> for 416.gamess).
>
> 416.gamess regression confirmed, 450.soplex improvement as well,
436/437 improvements?  450.soplex (229s -> 226s) loops like noise.

Thanks,
bin
> in the three-run 462.libquantum regresses (344s -> 351s) so I suppose
> that's noise.
>
> Richard.


Re: Don't query the frontend for unsupported types

2017-09-25 Thread Richard Biener
On Fri, Sep 22, 2017 at 6:42 PM, Richard Sandiford
 wrote:
> Richard Biener  writes:
>> On Thu, Sep 21, 2017 at 2:56 PM, Richard Sandiford
>>  wrote:
>>> Richard Biener  writes:
 On September 20, 2017 2:36:03 PM GMT+02:00, Richard Sandiford
  wrote:
>When forcing a constant of mode MODE into memory, force_const_mem
>asks the frontend to provide the type associated with that mode.
>In principle type_for_mode is allowed to return null, and although
>one use site correctly handled that, the other didn't.
>
>I think there's agreement that it's bogus to use type_for_mode for
>this kind of thing, since it forces frontends to handle types that
>don't exist in that language.  See e.g. http://gcc.gnu.org/PR46805
>where the Go frontend was forced to handle vector types even though
>Go doesn't have vector types.
>
>Also, the frontends use code like:
>
>  else if (VECTOR_MODE_P (mode))
>{
>  machine_mode inner_mode = GET_MODE_INNER (mode);
>  tree inner_type = c_common_type_for_mode (inner_mode, unsignedp);
>  if (inner_type != NULL_TREE)
>return build_vector_type_for_mode (inner_type, mode);
>}
>
>and there's no guarantee that every vector mode M used by backend
>rtl has an associated vector type whose TYPE_MODE is M.  I think
>really the type_for_mode hook should only return trees that _do_ have
>the requested TYPE_MODE, but PR46805 linked above shows that this is
>likely to have too many knock-on consequences.  It doesn't make sense
>for force_const_mem to ask about vector modes that aren't valid for
>vector types, so this patch handles the condition there instead.
>
>This is needed for SVE multi-register modes, which are modelled as
>vector modes but are not usable as vector types.
>
>Tested on aarch64-linux-gnu, x86_64-linux-gnu and
>powerpc64le-linus-gnu.
>OK to install?

 I think we should get rid of the use entirely.
>>>
>>> I first read this as not using type_for_mode at all in force_const_mem,
>>> which sounded like a good thing :-)
>>
>> That's what I meant ;)  A mode doesn't really have a type...
>>
>>   I tried it overnight on the usual
>>> at-least-one-target-per-CPU set and diffing the before and after
>>> assembly for the testsuite.  And it looks like i686 relies on this
>>> to get an alignment of 16 rather than 4 for XFmode constants:
>>> GET_MODE_ALIGNMENT (XFmode) == 32 (as requested by i386-modes.def),
>>> but i386's CONSTANT_ALIGNMENT increases it to 128 for static constants.
>>
>> Then the issue is that CONSTANT_ALIGNMENT takes a tree and not a mode...
>> even worse than type_for_mode is a use of make_tree!  Incidentially
>> ix86_constant_alignment _does_ look at the mode in the end...
>
> OK, I guess this means another target hook conversion.  The patch
> below converts CONSTANT_ALIGNMENT with its current interface.
> The definition:
>
>   #define CONSTANT_ALIGNMENT(EXP, ALIGN) \
> (TREE_CODE (EXP) == STRING_CST \
>  && (ALIGN) < BITS_PER_WORD ? BITS_PER_WORD : (ALIGN))
>
> was very common, so the patch adds a canned definition for that,
> called constant_alignment_word_strings.  Some ports had a variation
> that used a port-local FASTEST_ALIGNMENT instead of BITS_PER_WORD;
> the patch uses constant_alignment_word_strings if FASTEST_ALIGNMENT
> was always BITS_PER_WORD and a port-local hook function otherwise.
>
> Tested on aarch64-linux-gnu, x86_64-linux-gnu and powerpc64le-linux-gnu.
> Also tested by comparing the testsuite assembly output on at least one
> target per CPU directory.  I don't think this comes under Jeff's
> preapproval due to the constant_alignment_word_strings thing, so:
> OK to install?

Ok.

Thanks,
Richard.

> If so, then I'll follow up with a separate hook for rtl modes, which
> varasm, default_constant_alignment and constant_alignment_word_strings
> can all use.
>
> Thanks,
> Richard
>
>
> 2017-09-22  Richard Sandiford  
>
> gcc/
> * target.def (constant_alignment): New hook.
> * defaults.h (CONSTANT_ALIGNMENT): Delete.
> * doc/tm.texi.in (CONSTANT_ALIGNMENT): Replace with...
> (TARGET_CONSTANT_ALIGNMENT): ...this new hook.
> * doc/tm.texi: Regenerate.
> * targhooks.h (default_constant_alignment): Declare.
> (constant_alignment_word_strings): Likewise.
> * targhooks.c (default_constant_alignment): New function.
> (constant_alignment_word_strings): Likewise.
> * builtins.c (get_object_alignment_2): Use targetm.constant_alignment
> instead of CONSTANT_ALIGNMENT.
> * varasm.c (align_variable, get_variable_align, build_constant_desc)
> (force_const_mem): Likewise.
> * config/aarch64/aarch64.h 

Re: [GCC][PATCH][TESTSUITE][ARM][COMMITTED] Invert check to misalign in vect_hw_misalign (PR 78421)

2017-09-25 Thread Christophe Lyon
On 25 September 2017 at 11:36, Tamar Christina  wrote:
>
>
>> -Original Message-
>> From: Christophe Lyon [mailto:christophe.l...@linaro.org]
>> Sent: 23 September 2017 18:52
>> To: Tamar Christina
>> Cc: gcc-patches@gcc.gnu.org; nd; Ramana Radhakrishnan; Richard Earnshaw;
>> ni...@redhat.com; Kyrylo Tkachov
>> Subject: Re: [GCC][PATCH][TESTSUITE][ARM][COMMITTED] Invert check to
>> misalign in vect_hw_misalign (PR 78421)
>>
>> On 23 September 2017 at 16:12, Christophe Lyon
>>  wrote:
>> > On 21 September 2017 at 16:48, Tamar Christina
>>  wrote:
>> >> Hi All,
>> >>
>> >> Commit r244796 changed vect_hw_misalign for arm to check against
>> >> arm_vect_no_misalign. However vect_hw_misalign is supposed to check
>> >> if a target supports misalign access, while arm_vect_no_misalign
>> >> checks that a target only supports aligned access.
>> >>
>> >> As such the results need to be inverted otherwise the test runs in
>> >> exactly the wrong circumstances.
>> >>
>> >> Committed as r253073 under the GCC obvious rule.
>> >
>> > Hi Tamar,
>> >
>
> Hi Christoph,
>
>> > This is not as obvious as we might think. This patch is causing tcl
>> > syntax errors, such as:
>> > error executing dg-final: expected boolean value but got "!1"
>> >
>
> Ack, I didn't see this before, I only checked for tests using 
> arm_vect_no_misalign and retested those and the few
> That were failing before. Sorry about that.
>

The tcl error messages may be difficult to notice, they tend to be
"hidden" in the middle of the logs.

>> > I plan to send a fix b/o next week.
>> >
>>
>> The attached patch would apply after reverting yours.
>> I've applied it against r253072 (just before your patch) and the results are
>> visible at:
>> http://people.linaro.org/~christophe.lyon/cross-validation/gcc-test-
>> patches/253072-hw-misalign2/report-build-info.html
>>
>> Do they match your expectations? It looks like a few testcases need to be
>> adjusted, or new bugs are uncovered.
>>
>> If the patch OK with a suitable ChangeLog entry?
>
> That looks OK to me, about ~40 new tests correctly run now, and I think the 
> new failures are just test-isms but I'll
> Look at them more closely today.
>
Thanks for taking a look.

I've added Mike in cc so that he can confirm my patch is right. There
might be a better tcl idiom

Thanks,

Christophe

> Thanks,
> Tamar
>
>>
>> Thanks,
>>
>> Christophe
>>
>> > Christophe
>> >
>> >>
>> >> Thanks,
>> >> Tamar
>> >>
>> >> gcc/ChangeLog
>> >> 2017-09-21  Tamar Christina  
>> >>
>> >> PR testsuite/78421
>> >> * lib/target-supports.exp
>> (check_effective_target_vect_hw_misalign):
>> >> Invert arm check.
>> >>
>> >> --


Re: GCC Buildbot

2017-09-25 Thread Paulo Matos


On 25/09/17 13:36, Martin Liška wrote:
> 
> Would be great, what exactly do you want to visualize? For me, even having 
> green/red spots
> works fine in order to quickly identify what builds are wrong.
> 

There are several options and I think mostly it depends on what everyone
would like to see but I am thinking that a dashboard with green/red
spots as you mention (which depends not on the existence of failures)
but on the existence of a regression at a certain revision. Also, an
historical graph of results and gcc build times might be interesting as
well.

For benchmarks like Qt, blitz (as mentioned in the gcc testing page), we
can plot the build time of the benchmark and resulting size when
compiling for size.

Again, I expect that once there's something visible and people are keen
to use it, they'll ask for something specific. However, once the
infrastructure is in place, it shouldn't be too hard to add specific
visualizations.

> 
> Hopefully both. I'm attaching my config file (probably more for inspiration 
> that a real use).
> I'll ask my manager whether we can find a machine that can run more complex 
> tests. I'll inform you.
> 

Thanks for the configuration file. I will take a look. Will eagerly wait
for news on the hardware request.

> 
> Yes, duplication in way that it is (will be) same things. I'm adding author 
> of the tool,
> hopefully we can unify the effort (and resources of course).
> 

Great.

-- 
Paulo Matos


Re: [Ada] Improve performance of 'Image with enumeration types.

2017-09-25 Thread Duncan Sands

Hi,

On 09/25/2017 10:54 AM, Pierre-Marie de Rodat wrote:

This patch improves the performance of the code generated by the compiler
for attribute Image when applied to user-defined enumeration types and the
sources are compiled with optimizations enabled.


it looks like this is in essence inlining the run-time library routine.  In 
which case, shouldn't you only do it if inlining is enabled?  For example, it 
seems rather odd to do this if compiling with -Os.


Best wishes, Duncan.



No test required.

Tested on x86_64-pc-linux-gnu, committed on trunk

2017-09-25  Javier Miranda  

* exp_imgv.adb (Is_User_Defined_Enumeration_Type): New subprogram.
(Expand_User_Defined_Enumeration_Image): New subprogram.
(Expand_Image_Attribute): Enable speed-optimized expansion of
user-defined enumeration types when we are compiling with optimizations
enabled.





Re: [PATCH][GRAPHITE] More TLC

2017-09-25 Thread Richard Biener
On Mon, 25 Sep 2017, Richard Biener wrote:

> On Fri, 22 Sep 2017, Richard Biener wrote:
> 
> > 
> > This simplifies canonicalize_loop_closed_ssa and does other minimal
> > TLC.  It also adds a testcase I reduced from a stupid mistake I made
> > when reworking canonicalize_loop_closed_ssa.
> > 
> > Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk.
> > 
> > SPEC CPU 2006 is happy with it, current statistics on x86_64 with
> > -Ofast -march=haswell -floop-nest-optimize are
> > 
> >  61 loop nests "optimized"
> >  45 loop nest transforms cancelled because of code generation issues
> >  21 loop nest optimizations timed out the 35 ISL "operations" we allow
> 
> Overall compile time (with -j6) is 695 sec. w/o -floop-nest-optimize
> and 709 sec. with (this was with release checking).
> 
> A single-run has 416.gamess (580s -> 618s),
> 436.cactusADM (206s -> 182s), 437.leslie3d (228s ->218s),
> 450.soplex (229s -> 226s), 465.tonto (428s -> 425s), 401.bzip2 (383s -> 
> 379s), 462.libquantum (352s -> 343s), ignoring +-2s changes.  Will
> do a 3-run for those to confirm (it would be only a single regression
> for 416.gamess).

416.gamess regression confirmed, 450.soplex improvement as well,
in the three-run 462.libquantum regresses (344s -> 351s) so I suppose
that's noise.

Richard.


Re: [RFA][PATCH] Stack clash protection 06/08 - V4

2017-09-25 Thread Segher Boessenkool
On Mon, Sep 25, 2017 at 05:52:27AM -0500, Segher Boessenkool wrote:
> Bootstrap+testsuite finished on BE, but I forgot to enable stack-clash
> protection by default, whoops.  Will have results later today (also LE).

Some new failures show up:

+FAIL: c-c++-common/ubsan/vla-1.c   -O0  execution test

/home/segher/src/gcc/gcc/testsuite/c-c++-common/ubsan/vla-1.c:18:7: runtime 
error: variable length array bound evaluates to non-positive value -1
/home/segher/src/gcc/gcc/testsuite/c-c++-common/ubsan/vla-1.c:24:7: runtime 
error: variable length array bound evaluates to non-positive value -1
/home/segher/src/gcc/gcc/testsuite/c-c++-common/ubsan/vla-1.c:24:7: runtime 
error: variable length array bound evaluates to non-positive value -1
/home/segher/src/gcc/gcc/testsuite/c-c++-common/ubsan/vla-1.c:30:7: runtime 
error: variable length array bound evaluates to non-positive value -1
/home/segher/src/gcc/gcc/testsuite/c-c++-common/ubsan/vla-1.c:30:7: runtime 
error: variable length array bound evaluates to non-positive value -1
/home/segher/src/gcc/gcc/testsuite/c-c++-common/ubsan/vla-1.c:30:7: runtime 
error: variable length array bound evaluates to non-positive value -1
/home/segher/src/gcc/gcc/testsuite/c-c++-common/ubsan/vla-1.c:36:7: runtime 
error: variable length array bound evaluates to non-positive value -5

(both gcc and g++, both -m32 and -m64).

This is BE; LE is still running.


Segher


Re: [Ada] Use the Monotonic Clock on Linux

2017-09-25 Thread Duncan Sands

Hi,

On 09/25/2017 10:47 AM, Pierre-Marie de Rodat wrote:

The monotonic clock epoch is set to some undetermined time
in the past (typically system boot time).  In order to use the
monotonic clock for absolute time, the offset from a known epoch
is calculated and incorporated into timed delay and sleep.



--- libgnarl/s-taprop__linux.adb(revision 253134)
+++ libgnarl/s-taprop__linux.adb(working copy)
@@ -257,6 +266,73 @@
end if;
 end Abort_Handler;
  
+   --

+   -- Compute_Base_Monotonic_Clock --
+   --
+
+   function Compute_Base_Monotonic_Clock return Duration is
+  TS_Bef0, TS_Mon0, TS_Aft0 : aliased timespec;
+  TS_Bef,  TS_Mon,  TS_Aft  : aliased timespec;
+  Bef, Mon, Aft : Duration;
+  Res_B, Res_M, Res_A   : Interfaces.C.int;
+   begin
+  Res_B := clock_gettime
+   (clock_id => OSC.CLOCK_REALTIME, tp => TS_Bef0'Unchecked_Access);
+  pragma Assert (Res_B = 0);
+  Res_M := clock_gettime
+   (clock_id => OSC.CLOCK_RT_Ada, tp => TS_Mon0'Unchecked_Access);
+  pragma Assert (Res_M = 0);
+  Res_A := clock_gettime
+   (clock_id => OSC.CLOCK_REALTIME, tp => TS_Aft0'Unchecked_Access);
+  pragma Assert (Res_A = 0);
+
+  for I in 1 .. 10 loop
+ --  Guard against a leap second which will cause CLOCK_REALTIME
+ --  to jump backwards.  In the extrenmely unlikely event we call
+ --  clock_gettime before and after the jump the epoch result will
+ --  be off slightly.
+ --  Use only results where the tv_sec values match for the sake
+ --  of convenience.
+ --  Also try to calculate the most accurate
+ --  epoch by taking the minimum difference of 10 tries.
+
+ Res_B := clock_gettime
+  (clock_id => OSC.CLOCK_REALTIME, tp => TS_Bef'Unchecked_Access);
+ pragma Assert (Res_B = 0);
+ Res_M := clock_gettime
+  (clock_id => OSC.CLOCK_RT_Ada, tp => TS_Mon'Unchecked_Access);
+ pragma Assert (Res_M = 0);
+ Res_A := clock_gettime
+  (clock_id => OSC.CLOCK_REALTIME, tp => TS_Aft'Unchecked_Access);
+ pragma Assert (Res_A = 0);
+
+ if (TS_Bef0.tv_sec /= TS_Aft0.tv_sec and then
+ TS_Bef.tv_sec  = TS_Aft.tv_sec)
+--  The calls to clock_gettime before the loop were no good.
+or else
+(TS_Bef0.tv_sec = TS_Aft0.tv_sec and then
+ TS_Bef.tv_sec  = TS_Aft.tv_sec and then
+(TS_Aft.tv_nsec  - TS_Bef.tv_nsec <
+ TS_Aft0.tv_nsec - TS_Bef0.tv_nsec))
+--  The most recent calls to clock_gettime were more better.


were more better -> were better

Best wishes, Duncan.


+ then
+TS_Bef0.tv_sec := TS_Bef.tv_sec;
+TS_Bef0.tv_nsec := TS_Bef.tv_nsec;
+TS_Aft0.tv_sec := TS_Aft.tv_sec;
+TS_Aft0.tv_nsec := TS_Aft.tv_nsec;
+TS_Mon0.tv_sec := TS_Mon.tv_sec;
+TS_Mon0.tv_nsec := TS_Mon.tv_nsec;
+ end if;
+  end loop;
+
+  Bef := To_Duration (TS_Bef0);
+  Mon := To_Duration (TS_Mon0);
+  Aft := To_Duration (TS_Aft0);
+
+  return Bef / 2 + Aft / 2 - Mon;
+  --  Distribute the division to avoid potential type overflow someday.
+   end Compute_Base_Monotonic_Clock;
+
 --
 -- Lock_RTS --
 --




Re: [PATCH] tree-sra: fix compare_access_positions qsort comparator

2017-09-25 Thread Martin Jambor
On Thu, Sep 21, 2017 at 08:27:31PM +0300, Alexander Monakov wrote:
> Hi,
> 
> The compare_access_positions qsort comparator lacks transitivity, although
> somewhat surprisingly this issue didn't manifest on 64-bit x86 bootstraps.
> The first invalid comparison step is here (tree-sra.c:1545):
> 
>   /* Put the integral type with the bigger precision first.  */
>   else if (INTEGRAL_TYPE_P (f1->type)
>  && INTEGRAL_TYPE_P (f2->type))
>   return TYPE_PRECISION (f2->type) - TYPE_PRECISION (f1->type);
> 
> Imagine you have items A, B, C such that they compare equal according to
> preceding comparison steps, A and C are integral and have precision 64 resp.
> 32, B is non-integral.  Then you have C < A according to this step, but
> comparisons against B depend on TYPE_UID, so you can end up with A < B < C < 
> A.
> 
> A minimal fix would be to order all integral items before/after non-integral,
> like preceding code already does for aggregate/vector/complex types.

Thanks for spotting this.  Nevertheless, I would prefer SRA to select
integer types over non-integer ones (i.e. floats), because in the
common scenario which is a union of a float and an int, the int is the
type for which we can enable more subsequent optimizations, even if it
means that we do not scalarize some exotic cases.  So I'm currently
testing the following (if we ever find that this is a problem in
practice, we can fix it at the cost of making
sort_and_splice_var_accesses more complicated but capable of undoing
this decision).

By the way, after I jettison IPA-SRA from tree-sra.c, I'll start
reworking it in a way that does not have the qsort in it.

Thanks again,

Martin

diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c
index 163b7a2d03b..0f92033d0bb 100644
--- a/gcc/tree-sra.c
+++ b/gcc/tree-sra.c
@@ -1542,19 +1542,20 @@ compare_access_positions (const void *a, const void *b)
   && TREE_CODE (f2->type) != COMPLEX_TYPE
   && TREE_CODE (f2->type) != VECTOR_TYPE)
return -1;
-  /* Put the integral type with the bigger precision first.  */
+  /* Put any integral type before any non-integral type.  When splicing, we
+make sure that those with insufficient precision and occupupying the
+same space are not scalarized.  */
   else if (INTEGRAL_TYPE_P (f1->type)
+  && !INTEGRAL_TYPE_P (f2->type))
+   return -1;
+  else if (!INTEGRAL_TYPE_P (f1->type)
   && INTEGRAL_TYPE_P (f2->type))
-   return TYPE_PRECISION (f2->type) - TYPE_PRECISION (f1->type);
-  /* Put any integral type with non-full precision last.  */
-  else if (INTEGRAL_TYPE_P (f1->type)
-  && (TREE_INT_CST_LOW (TYPE_SIZE (f1->type))
-  != TYPE_PRECISION (f1->type)))
return 1;
-  else if (INTEGRAL_TYPE_P (f2->type)
-  && (TREE_INT_CST_LOW (TYPE_SIZE (f2->type))
-  != TYPE_PRECISION (f2->type)))
-   return -1;
+  /* Put the integral type with the bigger precision first.  */
+  else if (INTEGRAL_TYPE_P (f1->type)
+  && INTEGRAL_TYPE_P (f2->type)
+  && (TYPE_PRECISION (f2->type) != TYPE_PRECISION (f1->type)))
+   return TYPE_PRECISION (f2->type) - TYPE_PRECISION (f1->type);
   /* Stabilize the sort.  */
   return TYPE_UID (f1->type) - TYPE_UID (f2->type);
 }
@@ -2055,6 +2056,9 @@ sort_and_splice_var_accesses (tree var)
   bool grp_partial_lhs = access->grp_partial_lhs;
   bool first_scalar = is_gimple_reg_type (access->type);
   bool unscalarizable_region = access->grp_unscalarizable_region;
+  bool non_full_precision = (INTEGRAL_TYPE_P (access->type)
+&& (TREE_INT_CST_LOW (TYPE_SIZE (access->type))
+!= TYPE_PRECISION (access->type)));
 
   if (first || access->offset >= high)
{
@@ -2102,6 +2106,21 @@ sort_and_splice_var_accesses (tree var)
 this combination of size and offset, the comparison function
 should have put the scalars first.  */
  gcc_assert (first_scalar || !is_gimple_reg_type (ac2->type));
+ /* It also prefers integral types to non-integral.  However, when the
+precision of the selected type does not span the entire area and
+should also be used for a non-integer (i.e. float), we must not
+let that happen.  */
+ if (non_full_precision && !INTEGRAL_TYPE_P (ac2->type))
+   {
+ if (dump_file && (dump_flags & TDF_DETAILS))
+   {
+ fprintf (dump_file, "Cannot sclarize the following access "
+  "because insufficient precision integer type was "
+  "selected.\n  ");
+ dump_access (dump_file, access, false);
+   }
+ unscalarizable_region = true;
+   }
  ac2->group_representative = access;
  j++;
}


[Bug debug/82155] [7/8 Regression] ICE in dwarf2out_abstract_function, at dwarf2out.c:21655

2017-09-25 Thread pmderodat at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82155

--- Comment #5 from pmderodat at gcc dot gnu.org ---
Author: pmderodat
Date: Mon Sep 25 12:26:36 2017
New Revision: 253147

URL: https://gcc.gnu.org/viewcvs?rev=253147=gcc=rev
Log:
[PR82155] Fix crash in dwarf2out_abstract_function

This patch is an attempt to fix the crash reported in PR82155.

When generating a C++ class method for a class that is itself nested in
a class method, dwarf2out_early_global_decl currently leaves the
existing context DIE as it is if it already exists.  However, it is
possible that this call happens at a point where this context DIE is
just a declaration that is itself not located in its own context.

From there, if dwarf2out_early_global_decl is not called on any of the
FUNCTION_DECL in the context chain, DIEs will be left badly scoped and
some (such as the nested method) will be removed by the type pruning
machinery.  As a consequence, dwarf2out_abstract_function will will
crash when called on the corresponding DECL because it asserts that the
DECL has a DIE.

This patch fixes this crash making dwarf2out_early_global_decl process
context DIEs the same way we process abstract origins for FUNCTION_DECL:
if the corresponding DIE exists but is only a declaration, call
dwarf2out_decl anyway on it so that it is turned into a more complete
DIE and so that it is relocated in the proper context.

Bootstrapped and regtested on x86_64-linux.

gcc/

PR debug/82155
* dwarf2out.c (dwarf2out_early_global_decl): Call dwarf2out_decl
on the FUNCTION_DECL function context if it has a DIE that is a
declaration.

gcc/testsuite/

* g++.dg/pr82155.C: New testcase.

Added:
trunk/gcc/testsuite/g++.dg/pr82155.C
Modified:
trunk/gcc/ChangeLog
trunk/gcc/dwarf2out.c
trunk/gcc/testsuite/ChangeLog

Re: [PATCH] [PR82155] Fix crash in dwarf2out_abstract_function

2017-09-25 Thread Pierre-Marie de Rodat

On 09/25/2017 01:54 PM, Richard Biener wrote:

Ok for trunk and gcc-7 branch after a while.


Thank you, Richard! Committed on trunk as 253147; I’ll wait at least one 
week to revisit the gcc-7 branch commit.


--
Pierre-Marie de Rodat


Re: Add VEC_DUPLICATE_{CST,EXPR} and associated optab

2017-09-25 Thread Richard Sandiford
Richard Biener  writes:
> On Mon, Sep 25, 2017 at 1:08 PM, Richard Sandiford
>  wrote:
>> SVE needs a way of broadcasting a scalar to a variable-length vector.
>> This patch adds VEC_DUPLICATE_CST for when VECTOR_CST would be used for
>> fixed-length vectors and VEC_DUPLICATE_EXPR for when CONSTRUCTOR would
>> be used for fixed-length vectors.  VEC_DUPLICATE_EXPR is the tree
>> equivalent of the existing rtl code VEC_DUPLICATE.
>>
>> Originally we had a single VEC_DUPLICATE_EXPR and used TREE_CONSTANT
>> to mark constant nodes, but in response to last year's RFC, Richard B.
>> suggested it would be better to have separate codes for the constant
>> and non-constant cases.  This allows VEC_DUPLICATE_EXPR to be treated
>> as a normal unary operation and avoids the previous need for treating
>> it as a GIMPLE_SINGLE_RHS.
>>
>> It might make sense to use VEC_DUPLICATE_CST for all duplicated
>> vector constants, since it's a bit more compact than VECTOR_CST
>> in that case, and is potentially more efficient to process.  I don't
>> have any specific plans to do that though.  We'll need to keep both
>> types of constant around whatever happens.
>
> I think VEC_DUPLICATE_EXPR is a good thing to have.  Looking at the
> changelog you didn't patch build_vector_from_val to make use of either
> new tree code?  That would get you (quite) some testing coverage.

I didn't want to change the use of VECTOR_CST and CONSTRUCTOR for
fixed-length vectors since that would be another invasive change,
and wouldn't remove the need for supporting VECTOR_CST and CONSTRUCTOR
in all the places that currently handle it.  I think it would make sense
to do it only after the variable-length support has settled.

The SVE patches do make build_vector_from_val use these codes
for variable-length vectors:

  if (!TYPE_VECTOR_SUBPARTS (vectype).is_constant ())
{
  if (CONSTANT_CLASS_P (sc))
   return build_vec_duplicate_cst (vectype, sc);
  return fold_build1 (VEC_DUPLICATE_EXPR, vectype, sc);
}

> Currently we require all elements of a VECTOR_CST to be present -- how
> difficult would it be to declare that iff and if only the first
> element is present then all following elements are the same as the
> last one?  That said, I'm looking for a loop-hole to not add the extra
> VEC_DUPLICATE_CST code ... eventually we can simply allow scalars in
> contexts where vectors are valid?  Like we do for shifts?  OTOH that'd
> be "implicitely typed constants" (depends on context) like CONST_INT,
> so probably not the way to go?

I don't think it would be as bad as CONST_INT, because at least it would
still have a type.  But the vast majority of code that sees an INTEGER_CST
is going to expect it to be a scalar integer.  I don't think trying to
reuse it for vectors would make things cleaner.

Note also that we need a VEC_SERIES_CST and VEC_SERIES_EXPR for linear
series.  Unlike VEC_DUPLICATE_CST, that's restricted to integer types,
to avoid awkward rounding questions with floats.

> The ugly thing about the new codes is that we go from 3 cases when folding
> vector CONSTRUCTOR and VECTOR_CST we now have 24 to cover...
> (if I didn't miscount).  This now really asks for some common iterator over
> elements of a vector (and VEC_DUPLICATE_{EXPR,CST} would just return the first
> elt all the time).  Note that using scalars instead of vectors reduces the
> combinatorical explosion a bit (scalar and scalar const can be handled
> the same).

One of the advantages of restricting the new codes to variable-length
vectors is that you never get combinations of the old and new codes.
So at the moment this adds only a single case for each fold.
With VEC_SERIES_CST we get 4 new cases for PLUS and MINUS, but
not for much else.

If we did extend the new codes to fixed-length vectors, I think we want
to hide it behind a common accessor that gives the value of element
number X, rather than operating directly on TREE_CODE, VECTOR_CST_ELT, etc.

Thanks,
Richard

>
> So ... I'd rather not have those if we can avoid it but I haven't
> fully thought out
> things as you can see from above.
>
> Richard.


[Bug ada/80590] [8 regression] non-bootstrap build failure of Ada runtime

2017-09-25 Thread vries at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80590

--- Comment #10 from Tom de Vries  ---
(In reply to Tom de Vries from comment #8)
> (In reply to Tom de Vries from comment #7)
> > Backtrace from gdb is more complete:
> 
> Backtrace looks similar to PR 80556 comment 3. Problem in that PR also is
> with g-exptty.adb.

That PR is now fixed. I've re-enabled ada in my test setup, and will see if I
can still reproduce this failure.

Re: GCC Buildbot

2017-09-25 Thread Paulo Matos


On 25/09/17 13:14, Jonathan Wakely wrote:
> On 25 September 2017 at 11:13, Paulo Matos wrote:
>>> Apart from that, I fully agree with octoploid that 
>>> http://toolchain.lug-owl.de/buildbot/ is duplicated effort which is running
>>> on GCC compile farm machines and uses a shell scripts to utilize. I would 
>>> prefer to integrate it to Buildbot and utilize same
>>> GCC Farm machines for native builds.
>>>
>>
>> Octoploid? Is that a typo?
> 
> No, it's Markus Trippelsdorf's username.
> 

Ah, thanks for the clarification.

-- 
Paulo Matos


Re: [PATCH] [PR82155] Fix crash in dwarf2out_abstract_function

2017-09-25 Thread Richard Biener
On Tue, Sep 12, 2017 at 8:00 AM, Pierre-Marie de Rodat
 wrote:
> Hello,
>
> This patch is an attempt to fix the crash reported in PR82155.
>
> When generating a C++ class method for a class that is itself nested in
> a class method, dwarf2out_early_global_decl currently leaves the
> existing context DIE as it is if it already exists.  However, it is
> possible that this call happens at a point where this context DIE is
> just a declaration that is itself not located in its own context.
>
> From there, if dwarf2out_early_global_decl is not called on any of the
> FUNCTION_DECL in the context chain, DIEs will be left badly scoped and
> some (such as the nested method) will be removed by the type pruning
> machinery.  As a consequence, dwarf2out_abstract_function will will
> crash when called on the corresponding DECL because it asserts that the
> DECL has a DIE.
>
> This patch fixes this crash making dwarf2out_early_global_decl process
> context DIEs the same way we process abstract origins for FUNCTION_DECL:
> if the corresponding DIE exists but is only a declaration, call
> dwarf2out_decl anyway on it so that it is turned into a more complete
> DIE and so that it is relocated in the proper context.
>
> Bootstrapped and regtested on x86_64-linux.  The crash this addresses is
> present both on trunk and on the gcc-7 branch: I suggest we commit this
> patch on both branches.  Ok to commit? Thank you in advance!

Ok for trunk and gcc-7 branch after a while.

Thanks,
Richard.

> gcc/
>
> PR debug/82155
> * dwarf2out.c (dwarf2out_early_global_decl): Call dwarf2out_decl
> on the FUNCTION_DECL function context if it has a DIE that is a
> declaration.
>
> gcc/testsuite/
>
> * g++.dg/pr82155.C: New testcase.
> ---
>  gcc/dwarf2out.c| 10 --
>  gcc/testsuite/g++.dg/pr82155.C | 36 
>  2 files changed, 44 insertions(+), 2 deletions(-)
>  create mode 100644 gcc/testsuite/g++.dg/pr82155.C
>
> diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c
> index 00d6d951ba3..4cfc9c186af 100644
> --- a/gcc/dwarf2out.c
> +++ b/gcc/dwarf2out.c
> @@ -25500,10 +25500,16 @@ dwarf2out_early_global_decl (tree decl)
>  so that all nested DIEs are generated at the proper scope in the
>  first shot.  */
>   tree context = decl_function_context (decl);
> - if (context != NULL && lookup_decl_die (context) == NULL)
> + if (context != NULL)
> {
> + dw_die_ref context_die = lookup_decl_die (context);
>   current_function_decl = context;
> - dwarf2out_decl (context);
> +
> + /* Avoid emitting DIEs multiple times, but still process CONTEXT
> +enough so that it lands in its own context.  This avoids type
> +pruning issues later on.  */
> + if (context_die == NULL || is_declaration_die (context_die))
> +   dwarf2out_decl (context);
> }
>
>   /* Emit an abstract origin of a function first.  This happens
> diff --git a/gcc/testsuite/g++.dg/pr82155.C b/gcc/testsuite/g++.dg/pr82155.C
> new file mode 100644
> index 000..75d9b615f39
> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/pr82155.C
> @@ -0,0 +1,36 @@
> +/* { dg-do compile { target c++11 } } */
> +/* { dg-options "-g -O2" } */
> +
> +template  struct b { a c; };
> +template  struct e { d *operator->(); };
> +template  class h {
> +public:
> +  typedef e ag;
> +};
> +class i {
> +protected:
> +  i(int);
> +};
> +class j {
> +  virtual void k(int) = 0;
> +
> +public:
> +  int f;
> +  void l() { k(f); }
> +};
> +struct m : i {
> +  int cn;
> +  m() : i(cn) {
> +struct n : j {
> +  n() {}
> +  void k(int) {}
> +};
> +  }
> +};
> +struct o {
> +  o() {
> +for (h>::ag g;;)
> +  g->c.c->l();
> +  }
> +};
> +void fn1() { o(); }
> --
> 2.14.1
>


Re: Update interface to TARGET_VECTORIZE_VEC_PERM_CONST_OK

2017-09-25 Thread Richard Biener
On Fri, Sep 22, 2017 at 6:34 PM, Richard Sandiford
 wrote:
> This patch makes TARGET_VECTORIZE_VEC_PERM_CONST_OK take the permute
> vector in the form of a vec_perm_indices instead of an unsigned char *.
> It follows on from the recent patch that did the same in target-independent
> code.
>
> It was easy to make ARM and AArch64 use vec_perm_indices internally
> as well, and converting AArch64 helps with SVE.  I did try doing the same
> for the other ports, but the surgery needed was much more invasive and
> much less obviously correct.
>
> Tested on aarch64-linux-gnu, x86_64-linux-gnu and powerpc64le-linux-gnu.
> Also tested by comparing the testsuite assembly output on at least one
> target per CPU directory.  OK to install?

Ok.

Thanks,
Richard.

> Richard
>
>
> 2017-09-22  Richard Sandiford  
>
> gcc/
> * target.def (vec_perm_const_ok): Change sel parameter to
> vec_perm_indices.
> * optabs-query.c (can_vec_perm_p): Update accordingly.
> * doc/tm.texi: Regenerate.
> * config/aarch64/aarch64.c (expand_vec_perm_d): Change perm to
> auto_vec_perm_indices and remove separate nelt field.
> (aarch64_evpc_trn, aarch64_evpc_uzp, aarch64_evpc_zip)
> (aarch64_evpc_ext, aarch64_evpc_rev, aarch64_evpc_dup)
> (aarch64_evpc_tbl, aarch64_expand_vec_perm_const_1)
> (aarch64_expand_vec_perm_const): Update accordingly.
> (aarch64_vectorize_vec_perm_const_ok): Likewise.  Change sel
> to vec_perm_indices.
> * config/arm/arm.c (expand_vec_perm_d): Change perm to
> auto_vec_perm_indices and remove separate nelt field.
> (arm_evpc_neon_vuzp, arm_evpc_neon_vzip, arm_evpc_neon_vrev)
> (arm_evpc_neon_vtrn, arm_evpc_neon_vext, arm_evpc_neon_vtbl)
> (arm_expand_vec_perm_const_1, arm_expand_vec_perm_const): Update
> accordingly.
> (arm_vectorize_vec_perm_const_ok): Likewise.  Change sel
> to vec_perm_indices.
> * config/i386/i386.c (ix86_vectorize_vec_perm_const_ok): Change
> sel to vec_perm_indices.
> * config/ia64/ia64.c (ia64_vectorize_vec_perm_const_ok): Likewise.
> * config/mips/mips.c (mips_vectorize_vec_perm_const_ok): Likewise.
> * config/powerpcspe/powerpcspe.c (rs6000_vectorize_vec_perm_const_ok):
> Likewise.
> * config/rs6000/rs6000.c (rs6000_vectorize_vec_perm_const_ok):
> Likewise.
>
> Index: gcc/target.def
> ===
> --- gcc/target.def  2017-09-22 17:31:36.935337179 +0100
> +++ gcc/target.def  2017-09-22 17:31:56.428954480 +0100
> @@ -1847,7 +1847,7 @@ DEFHOOK
>  DEFHOOK
>  (vec_perm_const_ok,
>   "Return true if a vector created for @code{vec_perm_const} is valid.",
> - bool, (machine_mode, const unsigned char *sel),
> + bool, (machine_mode, vec_perm_indices),
>   NULL)
>
>  /* Return true if the target supports misaligned store/load of a
> Index: gcc/optabs-query.c
> ===
> --- gcc/optabs-query.c  2017-09-14 17:04:19.080694343 +0100
> +++ gcc/optabs-query.c  2017-09-22 17:31:56.428006577 +0100
> @@ -367,7 +367,7 @@ can_vec_perm_p (machine_mode mode, bool
>if (direct_optab_handler (vec_perm_const_optab, mode) != 
> CODE_FOR_nothing
>   && (sel == NULL
>   || targetm.vectorize.vec_perm_const_ok == NULL
> - || targetm.vectorize.vec_perm_const_ok (mode, &(*sel)[0])))
> + || targetm.vectorize.vec_perm_const_ok (mode, *sel)))
> return true;
>  }
>
> Index: gcc/doc/tm.texi
> ===
> --- gcc/doc/tm.texi 2017-09-22 17:31:36.933441374 +0100
> +++ gcc/doc/tm.texi 2017-09-22 17:31:56.428006577 +0100
> @@ -5774,7 +5774,7 @@ correct for most targets.
>  Return true if vector alignment is reachable (by peeling N iterations) for 
> the given scalar type @var{type}.  @var{is_packed} is false if the scalar 
> access using @var{type} is known to be naturally aligned.
>  @end deftypefn
>
> -@deftypefn {Target Hook} bool TARGET_VECTORIZE_VEC_PERM_CONST_OK 
> (machine_mode, const unsigned char *@var{sel})
> +@deftypefn {Target Hook} bool TARGET_VECTORIZE_VEC_PERM_CONST_OK 
> (machine_mode, @var{vec_perm_indices})
>  Return true if a vector created for @code{vec_perm_const} is valid.
>  @end deftypefn
>
> Index: gcc/config/aarch64/aarch64.c
> ===
> --- gcc/config/aarch64/aarch64.c2017-09-21 11:53:16.681759682 +0100
> +++ gcc/config/aarch64/aarch64.c2017-09-22 17:31:56.412840135 +0100
> @@ -141,8 +141,8 @@ static void aarch64_elf_asm_constructor
>  static void aarch64_elf_asm_destructor (rtx, int) ATTRIBUTE_UNUSED;
>  static void aarch64_override_options_after_change (void);
>  static bool 

Re: Change permute index type to unsigned short

2017-09-25 Thread Richard Biener
On Fri, Sep 22, 2017 at 6:36 PM, Richard Sandiford
 wrote:
> This patch changes the element type of (auto_)vec_perm_indices from
> unsigned char to unsigned short.  This is needed for fixed-length
> 2048-bit SVE.  (SVE is variable-length by default, but it's possible
> to ask for specific vector lengths if you want to.)
>
> Tested on aarch64-linux-gnu, x86_64-linux-gnu and powerpc64le-linux-gnu.
> Also tested by comparing the testsuite assembly output on at least one
> target per CPU directory.  OK to install?

Ok.

Richard.

> Richard
>
>
> 2017-09-22  Richard Sandiford  
>
> gcc/
> * target.h (vec_perm_indices): Use unsigned short rather than
> unsigned char.
> (auto_vec_perm_indices): Likewise.
> * config/aarch64/aarch64.c (aarch64_vectorize_vec_perm_const_ok):
> Use unsigned int rather than unsigned char.
> * config/arm/arm.c (arm_vectorize_vec_perm_const_ok): Likewise.
>
> Index: gcc/target.h
> ===
> --- gcc/target.h2017-09-14 17:04:19.080694343 +0100
> +++ gcc/target.h2017-09-22 17:35:22.486794044 +0100
> @@ -193,11 +193,11 @@ enum vect_cost_model_location {
>
>  /* The type to use for vector permutes with a constant permute vector.
> Each entry is an index into the concatenated input vectors.  */
> -typedef vec vec_perm_indices;
> +typedef vec vec_perm_indices;
>
>  /* Same, but can be used to construct local permute vectors that are
> automatically freed.  */
> -typedef auto_vec auto_vec_perm_indices;
> +typedef auto_vec auto_vec_perm_indices;
>
>  /* The target structure.  This holds all the backend hooks.  */
>  #define DEFHOOKPOD(NAME, DOC, TYPE, INIT) TYPE NAME;
> Index: gcc/config/aarch64/aarch64.c
> ===
> --- gcc/config/aarch64/aarch64.c2017-09-22 17:31:56.412840135 +0100
> +++ gcc/config/aarch64/aarch64.c2017-09-22 17:35:22.483794044 +0100
> @@ -13820,7 +13820,7 @@ aarch64_vectorize_vec_perm_const_ok (mac
>nelt = sel.length ();
>for (i = which = 0; i < nelt; ++i)
>  {
> -  unsigned char e = d.perm[i];
> +  unsigned int e = d.perm[i];
>gcc_assert (e < 2 * nelt);
>which |= (e < nelt ? 1 : 2);
>  }
> Index: gcc/config/arm/arm.c
> ===
> --- gcc/config/arm/arm.c2017-09-22 17:31:56.414735941 +0100
> +++ gcc/config/arm/arm.c2017-09-22 17:35:22.486794044 +0100
> @@ -29261,7 +29261,7 @@ arm_vectorize_vec_perm_const_ok (machine
>nelt = GET_MODE_NUNITS (d.vmode);
>for (i = which = 0; i < nelt; ++i)
>  {
> -  unsigned char e = d.perm[i];
> +  unsigned int e = d.perm[i];
>gcc_assert (e < 2 * nelt);
>which |= (e < nelt ? 1 : 2);
>  }


[Bug tree-optimization/82285] [6/7 Regression] Optimizing error when using enumeration

2017-09-25 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82285

Richard Biener  changed:

   What|Removed |Added

  Known to work||8.0
Summary|[6/7/8 Regression]  |[6/7 Regression] Optimizing
   |Optimizing error when using |error when using
   |enumeration |enumeration

--- Comment #5 from Richard Biener  ---
Fixed on trunk sofar.

[Bug tree-optimization/82285] [6/7/8 Regression] Optimizing error when using enumeration

2017-09-25 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82285

--- Comment #4 from Richard Biener  ---
Author: rguenth
Date: Mon Sep 25 11:40:23 2017
New Revision: 253146

URL: https://gcc.gnu.org/viewcvs?rev=253146=gcc=rev
Log:
2017-09-25  Richard Biener  

PR tree-optimization/82285
* tree-vect-patterns.c (vect_recog_bool_pattern): Also handle
enumeral types.

* gcc.dg/torture/pr82285.c: New testcase.

Added:
trunk/gcc/testsuite/gcc.dg/torture/pr82285.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/testsuite/ChangeLog
trunk/gcc/tree-vect-patterns.c

Re: Add VEC_DUPLICATE_{CST,EXPR} and associated optab

2017-09-25 Thread Richard Biener
On Mon, Sep 25, 2017 at 1:08 PM, Richard Sandiford
 wrote:
> SVE needs a way of broadcasting a scalar to a variable-length vector.
> This patch adds VEC_DUPLICATE_CST for when VECTOR_CST would be used for
> fixed-length vectors and VEC_DUPLICATE_EXPR for when CONSTRUCTOR would
> be used for fixed-length vectors.  VEC_DUPLICATE_EXPR is the tree
> equivalent of the existing rtl code VEC_DUPLICATE.
>
> Originally we had a single VEC_DUPLICATE_EXPR and used TREE_CONSTANT
> to mark constant nodes, but in response to last year's RFC, Richard B.
> suggested it would be better to have separate codes for the constant
> and non-constant cases.  This allows VEC_DUPLICATE_EXPR to be treated
> as a normal unary operation and avoids the previous need for treating
> it as a GIMPLE_SINGLE_RHS.
>
> It might make sense to use VEC_DUPLICATE_CST for all duplicated
> vector constants, since it's a bit more compact than VECTOR_CST
> in that case, and is potentially more efficient to process.  I don't
> have any specific plans to do that though.  We'll need to keep both
> types of constant around whatever happens.

I think VEC_DUPLICATE_EXPR is a good thing to have.  Looking at the changelog
you didn't patch build_vector_from_val to make use of either new tree
code?  That
would get you (quite) some testing coverage.

Currently we require all elements of a VECTOR_CST to be present -- how difficult
would it be to declare that iff and if only the first element is
present then all following
elements are the same as the last one?  That said, I'm looking for a loop-hole
to not add the extra VEC_DUPLICATE_CST code ... eventually we can simply
allow scalars in contexts where vectors are valid?  Like we do for shifts?  OTOH
that'd be "implicitely typed constants" (depends on context) like CONST_INT, so
probably not the way to go?

The ugly thing about the new codes is that we go from 3 cases when folding
vector CONSTRUCTOR and VECTOR_CST we now have 24 to cover...
(if I didn't miscount).  This now really asks for some common iterator over
elements of a vector (and VEC_DUPLICATE_{EXPR,CST} would just return the first
elt all the time).  Note that using scalars instead of vectors reduces the
combinatorical explosion a bit (scalar and scalar const can be handled
the same).

So ... I'd rather not have those if we can avoid it but I haven't
fully thought out
things as you can see from above.

Richard.

> The patch also adds a vec_duplicate_optab to go with VEC_DUPLICATE_EXPR.
>
> Tested on aarch64-linux-gnu, x86_64-linux-gnu and powerpc64le-linux-gnu.
> OK to install?
>
> Richard
>
>
> 2017-09-25  Richard Sandiford  
> Alan Hayward  
> David Sherwood  
>
> gcc/
> * doc/generic.texi (VEC_DUPLICATE_CST, VEC_DUPLICATE_EXPR): Document.
> (VEC_COND_EXPR): Add missing @tindex.
> * doc/md.texi (vec_duplicate@var{m}): Document.
> * tree.def (VEC_DUPLICATE_CST, VEC_DUPLICATE_EXPR): New tree codes.
> * tree-core.h (tree_base): Document that u.nelts and TREE_OVERFLOW
> are used for VEC_DUPLICATE_CST as well.
> (tree_vector): Access base.n.nelts directly.
> * tree.h (TREE_OVERFLOW): Add VEC_DUPLICATE_CST to the list of
> valid codes.
> (VEC_DUPLICATE_CST_ELT): New macro.
> (build_vec_duplicate_cst): Declare.
> * tree.c (tree_node_structure_for_code, tree_code_size, tree_size)
> (integer_zerop, integer_onep, integer_all_onesp, integer_truep)
> (real_zerop, real_onep, real_minus_onep, add_expr, initializer_zerop)
> (walk_tree_1, drop_tree_overflow): Handle VEC_DUPLICATE_CST.
> (build_vec_duplicate_cst): New function.
> (uniform_vector_p): Handle the new codes.
> (test_vec_duplicate_predicates_int): New function.
> (test_vec_duplicate_predicates_float): Likewise.
> (test_vec_duplicate_predicates): Likewise.
> (tree_c_tests): Call test_vec_duplicate_predicates.
> * cfgexpand.c (expand_debug_expr): Handle the new codes.
> * tree-pretty-print.c (dump_generic_node): Likewise.
> * dwarf2out.c (rtl_for_decl_init): Handle VEC_DUPLICATE_CST.
> * gimple-expr.h (is_gimple_constant): Likewise.
> * gimplify.c (gimplify_expr): Likewise.
> * graphite-isl-ast-to-gimple.c
> (translate_isl_ast_to_gimple::is_constant): Likewise.
> * graphite-scop-detection.c (scan_tree_for_params): Likewise.
> * ipa-icf-gimple.c (func_checker::compare_cst_or_decl): Likewise.
> (func_checker::compare_operand): Likewise.
> * ipa-icf.c (sem_item::add_expr, sem_variable::equals): Likewise.
> * match.pd (negate_expr_p): Likewise.
> * print-tree.c (print_node): Likewise.
> * tree-chkp.c (chkp_find_bounds_1): Likewise.
> * tree-data-ref.c (data_ref_compare_tree): Likewise.
> * 

Re: GCC Buildbot

2017-09-25 Thread Martin Liška
On 09/25/2017 12:13 PM, Paulo Matos wrote:
> 
> 
> On 25/09/17 11:52, Martin Liška wrote:
>> Hi Paulo.
>>
>> Thank you for working on that! To be honest, I've been running local 
>> buildbot on
>> my desktop machine which does builds your buildbot instance can do (please 
>> see:
>> https://pasteboard.co/GLZ0vLMu.png):
>>
> 
> Hi Martin,
> 
> Thanks for sharing your builders. Looks like you've got a good setup going.
> 
> I have done the very basic only since it was my interest to understand
> if people would find it useful. I didn't want to waste my time building
> something people have no interest to use.

Sure, nice kick off.

> 
> It seems there is some interest so I am gathering some requirements in
> the GitHub issues of the project. One very important feature is
> visualization of results, so I am integrating support for data gathering
> in influxdb to display using grafana. I do not work full time on this,
> so it's going slowly but I should have a dashboard to show in the next
> couple of weeks.

Would be great, what exactly do you want to visualize? For me, even having 
green/red spots
works fine in order to quickly identify what builds are wrong.

> 
>> - doing time to time (once a week) sanitizer builds: ASAN, UBSAN and run 
>> test-suite
>> - doing profiled bootstrap, LTO bootstrap (yes, it has been broken for quite 
>> some time) and LTO profiled bootstrap
>> - building project with --enable-gather-detailed-mem-stats
>> - doing coverage --enable-coverage, running test-suite and uploading to a 
>> location: https://gcc.opensuse.org/gcc-lcov/
>> - similar for Doxygen: https://gcc.opensuse.org/gcc-doxygen/
>> - periodic building of some projects: Inkscape, GIMP, linux-kernel, Firefox 
>> - I do it with -O2, -O2+LTO, -O3, ...
>>   Would be definitely fine, but it takes some care to maintain compatible 
>> versions of a project and GCC compiler.
>>   Plus handling of dependencies of external libraries can be irritating.
>> - cross build for primary architectures
>>
>> That's list of what I have and can be inspiration for you. I can help if you 
>> want and we can find a reasonable resources
>> where this can be run.
>>
> 
> Thanks. That's great. As you can see from #9 in
> https://github.com/LinkiTools/gcc-buildbot/issues/9, most of the things
> I hope to be able to run in the CompileFarm unless, of course, unless
> people host a worker on their own hardware. Regarding your offer for
> resources. Are you offering to merge your config or hardware? Either
> would be great, however I expect your config to have to be ported to
> buildbot nine before merging.

Hopefully both. I'm attaching my config file (probably more for inspiration 
that a real use).
I'll ask my manager whether we can find a machine that can run more complex 
tests. I'll inform you.

> 
>> Apart from that, I fully agree with octoploid that 
>> http://toolchain.lug-owl.de/buildbot/ is duplicated effort which is running
>> on GCC compile farm machines and uses a shell scripts to utilize. I would 
>> prefer to integrate it to Buildbot and utilize same
>> GCC Farm machines for native builds.
>>
> 
> Octoploid? Is that a typo?
> I discussed that in the Cauldron with David was surprised to know that
> the buildbot you reference is actually not a buildbot implementation
> using the Python framework but a handwritten software. So, in that
> respect is not duplicated effort. It is duplicated effort if on the
> other hand, we try to test the same things. I will try to understand how
> to merge efforts to that buildbot.

Yes, duplication in way that it is (will be) same things. I'm adding author of 
the tool,
hopefully we can unify the effort (and resources of course).

Martin

> 
>> Another inspiration (for builds) can come from what LLVM folks do:
>> http://lab.llvm.org:8011/builders
>>
> 
> Thanks for the pointer. I at one point tried to read their
> configuration. However, found the one by gdb simpler and used it as a
> basis for what I have. I will look at their builders nonetheless to
> understand what they build and how long they take.> 
>> Anyway, it's good starting point what you did and I'm looking forward to 
>> more common use of the tool.
>> Martin
>>
> 
> Thanks,
> 

# -*- python -*-
# ex: set syntax=python:

# This is a sample buildmaster config file. It must be installed as
# 'master.cfg' in your buildmaster's base directory.

# This is the dictionary that the buildmaster pays attention to. We also use
# a shorter alias to save typing.
c = BuildmasterConfig = {}

from base64 import *
import re
import os

### BUILDSLAVES

# The 'slaves' list defines the set of recognized buildslaves. Each element is
# a BuildSlave object, specifying a unique slave name and password.  The same
# slave name and password must be configured on the slave.
from buildbot.buildslave import BuildSlave
c['slaves'] = []

c['mergeRequests'] = False

# 'protocols' contains information about protocols which master will use for
# communicating with 

  1   2   >