date:20160718

Re: [PATCH], PR 71493, Fix PowerPC ABI breakage on GCC trunk/6.1

2016-07-18 Thread Michael Meissner

On Mon, Jul 18, 2016 at 06:42:02PM -0500, Segher Boessenkool wrote:
> On Mon, Jul 18, 2016 at 07:25:09PM -0400, Michael Meissner wrote:
> > When I added the support for __float128 last year, I accidentally broke
> > returning structures containing a single float or double item using the 
> > System
> > V 32-bit calling sequence.
> > 
> > This patch goes back to using SCALAR_FLOAT_TYPE_P (which looks at the tree
> > node) instead of SCALAR_FLOAT_MODE_NOT_VECTOR_P (which only looks at the
> > mode).
> > 
> > I have tested this patch on the trunk on a big endian power7 system, and 
> > there
> > were no regressions.  The same patch applies to the GCC-6 branch, and I am
> > testing it now.  Assuming there are no regresions on the GCC-6 branch, can I
> > check this patch into both the trunk and gcc-6-branch?
> 
> Did you test with -m32, too?
> 
> Ah the testcases (thanks) have it explicitly.  Well.  Does this work?
> 
> > +/* { dg-do compile { target { powerpc*-*-linux* && ilp32 } } } */
> > +/* { dg-options "-O2 -m32 -msvr4-struct-return" } */

Yes, both test out ok.

> Are dg-options set before the target test or after?  If before, the ilp32
> is superfluous; if after, the -m32 is.  Or is there more to it?

Not really, using ilp32 and explicit -m32 means -m32 is passed twice.  I will
remove the explicit -m32.

> I think you can drop the ilp32.

You cannot use -m32 on a 64-bit little endian system, so the && ilp32 test
guarantees it is only run on a system that supports 32-bit (a pure 32-bit
system, or a big endian 64-bit system that still has the 32-bit libraries
installed).

I also imagine somebody could build a 64-bit big endian compiler that was
configured with --disable-multilib, and you would not be able to do -m32.

> Please sort that out, make sure the testcases are actually run, and then
> this is okay for trunk as well as 6.

As I said, I dropped the explicit -m32 in dg-options.

> Thanks for taking care of this!

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797

Re: RFA: new pass to warn on questionable uses of alloca() and VLAs

2016-07-18 Thread Martin Sebor


How does this look?


I think it's 99% there.  You've addressed all of my comments so
far -- thanks for that and for being so patient.  I realize it
would be a lot more efficient to get all the feedback (or as much
of it as possible) up front.  Unfortunately, some things don't get
noticed until round 2 or 3 (or even 4).  Please take this in lieu
of an apology for not spotting the issues below until now(*).

For this code:

  void f (void*);

  void g (int n)
  {
int a [n];
f (a);
  }

-Wvla-larger-than=32 prints:

  warning: argument to variable-length array may be too large
  note: limit is 32 bytes, but argument may be 18446744073709551612

An int argument cannot be that large.  I suspect the printed value
is actually the size of the VLA in bytes when N is -1, truncated
to size_t, rather than the value of the VLA bound.  To avoid
confusion the note should be corrected to say something like:

  note: limit is 32 bytes, but the variable-length array may be
  as large as 18446744073709551612

Also, the checker prints false positives for code like:

  void f (void*);

  void g (unsigned x, int *y)
  {
if (1000 < x) return;

while (*y) {
  char a [x];
  f (a);
}
  }

With -Wvla-larger-than=1000 and greater it prints:

  warning: unbounded use of variable-length array

(Same thing with alloca).  There should be no warning for VLAs,
and for alloca, the warning should say "use of variable-length
array within a loop."  The VRP dump suggests the range information
is available within the loop.  Is the get_range_info() function
not returning the corresponding bounds?

Martin

[*] If you want to get me back I invite you (with a bit of
selfishness ;-) to review my -Wformat-length patch.

Re: [PATCH] Avoid invoking ranlib on libbackend.a

2016-07-18 Thread Alan Modra

On Mon, Jul 18, 2016 at 08:39:34PM -0400, Patrick Palka wrote:
> One thing that was not clear to me is whether the object file paths
> stored in a thin archive are relative or absolute paths.  If they are
> absolute paths then that would be a problem due to how the build system
> moves build directories in between stages (gcc/ -> prev-gcc/ etc).  But
> it looks like the object file paths are relative to the location of the
> archive which is compatible.

It's simple.  The paths stored in the archive are the paths supplied
on the ar command line (*).  Supply relative, you'll get relative in
the archive and files will be opened relative to the archive
directory.

*) Well, not quite.  Relative paths are adjusted to be relative to the
archive directory.

-- 
Alan Modra
Australia Development Lab, IBM

Re: [PATCH] Avoid invoking ranlib on libbackend.a

2016-07-18 Thread Patrick Palka

On Mon, 18 Jul 2016, Segher Boessenkool wrote:

> On Mon, Jul 18, 2016 at 06:35:11AM -0500, Segher Boessenkool wrote:
> > Or, if using GNU ar, you can even use -S, if that helps (after testing
> > for it in configure, of course).
> 
> I meant -T.  Some day I will learn how to type, promise!

According to the documentation of GNU ar,

  "gnu ar can optionally create a thin archive, which contains a symbol
  index and references to the original copies of the member files of the
  archive. This is useful for building libraries for use within a local
  build tree, where the relocatable objects are expected to remain
  available, and copying the contents of each object would only waste time
  and space."

Since the objects which libbackend.a is composed of remain available
throughout the build process I think it should be safe to make
libbackend.a a thin archive.

So here's a patch which builds libbackend.a as a thin archive if the
toolchain supports it.  The time it takes to rebuild a
--disable-bootstrap tree after touching a single source file is now 7.5s
instead of 35+s -- a much better speedup than when simply eliding the
call to ranlib since the archive is now 1-5MB in size instead of 450MB.

Instead of changing AR_FLAGS, only the invocation of ar on libbackend.a
is changed because that is by far the largest archive (by a factor of
20x) and it seems less risky this way.

One thing that was not clear to me is whether the object file paths
stored in a thin archive are relative or absolute paths.  If they are
absolute paths then that would be a problem due to how the build system
moves build directories in between stages (gcc/ -> prev-gcc/ etc).  But
it looks like the object file paths are relative to the location of the
archive which is compatible.

Bootstrapped on x86_64-pc-linux-gnu.  Thoughts?

-- >8 --

Subject: [PATCH] Build libbackend.a as a thin archive if possible

gcc/ChangeLog:

* configure.ac (thin_archive_support): New variable.  AC_SUBST it.
* configure: Regenerate.
* Makefile.in (THIN_ARCHIVE_SUPPORT): New variable.
(USE_THIN_ARCHIVES): New variable.
(libbackend.a): If USE_THIN_ARCHIVES then pass T to ar to build
this archive as a thin archive.
---
 gcc/Makefile.in  | 17 +
 gcc/configure| 20 ++--
 gcc/configure.ac | 13 +
 3 files changed, 48 insertions(+), 2 deletions(-)

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 0786fa3..15a879b 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -275,6 +275,17 @@ else
 LLINKER = $(LINKER)
 endif
 
+THIN_ARCHIVE_SUPPORT = @thin_archive_support@
+
+USE_THIN_ARCHIVES = no
+ifeq ($(THIN_ARCHIVE_SUPPORT),yes)
+ifeq ($(AR_FLAGS),rc)
+ifeq ($(RANLIB_FLAGS),)
+USE_THIN_ARCHIVES = yes
+endif
+endif
+endif
+
 # ---
 # Programs which operate on the build machine
 # ---
@@ -1882,8 +1893,14 @@ compilations: $(BACKEND)
 # This archive is strictly for the host.
 libbackend.a: $(OBJS)
-rm -rf libbackend.a
+   @# Build libbackend.a as a thin archive if possible, as doing so
+   @# significantly reduces build times.
+ifeq ($(USE_THIN_ARCHIVES),yes)
+   $(AR) $(AR_FLAGS)T libbackend.a $(OBJS)
+else
$(AR) $(AR_FLAGS) libbackend.a $(OBJS)
-$(RANLIB) $(RANLIB_FLAGS) libbackend.a
+endif
 
 libcommon-target.a: $(OBJS-libcommon-target)
-rm -rf libcommon-target.a
diff --git a/gcc/configure b/gcc/configure
index ed44472..81c81b3 100755
--- a/gcc/configure
+++ b/gcc/configure
@@ -679,6 +679,7 @@ zlibinc
 zlibdir
 HOST_LIBS
 enable_default_ssp
+thin_archive_support
 libgcc_visibility
 gcc_cv_readelf
 gcc_cv_objdump
@@ -18475,7 +18476,7 @@ else
   lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
   lt_status=$lt_dlunknown
   cat > conftest.$ac_ext <<_LT_EOF
-#line 18478 "configure"
+#line 18479 "configure"
 #include "confdefs.h"
 
 #if HAVE_DLFCN_H
@@ -18581,7 +18582,7 @@ else
   lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
   lt_status=$lt_dlunknown
   cat > conftest.$ac_ext <<_LT_EOF
-#line 18584 "configure"
+#line 18585 "configure"
 #include "confdefs.h"
 
 #if HAVE_DLFCN_H
@@ -27846,6 +27847,21 @@ $as_echo "#define HAVE_AS_LINE_ZERO 1" >>confdefs.h
 
 fi
 
+{ $as_echo "$as_me:${as_lineno-$LINENO}: checking support for thin archives" 
>&5
+$as_echo_n "checking support for thin archives... " >&6; }
+thin_archive_support=no
+echo 'int main (void) { return 0; }' > conftest.c
+if ($AR --version | sed 1q | grep "GNU ar" \
+&& $CC $CFLAGS -c conftest.c \
+&& $AR rcT conftest.a conftest.o \
+&& $CC -o conftest conftest.a) >/dev/null 2>&1; then
+  thin_archive_support=yes
+fi
+rm -f conftest.c conftest.o conftest.a conftest
+{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $thin_archive_support" >&5
+$as_echo "$thin_archive_support" >&6; }
+
+
 { $as_echo "$as_me:${as_lineno-$LINENO}: checking linker PT_GNU_EH_FRAME 
support" >&5

Re: [PATCH], PR 71493, Fix PowerPC ABI breakage on GCC trunk/6.1

2016-07-18 Thread Segher Boessenkool

On Mon, Jul 18, 2016 at 07:25:09PM -0400, Michael Meissner wrote:
> When I added the support for __float128 last year, I accidentally broke
> returning structures containing a single float or double item using the System
> V 32-bit calling sequence.
> 
> This patch goes back to using SCALAR_FLOAT_TYPE_P (which looks at the tree
> node) instead of SCALAR_FLOAT_MODE_NOT_VECTOR_P (which only looks at the
> mode).
> 
> I have tested this patch on the trunk on a big endian power7 system, and there
> were no regressions.  The same patch applies to the GCC-6 branch, and I am
> testing it now.  Assuming there are no regresions on the GCC-6 branch, can I
> check this patch into both the trunk and gcc-6-branch?

Did you test with -m32, too?

Ah the testcases (thanks) have it explicitly.  Well.  Does this work?

> +/* { dg-do compile { target { powerpc*-*-linux* && ilp32 } } } */
> +/* { dg-options "-O2 -m32 -msvr4-struct-return" } */

Are dg-options set before the target test or after?  If before, the ilp32
is superfluous; if after, the -m32 is.  Or is there more to it?

I think you can drop the ilp32.

Please sort that out, make sure the testcases are actually run, and then
this is okay for trunk as well as 6.

Thanks for taking care of this!

Segher

[PATCH], PR 71493, Fix PowerPC ABI breakage on GCC trunk/6.1

2016-07-18 Thread Michael Meissner

When I added the support for __float128 last year, I accidentally broke
returning structures containing a single float or double item using the System
V 32-bit calling sequence.

This patch goes back to using SCALAR_FLOAT_TYPE_P (which looks at the tree
node) instead of SCALAR_FLOAT_MODE_NOT_VECTOR_P (which only looks at the
mode).

I have tested this patch on the trunk on a big endian power7 system, and there
were no regressions.  The same patch applies to the GCC-6 branch, and I am
testing it now.  Assuming there are no regresions on the GCC-6 branch, can I
check this patch into both the trunk and gcc-6-branch?

[gcc]
2016-07-18  Michael Meissner  

PR target/71493
* config/rs6000/rs6000.c (rs6000_function_value): Fix
unintentional System V.4 structure return breakage for structures
with a single floating point element.

[gcc/testsuite]
2016-07-18  Michael Meissner  

PR target/71493
* gcc.target/powerpc/pr71493-1.c: New test.
* gcc.target/powerpc/pr71493-2.c: Likewise.

Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs6000.c  (revision 238438)
+++ gcc/config/rs6000/rs6000.c  (working copy)
@@ -35467,7 +35467,8 @@ rs6000_function_value (const_tree valtyp
   if (DECIMAL_FLOAT_MODE_P (mode) && TARGET_HARD_FLOAT && TARGET_FPRS)
 /* _Decimal128 must use an even/odd register pair.  */
 regno = (mode == TDmode) ? FP_ARG_RETURN + 1 : FP_ARG_RETURN;
-  else if (SCALAR_FLOAT_MODE_NOT_VECTOR_P (mode) && TARGET_HARD_FLOAT && 
TARGET_FPRS
+  else if (SCALAR_FLOAT_TYPE_P (valtype) && TARGET_HARD_FLOAT && TARGET_FPRS
+  && !FLOAT128_VECTOR_P (mode)
   && ((TARGET_SINGLE_FLOAT && (mode == SFmode)) || 
TARGET_DOUBLE_FLOAT))
 regno = FP_ARG_RETURN;
   else if (TREE_CODE (valtype) == COMPLEX_TYPE
Index: gcc/testsuite/gcc.target/powerpc/pr71493-2.c
===
--- gcc/testsuite/gcc.target/powerpc/pr71493-2.c(revision 0)
+++ gcc/testsuite/gcc.target/powerpc/pr71493-2.c(revision 0)
@@ -0,0 +1,13 @@
+/* { dg-do compile { target { powerpc*-*-linux* && ilp32 } } } */
+/* { dg-options "-O2 -m32 -msvr4-struct-return" } */
+
+struct S2 { double d; };
+
+struct S2 foo2 (void)
+{
+  struct S2 s = { 1.0 };
+  return s;
+}
+
+/* { dg-final { scan-assembler "lwz" } } */
+/* { dg-final { scan-assembler-not "lfd" } } */
Index: gcc/testsuite/gcc.target/powerpc/pr71493-1.c
===
--- gcc/testsuite/gcc.target/powerpc/pr71493-1.c(revision 0)
+++ gcc/testsuite/gcc.target/powerpc/pr71493-1.c(revision 0)
@@ -0,0 +1,13 @@
+/* { dg-do compile { target { powerpc*-*-linux* && ilp32 } } } */
+/* { dg-options "-O2 -m32 -msvr4-struct-return" } */
+
+struct S1 { float f; };
+
+struct S1 foo1 (void)
+{
+  struct S1 s = { 1.0f };
+  return s;
+}
+
+/* { dg-final { scan-assembler "lwz" } } */
+/* { dg-final { scan-assembler-not "lfs" } } */

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797
Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs6000.c  (revision 238438)
+++ gcc/config/rs6000/rs6000.c  (working copy)
@@ -35467,7 +35467,8 @@ rs6000_function_value (const_tree valtyp
   if (DECIMAL_FLOAT_MODE_P (mode) && TARGET_HARD_FLOAT && TARGET_FPRS)
 /* _Decimal128 must use an even/odd register pair.  */
 regno = (mode == TDmode) ? FP_ARG_RETURN + 1 : FP_ARG_RETURN;
-  else if (SCALAR_FLOAT_MODE_NOT_VECTOR_P (mode) && TARGET_HARD_FLOAT && 
TARGET_FPRS
+  else if (SCALAR_FLOAT_TYPE_P (valtype) && TARGET_HARD_FLOAT && TARGET_FPRS
+  && !FLOAT128_VECTOR_P (mode)
   && ((TARGET_SINGLE_FLOAT && (mode == SFmode)) || 
TARGET_DOUBLE_FLOAT))
 regno = FP_ARG_RETURN;
   else if (TREE_CODE (valtype) == COMPLEX_TYPE
Index: gcc/testsuite/gcc.target/powerpc/pr71493-2.c
===
--- gcc/testsuite/gcc.target/powerpc/pr71493-2.c(revision 0)
+++ gcc/testsuite/gcc.target/powerpc/pr71493-2.c(revision 0)
@@ -0,0 +1,13 @@
+/* { dg-do compile { target { powerpc*-*-linux* && ilp32 } } } */
+/* { dg-options "-O2 -m32 -msvr4-struct-return" } */
+
+struct S2 { double d; };
+
+struct S2 foo2 (void)
+{
+  struct S2 s = { 1.0 };
+  return s;
+}
+
+/* { dg-final { scan-assembler "lwz" } } */
+/* { dg-final { scan-assembler-not "lfd" } } */
Index: gcc/testsuite/gcc.target/powerpc/pr71493-1.c
===
--- gcc/testsuite/gcc.target/powerpc/pr71493-1.c(revision 0)
+++ gcc/testsuite/gcc.target/powerpc/pr71493-1.c(revision 0)
@@ -0,0 +1,13 @@
+/* { dg-do compile { target {

Re: [PATCH] c++/60760 - arithmetic on null pointers should not be allowed in constant expressions

2016-07-18 Thread Martin Sebor


On 07/18/2016 11:51 AM, Jason Merrill wrote:

On 07/06/2016 06:20 PM, Martin Sebor wrote:

@@ -2911,6 +2923,14 @@ cxx_eval_indirect_ref (const constexpr_ctx
*ctx, tree t,



   if (*non_constant_p)



 return t;







+  if (integer_zerop (op0))



+{



+  if (!ctx->quiet)



+error ("dereferencing a null pointer");



+  *non_constant_p = true;



+  return t;



+}


I'm skeptical of checking this here, since *p is valid for null p; &*p
is even a constant expression.  And removing this hunk doesn't seem to
break any of your tests.

OK with that hunk removed.


With it removed the constexpr-nullptr-2.C test fails on line 64:

  constexpr const int *pi0 = >pa1->pa0->i;   // { dg-error "null 
pointer|not a constant" }


Here, pa2 and pa1 are non-null but pa0 is null.

Martin

Re: [PATCH] Giant concepts patch

2016-07-18 Thread Jason Merrill

On Sun, Jul 10, 2016 at 11:20 AM, Andrew Sutton
 wrote:
> I just tried building a fresh pull of cmcstl2, and I'm not seeing any
> errors as a result of not handling those missing codes in
> tsubst_constraint. At one point, I think it was not possible to get
> those other constraints in this context because they were nested in a
> parm_constr. But that seems obviously untrue now. But still... that
> gcc_unreachable isn't being triggered by any code in cmcstl.

The only one that was triggered by cmcstl was EXPR_CONSTR, and then
only for a member; if you comment out the EXPR_CONSTR case that I
added to tsubst_constraint, this test will ICE.

struct B
{
  template  void f(T t)
requires requires (T tt) { tt; }
  { }
};

int main()
{
  B().f(42);
}

Jason

[C++ PATCH] cp_parser_save_member_function_body fix (PR c++/71909)

2016-07-18 Thread Jakub Jelinek

Hi!

This patch fixes two issues:
1) as shown in the first testcase, cp_parser_save_member_function_body
   adds the catch () { ... } tokens into the saved token range
   even when there is no function try block (missing try keyword)
2) if the method starts with __transaction_{atomic,relaxed}, and
   e.g. contains {}s somewhere in the mem-initializers, then
   cp_parser_save_member_function_body stops saving the tokens early
   instead of late

The following patch attempts to handle the same cases
cp_parser_function_definition_after_declarator handles (ok, ignores
the already unsupported return extension) - note that
cp_parser_txn_attribute_opt handles only a small subset of C++11 attributes
(and only once, not multiple times).

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2016-07-18  Jakub Jelinek  

PR c++/71909
* parser.c (cp_parser_save_member_function_body): Consume
__transaction_relaxed or __transaction_atomic with optional
attribute.  Only skip catch with block if try keyword is seen.

* g++.dg/parse/pr71909.C: New test.
* g++.dg/tm/pr71909.C: New test.

--- gcc/cp/parser.c.jj  2016-07-16 10:41:04.0 +0200
+++ gcc/cp/parser.c 2016-07-18 11:47:49.487748010 +0200
@@ -26044,6 +26044,7 @@ cp_parser_save_member_function_body (cp_
   cp_token *first;
   cp_token *last;
   tree fn;
+  bool function_try_block = false;
 
   /* Create the FUNCTION_DECL.  */
   fn = grokmethod (decl_specifiers, declarator, attributes);
@@ -26065,9 +26066,43 @@ cp_parser_save_member_function_body (cp_
   /* Save away the tokens that make up the body of the
  function.  */
   first = parser->lexer->next_token;
+
+  if (cp_lexer_next_token_is_keyword (parser->lexer, RID_TRANSACTION_RELAXED))
+cp_lexer_consume_token (parser->lexer);
+  else if (cp_lexer_next_token_is_keyword (parser->lexer,
+  RID_TRANSACTION_ATOMIC))
+{
+  cp_lexer_consume_token (parser->lexer);
+  /* Match cp_parser_txn_attribute_opt [[ identifier ]].  */
+  if (cp_lexer_next_token_is (parser->lexer, CPP_OPEN_SQUARE)
+ && cp_lexer_nth_token_is (parser->lexer, 2, CPP_OPEN_SQUARE)
+ && (cp_lexer_nth_token_is (parser->lexer, 3, CPP_NAME)
+ || cp_lexer_nth_token_is (parser->lexer, 3, CPP_KEYWORD))
+ && cp_lexer_nth_token_is (parser->lexer, 4, CPP_CLOSE_SQUARE)
+ && cp_lexer_nth_token_is (parser->lexer, 5, CPP_CLOSE_SQUARE))
+   {
+ cp_lexer_consume_token (parser->lexer);
+ cp_lexer_consume_token (parser->lexer);
+ cp_lexer_consume_token (parser->lexer);
+ cp_lexer_consume_token (parser->lexer);
+ cp_lexer_consume_token (parser->lexer);
+   }
+  else
+   while (cp_next_tokens_can_be_gnu_attribute_p (parser)
+  && cp_lexer_nth_token_is (parser->lexer, 2, CPP_OPEN_PAREN))
+ {
+   cp_lexer_consume_token (parser->lexer);
+   if (cp_parser_cache_group (parser, CPP_CLOSE_PAREN, /*depth=*/0))
+ break;
+ }
+}
+
   /* Handle function try blocks.  */
   if (cp_lexer_next_token_is_keyword (parser->lexer, RID_TRY))
-cp_lexer_consume_token (parser->lexer);
+{
+  cp_lexer_consume_token (parser->lexer);
+  function_try_block = true;
+}
   /* We can have braced-init-list mem-initializers before the fn body.  */
   if (cp_lexer_next_token_is (parser->lexer, CPP_COLON))
 {
@@ -26085,8 +26120,9 @@ cp_parser_save_member_function_body (cp_
 }
   cp_parser_cache_group (parser, CPP_CLOSE_BRACE, /*depth=*/0);
   /* Handle function try blocks.  */
-  while (cp_lexer_next_token_is_keyword (parser->lexer, RID_CATCH))
-cp_parser_cache_group (parser, CPP_CLOSE_BRACE, /*depth=*/0);
+  if (function_try_block)
+while (cp_lexer_next_token_is_keyword (parser->lexer, RID_CATCH))
+  cp_parser_cache_group (parser, CPP_CLOSE_BRACE, /*depth=*/0);
   last = parser->lexer->next_token;
 
   /* Save away the inline definition; we will process it when the
--- gcc/testsuite/g++.dg/parse/pr71909.C.jj 2016-07-18 11:55:51.169600236 
+0200
+++ gcc/testsuite/g++.dg/parse/pr71909.C2016-07-18 11:57:09.99364 
+0200
@@ -0,0 +1,22 @@
+// PR c++/71909
+// { dg-do compile }
+
+struct S
+{
+  S () try : m (0) {}
+  catch (...) {}
+  void foo () try {}
+  catch (int) {}
+  catch (...) {}
+  int m;
+};
+
+struct T
+{
+  T () : m (0) {}
+  catch (...) {}   // { dg-error "expected unqualified-id before" }
+  void foo () {}
+  catch (int) {}   // { dg-error "expected unqualified-id before" }
+  catch (...) {}   // { dg-error "expected unqualified-id before" }
+  int m;
+};
--- gcc/testsuite/g++.dg/tm/pr71909.C.jj2016-07-18 12:01:59.92245 
+0200
+++ gcc/testsuite/g++.dg/tm/pr71909.C   2016-07-18 12:01:14.0 +0200
@@ -0,0 +1,18 @@
+// PR c++/71909
+// { dg-do compile { target c++11 } }
+// { dg-options "-fgnu-tm" }
+

Re: [C++ PATCH] Allow frexp etc. builtins in c++14 constexpr (PR c++/50060)

2016-07-18 Thread Jakub Jelinek

On Mon, Jul 18, 2016 at 02:42:43PM -0400, Jason Merrill wrote:
> Ah, I guess we need to check cxx_dialect in cxx_eval_store_expression,
> not just in potential_constant_expression.

Here is an updated version, bootstrapped/regtested on x86_64-linux and
i686-linux, ok for trunk?

2016-07-18  Jakub Jelinek  

PR c++/50060
* constexpr.c (cxx_eval_builtin_function_call): Pass false as lval
when evaluating call arguments.  Use fold_builtin_call_array instead
of fold_build_call_array_loc, return t if it returns NULL.  Otherwise
check the result with potential_constant_expression and call
cxx_eval_constant_expression on it.

* g++.dg/cpp0x/constexpr-50060.C: New test.
* g++.dg/cpp1y/constexpr-50060.C: New test.

--- gcc/cp/constexpr.c.jj   2016-07-18 20:42:51.163955883 +0200
+++ gcc/cp/constexpr.c  2016-07-18 20:55:47.246152938 +0200
@@ -1105,7 +1105,7 @@ cxx_eval_builtin_function_call (const co
   for (i = 0; i < nargs; ++i)
 {
   args[i] = cxx_eval_constant_expression (_ctx, CALL_EXPR_ARG (t, i),
- lval, , );
+ false, , );
   if (bi_const_p)
/* For __built_in_constant_p, fold all expressions with constant values
   even if they aren't C++ constant-expressions.  */
@@ -1114,13 +1114,31 @@ cxx_eval_builtin_function_call (const co
 
   bool save_ffbcp = force_folding_builtin_constant_p;
   force_folding_builtin_constant_p = true;
-  new_call = fold_build_call_array_loc (EXPR_LOCATION (t), TREE_TYPE (t),
-   CALL_EXPR_FN (t), nargs, args);
-  /* Fold away the NOP_EXPR from fold_builtin_n.  */
-  new_call = fold (new_call);
+  new_call = fold_builtin_call_array (EXPR_LOCATION (t), TREE_TYPE (t),
+ CALL_EXPR_FN (t), nargs, args);
   force_folding_builtin_constant_p = save_ffbcp;
-  VERIFY_CONSTANT (new_call);
-  return new_call;
+  if (new_call == NULL)
+{
+  if (!*non_constant_p && !ctx->quiet)
+   {
+ new_call = build_call_array_loc (EXPR_LOCATION (t), TREE_TYPE (t),
+  CALL_EXPR_FN (t), nargs, args);
+ error ("%q+E is not a constant expression", new_call);
+   }
+  *non_constant_p = true;
+  return t;
+}
+
+  if (!potential_constant_expression (new_call))
+{
+  if (!*non_constant_p && !ctx->quiet)
+   error ("%q+E is not a constant expression", new_call);
+  *non_constant_p = true;
+  return t;
+}
+
+  return cxx_eval_constant_expression (_ctx, new_call, lval,
+  non_constant_p, overflow_p);
 }
 
 /* TEMP is the constant value of a temporary object of type TYPE.  Adjust
--- gcc/testsuite/g++.dg/cpp0x/constexpr-50060.C.jj 2016-07-18 
21:03:12.505532831 +0200
+++ gcc/testsuite/g++.dg/cpp0x/constexpr-50060.C2016-07-18 
21:05:41.306655422 +0200
@@ -0,0 +1,21 @@
+// PR c++/50060
+// { dg-do compile { target c++11 } }
+
+extern "C" double frexp (double, int *);
+
+struct S
+{
+  constexpr S (double a) : y {}, x (frexp (a, )) {}  // { dg-error "is not a 
constant expression" "S" { target { ! c++14 } } }
+  double x;
+  int y;
+};
+
+struct T
+{
+  constexpr T (double a) : y {}, x ((y = 1, 0.8125)) {}// { dg-error 
"is not a constant-expression" "T" { target { ! c++14 } } }
+  double x;
+  int y;
+};
+
+static_assert (S (6.5).x == 0.8125, "");   // { dg-error "non-constant 
condition for static assertion|in constexpr expansion" "" { target { ! c++14 } 
} }
+static_assert (T (6.5).x == 0.8125, "");   // { dg-error "non-constant 
condition for static assertion|called in a constant expression" "" { target { ! 
c++14 } } }
--- gcc/testsuite/g++.dg/cpp1y/constexpr-50060.C.jj 2016-07-18 
20:46:00.992553765 +0200
+++ gcc/testsuite/g++.dg/cpp1y/constexpr-50060.C2016-07-18 
20:46:00.992553765 +0200
@@ -0,0 +1,100 @@
+// PR c++/50060
+// { dg-do compile { target c++14 } }
+
+// sincos and lgamma_r aren't available in -std=c++14,
+// only in -std=gnu++14.  Use __builtin_* in that case.
+extern "C" void sincos (double, double *, double *);
+extern "C" double frexp (double, int *);
+extern "C" double modf (double, double *);
+extern "C" double remquo (double, double, int *);
+extern "C" double lgamma_r (double, int *);
+
+constexpr double
+f0 (double x)
+{
+  double y {};
+  double z {};
+  __builtin_sincos (x, , );
+  return y;
+}
+
+constexpr double
+f1 (double x)
+{
+  double y {};
+  double z {};
+  __builtin_sincos (x, , );
+  return z;
+}
+
+constexpr double
+f2 (double x)
+{
+  int y {};
+  return frexp (x, );
+}
+
+constexpr int
+f3 (double x)
+{
+  int y {};
+  frexp (x, );
+  return y;
+}
+
+constexpr double
+f4 (double x)
+{
+  double y {};
+  return modf (x, );
+}
+
+constexpr double
+f5 (double x)
+{
+  double y {};
+  modf (x, );
+  return y;
+}
+
+constexpr double
+f6

Re: [patch, Fortran] Fix some string temporaries

2016-07-18 Thread Mikael Morin


Le 18/07/2016 à 22:20, Thomas Koenig a écrit :

Am 18.07.2016 um 20:58 schrieb Mikael Morin:


Unfortunately not.  The original code (before I lifted out the
functionality) sometimes had GFC_DEP_ERROR at the end of the
function, which was then removed by

  return fin_dep == GFC_DEP_OVERLAP;


That is very strange, there is an assert just a few lines before, that
fin_dep != GFC_DEP_ERROR.


This is not the only return statement.

For example, look at

@@ -2215,7 +2268,7 @@ gfc_dep_resolver (gfc_ref *lref, gfc_ref *rref, gf
  /* Exactly matching and forward overlapping ranges don't cause a
 dependency.  */
  if (fin_dep < GFC_DEP_BACKWARD)
-   return 0;
+   return fin_dep;

A GFC_DEP_ERROR could 'escape' here.


Indeed, I missed that one.
Then handle the GFC_DEP_ERROR here. Or initialize fin_dep with 
GFC_DEP_NODEP at the beginning, as you prefer.

OK with either (and the unreachable assertions).

Having an invalid enum value equals to zero helps diagnosing 
uninitialized values, so I prefer keeping the GFC_DEP_ERROR value 
separate from GFC_DEP_NODEP, GFC_DEP_NODEPFOUND, or any other case.

Re: [patch, Fortran] Fix some string temporaries

2016-07-18 Thread Thomas Koenig


Am 18.07.2016 um 20:58 schrieb Mikael Morin:


Unfortunately not.  The original code (before I lifted out the
functionality) sometimes had GFC_DEP_ERROR at the end of the
function, which was then removed by

  return fin_dep == GFC_DEP_OVERLAP;


That is very strange, there is an assert just a few lines before, that
fin_dep != GFC_DEP_ERROR.


This is not the only return statement.

For example, look at

@@ -2215,7 +2268,7 @@ gfc_dep_resolver (gfc_ref *lref, gfc_ref *rref, gf
  /* Exactly matching and forward overlapping ranges don't cause a
 dependency.  */
  if (fin_dep < GFC_DEP_BACKWARD)
-   return 0;
+   return fin_dep;

A GFC_DEP_ERROR could 'escape' here.

I didn't change the logic (at least not intentionally), I just made
it more visible.

Regards

Thomas

Re: Debug algorithms

2016-07-18 Thread François Dumont


On 13/07/2016 19:45, Jonathan Wakely wrote:

On 22/06/16 22:05 +0200, François Dumont wrote:

Hi

   Here is eventually the so long promized patch to introduce Debug 
algos similarly to Debug containers.


I'm trying to decide how much benefit this really gives us, and
whether the obfuscation to the code (even more namespaces involved,
and having to use __std_a:: instead of std::) is worth it. It also
means more code to maintain of course, with extra overloads.


   Why such an evolution:
- More flexibility, debug algos can be used explicitely without 
activating Debug mode.


Although nice in theory, I doubt this will get much usage in practice.

- Performance: Debug algos can get rid of Debug layer on top of 
container iterators to invoke normal algos. Operations on normal 
iterators are faster and we also benefit from the same algos 
specialization that sometimes exist on some container iterators (like 
std::deque ones). Also normal algos are now using other normal algos, 
Debug check won't be done several times.
- It will be easier to implement new Debug checks without the 
limitation to do so through some Debug macro


   To do so I introduced a new namespace __cxx1998_a used for normal 
algos when Debug mode is active. I couldn't reuse __cxx1998 cause 
with current implementation of Debug containers __cxx1998 is exposed 
and because of ADL we could then have ambiguity between Debug and 
normal versions of the same algos. I also introduced a __std_a 
namespace which control the kind of algos used within the library 
mostly for containers implementation details.



I think I need to apply the patch locally and spend some time looking
at the new structure, to see what ends up calling what. I'm finding it
difficult to follow that just from reading the patch.


This is definitely more code to maintain but I hope this code won't 
require much maintenance as it only host the Debug logic and not the 
algo logic itself. It relies on normal algo for algo logic.


You can see this patch as a way to cleanup the normal mode too !

If __cxx1998 namespace was perfectly encapsulated we could avoid the 
__cxx1998_a but for the moment the boundary between normal and debug 
mode is too tight.


François

Re: [patch, Fortran] Fix some string temporaries

2016-07-18 Thread Mikael Morin


Le 17/07/2016 à 18:21, Thomas Koenig a écrit :

Hi Mikael,


Do we actually want to backport this? Technically, it is a regression,
but people are not likely to notice much.


It is not an ICE, neither a code correctness issue as far as I can see,
so I would rather not backport.


Fine with me.



+case GFC_DEP_FORWARD:
+  return 0;



This is doubtfull, but not worse than before I guess.


0 in this case means that you need no array temporary.  This is fine.


+case GFC_DEP_BACKWARD:
+  return 1;
+
+case GFC_DEP_OVERLAP:
+  return 1;
+
+case GFC_DEP_NODEP:
+  return 0;
+
+case GFC_DEP_ERROR:
+  return 0;

Can we put a gcc_unreachable here instead?


Unfortunately not.  The original code (before I lifted out the
functionality) sometimes had GFC_DEP_ERROR at the end of the
function, which was then removed by

  return fin_dep == GFC_DEP_OVERLAP;

That is very strange, there is an assert just a few lines before, that 
fin_dep != GFC_DEP_ERROR.
The only case I can see where GFC_DEP_ERROR could be returned after your 
change is the REF_SUBSTRING case, but then it wouldn't work either 
without substring...

Re: [C++ PATCH] Allow frexp etc. builtins in c++14 constexpr (PR c++/50060)

2016-07-18 Thread Jason Merrill

On Mon, Jul 18, 2016 at 2:33 PM, Jakub Jelinek  wrote:
> On Mon, Jul 18, 2016 at 02:07:50PM -0400, Jason Merrill wrote:
>> >/* Fold away the NOP_EXPR from fold_builtin_n.  */
>> >new_call = fold (new_call);
>> >force_folding_builtin_constant_p = save_ffbcp;
>> > +
>> > +  /* Folding some math builtins produces e.g. COMPOUND_EXPR etc.  */
>> > +  if (cxx_dialect >= cxx14)
>> > +return cxx_eval_constant_expression (_ctx, new_call, lval,
>> > +non_constant_p, overflow_p);
>>
>> If we do this unconditionally, can we drop the fold above?
>
> So I've tried following patch, but on
> extern "C" double frexp (double, int *);
>
> struct S
> {
> #ifdef FREXP
>   constexpr S (double a) : y {}, x (frexp (a, )) {}
> #else
>   constexpr S (double a) : y {}, x ((y = 1, 0.8125)) {}
> #endif
>   double x;
>   int y;
> };
>
> static_assert (S (6.5).x == 0.8125, "");
>
> it means the testcase is accepted with -DFREXP in both
> -std=gnu++11 and -std=gnu++14 modes and without -DFREXP in
> -std=gnu++14 mode only.  I can move over the fold call
> to the cxx11 block, but I think we need to reject it for C++11.

Ah, I guess we need to check cxx_dialect in cxx_eval_store_expression,
not just in potential_constant_expression.

Jason

Re: [C++ PATCH] Allow frexp etc. builtins in c++14 constexpr (PR c++/50060)

2016-07-18 Thread Jakub Jelinek

On Mon, Jul 18, 2016 at 02:07:50PM -0400, Jason Merrill wrote:
> On Mon, Jul 18, 2016 at 2:03 PM, Jakub Jelinek  wrote:
> > On Mon, Jul 18, 2016 at 01:16:26PM -0400, Jason Merrill wrote:
> > That is reasonable, but not 100% sure what to do if it returns NULL
> > - it should return t, but if I do VERIFY_CONSTANT (t); there or
> > manually
> > if (!*non_constant_p)
> >   {
> > if (!allow_non_constant)
> >   error ("%q+E is not a constant expression", t);
> > *non_constant_p = true;
> >   }
> > return t;
> > it will print the original expression (without folded arguments).
> > Another option would be to build_call_array_loc if we want to emit
> > the error.  Preferences?
> 
> Hmm, I guess let's build the call for the error.

Ok.

> >/* Fold away the NOP_EXPR from fold_builtin_n.  */
> >new_call = fold (new_call);
> >force_folding_builtin_constant_p = save_ffbcp;
> > +
> > +  /* Folding some math builtins produces e.g. COMPOUND_EXPR etc.  */
> > +  if (cxx_dialect >= cxx14)
> > +return cxx_eval_constant_expression (_ctx, new_call, lval,
> > +non_constant_p, overflow_p);
> 
> If we do this unconditionally, can we drop the fold above?

So I've tried following patch, but on
extern "C" double frexp (double, int *);

struct S
{
#ifdef FREXP
  constexpr S (double a) : y {}, x (frexp (a, )) {}
#else
  constexpr S (double a) : y {}, x ((y = 1, 0.8125)) {}
#endif
  double x;
  int y;
};

static_assert (S (6.5).x == 0.8125, "");

it means the testcase is accepted with -DFREXP in both
-std=gnu++11 and -std=gnu++14 modes and without -DFREXP in
-std=gnu++14 mode only.  I can move over the fold call
to the cxx11 block, but I think we need to reject it for C++11.

2016-07-18  Jakub Jelinek  

PR c++/50060
* constexpr.c (cxx_eval_builtin_function_call): Pass false as lval
when evaluating call arguments.  Use fold_builtin_call_array instead
of fold_build_call_array_loc, return t if it returns NULL.
For C++14 and later, pass new_call to if new_call
cxx_eval_constant_expression.

* g++.dg/cpp1y/constexpr-50060.C: New test.

--- gcc/cp/constexpr.c.jj   2016-07-16 10:41:04.525652516 +0200
+++ gcc/cp/constexpr.c  2016-07-18 20:18:08.035709874 +0200
@@ -1105,7 +1105,7 @@ cxx_eval_builtin_function_call (const co
   for (i = 0; i < nargs; ++i)
 {
   args[i] = cxx_eval_constant_expression (_ctx, CALL_EXPR_ARG (t, i),
- lval, , );
+ false, , );
   if (bi_const_p)
/* For __built_in_constant_p, fold all expressions with constant values
   even if they aren't C++ constant-expressions.  */
@@ -1114,13 +1114,23 @@ cxx_eval_builtin_function_call (const co
 
   bool save_ffbcp = force_folding_builtin_constant_p;
   force_folding_builtin_constant_p = true;
-  new_call = fold_build_call_array_loc (EXPR_LOCATION (t), TREE_TYPE (t),
-   CALL_EXPR_FN (t), nargs, args);
-  /* Fold away the NOP_EXPR from fold_builtin_n.  */
-  new_call = fold (new_call);
+  new_call = fold_builtin_call_array (EXPR_LOCATION (t), TREE_TYPE (t),
+ CALL_EXPR_FN (t), nargs, args);
   force_folding_builtin_constant_p = save_ffbcp;
-  VERIFY_CONSTANT (new_call);
-  return new_call;
+  if (new_call == NULL)
+{
+  if (!*non_constant_p && !ctx->quiet)
+   {
+ new_call = build_call_array_loc (EXPR_LOCATION (t), TREE_TYPE (t),
+  CALL_EXPR_FN (t), nargs, args);
+ error ("%q+E is not a constant expression", new_call);
+   }
+  *non_constant_p = true;
+  return t;
+}
+
+  return cxx_eval_constant_expression (_ctx, new_call, lval,
+  non_constant_p, overflow_p);
 }
 
 /* TEMP is the constant value of a temporary object of type TYPE.  Adjust
--- gcc/testsuite/g++.dg/cpp1y/constexpr-50060.C.jj 2016-07-18 
19:45:30.528496123 +0200
+++ gcc/testsuite/g++.dg/cpp1y/constexpr-50060.C2016-07-18 
19:45:30.528496123 +0200
@@ -0,0 +1,100 @@
+// PR c++/50060
+// { dg-do compile { target c++14 } }
+
+// sincos and lgamma_r aren't available in -std=c++14,
+// only in -std=gnu++14.  Use __builtin_* in that case.
+extern "C" void sincos (double, double *, double *);
+extern "C" double frexp (double, int *);
+extern "C" double modf (double, double *);
+extern "C" double remquo (double, double, int *);
+extern "C" double lgamma_r (double, int *);
+
+constexpr double
+f0 (double x)
+{
+  double y {};
+  double z {};
+  __builtin_sincos (x, , );
+  return y;
+}
+
+constexpr double
+f1 (double x)
+{
+  double y {};
+  double z {};
+  __builtin_sincos (x, , );
+  return z;
+}
+
+constexpr double
+f2 (double x)
+{
+  int y {};
+  return frexp (x, );
+}
+
+constexpr int
+f3 (double x)
+{
+  int y {};
+

Re: [C++ PATCH] Fix ICE with SIZEOF_EXPR in default arg (PR c++/71822)

2016-07-18 Thread Jason Merrill

OK.

On Mon, Jul 11, 2016 at 3:25 PM, Jakub Jelinek  wrote:
> Hi!
>
> For SIZEOF_EXPR, we rely on cp_fold to fold it.
> But, for VEC_INIT_EXPR initialization, we actually just genericize it
> without ever folding the expressions, so e.g. if the ctor has default args
> and some complicated expressions in there, they will never be cp_folded.
> This is the only place that calls cp_genericize_tree other than when the
> whole function is genericized, the fix just adds similar folding of the
> expression that cp_fold_function does.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk/6.2?
>
> 2016-07-11  Jakub Jelinek  
>
> PR c++/71822
> * cp-gimplify.c (cp_gimplify_expr) : Recursively
> fold *expr_p before genericizing it.
>
> * g++.dg/template/defarg21.C: New test.
>
> --- gcc/cp/cp-gimplify.c.jj 2016-07-11 11:14:28.0 +0200
> +++ gcc/cp/cp-gimplify.c2016-07-11 11:24:30.554083084 +0200
> @@ -621,6 +621,8 @@ cp_gimplify_expr (tree *expr_p, gimple_s
>   init, VEC_INIT_EXPR_VALUE_INIT (*expr_p),
>   from_array,
>   tf_warning_or_error);
> +   hash_set pset;
> +   cp_walk_tree (expr_p, cp_fold_r, , NULL);
> cp_genericize_tree (expr_p);
> ret = GS_OK;
> input_location = loc;
> --- gcc/testsuite/g++.dg/template/defarg21.C.jj 2016-07-11 11:32:34.262266398 
> +0200
> +++ gcc/testsuite/g++.dg/template/defarg21.C2016-07-11 11:31:21.0 
> +0200
> @@ -0,0 +1,21 @@
> +// PR c++/71822
> +// { dg-do compile }
> +
> +int bar (int);
> +
> +template 
> +struct A
> +{
> +  explicit A (int x = bar (sizeof (T)));
> +};
> +
> +struct B
> +{
> +  A  b[2];
> +};
> +
> +void
> +baz ()
> +{
> +  B b;
> +}
>
> Jakub

Re: [C++ PATCH] Fix error recovery in tsubst_baselink (PR c++/71826)

2016-07-18 Thread Jason Merrill

OK.

On Mon, Jul 11, 2016 at 3:34 PM, Jakub Jelinek  wrote:
> Hi!
>
> Most of the spots in tsubst_baselink that actually access baselink after
> it has been assigned lookup_fnfields () test that it is a BASELINK_P, except
> one - the BASELINK_OPTYPE update.  lookup_fnfields can return
> error_mark_node though, perhaps something else too.  The patch just follows
> what the surrounding code does.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
>
> 2016-07-11  Jakub Jelinek  
>
> PR c++/71826
> * pt.c (tsubst_baselink): Only set BASELINK_OPTYPE for BASELINK_P.
>
> * g++.dg/template/pr71826.C: New test.
>
> --- gcc/cp/pt.c.jj  2016-07-11 11:14:28.0 +0200
> +++ gcc/cp/pt.c 2016-07-11 12:30:45.939554745 +0200
> @@ -13734,7 +13734,8 @@ tsubst_baselink (tree baselink, tree obj
>   BASELINK_FUNCTIONS (baselink),
>   template_args);
>  /* Update the conversion operator type.  */
> -BASELINK_OPTYPE (baselink) = optype;
> +if (BASELINK_P (baselink))
> +  BASELINK_OPTYPE (baselink) = optype;
>
>  if (!object_type)
>object_type = current_class_type;
> --- gcc/testsuite/g++.dg/template/pr71826.C.jj  2016-07-11 12:34:51.406568756 
> +0200
> +++ gcc/testsuite/g++.dg/template/pr71826.C 2016-07-11 12:33:35.0 
> +0200
> @@ -0,0 +1,17 @@
> +// PR c++/71826
> +// { dg-do compile }
> +
> +template  struct A { int i; };  // { dg-message "note" }
> +struct B { void i () {} }; // { dg-message "note" }
> +template  struct C : A , B
> +{
> +  void f () { i (); }  // { dg-error "is ambiguous" }
> +};
> +
> +int
> +main ()
> +{
> +  C  c;
> +  c.f ();
> +  return 0;
> +}
>
> Jakub

Re: [C++ PATCH] Fix lval {REAL,IMAG}PART_EXPR constexpr evaluation (PR c++/71828)

2016-07-18 Thread Jason Merrill

OK.

On Mon, Jul 11, 2016 at 4:08 PM, Jakub Jelinek  wrote:
> Hi!
>
> REALPART_EXPR and IMAGPART_EXPR are handled like unary expressions, even
> though they are references.  For !lval that makes no difference, but for
> lval it means we can get ADDR_EXPR of INTEGER_CST etc., or trying to store
> into an INTEGER_CST.
>
> Fixed by doing roughly what we do for other references like COMPONENT_REF,
> ARRAY_REF etc. in that case.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk/6.2?
>
> 2016-07-11  Jakub Jelinek  
>
> PR c++/71828
> * constexpr.c (cxx_eval_constant_expression) :
> For lval don't use cxx_eval_unary_expression and instead recurse
> and if needed rebuild the reference.
>
> * g++.dg/cpp0x/constexpr-71828.C: New test.
>
> --- gcc/cp/constexpr.c.jj   2016-07-11 11:14:28.0 +0200
> +++ gcc/cp/constexpr.c  2016-07-11 13:30:17.333065119 +0200
> @@ -3790,6 +3790,19 @@ cxx_eval_constant_expression (const cons
>
>  case REALPART_EXPR:
>  case IMAGPART_EXPR:
> +  if (lval)
> +   {
> + r = cxx_eval_constant_expression (ctx, TREE_OPERAND (t, 0), lval,
> +   non_constant_p, overflow_p);
> + if (r == error_mark_node)
> +   ;
> + else if (r == TREE_OPERAND (t, 0))
> +   r = t;
> + else
> +   r = fold_build1 (TREE_CODE (t), TREE_TYPE (t), r);
> + break;
> +   }
> +  /* FALLTHRU */
>  case CONJ_EXPR:
>  case FIX_TRUNC_EXPR:
>  case FLOAT_EXPR:
> --- gcc/testsuite/g++.dg/cpp0x/constexpr-71828.C.jj 2016-07-11 
> 13:39:31.635423827 +0200
> +++ gcc/testsuite/g++.dg/cpp0x/constexpr-71828.C2016-07-11 
> 13:39:02.0 +0200
> @@ -0,0 +1,5 @@
> +// PR c++/71828
> +// { dg-do compile { target c++11 } }
> +
> +constexpr _Complex int a { 1, 2 };
> +static_assert (& __imag a != &__real a, "");
>
> Jakub

Re: [C++ PATCH] Fix diagnostics ICE (PR c++/71835)

2016-07-18 Thread Jason Merrill

OK.

On Mon, Jul 11, 2016 at 4:15 PM, Jakub Jelinek  wrote:
> Hi!
>
> add_conv_candidate creates cand->fn which is not a FUNCTION_DECL, but
> some type, all spots in convert_like_real assume that if fn is non-NULL, it
> is some decl.  Just a couple of lines above this hunk we even have a comment
> reminding us on cand->fn not having to be a decl:
>   /* Since cand->fn will be a type, not a function, for a conversion
>  function, we must be careful not to unconditionally look at
>  DECL_NAME here.  */
> So, this patch uses just convert_like instead of convert_like_with_context
> if cand->fn is not a decl (or shall I test for TREE_CODE (cand->fn) ==
> FUNCTION_DECL instead?).
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk/6.2?
>
> 2016-07-11  Jakub Jelinek  
>
> PR c++/71835
> * call.c (build_op_call_1): Use convert_like_with_context only
> if cand->fn is a decl.
>
> * g++.dg/conversion/ambig3.C: New test.
>
> --- gcc/cp/call.c.jj2016-07-11 11:14:28.0 +0200
> +++ gcc/cp/call.c   2016-07-11 14:42:44.466186449 +0200
> @@ -4406,8 +4406,11 @@ build_op_call_1 (tree obj, vec result = build_over_call (cand, LOOKUP_NORMAL, complain);
>else
> {
> - obj = convert_like_with_context (cand->convs[0], obj, cand->fn, -1,
> -  complain);
> + if (DECL_P (cand->fn))
> +   obj = convert_like_with_context (cand->convs[0], obj, cand->fn,
> +-1, complain);
> + else
> +   obj = convert_like (cand->convs[0], obj, complain);
>   obj = convert_from_reference (obj);
>   result = cp_build_function_call_vec (obj, args, complain);
> }
> --- gcc/testsuite/g++.dg/conversion/ambig3.C.jj 2016-07-11 14:47:26.668826075 
> +0200
> +++ gcc/testsuite/g++.dg/conversion/ambig3.C2016-07-11 14:46:49.0 
> +0200
> @@ -0,0 +1,13 @@
> +// PR c++/71835
> +// { dg-do compile }
> +
> +typedef void T (int);
> +struct A { operator T * (); }; // { dg-message "candidate" }
> +struct B { operator T * (); }; // { dg-message "candidate" }
> +struct C : A, B {} c;
> +
> +void
> +foo ()
> +{
> +  c (0);   // { dg-error "is ambiguous" }
> +}
>
> Jakub

Re: [C++ PATCH] Fix ICE with PTRMEM_CST (PR c++/70869)

2016-07-18 Thread Jason Merrill

OK for trunk; just the earlier patch is fine for 6.2.

On Mon, Jul 11, 2016 at 4:17 PM, Jakub Jelinek  wrote:
> On Thu, Jul 07, 2016 at 03:06:55PM -0400, Jason Merrill wrote:
>> On Thu, Jul 7, 2016 at 2:23 PM, Jakub Jelinek  wrote:
>> > On Thu, Jul 07, 2016 at 12:32:02PM -0400, Jason Merrill wrote:
>> >> Hmm, I wonder if walk_tree_1 should walk into DECL_EXPR like it does into
>> >> BIND_EXPR_VARS.  But your patch is OK.
>> >
>> > Well, walk_tree_1 does walk into DECL_EXPR, but cp_genericize_r says
>> > *walk_subtrees on the VAR_DECL inside of it.
>> > When walking BIND_EXPR_VARS, it doesn't walk the vars themselves, but
>> > /* Walk the DECL_INITIAL and DECL_SIZE.  We don't want to walk
>> >into declarations that are just mentioned, rather than
>> >declared; they don't really belong to this part of the tree.
>> >And, we can see cycles: the initializer for a declaration
>> >can refer to the declaration itself.  */
>> > WALK_SUBTREE (DECL_INITIAL (decl));
>> > WALK_SUBTREE (DECL_SIZE (decl));
>> > WALK_SUBTREE (DECL_SIZE_UNIT (decl));
>> > Do you mean walk_tree_1 should walk DECL_INITIAL/DECL_SIZE/DECL_SIZE_UNIT
>> > of the var mentioned in the DECL_EXPR?  Then for many vars (which are both
>> > mentioned in BIND_EXPR_VARS and in DECL_EXPR) it would walk them twice.
>>
>> Yes, that's what I meant.  Or perhaps since this is a C++ FE issue,
>> cp_walk_subtrees should walk those fields for artificial variables.
>
> I've already committed the patch, given your "But your patch is OK." above.
> But the following works too, bootstrapped/regtested on x86_64-linux and
> i686-linux, ok for trunk?
>
> If yes, do you want the combined diff from both patches on the 6.2 branch
> too, or just the earlier patch?
>
> 2016-07-11  Jakub Jelinek  
>
> PR c++/70869
> PR c++/71054
> * cp-gimplify.c (cp_genericize_r): Revert the 2016-07-07 change.
> * tree.c (cp_walk_subtrees): For DECL_EXPR on DECL_ARTIFICIAL
> non-static VAR_DECL, walk the decl's DECL_INITIAL, DECL_SIZE and
> DECL_SIZE_UNIT.
>
> --- gcc/cp/cp-gimplify.c.jj 2016-07-11 11:24:30.554083084 +0200
> +++ gcc/cp/cp-gimplify.c2016-07-11 15:21:30.459546129 +0200
> @@ -1351,15 +1351,7 @@ cp_genericize_r (tree *stmt_p, int *walk
>  {
>tree d = DECL_EXPR_DECL (stmt);
>if (TREE_CODE (d) == VAR_DECL)
> -   {
> - gcc_assert (CP_DECL_THREAD_LOCAL_P (d) == DECL_THREAD_LOCAL_P (d));
> - /* User var initializers should be genericized during containing
> -BIND_EXPR genericization when walk_tree walks DECL_INITIAL
> -of BIND_EXPR_VARS.  Artificial temporaries might not be
> -mentioned there though, so walk them now.  */
> - if (DECL_ARTIFICIAL (d) && !TREE_STATIC (d) && DECL_INITIAL (d))
> -   cp_walk_tree (_INITIAL (d), cp_genericize_r, data, NULL);
> -   }
> +   gcc_assert (CP_DECL_THREAD_LOCAL_P (d) == DECL_THREAD_LOCAL_P (d));
>  }
>else if (TREE_CODE (stmt) == OMP_PARALLEL
>|| TREE_CODE (stmt) == OMP_TASK
> --- gcc/cp/tree.c.jj2016-07-11 11:14:28.0 +0200
> +++ gcc/cp/tree.c   2016-07-11 15:30:38.635047697 +0200
> @@ -4075,6 +4075,22 @@ cp_walk_subtrees (tree *tp, int *walk_su
>*walk_subtrees_p = 0;
>break;
>
> +case DECL_EXPR:
> +  /* User variables should be mentioned in BIND_EXPR_VARS
> +and their initializers and sizes walked when walking
> +the containing BIND_EXPR.  Compiler temporaries are
> +handled here.  */
> +  if (VAR_P (TREE_OPERAND (*tp, 0))
> + && DECL_ARTIFICIAL (TREE_OPERAND (*tp, 0))
> + && !TREE_STATIC (TREE_OPERAND (*tp, 0)))
> +   {
> + tree decl = TREE_OPERAND (*tp, 0);
> + WALK_SUBTREE (DECL_INITIAL (decl));
> + WALK_SUBTREE (DECL_SIZE (decl));
> + WALK_SUBTREE (DECL_SIZE_UNIT (decl));
> +   }
> +  break;
> +
>  default:
>return NULL_TREE;
>  }
>
>
> Jakub

Re: [C++ RFC/Patch] PR c++/71665

2016-07-18 Thread Jason Merrill

On Tue, Jul 12, 2016 at 10:30 AM, Paolo Carlini
 wrote:
> On 30/06/2016 19:49, Jason Merrill wrote:
>> I think we should check the type before calling cxx_constant_value.
>>
> Ok, I got the point. I'm not sure however how far we want to go with this
> and which kind of consistency we want to achieve (vs error messages issued
> in other similar circumstances). The below certainly passes testing on
> x86_64-linux.

I meant the actual type of the expression: that is, check
INTEGRAL_OR_ENUMERATION_TYPE_P before calling cxx_constant_value.

Jason

Re: [C++ PATCH] Allow frexp etc. builtins in c++14 constexpr (PR c++/50060)

2016-07-18 Thread Jason Merrill

On Mon, Jul 18, 2016 at 2:03 PM, Jakub Jelinek  wrote:
> On Mon, Jul 18, 2016 at 01:16:26PM -0400, Jason Merrill wrote:
> That is reasonable, but not 100% sure what to do if it returns NULL
> - it should return t, but if I do VERIFY_CONSTANT (t); there or
> manually
> if (!*non_constant_p)
>   {
> if (!allow_non_constant)
>   error ("%q+E is not a constant expression", t);
> *non_constant_p = true;
>   }
> return t;
> it will print the original expression (without folded arguments).
> Another option would be to build_call_array_loc if we want to emit
> the error.  Preferences?

Hmm, I guess let's build the call for the error.

>/* Fold away the NOP_EXPR from fold_builtin_n.  */
>new_call = fold (new_call);
>force_folding_builtin_constant_p = save_ffbcp;
> +
> +  /* Folding some math builtins produces e.g. COMPOUND_EXPR etc.  */
> +  if (cxx_dialect >= cxx14)
> +return cxx_eval_constant_expression (_ctx, new_call, lval,
> +non_constant_p, overflow_p);

If we do this unconditionally, can we drop the fold above?

Jason

Patch ping

2016-07-18 Thread Jakub Jelinek

Hi!

I'd like to ping a couple of C++ patches:

- PR70869 - change fix from cp_genericize_r tweak to cp_walk_subtrees
  http://gcc.gnu.org/ml/gcc-patches/2016-07/msg00568.html

- PR71835 - fix diagnostic ICE
  http://gcc.gnu.org/ml/gcc-patches/2016-07/msg00567.html

- PR71828 - fix *PART_EXPR handling in constexpr
  http://gcc.gnu.org/ml/gcc-patches/2016-07/msg00565.html

- PR71826 - fix error recovery in tsubst_baselink
  http://gcc.gnu.org/ml/gcc-patches/2016-07/msg00530.html

- PR71822 - fix ICE with VEC_INIT_EXPR
  http://gcc.gnu.org/ml/gcc-patches/2016-07/msg00528.html

Jakub

Re: [C++ PATCH] Allow frexp etc. builtins in c++14 constexpr (PR c++/50060)

2016-07-18 Thread Jakub Jelinek

On Mon, Jul 18, 2016 at 01:16:26PM -0400, Jason Merrill wrote:
> On Fri, Jul 15, 2016 at 2:42 PM, Jakub Jelinek  wrote:
> > While in C++11, builtins returning two results, one of them by dereferencing
> > a pointer argument can't be constexpr, in my understanding in C++14
> > generalized constexprs they can.
> 
> Yes.
> 
> > So, this patch tweaks cxx_eval_builtin_function_call so that it handles how
> > builtins.c folds these builtins (i.e. COMPOUND_EXPR with first operand
> > being *arg = const1 and second operand const2, optionally all wrapped into a
> > NON_LVALUE_EXPR.
> 
> Why so specific?  Can't we just pass the return value from fold into
> cxx_eval_constant_expression, if it isn't still a CALL_EXPR?

I'll test it.

> Incidentally, I think we should be using fold_builtin_call_array
> rather than fold_build_call_array_loc.

That is reasonable, but not 100% sure what to do if it returns NULL
- it should return t, but if I do VERIFY_CONSTANT (t); there or
manually
if (!*non_constant_p)
  {
if (!allow_non_constant)
  error ("%q+E is not a constant expression", t);
*non_constant_p = true;
  }
return t;
it will print the original expression (without folded arguments).
Another option would be to build_call_array_loc if we want to emit
the error.  Preferences?

> > In addition, I've noticed that the lval argument is passed down to
> > evaluation of arguments, that doesn't make sense to me, IMHO arguments
> > should be always evaluated as rvalues (and for non-builtins they are).
> 
> I'm a bit nervous about this, since some builtins take arguments by
> magic rather than by value, but I'm willing to accept this and see
> what breaks.

If some builtin is special, wouldn't it be special regardless of whether
we want to take address of the builtin's return value or not?
I'm really not aware of any though, after all, gimplification will turn
all their arguments into rvalues anyway.

The following patch uses VERIFY_CONSTANT (t) for the above mentioned issue,
and passes make check-g++ RUNTESTFLAGS=dg.exp=*constexpr*, is that what you
want or something different?

2016-07-18  Jakub Jelinek  

PR c++/50060
* constexpr.c (cxx_eval_builtin_function_call): Pass false as lval
when evaluating call arguments.  Use fold_builtin_call_array instead
of fold_build_call_array_loc, return t if it returns NULL.
For C++14 and later, pass new_call to if new_call
cxx_eval_constant_expression.

* g++.dg/cpp1y/constexpr-50060.C: New test.

--- gcc/cp/constexpr.c.jj   2016-07-16 10:41:04.525652516 +0200
+++ gcc/cp/constexpr.c  2016-07-18 19:53:45.059291912 +0200
@@ -1105,7 +1105,7 @@ cxx_eval_builtin_function_call (const co
   for (i = 0; i < nargs; ++i)
 {
   args[i] = cxx_eval_constant_expression (_ctx, CALL_EXPR_ARG (t, i),
- lval, , );
+ false, , );
   if (bi_const_p)
/* For __built_in_constant_p, fold all expressions with constant values
   even if they aren't C++ constant-expressions.  */
@@ -1114,11 +1114,23 @@ cxx_eval_builtin_function_call (const co
 
   bool save_ffbcp = force_folding_builtin_constant_p;
   force_folding_builtin_constant_p = true;
-  new_call = fold_build_call_array_loc (EXPR_LOCATION (t), TREE_TYPE (t),
-   CALL_EXPR_FN (t), nargs, args);
+  new_call = fold_builtin_call_array (EXPR_LOCATION (t), TREE_TYPE (t),
+ CALL_EXPR_FN (t), nargs, args);
+  if (new_call == NULL)
+{
+  VERIFY_CONSTANT (t);
+  return t;
+}
+
   /* Fold away the NOP_EXPR from fold_builtin_n.  */
   new_call = fold (new_call);
   force_folding_builtin_constant_p = save_ffbcp;
+
+  /* Folding some math builtins produces e.g. COMPOUND_EXPR etc.  */
+  if (cxx_dialect >= cxx14)
+return cxx_eval_constant_expression (_ctx, new_call, lval,
+non_constant_p, overflow_p);
+
   VERIFY_CONSTANT (new_call);
   return new_call;
 }
--- gcc/testsuite/g++.dg/cpp1y/constexpr-50060.C.jj 2016-07-18 
19:45:30.528496123 +0200
+++ gcc/testsuite/g++.dg/cpp1y/constexpr-50060.C2016-07-18 
19:45:30.528496123 +0200
@@ -0,0 +1,100 @@
+// PR c++/50060
+// { dg-do compile { target c++14 } }
+
+// sincos and lgamma_r aren't available in -std=c++14,
+// only in -std=gnu++14.  Use __builtin_* in that case.
+extern "C" void sincos (double, double *, double *);
+extern "C" double frexp (double, int *);
+extern "C" double modf (double, double *);
+extern "C" double remquo (double, double, int *);
+extern "C" double lgamma_r (double, int *);
+
+constexpr double
+f0 (double x)
+{
+  double y {};
+  double z {};
+  __builtin_sincos (x, , );
+  return y;
+}
+
+constexpr double
+f1 (double x)
+{
+  double y {};
+  double z {};
+  __builtin_sincos (x, , );
+  return z;
+}
+
+constexpr double

Re: [PATCH] c++/60760 - arithmetic on null pointers should not be allowed in constant expressions

2016-07-18 Thread Jason Merrill


On 07/06/2016 06:20 PM, Martin Sebor wrote:

@@ -2911,6 +2923,14 @@ cxx_eval_indirect_ref (const constexpr_ctx *ctx, tree t,



   if (*non_constant_p)



return t;







+  if (integer_zerop (op0))



+   {



+ if (!ctx->quiet)



+   error ("dereferencing a null pointer");



+ *non_constant_p = true;



+ return t;



+   }


I'm skeptical of checking this here, since *p is valid for null p; &*p 
is even a constant expression.  And removing this hunk doesn't seem to 
break any of your tests.


OK with that hunk removed.

Jason

Re: [PATCH] Add qsort comparator consistency checking (PR71702)

2016-07-18 Thread Alexander Monakov

On Mon, 18 Jul 2016, Richard Biener wrote:
> Ugh.  What impact does this have on stage2 compile-time?

It doesn't seem to be high enough to be measured reliably.  I've made a trial
run with -time=time.log in BOOT_CFLAGS, but there's a lot of variability in
timings and the sum total of times ended up 1% lower on the patched compiler.

However, this patch only runs checking for vec::qsort, while I'd like to have
such checking on all qsort calls.  That would make it a bit more concerning.

It is possible to consider other schemes of limiting the impact of this checking
by restricting the subset of pairs being tested. For instance, it's possible to
run all-pairs check on a really small prefix of the sorted array (e.g. 10,
instead of 100 in the proposed patch), and for the rest of the elements, check
only a logarithmic number of pairs. This would make this checking have time
complexity O(n log n), matching qsort (but likely with a lower constant factor).
Would this scheme be appropriate?

Thanks.
Alexander

Re: [C++ PATCH] Allow frexp etc. builtins in c++14 constexpr (PR c++/50060)

2016-07-18 Thread Jason Merrill

On Fri, Jul 15, 2016 at 2:42 PM, Jakub Jelinek  wrote:
> While in C++11, builtins returning two results, one of them by dereferencing
> a pointer argument can't be constexpr, in my understanding in C++14
> generalized constexprs they can.

Yes.

> So, this patch tweaks cxx_eval_builtin_function_call so that it handles how
> builtins.c folds these builtins (i.e. COMPOUND_EXPR with first operand
> being *arg = const1 and second operand const2, optionally all wrapped into a
> NON_LVALUE_EXPR.

Why so specific?  Can't we just pass the return value from fold into
cxx_eval_constant_expression, if it isn't still a CALL_EXPR?

Incidentally, I think we should be using fold_builtin_call_array
rather than fold_build_call_array_loc.

> In addition, I've noticed that the lval argument is passed down to
> evaluation of arguments, that doesn't make sense to me, IMHO arguments
> should be always evaluated as rvalues (and for non-builtins they are).

I'm a bit nervous about this, since some builtins take arguments by
magic rather than by value, but I'm willing to accept this and see
what breaks.

Jason

Re: [C++ PATCH] Fix up vector cond expr handling in templates (PR c++/71871)

2016-07-18 Thread Jason Merrill

OK.

Jason

On Thu, Jul 14, 2016 at 11:21 AM, Jakub Jelinek  wrote:
> Hi!
>
> This patch reverts part of
> https://gcc.gnu.org/ml/gcc-patches/2012-10/msg01665.html
> which looks wrong to me.
> The problem is that in templates, if the build_x_conditional_expr
> arguments aren't type dependent, we might get NON_DEPENDENT_EXPR
> wrappers around the argument, so after issuing possibly needed diagnostics
> we need to return back to the original unmodified arguments,
> which for VEC_COND_EXPR the condition has been bypassing and we ended up
> with NON_DEPENDENT_EXPR in the IL, which nothing strips away
> (plus VEC_COND_EXPR isn't really supported in 
> tsubst_copy/tsubst_copy_and_build
> and perhaps other spots).
> While we could do build_min_non_dep with VEC_COND_EXPR and teach pt.c about
> VEC_COND_EXPR, I really don't see advantages of doing that, if we build just
> COND_EXPR, build_min_non_dep ensures that it will have the right type, and
> when we instantiate it build_x_conditional_expr will be called again and
> that will create VEC_COND_EXPR when not processing_template_decl.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
>
> 2016-07-14  Jakub Jelinek  
>
> PR c++/71871
> * typeck.c (build_x_conditional_expr): Revert the 2012-10-25 change.
>
> * g++.dg/ext/vector31.C: New test.
>
> --- gcc/cp/typeck.c.jj  2016-07-11 11:14:28.0 +0200
> +++ gcc/cp/typeck.c 2016-07-14 12:47:48.436699222 +0200
> @@ -6288,8 +6288,7 @@ build_x_conditional_expr (location_t loc
>  }
>
>expr = build_conditional_expr (loc, ifexp, op1, op2, complain);
> -  if (processing_template_decl && expr != error_mark_node
> -  && TREE_CODE (expr) != VEC_COND_EXPR)
> +  if (processing_template_decl && expr != error_mark_node)
>  {
>tree min = build_min_non_dep (COND_EXPR, expr,
> orig_ifexp, orig_op1, orig_op2);
> --- gcc/testsuite/g++.dg/ext/vector31.C.jj  2016-07-14 13:01:04.933206583 
> +0200
> +++ gcc/testsuite/g++.dg/ext/vector31.C 2016-07-14 13:00:59.943272143 +0200
> @@ -0,0 +1,29 @@
> +// PR c++/71871
> +// { dg-do compile }
> +
> +typedef unsigned int V __attribute__ ((__vector_size__ (32)));
> +
> +template 
> +void
> +foo (V *x)
> +{
> +  V a = *x;
> +  a = a ? a : -1;
> +  *x = a;
> +}
> +
> +template 
> +void
> +bar (T *x)
> +{
> +  T a = *x;
> +  a = a ? a : -1;
> +  *x = a;
> +}
> +
> +void
> +test (V *x, V *y)
> +{
> +  foo<0> (x);
> +  bar (y);
> +}
>
> Jakub

Re: [PATCH 8/9] shrink-wrap: shrink-wrapping for separate concerns

2016-07-18 Thread Bernd Schmidt


On 07/18/2016 06:34 PM, Segher Boessenkool wrote:



+  /* The frequency of executing the prologue for this BB and all BBs
+ dominated by it.  */
+  gcov_type cost;


Is this frequency consideration the only thing that attempts to prevent
placing prologue insns into loops?


Yes.  The algorithm makes sure the prologues are executed as infrequently
as possible.  If a block that would get a prologue has the same frequency
as a predecessor does, and that predecessor always has that first block as
eventual successor, the prologue is moved to the earlier block (this
handles the case where both have a frequency of zero, and other cases
where the range of freq is too limited).


Ugh, that is really scaring me. I'd much prefer a classification of 
valid blocks based on cfg structure alone - I'll need serious convincing 
that the frequency data is reliable enough for what you are trying to do.



Bernd

[PATCH, Fortran] DEC extra integer intrinsics

2016-07-18 Thread Fritz Reese

All,

Attached is another extension patch introducing a new DEC
compatibility flag -fdec-intrinsic-ints. With this flag the compiler
recognizes the following variants for integer intrinsics which use a
B/I/J/K prefix (with byte/integer/long/quad kind parameters):

IABS (babs, iiabs, jiabs, kiabs)
BTEST (bbtest, bitest, bjtest, bktest)
IAND (biand, iiand, jiand, kiand)
IBCLR (bbclr, iibclr, jibclr, kibclr)
IBITS (bbits, iibits, jibits, kibits)
IEOR (bieor, iieor, jieor, kieor)
IOR (bior, iior, jior, kior)
ISHFT (bshft, iishft, jishft, kishft)
ISHFTC (bshftc, iishftc, jishftc, kishftc)
MOD (bmod, imod, jmod, kmod)
NOT (bnot, inot, jnot, knot)
FLOAT (floati, floatj, floatk)
MVBITS (bmvbits, imvbits, jmvbits, kmvbits)

The patch updates intrinsics.texi to include each new intrinsic
variant and provides its own testcase. With the patch the compiler
builds and passes all regression tests on x86-64-redhat-linux.

---
Fritz Reese

2016-07-18  Fritz Reese  

New flag -fdec-intrinsic-ints for variants of integer intrinsics.

gcc/fortran/
* lang.opt: New option -fdec-intrinsic-ints.
* gfortran.texi, invoke.texi, intrinsics.texi: Update documentation.
* options.c (set_dec_flags): Enable with -fdec.
* intrinsic.c (add_function, add_subroutine): New B/I/J/K
intrinsic variants.

gcc/testsuite/gfortran.dg/
* dec_intrinsic_ints.f90: New testcase.
From f5347dddc8e5cba1f4850576de76bf2defbaa2e1 Mon Sep 17 00:00:00 2001
From: Fritz O. Reese 
Date: Mon, 30 May 2016 15:37:21 -0400
Subject: [PATCH] New flag -fdec-intrinsic-ints for variants of integer intrinsics.

	gcc/fortran/
	* lang.opt: New option -fdec-intrinsic-ints.
	* gfortran.texi, invoke.texi, intrinsics.texi: Update documentation.
	* options.c (set_dec_flags): Enable with -fdec.
	* intrinsic.c (add_function, add_subroutine): New B/I/J/K intrinsic
	variants.

	gcc/testsuite/gfortran.dg/
	* dec_intrinsic_ints.f90: New testcase.
---
 gcc/fortran/gfortran.texi|   53 
 gcc/fortran/intrinsic.c  |  362 ++
 gcc/fortran/intrinsic.texi   |  201 +++-
 gcc/fortran/invoke.texi  |8 +-
 gcc/fortran/lang.opt |4 +
 gcc/fortran/options.c|1 +
 gcc/testsuite/gfortran.dg/dec_intrinsic_ints.f90 |  165 ++
 7 files changed, 780 insertions(+), 14 deletions(-)
 create mode 100644 gcc/testsuite/gfortran.dg/dec_intrinsic_ints.f90

diff --git a/gcc/fortran/gfortran.texi b/gcc/fortran/gfortran.texi
index 4d288ba..54d60ad 100644
--- a/gcc/fortran/gfortran.texi
+++ b/gcc/fortran/gfortran.texi
@@ -1461,6 +1461,7 @@ without warning.
 * Read/Write after EOF marker::
 * STRUCTURE and RECORD::
 * UNION and MAP::
+* Type variants for integer intrinsics::
 @end menu
 
 @node Old-style kind specifications
@@ -2367,6 +2368,58 @@ a.h  === '.C'
 a.l  ===   '.D'
 @end example
 
+@node Type variants for integer intrinsics
+@subsection Type variants for integer intrinsics
+@cindex intrinsics, integer
+
+Similar to the D/C prefixes to real functions to specify the input/output
+types, GNU Fortran offers B/I/J/K prefixes to integer functions for
+compatibility with DEC programs. The types implied by each are:
+
+@example
+@code{B} - @code{INTEGER(kind=1)}
+@code{I} - @code{INTEGER(kind=2)}
+@code{J} - @code{INTEGER(kind=4)}
+@code{K} - @code{INTEGER(kind=8)}
+@end example
+
+GNU Fortran supports these with the flag @option{-fdec-intrinsic-ints}.
+Intrinsics for which prefixed versions are available and in what form are noted
+in @ref{Intrinsic Procedures}. The complete list of supported intrinsics is
+here:
+
+@multitable @columnfractions .2 .2 .2 .2 .2
+
+@headitem Intrinsic @tab B @tab I @tab J @tab K
+
+@item @code{@ref{ABS}}
+  @tab @code{BABS} @tab @code{IIABS} @tab @code{JIABS} @tab @code{KIABS}
+@item @code{@ref{BTEST}}
+  @tab @code{BBTEST} @tab @code{BITEST} @tab @code{BJTEST} @tab @code{BKTEST}
+@item @code{@ref{IAND}}
+  @tab @code{BIAND} @tab @code{IIAND} @tab @code{JIAND} @tab @code{KIAND}
+@item @code{@ref{IBCLR}}
+  @tab @code{BBCLR} @tab @code{IIBCLR} @tab @code{JIBCLR} @tab @code{KIBCLR}
+@item @code{@ref{IBITS}}
+  @tab @code{BBITS} @tab @code{IIBITS} @tab @code{JIBITS} @tab @code{KIBITS}
+@item @code{@ref{IBSET}}
+  @tab @code{BBSET} @tab @code{IIBSET} @tab @code{JIBSET} @tab @code{KIBSET}
+@item @code{@ref{IEOR}}
+  @tab @code{BIEOR} @tab @code{IIEOR} @tab @code{JIEOR} @tab @code{KIEOR}
+@item @code{@ref{IOR}}
+  @tab @code{BIOR} @tab @code{IIOR} @tab @code{JIOR} @tab @code{KIOR}
+@item @code{@ref{ISHFT}}
+  @tab @code{BSHFT} @tab @code{IISHFT} @tab @code{JISHFT} @tab @code{KISHFT}
+@item @code{@ref{ISHFTC}}
+  @tab @code{BSHFTC} @tab @code{IISHFTC} @tab @code{JISHFTC} @tab @code{KISHFTC}
+@item @code{@ref{MOD}}
+  @tab @code{BMOD} @tab @code{IMOD} @tab @code{JMOD}

Re: [PATCH] c++/58796 Make nullptr match exception handlers of pointer type

2016-07-18 Thread Jason Merrill

Perhaps the right answer is to drop support for catching nullptr as a
pointers to member from the language.

Jason

Re: [PATCH 8/9] shrink-wrap: shrink-wrapping for separate concerns

2016-07-18 Thread Segher Boessenkool

Hi Bernd,

Thanks for the review.

On Fri, Jul 15, 2016 at 02:42:24PM +0200, Bernd Schmidt wrote:
> I still have misgivings about all the changes needed to the following 
> passes, but I guess there's no choice but to live with it. So, I'm 
> trying to look at this patch, but I'm finding it fairly impenetrable and 
> underdocumented.

I'll add some more comments with a fresh eye.

> >+  /* The concerns for which we want a prologue placed on this BB.  */
> >+  sbitmap pro_concern;
> >+
> >+  /* The concerns for which we placed code at the start of the BB.  */
> >+  sbitmap head_concern;
> 
> What's the difference?

Concerns in head_concern have the prologue code placed at the start
of the bb; concerns in pro_concern have that code placed before the
existing code in this bb, but after the code in any predecessor bb.
It will be inserted on an edge, unless it can be a head_concern or a
tail_concern.

head_concern and tail_concern reduce code size, in the (quite frequent)
cases where cross jumping does a sub-par job.  Originally I didn't have
these; it seems I didn't document them well enough.

> >+  /* The frequency of executing the prologue for this BB and all BBs
> >+ dominated by it.  */
> >+  gcov_type cost;
> 
> Is this frequency consideration the only thing that attempts to prevent 
> placing prologue insns into loops?

Yes.  The algorithm makes sure the prologues are executed as infrequently
as possible.  If a block that would get a prologue has the same frequency
as a predecessor does, and that predecessor always has that first block as
eventual successor, the prologue is moved to the earlier block (this
handles the case where both have a frequency of zero, and other cases
where the range of freq is too limited).

> >+/* Destroy the pass-specific data.  */
> >+static void
> >+fini_separate_shrink_wrap (void)
> >+{
> >+  basic_block bb;
> >+  FOR_ALL_BB_FN (bb, cfun)
> >+if (bb->aux)
> >+  {
> >+sbitmap_free (SW (bb)->has_concern);
> >+sbitmap_free (SW (bb)->pro_concern);
> >+sbitmap_free (SW (bb)->head_concern);
> >+sbitmap_free (SW (bb)->tail_concern);
> >+free (bb->aux);
> >+bb->aux = 0;
> >+  }
> >+}
> 
> Almost makes me want to ask for an sbitmap variant allocated on obstacks.

Heh, yes.  I'll have a look.

> >+  /* If this block does have the concern itself, or it is cheaper to
> >+ put the prologue here than in all the descendants that need it,
> >+ mark it so.  If it is the same cost, put it here if there is no
> >+ block reachable from this block that does not need the prologue.
> >+ The actual test is a bit more stringent but catches most cases.  */
> 
> There's some oddness here with the leading whitespace.

Will fix.

> >+/* Mark HAS_CONCERN for every block dominated by at least one block with
> >+   PRO_CONCERN set, starting at HEAD.  */
> 
> I see a lot of code dealing with the placement of prologue 
> parts/concerns/components, but very little dealing with how to place 
> epilogues, leading me to wonder whether we could do better wrt the 
> latter. Shouldn't there be some mirror symmetry, i.e. 
> spread_concerns_backwards?

That is unfortunately harder to do (the "global" prologue block dominates
everywhere we could put a prologue component, but this is not true for
epilogues -- there can be more exits).

It is also true the epilogues already are sort of optimal in execution
cost: the epilogues are executed at most as often as the prologues,
which are placed optimally by construction.  The win from placing the
epilogues better is from infinite loops and non-returning abnormal
exits; but also you can get somewhat smaller code.

So I left this as a future improvement.

> >+{
> >+  if (first_visit)
> >+{
> >+  bitmap_ior (SW (bb)->has_concern, SW (bb)->pro_concern, concern);
> >+
> >+  if (first_dom_son (CDI_DOMINATORS, bb))
> >+{
> >+  concern = SW (bb)->has_concern;
> >+  bb = first_dom_son (CDI_DOMINATORS, bb);
> >+  continue;
> >+}
> 
> Calling first_dom_son twice with the same args?

I thought it was more readable this way.  It's two derefs and an add or
two.  Maybe we should make it an inline function?

> More importantly, this 
> first_visit business seems very confusing. I'd try to find a way to 
> merge this if with the places that set first_visit to true.

That will break the early-out optimisation I think?  I'll have to look
again.  All the loops here are O(n) (with n the number of edges, or
blocks); but the place_prologues loop is called once for every component,
as well.  So the early-out helps quite a lot here.

> Also - 
> instead of having a "continue;" here it seems the code inside the if 
> represents an inner loop that should be written explicitly. There are 
> two loops with such a structure.

This "loop" is just a non-recursive tree traversal.  The complicated
part is doing the early-out at just the right time.

I'll document it better.

> >+/* If

[PATCH GCC]Improve no-overflow check in SCEV using value range info.

2016-07-18 Thread Bin Cheng

Hi,
Scalar evolution needs to prove no-overflow for source variable when handling 
type conversion.  This is important because otherwise we would fail to 
recognize result of the conversion as SCEV, resulting in missing loop 
optimizations.  Take case added by this patch as an example, the loop can't be 
distributed as memset call because address of memory reference is not 
recognized.  At the moment, we rely on type overflow semantics and loop niter 
info for no-overflow checking, unfortunately that's not enough.  This patch 
introduces new method checking no-overflow using value range information.  As 
commented in the patch, value range can only be used when source operand 
variable evaluates on every loop iteration, rather than guarded by some 
conditions. 

This together with patch improving loop niter analysis 
(https://gcc.gnu.org/ml/gcc-patches/2016-07/msg00736.html) can help various 
loop passes like vectorization.
Bootstrap and test on x86_64 and AArch64.  Is it OK? 

Thanks,
bin

2016-07-15  Bin Cheng  

* tree-chrec.c (convert_affine_scev): New parameter.  Pass new arg.
(chrec_convert_1, chrec_convert): Ditto.
* tree-chrec.h (chrec_convert, convert_affine_scev): New parameter.
* tree-scalar-evolution.c (interpret_rhs_expr): Pass new arg.
* tree-vrp.c (adjust_range_with_scev): Ditto.
* tree-ssa-loop-niter.c (idx_infer_loop_bounds): Ditto.
(scev_var_range_cant_overflow): New function.
(scev_probably_wraps_p): New parameter.  Call above function.
* tree-ssa-loop-niter.h (scev_probably_wraps_p): New parameter.

gcc/testsuite/ChangeLog
2016-07-15  Bin Cheng  

* gcc.dg/tree-ssa/scev-15.c: New.
diff --git a/gcc/tree-chrec.c b/gcc/tree-chrec.c
index ee789a2..6add426 100644
--- a/gcc/tree-chrec.c
+++ b/gcc/tree-chrec.c
@@ -1162,16 +1162,17 @@ nb_vars_in_chrec (tree chrec)
 
 /* Converts BASE and STEP of affine scev to TYPE.  LOOP is the loop whose iv
the scev corresponds to.  AT_STMT is the statement at that the scev is
-   evaluated.  USE_OVERFLOW_SEMANTICS is true if this function should assume 
that
-   the rules for overflow of the given language apply (e.g., that signed
-   arithmetics in C does not overflow) -- i.e., to use them to avoid 
unnecessary
-   tests, but also to enforce that the result follows them.  Returns true if 
the
-   conversion succeeded, false otherwise.  */
+   evaluated.  USE_OVERFLOW_SEMANTICS is true if this function should assume
+   that the rules for overflow of the given language apply (e.g., that signed
+   arithmetics in C does not overflow) -- i.e., to use them to avoid
+   unnecessary tests, but also to enforce that the result follows them.
+   FROM is the source variable converted if it's not NULL.  Returns true if
+   the conversion succeeded, false otherwise.  */
 
 bool
 convert_affine_scev (struct loop *loop, tree type,
 tree *base, tree *step, gimple *at_stmt,
-bool use_overflow_semantics)
+bool use_overflow_semantics, tree from)
 {
   tree ct = TREE_TYPE (*step);
   bool enforce_overflow_semantics;
@@ -1230,7 +1231,7 @@ convert_affine_scev (struct loop *loop, tree type,
 must_check_rslt_overflow = false;
 
   if (must_check_src_overflow
-  && scev_probably_wraps_p (*base, *step, at_stmt, loop,
+  && scev_probably_wraps_p (from, *base, *step, at_stmt, loop,
use_overflow_semantics))
 return false;
 
@@ -1258,7 +1259,8 @@ convert_affine_scev (struct loop *loop, tree type,
   if (must_check_rslt_overflow
   /* Note that in this case we cannot use the fact that signed variables
 do not overflow, as this is what we are verifying for the new iv.  */
-  && scev_probably_wraps_p (new_base, new_step, at_stmt, loop, false))
+  && scev_probably_wraps_p (NULL, new_base, new_step,
+   at_stmt, loop, false))
 return false;
 
   *base = new_base;
@@ -1288,12 +1290,14 @@ chrec_convert_rhs (tree type, tree chrec, gimple 
*at_stmt)
 
USE_OVERFLOW_SEMANTICS is true if this function should assume that
the rules for overflow of the given language apply (e.g., that signed
-   arithmetics in C does not overflow) -- i.e., to use them to avoid 
unnecessary
-   tests, but also to enforce that the result follows them.  */
+   arithmetics in C does not overflow) -- i.e., to use them to avoid
+   unnecessary tests, but also to enforce that the result follows them.
+
+   FROM is the source variable converted if it's not NULL.  */
 
 static tree
 chrec_convert_1 (tree type, tree chrec, gimple *at_stmt,
-bool use_overflow_semantics)
+bool use_overflow_semantics, tree from)
 {
   tree ct, res;
   tree base, step;
@@ -1314,7 +1318,7 @@ chrec_convert_1 (tree type, tree chrec, gimple *at_stmt,
   step = CHREC_RIGHT (chrec);
 
   if (convert_affine_scev (loop, type, , , at_stmt,

Merge switch statements in tree-cfgcleanup

2016-07-18 Thread Bernd Schmidt

The motivating example for this patch was a change that was submitted 
for genattrtab last year, which would have made us generate


switch (type = get_attr_type (insn))
  {
   ... some cases ...
   default:
 switch (type = get_attr_type (insn)))
   {
   ... some other cases ...
   }
  }

The idea was to optimize this by merging the code into a single switch. 
My expectation was that this was most likely to occur in 
machine-generated code, but there are a few instances of this pattern in 
the gcc sources themselves. One case is


   code = gimple_code (stmt);
   switch (code)
 {
 
 default:
   if (is_gimple_omp (code))
 {
 }
 }

where is_gimple_omp expands into another switch. More cases exist in the 
compiler as shown by various bootstrap failures along the way; sometimes 
these are exposed after other optimizations. One is in the Ada runtime 
library somewhere, and another (which currently cannot be transformed by 
the patch) is in the Fortran frontend.


In the future we could also look for if statements making another 
comparison of the variable in the default branch, that would be a minor 
extension.


The motivating example currently can't be transformed because 
get_attr_type calls are in the way.


Bootstrapped and tested on x86_64-linux. Ok?


Bernd
	* tree-cfgcleanup.c (try_merge_switches): New function.
	(cleanup_tree_cfg_bb): Call it.

	* c-c++-common/merge-switch-1.c: New test.
	* c-c++-common/merge-switch-2.c: New test.
	* c-c++-common/merge-switch-3.c: New test.

Index: gcc/tree-cfgcleanup.c
===
--- gcc/tree-cfgcleanup.c	(revision 237797)
+++ gcc/tree-cfgcleanup.c	(working copy)
@@ -630,6 +630,242 @@ fixup_noreturn_call (gimple *stmt)
   return changed;
 }
 
+/* Look for situations where we have a switch inside the default case of
+   another, and they switch on the same condition.  We look for the
+   second switch in BB.  If we find such a situation, merge the two
+   switch statements.  */
+
+static bool
+try_merge_switches (basic_block bb)
+{
+  if (!single_pred_p (bb))
+return false;
+  basic_block pred_bb = single_pred (bb);
+
+  /* Look for a structure with two switch statements on the same value.  */
+  gimple_stmt_iterator gsi1, gsi2;
+  gsi1 = gsi_last_nondebug_bb (pred_bb);
+  if (gsi_end_p (gsi1))
+return false;
+  gimple *pred_end = gsi_stmt (gsi1);
+  if (gimple_code (pred_end) != GIMPLE_SWITCH)
+return false;
+
+  gsi2 = gsi_after_labels (bb);
+  if (gsi_end_p (gsi2))
+return false;
+
+  gimple *stmt = gsi_stmt (gsi2);
+  while (is_gimple_debug (stmt))
+{
+  gsi_next ();
+  if (gsi_end_p (gsi2))
+	return false;
+  stmt = gsi_stmt (gsi2);
+}
+
+  if (gimple_code (stmt) != GIMPLE_SWITCH)
+return false;
+  gswitch *sw1 = as_a  (pred_end);
+  gswitch *sw2 = as_a  (stmt);
+  tree idx1 = gimple_switch_index (sw1);
+  tree idx2 = gimple_switch_index (sw2);
+  if (TREE_CODE (idx1) != SSA_NAME || idx1 != idx2)
+return false;
+  size_t n1 = gimple_switch_num_labels (sw1);
+  size_t n2 = gimple_switch_num_labels (sw2);
+  if (n1 <= 1 || n2 <= 1)
+return false;
+  tree sw1_default = gimple_switch_default_label (sw1);
+  if (label_to_block (CASE_LABEL (sw1_default)) != bb)
+return false;
+
+  /* We know we have the basic structure of what we are looking for.  Sort
+ out some special cases regarding phi nodes.  */
+  if (!gsi_end_p (gsi_start_phis (bb)))
+return false;
+
+  edge e;
+  edge_iterator ei;
+  FOR_EACH_EDGE (e, ei, bb->succs)
+{
+  basic_block dest = e->dest;
+  if (find_edge (pred_bb, dest))
+	{
+	  /* If a destination block is reached from both switches, any
+	 phi nodes there would become corrupted.  */
+	  gphi_iterator psi = gsi_start_phis (dest);
+	  if (!gsi_end_p (psi))
+	return false;
+	}
+}
+
+  /* At this point we know we are making the transformation.
+ Clear the CASE_CHAIN values to avoid inconsistencies.  */
+  end_recording_case_labels ();
+
+  /* Start at 1 to skip default labels.  */
+  size_t i1 = 1;
+  size_t i2 = 1;
+  tree ce1 = gimple_switch_label (sw1, i1);
+  tree ce2 = gimple_switch_label (sw2, i2);
+  auto_vec new_labels;
+
+  /* Keep track of the blocks that were reachable from the second switch,
+ and whose edges should be redirected to the first.  Sometimes we
+ eliminate cases from the second switch entirely since they are
+ unreachable; in such cases, the bit for the destination block
+ remains clear.  */
+  bitmap_head redirect_bbs;
+  bitmap_initialize (_bbs, _default_obstack);
+
+  /* Merge case labels from sw2 into those of sw1.  */
+  while (i1 < n1 && i2 < n2)
+{
+  tree min1 = CASE_LOW (ce1);
+  tree min2 = CASE_LOW (ce2);
+  tree max1 = CASE_HIGH (ce1);
+  tree max2 = CASE_HIGH (ce2);
+  if (max1 == NULL_TREE)
+	max1 = min1;
+  if (max2 == NULL_TREE)
+	max2 = min2;
+
+  if

[PATCH, testsuite]: There is no warning for stack check with large frames on alpha

2016-07-18 Thread Uros Bizjak

Hello!

Alpha uses its own stack checking routine (part of the ABI), and
doesn't use generic functionality.

Disable check for warning for large frames on alpha.

2016-07-18  Uros Bizjak  

* gcc.dg/pr70017.c: Do not check for warning on alpha*-*-*.

Tested on x86_64-linux-gnu and alpha-linux-gnu, committed to mainline SVN.

Uros.
diff --git a/gcc/testsuite/gcc.dg/pr70017.c b/gcc/testsuite/gcc.dg/pr70017.c
index 52586fe..f544167 100644
--- a/gcc/testsuite/gcc.dg/pr70017.c
+++ b/gcc/testsuite/gcc.dg/pr70017.c
@@ -17,4 +17,4 @@ void foo(void)
 #define ONE(s) a##s[0] = 0;
   HUNDRED(a)
   HUNDRED(b)
-} /* { dg-warning "frame size too large for reliable stack checking" } */
+} /* { dg-warning "frame size too large for reliable stack checking" "" { 
target { ! alpha*-*-* } } } */

Re: [PATCH GCC]Remove support for -funsafe-loop-optimizations

2016-07-18 Thread Bin.Cheng

On Mon, Jul 18, 2016 at 4:28 PM, NightStrike  wrote:
> On Mon, Jul 18, 2016 at 3:55 AM, Bin.Cheng  wrote:
>> On Sat, Jul 16, 2016 at 6:28 PM, NightStrike  wrote:
>>> On Fri, Jul 15, 2016 at 1:07 PM, Bin Cheng  wrote:
 Hi,
 This patch removes support for -funsafe-loop-optimizations, as well as 
 -Wunsafe-loop-optimizations.  By its name, this option does unsafe 
 optimizations by assuming all loops must terminate and doesn't wrap.  
 Unfortunately, it's not as useful as expected because:
 1) Simply assuming loop must terminate isn't enough.  What we really want 
 is to analyze scalar evolution and loop niter bound under such 
 assumptions.  This option does nothing in this aspect.
 2) IIRC, this option generates bogus code for some common programs, that's 
 why it's disabled by default even at Ofast level.

 After I sent patches handling possible infinite loops in both (scev/niter) 
 analyzer and vectorizer, it's a natural step to remove such options in 
 GCC.  This patch does so by deleting code for -funsafe-loop-optimizations, 
 as well as -Wunsafe-loop-optimizations.  It also deletes the two now 
 useless tests, while the option interface is preserved for backward 
 compatibility purpose.
>>>
>>> There are a number of bugs opened against those options, including one
>>> that I just opened rather recently:
>>>
>>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71769
>>>
>>> but some go back far, in this case 9 years:
>>>
>>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=34114
>>>
>>> If you are going to remove the options, you should address open bugs
>>> related to those options.
>> Hi,
>> Thanks for pointing me to these PRs, I will have a look at them.
>
> I only highlighted two PRs, I was suggesting that you look for all of them.
>
>> IMHO, the old one reports weakness in loop niter analyzer, the issue
>> exists whether I remove unsafe-loop-optimization or not.  The new one
>> is a little bit trickier, I will put some comments on PR, and again,
>> the issue (if it is) is in niter analyzer which has nothing to do with
>> the option really.
>
> Well, one thing to note is that the warning is an easy way to get a
> notice of a possible missed optimization (and I have many more
> occurrences of it in a particular code base that I use).  If the
> warning is highlighted potential issues that aren't due to the -f
> option but are issues nonetheless, and we remove the warning, then how
> should I go about finding these missed opportunities in the future?
> Is there a different mechanism that does the same thing?
Hmm, good point, I will iterate the patch to see if I can only remove
-funsafe-loop-optimizations, while keep -Wunsafe-loop-optimizations.

Thanks,
bin

Re: [PATCH GCC]Remove support for -funsafe-loop-optimizations

2016-07-18 Thread NightStrike

On Mon, Jul 18, 2016 at 3:55 AM, Bin.Cheng  wrote:
> On Sat, Jul 16, 2016 at 6:28 PM, NightStrike  wrote:
>> On Fri, Jul 15, 2016 at 1:07 PM, Bin Cheng  wrote:
>>> Hi,
>>> This patch removes support for -funsafe-loop-optimizations, as well as 
>>> -Wunsafe-loop-optimizations.  By its name, this option does unsafe 
>>> optimizations by assuming all loops must terminate and doesn't wrap.  
>>> Unfortunately, it's not as useful as expected because:
>>> 1) Simply assuming loop must terminate isn't enough.  What we really want 
>>> is to analyze scalar evolution and loop niter bound under such assumptions. 
>>>  This option does nothing in this aspect.
>>> 2) IIRC, this option generates bogus code for some common programs, that's 
>>> why it's disabled by default even at Ofast level.
>>>
>>> After I sent patches handling possible infinite loops in both (scev/niter) 
>>> analyzer and vectorizer, it's a natural step to remove such options in GCC. 
>>>  This patch does so by deleting code for -funsafe-loop-optimizations, as 
>>> well as -Wunsafe-loop-optimizations.  It also deletes the two now useless 
>>> tests, while the option interface is preserved for backward compatibility 
>>> purpose.
>>
>> There are a number of bugs opened against those options, including one
>> that I just opened rather recently:
>>
>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71769
>>
>> but some go back far, in this case 9 years:
>>
>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=34114
>>
>> If you are going to remove the options, you should address open bugs
>> related to those options.
> Hi,
> Thanks for pointing me to these PRs, I will have a look at them.

I only highlighted two PRs, I was suggesting that you look for all of them.

> IMHO, the old one reports weakness in loop niter analyzer, the issue
> exists whether I remove unsafe-loop-optimization or not.  The new one
> is a little bit trickier, I will put some comments on PR, and again,
> the issue (if it is) is in niter analyzer which has nothing to do with
> the option really.

Well, one thing to note is that the warning is an easy way to get a
notice of a possible missed optimization (and I have many more
occurrences of it in a particular code base that I use).  If the
warning is highlighted potential issues that aren't due to the -f
option but are issues nonetheless, and we remove the warning, then how
should I go about finding these missed opportunities in the future?
Is there a different mechanism that does the same thing?

Re: Importing gnulib into the gcc tree

2016-07-18 Thread ayush goel

Replies inline

--  
Thanks,  
Ayush Goel

On 17 July 2016 at 9:44:27 PM, Manuel López-Ibáñez (lopeziba...@gmail.com) 
wrote:
> On 16 July 2016 at 10:54, ayush goel wrote:
> > Hi,
> > Thanks for the feedbacks.
> >
> > —> I’m already configuring gcc with multiple languages and multilib enabled
> >
> > —> The changes have been bootstrapped and regression tested (complete 
> > check, make  
> -k -j20 check).
> >
> > —> As mentioned, I have locally removed obstack.[ch] from libiberty and 
> > built and tested  
> the entire thing.
> >
> > PFA the patch
>  
> This sounds great to me, but I cannot approve it. I hope some of the
> people who can will comment on it.
>  
> One thing that I miss is documenting gnulib in doc/sourcebuild.texi.

I’ve added gnulib as an entry in sourcebuild.texi

> It would be good to document in particular how to add a new module.
> GDB has: 
> https://sourceware.org/gdb/wiki/DeveloperTips#Updating_GDB.27s_import_of_gnulib
>   
> but I think this info should be in sourcebuild.texi (or somewhere else
> under doc/).
> 

This wiki pages talks mostly of how to update the version of gnulib present 
inside the gdb directory.
I’ve instead created a gnulib-import.texi file inside /doc with bullet points 
on how to import a new module.

> I see several other mentions of libiberty in doc/, but some of them
> may be just using libiberty as an example, thus not relevant.
> 

Yes, there are several other mentions of libiberty (in contrib.texi, 
install.texi, invoke.texi etc) however these are not relevant to us.

I’m attaching a patch just containing the changes made in the /doc. Once these 
are approved by the community, I’ll add them to the main patch and resubmit it.

PFA


> Cheers,
>  
> Manuel.
>  


gcc_doc.patch
Description: Binary data

Re: C++ PATCHes to mangling of sizeof... and fold-expressions

2016-07-18 Thread Jason Merrill

On Fri, Jul 15, 2016 at 12:52 PM, Jason Merrill  wrote:
> Similarly, 71711 shows that we never implemented mangling of C++17
> fold-expressions, or partial instantiation of them.

...but this patch didn't implement demangling.  So here's that part.
commit 6854c5ef05271835dd489d3668284d927ec8b394
Author: Jason Merrill 
Date:   Fri Jul 15 14:33:54 2016 -0400

Demangle C++17 fold-expressions.

* cp-demangle.c (cplus_demangle_operators): Add f[lrLR].
(d_expression_1): Handle them.
(d_maybe_print_fold_expression): New.
(d_print_comp_inner): Use it.
(d_index_template_argument): Handle negative index.

diff --git a/libiberty/cp-demangle.c b/libiberty/cp-demangle.c
index 56d3bcb..0c6d714 100644
--- a/libiberty/cp-demangle.c
+++ b/libiberty/cp-demangle.c
@@ -344,7 +344,7 @@ struct d_print_info
   /* Set to 1 if we saw a demangling error.  */
   int demangle_failure;
   /* The current index into any template argument packs we are using
- for printing.  */
+ for printing, or -1 to print the whole pack.  */
   int pack_index;
   /* Number of d_print_flush calls so far.  */
   unsigned long int flush_count;
@@ -1762,6 +1762,10 @@ const struct demangle_operator_info 
cplus_demangle_operators[] =
   { "eO", NL ("^="),2 },
   { "eo", NL ("^"), 2 },
   { "eq", NL ("=="),2 },
+  { "fL", NL ("..."),   3 },
+  { "fR", NL ("..."),   3 },
+  { "fl", NL ("..."),   2 },
+  { "fr", NL ("..."),   2 },
   { "ge", NL (">="),2 },
   { "gs", NL ("::"),   1 },
   { "gt", NL (">"), 2 },
@@ -3305,6 +3309,9 @@ d_expression_1 (struct d_info *di)
  return NULL;
if (op_is_new_cast (op))
  left = cplus_demangle_type (di);
+   else if (code[0] == 'f')
+ /* fold-expression.  */
+ left = d_operator_name (di);
else
  left = d_expression_1 (di);
if (!strcmp (code, "cl"))
@@ -3339,6 +3346,13 @@ d_expression_1 (struct d_info *di)
second = d_expression_1 (di);
third = d_expression_1 (di);
  }
+   else if (code[0] == 'f')
+ {
+   /* fold-expression.  */
+   first = d_operator_name (di);
+   second = d_expression_1 (di);
+   third = d_expression_1 (di);
+ }
else if (code[0] == 'n')
  {
/* new-expression.  */
@@ -4196,13 +4210,17 @@ cplus_demangle_print (int options, const struct 
demangle_component *dc,
 }
 
 /* Returns the I'th element of the template arglist ARGS, or NULL on
-   failure.  */
+   failure.  If I is negative, return the entire arglist.  */
 
 static struct demangle_component *
 d_index_template_argument (struct demangle_component *args, int i)
 {
   struct demangle_component *a;
 
+  if (i < 0)
+/* Print the whole argument pack.  */
+return args;
+
   for (a = args;
a != NULL;
a = d_right (a))
@@ -4402,6 +4420,70 @@ d_get_saved_scope (struct d_print_info *dpi,
   return NULL;
 }
 
+/* If DC is a C++17 fold-expression, print it and return true; otherwise
+   return false.  */
+
+static int
+d_maybe_print_fold_expression (struct d_print_info *dpi, int options,
+  const struct demangle_component *dc)
+{
+  const struct demangle_component *ops, *operator_, *op1, *op2;
+  int save_idx;
+
+  const char *fold_code = d_left (dc)->u.s_operator.op->code;
+  if (fold_code[0] != 'f')
+return 0;
+
+  ops = d_right (dc);
+  operator_ = d_left (ops);
+  op1 = d_right (ops);
+  op2 = 0;
+  if (op1->type == DEMANGLE_COMPONENT_TRINARY_ARG2)
+{
+  op2 = d_right (op1);
+  op1 = d_left (op1);
+}
+
+  /* Print the whole pack.  */
+  save_idx = dpi->pack_index;
+  dpi->pack_index = -1;
+
+  switch (fold_code[1])
+{
+  /* Unary left fold, (... + X).  */
+case 'l':
+  d_append_string (dpi, "(...");
+  d_print_expr_op (dpi, options, operator_);
+  d_print_subexpr (dpi, options, op1);
+  d_append_char (dpi, ')');
+  break;
+
+  /* Unary right fold, (X + ...).  */
+case 'r':
+  d_append_char (dpi, '(');
+  d_print_subexpr (dpi, options, op1);
+  d_print_expr_op (dpi, options, operator_);
+  d_append_string (dpi, "...)");
+  break;
+
+  /* Binary left fold, (42 + ... + X).  */
+case 'L':
+  /* Binary right fold, (X + ... + 42).  */
+case 'R':
+  d_append_char (dpi, '(');
+  d_print_subexpr (dpi, options, op1);
+  d_print_expr_op (dpi, options, operator_);
+  d_append_string (dpi, "...");
+  d_print_expr_op (dpi, options, operator_);
+  d_print_subexpr (dpi, options, op2);
+  d_append_char (dpi, ')');
+  break;
+}
+
+  dpi->pack_index = save_idx;
+  return 1;
+}
+
 /* Subroutine to handle components.  */
 
 static void
@@ -5218,6 +5300,9 @@ d_print_comp_inner (struct

Re: [patch] Add new hook to diagnose address space usage

2016-07-18 Thread Bernd Schmidt


On 07/14/2016 05:11 PM, Georg-Johann Lay wrote:

The hook allows better diagnostics:  The address spaces are registered
with c_register_addr_space and if the parser comes across an address
space it provides the hook with the needed information, in particular
the location of the token so that the message would be something like


Looks reasonable, except...


+(diagnose_usage,
+ "Define this hook if the availability of an address space depends on\n\
+command line options and some diagnostics shall be printed when the\n\


"should", not "shall", I think.



+bool
+default_addr_space_diagnose_usage (addr_space_t ARG_UNUSED (as),
+  location_t ARG_UNUSED (loc))
+{
+  return false;
+}


The return value is not used, so it should return void. That would also 
match the documentation you added (which says "does nothing" rather than 
"returns false").


Remove unused arg names in default hook implementations, I think.


Bernd

Shared libbackend.so (was: [PATCH] Avoid invoking ranlib on libbackend.a)

2016-07-18 Thread Michael Matz

Hi,

On Mon, 18 Jul 2016, Jakub Jelinek wrote:

> On Mon, Jul 18, 2016 at 02:32:40PM +0200, Richard Biener wrote:
> > While eliding ranlib sounds like a no-brainer the real benefit (I/O wise) is
> > when you get rid of the archive or save link time by creating a (partially)
> > linked DSO.  ISTR Michael Matz has patches to do that.  Whether it's
> 
> DSO?  Then we'd have to build everything with -fpic (which we only do when
> building gccjit).  Did you mean just a relocatable object (ld -r) instead?

Yes, a real DSO, and yes with -fpic.  I had hacks in the compiler to make 
functions be protected visibility (and only those, as protected data is a 
PITA) to retain inlining effects to not lose too much performance (and 
some hackery in binutils to make this work as intended).  At that time 
(years ago) a cc1 with shared libbackend.so was even faster than a static 
cc1 (!).

When dusting off that stuff I realized that -fPIE has similar effects 
(symbols aren't preemptable, so inlining still happens, and the code is 
mostly appropriate for a shared object).  But I ran into the same binutils 
artifacts again and haven't found time to really continue with that.  
Some quick hack in binutils to not error out on direct PC-relative 
references to data symbols (for fear of copy relocs) shows that a 
sort-of-shared cc1+libbackend.so is about 1% slower than a static one (on 
some fold-const.i file I had) on x86-64.  These cc1 were built with -O0 
and checking enabled.

If you want to play with that, attached are two diffs, one for GCC, one 
for binutils.  configure with --enabled-backend-shared, and then play with 
the two BACKENDPICFLAG and NO_PIE_CFLAGS flags (the latter should contain 
-fPIE so that no copy relocs are created when BACKENDPICFLAGS contains 
-fvisibility=protected).  The binutils hack is only needed if 
libbackend.so is built with -fPIE.

The above timing is with BACKENDPICFLAGS="-fPIE -fvisibility=protected" 
and NO_PIE_CFLAGS=-fPIE and the binutils hack.

I still wanted to reimplement the visibility for functions only flag and 
remeasure.  Then the binutils part wouldn't be necessary.


Ciao,
Michael.Index: Makefile.in
===
--- Makefile.in	(revision 235171)
+++ Makefile.in	(working copy)
@@ -150,6 +150,7 @@ LDFLAGS = @LDFLAGS@
 
 # Should we build position-independent host code?
 PICFLAG = @PICFLAG@
+BACKENDPICFLAG = @BACKENDPICFLAG@
 
 # Flags to determine code coverage. When coverage is disabled, this will
 # contain the optimization flags, as you normally want code coverage
@@ -382,6 +383,7 @@ PLUGINLIBS = @pluginlibs@
 enable_plugin = @enable_plugin@
 
 enable_host_shared = @enable_host_shared@
+enable_backend_shared = @enable_backend_shared@
 
 enable_as_accelerator = @enable_as_accelerator@
 
@@ -1563,7 +1566,12 @@ ALL_HOST_BACKEND_OBJS = $(GCC_OBJS) $(OB
 # compilation or not.
 ALL_HOST_OBJS = $(ALL_HOST_FRONTEND_OBJS) $(ALL_HOST_BACKEND_OBJS)
 
-BACKEND = libbackend.a main.o libcommon-target.a libcommon.a \
+ifeq ($(enable_backend_shared),yes)
+LIBBACKEND = libbackend.so
+else
+LIBBACKEND = libbackend.a
+endif
+BACKEND = $(LIBBACKEND) main.o libcommon-target.a libcommon.a \
 	$(CPPLIB) $(LIBDECNUMBER)
 
 # This is defined to "yes" if Tree checking is enabled, which roughly means
@@ -1587,7 +1595,7 @@ MOSTLYCLEANFILES = insn-flags.h insn-con
  gcc-ranlib$(exeext) \
  gcov-iov$(build_exeext) gcov$(exeext) gcov-dump$(exeext) \
  gcov-tool$(exeect) \
- gengtype$(exeext) *.[0-9][0-9].* *.[si] *-checksum.c libbackend.a \
+ gengtype$(exeext) *.[0-9][0-9].* *.[si] *-checksum.c $(LIBBACKEND) \
  libcommon-target.a libcommon.a libgcc.mk
 
 # This symlink makes the full installation name of the driver be available
@@ -1842,12 +1850,20 @@ rest.cross: specs
 # This is used only if the user explicitly asks for it.
 compilations: $(BACKEND)
 
+ifeq ($(enable_backend_shared),yes)
+$(OBJS): INTERNAL_CFLAGS += $(BACKENDPICFLAG)
+endif
+
 # This archive is strictly for the host.
 libbackend.a: $(OBJS)
 	-rm -rf libbackend.a
 	$(AR) $(AR_FLAGS) libbackend.a $(OBJS)
 	-$(RANLIB) $(RANLIB_FLAGS) libbackend.a
 
+libbackend.so: $(OBJS)
+	-rm -rf libbackend.so
+	$(LINKER) $(ALL_LINKERFLAGS) $(LDFLAGS) -shared -o $@ $(OBJS)
+
 libcommon-target.a: $(OBJS-libcommon-target)
 	-rm -rf libcommon-target.a
 	$(AR) $(AR_FLAGS) libcommon-target.a $(OBJS-libcommon-target)
Index: configure.ac
===
--- configure.ac	(revision 235171)
+++ configure.ac	(working copy)
@@ -5991,6 +5991,12 @@ AC_ARG_ENABLE(host-shared,
 AC_SUBST(enable_host_shared)
 AC_SUBST(PICFLAG)
 
+AC_ARG_ENABLE(backend-shared,
+[AS_HELP_STRING([--enable-backend-shared],
+		[build backend code as shared library])],
+[BACKENDPICFLAG=-fPIC], [BACKENDPICFLAG=])
+AC_SUBST(enable_backend_shared)
+AC_SUBST(BACKENDPICFLAG)
 
 AC_ARG_ENABLE(libquadmath-support,
 [AS_HELP_STRING([--disable-libquadmath-support],
Index: configure

Re: [PATCH] Avoid invoking ranlib on libbackend.a

2016-07-18 Thread Patrick Palka

On Mon, Jul 18, 2016 at 9:53 AM, Segher Boessenkool
 wrote:
> On Mon, Jul 18, 2016 at 09:05:13AM -0400, Patrick Palka wrote:
>> On Mon, Jul 18, 2016 at 8:44 AM, Segher Boessenkool
>>  wrote:
>> > On Mon, Jul 18, 2016 at 06:35:11AM -0500, Segher Boessenkool wrote:
>> >> Or, if using GNU ar, you can even use -S, if that helps (after testing
>> >> for it in configure, of course).
>> >
>> > I meant -T.  Some day I will learn how to type, promise!
>>
>> This really helps!   libbackend.a gets built instantly with -T.
>
> Yes, but how does it influence link time?

No significant influence.  Linking cc1plus takes about 6 seconds on my
machine regardless of whether libbackend.a is built with -T or not.

>
>
> Segher

Re: [v3 PATCH] Minor comment cleanup on optional.

2016-07-18 Thread Jonathan Wakely


On 18/07/16 15:43 +0300, Ville Voutilainen wrote:

   Clean up optional's comments.
   * include/std/optional: Remove incorrect section headers
   from comments when redundant, replace bare section
   headers with more descriptive comments.


OK for trunk, thanks.

Re: [PATCH] Avoid invoking ranlib on libbackend.a

2016-07-18 Thread Segher Boessenkool

On Mon, Jul 18, 2016 at 09:05:13AM -0400, Patrick Palka wrote:
> On Mon, Jul 18, 2016 at 8:44 AM, Segher Boessenkool
>  wrote:
> > On Mon, Jul 18, 2016 at 06:35:11AM -0500, Segher Boessenkool wrote:
> >> Or, if using GNU ar, you can even use -S, if that helps (after testing
> >> for it in configure, of course).
> >
> > I meant -T.  Some day I will learn how to type, promise!
> 
> This really helps!   libbackend.a gets built instantly with -T.

Yes, but how does it influence link time?


Segher

Re: Ping: Fix PR fortran/71688

2016-07-18 Thread Paul Richard Thomas

Dear Martin,

This looks like an 'obvious' fix. OK for all the branches, 4.9->trunk.

Thanks for the patch

Paul

On 18 July 2016 at 14:53, Martin Jambor  wrote:
> Ping (this time also CCing fort...@gcc.gnu and Honza).
>
> I really think this should be backported to 4.9 in time for the last
> release.
>
> Thanks,
>
> Martin
>
> - Original message from Martin Jambor  -
>
> Date: Thu, 30 Jun 2016 11:13:17 +0200
> From: Martin Jambor 
> To: GCC Patches 
> Subject: Fix PR fortran/71688
>
> Hi,
>
> PR 71688 is about an ICE in cgraphunit.c caused by the fact that
> Fortran FE creates two separate call-graph nodes for a single function
> decl, if you are interested, complete backtraces leading to the point
> of creating them are in bugzilla.
>
> The intuitive fix, changing one of these points so that they call
> cgraph::get_create rather than cgraph_node::create works and given the
> comment just before the line also seems like the correct thing to do:
>
>   /* Register this function with cgraph just far enough to get it
>  added to our parent's nested function list.
>  If there are static coarrays in this function, the nested _caf_init
>  function has already called cgraph_create_node, which also created
>  the cgraph node for this function.  */
>
> It is interesting that the bug lurked so long there.  I have
> bootstrapped and tested the patch below on x86_64-linux, is it OK for
> trunk and (after a while) for all active release branches?
>
> Thanks,
>
> Martin
>
>
> 2016-06-29  Martin Jambor  
>
> PR fortran/71688
> * trans-decl.c (gfc_generate_function_code): Use get_create rather
> than create to get a call graph node.
>
> testsuite/
> gfortran.dg/pr71688.f90: New test.
>
>
> diff --git a/gcc/fortran/trans-decl.c b/gcc/fortran/trans-decl.c
> index 2f5e434..0e68736 100644
> --- a/gcc/fortran/trans-decl.c
> +++ b/gcc/fortran/trans-decl.c
> @@ -6336,7 +6336,7 @@ gfc_generate_function_code (gfc_namespace * ns)
>  function has already called cgraph_create_node, which also created
>  the cgraph node for this function.  */
>if (!has_coarray_vars || flag_coarray != GFC_FCOARRAY_LIB)
> -   (void) cgraph_node::create (fndecl);
> +   (void) cgraph_node::get_create (fndecl);
>  }
>else
>  cgraph_node::finalize_function (fndecl, true);
> diff --git a/gcc/testsuite/gfortran.dg/pr71688.f90 
> b/gcc/testsuite/gfortran.dg/pr71688.f90
> new file mode 100644
> index 000..dbb6d18
> --- /dev/null
> +++ b/gcc/testsuite/gfortran.dg/pr71688.f90
> @@ -0,0 +1,13 @@
> +! { dg-do compile }
> +! { dg-options "-fcoarray=lib" }
> +
> +program p
> +   call s
> +contains
> +   subroutine s
> +  real :: x[*] = 1
> +  block
> +  end block
> +  x = 2
> +   end
> +end
>
> - End original message -



-- 
The difference between genius and stupidity is; genius has its limits.

Albert Einstein

[PATCH 3/4] Support movbe as a i386 target optimization node

2016-07-18 Thread marxin

gcc/testsuite/ChangeLog:

2016-07-18  Martin Liska  

* gcc.target/i386/movbe-4.c: New test.

gcc/ChangeLog:

2016-07-18  Martin Liska  

* config/i386/i386.c (ix86_valid_target_attribute_inner_p):
Handle movbe.
---
 gcc/config/i386/i386.c  |  1 +
 gcc/testsuite/gcc.target/i386/movbe-4.c | 20 
 2 files changed, 21 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/i386/movbe-4.c

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 493b7e6..8a30cf2 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -6437,6 +6437,7 @@ ix86_valid_target_attribute_inner_p (tree args, char 
*p_strings[],
 IX86_ATTR_ISA ("mmx",  OPT_mmmx),
 IX86_ATTR_ISA ("pclmul",   OPT_mpclmul),
 IX86_ATTR_ISA ("popcnt",   OPT_mpopcnt),
+IX86_ATTR_ISA ("movbe",OPT_mmovbe),
 IX86_ATTR_ISA ("crc32",OPT_mcrc32),
 IX86_ATTR_ISA ("sse",  OPT_msse),
 IX86_ATTR_ISA ("sse2", OPT_msse2),
diff --git a/gcc/testsuite/gcc.target/i386/movbe-4.c 
b/gcc/testsuite/gcc.target/i386/movbe-4.c
new file mode 100644
index 000..9067091
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/movbe-4.c
@@ -0,0 +1,20 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+#pragma GCC target ("movbe")
+
+extern int x;
+
+void
+foo (int i)
+{
+  x = __builtin_bswap32 (i);
+}
+
+int
+bar ()
+{
+  return __builtin_bswap32 (x);
+}
+
+/* { dg-final { scan-assembler-times "movbe\[ \t\]" 2 } } */
-- 
2.8.4

[PATCH 1/4] Fix PR target/71652

2016-07-18 Thread marxin

gcc/ChangeLog:

2016-07-18  Martin Liska  

PR target/71652
* config/i386/i386.c (ix86_option_override_internal): Change
signature and return false when there's an error related to
arch string.
(release_options_strings): New function.
(ix86_valid_target_attribute_tree): Call the function.

gcc/testsuite/ChangeLog:

2016-07-18  Martin Liska  

* gcc.target/i386/pr71652.c: New test.
* gcc.target/i386/pr71652-2.c: New test.
* gcc.target/i386/pr71652-3.c: New test.
---
 gcc/config/i386/i386.c| 62 +--
 gcc/testsuite/gcc.target/i386/pr71652-2.c | 13 +++
 gcc/testsuite/gcc.target/i386/pr71652-3.c | 14 +++
 gcc/testsuite/gcc.target/i386/pr71652.c   | 13 +++
 4 files changed, 83 insertions(+), 19 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr71652-2.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr71652-3.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr71652.c

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index ba35dce..c838790 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -4698,9 +4698,10 @@ ix86_override_options_after_change (void)
 
 /* Override various settings based on options.  If MAIN_ARGS_P, the
options are from the command line, otherwise they are from
-   attributes.  */
+   attributes.  Return true if there's an error related to march
+   option.  */
 
-static void
+static bool
 ix86_option_override_internal (bool main_args_p,
   struct gcc_options *opts,
   struct gcc_options *opts_set)
@@ -5243,16 +5244,32 @@ ix86_option_override_internal (bool main_args_p,
   for (i = 0; i < pta_size; i++)
 if (! strcmp (opts->x_ix86_arch_string, processor_alias_table[i].name))
   {
+   if (!strcmp (opts->x_ix86_arch_string, "generic"))
+ {
+   error ("generic CPU can be used only for %stune=%s %s",
+  prefix, suffix, sw);
+   return false;
+ }
+   else if (!strcmp (opts->x_ix86_arch_string, "intel"))
+ {
+   error ("intel CPU can be used only for %stune=%s %s",
+  prefix, suffix, sw);
+   return false;
+ }
+
+   if (TARGET_64BIT_P (opts->x_ix86_isa_flags)
+   && !(processor_alias_table[i].flags & PTA_64BIT))
+ {
+   error ("CPU you selected does not support x86-64 "
+  "instruction set");
+   return false;
+ }
+
ix86_schedule = processor_alias_table[i].schedule;
ix86_arch = processor_alias_table[i].processor;
/* Default cpu tuning to the architecture.  */
ix86_tune = ix86_arch;
 
-   if (TARGET_64BIT_P (opts->x_ix86_isa_flags)
-   && !(processor_alias_table[i].flags & PTA_64BIT))
- error ("CPU you selected does not support x86-64 "
-"instruction set");
-
if (processor_alias_table[i].flags & PTA_MMX
&& !(opts->x_ix86_isa_flags_explicit & OPTION_MASK_ISA_MMX))
  opts->x_ix86_isa_flags |= OPTION_MASK_ISA_MMX;
@@ -5450,13 +5467,7 @@ ix86_option_override_internal (bool main_args_p,
   if (TARGET_X32 && (ix86_isa_flags & OPTION_MASK_ISA_MPX))
 error ("Intel MPX does not support x32");
 
-  if (!strcmp (opts->x_ix86_arch_string, "generic"))
-error ("generic CPU can be used only for %stune=%s %s",
-  prefix, suffix, sw);
-  else if (!strcmp (opts->x_ix86_arch_string, "intel"))
-error ("intel CPU can be used only for %stune=%s %s",
-  prefix, suffix, sw);
-  else if (i == pta_size)
+  if (i == pta_size)
 error ("bad value (%s) for %sarch=%s %s",
   opts->x_ix86_arch_string, prefix, suffix, sw);
 
@@ -6045,6 +6056,8 @@ ix86_option_override_internal (bool main_args_p,
   ix86_parse_stringop_strategy_string (str, true);
   free (str);
 }
+
+  return true;
 }
 
 /* Implement the TARGET_OPTION_OVERRIDE hook.  */
@@ -6639,6 +6652,15 @@ ix86_valid_target_attribute_inner_p (tree args, char 
*p_strings[],
   return ret;
 }
 
+/* Release allocated strings.  */
+static void
+release_options_strings (char **option_strings)
+{
+  /* Free up memory allocated to hold the strings */
+  for (unsigned i = 0; i < IX86_FUNCTION_SPECIFIC_MAX; i++)
+free (option_strings[i]);
+}
+
 /* Return a TARGET_OPTION_NODE tree of the target options listed or NULL.  */
 
 tree
@@ -6653,7 +6675,6 @@ ix86_valid_target_attribute_tree (tree args,
   int orig_arch_specified = ix86_arch_specified;
   char *option_strings[IX86_FUNCTION_SPECIFIC_MAX] = { NULL, NULL };
   tree t = NULL_TREE;
-  int i;
   struct cl_target_option *def
 = TREE_TARGET_OPTION (target_option_default_node);
   struct gcc_options enum_opts_set;
@@ -6714,7 +6735,12 @@ ix86_valid_target_attribute_tree (tree args,
}
 
   /* Do any overrides, such as arch=xxx, or tune=xxx

[PATCH 4/4] Remove fused-madd from documentation

2016-07-18 Thread marxin

gcc/ChangeLog:

2016-07-18  Martin Liska  

* doc/extend.texi: Remove fused-madd from i386 target
options.
---
 gcc/doc/extend.texi | 5 -
 1 file changed, 5 deletions(-)

diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index 5b9e617..30957ce 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -5443,11 +5443,6 @@ Enable/disable the generation of the CLD before string 
moves.
 Enable/disable the generation of the @code{sin}, @code{cos}, and
 @code{sqrt} instructions on the 387 floating-point unit.
 
-@item fused-madd
-@itemx no-fused-madd
-@cindex @code{target("fused-madd")} function attribute, x86
-Enable/disable the generation of the fused multiply/add instructions.
-
 @item ieee-fp
 @itemx no-ieee-fp
 @cindex @code{target("ieee-fp")} function attribute, x86
-- 
2.8.4

[PATCH 2/4] Support crc32 as a i386 target optimization node

2016-07-18 Thread marxin

gcc/ChangeLog:

2016-07-18  Martin Liska  

* config/i386/i386.c (ix86_valid_target_attribute_inner_p):
Handle crc32.

gcc/testsuite/ChangeLog:

2016-07-18  Martin Liska  

* gcc.target/i386/crc32-5.c: New test.
---
 gcc/config/i386/i386.c  |  1 +
 gcc/testsuite/gcc.target/i386/crc32-5.c | 25 +
 2 files changed, 26 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/i386/crc32-5.c

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index c838790..493b7e6 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -6437,6 +6437,7 @@ ix86_valid_target_attribute_inner_p (tree args, char 
*p_strings[],
 IX86_ATTR_ISA ("mmx",  OPT_mmmx),
 IX86_ATTR_ISA ("pclmul",   OPT_mpclmul),
 IX86_ATTR_ISA ("popcnt",   OPT_mpopcnt),
+IX86_ATTR_ISA ("crc32",OPT_mcrc32),
 IX86_ATTR_ISA ("sse",  OPT_msse),
 IX86_ATTR_ISA ("sse2", OPT_msse2),
 IX86_ATTR_ISA ("sse3", OPT_msse3),
diff --git a/gcc/testsuite/gcc.target/i386/crc32-5.c 
b/gcc/testsuite/gcc.target/i386/crc32-5.c
new file mode 100644
index 000..a47f1e2
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/crc32-5.c
@@ -0,0 +1,25 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+/* { dg-final { scan-assembler "crc32b\[^\\n\]*eax" } } */
+/* { dg-final { scan-assembler "crc32w\[^\\n\]*eax" } } */
+/* { dg-final { scan-assembler "crc32l\[^\\n\]*eax" } } */
+
+#pragma GCC target ("crc32")
+
+unsigned int
+crc32b (unsigned int x, unsigned char y)
+{
+  return __builtin_ia32_crc32qi (x, y);
+}
+
+unsigned int
+crc32w (unsigned int x, unsigned short y)
+{
+  return __builtin_ia32_crc32hi (x, y);
+}
+
+unsigned int
+crc32d (unsigned int x, unsigned int y)
+{
+  return __builtin_ia32_crc32si (x, y);
+}
-- 
2.8.4

[PATCH 0/4] Properly handle GCC target("march=") (PR71652)

2016-07-18 Thread marxin

Hello.

Following small patch set targets $subject, where we ICE if someone
uses #pragma GCC target ("arch=generic"). My attempt is to not to
create a new target optimization node in case of a wrong value
of march string. Such approach does not generate multiple errors.

Apart from that, I also improved i386 option handling as mentioned
in: [1]

Patch bootstraps and survives regression tests on powerpc64le-unknown-linux-gnu.

Ready for trunk?
Thanks,
Martin

[1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71652#c4

marxin (4):
  Fix PR target/71652
  Support crc32 as a i386 target optimization node
  Support movbe as a i386 target optimization node
  Remove fused-madd from documentation

 gcc/config/i386/i386.c| 64 ++-
 gcc/doc/extend.texi   |  5 ---
 gcc/testsuite/gcc.target/i386/crc32-5.c   | 25 
 gcc/testsuite/gcc.target/i386/movbe-4.c   | 20 ++
 gcc/testsuite/gcc.target/i386/pr71652-2.c | 13 +++
 gcc/testsuite/gcc.target/i386/pr71652-3.c | 14 +++
 gcc/testsuite/gcc.target/i386/pr71652.c   | 13 +++
 7 files changed, 130 insertions(+), 24 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/crc32-5.c
 create mode 100644 gcc/testsuite/gcc.target/i386/movbe-4.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr71652-2.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr71652-3.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr71652.c

-- 
2.8.4

[PATCH] Fix PR71907

2016-07-18 Thread Richard Biener


The following avoids PR71907 (tree_nonartificial_location failing)
by making sure to not completely drop BLOCK_ABSTRACT_ORIGIN but instead
add self-references (a "this has been inlined flag basically") for
BLOCKs not being inlined_function_outer_scope_p.

LTO bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

Without this all fortify warnings are now silent as GCC 6 honors
the in-system-header flag of locations (-Wsystem-headers make them
appear again).

Richard.

2016-07-18  Richard Biener  

PR lto/71907
* lto-streamer-out.c (DFS::DFS_write_tree_body): For blocks
with an abstract origin that is not an inlined function outer
scope add a self-reference as abstract origin.
* tree-streamer-out.c (write_ts_block_tree_pointers): Likewise.

Index: gcc/lto-streamer-out.c
===
*** gcc/lto-streamer-out.c  (revision 238426)
--- gcc/lto-streamer-out.c  (working copy)
*** DFS::DFS_write_tree_body (struct output_
*** 890,901 
/* Follow BLOCK_ABSTRACT_ORIGIN for the limited cases we can
 handle - those that represent inlined function scopes.
 For the drop rest them on the floor instead of ICEing
!in dwarf2out.c.  */
if (inlined_function_outer_scope_p (expr))
{
  tree ultimate_origin = block_ultimate_origin (expr);
  DFS_follow_tree_edge (ultimate_origin);
}
/* Do not follow BLOCK_NONLOCALIZED_VARS.  We cannot handle debug
 information for early inlined BLOCKs so drop it on the floor instead
 of ICEing in dwarf2out.c.  */
--- 890,905 
/* Follow BLOCK_ABSTRACT_ORIGIN for the limited cases we can
 handle - those that represent inlined function scopes.
 For the drop rest them on the floor instead of ICEing
!in dwarf2out.c, but keep the notion of whether the block
!is an inlined block by refering to itself for the sake of
!tree_nonartificial_location.  */
if (inlined_function_outer_scope_p (expr))
{
  tree ultimate_origin = block_ultimate_origin (expr);
  DFS_follow_tree_edge (ultimate_origin);
}
+   else if (BLOCK_ABSTRACT_ORIGIN (expr))
+   DFS_follow_tree_edge (expr);
/* Do not follow BLOCK_NONLOCALIZED_VARS.  We cannot handle debug
 information for early inlined BLOCKs so drop it on the floor instead
 of ICEing in dwarf2out.c.  */
Index: gcc/tree-streamer-out.c
===
*** gcc/tree-streamer-out.c (revision 238426)
--- gcc/tree-streamer-out.c (working copy)
*** write_ts_block_tree_pointers (struct out
*** 807,820 
  
/* Stream BLOCK_ABSTRACT_ORIGIN for the limited cases we can handle - those
   that represent inlined function scopes.
!  For the rest them on the floor instead of ICEing in dwarf2out.c.  */
if (inlined_function_outer_scope_p (expr))
  {
tree ultimate_origin = block_ultimate_origin (expr);
stream_write_tree (ob, ultimate_origin, ref_p);
  }
else
! stream_write_tree (ob, NULL_TREE, ref_p);
/* Do not stream BLOCK_NONLOCALIZED_VARS.  We cannot handle debug 
information
   for early inlined BLOCKs so drop it on the floor instead of ICEing in
   dwarf2out.c.  */
--- 807,823 
  
/* Stream BLOCK_ABSTRACT_ORIGIN for the limited cases we can handle - those
   that represent inlined function scopes.
!  For the rest them on the floor instead of ICEing in dwarf2out.c, but
!  keep the notion of whether the block is an inlined block by refering
!  to itself for the sake of tree_nonartificial_location.  */
if (inlined_function_outer_scope_p (expr))
  {
tree ultimate_origin = block_ultimate_origin (expr);
stream_write_tree (ob, ultimate_origin, ref_p);
  }
else
! stream_write_tree (ob, (BLOCK_ABSTRACT_ORIGIN (expr)
!   ? expr : NULL_TREE), ref_p);
/* Do not stream BLOCK_NONLOCALIZED_VARS.  We cannot handle debug 
information
   for early inlined BLOCKs so drop it on the floor instead of ICEing in
   dwarf2out.c.  */

[PATCH] Fix PR71908

2016-07-18 Thread Richard Biener


The following fixes PR71908.

Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

Richard.

2016-07-18  Richard Biener  

PR tree-optimization/71908
* tree-ssa-structalias.c (get_constraint_for_component_ref): Handle
symbolic constants in a more reliable way.

* gcc.dg/torture/pr71908.c: New testcase.

Index: gcc/tree-ssa-structalias.c
===
*** gcc/tree-ssa-structalias.c  (revision 238426)
--- gcc/tree-ssa-structalias.c  (working copy)
*** get_constraint_for_component_ref (tree t
*** 3211,3216 
--- 3211,3230 
  
t = get_ref_base_and_extent (t, , , , );
  
+   /* We can end up here for component references on a
+  VIEW_CONVERT_EXPR <>() or things like a
+  BIT_FIELD_REF <[(void *) + 4B], ...>.  So for
+  symbolic constants simply give up.  */
+   if (TREE_CODE (t) == ADDR_EXPR)
+ {
+   constraint_expr result;
+   result.type = SCALAR;
+   result.var = anything_id;
+   result.offset = 0;
+   results->safe_push (result);
+   return;
+ }
+ 
/* Pretend to take the address of the base, we'll take care of
   adding the required subset of sub-fields below.  */
get_constraint_for_1 (t, results, true, lhs_p);
*** get_constraint_for_component_ref (tree t
*** 3298,3311 
else
result.offset += bitpos;
  }
-   else if (result.type == ADDRESSOF)
- {
-   /* We can end up here for component references on a
-  VIEW_CONVERT_EXPR <>().  */
-   result.type = SCALAR;
-   result.var = anything_id;
-   result.offset = 0;
- }
else
  gcc_unreachable ();
  }
--- 3312,3317 
Index: gcc/testsuite/gcc.dg/torture/pr71908.c
===
*** gcc/testsuite/gcc.dg/torture/pr71908.c  (revision 0)
--- gcc/testsuite/gcc.dg/torture/pr71908.c  (working copy)
***
*** 0 
--- 1,26 
+ /* { dg-do compile } */
+ 
+ struct S3
+ {
+   int f3;
+   int f5;
+   char f6;
+   int f7;
+ } b;
+ int a;
+ static struct S3 *c = 
+ int *d;
+ int main()
+ {
+   int i;
+   for (;;) {
+   a = 0;
+   int **e = 
+   i = 0;
+   for (; i < 2; i++)
+   d = &(*c).f5;
+   *e = d;
+   **e = 3;
+   }
+   return 0;
+ }

[PATCH] Fix PR71901 (and PR71893 in a different way)

2016-07-18 Thread Richard Biener


The following fixes the PRs by not storing expressions inside
the reference ops for ARRAY_REFs for their element size but their
original operand 3.  This requires keeping track of the element
alignment and avoids any issues with not folding * /[ex] chains.
It also enables "real" VN of those ARRAY_REFs where previously
things like valueization wouldn't have worked on expression ops.

Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

Richard.

2016-07-18  Richard Biener  

PR tree-optimization/71901
* tree-ssa-sccvn.h (struct vn_reference_op_struct): Add
align member, group stuff with the bitfield.
(vn_ref_op_align_unit): New inline.
* tree-ssa-sccvn.c (copy_reference_ops_from_ref): For ARRAY_REFs
record element alignment and operand 3 unchanged.
(ao_ref_init_from_vn_reference): Adjust.
(valueize_refs_1): Likewise.
* tree-ssa-pre.c (create_component_ref_by_pieces_1): Likewise.

* gcc.dg/torture/pr71901.c: New testcase.

Index: gcc/tree-ssa-sccvn.h
===
*** gcc/tree-ssa-sccvn.h(revision 238426)
--- gcc/tree-ssa-sccvn.h(working copy)
*** typedef const struct vn_phi_s *const_vn_
*** 81,102 
  typedef struct vn_reference_op_struct
  {
ENUM_BITFIELD(tree_code) opcode : 16;
-   /* 1 for instrumented calls.  */
-   unsigned with_bounds : 1;
/* Dependence info, used for [TARGET_]MEM_REF only.  */
unsigned short clique;
unsigned short base;
/* Constant offset this op adds or -1 if it is variable.  */
HOST_WIDE_INT off;
tree type;
tree op0;
tree op1;
tree op2;
-   bool reverse;
  } vn_reference_op_s;
  typedef vn_reference_op_s *vn_reference_op_t;
  typedef const vn_reference_op_s *const_vn_reference_op_t;
  
  
  /* A reference operation in the hashtable is representation as
 the vuse, representing the memory state at the time of
--- 81,109 
  typedef struct vn_reference_op_struct
  {
ENUM_BITFIELD(tree_code) opcode : 16;
/* Dependence info, used for [TARGET_]MEM_REF only.  */
unsigned short clique;
unsigned short base;
+   /* 1 for instrumented calls.  */
+   unsigned with_bounds : 1;
+   unsigned reverse : 1;
+   /* For storing TYPE_ALIGN for array ref element size computation.  */
+   unsigned align : 6;
/* Constant offset this op adds or -1 if it is variable.  */
HOST_WIDE_INT off;
tree type;
tree op0;
tree op1;
tree op2;
  } vn_reference_op_s;
  typedef vn_reference_op_s *vn_reference_op_t;
  typedef const vn_reference_op_s *const_vn_reference_op_t;
  
+ inline unsigned
+ vn_ref_op_align_unit (vn_reference_op_t op)
+ {
+   return op->align ? ((unsigned)1 << (op->align - 1)) / BITS_PER_UNIT : 0;
+ }
  
  /* A reference operation in the hashtable is representation as
 the vuse, representing the memory state at the time of
Index: gcc/tree-ssa-pre.c
===
*** gcc/tree-ssa-pre.c  (revision 238426)
--- gcc/tree-ssa-pre.c  (working copy)
*** create_component_ref_by_pieces_1 (basic_
*** 2570,2584 
   here as the element alignment may be not visible.  See
   PR43783.  Simply drop the element size for constant
   sizes.  */
!   if (tree_int_cst_equal (genop3, TYPE_SIZE_UNIT (elmt_type)))
  genop3 = NULL_TREE;
else
  {
-   genop3 = size_binop (EXACT_DIV_EXPR, genop3,
-size_int (TYPE_ALIGN_UNIT (elmt_type)));
-   /* We may have a useless conversion added by
-  array_ref_element_size via copy_reference_opts_from_ref.  */
-   STRIP_USELESS_TYPE_CONVERSION (genop3);
genop3 = find_or_generate_expression (block, genop3, stmts);
if (!genop3)
  return NULL_TREE;
--- 2581,2593 
   here as the element alignment may be not visible.  See
   PR43783.  Simply drop the element size for constant
   sizes.  */
!   if (TREE_CODE (genop3) == INTEGER_CST
!   && wi::eq_p (wi::to_offset (TYPE_SIZE_UNIT (elmt_type)),
!(wi::to_offset (genop3)
! * vn_ref_op_align_unit (currop
  genop3 = NULL_TREE;
else
  {
genop3 = find_or_generate_expression (block, genop3, stmts);
if (!genop3)
  return NULL_TREE;
Index: gcc/tree-ssa-sccvn.c
===
*** gcc/tree-ssa-sccvn.c(revision 238426)
--- gcc/tree-ssa-sccvn.c(working copy)
*** copy_reference_ops_from_ref (tree ref, v
*** 805,828 
  break;
case ARRAY_RANGE_REF:
case ARRAY_REF:
! /* Record index as

Re: [PATCH v5] Allocate constant size dynamic stack space in the prologue

2016-07-18 Thread Andreas Krebbel

> gcc/ChangeLog
> 
>   * cfgexpand.c (expand_stack_vars): Implement synamic stack space
>   allocation in the prologue.
>   * explow.c (get_dynamic_stack_base): New function to return an address
>   expression for the dynamic stack base.
>   (get_dynamic_stack_size): New function to do the required dynamic stack
>   space size calculations.
>   (allocate_dynamic_stack_space): Use new functions.
>   (align_dynamic_address): Move some code from
>   allocate_dynamic_stack_space to new function.
>   * explow.h (get_dynamic_stack_base, get_dynamic_stack_size): Export.
> gcc/testsuite/ChangeLog
> 
>   * gcc.target/s390/warn-dynamicstack-1.c: New test.
>   * gcc.dg/stack-usage-2.c (foo3): Adapt expected warning.
>   stack-layout-dynamic-1.c: New test.

Applied.  Thanks!

-Andreas-

Re: [PATCH] Avoid invoking ranlib on libbackend.a

2016-07-18 Thread Patrick Palka

On Mon, Jul 18, 2016 at 8:44 AM, Segher Boessenkool
 wrote:
> On Mon, Jul 18, 2016 at 06:35:11AM -0500, Segher Boessenkool wrote:
>> Or, if using GNU ar, you can even use -S, if that helps (after testing
>> for it in configure, of course).
>
> I meant -T.  Some day I will learn how to type, promise!

This really helps!   libbackend.a gets built instantly with -T.

>
>
> Segher

GCC 4.9.4 Status Report (2016-07-18)

2016-07-18 Thread Richard Biener


Status
==

The GCC 4.9 branch is still open for regression and documentation fixes
but given GCC 6.2 is close it's about time to close the branch with
a last release from it.  Thus in the next week I plan to do a RC
for GCC 4.9.4 following with a release and the branch closing game.

Please consider backporting of important fixes (wrong-code, rejects-valid)
but err on the side of caution.


Quality Data


Priority  #   Change from last report
---   ---
P10
P2  140   +  40
P3   41   +  16
P4   66   +   2
P5   30   -   3
---   ---
Total P1-P3 181   +  56
Total   277   +  51


Previous Report
===

https://gcc.gnu.org/ml/gcc/2015-06/msg00260.html

Ping: Fix PR fortran/71688

2016-07-18 Thread Martin Jambor

Ping (this time also CCing fort...@gcc.gnu and Honza).

I really think this should be backported to 4.9 in time for the last
release.

Thanks,

Martin

- Original message from Martin Jambor  -

Date: Thu, 30 Jun 2016 11:13:17 +0200
From: Martin Jambor 
To: GCC Patches 
Subject: Fix PR fortran/71688

Hi,

PR 71688 is about an ICE in cgraphunit.c caused by the fact that
Fortran FE creates two separate call-graph nodes for a single function
decl, if you are interested, complete backtraces leading to the point
of creating them are in bugzilla.

The intuitive fix, changing one of these points so that they call
cgraph::get_create rather than cgraph_node::create works and given the
comment just before the line also seems like the correct thing to do:

  /* Register this function with cgraph just far enough to get it
 added to our parent's nested function list.
 If there are static coarrays in this function, the nested _caf_init
 function has already called cgraph_create_node, which also created
 the cgraph node for this function.  */

It is interesting that the bug lurked so long there.  I have
bootstrapped and tested the patch below on x86_64-linux, is it OK for
trunk and (after a while) for all active release branches?

Thanks,

Martin


2016-06-29  Martin Jambor  

PR fortran/71688
* trans-decl.c (gfc_generate_function_code): Use get_create rather
than create to get a call graph node.

testsuite/
gfortran.dg/pr71688.f90: New test.


diff --git a/gcc/fortran/trans-decl.c b/gcc/fortran/trans-decl.c
index 2f5e434..0e68736 100644
--- a/gcc/fortran/trans-decl.c
+++ b/gcc/fortran/trans-decl.c
@@ -6336,7 +6336,7 @@ gfc_generate_function_code (gfc_namespace * ns)
 function has already called cgraph_create_node, which also created
 the cgraph node for this function.  */
   if (!has_coarray_vars || flag_coarray != GFC_FCOARRAY_LIB)
-   (void) cgraph_node::create (fndecl);
+   (void) cgraph_node::get_create (fndecl);
 }
   else
 cgraph_node::finalize_function (fndecl, true);
diff --git a/gcc/testsuite/gfortran.dg/pr71688.f90 
b/gcc/testsuite/gfortran.dg/pr71688.f90
new file mode 100644
index 000..dbb6d18
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/pr71688.f90
@@ -0,0 +1,13 @@
+! { dg-do compile }
+! { dg-options "-fcoarray=lib" }
+
+program p
+   call s
+contains
+   subroutine s
+  real :: x[*] = 1
+  block
+  end block
+  x = 2
+   end
+end

- End original message -

Re: [PATCH] Avoid invoking ranlib on libbackend.a

2016-07-18 Thread Segher Boessenkool

On Mon, Jul 18, 2016 at 06:35:11AM -0500, Segher Boessenkool wrote:
> Or, if using GNU ar, you can even use -S, if that helps (after testing
> for it in configure, of course).

I meant -T.  Some day I will learn how to type, promise!


Segher

Re: [PATCH] Avoid invoking ranlib on libbackend.a

2016-07-18 Thread Richard Biener

On Mon, Jul 18, 2016 at 2:37 PM, Jakub Jelinek  wrote:
> On Mon, Jul 18, 2016 at 02:32:40PM +0200, Richard Biener wrote:
>> While eliding ranlib sounds like a no-brainer the real benefit (I/O wise) is
>> when you get rid of the archive or save link time by creating a (partially)
>> linked DSO.  ISTR Michael Matz has patches to do that.  Whether it's
>
> DSO?  Then we'd have to build everything with -fpic (which we only do when
> building gccjit).  Did you mean just a relocatable object (ld -r) instead?

I think his patches do a DSO (and yes, add -fpic).  But a relocatable object
is also possible of course (not sure if libtool provides support for creating
such a beast portably).

Richard.

> Jakub

[v3 PATCH] Minor comment cleanup on optional.

2016-07-18 Thread Ville Voutilainen

Tested on Linux-x64.

This should be fairly straightforward. :)

2016-07-18  Ville Voutilainen  

Clean up optional's comments.
* include/std/optional: Remove incorrect section headers
from comments when redundant, replace bare section
headers with more descriptive comments.
diff --git a/libstdc++-v3/include/std/optional 
b/libstdc++-v3/include/std/optional
index 2ea4fdd..4c94dff 100644
--- a/libstdc++-v3/include/std/optional
+++ b/libstdc++-v3/include/std/optional
@@ -51,13 +51,9 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
*  @{
*/
 
-  // All subsequent [X.Y.n] references are against n3793.
-
-  // [X.Y.4]
   template
 class optional;
 
-  // [X.Y.6]
   /// Tag type to disengage optional objects.
   struct nullopt_t
   {
@@ -72,11 +68,9 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 explicit constexpr nullopt_t(_Construct) { }
   };
 
-  // [X.Y.6]
   /// Tag to disengage optional objects.
   constexpr nullopt_t nullopt { nullopt_t::_Construct::_Token };
 
-  // [X.Y.7]
   /**
*  @brief Exception class thrown when a disengaged optional object is
*  dereferenced.
@@ -172,7 +166,6 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   using _Stored_type = remove_const_t<_Tp>;
 
 public:
-  // [X.Y.4.1] Constructors.
 
   // Constructors for disengaged optionals.
   constexpr _Optional_base() noexcept
@@ -217,7 +210,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   this->_M_construct(std::move(__other._M_get()));
   }
 
-  // [X.Y.4.3] (partly) Assignment.
+  // Assignment operators.
   _Optional_base&
   operator=(const _Optional_base& __other)
   {
@@ -251,7 +244,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
return *this;
   }
 
-  // [X.Y.4.2] Destructor.
+  // Destructor.
   ~_Optional_base()
   {
 if (this->_M_engaged)
@@ -560,7 +553,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   explicit constexpr optional(optional<_Up>&& __t)
 : _Base(__t ? optional<_Tp>(std::move(*__t)) : optional<_Tp>()) { }
 
-  // [X.Y.4.3] (partly) Assignment.
+  // Assignment operators.
   optional&
   operator=(nullopt_t) noexcept
   {
@@ -650,9 +643,9 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  this->_M_construct(__il, std::forward<_Args>(__args)...);
}
 
-  // [X.Y.4.2] Destructor is implicit, implemented in _Optional_base.
+  // Destructor is implicit, implemented in _Optional_base.
 
-  // [X.Y.4.4] Swap.
+  // Swap.
   void
   swap(optional& __other)
   noexcept(is_nothrow_move_constructible<_Tp>()
@@ -674,7 +667,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  }
   }
 
-  // [X.Y.4.5] Observers.
+  // Observers.
   constexpr const _Tp*
   operator->() const
   { return __constexpr_addressof(this->_M_get()); }
@@ -777,7 +770,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 using __optional_relop_t =
 enable_if_t::value, bool>;
 
-  // [X.Y.8] Comparisons between optional values.
+  // Comparisons between optional values.
   template
 constexpr auto
 operator==(const optional<_Tp>& __lhs, const optional<_Tp>& __rhs)
@@ -828,7 +821,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   return !__rhs || (static_cast(__lhs) && *__lhs >= *__rhs);
 }
 
-  // [X.Y.9] Comparisons with nullopt.
+  // Comparisons with nullopt.
   template
 constexpr bool
 operator==(const optional<_Tp>& __lhs, nullopt_t) noexcept
@@ -889,7 +882,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 operator>=(nullopt_t, const optional<_Tp>& __rhs) noexcept
 { return !__rhs; }
 
-  // [X.Y.10] Comparisons with value type.
+  // Comparisons with value type.
   template
 constexpr auto
 operator==(const optional<_Tp>& __lhs, const _Tp& __rhs)
@@ -962,7 +955,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 -> __optional_relop_t() >= declval<_Tp>())>
 { return !__rhs || __lhs >= *__rhs; }
 
-  // [X.Y.11]
+  // Swap and creation functions.
   template
 inline void
 swap(optional<_Tp>& __lhs, optional<_Tp>& __rhs)
@@ -984,7 +977,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 make_optional(initializer_list<_Up> __il, _Args&&... __args)
 { return optional<_Tp> { in_place, __il, std::forward<_Args>(__args)... }; 
}
 
-  // [X.Y.12]
+  // Hash.
   template
 struct hash>
 {

Re: [PATCH] c++/58796 Make nullptr match exception handlers of pointer type

2016-07-18 Thread Jonathan Wakely


On 18/07/16 11:51 +0100, Jonathan Wakely wrote:

say anything about the identity of the caught objects when nullptr is
thrown).


Not sure what happened to this sentence, it was supposed to say:

(The standard doesn't say anything about the identity of the caught
objects when nullptr is thrown).

Re: [PATCH] Avoid invoking ranlib on libbackend.a

2016-07-18 Thread Jakub Jelinek

On Mon, Jul 18, 2016 at 02:32:40PM +0200, Richard Biener wrote:
> While eliding ranlib sounds like a no-brainer the real benefit (I/O wise) is
> when you get rid of the archive or save link time by creating a (partially)
> linked DSO.  ISTR Michael Matz has patches to do that.  Whether it's

DSO?  Then we'd have to build everything with -fpic (which we only do when
building gccjit).  Did you mean just a relocatable object (ld -r) instead?

Jakub

Re: [PATCH] Avoid invoking ranlib on libbackend.a

2016-07-18 Thread Richard Biener

On Mon, Jul 18, 2016 at 1:35 PM, Segher Boessenkool
 wrote:
> On Sun, Jul 17, 2016 at 10:04:13PM -0400, Patrick Palka wrote:
>> I did some digging to figure out the origin of libbackend.a.  It was was
>> created to work around a command line length limit on a VAX system
>> (https://gcc.gnu.org/ml/gcc-bugs/2000-06/msg00438.html).  The 1600 byte
>> command line that links cc1plus was too large for this system, so
>> libbackend.a was introduced and used in place of $(OBJS) to reduce the
>> size of this command line.
>
> As Zack said in that thread, at the top of the mail you refer to here:
> "Putting the back end in a library seems like a good idea to me, never
> mind whether or not we're really hitting command line length limits."
>
> Many people agree with that.  Some, anyway.
>
>> I suppose rebuild times could instead be further reduced by splitting
>> libbackend.a into multiple logical archives.  It would also be a small
>> step towards making GCC more modular.
>
> Is it faster to not have one big archive?  There are nine compilers.
>
> If you want to add -s to the ar flags (it is POSIX, most systems should
> have it by now), you should add a configure check (or find one, there is
> bound to be one somewhere already) to test for it, and then add it to
> AR_FLAGS (and there is also AR_CREATE_FOR_TARGET, it may be useful to
> add it there as well -- needs a separate test of course).
>
> Or, if using GNU ar, you can even use -S, if that helps (after testing
> for it in configure, of course).

While eliding ranlib sounds like a no-brainer the real benefit (I/O wise) is
when you get rid of the archive or save link time by creating a (partially)
linked DSO.  ISTR Michael Matz has patches to do that.  Whether it's
portable enough is another question (I suppose it simply uses libtool).

Richard.

>
> Segher

Re: [PATCH, rs6000] Fix vec_construct vectorization cost to be somewhat more accurate

2016-07-18 Thread Richard Biener

On Mon, Jul 18, 2016 at 1:56 PM, Segher Boessenkool
 wrote:
> Hi Bill,
>
> On Fri, Jul 15, 2016 at 08:55:08AM -0500, Bill Schmidt wrote:
>> This patch is a follow-up to Richard's patch of
>> https://gcc.gnu.org/ml/gcc-patches/2016-06/msg00584.html.  The cost of a
>> vec_construct (initialization of an N-way vector by N scalars) is too low,
>> which can cause too-aggressive vectorization in particular for N=8 or
>> higher.  Richard changed the default cost to N-1, which is generally
>> sensible.  For powerpc I am going with a slightly higher cost of N, which
>> will keep us from being less conservative than the previous values when N=2.
>
>> In any case, the purpose of this patch is simply to avoid vectorizing
>> things we shouldn't when we've undercounted the cost of a vec_construct.
>> Bootstrapped and tested on powerpc64le-unknown-linux-gnu with no
>> regressions (hence the vectorization decisions in the test suite have
>> not changed).  Is this ok for trunk?
>
> Do you also have a testcase where it does matter?  It would be good to
> add that, then.  Or is it fixing a regression?
>
> I know nothing about the cost model, so someone else will have to review,
> or I can just say "okay" ;-)

You can maybe look at gcc.dg/vect/slp-4[35].c (and run it with the cost model
enabled).

Richard.

>
> Segher

Re: [PATCH][RFC] PR middle-end/22141 GIMPLE store widening pass

2016-07-18 Thread Richard Biener

On Fri, Jul 15, 2016 at 5:13 PM, Kyrill Tkachov
 wrote:
> Hi all,
>
> This is a GIMPLE pass to implement PR middle-end/22141. that is merge narrow
> stores of constants
> into fewer wider stores.  A 2009 patch from Jakub [1] contains many
> testcases but a simple motivating
> case can be:
>
> struct bar {
>   int a;
>   char b;
>   char c;
>   char d;
>   char e;
> }; // packed 64-bit structure
>
> void bar (struct bar *);
>
> void
> foo (struct bar *p)
> {
>   p->b = 0;
>   p->a = 0;
>   p->c = 0;
>   p->d = 1;
>   p->e = 0;
> }
>
> Currently on aarch64 this will generate:
> foo:
> mov w1, 1
> str wzr, [x0]
> strbwzr, [x0, 4]
> strbwzr, [x0, 5]
> strbw1, [x0, 6]
> strbwzr, [x0, 7]
> ret
>
> With this patch this can be improved into a single unaligned store:
> foo:
> mov x1, 0x1
> str x1, [x0]
> ret
>
> or, if compiled with -mstrict-align:
> foo:
> mov w1, 0x1
> stp wzr, w1, [x0]
> ret
>
> The pass is a tree-ssa pass that runs fairly late in the pipeline, after
> pass_optimize_widening_mul.
> I explain the approach taken in the comments in the new
> tree-ssa-store-widening.c file but essentially
> it has 3 phases applied to each basic block:
>
> 1) Scan through the statements recording assignments of constants to
> destinations like ARRAY_REF,
> COMPONENT_REF, MEM_REF which are determined to write to an ultimate common
> destination. get_inner_reference
> is used to decompose these destinations. Continue recording these until we
> encounter a statement that may
> interfere with the stores we've been recording (load or store that may
> alias, volatile access etc).
> These assignments of interest are recorded as store_immediate_info objects
> in the m_store_info vector.
>
> 2) Analyse the stores recorded in phase one (they all write to a destination
> offset from a common base)
> and merge them into wider assignments up to BITS_PER_WORD bits wide. These
> widened assignments are represented
> as merged_store_group objects and they are recorded in the
> m_merged_store_groups vector. This is the
> coalesce_immediate_stores function. It sorts the stores by the bitposition
> they write to and iterates through
> them, merging consecutive stores (it fails the transformation on overlapping
> stores, I don't think that case
> appears often enough to warrant extra logic) up to BITS_PER_WORD-wide
> accesses.
>
> 3) Go through the merged stores recorded in m_merged_store_groups and output
> each widened store. Widened stores
> that are not of a bitsize that is a power of two (for example 48 bits wide)
> are output as multiple stores of decreasing
> power-of-two width. So, for a widened store 48-bits wide this phase would a
> emit a 32-bit store followed by a
> 16-bit store. The new sequence is only emitted if it contains fewer
> statements than the original sequence that it
> will replace.  This phase also avoids outputting unaligned stores for
> STRICT_ALIGNMENT targets or targets where
> SLOW_UNALIGNED_ACCESS forbids it. Since some configurations/targets may want
> to avoid generation of unaligned
> stores even when it is legal I've added the new param
> PARAM_STORE_WIDENING_ALLOW_UNALIGNED that can be used
> to disallow unaligned store generation.  Its default setting is to allow
> them (assuming that STRICT_ALIGNMENT
> and SLOW_UNALIGNED_ACCESS allows it).
>
> This is my first GIMPLE-level pass so please do point out places where I'm
> not using the interfaces correctly.
> This patch handles bitfields as well, but only if they are a multiple of
> BITS_PER_UNIT. It should be easily
> extensible to handle other bitfields as well, but I'm not entirely sure of
> the rules for laying out such bitfields
> and in particular the byteswap logic that needs to be applied for big-endian
> targets. If someone can shed some light
> on how they should be handed I'll be happy to try it out, but I believe this
> patch is an improvement as it is.
>
> This has been bootstrapped and tested on aarch64-none-linux-gnu,
> arm-none-linux-gnueabihf and x86_64-unknown-linux-gnu.
> I've also tested it on the big-endian targets: armeb-none-eabi,
> aarch64_be-none-elf. Also tested aarch64-none-elf/-mabi=ilp32.
>
> I've benchmarked it on SPEC2006 on AArch64 on a Cortex-A72 and there were no
> regressions, the overall score improved a bit
> (about 0.1%). The interesting improvements were:
> 458.sjeng (+0.8%)
> 483.xalancbmk (+1.1%)
> 416.gamess(+1.0%)
> 454.calculix  (+1.1%)
>
> An interesting effect was in BZ2_decompress from bzip2 where at -Ofast it
> transformed a long sequence of constant
> byte stores into a much shorter sequence of word-size stores (from ~550
> instructions to ~190).
>
> On x86_64 SPECINT there was no change in the overall score. The code size at
> -Ofast is consistently smaller
> with this patch but the preformance differences

Re: [PATCH] Avoid invoking ranlib on libbackend.a

2016-07-18 Thread Andreas Schwab

Segher Boessenkool  writes:

> Or, if using GNU ar, you can even use -S, if that helps (after testing
> for it in configure, of course).

BSD ar has it too.

Andreas.

-- 
Andreas Schwab, SUSE Labs, sch...@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."

Re: [PATCH, rs6000] Fix vec_construct vectorization cost to be somewhat more accurate

2016-07-18 Thread Segher Boessenkool

Hi Bill,

On Fri, Jul 15, 2016 at 08:55:08AM -0500, Bill Schmidt wrote:
> This patch is a follow-up to Richard's patch of
> https://gcc.gnu.org/ml/gcc-patches/2016-06/msg00584.html.  The cost of a
> vec_construct (initialization of an N-way vector by N scalars) is too low,
> which can cause too-aggressive vectorization in particular for N=8 or
> higher.  Richard changed the default cost to N-1, which is generally
> sensible.  For powerpc I am going with a slightly higher cost of N, which
> will keep us from being less conservative than the previous values when N=2.

> In any case, the purpose of this patch is simply to avoid vectorizing
> things we shouldn't when we've undercounted the cost of a vec_construct.
> Bootstrapped and tested on powerpc64le-unknown-linux-gnu with no
> regressions (hence the vectorization decisions in the test suite have
> not changed).  Is this ok for trunk?

Do you also have a testcase where it does matter?  It would be good to
add that, then.  Or is it fixing a regression?

I know nothing about the cost model, so someone else will have to review,
or I can just say "okay" ;-)


Segher

Re: [RFC][IPA-VRP] Early VRP Implementation

2016-07-18 Thread Richard Biener

On Fri, Jul 15, 2016 at 9:33 AM, kugan
 wrote:
> Hi Andrew,
>
> On 15/07/16 17:28, Andrew Pinski wrote:
>>
>> On Fri, Jul 15, 2016 at 12:08 AM, kugan
>>  wrote:
>>>
>>> Hi Andrew,
>>>
 Why separate out early VRP from tree-vrp?  Just a little curious.
>>>
>>>
>>>
>>> It is based on the discussion in
>>> https://gcc.gnu.org/ml/gcc/2016-01/msg00069.html.
>>> In summary, conclusion (based on my understanding) was to implement a
>>> simplified VRP algorithm that doesn't require ASSERT_EXPR insertion.
>>
>>
>> But I don't see why you are moving it from tree-vrp.c .  That was my
>> question, you pointing to that discussion does not say to split it
>> into a new file and expose these interfaces.
>>
>
> Are you saying that I should keep this part of tree-vrp.c. I am happy to do
> that if this is considered the best approach.

Yes, I think that's the best approach.

Can you, as a refactoring before your patch, please change VRP to use
an alloc-pool
for allocating value_range?  The DOM-based VRP will add a lot of
malloc/free churn
otherwise.

Generally watch coding-style such as  missed function comments.

As you do a non-iterating VRP and thus do not visit back-edges you don't need
to initialize loops or SCEV nor do you need loop-closed SSA.

As you do a DOM-based VRP using SSA propagator stuff like ssa_prop_result
doesn't make any sense.

+edge evrp_dom_walker::before_dom_children (basic_block bb)
+{
+  /* If we are going out of scope, restore the old VR.  */
+  while (!cond_stack.is_empty ()
+&& !dominated_by_p (CDI_DOMINATORS, bb, cond_stack.last ().first))
+{
+  tree var = cond_stack.last ().second.first;
+  value_range *vr = cond_stack.last ().second.second;
+  value_range *vr_to_del = get_value_range (var);
+  XDELETE (vr_to_del);
+  change_value_range (var, vr);
+  cond_stack.pop ();
+}

that should be in after_dom_children I think and use a marker instead
of a DOM query.
See other examples like DOM itself or SCCVN.

+ /* Discover VR when condition is true.  */
+ if (te == e
+ && !TREE_OVERFLOW_P (op0)
+ && !TREE_OVERFLOW_P (op1))

you can use e->flags & EDGE_TRUE_VALUE/EDGE_FALSE_VALUE

why do you need those TREE_OVERFLOW checks?

+ tree cond = build2 (code, boolean_type_node, op0, op1);
+ tree a = build2 (ASSERT_EXPR, TREE_TYPE (op0), op0, cond);
+ extract_range_from_assert (, a);

so I was hoping that the "refactoring" patch in the series would expose a more
useful interface than extract_range_from_assert ... namely one that can
extract a range from the comparison directly and does not require building
a scratch ASSERT_EXPR.

+ /* If we found any usable VR, set the VR to ssa_name and create a
+restore point in the cond_stack with the  old VR. */
+ if (vr.type == VR_RANGE || vr.type == VR_ANTI_RANGE)
+   {
+ value_range *new_vr = XCNEW (value_range);
+ *new_vr = vr;
+ cond_stack.safe_push (std::make_pair (bb,
+   std::make_pair (op0,
+   old_vr)));
+ change_value_range (op0, new_vr);

I don't like 'change_value_range' as interface, please integrate that into
a push/pop_value_range style interface instead.

+   vrp_visit_stmt (stmt, _edge_p, _p);
+}
+
+  return NULL;

you should return taken_edge_p (misnamed as it isn't a pointer) and take
advantage of EDGE_EXECUTABLE.  Again see DOM/SCCVN (might want to
defer this as a followup improvement).

Note that the advantage of a DOM-based VRP is that backtracking is easy
to implement (you don't do that yet).  That is, after DEF got a (better)
value-range you can simply re-visit all the defs of its uses (and recursively).
I think you have to be careful with stmts that might prematurely leave a BB
though (like via EH).  So sth for a followup as well.

Thanks,
Richard.

> Thanks,
> Kugan

Re: [PATCH] Avoid invoking ranlib on libbackend.a

2016-07-18 Thread Segher Boessenkool

On Sun, Jul 17, 2016 at 10:04:13PM -0400, Patrick Palka wrote:
> I did some digging to figure out the origin of libbackend.a.  It was was
> created to work around a command line length limit on a VAX system
> (https://gcc.gnu.org/ml/gcc-bugs/2000-06/msg00438.html).  The 1600 byte
> command line that links cc1plus was too large for this system, so
> libbackend.a was introduced and used in place of $(OBJS) to reduce the
> size of this command line.

As Zack said in that thread, at the top of the mail you refer to here:
"Putting the back end in a library seems like a good idea to me, never
mind whether or not we're really hitting command line length limits."

Many people agree with that.  Some, anyway.

> I suppose rebuild times could instead be further reduced by splitting
> libbackend.a into multiple logical archives.  It would also be a small
> step towards making GCC more modular.

Is it faster to not have one big archive?  There are nine compilers.

If you want to add -s to the ar flags (it is POSIX, most systems should
have it by now), you should add a configure check (or find one, there is
bound to be one somewhere already) to test for it, and then add it to
AR_FLAGS (and there is also AR_CREATE_FOR_TARGET, it may be useful to
add it there as well -- needs a separate test of course).

Or, if using GNU ar, you can even use -S, if that helps (after testing
for it in configure, of course).

Segher

Re: [RFC][IPA-VRP] Teach tree-vrp to use the VR set in params

2016-07-18 Thread Richard Biener

On Fri, Jul 15, 2016 at 6:47 AM, kugan
 wrote:
> Hi,
>
>
>
> This patch teaches tree-vrp to use the VR set in params.

diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
index 8c87c06..ad3891c 100644
--- a/gcc/tree-vrp.c
+++ b/gcc/tree-vrp.c
@@ -667,6 +667,20 @@ get_value_range (const_tree var)
  if (POINTER_TYPE_P (TREE_TYPE (sym))
  && nonnull_arg_p (sym))
set_value_range_to_nonnull (vr, TREE_TYPE (sym));
+ else if (!POINTER_TYPE_P (TREE_TYPE (sym)))

Please use INTEGRAL_TYPE_P

+   {
+ wide_int min, max;
+ value_range_type rtype = get_range_info (var, , );
+ if (rtype == VR_RANGE || rtype == VR_ANTI_RANGE)
+   {
+ vr->type = rtype;
+ vr->min = wide_int_to_tree (TREE_TYPE (var), min);
+ vr->max = wide_int_to_tree (TREE_TYPE (var), max);
+ vr->equiv = NULL;

Use

 set_value_range (vr, rtype, wide_int..., ..., NULL);

+   }
+ else
+   set_value_range_to_varying (vr);
+   }


Ok with that change.

Thanks,
Richard.

>
>
> Thanks,
>
> Kugan
>
>
>
> gcc/ChangeLog:
>
>
>
> 2016-07-14  Kugan Vivekanandarajah  
>
>
>
>  * tree-vrp.c (get_value_range): Teach PARM_DECL to use ipa-vrp
>
>  results.
>

Re: [PATCH, rs6000] Fix vec_construct vectorization cost to be somewhat more accurate

2016-07-18 Thread Richard Biener

On Fri, Jul 15, 2016 at 3:55 PM, Bill Schmidt
 wrote:
> Hi,
>
> This patch is a follow-up to Richard's patch of
> https://gcc.gnu.org/ml/gcc-patches/2016-06/msg00584.html.  The cost of a
> vec_construct (initialization of an N-way vector by N scalars) is too low,
> which can cause too-aggressive vectorization in particular for N=8 or
> higher.  Richard changed the default cost to N-1, which is generally
> sensible.  For powerpc I am going with a slightly higher cost of N, which
> will keep us from being less conservative than the previous values when N=2.
>
> The whole cost model for powerpc needs more work (in particular we need
> to distinguish among processor models), but that's beyond the scope of
> this patch.  One thing that I've called out in the comments is that a
> vec_construct can have wildly different costs depending on the scalar
> elements.  If they are all the same small constant, then we only need
> a single splat-immediate instruction; but for V4SF the cost is potentially
> higher because of the need to do converts.  For the splat case, we might
> want to teach the vectorizer in general to estimate the cost as just
> a vector_stmt rather than a vec_construct, but that requires some target
> knowledge of which constants can be duplicated with a splat-immediate.
>
> In any case, the purpose of this patch is simply to avoid vectorizing
> things we shouldn't when we've undercounted the cost of a vec_construct.
> Bootstrapped and tested on powerpc64le-unknown-linux-gnu with no
> regressions (hence the vectorization decisions in the test suite have
> not changed).  Is this ok for trunk?

Note that most of the vectorizer testsuite is running with -fno-vect-cost-model,
only the costmodel tests are running with the cost model enabled.

Richard.

> Thanks,
> Bill
>
>
> 2016-07-15  Bill Schmidt  
>
> * config/rs6000/rs6000.c (rs6000_builtin_vectorization_cost):
> Improve vec_construct estimate.
>
>
> Index: gcc/config/rs6000/rs6000.c
> ===
> --- gcc/config/rs6000/rs6000.c  (revision 238312)
> +++ gcc/config/rs6000/rs6000.c  (working copy)
> @@ -5138,7 +5138,6 @@ rs6000_builtin_vectorization_cost (enum vect_cost_
> tree vectype, int misalign)
>  {
>unsigned elements;
> -  tree elem_type;
>
>switch (type_of_cost)
>  {
> @@ -5245,16 +5244,16 @@ rs6000_builtin_vectorization_cost (enum vect_cost_
>  return 2;
>
>case vec_construct:
> -   elements = TYPE_VECTOR_SUBPARTS (vectype);
> -   elem_type = TREE_TYPE (vectype);
> -   /* 32-bit vectors loaded into registers are stored as double
> -  precision, so we need n/2 converts in addition to the usual
> -  n/2 merges to construct a vector of short floats from them.  */
> -   if (SCALAR_FLOAT_TYPE_P (elem_type)
> -   && TYPE_PRECISION (elem_type) == 32)
> - return elements + 1;
> -   else
> - return elements / 2 + 1;
> +   /* This is a rough approximation assuming non-constant elements
> +  constructed into a vector via element insertion.  FIXME:
> +  vec_construct is not granular enough for uniformly good
> +  decisions.  If the initialization is a splat, this is
> +  cheaper than we estimate.  If we want to form four SF
> +  values into a vector, it's more expensive (we need to
> +  copy the four elements into two vector registers,
> +  perform two conversions to single precision, and merge
> +  the two result vectors).  Improve this someday.  */
> +   return TYPE_VECTOR_SUBPARTS (vectype);
>
>default:
>  gcc_unreachable ();
>

[Committed] S/390: Fix alignment check for literal pool references.

2016-07-18 Thread Andreas Krebbel

Committed to head and GCC 6 branch.

gcc/ChangeLog:

2016-07-18  Andreas Krebbel  

* config/s390/s390.c (s390_encode_section_info): Always set
notaligned marker if mode size is 0 or no MEM_ALIGN info could be
found.

gcc/testsuite/ChangeLog:

2016-07-18  Andreas Krebbel  

* gcc.target/s390/nolrl-1.c: New test.
---
 gcc/ChangeLog   |  9 +
 gcc/config/s390/s390.c  | 35 ++---
 gcc/testsuite/ChangeLog |  7 +++
 gcc/testsuite/gcc.target/s390/nolrl-1.c | 19 ++
 4 files changed, 50 insertions(+), 20 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/s390/nolrl-1.c

 2016-07-16  John David Anglin  
 
* config/pa/pa.c (hppa_profile_hook): Allocate stack space for
diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
index 9d2b2c0..318c021 100644
--- a/gcc/config/s390/s390.c
+++ b/gcc/config/s390/s390.c
@@ -12412,17 +12412,14 @@ s390_encode_section_info (tree decl, rtx rtl, int 
first)
 {
   /* Store the alignment to be able to check if we can use
 a larl/load-relative instruction.  We only handle the cases
-that can go wrong (i.e. no FUNC_DECLs).  If a symref does
-not have any flag we assume it to be correctly aligned.  */
-
-  if (DECL_ALIGN (decl) % 64)
-   SYMBOL_FLAG_SET_NOTALIGN8 (XEXP (rtl, 0));
-
-  if (DECL_ALIGN (decl) % 32)
-   SYMBOL_FLAG_SET_NOTALIGN4 (XEXP (rtl, 0));
-
-  if (DECL_ALIGN (decl) == 0 || DECL_ALIGN (decl) % 16)
+that can go wrong (i.e. no FUNC_DECLs).  */
+  if (DECL_ALIGN (decl) == 0
+ || DECL_ALIGN (decl) % 16)
SYMBOL_FLAG_SET_NOTALIGN2 (XEXP (rtl, 0));
+  else if (DECL_ALIGN (decl) % 32)
+   SYMBOL_FLAG_SET_NOTALIGN4 (XEXP (rtl, 0));
+  else if (DECL_ALIGN (decl) % 64)
+   SYMBOL_FLAG_SET_NOTALIGN8 (XEXP (rtl, 0));
 }
 
   /* Literal pool references don't have a decl so they are handled
@@ -12430,18 +12427,16 @@ s390_encode_section_info (tree decl, rtx rtl, int 
first)
  entry to decide upon the alignment.  */
   if (MEM_P (rtl)
   && GET_CODE (XEXP (rtl, 0)) == SYMBOL_REF
-  && TREE_CONSTANT_POOL_ADDRESS_P (XEXP (rtl, 0))
-  && MEM_ALIGN (rtl) != 0
-  && GET_MODE_BITSIZE (GET_MODE (rtl)) != 0)
+  && TREE_CONSTANT_POOL_ADDRESS_P (XEXP (rtl, 0)))
 {
-  if (MEM_ALIGN (rtl) % 64)
-   SYMBOL_FLAG_SET_NOTALIGN8 (XEXP (rtl, 0));
-
-  if (MEM_ALIGN (rtl) % 32)
-   SYMBOL_FLAG_SET_NOTALIGN4 (XEXP (rtl, 0));
-
-  if (MEM_ALIGN (rtl) == 0 || MEM_ALIGN (rtl) % 16)
+  if (MEM_ALIGN (rtl) == 0
+ || GET_MODE_SIZE (GET_MODE (rtl)) == 0
+ || MEM_ALIGN (rtl) % 16)
SYMBOL_FLAG_SET_NOTALIGN2 (XEXP (rtl, 0));
+  else if (MEM_ALIGN (rtl) % 32)
+   SYMBOL_FLAG_SET_NOTALIGN4 (XEXP (rtl, 0));
+  else if (MEM_ALIGN (rtl) % 64)
+   SYMBOL_FLAG_SET_NOTALIGN8 (XEXP (rtl, 0));
 }
 }
 
diff --git a/gcc/testsuite/gcc.target/s390/nolrl-1.c 
b/gcc/testsuite/gcc.target/s390/nolrl-1.c
new file mode 100644
index 000..e0d1213
--- /dev/null
+++ b/gcc/testsuite/gcc.target/s390/nolrl-1.c
@@ -0,0 +1,19 @@
+/* Make sure the compiler does not try to use a relative long
+   instruction to load the string since it might not meet the
+   alignment requirements of the instruction.  */
+
+/* { dg-do compile } */
+/* { dg-options "-march=z10 -O3 -mzarch" } */
+
+extern void foo (char*);
+
+void
+bar ()
+{
+unsigned char z[32];
+
+__builtin_memcpy (z, "\001\000\000\000", 4);
+foo (z);
+}
+
+/* { dg-final { scan-assembler-not "lrl" } } */
-- 
2.9.1

Re: [PATCH] c++/58796 Make nullptr match exception handlers of pointer type

2016-07-18 Thread Jonathan Wakely

On 16/07/16 11:11 +0200, Jakub Jelinek wrote:

On Fri, Jul 15, 2016 at 11:14:03PM +0100, Jonathan Wakely wrote:

On 15/07/16 22:53 +0100, Jonathan Wakely wrote:
>On 15/07/16 23:36 +0200, Jakub Jelinek wrote:
>>On Fri, Jul 15, 2016 at 10:07:03PM +0100, Jonathan Wakely wrote:
+  if (typeid (*this) == typeid(__pointer_type_info))
+{
+  *thr_obj = nullptr;
+  return true;
>>
>>But you have the above store too.
>
>That doesn't write to the exception object, it only does a single
>dereference (compared to the double dereference of the racy write), so
>it writes to the local variable in the PERSONALITY_FUNCTION in
>eh_personality.cc
>
>So that shouldn't race with other threads. I think.
>

TSan agrees. When I compile my test and yours (and include
libsupc++/pbase_type_info.cc in the executable, so the writes are also
instrumented by tsan) then I see races for the *(ptrdiff_t*)*thr_obj
writes but not the *thr_obj ones.

It's only the ptr-to-data-member case that scribbles in the actual
exception object.

In:

struct A { int a; };
int a;

int
main ()
{
 try {
   throw nullptr;
 } catch (int * const ) {
   __builtin_printf ("%p %p\n", p, );
 }
 try {
   throw nullptr;
 } catch (int A::* const ) {
   __builtin_printf ("%p\n", );
   asm volatile ("" : : "r" ());
 }
 try {
   throw 
 } catch (int * const ) {
   __builtin_printf ("%p %p\n", p, );
 }
}

I see  being the address passed to __cxa_throw only in the second case,
where does the reference point to in the other two cases?  Some temporary in
main?

I'm not sure what the difference is there. There's some difference in
how pointers and non-pointers are handled.

Does that mean if you rethrow the exception multiple times in
different threads you get references to the same object for PMF and to
different objects for pointers?

No, for the throw  case you'll get the same exception object in
every thread.

say anything about the identity of the caught objects when nullptr is
thrown).

Re: RFA: new pass to warn on questionable uses of alloca() and VLAs

2016-07-18 Thread Aldy Hernandez


On 07/16/2016 05:07 PM, Martin Sebor wrote:

[Addressed all of Manu's suggestions as well.]


Done.  -Walloca and -Wvla warn on any use of alloca and VLAs
accordingly, with or without optimization.  I sorry() on the bounded
cases.


I think it's an improvement though I suspect we each have a slightly
different understanding of what the sorry message is meant to be used
for.  It's documented in the Diagnostics Conventions section of the
GCC Coding Conventions as:

   sorry is for correct user input programs but unimplemented
   functionalities.

I take that to mean that it should be used for what is considered
valid user input that cannot be processed because the functionality
is not yet implemented (but eventually will be).  So unless this
case falls into this category I would expect GCC to issue a warning
saying that the options have no effect (or limited effect, whatever
the case may be) without optimization. But maybe I'm not reading
the coding conventions text right.


Technically we could add the functionality later.  I don't know whether 
the new range info work can be made to work with lower optimization 
levels.  But really, I don't care :).  Adjusted to a warning.





2) When passed an argument of a signed type, GCC prints

   warning: cast from signed type in alloca

even though there is no explicit cast in the code.  It may not
be obvious why the conversion is a problem in this context.  I
would suggest to rephrase the warning along the lines of
-Wsign-conversion which prints:

   conversion to ‘long unsigned int’ from ‘int’ may change the sign of
the result

and add why it's a potential problem.  Perhaps something like:

   argument to alloca may be too large due to conversion from
   'int to 'long unsigned int'


Fixed:


Cool.

FWIW, by coincidence I was just educated about the subtle nuances
of quoting in GCC messages in a discussion with David and Manu.
Types, functions, variables, and literals that appear in the source
code should be referenced in diagnostics by using the "%qT", "%qD",
and "%qE" directives so that GCC can add the right quotes and
highlighting.  Enclosing "'%T'" in quotes will not use the same
kind of quotes as "%qT" and won't highlight the type name.


Fixed.



   https://gcc.gnu.org/wiki/DiagnosticsGuidelines


There is a "documented" reason for this: :)

   // Do not warn on VLAs occurring in a loop, since VLAs are
   // guaranteed to be cleaned up when they go out of scope.
   // That is, there is a corresponding __builtin_stack_restore
   // at the end of the scope in which the VLA occurs.


Yes, I understand that VLAs in loops are treated differently than
alloca.  But I don't think this is quite how the logic should work.
I.e., an excessively large VLA should be diagnosed regardless of
whether it's in a loop or outside.  Consider the following case
where with the patch as is, the warning is issued only for one
of the two functions, even though they both allocate a VLA in
excess of the threshold.


Agreed...



   #define FOO(n) if (1) { \
   char a [n]; \
   f (a); \
 } else (void)0

   #define BAR(n) do { \
   char a [n]; \
   f (a); \
 } while (0)

   void f (void*);

   void foo (void)
   {
 int n = 8000;
 FOO (n);// warning with -Wla-larger-than=4000
   }

   void bar (void)
   {
 int n = 8000;
 BAR (n);// no warning
   }


...though it looks like your testcases may get optimized away.

I've added this testcase:

void
f6 (unsigned stuff)
{
  int n = 7000;
  do {
char a[n]; // { dg-warning "variable-length array is too large" }
f0 (a);
  } while (stuff--);
}


I've changed the logic so we warn on large allocas whether they're for 
VLAs or otherwise, but no warning given on VLAs within a loop when we 
know the bounds.





5) The -Wvla=N logic only seems to take into consideration the number
of elements but not the size of the element type. For example, I wasn't
able to get it to warn on the following with -Wvla=255 or greater:

   void f0 (void*);

   void f1 (unsigned char a)
   {
 int x [a];   // or even char a [n][__INT_MAX__];
 f0 (x);
   }


That was a huge oversight (or should I say over-engineering) on my part.
  Fixed.


Looks good.

I did notice one minor glitch, though not one caused by your patch.
GCC apparently transforms simple VLAs that are 256 bytes or less
in size into ordinary arrays (i.e., it doesn't call
__builtin_alloca_with_align).  Because of that, specifying
-Wvla-larger-than=N with N less than 256 may not give a warning,
as in the example below.  I suspect there may not be anything
the Walloca pass can do about this so perhaps just mentioning
it in the manual might be enough to avoid bug reports about false
negatives.


Documentation updated.



   void f0 (void*);

   unsigned f1 (void) { return 256; }

   void f2 (void)
   {
 unsigned n = f1 ();
 char a [n];
 f0 (a);
   }

GCC doesn't do the same transformation for alloca calls so the

Re: [PATCH] Add qsort comparator consistency checking (PR71702)

2016-07-18 Thread Richard Biener

On Fri, Jul 15, 2016 at 6:00 PM, Alexander Monakov  wrote:
> Hi,
>
> this patch adds internal checking for comparator functions that GCC passes to
> qsort.  PR71702 describes an ICE that happens because comparator
> 'dr_group_sort_cmp' used to be non-transitive in some cases until GCC 6.  This
> patch would uncover that issue on a number of small testcases in GCC's
> testsuite, so it should be useful to catch similar issues in the future as 
> well.
>
> Although the meat of the implementation is not tied to vec<>, this patch adds
> verification only to vec::qsort.  I see there's a number of other ::qsort uses
> in GCC; it should be useful to have such checking there as well.  I'd 
> appreciate
> input on that (I'd go with '#define qsort(b, n, s, c) qsort_chk (b, n, s, c)'
> in system.h and implement qsort_chk as a checking wrapper around libc qsort).
>
> Bootstrapped and regtested on x86_64, OK to apply?

Ugh.  What impact does this have on stage2 compile-time?

Richard.

> Alexander
>
> * vec.c (vec_check_sort_cmp): New.  Use it...
> * vec.h (vec::qsort): ...here to verify comparator
> consistency when checking is enabled.
>
> diff --git a/gcc/vec.c b/gcc/vec.c
> index fd200ea..b4ac0b4 100644
> --- a/gcc/vec.c
> +++ b/gcc/vec.c
> @@ -190,6 +190,44 @@ dump_vec_loc_statistics (void)
>vec_mem_desc.dump (VEC_ORIGIN);
>  }
>
> +/* Verify consistency of comparator CMP on array BASE of N SIZE-sized 
> elements.
> +   Assuming the array should be sorted according to CMP, any assertion 
> failure
> +   here implies that CMP is not transitive, or is not anti-commutative.  */
> +
> +void
> +vec_check_sort_cmp (const void *base, size_t n, size_t size,
> +   int (*cmp) (const void *, const void *))
> +{
> +  const char *cbase = (const char *) base;
> +  size_t i1, i2, i, j;
> +  /* The following loop nest has O(n^2) time complexity.  Limit n to avoid
> + slowness on artificial testcases.  */
> +  if (n > 100)
> +n = 100;
> +#define CMP(i, j) cmp (cbase + (i) * size, cbase + (j) * size)
> +  /* The outer loop iterates over maximum spans [i1, i2) such that elements
> + within each span compare equal.  */
> +  for (i1 = 0; i1 < n; i1 = i2)
> +{
> +  /* Position i2 past the last element that compares equal to i1'th.  */
> +  for (i2 = i1 + 1; i2 < n; i2++)
> +   if (CMP (i1, i2))
> + break;
> +   else
> + gcc_assert (!CMP (i2, i1));
> +  /* Verify that all remaining pairs within current span compare equal.  
> */
> +  for (i = i1 + 1; i + 1 < i2; i++)
> +   for (j = i + 1; j < i2; j++)
> + gcc_assert (!CMP (i, j) && !CMP (j, i));
> +  /* Verify that all elements within current span compare less than any
> + element beyond the span.  */
> +  for (i = i1; i < i2; i++)
> +   for (j = i2; j < n; j++)
> + gcc_assert (CMP (i, j) < 0 && CMP (j, i) > 0);
> +}
> +#undef CMP
> +}
> +
>  #ifndef GENERATOR_FILE
>  #if CHECKING_P
>
> diff --git a/gcc/vec.h b/gcc/vec.h
> index eb8c270..ff1e37e 100644
> --- a/gcc/vec.h
> +++ b/gcc/vec.h
> @@ -182,6 +182,9 @@ extern void *ggc_realloc (void *, size_t MEM_STAT_DECL);
>  /* Support function for statistics.  */
>  extern void dump_vec_loc_statistics (void);
>
> +extern void vec_check_sort_cmp (const void *, size_t, size_t,
> +   int (*) (const void *, const void *));
> +
>  /* Hashtable mapping vec addresses to descriptors.  */
>  extern htab_t vec_mem_usage_hash;
>
> @@ -947,8 +950,11 @@ template
>  inline void
>  vec::qsort (int (*cmp) (const void *, const void *))
>  {
> -  if (length () > 1)
> -::qsort (address (), length (), sizeof (T), cmp);
> +  if (length () <= 1)
> +return;
> +  ::qsort (address (), length (), sizeof (T), cmp);
> +  if (CHECKING_P)
> +vec_check_sort_cmp (address (), length (), sizeof (T), cmp);
>  }
>
>

Re: [PATCH PR71734] Add missed check that reference defined inside loop.

2016-07-18 Thread Richard Biener

On Fri, Jul 8, 2016 at 4:07 PM, Yuri Rumyantsev  wrote:
> Hi Richard,
>
> Thanks for your help - your patch looks much better.
> Here is new patch in which additional argument was added to determine
> source loop of reference.
>
> Bootstrap and regression testing did not show any new failures.
>
> Is it OK for trunk?

Yes.

Thanks,
Richard.

> ChangeLog:
> 2016-07-08  Yuri Rumyantsev  
>
> PR tree-optimization/71734
> * tree-ssa-loop-im.c (ref_indep_loop_p_1): Add REF_LOOP argument which
> contains REF, use it to check safelen, assume that safelen value
> must be greater 1, fix style.
> (ref_indep_loop_p_2): Add REF_LOOP argument.
> (ref_indep_loop_p): Pass LOOP as additional argument to
> ref_indep_loop_p_2.
> gcc/testsuite/ChangeLog:
> * g++.dg/vect/pr70729.cc: Delete redundant dg options, fix style.
>
> 2016-07-08 11:18 GMT+03:00 Richard Biener :
>> On Thu, Jul 7, 2016 at 5:38 PM, Yuri Rumyantsev  wrote:
>>> I checked simd3.f90 and found out that my additional check reject
>>> independence of references
>>>
>>> REF is independent in loop#3
>>> .istart0.19, .iend0.20
>>> which are defined in loop#1 which is outer for loop#3.
>>> Note that these references are defined by
>>> _103 = __builtin_GOMP_loop_dynamic_next (&.istart0.19, &.iend0.20);
>>> which is in loop#1.
>>> It is clear that both these references can not be independent for loop#3.
>>
>> Ok, so we end up calling ref_indep_loop for ref in LOOP also for inner loops
>> of LOOP to catch memory references in those as well.  So the issue is really
>> that we look at the wrong loop for safelen and we _do_ want to apply safelen
>> to inner loops as well.
>>
>> So better track the loop we are ultimately asking the question for, like in 
>> the
>> attached patch (fixes the testcase for me).
>>
>> Richard.
>>
>>
>>
>>> 2016-07-07 17:11 GMT+03:00 Richard Biener :
 On Thu, Jul 7, 2016 at 4:04 PM, Yuri Rumyantsev  wrote:
> I Added this check because of new failures in libgomp.fortran suite.
> Here is copy of Jakub message:
> --- Comment #29 from Jakub Jelinek  ---
> The #c27 r237844 change looks bogus to me.
> First of all, IMNSHO you can argue this way only if ref is a reference 
> seen in
> loop LOOP,

 or inner loops of LOOP I guess.  I _think_ we never call 
 ref_indep_loop_p_1 with
 a REF whose loop is not a sub-loop of LOOP or LOOP itself (as it would not 
 make
 sense to do that, it would be a waste of time).

 So only if "or inner loops of LOOP" is not correct the check would be 
 needed
 but then my issue with unrolling an inner loop and turning a ref that 
 safelen
 does not apply to into a ref that it now applies to arises.

 I don't fully get what Jakub is hinting at.

 Can you install the safelen > 0 -> safelen > 1 fix please?  Jakub, can you
 explain that bitmap check with a simple testcase?

 Thanks,
 Richard.

> which is the case of e.g. *.omp_data_i_23(D).a ref in simd3.f90 -O2
> -fopenmp -msse2, but not the D.3815[0] case tested during can_sm_ref_p - 
> the
> D.3815[0] = 0; as well as something = D.3815[0]; stmt found in the outer 
> loop
> obviously can be dependent on many of the loads and/or stores in the 
> loop, be
> it "omp simd array" or not.
> Say for
> void
> foo (int *p, int *q)
> {
>   #pragma omp simd
>   for (int i = 0; i < 1024; i++)
> p[i] += q[0];
> }
> sure, q[0] can't alias p[0] ... p[1022], the earlier iterations could 
> write
> something that changes its value, and then it would behave differently 
> from
> using VF = 1024, where everything is performed in parallel.
> Though, actually, it can alias, just it would have to write the same 
> value as
> was there.  So, if this is used to determine if it is safe to hoist the 
> load
> before the loop, it is fine, if it is used to determine if [0] >= [0] 
> &&
> [0] <= [1023], then it is not fine.
>
> For aliasing of q[0] and p[1023], I don't see why they couldn't alias in a
> valid program.  #pragma omp simd I think guarantees that the last 
> iteration is
> executed last, it isn't necessarily executed last alone, it could be, or
> together with one before last iteration, or (for simdlen INT_MAX) even all
> iterations can be done concurrently, in hw or sw, so it is fine if it is
> transformed into:
>   int temp[1024], temp2[1024], temp3[1024];
>   for (int i = 0; i < 1024; i++)
> temp[i] = p[i];
>   for (int i = 0; i < 1024; i++)
> temp2[i] = q[0];
>   /* The above two loops can be also swapped, or intermixed.  */
>   for (int i = 0; i < 1024; i++)
> temp3[i] = temp[i] + temp2[i];
>   for (int i = 0; i < 1024; i++)
>

Re: RFA: new pass to warn on questionable uses of alloca() and VLAs

2016-07-18 Thread Aldy Hernandez


On 07/17/2016 11:52 AM, Manuel López-Ibáñez wrote:

On 15/07/16 18:05, Aldy Hernandez wrote:

+case OPT_Walloca_larger_than_:
+  if (!value)
+inform (loc, "-Walloca-larger-than=0 is meaningless");
+  break;
+
+case OPT_Wvla_larger_than_:
+  if (!value)
+inform (loc, "-Wvla-larger-than=0 is meaningless");
+  break;
+

We don't give similar notes for any of the other Wx-larger-than=
options. If -Wvla-larger-than=0 suppresses a previous
-Wvla-larger-than=, then it doesn't seem meaningless, but a useful thing
to have.


I'm trying to avoid confusing users that may think that 
-Walloca-larger-than=0 means warn on any use of alloca.  That is what 
-Walloca is for.  But really, I don't care.  If you feel strongly about 
it, I can just remove the block of code.


Aldy

Re: [PATCH GCC]Remove support for -funsafe-loop-optimizations

2016-07-18 Thread Bin.Cheng

On Fri, Jul 15, 2016 at 6:23 PM, Richard Biener
 wrote:
> On July 15, 2016 7:16:42 PM GMT+02:00, Bernd Schmidt  
> wrote:
>>On 07/15/2016 07:07 PM, Bin Cheng wrote:
>>
>>> Bootstrap and test on x86_64.  Is it OK?
>>
>>If you do this you'll also need to remove the use in config/bfin.
>
> OK with that change.
Hi all,
Attachment is the updated patch, given it's pre-approved with this
change, I will commit it later.

Thanks,
bin
>
> Richard.
>
>>
>>Bernd
>
>
diff --git a/gcc/common.opt b/gcc/common.opt
index a7c5125..331e1da 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -625,8 +625,8 @@ Common Var(warn_null_dereference) Warning
 Warn if dereferencing a NULL pointer may lead to erroneous or undefined 
behavior.
 
 Wunsafe-loop-optimizations
-Common Var(warn_unsafe_loop_optimizations) Warning
-Warn if the loop cannot be optimized due to nontrivial assumptions.
+Common Ignore
+Does nothing.  Preserved for backward compatibility.
 
 Wmissing-noreturn
 Common Warning Alias(Wsuggest-attribute=noreturn)
@@ -2500,8 +2500,8 @@ Perform loop unrolling for all loops.
 ; that control loops do not overflow and that the loops with nontrivial
 ; exit condition are not infinite
 funsafe-loop-optimizations
-Common Report Var(flag_unsafe_loop_optimizations) Optimization
-Allow loop optimizations to assume that the loops behave in normal way.
+Common Ignore
+Does nothing.  Preserved for backward compatibility.
 
 fassociative-math
 Common Report Var(flag_associative_math) SetByCombined Optimization
diff --git a/gcc/config/bfin/bfin.c b/gcc/config/bfin/bfin.c
index 75ddcf0..b6edf2c 100644
--- a/gcc/config/bfin/bfin.c
+++ b/gcc/config/bfin/bfin.c
@@ -3375,10 +3375,7 @@ bfin_can_use_doloop_p (const widest_int &, const 
widest_int _max,
   /* Due to limitations in the hardware (an initial loop count of 0
  does not loop 2^32 times) we must avoid to generate a hardware
  loops when we cannot rule out this case.  */
-  if (!flag_unsafe_loop_optimizations
-  && wi::geu_p (iterations_max, 0x))
-return false;
-  return true;
+  return (wi::ltu_p (iterations_max, 0x));
 }
 
 /* Increment the counter for the number of loop instructions in the
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index ca8c1b4..4241956 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -302,8 +302,7 @@ Objective-C and Objective-C++ Dialects}.
 -Wswitch  -Wswitch-bool  -Wswitch-default  -Wswitch-enum @gol
 -Wswitch-unreachable  -Wsync-nand @gol
 -Wsystem-headers  -Wtautological-compare  -Wtrampolines  -Wtrigraphs @gol
--Wtype-limits  -Wundef @gol
--Wuninitialized  -Wunknown-pragmas  -Wunsafe-loop-optimizations @gol
+-Wtype-limits  -Wundef -Wuninitialized  -Wunknown-pragmas @gol
 -Wunsuffixed-float-constants  -Wunused  -Wunused-function @gol
 -Wunused-label  -Wunused-local-typedefs -Wunused-parameter @gol
 -Wno-unused-result -Wunused-value @gol -Wunused-variable @gol
@@ -414,7 +413,7 @@ Objective-C and Objective-C++ Dialects}.
 -ftree-switch-conversion -ftree-tail-merge -ftree-ter @gol
 -ftree-vectorize -ftree-vrp -funconstrained-commons @gol
 -funit-at-a-time -funroll-all-loops -funroll-loops @gol
--funsafe-loop-optimizations -funsafe-math-optimizations -funswitch-loops @gol
+-funsafe-math-optimizations -funswitch-loops @gol
 -fipa-ra -fvariable-expansion-in-unroller -fvect-cost-model -fvpt @gol
 -fweb -fwhole-program -fwpa -fuse-linker-plugin @gol
 --param @var{name}=@var{value}
@@ -4986,14 +4985,6 @@ If the stack usage is (partly) dynamic and not bounded, 
it's:
 @end smallexample
 @end itemize
 
-@item -Wunsafe-loop-optimizations
-@opindex Wunsafe-loop-optimizations
-@opindex Wno-unsafe-loop-optimizations
-Warn if the loop cannot be optimized because the compiler cannot
-assume anything on the bounds of the loop indices.  With
-@option{-funsafe-loop-optimizations} warn if the compiler makes
-such assumptions.
-
 @item -Wno-pedantic-ms-format @r{(MinGW targets only)}
 @opindex Wno-pedantic-ms-format
 @opindex Wpedantic-ms-format
@@ -6817,15 +6808,6 @@ number of iterations of a loop are used to guide loop 
unrolling and peeling
 and loop exit test optimizations.
 This option is enabled by default.
 
-@item -funsafe-loop-optimizations
-@opindex funsafe-loop-optimizations
-This option tells the loop optimizer to assume that loop indices do not
-overflow, and that loops with nontrivial exit condition are not
-infinite.  This enables a wider range of loop optimizations even if
-the loop optimizer itself cannot prove that these assumptions are valid.
-If you use @option{-Wunsafe-loop-optimizations}, the compiler warns you
-if it finds this kind of loop.
-
 @item -funconstrained-commons
 @opindex funconstrained-commons
 This option tells the compiler that variables declared in common blocks
diff --git a/gcc/ipa-inline-analysis.c b/gcc/ipa-inline-analysis.c
index da29d22..6b2cf96 100644
--- a/gcc/ipa-inline-analysis.c
+++ b/gcc/ipa-inline-analysis.c
@@ -2820,7

Re: [patch, Fortran] Fix some string temporaries

2016-07-18 Thread Andre Vehreschild

Hi Thomas,

> So, OK with a comment why this appears?  Or should I simply
> rename GFC_DEP_ERROR to GFC_DEP_NODEPFOUND to make this a bit
> clearer?

I recommend the latter. Reporting an error should be done only when an
error occurred, but no dependency detected does not feel like an error.
Let's reserve GFC_DEP_ERROR for real error cases (that may occur in the
future).

With the latter the patch is ok for me. In fact, was I thinking about
doing something similar to the gfc_dependendy routines. (Note, I have
no reviewer priviliges, so this is just a vote).

Regards,
Andre
-- 
Andre Vehreschild * Email: vehre ad gmx dot de

Re: [PATCH GCC]Remove support for -funsafe-loop-optimizations

2016-07-18 Thread Bin.Cheng

On Sat, Jul 16, 2016 at 6:28 PM, NightStrike  wrote:
> On Fri, Jul 15, 2016 at 1:07 PM, Bin Cheng  wrote:
>> Hi,
>> This patch removes support for -funsafe-loop-optimizations, as well as 
>> -Wunsafe-loop-optimizations.  By its name, this option does unsafe 
>> optimizations by assuming all loops must terminate and doesn't wrap.  
>> Unfortunately, it's not as useful as expected because:
>> 1) Simply assuming loop must terminate isn't enough.  What we really want is 
>> to analyze scalar evolution and loop niter bound under such assumptions.  
>> This option does nothing in this aspect.
>> 2) IIRC, this option generates bogus code for some common programs, that's 
>> why it's disabled by default even at Ofast level.
>>
>> After I sent patches handling possible infinite loops in both (scev/niter) 
>> analyzer and vectorizer, it's a natural step to remove such options in GCC.  
>> This patch does so by deleting code for -funsafe-loop-optimizations, as well 
>> as -Wunsafe-loop-optimizations.  It also deletes the two now useless tests, 
>> while the option interface is preserved for backward compatibility purpose.
>
> There are a number of bugs opened against those options, including one
> that I just opened rather recently:
>
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71769
>
> but some go back far, in this case 9 years:
>
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=34114
>
> If you are going to remove the options, you should address open bugs
> related to those options.
Hi,
Thanks for pointing me to these PRs, I will have a look at them.
IMHO, the old one reports weakness in loop niter analyzer, the issue
exists whether I remove unsafe-loop-optimization or not.  The new one
is a little bit trickier, I will put some comments on PR, and again,
the issue (if it is) is in niter analyzer which has nothing to do with
the option really.

Thanks,
bin

Re: [RFC, v2] Test coverage for --param boundary values

2016-07-18 Thread Martin Liška

On 07/15/2016 09:22 AM, Thomas Schwinge wrote:
> Hi!
> 
> On Fri, 8 Jul 2016 14:47:46 +0200, Martin Liška  wrote:
>> From f84ce7be4a998089541fb4512e19f54a4ec25cf6 Mon Sep 17 00:00:00 2001
>> From: marxin 
>> Date: Fri, 8 Jul 2016 10:59:24 +0200
>> Subject: [PATCH] Add tests that test boundary values of params
> 
> This became r238249.  Yay for pushing testing boundaries; next is some
> fuzzy testing for GCC?  ;-D
> 
>> gcc/ChangeLog:
>>
>> 2016-07-08  Martin Liska  
>>
>>  * Makefile.in: Append rule for params-options.h.
>>  * params-options.h: New file.
>>
>> gcc/testsuite/ChangeLog:
>>
>> 2016-07-08  Martin Liska  
>>
>>  * gcc.dg/params/blocksort-part.c: New test.
>>  * gcc.dg/params/params.exp: New file.
> 
> :-/ (GNU-style ChangeLogs are just so useful... not.)
> 
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.dg/params/blocksort-part.c
>> @@ -0,0 +1,706 @@
>> +
>> +/*-*/
>> +/*--- Block sorting machinery   ---*/
>> +/*---   blocksort.c ---*/
>> +/*-*/
>> +
>> +/* --
>> +   This file is part of bzip2/libbzip2, a program and library for
>> +   lossless, block-sorting data compression.
>> +
>> +   bzip2/libbzip2 version 1.0.6 of 6 September 2010
>> +   Copyright (C) 1996-2010 Julian Seward 
>> +
>> +   Please read the WARNING, DISCLAIMER and PATENTS sections in the 
>> +   README file.
>> +
>> +   This program is released under the terms of the license contained
>> +   in the file LICENSE.
>> +[...]
> 

Hi Thomas.

> Are there any issues with including that file without the
> accompanying/referenced README and LICENSE files?

Sure, I'll include these files.

> 
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.dg/params/params.exp
>> @@ -0,0 +1,64 @@
>> +[...]
>> +# GCC testsuite that uses the `dg.exp' driver.
>> +
>> +# Load support procs.
>> +load_lib gcc-dg.exp
>> +
>> +# Initialize `dg'.
>> +dg-init
>> +
>> +proc param_run_test { param_name param_value } {
>> +global srcdir
>> +global subdir
>> +
>> +dg-runtest $srcdir/$subdir/blocksort-part.c "" "-O3 --param 
>> $param_name=$param_value"
>> +}
>> +
>> +set fd [open "$objdir/../../params.options" r]
> 
> (I do understand what you're doing there, but) is it kosher to refer to a
> file from GCC's build tree inside the test suite?  I thought the idea was
> that you could run the testsuite without the build tree being available
> -- as much that's possible.  As far as I remember there are a few
> exception to this rule already, so maybe adding one more is not much of a
> problem.  ;-) (I do know that for our internal testing we have a list of
> files that need to be preserved from the GCC build directory for later
> testing without the build directory being available; so I'll add this
> file to the list; Joseph CCed in case he has some additional comments
> after returning from his vacations.)
> 
> 
> shows one instance of the problem, that I could quickly find:
> 
> [...]
> ERROR: tcl error sourcing 
> /home/jenkins/workspace/BuildToolchainThunder_elf_test_upstream/toolchain/src/gcc/testsuite/gcc.dg/params/params.exp.
> ERROR: couldn't open 
> "/home/jenkins/workspace/BuildToolchainThunder_elf_test_upstream/toolchain/testresults/aarch64-thunderx-elf/../../params.options":
>  no such file or directory
> [...]

You are right, I was inspired by what we do for GCC plugins in:
gcc/testsuite/lib/plugin-support.exp

where we have following comment:
# Note that the plugin test support currently only works when the GCC
# build tree is available. (We make sure that is the case in plugin.exp.)
# Once we have figured out how/where to package/install GCC header files
# for general plugin support, we should modify the following include paths
# accordingly.

Well, I can imaging a guard which will test whether 
"$objdir/../../params.options" file exits,
and if so, then the tests are executed? Is it acceptable approach?

> 
>> +set text [read $fd]
>> +close $fd
>> +
>> +# Main loop.
>> +foreach params [split $text "\n"] {
>> +set parts [split $params "="]
>> +set name [string trim [lindex $parts 0] '"']
>> +set values [split [lindex $parts 1] ","]
>> +if { [llength $values] == 3 } {
>> +set default [lindex $values 0]
>> +set min [lindex $values 1]
>> +set max [lindex $values 2]
>> +set int_max "INT_MAX"
>> +
>> +if { $min != -1 } {
>> +param_run_test $name $min
>> +}
>> +if { $max != $min && $max > 0 && $max != $int_max } {
>> +param_run_test $name $max
>> +}
>> +}
>> +if { [llength $values] == 5 } {

Re: [PATCH] Fix PR71893

2016-07-18 Thread Thomas Schwinge

Hi!

On Fri, 15 Jul 2016 15:19:59 +0200 (CEST), Richard Biener  
wrote:
> This PR shows that

(There is also PR71901 suspected to be related; David CCed.)

> array_ref_element_size may apply spurious casts which
> in turn end up confusing VN/PRE.
> 
> Fixed as follows, bootstrapped on x86_64-unknown-linux-gnu, testing in
> progress.

Thanks, this resolves PR71893 for nvptx:

>   PR tree-optimization/71893
>   * tree-ssa-pre.c (create_component_ref_by_pieces_1): Compensate
>   for sizetype cast added by array_ref_element_size.
>   * tree-ssa-sccvn.c (copy_reference_ops_from_ref): Likewise.
> 
> Index: gcc/tree-ssa-pre.c
> ===
> *** gcc/tree-ssa-pre.c(revision 238370)
> --- gcc/tree-ssa-pre.c(working copy)
> *** create_component_ref_by_pieces_1 (basic_
> *** 2576,2581 
> --- 2587,2595 
> {
>   genop3 = size_binop (EXACT_DIV_EXPR, genop3,
>size_int (TYPE_ALIGN_UNIT (elmt_type)));
> + /* We may have a useless conversion added by
> +array_ref_element_size via copy_reference_opts_from_ref.  */
> + STRIP_USELESS_TYPE_CONVERSION (genop3);
>   genop3 = find_or_generate_expression (block, genop3, stmts);
>   if (!genop3)
> return NULL_TREE;
> Index: gcc/tree-ssa-sccvn.c
> ===
> *** gcc/tree-ssa-sccvn.c  (revision 238370)
> --- gcc/tree-ssa-sccvn.c  (working copy)
> *** copy_reference_ops_from_ref (tree ref, v
> *** 810,815 
> --- 810,818 
> /* Always record lower bounds and element size.  */
> temp.op1 = array_ref_low_bound (ref);
> temp.op2 = array_ref_element_size (ref);
> +   /* array_ref_element_size forces the result to sizetype
> +  even if that is the same as bitsizetype.  */
> +   STRIP_USELESS_TYPE_CONVERSION (temp.op2);
> if (TREE_CODE (temp.op0) == INTEGER_CST
> && TREE_CODE (temp.op1) == INTEGER_CST
> && TREE_CODE (temp.op2) == INTEGER_CST)


Grüße
 Thomas


signature.asc
Description: PGP signature

Re: [patch,avr] make progmem work on AVR_TINY, use TARGET_ADDR_SPACE_DIAGNOSE_USAGE

2016-07-18 Thread Denis Chertykov

2016-07-15 18:26 GMT+03:00 Georg-Johann Lay :
> This patch needs new hook TARGET_ADDR_SPACE_DIAGNOSE_USAGE:
> https://gcc.gnu.org/ml/gcc-patches/2016-07/msg00839.html
>
> This patch turns attribute progmem into a working feature for AVR_TINY
> cores.
>
> It boils down to adding 0x4000 to all symbols with progmem:  Flash memory
> can be seen in the RAM address space starting at 0x4000, i.e. data in flash
> can be read by means of LD instruction if we add offsets of 0x4000.  There
> is no need for special access macros like pgm_read_* or special address
> spaces as there is nothing like a LPM instruction.
>
> This is simply achieved by setting a respective symbol_ref_flag, and when
> such a symbol has to be printed, then plus_constant with 0x4000 is used.
>
> Diagnosing of unsupported address spaces is now performed by
> TARGET_ADDR_SPACE_DIAGNOSE_USAGE which has exact location information.
> Hence there is no need to scan all decls for invalid address spaces.
>
> For AVR_TINY, alls address spaces have been disabled.  They are of no use.
> Supporting __flash would just make the backend more complicated without any
> gains.
>
>
> Ok for trunk?
>
> Johann
>
>
> gcc/
> * doc/extend.texi (AVR Variable Attributes) [progmem]: Add
> documentation how it works on reduced Tiny cores.
> (AVR Named Address Spaces): No support for reduced Tiny.
> * avr-protos.h (avr_addr_space_supported_p): New prototype.
> * avr.c (AVR_SYMBOL_FLAG_TINY_PM): New macro.
> (avr_address_tiny_pm_p): New static function.
> (avr_print_operand_address) [AVR_TINY]: Add AVR_TINY_PM_OFFSET
> if the address is in progmem.
> (avr_assemble_integer): Same.
> (avr_encode_section_info) [AVR_TINY]: Set AVR_SYMBOL_FLAG_TINY_PM
> for symbol_ref in progmem.
> (TARGET_ADDR_SPACE_DIAGNOSE_USAGE): New hook define...
> (avr_addr_space_diagnose_usage): ...and implementation.
> (avr_addr_space_supported_p): New function.
> (avr_nonconst_pointer_addrspace, avr_pgm_check_var_decl): Only
> report bad address space usage if that space is supported.
> (avr_insert_attributes): Same.  No more complain about unsupported
> address spaces.
> * avr.h (AVR_TINY_PM_OFFSET): New macro.
> * avr-c.c (tm_p.h): Include it.
> (avr_cpu_cpp_builtins) [__AVR_TINY_PM_BASE_ADDRESS__]: Use
> AVR_TINY_PM_OFFSET instead of magic 0x4000 when built-in def'ing.
> Only define addr-space related built-in macro if
> avr_addr_space_supported_p.
> gcc/testsuite/
> * gcc.target/avr/torture/tiny-progmem.c: New test.
>

Approved.

84 matches

Mail list logo