[gomp4] Merge trunk r231075 (2015-11-30) into gomp-4_0-branch

2015-11-30 Thread Thomas Schwinge
Hi!

Committed to gomp-4_0-branch in r231099:

commit 4f88f92b308151aa2c2592102da20c417df69c27
Merge: 24e5942 851c1b0
Author: tschwinge 
Date:   Tue Dec 1 07:44:27 2015 +

svn merge -r 230907:231075 svn+ssh://gcc.gnu.org/svn/gcc/trunk


git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gomp-4_0-branch@231099 
138bc75d-0d04-0410-961f-82ee72b054a4


Grüße
 Thomas


[UPC 20/22] libgupc runtime library [6/9]

2015-11-30 Thread Gary Funck
[NOTE: Due to email list size limits, this patch is broken into 9 parts.]

Background
--

An overview email, describing the UPC-related changes is here:
  https://gcc.gnu.org/ml/gcc-patches/2015-12/msg5.html

The GUPC branch is described here:
  http://gcc.gnu.org/projects/gupc.html

The UPC-related source code differences are summarized here:
  http://gccupc.org/gupc-changes

All languages (c, c++, fortran, go, lto, objc, obj-c++) have been
bootstrapped; no test suite regressions were introduced,
relative to the GCC trunk.

If you are on the cc-list, your name was chosen either
because you are listed as a maintainer for the area that
applies to the patches described in this email, or you
were a frequent contributor of patches made to files listed
in this email.

In the change log entries included in each patch, the directory
containing the affected files is listed, followed by the files.
When the patches are applied, the change log entries will be
distributed to the appropriate ChangeLog file.

Overview


Libgupc is the UPC runtime library, for GUPC.  The configuration,
makefile, and documentation related changes have been broken out into
separate patches.

As noted in the ChangeLog entry below, this is all new code.
Two communication layers are supported: (1) SMP, via 'mmap'
or (2) the Portals4 library API, which supports multi-node
operation.  Libgupc generally requires a POSIX-compliant target OS.

The 'smp' runtime is the default runtime.  The 'portals4'
runtime is experimental; it supports multi-node operation
using the Portals4 communications library.

Most of the libgupc/include/ directory contains standard headers
defined by the UPC language specification. 'make install' will
install these headers in the directory where other "C"
header files are located.

2015-11-30  Gary Funck  

libgupc/collectives/
* upc_coll_reduce.upc: New.
* upc_coll_scatter.upc: New.
* upc_coll_sort.upc: New.

Index: libgupc/collectives/upc_coll_reduce.upc
===
--- libgupc/collectives/upc_coll_reduce.upc (.../trunk) (revision 0)
+++ libgupc/collectives/upc_coll_reduce.upc (.../branches/gupc) 
(revision 231080)
@@ -0,0 +1,4296 @@
+/* Copyright (C) 2012-2015 Free Software Foundation, Inc.
+   This file is part of the UPC runtime library.
+   Written by Gary Funck 
+   and Nenad Vukicevic 
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 3, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+GNU General Public License for more details.
+
+Under Section 7 of GPL version 3, you are granted additional
+permissions described in the GCC Runtime Library Exception, version
+3.1, as published by the Free Software Foundation.
+
+You should have received a copy of the GNU General Public License and
+a copy of the GCC Runtime Library Exception along with this program;
+see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+.  */
+
+/*/
+/*   */
+/*  Copyright (c) 2004, Michigan Technological University*/
+/*  All rights reserved. */
+/*   */
+/*  Redistribution and use in source and binary forms, with or without   */
+/*  modification, are permitted provided that the following conditions   */
+/*  are met: */
+/*   */
+/*  * Redistributions of source code must retain the above copyright */
+/*  notice, this list of conditions and the following disclaimer.*/
+/*  * Redistributions in binary form must reproduce the above*/
+/*  copyright notice, this list of conditions and the following  */
+/*  disclaimer in the documentation and/or other materials provided  */
+/*  with the distribution.   */
+/*  * Neither the name of the Michigan Technological University  */
+/*  nor the names of its contributors may be used to endorse or promote  */
+/*  products derived from this software without specific prior written   */
+/*  permission.  */
+/*   */
+/*  THIS SOFTWAR

[UPC 20/22] libgupc runtime library [4/9]

2015-11-30 Thread Gary Funck
[NOTE: Due to email list size limits, this patch is broken into 9 parts.]

Background
--

An overview email, describing the UPC-related changes is here:
  https://gcc.gnu.org/ml/gcc-patches/2015-12/msg5.html

The GUPC branch is described here:
  http://gcc.gnu.org/projects/gupc.html

The UPC-related source code differences are summarized here:
  http://gccupc.org/gupc-changes

All languages (c, c++, fortran, go, lto, objc, obj-c++) have been
bootstrapped; no test suite regressions were introduced,
relative to the GCC trunk.

If you are on the cc-list, your name was chosen either
because you are listed as a maintainer for the area that
applies to the patches described in this email, or you
were a frequent contributor of patches made to files listed
in this email.

In the change log entries included in each patch, the directory
containing the affected files is listed, followed by the files.
When the patches are applied, the change log entries will be
distributed to the appropriate ChangeLog file.

Overview


Libgupc is the UPC runtime library, for GUPC.  The configuration,
makefile, and documentation related changes have been broken out into
separate patches.

As noted in the ChangeLog entry below, this is all new code.
Two communication layers are supported: (1) SMP, via 'mmap'
or (2) the Portals4 library API, which supports multi-node
operation.  Libgupc generally requires a POSIX-compliant target OS.

The 'smp' runtime is the default runtime.  The 'portals4'
runtime is experimental; it supports multi-node operation
using the Portals4 communications library.

Most of the libgupc/include/ directory contains standard headers
defined by the UPC language specification. 'make install' will
install these headers in the directory where other "C"
header files are located.

2015-11-30  Gary Funck  

libgupc/collectives/
* upc_coll.h: New.
* upc_coll_broadcast.upc: New.
* upc_coll_err.upc: New.
* upc_coll_exchange.upc: New.
* upc_coll_gather.upc: New.
* upc_coll_gather_all.upc: New.
* upc_coll_init.upc: New.

Index: libgupc/collectives/upc_coll.h
===
--- libgupc/collectives/upc_coll.h  (.../trunk) (revision 0)
+++ libgupc/collectives/upc_coll.h  (.../branches/gupc) (revision 
231080)
@@ -0,0 +1,67 @@
+/* Copyright (C) 2012-2015 Free Software Foundation, Inc.
+   This file is part of the UPC runtime library.
+   Written by Gary Funck 
+   and Nenad Vukicevic 
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 3, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+GNU General Public License for more details.
+
+Under Section 7 of GPL version 3, you are granted additional
+permissions described in the GCC Runtime Library Exception, version
+3.1, as published by the Free Software Foundation.
+
+You should have received a copy of the GNU General Public License and
+a copy of the GCC Runtime Library Exception along with this program;
+see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+.  */
+
+/*/
+/*   */
+/*  Copyright (c) 2004, Michigan Technological University*/
+/*  All rights reserved. */
+/*   */
+/*  Redistribution and use in source and binary forms, with or without   */
+/*  modification, are permitted provided that the following conditions   */
+/*  are met: */
+/*   */
+/*  * Redistributions of source code must retain the above copyright */
+/*  notice, this list of conditions and the following disclaimer.*/
+/*  * Redistributions in binary form must reproduce the above*/
+/*  copyright notice, this list of conditions and the following  */
+/*  disclaimer in the documentation and/or other materials provided  */
+/*  with the distribution.   */
+/*  * Neither the name of the Michigan Technological University  */
+/*  nor the names of its contributors may be used to endorse or promote  */
+/*  products derived from this software without specific prior written   */
+/*  permission.  

[UPC 20/22] libgupc runtime library [1/9]

2015-11-30 Thread Gary Funck
[NOTE: Due to email list size limits, this patch is broken into 9 parts.]

Background
--

An overview email, describing the UPC-related changes is here:
  https://gcc.gnu.org/ml/gcc-patches/2015-12/msg5.html

The GUPC branch is described here:
  http://gcc.gnu.org/projects/gupc.html

The UPC-related source code differences are summarized here:
  http://gccupc.org/gupc-changes

All languages (c, c++, fortran, go, lto, objc, obj-c++) have been
bootstrapped; no test suite regressions were introduced,
relative to the GCC trunk.

If you are on the cc-list, your name was chosen either
because you are listed as a maintainer for the area that
applies to the patches described in this email, or you
were a frequent contributor of patches made to files listed
in this email.

In the change log entries included in each patch, the directory
containing the affected files is listed, followed by the files.
When the patches are applied, the change log entries will be
distributed to the appropriate ChangeLog file.

Overview


Libgupc is the UPC runtime library, for GUPC.  The configuration,
makefile, and documentation related changes have been broken out into
separate patches.

As noted in the ChangeLog entry below, this is all new code.
Two communication layers are supported: (1) SMP, via 'mmap'
or (2) the Portals4 library API, which supports multi-node
operation.  Libgupc generally requires a POSIX-compliant target OS.

The 'smp' runtime is the default runtime.  The 'portals4'
runtime is experimental; it supports multi-node operation
using the Portals4 communications library.

Most of the libgupc/include/ directory contains standard headers
defined by the UPC language specification. 'make install' will
install these headers in the directory where other "C"
header files are located.

2015-11-30  Gary Funck  

libgupc/
* upc-crtstuff.c: New.
libgupc/include/
* gasp.h: New.
* gasp_upc.h: New.
* gcc-upc.h: New.
* pupc.h: New.
* upc.h: New.
* upc_atomic.h: New.
* upc_castable.h: New.
* upc_collective.h: New.
* upc_nb.h: New.
* upc_relaxed.h: New.
* upc_strict.h: New.
* upc_tick.h: New.
* upc_types.h: New.

Index: libgupc/upc-crtstuff.c
===
--- libgupc/upc-crtstuff.c  (.../trunk) (revision 0)
+++ libgupc/upc-crtstuff.c  (.../branches/gupc) (revision 231080)
@@ -0,0 +1,66 @@
+/* upc-crtstuff.c: UPC specific "C Runtime Support"
+   Copyright (C) 2009-2015 Free Software Foundation, Inc.
+   Contributed by Gary Funck 
+ and Nenad Vukicevic .
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 3, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+GNU General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+.  */
+
+#include "config.h"
+#include "upc-crt-config.h"
+#include "upc-crt-begin-end.h"
+
+/* Only define section start/end if no link script is used.   */
+
+#ifdef CRT_BEGIN
+
+/* Shared begin is always defined in order to allocate space
+   at the beginning of the section.  */
+#ifdef UPC_SHARED_SECTION_BEGIN
+/* Establish a symbol at the beginning of the data section.  */
+UPC_SHARED_SECTION_BEGIN
+#endif /* UPC_SHARED_SECTION_BEGIN */
+
+#ifndef HAVE_UPC_LINK_SCRIPT
+#ifdef UPC_PGM_INFO_SECTION_BEGIN
+/* Establish a symbol at the beginning of the program info data section.  */
+UPC_PGM_INFO_SECTION_BEGIN
+#endif /* UPC_PGM_INFO_SECTION_BEGIN */
+#ifdef UPC_INIT_ARRAY_SECTION_BEGIN
+/* Establish a symbol at the beginning of the initialization array section.  */
+UPC_INIT_ARRAY_SECTION_BEGIN
+#endif /* UPC_INIT_ARRAY_SECTION_BEGIN */
+#endif /* !HAVE_UPC_LINK_SCRIPT */
+
+#elif defined(CRT_END) /* ! CRT_BEGIN */
+
+#ifndef HAVE_UPC_LINK_SCRIPT
+#ifdef UPC_SHARED_SECTION_END
+/* Establish a symbol at the end of the shared data section.  */
+UPC_SHARED_SECTION_END
+#endif /* UPC_SHARED_SECTION_END */
+#ifdef UPC_PGM_INFO_SECTION_END
+/* Establish a symbol at the end of the program info data section.  */
+UPC_PGM_INFO_SECTION_END
+#endif /* UPC_PGM_INFO_SECTION_END */
+#ifdef UPC_INIT_ARRAY_SECTION_END
+/* Establish a symbol at the end of the initialization array section.  */
+UPC_INIT_ARRAY_SECTION_END
+#endif /* UPC_INIT_ARRAY_SECTION_END */
+#endif /* !HAVE_UPC_LINK_SCRIPT */
+#else /* ! CRT_BEGIN && ! CRT_END */
+#error "One of CRT_BEGIN or CRT_END must be defined."
+#endif
Index: libgupc/include/gasp.h
==

When not optimizing do not compute RTX memory attributes

2015-11-30 Thread Jan Hubicka
Hi,
memory attributes are currently optimized and attached to RTL even when not
optimizing. This is obviously just a wasted effort.

Bootstrapped/regtested x86_64-linux, OK?

Honza
* emit-rtl.c (set_mem_attrs, set_mem_attributes_minus_bitpos):
Do not compute memory attributes when not optimizing.

Index: emit-rtl.c
===
--- emit-rtl.c  (revision 231081)
+++ emit-rtl.c  (working copy)
@@ -336,7 +336,8 @@ static void
 set_mem_attrs (rtx mem, mem_attrs *attrs)
 {
   /* If everything is the default, we can just clear the attributes.  */
-  if (mem_attrs_eq_p (attrs, mode_mem_attrs[(int) GET_MODE (mem)]))
+  if (!optimize
+  || mem_attrs_eq_p (attrs, mode_mem_attrs[(int) GET_MODE (mem)]))
 {
   MEM_ATTRS (mem) = 0;
   return;
@@ -1749,6 +1750,9 @@ set_mem_attributes_minus_bitpos (rtx ref
   struct mem_attrs attrs, *defattrs, *refattrs;
   addr_space_t as;
 
+  if (!optimize)
+return;
+
   /* It can happen that type_for_mode was given a mode for which there
  is no language-level type.  In which case it returns NULL, which
  we can see here.  */


[UPC 21/22] gcc.dg test suite

2015-11-30 Thread Gary Funck

Background
--

An overview email, describing the UPC-related changes is here:
  https://gcc.gnu.org/ml/gcc-patches/2015-12/msg5.html

The GUPC branch is described here:
  http://gcc.gnu.org/projects/gupc.html

The UPC-related source code differences are summarized here:
  http://gccupc.org/gupc-changes

All languages (c, c++, fortran, go, lto, objc, obj-c++) have been
bootstrapped; no test suite regressions were introduced,
relative to the GCC trunk.

If you are on the cc-list, your name was chosen either
because you are listed as a maintainer for the area that
applies to the patches described in this email, or you
were a frequent contributor of patches made to files listed
in this email.

In the change log entries included in each patch, the directory
containing the affected files is listed, followed by the files.
When the patches are applied, the change log entries will be
distributed to the appropriate ChangeLog file.

Overview


The test suite additions under gcc/testsuite/gcc.dg/gupc
test most of the "negative" front-end errors generated by GNU UPC.
These are compile-only tests and can be safely run as part of
the gcc.dg test suite.

There are also some tests which test code generation for 'gets'
and 'puts' to UPC shared memory.  These code generation tests
scan the '.original' tree dump for expected UPC runtime calls.

A new gupc.exp file is introduced under gcc/testsuite/gcc.dg/gupc.
It checks that the compiler supports the -fupc switch; if not then
tests will not be run.  gupc.exp sets the compilation flags
to -fno-upc-pre-include by default.  This removes any dependency upon
the libgupc runtime library, but requires that the tests declare
various runtime API's and variables exported by the UPC runtime library
that would have otherwise been declared in the gcc-upc.h file built
under libgupc/include.

2015-11-30  Gary Funck  

gcc/testsuite/gcc.dg/gupc/
* addr-of-shared-bit-field.upc: New.
* assign-local-ptr-to-pts.upc: New.
* assign-pts-to-local-ptr.upc: New.
* assign-pts-with-diff-block-factors-no-cast.upc: New.
* barrier-notify-wait.upc: New.
* block-factor-applied-to-void-type.upc: New.
* block-factor-incompatible-with-ref-type.upc: New.
* block-factor-not-int-constant.upc: New.
* cast-int-to-pts.upc: New.
* cast-local-ptr-to-pts.upc: New.
* cmp-pts-and-local-ptr.upc: New.
* cmp-pts-eq-diff-block-factor-1.upc: New.
* cmp-pts-eq-diff-block-factor-2.upc: New.
* cmp-pts-eq-diff-block-factor-3.upc: New.
* cmp-pts-gt-diff-block-factor-1.upc: New.
* cmp-pts-gt-diff-block-factor-2.upc: New.
* cmp-pts-gt-diff-block-factor-3.upc: New.
* decl-multiple-layout-quals.upc: New.
* deprecated-barrier-notify-stmt.upc: New.
* deprecated-barrier-stmt.upc: New.
* deprecated-barrier-wait-stmt.upc: New.
* diff-pts-and-local-ptr.upc: New.
* dyn-array-decl-threads-more-than-once.upc: New.
* dyn-array-dim-not-simple-multiple-of-threads.upc: New.
* dyn-star-layout-dim-not-multiple-of-threads.upc: New.
* dyn-threads-more-than-once.upc: New.
* dyn-threads-with-indef-block-size.upc: New.
* field-decl-with-shared-qual.upc: New.
* func-decl-has-shared_qual.upc: New.
* get-blk-relaxed.upc: New.
* get-blk-strict.upc: New.
* get-df-relaxed.upc: New.
* get-df-strict.upc: New.
* get-di-relaxed.upc: New.
* get-di-strict.upc: New.
* get-hi-relaxed.upc: New.
* get-hi-strict.upc: New.
* get-qi-relaxed.upc: New.
* get-qi-strict.upc: New.
* get-sf-relaxed.upc: New.
* get-sf-strict.upc: New.
* get-si-relaxed.upc: New.
* get-si-strict.upc: New.
* get-tf-relaxed.upc: New.
* get-tf-strict.upc: New.
* get-ti-relaxed.upc: New.
* get-ti-strict.upc: New.
* getaddr.upc: New.
* gupc.exp: New.  Compile all *.upc tests in this directory.
* init-makes-pts-from-int.upc: New.
* invalid-local-ptr-to-void-arith.upc: New.
* invalid-sizeof-shared-void.upc: New.
* invalid-sizeof-void.upc: New.
* lt-pts-and-local-ptr.upc: New.
* max-block-size-exceeded.upc: New.
* no-closing-layout-qual-bracket.upc: New.
* parm-decl-with-shared-qual.upc: New.
* passing-arg-makes-pts-from-int.upc: New.
* pts-to-void-in-arith.upc: New.
* put-blk-relaxed.upc: New.
* put-blk-strict.upc: New.
* put-df-relaxed.upc: New.
* put-df-strict.upc: New.
* put-di-relaxed.upc: New.
* put-di-strict.upc: New.
* put-hi-relaxed.upc: New.
* put-hi-strict.upc: New.
* put-qi-relaxed.upc: New.
* put-qi-strict.upc: New.
* put-sf-relaxed.upc: New.
* put-sf-strict.upc: New.
* put-si-relaxed.upc: New.
  

[UPC 04/22] Make, Config changes

2015-11-30 Thread Gary Funck

Background
--

An overview email, describing the UPC-related changes is here:
  https://gcc.gnu.org/ml/gcc-patches/2015-12/msg5.html

The GUPC branch is described here:
  http://gcc.gnu.org/projects/gupc.html

The UPC-related source code differences are summarized here:
  http://gccupc.org/gupc-changes

All languages (c, c++, fortran, go, lto, objc, obj-c++) have been
bootstrapped; no test suite regressions were introduced,
relative to the GCC trunk.

If you are on the cc-list, your name was chosen either
because you are listed as a maintainer for the area that
applies to the patches described in this email, or you
were a frequent contributor of patches made to files listed
in this email.

In the change log entries included in each patch, the directory
containing the affected files is listed, followed by the files.
When the patches are applied, the change log entries will be
distributed to the appropriate ChangeLog file.

Overview


UPC introduces a new runtime library, libgupc and a new compiler driver, gupc.
These are defined in the top-level Makefile.def and Makefile.tpl files.

The top-level configure script will disable building the libgupc runtime
library on unsupported targets.  For builds where the target is the
same as the host, configure will check if "UPC linker scripts" can be
supported; this check can be over-ridden by the --enable-upc-linker-script
switch.  This check runs a 'perl' script, it will only be run if the
host has perl installed.

2015-11-30  Gary Funck  

* Makefile.def (libgupc):  New.  Define libgupc module.
* Makefile.in: Re-generate.
* Makefile.tpl (BUILD_EXPORTS, EXTRA_TARGET_FLAGS):
Add GUPC and GUPCFLAGS.
(BASE_TARGET_EXPORTS, EXTRA_HOST_FLAGS): Add GUPC.
(GUPC_FOR_BUILD, GUPCFLAGS,
GUPC_FOR_TARGET, GUPCFLAGS_FOR_TARGET): New.
* configure: Re-generate.
* configure.ac (target_libraries): Add target-libgupc.
Disable libgupc on unsupported systems.
Add check for 'gupc' as target tool.
(GUPC_FOR_BUILD): New.  Define 'gupc' as a target tool.
contrib/
* gcc_update (libgupc/aclocal.m4, libgupc/config.h.in,
libgupc/configure, libgupc/Makefile.in,
libgupc/testsuite/Makefile.in): New.  Define libgupc targets.
* update-copyright.py: Add libgupc library to copyright scan list.
(skip_extensions): Add .upc.
(GCCCopyright): Add external authors for
contributors to UPC-related additions.
gcc/
* config.in (HAVE_UPC_LINK_SCRIPT): New. Re-generate.
* configure: Re-generate.
* configure.ac (enable-upc-link-script): Add check for UPC
linker script support.
* Makefile.in (INFOFILES): Add doc/gupc.info.
(MANFILES): Add doc/gupc.1.
gcc/c/
* Make-lang.in (gupc): Add rules to build and install the
'gupc' executable.  Add rule to symlink 'upc' to 'gupc' executable.
* config-lang.in (gtfiles): Add UPC garbage collection
support files to gtfiles.

Index: Makefile.def
===
--- Makefile.def(.../trunk) (revision 231059)
+++ Makefile.def(.../branches/gupc) (revision 231080)
@@ -154,6 +154,7 @@ target_modules = { module= libbacktrace;
 target_modules = { module= libquadmath; };
 target_modules = { module= libgfortran; };
 target_modules = { module= libobjc; };
+target_modules = { module= libgupc; };
 target_modules = { module= libgo; };
 target_modules = { module= libtermcap; no_check=true;
missing=mostlyclean;
@@ -284,6 +285,8 @@ flags_to_pass = { flag= GCJ_FOR_TARGET ;
 flags_to_pass = { flag= GFORTRAN_FOR_TARGET ; };
 flags_to_pass = { flag= GOC_FOR_TARGET ; };
 flags_to_pass = { flag= GOCFLAGS_FOR_TARGET ; };
+flags_to_pass = { flag= GUPC_FOR_TARGET ; };
+flags_to_pass = { flag= GUPCFLAGS_FOR_TARGET ; };
 flags_to_pass = { flag= LD_FOR_TARGET ; };
 flags_to_pass = { flag= LIPO_FOR_TARGET ; };
 flags_to_pass = { flag= LDFLAGS_FOR_TARGET ; };
@@ -561,6 +564,8 @@ dependencies = { module=all-target-libja
 dependencies = { module=all-target-libjava; on=all-target-libffi; };
 dependencies = { module=configure-target-libobjc; 
on=configure-target-boehm-gc; };
 dependencies = { module=all-target-libobjc; on=all-target-boehm-gc; };
+dependencies = { module=all-target-libgupc; on=all-target-libbacktrace; };
+dependencies = { module=all-target-libgupc; on=all-target-libatomic; };
 dependencies = { module=configure-target-libstdc++-v3; 
on=configure-target-libgomp; };
 dependencies = { module=configure-target-liboffloadmic; 
on=configure-target-libgomp; };
 dependencies = { module=configure-target-libsanitizer; 
on=all-target-libstdc++-v3; };
@@ -569,6 +574,9 @@ dependencies = { module=configure-target
 // generated by the libgomp configure.  Unfortunately, due to the use of
 //  recursive make, we can't be that specific.
 dependencies = 

[UPC 07/22] lowering, pointer-to-shared ops

2015-11-30 Thread Gary Funck

Background
--

An overview email, describing the UPC-related changes is here:
  https://gcc.gnu.org/ml/gcc-patches/2015-12/msg5.html

The GUPC branch is described here:
  http://gcc.gnu.org/projects/gupc.html

The UPC-related source code differences are summarized here:
  http://gccupc.org/gupc-changes

All languages (c, c++, fortran, go, lto, objc, obj-c++) have been
bootstrapped; no test suite regressions were introduced,
relative to the GCC trunk.

If you are on the cc-list, your name was chosen either
because you are listed as a maintainer for the area that
applies to the patches described in this email, or you
were a frequent contributor of patches made to files listed
in this email.

In the change log entries included in each patch, the directory
containing the affected files is listed, followed by the files.
When the patches are applied, the change log entries will be
distributed to the appropriate ChangeLog file.

Overview


The UPC lowering pass traverses the current function tree
and rewrites UPC related statements and operations into GENERIC.
The resulting GENERIC tree code will retain UPC pointers-to-shared (PTS)
types, but all operations such as 'get' and 'put' which indirect
through a pointer-to-shared have been lowered to use the internal
representation type.  Most of these operations on UPC pointers-to-shared
is implemented in c/c-upc-pts-ops.c.

The UPC lowering pass is implemented by upc_genericize() in
c/c-upc-low.c.  upc_genericize() is called from finish_function()
in c/c-decl.c. It is called just prior to calling c_genericize(),
if -fupc has been asserted.

The file c/c-upc-rts-names.h defines the names of the UPC runtime
entry points and variables that implement the runtime ABI.
To date, there has been no need to implement target dependent names,
perhaps partly because UPC is supported primarily on POSIX-compliant targets.

UPC requires some special logic for handling file scoped initializations.
This is due to the fact that UPC shared addresses are not known
until runtime and therefore cannot be statically initialized
in the usual way.  For example, 'addr_x' below must be initialized
at runtime.

  shared int x;
  shared int *addr_x = &x;

The routine, upc_check_decl_init(), checks an initialization
statement to determine if it needs special handling.
It is called from store_init_value().  If an initialization
refers to UPC-related constructs that require initialization
at runtime, then upc_decl_init() is called to save the
initialization statement on a list.  This list is
processed by upc_write_global_declarations(), which
is called via a UPC-specific language hook from
c_common_parse_file(), just after calling c_parse_file().


2015-11-30  Gary Funck  

gcc/c-family/
* c-upc-pts.h: New.  Define the sizes and types of fields
in the UPC pointer-to-shared representation.
gcc/c/
* c-upc-low.c: New.  Lower UPC constructs to GENERIC.
* c-upc-low.h: New.  Prototypes for c-upc-low.c.
* c-upc-pts-ops.c: New. Implement UPC pointer-to-shared-operations.
* c-upc-pts-ops.h: New. Prototypes for c-upc-pts-ops.c.
* c-upc-rts-names.h: New.  Names of some functions in the UPC runtime.

Index: gcc/c-family/c-upc-pts.h
===
--- gcc/c-family/c-upc-pts.h(.../trunk) (revision 0)
+++ gcc/c-family/c-upc-pts.h(.../branches/gupc) (revision 231080)
@@ -0,0 +1,40 @@
+/* Define UPC pointer-to-shared representation characteristics.
+   Copyright (C) 2008-2015 Free Software Foundation, Inc.
+   Contributed by Gary Funck 
+ and Nenad Vukicevic .
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 3, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+GNU General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+.  */
+
+#ifndef GCC_C_FAMILY_UPC_PTS_H
+#define GCC_C_FAMILY_UPC_PTS_H 1
+
+#define UPC_PTS_SIZE(LONG_TYPE_SIZE + POINTER_SIZE)
+#define UPC_PTS_PHASE_SIZE  (LONG_TYPE_SIZE / 2)
+#define UPC_PTS_THREAD_SIZE (LONG_TYPE_SIZE / 2)
+#define UPC_PTS_VADDR_SIZE  POINTER_SIZE
+#define UPC_PTS_PHASE_TYPE  ((LONG_TYPE_SIZE == 64) \
+   ? "uint32_t" : "uint16_t")
+#define UPC_PTS_THREAD_TYPE ((LONG_TYPE_SIZE == 64) \
+   ? "uint32_t" : "uint16_t")
+#define UPC_PTS_VADDR_TYPE  "char *"
+
+#define UPC_MAX_THREADS (1 << (((UPC_PTS_THREAD_SIZE) < 30) \
+? (UPC_PTS_THRE

[UPC 11/22] documentation

2015-11-30 Thread Gary Funck

Background
--

An overview email, describing the UPC-related changes is here:
  https://gcc.gnu.org/ml/gcc-patches/2015-12/msg5.html

The GUPC branch is described here:
  http://gcc.gnu.org/projects/gupc.html

The UPC-related source code differences are summarized here:
  http://gccupc.org/gupc-changes

All languages (c, c++, fortran, go, lto, objc, obj-c++) have been
bootstrapped; no test suite regressions were introduced,
relative to the GCC trunk.

If you are on the cc-list, your name was chosen either
because you are listed as a maintainer for the area that
applies to the patches described in this email, or you
were a frequent contributor of patches made to files listed
in this email.

In the change log entries included in each patch, the directory
containing the affected files is listed, followed by the files.
When the patches are applied, the change log entries will be
distributed to the appropriate ChangeLog file.

Overview


For UPC, a new gupc.texi file is introduced to describe the
stand-alone 'gupc' command, which is a driver similar to gfortran
that will invoke 'gcc', asserting the -fupc switch and will
compile any .c files on the command line as if they were .upc files.
In addition, it describes how to run UPC programs, along with details
on the command line switches processed by the UPC runtime.

2015-11-30  Gary Funck  

gcc/doc/
* gupc.texi: New.
* install.texi (disable-libgupc, enable-upc-link-script):
New. Describe UPC-specific configure options.
* invoke.texi (fupc, fupc-threads, fupc-pthreads-model-tls,
fupc-inline-lib, fupc-pre-include, fupc-debug, dwarf-2-upc,
fupc-instrument, fupc-instrument-functions):
New. Describe UPS-specific compiler options.
* passes.texi: Describe the UPC lowering pass.
* sourcebuild.texi (libgupc): Add libgupc to list of libraries.
Also make note that target support for UPC is enabled via -fupc.
* tm.texi: Re-generate.
* tm.texi.in (TARGET_UPC_LINK_SCRIPT_P,
TARGET_UPC_SHARED_SECTION_NAME, TARGET_UPC_PGM_INFO_SECTION_NAME,
TARGET_UPC_INIT_ARRAY_SECTION_NAME): Refer to new UPC target hooks.
libgupc/
* libgupc.texi: New.

Index: gcc/doc/gupc.texi
===
--- gcc/doc/gupc.texi   (.../trunk) (revision 0)
+++ gcc/doc/gupc.texi   (.../branches/gupc) (revision 231080)
@@ -0,0 +1,394 @@
+\input texinfo @c -*-texinfo-*-
+@setfilename gupc
+@settitle GNU project UPC compiler
+
+@c Merge the standard indexes into a single one.
+@syncodeindex fn cp
+@syncodeindex vr cp
+@syncodeindex ky cp
+@syncodeindex pg cp
+@syncodeindex tp cp
+
+@include gcc-common.texi
+
+@c Copyright (C) 2001-2015 Free Software Foundation, Inc.
+@c Contributed by Gary Funck 
+@c   and Nenad Vukicevic .
+@c Based on original implementation
+@c   by Jesse M. Draper 
+@c   and William W. Carlson .
+
+@copying
+@c man begin COPYRIGHT
+Copyright @copyright{} 2001-2015 Free Software Foundation, Inc.
+
+Permission is granted to copy, distribute and/or modify this document
+under the terms of the GNU Free Documentation License, Version 1.3 or
+any later version published by the Free Software Foundation; with the
+Invariant Sections being ``GNU General Public License'' and ``Funding
+Free Software'', the Front-Cover texts being (a) (see below), and with
+the Back-Cover Texts being (b) (see below).  A copy of the license is
+included in the
+@c man end
+section entitled ``GNU Free Documentation License''.
+@ignore
+@c man begin COPYRIGHT
+man page gfdl(7).
+@c man end
+@end ignore
+@c man begin COPYRIGHT
+
+(a) The FSF's Front-Cover Text is:
+
+ A GNU Manual
+
+(b) The FSF's Back-Cover Text is:
+
+ You have freedom to copy and modify this GNU Manual, like GNU
+ software.  Copies published by the Free Software Foundation raise
+ funds for GNU development.
+@c man end
+@end copying
+@c Set file name and title for the man page.
+
+@ifinfo
+@dircategory Software development
+@direntry
+* GNU UPC: (gupc).   A GCC-based compiler for the UPC language
+@end direntry
+
+@insertcopying
+@end ifinfo
+
+@titlepage
+@title The GNU UPC Compiler
+@versionsubtitle
+@author Gary Funck and Nenad Vukicevic
+
+@page
+@vskip 0pt plus 1filll
+Published by the Free Software Foundation @*
+51 Franklin Street, Fifth Floor@*
+Boston, MA 02110-1301, USA@*
+@sp 1
+@insertcopying
+@end titlepage
+@contents
+@page
+
+@node Top
+@chapter @command{gupc}--- UPC compiler for parallel computers
+
+@command{gupc} provides a compilation and execution environment for
+programs written in the UPC (Unified Parallel C) language.
+
+@menu
+* GUPC Intro:: Introduction to gupc.
+* Threads::Number of Execution Threads.
+* Invoking GUPC::  How to use gupc.
+* GUPC Options::   GUPC Options.
+* See also::   References.
+* Contributors::   GUPC Contributors.
+* Index::   

[UPC 16/22] gimple/gimplify changes

2015-11-30 Thread Gary Funck

Background
--

An overview email, describing the UPC-related changes is here:
  https://gcc.gnu.org/ml/gcc-patches/2015-12/msg5.html

The GUPC branch is described here:
  http://gcc.gnu.org/projects/gupc.html

The UPC-related source code differences are summarized here:
  http://gccupc.org/gupc-changes

All languages (c, c++, fortran, go, lto, objc, obj-c++) have been
bootstrapped; no test suite regressions were introduced,
relative to the GCC trunk.

If you are on the cc-list, your name was chosen either
because you are listed as a maintainer for the area that
applies to the patches described in this email, or you
were a frequent contributor of patches made to files listed
in this email.

In the change log entries included in each patch, the directory
containing the affected files is listed, followed by the files.
When the patches are applied, the change log entries will be
distributed to the appropriate ChangeLog file.

Overview


In gimple-expr.c, logic is added to useless_type_conversion_p() to
handle conversions involving UPC pointers-to-shared.
lang_hooks.types_compatible_p() is called to check conversions
between UPC pointers-to-shared.  This will in turn call c_types_compatible_p()
which will call upc_types_compatible_p() if -fupc is asserted.

The hook is needed here because the gimple-related routines are
defined at the top-level of the GCC tree and can be linked with
other front-ends.

In gimplify.c, flag_instrument_functions_exclude_p() is exported
as an external function rather than being defined as a static function.
It is called from upc_genericize_function() defined in c/c-upc-low.c,
when -fupc-instrument-functions is asserted.

2015-11-30  Gary Funck  

gcc/
* gimple-expr.c: #include "langhooks.h".
(useless_type_conversion_p): Retain conversions from UPC
pointer-to-shared and a regular C pointer.
Retain conversions between incompatible UPC pointers-to-shared.
Call lang_hooks.types_compatible_p() to check type
compatibility between UPC pointers-to-shared.
* gimplify.c (flag_instrument_functions_exclude_p): Make it into
an external function.
* gimplify.h (flag_instrument_functions_exclude_p): New prototype.

Index: gcc/gimple-expr.c
===
--- gcc/gimple-expr.c   (.../trunk) (revision 231059)
+++ gcc/gimple-expr.c   (.../branches/gupc) (revision 231080)
@@ -29,6 +29,7 @@ along with GCC; see the file COPYING3.
 #include "gimple-ssa.h"
 #include "fold-const.h"
 #include "tree-eh.h"
+#include "langhooks.h"
 #include "gimplify.h"
 #include "stor-layout.h"
 #include "demangle.h"
@@ -67,6 +68,19 @@ useless_type_conversion_p (tree outer_ty
   if (POINTER_TYPE_P (inner_type)
   && POINTER_TYPE_P (outer_type))
 {
+  int i_shared = SHARED_TYPE_P (TREE_TYPE (inner_type));
+  int o_shared = SHARED_TYPE_P (TREE_TYPE (outer_type));
+
+  /* Retain conversions from a UPC shared pointer to
+ a regular C pointer.  */
+  if (!o_shared && i_shared)
+return false;
+
+  /* Retain conversions between incompatible UPC shared pointers.  */
+  if (o_shared && i_shared
+ && !lang_hooks.types_compatible_p (inner_type, outer_type))
+return false;
+
   /* Do not lose casts between pointers to different address spaces.  */
   if (TYPE_ADDR_SPACE (TREE_TYPE (outer_type))
  != TYPE_ADDR_SPACE (TREE_TYPE (inner_type)))
Index: gcc/gimplify.c
===
--- gcc/gimplify.c  (.../trunk) (revision 231059)
+++ gcc/gimplify.c  (.../branches/gupc) (revision 231080)
@@ -11269,7 +11269,7 @@ typedef char *char_p; /* For DEF_VEC_P.
 
 /* Return whether we should exclude FNDECL from instrumentation.  */
 
-static bool
+bool
 flag_instrument_functions_exclude_p (tree fndecl)
 {
   vec *v;
Index: gcc/gimplify.h
===
--- gcc/gimplify.h  (.../trunk) (revision 231059)
+++ gcc/gimplify.h  (.../branches/gupc) (revision 231080)
@@ -77,6 +77,7 @@ extern enum gimplify_status gimplify_exp
 extern void gimplify_type_sizes (tree, gimple_seq *);
 extern void gimplify_one_sizepos (tree *, gimple_seq *);
 extern gbind *gimplify_body (tree, bool);
+extern bool flag_instrument_functions_exclude_p (tree);
 extern enum gimplify_status gimplify_arg (tree *, gimple_seq *, location_t);
 extern void gimplify_function_tree (tree);
 extern enum gimplify_status gimplify_va_arg_expr (tree *, gimple_seq *,


[UPC 17/22] misc/common changes

2015-11-30 Thread Gary Funck

Background
--

An overview email, describing the UPC-related changes is here:
  https://gcc.gnu.org/ml/gcc-patches/2015-12/msg5.html

The GUPC branch is described here:
  http://gcc.gnu.org/projects/gupc.html

The UPC-related source code differences are summarized here:
  http://gccupc.org/gupc-changes

All languages (c, c++, fortran, go, lto, objc, obj-c++) have been
bootstrapped; no test suite regressions were introduced,
relative to the GCC trunk.

If you are on the cc-list, your name was chosen either
because you are listed as a maintainer for the area that
applies to the patches described in this email, or you
were a frequent contributor of patches made to files listed
in this email.

In the change log entries included in each patch, the directory
containing the affected files is listed, followed by the files.
When the patches are applied, the change log entries will be
distributed to the appropriate ChangeLog file.

Overview


Given that UPC pointers-to-shared (PTS's) have special arithmetic rules
and their internal representation is a structure with
three separate fields, they are not meaningfully convertible to integers
and pointer arithmetic involving PTS's cannot be optimized in
the same fashion as normal "C" pointer arithmetic.  Further,
the representation of a NULL pointer-to-shared is different from
a "C" null pointer.  Logic has been added to convert.c and jump.c
to handle operations involving UPC PTS's.  In function.c,
UPC pointers-to-shared which have an internal representation that
is a 'struct' are treated as aggregates.  Also in function.c
logic is added that prevents marking them as potential
pointer register values.

In varasm.c, a check is added for the linker section used by
UPC to coalesce file scoped UPC shared variables.  This section
is used only to assign offsets into UPC's shared data area for
the UPC shared variables.  When UPC linker scripts are supported,
this shared section is not loaded and has an origin of 0.

2015-11-30  Gary Funck  

gcc/
* convert.c (convert_to_pointer): Add check for null
UPC pointer-to-shared.
(convert_to_integer): Do not optimize pointer
subtraction for UPC pointers-to-shared.
(convert_to_integer): Issue error for an attempt
to convert a UPC pointer-to-shared to an integer.
* dojump.c (do_jump): If a UPC pointer-to-shared conversion
can change representation, it must be compared in the result type.
* function.c (aggregate_value_p): Handle 'struct' pointer-to-shared
values as an aggregate when passing them as a return value.
(assign_parm_setup_reg): Do not target UPC pointers-to-shared that are
represented as a 'struct' into a pointer register.
* varasm.c (default_section_type_flags): Handle UPC's shared
section as BSS, and if a UPC link script is supported,
make it a non-loadable, read-only section.

Index: gcc/convert.c
===
--- gcc/convert.c   (.../trunk) (revision 231059)
+++ gcc/convert.c   (.../branches/gupc) (revision 231080)
@@ -53,6 +53,14 @@ convert_to_pointer_1 (tree type, tree ex
   if (TREE_TYPE (expr) == type)
 return expr;
 
+  if (integer_zerop (expr) && POINTER_TYPE_P (type)
+  && SHARED_TYPE_P (TREE_TYPE (type)))
+{
+  expr = copy_node (upc_null_pts_node);
+  TREE_TYPE (expr) = build_unshared_type (type);
+  return expr;
+}
+
   switch (TREE_CODE (TREE_TYPE (expr)))
 {
 case POINTER_TYPE:
@@ -437,6 +445,16 @@ convert_to_integer_1 (tree type, tree ex
   return error_mark_node;
 }
 
+  /* Can't optimize the conversion of UPC shared pointer difference.  */
+  if (ex_form == MINUS_EXPR
+  && POINTER_TYPE_P (TREE_TYPE (TREE_OPERAND (expr, 0)))
+  && POINTER_TYPE_P (TREE_TYPE (TREE_OPERAND (expr, 1)))
+  && SHARED_TYPE_P (TREE_TYPE (TREE_TYPE (TREE_OPERAND (expr, 0
+  && SHARED_TYPE_P (TREE_TYPE (TREE_TYPE (TREE_OPERAND (expr, 1)
+  {
+  return build1 (CONVERT_EXPR, type, expr);
+  }
+
   if (ex_form == COMPOUND_EXPR)
 {
   tree t = convert_to_integer_1 (type, TREE_OPERAND (expr, 1), dofold);
@@ -581,6 +599,12 @@ convert_to_integer_1 (tree type, tree ex
 {
 case POINTER_TYPE:
 case REFERENCE_TYPE:
+  if (SHARED_TYPE_P (TREE_TYPE (intype)))
+{
+  error ("invalid conversion from a UPC pointer-to-shared "
+"to an integer");
+ expr = integer_zero_node;
+}
   if (integer_zerop (expr))
return build_int_cst (type, 0);
 
Index: gcc/dojump.c
===
--- gcc/dojump.c(.../trunk) (revision 231059)
+++ gcc/dojump.c(.../branches/gupc) (revision 231080)
@@ -468,6 +468,10 @@ do_jump (tree exp, rtx_code_label *if_fa
< TYPE_PRECISION (TREE_TYPE (TREE_OPERAND (exp, 0)
 goto 

[UPC 12/22] DWARF support

2015-11-30 Thread Gary Funck

Background
--

An overview email, describing the UPC-related changes is here:
  https://gcc.gnu.org/ml/gcc-patches/2015-12/msg5.html

The GUPC branch is described here:
  http://gcc.gnu.org/projects/gupc.html

The UPC-related source code differences are summarized here:
  http://gccupc.org/gupc-changes

All languages (c, c++, fortran, go, lto, objc, obj-c++) have been
bootstrapped; no test suite regressions were introduced,
relative to the GCC trunk.

If you are on the cc-list, your name was chosen either
because you are listed as a maintainer for the area that
applies to the patches described in this email, or you
were a frequent contributor of patches made to files listed
in this email.

In the change log entries included in each patch, the directory
containing the affected files is listed, followed by the files.
When the patches are applied, the change log entries will be
distributed to the appropriate ChangeLog file.

Overview


The Dwarf4 specification defines extensions which add support for UPC.
See: http://dwarfstd.org/doc/DWARF4.pdf for details.  These extensions
are defined in /include/dwarf2.def.  The patch below
implements UPC debugging support.  This support is enabled via
the -dwarf-2-upc compilation switch.  It is not enabled by default,
because some older versions of GDB will abort when encountering
the UPC-related DWARF extensions.

A few years back, we added support to GDB for UPC, though that
support was experimental and not pushed back into the mainline.
A couple of commercial parallel debuggers implemented support
for GNU UPC, utilizing these DWARF extensions.

2015-11-30  Gary Funck  

gcc/
* dwarf2out.c (modified_type_die): If the type is shared qualified,
generate UPC debugging information as defined in
the DWARF4 specification.
(add_subscript_info): If the array index is "THREADS scaled",
add the DW_AT_upc_threads_scaled attribute to the subrange DIE.
(gen_compile_unit_die): If -fupc is asserted,
set the language to DW_LANG_Upc.

Index: gcc/dwarf2out.c
===
--- gcc/dwarf2out.c (.../trunk) (revision 231059)
+++ gcc/dwarf2out.c (.../branches/gupc) (revision 231080)
@@ -10899,6 +10899,50 @@ modified_type_die (tree type, int cv_qua
mod_type_die = d;
  }
 }
+  else if (use_upc_dwarf2_extensions
+   && (cv_quals & TYPE_QUAL_SHARED))
+{
+  HOST_WIDE_INT block_factor = 1;
+
+  /* Inside the compiler,
+ "shared int x;" TYPE_BLOCK_FACTOR is null.
+ "shared [] int *p;" TYPE_BLOCK_FACTOR is zero.
+ "shared [10] int x[50];" TYPE_BLOCK_FACTOR is 10 * bitsize(int)
+ The DWARF2 encoding is as follows:
+ "shared int x;"  DW_AT_count: 1
+ "shared [] int *p;" 
+ "shared [10] int x[50];" DW_AT_count: 10
+ The logic below handles thse various contingencies.  */
+
+  mod_type_die = new_die (DW_TAG_upc_shared_type,
+  comp_unit_die (), type);
+
+  if (TYPE_HAS_BLOCK_FACTOR (type))
+block_factor = TREE_INT_CST_LOW (TYPE_BLOCK_FACTOR (type));
+
+  if (block_factor != 0)
+add_AT_unsigned (mod_type_die, DW_AT_count, block_factor);
+
+  sub_die = modified_type_die (type,
+   cv_quals & ~TYPE_QUAL_SHARED,
+   context_die);
+}
+  else if (use_upc_dwarf2_extensions && cv_quals & TYPE_QUAL_STRICT)
+{
+  mod_type_die = new_die (DW_TAG_upc_strict_type,
+  comp_unit_die (), type);
+  sub_die = modified_type_die (type,
+   cv_quals & ~TYPE_QUAL_STRICT,
+   context_die);
+}
+  else if (use_upc_dwarf2_extensions && cv_quals & TYPE_QUAL_RELAXED)
+{
+  mod_type_die = new_die (DW_TAG_upc_relaxed_type,
+  comp_unit_die (), type);
+  sub_die = modified_type_die (type,
+   cv_quals & ~TYPE_QUAL_RELAXED,
+   context_die);
+}
   else if (code == POINTER_TYPE || code == REFERENCE_TYPE)
 {
   dwarf_tag tag = DW_TAG_pointer_type;
@@ -16992,6 +17036,12 @@ add_subscript_info (dw_die_ref type_die,
   if (!subrange_die)
subrange_die = new_die (DW_TAG_subrange_type, type_die, NULL);
 
+
+  if (use_upc_dwarf2_extensions && TYPE_HAS_THREADS_FACTOR (type))
+{
+ add_AT_flag (subrange_die, DW_AT_upc_threads_scaled, 1);
+   }
+
   if (domain)
{
  /* We have an array type with specified bounds.  */
@@ -20279,6 +20329,10 @@ gen_compile_unit_die (const char *filena
  if (dwarf_version >= 5 /* || !dwarf_strict */)
if (strcmp (language_string, "GNU C11") == 0)
  language = DW_LANG_C11;
+
+  if (use_upc_dwarf2_extensions && flag_upc)
+language = DW_LANG_Upc;
+
   

[UPC 18/22] libatomic changes

2015-11-30 Thread Gary Funck

Background
--

An overview email, describing the UPC-related changes is here:
  https://gcc.gnu.org/ml/gcc-patches/2015-12/msg5.html

The GUPC branch is described here:
  http://gcc.gnu.org/projects/gupc.html

The UPC-related source code differences are summarized here:
  http://gccupc.org/gupc-changes

All languages (c, c++, fortran, go, lto, objc, obj-c++) have been
bootstrapped; no test suite regressions were introduced,
relative to the GCC trunk.

If you are on the cc-list, your name was chosen either
because you are listed as a maintainer for the area that
applies to the patches described in this email, or you
were a frequent contributor of patches made to files listed
in this email.

In the change log entries included in each patch, the directory
containing the affected files is listed, followed by the files.
When the patches are applied, the change log entries will be
distributed to the appropriate ChangeLog file.

Overview


The UPC language specification defines atomic operations on
UPC shared data, implemented by a set of library routines.

The UPC runtime library, targeting SMP (symmetric multiprocessor) systems,
uses GCC builtin atomic operations to implement atomic operations
on UPC shared values.  GCC's builtin atomic operations use libatomic
to handle various situations where direct hardware support is unavailable.
During testing, we noticed that when some operations or types are
unsupported that the library will call an internal lock routine and
this lock routine calls pthread_mutex().  That doesn't work well for
UPC because by default a UPC "thread" maps to an OS process.
We discussed this issue in this bug report:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60790.

To work around this locking issue, we build a statically linked
"convenience" library, libatomic_convenience_no_lock.a.
This is the same as the libatomic_convenience library built for libgo,
except it doesn't include lock.o.  In libgupc/smp, the source
file upc_libat_lock.c defines the same entry points as lock.c,
but implements them using a spin lock.

2015-11-30  Gary Funck  

libatomic/
* Makefile.am (LIBAT_SRC_NO_LOCK, libatomic_convenience_no_lock*):
New.  Add rules to build libatomic_convenience_no_lock.a,
used by libgupc.
* Makefile.in: Re-generate.

Index: libatomic/Makefile.am
===
--- libatomic/Makefile.am   (.../trunk) (revision 231059)
+++ libatomic/Makefile.am   (.../branches/gupc) (revision 231080)
@@ -40,7 +40,8 @@ AM_CCASFLAGS = $(XCFLAGS)
 AM_LDFLAGS = $(XLDFLAGS) $(SECTION_LDFLAGS) $(OPT_LDFLAGS)
 
 toolexeclib_LTLIBRARIES = libatomic.la
-noinst_LTLIBRARIES = libatomic_convenience.la
+noinst_LTLIBRARIES = libatomic_convenience.la \
+ libatomic_convenience_no_lock.la 
 
 if LIBAT_BUILD_VERSIONED_SHLIB
 if LIBAT_BUILD_VERSIONED_SHLIB_GNU
@@ -67,8 +68,9 @@ endif
 libatomic_version_info = -version-info $(libtool_VERSION)
 
 libatomic_la_LDFLAGS = $(libatomic_version_info) $(libatomic_version_script) 
$(lt_host_flags)
-libatomic_la_SOURCES = gload.c gstore.c gcas.c gexch.c glfree.c lock.c init.c \
+LIBAT_SRC_NO_LOCK = gload.c gstore.c gcas.c gexch.c glfree.c 0 init.c \
fenv.c fence.c flag.c
+libatomic_la_SOURCES = $(LIBAT_SRC_NO_LOCK) lock.c
 
 SIZEOBJS = load store cas exch fadd fsub fand fior fxor fnand tas
 SIZES = @SIZES@
@@ -139,3 +141,9 @@ endif
 
 libatomic_convenience_la_SOURCES = $(libatomic_la_SOURCES)
 libatomic_convenience_la_LIBADD = $(libatomic_la_LIBADD)
+
+# The "no lock" convenience library is used by libgupc to
+# avoid lock.c's use of pthread_mutex, which won't work
+# for processes using atomics on shared memory.
+libatomic_convenience_no_lock_la_SOURCES = $(LIBAT_SRC_NO_LOCK)
+libatomic_convenience_no_lock_la_LIBADD = $(libatomic_la_LIBADD)


[UPC 14/22] constant folding changes

2015-11-30 Thread Gary Funck

Background
--

An overview email, describing the UPC-related changes is here:
  https://gcc.gnu.org/ml/gcc-patches/2015-12/msg5.html

The GUPC branch is described here:
  http://gcc.gnu.org/projects/gupc.html

The UPC-related source code differences are summarized here:
  http://gccupc.org/gupc-changes

All languages (c, c++, fortran, go, lto, objc, obj-c++) have been
bootstrapped; no test suite regressions were introduced,
relative to the GCC trunk.

If you are on the cc-list, your name was chosen either
because you are listed as a maintainer for the area that
applies to the patches described in this email, or you
were a frequent contributor of patches made to files listed
in this email.

In the change log entries included in each patch, the directory
containing the affected files is listed, followed by the files.
When the patches are applied, the change log entries will be
distributed to the appropriate ChangeLog file.

Overview


UPC pointers-to-shared (aka shared pointers) are not interchangeable
with integers as they are in regular "C".  Therefore, additions
and subtraction operations which involve UPC shared pointers
should not be further simplified.

2015-11-30  Gary Funck  

gcc/
* fold-const.c (fold_unary_loc): Do not perform this simplification
if either of the types are UPC pointer-to-shared types.
(fold_binary_loc): Disable optimizations involving UPC
pointers-to-shared because integers are not interoperable
with UPC pointers-to-shared.
* match.pd: Do not simplify POINTER_PLUS operations which
involve UPC pointers-to-shared.  Do not simplify integral
conversions involving UPC pointers-to-shared.  For a chain
of two conversions, do not simplify conversions involving
UPC pointers-to-shared unless they meet specific criteria.

Index: gcc/fold-const.c
===
--- gcc/fold-const.c(.../trunk) (revision 231059)
+++ gcc/fold-const.c(.../branches/gupc) (revision 231080)
@@ -7805,10 +7805,16 @@ fold_unary_loc (location_t loc, enum tre
 
   /* Convert (T1)(X p+ Y) into ((T1)X p+ Y), for pointer type, when the new
 cast (T1)X will fold away.  We assume that this happens when X itself
-is a cast.  */
+is a cast.
+
+Do not perform this simplification if either of the types 
+are UPC pointer-to-shared types.  */
   if (POINTER_TYPE_P (type)
  && TREE_CODE (arg0) == POINTER_PLUS_EXPR
- && CONVERT_EXPR_P (TREE_OPERAND (arg0, 0)))
+ && CONVERT_EXPR_P (TREE_OPERAND (arg0, 0))
+ && !SHARED_TYPE_P (TREE_TYPE (type))
+ && !SHARED_TYPE_P (TREE_TYPE (
+  TREE_TYPE (TREE_OPERAND (arg0, 0)
{
  tree arg00 = TREE_OPERAND (arg0, 0);
  tree arg01 = TREE_OPERAND (arg0, 1);
@@ -9271,6 +9277,14 @@ fold_binary_loc (location_t loc,
   return NULL_TREE;
 
 case PLUS_EXPR:
+  /* Disable further optimizations involving UPC shared pointers,
+ because integers are not interoperable with shared pointers.  */
+  if ((TREE_TYPE (arg0) && POINTER_TYPE_P (TREE_TYPE (arg0))
+  && SHARED_TYPE_P (TREE_TYPE (TREE_TYPE (arg0
+ || (TREE_TYPE (arg1) && POINTER_TYPE_P (TREE_TYPE (arg1))
+ && SHARED_TYPE_P (TREE_TYPE (TREE_TYPE (arg1)
+return NULL_TREE;
+
   if (INTEGRAL_TYPE_P (type) || VECTOR_INTEGER_TYPE_P (type))
{
  /* X + (X / CST) * -CST is X % CST.  */
@@ -9679,6 +9693,16 @@ fold_binary_loc (location_t loc,
   return NULL_TREE;
 
 case MINUS_EXPR:
+
+  /* Disable further optimizations involving UPC shared pointers,
+ because integers are not interoperable with shared pointers.
+(The test below also detects pointer difference between
+shared pointers, which cannot be folded.  */
+
+  if (TREE_TYPE (arg0) && POINTER_TYPE_P (TREE_TYPE (arg0))
+  && SHARED_TYPE_P (TREE_TYPE (TREE_TYPE (arg0
+return NULL_TREE;
+
   /* (-A) - B -> (-B) - A  where B is easily negated and we can swap.  */
   if (TREE_CODE (arg0) == NEGATE_EXPR
  && negate_expr_p (op1)
Index: gcc/match.pd
===
--- gcc/match.pd(.../trunk) (revision 231059)
+++ gcc/match.pd(.../branches/gupc) (revision 231080)
@@ -931,10 +931,13 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
(if (!fail && wi::bit_and (@1, zero_mask_not) == 0)
 (inner_op @2 { wide_int_to_tree (type, cst_emit); }))
 
-/* Associate (p +p off1) +p off2 as (p +p (off1 + off2)).  */
-(simplify
-  (pointer_plus (pointer_plus:s @0 @1) @3)
-  (pointer_plus @0 (plus @1 @3)))
+/* Associate (p +p off1) +p off2 as (p +p (off1 + off2)).
+   (Do not apply this simplification to UPC pointers-to-shared
+   because they are not directly convertible 

[UPC 03/22] options processing, driver

2015-11-30 Thread Gary Funck

Background
--

An overview email, describing the UPC-related changes is here:
  https://gcc.gnu.org/ml/gcc-patches/2015-12/msg5.html

The GUPC branch is described here:
  http://gcc.gnu.org/projects/gupc.html

The UPC-related source code differences are summarized here:
  http://gccupc.org/gupc-changes

All languages (c, c++, fortran, go, lto, objc, obj-c++) have been
bootstrapped; no test suite regressions were introduced,
relative to the GCC trunk.

If you are on the cc-list, your name was chosen either
because you are listed as a maintainer for the area that
applies to the patches described in this email, or you
were a frequent contributor of patches made to files listed
in this email.

In the change log entries included in each patch, the directory
containing the affected files is listed, followed by the files.
When the patches are applied, the change log entries will be
distributed to the appropriate ChangeLog file.

Overview


UPC language support requires some extensions to the GCC driver.
Most of the new UPC-specific spec's will be triggered by the presence
of -fupc on the gcc command line.  Further, -fupc, will be asserted
when source files ending in .upc are compiled.
The linker spec, LINK_COMMAND_SPEC, is extended to bring in
UPC start/end files and to link with libgupc when -fupc is asserted.

Some new UPC-specific command line options are defined in c.opt.
These new UPC-specific options will only have an effect when -fupc
is asserted, and will be detected as an error otherwise.

c_common_parse_file() will call the UPC-specific language hook,
lang_hooks.upc.write_global_declarations() just after
parsing the source file via c_parse_file().  It is called
there to generate an initialization routine
within the scope of the current compilation unit.
This initialization routine will initialize file scope
UPC shared variables, and initialize pointers-to-shared
as needed.  The address of this initialization routine 
is placed in a special linker section named by
targetm.upc.init_array_section_name().

A new UPC-specific driver program called 'gupc' is implemented by
gcc/c/gupcspec.c.  It will be installed as both a 'gupc' and 'upc'
executable ('upc' is a symlink to 'gupc').  This is a convenience
driver similar to gfortran.  It asserts -fupc and
will cause .c source files to be compiled as UPC source files.
This implicit handling of .c files as .upc files provides compatibility
with other UPC compilers.

2015-11-30  Gary Funck  

gcc/
* gcc.c (upc_crtbegin_spec, link_upc_spec, upc_crtend_spec,
upc_options): New.  Define UPC-related spec's.
(default_compilers): Add support for .upc files.
(static_specs): Initialize UPC-specific spec's.
(LINK_COMMAND_SPEC): Add UPC-specific linker spec's.
* timevar.def (TV_TREE_UPC_GENERICIZE): New.  Define a new
time variable for the 'UPC genericize' pass.
gcc/c-family/
* c-opts.c: #include "c-upc-pts.h" to bring in UPC_MAX_THREADS.
(upc_init_options, upc_handle_option): New.
(c_common_init_options):
Call upc_init_options() if -fupc is asserted.
(c_common_handle_option): Call upc_handle_option
to handle UPC-specific options.
(c_common_parse_file):
Call lang_hooks.upc.write_global_declarations() if -fupc is asserted.
* c.opt (dwarf-2-upc, fupc, fupc-debug, fupc-inline-lib,
fupc-pre-include, fupc-pthreads-model-tls, fupc-threads,
fupc-instrument, fupc-instrument-functions): New.
Define UPS-specific command line options.
gcc/c/
* gupcspec.c: New.  Implement the 'gupc' driver program.

Index: gcc/gcc.c
===
--- gcc/gcc.c   (.../trunk) (revision 231059)
+++ gcc/gcc.c   (.../branches/gupc) (revision 231080)
@@ -1016,16 +1016,20 @@ proper position among the other output f
 %{flto} %{fno-lto} %{flto=*} %l " LINK_PIE_SPEC \
"%{fuse-ld=*:-fuse-ld=%*} " LINK_COMPRESS_DEBUG_SPEC \
"%X %{o*} %{e*} %{N} %{n} %{r}\
-%{s} %{t} %{u*} %{z} %{Z} %{!nostdlib:%{!nostartfiles:%S}} \
+%{s} %{t} %{u*} %{z} %{Z}\
+
%{!nostdlib:%{!nostartfiles:%{fupc:%:include(upc-crtbegin.spec)%(upc_crtbegin)}}}\
+%{!nostdlib:%{!nostartfiles:%S}} \
 %{static:} %{L*} %(mfwrap) %(link_libgcc) " \
 VTABLE_VERIFICATION_SPEC " " SANITIZER_EARLY_SPEC " %o " CHKP_SPEC " \
 %{fopenacc|fopenmp|%:gt(%{ftree-parallelize-loops=*} 1):\
%:include(libgomp.spec)%(link_gomp)}\
 %{fcilkplus:%:include(libcilkrts.spec)%(link_cilkrts)}\
 %{fgnu-tm:%:include(libitm.spec)%(link_itm)}\
+%{fupc:%:include(libgupc.spec)%(link_upc)}\
 %(mflib) " STACK_SPLIT_SPEC "\
 %{fprofile-arcs|fprofile-generate*|coverage:-lgcov} " SANITIZER_SPEC " \
 %{!nostdlib:%{!nodefaultlibs:%(link_ssp) %(link_gcc_c_sequence)}}\
+
%{!nostdlib:%{!nostartfiles:%{fupc:%:include(upc-crtend.spec)%(upc_crtend)}}}\
 %{!n

[UPC 05/22] language hooks changes

2015-11-30 Thread Gary Funck

Background
--

An overview email, describing the UPC-related changes is here:
  https://gcc.gnu.org/ml/gcc-patches/2015-12/msg5.html

The GUPC branch is described here:
  http://gcc.gnu.org/projects/gupc.html

The UPC-related source code differences are summarized here:
  http://gccupc.org/gupc-changes

All languages (c, c++, fortran, go, lto, objc, obj-c++) have been
bootstrapped; no test suite regressions were introduced,
relative to the GCC trunk.

If you are on the cc-list, your name was chosen either
because you are listed as a maintainer for the area that
applies to the patches described in this email, or you
were a frequent contributor of patches made to files listed
in this email.

In the change log entries included in each patch, the directory
containing the affected files is listed, followed by the files.
When the patches are applied, the change log entries will be
distributed to the appropriate ChangeLog file.

Overview


Two new UPC-specific 'decl' language hooks are defined and then called from
layout_decl() in stor-layout.c.  The layout_decl_p() function tests if
this is a UPC shared array declaration that requires special handling.
If it does, then layout_decl() is called.

A few new UPC-specific language hooks are defined in a 'upc' sub-structure
of the language hooks structure.  They are defined as
hooks because they are called from code in the 'c-family/' directory,
but are implemented in the 'c/' directory.

2015-11-30  Gary Funck  

gcc/
* langhooks-def.h (lhd_do_nothing_b, lhd_do_nothing_t_t):
New do nothing hook prototypes.
(LANG_HOOKS_UPC_TOGGLE_KEYWORDS,
LANG_HOOKS_UPC_PTS_INIT_TYPE, LANG_HOOKS_UPC_BUILD_INIT_FUNC,
LANG_HOOKS_UPC_WRITE_GLOBAL_DECLS): New default UPC hooks.
* langhooks-def.h (LANG_HOOKS_LAYOUT_DECL_P, LANG_HOOKS_LAYOUT_DECL):
New language hook defaults.
(LANG_HOOKS_UPC): New.  Define UPC hooks structure.
* langhooks.c (lhd_do_nothing_b, lhd_do_nothing_t_t):
New do nothing hooks.
* langhooks.h (layout_decl_p, layout_decl): New language hooks.
(lang_hooks_for_upc): New UPC language hooks structure.
* stor-layout.c (layout_decl): Call the layout_decl_p() and
and layout_decl() hooks.
gcc/c/
* c-lang.c: #include "c-upc-lang.h".
#include "c-upc-low.h".
(LANG_HOOKS_UPC_TOGGLE_KEYWORDS, LANG_HOOKS_UPC_PTS_INIT_TYPE,
LANG_HOOKS_UPC_BUILD_INIT_FUNC, LANG_HOOKS_UPC_WRITE_GLOBAL_DECLS,
LANG_HOOKS_LAYOUT_DECL_P, LANG_HOOKS_LAYOUT_DECL):
Override defaults.  Define UPC-specific hook routines.
* c-upc-lang.c: New.  Implement UPC-specific hook routines.
* c-upc-lang.h: New.  Define UPC-specific hook prototypes.

Index: gcc/langhooks-def.h
===
--- gcc/langhooks-def.h (.../trunk) (revision 231059)
+++ gcc/langhooks-def.h (.../branches/gupc) (revision 231080)
@@ -35,7 +35,9 @@ struct diagnostic_info;
 /* See langhooks.h for the definition and documentation of each hook.  */
 
 extern void lhd_do_nothing (void);
+extern void lhd_do_nothing_b (bool);
 extern void lhd_do_nothing_t (tree);
+extern void lhd_do_nothing_t_t (tree, tree);
 extern void lhd_do_nothing_f (struct function *);
 extern tree lhd_pass_through_t (tree);
 extern bool lhd_post_options (const char **);
@@ -175,6 +177,10 @@ extern tree lhd_make_node (enum tree_cod
 #define LANG_HOOKS_GET_SUBRANGE_BOUNDS NULL
 #define LANG_HOOKS_DESCRIPTIVE_TYPENULL
 #define LANG_HOOKS_RECONSTRUCT_COMPLEX_TYPE reconstruct_complex_type
+#define LANG_HOOKS_UPC_TOGGLE_KEYWORDS  lhd_do_nothing_b
+#define LANG_HOOKS_UPC_PTS_INIT_TYPE  lhd_do_nothing
+#define LANG_HOOKS_UPC_BUILD_INIT_FUNC lhd_do_nothing_t
+#define LANG_HOOKS_UPC_WRITE_GLOBAL_DECLS lhd_do_nothing
 #define LANG_HOOKS_ENUM_UNDERLYING_BASE_TYPE lhd_enum_underlying_base_type
 
 #define LANG_HOOKS_FOR_TYPES_INITIALIZER { \
@@ -219,6 +225,8 @@ extern tree lhd_make_node (enum tree_cod
 #define LANG_HOOKS_OMP_CLAUSE_LINEAR_CTOR NULL
 #define LANG_HOOKS_OMP_CLAUSE_DTOR hook_tree_tree_tree_null
 #define LANG_HOOKS_OMP_FINISH_CLAUSE lhd_omp_finish_clause
+#define LANG_HOOKS_LAYOUT_DECL_P hook_bool_tree_tree_false
+#define LANG_HOOKS_LAYOUT_DECL lhd_do_nothing_t_t
 
 #define LANG_HOOKS_DECLS { \
   LANG_HOOKS_GLOBAL_BINDINGS_P, \
@@ -243,7 +251,9 @@ extern tree lhd_make_node (enum tree_cod
   LANG_HOOKS_OMP_CLAUSE_ASSIGN_OP, \
   LANG_HOOKS_OMP_CLAUSE_LINEAR_CTOR, \
   LANG_HOOKS_OMP_CLAUSE_DTOR, \
-  LANG_HOOKS_OMP_FINISH_CLAUSE \
+  LANG_HOOKS_OMP_FINISH_CLAUSE, \
+  LANG_HOOKS_LAYOUT_DECL_P, \
+  LANG_HOOKS_LAYOUT_DECL \
 }
 
 /* LTO hooks.  */
@@ -261,6 +271,13 @@ extern void lhd_end_section (void);
   LANG_HOOKS_END_SECTION \
 }
 
+#define LANG_HOOKS_UPC { \
+  LANG_HOOKS_UPC_TOGGLE_KEYWORDS, \
+  LANG_HOOKS_UPC_PTS_INIT_TYPE, \
+  LANG_HOOKS_UPC_BUILD_INIT_FUNC, \
+  LANG_HOOKS_UPC_WRITE_GLOBAL_DECLS \

[UPC 13/22] C++ changes

2015-11-30 Thread Gary Funck

Background
--

An overview email, describing the UPC-related changes is here:
  https://gcc.gnu.org/ml/gcc-patches/2015-12/msg5.html

The GUPC branch is described here:
  http://gcc.gnu.org/projects/gupc.html

The UPC-related source code differences are summarized here:
  http://gccupc.org/gupc-changes

All languages (c, c++, fortran, go, lto, objc, obj-c++) have been
bootstrapped; no test suite regressions were introduced,
relative to the GCC trunk.

If you are on the cc-list, your name was chosen either
because you are listed as a maintainer for the area that
applies to the patches described in this email, or you
were a frequent contributor of patches made to files listed
in this email.

In the change log entries included in each patch, the directory
containing the affected files is listed, followed by the files.
When the patches are applied, the change log entries will be
distributed to the appropriate ChangeLog file.

Overview


Although UPC is an extension to "C" and not "C++", these changes
are needed to accommodate changes to the common tree-related
code that handles qualified types, and to accommodate UPC's
"layout qualifier" (blocking factor).

In tree.h, check_qualified_type() was changed to accept an
extra argument, block_factor.

/* Check whether CAND is suitable to be returned from get_qualified_type
   (BASE, TYPE_QUALS, BLOCK_FACTOR).  */

extern bool check_qualified_type (const_tree cand, const_tree base,
  int type_quals, tree block_factor);

and the c_build_qualified_type() procedure was renamed to
c_build_qualified_type_1().  c_build_qualified_type was changed
into a macro.

/* Return a version of the TYPE, qualified as indicated by the
   TYPE_QUALS and BLOCK_FACTOR, if one exists.
   If no qualified version exists yet, return NULL_TREE.  */

extern tree get_qualified_type_1 (tree type, int type_quals,
  tree block_factor);
#define get_qualified_type(TYPE, QUALS) \
  get_qualified_type_1 (TYPE, QUALS, 0)

This patch adjusts the C++ front-end so that it works with
the changes described above.

2015-11-30  Gary Funck  

gcc/cp/
* lex.c (init_reswords): Disable UPC keywords.
* tree.c (c_build_qualified_type_1): Rename.
Was: c_build_qualified_type.  
(cp_check_qualified_type): Adjust call to check_qualified_type
to pass a null UPC blocking factor.
Index: gcc/cp/lex.c
===
--- gcc/cp/lex.c(.../trunk) (revision 231059)
+++ gcc/cp/lex.c(.../branches/gupc) (revision 231080)
@@ -179,6 +179,9 @@ init_reswords (void)
   /* The Objective-C keywords are all context-dependent.  */
   mask |= D_OBJC;
 
+  /* UPC constructs are not supported in C++.  */
+  mask |= D_UPC;
+
   ridpointers = ggc_cleared_vec_alloc ((int) RID_MAX);
   for (i = 0; i < num_c_common_reswords; i++)
 {
Index: gcc/cp/tree.c
===
--- gcc/cp/tree.c   (.../trunk) (revision 231059)
+++ gcc/cp/tree.c   (.../branches/gupc) (revision 231080)
@@ -995,7 +995,8 @@ move (tree expr)
the C version of this function does not properly maintain canonical
types (which are not used in C).  */
 tree
-c_build_qualified_type (tree type, int type_quals)
+c_build_qualified_type_1 (tree type, int type_quals,
+ tree ARG_UNUSED (layout_qualifier))
 {
   return cp_build_qualified_type (type, type_quals);
 }
@@ -1867,7 +1868,7 @@ static bool
 cp_check_qualified_type (const_tree cand, const_tree base, int type_quals,
 cp_ref_qualifier rqual, tree raises)
 {
-  return (check_qualified_type (cand, base, type_quals)
+  return (check_qualified_type (cand, base, type_quals, NULL_TREE)
  && comp_except_specs (raises, TYPE_RAISES_EXCEPTIONS (cand),
ce_exact)
  && type_memfn_rqual (cand) == rqual);


[UPC 10/22] target - rs6000

2015-11-30 Thread Gary Funck

Background
--

An overview email, describing the UPC-related changes is here:
  https://gcc.gnu.org/ml/gcc-patches/2015-12/msg5.html

The GUPC branch is described here:
  http://gcc.gnu.org/projects/gupc.html

The UPC-related source code differences are summarized here:
  http://gccupc.org/gupc-changes

All languages (c, c++, fortran, go, lto, objc, obj-c++) have been
bootstrapped; no test suite regressions were introduced,
relative to the GCC trunk.

If you are on the cc-list, your name was chosen either
because you are listed as a maintainer for the area that
applies to the patches described in this email, or you
were a frequent contributor of patches made to files listed
in this email.

In the change log entries included in each patch, the directory
containing the affected files is listed, followed by the files.
When the patches are applied, the change log entries will be
distributed to the appropriate ChangeLog file.

Overview


UPC pointers-to-shared have an internal representation that is a 'struct'.
GCC generally assumes that pointers can be targeted into registers.
However, various ABI's will special case how struct's are passed.

In order to insure that UPC pointers-to-shared can be passed to
the UPC runtime, which is written in "C", the convention used
to pass the UPC pointer-to-shared must agree with that of a struct,
because the runtime will describe the internal representation
as a struct.

The code below checks for UPC pointers-to-shared that are
represented as an aggregate type and insures that these
pointers are passed as struct's.

2015-11-30  Gary Funck  

gcc/config/rs6000/
* rs6000.c (rs6000_return_in_memory):
If TYPE is a UPC PTS type with a "struct" internal representation,
handle it as an aggregate type.
(rs6000_function_arg_boundary): For UPC pointers-to-shared with
alignment > 64 that have an internal "struct" representation,
return 128 and skip the ABI warning.
(rs6000_pass_by_reference): If TYPE is a UPC PTS type with
a "struct" internal representation, handle it as an aggregate type.
(rs6000_pass_by_reference): Exclude UPC pointers-to-shared
from the logic that returns pointers in either SImode or DImode.

Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs6000.c  (.../trunk) (revision 231059)
+++ gcc/config/rs6000/rs6000.c  (.../branches/gupc) (revision 231080)
@@ -9709,12 +9709,21 @@ rs6000_return_in_memory (const_tree type
 NULL, NULL))
 return false;
 
+  /* TRUE if TYPE is a UPC pointer-to-shared type
+ and its underlying representation is an aggregate.  */
+  bool upc_struct_pts_p = (POINTER_TYPE_P (type)
+&& SHARED_TYPE_P (TREE_TYPE (type)))
+  && AGGREGATE_TYPE_P (upc_pts_rep_type_node);
+  /* If TYPE is a UPC struct PTS type, handle it as an aggregate type.  */
+  bool aggregate_p = AGGREGATE_TYPE_P (type)
+ || upc_struct_pts_p;
+
   /* The ELFv2 ABI returns aggregates up to 16B in registers */
-  if (DEFAULT_ABI == ABI_ELFv2 && AGGREGATE_TYPE_P (type)
+  if (DEFAULT_ABI == ABI_ELFv2 && aggregate_p
   && (unsigned HOST_WIDE_INT) int_size_in_bytes (type) <= 16)
 return false;
 
-  if (AGGREGATE_TYPE_P (type)
+  if (aggregate_p
   && (aix_struct_return
  || (unsigned HOST_WIDE_INT) int_size_in_bytes (type) > 8))
 return true;
@@ -10040,6 +10049,18 @@ rs6000_function_arg_boundary (machine_mo
|| DEFAULT_ABI == ABI_ELFv2)
   && type && TYPE_ALIGN (type) > 64)
 {
+
+  /* If the underlying UPC pointer-to-shared representation
+ is an aggregate, and TYPE is either a pointer-to-shared
+or the PTS representation type, then return 16-byte
+alignment and skip the ABI warning.  */
+  if (upc_pts_rep_type_node
+  && AGGREGATE_TYPE_P (upc_pts_rep_type_node)
+  && ((POINTER_TYPE_P (type)
+  && SHARED_TYPE_P (TREE_TYPE (type)))
+  || (TYPE_MAIN_VARIANT (type) == upc_pts_rep_type_node)))
+   return 128;
+
   /* "Aggregate" means any AGGREGATE_TYPE except for single-element
  or homogeneous float/vector aggregates here.  We already handled
  vector aggregates above, but still need to check for float here. */
@@ -11320,7 +11341,16 @@ rs6000_pass_by_reference (cumulative_arg
   return 1;
 }
 
-  if (DEFAULT_ABI == ABI_V4 && AGGREGATE_TYPE_P (type))
+  /* TRUE if TYPE is a UPC pointer-to-shared type
+ and its underlying representation is an aggregate.  */
+  bool upc_struct_pts_p = (POINTER_TYPE_P (type)
+ && SHARED_TYPE_P (TREE_TYPE (type)))
+   && AGGREGATE_TYPE_P (upc_pts_rep_type_node);
+  /* If TYPE is a UPC struct PTS type, handle it as an aggregate type.  */
+  bool aggregate_p 

[UPC 15/22] RTL changes

2015-11-30 Thread Gary Funck

Background
--

An overview email, describing the UPC-related changes is here:
  https://gcc.gnu.org/ml/gcc-patches/2015-12/msg5.html

The GUPC branch is described here:
  http://gcc.gnu.org/projects/gupc.html

The UPC-related source code differences are summarized here:
  http://gccupc.org/gupc-changes

All languages (c, c++, fortran, go, lto, objc, obj-c++) have been
bootstrapped; no test suite regressions were introduced,
relative to the GCC trunk.

If you are on the cc-list, your name was chosen either
because you are listed as a maintainer for the area that
applies to the patches described in this email, or you
were a frequent contributor of patches made to files listed
in this email.

In the change log entries included in each patch, the directory
containing the affected files is listed, followed by the files.
When the patches are applied, the change log entries will be
distributed to the appropriate ChangeLog file.

Overview


UPC pointers-to-shared have an internal representation which is
defined as a 'struct' with three fields.  Special logic is
needed in promote_mode() to handle this case.

2015-11-30  Gary Funck  

gcc/
* explow.c (promote_mode): For UPC pointer-to-shared values,
return the mode of the UPC PTS representation type.

Index: gcc/explow.c
===
--- gcc/explow.c(.../trunk) (revision 231059)
+++ gcc/explow.c(.../branches/gupc) (revision 231080)
@@ -794,6 +794,8 @@ promote_mode (const_tree type ATTRIBUTE_
 case REFERENCE_TYPE:
 case POINTER_TYPE:
   *punsignedp = POINTERS_EXTEND_UNSIGNED;
+  if (SHARED_TYPE_P (TREE_TYPE (type)))
+return TYPE_MODE (upc_pts_type_node);
   return targetm.addr_space.address_mode
   (TYPE_ADDR_SPACE (TREE_TYPE (type)));
   break;


[UPC 09/22] target - x86

2015-11-30 Thread Gary Funck

Background
--

An overview email, describing the UPC-related changes is here:
  https://gcc.gnu.org/ml/gcc-patches/2015-12/msg5.html

The GUPC branch is described here:
  http://gcc.gnu.org/projects/gupc.html

The UPC-related source code differences are summarized here:
  http://gccupc.org/gupc-changes

All languages (c, c++, fortran, go, lto, objc, obj-c++) have been
bootstrapped; no test suite regressions were introduced,
relative to the GCC trunk.

If you are on the cc-list, your name was chosen either
because you are listed as a maintainer for the area that
applies to the patches described in this email, or you
were a frequent contributor of patches made to files listed
in this email.

In the change log entries included in each patch, the directory
containing the affected files is listed, followed by the files.
When the patches are applied, the change log entries will be
distributed to the appropriate ChangeLog file.

Overview


UPC pointers-to-shared use a struct to describe their internal
representation.  For efficiency and correctness, ensure that if the struct's
mode is TIMode that a pointer-to-shared parameter is passed in registers.  
Note that the parameter passing logic forces "C" pointer type parameters
to be 'word mode', but that rule doesn't apply to UPC pointers-to-shared
due to their "fat" struct representation.

2015-11-30  Gary Funck  

gcc/config/i386/
* i386.c (classify_argument): check for UPC pointer-to-shared,
on 64-bit target.
(function_value_64): Do not force UPC pointers-to-shared
to be returned in word mode.

Index: gcc/config/i386/i386.c
===
--- gcc/config/i386/i386.c  (.../trunk) (revision 231059)
+++ gcc/config/i386/i386.c  (.../branches/gupc) (revision 231080)
@@ -7943,6 +7943,15 @@ classify_argument (machine_mode mode, co
   && targetm.calls.must_pass_in_stack (mode, type))
 return 0;
 
+  /* Special case check for pointer to shared, on 64-bit target.  */
+  if (TARGET_64BIT && mode == TImode
+  && type && TREE_CODE (type) == POINTER_TYPE
+  && SHARED_TYPE_P (TREE_TYPE (type)))
+{
+  classes[0] = classes[1] = X86_64_INTEGER_CLASS;
+  return 2;
+}
+
   if (type && AGGREGATE_TYPE_P (type))
 {
   int i;
@@ -9536,7 +9545,8 @@ function_value_64 (machine_mode orig_mod
 
   return gen_rtx_REG (mode, regno);
 }
-  else if (POINTER_TYPE_P (valtype))
+  else if (POINTER_TYPE_P (valtype)
+   && !SHARED_TYPE_P (TREE_TYPE (valtype)))
 {
   /* Pointers are always returned in word_mode.  */
   mode = word_mode;
@@ -9680,6 +9690,11 @@ ix86_promote_function_mode (const_tree t
 {
   if (type != NULL_TREE && POINTER_TYPE_P (type))
 {
+  if (SHARED_TYPE_P (TREE_TYPE (type)))
+{
+  *punsignedp = 1;
+  return TYPE_MODE (upc_pts_rep_type_node);
+   }
   *punsignedp = POINTERS_EXTEND_UNSIGNED;
   return word_mode;
 }


[UPC 06/22] target hooks

2015-11-30 Thread Gary Funck

Background
--

An overview email, describing the UPC-related changes is here:
  https://gcc.gnu.org/ml/gcc-patches/2015-12/msg5.html

The GUPC branch is described here:
  http://gcc.gnu.org/projects/gupc.html

The UPC-related source code differences are summarized here:
  http://gccupc.org/gupc-changes

All languages (c, c++, fortran, go, lto, objc, obj-c++) have been
bootstrapped; no test suite regressions were introduced,
relative to the GCC trunk.

If you are on the cc-list, your name was chosen either
because you are listed as a maintainer for the area that
applies to the patches described in this email, or you
were a frequent contributor of patches made to files listed
in this email.

In the change log entries included in each patch, the directory
containing the affected files is listed, followed by the files.
When the patches are applied, the change log entries will be
distributed to the appropriate ChangeLog file.

Overview


Four new target hooks are defined for UPC.  They relate to naming
the various linker sections used by UPC as well as testing for
the availability of the "UPC linker script" feature.

2015-11-30  Gary Funck  

gcc/
* defaults.h (UPC_SHARED_SECTION_NAME): New macro.
(UPC_PGM_INFO_SECTION_NAME): New macro.
(UPC_INIT_ARRAY_SECTION_NAME): New macro.
* target.def (upc): New hook prefix.
(link_script_p, shared_section_name,
pgm_info_section_name, init_array_section_name):
New target hook definitions.
* targhooks.c (default_upc_link_script_p,
default_upc_shared_section_name, default_upc_pgm_info_section_name,
default_upc_init_array_section_name): New default target hooks.
* targhooks.h (default_upc_link_script_p,
default_upc_shared_section_name, default_upc_pgm_info_section_name,
default_upc_init_array_section_name): New target hook prototypes.

Index: gcc/defaults.h
===
--- gcc/defaults.h  (.../trunk) (revision 231059)
+++ gcc/defaults.h  (.../branches/gupc) (revision 231080)
@@ -1488,4 +1488,23 @@ see the files COPYING3 and COPYING.RUNTI
 
 #endif /* GCC_INSN_FLAGS_H  */
 
+/* UPC section names.  */
+
+/* Name of section used to assign addresses to shared data items.  */
+#ifndef UPC_SHARED_SECTION_NAME
+#define UPC_SHARED_SECTION_NAME "upc_shared"
+#endif
+
+/* Name of section used to hold info. describing how
+   a UPC source file was compiled.  */
+#ifndef UPC_PGM_INFO_SECTION_NAME
+#define UPC_PGM_INFO_SECTION_NAME "upc_pgm_info"
+#endif
+
+/* Name of section that holds an array of addresses that points to 
+   the UPC initialization routines.  */
+#ifndef UPC_INIT_ARRAY_SECTION_NAME
+#define UPC_INIT_ARRAY_SECTION_NAME "upc_init_array"
+#endif
+
 #endif  /* ! GCC_DEFAULTS_H */
Index: gcc/target.def
===
--- gcc/target.def  (.../trunk) (revision 231059)
+++ gcc/target.def  (.../branches/gupc) (revision 231080)
@@ -5496,6 +5496,41 @@ DEFHOOK
 
 HOOK_VECTOR_END (cxx)
 
+/* Functions and data for UPC support.  */
+#undef HOOK_PREFIX
+#define HOOK_PREFIX "TARGET_UPC_"
+HOOK_VECTOR (TARGET_UPC, upc)
+
+DEFHOOK
+(link_script_p,
+"This hook returns true if a linker script will be used to\
+ origin the UPC shared section at 0.",
+ bool, (void),
+ default_upc_link_script_p)
+
+DEFHOOK
+(shared_section_name,
+"This hook returns the name of the section used to assign addresses to\
+ UPC shared data items.",
+ const char *, (void),
+ default_upc_shared_section_name)
+
+DEFHOOK
+(pgm_info_section_name,
+"This hook returns the name of the section used to hold information\
+ describing how a UPC source file was compiled.",
+ const char *, (void),
+ default_upc_pgm_info_section_name)
+
+DEFHOOK
+(init_array_section_name,
+"This hook returns the name of the section used to hold an array\
+ of addresses of UPC initialization routines.",
+ const char *, (void),
+ default_upc_init_array_section_name)
+
+HOOK_VECTOR_END (upc)
+
 /* Functions and data for emulated TLS support.  */
 #undef HOOK_PREFIX
 #define HOOK_PREFIX "TARGET_EMUTLS_"
Index: gcc/targhooks.c
===
--- gcc/targhooks.c (.../trunk) (revision 231059)
+++ gcc/targhooks.c (.../branches/gupc) (revision 231080)
@@ -1955,4 +1955,32 @@ can_use_doloop_if_innermost (const wides
   return loop_depth == 1;
 }
 
+bool
+default_upc_link_script_p (void)
+{
+#ifdef HAVE_UPC_LINK_SCRIPT
+  return true;
+#else
+  return false;
+#endif
+}
+
+const char *
+default_upc_shared_section_name (void)
+{
+  return UPC_SHARED_SECTION_NAME;
+}
+
+const char *
+default_upc_pgm_info_section_name (void)
+{
+  return UPC_PGM_INFO_SECTION_NAME;
+}
+
+const char *
+default_upc_init_array_section_name (void)
+{
+  return UPC_INIT_ARRAY_SECTION_NAME;
+}
+
 #include "gt-targhooks.h"
Index: gcc/targhooks.h
===

[UPC 08/22] target - Darwin

2015-11-30 Thread Gary Funck

Background
--

An overview email, describing the UPC-related changes is here:
  https://gcc.gnu.org/ml/gcc-patches/2015-12/msg5.html

The GUPC branch is described here:
  http://gcc.gnu.org/projects/gupc.html

The UPC-related source code differences are summarized here:
  http://gccupc.org/gupc-changes

All languages (c, c++, fortran, go, lto, objc, obj-c++) have been
bootstrapped; no test suite regressions were introduced,
relative to the GCC trunk.

If you are on the cc-list, your name was chosen either
because you are listed as a maintainer for the area that
applies to the patches described in this email, or you
were a frequent contributor of patches made to files listed
in this email.

In the change log entries included in each patch, the directory
containing the affected files is listed, followed by the files.
When the patches are applied, the change log entries will be
distributed to the appropriate ChangeLog file.

Overview


For Darwin, if -fupc is given, then define various UPC-specific spec's.
Also, override default section names for the UPC-related linker sections.

2015-11-30  Gary Funck  

gcc/config/
* darwin.h (LINK_COMMAND_SPEC_A): If -fupc is asserted:
add UPC start/end files, add include of libgupc.spec
(UPC_SHARED_SECTION_NAME, UPC_PGM_INFO_SECTION_NAME,
UPC_INIT_ARRAY_SECTION_NAME): New.  Override default section names.

Index: gcc/config/darwin.h
===
--- gcc/config/darwin.h (.../trunk) (revision 231059)
+++ gcc/config/darwin.h (.../branches/gupc) (revision 231080)
@@ -176,16 +176,19 @@ extern GTY(()) int darwin_ms_struct;
 %{e*} %{r} \
 %{o*}%{!o:-o a.out} \
 %{!nostdlib:%{!nostartfiles:%S}} \
+
%{!nostdlib:%{!nostartfiles:%{fupc:%:include(upc-crtbegin.spec)%(upc_crtbegin)}}}\
 %{L*} %(link_libgcc) %o 
%{fprofile-arcs|fprofile-generate*|coverage:-lgcov} \
 %{fopenacc|fopenmp|ftree-parallelize-loops=*: \
   %{static|static-libgcc|static-libstdc++|static-libgfortran: libgomp.a%s; 
: -lgomp } } \
 %{fgnu-tm: \
   %{static|static-libgcc|static-libstdc++|static-libgfortran: libitm.a%s; 
: -litm } } \
+%{fupc:%:include(libgupc.spec)%(link_upc)} \
 %{!nostdlib:%{!nodefaultlibs:\
   %{%:sanitize(address): -lasan } \
   %{%:sanitize(undefined): -lubsan } \
   %(link_ssp) %(link_gcc_c_sequence)\
 }}\
+
%{!nostdlib:%{!nostartfiles:%{fupc:%:include(upc-crtend.spec)%(upc_crtend)}}}\
 %{!nostdlib:%{!nostartfiles:%E}} %{T*} %{F*} }}}"
 
 #define DSYMUTIL "\ndsymutil"
@@ -922,6 +925,11 @@ extern void darwin_driver_init (unsigned
 #undef SUPPORTS_INIT_PRIORITY
 #define SUPPORTS_INIT_PRIORITY 0
 
+/* UPC section names */
+#define UPC_SHARED_SECTION_NAME "__DATA,upc_shared"
+#define UPC_PGM_INFO_SECTION_NAME "__DATA,upc_pgm_info"
+#define UPC_INIT_ARRAY_SECTION_NAME "__DATA,upc_init_array"
+
 /* When building cross-compilers (and native crosses) we shall default to 
providing an osx-version-min of this unless overridden by the User.
10.5 is the only version that fully supports all our archs so that's the


[UPC 02/22] tree-related changes

2015-11-30 Thread Gary Funck

Background
--

An overview email, describing the UPC-related changes is here:
  https://gcc.gnu.org/ml/gcc-patches/2015-12/msg5.html

The GUPC branch is described here:
  http://gcc.gnu.org/projects/gupc.html

The UPC-related source code differences are summarized here:
  http://gccupc.org/gupc-changes

All languages (c, c++, fortran, go, lto, objc, obj-c++) have been
bootstrapped; no test suite regressions were introduced,
relative to the GCC trunk.

If you are on the cc-list, your name was chosen either
because you are listed as a maintainer for the area that
applies to the patches described in this email, or you
were a frequent contributor of patches made to files listed
in this email.

In the change log entries included in each patch, the directory
containing the affected files is listed, followed by the files.
When the patches are applied, the change log entries will be
distributed to the appropriate ChangeLog file.

Overview


UPC introduces a new qualifier, "shared", that indicates that the
qualified object is located in a global shared address space that is
accessible by all UPC threads.  Additional qualifiers ("strict" and
"relaxed") further specify the semantics of accesses to
UPC shared objects.

In UPC, a shared qualified array can further specify a "layout
qualifier" (blocking factor) that indicates how the shared data
is blocked and distributed.

The following example illustrates the use of the UPC "shared" qualifier
combined with a layout qualifier.

#define BLKSIZE 5
#define N_PER_THREAD (4 * BLKSIZE)
shared [BLKSIZE] double A[N_PER_THREAD*THREADS];

Above the "[BLKSIZE]" construct is the UPC layout qualifier; this
specifies that the shared array, A, distributes its elements across
each thread in blocks of 5 elements.  If the program is run with two
threads, then A is distributed as shown below:

Thread 0Thread 1
-
A[ 0.. 4]   A[ 5.. 9]
A[10..14]   A[15..19]
A[20..24]   A[25..29]
A[30..34]   A[35..39]

Above, the elements shown for thread 0 are defined as having "affinity"
to thread 0.  Similarly, those elements shown for thread 1 have
affinity to thread 1.  In UPC, a pointer to a shared object can be
cast to a thread local pointer (a "C" pointer), when the designated
shared object has affinity to the referencing thread.

A UPC "pointer-to-shared" (PTS) is a pointer that references a UPC
shared object.  A UPC pointer-to-shared is a "fat" pointer with the
following logical fields:
   (virt_addr, thread, phase)

The virtual address (virt_addr) field is combined with the thread
number (thread) to derive the location of the referenced object
within the UPC shared address space.  The phase field is used
keep track of the current block offset for PTS's that have
blocking factor that is greater than one.

GUPC implements pointer-to-shared objects using a "struct" internal
representation.  Until recently, GUPC also supported a "packed"
representation, which is more space efficient, but limits the range of
various fields in the UPC pointer-to-shared representation.  We have
decided to support only the "struct" representation so that the
compiler uses a single ABI that supports the full range of addresses,
threads, and blocking factors.

GCC's internal tree representation is extended to record the UPC
"shared", "strict", "relaxed" qualifiers, and the layout qualifier.

--- gcc/tree-core.h (.../trunk) (revision 228959)
+++ gcc/tree-core.h (.../branches/gupc) (revision 229159)
@@ -470,7 +470,11 @@ enum cv_qualifier {
   TYPE_QUAL_CONST= 0x1,
   TYPE_QUAL_VOLATILE = 0x2,
   TYPE_QUAL_RESTRICT = 0x4,
-  TYPE_QUAL_ATOMIC   = 0x8
+  TYPE_QUAL_ATOMIC   = 0x8,
+  /* UPC qualifiers */
+  TYPE_QUAL_SHARED   = 0x10,
+  TYPE_QUAL_RELAXED  = 0x20,
+  TYPE_QUAL_STRICT   = 0x40
 };
[...]
@@ -857,9 +875,14 @@ struct GTY(()) tree_base {
   unsigned user_align : 1;
   unsigned nameless_flag : 1;
   unsigned atomic_flag : 1;
-  unsigned spare0 : 3;
-
-  unsigned spare1 : 8;
+  unsigned shared_flag : 1;
+  unsigned strict_flag : 1;
+  unsigned relaxed_flag : 1;
+
+  unsigned threads_factor_flag : 1;
+  unsigned block_factor_0 : 1;
+  unsigned block_factor_x : 1;
+  unsigned spare1 : 5;

A given type is a UPC shared type if its 'shared_flag' is set.
However, for array types, the shared_flag of the *element type*
must be checked.  Thus,

/* Return TRUE if TYPE is a shared type.  For arrays,
   the element type must be queried, because array types
   are never qualified.  */
#define SHARED_TYPE_P(TYPE) \
  ((TYPE) && TYPE_P (TYPE) \
   && TYPE_SHARED ((TREE_CODE (TYPE) != ARRAY_TYPE \
? (TYPE) : strip_array_types (TYPE

By default, a type has a blocking factor of 1.  If the blocking factor is 0
(known as "indefinite") then 'block_factor_0' is set. If the blocking
factor is neither 0 nor 1, then 'block_factor_x' is set and the

-fstrict-aliasing fixes 3/5: Do not ignore -fstrict-aliasing changes when parsing optimization attribute

2015-11-30 Thread Jan Hubicka
Hi,
this is third part which enables us to change -fstrict-aliasing using
optimize attribute.  This ought to work safely now because inliner
propagate the flag.

Bootstrapped/regtested x86_64-linux.

Honza

* gcc.c-torture/execute/alias-1.c: New testcase.
* c-common.c: Do not silently ignore -fstrict-aliasing changes.
Index: testsuite/gcc.c-torture/execute/alias-1.c
===
--- testsuite/gcc.c-torture/execute/alias-1.c   (revision 0)
+++ testsuite/gcc.c-torture/execute/alias-1.c   (revision 0)
@@ -0,0 +1,19 @@
+int val;
+
+int *ptr = &val;
+float *ptr2 = &val;
+
+__attribute__((optimize ("-fno-strict-aliasing")))
+typepun ()
+{
+  *ptr2=0;
+}
+
+main()
+{
+  *ptr=1;
+  typepun ();
+  if (*ptr)
+__builtin_abort ();
+}
+
Index: c-family/c-common.c
===
--- c-family/c-common.c (revision 231097)
+++ c-family/c-common.c (working copy)
@@ -9988,7 +9988,6 @@ parse_optimize_options (tree args, bool
   bool ret = true;
   unsigned opt_argc;
   unsigned i;
-  int saved_flag_strict_aliasing;
   const char **opt_argv;
   struct cl_decoded_option *decoded_options;
   unsigned int decoded_options_count;
@@ -10081,8 +10080,6 @@ parse_optimize_options (tree args, bool
   for (i = 1; i < opt_argc; i++)
 opt_argv[i] = (*optimize_args)[i];
 
-  saved_flag_strict_aliasing = flag_strict_aliasing;
-
   /* Now parse the options.  */
   decode_cmdline_options_to_array_default_mask (opt_argc, opt_argv,
&decoded_options,
@@ -10093,9 +10090,6 @@ parse_optimize_options (tree args, bool
 
   targetm.override_options_after_change();
 
-  /* Don't allow changing -fstrict-aliasing.  */
-  flag_strict_aliasing = saved_flag_strict_aliasing;
-
   optimize_args->truncate (0);
   return ret;
 }


RFC: Merge the GUPC branch into the GCC 6.0 trunk

2015-11-30 Thread Gary Funck

Some time ago, we submitted an RFC for the introduction of
UPC support into GCC.  During the intervening time period,
we have continued to keep the 'gupc' (GNU UPC) branch in sync
with the GCC trunk and have incorporated feedback and contributions from
various GCC developers (Joseph Myers, Tom Tromey, Jakub Jelinek,
Richard Henderson, Meador Inge, and others).  We have also implemented
various bug fixes and improvements.

At this time, we would like to re-submit the UPC patches for comment
with the goal of introducing these changes into GCC 6.0.

This email provides an overview of UPC and summarizes the
impact of UPC changes on the GCC front-end.

Subsequent emails will include various patch sets which are grouped
by the area of GCC that they impact (front-end, generic, documentation,
build, test, target-specific, and so on), so that they can receive
a more focused review by their respective maintainers.

The main review-related changes are:

* GUPC is no longer implemented as a separate language
(e.g., Objective-C or C++) compiler.  Rather, a new -fupc switch
has been added, which enables UPC support in the C compiler.

* The UPC blocking factor now only uses two of the tree's
"spare" bits.  If the UPC blocking factor is not the default
value of 1 or the "indefinite" value of 0, then it is recorded
in a separate hash table, indexed by the tree node.

* UPC-specific tree support has been integrated into
gcc/c-family/c-common.def and gcc/c-family/c-common.h.

* The number of UPC-specific configuration options
have been reduced.

* The UPC pointer-to-shared format per-target configuration
has been simplified.  Before, both a "packed" and a "struct"
pointer-to-shared representation was supported.  Now, only
the "struct" format is supported and various configuration
options for tweaking field sizes and such have been removed.

* In keeping with current GCC development guidelines
target macros are no longer used.  Rather, where needed,
target hooks are defined and used.

* FIXME's and TODO's were either fixed or cleaned up.

* The copyright and license notices were updated.

* The code was reviewed for conformance to coding standards and updated.

* Diagnostics now use appropriate format strings rather than building
up the strings with sprintf().

* Files in c-family/ no longer include c-tree.h to conform with modularization
improvements.

* Most of the #ifdef conditionals have been removed.  Some target hooks
have been defined and documented in tm.texi.

* The code was reviewed to verify that it conforms with
current GCC coding practices and that it incorporates cleanups
done in the past several years.

* Comments were added to most new functions, and typos and
spelling errors in comments were fixed.

* Changes that appeared in the diff's that were unrelated to UPC
were removed or incorporated into the trunk.

* The linkage to the libgupc library was changed to use the newly
defined method (used in libgomp/libgo for example) of including
library 'spec' files.  This led to a simplification where we no
longer needed to add UPC-specific spec. files in various
target-specific config. directories.

Introduction: UPC-related Changes
-

Below, various UPC-related changes are summarized.
This introduction is provided as background for review of the UPC
changes implemented in the GUPC branch.  Each individual change will be
discussed in more detail in the patch sets found in the following emails.

The current GUPC branch is based upon a recent version of the GCC trunk
and has been bootstrapped on x86_64/i686 Linux, x86_64
Darwin, IA64/Altix Linux, PowerPC Power7 (big endian), and Power8
(little endian).  Also some testing has been done on various flavors
of BSD and Solaris and in the past MIPS was tested and supported.

All languages (c, c++, fortran, go, lto, objc, obj-c++) have been
bootstrapped; no test suite regressions were introduced,
relative to the GCC trunk.

The GUPC branch is described here:
  http://gcc.gnu.org/projects/gupc.html

The UPC-related source code differences are summarized here:
  http://gccupc.org/gupc-changes

In the discussion below, some changes are excerpted in order to
highlight important aspects of the changes.

UPC's Shared Qualifier and Layout Qualifier
---

The UPC language specification describes
the language syntax and semantics:
  http://upc.lbl.gov/publications/upc-spec-1.3.pdf

UPC introduces a new qualifier, "shared" that indicates that the
qualified object is located in a global shared address space that is
accessible by all UPC threads.  Additional qualifiers ("strict" and
"relaxed") further specify the semantics of accesses to
UPC shared objects.

In UPC, a shared qualified array can optionally specify a "layout
qualifier" that indicates how the shared data is blocked and
distributed across UPC threads.

There are two language pre-defined identifiers that indicate the
number of threads that will be create

Re: [PATCH] Fix vector rsqrt discovery (PR tree-optimization/68501)

2015-11-30 Thread Bin.Cheng
On Sat, Nov 28, 2015 at 3:40 AM, Jakub Jelinek  wrote:
> Hi!
>
> The recent changes where vector sqrt is represented in the IL using
> IFN_SQRT instead of target specific builtins broke the discovery
> of vector rsqrt, as targetm.builtin_reciprocal is called only
> on builtin functions (not internal functions).  Furthermore,
> for internal fns, not only the IFN_* is significant, but also the
> types (modes actually) of the lhs and/or arguments.
>
> This patch adjusts the target hook, so that the backends can just inspect
> the call (builtin or internal function), whatever it is.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
>
> 2015-11-27  Jakub Jelinek  
>
> PR tree-optimization/68501
> * target.def (builtin_reciprocal): Replace the 3 arguments with
> a gcall * one, adjust description.
> * targhooks.h (default_builtin_reciprocal): Replace the 3 arguments
> with a gcall * one.
> * targhooks.c (default_builtin_reciprocal): Likewise.
> * tree-ssa-math-opts.c (pass_cse_reciprocals::execute): Use
> targetm.builtin_reciprocal even on internal functions, adjust
> the arguments and allow replacing an internal function with normal
> built-in.
> * config/i386/i386.c (ix86_builtin_reciprocal): Replace the 3 
> arguments
> with a gcall * one.  Handle internal fns too.
> * config/rs6000/rs6000.c (rs6000_builtin_reciprocal): Likewise.
> * config/aarch64/aarch64.c (aarch64_builtin_reciprocal): Likewise.
> * doc/tm.texi (builtin_reciprocal): Document.
>
> --- gcc/target.def.jj   2015-11-18 11:19:19.0 +0100
> +++ gcc/target.def  2015-11-27 16:37:07.870823670 +0100
> @@ -2463,13 +2463,9 @@ identical versions.",
>  DEFHOOK
>  (builtin_reciprocal,
>   "This hook should return the DECL of a function that implements reciprocal 
> of\n\
> -the builtin function with builtin function code @var{fn}, or\n\
> -@code{NULL_TREE} if such a function is not available.  @var{md_fn} is true\n\
> -when @var{fn} is a code of a machine-dependent builtin function.  When\n\
> -@var{sqrt} is true, additional optimizations that apply only to the 
> reciprocal\n\
> -of a square root function are performed, and only reciprocals of 
> @code{sqrt}\n\
> -function are valid.",
> - tree, (unsigned fn, bool md_fn, bool sqrt),
> +the builtin or internal function call @var{call}, or\n\
> +@code{NULL_TREE} if such a function is not available.",
> + tree, (gcall *call),
>   default_builtin_reciprocal)
>
>  /* For a vendor-specific TYPE, return a pointer to a statically-allocated
> --- gcc/targhooks.h.jj  2015-11-18 11:19:17.0 +0100
> +++ gcc/targhooks.h 2015-11-27 16:37:44.828301093 +0100
> @@ -90,7 +90,7 @@ extern tree default_builtin_vectorized_c
>
>  extern int default_builtin_vectorization_cost (enum vect_cost_for_stmt, 
> tree, int);
>
> -extern tree default_builtin_reciprocal (unsigned int, bool, bool);
> +extern tree default_builtin_reciprocal (gcall *);
>
>  extern HOST_WIDE_INT default_vector_alignment (const_tree);
>
> --- gcc/targhooks.c.jj  2015-11-18 11:19:17.0 +0100
> +++ gcc/targhooks.c 2015-11-27 16:38:21.461783097 +0100
> @@ -600,9 +600,7 @@ default_builtin_vectorization_cost (enum
>  /* Reciprocal.  */
>
>  tree
> -default_builtin_reciprocal (unsigned int fn ATTRIBUTE_UNUSED,
> -   bool md_fn ATTRIBUTE_UNUSED,
> -   bool sqrt ATTRIBUTE_UNUSED)
> +default_builtin_reciprocal (gcall *)
>  {
>return NULL_TREE;
>  }
> --- gcc/tree-ssa-math-opts.c.jj 2015-11-25 09:57:47.0 +0100
> +++ gcc/tree-ssa-math-opts.c2015-11-27 17:07:22.756162308 +0100
> @@ -601,19 +601,17 @@ pass_cse_reciprocals::execute (function
>
>   if (is_gimple_call (stmt1)
>   && gimple_call_lhs (stmt1)
> - && (fndecl = gimple_call_fndecl (stmt1))
> - && (DECL_BUILT_IN_CLASS (fndecl) == BUILT_IN_NORMAL
> - || DECL_BUILT_IN_CLASS (fndecl) == BUILT_IN_MD))
> + && (gimple_call_internal_p (stmt1)
> + || ((fndecl = gimple_call_fndecl (stmt1))
> + && (DECL_BUILT_IN_CLASS (fndecl) == BUILT_IN_NORMAL
> + || (DECL_BUILT_IN_CLASS (fndecl)
> + == BUILT_IN_MD)
> {
> - enum built_in_function code;
> - bool md_code, fail;
> + bool fail;
>   imm_use_iterator ui;
>   use_operand_p use_p;
>
> - code = DECL_FUNCTION_CODE (fndecl);
> - md_code = DECL_BUILT_IN_CLASS (fndecl) == BUILT_IN_MD;
> -
> - fndecl = targetm.builtin_reciprocal (code, md_code, false);
> + fndecl = targetm.builtin_reciprocal (as_a  
> (stmt1));
>   if (!fndecl)
> continue;
>
> @@ -639,8 +637,28 

Go patch committed: Don't set TYPE_STRING_FLAG on a type variant

2015-11-30 Thread Ian Lance Taylor
PR 68477 observes that gccgo crashes when using -flto1 because a type
variant has TYPE_STRING_FLAG set.  So, don't do that.
TYPE_STRING_FLAG doesn't really do anything, as far as I can tell,
since all the relevant tests in dwarf2out.c also test isfortran().
But, it seems like the right thing to do.  Bootstrapped and ran Go
testsuite on x86_64-pc-linux-gnu.  Committed to mainline.

Ian

2015-11-30  Ian Lance Taylor  

PR go/68477
* go-gcc.cc (Gcc_backend::string_constant_expression): Don't set
TYPE_STRING_FLAG on a variant type.
Index: gcc/go/go-gcc.cc
===
--- gcc/go/go-gcc.cc(revision 230759)
+++ gcc/go/go-gcc.cc(working copy)
@@ -1279,7 +1279,6 @@ Gcc_backend::string_constant_expression(
   tree const_char_type = build_qualified_type(unsigned_char_type_node,
  TYPE_QUAL_CONST);
   tree string_type = build_array_type(const_char_type, index_type);
-  string_type = build_variant_type_copy(string_type);
   TYPE_STRING_FLAG(string_type) = 1;
   tree string_val = build_string(val.length(), val.data());
   TREE_TYPE(string_val) = string_type;


-fstrict-aliasing fixes 2/5: drop alias set 0 streaming

2015-11-30 Thread Jan Hubicka
Hi,
this patch disables the streaming of alias 0 flag and adds a comment why.

Bootstrapped/regtested x86_64-linux, OK?

Honza

* lto-streamer-out.c (hash_tree): Do not stream TYPE_ALIAS_SET.
* tree-streamer-out.c (pack_ts_type_common_value_fields): Do not
stream TYPE_ALIAS_SET.
* tree-streamer-in.c (unpack_ts_type_common_value_fields): Do not
stream TYPE_ALIAS_SET.

* lto.c (compare_tree_sccs_1): Do not compare TYPE_ALIAS_SET.

Index: lto-streamer-out.c
===
--- lto-streamer-out.c  (revision 231081)
+++ lto-streamer-out.c  (working copy)
@@ -1109,10 +1109,6 @@ hash_tree (struct streamer_tree_cache_d
   hstate.commit_flag ();
   hstate.add_int (TYPE_PRECISION (t));
   hstate.add_int (TYPE_ALIGN (t));
-  hstate.add_int ((TYPE_ALIAS_SET (t) == 0
-|| (!in_lto_p
-&& get_alias_set (t) == 0))
-   ? 0 : -1);
 }
 
   if (CODE_CONTAINS_STRUCT (code, TS_TRANSLATION_UNIT_DECL))
Index: lto/lto.c
===
--- lto/lto.c   (revision 231081)
+++ lto/lto.c   (working copy)
@@ -1166,7 +1166,9 @@ compare_tree_sccs_1 (tree t1, tree t2, t
   compare_values (TYPE_READONLY);
   compare_values (TYPE_PRECISION);
   compare_values (TYPE_ALIGN);
-  compare_values (TYPE_ALIAS_SET);
+  /* Do not compare TYPE_ALIAS_SET.  Doing so introduce ordering issues
+ with calls to get_alias_set which may initialize it for streamed
+in types.  */
 }
 
   /* We don't want to compare locations, so there is nothing do compare
Index: tree-streamer-out.c
===
--- tree-streamer-out.c (revision 231081)
+++ tree-streamer-out.c (working copy)
@@ -317,13 +317,9 @@ pack_ts_type_common_value_fields (struct
   bp_pack_value (bp, TYPE_RESTRICT (expr), 1);
   bp_pack_value (bp, TYPE_USER_ALIGN (expr), 1);
   bp_pack_value (bp, TYPE_READONLY (expr), 1);
-  /* Make sure to preserve the fact whether the frontend would assign
- alias-set zero to this type.  Do that only for main variants, because
- type variants alias sets are never computed.
- FIXME:  This does not work for pre-streamed builtin types.  */
-  bp_pack_value (bp, (TYPE_ALIAS_SET (expr) == 0
- || (!in_lto_p && TYPE_MAIN_VARIANT (expr) == expr
- && get_alias_set (expr) == 0)), 1);
+  /* We used to stream TYPE_ALIAS_SET == 0 information to let frontends mark
+ types that are opaque for TBAA.  This however did not work as intended,
+ becuase TYPE_ALIAS_SET == 0 was regularly lost in canonical type merging. 
 */
   if (RECORD_OR_UNION_TYPE_P (expr))
 {
   bp_pack_value (bp, TYPE_TRANSPARENT_AGGR (expr), 1);
Index: tree-streamer-in.c
===
--- tree-streamer-in.c  (revision 231081)
+++ tree-streamer-in.c  (working copy)
@@ -366,7 +366,6 @@ unpack_ts_type_common_value_fields (stru
   TYPE_RESTRICT (expr) = (unsigned) bp_unpack_value (bp, 1);
   TYPE_USER_ALIGN (expr) = (unsigned) bp_unpack_value (bp, 1);
   TYPE_READONLY (expr) = (unsigned) bp_unpack_value (bp, 1);
-  TYPE_ALIAS_SET (expr) = bp_unpack_value (bp, 1) ? 0 : -1;
   if (RECORD_OR_UNION_TYPE_P (expr))
 {
   TYPE_TRANSPARENT_AGGR (expr) = (unsigned) bp_unpack_value (bp, 1);


Re: [PATCH AArch64]Handle REG+REG+CONST and REG+NON_REG+CONST in legitimize address

2015-11-30 Thread Bin.Cheng
On Tue, Nov 24, 2015 at 6:18 PM, Richard Earnshaw
 wrote:
> On 24/11/15 09:56, Richard Earnshaw wrote:
>> On 24/11/15 02:51, Bin.Cheng wrote:
> The aarch64's problem is we don't define addptr3 pattern, and we don't
>>> have direct insn pattern describing the "x + y << z".  According to
>>> gcc internal:
>>>
>>> ‘addptrm3’
>>> Like addm3 but is guaranteed to only be used for address calculations.
>>> The expanded code is not allowed to clobber the condition code. It
>>> only needs to be defined if addm3 sets the condition code.
>
> addm3 on aarch64 does not set the condition codes, so by this rule we
> shouldn't need to define this pattern.
>>> Hi Richard,
>>> I think that rule has a prerequisite that backend needs to support
>>> register shifted addition in addm3 pattern.
>>
>> addm3 is a named pattern and its format is well defined.  It does not
>> take a shifted operand and never has.
>>
>>> Apparently for AArch64,
>>> addm3 only supports "reg+reg" or "reg+imm".  Also we don't really
>>> "does not set the condition codes" actually, because both
>>> "adds_shift_imm_*" and "adds_mul_imm_*" do set the condition flags.
>>
>> You appear to be confusing named patterns (used by expand) with
>> recognizers.  Anyway, we have
>>
>> (define_insn "*add__"
>>   [(set (match_operand:GPI 0 "register_operand" "=r")
>> (plus:GPI (ASHIFT:GPI (match_operand:GPI 1 "register_operand" "r")
>>   (match_operand:QI 2
>> "aarch64_shift_imm_" "n"))
>>   (match_operand:GPI 3 "register_operand" "r")))]
>>
>> Which is a non-flag setting add with shifted operand.
>>
>>> Either way I think it is another backend issue, so do you approve that
>>> I commit this patch now?
>>
>> Not yet.  I think there's something fundamental amiss here.
>>
>> BTW, it looks to me as though addptr3 should have exactly the same
>> operand rules as add3 (documentation reads "like add3"), so a
>> shifted operand shouldn't be supported there either.  If that isn't the
>> case then that should be clearly called out in the documentation.
>>
>> R.
>>
>
> PS.
>
> I presume you are aware of the canonicalization rules for add?  That is,
> for a shift-and-add operation, the shift operand must appear first.  Ie.
>
> (plus (shift (op, op)), op)
>
> not
>
> (plus (op, (shift (op, op))

Hi Richard,
Thanks for the comments.  I realized that the not-recognized insn
issue is because the original patch build non-canonical expressions.
When reloading address expression, LRA generates non-canonical
register scaled insn, which can't be recognized by aarch64 backend.

Here is the updated patch using canonical form pattern,  it passes
bootstrap and regression test.  Well, the ivo failure still exists,
but it analyzed in the original message.

Is this patch OK?

As for Jiong's concern about the additional extension instruction, I
think this only stands for atmoic load store instructions.  For
general load store, AArch64 supports zext/sext in register scaling
addressing mode, the additional instruction can be forward propagated
into memory reference.  The problem for atomic load store is AArch64
only supports direct register addressing mode.  After LRA reloads
address expression out of memory reference, there is no combine/fwprop
optimizer to merge instructions.  The problem is atomic_store's
predicate doesn't match its constraint.   The predicate used for
atomic_store is memory_operand, while all other atomic patterns
use aarch64_sync_memory_operand.  I think this might be a typo.  With
this change, expand will not generate addressing mode requiring reload
anymore.  I will test another patch fixing this.

Thanks,
bin
>
> R.
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 3fe2f0f..5b3e3c4 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -4757,13 +4757,65 @@ aarch64_legitimize_address (rtx x, rtx /* orig_x  */, 
machine_mode mode)
  We try to pick as large a range for the offset as possible to
  maximize the chance of a CSE.  However, for aligned addresses
  we limit the range to 4k so that structures with different sized
- elements are likely to use the same base.  */
+ elements are likely to use the same base.  We need to be careful
+ not split CONST for some forms address expressions, otherwise it
+ will generate sub-optimal code.  */
 
   if (GET_CODE (x) == PLUS && CONST_INT_P (XEXP (x, 1)))
 {
   HOST_WIDE_INT offset = INTVAL (XEXP (x, 1));
   HOST_WIDE_INT base_offset;
 
+  if (GET_CODE (XEXP (x, 0)) == PLUS)
+   {
+ rtx op0 = XEXP (XEXP (x, 0), 0);
+ rtx op1 = XEXP (XEXP (x, 0), 1);
+
+ /* For addr expression in the form like "r1 + r2 + 0x3ffc".
+Since the offset is within range supported by addressing
+mode "reg+offset", we don't split the const and legalize
+it into below insn and expr sequence:
+  r3

Re: [RFA] Implement incremental IL linking

2015-11-30 Thread Jan Hubicka
> Hi,
> this is polished version of the patch to implement IL level incremental 
> inking.
> -flinker-output is now documented and can be specified to the GCC driver.
> In this case plugin gets option -linker-output-known and it stops from
> attempts to detect it from info passed down by linker. I also added doc for
> the flag to invoke.texi
> 
> Modulo the testsuite compensation the rest of patch is basically unchanged
> since earlier version: lto-wrapper looks for linker-output flag and switches 
> to
> non-WPA mode (because we do not want to execute ltrans compilatoins) and lto
> frontends configure the compiler to output IL and possibly flat lto binary to
> the object file.
> 
> Bootstrapped/regtested x86_64-linux, OK?
Hmm and now for the fun part.  I just noticed that the patch works well with 
both
GNU LD and Gold from my system instalation, wich is

GNU gold (GNU Binutils 2.24.51.20140405) 1.11

while newer version:

GNU gold (GNU Binutils 2.25.51.20150520) 1.11

fails with:

/tmp/ccPIuUSA.lto.o: plugin needed to handle lto object

in the final stage of incremental linking.  This seems like binutils bug - the
message should be output only if there are LTO objects not claimed by the linker
before invoking the plugin. There is no need to error out when plugin itself
produce IL for incremental linking.

I will check if new version fixes it and fill in PR.  I suppose I can whitelist
ld versions in the plugin and enable -flinker-output=rel only on binutils
version where this works. There is LDPT_GOLD_VERSION which tells me the
info.  I will update patch accordingly and check what version range refuses to
finish the link.

Honza

> 
> Honza
> 
>   * lto-plugin.c: Document options; add -linker-output-known;
>   determine when to use rel and when nolto-rel output.
> 
>   * lto-wrapper.c (run_gcc): Look for -flinker-output=rel also in the
>   list of options passed from the driver.
>   * passes.c (ipa_write_summaries): Only modify statements if body
>   is in memory.
>   * cgraphunit.c (ipa_passes): Also produce intermeidate code when
>   incrementally linking.
>   (ipa_passes): LIkewise.
>   * lto-cgraph.c (lto_output_node): When incrementally linking do not
>   pass down resolution info.
>   * common.opt (flag_incremental_link): Update info.
>   * gcc.c (plugin specs): Turn flinker-output=* to
>   -plugin-opt=-linker-output-known
>   * toplev.c (compile_file): Also cut compilation when doing incremental
>   link.
>   * flag-types.h (enum lto_partition_model): Add
>   LTO_LINKER_OUTPUT_NOLTOREL.
>   (invoke.texi): Add -flinker-output docs.
> 
>   * lang.opt (lto_linker_output): Add nolto-rel.
>   * lto-lang.c (lto_post_options): Handle LTO_LINKER_OUTPUT_REL
>   and LTO_LINKER_OUTPUT_NOLTOREL:.
>   (lto_init): Generate lto when doing incremental link.
> 
>   * gcc.dg/lto/20081120-2_0.c: Add -flinker-output=nolto-rel
>   * gcc.dg/lto/20090126-1_0.c: Likewise.
>   * gcc.dg/lto/20091020-2_0.c: Likewise.
>   * gcc.dg/lto/20081204-2_0.c: Likewise.
>   * gcc.dg/lto/20091015-1_0.c: Likewise.
>   * gcc.dg/lto/20090126-2_0.c: Likewiwe.
>   * gcc.dg/lto/20090116_0.c: Likewise.
>   * gcc.dg/lto/20081224_0.c: Likewise.
>   * gcc.dg/lto/20091027-1_0.c: Likewise.
>   * gcc.dg/lto/20090219_0.c: Likewise.
>   * gcc.dg/lto/20081212-1_0.c: Likewise.
>   * gcc.dg/lto/20091013-1_0.c: Likewise.
>   * gcc.dg/lto/20081126_0.c: Likewise.
>   * gcc.dg/lto/20090206-1_0.c: Likewise.
>   * gcc.dg/lto/20091016-1_0.c: Likewise.
>   * gcc.dg/lto/20081120-1_0.c: Likewise.
>   * gcc.dg/lto/20091020-1_0.c: Likewise.
>   * gcc.dg/lto/20100426_0.c: Likewise.
>   * gcc.dg/lto/20081204-1_0.c: Likewise.
>   * gcc.dg/lto/20091014-1_0.c: Likewise.
>   * g++.dg/lto/20081109-1_0.C: Likewise.
>   * g++.dg/lto/20100724-1_0.C: Likewise.
>   * g++.dg/lto/20081204-1_0.C: Likewise.
>   * g++.dg/lto/pr45679-2_0.C: Likewise.
>   * g++.dg/lto/20110311-1_0.C: Likewise.
>   * g++.dg/lto/20090302_0.C: Likewise.
>   * g++.dg/lto/20081118_0.C: Likewise.
>   * g++.dg/lto/20091002-2_0.C: Likewise.
>   * g++.dg/lto/20081120-2_0.C: Likewise.
>   * g++.dg/lto/20081123_0.C: Likewise.
>   * g++.dg/lto/20090313_0.C: Likewise.
>   * g++.dg/lto/pr54625-1_0.c: Likewise.
>   * g++.dg/lto/pr48354-1_0.C: Likewise.
>   * g++.dg/lto/20081219_0.C: Likewise.
>   * g++.dg/lto/pr48042_0.C: Likewise.
>   * g++.dg/lto/20101015-2_0.C: Likewise.
>   * g++.dg/lto/pr45679-1_0.C: Likewise.
>   * g++.dg/lto/20091026-1_0.C: Likewise.
>   * g++.dg/lto/pr45621_0.C: Likewise.
>   * g++.dg/lto/20081119-1_0.C: Likewise.
>   * g++.dg/lto/20101010-4_0.C: Likewise.
>   * g++.dg/lto/20081120-1_0.C: Likewise.
>   * g++.dg/lto/20091002-1_0.C: Likewise.
>   * g++.dg/lto/20091002-3_0.C: Likewise.
>   * gfortran.dg/lto/20091016-1_0.f90: Li

Re: [PATCH] Add save_expr langhook (PR c/68513)

2015-11-30 Thread Joseph Myers
On Mon, 30 Nov 2015, Marek Polacek wrote:

> On Sat, Nov 28, 2015 at 08:50:12AM +0100, Richard Biener wrote:
> > Different approach: after the FE folds (unexpectedly?), scan the result for
> > SAVE_EXPRs and if found, drop the folding.
> 
> Neither this fixes this problem completely, because we simply don't know where
> those SAVE_EXPRs might be introduced: it might be convert(), but e.g. when I
> changed the original testcase a tiny bit (added -), then those SAVE_EXPRs were
> introduced in a different spot (via c_process_stmt_expr -> c_fully_fold).

Well, c_fully_fold should eliminate all C_MAYBE_CONST_EXPRs in its 
argument and never pass anything containing them to the 
language-independent folders.  So it shouldn't matter if something called 
by c_fully_fold introduces a SAVE_EXPR.  If it does matter, that indicates 
the problem was earlier (something earlier putting a tree that 
c_fully_fold doesn't fold around a tree containing a C_MAYBE_CONST_EXPR, 
without folding first).

-- 
Joseph S. Myers
jos...@codesourcery.com


[hsa] Use gimplify_expr in gridification

2015-11-30 Thread Martin Jambor
Hi,

doing some more testing of the branch and combining two of my
testcases I came accross a bug where temporaries created by
force_gimple_operand_gsi were not added to the proper bind and thus
were subsequently re-mapped to error_mark when the target construct
was within some other omp construct.  Fixed with this patch, where
pop_gimplify_context does the right thing like at other places in
omp-low.c.  Committed to the branch.

Thanks,

Martin



2015-11-30  Martin Jambor  

* omp-low.c (attempt_target_gridification): Use gimplify_expr.
---
 gcc/omp-low.c | 27 +++
 1 file changed, 15 insertions(+), 12 deletions(-)

diff --git a/gcc/omp-low.c b/gcc/omp-low.c
index bdf6539..7fbdcdf 100644
--- a/gcc/omp-low.c
+++ b/gcc/omp-low.c
@@ -17481,6 +17481,7 @@ attempt_target_gridification (gomp_target *target, 
gimple_stmt_iterator *gsi,
  gpukernel);
 
   walk_tree (&group_size, remap_prebody_decls, &wi, NULL);
+  push_gimplify_context ();
   size_t collapse = gimple_omp_for_collapse (inner_loop);
   for (size_t i = 0; i < collapse; i++)
 {
@@ -17499,30 +17500,32 @@ attempt_target_gridification (gomp_target *target, 
gimple_stmt_iterator *gsi,
   tree step;
   step = get_omp_for_step_from_incr (loc,
 gimple_omp_for_incr (inner_loop, i));
-  n1 = force_gimple_operand_gsi (gsi, fold_convert (type, n1), true,
-NULL_TREE, true, GSI_SAME_STMT);
-  n2 = force_gimple_operand_gsi (gsi, fold_convert (itype, n2), true,
-NULL_TREE,
-true, GSI_SAME_STMT);
+  gimple_seq tmpseq = NULL;
+  n1 = fold_convert (itype, n1);
+  n2 = fold_convert (itype, n2);
   tree t = build_int_cst (itype, (cond_code == LT_EXPR ? -1 : 1));
   t = fold_build2 (PLUS_EXPR, itype, step, t);
   t = fold_build2 (PLUS_EXPR, itype, t, n2);
-  t = fold_build2 (MINUS_EXPR, itype, t, fold_convert (itype, n1));
+  t = fold_build2 (MINUS_EXPR, itype, t, n1);
   if (TYPE_UNSIGNED (itype) && cond_code == GT_EXPR)
t = fold_build2 (TRUNC_DIV_EXPR, itype,
 fold_build1 (NEGATE_EXPR, itype, t),
 fold_build1 (NEGATE_EXPR, itype, step));
   else
t = fold_build2 (TRUNC_DIV_EXPR, itype, t, step);
-  t = fold_convert (uint32_type_node, t);
-  tree gs = force_gimple_operand_gsi (gsi, t, true, NULL_TREE, true,
- GSI_SAME_STMT);
+  tree gs = fold_convert (uint32_type_node, t);
+  gimplify_expr (&gs, &tmpseq, NULL, is_gimple_val, fb_rvalue);
+  if (!gimple_seq_empty_p (tmpseq))
+   gsi_insert_seq_before (gsi, tmpseq, GSI_SAME_STMT);
+
   tree ws;
   if (i == 0 && group_size)
{
  ws = fold_convert (uint32_type_node, group_size);
- ws = force_gimple_operand_gsi (gsi, ws, true, NULL_TREE, true,
-GSI_SAME_STMT);
+ tmpseq = NULL;
+ gimplify_expr (&ws, &tmpseq, NULL, is_gimple_val, fb_rvalue);
+ if (!gimple_seq_empty_p (tmpseq))
+   gsi_insert_seq_before (gsi, tmpseq, GSI_SAME_STMT);
}
   else
ws = build_zero_cst (uint32_type_node);
@@ -17534,7 +17537,7 @@ attempt_target_gridification (gomp_target *target, 
gimple_stmt_iterator *gsi,
   OMP_CLAUSE_CHAIN (c) = gimple_omp_target_clauses (target);
   gimple_omp_target_set_clauses (target, c);
 }
-
+  pop_gimplify_context (tgt_bind);
   delete declmap;
   return;
 }
-- 
2.6.0



[hsa] Use proper accesses to gimple_omp_for

2015-11-30 Thread Martin Jambor
Hi,

when looking at the attempt_target_gridification function I realized I
forgot to to replace some of the early code with proper gimple
statement access function calls.  This patch addresses that.
Committed to the branch.

Thanks,

Martin


2015-11-30  Martin Jambor  

* omp-low.c (attempt_target_gridification): Use proper access into
iter array of the inner loop.
---
 gcc/omp-low.c | 18 +-
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/gcc/omp-low.c b/gcc/omp-low.c
index 5933c60..bdf6539 100644
--- a/gcc/omp-low.c
+++ b/gcc/omp-low.c
@@ -17484,21 +17484,21 @@ attempt_target_gridification (gomp_target *target, 
gimple_stmt_iterator *gsi,
   size_t collapse = gimple_omp_for_collapse (inner_loop);
   for (size_t i = 0; i < collapse; i++)
 {
-  gimple_omp_for_iter iter = inner_loop->iter[i];
-  walk_tree (&iter.initial, remap_prebody_decls, &wi, NULL);
-  walk_tree (&iter.final, remap_prebody_decls, &wi, NULL);
-
-  tree itype, type = TREE_TYPE (iter.index);
+  tree itype, type = TREE_TYPE (gimple_omp_for_index (inner_loop, i));
   if (POINTER_TYPE_P (type))
itype = signed_type_for (type);
   else
itype = type;
 
-  enum tree_code cond_code = iter.cond;
-  tree n1 = iter.initial;
-  tree n2 = iter.final;
+  enum tree_code cond_code = gimple_omp_for_cond (inner_loop, i);
+  tree n1 = unshare_expr (gimple_omp_for_initial (inner_loop, i));
+  walk_tree (&n1, remap_prebody_decls, &wi, NULL);
+  tree n2 = unshare_expr (gimple_omp_for_final (inner_loop, i));
+  walk_tree (&n2, remap_prebody_decls, &wi, NULL);
   adjust_for_condition (loc, &cond_code, &n2);
-  tree step = get_omp_for_step_from_incr (loc, iter.incr);
+  tree step;
+  step = get_omp_for_step_from_incr (loc,
+gimple_omp_for_incr (inner_loop, i));
   n1 = force_gimple_operand_gsi (gsi, fold_convert (type, n1), true,
 NULL_TREE, true, GSI_SAME_STMT);
   n2 = force_gimple_operand_gsi (gsi, fold_convert (itype, n2), true,
-- 
2.6.0



Re: [PATCH] Fix declaration of pthread-structs in s-osinte-rtems.ads (ada/68169)

2015-11-30 Thread Jeff Law

On 11/30/2015 03:06 PM, Jan Sommer wrote:

Could someone with write access please commit the patch?
The paperwork with the FSF has gone through. If something else is missing, 
please tell me.
I won't be available next week.
I'm not sure what you built your patches again, but I can't apply them 
to the trunk.  Can you resend a patch as a diff against the trunk.


Often I can fix things by hand, but this is Ada and I'd be much more 
likely to botch something.



jeff




[gomp4] fortran routine backports

2015-11-30 Thread Cesar Philippidis
This patch backports the recent fortran routine support changes I've
made in trunk to gomp-4_0-branch. Nothing changed in the fortran front
end, but I corrected a couple of problems with the way that gang, worker
and vector were handled in tree-nested.c. And there's a new test case to
exercise those changes.

This patch has been applied to gomp-4_0-branch.

Cesar
2015-11-30  Cesar Philippidis  

	gcc/
	* tree-nested.c (convert_nonlocal_omp_clauses): Handle optional
	arguments for OMP_CLAUSE_{GANG,WORKER,VECTOR}.
	(convert_local_omp_clauses): Likewise

	gcc/testsuite/
	* gfortran.dg/goacc/subroutines.f90: New test.

diff --git a/gcc/testsuite/gfortran.dg/goacc/subroutines.f90 b/gcc/testsuite/gfortran.dg/goacc/subroutines.f90
new file mode 100644
index 000..6cab798
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/goacc/subroutines.f90
@@ -0,0 +1,73 @@
+! Exercise how tree-nested.c handles gang, worker vector and seq.
+
+! { dg-do compile } 
+
+program main
+  integer, parameter :: N = 100
+  integer :: nonlocal_arg
+  integer :: nonlocal_a(N)
+  integer :: nonlocal_i
+  integer :: nonlocal_j
+  
+  nonlocal_a (:) = 5
+  nonlocal_arg = 5
+  
+  call local ()
+  call nonlocal ()
+
+contains
+
+  subroutine local ()
+integer :: local_i
+integer :: local_arg
+integer :: local_a(N)
+integer :: local_j
+
+local_a (:) = 5
+local_arg = 5
+
+!$acc kernels loop gang(num:local_arg) worker(local_arg) vector(local_arg)
+do local_i = 1, N
+   local_a(local_i) = 100
+   !$acc loop seq
+   do local_j = 1, N
+   enddo
+enddo
+!$acc end kernels loop
+
+!$acc kernels loop gang(static:local_arg) worker(local_arg) &
+!$acc vector(local_arg)
+do local_i = 1, N
+   local_a(local_i) = 100
+   !$acc loop seq
+   do local_j = 1, N
+   enddo
+enddo
+!$acc end kernels loop
+  end subroutine local
+
+  subroutine nonlocal ()
+nonlocal_a (:) = 5
+nonlocal_arg = 5
+  
+!$acc kernels loop gang(num:nonlocal_arg) worker(nonlocal_arg) &
+!$acc vector(nonlocal_arg)
+do nonlocal_i = 1, N
+   nonlocal_a(nonlocal_i) = 100
+   !$acc loop seq
+   do nonlocal_j = 1, N
+   enddo
+enddo
+!$acc end kernels loop
+
+!$acc kernels loop gang(static:nonlocal_arg) worker(nonlocal_arg) &
+!$acc vector(nonlocal_arg)
+do nonlocal_i = 1, N
+   nonlocal_a(nonlocal_i) = 100
+   !$acc loop seq
+   do nonlocal_j = 1, N
+   enddo
+enddo
+!$acc end kernels loop
+  end subroutine nonlocal
+end program main
diff --git a/gcc/tree-nested.c b/gcc/tree-nested.c
index e321072..1c9849b 100644
--- a/gcc/tree-nested.c
+++ b/gcc/tree-nested.c
@@ -1109,10 +1109,28 @@ convert_nonlocal_omp_clauses (tree *pclauses, struct walk_stmt_info *wi)
 	case OMP_CLAUSE_NUM_GANGS:
 	case OMP_CLAUSE_NUM_WORKERS:
 	case OMP_CLAUSE_VECTOR_LENGTH:
-	  wi->val_only = true;
-	  wi->is_lhs = false;
-	  convert_nonlocal_reference_op (&OMP_CLAUSE_OPERAND (clause, 0),
-	 &dummy, wi);
+	case OMP_CLAUSE_GANG:
+	case OMP_CLAUSE_WORKER:
+	case OMP_CLAUSE_VECTOR:
+	  /* Several OpenACC clauses have optional arguments.  Check if they
+	 are present.  */
+	  if (OMP_CLAUSE_OPERAND (clause, 0))
+	{
+	  wi->val_only = true;
+	  wi->is_lhs = false;
+	  convert_nonlocal_reference_op (&OMP_CLAUSE_OPERAND (clause, 0),
+	 &dummy, wi);
+	}
+
+	  /* The gang clause accepts two arguments.  */
+	  if (OMP_CLAUSE_CODE (clause) == OMP_CLAUSE_GANG
+	  && OMP_CLAUSE_GANG_STATIC_EXPR (clause))
+	{
+		wi->val_only = true;
+		wi->is_lhs = false;
+		convert_nonlocal_reference_op
+		  (&OMP_CLAUSE_GANG_STATIC_EXPR (clause), &dummy, wi);
+	}
 	  break;
 
 	case OMP_CLAUSE_DIST_SCHEDULE:
@@ -1176,9 +1194,6 @@ convert_nonlocal_omp_clauses (tree *pclauses, struct walk_stmt_info *wi)
 	case OMP_CLAUSE_THREADS:
 	case OMP_CLAUSE_SIMD:
 	case OMP_CLAUSE_DEFAULTMAP:
-	case OMP_CLAUSE_GANG:
-	case OMP_CLAUSE_WORKER:
-	case OMP_CLAUSE_VECTOR:
 	case OMP_CLAUSE_SEQ:
 	  break;
 
@@ -1768,10 +1783,28 @@ convert_local_omp_clauses (tree *pclauses, struct walk_stmt_info *wi)
 	case OMP_CLAUSE_NUM_GANGS:
 	case OMP_CLAUSE_NUM_WORKERS:
 	case OMP_CLAUSE_VECTOR_LENGTH:
-	  wi->val_only = true;
-	  wi->is_lhs = false;
-	  convert_local_reference_op (&OMP_CLAUSE_OPERAND (clause, 0), &dummy,
-  wi);
+	case OMP_CLAUSE_GANG:
+	case OMP_CLAUSE_WORKER:
+	case OMP_CLAUSE_VECTOR:
+	  /* Several OpenACC clauses have optional arguments.  Check if they
+	 are present.  */
+	  if (OMP_CLAUSE_OPERAND (clause, 0))
+	{
+	  wi->val_only = true;
+	  wi->is_lhs = false;
+	  convert_local_reference_op (&OMP_CLAUSE_OPERAND (clause, 0),
+	  &dummy, wi);
+	}
+
+	  /* The gang clause accepts two arguments.  */
+	  if (OMP_CLAUSE_CODE (clause) == OMP_CLAUSE_GANG
+	  && OMP_CLAUSE_GANG_STATIC_EXPR (clause))
+	{
+		wi->val_only = true;
+		wi->is_lhs = false;
+		convert_nonlocal_reference_op
+		  (&OMP_CLAUSE_G

[hsa] Describe grid with target clauses

2015-11-30 Thread Martin Jambor
Hi,

Jakub requested that I remove the grid description from new fields of
the classes representing gimple omp statement and put them into
special artificial clauses instead.  This patch implement that, with
one target clause per dimension (so up to three clauses) and each one
describing both the grid size and group size along that dimension
(hence the new clause type has two parameters).

Committed to the branch, I will be preparing a new diff against the
trunk shortly.

Thanks,

Martin


2015-11-30  Martin Jambor  

* gimple.c (gimple_omp_target_init_dimensions): Removed.
* gimple.h (gimple_statement_omp_parallel_layout): Removed fields
dimensions and kernel_dim.
(gimple_omp_target_dimensions): Removed.
(gimple_omp_target_grid_size): Likewise.
(gimple_omp_target_grid_size_ptr): Likewise.
(gimple_omp_target_set_grid_size): Likewise.
(gimple_omp_target_workgroup_size): Likewise.
(gimple_omp_target_workgroup_size_ptr): Likewise.
(gimple_omp_target_set_workgroup_size): Likewise.
* omp-low.c (scan_sharing_clauses): Handle OMP_CLAUSE__GRIDDIM_.
(scan_omp_target): Do not scan kernel_dim.
(region_needs_kernel_p): Use clauses to recognize gridified kernels.
(get_kernel_launch_attributes): Generate launch attributes from
clauses.
(get_target_arguments): Use clauses to recognize gridified kernels.
(expand_target_kernel_body): Likewise.
(attempt_target_gridification): Record grid description into clauses.
* tree-core.h (omp_clause_code): New element OMP_CLAUSE__GRIDDIM_.
(tree_omp_clause): New subcode dimension.
* tree-pretty-print.c (dump_omp_clause): Handle OMP_CLAUSE__GRIDDIM_.
* tree.c (omp_clause_num_ops): Add number of opernads of
OMP_CLAUSE__GRIDDIM_.
(omp_clause_code_name): Add name of OMP_CLAUSE__GRIDDIM_.
(walk_tree_1): Handle OMP_CLAUSE__GRIDDIM_.
* tree.h (OMP_CLAUSE_GRIDDIM_DIMENSION): New.
(OMP_CLAUSE_SET_GRIDDIM_DIMENSION): Likewise.
(OMP_CLAUSE_GRIDDIM_SIZE): Likewise.
(OMP_CLAUSE_GRIDDIM_GROUP): Likewise.
---
 gcc/gimple.c| 11 ---
 gcc/gimple.h| 82 -
 gcc/omp-low.c   | 72 ++-
 gcc/tree-core.h |  9 +-
 gcc/tree-pretty-print.c | 12 
 gcc/tree.c  |  5 ++-
 gcc/tree.h  | 11 +++
 7 files changed, 79 insertions(+), 123 deletions(-)

diff --git a/gcc/gimple.c b/gcc/gimple.c
index d876e90..4658f29 100644
--- a/gcc/gimple.c
+++ b/gcc/gimple.c
@@ -1098,17 +1098,6 @@ gimple_build_omp_target (gimple_seq body, int kind, tree 
clauses)
   return p;
 }
 
-/* Set dimensions of TARGET to NUM and allocate kernel_dim array of the
-   statement with the appropriate number of elements.  */
-
-void
-gimple_omp_target_init_dimensions (gomp_target *target, size_t num)
-{
-  gcc_assert (num > 0);
-  target->dimensions = num;
-  target->kernel_dim = ggc_cleared_vec_alloc (num);
-}
-
 /* Build a GIMPLE_OMP_TEAMS statement.
 
BODY is the sequence of statements that will be executed.
diff --git a/gcc/gimple.h b/gcc/gimple.h
index 14e6cf6..4c4c799 100644
--- a/gcc/gimple.h
+++ b/gcc/gimple.h
@@ -661,21 +661,7 @@ struct GTY((tag("GSS_OMP_PARALLEL_LAYOUT")))
  Shared data argument.  */
   tree data_arg;
 
-  /* TODO: Revisit placement of the following two fields.  On one hand, we
- currently only use them on target construct.  On the other, use on
- parallel construct is also possible in the future.  */
-
   /* [ WORD 11 ] */
-  /* Number of elements in kernel_iter array.  */
-  size_t dimensions;
-
-  /* [ WORD 12 ] */
-  /* If target also contains a GPU kernel, it should be run with the
- following grid sizes.  */
-  struct gimple_omp_target_grid_dim
-* GTY((length ("%h.dimensions"))) kernel_dim;
-
-  /* [ WORD 13 ] */
   /* If set, this statement is part of a gridified kernel, its clauses need to
  be scanned and lowered but the statement should be discarded after
  lowering.  */
@@ -1504,7 +1490,6 @@ gomp_sections *gimple_build_omp_sections (gimple_seq, 
tree);
 gimple *gimple_build_omp_sections_switch (void);
 gomp_single *gimple_build_omp_single (gimple_seq, tree);
 gomp_target *gimple_build_omp_target (gimple_seq, int, tree);
-void gimple_omp_target_init_dimensions (gomp_target *, size_t);
 gomp_teams *gimple_build_omp_teams (gimple_seq, tree);
 gomp_atomic_load *gimple_build_omp_atomic_load (tree, tree);
 gomp_atomic_store *gimple_build_omp_atomic_store (tree);
@@ -5683,73 +5668,6 @@ gimple_omp_target_set_data_arg (gomp_target 
*omp_target_stmt,
   omp_target_stmt->data_arg = data_arg;
 }
 
-/* Return the number of dimensions of kernel grid.  */
-
-static inline size_t
-gimple_omp_target_dimensions (gomp_target *omp_target_stmt)
-{
-  return omp_target_stmt->dimensions;
-}
-
-/* Return the size 

-fstrict-aliasing fixes 1/5: propagate -fno-strict-aliasing in the inliner

2015-11-30 Thread Jan Hubicka
Hi,
this is first patch in the broken up series.  It adds the logic into
ipa-inline-transform to drop the flag when inlining.  I do it always until
we find a way to make early optimizations safe WRT this transform.

The testcase triggers with GCC 5.0/4.9 too, older compilers passes if
-fstrict-aliasing is used at linktime and fails otherwise.

Bootstrapped/regtested x86_64-linux, will commit it after re-testing on
Firefox.

Honza

* ipa-inline-transform.c (inline_call): Drop -fstrict-aliasing when
inlining -fno-strict-aliasing into -fstrict-aliasing body.
* gcc.dg/lto/alias-1_0.c: New testcase.
* gcc.dg/lto/alias-1_1.c: New testcase.
Index: ipa-inline-transform.c
===
--- ipa-inline-transform.c  (revision 231081)
+++ ipa-inline-transform.c  (working copy)
@@ -322,6 +322,21 @@ inline_call (struct cgraph_edge *e, bool
   if (DECL_FUNCTION_PERSONALITY (callee->decl))
 DECL_FUNCTION_PERSONALITY (to->decl)
   = DECL_FUNCTION_PERSONALITY (callee->decl);
+  if (!opt_for_fn (callee->decl, flag_strict_aliasing)
+  && opt_for_fn (to->decl, flag_strict_aliasing))
+{
+  struct gcc_options opts = global_options;
+
+  cl_optimization_restore (&opts,
+TREE_OPTIMIZATION (DECL_FUNCTION_SPECIFIC_OPTIMIZATION (to->decl)));
+  opts.x_flag_strict_aliasing = false;
+  if (dump_file)
+   fprintf (dump_file, "Dropping flag_strict_aliasing on %s:%i\n",
+to->name (), to->order);
+  build_optimization_node (&opts);
+  DECL_FUNCTION_SPECIFIC_OPTIMIZATION (to->decl)
+= build_optimization_node (&opts);
+}
 
   /* If aliases are involved, redirect edge to the actual destination and
  possibly remove the aliases.  */
Index: testsuite/gcc.dg/lto/alias-1_0.c
===
--- testsuite/gcc.dg/lto/alias-1_0.c(revision 0)
+++ testsuite/gcc.dg/lto/alias-1_0.c(revision 0)
@@ -0,0 +1,23 @@
+/* { dg-lto-do run } */
+/* { dg-lto-options { { -O2 -flto } } } */
+int val;
+
+__attribute__ ((used))
+int *ptr = &val;
+__attribute__ ((used))
+float *ptr2 = (void *)&val;
+
+extern void typefun(float val);
+
+void link_error (void);
+
+int
+main()
+{ 
+  *ptr=1;
+  typefun (0);
+  if (*ptr)
+__builtin_abort ();
+  return 0;
+}
+
Index: testsuite/gcc.dg/lto/alias-1_1.c
===
--- testsuite/gcc.dg/lto/alias-1_1.c(revision 0)
+++ testsuite/gcc.dg/lto/alias-1_1.c(revision 0)
@@ -0,0 +1,7 @@
+/* { dg-options "-fno-strict-aliasing" } */
+extern float *ptr2;
+void
+typefun (float val)
+{ 
+  *ptr2=val;
+}


Re: [PATCH 01/15] Selftest framework (unittests v4)

2015-11-30 Thread Jeff Law

On 11/26/2015 05:37 AM, Bernd Schmidt wrote:

On 11/25/2015 11:47 PM, David Malcolm wrote:

FWIW, the reason I special-cased the linked list was to avoid any
dynamic memory allocation: the ctors run before main, so I wanted to
keep them as simple as possible.


Is there any particular reason for this? C++ doesn't disallow memory
allocation in global constructors, does it?

I'm not aware of any such restriction, but I'm not a C++ guru.

David, what's the reason for avoiding dynamic memory allocation here?





I do want some level of determinism over test ordering, for the sake of
everyone's sanity.  It's probably simplest to either hardcode the order,
or have priority levels.  I favor the former (and right now am leaning
towards a very explicit no-magic approach with no auto-registration,
given the linker issues I've been seeing with auto-registration).


I guess that works too. Certainly explicit function calls are
preferrable over #including other C files as a workaround for such a
problem.
My problem with priorities is that it's really just a poor man's 
substitution for dependency analysis. And in my experience, it usually 
fails.




I still wish others would chime in on the rest of the issues we've
discussed (run to first failure vs. providing elaborate test summaries),
I want to make my preference clear but I don't want to dictate it.
I favor run-all over run-to-first-failure as long as we don't have good 
dependency analysis to order the tests.   That in turn tends to imply 
that each test ought to have a pass/fail indicator.


If we had good dependency analysis, then run-to-first-failure would be 
my preference.


Jeff


[RFA] Implement incremental IL linking

2015-11-30 Thread Jan Hubicka
Hi,
this is polished version of the patch to implement IL level incremental inking.
-flinker-output is now documented and can be specified to the GCC driver.
In this case plugin gets option -linker-output-known and it stops from
attempts to detect it from info passed down by linker. I also added doc for
the flag to invoke.texi

Modulo the testsuite compensation the rest of patch is basically unchanged
since earlier version: lto-wrapper looks for linker-output flag and switches to
non-WPA mode (because we do not want to execute ltrans compilatoins) and lto
frontends configure the compiler to output IL and possibly flat lto binary to
the object file.

Bootstrapped/regtested x86_64-linux, OK?

Honza

* lto-plugin.c: Document options; add -linker-output-known;
determine when to use rel and when nolto-rel output.

* lto-wrapper.c (run_gcc): Look for -flinker-output=rel also in the
list of options passed from the driver.
* passes.c (ipa_write_summaries): Only modify statements if body
is in memory.
* cgraphunit.c (ipa_passes): Also produce intermeidate code when
incrementally linking.
(ipa_passes): LIkewise.
* lto-cgraph.c (lto_output_node): When incrementally linking do not
pass down resolution info.
* common.opt (flag_incremental_link): Update info.
* gcc.c (plugin specs): Turn flinker-output=* to
-plugin-opt=-linker-output-known
* toplev.c (compile_file): Also cut compilation when doing incremental
link.
* flag-types.h (enum lto_partition_model): Add
LTO_LINKER_OUTPUT_NOLTOREL.
(invoke.texi): Add -flinker-output docs.

* lang.opt (lto_linker_output): Add nolto-rel.
* lto-lang.c (lto_post_options): Handle LTO_LINKER_OUTPUT_REL
and LTO_LINKER_OUTPUT_NOLTOREL:.
(lto_init): Generate lto when doing incremental link.

* gcc.dg/lto/20081120-2_0.c: Add -flinker-output=nolto-rel
* gcc.dg/lto/20090126-1_0.c: Likewise.
* gcc.dg/lto/20091020-2_0.c: Likewise.
* gcc.dg/lto/20081204-2_0.c: Likewise.
* gcc.dg/lto/20091015-1_0.c: Likewise.
* gcc.dg/lto/20090126-2_0.c: Likewiwe.
* gcc.dg/lto/20090116_0.c: Likewise.
* gcc.dg/lto/20081224_0.c: Likewise.
* gcc.dg/lto/20091027-1_0.c: Likewise.
* gcc.dg/lto/20090219_0.c: Likewise.
* gcc.dg/lto/20081212-1_0.c: Likewise.
* gcc.dg/lto/20091013-1_0.c: Likewise.
* gcc.dg/lto/20081126_0.c: Likewise.
* gcc.dg/lto/20090206-1_0.c: Likewise.
* gcc.dg/lto/20091016-1_0.c: Likewise.
* gcc.dg/lto/20081120-1_0.c: Likewise.
* gcc.dg/lto/20091020-1_0.c: Likewise.
* gcc.dg/lto/20100426_0.c: Likewise.
* gcc.dg/lto/20081204-1_0.c: Likewise.
* gcc.dg/lto/20091014-1_0.c: Likewise.
* g++.dg/lto/20081109-1_0.C: Likewise.
* g++.dg/lto/20100724-1_0.C: Likewise.
* g++.dg/lto/20081204-1_0.C: Likewise.
* g++.dg/lto/pr45679-2_0.C: Likewise.
* g++.dg/lto/20110311-1_0.C: Likewise.
* g++.dg/lto/20090302_0.C: Likewise.
* g++.dg/lto/20081118_0.C: Likewise.
* g++.dg/lto/20091002-2_0.C: Likewise.
* g++.dg/lto/20081120-2_0.C: Likewise.
* g++.dg/lto/20081123_0.C: Likewise.
* g++.dg/lto/20090313_0.C: Likewise.
* g++.dg/lto/pr54625-1_0.c: Likewise.
* g++.dg/lto/pr48354-1_0.C: Likewise.
* g++.dg/lto/20081219_0.C: Likewise.
* g++.dg/lto/pr48042_0.C: Likewise.
* g++.dg/lto/20101015-2_0.C: Likewise.
* g++.dg/lto/pr45679-1_0.C: Likewise.
* g++.dg/lto/20091026-1_0.C: Likewise.
* g++.dg/lto/pr45621_0.C: Likewise.
* g++.dg/lto/20081119-1_0.C: Likewise.
* g++.dg/lto/20101010-4_0.C: Likewise.
* g++.dg/lto/20081120-1_0.C: Likewise.
* g++.dg/lto/20091002-1_0.C: Likewise.
* g++.dg/lto/20091002-3_0.C: Likewise.
* gfortran.dg/lto/20091016-1_0.f90: Likewise.
* gfortran.dg/lto/pr47839_0.f90: Likewise.
* gfortran.dg/lto/pr46911_0.f: Likewise.
* gfortran.dg/lto/20091028-1_0.f90: Likewise.
* gfortran.dg/lto/20091028-2_0.f90: Likewise.
Index: lto-plugin/lto-plugin.c
===
--- lto-plugin/lto-plugin.c (revision 231081)
+++ lto-plugin/lto-plugin.c (working copy)
@@ -27,10 +27,13 @@ along with this program; see the file CO
More information at http://gcc.gnu.org/wiki/whopr/driver.
 
This plugin should be passed the lto-wrapper options and will forward them.
-   It also has 2 options of its own:
+   It also has options at his own:
-debug: Print the command line used to run lto-wrapper.
-nop: Instead of running lto-wrapper, pass the original to the plugin. This
-   only works if the input files are hybrid.  */
+   only works if the input files are hybrid. 
+   -linker-output-known: Do not determine l

Re: [PATCH] fix PR65726

2015-11-30 Thread Jeff Law

On 11/26/2015 11:49 AM, Andreas Tobler wrote:

Hi all,

the attached patch fixes the build issue from this ticket if bootstrap
is disabled.

Tested on x86_64-*-linux* and on x86_64-*-freebsd* with gcc and clang.

Ok for trunk?

And 5.3?

Thanks,
Andreas

2015-11-26  Andreas Tobler  

 PR libffi/65726
 * Makefile.def (lang_env_dependencies): Make libffi depend
 on cxx.
 * Makefile.in: Regenerate.


OK.
jeff


Re: [RFC, Patch]: Optimized changes in the register used inside loop for LICM and IVOPTS.

2015-11-30 Thread Jeff Law

On 11/29/2015 09:24 AM, Ajit Kumar Agarwal wrote:


I agree with the above.  To add up on the above, we only require to calculate 
the set of objects ( SSA_NAMES) that are live at the birth or the header of the 
loop.
We don't need to calculate the live through the Loop considering Live in and 
Live out of all the basic blocks of the Loop. This is because the set of 
objects (SSA_NAMES)
That are live-in at the birth or header of the loop will be live-in at every 
node in the Loop.

If a v live out at the header of the loop then the variable is live-in at every 
node in the Loop. To prove this, Consider a Loop L with header h such that
The variable v defined at d is live-in at h. Since v is live at h, d is not 
part of L. This follows from the dominance property, i.e. h is strictly 
dominated by d.
Furthermore, there exists a path from h to a use of v which does not go through 
d. For every node of the loop, p, since the loop is strongly connected
Component of the CFG, there exists a path, consisting only of nodes of L from p 
to h. Concatenating those two paths prove that v is live-in and live-out
Of p.

On top of live-in at the birth or header of the loop as proven above, if we 
calculate the Live out of the exit block of the block and Live-in at the 
destination
Edge of the exit block of the loops. This consider the liveness outside of the 
Loop.

The above two cases forms the basis of better estimator for register pressure 
as far as LICM is concerned.

If you agree with the above, I will implement add the above in the patch for 
register_used estimates for better estimate of register pressure for LICM.

Yes, I think we're in agreement.

jeff



Re: [patch] RFC asan support for i?86/x86_64-*freebsd*

2015-11-30 Thread Jeff Law

On 11/29/2015 03:10 PM, Andreas Tobler wrote:

All,

this patch adds support for asan for i?86/x86_64-*freebsd*.

Test results can be found on the list.

These modifications belong only to gcc. There is one modification to
asan/asan_linux.cc, this one is sent upstream. Until this one is in, my
patch is on hold.

One thing to note, FreeBSD does not need to link against -ldl. That is
why I added an extra config check.

But nevertheless I'd like to get some comments on the patch.

Thanks to Jakub and Dan McGregor.

Thanks,
Andreas


2015-11-29  Andreas Tobler  

 * config/i386/i386.h: Define two new macros:
 SUBTARGET_SHADOW_OFFSET_64 and SUBTARGET_SHADOW_OFFSET_32.
 * config/i386/i386.c (ix86_asan_shadow_offset): Use these macros.
 * config/i386/darwin.h: Override the SUBTARGET_SHADOW_OFFSET_64
 macro.
 * config/i386/freebsd.h: Override the SUBTARGET_SHADOW_OFFSET_64
 and the SUBTARGET_SHADOW_OFFSET_32 macro.
 * config/freebsd.h (LIBASAN_EARLY_SPEC): Define.
 (LIBTSAN_EARLY_SPEC): Likewise.
 (LIBLSAN_EARLY_SPEC): Likewise.

2015-11-29  Andreas Tobler  

 * configure.ac: Replace the hard-coded -ldl requirement for
 link_sanitizer_common with a configure time check for -ldl.
 * configure: Regenerate.
 * configure.tgt: Add x86_64- and i?86-*-freebsd* targets.
The configury bits are fine.  Uros would own review on the x86 specific 
changes.


jeff



Re: [PATCH] Fix declaration of pthread-structs in s-osinte-rtems.ads (ada/68169)

2015-11-30 Thread Jan Sommer
Could someone with write access please commit the patch?
The paperwork with the FSF has gone through. If something else is missing, 
please tell me.
I won't be available next week.

Best regards,

   Jan

Am Tuesday 24 November 2015, 08:47:49 schrieb Jan Sommer:
> It has gone through.
> That was why I resubmitted the patch.
> Joel can confirm. Apparently he is on a respective list and saw my paperwork 
> being cleared.
> 
> Best regards,
> 
>Jan
> 
> Am Tuesday 24 November 2015, 07:45:30 schrieb Sebastian Huber:
> > Hello Jan,
> > 
> > On 23/11/15 23:15, Jan Sommer wrote:
> > > If someone with commit rights could check and push the patches we might 
> > > get it into the next release.
> > 
> > what is the status of your copyright assignment to the FSF which is 
> > required to integrate changes into GCC?
> > 
> > 
> 
> ___
> devel mailing list
> de...@rtems.org
> http://lists.rtems.org/mailman/listinfo/devel



Re: regrename/i386: ROP vs df and stack-regs

2015-11-30 Thread Uros Bizjak
On Mon, Nov 30, 2015 at 10:38 PM, Bernd Schmidt  wrote:
> On 11/27/2015 10:02 AM, Bernd Schmidt wrote:
>>
>> This is a patch for PRs 68471 and 68472, which show problems with the
>> ROP mitigation:
>>   * reg-stack doesn't call df_insn_update when it makes changes, and
>> if df checking is enabled, any subsequent df_analyze call will
>> abort
>>   * Using -mcmodel=medium fails because of a pattern that has lea type
>> and needs its modrm_class overridden.
>>
>> Both of these are fixed in the i386 backend. As a further safety
>> measure, I've added some extra code to regrename to ignore stack regs
>> after regstack_complete - they can't be dealt with anymore.
>>
>> Bootstrapped and tested on x86_64-linux, with -mmitigate-rop forced on.
>> Ok?
>
>
>> PR target/68471
>> PR target/68472
>> * config/i386/i386.c (ix86_mitigate_rop): Don't call
>> compute_bb_for_insn again.  Call df_insn_rescan_all.
>> * config/i386/i386.md (set_got_rex64): Override modrm_class.
>>
>> * regrename.c (build_def_use): Ignore stack regs if
>> regstack_completed.
>>
>> testsuite/
>> * gcc.target/i386/rop1.c: New test.
>>
>> diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
>> index 2ac6c25..14c99eb 100644
>> --- a/gcc/config/i386/i386.c
>> +++ b/gcc/config/i386/i386.c
>> @@ -45243,8 +45243,9 @@ ix86_mitigate_rop (void)
>>COPY_HARD_REG_SET (inout_risky, input_risky);
>>IOR_HARD_REG_SET (inout_risky, output_risky);
>>
>> -  compute_bb_for_insn ();
>>df_note_add_problem ();
>> +  /* Fix up what stack-regs did.  */
>> +  df_insn_rescan_all ();
>>df_analyze ();
>>
>>regrename_init (true);
>> diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
>> index a57d165..671580d 100644
>> --- a/gcc/config/i386/i386.md
>> +++ b/gcc/config/i386/i386.md
>> @@ -12418,6 +12418,7 @@
>>"lea{q}\t{_GLOBAL_OFFSET_TABLE_(%%rip), %0|%0,
>> _GLOBAL_OFFSET_TABLE_[rip]}"
>>[(set_attr "type" "lea")
>> (set_attr "length_address" "4")
>> +   (set_attr "modrm_class" "unknown")
>> (set_attr "mode" "DI")])
>>
>>  (define_insn "set_rip_rex64"
>> --- /dev/null   2015-11-23 12:05:22.553607702 +0100
>> +++ gcc/testsuite/gcc.target/i386/rop1.c2015-11-24
>> 15:40:04.381086953 +0100
>> @@ -0,0 +1,7 @@
>> +/* { dg-do compile } */
>> +/* { dg-require-effective-target lp64 } */
>> +/* { dg-options "-mcmodel=medium -mmitigate-rop" } */
>> +void
>> +foo (void)
>> +{
>> +}
>
>
> Ccing Uros for the i386 bits.

These are OK.

Thanks,
Uros.


Re: regrename/i386: ROP vs df and stack-regs

2015-11-30 Thread Bernd Schmidt

On 11/27/2015 10:02 AM, Bernd Schmidt wrote:

This is a patch for PRs 68471 and 68472, which show problems with the
ROP mitigation:
  * reg-stack doesn't call df_insn_update when it makes changes, and
if df checking is enabled, any subsequent df_analyze call will
abort
  * Using -mcmodel=medium fails because of a pattern that has lea type
and needs its modrm_class overridden.

Both of these are fixed in the i386 backend. As a further safety
measure, I've added some extra code to regrename to ignore stack regs
after regstack_complete - they can't be dealt with anymore.

Bootstrapped and tested on x86_64-linux, with -mmitigate-rop forced on. Ok?



PR target/68471
PR target/68472
* config/i386/i386.c (ix86_mitigate_rop): Don't call
compute_bb_for_insn again.  Call df_insn_rescan_all.
* config/i386/i386.md (set_got_rex64): Override modrm_class.

* regrename.c (build_def_use): Ignore stack regs if regstack_completed.

testsuite/
* gcc.target/i386/rop1.c: New test.

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 2ac6c25..14c99eb 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -45243,8 +45243,9 @@ ix86_mitigate_rop (void)
   COPY_HARD_REG_SET (inout_risky, input_risky);
   IOR_HARD_REG_SET (inout_risky, output_risky);

-  compute_bb_for_insn ();
   df_note_add_problem ();
+  /* Fix up what stack-regs did.  */
+  df_insn_rescan_all ();
   df_analyze ();

   regrename_init (true);
diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index a57d165..671580d 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -12418,6 +12418,7 @@
   "lea{q}\t{_GLOBAL_OFFSET_TABLE_(%%rip), %0|%0, _GLOBAL_OFFSET_TABLE_[rip]}"
   [(set_attr "type" "lea")
(set_attr "length_address" "4")
+   (set_attr "modrm_class" "unknown")
(set_attr "mode" "DI")])

 (define_insn "set_rip_rex64"
--- /dev/null   2015-11-23 12:05:22.553607702 +0100
+++ gcc/testsuite/gcc.target/i386/rop1.c2015-11-24 15:40:04.381086953 
+0100
@@ -0,0 +1,7 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-options "-mcmodel=medium -mmitigate-rop" } */
+void
+foo (void)
+{
+}


Ccing Uros for the i386 bits.


Bernd



Re: [gomp4.5] Handle #pragma omp declare target link

2015-11-30 Thread Ilya Verbin
On Mon, Nov 30, 2015 at 21:49:02 +0100, Jakub Jelinek wrote:
> On Mon, Nov 30, 2015 at 11:29:34PM +0300, Ilya Verbin wrote:
> > You're right, it doesn't deallocate memory on the device if DSO leaves 
> > nonzero
> > refcount.  And currently host compiler doesn't set MSB in host_var_table, 
> > it's
> > set only by accel compiler.  But it's possible to do splay_tree_lookup for 
> > each
> > var to determine whether is it linked or not, like in the patch bellow.
> > Or do you prefer to set the bit in host compiler too?  It requires
> > lookup_attribute ("omp declare target link") for all vars in the table 
> > during
> > compilation, but allows to do splay_tree_lookup at run-time only for vars 
> > with
> > MSB set in host_var_table.
> > Unfortunately, calling gomp_exit_data from gomp_unload_image_from_device 
> > works
> > only for DSO, but it crashed when an executable leaves nonzero refcount, 
> > because
> > target device may be already uninitialized from plugin's __run_exit_handlers
> > (and it is in case of intelmic), so gomp_exit_data cannot run free_func.
> > Is it possible do add some atexit (...) to libgomp, which will set 
> > shutting_down
> > flag, and just do nothing in gomp_unload_image_from_device if it is set?
> 
> Sorry, I didn't mean you should call gomp_exit_data, what I meant was that
> you perform the same action as would delete(var) do in that case.
> Calling gomp_exit_data e.g. looks it up again etc.
> Supposedly having the MSB in host table too is useful, so if you could
> handle that, it would be nice.  And splay_tree_lookup only if the MSB is
> set.
> So,
> if (!host_data_has_msb_set)
>   splay_tree_remove (&devicep->mem_map, &k);
> else
>   {
> splay_tree_key n = splay_tree_lookup (&devicep->mem_map, &k);
> if (n->link_key)
> {
>   n->refcount = 0;
>   n->link_key = NULL;
>   splay_tree_remove (&devicep->mem_map, n);
>   if (n->tgt->refcount > 1)
> n->tgt->refcount--;
>   else
> gomp_unmap_tgt (n->tgt);
> }
>   else
> splay_tree_remove (&devicep->mem_map, n);
>   }
> or so.

Ok, but it doesn't solve the issue with doing it for the executable, because
gomp_unmap_tgt (n->tgt) will want to run free_func on uninitialized device.

  -- Ilya


Re: [gomp4.5] Handle #pragma omp declare target link

2015-11-30 Thread Jakub Jelinek
On Mon, Nov 30, 2015 at 11:29:34PM +0300, Ilya Verbin wrote:
> > This looks wrong, both of these clearly could affect anything with
> > DECL_HAS_VALUE_EXPR_P, not just the link vars.
> > So, if you need to handle the "omp declare target link" vars specially,
> > you should only handle those specially and nothing else.  And please try to
> > explain why.
> 
> Actually these ifndefs are not needed, because assemble_decl never will be
> called by accel compiler for original link vars.  I've added a check into
> output_in_order, but missed a second place where assemble_decl is called -
> symbol_table::output_variables.  So, fixed now.

Great.

> > Do we need to do anything in gomp_unload_image_from_device ?
> > I mean at least in questionable programs that for link vars don't decrement
> > the refcount of the var that replaced the link var to 0 first before
> > dlclosing the library.
> > At least host_var_table[j * 2 + 1] will have the MSB set, so we need to
> > handle it differently.  Perhaps for that case perform a lookup, and if we
> > get something which has link_map non-NULL, first perform as if there is
> > target exit data delete (var) on it first?
> 
> You're right, it doesn't deallocate memory on the device if DSO leaves nonzero
> refcount.  And currently host compiler doesn't set MSB in host_var_table, it's
> set only by accel compiler.  But it's possible to do splay_tree_lookup for 
> each
> var to determine whether is it linked or not, like in the patch bellow.
> Or do you prefer to set the bit in host compiler too?  It requires
> lookup_attribute ("omp declare target link") for all vars in the table during
> compilation, but allows to do splay_tree_lookup at run-time only for vars with
> MSB set in host_var_table.
> Unfortunately, calling gomp_exit_data from gomp_unload_image_from_device works
> only for DSO, but it crashed when an executable leaves nonzero refcount, 
> because
> target device may be already uninitialized from plugin's __run_exit_handlers
> (and it is in case of intelmic), so gomp_exit_data cannot run free_func.
> Is it possible do add some atexit (...) to libgomp, which will set 
> shutting_down
> flag, and just do nothing in gomp_unload_image_from_device if it is set?

Sorry, I didn't mean you should call gomp_exit_data, what I meant was that
you perform the same action as would delete(var) do in that case.
Calling gomp_exit_data e.g. looks it up again etc.
Supposedly having the MSB in host table too is useful, so if you could
handle that, it would be nice.  And splay_tree_lookup only if the MSB is
set.
So,
if (!host_data_has_msb_set)
  splay_tree_remove (&devicep->mem_map, &k);
else
  {
splay_tree_key n = splay_tree_lookup (&devicep->mem_map, &k);
if (n->link_key)
  {
n->refcount = 0;
n->link_key = NULL;
splay_tree_remove (&devicep->mem_map, n);
if (n->tgt->refcount > 1)
  n->tgt->refcount--;
else
  gomp_unmap_tgt (n->tgt);
  }
else
  splay_tree_remove (&devicep->mem_map, n);
  }
or so.

Jakub


RE: [PR68001, CilkPlus] Fix for PR68001

2015-11-30 Thread Zamyatin, Igor
> 
> FAIL: obj-c++.dg/property/dotsyntax-11.mm -fgnu-runtime  (test for errors,
> line 51)
> FAIL: obj-c++.dg/property/dotsyntax-11.mm -fgnu-runtime  (test for errors,
> line 56)
> FAIL: obj-c++.dg/property/dotsyntax-11.mm -fgnu-runtime  (test for errors,
> line 59)
> 
> Andreas.

Here is the patch that properly limits GS_ERROR exit only in case of error in 
cilk spawn detection.

Bootstrapped and regtested on x86_64, ok for trunk?

Thanks,
Igor

cp/Changelog

2015-11-27  Igor Zamyatin  

PR c++/68001
* cp-gimplify.c (cp_gimplify_expr): Limit GS_ERROR only in case of
error in cilk spawn detection.



diff --git a/gcc/cp/cp-gimplify.c b/gcc/cp/cp-gimplify.c
index 09ee5ff..3dbbd7f 100644
--- a/gcc/cp/cp-gimplify.c
+++ b/gcc/cp/cp-gimplify.c
@@ -559,6 +559,7 @@ int
 cp_gimplify_expr (tree *expr_p, gimple_seq *pre_p, gimple_seq *post_p)
 {
   int saved_stmts_are_full_exprs_p = 0;
+  bool is_spawn_detected = true;
   enum tree_code code = TREE_CODE (*expr_p);
   enum gimplify_status ret;
 
@@ -614,12 +615,12 @@ cp_gimplify_expr (tree *expr_p, gimple_seq *pre_p, 
gimple_seq *post_p)
 25979.  */
 case INIT_EXPR:
   if (fn_contains_cilk_spawn_p (cfun)
- && cilk_detect_spawn_and_unwrap (expr_p))
+ && (is_spawn_detected = cilk_detect_spawn_and_unwrap (expr_p)))
{
  cilk_cp_gimplify_call_params_in_spawned_fn (expr_p, pre_p, post_p);
  return (enum gimplify_status) gimplify_cilk_spawn (expr_p);
}
-  if (seen_error ())
+  if (!is_spawn_detected && seen_error ())
return GS_ERROR;
 
   cp_gimplify_init_expr (expr_p);





Re: [Patch,SLP]: Correction in the comment for SLP vectorization profitable case.

2015-11-30 Thread Jeff Law

On 11/30/2015 02:00 AM, Ajit Kumar Agarwal wrote:

This patch made correction in the comment for SLP profitable vectorization case.

Correction in the comment for vectorizable profitable case. The comment is
contradicting the condition vec_outside_cost + vec_inside_cost > scalar_cost.

ChangeLog:
2015-11-30  Ajit Agarwal  

 * tree-vect-slp.c
 (vect_bb_vectorization_profitable_p): Correction in the comment.

OK.  Please install.

Thanks,
Jeff



Re: [gomp4.5] Handle #pragma omp declare target link

2015-11-30 Thread Ilya Verbin
On Mon, Nov 30, 2015 at 13:04:59 +0100, Jakub Jelinek wrote:
> On Fri, Nov 27, 2015 at 07:50:09PM +0300, Ilya Verbin wrote:
> > + /* Most significant bit of the size marks such vars.  */
> > + unsigned HOST_WIDE_INT isize = tree_to_uhwi (size);
> > + isize |= 1ULL << (int_size_in_bytes (const_ptr_type_node) * 8 - 1);
> 
> That supposedly should be BITS_PER_UNIT instead of 8.

Fixed.

> > diff --git a/gcc/varpool.c b/gcc/varpool.c
> > index 36f19a6..cbd1e05 100644
> > --- a/gcc/varpool.c
> > +++ b/gcc/varpool.c
> > @@ -561,17 +561,21 @@ varpool_node::assemble_decl (void)
> >   are not real variables, but just info for debugging and codegen.
> >   Unfortunately at the moment emutls is not updating varpool correctly
> >   after turning real vars into value_expr vars.  */
> > +#ifndef ACCEL_COMPILER
> >if (DECL_HAS_VALUE_EXPR_P (decl)
> >&& !targetm.have_tls)
> >  return false;
> > +#endif
> >  
> >/* Hard register vars do not need to be output.  */
> >if (DECL_HARD_REGISTER (decl))
> >  return false;
> >  
> > +#ifndef ACCEL_COMPILER
> >gcc_checking_assert (!TREE_ASM_WRITTEN (decl)
> >&& TREE_CODE (decl) == VAR_DECL
> >&& !DECL_HAS_VALUE_EXPR_P (decl));
> > +#endif
> 
> This looks wrong, both of these clearly could affect anything with
> DECL_HAS_VALUE_EXPR_P, not just the link vars.
> So, if you need to handle the "omp declare target link" vars specially,
> you should only handle those specially and nothing else.  And please try to
> explain why.

Actually these ifndefs are not needed, because assemble_decl never will be
called by accel compiler for original link vars.  I've added a check into
output_in_order, but missed a second place where assemble_decl is called -
symbol_table::output_variables.  So, fixed now.

> > @@ -1005,13 +1026,18 @@ gomp_load_image_to_device (struct gomp_device_descr 
> > *devicep, unsigned version,
> >for (i = 0; i < num_vars; i++)
> >  {
> >struct addr_pair *target_var = &target_table[num_funcs + i];
> > -  if (target_var->end - target_var->start
> > - != (uintptr_t) host_var_table[i * 2 + 1])
> > +  uintptr_t target_size = target_var->end - target_var->start;
> > +
> > +  /* Most significant bit of the size marks "omp declare target link"
> > +variables.  */
> > +  bool is_link = target_size & (1ULL << (sizeof (uintptr_t) * 8 - 1));
> 
> __CHAR_BIT__ here instead of 8?

Fixed.

> > @@ -1019,7 +1045,7 @@ gomp_load_image_to_device (struct gomp_device_descr 
> > *devicep, unsigned version,
> >k->host_end = k->host_start + (uintptr_t) host_var_table[i * 2 + 1];
> >k->tgt = tgt;
> >k->tgt_offset = target_var->start;
> > -  k->refcount = REFCOUNT_INFINITY;
> > +  k->refcount = is_link ? REFCOUNT_LINK : REFCOUNT_INFINITY;
> >k->async_refcount = 0;
> >array->left = NULL;
> >array->right = NULL;
> 
> Do we need to do anything in gomp_unload_image_from_device ?
> I mean at least in questionable programs that for link vars don't decrement
> the refcount of the var that replaced the link var to 0 first before
> dlclosing the library.
> At least host_var_table[j * 2 + 1] will have the MSB set, so we need to
> handle it differently.  Perhaps for that case perform a lookup, and if we
> get something which has link_map non-NULL, first perform as if there is
> target exit data delete (var) on it first?

You're right, it doesn't deallocate memory on the device if DSO leaves nonzero
refcount.  And currently host compiler doesn't set MSB in host_var_table, it's
set only by accel compiler.  But it's possible to do splay_tree_lookup for each
var to determine whether is it linked or not, like in the patch bellow.
Or do you prefer to set the bit in host compiler too?  It requires
lookup_attribute ("omp declare target link") for all vars in the table during
compilation, but allows to do splay_tree_lookup at run-time only for vars with
MSB set in host_var_table.
Unfortunately, calling gomp_exit_data from gomp_unload_image_from_device works
only for DSO, but it crashed when an executable leaves nonzero refcount, because
target device may be already uninitialized from plugin's __run_exit_handlers
(and it is in case of intelmic), so gomp_exit_data cannot run free_func.
Is it possible do add some atexit (...) to libgomp, which will set shutting_down
flag, and just do nothing in gomp_unload_image_from_device if it is set?


diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c
index 369574f..b73caa1 100644
--- a/gcc/c-family/c-common.c
+++ b/gcc/c-family/c-common.c
@@ -822,6 +822,8 @@ const struct attribute_spec c_common_attribute_table[] =
  handle_simd_attribute, false },
   { "omp declare target", 0, 0, true, false, false,
  handle_omp_declare_target_attribute, false },
+  { "omp declare target link", 0, 0, true, false, false,
+  

[PATCH 2/2] [graphite] check for ISL generated code that leads to division by zero

2015-11-30 Thread Sebastian Pop
we used to generate modulo and division by zero because ISL uses big numbers
which translate to zero in modulo arithmetic.  The patch also improves error 
handling
and bails out early in case of wrong code gen.
---
 gcc/graphite-isl-ast-to-gimple.c | 85 +++-
 1 file changed, 83 insertions(+), 2 deletions(-)

diff --git a/gcc/graphite-isl-ast-to-gimple.c b/gcc/graphite-isl-ast-to-gimple.c
index 16cb5fa..bfce316 100644
--- a/gcc/graphite-isl-ast-to-gimple.c
+++ b/gcc/graphite-isl-ast-to-gimple.c
@@ -502,7 +502,7 @@ private:
 tree
 translate_isl_ast_to_gimple::
 gcc_expression_from_isl_ast_expr_id (tree type,
-__isl_keep isl_ast_expr *expr_id,
+__isl_take isl_ast_expr *expr_id,
 ivs_params &ip)
 {
   gcc_assert (isl_ast_expr_get_type (expr_id) == isl_ast_expr_id);
@@ -550,8 +550,13 @@ binary_op_to_tree (tree type, __isl_take isl_ast_expr 
*expr, ivs_params &ip)
   tree tree_lhs_expr = gcc_expression_from_isl_expression (type, arg_expr, ip);
   arg_expr = isl_ast_expr_get_op_arg (expr, 1);
   tree tree_rhs_expr = gcc_expression_from_isl_expression (type, arg_expr, ip);
+
   enum isl_ast_op_type expr_type = isl_ast_expr_get_op_type (expr);
   isl_ast_expr_free (expr);
+
+  if (codegen_error)
+return NULL_TREE;
+
   switch (expr_type)
 {
 case isl_ast_op_add:
@@ -564,15 +569,43 @@ binary_op_to_tree (tree type, __isl_take isl_ast_expr 
*expr, ivs_params &ip)
   return fold_build2 (MULT_EXPR, type, tree_lhs_expr, tree_rhs_expr);
 
 case isl_ast_op_div:
+  /* As ISL operates on arbitrary precision numbers, we may end up with
+division by 2^64 that is folded to 0.  */
+  if (integer_zerop (tree_rhs_expr))
+   {
+ codegen_error = true;
+ return NULL_TREE;
+   }
   return fold_build2 (EXACT_DIV_EXPR, type, tree_lhs_expr, tree_rhs_expr);
 
 case isl_ast_op_pdiv_q:
+  /* As ISL operates on arbitrary precision numbers, we may end up with
+division by 2^64 that is folded to 0.  */
+  if (integer_zerop (tree_rhs_expr))
+   {
+ codegen_error = true;
+ return NULL_TREE;
+   }
   return fold_build2 (TRUNC_DIV_EXPR, type, tree_lhs_expr, tree_rhs_expr);
 
 case isl_ast_op_pdiv_r:
+  /* As ISL operates on arbitrary precision numbers, we may end up with
+division by 2^64 that is folded to 0.  */
+  if (integer_zerop (tree_rhs_expr))
+   {
+ codegen_error = true;
+ return NULL_TREE;
+   }
   return fold_build2 (TRUNC_MOD_EXPR, type, tree_lhs_expr, tree_rhs_expr);
 
 case isl_ast_op_fdiv_q:
+  /* As ISL operates on arbitrary precision numbers, we may end up with
+division by 2^64 that is folded to 0.  */
+  if (integer_zerop (tree_rhs_expr))
+   {
+ codegen_error = true;
+ return NULL_TREE;
+   }
   return fold_build2 (FLOOR_DIV_EXPR, type, tree_lhs_expr, tree_rhs_expr);
 
 case isl_ast_op_and:
@@ -620,6 +653,9 @@ ternary_op_to_tree (tree type, __isl_take isl_ast_expr 
*expr, ivs_params &ip)
   tree tree_third_expr
 = gcc_expression_from_isl_expression (type, arg_expr, ip);
   isl_ast_expr_free (expr);
+
+  if (codegen_error)
+return NULL_TREE;
   return fold_build3 (COND_EXPR, type, tree_first_expr,
  tree_second_expr, tree_third_expr);
 }
@@ -635,7 +671,7 @@ unary_op_to_tree (tree type, __isl_take isl_ast_expr *expr, 
ivs_params &ip)
   isl_ast_expr *arg_expr = isl_ast_expr_get_op_arg (expr, 0);
   tree tree_expr = gcc_expression_from_isl_expression (type, arg_expr, ip);
   isl_ast_expr_free (expr);
-  return fold_build1 (NEGATE_EXPR, type, tree_expr);
+  return codegen_error ? NULL_TREE : fold_build1 (NEGATE_EXPR, type, 
tree_expr);
 }
 
 /* Converts an isl_ast_expr_op expression E with unknown number of arguments
@@ -661,11 +697,25 @@ nary_op_to_tree (tree type, __isl_take isl_ast_expr 
*expr, ivs_params &ip)
 }
   isl_ast_expr *arg_expr = isl_ast_expr_get_op_arg (expr, 0);
   tree res = gcc_expression_from_isl_expression (type, arg_expr, ip);
+
+  if (codegen_error)
+{
+  isl_ast_expr_free (expr);
+  return NULL_TREE;
+}
+
   int i;
   for (i = 1; i < isl_ast_expr_get_op_n_arg (expr); i++)
 {
   arg_expr = isl_ast_expr_get_op_arg (expr, i);
   tree t = gcc_expression_from_isl_expression (type, arg_expr, ip);
+
+  if (codegen_error)
+   {
+ isl_ast_expr_free (expr);
+ return NULL_TREE;
+   }
+
   res = fold_build2 (op_code, type, res, t);
 }
   isl_ast_expr_free (expr);
@@ -680,6 +730,12 @@ translate_isl_ast_to_gimple::
 gcc_expression_from_isl_expr_op (tree type, __isl_take isl_ast_expr *expr,
 ivs_params &ip)
 {
+  if (codegen_error)
+{
+  isl_ast_expr_free (expr);
+  return NULL_TREE;
+}
+
   gcc_assert (isl_ast_expr_get_type (expr) =

[PATCH 1/2] [graphite] always print parameter names as P_{SSA_NAME_VERSION}

2015-11-30 Thread Sebastian Pop
---
 gcc/graphite-isl-ast-to-gimple.c  |  4 ++--
 gcc/graphite-scop-detection.c | 31 ---
 gcc/graphite-sese-to-poly.c   | 16 +++-
 gcc/testsuite/gcc.dg/graphite/pr35356-1.c |  2 +-
 4 files changed, 22 insertions(+), 31 deletions(-)

diff --git a/gcc/graphite-isl-ast-to-gimple.c b/gcc/graphite-isl-ast-to-gimple.c
index 33423dd..16cb5fa 100644
--- a/gcc/graphite-isl-ast-to-gimple.c
+++ b/gcc/graphite-isl-ast-to-gimple.c
@@ -2220,7 +2220,7 @@ translate_isl_ast_to_gimple::copy_loop_close_phi_args 
(basic_block old_bb,
   get_loc (old_name));
   if (dump_file)
{
- fprintf (dump_file, "[codegen] Adding loop-closed phi: ");
+ fprintf (dump_file, "[codegen] Adding loop close phi: ");
  print_gimple_stmt (dump_file, new_close_phi, 0, 0);
}
 
@@ -2265,7 +2265,7 @@ translate_isl_ast_to_gimple::copy_loop_close_phi_nodes 
(basic_block old_bb,
basic_block new_bb)
 {
   if (dump_file)
-fprintf (dump_file, "[codegen] copying loop closed phi nodes in bb_%d.\n",
+fprintf (dump_file, "[codegen] copying loop close phi nodes in bb_%d.\n",
 new_bb->index);
   /* Loop close phi nodes should have only one argument.  */
   gcc_assert (1 == EDGE_COUNT (old_bb->preds));
diff --git a/gcc/graphite-scop-detection.c b/gcc/graphite-scop-detection.c
index 1f8fc76..2f4231a 100644
--- a/gcc/graphite-scop-detection.c
+++ b/gcc/graphite-scop-detection.c
@@ -382,7 +382,7 @@ canonicalize_loop_closed_ssa (loop_p loop)
   if (single_pred_p (bb))
 {
   e = split_block_after_labels (bb);
-  DEBUG_PRINT (dp << "\nSplitting bb_" << bb->index);
+  DEBUG_PRINT (dp << "Splitting bb_" << bb->index << ".\n");
   make_close_phi_nodes_unique (e->src);
 }
   else
@@ -391,7 +391,7 @@ canonicalize_loop_closed_ssa (loop_p loop)
   basic_block close = split_edge (e);
 
   e = single_succ_edge (close);
-  DEBUG_PRINT (dp << "\nSplitting edge (" << e->src->index << ","
+  DEBUG_PRINT (dp << "Splitting edge (" << e->src->index << ","
  << e->dest->index << ")\n");
 
   for (psi = gsi_start_phis (bb); !gsi_end_p (psi); gsi_next (&psi))
@@ -846,7 +846,7 @@ scop_detection::merge_sese (sese_l first, sese_l second) 
const
combined.exit = single_succ_edge (imm_succ);
   else
{
- DEBUG_PRINT (dp << "\n[scop-detection-fail] Discarding SCoP because "
+ DEBUG_PRINT (dp << "[scop-detection-fail] Discarding SCoP because "
  << "no single exit (empty succ) for sese exit";
   print_sese (dump_file, combined));
  return invalid_sese;
@@ -870,7 +870,7 @@ scop_detection::build_scop_depth (sese_l s, loop_p loop)
   if (!loop)
 return s;
 
-  DEBUG_PRINT (dp << "\n[Depth loop_" << loop->num << "]");
+  DEBUG_PRINT (dp << "[Depth loop_" << loop->num << "]\n");
   s = build_scop_depth (s, loop->inner);
 
   sese_l s2 = merge_sese (s, get_sese (loop));
@@ -895,7 +895,7 @@ scop_detection::build_scop_breadth (sese_l s1, loop_p loop)
 {
   if (!loop)
 return s1;
-  DEBUG_PRINT (dp << "\n[Breadth loop_" << loop->num << "]");
+  DEBUG_PRINT (dp << "[Breadth loop_" << loop->num << "]\n");
   gcc_assert (s1);
 
   loop_p l = loop;
@@ -981,7 +981,7 @@ scop_detection::loop_is_valid_scop (loop_p loop, sese_l 
scop) const
   if (loop_body_is_valid_scop (loop, scop))
 {
   DEBUG_PRINT (dp << "[valid-scop] loop_" << loop->num
- << "is a valid scop.\n");
+ << " is a valid scop.\n");
   return true;
 }
   return false;
@@ -1013,15 +1013,15 @@ scop_detection::add_scop (sese_l s)
   /* Do not add scops with only one loop.  */
   if (region_has_one_loop (s))
 {
-  DEBUG_PRINT (dp << "\n[scop-detection-fail] Discarding one loop SCoP";
+  DEBUG_PRINT (dp << "[scop-detection-fail] Discarding one loop SCoP.\n";
   print_sese (dump_file, s));
   return;
 }
 
   if (get_exit_bb (s) == EXIT_BLOCK_PTR_FOR_FN (cfun))
 {
-  DEBUG_PRINT (dp << "\n[scop-detection-fail] "
- << "Discarding SCoP exiting to return";
+  DEBUG_PRINT (dp << "[scop-detection-fail] "
+ << "Discarding SCoP exiting to return.";
   print_sese (dump_file, s));
   return;
 }
@@ -1033,7 +1033,7 @@ scop_detection::add_scop (sese_l s)
   remove_intersecting_scops (s);
 
   scops.safe_push (s);
-  DEBUG_PRINT (dp << "\nAdding SCoP "; print_sese (dump_file, s));
+  DEBUG_PRINT (dp << "Adding SCoP "; print_sese (dump_file, s));
 }
 
 /* Return true when a statement in SCOP cannot be represented by Graphite.
@@ -1047,7 +1047,7 @@ scop_detection::harmful_stmt_in_region (sese_l scop) const
   basic_block exit_bb = get_exit_bb (scop);
   basic_block entry_bb = get_entry_bb (scop);
 
-  DEBUG_PRINT (dp << "\n[checking-harmful-bbs]

Re: [patch] c/c++ asan tests for FreeBSD

2015-11-30 Thread Andreas Tobler

On 30.11.15 17:22, Jakub Jelinek wrote:

On Mon, Nov 30, 2015 at 05:17:29PM +0100, Bernd Schmidt wrote:

On 11/30/2015 01:12 PM, Andreas Tobler wrote:

On 30.11.15 11:28, Bernd Schmidt wrote:

On 11/29/2015 08:32 PM, Andreas Tobler wrote:

-/* { dg-do run { target { *-*-linux* } } } */
+/* { dg-do run { target { *-*-linux* *-*-freebsd* } } } */


I see a patch from you to add asan support to x86 freebsd, but what
about other architectures?


You mean because of the wildcard? I'll add them as I have time to port
them.

For now they are UNSUPPORTED.


Is that how they show up, or do you get FAILs on other FreeBSDs?


This is inside of asan.exp, which is guarded with
check_effective_target_fsanitize_address
and therefore should not be run at all on non-asan targets.


It manifests this way:

/usr/local/bin/ld: cannot find libasan_preinit.o: No such file or directory
/usr/local/bin/ld: cannot find -lasan
collect2: error: ld returned 1 exit status

Then it bails out and the asan tests are skipped.

...
testsuite/gcc.dg/asan/asan.exp completed in 1 seconds
...

There is no UNSUPPORTED in the log file.


I think the testsuite changes are fine, but it IMHO doesn't make sense to
commit it until the FreeBSD asan supports lands in (which is dependent on
the upstream libsanitizer change I believe).  Once it happens, it can be
cherry-picked from there, the config/i386 part looks reasonable.


I agree that it doesn't make much sense to commit for the public, but 
I'd have a patch less on the table ;)


But, np problem at all.

This is the cherry I'd like to pick once it has landed :)

http://reviews.llvm.org/D15049

The part for lib/asan/asan_linux.cc.

Thanks for the comments!
Andreas


Re: [PATCH 1/2][ARM] PR/65956 AAPCS update for alignment attribute

2015-11-30 Thread Florian Weimer
On 11/27/2015 06:55 PM, Eric Botcazou wrote:

> There is no official ABI for Ada so I guess that's not really a problem as 
> long as it's documented on https://gcc.gnu.org/gcc-5/changes.html.

It's still surprising to make such a far-reaching change in a minor
release, I think.

Florian



Re: [OpenACC 0/7] host_data construct

2015-11-30 Thread Julian Brown
On Thu, 19 Nov 2015 16:57:23 +0100
Jakub Jelinek  wrote:

> If it is unclear, I think disallowing acc {parallel,kernels} inside of
> acc host_data might be too big hammer, but perhaps just erroring out
> or warning during gimplification that if you (explicitly or
> implicitly) try to map a var that is in use_device clause in some
> outer context, it is either wrong, unsupported or will not do what
> users think?

I think we can only assume that trying to map a variable declared in
a surrounding use_device clause is undefined behaviour. I haven't had
any response to my questions about host_data & deviceptr on the OpenACC
list.

> > #pragma acc host_data use_device(x)
> > {
> >   target_primitive(x);
> >   #pragma acc parallel deviceptr(x)
> >   {
> > ...
> >   }
> > }
> 
> Is deviceptr as above meant to work?  That is the OpenACC counterpart
> of is_device_ptr, right?  If yes, then I'd suggest just warning if you
> try to implicitly or explicitly map something use_device in outer
> contexts, and just make sure you don't ICE on the cases where you
> warn. If the standard does not say what it means, then it is
> unspecified behavior...

A problem with deviceptr, unlike is_device_ptr, is that it turns out to
be defined only to work with pointers, not arrays (OpenACC 2.0a
2.6.5.2), and there are no rules describing the latter decaying to the
former. So at least if 'x' is an array, it appears the answer is "no".

So, the attached patch disallows (via raising an error):

* Variables being declared in explicit mapping clauses that are
  declared in enclosing host_data regions.

* Variables being implicitly used (mapped) in offloaded regions that
  are declared in enclosing host_data regions.

It's otherwise equivalent to the previously-posted version, but without
the hacks to {maybe_,}lookup_decl_in_outer_ctx. I added checks for the
above conditions during gimplification, which seemed to be about the
same phase that other similar kinds of errors are diagnosed.

Tests look OK (libgomp/gcc/g++/libstdc++), and the new ones pass.

OK for mainline?

Thanks,

Julian

ChangeLog

Julian Brown  
Cesar Philippidis  
James Norris  

gcc/
* c-family/c-pragma.c (oacc_pragmas): Add PRAGMA_OACC_HOST_DATA.
* c-family/c-pragma.h (pragma_kind): Add PRAGMA_OACC_HOST_DATA.
(pragma_omp_clause): Add PRAGMA_OACC_CLAUSE_USE_DEVICE.
* c/c-parser.c (c_parser_omp_clause_name): Add use_device support.
(c_parser_oacc_clause_use_device): New function.
(c_parser_oacc_all_clauses): Add use_device support.
(OACC_HOST_DATA_CLAUSE_MASK): New macro.
(c_parser_oacc_host_data): New function.
(c_parser_omp_construct): Add host_data support.
* c/c-tree.h (c_finish_oacc_host_data): Add prototype.
* c/c-typeck.c (c_finish_oacc_host_data): New function.
(c_finish_omp_clauses): Add use_device support.
* cp/cp-tree.h (finish_oacc_host_data): Add prototype.
* cp/parser.c (cp_parser_omp_clause_name): Add use_device support.
(cp_parser_oacc_all_clauses): Add use_device support.
(OACC_HOST_DATA_CLAUSE_MASK): New macro.
(cp_parser_oacc_host_data): New function.
(cp_parser_omp_construct): Add host_data support.
(cp_parser_pragma): Add host_data support.
* cp/semantics.c (finish_omp_clauses): Add use_device support.
(finish_oacc_host_data): New function.
* gimple-pretty-print.c (dump_gimple_omp_target): Add host_data
support.
* gimple.h (gf_mask): Add GF_OMP_TARGET_KIND_OACC_HOST_DATA.
(is_gimple_omp_oacc): Add support for above.
* gimplify.c (omp_region_type): Add ORT_ACC_HOST_DATA.
(omp_notice_variable): Diagnose undefined implicit uses of
use_device variables in offloaded regions.
(gimplify_scan_omp_clauses): Add host_data, use_device
support. Diagnose undefined mapping of use_device variables in
OpenACC clauses.
(gimplify_omp_workshare): Add host_data support.
(gimplify_expr): Likewise.
* omp-builtins.def (BUILT_IN_GOACC_HOST_DATA): New.
* omp-low.c (lookup_decl_in_outer_ctx)
(maybe_lookup_decl_in_outer_ctx): Add optional argument to skip
host_data regions.
(scan_sharing_clauses): Support use_device.
(check_omp_nesting_restrictions): Support host_data.
(expand_omp_target): Support host_data.
(lower_omp_target): Skip over outer host_data regions when looking
up decls. Support use_device.
(make_gimple_omp_edges): Support host_data.
* tree-nested.c (convert_nonlocal_omp_clauses): Add use_device
clause.

libgomp/
* oacc-parallel.c (GOACC_host_data): New function.
* libgomp.map (GOACC_host_data): Add to GOACC_2.0.1.
* testsuite/libgomp.oacc-c-c++-common/host_data-1.c: New test.
* testsuite/libgomp.oacc-c-c++-common/host_data-2.c: New test.
* testsuite/libgomp.oacc-c-c++-common/host_data-3.c: New test.
* testsuite/libgomp.oacc-c-c++-common/host_data-4.c: New test.
* testsuite/libgomp.oacc-c-c++-common/host_data-5.c: New test.
* testsui

[openacc] fortran loop clauses and splitting

2015-11-30 Thread Cesar Philippidis
This patch contains the following bug fixes:

 * Teaches gfortran to accept both num and static gang arguments inside
   same clause. E.g. gang(num:10, static:30). Currently, gfortran only
   allows one of those arguments to appear in a gang clause.

 * Make the diagnostics reported by resovle_oacc_positive_int_expr more
   accurate for worker and vector clauses.

 * Updates how combined loops are split to account for the renamed gang
   clause members in gfc_omp_clauses.  Also corrected a bug that Tom
   discovered in the c front end where combined reductions were being
   attached to kernels and parallel constructs. Now, they are only
   associated with the split acc loop.

Is this OK for trunk?

Cesar
2015-11-30  Cesar Philippidis  

	gcc/fortran/
	* dump-parse-tree.c (show_omp_clauses): Handle optional num and static
	arguments for the gang clause.
	* gfortran.h (gfc_omp_clauses): Rename gang_expr as gang_num_expr.
	Add gang_static_expr.
	* openmp.c (gfc_free_omp_clauses): Update to free gang_num_expr and
	gang_static_expr.
	(match_oacc_clause_gang): Update to support both num and static in
	the same clause.
	(resolve_omp_clauses): Formatting.  Also handle gang_num_expr and
	gang_static_expr.
	(resolve_oacc_params_in_parallel): New const char arg argument.
	Use it to report more accurate gang, worker and vector clause errors.
	(resolve_oacc_loop_blocks): Update calls to
	resolve_oacc_params_in_parallel.
	* trans-openmp.c (gfc_trans_omp_clauses): Update the gimplification of
	the gang clause.
	(gfc_trans_oacc_combined_directive): Make use of gang_num_expr and
	gang_static_expr.  Remove OMP_LIST_REDUCTION from construct_clauses.

	gcc/testsuite/
	* gfortran.dg/goacc/gang-static.f95: Add tests for gang num arguments.
	* gfortran.dg/goacc/loop-2.f95: Update expected diagnostics.
	* gfortran.dg/goacc/loop-6.f95: Likewise.
	* gfortran.dg/goacc/loop-7.f95: New test.
	* gfortran.dg/goacc/reduction-2.f95: New test.

diff --git a/gcc/fortran/dump-parse-tree.c b/gcc/fortran/dump-parse-tree.c
index 48476af..f9abf40 100644
--- a/gcc/fortran/dump-parse-tree.c
+++ b/gcc/fortran/dump-parse-tree.c
@@ -1146,10 +1146,24 @@ show_omp_clauses (gfc_omp_clauses *omp_clauses)
   if (omp_clauses->gang)
 {
   fputs (" GANG", dumpfile);
-  if (omp_clauses->gang_expr)
+  if (omp_clauses->gang_num_expr || omp_clauses->gang_static_expr)
 	{
 	  fputc ('(', dumpfile);
-	  show_expr (omp_clauses->gang_expr);
+	  if (omp_clauses->gang_num_expr)
+	{
+	  fprintf (dumpfile, "num:");
+	  show_expr (omp_clauses->gang_num_expr);
+	}
+	  if (omp_clauses->gang_num_expr && omp_clauses->gang_static)
+	fputc (',', dumpfile);
+	  if (omp_clauses->gang_static)
+	{
+	  fprintf (dumpfile, "static:");
+	  if (omp_clauses->gang_static_expr)
+		show_expr (omp_clauses->gang_static_expr);
+	  else
+		fputc ('*', dumpfile);
+	}
 	  fputc (')', dumpfile);
 	}
 }
diff --git a/gcc/fortran/gfortran.h b/gcc/fortran/gfortran.h
index 5487c93..90b03ef 100644
--- a/gcc/fortran/gfortran.h
+++ b/gcc/fortran/gfortran.h
@@ -1226,7 +1226,8 @@ typedef struct gfc_omp_clauses
 
   /* OpenACC. */
   struct gfc_expr *async_expr;
-  struct gfc_expr *gang_expr;
+  struct gfc_expr *gang_static_expr;
+  struct gfc_expr *gang_num_expr;
   struct gfc_expr *worker_expr;
   struct gfc_expr *vector_expr;
   struct gfc_expr *num_gangs_expr;
diff --git a/gcc/fortran/openmp.c b/gcc/fortran/openmp.c
index a07cee1..2941ad4 100644
--- a/gcc/fortran/openmp.c
+++ b/gcc/fortran/openmp.c
@@ -77,7 +77,8 @@ gfc_free_omp_clauses (gfc_omp_clauses *c)
   gfc_free_expr (c->thread_limit);
   gfc_free_expr (c->dist_chunk_size);
   gfc_free_expr (c->async_expr);
-  gfc_free_expr (c->gang_expr);
+  gfc_free_expr (c->gang_num_expr);
+  gfc_free_expr (c->gang_static_expr);
   gfc_free_expr (c->worker_expr);
   gfc_free_expr (c->vector_expr);
   gfc_free_expr (c->num_gangs_expr);
@@ -395,21 +396,41 @@ cleanup:
 static match
 match_oacc_clause_gang (gfc_omp_clauses *cp)
 {
-  if (gfc_match_char ('(') != MATCH_YES)
+  match ret = MATCH_YES;
+
+  if (gfc_match (" ( ") != MATCH_YES)
 return MATCH_NO;
-  if (gfc_match (" num :") == MATCH_YES)
-{
-  cp->gang_static = false;
-  return gfc_match (" %e )", &cp->gang_expr);
-}
-  if (gfc_match (" static :") == MATCH_YES)
+
+  /* The gang clause accepts two optional arguments, num and static.
+ The num argument may either be explicit (num: ) or
+ implicit without ( without num:).  */
+
+  while (ret == MATCH_YES)
 {
-  cp->gang_static = true;
-  if (gfc_match (" * )") != MATCH_YES)
-	return gfc_match (" %e )", &cp->gang_expr);
-  return MATCH_YES;
+  if (gfc_match (" static :") == MATCH_YES)
+	{
+	  if (cp->gang_static)
+	return MATCH_ERROR;
+	  else
+	cp->gang_static = true;
+	  if (gfc_match_char ('*') == MATCH_YES)
+	cp->gang_static_expr = NULL;
+	  else if (gfc_match (" %e ", &cp->gang_static_expr) != MATCH_YES)
+	return MATCH_ERROR;
+	}
+  

Re: [PATCH, PR46032] Handle BUILT_IN_GOMP_PARALLEL in ipa-pta

2015-11-30 Thread Tom de Vries

On 30/11/15 17:48, Jakub Jelinek wrote:

On Mon, Nov 30, 2015 at 05:36:25PM +0100, Tom de Vries wrote:

+int
+main (void)
+{
+  unsigned results[nEvents];
+  unsigned pData[nEvents];
+  unsigned coeff = 2;
+
+  init (&results[0], &pData[0]);
+
+#pragma omp parallel for
+  for (int idx = 0; idx < (int)nEvents; idx++)
+results[idx] = coeff * pData[idx];


Could you please add another testcase, where you have say pData
and some other pointer that init sets to alias with pData, and verify
that such loop (would need to be say normal loop inside #pragma omp single
or master) is not vectorized?


I've:
- added a simpler (not vectorizer-based) version of the testcase as
  pr46032-2.c, and
- copied pr46032-2.c to pr46032-3.c and modified it such that two
  pointers are aliasing

Committed to trunk.

Thanks,
- Tom

Add gcc.dg/pr46032-{2,3}.c test-cases

2015-11-30  Tom de Vries  

	* gcc.dg/pr46032-2.c: New test.
	* gcc.dg/pr46032-3.c: New test.

---
 gcc/testsuite/gcc.dg/pr46032-2.c | 29 +
 gcc/testsuite/gcc.dg/pr46032-3.c | 28 
 2 files changed, 57 insertions(+)

diff --git a/gcc/testsuite/gcc.dg/pr46032-2.c b/gcc/testsuite/gcc.dg/pr46032-2.c
new file mode 100644
index 000..e110880
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr46032-2.c
@@ -0,0 +1,29 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fopenmp -std=c99 -fipa-pta -fdump-tree-optimized" } */
+
+#define N 2
+
+int
+foo (void)
+{
+  int a[N], b[N], c[N];
+  int *ap = &a[0];
+  int *bp = &b[0];
+  int *cp = &c[0];
+
+#pragma omp parallel for
+  for (unsigned int idx = 0; idx < N; idx++)
+{
+  ap[idx] = 1;
+  bp[idx] = 2;
+  cp[idx] = ap[idx];
+}
+
+  return *cp;
+}
+
+/* { dg-final { scan-tree-dump-times "\\] = 1;" 2 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "\\] = 2;" 1 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "\\] = _\[0-9\]*;" 0 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "\\] = " 3 "optimized" } } */
+
diff --git a/gcc/testsuite/gcc.dg/pr46032-3.c b/gcc/testsuite/gcc.dg/pr46032-3.c
new file mode 100644
index 000..a4af7ec
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr46032-3.c
@@ -0,0 +1,28 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fopenmp -std=c99 -fipa-pta -fdump-tree-optimized" } */
+
+#define N 2
+
+int
+foo (void)
+{
+  int a[N], c[N];
+  int *ap = &a[0];
+  int *bp = &a[0];
+  int *cp = &c[0];
+
+#pragma omp parallel for
+  for (unsigned int idx = 0; idx < N; idx++)
+{
+  ap[idx] = 1;
+  bp[idx] = 2;
+  cp[idx] = ap[idx];
+}
+
+  return *cp;
+}
+
+/* { dg-final { scan-tree-dump-times "\\] = 1;" 1 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "\\] = 2;" 1 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "\\] = _\[0-9\]*;" 1 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "\\] = " 3 "optimized" } } */


Re: S/390: Fix warnings in "*setmem_long..." patterns.

2015-11-30 Thread Andreas Krebbel
On 11/30/2015 06:11 PM, Ulrich Weigand wrote:
...
> However, I agree that UNSPEC_P_TO_BLK really should also get the length
> as input, to make it have precisely defined semantics.  Also, I'd rather
> use a more descriptive name, like UNSPEC_REPLICATE_BYTE or the like.
> 
> What would you think about something like the following?
> 
> (define_insn "*setmem_long"
>   [(clobber (match_operand: 0 "register_operand" "=d"))
>(set (mem:BLK (subreg:P (match_operand: 3 "register_operand" "0") 0))
> (unspec:BLK [(match_operand:P 2 "shift_count_or_setmem_operand" "Y")
>  (subreg:P (match_dup 3) 1)] UNSPEC_REPLICATE_BYTE))
>(use (match_operand: 1 "register_operand" "d"))
>(clobber (reg:CC CC_REGNUM))]

Fine with me. Thanks!

Bye,

-Andreas-



[gomp4] Re: [PATCH, 10/16] Add pass_oacc_kernels pass group in passes.def

2015-11-30 Thread Thomas Schwinge
Hi!

On Wed, 25 Nov 2015 11:43:14 +0100 (CET), Richard Biener  
wrote:
> On Tue, 24 Nov 2015, Tom de Vries wrote:
> > > [...]
> > 
> > Reposting using the in_loop_pipeline style in pass_lim.
> 
> Ok.

I merged trunk r230907 into gomp-4_0-branch in a very simplistic way,
basically just moving pass_fre in between pass_oacc_kernels and the (new)
pass_oacc_kernels2 pass groups.  We'll want to clean this up later (on
gomp-4_0-branch), once we're more clear on what difference will remain
between the trunk and gomp-4_0-branch pass structures (if any); for now
this makes sure we don't regress OpenACC kernels functionality on
gomp-4_0-branch.  In gomp-4_0-branch r231078, I effectively applied the
following:

commit ffae8a36e195172327a233bd397a4230a7939681
Merge: 8249e60 e1e1688
Author: tschwinge 
Date:   Mon Nov 30 17:28:07 2015 +

svn merge -r 230906:230907 svn+ssh://gcc.gnu.org/svn/gcc/trunk


git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gomp-4_0-branch@231078 
138bc75d-0d04-0410-961f-82ee72b054a4

 gcc/ChangeLog   |  6 
 gcc/passes.def  | 13 +++--
 gcc/testsuite/ChangeLog | 76 +
 3 files changed, 92 insertions(+), 3 deletions(-)

[diff --git gcc/ChangeLog gcc/ChangeLog]
diff --git gcc/passes.def gcc/passes.def
index f4eb235..9fe4fec 100644
--- gcc/passes.def
+++ gcc/passes.def
@@ -84,36 +84,43 @@ along with GCC; see the file COPYING3.  If not see
  /* After CCP we rewrite no longer addressed locals into SSA
 form if possible.  */
  NEXT_PASS (pass_forwprop);
  NEXT_PASS (pass_sra_early);
  /* pass_build_ealias is a dummy pass that ensures that we
 execute TODO_rebuild_alias at this point.  */
  NEXT_PASS (pass_build_ealias);
- /* Pass group that runs when there are oacc kernels in the
-function.  */
+ /* Pass group that runs when the function is an offloaded function
+containing oacc kernels loops.  Part 1.  */
  NEXT_PASS (pass_oacc_kernels);
  PUSH_INSERT_PASSES_WITHIN (pass_oacc_kernels)
  NEXT_PASS (pass_dominator, false /* may_peel_loop_headers_p */);
  NEXT_PASS (pass_ch);
  NEXT_PASS (pass_dominator, false /* may_peel_loop_headers_p */);
+ POP_INSERT_PASSES ()
+ NEXT_PASS (pass_fre);
+ /* Pass group that runs when the function is an offloaded function
+containing oacc kernels loops.  Part 2.  */
+ NEXT_PASS (pass_oacc_kernels2);
+ PUSH_INSERT_PASSES_WITHIN (pass_oacc_kernels2)
+ /* We use pass_lim to rewrite in-memory iteration and reduction
+variable accesses in loops into local variables accesses.  */
  NEXT_PASS (pass_tree_loop_init);
  NEXT_PASS (pass_lim);
  NEXT_PASS (pass_copy_prop);
  NEXT_PASS (pass_lim);
  NEXT_PASS (pass_copy_prop);
  NEXT_PASS (pass_scev_cprop);
  NEXT_PASS (pass_tree_loop_done);
  NEXT_PASS (pass_dominator, false /* may_peel_loop_headers_p */);
  NEXT_PASS (pass_dce);
  NEXT_PASS (pass_tree_loop_init);
  NEXT_PASS (pass_parallelize_loops_oacc_kernels);
  NEXT_PASS (pass_expand_omp_ssa);
  NEXT_PASS (pass_tree_loop_done);
  POP_INSERT_PASSES ()
- NEXT_PASS (pass_fre);
  NEXT_PASS (pass_merge_phi);
   NEXT_PASS (pass_dse);
  NEXT_PASS (pass_cd_dce);
  NEXT_PASS (pass_early_ipa_sra);
  NEXT_PASS (pass_tail_recursion);
  NEXT_PASS (pass_convert_switch);
  NEXT_PASS (pass_cleanup_eh);
[diff --git gcc/testsuite/ChangeLog gcc/testsuite/ChangeLog]

..., so the following difference from trunk to gomp-4_0-branch remains to
be resolved/reduced (plus the corresponding testsuite tree dump scanning
changes):

--- gcc/passes.def
+++ gcc/passes.def
@@ -89,25 +89,36 @@ along with GCC; see the file COPYING3.  If not see
 execute TODO_rebuild_alias at this point.  */
  NEXT_PASS (pass_build_ealias);
  /* Pass group that runs when the function is an offloaded function
 containing oacc kernels loops.  Part 1.  */
  NEXT_PASS (pass_oacc_kernels);
  PUSH_INSERT_PASSES_WITHIN (pass_oacc_kernels)
+ NEXT_PASS (pass_dominator, false /* may_peel_loop_headers_p */);
  NEXT_PASS (pass_ch);
+ NEXT_PASS (pass_dominator, false /* may_peel_loop_headers_p */);
  POP_INSERT_PASSES ()
  NEXT_PASS (pass_fre);
  /* Pass group that runs when the function is an offloaded function
 containing oacc kernels loops.  Part 2.  */
  NEXT_PASS (pass_oacc_kernels2);
  PUSH_INSERT_PASSES_WITHIN (pass_oacc_kernels2)
  /* We use pass_lim to rewrite in-memory iteration and reduction
 variable accesses in loop

Re: S/390: Fix warnings in "*setmem_long..." patterns.

2015-11-30 Thread Ulrich Weigand
Andreas Krebbel wrote:
> On 11/30/2015 04:11 PM, Dominik Vogt wrote:
> > The attached patch fixes some warnings generated by the setmem...
> > patterns in s390.md during build and add test cases for the
> > patterns.  The patch is to be added on to p of the movstr patch:
> > https://gcc.gnu.org/ml/gcc-patches/2015-11/msg03485.html
> > 
> > The test cases validate that the patterns are actually used, but
> > at the moment the setmem_long_and pattern is never actually used
> > and thus the test case would fail.  So I've split the patch in two
> > (both attached to this message) to activate this part of the test
> > once we've fixed that.
> > 
> > The patch has passed the SPEC2006 testsuite without any measurable
> > changes in performance.
> 
> Shouldn't we instead describe the whole setmem operation as unspec including 
> the other operands as
> well? The semantics of the introduced UNSPEC_P_TO_BLK operation is not clear 
> to me.  It suggests to
> be some kind of "cast" which it isn't. In fact it is not able to do its job 
> without the length which
> is specified as use outside the unspec.

Well, I guess I suggested to Dominik to leave the basic
[parallel
  (set (dst:BLK) (src:BLK))
  (use (length)]
structure in place; my understanding is that the middle-end recognizes
this as a block move.  As "source" in this case we'd use a BLKmode
operand that consist iof the same byte replicated a number of times.

If we were to use just a single UNSPEC, how would we indicate to the
middle-end that a block of memory is modified, without using too coarse-
grained clobbers?

However, I agree that UNSPEC_P_TO_BLK really should also get the length
as input, to make it have precisely defined semantics.  Also, I'd rather
use a more descriptive name, like UNSPEC_REPLICATE_BYTE or the like.

What would you think about something like the following?

(define_insn "*setmem_long"
  [(clobber (match_operand: 0 "register_operand" "=d"))
   (set (mem:BLK (subreg:P (match_operand: 3 "register_operand" "0") 0))
(unspec:BLK [(match_operand:P 2 "shift_count_or_setmem_operand" "Y")
 (subreg:P (match_dup 3) 1)] UNSPEC_REPLICATE_BYTE))
   (use (match_operand: 1 "register_operand" "d"))
   (clobber (reg:CC CC_REGNUM))]

[ Not sure if we'd need an extra (use (match_dup 3)) any more. ]

B.t.w. this is certainly wrong and cannot be generated by common code:
(and:BLK (unspec:BLK
  [(match_operand:P 2 "shift_count_or_setmem_operand" "Y")]
  UNSPEC_P_TO_BLK)
 (match_operand 4 "const_int_operand" "n"))
(This explains why the pattern would never match.)

The AND should be on the filler byte instead:
(unspec:BLK [(and:P (match_operand:P 2 "shift_count_or_setmem_operand" 
"Y")
(match_operand:P 4 "const_int_operand" 
"n"))
 (subreg:P (match_dup 3) 1)] UNSPEC_REPLICATE_BYTE))

Bye,
Ulrich

-- 
  Dr. Ulrich Weigand
  GNU/Linux compilers and toolchain
  ulrich.weig...@de.ibm.com



Re: [PATCH 3/4] [ARM] PR63870 Add test cases

2015-11-30 Thread Charles Baylis
Applied to trunk as r231077.

On 26 November 2015 at 09:43, James Greenhalgh  wrote:
> On Thu, Nov 26, 2015 at 09:41:15AM +, Charles Baylis wrote:
>> Hi James,
>>
>> Ping. This needs an ack from an AArch64 reviewer/maintainer
>
> Fine by me, it will considerably clean up my test results for ARM!
>
> Thanks,
> James
>
>


Re: Fix verify_type ICE during Ada bootstrap

2015-11-30 Thread Jan Hubicka
> 
> I think you are doing too many things in one patch.  I'm fine with
> dropping the zero-alias-set streaming (but I'd rather not assert
> as FE get_alias_set langhook may assign zero to random tree nodes).

Ok, the assert was there mostly to double check that all zero alias
sets rematerialize correctly in LTO which I tested so it can go.
> 
> I'm also fine with handling flag_strict_aliasing conservatively
> during inlining - but the condition you placed on this handling
> needs a comment.  I couldn't decipher it ;)

OK, there is symmetric condition in ipa-inline-analysis, will comment on it.
It indeed can go in separately.
> 
> > +  if (dump_file)
> > + fprintf (dump_file, "Dropping flag_strict_aliasing on %s:%i\n",
> > +  to->name (), to->order);
> 
> So I wonder if it makes sense to pessimize such inlining as well.

I don't know - even for Firefox that heavily mix -fstrict-aliasing
and -fno-strict-aliasing units this seems quite rare occasion and it
is hard to judge when dopping the flag_strict_aliasing.
> 
> The two above should be enough to fix the correctness issue.

We also need to prevent ipa-icf and fold_const from optimizing functions
early in a way that is not compatible with inlining -fno-strict-aliasing comdat
to -fstrict-aliasing function.

Honza
> 
> The parse_optimize_options hack looks indeed interesting, but we solved
> the issue differently by
> 
> 2014-11-27  Richard Biener  
> 
> PR middle-end/63704
> * alias.c (mems_in_disjoint_alias_sets_p): Remove assert
> and instead return false when !fstrict-aliasing.
> 
> So the hack can be removed as a separate commit after the first one
> above.  This should make optimize("fno-strict-aliasing") work.
> 
> 
> I don't really see why we need all the other changes and IMHO the
> get_alias_set interface change is ugly and fragile.  And this doesn't
> look like sth for stage3.
> 
> Thus please split the patch up.
> 
> Thanks,
> Richard.
> 
> > Honza
> > 
> > * tree.c (free_lang_data): Pass true to get_alias_set.
> > * tree-streamer-in.c (unpack_ts_type_common_value_fields): Do not stream
> > alias set.
> > * tree-ssa-alias.c (ao_ref_base_alias_set, ao_ref_alias_set): Pass true
> > to get_alias_set; comment.
> > (same_type_for_tbaa): Likewise.
> > * alias.c (alias_set_subset_of, alias_sets_conflict_p): When strict
> > aliasing is disabled, return true.
> > (get_alias_set): New parameter strict.
> > (new_alias_set): Always produce new alias set.
> > (record_component_aliases): Pass true to get_alias_set.
> > * alias.h (get_alias_set): New optional parameter STRICT.
> > * lto-streamer-out.c (hash_tree): Do not hash alias set.
> > * ipa-inline-transform.c (inline_call): Drop strict aliasing of
> > caller if needed.
> > * ipa-icf-gimple.c (func_checker::compatible_types_p): Pass true
> > to get_alias_set.
> > * tree-streamer-out.c (pack_ts_type_common_value_fields): Do not
> > stream TYPE_ALIAS_SET; sanity check that alias set 0 at LTO time will
> > match what frontneds does.
> > * fold-const.c (operand_equal_p): Be cureful about TBAA info before
> > inlining even with -fno-strict-aliasing.
> > * gimple.c (gimple_get_alias_set): Pass true to get_alias_set.
> > 
> > * misc.c (gnat_get_alias_set): Pass true to get_alias_set.
> > * utils.c (relate_alias_sets): Likewise.
> > * trans.c (validate_unchecked_conversion): Likewise.
> > 
> > * lto-symtab.c (warn_type_compatibility_p): Pass true to get_alias_set.
> > * lto.c (compare_tree_sccs_1): Do not ocmpare TYPE_ALIAS_SET.
> > 
> > * gcc.c-torture/execute/alias-1.c: New testcase.
> > * gcc.dg/lto/alias-1_0.c: New testcase.
> > * gcc.dg/lto/alias-1_1.c: New testcase.
> > 
> > * c-common.c (parse_optimize_options): Remove hack about
> > flag_strict_aliasing.
> > (convert_vector_to_pointer_for_subscript): Pass true to get_alias_set.
> > 
> > * cp-objcp-common.c (cxx_get_alias_set): Pass true to get_alias_set.
> > 
> > * rtti.c (typeid_ok_p): Pass true to get_alias_set.
> > Index: tree.c
> > ===
> > --- tree.c  (revision 231020)
> > +++ tree.c  (working copy)
> > @@ -5971,7 +5971,8 @@ free_lang_data (void)
> >   while the slots are still in the way the frontends generated them.  */
> >for (i = 0; i < itk_none; ++i)
> >  if (integer_types[i])
> > -  TYPE_ALIAS_SET (integer_types[i]) = get_alias_set (integer_types[i]);
> > +  TYPE_ALIAS_SET (integer_types[i]) = get_alias_set (integer_types[i],
> > +true);
> >  
> >/* Traverse the IL resetting language specific information for
> >   operands, expressions, etc.  */
> > Index: cp/rtti.c
> > ===
> > --- cp/rtti.c   (revision 231020)
> > +++ cp/rtti.c   (working copy)

Re: [PATCH] [PR68603] Associate conditional C++ loop's back-jump with start, not body

2015-11-30 Thread Jason Merrill

OK.

Jason


Re: [PATCH] Fix PR68067

2015-11-30 Thread Jeff Law

On 11/30/2015 01:42 AM, Richard Biener wrote:


Yeah.  I've pondered with clearing the hashmap after each pass
(and hope no IPA pass would redirect edges).  Or even more aggressive,
clear the hashmap as well when we do set_cfun ().

Maybe you can try that?

And no, I don't think any pass expects this stuff to be live across
passes.
I'd argue that any pass that expects this stuff to be live across a pass 
is fundamentally broken.


jeff



Re: [PATCH] rs6000_adjust_cost old thinko

2015-11-30 Thread Eric Botcazou
> FYI, the function should test recog_memoized (dep_insn) also.

I don't think that's needed as it doesn't call get_attr_type on dep_insn.

-- 
Eric Botcazou


Re: [PATCH, PR46032] Handle BUILT_IN_GOMP_PARALLEL in ipa-pta

2015-11-30 Thread Jakub Jelinek
On Mon, Nov 30, 2015 at 05:36:25PM +0100, Tom de Vries wrote:
> +int
> +main (void)
> +{
> +  unsigned results[nEvents];
> +  unsigned pData[nEvents];
> +  unsigned coeff = 2;
> +
> +  init (&results[0], &pData[0]);
> +
> +#pragma omp parallel for
> +  for (int idx = 0; idx < (int)nEvents; idx++)
> +results[idx] = coeff * pData[idx];

Could you please add another testcase, where you have say pData
and some other pointer that init sets to alias with pData, and verify
that such loop (would need to be say normal loop inside #pragma omp single
or master) is not vectorized?

Jakub


[PATCH] [PR68603] Associate conditional C++ loop's back-jump with start, not body

2015-11-30 Thread Andreas Arnez
SVN commit r230979 always associates a loop's back-jump with the start
of the loop body.  This caused a regression for gcov with conditional
loops, because then the loop body appears to be covered twice per
iteration.

gcc/cp/ChangeLog:

PR gcov-profile/68603
* cp-gimplify.c (genericize_cp_loop): For the back-jump's location
use the start of the loop body only if the loop is unconditional.
---
 gcc/cp/cp-gimplify.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/gcc/cp/cp-gimplify.c b/gcc/cp/cp-gimplify.c
index a9a34cd..3c89f1b 100644
--- a/gcc/cp/cp-gimplify.c
+++ b/gcc/cp/cp-gimplify.c
@@ -264,7 +264,9 @@ genericize_cp_loop (tree *stmt_p, location_t start_locus, 
tree cond, tree body,
 }
   else
 {
-  location_t loc = EXPR_LOCATION (expr_first (body));
+  location_t loc = start_locus;
+  if (!cond || integer_nonzerop (cond))
+   loc = EXPR_LOCATION (expr_first (body));
   if (loc == UNKNOWN_LOCATION)
loc = start_locus;
   loop = build1_loc (loc, LOOP_EXPR, void_type_node, stmt_list);
-- 
2.5.0



Re: [PATCH, PR46032] Handle BUILT_IN_GOMP_PARALLEL in ipa-pta

2015-11-30 Thread Tom de Vries

On 30/11/15 14:24, Richard Biener wrote:

On Mon, 30 Nov 2015, Tom de Vries wrote:


On 30/11/15 10:16, Richard Biener wrote:

On Mon, 30 Nov 2015, Tom de Vries wrote:


Hi,

this patch fixes PR46032.

It handles a call:
...
__builtin_GOMP_parallel (fn, data, num_threads, flags)
...
as:
...
fn (data)
...
in ipa-pta.

This improves ipa-pta alias analysis in the parallelized function fn, and
allows vectorization in the testcase without a runtime alias test.

Bootstrapped and reg-tested on x86_64.

OK for stage3 trunk?


+ /* Assign the passed argument to the appropriate incoming
+parameter of the function.  */
+ struct constraint_expr lhs ;
+ lhs = get_function_part_constraint (fi, fi_parm_base + 0);
+ auto_vec rhsc;
+ struct constraint_expr *rhsp;
+ get_constraint_for_rhs (arg, &rhsc);
+ while (rhsc.length () != 0)
+   {
+ rhsp = &rhsc.last ();
+ process_constraint (new_constraint (lhs, *rhsp));
+ rhsc.pop ();
+   }

please use style used elsewhere with

   FOR_EACH_VEC_ELT (rhsc, j, rhsp)
 process_constraint (new_constraint (lhs, *rhsp));
   rhsc.truncate (0);



That code was copied from find_func_aliases_for_call.
I've factored out the bit that I copied as find_func_aliases_for_call_arg, and
fixed the style there (and dropped 'rhsc.truncate (0)' since AFAIU it's
redundant at the end of a function).


+ /* Parameter passed by value is used.  */
+ lhs = get_function_part_constraint (fi, fi_uses);
+ struct constraint_expr *rhsp;
+ get_constraint_for_address_of (arg, &rhsc);

This isn't correct - you want to use get_constraint_for (arg, &rhsc).
After all rhs is already an ADDR_EXPR.



Can we add an assert somewhere to detect this incorrect usage?


+ FOR_EACH_VEC_ELT (rhsc, j, rhsp)
+   process_constraint (new_constraint (lhs, *rhsp));
+ rhsc.truncate (0);
+
+ /* The caller clobbers what the callee does.  */
+ lhs = get_function_part_constraint (fi, fi_clobbers);
+ rhs = get_function_part_constraint (cfi, fi_clobbers);
+ process_constraint (new_constraint (lhs, rhs));
+
+ /* The caller uses what the callee does.  */
+ lhs = get_function_part_constraint (fi, fi_uses);
+ rhs = get_function_part_constraint (cfi, fi_uses);
+ process_constraint (new_constraint (lhs, rhs));

I don't see why you need those.  The solver should compute these
in even better precision (context sensitive on the call side).

The same is true for the function parameter.  That is, the only
needed part of the patch should be that making sure we see
the "direct" call and assign parameters correctly.



Dropped this bit.

OK for stage3 trunk if bootstrap and reg-test succeeds?


-|| node->address_taken);
+|| (node->address_taken
+&& !node->parallelized_function));

please add a comment here on why this is safe.

Ok with this change.


Updated with comment, committed as attached.

Thanks,
- Tom


Handle BUILT_IN_GOMP_PARALLEL in ipa-pta

2015-11-30  Tom de Vries  

	PR tree-optimization/46032
	* tree-ssa-structalias.c (find_func_aliases_for_call_arg): New function,
	factored out of ...
	(find_func_aliases_for_call): ... here.
	(find_func_aliases_for_builtin_call, find_func_clobbers): Handle
	BUILT_IN_GOMP_PARALLEL.
	(ipa_pta_execute): Same.  Handle node->parallelized_function as a local
	function.

	* gcc.dg/pr46032.c: New test.

	* testsuite/libgomp.c/pr46032.c: New test.

---
 gcc/testsuite/gcc.dg/pr46032.c| 47 +++
 gcc/tree-ssa-structalias.c| 71 ---
 libgomp/testsuite/libgomp.c/pr46032.c | 44 ++
 3 files changed, 149 insertions(+), 13 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/pr46032.c b/gcc/testsuite/gcc.dg/pr46032.c
new file mode 100644
index 000..b91190e
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr46032.c
@@ -0,0 +1,47 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fopenmp -ftree-vectorize -std=c99 -fipa-pta -fdump-tree-vect-all" } */
+
+extern void abort (void);
+
+#define nEvents 1000
+
+static void __attribute__((noinline, noclone, optimize("-fno-tree-vectorize")))
+init (unsigned *results, unsigned *pData)
+{
+  unsigned int i;
+  for (i = 0; i < nEvents; ++i)
+pData[i] = i % 3;
+}
+
+static void __attribute__((noinline, noclone, optimize("-fno-tree-vectorize")))
+check (unsigned *results)
+{
+  unsigned sum = 0;
+  for (int idx = 0; idx < (int)nEvents; idx++)
+sum += results[idx];
+
+  if (sum != 1998)
+abort ();
+}
+
+int
+main (void)
+{
+  unsigned results[nEvents];
+  unsigned pData[nEvents];
+  unsigned coeff = 2;
+
+  init (&results[0], &pData[

Re: [PATCH] [ARC] Add support for atomic memory built-in.

2015-11-30 Thread Claudiu Zissulescu
Ping. This patch is stalling for two weeks.

Thanks,
Claudiu

On Mon, Nov 16, 2015 at 11:18 AM, Claudiu Zissulescu
 wrote:
> This patch adds support for atomic memory built-in for ARCHS and ARC700. 
> Tested with dg.exp.
>
> OK to apply?
>
> Thanks,
> Claudiu
>
> ChangeLogs:
> gcc/
>
> 2015-11-12  Claudiu Zissulescu  
>
> * config/arc/arc-protos.h (arc_expand_atomic_op): Prototype.
> (arc_split_compare_and_swap): Likewise.
> (arc_expand_compare_and_swap): Likewise.
> * config/arc/arc.c (arc_init): Check usage atomic option.
> (arc_pre_atomic_barrier): New function.
> (arc_post_atomic_barrier): Likewise.
> (emit_unlikely_jump): Likewise.
> (arc_expand_compare_and_swap_qh): Likewise.
> (arc_expand_compare_and_swap): Likewise.
> (arc_split_compare_and_swap): Likewise.
> (arc_expand_atomic_op): Likewise.
> * config/arc/arc.h (TARGET_CPU_CPP_BUILTINS): New C macro.
> (ASM_SPEC): Enable mlock option when matomic is used.
> * config/arc/arc.md (UNSPEC_ARC_MEMBAR): Define.
> (VUNSPEC_ARC_CAS): Likewise.
> (VUNSPEC_ARC_LL): Likewise.
> (VUNSPEC_ARC_SC): Likewise.
> (VUNSPEC_ARC_EX): Likewise.
> * config/arc/arc.opt (matomic): New option.
> * config/arc/constraints.md (ATO): New constraint.
> * config/arc/predicates.md (mem_noofs_operand): New predicate.
> * doc/invoke.texi: Document -matomic.
> * config/arc/atomic.md: New file.
>
> gcc/testsuite
>
> 2015-11-12  Claudiu Zissulescu  
>
> * lib/target-supports.exp (check_effective_target_arc_atomic): New
> function.
> (check_effective_target_sync_int_long): Add checks for ARC atomic
> feature.
> (check_effective_target_sync_char_short): Likewise.
> ---
>  gcc/config/arc/arc-protos.h   |   4 +
>  gcc/config/arc/arc.c  | 391 
> ++
>  gcc/config/arc/arc.h  |   6 +-
>  gcc/config/arc/arc.md |   9 +
>  gcc/config/arc/arc.opt|   3 +
>  gcc/config/arc/atomic.md  | 235 
>  gcc/config/arc/constraints.md |   6 +
>  gcc/config/arc/predicates.md  |   4 +
>  gcc/doc/invoke.texi   |   8 +-
>  gcc/testsuite/lib/target-supports.exp |  11 +
>  10 files changed, 675 insertions(+), 2 deletions(-)
>  create mode 100644 gcc/config/arc/atomic.md
>
> diff --git a/gcc/config/arc/arc-protos.h b/gcc/config/arc/arc-protos.h
> index 6e04351..3581bb0 100644
> --- a/gcc/config/arc/arc-protos.h
> +++ b/gcc/config/arc/arc-protos.h
> @@ -41,6 +41,10 @@ extern int arc_output_commutative_cond_exec (rtx 
> *operands, bool);
>  extern bool arc_expand_movmem (rtx *operands);
>  extern bool prepare_move_operands (rtx *operands, machine_mode mode);
>  extern void emit_shift (enum rtx_code, rtx, rtx, rtx);
> +extern void arc_expand_atomic_op (enum rtx_code, rtx, rtx, rtx, rtx, rtx);
> +extern void arc_split_compare_and_swap (rtx *);
> +extern void arc_expand_compare_and_swap (rtx *);
> +
>  #endif /* RTX_CODE */
>
>  #ifdef TREE_CODE
> diff --git a/gcc/config/arc/arc.c b/gcc/config/arc/arc.c
> index 8bb0969..d47bbe4 100644
> --- a/gcc/config/arc/arc.c
> +++ b/gcc/config/arc/arc.c
> @@ -61,6 +61,7 @@ along with GCC; see the file COPYING3.  If not see
>  #include "context.h"
>  #include "builtins.h"
>  #include "rtl-iter.h"
> +#include "alias.h"
>
>  /* Which cpu we're compiling for (ARC600, ARC601, ARC700).  */
>  static const char *arc_cpu_string = "";
> @@ -884,6 +885,9 @@ arc_init (void)
>flag_pic = 0;
>  }
>
> +  if (TARGET_ATOMIC && !(TARGET_ARC700 || TARGET_HS))
> +error ("-matomic is only supported for ARC700 or ARC HS cores");
> +
>arc_init_reg_tables ();
>
>/* Initialize array for PRINT_OPERAND_PUNCT_VALID_P.  */
> @@ -9650,6 +9654,393 @@ arc_use_by_pieces_infrastructure_p (unsigned 
> HOST_WIDE_INT size,
>return default_use_by_pieces_infrastructure_p (size, align, op, speed_p);
>  }
>
> +/* Emit a (pre) memory barrier around an atomic sequence according to
> +   MODEL.  */
> +
> +static void
> +arc_pre_atomic_barrier (enum memmodel model)
> +{
> + switch (model & MEMMODEL_MASK)
> +{
> +case MEMMODEL_RELAXED:
> +case MEMMODEL_CONSUME:
> +case MEMMODEL_ACQUIRE:
> +case MEMMODEL_SYNC_ACQUIRE:
> +  break;
> +case MEMMODEL_RELEASE:
> +case MEMMODEL_ACQ_REL:
> +case MEMMODEL_SYNC_RELEASE:
> +  emit_insn (gen_membar (const0_rtx));
> +  break;
> +case MEMMODEL_SEQ_CST:
> +case MEMMODEL_SYNC_SEQ_CST:
> +  emit_insn (gen_sync (const1_rtx));
> +  break;
> +default:
> +  gcc_unreachable ();
> +}
> +}
> +
> +/* Emit a (post) memory barrier around an atomic sequence according to
> +   MODEL.  */
> +
> +static void
> +arc_post_atomic_barrier (enum memmodel model)
> +{
> + switch (model & MEMMODEL_MASK)
> +{
> +case MEMMODEL

Re: [patch] c/c++ asan tests for FreeBSD

2015-11-30 Thread Jakub Jelinek
On Mon, Nov 30, 2015 at 05:17:29PM +0100, Bernd Schmidt wrote:
> On 11/30/2015 01:12 PM, Andreas Tobler wrote:
> >On 30.11.15 11:28, Bernd Schmidt wrote:
> >>On 11/29/2015 08:32 PM, Andreas Tobler wrote:
> >>>-/* { dg-do run { target { *-*-linux* } } } */
> >>>+/* { dg-do run { target { *-*-linux* *-*-freebsd* } } } */
> >>
> >>I see a patch from you to add asan support to x86 freebsd, but what
> >>about other architectures?
> >
> >You mean because of the wildcard? I'll add them as I have time to port
> >them.
> >
> >For now they are UNSUPPORTED.
> 
> Is that how they show up, or do you get FAILs on other FreeBSDs?

This is inside of asan.exp, which is guarded with
check_effective_target_fsanitize_address
and therefore should not be run at all on non-asan targets.
I think the testsuite changes are fine, but it IMHO doesn't make sense to
commit it until the FreeBSD asan supports lands in (which is dependent on
the upstream libsanitizer change I believe).  Once it happens, it can be
cherry-picked from there, the config/i386 part looks reasonable.

Jakub


Re: [patch] c/c++ asan tests for FreeBSD

2015-11-30 Thread Bernd Schmidt

On 11/30/2015 01:12 PM, Andreas Tobler wrote:

On 30.11.15 11:28, Bernd Schmidt wrote:

On 11/29/2015 08:32 PM, Andreas Tobler wrote:

-/* { dg-do run { target { *-*-linux* } } } */
+/* { dg-do run { target { *-*-linux* *-*-freebsd* } } } */


I see a patch from you to add asan support to x86 freebsd, but what
about other architectures?


You mean because of the wildcard? I'll add them as I have time to port
them.

For now they are UNSUPPORTED.


Is that how they show up, or do you get FAILs on other FreeBSDs?


Does every *-*-linux* has asan support?


Probably not, but I guess the main ones people tend to test.


Bernd


Re: [RFC] [Patch] PR67326 - relax trap assumption by looking at similar DRS

2015-11-30 Thread H.J. Lu
On Fri, Nov 27, 2015 at 12:24 AM, Kumar, Venkataramanan
 wrote:
> Hi Richard,
>
>> -Original Message-
>> From: Richard Biener [mailto:richard.guent...@gmail.com]
>> Sent: Tuesday, November 24, 2015 9:07 PM
>> To: Kumar, Venkataramanan
>> Cc: Jakub Jelinek (ja...@redhat.com); gcc-patches@gcc.gnu.org
>> Subject: Re: [RFC] [Patch] PR67326 - relax trap assumption by looking at
>> similar DRS
>>
>> On Fri, Nov 20, 2015 at 1:02 PM, Kumar, Venkataramanan
>>  wrote:
>> > Hi Richard,
>> >
>> > As per Jakub suggestion in
>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67326, the below patch fixes
>> the regression in tree if conversion.
>> > Basically allowing if conversion to happen for a candidate DR, if we find
>> similar DR with same dimensions  and that DR will not trap.
>> >
>> > To find similar DRs using hash table to hashing the offset and DR pairs.
>> > Also reusing  read/written information that was stored for reference tree.
>> >
>> > Also.
>> > (1) I guard these checks for  -ftree-loop-if-convert-stores and -fno-
>> common.
>> > Sometimes vectorization flags also triggers if conversion.
>> > (2) Also hashing base DRs for writes only.
>> >
>> > gcc/ChangeLog
>> > 2015-11-19  Venkataramanan  
>> >
>> > PR tree-optimization/67326
>> > * tree-if-conv.c  (offset_DR_map): Define.
>> > (struct ifc_dr): Add new tree base_predicate field.
>> > (hash_memrefs_baserefs_and_store_DRs_read_written_info): Hash
>> offsets, DR pairs
>> > and hash base ref,  DR pairs  for write type DRs.
>> > (ifcvt_memrefs_wont_trap):  Guard checks with -ftree-loop-if-
>> convert-stores flag.
>> >Check for similar DR that are accessed unconditionally.
>> >(if_convertible_loop_p_1):  Initialize and delete offset hash
>> > maps
>> >
>> > gcc/testsuite/ChangeLog
>> > 2015-11-19  Venkataramanan  
>> > * gcc.dg/tree-ssa/ifc-pr67326.c:  Add new.
>> >
>> > Regstrapped on x86_64, Ok for trunk?
>>
>> +  if (offset)
>> +{
>> +  offset_master_dr = &offset_DR_map->get_or_insert (offset,&exist3);
>> +  if (!exist3)
>> +   *offset_master_dr = a;
>> +
>> +  if (DR_RW_UNCONDITIONALLY (*offset_master_dr) != 1)
>> +   DR_RW_UNCONDITIONALLY (*offset_master_dr)
>> +   = DR_RW_UNCONDITIONALLY (*master_dr);
>>
>> this is fishy - as far as I can see offset_master globs all _candidates_ and
>>
>> +  else if (DR_OFFSET (a))
>> +{
>> +  offset_dr = offset_DR_map->get (DR_OFFSET (a));
>> +  if ((DR_RW_UNCONDITIONALLY (*offset_dr) == 1)
>> +  && DR_NUM_DIMENSIONS (a) == DR_NUM_DIMENSIONS
>> (*offset_dr))
>> +   {
>> + tree base_tree = get_base_address (DR_REF (a));
>> + if (DECL_P (base_tree)
>> + && flag_tree_loop_if_convert_stores
>> + && decl_binds_to_current_def_p (base_tree)
>> + && !TREE_READONLY (base_tree))
>> +   return true;
>> +   }
>> +}
>>
>> where with this that actually checks something (DR_NUM_DIMENSIONS is
>> not something you can use to identify two arrays with the same domain) will
>> then consider DR_DW_UNCONDITIONALLY ORed from all _candidates_ but
>> not only from those which really have the same domain.
>>
>> You need to do the domain check as part of the hash-map
>> hashing/comparing.
>>
>> Note that there is no bounds info in the data ref info so you need to
>>   a) consider DR_OFFSET + DR_INIT
>>   b) verify the access size is the same (TYPE_SIZE_UNIT (TREE_TYPE (dr-
>> >ref)))
>>   c) verify the base objects are of the same size - note this is somewhat
>> difficult as the base object for DR_OFFSET/INIT is starting at
>> DR_BASE_ADDRESS so maybe restrict this to ADDR_EXPR 
>> DR_BASE_ADDRESS cases where you can look at DECL_SIZE (decl) of both
>> candidates
>>
>> You can also try using indices (DR_BASE_OBJECT plus DR_ACCESS_FNS when
>> DR_UNCONSTRAINED_BASE is false).  If the size of DR_BASE_OBJECT
>> matches and all access functions are equal it should be a compatible enough
>> case as well.
>
> Ok,  I will take some time to figure out on domain analysis part.
>
>>
>> I'd say you should split out the base_predicate introduction into a separate
>> patch (this change looks ok).
>>
>
> Attached patch has the  "base_predicate" introduction part alone.
> It does the predicate folding  and hashes base references for only write type 
> DRs while hashing.
> I have not added any new test case since we already have  ifc-8.c
>
> Also fixed formatting issues Jakub  pointed out for this patch.
>
> Boot strapped on X86_64.
>
> Ok to upstream if it passes regression tests?
>
> gcc/ChangeLog
> 2015-11-27  Venkataramanan Kumar  
>
> * tree-if-conv.c (struct ifc_dr): Add new tree
> base_predicate field.
> (hash_memrefs_baserefs_and_store_DRs_read_written_info): Hash
> base ref, DR pairs and store base_predicate for write type DRs.
> (ifcvt_memrefs_wont_trap): Guard checks with
> -ftree-loop-if-convert-sto

Re: S/390: Fix warnings in "*setmem_long..." patterns.

2015-11-30 Thread Andreas Krebbel
On 11/30/2015 04:11 PM, Dominik Vogt wrote:
> The attached patch fixes some warnings generated by the setmem...
> patterns in s390.md during build and add test cases for the
> patterns.  The patch is to be added on to p of the movstr patch:
> https://gcc.gnu.org/ml/gcc-patches/2015-11/msg03485.html
> 
> The test cases validate that the patterns are actually used, but
> at the moment the setmem_long_and pattern is never actually used
> and thus the test case would fail.  So I've split the patch in two
> (both attached to this message) to activate this part of the test
> once we've fixed that.
> 
> The patch has passed the SPEC2006 testsuite without any measurable
> changes in performance.

Shouldn't we instead describe the whole setmem operation as unspec including 
the other operands as
well? The semantics of the introduced UNSPEC_P_TO_BLK operation is not clear to 
me.  It suggests to
be some kind of "cast" which it isn't. In fact it is not able to do its job 
without the length which
is specified as use outside the unspec.

Bye,

-Andreas-



Re: [PATCH] Add save_expr langhook (PR c/68513)

2015-11-30 Thread Richard Biener
On Mon, 30 Nov 2015, Richard Biener wrote:

> On Mon, 30 Nov 2015, Marek Polacek wrote:
> 
> > On Sat, Nov 28, 2015 at 08:50:12AM +0100, Richard Biener wrote:
> > > Different approach: after the FE folds (unexpectedly?), scan the result 
> > > for
> > > SAVE_EXPRs and if found, drop the folding.
> > 
> > Neither this fixes this problem completely, because we simply don't know 
> > where
> > those SAVE_EXPRs might be introduced: it might be convert(), but e.g. when I
> > changed the original testcase a tiny bit (added -), then those SAVE_EXPRs 
> > were
> > introduced in a different spot (via c_process_stmt_expr -> c_fully_fold).
> 
> So the following "disables" save_expr generation from generic-match.c
> by failing to simplify if save_expr would end up not returning a
> non-save_expr.
> 
> I expect this will make fixing PR68590 difficult (w/o re-introducing
> some fold-const.c code or changing genmatch to "special-case"
> things).
> 
> The other option for this PR is to re-introduce the TREE_SIDE_EFFECTS
> check I removed earlier (to avoid un-CSEing large expressions at
> -O0 for example) and thus only FAIL if the save_expr were needed
> for correctness.

And the following will avoid quite some fallout (eventually).  Testing
as desired change independently.

Richard.

Index: gcc/match.pd
===
--- gcc/match.pd(revision 231065)
+++ gcc/match.pd(working copy)
@@ -1828,15 +1828,14 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
  
 /* Simplify comparison of something with itself.  For IEEE
floating-point, we can only do some of these simplifications.  */
-(simplify
- (eq @0 @0)
- (if (! FLOAT_TYPE_P (TREE_TYPE (@0))
-  || ! HONOR_NANS (TYPE_MODE (TREE_TYPE (@0
-  { constant_boolean_node (true, type); }))
-(for cmp (ge le)
+(for cmp (eq ge le)
  (simplify
   (cmp @0 @0)
-  (eq @0 @0)))
+  (if (! FLOAT_TYPE_P (TREE_TYPE (@0))
+   || ! HONOR_NANS (TYPE_MODE (TREE_TYPE (@0
+   { constant_boolean_node (true, type); }
+   (if (cmp != EQ_EXPR)
+(eq @0 @0)
 (for cmp (ne gt lt)
  (simplify
   (cmp @0 @0)


> Richard.
> 
> Index: gcc/tree.c
> ===
> --- gcc/tree.c(revision 231065)
> +++ gcc/tree.c(working copy)
> @@ -3231,8 +3231,6 @@ decl_address_ip_invariant_p (const_tree
> not handle arithmetic; that's handled in skip_simple_arithmetic and
> tree_invariant_p).  */
>  
> -static bool tree_invariant_p (tree t);
> -
>  static bool
>  tree_invariant_p_1 (tree t)
>  {
> @@ -3282,7 +3280,7 @@ tree_invariant_p_1 (tree t)
>  
>  /* Return true if T is function-invariant.  */
>  
> -static bool
> +bool
>  tree_invariant_p (tree t)
>  {
>tree inner = skip_simple_arithmetic (t);
> Index: gcc/tree.h
> ===
> --- gcc/tree.h(revision 231065)
> +++ gcc/tree.h(working copy)
> @@ -4320,6 +4320,10 @@ extern tree staticp (tree);
>  
>  extern tree save_expr (tree);
>  
> +/* Return true if T is function-invariant.  */
> +
> +extern bool tree_invariant_p (tree);
> +
>  /* Look inside EXPR into any simple arithmetic operations.  Return the
> outermost non-arithmetic or non-invariant node.  */
>  
> Index: gcc/genmatch.c
> ===
> --- gcc/genmatch.c(revision 231065)
> +++ gcc/genmatch.c(working copy)
> @@ -3106,7 +3106,9 @@ dt_simplify::gen_1 (FILE *f, int indent,
> else if (is_a  (opr))
>   is_predicate = true;
> /* Search for captures used multiple times in the result expression
> -  and dependent on TREE_SIDE_EFFECTS emit a SAVE_EXPR.  */
> +  and check if we can safely evaluate it multiple times.  Otherwise
> +  fail, avoiding a SAVE_EXPR because that confuses the C FE
> +  const expression folding.  */
> if (!is_predicate)
>   for (int i = 0; i < s->capture_max + 1; ++i)
> {
> @@ -3114,8 +3116,8 @@ dt_simplify::gen_1 (FILE *f, int indent,
> continue;
>   if (cinfo.info[i].result_use_count > 1)
> fprintf_indent (f, indent,
> -   "captures[%d] = save_expr (captures[%d]);\n",
> -   i, i);
> +   "if (! tree_invariant_p (captures[%d])) "
> +   "return NULL_TREE;\n", i);
> }
> for (unsigned j = 0; j < e->ops.length (); ++j)
>   {
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)


Re: [PATCH] Add save_expr langhook (PR c/68513)

2015-11-30 Thread Richard Biener
On Mon, 30 Nov 2015, Marek Polacek wrote:

> On Sat, Nov 28, 2015 at 08:50:12AM +0100, Richard Biener wrote:
> > Different approach: after the FE folds (unexpectedly?), scan the result for
> > SAVE_EXPRs and if found, drop the folding.
> 
> Neither this fixes this problem completely, because we simply don't know where
> those SAVE_EXPRs might be introduced: it might be convert(), but e.g. when I
> changed the original testcase a tiny bit (added -), then those SAVE_EXPRs were
> introduced in a different spot (via c_process_stmt_expr -> c_fully_fold).

So the following "disables" save_expr generation from generic-match.c
by failing to simplify if save_expr would end up not returning a
non-save_expr.

I expect this will make fixing PR68590 difficult (w/o re-introducing
some fold-const.c code or changing genmatch to "special-case"
things).

The other option for this PR is to re-introduce the TREE_SIDE_EFFECTS
check I removed earlier (to avoid un-CSEing large expressions at
-O0 for example) and thus only FAIL if the save_expr were needed
for correctness.

Richard.

Index: gcc/tree.c
===
--- gcc/tree.c  (revision 231065)
+++ gcc/tree.c  (working copy)
@@ -3231,8 +3231,6 @@ decl_address_ip_invariant_p (const_tree
not handle arithmetic; that's handled in skip_simple_arithmetic and
tree_invariant_p).  */
 
-static bool tree_invariant_p (tree t);
-
 static bool
 tree_invariant_p_1 (tree t)
 {
@@ -3282,7 +3280,7 @@ tree_invariant_p_1 (tree t)
 
 /* Return true if T is function-invariant.  */
 
-static bool
+bool
 tree_invariant_p (tree t)
 {
   tree inner = skip_simple_arithmetic (t);
Index: gcc/tree.h
===
--- gcc/tree.h  (revision 231065)
+++ gcc/tree.h  (working copy)
@@ -4320,6 +4320,10 @@ extern tree staticp (tree);
 
 extern tree save_expr (tree);
 
+/* Return true if T is function-invariant.  */
+
+extern bool tree_invariant_p (tree);
+
 /* Look inside EXPR into any simple arithmetic operations.  Return the
outermost non-arithmetic or non-invariant node.  */
 
Index: gcc/genmatch.c
===
--- gcc/genmatch.c  (revision 231065)
+++ gcc/genmatch.c  (working copy)
@@ -3106,7 +3106,9 @@ dt_simplify::gen_1 (FILE *f, int indent,
  else if (is_a  (opr))
is_predicate = true;
  /* Search for captures used multiple times in the result expression
-and dependent on TREE_SIDE_EFFECTS emit a SAVE_EXPR.  */
+and check if we can safely evaluate it multiple times.  Otherwise
+fail, avoiding a SAVE_EXPR because that confuses the C FE
+const expression folding.  */
  if (!is_predicate)
for (int i = 0; i < s->capture_max + 1; ++i)
  {
@@ -3114,8 +3116,8 @@ dt_simplify::gen_1 (FILE *f, int indent,
  continue;
if (cinfo.info[i].result_use_count > 1)
  fprintf_indent (f, indent,
- "captures[%d] = save_expr (captures[%d]);\n",
- i, i);
+ "if (! tree_invariant_p (captures[%d])) "
+ "return NULL_TREE;\n", i);
  }
  for (unsigned j = 0; j < e->ops.length (); ++j)
{


Re: [PATCH] Add save_expr langhook (PR c/68513)

2015-11-30 Thread Marek Polacek
On Sat, Nov 28, 2015 at 04:05:30PM +, Joseph Myers wrote:
> On Sat, 28 Nov 2015, Richard Biener wrote:
> 
> > Different approach: after the FE folds (unexpectedly?), scan the result 
> > for SAVE_EXPRs and if found, drop the folding.
> 
> Or, if conversions are going to fold from language-independent code (which 
> is the underlying problem here - a conversion without folding would be 
> preferred once the fallout from that can be resolved), make the front end 
> fold with c_fully_fold before doing the conversion, and wrap the result of 
> the conversion in a C_MAYBE_CONST_EXPR with c_wrap_maybe_const in the same 
> way as done in other places that fold early (if either c_fully_fold 
> indicates it can't occur in a constant expression, or the result of 
> folding / conversion is not an INTEGER_CST).

Unfortunately, even this doesn't seem to work :(; I'm getting leaked
C_MAYBE_CONST_EXPRs e.g. when converting to (_Complex float), and a bunch of
missing warnings resulting in big testsuite fallout.

Marek


Re: [PATCH] Add save_expr langhook (PR c/68513)

2015-11-30 Thread Marek Polacek
On Sat, Nov 28, 2015 at 08:50:12AM +0100, Richard Biener wrote:
> Different approach: after the FE folds (unexpectedly?), scan the result for
> SAVE_EXPRs and if found, drop the folding.

Neither this fixes this problem completely, because we simply don't know where
those SAVE_EXPRs might be introduced: it might be convert(), but e.g. when I
changed the original testcase a tiny bit (added -), then those SAVE_EXPRs were
introduced in a different spot (via c_process_stmt_expr -> c_fully_fold).

Marek


Re: [PATCH 3/4] Add libgomp plugin for Intel MIC

2015-11-30 Thread Jakub Jelinek
On Wed, Nov 11, 2015 at 05:56:15PM +0300, Aleksander Ivanyushenko wrote:
> diff --git a/configure.ac b/configure.ac
> index 9241261..b997646 100644
> --- a/configure.ac
> +++ b/configure.ac
> @@ -494,6 +494,18 @@ else
>  fi])
>  AC_SUBST(extra_liboffloadmic_configure_flags)
>  
> +# Intelmic and intelmicemul require xxd or python.
> +case "${target}" in
> +  *-intelmic-* | *-intelmicemul-*)
> +AC_CHECK_PROG(xxd_present, xxd, "yes", "no")
> +AC_CHECK_PROG(python2_present, python2, "yes", "no")
> +AC_CHECK_PROG(python3_present, python3, "yes", "no")
> +if test "$xxd_present$python2_present$python3_present" = "nonono"; then
> +  AC_MSG_ERROR([cannot find neither xxd nor python])
> +fi
> +;;
> +esac

Why here?  I'd do something like that only in
liboffloadmic/plugin/configure.ac.  Furthermore, it is inconsistent
with what you actually use in liboffloadmic/plugin (where you look only
for python and above you only look for python[23]).

> @@ -73,7 +75,7 @@ main_target_image.h: offload_target_main
>   @echo "};" >> $@
>   @echo "extern \"C\" const MainTargetImage main_target_image = {" >> $@
>   @echo "  image_size, \"offload_target_main\"," >> $@
> - @cat $< | xxd -include >> $@
> + @if test "x$(xxd_path)" != "xno"; then cat $< | $(xxd_path) -include >> 
> $@; else $(python_path) $(XXD_PY) $< >> $@; fi;
>   @echo "};" >> $@

I'd prefer to use $(XXD) and $(PYTHON) instead of $(xxd_path) and 
$(python_path),
that is more consistent with dozens of other variables for other tools.

> --- a/liboffloadmic/plugin/configure.ac
> +++ b/liboffloadmic/plugin/configure.ac
> @@ -124,6 +124,10 @@ case ${enable_version_specific_runtime_libs} in
>  ;;
>  esac
>  
> +# Find path to xxd or python
> +AC_PATH_PROG(xxd_path, xxd, "no")
> +AC_PATH_PROG(python_path, python, "no")

I'd use
+AC_PATH_PROG(XXD, xxd, no)
+AC_PATH_PROGS(PYTHON, python python2 python3, no)
and then add the conditional AC_MSG_ERROR if
x$XXD = xno && x$PYTHON = xno

Jakub


S/390: Fix warnings in "*setmem_long..." patterns.

2015-11-30 Thread Dominik Vogt
The attached patch fixes some warnings generated by the setmem...
patterns in s390.md during build and add test cases for the
patterns.  The patch is to be added on to p of the movstr patch:
https://gcc.gnu.org/ml/gcc-patches/2015-11/msg03485.html

The test cases validate that the patterns are actually used, but
at the moment the setmem_long_and pattern is never actually used
and thus the test case would fail.  So I've split the patch in two
(both attached to this message) to activate this part of the test
once we've fixed that.

The patch has passed the SPEC2006 testsuite without any measurable
changes in performance.

Ciao

Dominik ^_^  ^_^

-- 

Dominik Vogt
IBM Germany
gcc/ChangeLog

* config/s390/s390.c (s390_expand_setmem): Use new expanders.
* config/s390/s390.md ("*setmem_long")
("*setmem_long_and", "*setmem_long_31z"): Fix warnings.
("setmem_long_"): New expanders.
("setmem_long"): Removed.

gcc/testsuite/ChangeLog

* gcc.target/s390/md/setmem_long-1.c: New test.
* gcc.target/s390/md/setmem_long-2.c: New test.
>From 6b484cd8a9f39a38b3e990b4ac160c8254c03f6b Mon Sep 17 00:00:00 2001
From: Dominik Vogt 
Date: Wed, 4 Nov 2015 03:16:24 +0100
Subject: [PATCH 1/1.5] S/390: Fix warnings in "*setmem_long..." patterns.

---
 gcc/config/s390/s390.c   |  7 ++-
 gcc/config/s390/s390.md  | 18 +-
 gcc/testsuite/gcc.target/s390/md/setmem_long-1.c | 20 
 gcc/testsuite/gcc.target/s390/md/setmem_long-2.c | 20 
 4 files changed, 59 insertions(+), 6 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/s390/md/setmem_long-1.c
 create mode 100644 gcc/testsuite/gcc.target/s390/md/setmem_long-2.c

diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
index 40ee2f7..8f2396f 100644
--- a/gcc/config/s390/s390.c
+++ b/gcc/config/s390/s390.c
@@ -5178,7 +5178,12 @@ s390_expand_setmem (rtx dst, rtx len, rtx val)
   else if (TARGET_MVCLE)
 {
   val = force_not_mem (convert_modes (Pmode, QImode, val, 1));
-  emit_insn (gen_setmem_long (dst, convert_to_mode (Pmode, len, 1), val));
+  if (TARGET_64BIT)
+	emit_insn (gen_setmem_long_di (dst, convert_to_mode (Pmode, len, 1),
+   val));
+  else
+	emit_insn (gen_setmem_long_si (dst, convert_to_mode (Pmode, len, 1),
+   val));
 }
 
   else
diff --git a/gcc/config/s390/s390.md b/gcc/config/s390/s390.md
index 75e9af7..ed98101 100644
--- a/gcc/config/s390/s390.md
+++ b/gcc/config/s390/s390.md
@@ -70,6 +70,9 @@
; Copy CC as is into the lower 2 bits of an integer register
UNSPEC_CC_TO_INT
 
+   ; Convert Pmode to BLKmode
+   UNSPEC_P_TO_BLK
+
; GOT/PLT and lt-relative accesses
UNSPEC_LTREL_OFFSET
UNSPEC_LTREL_BASE
@@ -3281,11 +3284,12 @@
 
 ; Initialize a block of arbitrary length with (operands[2] % 256).
 
-(define_expand "setmem_long"
+(define_expand "setmem_long_"
   [(parallel
 [(clobber (match_dup 1))
  (set (match_operand:BLK 0 "memory_operand" "")
-  (match_operand 2 "shift_count_or_setmem_operand" ""))
+	  (unspec:BLK [(match_operand:P 2 "shift_count_or_setmem_operand" "")]
+		  UNSPEC_P_TO_BLK))
  (use (match_operand 1 "general_operand" ""))
  (use (match_dup 3))
  (clobber (reg:CC CC_REGNUM))])]
@@ -3312,7 +3316,8 @@
 (define_insn "*setmem_long"
   [(clobber (match_operand: 0 "register_operand" "=d"))
(set (mem:BLK (subreg:P (match_operand: 3 "register_operand" "0") 0))
-(match_operand 2 "shift_count_or_setmem_operand" "Y"))
+(unspec:BLK [(match_operand:P 2 "shift_count_or_setmem_operand" "Y")]
+		UNSPEC_P_TO_BLK))
(use (match_dup 3))
(use (match_operand: 1 "register_operand" "d"))
(clobber (reg:CC CC_REGNUM))]
@@ -3324,7 +3329,9 @@
 (define_insn "*setmem_long_and"
   [(clobber (match_operand: 0 "register_operand" "=d"))
(set (mem:BLK (subreg:P (match_operand: 3 "register_operand" "0") 0))
-(and (match_operand 2 "shift_count_or_setmem_operand" "Y")
+(and:BLK (unspec:BLK
+	  [(match_operand:P 2 "shift_count_or_setmem_operand" "Y")]
+	  UNSPEC_P_TO_BLK)
 	 (match_operand 4 "const_int_operand" "n")))
(use (match_dup 3))
(use (match_operand: 1 "register_operand" "d"))
@@ -3338,7 +3345,8 @@
 (define_insn "*setmem_long_31z"
   [(clobber (match_operand:TI 0 "register_operand" "=d"))
(set (mem:BLK (subreg:SI (match_operand:TI 3 "register_operand" "0") 4))
-(match_operand 2 "shift_count_or_setmem_operand" "Y"))
+(unspec:BLK [(match_operand:P 2 "shift_count_or_setmem_operand" "Y")]
+		UNSPEC_P_TO_BLK))
(use (match_dup 3))
(use (match_operand:TI 1 "register_operand" "d"))
(clobber (reg:CC CC_REGNUM))]
diff --git a/gcc/testsuite/gcc.target/s390/md/setmem_long-1.c b/gcc/testsuite/gcc.target/s390/md/setmem_long-1.c
new file mode 100644
index 000..9a926ce
--- /dev/null
+++ b/gcc/testsuite/gcc.target/s390/md/setmem_lo

Re: [PATCH 3/4] Add libgomp plugin for Intel MIC

2015-11-30 Thread Aleksander Ivanyushenko
On Wed, Nov 11, 2015 at 17:56:15 +0300, Aleksander Ivanyushenko wrote:
> On Mon, Aug 24, 2015 at 10:45:03 +0200, Jakub Jelinek wrote:
> > On Thu, Aug 06, 2015 at 05:34:56PM +0300, Maxim Blumental wrote:
> > >  Applied the idea with python script alternative. Review, please.
> > 
> > > 2015-07-28  Maxim Blumenthal  
> > > 
> > >   * configure.ac: Add a check for xxd or python presence when the target
> > >   is intelmic or intelmicemul.
> > >   * configure: Regenerate.
> > >   * liboffloadmic/plugin/Makefile.am: Add a condition into
> > >   make_target_image.h generating code.  This condition performs an
> > >   action with either xxd or a special python script during the
> > >   generating.
> > >   * liboffloadmic/plugin/xxd.py: New file.
> > >   * liboffloadmic/plugin/Makefile.in: Regenerate.
> > 
> > I still don't like this, there should be no `which ...` uses in the
> > Makefile.
> > Instead, use AC_CHECK_PROG/AC_CHECK_PROGS in configure.ac, for python
> > perhaps search for python python2 python3 or what is common in the python
> > land.  And prepare the command line to use in the Makefile.am in configure
> > too, then AC_SUBST it and use the variable in there (and the variable will
> > use $@ etc.).
> Maxim has left Intel so I have fixed this issue. I tried to build with and
> without xxd, so everything works fine. ok for trunk?
> 
> 2015-11-10  Aleksander Ivanushenko  
>   Maxim Blumenthal  
> 
>   * configure.ac: Add xxd and python check for intelmic and
>   intelmicemul.
>   * configure: Regenerate.
> 
> liboffloadmic/
> 2015-11-10  Aleksander Ivanushenko  
>   Maxim Blumenthal  
>   David Malcolm  
> 
>   * plugin/xxd.py: New file.
>   * plugin/configure.ac: Add searching for xxd and python pathes.
>   * plugin/Makefile.am: Add python script usage in case when xxd is not
>   available.
>   * plugin/configure: Regenerate.
>   * plugin/Makefile.in: Regenerate.
> 
>
Ping. 


Re: [RFC] Combine vectorized loops with its scalar remainder.

2015-11-30 Thread Yuri Rumyantsev
Richard,

Thanks a lot for your detailed comments!

Few words about 436.cactusADM gain. The loop which was transformed for
avx2 is very huge and this is the last inner-most loop in routine
Bench_StaggeredLeapfrog2 (StaggeredLeapfrog2.F #366). If you don't
have sources, let me know.

Yuri.

2015-11-27 16:45 GMT+03:00 Richard Biener :
> On Fri, Nov 13, 2015 at 11:35 AM, Yuri Rumyantsev  wrote:
>> Hi Richard,
>>
>> Here is updated version of the patch which 91) is in sync with trunk
>> compiler and (2) contains simple cost model to estimate profitability
>> of scalar epilogue elimination. The part related to vectorization of
>> loops with small trip count is in process of developing. Note that
>> implemented cost model was not tuned  well for HASWELL and KNL but we
>> got  ~6% speed-up on 436.cactusADM from spec2006 suite for HASWELL.
>
> Ok, so I don't know where to start with this.
>
> First of all while I wanted to have the actual stmt processing to be
> as post-processing
> on the vectorized loop body I didn't want to have this competely separated 
> from
> vectorizing.
>
> So, do combine_vect_loop_remainder () from vect_transform_loop, not by 
> iterating
> over all (vectorized) loops at the end.
>
> Second, all the adjustments of the number of iterations for the vector
> loop should
> be integrated into the main vectorization scheme as should determining the
> cost of the predication.  So you'll end up adding a
> LOOP_VINFO_MASK_MAIN_LOOP_FOR_EPILOGUE flag, determined during
> cost analysis and during code generation adjust vector iteration computation
> accordingly and _not_ generate the epilogue loop (or wire it up correctly in
> the first place).
>
> The actual stmt processing should then still happen in a similar way as you 
> do.
>
> So I'm going to comment on that part only as I expect the rest will look a lot
> different.
>
> +/* Generate induction_vector which will be used to mask evaluation.  */
> +
> +static tree
> +gen_vec_induction (loop_vec_info loop_vinfo, unsigned elem_size, unsigned 
> size)
> +{
>
> please make use of create_iv.  Add more comments.  I reverse-engineered
> that you add a { { 0, ..., vf }, +, {vf, ... vf } } IV which you use
> in gen_mask_for_remainder
> by comparing it against { niter, ..., niter }.
>
> +  gsi = gsi_after_labels (loop->header);
> +  niters = LOOP_VINFO_PEELING_FOR_ALIGNMENT (loop_vinfo)
> +  ? LOOP_VINFO_NITERS (loop_vinfo)
> +  : LOOP_VINFO_NITERS_UNCHANGED (loop_vinfo);
>
> that's either wrong or unnecessary.  if ! peeling for alignment
> loop-vinfo-niters
> is equal to loop-vinfo-niters-unchanged.
>
> +  ptr = build_int_cst (reference_alias_ptr_type (ref), 0);
> +  if (!SSA_NAME_PTR_INFO (addr))
> +   copy_ref_info (build2 (MEM_REF, TREE_TYPE (ref), addr, ptr), ref);
>
> vect_duplicate_ssa_name_ptr_info.
>
> +
> +static void
> +fix_mask_for_masked_ld_st (vec *masked_stmt, tree mask)
> +{
> +  gimple *stmt, *new_stmt;
> +  tree old, lhs, vectype, var, n_lhs;
>
> no comment?  what's this for.
>
> +/* Convert vectorized reductions to VEC_COND statements to preserve
> +   reduction semantic:
> +   s1 = x + s2 --> t = x + s2; s1 = (mask)? t : s2.  */
> +
> +static void
> +convert_reductions (loop_vec_info loop_vinfo, tree mask)
> +{
>
> for reductions it looks like preserving the last iteration x plus the mask
> could avoid predicating it this way and compensate in the reduction
> epilogue by "subtracting" x & mask?  With true predication support
> that'll likely be more expensive of course.
>
> +  /* Generate new VEC_COND expr.  */
> +  vec_cond_expr = build3 (VEC_COND_EXPR, vectype, mask, new_lhs, rhs);
> +  new_stmt = gimple_build_assign (lhs, vec_cond_expr);
>
> gimple_build_assign (lhs, VEC_COND_EXPR, vectype, mask, new_lhs, rhs);
>
> +/* Return true if MEM_REF is incremented by vector size and false
> otherwise.  */
> +
> +static bool
> +mem_ref_is_vec_size_incremented (loop_vec_info loop_vinfo, tree lhs)
> +{
> +  struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo);
>
> what?!  Just look at DR_STEP of the store?
>
>
> +void
> +combine_vect_loop_remainder (loop_vec_info loop_vinfo)
> +{
> +  struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo);
> +  auto_vec loads;
> +  auto_vec stores;
>
> so you need to re-structure this in a way that it computes
>
>   a) wheter it can perform the operation - and you need to do that
>   reliably before the operation has taken place
>   b) its cost
>
> instead of looking at def types or gimple_assign_load/store_p predicates
> please look at STMT_VINFO_TYPE instead.
>
> I don't like the new target hook for the costing.  We do need some major
> re-structuring in the vectorizer cost model implementation, this doesn't go
> into the right direction.
>
> A simplistic hook following the current scheme would have used
> the vect_cost_for_stmt as argument and mirror builtin_vectorization_cost.
>
> There is not a single testcase in the patch.  I would have expected one that
> makes sur

Re: [PATCH] Fix vector rsqrt discovery (PR tree-optimization/68501)

2015-11-30 Thread Richard Biener
On Mon, 30 Nov 2015, Jakub Jelinek wrote:

> On Mon, Nov 30, 2015 at 02:30:04PM +, Richard Sandiford wrote:
> > > keep the builtin_reciprocal hook (perhaps renamed to builtin_rsqrt)
> > > for the purpose of this condition and nothing else (i.e. return a
> > > boolean) and let the rest be determined from the optab, just commit
> > > the already posted patch, something else?
> > 
> > ...I suppose the problem with adding extra conditions to the expander
> > is that it would break cases where the expander is used for target
> > built-ins too.
> > 
> > Maybe optabs shouldn't be used for built-ins if the usage conditions
> > aren't the same.  But if that's fighting too much against existing usage,
> > the hook "hack" could check these conditions too.
> 
> Yeah, I'm aware that the target builtins use those expanders with the
> current conditions and so would need to be renamed to something different
> if we take the approach of adding the conditions to all rsqrt* expanders.
> 
> So, maybe it is best if I just apply my original patch right away so that
> the bug is fixed and we can continue discussions on how we want to handle
> it.

Yes, I've seen the IFN idea as a followup improvement and go with
your original patch for now.

Richard.


[PTX] rework PTX prototype emission

2015-11-30 Thread Nathan Sidwell
This patch moves all PTX prototype emission from a DECL into the renamed 
'write_fn_decl'.  Thus we no longer need write_fn_decl_and_comment, nor does 
nvptx_declare_function_name need to emit the linker comment marker itself (which 
is nearly identical to the one that was emitted by write_fn_decl_and_comment).


While there I made the handling of name replacement consistent between the two 
prototype emitters and tidied up the argument emission of write_fn_decl.


nathan
2015-11-30  Nathan Sidwell  

	* config/nvptx/nvptx.c (nvptx_name_replacement): Move earlier.
	(write_one_arg): Reorder parms, add 'sep' param.
	(nvptx_write_function_decl): Rename to ...
	(write_fn_proto): ... here.  Do name replacement. Emit linaer
	comment marker. Deal with both decls and defns. Simplify argument
	formatting.
	(write_function_decl_and_comment): Delete.
	(write_func_decl_from_insn): Rename to ...
	(write_fn_proto_from_insn): ... here.  Don't do name replacement.
	(nvptx_record_fndecl): Call write_fn_proto.
	(nvptx_record_libfunc): Call write_fn_proto_from_insn.
	(nvptx_declare_function_name): Adjust for write_fn_proto changes.
	(nvotx_output_call_insn): Call write_fn_prot_from_insn.

Index: gcc/config/nvptx/nvptx.c
===
--- gcc/config/nvptx/nvptx.c	(revision 231072)
+++ gcc/config/nvptx/nvptx.c	(working copy)
@@ -224,6 +224,24 @@ nvptx_addr_space_from_sym (rtx sym)
   return ADDR_SPACE_GLOBAL;
 }
 
+/* Check NAME for special function names and redirect them by returning a
+   replacement.  This applies to malloc, free and realloc, for which we
+   want to use libgcc wrappers, and call, which triggers a bug in ptxas.  */
+
+static const char *
+nvptx_name_replacement (const char *name)
+{
+  if (strcmp (name, "call") == 0)
+return "__nvptx_call";
+  if (strcmp (name, "malloc") == 0)
+return "__nvptx_malloc";
+  if (strcmp (name, "free") == 0)
+return "__nvptx_free";
+  if (strcmp (name, "realloc") == 0)
+return "__nvptx_realloc";
+  return name;
+}
+
 /* If MODE should be treated as two registers of an inner mode, return
that inner mode.  Otherwise return VOIDmode.  */
 
@@ -309,8 +327,8 @@ arg_promotion (machine_mode mode)
a decl with zero TYPE_ARG_TYPES, i.e. an old-style C decl.  */
 
 static int
-write_one_arg (std::stringstream &s, tree type, int i, machine_mode mode,
-	   bool no_arg_types)
+write_one_arg (std::stringstream &s, const char *sep, int i,
+	   tree type, machine_mode mode, bool no_arg_types)
 {
   if (!PASS_IN_REG_P (mode, type))
 mode = Pmode;
@@ -318,9 +336,9 @@ write_one_arg (std::stringstream &s, tre
   machine_mode split = maybe_split_mode (mode);
   if (split != VOIDmode)
 {
-  i = write_one_arg (s, NULL_TREE, i, split, false);
-  i = write_one_arg (s, NULL_TREE, i, split, false);
-  return i;
+  i = write_one_arg (s, sep, i, TREE_TYPE (type), split, false);
+  sep = ", ";
+  mode = split;
 }
 
   if (no_arg_types && !AGGREGATE_TYPE_P (type))
@@ -330,8 +348,7 @@ write_one_arg (std::stringstream &s, tre
   mode = arg_promotion (mode);
 }
 
-  if (i)
-s << ", ";
+  s << sep;
   s << ".param" << nvptx_ptx_type_from_mode (mode, false) << " %in_ar"
 << i << (mode == QImode || mode == HImode ? "[1]" : "");
   if (mode == BLKmode)
@@ -349,41 +366,41 @@ write_as_kernel (tree attrs)
 	  || lookup_attribute ("omp target entrypoint", attrs) != NULL_TREE);
 }
 
-/* Write a function decl for DECL to S, where NAME is the name to be used.
-   This includes ptx .visible or .extern specifiers, .func or .kernel, and
-   argument and return types.  */
+/* Write a .func or .kernel declaration or definition along with
+   a helper comment for use by ld.  S is the stream to write to, DECL
+   the decl for the function with name NAME.   For definitions, emit
+   a declaration too.  */
 
-static void
-nvptx_write_function_decl (std::stringstream &s, const char *name, const_tree decl)
+static const char *
+write_fn_proto (std::stringstream &s, bool is_defn,
+		const char *name, const_tree decl)
 {
-  tree fntype = TREE_TYPE (decl);
-  tree result_type = TREE_TYPE (fntype);
-  tree args = TYPE_ARG_TYPES (fntype);
-  tree attrs = DECL_ATTRIBUTES (decl);
-  bool kernel = write_as_kernel (attrs);
-  bool is_main = strcmp (name, "main") == 0;
-  bool args_from_decl = false;
-
-  /* We get:
- NULL in TYPE_ARG_TYPES, for old-style functions
- NULL in DECL_ARGUMENTS, for builtin functions without another
-   declaration.
- So we have to pick the best one we have.  */
-  if (args == 0)
+  if (is_defn)
+/* Emit a declaration. The PTX assembler gets upset without it.   */
+name = write_fn_proto (s, false, name, decl);
+  else
 {
-  args = DECL_ARGUMENTS (decl);
-  args_from_decl = true;
+  /* Avoid repeating the name replacement.  */
+  name = nvptx_name_replacement (name);
+  if (name[0] == '*')
+	name++;
 }
 
+  /* Emit the linker mar

Re: [PATCH] S/390: Fix warning in "*movstr" pattern.

2015-11-30 Thread Dominik Vogt
On Mon, Nov 09, 2015 at 01:33:23PM +0100, Andreas Krebbel wrote:
> On 11/04/2015 02:39 AM, Dominik Vogt wrote:
> > On Tue, Nov 03, 2015 at 06:47:28PM +0100, Ulrich Weigand wrote:
> >> Dominik Vogt wrote:
> >>
> >>> @@ -2936,7 +2936,7 @@
> >>> (set (mem:BLK (match_operand:P 1 "register_operand" "0"))
> >>>   (mem:BLK (match_operand:P 3 "register_operand" "2")))
> >>> (set (match_operand:P 0 "register_operand" "=d")
> >>> - (unspec [(mem:BLK (match_dup 1))
> >>> + (unspec:P [(mem:BLK (match_dup 1))
> >>>(mem:BLK (match_dup 3))
> >>>(reg:SI 0)] UNSPEC_MVST))
> >>> (clobber (reg:CC CC_REGNUM))]
> >>
> >> Don't you have to change the expander too?  Otherwise the
> >> pattern will no longer match ...
> > 
> > Yes, you're right.  This turned out to be a bit tricky to do
> > because the "movstr" expander doesn't allow variants with
> > different modes.  :-/
> > 
> > New patch attached, including a test case that works on 31-bit and
> > 64-bit.
> 
> Could you please check that the generated code doesn't change with a larger 
> code base (e.g.
> speccpu)?  It should not affect it but I really think we omitted the mode 
> here for a reason
> (although I don't remember why).

The attached patch contains a little cleanup in the s390.exp
script and a 31-bit test case.  This has successfully passed the
SPEC2006 testsuite without measurable changes in performance.  In
my eyes this is now clean to be committed.

Ciao

Dominik ^_^  ^_^

-- 

Dominik Vogt
IBM Germany
gcc/ChangeLog

* config/s390/s390.md ("movstr", "*movstr"): Fix warning.
("movstr"): New indirect expanders used by "movstr".

gcc/testsuite/ChangeLog

* gcc.target/s390/md/movstr-1.c: New test.
* gcc.target/s390/s390.exp: Add subdir md.
Do not run hotpatch tests twice.
>From 7680c94169918aa22b10d923b18c676e59506d4d Mon Sep 17 00:00:00 2001
From: Dominik Vogt 
Date: Tue, 3 Nov 2015 18:03:02 +0100
Subject: [PATCH] S/390: Fix warning in "*movstr" pattern.

---
 gcc/config/s390/s390.md | 20 +---
 gcc/testsuite/gcc.target/s390/md/movstr-1.c | 11 +++
 gcc/testsuite/gcc.target/s390/s390.exp  | 25 -
 3 files changed, 48 insertions(+), 8 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/s390/md/movstr-1.c

diff --git a/gcc/config/s390/s390.md b/gcc/config/s390/s390.md
index f2bb24c..75e9af7 100644
--- a/gcc/config/s390/s390.md
+++ b/gcc/config/s390/s390.md
@@ -2910,13 +2910,27 @@
 ;
 
 (define_expand "movstr"
+  ;; The pattern is never generated.
+  [(match_operand 0 "" "")
+   (match_operand 1 "" "")
+   (match_operand 2 "" "")]
+  ""
+{
+  if (TARGET_64BIT)
+emit_insn (gen_movstrdi (operands[0], operands[1], operands[2]));
+  else
+emit_insn (gen_movstrsi (operands[0], operands[1], operands[2]));
+  DONE;
+})
+
+(define_expand "movstr"
   [(set (reg:SI 0) (const_int 0))
(parallel
 [(clobber (match_dup 3))
  (set (match_operand:BLK 1 "memory_operand" "")
 	  (match_operand:BLK 2 "memory_operand" ""))
- (set (match_operand 0 "register_operand" "")
-	  (unspec [(match_dup 1)
+ (set (match_operand:P 0 "register_operand" "")
+	  (unspec:P [(match_dup 1)
 		   (match_dup 2)
 		   (reg:SI 0)] UNSPEC_MVST))
  (clobber (reg:CC CC_REGNUM))])]
@@ -2937,7 +2951,7 @@
(set (mem:BLK (match_operand:P 1 "register_operand" "0"))
 	(mem:BLK (match_operand:P 3 "register_operand" "2")))
(set (match_operand:P 0 "register_operand" "=d")
-	(unspec [(mem:BLK (match_dup 1))
+	(unspec:P [(mem:BLK (match_dup 1))
 		 (mem:BLK (match_dup 3))
 		 (reg:SI 0)] UNSPEC_MVST))
(clobber (reg:CC CC_REGNUM))]
diff --git a/gcc/testsuite/gcc.target/s390/md/movstr-1.c b/gcc/testsuite/gcc.target/s390/md/movstr-1.c
new file mode 100644
index 000..3429054
--- /dev/null
+++ b/gcc/testsuite/gcc.target/s390/md/movstr-1.c
@@ -0,0 +1,11 @@
+/* Machine description pattern tests.  */
+
+/* { dg-do assemble } */
+/* { dg-options "-dP -save-temps" } */
+
+void test(char *dest, const char *src)
+{
+  __builtin_stpcpy (dest, src);
+}
+
+/* { dg-final { scan-assembler-times "\{\\*movstr\}" 1 } } */
diff --git a/gcc/testsuite/gcc.target/s390/s390.exp b/gcc/testsuite/gcc.target/s390/s390.exp
index 0b8f80ed..0d7a7eb 100644
--- a/gcc/testsuite/gcc.target/s390/s390.exp
+++ b/gcc/testsuite/gcc.target/s390/s390.exp
@@ -61,20 +61,35 @@ if ![info exists DEFAULT_CFLAGS] then {
 # Initialize `dg'.
 dg-init
 
-set hotpatch_tests $srcdir/$subdir/hotpatch-\[0-9\]*.c
+set md_tests $srcdir/$subdir/md/*.c
 
 # Main loop.
 dg-runtest [lsort [prune [glob -nocomplain $srcdir/$subdir/*.\[cS\]] \
-			 $hotpatch_tests]] "" $DEFAULT_CFLAGS
+			 $md_tests]] "" $DEFAULT_CFLAGS
 
 dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/*vector*/*.\[cS\]]] \
 	"" $DEFAULT_CFLAGS
 
+dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/md/*.\[cS\]]] \
+	"" $DEFAULT_CFLAGS
+
 # Additional hotpatch torture tests.
 torture-init
-set HOTPATCH_TEST_OPTS [list -Os

[PATCH] Fix PR68592

2015-11-30 Thread Richard Biener

The following fixes PR68592 where I forgot the pattern def seq when
resetting SLP type.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.

Richard.

2015-11-30  Richard Biener  

PR tree-optimization/68592
* tree-vect-loop.c (vect_analyze_loop_2): Reset SLP type also
on the pattern def sequence.

* gfortran.dg/pr68592.f: New testcase.

Index: gcc/tree-vect-loop.c
===
--- gcc/tree-vect-loop.c(revision 231065)
+++ gcc/tree-vect-loop.c(working copy)
@@ -2178,6 +2178,13 @@ again:
{
  gcc_assert (STMT_SLP_TYPE (stmt_info) == loop_vect);
  stmt_info = vinfo_for_stmt (STMT_VINFO_RELATED_STMT (stmt_info));
+ for (gimple_stmt_iterator pi
+= gsi_start (STMT_VINFO_PATTERN_DEF_SEQ (stmt_info));
+  !gsi_end_p (pi); gsi_next (&pi))
+   {
+ gimple *pstmt = gsi_stmt (pi);
+ STMT_SLP_TYPE (vinfo_for_stmt (pstmt)) = loop_vect;
+   }
}
  STMT_SLP_TYPE (stmt_info) = loop_vect;
}
Index: gcc/testsuite/gfortran.dg/pr68592.f
===
--- gcc/testsuite/gfortran.dg/pr68592.f (revision 0)
+++ gcc/testsuite/gfortran.dg/pr68592.f (working copy)
@@ -0,0 +1,20 @@
+! PR tree-optimization/68592
+! { dg-do compile }
+! { dg-require-profiling "-fprofile-generate" }
+! { dg-options "-Ofast -fprofile-generate" }
+! { dg-additional-options "-mavx" { target x86_64-*-* i?86-*-* } }
+  PARAMETER (MXCPGA=320,ZERO=0.0)
+  DIMENSION CPNORM(MXCPGA),CDNORM(MXCPGA),
+ *  CFNORM(MXCPGA)  
+ KTYPIL= KTYPI()
+ DO 84 K=1,NOGTF
+   LMP=LMP+1
+   CFNORM(LMP)=ZERO
+   IF (KTYPIL.EQ.1) LMP=CMPILMP
+   IF (KTYPIL.EQ.2) CPNORM(LMP)=CMPILMP
+   IF (KTYPIL.EQ.3) CDNORM(LMP)=CMPILMP
+   IF (KTYPIL.EQ.4) LMP=CMPILMP
+   IF (KTYPIL.EQ.6) LMP=CMPILMP
+   84CONTINUE
+ CALL MMPNOR(CPNORM,CDNORM,CFNORM) 
+  END


Re: [PATCH] Fix vector rsqrt discovery (PR tree-optimization/68501)

2015-11-30 Thread Jakub Jelinek
On Mon, Nov 30, 2015 at 02:30:04PM +, Richard Sandiford wrote:
> > keep the builtin_reciprocal hook (perhaps renamed to builtin_rsqrt)
> > for the purpose of this condition and nothing else (i.e. return a
> > boolean) and let the rest be determined from the optab, just commit
> > the already posted patch, something else?
> 
> ...I suppose the problem with adding extra conditions to the expander
> is that it would break cases where the expander is used for target
> built-ins too.
> 
> Maybe optabs shouldn't be used for built-ins if the usage conditions
> aren't the same.  But if that's fighting too much against existing usage,
> the hook "hack" could check these conditions too.

Yeah, I'm aware that the target builtins use those expanders with the
current conditions and so would need to be renamed to something different
if we take the approach of adding the conditions to all rsqrt* expanders.

So, maybe it is best if I just apply my original patch right away so that
the bug is fixed and we can continue discussions on how we want to handle
it.

Jakub


Re: [PATCH] Fix vector rsqrt discovery (PR tree-optimization/68501)

2015-11-30 Thread Richard Sandiford
Jakub Jelinek  writes:
> On Sat, Nov 28, 2015 at 09:38:40AM +0100, Jakub Jelinek wrote:
>> On Sat, Nov 28, 2015 at 08:47:18AM +0100, Richard Biener wrote:
>> > On November 27, 2015 8:40:56 PM GMT+01:00, Jakub Jelinek
>> >  wrote:
>> > >The recent changes where vector sqrt is represented in the IL using
>> > >IFN_SQRT instead of target specific builtins broke the discovery
>> > >of vector rsqrt, as targetm.builtin_reciprocal is called only
>> > >on builtin functions (not internal functions).  Furthermore,
>> > >for internal fns, not only the IFN_* is significant, but also the
>> > >types (modes actually) of the lhs and/or arguments.
>> > >
>> > >This patch adjusts the target hook, so that the backends can just
>> > >inspect
>> > >the call (builtin or internal function), whatever it is.
>> > >
>> > >Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
>> > 
>> > OK.  Though the other option would be to add an optab with
>> > corresponding IFN.
>> 
>> Yeah, I've been thinking about IFN_RSQRT and rsqrt optab, perhaps that is
>> cleaner and the target hook could go away completely.
>
> So, had a look at this, and the only issue I see is that the various
> targetm.builtin_reciprocal implementations start with various fancy
> conditions:
>   if (! (TARGET_SSE_MATH && !optimize_insn_for_size_p ()
>  && flag_finite_math_only && !flag_trapping_math
>  && flag_unsafe_math_optimizations))
> return NULL_TREE;
> on i?86,
>   if (optimize_insn_for_size_p ())
> return NULL_TREE;
> on rs6000 and
>   if (flag_trapping_math
>   || !flag_unsafe_math_optimizations
>   || optimize_size
>   || ! (aarch64_tune_params.extra_tuning_flags
>& AARCH64_EXTRA_TUNE_RECIP_SQRT))
> on aarch64.  The recip pass is only guarded by its gate, which doesn't say
> anything from the above.
> So, shall I move these conditions to the rsqrt2 expanders
> (but not sure if e.g. the tuning flags or !optimize_size or
> !optimize_insn_for_size_p () is appropriate for expander conditions),

The size conditions are the same problem as PR68432.  I'm not sure
whether my series for that PR has been officially rejected or not.
If it has then I'll need to hack around it some other way,
e.g. by having a target hook that says whether an optab should be
used when optimising for size or speed.

Maybe that isn't such a hack given...

> keep the builtin_reciprocal hook (perhaps renamed to builtin_rsqrt)
> for the purpose of this condition and nothing else (i.e. return a
> boolean) and let the rest be determined from the optab, just commit
> the already posted patch, something else?

...I suppose the problem with adding extra conditions to the expander
is that it would break cases where the expander is used for target
built-ins too.

Maybe optabs shouldn't be used for built-ins if the usage conditions
aren't the same.  But if that's fighting too much against existing usage,
the hook "hack" could check these conditions too.

E.g.:

  bool
  targetm.optab_supported_p (int optab_tag, machine_mode mode1,
 machine_mode mode2, optimization_type opt_type)

where mode1 and mode2 are the optab modes (only one needed for direct
optabs) and where optimization_type is the type from my PR68432 series.

Thanks,
Richard



Re: [PATCH] Fix vector rsqrt discovery (PR tree-optimization/68501)

2015-11-30 Thread Jakub Jelinek
On Sat, Nov 28, 2015 at 09:38:40AM +0100, Jakub Jelinek wrote:
> On Sat, Nov 28, 2015 at 08:47:18AM +0100, Richard Biener wrote:
> > On November 27, 2015 8:40:56 PM GMT+01:00, Jakub Jelinek  
> > wrote:
> > >The recent changes where vector sqrt is represented in the IL using
> > >IFN_SQRT instead of target specific builtins broke the discovery
> > >of vector rsqrt, as targetm.builtin_reciprocal is called only
> > >on builtin functions (not internal functions).  Furthermore,
> > >for internal fns, not only the IFN_* is significant, but also the
> > >types (modes actually) of the lhs and/or arguments.
> > >
> > >This patch adjusts the target hook, so that the backends can just
> > >inspect
> > >the call (builtin or internal function), whatever it is.
> > >
> > >Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> > 
> > OK.  Though the other option would be to add an optab with corresponding 
> > IFN.
> 
> Yeah, I've been thinking about IFN_RSQRT and rsqrt optab, perhaps that is
> cleaner and the target hook could go away completely.

So, had a look at this, and the only issue I see is that the various
targetm.builtin_reciprocal implementations start with various fancy
conditions:
  if (! (TARGET_SSE_MATH && !optimize_insn_for_size_p ()
 && flag_finite_math_only && !flag_trapping_math
 && flag_unsafe_math_optimizations))
return NULL_TREE;
on i?86,
  if (optimize_insn_for_size_p ())
return NULL_TREE;
on rs6000 and
  if (flag_trapping_math
  || !flag_unsafe_math_optimizations
  || optimize_size
  || ! (aarch64_tune_params.extra_tuning_flags
   & AARCH64_EXTRA_TUNE_RECIP_SQRT))
on aarch64.  The recip pass is only guarded by its gate, which doesn't say
anything from the above.
So, shall I move these conditions to the rsqrt2 expanders
(but not sure if e.g. the tuning flags or !optimize_size or
!optimize_insn_for_size_p () is appropriate for expander conditions), keep
the builtin_reciprocal hook (perhaps renamed to builtin_rsqrt) for the
purpose of this condition and nothing else (i.e. return a boolean) and let
the rest be determined from the optab, just commit the already posted patch,
something else?

Jakub


Re: Fix verify_type ICE during Ada bootstrap

2015-11-30 Thread Richard Biener
On Mon, 30 Nov 2015, Jan Hubicka wrote:

> Hi,
> here is updated patch which bootstraps®testes, lto-bootstraps x86_64-linux 
> and
> also works for Firefox. The basic pain is to identify which calls to 
> get_alias_set
> are used to build alias sets themselves and thus must be made 
> -fstrict-aliasing
> independent and which are used to drive queries to oracle and thus should 
> follow
> -fstrict-aliasing.  Fortunately the sanity checking I added seems pretty 
> effective
> to check bugs in this area: either we get ice in tree-streamer-out.c because 
> alias
> set is 0 when it is not expected to be or we get an ice in 
> record_component_aliases
> because alias set of component gets 0.
> 
> OK?

I think you are doing too many things in one patch.  I'm fine with
dropping the zero-alias-set streaming (but I'd rather not assert
as FE get_alias_set langhook may assign zero to random tree nodes).

I'm also fine with handling flag_strict_aliasing conservatively
during inlining - but the condition you placed on this handling
needs a comment.  I couldn't decipher it ;)

> +  if (dump_file)
> + fprintf (dump_file, "Dropping flag_strict_aliasing on %s:%i\n",
> +  to->name (), to->order);

So I wonder if it makes sense to pessimize such inlining as well.

The two above should be enough to fix the correctness issue.

The parse_optimize_options hack looks indeed interesting, but we solved
the issue differently by

2014-11-27  Richard Biener  

PR middle-end/63704
* alias.c (mems_in_disjoint_alias_sets_p): Remove assert
and instead return false when !fstrict-aliasing.

So the hack can be removed as a separate commit after the first one
above.  This should make optimize("fno-strict-aliasing") work.


I don't really see why we need all the other changes and IMHO the
get_alias_set interface change is ugly and fragile.  And this doesn't
look like sth for stage3.

Thus please split the patch up.

Thanks,
Richard.

> Honza
> 
>   * tree.c (free_lang_data): Pass true to get_alias_set.
>   * tree-streamer-in.c (unpack_ts_type_common_value_fields): Do not stream
>   alias set.
>   * tree-ssa-alias.c (ao_ref_base_alias_set, ao_ref_alias_set): Pass true
>   to get_alias_set; comment.
>   (same_type_for_tbaa): Likewise.
>   * alias.c (alias_set_subset_of, alias_sets_conflict_p): When strict
>   aliasing is disabled, return true.
>   (get_alias_set): New parameter strict.
>   (new_alias_set): Always produce new alias set.
>   (record_component_aliases): Pass true to get_alias_set.
>   * alias.h (get_alias_set): New optional parameter STRICT.
>   * lto-streamer-out.c (hash_tree): Do not hash alias set.
>   * ipa-inline-transform.c (inline_call): Drop strict aliasing of
>   caller if needed.
>   * ipa-icf-gimple.c (func_checker::compatible_types_p): Pass true
>   to get_alias_set.
>   * tree-streamer-out.c (pack_ts_type_common_value_fields): Do not
>   stream TYPE_ALIAS_SET; sanity check that alias set 0 at LTO time will
>   match what frontneds does.
>   * fold-const.c (operand_equal_p): Be cureful about TBAA info before
>   inlining even with -fno-strict-aliasing.
>   * gimple.c (gimple_get_alias_set): Pass true to get_alias_set.
> 
>   * misc.c (gnat_get_alias_set): Pass true to get_alias_set.
>   * utils.c (relate_alias_sets): Likewise.
>   * trans.c (validate_unchecked_conversion): Likewise.
> 
>   * lto-symtab.c (warn_type_compatibility_p): Pass true to get_alias_set.
>   * lto.c (compare_tree_sccs_1): Do not ocmpare TYPE_ALIAS_SET.
> 
>   * gcc.c-torture/execute/alias-1.c: New testcase.
>   * gcc.dg/lto/alias-1_0.c: New testcase.
>   * gcc.dg/lto/alias-1_1.c: New testcase.
> 
>   * c-common.c (parse_optimize_options): Remove hack about
>   flag_strict_aliasing.
>   (convert_vector_to_pointer_for_subscript): Pass true to get_alias_set.
> 
>   * cp-objcp-common.c (cxx_get_alias_set): Pass true to get_alias_set.
>   
>   * rtti.c (typeid_ok_p): Pass true to get_alias_set.
> Index: tree.c
> ===
> --- tree.c(revision 231020)
> +++ tree.c(working copy)
> @@ -5971,7 +5971,8 @@ free_lang_data (void)
>   while the slots are still in the way the frontends generated them.  */
>for (i = 0; i < itk_none; ++i)
>  if (integer_types[i])
> -  TYPE_ALIAS_SET (integer_types[i]) = get_alias_set (integer_types[i]);
> +  TYPE_ALIAS_SET (integer_types[i]) = get_alias_set (integer_types[i],
> +  true);
>  
>/* Traverse the IL resetting language specific information for
>   operands, expressions, etc.  */
> Index: cp/rtti.c
> ===
> --- cp/rtti.c (revision 231020)
> +++ cp/rtti.c (working copy)
> @@ -300,10 +300,10 @@ typeid_ok_p (void)
>/* Make su

Re: [PATCH] rs6000_adjust_cost old thinko

2015-11-30 Thread David Edelsohn
On Mon, Nov 30, 2015 at 4:44 AM, Eric Botcazou  wrote:
>> Note this also is wrong on PA and one of the SPARC adjust_cost macros.
>
> Thanks for the heads up, fixed thusly, applied on the mainline
>
>
> PR target/28115
> * config/sparc/sparc.c (supersparc_adjust_cost): Fix thinko.
> (sparc_adjust_cost): Add missing space.

Eric,

FYI, the function should test recog_memoized (dep_insn) also.

- David


Re: [Patch, fortran] PR68534 - No error on mismatch in number of arguments between submodule and module interface

2015-11-30 Thread Paul Richard Thomas
Committed as revision 231072.

Thanks for the review

Paul

On 28 November 2015 at 17:19, Steve Kargl
 wrote:
> On Sat, Nov 28, 2015 at 11:35:54AM +0100, Paul Richard Thomas wrote:
>> +
>> +   /* Abreviated module procedure declaration is not meant to have any
>
> s/Abreviated/Abbreviated
>
>> +  formal arguments!  */
>> +   if (!sym->abr_modproc_decl && formal && !head)
>> + arg_count_mismatch = true;
>> +
>
> OK to commit.
>
> --
> Steve



-- 
Outside of a dog, a book is a man's best friend. Inside of a dog it's
too dark to read.

Groucho Marx


Re: [PATCH, PR46032] Handle BUILT_IN_GOMP_PARALLEL in ipa-pta

2015-11-30 Thread Jakub Jelinek
On Mon, Nov 30, 2015 at 02:24:18PM +0100, Richard Biener wrote:
> > OK for stage3 trunk if bootstrap and reg-test succeeds?
> 
> -|| node->address_taken);
> +|| (node->address_taken
> +&& !node->parallelized_function));
> 
> please add a comment here on why this is safe.
> 
> Ok with this change.

BTW, __builting_GOMP_task supposedly can be treated similarly
if the third argument is NULL (if 3rd arg is non-NULL, then
the caller passes a different structure from what the callee receives,
but perhaps it could be emulated as pretending that cpyfn is called first
with address of a temporary var and the data argument and then fn
is called with the address of the temporary var).

Jakub


Re: [PATCH, PR46032] Handle BUILT_IN_GOMP_PARALLEL in ipa-pta

2015-11-30 Thread Richard Biener
On Mon, 30 Nov 2015, Tom de Vries wrote:

> On 30/11/15 10:16, Richard Biener wrote:
> > On Mon, 30 Nov 2015, Tom de Vries wrote:
> > 
> > > Hi,
> > > 
> > > this patch fixes PR46032.
> > > 
> > > It handles a call:
> > > ...
> > >__builtin_GOMP_parallel (fn, data, num_threads, flags)
> > > ...
> > > as:
> > > ...
> > >fn (data)
> > > ...
> > > in ipa-pta.
> > > 
> > > This improves ipa-pta alias analysis in the parallelized function fn, and
> > > allows vectorization in the testcase without a runtime alias test.
> > > 
> > > Bootstrapped and reg-tested on x86_64.
> > > 
> > > OK for stage3 trunk?
> > 
> > + /* Assign the passed argument to the appropriate incoming
> > +parameter of the function.  */
> > + struct constraint_expr lhs ;
> > + lhs = get_function_part_constraint (fi, fi_parm_base + 0);
> > + auto_vec rhsc;
> > + struct constraint_expr *rhsp;
> > + get_constraint_for_rhs (arg, &rhsc);
> > + while (rhsc.length () != 0)
> > +   {
> > + rhsp = &rhsc.last ();
> > + process_constraint (new_constraint (lhs, *rhsp));
> > + rhsc.pop ();
> > +   }
> > 
> > please use style used elsewhere with
> > 
> >   FOR_EACH_VEC_ELT (rhsc, j, rhsp)
> > process_constraint (new_constraint (lhs, *rhsp));
> >   rhsc.truncate (0);
> > 
> 
> That code was copied from find_func_aliases_for_call.
> I've factored out the bit that I copied as find_func_aliases_for_call_arg, and
> fixed the style there (and dropped 'rhsc.truncate (0)' since AFAIU it's
> redundant at the end of a function).
> 
> > + /* Parameter passed by value is used.  */
> > + lhs = get_function_part_constraint (fi, fi_uses);
> > + struct constraint_expr *rhsp;
> > + get_constraint_for_address_of (arg, &rhsc);
> > 
> > This isn't correct - you want to use get_constraint_for (arg, &rhsc).
> > After all rhs is already an ADDR_EXPR.
> > 
> 
> Can we add an assert somewhere to detect this incorrect usage?
> 
> > + FOR_EACH_VEC_ELT (rhsc, j, rhsp)
> > +   process_constraint (new_constraint (lhs, *rhsp));
> > + rhsc.truncate (0);
> > +
> > + /* The caller clobbers what the callee does.  */
> > + lhs = get_function_part_constraint (fi, fi_clobbers);
> > + rhs = get_function_part_constraint (cfi, fi_clobbers);
> > + process_constraint (new_constraint (lhs, rhs));
> > +
> > + /* The caller uses what the callee does.  */
> > + lhs = get_function_part_constraint (fi, fi_uses);
> > + rhs = get_function_part_constraint (cfi, fi_uses);
> > + process_constraint (new_constraint (lhs, rhs));
> > 
> > I don't see why you need those.  The solver should compute these
> > in even better precision (context sensitive on the call side).
> > 
> > The same is true for the function parameter.  That is, the only
> > needed part of the patch should be that making sure we see
> > the "direct" call and assign parameters correctly.
> > 
> 
> Dropped this bit.
> 
> OK for stage3 trunk if bootstrap and reg-test succeeds?

-|| node->address_taken);
+|| (node->address_taken
+&& !node->parallelized_function));

please add a comment here on why this is safe.

Ok with this change.

Thanks,
Richard.


[PATCH PR68542]

2015-11-30 Thread Yuri Rumyantsev
Hi All,

Here is a patch for 481.wrf preformance regression for avx2 which is
sligthly modified mask store optimization. This transformation allows
perform unpredication for semi-hammock containing masked stores, other
words if we have a loop like
for (i=0; i

PR middle-end/68542
* config/i386/i386.c (ix86_expand_branch): Implement integral vector
comparison with boolean result.
* config/i386/sse.md (define_expand "cbranch4): Add define-expand
for vector comparion with eq/ne only.
* fold-const.c (fold_relational_const): Add handling of vector
comparison with boolean result.
* tree-cfg.c (verify_gimple_comparison): Add argument CODE, allow
comparison of vector operands with boolean result for EQ/NE only.
(verify_gimple_assign_binary): Adjust call for verify_gimple_comparison.
(verify_gimple_cond): Likewise.
* tree-ssa-forwprop.c (combine_cond_expr_cond): Do not perform
combining for non-compatible vector types.
* tree-vect-loop.c (is_valid_sink): New function.
(optimize_mask_stores): Likewise.
* tree-vect-stmts.c (vectorizable_mask_load_store): Initialize
has_mask_store field of vect_info.
* tree-vectorizer.c (vectorize_loops): Invoke optimaze_mask_stores for
vectorized loops having masked stores.
* tree-vectorizer.h (loop_vec_info): Add new has_mask_store field and
correspondent macros.
(optimize_mask_stores): Add prototype.
* tree-vrp.c (register_edge_assert_for): Do not handle NAME with vector
type.

gcc/testsuite/ChangeLog:
* gcc.target/i386/avx2-vect-mask-store-move1.c: New test.


PR68542.patch
Description: Binary data


Re: [PATCH] Add save_expr langhook (PR c/68513)

2015-11-30 Thread Marek Polacek
On Fri, Nov 27, 2015 at 10:43:42PM +, Joseph Myers wrote:
> On Fri, 27 Nov 2015, Marek Polacek wrote:
> 
> > I didn't know where to put setting of in_late_processing.  With the current
> > placement, we won't (for valid programs) call c_save_expr from c_genericize
> > or c_gimplify_expr.
> 
> Well, the placement in this patch (in c_parser_compound_statement) is 
> certainly wrong.  It doesn't even save and restore, so after one compound 
> statement inside another, parsing would continue with in_late_processing 
> wrongly set.  And c_save_expr is logically right for any parsing outside 
> compound statements as well (arbitrary expressions can occur in sizeof 
> outside functions and in VLA parameter sizes and should follow the normal 
> rules for what's a constant expression - there's a known bug that 
> statement expressions are wrongly rejected in such contexts).
 
Indeed.  I don't know what I was thinking. :/

> Starting from first principles: parsing takes place from within 
> c_parse_file as the sole external entry point to the parser.  So you could 
> have a parsing_input variable that starts off as false, and where 
> c_parse_file saves it, sets to true, and restores the saved value at the 
> end.  Then you'd use c_save_expr if parsing_input && !in_late_binary_op.
> 
> If that doesn't work, it means there are cases where the hook gets called 
> from folding that takes place during parsing, on expressions that will not 
> subsequently go through c_fully_fold, but without in_late_binary_op set.  
> Knowing what those cases are might help work out any fix for them that is 
> needed.

I'm not sanguine about doing this reliably in stage3.  I think I'll try the
other approach mentioned later in this thread.
 
> > I suppose I should also modify save_expr in fold-const.c to call it via the
> > langhook, if this approach is sane.  Dunno.
> 
> That's a complication.  When the folding is taking place from within 
> c_fully_fold (and so the sub-expressions have already been folded, and had 
> their C_MAYBE_CONST_EXPRs removed, and the result of folding will not be 
> re-folded), it should be using save_expr not c_save_expr.  So maybe the 
> hook needs to say: use c_save_expr, if parsing, not in_late_binary_op and 
> not folding from within c_fully_fold.
 
Oh, I see :(.

> Again long term we should aim for the representation during parsing not to 
> need SAVE_EXPRs and for the folding that creates them (and the other 
> folding for optimization in general) to happen only after parsing

Yeah, let's strike that for gcc7.

Thanks,

Marek


[c-family] Fix -fdump-ada-spec ordering issue in C++

2015-11-30 Thread Eric Botcazou
This fixes an ordering issue in the Ada code generated by the -fdump-ada-spec 
option with the C++ compiler on structures/unions with nested anonymous arrays 
of structures/unions.

Given that this only affects the Ada code generated by -fdump-ada-spec and has 
no effect whatsoever on the C and C++ compilers, I have already installed it.

Tested on x86_64-suse-linux, applied on the mainline.

2015-11-30  Eric Botcazou  

c-family/
* c-ada-spec.c (print_ada_macros): Remove redundant blank line.
(decl_sloc_common): Delete and move bulk of processing to...
(decl_sloc): ...here.
(pp_ada_tree_identifier): Remove reference to QUAL_UNION_TYPE.
(dump_ada_double_name): Remove S parameter and compute the suffix.
(dump_ada_array_type): Add PARENT parameter.  Simplify computation of
element type and deal with an anonymous one.
(dump_ada_template): Use RECORD_OR_UNION_TYPE_P macro.
(dump_generic_ada_node): Tweak.  Adjust call to dump_ada_array_type
and remove reference to QUAL_UNION_TYPE.
(dump_nested_types): Make 2 passes on the fields and move bulk to...
(dump_nested_type): ...here.  New function extracted from above.
Generate a full declaration for anonymous element type of arrays.
(print_ada_declaration): Really skip anonymous declarations.  Remove
references to QUAL_UNION_TYPE.  Adjust call to dump_ada_array_type.
Clean up processing of declarations of array types and objects.
(print_ada_struct_decl): Remove reference to QUAL_UNION_TYPE.
Remove obsolete code and tidy up.


2015-11-30  Eric Botcazou  

* gcc.dg/dump-ada-spec-1.c: Move to...
* c-c++-common/dump-ada-spec-1.c: ...here.
* c-c++-common/dump-ada-spec-2.c: New test.

-- 
Eric BotcazouIndex: c-ada-spec.c
===
--- c-ada-spec.c	(revision 231010)
+++ c-ada-spec.c	(working copy)
@@ -375,7 +375,7 @@ print_ada_macros (pretty_printer *pp, cp
 	{
 	  expanded_location sloc = expand_location (macro->line);
 
-	  if (sloc.line != prev_line + 1)
+	  if (sloc.line != prev_line + 1 && prev_line > 0)
 	pp_newline (pp);
 
 	  num_macros++;
@@ -500,39 +500,28 @@ dump_ada_macros (pretty_printer *pp, con
 
 static const char *source_file_base;
 
-/* Compare the declaration (DECL) of struct-like types based on the sloc of
-   their last field (if LAST is true), so that more nested types collate before
-   less nested ones.
-   If ORIG_TYPE is true, also consider struct with a DECL_ORIGINAL_TYPE.  */
+/* Return sloc of DECL, using sloc of last field if LAST is true.  */
 
-static location_t
-decl_sloc_common (const_tree decl, bool last, bool orig_type)
+location_t
+decl_sloc (const_tree decl, bool last)
 {
-  tree type = TREE_TYPE (decl);
+  tree field;
 
+  /* Compare the declaration of struct-like types based on the sloc of their
+ last field (if LAST is true), so that more nested types collate before
+ less nested ones.  */
   if (TREE_CODE (decl) == TYPE_DECL
-  && (orig_type || !DECL_ORIGINAL_TYPE (decl))
-  && RECORD_OR_UNION_TYPE_P (type)
-  && TYPE_FIELDS (type))
+  && !DECL_ORIGINAL_TYPE (decl)
+  && RECORD_OR_UNION_TYPE_P (TREE_TYPE (decl))
+  && (field = TYPE_FIELDS (TREE_TYPE (decl
 {
-  tree f = TYPE_FIELDS (type);
-
   if (last)
-	while (TREE_CHAIN (f))
-	  f = TREE_CHAIN (f);
-
-  return DECL_SOURCE_LOCATION (f);
+	while (DECL_CHAIN (field))
+	  field = DECL_CHAIN (field);
+  return DECL_SOURCE_LOCATION (field);
 }
-  else
-return DECL_SOURCE_LOCATION (decl);
-}
 
-/* Return sloc of DECL, using sloc of last field if LAST is true.  */
-
-location_t
-decl_sloc (const_tree decl, bool last)
-{
-  return decl_sloc_common (decl, last, false);
+  return DECL_SOURCE_LOCATION (decl);
 }
 
 /* Compare two locations LHS and RHS.  */
@@ -1258,7 +1247,6 @@ pp_ada_tree_identifier (pretty_printer *
 		  case ARRAY_TYPE:
 		  case RECORD_TYPE:
 		  case UNION_TYPE:
-		  case QUAL_UNION_TYPE:
 		  case TYPE_DECL:
 		if (package_prefix)
 		  {
@@ -1373,10 +1361,10 @@ dump_ada_decl_name (pretty_printer *buff
 }
 }
 
-/* Dump in BUFFER a name based on both T1 and T2, followed by S.  */
+/* Dump in BUFFER a name based on both T1 and T2 followed by a suffix.  */
 
 static void
-dump_ada_double_name (pretty_printer *buffer, tree t1, tree t2, const char *s)
+dump_ada_double_name (pretty_printer *buffer, tree t1, tree t2)
 {
   if (DECL_NAME (t1))
 pp_ada_tree_identifier (buffer, DECL_NAME (t1), t1, false);
@@ -1396,7 +1384,21 @@ dump_ada_double_name (pretty_printer *bu
   pp_scalar (buffer, "%d", TYPE_UID (TREE_TYPE (t2)));
 }
 
-  pp_string (buffer, s);
+  switch (TREE_CODE (TREE_TYPE (t2)))
+{
+case ARRAY_TYPE:
+  pp_string (buffer, "_array");
+  break;
+case RECORD_TYPE:
+  pp_string (buffer, "_struct");
+  break;
+case UNION_TYPE:
+

Re: [PATCH, PR46032] Handle BUILT_IN_GOMP_PARALLEL in ipa-pta

2015-11-30 Thread Tom de Vries

On 30/11/15 10:16, Richard Biener wrote:

On Mon, 30 Nov 2015, Tom de Vries wrote:


Hi,

this patch fixes PR46032.

It handles a call:
...
   __builtin_GOMP_parallel (fn, data, num_threads, flags)
...
as:
...
   fn (data)
...
in ipa-pta.

This improves ipa-pta alias analysis in the parallelized function fn, and
allows vectorization in the testcase without a runtime alias test.

Bootstrapped and reg-tested on x86_64.

OK for stage3 trunk?


+ /* Assign the passed argument to the appropriate incoming
+parameter of the function.  */
+ struct constraint_expr lhs ;
+ lhs = get_function_part_constraint (fi, fi_parm_base + 0);
+ auto_vec rhsc;
+ struct constraint_expr *rhsp;
+ get_constraint_for_rhs (arg, &rhsc);
+ while (rhsc.length () != 0)
+   {
+ rhsp = &rhsc.last ();
+ process_constraint (new_constraint (lhs, *rhsp));
+ rhsc.pop ();
+   }

please use style used elsewhere with

  FOR_EACH_VEC_ELT (rhsc, j, rhsp)
process_constraint (new_constraint (lhs, *rhsp));
  rhsc.truncate (0);



That code was copied from find_func_aliases_for_call.
I've factored out the bit that I copied as 
find_func_aliases_for_call_arg, and fixed the style there (and dropped 
'rhsc.truncate (0)' since AFAIU it's redundant at the end of a function).



+ /* Parameter passed by value is used.  */
+ lhs = get_function_part_constraint (fi, fi_uses);
+ struct constraint_expr *rhsp;
+ get_constraint_for_address_of (arg, &rhsc);

This isn't correct - you want to use get_constraint_for (arg, &rhsc).
After all rhs is already an ADDR_EXPR.



Can we add an assert somewhere to detect this incorrect usage?


+ FOR_EACH_VEC_ELT (rhsc, j, rhsp)
+   process_constraint (new_constraint (lhs, *rhsp));
+ rhsc.truncate (0);
+
+ /* The caller clobbers what the callee does.  */
+ lhs = get_function_part_constraint (fi, fi_clobbers);
+ rhs = get_function_part_constraint (cfi, fi_clobbers);
+ process_constraint (new_constraint (lhs, rhs));
+
+ /* The caller uses what the callee does.  */
+ lhs = get_function_part_constraint (fi, fi_uses);
+ rhs = get_function_part_constraint (cfi, fi_uses);
+ process_constraint (new_constraint (lhs, rhs));

I don't see why you need those.  The solver should compute these
in even better precision (context sensitive on the call side).

The same is true for the function parameter.  That is, the only
needed part of the patch should be that making sure we see
the "direct" call and assign parameters correctly.



Dropped this bit.

OK for stage3 trunk if bootstrap and reg-test succeeds?

Thanks,
- Tom


Handle BUILT_IN_GOMP_PARALLEL in ipa-pta

2015-11-30  Tom de Vries  

	PR tree-optimization/46032
	* tree-ssa-structalias.c (find_func_aliases_for_call_arg): New function,
	factored out of ...
	(find_func_aliases_for_call): ... here.
	(find_func_aliases_for_builtin_call, find_func_clobbers): Handle
	BUILT_IN_GOMP_PARALLEL.
	(ipa_pta_execute): Same.  Handle node->parallelized_function as a local
	function.

	* gcc.dg/pr46032.c: New test.

	* testsuite/libgomp.c/pr46032.c: New test.

---
 gcc/testsuite/gcc.dg/pr46032.c| 47 +++
 gcc/tree-ssa-structalias.c| 60 +++
 libgomp/testsuite/libgomp.c/pr46032.c | 44 +
 3 files changed, 138 insertions(+), 13 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/pr46032.c b/gcc/testsuite/gcc.dg/pr46032.c
new file mode 100644
index 000..b91190e
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr46032.c
@@ -0,0 +1,47 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fopenmp -ftree-vectorize -std=c99 -fipa-pta -fdump-tree-vect-all" } */
+
+extern void abort (void);
+
+#define nEvents 1000
+
+static void __attribute__((noinline, noclone, optimize("-fno-tree-vectorize")))
+init (unsigned *results, unsigned *pData)
+{
+  unsigned int i;
+  for (i = 0; i < nEvents; ++i)
+pData[i] = i % 3;
+}
+
+static void __attribute__((noinline, noclone, optimize("-fno-tree-vectorize")))
+check (unsigned *results)
+{
+  unsigned sum = 0;
+  for (int idx = 0; idx < (int)nEvents; idx++)
+sum += results[idx];
+
+  if (sum != 1998)
+abort ();
+}
+
+int
+main (void)
+{
+  unsigned results[nEvents];
+  unsigned pData[nEvents];
+  unsigned coeff = 2;
+
+  init (&results[0], &pData[0]);
+
+#pragma omp parallel for
+  for (int idx = 0; idx < (int)nEvents; idx++)
+results[idx] = coeff * pData[idx];
+
+  check (&results[0]);
+
+  return 0;
+}
+
+/* { dg-final { scan-tree-dump-times "note: vectorized 1 loop" 1 "vect" } } */
+/* { dg-final { scan-tree-dump-not "versioning for alias required" "vect" } } */
+
diff --git a/gcc/tree-ssa-struc

Re: [patch] c/c++ asan tests for FreeBSD

2015-11-30 Thread Andreas Tobler

On 30.11.15 11:28, Bernd Schmidt wrote:

On 11/29/2015 08:32 PM, Andreas Tobler wrote:

Hi all,

the attached patch prepares the testsuite, c and c++, for the upcoming
ASAN support for FreeBSD (x86_64 first).

I tested the patch on CentOS7.1 x86_64 and on FreeBSD x86_64.
Results can be seen on the list.

Is this ok for trunk?

-/* { dg-do run { target { *-*-linux* } } } */
+/* { dg-do run { target { *-*-linux* *-*-freebsd* } } } */


I see a patch from you to add asan support to x86 freebsd, but what
about other architectures?


You mean because of the wildcard? I'll add them as I have time to port them.

For now they are UNSUPPORTED.

Does every *-*-linux* has asan support?

Andreas



[gomp4] Use pass_ch instead of pass_ch_oacc_kernels (was: [PATCH, 8/16] Add pass_ch_oacc_kernels)

2015-11-30 Thread Thomas Schwinge
Hi!

On Wed, 11 Nov 2015 21:29:10 +0100, Tom de Vries  wrote:
> On 09/11/15 19:33, Tom de Vries wrote:
> > On 09/11/15 16:35, Tom de Vries wrote:
> > this patch adds a pass pass_ch_oacc_kernels, which is like pass_ch, but
> > only runs for loops with oacc_kernels_region set.
> >
> > [ But... thinking about it a bit more, I think that we could use a
> > regular pass_ch instead. We only use the kernels pass group for a single
> > loop nest in a kernels region, and we mark all the loops in the loop
> > nest with oacc_kernels_region. So I think that the oacc_kernels_region
> > test in pass_ch_oacc_kernels::process_loop_p evaluates to true. ]
> >
> > So, I'll try to confirm with retesting that we can drop this patch.
> >
> 
> That's confirmed. I can use pass_ch instead of pass_ch_oacc_kernels, so 
> I'm dropping this patch from the series.

Committed to gomp-4_0-branch in r231067:

commit 8249e606d83025092e3b0b227360f7e38fe591d4
Author: tschwinge 
Date:   Mon Nov 30 12:05:50 2015 +

Use pass_ch instead of pass_ch_oacc_kernels

gcc/
* passes.def: Use pass_ch instead of pass_ch_oacc_kernels.
* tree-pass.h (make_pass_ch_oacc_kernels): Remove.
* tree-ssa-loop-ch.c: Revert to trunk r230907 version.
gcc/testsuite/
* gcc.dg/tree-ssa/copy-headers.c: Update for new pass_ch.
* gcc.dg/tree-ssa/foldconst-2.c: Likewise.
* gcc.dg/tree-ssa/loop-40.c: Likewise.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gomp-4_0-branch@231067 
138bc75d-0d04-0410-961f-82ee72b054a4
---
 gcc/ChangeLog.gomp   |6 +++
 gcc/passes.def   |2 +-
 gcc/testsuite/ChangeLog.gomp |6 +++
 gcc/testsuite/gcc.dg/tree-ssa/copy-headers.c |4 +-
 gcc/testsuite/gcc.dg/tree-ssa/foldconst-2.c  |4 +-
 gcc/testsuite/gcc.dg/tree-ssa/loop-40.c  |4 +-
 gcc/tree-pass.h  |1 -
 gcc/tree-ssa-loop-ch.c   |   60 +++---
 8 files changed, 24 insertions(+), 63 deletions(-)

diff --git gcc/ChangeLog.gomp gcc/ChangeLog.gomp
index 54712ab..2c8f0c2 100644
--- gcc/ChangeLog.gomp
+++ gcc/ChangeLog.gomp
@@ -1,3 +1,9 @@
+2015-11-30  Thomas Schwinge  
+
+   * passes.def: Use pass_ch instead of pass_ch_oacc_kernels.
+   * tree-pass.h (make_pass_ch_oacc_kernels): Remove.
+   * tree-ssa-loop-ch.c: Revert to trunk r230907 version.
+
 2015-11-18  Nathan Sidwell  
 
* config/nvptx/nvptx.c: Remove unneeded #includes. Backport
diff --git gcc/passes.def gcc/passes.def
index e44bfac..f4eb235 100644
--- gcc/passes.def
+++ gcc/passes.def
@@ -93,7 +93,7 @@ along with GCC; see the file COPYING3.  If not see
  NEXT_PASS (pass_oacc_kernels);
  PUSH_INSERT_PASSES_WITHIN (pass_oacc_kernels)
  NEXT_PASS (pass_dominator, false /* may_peel_loop_headers_p */);
- NEXT_PASS (pass_ch_oacc_kernels);
+ NEXT_PASS (pass_ch);
  NEXT_PASS (pass_dominator, false /* may_peel_loop_headers_p */);
  NEXT_PASS (pass_tree_loop_init);
  NEXT_PASS (pass_lim);
diff --git gcc/testsuite/ChangeLog.gomp gcc/testsuite/ChangeLog.gomp
index dd3b1f5..59733bd 100644
--- gcc/testsuite/ChangeLog.gomp
+++ gcc/testsuite/ChangeLog.gomp
@@ -1,3 +1,9 @@
+2015-11-30  Thomas Schwinge  
+
+   * gcc.dg/tree-ssa/copy-headers.c: Update for new pass_ch.
+   * gcc.dg/tree-ssa/foldconst-2.c: Likewise.
+   * gcc.dg/tree-ssa/loop-40.c: Likewise.
+
 2015-11-19  Cesar Philippidis  
 
* gfortran.dg/goacc/routine-6.f90: Ensure that the device clause is
diff --git gcc/testsuite/gcc.dg/tree-ssa/copy-headers.c 
gcc/testsuite/gcc.dg/tree-ssa/copy-headers.c
index 4241b40..a5a8212 100644
--- gcc/testsuite/gcc.dg/tree-ssa/copy-headers.c
+++ gcc/testsuite/gcc.dg/tree-ssa/copy-headers.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */ 
-/* { dg-options "-O2 -fdump-tree-ch-details" } */
+/* { dg-options "-O2 -fdump-tree-ch2-details" } */
 
 extern int foo (int);
 
@@ -12,4 +12,4 @@ void bla (void)
 }
 
 /* There should be a header duplicated.  */
-/* { dg-final { scan-tree-dump-times "Duplicating header" 1 "ch"} } */
+/* { dg-final { scan-tree-dump-times "Duplicating header" 1 "ch2"} } */
diff --git gcc/testsuite/gcc.dg/tree-ssa/foldconst-2.c 
gcc/testsuite/gcc.dg/tree-ssa/foldconst-2.c
index eb1e6de..e9a6f87 100644
--- gcc/testsuite/gcc.dg/tree-ssa/foldconst-2.c
+++ gcc/testsuite/gcc.dg/tree-ssa/foldconst-2.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -fdump-tree-ch" } */
+/* { dg-options "-O2 -fdump-tree-ch2" } */
 typedef union tree_node *tree;
 enum tree_code
 {
@@ -56,4 +56,4 @@ emit_support_tinfos (void)
 }
 /* We should copy loop header to fundamentals[0] and then fold it way into
known value.  */
-/* { dg-final { scan-tree-dump-not "fundamentals.0" "ch"} } */
+/* { dg-final { scan-tree-dump-not "fundamentals.0" "ch2"} } */
diff --git gcc/testsuite/gcc.dg/tree-ssa/loo

Re: [gomp4.5] Handle #pragma omp declare target link

2015-11-30 Thread Jakub Jelinek
On Fri, Nov 27, 2015 at 07:50:09PM +0300, Ilya Verbin wrote:
> On Thu, Nov 19, 2015 at 16:31:15 +0100, Jakub Jelinek wrote:
> > On Mon, Nov 16, 2015 at 06:40:43PM +0300, Ilya Verbin wrote:
> > > @@ -2009,7 +2010,8 @@ scan_sharing_clauses (tree clauses, omp_context 
> > > *ctx)
> > > decl = OMP_CLAUSE_DECL (c);
> > > /* Global variables with "omp declare target" attribute
> > >don't need to be copied, the receiver side will use them
> > > -  directly.  */
> > > +  directly.  However, global variables with "omp declare target link"
> > > +  attribute need to be copied.  */
> > > if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP
> > > && DECL_P (decl)
> > > && ((OMP_CLAUSE_MAP_KIND (c) != GOMP_MAP_FIRSTPRIVATE_POINTER
> > > @@ -2017,7 +2019,9 @@ scan_sharing_clauses (tree clauses, omp_context 
> > > *ctx)
> > >  != GOMP_MAP_FIRSTPRIVATE_REFERENCE))
> > > || TREE_CODE (TREE_TYPE (decl)) == ARRAY_TYPE)
> > > && is_global_var (maybe_lookup_decl_in_outer_ctx (decl, ctx))
> > > -   && varpool_node::get_create (decl)->offloadable)
> > > +   && varpool_node::get_create (decl)->offloadable
> > > +   && !lookup_attribute ("omp declare target link",
> > > + DECL_ATTRIBUTES (decl)))
> > 
> > I wonder if Honza/Richi wouldn't prefer to have this info also
> > in cgraph, instead of looking up the attribute in each case.
> 
> So should I add a new flag into cgraph?
> Also it is used in gimplify_adjust_omp_clauses.

Richi said on IRC that lookup_attribute is ok, so let's keep it that way for
now.

> +   /* Most significant bit of the size marks such vars.  */
> +   unsigned HOST_WIDE_INT isize = tree_to_uhwi (size);
> +   isize |= 1ULL << (int_size_in_bytes (const_ptr_type_node) * 8 - 1);

That supposedly should be BITS_PER_UNIT instead of 8.

> diff --git a/gcc/varpool.c b/gcc/varpool.c
> index 36f19a6..cbd1e05 100644
> --- a/gcc/varpool.c
> +++ b/gcc/varpool.c
> @@ -561,17 +561,21 @@ varpool_node::assemble_decl (void)
>   are not real variables, but just info for debugging and codegen.
>   Unfortunately at the moment emutls is not updating varpool correctly
>   after turning real vars into value_expr vars.  */
> +#ifndef ACCEL_COMPILER
>if (DECL_HAS_VALUE_EXPR_P (decl)
>&& !targetm.have_tls)
>  return false;
> +#endif
>  
>/* Hard register vars do not need to be output.  */
>if (DECL_HARD_REGISTER (decl))
>  return false;
>  
> +#ifndef ACCEL_COMPILER
>gcc_checking_assert (!TREE_ASM_WRITTEN (decl)
>  && TREE_CODE (decl) == VAR_DECL
>  && !DECL_HAS_VALUE_EXPR_P (decl));
> +#endif

This looks wrong, both of these clearly could affect anything with
DECL_HAS_VALUE_EXPR_P, not just the link vars.
So, if you need to handle the "omp declare target link" vars specially,
you should only handle those specially and nothing else.  And please try to
explain why.

> @@ -1005,13 +1026,18 @@ gomp_load_image_to_device (struct gomp_device_descr 
> *devicep, unsigned version,
>for (i = 0; i < num_vars; i++)
>  {
>struct addr_pair *target_var = &target_table[num_funcs + i];
> -  if (target_var->end - target_var->start
> -   != (uintptr_t) host_var_table[i * 2 + 1])
> +  uintptr_t target_size = target_var->end - target_var->start;
> +
> +  /* Most significant bit of the size marks "omp declare target link"
> +  variables.  */
> +  bool is_link = target_size & (1ULL << (sizeof (uintptr_t) * 8 - 1));

__CHAR_BIT__ here instead of 8?

> @@ -1019,7 +1045,7 @@ gomp_load_image_to_device (struct gomp_device_descr 
> *devicep, unsigned version,
>k->host_end = k->host_start + (uintptr_t) host_var_table[i * 2 + 1];
>k->tgt = tgt;
>k->tgt_offset = target_var->start;
> -  k->refcount = REFCOUNT_INFINITY;
> +  k->refcount = is_link ? REFCOUNT_LINK : REFCOUNT_INFINITY;
>k->async_refcount = 0;
>array->left = NULL;
>array->right = NULL;

Do we need to do anything in gomp_unload_image_from_device ?
I mean at least in questionable programs that for link vars don't decrement
the refcount of the var that replaced the link var to 0 first before
dlclosing the library.
At least host_var_table[j * 2 + 1] will have the MSB set, so we need to
handle it differently.  Perhaps for that case perform a lookup, and if we
get something which has link_map non-NULL, first perform as if there is
target exit data delete (var) on it first?

Jakub


  1   2   >