Re: [PATCH] PR fortran/67885 -- PARAMETER needs to be marked in BLOCK

2015-10-26 Thread Thomas Koenig

Hi Steve,


When an specification statement in a BLOCK construct has a
PARAMETER attribute, gfortran currently discards the entity.
This patch marks PARAMETER entity if in a BLOCK.  I'm not
complete convince that this is the right fix, but it does
allow the testcase to compile and run.  Built and tested
on x86_64-*-freebsd.  OK to commit (if not no one has a
better patch)?


OK.  And thanks for the patch!

Thomas



Re: Add non-constant vector ctors to operand_equal_p

2015-10-26 Thread Thomas Schwinge
Hi!

On Thu, 22 Oct 2015 04:09:26 +0200, Jan Hubicka  wrote:
> this patch adds matching of non-constant CONSTRUCTOR expressions into
> operand_equal_p. [...]

> --- testsuite/gcc.dg/tree-ssa/operand-equal-2.c   (revision 0)
> +++ testsuite/gcc.dg/tree-ssa/operand-equal-2.c   (revision 0)
> @@ -0,0 +1,12 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-tree-forwprop1" } */
> +
> +typedef char __attribute__ ((vector_size (4))) v4qi;
> +
> +v4qi v;
> +void ret(char a)
> +{
> +  v4qi c={a,a,a,a},d={a,a,a,a};
> +  v = (c!=d);
> +}
> +/* { dg-final { scan-tree-dump "v = . 0, 0, 0, 0 ." "forwprop1"} } */

You checked that in with -fdump-tree-forwprop1 but "forwprop2" instead of
"forwprop1" in the dg-final, so we get:

PASS: gcc.dg/tree-ssa/operand-equal-2.c (test for excess errors)
UNRESOLVED: gcc.dg/tree-ssa/operand-equal-2.c scan-tree-dump forwprop2 "v = 
. 0, 0, 0, 0 ."


Grüße
 Thomas


signature.asc
Description: PGP signature


Re: [Ping] Fix 61441

2015-10-26 Thread Sujoy Saraswati
This is a ping for the patch to fix 61441.

Regards,
Sujoy

On Tue, Oct 13, 2015 at 4:16 PM, Sujoy Saraswati  wrote:
> Hi,
>  This is another modified version of the patch, incorporating the
> previous comments.
>
> Bootstrap and regression tests on x86_64-linux-gnu and
> aarch64-unknown-linux-gnu passed with changes done on trunk.
>
> Is this fine ?
>
> Regards,
> Sujoy
>
> 2015-10-13  Sujoy Saraswati 
>
> PR tree-optimization/61441
> * builtins.c (integer_valued_real_p): Return true for
> NaN values.
> (fold_builtin_trunc, fold_builtin_pow): Avoid the operation
> if flag_signaling_nans is on and the operand is a NaN.
> (fold_builtin_powi): Same.
> * fold-const.c (const_binop): Convert sNaN to qNaN when
> flag_signaling_nans is off.
> (const_unop): Avoid the operation, other than NEGATE and
> ABS, if flag_signaling_nans is on and the operand is a NaN.
> (fold_convert_const_real_from_real): Avoid the operation if
> flag_signaling_nans is on and the operand is a NaN.
> * real.c (do_add): Make resulting NaN value to be qNaN.
> (do_multiply, do_divide, do_fix_trunc): Same.
> (real_arithmetic, real_ldexp): Same
> * simplify-rtx.c (simplify_const_unary_operation): Avoid the
> operation if flag_signaling_nans is on and the operand is a NaN.
> * tree-ssa-math-opts.c (gimple_expand_builtin_pow): Same.
>
> PR tree-optimization/61441
> * gcc.dg/pr61441.c: New testcase.
>
> Index: gcc/builtins.c
> ===
> --- gcc/builtins.c  (revision 228700)
> +++ gcc/builtins.c  (working copy)
> @@ -7357,7 +7357,11 @@ integer_valued_real_p (tree t)
>  && integer_valued_real_p (TREE_OPERAND (t, 2));
>
>  case REAL_CST:
> -  return real_isinteger (TREE_REAL_CST_PTR (t), TYPE_MODE (TREE_TYPE 
> (t)));
> +  /* Return true for NaN values, since real_isinteger would
> + return false if the value is sNaN.  */
> +  return (REAL_VALUE_ISNAN (TREE_REAL_CST (t))
> +  || real_isinteger (TREE_REAL_CST_PTR (t),
> + TYPE_MODE (TREE_TYPE (t;
>
>  CASE_CONVERT:
>{
> @@ -7910,8 +7914,13 @@ fold_builtin_trunc (location_t loc, tree fndecl, t
>tree type = TREE_TYPE (TREE_TYPE (fndecl));
>
>x = TREE_REAL_CST (arg);
> -  real_trunc (&r, TYPE_MODE (type), &x);
> -  return build_real (type, r);
> +  /* Avoid the folding if flag_signaling_nans is on.  */
> +  if (!(HONOR_SNANS (TYPE_MODE (type))
> +&& REAL_VALUE_ISNAN (x)))
> +  {
> +real_trunc (&r, TYPE_MODE (type), &x);
> +return build_real (type, r);
> +  }
>  }
>
>return fold_trunc_transparent_mathfn (loc, fndecl, arg);
> @@ -8297,9 +8306,15 @@ fold_builtin_pow (location_t loc, tree fndecl, tre
>   bool inexact;
>
>   x = TREE_REAL_CST (arg0);
> +
>   inexact = real_powi (&x, TYPE_MODE (type), &x, n);
> - if (flag_unsafe_math_optimizations || !inexact)
> -   return build_real (type, x);
> +
> +  /* Avoid the folding if flag_signaling_nans is on.  */
> + if (flag_unsafe_math_optimizations
> +  || (!inexact
> +  && !(HONOR_SNANS (TYPE_MODE (TREE_TYPE (arg0)))
> +   && REAL_VALUE_ISNAN (x
> + return build_real (type, x);
> }
>
>   /* Strip sign ops from even integer powers.  */
> @@ -8388,8 +8403,14 @@ fold_builtin_powi (location_t loc, tree fndecl ATT
> {
>   REAL_VALUE_TYPE x;
>   x = TREE_REAL_CST (arg0);
> - real_powi (&x, TYPE_MODE (type), &x, c);
> - return build_real (type, x);
> +
> +  /* Avoid the folding if flag_signaling_nans is on.  */
> +  if (!(HONOR_SNANS (TYPE_MODE (TREE_TYPE (arg0)))
> +&& REAL_VALUE_ISNAN (x)))
> +  {
> +   real_powi (&x, TYPE_MODE (type), &x, c);
> +   return build_real (type, x);
> +  }
> }
>
>/* Optimize pow(x,0) = 1.0.  */
> Index: gcc/fold-const.c
> ===
> --- gcc/fold-const.c(revision 228700)
> +++ gcc/fold-const.c(working copy)
> @@ -1185,9 +1185,21 @@ const_binop (enum tree_code code, tree arg1, tree
>/* If either operand is a NaN, just return it.  Otherwise, set up
>  for floating-point trap; we return an overflow.  */
>if (REAL_VALUE_ISNAN (d1))
> -   return arg1;
> +  {
> +/* Make resulting NaN value to be qNaN when flag_signaling_nans
> +   is off.  */
> +d1.signalling = 0;
> +t = build_real (type, d1);
> +   return t;
> +  }
>else if (REAL_VALUE_ISNAN (d2))
> -   return arg2;
> +  {
> +/* Make resulting NaN

[PATCH, GCC 4.9 branch] Fix compile time regression caused by fix to PR64111

2015-10-26 Thread Caroline Tice
Here is my promised backport to the GCC 4.9 branch, for the patch below
that went into ToT last week.  As with the previous patch, I've
verified that it fixes the problem, bootstraps and has no new
regression test failures.  Is this ok to commit to the gcc-4_9-branch?

-- Caroline Tice
cmt...@google.com

gcc/ChangeLog:

 2015-10-26  Caroline Tice  

(from Richard Biener)
 * tree.c (int_cst_hash_hash):  Replace XORs with more efficient
 calls to iterative_hash_host_wide_int.


gcc-fsf-4_9.patch
Description: Binary data


[PATCH, GCC 5 branch] Fix compile time regression caused by fix to PR64111

2015-10-26 Thread Caroline Tice
Here is my promised backport to the GCC 5 branch, for the patch below
that went into ToT last week.  As with the previous patch, I've
verified that it fixes the problem, bootstraps and has no new
regression test failures.  Is this ok to commit to the gcc-5-branch?

-- Caroline Tice
cmt...@google.com


On Fri, Oct 23, 2015 at 3:22 PM, Caroline Tice  wrote:
> This patch fixes a compile-time regression that was originally
> introduced by the fix
> for PR64111, in GCC 4.9.3.One of our user's encountered this problem with 
> a
> particular file, where the compile time (on arm) went from 20 seconds
> to 150 seconds.
>
> The fix in this patch was suggested by Richard Biener, who wrote the
> original fix for
> PR64111.  I have verified that this patch fixes the compile time
> regression; I have bootstrapped
> the compiler with this patch; and I have run the regression testsuite
> (no regressions).
> Is this ok to commit to ToT?   (I am also working on backports for
> gcc-5_branch and gcc-4_9-branch).
>
> -- Caroline Tice
> cmt...@google.com

gcc/ChangeLog:

2015-10-26  Caroline Tice  

 (from Richard Biener)
 * tree.c (int_cst_hasher::hash):  Replace XOR with more efficient
 call to iterative_hash_host_wide_int.


gcc-fsf-5.patch
Description: Binary data


Re: [PATCH v4] SH FDPIC backend support

2015-10-26 Thread Rich Felker
On Sun, Oct 25, 2015 at 11:28:51PM +0900, Oleg Endo wrote:
> On Fri, 2015-10-23 at 02:32 -0400, Rich Felker wrote:
> > Here's my updated version of the FDPIC patch with all requested
> > changes made and Changelog added. I've included all the original
> > authors. This is my first time writing such an extensive Changelog
> > entry so please let me know if there are things I got wrong.
> 
> I took the liberty and fixed some minor formatting trivia and extracted
> functions sh_emit_storesi and sh_emit_storehi which are used in
>  sh_trampoline_init to effectively memcpy code into the trampoline
> area.  Can you please check it?  If it's OK I'll commit the attached
> patch to trunk.

Is there anything in particular you'd like me to check? It builds fine
for fdpic target, successfully compiles musl libc.so, and busybox runs
with the resulting libc.so. I did a quick visual inspection of the
diff between my version and yours too and didn't see anything that
looked suspicious to me.

Rich


[PING] Re: [PATCH] c++/67913, 67917 - fix new expression with wrong number of elements

2015-10-26 Thread Martin Sebor

[CC Jason]

The patch is at the link below:
https://gcc.gnu.org/ml/gcc-patches/2015-10/msg01803.html

Thanks

On 10/19/2015 12:50 PM, Martin Sebor wrote:

This is a patch for two C++ bugs:

   67913 - new expression with negative size not diagnosed
   67927 - array new expression with excessive number of elements
   not diagnosed

The C++ front end rejects a subset of array declarators with negative
bounds or with bounds in excess of some implementation defined maximum
(roughly SIZE_MAX / 2), but it does so inconsistently.  For example,
it silently accepts expressions such as

 new int [-1][2];

but invokes operator new to allocate SIZE_MAX bytes.  When operator
new succeeds (as is the case on Linux with memory overcommitment
enabled), the new expression returns a valid pointer to some
unknown amount of storage less than SIZE_MAX.  Accessing the memory
at some non-zero offset then causes a SIGSEGV.

Similarly, GCC accepts the following expression with the same result:

 new int [SIZE_MAX][2];

C++ 14 makes it clear that such expressions are ill-formed and must
be rejected.  This patch adds checks that consistently reject all
new expressions with both negative array bounds and bounds in excess
of the maximum.

While I raised these bugs as separate issues I decided to group the
two sets of changes together since they both touch the same function
in similar ways, and hopefully doing so will make them also easier
to review.

I've tested the patch by bootstrapping C/C++ and running the test
suites (including libstdc++) on x86_64 with no regressions.

During the development of the changes I found a few basic mistakes
in my code only after running libstdc++ tests.  To make it possible
to uncover them sooner in the future, I added another test that
isn't directly related to the problem: new45.C.

I also found a minor problem in the GCC regression test suite where
the g++.dg/other/new-size-type.C test for PR 36741 tried to check
that the 'new char[~static_cast(0)]' expression was accepted
without a warning.  The complaint in the PR was about the wording of
the warning, not about the validity of the expression (the submitter
agreed that a correctly worded diagnostic would be appropriate).
  I chaged the test to expect a meaningful error message.

Once this patch is approved and committed a follow-up patch should
document the implementation-defined maximum to the manual.

Martin




[PING 2] [PATCH] c++/67942 - diagnose placement new buffer overflow

2015-10-26 Thread Martin Sebor

[CC Jason]

When you have a chance, the patch is at the link below for review:
https://gcc.gnu.org/ml/gcc-patches/2015-10/msg02001.html

On 10/20/2015 01:57 PM, Martin Sebor wrote:

Attached is a slightly updated patch that tweaks the diagnostic
messages to avoid assuming the English punctuation, and adds
a few test cases exercising the text of the diagnostics.

Martin

On 10/13/2015 11:22 AM, Martin Sebor wrote:

C++ placement new expression is susceptible to buffer overflow flaws
(see [1]).  In many such cases GCC has sufficient information to
detect and diagnose such defects. The attached patch is a starting
point for this feature.  It lets GCC diagnose basic cases of buffer
overflows when both the size of the buffer and the type being
constructed are constant expressions.  A more sophisticated
implementation would try to detect additional cases in a manner
similar to _FORTIFY_SOURCE.

Besides buffer overflow, placement new can also be misused to
construct objects in unaligned storage (also discussed in the paper
below).  I leave diagnosing such cases and improving the detection
of buffer overflows via a mechanism like Object Size Checking for
a future patch.

Tested on x86_64 with no regressions.

Martin

[1] A New Class of Buffer Overflow Attacks, Kundu, A., Bertino, E.,
31st International Conference on Distributed Computing Systems (ICDCS),
2011 http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=5961725







Re: [0/7] Type promotion pass and elimination of zext/sext

2015-10-26 Thread kugan



On 23/10/15 01:23, Richard Biener wrote:

On Thu, Oct 22, 2015 at 12:50 PM, Kugan
 wrote:



On 21/10/15 23:45, Richard Biener wrote:

On Tue, Oct 20, 2015 at 10:03 PM, Kugan
 wrote:



On 07/09/15 12:53, Kugan wrote:


This a new version of the patch posted in
https://gcc.gnu.org/ml/gcc-patches/2015-08/msg00226.html. I have done
more testing and spitted the patch to make it more easier to review.
There are still couple of issues to be addressed and I am working on them.

1. AARCH64 bootstrap now fails with the commit
94f92c36a83d66a893c3bc6f00a038ba3dbe2a6f. simplify-rtx.c is mis-compiled
in stage2 and fwprop.c is failing. It looks to me that there is a latent
issue which gets exposed my patch. I can also reproduce this in x86_64
if I use the same PROMOTE_MODE which is used in aarch64 port. For the
time being, I am using  patch
0006-temporary-workaround-for-bootstrap-failure-due-to-co.patch as a
workaround. This meeds to be fixed before the patches are ready to be
committed.

2. vector-compare-1.c from c-c++-common/torture fails to assemble with
-O3 -g Error: unaligned opcodes detected in executable segment. It works
fine if I remove the -g. I am looking into it and needs to be fixed as well.


Hi Richard,

Now that stage 1 is going to close, I would like to get these patches
accepted for stage1. I will try my best to address your review comments
ASAP.


Ok, can you make the whole patch series available so I can poke at the
implementation a bit?  Please state the revision it was rebased on
(or point me to a git/svn branch the work resides on).



Thanks. Please find the patched rebated against trunk@229156. I have
skipped the test-case readjustment patches.


Some quick observations.  On x86_64 when building


Hi Richard,

Thanks for the review.


short bar (short y);
int foo (short x)
{
   short y = bar (x) + 15;
   return y;
}

with -m32 -O2 -mtune=pentiumpro (which ends up promoting HImode regs)
I get

   :
   _1 = (int) x_10(D);
   _2 = (_1) sext (16);
   _11 = bar (_2);
   _5 = (int) _11;
   _12 = (unsigned int) _5;
   _6 = _12 & 65535;
   _7 = _6 + 15;
   _13 = (int) _7;
   _8 = (_13) sext (16);
   _9 = (_8) sext (16);
   return _9;

which looks fine but the VRP optimization doesn't trigger for the redundant sext
(ranges are computed correctly but the 2nd extension is not removed).

This also makes me notice trivial match.pd patterns are missing, like
for example

(simplify
  (sext (sext@2 @0 @1) @3)
  (if (tree_int_cst_compare (@1, @3) <= 0)
   @2
   (sext @0 @3)))

as VRP doesn't run at -O1 we must rely on those to remove rendudant extensions,
otherwise generated code might get worse compared to without the pass(?)


Do you think that we should enable this pass only when vrp is enabled. 
Otherwise, even when we do the simple optimizations you mentioned below, 
we might not be able to remove all the redundancies.




I also notice that the 'short' argument does not get it's sign-extension removed
as redundand either even though we have

_1 = (int) x_8(D);
Found new range for _1: [-32768, 32767]



I am looking into it.


In the end I suspect that keeping track of the "simple" cases in the promotion
pass itself (by keeping a lattice) might be a good idea (after we fix VRP to do
its work).  In some way whether the ABI guarantees promoted argument
registers might need some other target hook queries.

Now onto the 0002 patch.

+static bool
+type_precision_ok (tree type)
+{
+  return (TYPE_PRECISION (type)  == 8
+ || TYPE_PRECISION (type) == 16
+ || TYPE_PRECISION (type) == 32);
+}

that's a weird function to me.  You probably want
TYPE_PRECISION (type) == GET_MODE_PRECISION (TYPE_MODE (type))
here?  And guard that thing with POINTER_TYPE_P || INTEGRAL_TYPE_P?



I will change this. (I have a patch which I am testing with other 
changes you have asked for)



+/* Return the promoted type for TYPE.  */
+static tree
+get_promoted_type (tree type)
+{
+  tree promoted_type;
+  enum machine_mode mode;
+  int uns;
+  if (POINTER_TYPE_P (type)
+  || !INTEGRAL_TYPE_P (type)
+  || !type_precision_ok (type))
+return type;
+
+  mode = TYPE_MODE (type);
+#ifdef PROMOTE_MODE
+  uns = TYPE_SIGN (type);
+  PROMOTE_MODE (mode, uns, type);
+#endif
+  uns = TYPE_SIGN (type);
+  promoted_type = lang_hooks.types.type_for_mode (mode, uns);
+  if (promoted_type
+  && (TYPE_PRECISION (promoted_type) > TYPE_PRECISION (type)))
+type = promoted_type;

I think what you want to verify is that TYPE_PRECISION (promoted_type)
== GET_MODE_PRECISION (mode).
And to not even bother with this simply use

promoted_type = build_nonstandard_integer_type (GET_MODE_PRECISION (mode), uns);



I am changing this too.


You use a domwalk but also might create new basic-blocks during it
(insert_on_edge_immediate), that's a
no-no, commit edge inserts after the domwalk.


I am sorry, I dont understand "commit edge inserts after the domwalk" Is 
there a way to do this in the current implementation?



ssa_sets_h

Re: [PATCH] Add missing INCLUDE_DEFAULTS_MUSL_LOCAL

2015-10-26 Thread Rich Felker
On Tue, Oct 27, 2015 at 12:16:16AM +, Joseph Myers wrote:
> On Mon, 26 Oct 2015, Rich Felker wrote:
> 
> > On Mon, Oct 26, 2015 at 11:42:37PM +, Joseph Myers wrote:
> > > On Mon, 26 Oct 2015, Rich Felker wrote:
> > > 
> > > > musl explicitly does not support using a mix of libc headers and
> > > > compiler-provided freestanding headers. While there may be
> > > 
> > > In that case the GCC ports for musl should define USER_H = 
> > > $(EXTRA_HEADERS) like t-openbsd does.  (Of course that won't work for 
> > > multilib builds supporting different C libraries with different 
> > > multilibs.)
> > 
> > This sounds interesting. Are there practical ways it's a better
> > solution than what linux.h is doing now for musl? Inability to support
> 
> Well, it ensures the installed compiler can't find the freestanding 
> headers at all, because they're not installed (other than 
> architecture-specific intrinsics headers).

Oh. I think that breaks Linux kernel builds, which use -nostdinc then
add back the gcc include directory or something like that. I don't
remember the details.

Rich


Re: [PATCH] Add missing INCLUDE_DEFAULTS_MUSL_LOCAL

2015-10-26 Thread Joseph Myers
On Mon, 26 Oct 2015, Rich Felker wrote:

> On Mon, Oct 26, 2015 at 11:42:37PM +, Joseph Myers wrote:
> > On Mon, 26 Oct 2015, Rich Felker wrote:
> > 
> > > musl explicitly does not support using a mix of libc headers and
> > > compiler-provided freestanding headers. While there may be
> > 
> > In that case the GCC ports for musl should define USER_H = 
> > $(EXTRA_HEADERS) like t-openbsd does.  (Of course that won't work for 
> > multilib builds supporting different C libraries with different 
> > multilibs.)
> 
> This sounds interesting. Are there practical ways it's a better
> solution than what linux.h is doing now for musl? Inability to support

Well, it ensures the installed compiler can't find the freestanding 
headers at all, because they're not installed (other than 
architecture-specific intrinsics headers).

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH] Add missing INCLUDE_DEFAULTS_MUSL_LOCAL

2015-10-26 Thread Rich Felker
On Mon, Oct 26, 2015 at 11:42:37PM +, Joseph Myers wrote:
> On Mon, 26 Oct 2015, Rich Felker wrote:
> 
> > musl explicitly does not support using a mix of libc headers and
> > compiler-provided freestanding headers. While there may be
> 
> In that case the GCC ports for musl should define USER_H = 
> $(EXTRA_HEADERS) like t-openbsd does.  (Of course that won't work for 
> multilib builds supporting different C libraries with different 
> multilibs.)

This sounds interesting. Are there practical ways it's a better
solution than what linux.h is doing now for musl? Inability to support
multilib well is something I've wondered if we could improve on, but
from what you said it sounds like this wouldn't help.

Rich


Re: [PATCH] Add missing INCLUDE_DEFAULTS_MUSL_LOCAL

2015-10-26 Thread Joseph Myers
On Mon, 26 Oct 2015, Rich Felker wrote:

> musl explicitly does not support using a mix of libc headers and
> compiler-provided freestanding headers. While there may be

In that case the GCC ports for musl should define USER_H = 
$(EXTRA_HEADERS) like t-openbsd does.  (Of course that won't work for 
multilib builds supporting different C libraries with different 
multilibs.)

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH] Add missing INCLUDE_DEFAULTS_MUSL_LOCAL

2015-10-26 Thread Joseph Myers
On Mon, 26 Oct 2015, Szabolcs Nagy wrote:

> > FLT_ROUNDS is an ordinary compiler bug (bug 30569), should be fixable
> > reasonably straightforwardly as outlined at
> > , probably within say a
> > week's work if most architecture-specific changes are left to architecture
> > maintainers.
> 
> musl tries to support old compilers in general (it can be built
> with gcc 3.x, and it should be possible to use with a wider range
> of compilers with reasonably consistent semantics, so fixing that
> bug in gcc does not help much.)

Well, the general expectation in the GNU system is that GCC and glibc may 
work around each other's issues if the one doing the working around is 
responsible for the interface that needs the workaround - but also that 
interfaces required for freestanding implementations are GCC's 
responsibility while interfaces involving library functions are the C 
library's responsibility.  GCC fixincludes doesn't try to fix library 
issues not relevant for GCC and its tests unless they actually break use 
of a header with GCC, and glibc doesn't try to fix issues with headers 
provided by GCC.

(There may be the odd deviation from that starting point - GCC provides 
stdatomic.h because it's so closely linked to the compiler despite not 
being required of freestanding implementations, and GCC would not start to 
provide libm in future if adopting TS 18661-1 despite it requiring more 
library functionality for freestanding implementations.)

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [OpenACC 7/11] execution model

2015-10-26 Thread Nathan Sidwell

Jakub, Richard,
This is the updated version of patch 7, using target-insns.def for the new 
insns.  Otherwise same as yesterday's, which had the following changes:


The significant change is that now the head/tail unique markers are  threaded on 
a data dependency variable.  I'd not  noticed its lack being a problem, but this 
is certainly more robust in showing the ordering dependency between calls.  The 
dependency var is the 2nd parameter, and all others are simply shifted along by one.


At RTL generation time the date dependency is exposed to the RTL expander, which 
in the PTX case simply does a src->dst move, which will eventually be deleted as 
unnecessary.


ok?

nathan
2015-10-26  Nathan Sidwell  

	* internal-fn.def (IFN_GOACC_LOOP): New.
	* internal-fn.h (enum ifn_unique_kind): Add IFN_UNIQUE_OACC_FORK,
	IFN_UNIQUE_OACC_JOIN, IFN_UNIQUE_OACC_HEAD_MARK,
	IFN_UNIQUE_OACC_TAIL_MARK.
	(enum ifn_goacc_loop_kind): New.
	* internal-fn.c (expand_UNIQUE): Add IFN_UNIQUE_OACC_FORK,
	IFN_UNIQUE_OACC_JOIN cases.
	(expand_OACC_LOOP): New.
	(IFN_GOACC_LOOP_CHUNKS, IFN_GOACC_LOOP_STEP,
	IFN_GOACC_LOOP_OFFSET, IFN_GOACC_LOOP_BOUND): New.
	* target-insns.def (oacc_dim_pos, oacc_dim_size, oacc_fork,
	oacc_join): New.
	* internal-fn.c (expand_UNIQUE): Handle IFN_UNIQUE_OACC_FORK,
	IFN_UNIQUE_OACC_JOIN.
	(expand_GOACC_DIM_SIZE, expand_GOACC_DIM_POS, expand_GOACC_LOOP): New.
	* omp-low.c (struct omp_context): Remove gwv_below, gwv_this
	fields.
	(enum oacc_loop_flags): New.
	(enclosing_target_ctx): May return NULL.
	(ctx_in_oacc_kernels_region): New.
	(is_oacc_parallel, is_oaccc_kernels): New.
	(check_oacc_kernel_gwv): New.
	(oacc_loop_or_target_p): Delete.
	(scan_omp_for): Don't calculate gwv mask.  Check parallel clause
	operands.  Strip reductions fro kernels.
	(scan_omp_target): Don't calculate gwv mask.
	(lower_oacc_head_mark, lower_oacc_loop_marker,
	lower_oacc_head_tail): New.
	(expand_omp_for_static_nochunk, expand_omp_for_static_chunk):
	Remove OpenACC handling.
	(struct oacc_collapse): New.
	(expand_oacc_collapse_init, expand_oacc_collapse_vars): New.
	(expand_oacc_for): New.
	(expand_omp_for): Call expand_oacc_for.
	(lower_omp_for): Call lower_oacc_head_tail.

Index: gcc/target-insns.def
===
--- gcc/target-insns.def	(revision 229276)
+++ gcc/target-insns.def	(working copy)
@@ -64,6 +64,8 @@ DEF_TARGET_INSN (memory_barrier, (void))
 DEF_TARGET_INSN (movstr, (rtx x0, rtx x1, rtx x2))
 DEF_TARGET_INSN (nonlocal_goto, (rtx x0, rtx x1, rtx x2, rtx x3))
 DEF_TARGET_INSN (nonlocal_goto_receiver, (void))
+DEF_TARGET_INSN (oacc_fork, (rtx x0, rtx x1, rtx x2))
+DEF_TARGET_INSN (oacc_join, (rtx x0, rtx x1, rtx x2))
 DEF_TARGET_INSN (prefetch, (rtx x0, rtx x1, rtx x2))
 DEF_TARGET_INSN (probe_stack, (rtx x0))
 DEF_TARGET_INSN (probe_stack_address, (rtx x0))
Index: gcc/internal-fn.c
===
--- gcc/internal-fn.c	(revision 229276)
+++ gcc/internal-fn.c	(working copy)
@@ -1958,30 +1958,60 @@ expand_VA_ARG (gcall *stmt ATTRIBUTE_UNU
   gcc_unreachable ();
 }
 
 /* Expand the IFN_UNIQUE function according to its first argument.  */
 
 static void
 expand_UNIQUE (gcall *stmt)
 {
   rtx pattern = NULL_RTX;
   enum ifn_unique_kind kind
 = (enum ifn_unique_kind) TREE_INT_CST_LOW (gimple_call_arg (stmt, 0));
 
   switch (kind)
 {
 default:
   gcc_unreachable ();
 
 case IFN_UNIQUE_UNSPEC:
   if (targetm.have_unique ())
 	pattern = targetm.gen_unique ();
   break;
+
+case IFN_UNIQUE_OACC_FORK:
+case IFN_UNIQUE_OACC_JOIN:
+  if (targetm.have_oacc_fork () && targetm.have_oacc_join ())
+	{
+	  tree lhs = gimple_call_lhs (stmt);
+	  rtx target = const0_rtx;
+
+	  if (lhs)
+	target = expand_expr (lhs, NULL_RTX, VOIDmode, EXPAND_WRITE);
+
+	  rtx data_dep = expand_normal (gimple_call_arg (stmt, 1));
+	  rtx axis = expand_normal (gimple_call_arg (stmt, 2));
+
+	  if (kind == IFN_UNIQUE_OACC_FORK)
+	pattern = targetm.gen_oacc_fork (target, data_dep, axis);
+	  else
+	pattern = targetm.gen_oacc_join (target, data_dep, axis);
+	}
+  else
+	gcc_unreachable ();
+  break;
 }
 
   if (pattern)
 emit_insn (pattern);
 }
 
+/* This is expanded by oacc_device_lower pass.  */
+
+static void
+expand_GOACC_LOOP (gcall *stmt ATTRIBUTE_UNUSED)
+{
+  gcc_unreachable ();
+}
+
 /* Routines to expand each internal function, indexed by function number.
Each routine has the prototype:
 
Index: gcc/internal-fn.h
===
--- gcc/internal-fn.h	(revision 229276)
+++ gcc/internal-fn.h	(working copy)
@@ -20,9 +20,52 @@ along with GCC; see the file COPYING3.
 #ifndef GCC_INTERNAL_FN_H
 #define GCC_INTERNAL_FN_H
 
 /* INTEGER_CST values for IFN_UNIQUE function arg-0.  */
 enum ifn_unique_kind {
-  IFN_UNIQUE_UNSPEC   /* Undifferentiated UNIQUE.  */
+  IFN_UNIQUE_UNSPEC,  /* Undifferentiated UNIQUE.  */
+
+  /*

[PATCH] Add contains_symbol_ref_p .

2015-10-26 Thread Anatoliy Sokolov

Hello.

  This patch add contains_symbol_ref_p function in rtlanal.c and remove 
contains_symbol_ref_p from lra-constraints.c and contains_symbol_ref from 
var-tracking.c.


Bootstrapped and reg-tested on x86_64-unknown-linux-gnu.

OK for trunk?

2015-10-27  Anatoly Sokolov  

* rtl.h (contains_symbol_ref_p): Declare.
(SYMBOL_REF_P): Define.
* rtlanal.c (contains_symbol_ref_p: New function.
* lra-constraints.c (contains_symbol_ref_p): Remove.
* var-tracking.c (contains_symbol_ref): Remove.
(track_expr_p): Use contains_symbol_ref_p instead of
contains_symbol_ref.

Index: gcc/lra-constraints.c
===
--- gcc/lra-constraints.c   (revision 228971)
+++ gcc/lra-constraints.c   (working copy)
@@ -4007,35 +4007,6 @@
   return false;
 }

-/* Return true if X contains a symbol reg.  */
-static bool
-contains_symbol_ref_p (rtx x)
-{
-  int i, j;
-  const char *fmt;
-  enum rtx_code code;
-
-  code = GET_CODE (x);
-  if (code == SYMBOL_REF)
-return true;
-  fmt = GET_RTX_FORMAT (code);
-  for (i = GET_RTX_LENGTH (code) - 1; i >= 0; i--)
-{
-  if (fmt[i] == 'e')
-   {
- if (contains_symbol_ref_p (XEXP (x, i)))
-   return true;
-   }
-  else if (fmt[i] == 'E')
-   {
- for (j = XVECLEN (x, i) - 1; j >= 0; j--)
-   if (contains_symbol_ref_p (XVECEXP (x, i, j)))
- return true;
-   }
-}
-  return false;
-}
-
 /* Process all regs in location *LOC and change them on equivalent
substitution.  Return true if any change was done.  */
 static bool
Index: gcc/rtl.h
===
--- gcc/rtl.h   (revision 228971)
+++ gcc/rtl.h   (working copy)
@@ -829,6 +829,9 @@
 /* Predicate yielding nonzero iff RTX is a subreg.  */
 #define SUBREG_P(RTX) (GET_CODE (RTX) == SUBREG)

+/* Predicate yielding true iff RTX is a symbol ref.  */
+#define SYMBOL_REF_P(RTX) (GET_CODE (RTX) == SYMBOL_REF)
+
 template <>
 template <>
 inline bool
@@ -2926,6 +2929,7 @@
 /* Functions in rtlanal.c */

 extern rtx single_set_2 (const rtx_insn *, const_rtx);
+extern bool contains_symbol_ref_p (const_rtx);

 /* Handle the cheap and common cases inline for performance.  */

Index: gcc/rtlanal.c
===
--- gcc/rtlanal.c   (revision 228971)
+++ gcc/rtlanal.c   (working copy)
@@ -6232,6 +6232,19 @@
   return SCRATCH;
 }

+/* Return true if RTL X contains a SYMBOL_REF.  */
+
+bool
+contains_symbol_ref_p (const_rtx x)
+{
+  subrtx_iterator::array_type array;
+  FOR_EACH_SUBRTX (iter, array, x, ALL)
+if (SYMBOL_REF_P (*iter))
+  return true;
+
+  return false;
+}
+
 /* Return true if X contains a thread-local symbol.  */

 bool
Index: gcc/var-tracking.c
===
--- gcc/var-tracking.c  (revision 228971)
+++ gcc/var-tracking.c  (working copy)
@@ -671,7 +671,6 @@
 static bool dataflow_set_different (dataflow_set *, dataflow_set *);
 static void dataflow_set_destroy (dataflow_set *);

-static bool contains_symbol_ref (rtx);
 static bool track_expr_p (tree, bool);
 static bool same_variable_part_p (rtx, tree, HOST_WIDE_INT);
 static void add_uses_1 (rtx *, void *);
@@ -5031,42 +5030,6 @@
   set->vars = NULL;
 }

-/* Return true if RTL X contains a SYMBOL_REF.  */
-
-static bool
-contains_symbol_ref (rtx x)
-{
-  const char *fmt;
-  RTX_CODE code;
-  int i;
-
-  if (!x)
-return false;
-
-  code = GET_CODE (x);
-  if (code == SYMBOL_REF)
-return true;
-
-  fmt = GET_RTX_FORMAT (code);
-  for (i = GET_RTX_LENGTH (code) - 1; i >= 0; i--)
-{
-  if (fmt[i] == 'e')
-   {
- if (contains_symbol_ref (XEXP (x, i)))
-   return true;
-   }
-  else if (fmt[i] == 'E')
-   {
- int j;
- for (j = 0; j < XVECLEN (x, i); j++)
-   if (contains_symbol_ref (XVECEXP (x, i, j)))
- return true;
-   }
-}
-
-  return false;
-}
-
 /* Shall EXPR be tracked?  */

 static bool
@@ -5147,7 +5110,7 @@
  char **_dl_argv;
   */
   if (decl_rtl && MEM_P (decl_rtl)
-  && contains_symbol_ref (XEXP (decl_rtl, 0)))
+  && contains_symbol_ref_p (XEXP (decl_rtl, 0)))
 return 0;

   /* If RTX is a memory it should not be very large (because it would be

Anatoliy.


[PATCH] Use subreg_regno instead of subreg_regno_offset

2015-10-26 Thread Anatoliy Sokolov

Hello.

  This patch change code 'REGNO (subreg) + subreg_regno_offset (...)' with 
subreg_regno (subreg).


Bootstrapped and reg-tested on x86_64-unknown-linux-gnu.

OK for trunk?

2015-10-27  Anatoly Sokolov  

* caller-save.c (add_stored_regs): Use subreg_regno instead of
subreg_regno_offset.
* df-scan.c (df_ref_record): Ditto.
* reg-stack.c (get_true_reg): Ditto.
* reload.c (operands_match_p, find_reloads_address_1,
reg_overlap_mentioned_for_reload_p): Ditto.

Index: gcc/caller-save.c
===
--- gcc/caller-save.c   (revision 229083)
+++ gcc/caller-save.c   (working copy)
@@ -991,31 +991,25 @@
 add_stored_regs (rtx reg, const_rtx setter, void *data)
 {
   int regno, endregno, i;
-  machine_mode mode = GET_MODE (reg);
-  int offset = 0;

   if (GET_CODE (setter) == CLOBBER)
 return;

-  if (GET_CODE (reg) == SUBREG
+  if (SUBREG_P (reg)
   && REG_P (SUBREG_REG (reg))
-  && REGNO (SUBREG_REG (reg)) < FIRST_PSEUDO_REGISTER)
+  && HARD_REGISTER_P (SUBREG_REG (reg)))
 {
-  offset = subreg_regno_offset (REGNO (SUBREG_REG (reg)),
-   GET_MODE (SUBREG_REG (reg)),
-   SUBREG_BYTE (reg),
-   GET_MODE (reg));
-  regno = REGNO (SUBREG_REG (reg)) + offset;
+  regno = subreg_regno (reg);
   endregno = regno + subreg_nregs (reg);
 }
-  else
+  else if (REG_P (reg)
+  && HARD_REGISTER_P (reg))
 {
-  if (!REG_P (reg) || REGNO (reg) >= FIRST_PSEUDO_REGISTER)
-   return;
-
-  regno = REGNO (reg) + offset;
-  endregno = end_hard_regno (mode, regno);
+  regno = REGNO (reg);
+  endregno = end_hard_regno (GET_MODE (reg), regno);
 }
+  else
+return;

   for (i = regno; i < endregno; i++)
 SET_REGNO_REG_SET ((regset) data, i);
Index: gcc/df-scan.c
===
--- gcc/df-scan.c   (revision 229083)
+++ gcc/df-scan.c   (working copy)
@@ -2588,8 +2588,7 @@

   if (GET_CODE (reg) == SUBREG)
{
- regno += subreg_regno_offset (regno, GET_MODE (SUBREG_REG (reg)),
-   SUBREG_BYTE (reg), GET_MODE (reg));
+ regno = subreg_regno (reg);
  endregno = regno + subreg_nregs (reg);
}
   else
Index: gcc/reg-stack.c
===
--- gcc/reg-stack.c (revision 229083)
+++ gcc/reg-stack.c (working copy)
@@ -416,11 +416,7 @@
  rtx subreg;
  if (STACK_REG_P (subreg = SUBREG_REG (*pat)))
{
- int regno_off = subreg_regno_offset (REGNO (subreg),
-  GET_MODE (subreg),
-  SUBREG_BYTE (*pat),
-  GET_MODE (*pat));
- *pat = FP_MODE_REG (REGNO (subreg) + regno_off,
+ *pat = FP_MODE_REG (subreg_regno (subreg),
  GET_MODE (subreg));
  return pat;
}
Index: gcc/reload.c
===
--- gcc/reload.c(revision 229083)
+++ gcc/reload.c(working copy)
@@ -2256,10 +2256,7 @@
  i = REGNO (SUBREG_REG (x));
  if (i >= FIRST_PSEUDO_REGISTER)
goto slow;
- i += subreg_regno_offset (REGNO (SUBREG_REG (x)),
-   GET_MODE (SUBREG_REG (x)),
-   SUBREG_BYTE (x),
-   GET_MODE (x));
+ i = subreg_regno (x);
}
   else
i = REGNO (x);
@@ -2269,10 +2266,7 @@
  j = REGNO (SUBREG_REG (y));
  if (j >= FIRST_PSEUDO_REGISTER)
goto slow;
- j += subreg_regno_offset (REGNO (SUBREG_REG (y)),
-   GET_MODE (SUBREG_REG (y)),
-   SUBREG_BYTE (y),
-   GET_MODE (y));
+ j = subreg_regno (y);
}
   else
j = REGNO (y);
@@ -5522,12 +5516,7 @@
op0 = SUBREG_REG (op0);
code0 = GET_CODE (op0);
if (code0 == REG && REGNO (op0) < FIRST_PSEUDO_REGISTER)
- op0 = gen_rtx_REG (word_mode,
-(REGNO (op0) +
- subreg_regno_offset (REGNO (SUBREG_REG 
(orig_op0)),
-  GET_MODE (SUBREG_REG 
(orig_op0)),
-  SUBREG_BYTE (orig_op0),
-  GET_MODE (orig_op0;
+ op0 = gen_rtx_REG (word_mode, subreg_regno (op0));
  }

if (GET_CODE (op1) == SUBREG)
@@ -5537,12 +5526,7 @@
if (code1 == REG && REGNO

[MCORE] Hookize GO_IF_LEGITIMATE_ADDRESS

2015-10-26 Thread Anatoliy Sokolov

Hi.

This patch removes obsolete GO_IF_LEGITIMATE_ADDRESS macros from
the MCORE back end in the GCC and introduce equivalent
TARGET_LEGITIMATE_ADDRESS_P target hook.

Regression tested on mcore-unknown-elf. Only compile test run.

OK for trunk?

2015-08-24  Anatoly Sokolov  

* config/mcore/mcore.h (REG_OK_FOR_BASE_P, REG_OK_FOR_INDEX_P,
  BASE_REGISTER_RTX_P, INDEX_REGISTER_RTX_P,
  GO_IF_LEGITIMATE_ADDRESS): Remove macros.
* config/mcore/mcore.c (mcore_reg_ok_for_base_p,
  mcore_base_register_rtx_p, mcore_legitimate_index_p,
  mcore_legitimate_address_p): New functions.
  (TARGET_ADDR_SPACE_LEGITIMATE_ADDRESS_P): Define.

Index: gcc/config/mcore/mcore.c
===
--- gcc/config/mcore/mcore.c(revision 227044)
+++ gcc/config/mcore/mcore.c(working copy)
@@ -156,6 +156,8 @@
 static bool   mcore_warn_func_return(tree);
 static void   mcore_option_override(void);
 static bool   mcore_legitimate_constant_p   (machine_mode, rtx);
+static bool  mcore_legitimate_address_p(machine_mode, rtx, bool,
+addr_space_t);
 
 /* MCore specific attributes.  */

@@ -243,6 +245,8 @@

 #undef TARGET_LEGITIMATE_CONSTANT_P
 #define TARGET_LEGITIMATE_CONSTANT_P mcore_legitimate_constant_p
+#undef TARGET_ADDR_SPACE_LEGITIMATE_ADDRESS_P
+#define TARGET_ADDR_SPACE_LEGITIMATE_ADDRESS_P mcore_legitimate_address_p

 #undef TARGET_WARN_FUNC_RETURN
 #define TARGET_WARN_FUNC_RETURN mcore_warn_func_return
@@ -3196,3 +3200,74 @@
 {
   return GET_CODE (x) != CONST_DOUBLE;
 }
+
+/* Helper function for `mcore_legitimate_address_p'.  */
+
+static bool
+mcore_reg_ok_for_base_p (const_rtx reg, bool strict_p)
+{
+  if (strict_p)
+return REGNO_OK_FOR_BASE_P (REGNO (reg));
+  else
+return (REGNO (reg) <= 16 || !HARD_REGISTER_P (reg));
+}
+
+static bool
+mcore_base_register_rtx_p (const_rtx x, bool strict_p)
+{
+  return REG_P(x) && mcore_reg_ok_for_base_p (x, strict_p);
+}
+
+/*  A legitimate index for a QI is 0..15, for HI is 0..30, for SI is 0..60,
+and for DI is 0..56 because we use two SI loads, etc.  */
+
+static bool
+mcore_legitimate_index_p (machine_mode mode, const_rtx op)
+{
+  if (CONST_INT_P (op))
+{
+  if (GET_MODE_SIZE (mode) >= 4
+ && (((unsigned HOST_WIDE_INT) INTVAL (op)) % 4) == 0
+ &&  ((unsigned HOST_WIDE_INT) INTVAL (op))
+ <= (unsigned HOST_WIDE_INT) 64 - GET_MODE_SIZE (mode))
+   return true;
+  if (GET_MODE_SIZE (mode) == 2
+ && (((unsigned HOST_WIDE_INT) INTVAL (op)) % 2) == 0
+ &&  ((unsigned HOST_WIDE_INT) INTVAL (op)) <= 30)
+   return true;
+  if (GET_MODE_SIZE (mode) == 1
+ && ((unsigned HOST_WIDE_INT) INTVAL (op)) <= 15)
+   return true;
+  }
+  return false;
+}
+
+
+/* Worker function for TARGET_ADDR_SPACE_LEGITIMATE_ADDRESS_P.
+
+   Allow  REG
+ REG + disp  */
+
+static bool
+mcore_legitimate_address_p (machine_mode mode, rtx x, bool strict_p,
+   addr_space_t as)
+{
+  gcc_assert (ADDR_SPACE_GENERIC_P (as));
+
+  if (mcore_base_register_rtx_p (x, strict_p))
+return true;
+  else if (GET_CODE (x) == PLUS || GET_CODE (x) == LO_SUM)
+{
+  rtx xop0 = XEXP (x, 0);
+  rtx xop1 = XEXP (x, 1);
+  if (mcore_base_register_rtx_p (xop0, strict_p)
+ && mcore_legitimate_index_p (mode, xop1))
+   return true;
+  if (mcore_base_register_rtx_p (xop1, strict_p)
+ && mcore_legitimate_index_p (mode, xop0))
+   return true;
+}
+
+  return false;
+}
+
Index: gcc/config/mcore/mcore.h
===
--- gcc/config/mcore/mcore.h(revision 227044)
+++ gcc/config/mcore/mcore.h(working copy)
@@ -529,91 +529,6 @@
 /* Recognize any constant value that is a valid address.  */
 #define CONSTANT_ADDRESS_P(X)   (GET_CODE (X) == LABEL_REF)

-/* The macros REG_OK_FOR..._P assume that the arg is a REG rtx
-   and check its validity for a certain class.
-   We have two alternate definitions for each of them.
-   The usual definition accepts all pseudo regs; the other rejects
-   them unless they have been allocated suitable hard regs.
-   The symbol REG_OK_STRICT causes the latter definition to be used.  */
-#ifndef REG_OK_STRICT
-
-/* Nonzero if X is a hard reg that can be used as a base reg
-   or if it is a pseudo reg.  */
-#define REG_OK_FOR_BASE_P(X) \
-   (REGNO (X) <= 16 || REGNO (X) >= FIRST_PSEUDO_REGISTER)
-
-/* Nonzero if X is a hard reg that can be used as an index
-   or if it is a pseudo reg.  */
-#define REG_OK_FOR_INDEX_P(X)  0
-
-#else
-
-/* Nonzero if X is a hard reg that can be used as a base reg.  */
-#define REG_OK_FOR_BASE_P(X)   \
-   REGNO_OK_FOR_BASE_P (REGNO (X))
-
-/* Nonzero if X is a hard reg that can be used as an index.  */
-#define REG_OK_FO

Re: [OpenACC 5/11] C++ FE changes

2015-10-26 Thread Cesar Philippidis
On 10/26/2015 03:20 AM, Jakub Jelinek wrote:
> On Sat, Oct 24, 2015 at 02:11:41PM -0700, Cesar Philippidis wrote:

>> --- a/gcc/cp/semantics.c
>> +++ b/gcc/cp/semantics.c
>> @@ -5911,6 +5911,31 @@ finish_omp_clauses (tree clauses, bool allow_fields, 
>> bool declare_simd)
>>  bitmap_set_bit (&firstprivate_head, DECL_UID (t));
>>goto handle_field_decl;
>>  
>> +case OMP_CLAUSE_GANG:
>> +case OMP_CLAUSE_VECTOR:
>> +case OMP_CLAUSE_WORKER:
>> +  /* Operand 0 is the num: or length: argument.  */
>> +  t = OMP_CLAUSE_OPERAND (c, 0);
>> +  if (t == NULL_TREE)
>> +break;
>> +
>> +  if (!processing_template_decl)
>> +t = fold_build_cleanup_point_expr (TREE_TYPE (t), t);
>> +  OMP_CLAUSE_OPERAND (c, 0) = t;
>> +
>> +  if (OMP_CLAUSE_CODE (c) != OMP_CLAUSE_GANG)
>> +break;
> 
> I think it would be better to do the Operand 1 stuff first for
> case OMP_CLAUSE_GANG: only, and then have /* FALLTHRU */ into
> case OMP_CLAUSE_{VECTOR,WORKER}: which would handle the first argument.
> 
> You should add testing that the operand has INTEGRAL_TYPE_P type
> (except that for processing_template_decl it can be
> type_dependent_expression_p instead of INTEGRAL_TYPE_P).
>
> Also, the if (t == NULL_TREE) stuff looks fishy, because e.g. right now
> if you have OMP_CLAUSE_GANG gang (static: expr) or similar,
> you wouldn't wrap the expr into cleanup point.
> So, instead it should be
>   if (t)
> {
>   if (t == error_mark_node)
>   remove = true;
>   else if (!type_dependent_expression_p (t)
>&& !INTEGRAL_TYPE_P (TREE_TYPE (t)))
>   {
> error_at (OMP_CLAUSE_LOCATION (c), ...);
> remove = true;
> }
>   else
>   {
> t = mark_rvalue_use (t);
> if (!processing_template_decl)
>   t = fold_build_cleanup_point_expr (TREE_TYPE (t), t);
> OMP_CLAUSE_OPERAND (c, 0) = t;
>   }
> }
> or so.  Also, can the expressions be arbitrary integers, or just
> non-negative, or positive?  If it is INTEGER_CST, that is something that
> could be checked here too.

I ended up handling with with OMP_CLAUSE_NUM_*, since they all require
positive integer expressions. The only exception was OMP_CLAUSE_GANG
which has two optional arguments.

>>else if (!type_dependent_expression_p (t)
>> && !INTEGRAL_TYPE_P (TREE_TYPE (t)))
>>  {
>> -  error ("num_threads expression must be integral");
>> + switch (OMP_CLAUSE_CODE (c))
>> +{
>> +case OMP_CLAUSE_NUM_TASKS:
>> +  error ("% expression must be integral"); break;
>> +case OMP_CLAUSE_NUM_TEAMS:
>> +  error ("% expression must be integral"); break;
>> +case OMP_CLAUSE_NUM_THREADS:
>> +  error ("% expression must be integral"); break;
>> +case OMP_CLAUSE_NUM_GANGS:
>> +  error ("% expression must be integral"); break;
>> +case OMP_CLAUSE_NUM_WORKERS:
>> +  error ("% expression must be integral");
>> +  break;
>> +case OMP_CLAUSE_VECTOR_LENGTH:
>> +  error ("% expression must be integral");
>> +  break;
> 
> When touching these, can you please use error_at (OMP_CLAUSE_LOCATION (c),
> instead of error ( ?

Done

>> +default:
>> +  error ("invalid argument");
> 
> What invalid argument?  I'd say that is clearly gcc_unreachable (); case.
> 
> But, I think it would be better to just use
>   error_at (OMP_CLAUSE_LOCATION (c), "%qs expression must be integral",
>   omp_clause_code_name[c]);

I used that generic message for all of those clauses except for _GANG,
_WORKER and _VECTOR. The gang clause, at the very least, needed it to
disambiguate the static and num arguments. If you want I can handle
_WORKER and _VECTOR with the generic message. I only included it because
those arguments are optional, whereas they are mandatory for the other
clauses.

Is this patch OK for trunk?

Cesar

2015-10-26  Cesar Philippidis  
	Thomas Schwinge  
	James Norris  
	Joseph Myers  
	Julian Brown  
	Nathan Sidwell 
	Bernd Schmidt  

	gcc/cp/
	* parser.c (cp_parser_omp_clause_name): Add auto, gang, seq,
	vector, worker.
	(cp_parser_oacc_simple_clause): New.
	(cp_parser_oacc_shape_clause): New.
	(cp_parser_oacc_all_clauses): Add auto, gang, seq, vector, worker.
	(OACC_LOOP_CLAUSE_MASK): Likewise.
	* semantics.c (finish_omp_clauses): Add auto, gang, seq, vector,
	worker. Unify the handling of teams, tasks and vector_length with
	the other loop shape clauses.

2015-10-26  Nathan Sidwell 
	Cesar Philippidis  

	gcc/testsuite/
	* g++.dg/g++.dg/gomp/pr33372-1.C: Adjust diagnostic.
	* gcc/testsuite/g++.dg/gomp/pr33372-3.C: Likewise.
			  

diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index 7555bf3..5d07487 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -29064,7 +29064,9 @@ cp_parser

Re: [OpenACC 1/11] UNIQUE internal function

2015-10-26 Thread Nathan Sidwell

Richard, Jakub,
this updates patch 1 to use the target-insns.def mechanism of detecting 
conditionally-implemented instructions.  Otherwise it's the same as yesterday's 
patch.  To recap:


1) Moved the subcodes to an enumeration in internal-fn.h

2) Remove ECF_LEAF

3) Added check in initialize_ctrl_altering

4) tracer code now (continues) to only look in last stmt of block

I looked at fnsplit and do not believe I need changes there.  That's changing 
things like:

  if (cheap test)
do cheap thing
  else
do complex thing

to break out the else part into a separate function.   That's fine -- it'll copy 
the whole CFG of interest.


ok?

nathan
2015-10-26  Nathan Sidwell  
	
	* internal-fn.c (expand_UNIQUE): New.
	* internal-fn.h (enum ifn_unique_kind): New.
	* internal-fn.def (IFN_UNIQUE): New.
	* target-insns.def (unique): Define.
	* gimple.h (gimple_call_internal_unique_p): New.
	* gimple.c (gimple_call_same_target_p): Check internal fn
	uniqueness.
	* tracer.c (ignore_bb_p): Check for IFN_UNIQUE call.
	* tree-ssa-threadedge.c
	(record_temporary_equivalences_from_stmts): Likewise.
	* tree-cfg.c (gmple_call_initialize_ctrl_altering): Likewise.

Index: gcc/target-insns.def
===
--- gcc/target-insns.def	(revision 229276)
+++ gcc/target-insns.def	(working copy)
@@ -89,5 +93,6 @@ DEF_TARGET_INSN (stack_protect_test, (rt
 DEF_TARGET_INSN (store_multiple, (rtx x0, rtx x1, rtx x2))
 DEF_TARGET_INSN (tablejump, (rtx x0, rtx x1))
 DEF_TARGET_INSN (trap, (void))
+DEF_TARGET_INSN (unique, (void))
 DEF_TARGET_INSN (untyped_call, (rtx x0, rtx x1, rtx x2))
 DEF_TARGET_INSN (untyped_return, (rtx x0, rtx x1))
Index: gcc/gimple.c
===
--- gcc/gimple.c	(revision 229276)
+++ gcc/gimple.c	(working copy)
@@ -1346,7 +1346,8 @@ gimple_call_same_target_p (const gimple
 {
   if (gimple_call_internal_p (c1))
 return (gimple_call_internal_p (c2)
-	&& gimple_call_internal_fn (c1) == gimple_call_internal_fn (c2));
+	&& gimple_call_internal_fn (c1) == gimple_call_internal_fn (c2)
+	&& !gimple_call_internal_unique_p (as_a  (c1)));
   else
 return (gimple_call_fn (c1) == gimple_call_fn (c2)
 	|| (gimple_call_fndecl (c1)
Index: gcc/gimple.h
===
--- gcc/gimple.h	(revision 229276)
+++ gcc/gimple.h	(working copy)
@@ -2895,6 +2895,21 @@ gimple_call_internal_fn (const gimple *g
   return gimple_call_internal_fn (gc);
 }
 
+/* Return true, if this internal gimple call is unique.  */
+
+static inline bool
+gimple_call_internal_unique_p (const gcall *gs)
+{
+  return gimple_call_internal_fn (gs) == IFN_UNIQUE;
+}
+
+static inline bool
+gimple_call_internal_unique_p (const gimple *gs)
+{
+  const gcall *gc = GIMPLE_CHECK2 (gs);
+  return gimple_call_internal_unique_p (gc);
+}
+
 /* If CTRL_ALTERING_P is true, mark GIMPLE_CALL S to be a stmt
that could alter control flow.  */
 
Index: gcc/internal-fn.c
===
--- gcc/internal-fn.c	(revision 229276)
+++ gcc/internal-fn.c	(working copy)
@@ -1958,6 +1958,30 @@ expand_VA_ARG (gcall *stmt ATTRIBUTE_UNU
   gcc_unreachable ();
 }
 
+/* Expand the IFN_UNIQUE function according to its first argument.  */
+
+static void
+expand_UNIQUE (gcall *stmt)
+{
+  rtx pattern = NULL_RTX;
+  enum ifn_unique_kind kind
+= (enum ifn_unique_kind) TREE_INT_CST_LOW (gimple_call_arg (stmt, 0));
+
+  switch (kind)
+{
+default:
+  gcc_unreachable ();
+
+case IFN_UNIQUE_UNSPEC:
+  if (targetm.have_unique ())
+	pattern = targetm.gen_unique ();
+  break;
+}
+
+  if (pattern)
+emit_insn (pattern);
+}
+
 /* Routines to expand each internal function, indexed by function number.
Each routine has the prototype:
 
Index: gcc/internal-fn.h
===
--- gcc/internal-fn.h	(revision 229276)
+++ gcc/internal-fn.h	(working copy)
@@ -20,6 +20,11 @@ along with GCC; see the file COPYING3.
 #ifndef GCC_INTERNAL_FN_H
 #define GCC_INTERNAL_FN_H
 
+/* INTEGER_CST values for IFN_UNIQUE function arg-0.  */
+enum ifn_unique_kind {
+  IFN_UNIQUE_UNSPEC   /* Undifferentiated UNIQUE.  */
+};
+
 /* Initialize internal function tables.  */
 
 extern void init_internal_fns ();
Index: gcc/internal-fn.def
===
--- gcc/internal-fn.def	(revision 229276)
+++ gcc/internal-fn.def	(working copy)
@@ -65,3 +65,10 @@ DEF_INTERNAL_FN (SUB_OVERFLOW, ECF_CONST
 DEF_INTERNAL_FN (MUL_OVERFLOW, ECF_CONST | ECF_LEAF | ECF_NOTHROW, NULL)
 DEF_INTERNAL_FN (TSAN_FUNC_EXIT, ECF_NOVOPS | ECF_LEAF | ECF_NOTHROW, NULL)
 DEF_INTERNAL_FN (VA_ARG, ECF_NOTHROW | ECF_LEAF, NULL)
+
+/* An unduplicable, uncombinable function.  Generally used to preserve
+   a CFG property in the face of jump threading, tail merging or
+   other such optimizations

Re: [OpenACC 4/11] C FE changes

2015-10-26 Thread Cesar Philippidis
On 10/26/2015 01:59 AM, Jakub Jelinek wrote:

> Ok for trunk with those changes fixed.

Here's the patch with those changes. Nathan will commit this patch the
rest of the openacc execution model patches.

Thanks,
Cesar

2015-10-26  Cesar Philippidis  
	Thomas Schwinge  
	James Norris  
	Joseph Myers  
	Julian Brown  
	Bernd Schmidt  

	gcc/c/
	* c-parser.c (c_parser_oacc_shape_clause): New.
	(c_parser_oacc_simple_clause): New.
	(c_parser_oacc_all_clauses): Add auto, gang, seq, vector, worker.
	(OACC_LOOP_CLAUSE_MASK): Add gang, worker, vector, auto, seq.

2015-10-26  Cesar Philippidis  

	gcc/testsuite/
	* c-c++-common/goacc/loop-shape.c: New test.


diff --git a/gcc/c/c-parser.c b/gcc/c/c-parser.c
index c8c6a2d..13f09d8 100644
--- a/gcc/c/c-parser.c
+++ b/gcc/c/c-parser.c
@@ -11188,6 +11188,167 @@ c_parser_omp_clause_num_workers (c_parser *parser, tree list)
 }
 
 /* OpenACC:
+
+gang [( gang-arg-list )]
+worker [( [num:] int-expr )]
+vector [( [length:] int-expr )]
+
+  where gang-arg is one of:
+
+[num:] int-expr
+static: size-expr
+
+  and size-expr may be:
+
+*
+int-expr
+*/
+
+static tree
+c_parser_oacc_shape_clause (c_parser *parser, omp_clause_code kind,
+			const char *str, tree list)
+{
+  const char *id = "num";
+  tree ops[2] = { NULL_TREE, NULL_TREE }, c;
+  location_t loc = c_parser_peek_token (parser)->location;
+
+  if (kind == OMP_CLAUSE_VECTOR)
+id = "length";
+
+  if (c_parser_next_token_is (parser, CPP_OPEN_PAREN))
+{
+  c_parser_consume_token (parser);
+
+  do
+	{
+	  c_token *next = c_parser_peek_token (parser);
+	  int idx = 0;
+
+	  /* Gang static argument.  */
+	  if (kind == OMP_CLAUSE_GANG
+	  && c_parser_next_token_is_keyword (parser, RID_STATIC))
+	{
+	  c_parser_consume_token (parser);
+
+	  if (!c_parser_require (parser, CPP_COLON, "expected %<:%>"))
+		goto cleanup_error;
+
+	  idx = 1;
+	  if (ops[idx] != NULL_TREE)
+		{
+		  c_parser_error (parser, "too many % arguments");
+		  goto cleanup_error;
+		}
+
+	  /* Check for the '*' argument.  */
+	  if (c_parser_next_token_is (parser, CPP_MULT))
+		{
+		  c_parser_consume_token (parser);
+		  ops[idx] = integer_minus_one_node;
+
+		  if (c_parser_next_token_is (parser, CPP_COMMA))
+		{
+		  c_parser_consume_token (parser);
+		  continue;
+		}
+		  else
+		break;
+		}
+	}
+	  /* Worker num: argument and vector length: arguments.  */
+	  else if (c_parser_next_token_is (parser, CPP_NAME)
+		   && strcmp (id, IDENTIFIER_POINTER (next->value)) == 0
+		   && c_parser_peek_2nd_token (parser)->type == CPP_COLON)
+	{
+	  c_parser_consume_token (parser);  /* id  */
+	  c_parser_consume_token (parser);  /* ':'  */
+	}
+
+	  /* Now collect the actual argument.  */
+	  if (ops[idx] != NULL_TREE)
+	{
+	  c_parser_error (parser, "unexpected argument");
+	  goto cleanup_error;
+	}
+
+	  location_t expr_loc = c_parser_peek_token (parser)->location;
+	  tree expr = c_parser_expr_no_commas (parser, NULL).value;
+	  if (expr == error_mark_node)
+	goto cleanup_error;
+
+	  mark_exp_read (expr);
+	  expr = c_fully_fold (expr, false, NULL);
+
+	  /* Attempt to statically determine when the number isn't a
+	 positive integer.  */
+
+	  if (!INTEGRAL_TYPE_P (TREE_TYPE (expr)))
+	{
+	  c_parser_error (parser, "expected integer expression");
+	  return list;
+	}
+
+	  tree c = fold_build2_loc (expr_loc, LE_EXPR, boolean_type_node, expr,
+build_int_cst (TREE_TYPE (expr), 0));
+	  if (c == boolean_true_node)
+	{
+	  warning_at (loc, 0,
+			  "%<%s%> value must be positive", str);
+	  expr = integer_one_node;
+	}
+
+	  ops[idx] = expr;
+
+	  if (kind == OMP_CLAUSE_GANG
+	  && c_parser_next_token_is (parser, CPP_COMMA))
+	{
+	  c_parser_consume_token (parser);
+	  continue;
+	}
+	  break;
+	}
+  while (1);
+
+  if (!c_parser_require (parser, CPP_CLOSE_PAREN, "expected %<)%>"))
+	goto cleanup_error;
+}
+
+  check_no_duplicate_clause (list, kind, str);
+
+  c = build_omp_clause (loc, kind);
+
+  if (ops[1])
+OMP_CLAUSE_OPERAND (c, 1) = ops[1];
+
+  OMP_CLAUSE_OPERAND (c, 0) = ops[0];
+  OMP_CLAUSE_CHAIN (c) = list;
+
+  return c;
+
+ cleanup_error:
+  c_parser_skip_until_found (parser, CPP_CLOSE_PAREN, 0);
+  return list;
+}
+
+/* OpenACC:
+   auto
+   independent
+   nohost
+   seq */
+
+static tree
+c_parser_oacc_simple_clause (c_parser *parser, enum omp_clause_code code,
+			 tree list)
+{
+  check_no_duplicate_clause (list, code, omp_clause_code_name[code]);
+
+  tree c = build_omp_clause (c_parser_peek_token (parser)->location, code);
+  OMP_CLAUSE_CHAIN (c) = list;
+
+  return c;
+}
+
+/* OpenACC:
async [( int-expr )] */
 
 static tree
@@ -12393,6 +12554,11 @@ c_parser_oacc_all_clauses (c_parser *parser, omp_clause_mask mask,
 	  clauses = c_parser_oacc_clause_async (parser, clauses);
 	  c_name = "async";
 	

[gomp4] reorganize reduction target lowering

2015-10-26 Thread Nathan Sidwell

I've committed this to gomp4.  It changes the target reduction lowering to:

1) use a single internal function with initial argument discriminator, in the 
same manner to IFN_UNIQUE and IFN_GOACC_LOOP


2) Rather than identify reductions with loop-id and reduction-id, which will 
cause  a difficulty with inlining, we simply identify them with an offset 
determined when generating the lowering.   This offset is useful in our case for 
worker level reductions, but could be used for other purposes by different backends.


nathan
2015-10-26  Nathan Sidwell  

	* doc/tm.texi: Rebuilt.
	* internal-fn.c (expand_GOACC_REDUCTION_SETUP,
	expand_GOACC_REDUCTION_INIT, expand_GOACC_REDUCTION_FINI,
	expand_GOACC_REDUCTION_TEADOWN): Replace with ...
	(expand_GOACC_REDUCTION): ... this.
	* internal-fn.def (GOACC_REDUCTION_SETUP,
	GOACC_REDUCTION_INIT, GOACC_REDUCTION_FINI,
	GOACC_REDUCTION_TEADOWN): Replace with ...
	(GOACC_REDUCTION): ... this.
	* internal-fn.h (enum ifn_goacc_reduction_kind): New.
	* omp-low.c (lower_rec_input_clauses): Adjust OpenACC comment.
	(lower_oacc_reductions): Remove RID & LID, calculate
	offset. Adjust for IFN_GOACC_REDUCTION change.
	(default_goacc_reduction): Don't return bool.  Adjust for argument
	shift.
	(execute_oacc_device_lower): Adjust for IFN_GOACC_REDUCTION
	change.
	* target.def (goacc_reduction): Adjust hook.
	* targhooks.h (default_goacc_reduction): Return void.
	* config/nvptx/nvptx.c (worker_red_hwm): Rename to ...
	(worker_red_size): ... here.
	(var_red_t, struct loop_red, loop_reds): Delete.
	(nvptx_reorg_reductions): Delete.
	(nvptx_reorg): Don't reorg reductoins.
	(nvptx_file_end): Adjust worker reduction size name.
	(nvptx_expand_worker_addr): Reimplement.
	(nvptx_init_builtins): Adjust WORKER_ADDR prototype.
	(nvptx_get_worker_red_addr): Reimplement.
	(nvptx_goacc_reduction_setup, nvptx_goacc_reduction_init,
	nvptx_goacc_reduction_fini, nvptx_goacc_reduction_teardown): Don't
	return bool.  Adjust for argument shift & worker offset
	processing.
	(nvptx_goacc_reduction): Adjust.

Index: gcc/config/nvptx/nvptx.c
===
--- gcc/config/nvptx/nvptx.c	(revision 229392)
+++ gcc/config/nvptx/nvptx.c	(working copy)
@@ -119,40 +119,13 @@ static unsigned worker_bcast_align;
 static GTY(()) rtx worker_bcast_sym;
 
 /* Size of buffer needed for worker reductions.  This has to be
-   disjoing from the worker broadcast array, as both may be live
+   distinct from the worker broadcast array, as both may be live
concurrently.  */
-static unsigned worker_red_hwm;
+static unsigned worker_red_size;
 static unsigned worker_red_align;
 #define worker_red_name "__worker_red"
 static GTY(()) rtx worker_red_sym;
 
-/* To process worker-level reductions we need a buffer in CTA local
-   (.shared) memory.  As the number of loops per function and number
-   of reductions per loop are likely to be small numbers, we use
-   simple unsorted vectors to hold the mappings.  */
-
-/* Mapping from a reduction to an offset within the worker reduction
-   array.  */
-typedef std::pair var_red_t;
-
-/* Mapping from loops within a function to lists of reductions on that
-   loop.  */
-struct loop_red
-{
-  unsigned id;  /* Loop ID.  */
-  unsigned hwm;  /* Allocated worker buffer for this loop.  */
-  auto_vec vars;   /* Reduction variables of the loop.  */
-
-  loop_red (unsigned id_)
-  :id (id_), hwm (0) 
-  {
-  }
-};
-
-/* It would be nice to put this intp machine_function, but auto_vec
-   pulls in too much other stuff.   */
-static auto_vec loop_reds;
-
 /* Allocate a new, cleared machine_function structure.  */
 
 static struct machine_function *
@@ -3785,21 +3758,7 @@ nvptx_neuter_pars (parallel *par, unsign
 nvptx_neuter_pars (par->next, modes, outer);
 }
 
-static void
-nvptx_reorg_reductions (void)
-{
-  unsigned ix;
-
-  for (ix = loop_reds.length (); ix--;)
-{
-  if (loop_reds[ix].hwm > worker_red_hwm)
-	worker_red_hwm = loop_reds[ix].hwm;
-  loop_reds.pop ();
-}
-}
-
 /* PTX-specific reorganization
-   - Scan and release reduction buffers
- Split blocks at fork and join instructions
- Compute live registers
- Mark now-unused registers, so function begin doesn't declare
@@ -3812,8 +3771,6 @@ nvptx_reorg_reductions (void)
 static void
 nvptx_reorg (void)
 {
-  nvptx_reorg_reductions ();
-  
   /* We are freeing block_for_insn in the toplev to keep compatibility
  with old MDEP_REORGS that are not CFG based.  Recompute it now.  */
   compute_bb_for_insn ();
@@ -4023,17 +3980,17 @@ nvptx_file_end (void)
 	   worker_bcast_name, worker_bcast_hwm);
 }
 
-  if (worker_red_hwm)
+  if (worker_red_size)
 {
   /* Define the reduction buffer.  */
 
-  worker_red_hwm = (worker_red_hwm + worker_red_align - 1)
+  worker_red_size = (worker_red_size + worker_red_align - 1)
 	& ~(worker_red_align - 1);
   
   fprintf (asm_out_file, "// BEGIN VAR DEF: %s\n", worker_red_name);
   fp

Re: [PATCH][wwwdocs] Mention arm target attributes and pragmas in GCC 6 changes

2015-10-26 Thread Gerald Pfeifer

On Mon, 26 Oct 2015, Kyrill Tkachov wrote:

Here's a patch to mention the new target attributes for arm, in the same
wording as for aarch64.


This looks good to me, thanks.

Gerald


Re: [PATCH] libffi testsuite: Don't run testsuite/libffi.call/float2.c on hppa*-*-hpux*

2015-10-26 Thread Andreas Tobler

Hi John,

On 29.03.15 22:01, Andreas Tobler wrote:

On 28.03.15 20:25, John David Anglin wrote:

The libffi.call/float2.c test uses fabsl which was introduced in c99 and isn't 
available on hppa*-*-hpux*.
In order to use the target selector with dg-run, I need to load 
target-supports-dg.exp in lib/libffi.exp.

Tested on hppa2.0w-hp-hpux11.11.  Okay for trunk?


  From the testsuite pov. ok, but I do not have a picture of the trunk
check in restrictions regarding the next release.


How about to commit now?

Andreas



Re: [PATCH][auto-inc-dec.c] Account for cost of move operation in FORM_PRE_ADD and FORM_POST_ADD cases

2015-10-26 Thread Jeff Law

On 10/26/2015 10:07 AM, Richard Sandiford wrote:

Bernd Schmidt  writes:

I seem to recall Richard had a rewrite of all the autoinc code. I wonder
what happened to that?


Although it produced more autoincs, it didn't really improve performance
that much on the targets I was looking at at the time.

I'm afraid the patch is long lost now, and would probably be in an
uncertain copyright situation anyway.

Yup.  I wouldn't want to untangle that mess of legal opinions.

Out of curiosity, what was the basic premise behind what you did to get 
more autoincs?

jeff


C++ PATCH for DR 2179 (ambiguous partial specialization after instantiation)

2015-10-26 Thread Jason Merrill
While discussing issue 2179 at the meeting last week, I noticed that we 
were crashing when a partial specialization made a previous 
instantiation ambiguous.  Fixed thus.


Tested x86_64-pc-linux-gnu, applying to trunk.
commit cb0817f19d7f4ce9878922412c2c1b2d07eb66d7
Author: Jason Merrill 
Date:   Sun Oct 25 05:22:50 2015 -1000

	DR 2179
	* pt.c (process_partial_specialization): Handle error_mark_node
	from most_specialized_partial_spec.

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index ffe02da..2745b40 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -4690,14 +4690,18 @@ process_partial_specialization (tree decl)
 	  : DECL_TEMPLATE_INSTANTIATION (instance))
 	{
 	  tree spec = most_specialized_partial_spec (instance, tf_none);
-	  if (spec && TREE_VALUE (spec) == tmpl)
-	{
-	  tree inst_decl = (DECL_P (instance)
-? instance : TYPE_NAME (instance));
-	  permerror (input_location,
-			 "partial specialization of %qD after instantiation "
-			 "of %qD", decl, inst_decl);
-	}
+	  tree inst_decl = (DECL_P (instance)
+			? instance : TYPE_NAME (instance));
+	  if (!spec)
+	/* OK */;
+	  else if (spec == error_mark_node)
+	permerror (input_location,
+		   "declaration of %qD ambiguates earlier template "
+		   "instantiation for %qD", decl, inst_decl);
+	  else if (TREE_VALUE (spec) == tmpl)
+	permerror (input_location,
+		   "partial specialization of %qD after instantiation "
+		   "of %qD", decl, inst_decl);
 	}
 }
 
diff --git a/gcc/testsuite/g++.dg/template/partial-specialization3.C b/gcc/testsuite/g++.dg/template/partial-specialization3.C
new file mode 100644
index 000..c5f83bd
--- /dev/null
+++ b/gcc/testsuite/g++.dg/template/partial-specialization3.C
@@ -0,0 +1,7 @@
+// DR 2179
+
+template  class A;
+template  struct A { void f(); };
+template  void g(T) { A().f(); }   // #1
+template struct A {};		// { dg-error "" }
+A f;   // #2


Re: [PATCH] rs6000: Fix tests for xvmadd and xvnmsub

2015-10-26 Thread David Edelsohn
On Sun, Oct 25, 2015 at 8:59 PM, Segher Boessenkool
 wrote:
> The patterns involved can create vmadd resp. vnmsub instructions instead.
> This patch changes the testcases to allow those.
>
> Tested with -m32,-m32/-mpowerpc64,-m64; okay for trunk?
>
>
> Segher
>
>
> 2015-10-26  Segher Boessenkool  
>
> gcc/testsuite/
> * gcc.target/powerpc/vsx-builtin-2.c: Allow vmadd and vnmsub as well
> as xvmadd and xvnmsub.
> * gcc.target/powerpc/vsx-vector-2.c: Allow vmadd as well as xvmadd.

Okay.

thanks, David


Re: [PATCH] rs6000: p8vector-builtin-8.c test requires int128

2015-10-26 Thread David Edelsohn
On Sun, Oct 25, 2015 at 8:59 PM, Segher Boessenkool
 wrote:
> For 32-bit targets p8vector_ok does not imply we have int128.
>
> Tested with -m32,-m32/-mpowerpc64,-m64; okay for trunk?
>
>
> Segher
>
>
> 2015-10-26  Segher Boessenkool  
>
> gcc/testsuite/
> * gcc.target/powerpc/p8vector-builtin-8.c: Add "target int128".

Okay.

Thanks, David


Re: [PATCH] Add missing INCLUDE_DEFAULTS_MUSL_LOCAL

2015-10-26 Thread Doug Evans
On Fri, Oct 23, 2015 at 11:34 AM, Szabolcs Nagy  wrote:
> On 23/10/15 18:39, Doug Evans wrote:
>>
>> On Fri, Oct 23, 2015 at 10:08 AM, Bernd Schmidt 
>> wrote:
>>>
>>>
>>> On 10/21/2015 09:00 PM, Doug Evans wrote:


 I happened to notice local prefixes not working with musl.
 Fixes thusly.
>>>
>>>
>>>
 Index: config/linux.h
 ===
 --- config/linux.h(revision 229130)
 +++ config/linux.h(working copy)
 @@ -174,6 +174,7 @@
#define INCLUDE_DEFAULTS\
  {\
INCLUDE_DEFAULTS_MUSL_GPP\
 +INCLUDE_DEFAULTS_MUSL_LOCAL\
INCLUDE_DEFAULTS_MUSL_PREFIX\
INCLUDE_DEFAULTS_MUSL_CROSS\
INCLUDE_DEFAULTS_MUSL_TOOL\
>>>
>>>
>>>
>>> Looks pretty obvious given that the macro isn't otherwise used AFAICT.
>>> However, I have no idea whether the order is right, since the purpose of all
>>> this code here is apparently only to provide a different order than the
>>> default.
>>>
>>> So, someone who worked on the original musl patches should comment. I
>>> would also like to know precisely which ordering change over the default is
>>> required, and that it be documented. Ideally we'd come up with a solution
>>> that makes us not duplicate all this stuff in linux.h.
>>>
>>>
>>> Bernd
>>
>>
>> Crap, sorry for the resend. G gmail ...
>>
>> The only significant different AFAICT is that GCC_INCLUDE_DIR is moved
>> to later (last).
>> Why this is is briefly described in the intro comment:
>>
>> config/linux.h:
>>   /* musl avoids problematic includes by rearranging the include
>> directories.
>>   * Unfortunately, this is mostly duplicated from cppdefault.c */
>>
>> I've put LOCAL in the same place as the default (as defined by
>> cppdefault.c),
>> so one could separate the issues here ...
>>
>> 1) Where does LOCAL go for musl?
>
>
> LOCAL should go the same place as in cppdefault.c
> so the patch is ok.

Committed. Thanks.


Re: [gomp4] Adjust UNQUE ifn

2015-10-26 Thread Nathan Sidwell

On 10/26/15 07:36, Richard Biener wrote:


Looks better now.

+ {
  #ifdef HAVE_oacc_fork

(etc.)  can you use target-insn.def and targetm.have_oacc_fork () instead?



I've committed this to gomp4.  Will port and update the patches for trunk.

nathan
2015-10-26  Nathan Sidwell  

	* internal-fn.c (expand_UNIQUE, expand_GOACC_DIM_SIZE,
	expand_GOACC_DIM_POS): Use targetm to discover and generate insns.
	* target-insns.def (oacc_dim_pos, oacc_dim_size, oacc_fork,
	oacc_join, unique): Define insns.

Index: gcc/internal-fn.c
===
--- gcc/internal-fn.c	(revision 229365)
+++ gcc/internal-fn.c	(working copy)
@@ -1970,32 +1970,30 @@ expand_UNIQUE (gcall *stmt)
   gcc_unreachable ();
 
 case IFN_UNIQUE_UNSPEC:
-#ifdef HAVE_unique
-  pattern = gen_unique ();
-#endif
+  if (targetm.have_unique ())
+	pattern = targetm.gen_unique ();
   break;
 
 case IFN_UNIQUE_OACC_FORK:
 case IFN_UNIQUE_OACC_JOIN:
-  {
-#if defined (HAVE_oacc_fork) && defined (HAVE_oacc_join)
-	tree lhs = gimple_call_lhs (stmt);
-	rtx target = const0_rtx;
-
-	if (lhs)
-	  target = expand_expr (lhs, NULL_RTX, VOIDmode, EXPAND_WRITE);
-
-	rtx data_dep = expand_normal (gimple_call_arg (stmt, 1));
-	rtx axis = expand_normal (gimple_call_arg (stmt, 2));
-
-	if (code == IFN_UNIQUE_OACC_FORK)
-	  pattern = gen_oacc_fork (target, data_dep, axis);
-	else
-	  pattern = gen_oacc_join (target, data_dep, axis);
-#else
+  if (targetm.have_oacc_fork () && targetm.have_oacc_join ())
+	{
+	  tree lhs = gimple_call_lhs (stmt);
+	  rtx target = const0_rtx;
+
+	  if (lhs)
+	target = expand_expr (lhs, NULL_RTX, VOIDmode, EXPAND_WRITE);
+
+	  rtx data_dep = expand_normal (gimple_call_arg (stmt, 1));
+	  rtx axis = expand_normal (gimple_call_arg (stmt, 2));
+
+	  if (code == IFN_UNIQUE_OACC_FORK)
+	pattern = targetm.gen_oacc_fork (target, data_dep, axis);
+	  else
+	pattern = targetm.gen_oacc_join (target, data_dep, axis);
+	}
+  else
 	gcc_unreachable ();
-#endif
-  }
   break;
 }
 
@@ -2012,40 +2010,47 @@ expand_GOACC_DATA_END_WITH_ARG (gcall *s
   gcc_unreachable ();
 }
 
+/* GOACC_DIM_SIZE returns the size of the specified compute axis.  */
+
 static void
-expand_GOACC_DIM_SIZE (gcall *ARG_UNUSED (stmt))
+expand_GOACC_DIM_SIZE (gcall *stmt)
 {
-#ifdef HAVE_oacc_dim_size
-  tree lhs = gimple_call_lhs (stmt);
-
-  if (!lhs)
-return;
-  
-  rtx target = expand_expr (lhs, NULL_RTX, VOIDmode, EXPAND_WRITE);
-  rtx dim = expand_expr (gimple_call_arg (stmt, 0), NULL_RTX,
-			 VOIDmode, EXPAND_NORMAL);
-  emit_insn (gen_oacc_dim_size (target, dim));
-#else
-  gcc_unreachable ();
-#endif
+  if (targetm.have_oacc_dim_size ())
+{
+  tree lhs = gimple_call_lhs (stmt);
+
+  if (!lhs)
+	return;
+
+  rtx target = expand_expr (lhs, NULL_RTX, VOIDmode, EXPAND_WRITE);
+  rtx dim = expand_expr (gimple_call_arg (stmt, 0), NULL_RTX,
+			 VOIDmode, EXPAND_NORMAL);
+  emit_insn (targetm.gen_oacc_dim_size (target, dim));
+}
+  else
+gcc_unreachable ();
 }
 
+/* GOACC_DIM_POS returns the index of the executing thread along the
+   specified axis.  */
+
 static void
-expand_GOACC_DIM_POS (gcall *ARG_UNUSED (stmt))
+expand_GOACC_DIM_POS (gcall *stmt)
 {
-#ifdef HAVE_oacc_dim_pos
-  tree lhs = gimple_call_lhs (stmt);
-
-  if (!lhs)
-return;
-  
-  rtx target = expand_expr (lhs, NULL_RTX, VOIDmode, EXPAND_WRITE);
-  rtx dim = expand_expr (gimple_call_arg (stmt, 0), NULL_RTX,
-			 VOIDmode, EXPAND_NORMAL);
-  emit_insn (gen_oacc_dim_pos (target, dim));
-#else
-  gcc_unreachable ();
-#endif
+  if (targetm.have_oacc_dim_pos ())
+{
+  tree lhs = gimple_call_lhs (stmt);
+
+  if (!lhs)
+	return;
+
+  rtx target = expand_expr (lhs, NULL_RTX, VOIDmode, EXPAND_WRITE);
+  rtx dim = expand_expr (gimple_call_arg (stmt, 0), NULL_RTX,
+			 VOIDmode, EXPAND_NORMAL);
+  emit_insn (targetm.gen_oacc_dim_pos (target, dim));
+}
+  else
+gcc_unreachable ();
 }
 
 /* All the GOACC_REDUCTION variants  get expanded in oacc_device_lower.  */
Index: gcc/target-insns.def
===
--- gcc/target-insns.def	(revision 229364)
+++ gcc/target-insns.def	(working copy)
@@ -64,6 +64,10 @@ DEF_TARGET_INSN (memory_barrier, (void))
 DEF_TARGET_INSN (movstr, (rtx x0, rtx x1, rtx x2))
 DEF_TARGET_INSN (nonlocal_goto, (rtx x0, rtx x1, rtx x2, rtx x3))
 DEF_TARGET_INSN (nonlocal_goto_receiver, (void))
+DEF_TARGET_INSN (oacc_dim_pos, (rtx x0, rtx x1))
+DEF_TARGET_INSN (oacc_dim_size, (rtx x0, rtx x1))
+DEF_TARGET_INSN (oacc_fork, (rtx x0, rtx x1, rtx x2))
+DEF_TARGET_INSN (oacc_join, (rtx x0, rtx x1, rtx x2))
 DEF_TARGET_INSN (prefetch, (rtx x0, rtx x1, rtx x2))
 DEF_TARGET_INSN (probe_stack, (rtx x0))
 DEF_TARGET_INSN (probe_stack_address, (rtx x0))
@@ -89,5 +93,6 @@ DEF_TARGET_INSN (stack_protect_test, (rt
 DEF_TARGET_INSN (store_multiple, (rtx x0, rtx x1, rtx x2))
 

[PATCH] PR fortran/67885 -- PARAMETER needs to be marked in BLOCK

2015-10-26 Thread Steve Kargl
When an specification statement in a BLOCK construct has a
PARAMETER attribute, gfortran currently discards the entity.
This patch marks PARAMETER entity if in a BLOCK.  I'm not
complete convince that this is the right fix, but it does
allow the testcase to compile and run.  Built and tested
on x86_64-*-freebsd.  OK to commit (if not no one has a
better patch)?

2015-10-26  Steven G. Kargl  

PR fortran/67885
* trans-decl.c (generate_local_decl): Mark PARAMETER entities in
BLOCK construct.

2015-10-26  Steven G. Kargl  

PR fortran/67885
* gfortran.dg/pr67885.f90: New test.

-- 
Steve
Index: gcc/fortran/trans-decl.c
===
--- gcc/fortran/trans-decl.c	(revision 229390)
+++ gcc/fortran/trans-decl.c	(working copy)
@@ -5217,6 +5217,16 @@ generate_local_decl (gfc_symbol * sym)
 			  "Unused parameter %qs which has been explicitly "
 			  "imported at %L", sym->name, &sym->declared_at);
 	}
+
+  if (sym->ns
+	  && sym->ns->parent
+	  && sym->ns->parent->code
+	  && sym->ns->parent->code->op == EXEC_BLOCK)
+	{
+	  if (sym->attr.referenced)
+	gfc_get_symbol_decl (sym);
+	  sym->mark = 1;
+	}
 }
   else if (sym->attr.flavor == FL_PROCEDURE)
 {
Index: gcc/testsuite/gfortran.dg/pr67885.f90
===
--- gcc/testsuite/gfortran.dg/pr67885.f90	(revision 0)
+++ gcc/testsuite/gfortran.dg/pr67885.f90	(working copy)
@@ -0,0 +1,12 @@
+! { dg-do run }
+! PR fortran/67885
+! Original code contributed by Gerhard Steinmetz
+! gerhard dot steinmetz dot fortran at t-online dot de
+program p
+   block
+  integer, parameter :: a(2) = [1, 2]
+  integer :: x(2)
+  x = a
+  if (x(1) /= 1) call abort
+   end block
+end


Re: [gomp4.1] Handle new form of #pragma omp declare target

2015-10-26 Thread Jakub Jelinek
On Mon, Oct 26, 2015 at 10:39:04PM +0300, Ilya Verbin wrote:
> > Without declare target link or to, you can't use the global variables
> > in orphaned accelerated routines (unless you e.g. take the address of the
> > mapped variable in the region and pass it around).
> > The to variables (non-deferred) are always mapped and are initialized with
> > the original initializer, refcount is infinity.  link (deferred) work more
> > like the normal mapping, referencing those vars when they aren't explicitly
> > (or implicitly) mapped is unspecified behavior, if it is e.g. mapped freshly
> > with to kind, it gets the current value of the host var rather than the
> > original one.  But, beyond the mapping the compiler needs to ensure that
> > all uses of the link global var (or perhaps just all uses of the link global
> > var outside of the target construct body where it is mapped, because you
> > could use there the pointer you got from GOMP_target) are replaced by
> > dereference of some artificial pointer, so a becomes *a_tmp and &a becomes
> > &*a_tmp, and that the runtime library during registration of the tables is
> > told about the address of this artificial pointer.  During registration,
> > I'd expect it would stick an entry for this range into the table, with some
> > special flag or something similar, indicating that it is deferred mapping
> > and where the offloading device pointer is.  During mapping, it would map it
> > as any other not yet mapped object, but additionally would also set this
> > device pointer to the device address of the mapped object.  We also need to
> > ensure that when we drop the refcount of that mapping back to 0, we get it
> > back to the state where it is described as a range with registered deferred
> > mapping and where the device pointer is.
> 
> Ok, got it, I'll try implement this...

Thanks.

> > > > we actually replace the variables with pointers to variables, then need
> > > > to somehow also mark those in the offloading tables, so that the library
> > > 
> > > I see 2 possible options: use the MSB of the size, or introduce the third 
> > > field
> > > for flags.
> > 
> > Well, it can be either recorded in the host variable tables (which contain
> > address and size pair, right), or in corresponding offloading device table
> > (which contains the pointer, something else?).
> 
> It contains a size too, which is checked in libgomp:
> gomp_fatal ("Can't map target variables (size mismatch)");
> Yes, we can remove this check, and use second field in device table for flags.

Yeah, or e.g. just use MSB of that size (so check that either the size is
the same (then it is target to) or it is MSB | size (then it is target link).
Objects larger than half of the address space aren't really supportable
anyway.

Jakub


Re: [gomp4.1] Handle new form of #pragma omp declare target

2015-10-26 Thread Ilya Verbin
On Mon, Oct 26, 2015 at 20:05:39 +0100, Jakub Jelinek wrote:
> On Mon, Oct 26, 2015 at 09:35:52PM +0300, Ilya Verbin wrote:
> > On Fri, Jul 17, 2015 at 15:05:59 +0200, Jakub Jelinek wrote:
> > > As the testcases show, #pragma omp declare target has now a new form 
> > > (well,
> > > two; with some issues on it pending), where it is used just as a single
> > > declarative directive rather than a pair of them and allows marking
> > > vars and functions by name as "omp declare target" vars/functions (which 
> > > the
> > > middle-end etc. already handles), but also "omp declare target link", 
> > > which
> > > is a deferred var, that is not initially mapped (on devices without shared
> > > memory with host), but has to be mapped explicitly.
> > 
> > I don't quite understand how link should work.  OpenMP 4.5 says:
> > 
> > "The list items of a link clause are not mapped by the declare target 
> > directive.
> > Instead, their mapping is deferred until they are mapped by target data or
> > target constructs. They are mapped only for such regions."
> >
> > But doesn't this mean that the example bellow should work identically
> > with/without USE_LINK defined?  Or is there some difference on other 
> > testcases?
> 
> On your testcase, the end result is pretty much the same, the variable is
> not mapped initially to the device, and at the beginning of omp target it is
> mapped to device, at the end of the region it is unmapped from the device
> (without copying back).
> 
> But consider:
> 
> int a = 1, b = 1;
> #pragma omp declare target link (a) to (b)
> int
> foo (void)
> {
>   return a++ + b++;
> }
> #pragma omp declare target to (foo)
> int
> main ()
> {
>   a = 2;
>   b = 2;
>   int res;
>   #pragma omp target map (to: a, b) map (from: res)
>   {
> res = foo () + foo ();
>   }
>   // This assumes only non-shared address space, so would need to be guarded
>   // for that.
>   if (res != (2 + 1) + (3 + 2))
> __builtin_abort ();
>   return 0;
> }
> 
> Without declare target link or to, you can't use the global variables
> in orphaned accelerated routines (unless you e.g. take the address of the
> mapped variable in the region and pass it around).
> The to variables (non-deferred) are always mapped and are initialized with
> the original initializer, refcount is infinity.  link (deferred) work more
> like the normal mapping, referencing those vars when they aren't explicitly
> (or implicitly) mapped is unspecified behavior, if it is e.g. mapped freshly
> with to kind, it gets the current value of the host var rather than the
> original one.  But, beyond the mapping the compiler needs to ensure that
> all uses of the link global var (or perhaps just all uses of the link global
> var outside of the target construct body where it is mapped, because you
> could use there the pointer you got from GOMP_target) are replaced by
> dereference of some artificial pointer, so a becomes *a_tmp and &a becomes
> &*a_tmp, and that the runtime library during registration of the tables is
> told about the address of this artificial pointer.  During registration,
> I'd expect it would stick an entry for this range into the table, with some
> special flag or something similar, indicating that it is deferred mapping
> and where the offloading device pointer is.  During mapping, it would map it
> as any other not yet mapped object, but additionally would also set this
> device pointer to the device address of the mapped object.  We also need to
> ensure that when we drop the refcount of that mapping back to 0, we get it
> back to the state where it is described as a range with registered deferred
> mapping and where the device pointer is.

Ok, got it, I'll try implement this...

> > > we actually replace the variables with pointers to variables, then need
> > > to somehow also mark those in the offloading tables, so that the library
> > 
> > I see 2 possible options: use the MSB of the size, or introduce the third 
> > field
> > for flags.
> 
> Well, it can be either recorded in the host variable tables (which contain
> address and size pair, right), or in corresponding offloading device table
> (which contains the pointer, something else?).

It contains a size too, which is checked in libgomp:
  gomp_fatal ("Can't map target variables (size mismatch)");
Yes, we can remove this check, and use second field in device table for flags.

  -- Ilya


[Fortran, committed] Statement label on empty line in derived type declaration (PR 66056)

2015-10-26 Thread Louis Krupp
Revision 229390...

Louis



[gomp4.5] Diagnose linear IV on #pragma omp for

2015-10-26 Thread Jakub Jelinek
Hi!

linear clause is allowed on omp for, but the IV can only be private or
lastprivate, not linear.  We weren't diagnosing this.
Fixed thusly, regtested on x86_64-linux.

2015-10-26  Jakub Jelinek  

* gimplify.c (omp_is_private): Diagnose linear iteration variables
on non-simd constructs.

* gcc.dg/gomp/linear-1.c: New test.
* g++.dg/gomp/linear-2.C: New test.

--- gcc/gimplify.c.jj   2015-10-26 15:38:20.0 +0100
+++ gcc/gimplify.c  2015-10-26 18:25:58.860633072 +0100
@@ -6090,6 +6090,9 @@ omp_is_private (struct gimplify_omp_ctx
  else if ((n->value & GOVD_REDUCTION) != 0)
error ("iteration variable %qE should not be reduction",
   DECL_NAME (decl));
+ else if (simd == 0 && (n->value & GOVD_LINEAR) != 0)
+   error ("iteration variable %qE should not be linear",
+  DECL_NAME (decl));
  else if (simd == 1 && (n->value & GOVD_LASTPRIVATE) != 0)
error ("iteration variable %qE should not be lastprivate",
   DECL_NAME (decl));
--- gcc/testsuite/gcc.dg/gomp/linear-1.c.jj 2015-10-26 18:32:57.721611756 
+0100
+++ gcc/testsuite/gcc.dg/gomp/linear-1.c2015-10-26 18:36:53.373224158 
+0100
@@ -0,0 +1,57 @@
+/* { dg-do compile } */
+/* { dg-options "-fopenmp" } */
+
+int i, j;
+
+void
+f1 (void)
+{
+  #pragma omp for linear (i:1) /* { dg-error "iteration variable .i. should 
not be linear" } */
+  for (i = 0; i < 32; i++)
+;
+}
+
+void
+f2 (void)
+{
+  #pragma omp distribute parallel for linear (i:1) /* { dg-error ".linear. 
is not valid for .#pragma omp distribute parallel for." } */
+  for (i = 0; i < 32; i++)
+;
+}
+
+void
+f3 (void)
+{
+  #pragma omp parallel for linear (i:1) collapse(1)/* { dg-error 
"iteration variable .i. should not be linear" } */
+  for (i = 0; i < 32; i++)
+;
+}
+
+void
+f4 (void)
+{
+  #pragma omp for linear (i:1) linear (j:2) collapse(2)/* { dg-error 
"iteration variable .i. should not be linear" } */
+  for (i = 0; i < 32; i++) /* { dg-error 
"iteration variable .j. should not be linear" "" { target *-*-* } 33 } */
+for (j = 0; j < 32; j+=2)
+  ;
+}
+
+void
+f5 (void)
+{
+  #pragma omp target teams distribute parallel for linear (i:1) linear (j:2) 
collapse(2)   /* { dg-error ".linear. is not valid for .#pragma omp target 
teams distribute parallel for." } */
+  for (i = 0; i < 32; i++)
+for (j = 0; j < 32; j+=2)
+  ;
+}
+
+void
+f6 (void)
+{
+  #pragma omp parallel for linear (i:1) collapse(2) linear (j:2)   /* { 
dg-error "iteration variable .i. should not be linear" } */
+  for (i = 0; i < 32; i++) /* { 
dg-error "iteration variable .j. should not be linear" "" { target *-*-* } 51 } 
*/
+for (j = 0; j < 32; j+=2)
+  ;
+}
+
+#pragma omp declare target to (i, j, f2)
--- gcc/testsuite/g++.dg/gomp/linear-2.C.jj 2015-10-26 18:33:32.352113927 
+0100
+++ gcc/testsuite/g++.dg/gomp/linear-2.C2015-10-26 18:54:30.869022177 
+0100
@@ -0,0 +1,128 @@
+// { dg-do compile }
+// { dg-options "-fopenmp" }
+
+#pragma omp declare target
+
+int i, j;
+
+void
+f1 ()
+{
+  #pragma omp for linear (i:1) // { dg-error "iteration variable .i. should 
not be linear" }
+  for (i = 0; i < 32; i++)
+;
+}
+
+void
+f2 ()
+{
+  #pragma omp distribute parallel for linear (i:1) // { dg-error ".linear. 
is not valid for .#pragma omp distribute parallel for." }
+  for (i = 0; i < 32; i++)
+;
+}
+
+void
+f3 ()
+{
+  #pragma omp parallel for linear (i:1) collapse(1)
+  for (i = 0; i < 32; i++) // { dg-error 
"iteration variable .i. should not be linear" }
+;
+}
+
+void
+f4 ()
+{
+  #pragma omp for linear (i:1) linear (j:2) collapse(2)// { dg-error 
"iteration variable .i. should not be linear" }
+  for (i = 0; i < 32; i++) // { dg-error 
"iteration variable .j. should not be linear" "" { target *-*-* } 35 }
+for (j = 0; j < 32; j+=2)
+  ;
+}
+
+void
+f5 ()
+{
+  #pragma omp target teams distribute parallel for linear (i:1) linear (j:2) 
collapse(2)   // { dg-error ".linear. is not valid for .#pragma omp target 
teams distribute parallel for." }
+  for (i = 0; i < 32; i++)
+for (j = 0; j < 32; j+=2)
+  ;
+}
+
+void
+f6 ()
+{
+  #pragma omp parallel for linear (i:1) collapse(2) linear (j:2)   // { 
dg-error "iteration variable .i. should not be linear" "" { target *-*-* } 54 }
+  for (i = 0; i < 32; i++) // { 
dg-error "iteration variable .j. should not be linear" }
+for (j = 0; j < 32; j+=2)
+  ;
+}
+
+template 
+void
+f7 ()
+{
+  #pragma omp for linear (i:1) // { dg-error "iteration variable .i. should 
not be linear" }
+  for (i = 0; i < 32; i++)
+;
+}
+
+template 
+void
+f8 ()
+{
+  #pragma omp distribute parallel for linear (i:1) // { dg-error ".linear. 
is not val

Re: [gomp4.1] Handle new form of #pragma omp declare target

2015-10-26 Thread Jakub Jelinek
On Mon, Oct 26, 2015 at 09:35:52PM +0300, Ilya Verbin wrote:
> On Fri, Jul 17, 2015 at 15:05:59 +0200, Jakub Jelinek wrote:
> > As the testcases show, #pragma omp declare target has now a new form (well,
> > two; with some issues on it pending), where it is used just as a single
> > declarative directive rather than a pair of them and allows marking
> > vars and functions by name as "omp declare target" vars/functions (which the
> > middle-end etc. already handles), but also "omp declare target link", which
> > is a deferred var, that is not initially mapped (on devices without shared
> > memory with host), but has to be mapped explicitly.
> 
> I don't quite understand how link should work.  OpenMP 4.5 says:
> 
> "The list items of a link clause are not mapped by the declare target 
> directive.
> Instead, their mapping is deferred until they are mapped by target data or
> target constructs. They are mapped only for such regions."
>
> But doesn't this mean that the example bellow should work identically
> with/without USE_LINK defined?  Or is there some difference on other 
> testcases?

On your testcase, the end result is pretty much the same, the variable is
not mapped initially to the device, and at the beginning of omp target it is
mapped to device, at the end of the region it is unmapped from the device
(without copying back).

But consider:

int a = 1, b = 1;
#pragma omp declare target link (a) to (b)
int
foo (void)
{
  return a++ + b++;
}
#pragma omp declare target to (foo)
int
main ()
{
  a = 2;
  b = 2;
  int res;
  #pragma omp target map (to: a, b) map (from: res)
  {
res = foo () + foo ();
  }
  // This assumes only non-shared address space, so would need to be guarded
  // for that.
  if (res != (2 + 1) + (3 + 2))
__builtin_abort ();
  return 0;
}

Without declare target link or to, you can't use the global variables
in orphaned accelerated routines (unless you e.g. take the address of the
mapped variable in the region and pass it around).
The to variables (non-deferred) are always mapped and are initialized with
the original initializer, refcount is infinity.  link (deferred) work more
like the normal mapping, referencing those vars when they aren't explicitly
(or implicitly) mapped is unspecified behavior, if it is e.g. mapped freshly
with to kind, it gets the current value of the host var rather than the
original one.  But, beyond the mapping the compiler needs to ensure that
all uses of the link global var (or perhaps just all uses of the link global
var outside of the target construct body where it is mapped, because you
could use there the pointer you got from GOMP_target) are replaced by
dereference of some artificial pointer, so a becomes *a_tmp and &a becomes
&*a_tmp, and that the runtime library during registration of the tables is
told about the address of this artificial pointer.  During registration,
I'd expect it would stick an entry for this range into the table, with some
special flag or something similar, indicating that it is deferred mapping
and where the offloading device pointer is.  During mapping, it would map it
as any other not yet mapped object, but additionally would also set this
device pointer to the device address of the mapped object.  We also need to
ensure that when we drop the refcount of that mapping back to 0, we get it
back to the state where it is described as a range with registered deferred
mapping and where the device pointer is.

> > This patch only marks them with the new attribute, the actual middle-end
> > implementation needs to be implemented.
> > 
> > I believe OpenACC has something similar, but no idea if it is already
> > implemented.
> > 
> > Anyway, I think the implementation should be that in some pass running on
> > the ACCEL_COMPILER side (guarded by separate address space aka non-HSA)
> 
> HSA does not define ACCEL_COMPILER, because it uses only one compiler.

HSA is a non-issue here, as it has shared address space, therefore map
clause does nothing, declare target to or link clauses also don't do
anything.

> > we actually replace the variables with pointers to variables, then need
> > to somehow also mark those in the offloading tables, so that the library
> 
> I see 2 possible options: use the MSB of the size, or introduce the third 
> field
> for flags.

Well, it can be either recorded in the host variable tables (which contain
address and size pair, right), or in corresponding offloading device table
(which contains the pointer, something else?).

Jakub


Re: [PATCH 7/9] ENABLE_CHECKING refactoring: middle-end, LTO FE

2015-10-26 Thread Jeff Law

On 10/26/2015 11:14 AM, Bernd Schmidt wrote:

On 10/26/2015 05:59 PM, Jeff Law wrote:

I left it as-is.  Obviously if you really want the unnecessary braces
squashed out, we can do that.


It's not a rule I particularly care about myself, but I'm flagging stuff
like this for the sake of consistency.

Noted.




Don't change the whitespace here. Looks like you probably removed a page
break.

Not obvious where it got lost as there's no filename in the review
comments :-)


Oops. I think it was one of the sel-sched files.

I'll do a looksie over things for undesirable whitespace changes.

At least some of the CHECKING_P vs flag_checking thingies are for things 
that are used outside the compiler proper.  ie things that don't have 
the options structure.  The pretty-printer and diagnostics stuff was of 
that nature.  So I'm keeping CHECKING_P rather than flag_checking in those.


Jeff


Re: [gomp4.1] Handle new form of #pragma omp declare target

2015-10-26 Thread Ilya Verbin
On Fri, Jul 17, 2015 at 15:05:59 +0200, Jakub Jelinek wrote:
> As the testcases show, #pragma omp declare target has now a new form (well,
> two; with some issues on it pending), where it is used just as a single
> declarative directive rather than a pair of them and allows marking
> vars and functions by name as "omp declare target" vars/functions (which the
> middle-end etc. already handles), but also "omp declare target link", which
> is a deferred var, that is not initially mapped (on devices without shared
> memory with host), but has to be mapped explicitly.

I don't quite understand how link should work.  OpenMP 4.5 says:

"The list items of a link clause are not mapped by the declare target directive.
Instead, their mapping is deferred until they are mapped by target data or
target constructs. They are mapped only for such regions."

But doesn't this mean that the example bellow should work identically
with/without USE_LINK defined?  Or is there some difference on other testcases?

int a = 1;

#ifdef USE_LINK
#pragma omp declare target link(a)
#endif

int main ()
{
  a = 2;
  int res;
  #pragma omp target map(to: a) map(from: res)
res = a;
  return res;
}

> This patch only marks them with the new attribute, the actual middle-end
> implementation needs to be implemented.
> 
> I believe OpenACC has something similar, but no idea if it is already
> implemented.
> 
> Anyway, I think the implementation should be that in some pass running on
> the ACCEL_COMPILER side (guarded by separate address space aka non-HSA)

HSA does not define ACCEL_COMPILER, because it uses only one compiler.

> we actually replace the variables with pointers to variables, then need
> to somehow also mark those in the offloading tables, so that the library

I see 2 possible options: use the MSB of the size, or introduce the third field
for flags.

> registers them (the locations of the pointers to the vars), but also marks
> them for special treatment, and then when actually trying to map them
> (or their parts, guess that needs to be discussed) we allocate them or
> whatever is requested and store the device pointer into the corresponding
> variable.
> 
> Ilya, Thomas, thoughts on this?

  -- Ilya


Re: [Bulk] [OpenACC 0/7] host_data construct

2015-10-26 Thread Jakub Jelinek
On Fri, Oct 23, 2015 at 10:51:42AM -0500, James Norris wrote:
> @@ -12942,6 +12961,7 @@ c_finish_omp_clauses (tree clauses, bool is_omp, bool 
> declare_simd)
>   case OMP_CLAUSE_GANG:
>   case OMP_CLAUSE_WORKER:
>   case OMP_CLAUSE_VECTOR:
> + case OMP_CLAUSE_USE_DEVICE:
> pc = &OMP_CLAUSE_CHAIN (c);
> continue;
>  

Are there any restrictions on whether you can specify the same var multiple
times in use_device clause?
#pragma acc host_data use_device (x) use_device (x) use_device (y, y, y)
?
If not, have you verified that the gimplifier doesn't ICE on it?  Generally
it doesn't like the same var being mentioned multiple times.
If yes, you can use e.g. the generic_head bitmap for that and in any case,
cover that with sufficient testsuite coverage.

> diff --git a/gcc/gimplify.c b/gcc/gimplify.c
> index ab9e540..0c32219 100644
> --- a/gcc/gimplify.c
> +++ b/gcc/gimplify.c
> @@ -93,6 +93,8 @@ enum gimplify_omp_var_data
>  
>GOVD_MAP_0LEN_ARRAY = 32768,
>  
> +  GOVD_USE_DEVICE = 65536,
> +
>GOVD_DATA_SHARE_CLASS = (GOVD_SHARED | GOVD_PRIVATE | GOVD_FIRSTPRIVATE
>  | GOVD_LASTPRIVATE | GOVD_REDUCTION | GOVD_LINEAR
>  | GOVD_LOCAL)
> @@ -116,7 +118,9 @@ enum omp_region_type
>ORT_COMBINED_TARGET = 33,
>/* Dummy OpenMP region, used to disable expansion of
>   DECL_VALUE_EXPRs in taskloop pre body.  */
> -  ORT_NONE = 64
> +  ORT_NONE = 64,
> +  /* An OpenACC host-data region.  */
> +  ORT_HOST_DATA = 128

I'd prefer ORT_NONE to be the last one, can you just renumber it and put
ORT_HOST_DATA before it?

> +static tree
> +gimplify_oacc_host_data_1 (tree *tp, int *walk_subtrees,
> +void *data ATTRIBUTE_UNUSED)
> +{

Your use_device sounds very similar to use_device_ptr clause in OpenMP,
which is allowed on #pragma omp target data construct and is implemented
quite a bit differently from this; it is unclear if the OpenACC standard
requires this kind of implementation, or you just chose to implement it this
way.  In particular, the GOMP_target_data call puts the variables mentioned
in the use_device_ptr clauses into the mapping structures (similarly how
map clause appears) and the corresponding vars are privatized within the
target data region (which is a host region, basically a fancy { } braces),
where the private variables contain the offloading device's pointers.

> +  splay_tree_node n = NULL;
> +  location_t loc = EXPR_LOCATION (*tp);
> +
> +  switch (TREE_CODE (*tp))
> +{
> +case ADDR_EXPR:
> +  {
> + tree decl = TREE_OPERAND (*tp, 0);
> +
> + switch (TREE_CODE (decl))
> +   {
> +   case ARRAY_REF:
> +   case ARRAY_RANGE_REF:
> +   case COMPONENT_REF:
> +   case VIEW_CONVERT_EXPR:
> +   case REALPART_EXPR:
> +   case IMAGPART_EXPR:
> + if (TREE_CODE (TREE_OPERAND (decl, 0)) == VAR_DECL)
> +   n = splay_tree_lookup (gimplify_omp_ctxp->variables,
> +  (splay_tree_key) TREE_OPERAND (decl, 0));
> + break;

I must say this looks really strange, you throw away all the offsets
embedded in the component codes (fixed or variable).
Where comes the above list?  What about other components (say bit field refs,
etc.)?

> +case VAR_DECL:

What is so special about VAR_DECLs?  Shouldn't PARM_DECLs / RESULT_DECLs
be treated the same way?
> --- a/libgomp/libgomp.map
> +++ b/libgomp/libgomp.map
> @@ -378,6 +378,7 @@ GOACC_2.0 {
>   GOACC_wait;
>   GOACC_get_thread_num;
>   GOACC_get_num_threads;
> + GOACC_deviceptr;
>  };
>  
>  GOACC_2.0.1 {

You shouldn't be adding new symbols into a symbol version that appeared in a
compiler that shipped already (GCC 5 already had GOACC_2.0 symbols).
So it should go into GOACC_2.0.1.

> diff --git a/libgomp/oacc-mem.c b/libgomp/oacc-mem.c
> index af067d6..497ab92 100644
> --- a/libgomp/oacc-mem.c
> +++ b/libgomp/oacc-mem.c
> @@ -204,6 +204,38 @@ acc_deviceptr (void *h)
>return d;
>  }
>  
> +/* This function is used as a helper in generated code to implement pointer
> +   lookup in host_data regions.  Unlike acc_deviceptr, it returns its 
> argument
> +   unchanged on a shared-memory system (e.g. the host).  */
> +
> +void *
> +GOACC_deviceptr (void *h)
> +{
> +  splay_tree_key n;
> +  void *d;
> +  void *offset;
> +
> +  goacc_lazy_initialize ();
> +
> +  struct goacc_thread *thr = goacc_thread ();
> +
> +  if ((thr->dev->capabilities & GOMP_OFFLOAD_CAP_SHARED_MEM) == 0)
> +{
> +  n = lookup_host (thr->dev, h, 1);

What is supposed to be the behavior when the h pointer points at object
boundary, rather than into the middle of existing mapped object?

Say you have:
  char a[16], b[0], c[16]; // b is GCC extension
Now, char *p = &a[5]; is unambiguous, either a is mapped, or not.
But, if p = &a[16];, then it could be either the one-past-last byte in a,
or it could be the start of b (== one-past-last byte in b) or it could be
the pointer to

Re: [PATCH] libjava: fix locale handling when sorting JNI methods

2015-10-26 Thread Mike Frysinger
On 26 Oct 2015 12:10, Tom Tromey wrote:
> Mike> URL: https://bugs.gentoo.org/563710
> Mike> Reported-by: Miroslav Šulc 
> 
> Mike> 2015-10-22  Mike Frysinger  
> 
> Mike> * scripts/check_jni_methods.sh.in: Run sort with LC_ALL=C, and
> Mike> combine `sort|uniq` into `sort -u`.
> Mike> ---
> Mike>  libjava/classpath/scripts/check_jni_methods.sh.in | 4 ++--
> 
> Just FYI - Classpath changes should also go upstream.

i sent it to Andrew & cc-ed their patches list independently now
-mike


signature.asc
Description: Digital signature


Re: [PATCH] libjava: fix locale handling when sorting JNI methods

2015-10-26 Thread Tom Tromey
Mike> URL: https://bugs.gentoo.org/563710
Mike> Reported-by: Miroslav Šulc 

Mike> 2015-10-22  Mike Frysinger  

Mike>   * scripts/check_jni_methods.sh.in: Run sort with LC_ALL=C, and
Mike>   combine `sort|uniq` into `sort -u`.
Mike> ---
Mike>  libjava/classpath/scripts/check_jni_methods.sh.in | 4 ++--

Just FYI - Classpath changes should also go upstream.

Tom


[C++ Patch] PR c++/67846

2015-10-26 Thread Paolo Carlini

Hi,

in mainline this ICE on invalid doesn't exist anymore but we may want to 
avoid issuing the additional redundant "error: cannot convert ‘A::foo’ 
from type ‘void (A::)()’ to type ‘void (A::*)()’" and/or make the error 
message more informative by printing the member used invalidly. Tested 
x86_64-linux.


Thanks, Paolo.

///

/cp
2015-10-26  Paolo Carlini  

PR c++/67846
* parser.c (cp_parser_lambda_body): Check lambda_return_type
return value.
* typeck2.c (cxx_incomplete_type_diagnostic): Print member or
member function used invalidly.

/testsuite
2015-10-26  Paolo Carlini  

PR c++/67846
* g++.dg/cpp0x/lambda/lambda-ice15.C: New.
Index: cp/parser.c
===
--- cp/parser.c (revision 229351)
+++ cp/parser.c (working copy)
@@ -9892,7 +9892,12 @@ cp_parser_lambda_body (cp_parser* parser, tree lam
if (cp_parser_parse_definitely (parser))
  {
if (!processing_template_decl)
- apply_deduced_return_type (fco, lambda_return_type (expr));
+ {
+   tree type = lambda_return_type (expr);
+   apply_deduced_return_type (fco, type);
+   if (type == error_mark_node)
+ expr = error_mark_node;
+ }
 
/* Will get error here if type not deduced yet.  */
finish_return_stmt (expr);
Index: cp/typeck2.c
===
--- cp/typeck2.c(revision 229351)
+++ cp/typeck2.c(working copy)
@@ -517,12 +517,12 @@ cxx_incomplete_type_diagnostic (const_tree value,
if (DECL_FUNCTION_MEMBER_P (member)
&& ! flag_ms_extensions)
  emit_diagnostic (diag_kind, input_location, 0,
-  "invalid use of member function "
-  "(did you forget the %<()%> ?)");
+  "invalid use of member function %qD "
+  "(did you forget the %<()%> ?)", member);
else
  emit_diagnostic (diag_kind, input_location, 0,
-  "invalid use of member "
-  "(did you forget the %<&%> ?)");
+  "invalid use of member %qD "
+  "(did you forget the %<&%> ?)", member);
   }
   break;
 
Index: testsuite/g++.dg/cpp0x/lambda/lambda-ice15.C
===
--- testsuite/g++.dg/cpp0x/lambda/lambda-ice15.C(revision 0)
+++ testsuite/g++.dg/cpp0x/lambda/lambda-ice15.C(working copy)
@@ -0,0 +1,10 @@
+// PR c++/67846
+// { dg-do compile { target c++11 } }
+
+class A
+{
+  void foo ()
+  {
+[=] { return foo; };  // { dg-error "invalid use of member function" }
+  }
+};


Re: [OpenACC 8/11] device-specific lowering

2015-10-26 Thread Nathan Sidwell

On 10/26/15 09:51, Jakub Jelinek wrote:


If not, I think the  only thing remaining is  the IFN_UNIQUE patch, which
(At least) needs an update to use targetm.have... conversion.


Ok, will wait till you make those changes then?


Hope to have that later today.

nathan


[PATCH, GCC 4.9 branch] Fix compile time regression caused by fix to PR64111

2015-10-26 Thread Caroline Tice
Here is my promised backport to the GCC 4.9 branch, for the patch below
that went into ToT last week.  As with the previous patch, I've
verified that it fixes the problem, bootstraps and has no new
regression test failures.  Is this ok to commit to the gcc-4_9-branch?

-- Caroline Tice
cmt...@google.com


On Fri, Oct 23, 2015 at 3:22 PM, Caroline Tice  wrote:
> This patch fixes a compile-time regression that was originally
> introduced by the fix
> for PR64111, in GCC 4.9.3.One of our user's encountered this problem with 
> a
> particular file, where the compile time (on arm) went from 20 seconds
> to 150 seconds.
>
> The fix in this patch was suggested by Richard Biener, who wrote the
> original fix for
> PR64111.  I have verified that this patch fixes the compile time
> regression; I have bootstrapped
> the compiler with this patch; and I have run the regression testsuite
> (no regressions).
> Is this ok to commit to ToT?   (I am also working on backports for
> gcc-5_branch and gcc-4_9-branch).
>
> -- Caroline Tice
> cmt...@google.com
>
 gcc/ChangeLog:

 2015-10-26  Caroline Tice  

(from Richard Biener)
 * tree.c (int_cst_hash_hash):  Replace XORs with more efficient
 calls to iterative_hash_host_wide_int.


gcc-fsf-4_9.patch
Description: Binary data


[PATCH, GCC 5 branch] Fix compile time regression caused by fix to PR64111

2015-10-26 Thread Caroline Tice
Here is my promised backport to the GCC 5 branch, for the patch below
that went into ToT last week.  As with the previous patch, I've
verified that it fixes the problem, bootstraps and has no new
regression test failures.  Is this ok to commit to the gcc-5-branch?

-- Caroline Tice
cmt...@google.com


On Fri, Oct 23, 2015 at 3:22 PM, Caroline Tice  wrote:
> This patch fixes a compile-time regression that was originally
> introduced by the fix
> for PR64111, in GCC 4.9.3.One of our user's encountered this problem with 
> a
> particular file, where the compile time (on arm) went from 20 seconds
> to 150 seconds.
>
> The fix in this patch was suggested by Richard Biener, who wrote the
> original fix for
> PR64111.  I have verified that this patch fixes the compile time
> regression; I have bootstrapped
> the compiler with this patch; and I have run the regression testsuite
> (no regressions).
> Is this ok to commit to ToT?   (I am also working on backports for
> gcc-5_branch and gcc-4_9-branch).
>
> -- Caroline Tice
> cmt...@google.com
>
 gcc/ChangeLog:

 2015-10-26  Caroline Tice  

 (from Richard Biener)
 * tree.c (int_cst_hasher::hash):  Replace XOR with more efficient
 call to iterative_hash_host_wide_int.


gcc-fsf-5.patch
Description: Binary data


Re: [Patch] Avoid is_simple_use bug in vectorizable_live_operation

2015-10-26 Thread Alan Hayward


On 26/10/2015 13:35, "Richard Biener"  wrote:

>On Mon, Oct 26, 2015 at 1:33 PM, Alan Hayward 
>wrote:
>> There is a potential bug in vectorizable_live_operation.
>>
>> Consider the case where the first op for stmt is valid, but the second
>>is
>> null.
>> The first time through the for () loop, it will call out to
>> vect_is_simple_use () which will set dt.
>> The second time, because op is null, vect_is_simple_use () will not be
>> called.
>> However, dt is still set to a valid value, therefore the loop will
>> continue.
>>
>> Note this is different from the case where the first op is null, which
>> will cause the loop not call vect_is_simple_use () and then return
>>false.
>>
>> It is possible that this was intentional, however that is not clear from
>> the code.
>>
>> The fix is to simply ensure dt is initialized to a default value on each
>> iteration.
>
>I think the patch is a strict improvement, thus OK.  Still a NULL operand
>is not possible in GIMPLE so the op && check is not necessary.  The way
>it iterates over all stmt uses is a bit scary anyway.  As it is ok with
>all invariants it should probably simply do sth like
>
>   FOR_EACH_SSA_TREE_OPERAND (op, stmt, iter, SSA_OP_USE)
> if (!vect_is_simple_use (op, ))
>
>and be done with that.  Unvisited uses can only be constants (ok).
>
>Care to rework the funtion like that if you are here?
>

Ok, I’ve updated as requested.


Cheers,
Alan.



avoid_issimple2.patch
Description: Binary data


Re: [PATCH 7/9] ENABLE_CHECKING refactoring: middle-end, LTO FE

2015-10-26 Thread Bernd Schmidt

On 10/26/2015 05:59 PM, Jeff Law wrote:

I left it as-is.  Obviously if you really want the unnecessary braces
squashed out, we can do that.


It's not a rule I particularly care about myself, but I'm flagging stuff 
like this for the sake of consistency.



Don't change the whitespace here. Looks like you probably removed a page
break.

Not obvious where it got lost as there's no filename in the review
comments :-)


Oops. I think it was one of the sel-sched files.


Bernd


Re: Drop types_compatible_p from operand_equal_p

2015-10-26 Thread Jan Hubicka
> On Sat, 24 Oct 2015, Jan Hubicka wrote:
> 
> > Hi, as discussed earlier, the types_compatible_p in operand_equal_p 
> > seems redundant. (it is callers work to figure out how much of type 
> > matching it wants.  If not, we probably want to treat most of other 
> > references and casts simlar way).
> > 
> > Bootstrapped/regtested x86_64-linux. OK?
> 
> Ok.
> 
> Btw, you need to audit tree hashing for required changes with respect
> to your ones to operand_equal_p.  operand_equal_p is the equality
> function for it.

Yep, I am aware that tree hasing must match.  I think my changes are safe so
far:
 - ctors are already hashed resonably
 - types are not hashed so the changes strenghtening OEP_ADDRESS_OF are safe
 - OEP_ADDRESS_OF (so far) still boils down to syntactic matching. 

Looking at this I noticed that simple_cst_equal in tree.c seems to reimplement
OEP_CONSTANT part of operand_equal_p and it seems to have bugs - i.e. not
comparing index for CONSTRUCTOR ELTs.

Also I was always bit unsure how the path through operand_equal_p that allows
different tree codes to match (stripping MEM_REF) and use of STRIP_NOPS combine
with add_expr that doesn't do these.  Do we have some mechanism that will
prevent this from corrupting hashtables?

Honza
> 
> Thanks,
> Richard.
> 
> > * fold-const.c (operand_equal_p): Drop types_compatible_p when
> > comparing references.
> > 
> > Index: fold-const.c
> > ===
> > --- fold-const.c(revision 229278)
> > +++ fold-const.c(working copy)
> > @@ -2982,9 +2982,6 @@ operand_equal_p (const_tree arg0, const_
> >TYPE_SIZE (TREE_TYPE (arg1)),
> >flags)))
> > return 0;
> > - /* Verify that access happens in similar types.  */
> > - if (!types_compatible_p (TREE_TYPE (arg0), TREE_TYPE (arg1)))
> > -   return 0;
> >   /* Verify that accesses are TBAA compatible.  */
> >   if (flag_strict_aliasing
> >   && (!alias_ptr_types_compatible_p
> > 
> > 
> 
> -- 
> Richard Biener 
> SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
> 21284 (AG Nuernberg)


Re: [PATCH 7/9] ENABLE_CHECKING refactoring: middle-end, LTO FE

2015-10-26 Thread Jeff Law

On 10/19/2015 06:13 AM, Bernd Schmidt wrote:

diff --git a/gcc/cfgcleanup.c b/gcc/cfgcleanup.c
-
-#ifdef ENABLE_CHECKING
-  verify_flow_info ();
-#endif
+  checking_verify_flow_info ();


This looks misindented.
Looks that way, but it was a spaces vs tabs issue.  Oh how I long for 
the day when there's a commit hook that just fixes that stuff for us.






-#ifdef ENABLE_CHECKING
cgraph_edge *e;
gcc_checking_assert (
  !(e = caller->get_edge (call_stmt)) || e->speculative);
-#endif


While you're here, that would look nicer as
  gcc_checking_assert (!(e = caller->get_edge (call_stmt))
   || e->speculative);

Agreed & fixed.




-#ifdef ENABLE_CHECKING
-  if (check_same_comdat_groups)
+  if (CHECKING_P && check_same_comdat_groups)


flag_checking

Agreed & fixed.




-#ifdef ENABLE_CHECKING
-  struct df_rd_bb_info *bb_info = DF_RD_BB_INFO (g->bb);
-#endif
+  struct df_rd_bb_info *bb_info = flag_checking ? DF_RD_BB_INFO (g->bb)
+: NULL;


I think no need to make that conditional, that's a bit too ugly.
Given that BB_INFO is only used in a checking path, I think we can just 
sink the variable and its initialization into that path and drop the 
conditional nonsense.  Done.  Fixed minor indention in the assert as well.







+  if (CHECKING_P)
+sparseset_set_bit (active_defs_check, regno);



+  if (CHECKING_P)
+sparseset_clear (active_defs_check);


 > -#ifdef ENABLE_CHECKING
 > -  active_defs_check = sparseset_alloc (max_reg_num ());
 > -#endif

 > +  if (CHECKING_P)
 > +active_defs_check = sparseset_alloc (max_reg_num ());

 > +  if (CHECKING_P)
 > +sparseset_free (active_defs_check);

flag_checking. Lots of other occurrences, I'll mention some but not all
but please fix them for consistency.
Those which were used in conditionals like above I fixed.  I left the 
CPP conditionals alone.  Obviously a second round of conditional 
compilation is going to be needed :-)





  void
  sem_item_optimizer::verify_classes (void)
  {
-#if ENABLE_CHECKING
+  if (!flag_checking)
+return;
+


Not entirely sure whether you want to wrap this into a
checking_verify_classes instead so that it remains easily callable by
the debugger?
I'd tend to agree.  If I call a verify routine from the debugger, I 
expect it to actually verify state, not to early exit because checking 
was turned off entirely.  I clearly missed that when initially reviewing 
the ICF work.


I think pushing the conditional up to the callers should be sufficient, 
which is what I've done.







+  if (flag_checking)
+{
+  for (symtab_node *n = node->same_comdat_group;
+   n != node;
+   n = n->same_comdat_group)
+/* If at least one of same comdat group functions is external,
+   all of them have to be, otherwise it is a front-end bug.  */
+gcc_assert (DECL_EXTERNAL (n->decl));
+}


Unnecessary set of braces.
Agreed.  But I've looked at both formattings and the one above actually 
seems easier to look at visually to me.  I thought it might be the 
comments making the code look like a longer block, so I tried pushing in 
the extra braces, but that wasn't really an improvement (to me) over 
Mikhai's version.


I left it as-is.  Obviously if you really want the unnecessary braces 
squashed out, we can do that.






diff --git a/gcc/lra-assigns.c b/gcc/lra-assigns.c
index 2986f57..941a829 100644
--- a/gcc/lra-assigns.c
+++ b/gcc/lra-assigns.c
@@ -1591,7 +1591,7 @@ lra_assign (void)
bitmap_initialize (&all_spilled_pseudos, ®_obstack);
create_live_range_start_chains ();
setup_live_pseudos_and_spill_after_risky_transforms
(&all_spilled_pseudos);
-#ifdef ENABLE_CHECKING
+#if CHECKING_P
if (!flag_ipa_ra)
  for (i = FIRST_PSEUDO_REGISTER; i < max_regno; i++)
if (lra_reg_info[i].nrefs != 0 && reg_renumber[i] >= 0


Seems inconsistent, use flag_checking and no #if? Looks like the problem
you're trying to solve is that a structure field exists only with
checking, I think that could just be made available unconditionally -
the struct is huge anyway.
Yea, more importantly, I don't like structs that change size based on 
checking bits.   I've made the field available and removed the 
conditional assignment to the field.  I can't see how this could 
possibly be performance critical.




As mentioned in the other mail, I see no value changing the #ifdefs to
#ifs here or elsewhere in the patch.


-  check_rtl (false);
-#endif
+  if (flag_checking)
+check_rtl (/*final_p=*/false);


Lose the /*final_p=*/.

Fixed both occurrences.




-#ifdef ENABLE_CHECKING
+#if CHECKING_P
gcc_assert (!bitmap_bit_p (output, DECL_UID (node->decl)));
bitmap_set_bit (output, DECL_UID (node->decl));
  #endif


Not entirely clear why this isn't using flag_checking.
Me neither.   I think we can drop the conditional code here.  Just to be 
consistent with c

Re: [patch] Extend former fold_widened_comparison to all integral types

2015-10-26 Thread Eric Botcazou
> Maybe you should count how many times __atomic_is_lock_free is called
> instead of the current patch to the testsuite.

I have changed it to make __atomic_is_lock_free set a value through p.

-- 
Eric Botcazou


Re: [OpenACC 8/11] device-specific lowering

2015-10-26 Thread Jakub Jelinek
On Mon, Oct 26, 2015 at 09:13:28AM -0700, Nathan Sidwell wrote:
> On 10/26/15 08:13, Jakub Jelinek wrote:
> 
> >>It won't convert them into such representations.
> >
> >Can you fix that incrementally?  I'd expect that code marked with acc loop 
> >vector
> >can't have loop carried backward lexical dependencies, at least not within
> >the adjacent number of iterations specified in vector clause?
> 
> Sure.  I was using 'won't' to  describe the patch,  not claiming it could
> never be changed to do that kind of thing.

Ok.

> >Otherwise LGTM.
> 
> I think all your other comments are spot on and will address.  Do you want
> another review with them fixed?

Just committing fixed version (and posting what you've committed for patches
that changed since the patch that has been posted earlier) is enough.

> If not, I think the  only thing remaining is  the IFN_UNIQUE patch, which
> (At least) needs an update to use targetm.have... conversion.

Ok, will wait till you make those changes then?

Jakub


Re: [PATCH] Adjust some patterns wrt :c

2015-10-26 Thread Marc Glisse

On Mon, 26 Oct 2015, Richard Biener wrote:


@@ -435,7 +435,7 @@ (define_operator_list RINT BUILT_IN_RINT

/* Fold (A & ~B) - (A & B) into (A ^ B) - B.  */
(simplify
- (minus (bit_and:cs @0 (bit_not @1)) (bit_and:s @0 @1))
+ (minus (bit_and:cs @0 (bit_not @1)) (bit_and:cs @0 @1))
  (minus (bit_xor @0 @1) @1))


Sorry, I should have listed them all, but the same applies to
/* Fold (A & B) - (A & ~B) into B - (A ^ B).  */
a few lines below.

--
Marc Glisse


Re: [PATCH] PR fortran/36192 -- Check for valid BT_INTEGER

2015-10-26 Thread Dominique d'Humières
> … . The testcase demonstrates that the segfault in F951 (caused by calling
>  mpz_set with an invalid mpz_t) does not happen. 

If I am not mistaken, the test compiles without the patch (with different 
messages at least on x86_64-apple-darwin14

/opt/gcc/work/gcc/testsuite/gfortran.dg/pr36192.f90:6:18:

   real, dimension(n,d) :: x  ! { dg-error "of INTEGER type|of INTEGER 
type" }
  1
Error: Expression at (1) must be of INTEGER type, found REAL
/opt/gcc/work/gcc/testsuite/gfortran.dg/pr36192.f90:6:20:

   real, dimension(n,d) :: x  ! { dg-error "of INTEGER type|of INTEGER 
type" }
1
Error: Expression at (1) must be of INTEGER type, found REAL
/opt/gcc/work/gcc/testsuite/gfortran.dg/pr36192.f90:6:27:

   real, dimension(n,d) :: x  ! { dg-error "of INTEGER type|of INTEGER 
type" }
   1
Error: The module or main program array 'x' at (1) must have constant shape
/opt/gcc/work/gcc/testsuite/gfortran.dg/pr36192.f90:7:2:

   x(1,:) = (/ 1.0, 0.0 /)
  1
Error: Different shape for array assignment at (1) on dimension 1 (0 and 2)

Dominique

> Le 26 oct. 2015 à 11:15, Dominique d'Humières  a écrit :
> 
> With the patch compiling the original test still gives
> 
> …
> pr36192.f90:39:10:
> 
>x_n, v_n, &  ! Configuration at t+dt with step dt
>  1
> Error: The module or main program array 'x_n' at (1) must have constant shape
> f951: internal compiler error: Segmentation fault: 11
> 
> Dominique
> 



Re: abort might not flush all open streams before process termination (was: aarch64-suse-linux-gnu: libgomp.oacc-c-c++-common/abort-1.c, libgomp.oacc-c-c++-common/abort-3.c FAILs)

2015-10-26 Thread Thomas Schwinge
Hi!

On Wed, 7 Oct 2015 12:13:41 +0200, I wrote:
> Copying glibc for your information/in case anyone has any further
> comments, and the man-pages maintainer, Michael Kerrisk.  The issue is
> that abort might not flush all open streams before process termination;
> original thread starting at
> .
> 
> On Tue, 06 Oct 2015 13:55:00 +0200, Andreas Schwab  
> wrote:
> > Thomas Schwinge  writes:
> > 
> > > | The two regressed test cases use __builtin_printf instead of fprintf to
> > > | stderr, but as far as I know, abort is to flush all open streams before
> > > | process termination?
> > 
> > It can't, since abort must be async-signal-safe.
> 
> It's still surprising to me that the message written to stderr is lost in
> your aarch64-suse-linux-gnu configuration (only): from a quick look,
> (current) glibc's stdlib/abort.c tries to actually close/flush all open
> streams before process termination.

That's , Andreas told us.

> This is also what's documented on
> : "all open streams
> are closed and flushed".
> 
> does sound more "conservative": "[abort] may include an attempt to effect
> fclose() on all open streams".  Should the man-page be edited to that
> effect?  And, the following patch be applied to GCC?

I convince myself that the GCC patch was obvious enough; committed in
r229382:

commit 005c2a97673312fa25486a70bd810b9a1b37d367
Author: tschwinge 
Date:   Mon Oct 26 16:25:04 2015 +

abort might not flush all open streams before process termination

libgomp/
* testsuite/libgomp.oacc-c-c++-common/abort-1.c: Print to stderr.
* testsuite/libgomp.oacc-c-c++-common/abort-3.c: Likewise.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@229382 
138bc75d-0d04-0410-961f-82ee72b054a4
---
 libgomp/ChangeLog | 3 +++
 libgomp/testsuite/libgomp.oacc-c-c++-common/abort-1.c | 3 ++-
 libgomp/testsuite/libgomp.oacc-c-c++-common/abort-3.c | 3 ++-
 3 files changed, 7 insertions(+), 2 deletions(-)

diff --git libgomp/ChangeLog libgomp/ChangeLog
index fa9027b..afc49ae 100644
--- libgomp/ChangeLog
+++ libgomp/ChangeLog
@@ -1,5 +1,8 @@
 2015-10-26  Thomas Schwinge  
 
+   * testsuite/libgomp.oacc-c-c++-common/abort-1.c: Print to stderr.
+   * testsuite/libgomp.oacc-c-c++-common/abort-3.c: Likewise.
+
* testsuite/libgomp.oacc-c-c++-common/lib-1.c: Remove explicit
acc_device_nvidia usage.
* testsuite/libgomp.oacc-c-c++-common/lib-10.c: Likewise.
diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/abort-1.c 
libgomp/testsuite/libgomp.oacc-c-c++-common/abort-1.c
index 6a9b1df..296708f 100644
--- libgomp/testsuite/libgomp.oacc-c-c++-common/abort-1.c
+++ libgomp/testsuite/libgomp.oacc-c-c++-common/abort-1.c
@@ -1,11 +1,12 @@
 /* { dg-do run } */
 
+#include 
 #include 
 
 int
 main (void)
 {
-  __builtin_printf ("CheCKpOInT\n");
+  fprintf (stderr, "CheCKpOInT\n");
 #pragma acc parallel
   {
 abort ();
diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/abort-3.c 
libgomp/testsuite/libgomp.oacc-c-c++-common/abort-3.c
index 2c8f347..bca425e 100644
--- libgomp/testsuite/libgomp.oacc-c-c++-common/abort-3.c
+++ libgomp/testsuite/libgomp.oacc-c-c++-common/abort-3.c
@@ -1,11 +1,12 @@
 /* { dg-do run } */
 
+#include 
 #include 
 
 int
 main (void)
 {
-  __builtin_printf ("CheCKpOInT\n");
+  fprintf (stderr, "CheCKpOInT\n");
 #pragma acc kernels
   {
 abort ();


Grüße
 Thomas


signature.asc
Description: PGP signature


RE: FW: [PATCH] Target hook for disabling the delay slot filler.

2015-10-26 Thread Simon Dardis
> On 10/23/2015 11:31 AM, Bernd Schmidt wrote:
> > On 10/23/2015 04:57 PM, Simon Dardis wrote:
> >
> >> Patch below. Target hook renamed to
> >> TARGET_NO_SPECULATION_IN_DELAY_SLOTS_P.
> >>
> >> Tested on mips-img-elf, no new regressions.
> >
> > As far as I'm concerned this is ok, and IIUC Jeff was on board too.
> > This is assuming the test included a bootstrap, otherwise please do
> > that. You should also include a ChangeLog in future submissions.
> Just to be explicit, I'm on board.
> 
> Jeff

I've done bootstrap and regression. No new failures.

gcc/
* target.def (TARGET_NO_SPECULATION_IN_DELAY_SLOTS_P): New hook.
* doc/tm.texi.in (TARGET_NO_SPECULATION_IN_DELAY_SLOTS_P): Document.
* doc/tm.texi: Regenerated.
* reorg.c (dbr_schedule): Use new hook.
* config/mips/mips.c (mips_no_speculation_in_delay_slots_p): New.

testsuite/
* gcc.target/mips/ds-schedule-1.c: New.
* gcc.target/mips/ds-schedule-2.c: New.

Committed as r229383.

Thanks,
Simon


Re: [PATCH, 2/2] Handle recursive restrict pointer in create_variable_info_for_1

2015-10-26 Thread Tom de Vries

On 26/10/15 12:23, Tom de Vries wrote:

Hi,

this patch enables recursive restrict handling in
create_variable_info_for_1.

This allows us to interpret all restricts in a function parameter
"int *restrict *restrict *restrict a".

This patch is the first step towards implementing a generic fix for
PR67742.

Bootstrapped and reg-tested on x86_64.



Reposting with restrict_pointed_var as a hash_map rather than a field.


OK for trunk?



Thanks,
- Tom
Handle recursive restrict pointer in create_variable_info_for_1

2015-10-26  Tom de Vries  

	* tree-ssa-structalias.c (create_variable_info_for_1): Enable recursive
	handling of restrict pointers.
	(make_restrict_var_constraints): Handle restrict vars recursively.

	* gcc.dg/tree-ssa/restrict-7.c: New test.
---
 gcc/testsuite/gcc.dg/tree-ssa/restrict-7.c | 12 
 gcc/tree-ssa-structalias.c | 13 +++--
 2 files changed, 23 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/restrict-7.c

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/restrict-7.c b/gcc/testsuite/gcc.dg/tree-ssa/restrict-7.c
new file mode 100644
index 000..f7a68c7
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/restrict-7.c
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-fre1" } */
+
+int
+f (int *__restrict__ *__restrict__ *__restrict__ a, int *b)
+{
+  *b = 1;
+  ***a  = 2;
+  return *b;
+}
+
+/* { dg-final { scan-tree-dump-times "return 1" 1 "fre1" } } */
diff --git a/gcc/tree-ssa-structalias.c b/gcc/tree-ssa-structalias.c
index f1325e0..79e9773 100644
--- a/gcc/tree-ssa-structalias.c
+++ b/gcc/tree-ssa-structalias.c
@@ -5726,7 +5726,7 @@ create_variable_info_for_1 (tree decl, const char *name, bool handle_param)
 	  varinfo_t rvi;
 	  tree heapvar = build_fake_var_decl (TREE_TYPE (decl_type));
 	  DECL_EXTERNAL (heapvar) = 1;
-	  rvi = create_variable_info_for_1 (heapvar, "PARM_NOALIAS", false);
+	  rvi = create_variable_info_for_1 (heapvar, "PARM_NOALIAS", true);
 	  rvi->is_restrict_var = 1;
 	  insert_vi_for_tree (heapvar, rvi);
 	  insert_restrict_pointed_var (vi, rvi);
@@ -5897,7 +5897,16 @@ make_restrict_var_constraints (varinfo_t vi)
 if (vi->may_have_pointers)
   {
 	if (vi->only_restrict_pointers)
-	  make_constraint_from_global_restrict (vi, "GLOBAL_RESTRICT");
+	  {
+	varinfo_t rvi = lookup_restrict_pointed_var (vi);
+	if (rvi != NULL)
+	  {
+		make_constraint_from (vi, rvi->id);
+		make_restrict_var_constraints (rvi);
+	  }
+	else
+	  make_constraint_from_global_restrict (vi, "GLOBAL_RESTRICT");
+	  }
 	else
 	  make_copy_constraint (vi, nonlocal_id);
   }
-- 
1.9.1



Re: libgomp testsuite: Remove some explicit acc_device_nvidia usage

2015-10-26 Thread Thomas Schwinge
Hi!

On Wed, 14 Oct 2015 14:05:49 +0200, Bernd Schmidt  wrote:
> On 10/09/2015 05:11 PM, Thomas Schwinge wrote:
> > On Wed, 22 Jul 2015 16:39:54 +0200, I wrote:
> >> [...] cleanup; committed to
> >> gomp-4_0-branch in r226072: [...]
> >
> > OK for trunk?
> 
> I think all three patches here look OK.

Thanks for the review.  Committed in r229379, r229380, r229381:

commit 54c8f61c735f98c73033ad04fc5db9062f8ad3a8
Author: tschwinge 
Date:   Mon Oct 26 16:24:28 2015 +

[libgomp/66518] Resolve XFAIL in libgomp.oacc-c-c++-common/lib-3.c

libgomp/
PR libgomp/66518
* testsuite/libgomp.oacc-c-c++-common/lib-3.c: Resolve XFAIL.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@229379 
138bc75d-0d04-0410-961f-82ee72b054a4
---
 libgomp/ChangeLog   | 3 +++
 libgomp/testsuite/libgomp.oacc-c-c++-common/lib-3.c | 7 ---
 2 files changed, 7 insertions(+), 3 deletions(-)

diff --git libgomp/ChangeLog libgomp/ChangeLog
index 76cb423..e99f924 100644
--- libgomp/ChangeLog
+++ libgomp/ChangeLog
@@ -1,5 +1,8 @@
 2015-10-26  Thomas Schwinge  
 
+   PR libgomp/66518
+   * testsuite/libgomp.oacc-c-c++-common/lib-3.c: Resolve XFAIL.
+
PR libgomp/65437
PR libgomp/66518
* oacc-mem.c (update_dev_host): Call goacc_lazy_initialize.
diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/lib-3.c 
libgomp/testsuite/libgomp.oacc-c-c++-common/lib-3.c
index 7118797..f9a73397 100644
--- libgomp/testsuite/libgomp.oacc-c-c++-common/lib-3.c
+++ libgomp/testsuite/libgomp.oacc-c-c++-common/lib-3.c
@@ -1,4 +1,6 @@
-/* { dg-do run } */
+/* Expect an error message when shutting down a device different from the one
+   that has been initialized.  */
+/* { dg-do run { target { ! openacc_host_selected } } } */
 
 #include 
 #include 
@@ -15,6 +17,5 @@ main (int argc, char **argv)
 }
 
 /* { dg-output "CheCKpOInT(\n|\r\n|\r).*" } */
-/* TODO: currently prints: "libgomp: no device found".  */
-/* { dg-output "device \[0-9\]+\\\(\[0-9\]+\\\) is initialized" { xfail *-*-* 
} } */
+/* { dg-output "no device initialized" } */
 /* { dg-shouldfail "" } */

commit cfe316ad3a766aa93361cec6325a3bc75c310e59
Author: tschwinge 
Date:   Mon Oct 26 16:24:44 2015 +

libgomp: Additional acc_shutdown bug fixing and testing

libgomp/
* oacc-init.c (acc_shutdown): Call gomp_init_targets_once.
* testsuite/libgomp.oacc-c-c++-common/lib-8.c: New file.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@229380 
138bc75d-0d04-0410-961f-82ee72b054a4
---
 libgomp/ChangeLog   |  3 +++
 libgomp/oacc-init.c |  2 ++
 libgomp/testsuite/libgomp.oacc-c-c++-common/lib-8.c | 19 +++
 3 files changed, 24 insertions(+)

diff --git libgomp/ChangeLog libgomp/ChangeLog
index e99f924..ad970df 100644
--- libgomp/ChangeLog
+++ libgomp/ChangeLog
@@ -1,5 +1,8 @@
 2015-10-26  Thomas Schwinge  
 
+   * oacc-init.c (acc_shutdown): Call gomp_init_targets_once.
+   * testsuite/libgomp.oacc-c-c++-common/lib-8.c: New file.
+
PR libgomp/66518
* testsuite/libgomp.oacc-c-c++-common/lib-3.c: Resolve XFAIL.
 
diff --git libgomp/oacc-init.c libgomp/oacc-init.c
index a0e62a4..9a9a0b0 100644
--- libgomp/oacc-init.c
+++ libgomp/oacc-init.c
@@ -449,6 +449,8 @@ ialias (acc_init)
 void
 acc_shutdown (acc_device_t d)
 {
+  gomp_init_targets_once ();
+
   gomp_mutex_lock (&acc_device_lock);
 
   acc_shutdown_1 (d);
diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/lib-8.c 
libgomp/testsuite/libgomp.oacc-c-c++-common/lib-8.c
new file mode 100644
index 000..ea28b6b
--- /dev/null
+++ libgomp/testsuite/libgomp.oacc-c-c++-common/lib-8.c
@@ -0,0 +1,19 @@
+/* Expect error message when shutting down a device that has never been
+   initialized.  */
+/* { dg-do run } */
+
+#include 
+#include 
+
+int
+main (int argc, char **argv)
+{
+  fprintf (stderr, "CheCKpOInT\n");
+  acc_shutdown (acc_device_default);
+
+  return 0;
+}
+
+/* { dg-output "CheCKpOInT(\n|\r\n|\r).*" } */
+/* { dg-output "no device initialized" } */
+/* { dg-shouldfail "" } */

commit 3c41a4f17f727373340749ec0699c62b62b93b67
Author: tschwinge 
Date:   Mon Oct 26 16:24:54 2015 +

libgomp testsuite: Remove some explicit acc_device_nvidia usage.

libgomp/
* testsuite/libgomp.oacc-c-c++-common/lib-1.c: Remove explicit
acc_device_nvidia usage.
* testsuite/libgomp.oacc-c-c++-common/lib-10.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-2.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-9.c: Likewise.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@229381 
138bc75d-0d04-0410-961f-82ee72b054a4
---
 libgomp/ChangeLog|  6 ++
 libgomp/testsuite/libgomp.oacc-c-c++-common/lib-1.c  | 14 ++
 libgomp/testsuite/libgomp.oacc-c-c++-common/lib-10.c |  9 +
 libgomp/testsuite/libgomp.oacc

[PATCH][wwwdocs] Mention arm target attributes and pragmas in GCC 6 changes

2015-10-26 Thread Kyrill Tkachov

Hi all,

Christian asked me for this patch some time ago, but it slipped under my radar.
Here's a patch to mention the new target attributes for arm, in the same 
wording as for aarch64.

Ok to install?

Thanks,
Kyrill
Index: htdocs/gcc-6/changes.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-6/changes.html,v
retrieving revision 1.36
diff -U 3 -r1.36 changes.html
--- htdocs/gcc-6/changes.html	12 Oct 2015 16:55:25 -	1.36
+++ htdocs/gcc-6/changes.html	16 Oct 2015 14:49:17 -
@@ -182,8 +182,16 @@
target-specific options is now supported.
  

-
 
+ARM
+   
+ 
+   The arm port now supports target attributes and pragmas.  Please
+   refer to the https://gcc.gnu.org/onlinedocs/gcc/ARM-Function-Attributes.html#ARM-Function-Attributes";>
+   documentation for details of available attributes and
+   pragmas as well as usage instructions.
+ 
+   
 
 
 IA-32/x86-64


Re: [PR libgomp/65437, libgomp/66518] Initialize runtime in acc_update_device, acc_update_self

2015-10-26 Thread Thomas Schwinge
Hi!

On Wed, 14 Oct 2015 14:08:42 +0200, Bernd Schmidt  wrote:
> On 10/09/2015 05:14 PM, Thomas Schwinge wrote:
> > On Fri, 19 Jun 2015 09:47:41 +0200, I wrote:
> >> On Tue, 5 May 2015 11:43:20 +0200, I wrote:
> >>> On Mon, 4 May 2015 10:20:14 -0400, John David Anglin 
> >>>  wrote:
>  FAIL: libgomp.oacc-c/../libgomp.oacc-c-c++-common/lib-42.c
>  -DACC_DEVICE_TYPE_hos
>  t=1 -DACC_MEM_SHARED=1 output pattern test, is , should match
>  \[[0-9a-fA-FxX]+,2
>  56\] is not mapped

> > OK to commit?
> 
> Ok.

Thanks for the review.  Committed in r229378:

commit a6dcb5581494a8b750daf173f04ef087d6dc60c5
Author: tschwinge 
Date:   Mon Oct 26 16:24:17 2015 +

[PR libgomp/65437, libgomp/66518] Initialize runtime in acc_update_device, 
acc_update_self

libgomp/
PR libgomp/65437
PR libgomp/66518
* oacc-mem.c (update_dev_host): Call goacc_lazy_initialize.
* testsuite/libgomp.oacc-c-c++-common/lib-42.c: Remove XFAIL.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@229378 
138bc75d-0d04-0410-961f-82ee72b054a4
---
 libgomp/ChangeLog| 7 +++
 libgomp/oacc-mem.c   | 6 +++---
 libgomp/testsuite/libgomp.oacc-c-c++-common/lib-42.c | 4 +---
 3 files changed, 11 insertions(+), 6 deletions(-)

diff --git libgomp/ChangeLog libgomp/ChangeLog
index 658c47b..76cb423 100644
--- libgomp/ChangeLog
+++ libgomp/ChangeLog
@@ -1,3 +1,10 @@
+2015-10-26  Thomas Schwinge  
+
+   PR libgomp/65437
+   PR libgomp/66518
+   * oacc-mem.c (update_dev_host): Call goacc_lazy_initialize.
+   * testsuite/libgomp.oacc-c-c++-common/lib-42.c: Remove XFAIL.
+
 2015-10-23  Tom de Vries  
 
PR testsuite/68063
diff --git libgomp/oacc-mem.c libgomp/oacc-mem.c
index af067d6..5410906 100644
--- libgomp/oacc-mem.c
+++ libgomp/oacc-mem.c
@@ -547,6 +547,9 @@ update_dev_host (int is_dev, void *h, size_t s)
 {
   splay_tree_key n;
   void *d;
+
+  goacc_lazy_initialize ();
+
   struct goacc_thread *thr = goacc_thread ();
   struct gomp_device_descr *acc_dev = thr->dev;
 
@@ -554,9 +557,6 @@ update_dev_host (int is_dev, void *h, size_t s)
 
   n = lookup_host (acc_dev, h, s);
 
-  /* No need to call lazy open, as the data must already have been
- mapped.  */
-
   if (!n)
 {
   gomp_mutex_unlock (&acc_dev->lock);
diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/lib-42.c 
libgomp/testsuite/libgomp.oacc-c-c++-common/lib-42.c
index 95c4162..de5d1c1 100644
--- libgomp/testsuite/libgomp.oacc-c-c++-common/lib-42.c
+++ libgomp/testsuite/libgomp.oacc-c-c++-common/lib-42.c
@@ -35,7 +35,5 @@ main (int argc, char **argv)
 }
 
 /* { dg-output "CheCKpOInT(\n|\r\n|\r).*" } */
-/* TODO: currently doesn't print anything; SIGSEGV.
-   .  */
-/* { dg-output "\\\[\[0-9a-fA-FxX\]+,256\\\] is not mapped" { xfail *-*-* } } 
*/
+/* { dg-output "\\\[\[0-9a-fA-FxX\]+,256\\\] is not mapped" } */
 /* { dg-shouldfail "" } */


Grüße
 Thomas


signature.asc
Description: PGP signature


[PATCH][AArch64] Fix ICE on (const_double:HF 0.0)

2015-10-26 Thread Alan Lawrence
The included testcase demonstrates the ICE: aarch64_valid_floating_const
(via aarch64_float_const_representable_p) disables HFmode immediates, but
allows 0.0. However, *movhf_aarch64 does not allow this insn:

(insn 7 6 10 2 (set (mem:HF (reg/f:DI 73) [0 *f_2(D)+0 S2 A16])
(const_double:HF 0.0 [0x0.0p+0])) test.c:8 -1
 (nil))

Fix is to allow the second operand to be zero, in the same way as
*movsf_aarch64.

Bootstrapped + check-gcc on aarch64-none-linux-gnu.
New test also passing on arm-none-eabi.

OK for trunk?

gcc/ChangeLog:

* config/aarch64/aarch64.md (*movhf_aarch64): Use
aarch64_reg_or_fp_zero for second operand.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/fp16/set_zero_1.c: New.
---
 gcc/config/aarch64/aarch64.md  |  2 +-
 gcc/testsuite/gcc.target/aarch64/fp16/set_zero_1.c | 22 ++
 2 files changed, 23 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.target/aarch64/fp16/set_zero_1.c

diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index 78b9ae2..8895a4e 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -1120,7 +1120,7 @@
   [(set (match_operand:HF 0 "nonimmediate_operand" "=w, ?r,w,w,m,r,m ,r")
(match_operand:HF 1 "general_operand"  "?rY, w,w,m,w,m,rY,r"))]
   "TARGET_FLOAT && (register_operand (operands[0], HFmode)
-|| register_operand (operands[1], HFmode))"
+|| aarch64_reg_or_fp_zero (operands[1], HFmode))"
   "@
mov\\t%0.h[0], %w1
umov\\t%w0, %1.h[0]
diff --git a/gcc/testsuite/gcc.target/aarch64/fp16/set_zero_1.c 
b/gcc/testsuite/gcc.target/aarch64/fp16/set_zero_1.c
new file mode 100644
index 000..36cadfd
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/fp16/set_zero_1.c
@@ -0,0 +1,22 @@
+/* { dg-do run } */
+/* { dg-options "-O2" } */
+/* { dg-additional-options "-mfp16-format=ieee" { target "arm*-*-*" } } */
+
+extern void abort (void);
+
+__attribute__ ((noinline))
+void
+setfoo (__fp16 *f)
+{
+  *f = 0.0;
+}
+
+int
+main (int argc, char **argv)
+{
+  __fp16 a = 1.0;
+  setfoo (&a);
+  if (a != 0.0)
+abort ();
+  return 0;
+}
-- 
1.9.1



Re: [PATCH, 1/2] Add handle_param parameter to create_variable_info_for_1

2015-10-26 Thread Tom de Vries

On 26/10/15 14:42, Richard Biener wrote:

On Mon, Oct 26, 2015 at 12:22 PM, Tom de Vries  wrote:

>Hi,
>
>this no-functional-changes patch copies the restrict var declaration code
>from intra_create_variable_infos to create_variable_info_for_1.
>
>The code was copied rather than moved, since in fipa-pta mode the varinfo p
>for the parameter t may already exist due to create_function_info_for, in
>which case we're not calling create_variable_info_for_1 to set p, meaning
>the copied code won't get triggered.
>
>Bootstrapped and reg-tested on x86_64.
>
>OK for trunk?

@@ -272,6 +272,9 @@ struct variable_info
/* True if this field has only restrict qualified pointers.  */
unsigned int only_restrict_pointers : 1;

+  /* The id of the pointed-to restrict var in case only_restrict_pointers.  */
+  unsigned int restrict_pointed_var;
+
/* True if this represents a heap var created for a restrict qualified
   pointer.  */
unsigned int is_restrict_var : 1;
@@ -5608,10 +5611,10 @@ check_for_overlaps (vec fieldstack)


Please don't split the  bitfield like that.


Ah, indeed.


Note that variable_info
is kept as small as
possible because there may be a_lot_  of them.  Is it really necessary to have
this info be persistent?


The info is only needed for a short period, roughly the period of an 
invocation of intra_create_variable_infos.


I've removed the field, and now the info is stored in a hash_map.


 Why does create_variable_info_for_1 only handle
the single-field case?  That is, I expected this to be handled by c_v_r_f_1
fully.



Yep, that's the goal of PR67742. I've written a patch in an earlier 
version of the patch series that implements that, I'm currently porting 
that patch to this patch series. I'll post asap.


I decided to post these two patches which do not present a full 
solution, since they are already an improvement over the current situation.


Thanks,
- Tom
Add handle_param parameter to create_variable_info_for_1

2015-10-26  Tom de Vries  

	* tree-ssa-structalias.c (restrict_pointed_var): New static variable.
	* (insert_restrict_pointed_var, lookup_restrict_pointed_var): New
	function.
	(create_variable_info_for_1): Add and handle handle_param parameter.
	(create_variable_info_for): Call create_variable_info_for_1 with extra
	arg.
	(intra_create_variable_infos): Same.  Handle case that
	lookup_restrict_pointed_var (p) is not NULL.
---
 gcc/tree-ssa-structalias.c | 73 +++---
 1 file changed, 63 insertions(+), 10 deletions(-)

diff --git a/gcc/tree-ssa-structalias.c b/gcc/tree-ssa-structalias.c
index 63a3d02..f1325e0 100644
--- a/gcc/tree-ssa-structalias.c
+++ b/gcc/tree-ssa-structalias.c
@@ -5606,12 +5606,45 @@ check_for_overlaps (vec fieldstack)
   return false;
 }
 
+/* Map from restrict pointer variable info to restrict var variable info.  */
+
+static hash_map *restrict_pointed_var = NULL;
+
+/* Insert VI2 as the restrict var for VI in the restrict_pointed_var map.  */
+
+static void
+insert_restrict_pointed_var (varinfo_t vi, varinfo_t vi2)
+{
+  if (restrict_pointed_var == NULL)
+  restrict_pointed_var = new hash_map;
+
+  bool mapped = restrict_pointed_var->put (vi, vi2);
+  gcc_assert (!mapped);
+}
+
+/* Find the restrict var for restrict pointer VI in the restrict_pointed_var
+   map.  If VI does not exist in the map, return NULL, otherwise, return the
+   varinfo we found.  */
+
+static varinfo_t
+lookup_restrict_pointed_var (varinfo_t vi)
+{
+  if (restrict_pointed_var == NULL)
+return NULL;
+  varinfo_t *slot = restrict_pointed_var->get (vi);
+  if (slot == NULL)
+return NULL;
+
+  return *slot;
+}
+
+
 /* Create a varinfo structure for NAME and DECL, and add it to VARMAP.
This will also create any varinfo structures necessary for fields
-   of DECL.  */
+   of DECL.  DECL is a function parameter if HANDLE_PARAM is set.  */
 
 static varinfo_t
-create_variable_info_for_1 (tree decl, const char *name)
+create_variable_info_for_1 (tree decl, const char *name, bool handle_param)
 {
   varinfo_t vi, newvi;
   tree decl_type = TREE_TYPE (decl);
@@ -5687,6 +5720,17 @@ create_variable_info_for_1 (tree decl, const char *name)
   if (POINTER_TYPE_P (TREE_TYPE (decl))
 	  && TYPE_RESTRICT (TREE_TYPE (decl)))
 	vi->only_restrict_pointers = 1;
+  if (vi->only_restrict_pointers
+	  && handle_param)
+	{
+	  varinfo_t rvi;
+	  tree heapvar = build_fake_var_decl (TREE_TYPE (decl_type));
+	  DECL_EXTERNAL (heapvar) = 1;
+	  rvi = create_variable_info_for_1 (heapvar, "PARM_NOALIAS", false);
+	  rvi->is_restrict_var = 1;
+	  insert_vi_for_tree (heapvar, rvi);
+	  insert_restrict_pointed_var (vi, rvi);
+	}
   fieldstack.release ();
   return vi;
 }
@@ -5738,7 +5782,7 @@ create_variable_info_for_1 (tree decl, const char *name)
 static unsigned int
 create_variable_info_for (tree decl, const char *name)
 {
-  varinfo_t vi = create_variable_info_for_1 (decl, name);
+  varinfo_t vi = create_variable_info_for_1 (decl

Re: [OpenACC 8/11] device-specific lowering

2015-10-26 Thread Nathan Sidwell

On 10/26/15 08:13, Jakub Jelinek wrote:


It won't convert them into such representations.


Can you fix that incrementally?  I'd expect that code marked with acc loop 
vector
can't have loop carried backward lexical dependencies, at least not within
the adjacent number of iterations specified in vector clause?


Sure.  I was using 'won't' to  describe the patch,  not claiming it could never 
be changed to do that kind of thing.




Otherwise LGTM.


I think all your other comments are spot on and will address.  Do you want 
another review with them fixed?


If not, I think the  only thing remaining is  the IFN_UNIQUE patch, which (At 
least) needs an update to use targetm.have... conversion.


nathan



Re: [PATCH] PR fortran/36192 -- Check for valid BT_INTEGER

2015-10-26 Thread Steve Kargl
On Mon, Oct 26, 2015 at 03:43:37PM +0100, FX wrote:
> > Because the code issues two errors, one for each dimension.
> 
> Then shouldn???t it be ???string.*string??? to match
> two occurences of the string, with some stuff (incl. newline) in the middle?
> 

I don't know dejagnu well enough to know if some other regex pattern
would capture all 3 errors.  I'm simply using the advice given on
the wiki: https://gcc.gnu.org/wiki/TestCaseWriting
If you have a better pattern, I'm more than willing to change the
testcase.

The point of the testcase isn't to see if 3 errors messages or even
1 error message is issued.  The testcase demonstrates that the segfault
in F951 (caused by calling mpz_set with an invalid mpz_t) does not happen.

-- 
Steve


Re: [PATCH][auto-inc-dec.c] Account for cost of move operation in FORM_PRE_ADD and FORM_POST_ADD cases

2015-10-26 Thread Richard Sandiford
Bernd Schmidt  writes:
> I seem to recall Richard had a rewrite of all the autoinc code. I wonder 
> what happened to that?

Although it produced more autoincs, it didn't really improve performance
that much on the targets I was looking at at the time.

I'm afraid the patch is long lost now, and would probably be in an
uncertain copyright situation anyway.

Thanks,
Richard



Re: Handle OBJ_TYPE_REF in operand_equal_p

2015-10-26 Thread Jan Hubicka
> > This thing simply stays in GIMPLE and serves no purpose.  I tried to remove
> > it at one point, but run into some regressions.  I plan to return to this.
> > Have no idea how this was intended to work.
> > Some bits was added by:
> >  https://gcc.gnu.org/ml/gcc-patches/2005-04/txt00100.txt
> > Code in build_objc_method_call seems predating changelogs.
> 
> I see.  But the above should be harmless unless you throw it onto the ODR
> predicates?

Yes, but it is what I do two lines bellow...
> 
> >>
> >> > + /* Match the type information stored in the wrapper.  */
> >> > +
> >> > + flags &= ~(OEP_CONSTANT_ADDRESS_OF|OEP_ADDRESS_OF);
> >> > + return (tree_to_uhwi (OBJ_TYPE_REF_TOKEN (arg0))
> >> > + == tree_to_uhwi (OBJ_TYPE_REF_TOKEN (arg1))
> >>
> >> Err, please use tree_int_cst_eq ()
> >
> > OK updated in my local copy and, I will update other places too, this was
> > directly copied from ipa-icf-gimple.
> >>
> >> > + && types_same_for_odr (obj_type_ref_class (arg0),
> >> > +obj_type_ref_class (arg1))
> >>
> >> Why do you need this check?  The args of OBJ_TYPE_REF should be fully
> >> specifying the object, no?
> >
> > This is the type of THIS pointer that is extracted from pointer type of
> > OBJ_TYPE_REF.  Because we otherwise consider different pointers compatible,
> > I think we need to check it here.
> 
> Hum, this is fold-const.c and thus GENERIC.  GENERIC doesn't consider pointer
> types compatible in general.  So I think it is not needed.  As you say 
> elsewhere
> the caller is responsible for restricting typeof the trees it asks for 
> equality.

...

My understanding is that operand_equal_p should check that the value of operand
is the same and will be the same after subsequentoptimizations.

For this reason it does its own type matching (comparsion of TYPE_MODE for
expressions, TYPE_UNSIGNED where it matters).  It also compares alias
information for MEM_REF to be sure that both reads from memory will always
return the same even after later optimizations.  OBJ_TYPE_REF type info is no
different from TBAA alias info.  It is just additional information which is
used for later optimization and therefore it must match across the two copies.

Modulo the fact that we don't seem to fold these, I think one way to confuse us
is:

  test ? ((derivedobject1 *)ptr)->foo () : ((derivedobject *)ptr)->foo ()

Here we have two NOP_EXPR that converts PTR to two derived objects and 
OBJ_TYPE_REF
will look equivalent modulo having different type.  If return value is the same,
I think it w

Plus operand_equal_p is used for both GENERIC and GIMPLE (which is somewhat
ugly indeed.  We may just fork the implementation eventally if wemove away
completely from sharing code paths between GENERIC and GIMPLE).

Ok, I tried to fix my testcase to be optimized and modify it to case where we
need to type match.  I can't do that with current tree but I think it is just
a lazyness elsewhere that should be fixed. Doing:

struct foo {
  virtual int test () __attribute__ ((pure));
};

int ret (struct foo *f, int v)
{
   return v ? f->test() : f->test();
}

We actually match both OBJ_TYPE_REFs in GENERIC folding:

#0  operand_equal_p (arg0=0x76ad6b10, arg1=0x76ad6a50, 
flags=flags@entry=0) at ../../gcc/fold-const.c:2706
#1  0x009fa979 in operand_equal_p (arg0=arg0@entry=0x76ae20e0, 
arg1=arg1@entry=0x76ae20a8, flags=flags@entry=0) at 
../../gcc/fold-const.c:3130
#2  0x01097e78 in generic_simplify_COND_EXPR (code=COND_EXPR, 
op2=0x76ae20e0, op1=0x76ae20a8, op0=0x76c35758, 
type=0x76adb7e0, loc=1042) at generic-match.c:24047
#3  generic_simplify (loc=loc@entry=1042, code=code@entry=COND_EXPR, 
type=type@entry=0x76adb7e0, op0=op0@entry=0x76c35758, 
op1=op1@entry=0x76ae20a8, 
op2=op2@entry=0x76ae20e0) at generic-match.c:24230
#4  0x00a066cd in fold_ternary_loc (loc=loc@entry=1042, 
code=code@entry=COND_EXPR, type=type@entry=0x76adb7e0, op0=0x76c35758, 
op1=0x76ae20a8, op2=0x76ae20e0)
at ../../gcc/fold-const.c:11377
#5  0x00a28c85 in fold (expr=) at 
../../gcc/fold-const.c:11996
#6  0x0079118a in fold_if_not_in_template (expr=) at 
../../gcc/cp/tree.c:4287

This will not result in the desired optimization because we give up in the
otuer operand_equal_p matching CALL_EXPR:

0x009fa979 in operand_equal_p (arg0=arg0@entry=0x76ae20e0, 
arg1=arg1@entry=0x76ae20a8, flags=flags@entry=0) at 
../../gcc/fold-const.c:3130
3130  if (! operand_equal_p (CALL_EXPR_FN (arg0), CALL_EXPR_FN 
(arg1),
Value returned is $13 = 1
(gdb) l
3125}
3126  else
3127{
3128  /* If the CALL_EXPRs call different functions, then they 
are not
3129 equal.  */
3130  if (! operand_equal_p (CALL_EXPR_FN (arg0), CALL_EXPR_FN 
(arg1),
3131   

Re: [PATCH] New attribute to create target clones

2015-10-26 Thread Jeff Law

On 10/12/2015 05:35 PM, Evgeny Stupachenko wrote:

Hi All,

Here is a new version of patch (attached).
Bootstrap and make check are in progress (all new tests passed).

New test case g++.dg/ext/mvc4.C fails with ICE, when options lower
than "-mavx" are passed.
However it has the same behavior if "target_clones" attribute is
replaced by 2 corresponding "target" attributes.
I've filed PR67946 on this:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67946

Thanks,
Evgeny

ChangeLog:

2015-10-13  Evgeny Stupachenko
gcc/
 * Makefile.in (OBJS): Add multiple_target.o.
 * attrib.c (make_attribute): Moved from config/i386/i386.c
 * config/i386/i386.c (make_attribute): Deleted.
 * multiple_target.c (make_attribute): New.
 (create_dispatcher_calls): Ditto.
 (get_attr_len): Ditto.
 (get_attr_str): Ditto.
 (is_valid_asm_symbol): Ditto.
 (create_new_asm_name): Ditto.
 (create_target_clone): Ditto.
 (expand_target_clones): Ditto.
 (ipa_target_clone): Ditto.
 (ipa_dispatcher_calls): Ditto.
 * passes.def (pass_target_clone): Two new ipa passes.
 * tree-pass.h (make_pass_target_clone): Ditto.

gcc/c-family
 * c-common.c (handle_target_clones_attribute): New.
 * (c_common_attribute_table): Add handle_target_clones_attribute.
 * (handle_always_inline_attribute): Add check on target_clones
 attribute.
 * (handle_target_attribute): Ditto.

gcc/testsuite
 * gcc.dg/mvc1.c: New test for multiple targets cloning.
 * gcc.dg/mvc2.c: Ditto.
 * gcc.dg/mvc3.c: Ditto.
 * gcc.dg/mvc4.c: Ditto.
 * gcc.dg/mvc5.c: Ditto.
 * gcc.dg/mvc6.c: Ditto.
 * gcc.dg/mvc7.c: Ditto.
 * g++.dg/ext/mvc1.C: Ditto.
 * g++.dg/ext/mvc2.C: Ditto.
 * g++.dg/ext/mvc3.C: Ditto.
 * g++.dg/ext/mvc4.C: Ditto.

gcc/doc
 * doc/extend.texi (target_clones): New attribute description.




diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index 23e6a76..f9d28d1 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -3066,6 +3066,19 @@ This function attribute make a stack protection of the 
function if
  flags @option{fstack-protector} or @option{fstack-protector-strong}
  or @option{fstack-protector-explicit} are set.

+@item target_clones (@var{options})
+@cindex @code{target_clones} function attribute
+The @code{target_clones} attribute is used to specify that a function is to
+be cloned into multiple versions compiled with different target options
+than specified on the command line.  The supported options and restrictions
+are the same as for @code{target}.

"as for @code{target}" -> "as for the @code{target} attribute."

I think that makes it a tiny bit clearer.





+
+/*  Creates target clone of NODE.  */
+
+static cgraph_node *
+create_target_clone (cgraph_node *node, bool definition)
+{
+  cgraph_node *new_node;
+  if (definition)
+{
+  new_node = node->create_version_clone_with_body (vNULL, NULL,
+  NULL, false,
+  NULL, NULL,
+  "target_clone");
+  new_node->externally_visible = node->externally_visible;
+  new_node->address_taken = node->address_taken;
+  new_node->thunk = node->thunk;
+  new_node->alias = node->alias;
+  new_node->weakref = node->weakref;
+  new_node->cpp_implicit_alias = node->cpp_implicit_alias;
+  new_node->local.local = node->local.local;
So do we need to explicitly clear TREE_PUBLIC here?  It also seems like 
copying externally_visible, address_taken and possibly some of those 
other fields is wrong.  The clone is going to be local to the CU, it 
doesn't inherit those properties from the original -- only the 
dispatcher needs to inherit those properties, right?



+
+
+  for (i = 0; i < attrnum; i++)
+{
+  char *attr = attrs[i];
+  cgraph_node *new_node = create_target_clone (node, defenition);
+  char *new_asm_name =
+ XNEWVEC (char, strlen (old_asm_name) + strlen (attr) + 2);
+  create_new_asm_name (old_asm_name, attr, new_asm_name);
I thought after discussions with Jan that this wasn't going to be 
necessary as cloning should create a suitable name for us?



Jeff


Commit: Fix spelling mistake in RL78 documentation

2015-10-26 Thread Nick Clifton
Hi Guys,

  I am checking in the patch below to fix a small spelling mistake that
  I recently introduced to the RL78 documentation:

gcc/ChangeLog
2015-10-26  Nick Clifton  

* doc/invoke.texi (RL78 Options): Fix spelling mistake.

Index: gcc/doc/invoke.texi
===
--- gcc/doc/invoke.texi (revision 229376)
+++ gcc/doc/invoke.texi (working copy)
@@ -19118,7 +19118,7 @@
 @option{-mmul=none} option on the command line.  Thus specifying
 @option{-mcpu=g13} enables the use of the G13 hardware multiply
 peripheral and specifying @option{-mcpu=g10} disables the use of
-hardware multipications altogether.
+hardware multiplications altogether.
 
 Note, although the RL78/G14 core is the default target, specifying
 @option{-mcpu=g14} or @option{-mcpu=rl78} on the command line does





[Ada] Renamings of volatile objects

2015-10-26 Thread Arnaud Charlet
This patch implements the following SPARK RM 7.1.3(12) rule:

   Contrary to the general SPARK 2014 rule that expression evaluation cannot
   have side effects, a read of an effectively volatile object with the
   properties Async_Writers or Effective_Reads set to True is considered to
   have an effect when read. To reconcile this discrepancy, a name denoting
   such an object shall only occur in a non-interfering context. A name occurs
   in a non-interfering context if it is:

  * the object_name of an object_renaming_declaration


-- Source --


--  volatile_renamings.ads

package Volatile_Renamings with SPARK_Mode is
   type Vol_Rec is record
  Comp : Integer;
   end record with Volatile;

   Vol_Obj_1 : Vol_Rec;
   Vol_Obj_2 : Integer with Volatile;

   Vol_Ren_1 : Vol_Rec renames Vol_Obj_1;
   Vol_Ren_2 : Integer renames Vol_Obj_2;

   function Rec_Func return Vol_Rec with Volatile_Function;
   function Int_Func return Integer with Volatile_Function;

   function Error_Rec_Func return Vol_Rec;
   function Error_Int_Func return Integer;
end Volatile_Renamings;

--  volatile_renamings.adb

package body Volatile_Renamings with SPARK_Mode is
   function Rec_Func return Vol_Rec is
   begin
  return Vol_Ren_1;  --  OK
   end Rec_Func;

   function Int_Func return Integer is
   begin
  return Vol_Ren_2;  --  OK
   end Int_Func;

   function Error_Rec_Func return Vol_Rec is
   begin
  return Vol_Ren_1;  --  Error
   end Error_Rec_Func;

   function Error_Int_Func return Integer is
   begin
  return Vol_Ren_2;  --  Error
   end Error_Int_Func;
end Volatile_Renamings;


-- Compilation and output --


$ gcc -c volatile_renamings.adb
volatile_renamings.adb:14:14: volatile object cannot appear in this context
  (SPARK RM 7.1.3(12))
volatile_renamings.adb:19:14: volatile object cannot appear in this context
  (SPARK RM 7.1.3(12))
volatile_renamings.ads:15:13: nonvolatile function "Error_Rec_Func" cannot have
  a volatile return type

Tested on x86_64-pc-linux-gnu, committed on trunk

2015-10-26  Hristian Kirtchev  

* sem_res.adb (Is_OK_Volatile_Context): A volatile object may appear
in an object declaration as long as it denotes the name.

Index: sem_res.adb
===
--- sem_res.adb (revision 229362)
+++ sem_res.adb (working copy)
@@ -6993,6 +6993,13 @@
return True;
 end if;
 
+ --  The volatile object acts as the name of a renaming declaration
+
+ elsif Nkind (Context) = N_Object_Renaming_Declaration
+   and then Name (Context) = Obj_Ref
+ then
+return True;
+
  --  The volatile object appears as an actual parameter in a call to an
  --  instance of Unchecked_Conversion whose result is renamed.
 


[Ada] Single protected declaration transformation guarantee

2015-10-26 Thread Arnaud Charlet
This patch adds a check to ensure that there is no attempt to expand a single
protected declaration as the declaration should have been transformed into a
protected type along with an anonymous object. No change in behavior, no test
needed.

Tested on x86_64-pc-linux-gnu, committed on trunk

2015-10-26  Hristian Kirtchev  

* expander.adb (Expand): Expand a single protected declaration.
* exp_ch9.ads, exp_ch9.adb (Expand_N_Single_Protected_Declaration): New
routine.

Index: exp_ch9.adb
===
--- exp_ch9.adb (revision 229328)
+++ exp_ch9.adb (working copy)
@@ -11388,14 +11388,28 @@
   end loop;
end Expand_N_Selective_Accept;
 
+   ---
+   -- Expand_N_Single_Protected_Declaration --
+   ---
+
+   --  A single protected declaration should never be present after semantic
+   --  analysis because it is transformed into a protected type declaration
+   --  and an accompanying anonymous object. This routine ensures that the
+   --  transformation takes place.
+
+   procedure Expand_N_Single_Protected_Declaration (N : Node_Id) is
+   begin
+  raise Program_Error;
+   end Expand_N_Single_Protected_Declaration;
+
--
-- Expand_N_Single_Task_Declaration --
--
 
-   --  Single task declarations should never be present after semantic
-   --  analysis, since we expect them to be replaced by a declaration of an
-   --  anonymous task type, followed by a declaration of the task object. We
-   --  include this routine to make sure that is happening.
+   --  A single task declaration should never be present after semantic
+   --  analysis because it is transformed into a task type declaration and
+   --  an accompanying anonymous object. This routine ensures that the
+   --  transformation takes place.
 
procedure Expand_N_Single_Task_Declaration (N : Node_Id) is
begin
Index: exp_ch9.ads
===
--- exp_ch9.ads (revision 229313)
+++ exp_ch9.ads (working copy)
@@ -6,7 +6,7 @@
 --  --
 -- S p e c  --
 --  --
---  Copyright (C) 1992-2014, Free Software Foundation, Inc. --
+--  Copyright (C) 1992-2015, Free Software Foundation, Inc. --
 --  --
 -- GNAT is free software;  you can  redistribute it  and/or modify it under --
 -- terms of the  GNU General Public License as published  by the Free Soft- --
@@ -266,12 +266,13 @@
--  allows these two nodes to be found from the type, without benefit of
--  further attributes, using Corresponding_Record.
 
-   procedure Expand_N_Requeue_Statement  (N : Node_Id);
-   procedure Expand_N_Selective_Accept   (N : Node_Id);
-   procedure Expand_N_Single_Task_Declaration(N : Node_Id);
-   procedure Expand_N_Task_Body  (N : Node_Id);
-   procedure Expand_N_Task_Type_Declaration  (N : Node_Id);
-   procedure Expand_N_Timed_Entry_Call   (N : Node_Id);
+   procedure Expand_N_Requeue_Statement(N : Node_Id);
+   procedure Expand_N_Selective_Accept (N : Node_Id);
+   procedure Expand_N_Single_Protected_Declaration (N : Node_Id);
+   procedure Expand_N_Single_Task_Declaration  (N : Node_Id);
+   procedure Expand_N_Task_Body(N : Node_Id);
+   procedure Expand_N_Task_Type_Declaration(N : Node_Id);
+   procedure Expand_N_Timed_Entry_Call (N : Node_Id);
 
procedure Expand_Protected_Body_Declarations
  (N   : Node_Id;
Index: expander.adb
===
--- expander.adb(revision 229313)
+++ expander.adb(working copy)
@@ -432,6 +432,9 @@
when N_Selective_Accept =>
   Expand_N_Selective_Accept (N);
 
+   when N_Single_Protected_Declaration =>
+  Expand_N_Single_Protected_Declaration (N);
+
when N_Single_Task_Declaration =>
   Expand_N_Single_Task_Declaration (N);
 
@@ -471,7 +474,7 @@
when N_Variant_Part =>
   Expand_N_Variant_Part (N);
 
-  --  For all other node kinds, no expansion activity required
+   --  For all other node kinds, no expansion activity required
 
when others =>
   null;


[Ada] Ghost types, objects and synchronization

2015-10-26 Thread Arnaud Charlet
This patch implements the following SPARK RM 6.9(19) rule:

   A ghost type shall not have a task or protected part. A ghost object shall
   not be of a type which yields synchronized objects. A ghost object shall not
   have a volatile part. A synchronized state abstraction shall not be a ghost
   state abstraction.


-- Source --


--  yso.ads

--  Yields synchronized object

with Ada.Synchronous_Task_Control; use Ada.Synchronous_Task_Control;

package YSO with SPARK_Mode is

   --  Protected interface

   type Prot_Iface is protected interface;

   --  Protected type

   protected type Prot_Typ is
  entry Ent;
   end Prot_Typ;

   --  Synchronized interface

   type Sync_Iface is synchronized interface;

   --  Task interface

   type Task_Iface is task interface;

   --  Task type

   task type Task_Typ;

   --  Array type

   type Arr_Typ is array (1 .. 5) of Prot_Typ;

   --  Descendant of Suspension_Object

   type Sus_Obj is new Suspension_Object;

   --  Record type

   type Rec_Typ is record
  Comp_1 : Prot_Typ;
  Comp_2 : Task_Typ;
  Comp_3 : Arr_Typ;
  Comp_4 : Sus_Obj;
  Comp_5 : Suspension_Object;
   end record with Volatile;

   --  Type extension

   type Deriv_Typ is new Rec_Typ;
end YSO;

--  lr19.ads

th Ada.Synchronous_Task_Control; use Ada.Synchronous_Task_Control;
with YSO;  use YSO;

package LR19
  with SPARK_Mode,

   --  A synchronized state abstraction shall not be a ghost state
   --  abstraction.

   Abstract_State => ((Error_1 with Ghost, Synchronous), --  Error
  (Error_2 with Synchronous, Ghost)) --  Error
is
   --  A ghost type shall not have a task or protected type

   protected type Prot_Typ is end Prot_Typ;
   task type Task_Typ;

   protected type Error_3 with Ghost is end Error_3; --  Error
   protected Error_4 with Ghost is end Error_4;  --  Error
   task type Error_5 with Ghost; --  Error
   task Error_6 with Ghost;  --  Error

   type Error_7 is record
  Comp : Prot_Typ;   --  Error
   end record with Ghost, Volatile;

   type Error_8 is array (1 .. 3) of Task_Typ with Ghost;--  Error

   --  A ghost object shall not be of a type which yields a synchonized object

   Error_9  : Prot_Typ with Ghost;   --  Error
   Error_10 : Task_Typ with Ghost;   --  Error
   Error_11 : Arr_Typ with Ghost;--  Error
   Error_12 : Sus_Obj with Ghost;--  Error
   Error_13 : Suspension_Object with Ghost;  --  Error
   Error_14 : Rec_Typ with Ghost;--  Error
   Error_15 : Deriv_Typ with Ghost;  --  Error

   --  A ghost object shall not have a [n effectively?] volatile part

   type Vol_Int is new Integer range 1 .. 5 with Volatile;

   type Vol_Rec is record
  Comp : Vol_Int;
   end record with Volatile;

   Error_16 : Integer with Ghost, Volatile;  --  Error
   Error_17 : Vol_Int with Ghost;--  Error
   Error_18 : Vol_Rec with Ghost;--  Error
end LR19;


-- Compilation and output --


$ gcc -c lr19.ads
lr19.ads:10:27: synchronized state cannot be ghost
lr19.ads:11:27: synchronized state cannot be ghost
lr19.ads:18:32: aspect "GHOST" cannot apply to a protected type
lr19.ads:19:27: aspect "GHOST" cannot apply to a protected type
lr19.ads:20:27: aspect "GHOST" cannot apply to a task type
lr19.ads:21:22: aspect "GHOST" cannot apply to a task type
lr19.ads:23:09: ghost type "Error_7" cannot be volatile
lr19.ads:24:07: component "Comp" of ghost type "ERROR_7" cannot be concurrent
lr19.ads:27:04: ghost array type "ERROR_8" cannot have concurrent component
  type
lr19.ads:31:04: ghost object "Error_9" cannot be synchronized
lr19.ads:31:29: aspect "GHOST" cannot apply to a protected object
lr19.ads:32:04: ghost object "Error_10" cannot be synchronized
lr19.ads:32:29: aspect "GHOST" cannot apply to a task object
lr19.ads:33:04: ghost object "Error_11" cannot be synchronized
lr19.ads:34:04: ghost object "Error_12" cannot be synchronized
lr19.ads:35:04: ghost object "Error_13" cannot be synchronized
lr19.ads:36:04: ghost object "Error_14" cannot be synchronized
lr19.ads:37:04: ghost object "Error_15" cannot be synchronized
lr19.ads:47:04: ghost object "Error_16" cannot be volatile
lr19.ads:48:04: ghost object "Error_17" cannot be volatile
lr19.ads:49:04: ghost object "Error_18" cannot be volatile

Tested on x86_64-pc-linux-gnu, committed on trunk

2015-10-26  Hristian Kirtchev  

* contracts.adb (Analyze_Object_Contract):

[PATCH] [PR tree-optimization/68013] Make sure first block in FSM path is in VISITED_BBs

2015-10-26 Thread Jeff Law


The problem here is the first block in an FSM path may not be added to 
VISITED_BBs.  That in turn results in the threader finding a path which 
passes through that initial block a second time, reaching a different 
destination the second time through.


This is caught by the assertion (to avoid generating incorrect code). 
Fixing is simple.  But...


I'm not real happy with how the affected code is structured, but I'm at 
a loss right now how to reorganize it in the immediate term.  I'm hoping 
to get some clarity as I continue to look to replace the EDGE_DFS_BACK 
support in the old threader with the FSM threader, fixing 67892 along 
the way).


Anyway, bootstrapped and regression tested on x86_64-linux-gnu. 
Installed on the trunk.


Jeff
commit 803e64f4cf1221027844db60db0a480e64653664
Author: law 
Date:   Mon Oct 26 15:36:04 2015 +

[PATCH] [PR tree-optimization/68013] Make sure first block in FSM path
is in VISITED_BBs

PR tree-optimization/68013
* tree-ssa-threadbackward.c
(fsm_find_control_statement_thread_paths): Make sure the first block
in the path is in VISITED_BBs.

PR tree-optimization/68013
* gcc.c-torture/compile/pr68013.c: New test.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@229375 
138bc75d-0d04-0410-961f-82ee72b054a4

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 95479f3..b5cfa1e 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,10 @@
+2015-10-26  Jeff Law  
+
+   PR tree-optimization/68013
+   * tree-ssa-threadbackward.c
+   (fsm_find_control_statement_thread_paths): Make sure the first block
+   in the path is in VISITED_BBs.
+
 2015-10-26  Richard Biener  
Dominik Vogt  
 
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index fd5ade4..688f745 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,8 @@
+2015-10-26  Jeff Law  
+
+   PR tree-optimization/68013
+   * gcc.c-torture/compile/pr68013.c: New test.
+
 2015-10-26  Richard Biener  
Dominik Vogt  
 
diff --git a/gcc/testsuite/gcc.c-torture/compile/pr68013.c 
b/gcc/testsuite/gcc.c-torture/compile/pr68013.c
new file mode 100644
index 000..cc500da
--- /dev/null
+++ b/gcc/testsuite/gcc.c-torture/compile/pr68013.c
@@ -0,0 +1,16 @@
+int a, b, c, d, e, f;
+
+void
+fn1 ()
+{
+  for (; e;)
+{
+  e = f;
+  for (; b;)
+   {
+ b = a;
+ f = a || d ? 0 : c;
+   }
+  d = 0;
+}
+}
diff --git a/gcc/tree-ssa-threadbackward.c b/gcc/tree-ssa-threadbackward.c
index 9128094..cfb4ace 100644
--- a/gcc/tree-ssa-threadbackward.c
+++ b/gcc/tree-ssa-threadbackward.c
@@ -136,6 +136,11 @@ fsm_find_control_statement_thread_paths (tree name,
   vec *next_path;
   vec_alloc (next_path, n_basic_blocks_for_fn (cfun));
 
+  /* When VAR_BB == LAST_BB_IN_PATH, then the first block in the path
+will already be in VISITED_BBS.  When they are not equal, then we
+must ensure that first block is accounted for to ensure we do not
+create bogus jump threading paths.  */
+  visited_bbs->add ((*path)[0]);
   FOR_EACH_EDGE (e, ei, last_bb_in_path->preds)
{
  hash_set *visited_bbs = new hash_set;


[Ada] References to task and protected types in aspects/pragmas

2015-10-26 Thread Arnaud Charlet
This patch implements the following rules from SPARK RM 6.1.4:

   For purposes of the rules concerning the Global, Depends, Refined_Global,
   and Refined_Depends aspects, when any of these aspects are specified for a
   task unit the task unit's body is considered to be the body of a procedure
   and the current instance of the task unit is considered to be a formal
   parameter (of that notional procedure) of mode IN OUT.

   Similarly, for purposes of the rules concerning the Global, Refined_Global,
   Depends, and Refined_Depends aspects as they apply to protected operations,
   the current instance of the enclosing protected unit is considered to be a
   formal parameter (of mode IN for a protected function, of mode IN OUT
   otherwise) and a protected entry is considered to be a protected procedure.

The patch also introduces the concept of a body "freezing" the contract of its
initial declaration.


-- Source --


--  synchronized_contracts.ads

package Synchronized_Contracts
  with SPARK_Mode,
   Abstract_State => State
is
   Var : Integer := 1;

   protected type Prot_Typ_1 is
  entry Prot_Ent (Formal : out Integer)
with Global  => (Input => (State, Var)),
 Depends => ((Prot_Typ_1, Formal) => (State, Var, Prot_Typ_1));
   end Prot_Typ_1;

   protected Prot_Typ_2 is
  entry Prot_Ent (Formal : out Integer);
  pragma Global  ((Input => State));
  pragma Depends ((Formal => State));
   end Prot_Typ_2;

   task type Task_Typ_1
 with Global  => (Input => State, Output => Var),
  Depends => ((Var, Task_Typ_1) => (State, Task_Typ_1));

   task Task_Typ_2;
   pragma Global  ((Output => (State, Var)));
   pragma Depends (((State, Var) => null));
end Synchronized_Contracts;

--  synchronized_contracts.adb

package body Synchronized_Contracts
  with SPARK_Mode,
   Refined_State => (State => Constit)
is
   Constit : Integer := 2;

   protected body Prot_Typ_1 is
  entry Prot_Ent (Formal : out Integer) when True is
 pragma Refined_Global  ((Input => (Constit, Var)));
 pragma Refined_Depends (((Prot_Typ_1, Formal) =>
 (Constit, Var, Prot_Typ_1)));
  begin
 Formal := Constit + Var;
  end Prot_Ent;
   end Prot_Typ_1;

   protected body Prot_Typ_2 is
  entry Prot_Ent (Formal : out Integer)
with Refined_Global  => (Input => Constit),
 Refined_Depends => (Formal => Constit)
when True is
  begin
 Formal := Constit + 1;
  end Prot_Ent;
   end Prot_Typ_2;

   task body Task_Typ_1 is
  pragma Refined_Global  ((Input => Constit, Output => Var));
  pragma Refined_Depends (((Var, Task_Typ_1) => (Constit, Task_Typ_1)));
   begin
  null;
   end Task_Typ_1;

   task body Task_Typ_2
 with Refined_Global  => (Output => (Constit, Var)),
  Refined_Depends => ((Constit, Var) => null)
   is
   begin
  null;
   end Task_Typ_2;
end Synchronized_Contracts;

-
-- Compilation --
-

$ gcc -c synchronized_contracts.adb

Tested on x86_64-pc-linux-gnu, committed on trunk

2015-10-26  Hristian Kirtchev  

* atree.ads, atree.adb (Ekind_In): New 10 and 11 parameter versions.
* contracts.ads, contracts.adb (Analyze_Initial_Declaration_Contract):
New routine.
* sem_ch6.adb (Analyze_Generic_Subprogram_Body):
Analyze the contract of the initial declaration.
(Analyze_Subprogram_Body_Helper): Analyze the contract of the
initial declaration.
* sem_ch7.adb (Analyze_Package_Body_Helper): Analyze the contract
of the initial declaration.
* sem_ch9.adb (Analyze_Entry_Body): Analyze the contract of
the initial declaration.
(Analyze_Protected_Body): Analyze
the contract of the initial declaration.
(Analyze_Task_Body): Analyze the contract of the initial declaration.
* sem_prag.adb (Add_Entity_To_Name_Buffer): Use "type" rather
than "unit" as it makes the error messages sound better.
(Add_Item_To_Name_Buffer): Update comment on usage. The routine
now supports discriminants and current instances of concurrent
types.
(Analyze_Depends_In_Decl_Part): Install the discriminants
of a task type.
(Analyze_Global_In_Decl_Part): Install the discriminants of a task type.
(Analyze_Global_Item): Add processing for current instances of
concurrent types and include discriminants as valid global items.
(Analyze_Input_Output): Discriminants and current instances of
concurrent types are now valid items. Update various error messages.
(Check_Usage): Current instances of protected and task types behaves
as formal parameters.
(Collect_Subprogram_Inputs_Outputs): There is
no longer need to manually analyze [Refined_]Global thanks to
freezing of initial declaration contracts.  Add pr

Re: [vec-cmp, patch 4/6] Support vector mask invariants

2015-10-26 Thread Richard Biener
On Wed, Oct 14, 2015 at 6:13 PM, Ilya Enkovich  wrote:
> On 14 Oct 13:50, Ilya Enkovich wrote:
>> 2015-10-14 11:49 GMT+03:00 Richard Biener :
>> > On Tue, Oct 13, 2015 at 4:52 PM, Ilya Enkovich  
>> > wrote:
>> >> I don't understand what you mean. vect_get_vec_def_for_operand has two
>> >> changes made.
>> >> 1. For boolean invariants use build_same_sized_truth_vector_type
>> >> instead of get_vectype_for_scalar_type in case statement produces a
>> >> boolean vector. This covers cases when we use invariants in
>> >> comparison, AND, IOR, XOR.
>> >
>> > Yes, I understand we need this special-casing to differentiate between
>> > the vector type
>> > used for boolean-typed loads/stores and the type for boolean typed 
>> > constants.
>> > What happens if we mix them btw, like with
>> >
>> >   _Bool b = bools[i];
>> >   _Bool c = b || d;
>> >   ...
>> >
>> > ?
>>
>> Here both statements should get vector of char as a vectype and we
>> never go VECTOR_BOOLEAN_TYPE_P way for them
>>
>> >
>> >> 2. COND_EXPR is an exception because it has built-in boolean vector
>> >> result not reflected in its vecinfo. Thus I added additional operand
>> >> for vect_get_vec_def_for_operand to directly specify vectype for
>> >> vector definition in case it is a loop invariant.
>> >> So what do you propose to do with these changes?
>> >
>> > This is the change I don't like and don't see why we need it.  It works 
>> > today
>> > and the comparison operands should be of appropriate type already?
>>
>> Today it works because we always create vector of integer constant.
>> With boolean vectors it may be either integer vector or boolean vector
>> depending on context. Consider:
>>
>> _Bool _1;
>> int _2;
>>
>> _2 = _1 != 0 ? 0 : 1
>>
>> We have two zero constants here requiring different vectypes.
>>
>> Ilya
>>
>> >
>> > Richard.
>> >
>> >> Thanks,
>> >> Ilya
>
> Here is an updated patch version.
>
> Thanks,
> Ilya
> --
> gcc/
>
> 2015-10-14  Ilya Enkovich  
>
> * expr.c (const_vector_mask_from_tree): New.
> (const_vector_from_tree): Use const_vector_mask_from_tree
> for boolean vectors.
> * tree-vect-stmts.c (vect_init_vector): Support boolean vector
> invariants.
> (vect_get_vec_def_for_operand): Add VECTYPE arg.
> (vectorizable_condition): Directly provide vectype for invariants
> used in comparison.
> * tree-vectorizer.h (vect_get_vec_def_for_operand): Add VECTYPE
> arg.
>
>
> diff --git a/gcc/expr.c b/gcc/expr.c
> index b5ff598..ab25d1a 100644
> --- a/gcc/expr.c
> +++ b/gcc/expr.c
> @@ -11344,6 +11344,40 @@ try_tablejump (tree index_type, tree index_expr, 
> tree minval, tree range,
>return 1;
>  }
>
> +/* Return a CONST_VECTOR rtx representing vector mask for
> +   a VECTOR_CST of booleans.  */
> +static rtx
> +const_vector_mask_from_tree (tree exp)
> +{
> +  rtvec v;
> +  unsigned i;
> +  int units;
> +  tree elt;
> +  machine_mode inner, mode;
> +
> +  mode = TYPE_MODE (TREE_TYPE (exp));
> +  units = GET_MODE_NUNITS (mode);
> +  inner = GET_MODE_INNER (mode);
> +
> +  v = rtvec_alloc (units);
> +
> +  for (i = 0; i < VECTOR_CST_NELTS (exp); ++i)
> +{
> +  elt = VECTOR_CST_ELT (exp, i);
> +
> +  gcc_assert (TREE_CODE (elt) == INTEGER_CST);
> +  if (integer_zerop (elt))
> +   RTVEC_ELT (v, i) = CONST0_RTX (inner);
> +  else if (integer_onep (elt)
> +  || integer_minus_onep (elt))
> +   RTVEC_ELT (v, i) = CONSTM1_RTX (inner);
> +  else
> +   gcc_unreachable ();
> +}
> +
> +  return gen_rtx_CONST_VECTOR (mode, v);
> +}
> +
>  /* Return a CONST_VECTOR rtx for a VECTOR_CST tree.  */
>  static rtx
>  const_vector_from_tree (tree exp)
> @@ -11359,6 +11393,9 @@ const_vector_from_tree (tree exp)
>if (initializer_zerop (exp))
>  return CONST0_RTX (mode);
>
> +  if (VECTOR_BOOLEAN_TYPE_P (TREE_TYPE (exp)))
> +return const_vector_mask_from_tree (exp);
> +
>units = GET_MODE_NUNITS (mode);
>inner = GET_MODE_INNER (mode);
>
> diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c
> index 6a52895..01168ae 100644
> --- a/gcc/tree-vect-stmts.c
> +++ b/gcc/tree-vect-stmts.c
> @@ -1308,7 +1308,22 @@ vect_init_vector (gimple *stmt, tree val, tree type, 
> gimple_stmt_iterator *gsi)
>if (!types_compatible_p (TREE_TYPE (type), TREE_TYPE (val)))
> {
>   if (CONSTANT_CLASS_P (val))
> -   val = fold_unary (VIEW_CONVERT_EXPR, TREE_TYPE (type), val);
> +   {
> + /* Can't use VIEW_CONVERT_EXPR for booleans because
> +of possibly different sizes of scalar value and
> +vector element.  */
> + if (VECTOR_BOOLEAN_TYPE_P (type))
> +   {
> + if (integer_zerop (val))
> +   val = build_int_cst (TREE_TYPE (type), 0);
> + else if (integer_onep (val))
> +   val = build_int_cst (TREE_TYPE (type), 1);
> + else
> +

Re: [PATCH] Fix PR67443

2015-10-26 Thread Richard Biener
On Thu, Oct 22, 2015 at 11:58 AM, Dominik Vogt  wrote:
>> Eventually we'll get another testcase so I'll leave this for
>> comments a while and will commit somewhen later this week.
>
> In case you're referring to my attempt to port the test case to
> x86:  All the efforts to reproduce the bug on x86 have failed so
> far.  It seems that Gcc is much better in handling bit filed
> accesses on little endian targets (x86) than on big endian
> (s390[x], power).  In other words, on x86 earlier optimisations
> prevent the dse2 pass from having to deal with the problematic
> situation.  So, don't wait for me coming up with more test cases;
> I'll keep trying, but I'm pessimistic.

Now applied to trunk.

Richard.

> Ciao
>
> Dominik ^_^  ^_^
>
> --
>
> Dominik Vogt
> IBM Germany
>


Re: [OpenACC 8/11] device-specific lowering

2015-10-26 Thread Jakub Jelinek
On Wed, Oct 21, 2015 at 03:49:08PM -0400, Nathan Sidwell wrote:
> This patch is the device-specific half of the previous patch.  It processes
> the partition head & tail markers and loop abstraction functions inserted
> during omp lowering.

> > I don't see anything that would e.g. set the various flags that e.g. OpenMP
> > #pragma omp simd or Cilk+ #pragma simd sets, like loop->safelen,
> > loop->force_vectorize, maybe loop->simduid and promote some vars to simduid
> > arrays if that is relevant to OpenACC.

> It won't convert them into such representations.

Can you fix that incrementally?  I'd expect that code marked with acc loop 
vector 
can't have loop carried backward lexical dependencies, at least not within
the adjacent number of iterations specified in vector clause?

> +/* Find the number of threads (POS = false), or thread number (POS =
> +   tre) for an OpenACC region partitioned as MASK.  Setup code

Typo, tre -> true.

> +static tree
> +oacc_thread_numbers (bool pos, int mask, gimple_seq *seq)
> +{
> +  tree res = pos ? NULL_TREE :  build_int_cst (unsigned_type_node, 1);

Formatting, too many spaces.

> +  if (res == NULL_TREE)
> +res = build_int_cst (integer_type_node, 0);

integer_zero_node ?

> +/* Transform IFN_GOACC_LOOP calls to actual code.  See
> +   expand_oacc_for for where these are generated.  At the vector
> +   level, we stride loops, such that each  member of a warp will

Too many spaces before member.

> +  gimple_stmt_iterator gsi = gsi_for_stmt (call);
> +  unsigned code = (unsigned)TREE_INT_CST_LOW (gimple_call_arg (call, 0));

Missing space before T.

> +  tree dir = gimple_call_arg (call, 1);
> +  tree range = gimple_call_arg (call, 2);
> +  tree step = gimple_call_arg (call, 3);
> +  tree chunk_size = NULL_TREE;
> +  unsigned mask = (unsigned)TREE_INT_CST_LOW (gimple_call_arg (call, 5));

Ditto.

> +static void
> +oacc_loop_xform_head_tail (gcall *from, int level)
> +{
> +  gimple_stmt_iterator gsi = gsi_for_stmt (from);
> +  unsigned code = TREE_INT_CST_LOW (gimple_call_arg (from, 0));
> +  tree replacement  = build_int_cst (unsigned_type_node, level);

Too many spaces.

> +  switch (gimple_call_internal_fn (call))
> + {
> + case IFN_UNIQUE:
> +   {
> + unsigned c = TREE_INT_CST_LOW (gimple_call_arg (call, 0));

Shouldn't c be of type enum enum ifn_unique_kind ?
What about code?
> +
> + default:
> +   break;
> + }
> +}
> +
> + break2:;

Can't you replace goto break2; with return; and
remove break2:; ?

> +   if (TREE_INT_CST_LOW (gimple_call_arg (call, 0))
> +   == IFN_GOACC_LOOP_BOUND)
> + goto break2;
> + }
> +
> +  /* If we didn't see LOOP_BOUND, it should be in the single
> +  successor block.  */
> +  basic_block bb = single_succ (gsi_bb (gsi));
> +  gsi = gsi_start_bb (bb);
> +}
> +
> + break2:;

Similarly.
> + if (gimple_vdef (call))
> +   replace_uses_by (gimple_vdef (call),
> +gimple_vuse (call));

Why the line break in between the arguments?  The line wouldn't be really
long.

Otherwise LGTM.

Jakub


Re: [PATCH] Add missing INCLUDE_DEFAULTS_MUSL_LOCAL

2015-10-26 Thread Rich Felker
On Mon, Oct 26, 2015 at 12:32:01PM +, Szabolcs Nagy wrote:
> On 23/10/15 21:20, Joseph Myers wrote:
> >On Fri, 23 Oct 2015, Szabolcs Nagy wrote:
> >
> >>i think bsd libcs do the same, compiler headers interfering
> >>with libc headers is problematic (e.g. FLT_ROUNDS is wrong
> >>in gcc float.h, applications shouldn't see that), i'm not sure
> >
> >FLT_ROUNDS is an ordinary compiler bug (bug 30569), should be fixable
> >reasonably straightforwardly as outlined at
> >, probably within say a
> >week's work if most architecture-specific changes are left to architecture
> >maintainers.
> 
> musl tries to support old compilers in general (it can be built
> with gcc 3.x, and it should be possible to use with a wider range
> of compilers with reasonably consistent semantics, so fixing that
> bug in gcc does not help much.)
> 
> a better example would be stddef.h (it has incompatible definition
> of NULL, max_align_t etc, the ifdefs in gcc are fragile and none
> of the __need_FOO patterns match the ones musl use).
> 
> i think in general the higher level layer should come first
> (c++ first, then libc, then compiler include paths), so the one
> closer to the user gets a chance to override the ones after it,
> stdc-predef.h was a good step toward that.

musl explicitly does not support using a mix of libc headers and
compiler-provided freestanding headers. While there may be
circumstances under which no effective breakage occurs, this is merely
by chance and is not a supported usage. No effort is made by musl to
interact with the gcc headers (e.g. defining the macros they use to
prevent multiple definitions or control multiple inclusion, or testing
for whether they have already been included are made). It is necessary
to use the BSD-style header order, so that the libc headers get used
instead of the compiler-provided ones, to have a reliable musl target.

Rich


Re: [vec-cmp, patch 3/6] Vectorize comparison

2015-10-26 Thread Richard Biener
On Wed, Oct 14, 2015 at 6:12 PM, Ilya Enkovich  wrote:
> On 14 Oct 15:06, Ilya Enkovich wrote:
>>
>> Will send an updated version after testing.
>>
>> Thanks,
>> Ilya
>>
>
> Here is an updated patch version.
>
> Thanks,
> Ilya
> --
> gcc/
>
> 2015-10-14  Ilya Enkovich  
>
> * tree-vect-data-refs.c (vect_get_new_vect_var): Support 
> vect_mask_var.
> (vect_create_destination_var): Likewise.
> * tree-vect-stmts.c (vectorizable_comparison): New.
> (vect_analyze_stmt): Add vectorizable_comparison.
> (vect_transform_stmt): Likewise.
> * tree-vectorizer.h (enum vect_var_kind): Add vect_mask_var.
> (enum stmt_vec_info_type): Add comparison_vec_info_type.
> (vectorizable_comparison): New.
>
>
> diff --git a/gcc/tree-vect-data-refs.c b/gcc/tree-vect-data-refs.c
> index 8a4d489..0be0523 100644
> --- a/gcc/tree-vect-data-refs.c
> +++ b/gcc/tree-vect-data-refs.c
> @@ -3870,6 +3870,9 @@ vect_get_new_vect_var (tree type, enum vect_var_kind 
> var_kind, const char *name)
>case vect_scalar_var:
>  prefix = "stmp";
>  break;
> +  case vect_mask_var:
> +prefix = "mask";
> +break;
>case vect_pointer_var:
>  prefix = "vectp";
>  break;
> @@ -4424,7 +4427,11 @@ vect_create_destination_var (tree scalar_dest, tree 
> vectype)
>tree type;
>enum vect_var_kind kind;
>
> -  kind = vectype ? vect_simple_var : vect_scalar_var;
> +  kind = vectype
> +? VECTOR_BOOLEAN_TYPE_P (vectype)
> +? vect_mask_var
> +: vect_simple_var
> +: vect_scalar_var;
>type = vectype ? vectype : TREE_TYPE (scalar_dest);
>
>gcc_assert (TREE_CODE (scalar_dest) == SSA_NAME);
> diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c
> index 23cec8a..6a52895 100644
> --- a/gcc/tree-vect-stmts.c
> +++ b/gcc/tree-vect-stmts.c
> @@ -7516,6 +7516,192 @@ vectorizable_condition (gimple *stmt, 
> gimple_stmt_iterator *gsi,
>return true;
>  }
>
> +/* vectorizable_comparison.
> +
> +   Check if STMT is comparison expression that can be vectorized.
> +   If VEC_STMT is also passed, vectorize the STMT: create a vectorized
> +   comparison, put it in VEC_STMT, and insert it at GSI.
> +
> +   Return FALSE if not a vectorizable STMT, TRUE otherwise.  */
> +
> +bool
> +vectorizable_comparison (gimple *stmt, gimple_stmt_iterator *gsi,
> +gimple **vec_stmt, tree reduc_def,
> +slp_tree slp_node)
> +{
> +  tree lhs, rhs1, rhs2;
> +  stmt_vec_info stmt_info = vinfo_for_stmt (stmt);
> +  tree vectype1 = NULL_TREE, vectype2 = NULL_TREE;
> +  tree vectype = STMT_VINFO_VECTYPE (stmt_info);
> +  tree vec_rhs1 = NULL_TREE, vec_rhs2 = NULL_TREE;
> +  tree vec_compare;
> +  tree new_temp;
> +  loop_vec_info loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_info);
> +  tree def;
> +  enum vect_def_type dts[2] = {vect_unknown_def_type, vect_unknown_def_type};
> +  unsigned nunits;
> +  int ncopies;
> +  enum tree_code code;
> +  stmt_vec_info prev_stmt_info = NULL;
> +  int i, j;
> +  bb_vec_info bb_vinfo = STMT_VINFO_BB_VINFO (stmt_info);
> +  vec vec_oprnds0 = vNULL;
> +  vec vec_oprnds1 = vNULL;
> +  gimple *def_stmt;
> +  tree mask_type;
> +  tree mask;
> +
> +  if (!VECTOR_BOOLEAN_TYPE_P (vectype))
> +return false;
> +
> +  mask_type = vectype;
> +  nunits = TYPE_VECTOR_SUBPARTS (vectype);
> +
> +  if (slp_node || PURE_SLP_STMT (stmt_info))
> +ncopies = 1;
> +  else
> +ncopies = LOOP_VINFO_VECT_FACTOR (loop_vinfo) / nunits;
> +
> +  gcc_assert (ncopies >= 1);
> +  if (!STMT_VINFO_RELEVANT_P (stmt_info) && !bb_vinfo)
> +return false;
> +
> +  if (STMT_VINFO_DEF_TYPE (stmt_info) != vect_internal_def
> +  && !(STMT_VINFO_DEF_TYPE (stmt_info) == vect_nested_cycle
> +  && reduc_def))
> +return false;
> +
> +  if (STMT_VINFO_LIVE_P (stmt_info))
> +{
> +  if (dump_enabled_p ())
> +   dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
> +"value used after loop.\n");
> +  return false;
> +}
> +
> +  if (!is_gimple_assign (stmt))
> +return false;
> +
> +  code = gimple_assign_rhs_code (stmt);
> +
> +  if (TREE_CODE_CLASS (code) != tcc_comparison)
> +return false;
> +
> +  rhs1 = gimple_assign_rhs1 (stmt);
> +  rhs2 = gimple_assign_rhs2 (stmt);
> +
> +  if (!vect_is_simple_use_1 (rhs1, stmt, stmt_info->vinfo,
> +&def_stmt, &def, &dts[0], &vectype1))
> +return false;
> +
> +  if (!vect_is_simple_use_1 (rhs2, stmt, stmt_info->vinfo,
> +&def_stmt, &def, &dts[1], &vectype2))
> +   return false;
> +
> +  if (vectype1 && vectype2
> +  && TYPE_VECTOR_SUBPARTS (vectype1) != TYPE_VECTOR_SUBPARTS (vectype2))
> +return false;
> +
> +  vectype = vectype1 ? vectype1 : vectype2;
> +
> +  /* Invariant comparison.  */
> +  if (!vectype)
> +{
> +  vectype = build_vector_type (TREE_TYPE (rhs1), nunits);
> +  if (tree_to_shwi (TYPE_SIZE_UNIT (vectype)) != current_vector_size)

Re: [PATCH] PR/67682, break SLP groups up if only some elements match

2015-10-26 Thread Richard Biener
On Fri, Oct 23, 2015 at 5:20 PM, Alan Lawrence  wrote:
> vect_analyze_slp_instance currently only creates an slp_instance if _all_ 
> stores
> in a group fitted the same pattern. This patch splits non-matching groups up
> on vector boundaries, allowing only part of the group to be SLP'd, or multiple
> subgroups to be SLP'd differently.
>
> The algorithm could be made more efficient: we have more info available in
> the matches vector, and a single match in a vector full of non-matches, means
> we will be unable to SLP). But the double-recursion case has at most log(N)
> depth and the single-recursion case is at worst N^2 in *the group width*, 
> which
> is generally small.
>
> This could possibly be extended to hybrid SLP, but I believe that would also
> require splitting the load groups, as at present removing the bb_vinfo check
> ends up with data being stored to the wrong locations e.g. in slp-11a.c. 
> Hence,
> leaving that extension to a future patch.
>
> Bootstrapped + check-{gcc,g++,fortran} on aarch64, x86_64 and arm (v7-a neon).
>
> Thanks, Alan
>
> gcc/ChangeLog:
>
> * tree-vect-slp.c (vect_split_slp_group): New.
> (vect_analyze_slp_instance): Recurse on subgroup(s) if
> vect_build_slp_tree fails during basic block SLP.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.dg/vect/bb-slp-7.c: Make that SLP group even more 
> non-isomorphic.
> * gcc.dg/vect/bb-slp-subgroups-1.c: New.
> * gcc.dg/vect/bb-slp-subgroups-2.c: New.
> ---
>  gcc/testsuite/gcc.dg/vect/bb-slp-7.c   |  8 ++--
>  gcc/testsuite/gcc.dg/vect/bb-slp-subgroups-1.c | 44 ++
>  gcc/testsuite/gcc.dg/vect/bb-slp-subgroups-2.c | 42 +
>  gcc/tree-vect-slp.c| 63 
> +-
>  4 files changed, 152 insertions(+), 5 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.dg/vect/bb-slp-subgroups-1.c
>  create mode 100644 gcc/testsuite/gcc.dg/vect/bb-slp-subgroups-2.c
>
> diff --git a/gcc/testsuite/gcc.dg/vect/bb-slp-7.c 
> b/gcc/testsuite/gcc.dg/vect/bb-slp-7.c
> index ab54a48..b012d78 100644
> --- a/gcc/testsuite/gcc.dg/vect/bb-slp-7.c
> +++ b/gcc/testsuite/gcc.dg/vect/bb-slp-7.c
> @@ -16,12 +16,12 @@ main1 (unsigned int x, unsigned int y)
>unsigned int *pout = &out[0];
>unsigned int a0, a1, a2, a3;
>
> -  /* Non isomorphic.  */
> +  /* Non isomorphic, even 64-bit subgroups.  */
>a0 = *pin++ + 23;
> -  a1 = *pin++ + 142;
> +  a1 = *pin++ * 142;
>a2 = *pin++ + 2;
>a3 = *pin++ * 31;
> -
> +
>*pout++ = a0 * x;
>*pout++ = a1 * y;
>*pout++ = a2 * x;
> @@ -47,4 +47,4 @@ int main (void)
>  }
>
>  /* { dg-final { scan-tree-dump-times "basic block vectorized" 0 "slp2" } } */
> -
> +
> diff --git a/gcc/testsuite/gcc.dg/vect/bb-slp-subgroups-1.c 
> b/gcc/testsuite/gcc.dg/vect/bb-slp-subgroups-1.c
> new file mode 100644
> index 000..39c23c3
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/vect/bb-slp-subgroups-1.c
> @@ -0,0 +1,44 @@
> +/* { dg-require-effective-target vect_int } */
> +/* PR tree-optimization/67682.  */
> +
> +#include "tree-vect.h"
> +
> +int __attribute__((__aligned__(8))) a[8];
> +int __attribute__((__aligned__(8))) b[4];
> +
> +__attribute__ ((noinline)) void
> +test ()
> +{
> +a[0] = b[0];
> +a[1] = b[1];
> +a[2] = b[2];
> +a[3] = b[3];
> +a[4] = 0;
> +a[5] = 0;
> +a[6] = 0;
> +a[7] = 0;
> +}
> +
> +int
> +main (int argc, char **argv)
> +{
> +  check_vect ();
> +
> +  for (int i = 0; i < 8; i++)
> +a[i] = 1;
> +  for (int i = 0; i < 4; i++)
> +b[i] = i + 4;
> +  __asm__ volatile ("" : : : "memory");
> +  test (a, b);
> +  __asm__ volatile ("" : : : "memory");
> +  for (int i = 0; i < 4; i++)
> +if (a[i] != i+4)
> +  abort ();
> +  for (int i = 4; i < 8; i++)
> +if (a[i] != 0)
> +  abort ();
> +  return 0;
> +}
> +
> +/* { dg-final { scan-tree-dump-times "Basic block will be vectorized using 
> SLP" 1 "slp2" } } */
> +/* { dg-final { scan-tree-dump-times "basic block vectorized" 1 "slp2" } } */
> diff --git a/gcc/testsuite/gcc.dg/vect/bb-slp-subgroups-2.c 
> b/gcc/testsuite/gcc.dg/vect/bb-slp-subgroups-2.c
> new file mode 100644
> index 000..06099fd
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/vect/bb-slp-subgroups-2.c
> @@ -0,0 +1,42 @@
> +/* { dg-require-effective-target vect_int } */
> +/* PR tree-optimization/67682.  */
> +
> +#include "tree-vect.h"
> +
> +int __attribute__((__aligned__(8))) a[8];
> +int __attribute__((__aligned__(8))) b[8];
> +
> +__attribute__ ((noinline)) void
> +test ()
> +{
> +a[0] = b[0];
> +a[1] = b[1] + 1;
> +a[2] = b[2] * 2;
> +a[3] = b[3] + 3;
> +
> +a[4] = b[4] + 4;
> +a[5] = b[5] + 4;
> +a[6] = b[6] + 4;
> +a[7] = b[7] + 4;
> +}
> +
> +int
> +main (int argc, char **argv)
> +{
> +  check_vect ();
> +
> +  for (int i = 0; i < 8; i++)
> +b[i] = i + 1;
> +  __asm__ volatile ("" : : : "memory");
> +  test (a, b);
> +  __asm__ volatile ("" : : : "memory");
> +  if 

Re: Handle OBJ_TYPE_REF in operand_equal_p

2015-10-26 Thread Richard Biener
On Mon, Oct 26, 2015 at 3:34 PM, Jan Hubicka  wrote:
>> > + /* Objective-C frontend produce ill formed OBJ_TYPE_REF which
>> > +probably should be dropped before reaching middle-end.  */
>> > + if (!virtual_method_call_p (arg0) || !virtual_method_call_p 
>> > (arg1))
>> > +   return false;
>>
>> So what kind of brokeness is this?
>
> As the comment says, Objective-C uses OBJ_TYPE_REF for its own way. It also
> represent kind of polymorphic calls, but not in a way middle-end would 
> understand
> because Objective-C types have no TYPE_BINFO and ODR information, so they 
> can't
> be used for devirtualization in any way.
>
> /* Possibly rewrite a function CALL into an OBJ_TYPE_REF expression.  This
>needs to be done if we are calling a function through a cast.  */
>
> tree
> objc_rewrite_function_call (tree function, tree first_param)
> {
>   if (TREE_CODE (function) == NOP_EXPR
>   && TREE_CODE (TREE_OPERAND (function, 0)) == ADDR_EXPR
>   && TREE_CODE (TREE_OPERAND (TREE_OPERAND (function, 0), 0))
>  == FUNCTION_DECL)
> {
>   function = build3 (OBJ_TYPE_REF, TREE_TYPE (function),
>  TREE_OPERAND (function, 0),
>  first_param, size_zero_node);
> }
>
>   return function;
> }
>
> This thing simply stays in GIMPLE and serves no purpose.  I tried to remove
> it at one point, but run into some regressions.  I plan to return to this.
> Have no idea how this was intended to work.
> Some bits was added by:
>  https://gcc.gnu.org/ml/gcc-patches/2005-04/txt00100.txt
> Code in build_objc_method_call seems predating changelogs.

I see.  But the above should be harmless unless you throw it onto the ODR
predicates?

>>
>> > + /* Match the type information stored in the wrapper.  */
>> > +
>> > + flags &= ~(OEP_CONSTANT_ADDRESS_OF|OEP_ADDRESS_OF);
>> > + return (tree_to_uhwi (OBJ_TYPE_REF_TOKEN (arg0))
>> > + == tree_to_uhwi (OBJ_TYPE_REF_TOKEN (arg1))
>>
>> Err, please use tree_int_cst_eq ()
>
> OK updated in my local copy and, I will update other places too, this was
> directly copied from ipa-icf-gimple.
>>
>> > + && types_same_for_odr (obj_type_ref_class (arg0),
>> > +obj_type_ref_class (arg1))
>>
>> Why do you need this check?  The args of OBJ_TYPE_REF should be fully
>> specifying the object, no?
>
> This is the type of THIS pointer that is extracted from pointer type of
> OBJ_TYPE_REF.  Because we otherwise consider different pointers compatible,
> I think we need to check it here.

Hum, this is fold-const.c and thus GENERIC.  GENERIC doesn't consider pointer
types compatible in general.  So I think it is not needed.  As you say elsewhere
the caller is responsible for restricting typeof the trees it asks for equality.

>  It basically compares
>
> TREE_TYPE (TREE_VALUE (TYPE_ARG_TYPES (TREE_TYPE (TREE_TYPE (obj_type_ref)
>
> I supose it can also do
>
> TYPE_METHOD_BASETYPE (TREE_TYPE (TREE_TYPE (obj_type_ref)))
>
> will look into that independently.  In any case we need to compare it.

Not on GENERIC IMHO and thus not in operand_equal_p.

Richard.

>>
>> > + && operand_equal_p (OBJ_TYPE_REF_OBJECT (arg0),
>> > + OBJ_TYPE_REF_OBJECT (arg1),
>> > + flags));
>> > default:
>> >   return 0;
>> > }


[gomp4.5] target teams expression evaluation

2015-10-26 Thread Jakub Jelinek
Hi!

The OpenMP 4.5 spec says that for combined target teams the num_teams
and thread_limit expressions are evaluated on the host before the
target construct.
Additionally, this patch tries to determine during gimplification if
it is easily possible to compute those expressions on the host even
if the construct is not combined (basically if the expressions are simple
integral arithmetics using constants and variables that are either
explicitly firstprivate, or explicitly mapped always {to,tofrom}, or
implicitly firstprivatized on the target construct.

The runtime library function is then told these values (> 0 stands
for those values, 0 stands for clauses not specified, meaning implementation
defined values can be used, -1 stands for could not determine what will be
passed; for missing teams construct which is essentially a request to have
exactly 1 team and implementation defined number of maximum threads we pass
1, 0).

Regtested on x86_64-linux, both intelmicemul and no offloading.

As incremental change, probably the GOMP_target_ext construct could serve
as GOMP_teams call too if both of those values are >= 0 and then GOMP_teams
call should be removed.

2015-10-26  Jakub Jelinek  

gcc/
* gimplify.c (enum gimplify_omp_var_data): Add GOVD_MAP_ALWAYS_TO.
(gimplify_scan_omp_clauses): Or in GOVD_MAP_ALWAYS_TO for
GOMP_MAP_ALWAYS_TO or GOMP_MAP_ALWAYS_TOFROM kinds.
(find_omp_teams, computable_teams_clause, optimize_target_teams): New
functions.
(gimplify_omp_workshare): Call optimize_target_teams.
* omp-low.c (expand_omp_target): Pass num_teams and thread_limit
arguments to BUILT_IN_GOMP_TARGET.
* omp-builtins.def (BUILT_IN_GOMP_TARGET): Rename GOMP_target_41
to GOMP_target_ext.  Add num_teams and thread_limit arguments.
(BUILT_IN_GOMP_TARGET_DATA): Rename GOMP_target_data_41
to GOMP_target_data_ext.
(BUILT_IN_GOMP_TARGET_UPDATE): Rename GOMP_target_update_41
to GOMP_target_update_ext.
* builtin-types.def
(BT_FN_VOID_INT_OMPFN_SIZE_PTR_PTR_PTR_UINT_PTR): Remove.
(BT_FN_VOID_INT_OMPFN_SIZE_PTR_PTR_PTR_UINT_PTR_INT_INT): New.
gcc/c/
* c-parser.c: Include gimple-expr.h.
(c_parser_omp_target): Evaluate num_teams and thread_limit
expressions on combined target teams before the target.
gcc/cp/
* parser.c (cp_parser_omp_target): Evaluate num_teams and
thread_limit expressions on combined target teams before the
target.
* pt.c (tsubst_find_omp_teams): New function.
(tsubst_expr): Evaluate num_teams and thread_limit expressions on
combined target teams before the target.
gcc/fortran/
* types.def (BT_FN_VOID_INT_OMPFN_SIZE_PTR_PTR_PTR_UINT_PTR): Remove.
(BT_FN_VOID_INT_OMPFN_SIZE_PTR_PTR_PTR_UINT_PTR_INT_INT): New.
gcc/testsuite/
* c-c++-common/gomp/target-teams-1.c: New test.
* g++.dg/gomp/target-teams-1.C: New test.
include/
* gomp-constants.h (GOMP_TARGET_FLAG_NOWAIT): Adjust comment.
libgomp/
* target.c (GOMP_target_41): Renamed to ...
(GOMP_target_ext): ... this.  Add num_teams and thread_limit
arguments.
(GOMP_target_data_41): Renamed to ...
(GOMP_target_data_ext): ... this.
(GOMP_target_update_41): Renamed to ...
(GOMP_target_update_ext): ... this.
* libgomp.map (GOMP_4.5): Export GOMP_target_ext,
GOMP_target_data_ext and GOMP_target_update_ext instead of
GOMP_target_41, GOMP_target_data_41 and GOMP_target_update_41.
* libgomp_g.h (GOMP_target_41): Renamed to ...
(GOMP_target_ext): ... this.  Add num_teams and thread_limit
arguments.
(GOMP_target_data_41): Renamed to ...
(GOMP_target_data_ext): ... this.
(GOMP_target_update_41): Renamed to ...
(GOMP_target_update_ext): ... this.
* testsuite/libgomp.c/target-teams-1.c: New test.

--- gcc/gimplify.c.jj   2015-10-22 18:01:45.0 +0200
+++ gcc/gimplify.c  2015-10-23 17:44:44.891528688 +0200
@@ -93,6 +93,9 @@ enum gimplify_omp_var_data
 
   GOVD_MAP_0LEN_ARRAY = 32768,
 
+  /* Flag for GOVD_MAP, if it is always, to or always, tofrom mapping.  */
+  GOVD_MAP_ALWAYS_TO = 65536,
+
   GOVD_DATA_SHARE_CLASS = (GOVD_SHARED | GOVD_PRIVATE | GOVD_FIRSTPRIVATE
   | GOVD_LASTPRIVATE | GOVD_REDUCTION | GOVD_LINEAR
   | GOVD_LOCAL)
@@ -6757,6 +6760,9 @@ gimplify_scan_omp_clauses (tree *list_p,
  break;
}
  flags = GOVD_MAP | GOVD_EXPLICIT;
+ if (OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_ALWAYS_TO
+ || OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_ALWAYS_TOFROM)
+   flags |= GOVD_MAP_ALWAYS_TO;
  goto do_add;
 
case OMP_CLAUSE_DEPEND:
@@ -8543,6 +8549,201 @@ gimplify_omp_for (tree *expr_p, gimple_s
   return GS_ALL_DONE;
 }
 
+/* Helper function of optimize_target_teams, find 

Re: [PATCH] PR fortran/36192 -- Check for valid BT_INTEGER

2015-10-26 Thread FX
> Because the code issues two errors, one for each dimension.

Then shouldn’t it be “string.*string” to match two occurences of the string, 
with some stuff (incl. newline) in the middle?

FX

Re: [gomp4] Adjust UNQUE ifn

2015-10-26 Thread Nathan Sidwell

On 10/26/15 07:36, Richard Biener wrote:


+ {
  #ifdef HAVE_oacc_fork

(etc.)  can you use target-insn.def and targetm.have_oacc_fork () instead?


I can try ...


nathan


Re: C++ PATCH for DR 1518 (c++/54835, c++/60417)

2015-10-26 Thread Jason Merrill

On 10/25/2015 09:04 PM, Ville Voutilainen wrote:

On 25 October 2015 at 22:15, Ville Voutilainen
 wrote:

It seems to me that there's a discrepancy in handling explicit
default constructors. Based on my tests, this works:

struct X {explicit X() {}};

void f(X) {}

int main()
{
 f({});
}

However, if the explicit constructor is defaulted, gcc accepts the code:

struct X {explicit X() = default;};

void f(X) {}

int main()
{
 f({});
}


And to clarify, I'd expect both of those snippets to be rejected, but only the
former is.


The latter is accepted because the second X is an aggregate, and the 
aggregate initialization bullet comes before value-initialization in 8.5.4.


Jason



Re: [PATCH] Fix PR68087

2015-10-26 Thread Markus Trippelsdorf
On 2015.10.26 at 13:26 +0100, Richard Biener wrote:
> On Mon, Oct 26, 2015 at 12:32 PM, Markus Trippelsdorf 
>  wrote:
> > diff --git a/gcc/cp/constexpr.c b/gcc/cp/constexpr.c
> > index ebca411b3eb4..0828a90b0e75 100644
> > --- a/gcc/cp/constexpr.c
> > +++ b/gcc/cp/constexpr.c
> > @@ -1782,8 +1782,7 @@ cxx_eval_array_reference (const constexpr_ctx *ctx, 
> > tree t,
> >gcc_unreachable ();
> >  }
> >
> > -  i = tree_to_shwi (index);
> > -  if (i < 0)
> > +  if (!tree_fits_shwi_p (index) || tree_to_shwi (index) < 0)
> >  {
> 
> Err, but that also catches very large positive constants.  Why not use
> wi::lt (index, 0)?
> Or is index to be interpreted as signed?  Then use wi::lts (index, 0).

I think the compiler will reject the array as too large before that can
happen. But anyway, thanks for your suggestion. Here is an updated
patch, that still passes bootstrap and regtesting on ppc64le.

diff --git a/gcc/cp/constexpr.c b/gcc/cp/constexpr.c
index ebca411..11e4ef6 100644
--- a/gcc/cp/constexpr.c
+++ b/gcc/cp/constexpr.c
@@ -1782,8 +1782,7 @@ cxx_eval_array_reference (const constexpr_ctx *ctx, tree 
t,
   gcc_unreachable ();
 }
 
-  i = tree_to_shwi (index);
-  if (i < 0)
+  if (wi::lts_p (index, 0))
 {
   if (!ctx->quiet)
error ("negative array subscript");
@@ -1792,6 +1791,7 @@ cxx_eval_array_reference (const constexpr_ctx *ctx, tree 
t,
 }
 
   bool found;
+  i = tree_to_shwi (index);
   if (TREE_CODE (ary) == CONSTRUCTOR)
 {
   HOST_WIDE_INT ix = find_array_ctor_elt (ary, index);

-- 
Markus


Re: [PATCH 0/4] OpenMP 4.0 offloading to Intel MIC

2015-10-26 Thread Ilya Verbin
On Fri, Oct 23, 2015 at 10:10:06 +0200, Jakub Jelinek wrote:
> On Thu, Oct 22, 2015 at 09:26:37PM +0300, Ilya Verbin wrote:
> > On Mon, Dec 22, 2014 at 13:01:40 +0100, Thomas Schwinge wrote:
> > > By chance (when tracking down a different problem), I've found the
> > > following.  Would you please check whether that's a real problem in
> > > liboffloadmic, or its libgomp plugin, or just a mis-diagnosis by
> > > Valgrind?
> > > 
> > > ==21327== Syscall param write(buf) points to uninitialised byte(s)
> > 
> > Finally we have investigated this :)  Valgrind warns about uninitialized 
> > bytes,
> > inserted into the struct for alignment.  It's possible to avoid the warning 
> > by
> > the patch bellow.  Should I commit it, or just leave it as is?
> 
> Or use calloc instead of malloc, or add two uint8_t padding fields after the
> two uint8_t fields and initialize them too.  Though, as you have some
> padding after the name, I think calloc is best.

Here is what I committed to trunk together with an obvious change.


liboffloadmic/
* runtime/offload_host.cpp (OffloadDescriptor::setup_misc_data): Use
calloc instead of malloc.
(__offload_fini_library): Set mic_engines_total to zero.


diff --git a/liboffloadmic/runtime/offload_host.cpp 
b/liboffloadmic/runtime/offload_host.cpp
index c6c6518..a150410 100644
--- a/liboffloadmic/runtime/offload_host.cpp
+++ b/liboffloadmic/runtime/offload_host.cpp
@@ -2424,8 +2424,8 @@ bool OffloadDescriptor::setup_misc_data(const char *name)
 }
 
 // initialize function descriptor
-m_func_desc = (FunctionDescriptor*) malloc(m_func_desc_size +
-   misc_data_size);
+m_func_desc = (FunctionDescriptor*) calloc(1, m_func_desc_size
+ + misc_data_size);
 if (m_func_desc == NULL)
   LIBOFFLOAD_ERROR(c_malloc);
 m_func_desc->console_enabled = console_enabled;
@@ -5090,6 +5090,7 @@ static void __offload_fini_library(void)
 OFFLOAD_DEBUG_TRACE(2, "Cleanup offload library ...\n");
 if (mic_engines_total > 0) {
 delete[] mic_engines;
+mic_engines_total = 0;
 
 if (mic_proxy_fs_root != 0) {
 free(mic_proxy_fs_root);


  -- Ilya


Re: [gomp4] Adjust UNQUE ifn

2015-10-26 Thread Richard Biener
On Mon, Oct 26, 2015 at 3:26 PM, Nathan Sidwell  wrote:
> On 10/26/15 02:10, Thomas Schwinge wrote:
>>
>> Hi Nathan!
>
>  }
>>
>>
>>  [...]/source-gcc/gcc/internal-fn.c: In function 'void
>> expand_UNIQUE(gcall*)':
>>  [...]/source-gcc/gcc/internal-fn.c:1982:6: error: variable 'target'
>> set but not used [-Werror=unused-but-set-variable]
>>rtx target = const0_rtx;
>>^
>>  [...]/source-gcc/gcc/internal-fn.c:1987:6: error: unused variable
>> 'data_dep' [-Werror=unused-variable]
>>rtx data_dep = expand_normal (gimple_call_arg (stmt, 1));
>>^
>>  [...]/source-gcc/gcc/internal-fn.c:1988:6: error: unused variable
>> 'axis' [-Werror=unused-variable]
>>rtx axis = expand_normal (gimple_call_arg (stmt, 2));
>
>
>
> Fixed thusly.

Looks better now.

+ {
 #ifdef HAVE_oacc_fork

(etc.)  can you use target-insn.def and targetm.have_oacc_fork () instead?

> nathan


Re: Handle OBJ_TYPE_REF in operand_equal_p

2015-10-26 Thread Jan Hubicka
> > + /* Objective-C frontend produce ill formed OBJ_TYPE_REF which
> > +probably should be dropped before reaching middle-end.  */
> > + if (!virtual_method_call_p (arg0) || !virtual_method_call_p 
> > (arg1))
> > +   return false;
> 
> So what kind of brokeness is this?

As the comment says, Objective-C uses OBJ_TYPE_REF for its own way. It also
represent kind of polymorphic calls, but not in a way middle-end would 
understand
because Objective-C types have no TYPE_BINFO and ODR information, so they can't
be used for devirtualization in any way.

/* Possibly rewrite a function CALL into an OBJ_TYPE_REF expression.  This
   needs to be done if we are calling a function through a cast.  */

tree
objc_rewrite_function_call (tree function, tree first_param)
{
  if (TREE_CODE (function) == NOP_EXPR
  && TREE_CODE (TREE_OPERAND (function, 0)) == ADDR_EXPR
  && TREE_CODE (TREE_OPERAND (TREE_OPERAND (function, 0), 0))
 == FUNCTION_DECL)
{
  function = build3 (OBJ_TYPE_REF, TREE_TYPE (function),
 TREE_OPERAND (function, 0),
 first_param, size_zero_node);
}

  return function;
}

This thing simply stays in GIMPLE and serves no purpose.  I tried to remove
it at one point, but run into some regressions.  I plan to return to this.
Have no idea how this was intended to work.
Some bits was added by:
 https://gcc.gnu.org/ml/gcc-patches/2005-04/txt00100.txt
Code in build_objc_method_call seems predating changelogs.
> 
> > + /* Match the type information stored in the wrapper.  */
> > +
> > + flags &= ~(OEP_CONSTANT_ADDRESS_OF|OEP_ADDRESS_OF);
> > + return (tree_to_uhwi (OBJ_TYPE_REF_TOKEN (arg0))
> > + == tree_to_uhwi (OBJ_TYPE_REF_TOKEN (arg1))
> 
> Err, please use tree_int_cst_eq ()

OK updated in my local copy and, I will update other places too, this was
directly copied from ipa-icf-gimple.
> 
> > + && types_same_for_odr (obj_type_ref_class (arg0),
> > +obj_type_ref_class (arg1))
> 
> Why do you need this check?  The args of OBJ_TYPE_REF should be fully
> specifying the object, no?

This is the type of THIS pointer that is extracted from pointer type of
OBJ_TYPE_REF.  Because we otherwise consider different pointers compatible,
I think we need to check it here.  It basically compares

TREE_TYPE (TREE_VALUE (TYPE_ARG_TYPES (TREE_TYPE (TREE_TYPE (obj_type_ref)

I supose it can also do

TYPE_METHOD_BASETYPE (TREE_TYPE (TREE_TYPE (obj_type_ref)))

will look into that independently.  In any case we need to compare it.
> 
> > + && operand_equal_p (OBJ_TYPE_REF_OBJECT (arg0),
> > + OBJ_TYPE_REF_OBJECT (arg1),
> > + flags));
> > default:
> >   return 0;
> > }


Re: [gomp4] Adjust UNQUE ifn

2015-10-26 Thread Nathan Sidwell

On 10/26/15 02:10, Thomas Schwinge wrote:

Hi Nathan!

 }


 [...]/source-gcc/gcc/internal-fn.c: In function 'void 
expand_UNIQUE(gcall*)':
 [...]/source-gcc/gcc/internal-fn.c:1982:6: error: variable 'target' set 
but not used [-Werror=unused-but-set-variable]
   rtx target = const0_rtx;
   ^
 [...]/source-gcc/gcc/internal-fn.c:1987:6: error: unused variable 
'data_dep' [-Werror=unused-variable]
   rtx data_dep = expand_normal (gimple_call_arg (stmt, 1));
   ^
 [...]/source-gcc/gcc/internal-fn.c:1988:6: error: unused variable 'axis' 
[-Werror=unused-variable]
   rtx axis = expand_normal (gimple_call_arg (stmt, 2));



Fixed thusly.

nathan
2015-10-26  Nathan Sidwell  

	* internal-fn.c (expand_UNIQUE): Protect fork & join inside
	combined if defined.

Index: gcc/internal-fn.c
===
--- gcc/internal-fn.c	(revision 229297)
+++ gcc/internal-fn.c	(working copy)
@@ -1978,6 +1978,7 @@ expand_UNIQUE (gcall *stmt)
 case IFN_UNIQUE_OACC_FORK:
 case IFN_UNIQUE_OACC_JOIN:
   {
+#if defined (HAVE_oacc_fork) && defined (HAVE_oacc_join)
 	tree lhs = gimple_call_lhs (stmt);
 	rtx target = const0_rtx;
 
@@ -1988,21 +1989,12 @@ expand_UNIQUE (gcall *stmt)
 	rtx axis = expand_normal (gimple_call_arg (stmt, 2));
 
 	if (code == IFN_UNIQUE_OACC_FORK)
-	  {
-#ifdef HAVE_oacc_fork
-	pattern = gen_oacc_fork (target, data_dep, axis);
-#else
-	gcc_unreachable ();
-#endif
-	  }
+	  pattern = gen_oacc_fork (target, data_dep, axis);
 	else
-	  {
-#ifdef HAVE_oacc_join
-	pattern = gen_oacc_join (target, data_dep, axis);
+	  pattern = gen_oacc_join (target, data_dep, axis);
 #else
-	gcc_unreachable ();
+	gcc_unreachable ();
 #endif
-	  }
   }
   break;
 }


[PATCH] Remove fold_call_stmt calls, cleanup DOM threadedge a bit

2015-10-26 Thread Richard Biener

The following removes calls to fold_call_stmt as that does not end up
exercising all the match.pd rules we have for builtins now.  Instead
use gimple_fold_stmt_to_constant[_1].

It also removes some awkward manual re-folding code from
tree-ssa-threadedge.c and notes further cleanup opportunities around
its callback.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk.

Richard.

2015-10-26  Richard Biener  

* tree-object-size.c: Remove builtins.h include, include tree-cfg.h.
(do_valueize): New function.
(pass_object_sizes::execute): Use gimple_fold_stmt_to_constant and
replace_uses_by.
* tree-ssa-threadedge.c: Remove builtins.h include, include
gimple-fold.h
(fold_assignment_stmt): Remove.
(threadedge_valueize): New function.
(record_temporary_equivalences_from_stmts): Use
gimple_fold_stmt_to_constant_1, note additional cleanup
opportunities.

Index: gcc/tree-object-size.c
===
*** gcc/tree-object-size.c  (revision 229311)
--- gcc/tree-object-size.c  (working copy)
*** along with GCC; see the file COPYING3.
*** 36,42 
  #include "gimple-iterator.h"
  #include "tree-pass.h"
  #include "tree-ssa-propagate.h"
! #include "builtins.h"
  
  struct object_size_info
  {
--- 36,42 
  #include "gimple-iterator.h"
  #include "tree-pass.h"
  #include "tree-ssa-propagate.h"
! #include "tree-cfg.h"
  
  struct object_size_info
  {
*** public:
*** 1231,1236 
--- 1231,1244 
  
  }; // class pass_object_sizes
  
+ /* Dummy valueize function.  */
+ 
+ static tree
+ do_valueize (tree t)
+ {
+   return t;
+ }
+ 
  unsigned int
  pass_object_sizes::execute (function *fun)
  {
*** pass_object_sizes::execute (function *fu
*** 1287,1293 
  continue;
}
  
! result = fold_call_stmt (as_a  (call), false);
  if (!result)
{
  tree ost = gimple_call_arg (call, 1);
--- 1295,1305 
  continue;
}
  
! tree lhs = gimple_call_lhs (call);
! if (!lhs)
!   continue;
! 
! result = gimple_fold_stmt_to_constant (call, do_valueize);
  if (!result)
{
  tree ost = gimple_call_arg (call, 1);
*** pass_object_sizes::execute (function *fu
*** 1318,1339 
  fprintf (dump_file, "\n");
}
  
- tree lhs = gimple_call_lhs (call);
- if (!lhs)
-   continue;
- 
  /* Propagate into all uses and fold those stmts.  */
! gimple *use_stmt;
! imm_use_iterator iter;
! FOR_EACH_IMM_USE_STMT (use_stmt, iter, lhs)
!   {
! use_operand_p use_p;
! FOR_EACH_IMM_USE_ON_STMT (use_p, iter)
!   SET_USE (use_p, result);
! gimple_stmt_iterator gsi = gsi_for_stmt (use_stmt);
! fold_stmt (&gsi);
! update_stmt (gsi_stmt (gsi));
!   }
}
  }
  
--- 1330,1337 
  fprintf (dump_file, "\n");
}
  
  /* Propagate into all uses and fold those stmts.  */
! replace_uses_by (lhs, result);
}
  }
  
Index: gcc/tree-ssa-threadedge.c
===
*** gcc/tree-ssa-threadedge.c   (revision 229311)
--- gcc/tree-ssa-threadedge.c   (working copy)
*** along with GCC; see the file COPYING3.
*** 36,42 
  #include "tree-ssa-threadedge.h"
  #include "tree-ssa-threadbackward.h"
  #include "tree-ssa-dom.h"
! #include "builtins.h"
  
  /* To avoid code explosion due to jump threading, we limit the
 number of statements we are going to copy.  This variable
--- 36,42 
  #include "tree-ssa-threadedge.h"
  #include "tree-ssa-threadbackward.h"
  #include "tree-ssa-dom.h"
! #include "gimple-fold.h"
  
  /* To avoid code explosion due to jump threading, we limit the
 number of statements we are going to copy.  This variable
*** record_temporary_equivalences_from_phis
*** 180,233 
return true;
  }
  
! /* Fold the RHS of an assignment statement and return it as a tree.
!May return NULL_TREE if no simplification is possible.  */
  
  static tree
! fold_assignment_stmt (gimple *stmt)
  {
!   enum tree_code subcode = gimple_assign_rhs_code (stmt);
! 
!   switch (get_gimple_rhs_class (subcode))
  {
! case GIMPLE_SINGLE_RHS:
!   return fold (gimple_assign_rhs1 (stmt));
! 
! case GIMPLE_UNARY_RHS:
!   {
! tree lhs = gimple_assign_lhs (stmt);
! tree op0 = gimple_assign_rhs1 (stmt);
! return fold_unary (subcode, TREE_TYPE (lhs), op0);
!   }
! 
! case GIMPLE_BINARY_RHS:
!   {
! tree lhs = gimple_assign_lhs (stmt);
! tree op0 = gimple_assign_rhs1 (stmt);
! tree op1 = gimple_assign_rhs2 (stmt);
! return fold_bi

Re: [vec-cmp, patch 6/6, i386] Add i386 support for vector comparison

2015-10-26 Thread Kirill Yukhin
Hi Ilya,
On 08 Oct 18:32, Ilya Enkovich wrote:
> Hi,
> 
> This patch adds patterns for vec_cmp optabs.  Vector comparison expand code 
> was moved from VEC_COND_EXPR expanders into a separate functions.  AVX-512 
> patterns use more simple masked versions.
> 
> Thanks,
> Ilya
> --
> gcc/
> 
> 2015-10-08  Ilya Enkovich  
> 
>   * config/i386/i386-protos.h (ix86_expand_mask_vec_cmp): New.
>   (ix86_expand_int_vec_cmp): New.
>   (ix86_expand_fp_vec_cmp): New.
>   * config/i386/i386.c (ix86_expand_sse_cmp): Allow NULL for
>   op_true and op_false.
>   (ix86_int_cmp_code_to_pcmp_immediate): New.
>   (ix86_fp_cmp_code_to_pcmp_immediate): New.
>   (ix86_cmp_code_to_pcmp_immediate): New.
>   (ix86_expand_mask_vec_cmp): New.
>   (ix86_expand_fp_vec_cmp): New.
>   (ix86_expand_int_sse_cmp): New.
>   (ix86_expand_int_vcond): Use ix86_expand_int_sse_cmp.
>   (ix86_expand_fp_vcond): Use ix86_expand_sse_cmp.
>   (ix86_expand_int_vec_cmp): New.
>   (ix86_get_mask_mode): New.
>   (TARGET_VECTORIZE_GET_MASK_MODE): New.
>   * config/i386/sse.md (avx512fmaskmodelower): New.
>   (vec_cmp): New.
>   (vec_cmp): New.
>   (vec_cmpv2div2di): New.
>   (vec_cmpu): New.
>   (vec_cmpu): New.
>   (vec_cmpuv2div2di): New.

Patch is OK for trunk.
(Although vec_cmp and vec_cmpu might be merged w/ susbst).

--
Thanks, K


Re: [gomp4.1] map clause parsing improvements

2015-10-26 Thread Ilya Verbin
On Mon, Oct 26, 2015 at 14:07:13 +0100, Jakub Jelinek wrote:
> On Mon, Oct 26, 2015 at 03:53:57PM +0300, Ilya Verbin wrote:
> > @@ -7363,7 +7363,7 @@ gimplify_adjust_omp_clauses (gimple_seq *pre_p, tree 
> > *list_p,
> >   n = splay_tree_lookup (ctx->variables, (splay_tree_key) decl);
> >   if ((ctx->region_type & ORT_TARGET) != 0
> >   && !(n->value & GOVD_SEEN)
> > - && ((OMP_CLAUSE_MAP_KIND (c) & GOMP_MAP_FLAG_ALWAYS) == 0
> > + && (GOMP_MAP_ALWAYS_P (OMP_CLAUSE_MAP_KIND (c)) == 0
> >   || OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_STRUCT))
> 
> The || OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_STRUCT part can go then too,
> it was there only because (OMP_CLAUSE_MAP_KIND (c) & GOMP_MAP_FLAG_ALWAYS)
> has been non-zero for GOMP_MAP_STRUCT (and the () pair around the condition
> too).

Oops, missed that.

> We want to be able to remove all map clauses on the target construct, except
> if it is always {to,from,tofrom}.
> We do not want to remove release or delete, but those only exist on target
> exit data and thus are handled by (ctx->region_type & ORT_TARGET) != 0.
> 
> > @@ -142,6 +143,10 @@ enum gomp_map_kind
> >  #define GOMP_MAP_ALWAYS_FROM_P(X) \
> >(((X) == GOMP_MAP_ALWAYS_FROM) || ((X) == GOMP_MAP_ALWAYS_TOFROM))
> >  
> > +#define GOMP_MAP_ALWAYS_P(X) \
> > +  (((X) == GOMP_MAP_ALWAYS_TO) || ((X) == GOMP_MAP_ALWAYS_FROM) \
> > +   || ((X) == GOMP_MAP_ALWAYS_TOFROM))
> 
> You could simplify this e.g. to
>   (((X) == GOMP_MAP_ALWAYS_TO) || GOMP_MAP_ALWAYS_FROM_P (X))
> or
>   (GOMP_MAP_ALWAYS_TO_P (X) || ((X) == GOMP_MAP_ALWAYS_FROM))
> 
> Otherwise, LGTM.

Done.  Here is what I committed:


gcc/
* gimplify.c (gimplify_scan_omp_clauses): Use GOMP_MAP_ALWAYS_P.
(gimplify_adjust_omp_clauses): Likewise.
include/
* gomp-constants.h (GOMP_MAP_FLAG_SPECIAL_2): Define.
(GOMP_MAP_FLAG_ALWAYS): Remove.
(enum gomp_map_kind): Use GOMP_MAP_FLAG_SPECIAL_2 instead of
GOMP_MAP_FLAG_ALWAYS for GOMP_MAP_ALWAYS_TO, GOMP_MAP_ALWAYS_FROM,
GOMP_MAP_ALWAYS_TOFROM, GOMP_MAP_STRUCT, GOMP_MAP_RELEASE.
(GOMP_MAP_ALWAYS_P): Define.


diff --git a/gcc/gimplify.c b/gcc/gimplify.c
index ee5cb95..a308307 100644
--- a/gcc/gimplify.c
+++ b/gcc/gimplify.c
@@ -6613,7 +6613,7 @@ gimplify_scan_omp_clauses (tree *list_p, gimple_seq 
*pre_p,
  struct_map_to_clause->put (decl, *list_p);
  list_p = &OMP_CLAUSE_CHAIN (*list_p);
  flags = GOVD_MAP | GOVD_EXPLICIT;
- if (OMP_CLAUSE_MAP_KIND (c) & GOMP_MAP_FLAG_ALWAYS)
+ if (GOMP_MAP_ALWAYS_P (OMP_CLAUSE_MAP_KIND (c)))
flags |= GOVD_SEEN;
  goto do_add_decl;
}
@@ -6623,7 +6623,7 @@ gimplify_scan_omp_clauses (tree *list_p, gimple_seq 
*pre_p,
  tree *sc = NULL, *pt = NULL;
  if (!ptr && TREE_CODE (*osc) == TREE_LIST)
osc = &TREE_PURPOSE (*osc);
- if (OMP_CLAUSE_MAP_KIND (c) & GOMP_MAP_FLAG_ALWAYS)
+ if (GOMP_MAP_ALWAYS_P (OMP_CLAUSE_MAP_KIND (c)))
n->value |= GOVD_SEEN;
  offset_int o1, o2;
  if (offset)
@@ -7363,8 +7363,7 @@ gimplify_adjust_omp_clauses (gimple_seq *pre_p, tree 
*list_p,
  n = splay_tree_lookup (ctx->variables, (splay_tree_key) decl);
  if ((ctx->region_type & ORT_TARGET) != 0
  && !(n->value & GOVD_SEEN)
- && ((OMP_CLAUSE_MAP_KIND (c) & GOMP_MAP_FLAG_ALWAYS) == 0
- || OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_STRUCT))
+ && GOMP_MAP_ALWAYS_P (OMP_CLAUSE_MAP_KIND (c)) == 0)
{
  remove = true;
  /* For struct element mapping, if struct is never referenced
diff --git a/include/gomp-constants.h b/include/gomp-constants.h
index f834dec..008a4a4 100644
--- a/include/gomp-constants.h
+++ b/include/gomp-constants.h
@@ -39,10 +39,9 @@
 /* Special map kinds, enumerated starting here.  */
 #define GOMP_MAP_FLAG_SPECIAL_0(1 << 2)
 #define GOMP_MAP_FLAG_SPECIAL_1(1 << 3)
+#define GOMP_MAP_FLAG_SPECIAL_2(1 << 4)
 #define GOMP_MAP_FLAG_SPECIAL  (GOMP_MAP_FLAG_SPECIAL_1 \
 | GOMP_MAP_FLAG_SPECIAL_0)
-/* OpenMP always flag.  */
-#define GOMP_MAP_FLAG_ALWAYS   (1 << 6)
 /* Flag to force a specific behavior (or else, trigger a run-time error).  */
 #define GOMP_MAP_FLAG_FORCE(1 << 7)
 
@@ -95,29 +94,31 @@ enum gomp_map_kind
 GOMP_MAP_FORCE_TOFROM =(GOMP_MAP_FLAG_FORCE | GOMP_MAP_TOFROM),
 /* If not already present, allocate.  And unconditionally copy to
device.  */
-GOMP_MAP_ALWAYS_TO =   (GOMP_MAP_FLAG_ALWAYS | GOMP_MAP_TO),
+GOMP_MAP_ALWAYS_TO =   (GOMP_MAP_FLAG_SPECIAL_2 | GOMP_MAP_TO),
 /* If not 

Re: [PATCH] Pass manager: add support for termination of pass list

2015-10-26 Thread Richard Biener
On Mon, Oct 26, 2015 at 2:48 PM, Richard Biener
 wrote:
> On Thu, Oct 22, 2015 at 1:02 PM, Martin Liška  wrote:
>> On 10/21/2015 04:06 PM, Richard Biener wrote:
>>> On Wed, Oct 21, 2015 at 1:24 PM, Martin Liška  wrote:
 On 10/21/2015 11:59 AM, Richard Biener wrote:
> On Wed, Oct 21, 2015 at 11:19 AM, Martin Liška  wrote:
>> On 10/20/2015 03:39 PM, Richard Biener wrote:
>>> On Tue, Oct 20, 2015 at 3:00 PM, Martin Liška  wrote:
 Hello.

 As part of upcoming merge of HSA branch, we would like to have 
 possibility to terminate
 pass manager after execution of the HSA generation pass. The HSA 
 back-end is implemented
 as a tree pass that directly emits HSAIL from gimple tree 
 representation. The pass operates
 on clones created by HSA IPA pass and the pass manager should stop 
 execution of further
 RTL passes.

 Suggested patch survives bootstrap and regression tests on 
 x86_64-linux-pc.

 What do you think about it?
>>>
>>> Are you sure it works this way?
>>>
>>> Btw, you will miss executing of all the cleanup passes that will
>>> eventually free memory
>>> associated with the function.  So I'd rather support a
>>> TODO_discard_function which
>>> should basically release the body from the cgraph.
>>
>> Hi.
>>
>> Agree with you that I should execute all TODOs, which can be easily done.
>> However, if I just try to introduce the suggested TODO and handle it 
>> properly
>> by calling cgraph_node::release_body, then for instance fn->gimple_df, 
>> fn->cfg are
>> released and I hit ICEs on many places.
>>
>> Stopping the pass manager looks necessary, or do I miss something?
>
> "Stopping the pass manager" is necessary after TODO_discard_function, yes.
> But that may be simply done via a has_body () check then?

 Thanks, there's second version of the patch. I'm going to start regression 
 tests.
>>>
>>> As release_body () will free cfun you should pop_cfun () before
>>> calling it (and thus
>>
>> Well, release_function_body calls both push & pop, so I think calling pop
>> before cgraph_node::release_body is not necessary.
>
> (ugh).
>
>> If tried to call pop_cfun before cgraph_node::release_body, I have cfun still
>> pointing to the same (function *) (which is gcc_freed, but cfun != NULL).
>
> Hmm, I meant to call pop_cfun then after it (unless you want to fix the above,
> none of the freeing functions should techincally need 'cfun', just add
> 'fn' parameters ...).

I'm giving that a shot now (removing push/pop_cfun in release_body)

>
> I expected pop_cfun to eventually set cfun to NULL if it popped the
> "last" cfun.  Why
> doesn't it do that?
>
>>> drop its modification).  Also TODO_discard_functiuon should be only set for
>>> local passes thus you should probably add a gcc_assert (cfun).
>>> I'd move its handling earlier, definitely before the ggc_collect, eventually
>>> before the pass_fini_dump_file () (do we want a last dump of the
>>> function or not?).
>>
>> Fully agree, moved here.
>>
>>>
>>> @@ -2397,6 +2410,10 @@ execute_pass_list_1 (opt_pass *pass)
>>>  {
>>>gcc_assert (pass->type == GIMPLE_PASS
>>>   || pass->type == RTL_PASS);
>>> +
>>> +
>>> +  if (!gimple_has_body_p (current_function_decl))
>>> +   return;
>>>
>>> too much vertical space.  With popping cfun before releasing the body the 
>>> check
>>> might just become if (!cfun) and
>>
>> As mentioned above, as release body is symmetric (calling push & pop), the 
>> suggested
>> guard will not work.
>
> I suggest to fix it.  If it calls push/pop it should leave with the
> original cfun, popping again
> should make it NULL?
>
>>>
>>> @@ -2409,7 +2426,7 @@ execute_pass_list (function *fn, opt_pass *pass)
>>>  {
>>>push_cfun (fn);
>>>execute_pass_list_1 (pass);
>>> -  if (fn->cfg)
>>> +  if (gimple_has_body_p (current_function_decl) && fn->cfg)
>>>  {
>>>free_dominance_info (CDI_DOMINATORS);
>>>free_dominance_info (CDI_POST_DOMINATORS);
>>>
>>> here you'd need to guard the pop_cfun call on cfun != NULL.  IMHO it's 
>>> better
>>> to not let cfun point to deallocated data.
>>
>> As I've read the code, since we gcc_free 'struct function', one can just 
>> rely on
>> gimple_has_body_p (current_function_decl) as it correctly realizes that
>> DECL_STRUCT_FUNCTION (current_function_decl) == NULL.
>
> I'm sure that with some GC checking ggc_free makes them #deadbeef or so:
>
> void
> ggc_free (void *p)
> {
> ...
> #ifdef ENABLE_GC_CHECKING
>   /* Poison the data, to indicate the data is garbage.  */
>   VALGRIND_DISCARD (VALGRIND_MAKE_MEM_UNDEFINED (p, size));
>   memset (p, 0xa5, size);
> #endif
>
> so I don't think that's a good thing to rely on ;)
>
> Richard.
>
>> I'm attaching v3.
>>
>> Thanks,
>> Martin
>>
>>>
>>> Otherwise looks good to me.
>

Re: [PATCH] Pass manager: add support for termination of pass list

2015-10-26 Thread Richard Biener
On Thu, Oct 22, 2015 at 1:02 PM, Martin Liška  wrote:
> On 10/21/2015 04:06 PM, Richard Biener wrote:
>> On Wed, Oct 21, 2015 at 1:24 PM, Martin Liška  wrote:
>>> On 10/21/2015 11:59 AM, Richard Biener wrote:
 On Wed, Oct 21, 2015 at 11:19 AM, Martin Liška  wrote:
> On 10/20/2015 03:39 PM, Richard Biener wrote:
>> On Tue, Oct 20, 2015 at 3:00 PM, Martin Liška  wrote:
>>> Hello.
>>>
>>> As part of upcoming merge of HSA branch, we would like to have 
>>> possibility to terminate
>>> pass manager after execution of the HSA generation pass. The HSA 
>>> back-end is implemented
>>> as a tree pass that directly emits HSAIL from gimple tree 
>>> representation. The pass operates
>>> on clones created by HSA IPA pass and the pass manager should stop 
>>> execution of further
>>> RTL passes.
>>>
>>> Suggested patch survives bootstrap and regression tests on 
>>> x86_64-linux-pc.
>>>
>>> What do you think about it?
>>
>> Are you sure it works this way?
>>
>> Btw, you will miss executing of all the cleanup passes that will
>> eventually free memory
>> associated with the function.  So I'd rather support a
>> TODO_discard_function which
>> should basically release the body from the cgraph.
>
> Hi.
>
> Agree with you that I should execute all TODOs, which can be easily done.
> However, if I just try to introduce the suggested TODO and handle it 
> properly
> by calling cgraph_node::release_body, then for instance fn->gimple_df, 
> fn->cfg are
> released and I hit ICEs on many places.
>
> Stopping the pass manager looks necessary, or do I miss something?

 "Stopping the pass manager" is necessary after TODO_discard_function, yes.
 But that may be simply done via a has_body () check then?
>>>
>>> Thanks, there's second version of the patch. I'm going to start regression 
>>> tests.
>>
>> As release_body () will free cfun you should pop_cfun () before
>> calling it (and thus
>
> Well, release_function_body calls both push & pop, so I think calling pop
> before cgraph_node::release_body is not necessary.

(ugh).

> If tried to call pop_cfun before cgraph_node::release_body, I have cfun still
> pointing to the same (function *) (which is gcc_freed, but cfun != NULL).

Hmm, I meant to call pop_cfun then after it (unless you want to fix the above,
none of the freeing functions should techincally need 'cfun', just add
'fn' parameters ...).

I expected pop_cfun to eventually set cfun to NULL if it popped the
"last" cfun.  Why
doesn't it do that?

>> drop its modification).  Also TODO_discard_functiuon should be only set for
>> local passes thus you should probably add a gcc_assert (cfun).
>> I'd move its handling earlier, definitely before the ggc_collect, eventually
>> before the pass_fini_dump_file () (do we want a last dump of the
>> function or not?).
>
> Fully agree, moved here.
>
>>
>> @@ -2397,6 +2410,10 @@ execute_pass_list_1 (opt_pass *pass)
>>  {
>>gcc_assert (pass->type == GIMPLE_PASS
>>   || pass->type == RTL_PASS);
>> +
>> +
>> +  if (!gimple_has_body_p (current_function_decl))
>> +   return;
>>
>> too much vertical space.  With popping cfun before releasing the body the 
>> check
>> might just become if (!cfun) and
>
> As mentioned above, as release body is symmetric (calling push & pop), the 
> suggested
> guard will not work.

I suggest to fix it.  If it calls push/pop it should leave with the
original cfun, popping again
should make it NULL?

>>
>> @@ -2409,7 +2426,7 @@ execute_pass_list (function *fn, opt_pass *pass)
>>  {
>>push_cfun (fn);
>>execute_pass_list_1 (pass);
>> -  if (fn->cfg)
>> +  if (gimple_has_body_p (current_function_decl) && fn->cfg)
>>  {
>>free_dominance_info (CDI_DOMINATORS);
>>free_dominance_info (CDI_POST_DOMINATORS);
>>
>> here you'd need to guard the pop_cfun call on cfun != NULL.  IMHO it's better
>> to not let cfun point to deallocated data.
>
> As I've read the code, since we gcc_free 'struct function', one can just rely 
> on
> gimple_has_body_p (current_function_decl) as it correctly realizes that
> DECL_STRUCT_FUNCTION (current_function_decl) == NULL.

I'm sure that with some GC checking ggc_free makes them #deadbeef or so:

void
ggc_free (void *p)
{
...
#ifdef ENABLE_GC_CHECKING
  /* Poison the data, to indicate the data is garbage.  */
  VALGRIND_DISCARD (VALGRIND_MAKE_MEM_UNDEFINED (p, size));
  memset (p, 0xa5, size);
#endif

so I don't think that's a good thing to rely on ;)

Richard.

> I'm attaching v3.
>
> Thanks,
> Martin
>
>>
>> Otherwise looks good to me.
>>
>> Richard.
>>
>>
>>> Martin
>>>

> Thanks,
> Martin
>
>>
>> Richard.
>>
>>> Thanks,
>>> Martin
>
>>>
>


Re: [PATCH] PR fortran/36192 -- Check for valid BT_INTEGER

2015-10-26 Thread Steve Kargl
On Mon, Oct 26, 2015 at 09:49:10AM +0100, FX wrote:
> > 2015-10-25  Steven G. Kargl  
> > 
> > PR fortran/36192
> > * array.c (gfc_ref_dimen_size): Check for BT_INTEGER before calling
> > mpz_set.
> > 
> > 
> > 2015-10-25  Steven G. Kargl  
> > 
> > PR fortran/36192
> > * gfortran.dg/pr36192.f90: New test.
> 
> OK. But I don???t understand why the testcase???s dg-error pattern has this 
> form: a regex ???or??? (|) of two identical strings?
> 

Because the code issues two errors, one for each dimension.
I thought testing for the third (which I prune) to be
excessive.

laptop-kargl:kargl[202] gfc -c pr36192.f90
pr36192.f90:6:18:

   real, dimension(n,d) :: x ! { dg-error "of INTEGER type|of INTEGER type" }
  1
Error: Expression at (1) must be of INTEGER type, found REAL
pr36192.f90:6:20:

   real, dimension(n,d) :: x ! { dg-error "of INTEGER type|of INTEGER type" }
1
Error: Expression at (1) must be of INTEGER type, found REAL
pr36192.f90:6:27:

   real, dimension(n,d) :: x ! { dg-error "of INTEGER type|of INTEGER type" }
   1
Error: The module or main program array 'x' at (1) must have constant shape

-- 
Steve


Re: [PATCH, 1/2] Add handle_param parameter to create_variable_info_for_1

2015-10-26 Thread Richard Biener
On Mon, Oct 26, 2015 at 12:22 PM, Tom de Vries  wrote:
> Hi,
>
> this no-functional-changes patch copies the restrict var declaration code
> from intra_create_variable_infos to create_variable_info_for_1.
>
> The code was copied rather than moved, since in fipa-pta mode the varinfo p
> for the parameter t may already exist due to create_function_info_for, in
> which case we're not calling create_variable_info_for_1 to set p, meaning
> the copied code won't get triggered.
>
> Bootstrapped and reg-tested on x86_64.
>
> OK for trunk?

@@ -272,6 +272,9 @@ struct variable_info
   /* True if this field has only restrict qualified pointers.  */
   unsigned int only_restrict_pointers : 1;

+  /* The id of the pointed-to restrict var in case only_restrict_pointers.  */
+  unsigned int restrict_pointed_var;
+
   /* True if this represents a heap var created for a restrict qualified
  pointer.  */
   unsigned int is_restrict_var : 1;
@@ -5608,10 +5611,10 @@ check_for_overlaps (vec fieldstack)


Please don't split the  bitfield like that.  Note that variable_info
is kept as small as
possible because there may be a _lot_ of them.  Is it really necessary to have
this info be persistent?  Why does create_variable_info_for_1 only handle
the single-field case?  That is, I expected this to be handled by c_v_r_f_1
fully.

Richard.





> Thanks,
> - Tom


Re: [Patch] Avoid is_simple_use bug in vectorizable_live_operation

2015-10-26 Thread Richard Biener
On Mon, Oct 26, 2015 at 1:33 PM, Alan Hayward  wrote:
> There is a potential bug in vectorizable_live_operation.
>
> Consider the case where the first op for stmt is valid, but the second is
> null.
> The first time through the for () loop, it will call out to
> vect_is_simple_use () which will set dt.
> The second time, because op is null, vect_is_simple_use () will not be
> called.
> However, dt is still set to a valid value, therefore the loop will
> continue.
>
> Note this is different from the case where the first op is null, which
> will cause the loop not call vect_is_simple_use () and then return false.
>
> It is possible that this was intentional, however that is not clear from
> the code.
>
> The fix is to simply ensure dt is initialized to a default value on each
> iteration.

I think the patch is a strict improvement, thus OK.  Still a NULL operand
is not possible in GIMPLE so the op && check is not necessary.  The way
it iterates over all stmt uses is a bit scary anyway.  As it is ok with
all invariants it should probably simply do sth like

   FOR_EACH_SSA_TREE_OPERAND (op, stmt, iter, SSA_OP_USE)
 if (!vect_is_simple_use (op, ))

and be done with that.  Unvisited uses can only be constants (ok).

Care to rework the funtion like that if you are here?

Thanks,
Richard.

>
> 2015-09-07  Alan Hayward  alan.hayw...@arm.com
> * tree-vect-looop.c (vectorizable_live_operation): localize variable.
>
>
> Cheers,
> Alan
>


Re: [RFC] Improving alias dumps

2015-10-26 Thread Richard Biener
On Mon, Oct 26, 2015 at 1:26 PM, Tom de Vries  wrote:
> Hi,
>
> After spending some time looking at ealias/pta dumps, I realized that
> they're hard to understand because we use varinfo names to identify
> varinfos, while those names are not necessarily unique.
>
> F.i., for a function f:
> ...
> void
> f (int *__restrict__ a, int *__restrict__ b)
> {
>   *a = 1;
>   *b = 2;
> }
> ...
>
> we have at ealias the constraints:
> ...
> a = &PARM_NOALIAS
> PARM_NOALIAS = NONLOCAL
> b = &PARM_NOALIAS
> PARM_NOALIAS = NONLOCAL
> derefaddrtmp = &NONLOCAL
> *a = derefaddrtmp
> derefaddrtmp = &NONLOCAL
> *b = derefaddrtmp
> ...
> F.i. PARM_NOALIAS occurs several times, and it's not clear if there are one
> or two varinfos with that name.
>
> Using attached patch, it's clearer what varinfos the constraints relate to:
> ...
> a(8) = &PARM_NOALIAS(9)
> PARM_NOALIAS(9) = NONLOCAL(5)
> b(10) = &PARM_NOALIAS(11)
> PARM_NOALIAS(11) = NONLOCAL(5)
> derefaddrtmp(12) = &NONLOCAL(5)
> *a(8) = derefaddrtmp(12)
> derefaddrtmp(13) = &NONLOCAL(5)
> *b(10) = derefaddrtmp(13)
> ...
>
> It this a good idea, f.i. guarded by (dump_flags & TDF_DETAILS) not to
> disturb scans of current tests?
>
> Or, do we f.i. want to fix the names themselves to be unique?

I think so, on most cases the (n) adds clutter without extra info.

Richard.

>
> Thanks,
> - Tom


[PATCH] Adjust some patterns wrt :c

2015-10-26 Thread Richard Biener

This removes the genmatch warnings.

Bootstrapped / tested on x86_64-unknown-linux-gnu, applied.

Richard.

2015-10-26  Richard Biener  

* match.pd ((A & ~B) - (A & B) -> (A ^ B) - B): Add missing :c.
( (X & ~Y) | (~X & Y) -> X ^ Y): Remove redundant :c.

Index: gcc/match.pd
===
--- gcc/match.pd(revision 229307)
+++ gcc/match.pd(working copy)
@@ -435,7 +435,7 @@ (define_operator_list RINT BUILT_IN_RINT
 
 /* Fold (A & ~B) - (A & B) into (A ^ B) - B.  */
 (simplify
- (minus (bit_and:cs @0 (bit_not @1)) (bit_and:s @0 @1))
+ (minus (bit_and:cs @0 (bit_not @1)) (bit_and:cs @0 @1))
   (minus (bit_xor @0 @1) @1))
 (simplify
  (minus (bit_and:s @0 INTEGER_CST@2) (bit_and:s @0 INTEGER_CST@1))
@@ -449,7 +449,7 @@ (define_operator_list RINT BUILT_IN_RINT
 
 /* Simplify (X & ~Y) | (~X & Y) -> X ^ Y.  */
 (simplify
- (bit_ior:c (bit_and:c @0 (bit_not @1)) (bit_and:c (bit_not @0) @1))
+ (bit_ior (bit_and:c @0 (bit_not @1)) (bit_and:c (bit_not @0) @1))
   (bit_xor @0 @1))
 (simplify
  (bit_ior:c (bit_and @0 INTEGER_CST@2) (bit_and (bit_not @0) INTEGER_CST@1))



Re: [PATCH]Add -fprofile-use option for check_effective_target_freorder.

2015-10-26 Thread Bernd Schmidt

On 10/26/2015 02:17 PM, Teresa Johnson wrote:

On Mon, Oct 26, 2015 at 2:00 AM, Renlin Li  wrote:

 * lib/target-supports.exp (check_effective_target_freorder): Add
 -fprofile-use flag.


Hmmm, the testcases themselves which use this predicate only use 
-freorder-and-partition, so maybe this requires more thought.



Bernd



Re: [PATCH]Add -fprofile-use option for check_effective_target_freorder.

2015-10-26 Thread Teresa Johnson
Looks good to me, but I don't have approval rights.
Thanks for fixing,
Teresa

On Mon, Oct 26, 2015 at 2:00 AM, Renlin Li  wrote:
> Hi all,
>
> After r228136, flag_reorder_blocks_and_partition is canceled when
> -fprofile-use is not specified.
>
> In this case check_effective_target_freorder() is not able to check the
> proper target support.
> This is a simple patch to add "-fprofile-use" option that effective target
> check.
>
> Okay to commit on the trunk?
>
> Regards,
> Renlin Li
>
> gcc/testsuite/ChangeLog:
>
> 2015-10-26  Renlin Li  
>
> * lib/target-supports.exp (check_effective_target_freorder): Add
> -fprofile-use flag.



-- 
Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413


  1   2   3   >