Re: [PATCH] Factorize the two read_line functions present in gcov.c and input.c

2013-11-07 Thread Jakub Jelinek
On Thu, Nov 07, 2013 at 07:17:19AM +, Nathan Sidwell wrote:
 On 11/06/13 21:25, Dodji Seketeli wrote:
 It appeared that gcov.c and input.c both have a static function named
 read_line.  I guess we ought to keep just one.
 
 As a general rule, that's good.

Generally yes, but IMHO not in this case, both because of the problems you
are mentioning, because it already now needs to do quite different things
and because I think in the future the version for caret diagnostics should
do much more complicated things (caching of the lines for faster further
caret diagnostics).  Furthermore, ftell/fseek aren't something that should
be used unless strictly necessary.

Jakub


Re: [gomp4 1/9] Add missing include.

2013-11-07 Thread Jakub Jelinek
On Wed, Nov 06, 2013 at 08:42:15PM +0100, tho...@codesourcery.com wrote:
 From: Thomas Schwinge tho...@codesourcery.com
 
   libgomp/
   * libgomp_g.h: Include stddef.h for size_t.

I'm surprised why it is needed, because libgomp.h that includes it includes
stdlib.h that should provide size_t too.  But, including this certainly
doesn't hurt and makes the header more self-contained.  So, ok for
trunk/gomp-4_0-branch.

Jakub


Re: [gomp4 2/9] libgomp: Prepare for testcases without -fopenmp.

2013-11-07 Thread Jakub Jelinek
On Wed, Nov 06, 2013 at 08:42:16PM +0100, tho...@codesourcery.com wrote:
 From: Thomas Schwinge tho...@codesourcery.com
 
   libgomp/
   * testsuite/lib/libgomp.exp (libgomp_init): Don't add -fopenmp to
   ALWAYS_CFLAGS.
   * testsuite/libgomp.c++/c++.exp (ALWAYS_CFLAGS): Add -fopenmp.
   * testsuite/libgomp.c/c.exp (ALWAYS_CFLAGS): Likewise.
   * testsuite/libgomp.fortran/fortran.exp (ALWAYS_CFLAGS): Likewise.
   * testsuite/libgomp.graphite/graphite.exp (ALWAYS_CFLAGS):
   Likewise.

Ok for trunk/gomp-4_0-branch.

Jakub


Re: [gomp4 3/9] OpenACC: Recognize -fopenacc.

2013-11-07 Thread Jakub Jelinek
On Wed, Nov 06, 2013 at 08:42:17PM +0100, tho...@codesourcery.com wrote:
 From: Thomas Schwinge tho...@codesourcery.com
 
   gcc/c-family/
   * c.opt (fopenacc): New option.
   gcc/fortran/
   * lang.opt (fopenacc): New option.
   * invoke.texi (-fopenacc): Document it.
   * gfortran.h (gfc_option_t): New member.
   * options.c (gfc_init_options, gfc_handle_option): Handle it.
   gcc/testsuite/
   * lib/target-supports.exp (check_effective_target_fopenacc): New
   procedure.
   gcc/
   * doc/invoke.texi (-fopenacc): Document it.
   * doc/sourcebuild.texi (fopenacc): Document it.
   gcc/testsuite/
   * c-c++-common/cpp/openacc-define-1.c: New file.
   * c-c++-common/cpp/openacc-define-2.c: Likewise.
   * c-c++-common/cpp/openacc-define-3.c: Likewise.
   * gfortran.dg/openacc-define-1.f90: Likewise.
   * gfortran.dg/openacc-define-2.f90: Likewise.
   * gfortran.dg/openacc-define-3.f90: Likewise.

Ok for branch.

Jakub


Re: [gomp4 4/9] OpenACC: The runtime library will be implemented in libgomp, too.

2013-11-07 Thread Jakub Jelinek
On Wed, Nov 06, 2013 at 08:42:18PM +0100, tho...@codesourcery.com wrote:
 --- gcc/config/arc/arc.h
 +++ gcc/config/arc/arc.h
 @@ -174,7 +174,7 @@ along with GCC; see the file COPYING3.  If not see
  %(linker) %l  LINK_PIE_SPEC %X %{o*} %{A} %{d} %{e*} %{m} %{N} %{n} 
 %{r}\
  %{s} %{t} %{u*} %{x} %{z} %{Z} %{!A:%{!nostdlib:%{!nostartfiles:%S}}}\
  %{static:} %{L*} %(mfwrap) %(link_libgcc) %o\
 -%{fopenmp:%:include(libgomp.spec)%(link_gomp)} %(mflib)\
 +%{fopenacc|fopenmp:%:include(libgomp.spec)%(link_gomp)} %(mflib)\

Perhaps add here |ftree-parallelize-loops=* too?

%{!shared: \
 - %{mt|pthread:%{fopenmp:-lrt} -lpthread} \
 + %{mt|pthread:%{fopenacc|fopenmp:-lrt} -lpthread} \

Likewise.

   %{p:%{!mlp64:-L/usr/lib/hpux32/libp} \
%{mlp64:-L/usr/lib/hpux64/libp} -lprof} \
   %{pg:%{!mlp64:-L/usr/lib/hpux32/libp} \
 diff --git gcc/config/pa/pa-hpux11.h gcc/config/pa/pa-hpux11.h
 index c217398..0676d74 100644
 --- gcc/config/pa/pa-hpux11.h
 +++ gcc/config/pa/pa-hpux11.h
 @@ -120,7 +120,7 @@ along with GCC; see the file COPYING3.  If not see
  #undef LIB_SPEC
  #define LIB_SPEC \
%{!shared:\
 - %{fopenmp:%{static:-a archive_shared} -lrt %{static:-a archive}}\
 + %{fopenacc|fopenmp:%{static:-a archive_shared} -lrt %{static:-a 
 archive}}\

Ditto (and several times further).

Otherwise LGTM for branch.

Jakub


Re: [gomp4 5/9] OpenACC: preprocessor definition, Fortran integer parameter.

2013-11-07 Thread Jakub Jelinek
On Wed, Nov 06, 2013 at 08:42:19PM +0100, tho...@codesourcery.com wrote:
 From: Thomas Schwinge tho...@codesourcery.com
 
   gcc/c-family/
   * c-cppbuiltin.c (c_cpp_builtins): Conditionally define _OPENACC.
   gcc/fortran/
   * cpp.c (cpp_define_builtins): Conditionally define _OPENACC.
   gcc/testsuite/
   * c-c++-common/cpp/openacc-define-1.c: Test _OPENACC.
   * c-c++-common/cpp/openacc-define-2.c: Likewise.
   * c-c++-common/cpp/openacc-define-3.c: Likewise.
   * gfortran.dg/openacc-define-1.f90: Likewise.
   * gfortran.dg/openacc-define-2.f90: Likewise.
   * gfortran.dg/openacc-define-3.f90: Likewise.
   libgomp/
   * openacc.f90 (openacc_version): New integer parameter.
   * openacc_lib.h (openacc_version): Likewise.
   * testsuite/libgomp.oacc-fortran/openacc_version-1.f: New file.
   * testsuite/libgomp.oacc-fortran/openacc_version-2.f90: Likewise.

Ok for branch.

Jakub


Re: [gomp4 7/9] OpenACC: Use OpenMP's lowering and expansion passes.

2013-11-07 Thread Jakub Jelinek
On Wed, Nov 06, 2013 at 08:42:21PM +0100, tho...@codesourcery.com wrote:
 From: Thomas Schwinge tho...@codesourcery.com
 
   gcc/
   * gimplify.c (gimplify_body): Consider flag_openacc additionally
   to flag_openmp.
   * omp-low.c (execute_expand_omp, execute_lower_omp)
   (gate_diagnose_omp_blocks): Likewise.
   gcc/testsuite/
   * gcc.dg/goacc-gomp/goacc-gomp.exp: New file.
   * gcc.dg/goacc/goacc.exp: Likewise.

Ok for branch.

Jakub


Re: [gomp4 6/9] OpenACC: Infrastructure for builtins.

2013-11-07 Thread Jakub Jelinek
On Wed, Nov 06, 2013 at 08:42:20PM +0100, tho...@codesourcery.com wrote:
 From: Thomas Schwinge tho...@codesourcery.com
 
   gcc/
   * oacc-builtins.def: New file.
   * Makefile.in (BUILTINS_DEF): Add oacc-builtins.def.
   * builtins.def (DEF_GOACC_BUILTIN): New macro.
   Include oacc-builtins.def.
   gcc/fortran/
   * f95-lang.c (DEF_GOACC_BUILTIN): New macro.
   Include ../oacc-builtins.def.
   libgomp/
   * libgomp.map (GOACC_2.0): New symbol version.

Ok for branch.

Jakub


Re: [gomp4 8/9] OpenACC: Basic support for #pragma acc in the C front end.

2013-11-07 Thread Jakub Jelinek
On Wed, Nov 06, 2013 at 08:42:22PM +0100, tho...@codesourcery.com wrote:
 From: Thomas Schwinge tho...@codesourcery.com
 
   gcc/c-family/
   * c-pragma.c (oacc_pragmas): New array.
   (c_pp_lookup_pragma, init_pragma): Handle it.
   gcc/
   * doc/invoke.texi (-fopenacc): Update.
 
   gcc/c/
   * c-parser.c (c_parser_omp_all_clauses): Make a parser error
   message suitable for OpenACC, too.
   gcc/cp/
   * parser.c (cp_parser_omp_all_clauses): Make a parser error
   message suitable for OpenACC, too.

Ok for branch.

Jakub


Re: [gomp4 9/9] OpenACC: Basic support for #pragma acc parallel.

2013-11-07 Thread Jakub Jelinek
On Wed, Nov 06, 2013 at 08:53:00PM +0100, Thomas Schwinge wrote:
 Forgot to pass the --patience switch to Git, so the diff algorithm
 decided to first patch the existing expand_omp_taskreg into
 expand_oacc_parallel, and then later re-add expand_omp_taskreg.  Here's a
 more readable version of that patch, avoiding all that back'n'forth:

I've just quickly skimmed this, other than what I've mentioned earlier
I haven't found anything totally wrong, so with that thing fixed I guess
it is ok for the branch, but my quick skimming was far from a proper review.

Jakub


Re: [gomp4] rewrite simd clone argument adjustment to avoid regimplification

2013-11-07 Thread Jakub Jelinek
On Wed, Nov 06, 2013 at 05:37:03PM -0700, Aldy Hernandez wrote:
 Hmmm, good point.  I've moved update_stmt and company to the caller,
 and modified the caller to call regimplify_operands only for
 GIMPLE_RETURN which is the special case.

Can't you (later) handle that without regimplification too?
 
 Let me know if this is still ok so I can commit.

Ok.

Jakub


Re: [PATCH] Fix PR58941

2013-11-07 Thread Richard Biener
On Wed, 6 Nov 2013, Richard Sandiford wrote:

 Richard Biener rguent...@suse.de writes:
  --- 599,615 

  exp = TREE_OPERAND (exp, 0);
}

  +   /* We need to deal with variable arrays ending structures.  */
  +   if (seen_variable_array_ref
  +maxsize != -1
  +(!bit_offset.fits_shwi ()
  + || !host_integerp (TYPE_SIZE (TREE_TYPE (exp)), 1)
  + || (bit_offset.to_shwi () + maxsize
  + == (signed) TREE_INT_CST_LOW (TYPE_SIZE (TREE_TYPE (exp))
  + maxsize = -1;
 
 I realise this was in the old code too, but should it be HOST_WIDE_INT
 rather than signed?

Hmm, indeed.  I'll fix it.

Thanks,
Richard.

 Ignore me if not.  It just stood out as a bit weird in today's wide-int merge.
 
 Thanks,
 Richard
 
 
  + 
  +  done:
  if (!bit_offset.fits_shwi ())
{
  *poffset = 0;
  *** get_ref_base_and_extent (tree exp, HOST_
  *** 614,637 

  hbit_offset = bit_offset.to_shwi ();

  -   /* We need to deal with variable arrays ending structures such as
  -struct { int length; int a[1]; } x;   x.a[d]
  -struct { struct { int a; int b; } a[1]; } x;  x.a[d].a
  -struct { struct { int a[1]; } a[1]; } x;  x.a[0][d], x.a[d][0]
  -struct { int len; union { int a[1]; struct X x; } u; } x; x.u.a[d]
  -  where we do not know maxsize for variable index accesses to
  -  the array.  The simplest way to conservatively deal with this
  -  is to punt in the case that offset + maxsize reaches the
  -  base type boundary.  This needs to include possible trailing padding
  -  that is there for alignment purposes.  */
  - 
  -   if (seen_variable_array_ref
  -maxsize != -1
  -(!host_integerp (TYPE_SIZE (base_type), 1)
  - || (hbit_offset + maxsize
  - == (signed) TREE_INT_CST_LOW (TYPE_SIZE (base_type)
  - maxsize = -1;
  - 
  /* In case of a decl or constant base object we can do better.  */

  if (DECL_P (exp))
 
 

-- 
Richard Biener rguent...@suse.de
SUSE / SUSE Labs
SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
GF: Jeff Hawn, Jennifer Guild, Felix Imend


Re: [PATCH] Improve VRP assert creation for loops

2013-11-07 Thread Richard Biener
On Wed, 6 Nov 2013, Jakub Jelinek wrote:

 On Tue, Nov 05, 2013 at 02:00:16PM -0800, Cong Hou wrote:
   I'm also curious -- did this code show up in a particular benchmark, if 
   so,
   which one?
  
  I didn't find this problem from any benchmark, but from another
  concern about loop upper bound estimation. Look at the following code:
  
  int foo(unsigned int n, int r)
  {
int i;
if (n  0)
  if (n  4)
  {
do {
   --n;
   r *= 2;
} while (n  0);
  }
return r+n;
  }
  
  
  In order to get the upper bound of the loop in this function, GCC
  traverses conditions n4 and n0 separately and tries to get any
  useful information. But as those two conditions cannot be combined
  into one due to this issue (note that n0 will be transformed into
  n!=0), when GCC sees n4, it will consider the possibility that n may
  be equal to 0, in which case the upper bound is UINT_MAX. If those two
  conditions can be combined into one, which is n-1=2, then we can get
  the correct upper bound of the loop.
 
 I've looked at the above testcase to see why we aren't able to determine
 the number of iterations upper bound properly here.
 
 That doesn't mean your patch is useless, though I must say I'm far from
 being convinced it is safe ATM and also it looks like quite ugly special
 case (trying to undo a VRP optimization but only one single specific case).
 
 The first problem is VRP, we end up with:
   bb 5:
   # n_1 = PHI n_5(D)(4), n_7(6)
   # r_3 = PHI r_6(D)(4), r_8(6)
   # RANGE [0, 4294967295]
   n_7 = n_1 + 4294967295;
   # RANGE [-2147483648, 2147483647] NONZERO 0x0fffe
   r_8 = r_3 * 2;
   if (n_7 != 0)
 goto bb 6;
   else
 goto bb 7;
   
   bb 6:
   goto bb 5;
 - missing range on n_1, extremely conservative range on n_7.  With the
 attached patch we get instead:
   bb 5:
   # RANGE [1, 3] NONZERO 0x3
   # n_1 = PHI n_5(D)(4), n_7(6)
   # r_3 = PHI r_6(D)(4), r_8(6)
   # RANGE [0, 2] NONZERO 0x3
   n_7 = n_1 + 4294967295;
   # RANGE [-2147483648, 2147483647] NONZERO 0x0fffe
   r_8 = r_3 * 2;
   if (n_7 != 0)
 goto bb 6;
   else
 goto bb 7;
 
   bb 6:
   goto bb 5;
 
 The issue is that we use live_on_edge to determine if it is desirable
 to added ASSERT_EXPRs, but as we walk bbs in RPO order, we first enter
 the loop through the bb with exit edge and thus live of the latch isn't
 computed (and, generally the propagation for live ignores dfs back edges
 anyway), and because in the above live_on_edge ((5)-(6), n_7) is false,
 we don't add ASSERT_EXPR that n_7 is != 0 in the latch bb, so during
 iteration we start with n_1 being assumed [1, 3] (that is range of the
 assertion from the preceeding conditions on n_5(D)), but in the next
 iteration widen it to [0, 3] because we think n_7 can be [0, 2] in the
 PHI and thus end up with uselessly wide range because we think the
 subtraction can wrap around.  This patch improves live_on_edge for
 empty latches, by just looking at the PHIs on loop-header from the
 latch - header edge and noting which SSA_NAMEs are used there.
 
 I had to add -fno-tree-vrp to 4 unroll_*.c tests, because they disable
 various tree optimizations already and want to test unrolling of loops
 iterating exactly twice, but with this VRP change VRP is smart enough
 to replace the PHI argument on the i IV from
 -  # i_13 = PHI i_8(3), 0(2)
 +  # i_13 = PHI 1(3), 0(2)
 (the loop iterates exactly twice) and RTL unrolling doesn't do the
 tested thing in that case.  But, this should affect only loops that roll
 exactly twice and those realy should be unrolled already far before, so IMHO
 it isn't something to try to optimize better at the RTL level.
 
 Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
 
 2013-11-06  Jakub Jelinek  ja...@redhat.com
 
   * tree-vrp.c (live_on_edge): If e-dest is an empty latch
   of some loop, but live[e-dest-index] is not computed, look
   for SSA_NAMEs used in loop header PHIs from the latch edge.
 
   * gcc.dg/unroll_1.c: Add -fno-tree-vrp to dg-options.
   * gcc.dg/unroll_2.c: Likewise.
   * gcc.dg/unroll_3.c: Likewise.
   * gcc.dg/unroll_4.c: Likewise.
 
 --- gcc/tree-vrp.c.jj 2013-11-06 08:48:01.0 +0100
 +++ gcc/tree-vrp.c2013-11-06 09:32:19.205104029 +0100
 @@ -92,6 +92,42 @@ static sbitmap *live;
  static bool
  live_on_edge (edge e, tree name)
  {
 +  if (live[e-dest-index] == NULL)
 +{

I'd rather have live[] computed correctly than the following
which I think is a hack ... as you say the issue is ordering
(or rather that there isn't an order for CFG cycles unless
we want to iterate).  For loop BB order we explicitely handle
the latch, maybe just using a different order than
RPO order, with special-casing the latch, makes more sense?

Richard.

 +  /* Handle edges to empty latch blocks.  If NAME appears
 +  in loop header phis on edges from latch, return true
 +  and 

Re: [wide-int] Remove SHIFT_COUNT_TRUNCATED uses at tree/gimple level

2013-11-07 Thread Kenneth Zadeck
I very strongly disagree with this.  The standard needs to be high than does 
it pass the test suite.

What we are introducing is a case where the program will behave one way with 
optimization and another way without it.  While, this is always true for timing 
dependent code, it is pretty rare for math.   We should always tread carefully 
when doing that, and the excuse that it is cleaner does not, in my mind, fit 
the bill.



On Nov 7, 2013, at 2:08 AM, Richard Sandiford rdsandif...@googlemail.com 
wrote:

 Kenneth Zadeck zad...@naturalbridge.com writes:
 So what is the big plan here? if you remove it here and then do not 
 do it in wide int, then it is not going to be truncated.
 
 Yeah, that is the big plan for trees.  Mainline doesn't truncate at the
 tree level after:
 
http://gcc.gnu.org/ml/gcc-patches/2013-11/msg00285.html
 
 and this patch brings wide-int back in line.  (The truncations in this patch
 were only ever local to wide-int.)
 
 If you're wondering why we don't want to truncate at the tree level,
 see the second half of:
 
http://gcc.gnu.org/ml/gcc-patches/2013-11/msg00174.html
 
 Thanks,
 Richard
 
 On 11/06/2013 05:10 PM, Richard Sandiford wrote:
 Following the removal of SHIFT_COUNT_TRUNCATED from double-int, this patch
 reverts the changed I'd made to mimic the old behaviour on wide-int.
 
 Tested on powerpc64-linux-gnu and by assembly comparison on a range of 
 targets.
 OK to install?
 
 Thanks,
 Richard
 
 
 Index: gcc/fold-const.c
 ===
 --- gcc/fold-const.c2013-11-05 13:06:56.985255941 +
 +++ gcc/fold-const.c2013-11-05 13:12:28.805655903 +
 @@ -1007,13 +1007,9 @@ int_const_binop_1 (enum tree_code code,
  /* It's unclear from the C standard whether shifts can overflow.
 The following code ignores overflow; perhaps a C standard
 interpretation ruling is needed.  */
 -res = wi::rshift (arg1, arg2, sign,
 -  SHIFT_COUNT_TRUNCATED
 -  ? GET_MODE_BITSIZE (TYPE_MODE (type)) : 0);
 +res = wi::rshift (arg1, arg2, sign);
else
 -res = wi::lshift (arg1, arg2,
 -  SHIFT_COUNT_TRUNCATED
 -  ? GET_MODE_BITSIZE (TYPE_MODE (type)) : 0);
 +res = wi::lshift (arg1, arg2);
break;
 
  case RROTATE_EXPR:
 Index: gcc/tree-ssa-ccp.c
 ===
 --- gcc/tree-ssa-ccp.c2013-11-05 13:07:25.659474362 +
 +++ gcc/tree-ssa-ccp.c2013-11-05 13:12:28.806655910 +
 @@ -1272,20 +1272,15 @@ bit_value_binop_1 (enum tree_code code,
else
  code = RSHIFT_EXPR;
  }
 -  int shift_precision = SHIFT_COUNT_TRUNCATED ? width : 0;
if (code == RSHIFT_EXPR)
  {
 -  *mask = wi::rshift (wi::ext (r1mask, width, sgn),
 -  shift, sgn, shift_precision);
 -  *val = wi::rshift (wi::ext (r1val, width, sgn),
 - shift, sgn, shift_precision);
 +  *mask = wi::rshift (wi::ext (r1mask, width, sgn), shift, sgn);
 +  *val = wi::rshift (wi::ext (r1val, width, sgn), shift, sgn);
  }
else
  {
 -  *mask = wi::ext (wi::lshift (r1mask, shift, shift_precision),
 -   width, sgn);
 -  *val = wi::ext (wi::lshift (r1val, shift, shift_precision),
 -  width, sgn);
 +  *mask = wi::ext (wi::lshift (r1mask, shift), width, sgn);
 +  *val = wi::ext (wi::lshift (r1val, shift), width, sgn);
  }
  }
  }


Re: [wide-int] Remove SHIFT_COUNT_TRUNCATED uses at tree/gimple level

2013-11-07 Thread Richard Biener
On Thu, 7 Nov 2013, Kenneth Zadeck wrote:

 I very strongly disagree with this.  The standard needs to be high than does 
 it pass the test suite.
 
 What we are introducing is a case where the program will behave one way 
 with optimization and another way without it.  While, this is always 
 true for timing dependent code, it is pretty rare for math.  We should 
 always tread carefully when doing that, and the excuse that it is 
 cleaner does not, in my mind, fit the bill.

It happens always for code with undefined behavior.  See the gazillions
of INVALID bugreports that complain that -O0 produces different results
than -On.

Patch is ok.

Thanks,
Richard. 

 On Nov 7, 2013, at 2:08 AM, Richard Sandiford rdsandif...@googlemail.com 
 wrote:
 
  Kenneth Zadeck zad...@naturalbridge.com writes:
  So what is the big plan here? if you remove it here and then do not 
  do it in wide int, then it is not going to be truncated.
  
  Yeah, that is the big plan for trees.  Mainline doesn't truncate at the
  tree level after:
  
 http://gcc.gnu.org/ml/gcc-patches/2013-11/msg00285.html
  
  and this patch brings wide-int back in line.  (The truncations in this patch
  were only ever local to wide-int.)
  
  If you're wondering why we don't want to truncate at the tree level,
  see the second half of:
  
 http://gcc.gnu.org/ml/gcc-patches/2013-11/msg00174.html
  
  Thanks,
  Richard
  
  On 11/06/2013 05:10 PM, Richard Sandiford wrote:
  Following the removal of SHIFT_COUNT_TRUNCATED from double-int, this patch
  reverts the changed I'd made to mimic the old behaviour on wide-int.
  
  Tested on powerpc64-linux-gnu and by assembly comparison on a range of 
  targets.
  OK to install?
  
  Thanks,
  Richard
  
  
  Index: gcc/fold-const.c
  ===
  --- gcc/fold-const.c2013-11-05 13:06:56.985255941 +
  +++ gcc/fold-const.c2013-11-05 13:12:28.805655903 +
  @@ -1007,13 +1007,9 @@ int_const_binop_1 (enum tree_code code,
   /* It's unclear from the C standard whether shifts can overflow.
  The following code ignores overflow; perhaps a C standard
  interpretation ruling is needed.  */
  -res = wi::rshift (arg1, arg2, sign,
  -  SHIFT_COUNT_TRUNCATED
  -  ? GET_MODE_BITSIZE (TYPE_MODE (type)) : 0);
  +res = wi::rshift (arg1, arg2, sign);
 else
  -res = wi::lshift (arg1, arg2,
  -  SHIFT_COUNT_TRUNCATED
  -  ? GET_MODE_BITSIZE (TYPE_MODE (type)) : 0);
  +res = wi::lshift (arg1, arg2);
 break;
  
   case RROTATE_EXPR:
  Index: gcc/tree-ssa-ccp.c
  ===
  --- gcc/tree-ssa-ccp.c2013-11-05 13:07:25.659474362 +
  +++ gcc/tree-ssa-ccp.c2013-11-05 13:12:28.806655910 +
  @@ -1272,20 +1272,15 @@ bit_value_binop_1 (enum tree_code code,
 else
   code = RSHIFT_EXPR;
   }
  -  int shift_precision = SHIFT_COUNT_TRUNCATED ? width : 0;
 if (code == RSHIFT_EXPR)
   {
  -  *mask = wi::rshift (wi::ext (r1mask, width, sgn),
  -  shift, sgn, shift_precision);
  -  *val = wi::rshift (wi::ext (r1val, width, sgn),
  - shift, sgn, shift_precision);
  +  *mask = wi::rshift (wi::ext (r1mask, width, sgn), shift, sgn);
  +  *val = wi::rshift (wi::ext (r1val, width, sgn), shift, sgn);
   }
 else
   {
  -  *mask = wi::ext (wi::lshift (r1mask, shift, shift_precision),
  -   width, sgn);
  -  *val = wi::ext (wi::lshift (r1val, shift, shift_precision),
  -  width, sgn);
  +  *mask = wi::ext (wi::lshift (r1mask, shift), width, sgn);
  +  *val = wi::ext (wi::lshift (r1val, shift), width, sgn);
   }
   }
   }
 
 

-- 
Richard Biener rguent...@suse.de
SUSE / SUSE Labs
SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
GF: Jeff Hawn, Jennifer Guild, Felix Imend


Re: [PATCH] Improve VRP assert creation for loops

2013-11-07 Thread Jakub Jelinek
On Thu, Nov 07, 2013 at 09:48:46AM +0100, Richard Biener wrote:
  --- gcc/tree-vrp.c.jj   2013-11-06 08:48:01.0 +0100
  +++ gcc/tree-vrp.c  2013-11-06 09:32:19.205104029 +0100
  @@ -92,6 +92,42 @@ static sbitmap *live;
   static bool
   live_on_edge (edge e, tree name)
   {
  +  if (live[e-dest-index] == NULL)
  +{
 
 I'd rather have live[] computed correctly than the following
 which I think is a hack ... as you say the issue is ordering
 (or rather that there isn't an order for CFG cycles unless
 we want to iterate).  For loop BB order we explicitely handle
 the latch, maybe just using a different order than
 RPO order, with special-casing the latch, makes more sense?

But is there any order that would help?  Sure, we could say just tweak
the rpo/bb_rpo arrays and put there latch bbs before any other bbs from the
same loop, but that would only help us in calling find_assert_locations_1
on the latch before other bbs in the loop.  But that call does nothing,
and we'd need to special case handling of the live for the latch block
anyway, just not in live_on_edge, but in find_assert_locations instead.

Is that what what you are looking for?

Because without that special casing of handling latch edges, you'd need to
process the header rather than latch first, and propagate through DFS_BACK
edges, but then you couldn't propagate through other edges, nor in this
testcase would be the latch live computed when processing the header which
has there the condition.

Jakub


Re: [PATCH] Improve VRP assert creation for loops

2013-11-07 Thread Richard Biener
On Thu, 7 Nov 2013, Jakub Jelinek wrote:

 On Thu, Nov 07, 2013 at 09:48:46AM +0100, Richard Biener wrote:
   --- gcc/tree-vrp.c.jj 2013-11-06 08:48:01.0 +0100
   +++ gcc/tree-vrp.c2013-11-06 09:32:19.205104029 +0100
   @@ -92,6 +92,42 @@ static sbitmap *live;
static bool
live_on_edge (edge e, tree name)
{
   +  if (live[e-dest-index] == NULL)
   +{
  
  I'd rather have live[] computed correctly than the following
  which I think is a hack ... as you say the issue is ordering
  (or rather that there isn't an order for CFG cycles unless
  we want to iterate).  For loop BB order we explicitely handle
  the latch, maybe just using a different order than
  RPO order, with special-casing the latch, makes more sense?
 
 But is there any order that would help?  Sure, we could say just tweak
 the rpo/bb_rpo arrays and put there latch bbs before any other bbs from the
 same loop, but that would only help us in calling find_assert_locations_1
 on the latch before other bbs in the loop.  But that call does nothing,
 and we'd need to special case handling of the live for the latch block
 anyway, just not in live_on_edge, but in find_assert_locations instead.
 
 Is that what what you are looking for?
 
 Because without that special casing of handling latch edges, you'd need to
 process the header rather than latch first, and propagate through DFS_BACK
 edges, but then you couldn't propagate through other edges, nor in this
 testcase would be the latch live computed when processing the header which
 has there the condition.

I'm looking for adjusting of live compute - either by adjusting the
algorithm or by special casing the latches.  Like for example
with the following (untested, needs cleanup - the PHI processing
can be factored out from find_assert_locations_1 and re-used):

@@ -5904,6 +5898,27 @@ find_assert_locations (void)
   for (i = 0; i  rpo_cnt; ++i)
 bb_rpo[rpo[i]] = i;
 
+  /* Pre-seed loop latch liveness from loop header PHI nodes.  Due to
+ the order we compute liveness and insert asserts we otherwise
+ fail to insert asserts into the loop latch.  */
+  loop_p loop;
+  loop_iterator li;
+  FOR_EACH_LOOP (li, loop, 0)
+{
+  i = loop-latch-index;
+  live[i] = sbitmap_alloc (num_ssa_names);
+  bitmap_clear (live[i]);
+  for (gimple_stmt_iterator gsi = gsi_start_phis (loop-header);
+  !gsi_end_p (gsi); gsi_next (gsi))
+   {
+ gimple phi = gsi_stmt (gsi);
+ if (virtual_operand_p (gimple_phi_result (phi)))
+   continue;
+ for (unsigned j = 0; j  gimple_phi_num_args (phi); ++j)
+   if (TREE_CODE (gimple_phi_arg_def (phi, j)) == SSA_NAME)
+ bitmap_set_bit (live[i], SSA_NAME_VERSION 
(gimple_phi_arg_def (phi, j)));
+   }
+}



Re: [PATCH] Factorize the two read_line functions present in gcov.c and input.c

2013-11-07 Thread Dodji Seketeli
I was a bit too hasty in committing this patch, thinking the review was
done.  I have reverted it yesterday night.

Let's try again after a night of sleep.

Bernd Edlinger bernd.edlin...@hotmail.de writes:

 I still do not see how this is supposed to work:

 If the previous invocation of get_line already had read some
 characters of the following line(s), how is that information
 recovered?

It's lost, you are right.  I felt under my radar that the file position
indicator adjustment that should be done in get_line after it finds a
line delimiter felt through the cracks in the transformation made from
the getline of glibc.  Hopefully in this version, the file position
indicator should be properly set when get_line finishes.

So am proposing two patches that I think should be committed in pair.

The first one is the one we've been discussing lately; it's amended to
make get_line remember update the file position indicator.

The second one is to factorize the two read_line (from gcov.c and
input.c) and make gcov.c just use one from input.c.

Nathan Sidwell nat...@acm.org writes:

 How is input.o linked into gcov?

I didn't changed the way input.o is linked into gcov.  I just launched
the debugger on some of the regression tests of gcov in the suite, to
see that the read_line from input.c was being picked up.  But now that
you are asking, here is what I see from gcc/Makefile.in:

input.o is part of OBJS-libcommon:

OBJS-libcommon = diagnostic.o diagnostic-color.o pretty-print.o intl.o 
input.o version.o

$(OBJS-libcommon) ends up in libcommon.a which itself is part of LIBS
(and LIBDEPS)

And the gcov executable is linked with LIBS as per:

GCOV_OBJS = gcov.o
gcov$(exeext): $(GCOV_OBJS) $(LIBDEPS)
+$(LINKER) $(ALL_LINKERFLAGS) $(LDFLAGS) $(GCOV_OBJS) $(LIBS) -o $@

 The former's a compiler proper source file. Furthermore, if it is
 linked in, doesn't it drag in a whole bunch of the compiler itself
 into gcov?

From the above, what I can say is that input.o was already linked with
gcov.  But I think it's minimal enough to only drag libcpp and the
diagnostic subsystem.

 I think that means read_line needs to be broken out into a separate
 source file -- or at least one that's already common between gcov
 and cc1 (etc).

Hopefully this concern should be addressed by the above.


 Changelog doesn't match diff. (function names wrong, no mention of gcov)

Woops, sorry about that.  The second patch attached is an update that
should address that.  And with the changes in the first patch, this
new factorization patch is smaller.

Jakub Jelinek ja...@redhat.com writes:

 On Thu, Nov 07, 2013 at 07:17:19AM +, Nathan Sidwell wrote:
 On 11/06/13 21:25, Dodji Seketeli wrote:
 It appeared that gcov.c and input.c both have a static function named
 read_line.  I guess we ought to keep just one.
 
 As a general rule, that's good.

 Generally yes, but IMHO not in this case, both because of the problems you
 are mentioning, because it already now needs to do quite different things
 and because I think in the future the version for caret diagnostics should
 do much more complicated things (caching of the lines for faster further
 caret diagnostics).  Furthermore, ftell/fseek aren't something that should
 be used unless strictly necessary.

OK, let's say I am sending these patches here just for the record so
that it doesn't get lost :-)  I'll devise another scheme right away to
avoid the ftell/seek that you don't like.

I agree that, as you proposed at the beginning of the initial thread,
the caret diagnostics system should avoid opening/closing one file just
to cycle over it to find one line, repeatedly.  I just thought that
would come as a later patch after we fix this PR specifically.

Cheers.

From 75cea3f6883f192e9472fd7905f42d84d875e4e7 Mon Sep 17 00:00:00 2001
From: dodji dodji@138bc75d-0d04-0410-961f-82ee72b054a4
Date: Wed, 6 Nov 2013 11:33:52 +
Subject: [PATCH 1/2] PR preprocessor/58580 - preprocessor goes OOM with
 warning for zero literals

gcc/ChangeLog:

	* input.h (location_get_source_line): Take an additional line_size
	parameter.
	* input.c (get_line): New static function definition.
	(read_line): Take an additional line_length output parameter to be
	set to the size of the line.  Use the new get_line function do the
	actual line reading.
	(location_get_source_line): Take an additional output line_len
	parameter.  Update the use of read_line to pass it the line_len
	parameter.
	* diagnostic.c (adjust_line): Take an additional input parameter
	for the length of the line, rather than calculating it with
	strlen.
	(diagnostic_show_locus): Adjust the use of
	location_get_source_line and adjust_line with respect to their new
	signature.  While displaying a line now, do not stop at the first
	null byte.  Rather, display the zero byte as a space and keep
	going until we reach the size of the line.

gcc/testsuite/ChangeLog:

	* c-c++-common/cpp/warning-zero-in-literals-1.c: New test file.


Re: patch to fix PR58784 (ARM LRA crash)

2013-11-07 Thread Yvan Roux
Hi,

I've tested LRA on ARM in several configuration and it occurs that
only one regression is observed in Fortran for targets A15 and A9 in
hard and soft fp amd Thumb mode enabled by default (ARMv5 is clean).
The failing test case is gfortran.dg/realloc_on_assign_11.f90 in -O3
and the error seems to be of the same nature of
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58784, in the sens that
LRA ICEs in check_rtl on an insn which doesn't satisfy its
constraints.  Here is the insn :

(insn 633 543 656 24 (parallel [
(set (mem/c:SI (plus:SI (reg/f:SI 907)
(const_int 8 [0x8])) [4 A.54+8 S4 A64])
(smax:SI (reg:SI 562 [ D.5235 ])
(reg:SI 621 [ D.5235 ])))
(clobber (reg:CC 100 cc))
]).../gfortran.dg/realloc_on_assign_11.f90:29 121 {*store_minmaxsi}
 (expr_list:REG_DEAD (reg:SI 621 [ D.5235 ])
(expr_list:REG_UNUSED (reg:CC 100 cc)
(nil

Thanks,
Yvan


On 31 October 2013 17:31, Yvan Roux yvan.r...@linaro.org wrote:
 (this time in plain text !)

 Does this mean we can now turn on LRA for ARM state by default and
 start looking at performance issues if any ?

 With the other patch for 58785 and a small one I've to post, we are even
 about to turn LRA on by default completely, as the compiler now bootstrap
 also in thumb and first testsuite are clean :-)

 I juste want ton pass a full validation before.

 Yvan


 Ramana





  2013-10-30  Vladimir Makarov  vmaka...@redhat.com
 
  PR target/58784
  * lra.c (check_rtl): Remove address check before LRA work.
 
  2013-10-30  Vladimir Makarov  vmaka...@redhat.com
 
  PR target/58784
  * gcc.target/arm/pr58784.c: New.
 


Re: [C] Fix PR57258: unused variable warning is emitted for volatile variables

2013-11-07 Thread Joey Ye
-  warning_at (DECL_SOURCE_LOCATION (p),
-   OPT_Wunused_but_set_variable,
-   variable %qD set but not used, p);
+  {
+if (!TREE_THIS_VOLATILE (p))
+  warning_at (DECL_SOURCE_LOCATION (p),
+OPT_Wunused_but_set_variable,
+variable %qD set but not used, p);
+  }
I'd prefer

else if (DECL_CONTEXT (p) == current_function_decl
 !TREE_THIS_VOLATILE (p))

which concises the logic and avoids indent change

Thanks,
Joey

On Thu, Nov 7, 2013 at 3:37 PM, Bin.Cheng amker.ch...@gmail.com wrote:
 On Thu, Nov 7, 2013 at 11:53 AM, Mingjie Xing mingjie.x...@gmail.com wrote:
 2013/11/6 Richard Biener richard.guent...@gmail.com:
 You miss a testcase.

 Also why should the warning be omitted for unused automatic
 volatile variables?  They cannot be used in any way.

 Richard.

 Thanks.  I've updated the patch with a test case.

 c/ChangeLog

 PR 57258

 Should be:
PR c/57258

 Thanks,
 bin
 * c-decl.c (pop_scope): Don't emit unused variable warnings for
 accessed volatile variables.

 testsuite/ChangeLog

 * gcc.dg/Wunused-var-4.c: New test.

 Mingjie



 --
 Best Regards.


[rl78] Use canonical const_int for one_cmplqi2

2013-11-07 Thread Richard Sandiford
This patch just changes a QImode (const_int 255) to (const_int -1),
since the canonical form is to sign-extend.  As things stand the pattern
trips a new assert added on the wide-int branch.

Tested by building rl78-elf before and after the patch and making sure
that there were no changes in assembly code for gcc.c-torture, gcc.dg
and g++dg.  OK to install?

Thanks,
Richard


gcc/
* config/rl78/rl78-expand.md (one_cmplqi2): Use (const_int -1)
rather than (const_int 255).

Index: gcc/config/rl78/rl78-expand.md
===
--- gcc/config/rl78/rl78-expand.md  2013-10-15 19:24:15.821884272 +0100
+++ gcc/config/rl78/rl78-expand.md  2013-11-07 09:41:50.754226143 +
@@ -177,7 +177,7 @@ (define_expand xorqi3
 (define_expand one_cmplqi2
   [(set (match_operand:QI 0 nonimmediate_operand)
(xor:QI (match_operand:QI 1 general_operand)
-   (const_int 255)))
+   (const_int -1)))
]
   
   if (rl78_force_nonfar_2 (operands, gen_one_cmplqi2))


Re: [PATCH, MPX, 2/X] Pointers Checker [7/25] Suppress BUILT_IN_CHKP_ARG_BND optimizations.

2013-11-07 Thread Ilya Enkovich
2013/11/6 Ilya Enkovich enkovich@gmail.com:
 2013/11/5 Jeff Law l...@redhat.com:
 On 11/04/13 06:27, Richard Biener wrote:

 On Thu, Oct 31, 2013 at 10:02 AM, Ilya Enkovich enkovich@gmail.com
 wrote:

 Hi,

 Here is a patch which hadles the problem with optimization of
 BUILT_IN_CHKP_ARG_BND calls.  Pointer Bounds Checker expects that argument
 of this call is a default SSA_NAME of the PARM_DECL whose bounds we want to
 get.  The problem is in optimizations which may replace arg with it's copy
 or a known value.  This patch suppress such modifications.


 This doesn't seem like a good fix.  I suppose you require the same on
 RTL, that is, have the incoming arg reg coalesced with the use?
 In that case better set SSA_NAME_OCCURS_IN_ABNORMAL_PHI.

 I agree.  This seems like the wrong direction.

 While I don't like abusing SSA_NAME_OCCURS_IN_ABNORMAL_PHI in this manner,
 setting that flag should give them precisely the behaviour they want.  That
 makes me think SSA_NAME_OCCURS_IN_ABNORMAL_PHI is poorly named.  But
 addressing that is separate from this patch.

 Ilya, can you convert your code to set SSA_NAME_OCCURS_IN_ABNORMAL_PHI and
 drop this patch?

 OK. I'll try SSA_NAME_OCCURS_IN_ABNORMAL_PHI approach and see if it
 causes any problems.

Seems flag works fine.  All my tests pass.  No need to make additional
changes in verifiers etc.  Fix will be included into instrumentation
pass patch.  This patch is dropped.

Thanks,
Ilya


 Thanks,
 Ilya



 Jeff



Re: [PATCH 0/6] Conversion of gimple types to C++ inheritance (v3)

2013-11-07 Thread Richard Biener
On Wed, Nov 6, 2013 at 9:56 PM, Andrew MacLeod amacl...@redhat.com wrote:
 On 11/06/2013 06:31 AM, Richard Biener wrote:

 On Wed, Nov 6, 2013 at 12:02 PM, Bernd Schmidt ber...@codesourcery.com
 wrote:


 Maybe we need to revisit it? As one of those who were not in favour of
 the C++ move, can I ask you guys to step back for a moment and think
 about - what do all of these changes buy us, exactly? Imagine the state
 at the end, where everything is converted and supposedly the temporary
 ugliness is gone, what have we gained over the code as it is now?

 as_a gains us less runtime checking and more static type checking
 which is good.

 I still think all this effort is misdirected and distracts us from
 making changes that improve gcc for its users.

 That I agree to.  Instead of fixing the less than optimal separation /
 boundary
 between frontends and the middle-end, or fixing several other
 long-standing
 issues with GCC we spend a lot of time refactoring things to be C++.
 But that was kind of part of the decision (though I remember that we
 mainly wanted to convert containters and isolated stuff, not gimple
 or trees (I bet that'll be next)).

 Of course I don't see contributors of changes that improve gcc for its
 users
 now wasting their time with converting code to C++.  That conversion
 may slow down those people, but only so much.  It'll get more interesting
 with branch maintainance ...


 What does it buy us?  I won't comment on the merit of other changes, but I
 will reiterate what I'm doing.

 1 - I think the code restructuring  to enable the changes I proposed is
 quite worthwhile.  Our includes are a mess and prototypes locations are all
 over the map. I think this is a worthwhile cleanup that I probably wouldn't
 be doing otherwise.

True.

 2 - I really believe gimple needs a type system different from front end
 trees, that is my primary motivation.  I'm tired of jumping through hoops to
 do anything slightly different, and I got fed up with it.  With a separate
 type system for gimple, we can rid ourselves of all the stuff that isn't
 related to optimization and codegen...   Could we do this with trees?
 possibly, but if I'm going to go to the effort of converting the front end
 tree types into a new or reduced-type subset, I might as well put that
 effort into something that is more appropriate right from the start.

I'm not sure.  Especially the idea of wrapping everything in a
I'm-not-a-tree-well-but-I-really-still-am scares the hell out of me.
And I put that into the very same basket of things you now complain about.

 3 - Trees are very polymorphic and as a result, its difficult to tell where
 types are being used.

It's a feature.  Now the type vs. object thing may be something to
change (historically the reason they are entangled is that things
like a vector and a list also were trees - which is a feature).

 By replacing those uses with an actual type, we'll get
 static type checking which is another improvement.   There is only one way
 to move the middle/back end off of types as trees that I can see.  Go
 through the source base and find all the places which use trees as types,
 and replace them with an actual type.  This can't be done all at once, its
 too big a job...  So I propose a plan which allows a new gimple type and
 current trees to coexist.  Once the conversion is complete we can easily
 implement a type system that has the features and traits we want.

Note that on 'gimple' there are no types ... the types are on the objects
'gimple' refers to and those are 'tree's.  So you'd need to change the
'tree's to have TREE_TYPE not be a 'tree'.  Or simply subclass
'tree' so you can distinguish a tree type from a tree non-type statically
(a bit awkward with the current union setup).

But maybe I'm confusing what you call 'type'.

 4 - I realized that if I am going through the entire code base to find all
 the type locations, its very little extra work to also do the same for decls
 and expressions. Then we'll have static type checking for those as well.

 5 - When done we will be significantly closer to having the ability to
 create a self contained IL that we can strictly define...  That alone seems
 a worthwhile goal to me.  The IL then becomes appropriate for streaming at
 any point in during compilation.  This means any front end or tool can
 easily generate or consume a gimple stream which opens up numerous
 opportunities...  That is pretty hard to enable today.

Have fun with dealing with our legacy frontends ;)  Or didn't you
actually mean any point during compilation? ... oh, you talk about
GIMPLE.  So no plans for Frontend ASTs aka GENERIC for our legacy
frontends?

 Im going to do the initial bits on a branch for the first couple of months,
 and during that time convert a number of files over to the new gimple
 interface.  Then we can look at it, measure it, comment on it, tweak it,
 whatever.  You can see exactly what we'll be getting into before we fully
 

Re: PATCH: PR target/59034: FAIL gcc.c-torture/compile/20031208-1.c

2013-11-07 Thread H.J. Lu
On Wed, Nov 6, 2013 at 11:42 PM, Uros Bizjak ubiz...@gmail.com wrote:
 On Thu, Nov 7, 2013 at 7:19 AM, H.J. Lu hongjiu...@intel.com wrote:

 We should use Pmode with stack_pointer_rtx.  OK for trunk and 4.8
 branch?

 OK everywhere.


I checked it into trunk and 4.8 branch.  For 4.7, I only checked
in the i386.md change since the testcases use the new
-maddress-mode= in 4.8.

Thanks.

-- 
H.J.


Re: [PATCH, i386]: Fix PR 59021, new vzeroupper instructions generated with -mavx

2013-11-07 Thread Uros Bizjak
On Wed, Nov 6, 2013 at 11:06 PM, Eric Botcazou ebotca...@adacore.com wrote:
 Attached patch fixes PR 59021, where new vzeroupper instructions are
 generated for -mavx after Vlad's useless insn removal patch.

 The problem was, that we depent on (useless) moves to AVX256 function
 argument registers in front of the function call to switch the mode to
 DIRTY mode. This is not true anymore, so call_insn has to switch to
 DIRTY mode by itself.

 Thanks for fixing this!

OTOH, looking a bit deeper, it looks that there is a problem in
mode-switching infrastructure. If we have a BB without any mode
requirements, but an insn that switched the mode via MODE_AFTER, we
should not mark the block as transparent. Indeed, we even have a N.B.
comment in mode-switching.c:

  /* Check for blocks without ANY mode requirements.
 N.B. because of MODE_AFTER, last_mode might still be different
 from no_mode.  */

But, we do nothing w.r.t to transparency

Attached incremental patch also fixes additional vzeroupper problem.
Calls with AVX argument register in fact do not have any mode
requirements, so we don't need to fake MODE_DIRTY requirement.

I have added Joern to the discussion.

Uros.
Index: mode-switching.c
===
--- mode-switching.c(revision 204496)
+++ mode-switching.c(working copy)
@@ -577,6 +577,8 @@ optimize_mode_switching (void)
{
  ptr = new_seginfo (no_mode, BB_END (bb), bb-index, live_now);
  add_seginfo (info + bb-index, ptr);
+ if (last_mode != no_mode)
+   bitmap_clear_bit (transp[bb-index], j);
}
}
 #if defined (MODE_ENTRY)  defined (MODE_EXIT)
Index: config/i386/i386.c
===
--- config/i386/i386.c  (revision 204496)
+++ config/i386/i386.c  (working copy)
@@ -15708,7 +15708,7 @@ ix86_avx_u128_mode_needed (rtx insn)
  rtx arg = XEXP (XEXP (link, 0), 0);
 
  if (ix86_check_avx256_register (arg, NULL))
-   return AVX_U128_DIRTY;
+   return AVX_U128_ANY;
}
}
 


Re: Re-factor tree.h - Part 1

2013-11-07 Thread Richard Biener
On Wed, Nov 6, 2013 at 6:10 PM, Diego Novillo dnovi...@google.com wrote:
 On Wed, Nov 6, 2013 at 2:04 AM, Marc Glisse marc.gli...@inria.fr wrote:
 On Tue, 5 Nov 2013, Diego Novillo wrote:

 This is the first patch in a series of patches to cleanup tree.h to
 reduce the exposure it has all over the compiler.

 In this patch, I'm moving functions that are used once into the files
 that use them, and make them private to that file. These functions
 were declared extern in tree.h and called from exactly one place.


 I am not a big fan of doing it so automatically. For instance
 widest_int_cst_value should imho remain next to int_cst_value since they
 mostly share the same implementation. Doing this also doesn't promote code
 reuse: if I am looking for a function that does some basic operation on
 trees, I won't only need to look in the file that is semantically relevant,

 Yeah, that was what I was trying to impose. Functions that
 semantically make better sense in the place where they're called from.
 I filtered functions that were used once but would not make much sense
 to move (for instance, the init functions).

 How about this. Below is the list of functions that I took out of
 tree.h.  They are either not used or used only once, in which case I
 moved them (and their private transitive closure) over to the file
 that uses them.  There are more, but these are the ones that I
 originally considered to be too specific to the user file to be
 globally exposed.  Which ones would you folks keep global?

 I have marked some that we may decide to keep extern and one, which I
 would not mind to not move at all.  I also marked the functions I
 removed and the ones that I just made private to their definition
 file:

 Moved to builtins.c:
 more_const_call_expr_args_p
 expand_stack_restore
 expand_stack_save

Makes sense.

 Moved to cfgexpand.c:
 expand_main_function
 stack_protect_prologue
 expand_asm_stmt
 expand_computed_goto
 expand_goto
 expand_return

Likewise.

 Moved to cgraphclones.c:
 build_function_decl_skip_args

Likewise.

 Moved to explow.c:
 tree_expr_size (maybe keep extern?)

Weird enough function, no need to export it.

 Moved to expr.c:
 fields_length

Likewise.

 Moved to fold-const.c:
 size_low_cst

Same as int_cst_value just with lack of any checking?

 Moved to gimple-fold.c:
 truth_type_for (maybe keep extern?)

Uses a langhook so should stay here.

 Moved to symtab.c:
 decl_assembler_name_hash
 decl_assembler_name_equal

Ok.

 Moved to trans-mem.c:
 is_tm_safe_or_pure

Ok.

 Moved to tree-eh.c:
 in_array_bounds_p
 range_in_array_bounds_p

I'm not sure ...

 Moved to tree-ssa-dom.c:
 iterative_hash_exprs_commutative

I'm not sure - it seems that iterative_hash_expr does exactly
the same thing?  So instead of moving things this should
be refactored (make that function re-usable from iterative_hash_expr).

 Moved to tree-ssa-math-opts.c:
 widest_int_cst_value (maybe keep extern?)

Keep it in tree.c

 Moved to tree-vrp.c:
 fold_unary_to_constant (?)

No.

 Moved to cp/call.c
 ctor_to_vec

No.

 Moved to cp/decl.c:
 supports_one_only
 chain_member

No.

 Moved to java/class.c:
 build_method_type

No.

 Removed (not used anywhere)
 build_type_no_quals
 omp_remove_redundant_declare_simd_attrs
 fold_build3_initializer_loc
 real_twop
 print_vec_tree
 list_equal_p
 ssa_name_nonnegative_p
 addr_expr_of_non_mem_decl_p
 save_vtable_map_decl

 Made static (only used in its defining file)
 stabilize_reference_1
 tree_expr_nonzero_p
 tree_invalid_nonnegative_warnv_p
 tree_expr_nonzero_warnv_p
 fold_builtin_snprintf_chk
 validate_arglist
 simple_cst_list_equal
 lookup_scoped_attribute_spec
 get_attribute_namespace
 fini_object_sizes

Those are obvious.

Richard.


 Thanks.  Diego.


Re: [patch] Create gimple-expr..[ch] ... was Re: RFC: gimple.[ch] break apart

2013-11-07 Thread Basile Starynkevitch
On Tue, Nov 05, 2013 at 11:26:46AM -0500, Andrew MacLeod wrote:
 
 I decided to name the new file gimple-expr.[ch] instead of
 gimple-decl   This will eventually split into gimple-type.[ch],
 gimple-decl.[ch], and gimple-expr.[ch].  


Since we are adding *new* C++ files, can't we please name them *.cc 
for the implementation part, so at least create gimple-expr.h and 
gimple-expr.cc but not gimple-expr.c, please!

There are some reasons to keep existing *.c files containing C++ code
(IMHO the reasons are bad ones, and related to poor habits and to 
deficiencies in the version control system we have to use, 
but I really don't want to open that debate again).

But for **NEW** files which are definitely in C++, I don't understand 
why they should be named .c files; this is confusing for all (and, 
for instance, when compiling them with Clang we are getting -IMHO 
rightly- some warnings about the file naming).

If I remember well, there have been (in the discussion about naming C++ source 
files of GCC) a suggestion (and perhaps even a consensus), 
probably by Diego Novillo, to name *.cc our new files which are in C++.


Having old C++ files named *.c is already a big frustration 
(most editors are by default configured to handle them as C, not as C++, files
and most developers, notably newbies to GCC contributions, 
are expecting them to be C files not C++ ones). 

But IMHO having new source files inside GCC coded in C++ with a *.c file 
extension
is non-sense; such files should have a .cc (or maybe .cpp or .cxx) extension, 
not a .c extension.

Regards.
-- 
Basile STARYNKEVITCH http://starynkevitch.net/Basile/
email: basileatstarynkevitchdotnet mobile: +33 6 8501 2359
8, rue de la Faiencerie, 92340 Bourg La Reine, France
*** opinions {are only mines, sont seulement les miennes} ***


Re: [PATCH]Fix computation of offset in ivopt

2013-11-07 Thread Richard Biener
On Wed, Nov 6, 2013 at 6:06 PM, Richard Sandiford
rdsandif...@googlemail.com wrote:
 Hi,

 bin.cheng bin.ch...@arm.com writes:
 Index: gcc/tree-ssa-loop-ivopts.c
 ===
 --- gcc/tree-ssa-loop-ivopts.c(revision 203267)
 +++ gcc/tree-ssa-loop-ivopts.c(working copy)
 @@ -2037,12 +2037,12 @@ find_interesting_uses (struct ivopts_data *data)

  static tree
  strip_offset_1 (tree expr, bool inside_addr, bool top_compref,
 - unsigned HOST_WIDE_INT *offset)
 + HOST_WIDE_INT *offset)
  {
tree op0 = NULL_TREE, op1 = NULL_TREE, tmp, step;
enum tree_code code;
tree type, orig_type = TREE_TYPE (expr);
 -  unsigned HOST_WIDE_INT off0, off1, st;
 +  HOST_WIDE_INT off0, off1, st;
tree orig_expr = expr;

STRIP_NOPS (expr);
 @@ -2133,19 +2133,32 @@ strip_offset_1 (tree expr, bool inside_addr, bool
break;

  case COMPONENT_REF:
 -  if (!inside_addr)
 - return orig_expr;
 +  {
 + tree field;

 -  tmp = component_ref_field_offset (expr);
 -  if (top_compref
 -cst_and_fits_in_hwi (tmp))
 - {
 -   /* Strip the component reference completely.  */
 -   op0 = TREE_OPERAND (expr, 0);
 -   op0 = strip_offset_1 (op0, inside_addr, top_compref, off0);
 -   *offset = off0 + int_cst_value (tmp);
 -   return op0;
 - }
 + if (!inside_addr)
 +   return orig_expr;
 +
 + tmp = component_ref_field_offset (expr);
 + field = TREE_OPERAND (expr, 1);
 + if (top_compref
 +  cst_and_fits_in_hwi (tmp)
 +  cst_and_fits_in_hwi (DECL_FIELD_BIT_OFFSET (field)))

 While comparing output for wide-int and mainline, I noticed that
 this condition is now always false on x86_64, since DECL_FIELD_BIT_OFFSET
 is a 128-bit bitsizetype and since cst_and_fits_in_hwi rejects constants
 with precision greater than HWI:

   if (TREE_CODE (x) != INTEGER_CST)
 return false;

   if (TYPE_PRECISION (TREE_TYPE (x))  HOST_BITS_PER_WIDE_INT)
 return false;

I think that's simply overly restrictive on the side of cst_and_fits_in_hwi.
I suppose this function is meant to complement int_cst_value (digging
in history might be nice here).

 Should this be host_integerp (DECL_FIELD_BIT_OFFSET (field), 0) instead?

That's not the same as it rejects -1U.  The function seems to ask
if the value, if casted to unsigned fits in a HOST_WIDE_INT.

So just drop the precision check from cst_and_fits_in_hwi.

Richard.


 +   {
 + HOST_WIDE_INT boffset, abs_off;
 +
 + /* Strip the component reference completely.  */
 + op0 = TREE_OPERAND (expr, 0);
 + op0 = strip_offset_1 (op0, inside_addr, top_compref, off0);
 + boffset = int_cst_value (DECL_FIELD_BIT_OFFSET (field));
 + abs_off = abs_hwi (boffset) / BITS_PER_UNIT;
 + if (boffset  0)
 +   abs_off = -abs_off;
 +
 + *offset = off0 + int_cst_value (tmp) + abs_off;
 + return op0;
 +   }
 +  }
break;

 Thanks,
 Richard


Re: [PATCH] Improve VRP assert creation for loops

2013-11-07 Thread Jakub Jelinek
On Thu, Nov 07, 2013 at 10:32:10AM +0100, Richard Biener wrote:
 I'm looking for adjusting of live compute - either by adjusting the
 algorithm or by special casing the latches.  Like for example
 with the following (untested, needs cleanup - the PHI processing
 can be factored out from find_assert_locations_1 and re-used):
 
 @@ -5904,6 +5898,27 @@ find_assert_locations (void)
for (i = 0; i  rpo_cnt; ++i)
  bb_rpo[rpo[i]] = i;
  
 +  /* Pre-seed loop latch liveness from loop header PHI nodes.  Due to
 + the order we compute liveness and insert asserts we otherwise
 + fail to insert asserts into the loop latch.  */
 +  loop_p loop;
 +  loop_iterator li;
 +  FOR_EACH_LOOP (li, loop, 0)
 +{
 +  i = loop-latch-index;
 +  live[i] = sbitmap_alloc (num_ssa_names);
 +  bitmap_clear (live[i]);
 +  for (gimple_stmt_iterator gsi = gsi_start_phis (loop-header);
 +  !gsi_end_p (gsi); gsi_next (gsi))
 +   {
 + gimple phi = gsi_stmt (gsi);
 + if (virtual_operand_p (gimple_phi_result (phi)))
 +   continue;
 + for (unsigned j = 0; j  gimple_phi_num_args (phi); ++j)
 +   if (TREE_CODE (gimple_phi_arg_def (phi, j)) == SSA_NAME)
 + bitmap_set_bit (live[i], SSA_NAME_VERSION 
 (gimple_phi_arg_def (phi, j)));
 +   }
 +}

LGTM, except that I think we should only care about phi args from the
latch - header edge, not others and IMHO it is memory waste to allocate
sbitmap even when we wouldn't add anything.  So, I'm going to
rebootstrap/regtest the following (including new testcase Jeff asked for):

2013-11-07  Richard Biener  rguent...@suse.de
Jakub Jelinek  ja...@redhat.com

* tree-vrp.c (find_assert_locations): Pre-seed live bitmaps for loop
latches from header PHI arguments from the latch edge.

* gcc.dg/unroll_1.c: Add -fno-tree-vrp to dg-options.
* gcc.dg/unroll_2.c: Likewise.
* gcc.dg/unroll_3.c: Likewise.
* gcc.dg/unroll_4.c: Likewise.
* gcc.dg/vrp90.c: New test.

--- gcc/tree-vrp.c.jj   2013-11-06 16:58:04.675678567 +0100
+++ gcc/tree-vrp.c  2013-11-07 10:59:38.841283931 +0100
@@ -5906,6 +5906,34 @@ find_assert_locations (void)
   for (i = 0; i  rpo_cnt; ++i)
 bb_rpo[rpo[i]] = i;
 
+  /* Pre-seed loop latch liveness from loop header PHI nodes.  Due to
+ the order we compute liveness and insert asserts we otherwise
+ fail to insert asserts into the loop latch.  */
+  loop_p loop;
+  loop_iterator li;
+  FOR_EACH_LOOP (li, loop, 0)
+{
+  i = loop-latch-index;
+  unsigned int j = find_edge (loop-latch, loop-header)-dest_idx;
+  for (gimple_stmt_iterator gsi = gsi_start_phis (loop-header);
+  !gsi_end_p (gsi); gsi_next (gsi))
+   {
+ gimple phi = gsi_stmt (gsi);
+ if (virtual_operand_p (gimple_phi_result (phi)))
+   continue;
+ tree arg = gimple_phi_arg_def (phi, j);
+ if (TREE_CODE (arg) == SSA_NAME)
+   {
+ if (live[i] == NULL)
+   {
+ live[i] = sbitmap_alloc (num_ssa_names);
+ bitmap_clear (live[i]);
+   }
+ bitmap_set_bit (live[i], SSA_NAME_VERSION (arg));
+   }
+   }
+}
+
   need_asserts = false;
   for (i = rpo_cnt - 1; i = 0; --i)
 {
--- gcc/testsuite/gcc.dg/unroll_1.c.jj  2013-09-09 11:32:36.0 +0200
+++ gcc/testsuite/gcc.dg/unroll_1.c 2013-11-06 17:10:52.900722932 +0100
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options -O2 -fdump-rtl-loop2_unroll=stderr -fno-peel-loops 
-fdisable-tree-cunroll -fdisable-tree-cunrolli -fenable-rtl-loop2_unroll } */
+/* { dg-options -O2 -fdump-rtl-loop2_unroll=stderr -fno-peel-loops 
-fno-tree-vrp -fdisable-tree-cunroll -fdisable-tree-cunrolli 
-fenable-rtl-loop2_unroll } */
 
 unsigned a[100], b[100];
 inline void bar()
--- gcc/testsuite/gcc.dg/unroll_2.c.jj  2013-08-30 14:38:39.0 +0200
+++ gcc/testsuite/gcc.dg/unroll_2.c 2013-11-06 17:10:30.751845504 +0100
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options -O2 -fdump-rtl-loop2_unroll -fno-peel-loops 
-fdisable-tree-cunroll=foo -fdisable-tree-cunrolli=foo 
-fenable-rtl-loop2_unroll } */
+/* { dg-options -O2 -fdump-rtl-loop2_unroll -fno-peel-loops -fno-tree-vrp 
-fdisable-tree-cunroll=foo -fdisable-tree-cunrolli=foo 
-fenable-rtl-loop2_unroll } */
 
 unsigned a[100], b[100];
 inline void bar()
--- gcc/testsuite/gcc.dg/unroll_3.c.jj  2013-08-30 14:38:39.0 +0200
+++ gcc/testsuite/gcc.dg/unroll_3.c 2013-11-06 17:10:38.864800338 +0100
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options -O2 -fdump-rtl-loop2_unroll -fno-peel-loops 
-fdisable-tree-cunroll -fdisable-tree-cunrolli -fenable-rtl-loop2_unroll=foo } 
*/
+/* { dg-options -O2 -fdump-rtl-loop2_unroll -fno-peel-loops -fno-tree-vrp 
-fdisable-tree-cunroll -fdisable-tree-cunrolli -fenable-rtl-loop2_unroll=foo } 
*/
 
 unsigned a[100], b[100];
 inline void bar()
--- 

Re: [PATCH] Improve VRP assert creation for loops

2013-11-07 Thread Richard Biener
On Thu, 7 Nov 2013, Jakub Jelinek wrote:

 On Thu, Nov 07, 2013 at 10:32:10AM +0100, Richard Biener wrote:
  I'm looking for adjusting of live compute - either by adjusting the
  algorithm or by special casing the latches.  Like for example
  with the following (untested, needs cleanup - the PHI processing
  can be factored out from find_assert_locations_1 and re-used):
  
  @@ -5904,6 +5898,27 @@ find_assert_locations (void)
 for (i = 0; i  rpo_cnt; ++i)
   bb_rpo[rpo[i]] = i;
   
  +  /* Pre-seed loop latch liveness from loop header PHI nodes.  Due to
  + the order we compute liveness and insert asserts we otherwise
  + fail to insert asserts into the loop latch.  */
  +  loop_p loop;
  +  loop_iterator li;
  +  FOR_EACH_LOOP (li, loop, 0)
  +{
  +  i = loop-latch-index;
  +  live[i] = sbitmap_alloc (num_ssa_names);
  +  bitmap_clear (live[i]);
  +  for (gimple_stmt_iterator gsi = gsi_start_phis (loop-header);
  +  !gsi_end_p (gsi); gsi_next (gsi))
  +   {
  + gimple phi = gsi_stmt (gsi);
  + if (virtual_operand_p (gimple_phi_result (phi)))
  +   continue;
  + for (unsigned j = 0; j  gimple_phi_num_args (phi); ++j)
  +   if (TREE_CODE (gimple_phi_arg_def (phi, j)) == SSA_NAME)
  + bitmap_set_bit (live[i], SSA_NAME_VERSION 
  (gimple_phi_arg_def (phi, j)));
  +   }
  +}
 
 LGTM, except that I think we should only care about phi args from the
 latch - header edge, not others and IMHO it is memory waste to allocate
 sbitmap even when we wouldn't add anything.  So, I'm going to
 rebootstrap/regtest the following (including new testcase Jeff asked for):
 
 2013-11-07  Richard Biener  rguent...@suse.de
   Jakub Jelinek  ja...@redhat.com
 
   * tree-vrp.c (find_assert_locations): Pre-seed live bitmaps for loop
   latches from header PHI arguments from the latch edge.
 
 * gcc.dg/unroll_1.c: Add -fno-tree-vrp to dg-options.
 * gcc.dg/unroll_2.c: Likewise.
 * gcc.dg/unroll_3.c: Likewise.
 * gcc.dg/unroll_4.c: Likewise.
   * gcc.dg/vrp90.c: New test.
 
 --- gcc/tree-vrp.c.jj 2013-11-06 16:58:04.675678567 +0100
 +++ gcc/tree-vrp.c2013-11-07 10:59:38.841283931 +0100
 @@ -5906,6 +5906,34 @@ find_assert_locations (void)
for (i = 0; i  rpo_cnt; ++i)
  bb_rpo[rpo[i]] = i;
  
 +  /* Pre-seed loop latch liveness from loop header PHI nodes.  Due to
 + the order we compute liveness and insert asserts we otherwise
 + fail to insert asserts into the loop latch.  */
 +  loop_p loop;
 +  loop_iterator li;
 +  FOR_EACH_LOOP (li, loop, 0)
 +{
 +  i = loop-latch-index;
 +  unsigned int j = find_edge (loop-latch, loop-header)-dest_idx;

Actually there is only one edge from latch to header by definition,
so single_succ_edge (loop-latch) would be better?

Otherwise ok.

Thanks,
Richard.

 +  for (gimple_stmt_iterator gsi = gsi_start_phis (loop-header);
 +!gsi_end_p (gsi); gsi_next (gsi))
 + {
 +   gimple phi = gsi_stmt (gsi);
 +   if (virtual_operand_p (gimple_phi_result (phi)))
 + continue;
 +   tree arg = gimple_phi_arg_def (phi, j);
 +   if (TREE_CODE (arg) == SSA_NAME)
 + {
 +   if (live[i] == NULL)
 + {
 +   live[i] = sbitmap_alloc (num_ssa_names);
 +   bitmap_clear (live[i]);
 + }
 +   bitmap_set_bit (live[i], SSA_NAME_VERSION (arg));
 + }
 + }
 +}
 +
need_asserts = false;
for (i = rpo_cnt - 1; i = 0; --i)
  {
 --- gcc/testsuite/gcc.dg/unroll_1.c.jj2013-09-09 11:32:36.0 
 +0200
 +++ gcc/testsuite/gcc.dg/unroll_1.c   2013-11-06 17:10:52.900722932 +0100
 @@ -1,5 +1,5 @@
  /* { dg-do compile } */
 -/* { dg-options -O2 -fdump-rtl-loop2_unroll=stderr -fno-peel-loops 
 -fdisable-tree-cunroll -fdisable-tree-cunrolli -fenable-rtl-loop2_unroll } */
 +/* { dg-options -O2 -fdump-rtl-loop2_unroll=stderr -fno-peel-loops 
 -fno-tree-vrp -fdisable-tree-cunroll -fdisable-tree-cunrolli 
 -fenable-rtl-loop2_unroll } */
  
  unsigned a[100], b[100];
  inline void bar()
 --- gcc/testsuite/gcc.dg/unroll_2.c.jj2013-08-30 14:38:39.0 
 +0200
 +++ gcc/testsuite/gcc.dg/unroll_2.c   2013-11-06 17:10:30.751845504 +0100
 @@ -1,5 +1,5 @@
  /* { dg-do compile } */
 -/* { dg-options -O2 -fdump-rtl-loop2_unroll -fno-peel-loops 
 -fdisable-tree-cunroll=foo -fdisable-tree-cunrolli=foo 
 -fenable-rtl-loop2_unroll } */
 +/* { dg-options -O2 -fdump-rtl-loop2_unroll -fno-peel-loops -fno-tree-vrp 
 -fdisable-tree-cunroll=foo -fdisable-tree-cunrolli=foo 
 -fenable-rtl-loop2_unroll } */
  
  unsigned a[100], b[100];
  inline void bar()
 --- gcc/testsuite/gcc.dg/unroll_3.c.jj2013-08-30 14:38:39.0 
 +0200
 +++ gcc/testsuite/gcc.dg/unroll_3.c   2013-11-06 17:10:38.864800338 +0100
 @@ -1,5 +1,5 @@
  /* { dg-do compile } */
 -/* { dg-options -O2 -fdump-rtl-loop2_unroll -fno-peel-loops 
 

[PATCH] Fix casting in get_ref_base_and_extent

2013-11-07 Thread Richard Biener

As noticed by Richard.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.

Richard.

2013-11-07  Richard Biener  rguent...@suse.de

* tree-dfa.c (get_ref_base_and_extent): Fix casting.

Index: gcc/tree-dfa.c
===
--- gcc/tree-dfa.c  (revision 204458)
+++ gcc/tree-dfa.c  (working copy)
@@ -569,7 +569,7 @@ get_ref_base_and_extent (tree exp, HOST_
   (!bit_offset.fits_shwi ()
  || !host_integerp (TYPE_SIZE (TREE_TYPE (exp)), 1)
  || (bit_offset.to_shwi () + maxsize
- == (signed) TREE_INT_CST_LOW
+ == (HOST_WIDE_INT) TREE_INT_CST_LOW
(TYPE_SIZE (TREE_TYPE (exp))
maxsize = -1;
 
@@ -606,7 +606,8 @@ get_ref_base_and_extent (tree exp, HOST_
(!bit_offset.fits_shwi ()
  || !host_integerp (TYPE_SIZE (TREE_TYPE (exp)), 1)
  || (bit_offset.to_shwi () + maxsize
- == (signed) TREE_INT_CST_LOW (TYPE_SIZE (TREE_TYPE (exp))
+ == (HOST_WIDE_INT) TREE_INT_CST_LOW
+  (TYPE_SIZE (TREE_TYPE (exp))
 maxsize = -1;
 
  done:


[PATCH][2/2] Get TREE_OVERFLOW somewhat under control

2013-11-07 Thread Richard Biener

This fixes CCP and DOM to not spread TREE_OVERFLOW eventually set
from fold.  I've chosen to change the lattice updating which is
easiest.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.

Richard.

2013-11-07  Richard Biener  rguent...@suse.de

* tree-ssa-ccp.c (canonicalize_float_value): Rename to ...
(canonicalize_value): ... this.  Also handle stripping of
TREE_OVERFLOW.
(get_value, set_lattice_value, get_value_for_expr): Adjust.
* gimple-fold.c (canonicalize_constructor_val): Strip
TREE_OVERFLOW.
* tree-ssa-threadedge.c (set_ssa_name_value): Likewise.

Index: gcc/tree-ssa-ccp.c
===
--- gcc/tree-ssa-ccp.c  (revision 204504)
+++ gcc/tree-ssa-ccp.c  (working copy)
@@ -168,7 +168,7 @@ typedef struct prop_value_d prop_value_t
 static prop_value_t *const_val;
 static unsigned n_const_val;
 
-static void canonicalize_float_value (prop_value_t *);
+static void canonicalize_value (prop_value_t *);
 static bool ccp_fold_stmt (gimple_stmt_iterator *);
 
 /* Dump constant propagation value VAL to file OUTF prefixed by PREFIX.  */
@@ -326,7 +326,7 @@ get_value (tree var)
   if (val-lattice_val == UNINITIALIZED)
 *val = get_default_value (var);
 
-  canonicalize_float_value (val);
+  canonicalize_value (val);
 
   return val;
 }
@@ -378,17 +378,24 @@ set_value_varying (tree var)
  that HONOR_NANS is false, and we try to change the value of x to 0,
  causing an ICE.  With HONOR_NANS being false, the real appearance of
  NaN would cause undefined behavior, though, so claiming that y (and x)
- are UNDEFINED initially is correct.  */
+ are UNDEFINED initially is correct.
+
+  For other constants, make sure to drop TREE_OVERFLOW.  */
 
 static void
-canonicalize_float_value (prop_value_t *val)
+canonicalize_value (prop_value_t *val)
 {
   enum machine_mode mode;
   tree type;
   REAL_VALUE_TYPE d;
 
-  if (val-lattice_val != CONSTANT
-  || TREE_CODE (val-value) != REAL_CST)
+  if (val-lattice_val != CONSTANT)
+return;
+
+  if (TREE_OVERFLOW_P (val-value))
+val-value = drop_tree_overflow (val-value);
+
+  if (TREE_CODE (val-value) != REAL_CST)
 return;
 
   d = TREE_REAL_CST (val-value);
@@ -454,7 +461,7 @@ set_lattice_value (tree var, prop_value_
   /* We can deal with old UNINITIALIZED values just fine here.  */
   prop_value_t *old_val = const_val[SSA_NAME_VERSION (var)];
 
-  canonicalize_float_value (new_val);
+  canonicalize_value (new_val);
 
   /* We have to be careful to not go up the bitwise lattice
  represented by the mask.
@@ -569,7 +576,7 @@ get_value_for_expr (tree expr, bool for_
   val.lattice_val = CONSTANT;
   val.value = expr;
   val.mask = double_int_zero;
-  canonicalize_float_value (val);
+  canonicalize_value (val);
 }
   else if (TREE_CODE (expr) == ADDR_EXPR)
 val = get_value_from_alignment (expr);
Index: gcc/gimple-fold.c
===
--- gcc/gimple-fold.c   (revision 204504)
+++ gcc/gimple-fold.c   (working copy)
@@ -205,6 +205,8 @@ canonicalize_constructor_val (tree cval,
cval = fold_convert (TREE_TYPE (orig_cval), cval);
   return cval;
 }
+  if (TREE_OVERFLOW_P (cval))
+return drop_tree_overflow (cval);
   return orig_cval;
 }
 
Index: gcc/tree-ssa-threadedge.c
===
--- gcc/tree-ssa-threadedge.c   (revision 204504)
+++ gcc/tree-ssa-threadedge.c   (working copy)
@@ -58,6 +58,8 @@ set_ssa_name_value (tree name, tree valu
 {
   if (SSA_NAME_VERSION (name) = ssa_name_values.length ())
 ssa_name_values.safe_grow_cleared (SSA_NAME_VERSION (name) + 1);
+  if (value  TREE_OVERFLOW_P (value))
+value = drop_tree_overflow (value);
   ssa_name_values[SSA_NAME_VERSION (name)] = value;
 }
 


[PATCH, MPX, 2/X] Pointers Checker. Remove checker language hook.

2013-11-07 Thread Ilya Enkovich
Hi,

Here is a patch to remove language hook used by Pointer Bounds Checker.  To 
disable checker on non C languages option is moved to c.opt.

Thanks,
Ilya
--
gcc/

2013-11-06  Ilya Enkovich  ilya.enkov...@intel.com

* common.opt (fcheck-pointer-bounds): Move to ...
* c-family/c.opt: ... here.
* langhooks-def.h (LANG_HOOKS_CHKP_SUPPORTED): Remove.
(LANG_HOOKS_INITIALIZER): Remove LANG_HOOKS_CHKP_SUPPORTED.
* langhooks.h (lang_hooks): Remove chkp_supported field.
* toplev.c (process_options): Remove chkp_supported check.


diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
index b862eb9..c082bac 100644
--- a/gcc/c-family/c.opt
+++ b/gcc/c-family/c.opt
@@ -850,6 +850,11 @@ fcanonical-system-headers
 C ObjC C++ ObjC++
 Where shorter, use canonicalized paths to systems headers.
 
+fcheck-pointer-bounds
+C ObjC C++ ObjC++ LTO Report Var(flag_check_pointer_bounds)
+Add Pointer Bounds Checker instrumentation.  fchkp-* flags are used to
+control instrumentation.
+
 fcilkplus
 C ObjC C++ ObjC++ LTO Report Var(flag_enable_cilkplus) Init(0)
 Enable Cilk Plus
diff --git a/gcc/common.opt b/gcc/common.opt
index 5c2f56e..deeb3f2 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -874,11 +874,6 @@ fbounds-check
 Common Report Var(flag_bounds_check)
 Generate code to check bounds before indexing arrays
 
-fcheck-pointer-bounds
-Common Report Var(flag_check_pointer_bounds)
-Add Pointer Bounds Checker instrumentation.  fchkp-* flags are used to
-control instrumentation.  Currently available for C, C++ and ObjC.
-
 fbranch-count-reg
 Common Report Var(flag_branch_on_count_reg) Init(1) Optimization
 Replace add, compare, branch with branch on count register
diff --git a/gcc/langhooks-def.h b/gcc/langhooks-def.h
index 0597036..b7be472 100644
--- a/gcc/langhooks-def.h
+++ b/gcc/langhooks-def.h
@@ -118,7 +118,6 @@ extern bool lhd_omp_mappable_type (tree);
 #define LANG_HOOKS_BLOCK_MAY_FALLTHRU  hook_bool_const_tree_true
 #define LANG_HOOKS_EH_USE_CXA_END_CLEANUP  false
 #define LANG_HOOKS_DEEP_UNSHARING  false
-#define LANG_HOOKS_CHKP_SUPPORTED  false
 
 /* Attribute hooks.  */
 #define LANG_HOOKS_ATTRIBUTE_TABLE NULL
@@ -306,8 +305,7 @@ extern void lhd_end_section (void);
   LANG_HOOKS_EH_PROTECT_CLEANUP_ACTIONS, \
   LANG_HOOKS_BLOCK_MAY_FALLTHRU, \
   LANG_HOOKS_EH_USE_CXA_END_CLEANUP, \
-  LANG_HOOKS_DEEP_UNSHARING, \
-  LANG_HOOKS_CHKP_SUPPORTED \
+  LANG_HOOKS_DEEP_UNSHARING \
 }
 
 #endif /* GCC_LANG_HOOKS_DEF_H */
diff --git a/gcc/langhooks.h b/gcc/langhooks.h
index 002d7eb..a83bf7b 100644
--- a/gcc/langhooks.h
+++ b/gcc/langhooks.h
@@ -472,9 +472,6 @@ struct lang_hooks
  gimplification.  */
   bool deep_unsharing;
 
-  /* True if this language allows pointers checker instrumentation.  */
-  bool chkp_supported;
-
   /* Whenever you add entries here, make sure you adjust langhooks-def.h
  and langhooks.c accordingly.  */
 };
diff --git a/gcc/toplev.c b/gcc/toplev.c
index 0eaf081..d3dac07 100644
--- a/gcc/toplev.c
+++ b/gcc/toplev.c
@@ -1286,9 +1286,6 @@ process_options (void)
 {
   if (targetm.chkp_bound_mode () == VOIDmode)
error (-fcheck-pointers is not supported for this target);
-
-  if (!lang_hooks.chkp_supported)
-   flag_check_pointer_bounds = 0;
 }
 
   /* One region RA really helps to decrease the code size.  */


[PATCH][AArch64][committed] Add comment on why plus_constant is not used in aarch64_legitimize_reload_address

2013-11-07 Thread Kyrill Tkachov

Hi all,

I've committed this patch as obvious. It adds a comment in 
aarch64_legitimize_reload_address explaining that we need the RTL structure to 
be preserved before push_reload and using plus_constant would fold that, causing 
ICEs.


Thanks,
Kyrill

[gcc/]
2013-11-07  Kyrylo Tkachov  kyrylo.tkac...@arm.com

* config/aarch64/aarch64.c (aarch64_legitimize_reload_address):
Explain why plus_constant is not used.diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index ee2cb4c..3fe70c1 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -4007,7 +4007,11 @@ aarch64_legitimize_reload_address (rtx *x_p,
 	cst = force_const_mem (xmode, cst);
 
   /* Reload high part into base reg, leaving the low part
-	 in the mem instruction.  */
+	 in the mem instruction.
+	 Note that replacing this gen_rtx_PLUS with plus_constant is
+	 wrong in this case because we rely on the
+	 (plus (plus reg c1) c2) structure being preserved so that
+	 XEXP (*p, 0) in push_reload below uses the correct term.  */
   x = gen_rtx_PLUS (xmode,
 			gen_rtx_PLUS (xmode, XEXP (x, 0), cst),
 			GEN_INT (low));

Re: [PATCH, MPX, 2/X] Pointers Checker [10/25] Calls copy and verification

2013-11-07 Thread Ilya Enkovich
Hi,

Here is an updated patch version.

Thanks,
Ilya
--
diff --git a/gcc/cgraph.c b/gcc/cgraph.c
index 52d9ab0..9d7ae85 100644
--- a/gcc/cgraph.c
+++ b/gcc/cgraph.c
@@ -3030,40 +3030,54 @@ gimple_check_call_args (gimple stmt, tree fndecl, bool 
args_count_match)
 {
   for (i = 0, p = DECL_ARGUMENTS (fndecl);
   i  nargs;
-  i++, p = DECL_CHAIN (p))
+  i++)
{
- tree arg;
+ tree arg = gimple_call_arg (stmt, i);
+
+ /* Skip bound args inserted by Pointer Bounds Checker.  */
+ if (POINTER_BOUNDS_P (arg))
+   continue;
+
  /* We cannot distinguish a varargs function from the case
 of excess parameters, still deferring the inlining decision
 to the callee is possible.  */
  if (!p)
break;
- arg = gimple_call_arg (stmt, i);
+
  if (p == error_mark_node
  || arg == error_mark_node
  || (!types_compatible_p (DECL_ARG_TYPE (p), TREE_TYPE (arg))
   !fold_convertible_p (DECL_ARG_TYPE (p), arg)))
 return false;
+
+ p = DECL_CHAIN (p);
}
   if (args_count_match  p)
return false;
 }
   else if (parms)
 {
-  for (i = 0, p = parms; i  nargs; i++, p = TREE_CHAIN (p))
+  for (i = 0, p = parms; i  nargs; i++)
{
- tree arg;
+ tree arg = gimple_call_arg (stmt, i);
+
+ /* Skip bound args inserted by Pointer Bounds Checker.  */
+ if (POINTER_BOUNDS_P (arg))
+   continue;
+
  /* If this is a varargs function defer inlining decision
 to callee.  */
  if (!p)
break;
- arg = gimple_call_arg (stmt, i);
+
  if (TREE_VALUE (p) == error_mark_node
  || arg == error_mark_node
  || TREE_CODE (TREE_VALUE (p)) == VOID_TYPE
  || (!types_compatible_p (TREE_VALUE (p), TREE_TYPE (arg))
   !fold_convertible_p (TREE_VALUE (p), arg)))
 return false;
+
+ p = TREE_CHAIN (p);
}
 }
   else
diff --git a/gcc/gimple.c b/gcc/gimple.c
index 20f6010..0fc704b 100644
--- a/gcc/gimple.c
+++ b/gcc/gimple.c
@@ -3043,20 +3043,27 @@ canonicalize_cond_expr_cond (tree t)
 }
 
 /* Build a GIMPLE_CALL identical to STMT but skipping the arguments in
-   the positions marked by the set ARGS_TO_SKIP.  */
+   the positions marked by the set ARGS_TO_SKIP.  ARGS_TO_SKIP does not
+   take into account bounds arguments.  Bounds passed for skipped args
+   are also skipped.  */
 
 gimple
 gimple_call_copy_skip_args (gimple stmt, bitmap args_to_skip)
 {
-  int i;
+  int i, bit;
   int nargs = gimple_call_num_args (stmt);
   vectree vargs;
   vargs.create (nargs);
   gimple new_stmt;
 
-  for (i = 0; i  nargs; i++)
-if (!bitmap_bit_p (args_to_skip, i))
-  vargs.quick_push (gimple_call_arg (stmt, i));
+  for (i = 0, bit = 0; i  nargs; i++, bit++)
+  if (POINTER_BOUNDS_P (gimple_call_arg (stmt, i)))
+   {
+ if (!bitmap_bit_p (args_to_skip, --bit))
+   vargs.quick_push (gimple_call_arg (stmt, i));
+   }
+  else if (!bitmap_bit_p (args_to_skip, bit))
+ vargs.quick_push (gimple_call_arg (stmt, i));
 
   if (gimple_call_internal_p (stmt))
 new_stmt = gimple_build_call_internal_vec (gimple_call_internal_fn (stmt),
@@ -3702,6 +3709,9 @@ validate_call (gimple stmt, tree fndecl)
   if (!targs)
return true;
   tree arg = gimple_call_arg (stmt, i);
+  /* Skip bounds.  */
+  if (POINTER_BOUNDS_P (arg))
+   continue;
   if (INTEGRAL_TYPE_P (TREE_TYPE (arg))
   INTEGRAL_TYPE_P (TREE_VALUE (targs)))
;


Re: [PATCH, MPX, 2/X] Pointers Checker. Remove checker language hook.

2013-11-07 Thread Richard Biener
On Thu, Nov 7, 2013 at 12:46 PM, Ilya Enkovich enkovich@gmail.com wrote:
 Hi,

 Here is a patch to remove language hook used by Pointer Bounds Checker.  To 
 disable checker on non C languages option is moved to c.opt.

Ok.

Thanks,
Richard.

 Thanks,
 Ilya
 --
 gcc/

 2013-11-06  Ilya Enkovich  ilya.enkov...@intel.com

 * common.opt (fcheck-pointer-bounds): Move to ...
 * c-family/c.opt: ... here.
 * langhooks-def.h (LANG_HOOKS_CHKP_SUPPORTED): Remove.
 (LANG_HOOKS_INITIALIZER): Remove LANG_HOOKS_CHKP_SUPPORTED.
 * langhooks.h (lang_hooks): Remove chkp_supported field.
 * toplev.c (process_options): Remove chkp_supported check.


 diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
 index b862eb9..c082bac 100644
 --- a/gcc/c-family/c.opt
 +++ b/gcc/c-family/c.opt
 @@ -850,6 +850,11 @@ fcanonical-system-headers
  C ObjC C++ ObjC++
  Where shorter, use canonicalized paths to systems headers.

 +fcheck-pointer-bounds
 +C ObjC C++ ObjC++ LTO Report Var(flag_check_pointer_bounds)
 +Add Pointer Bounds Checker instrumentation.  fchkp-* flags are used to
 +control instrumentation.
 +
  fcilkplus
  C ObjC C++ ObjC++ LTO Report Var(flag_enable_cilkplus) Init(0)
  Enable Cilk Plus
 diff --git a/gcc/common.opt b/gcc/common.opt
 index 5c2f56e..deeb3f2 100644
 --- a/gcc/common.opt
 +++ b/gcc/common.opt
 @@ -874,11 +874,6 @@ fbounds-check
  Common Report Var(flag_bounds_check)
  Generate code to check bounds before indexing arrays

 -fcheck-pointer-bounds
 -Common Report Var(flag_check_pointer_bounds)
 -Add Pointer Bounds Checker instrumentation.  fchkp-* flags are used to
 -control instrumentation.  Currently available for C, C++ and ObjC.
 -
  fbranch-count-reg
  Common Report Var(flag_branch_on_count_reg) Init(1) Optimization
  Replace add, compare, branch with branch on count register
 diff --git a/gcc/langhooks-def.h b/gcc/langhooks-def.h
 index 0597036..b7be472 100644
 --- a/gcc/langhooks-def.h
 +++ b/gcc/langhooks-def.h
 @@ -118,7 +118,6 @@ extern bool lhd_omp_mappable_type (tree);
  #define LANG_HOOKS_BLOCK_MAY_FALLTHRU  hook_bool_const_tree_true
  #define LANG_HOOKS_EH_USE_CXA_END_CLEANUP  false
  #define LANG_HOOKS_DEEP_UNSHARING  false
 -#define LANG_HOOKS_CHKP_SUPPORTED  false

  /* Attribute hooks.  */
  #define LANG_HOOKS_ATTRIBUTE_TABLE NULL
 @@ -306,8 +305,7 @@ extern void lhd_end_section (void);
LANG_HOOKS_EH_PROTECT_CLEANUP_ACTIONS, \
LANG_HOOKS_BLOCK_MAY_FALLTHRU, \
LANG_HOOKS_EH_USE_CXA_END_CLEANUP, \
 -  LANG_HOOKS_DEEP_UNSHARING, \
 -  LANG_HOOKS_CHKP_SUPPORTED \
 +  LANG_HOOKS_DEEP_UNSHARING \
  }

  #endif /* GCC_LANG_HOOKS_DEF_H */
 diff --git a/gcc/langhooks.h b/gcc/langhooks.h
 index 002d7eb..a83bf7b 100644
 --- a/gcc/langhooks.h
 +++ b/gcc/langhooks.h
 @@ -472,9 +472,6 @@ struct lang_hooks
   gimplification.  */
bool deep_unsharing;

 -  /* True if this language allows pointers checker instrumentation.  */
 -  bool chkp_supported;
 -
/* Whenever you add entries here, make sure you adjust langhooks-def.h
   and langhooks.c accordingly.  */
  };
 diff --git a/gcc/toplev.c b/gcc/toplev.c
 index 0eaf081..d3dac07 100644
 --- a/gcc/toplev.c
 +++ b/gcc/toplev.c
 @@ -1286,9 +1286,6 @@ process_options (void)
  {
if (targetm.chkp_bound_mode () == VOIDmode)
 error (-fcheck-pointers is not supported for this target);
 -
 -  if (!lang_hooks.chkp_supported)
 -   flag_check_pointer_bounds = 0;
  }

/* One region RA really helps to decrease the code size.  */


RE: [PATCH] Factorize the two read_line functions present in gcov.c and input.c

2013-11-07 Thread Bernd Edlinger
Hi,

just some thoughts...


fgetc will definitely be much faster than fread 16K and fseek back to the end 
of line position.

Note: fgetc is already buffered and not too slow on average, but only if you do 
not
fseek around. In that case the buffered file-stream data is lost.

Well, reading 16K in one piece _could_ be faster than fgetc, but only if you 
read the
data once and do not memcpy or memmove the buffer afterwards.

But if you read 16K starting from 0, then read 16K starting from line 1, and so 
on,
you will be really really slow. Did you ever profile your proposed patch?

The most easy way to do something for the performance would be in
location_get_source_line:

If you remember the last file and position, and keep the file open, just in 
case,
that the next time, either the same or a following line is requested. That 
would be
a really simple optimization.

Maybe that would be too simple? I do not know.

 fseek (fp, prev_pos + cur_len, SEEK_SET);
Furthermore this seek will position in the middle of the \r\n sequence on 
windows.


I am furthermore not sure, how the gcov.c would react on a NUL-char in the 
input stream.
Maybe in a comment, somewhere? Could it be possible that this reaches the 
gcov.c?

In that case the same OOM would occur because the memory size is always 
duplicated
after an incomplete line?

And gcov.c uses printf (..%s\n) to print a source listing right?

If the line contains NUL-chars, the listing will only contain the part of the 
line
until the NUL-char, which is not what we want?

So in that case it would be better to replace NUL-chars in the line
buffer with SPACEs, to get a printable C string? What do you think?

Isn't a C string always more easy to handle, than this buffer with zero-chars?


Regards
Bernd.

[Commited] Fix for PR59029

2013-11-07 Thread Yury Gribov

Preapproved by Jakub in http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59029
diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 4991a3a..535b670 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,10 @@
+2013-11-07  Yury Gribov  y.gri...@samsung.com
+	Jakub Jelinek  ja...@redhat.com
+
+	PR sanitizer/59029
+	* gcc/asan.c (get_mem_refs_of_builtin_call): Allow
+	integer literals as addresses in instrumented builtins.
+
 2013-11-07  H.J. Lu  hongjiu...@intel.com
 
 	PR target/59034
diff --git a/gcc/asan.c b/gcc/asan.c
index fdca377..950d332 100644
--- a/gcc/asan.c
+++ b/gcc/asan.c
@@ -715,7 +715,7 @@ get_mem_refs_of_builtin_call (const gimple call,
 	   instrument_derefs.  */
 	if (TREE_CODE (dest) == ADDR_EXPR)
 	  dest = TREE_OPERAND (dest, 0);
-	else if (TREE_CODE (dest) == SSA_NAME)
+	else if (TREE_CODE (dest) == SSA_NAME || TREE_CODE (dest) == INTEGER_CST)
 	  dest = build2 (MEM_REF, TREE_TYPE (TREE_TYPE (dest)),
 			 dest, build_int_cst (TREE_TYPE (dest), 0));
 	else
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index 56d30a3..f6e735f 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,9 @@
+2013-11-07  Yury Gribov  y.gri...@samsung.com
+	Jakub Jelinek  ja...@redhat.com
+
+	PR sanitizer/59029
+	* c-c++-common/asan/pr59029.c: New test.
+
 2013-11-07  H.J. Lu  hongjiu...@intel.com
 
 	PR target/59034
diff --git a/gcc/testsuite/c-c++-common/asan/pr59029.c b/gcc/testsuite/c-c++-common/asan/pr59029.c
new file mode 100644
index 000..a1319b2
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/asan/pr59029.c
@@ -0,0 +1,7 @@
+/* { dg-do compile } */
+
+int
+foo ()
+{
+  return __sync_fetch_and_add ((int *) 0, 1);
+}


Re: [Commited] Fix for PR59029

2013-11-07 Thread Marek Polacek
On Thu, Nov 07, 2013 at 04:32:56PM +0400, Yury Gribov wrote:
 + PR sanitizer/59029
 + * gcc/asan.c (get_mem_refs_of_builtin_call): Allow
 + integer literals as addresses in instrumented builtins.
 +

The prefix gcc/ hasn't been dropped as it should.

Marek


[ia64] [PR target/57491] internal compiler error: in ia64_split_tmode -O2, quadmath

2013-11-07 Thread Kirill Yukhin
Hello,
I've looked into RTL after register allocation. Insn which
lead to assert is:
(insn 77 180 79 4 (set (reg/v:TF 44 loc12 [orig:359 T8 ] [359])
(mem:TF (post_modify:DI (reg/f:DI 14 r14 [435])
(plus:DI (reg/f:DI 14 r14 [435])
(reg:DI 45 loc13 [orig:342 D.3347 ] [342]))) [2 *R1_102+0 
S16 A128])) 1.i:2370 131 {*movtf_intern\
al}

When trying to split it into two DI loads here:
 ia64_split_tmode_move

We check if source and destination RTX are register overlapped,
and if so, set flag that says that load adress is dead after insns:
  if (GET_CODE (operands[1]) == MEM
   reg_overlap_mentioned_p (operands[0], operands[1]))
...
dead = true;

This flags is asserted (to be unset) later in splitting of memory operand in 
ia64_split_tmode:
  case POST_MODIFY:
gcc_assert (!reversed  !dead);

I think that the insn (although op[0] and op1[1] are overlapped by regs)
is still safe to split. And we might end up with something like this:
  (insn 272 0 0 (set (reg:DI 44 loc12)
  (mem:DI (post_inc:DI (reg/f:DI 14 r14 [435])) [2 *R1_102+0 S8 A128]))
  (insn 273 272 0 (set (reg:DI 45 loc13)
  (mem:DI (post_modify:DI (reg/f:DI 14 r14 [435])
  (plus:DI (reg/f:DI 14 r14 [435])
  (reg:DI 45 loc13 [orig:342 D.3347 ] [342]))) [2 *R1_102+8 
S8 A64]))
  (set (reg/f:DI 14 r14 [435])
  (plus:DI (reg/f:DI 14 r14 [435])
  (const_int -8 [0xfff8])))

As far as I understand, second insn will update loc13 *after* ld is executed,
so value of location in r14 will be correct.

Patch in the bottom.

ChangeLog entry:
  * config/ia64/ia64.c (ia64_split_tmode_move): Relax `dead' flag 
setting.

Testcase builds w/o ICE with the patch.
Bootstrapped.

Could you pls take a look?

--
Thanks, K
---
 gcc/config/ia64/ia64.c |   11 +--
 1 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/gcc/config/ia64/ia64.c b/gcc/config/ia64/ia64.c
index a128b19..446ee59 100644
--- a/gcc/config/ia64/ia64.c
+++ b/gcc/config/ia64/ia64.c
@@ -1527,12 +1527,19 @@ ia64_split_tmode_move (rtx operands[])
reg_overlap_mentioned_p (operands[0], operands[1]))
 {
   rtx base = XEXP (operands[1], 0);
+  rtx first_write = gen_rtx_REG (DImode, REGNO (operands[0]));
   while (GET_CODE (base) != REG)
base = XEXP (base, 0);

   if (REGNO (base) == REGNO (operands[0]))
-   reversed = true;
-  dead = true;
+   {
+ reversed = true;
+ first_write = gen_rtx_REG (DImode, REGNO (operands[0]) + 1);
+   }
+
+  if (GET_CODE (operands[0]) == REG
+  reg_overlap_mentioned_p (first_write, operands[1]))
+   dead = true;
 }
   /* Another reason to do the moves in reversed order is if the first
  element of the target register pair is also the second element of


Re: [Commited] Fix for PR59029

2013-11-07 Thread Yury Gribov
Yup, Jakub already pointed this out. Should I fix the changelog with 
another commit?


---
From: Marek Polacek pola...@redhat.com
Sent:  Thursday, November 07, 2013 4:38PM
To: Yury Gribov y.gri...@samsung.com
Cc: gcc-patches@gcc.gnu.org, Jakub Jelinek ja...@redhat.com, 
reich...@gcc.gnu.org, Viacheslav Garbuzov v.garbu...@samsung.com, 
Evgeny Gavrin e.gav...@samsung.com

Subject: Re: [Commited] Fix for PR59029
On 11/07/2013 04:38 PM, Marek Polacek wrote:

On Thu, Nov 07, 2013 at 04:32:56PM +0400, Yury Gribov wrote:


+   PR sanitizer/59029
+   * gcc/asan.c (get_mem_refs_of_builtin_call): Allow
+   integer literals as addresses in instrumented builtins.
+


The prefix gcc/ hasn't been dropped as it should.

Marek





Re: [PATCH] Factorize the two read_line functions present in gcov.c and input.c

2013-11-07 Thread Jakub Jelinek
On Thu, Nov 07, 2013 at 01:25:00PM +0100, Bernd Edlinger wrote:
 just some thoughts...
 
 
 fgetc will definitely be much faster than fread 16K and fseek back to the end 
 of line position.
 
 Note: fgetc is already buffered and not too slow on average, but only if you 
 do not
 fseek around. In that case the buffered file-stream data is lost.

fgetc (unless you are talking about fgetc_unlocked which is somewhat better)
is actually very expensive.  Of course, fseek is highly undersirable, but
I've already said that.

Jakub


Re: [Commited] Fix for PR59029

2013-11-07 Thread Jakub Jelinek
On Thu, Nov 07, 2013 at 04:48:10PM +0400, Yury Gribov wrote:
 Yup, Jakub already pointed this out. Should I fix the changelog with
 another commit?

Yes, just fix up the ChangeLog and commit, for ChangeLog changes
of course you don't add a new ChangeLog entry.

Jakub


Re: [Commited] Fix for PR59029

2013-11-07 Thread Yury Gribov

 Should I fix the changelog with another commit?
 Yes, just fix up the ChangeLog and commit

Fixed.

-Y



RE: [PATCH, PR 57748] Check for out of bounds access, Part 2

2013-11-07 Thread Bernd Edlinger
Hi,

On Fri, 25 Oct 2013 12:51:13, Richard Biener wrote:

 On Fri, Oct 25, 2013 at 12:02 PM, Bernd Edlinger
 bernd.edlin...@hotmail.de wrote:
 Hi,

 Eh ... even

 register struct { int i; int a[0]; } asm (ebx);

 works. Also with int a[1] but not with a[2]. So just handling trailing
 arrays makes this case regress as well.

 Now I'd call this use questionable ... but we've likely supported that
 for decades and cannot change that now.

 Back to fixing everything in expand.

 Richard.


 Ok, finally you asked for it.

 Here is my previous version of that patch again.

 I have now added a new value EXPAND_REFERENCE to the expand_modifier
 enumeration. It is almost like EXPAND_MEMORY but it does not interfere with
 constant values.

 I have done the same modification to VIEW_CONVERT_EXPR too, because
 this is a possible inner reference, itself. It is however inherently hard to
 test around this code.

 To understand this patch it is good to know what type of object the
 return value tem of get_inner_reference can be.

 From the program logic at get_inner_reference it is clear that the
 return value may *not* be BIT_FIELD_REF, COMPONENT_REF, ARRAY_REF,
 ARRAY_RANGE_REF, REALPART_EXPR, IMAGPART_EXPR. The result may
 be VIEW_CONVERT_EXPR only on a STRICT_ALIGNMENT target. This is probably
 further restricted because exp is gimplified.

 Usually the result will be a MEM_REF or a SSA_NAME of the memory where
 the structure is to be found.

 When you look at where EXPAND_MEMORY is handled you see it is special-cased
 in TARGET_MEM_REF, MEM_REF, ARRAY_REF, COMPONENT_REF, BIT_FIELD_REF,
 ARRAY_RANGE_REF.

 At TARGET_MEM_REF, MEM_REF, VIEW_CONVERT_EXPR, it should be the
 same if EXPAND_MEMORY, EXPAND_WRITE or EXPAND_REFERENCE is given:
 If it is an unaligned memory, we just return the unaligned reference.

 This was missing for VIEW_CONVERT_EXPR, and unfortunately I have no test 
 case,
 because it is only a problem for STRICT_ALIGNMENT targets, and even there it 
 will
 certainly be really hard to find test cases that exercise this code.

 In ARRAY_REF, COMPONENT_REF, BIT_FIELD_REF, ARRAY_RANGE_REF
 we do not have to touch the handling of the outer modifier. However we pass
 EXPAND_REFERENCE to the inner object, which should not be a recursive
 use of any ARRAY_REF, COMPONENT_REF, BIT_FIELD_REF, ARRAY_RANGE_REF.

 TARGET_MEM_REF, MEM_REF and VIEW_CONVERT_EXPR know how to handle
 EXPAND_REFERENCE, anything else handles it like EXPAND_NORMAL.


 Boot-strapped and regression-tested on x86_64-linux-gnu
 OK for trunk?

 You point to a weak spot in expansion - that it handles creating
 the base MEM to offset with handled components by recursing
 into the case that handles bare MEM_REFs. This makes the
 bare MEM_REF handling code somewhat awkward (it's the
 one to assign mem-attrs which are later adjusted for example).

 Maybe a better appropach than adding yet another expand
 modifier would be to split out the base MEM expansion part
 out of the bare MEM_REF handling code so we can call that
 instead of recursing.

 In this light - instead of a new expand modifier don't you want
 an actual flag that specifies we are coming from a call that
 wants to expand a base? That is, allow EXPAND_SUM
 but with the recursion flag set?


I think you are right. After some thought, I start to like that idea.

This way we have at least much more flexibility, how to handle the inner
references correctly, and if I change only  the interfaces of 
expand_expr_real/real_1
that will not be used at too many places, either.

 Finally I think the recursion into the VIEW_CONVERT_EXPR case
 is only there because of the keep_aligning flag of get_inner_reference
 which should be obsolete now that we properly handle its effects
 in get_object_alignment. So you wouldn't need to adjust this path
 if we finally can get rid of that.


I proposed a patch for that, but did not hear from you:
http://gcc.gnu.org/ml/gcc-patches/2013-10/msg02267.html

nevertheless, even if the VIEW_CONVERT_EXPR may or may not be an inner
reference, the code there should be kept in sync with the normal_inner_ref,
and MEM_REF, IMHO.

Attached, my updated patch for PR57748, Part 2.

Boot-strapped and regression-tested on X86_64-pc-linux-gnu.

Ok for trunk?


Thanks
Bernd.

 What do others think?

 Thanks,
 Richard.

 Thanks
 Bernd. 

RE: [PATCH, PR 57748] Check for out of bounds access, Part 2

2013-11-07 Thread Bernd Edlinger
oops - this time with attachments...


 Hi,

 On Fri, 25 Oct 2013 12:51:13, Richard Biener wrote:

 On Fri, Oct 25, 2013 at 12:02 PM, Bernd Edlinger
 bernd.edlin...@hotmail.de wrote:
 Hi,

 Eh ... even

 register struct { int i; int a[0]; } asm (ebx);

 works. Also with int a[1] but not with a[2]. So just handling trailing
 arrays makes this case regress as well.

 Now I'd call this use questionable ... but we've likely supported that
 for decades and cannot change that now.

 Back to fixing everything in expand.

 Richard.


 Ok, finally you asked for it.

 Here is my previous version of that patch again.

 I have now added a new value EXPAND_REFERENCE to the expand_modifier
 enumeration. It is almost like EXPAND_MEMORY but it does not interfere with
 constant values.

 I have done the same modification to VIEW_CONVERT_EXPR too, because
 this is a possible inner reference, itself. It is however inherently hard to
 test around this code.

 To understand this patch it is good to know what type of object the
 return value tem of get_inner_reference can be.

 From the program logic at get_inner_reference it is clear that the
 return value may *not* be BIT_FIELD_REF, COMPONENT_REF, ARRAY_REF,
 ARRAY_RANGE_REF, REALPART_EXPR, IMAGPART_EXPR. The result may
 be VIEW_CONVERT_EXPR only on a STRICT_ALIGNMENT target. This is probably
 further restricted because exp is gimplified.

 Usually the result will be a MEM_REF or a SSA_NAME of the memory where
 the structure is to be found.

 When you look at where EXPAND_MEMORY is handled you see it is special-cased
 in TARGET_MEM_REF, MEM_REF, ARRAY_REF, COMPONENT_REF, BIT_FIELD_REF,
 ARRAY_RANGE_REF.

 At TARGET_MEM_REF, MEM_REF, VIEW_CONVERT_EXPR, it should be the
 same if EXPAND_MEMORY, EXPAND_WRITE or EXPAND_REFERENCE is given:
 If it is an unaligned memory, we just return the unaligned reference.

 This was missing for VIEW_CONVERT_EXPR, and unfortunately I have no test 
 case,
 because it is only a problem for STRICT_ALIGNMENT targets, and even there 
 it will
 certainly be really hard to find test cases that exercise this code.

 In ARRAY_REF, COMPONENT_REF, BIT_FIELD_REF, ARRAY_RANGE_REF
 we do not have to touch the handling of the outer modifier. However we pass
 EXPAND_REFERENCE to the inner object, which should not be a recursive
 use of any ARRAY_REF, COMPONENT_REF, BIT_FIELD_REF, ARRAY_RANGE_REF.

 TARGET_MEM_REF, MEM_REF and VIEW_CONVERT_EXPR know how to handle
 EXPAND_REFERENCE, anything else handles it like EXPAND_NORMAL.


 Boot-strapped and regression-tested on x86_64-linux-gnu
 OK for trunk?

 You point to a weak spot in expansion - that it handles creating
 the base MEM to offset with handled components by recursing
 into the case that handles bare MEM_REFs. This makes the
 bare MEM_REF handling code somewhat awkward (it's the
 one to assign mem-attrs which are later adjusted for example).

 Maybe a better appropach than adding yet another expand
 modifier would be to split out the base MEM expansion part
 out of the bare MEM_REF handling code so we can call that
 instead of recursing.

 In this light - instead of a new expand modifier don't you want
 an actual flag that specifies we are coming from a call that
 wants to expand a base? That is, allow EXPAND_SUM
 but with the recursion flag set?


 I think you are right. After some thought, I start to like that idea.

 This way we have at least much more flexibility, how to handle the inner
 references correctly, and if I change only the interfaces of 
 expand_expr_real/real_1
 that will not be used at too many places, either.

 Finally I think the recursion into the VIEW_CONVERT_EXPR case
 is only there because of the keep_aligning flag of get_inner_reference
 which should be obsolete now that we properly handle its effects
 in get_object_alignment. So you wouldn't need to adjust this path
 if we finally can get rid of that.


 I proposed a patch for that, but did not hear from you:
 http://gcc.gnu.org/ml/gcc-patches/2013-10/msg02267.html

 nevertheless, even if the VIEW_CONVERT_EXPR may or may not be an inner
 reference, the code there should be kept in sync with the normal_inner_ref,
 and MEM_REF, IMHO.

 Attached, my updated patch for PR57748, Part 2.

 Boot-strapped and regression-tested on X86_64-pc-linux-gnu.

 Ok for trunk?


 Thanks
 Bernd.

 What do others think?

 Thanks,
 Richard.

 Thanks
 Bernd.2013-11-07  Bernd Edlinger  bernd.edlin...@hotmail.de

PR middle-end/57748
* expr.h (expand_expr_real, expand_expr_real_1): Add new parameter
expand_reference.
(expand_expr, expand_normal): Adjust.
* expr.c (expand_expr_real, expand_expr_real_1): Add new parameter
expand_reference. Use expand_reference to expand inner references.
(store_expr): Adjust.
* cfgexpand.c (expand_call_stmt): Adjust.

testsuite:
2013-11-07  Bernd Edlinger  bernd.edlin...@hotmail.de

PR middle-end/57748
* 

[C++ Patch/RFC] PR 58176 (ICE with nullptr at -O0 in output_constant)

2013-11-07 Thread Paolo Carlini

Hi,

at -O0 we ICE for this valid testcase in output_constant because 
NULLPTR_TYPE is not handled. Should I dig further because we never want 
nullptr to reach varasm? Because otherwise the patchlet works fine, I 
checked that expand_expr can cope perfectly well with NULLPTR_TYPE as 
TREE_TYPE of the INTEGER_CST (generally, we are already handling 
NULLPTR_TYPE in a number of places outside /cp)


Thanks,
Paolo.


Index: testsuite/g++.dg/cpp0x/nullptr30.C
===
--- testsuite/g++.dg/cpp0x/nullptr30.C  (revision 0)
+++ testsuite/g++.dg/cpp0x/nullptr30.C  (working copy)
@@ -0,0 +1,40 @@
+// PR c++/58176
+// { dg-do compile { target c++11 } }
+
+// Nil
+struct nil_ { constexpr nil_ () {} };
+constexpr nil_ nil;
+
+// Cons
+template class H, class T = nil_
+struct cons_ {
+using head_ = H;
+using tail_ = T;
+
+H head;
+T tail;
+
+constexpr cons_() {}
+constexpr cons_(H const h, T const t) : head(h), tail(t) {}
+};
+template class H, class T = nil_
+constexpr cons_H, T cons (H const h, T const t = nil) { return
+cons_H,T(h,t); }
+
+// List
+template class... T struct list_s;
+template class H, class... T
+struct list_sH, T... {
+using type = cons_H, typename list_sT...::type;
+};
+template 
+struct list_s {
+using type = nil_;
+};
+template class... T
+using list_ = typename list_sT...::type;
+constexpr nil_ list () { return nil; }
+template class H, class... T
+constexpr list_H, T... list (H h, T... t) { return cons(h, list(t...)); }
+
+constexpr auto l1 = list(monkey, 123.4, cons(1, 2), nullptr);
Index: varasm.c
===
--- varasm.c(revision 204508)
+++ varasm.c(working copy)
@@ -4685,6 +4685,7 @@ output_constant (tree exp, unsigned HOST_WIDE_INT
 case OFFSET_TYPE:
 case FIXED_POINT_TYPE:
 case POINTER_BOUNDS_TYPE:
+case NULLPTR_TYPE:
   if (! assemble_integer (expand_expr (exp, NULL_RTX, VOIDmode,
   EXPAND_INITIALIZER),
  MIN (size, thissize), align, 0))


Re: [C] Fix PR57258: unused variable warning is emitted for volatile variables

2013-11-07 Thread Joseph S. Myers
On Thu, 7 Nov 2013, Mingjie Xing wrote:

 2013/11/6 Richard Biener richard.guent...@gmail.com:
  You miss a testcase.
 
  Also why should the warning be omitted for unused automatic
  volatile variables?  They cannot be used in any way.
 
  Richard.
 
 Thanks.  I've updated the patch with a test case.

You don't seem to have answered Richard's question about why the change is 
desirable in the first place.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH 0/6] Conversion of gimple types to C++ inheritance (v3)

2013-11-07 Thread Joseph S. Myers
On Thu, 7 Nov 2013, Richard Biener wrote:

 Well, I'm betting that you'll re-invent sth like 'tree' just don't
 call it 'tree' ;)
 You need to transparently refer to constants, SSA names and decls
 (at least) as GIMPLE statement operands.  You probably will make
 a gimple statement operand base class.  Well - that's a 'tree' ;)

Uses of constants and decls in expressions are why it's not obvious that 
those get a static type different from expressions (or at least, they 
probably do need a common base class).  But my model of how tree should 
ideally be split up has expressions (maybe including decls) as a 
completely separate static type from types, and identifiers as yet another 
static type, none of those inheriting from a common base class tree.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH, MPX, 2/X] Pointers Checker [6/25] Instrumentation pass

2013-11-07 Thread Joseph S. Myers
On Thu, 7 Nov 2013, Ilya Enkovich wrote:

 Here is a new split.  I still include rtl.h into tree-chkp.c for MEM_P 
 usage.  Is it OK?

I have no further comments on this patch.

-- 
Joseph S. Myers
jos...@codesourcery.com


Generally link to libgomp for -ftree-parallelize-loops=* (was: [gomp4 4/9] OpenACC: The runtime library will be implemented in libgomp, too.)

2013-11-07 Thread Thomas Schwinge
Hi!

On Thu, 7 Nov 2013 09:15:45 +0100, Jakub Jelinek ja...@redhat.com wrote:
 On Wed, Nov 06, 2013 at 08:42:18PM +0100, tho...@codesourcery.com wrote:
  --- gcc/config/arc/arc.h
  +++ gcc/config/arc/arc.h
  @@ -174,7 +174,7 @@ along with GCC; see the file COPYING3.  If not see
   %(linker) %l  LINK_PIE_SPEC %X %{o*} %{A} %{d} %{e*} %{m} %{N} %{n} 
  %{r}\
   %{s} %{t} %{u*} %{x} %{z} %{Z} %{!A:%{!nostdlib:%{!nostartfiles:%S}}}\
   %{static:} %{L*} %(mfwrap) %(link_libgcc) %o\
  -%{fopenmp:%:include(libgomp.spec)%(link_gomp)} %(mflib)\
  +%{fopenacc|fopenmp:%:include(libgomp.spec)%(link_gomp)} %(mflib)\
 
 Perhaps add here |ftree-parallelize-loops=* too?  [...]

Like this (for trunk and gomp-4_0-branch)?

gcc/
* config/arc/arc.h (LINK_COMMAND_SPEC): For
-ftree-parallelize-loops=*, link to libgomp and its dependencies.
* config/ia64/hpux.h (LIB_SPEC): Likewise.
* config/pa/pa-hpux11.h (LIB_SPEC): Likewise.
* config/pa/pa64-hpux.h (LIB_SPEC): Likewise.
* gcc.c (GOMP_SELF_SPECS): Update comment about libgomp's
dependencies.
libgomp/
* libgomp.spec.in: Update comment about libgomp's dependencies.
* configure.ac: Likewise.
* configure: Regenerate.

diff --git gcc/config/arc/arc.h gcc/config/arc/arc.h
index 637f7b6..87908d4 100644
--- gcc/config/arc/arc.h
+++ gcc/config/arc/arc.h
@@ -174,7 +174,8 @@ along with GCC; see the file COPYING3.  If not see
 %(linker) %l  LINK_PIE_SPEC %X %{o*} %{A} %{d} %{e*} %{m} %{N} %{n} %{r}\
 %{s} %{t} %{u*} %{x} %{z} %{Z} %{!A:%{!nostdlib:%{!nostartfiles:%S}}}\
 %{static:} %{L*} %(mfwrap) %(link_libgcc) %o\
-%{fopenmp:%:include(libgomp.spec)%(link_gomp)} %(mflib)\
+%{fopenmp|ftree-parallelize-loops=*:%:include(libgomp.spec)%(link_gomp)}\
+%(mflib)\
 %{fprofile-arcs|fprofile-generate|coverage:-lgcov}\
 %{!nostdlib:%{!nodefaultlibs:%(link_ssp) %(link_gcc_c_sequence)}}\
 %{!A:%{!nostdlib:%{!nostartfiles:%E}}} %{T*} }}
diff --git gcc/config/ia64/hpux.h gcc/config/ia64/hpux.h
index ca592e4..517132c 100644
--- gcc/config/ia64/hpux.h
+++ gcc/config/ia64/hpux.h
@@ -92,7 +92,7 @@ do {  \
 #undef  LIB_SPEC
 #define LIB_SPEC \
   %{!shared: \
- %{mt|pthread:%{fopenmp:-lrt} -lpthread} \
+ %{mt|pthread:%{fopenmp|ftree-parallelize-loops=*:-lrt} -lpthread} \
  %{p:%{!mlp64:-L/usr/lib/hpux32/libp} \
 %{mlp64:-L/usr/lib/hpux64/libp} -lprof} \
  %{pg:%{!mlp64:-L/usr/lib/hpux32/libp} \
diff --git gcc/config/pa/pa-hpux11.h gcc/config/pa/pa-hpux11.h
index c217398..4da1b09 100644
--- gcc/config/pa/pa-hpux11.h
+++ gcc/config/pa/pa-hpux11.h
@@ -120,7 +120,8 @@ along with GCC; see the file COPYING3.  If not see
 #undef LIB_SPEC
 #define LIB_SPEC \
   %{!shared:\
- %{fopenmp:%{static:-a archive_shared} -lrt %{static:-a archive}}\
+ %{fopenmp|ftree-parallelize-loops=*:%{static:-a archive_shared} -lrt\
+   %{static:-a archive}}\
  %{mt|pthread:-lpthread} -lc\
  %{static:%{!nolibdld:-a archive_shared -ldld -a archive -lc}\
%{!mt:%{!pthread:-a shared -lc -a archive\
diff --git gcc/config/pa/pa64-hpux.h gcc/config/pa/pa64-hpux.h
index a3e21d3..9efc137 100644
--- gcc/config/pa/pa64-hpux.h
+++ gcc/config/pa/pa64-hpux.h
@@ -58,19 +58,22 @@ along with GCC; see the file COPYING3.  If not see
 #if ((TARGET_DEFAULT | TARGET_CPU_DEFAULT)  MASK_GNU_LD)
 #define LIB_SPEC \
   %{!shared:\
- %{!p:%{!pg:%{fopenmp:%{static:-a shared} -lrt %{static:-a archive}}\
+ %{!p:%{!pg:%{fopenmp|ftree-parallelize-loops=*:%{static:-a shared} -lrt\
+  %{static:-a archive}}\
%{mt|pthread:-lpthread} -lc\
%{static:%{!nolibdld:-a shared -ldld -a archive -lc}\
%{!mt:%{!pthread:-a shared -lc -a archive}\
  %{p:%{!pg:%{static:%{!mhp-ld:-a shared}%{mhp-ld:-a archive_shared}}\
   -lprof %{static:-a archive}\
-  %{fopenmp:%{static:-a shared} -lrt %{static:-a archive}}\
+  %{fopenmp|ftree-parallelize-loops=*:%{static:-a shared} -lrt\
+ %{static:-a archive}}\
   %{mt|pthread:-lpthread} -lc\
   %{static:%{!nolibdld:-a shared -ldld -a archive -lc}\
%{!mt:%{!pthread:-a shared -lc -a archive}\
  %{pg:%{static:%{!mhp-ld:-a shared}%{mhp-ld:-a archive_shared}}\
-lgprof %{static:-a archive}\
-   %{fopenmp:%{static:-a shared} -lrt %{static:-a archive}}\
+   %{fopenmp|ftree-parallelize-loops=*:%{static:-a shared} -lrt\
+ %{static:-a archive}}\
%{mt|pthread:-lpthread} -lc\
%{static:%{!nolibdld:-a shared -ldld -a archive -lc}\
%{!mt:%{!pthread:-a shared -lc -a archive}\
@@ -78,19 +81,22 @@ along with GCC; see the file COPYING3.  If not see
 #else
 #define LIB_SPEC \
   %{!shared:\
- %{!p:%{!pg:%{fopenmp:%{static:-a shared} -lrt %{static:-a archive}}\
+ 

PATCH: Don't set misaligned_prologue_used twice

2013-11-07 Thread H.J. Lu
I checked in this patch to remove redundant setting of
misaligned_prologue_used.

H.J.
---
Index: ChangeLog
===
--- ChangeLog   (revision 204511)
+++ ChangeLog   (working copy)
@@ -1,3 +1,8 @@
+2013-11-07  H.J. Lu  hongjiu...@intel.com
+
+   * config/i386/i386.c (ix86_expand_set_or_movmem): Don't set
+   misaligned_prologue_used when it has been set.
+
 2013-11-07  Yury Gribov  y.gri...@samsung.com
Jakub Jelinek  ja...@redhat.com
 
Index: config/i386/i386.c
===
--- config/i386/i386.c  (revision 204511)
+++ config/i386/i386.c  (working copy)
@@ -23796,7 +23796,6 @@ ix86_expand_set_or_movmem (rtx dst, rtx
   if (misaligned_prologue_used)
 {
   /* Misaligned move prologue handled small blocks by itself.  */
-  misaligned_prologue_used = true;
   expand_set_or_movmem_prologue_epilogue_by_misaligned_moves
   (dst, src, destreg, srcreg,
move_mode, promoted_val, vec_promoted_val,


Re: [C++ Patch/RFC] PR 58176 (ICE with nullptr at -O0 in output_constant)

2013-11-07 Thread Jason Merrill

OK.


Re: [PATCH] remove nolonger needed {cgraph,varpool}_node_{,asm_}name () functions

2013-11-07 Thread Václav Zeman
On 11/07/2013 01:58 AM, tsaund...@mozilla.com wrote:
 From: Trevor Saunders tsaund...@mozilla.com

 Hi,

 I admit its another c++ification, but the functions are useless, and if I'm
 changing the name of the called function its just easier to drop the class 
 name
 bit all together.

 Trev

[...]
 diff --git a/gcc/cgraph.h b/gcc/cgraph.h
 index 4f93713..754d245 100644
 --- a/gcc/cgraph.h
 +++ b/gcc/cgraph.h
 @@ -44,6 +44,12 @@ class GTY((desc (%h.type), tag (SYMTAB_SYMBOL),
symtab_node
  {
  public:
 +  /* Return name.  */
 +  const char *name () const;
 +
 +  /* Return asm name.  */
 +const char * asm_name () const;
 +
Missing indentation?

-- 
VZ


Re: [Patch, RTL] Eliminate redundant vec_select moves.

2013-11-07 Thread Tejas Belagod

Richard Sandiford wrote:

Tejas Belagod tbela...@arm.com writes:

Richard Sandiford wrote:

Tejas Belagod tbela...@arm.com writes:

Richard Sandiford wrote:

Tejas Belagod tbela...@arm.com writes:

+  /* This is big-endian-safe because the elements are kept in target
+ memory order.  So, for eg. PARALLEL element value of 2 is the same in
+ either endian-ness.  */
+  if (GET_CODE (src) == VEC_SELECT
+   REG_P (XEXP (src, 0))  REG_P (dst)
+   REGNO (XEXP (src, 0)) == REGNO (dst))
+{
+  rtx par = XEXP (src, 1);
+  int i;
+
+  for (i = 0; i  XVECLEN (par, 0); i++)
+   {
+ rtx tem = XVECEXP (par, 0, i);
+ if (!CONST_INT_P (tem) || INTVAL (tem) != i)
+   return 0;
+   }
+  return 1;
+}
+

I think for big endian it needs to be:

INTVAL (tem) != i + base

where base is something like:

int base = GET_MODE_NUNITS (GET_MODE (XEXP (src, 0))) - XVECLEN (par, 0);

E.g. a big-endian V4HI looks like:

msb  lsb


and shortening it to say V2HI only gives the low 32 bits:

msb  lsb


But, in this case we want

 msb  lsb
 

It depends on whether the result occupies a full register or not.
I was thinking of the case where it didn't, but I realise now you were
thinking of the case where it did.  And yeah, my suggestion doesn't
cope with that...

I was under the impression that the const vector parallel for vec_select 
represents the element indexes of the array in memory order.


Therefore, in bigendian,

  msb lsb
     
element  a[0] a[1] a[2] a[3]

and in littleendian

  msb lsb
     
element  a[3] a[2] a[1] a[0]

Right.  But if an N-bit value is stored in a register, it's assumed to
occupy the lsb of the register and the N-1 bits above that.  The other
bits in the register are don't-care.

E.g., leaving vectors to one side, if you have:

   (set (reg:HI N) (truncate:SI (reg:SI N)))

on a 32-bit !TRULY_NOOP_TRUNCATION target, it shortens like this:

   msb  lsb
   01234567
   
   4567

rather than:

   msb  lsb
   01234567
   
   0123

for both endiannesses.  The same principle applies to vectors.
The lsb of the register is always assumed to be significant.

So maybe the original patch was correct for partial-register and
full-register results on little-endian, but only for full-register
results on big-endian.

Ah, ok! I think I get it. By eliminating
set( (reg:DI n) vec_select:DI ( (reg:V2DI n) (parallel [const 0]

using the check INTVAL (tem) != i, I'm essentially making subsequent operations 
use (reg:V2DI n) in DI mode which is a partial register result and this

gives me
the wrong set of lanes in bigendian. So, if I want to use (reg n) in partial 
register mode, I have to make sure the correct elements coincide with

the lsb in
big-endian...

Thanks for your input, I'll apply the offset correction for big-endian you 
suggested. I'll respin the patch.


Thanks.  Just for avoidance of doubt, the result might be a full or
partial register, depending on the mode and target.  I was trying to
correct myself by agreeing that your original was right and mine was
wrong for big-endian if the result is a full register.



What I had in mind when I implemented this was a partial-reg result, but 
obviously it was wrong.


Sorry, I'm going to take a step back - I'm trying to figure out what a full 
register result would look like. Looking at the pattern,


  set( (reg:DI n) vec_select:DI ( (reg:V2DI n) (parallel [const 0]

the result is always a mode smaller than the src if the vec_select selects a 
subset of lanes. Plus if the src and dst are the same reg, because we're 
re-writing the src reg, wouldn't it always end up being a partial-reg?

In this case, wouldn't (reg:DI n) always represent

 msb  lsb
 

Thanks,
Tejas.



I don't know if there are existing helper functions for this kind of thing.

Richard






Re: [PATCH] Use get_range_info during number of iterations analysis

2013-11-07 Thread Jakub Jelinek
On Wed, Nov 06, 2013 at 03:21:04PM -0700, Jeff Law wrote:
 2013-11-06  Jakub Jelinek  ja...@redhat.com
 
  * tree-ssa-loop-niter.c: Include tree-ssanames.h.
  (determine_value_range): Add loop argument.  Use get_range_info to
  improve range.
  (bound_difference): Adjust caller.
 Clever, using the range from the PHI in the loop header.  While it
 may be too wide, it's likely going to be a lot better than nothing.
 
 Add a testcase and this is OK.

Ok, thanks, here is what I've committed after another bootstrap/regtest.

2013-11-07  Jakub Jelinek  ja...@redhat.com

* tree-ssa-loop-niter.c: Include tree-ssanames.h.
(determine_value_range): Add loop argument.  Use get_range_info to
improve range.
(bound_difference): Adjust caller.

* gcc.dg/tree-ssa/loop-39.c: New test.

--- gcc/tree-ssa-loop-niter.c.jj2013-10-24 10:19:21.0 +0200
+++ gcc/tree-ssa-loop-niter.c   2013-11-07 11:49:37.568083138 +0100
@@ -45,6 +45,7 @@ along with GCC; see the file COPYING3.
 #include diagnostic-core.h
 #include tree-inline.h
 #include tree-pass.h
+#include tree-ssanames.h
 
 
 #define SWAP(X, Y) do { affine_iv *tmp = (X); (X) = (Y); (Y) = tmp; } while (0)
@@ -119,9 +120,12 @@ split_to_var_and_offset (tree expr, tree
in TYPE to MIN and MAX.  */
 
 static void
-determine_value_range (tree type, tree var, mpz_t off,
+determine_value_range (struct loop *loop, tree type, tree var, mpz_t off,
   mpz_t min, mpz_t max)
 {
+  double_int minv, maxv;
+  enum value_range_type rtype = VR_VARYING;
+
   /* If the expression is a constant, we know its value exactly.  */
   if (integer_zerop (var))
 {
@@ -130,9 +134,73 @@ determine_value_range (tree type, tree v
   return;
 }
 
+  get_type_static_bounds (type, min, max);
+
+  /* See if we have some range info from VRP.  */
+  if (TREE_CODE (var) == SSA_NAME  INTEGRAL_TYPE_P (type))
+{
+  edge e = loop_preheader_edge (loop);
+  gimple_stmt_iterator gsi;
+
+  /* Either for VAR itself...  */
+  rtype = get_range_info (var, minv, maxv);
+  /* Or for PHI results in loop-header where VAR is used as
+PHI argument from the loop preheader edge.  */
+  for (gsi = gsi_start_phis (loop-header); !gsi_end_p (gsi); gsi_next 
(gsi))
+   {
+ gimple phi = gsi_stmt (gsi);
+ double_int minc, maxc;
+ if (PHI_ARG_DEF_FROM_EDGE (phi, e) == var
+  (get_range_info (gimple_phi_result (phi), minc, maxc)
+ == VR_RANGE))
+   {
+ if (rtype != VR_RANGE)
+   {
+ rtype = VR_RANGE;
+ minv = minc;
+ maxv = maxc;
+   }
+ else
+   {
+ minv = minv.max (minc, TYPE_UNSIGNED (type));
+ maxv = maxv.min (maxc, TYPE_UNSIGNED (type));
+ gcc_assert (minv.cmp (maxv, TYPE_UNSIGNED (type)) = 0);
+   }
+   }
+   }
+  if (rtype == VR_RANGE)
+   {
+ mpz_t minm, maxm;
+ gcc_assert (minv.cmp (maxv, TYPE_UNSIGNED (type)) = 0);
+ mpz_init (minm);
+ mpz_init (maxm);
+ mpz_set_double_int (minm, minv, TYPE_UNSIGNED (type));
+ mpz_set_double_int (maxm, maxv, TYPE_UNSIGNED (type));
+ mpz_add (minm, minm, off);
+ mpz_add (maxm, maxm, off);
+ /* If the computation may not wrap or off is zero, then this
+is always fine.  If off is negative and minv + off isn't
+smaller than type's minimum, or off is positive and
+maxv + off isn't bigger than type's maximum, use the more
+precise range too.  */
+ if (nowrap_type_p (type)
+ || mpz_sgn (off) == 0
+ || (mpz_sgn (off)  0  mpz_cmp (minm, min) = 0)
+ || (mpz_sgn (off)  0  mpz_cmp (maxm, max) = 0))
+   {
+ mpz_set (min, minm);
+ mpz_set (max, maxm);
+ mpz_clear (minm);
+ mpz_clear (maxm);
+ return;
+   }
+ mpz_clear (minm);
+ mpz_clear (maxm);
+   }
+}
+
   /* If the computation may wrap, we know nothing about the value, except for
  the range of the type.  */
-  get_type_static_bounds (type, min, max);
   if (!nowrap_type_p (type))
 return;
 
@@ -405,8 +473,8 @@ bound_difference (struct loop *loop, tre
   mpz_init (maxx);
   mpz_init (miny);
   mpz_init (maxy);
-  determine_value_range (type, varx, offx, minx, maxx);
-  determine_value_range (type, vary, offy, miny, maxy);
+  determine_value_range (loop, type, varx, offx, minx, maxx);
+  determine_value_range (loop, type, vary, offy, miny, maxy);
 
   mpz_sub (bnds-below, minx, maxy);
   mpz_sub (bnds-up, maxx, miny);
--- gcc/testsuite/gcc.dg/tree-ssa/loop-39.c.jj  2013-11-07 11:52:01.871309995 
+0100
+++ gcc/testsuite/gcc.dg/tree-ssa/loop-39.c 2013-11-07 

Re: [PATCH 0/6] Conversion of gimple types to C++ inheritance (v3)

2013-11-07 Thread Richard Biener
On Thu, Nov 7, 2013 at 2:44 PM, Joseph S. Myers jos...@codesourcery.com wrote:
 On Thu, 7 Nov 2013, Richard Biener wrote:

 Well, I'm betting that you'll re-invent sth like 'tree' just don't
 call it 'tree' ;)
 You need to transparently refer to constants, SSA names and decls
 (at least) as GIMPLE statement operands.  You probably will make
 a gimple statement operand base class.  Well - that's a 'tree' ;)

 Uses of constants and decls in expressions are why it's not obvious that
 those get a static type different from expressions (or at least, they
 probably do need a common base class).  But my model of how tree should
 ideally be split up has expressions (maybe including decls) as a
 completely separate static type from types, and identifiers as yet another
 static type, none of those inheriting from a common base class tree.

Sure.  Also you make containers no longer trees (TREE_CHAIN
anyone, TREE_VEC, TREE_LIST ...).  Should be the first step
I think (already started some time ago with using VECs for some
pieces).

Richard.

 --
 Joseph S. Myers
 jos...@codesourcery.com


RE: [PATCH] Factorize the two read_line functions present in gcov.c and input.c

2013-11-07 Thread Bernd Edlinger
On Thu, 7 Nov 2013 13:48:14, Jakub Jelinek wrote:
 On Thu, Nov 07, 2013 at 01:25:00PM +0100, Bernd Edlinger wrote:
 just some thoughts...


 fgetc will definitely be much faster than fread 16K and fseek back to the 
 end of line position.

 Note: fgetc is already buffered and not too slow on average, but only if you 
 do not
 fseek around. In that case the buffered file-stream data is lost.

 fgetc (unless you are talking about fgetc_unlocked which is somewhat better)
 is actually very expensive. Of course, fseek is highly undersirable, but
 I've already said that.

 Jakub

yes. That's true.

I think, maybe this patch tries just to reach too many goals at the same time.

What is the _real_ bug which should be fixed first, is that even a very short 
file
with NUL-chars goes OOM in read_line.

So stay with fgets, just detect that the file is binary and return in that case.

The performance and duplicate code can still be improved by a follow-up patch.

We do not really need any caret positions on a binary file, after all, we just
should not ICE.

Maybe that could be done by just detecting that we have a BINARY file, and stop
printing any further lines.

like this?

--- input.c.jj    2013-01-10 21:38:27.0 +0100
+++ input.c    2013-11-07 15:33:53.804990865 +0100
@@ -106,12 +106,15 @@ read_line (FILE *file)
 {
   size_t len = strlen (string + pos);
 
-  if (string[pos + len - 1] == '\n')
+  if (len  string[pos + len - 1] == '\n')
 {
   string[pos + len - 1] = 0;
   return string;
 }
   pos += len;
+  /* test if binary file or last line incomplete? */
+  if (pos  string_len-1)
+    return pos  feof (file) ? string : NULL;
   string = XRESIZEVEC (char, string, string_len * 2);
   string_len *= 2;
 }


Bernd.

Re: Generally link to libgomp for -ftree-parallelize-loops=* (was: [gomp4 4/9] OpenACC: The runtime library will be implemented in libgomp, too.)

2013-11-07 Thread Jakub Jelinek
On Thu, Nov 07, 2013 at 02:58:31PM +0100, Thomas Schwinge wrote:
 On Thu, 7 Nov 2013 09:15:45 +0100, Jakub Jelinek ja...@redhat.com wrote:
  On Wed, Nov 06, 2013 at 08:42:18PM +0100, tho...@codesourcery.com wrote:
   --- gcc/config/arc/arc.h
   +++ gcc/config/arc/arc.h
   @@ -174,7 +174,7 @@ along with GCC; see the file COPYING3.  If not see
%(linker) %l  LINK_PIE_SPEC %X %{o*} %{A} %{d} %{e*} %{m} %{N} 
   %{n} %{r}\
%{s} %{t} %{u*} %{x} %{z} %{Z} 
   %{!A:%{!nostdlib:%{!nostartfiles:%S}}}\
%{static:} %{L*} %(mfwrap) %(link_libgcc) %o\
   -%{fopenmp:%:include(libgomp.spec)%(link_gomp)} %(mflib)\
   +%{fopenacc|fopenmp:%:include(libgomp.spec)%(link_gomp)} %(mflib)\
  
  Perhaps add here |ftree-parallelize-loops=* too?  [...]
 
 Like this (for trunk and gomp-4_0-branch)?
 
   gcc/
   * config/arc/arc.h (LINK_COMMAND_SPEC): For
   -ftree-parallelize-loops=*, link to libgomp and its dependencies.
   * config/ia64/hpux.h (LIB_SPEC): Likewise.
   * config/pa/pa-hpux11.h (LIB_SPEC): Likewise.
   * config/pa/pa64-hpux.h (LIB_SPEC): Likewise.
   * gcc.c (GOMP_SELF_SPECS): Update comment about libgomp's
   dependencies.
   libgomp/
   * libgomp.spec.in: Update comment about libgomp's dependencies.
   * configure.ac: Likewise.
   * configure: Regenerate.

Yes, thanks.

Jakub


Re: [PATCH 0/6] Conversion of gimple types to C++ inheritance (v3)

2013-11-07 Thread Andrew MacLeod

On 11/07/2013 05:08 AM, Richard Biener wrote:



2 - I really believe gimple needs a type system different from front end
trees, that is my primary motivation.  I'm tired of jumping through hoops to
do anything slightly different, and I got fed up with it.  With a separate
type system for gimple, we can rid ourselves of all the stuff that isn't
related to optimization and codegen...   Could we do this with trees?
possibly, but if I'm going to go to the effort of converting the front end
tree types into a new or reduced-type subset, I might as well put that
effort into something that is more appropriate right from the start.

I'm not sure.  Especially the idea of wrapping everything in a
I'm-not-a-tree-well-but-I-really-still-am scares the hell out of me.
And I put that into the very same basket of things you now complain about.


Well, this first phase is a transitional one... If it were possible, I'd 
change them all at once to not use a tree... but I don't think that is 
feasible. There is too much to find and change, and the source base is 
constantly changing.  We can't just freeze the base for a year.   so 
until I can get them all changes, they have to act like trees and 
interact like trees, so we make them trees under the covers.   Short 
term pain and discomfort...


If it helps, during the transition you can view it as replacing trees 
with oddly named tree subclasses which provide static type checking 
instead of run time GIMPLE_CHECK() calls :-)





By replacing those uses with an actual type, we'll get
static type checking which is another improvement.   There is only one way
to move the middle/back end off of types as trees that I can see.  Go
through the source base and find all the places which use trees as types,
and replace them with an actual type.  This can't be done all at once, its
too big a job...  So I propose a plan which allows a new gimple type and
current trees to coexist.  Once the conversion is complete we can easily
implement a type system that has the features and traits we want.

Note that on 'gimple' there are no types ... the types are on the objects
'gimple' refers to and those are 'tree's.  So you'd need to change the
'tree's to have TREE_TYPE not be a 'tree'.  Or simply subclass
'tree' so you can distinguish a tree type from a tree non-type statically
(a bit awkward with the current union setup).

But maybe I'm confusing what you call 'type'.


Im talking about changing the type gimple statements refer to, and also 
what decls and expressions will refer to...  making that the gimple_type.


 Today, thats a TREE_TYPE tree.  During the transition period this 
gimple_type is indeed a tree under the covers.. so that all the existing 
code can work even if it hasnt been converted yet.   Once everything is 
converted, then gimple_type has the tree removed from its declaration 
and we replace it with the bits we need/want... and it will no longer 
function as a tree..  gimplification will create these from the trees 
the front end created.





4 - I realized that if I am going through the entire code base to find all
the type locations, its very little extra work to also do the same for decls
and expressions. Then we'll have static type checking for those as well.

5 - When done we will be significantly closer to having the ability to
create a self contained IL that we can strictly define...  That alone seems
a worthwhile goal to me.  The IL then becomes appropriate for streaming at
any point in during compilation.  This means any front end or tool can
easily generate or consume a gimple stream which opens up numerous
opportunities...  That is pretty hard to enable today.

Have fun with dealing with our legacy frontends ;)  Or didn't you
actually mean any point during compilation? ... oh, you talk about
GIMPLE.  So no plans for Frontend ASTs aka GENERIC for our legacy
frontends?
I meant any point where we are in gimple...  Thats the only point we 
really care about.  Once control is passed to the middle end.  No plans 
to touch GENERIC at all... in fact, GENERIC and 'tree' will become true 
synonyms when I'm done, because gimple wont have any trees in it 
anywhere.  And rtl will also refer to gimple nodes.   It just wont 
matter what warts are in trees.

Im going to do the initial bits on a branch for the first couple of months,
and during that time convert a number of files over to the new gimple
interface.  Then we can look at it, measure it, comment on it, tweak it,
whatever.  You can see exactly what we'll be getting into before we fully
commit.

Before we fully commit we want to see the underlying object changes.
Because I still doubt this all will work out without seriously restructuring
the entanglement of our tree-based frontends and the middle-end
(say fold).

That is, I still think you are starting this at the wrong end.  You first
need to solve the entanglement issue before you can seriously consider
making any real change to the data structures GIMPLE relies 

Re: [Patch, RTL] Eliminate redundant vec_select moves.

2013-11-07 Thread Richard Sandiford
Tejas Belagod tbela...@arm.com writes:
 Richard Sandiford wrote:
 Tejas Belagod tbela...@arm.com writes:
 Richard Sandiford wrote:
 Tejas Belagod tbela...@arm.com writes:
 Richard Sandiford wrote:
 Tejas Belagod tbela...@arm.com writes:
 +  /* This is big-endian-safe because the elements are kept in target
 + memory order.  So, for eg. PARALLEL element value of 2 is the
 same in
 + either endian-ness.  */
 +  if (GET_CODE (src) == VEC_SELECT
 +   REG_P (XEXP (src, 0))  REG_P (dst)
 +   REGNO (XEXP (src, 0)) == REGNO (dst))
 +{
 +  rtx par = XEXP (src, 1);
 +  int i;
 +
 +  for (i = 0; i  XVECLEN (par, 0); i++)
 +   {
 + rtx tem = XVECEXP (par, 0, i);
 + if (!CONST_INT_P (tem) || INTVAL (tem) != i)
 +   return 0;
 +   }
 +  return 1;
 +}
 +
 I think for big endian it needs to be:

 INTVAL (tem) != i + base

 where base is something like:

 int base = GET_MODE_NUNITS (GET_MODE (XEXP (src, 0))) - XVECLEN 
 (par, 0);

 E.g. a big-endian V4HI looks like:

 msb  lsb
 

 and shortening it to say V2HI only gives the low 32 bits:

 msb  lsb
 
 But, in this case we want

  msb  lsb
  
 It depends on whether the result occupies a full register or not.
 I was thinking of the case where it didn't, but I realise now you were
 thinking of the case where it did.  And yeah, my suggestion doesn't
 cope with that...

 I was under the impression that the const vector parallel for vec_select 
 represents the element indexes of the array in memory order.

 Therefore, in bigendian,

   msb lsb
      
 element  a[0] a[1] a[2] a[3]

 and in littleendian

   msb lsb
      
 element  a[3] a[2] a[1] a[0]
 Right.  But if an N-bit value is stored in a register, it's assumed to
 occupy the lsb of the register and the N-1 bits above that.  The other
 bits in the register are don't-care.

 E.g., leaving vectors to one side, if you have:

(set (reg:HI N) (truncate:SI (reg:SI N)))

 on a 32-bit !TRULY_NOOP_TRUNCATION target, it shortens like this:

msb  lsb
01234567

4567

 rather than:

msb  lsb
01234567

0123

 for both endiannesses.  The same principle applies to vectors.
 The lsb of the register is always assumed to be significant.

 So maybe the original patch was correct for partial-register and
 full-register results on little-endian, but only for full-register
 results on big-endian.
 Ah, ok! I think I get it. By eliminating
 set( (reg:DI n) vec_select:DI ( (reg:V2DI n) (parallel [const 0]

 using the check INTVAL (tem) != i, I'm essentially making subsequent
 operations
 use (reg:V2DI n) in DI mode which is a partial register result and this
 gives me
 the wrong set of lanes in bigendian. So, if I want to use (reg n) in 
 partial 
 register mode, I have to make sure the correct elements coincide with
 the lsb in
 big-endian...

 Thanks for your input, I'll apply the offset correction for big-endian you 
 suggested. I'll respin the patch.
 
 Thanks.  Just for avoidance of doubt, the result might be a full or
 partial register, depending on the mode and target.  I was trying to
 correct myself by agreeing that your original was right and mine was
 wrong for big-endian if the result is a full register.
 

 What I had in mind when I implemented this was a partial-reg result, but 
 obviously it was wrong.

 Sorry, I'm going to take a step back - I'm trying to figure out what a full 
 register result would look like. Looking at the pattern,

set( (reg:DI n) vec_select:DI ( (reg:V2DI n) (parallel [const 0]

 the result is always a mode smaller than the src if the vec_select selects a 
 subset of lanes. Plus if the src and dst are the same reg, because we're 
 re-writing the src reg, wouldn't it always end up being a partial-reg?
 In this case, wouldn't (reg:DI n) always represent

   msb  lsb
   

The problem is that one reg rtx can span several hard registers.
E.g. (reg:V4SI 32) might represent one 64-bit register (no. 32),
but it might instead represent two 32-bit registers (nos. 32 and 33).
Obviously the latter's not very likely for vectors this small,
but more likely for larger ones (including on NEON IIRC).

So if we had 2 32-bit registers being treated as a V4HI, it would be:

   --3233--
   msb  lsb
   
   
   
   msb  lsb
   --32--

for big endian and:

   --3332--
   msb  lsb
   
   
   
   msb  lsb
   --32--

for little endian.

Thanks,
Richard


Re: [PATCH] Add gimple subclasses for every gimple code (was Re: [PATCH 0/6] Conversion of gimple types to C++ inheritance (v3))

2013-11-07 Thread Michael Matz
Hi,

On Wed, 6 Nov 2013, David Malcolm wrote:

  I don't like that.  The empty classes are just useless, they imply a 
  structure that isn't really there, some of the separate gimple codes 
  are basically selectors of specific subtypes of a generic concept, 
  without additional data or methods; creating a type for those is 
  confusing.
 
 A type system does more than just express memory layouts:

I know all that.  But there's a point on the scale from a void* type 
system to a different-type-for-everything system where the advantages of a 
more complex type system are anulled by the disadvantages of a more 
complex type system.  And I'm merely of the opinion that you crossed that 
point with this separation.

You know we could have a typesystem where every different natural number 
is a type (and with C++ we _really_ could do that).  Obviously that's also 
a reductio ad absurdum, but it demonstrates the point that there _are_ 
type systems with a too low complexity to expressiveness ratio.

 * it permits the proof of absence of certain bugs
 * it supports abstraction
 * it documents the intent of the code

Well, yes.  You can document code with other things than types, you can 
add abstractions via different means, and certain bugs can indeed only be 
ruled out by different static types.

In my mind types should abstract different larger concepts foremost (and 
not every little aspect of a concept).  Only _then_ should they be used 
for other effects like the above, and only if the disadvantages don't 
overshadow that.  Disadvantages being mostly simply more complexity: which 
types relate to others in which way; what are the methods you can do with 
that but not the other type; which are the members of the 100 types there 
are (and if a type system is too large to be used only by IDEs with code 
completion, then it's too complex); and of course also simply having to 
write more text when you can't use the same gimple variable for several 
statements.

 To give an example from the patch, the proposed gimple_statement_switch
 subclass of gimple_statement_with_ops adds no extra fields, but it adds
 the invariant that (assuming the ptr is non-NULL), that code ==
 GIMPLE_SWITCH, or, more intuitively, it's a switch statement.
 
 This means that if we have a gimple_statement_switch, that both a human
 reading the code and the compiler can know at compile-time that the
 code == GIMPLE_SWITCH. 

But a gimple switch is actually a special case of a gimple conditional 
jump (the normal GIMPLE_COND being another) which itself is a special 
case of a gimple control transfer (a thing causing one or multiple 
outgoing edges and ending a basic block).  That enlightens that your 
hierarchy might not actually be the best split of types, it's merely a 
formal translation of gimple codes into types.

And I question exactly that automatic introduction of types.  There's not 
enough human thought put into the hierarchy to warrant the change.  Types 
are a fundament and they shouldn't be introduced lightly.  As gimple codes 
are simply flat it doesn't matter so much that the inherent structure of 
gimple statements isn't reflected (although there _is_ some structure by 
having the with_ops and with_mem_ops codes be together).  But for a _type 
hierarchy_ I would pretty much require that real thought and insights are 
put into it.

 Consider tree-switch-conversion.c: this contains various gimple
 variables but they don't work on arbitrary gimple; they require code ==
 GIMPLE_SWITCH:
 ...
 We could be doing this directly in the type system, and using the
 compiler to check it.

Then please do that while introducing one new type.

The fewer 
  types the better IMO.
 
 There's a reductio ad absurdum of that argument:

See above for my variant of that showing that it also goes into the other 
direction of more types.


Ciao,
Michael.
P.S.: And do away with the _statement part of the type, gimple_switch 
ought to be enough.


Re: [PATCH] [libiberty] MAX_PATH problems with mingw gcc

2013-11-07 Thread Ian Lance Taylor
On Wed, Nov 6, 2013 at 11:55 PM, Vladimir Simonov
vladimir.simo...@acronis.com wrote:
 -Original Message-
 From: Ian Lance Taylor [mailto:i...@google.com]
 Sent: Wednesday, November 06, 2013 6:42 PM
 To: Joey Ye
 Cc: gcc-patches; d...@redhat.com; Vladimir Simonov
 Subject: Re: [PATCH] [libiberty] MAX_PATH problems with mingw gcc

 Jan, thank you for your attention.

 It looks to me that you missed that the patch changes current gcc policy for 
 work with
 pathname  separators on hosts supporting both back and forward slashes from
 neutral(undefined) behavior to more defined - From now on hosts/builds (in 
 terms of host-build-target)
 supporting both back and forward slashes gcc tries to use forward slashes 
 both in filenames saved
 in binaries for target and for internal work.
 And this patch is just first, little step in this direction. In fact the 
 patch was published just
 to show problems and start discussion about ways for their solution.

 Above may not satisfy you and other gcc developers/consumers.
 As minimum I'm interested in Mingw people opinion.

 Arguments for new policy are simple - this policy should not affect
 native builds but helps a lot in case when host/build supports both kinds 
 of separators but
 target supports only forward slashes.

 Without explicit consensus on above I see no sense in the patch details 
 discussion.


Personally, I'm fine with that proposed change in policy.

As somebody who does not use Windows, it does not affect me.  I agree
that getting the opinion of the mingw maintainers makes sense.

I commented on the details of the patch because Joey pinged me to
submit it, and I am a libiberty maintainer.  If you want to discuss
the above proposal, then by all means discuss it.

Ian


[PATCH, ARM, LRA] Fixed bootstrap failure in Thumb mode

2013-11-07 Thread Yvan Roux
Hi,

this patch fixed an LRA cycling due to secondary reload (Thumb mode).
Notice that this patch is a prerequisite to turn on LRA by default on
ARM.  Bootstrapped on a9 and a15 without any regression in the
testsuite as LRA is off by default and with the regression reported in
the thread bellow when LRA is on.

http://gcc.gnu.org/ml/gcc-patches/2013-11/msg00725.html

Thanks,
Yvan

2013-11-07  Yvan Roux  yvan.r...@linaro.org

* config/arm/arm.h (THUMB_SECONDARY_INPUT_RELOAD_CLASS): Return NO_REGS
for LRA.
diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h
index 1781b75..d054906 100644
--- a/gcc/config/arm/arm.h
+++ b/gcc/config/arm/arm.h
@@ -1266,11 +1266,12 @@ enum reg_class
 
 /* Must leave BASE_REGS reloads alone */
 #define THUMB_SECONDARY_INPUT_RELOAD_CLASS(CLASS, MODE, X)		\
-  ((CLASS) != LO_REGS  (CLASS) != BASE_REGS\
-   ? ((true_regnum (X) == -1 ? LO_REGS	\
-   : (true_regnum (X) + HARD_REGNO_NREGS (0, MODE)  8) ? LO_REGS	\
-   : NO_REGS)) 			\
-   : NO_REGS)
+  (lra_in_progress ? NO_REGS		\
+   : ((CLASS) != LO_REGS  (CLASS) != BASE_REGS			\
+  ? ((true_regnum (X) == -1 ? LO_REGS\
+ : (true_regnum (X) + HARD_REGNO_NREGS (0, MODE)  8) ? LO_REGS	\
+ : NO_REGS)) 			\
+  : NO_REGS))
 
 #define THUMB_SECONDARY_OUTPUT_RELOAD_CLASS(CLASS, MODE, X)		\
   ((CLASS) != LO_REGS  (CLASS) != BASE_REGS\


[PATCH] Don't simplify_conversion_from_bitmask on x 1 (PR tree-optimization/59014)

2013-11-07 Thread Marek Polacek
Here, forward propagation turned
  _6 = a.1_5  1;
  _7 = (_Bool) _6;
into
  _7 = (_Bool) a.1_5;
but that's wrong: e.g. if a = 2 then (_Bool) _6 is 0, but (_Bool) a.1_5 is 1.
If a is an odd number, this is correct, but we don't know the value
of a, so I disabled this optimization for x  1 cases.  For e.g. x  255,
this still works correctly.

In this PR, VRP created wrong ASSERT_EXPR because of this, thus we were
miscompiling the following testcase...

Regtested/bootstrapped on x86_64-linux, ok for trunk?

2013-11-07  Marek Polacek  pola...@redhat.com

PR tree-optimization/59014
* tree-ssa-forwprop.c (simplify_conversion_from_bitmask): Don't
optimize x  1 expressions here.
testsuite/
* gcc.dg/torture/pr59014.c: New test.

--- gcc/testsuite/gcc.dg/torture/pr59014.c.mp   2013-11-07 14:47:38.040941144 
+0100
+++ gcc/testsuite/gcc.dg/torture/pr59014.c  2013-11-07 14:49:03.646271031 
+0100
@@ -0,0 +1,26 @@
+/* PR tree-optimization/59014 */
+/* { dg-do run } */
+
+int a = 2, b, c, d;
+
+int
+foo ()
+{
+  for (;; c++)
+if ((b  0) | (a  1))
+  ;
+else
+  {
+   d = a;
+   return 0;
+  }
+}
+
+int
+main ()
+{
+  foo ();
+  if (d != 2)
+__builtin_abort ();
+  return 0;
+}
--- gcc/tree-ssa-forwprop.c.mp  2013-11-07 14:30:43.785733810 +0100
+++ gcc/tree-ssa-forwprop.c 2013-11-07 14:42:31.836765375 +0100
@@ -1199,9 +1199,16 @@ simplify_conversion_from_bitmask (gimple
   if (TREE_CODE (rhs_def_operand1) == SSA_NAME
   ! SSA_NAME_OCCURS_IN_ABNORMAL_PHI (rhs_def_operand1)
   TREE_CODE (rhs_def_operand2) == INTEGER_CST
+ /* We can't turn
+  _6 = a  1;
+  _7 = (_Bool) _6;
+into
+  _7 = (_Bool) a;
+as that's wrong if A is an even number.  */
+  ! integer_onep (rhs_def_operand2)
   operand_equal_p (rhs_def_operand2,
  build_low_bits_mask (TREE_TYPE (rhs_def_operand2),
-  TYPE_PRECISION (lhs_type)),
+  TYPE_PRECISION (lhs_type)),
   0))
{
  /* This is an optimizable case.  Replace the source operand

Marek


Re: [gomp4] rewrite simd clone argument adjustment to avoid regimplification

2013-11-07 Thread Aldy Hernandez

On 11/07/13 00:38, Jakub Jelinek wrote:

On Wed, Nov 06, 2013 at 05:37:03PM -0700, Aldy Hernandez wrote:

Hmmm, good point.  I've moved update_stmt and company to the caller,
and modified the caller to call regimplify_operands only for
GIMPLE_RETURN which is the special case.


Can't you (later) handle that without regimplification too?


Sure!  See attached patch.

But as discussed on IRC, I wonder whether we can do without the 
following in the attached patch:


+   tree repl = make_ssa_name (TREE_TYPE (retval), NULL);
+   stmt = gimple_build_assign (repl, retval);
+   gsi_insert_before (gsi, stmt, GSI_SAME_STMT);
+   stmt = gimple_build_assign (ref, repl);

...and unconditionally do:

+ stmt = gimple_build_assign (ref, retval);

...since it seems all the GIMPLE_RETURNs I see can be replaced by 
``retval_array[iter] = whatever'' without creating something non-gimple 
(thus avoiding an SSA variable).


Either way, I'm ok.  Let me know.

Aldy
gcc/ChangeLog.gomp

* omp-low.c (ipa_simd_modify_function_body): Avoid
regimplification of GIMPLE_RETURNs.

diff --git a/gcc/omp-low.c b/gcc/omp-low.c
index 14b8571..6039de9 100644
--- a/gcc/omp-low.c
+++ b/gcc/omp-low.c
@@ -11175,26 +11175,31 @@ ipa_simd_modify_function_body (struct cgraph_node 
*node,
{
case GIMPLE_RETURN:
  {
-   tree old_retval = gimple_return_retval (stmt);
-   if (old_retval)
+   tree retval = gimple_return_retval (stmt);
+   if (!retval)
  {
-   /* Replace `return foo' by `retval_array[iter] = foo'.  */
-   stmt = gimple_build_assign (build4 (ARRAY_REF,
-   TREE_TYPE (old_retval),
-   retval_array, iter,
-   NULL, NULL),
-   old_retval);
-   gsi_replace (gsi, stmt, true);
-   gimple_regimplify_operands (stmt, gsi);
-   info.modified = true;
+   gsi_remove (gsi, true);
+   continue;
  }
+
+   /* Replace `return foo' with `retval_array[iter] = foo'.  */
+   tree ref = build4 (ARRAY_REF,
+  TREE_TYPE (retval),
+  retval_array, iter,
+  NULL, NULL);
+   if (is_gimple_reg (retval))
+ stmt = gimple_build_assign (ref, retval);
else
  {
-   gsi_remove (gsi, true);
-   continue;
+   tree repl = make_ssa_name (TREE_TYPE (retval), NULL);
+   stmt = gimple_build_assign (repl, retval);
+   gsi_insert_before (gsi, stmt, GSI_SAME_STMT);
+   stmt = gimple_build_assign (ref, repl);
  }
-   break;
+   gsi_replace (gsi, stmt, true);
+   info.modified = true;
  }
+ break;
 
default:
  walk_gimple_op (stmt, ipa_simd_modify_stmt_ops, wi);


Re: [PATCH] Add check for aarch64 in vect_cmdline_needed

2013-11-07 Thread Cesar Philippidis
On 11/6/13, 5:06 PM, Joseph S. Myers wrote:

 You should be testing aarch64*-*-* so as to match aarch64_be targets.

Thank you for catching that. Please commit this new patch if is OK. I
don't have SVN access.

Thanks,
Cesar

2013-11-06  Cesar Philippidis  ce...@codesourcery.com

gcc/testsuite/
* lib/target-supports.exp 
(check_effective_target_vect_cmdline_neeed): Add aarch64 to the list 
of targets that do not need command line argument to enable SIMD.

Index: gcc/testsuite/lib/target-supports.exp
===
--- gcc/testsuite/lib/target-supports.exp   (revision 423006)
+++ gcc/testsuite/lib/target-supports.exp   (working copy)
@@ -1920,7 +1920,8 @@ proc check_effective_target_vect_cmdline_needed {
 || [check_effective_target_powerpc_altivec]))
 || ([istarget sparc*-*-*]  [check_effective_target_sparc_vis])
  || [istarget spu-*-*]
-|| ([istarget arm*-*-*]  [check_effective_target_arm_neon]) } {
+|| ([istarget arm*-*-*]  [check_effective_target_arm_neon]) 
+|| [istarget aarch64*-*-*] } {
   set et_vect_cmdline_needed_saved 0
}
 }


Re: [gomp4] rewrite simd clone argument adjustment to avoid regimplification

2013-11-07 Thread Jakub Jelinek
On Thu, Nov 07, 2013 at 08:17:13AM -0700, Aldy Hernandez wrote:
 But as discussed on IRC, I wonder whether we can do without the
 following in the attached patch:
 
 + tree repl = make_ssa_name (TREE_TYPE (retval), NULL);
 + stmt = gimple_build_assign (repl, retval);
 + gsi_insert_before (gsi, stmt, GSI_SAME_STMT);
 + stmt = gimple_build_assign (ref, repl);
 
 ...and unconditionally do:
 
 +   stmt = gimple_build_assign (ref, retval);

This should be sufficient.
 
 ...since it seems all the GIMPLE_RETURNs I see can be replaced by
 ``retval_array[iter] = whatever'' without creating something
 non-gimple (thus avoiding an SSA variable).
 
 Either way, I'm ok.  Let me know.

Thanks.

Jakub


Re: [PATCH] Don't simplify_conversion_from_bitmask on x 1 (PR tree-optimization/59014)

2013-11-07 Thread Jakub Jelinek
On Thu, Nov 07, 2013 at 04:15:11PM +0100, Marek Polacek wrote:
 Here, forward propagation turned
   _6 = a.1_5  1;
   _7 = (_Bool) _6;
 into
   _7 = (_Bool) a.1_5;
 but that's wrong: e.g. if a = 2 then (_Bool) _6 is 0, but (_Bool) a.1_5 is 1.
 If a is an odd number, this is correct, but we don't know the value
 of a, so I disabled this optimization for x  1 cases.  For e.g. x  255,
 this still works correctly.
 
 In this PR, VRP created wrong ASSERT_EXPR because of this, thus we were
 miscompiling the following testcase...
 
 Regtested/bootstrapped on x86_64-linux, ok for trunk?

Isn't the problem here not the constant 1, but the fact that TREE_CODE
(lhs_type) == BOOLEAN_TYPE, with the special properties of boolean types?
I mean, if lhs_type is say some normal INTEGER_TYPE with precision 1
(or say 3 and stmt  3 etc.), then the cast alone should cover the required
masking.

 2013-11-07  Marek Polacek  pola...@redhat.com
 
   PR tree-optimization/59014
   * tree-ssa-forwprop.c (simplify_conversion_from_bitmask): Don't
   optimize x  1 expressions here.
 testsuite/
   * gcc.dg/torture/pr59014.c: New test.
 
 --- gcc/tree-ssa-forwprop.c.mp2013-11-07 14:30:43.785733810 +0100
 +++ gcc/tree-ssa-forwprop.c   2013-11-07 14:42:31.836765375 +0100
 @@ -1199,9 +1199,16 @@ simplify_conversion_from_bitmask (gimple
if (TREE_CODE (rhs_def_operand1) == SSA_NAME
  ! SSA_NAME_OCCURS_IN_ABNORMAL_PHI (rhs_def_operand1)
  TREE_CODE (rhs_def_operand2) == INTEGER_CST
 +   /* We can't turn
 +_6 = a  1;
 +_7 = (_Bool) _6;
 +  into
 +_7 = (_Bool) a;
 +  as that's wrong if A is an even number.  */
 +! integer_onep (rhs_def_operand2)
  operand_equal_p (rhs_def_operand2,
 build_low_bits_mask (TREE_TYPE (rhs_def_operand2),
 -TYPE_PRECISION (lhs_type)),
 +TYPE_PRECISION (lhs_type)),
  0))
   {
 /* This is an optimizable case.  Replace the source operand
 
   Marek

Jakub


Re: [PATCH] Don't simplify_conversion_from_bitmask on x 1 (PR tree-optimization/59014)

2013-11-07 Thread Richard Biener
On Thu, Nov 7, 2013 at 4:15 PM, Marek Polacek pola...@redhat.com wrote:
 Here, forward propagation turned
   _6 = a.1_5  1;
   _7 = (_Bool) _6;
 into
   _7 = (_Bool) a.1_5;
 but that's wrong: e.g. if a = 2 then (_Bool) _6 is 0, but (_Bool) a.1_5 is 1.

?

(_Bool) 2

should truncate the value to zero.

Richard.

 If a is an odd number, this is correct, but we don't know the value
 of a, so I disabled this optimization for x  1 cases.  For e.g. x  255,
 this still works correctly.


 In this PR, VRP created wrong ASSERT_EXPR because of this, thus we were
 miscompiling the following testcase...

 Regtested/bootstrapped on x86_64-linux, ok for trunk?

 2013-11-07  Marek Polacek  pola...@redhat.com

 PR tree-optimization/59014
 * tree-ssa-forwprop.c (simplify_conversion_from_bitmask): Don't
 optimize x  1 expressions here.
 testsuite/
 * gcc.dg/torture/pr59014.c: New test.

 --- gcc/testsuite/gcc.dg/torture/pr59014.c.mp   2013-11-07 14:47:38.040941144 
 +0100
 +++ gcc/testsuite/gcc.dg/torture/pr59014.c  2013-11-07 14:49:03.646271031 
 +0100
 @@ -0,0 +1,26 @@
 +/* PR tree-optimization/59014 */
 +/* { dg-do run } */
 +
 +int a = 2, b, c, d;
 +
 +int
 +foo ()
 +{
 +  for (;; c++)
 +if ((b  0) | (a  1))
 +  ;
 +else
 +  {
 +   d = a;
 +   return 0;
 +  }
 +}
 +
 +int
 +main ()
 +{
 +  foo ();
 +  if (d != 2)
 +__builtin_abort ();
 +  return 0;
 +}
 --- gcc/tree-ssa-forwprop.c.mp  2013-11-07 14:30:43.785733810 +0100
 +++ gcc/tree-ssa-forwprop.c 2013-11-07 14:42:31.836765375 +0100
 @@ -1199,9 +1199,16 @@ simplify_conversion_from_bitmask (gimple
if (TREE_CODE (rhs_def_operand1) == SSA_NAME
! SSA_NAME_OCCURS_IN_ABNORMAL_PHI (rhs_def_operand1)
TREE_CODE (rhs_def_operand2) == INTEGER_CST
 + /* We can't turn
 +  _6 = a  1;
 +  _7 = (_Bool) _6;
 +into
 +  _7 = (_Bool) a;
 +as that's wrong if A is an even number.  */
 +  ! integer_onep (rhs_def_operand2)
operand_equal_p (rhs_def_operand2,
   build_low_bits_mask (TREE_TYPE 
 (rhs_def_operand2),
 -  TYPE_PRECISION (lhs_type)),
 +  TYPE_PRECISION (lhs_type)),
0))
 {
   /* This is an optimizable case.  Replace the source operand

 Marek


Re: [PATCH] Don't simplify_conversion_from_bitmask on x 1 (PR tree-optimization/59014)

2013-11-07 Thread Jakub Jelinek
On Thu, Nov 07, 2013 at 04:39:03PM +0100, Richard Biener wrote:
 On Thu, Nov 7, 2013 at 4:15 PM, Marek Polacek pola...@redhat.com wrote:
  Here, forward propagation turned
_6 = a.1_5  1;
_7 = (_Bool) _6;
  into
_7 = (_Bool) a.1_5;
  but that's wrong: e.g. if a = 2 then (_Bool) _6 is 0, but (_Bool) a.1_5 is 
  1.
 
 ?
 
 (_Bool) 2
 
 should truncate the value to zero.

You're right, say:
_Bool foo (int x)
{
  return (_Bool) (x  1);
}
which due to the simplify_conversion_from_bitmask optimization is optimized
into:
  _Bool _3;

  bb 2:
  _3 = (_Bool) x_1(D);
  return _3;
is indeed expanded as:
(set (reg:QI 90 [ D.1789 ])
(and:QI (reg:QI 90 [ D.1789 ])
(const_int 1 [0x1])))

Jakub


[PATCH] make has_gate and has_execute useless

2013-11-07 Thread tsaunders
From: Trevor Saunders tsaund...@mozilla.com

Hi,

 This is the result of seeing what it would take to get rid of the has_gate and
has_execute flags on pass_data.  It turns out not much, but I wanted
confirmation this part is ok before I go deal with all the places that
initialize the fields.

I bootstrapped this part on x86_64-unknown-linux-gnu and saw no new test suite
regressions (ignoring the silk stuff because the full paths in its test names
break my test script for now) Any reason this patch with the actual removal of 
the flags wouldn't be ok?

Trev

2013-11-06  Trevor Saunders  tsaund...@mozilla.com

* pass_manager.h (pass_manager): Adjust.
* passes.c (opt_pass::execute): Tell the pass manager it doesn't need
to do anything for this pass.
(pass_manager::register_dump_files_1): Don't uselessly deal with
properties of passes.
(pass_manager::register_dump_files): Adjust.
(dump_one_pass): Just call pass-gate ().
(execute_ipa_summary_passes): Likewise.
(execute_one_pass): Don't check pass-has_execute flag.
(ipa_write_summaries_2): Don't check pass-has_gate flag.
(ipa_write_optimization_summaries_1): Likewise.
(ipa_read_summaries_1): Likewise.
(ipa_read_optimization_summaries_1): Likewise.
(execute_ipa_stmt_fixups): Likewise.
* tree-pass.h (pass_data): Rename has_gate to useless_has_gate and
has_execute to useless_has_execute to be sure they're unused.
(TODO_absolutely_nothing): New constant.


diff --git a/gcc/pass_manager.h b/gcc/pass_manager.h
index 77d78eb..3bc0a99 100644
--- a/gcc/pass_manager.h
+++ b/gcc/pass_manager.h
@@ -93,7 +93,7 @@ public:
 
 private:
   void set_pass_for_id (int id, opt_pass *pass);
-  int register_dump_files_1 (struct opt_pass *pass, int properties);
+  void register_dump_files_1 (struct opt_pass *pass);
   void register_dump_files (struct opt_pass *pass, int properties);
 
 private:
diff --git a/gcc/passes.c b/gcc/passes.c
index 19e5869..3b28dc9 100644
--- a/gcc/passes.c
+++ b/gcc/passes.c
@@ -112,7 +112,7 @@ opt_pass::gate ()
 unsigned int
 opt_pass::execute ()
 {
-  return 0;
+  return TODO_absolutely_nothing;
 }
 
 opt_pass::opt_pass (const pass_data data, context *ctxt)
@@ -701,33 +701,21 @@ pass_manager::register_one_dump_file (struct opt_pass 
*pass)
 
 /* Recursive worker function for register_dump_files.  */
 
-int
+void
 pass_manager::
-register_dump_files_1 (struct opt_pass *pass, int properties)
+register_dump_files_1 (struct opt_pass *pass)
 {
   do
 {
-  int new_properties = (properties | pass-properties_provided)
-   ~pass-properties_destroyed;
-
   if (pass-name  pass-name[0] != '*')
 register_one_dump_file (pass);
 
   if (pass-sub)
-new_properties = register_dump_files_1 (pass-sub, new_properties);
-
-  /* If we have a gate, combine the properties that we could have with
- and without the pass being examined.  */
-  if (pass-has_gate)
-properties = new_properties;
-  else
-properties = new_properties;
+register_dump_files_1 (pass-sub);
 
   pass = pass-next;
 }
   while (pass);
-
-  return properties;
 }
 
 /* Register the dump files for the pass_manager starting at PASS.
@@ -739,7 +727,7 @@ pass_manager::
 register_dump_files (struct opt_pass *pass,int properties)
 {
   pass-properties_required |= properties;
-  register_dump_files_1 (pass, properties);
+  register_dump_files_1 (pass);
 }
 
 struct pass_registry
@@ -847,7 +835,7 @@ dump_one_pass (struct opt_pass *pass, int pass_indent)
   const char *pn;
   bool is_on, is_really_on;
 
-  is_on = pass-has_gate ? pass-gate () : true;
+  is_on = pass-gate ();
   is_really_on = override_gate_status (pass, current_function_decl, is_on);
 
   if (pass-static_pass_number = 0)
@@ -2002,7 +1990,7 @@ execute_ipa_summary_passes (struct ipa_opt_pass_d 
*ipa_pass)
 
   /* Execute all of the IPA_PASSes in the list.  */
   if (ipa_pass-type == IPA_PASS
-  ((!pass-has_gate) || pass-gate ())
+  pass-gate ()
   ipa_pass-generate_summary)
{
  pass_init_dump_file (pass);
@@ -2153,7 +2141,7 @@ execute_one_pass (struct opt_pass *pass)
 
   /* Check whether gate check should be avoided.
  User controls the value of the gate through the parameter gate_status. 
*/
-  gate_status = pass-has_gate ? pass-gate () : true;
+  gate_status = pass-gate ();
   gate_status = override_gate_status (pass, current_function_decl, 
gate_status);
 
   /* Override gate with plugin.  */
@@ -2210,11 +2198,11 @@ execute_one_pass (struct opt_pass *pass)
 timevar_push (pass-tv_id);
 
   /* Do it!  */
-  if (pass-has_execute)
-{
-  todo_after = pass-execute ();
-  do_per_function (clear_last_verified, NULL);
-}
+  todo_after = pass-execute ();
+  if (todo_after != TODO_absolutely_nothing)
+do_per_function (clear_last_verified, NULL);
+  

Re: RFC: simd enabled functions (omp declare simd / elementals)

2013-11-07 Thread Martin Jambor
Hi,

On Thu, Oct 31, 2013 at 10:04:45PM -0500, Aldy Hernandez wrote:
 Hello gentlemen.  I'm CCing all of you, because each of you can
 provide valuable feedback to various parts of the compiler which I
 touch.  I have sprinkled love notes with your names throughout the
 post :).

sorry it took me so long, for various reasons I out of my control I've
accumulated quite a backlog of email and tasks last week and it took
me a lot of time to chew through it all.

...

 [Martin] I have adapted the ipa_parm_adjustment infrastructure to
 allow adding new arguments out of the blue like you mentioned was
 missing in ipa-prop.h.  I have also added support for creating
 vectors of arguments.  Could you take a look at my changes to
 ipa-prop.[ch]?

Sure, though I have only looked at ipa-* and tree-sra.c stuff.  I do
not have any real objections but would suggest a few amendments.  

I am glad this is becoming a useful infrastructure rather than just a
part of IPA-SRA.  Note that while ipa_combine_adjustments is not used
from anywhere and thus probably buggy anyway, it should in theory be
able to process new_param adjustments too.  Can you please at least
put a not implemented assert there?  (The reason is that the plan
still is to replace args_to_skip bitmaps in cgraphclones.c by
adjustments one day and we do need to combine clones.)

 
 [Martin] I need to add new arguments in the case of inbranch clones,
 which add an additional vector with a mask as the last argument:
 For the following:
 
 #pragma omp declare simd simdlen(4) inbranch
 int foo (int a)
 {
   return a + 1234;
 }
 
 ...we would generate a clone with:
 
 vector(4) int
 foo.simdclone.0 (vector(4) int simd.4, vector(4) int mask.5)
 
 I thought it best to enhance ipa_modify_formal_parameters() and
 associated machinery than to add the new argument ad-hoc.  We
 already have enough ways of doing tree and cgraph versioning in the
 compiler ;-).
 

...

 gcc/ChangeLog.elementals
 
   * Makefile.in (omp-low.o): Depend on PRETTY_PRINT_H and IPA_PROP_H.
   * tree-vect-stmts.c (vectorizable_call): Allow  3 arguments when
   a SIMD clone may be available.
   (vectorizable_function): Use SIMD clone if available.
   * ipa-cp.c (determine_versionability): Nodes with SIMD clones are
   not versionable.
   * ggc.h (ggc_alloc_cleared_simd_clone_stat): New.
   * cgraph.h (enum linear_stride_type): New.
   (struct simd_clone_arg): New.
   (struct simd_clone): New.
   (struct cgraph_node): Add simdclone and simdclone_of fields.
   (get_simd_clone): Protoize.
   * cgraph.c (get_simd_clone): New.
   Add `has_simd_clones' field.
   * ipa-cp.c (determine_versionability): Disallow functions with
   simd clones.

(This looks like a repeated entry.)

   * ipa-prop.h (ipa_sra_modify_function_body): Protoize.
   (sra_ipa_modify_expr): Same.
   (struct ipa_parm_adjustment): Add new_arg_prefix and new_param
   fields.  Document their use.
   * ipa-prop.c (ipa_modify_formal_parameters): Handle creating brand
   new parameters and minor cleanups.
   * omp-low.c: Add new pass_omp_simd_clone support code.
   (make_pass_omp_simd_clone): New.
   (pass_data_omp_simd_clone): Declare.
   (class pass_omp_simd_clone): Declare.
   (vecsize_mangle): New.
   (ipa_omp_simd_clone): New.
   (simd_clone_clauses_extract): New.
   (simd_clone_compute_base_data_type): New.
   (simd_clone_compute_vecsize_and_simdlen): New.
   (simd_clone_create): New.
   (simd_clone_adjust_return_type): New.
   (simd_clone_adjust_return_types): New.
   (simd_clone_adjust): New.
   (simd_clone_init_simd_arrays): New.
   (ipa_simd_modify_function_body): New.
   (simd_clone_mangle): New.
   (simd_clone_struct_alloc): New.
   (simd_clone_struct_copy): New.
   (class argno_map): New.
   (argno_map::argno_map(tree)): New.
   (argno_map::~argno_map): New.
   (argno_map::operator []): New.
   (argno_map::length): New.
   (expand_simd_clones): New.
   (create_tmp_simd_array): New.
   * tree.h (OMP_CLAUSE_LINEAR_VARIABLE_STRIDE): New.
   * tree-core.h (OMP_CLAUSE_LINEAR_VARIABLE_STRIDE): Document.
   * tree-pass.h (make_pass_omp_simd_clone): New.
   * passes.def (pass_omp_simd_clone): New.
   * target.def: Define new hook prefix TARGET_CILKPLUS_.
   (default_vecsize_mangle): New.
   (vecsize_for_mangle): New.
   * doc/tm.texi.in: Add placeholder for
   TARGET_CILKPLUS_DEFAULT_VECSIZE_MANGLE and
   TARGET_CILKPLUS_VECSIZE_FOR_MANGLE.
   * tree-sra.c (sra_ipa_modify_expr): Remove static modifier.
   (ipa_sra_modify_function_body): Same.
   * tree.h (OMP_CLAUSE_LINEAR_VARIABLE_STRIDE): Define.
   * doc/tm.texi: Regenerate.
   * config/i386/i386.c (ix86_cilkplus_default_vecsize_mangle): New.
   (ix86_cilkplus_vecsize_for_mangle): New.
   (TARGET_CILKPLUS_DEFAULT_VECSIZE_MANGLE): New.
  

Re: [PATCH] Don't simplify_conversion_from_bitmask on x 1 (PR tree-optimization/59014)

2013-11-07 Thread Marek Polacek
On Thu, Nov 07, 2013 at 04:39:03PM +0100, Richard Biener wrote:
 On Thu, Nov 7, 2013 at 4:15 PM, Marek Polacek pola...@redhat.com wrote:
  Here, forward propagation turned
_6 = a.1_5  1;
_7 = (_Bool) _6;
  into
_7 = (_Bool) a.1_5;
  but that's wrong: e.g. if a = 2 then (_Bool) _6 is 0, but (_Bool) a.1_5 is 
  1.
 
 ?
 
 (_Bool) 2
 
 should truncate the value to zero.

Aha.  I can see now, said the blind man.

Anyway, I'll revisit this in stage 3; I now suspect the special
handling of BIT_IOR_EXPR in register_edge_assert_for_1.

Marek


Re: Implement C11 _Atomic

2013-11-07 Thread Uros Bizjak
On Wed, Nov 6, 2013 at 12:21 AM, Joseph S. Myers
jos...@codesourcery.com wrote:
 This patch, relative to trunk and based on work done on the C11-atomic
 branch, adds support for C11 _Atomic.  It is intended to include all
 the required language support.

 It does not include the stdatomic.h header; there's a version on the
 branch, but it needs further review against the standard and test
 coverage adding to the testsuite before I can propose it for mainline.

 Support for atomic types having bigger alignment than the
 corresponding non-atomic types is limited: it includes the code to
 increase the alignment of types whose size is exactly 1, 2, 4, 8 or 16
 to that of the corresponding integer type [*], but not anything for
 target-specific alignment increases.  There's code for target-specific
 alignment on the branch (and I intend to merge trunk back to the
 branch once this patch is on trunk, so it's easy to tell what the
 changes still left on the branch are), should any target maintainers
 wish to have such alignment.  Note however that ideally libstdc++
 atomics would be ABI-compatible with C atomics, requiring them to get
 the same alignment; the branch has an atomic attribute for that
 purpose, but I think further work on the C++ front-end changes would
 be needed for them to be ready for mainline.

 [*] The c-common.c code to resolve size-generic atomic built-in
 functions to size-specific ones assumes that types are naturally
 aligned (and so resolves to calls to the size-specific functions that
 require natural alignment) without checking it.  If users should be
 able to use the size-generic functions on types with lesser alignment,
 e.g. _Complex double (8-byte rather than 16-byte aligned), rather than
 just on their _Atomic variants, this is of course a bug in the code
 resolving those built-in functions.  (The libatomic functions properly
 check for alignment at runtime.)  But it seems reasonable enough
 anyway to make _Atomic _Complex double 16-byte aligned.

 Full use of _Atomic requires linking with libatomic, if you're doing
 any atomic operations that aren't fully expanded inline; arguably
 libatomic should be linked in by default with --as-needed (given that
 previously ordinary C code didn't require the user to link any
 libraries explicitly unless they directly called library functions),
 but that's independent of this patch.  Correct handling of atomic
 compound assignment for floating-point types also requires built-in
 support for floating-point environment manipulation operations roughly
 equivalent to feholdexcept, feclearexcept and feupdateenv (built-ins
 are used to avoid introducing libm dependencies, which generic C code
 not actually calling libm library functions shouldn't require); this
 patch includes such support for x86 [*], and I expect other
 architectures to be simpler.  If you don't have a libatomic port, the
 execution tests for _Atomic simply won't be run; if you have libatomic
 but not the floating-point support, the tests will be run but
 c11-atomic-exec-5.c will fail (assuming the library features are
 present for that test to run at all - it also requires pthreads).

 [*] This could be optimized if desired by passing the types in
 question to the target hook so it can generate code for only one of
 x87 and SSE in most cases, much like an optimization present in glibc
 when it internally does feholdexcept / fesetenv / feupdateenv
 sequences.

 _Atomic support is currently disabled for Objective-C and OpenMP.  For
 both (but mainly OpenMP), the relevant parser code needs checking to
 determine where convert_lvalue_to_rvalue calls need inserting to
 ensure that accesses to atomic variables involve atomic loads.  For
 Objective-C, there are also various special cases of compound
 assignment that need special handling for atomics just as standard C
 compound assignment is handled differently for atomics, as well as
 some TYPE_MAIN_VARIANT calls to check for correctness for atomics; see
 the comment on the relevant sorry () call for details.  OpenMP should
 also have TYPE_MAIN_VARIANT uses checked as well as a use of
 TYPE_QUALS_NO_ADDR_SPACE for a diagnostic in
 c_parser_omp_declare_reduction (where the diagnostic refers to a
 particular list of qualifiers).

 Bootstrapped with no regressions on x86_64-unknown-linux-gnu.  OK to
 commit (the various non-front-end pieces)?

 gcc:
 2013-11-05  Andrew MacLeod  amacl...@redhat.com
 Joseph Myers  jos...@codesourcery.com

 * tree-core.h (enum cv_qualifier): Add TYPE_QUAL_ATOMIC.
 (enum tree_index): Add TI_ATOMICQI_TYPE, TI_ATOMICHI_TYPE,
 TI_ATOMICSI_TYPE, TI_ATOMICDI_TYPE and TI_ATOMICTI_TYPE.
 (struct tree_base): Add atomic_flag field.
 * tree.h (TYPE_ATOMIC): New accessor macro.
 (TYPE_QUALS, TYPE_QUALS_NO_ADDR_SPACE): Add TYPE_QUAL_ATOMIC.
 (TYPE_QUALS_NO_ADDR_SPACE_NO_ATOMIC): New macro.
 (atomicQI_type_node, atomicHI_type_node, 

Clean up atomic tests

2013-11-07 Thread Joseph S. Myers
This patch cleans up various issues with the tests of atomics built-in
functions and libatomic functions, in preparation for adapting those
tests to add test coverage of stdatomic.h macros.  The tests were
missing a return type for main (C11 doesn't allow implicit int return
types).  Some tests were incrementing a variable in one part of an
expression and using its value in another part without a sequence
point in between.  And the libatomic tests shouldn't have been
restricted to targets with hardware sync_* support, or adding
command-line options to make such support available, since they are
built with -fno-inline-atomics anyway to ensure the library
functionality is what's tested, and the library is meant to cover all
types regardless of what hardware support may be available.

Tested for x86_64-unknown-linux-gnu.  OK to commit?

gcc/testsuite:
2013-11-07  Joseph Myers  jos...@codesourcery.com

* gcc.dg/atomic-compare-exchange-1.c,
gcc.dg/atomic-compare-exchange-2.c,
gcc.dg/atomic-compare-exchange-3.c,
gcc.dg/atomic-compare-exchange-4.c,
gcc.dg/atomic-compare-exchange-5.c, gcc.dg/atomic-exchange-1.c,
gcc.dg/atomic-exchange-2.c, gcc.dg/atomic-exchange-3.c,
gcc.dg/atomic-exchange-4.c, gcc.dg/atomic-exchange-5.c,
gcc.dg/atomic-fence.c, gcc.dg/atomic-flag.c,
gcc.dg/atomic-generic.c, gcc.dg/atomic-invalid.c,
gcc.dg/atomic-load-1.c, gcc.dg/atomic-load-2.c,
gcc.dg/atomic-load-3.c, gcc.dg/atomic-load-4.c,
gcc.dg/atomic-load-5.c, gcc.dg/atomic-lockfree.c,
gcc.dg/atomic-noinline.c, gcc.dg/atomic-op-1.c,
gcc.dg/atomic-op-2.c, gcc.dg/atomic-op-3.c, gcc.dg/atomic-op-4.c,
gcc.dg/atomic-op-5.c, gcc.dg/atomic-param.c,
gcc.dg/atomic-store-1.c, gcc.dg/atomic-store-2.c,
gcc.dg/atomic-store-3.c, gcc.dg/atomic-store-4.c,
gcc.dg/atomic-store-5.c: Declare main as returning int.
* gcc.dg/atomic-exchange-1.c, gcc.dg/atomic-exchange-2.c,
gcc.dg/atomic-exchange-3.c, gcc.dg/atomic-exchange-4.c,
gcc.dg/atomic-exchange-5.c: Separate increments of count from
expression using value of count.

libatomic:
2013-11-07  Joseph Myers  jos...@codesourcery.com

* testsuite/libatomic.c/atomic-compare-exchange-1.c,
testsuite/libatomic.c/atomic-compare-exchange-2.c,
testsuite/libatomic.c/atomic-compare-exchange-3.c,
testsuite/libatomic.c/atomic-compare-exchange-4.c,
testsuite/libatomic.c/atomic-compare-exchange-5.c,
testsuite/libatomic.c/atomic-exchange-1.c,
testsuite/libatomic.c/atomic-exchange-2.c,
testsuite/libatomic.c/atomic-exchange-3.c,
testsuite/libatomic.c/atomic-exchange-4.c,
testsuite/libatomic.c/atomic-exchange-5.c,
testsuite/libatomic.c/atomic-generic.c,
testsuite/libatomic.c/atomic-load-1.c,
testsuite/libatomic.c/atomic-load-2.c,
testsuite/libatomic.c/atomic-load-3.c,
testsuite/libatomic.c/atomic-load-4.c,
testsuite/libatomic.c/atomic-load-5.c,
testsuite/libatomic.c/atomic-op-1.c,
testsuite/libatomic.c/atomic-op-2.c,
testsuite/libatomic.c/atomic-op-3.c,
testsuite/libatomic.c/atomic-op-4.c,
testsuite/libatomic.c/atomic-op-5.c,
testsuite/libatomic.c/atomic-store-1.c,
testsuite/libatomic.c/atomic-store-2.c,
testsuite/libatomic.c/atomic-store-3.c,
testsuite/libatomic.c/atomic-store-4.c,
testsuite/libatomic.c/atomic-store-5.c: Declare main as returning
int.  Do not require built-in sync support or add target-specific
options.
* testsuite/libatomic.c/atomic-exchange-1.c,
testsuite/libatomic.c/atomic-exchange-2.c,
testsuite/libatomic.c/atomic-exchange-3.c,
testsuite/libatomic.c/atomic-exchange-4.c,
testsuite/libatomic.c/atomic-exchange-5.c: Separate increments of
count from expression using value of count.

Index: gcc/testsuite/gcc.dg/atomic-op-2.c
===
--- gcc/testsuite/gcc.dg/atomic-op-2.c  (revision 204508)
+++ gcc/testsuite/gcc.dg/atomic-op-2.c  (working copy)
@@ -528,6 +528,7 @@
 abort ();
 }
 
+int
 main ()
 {
   test_fetch_add ();
Index: gcc/testsuite/gcc.dg/atomic-compare-exchange-1.c
===
--- gcc/testsuite/gcc.dg/atomic-compare-exchange-1.c(revision 204508)
+++ gcc/testsuite/gcc.dg/atomic-compare-exchange-1.c(working copy)
@@ -16,6 +16,7 @@
 #define STRONG 0
 #define WEAK 1
 
+int
 main ()
 {
 
Index: gcc/testsuite/gcc.dg/atomic-exchange-3.c
===
--- gcc/testsuite/gcc.dg/atomic-exchange-3.c(revision 204508)
+++ gcc/testsuite/gcc.dg/atomic-exchange-3.c(working copy)
@@ -9,25 +9,31 @@
 
 int v, count, ret;
 
+int
 main ()
 {
   v = 0;
   count = 0;
 
-  if (__atomic_exchange_n (v, count 

Re: Implement C11 _Atomic

2013-11-07 Thread Jakub Jelinek
On Tue, Nov 05, 2013 at 11:21:56PM +, Joseph S. Myers wrote:
 2013-11-05  Andrew MacLeod  amacl...@redhat.com
   Joseph Myers  jos...@codesourcery.com
 
   * tree-core.h (enum cv_qualifier): Add TYPE_QUAL_ATOMIC.
   (enum tree_index): Add TI_ATOMICQI_TYPE, TI_ATOMICHI_TYPE,
   TI_ATOMICSI_TYPE, TI_ATOMICDI_TYPE and TI_ATOMICTI_TYPE.
   (struct tree_base): Add atomic_flag field.
   * tree.h (TYPE_ATOMIC): New accessor macro.
   (TYPE_QUALS, TYPE_QUALS_NO_ADDR_SPACE): Add TYPE_QUAL_ATOMIC.
   (TYPE_QUALS_NO_ADDR_SPACE_NO_ATOMIC): New macro.
   (atomicQI_type_node, atomicHI_type_node, atomicSI_type_node)
   (atomicDI_type_node, atomicTI_type_node): New macros for type
   nodes.
   * tree.c (set_type_quals): Set TYPE_ATOMIC.
   (find_atomic_core_type): New function.
   (build_qualified_type): Adjust alignment for qualified types.
   (build_atomic_base): New function
   (build_common_tree_nodes): Build atomicQI_type_node,
   atomicHI_type_node, atomicSI_type_node, atomicDI_type_node and
   atomicTI_type_node.
   * print-tree.c (print_node): Print atomic qualifier.
   * tree-pretty-print.c (dump_generic_node): Print atomic type
   attribute.
   * target.def (atomic_assign_expand_fenv): New hook.
   * doc/tm.texi.in (TARGET_ATOMIC_ASSIGN_EXPAND_FENV): New @hook.
   * doc/tm.texi: Regenerate.
   * targhooks.c (default_atomic_assign_expand_fenv): New function.
   * targhooks.h (default_atomic_assign_expand_fenv): Declare.
   * sync-builtins.def (__atomic_feraiseexcept): New built-in
   function.

 2013-11-05  Joseph Myers  jos...@codesourcery.com
 
   * lib/target-supports.exp
   (check_effective_target_fenv_exceptions): New function.
   * lib/atomic-dg.exp, gcc.dg/atomic/atomic.exp: New files.
   * gcc.dg/atomic/c11-atomic-exec-1.c,
   gcc.dg/atomic/c11-atomic-exec-2.c,
   gcc.dg/atomic/c11-atomic-exec-3.c,
   gcc.dg/atomic/c11-atomic-exec-4.c,
   gcc.dg/atomic/c11-atomic-exec-5.c, gcc.dg/c11-atomic-1.c,
   gcc.dg/c11-atomic-2.c, gcc.dg/c11-atomic-3.c,
   gcc.dg/c90-atomic-1.c, gcc.dg/c99-atomic-1.c: New tests.
 
 libatomic:
 2013-11-05  Joseph Myers  jos...@codesourcery.com
 
   * fenv.c: New file.
   * libatomic.map (LIBATOMIC_1.1): New symbol version.  Include
   __atomic_feraiseexcept.
   * configure.ac (libtool_VERSION): Change to 2:0:1.
   (fenv.h): Test for header.
   * Makefile.am (libatomic_la_SOURCES): Add fenv.c.
   * Makefile.in, auto-config.h.in, configure: Regenerate.

The middle-end, libatomic and testsuite changes are ok.

Jakub


Re: Cilk Library

2013-11-07 Thread Jeff Law

On 11/07/13 06:11, Thomas Schwinge wrote:

Hi!

On Wed, 9 Oct 2013 18:32:11 +, Iyer, Balaji V balaji.v.i...@intel.com 
wrote:

 * Makefile.def: Add libcilkrts to target_modules.  Make libcilkrts
 depend on libstdc++ and libgcc.
 [...]
 * Makefile.in: Added libcilkrts related fields to support building it.


How did you modify the latter file?  I noticed it is no longer in sync
with the former: if I regenerate it (»autogen Makefile.def«), then there
are differences.  This is easily fixed, of course, but as everyone now
has been using the out-of-sync Makefile.in, I wonder whether the
regeneration qualifies as obvious to commit, or rather something in
Makefile.def needs to be changed to make it match the Makefile.in as it
has been checked in in r204173?

* Makefile.in: Regenerate.
I think we should consider regeneration as an obvious change.  If that 
breaks something, then it's the original author who introduced the 
change without a corresponding regenerate that needs to fix their code.


Just to be explicit, this is fine for the trunk.

Jeff



Re: Re-factor tree.h - Part 1

2013-11-07 Thread Jeff Law

On 11/07/13 10:24, Diego Novillo wrote:

On Thu, Nov 7, 2013 at 5:29 AM, Richard Biener
richard.guent...@gmail.com wrote:


Moved to fold-const.c:
 size_low_cst


Same as int_cst_value just with lack of any checking?


It may be. It's used by the POINTER_EXPR_PLUS handler. I will try to
replace it in a later patch.




Moved to gimple-fold.c:
 truth_type_for (maybe keep extern?)


Uses a langhook so should stay here.


Done.


Moved to tree-ssa-dom.c:
 iterative_hash_exprs_commutative


I'm not sure - it seems that iterative_hash_expr does exactly
the same thing?  So instead of moving things this should
be refactored (make that function re-usable from iterative_hash_expr).


Future patch? Making it static will at least prevent diverging uses of
the similar functions.
I think future patch.  In theory we ought to just make 
iterative_hash_exprs_commutative just go away, but there's some 
restructuring of the DOM code that would need to happen first.


jeff



Committed: arc_ifcvt: take commutativity into account

2013-11-07 Thread Joern Rennecke



tmp
Description: Binary data


Re: [RFC] libgcov.c re-factoring and offline profile-tool

2013-11-07 Thread Joseph S. Myers
On Thu, 7 Nov 2013, Rong Xu wrote:

 Thanks Joseph for these detailed comments/suggestions.
 The fixed patch is attached to this email.
 The only thing left out is the Texinfo manual. Do you mean this tool
 should have its it's own texi file in gcc/doc?

Its own texi file, probably included as a chapter by gcc.texi (like 
gcov.texi is).

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH] make has_gate and has_execute useless

2013-11-07 Thread Jeff Law

On 11/07/13 09:00, tsaund...@mozilla.com wrote:

From: Trevor Saunders tsaund...@mozilla.com

Hi,

  This is the result of seeing what it would take to get rid of the has_gate and
has_execute flags on pass_data.  It turns out not much, but I wanted
confirmation this part is ok before I go deal with all the places that
initialize the fields.

Most definitely, the has_gate/has_execute stuff is definitely a wart.



I bootstrapped this part on x86_64-unknown-linux-gnu and saw no new test suite
regressions (ignoring the silk stuff because the full paths in its test names
break my test script for now) Any reason this patch with the actual removal of 
the flags wouldn't be ok?

I don't see anything glaringly wrong.

Any reason why the register_dump_files bits were mucking with 
properties.  That just seems wrong.




+  todo_after = pass-execute ();
+  if (todo_after != TODO_absolutely_nothing)
+do_per_function (clear_last_verified, NULL);
+  else
+todo_after = ~TODO_absolutely_nothing;


Isn't the last assignment there just
todo_after = 0;

?

Jeff


[wide-int, committed] Fix nds32 build

2013-11-07 Thread Richard Sandiford
Get the new port building on wide-int.  Seemed pretty obvious so I went
ahead and installed it.

Thanks,
Richard


Index: gcc/config/nds32/nds32.c
===
--- gcc/config/nds32/nds32.c2013-11-05 13:06:54.744239262 +
+++ gcc/config/nds32/nds32.c2013-11-07 10:27:03.339028225 +
@@ -1273,7 +1273,7 @@ nds32_construct_isr_vectors_information
  /* Pick up each vector id value.  */
  id = TREE_VALUE (id_list);
  /* Add vector_number_offset to get actual vector number.  */
- vector_id = TREE_INT_CST_LOW (id) + vector_number_offset;
+ vector_id = tree_to_uhwi (id) + vector_number_offset;
 
  /* Enable corresponding vector and set function name.  */
  nds32_isr_vectors[vector_id].category = (intr)
@@ -1315,7 +1315,7 @@ nds32_construct_isr_vectors_information
 
   /* The total vectors = interrupt + exception numbers + reset.
  There are 8 exception and 1 reset in nds32 architecture.  */
-  nds32_isr_vectors[0].total_n_vectors = TREE_INT_CST_LOW (id) + 8 + 1;
+  nds32_isr_vectors[0].total_n_vectors = tree_to_uhwi (id) + 8 + 1;
   strcpy (nds32_isr_vectors[0].func_name, func_name);
 
   /* Retrieve nmi and warm function.  */
@@ -3145,8 +3145,8 @@ nds32_insert_attributes (tree decl, tree
  id = TREE_VALUE (id_list);
  /* Issue error if it is not a valid integer value.  */
  if (TREE_CODE (id) != INTEGER_CST
- || TREE_INT_CST_LOW (id)  lower_bound
- || TREE_INT_CST_LOW (id)  upper_bound)
+ || wi::ltu_p (id, lower_bound)
+ || wi::gtu_p (id, upper_bound))
error (invalid id value for interrupt/exception attribute);
 
  /* Advance to next id.  */
@@ -3173,8 +3173,8 @@ nds32_insert_attributes (tree decl, tree
 
  /* 3. Check valid integer value for reset.  */
  if (TREE_CODE (id) != INTEGER_CST
- || TREE_INT_CST_LOW (id)  lower_bound
- || TREE_INT_CST_LOW (id)  upper_bound)
+ || wi::ltu_p (id, lower_bound)
+ || wi::gtu_p (id, upper_bound))
error (invalid id value for reset attribute);
 
  /* 4. Check valid function for nmi/warm.  */


Re: fixincludes patch RFA: Fix fenv.h on Ubuntu Precise

2013-11-07 Thread Bruce Korb

On 11/06/13 15:29, Ian Lance Taylor wrote:

When fenv.h is not fixed, libquadmath does not build.

This patch works around the problem.  Bootstrapped and tested on
x86_64-unknown-linux-gnu.

OK for mainline?


Hi Ian,

Yes, please.

This time, I'm on my dev box and looked at the code.
You remembered correctly that the first file name in the list
of file names needs to not have wild card characters so that
the testing scheme can create a file by that name.  A directory
named * is possible, but inconvenient.

Anyway, it also matches prior art in:


/*
 * glibc_c99_inline_3
 */
fix = {
hackname  = glibc_c99_inline_3;
files = bits/string2.h, '*/bits/string2.h';


but is inconsistent with:


/* Some versions of glibc have a version of bits/string2.h that
   produces value computed is not used warnings from strncpy; fix
   this definition by using __builtin_strncpy instead as in newer
   versions.  */
fix = {
hackname  = glibc_strncpy;
files = bits/string2.h;


Hmmm.  Does that fix need fixing, too?

Thanks -Bruce


wide-int testing results

2013-11-07 Thread Richard Sandiford
I wanted to make sure that each backend still builds with wide-int and that
there weren't any unexplained changes in assembly output.  I arbitrarily
picked one target for each CPU:

aarch64-linux-gnueabi alpha-linux-gnu arc-elf arm-linux-gnueabi
avr-rtems bfin-elf c6x-elf cr16-elf cris-elf epiphany-elf
fr30-elf frv-linux-gnu h8300-elf ia64-linux-gnu iq2000-elf
lm32-elf m32c-elf m32r-elf m68k-linux-gnu mcore-elf mep-elf
microblaze-elf mips-linux-gnu mmix mn10300-elf moxie-elf
msp430-elf nds32le-elf hppa64-hp-hpux11.23 pdp11 picochip-elf
powerpc-linux-gnu powerpc-eabispe rl78-elf rx-elf s390-linux-gnu
score-elf sh-linux-gnu sparc-linux-gnu spu-elf tilegx-elf
tilepro-elf xstormy16-elf v850-elf vax-netbsdelf xtensa-elf
x86_64-darwin

and built gcc and g++ from both the merge point and the wide-int branch.
The branch included the patches I've posted and a local change
to put CONST_WIDE_INT at the end of rtl.def (just for comparison,
as mentioned before).  I then compiled gcc.c-torture, gcc.dg and g++.dg
at -O2 and compared the asm output.  This obviously isn't a very strong
test, since none of the libraries are built, and since some of the test
cases rely on system headers, but it should at least be better than nothing.

pdp11 and picochip-elf don't build on mainline due to:

   gcc/target-def.h:69:34: error: ‘default_stabs_asm_out_destructor’ was not 
declared in this scope
   #   define TARGET_ASM_DESTRUCTOR default_stabs_asm_out_destructor

Both the merge point and the branch built the other targets and there were
no new warnings on the branch.  The only asm differences were in:

* gcc.dg/fixed-point/const-1.s on targets that support it

  I think this is a known improvement.  The test case checks a series of
  conversions to make sure that the out-of-range ones produce a warning.
  E.g.:

short _Fract sfF = 1.0;  /* { dg-warning overflow } */

  The asm differences are in the values produced for these out-of-range
  casees.  The old code used real_to_integer2, which saturates the result
  to double_int precision.  It then uses the fractional part of that result
  to initialise sfF.  The result therefore depends on the host (at least
  in the general case).

  The new code instead saturates to the precision of the result, just like
  we already do for:

signed short s = 32768.0f;

  So this seems like a good change. 

* gcc.dg/vect/pr51799.c on m32c-elf and
* gcc.c-torture/execute/divconst-2.c on xstormy16-elf

  These are cases where wide-int converts a narrower-than-HWI
  multiplication by 1  (GET_MODE_PRECISION - 1) into a shift but
  mainline doesn't.  The mainline code looks like:

  /* Convert multiply by constant power of two into shift unless
 we are still generating RTL.  This test is a kludge.  */
  if (CONST_INT_P (trueop1)
   (val = exact_log2 (UINTVAL (trueop1))) = 0
  /* If the mode is larger than the host word size, and the
 uppermost bit is set, then this isn't a power of two due
 to implicit sign extension.  */
   (width = HOST_BITS_PER_WIDE_INT
  || val != HOST_BITS_PER_WIDE_INT - 1))
return simplify_gen_binary (ASHIFT, mode, op0, GEN_INT (val));

  and this exact_log2 doesn't cope properly with the sign-extended
  0xff8000... CONST_INT.  The wide-int code is:

  /* Convert multiply by constant power of two into shift.  */
  if (CONST_SCALAR_INT_P (trueop1))
{
  val = wi::exact_log2 (std::make_pair (trueop1, mode));
  if (val = 0  val  GET_MODE_BITSIZE (mode))
return simplify_gen_binary (ASHIFT, mode, op0, GEN_INT (val));
}

  So this too seems like a good thing.

Thanks,
Richard


Re: fixincludes patch RFA: Fix fenv.h on Ubuntu Precise

2013-11-07 Thread Bruce Korb

So is this the right patch?


$ svn diff inclhack.def
Index: inclhack.def
===
--- inclhack.def(revision 204533)
+++ inclhack.def(working copy)
@@ -1738,7 +1738,7 @@
versions.  */
 fix = {
 hackname  = glibc_strncpy;
-files = bits/string2.h;
+files = bits/string2.h, '*/bits/string2.h';
 bypass= __builtin_strncpy;
 c_fix = format;
 c_fix_arg = #  define strncpy(dest, src, n) __builtin_strncpy (dest, src, 
n);
@@ -2411,7 +2411,7 @@
  */
 fix = {
 hackname  = huge_val_hex;
-files = bits/huge_val.h;
+files = bits/huge_val.h, '*/bits/huge_val.h';
 select= ^#[ \t]*define[ \t]*HUGE_VAL[ \t].*0x1\\.0p.*;
 bypass= __builtin_huge_val;

@@ -2426,7 +2426,7 @@
  */
 fix = {
 hackname  = huge_valf_hex;
-files = bits/huge_val.h;
+files = bits/huge_val.h, '*/bits/huge_val.h';
 select= ^#[ \t]*define[ \t]*HUGE_VALF[ \t].*0x1\\.0p.*;
 bypass= __builtin_huge_valf;

@@ -2441,7 +2441,7 @@
  */
 fix = {
 hackname  = huge_vall_hex;
-files = bits/huge_val.h;
+files = bits/huge_val.h, '*/bits/huge_val.h';
 select= ^#[ \t]*define[ \t]*HUGE_VALL[ \t].*0x1\\.0p.*;
 bypass= __builtin_huge_vall;

@@ -4226,7 +4226,7 @@
 fix = {
 hackname  = thread_keyword;
 files = pthread.h;
-files = bits/sigthread.h;
+files = bits/sigthread.h, '*/bits/sigthread.h';
 select= ([* ])__thread([,)]);
 c_fix = format;
 c_fix_arg = %1__thr%2;
@@ -4767,7 +4767,7 @@
 fix = {
 hackname  = feraiseexcept_nosse_invalid;
 mach  = 'i[34567]86-*-linux*', 'x86*-linux*', 'amd64-*-linux*';
-files = bits/fenv.h;
+files = bits/fenv.h, '*/bits/fenv.h';
 select= ^([\t ]*)__asm__ __volatile__ \\(\divss %0, %0 *\ : 
: \x\ \\(__f\\)\\);$;
 bypass= \fdiv .*; fwait\;
@@ -4794,7 +4794,7 @@
 fix = {
 hackname  = feraiseexcept_nosse_divbyzero;
 mach  = 'i[34567]86-*-linux*', 'x86*-linux*', 'amd64-*-linux*';
-files = bits/fenv.h;
+files = bits/fenv.h, '*/bits/fenv.h';
 select= ^([\t ]*)__asm__ __volatile__ \\(\divss %1, %0 *\ : 
: \x\ \\(__f\\), \x\ \\(__g\\)\\);$;
 bypass= \fdivp .*; fwait\;


Re: [Patch, RTL] Eliminate redundant vec_select moves.

2013-11-07 Thread Tejas Belagod

Richard Sandiford wrote:

Tejas Belagod tbela...@arm.com writes:

Richard Sandiford wrote:

Tejas Belagod tbela...@arm.com writes:

Richard Sandiford wrote:

Tejas Belagod tbela...@arm.com writes:

Richard Sandiford wrote:

Tejas Belagod tbela...@arm.com writes:

+  /* This is big-endian-safe because the elements are kept in target
+ memory order.  So, for eg. PARALLEL element value of 2 is the
same in
+ either endian-ness.  */
+  if (GET_CODE (src) == VEC_SELECT
+   REG_P (XEXP (src, 0))  REG_P (dst)
+   REGNO (XEXP (src, 0)) == REGNO (dst))
+{
+  rtx par = XEXP (src, 1);
+  int i;
+
+  for (i = 0; i  XVECLEN (par, 0); i++)
+   {
+ rtx tem = XVECEXP (par, 0, i);
+ if (!CONST_INT_P (tem) || INTVAL (tem) != i)
+   return 0;
+   }
+  return 1;
+}
+

I think for big endian it needs to be:

INTVAL (tem) != i + base

where base is something like:

int base = GET_MODE_NUNITS (GET_MODE (XEXP (src, 0))) - XVECLEN (par, 0);

E.g. a big-endian V4HI looks like:

msb  lsb


and shortening it to say V2HI only gives the low 32 bits:

msb  lsb


But, in this case we want

 msb  lsb
 

It depends on whether the result occupies a full register or not.
I was thinking of the case where it didn't, but I realise now you were
thinking of the case where it did.  And yeah, my suggestion doesn't
cope with that...

I was under the impression that the const vector parallel for vec_select 
represents the element indexes of the array in memory order.


Therefore, in bigendian,

  msb lsb
     
element  a[0] a[1] a[2] a[3]

and in littleendian

  msb lsb
     
element  a[3] a[2] a[1] a[0]

Right.  But if an N-bit value is stored in a register, it's assumed to
occupy the lsb of the register and the N-1 bits above that.  The other
bits in the register are don't-care.

E.g., leaving vectors to one side, if you have:

   (set (reg:HI N) (truncate:SI (reg:SI N)))

on a 32-bit !TRULY_NOOP_TRUNCATION target, it shortens like this:

   msb  lsb
   01234567
   
   4567

rather than:

   msb  lsb
   01234567
   
   0123

for both endiannesses.  The same principle applies to vectors.
The lsb of the register is always assumed to be significant.

So maybe the original patch was correct for partial-register and
full-register results on little-endian, but only for full-register
results on big-endian.

Ah, ok! I think I get it. By eliminating
set( (reg:DI n) vec_select:DI ( (reg:V2DI n) (parallel [const 0]

using the check INTVAL (tem) != i, I'm essentially making subsequent
operations
use (reg:V2DI n) in DI mode which is a partial register result and this
gives me
the wrong set of lanes in bigendian. So, if I want to use (reg n) in partial 
register mode, I have to make sure the correct elements coincide with

the lsb in
big-endian...

Thanks for your input, I'll apply the offset correction for big-endian you 
suggested. I'll respin the patch.

Thanks.  Just for avoidance of doubt, the result might be a full or
partial register, depending on the mode and target.  I was trying to
correct myself by agreeing that your original was right and mine was
wrong for big-endian if the result is a full register.

What I had in mind when I implemented this was a partial-reg result, but 
obviously it was wrong.


Sorry, I'm going to take a step back - I'm trying to figure out what a full 
register result would look like. Looking at the pattern,


   set( (reg:DI n) vec_select:DI ( (reg:V2DI n) (parallel [const 0]

the result is always a mode smaller than the src if the vec_select selects a 
subset of lanes. Plus if the src and dst are the same reg, because we're 
re-writing the src reg, wouldn't it always end up being a partial-reg?

In this case, wouldn't (reg:DI n) always represent

  msb  lsb
  


The problem is that one reg rtx can span several hard registers.
E.g. (reg:V4SI 32) might represent one 64-bit register (no. 32),
but it might instead represent two 32-bit registers (nos. 32 and 33).
Obviously the latter's not very likely for vectors this small,
but more likely for larger ones (including on NEON IIRC).

So if we had 2 32-bit registers being treated as a V4HI, it would be:

   --3233--
   msb  lsb
   
   
   
   msb  lsb
   --32--

for big endian and:

   --3332--
   msb  lsb
   
   
   
   msb  lsb
   --32--

for little endian.


Ah, ok, that makes things clearer. Thanks for that.

I can't find any helper function that figures out if we're writing partial or 
full result regs. Would something like


REGNO (src) == REGNO (dst) 
HARD_REGNO_NREGS (src) == HARD_REGNO_NREGS (dst) == 1

be a sane check for partial result regs?


Re: [gomp4] rewrite simd clone argument adjustment to avoid regimplification

2013-11-07 Thread Martin Jambor
Hi,

On Wed, Nov 06, 2013 at 05:37:03PM -0700, Aldy Hernandez wrote:
 On 11/06/13 15:48, Jakub Jelinek wrote:
 On Wed, Nov 06, 2013 at 03:24:40PM -0700, Aldy Hernandez wrote:
 I have checked the following patch with the attached testcases that
 were previously ICEing, and with a handful of handcrafted tests that
 I checked manually (array references on lhs and rhs, vectors of
 pointers, etc).
 
 I'd like Martin to eyeball the ipa changes eventually.
 
 Indeed!  Likewise for all my previous changes to ipa-prop.[ch] and
 tree-sra.c.

Sorry for the delay.  I'd just like to re-iterate one comment from my
previous email which is that I do not think tree-sra.c should export
any function to the outside world apart from the entry points of the
passes (yes, there is already build_ref_for_offset which I admit is
entirely my creation but that should be moved elswhere too).

So please put sra_ipa_get_adjustment_candidate to ipa-prop.c.  I see
that it requires get_ssa_base_param to be moved there as well but
since IPA-SRA uses it in different places, it would need exporting
too, which would be weird because it does not really do anything with
parameters.  Since the function is so trivial, I would even suggest
introducing another private copy to ipa-prop.c (and leaving the
original without the new parameter).  Alternatively, you can move the
function to tree.c but for that it looks too specialized.

Thanks,

Martin

 
 +  if (!SSA_VAR_P (*tp))
 +{
 +  /* Make sure we treat subtrees as a RHS.  This makes sure that
 +when examining the `*foo' in *foo=x, the `foo' get treated as
 +a use properly.  */
 
 I think it wouldn't hurt to e.g. do
  if (TYPE_P (*tp))
*walk_subtrees = 0;
 here (and drop ATTRIBUTE_UNUSED above.
 
 Done.
 
 
 +  wi-is_lhs = false;
 +  wi-val_only = true;
 +  return NULL_TREE;
 +}
 +  struct modify_stmt_info *info = (struct modify_stmt_info *) wi-info;
 +  struct ipa_parm_adjustment *cand
 += sra_ipa_get_adjustment_candidate (tp, NULL, info-adjustments, true);
 +  if (!cand)
 +return NULL_TREE;
 +
 tree t = *tp;
 +  tree repl = make_ssa_name (create_tmp_var (TREE_TYPE (t), NULL), NULL);
 
 Just do
tree repl = make_ssa_name (TREE_TYPE (t), NULL);
 no need to create underlying vars since 4.8.
 
 Done.
 
 
 +  /* We have modified in place; update the SSA operands.  */
 +  info-modified = false;
 
 So you always set modified to false?  I was expecting you'd set
 it to true here and defer update_stmt and maybe_clean_eh_stmt etc.
 to after walk_gimple_op (so, do it only when all the changes on the stmt
 have been performed).  Plus, when you modify something, there is no need
 to walk subtrees, so you can just do *walk_subtrees = 0; too.
 
 Hmmm, good point.  I've moved update_stmt and company to the caller,
 and modified the caller to call regimplify_operands only for
 GIMPLE_RETURN which is the special case.
 
 Let me know if this is still ok so I can commit.
 
 Thanks.
 Aldy

 gcc/ChangeLog.gomp
   * ipa-prop.h (sra_ipa_get_adjustment_candidate): Protoize.
   * omp-low.c (struct modify_stmt_info): New.
   (ipa_simd_modify_function_body_ops_1): Remove.
   (ipa_simd_modify_stmt_ops): New.
   (ipa_simd_modify_function_body_ops): Remove.
   (ipa_simd_modify_function_body): Set new callback info.
   Remove special casing.  Handle all operators with walk_gimple_op.
   * tree-sra.c (get_ssa_base_param): Add new argument.  Use it.
   (disqualify_base_of_expr): Pass new argument to
   get_ssa_base_param.
   (sra_ipa_modify_expr): Abstract candidate search into...
   (sra_ipa_get_adjustment_candidate): ...here.
 


Re: wide-int testing results

2013-11-07 Thread Kenneth Zadeck

I doubt that this list is comprehensive.

When i many of these things a long time ago, I was not that interested 
in bit for bit compatibly as long as was never introducing any 
problems.in some cases it would have been hard to replicate some of 
the end cases with the wide-int code.


On 11/07/2013 12:51 PM, Richard Sandiford wrote:

I wanted to make sure that each backend still builds with wide-int and that
there weren't any unexplained changes in assembly output.  I arbitrarily
picked one target for each CPU:

 aarch64-linux-gnueabi alpha-linux-gnu arc-elf arm-linux-gnueabi
 avr-rtems bfin-elf c6x-elf cr16-elf cris-elf epiphany-elf
 fr30-elf frv-linux-gnu h8300-elf ia64-linux-gnu iq2000-elf
 lm32-elf m32c-elf m32r-elf m68k-linux-gnu mcore-elf mep-elf
 microblaze-elf mips-linux-gnu mmix mn10300-elf moxie-elf
 msp430-elf nds32le-elf hppa64-hp-hpux11.23 pdp11 picochip-elf
 powerpc-linux-gnu powerpc-eabispe rl78-elf rx-elf s390-linux-gnu
 score-elf sh-linux-gnu sparc-linux-gnu spu-elf tilegx-elf
 tilepro-elf xstormy16-elf v850-elf vax-netbsdelf xtensa-elf
 x86_64-darwin

and built gcc and g++ from both the merge point and the wide-int branch.
The branch included the patches I've posted and a local change
to put CONST_WIDE_INT at the end of rtl.def (just for comparison,
as mentioned before).  I then compiled gcc.c-torture, gcc.dg and g++.dg
at -O2 and compared the asm output.  This obviously isn't a very strong
test, since none of the libraries are built, and since some of the test
cases rely on system headers, but it should at least be better than nothing.

pdp11 and picochip-elf don't build on mainline due to:

gcc/target-def.h:69:34: error: ‘default_stabs_asm_out_destructor’ was not 
declared in this scope
#   define TARGET_ASM_DESTRUCTOR default_stabs_asm_out_destructor

Both the merge point and the branch built the other targets and there were
no new warnings on the branch.  The only asm differences were in:

* gcc.dg/fixed-point/const-1.s on targets that support it

   I think this is a known improvement.  The test case checks a series of
   conversions to make sure that the out-of-range ones produce a warning.
   E.g.:

 short _Fract sfF = 1.0;  /* { dg-warning overflow } */

   The asm differences are in the values produced for these out-of-range
   casees.  The old code used real_to_integer2, which saturates the result
   to double_int precision.  It then uses the fractional part of that result
   to initialise sfF.  The result therefore depends on the host (at least
   in the general case).

   The new code instead saturates to the precision of the result, just like
   we already do for:

 signed short s = 32768.0f;

   So this seems like a good change.

* gcc.dg/vect/pr51799.c on m32c-elf and
* gcc.c-torture/execute/divconst-2.c on xstormy16-elf

   These are cases where wide-int converts a narrower-than-HWI
   multiplication by 1  (GET_MODE_PRECISION - 1) into a shift but
   mainline doesn't.  The mainline code looks like:

   /* Convert multiply by constant power of two into shift unless
 we are still generating RTL.  This test is a kludge.  */
   if (CONST_INT_P (trueop1)
   (val = exact_log2 (UINTVAL (trueop1))) = 0
  /* If the mode is larger than the host word size, and the
 uppermost bit is set, then this isn't a power of two due
 to implicit sign extension.  */
   (width = HOST_BITS_PER_WIDE_INT
  || val != HOST_BITS_PER_WIDE_INT - 1))
return simplify_gen_binary (ASHIFT, mode, op0, GEN_INT (val));

   and this exact_log2 doesn't cope properly with the sign-extended
   0xff8000... CONST_INT.  The wide-int code is:

   /* Convert multiply by constant power of two into shift.  */
   if (CONST_SCALAR_INT_P (trueop1))
{
  val = wi::exact_log2 (std::make_pair (trueop1, mode));
  if (val = 0  val  GET_MODE_BITSIZE (mode))
return simplify_gen_binary (ASHIFT, mode, op0, GEN_INT (val));
}

   So this too seems like a good thing.

Thanks,
Richard




Re: Implement C11 _Atomic

2013-11-07 Thread Uros Bizjak
On Thu, Nov 7, 2013 at 5:45 PM, Jakub Jelinek ja...@redhat.com wrote:
 On Tue, Nov 05, 2013 at 11:21:56PM +, Joseph S. Myers wrote:
 2013-11-05  Andrew MacLeod  amacl...@redhat.com
   Joseph Myers  jos...@codesourcery.com

   * tree-core.h (enum cv_qualifier): Add TYPE_QUAL_ATOMIC.
   (enum tree_index): Add TI_ATOMICQI_TYPE, TI_ATOMICHI_TYPE,
   TI_ATOMICSI_TYPE, TI_ATOMICDI_TYPE and TI_ATOMICTI_TYPE.
   (struct tree_base): Add atomic_flag field.
   * tree.h (TYPE_ATOMIC): New accessor macro.
   (TYPE_QUALS, TYPE_QUALS_NO_ADDR_SPACE): Add TYPE_QUAL_ATOMIC.
   (TYPE_QUALS_NO_ADDR_SPACE_NO_ATOMIC): New macro.
   (atomicQI_type_node, atomicHI_type_node, atomicSI_type_node)
   (atomicDI_type_node, atomicTI_type_node): New macros for type
   nodes.
   * tree.c (set_type_quals): Set TYPE_ATOMIC.
   (find_atomic_core_type): New function.
   (build_qualified_type): Adjust alignment for qualified types.
   (build_atomic_base): New function
   (build_common_tree_nodes): Build atomicQI_type_node,
   atomicHI_type_node, atomicSI_type_node, atomicDI_type_node and
   atomicTI_type_node.
   * print-tree.c (print_node): Print atomic qualifier.
   * tree-pretty-print.c (dump_generic_node): Print atomic type
   attribute.
   * target.def (atomic_assign_expand_fenv): New hook.
   * doc/tm.texi.in (TARGET_ATOMIC_ASSIGN_EXPAND_FENV): New @hook.
   * doc/tm.texi: Regenerate.
   * targhooks.c (default_atomic_assign_expand_fenv): New function.
   * targhooks.h (default_atomic_assign_expand_fenv): Declare.
   * sync-builtins.def (__atomic_feraiseexcept): New built-in
   function.

 2013-11-05  Joseph Myers  jos...@codesourcery.com

   * lib/target-supports.exp
   (check_effective_target_fenv_exceptions): New function.
   * lib/atomic-dg.exp, gcc.dg/atomic/atomic.exp: New files.
   * gcc.dg/atomic/c11-atomic-exec-1.c,
   gcc.dg/atomic/c11-atomic-exec-2.c,
   gcc.dg/atomic/c11-atomic-exec-3.c,
   gcc.dg/atomic/c11-atomic-exec-4.c,
   gcc.dg/atomic/c11-atomic-exec-5.c, gcc.dg/c11-atomic-1.c,
   gcc.dg/c11-atomic-2.c, gcc.dg/c11-atomic-3.c,
   gcc.dg/c90-atomic-1.c, gcc.dg/c99-atomic-1.c: New tests.

 libatomic:
 2013-11-05  Joseph Myers  jos...@codesourcery.com

   * fenv.c: New file.
   * libatomic.map (LIBATOMIC_1.1): New symbol version.  Include
   __atomic_feraiseexcept.
   * configure.ac (libtool_VERSION): Change to 2:0:1.
   (fenv.h): Test for header.
   * Makefile.am (libatomic_la_SOURCES): Add fenv.c.
   * Makefile.in, auto-config.h.in, configure: Regenerate.

 The middle-end, libatomic and testsuite changes are ok.

Please note that following code form fenv.c won't generate overflow
exception on x87:

  if (excepts  FP_EX_OVERFLOW)
{
  volatile float max = __FLT_MAX__;
  r = max * max;
}

Uros.


Re: [patch] Create gimple-expr..[ch] ... was Re: RFC: gimple.[ch] break apart

2013-11-07 Thread Diego Novillo
On Thu, Nov 7, 2013 at 8:23 AM, Andrew MacLeod amacl...@redhat.com wrote:
 On 11/07/2013 05:36 AM, Basile Starynkevitch wrote:

 On Tue, Nov 05, 2013 at 11:26:46AM -0500, Andrew MacLeod wrote:

 I decided to name the new file gimple-expr.[ch] instead of
 gimple-decl   This will eventually split into gimple-type.[ch],
 gimple-decl.[ch], and gimple-expr.[ch].


 Since we are adding *new* C++ files, can't we please name them *.cc
 for the implementation part, so at least create gimple-expr.h and
 gimple-expr.cc but not gimple-expr.c, please!

 Assuming we put this into stage 1 next year,  I would imagine we'd rename a
 number of things, including .cc, drop the tree-* from the tree-ssa-blah.[c]h
 files, etc etc.  There have been a few things people have suggested
 renaming... I think if we do renaming, they ought to be all at one time to
 minimize the pain.

 At the moment, the new gimple-* files I'm creating are still C, so they are
 .c files...

Agreed. When we start shuffling files around seems a better time.
Doing both operations at once will be easier than going through two
phases of naming changes.


Diego.


Re: [wide-int] Remove SHIFT_COUNT_TRUNCATED uses at tree/gimple level

2013-11-07 Thread Richard Sandiford
Kenneth Zadeck zad...@naturalbridge.com writes:
 I very strongly disagree with this.  The standard needs to be high than
 does it pass the test suite.

 What we are introducing is a case where the program will behave one way
 with optimization and another way without it.  While, this is always
 true for timing dependent code, it is pretty rare for math.  We should
 always tread carefully when doing that, and the excuse that it is
 cleaner does not, in my mind, fit the bill.

If it makes you feel any better (probably not), this was already true
for !SHIFT_COUNT_TRUNCATED targets.  The tree level would then not do
any truncation, whereas as you said before, all real targets do in practice.
And that affected many of the main targets, including x86_64, powerpc,
z/Series, ARM, etc.

(And the tree level has to do _something_.  unsigned int i = 1  32;
must compile with default options, even if the behaviour's undefined.
Only rtl has the luxury of being able to punt.)

Although you could go out of your way to try to make the undefined
behaviour match what the target instructions would do, that'd be a
lot of work, especially for cases that are implemented using library
calls.  And there's a lot to be said for making the behaviour
consistent across targets, which is what removing the truncation
from the tree level does.

Things are different at the rtl level because there a backend can
legitimately exploit the behaviour of the shift instructions when
implementing optabs.  And optabs.c itself does this when creating
doubleword shifts.  So at the rtl level we can only optimise out-of-range
shifts if we're sure what the target would do.

Thanks,
Richard


RE: Initial submission of OpenACC support integrated into OpenMP's lowering and expansion passes

2013-11-07 Thread Evgeny Gavrin
Hi, Thomas!

This is really great!
I've looked at your changes and in most of front-end parts it looks reasonable 
to me. As you know, we're like-minded with you about how OpenACC's front-ends 
should look like. So, I think it's good that you have working flow for your 
implementation. Now we can start to merge front-ends from openacc-1_0-branch to 
GOMP, while keeping all understandings of ACC in sync and working.

Jakub, don't you mind if I prepare a few front-end patches likewise for review 
to extend Thomas' vision with the things that was already done by my colleagues 
on the ACC branch? I hope it'll much speed up the development.

 the last patch adds GOACC_parallel, which so far simply branches to 
 GOMP_target.  There's more to come.
I've reviewed ACC-related code from [openacc-1_0-branch] and from the patches, 
I noticed there are some mess around marking OpenACC specific code. There are 
ACC, GACC, OACC, GOACC abbreviations used across code, which one is the best? I 
prefer ACC, this is not critical, I'm just curious how to resolve this.

 Due to the conceptual similarity compared to OpenMP, and that (later) 
 it is reasonable to expect to embed OpenACC directives into OpenMP 
 ones, the approach I've chosen is to directly embed OpenACC handling 
 into the existing OpenMP infrastructure/passes.
Regarding front-ends, not only from view of OMP/ACC, but front-ends itself, it 
makes sense to move some parsing routines out of gomp/acc context. This will be 
useful to implement not only ACC, but also HMPP, if someone is interested, or 
other similar technology. And of cause, these functions should be used behind 
the existing facilities - not to break current implementation and other 
dependencies. So it won't not introduce any difficulties for back porting, 
while generalizing front-ends a bit.
For now, I prefer to extend the approved way of developing, but keep it in 
mind, please.

 This patch series doesn't contain any substantial rumtime library work 
 yet;
Can you share some technical details/ideas about further implementation, it's 
not going to be trivial. On the [openacc-1_0-branch] we resolved it with 
OpenCL-driven runtime. Also, OpenACC 2.0 standard declares some vendor-specific 
routines, this should be noted too.

 This is in contrast to Samsung's work, who are implementing OpenACC
 separately from the existing OpenMP support.
I'm not fully agree that this in contrast to things are taking place in 
[openacc-1_0-branch]. The main reason for keeping middle-end and back-end 
solutions far from GOMP is the understanding of OpenACC semantics and unclear 
understanding of what should be done to generalize support of accelerators in 
GCC. OpenACC and OpenOMP has some semantically differences, but such things 
matters only for middle-end and further code generation. The main one, is the 
memory concept. The 

 Yet, I hope we'll be able to share/re-use at least front end and some
middle end code.
I absolutely agree, it's needed to merge our efforts regarding front-ends.
Many things from ACC in [openacc-1_0-branch] is already done, so it doesn't 
necessary to re-implement same things. Maybe we can polish front-ends together 
and then commit changes to GOMP?

 We directly strive for OpenACC 2.0 support, skipping OpenACC 1.  We're 
 focussing on the C front end implementation first, following on with 
 C++ and Fortran later on.
How do you think, does it make sense to adopt current Fortran implementation 
from the OpenACC branch? This can be done quite soon. Also, I may help to 
extend C and C++ front-end the same way it's done now, by adding other 
implemented directives, too.

Regarding proposed patches:
[8/9] c_parser_omp_all_clauses and cp_parser_omp_all_clauses
- c_parser_error (parser, expected %#pragma omp% clause);
+ c_parser_error (parser, expected clause);

- cp_parser_error (parser, expected %#pragma omp% clause);
+ cp_parser_error (parser, expected clause);
Does it really needed to remove %#pragma omp% in these cases? We handle both, 
I think.

[9/9]
* gimple.h (gimple_build_oacc_*): New declaration.
gimple_oacc_parallel_* and other functions like that. Let's move these 
functions out of gimple.* to gimple-oacc.*! There are going to be much more of 
the stuff like this, maybe it'll make sense to keep it externally? May be not 
only gimple manipulation functions but also pretty-print Also, have a look on 
openacc_1-0_branch please, for gimple-oacc.*, is OK if I'll add some of this?

Handling clauses.
+  GIMPLE_CHECK (gs, GIMPLE_OACC_PARALLEL);  
+ gs-gimple_omp_parallel.clauses = clauses;
I think, we can freely add some kind of flag to gimple_omp_parallel to indicate 
that we're are working with ACC.
Or even add new gimple_acc_* statements.

On gimplification, lowering stuff and further, I think it's okay if targeting 
GOMP_target, but on my view, this is going to prevent us from emiting OpenCL or 
do some extended analysis of the code for error analysis to 

Re: Clean up atomic tests

2013-11-07 Thread Mike Stump
On Nov 7, 2013, at 8:43 AM, Joseph S. Myers jos...@codesourcery.com wrote:
 This patch cleans up various issues with the tests of atomics built-in
 functions and libatomic functions, in preparation for adapting those
 tests to add test coverage of stdatomic.h macros.  The tests were
 missing a return type for main (C11 doesn't allow implicit int return
 types).

Ok for those changes.

 Some tests were incrementing a variable in one part of an
 expression and using its value in another part without a sequence
 point in between.

Ok for those changes.

 And the libatomic tests shouldn't have been
 restricted to targets with hardware sync_* support, or adding
 command-line options to make such support available,

For this I'd like a domain expert to weigh in.  If that is you, then you can 
check it in as obvious if you'd like.  If you want a review, you can wait for 
them to speak up.

Re: [wide-int, committed] Fix nds32 build

2013-11-07 Thread Mike Stump
On Nov 7, 2013, at 9:47 AM, Richard Sandiford rdsandif...@googlemail.com 
wrote:
 Get the new port building on wide-int.  Seemed pretty obvious so I went
 ahead and installed it.

Looks good.

Re: wide-int testing results

2013-11-07 Thread Richard Sandiford
Kenneth Zadeck zad...@naturalbridge.com writes:
 I doubt that this list is comprehensive.

Hmm?  It was supposed to be one target for each CPU, so if you think
I've missed one, please point it out.  It certainly wasn't supposed
to be every target triple that gcc supports.

Even one target per CPU was useful, since quite a few wouldn't compile.

 When i many of these things a long time ago, I was not that interested 
 in bit for bit compatibly as long as was never introducing any 
 problems.in some cases it would have been hard to replicate some of 
 the end cases with the wide-int code.

Not sure what you mean.  The point was that if wide-int made things better,
we should just say good and move on.  Which is why I'm not trying to
get bit-for-bit compatibility in the testcases I pointed out.  But for
a change of this size I think we should at least know why any differences
occur.

Thanks,
Richard

 On 11/07/2013 12:51 PM, Richard Sandiford wrote:
 I wanted to make sure that each backend still builds with wide-int and that
 there weren't any unexplained changes in assembly output.  I arbitrarily
 picked one target for each CPU:

  aarch64-linux-gnueabi alpha-linux-gnu arc-elf arm-linux-gnueabi
  avr-rtems bfin-elf c6x-elf cr16-elf cris-elf epiphany-elf
  fr30-elf frv-linux-gnu h8300-elf ia64-linux-gnu iq2000-elf
  lm32-elf m32c-elf m32r-elf m68k-linux-gnu mcore-elf mep-elf
  microblaze-elf mips-linux-gnu mmix mn10300-elf moxie-elf
  msp430-elf nds32le-elf hppa64-hp-hpux11.23 pdp11 picochip-elf
  powerpc-linux-gnu powerpc-eabispe rl78-elf rx-elf s390-linux-gnu
  score-elf sh-linux-gnu sparc-linux-gnu spu-elf tilegx-elf
  tilepro-elf xstormy16-elf v850-elf vax-netbsdelf xtensa-elf
  x86_64-darwin

 and built gcc and g++ from both the merge point and the wide-int branch.
 The branch included the patches I've posted and a local change
 to put CONST_WIDE_INT at the end of rtl.def (just for comparison,
 as mentioned before).  I then compiled gcc.c-torture, gcc.dg and g++.dg
 at -O2 and compared the asm output.  This obviously isn't a very strong
 test, since none of the libraries are built, and since some of the test
 cases rely on system headers, but it should at least be better than nothing.

 pdp11 and picochip-elf don't build on mainline due to:

 gcc/target-def.h:69:34: error: ‘default_stabs_asm_out_destructor’ was 
 not declared in this scope
 #   define TARGET_ASM_DESTRUCTOR default_stabs_asm_out_destructor

 Both the merge point and the branch built the other targets and there were
 no new warnings on the branch.  The only asm differences were in:

 * gcc.dg/fixed-point/const-1.s on targets that support it

I think this is a known improvement.  The test case checks a series of
conversions to make sure that the out-of-range ones produce a warning.
E.g.:

  short _Fract sfF = 1.0;  /* { dg-warning overflow } */

The asm differences are in the values produced for these out-of-range
casees.  The old code used real_to_integer2, which saturates the result
to double_int precision.  It then uses the fractional part of that result
to initialise sfF.  The result therefore depends on the host (at least
in the general case).

The new code instead saturates to the precision of the result, just like
we already do for:

  signed short s = 32768.0f;

So this seems like a good change.

 * gcc.dg/vect/pr51799.c on m32c-elf and
 * gcc.c-torture/execute/divconst-2.c on xstormy16-elf

These are cases where wide-int converts a narrower-than-HWI
multiplication by 1  (GET_MODE_PRECISION - 1) into a shift but
mainline doesn't.  The mainline code looks like:

/* Convert multiply by constant power of two into shift unless
   we are still generating RTL.  This test is a kludge.  */
if (CONST_INT_P (trueop1)
 (val = exact_log2 (UINTVAL (trueop1))) = 0
/* If the mode is larger than the host word size, and the
   uppermost bit is set, then this isn't a power of two due
   to implicit sign extension.  */
 (width = HOST_BITS_PER_WIDE_INT
|| val != HOST_BITS_PER_WIDE_INT - 1))
  return simplify_gen_binary (ASHIFT, mode, op0, GEN_INT (val));

and this exact_log2 doesn't cope properly with the sign-extended
0xff8000... CONST_INT.  The wide-int code is:

/* Convert multiply by constant power of two into shift.  */
if (CONST_SCALAR_INT_P (trueop1))
  {
val = wi::exact_log2 (std::make_pair (trueop1, mode));
if (val = 0  val  GET_MODE_BITSIZE (mode))
  return simplify_gen_binary (ASHIFT, mode, op0, GEN_INT (val));
  }

So this too seems like a good thing.

 Thanks,
 Richard


RE: [PATCH] Factorize the two read_line functions present in gcov.c and input.c

2013-11-07 Thread Bernd Edlinger
 On 11/07/13 09:32, Dodji Seketeli wrote:

 From the above, what I can say is that input.o was already linked with
 gcov. But I think it's minimal enough to only drag libcpp and the
 diagnostic subsystem.

 I disagree. While input.o was available to gcov, I don't think it was being
 pulled into the executable (if it was, I'd like to understand why). I think
 dragging in libcpp  diagnostics is a major bloat -- have you measured the
 before and after gcov text sizes?


I agree that this should be avoided. However both versions of read_line
need to be fixed eventually.

Just for the records, this is not only a theoretical issue for gcov:
The OOM error can also be reproduced in gcov.

The attached test.c has some dozends of comment lines:
//xx\0EOL

That are lines with text before and after the \0.
If compiled without -Wall, it will crash in gcov:


gcc -fprofile-arcs -ftest-coverage test.c
./a.out
gcov test
File 'test.c'
Lines executed:100.00% of 1
Creating 'test.c.gcov'

gcov: out of memory allocating 1677721600 bytes after a total of 135168 bytes


Bernd.//00EOL
//01EOL
//02EOL
//03EOL
//04EOL
//05EOL
//06EOL
//07EOL
//08EOL
//09EOL
//10EOL
//11EOL
//12EOL
//13EOL
//14EOL
//15EOL
//16EOL
//17EOL
//18EOL
//19EOL
//20EOL
//21EOL
//22EOL
//23EOL
//24EOL
//25EOL
//26EOL
//27EOL
//28EOL
//29EOL
//30EOL
//31EOL
//32EOL
//33EOL
//34EOL
//35EOL
//36EOL
//37EOL
//38EOL
//39EOL
//40EOL
//41EOL
//42EOL
int main() { }
// EOF


Re: [PATCH] make has_gate and has_execute useless

2013-11-07 Thread Trevor Saunders
On Thu, Nov 07, 2013 at 10:30:16AM -0700, Jeff Law wrote:
 On 11/07/13 09:00, tsaund...@mozilla.com wrote:
 From: Trevor Saunders tsaund...@mozilla.com
 
 Hi,
 
   This is the result of seeing what it would take to get rid of the has_gate 
  and
 has_execute flags on pass_data.  It turns out not much, but I wanted
 confirmation this part is ok before I go deal with all the places that
 initialize the fields.
 Most definitely, the has_gate/has_execute stuff is definitely a wart.
 
 
 I bootstrapped this part on x86_64-unknown-linux-gnu and saw no new test 
 suite
 regressions (ignoring the silk stuff because the full paths in its test names
 break my test script for now) Any reason this patch with the actual removal 
 of the flags wouldn't be ok?
 I don't see anything glaringly wrong.

great, I'll work on ripping them out then :)

 Any reason why the register_dump_files bits were mucking with
 properties.  That just seems wrong.

 no, it looks like its pretty old, the most recent change I can see is
 r110742 in 2006, but I agree it seems crazy, but atleast its not
 actually doing anything now.

 +  todo_after = pass-execute ();
 +  if (todo_after != TODO_absolutely_nothing)
 +do_per_function (clear_last_verified, NULL);
 +  else
 +todo_after = ~TODO_absolutely_nothing;
 
 Isn't the last assignment there just
 todo_after = 0;

yes

Trev

 
 ?
 
 Jeff


Re: fixincludes patch RFA: Fix fenv.h on Ubuntu Precise

2013-11-07 Thread Ian Lance Taylor
On Thu, Nov 7, 2013 at 8:48 AM, Bruce Korb bk...@gnu.org wrote:

 This time, I'm on my dev box and looked at the code.
 You remembered correctly that the first file name in the list
 of file names needs to not have wild card characters so that
 the testing scheme can create a file by that name.  A directory
 named * is possible, but inconvenient.

 Anyway, it also matches prior art in:

 /*
  * glibc_c99_inline_3
  */
 fix = {
 hackname  = glibc_c99_inline_3;
 files = bits/string2.h, '*/bits/string2.h';


 but is inconsistent with:

 /* Some versions of glibc have a version of bits/string2.h that
produces value computed is not used warnings from strncpy; fix
this definition by using __builtin_strncpy instead as in newer
versions.  */
 fix = {
 hackname  = glibc_strncpy;
 files = bits/string2.h;


 Hmmm.  Does that fix need fixing, too?

I didn't worry about that and the other cases because they are fixing
problems that existed in past glibc versions before Ubuntu changed
their directory layouts.  But of course there is nothing wrong with
fixing them too.

Your proposed patch looks fine to me and I think you should go ahead
and commit.  Or I can if you prefer for some reason.

Ian


Re: Implement C11 _Atomic

2013-11-07 Thread Joseph S. Myers
On Thu, 7 Nov 2013, Uros Bizjak wrote:

 Please note that following code form fenv.c won't generate overflow
 exception on x87:
 
   if (excepts  FP_EX_OVERFLOW)
 {
   volatile float max = __FLT_MAX__;
   r = max * max;
 }

r being volatile is intended to ensure that the result does get stored 
back to memory, and so in particular that a result computed with excess 
precision gets converted back to float and the exception is raised.

-- 
Joseph S. Myers
jos...@codesourcery.com


  1   2   >