[PATCH,i386] Enable FMA4 for AMD bdver3

2013-10-16 Thread Gopalasubramanian, Ganesh
Hi 

The below patch enables FMA4 for AMD bdver3 architectures.

make -k check passes.

Is it OK for upstream?

Regards
Ganesh


diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index fb5b267..cbb5311 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,8 @@
+2013-10-16 Ganesh Gopalasubramanian  ganesh.gopalasubraman...@amd.com
+
+   * config/i386/i386.c (ix86_option_override_internal): Enable FMA4
+   for AMD bdver3.
+
 2013-10-16  Hans-Peter Nilsson  h...@axis.com

* config/cris/t-elfmulti (MULTILIB_OPTIONS, MULTILIB_DIRNAMES)
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index b5796db..c24ce36 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -3104,7 +3104,7 @@ ix86_option_override_internal (bool main_args_p,
   {bdver3, PROCESSOR_BDVER3, CPU_BDVER3,
PTA_64BIT | PTA_MMX | PTA_SSE | PTA_SSE2 | PTA_SSE3
| PTA_SSE4A | PTA_CX16 | PTA_ABM | PTA_SSSE3 | PTA_SSE4_1
-   | PTA_SSE4_2 | PTA_AES | PTA_PCLMUL | PTA_AVX
+   | PTA_SSE4_2 | PTA_AES | PTA_PCLMUL | PTA_AVX | PTA_FMA4
| PTA_XOP | PTA_LWP | PTA_BMI | PTA_TBM | PTA_F16C
| PTA_FMA | PTA_PRFCHW | PTA_FXSR | PTA_XSAVE
| PTA_XSAVEOPT | PTA_FSGSBASE},



Re: [PATCH,i386] Enable FMA4 for AMD bdver3

2013-10-16 Thread Uros Bizjak
On Wed, Oct 16, 2013 at 8:28 AM, Gopalasubramanian, Ganesh
ganesh.gopalasubraman...@amd.com wrote:

 The below patch enables FMA4 for AMD bdver3 architectures.

 make -k check passes.

 +2013-10-16 Ganesh Gopalasubramanian  ganesh.gopalasubraman...@amd.com
 +
 +   * config/i386/i386.c (ix86_option_override_internal): Enable FMA4
 +   for AMD bdver3.
 +

OK for mainline and 4.8 branch (for 4.8.2 if approved by Jakub,
otherwise please wait for branch to open).

Thanks,
Uros.


Re: [PATCH,i386] Enable FMA4 for AMD bdver3

2013-10-16 Thread Jakub Jelinek
On Wed, Oct 16, 2013 at 09:00:58AM +0200, Uros Bizjak wrote:
 On Wed, Oct 16, 2013 at 8:28 AM, Gopalasubramanian, Ganesh
 ganesh.gopalasubraman...@amd.com wrote:
 
  The below patch enables FMA4 for AMD bdver3 architectures.
 
  make -k check passes.
 
  +2013-10-16 Ganesh Gopalasubramanian  ganesh.gopalasubraman...@amd.com
  +
  +   * config/i386/i386.c (ix86_option_override_internal): Enable FMA4
  +   for AMD bdver3.
  +
 
 OK for mainline and 4.8 branch (for 4.8.2 if approved by Jakub,
 otherwise please wait for branch to open).

4.8.2 is already rolling, so too late for that.

Jakub


Re: libgo patch committed: Don't clobber context when catching signal

2013-10-16 Thread Uros Bizjak
Hello!

 This patch to libgo fixes a dumb bug in which catching a signal via the
 os/signal package could clobber the saved split-stack context, leading
 to disaster if the signal arrived at the wrong time.  Bootstrapped and
 ran Go testsuite on x86_64-unknown-linux-gnu.  Committed to mainline.
 Will commit to 4.8 branch when it reopens.

Attached patch is re-enables bootstrap on targets without split stack
due to unused variable warning. The patch was tested on
alphaev68-linux-gnu [1].

[1] http://gcc.gnu.org/ml/gcc-testresults/2013-10/msg01200.html

Uros.

Index: runtime/go-signal.c
===
--- runtime/go-signal.c (revision 203610)
+++ runtime/go-signal.c (working copy)
@@ -399,7 +399,9 @@
 {
   G *gp;
   M *mp;
+#ifdef USING_SPLIT_STACK
   void *stack_context[10];
+#endif

   /* We are now running on the stack registered via sigaltstack.
  (Actually there is a small span of time between runtime_siginit


[buildrobot] Minor fallout of: Fix PR 57756

2013-10-16 Thread Jan-Benedict Glaw
On Wed, 2013-10-16 16:24:12 +1030, Alan Modra amo...@gmail.com wrote:
 On Tue, Oct 15, 2013 at 02:45:23PM -0700, Sriraman Tallam wrote:
  I committed this patch after making the above change.
 
 /src/gcc-virgin/gcc/config/rs6000/rs6000.c: At global scope:
 /src/gcc-virgin/gcc/config/rs6000/rs6000.c:31122:29: error: invalid 
 conversion from ‘void (*)(cl_target_option*)’ to ‘void (*)(cl_target_option*, 
 gcc_options*)’ [-fpermissive]
 /src/gcc-virgin/gcc/config/rs6000/rs6000.c:31122:29: error: invalid 
 conversion from ‘void (*)(cl_target_option*)’ to ‘void (*)(gcc_options*, 
 cl_target_option*)’ [-fpermissive]

Not quite...

Also causes another build failure seen only on some targets (m68k,
mips, ppc, s390 and sparc), but not all, which build robot[1] found:

m68k-linux  
http://toolchain.lug-owl.de/buildbot/show_build_details.php?id=19881
mips-linux  
http://toolchain.lug-owl.de/buildbot/show_build_details.php?id=19892
mips64-linux
http://toolchain.lug-owl.de/buildbot/show_build_details.php?id=19920
mips64el-linux  
http://toolchain.lug-owl.de/buildbot/show_build_details.php?id=19879
mipsel-linux
http://toolchain.lug-owl.de/buildbot/show_build_details.php?id=19924
powerpc-linux   
http://toolchain.lug-owl.de/buildbot/show_build_details.php?id=19903
powerpc64le-linux 
http://toolchain.lug-owl.de/buildbot/show_build_details.php?id=19917
powerpcle-linux 
http://toolchain.lug-owl.de/buildbot/show_build_details.php?id=19884
s390-linux  
http://toolchain.lug-owl.de/buildbot/show_build_details.php?id=19889
s390x-linux 
http://toolchain.lug-owl.de/buildbot/show_build_details.php?id=19913
sparc-linux 
http://toolchain.lug-owl.de/buildbot/show_build_details.php?id=19957
sparc64-linux   
http://toolchain.lug-owl.de/buildbot/show_build_details.php?id=19898

They're all similar, like this:

g++ -c   -g -O2 -DIN_GCC  -DCROSS_DIRECTORY_STRUCTURE  -fno-exceptions 
-fno-rtti -fasynchronous-unwind-tables -W -Wall -Wwrite-strings -Wcast-qual 
-Wmissing-format-attribute -pedantic -Wno-long-long -Wno-variadic-macros 
-Wno-overlength-strings -fno-common  -DHAVE_CONFIG_H -I. -I. 
-I/home/jbglaw/repos/gcc/gcc -I/home/jbglaw/repos/gcc/gcc/. 
-I/home/jbglaw/repos/gcc/gcc/../include 
-I/home/jbglaw/repos/gcc/gcc/../libcpp/include  
-I/home/jbglaw/repos/gcc/gcc/../libdecnumber 
-I/home/jbglaw/repos/gcc/gcc/../libdecnumber/dpd -I../libdecnumber 
-I/home/jbglaw/repos/gcc/gcc/../libbacktrace-o sparc.o -MT sparc.o -MMD -MP 
-MF ./.deps/sparc.TPo /home/jbglaw/repos/gcc/gcc/config/sparc/sparc.c
/home/jbglaw/repos/gcc/gcc/config/sparc/sparc.c: In function ‘void 
sparc_option_override()’:
/home/jbglaw/repos/gcc/gcc/config/sparc/sparc.c:1228: error: 
‘target_flags_explicit’ was not declared in this scope
/home/jbglaw/repos/gcc/gcc/config/sparc/sparc.c:1322: error: 
‘target_flags_explicit’ was not declared in this scope

It's always a missing declaration of `target_flags_explicit', so this
should be quite easy to fix.

MfG, JBG
[1] http://toolchain.lug-owl.de/buildbot/
http://toolchain.lug-owl.de/buildbot/timeline.php
-- 
  Jan-Benedict Glaw  jbg...@lug-owl.de  +49-172-7608481
Signature of:  Fortschritt bedeutet, einen Schritt so zu machen,
the second  :   daß man den nächsten auch noch machen kann.


signature.asc
Description: Digital signature


Re: [PATCH] Fix PR54702 and LTO bootstrap (for me)

2013-10-16 Thread Richard Biener
On Tue, 15 Oct 2013, Mike Stump wrote:

 On Oct 15, 2013, at 12:45 AM, Richard Biener rguent...@suse.de wrote:
  On Mon, 14 Oct 2013, Mike Stump wrote:
  On Sep 25, 2012, at 8:00 AM, Richard Guenther rguent...@suse.de wrote:
  2012-09-25  Richard Guenther  rguent...@suse.de
  
PR lto/54625
* lto-symtab.c (lto_symtab_merge_cgraph_nodes_1): Do not merge
cgraph nodes for builtins.
  
* gcc.dg/lto/pr54702_0.c: New testcase.
* gcc.dg/lto/pr54702_1.c: Likewise.
* gcc.dg/lto/pr54625-1_0.c: Likewise.
* gcc.dg/lto/pr54625-1_1.C: Likewise.
* gcc.dg/lto/pr54625-2_0.c: Likewise.
* gcc.dg/lto/pr54625-2_1.C: Likewise.
  
  
  xgcc: error: /home/mrs/wg/gcc/gcc/testsuite/gcc.dg/lto/pr54625-2_1.C: C++ 
  compiler not installed on this system
  
  FAIL: gcc.dg/lto/pr54625-2 c_lto_pr54625-2_1.o assemble, -O2 -flto 
  -fuse-linker-plugin
  
  shouldn't C++ testcases go into the C++ testsuite?
  
  Well ... this is a mixed C / C++ testcase (see the other file 
  participating, gcc.dg/lto/pr54625-2_0.c).  The C testsuite is the only
  one where the driver switches languages based on the file ending.
 
 That is an interesting inquiry, but, doesn't lead to the error.  Not sure 
 exactly why you bring it up:
 
 From a random C++ test suite run:
 
 spawn /home/mrs/work2/gcc-port/gcc/xgcc -B/home/mrs/work2/gcc-port/gcc/ 
 -fno-diagnostics-show-caret -O0 -flto -flto-partition=none 
 -fuse-linker-plugin -DNO_TRAMPOLINES -c -DNO_TRAMPOLINES -o 
 cp_lto_20100603-1_1.o 
 /home/mrs/work2/gcc/gcc/testsuite/g++.dg/lto/20100603-1_1.c
 
 here, we see that xgcc is used to build a C part of a mix language testcase 
 in the C++ test suite.
 
 If you follow that style, is there some reason why it won't just work as you 
 want?
 
  Hmm, it seems at least the fortran frontend knows how to mix fortran
  and C.
 
 And the fortran/c test cases are in the fortran test suite.
 
  So if you really did not build the C++ compiler then we need,
 
 Yup, no C++ compiler.
 
  at least for lto.exp, a way to say dg-require-c++-frontend in the main file
  (thats the _0 one).
 
 Why do that?
 
  Btw, can we end up with the C frontend not built?
 
 I'v never contemplated that idea?  but no, c is automatically added in order 
 I suppose, to build the language independent runtime (libgcc) with.

I suppose it's fine to move the testcase to the C++ lto testsuite
if you verify it will use the C compiler for the C parts.

Thanks,
Richard.


Re: [PATCH] tree_code_name wrapper

2013-10-16 Thread Richard Biener
On Tue, Oct 15, 2013 at 5:11 PM, Paulo Matos pma...@broadcom.com wrote:
 -Original Message-
 From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-ow...@gcc.gnu.org] On
 Behalf Of Jakub Jelinek
 Sent: 15 October 2013 15:45
 To: Paulo Matos
 Cc: Richard Biener; Paolo Carlini; gcc-patches@gcc.gnu.org
 Subject: Re: [PATCH] tree_code_name wrapper

 The ChangeLog is still wrong:


 OK, sorry.
 If this is ok now I will open a bottle of champagne.

 Thank you all for your time reviewing this. Ok to submit?

Ok.

Thanks.
Richard.

 2013-10-15  Paulo Matos  pma...@broadcom.com

 gcc/
 * tree-core.h (tree_code_name): Remove.
 * tree.h (get_tree_code_name): New prototype.
 * tree.c (tree_code_name): Make static.
 (get_tree_code_name): New function.
 (dump_tree_statistics, tree_check_failed, tree_not_check_failed,
 tree_class_check_failed, tree_range_check_failed,
 tree_not_class_check_failed, omp_clause_check_failed,
 tree_contains_struct_check_failed, tree_operand_check_failed): Use new
 wrapper get_tree_code_name instead of calling tree_code_name directly.
 * tree-vrp.c (dump_asserts_for): Likewise.
 * tree-dump.c (dequeue_and_dump): Likewise.
 * tree-pretty-print.c (do_niy, dump_generic_node): Likewise.
 * tree-pretty-print.h (pp_unsupported_tree): Likewise.
 * lto-streamer-out.c (lto_write_tree, DFS_write_tree): Likewise.
 * tree-ssa-dom.c (print_expr_hash_elt): Likewise.
 * gimple-pretty-print.c (dump_unary_rhs, dump_binary_rhs,
 dump_ternary_rhs, dump_gimple_assign, dump_gimple_cond,
 dump_gimple_omp_for): Likewise.
 * tree-vect-data-refs.c (vect_create_data_ref_ptr): Likewise.
 * tree-ssa-pre.c (print_pre_expr): Likewise.
 * ipa-prop.c (ipa_print_node_jump_functions_for_edge): Likewise.
 * print-tree.c (print_node_brief, print_node): Likewise.
 * gimple.c (gimple_check_failed): Likewise.
 * lto-streamer.c (lto_tag_name, print_lto_report): Likewise.
 * config/frv/frv.c (frv_init_cumulative_args): Likewise.
 * config/mep/mep.c (mep_validate_vliw): Likewise.
 * config/iq2000/iq2000.c (init_cumulative_args): Likewise.
 * config/rs6000/rs6000.c (init_cumulative_args): Likewise.

 gcc/cp/
 * error.c (code_to_string): Use new wrapper get_tree_code_name.
 * cxx-pretty-print.c (pp_cxx_assignment_operator): Likewise.
 * pt.c (tsubst): Likewise.
 * semantics.c (cxx_eval_constant_expression,
 potential_constant_expression_1): Likewise.
 * mangle.c (MANGLE_TRACE_TREE, dump_substitution_candidates,
 add_substitution, find_substitution): Likewise.

 --
 Paulo Matos


Re: [PATCH] Reassociate X == CST1 || X == CST2 if popcount (CST2 - CST1) == 1 into ((X - CST1) ~(CST2 - CST1)) == 0

2013-10-16 Thread Richard Biener
On Tue, Oct 15, 2013 at 6:55 PM, Jeff Law l...@redhat.com wrote:
 On 10/15/13 10:53, Jakub Jelinek wrote:

 On Tue, Oct 15, 2013 at 10:50:39AM -0600, Jeff Law wrote:

 I noticed that we're now including rtl.h and tm_p.h in
 tree-ssa-reassoc.c, which seems wrong.


 Isn't that required for BRANCH_COST use?
 Other option would be to add some dummy wrapper around
 BRANCH_COST, put that wrapper into some file that already includes rtl.h
 and
 tm_p.h and just call it from there.

 Yea, looking at it now, I'm a bit surprised this hasn't been converted to a
 target hook which would avoid this problem entirely.

You got the job!

;)

Richard.

 jeff


RE: [PATCH] Fix PR58682

2013-10-16 Thread Paulo Matos
 -Original Message-
 From: Mike Stump [mailto:mikest...@comcast.net]
 Sent: 14 October 2013 20:46
 To: Kyrill Tkachov
 Cc: Paulo Matos; gcc-patches@gcc.gnu.org
 Subject: Re: [PATCH] Fix PR58682

 In this case, I've reviewed the license I suspect you have, and I suspect it 
 does
 not have a grant of enough rights to allow it to be contributed to gcc.


Assuming we cannot get the testcase into GCC, can we get the patch into trunk? 
If you try it locally you can definitely see the ICE and that the patch fixes 
it.

-- 
Paulo Matos


Re: [Patch, Fortran] PR58652 - accept CLASS(*) as argument to CLASS(*)

2013-10-16 Thread Paul Richard Thomas
Dear Tobias,

Your patch is fine for trunk.

Thanks

Paul

On 16 October 2013 00:38, Tobias Burnus bur...@net-b.de wrote:
 As the test case (see also PR) showed, gfortran was rejecting:

  subroutine list_move_alloc(self,item)
class(list_node),intent(inout) :: self
class(*),intent(inout),allocatable :: item
 ...
  class(*), allocatable :: expr
 ...
call ast%move_alloc(expr)

 with the bogus message:

 call ast%move_alloc(expr)
 1
 Error: Actual argument to 'item' at (1) must have the same declared type


 The attached patch now also accepts passing CLASS(*) to CLASS(*).

 Built and currently regtesting on x86-64-gnu-linux (when successful:)
 OK for the trunk?

 Tobias



-- 
The knack of flying is learning how to throw yourself at the ground and miss.
   --Hitchhikers Guide to the Galaxy


[PATCH] Enhance ifcombine to recover non short circuit branches

2013-10-16 Thread Zhenqiang Chen
Hi,

The patch enhances ifcombine pass to recover some non short circuit
branches. Basically, it will do the following transformation:

Case 1:
  if (cond1)
if (cond2)
==
  if (cond1  cond2)

Case 2:
  if (cond1)
goto L1
  if (cond2)
goto L1
==
  if (cond1 || cond2)
goto L1

Case 3:
  if (cond1)
goto L1
  else
goto L2
L1:
  if (cond2)
goto L2
==
  if (invert (cond1) || cond2)
goto L2

Case 4:
  if (cond1)
goto L1
  if (cond2)
goto L2
L1:
==
  if (invert (cond1)  cond2)
goto L2

Bootstrap on X86-64 and ARM.

Two new FAILs in regression tests:
gcc.dg/uninit-pred-8_b.c
gcc.dg/uninit-pred-9_b.c
uninit pass should be enhanced to handle more complex conditions. Will
submit a bug to track it and fix it later.

Is it OK for trunk?

Thanks!
-Zhenqiang

ChangeLog:
2013-10-16  Zhenqiang Chen  zhenqiang.c...@linaro.org

* fold-const.c (simple_operand_p_2): Make it global.
* tree.h (simple_operand_p_2): Declare it.
* tree-ssa-ifcombine.c: Include rtl.h and tm_p.h.
(bb_has_overhead_p, generate_condition_node,
ifcombine_ccmp): New functions.
(ifcombine_fold_ifandif): New function, extracted from
ifcombine_ifandif.
(ifcombine_ifandif): Call ifcombine_ccmp.
(tree_ssa_ifcombine_bb): Skip optimized bb.

testsuite/ChangeLog
2013-10-16  Zhenqiang Chen  zhenqiang.c...@linaro.org

* gcc.dg/tree-ssa/ssa-ifcombine-ccmp-1.c: New test case.
* gcc.dg/tree-ssa/ssa-ifcombine-ccmp-2.c: New test case.
* gcc.dg/tree-ssa/ssa-ifcombine-ccmp-3.c: New test case.
* gcc.dg/tree-ssa/ssa-ifcombine-ccmp-4.c: New test case.
* gcc.dg/tree-ssa/ssa-ifcombine-ccmp-5.c: New test case.
* gcc.dg/tree-ssa/ssa-ifcombine-ccmp-6.c: New test case.
* gcc.dg/tree-ssa/ssa-dom-thread-3.c: Updated.


ifcombine.patch
Description: Binary data


Re: [PATCH] Move effective target check after other directives in c-c++-common/cpp/openmp-define-3.c

2013-10-16 Thread Thomas Schwinge
Hi!

On Mon, 14 Oct 2013 11:26:21 +0100, Kyrill Tkachov kyrylo.tkac...@arm.com 
wrote:
 This minuscule patch moves the dg-require-effective-target fopenmp line after 
 the other directives in the c-c++-common/cpp/openmp-define-3.c test. 
 Otherwise 
 dejagnu still proceeds to add -fopenmp to the options which fails on bare 
 metal 
 arm and aarch64 targets.

Thanks for reporting and fixing this -- I had not been aware of the
ordering requirements of dg-* directives.  From gcc/doc/sourcebuild.texi:

@item @{ dg-require-effective-target @var{keyword} [@{ @var{selector} @}] @}
Skip the test if [...]
This directive must appear after any @code{dg-do} directive in the test
and before any @code{dg-additional-sources} directive.
[...]

 --- a/gcc/testsuite/c-c++-common/cpp/openmp-define-3.c
 +++ b/gcc/testsuite/c-c++-common/cpp/openmp-define-3.c
 @@ -1,6 +1,6 @@
 -/* { dg-require-effective-target fopenmp } */
  /* { dg-options -fopenmp } */
  /* { dg-do preprocess } */
 +/* { dg-require-effective-target fopenmp } */

The other testsuite files that I added in r203417 should also be adapted
accordingly.  Do you want to do that, or want me to do it?


Grüße,
 Thomas


pgp8dkUCJT8ZE.pgp
Description: PGP signature


Re: RFC: Add of type-demotion pass

2013-10-16 Thread Richard Biener
On Wed, Oct 16, 2013 at 7:59 AM, Jeff Law l...@redhat.com wrote:
 On 07/08/13 11:45, Kai Tietz wrote:

 Hello,

 These passes are implementing type-demotion (type sinking into
 statements) for some gimple statements.  I limitted inital
 implementation to the set of multiply, addition, substraction, and
 binary-and/or/xor.  Additional this pass adds some rules to sink
 type-cast sequences - eg. (int) (short) x; with char as type of x.
 This special handing in this pass of such type-sequence
 simplification is necessary to avoid too complex cast-sequences by
 type-unsigned conversion used by this pass to avoid undefined
 overflow behaviour.

 I will sent separate patch with some test-cases to demonstrate and
 verify operation of this new optimization.  Just one sample I will
 cite here to demonstrate operation of type-demotion pass.

 So I think we want to come back to this patch and make some decisions about
 how to go forward.

 I've been reviewing all the discussions back through last year and I think
 the single biggest thing we need to realize is that there's nothing in this
 patch that *couldn't* be handled by tree-ssa-forwprop with a suitable amount
 of following the use-def chains through casts in various transformations and
 determining when those casts can be safely ignored.  However, I don't think
 extending tree-ssa-forwprop in this way is wise.

 As I see it, the value in this patch is it allows us to avoid that nonsense
 by essentially reformulating this a problem of moving and reformulating the
 casts in the IL.

 I see two primary effects of type sinking.

Note it was called type demotion ;)

  First and probably the most
 important in my mind is by sinking a cast through its uses the various
 transformations we already perform are more likely to apply *without*
 needing to handle optimizing through typecasts explicitly.

I would say it is desirable to express arithmetic in the smallest possible
types (see what premature optimization the C family frontends do
to narrow operations again after C integer promotion applied).

You need some kind of range information to do this, thus either integrate
it into VRP (there is already code that does this there) or use range
information from VRP which we now preserve.

 The second primary effect is, given two casts where the first indirectly
 feeds the second (ie, the first feeds some statement, which then feeds the
 second cast), if we're able to sink the first cast, we end up with the first
 cast directly feeding the second cast.  When this occurs one of the two
 casts can often be eliminated.   Sadly, I didn't keep any of those test
 files, but I regularly saw them in GCC bootstraps.

This transformation is applied both by fold-const.c and by SSA forwprop
(our GIMPLE combiner).  Doing it in yet another pass looks wrong
(and it isn't type demotion but also can be promotion).

 Kai, you mentioned you had more tests, but I never saw those posted.  I
 pulled together a handful of tests from the PRs  discussion threads. Some
 may be duplicates of yours (mine are attached).  If you could post yours as
 well, it'd be helpful.  Not all of mine are helped by this patch, but at
 least two are.

 There was some question about what to do with PROMOTE_MODE based targets.
 On those targets there's a lot of value in getting something into word mode
 and doing everything in word mode.  I think that can/should be a separate
 issue -- ISTM that ought to be handled fairly late in the tree optimizers.
 I don't see a strong argument for gating this patch on catering to
 PROMOTE_MODE.

In contrast to the desire of expressing operations in the smallest required
type there is the desire of exposing the effect of PROMOTE_MODE on
GIMPLE instead of only during RTL expansion.  This is because the
truncations (sext and zext) PROMOTE_MODE introduced are
easier to optimize away when range information is available (see the
attempts to address this at RTL expansion time from Kugan from Linaro).


 Similarly, I know there's a type hoisting patch that's also queued up. I
 think it should be handled separately as well.

I think we need to paint a picture of the final result - what is the
main objective of the various(?!) passes in question?  Where do
we do the same kind of transformation already?

AFAIK we have many places that optimize a series of conversions
(that's a combine-like problem), I'd call that conversion contraction.
We do have a set of foldings in the C family frontends, in fold and
in forwprop that try to un-do the effect of C integer promotions
(that's also a combine-like problem).

We have no pass that tries to promote or demote the types of
variables with using a data-flow approach (VRP comes closest,
but the transform is again pattern-matching, thus combine-like).
I do not object to adding this kind of pass, but I suggest to
look at the targets desires when implementing it - which eventually
means to honor PROMOTE_MODE (be careful about pass
placement here - you want 

[C++ Patch] PR 58724

2013-10-16 Thread Paolo Carlini

Hi,

we ICE on the C++11 version of the GNU visibility attribute on a 
namespace. I suppose we want to accept it, then just use 
get_attribute_name instead of TREE_PURPOSE. The bug report also asked 
for an example in the docs of attribute on a namespace declaration.


Tested x86_64-linux.

Thanks,
Paolo.

///
2013-10-16  Paolo Carlini  paolo.carl...@oracle.com

PR c++/58724
* doc/extend.texi [visibility (visibility_type)]: Add example
about visibility attribute on namespace declaration.

/cp
2013-10-16  Paolo Carlini  paolo.carl...@oracle.com

PR c++/58724
* name-lookup.c (handle_namespace_attrs): Use get_attribute_name.

/testsuite
2013-10-16  Paolo Carlini  paolo.carl...@oracle.com

PR c++/58724
* g++.dg/cpp0x/gen-attrs-56.C: New.
Index: cp/name-lookup.c
===
--- cp/name-lookup.c(revision 203691)
+++ cp/name-lookup.c(working copy)
@@ -3571,7 +3571,7 @@ handle_namespace_attrs (tree ns, tree attributes)
 
   for (d = attributes; d; d = TREE_CHAIN (d))
 {
-  tree name = TREE_PURPOSE (d);
+  tree name = get_attribute_name (d);
   tree args = TREE_VALUE (d);
 
   if (is_attribute_p (visibility, name))
Index: doc/extend.texi
===
--- doc/extend.texi (revision 203691)
+++ doc/extend.texi (working copy)
@@ -4220,6 +4220,12 @@ the One Definition Rule; for example, it is usuall
 an inline method as hidden without marking the whole class as hidden.
 
 A C++ namespace declaration can also have the visibility attribute.
+
+@smallexample
+namespace nspace1 __attribute__ ((visibility (protected)))
+@{ /* @r{Do something.} */; @}
+@end smallexample
+
 This attribute applies only to the particular namespace body, not to
 other definitions of the same namespace; it is equivalent to using
 @samp{#pragma GCC visibility} before and after the namespace
Index: testsuite/g++.dg/cpp0x/gen-attrs-56.C
===
--- testsuite/g++.dg/cpp0x/gen-attrs-56.C   (revision 0)
+++ testsuite/g++.dg/cpp0x/gen-attrs-56.C   (working copy)
@@ -0,0 +1,5 @@
+// PR c++/58724
+// { dg-do compile { target c++11 } }
+
+namespace foo __attribute__((visibility(default))) {}
+namespace bar [[gnu::visibility(default)]] {}


Re: [PATCH] Move effective target check after other directives in c-c++-common/cpp/openmp-define-3.c

2013-10-16 Thread Kyrill Tkachov

Hi Thomas,

--- a/gcc/testsuite/c-c++-common/cpp/openmp-define-3.c
+++ b/gcc/testsuite/c-c++-common/cpp/openmp-define-3.c
@@ -1,6 +1,6 @@
-/* { dg-require-effective-target fopenmp } */
  /* { dg-options -fopenmp } */
  /* { dg-do preprocess } */
+/* { dg-require-effective-target fopenmp } */

The other testsuite files that I added in r203417 should also be adapted
accordingly.  Do you want to do that, or want me to do it?

I'll leave the honors to you :)

Kyrill




Grüße,
  Thomas





Re: [PATCH,ARM] fix testsuite failures for arm-none-linux-gnueabihf

2013-10-16 Thread Charles Baylis
PIng?

On 19 September 2013 18:21, Charles Baylis charles.bay...@linaro.org wrote:
 Hi

 Here is an updated version.

 Changelog:

 * gcc.dg/builtin-apply2.c: skip test on arm hardfloat ABI targets
 * gcc.dg/tls/pr42894.c: Remove options, forcing -mthumb fails
 with hardfloat, and test is not thumb-specific
 * gcc,target/arm/thumb-ltu.c: Avoid test failure with
 hardfloat ABI by requiring arm_thumb1_ok
 * lib/target-supports.exp
 (check_effective_target_arm_fp16_ok_nocache): don't force
 -mfloat-abi=soft when building for hardfloat target

 On 19 August 2013 16:34, Richard Earnshaw rearn...@arm.com wrote:
 On 15/08/13 15:10, Charles Baylis wrote:
 Hi

 The attached patch fixes some tests which fail when testing gcc for a
 arm-none-linux-gnueabihf target because they do not expect to be built
 with a hard float ABI.

 The change in target-supports.exp fixes arm-fp16-ops-5.c and 
 arm-fp16-ops-6.c.

 Tested on arm-none-linux-gnueabihf using qemu-arm, and does not cause
 any other tests to break.

 Comments? This is my first patch, so please point out anything wrong.





 2013-08-15  Charles Baylis  charles.bay...@linaro.org

 * gcc.dg/builtin-apply2.c: skip test on arm hardfloat ABI targets
 * gcc.dg/tls/pr42894.c: Use -mfloat-abi=soft as Thumb1 does
 not support hardfloat ABI
 * arm/thumb-ltu.c: Use -mfloat-abi=soft as Thumb1 does not
 support hardfloat ABI
 * target-supports.exp: don't force -mfloat-abi=soft when
 building for hardfloat target


 hf-fixes.txt


 Index: gcc/testsuite/gcc.dg/builtin-apply2.c
 ===
 --- gcc/testsuite/gcc.dg/builtin-apply2.c (revision 201726)
 +++ gcc/testsuite/gcc.dg/builtin-apply2.c (working copy)
 @@ -1,6 +1,7 @@
  /* { dg-do run } */
  /* { dg-skip-if Variadic funcs have all args on stack. Normal funcs have 
 args in registers. { aarch64*-*-* avr-*-*  } { * } {  } } */
  /* { dg-skip-if Variadic funcs use Base AAPCS.  Normal funcs use VFP 
 variant. { arm*-*-* } { -mfloat-abi=hard } {  } } */
 +/* { dg-skip-if Variadic funcs use Base AAPCS.  Normal funcs use VFP 
 variant. { arm*-*-gnueabihf } { * } { -mfloat-abi=soft* } } */



 As you've noticed, basing the test's behaviour on the config variant
 doesn't work reliably.  The builtin-apply2 test really should be skipped
 if the current test variant is not soft-float.  We already have
 check_effective_target_arm_hf_eabi in target-supports.exp that checks
 whether __ARM_PCS_VFP is defined during a compilation.  So can  replace
 both arm related lines in builtin-apply2 with

  /* { dg-skip-if Variadic funcs use Base AAPCS.  Normal funcs use VFP
 variant. { arm*-*-*  arm_hf_eabi} { * } {  } } */

  /* PR target/12503 */
  /* Origin: pierre.nguyen-tu...@asim.lip6.fr */
 Index: gcc/testsuite/gcc.dg/tls/pr42894.c
 ===
 --- gcc/testsuite/gcc.dg/tls/pr42894.c(revision 201726)
 +++ gcc/testsuite/gcc.dg/tls/pr42894.c(working copy)
 @@ -1,6 +1,7 @@
  /* PR target/42894 */
  /* { dg-do compile } */
  /* { dg-options -march=armv5te -mthumb { target arm*-*-* } } */
 +/* { dg-options -march=armv5te -mthumb -mfloat-abi=soft { target 
 arm*-*-*hf } } */
  /* { dg-require-effective-target tls } */


 Although the original PR was for Thumb1, this is a generic test.  I'm
 not convinced that on ARM it should try to force thumb1.  Removing the
 original dg-options line should solve the problem and we then get better
 multi-lib testing as well.

  extern __thread int t;
 Index: gcc/testsuite/gcc.target/arm/thumb-ltu.c
 ===
 --- gcc/testsuite/gcc.target/arm/thumb-ltu.c  (revision 201726)
 +++ gcc/testsuite/gcc.target/arm/thumb-ltu.c  (working copy)
 @@ -1,6 +1,6 @@
  /* { dg-do compile } */
  /* { dg-skip-if incompatible options { arm*-*-* } { -march=* } { 
 -march=armv6 -march=armv6j -march=armv6z } } */
 -/* { dg-options -mcpu=arm1136jf-s -mthumb -O2 } */
 +/* { dg-options -mcpu=arm1136jf-s -mthumb -O2 -mfloat-abi=soft } */


 This won't work if there's an explict -mfloat-abi={softfp,hard} on the
 multilib options.  Probably the best thing to do here is to skip the
 test if arm_thumb1_ok is not true.

  void f(unsigned a, unsigned b, unsigned c, unsigned d)
  {
 Index: gcc/testsuite/lib/target-supports.exp
 ===
 --- gcc/testsuite/lib/target-supports.exp (revision 201726)
 +++ gcc/testsuite/lib/target-supports.exp (working copy)
 @@ -2445,6 +2445,11 @@
   # Must generate floating-point instructions.
   return 0
  }
 +if [check-flags [list  { *-*-gnueabihf } { * } {  } ]] {
 +# Use existing float-abi and force an fpu which supports fp16

 This should use arm_hf_eabi as described above.

 + set et_arm_fp16_flags -mfpu=vfpv4
 + return 1;
 +}
  if [check-flags 

RE: [PATCH,i386] Enable FMA4 for AMD bdver3

2013-10-16 Thread Gopalasubramanian, Ganesh
 4.8.2 is already rolling, so too late for that.

Is 4.8 branch (gcc/branches/gcc-4_8-branch) open?
If yes, shall I commit these changes?

Regards
Ganesh

-Original Message-
From: Jakub Jelinek [mailto:ja...@redhat.com] 
Sent: Wednesday, October 16, 2013 12:41 PM
To: Uros Bizjak
Cc: Gopalasubramanian, Ganesh; gcc-patches@gcc.gnu.org
Subject: Re: [PATCH,i386] Enable FMA4 for AMD bdver3

On Wed, Oct 16, 2013 at 09:00:58AM +0200, Uros Bizjak wrote:
 On Wed, Oct 16, 2013 at 8:28 AM, Gopalasubramanian, Ganesh 
 ganesh.gopalasubraman...@amd.com wrote:
 
  The below patch enables FMA4 for AMD bdver3 architectures.
 
  make -k check passes.
 
  +2013-10-16 Ganesh Gopalasubramanian  
  +ganesh.gopalasubraman...@amd.com
  +
  +   * config/i386/i386.c (ix86_option_override_internal): Enable FMA4
  +   for AMD bdver3.
  +
 
 OK for mainline and 4.8 branch (for 4.8.2 if approved by Jakub, 
 otherwise please wait for branch to open).

4.8.2 is already rolling, so too late for that.

Jakub




FW: MAX_PATH problems with mingw gcc

2013-10-16 Thread Vladimir Simonov
Hi,

Resending filename-normalize patch to correct list gcc-patches@gcc.gnu.org.
All context, please, see below.

Regards
Vladimir Simonov

-Original Message-
From: Vladimir Simonov 
Sent: Wednesday, October 16, 2013 12:52 PM
To: 'Jiri Dobry'; 'gcc-h...@gcc.gnu.org'
Subject: RE: MAX_PATH problems with mingw gcc

Hi all,

Sending new version of filename-normalize patch because people asking to make 
it public.

This version also fixes problems with gcov support of Linux binaries 
crosscompiled on Windows.
Problem: If you do crossbuild with gcov support, binaries refer gcda files like 
C:\Myproject\Mydir\Myobj.gcda.
After run on Linux files with names  like C:\Myproject\Mydir\Myobj.gcda are 
created.
No any directory tree.
The last chunk of new patch fixes this.

Best regards
Vladimir Simonov

-Original Message-
From: Simonov, Vladimir 
Sent: Monday, May 27, 2013 8:15 PM
To: gcc-h...@gcc.gnu.org
Subject: MAX_PATH problems with mingw gcc

Hi all,

Constructions #include ../../some_module/some_header.h are quite popular in 
our project. They lead to funny for Linux user problem which becomes real 
headache when we build the project by mingw based cross gcc.

Several levels of such include may easy exceed Windows MAX_PATH limitation 
for file I/O functions (260 symbols).

For example
g:/w/project1/product1/bundle/console/mv_system/centralized/dialogs/selectors/../../../../machines/dialogs/selectors/../../../components/machines/browsers/tree_based_group_browser/../../../hierarchy_model_browser/hierarchy_model_browser.h:37:61:
 fatal error: ../../remote/async_invokers/client/async_action.h: No such file 
or directory  #include ../../remote/async_invokers/client/async_action.h

As you see gcc tries to open
g:/w/project1/product1/bundle/console/mv_system/centralized/dialogs/selectors/../../../../machines/dialogs/selectors/../../../components/machines/browsers/tree_based_group_browser/../../../hierarchy_model_browser/../../remote/async_invokers/client/async_action.h
which length is 263 symbols.

IMO the best solution would be to switch on Unicode version of file IO 
functions in libiberty but it is quite radical change.

Attaching patch which works for me many years and illustrates another approach.
Its main idea is to simplify paths in some key places working with filenames 
(libcpp, parse_include in directives.c and find_file_in_dir in files.c).
With this patch code above compiles OK.

There are many pro and contras, i.e. it adds additional, probably unnecessary 
work on Linux time but makes filenames shorter, it affects libiberty which is 
shared between gcc/binutils/gdb, but it may be useful in other packages, etc., 
etc.

Anyway I'll be glad to hear any comments and may be to see some fix for above 
problem in gcc code.

Best regards
Vladimir Simonov



gcc-4.8.1-filename-normalize.patch
Description: gcc-4.8.1-filename-normalize.patch


Re: [PATCH] Move effective target check after other directives in c-c++-common/cpp/openmp-define-3.c

2013-10-16 Thread Thomas Schwinge
Hi!

On Wed, 16 Oct 2013 10:54:48 +0100, Kyrill Tkachov kyrylo.tkac...@arm.com 
wrote:
  --- a/gcc/testsuite/c-c++-common/cpp/openmp-define-3.c
  +++ b/gcc/testsuite/c-c++-common/cpp/openmp-define-3.c
  @@ -1,6 +1,6 @@
  -/* { dg-require-effective-target fopenmp } */
/* { dg-options -fopenmp } */
/* { dg-do preprocess } */
  +/* { dg-require-effective-target fopenmp } */
  The other testsuite files that I added in r203417 should also be adapted
  accordingly.  Do you want to do that, or want me to do it?
 I'll leave the honors to you :)

Sure.  ;-)

Pushed to trunk, r203697:

gcc/testsuite/
* c-c++-common/cpp/openmp-define-1.c: Move
dg-require-effective-target fopenmp after dg-do directive.
* c-c++-common/cpp/openmp-define-2.c: Likewise.
* gfortran.dg/openmp-define-1.f90: Likewise.
* gfortran.dg/openmp-define-2.f90: Likewise.
* gfortran.dg/openmp-define-3.f90: Likewise.

Completion of r203588, fix-up for r203417.

diff --git gcc/testsuite/c-c++-common/cpp/openmp-define-1.c 
gcc/testsuite/c-c++-common/cpp/openmp-define-1.c
index 0e7943a..c537922 100644
--- gcc/testsuite/c-c++-common/cpp/openmp-define-1.c
+++ gcc/testsuite/c-c++-common/cpp/openmp-define-1.c
@@ -1,5 +1,5 @@
-/* { dg-require-effective-target fopenmp } */
 /* { dg-do preprocess } */
+/* { dg-require-effective-target fopenmp } */
 
 #ifdef _OPENMP
 # error _OPENMP defined
diff --git gcc/testsuite/c-c++-common/cpp/openmp-define-2.c 
gcc/testsuite/c-c++-common/cpp/openmp-define-2.c
index f89..8823e29 100644
--- gcc/testsuite/c-c++-common/cpp/openmp-define-2.c
+++ gcc/testsuite/c-c++-common/cpp/openmp-define-2.c
@@ -1,6 +1,6 @@
-/* { dg-require-effective-target fopenmp } */
 /* { dg-options -fno-openmp } */
 /* { dg-do preprocess } */
+/* { dg-require-effective-target fopenmp } */
 
 #ifdef _OPENMP
 # error _OPENMP defined
diff --git gcc/testsuite/gfortran.dg/openmp-define-1.f90 
gcc/testsuite/gfortran.dg/openmp-define-1.f90
index d4ef1d1..4e1c076 100644
--- gcc/testsuite/gfortran.dg/openmp-define-1.f90
+++ gcc/testsuite/gfortran.dg/openmp-define-1.f90
@@ -1,6 +1,6 @@
-! { dg-require-effective-target fopenmp }
 ! { dg-options -cpp }
 ! { dg-do preprocess }
+! { dg-require-effective-target fopenmp }
 
 #ifdef _OPENMP
 # error _OPENMP defined
diff --git gcc/testsuite/gfortran.dg/openmp-define-2.f90 
gcc/testsuite/gfortran.dg/openmp-define-2.f90
index 651dae7..cd167ea 100644
--- gcc/testsuite/gfortran.dg/openmp-define-2.f90
+++ gcc/testsuite/gfortran.dg/openmp-define-2.f90
@@ -1,6 +1,6 @@
-! { dg-require-effective-target fopenmp }
 ! { dg-options -cpp -fno-openmp }
 ! { dg-do preprocess }
+! { dg-require-effective-target fopenmp }
 
 #ifdef _OPENMP
 # error _OPENMP defined
diff --git gcc/testsuite/gfortran.dg/openmp-define-3.f90 
gcc/testsuite/gfortran.dg/openmp-define-3.f90
index dfc45b1..3d55986 100644
--- gcc/testsuite/gfortran.dg/openmp-define-3.f90
+++ gcc/testsuite/gfortran.dg/openmp-define-3.f90
@@ -1,6 +1,6 @@
-! { dg-require-effective-target fopenmp }
 ! { dg-options -cpp -fopenmp }
 ! { dg-do preprocess }
+! { dg-require-effective-target fopenmp }
 
 #ifndef _OPENMP
 # error _OPENMP not defined


Grüße,
 Thomas


pgp0HJQRNg51x.pgp
Description: PGP signature


[C PATCH] Warn for _Alignas in an array declarator (PR c/58267)

2013-10-16 Thread Marek Polacek
This PR is about _Alignas in contexts like
char x[_Alignas (int) 20];
C grammar does not allow _Alignas to be in an array declarator, yet we
don't issue any warning or an error.  This patch implements a pedwarn
for this.  I'm however highly unsure whether we want pedwarn, error, or
normal warning for this - currently we just DTRT and ignore the
_Alignas altogether.

Regtested/bootstrapped on x86_64-linux, ok for trunk?

2013-10-16  Marek Polacek  pola...@redhat.com

PR c/58267
c/
* c-parser.c (c_parser_direct_declarator_inner): Record whether
_Alignas has been seen.
* c-decl.c (build_array_declarator): Add alignas_p parameter.  Warn
if _Alignas occurs in an array declarator.
* c-tree.h (build_array_declarator): Adjust declaration.
testsuite/
* gcc.dg/c1x-align-5.c: New test.

--- gcc/c/c-tree.h.mp   2013-10-15 19:36:09.548670833 +0200
+++ gcc/c/c-tree.h  2013-10-15 19:36:25.003727398 +0200
@@ -480,8 +480,8 @@ extern void c_init_decl_processing (void
 extern void c_print_identifier (FILE *, tree, int);
 extern int quals_from_declspecs (const struct c_declspecs *);
 extern struct c_declarator *build_array_declarator (location_t, tree,
-   struct c_declspecs *,
-   bool, bool);
+   struct c_declspecs *,
+   bool, bool, bool);
 extern tree build_enumerator (location_t, location_t, struct c_enum_contents *,
  tree, tree);
 extern tree check_for_loop_decls (location_t, bool);
--- gcc/c/c-parser.c.mp 2013-10-15 19:33:47.706150200 +0200
+++ gcc/c/c-parser.c2013-10-15 19:35:57.557627164 +0200
@@ -3096,8 +3096,10 @@ c_parser_direct_declarator_inner (c_pars
   struct c_declspecs *quals_attrs = build_null_declspecs ();
   bool static_seen;
   bool star_seen;
+  bool alignas_seen;
   tree dimen;
   c_parser_consume_token (parser);
+  alignas_seen = c_parser_next_token_is_keyword (parser, RID_ALIGNAS);
   c_parser_declspecs (parser, quals_attrs, false, false, true, 
cla_prefer_id);
   static_seen = c_parser_next_token_is_keyword (parser, RID_STATIC);
   if (static_seen)
@@ -3169,7 +3171,8 @@ c_parser_direct_declarator_inner (c_pars
   if (dimen)
mark_exp_read (dimen);
   declarator = build_array_declarator (brace_loc, dimen, quals_attrs,
-  static_seen, star_seen);
+  static_seen, star_seen,
+  alignas_seen);
   if (declarator == NULL)
return NULL;
   inner = set_array_declarator_inner (declarator, inner);
--- gcc/c/c-decl.c.mp   2013-10-15 19:35:30.172525717 +0200
+++ gcc/c/c-decl.c  2013-10-16 11:05:06.046383071 +0200
@@ -3866,7 +3866,7 @@ quals_from_declspecs (const struct c_dec
 struct c_declarator *
 build_array_declarator (location_t loc,
tree expr, struct c_declspecs *quals, bool static_p,
-   bool vla_unspec_p)
+   bool vla_unspec_p, bool alignas_p)
 {
   struct c_declarator *declarator = XOBNEW (parser_obstack,
struct c_declarator);
@@ -3908,6 +3908,9 @@ build_array_declarator (location_t loc,
}
   current_scope-had_vla_unspec = true;
 }
+  if (alignas_p)
+pedwarn (loc, OPT_Wpedantic, %_Alignas% meaningless in array 
+declarator);
   return declarator;
 }
 
--- gcc/testsuite/gcc.dg/c1x-align-5.c.mp   2013-10-16 11:11:16.432690963 
+0200
+++ gcc/testsuite/gcc.dg/c1x-align-5.c  2013-10-16 11:15:09.269514593 +0200
@@ -0,0 +1,17 @@
+/* Test C1X alignment support.  Test invalid code.  */
+/* { dg-do compile } */
+/* { dg-options -std=c1x -pedantic-errors } */
+
+static int
+foo (int a[_Alignas (0) 10])   /* { dg-error meaningless in array declarator 
} */
+{
+  return a[0];
+}
+
+int
+main ()
+{
+  int a[_Alignas (int) 10];/* { dg-error meaningless in array declarator 
} */
+  foo (a);
+  return 0;
+}

Marek


Re: [PATCH] Enhance ifcombine to recover non short circuit branches

2013-10-16 Thread Marc Glisse

On Wed, 16 Oct 2013, Zhenqiang Chen wrote:


The patch enhances ifcombine pass to recover some non short circuit
branches. Basically, it will do the following transformation:

Case 1:
 if (cond1)
   if (cond2)
==
 if (cond1  cond2)


Hello,

it might be more clear if you wrote (cond1  cond2) so it can't be 
confused with a short-circuit?


Using branch probabilities to guide the decision would be nice, don't know 
how hard that would be though.


--
Marc Glisse


Re: [PATCH] tree_code_name wrapper

2013-10-16 Thread Paolo Carlini

Hi,

bootstrap is currently broken, I'm going to test and commit the below if 
everything goes well.


Thanks,
Paolo.

///
Index: pt.c
===
--- pt.c(revision 203698)
+++ pt.c(working copy)
@@ -12102,7 +12102,7 @@ tsubst (tree t, tree args, tsubst_flags_t complain
   gcc_unreachable ();
 
 default:
-  sorry (use of %qs in template, get_tree_code_name ((int) code));
+  sorry (use of %qs in template, get_tree_code_name (code));
   return error_mark_node;
 }
 }


[COMMITTED] LRA enabling on ARM

2013-10-16 Thread Yvan Roux
Hi,

after the discussion in the thread below I've committed the enabling
of LRA on ARM through the new option -mlra. Notice that we still rely
on reload until the regression has been resolved.

http://gcc.gnu.org/ml/gcc-patches/2013-10/msg00908.html

Thanks,
Yvan

2013-10-16  Yvan Roux  yvan.r...@linaro.org

* config/arm/arm.opt (mlra): New option.
* config/arm/arm.c (arm_lra_p): New function.
(TARGET_LRA_P): Define.


arm-lra.diff
Description: Binary data


[PATCH, rs6000] Fix vsx_unpack expansions for endianness

2013-10-16 Thread Bill Schmidt
For vector unpack operations, the meaning of high and low is
reversed for little endian.

Bootstrapped and tested on powerpc64{,le}-unknown-linux-gnu with no
regressions.  This fixes one test case for little endian
(gcc.dg/vect/vect-122.c).

Ok for trunk?

Thanks,
Bill


2013-10-16  Bill Schmidt  wschm...@linux.vnet.ibm.com

* gcc/config/rs6000/vector.md (vec_unpacks_hi_v4sf): Correct for
endianness.
(vec_unpacks_lo_v4sf): Likewise.
(vec_unpacks_float_hi_v4si): Likewise.
(vec_unpacks_float_lo_v4si): Likewise.
(vec_unpacku_float_hi_v4si): Likewise.
(vec_unpacku_float_lo_v4si): Likewise.


Index: gcc/config/rs6000/vector.md
===
--- gcc/config/rs6000/vector.md (revision 203508)
+++ gcc/config/rs6000/vector.md (working copy)
@@ -872,7 +872,7 @@
 {
   rtx reg = gen_reg_rtx (V4SFmode);
 
-  rs6000_expand_interleave (reg, operands[1], operands[1], true);
+  rs6000_expand_interleave (reg, operands[1], operands[1], BYTES_BIG_ENDIAN);
   emit_insn (gen_vsx_xvcvspdp (operands[0], reg));
   DONE;
 })
@@ -884,7 +884,7 @@
 {
   rtx reg = gen_reg_rtx (V4SFmode);
 
-  rs6000_expand_interleave (reg, operands[1], operands[1], false);
+  rs6000_expand_interleave (reg, operands[1], operands[1], !BYTES_BIG_ENDIAN);
   emit_insn (gen_vsx_xvcvspdp (operands[0], reg));
   DONE;
 })
@@ -896,7 +896,7 @@
 {
   rtx reg = gen_reg_rtx (V4SImode);
 
-  rs6000_expand_interleave (reg, operands[1], operands[1], true);
+  rs6000_expand_interleave (reg, operands[1], operands[1], BYTES_BIG_ENDIAN);
   emit_insn (gen_vsx_xvcvsxwdp (operands[0], reg));
   DONE;
 })
@@ -908,7 +908,7 @@
 {
   rtx reg = gen_reg_rtx (V4SImode);
 
-  rs6000_expand_interleave (reg, operands[1], operands[1], false);
+  rs6000_expand_interleave (reg, operands[1], operands[1], !BYTES_BIG_ENDIAN);
   emit_insn (gen_vsx_xvcvsxwdp (operands[0], reg));
   DONE;
 })
@@ -920,7 +920,7 @@
 {
   rtx reg = gen_reg_rtx (V4SImode);
 
-  rs6000_expand_interleave (reg, operands[1], operands[1], true);
+  rs6000_expand_interleave (reg, operands[1], operands[1], BYTES_BIG_ENDIAN);
   emit_insn (gen_vsx_xvcvuxwdp (operands[0], reg));
   DONE;
 })
@@ -932,7 +932,7 @@
 {
   rtx reg = gen_reg_rtx (V4SImode);
 
-  rs6000_expand_interleave (reg, operands[1], operands[1], false);
+  rs6000_expand_interleave (reg, operands[1], operands[1], !BYTES_BIG_ENDIAN);
   emit_insn (gen_vsx_xvcvuxwdp (operands[0], reg));
   DONE;
 })




Re: [C PATCH] Warn for _Alignas in an array declarator (PR c/58267)

2013-10-16 Thread Joseph S. Myers
On Wed, 16 Oct 2013, Marek Polacek wrote:

 This PR is about _Alignas in contexts like
 char x[_Alignas (int) 20];
 C grammar does not allow _Alignas to be in an array declarator, yet we
 don't issue any warning or an error.  This patch implements a pedwarn
 for this.  I'm however highly unsure whether we want pedwarn, error, or
 normal warning for this - currently we just DTRT and ignore the
 _Alignas altogether.

I think this should be an error.

I see this as a problem with how the parsing of array declarators uses 
c_parser_declspecs to parse a sequence of qualifiers and attributes, with 
scspec_ok=false and typespec_ok=false to ensure those kinds of specifiers 
can't occur in the list, but without any measure to exclude alignment 
specifiers.  That is, I think c_parser_declspecs should gain an argument 
specifying whether alignment specifiers are OK, and the relevant calls 
from c_parser_direct_declarator_inner would pass false for that argument, 
and uses of alignment specifiers here would result in parse errors.

(Incidentally, the comments in c-parser.c that are supposed to give the 
grammar accepted are missing the syntax of array-declarator.)

 --- gcc/testsuite/gcc.dg/c1x-align-5.c.mp 2013-10-16 11:11:16.432690963 
 +0200
 +++ gcc/testsuite/gcc.dg/c1x-align-5.c2013-10-16 11:15:09.269514593 
 +0200
 @@ -0,0 +1,17 @@
 +/* Test C1X alignment support.  Test invalid code.  */
 +/* { dg-do compile } */
 +/* { dg-options -std=c1x -pedantic-errors } */
 +
 +static int
 +foo (int a[_Alignas (0) 10]) /* { dg-error meaningless in array declarator 
 } */
 +{
 +  return a[0];
 +}
 +
 +int
 +main ()
 +{
 +  int a[_Alignas (int) 10];  /* { dg-error meaningless in array declarator 
 } */
 +  foo (a);
 +  return 0;
 +}

I think this should also be tested after static (given that there are 
two separate calls to c_parser_declspecs involved), as well as in an 
abstract declarator.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PING] [PATCH] PR58143/58227 wrong code at -O3

2013-10-16 Thread Michael Matz
Hi,

On Tue, 15 Oct 2013, Richard Biener wrote:

 On Sun, Oct 6, 2013 at 2:22 PM, Bernd Edlinger
 bernd.edlin...@hotmail.de wrote:
  How I should proceed with this patch, is it OK?
 
  The latest version was posted at: 
  http://gcc.gnu.org/ml/gcc-patches/2013-09/msg00234.html
 
 I have now attached a quick draft of a patch that rewrites the moved stmts

No, you haven't :)


Ciao,
Michael.


RE: [PATCH] Portable Volatility Warning

2013-10-16 Thread Bernd Edlinger
Hi Richard,

well I think I have now a solution for both of your comments on the
initial version of the portable volatility warning patch.
Furthermore I have integrated Sandra's comments.

Therefore I think it might be worth another try, if you don't mind.

Technically this patch is not dependent on Sandra's patch any more.
However it may be useful to use -Wportable-volatility together with
-fno-strict-volatile-bitmaps, to get correct code, together with a warning
where the device registers are accessed in possible wrong way.
As we know the -fstrict-volatile-bitmaps generates wrong
code most of the time, but the warning does not cover all cases
where -fstrict-volatile-bitmaps generates wrong code
without Sandra's patch.

On Tue, 3 Sep 2013 10:53:10, Richard Biener wrote:
 Just a quick first note:

 @@ -3470,6 +3470,13 @@ optimize_bit_field_compare (location_t l
 || offset != 0 || TREE_CODE (linner) == PLACEHOLDER_EXPR)
 return 0;

 + /* Do not try to optimize on volatiles, as it would defeat the
 + -Wportable-volatility warning later:
 + If the COMPONENT_REF is replaced by a BIT_FIELD_REF we loose
 + information and thus the warning cannot be generated correctly. */
 + if (warn_portable_volatility  lvolatilep)
 + return 0;

 changing code-generation with a warning option looks wrong to me. Note
 that I completely removed this code once (it also generates strange
 BIT_FIELD_REFs) but re-instantiated it on request because of lost
 optimizations.


I just emit the warning, when the COMPONENT_REF is about to be replaced,
but leave the code generation flow intact.

 Similar for

 @@ -609,11 +609,14 @@ layout_decl (tree decl, unsigned int kno
 occupying a complete byte or bytes on proper boundary,
 and not -fstrict-volatile-bitfields. If the latter is set,
 we unfortunately can't check TREE_THIS_VOLATILE, as a cast
 - may make a volatile object later. */
 + may make a volatile object later.
 + Handle -Wportable-volatility analogous, as we would loose
 + information and defeat the warning later. */
 if (TYPE_SIZE (type) != 0
  TREE_CODE (TYPE_SIZE (type)) == INTEGER_CST
  GET_MODE_CLASS (TYPE_MODE (type)) == MODE_INT
 -  flag_strict_volatile_bitfields = 0)
 +  flag_strict_volatile_bitfields = 0
 +  warn_portable_volatility == 0)
 {
 enum machine_mode xmode
 = mode_for_size_tree (DECL_SIZE (decl), MODE_INT, 1);

 I also wonder about we unfortunately can't check TREE_THIS_VOLATILE, as a 
 cast
 may make a volatile object later. - does that mean that AAPCS requires

 struct X { int x : 8; char c; };

 void foo (struct X *p)
 {
 ((volatile struct X *)p)-x = 1;
 }

 to be treated as volatile bitfield access? Similar, is

 volatile struct X x;
 x.x = 1;

 a volatile bitifled access?

 I think the warning can be completely implemented inside struct-layout.c
 for example in finish_bitfield_representative (if you pass that the first 
 field
 in the group, too). Of course that is at the expense of warning for
 struct declarations rather than actual code differences (the struct may
 not be used)

 Richard.


Bootstrapped and regression-tested on x86_64-linux-gnu.

OK for trunk?

Thanks
Bernd.2013-10-16  Bernd Edlinger  bernd.edlin...@hotmail.de

Implement -Wportable-volatility warning to warn about
code that accesses volatile structure members for
which different ABI specifications exist.
* expr.c (check_portable_volatility): New function.
(expand_assignment): Call check_portable_volatility.
(expand_real_expr_1): Likewise.
* expr.h (check_portable_volatility): New function.
* fold-const.c (optimize_bit_field_compare): Call
check_portable_volatility.  Removed if-statement, because
condition flag_strict_volatile_bitfields  0 is always false.
* c-family/c.opt: Add -Wportable-volatility option.
* doc/invoke.texi: Add documentation about -Wportable-volatility.

testsuite:
c-c++-common/
* Wportable-volatility-1.c: New testcase.
* Wportable-volatility-2.c: New testcase.



patch-portable-volatility.diff
Description: Binary data


Re: [patch] (un)fix loop unswitching.

2013-10-16 Thread Andrew MacLeod
Bah, bounced due to html or something. My latest email client upgrade 
must have lost the 'plain text only' attribute..


On 10/16/2013 08:02 AM, Andrew MacLeod wrote:
More or less  the same patch that I put in bugzilla 
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58697


I've bootstrapped it and run it through regressions.  Jakubs test case 
fails again like it did before my changes, so it no longer masks the 
problem.


Loop unswitching was calling estimate_loop_iterations_int, which for 
various annoying reasons ended up in cfgloop and called 
get_estimate_loop_iterations, so it wasn't doing the calculations. 
Hopefully its clearer now.


The end result now is that cfgloop.c has 4  accessor routines which 
just retrieve info:

get_estimated_loop_iterations, get_estimated_loop_iterations_int,
get_max_loop_iterations, and get_max_loop_iterations_int.

tree-ssa-loop-niter.c has 4 routines which create the information 
first, if they can, then retrieve the information.

estimated_loop_iterations, estimated_loop_iterations_int,
max_loop_iterations, and max_loop_iterations_int.

all the cfg*  routines call the 'get_' variety, and all the tree-ssa 
routines call the non 'get_' variety to calculate things if need be.


Bootstraps on x86_64-unknown-linux-gnu and no regressions. OK? I 
presume Jakub's test case isn't in the testsuite anywhere then?  I 
presume it would go in along with a fix to the original problem


Andrew



	* cfgloop.c (get_estimated_loop_iterations_int): Rename from 
	estimated_loop_iterations_int.
	(max_stmt_executions_int): Call get_max_loop_iterations_int.
	(get_max_loop_iterations_int): New.  HWINT version of 
	get_max_loop_iterations.
	* cfgloop.h: Add prototypes.
	* loop-iv.c (find_simple_exit): call get_estimated_loop_iterations_int.
	* loop-unroll.c (decide_peel_once_rolling): Call 
	get_estimated_loop_iterations_int.
	* tree-ssa-loop-niter.c (estimated_loop_iterations_int): Add back.
	* tree-ssa-loop-niter.h: Tweak prototypes.

Index: cfgloop.c
===
*** cfgloop.c	(revision 203625)
--- cfgloop.c	(working copy)
*** record_niter_bound (struct loop *loop, d
*** 1815,1826 
  loop-nb_iterations_estimate = loop-nb_iterations_upper_bound;
  }
  
! /* Similar to estimated_loop_iterations, but returns the estimate only
 if it fits to HOST_WIDE_INT.  If this is not the case, or the estimate
 on the number of iterations of LOOP could not be derived, returns -1.  */
  
  HOST_WIDE_INT
! estimated_loop_iterations_int (struct loop *loop)
  {
double_int nit;
HOST_WIDE_INT hwi_nit;
--- 1815,1826 
  loop-nb_iterations_estimate = loop-nb_iterations_upper_bound;
  }
  
! /* Similar to get_estimated_loop_iterations, but returns the estimate only
 if it fits to HOST_WIDE_INT.  If this is not the case, or the estimate
 on the number of iterations of LOOP could not be derived, returns -1.  */
  
  HOST_WIDE_INT
! get_estimated_loop_iterations_int (struct loop *loop)
  {
double_int nit;
HOST_WIDE_INT hwi_nit;
*** estimated_loop_iterations_int (struct lo
*** 1842,1848 
  HOST_WIDE_INT
  max_stmt_executions_int (struct loop *loop)
  {
!   HOST_WIDE_INT nit = max_loop_iterations_int (loop);
HOST_WIDE_INT snit;
  
if (nit == -1)
--- 1842,1848 
  HOST_WIDE_INT
  max_stmt_executions_int (struct loop *loop)
  {
!   HOST_WIDE_INT nit = get_max_loop_iterations_int (loop);
HOST_WIDE_INT snit;
  
if (nit == -1)
*** get_max_loop_iterations (struct loop *lo
*** 1891,1893 
--- 1891,1915 
*nit = loop-nb_iterations_upper_bound;
return true;
  }
+ 
+ /* Similar to get_max_loop_iterations, but returns the estimate only
+if it fits to HOST_WIDE_INT.  If this is not the case, or the estimate
+on the number of iterations of LOOP could not be derived, returns -1.  */
+ 
+ HOST_WIDE_INT
+ get_max_loop_iterations_int (struct loop *loop)
+ {
+   double_int nit;
+   HOST_WIDE_INT hwi_nit;
+ 
+   if (!get_max_loop_iterations (loop, nit))
+ return -1;
+ 
+   if (!nit.fits_shwi ())
+ return -1;
+   hwi_nit = nit.to_shwi ();
+ 
+   return hwi_nit  0 ? -1 : hwi_nit;
+ }
+ 
+ 
Index: cfgloop.h
===
*** cfgloop.h	(revision 203625)
--- cfgloop.h	(working copy)
*** loop_outermost (struct loop *loop)
*** 740,747 
  }
  
  extern void record_niter_bound (struct loop *, double_int, bool, bool);
! extern HOST_WIDE_INT estimated_loop_iterations_int (struct loop *);
! extern HOST_WIDE_INT max_loop_iterations_int (struct loop *);
  extern bool get_estimated_loop_iterations (struct loop *loop, double_int *nit);
  extern bool get_max_loop_iterations (struct loop *loop, double_int *nit);
  
--- 740,747 
  }
  
  extern void record_niter_bound (struct loop *, double_int, bool, bool);
! extern HOST_WIDE_INT get_estimated_loop_iterations_int (struct loop *);
! extern 

Re: Minor ChangeLog goof

2013-10-16 Thread Martin Jambor
On Tue, Oct 15, 2013 at 09:53:22AM -0600, Jeff Law wrote:
 
 Martin put his entry in the wrong ChangeLog.  (toplevel instead of
 inside the gcc subdirectory).

Ouch, sorry and thanks for spotting and fixing it.

Martin


 
 Fixed in the obvious way.
 

 diff --git a/ChangeLog b/ChangeLog
 index 63c6cd8..0d3c199 100644
 --- a/ChangeLog
 +++ b/ChangeLog
 @@ -1,9 +1,3 @@
 -2013-10-15  Martin Jambor  mjam...@suse.cz
 -
 - * ipa-utils.h (ipa_edge_within_scc): Declare.
 - * ipa-cp.c (edge_within_scc): Moved...
 - * ipa-utils.c (ipa_edge_within_scc): ...here.  Updated all callers.
 -
  2013-01-10  Joern Rennecke  joern.renne...@embecosm.com
  
   Import from savannah.gnu.org:
 diff --git a/gcc/ChangeLog b/gcc/ChangeLog
 index 4142dd2..09e2494 100644
 --- a/gcc/ChangeLog
 +++ b/gcc/ChangeLog
 @@ -753,6 +753,11 @@
   (avx512f_getmantmode): Ditto.
   (avx512f_rndscalemode): Fix formatting.
  
 +2013-10-15  Martin Jambor  mjam...@suse.cz
 +
 + * ipa-utils.h (ipa_edge_within_scc): Declare.
 + * ipa-cp.c (edge_within_scc): Moved...
 + * ipa-utils.c (ipa_edge_within_scc): ...here.  Updated all callers.
  
  2013-10-15  Alexander Ivchenko  alexander.ivche...@intel.com
   Maxim Kuznetsov  maxim.kuznet...@intel.com



Re: Patch: Add #pragma ivdep support to the ME and C FE

2013-10-16 Thread Frederic Riss
Hi,

Just one question. You describe the pragma in the doco patch as:

+This pragma tells the compiler that the immediately following @code{for}
+loop can be executed in any loop index order without affecting the result.
+The pragma aids optimization and in particular vectorization as the
+compiler can then assume a vectorization safelen of infinity.

I'm not a specialist, but I was always told that the 'original'
meaning of ivdep (which I believe was introduced by Cray), was that
the compiler could assume that there are only forward dependencies in
the loop, but not that it can be executed in any order. The Intel docs
give this example:

void ignore_vec_dep(int *a, int k, int c, int m)
{
  #pragma ivdep
  for (int i = 0; i  m; i++)
a[i] = a[i + k] * c;
}

Given your description, this loop wouldn't be a candidate for ivdep,
as reversing the loop index order changes the semantics. I believe
that the way you interpret it (ie. setting vectorization safe length
to INT_MAX) is correct with respect to this other definition, though.

Is there any normative definition for such language extensions ?

Oh, and are there any plans to maintain this information in some way
till the back-end? Software pipelining could be another huge winner
for that kind of dependency analysis simplification.

Thanks,
Fred


On 10 October 2013 10:45, Tobias Burnus bur...@net-b.de wrote:
 Richard: Could you review the Gimple part? Thanks!
 Joseph: Could you have a look at the C part? Thanks!

 Jakub Jelinek wrote:

 +  if (loop-latch)
 +   FOR_EACH_EDGE (e, ei, loop-latch-succs)
 + {
 +   if (e-dest-flags  BB_RTL)
 + break;

 I'd prefer Richard to review this (and probably Joseph the C FE part).
 You can't really have loop-latch in GIMPLE and the successors
 in RTL, so perhaps you can check that in the if (loop-latch) check
 already.
 GIMPLE_COND must be the last in the bb, can't be followed by debug stmts,
 so you can safely use just gsi_last_bb (e-dest) instead.


 I have done the two modifications you suggested, fixed the formatting
 glitches. And completed it by another all-language bootstrap and regtesting.
 (Including also the separately posted Fortran [1] and the C++ [2] patches.)
 OK for the trunk?

 [1] http://gcc.gnu.org/ml/gcc-patches/2013-10/msg00517.html
 [2] http://gcc.gnu.org/ml/gcc-patches/2013-10/msg00656.html

 +#define SET_ANNOTATE_EXPR_ID(NODE, ID) \
 +  (ANNOTATE_EXPR_CHECK(NODE)-base.u.version = ID)

 Likewise.  Shouldn't it be = (ID) ?


 Probably less relevant for assignments, but for safety, it cannot harm.

 Tobias


RE: [PATCH] tree_code_name wrapper

2013-10-16 Thread Paulo Matos
 -Original Message-
 From: Paolo Carlini [mailto:paolo.carl...@oracle.com]
 Sent: 16 October 2013 12:54
 To: Richard Biener; Paulo Matos
 Cc: Jakub Jelinek; gcc-patches@gcc.gnu.org
 Subject: Re: [PATCH] tree_code_name wrapper
 
 Hi,
 
 bootstrap is currently broken, I'm going to test and commit the below if
 everything goes well.
 

Sorry all, it might be because I did --enable-languages=c and that was in the 
C++ part, even though with g++ being used to bootstrap I would have expected it 
to break the build anyway.

Thanks Paolo for fixing it.

-- 
Paulo Matos


Re: libgo patch committed: Don't clobber context when catching signal

2013-10-16 Thread Ian Lance Taylor
On Wed, Oct 16, 2013 at 12:24 AM, Uros Bizjak ubiz...@gmail.com wrote:

 This patch to libgo fixes a dumb bug in which catching a signal via the
 os/signal package could clobber the saved split-stack context, leading
 to disaster if the signal arrived at the wrong time.  Bootstrapped and
 ran Go testsuite on x86_64-unknown-linux-gnu.  Committed to mainline.
 Will commit to 4.8 branch when it reopens.

 Attached patch is re-enables bootstrap on targets without split stack
 due to unused variable warning. The patch was tested on
 alphaev68-linux-gnu [1].

Whoops, sorry about that.

Committed to mainline.

Ian


website patch committed: 4.8.2 supports Go 1.1.2

2013-10-16 Thread Ian Lance Taylor
I committed this patch to the website to document that the 4.8.2 release
includes full support for Go 1.1.2.

Ian

Index: gcc-4.8/changes.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-4.8/changes.html,v
retrieving revision 1.122
diff -u -r1.122 changes.html
--- gcc-4.8/changes.html	16 Oct 2013 06:57:54 -	1.122
+++ gcc-4.8/changes.html	16 Oct 2013 13:46:57 -
@@ -441,6 +441,8 @@
 
 h3 id=goGo/h3
   ul
+liGCC 4.8.2 provides a complete implementation of the Go 1.1.2
+  release./li
 liGCC 4.8.0 and 4.8.1 implement a preliminary version of the Go
   1.1 release.  The library support is not quite complete./li
 liGo has been tested on GNU/Linux and Solaris platforms for


[wide-int] int_traits tree

2013-10-16 Thread Richard Biener

Speaking in terms of a patch:

Index: tree.h
===
--- tree.h  (revision 203690)
+++ tree.h  (working copy)
@@ -5184,14 +5184,11 @@ wi::int_traits const_tree::decompose (
  / HOST_BITS_PER_WIDE_INT);
   unsigned int xprecision = get_precision (x);
 
-  gcc_assert (precision = xprecision);
+  gcc_checking_assert (precision = xprecision  precision != 0);
 
-  /* Got to be careful of precision 0 values.  */
-  if (precision)
-len = MIN (len, max_len);
   if (TYPE_SIGN (TREE_TYPE (x)) == UNSIGNED)
 {
-  unsigned int small_prec = precision  (HOST_BITS_PER_WIDE_INT - 1);
+  unsigned int small_prec = xprecision  (HOST_BITS_PER_WIDE_INT - 
1);
   if (small_prec)
{
  /* We have to futz with this because the canonization for
@@ -5201,7 +5198,7 @@ wi::int_traits const_tree::decompose (
{
  for (unsigned int i = 0; i  len - 1; i++)
scratch[i] = val[i];
- scratch[len - 1] = sext_hwi (val[len - 1], precision);
+ scratch[len - 1] = sext_hwi (val[len - 1], small_prec);
  return wi::storage_ref (scratch, len, precision);
}
}

the precision 0 thing is gone as far as I understand?

Also sign-extending the upper hwi must be done with precision % 
HOST_BITS_PER_WIDE_INT, no?  And it assumes that we only ever
have to extend the upper most HWI, thus 'len' is never too big.

I note that we immediately return in the above case, so

  if (precision  xprecision + HOST_BITS_PER_WIDE_INT)
{
  len = wi::force_to_size (scratch, val, len, xprecision, 
precision, UNSIGNED);
  return wi::storage_ref (scratch, len, precision);
}

applies only when the desired precision is a multiple of a HWI.
I assume it adds that extra zero word in case the MSB is set,
right?  I am confused about the condition written here and
how we look at precision and not xprecision when deciding how
to extend - given a 8-bit precision unsigned 0xff and precision == 64
we do not even consider sign-extending the value because we look
at precision and not xprecision.  Don't we need to look at
xprecision here?  After all an assert precision == xprecision
does not work in this routine.

Quickly execute.exp tested only.

To avoid code quality wreckage we have to implement a different way
of allocating the 'scratch' storage of wide_int_ref_storage
(well, ideally we wouldn't need it ...).  I thought of simply
allocating it via alloca (), but we need to do that in all
callers that build a wide_int_ref (eventually via a hidden
default arg).  But as that's a template specialization of
generic_wide_int ... so the option is to provide a function
wrapper inside wi:: for this, like

template typename T
wide_int_ref
wi::ref (const T t, HOST_WIDE_INT *scratch = XALLOCAVEC (HOST_WIDE_INT, 
WIDE_INT_MAX_ELTS))
{
  return wide_int_ref (t, scratch);
}

and amend the storage constructors to get a scratch argument.

The reason for all this is that otherwise the wide_int_ref object
escapes when we pass the scratch storage to the workers in
int_traits const_tree.

Thanks,
Richard.


[c++-concepts] Shorthand notation

2013-10-16 Thread Andrew Sutton
I've committed initial support for shorthand constraints. This patch
adds support for parsing a concept-names as non-class names. When
introducing a template parameter, the concept name is transformed into
a constraint on the template parameter. Constrained parameters can
introduce type, non-type and template template parameters.

This has initial support for variadic constraints, but it's not well tested.

This patch does not yet support default arguments for constrained
template parameters, nor does it support the use of concept ids of
this form:

  templatetypename T, FunctionT F
void f();

There are a couple of interesting things in the patch. I'm using a
PLACEHOLDER_EXPR as a template argument in order to resolve constraint
names. Effectively, I deduce concepts by creating an expression like:

  Equality_comparable?()

where ? is a placeholder, and after coerce_template_arguments
completes, I can extract the matched parameter from the placeholder.
This works nicely when concepts are overloaded or have default
arguments (although I'm not sure I'm testing that very thoroughly
right now).

With variadic constraints, I've had to add functionality to expand a
pack as a conjunction of requirements. For example, if you declare:

  templateClass... Ts // ClassT() must be true for each T in Ts
void f();

The transformation is:

  templatetypename... Ts
requires ClassTs()...
  void f();

Where ClassTs()... expands to ClassT1()  ClassT2()  ... etc.
I feel like the current implementation is a bit hacky, and I'm
wondering if I should introduce a new node for a pack conjunction.

Change log follows.

2013-10-16  Andrew Sutton  andrew.n.sut...@gmail.com
* gcc/cp/constraint.cc (conjoin_requiremens): New.
(resolve_constraint_check): Filter non-concept candidates before
coercing arguments. Perform deduction in a template-decl processing
context to prevent errors during diagnosis.
(finish_concept_name), (finish_shorthand_requirement),
(get_shorthand_requirements): New.
* gcc/cp/pt.c (template_parm_to_arg): Make non-static.
(process_templat_parm): Build shorthand requirements from the
parameter description.
(end_templat_parm_list): New.
(convert_placeholder_argument): New.
(convert_template_argument): Match placeholder arguments against
any template parameter.
(tsubst_pack_conjuction):  New.
(tsubst_expr): Expand a pack as a conjunction.
(type_dependent_expression_p): Placeholders are always type
dependent.
* gcc/cp/parser.c (cp_is_constrained_parameter),
(cp_finish_template_type_parm), (cp_finish_template_template_parm)
(cp_finish_non_type_template_parm), (cp_finish_constrined_parameter):
New.
(cp_parser_template_parameter): Handle constrained parameters.
(cp_parser_nonclass_name): An identifier naming an overload set
may declare a constrained parameter.
(cp_parser_type_parameter), (cp_parser_template_declaration_after_exp):
Get shorthand requirements from the tmeplate parameter list.
* gcc/cp/cp-tree.h (TEMPLATE_PARM_CONSTRAINTS): New.

Committed in 203704.

Andrew Sutton


Re: [patch] (un)fix loop unswitching.

2013-10-16 Thread Richard Biener
On Wed, Oct 16, 2013 at 2:56 PM, Andrew MacLeod amacl...@redhat.com wrote:
 Bah, bounced due to html or something. My latest email client upgrade must
 have lost the 'plain text only' attribute..


 On 10/16/2013 08:02 AM, Andrew MacLeod wrote:

 More or less  the same patch that I put in bugzilla
 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58697

 I've bootstrapped it and run it through regressions.  Jakubs test case
 fails again like it did before my changes, so it no longer masks the
 problem.

 Loop unswitching was calling estimate_loop_iterations_int, which for
 various annoying reasons ended up in cfgloop and called
 get_estimate_loop_iterations, so it wasn't doing the calculations. Hopefully
 its clearer now.

 The end result now is that cfgloop.c has 4  accessor routines which just
 retrieve info:
 get_estimated_loop_iterations, get_estimated_loop_iterations_int,
 get_max_loop_iterations, and get_max_loop_iterations_int.

 tree-ssa-loop-niter.c has 4 routines which create the information first,
 if they can, then retrieve the information.
 estimated_loop_iterations, estimated_loop_iterations_int,
 max_loop_iterations, and max_loop_iterations_int.

 all the cfg*  routines call the 'get_' variety, and all the tree-ssa
 routines call the non 'get_' variety to calculate things if need be.

 Bootstraps on x86_64-unknown-linux-gnu and no regressions. OK? I

Ok.

Thanks,
Richard.

presume
 Jakub's test case isn't in the testsuite anywhere then?  I presume it would
 go in along with a fix to the original problem

 Andrew




Re: [PATCH, rs6000] Fix vsx_unpack expansions for endianness

2013-10-16 Thread David Edelsohn
On Wed, Oct 16, 2013 at 8:22 AM, Bill Schmidt
wschm...@linux.vnet.ibm.com wrote:
 For vector unpack operations, the meaning of high and low is
 reversed for little endian.

 Bootstrapped and tested on powerpc64{,le}-unknown-linux-gnu with no
 regressions.  This fixes one test case for little endian
 (gcc.dg/vect/vect-122.c).

 Ok for trunk?

 Thanks,
 Bill


 2013-10-16  Bill Schmidt  wschm...@linux.vnet.ibm.com

 * gcc/config/rs6000/vector.md (vec_unpacks_hi_v4sf): Correct for
 endianness.
 (vec_unpacks_lo_v4sf): Likewise.
 (vec_unpacks_float_hi_v4si): Likewise.
 (vec_unpacks_float_lo_v4si): Likewise.
 (vec_unpacku_float_hi_v4si): Likewise.
 (vec_unpacku_float_lo_v4si): Likewise.

Okay.

Thanks, David


Re: Patch: Add #pragma ivdep support to the ME and C FE

2013-10-16 Thread Tobias Burnus
Frederic Riss wrote:
 Just one question. You describe the pragma in the doco patch as:

 +This pragma tells the compiler that the immediately following @code{for}
 +loop can be executed in any loop index order without affecting the result.
 +The pragma aids optimization and in particular vectorization as the
 +compiler can then assume a vectorization safelen of infinity.

 I'm not a specialist, but I was always told that the 'original'
 meaning of ivdep (which I believe was introduced by Cray), was that
 the compiler could assume that there are only forward dependencies in
 the loop, but not that it can be executed in any order.

The nice thing about #pragma ivdep is that there is no real standard. And
the explanation of the different vendors is also not completely clear.


Some overview about this is given in the following file on pages 13-14 for
Cray Reaseach PVP, MIPSPRO  Open64, Intel ICC, Multiflow
http://sysrun.haifa.il.ibm.com/hrl/greps2007/papers/GREPS2007-Benoit.pdf

That's summerized as:
- vector: ignore lexical upward dependencies (Cray PVP, Intel ICC)
- parallel: ignore loop-carried dependencies (MIPSPRO, Open64)
- liberal: ignore loop-variant dependencies (Multiflow)


The quotes for Cray and Intel are below.

Cray: 
http://docs.cray.com/books/004-2179-001/html-004-2179-001/brtlrwh.html#EKZ5MRWH
The ivdep directive tells the compiler to ignore vector dependencies for
 the loop immediately following the directive. Conditions other than vector
 dependencies can inhibit vectorization. If these conditions are satisfactory,
 the loop vectorizes. This directive is useful for some loops that contain
 pointers and indirect addressing. The format of this directive is as follows:
 #pragma _CRI ivdep

Intel: 
http://software.intel.com/sites/products/documentation/doclib/iss/2013/compiler/cpp-lin/GUID-B25ABCC2-BE6F-4599-AEDF-2434F4676E1B.htm
The ivdep pragma instructs the compiler to ignore assumed vector dependencies.
 To ensure correct code, the compiler treats an assumed dependence as a proven
 dependence, which prevents vectorization. This pragma overrides that decision.
 Use this pragma only when you know that the assumed loop dependencies are safe
 to ignore.


 The Intel docs give this example:
...
 Given your description, this loop wouldn't be a candidate for ivdep,
 as reversing the loop index order changes the semantics. I believe
 that the way you interpret it (ie. setting vectorization safe length
 to INT_MAX) is correct with respect to this other definition, though.

Do you have a suggestion for a better wording? My idea was to interpret
this part similar to OpenMP's simd with safelen=infinity. (Actually, I
believe loop-safelen was added for OpenMPv4's and/or Cilk Plus's simd.)

OpenMPv4.0, http://www.openmp.org/mp-documents/OpenMP4.0.0.pdf , states
for this (excerpt from page 70):
A SIMD loop has logical iterations numbered 0, 1,...,N-1 where N is the
number of loop iterations, and the logical numbering denotes the sequence
in which the iterations would be executed if the associated loop(s) were
executed with no SIMD instructions. If the safelen clause is used then no
two iterations executed concurrently with SIMD instructions can have a
greater distance in the logical iteration space than its value. The
parameter of the safelen clause must be a constant positive integer
expression. The number of iterations that are executed concurrently at
any given time is implementation defined. Each concurrent iteration will
be executed by a different SIMD lane. Each set of concurrent iterations
is a SIMD chunk.


 Oh, and are there any plans to maintain this information in some way
 till the back-end? Software pipelining could be another huge winner
 for that kind of dependency analysis simplification.

I don't know until when loop-safelen is kept. As it is late in the
middle-end, providing the backend with this information should be
simple.

Tobias


[AArch64] Fix preferred_reload_class for regclass STACK_REG.

2013-10-16 Thread Marcus Shawcroft


The existing AArch64 implementation of preferred_reload_class when 
called with a destination regclass STACK_REG returns the wider register 
class GENERAL_REG.  This patch changes the behavior such that we only 
return a narrower class (or NO_REGS) and never return a wider class.


I'll leave this on the list 24h before committing.

/Marcus


2013-10-16  Marcus Shawcroft  marcus.shawcr...@arm.com

* config/aarch64/aarch64.c (aarch64_preferred_reload_class): Adjust
handling of STACK_REG.diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index da3962f..7fce7a0 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -4219,9 +4219,18 @@ aarch64_class_max_nregs (reg_class_t regclass, enum machine_mode mode)
 static reg_class_t
 aarch64_preferred_reload_class (rtx x, reg_class_t regclass)
 {
-  if (regclass == POINTER_REGS || regclass == STACK_REG)
+  if (regclass == POINTER_REGS)
 return GENERAL_REGS;
 
+  if (regclass == STACK_REG)
+{
+  if (REG_P(x)
+	   reg_class_subset_p (REGNO_REG_CLASS (REGNO (x)), POINTER_REGS))
+	  return regclass;
+
+  return NO_REGS;
+}
+
   /* If it's an integer immediate that MOVI can't handle, then
  FP_REGS is not an option, so we return NO_REGS instead.  */
   if (CONST_INT_P (x)  reg_class_subset_p (regclass, FP_REGS)

[AArch64] Classify FRAME_POINTER_REGNUM and ARG_POINTER_REGNUM as POINTER_REGS.

2013-10-16 Thread Marcus Shawcroft


The existing aarch64 implementation of REGNO_REGCLASS classifies the 
soft frame and arg registers as CORE_REGS.  However either could be 
eliminated against FP or SP, in case of the latter CORE_REGS is too 
narrow.  This patch changes the classification to POINTER_REGS a 
superset of CORE_REGS that includes SP.


Regressed aarch64-none-elf, committed.

/Marcus

2013-10-16  Marcus Shawcroft  marcus.shawcr...@arm.com

* config/aarch64/aarch64.c (aarch64_regno_regclass): Classify
FRAME_POINTER_REGNUM and ARG_POINTER_REGNUM as POINTER_REGS.diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index f6f587a..da3962f 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -3904,7 +3904,7 @@ aarch64_regno_regclass (unsigned regno)
 
   if (regno == FRAME_POINTER_REGNUM
   || regno == ARG_POINTER_REGNUM)
-return CORE_REGS;
+return POINTER_REGS;
 
   if (FP_REGNUM_P (regno))
 return FP_LO_REGNUM_P (regno) ?  FP_LO_REGS : FP_REGS;

[AArch64] Fix output template for Scalar Neon-Neon register move.

2013-10-16 Thread James Greenhalgh

Hi,

To move a scalar char/short/int around in the vector registers there
is no such instruction as:
  dup v0, v0.h[0]
But there is:
  dup h0, v0.h[0]
(Alternately there is dup v0.4h, v0.h[0], but I don't think that
is what we are aiming for).

Fix the output template we are using to reflect this.

aarch64.exp came back clean and the correct instruction form is
now generated.

OK?

Thanks,
James

---
2013-10-14  James Greenhalgh  james.greenha...@arm.com

* config/aarch64/aarch64.md
(*movmode_aarch64): Fix output template for DUP (element) Scalar.
diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index 01664665e7d309f2cf2076bdc3ca6e0825612cea..758be47420e95fad74c57c1a9dcb7934b87c141e 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -789,7 +789,7 @@ (define_insn *movmode_aarch64
  case 8:
return dup\t%0.Vallxd, %w1;
  case 9:
-   return dup\t%0, %1.v[0];
+   return dup\t%Vetype0, %1.v[0];
  default:
gcc_unreachable ();
  }

Re: [AArch64] Fix output template for Scalar Neon-Neon register move.

2013-10-16 Thread Marcus Shawcroft
On 16 October 2013 15:58, James Greenhalgh james.greenha...@arm.com wrote:

 Hi,

 To move a scalar char/short/int around in the vector registers there
 is no such instruction as:
   dup v0, v0.h[0]
 But there is:
   dup h0, v0.h[0]
 (Alternately there is dup v0.4h, v0.h[0], but I don't think that
 is what we are aiming for).

 Fix the output template we are using to reflect this.

 aarch64.exp came back clean and the correct instruction form is
 now generated.

 OK?

OK

/Marcus


Re: [C PATCH] Warn for _Alignas in an array declarator (PR c/58267)

2013-10-16 Thread Marek Polacek
On Wed, Oct 16, 2013 at 12:28:32PM +, Joseph S. Myers wrote:
 On Wed, 16 Oct 2013, Marek Polacek wrote:
 
  This PR is about _Alignas in contexts like
  char x[_Alignas (int) 20];
  C grammar does not allow _Alignas to be in an array declarator, yet we
  don't issue any warning or an error.  This patch implements a pedwarn
  for this.  I'm however highly unsure whether we want pedwarn, error, or
  normal warning for this - currently we just DTRT and ignore the
  _Alignas altogether.
 
 I think this should be an error.

Makes sense.
 
 I see this as a problem with how the parsing of array declarators uses 
 c_parser_declspecs to parse a sequence of qualifiers and attributes, with 
 scspec_ok=false and typespec_ok=false to ensure those kinds of specifiers 
 can't occur in the list, but without any measure to exclude alignment 
 specifiers.  That is, I think c_parser_declspecs should gain an argument 
 specifying whether alignment specifiers are OK, and the relevant calls 
 from c_parser_direct_declarator_inner would pass false for that argument, 
 and uses of alignment specifiers here would result in parse errors.

Thanks, should be done in the attached patch.

 (Incidentally, the comments in c-parser.c that are supposed to give the 
 grammar accepted are missing the syntax of array-declarator.)

I've fixed that up.

 I think this should also be tested after static (given that there are 
 two separate calls to c_parser_declspecs involved), as well as in an 
 abstract declarator.

Done.

Regtested/bootstrapped on x86_64-linux, ok for trunk?

2013-10-16  Marek Polacek  pola...@redhat.com

PR c/58267
c/
* c-parser.c (c_parser_declspecs): Add alignspec_ok parameter.
Document syntax of the array-declarator.
(c_parser_declspecs) RID_ALIGNAS: Bail out if alignment specs
are not permitted.
(c_parser_declaration_or_fndef): Adjust c_parser_declspecs call.
(c_parser_struct_declaration): Likewise.
(c_parser_declarator): Likewise.
(c_parser_direct_declarator_inner): Likewise.
(c_parser_parameter_declaration): Likewise.
(c_parser_type_name): Likewise.
testsuite/
* gcc.dg/c1x-align-5.c: New test.

--- gcc/c/c-parser.c.mp 2013-10-16 15:00:36.774530501 +0200
+++ gcc/c/c-parser.c2013-10-16 16:31:30.731611508 +0200
@@ -1124,7 +1124,7 @@ static void c_parser_declaration_or_fnde
 static void c_parser_static_assert_declaration_no_semi (c_parser *);
 static void c_parser_static_assert_declaration (c_parser *);
 static void c_parser_declspecs (c_parser *, struct c_declspecs *, bool, bool,
-   bool, enum c_lookahead_kind);
+   bool, bool, enum c_lookahead_kind);
 static struct c_typespec c_parser_enum_specifier (c_parser *);
 static struct c_typespec c_parser_struct_or_union_specifier (c_parser *);
 static tree c_parser_struct_declaration (c_parser *);
@@ -1494,7 +1494,8 @@ c_parser_declaration_or_fndef (c_parser
   fndef_ok = !nested;
 }
 
-  c_parser_declspecs (parser, specs, true, true, start_attr_ok, 
cla_nonabstract_decl);
+  c_parser_declspecs (parser, specs, true, true, start_attr_ok,
+ true, cla_nonabstract_decl);
   if (parser-error)
 {
   c_parser_skip_to_end_of_block_or_statement (parser);
@@ -1942,8 +1943,9 @@ c_parser_static_assert_declaration_no_se
 /* Parse some declaration specifiers (possibly none) (C90 6.5, C99
6.7), adding them to SPECS (which may already include some).
Storage class specifiers are accepted iff SCSPEC_OK; type
-   specifiers are accepted iff TYPESPEC_OK; attributes are accepted at
-   the start iff START_ATTR_OK.
+   specifiers are accepted iff TYPESPEC_OK; alignment specifiers are
+   accepted iff ALIGNSPEC_OK; attributes are accepted at the start
+   iff START_ATTR_OK.
 
declaration-specifiers:
  storage-class-specifier declaration-specifiers[opt]
@@ -2039,7 +2041,7 @@ c_parser_static_assert_declaration_no_se
 static void
 c_parser_declspecs (c_parser *parser, struct c_declspecs *specs,
bool scspec_ok, bool typespec_ok, bool start_attr_ok,
-   enum c_lookahead_kind la)
+   bool alignspec_ok, enum c_lookahead_kind la)
 {
   bool attrs_ok = start_attr_ok;
   bool seen_type = specs-typespec_kind != ctsk_none;
@@ -2235,6 +2237,8 @@ c_parser_declspecs (c_parser *parser, st
  declspecs_add_attrs (loc, specs, attrs);
  break;
case RID_ALIGNAS:
+ if (!alignspec_ok)
+   goto out;
  align = c_parser_alignas_specifier (parser);
  declspecs_add_alignas (loc, specs, align);
  break;
@@ -2640,7 +2644,8 @@ c_parser_struct_declaration (c_parser *p
 }
   specs = build_null_declspecs ();
   decl_loc = c_parser_peek_token (parser)-location;
-  c_parser_declspecs (parser, specs, false, true, true, cla_nonabstract_decl);
+  c_parser_declspecs (parser, specs, false, true, true,
+

Re: [PATCH v2 1/4] Ignore access-control keywords when parsing fields.

2013-10-16 Thread David Malcolm
On Tue, 2013-10-15 at 12:36 -0600, Jeff Law wrote:
 On 09/24/13 11:49, David Malcolm wrote:
[...]
  * gengtype-parse.c (struct_field_seq): Ignore access-control
  keywords (public: etc).
 OK.
Thanks; committed to trunk as r203708.





Teaching emacs about GCC coding conventions (was Re: [PATCH] tree_code_name wrapper)

2013-10-16 Thread David Malcolm
On Tue, 2013-10-15 at 13:19 +, Paulo Matos wrote:
  -Original Message-
  From: Jakub Jelinek [mailto:ja...@redhat.com]
  Sent: 15 October 2013 10:51
  To: Paulo Matos
  Cc: Richard Biener; Paolo Carlini; gcc-patches@gcc.gnu.org
  Subject: Re: [PATCH] tree_code_name wrapper
  
  On Tue, Oct 15, 2013 at 09:42:17AM +, Paulo Matos wrote:
   Thanks, regarding the indentation I was convinced we used 2 space
   indentation with maximum line length of 80 characters.
  
  We do, however 8 consecutive spaces in the indentation should be always
  replaced by a tab.
  
 
 This means that every sequence of 8 spaces should be converted into tabs? So, 
 if we indent something 4 times, that becomes a tab instead of 4 times 2 
 spaces. 
 
 Just to confirm so I can setup my emacs to use this.

Probably a silly question, but what's the invocation in emacs to do
this, and is it possible to set up the source tree so that this happens
automatically for everyone?

(I've been using whitespace-mode and manually copying-and-pasting tab
characters, which is clearly suboptimal).



Re: [patch] combine ICE fix

2013-10-16 Thread Cesar Philippidis
On 10/15/13 12:16 PM, Jeff Law wrote:
 On 10/10/13 10:25, Jakub Jelinek wrote:
 On Thu, Oct 10, 2013 at 07:26:43AM -0700, Cesar Philippidis wrote:
 This patch addresses an ICE when building qemu for a Mips target in
 Yocto. Both gcc-trunk, gcc-4.8 and all of the targets are potentially
 affected. The problem occurs because the instruction combine phase uses
 two data structures to keep track of registers, reg_stat and
 regstat_n_sets_and_refs, but they potentially end up out of sync; when
 combine inserts a new register into reg_stat it does not update
 regstat_n_sets_and_refs. Failing to update the latter results in an
 occasional segmentation fault.

 Is this OK for trunk and gcc-4.8? If so, please check it in. I tested it
 on Mips and x86-64 and no regressions showed up.

 That looks broken.  You leave everything from the last size till the
 current
 one uninitialized, so it would only work if combine_split_insns
 always increments max_reg_num () by at most one.
 I don't think that assumption is safe.  Consider a parallel with a bunch
 of (clobber (match_scratch)) expressions.

I address that in the patch posted here
http://gcc.gnu.org/ml/gcc-patches/2013-10/msg00802.html. Is that still
insufficient?

 Furthermore, there are macros which should be used to access
 the fields, and, if the vector is ever going to be resized, supposedly
 it should be vec.h vector rather than just array.
 Or perhaps take into account:
 /* If a pass need to change these values in some magical way or the
 pass needs to have accurate values for these and is not using
 incremental df scanning, then it should use REG_N_SETS and
 REG_N_USES.  If the pass is doing incremental scanning then it
 should be getting the info from DF_REG_DEF_COUNT and
 DF_REG_USE_COUNT.  */
 and not use REG_N_SETS etc. but instead the df stuff.
 Which begs the question, how exactly is combine utilizing the
 regstat_n_* structures and is that use even valid for combine?

I'll take a look at that.

Cesar



RE: Teaching emacs about GCC coding conventions (was Re: [PATCH] tree_code_name wrapper)

2013-10-16 Thread Paulo Matos
For me it worked with:
(add-hook 'c-mode-hook 'gcc-c-mode)
(defun gcc-c-mode()
  ;; set gnu style.
  (c-set-style gnu)
  ;; TAB offset set to 2
  (setq c-basic-offset 2)
  ;; Tabs should be 8
  (setq indent-tabs-mode t)
  )

Then I simply edit all my software as if it was GCC (haven't found a way to say 
: 'only use this for GCC sources'). I am just missing a good mode for .md files.

Paulo Matos


 -Original Message-
 From: David Malcolm [mailto:dmalc...@redhat.com]
 Sent: 16 October 2013 16:33
 To: Paulo Matos
 Cc: Jakub Jelinek; Richard Biener; Paolo Carlini; gcc-patches@gcc.gnu.org
 Subject: Teaching emacs about GCC coding conventions (was Re: [PATCH]
 tree_code_name wrapper)
 
 On Tue, 2013-10-15 at 13:19 +, Paulo Matos wrote:
   -Original Message-
   From: Jakub Jelinek [mailto:ja...@redhat.com]
   Sent: 15 October 2013 10:51
   To: Paulo Matos
   Cc: Richard Biener; Paolo Carlini; gcc-patches@gcc.gnu.org
   Subject: Re: [PATCH] tree_code_name wrapper
  
   On Tue, Oct 15, 2013 at 09:42:17AM +, Paulo Matos wrote:
Thanks, regarding the indentation I was convinced we used 2 space
indentation with maximum line length of 80 characters.
  
   We do, however 8 consecutive spaces in the indentation should be always
   replaced by a tab.
  
 
  This means that every sequence of 8 spaces should be converted into tabs? 
  So,
 if we indent something 4 times, that becomes a tab instead of 4 times 2 
 spaces.
 
  Just to confirm so I can setup my emacs to use this.
 
 Probably a silly question, but what's the invocation in emacs to do
 this, and is it possible to set up the source tree so that this happens
 automatically for everyone?
 
 (I've been using whitespace-mode and manually copying-and-pasting tab
 characters, which is clearly suboptimal).
 



Re: [wide-int] int_traits tree

2013-10-16 Thread Kenneth Zadeck

On 10/16/2013 09:55 AM, Richard Biener wrote:

Speaking in terms of a patch:

Index: tree.h
===
--- tree.h  (revision 203690)
+++ tree.h  (working copy)
@@ -5184,14 +5184,11 @@ wi::int_traits const_tree::decompose (
   / HOST_BITS_PER_WIDE_INT);
unsigned int xprecision = get_precision (x);
  
-  gcc_assert (precision = xprecision);

+  gcc_checking_assert (precision = xprecision  precision != 0);
  
-  /* Got to be careful of precision 0 values.  */

-  if (precision)
-len = MIN (len, max_len);
if (TYPE_SIGN (TREE_TYPE (x)) == UNSIGNED)
  {
-  unsigned int small_prec = precision  (HOST_BITS_PER_WIDE_INT - 1);
+  unsigned int small_prec = xprecision  (HOST_BITS_PER_WIDE_INT -
1);
if (small_prec)
 {
   /* We have to futz with this because the canonization for
@@ -5201,7 +5198,7 @@ wi::int_traits const_tree::decompose (
 {
   for (unsigned int i = 0; i  len - 1; i++)
 scratch[i] = val[i];
- scratch[len - 1] = sext_hwi (val[len - 1], precision);
+ scratch[len - 1] = sext_hwi (val[len - 1], small_prec);
   return wi::storage_ref (scratch, len, precision);
 }
 }

the precision 0 thing is gone as far as I understand?
There is still the case where the front ends create types with precision 
0, and that causes a few odd cases, but the precision 0 that you refer 
to is gone.

Also sign-extending the upper hwi must be done with precision %
HOST_BITS_PER_WIDE_INT, no?  And it assumes that we only ever
have to extend the upper most HWI, thus 'len' is never too big.

yes

I note that we immediately return in the above case, so

   if (precision  xprecision + HOST_BITS_PER_WIDE_INT)
 {
   len = wi::force_to_size (scratch, val, len, xprecision,
precision, UNSIGNED);
   return wi::storage_ref (scratch, len, precision);
 }

applies only when the desired precision is a multiple of a HWI.

yes

I assume it adds that extra zero word in case the MSB is set,
right?

yes

I am confused about the condition written here and
how we look at precision and not xprecision when deciding how
to extend - given a 8-bit precision unsigned 0xff and precision == 64
we do not even consider sign-extending the value because we look
at precision and not xprecision.  Don't we need to look at
xprecision here?

I do not think so.   There are three cases here:

1) xprecision  precision:
say xprecision = 8 and precision = 32.

The number is between 0 and 255 and after canonicalization the number 
will be between 0 and 255.


2) xprecision == precision
not much to talk about here.

3) xprecision  precision

We need to loose the upper bits of the input and it does not matter what 
they were.The only thing that we are concerned about is that the 
bits above what is now the sign bit pointed to by precision are matched 
in the bits above precision.



I do think we could change is that the if (precision  xprecision + 
HWI) to become else if (...), which will save a test.

After all an assert precision == xprecision
does not work in this routine.

Quickly execute.exp tested only.

To avoid code quality wreckage we have to implement a different way
of allocating the 'scratch' storage of wide_int_ref_storage
(well, ideally we wouldn't need it ...).  I thought of simply
allocating it via alloca (), but we need to do that in all
callers that build a wide_int_ref (eventually via a hidden
default arg).  But as that's a template specialization of
generic_wide_int ... so the option is to provide a function
wrapper inside wi:: for this, like
I want richard and mike to be the people who respond to the next 
point.I am not a c++ person and all of this storage manager layer 
stuff gives me a headache.



template typename T
wide_int_ref
wi::ref (const T t, HOST_WIDE_INT *scratch = XALLOCAVEC (HOST_WIDE_INT,
WIDE_INT_MAX_ELTS))
{
   return wide_int_ref (t, scratch);
}

and amend the storage constructors to get a scratch argument.

The reason for all this is that otherwise the wide_int_ref object
escapes when we pass the scratch storage to the workers in
int_traits const_tree.

Thanks,
Richard.




Re: Teaching emacs about GCC coding conventions (was Re: [PATCH] tree_code_name wrapper)

2013-10-16 Thread Andreas Schwab
Paulo Matos pma...@broadcom.com writes:

 (haven't found a way to say : 'only use this for GCC sources')

You could put it in .dir-locals.el, or put a hook on find-file-hook that
looks at buffer-file-name.

Andreas.

-- 
Andreas Schwab, SUSE Labs, sch...@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
And now for something completely different.


[PATCH i386 AVX2] Remove redundant expands.

2013-10-16 Thread Kirill Yukhin
Hello,

It seems that gang of AVX* patterns were copy and
pasted from SSE, however as far as they are NDD,
we may remove corresponding expands which sort operands.

ChangeLog:
* config/i386/sse.md (vec_widen_umult_even_v8si): Remove expand,
make insn visible, remove redundant check.
(vec_widen_smult_even_v8si): Ditto.
(avx2_pmaddwd): Ditto.
(avx2_eqmode3): Ditto.
(avx512f_eqmode3): Ditto.

Bootrstrap pass.
All AVX* tests pass.

Is it ok to commit to main trunk?

--
Thanks, K

---
 gcc/ChangeLog  |   9 
 gcc/config/i386/sse.md | 119 ++---
 2 files changed, 23 insertions(+), 105 deletions(-)

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index ebaa3e0..b25d8eb 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,12 @@
+2013-10-16  Kirill Yukhin  kirill.yuk...@intel.com
+
+   * config/i386/sse.md (vec_widen_umult_even_v8si): Remove expand,
+   make insn visible, remove redundant check.
+   (vec_widen_smult_even_v8si): Ditto.
+   (avx2_pmaddwd): Ditto.
+   (avx2_eqmode3): Ditto.
+   (avx512f_eqmode3): Ditto.
+
 2013-10-16  Yvan Roux  yvan.r...@linaro.org
 
* config/arm/arm.opt (mlra): New option.
diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index 2046dd5..57e4c2b 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -6067,23 +6067,7 @@
(set_attr prefix orig,vex)
(set_attr mode sseinsnmode)])
 
-(define_expand vec_widen_umult_even_v8si
-  [(set (match_operand:V4DI 0 register_operand)
-   (mult:V4DI
- (zero_extend:V4DI
-   (vec_select:V4SI
- (match_operand:V8SI 1 nonimmediate_operand)
- (parallel [(const_int 0) (const_int 2)
-(const_int 4) (const_int 6)])))
- (zero_extend:V4DI
-   (vec_select:V4SI
- (match_operand:V8SI 2 nonimmediate_operand)
- (parallel [(const_int 0) (const_int 2)
-(const_int 4) (const_int 6)])]
-  TARGET_AVX2
-  ix86_fixup_binary_operands_no_copy (MULT, V8SImode, operands);)
-
-(define_insn *vec_widen_umult_even_v8si
+(define_insn vec_widen_umult_even_v8si
   [(set (match_operand:V4DI 0 register_operand =x)
(mult:V4DI
  (zero_extend:V4DI
@@ -6096,7 +6080,7 @@
  (match_operand:V8SI 2 nonimmediate_operand xm)
  (parallel [(const_int 0) (const_int 2)
 (const_int 4) (const_int 6)])]
-  TARGET_AVX2  ix86_binary_operator_ok (MULT, V8SImode, operands)
+  TARGET_AVX2
   vpmuludq\t{%2, %1, %0|%0, %1, %2}
   [(set_attr type sseimul)
(set_attr prefix vex)
@@ -6137,28 +6121,12 @@
(set_attr prefix orig,vex)
(set_attr mode TI)])
 
-(define_expand vec_widen_smult_even_v8si
-  [(set (match_operand:V4DI 0 register_operand)
-   (mult:V4DI
- (sign_extend:V4DI
-   (vec_select:V4SI
- (match_operand:V8SI 1 nonimmediate_operand)
- (parallel [(const_int 0) (const_int 2)
-(const_int 4) (const_int 6)])))
- (sign_extend:V4DI
-   (vec_select:V4SI
- (match_operand:V8SI 2 nonimmediate_operand)
- (parallel [(const_int 0) (const_int 2)
-(const_int 4) (const_int 6)])]
-  TARGET_AVX2
-  ix86_fixup_binary_operands_no_copy (MULT, V8SImode, operands);)
-
-(define_insn *vec_widen_smult_even_v8si
+(define_insn vec_widen_smult_even_v8si
   [(set (match_operand:V4DI 0 register_operand =x)
(mult:V4DI
  (sign_extend:V4DI
(vec_select:V4SI
- (match_operand:V8SI 1 nonimmediate_operand x)
+ (match_operand:V8SI 1 nonimmediate_operand %x)
  (parallel [(const_int 0) (const_int 2)
 (const_int 4) (const_int 6)])))
  (sign_extend:V4DI
@@ -6166,7 +6134,7 @@
  (match_operand:V8SI 2 nonimmediate_operand xm)
  (parallel [(const_int 0) (const_int 2)
 (const_int 4) (const_int 6)])]
-  TARGET_AVX2  ix86_binary_operator_ok (MULT, V8SImode, operands)
+  TARGET_AVX2
   vpmuldq\t{%2, %1, %0|%0, %1, %2}
   [(set_attr isa avx)
(set_attr type sseimul)
@@ -6210,41 +6178,7 @@
(set_attr prefix orig,vex)
(set_attr mode TI)])
 
-(define_expand avx2_pmaddwd
-  [(set (match_operand:V8SI 0 register_operand)
-   (plus:V8SI
- (mult:V8SI
-   (sign_extend:V8SI
- (vec_select:V8HI
-   (match_operand:V16HI 1 nonimmediate_operand)
-   (parallel [(const_int 0) (const_int 2)
-  (const_int 4) (const_int 6)
-  (const_int 8) (const_int 10)
-  (const_int 12) (const_int 14)])))
-   (sign_extend:V8SI
- (vec_select:V8HI
-   (match_operand:V16HI 2 nonimmediate_operand)
-   (parallel [(const_int 0) (const_int 2)
-  

Re: [PATCH i386 3/8] [AVX512] [19/n] Add AVX-512 patterns: Extracts and converts.

2013-10-16 Thread Kirill Yukhin
Hello,
On 15 Oct 08:46, Richard Henderson wrote:
 On 10/09/2013 03:31 AM, Kirill Yukhin wrote:
  +  rtx op1 = operands[1];
  +  if (REG_P (op1))
  +op1 = gen_rtx_REG (V16HImode, REGNO (op1));
  +  else
  +op1 = gen_lowpart (V16HImode, op1);
 
 The IF case is incorrect.  You need to use gen_lowpart always.
I suspect gen_lowpart is bad turn when reload is completed, as
far as it can create new pseudo. gen_lowpart () may call
gen_reg_rtx (), which contain corresponging gcc_assert ().

I've rewrote this pattern w/o explicit insn emit:
(define_insn_and_split vec_extract_lo_v32hi
  [(set (match_operand:V16HI 0 nonimmediate_operand =v,m)
(vec_select:V16HI
  (match_operand:V32HI 1 nonimmediate_operand vm,v)
  (parallel [(const_int 0) (const_int 1)
 (const_int 2) (const_int 3)
 (const_int 4) (const_int 5)
 (const_int 6) (const_int 7)
 (const_int 8) (const_int 9)
 (const_int 10) (const_int 11)
 (const_int 12) (const_int 13)
 (const_int 14) (const_int 15)])))]
  TARGET_AVX512F  !(MEM_P (operands[0])  MEM_P (operands[1]))
  #
   reload_completed
  [(set (match_dup 0) (match_dup 1))]
{
  if (REG_P (operands[1]))
operands[1] = gen_rtx_REG (V16HImode, REGNO (operands[1]));
  else
operands[1] = adjust_address (operands[1], V16HImode, 0);
})

 The second alternative only matches for v/m/1.  While I imagine that it 
 doesn't
 really matter, it might be better to swap the two so that vmovddup gets used
 for the v/v/1 case too.
Done.

 Why do you have separate define_expand and define_insn for this pattern?
Expand removed.

 And is this really the best description for this insn?  It's a store to 
 memory.
  Surely it's better to say that we store V8QI.
I believe it is a bug. We store 8 bytes, no zeroes. Fixed.

 We don't need to use ix86_fixup_binary_operands for any three-operand insn.
 That function is in order to help SSE's two-operand insns.  You can drop the
 define-expand and just keep the define_insn.
Fixed.

 You want a %v for operand 1, to make the operands commutative.
Done.

  +(define_insn *vec_widen_smult_even_v16si
  +  [(set (match_operand:V8DI 0 register_operand =x)
  +(mult:V8DI
 
 Similarly, plus errant x.
Done.

Whole patch below.

Is it ok now?

Bootstrap pass, all AVX* tests pass.

--
Thanks, K

 gcc/config/i386/i386.md   |   5 +
 gcc/config/i386/predicates.md |  40 ++
 gcc/config/i386/sse.md| 873 +-
 3 files changed, 912 insertions(+), 6 deletions(-)

diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index 10ca6cb..e7e9f2d 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -831,6 +831,11 @@
 (define_code_attr s [(sign_extend s) (zero_extend u)])
 (define_code_attr u_bool [(sign_extend false) (zero_extend true)])
 
+;; Used in signed and unsigned truncations.
+(define_code_iterator any_truncate [ss_truncate truncate us_truncate])
+;; Instruction suffix for truncations.
+(define_code_attr trunsuffix [(ss_truncate s) (truncate ) (us_truncate 
us)])
+
 ;; Used in signed and unsigned fix.
 (define_code_iterator any_fix [fix unsigned_fix])
 (define_code_attr fixsuffix [(fix ) (unsigned_fix u)])
diff --git a/gcc/config/i386/predicates.md b/gcc/config/i386/predicates.md
index 06b2914..999d8ab 100644
--- a/gcc/config/i386/predicates.md
+++ b/gcc/config/i386/predicates.md
@@ -752,6 +752,11 @@
   (and (match_code const_int)
(match_test IN_RANGE (INTVAL (op), 6, 7
 
+;; Match 8 to 9.
+(define_predicate const_8_to_9_operand
+  (and (match_code const_int)
+   (match_test IN_RANGE (INTVAL (op), 8, 9
+
 ;; Match 8 to 11.
 (define_predicate const_8_to_11_operand
   (and (match_code const_int)
@@ -762,16 +767,51 @@
   (and (match_code const_int)
(match_test IN_RANGE (INTVAL (op), 8, 15
 
+;; Match 10 to 11.
+(define_predicate const_10_to_11_operand
+  (and (match_code const_int)
+   (match_test IN_RANGE (INTVAL (op), 10, 11
+
+;; Match 12 to 13.
+(define_predicate const_12_to_13_operand
+  (and (match_code const_int)
+   (match_test IN_RANGE (INTVAL (op), 12, 13
+
 ;; Match 12 to 15.
 (define_predicate const_12_to_15_operand
   (and (match_code const_int)
(match_test IN_RANGE (INTVAL (op), 12, 15
 
+;; Match 14 to 15.
+(define_predicate const_14_to_15_operand
+  (and (match_code const_int)
+   (match_test IN_RANGE (INTVAL (op), 14, 15
+
+;; Match 16 to 19.
+(define_predicate const_16_to_19_operand
+  (and (match_code const_int)
+   (match_test IN_RANGE (INTVAL (op), 16, 19
+
 ;; Match 16 to 31.
 (define_predicate const_16_to_31_operand
   (and (match_code const_int)
(match_test IN_RANGE (INTVAL (op), 16, 31
 
+;; Match 20 to 23.
+(define_predicate const_20_to_23_operand
+  (and (match_code const_int)
+   (match_test IN_RANGE (INTVAL (op), 20, 23
+
+;; Match 24 

powerpc support for TARGET_SUPPORTS_WIDE_INT.

2013-10-16 Thread Kenneth Zadeck
This patch makes the PPC the first public target to use the 
CONST_WIDE_INT representation rather than CONST_DOUBLES to represent 
larger integers.   There are two  parts to this patch.   The changes to 
genrecog.c represent fixes that would be needed for any port that 
supports CONST_WIDE_INTS and the rest are changes specific to the rs6000 
port.   Bootstrapped and tested on a ppc64.


Committed as revision 204710.

Kenny
Index: gcc/genrecog.c
===
--- gcc/genrecog.c	(revision 203701)
+++ gcc/genrecog.c	(working copy)
@@ -588,6 +588,7 @@ validate_pattern (rtx pattern, rtx insn,
 		  GET_CODE (src) != PC
 		  GET_CODE (src) != CC0
 		  !CONST_INT_P (src)
+		  !CONST_WIDE_INT_P (src)
 		  GET_CODE (src) != CALL)
 	  {
 	const char *which;
@@ -772,13 +773,14 @@ add_to_sequence (rtx pattern, struct dec
 
 	   We can optimize the generated code a little if either
 	   (a) the predicate only accepts one code, or (b) the
-	   predicate does not allow CONST_INT, in which case it
-	   can match only if the modes match.  */
+	   predicate does not allow CONST_INT or CONST_WIDE_INT,
+	   in which case it can match only if the modes match.  */
 	pred = lookup_predicate (pred_name);
 	if (pred)
 	  {
 		test-u.pred.data = pred;
-		allows_const_int = pred-codes[CONST_INT];
+		allows_const_int = (pred-codes[CONST_INT]
+|| pred-codes[CONST_WIDE_INT]);
 		if (was_code == MATCH_PARALLEL
 		 pred-singleton != PARALLEL)
 		  message_with_line (pattern_lineno,
Index: gcc/recog.c
===
--- gcc/recog.c	(revision 203701)
+++ gcc/recog.c	(working copy)
@@ -1198,7 +1198,8 @@ const_scalar_int_operand (rtx op, enum m
 
 /* Returns 1 if OP is an operand that is a CONST_WIDE_INT of mode
MODE.  This most likely is not as useful as
-   const_scalar_int_operand, but is here for consistancy.  */
+   const_scalar_int_operand since it does not accept CONST_INTs, but
+   is here for consistancy.  */
 int
 const_wide_int_operand (rtx op, enum machine_mode mode)
 {
Index: gcc/config/rs6000/predicates.md
===
--- gcc/config/rs6000/predicates.md	(revision 203701)
+++ gcc/config/rs6000/predicates.md	(working copy)
@@ -19,7 +19,7 @@
 
 ;; Return 1 for anything except PARALLEL.
 (define_predicate any_operand
-  (match_code const_int,const_double,const,symbol_ref,label_ref,subreg,reg,mem))
+  (match_code const_int,const_double,const_wide_int,const,symbol_ref,label_ref,subreg,reg,mem))
 
 ;; Return 1 for any PARALLEL.
 (define_predicate any_parallel_operand
@@ -596,7 +596,7 @@ (define_predicate easy_vector_constant_
 
 ;; Return 1 if operand is constant zero (scalars and vectors).
 (define_predicate zero_constant
-  (and (match_code const_int,const_double,const_vector)
+  (and (match_code const_int,const_double,const_wide_int,const_vector)
(match_test op == CONST0_RTX (mode
 
 ;; Return 1 if operand is 0.0.
@@ -790,7 +790,7 @@ (define_predicate logical_operand
 ;; Return 1 if op is a constant that is not a logical operand, but could
 ;; be split into one.
 (define_predicate non_logical_cint_operand
-  (and (match_code const_int,const_double)
+  (and (match_code const_int,const_wide_int)
(and (not (match_operand 0 logical_operand))
 	(match_operand 0 reg_or_logical_cint_operand
 
@@ -1058,7 +1058,7 @@ (define_predicate current_file_function
 ;; Return 1 if this operand is a valid input for a move insn.
 (define_predicate input_operand
   (match_code symbol_ref,const,reg,subreg,mem,
-	   const_double,const_vector,const_int)
+	   const_double,const_wide_int,const_vector,const_int)
 {
   /* Memory is always valid.  */
   if (memory_operand (op, mode))
@@ -1071,8 +1071,7 @@ (define_predicate input_operand
 
   /* Allow any integer constant.  */
   if (GET_MODE_CLASS (mode) == MODE_INT
-   (GET_CODE (op) == CONST_INT
-	  || GET_CODE (op) == CONST_DOUBLE))
+   CONST_SCALAR_INT_P (op))
 return 1;
 
   /* Allow easy vector constants.  */
@@ -,7 +1110,7 @@ (define_predicate input_operand
 ;; Return 1 if this operand is a valid input for a vsx_splat insn.
 (define_predicate splat_input_operand
   (match_code symbol_ref,const,reg,subreg,mem,
-	   const_double,const_vector,const_int)
+	   const_double,const_wide_int,const_vector,const_int)
 {
   if (MEM_P (op))
 {
Index: gcc/config/rs6000/darwin.h
===
--- gcc/config/rs6000/darwin.h	(revision 203701)
+++ gcc/config/rs6000/darwin.h	(working copy)
@@ -421,3 +421,4 @@ do {	\
 /* So far, there is no rs6000_fold_builtin, if one is introduced, then
this will need to be modified similar to the x86 case.  */
 #define TARGET_FOLD_BUILTIN SUBTARGET_FOLD_BUILTIN
+
Index: gcc/config/rs6000/rs6000.c

Re: [PATCH i386 AVX2] Remove redundant expands.

2013-10-16 Thread Uros Bizjak
On Wed, Oct 16, 2013 at 6:06 PM, Kirill Yukhin kirill.yuk...@gmail.com wrote:

 It seems that gang of AVX* patterns were copy and
 pasted from SSE, however as far as they are NDD,
 we may remove corresponding expands which sort operands.

 ChangeLog:
 * config/i386/sse.md (vec_widen_umult_even_v8si): Remove expand,
 make insn visible, remove redundant check.
 (vec_widen_smult_even_v8si): Ditto.
 (avx2_pmaddwd): Ditto.
 (avx2_eqmode3): Ditto.
 (avx512f_eqmode3): Ditto.


 -(define_insn *vec_widen_smult_even_v8si
 +(define_insn vec_widen_smult_even_v8si
[(set (match_operand:V4DI 0 register_operand =x)
 (mult:V4DI
   (sign_extend:V4DI
 (vec_select:V4SI
 - (match_operand:V8SI 1 nonimmediate_operand x)
 + (match_operand:V8SI 1 nonimmediate_operand %x)
   (parallel [(const_int 0) (const_int 2)
  (const_int 4) (const_int 6)])))
   (sign_extend:V4DI
 @@ -6166,7 +6134,7 @@
   (match_operand:V8SI 2 nonimmediate_operand xm)
   (parallel [(const_int 0) (const_int 2)
  (const_int 4) (const_int 6)])]
 -  TARGET_AVX2  ix86_binary_operator_ok (MULT, V8SImode, operands)
 +  TARGET_AVX2
vpmuldq\t{%2, %1, %0|%0, %1, %2}
[(set_attr isa avx)

Please also remove the above set_attr, it is not needed when the insn
is constrainted with TARGET_AVX2.

OK with this addition.

Thanks,
Uros.


Re: Teaching emacs about GCC coding conventions (was Re: [PATCH] tree_code_name wrapper)

2013-10-16 Thread Tom Tromey
Andreas You could put it in .dir-locals.el, or put a hook on
Andreas find-file-hook that looks at buffer-file-name.

We checked in a .dir-locals.el for gdb.  I recommend it for GCC as well.

The gdb one is appended.  You'll at least want to edit the bugzilla URL
to suit.

To make this work, just drop it in gcc/.dir-locals.el.
Emacs will pick it up automatically.

Not sure if the Tcl settings are right for gcc's test suite, but they
are for gdb's.

Tom

;; Emacs settings.
;; Copyright (C) 2012-2013 Free Software Foundation, Inc.

;; This program is free software; you can redistribute it and/or modify
;; it under the terms of the GNU General Public License as published by
;; the Free Software Foundation; either version 3 of the License, or
;; (at your option) any later version.

;; This program is distributed in the hope that it will be useful,
;; but WITHOUT ANY WARRANTY; without even the implied warranty of
;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
;; GNU General Public License for more details.

;; You should have received a copy of the GNU General Public License
;; along with this program.  If not, see http://www.gnu.org/licenses/.

(
 (tcl-mode . ((tcl-indent-level . 4)
  (tcl-continued-indent-level . 4)
  (indent-tabs-mode . t)))
 (nil . ((bug-reference-url-format . 
http://sourceware.org/bugzilla/show_bug.cgi?id=%s;)))
 (c-mode . ((c-file-style . GNU)
(indent-tabs-mode . t)))
)


Re: RFC: Add of type-demotion pass

2013-10-16 Thread Jeff Law

On 10/16/13 03:31, Richard Biener wrote:

I see two primary effects of type sinking.


Note it was called type demotion ;)
;)  It's a mental block of mine; it's been called type hoisting/sinking 
in various contexts and I see parallels between the code motion 
algorithms and how the type promotion/demotion exposes unnecessary type 
conversions.  So I keep calling them hoisting/sinking.  I'll try to use 
promotion/demotion.







  First and probably the most
important in my mind is by sinking a cast through its uses the various
transformations we already perform are more likely to apply *without*
needing to handle optimizing through typecasts explicitly.


I would say it is desirable to express arithmetic in the smallest possible
types (see what premature optimization the C family frontends do
to narrow operations again after C integer promotion applied).
I don't see this as the major benefit of type demotion.  Yes, there is 
some value in shrinking constants and the like, but in my experience the 
benefits are relatively small and often get lost in things like partial 
register stalls on x86, the PA and probably others (yes, the PA has 
partial register stalls, it's just that nobody used that term).


What I really want to get at here is avoiding having a large number of 
optimizers looking back through the use-def chains and attempting to 
elide typecasts in the middle of a chain of statements of interest.




You need some kind of range information to do this, thus either integrate
it into VRP (there is already code that does this there) or use range
information from VRP which we now preserve.
If the primary goal is to shrink types, then yes, you want to use 
whatever information you can, including VRP.  But that's not the primary 
goal in my mind, at least not at this stage.


There's no reason why this pass couldn't utilize VRP information to 
provide more opportunities to demote types and achieve the goal you 
want.  But I'd consider that a follow-on opportunity.








The second primary effect is, given two casts where the first indirectly
feeds the second (ie, the first feeds some statement, which then feeds the
second cast), if we're able to sink the first cast, we end up with the first
cast directly feeding the second cast.  When this occurs one of the two
casts can often be eliminated.   Sadly, I didn't keep any of those test
files, but I regularly saw them in GCC bootstraps.


This transformation is applied both by fold-const.c and by SSA forwprop
(our GIMPLE combiner).  Doing it in yet another pass looks wrong
(and it isn't type demotion but also can be promotion).
Yes, I know.  And we need to get this back down to a single 
implementation.  I don't much care which of the 3 implementations we 
keep, but it really should just be one and it needs to be reusable.


I probably should have stated this differently -- the second primary 
effect is to expose more cases where type conversions can be eliminated 
via type promotion/demotion.  I don't much care which of the 3 blobs of 
code to eliminate the conversions we use -- I do care that we've got a 
consistent way to promote/demote conversions to expose the unnecessary 
type conversions.




In contrast to the desire of expressing operations in the smallest required
type there is the desire of exposing the effect of PROMOTE_MODE on
GIMPLE instead of only during RTL expansion.  This is because the
truncations (sext and zext) PROMOTE_MODE introduced are
easier to optimize away when range information is available (see the
attempts to address this at RTL expansion time from Kugan from Linaro).
Right.  I'm aware of this work and the problem he's trying to solve and 
have been loosely watching it -- primarily for the persistent VRP 
information.





Similarly, I know there's a type hoisting patch that's also queued up. I
think it should be handled separately as well.


I think we need to paint a picture of the final result - what is the
main objective of the various(?!) passes in question?  Where do
we do the same kind of transformation already?
I thought we'd done this at a high level already.  At the heart of this 
work is to:


  1. Isolate, to the fullest extent possible, code which promotes and
 demotes types.  We have this stuff all over the place right now
 and it's very ad-hoc.

  2. Promote/demote types to allow our optimizers to not concern
 themselves with walking back through type conversions when applying
 optimizations.

  3. Promote/demote types to expose unnecessary type conversions.


If we look at #2 and #3 we can expect that we'd want a structure which 
allows for a simplification/optimization step to occur after types are 
promoted or demoted.  ie, a pipeline that looks like:


promote types - optimize1 - demote types - optimize2

Now where that little mini pipeline lands is still a big question to me. 
 optimize1 may be a fairly significant hunk of our pipeline.  optimize2 
probably isn't (may just be a final 

Re: [PATCH i386 AVX2] Remove redundant expands.

2013-10-16 Thread Uros Bizjak
On Wed, Oct 16, 2013 at 6:06 PM, Kirill Yukhin kirill.yuk...@gmail.com wrote:

 It seems that gang of AVX* patterns were copy and
 pasted from SSE, however as far as they are NDD,
 we may remove corresponding expands which sort operands.

OTOH, I have some second thoughts on removing AVX2 expanders.

Please consider the situation, where we have *both* operands in
memory, and the insn is inside the loop. When reload comes around, it
will fixup one of the operands with a load from memory. However,
having insn in the loop, I suspect the load won't be moved out of
loop.

So, I guess even AVX/AVX2 insn patterns should call
ix86_fixup_binary_operands_*, and this fixup function should be
improved to load one of the operands into register, in case both
operands are in memory.

This also means, that you still need expanders for AVX512 commutative
multiplies.

Uros.


Re: [PATCH i386 3/8] [AVX512] [19/n] Add AVX-512 patterns: Extracts and converts.

2013-10-16 Thread Richard Henderson
On 10/16/2013 09:07 AM, Kirill Yukhin wrote:
 I suspect gen_lowpart is bad turn when reload is completed, as
 far as it can create new pseudo. gen_lowpart () may call
 gen_reg_rtx (), which contain corresponging gcc_assert ().

False.  gen_lowpart is perfectly safe post-reload.
Indeed, taking the subreg of a hard register should arrive

  x = gen_rtx_REG_offset (op, outermode, final_regno, final_offset);

in simplify_subreg.

Have you encountered some specific problem with gen_lowpart?


r~



Re: [PATCH i386 AVX2] Remove redundant expands.

2013-10-16 Thread Richard Henderson
On 10/16/2013 09:47 AM, Uros Bizjak wrote:
 On Wed, Oct 16, 2013 at 6:06 PM, Kirill Yukhin kirill.yuk...@gmail.com 
 wrote:
 
 It seems that gang of AVX* patterns were copy and
 pasted from SSE, however as far as they are NDD,
 we may remove corresponding expands which sort operands.
 
 OTOH, I have some second thoughts on removing AVX2 expanders.
 
 Please consider the situation, where we have *both* operands in
 memory, and the insn is inside the loop. When reload comes around, it
 will fixup one of the operands with a load from memory. However,
 having insn in the loop, I suspect the load won't be moved out of
 loop.
 
 So, I guess even AVX/AVX2 insn patterns should call
 ix86_fixup_binary_operands_*, and this fixup function should be
 improved to load one of the operands into register, in case both
 operands are in memory.
 
 This also means, that you still need expanders for AVX512 commutative
 multiplies.
 
 Uros.
 

Fair enough.


r~


Re: [C PATCH] Warn for _Alignas in an array declarator (PR c/58267)

2013-10-16 Thread Joseph S. Myers
On Wed, 16 Oct 2013, Marek Polacek wrote:

 Regtested/bootstrapped on x86_64-linux, ok for trunk?
 
 2013-10-16  Marek Polacek  pola...@redhat.com
 
   PR c/58267
 c/
   * c-parser.c (c_parser_declspecs): Add alignspec_ok parameter.
   Document syntax of the array-declarator.
   (c_parser_declspecs) RID_ALIGNAS: Bail out if alignment specs
   are not permitted.
   (c_parser_declaration_or_fndef): Adjust c_parser_declspecs call.
   (c_parser_struct_declaration): Likewise.
   (c_parser_declarator): Likewise.
   (c_parser_direct_declarator_inner): Likewise.
   (c_parser_parameter_declaration): Likewise.
   (c_parser_type_name): Likewise.
 testsuite/
   * gcc.dg/c1x-align-5.c: New test.

OK.

-- 
Joseph S. Myers
jos...@codesourcery.com


[jit] Revamp of locals handling (changing API)

2013-10-16 Thread David Malcolm
I've committed the following to dmalcolm/jit:

Convert local creation to take a function rather than a context.

Simplify the API by eliminating the separate local type in favor of
just using the lvalue type, as we do for globals.  As well as reducing
the number of API entrypoints, this simplifies client code by eliminating
various casts from local to lvalue.

To port from the old API:

  s/gcc_jit_context_new_local/gcc_jit_function_new_local/g
  s/gcc_jit_local_as_rvalue/gcc_jit_lvalue_as_rvalue/g
  s/gcc_jit_local/gcc_jit_lvalue/g

and eliminate calls to gcc_jit_local_as_lvalue, and update the arguments
to gcc_jit_function_new_local (was gcc_jit_context_new_local) to reflect
the type change.

gcc/jit/
* TODO.rst (gcc_jit_context_new_local): Remove completed item.
* internal-api.c (gcc::jit::context::new_local): Replace with...
(gcc::jit::function::new_local): ...this, and change return type
from (local*) to (lvalue*).
* internal-api.h (gcc::jit::local): Eliminate.
(gcc::jit::context::new_local): Replace with...
(gcc::jit::function::new_local): ...this, and change return type
from (local*) to (lvalue*).
* libgccjit.c (gcc_jit_local): Eliminate.
(gcc_jit_context_new_local): Replace with...
(gcc_jit_function_new_local): ...this, and change return type
from (gcc_jit_local*) to (gcc_jit_lvalue*).
(gcc_jit_local_as_lvalue): Remove.
(gcc_jit_local_as_rvalue): Remove.
* libgccjit.h (gcc_jit_local): Remove.
(gcc_jit_context_new_local): Replace with...
(gcc_jit_function_new_local): ...this, and change return type
from (gcc_jit_local*) to (gcc_jit_lvalue*).
(gcc_jit_local_as_lvalue): Remove.
(gcc_jit_local_as_rvalue): Remove.
* libgccjit.map (gcc_jit_context_new_local): Replace with...
(gcc_jit_function_new_local): ...this.
(gcc_jit_local_as_lvalue): Remove.
(gcc_jit_local_as_rvalue): Remove.

gcc/testsuite/
* jit.dg/test-dot-product.c (code_making_callback): Update for
API changes to locals.
* jit.dg/test-sum-of-squares.c (code_making_callback): Likewise.
---
 gcc/jit/ChangeLog.jit  | 27 ++
 gcc/jit/TODO.rst   |  3 ---
 gcc/jit/internal-api.c | 33 +--
 gcc/jit/internal-api.h | 19 +---
 gcc/jit/libgccjit.c| 36 +-
 gcc/jit/libgccjit.h| 23 +--
 gcc/jit/libgccjit.map  |  4 +---
 gcc/testsuite/ChangeLog.jit|  6 +
 gcc/testsuite/jit.dg/test-dot-product.c| 24 ++--
 gcc/testsuite/jit.dg/test-sum-of-squares.c | 28 +++
 10 files changed, 97 insertions(+), 106 deletions(-)

diff --git a/gcc/jit/ChangeLog.jit b/gcc/jit/ChangeLog.jit
index da3e75a..54953b6 100644
--- a/gcc/jit/ChangeLog.jit
+++ b/gcc/jit/ChangeLog.jit
@@ -1,3 +1,30 @@
+2013-10-16  David Malcolm  dmalc...@redhat.com
+
+   * TODO.rst (gcc_jit_context_new_local): Remove completed item.
+   * internal-api.c (gcc::jit::context::new_local): Replace with...
+   (gcc::jit::function::new_local): ...this, and change return type
+   from (local*) to (lvalue*).
+   * internal-api.h (gcc::jit::local): Eliminate.
+   (gcc::jit::context::new_local): Replace with...
+   (gcc::jit::function::new_local): ...this, and change return type
+   from (local*) to (lvalue*).
+   * libgccjit.c (gcc_jit_local): Eliminate.
+   (gcc_jit_context_new_local): Replace with...
+   (gcc_jit_function_new_local): ...this, and change return type
+   from (gcc_jit_local*) to (gcc_jit_lvalue*).
+   (gcc_jit_local_as_lvalue): Remove.
+   (gcc_jit_local_as_rvalue): Remove.
+   * libgccjit.h (gcc_jit_local): Remove.
+   (gcc_jit_context_new_local): Replace with...
+   (gcc_jit_function_new_local): ...this, and change return type
+   from (gcc_jit_local*) to (gcc_jit_lvalue*).
+   (gcc_jit_local_as_lvalue): Remove.
+   (gcc_jit_local_as_rvalue): Remove.
+   * libgccjit.map (gcc_jit_context_new_local): Replace with...
+   (gcc_jit_function_new_local): ...this.
+   (gcc_jit_local_as_lvalue): Remove.
+   (gcc_jit_local_as_rvalue): Remove.
+
 2013-10-15  David Malcolm  dmalc...@redhat.com
 
* libgccjit.h (gcc_jit_location): Rewrite comment to reflect
diff --git a/gcc/jit/TODO.rst b/gcc/jit/TODO.rst
index feb4d2b..d750b2a 100644
--- a/gcc/jit/TODO.rst
+++ b/gcc/jit/TODO.rst
@@ -1,8 +1,5 @@
 TODOs:
 
-* gcc_jit_context_new_local should take a function, not the context (is
-  it building a global?)
-
 * error-handling:
 * have a client-provided error-handling callback for the context, and
   call it, rather than asserting/crashing etc, to make the API resilient 
and helpful
diff 

Re: [wide-int] int_traits tree

2013-10-16 Thread Mike Stump
On Oct 16, 2013, at 8:47 AM, Kenneth Zadeck zad...@naturalbridge.com wrote:
 To avoid code quality wreckage we have to implement a different way
 of allocating the 'scratch' storage of wide_int_ref_storage

 I want richard and mike to be the people who respond to the next point.I 
 am not a c++ person and all of this storage manager layer stuff gives me a 
 headache.

I don't have a problem with that direction if it improves code-gen...

Go patch committed: Updated 4.8 branch

2013-10-16 Thread Ian Lance Taylor
I committed this patch to the 4.8 branch to bring in the last set of Go
1.1.2 bug fixes and patches.  Bootstrapped and ran Go testsuite on
x86_64-unknown-linux-gnu.

Ian


2013-10-16  Ian Lance Taylor  i...@google.com

Bring in from mainline:

2013-10-11  Chris Manghane  cm...@google.com
* go-gcc.cc (Gcc_backend::function_code_expression): New
function.

2013-10-10  Chris Manghane  cm...@google.com
* go-gcc.cc (Backend::error_function): New function.
(Backend::function): New function.
(Backend::make_function): New function.
(function_to_tree): New function.


Index: libgo/Makefile.am
===
--- libgo/Makefile.am	(revision 203707)
+++ libgo/Makefile.am	(working copy)
@@ -424,6 +424,7 @@ runtime_files = \
 	runtime/go-caller.c \
 	runtime/go-callers.c \
 	runtime/go-can-convert-interface.c \
+	runtime/go-cdiv.c \
 	runtime/go-cgo.c \
 	runtime/go-check-interface.c \
 	runtime/go-construct-map.c \
Index: libgo/runtime/go-signal.c
===
--- libgo/runtime/go-signal.c	(revision 203707)
+++ libgo/runtime/go-signal.c	(working copy)
@@ -399,6 +399,9 @@ sig_tramp_info (int sig, Siginfo *info,
 {
   G *gp;
   M *mp;
+#ifdef USING_SPLIT_STACK
+  void *stack_context[10];
+#endif
 
   /* We are now running on the stack registered via sigaltstack.
  (Actually there is a small span of time between runtime_siginit
@@ -409,7 +412,7 @@ sig_tramp_info (int sig, Siginfo *info,
   if (gp != NULL)
 {
 #ifdef USING_SPLIT_STACK
-  __splitstack_getcontext (gp-stack_context[0]);
+  __splitstack_getcontext (stack_context[0]);
 #endif
 }
 
@@ -432,7 +435,7 @@ sig_tramp_info (int sig, Siginfo *info,
   if (gp != NULL)
 {
 #ifdef USING_SPLIT_STACK
-  __splitstack_setcontext (gp-stack_context[0]);
+  __splitstack_setcontext (stack_context[0]);
 #endif
 }
 }
Index: libgo/runtime/go-cdiv.c
===
--- libgo/runtime/go-cdiv.c	(revision 0)
+++ libgo/runtime/go-cdiv.c	(revision 0)
@@ -0,0 +1,46 @@
+/* go-cdiv.c -- complex division routines
+
+   Copyright 2013 The Go Authors. All rights reserved.
+   Use of this source code is governed by a BSD-style
+   license that can be found in the LICENSE file.  */
+
+/* Calls to these functions are generated by the Go frontend for
+   division of complex64 or complex128.  We use these because Go's
+   complex division expects slightly different results from the GCC
+   default.  When dividing NaN+1.0i / 0+0i, Go expects NaN+NaNi but
+   GCC generates NaN+Infi.  NaN+Infi seems wrong seems the rules of
+   C99 Annex G specify that if either side of a complex number is Inf,
+   the the whole number is Inf, but an operation involving NaN ought
+   to result in NaN, not Inf.  */
+
+__complex float
+__go_complex64_div (__complex float a, __complex float b)
+{
+  if (__builtin_expect (b == 0+0i, 0))
+{
+  if (!__builtin_isinff (__real__ a)
+	   !__builtin_isinff (__imag__ a)
+	   (__builtin_isnanf (__real__ a) || __builtin_isnanf (__imag__ a)))
+	{
+	  /* Pass 1 to nanf to match math/bits.go.  */
+	  return __builtin_nanf(1) + __builtin_nanf(1)*1i;
+	}
+}
+  return a / b;
+}
+
+__complex double
+__go_complex128_div (__complex double a, __complex double b)
+{
+  if (__builtin_expect (b == 0+0i, 0))
+{
+  if (!__builtin_isinf (__real__ a)
+	   !__builtin_isinf (__imag__ a)
+	   (__builtin_isnan (__real__ a) || __builtin_isnan (__imag__ a)))
+	{
+	  /* Pass 1 to nan to match math/bits.go.  */
+	  return __builtin_nan(1) + __builtin_nan(1)*1i;
+	}
+}
+  return a / b;
+}
Index: libgo/runtime/go-make-slice.c
===
--- libgo/runtime/go-make-slice.c	(revision 203707)
+++ libgo/runtime/go-make-slice.c	(working copy)
@@ -34,7 +34,10 @@ __go_make_slice2 (const struct __go_type
   std = (const struct __go_slice_type *) td;
 
   ilen = (intgo) len;
-  if (ilen  0 || (uintptr_t) ilen != len)
+  if (ilen  0
+  || (uintptr_t) ilen != len
+  || (std-__element_type-__size  0
+	   len  MaxMem / std-__element_type-__size))
 runtime_panicstring (makeslice: len out of range);
 
   icap = (intgo) cap;
Index: gcc/go/go-gcc.cc
===
--- gcc/go/go-gcc.cc	(revision 203707)
+++ gcc/go/go-gcc.cc	(working copy)
@@ -232,6 +232,9 @@ class Gcc_backend : public Backend
   Bexpression*
   convert_expression(Btype* type, Bexpression* expr, Location);
 
+  Bexpression*
+  function_code_expression(Bfunction*, Location);
+
   // Statements.
 
   Bstatement*
@@ -334,6 +337,17 @@ class Gcc_backend : public Backend
   Bexpression*
   label_address(Blabel*, Location);
 
+  // Functions.
+
+  Bfunction*
+  error_function()
+  { return this-make_function(error_mark_node); }
+
+  Bfunction*
+  function(Btype* 

Re: PING: Fwd: Re: [patch] implement Cilk Plus simd loops on trunk

2013-10-16 Thread Aldy Hernandez

On 10/02/13 16:53, Jason Merrill wrote:

On 08/27/2013 04:03 PM, Aldy Hernandez wrote:





+  else if (!TREE_TYPE (e) || !TREE_CONSTANT (e)
+   || !INTEGRAL_TYPE_P (TREE_TYPE (e)))
+cp_parser_error (parser,
+ step size must be an integer constant);


Can't the step size be a value-dependent expression like a template
non-type parameter?


[Balaji, this is for the linear clause that is either linear(var) or 
linear(var : step)].


Jason, I'm not sure.  The standard says:

The conditional-expression in a simd-linear-step shall either
satisfy the requirements of an integer constant expression, or
be a reference to a variable with integer type.

I take this to mean, an integral expression or a plain old variable.  Is 
this the case, Balaji?


Jason, are you asking about the validity of something like this:

int *p;

template int argoop
void foobar(int a)
{
  int j = 123;
#pragma simd linear(j : argoop)
  for (int i=0; i  a; ++i)
p[i] = argoop;
}

void funky()
{
  foobar 69 (1000);
}

If this is what you're asking, and the above standardeese allows it, 
then we have a problem, because the code parsing the j in the linear 
clause uses cp_parser_id_expression() which gets horrendously confused 
with the colon, expecting a '::'.  For that matter, even doing this 
causes problems:


int j = 123, k = 456;
#pragma simd linear (j : k)
...

Is this what you were asking about?  If so, then perhaps something 
besides cp_parser_id_expression is in order, for the sake of the 
template argument or just a plain variable.


Aldy


alias fix for PR58685

2013-10-16 Thread Bernd Schmidt
The sequence of events here can be summarized as shrink-wrapping causes
the i386 backend to do something that confuses alias analysis. The
miscompilation is that two instructions are swapped by the scheduler
when they shouldn't be, due to an incorrect reg_base_value.

The miscompiled function has the following parts:
 * a loop which uses %rdi as an incoming argument register
 * following that, a decrement of the stack pointer (shrink-wrapped
   to this point rather than the start of the function)
 * the rest of the function which has one set of %rdi to
   (symbol_ref LC0).

The argument register %rdi is dead by the point where we decrement the
stack pointer. The i386 backend splits the sub into a sequence of two
insns, a clobber of %rdi and a push of it (which register is chosen
appears to be somewhat random).

When called from the scheduler, init_alias_analysis incorrectly deduces
that %rdi has a base value of (symbol_ref LC0). Although it tries to
track which registers are argument registers that may hold a pointer,
this information is lost when we encounter the clobber. The main part of
the following patch is to modify that piece of code to also set reg_seen
for argument registers when encountering a clobber.

There are other problems in this area, one of which showed up while
testing an earlier patch. i386/pr57003.c demonstrates that return values
of FUNCTION_ARG_REGNO are not constant across the whole compilation;
they are affected by the ms_abi attribute. This necessitates changing
from a static_reg_base_value array to a per-function one. Once that is
done, it's better to look at DECL_ARGUMENTS in a similar way to what
combine.c does to identify the arguments of the specific function we're
compiling rather than using the less specific FUNCTION_ARG_REGNO.
Lastly, I modified a test for Pmode to use the valid_pointer_mode hook
which should be a little more correct.

Bootstrapped and tested on x86_64-linux. Ok?


Bernd
	PR rtl-optimization/58685
	* alias.c (static_reg_base_value): Delete macro.
	(reg_base_value_initial): New macro to replace it.  All uses changed.
	(record_set): For a clobber of an arg reg, set reg_seen.
	(setup_initial_base_values): Renamed from init_alias_target and made
	static.  Check the DECL_INCOMING_RTL of the DECL_ARGUMENTS.  Record
	that the array was set up for the current function.
	(init_alias_analysis): Call setup_initial_base_values if necessary.
	* function.h (struct emit_status): Add fields initial_reg_base_value
	and initial_reg_base_value_setup.
	* rtl.h (struct target_rtl): Remove x_static_reg_base_value.
	(init_alias_target): Don't declare.
	* toplev.c (backend_init_target): Don't call it.

	PR rtl-optimization/58685
	* gcc.c-torture/execute/pr58685.c: New test.

diff --git a/gcc/alias.c b/gcc/alias.c
index a48bb51..b1189b7 100644
--- a/gcc/alias.c
+++ b/gcc/alias.c
@@ -229,8 +229,7 @@ static GTY((deletable)) vecrtx, va_gc *old_reg_base_value;
 #define UNIQUE_BASE_VALUE_FP	-3
 #define UNIQUE_BASE_VALUE_HFP	-4
 
-#define static_reg_base_value \
-  (this_target_rtl-x_static_reg_base_value)
+#define reg_base_value_initial (crtl-emit.initial_reg_base_value)
 
 #define REG_BASE_VALUE(X)	\
   (REGNO (X)  vec_safe_length (reg_base_value)			\
@@ -1306,6 +1305,20 @@ record_set (rtx dest, const_rtx set, void *data ATTRIBUTE_UNUSED)
 	 set).  */
   if (GET_CODE (set) == CLOBBER)
 	{
+	  if (new_reg_base_value[regno] != 0
+	   regno != HARD_FRAME_POINTER_REGNUM
+	   regno != FRAME_POINTER_REGNUM
+	   regno != STACK_POINTER_REGNUM
+	   regno != ARG_POINTER_REGNUM)
+	{
+	  if (!bitmap_bit_p (reg_seen, regno))
+		{
+		  /* We have to make an exception here for an argument
+		 register, in case it is used before the clobber.  */
+		  gcc_assert (new_reg_base_value[regno] == arg_base_value);
+		  bitmap_set_bit (reg_seen, regno);
+		}
+	}
 	  new_reg_base_value[regno] = 0;
 	  return;
 	}
@@ -1693,7 +1706,7 @@ find_base_term (rtx x)
 	return ret;
 
   if (cselib_sp_based_value_p (val))
-	return static_reg_base_value[STACK_POINTER_REGNUM];
+	return reg_base_value_initial[STACK_POINTER_REGNUM];
 
   f = val-locs;
   /* Temporarily reset val-locs to avoid infinite recursion.  */
@@ -1795,7 +1808,7 @@ bool
 may_be_sp_based_p (rtx x)
 {
   rtx base = find_base_term (x);
-  return !base || base == static_reg_base_value[STACK_POINTER_REGNUM];
+  return !base || base == reg_base_value_initial[STACK_POINTER_REGNUM];
 }
 
 /* Return 0 if the addresses X and Y are known to point to different
@@ -2809,34 +2822,34 @@ may_alias_p (const_rtx mem, const_rtx x)
   return rtx_refs_may_alias_p (x, mem, false);
 }
 
-void
-init_alias_target (void)
+static void
+setup_initial_base_values (void)
 {
-  int i;
-
   if (!arg_base_value)
 arg_base_value = gen_rtx_ADDRESS (VOIDmode, 0);
 
-  memset (static_reg_base_value, 0, sizeof static_reg_base_value);
+  memset (reg_base_value_initial, 0, sizeof reg_base_value_initial);
 
-  for (i = 0; i  

Re: [PATCH] Reassociate X == CST1 || X == CST2 if popcount (CST2 - CST1) == 1 into ((X - CST1) ~(CST2 - CST1)) == 0

2013-10-16 Thread Jeff Law

On 10/16/13 02:21, Richard Biener wrote:

On Tue, Oct 15, 2013 at 6:55 PM, Jeff Law l...@redhat.com wrote:

On 10/15/13 10:53, Jakub Jelinek wrote:


On Tue, Oct 15, 2013 at 10:50:39AM -0600, Jeff Law wrote:


I noticed that we're now including rtl.h and tm_p.h in
tree-ssa-reassoc.c, which seems wrong.



Isn't that required for BRANCH_COST use?
Other option would be to add some dummy wrapper around
BRANCH_COST, put that wrapper into some file that already includes rtl.h
and
tm_p.h and just call it from there.


Yea, looking at it now, I'm a bit surprised this hasn't been converted to a
target hook which would avoid this problem entirely.


You got the job!

Joys :-)

What's the policy on the GDFL stuff.  I've tried so hard to avoid having 
to worry about that rats nest that I have no idea what our policy is. 
Basically I just want to take the old docs for BRANCH_COST and re-use 
those.  Is that considered kosher with all the GDFL nonsense?


Jeff


Re: [patch] combine ICE fix

2013-10-16 Thread Jeff Law

On 10/16/13 09:34, Cesar Philippidis wrote:

On 10/15/13 12:16 PM, Jeff Law wrote:

On 10/10/13 10:25, Jakub Jelinek wrote:

On Thu, Oct 10, 2013 at 07:26:43AM -0700, Cesar Philippidis wrote:

This patch addresses an ICE when building qemu for a Mips target in
Yocto. Both gcc-trunk, gcc-4.8 and all of the targets are potentially
affected. The problem occurs because the instruction combine phase uses
two data structures to keep track of registers, reg_stat and
regstat_n_sets_and_refs, but they potentially end up out of sync; when
combine inserts a new register into reg_stat it does not update
regstat_n_sets_and_refs. Failing to update the latter results in an
occasional segmentation fault.

Is this OK for trunk and gcc-4.8? If so, please check it in. I tested it
on Mips and x86-64 and no regressions showed up.


That looks broken.  You leave everything from the last size till the
current
one uninitialized, so it would only work if combine_split_insns
always increments max_reg_num () by at most one.

I don't think that assumption is safe.  Consider a parallel with a bunch
of (clobber (match_scratch)) expressions.


I address that in the patch posted here
http://gcc.gnu.org/ml/gcc-patches/2013-10/msg00802.html. Is that still
insufficient?

Thanks.  I wasn't aware of the follow-up.




Furthermore, there are macros which should be used to access
the fields, and, if the vector is ever going to be resized, supposedly
it should be vec.h vector rather than just array.
Or perhaps take into account:
/* If a pass need to change these values in some magical way or the
 pass needs to have accurate values for these and is not using
 incremental df scanning, then it should use REG_N_SETS and
 REG_N_USES.  If the pass is doing incremental scanning then it
 should be getting the info from DF_REG_DEF_COUNT and
 DF_REG_USE_COUNT.  */
and not use REG_N_SETS etc. but instead the df stuff.

Which begs the question, how exactly is combine utilizing the
regstat_n_* structures and is that use even valid for combine?


I'll take a look at that.

This needs to be resolved before we can go forward with your patch.

jeff



Re: PING: Fwd: Re: [patch] implement Cilk Plus simd loops on trunk

2013-10-16 Thread Jason Merrill

On 10/16/2013 01:48 PM, Aldy Hernandez wrote:

On 10/02/13 16:53, Jason Merrill wrote:



Can't the step size be a value-dependent expression like a template
non-type parameter?



Jason, I'm not sure.  The standard says:

 The conditional-expression in a simd-linear-step shall either
 satisfy the requirements of an integer constant expression, or
 be a reference to a variable with integer type.

I take this to mean, an integral expression or a plain old variable.  Is
this the case, Balaji?


Right.  An integral constant expression can be value-dependent.


If this is what you're asking, and the above standardeese allows it,
then we have a problem, because the code parsing the j in the linear
clause uses cp_parser_id_expression() which gets horrendously confused
with the colon, expecting a '::'.


Sounds like you need to turn off parser-colon_corrects_to_scope_p when 
parsing the pragma arguments.


Jason



RE: PING: Fwd: Re: [patch] implement Cilk Plus simd loops on trunk

2013-10-16 Thread Iyer, Balaji V


 -Original Message-
 From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
 ow...@gcc.gnu.org] On Behalf Of Aldy Hernandez
 Sent: Wednesday, October 16, 2013 1:49 PM
 To: Jason Merrill
 Cc: Richard Henderson; gcc-patches; Iyer, Balaji V
 Subject: Re: PING: Fwd: Re: [patch] implement Cilk Plus simd loops on trunk
 
 On 10/02/13 16:53, Jason Merrill wrote:
  On 08/27/2013 04:03 PM, Aldy Hernandez wrote:
 
 
  +  else if (!TREE_TYPE (e) || !TREE_CONSTANT (e)
  +   || !INTEGRAL_TYPE_P (TREE_TYPE (e)))
  +cp_parser_error (parser,
  + step size must be an integer constant);
 
  Can't the step size be a value-dependent expression like a template
  non-type parameter?
 
 [Balaji, this is for the linear clause that is either linear(var) or 
 linear(var : step)].
 
 Jason, I'm not sure.  The standard says:
 
   The conditional-expression in a simd-linear-step shall either
   satisfy the requirements of an integer constant expression, or
   be a reference to a variable with integer type.
 
 I take this to mean, an integral expression or a plain old variable.  Is this 
 the
 case, Balaji?
 
Yup.


 Jason, are you asking about the validity of something like this:
 
 int *p;
 
 template int argoop
 void foobar(int a)
 {
int j = 123;
 #pragma simd linear(j : argoop)
for (int i=0; i  a; ++i)
  p[i] = argoop;
 }
 
 void funky()
 {
foobar 69 (1000);
 }
 
 If this is what you're asking, and the above standardeese allows it, then we 
 have
 a problem, because the code parsing the j in the linear clause uses
 cp_parser_id_expression() which gets horrendously confused with the colon,
 expecting a '::'.  For that matter, even doing this causes problems:
 

In parser structure, there is a variable that called colon_corrects_to_scope_p. 
Set that to false right before searching for a colon. That should fix this 
looking for :: issue.

   int j = 123, k = 456;
   #pragma simd linear (j : k)
   ...
 
 Is this what you were asking about?  If so, then perhaps something besides
 cp_parser_id_expression is in order, for the sake of the template argument or
 just a plain variable.
 

Thanks,

Balaji V. Iyer.


 Aldy


Re: Teaching emacs about GCC coding conventions (was Re: [PATCH] tree_code_name wrapper)

2013-10-16 Thread Mike Stump
On Oct 16, 2013, at 9:26 AM, Tom Tromey tro...@redhat.com wrote:
 Andreas You could put it in .dir-locals.el, or put a hook on
 Andreas find-file-hook that looks at buffer-file-name.

 We checked in a .dir-locals.el for gdb.  I recommend it for GCC as well.

I've copied the gdb one and updated it for our tree:

$ cat .dir-locals.el
;; Emacs settings.
;; Copyright (C) 2012-2013 Free Software Foundation, Inc.

;; This program is free software; you can redistribute it and/or modify
;; it under the terms of the GNU General Public License as published by
;; the Free Software Foundation; either version 3 of the License, or
;; (at your option) any later version.

;; This program is distributed in the hope that it will be useful,
;; but WITHOUT ANY WARRANTY; without even the implied warranty of
;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
;; GNU General Public License for more details.

;; You should have received a copy of the GNU General Public License
;; along with this program.  If not, see http://www.gnu.org/licenses/.

(
 (tcl-mode . ((tcl-indent-level . 4)
  (tcl-continued-indent-level . 4)
  (indent-tabs-mode . t)))
 (nil . ((bug-reference-url-format . http://gcc.gnu.org/PR%s;)))
 (c-mode . ((c-file-style . GNU)
(indent-tabs-mode . t)))
)

Committed revision 203715.

I've tested it to ensure everything works as expected, except I didn't see 
exactly how to test the bug-reference…  If someone knows how to test that, I'd 
appreciate it.  If I understand the net result of this, it should make the 
lives of people that work on multiple trees with different formatting easier.

Thanks guys.

Now, if you've been manually fidgeting with tab characters, you should be able 
to just hit tab on the line your editing and have it do the right thing (if 
your emacs isn't really old).

Re: [PATCH i386 AVX2] Remove redundant expands.

2013-10-16 Thread Uros Bizjak
On Wed, Oct 16, 2013 at 7:01 PM, Richard Henderson r...@redhat.com wrote:
 On 10/16/2013 09:47 AM, Uros Bizjak wrote:
 On Wed, Oct 16, 2013 at 6:06 PM, Kirill Yukhin kirill.yuk...@gmail.com 
 wrote:

 It seems that gang of AVX* patterns were copy and
 pasted from SSE, however as far as they are NDD,
 we may remove corresponding expands which sort operands.

 OTOH, I have some second thoughts on removing AVX2 expanders.

 Please consider the situation, where we have *both* operands in
 memory, and the insn is inside the loop. When reload comes around, it
 will fixup one of the operands with a load from memory. However,
 having insn in the loop, I suspect the load won't be moved out of
 loop.

 So, I guess even AVX/AVX2 insn patterns should call
 ix86_fixup_binary_operands_*, and this fixup function should be
 improved to load one of the operands into register, in case both
 operands are in memory.

 This also means, that you still need expanders for AVX512 commutative
 multiplies.

 Fair enough.

I have checked ix86_fixup_binary_operands and ix86_binary_operator_ok,
and they should be OK for destructive (SSE2) and non-destructive (AVX)
commutative instructions. Vector instructions have always destination
in a register, so most of fixups and checks do not apply at all. We
can probably use:

{
  if (MEM_P (operands[1])  MEM_P (operands[2]))
operands[1] = force_reg (MODEmode, operands[1]);
}

in expanders and

!(MEM_P (operands[0])  MEM_P (operands[1]))

in insn constraints for most of SSE and AVX commutative insns.

Uros.


Re: [wide-int] int_traits tree

2013-10-16 Thread Richard Sandiford
Kenneth Zadeck zad...@naturalbridge.com writes:
 I note that we immediately return in the above case, so

if (precision  xprecision + HOST_BITS_PER_WIDE_INT)
  {
len = wi::force_to_size (scratch, val, len, xprecision,
 precision, UNSIGNED);
return wi::storage_ref (scratch, len, precision);
  }

 applies only when the desired precision is a multiple of a HWI.
 yes
 I assume it adds that extra zero word in case the MSB is set,
 right?
 yes
 I am confused about the condition written here and
 how we look at precision and not xprecision when deciding how
 to extend - given a 8-bit precision unsigned 0xff and precision == 64
 we do not even consider sign-extending the value because we look
 at precision and not xprecision.  Don't we need to look at
 xprecision here?
 I do not think so.   There are three cases here:

 1) xprecision  precision:
 say xprecision = 8 and precision = 32.

 The number is between 0 and 255 and after canonicalization the number 
 will be between 0 and 255.

 2) xprecision == precision
 not much to talk about here.

 3) xprecision  precision

 We need to loose the upper bits of the input and it does not matter what 
 they were.The only thing that we are concerned about is that the 
 bits above what is now the sign bit pointed to by precision are matched 
 in the bits above precision.

(3) is gone with the assert though.

As mentioned in my message yesterday, I thought your new way of canonising
unsigned tree constants meant that there was always an upper zero bit.
Is that right?

If so, xprecision  precision is a no-op, because the number always
has the right form for wider precisions.  The only difficult case is
xprecision == precision, since then we need to peel off any upper -1 HWIs.

If that's right and xprecision == precision can use val with an adjusted len,
then...

 After all an assert precision == xprecision
 does not work in this routine.

 Quickly execute.exp tested only.

 To avoid code quality wreckage we have to implement a different way
 of allocating the 'scratch' storage of wide_int_ref_storage
 (well, ideally we wouldn't need it ...).  I thought of simply
 allocating it via alloca (), but we need to do that in all
 callers that build a wide_int_ref (eventually via a hidden
 default arg).  But as that's a template specialization of
 generic_wide_int ... so the option is to provide a function
 wrapper inside wi:: for this, like
 I want richard and mike to be the people who respond to the next 
 point.I am not a c++ person and all of this storage manager layer 
 stuff gives me a headache.

...the use of the scratch array goes away completely for addr_wide_int
and max_wide_int.  If this code is on a hot path for wide_int then maybe
we should simply require that wide_int operations involving trees have
explicit extensions, like in rtl.  (I.e. only do the implicit extension
for addr_wide_int and max_wide_int.)

I'm not opposed to going back to a separate scratch array if all else fails,
but it's really ugly, so I'd like to look at the alternatives first...

Thanks,
Richard


Merge from GCC 4.8 branch to gccgo branch

2013-10-16 Thread Ian Lance Taylor
I've merged revision 203713 from the GCC 4.8 branch to the gccgo branch.

Ian


Re: [PATCH, powerpc] Rework#2 VSX scalar floating point support, patch #5

2013-10-16 Thread Michael Meissner
On Mon, Oct 14, 2013 at 08:59:50PM -0400, David Edelsohn wrote:
 Please use VMX instead of AV, not because I am pedantic about
 IBM's name for the feature, but AV is meaningless to me and anyone
 else reading the code. If you need a three-letter name, use VMX.
 
 Okay with that change.

I can change it to VMX, but VMX is not referenced very much in the port.  For
better or worse, Altivec is used everywhere.  So I think I will just spell it
out.

I had used _av previously in rs6000_output_move_128bit.  Did you want me to
clean up those places as well?

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797



Re: [PATCH] decide edge's hotness when there is profile info

2013-10-16 Thread Teresa Johnson
On Tue, Oct 15, 2013 at 3:39 PM, Teresa Johnson tejohn...@google.com wrote:
 On Tue, Oct 15, 2013 at 2:57 PM, Jan Hubicka hubi...@ucw.cz wrote:
 On Tue, Oct 15, 2013 at 2:03 PM, Jan Hubicka hubi...@ucw.cz wrote:
  Yes, that would work. So let's discard this patch because the fix for
  comdat can also fix this problem.
 
  Unforutnately ipa-profile-estimate is an IPA pass and as such it does
  not have access to profile_status_to_function.

 I see. I was coding that up today but hadn't tested it yet.

  You can probably just factor this out into a function that can be called
  and for normal FDO we call it where the loop stis now and for auto-FDO we 
  can
  probably have another invocation from before early passes where auto-FDO 
  is collected.

 Ok, let's go with that approach for now. It won't address the 0 count
 COMDAT calling another 0 count COMDAT problem, but I will probably
 just find a way to deal with this when inlining.

 You can still propagate, since tree-profile is an simple-ipa pass.

 Ok, I think I will leave that for the second patch, since I need to do
 some testing on the effects - i.e. the propagation I had in mind would
 make any 0-count routine called by a routine that was dropped to
 guessed to also be dropped to guessed (and no longer unlikely). It may
 be too aggressive, I need to check.


   +  if (node-count)
   + continue;
  Also here we should sum the counts and consider function non unlikely 
  executed
  in the same way as probably_never_executed does.

 I assume you mean by doing the same comparison to the number of
 profile-runs. Yes, this makes sense.

 Yes.

 
  I can prepare updated patch, but i am currently travelling, so i would not
  be disapointed if you beat me ;)

 I'm working on it, and I think based on Dehao's needs I am going to
 split up the patch into two phases, the one that does just the part
 you had sent a patch for (making sure 0 count routines with non-zero
 calls are marked guessed and have their node frequency set
 appropriately), and a subsequent one to do the count application when
 we inline a 0-count routine into a non-zero callsite. I'll shoot for
 getting this ready by tomorrow.

 BTW, in your original patch you are checking for both e-count or
 cgraph_maybe_hot_edge_p(e), but AFAICT the call to
 cgraph_maybe_hot_edge_p will never return true when e-count is zero.
 When there is a profile it will return false via maybe_hot_count_p
 since e-count == 0. When there is no profile it will return false
 when the callee has NODE_FREQUENCY_UNLIKELY_EXECUTED. So I think just
 checking for e-count 0 is sufficient here.

 I think I was checking caller count here (that is read) and the code
 was supposed to make functoin with hot caller to be hot...

 The code in your patch was just checking the edge count, not the
 caller count. cgraph_maybe_hot_edge_p also doesn't check the caller
 count, just the edge count. Do we want to make all functions with hot
 callers also be hot, even when the edge count is not hot? This may be
 to aggressive. I was simply going to make the code check e-count.

Here is the patch from Honza, that I revised based on discussions and
testing. Once this related patch goes in, I can change the check against the
profile_info-runs that hardcodes the threshold with the new parameter
proposed for probably_never_executed there:
  http://gcc.gnu.org/ml/gcc-patches/2013-10/msg00743.html

Bootstrapped and tested on x86_64-unknown-linux-gnu. Ok for trunk?

Thanks, Teresa

2013-10-16  Jan Hubicka  j...@suse.cz
Teresa Johnson  tejohn...@google.com

* predict.c (handle_missing_profiles): New function.
(counts_to_freqs): Don't overwrite estimated frequencies
when function has no profile counts.
* predict.h (handle_missing_profiles): Declare.
* tree-profile.c (tree_profiling): Invoke handle_missing_profiles.

Index: predict.c
===
--- predict.c   (revision 203633)
+++ predict.c   (working copy)
@@ -2742,6 +2742,39 @@ estimate_loops (void)
   BITMAP_FREE (tovisit);
 }

+void
+handle_missing_profiles (void)
+{
+  struct cgraph_node *node;
+  /* See if 0 count function has non-0 count callers.  In this case we
+ lost some profile.  Drop its function profile to PROFILE_GUESSED.  */
+  FOR_EACH_DEFINED_FUNCTION (node)
+{
+  struct cgraph_edge *e;
+  gcov_type call_count = 0;
+  struct function *fn = DECL_STRUCT_FUNCTION (node-symbol.decl);
+  if (node-count)
+continue;
+  for (e = node-callers; e; e = e-next_caller)
+call_count += e-count;
+  if (call_count
+   fn  fn-cfg
+   (call_count * 4 = profile_info-runs))
+{
+  bool maybe_hot = maybe_hot_count_p (NULL, call_count);
+  if (dump_file)
+fprintf (dump_file,
+ Dropping 0 profile for %s/%i. %s based on calls.\n,
+ cgraph_node_name (node), 

Re: Teaching emacs about GCC coding conventions (was Re: [PATCH] tree_code_name wrapper)

2013-10-16 Thread Tom Tromey
 Mike == Mike Stump mikest...@comcast.net writes:

Mike I've tested it to ensure everything works as expected, except I
Mike didn't see exactly how to test the bug-reference… If someone knows
Mike how to test that, I'd appreciate it.  If I understand the net
Mike result of this, it should make the lives of people that work on
Mike multiple trees with different formatting easier.

Visit ChangeLog.
Find a reference to a PR.
M-x bug-reference-mode
Click or C-c RET on the PR text

If it pops up the bug in your browser, it works.

There's also bug-reference-prog-mode, for programming modes.
It only fontifies PR references in comments and strings.

Tom


Re: [PATCH, powerpc] PR target/58673: Fix power8 quad memory/no-vsx-timode interaction (revised)

2013-10-16 Thread Michael Meissner
On Mon, Oct 14, 2013 at 08:51:15PM -0400, David Edelsohn wrote:
 As Ken Zadeck and Richard Sandiford noticed this morning, the F
 constraint seems to be a cut-and-paste error because the F
 constraint corresponds to an immediate scalar or vector float value,
 which never will match TImode. Should this constraint be n?
 
 The GCC testsuite apparently is missing a test to ensure that a large
 TImode immediate value can be loaded on targets that support int128_t.

I'll try n.  It probably should be something that allows CONST_DOUBLE ints as
well as CONST_INT.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797



Re: [PATCH] Reassociate X == CST1 || X == CST2 if popcount (CST2 - CST1) == 1 into ((X - CST1) ~(CST2 - CST1)) == 0

2013-10-16 Thread Joseph S. Myers
On Wed, 16 Oct 2013, Jeff Law wrote:

 What's the policy on the GDFL stuff.  I've tried so hard to avoid having to
 worry about that rats nest that I have no idea what our policy is. Basically I
 just want to take the old docs for BRANCH_COST and re-use those.  Is that
 considered kosher with all the GDFL nonsense?

When moving documentation text from the manual into target.def (so it ends 
up in both target.def and tm.texi, with tm.texi.in just having an @hook 
line to show where the regeneration of tm.texi should put the text), CC 
one of the people listed as docstring relicensing maintainers and ask us 
to approve the GPL relicensing in that case.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: Fwd: [PATCH] Scheduling result adjustment to enable macro-fusion

2013-10-16 Thread Jeff Law

On 10/15/13 15:30, Wei Mi wrote:




Aren't you just trying to see if we have a comparison feeding the
conditional jump and if they're already adjacent?  Do you actually need to
get the condition code regs to do that test?



Yes, I am trying to see if we have a comparison feeding the
conditional jump and if they're already adjacent. Do you have more
easier way to do that test?
Can't you just look at the last insn in the block and if it's a 
conditional peek at the previous insn and see if it sets CC mode register?


Hmm, I guess that's effectively what you're doing, I guess I was just 
surprised by the need to first get the fixed_condition_code_regs as I 
expected you to just extract them from the conditional jump.   But 
thinking a bit more about it now your solution seems rather clean.





   static void haifa_init_only_bb (basic_block, basic_block);
diff --git a/gcc/sched-rgn.c b/gcc/sched-rgn.c
index e1a2dce..156359e 100644
--- a/gcc/sched-rgn.c
+++ b/gcc/sched-rgn.c
@@ -2443,6 +2443,8 @@ add_branch_dependences (rtx head, rtx tail)
cc0 setters remain at the end because they can't be moved away from
their cc0 user.

+ Predecessors of SCHED_GROUP_P instructions at the end remain at the
end.
+
COND_EXEC insns cannot be moved past a branch (see e.g. PR17808).

Insns setting TARGET_CLASS_LIKELY_SPILLED_P registers (usually
return
@@ -2465,7 +2467,8 @@ add_branch_dependences (rtx head, rtx tail)
   #endif
   || (!reload_completed
sets_likely_spilled (PATTERN (insn)
-|| NOTE_P (insn))
+|| NOTE_P (insn)
+|| (last != 0  SCHED_GROUP_P (last)))
   {
 if (!NOTE_P (insn))
  {


This looks like a straighforward bugfix and probably should go forward
independent of this enhancement.


Ok, I will separate it into another patch.
Go ahead and consider that pre-approved.  Just send it to the list with 
a note that I approved it in this thread.




Jeff




Re: [PING] 3 patches waiting for approval/review

2013-10-16 Thread Jeff Law

On 10/11/13 11:23, Andreas Krebbel wrote:

On 10/10/13 18:41, Jeff Law wrote:

On 10/10/13 04:00, Andreas Krebbel wrote:

On 09/10/13 21:46, Jeff Law wrote:

On 08/21/13 03:21, Andreas Krebbel wrote:

[RFC] Allow functions calling mcount before prologue to be leaf functions
http://gcc.gnu.org/ml/gcc-patches/2013-04/msg00993.html

I don't think this is necessarily correct for all targets.  ISTM the
ability to consider a function calling mcount as a leaf needs to be a
property of the target.


We have already profile_before_prologue as a target property. Shouldn't this 
be enough to decide
upon this? When a function calls mcount before the prologue it shouldn't matter 
whether the function
is leaf or not.

I don't think so, I think it'd break the PA's 32 bit ABI, maybe the 64
bit ABI as well.  It's the caller's responsibility to build a mini stack
frame if the function makes any calls.  If the code in the prologue
expander uses leafness to make the decision about whether or not to
allocate the mini frame, then it'd do the wrong thing here.


Since it seems to be about PROFILE_HOOKS vs FUNCTION_PROFILER targets what 
about the following patch:
It's not about PROFILE_HOOKS vs FUNCTION_PROFILER, but the ABI and how 
the backend changes its behavior on the return value of leaf_function_p.


Effectively you want to have functions which are not leaf functions 
(they call mcount) be treated as leaf functions.  I have great concerns 
about the safety of allowing that without exports on each port weighing 
in on its correctness for their port.



I still really feel this should be a target hook that is off by default 
so that the target maintainers can audit their target to ensure it 
operates correctly.


Maybe I'm missing something, so perhaps another approach.  Can you 
explain why you think it is safe to ignore calls to mcount when trying 
to determine if a function is a leaf or not?



BTW, there is a hunk of this patch that can and should go forward now 
rather than later.  Specifically removing the profile_arc_flag part of 
the test.  If you wanted to push that forward immediately, I'd support 
that and you should consider it pre-approved.



jeff



Re: [PATCH v4 04/20] add configury

2013-10-16 Thread Tom Tromey
 Gerald == Gerald Pfeifer ger...@pfeifer.com writes:

Gerald   % mkdir /scratch2/tmp/gerald/OBJ-1014-1920
Gerald   % cd /scratch2/tmp/gerald/OBJ-1014-1920
Gerald   % $GCC_SOURCE/configure --prefix=/home/gerald/something
Gerald   % gmake bootstrap-lean
Gerald   % gmake install

Gerald Wait a minute!  

Gerald It _looks_ like I can get this to succeed using gmake bootstrap 
Gerald and have the very same script fail using gmake bootstrap-lean.

Gerald Can you reproduce that?

Yeah, the bootstrap-lean + install combination fails for me.

Tom


Re: [wide-int] int_traits tree

2013-10-16 Thread Kenneth Zadeck

On 10/16/2013 02:45 PM, Richard Sandiford wrote:

Kenneth Zadeck zad...@naturalbridge.com writes:

I note that we immediately return in the above case, so

if (precision  xprecision + HOST_BITS_PER_WIDE_INT)
  {
len = wi::force_to_size (scratch, val, len, xprecision,
precision, UNSIGNED);
return wi::storage_ref (scratch, len, precision);
  }

applies only when the desired precision is a multiple of a HWI.

yes

I assume it adds that extra zero word in case the MSB is set,
right?

yes

I am confused about the condition written here and
how we look at precision and not xprecision when deciding how
to extend - given a 8-bit precision unsigned 0xff and precision == 64
we do not even consider sign-extending the value because we look
at precision and not xprecision.  Don't we need to look at
xprecision here?

I do not think so.   There are three cases here:

1) xprecision  precision:
say xprecision = 8 and precision = 32.

The number is between 0 and 255 and after canonicalization the number
will be between 0 and 255.

2) xprecision == precision
not much to talk about here.

3) xprecision  precision

We need to loose the upper bits of the input and it does not matter what
they were.The only thing that we are concerned about is that the
bits above what is now the sign bit pointed to by precision are matched
in the bits above precision.

(3) is gone with the assert though.

you seem to be correct here.

As mentioned in my message yesterday, I thought your new way of canonising
unsigned tree constants meant that there was always an upper zero bit.
Is that right?

i believe this is correct.

If so, xprecision  precision is a no-op, because the number always
has the right form for wider precisions.  The only difficult case is
xprecision == precision, since then we need to peel off any upper -1 HWIs.
say my HWI size is 8 bits (just to keep from typing a million 'f's.   if 
i have a 16 bit unsigned number that is all 1s in the tree world it is 3 
hwis

0x00 0xff 0xff.

but inside regular wide int, it would take 1 wide int whose value is 0xff.
inside of max it would be the same as the tree, but then the test 
precision  xprecision + hbpwi never kicks in because precision is 
guaranteed to be huge.


inside of addr_wide_int, i think we tank with the assertion.

the case actually comes up on the ppc because they do a lot of 128 bit 
math.I think i got thru the x86-64 without noticing this.





If that's right and xprecision == precision can use val with an adjusted len,
then...


After all an assert precision == xprecision
does not work in this routine.

Quickly execute.exp tested only.

To avoid code quality wreckage we have to implement a different way
of allocating the 'scratch' storage of wide_int_ref_storage
(well, ideally we wouldn't need it ...).  I thought of simply
allocating it via alloca (), but we need to do that in all
callers that build a wide_int_ref (eventually via a hidden
default arg).  But as that's a template specialization of
generic_wide_int ... so the option is to provide a function
wrapper inside wi:: for this, like

I want richard and mike to be the people who respond to the next
point.I am not a c++ person and all of this storage manager layer
stuff gives me a headache.

...the use of the scratch array goes away completely for addr_wide_int
and max_wide_int.  If this code is on a hot path for wide_int then maybe
we should simply require that wide_int operations involving trees have
explicit extensions, like in rtl.  (I.e. only do the implicit extension
for addr_wide_int and max_wide_int.)

i do not understand how this is true.

I'm not opposed to going back to a separate scratch array if all else fails,
but it's really ugly, so I'd like to look at the alternatives first...

Thanks,
Richard




Re: Teaching emacs about GCC coding conventions (was Re: [PATCH] tree_code_name wrapper)

2013-10-16 Thread Paulo J. Matos

On 16/10/13 19:32, Mike Stump wrote:


Committed revision 203715.



Awesome!
Can I backport your patch to 4_8?

--
PMatos



Re: [PATCH, powerpc] Rework#2 VSX scalar floating point support, patch #5

2013-10-16 Thread David Edelsohn
On Wed, Oct 16, 2013 at 3:00 PM, Michael Meissner
meiss...@linux.vnet.ibm.com wrote:
 On Mon, Oct 14, 2013 at 08:59:50PM -0400, David Edelsohn wrote:
 Please use VMX instead of AV, not because I am pedantic about
 IBM's name for the feature, but AV is meaningless to me and anyone
 else reading the code. If you need a three-letter name, use VMX.

 Okay with that change.

 I can change it to VMX, but VMX is not referenced very much in the port.  For
 better or worse, Altivec is used everywhere.  So I think I will just spell it
 out.

 I had used _av previously in rs6000_output_move_128bit.  Did you want me to
 clean up those places as well?

Yes, please remove uses of av because I really don't think that it
is meaningful or informative to someone reading the source code. If
you change that use to _vmx, then VMX will be used more in the code.
I'm not going to insist on VMX, but it never will be used more in the
code if we always avoid using it.

Thanks, David


Re: [PATCH, powerpc] PR target/58673: Fix power8 quad memory/no-vsx-timode interaction (revised)

2013-10-16 Thread David Edelsohn
On Wed, Oct 16, 2013 at 3:08 PM, Michael Meissner
meiss...@linux.vnet.ibm.com wrote:

 I'll try n.  It probably should be something that allows CONST_DOUBLE ints 
 as
 well as CONST_INT.

The 'n' constraint matches CONST_INT and CONST_DOUBLE.

- David


Re: [PATCH] Cilk Keywords (_Cilk_spawn and _Cilk_sync) for C (and C++)

2013-10-16 Thread Jeff Law

On 09/11/13 12:18, Iyer, Balaji V wrote:

Hello Everyone, Couple weeks back, I had submitted a patch for review
that will implement Cilk keywords (_Cilk_spawn and _Cilk_sync) into
the C compiler. I recently finished C++ implementation also. In this
email, I am attaching 2 patches: 1 for C (and the common parts for C
and C++) and 1 for C++. The C++ Changelog is labelled
cp-ChangeLog.cilkplus and the other one is just ChangeLog.cilkplus.
There isn't much changes in the C patch. Only noticeable changes
would be moving functions to the common parts so that C++ can use
them.

It passes all the tests and does not affect  (by affect I mean fail a
passing test or pass a failing one) any of the other tests in the
testsuite directory.

Is this Ok for trunk?
This is not a complete review, but obviously I am starting to look at 
the patch and figured you could start addressing issues as I find them. 
 I usually start by trying to filter out all the stuff that is fairly 
benign, so I'm looking at code in a fairly random order.


You'll also note some items are just questions I'd like you to answer 
and don't (at this time) require any code changes -- they just help me 
understand everything.


I'm also not looking at the C++ bits -- my familiarity with the C++ 
front-end is minimal at best and I'd probably do more harm than good 
reviewing that code.




In gcc/Makefile.in, since the original submission of your patch we have 
integrated automatic dependency generation.  As a result several hunks 
to provide dependencies for cilk.o and update dependencies for other 
objects are no longer needed.  This is true for hte various Make-lang.in 
files as well.


builtins.c:

+  if (flag_enable_cilkplus  (!strcmp (name, __cilkrts_detach)
+  || !strcmp (name, __cilkrts_pop_frame)))
Formatting nit.
  if (fubar
   (com | baz))

in cppbuiltin.c, presumably the __cilk predefine is some kind of version 
#?  I'm going to assume that __clik is the same #define that ICC sets 
up, correct?


In function.h:

+  /* This will indicate whether a function is a cilk function */
+  unsigned int is_cilk_function : 1;

Doesn't this really mean calls into the cilk runtime?



In ira.c:

+   /* We need a frame pointer for all Cilk Plus functions that use
+ Cilk keywords.  */
+   || (flag_enable_cilkplus  cfun-is_cilk_function)
Can you explain to me a bit more why you need a frame pointer?  I'm 
trying to determine if it's best to leave this as-is or have this code 
detect a property in the generated code for the function.  From a 
modularity standpoint it seems pretty gross that we have to peek at this 
within IRA.




In a couple places I saw this comment:
+  /* Cilk keywords currently need to replace some variables that
+ ordinary nested functions do not.  */
+  bool remap_var_for_cilk;
I didn't see anywhere that explained exactly why some variables that do 
not ordinarily need replacing need replacing when cilk is enabled.  If 
it's in the patch somewhere, just point me to it. If not, add 
documentation about why these variables need remapping for cilk.



In gimplify.c:
+  /* Implicit _Cilk_sync must be inserted right before any return statement
+ if there is a _Cilk_spawn in the function.  If the user has provided a
+ _Cilk_sync, the optimizer should remove this duplicate one.  */
+  if (fn_contains_cilk_spawn_p (cfun))
+{
+  tree impl_sync = build0 (CILK_SYNC_STMT, void_type_node);
+  gimplify_and_add (impl_sync, pre_p);
+}
+
Does anything actually ensure we don't have multiple syncs?


What's the thinking behind parsing calls to cilk_spawn as a normal call 
if there's an error?  Referring to this code in gimplify.c:

+   case CILK_SPAWN_STMT:
+ gcc_assert
+   (fn_contains_cilk_spawn_p (cfun)
+ lang_hooks.cilkplus.cilk_detect_spawn_and_unwrap (expr_p));
+ if (!seen_error ())
+   {
+ ret = (enum gimplify_status)
+   lang_hooks.cilkplus.gimplify_cilk_spawn (expr_p, pre_p,
+post_p);
+ break;
+   }
+ /* If errors are seen, then just process it as a CALL_EXPR.  */
+

Meta-question, when we're not in cilk mode, should we be consuming the 
cilk tokens?  I'm not familiar at all with our parser, so I'm not sure 
if we can handle this gracefully.  Though I guess parsing hte token and 
warning/error if not in Cilk mode is probably the best course of action.


Can you take a look at calls.c::special_function_p and determine if we 
need to do something special for spawn here?



More tomorrow...

jeff


Re: [PATCH][i386]Fix PR 57756

2013-10-16 Thread Sriraman Tallam
On Tue, Oct 15, 2013 at 10:54 PM, Alan Modra amo...@gmail.com wrote:
 On Tue, Oct 15, 2013 at 02:45:23PM -0700, Sriraman Tallam wrote:
 I committed this patch after making the above change.

 /src/gcc-virgin/gcc/config/rs6000/rs6000.c: At global scope:
 /src/gcc-virgin/gcc/config/rs6000/rs6000.c:31122:29: error: invalid 
 conversion from ‘void (*)(cl_target_option*)’ to ‘void (*)(cl_target_option*, 
 gcc_options*)’ [-fpermissive]
 /src/gcc-virgin/gcc/config/rs6000/rs6000.c:31122:29: error: invalid 
 conversion from ‘void (*)(cl_target_option*)’ to ‘void (*)(gcc_options*, 
 cl_target_option*)’ [-fpermissive]

This patch fixes it, ok to submit?


* config/rs6000/rs6000.c (rs6000_function_specific_save): New unused parameter.
(rs6000_function_specifc_restore): Ditto.

Index: config/rs6000/rs6000.c
===
--- config/rs6000/rs6000.c  (revision 203711)
+++ config/rs6000/rs6000.c  (working copy)
@@ -29995,7 +29995,8 @@ rs6000_set_current_function (tree fndecl)
 /* Save the current options */

 static void
-rs6000_function_specific_save (struct cl_target_option *ptr)
+rs6000_function_specific_save (struct cl_target_option *ptr,
+  struct gcc_options *ARG_UNUSED (opts))
 {
   ptr-x_rs6000_isa_flags = rs6000_isa_flags;
   ptr-x_rs6000_isa_flags_explicit = rs6000_isa_flags_explicit;
@@ -30004,7 +30005,8 @@ static void
 /* Restore the current options */

 static void
-rs6000_function_specific_restore (struct cl_target_option *ptr)
+rs6000_function_specific_restore (struct gcc_options *ARG_UNUSED (opts),
+ struct cl_target_option *ptr)
 {
   rs6000_isa_flags = ptr-x_rs6000_isa_flags;
   rs6000_isa_flags_explicit = ptr-x_rs6000_isa_flags_explicit;


Thanks
Sri




 --
 Alan Modra
 Australia Development Lab, IBM


C++ PATCH for c++/57850 (-fdump-translation-unit and PCH)

2013-10-16 Thread Jason Merrill
Early exit from cp_write_global_declarations in the PCH case completely 
broke -fdump-translation-unit; the simple fix is just to do the dump on 
the PCH path as well.


Tested x86_64-pc-linux-gnu, applying to trunk and 4.8.
commit 106091d492179afa32d3042c4c74fb2ae6c440f4
Author: Jason Merrill ja...@redhat.com
Date:   Thu Sep 12 18:33:21 2013 -0400

	PR c++/57850
	* decl2.c (dump_tu): Split out from...
	(cp_write_global_declarations): ...here.  Call it in PCH mode.

diff --git a/gcc/cp/decl2.c b/gcc/cp/decl2.c
index 4ebc155..5e5f5e8 100644
--- a/gcc/cp/decl2.c
+++ b/gcc/cp/decl2.c
@@ -4041,6 +4041,22 @@ handle_tls_init (void)
   expand_or_defer_fn (finish_function (0));
 }
 
+/* The entire file is now complete.  If requested, dump everything
+   to a file.  */
+
+static void
+dump_tu (void)
+{
+  int flags;
+  FILE *stream = dump_begin (TDI_tu, flags);
+
+  if (stream)
+{
+  dump_node (global_namespace, flags  ~TDF_SLIM, stream);
+  dump_end (TDI_tu, stream);
+}
+}
+
 /* This routine is called at the end of compilation.
Its job is to create all the code needed to initialize and
destroy the global aggregates.  We do the destruction
@@ -4071,6 +4087,7 @@ cp_write_global_declarations (void)
   if (pch_file)
 {
   c_common_write_pch ();
+  dump_tu ();
   return;
 }
 
@@ -4457,16 +4474,7 @@ cp_write_global_declarations (void)
 
   /* The entire file is now complete.  If requested, dump everything
  to a file.  */
-  {
-int flags;
-FILE *stream = dump_begin (TDI_tu, flags);
-
-if (stream)
-  {
-	dump_node (global_namespace, flags  ~TDF_SLIM, stream);
-	dump_end (TDI_tu, stream);
-  }
-  }
+  dump_tu ();
 
   if (flag_detailed_statistics)
 {


C++ Patch for core issue 1591 (deduction of array bound from initializer list)

2013-10-16 Thread Jason Merrill
If we're deducing an array type from an initializer list, we can also 
deduce the bound.


Tested x86_64-pc-linux-gnu, applying to trunk.
commit c1effcb5fe5b63628a4a7ef75c96d9415ee18f20
Author: Jason Merrill ja...@redhat.com
Date:   Mon Sep 23 06:15:25 2013 -0500

	Core 1591
	* pt.c (unify_array_domain): Split out from unify.
	(unify): Use it for list deduction, too.

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index eed648a..a1eb7a2 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -17007,6 +17007,72 @@ unify_pack_expansion (tree tparms, tree targs, tree packed_parms,
   return unify_success (explain_p);
 }
 
+/* Handle unification of the domain of an array.  PARM_DOM and ARG_DOM are
+   INTEGER_TYPEs representing the TYPE_DOMAIN of ARRAY_TYPEs.  The other
+   parameters and return value are as for unify.  */
+
+static int
+unify_array_domain (tree tparms, tree targs,
+		tree parm_dom, tree arg_dom,
+		bool explain_p)
+{
+  tree parm_max;
+  tree arg_max;
+  bool parm_cst;
+  bool arg_cst;
+
+  /* Our representation of array types uses N - 1 as the
+ TYPE_MAX_VALUE for an array with N elements, if N is
+ not an integer constant.  We cannot unify arbitrarily
+ complex expressions, so we eliminate the MINUS_EXPRs
+ here.  */
+  parm_max = TYPE_MAX_VALUE (parm_dom);
+  parm_cst = TREE_CODE (parm_max) == INTEGER_CST;
+  if (!parm_cst)
+{
+  gcc_assert (TREE_CODE (parm_max) == MINUS_EXPR);
+  parm_max = TREE_OPERAND (parm_max, 0);
+}
+  arg_max = TYPE_MAX_VALUE (arg_dom);
+  arg_cst = TREE_CODE (arg_max) == INTEGER_CST;
+  if (!arg_cst)
+{
+  /* The ARG_MAX may not be a simple MINUS_EXPR, if we are
+	 trying to unify the type of a variable with the type
+	 of a template parameter.  For example:
+
+	   template unsigned int N
+	   void f (char () [N]);
+	   int g();
+	   void h(int i) {
+	 char a[g(i)];
+	 f(a);
+	   }
+
+	 Here, the type of the ARG will be int [g(i)], and
+	 may be a SAVE_EXPR, etc.  */
+  if (TREE_CODE (arg_max) != MINUS_EXPR)
+	return unify_vla_arg (explain_p, arg_dom);
+  arg_max = TREE_OPERAND (arg_max, 0);
+}
+
+  /* If only one of the bounds used a MINUS_EXPR, compensate
+ by adding one to the other bound.  */
+  if (parm_cst  !arg_cst)
+parm_max = fold_build2_loc (input_location, PLUS_EXPR,
+integer_type_node,
+parm_max,
+integer_one_node);
+  else if (arg_cst  !parm_cst)
+arg_max = fold_build2_loc (input_location, PLUS_EXPR,
+			   integer_type_node,
+			   arg_max,
+			   integer_one_node);
+
+  return unify (tparms, targs, parm_max, arg_max,
+		UNIFY_ALLOW_INTEGER, explain_p);
+}
+
 /* Deduce the value of template parameters.  TPARMS is the (innermost)
set of template parameters to a template.  TARGS is the bindings
for those template parameters, as determined thus far; TARGS may
@@ -17092,13 +17158,17 @@ unify (tree tparms, tree targs, tree parm, tree arg, int strict,
 	   flag_deduce_init_list)
 	parm = listify (parm);
 
-  if (!is_std_init_list (parm))
+  if (!is_std_init_list (parm)
+	   TREE_CODE (parm) != ARRAY_TYPE)
 	/* We can only deduce from an initializer list argument if the
-	   parameter is std::initializer_list; otherwise this is a
-	   non-deduced context. */
+	   parameter is std::initializer_list or an array; otherwise this
+	   is a non-deduced context. */
 	return unify_success (explain_p);
 
-  elttype = TREE_VEC_ELT (CLASSTYPE_TI_ARGS (parm), 0);
+  if (TREE_CODE (parm) == ARRAY_TYPE)
+	elttype = TREE_TYPE (parm);
+  else
+	elttype = TREE_VEC_ELT (CLASSTYPE_TI_ARGS (parm), 0);
 
   FOR_EACH_CONSTRUCTOR_VALUE (CONSTRUCTOR_ELTS (arg), i, elt)
 	{
@@ -17121,6 +17191,15 @@ unify (tree tparms, tree targs, tree parm, tree arg, int strict,
    explain_p);
 	}
 
+  if (TREE_CODE (parm) == ARRAY_TYPE)
+	{
+	  /* Also deduce from the length of the initializer list.  */
+	  tree max = size_int (CONSTRUCTOR_NELTS (arg));
+	  tree idx = compute_array_index_type (NULL_TREE, max, tf_none);
+	  return unify_array_domain (tparms, targs, TYPE_DOMAIN (parm),
+ idx, explain_p);
+	}
+
   /* If the std::initializer_listT deduction worked, replace the
 	 deduced A with std::initializer_listA.  */
   if (orig_parm != parm)
@@ -17494,63 +17573,8 @@ unify (tree tparms, tree targs, tree parm, tree arg, int strict,
   RECUR_AND_CHECK_FAILURE (tparms, targs, TREE_TYPE (parm), TREE_TYPE (arg),
 			   strict  UNIFY_ALLOW_MORE_CV_QUAL, explain_p);
   if (TYPE_DOMAIN (parm) != NULL_TREE)
-	{
-	  tree parm_max;
-	  tree arg_max;
-	  bool parm_cst;
-	  bool arg_cst;
-
-	  /* Our representation of array types uses N - 1 as the
-	 TYPE_MAX_VALUE for an array with N elements, if N is
-	 not an integer constant.  We cannot unify arbitrarily
-	 complex expressions, so we eliminate the MINUS_EXPRs
-	 here.  */
-	  parm_max = TYPE_MAX_VALUE (TYPE_DOMAIN (parm));
-	  parm_cst = TREE_CODE (parm_max) == INTEGER_CST;
-	

[Patch] Fix undefined behaviors in regex

2013-10-16 Thread Tim Shen
It's detected by `clang++ -g -std=c++11 -fsanitize=undefined` but not
detected by g++ with flags `g++ -g -std=c++11 -fsanitize=undefined
-lpthread`. Am I on the right way?

-m32 and -m64 tested :)

Thanks!


-- 
Tim Shen


a.patch
Description: Binary data


C++ PATCH to printing 'noexcept' in diagnostics

2013-10-16 Thread Jason Merrill

Printing noexcept (true) is redundant.

Tested x86_64-pc-linux-gnu, applying to trunk.
commit 5305dc0cdb0736a0c56ce3d126b4baad0521f096
Author: Jason Merrill ja...@redhat.com
Date:   Thu Oct 3 15:35:11 2013 -0400

	* error.c (dump_exception_spec): Print noexcept rather than
	noexcept (true).

diff --git a/gcc/cp/error.c b/gcc/cp/error.c
index 0c3cead..3f6f594 100644
--- a/gcc/cp/error.c
+++ b/gcc/cp/error.c
@@ -1595,13 +1595,16 @@ dump_exception_spec (cxx_pretty_printer *pp, tree t, int flags)
   if (t  TREE_PURPOSE (t))
 {
   pp_cxx_ws_string (pp, noexcept);
-  pp_cxx_whitespace (pp);
-  pp_cxx_left_paren (pp);
-  if (DEFERRED_NOEXCEPT_SPEC_P (t))
-	pp_cxx_ws_string (pp, uninstantiated);
-  else
-	dump_expr (pp, TREE_PURPOSE (t), flags);
-  pp_cxx_right_paren (pp);
+  if (!integer_onep (TREE_PURPOSE (t)))
+	{
+	  pp_cxx_whitespace (pp);
+	  pp_cxx_left_paren (pp);
+	  if (DEFERRED_NOEXCEPT_SPEC_P (t))
+	pp_cxx_ws_string (pp, uninstantiated);
+	  else
+	dump_expr (pp, TREE_PURPOSE (t), flags);
+	  pp_cxx_right_paren (pp);
+	}
 }
   else if (t)
 {


Small C++ PATCH to apply_late_template_attributes

2013-10-16 Thread Jason Merrill
While reviewing a recent patch I noticed that we weren't using 
attribute_takes_identifier_p here.  Fixed.


Tested x86_64-pc-linux-gnu, applying to trunk.
commit 021fe309badd55090a5bc212c44d24b7c23f37af
Author: Jason Merrill ja...@redhat.com
Date:   Fri Oct 11 10:38:23 2013 -0400

	* pt.c (apply_late_template_attributes): Use
	attribute_takes_identifier_p.

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index a1eb7a2..e9c3ce2 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -8610,10 +8610,7 @@ apply_late_template_attributes (tree *decl_p, tree attributes, int attr_flags,
 		 pass it through tsubst.  Attributes like mode, format,
 		 cleanup and several target specific attributes expect it
 		 unmodified.  */
-	  else if (TREE_VALUE (t)
-		TREE_CODE (TREE_VALUE (t)) == TREE_LIST
-		TREE_VALUE (TREE_VALUE (t))
-		identifier_p (TREE_VALUE (TREE_VALUE (t
+	  else if (attribute_takes_identifier_p (TREE_PURPOSE (t)))
 		{
 		  tree chain
 		= tsubst_expr (TREE_CHAIN (TREE_VALUE (t)), args, complain,


Re: Teaching emacs about GCC coding conventions (was Re: [PATCH] tree_code_name wrapper)

2013-10-16 Thread Mike Stump
On Oct 16, 2013, at 1:51 PM, Paulo J. Matos pa...@matos-sorge.com wrote:
 On 16/10/13 19:32, Mike Stump wrote:
 
 Committed revision 203715.
 
 Awesome!
 Can I backport your patch to 4_8?

First, we like to wait and let patches bake on mainline before considering back 
porting, this has no bake time yet in our tree.  Second, only patches that 
impact quality enough should be back ported.  I tend to think that this one 
does not impact quality enough to worry about.  Also, you should develop on 
trunk, not 4_8.  Arguably, I would say no.  Now, a release manager can always 
review approve it; it should be very, very low risk.

If you have your own private tree, you can also place it in your tree.  If you 
merely have a checkout, you can place it into your checkout.  This should get 
you 90% of the benefit.  Additionally, you can remove customizations from your 
environment that conflict with the style set in that file, and instead, bundle 
such a file with the projects that want a non-default style.  The default style 
works well with GNU software (as long as you don't edit .md files).

Re: [Patch] Fix undefined behaviors in regex

2013-10-16 Thread Paolo Carlini

On 10/16/2013 11:48 PM, Tim Shen wrote:

It's detected by `clang++ -g -std=c++11 -fsanitize=undefined` but not
detected by g++ with flags `g++ -g -std=c++11 -fsanitize=undefined
-lpthread`. Am I on the right way?

-m32 and -m64 tested :)

Thank you!

Paolo.


Re: [PATCH] Hoist loop invariant statements containing data refs with zero-step during loop-versioning in vectorization.

2013-10-16 Thread Cong Hou
On Wed, Oct 16, 2013 at 2:02 AM, Richard Biener rguent...@suse.de wrote:
 On Tue, 15 Oct 2013, Cong Hou wrote:

 Thank you for your reminder, Jeff! I just noticed Richard's comment. I
 have modified the patch according to that.

 The new patch is attached.

 (posting patches inline is easier for review, now you have to deal
 with no quoting markers ;))

 Comments inline.

 diff --git a/gcc/ChangeLog b/gcc/ChangeLog
 index 8a38316..2637309 100644
 --- a/gcc/ChangeLog
 +++ b/gcc/ChangeLog
 @@ -1,3 +1,8 @@
 +2013-10-15  Cong Hou  co...@google.com
 +
 +   * tree-vect-loop-manip.c (vect_loop_versioning): Hoist loop invariant
 +   statement that contains data refs with zero-step.
 +
  2013-10-14  David Malcolm  dmalc...@redhat.com

 * dumpfile.h (gcc::dump_manager): New class, to hold state
 diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
 index 075d071..9d0f4a5 100644
 --- a/gcc/testsuite/ChangeLog
 +++ b/gcc/testsuite/ChangeLog
 @@ -1,3 +1,7 @@
 +2013-10-15  Cong Hou  co...@google.com
 +
 +   * gcc.dg/vect/pr58508.c: New test.
 +
  2013-10-14  Tobias Burnus  bur...@net-b.de

 PR fortran/58658
 diff --git a/gcc/testsuite/gcc.dg/vect/pr58508.c 
 b/gcc/testsuite/gcc.dg/vect/pr58508.c
 new file mode 100644
 index 000..cb22b50
 --- /dev/null
 +++ b/gcc/testsuite/gcc.dg/vect/pr58508.c
 @@ -0,0 +1,20 @@
 +/* { dg-do compile } */
 +/* { dg-options -O2 -ftree-vectorize -fdump-tree-vect-details } */
 +
 +
 +/* The GCC vectorizer generates loop versioning for the following loop
 +   since there may exist aliasing between A and B.  The predicate checks
 +   if A may alias with B across all iterations.  Then for the loop in
 +   the true body, we can assert that *B is a loop invariant so that
 +   we can hoist the load of *B before the loop body.  */
 +
 +void foo (int* a, int* b)
 +{
 +  int i;
 +  for (i = 0; i  10; ++i)
 +a[i] = *b + 1;
 +}
 +
 +
 +/* { dg-final { scan-tree-dump-times hoist 2 vect } } */
 +/* { dg-final { cleanup-tree-dump vect } } */
 diff --git a/gcc/tree-vect-loop-manip.c b/gcc/tree-vect-loop-manip.c
 index 574446a..f4fdec2 100644
 --- a/gcc/tree-vect-loop-manip.c
 +++ b/gcc/tree-vect-loop-manip.c
 @@ -2477,6 +2477,92 @@ vect_loop_versioning (loop_vec_info loop_vinfo,
adjust_phi_and_debug_stmts (orig_phi, e, PHI_RESULT (new_phi));
  }


 Note that applying this kind of transform at this point invalidates
 some of the earlier analysis the vectorizer performed (namely the
 def-kind which now effectively gets vect_external_def from
 vect_internal_def).  In this case it doesn't seem to cause any
 issues (we re-compute the def-kind everytime we need it (how wasteful)).

 +  /* Extract load and store statements on pointers with zero-stride
 + accesses.  */
 +  if (LOOP_REQUIRES_VERSIONING_FOR_ALIAS (loop_vinfo))
 +{
 +  /* In the loop body, we iterate each statement to check if it is a load
 +or store.  Then we check the DR_STEP of the data reference.  If
 +DR_STEP is zero, then we will hoist the load statement to the loop
 +preheader, and move the store statement to the loop exit.  */

 We don't move the store yet.  Micha has a patch pending that enables
 vectorization of zero-step stores.

 +  for (gimple_stmt_iterator si = gsi_start_bb (loop-header);
 +  !gsi_end_p (si);)

 While technically ok now (vectorized loops contain a single basic block)
 please use LOOP_VINFO_BBS () to get at the vector of basic-blcoks
 and iterate over them like other code does.


Have done it.



 +   {
 + gimple stmt = gsi_stmt (si);
 + stmt_vec_info stmt_info = vinfo_for_stmt (stmt);
 + struct data_reference *dr = STMT_VINFO_DATA_REF (stmt_info);
 +
 + if (dr  integer_zerop (DR_STEP (dr)))
 +   {
 + if (DR_IS_READ (dr))
 +   {
 + if (dump_enabled_p ())
 +   {
 + dump_printf_loc
 + (MSG_NOTE, vect_location,
 +  hoist the statement to outside of the loop );

 hoisting out of the vectorized loop: 

 + dump_gimple_stmt (MSG_NOTE, TDF_SLIM, stmt, 0);
 + dump_printf (MSG_NOTE, \n);
 +   }
 +
 + gsi_remove (si, false);
 + gsi_insert_on_edge_immediate (loop_preheader_edge (loop), 
 stmt);

 Note that this will result in a bogus VUSE on the stmt at this point which
 will be only fixed because of implementation details of loop versioning.
 Either get the correct VUSE from the loop header virtual PHI node
 preheader edge (if there is none then the current VUSE is the correct one
 to use) or clear it.


I just cleared the VUSE since I noticed that after the vectorization
pass the correct VUSE is reassigned to the load.



 +   }
 + /* TODO: We also consider vectorizing loops containing zero-step
 +data refs as writes.  For example:

Re: Teaching emacs about GCC coding conventions (was Re: [PATCH] tree_code_name wrapper)

2013-10-16 Thread Mike Stump
On Oct 16, 2013, at 12:03 PM, Tom Tromey tro...@redhat.com wrote:
 Visit ChangeLog.
 Find a reference to a PR.
 M-x bug-reference-mode
 Click or C-c RET on the PR text

Tested, confirmed it works as expected; thanks.


Re: [Patch] Fix undefined behaviors in regex

2013-10-16 Thread Tim Shen
On Wed, Oct 16, 2013 at 6:13 PM, Paolo Carlini paolo.carl...@oracle.com wrote:
 Thank you!

Committed.

Thanks!


-- 
Tim Shen


Re: [Patch] Fix undefined behaviors in regex

2013-10-16 Thread Paolo Carlini

On 10/17/2013 12:29 AM, Tim Shen wrote:

On Wed, Oct 16, 2013 at 6:13 PM, Paolo Carlini paolo.carl...@oracle.com wrote:

Thank you!

Committed.
By the way, please feel free to prepare a reduced testcase for the 
-fsanitize people (Marek, Jakub?)


Thanks,
Paolo.


Re: [Patch] Fix undefined behaviors in regex

2013-10-16 Thread Tim Shen
On Wed, Oct 16, 2013 at 6:33 PM, Paolo Carlini paolo.carl...@oracle.com wrote:
 By the way, please feel free to prepare a reduced testcase for the
 -fsanitize people (Marek, Jakub?)

Here it is.

And we should get undefined behaviors before this commit(r203732).


-- 
Tim Shen


split.cc
Description: Binary data


Re: [Patch] Fix undefined behaviors in regex

2013-10-16 Thread Paolo Carlini

On 10/17/2013 12:39 AM, Tim Shen wrote:

On Wed, Oct 16, 2013 at 6:33 PM, Paolo Carlini paolo.carl...@oracle.com wrote:

By the way, please feel free to prepare a reduced testcase for the
-fsanitize people (Marek, Jakub?)

Here it is.

And we should get undefined behaviors before this commit(r203732).
To be honest, I was thinking something much smaller than the whole 
regex ;) But let's add Marek in CC.


Paolo.


[rl78] add rtx paranoia

2013-10-16 Thread DJ Delorie

Building newlib uncovered a few invalid assumptions...  Committed.

* config/rl78/rl78.c (rl78_alloc_address_registers_macax): Verify
op is a REG before checking REGNO.
(rl78_alloc_physical_registers): Verify pattern is a SET before
checking SET_SRC.

Index: config/rl78/rl78.c
===
--- config/rl78/rl78.c  (revision 203732)
+++ config/rl78/rl78.c  (working copy)
@@ -3047,13 +3047,14 @@ rl78_alloc_address_registers_macax (rtx 
  replace_in_op0 = (op  0  rtx_equal_p (OP (op), OP (0)));
  replace_in_op1 = (op  1  rtx_equal_p (OP (op), OP (1)));
 
  OP (op) = transcode_memory_rtx (OP (op), HL, insn);
  if (op == 2
   MEM_P (OP (op))
-  (REGNO (XEXP (OP (op), 0)) == SP_REG
+  ((GET_CODE (XEXP (OP (op), 0)) == REG
+   REGNO (XEXP (OP (op), 0)) == SP_REG)
  || (GET_CODE (XEXP (OP (op), 0)) == PLUS
   REGNO (XEXP (XEXP (OP (op), 0), 0)) == SP_REG)))
{
  emit_insn_before (gen_movhi (HL, gen_rtx_REG (HImode, 
SP_REG)), insn);
  OP (op) = replace_rtx (OP (op), gen_rtx_REG (HImode, SP_REG), 
HL);
}
@@ -3137,13 +3138,14 @@ rl78_alloc_physical_registers (void)
pattern = XVECEXP (pattern, 0, 0);
   if (JUMP_P (insn) || CALL_P (insn) || GET_CODE (pattern) == CALL)
clear_content_memory ();
   if (GET_CODE (pattern) != SET
   GET_CODE (pattern) != CALL)
continue;
-  if (GET_CODE (SET_SRC (pattern)) == ASM_OPERANDS)
+  if (GET_CODE (pattern) == SET
+  GET_CODE (SET_SRC (pattern)) == ASM_OPERANDS)
continue;
 
   valloc_method = get_attr_valloc (insn);
 
   PATTERN (insn) = copy_rtx_if_shared (PATTERN (insn));
 


  1   2   >