Re: Commit: RX: Fix simple_return pattern

2012-06-27 Thread nick clifton

Hi Mike,

  I plan on applying a similar patch to the mainline sources once I have
  finished regression testing them.


Really, trunk should always go in first...  Could you hold 4.7 until trunk goes 
in?


Sorry, I had already checked the patch in.  I have now checked in the 
trunk patch, with only 1 day between the two patches going in.  In the 
future I'll be sure to patch the trunk first.


Cheers
  Nick





Re: [PATCH 1/2] gcc symbol database

2012-06-27 Thread Yunfeng ZHANG
Hi Dodji Seketeli:

Is it possible to gcc to accept libcpp.patch and plugin.patch?

I recently rewrite my doc.txt which mainly add a new section Macro
Expansion Overview, it's focused on pfile.context usage linking to
macro. I think it's important to use cb_macro_start/end callbacks
because most users only care about the outest macro expansion, test
whether pfile.context.prev == NULL, however if he hasn't known macro
cascaded case, his code will crashed.

Sincerely
 Yunfeng
// vim: foldmarker=([{,}]) foldmethod=marker
 Gcc symbol database (symdb)
 zyf.zer...@gmail.com
  November 24, 2009
   revised on May 24, 2012

// Purpose ([{
The file is used to record the idea I got -- collecting gcc internal data
(definition, file-dependence etc.) and outputting them into database for
further usage.  Have you knowed cscope? but I think it's more appropriate that
symbols should be collected by gcc itself. Later sections can be cataloged
into two genres

For user (here user is IDE-like develop tools, not final user)
1) Need to know what symdb can do, goto section Feature List.
2) Goto section User Manual for how to using symdb.
3) Multiple results and Tested cases have more about the plugin.
For gcc internal developer
1) Section New Token Type defines some new token types used in my symdb.
2) Sections Gcc  Macro Expansion shows some complex cases linking to
macro expansion, I list calling sequence from plugin-side and stack snapshot
from gcc-side in every section, read them carefully, it's the key to
understand so.c:class mo.  test/testplan.txt and test/macro have the
testcases.
3) Section Patch Overview makes focus on which files and how are changed in
the patch.

Before we go, let's clear up some terminology or abbreviation used in my symdb
1) cpp abbreviates from c preprocess (follows gcc intern convention); however,
cxx represents c++.
2) In gcc/c-ppoutput.c, gcc defines compilation unit as compiling a file, new
noun `compilation session' means compiling all files of a project.
// }])

For User
// Feature List ([{
1) The plugin only works on C not C++.
2) Plugin can collect all extern definitins and dump them to database.
3) As the convention, the members of an extern enum are collected and dumped
to database.
4) Funtion call relationship are collected too, just like cscope.
5) You can use table FileDependence of database to reconstruct file dependence
relationship.
6) Not in cscope, you can use `gs addsym/rmsym' to re-edit the database,
remove the duplicate results and append new symbols to the database, see
section Multiple-results 
7) I finished a vim script `helper.vim' to help you using the database in vim.
8) My plugin is better than cscope in any cases, since I can catch definition
after macro expansion (such as tell you where `sys_open' is defined in linux
source) and skip `#ifdef/#if'.
// }])

// User Manual ([{
Note: Using my plugin on correct code, buggy code maybe cause my plugin
infinite-loop.

Prepare stage (patch on gcc-4.6.3):
1) cp gcc.patches/* gcc.src/patches
2) quilt push -a
3) make # as usual
More about compilation suite (such as crosstool-ng-1.13.2):
1) Since gcc plugin is implemented as shared library, so disable compiling
static toolchain option.
2) add `--enable-plugin' to your gcc configure line, or append
`CT_CC_GCC_ENABLE_PLUGINS=y' to your crossng.config.
3) See section Tested cases for a sample command line on gcc-4.6.3.

Compiling source by patched gcc (cd myplugin.src/):
1) make
2) cp gs helper.vim init.sql target.src/  cd target.src/
3) ./gs initdb ./ # Initialize database. If you want to custom the plugin,
update plugin-control-fields of database:ProjectOverview.
4) Append `-fplugin=/path/to/symdb.so
-fplugin-arg-symdb-dbfile=/target.src/gccsym.db' to your CFLAGS.
5) Since sqlite uses file-lock to synchronize database, so use `make -j1' to
compile source.
6) It will cost more time to compile your project, because my plugin need
compare whether a token has been inserted into database and multi-core can't
help you -- see previous steps, do it overnight.
7) ./gs vacuumdb ./ # Rearrange and defrag your database.
Of course, you can use some short-cuts to compile your projects without any
modification.
alias gcc='gccplugin -fplugin=symdb.so -fplugin-arg-symdb-dbfile=gccsym.db'
alias make='make -j1'

Working with new database:
1) cd /target.src
2) vi
3) execute `:source helper.vim'
4) Using `CTRL-]' to search a definition.
5) Using `CTRL-[' to search which functions calls the function.
6) Using `CTRL-T' to jump back.
7) Using `Gs def yoursymbol' to search a definition.
8) Using `Gs callee yourfunction' to search function call relationship.

Vim quickref:
Since my database stores the file-offset of every token, so
1) Using `:go fileoffset' to jump to the token.
2) Using `gCTRL-g' on the char to get the 

[PATCH] Disable loop2_invariant for -Os

2012-06-27 Thread Zhenqiang Chen
Hi,

In general, invariant motion itself can not reduce code size. But it will
change the liverange of the invariant, which might lead to more spilling.
The patch disables loop2_invariant when optimizing for size.

I measured the code size benefit for four targets based on CSiBE benchmark:

ARM: 0.33%
MIPS: 1.15%
PPC: 0.24%
X86: 0.45%

Is it OK for trunk?

Thanks!
-Zhenqiang

ChangeLog:
2012-06-27  Zhenqiang Chen zhenqiang.c...@arm.com

* loop-init.c (gate_rtl_move_loop_invariants): Disable
loop2_invariant
when optimizing function for size.

diff --git a/gcc/loop-init.c b/gcc/loop-init.c
index 03f8f61..5d8cf73 100644
--- a/gcc/loop-init.c
+++ b/gcc/loop-init.c
@@ -273,6 +273,12 @@ struct rtl_opt_pass pass_rtl_loop_done =
 static bool
 gate_rtl_move_loop_invariants (void)
 {
+  /* In general, invariant motion can not reduce code size. But it will
+ change the liverange of the invariant, which increases the register
+ pressure and might lead to more spilling.  */
+  if (optimize_function_for_size_p (cfun))
+return false;
+
   return flag_move_loop_invariants;
 }





Re: [testsuite] don't use lto plugin if it doesn't work

2012-06-27 Thread Richard Guenther
On Tue, Jun 26, 2012 at 11:28 PM, H.J. Lu hjl.to...@gmail.com wrote:
 On Tue, Jun 26, 2012 at 2:04 PM, Alexandre Oliva aol...@redhat.com wrote:
 I test i686-linux-gnu in a presumably unusual setting: it's an
 x86_64-linux-gnu system, and I've configured the GCC build to use as
 builddev tools wrapper scripts for as, ld, gnatmake and gcc that add
 flags that make them default to 32-bit.

 This worked fine for regression testing, but I've recently realized
 (with the PR49888/53671 mishap) that I'm getting tons of LTO testsuite
 failures (before and after, so no regression), because the 32-bit LTO
 plugin built in this setting can't possibly be used by the 64-bit linker
 installed on the system.  Obviously, passing -melf_i386 to the linker
 through the wrapper is not enough for it to be able to dlopen a 32-bit
 plugin ;-)

 I am using this Makefile fragment to bootstrap and test
 -m32 and -mx32 GCC on Linux/x86-64:

 ifneq ($(BUILD-ARCH),$(CPU))
 ifeq (i386,$(ARCH))
 TARGET-FLAGS=$(TARGET)
 CC=gcc -m32
 CXX=g++ -m32
 FLAGS-TO-PASS+=CC=$(CC)
 FLAGS-TO-PASS+=CXX=$(CXX)
 # Need 32bit linker for LTO.  */
 PATH:=/usr/local32/bin:$(PATH)
 endif

 ifeq (x32,$(ARCH))
 CC=gcc -mx32
 CXX=g++ -mx32
 FLAGS-TO-PASS+=CC=$(CC)
 FLAGS-TO-PASS+=CXX=$(CXX)
 # Need x32 linker for LTO.  */
 PATH:=/usr/localx32/bin:$(PATH)
 endif
 endif

 [hjl@gnu-32 gcc-32bit]$ file /usr/localx32/bin/ld
 /usr/localx32/bin/ld: ELF 32-bit LSB executable, x86-64, version 1
 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.38,
 BuildID[sha1]=0x85a2821594e122d4fc60741e2664c2b57888682e, not stripped
 [hjl@gnu-32 gcc-32bit]$ file /usr/local32/bin/ld
 /usr/local32/bin/ld: ELF 32-bit LSB executable, Intel 80386, version 1
 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.9,
 not stripped
 [hjl@gnu-32 gcc-32bit]$

So I suppose the above would be good to have in collect-ld?



 --
 H.J.


Re: [RFC] Tweak reload to const propagate into matching constraint output

2012-06-27 Thread Richard Guenther
On Wed, Jun 27, 2012 at 5:02 AM, Richard Henderson r...@redhat.com wrote:
 The problem I'd like to solve is stuff like

        pxor    %xmm4, %xmm4
 ...
        movdqa  %xmm4, %xmm2
        pcmpgtd %xmm0, %xmm2

 In that there's no point performing the copy from xmm4
 rather than just emitting a new pxor insn.

 The Real Problem, as I see it, is that at the point (g)cse
 runs we have no visibility into the 2-operand matching
 constraint on that pcmpgtd so we make the wrong choice
 in sharing the zero.

 If we're using AVX, instead of SSE, we don't use matching
 constraints and given the 3-operand insn, hoisting the zero
 is the right and proper thing to do because we won't need
 to emit that movdqa.

 Of course, this fires for normal integer code as well.
 Some cases it's a clear win:

 -:      41 be 1f 00 00 00       mov    $0x1f,%r14d
 ...
 -:      4c 89 f1                mov    %r14,%rcx
 +:      b9 1f 00 00 00          mov    $0x1f,%ecx

 sometimes not (increased code size):

 -:      41 bd 01 00 00 00       mov    $0x1,%r13d
 -:      4d 89 ec                mov    %r13,%r12
 +:      41 bc 01 00 00 00       mov    $0x1,%r12d
 +:      41 bd 01 00 00 00       mov    $0x1,%r13d

I suppose that might be fixed if instead of

+  /* Only use the constant when it's just as cheap as a reg move.  */
+  if (set_src_cost (c, optimize_function_for_speed_p (cfun)) == 0)
+return c;

you'd unconditionall use size costs?

 although the total difference is minimal, and ambiguous:

        new text        old text
 cc1     13971302        13971342
 cc1plus 15882736        15882728

 Also, note that in the first case above, r14 is otherwise
 unused, and we wind up with an unnecessary save/restore of
 the register in the function.

 Thoughts?

We have an inverse issue elsewhere in that we don't CSE a propagated constant
but get

   mov $0, %(eax)
   mov $0, 4%(eax)
...

instead of doing one register clearing and then re-using that as zero.  But I
suppose reload is not exactly the place to fix that ;)

Richard.



 r~


Re: [wwwdocs] Update coding conventions for C++

2012-06-27 Thread Richard Guenther
On Wed, Jun 27, 2012 at 2:35 AM, Magnus Fromreide ma...@lysator.liu.se wrote:
 On Mon, 2012-06-25 at 15:17 -0700, Lawrence Crowl wrote:
 On 6/25/12, Joseph S. Myers jos...@codesourcery.com wrote:
  On Mon, 25 Jun 2012, Diego Novillo wrote:
   [ Added doc maintainers in CC ]
  

 I have added a bit more in the rationale, reached through the link
 at the end of that section.

+p
+Indent protection labels by one space.
+/p
+
+p
+Indent class members by two spaces.
 
  Do all the listed indentation rules correspond to what a TAB
  will do by default when editing C++ code in GNU Emacs?  If not,
  we have conflicting notions of GNU C++ indentation conventions.

 I have no idea.  I don't use emacs.  The two-space rule for members
 comes from the wiki.  The one-space rule for protection labels is
 common practice.  If folks want something else, changes are fine
 with me.

 Two spaces for members is common practice with GNU, and it seems to be
 used for libstdc++.

 One space for protection labels is not something I have heard of before
 and libstdc++ uses no indentation for them.

 A freshly started emacs also doesn't indent access labels.

 I do think there is some value in using the same coding style for
 libstdc++ and the compiler.

I agree here.  It's the same we do for case labels.

Richard.

 /MF



Re: [wwwdocs] Update coding conventions for C++

2012-06-27 Thread Gabriel Dos Reis

[...]

|  Two spaces for members is common practice with GNU, and it seems to be
|  used for libstdc++.
| 
|  One space for protection labels is not something I have heard of before
|  and libstdc++ uses no indentation for them.
| 
|  A freshly started emacs also doesn't indent access labels.
| 
|  I do think there is some value in using the same coding style for
|  libstdc++ and the compiler.
| 
| I agree here.  It's the same we do for case labels.
| 
| Richard.

I think we reached total agreement :-)

-- Gaby


Re: [PATCH] Disable loop2_invariant for -Os

2012-06-27 Thread Richard Guenther
On Wed, Jun 27, 2012 at 10:40 AM, Zhenqiang Chen zhenqiang.c...@arm.com wrote:
 Hi,

 In general, invariant motion itself can not reduce code size.

It can expose CSE opportunities across loops though.

 But it will
 change the liverange of the invariant, which might lead to more spilling.

might - indeed.  I wonder what the trade-off is here ... but given that you
leave tree loop invariant motion enabled it might not make much of a difference.

Still as this is mostly a spilling issue it looks odd to do that generally.  In
fact you could improve things by only disabling motion when that increases
register lifetime - it can after all reduce overall register lifetime:

for (;;)
  inv = inv1 + inv2;
  ... use inv;

to

inv = inv1 + inv2;
for (;;)
  ... use inv;

has register lifetime reduced.

Or at least like I suggest below.

 The patch disables loop2_invariant when optimizing for size.

 I measured the code size benefit for four targets based on CSiBE benchmark:

 ARM: 0.33%
 MIPS: 1.15%
 PPC: 0.24%
 X86: 0.45%

 Is it OK for trunk?

 Thanks!
 -Zhenqiang

 ChangeLog:
 2012-06-27  Zhenqiang Chen zhenqiang.c...@arm.com

        * loop-init.c (gate_rtl_move_loop_invariants): Disable
 loop2_invariant
        when optimizing function for size.

 diff --git a/gcc/loop-init.c b/gcc/loop-init.c
 index 03f8f61..5d8cf73 100644
 --- a/gcc/loop-init.c
 +++ b/gcc/loop-init.c
 @@ -273,6 +273,12 @@ struct rtl_opt_pass pass_rtl_loop_done =
  static bool
  gate_rtl_move_loop_invariants (void)
  {
 +  /* In general, invariant motion can not reduce code size. But it will
 +     change the liverange of the invariant, which increases the register
 +     pressure and might lead to more spilling.  */
 +  if (optimize_function_for_size_p (cfun))
 +    return false;
 +

Can you do this per loop instead?  Using optimize_loop_nest_for_size_p?

Thanks,
Richard.

   return flag_move_loop_invariants;
  }





Re: [PATCH] Disable loop2_invariant for -Os

2012-06-27 Thread Steven Bosscher
On Wed, Jun 27, 2012 at 10:40 AM, Zhenqiang Chen zhenqiang.c...@arm.com wrote:
 Hi,

 In general, invariant motion itself can not reduce code size. But it will
 change the liverange of the invariant, which might lead to more spilling.

This may be true for ARM but it's not true in general. Sometimes
loop-invariant address arithmetic, that is not exposed in GIMPLE, is
profitable to hoist out of the loop. See e.g. PR41026 (for which I
still have a patch in the queue).

If this goes in anyway, please mention PR39837 in your ChangeLog entry.

Ciao!
Steven


Re: [testsuite] don't use lto plugin if it doesn't work

2012-06-27 Thread Alexandre Oliva
[Adding gcc@]

On Jun 26, 2012, H.J. Lu hjl.to...@gmail.com wrote:

 On Tue, Jun 26, 2012 at 3:39 PM, Mike Stump mikest...@comcast.net wrote:
 On Jun 26, 2012, at 2:04 PM, Alexandre Oliva wrote:
 I test i686-linux-gnu in a presumably unusual setting
 
 I like the setup and testing...
 
 This worked fine for regression testing, but I've recently realized
 (with the PR49888/53671 mishap) that I'm getting tons of LTO testsuite
 failures (before and after, so no regression), because the 32-bit LTO
 plugin built in this setting can't possibly be used by the 64-bit linker
 installed on the system.  Obviously, passing -melf_i386 to the linker
 through the wrapper is not enough for it to be able to dlopen a 32-bit
 plugin ;-)

 So, let's kick this back to the gcc list for all the interested
 parties to chime in on...  I'd rather have 5 interested people
 architect the right, nice solution that is engineered to work and
 then use it.

 H.J.'s solution seems like the most reasonable short term solution

It's not a “solution”, it's just the same local arrangement I mentioned
I was leaning towards, after fixing the problem in the test harness,
that lets GCC use the plugin and fail even after explicitly testing for
that.  I don't see how that can possibly be perceived as not a bug.
Which is not to say that there aren't *other* bugs in place, and that
some of them might alleviate this one.

 If we build the plugin after sensing the 64-bitness of ld, using the
 flags appropriate for the linker...

Then we'd be disobeying the host setting specified or detected by
configure.

 failing the build early when mismatched

Why?  We don't demand a working plugin.  Indeed, we disable the use of
the plugin if we find a linker that doesn't support it.  We just don't
account for the possibility of finding a linker that supports plugins,
but that doesn't support the one we'll build later.

 Bootstrap/test -m32/-mx32 with LTO on -m64 is similar to cross-compile,
 in the sense that the native linker may not work with plugin.  We just
 need to make the right linker available to GCC. My kludge uses PATH.
 One can also include binutils source in GCC source tree.

These are all reasonable suggestions to make the plugin work.  But
that's not what the patch addresses, namely what to do during testing
when the plugin is found to NOT work.

As I wrote in the original post, even if we were to detect that the
plugin is not supported and refrain from building it and testing it,
it's more valuable that the test summaries explicitly indicate, in each
FAIL or XFAIL, that the plugin was not used, rather than make room for
uncertainty as to whether the plugin was implicitly used or not.

-- 
Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist  Red Hat Brazil Compiler Engineer


Re: [PATCH][configure] Make sure CFLAGS_FOR_TARGET And CXXFLAGS_FOR_TARGET contain -O2

2012-06-27 Thread Christophe Lyon

On 26.06.2012 22:17, Alexandre Oliva wrote:

On Jun 26, 2012, Christophe Lyon christophe.l...@st.com wrote:


On 25.06.2012 17:24, Joseph S. Myers wrote:

On Mon, 25 Jun 2012, Christophe Lyon wrote:


Ping?

I advise CCing appropriate maintainers (in this case, build system
maintainers) on pings.


Ping again, CCing build system maintainers as suggested by Joseph.

Is this a ping for the patch you quoted in your Jun 25 email?  It's
generally good practice to include a link to the message holding the
patch in a ping.

Yes it is, sorry.
The original proposal was:
http://gcc.gnu.org/ml/gcc-patches/2012-03/msg01855.html


I looked at the patch in there, and I'm afraid I don't understand how it
achieves the ChangeLog-suggested purpose of ensuring -O2 makes to
C*FLAGS_FOR_TARGET, when all it appears to do is to prepend -g.  Can you
please clarify?


With more context, the current code fragment is:
  CFLAGS_FOR_TARGET=$CFLAGS
  case  $CFLAGS  in
* -O2 *) ;;
*) CFLAGS_FOR_TARGET=-O2 $CFLAGS ;;
  esac
  case  $CFLAGS  in
* -g * | * -g3 *) ;;
*) CFLAGS_FOR_TARGET=-g $CFLAGS ;;
  esac

where pre-pending -g discards -O2 if it was pre-pended just above.
That's why I replace CFLAGS by CFLAGS_FOR_TARGET when pre-pending -g.

Ditto for CXXFLAGS.

Christophe.




Re: [PATCH] Fix accesses to freed up memory in var-tracking (PR debug/53706)

2012-06-27 Thread Alexandre Oliva
On Jun 21, 2012, Uros Bizjak ubiz...@gmail.com wrote:

 Hello!
  During htab_delete (dropped_values), loc_exp_dep_pool
  allocated objects might be accessed, so it is better to free the
  pool afterwards.
 
  Bootstrapped/regtested on i686-linux, ok for trunk?
 
 Looks obvious.

 The patch doesn't fix all writes to freed up memory, please see
 comment #8 in the PR audit trail.

So, I've tested your patch in comment #10 on ia64-linux-gnu, and it
worked, but it failed on i686- and x86_64-linux-gnu, just because in
some cases we decided not to go through vt_emit_notes(), so
loc_exp_dep_pool remained uninitialized, and free_alloc_pool doesn't
like to release NULL pools ;-)

The resulting patch was regstrapped on i686- and x86_64-linux-gnu.  I'm
going to check it in as obvious after getting some sleep.

for  gcc/ChangeLog
from  Alexandre Oliva  aol...@redhat.com,
	Uros Bizjak  ubiz...@gmail.com, Jakub Jelinek  ja...@redhat.com

	PR debug/53706
	PR debug/47624
	* var-tracking.c (vt_emit_notes): Release loc_exp_dep_pool...
	(vt_finalize): ... here instead, if needed.

Index: gcc/var-tracking.c
===
--- gcc/var-tracking.c.orig	2012-06-27 02:25:13.903896343 -0300
+++ gcc/var-tracking.c	2012-06-27 03:22:25.0 -0300
@@ -9260,11 +9260,7 @@ vt_emit_notes (void)
   dataflow_set_destroy (cur);
 
   if (MAY_HAVE_DEBUG_INSNS)
-{
-  free_alloc_pool (loc_exp_dep_pool);
-  loc_exp_dep_pool = NULL;
-  htab_delete (dropped_values);
-}
+htab_delete (dropped_values);
 
   emit_notes = false;
 }
@@ -9974,6 +9970,9 @@ vt_finalize (void)
 
   if (MAY_HAVE_DEBUG_INSNS)
 {
+  if (loc_exp_dep_pool)
+	free_alloc_pool (loc_exp_dep_pool);
+  loc_exp_dep_pool = NULL;
   free_alloc_pool (valvar_pool);
   VEC_free (rtx, heap, preserved_values);
   cselib_finish ();


-- 
Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist  Red Hat Brazil Compiler Engineer


Re: RFA: dead_debug_* ICE

2012-06-27 Thread Alexandre Oliva
On Jun 25, 2012, Alexandre Oliva aol...@redhat.com wrote:

 On Jun 24, 2012, Richard Sandiford rdsandif...@googlemail.com wrote:
 gcc.c-torture/compile/vector-2.c fails on mips64-elf with RTL checking
 enabled because dead_debug_insert_temp tries to read the REGNO of something
 that it has already replaced with a debug temporary.

 This sounds like http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53740

 The patch looks reasonable to me, but I don't think I'm entitled to
 approve it.

So, it turned out that the observable problems were a red herring.  The
problem was that we were failing to emit debug temps for regs that were
set *and* used, but that died before their last debug use.  The “else”
that I'd recently introduced was a mistake, for it stopped us from
satisfying the needed debug binding with the correct expression if it
had any nondebug uses.  The need would get past (backwards) the setting
point, reaching some earlier set that happened to be actually dead and
getting us thoroughly confused.  This was obviously not supposed to
happen.

Once I removed the incorrectly-added “else”s, the problem no longer
occurred, and I'm pretty sure it won't.

Now, while investigating the problem, I noticed a number of suspicious
paradoxical SUBREGs that I'm pretty sure were supposed to refer to full
double words rather than extending single words to double.  Indeed, we
had a problem in tracking single-reg sets for multi-reg debug uses: we'd
use the value stored in the lowest-numbered as if it was the whole
multireg value.  This would be a sign extension of the low part given
the right endianness, but it would become utter gibberish for the wrong
one.  There was code in place to avoid this incorrect use if we happened
to be storing the part of the value in a SUBREG, but if we wrote to any
entire REG, we'd happily do the wrong thing.

I have plans on how to deal with multiregs properly, as noted in the
newly-added comments, but I haven't decided it's worth tackling yet.

Richard, is it easy for you to confirm that the patch fixes the mips64-
problem too?  If not, I'll build a cross and hope I can trigger the
problem with it.  (no chance of my building an rtx-checking native on my
poor yeeloong ;-)

This was regstrapped on x86_64- and i686-linux-gnu.  I'm checking it in
as obvious after catching some sleep.

for  gcc/ChangeLog
from  Alexandre Oliva  aol...@redhat.com

	PR debug/53740
	PR debug/52983
	PR debug/48866
	* dce.c (word_dce_process_block): Check whether inserting debug
	temps are needed even for needed insns.
	(dce_process_block): Likewise.
	* df-problems.c (dead_debug_add): Add comment about multi-regs.
	(dead_debug_insert_temp): Likewise.  Don't subreg when we're
	setting fewer regs than a multi-reg requires.

Index: gcc/dce.c
===
--- gcc/dce.c.orig	2012-06-27 02:29:32.290377543 -0300
+++ gcc/dce.c	2012-06-27 02:30:56.072721259 -0300
@@ -864,9 +864,12 @@ word_dce_process_block (basic_block bb, 
 	   anything in local_live.  */
 	if (marked_insn_p (insn))
 	  df_word_lr_simulate_uses (insn, local_live);
+
 	/* Insert debug temps for dead REGs used in subsequent debug
-	   insns.  */
-	else if (debug.used  !bitmap_empty_p (debug.used))
+	   insns.  We may have to emit a debug temp even if the insn
+	   was marked, in case the debug use was after the point of
+	   death.  */
+	if (debug.used  !bitmap_empty_p (debug.used))
 	  {
 	df_ref *def_rec;
 
@@ -963,9 +966,12 @@ dce_process_block (basic_block bb, bool 
 	   anything in local_live.  */
 	if (needed)
 	  df_simulate_uses (insn, local_live);
+
 	/* Insert debug temps for dead REGs used in subsequent debug
-	   insns.  */
-	else if (debug.used  !bitmap_empty_p (debug.used))
+	   insns.  We may have to emit a debug temp even if the insn
+	   was marked, in case the debug use was after the point of
+	   death.  */
+	if (debug.used  !bitmap_empty_p (debug.used))
 	  for (def_rec = DF_INSN_DEFS (insn); *def_rec; def_rec++)
 	dead_debug_insert_temp (debug, DF_REF_REGNO (*def_rec), insn,
 DEBUG_TEMP_BEFORE_WITH_VALUE);
Index: gcc/df-problems.c
===
--- gcc/df-problems.c.orig	2012-06-27 02:30:45.0 -0300
+++ gcc/df-problems.c	2012-06-27 02:30:56.073721179 -0300
@@ -3179,6 +3179,9 @@ dead_debug_add (struct dead_debug *debug
   if (!debug-used)
 debug-used = BITMAP_ALLOC (NULL);
 
+  /* ??? If we dealt with split multi-registers below, we should set
+ all registers for the used mode in case of hardware
+ registers.  */
   bitmap_set_bit (debug-used, uregno);
 }
 
@@ -3269,6 +3272,15 @@ dead_debug_insert_temp (struct dead_debu
 	  /* Hmm...  Something's fishy, we should be setting REG here.  */
 	  if (REGNO (dest) != REGNO (reg))
 	breg = NULL;
+	  /* If we're not overwriting all the hardware registers that
+	 setting REG in its mode would, we won't know what to bind
+	 the debug temp to.  ??? We 

[RFA] Enable dump-noaddr test to work in out of build tree testing

2012-06-27 Thread Matthew Gretton-Dann

All,

This patch enables the dump-noaddr test to work in out-of-build-tree testing.

It does this by making sure that the dump files generated during the
test are created under $tmpdir.

gcc/testsuite/ChangeLog:
2012-06-27  Matthew Gretton-Dann  matthew.gretton-d...@arm.com

* gcc.c-torture/unsorted/dump-noaddr.x: Generate dump files in
tmpdir.

Tested both in and out of build-tree against an arm-none-eabi targetted 
compiler.


OK?

Thanks,

Matt

--
Matthew Gretton-Dann
Principal Engineer, PD Software - Tools, ARM Ltd
diff --git a/gcc/testsuite/gcc.c-torture/unsorted/dump-noaddr.x b/gcc/testsuite/gcc.c-torture/unsorted/dump-noaddr.x
index a8174e0..bd84c06 100644
--- a/gcc/testsuite/gcc.c-torture/unsorted/dump-noaddr.x
+++ b/gcc/testsuite/gcc.c-torture/unsorted/dump-noaddr.x
@@ -9,14 +9,14 @@ proc dump_compare { src options } {
 
 # loop through all the options
 foreach option $option_list {
-	file delete -force dump1
-	file mkdir dump1
-	c-torture-compile $src $option $options -dumpbase dump1/$dumpbase -DMASK=1 -x c --param ggc-min-heapsize=1 -fdump-ipa-all -fdump-rtl-all -fdump-tree-all -fdump-noaddr
-	file delete -force dump2
-	file mkdir dump2
-	c-torture-compile $src $option $options -dumpbase dump2/$dumpbase -DMASK=2 -x c -fdump-ipa-all -fdump-rtl-all -fdump-tree-all -fdump-noaddr
-	foreach dump1 [lsort [glob -nocomplain dump1/*]] {
-	regsub dump1/ $dump1 dump2/ dump2
+	file delete -force $tmpdir/dump1
+	file mkdir $tmpdir/dump1
+	c-torture-compile $src $option $options -dumpbase $tmpdir/dump1/$dumpbase -DMASK=1 -x c --param ggc-min-heapsize=1 -fdump-ipa-all -fdump-rtl-all -fdump-tree-all -fdump-noaddr
+	file delete -force $tmpdir/dump2
+	file mkdir $tmpdir/dump2
+	c-torture-compile $src $option $options -dumpbase $tmpdir/dump2/$dumpbase -DMASK=2 -x c -fdump-ipa-all -fdump-rtl-all -fdump-tree-all -fdump-noaddr
+	foreach dump1 [lsort [glob -nocomplain $tmpdir/dump1/*]] {
+	set dump2 $tmpdir/dump2/[file tail $dump1]
 	set dumptail gcc.c-torture/unsorted/[file tail $dump1]
 	#puts $option $dump1
 	set tmp [ diff $dump1 $dump2 ]
@@ -30,8 +30,8 @@ proc dump_compare { src options } {
 	#exec diff $dump1 $dump2
 	}
 }
-file delete -force dump1
-file delete -force dump2
+file delete -force $tmpdir/dump1
+file delete -force $tmpdir/dump2
 }
 
 catch {dump_compare $src $options} result

[PATCH] Fix PR53774

2012-06-27 Thread Richard Guenther

This fixes PR53774, a case where reassoc produced non-canonical
statements like a = 4 + b.  The reason it did so was that it
assigned a rank of zero to things that are not constant.  Fixed
by pre-computing a rank for all SSA default defs.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.

Richard.

2012-06-27  Richard Guenther  rguent...@suse.de

PR tree-optimization/53774
* tree-ssa-reassoc.c (get_rank): All default defs have
precomputed rank.
(init_reassoc): Precompute rank for all SSA default defs.

Index: gcc/tree-ssa-reassoc.c
===
--- gcc/tree-ssa-reassoc.c  (revision 188987)
+++ gcc/tree-ssa-reassoc.c  (working copy)
@@ -383,14 +383,10 @@ get_rank (tree e)
   int i, n;
   tree op;
 
-  if (TREE_CODE (SSA_NAME_VAR (e)) == PARM_DECL
-  SSA_NAME_IS_DEFAULT_DEF (e))
+  if (SSA_NAME_IS_DEFAULT_DEF (e))
return find_operand_rank (e);
 
   stmt = SSA_NAME_DEF_STMT (e);
-  if (gimple_bb (stmt) == NULL)
-   return 0;
-
   if (gimple_code (stmt) == GIMPLE_PHI)
return phi_rank (stmt);
 
@@ -484,7 +480,7 @@ sort_by_operand_rank (const void *pa, co
   /* It's nicer for optimize_expression if constants that are likely
  to fold when added/multiplied//whatever are put next to each
  other.  Since all constants have rank 0, order them by type.  */
-  if (oeb-rank == 0   oea-rank == 0)
+  if (oeb-rank == 0  oea-rank == 0)
 {
   if (constant_type (oeb-op) != constant_type (oea-op))
return constant_type (oeb-op) - constant_type (oea-op);
@@ -3441,7 +3437,7 @@ transform_stmt_to_multiply (gimple_stmt_
   print_gimple_stmt (dump_file, stmt, 0, 0);
 }
 
-  gimple_assign_set_rhs_with_ops_1 (gsi, MULT_EXPR, rhs1, rhs2, NULL_TREE);
+  gimple_assign_set_rhs_with_ops (gsi, MULT_EXPR, rhs1, rhs2);
   update_stmt (gsi_stmt (*gsi));
   remove_visited_stmt_chain (rhs1);
 
@@ -3647,7 +3643,6 @@ init_reassoc (void)
 {
   int i;
   long rank = 2;
-  tree param;
   int *bbs = XNEWVEC (int, last_basic_block + 1);
 
   /* Find the loops, so that we can prevent moving calculations in
@@ -3666,24 +3661,15 @@ init_reassoc (void)
   bb_rank = XCNEWVEC (long, last_basic_block + 1);
   operand_rank = pointer_map_create ();
 
-  /* Give each argument a distinct rank.   */
-  for (param = DECL_ARGUMENTS (current_function_decl);
-   param;
-   param = DECL_CHAIN (param))
-{
-  if (gimple_default_def (cfun, param) != NULL)
-   {
- tree def = gimple_default_def (cfun, param);
- insert_operand_rank (def, ++rank);
-   }
-}
-
-  /* Give the chain decl a distinct rank. */
-  if (cfun-static_chain_decl != NULL)
-{
-  tree def = gimple_default_def (cfun, cfun-static_chain_decl);
-  if (def != NULL)
-   insert_operand_rank (def, ++rank);
+  /* Give each default definition a distinct rank.  This includes
+ parameters and the static chain.  Walk backwards over all
+ SSA names so that we get proper rank ordering according
+ to tree_swap_operands_p.  */
+  for (i = num_ssa_names - 1; i  0; --i)
+{
+  tree name = ssa_name (i);
+  if (name  SSA_NAME_IS_DEFAULT_DEF (name))
+   insert_operand_rank (name, ++rank);
 }
 
   /* Set up rank for each BB  */


[PATCH][RFC] Fix PR53695

2012-06-27 Thread Richard Guenther

This fixes PR53695 where we recognize a loop which has only an
abnormal goto edge from its header to its tail.  We already reject
loops during recognition that have at least a single abnormal
predecessor of its head so it seems reasonable to at least require
one regular CFG path from its header to its latch (thus only have
abnormal extra entries / exists).

Not entirely trivial to test as one can see below.

Bootstrap and regtest running on x86_64-unknown-linux-gnu.

Any comments?

Thanks,
Richard.

2012-06-27  Richard Guenther  rguent...@suse.de

PR middle-end/53695
* cfgloop.c (path_without_edge_flags_1): New function.
(path_without_edge_flags): Likewise.
(flow_loops_find): Require at least one regular path from
loop header to latch.

Index: gcc/cfgloop.c
===
*** gcc/cfgloop.c   (revision 188987)
--- gcc/cfgloop.c   (working copy)
*** init_loops_structure (struct loops *loop
*** 365,370 
--- 365,424 
loops-tree_root = root;
  }
  
+ /* Return true if there is a path from FROM to the dominated TO where no
+edge on that path contains FLAGS.  */
+ 
+ static bool
+ path_without_edge_flags_1 (basic_block from, basic_block to, int flags,
+  bitmap *visited)
+ {
+   while (to != from)
+ {
+   /* At least one such path to the immediate dominator.  */
+   if (single_pred_p (to))
+   {
+ edge e = single_pred_edge (to);
+ if (e-flags  flags)
+   return false;
+ to = e-src;
+   }
+   else
+   {
+ basic_block dom;
+ edge_iterator ei;
+ edge e;
+ 
+ /* We have to guard ourselves to not loop in the face of subloops.  */
+ if (!*visited)
+   *visited = BITMAP_ALLOC (NULL);
+ if (!bitmap_set_bit (*visited, to-index))
+   return false;
+ 
+ dom = get_immediate_dominator (CDI_DOMINATORS, to);
+ FOR_EACH_EDGE(e, ei, to-preds)
+   if (!(e-flags  flags)
+(e-src == dom
+   || path_without_edge_flags_1 (dom, e-src, flags, visited)))
+ break;
+ 
+ to = dom;
+   }
+ }
+ 
+   return true;
+ }
+ 
+ static bool
+ path_without_edge_flags (basic_block from, basic_block to, int flags)
+ {
+   bitmap visited = NULL;
+   bool res;
+   res = path_without_edge_flags_1 (from, to, flags, visited);
+   if (visited)
+ BITMAP_FREE (visited);
+   return res;
+ }
+ 
  /* Find all the natural loops in the function and save in LOOPS structure and
 recalculate loop_depth information in basic block structures.
 Return the number of natural loops found.  */
*** flow_loops_find (struct loops *loops)
*** 422,430 
 by this block.  A natural loop has a single entry
 node (header) that dominates all the nodes in the
 loop.  It also has single back edge to the header
!from a latch node.  */
  if (latch != ENTRY_BLOCK_PTR
!  dominated_by_p (CDI_DOMINATORS, latch, header))
{
  /* Shared headers should be eliminated by now.  */
  SET_BIT (headers, header-index);
--- 476,488 
 by this block.  A natural loop has a single entry
 node (header) that dominates all the nodes in the
 loop.  It also has single back edge to the header
!from a latch node.
!If there is no regular path from the header to the
!latch do not consider this latch (not worth the
!problems).  */
  if (latch != ENTRY_BLOCK_PTR
!  dominated_by_p (CDI_DOMINATORS, latch, header)
!  path_without_edge_flags (header, latch, EDGE_COMPLEX))
{
  /* Shared headers should be eliminated by now.  */
  SET_BIT (headers, header-index);
*** verify_loop_structure (void)
*** 1388,1393 
--- 1446,1460 
  err = 1;
}
}
+   else if (loop-latch)
+   {
+ if (find_edge (loop-latch, loop-header) == NULL)
+   {
+ error (loop %d%'s latch has no edge to the header, i);
+ err = 1;
+   }
+   }
+ 
if (loop-header-loop_father != loop)
{
  error (loop %d%'s header does not belong directly to it, i);
Index: gcc/testsuite/gcc.dg/torture/pr53695.c
===
*** gcc/testsuite/gcc.dg/torture/pr53695.c  (revision 0)
--- gcc/testsuite/gcc.dg/torture/pr53695.c  (working copy)
***
*** 0 
--- 1,14 
+ /* { dg-do compile } */
+ /* { dg-options -ftracer } */
+ 
+ void
+ foo (const void **p)
+ {
+   void *labs[] = { l1, l2, l3 };
+ l1:
+   goto *p++;
+ l2:
+   goto *p;
+ l3:
+   ;
+ }


Re: [wwwdocs] Update coding conventions for C++

2012-06-27 Thread Chiheng Xu
On Wed, Jun 27, 2012 at 2:19 AM, Lawrence Crowl cr...@google.com wrote:
 On 6/26/12, Jason Merrill ja...@redhat.com wrote:
 On 06/25/2012 06:26 PM, Lawrence Crowl wrote:
  +orcodegcc_unreachable/code.  If the checks are expensive or the
  +compiler can reasonably carry on after the error, they may be
  +conditioned oncode--enable-checking/code./p

 by using codegcc_checking_assert/code

 I inserted that suggestion, but note that I did not create that text,
 only moved it.

 [Rationale]
  +FIXME: Discussion of deleting inappropraite special members.

 Is this FIXME still needed?  The text following it seems to cover the
 issue well enough.

 No longer needed.

  +However, by default, RTTI is not permitted
  +and the compiler must build cleanly with code-fno-rtti/code.

 This seems like an unnecessary restriction to me.

  +Disabling RTTI will save space in the compiler.

 This is a fine reason to disable RTTI if it isn't used, but doesn't
 seem like a strong reason to prohibit using it.  The information is
 only emitted for classes with virtual functions, isn't very large,
 is typically emitted in only one object file, and since it lives
 in .rodata it can be shared between multiple compiler processes.

  +Checking the type of a class at runtime usually indicates a design 
  problem.

 The tree_contains_struct machinery recently added to GCC maps
 directly onto dynamic_cast;

 if (CODE_CONTAINS_STRUCT(TREE_CODE(t),TS_DECL_WRTL))
   /* do something with t-decl_with_rtl */

 translates roughly to to

 if (decl_with_rtl *p = dynamic_cast decl_with_rtl *(t))
   /* do something with p */

 When new interfaces are added partway down an inheritance tree,
 dynamic_cast is the right way to access them.  This isn't checking
 what the type is, it's checking whether the object supports a
 particular method.

 The fact that we've gone to the trouble to implement this
 functionality in C suggests to me that using it isn't always
 indicative of a design problem.  :)

 Personally, I am fine with using dynamic_cast.  I was writing up what
 I thought was a consensus on the wiki.  I can live without it, or I
 can use it profitably.  Whatever you all decide, I will write up.


dynamic_cast use RTTI, while TREE_CODE are poor man's type info. RTTI
is better than TREE_CODE. But, If you decide to use RTTI,  TREE_CODE
become redundant, that means all use of TREE_CODE should be removed,
sooner or later. Are you prepared for that ?

If every types in the type hierarchy, not just the root type and
leaves types,  has a TREE_CODE,  and you maintain a database of
inheritance relationship and other info of types of different
TREE_CODEs,  then,  for a object of a given TREE_CODE,  you know its
TREE_CODE and type info,   you know whether or not the object can be
cast to another type,  and,  you can know other info about the type,
like the size of object of the type, string name of the type.

This can be implemented as below. For every type in the type
hierarchy, static define a object of meta type( meta class, or class
class) to describe the type info of the specific type for a given
TREE_CODE. All of the meta type objects of different TREE_CODEs are
organized as an array, which is indexed by the value of TREE_CODE. And
you provide a set of C functions , that judge whether a object that
has a given TREE_CODE can be cast to a type that has another TREE_CODE
and retrieve the type info of a object that has a given TREE_CODE.


  +If you need to know the type of a class for some other reason,
  +use an enum or a virtual member function
  +that coverts a pointer to the more derived class.
  +For example,
  +/p
  +
  +blockquoteprecode
  +common_type *p = ;
  +if (specific_type *q = p-gt;to_specific ()) {
  +  // We have and can use a specific_type pointed to by q.
  +}
  +/code/pre/blockquote

 This is basically equivalent to dynamic_cast except that you need
 to clutter up your root base class with one virtual function for
 each derived class you want to be able to convert to.

 Note quite equivalent, as you can implement to_specific with
 non-virtual classes using the TREE_CODE.  Thus would could implement
 a type-save dynamic pointer converter before converting to virtual
 classes.

 OTOH, we could probably work up a template function that looks like
 dynamic_cast but uses the TREE_CODE instead, achieving the same
 intermediate step.

 I agree that it does clutter up the base class.


Template function may be not necessary.
And, virtual method is not necessary.
Just normal C functions can work if you have the type info of a given TREE_CODE.

 Shall we enable RTTI?
Are you prepared to remove all use of TREE_CODE ?



-- 
Chiheng Xu


[Patch, Fortran, OOP] PR 49591/41951 (Multiple identical specific procedures in type-bound operator not detected)

2012-06-27 Thread Janus Weil
Hi all,

here is a patch related to type-bound operators, which fixes both PR
49591 and parts of PR 41951 (namely comment #12). The central piece of
the patch is the resolve.c part. This adds all type-bound operator
routines also to the non-typebound operator list (i.e. ns-op). In
this way, duplicates and ambiguities will be found also for cases
where
1) operators are bound to different types or
2) one operator is bound to a type and one is not.

We have to be careful to do this only in the original namespace (where
the type is defined), and not in namespaces where the type is
use-associated (otherwise operators will be added twice). Also we can
not do it for private operators and deferred ones (which is the reason
for not throwing an error on PR 41951 comment #11 and friends).

The patch was regtested on x86_64-unknown-linux-gnu. Ok for trunk?

Cheers,
Janus


2012-06-27  Janus Weil  ja...@gcc.gnu.org

PR fortran/41951
PR fortran/49591
* interface.c (check_new_interface): Rename, add 'loc' argument,
make non-static.
(gfc_add_interface): Rename 'check_new_interface'
* gfortran.h (gfc_check_new_interface): Add prototype.
* resolve.c (resolve_typebound_intrinsic_op): Add typebound operator
targets to non-typebound operator list.


2012-06-27  Janus Weil  ja...@gcc.gnu.org

PR fortran/41951
PR fortran/49591
* gfortran.dg/typebound_operator_16.f03: New.


pr41951_49591.diff
Description: Binary data


typebound_operator_16.f03
Description: Binary data


Re: [wwwdocs] Update coding conventions for C++

2012-06-27 Thread Richard Guenther
On Wed, Jun 27, 2012 at 2:35 PM, Chiheng Xu chiheng...@gmail.com wrote:
 On Wed, Jun 27, 2012 at 2:19 AM, Lawrence Crowl cr...@google.com wrote:
 On 6/26/12, Jason Merrill ja...@redhat.com wrote:
 On 06/25/2012 06:26 PM, Lawrence Crowl wrote:
  +orcodegcc_unreachable/code.  If the checks are expensive or the
  +compiler can reasonably carry on after the error, they may be
  +conditioned oncode--enable-checking/code./p

 by using codegcc_checking_assert/code

 I inserted that suggestion, but note that I did not create that text,
 only moved it.

 [Rationale]
  +FIXME: Discussion of deleting inappropraite special members.

 Is this FIXME still needed?  The text following it seems to cover the
 issue well enough.

 No longer needed.

  +However, by default, RTTI is not permitted
  +and the compiler must build cleanly with code-fno-rtti/code.

 This seems like an unnecessary restriction to me.

  +Disabling RTTI will save space in the compiler.

 This is a fine reason to disable RTTI if it isn't used, but doesn't
 seem like a strong reason to prohibit using it.  The information is
 only emitted for classes with virtual functions, isn't very large,
 is typically emitted in only one object file, and since it lives
 in .rodata it can be shared between multiple compiler processes.

  +Checking the type of a class at runtime usually indicates a design 
  problem.

 The tree_contains_struct machinery recently added to GCC maps
 directly onto dynamic_cast;

 if (CODE_CONTAINS_STRUCT(TREE_CODE(t),TS_DECL_WRTL))
   /* do something with t-decl_with_rtl */

 translates roughly to to

 if (decl_with_rtl *p = dynamic_cast decl_with_rtl *(t))
   /* do something with p */

 When new interfaces are added partway down an inheritance tree,
 dynamic_cast is the right way to access them.  This isn't checking
 what the type is, it's checking whether the object supports a
 particular method.

 The fact that we've gone to the trouble to implement this
 functionality in C suggests to me that using it isn't always
 indicative of a design problem.  :)

 Personally, I am fine with using dynamic_cast.  I was writing up what
 I thought was a consensus on the wiki.  I can live without it, or I
 can use it profitably.  Whatever you all decide, I will write up.


 dynamic_cast use RTTI, while TREE_CODE are poor man's type info. RTTI
 is better than TREE_CODE.

RTTI requires more space than TREE_CODE, so it's not universally better.

 But, If you decide to use RTTI,  TREE_CODE
 become redundant, that means all use of TREE_CODE should be removed,
 sooner or later. Are you prepared for that ?

 If every types in the type hierarchy, not just the root type and
 leaves types,  has a TREE_CODE,  and you maintain a database of
 inheritance relationship and other info of types of different
 TREE_CODEs,  then,  for a object of a given TREE_CODE,  you know its
 TREE_CODE and type info,   you know whether or not the object can be
 cast to another type,  and,  you can know other info about the type,
 like the size of object of the type, string name of the type.

 This can be implemented as below. For every type in the type
 hierarchy, static define a object of meta type( meta class, or class
 class) to describe the type info of the specific type for a given
 TREE_CODE. All of the meta type objects of different TREE_CODEs are
 organized as an array, which is indexed by the value of TREE_CODE. And
 you provide a set of C functions , that judge whether a object that
 has a given TREE_CODE can be cast to a type that has another TREE_CODE
 and retrieve the type info of a object that has a given TREE_CODE.

We already have all (or most) of this.

  +If you need to know the type of a class for some other reason,
  +use an enum or a virtual member function
  +that coverts a pointer to the more derived class.
  +For example,
  +/p
  +
  +blockquoteprecode
  +common_type *p = ;
  +if (specific_type *q = p-gt;to_specific ()) {
  +  // We have and can use a specific_type pointed to by q.
  +}
  +/code/pre/blockquote

 This is basically equivalent to dynamic_cast except that you need
 to clutter up your root base class with one virtual function for
 each derived class you want to be able to convert to.

 Note quite equivalent, as you can implement to_specific with
 non-virtual classes using the TREE_CODE.  Thus would could implement
 a type-save dynamic pointer converter before converting to virtual
 classes.

 OTOH, we could probably work up a template function that looks like
 dynamic_cast but uses the TREE_CODE instead, achieving the same
 intermediate step.

 I agree that it does clutter up the base class.


 Template function may be not necessary.
 And, virtual method is not necessary.
 Just normal C functions can work if you have the type info of a given 
 TREE_CODE.

 Shall we enable RTTI?
 Are you prepared to remove all use of TREE_CODE ?

To answer, no - we should not enable RTTI (nor exceptions).

Richard.


Commit: RX: Fix comparesi3_extend pattern

2012-06-27 Thread Nick Clifton
Hi Guys,

  I am checking in the patch below to the mainline and 4.7 branch
  sources to fix a typo in the comparesi3_extend patterns in the rx.md
  file.  Operand 0 is an input operand but it had an = modifier applied
  to it.  This confused gcc's internals and resulted in several ICEs in
  the gcc testsuite.

Cheers
  Nick

gcc/ChangeLog
2012-06-27  Nick Clifton  ni...@redhat.com

* config/rx/rx.md (comparesi3_extend): Remove = modifier from
input operand.

Index: gcc/config/rx/rx.md
===
--- gcc/config/rx/rx.md (revision 189013)
+++ gcc/config/rx/rx.md (working copy)
@@ -1868,7 +1868,7 @@
 
 (define_insn comparesi3_extend_types:codesmall_int_modes:mode
   [(set (reg:CC CC_REG)
-   (compare:CC (match_operand:SI   0 
register_operand =r)
+   (compare:CC (match_operand:SI   0 
register_operand r)
(extend_types:SI (match_operand:small_int_modes 1 
rx_restricted_mem_operand Q]
   (optimize  3 || optimize_size)
   cmp\t%extend_types:letter1, %0


Re: [wwwdocs] Update coding conventions for C++

2012-06-27 Thread Chiheng Xu
On Tue, Jun 19, 2012 at 6:28 AM, Lawrence Crowl cr...@google.com wrote:
  pFunction prototypes for extern functions should only occur in
  header files.  Functions should be ordered within source files to
  minimize the number of function prototypes, by defining them before
 @@ -121,13 +208,13 @@
  necessary, to break mutually recursive cycles./p


If you always put entry functions in the bottom of a source file, and
generically, always put upper layer functions below the lower layer
functions. Then probably there will be no need for function prototypes
in a source file.


 +h4a name=Namespace_UseNamespaces/a/h4
 +
 +p
 +Namespaces are encouraged.
 +All separable libraries should have a unique global namespace.
 +All individual tools should have a unique global namespace.
 +Nested include directories names should map to nested namespaces when
 possible.
 +/p

Do all people have a consensus on the use of namespace ?

 +
 +p
 +Header files should have neither codeusing/code directives
 +nor namespace-scope codeusing/code declarations.
 +/p
 +
 +p
 +There is no alternative to codeusing/code declarations
 +in class definitions to manage names within an inheritance hierarchy,
 +so they are necessarily permitted.
 +/p


 +h4a name=ExceptionsExceptions/a/h4
 +
 +p
 +Exceptions and throw specifications are not permitted
 +and the compiler must build cleanly with code-fno-exceptions/code.
 +/p
 +
 +p
 +a href=codingrationale.html#exceptionsRationale and Discussion/a
 +/p
 +
 +
 +h4a name=Standard_LibraryThe Standard Library/a/h4
 +
 +p
 +Use of the standard library is permitted.
 +Note, however, that it is currently not usable with garbage collected
 data.
 +/p
 +
 +p
 +For compiler messages, indeed any text that needs i18n,
 +should continue to use the existing facilities.
 +/p
 +
 +p
 +For long-term code, at least for now,
 +we will continue to use codeprintf/code style I/O
 +rather than codelt;iostreamgt;/code style I/O.
 +For quick debugging code,
 +codelt;iostreamgt;/code is permitted.
 +/p

Is iostream really suitable or necessary for GCC ?
Have you think about writing another thinner interface , like Java's IO stream.


-- 
Chiheng Xu


Re: [wwwdocs] Update coding conventions for C++

2012-06-27 Thread Martin Jambor
Hi,

On Tue, Jun 26, 2012 at 11:06:15AM -0700, Lawrence Crowl wrote:
 On 6/26/12, Martin Jambor mjam...@suse.cz wrote:
  On Mon, Jun 25, 2012 at 03:26:01PM -0700, Lawrence Crowl wrote:
I have no idea.  I don't use emacs.  The two-space rule for
members comes from the wiki.  The one-space rule for protection
labels is common practice.  If folks want something else,
changes are fine with me.
 
  I'll also need an emacs C++ indentation style that conforms to
  this in order to really be able to produce complying code myself.
  So if anybody else will be working on that, I'm interested (to
  use it and perhaps help crafting it) and I guess a number of
  other people on this list are too...
 
 Alternatively, one could change the conventions to match an emacs
 style.  Either is fine we me, as long as the style is reasonable.

That would be very nice :-) 
Of course, if many people do not find it reasonable, I'm sure there is
an easy way to tweak it.

...

   +but think twice before using it in code
   +intended to last a long time.
 
  I think all committed code should be expected to have long-lasting
  quality.  I would not encourage people to think otherwise and would
  drop the long time reference here.  If anybody ever commits
  something ugly to bridge a short time period, it should only be done
  under the maintainers grant exceptions rule anyway.
 
   +/p +p
   +For long-term code, at least for now,
   +we will continue to use codeprintf/code style I/O
   +rather than codelt;iostreamgt;/code style I/O.
   +For quick debugging code,
   +codelt;iostreamgt;/code is permitted.
   +/p
 
  Similarly here, no quick and dirty debugging output should ever be
  committed, we should not
 
   +h4a name=stdlibThe Standard Library/a/h4
   +
   +p
   +At present, C++ provides no great advantage for i18n.
   +GCC does type checking for codeprintf/code arguments,
   +so the type insecurity of codeprintf/code is moot,
   +but the clarity in layout persists.
   +For quick debugging output, lt;iostreamgt; requires less work.
   +/p
 
  The same applies here.
 
 The value of these changes depends on when the rules are enforced.
 If they are enforced only on trunk, then the changes seem fine
 to me.  However, if they are enforced on branches, then they could
 unnecessarily slow down development.
 
 Comments?

I think that if you have a private branch, you are basically its
maintainer and can grant yourself any exception from any rule you
want.  Of course, that might make your life harder if you later want
to contribute the changes to the trunk, release branches, other
peple's branches and generally anywhere.

Thanks,

Martin


[patch c-faimily]: Fix for PR 37215

2012-06-27 Thread Kai Tietz
Hello,

this patch fixes an ICE on valid code for preprocessor as described in PR 37215
ChangeLog

2012-06-27  Kai Tietz

PR preprocessor/37215
* c-ppoutput.c (preprocess_file): Check for none-empty buffer.

Tested for x86_64-unknown-linux-gnu, and i688-pc-cygwin.  Ok for apply?

Regards,
Kai

Index: c-family/c-ppoutput.c
===
--- c-family/c-ppoutput.c   (revision 183106)
+++ c-family/c-ppoutput.c   (working copy)
@@ -86,7 +86,7 @@
 {
   /* A successful cpp_read_main_file guarantees that we can call
  cpp_scan_nooutput or cpp_get_token next.  */
-  if (flag_no_output)
+  if (flag_no_output  pfile-buffer)
 {
   /* Scan -included buffers, then the main file.  */
   while (pfile-buffer-prev)


Re: [patch c-faimily]: Fix for PR 37215

2012-06-27 Thread Joseph S. Myers
On Wed, 27 Jun 2012, Kai Tietz wrote:

 2012-06-27  Kai Tietz
 
 PR preprocessor/37215
 * c-ppoutput.c (preprocess_file): Check for none-empty buffer.

nonempty

 Tested for x86_64-unknown-linux-gnu, and i688-pc-cygwin.  Ok for apply?

OK with the ChangeLog typo fixed.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [wwwdocs] Update coding conventions for C++

2012-06-27 Thread Jason Merrill

On 06/27/2012 08:35 AM, Chiheng Xu wrote:

dynamic_cast use RTTI, while TREE_CODE are poor man's type info. RTTI
is better than TREE_CODE. But, If you decide to use RTTI,  TREE_CODE
become redundant, that means all use of TREE_CODE should be removed,
sooner or later. Are you prepared for that ?


I wasn't suggesting we would change trees to use inheritance in the 
forseeable future; my point was that RTTI is used in patterns like what 
we already do with trees, so I don't think that using it indicates a 
design problem.


If we were to change trees to use inheritance and virtual functions, 
which seems unlikely to me, then I think it would make sense to use RTTI 
instead of TREE_CODE.


On 06/27/2012 09:02 AM, Richard Guenther wrote:

RTTI requires more space than TREE_CODE, so it's not universally better.


RTTI requires no additional space in each object, and no space at all 
for classes with no virtual functions.



To answer, no - we should not enable RTTI (nor exceptions).


The problem with exceptions is that the compiler is not exception-safe. 
 I don't think there's a good reason to prohibit RTTI.  As I said in 
another message, I don't mind turning it off for now, but I don't think 
doing so has much benefit either, since it only affects classes with 
virtual functions.


Jason


Re: [PATCH] Move Graphite to upstream cloog 0.17.0

2012-06-27 Thread Diego Novillo
On Fri, Jun 22, 2012 at 9:16 AM, Richard Guenther rguent...@suse.de wrote:

 This bumps the requirement to enable Graphite to using cloog 0.17.0
 which is the last release from upstream.  The patch removes the
 support for the legacy cloog versions, too.

 I am bootstrapping and testing this now with cloog 0.17.0 built
 against the upstream ISL 0.10 version.

 If this ends up being approved I will put the cloog 0.17.0 tarball
 in the infrastructure directory.

 Bootstrap and regtest pending on x86_64-unknown-linux-gnu.

 Ok for trunk (for the build parts)?

The build parts look fine.


Diego.


Re: [RFC C++ / PR51033 ] Handle __builtin_shuffle in constexpr properly in the C++ frontend.

2012-06-27 Thread Ramana Radhakrishnan
On 25 June 2012 04:32, Jason Merrill ja...@redhat.com wrote:
 On 06/18/2012 09:04 AM, Ramana Radhakrishnan wrote:

 +  location_t loc = EXPR_LOC_OR_HERE (t);


 We should only use EXPR_LOC_OR_HERE for diagnostics.  For a location to use
 in building other expressions, use EXPR_LOCATION.

Thanks for the review. I've made that change and committed the following patch.

Ramana



 OK with that change.

 Jason



committed.patch
Description: Binary data


Re: [Patch, Fortran, OOP] PR 49591/41951 (Multiple identical specific procedures in type-bound operator not detected)

2012-06-27 Thread Tobias Burnus

Hi Janus,

On 06/27/2012 02:42 PM, Janus Weil wrote:

here is a patch related to type-bound operators, which fixes both PR
49591 and parts of PR 41951 (namely comment #12). The central piece of
the patch is the resolve.c part. This adds all type-bound operator
routines also to the non-typebound operator list (i.e. ns-op). In
this way, duplicates and ambiguities will be found also for cases
where
1) operators are bound to different types or
2) one operator is bound to a type and one is not.

We have to be careful to do this only in the original namespace (where
the type is defined), and not in namespaces where the type is
use-associated (otherwise operators will be added twice). Also we can
not do it for private operators and deferred ones (which is the reason
for not throwing an error on PR 41951 comment #11 and friends).


Can you add a note to the PR after committal?

Regarding the test case: Please also refer to interpreation request 
F03/0018.


Sorry for being a bean counter, but in gfc_add_interface the lines are 
too long: Before 84 now 107 characters.


Otherwise, the patch looks OK. Thanks for going through the list of OOP 
bugs.


Tobias


The patch was regtested on x86_64-unknown-linux-gnu. Ok for trunk?

Cheers,
Janus


2012-06-27  Janus Weil  ja...@gcc.gnu.org

PR fortran/41951
PR fortran/49591
* interface.c (check_new_interface): Rename, add 'loc' argument,
make non-static.
(gfc_add_interface): Rename 'check_new_interface'
* gfortran.h (gfc_check_new_interface): Add prototype.
* resolve.c (resolve_typebound_intrinsic_op): Add typebound operator
targets to non-typebound operator list.


2012-06-27  Janus Weil  ja...@gcc.gnu.org

PR fortran/41951
PR fortran/49591
* gfortran.dg/typebound_operator_16.f03: New.





[PATCH] Add generic vector lowering for integer division and modulus (PR tree-optimization/53645)

2012-06-27 Thread Jakub Jelinek
Hi!

This patch makes veclower2 attempt to emit integer division/modulus of
vectors by constants using vector multiplication, shifts or masking.

It is somewhat similar to the vect_recog_divmod_pattern, but it needs
to analyze everything first, see if all divisions or modulos are doable
using the same sequence of vector insns, and then emit vector insns
as opposed to the scalar ones the pattern recognizer adds.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

The testcase additionally eyeballed even for -mavx2, which unlike -mavx
has vector  vector shifts.

2012-06-27  Jakub Jelinek  ja...@redhat.com

PR tree-optimization/53645
* tree-vect-generic.c (add_rshift): New function.
(expand_vector_divmod): New function.
(expand_vector_operation): Use it for vector integer
TRUNC_{DIV,MOD}_EXPR by VECTOR_CST.
* tree-vect-patterns.c (vect_recog_divmod_pattern): Replace
unused lguup variable with dummy_int.

* gcc.c-torture/execute/pr53645.c: New test.

--- gcc/tree-vect-generic.c.jj  2012-06-26 10:00:42.935832834 +0200
+++ gcc/tree-vect-generic.c 2012-06-27 10:15:20.534103045 +0200
@@ -391,6 +391,515 @@ expand_vector_comparison (gimple_stmt_it
   return t;
 }
 
+/* Helper function of expand_vector_divmod.  Gimplify a RSHIFT_EXPR in type
+   of OP0 with shift counts in SHIFTCNTS array and return the temporary holding
+   the result if successful, otherwise return NULL_TREE.  */
+static tree
+add_rshift (gimple_stmt_iterator *gsi, tree type, tree op0, int *shiftcnts)
+{
+  optab op;
+  unsigned int i, nunits = TYPE_VECTOR_SUBPARTS (type);
+  bool scalar_shift = true;
+
+  for (i = 1; i  nunits; i++)
+{
+  if (shiftcnts[i] != shiftcnts[0])
+   scalar_shift = false;
+}
+
+  if (scalar_shift  shiftcnts[0] == 0)
+return op0;
+
+  if (scalar_shift)
+{
+  op = optab_for_tree_code (RSHIFT_EXPR, type, optab_scalar);
+  if (op != NULL
+  optab_handler (op, TYPE_MODE (type)) != CODE_FOR_nothing)
+   return gimplify_build2 (gsi, RSHIFT_EXPR, type, op0,
+   build_int_cst (NULL_TREE, shiftcnts[0]));
+}
+
+  op = optab_for_tree_code (RSHIFT_EXPR, type, optab_vector);
+  if (op != NULL
+   optab_handler (op, TYPE_MODE (type)) != CODE_FOR_nothing)
+{
+  tree *vec = XALLOCAVEC (tree, nunits);
+  for (i = 0; i  nunits; i++)
+   vec[i] = build_int_cst (TREE_TYPE (type), shiftcnts[i]);
+  return gimplify_build2 (gsi, RSHIFT_EXPR, type, op0,
+ build_vector (type, vec));
+}
+
+  return NULL_TREE;
+}
+
+/* Try to expand integer vector division by constant using
+   widening multiply, shifts and additions.  */
+static tree
+expand_vector_divmod (gimple_stmt_iterator *gsi, tree type, tree op0,
+ tree op1, enum tree_code code)
+{
+  bool use_pow2 = true;
+  bool has_vector_shift = true;
+  int mode = -1, this_mode;
+  int pre_shift = -1, post_shift;
+  unsigned int nunits = TYPE_VECTOR_SUBPARTS (type);
+  int *shifts = XALLOCAVEC (int, nunits * 4);
+  int *pre_shifts = shifts + nunits;
+  int *post_shifts = pre_shifts + nunits;
+  int *shift_temps = post_shifts + nunits;
+  unsigned HOST_WIDE_INT *mulc = XALLOCAVEC (unsigned HOST_WIDE_INT, nunits);
+  int prec = TYPE_PRECISION (TREE_TYPE (type));
+  int dummy_int;
+  unsigned int i, unsignedp = TYPE_UNSIGNED (TREE_TYPE (type));
+  unsigned HOST_WIDE_INT mask = GET_MODE_MASK (TYPE_MODE (TREE_TYPE (type)));
+  optab op;
+  tree *vec;
+  unsigned char *sel;
+  tree cur_op, mhi, mlo, mulcst, perm_mask, wider_type, tem;
+
+  if (prec  HOST_BITS_PER_WIDE_INT)
+return NULL_TREE;
+
+  op = optab_for_tree_code (RSHIFT_EXPR, type, optab_vector);
+  if (op == NULL
+  || optab_handler (op, TYPE_MODE (type)) == CODE_FOR_nothing)
+has_vector_shift = false;
+
+  /* Analysis phase.  Determine if all op1 elements are either power
+ of two and it is possible to expand it using shifts (or for remainder
+ using masking).  Additionally compute the multiplicative constants
+ and pre and post shifts if the division is to be expanded using
+ widening or high part multiplication plus shifts.  */
+  for (i = 0; i  nunits; i++)
+{
+  tree cst = VECTOR_CST_ELT (op1, i);
+  unsigned HOST_WIDE_INT ml;
+
+  if (!host_integerp (cst, unsignedp) || integer_zerop (cst))
+   return NULL_TREE;
+  pre_shifts[i] = 0;
+  post_shifts[i] = 0;
+  mulc[i] = 0;
+  if (use_pow2
+  (!integer_pow2p (cst) || tree_int_cst_sgn (cst) != 1))
+   use_pow2 = false;
+  if (use_pow2)
+   {
+ shifts[i] = tree_log2 (cst);
+ if (shifts[i] != shifts[0]
+  code == TRUNC_DIV_EXPR
+  !has_vector_shift)
+   use_pow2 = false;
+   }
+  if (mode == -2)
+   continue;
+  if (unsignedp)
+   {
+ unsigned HOST_WIDE_INT mh;
+ unsigned HOST_WIDE_INT d = 

Re: [PATCH, gdc] - Merging gdc (GNU D Compiler) into gcc

2012-06-27 Thread Iain Buclaw
On 19 June 2012 17:20, Joseph S. Myers jos...@codesourcery.com wrote:
 On Mon, 18 Jun 2012, Iain Buclaw wrote:

 [PATCH 1/4]:
 The D compiler frontend
  -  gcc/d

 Only selectively reviewed, but here are some comments:

 diff -Naur gcc-4.8-20120617/gcc/d/asmstmt.cc gcc-4.8/gcc/d/asmstmt.cc
 --- gcc-4.8-20120617/gcc/d/asmstmt.cc   1970-01-01 01:00:00.0 +0100
 +++ gcc-4.8/gcc/d/asmstmt.cc    2012-06-05 13:42:09.044876794 +0100
 @@ -0,0 +1,2731 @@
 +// asmstmt.cc -- D frontend for GCC.
 +// Originally contributed by David Friedman
 +// Maintained by Iain Buclaw
 +
 +// GCC is free software; you can redistribute it and/or modify it under

 Every file more than ten lines long needs a copyright notice as well as
 the license notice.  See
 http://www.gnu.org/prep/maintain/html_node/Copyright-Notices.html for
 instructions, including the case of multiple copyright holders - though if
 there are any significant (more than fifteen lines of copyrightable text
 or so) contributors not assigning copyright to the FSF then special
 approval from the FSF will be needed to include the front end.

 I would say that the files in dfrontend/ need copyright and license
 notices as well, though not necessarily in exactly GNU form.  Thus, you
 will need to get Digital Mars to approve appropriate notices for those
 files (aav.c is the first I see that's lacking such a notice but is long
 enough to need one; likewise async.c, gnuc.c, speller.c; rmem.c just says
 All Rights Reserved and needs a proper license notice like other files;
 likewise rmem.h).


I have raised this with Walter, and the licensing has been fixed in
all frontend code.


 +#ifdef TARGET_80387
 +#include d-asm-i386.h
 +#else
 +#define D_NO_INLINE_ASM_AT_ALL
 +#endif

 Ugh.  We want to move away from target macros, and this isn't even a
 proper target macro.  It would be better to define target hooks for the D
 inline asm support - possibly with a D-specific hook structure, like the C
 hooks structure.  (Even if you avoid needing copyright assignments for the
 front end itself, such hook implementations will probably need to be
 assigned.)


This code has been removed entirely.

 +/* Apple GCC extends ASM_EXPR to five operands; cannot use build4. */

 I don't see why that should be in the least relevant to a contribution to
 FSF GCC.  If you can do things in a more natural way in FSF GCC, then do
 so.


Now use build5, similar to other frontends.


 Each function in the GCC-specific parts of the code should have a comment
 on it, explaining the semantics of the function, its operands and its
 return value if any.


Am working on this.

 For new code in GCC, it's better to use snprintf than sprintf.


Have fixed this. Thanks.


 +extern void decode_options (struct gcc_options *, struct gcc_options *,

 Please use appropriate headers rather than local declarations of GCC
 functions.

 +// d-bi-attr.h -- D frontend for GCC.

 This file looks like it's largely copied from elsewhere in GCC.  In such a
 case, please work out a better way to refactor the code so that it can be
 shared rather than duplicated.  (Again, such common code will no doubt
 need full copyright assignments.)

 I don't know whether your assignment Assigns Past and Future Changes to
 the GNU D Compiler (GDC) covers changes elsewhere in GCC.  But I expect a
 general assignment for GCC to be needed for any refactoring involved in
 adapting common code for use in D.  (And such refactoring would be a new
 contribution so there shouldn't be any issues with unknown previous
 contributors without assignments - those would only arise if significant
 amounts of previously written D front-end code are being moved into common
 code.)


It's copied as including c-common.c / .h causes problems with a fair
number of references pulled in that need to be stubbed out - also,
some GCC function attributes that we use do not make any sense to have
in D code (eg: gnu_inline, artificial, cleanup).  It could certainly
be possible though ... Will need to review this in more detail.


 +#if D_VA_LIST_TYPE_VOIDPTR

 Please avoid #if conditionals on anything that could be a target property.
 It's generally better to use if conditionals instead of #if, so that all
 cases are checked for syntax in all compiles.

 I see #if conditions on defines such as V2 and V1 as well.  Unless
 something is an *existing* target macro or configure macro in GCC, use
 if conditions and ensure that the macro is defined to true or false
 values (rather than defined or not defined).  But if a macro is always
 defined, or never defined, then just avoiding the conditionals may be
 better.


Have remove this from the gdc glue.

 The gcc/d/dfrontend/readme.txt says:

 +These sources are free, they are redistributable and modifiable
 +under the terms of the GNU General Public License (attached as gpl.txt),
 +or the Artistic License (attached as artistic.txt).

 But that license is GPLv2.  We need an explicit notice (approved by the
 copyright 

[committed] Fix up i386/sse4_1-pmuldq.c (and i386/avx-vpmuldq.c) testcase

2012-06-27 Thread Jakub Jelinek
Hi!

When testing SVN valgrind AVX support, I've discovered a bug in
this testcase, which has been initializing only half of the array and
performed testing on the second half of the array with uninitialized random
stack data.

Tested on x86_64-linux and i686-linux, committed to trunk.

2012-06-27  Jakub Jelinek  ja...@redhat.com

* gcc.target/i386/sse4_1-pmuldq.c (TEST): Initialize
even src1.i and src2.i fields even in the second half of the arrays.

--- gcc/testsuite/gcc.target/i386/sse4_1-pmuldq.c.jj2008-10-23 
13:20:53.0 +0200
+++ gcc/testsuite/gcc.target/i386/sse4_1-pmuldq.c   2012-06-26 
13:06:06.890051299 +0200
@@ -32,7 +32,7 @@ TEST (void)
   int i, sign = 1;
   long long value;
 
-  for (i = 0; i  NUM; i += 2)
+  for (i = 0; i  NUM * 2; i += 2)
 {
   src1.i[i] = i * i * sign;
   src2.i[i] = (i + 20) * sign;

Jakub


[RFC, ARM] later split of symbol_refs

2012-06-27 Thread Dmitry Melnik

Hi,

We'd like to note about CodeSourcery's patch for ARM backend, from which 
GCC mainline can gain 4% on SPEC2K INT: 
http://cgit.openembedded.org/openembedded/plain/recipes/gcc/gcc-4.5/linaro/gcc-4.5-linaro-r99369.patch 
(also the patch is attached).


Originally, we noticed that GNU Go works 6% faster on cortex-a8 with 
-fno-gcse.  After profiling we found that this is most likely caused by 
cache misses when accessing global variables.  GCC generates ldr 
instructions for them, while this can be avoided by emitting movt/movw 
pair for such cases. RTL expressions for these instructions is high_ and 
lo_sum.  Currently, symbol_ref expands as high_ and lo_sum but then 
cprop1 decides that this is redundant and merges them into one load insn.


The problem was also found by Linaro community: 
https://bugs.launchpad.net/gcc-linaro/+bug/886124 .
Also there is a patch from codesourcery (attached), which was ported to 
linaro gcc 4.5, but is missing in later linaro releases.
This patch makes split of symbol_refs at the later stage (after cprop), 
instead of generating movt/movw at expand.


It fixed our test case on GNU Go.  Also we tested it on SPEC2K INT (ref) 
with GCC 4.8 snapshot from May 12, 2012 on cortex-a9 with -O2 and -mthumb:


Base  Base  Base  Peak  Peak  Peak
Benchmarks  Ref Time  Run Time   RatioRef Time  Run Time  Ratio
--           ---
164.gzip1400  492   284 1400   497   282  -0.70%
175.vpr 1400  433   323 1400   458   306  -5.26%
176.gcc 1100  203   542 1100   198   557   2.77%
181.mcf 1800  529   340 1800   528   341   0.29%
186.crafty  1000  261   383 1000   256   391   2.09%
197.parser  1800  709   254 1800   701   257   1.18%
252.eon 1300  219   594 1300   202   644   8.42%
253.perlbmk 1800  389   463 1800   367   490   5.83%
254.gap 1100  259   425 1100   236   467   9.88%
255.vortex  1900  498   382 1900   442   430  12.57%
256.bzip2   1500  452   332 1500   424   354   6.63%
300.twolf   3000  916   328 3000   853   352   7.32%
SPECint_base2000376
SPECint2000  391   3.99%


SPEC2K INT grows by 4% (up to 12.5% on vortex; vpr slowdown is likely 
because of big variance on this test).


Similarly, there are gains of 3-4% without -mthumb on cortex-a9 and on 
cortex-a8 (thumb2 and ARM modes).


This patch can be applied to current trunk and passes regtest 
successfully on qemu-arm.

Maybe it will be good to have it in trunk?
If everybody agrees, we can take care of committing it.

--
Best regards,
  Dmitry
2010-08-20  Jie Zhang  j...@codesourcery.com

	Merged from Sourcery G++ 4.4:

	gcc/
	2009-05-29  Julian Brown  jul...@codesourcery.com
	Merged from Sourcery G++ 4.3:
	* config/arm/arm.md (movsi): Don't split symbol refs here.
	(define_split): New.

 2010-08-18  Julian Brown  jul...@codesourcery.com
 
 	Issue #9222

=== modified file 'gcc/config/arm/arm.md'
--- old/gcc/config/arm/arm.md	2010-08-20 16:41:37 +
+++ new/gcc/config/arm/arm.md	2010-08-23 14:39:12 +
@@ -5150,14 +5150,6 @@
 			   optimize  can_create_pseudo_p ());
   DONE;
 }
-
-  if (TARGET_USE_MOVT  !target_word_relocations
-	   GET_CODE (operands[1]) == SYMBOL_REF
-	   !flag_pic  !arm_tls_referenced_p (operands[1]))
-	{
-	  arm_emit_movpair (operands[0], operands[1]);
-	  DONE;
-	}
 }
   else /* TARGET_THUMB1...  */
 {
@@ -5265,6 +5257,19 @@
   
 )
 
+(define_split
+  [(set (match_operand:SI 0 arm_general_register_operand )
+	(match_operand:SI 1 general_operand ))]
+  TARGET_32BIT
+TARGET_USE_MOVT  GET_CODE (operands[1]) == SYMBOL_REF
+!flag_pic  !target_word_relocations
+!arm_tls_referenced_p (operands[1])
+  [(clobber (const_int 0))]
+{
+  arm_emit_movpair (operands[0], operands[1]);
+  DONE;
+})
+
 (define_insn *thumb1_movsi_insn
   [(set (match_operand:SI 0 nonimmediate_operand =l,l,l,l,l,,l, m,*lhk)
 	(match_operand:SI 1 general_operand  l, I,J,K,,l,mi,l,*lhk))]



Re: [PATCH, gdc] - Merging gdc (GNU D Compiler) into gcc

2012-06-27 Thread Iain Buclaw
On 19 June 2012 17:08, Steven Bosscher stevenb@gmail.com wrote:

 Many functions have no leading comment, and other GNU coding standard
 requirements are not followed either. Those should IMHO be fixed also,
 before this front end can be accepted.



To separate this from the other listed items.  I am aware of certain
things that are in definite need to spot check over.  (NB: The coding
convention was originally KR, then changed over to Allman about 2
years ago, and recently changed over to GNU-like via several vim
macros I wrote to carry out the quick re-format job).

As well as the GCC coding convention,  I would like to also know what
of the C++ Coding convention is an absolutely *must* be followed to
the letter on the wiki?  As it states at the top that it is only a
proposed set of coding conventions to be used when writing GCC in C++,
and I am well aware that in GDC I do not follow some items, such as
All data members should have names which end with an underscore.

http://gcc.gnu.org/wiki/CppConventions


Regards
-- 
Iain Buclaw

*(p  e ? p++ : p) = (c  0x0f) + '0';


Re: [RFC, ARM] later split of symbol_refs

2012-06-27 Thread Steven Bosscher
On Wed, Jun 27, 2012 at 4:58 PM, Dmitry Melnik d...@ispras.ru wrote:
 This patch can be applied to current trunk and passes regtest successfully
 on qemu-arm.
 Maybe it will be good to have it in trunk?
 If everybody agrees, we can take care of committing it.

If the patch is approved, can you please add a brief comment before
the define_split to explain why it's there and what it does?

Ciao!
Steven


Re: [RFC] Tweak reload to const propagate into matching constraint output

2012-06-27 Thread Richard Henderson
On 06/27/2012 01:45 AM, Richard Guenther wrote:
  Of course, this fires for normal integer code as well.
  Some cases it's a clear win:
 
  -:  41 be 1f 00 00 00   mov$0x1f,%r14d
  ...
  -:  4c 89 f1mov%r14,%rcx
  +:  b9 1f 00 00 00  mov$0x1f,%ecx
 
  sometimes not (increased code size):
 
  -:  41 bd 01 00 00 00   mov$0x1,%r13d
  -:  4d 89 ecmov%r13,%r12
  +:  41 bc 01 00 00 00   mov$0x1,%r12d
  +:  41 bd 01 00 00 00   mov$0x1,%r13d
 I suppose that might be fixed if instead of
 
 +  /* Only use the constant when it's just as cheap as a reg move.  */
 +  if (set_src_cost (c, optimize_function_for_speed_p (cfun)) == 0)
 +return c;
 
 you'd unconditionall use size costs?
 

For one, without x86 cost changes that wouldn't affect anything.
For another, unconditionally using size costs, locally, would then
exchange the missed optimization from the second case to the first.

 We have an inverse issue elsewhere in that we don't CSE a propagated constant
 but get
 
mov $0, %(eax)
mov $0, 4%(eax)
 ...
 
 instead of doing one register clearing and then re-using that as zero.  But I
 suppose reload is not exactly the place to fix that ;)

That would be exactly because x86 doesn't model immediate costs properly.

My patch trying to un-cse in exactly the spot where the value is about
to be clobbered.  While we could give a go at this in a pre-reload pass,
it would be just a guess until register allocation does or does not
assign a hard reg to the constant, and does or does not choose an
alternative that requires the constant match an output.

Having reviewed more of the cc1 asm diff, the vast majority of cases are:
  * the cx input to string insns,
  * (1  n).
These results are certainly skewed by the kind of stuff we do in gcc, but
it makes a fair amount of sense.


r~


Re: [RFC] Tweak reload to const propagate into matching constraint output

2012-06-27 Thread Bernd Schmidt
On 06/27/2012 10:45 AM, Richard Guenther wrote:
 On Wed, Jun 27, 2012 at 5:02 AM, Richard Henderson r...@redhat.com wrote:
 sometimes not (increased code size):

 -:  41 bd 01 00 00 00   mov$0x1,%r13d
 -:  4d 89 ecmov%r13,%r12
 +:  41 bc 01 00 00 00   mov$0x1,%r12d
 +:  41 bd 01 00 00 00   mov$0x1,%r13d
 
 I suppose that might be fixed if instead of
 
 +  /* Only use the constant when it's just as cheap as a reg move.  */
 +  if (set_src_cost (c, optimize_function_for_speed_p (cfun)) == 0)
 +return c;
 
 you'd unconditionall use size costs?

I've added some code last year or so in rtl.h to operate on
full_rtx_costs, taking both into account (use the primary cost
comparison, and if that's equal, use the secondary). Would that work here?


Bernd


Re: [PATCH, gdc] - Merging gdc (GNU D Compiler) into gcc

2012-06-27 Thread Joseph S. Myers
On Wed, 27 Jun 2012, Iain Buclaw wrote:

 It's copied as including c-common.c / .h causes problems with a fair
 number of references pulled in that need to be stubbed out - also,
 some GCC function attributes that we use do not make any sense to have
 in D code (eg: gnu_inline, artificial, cleanup).  It could certainly
 be possible though ... Will need to review this in more detail.

Quite possibly you need to split up c-common so that the parts that can 
also be shared with D are in separate files.

 The D frontend is completely independent of the GCC backend, and any
 alterations are purely for portability (eg, the use of real_t rather
 than long double for the representation of floats).   There is no

If for portability, I'd hope they wouldn't need to be conditional - 
rather, the common repository used for all the compilers using the 
dfrontend code should be able to have them, unconditionally, and another 
such compiler might have a typedef of real_t to long double if that's what 
that other compiler wishes to use.  Hopefully you can work with the people 
maintaining other such compilers so that there can genuinely be a shared, 
portable source base for the shared code, in a public repository used by 
all those maintainers, without conditionals based on which compiler it's 
used in, and with that shared source base only using an absolute minimum 
of headers from whatever compiler it's used in (so only minimal GCC 
headers when used in GCC, etc.).

 Likewise, have removed it as is in fact no longer required.   The
 optimize #undef remains for the time being as it conflicts with the
 name of a member in the D frontend sources.  If the D frontend
 followed the C++ Coding Conventions as outlined in
 gcc.gnu.org/wiki/CppConventions then this wouldn't be an issue.
 Though I don't think it has an obligation to being essentially
 disconnected from calling any GCC code.

But it ought to be possible to stop the shared D front end from including 
the relevant GCC headers at all, if it has a clean interface to the rest 
of GCC

 Have removed all alloca handling from GDC and replaced with simply
 including libiberty.h.

system.h includes libiberty.h, so direct inclusion of libiberty.h 
shouldn't be needed (unless you are trying to avoid using system.h in code 
shared with other D compilers).

 https://github.com/D-Programming-GDC/GDC/commits/master
 
 
 I do have a question though, what is available for the transition of
 development from git to svn?  Other than a lot of ready and getting
 used to the various switches and commands on my part.

Once there is a front end ready to commit and approved in technical and 
GNU policy terms, you'd just do svn add on the files to add them to 
trunk.  It's up to you how you handle keeping the dfrontend/ changes in 
sync with an external shared repository (with all changes going to the 
external repository first); Ian Taylor may have some automation for that 
issue for Go.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [RFC, ARM] later split of symbol_refs

2012-06-27 Thread Julian Brown
On Wed, 27 Jun 2012 18:58:36 +0400
Dmitry Melnik d...@ispras.ru wrote:

 This patch can be applied to current trunk and passes regtest 
 successfully on qemu-arm.
 Maybe it will be good to have it in trunk?
 If everybody agrees, we can take care of committing it.

No objection from me (as the original author), FWIW.

Thanks!

Julian


Re: [RFC, ARM] later split of symbol_refs

2012-06-27 Thread Richard Earnshaw
On 27/06/12 15:58, Dmitry Melnik wrote:
 Hi,
 
 We'd like to note about CodeSourcery's patch for ARM backend, from which 
 GCC mainline can gain 4% on SPEC2K INT: 
 http://cgit.openembedded.org/openembedded/plain/recipes/gcc/gcc-4.5/linaro/gcc-4.5-linaro-r99369.patch
  
 (also the patch is attached).
 
 Originally, we noticed that GNU Go works 6% faster on cortex-a8 with 
 -fno-gcse.  After profiling we found that this is most likely caused by 
 cache misses when accessing global variables.  GCC generates ldr 
 instructions for them, while this can be avoided by emitting movt/movw 
 pair for such cases. RTL expressions for these instructions is high_ and 
 lo_sum.  Currently, symbol_ref expands as high_ and lo_sum but then 
 cprop1 decides that this is redundant and merges them into one load insn.
 
 The problem was also found by Linaro community: 
 https://bugs.launchpad.net/gcc-linaro/+bug/886124 .
 Also there is a patch from codesourcery (attached), which was ported to 
 linaro gcc 4.5, but is missing in later linaro releases.
 This patch makes split of symbol_refs at the later stage (after cprop), 
 instead of generating movt/movw at expand.
 
 It fixed our test case on GNU Go.  Also we tested it on SPEC2K INT (ref) 
 with GCC 4.8 snapshot from May 12, 2012 on cortex-a9 with -O2 and -mthumb:
 
  Base  Base  Base  Peak  Peak  Peak
 Benchmarks  Ref Time  Run Time   RatioRef Time  Run Time  Ratio
 --           ---
 164.gzip1400  492   284 1400   497   282  -0.70%
 175.vpr 1400  433   323 1400   458   306  -5.26%
 176.gcc 1100  203   542 1100   198   557   2.77%
 181.mcf 1800  529   340 1800   528   341   0.29%
 186.crafty  1000  261   383 1000   256   391   2.09%
 197.parser  1800  709   254 1800   701   257   1.18%
 252.eon 1300  219   594 1300   202   644   8.42%
 253.perlbmk 1800  389   463 1800   367   490   5.83%
 254.gap 1100  259   425 1100   236   467   9.88%
 255.vortex  1900  498   382 1900   442   430  12.57%
 256.bzip2   1500  452   332 1500   424   354   6.63%
 300.twolf   3000  916   328 3000   853   352   7.32%
 SPECint_base2000376
 SPECint2000  391   3.99%
 
 
 SPEC2K INT grows by 4% (up to 12.5% on vortex; vpr slowdown is likely 
 because of big variance on this test).
 
 Similarly, there are gains of 3-4% without -mthumb on cortex-a9 and on 
 cortex-a8 (thumb2 and ARM modes).
 
 This patch can be applied to current trunk and passes regtest 
 successfully on qemu-arm.
 Maybe it will be good to have it in trunk?
 If everybody agrees, we can take care of committing it.
 
 --
 Best regards,
Dmitry
 
 
 gcc-4.5-linaro-r99369.patch
 

Please update the ChangeLog entry (it's not appropriate to mention
Sourcery G++) and add a comment as Steven has suggested.

Otherwise OK.

R.



Re: [RFC, ARM] later split of symbol_refs

2012-06-27 Thread Ramana Radhakrishnan
On 27 June 2012 15:58, Dmitry Melnik d...@ispras.ru wrote:
 Hi,

 We'd like to note about CodeSourcery's patch for ARM backend, from which GCC
 mainline can gain 4% on SPEC2K INT:
 http://cgit.openembedded.org/openembedded/plain/recipes/gcc/gcc-4.5/linaro/gcc-4.5-linaro-r99369.patch
 (also the patch is attached).

 Originally, we noticed that GNU Go works 6% faster on cortex-a8 with
 -fno-gcse.  After profiling we found that this is most likely caused by
 cache misses when accessing global variables.  GCC generates ldr
 instructions for them, while this can be avoided by emitting movt/movw pair
 for such cases. RTL expressions for these instructions is high_ and lo_sum.
  Currently, symbol_ref expands as high_ and lo_sum but then cprop1 decides
 that this is redundant and merges them into one load insn.

 The problem was also found by Linaro community:
 https://bugs.launchpad.net/gcc-linaro/+bug/886124 .

The reason IIRC this isn't in our later releases is that it wasn't
thought beneficial enough to upstream. Now you've got some evidence to
the contrary.

 Also there is a patch from codesourcery (attached), which was ported to
 linaro gcc 4.5, but is missing in later linaro releases.
 This patch makes split of symbol_refs at the later stage (after cprop),
 instead of generating movt/movw at expand.

I must admit that I had been suggesting to Zhenqiang about turning
this off by tightening the movsi_insn predicates rather than adding a
split, but given that it appears to produce enough benefit in this
case I don't have any reasons to object ...

However it's interesting that this doesn't seem to help vpr 


Ramana


Re: [ARM Patch 1/n] PR53447: optimizations of 64bit ALU operation with constant

2012-06-27 Thread Ramana Radhakrishnan
On 8 June 2012 10:12, Carrot Wei car...@google.com wrote:
 Hi

 In rtl expression, substract a constant c is expressed as add a value -c, so 
 it
 is alse processed by adddi3, and I extend it more to handle a subtraction of
 64bit constant. I created an insn pattern arm_subdi3_immediate to specifically
 represent substraction with 64bit constant while continue keeping the add rtl
 expression.


Sorry about the time it has taken to review this patch -Thanks for
tackling this but I'm not convinced that this patch is correct and
definitely can be more efficient.

The range of valid 64 bit constants allowed would be in my opinion are
the following- obtained by dividing the 64 bit constant into 2 32 bit
halves (upper32 and lower32 referred to as upper and lower below)

 arm_not_operand (upper)  arm_add_operand (lower) which boils down
to the valid combination of

  adds lo : adc hi - both positive constants.
  adds lo ; sbc hi  - lower positive, upper negative
  subs lo ; sbc hi - lower negative, upper negative
  subs lo ; adc hi  - lower negative, upper positive


Therefore I'd do the following -

* Don't make *arm_adddi3 a named pattern - we don't need that.
* Change the *addsi3_carryin_optab pattern to be something like this :

--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -1001,12 +1001,14 @@
 )

 (define_insn *addsi3_carryin_optab
-  [(set (match_operand:SI 0 s_register_operand =r)
-   (plus:SI (plus:SI (match_operand:SI 1 s_register_operand %r)
- (match_operand:SI 2 arm_rhs_operand rI))
+  [(set (match_operand:SI 0 s_register_operand =r,r)
+   (plus:SI (plus:SI (match_operand:SI 1 s_register_operand %r,r
+ (match_operand:SI 2 arm_not_operand rI,K
 (LTUGEU:SI (reg:cnb CC_REGNUM) (const_int 0]
   TARGET_32BIT
-  adc%?\\t%0, %1, %2
+  @
+  adc%?\\t%0, %1, %2
+  sbc%?\\t%0, %1, %#n2
   [(set_attr conds use)]
 )

* I'd like a new const_ok_for_dimode_op function that dealt with each
of these operations, thus your plus operation with a DImode constant
would just be a check similar to what I've said above.
* You then don't need the new subdi3_immediate pattern and the split
can happen after reload. Adjust predicates and constraints
accordingly, delete it. Also please use CONST_INT_P instead of
GET_CODE (x) == CONST_INT in your patch.

regards,
Ramana











 Tested on arm qemu with both arm/thumb modes. OK for trunk?

 thanks
 Carrot


 2012-06-08  Wei Guozhi  car...@google.com

        PR target/53447
        * gcc.target/arm/pr53447-1.c: New testcase.
        * gcc.target/arm/pr53447-5.c: New testcase.


 2012-06-08  Wei Guozhi  car...@google.com

        PR target/53447
        * config/arm/constraints.md (Dd): New constraint.
        * config/arm/predicates.md (arm_neg_immediate_di_operand): New
        predicate.
        * config/arm/arm.md (adddi3): Extend it to handle constants.
        (arm_subdi3_immediate): New insn pattern.
        (arm_adddi3): Extend it to handle constants.
        * config/arm/neon.md (adddi3_neon): Likewise.


 Index: testsuite/gcc.target/arm/pr53447-1.c
 ===
 --- testsuite/gcc.target/arm/pr53447-1.c        (revision 0)
 +++ testsuite/gcc.target/arm/pr53447-1.c        (revision 0)
 @@ -0,0 +1,8 @@
 +/* { dg-options -O2 }  */
 +/* { dg-require-effective-target arm32 } */
 +/* { dg-final { scan-assembler-not mov } } */
 +
 +void t0p(long long * p)
 +{
 +  *p += 0x10001;
 +}
 Index: testsuite/gcc.target/arm/pr53447-5.c
 ===
 --- testsuite/gcc.target/arm/pr53447-5.c        (revision 0)
 +++ testsuite/gcc.target/arm/pr53447-5.c        (revision 0)
 @@ -0,0 +1,8 @@
 +/* { dg-options -O2 }  */
 +/* { dg-require-effective-target arm32 } */
 +/* { dg-final { scan-assembler-not mov } } */
 +
 +void t0p(long long * p)
 +{
 +  *p -= 0x10008;
 +}
 Index: config/arm/neon.md
 ===
 --- config/arm/neon.md  (revision 187751)
 +++ config/arm/neon.md  (working copy)
 @@ -588,9 +588,9 @@
  )

  (define_insn adddi3_neon
 -  [(set (match_operand:DI 0 s_register_operand =w,?r,?r,?w)
 -        (plus:DI (match_operand:DI 1 s_register_operand %w,0,0,w)
 -                 (match_operand:DI 2 s_register_operand w,r,0,w)))
 +  [(set (match_operand:DI 0 s_register_operand =w,?r,?r,?w,?r,?r,?r)
 +        (plus:DI (match_operand:DI 1 s_register_operand %w,0,0,w,r,0,r)
 +                 (match_operand:DI 2 arm_di_operand     
 w,r,0,w,r,Di,Di)))
    (clobber (reg:CC CC_REGNUM))]
   TARGET_NEON
  {
 @@ -600,13 +600,16 @@
     case 3: return vadd.i64\t%P0, %P1, %P2;
     case 1: return #;
     case 2: return #;
 +    case 4: return #;
 +    case 5: return #;
 +    case 6: return #;
     default: gcc_unreachable ();
     }
  }
 -  [(set_attr neon_type neon_int_1,*,*,neon_int_1)
 -   (set_attr conds *,clob,clob,*)
 -   (set_attr 

Re: [PATCH 1/3] Add rtx costs for sse integer ops

2012-06-27 Thread Igor Zamyatin
May I ask about the purpose of the following piece of change? Doesn't
it affect non-sse cases either?

@@ -32038,7 +32042,15 @@ ix86_rtx_costs (rtx x, int code, int
outer_code_i, int opno, int *total,
case ASHIFTRT:
case LSHIFTRT:
case ROTATERT:
-  if (!TARGET_64BIT  GET_MODE (XEXP (x, 0)) == DImode)
+  if (GET_MODE_CLASS (mode) == MODE_VECTOR_INT)
+   {
+ /* ??? Should be SSE vector operation cost.  */
+ /* At least for published AMD latencies, this really is the same
+as the latency for a simple fpu operation like fabs.  */
+ *total = cost-fabs;
+ return false;
+   }
+  if (GET_MODE_SIZE (mode)  UNITS_PER_WORD)
   {
 if (CONST_INT_P (XEXP (x, 1)))
   {

It also seems that we reversed the condition for the code that is now
under if (GET_MODE_SIZE (mode)  UNITS_PER_WORD). Why do we need this?


Thanks,
Igor

-Original Message-
From: gcc-patches-ow...@gcc.gnu.org
[mailto:gcc-patches-ow...@gcc.gnu.org] On Behalf Of Richard Henderson
Sent: Saturday, June 16, 2012 12:57 AM
To: gcc-patches@gcc.gnu.org
Cc: rguent...@suse.de; ubiz...@gmail.com; hjl.to...@gmail.com
Subject: [PATCH 1/3] Add rtx costs for sse integer ops

---
 gcc/config/i386/i386.c |   50 ++-
 1 files changed, 40 insertions(+), 10 deletions(-)

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index
e2f5740..578a756 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -31990,13 +31990,16 @@ ix86_rtx_costs (rtx x, int code, int
outer_code_i, int opno, int *total,
           break;
         case 0:
         case -1:
-           /* Start with (MEM (SYMBOL_REF)), since that's where
-              it'll probably end up.  Add a penalty for size.  */
-           *total = (COSTS_N_INSNS (1)
-                     + (flag_pic != 0  !TARGET_64BIT)
-                     + (mode == SFmode ? 0 : mode == DFmode ? 1 : 2));
           break;
         }
+      /* FALLTHRU */
+
+    case CONST_VECTOR:
+      /* Start with (MEM (SYMBOL_REF)), since that's where
+        it'll probably end up.  Add a penalty for size.  */
+      *total = (COSTS_N_INSNS (1)
+               + (flag_pic != 0  !TARGET_64BIT)
+               + (mode == SFmode ? 0 : mode == DFmode ? 1 : 2));
      return true;

    case ZERO_EXTEND:
@@ -32016,8 +32019,9 @@ ix86_rtx_costs (rtx x, int code, int
outer_code_i, int opno, int *total,
      return false;

    case ASHIFT:
-      if (CONST_INT_P (XEXP (x, 1))
-          (GET_MODE (XEXP (x, 0)) != DImode || TARGET_64BIT))
+      if (SCALAR_INT_MODE_P (mode)
+          GET_MODE_SIZE (mode)  UNITS_PER_WORD
+          CONST_INT_P (XEXP (x, 1)))
       {
         HOST_WIDE_INT value = INTVAL (XEXP (x, 1));
         if (value == 1)
@@ -32038,7 +32042,15 @@ ix86_rtx_costs (rtx x, int code, int
outer_code_i, int opno, int *total,
    case ASHIFTRT:
    case LSHIFTRT:
    case ROTATERT:
-      if (!TARGET_64BIT  GET_MODE (XEXP (x, 0)) == DImode)
+      if (GET_MODE_CLASS (mode) == MODE_VECTOR_INT)
+       {
+         /* ??? Should be SSE vector operation cost.  */
+         /* At least for published AMD latencies, this really is the same
+            as the latency for a simple fpu operation like fabs.  */
+         *total = cost-fabs;
+         return false;
+       }
+      if (GET_MODE_SIZE (mode)  UNITS_PER_WORD)
       {
         if (CONST_INT_P (XEXP (x, 1)))
           {
@@ -32107,6 +32119,16 @@ ix86_rtx_costs (rtx x, int code, int
outer_code_i, int opno, int *total,
         *total = cost-fmul;
         return false;
       }
+      else if (GET_MODE_CLASS (mode) == MODE_VECTOR_INT)
+       {
+         /* Without sse4.1, we don't have PMULLD; it's emulated with 7
+            insns, including two PMULUDQ.  */
+         if (mode == V4SImode  !(TARGET_SSE4_1 || TARGET_AVX))
+           *total = cost-fmul * 2 + cost-fabs * 5;
+         else
+           *total = cost-fmul;
+         return false;
+       }
      else
       {
         rtx op0 = XEXP (x, 0);
@@ -32171,7 +32193,7 @@ ix86_rtx_costs (rtx x, int code, int
outer_code_i, int opno, int *total,

    case PLUS:
      if (GET_MODE_CLASS (mode) == MODE_INT
-               GET_MODE_BITSIZE (mode) = GET_MODE_BITSIZE (Pmode))
+          GET_MODE_SIZE (mode) = UNITS_PER_WORD)
       {
         if (GET_CODE (XEXP (x, 0)) == PLUS
              GET_CODE (XEXP (XEXP (x, 0), 0)) == MULT @@ -32271,6
+32293,14 @@ ix86_rtx_costs (rtx x, int code, int outer_code_i, int
opno, int *total,
      /* FALLTHRU */

    case NOT:
+      if (GET_MODE_CLASS (mode) == MODE_VECTOR_INT)
+       {
+         /* ??? Should be SSE vector operation cost.  */
+         /* At least for published AMD latencies, this really is the same
+            as the latency for a simple fpu operation like fabs.  */
+         *total = cost-fabs;
+         return false;
+       }
      if (!TARGET_64BIT  mode == DImode)
       *total = cost-add * 2;
      else
@@ -32331,7 

Re: [ARM Patch 2/n]PR53447: optimizations of 64bit ALU operation with constant

2012-06-27 Thread Ramana Radhakrishnan
On 28 May 2012 11:08, Carrot Wei car...@google.com wrote:
 Hi

 This is the second part of the patches that deals with 64bit and. It directly
 extends the patterns anddi3, anddi3_insn and anddi3_neon to handle 64bit
 constant operands.


Comments about const_di_ok_for_op still apply from earlier review.

However I don't see and /ior / xor with constants that have either the
low or high parts set can't be expanded directly into ands of subregs
with moves of zero's or the original value ?

This and some of the other boolean operations should be targeting PR
target/53189 by the way.


Ramana

 Tested on arm qemu without regression.

 OK for trunk?

 thanks
 Carrot

 2012-05-28  Wei Guozhi  car...@google.com

        PR target/53447
        * gcc.target/arm/pr53447-2.c: New testcase.


 2012-05-28  Wei Guozhi  car...@google.com

        PR target/53447
        * config/arm/arm-protos.h (const_ok_for_anddi): New prototype.
        * config/arm/arm.c (const_ok_for_anddi): New function.
        * config/arm/constraints.md (De): New constraint.
        * config/arm/predicates.md (arm_anddi_operand): New predicate.
        (arm_immediate_anddi_operand): Likewise.
        (anddi_operand): Likewise.
        * config/arm/arm.md (anddi3): Extend it to handle 64bit constants.
        (anddi3_insn): Likewise.
        * config/arm/neon.md (anddi3_neon): Likewise.



 Index: testsuite/gcc.target/arm/pr53447-2.c
 ===
 --- testsuite/gcc.target/arm/pr53447-2.c        (revision 0)
 +++ testsuite/gcc.target/arm/pr53447-2.c        (revision 0)
 @@ -0,0 +1,8 @@
 +/* { dg-options -O2 }  */
 +/* { dg-require-effective-target arm32 } */
 +/* { dg-final { scan-assembler-not mov } } */
 +
 +void t0p(long long * p)
 +{
 +  *p = 0x10002;
 +}
 Index: config/arm/arm.c
 ===
 --- config/arm/arm.c    (revision 187927)
 +++ config/arm/arm.c    (working copy)
 @@ -2497,6 +2497,18 @@
     }
  }

 +/* Return TRUE if int I is a valid immediate constant used by pattern
 +   anddi3_insn.  */
 +int
 +const_ok_for_anddi (HOST_WIDE_INT i)
 +{
 +  HOST_WIDE_INT high = ARM_SIGN_EXTEND ((i  32)  0x);
 +  HOST_WIDE_INT low = ARM_SIGN_EXTEND (i  0x);
 +
 +  return (TARGET_32BIT  (const_ok_for_arm (low) || const_ok_for_arm (~low))
 +          (const_ok_for_arm (high) || const_ok_for_arm (~high)));
 +}
 +
  /* Emit a sequence of insns to handle a large constant.
    CODE is the code of the operation required, it can be any of SET, PLUS,
    IOR, AND, XOR, MINUS;
 Index: config/arm/arm-protos.h
 ===
 --- config/arm/arm-protos.h     (revision 187927)
 +++ config/arm/arm-protos.h     (working copy)
 @@ -47,6 +47,7 @@
  extern bool arm_small_register_classes_for_mode_p (enum machine_mode);
  extern int arm_hard_regno_mode_ok (unsigned int, enum machine_mode);
  extern bool arm_modes_tieable_p (enum machine_mode, enum machine_mode);
 +extern int const_ok_for_anddi (HOST_WIDE_INT);
  extern int const_ok_for_arm (HOST_WIDE_INT);
  extern int const_ok_for_op (HOST_WIDE_INT, enum rtx_code);
  extern int arm_split_constant (RTX_CODE, enum machine_mode, rtx,
 Index: config/arm/neon.md
 ===
 --- config/arm/neon.md  (revision 187927)
 +++ config/arm/neon.md  (working copy)
 @@ -774,9 +774,9 @@
  )

  (define_insn anddi3_neon
 -  [(set (match_operand:DI 0 s_register_operand =w,w,?r,?r,?w,?w)
 -        (and:DI (match_operand:DI 1 s_register_operand %w,0,0,r,w,0)
 -               (match_operand:DI 2 neon_inv_logic_op2 w,DL,r,r,w,DL)))]
 +  [(set (match_operand:DI 0 s_register_operand 
 =w,w,?r,?r,?w,?w,?r,?r)
 +        (and:DI (match_operand:DI 1 s_register_operand %w,0,0,r,w,0,0,r)
 +               (match_operand:DI 2 anddi_operand w,DL,r,r,w,DL,De,De)))]
   TARGET_NEON
  {
   switch (which_alternative)
 @@ -788,12 +788,14 @@
                     DImode, 1, VALID_NEON_QREG_MODE (DImode));
     case 2: return #;
     case 3: return #;
 +    case 6: return #;
 +    case 7: return #;
     default: gcc_unreachable ();
     }
  }
 -  [(set_attr neon_type neon_int_1,neon_int_1,*,*,neon_int_1,neon_int_1)
 -   (set_attr length *,*,8,8,*,*)
 -   (set_attr arch nota8,nota8,*,*,onlya8,onlya8)]
 +  [(set_attr neon_type 
 neon_int_1,neon_int_1,*,*,neon_int_1,neon_int_1,*,*)
 +   (set_attr length *,*,8,8,*,*,8,8)
 +   (set_attr arch nota8,nota8,*,*,onlya8,onlya8,*,*)]
  )

  (define_insn ornmode3_neon
 Index: config/arm/constraints.md
 ===
 --- config/arm/constraints.md   (revision 187927)
 +++ config/arm/constraints.md   (working copy)
 @@ -29,7 +29,7 @@
  ;; in Thumb-1 state: I, J, K, L, M, N, O

  ;; The following multi-letter normal constraints have been used:
 -;; in ARM/Thumb-2 state: Da, Db, Dc, Dn, Dl, DL, Dv, Dy, Di, Dt, Dz
 +;; in ARM/Thumb-2 

Re: [ARM Patch 2/n]PR53447: optimizations of 64bit ALU operation with constant

2012-06-27 Thread Ramana Radhakrishnan
On 28 May 2012 11:08, Carrot Wei car...@google.com wrote:
 Hi

 This is the second part of the patches that deals with 64bit and. It directly
 extends the patterns anddi3, anddi3_insn and anddi3_neon to handle 64bit
 constant operands.


Comments about const_di_ok_for_op still apply from my review of your add patch.

However I don't see why and /ior / xor with constants that have either
the low or high parts set can't be expanded directly into ands of
subregs with moves of zero's or the original value especially if you
aren't looking at doing 64 bit operations in neon .With Neon being
used for 64 bit arithmetic it gets more interesting.

Finally this should target PR target/53189.


Ramana

 Tested on arm qemu without regression.

 OK for trunk?

 thanks
 Carrot

 2012-05-28  Wei Guozhi  car...@google.com

        PR target/53447
        * gcc.target/arm/pr53447-2.c: New testcase.


 2012-05-28  Wei Guozhi  car...@google.com

        PR target/53447
        * config/arm/arm-protos.h (const_ok_for_anddi): New prototype.
        * config/arm/arm.c (const_ok_for_anddi): New function.
        * config/arm/constraints.md (De): New constraint.
        * config/arm/predicates.md (arm_anddi_operand): New predicate.
        (arm_immediate_anddi_operand): Likewise.
        (anddi_operand): Likewise.
        * config/arm/arm.md (anddi3): Extend it to handle 64bit constants.
        (anddi3_insn): Likewise.
        * config/arm/neon.md (anddi3_neon): Likewise.



 Index: testsuite/gcc.target/arm/pr53447-2.c
 ===
 --- testsuite/gcc.target/arm/pr53447-2.c        (revision 0)
 +++ testsuite/gcc.target/arm/pr53447-2.c        (revision 0)
 @@ -0,0 +1,8 @@
 +/* { dg-options -O2 }  */
 +/* { dg-require-effective-target arm32 } */
 +/* { dg-final { scan-assembler-not mov } } */
 +
 +void t0p(long long * p)
 +{
 +  *p = 0x10002;
 +}
 Index: config/arm/arm.c
 ===
 --- config/arm/arm.c    (revision 187927)
 +++ config/arm/arm.c    (working copy)
 @@ -2497,6 +2497,18 @@
     }
  }

 +/* Return TRUE if int I is a valid immediate constant used by pattern
 +   anddi3_insn.  */
 +int
 +const_ok_for_anddi (HOST_WIDE_INT i)
 +{
 +  HOST_WIDE_INT high = ARM_SIGN_EXTEND ((i  32)  0x);
 +  HOST_WIDE_INT low = ARM_SIGN_EXTEND (i  0x);
 +
 +  return (TARGET_32BIT  (const_ok_for_arm (low) || const_ok_for_arm (~low))
 +          (const_ok_for_arm (high) || const_ok_for_arm (~high)));
 +}
 +
  /* Emit a sequence of insns to handle a large constant.
    CODE is the code of the operation required, it can be any of SET, PLUS,
    IOR, AND, XOR, MINUS;
 Index: config/arm/arm-protos.h
 ===
 --- config/arm/arm-protos.h     (revision 187927)
 +++ config/arm/arm-protos.h     (working copy)
 @@ -47,6 +47,7 @@
  extern bool arm_small_register_classes_for_mode_p (enum machine_mode);
  extern int arm_hard_regno_mode_ok (unsigned int, enum machine_mode);
  extern bool arm_modes_tieable_p (enum machine_mode, enum machine_mode);
 +extern int const_ok_for_anddi (HOST_WIDE_INT);
  extern int const_ok_for_arm (HOST_WIDE_INT);
  extern int const_ok_for_op (HOST_WIDE_INT, enum rtx_code);
  extern int arm_split_constant (RTX_CODE, enum machine_mode, rtx,
 Index: config/arm/neon.md
 ===
 --- config/arm/neon.md  (revision 187927)
 +++ config/arm/neon.md  (working copy)
 @@ -774,9 +774,9 @@
  )

  (define_insn anddi3_neon
 -  [(set (match_operand:DI 0 s_register_operand =w,w,?r,?r,?w,?w)
 -        (and:DI (match_operand:DI 1 s_register_operand %w,0,0,r,w,0)
 -               (match_operand:DI 2 neon_inv_logic_op2 w,DL,r,r,w,DL)))]
 +  [(set (match_operand:DI 0 s_register_operand 
 =w,w,?r,?r,?w,?w,?r,?r)
 +        (and:DI (match_operand:DI 1 s_register_operand %w,0,0,r,w,0,0,r)
 +               (match_operand:DI 2 anddi_operand w,DL,r,r,w,DL,De,De)))]
   TARGET_NEON
  {
   switch (which_alternative)
 @@ -788,12 +788,14 @@
                     DImode, 1, VALID_NEON_QREG_MODE (DImode));
     case 2: return #;
     case 3: return #;
 +    case 6: return #;
 +    case 7: return #;
     default: gcc_unreachable ();
     }
  }
 -  [(set_attr neon_type neon_int_1,neon_int_1,*,*,neon_int_1,neon_int_1)
 -   (set_attr length *,*,8,8,*,*)
 -   (set_attr arch nota8,nota8,*,*,onlya8,onlya8)]
 +  [(set_attr neon_type 
 neon_int_1,neon_int_1,*,*,neon_int_1,neon_int_1,*,*)
 +   (set_attr length *,*,8,8,*,*,8,8)
 +   (set_attr arch nota8,nota8,*,*,onlya8,onlya8,*,*)]
  )

  (define_insn ornmode3_neon
 Index: config/arm/constraints.md
 ===
 --- config/arm/constraints.md   (revision 187927)
 +++ config/arm/constraints.md   (working copy)
 @@ -29,7 +29,7 @@
  ;; in Thumb-1 state: I, J, K, L, M, N, O

  ;; The following multi-letter normal constraints 

Re: [PATCH, gdc] - Merging gdc (GNU D Compiler) into gcc

2012-06-27 Thread Iain Buclaw
On 27 June 2012 16:47, Joseph S. Myers jos...@codesourcery.com wrote:
 On Wed, 27 Jun 2012, Iain Buclaw wrote:

 It's copied as including c-common.c / .h causes problems with a fair
 number of references pulled in that need to be stubbed out - also,
 some GCC function attributes that we use do not make any sense to have
 in D code (eg: gnu_inline, artificial, cleanup).  It could certainly
 be possible though ... Will need to review this in more detail.

 Quite possibly you need to split up c-common so that the parts that can
 also be shared with D are in separate files.

 The D frontend is completely independent of the GCC backend, and any
 alterations are purely for portability (eg, the use of real_t rather
 than long double for the representation of floats).   There is no

 If for portability, I'd hope they wouldn't need to be conditional -
 rather, the common repository used for all the compilers using the
 dfrontend code should be able to have them, unconditionally, and another
 such compiler might have a typedef of real_t to long double if that's what
 that other compiler wishes to use.  Hopefully you can work with the people
 maintaining other such compilers so that there can genuinely be a shared,
 portable source base for the shared code, in a public repository used by
 all those maintainers, without conditionals based on which compiler it's
 used in, and with that shared source base only using an absolute minimum
 of headers from whatever compiler it's used in (so only minimal GCC
 headers when used in GCC, etc.).


In some ways, some elements of the code is already shared, and I've
have no trouble sending patches for GDC-specific elements provided
that they are wrapped around #ifdef IN_GCC.



 Likewise, have removed it as is in fact no longer required.   The
 optimize #undef remains for the time being as it conflicts with the
 name of a member in the D frontend sources.  If the D frontend
 followed the C++ Coding Conventions as outlined in
 gcc.gnu.org/wiki/CppConventions then this wouldn't be an issue.
 Though I don't think it has an obligation to being essentially
 disconnected from calling any GCC code.

 But it ought to be possible to stop the shared D front end from including
 the relevant GCC headers at all, if it has a clean interface to the rest
 of GCC


It's the other way round, actually.  It's the GDC code that errors on
compilation from including the main header (mars.h) in the D frontend.
 The dfrontend code does not touch d-gcc-includes.h, and I can add in
a compile #error to assert that is always the case now that I'm using
IN_GCC_FRONTEND to build the GDC sources.


 Have removed all alloca handling from GDC and replaced with simply
 including libiberty.h.

 system.h includes libiberty.h, so direct inclusion of libiberty.h
 shouldn't be needed (unless you are trying to avoid using system.h in code
 shared with other D compilers).


I'd rather not be including headers in GCC into the D frontend,
however useful they may be.

One reason for this is that system.h poisons at least one function
that the D frontend uses (strdup in rmem.c).  Although I'm sure the
maintainers would be happy to accept any patches I send if the D
frontend must be compliant and not use these banned functions in GCC.


 https://github.com/D-Programming-GDC/GDC/commits/master


 I do have a question though, what is available for the transition of
 development from git to svn?  Other than a lot of ready and getting
 used to the various switches and commands on my part.

 Once there is a front end ready to commit and approved in technical and
 GNU policy terms, you'd just do svn add on the files to add them to
 trunk.  It's up to you how you handle keeping the dfrontend/ changes in
 sync with an external shared repository (with all changes going to the
 external repository first); Ian Taylor may have some automation for that
 issue for Go.


OK, thanks.  As the D frontend goes through a sometimes experimental
development process between each release, I'd rather have it so that I
merge the frontend into GDC as each release happens, instead of
keeping in constant sync.  This is how I handle such merges at the
moment, although it doesn't help if you require a track of all changes
made to the D frontend.


Regards
-- 
Iain Buclaw

*(p  e ? p++ : p) = (c  0x0f) + '0';


Re: [PATCH, gdc] - Merging gdc (GNU D Compiler) into gcc

2012-06-27 Thread Joseph S. Myers
On Wed, 27 Jun 2012, Iain Buclaw wrote:

 OK, thanks.  As the D frontend goes through a sometimes experimental
 development process between each release, I'd rather have it so that I
 merge the frontend into GDC as each release happens, instead of
 keeping in constant sync.  This is how I handle such merges at the
 moment, although it doesn't help if you require a track of all changes
 made to the D frontend.

As long as you don't make changes to the dfrontend code directly in the 
GCC repository (only import versions from another repository, verbatim) I 
think that would work (importing from releases / release branches in the 
upstream repository rather than always the development mainline).  
Likewise any other directories taken verbatim from some external 
repository shared between D compilers.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH 1/3] Add rtx costs for sse integer ops

2012-06-27 Thread Richard Henderson
On 06/27/2012 09:07 AM, Igor Zamyatin wrote:
 May I ask about the purpose of the following piece of change? Doesn't
 it affect non-sse cases either?

Err, no, it doesn't affect non-sse cases.  All MODE_VECTOR_INT
cases will be implemented in the xmm registers (ignoring the
deprecated and largely ignored mmx case).


 
 @@ -32038,7 +32042,15 @@ ix86_rtx_costs (rtx x, int code, int
 outer_code_i, int opno, int *total,
 case ASHIFTRT:
 case LSHIFTRT:
 case ROTATERT:
 -  if (!TARGET_64BIT  GET_MODE (XEXP (x, 0)) == DImode)
 +  if (GET_MODE_CLASS (mode) == MODE_VECTOR_INT)
 +   {
 + /* ??? Should be SSE vector operation cost.  */
 + /* At least for published AMD latencies, this really is the same
 +as the latency for a simple fpu operation like fabs.  */
 + *total = cost-fabs;
 + return false;
 +   }
 +  if (GET_MODE_SIZE (mode)  UNITS_PER_WORD)
{
  if (CONST_INT_P (XEXP (x, 1)))
{
 
 It also seems that we reversed the condition for the code that is now
 under if (GET_MODE_SIZE (mode)  UNITS_PER_WORD). Why do we need this?

I'm not sure what you're suggesting.  But we certainly don't use
the xmm registers to implement DImode operations in 32-bit, so...


r~


Re: [PATCH 1/3] Add rtx costs for sse integer ops

2012-06-27 Thread Igor Zamyatin
On Wed, Jun 27, 2012 at 9:00 PM, Richard Henderson r...@redhat.com wrote:
 On 06/27/2012 09:07 AM, Igor Zamyatin wrote:
 May I ask about the purpose of the following piece of change? Doesn't
 it affect non-sse cases either?

 Err, no, it doesn't affect non-sse cases.  All MODE_VECTOR_INT
 cases will be implemented in the xmm registers (ignoring the
 deprecated and largely ignored mmx case).

Probably I misunderstand something... This condition - GET_MODE_SIZE
(mode)  UNITS_PER_WORD - is outside the check for MODE_VECTOR_INT.




 @@ -32038,7 +32042,15 @@ ix86_rtx_costs (rtx x, int code, int
 outer_code_i, int opno, int *total,
     case ASHIFTRT:
     case LSHIFTRT:
     case ROTATERT:
 -      if (!TARGET_64BIT  GET_MODE (XEXP (x, 0)) == DImode)
 +      if (GET_MODE_CLASS (mode) == MODE_VECTOR_INT)
 +       {
 +         /* ??? Should be SSE vector operation cost.  */
 +         /* At least for published AMD latencies, this really is the same
 +            as the latency for a simple fpu operation like fabs.  */
 +         *total = cost-fabs;
 +         return false;
 +       }
 +      if (GET_MODE_SIZE (mode)  UNITS_PER_WORD)
        {
          if (CONST_INT_P (XEXP (x, 1)))
            {

 It also seems that we reversed the condition for the code that is now
 under if (GET_MODE_SIZE (mode)  UNITS_PER_WORD). Why do we need this?

 I'm not sure what you're suggesting.  But we certainly don't use
 the xmm registers to implement DImode operations in 32-bit, so...


 r~


Re: [PATCH 1/3] Add rtx costs for sse integer ops

2012-06-27 Thread Richard Henderson
On 06/27/2012 10:08 AM, Igor Zamyatin wrote:
 On Wed, Jun 27, 2012 at 9:00 PM, Richard Henderson r...@redhat.com wrote:
  On 06/27/2012 09:07 AM, Igor Zamyatin wrote:
  May I ask about the purpose of the following piece of change? Doesn't
  it affect non-sse cases either?
 
  Err, no, it doesn't affect non-sse cases.  All MODE_VECTOR_INT
  cases will be implemented in the xmm registers (ignoring the
  deprecated and largely ignored mmx case).
 Probably I misunderstand something... This condition - GET_MODE_SIZE
 (mode)  UNITS_PER_WORD - is outside the check for MODE_VECTOR_INT.

Of course.

We currently have

if (vector mode)
else if (mode = word size)
else /* scalar mode  word size */

We could no doubt legitimately rearrange this to

if (mode  word size)
else if (vector mode)
else /* scalar mode  word size */

but I don't see how that's any clearer.  We certainly
can't eliminate any tests, since there are in fact
three different possibilities.



r~


Re: [PATCH 1/3] Add rtx costs for sse integer ops

2012-06-27 Thread Igor Zamyatin
On Wed, Jun 27, 2012 at 9:14 PM, Richard Henderson r...@redhat.com wrote:
 On 06/27/2012 10:08 AM, Igor Zamyatin wrote:
 On Wed, Jun 27, 2012 at 9:00 PM, Richard Henderson r...@redhat.com wrote:
  On 06/27/2012 09:07 AM, Igor Zamyatin wrote:
  May I ask about the purpose of the following piece of change? Doesn't
  it affect non-sse cases either?
 
  Err, no, it doesn't affect non-sse cases.  All MODE_VECTOR_INT
  cases will be implemented in the xmm registers (ignoring the
  deprecated and largely ignored mmx case).
 Probably I misunderstand something... This condition - GET_MODE_SIZE
 (mode)  UNITS_PER_WORD - is outside the check for MODE_VECTOR_INT.

 Of course.

 We currently have

        if (vector mode)
        else if (mode = word size)
        else /* scalar mode  word size */

Sure, it's clear. So am I correct that in this case now for second
possibility we run the code which was executed for third possibility
before the patch (it was under TARGET_64BIT  GET_MODE (XEXP (x, 0))
== DImode)? If yes, why?


 We could no doubt legitimately rearrange this to

        if (mode  word size)
        else if (vector mode)
        else /* scalar mode  word size */

 but I don't see how that's any clearer.  We certainly
 can't eliminate any tests, since there are in fact
 three different possibilities.



 r~


Re: [C++ Patch] for c++/51214

2012-06-27 Thread Fabien Chêne
2012/6/7 Fabien Chêne fabien.ch...@gmail.com:
[...]
 ... committed as rev 188294.
 I will backport it to 4.7 when it unfreezes.

... Eventually backported as rev 189021.

-- 
Fabien


Re: [PATCH, gdc] - Merging gdc (GNU D Compiler) into gcc

2012-06-27 Thread Mike Stump
On Jun 27, 2012, at 7:45 AM, Iain Buclaw wrote:
 I do have a question though, what is available for the transition of
 development from git to svn?  Other than a lot of ready and getting
 used to the various switches and commands on my part.

Why transition?  Quite a few people around here use git on a day to day basis 
and just push and pull to/from svn as they see fit.  gcc has a read-only git 
repo you can track and pull from.  For pushing into svn, you can use git to do 
that as well (dcommit).  You'll want to read up on work flows on the net... as 
dcommit and merges require a little extra caution that isn't obvious.

 As the D frontend goes through a sometimes experimental
 development process between each release, I'd rather have it so that I
 merge the frontend into GDC as each release happens, instead of
 keeping in constant sync.

You can evolve when and how often you push and pull...


Re: [RFA] Enable dump-noaddr test to work in out of build tree testing

2012-06-27 Thread Mike Stump
On Jun 27, 2012, at 3:33 AM, Matthew Gretton-Dann wrote:
 This patch enables the dump-noaddr test to work in out-of-build-tree testing.

   * gcc.c-torture/unsorted/dump-noaddr.x: Generate dump files in
   tmpdir.

 OK?

Ok.


Re: [PATCH ARM iWMMXt 0/5] Improve iWMMXt support

2012-06-27 Thread Matt Turner
On Tue, Jun 26, 2012 at 10:56 AM, nick clifton ni...@redhat.com wrote:
 Hi Matt,


 There's also a trivial documentation fix:

 [PATCH 1/2] doc: Correct __builtin_arm_tinsr prototype documentation

 and a test to exercise the intrinsics:

 [PATCH 2/2] arm: add iwMMXt mmx-2.c test


 These have both been checked in.

 It turns out that both needed minor updates as some of the builtins have
 changed since these patches were written.  I have taken care of this
 however.

 Cheers
  Nick

Thanks a lot, Nick!


Re: [PR49888, VTA] don't keep VALUEs bound to modified MEMs

2012-06-27 Thread Richard Henderson
On 06/26/2012 01:54 PM, Alexandre Oliva wrote:
 +  track_stack_pointer (dst, src1, src2);

Why does this function return a value then?


r~


C++ PATCH for c++/53563 (ice-on-invalid with svoid::svoid)

2012-06-27 Thread Jason Merrill
Here, we were parsing svoid::svoid wrong.  When we've seen a 
class-key, an injected-class-name must be considered to name the class, 
not the constructor, because the class-key limits lookup to finding type 
names.  Fixing that corrects the error message for this testcase, and 
avoids the ICE.


Tested x86_64-pc-linux-gnu, applying to trunk.
commit f087a69f6320f9114101eb9117fb8e1c6ee44773
Author: Jason Merrill ja...@redhat.com
Date:   Wed Jun 27 14:33:06 2012 -0400

	PR c++/53563
	* parser.c (cp_parser_template_id): Add tag_type parm.
	(cp_parser_template_name): Likewise.
	(cp_parser_id_expression, cp_parser_unqualified_id): Adjust.
	(cp_parser_pseudo_destructor_name, cp_parser_type_name): Adjust.
	(cp_parser_simple_type_specifier, cp_parser_class_name): Adjust.
	(cp_parser_elaborated_type_specifier, cp_parser_class_head): Adjust.

diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index 46f1401..7012caa 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -2062,9 +2062,9 @@ static tree cp_parser_template_parameter
 static tree cp_parser_type_parameter
   (cp_parser *, bool *);
 static tree cp_parser_template_id
-  (cp_parser *, bool, bool, bool);
+  (cp_parser *, bool, bool, enum tag_types, bool);
 static tree cp_parser_template_name
-  (cp_parser *, bool, bool, bool, bool *);
+  (cp_parser *, bool, bool, bool, enum tag_types, bool *);
 static tree cp_parser_template_argument_list
   (cp_parser *);
 static tree cp_parser_template_argument
@@ -4466,6 +4466,7 @@ cp_parser_id_expression (cp_parser *parser,
   id = cp_parser_template_id (parser,
   /*template_keyword_p=*/false,
   /*check_dependency_p=*/true,
+  none_type,
   declarator_p);
   /* If that worked, we're done.  */
   if (cp_parser_parse_definitely (parser))
@@ -4543,6 +4544,7 @@ cp_parser_unqualified_id (cp_parser* parser,
 	/* Try a template-id.  */
 	id = cp_parser_template_id (parser, template_keyword_p,
 check_dependency_p,
+none_type,
 declarator_p);
 	/* If it worked, we're done.  */
 	if (cp_parser_parse_definitely (parser))
@@ -4554,6 +4556,7 @@ cp_parser_unqualified_id (cp_parser* parser,
 case CPP_TEMPLATE_ID:
   return cp_parser_template_id (parser, template_keyword_p,
 check_dependency_p,
+none_type,
 declarator_p);
 
 case CPP_COMPL:
@@ -4769,6 +4772,7 @@ cp_parser_unqualified_id (cp_parser* parser,
 	  /* Try a template-id.  */
 	  id = cp_parser_template_id (parser, template_keyword_p,
   /*check_dependency_p=*/true,
+  none_type,
   declarator_p);
 	  /* If that worked, we're done.  */
 	  if (cp_parser_parse_definitely (parser))
@@ -6280,6 +6284,7 @@ cp_parser_pseudo_destructor_name (cp_parser* parser,
   cp_parser_template_id (parser,
 			 /*template_keyword_p=*/true,
 			 /*check_dependency_p=*/false,
+			 class_type,
 			 /*is_declaration=*/true);
   /* Look for the `::' token.  */
   cp_parser_require (parser, CPP_SCOPE, RT_SCOPE);
@@ -12376,6 +12381,7 @@ static tree
 cp_parser_template_id (cp_parser *parser,
 		   bool template_keyword_p,
 		   bool check_dependency_p,
+		   enum tag_types tag_type,
 		   bool is_declaration)
 {
   int i;
@@ -12432,6 +12438,7 @@ cp_parser_template_id (cp_parser *parser,
   templ = cp_parser_template_name (parser, template_keyword_p,
    check_dependency_p,
    is_declaration,
+   tag_type,
    is_identifier);
   if (templ == error_mark_node || is_identifier)
 {
@@ -12604,6 +12611,7 @@ cp_parser_template_name (cp_parser* parser,
 			 bool template_keyword_p,
 			 bool check_dependency_p,
 			 bool is_declaration,
+			 enum tag_types tag_type,
 			 bool *is_identifier)
 {
   tree identifier;
@@ -12710,7 +12718,7 @@ cp_parser_template_name (cp_parser* parser,
 
   /* Look up the name.  */
   decl = cp_parser_lookup_name (parser, identifier,
-none_type,
+tag_type,
 /*is_template=*/true,
 /*is_namespace=*/false,
 check_dependency_p,
@@ -13699,6 +13707,7 @@ cp_parser_simple_type_specifier (cp_parser* parser,
 	  type = cp_parser_template_id (parser,
 	/*template_keyword_p=*/true,
 	/*check_dependency_p=*/true,
+	none_type,
 	/*is_declaration=*/false);
 	  /* If the template-id did not name a type, we are out of
 	 luck.  */
@@ -13811,6 +13820,7 @@ cp_parser_type_name (cp_parser* parser)
   type_decl = cp_parser_template_id (parser,
 	 /*template_keyword_p=*/false,
 	 /*check_dependency_p=*/false,
+	 none_type,
 	 /*is_declaration=*/false);
   /* Note that this must be an instantiation of an alias template
 	 because [temp.names]/6 says:
@@ -14035,6 +14045,7 @@ cp_parser_elaborated_type_specifier (cp_parser* parser,
   token = cp_lexer_peek_token (parser-lexer);
   decl = cp_parser_template_id (parser, template_p,
 /*check_dependency_p=*/true,
+tag_type,
 is_declaration);
   /* If we didn't find a 

Re: [testsuite] don't use lto plugin if it doesn't work

2012-06-27 Thread Mike Stump
On Jun 27, 2012, at 2:07 AM, Alexandre Oliva wrote:
 Why?  We don't demand a working plugin.  Indeed, we disable the use of
 the plugin if we find a linker that doesn't support it.  We just don't
 account for the possibility of finding a linker that supports plugins,
 but that doesn't support the one we'll build later.

If this is the preferred solution, then having configure check the 64-bitness 
of ld and turning off the plugin altogether on mismatches sounds like a 
reasonable course of action to me.


[patch testsuite]: Fix another LP64 vs LLP64 issue

2012-06-27 Thread Kai Tietz
Hi,

this patch fixes a testsuite-failure for LLP64 targets.

ChangeLog

2012-06-27  Kai Tietz

* g++.dg/cpp0x/constexpr-52672.C (ul_ptr): Use SIZE_TYPE instead of
hard-coded 'unsigned long'.

Tested for x86_64-w64-mingw32, and x86_64-unknown-linux-gnu.  Ok for apply?

Regards,
Kai


Index: testsuite/g++.dg/cpp0x/constexpr-52672.C
===
--- testsuite/g++.dg/cpp0x/constexpr-52672.C(revision 189009)
+++ testsuite/g++.dg/cpp0x/constexpr-52672.C(working copy)
@@ -2,7 +2,7 @@
 // { dg-do compile }
 // { dg-options -std=c++11 }

-typedef unsigned long * ul_ptr;
+__extension__ typedef __SIZE_TYPE__ * ul_ptr;
 constexpr unsigned long a = *((ul_ptr)0x0); // { dg-error  }
 constexpr unsigned long b = *((ul_ptr)(*((ul_ptr)0x0))); // { dg-error  }
 constexpr unsigned long c = *((ul_ptr)*((ul_ptr)(*((ul_ptr)0x0;
// { dg-error  }


[patch i386]: always allow for pe-coff that relocations can be put into readonly memory

2012-06-27 Thread Kai Tietz
Hello,

this patch makes sure that for pe(+)-coff targets always relocations
are allowed in readonly memory.
This fixes for x86_64-w64-mingw32 target some testcases.

ChangeLog

2012-06-27  Kai Tietz

* config/i386/winnt.c (i386_pe_reloc_rw_mask): New function.
* config/i386/i386-protos.h (i386_pe_reloc_rw_mask): Add
prototype.
* config/i386/cygming.h (TARGET_ASM_RELOC_RW_MASK): Define
as i386_pe_reloc_rw_mask.

Tested for i686-pc-cygwin, i686-w64-mingw32, and x86_64-w64-mingw32.
Ok for apply?

Regards,
Kai

Index: config/i386/winnt.c
===
--- config/i386/winnt.c (revision 189009)
+++ config/i386/winnt.c (working copy)
@@ -421,6 +421,14 @@
   DECL_SECTION_NAME (decl) = build_string (len, string);
 }

+/* Local and global relocs can be placed always into readonly memory for
+   memory for PE-COFF targets.  */
+int
+i386_pe_reloc_rw_mask (void)
+{
+  return 0;
+}
+
 /* Select a set of attributes for section NAME based on the properties
of DECL and whether or not RELOC indicates that DECL's initializer
might contain runtime relocations.
Index: config/i386/i386-protos.h
===
--- config/i386/i386-protos.h   (revision 189009)
+++ config/i386/i386-protos.h   (working copy)
@@ -264,6 +264,8 @@
 extern bool i386_pe_type_dllimport_p (tree);
 extern bool i386_pe_type_dllexport_p (tree);

+extern int i386_pe_reloc_rw_mask (void);
+
 extern rtx maybe_get_pool_constant (rtx);

 extern char internal_label_prefix[16];
Index: config/i386/cygming.h
===
--- config/i386/cygming.h   (revision 189009)
+++ config/i386/cygming.h   (working copy)
@@ -225,6 +225,11 @@

 #define SUBTARGET_ENCODE_SECTION_INFO  i386_pe_encode_section_info

+/* Local and global relocs can be placed always into readonly memory
+   for PE-COFF targets.  */
+#undef TARGET_ASM_RELOC_RW_MASK
+#define TARGET_ASM_RELOC_RW_MASK i386_pe_reloc_rw_mask
+
 /* Output a common block.  */
 #undef ASM_OUTPUT_ALIGNED_DECL_COMMON
 #define ASM_OUTPUT_ALIGNED_DECL_COMMON \


Re: [RFA] Enable dump-noaddr test to work in out of build tree testing

2012-06-27 Thread Andrew Pinski
On Wed, Jun 27, 2012 at 3:33 AM, Matthew Gretton-Dann
matthew.gretton-d...@arm.com wrote:
 All,

 This patch enables the dump-noaddr test to work in out-of-build-tree
 testing.

 It does this by making sure that the dump files generated during the
 test are created under $tmpdir.

I created a much simpler patch which I have been meaning to submit.
I attached it for reference.


Thanks,
Andrew Pinski

ChangeLog:
* testsuite/gcc.c-torture/unsorted/dump-noaddr.x (dump_compare): Use
an absolute dump base instead of a relative one.





 gcc/testsuite/ChangeLog:
 2012-06-27  Matthew Gretton-Dann  matthew.gretton-d...@arm.com

        * gcc.c-torture/unsorted/dump-noaddr.x: Generate dump files in
        tmpdir.

 Tested both in and out of build-tree against an arm-none-eabi targetted
 compiler.

 OK?

 Thanks,

 Matt

 --
 Matthew Gretton-Dann
 Principal Engineer, PD Software - Tools, ARM Ltd
Index: gcc.c-torture/unsorted/dump-noaddr.x
===
--- gcc.c-torture/unsorted/dump-noaddr.x(revision 61452)
+++ gcc.c-torture/unsorted/dump-noaddr.x(revision 61453)
@@ -11,10 +11,10 @@ proc dump_compare { src options } {
 foreach option $option_list {
file delete -force dump1
file mkdir dump1
-   c-torture-compile $src $option $options -dumpbase dump1/$dumpbase 
-DMASK=1 -x c --param ggc-min-heapsize=1 -fdump-ipa-all -fdump-rtl-all 
-fdump-tree-all -fdump-noaddr
+   c-torture-compile $src $option $options -dumpbase 
[pwd]/dump1/$dumpbase -DMASK=1 -x c --param ggc-min-heapsize=1 -fdump-ipa-all 
-fdump-rtl-all -fdump-tree-all -fdump-noaddr
file delete -force dump2
file mkdir dump2
-   c-torture-compile $src $option $options -dumpbase dump2/$dumpbase 
-DMASK=2 -x c -fdump-ipa-all -fdump-rtl-all -fdump-tree-all -fdump-noaddr
+   c-torture-compile $src $option $options -dumpbase 
[pwd]/dump2/$dumpbase -DMASK=2 -x c -fdump-ipa-all -fdump-rtl-all 
-fdump-tree-all -fdump-noaddr
foreach dump1 [lsort [glob -nocomplain dump1/*]] {
regsub dump1/ $dump1 dump2/ dump2
set dumptail gcc.c-torture/unsorted/[file tail $dump1]


[patch] Move some dbxout/xcoffout defines from header files into .c files

2012-06-27 Thread Steven Bosscher
Hello,

This small patch moves a couple of #defines into the only files that
use them. En passant xcoff.h (gcc/xcoff.h, *not* the rs6000/xcoff.h)
becomes unused and can be removed.

Tested by building a cross cc1 from x86_64-unknown-linux-gnu to
rs6000-ibm-aix4.3, just to be sure.

OK for trunk?

Ciao!
Steven


debug_cleanup.diff
Description: Binary data


Re: [Patch ARM] Improve vdup_n intrinsics.

2012-06-27 Thread Ramana Radhakrishnan
On 27 June 2012 21:27, Richard Henderson r...@redhat.com wrote:
 On 06/20/2012 05:44 AM, Ramana Radhakrishnan wrote:
 +         case NEON_DUP:
 +           if (TREE_CODE (argp[0]) == INTEGER_CST
 +               || TREE_CODE (argp[0]) == REAL_CST)
 +             return build_vector_from_val (result_type, argp[0]);
 +           return NULL_TREE;

 You can expand this in all cases.  Constants go to VECTOR_CST,
 as you're doing, but variables can be expanded via a CONSTRUCTOR.

 Check out what we generate for

  (v4si){ x, x, x, x }

Oh, nice. I'll see what I can spin up.



Ramana



 r~


[PATCH] Add MULT_HIGHPART_EXPR

2012-06-27 Thread Richard Henderson

I was sitting on this patch until I got around to fixing up Jakub's
existing vector divmod code to use it.  But seeing as how he's adding
more uses, I think it's better to get it in earlier.

Tested via a patch sent under separate cover that changes
__builtin_alpha_umulh to immediately fold to MULT_HIGHPART_EXPR.


r~

---

* tree.def (MULT_HIGHPART_EXPR): New.
* cfgexpand.c (expand_debug_expr): Ignore it.
* expr.c (expand_expr_real_2): Handle it.
* fold-const.c (int_const_binop_1): Likewise.
* optabs.c (optab_for_tree_code): Likewise.
* tree-cfg.c (verify_gimple_assign_binary): Likewise.
* tree-inline.c (estimate_operator_cost): Likewise.
* tree-pretty-print.c (dump_generic_node): Likewise.
(op_code_prio, op_symbol_code): Likewise.
* tree.c (commutative_tree_code): Likewise.  Also handle
WIDEN_MULT_EXPR, VEC_WIDEN_MULT_HI_EXPR, VEC_WIDEN_MULT_LO_EXPR.
---
 gcc/ChangeLog   |   14 ++
 gcc/cfgexpand.c |6 +-
 gcc/expr.c  |1 +
 gcc/fold-const.c|   10 ++
 gcc/optabs.c|3 +++
 gcc/tree-cfg.c  |1 +
 gcc/tree-inline.c   |1 +
 gcc/tree-pretty-print.c |5 +
 gcc/tree.c  |4 
 gcc/tree.def|4 
 10 files changed, 48 insertions(+), 1 deletions(-)

diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
index a8397c6..ad2f667 100644
--- a/gcc/cfgexpand.c
+++ b/gcc/cfgexpand.c
@@ -3415,7 +3415,7 @@ expand_debug_expr (tree exp)
 case VEC_PERM_EXPR:
   return NULL;
 
-   /* Misc codes.  */
+/* Misc codes.  */
 case ADDR_SPACE_CONVERT_EXPR:
 case FIXED_CONVERT_EXPR:
 case OBJ_TYPE_REF:
@@ -3466,6 +3466,10 @@ expand_debug_expr (tree exp)
}
   return NULL;
 
+case MULT_HIGHPART_EXPR:
+  /* ??? Similar to the above.  */
+  return NULL;
+
 case WIDEN_SUM_EXPR:
 case WIDEN_LSHIFT_EXPR:
   if (SCALAR_INT_MODE_P (GET_MODE (op0))
diff --git a/gcc/expr.c b/gcc/expr.c
index cad5b10..5295da2 100644
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -8551,6 +8551,7 @@ expand_expr_real_2 (sepops ops, rtx target, enum 
machine_mode tmode,
   return expand_divmod (0, code, mode, op0, op1, target, unsignedp);
 
 case RDIV_EXPR:
+case MULT_HIGHPART_EXPR:
   goto binop;
 
 case TRUNC_MOD_EXPR:
diff --git a/gcc/fold-const.c b/gcc/fold-const.c
index 877cf32..702f4e0 100644
--- a/gcc/fold-const.c
+++ b/gcc/fold-const.c
@@ -999,6 +999,16 @@ int_const_binop_1 (enum tree_code code, const_tree arg1, 
const_tree arg2,
 res.low, res.high);
   break;
 
+case MULT_HIGHPART_EXPR:
+  /* ??? Need quad precision, or an additional shift operand
+to the multiply primitive, to handle very large highparts.  */
+  if (TYPE_PRECISION (type)  HOST_BITS_PER_WIDE_INT)
+   return NULL_TREE;
+  tmp = double_int_mul (op1, op2);
+  res = double_int_rshift (tmp, TYPE_PRECISION (type),
+  TYPE_PRECISION (type), !uns);
+  break;
+
 case TRUNC_DIV_EXPR:
 case FLOOR_DIV_EXPR: case CEIL_DIV_EXPR:
 case EXACT_DIV_EXPR:
diff --git a/gcc/optabs.c b/gcc/optabs.c
index 9a549ff..3094476 100644
--- a/gcc/optabs.c
+++ b/gcc/optabs.c
@@ -367,6 +367,9 @@ optab_for_tree_code (enum tree_code code, const_tree type,
 case BIT_XOR_EXPR:
   return xor_optab;
 
+case MULT_HIGHPART_EXPR:
+  return TYPE_UNSIGNED (type) ? umul_highpart_optab : smul_highpart_optab;
+
 case TRUNC_MOD_EXPR:
 case CEIL_MOD_EXPR:
 case FLOOR_MOD_EXPR:
diff --git a/gcc/tree-cfg.c b/gcc/tree-cfg.c
index d7ab090..fe5af70 100644
--- a/gcc/tree-cfg.c
+++ b/gcc/tree-cfg.c
@@ -3731,6 +3731,7 @@ do_pointer_plus_expr_check:
   return false;
 
 case MULT_EXPR:
+case MULT_HIGHPART_EXPR:
 case TRUNC_DIV_EXPR:
 case CEIL_DIV_EXPR:
 case FLOOR_DIV_EXPR:
diff --git a/gcc/tree-inline.c b/gcc/tree-inline.c
index 6746296..c3d3fb6 100644
--- a/gcc/tree-inline.c
+++ b/gcc/tree-inline.c
@@ -3379,6 +3379,7 @@ estimate_operator_cost (enum tree_code code, eni_weights 
*weights,
 case POINTER_PLUS_EXPR:
 case MINUS_EXPR:
 case MULT_EXPR:
+case MULT_HIGHPART_EXPR:
 case FMA_EXPR:
 
 case ADDR_SPACE_CONVERT_EXPR:
diff --git a/gcc/tree-pretty-print.c b/gcc/tree-pretty-print.c
index f810d77..44d3c10 100644
--- a/gcc/tree-pretty-print.c
+++ b/gcc/tree-pretty-print.c
@@ -1612,6 +1612,7 @@ dump_generic_node (pretty_printer *buffer, tree node, int 
spc, int flags,
 case WIDEN_SUM_EXPR:
 case WIDEN_MULT_EXPR:
 case MULT_EXPR:
+case MULT_HIGHPART_EXPR:
 case PLUS_EXPR:
 case POINTER_PLUS_EXPR:
 case MINUS_EXPR:
@@ -2674,6 +2675,7 @@ op_code_prio (enum tree_code code)
 case WIDEN_MULT_PLUS_EXPR:
 case WIDEN_MULT_MINUS_EXPR:
 case MULT_EXPR:
+case MULT_HIGHPART_EXPR:
 case TRUNC_DIV_EXPR:
 case CEIL_DIV_EXPR:
 case 

[PATCH] alpha: Cleaup builtins and folding

2012-06-27 Thread Richard Henderson
* config/alpha/alpha.c (alpha_dimode_u): New.
(alpha_init_builtins): Initialize it, and use it.
(alpha_fold_builtin_cmpbge): Use alpha_dimode_u.
(alpha_fold_builtin_zapnot, alpha_fold_builtin_insxx): Likewise.
(alpha_fold_vector_minmax, alpha_fold_builtin_perr): Likewise.
(alpha_fold_builtin_pklb, alpha_fold_builtin_pkwb): Likewise.
(alpha_fold_builtin_unpkbl, alpha_fold_builtin_unpkbw): Likewise.
(alpha_fold_builtin_cttz, alpha_fold_builtin_ctlz): Likewise.
(alpha_fold_builtin_ctpop): Likewise.
(alpha_fold_builtin_umulh): Remove.
(alpha_fold_builtin): Use MULT_HIGHPART_EXPR for UMULH; fix
typo in MAX_ARGS check.
---
 gcc/ChangeLog|   15 +++
 gcc/config/alpha/alpha.c |   99 ++---
 2 files changed, 46 insertions(+), 68 deletions(-)

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 8720ec4..09c2d56 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,5 +1,20 @@
 2012-06-27  Richard Henderson  r...@redhat.com
 
+   * config/alpha/alpha.c (alpha_dimode_u): New.
+   (alpha_init_builtins): Initialize it, and use it.
+   (alpha_fold_builtin_cmpbge): Use alpha_dimode_u.
+   (alpha_fold_builtin_zapnot, alpha_fold_builtin_insxx): Likewise.
+   (alpha_fold_vector_minmax, alpha_fold_builtin_perr): Likewise.
+   (alpha_fold_builtin_pklb, alpha_fold_builtin_pkwb): Likewise.
+   (alpha_fold_builtin_unpkbl, alpha_fold_builtin_unpkbw): Likewise.
+   (alpha_fold_builtin_cttz, alpha_fold_builtin_ctlz): Likewise.
+   (alpha_fold_builtin_ctpop): Likewise.
+   (alpha_fold_builtin_umulh): Remove.
+   (alpha_fold_builtin): Use MULT_HIGHPART_EXPR for UMULH; fix
+   typo in MAX_ARGS check.
+
+2012-06-27  Richard Henderson  r...@redhat.com
+
* tree.def (MULT_HIGHPART_EXPR): New.
* cfgexpand.c (expand_debug_expr): Ignore it.
* expr.c (expand_expr_real_2): Handle it.
diff --git a/gcc/config/alpha/alpha.c b/gcc/config/alpha/alpha.c
index a881a9e..5617ea3 100644
--- a/gcc/config/alpha/alpha.c
+++ b/gcc/config/alpha/alpha.c
@@ -6461,6 +6461,7 @@ static struct alpha_builtin_def const two_arg_builtins[] 
= {
   { __builtin_alpha_perr,ALPHA_BUILTIN_PERR, MASK_MAX, true }
 };
 
+static GTY(()) tree alpha_dimode_u;
 static GTY(()) tree alpha_v8qi_u;
 static GTY(()) tree alpha_v8qi_s;
 static GTY(()) tree alpha_v4hi_u;
@@ -6514,25 +6515,23 @@ alpha_add_builtins (const struct alpha_builtin_def *p, 
size_t count,
 static void
 alpha_init_builtins (void)
 {
-  tree dimode_integer_type_node;
   tree ftype;
 
-  dimode_integer_type_node = lang_hooks.types.type_for_mode (DImode, 0);
+  alpha_dimode_u = lang_hooks.types.type_for_mode (DImode, 1);
+  alpha_v8qi_u = build_vector_type (unsigned_intQI_type_node, 8);
+  alpha_v8qi_s = build_vector_type (intQI_type_node, 8);
+  alpha_v4hi_u = build_vector_type (unsigned_intHI_type_node, 4);
+  alpha_v4hi_s = build_vector_type (intHI_type_node, 4);
 
-  ftype = build_function_type_list (dimode_integer_type_node, NULL_TREE);
-  alpha_add_builtins (zero_arg_builtins, ARRAY_SIZE (zero_arg_builtins),
- ftype);
+  ftype = build_function_type_list (alpha_dimode_u, NULL_TREE);
+  alpha_add_builtins (zero_arg_builtins, ARRAY_SIZE (zero_arg_builtins), 
ftype);
 
-  ftype = build_function_type_list (dimode_integer_type_node,
-   dimode_integer_type_node, NULL_TREE);
-  alpha_add_builtins (one_arg_builtins, ARRAY_SIZE (one_arg_builtins),
- ftype);
+  ftype = build_function_type_list (alpha_dimode_u, alpha_dimode_u, NULL_TREE);
+  alpha_add_builtins (one_arg_builtins, ARRAY_SIZE (one_arg_builtins), ftype);
 
-  ftype = build_function_type_list (dimode_integer_type_node,
-   dimode_integer_type_node,
-   dimode_integer_type_node, NULL_TREE);
-  alpha_add_builtins (two_arg_builtins, ARRAY_SIZE (two_arg_builtins),
- ftype);
+  ftype = build_function_type_list (alpha_dimode_u, alpha_dimode_u,
+   alpha_dimode_u, NULL_TREE);
+  alpha_add_builtins (two_arg_builtins, ARRAY_SIZE (two_arg_builtins), ftype);
 
   ftype = build_function_type_list (ptr_type_node, NULL_TREE);
   alpha_builtin_function (__builtin_thread_pointer, ftype,
@@ -6558,11 +6557,6 @@ alpha_init_builtins (void)
 
   vms_patch_builtins ();
 }
-
-  alpha_v8qi_u = build_vector_type (unsigned_intQI_type_node, 8);
-  alpha_v8qi_s = build_vector_type (intQI_type_node, 8);
-  alpha_v4hi_u = build_vector_type (unsigned_intHI_type_node, 4);
-  alpha_v4hi_s = build_vector_type (intHI_type_node, 4);
 }
 
 /* Expand an expression EXP that calls a built-in function,
@@ -6675,10 +6669,10 @@ alpha_fold_builtin_cmpbge (unsigned HOST_WIDE_INT 
opint[], long op_const)
  if (c0 = c1)
val |= 1  i;
}
-  return build_int_cst 

Re: [PATCH] Add MULT_HIGHPART_EXPR

2012-06-27 Thread Steven Bosscher
On Wed, Jun 27, 2012 at 11:37 PM, Richard Henderson r...@redhat.com wrote:
        * tree.def (MULT_HIGHPART_EXPR): New.
        * cfgexpand.c (expand_debug_expr): Ignore it.
        * expr.c (expand_expr_real_2): Handle it.
        * fold-const.c (int_const_binop_1): Likewise.
        * optabs.c (optab_for_tree_code): Likewise.
        * tree-cfg.c (verify_gimple_assign_binary): Likewise.
        * tree-inline.c (estimate_operator_cost): Likewise.
        * tree-pretty-print.c (dump_generic_node): Likewise.
        (op_code_prio, op_symbol_code): Likewise.
        * tree.c (commutative_tree_code): Likewise.  Also handle
        WIDEN_MULT_EXPR, VEC_WIDEN_MULT_HI_EXPR, VEC_WIDEN_MULT_LO_EXPR.

Maybe also a bit in doc/generic.texi? Or is this not supposed to be
exposed to the front ends?

Ciao!
Steven


Re: [PATCH] Add MULT_HIGHPART_EXPR

2012-06-27 Thread Richard Henderson
On 06/27/2012 02:42 PM, Steven Bosscher wrote:
 Maybe also a bit in doc/generic.texi? Or is this not supposed to be
 exposed to the front ends?

I can't imagine it being terribly useful to front ends, but there's
certainly nothing that ought to prevent it.  How's this?


r~
diff --git a/gcc/doc/generic.texi b/gcc/doc/generic.texi
index e99366f..c48b663 100644
--- a/gcc/doc/generic.texi
+++ b/gcc/doc/generic.texi
@@ -1235,6 +1235,7 @@ the byte offset of the field, but should not be used 
directly; call
 @tindex PLUS_EXPR
 @tindex MINUS_EXPR
 @tindex MULT_EXPR
+@tindex MULT_HIGHPART_EXPR
 @tindex RDIV_EXPR
 @tindex TRUNC_DIV_EXPR
 @tindex FLOOR_DIV_EXPR
@@ -1433,6 +1434,11 @@ one operand is of floating type and the other is of 
integral type.
 The behavior of these operations on signed arithmetic overflow is
 controlled by the @code{flag_wrapv} and @code{flag_trapv} variables.
 
+@item MULT_HIGHPART_EXPR
+This node represents the ``high-part'' of a widening multiplication.
+For an integral type with @var{b} bits of precision, the result is
+the most significant @var{b} bits of the full @math{2@var{b}} product.
+
 @item RDIV_EXPR
 This node represents a floating point division operation.
 


[testsuite] gcc.dg/vect/vect-50.c: combine two scans

2012-06-27 Thread Janis Johnson
These scans from gcc.dg/vect/vect-50.c, and others similar to them in
other vect tests, hurt my brain:

/* { dg-final { scan-tree-dump-times Vectorizing an unaligned access 2 vect 
{ xfail { vect_no_align } } } }  */
/* { dg-final { scan-tree-dump-times Vectorizing an unaligned access 2 vect 
{ target vect_hw_misalign } } } */

Both of these PASS for i686-pc-linux-gnu, causing duplicate lines in the
gcc test summary.  I'm pretty sure the following accomplishes the same
goal:

/* { dg-final { scan-tree-dump-times Vectorizing an unaligned access 2 vect 
{ xfail { vect_no_align  { ! vect_hw_misalign } } } } } */

That is, run the check everywhere and expect it to fail for effective
targets for which vect_no_align is true and vect_hw_misalign is false.

Tested on i686-pc-linux-gnu and arm-none-eabi.  I'm enough confused that
I'm not going to call this one obvious, it needs a sanity check from
someone else; OK for trunk?

Janis
2012-06-27  Janis Johnson  jani...@codesourcery.com

* gcc.dg/vect/vect-50.c: Combine two scans.

Index: gcc.dg/vect/vect-50.c
===
--- gcc.dg/vect/vect-50.c   (revision 189025)
+++ gcc.dg/vect/vect-50.c   (working copy)
@@ -61,8 +61,7 @@
align the store will not force the two loads to be aligned).  */
 
 /* { dg-final { scan-tree-dump-times vectorized 1 loops 1 vect } } */
-/* { dg-final { scan-tree-dump-times Vectorizing an unaligned access 2 
vect { xfail { vect_no_align } } } }  */
-/* { dg-final { scan-tree-dump-times Vectorizing an unaligned access 2 
vect { target vect_hw_misalign } } } */
+/* { dg-final { scan-tree-dump-times Vectorizing an unaligned access 2 
vect { xfail { vect_no_align  {! vect_hw_misalign } } } } } */
 /* { dg-final { scan-tree-dump-times Alignment of access forced using 
peeling 1 vect { xfail { vect_no_align || {! vector_alignment_reachable} } } 
} } */
 /* { dg-final { scan-tree-dump-times Alignment of access forced using 
versioning. 3 vect { target vect_no_align } } } */
 /* { dg-final { scan-tree-dump-times Alignment of access forced using 
versioning. 1 vect { target { {! vector_alignment_reachable}  { {! 
vect_no_align }  {! vect_hw_misalign } } } } } } */


[patch] Remove IFCVT_EXTRA_FIELDS

2012-06-27 Thread Steven Bosscher
Hello,

The only user of IFCVT_EXTRA_FIELDS is FR-V -- and it's not even using
the macro to define extra fields...

This patch removes IFCVT_EXTRA_FIELDS and replaces the related
IFCVT_INIT_EXTRA_FIELDS with IFCVT_MACHDEP_INIT.

Bootstrappedtested on x86_64-unknown-linux-gnu, and buildtested with
a cross from powerpc64-unknown-linux-gnu to frv-elf.
OK for trunk?

Ciao!
Steven


cleanup_IFCVT_EXTRA_FIELDS.diff
Description: Binary data


Re: [patch] Remove IFCVT_EXTRA_FIELDS

2012-06-27 Thread Richard Henderson
On 06/27/2012 03:53 PM, Steven Bosscher wrote:
   * system.h (IFCVT_EXTRA_FIELDS): Poison.
   (IFCVT_INIT_EXTRA_FIELDS): Poison.
   * basic-block.h (struct ce_if_block): Remove IFCVT_EXTRA_FIELDS.
   * ifcvt.c (find_if_header): Use IFCVT_MACHDEP_INIT instead of
   IFCVT_INIT_EXTRA_FIELDS.
   * gengtype-parse.c (struct_field_seq): Remove obsolete comment.
   * config/frv/frv.h (IFCVT_INIT_EXTRA_FIELDS): Rename to
   IFCVT_MACHDEP_INIT.
   * config/frv/frv.c (frv_ifcvt_init_extra_fields): Rename to
   frv_ifcvt_machdep_init.
   * doc/tm.texi.in (IFCVT_INIT_EXTRA_FIELDS, IFCVT_EXTRA_FIELDS):
   Remove documentation.
   (IFCVT_MACHDEP_INIT): Document.
   * doc/tm.texi: Regenerate.

Ok.

r~


Re: [patch i386]: always allow for pe-coff that relocations can be put into readonly memory

2012-06-27 Thread Mike Stump
On Jun 27, 2012, at 1:13 PM, Richard Henderson wrote:
 Would it be of any use to introduce an .rdata$N section (equivalent
 to .data.ro) so that most of the runtime relocations are adjacent,
 and more of the executable image is sharable?

I can't help but think that grouping them together would be a win, provided the 
relocations are allowed...  I know on darwin, we kinda hate those sorts of 
relocations because they are a performance sap.  This type of performance sap 
is nasty as it is pervasive and invisible and hard to ever get back.  Would be 
nice to ensure any change applied doesn't hurt performance...


Re: [patch testsuite]: Fix another LP64 vs LLP64 issue

2012-06-27 Thread Mike Stump
On Jun 27, 2012, at 12:21 PM, Kai Tietz wrote:
 this patch fixes a testsuite-failure for LLP64 targets.
 
 ChangeLog
 
 2012-06-27  Kai Tietz
 
* g++.dg/cpp0x/constexpr-52672.C (ul_ptr): Use SIZE_TYPE instead of
hard-coded 'unsigned long'.

 Ok for apply?

Ok.


Re: [testsuite] gcc.dg/vect/vect-50.c: combine two scans

2012-06-27 Thread Mike Stump
On Jun 27, 2012, at 3:36 PM, Janis Johnson wrote:
 These scans from gcc.dg/vect/vect-50.c, and others similar to them in
 other vect tests, hurt my brain:
 
 /* { dg-final { scan-tree-dump-times Vectorizing an unaligned access 2 
 vect { xfail { vect_no_align } } } }  */
 /* { dg-final { scan-tree-dump-times Vectorizing an unaligned access 2 
 vect { target vect_hw_misalign } } } */
 
 Both of these PASS for i686-pc-linux-gnu, causing duplicate lines in the
 gcc test summary.  I'm pretty sure the following accomplishes the same
 goal:
 
 /* { dg-final { scan-tree-dump-times Vectorizing an unaligned access 2 
 vect { xfail { vect_no_align  { ! vect_hw_misalign } } } } } */

I don't think so?  The first sets the xfail status for the testcase.  If you 
change the condition, you can't the xfail state for some targets, which would 
be wrong (without a vec person chiming in).

I'd like to think you can compose the two with some spelling...  I just don't 
think this one is it.?

I grepped around and found:

  /* { dg-message does break strict-aliasing  { target { *-*-*  lp64 } 
xfail *-*-* } 8 } */

which might have the right way to spell it, though, I always test to ensure the 
construct does what I want.

 That is, run the check everywhere

We don't want to run the test on other than vect_hw_misalign targets, right?


Re: [RFA] Enable dump-noaddr test to work in out of build tree testing

2012-06-27 Thread Mike Stump
On Jun 27, 2012, at 1:35 PM, Andrew Pinski wrote:
 On Wed, Jun 27, 2012 at 3:33 AM, Matthew Gretton-Dann
 matthew.gretton-d...@arm.com wrote:
 All,
 
 This patch enables the dump-noaddr test to work in out-of-build-tree
 testing.
 
 It does this by making sure that the dump files generated during the
 test are created under $tmpdir.
 
 I created a much simpler patch which I have been meaning to submit.
 I attached it for reference.

I'll let Matthew chime in, I do wonder if the testcase name comes out short 
[basename]...  If Matthew likes your version better, I'll pre-approve the 
backout of his and the add on yours.


Re: [wwwdocs] Update coding conventions for C++

2012-06-27 Thread Lawrence Crowl
On 6/27/12, Martin Jambor mjam...@suse.cz wrote:
 On Tue, Jun 26, 2012 at 11:06:15AM -0700, Lawrence Crowl wrote:
  On 6/26/12, Martin Jambor mjam...@suse.cz wrote:
   On Mon, Jun 25, 2012 at 03:26:01PM -0700, Lawrence Crowl wrote:
+but think twice before using it in code
+intended to last a long time.
  
   I think all committed code should be expected to have
   long-lasting quality.  I would not encourage people to think
   otherwise and would drop the long time reference here.
   If anybody ever commits something ugly to bridge a short time
   period, it should only be done under the maintainers grant
   exceptions rule anyway.
  
+/p +p
+For long-term code, at least for now,
+we will continue to use codeprintf/code style I/O
+rather than codelt;iostreamgt;/code style I/O.
+For quick debugging code,
+codelt;iostreamgt;/code is permitted.
+/p
  
   Similarly here, no quick and dirty debugging output should
   ever be committed, we should not
  
+h4a name=stdlibThe Standard Library/a/h4
+
+p
+At present, C++ provides no great advantage for i18n.
+GCC does type checking for codeprintf/code arguments,
+so the type insecurity of codeprintf/code is moot,
+but the clarity in layout persists.
+For quick debugging output, lt;iostreamgt; requires less work.
+/p
  
   The same applies here.
 
  The value of these changes depends on when the rules are
  enforced.  If they are enforced only on trunk, then the changes
  seem fine to me.  However, if they are enforced on branches,
  then they could unnecessarily slow down development.
 
  Comments?

 I think that if you have a private branch, you are basically
 its maintainer and can grant yourself any exception from any
 rule you want.  Of course, that might make your life harder if
 you later want to contribute the changes to the trunk, release
 branches, other peple's branches and generally anywhere.

I am not concerned about private branches, but public branches for
which sharing might be needed, but for which cleanup before going
into trunk or a release is reasonable.

C++ streams are much more convenient for free-form output than
C-based solutions.  Having said that, does anyone object to removing
the permission to use C++ streams?

-- 
Lawrence Crowl


Re: [patch i386]: always allow for pe-coff that relocations can be put into readonly memory

2012-06-27 Thread Kai Tietz
2012/6/27 Richard Henderson r...@redhat.com:
 On 06/27/2012 12:47 PM, Kai Tietz wrote:
 2012-06-27  Kai Tietz

       * config/i386/winnt.c (i386_pe_reloc_rw_mask): New function.
       * config/i386/i386-protos.h (i386_pe_reloc_rw_mask): Add
       prototype.
       * config/i386/cygming.h (TARGET_ASM_RELOC_RW_MASK): Define
       as i386_pe_reloc_rw_mask.

 Tested for i686-pc-cygwin, i686-w64-mingw32, and x86_64-w64-mingw32.
 Ok for apply?

 Plausible.

 I suppose this gets handled by the windows loader similar to how
 the .data.ro sections get handled by an elf loader with -z relro?
 I.e. relocations applied then the page protections reapplied?

Correct.

 Would it be of any use to introduce an .rdata$N section (equivalent
 to .data.ro) so that most of the runtime relocations are adjacent,
 and more of the executable image is sharable?

Sounds interesting from perspective of startup-speed.  I wouldn't
assume that it has much effect on memory-saving.  But to check this,
is subject of different patch, but II will give it a try.

 r~

Kai