Re: [SH] PR 49880 - Fix some more -mdiv option issues

2013-03-12 Thread Kaz Kojima
Oleg Endo  wrote:
> The attached patch should make the -mdiv= option work as it is described
> in the documentation (which I updated recently as part of PR 56529).
> 
> Tested with 'make all' and
> 
> make -k check-gcc RUNTESTFLAGS="sh.exp=pr49880* --target_board=sh-sim
> \{-m2,-m2a,-m2a-nofpu,-m2a-single,-m2a-single-only,-m3,-m3e,-m4,-m4-single,
> -m4-single-only,-m4a,-m4a-single,-m4a-single-only}"
> 
> OK for 4.8 and 4.7?
 
OK.

Regards,
kaz


[SH] PR 49880 - Fix some more -mdiv option issues

2013-03-12 Thread Oleg Endo
Hi,

Initially I just wanted to simplify two lines as mentioned in the PR.
However, when I started writing the test cases a small can of worms
popped up.
'-m4 -mdiv=call-div1' would not link on bare metal configs because of
missing functions in libgcc, '-m2a -mdiv=call-fp' would ICE and/or not
link and '*-nofpu -mdiv=call-fp' would invoke library functions that use
the FPU.

I've also run across some confusions regarding TARGET_FPU_SINGLE and
friends, but I'm leaving a better cleanup for 4.9.  Basically it's not
possible to distinguish between -m4-nofpu and -m4-single-only, because
there are no corresponding bits.  Thus the two new mask bits in sh.opt,
which required converting some of the existing mask bit into a var, as
we already ran out of bits once in the past.

The attached patch should make the -mdiv= option work as it is described
in the documentation (which I updated recently as part of PR 56529).

Tested with 'make all' and

make -k check-gcc RUNTESTFLAGS="sh.exp=pr49880* --target_board=sh-sim
\{-m2,-m2a,-m2a-nofpu,-m2a-single,-m2a-single-only,-m3,-m3e,-m4,-m4-single,
-m4-single-only,-m4a,-m4a-single,-m4a-single-only}"

OK for 4.8 and 4.7?

Cheers,
Oleg

gcc/ChangeLog:

PR target/49880
* config/sh/sh.opt (FPU_SINGLE_ONLY): New mask.
(musermode): Convert to Var(TARGET_USERMODE).
* config/sh/sh.h (SELECT_SH2A_SINGLE_ONLY, 
SELECT_SH4_SINGLE_ONLY, MASK_ARCH): Add MASK_FPU_SINGLE_ONLY.
* config/sh/sh.c (sh_option_override): Use
TARGET_FPU_DOUBLE || TARGET_FPU_SINGLE_ONLY for call-fp case.
* config/sh/sh.md (udivsi3_i1, divsi3_i1): Remove ! TARGET_SH4 
condition.
(udivsi3_i4, divsi3_i4): Use TARGET_FPU_DOUBLE condition instead
of TARGET_SH4.
(udivsi3_i4_single, divsi3_i4_single): Use 
TARGET_FPU_SINGLE_ONLY || TARGET_FPU_DOUBLE instead of 
TARGET_HARD_SH4.

libgcc/ChangeLog:

PR target/49880
* config/sh/lib1funcs.S (sdivsi3_i4, udivsi3_i4): Enable for 
SH2A.
(sdivsi3, udivsi3): Remove SH4 check and always compile these 
functions.

testsuite/ChangeLog:

PR target/49880
* testsuite/gcc.target/sh/pr49880-1.c: New.
* testsuite/gcc.target/sh/pr49880-2.c: New.
* testsuite/gcc.target/sh/pr49880-3.c: New.
* testsuite/gcc.target/sh/pr49880-4.c: New.
* testsuite/gcc.target/sh/pr49880-5.c: New.
Index: gcc/config/sh/sh.md
===
--- gcc/config/sh/sh.md	(revision 196589)
+++ gcc/config/sh/sh.md	(working copy)
@@ -2154,7 +2154,7 @@
(clobber (reg:SI PR_REG))
(clobber (reg:SI R4_REG))
(use (match_operand:SI 1 "arith_reg_operand" "r"))]
-  "TARGET_SH1 && (! TARGET_SH4 || TARGET_DIVIDE_CALL_DIV1)"
+  "TARGET_SH1 && TARGET_DIVIDE_CALL_DIV1"
   "jsr	@%1%#"
   [(set_attr "type" "sfunc")
(set_attr "needs_delay_slot" "yes")])
@@ -2217,7 +2217,7 @@
(clobber (reg:SI R5_REG))
(use (reg:PSI FPSCR_REG))
(use (match_operand:SI 1 "arith_reg_operand" "r"))]
-  "TARGET_SH4 && ! TARGET_FPU_SINGLE"
+  "TARGET_FPU_DOUBLE && ! TARGET_FPU_SINGLE"
   "jsr	@%1%#"
   [(set_attr "type" "sfunc")
(set_attr "fp_mode" "double")
@@ -2236,7 +2236,8 @@
(clobber (reg:SI R4_REG))
(clobber (reg:SI R5_REG))
(use (match_operand:SI 1 "arith_reg_operand" "r"))]
-  "(TARGET_HARD_SH4 || TARGET_SHCOMPACT) && TARGET_FPU_SINGLE"
+  "(TARGET_FPU_SINGLE_ONLY || TARGET_FPU_DOUBLE || TARGET_SHCOMPACT)
+   && TARGET_FPU_SINGLE"
   "jsr	@%1%#"
   [(set_attr "type" "sfunc")
(set_attr "needs_delay_slot" "yes")])
@@ -2358,7 +2359,7 @@
(clobber (reg:SI R2_REG))
(clobber (reg:SI R3_REG))
(use (match_operand:SI 1 "arith_reg_operand" "r"))]
-  "TARGET_SH1 && (! TARGET_SH4 || TARGET_DIVIDE_CALL_DIV1)"
+  "TARGET_SH1 && TARGET_DIVIDE_CALL_DIV1"
   "jsr	@%1%#"
   [(set_attr "type" "sfunc")
(set_attr "needs_delay_slot" "yes")])
@@ -2487,7 +2488,7 @@
(clobber (reg:DF DR2_REG))
(use (reg:PSI FPSCR_REG))
(use (match_operand:SI 1 "arith_reg_operand" "r"))]
-  "TARGET_SH4 && ! TARGET_FPU_SINGLE"
+  "TARGET_FPU_DOUBLE && ! TARGET_FPU_SINGLE"
   "jsr	@%1%#"
   [(set_attr "type" "sfunc")
(set_attr "fp_mode" "double")
@@ -2501,7 +2502,8 @@
(clobber (reg:DF DR2_REG))
(clobber (reg:SI R2_REG))
(use (match_operand:SI 1 "arith_reg_operand" "r"))]
-  "(TARGET_HARD_SH4 || TARGET_SHCOMPACT) && TARGET_FPU_SINGLE"
+  "(TARGET_FPU_SINGLE_ONLY || TARGET_FPU_DOUBLE || TARGET_SHCOMPACT)
+   && TARGET_FPU_SINGLE"
   "jsr	@%1%#"
   [(set_attr "type" "sfunc")
(set_attr "needs_delay_slot" "yes")])
Index: gcc/config/sh/sh.opt
===
--- gcc/config/sh/sh.opt	(revision 196589)
+++ gcc/config/sh/sh.opt	(working copy)
@@ -24,6 +24,10 @@
 ;; Set if the default precision of th FPU is single.
 Mask(FPU_SINGLE)
 
+;; Set if the a double-precision FPU is present but is restricted to
+;; single precisi

[Patch, microblaze]: Add MicroBlaze TLS configure support

2013-03-12 Thread David Holsgrove
Add test for MicroBlaze TLS support to gcc/configure.ac

gcc/Changelog

2013-03-13  Edgar E. Iglesias 
David Holsgrove 

  * configure.ac: Add MicroBlaze TLS support detection.
  * configure: Regenerate.

Signed-off-by: Edgar E. Iglesias 
Signed-off-by: David Holsgrove 



0002-Patch-microblaze-Add-MicroBlaze-TLS-configure-suppor.patch
Description: 0002-Patch-microblaze-Add-MicroBlaze-TLS-configure-suppor.patch


[Patch, microblaze]: Add support for TLS in MicroBlaze

2013-03-12 Thread David Holsgrove
Add support for thread local storage (general dynamic and local dynamic models) 
in MicroBlaze.


gcc/Changelog

2013-03-13  Edgar E. Iglesias 
David Holsgrove 

 * config/microblaze/microblaze-protos.h: (microblaze_cannot_force_const_mem,
   microblaze_tls_referenced_p, symbol_mentioned_p,
   label_mentioned_p): Add prototypes.
 * config/microblaze/microblaze.c (microblaze_address_type): Add ADDRESS_TLS
   and tls_reloc address types.
   (microblaze_address_info): Add tls_reloc.
   (TARGET_HAVE_TLS): Define.
   (get_tls_get_addr, microblaze_tls_symbol_p, microblaze_tls_operand_p_1,
microblaze_tls_referenced_p, microblaze_cannot_force_const_mem,
symbol_mentioned_p, label_mentioned_p, tls_mentioned_p, load_tls_operand,
microblaze_call_tls_get_addr, microblaze_legitimize_tls_address): New 
functions.
   (microblaze_classify_unspec): Handle UNSPEC_TLS.
   (get_base_reg): Use microblaze_tls_symbol_p.
   (microblaze_classify_address): Handle TLS.
   (microblaze_legitimate_pic_operand): Use symbol_mentioned_p, 
label_mentioned_p
and microblaze_tls_referenced_p.
   (microblaze_legitimize_address): Handle TLS.
   (microblaze_address_insns): Handle ADDRESS_TLS.
   (pic_address_needs_scratch): Handle TLS.
   (print_operand_address): Handle TLS.
   (microblaze_expand_prologue): Check TLS_NEEDS_GOT.
   (microblaze_expand_move): Handle TLS.
   (microblaze_legitimate_constant_p): Check microblaze_cannot_force_const_mem
and microblaze_tls_symbol_p.
   (TARGET_CANNOT_FORCE_CONST_MEM): Define.
 * config/microblaze/microblaze.h (TLS_NEEDS_GOT): Define
   (PIC_OFFSET_TABLE_REGNUM): Set.
 * config/microblaze/linux.h (TLS_NEEDS_GOT): Define.
 * config/microblaze/microblaze.md (UNSPEC_TLS): Define.
   (addsi3, movsi_internal2, movdf_internal): Update constraints
 * config/microblaze/predicates.md (arith_plus_operand): Define
   (move_operand): Redefine as move_src_operand, check 
microblaze_tls_referenced_p.

Signed-off-by: Edgar E. Iglesias 
Signed-off-by: David Holsgrove 



0001-Patch-microblaze-Add-support-for-TLS.patch
Description: 0001-Patch-microblaze-Add-support-for-TLS.patch


Re: [trunk][google/gcc47]Add dependence of configure-target-libmudflap on configure-target-libstdc++-v3 (issue7740043)

2013-03-12 Thread Diego Novillo

On 2013-03-12 13:24 , Jing Yu wrote:

I made a mistake in my previous patch. I did not notice that
Makefile.in was a generated file. Update the patch.

2013-03-12  Jing Yu  

* Makefile.def (Target modules dependencies): Add new dependency.
* Makefile.in: Re-generate.


Index: Makefile.in
===
--- Makefile.in (revision 196604)
+++ Makefile.in (working copy)
@@ -6,6 +6,7 @@ all-target-libjava: maybe-all-target-boehm-gc
  all-target-libjava: maybe-all-target-libffi
  configure-target-libobjc: maybe-configure-target-boehm-gc
  all-target-libobjc: maybe-all-target-boehm-gc
+configure-target-libmudflap: maybe-configure-target-libstdc++-v3
  configure-target-libstdc++-v3: maybe-configure-target-libgomp

  configure-stage1-target-libstdc++-v3: maybe-configure-stage1-target-libgomp
Index: Makefile.def
===
--- Makefile.def (revision 196604)
+++ Makefile.def (working copy)
@@ -504,6 +504,7 @@ dependencies = { module=all-target-libjava; on=all
  dependencies = { module=all-target-libjava; on=all-target-libffi; };
  dependencies = { module=configure-target-libobjc;
on=configure-target-boehm-gc; };
  dependencies = { module=all-target-libobjc; on=all-target-boehm-gc; };
+dependencies = { module=configure-target-libmudflap;
on=configure-target-libstdc++-v3; };
  dependencies = { module=configure-target-libstdc++-v3;
on=configure-target-libgomp; };
  // parallel_list.o and parallel_settings.o depend on omp.h, which is
  // generated by the libgomp configure.  Unfortunately, due to the use of

On Mon, Mar 11, 2013 at 5:21 PM, Jing Yu  wrote:

Don't know why the email body became attachment. Sent it again.
The review link is https://codereview.appspot.com/7740043

Hi Diego,

The nightly build of gcc-4.7 based ppc64 and ppc32 crosstools have failed since
the build server upgraded to gPrecise one week ago. Log shows a configuration fa
ilure on libmudflap.

checking for suffix of object files... /lib/cpp
configure: error: in
`/g/nightly/build/work/gcc-4.7.x-grtev3-powerpc32-8540/rpmbuild/BUILD/.../powerpc-grtev3-linux-gnu/libmudflap':
configure: error: C++ preprocessor "/lib/cpp" fails sanity check
See `config.log' for more details.

There is no /lib/cpp on gprecise server, though it should not be used here.

What happened was that libmudflap configure looks for a preprocessor
by trying $CXX -E and then backing off to /lib/cpp.  $CXX -E is
failing with "unrecognized command line option
‘-funconfigured-libstdc++’", and the /lib/cpp backstop then fails
also. The -funconfigured-libstdc++ is because configure can't find
libstdc++/scripts/testsuite_flags. This is a so-far undiagnosed race
in gcc make, masked where /lib/cpp is available.   And that's absent
because in this build, for whatever reason, libstdc++ loses a race
with libmudflap.

The theory is confirmed by:
  1) if we force --job=1, build can succeed
  2) If we apply the following patch to build-gcc/Makefile, build can
succeed. After removing this dependency, build fails with the same
error again.

Is this patch ok for google/gcc-4_7?

If the same issue exists on upstream trunk, how does the patch sound to trunk?

Thanks,
Jing

2013-03-11  Jing Yu  

 * Makefile.in: (maybe-configure-target-libmudflap):
 Add dependence on configure-target-libstdc++-v3.


OK for google/gcc-4_7.  It's fine for trunk as well, but it may need to 
wait until trunk opens up again.



Diego.



Index: Makefile.in
===
--- Makefile.in (revision 196604)
+++ Makefile.in (working copy)
@@ -31879,6 +31879,9 @@ maybe-configure-target-libmudflap:
  @if gcc-bootstrap
  configure-target-libmudflap: stage_current
  @endif gcc-bootstrap
+@if target-libstdc++-v3
+configure-target-libmudflap: configure-target-libstdc++-v3
+@endif target-libstdc++-v3
  @if target-libmudflap
  maybe-configure-target-libmudflap: configure-target-libmudflap
  configure-target-libmudflap:




Re: [trunk][google/gcc47]Add dependence of configure-target-libmudflap on configure-target-libstdc++-v3 (issue7740043)

2013-03-12 Thread Jing Yu
I made a mistake in my previous patch. I did not notice that
Makefile.in was a generated file. Update the patch.

2013-03-12  Jing Yu  

   * Makefile.def (Target modules dependencies): Add new dependency.
   * Makefile.in: Re-generate.


Index: Makefile.in
===
--- Makefile.in (revision 196604)
+++ Makefile.in (working copy)
@@ -6,6 +6,7 @@ all-target-libjava: maybe-all-target-boehm-gc
 all-target-libjava: maybe-all-target-libffi
 configure-target-libobjc: maybe-configure-target-boehm-gc
 all-target-libobjc: maybe-all-target-boehm-gc
+configure-target-libmudflap: maybe-configure-target-libstdc++-v3
 configure-target-libstdc++-v3: maybe-configure-target-libgomp

 configure-stage1-target-libstdc++-v3: maybe-configure-stage1-target-libgomp
Index: Makefile.def
===
--- Makefile.def (revision 196604)
+++ Makefile.def (working copy)
@@ -504,6 +504,7 @@ dependencies = { module=all-target-libjava; on=all
 dependencies = { module=all-target-libjava; on=all-target-libffi; };
 dependencies = { module=configure-target-libobjc;
on=configure-target-boehm-gc; };
 dependencies = { module=all-target-libobjc; on=all-target-boehm-gc; };
+dependencies = { module=configure-target-libmudflap;
on=configure-target-libstdc++-v3; };
 dependencies = { module=configure-target-libstdc++-v3;
on=configure-target-libgomp; };
 // parallel_list.o and parallel_settings.o depend on omp.h, which is
 // generated by the libgomp configure.  Unfortunately, due to the use of

On Mon, Mar 11, 2013 at 5:21 PM, Jing Yu  wrote:
> Don't know why the email body became attachment. Sent it again.
> The review link is https://codereview.appspot.com/7740043
>
> Hi Diego,
>
> The nightly build of gcc-4.7 based ppc64 and ppc32 crosstools have failed 
> since
> the build server upgraded to gPrecise one week ago. Log shows a configuration 
> fa
> ilure on libmudflap.
>
> checking for suffix of object files... /lib/cpp
> configure: error: in
> `/g/nightly/build/work/gcc-4.7.x-grtev3-powerpc32-8540/rpmbuild/BUILD/.../powerpc-grtev3-linux-gnu/libmudflap':
> configure: error: C++ preprocessor "/lib/cpp" fails sanity check
> See `config.log' for more details.
>
> There is no /lib/cpp on gprecise server, though it should not be used here.
>
> What happened was that libmudflap configure looks for a preprocessor
> by trying $CXX -E and then backing off to /lib/cpp.  $CXX -E is
> failing with "unrecognized command line option
> ‘-funconfigured-libstdc++’", and the /lib/cpp backstop then fails
> also. The -funconfigured-libstdc++ is because configure can't find
> libstdc++/scripts/testsuite_flags. This is a so-far undiagnosed race
> in gcc make, masked where /lib/cpp is available.   And that's absent
> because in this build, for whatever reason, libstdc++ loses a race
> with libmudflap.
>
> The theory is confirmed by:
>  1) if we force --job=1, build can succeed
>  2) If we apply the following patch to build-gcc/Makefile, build can
> succeed. After removing this dependency, build fails with the same
> error again.
>
> Is this patch ok for google/gcc-4_7?
>
> If the same issue exists on upstream trunk, how does the patch sound to trunk?
>
> Thanks,
> Jing
>
> 2013-03-11  Jing Yu  
>
> * Makefile.in: (maybe-configure-target-libmudflap):
> Add dependence on configure-target-libstdc++-v3.
>
> Index: Makefile.in
> ===
> --- Makefile.in (revision 196604)
> +++ Makefile.in (working copy)
> @@ -31879,6 +31879,9 @@ maybe-configure-target-libmudflap:
>  @if gcc-bootstrap
>  configure-target-libmudflap: stage_current
>  @endif gcc-bootstrap
> +@if target-libstdc++-v3
> +configure-target-libmudflap: configure-target-libstdc++-v3
> +@endif target-libstdc++-v3
>  @if target-libmudflap
>  maybe-configure-target-libmudflap: configure-target-libmudflap
>  configure-target-libmudflap:


[patch, committed, wwwdata] Add id="current" to Current releases in gcc.gnu.org/onlinedocs/

2013-03-12 Thread Tobias Burnus

Hi,

I have committed the attached trivial patch in order to link directly to 
the documentation of the developer version:


http://gcc.gnu.org/onlinedocs/#current

Tobias

PS: For releases, one can use "/" instead, e.g. 
http://gcc.gnu.org/onlinedocs/4.7.2/
Index: index.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/onlinedocs/index.html,v
retrieving revision 1.129
diff -u -r1.129 index.html
--- index.html	20 Sep 2012 10:17:30 -	1.129
+++ index.html	12 Mar 2013 16:55:05 -
@@ -734,7 +734,7 @@
 
 
 
-Current development
+Current development
 
 Please note that the following documentation refers to current
 development.  Some information may not be applicable to any


Re: [PATCH][6/n] tree LIM TLC

2013-03-12 Thread Steven Bosscher
On Tue, Mar 12, 2013 at 4:33 PM, Richard Biener wrote:
> On Tue, 12 Mar 2013, Steven Bosscher wrote:

>> I suppose this renders my LIM patch obsolete.
>
> Not really - it's still
>
>  tree loop invariant motion: 588.31 (78%) usr
>
> so limiting the O(n^2) dependence testing is a good thing.  But I
> can take it over from here and implement that ontop of my patches
> if you like.

That'd be good, let's keep it in one hand, one set.


>> Did you also look at the memory foot print?
>
> Yeah, unfortunately processing outermost loops separately doesn't
> reduce peak memory consumption.  I'll look into getting rid of the
> all-refs bitmaps, but I'm not there yet.

A few more ideas (though probably not with as much impact):

Is it possible to use a bitmap_head for the (now merged)
dep_loop/indep_loop, instead of bitmap? Likewise for a few other
bitmaps, especially the vectors of bitmaps.

Put "struct depend" in an alloc pool. (Also allows one to wipe them
all out in free_lim_aux_data.)
Likewise for "struct mem_ref".

Use a shared mem_ref for the error_mark_node case (and hoist the
MEM_ANALYZABLE checks in refs_independent_p above the bitmap tests).

Use nameless temps instead of lsm_tmp_name_add.


> Currently the testcase peaks at 1.7GB for me (after LIM, then
> it gets worse with DSE and IRA).  And I only tested -O1 sofar.

Try my DSE patch (corrected version attached).

What are you using now to measure per-pass memory usage? I'm still
using my old hack (also attached) but it's not quite optimal.

Ciao!
Steven


PR39326_RTLDSE.diff
Description: Binary data


passes_memstat.diff
Description: Binary data


Re: [PATCH] Fix install-plugin with vxworks-dummy.h (PR plugins/45078)

2013-03-12 Thread Matthias Klose
Am 06.03.2013 20:44, schrieb Jakub Jelinek:
> Hi!
> 
> On Wed, Mar 06, 2013 at 06:57:03PM +0800, Matthias Klose wrote:
>> There is still vxworks-dummy.h, which is not installed, see PR45078. Would 
>> the
>> same approach work?
> 
> Like this?  Untested though, and no access to most of the targets.

looks ok.

using the first chunk as in a patch proposed early, or maybe just applied
locally. works for arm and sparc, sh4 didn't build yet, for mips a tri-arch
build currently fails with

Bootstrap comparison failure!
mips-linux-gnu/64/libstdc++-v3/src/c++98/sstream-inst.o differs
mips-linux-gnu/64/libstdc++-v3/src/c++98/istream-inst.o differs
mips-linux-gnu/64/libstdc++-v3/src/c++98/ostream-inst.o differs
make[4]: *** [compare] Error 1

  Matthias



Re: [PATCH][6/n] tree LIM TLC

2013-03-12 Thread Richard Biener
On Tue, 12 Mar 2013, Steven Bosscher wrote:

> On Tue, Mar 12, 2013 at 4:25 PM, Richard Biener wrote:
> >
> > (Un-?)surprisingly the most effective compile-time reduction for
> > the testcase in PR39326 is to employ ao_ref caching for
> > alias oracle queries and caching of expanded affine-combinations
> > for affine disambiguations.
> >
> > This reduces compile-time to a manageable amount in the first place
> > for me (so I'm sending it "late" in the series).
> 
> I suppose this renders my LIM patch obsolete.

Not really - it's still

 tree loop invariant motion: 588.31 (78%) usr

so limiting the O(n^2) dependence testing is a good thing.  But I
can take it over from here and implement that ontop of my patches
if you like.

> Did you also look at the memory foot print?

Yeah, unfortunately processing outermost loops separately doesn't
reduce peak memory consumption.  I'll look into getting rid of the
all-refs bitmaps, but I'm not there yet.

Currently the testcase peaks at 1.7GB for me (after LIM, then
it gets worse with DSE and IRA).  And I only tested -O1 sofar.

Thanks,
Richard.


Re: [PATCH][6/n] tree LIM TLC

2013-03-12 Thread Steven Bosscher
On Tue, Mar 12, 2013 at 4:25 PM, Richard Biener wrote:
>
> (Un-?)surprisingly the most effective compile-time reduction for
> the testcase in PR39326 is to employ ao_ref caching for
> alias oracle queries and caching of expanded affine-combinations
> for affine disambiguations.
>
> This reduces compile-time to a manageable amount in the first place
> for me (so I'm sending it "late" in the series).

I suppose this renders my LIM patch obsolete.

Did you also look at the memory foot print?

Ciao!
Steven


[PATCH][6/n] tree LIM TLC

2013-03-12 Thread Richard Biener

(Un-?)surprisingly the most effective compile-time reduction for
the testcase in PR39326 is to employ ao_ref caching for
alias oracle queries and caching of expanded affine-combinations
for affine disambiguations.

This reduces compile-time to a manageable amount in the first place
for me (so I'm sending it "late" in the series).

Bootstrap and regtest scheduled on x86_64-unknown-linux-gnu, queued
for 4.9.

Richard.

2013-03-12  Richard Biener  

PR tree-optimization/39326
* tree-ssa-loop-im.c (struct mem_ref): Replace mem member
with an ao_ref typed one.  Add affine-combination cache members.
(MEM_ANALYZABLE): Adjust.
(memref_eq): Likewise.
(mem_ref_alloc): Likewise.
(gather_mem_refs_stmt): Likewise.
(execute_sm_if_changed_flag_set): Likewise.
(execute_sm): Likewise.
(ref_always_accessed_p): Likewise.
(refs_independent_p): Likewise.
(can_sm_ref_p): Likewise.
(mem_refs_may_alias_p): Use ao_ref members to query the oracle.
Cache expanded affine combinations.

Index: trunk/gcc/tree-ssa-loop-im.c
===
*** trunk.orig/gcc/tree-ssa-loop-im.c   2013-03-12 15:11:12.0 +0100
--- trunk/gcc/tree-ssa-loop-im.c2013-03-12 16:20:49.115169595 +0100
*** typedef struct mem_ref_locs
*** 117,126 
  
  typedef struct mem_ref
  {
-   tree mem;   /* The memory itself.  */
unsigned id;/* ID assigned to the memory reference
   (its index in memory_accesses.refs_list)  */
hashval_t hash; /* Its hash value.  */
bitmap stored;  /* The set of loops in that this memory location
   is stored to.  */
vec accesses_in_loop;
--- 117,130 
  
  typedef struct mem_ref
  {
unsigned id;/* ID assigned to the memory reference
   (its index in memory_accesses.refs_list)  */
hashval_t hash; /* Its hash value.  */
+ 
+   /* The memory access itself and associated caching of alias-oracle
+  query meta-data.  */
+   ao_ref mem; /* The ao_ref of this memory access.  */
+ 
bitmap stored;  /* The set of loops in that this memory location
   is stored to.  */
vec accesses_in_loop;
*** typedef struct mem_ref
*** 142,147 
--- 146,155 
bitmap indep_ref;   /* The set of memory references on that
   this reference is independent.  */
bitmap dep_ref; /* The complement of INDEP_REF.  */
+ 
+   /* The expanded affine combination of this memory access.  */
+   aff_tree aff_off;
+   double_int aff_size;
  } *mem_ref_p;
  
  
*** static bool ref_indep_loop_p (struct loo
*** 186,192 
  #define SET_ALWAYS_EXECUTED_IN(BB, VAL) ((BB)->aux = (void *) (VAL))
  
  /* Whether the reference was analyzable.  */
! #define MEM_ANALYZABLE(REF) ((REF)->mem != error_mark_node)
  
  static struct lim_aux_data *
  init_lim_data (gimple stmt)
--- 194,200 
  #define SET_ALWAYS_EXECUTED_IN(BB, VAL) ((BB)->aux = (void *) (VAL))
  
  /* Whether the reference was analyzable.  */
! #define MEM_ANALYZABLE(REF) ((REF)->mem.ref != error_mark_node)
  
  static struct lim_aux_data *
  init_lim_data (gimple stmt)
*** memref_eq (const void *obj1, const void
*** 1435,1441 
  {
const struct mem_ref *const mem1 = (const struct mem_ref *) obj1;
  
!   return operand_equal_p (mem1->mem, (const_tree) obj2, 0);
  }
  
  /* Releases list of memory reference locations ACCS.  */
--- 1443,1449 
  {
const struct mem_ref *const mem1 = (const struct mem_ref *) obj1;
  
!   return operand_equal_p (mem1->mem.ref, (const_tree) obj2, 0);
  }
  
  /* Releases list of memory reference locations ACCS.  */
*** static mem_ref_p
*** 1477,1483 
  mem_ref_alloc (tree mem, unsigned hash, unsigned id)
  {
mem_ref_p ref = XNEW (struct mem_ref);
!   ref->mem = mem;
ref->id = id;
ref->hash = hash;
ref->stored = BITMAP_ALLOC (&lim_bitmap_obstack);
--- 1485,1491 
  mem_ref_alloc (tree mem, unsigned hash, unsigned id)
  {
mem_ref_p ref = XNEW (struct mem_ref);
!   ao_ref_init (&ref->mem, mem);
ref->id = id;
ref->hash = hash;
ref->stored = BITMAP_ALLOC (&lim_bitmap_obstack);
*** mem_ref_alloc (tree mem, unsigned hash,
*** 1487,1492 
--- 1495,1502 
ref->dep_ref = BITMAP_ALLOC (&lim_bitmap_obstack);
ref->accesses_in_loop.create (0);
  
+   ref->aff_off.type = NULL_TREE;
+ 
return ref;
  }
  
*** gather_mem_refs_stmt (struct loop *loop,
*** 1586,1592 
if (dump_file && (dump_flags & TDF_DETAILS))
{
  fprintf (dump_file, "Memory reference %u: ", id);
! print_generic_expr (dump_file, ref->

Re: [PATCH][5/n] tree LIM TLC

2013-03-12 Thread Steven Bosscher
On Tue, Mar 12, 2013 at 4:16 PM, Richard Biener wrote:
> --- 127,145 
> /* The locations of the accesses.  Vector
>indexed by the loop number.  */
>
> !   /* The following sets are computed on demand.  We use two bits per
> !  information to represent the not-computed state.  */
> !
> !   /* The set of loops in that the memory reference is independent
> !  (2 * loop->num) or dependent (2 * loop->num + 1) in.
> !  If it is stored in the loop, this store is independent on all other
> !  loads and stores.
> !  If it is only loaded, then it is independent on all stores in the 
> loop.  */
> !   bitmap loop_dependence;
> !
> !   /* The set of memory references on that this reference is independent
> !  (2 * mem->id) or dependent (2 * mem->id + 1).  */
> !   bitmap ref_dependence;
>   } *mem_ref_p;

Perhaps add simple inline functions to test those bits, to avoid:

> --- 2174,2182 
> ref2 = tem;
>   }
>
> !   if (bitmap_bit_p (ref1->ref_dependence, 2 * ref2->id))
>   return true;
> !   if (bitmap_bit_p (ref1->ref_dependence, 2 * ref2->id + 1))
>   return false;
>
> if (dump_file && (dump_flags & TDF_DETAILS))

?

That kind of explicit 2*x+[01] is bound to go wrong at some point.

Ciao!
Steven


[PATCH][5/n] tree LIM TLC

2013-03-12 Thread Richard Biener

This merges the tri-state caches (not computed, dependent, independent)
bitmaps to improve cache locality.

Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.
Queued for 4.9.

Richard.

2013-03-12  Richard Biener  

* tree-ssa-loop-im.c (struct mem_ref): Merge indep_loop and
dep_loop into loop_dependence, merge indep_ref and dep_ref
into ref_dependence.
(mem_ref_alloc): Adjust.
(refs_independent_p): Likewise.
(record_indep_loop): Likewise.
(ref_indep_loop_p): Likewise.

Index: trunk/gcc/tree-ssa-loop-im.c
===
*** trunk.orig/gcc/tree-ssa-loop-im.c   2013-03-12 14:18:44.0 +0100
--- trunk/gcc/tree-ssa-loop-im.c2013-03-12 14:27:37.054487784 +0100
*** typedef struct mem_ref
*** 127,147 
/* The locations of the accesses.  Vector
   indexed by the loop number.  */
  
!   /* The following sets are computed on demand.  We keep both set and
!  its complement, so that we know whether the information was
!  already computed or not.  */
!   bitmap indep_loop;  /* The set of loops in that the memory
!  reference is independent, meaning:
!  If it is stored in the loop, this store
!is independent on all other loads and
!stores.
!  If it is only loaded, then it is independent
!on all stores in the loop.  */
!   bitmap dep_loop;/* The complement of INDEP_LOOP.  */
! 
!   bitmap indep_ref;   /* The set of memory references on that
!  this reference is independent.  */
!   bitmap dep_ref; /* The complement of INDEP_REF.  */
  } *mem_ref_p;
  
  
--- 127,145 
/* The locations of the accesses.  Vector
   indexed by the loop number.  */
  
!   /* The following sets are computed on demand.  We use two bits per
!  information to represent the not-computed state.  */
! 
!   /* The set of loops in that the memory reference is independent
!  (2 * loop->num) or dependent (2 * loop->num + 1) in.
!  If it is stored in the loop, this store is independent on all other
!  loads and stores.
!  If it is only loaded, then it is independent on all stores in the loop.  
*/
!   bitmap loop_dependence; 
! 
!   /* The set of memory references on that this reference is independent
!  (2 * mem->id) or dependent (2 * mem->id + 1).  */
!   bitmap ref_dependence;
  } *mem_ref_p;
  
  
*** mem_ref_alloc (tree mem, unsigned hash,
*** 1481,1490 
ref->id = id;
ref->hash = hash;
ref->stored = BITMAP_ALLOC (&lim_bitmap_obstack);
!   ref->indep_loop = BITMAP_ALLOC (&lim_bitmap_obstack);
!   ref->dep_loop = BITMAP_ALLOC (&lim_bitmap_obstack);
!   ref->indep_ref = BITMAP_ALLOC (&lim_bitmap_obstack);
!   ref->dep_ref = BITMAP_ALLOC (&lim_bitmap_obstack);
ref->accesses_in_loop.create (0);
  
return ref;
--- 1479,1486 
ref->id = id;
ref->hash = hash;
ref->stored = BITMAP_ALLOC (&lim_bitmap_obstack);
!   ref->loop_dependence = BITMAP_ALLOC (&lim_bitmap_obstack);
!   ref->ref_dependence = BITMAP_ALLOC (&lim_bitmap_obstack);
ref->accesses_in_loop.create (0);
  
return ref;
*** refs_independent_p (mem_ref_p ref1, mem_
*** 2178,2186 
ref2 = tem;
  }
  
!   if (bitmap_bit_p (ref1->indep_ref, ref2->id))
  return true;
!   if (bitmap_bit_p (ref1->dep_ref, ref2->id))
  return false;
  
if (dump_file && (dump_flags & TDF_DETAILS))
--- 2174,2182 
ref2 = tem;
  }
  
!   if (bitmap_bit_p (ref1->ref_dependence, 2 * ref2->id))
  return true;
!   if (bitmap_bit_p (ref1->ref_dependence, 2 * ref2->id + 1))
  return false;
  
if (dump_file && (dump_flags & TDF_DETAILS))
*** refs_independent_p (mem_ref_p ref1, mem_
*** 2190,2203 
if (mem_refs_may_alias_p (ref1->mem, ref2->mem,
&memory_accesses.ttae_cache))
  {
!   bitmap_set_bit (ref1->dep_ref, ref2->id);
if (dump_file && (dump_flags & TDF_DETAILS))
fprintf (dump_file, "dependent.\n");
return false;
  }
else
  {
!   bitmap_set_bit (ref1->indep_ref, ref2->id);
if (dump_file && (dump_flags & TDF_DETAILS))
fprintf (dump_file, "independent.\n");
return true;
--- 2186,2199 
if (mem_refs_may_alias_p (ref1->mem, ref2->mem,
&memory_accesses.ttae_cache))
  {
!   bitmap_set_bit (ref1->ref_dependence, 2 * ref2->id + 1);
if (dump_file && (dump_flags & TDF_DETAILS))
fprintf (dump_file, "dependent.\n");
return false;
  }
els

Re: [Patch,AVR]: Fix PR56263

2013-03-12 Thread Jakub Jelinek
On Mon, Mar 11, 2013 at 07:03:14PM +0100, Georg-Johann Lay wrote:
>   PR target/56263
>   * doc/invoke.texi (AVR Options): Document it.

This change broke building of info doc everywhere:

../../gcc/doc//invoke.texi:11652: @item found outside of an insertion block.
makeinfo: Removing output file `doc/gcc.info' due to errors; use --force to 
preserve.
make: *** [doc/gcc.info] Error 1

Fixed thusly, committed as obvious.

2013-03-12  Jakub Jelinek  

* doc/invoke.texi (-Waddr-space-convert): Move into the table earlier.

--- gcc/doc/invoke.texi (revision 196613)
+++ gcc/doc/invoke.texi (working copy)
@@ -11632,6 +11632,11 @@ sbiw r26, const   ; X -= const
 @item -mtiny-stack
 @opindex mtiny-stack
 Only change the lower 8@tie{}bits of the stack pointer.
+
+@item -Waddr-space-convert
+@opindex Waddr-space-convert
+Warn about conversions between address spaces in the case where the
+resulting address space is not contained in the incoming address space.
 @end table
 
 @subsubsection @code{EIND} and Devices with more than 128 Ki Bytes of Flash
@@ -11649,11 +11654,6 @@ when @code{EICALL} or @code{EIJMP} instr
 Indirect jumps and calls on these devices are handled as follows by
 the compiler and are subject to some limitations:
 
-@item -Waddr-space-convert
-@opindex Waddr-space-convert
-Warn about conversions between address spaces in the case where the
-resulting address space is not contained in the incoming address space.
-
 @itemize @bullet
 
 @item


Jakub


Re: [patch testsuite]: Fix gcc.target/i386 cases for mingw-targets

2013-03-12 Thread Kai Tietz
What's here to ping about?  I got ok by rth.

Kai


Do not output references from external vtables into LTO symbol table

2013-03-12 Thread Jan Hubicka
Hi,
this patch fixes GCC side of PR 56557 where we fail to link

#include 

int main()
{
  std::fstream x;
}

with -flto -rdynamic because we do not see definition of
_ZTCSt13basic_fstreamIcSt11char_traitsIcEE0_Sd.  This symbol mistakely appears
in lto symbol table because it is used from external constructor that is kept
around only for optimization.  There is also BFD linker bug that create stale
dynamic table entries for these kind of symbols.

Bootstrapped®tested x86_64-linux, tested on Mozilla and Qt LTO build, 
comitted.

Honza

PR lto/56557
* lto-streamer-out.c (output_symbol_p): Skip references from
constructors of external variables.
Index: lto-streamer-out.c
===
--- lto-streamer-out.c  (revision 196611)
+++ lto-streamer-out.c  (working copy)
@@ -1265,17 +1265,36 @@ bool
 output_symbol_p (symtab_node node)
 {
   struct cgraph_node *cnode;
-  struct ipa_ref *ref;
-
   if (!symtab_real_symbol_p (node))
 return false;
   /* We keep external functions in symtab for sake of inlining
  and devirtualization.  We do not want to see them in symbol table as
- references.  */
+ references unless they are really used.  */
   cnode = dyn_cast  (node);
-  if (cnode && DECL_EXTERNAL (cnode->symbol.decl))
-return (cnode->callers
-   || ipa_ref_list_referring_iterate (&cnode->symbol.ref_list, 0, 
ref));
+  if (cnode && DECL_EXTERNAL (cnode->symbol.decl)
+  && cnode->callers)
+return true;
+
+ /* Ignore all references from external vars initializers - they are not really
+part of the compilation unit until they are used by folding.  Some symbols,
+like references to external construction vtables can not be referred to at 
all.
+We decide this at can_refer_decl_in_current_unit_p.  */
+ if (DECL_EXTERNAL (node->symbol.decl))
+{
+  int i;
+  struct ipa_ref *ref;
+  for (i = 0; ipa_ref_list_referring_iterate (&node->symbol.ref_list,
+ i, ref); i++)
+   {
+ if (ref->use == IPA_REF_ALIAS)
+   continue;
+  if (is_a  (ref->referring))
+   return true;
+ if (!DECL_EXTERNAL (ref->referring->symbol.decl))
+   return true;
+   }
+  return false;
+}
   return true;
 }
 


[PATCH][4/n] tree LIM TLC

2013-03-12 Thread Richard Biener

This makes us only consider stores instead of all references when
looking for store motion opportunities.

Bootstrap pending on x86_64-unknown-linux-gnu, queued for 4.9.

To be committed with omitting the gcc_checking_assert.

Richard.

2013-03-12  Richard Biener  

* tree-ssa-loop-im.c (can_sm_ref_p): Do not test whether
ref is stored in the loop.
(find_refs_for_sm): Walk only over all stores.

Index: trunk/gcc/tree-ssa-loop-im.c
===
*** trunk.orig/gcc/tree-ssa-loop-im.c   2013-03-12 13:25:30.0 +0100
--- trunk/gcc/tree-ssa-loop-im.c2013-03-12 13:28:49.468824315 +0100
*** can_sm_ref_p (struct loop *loop, mem_ref
*** 2284,2291 
  return false;
  
/* Unless the reference is stored in the loop, there is nothing to do.  */
!   if (!bitmap_bit_p (ref->stored, loop->num))
! return false;
  
/* It should be movable.  */
if (!is_gimple_reg_type (TREE_TYPE (ref->mem))
--- 2284,2290 
  return false;
  
/* Unless the reference is stored in the loop, there is nothing to do.  */
!   gcc_checking_assert (bitmap_bit_p (ref->stored, loop->num));
  
/* It should be movable.  */
if (!is_gimple_reg_type (TREE_TYPE (ref->mem))
*** can_sm_ref_p (struct loop *loop, mem_ref
*** 2322,2328 
  static void
  find_refs_for_sm (struct loop *loop, bitmap sm_executed, bitmap refs_to_sm)
  {
!   bitmap refs = memory_accesses.all_refs_in_loop[loop->num];
unsigned i;
bitmap_iterator bi;
mem_ref_p ref;
--- 2321,2327 
  static void
  find_refs_for_sm (struct loop *loop, bitmap sm_executed, bitmap refs_to_sm)
  {
!   bitmap refs = memory_accesses.all_refs_stored_in_loop[loop->num];
unsigned i;
bitmap_iterator bi;
mem_ref_p ref;


[PATCH][3/n] tree LIM TLC

2013-03-12 Thread Richard Biener

This exploits symmetry in memory reference disambiguation.  That
should reduce the number of queries and space needed in the
cache bitmaps.

Bootstrapped and tested on x86_64-unknown-linux-gnu.

Richard.

2013-03-12  Richard Biener  

PR tree-optimization/39326
* tree-ssa-loop-im.c (refs_independent_p): Exploit symmetry.

Index: trunk/gcc/tree-ssa-loop-im.c
===
*** trunk.orig/gcc/tree-ssa-loop-im.c   2013-03-12 11:41:58.0 +0100
--- trunk/gcc/tree-ssa-loop-im.c2013-03-12 11:44:11.507745552 +0100
*** ref_always_accessed_p (struct loop *loop
*** 2163,2177 
  static bool
  refs_independent_p (mem_ref_p ref1, mem_ref_p ref2)
  {
!   if (ref1 == ref2
!   || bitmap_bit_p (ref1->indep_ref, ref2->id))
  return true;
!   if (bitmap_bit_p (ref1->dep_ref, ref2->id))
! return false;
if (!MEM_ANALYZABLE (ref1)
|| !MEM_ANALYZABLE (ref2))
  return false;
  
if (dump_file && (dump_flags & TDF_DETAILS))
  fprintf (dump_file, "Querying dependency of refs %u and %u: ",
 ref1->id, ref2->id);
--- 2163,2188 
  static bool
  refs_independent_p (mem_ref_p ref1, mem_ref_p ref2)
  {
!   if (ref1 == ref2)
  return true;
! 
if (!MEM_ANALYZABLE (ref1)
|| !MEM_ANALYZABLE (ref2))
  return false;
  
+   /* Reference dependence in a loop is symmetric.  */
+   if (ref1->id > ref2->id)
+ {
+   mem_ref_p tem = ref1;
+   ref1 = ref2;
+   ref2 = tem;
+ }
+ 
+   if (bitmap_bit_p (ref1->indep_ref, ref2->id))
+ return true;
+   if (bitmap_bit_p (ref1->dep_ref, ref2->id))
+ return false;
+ 
if (dump_file && (dump_flags & TDF_DETAILS))
  fprintf (dump_file, "Querying dependency of refs %u and %u: ",
 ref1->id, ref2->id);
*** refs_independent_p (mem_ref_p ref1, mem_
*** 2180,2186 
&memory_accesses.ttae_cache))
  {
bitmap_set_bit (ref1->dep_ref, ref2->id);
-   bitmap_set_bit (ref2->dep_ref, ref1->id);
if (dump_file && (dump_flags & TDF_DETAILS))
fprintf (dump_file, "dependent.\n");
return false;
--- 2191,2196 
*** refs_independent_p (mem_ref_p ref1, mem_
*** 2188,2194 
else
  {
bitmap_set_bit (ref1->indep_ref, ref2->id);
-   bitmap_set_bit (ref2->indep_ref, ref1->id);
if (dump_file && (dump_flags & TDF_DETAILS))
fprintf (dump_file, "independent.\n");
return true;
--- 2198,2203 


[PATCH][2/n] tree LIM TLC

2013-03-12 Thread Richard Biener

This makes LIM work per outermost loop, reducing peak memory use.

Not necessarily 2/n, but I've completed and tested it
on x86_64-unknown-linux-gnu.  Queued for 4.9.

Richard.

2013-03-12  Richard Biener  

* tree-ssa-loop-im.c (determine_invariantness_stmt): Rename to ...
(determine_invariantness_bb): ... this.  Adjust for ...
(determine_invariantness): ... walk all blocks of the loop
we process.
(move_computations_stmt): Rename to ...
(move_computations_bb): ... this.  Adjust for ...
(move_computations): ... walk all blocks of the loop we process.
(analyze_memory_references): Likewise.
(store_motion): Process all sub-loops of the loop we process.
(fill_always_executed_in): Likewise.
(tree_ssa_lim_initialize): Move global bits to tree_ssa_lim.
(tree_ssa_lim_finalize): Likewise.
(tree_ssa_lim_1): Split out from ...
(tree_ssa_lim): ... this.  Perform global init and iterate over
all outermost loops.

Index: gcc/tree-ssa-loop-im.c
===
*** gcc/tree-ssa-loop-im.c.orig 2013-03-11 16:11:02.0 +0100
--- gcc/tree-ssa-loop-im.c  2013-03-12 10:09:58.923878391 +0100
*** rewrite_bittest (gimple_stmt_iterator *b
*** 1040,1047 
 Callback for walk_dominator_tree.  */
  
  static void
! determine_invariantness_stmt (struct dom_walk_data *dw_data ATTRIBUTE_UNUSED,
! basic_block bb)
  {
enum move_pos pos;
gimple_stmt_iterator bsi;
--- 1040,1046 
 Callback for walk_dominator_tree.  */
  
  static void
! determine_invariantness_bb (basic_block bb)
  {
enum move_pos pos;
gimple_stmt_iterator bsi;
*** determine_invariantness_stmt (struct dom
*** 1050,1058 
struct loop *outermost = ALWAYS_EXECUTED_IN (bb);
struct lim_aux_data *lim_data;
  
-   if (!loop_outer (bb->loop_father))
- return;
- 
if (dump_file && (dump_flags & TDF_DETAILS))
  fprintf (dump_file, "Basic block %d (loop %d -- depth %d):\n\n",
 bb->index, bb->loop_father->num, loop_depth (bb->loop_father));
--- 1049,1054 
*** determine_invariantness_stmt (struct dom
*** 1177,1211 
 each statement.  */
  
  static void
! determine_invariantness (void)
  {
!   struct dom_walk_data walk_data;
! 
!   memset (&walk_data, 0, sizeof (struct dom_walk_data));
!   walk_data.dom_direction = CDI_DOMINATORS;
!   walk_data.before_dom_children = determine_invariantness_stmt;
  
!   init_walk_dominator_tree (&walk_data);
!   walk_dominator_tree (&walk_data, ENTRY_BLOCK_PTR);
!   fini_walk_dominator_tree (&walk_data);
  }
  
  /* Hoist the statements in basic block BB out of the loops prescribed by
 data stored in LIM_DATA structures associated with each statement.  
Callback
 for walk_dominator_tree.  */
  
! static void
! move_computations_stmt (struct dom_walk_data *dw_data,
!   basic_block bb)
  {
struct loop *level;
gimple_stmt_iterator bsi;
gimple stmt;
unsigned cost = 0;
struct lim_aux_data *lim_data;
! 
!   if (!loop_outer (bb->loop_father))
! return;
  
for (bsi = gsi_start_phis (bb); !gsi_end_p (bsi); )
  {
--- 1173,1202 
 each statement.  */
  
  static void
! determine_invariantness (struct loop *loop, basic_block *bbs)
  {
!   unsigned i;
  
!   for (i = 0; i < loop->num_nodes; ++i)
! {
!   basic_block bb = bbs[i];
!   determine_invariantness_bb (bb);
! }
  }
  
  /* Hoist the statements in basic block BB out of the loops prescribed by
 data stored in LIM_DATA structures associated with each statement.  
Callback
 for walk_dominator_tree.  */
  
! static unsigned
! move_computations_bb (basic_block bb)
  {
struct loop *level;
gimple_stmt_iterator bsi;
gimple stmt;
unsigned cost = 0;
struct lim_aux_data *lim_data;
!   unsigned todo = 0;
  
for (bsi = gsi_start_phis (bb); !gsi_end_p (bsi); )
  {
*** move_computations_stmt (struct dom_walk_
*** 1260,1266 
   gimple_phi_result (stmt),
   t, arg0, arg1);
  SSA_NAME_DEF_STMT (gimple_phi_result (stmt)) = new_stmt;
! *((unsigned int *)(dw_data->global_data)) |= TODO_cleanup_cfg;
}
gsi_insert_on_edge (loop_preheader_edge (level), new_stmt);
remove_phi_node (&bsi, false);
--- 1251,1257 
   gimple_phi_result (stmt),
   t, arg0, arg1);
  SSA_NAME_DEF_STMT (gimple_phi_result (stmt)) = new_stmt;
! todo |= TODO_cleanup_cfg;
}
gsi_insert_on_edge (loop_preheader_edge (level), new_stmt);
remove_phi_node (&bsi, false);
*** move_computations_stmt (struct dom_walk_
*** 1323,1351 
gsi_

Re: [patch testsuite]: Fix gcc.target/i386 cases for mingw-targets

2013-03-12 Thread NightStrike
On Fri, Mar 8, 2013 at 8:21 AM, Rainer Orth  
wrote:
> Hi Kai,
>
 Index: gcc.target/i386/pr20020-1.c
 ===
 --- gcc.target/i386/pr20020-1.c   (Revision 196507)
 +++ gcc.target/i386/pr20020-1.c   (Arbeitskopie)
 @@ -1,5 +1,6 @@
  /* Check that 128-bit struct's are represented as TImode values.  */
  /* { dg-do compile { target int128 } } */
 +/* { dg-skip-if "" { x86_64-*-mingw* } { "*" } { "" } } */
>>>
>>> Please omit the default { "*" } { "" } here and in other tests below.
>>> And again: explain why the test is skipped.
>>
>> Hmm, why shall I omit here the default.  I checked in tree and most
>> statements for dg-skip-if are expressing default too.
>
> just because support for omitting the default isn't that old.  There's
> certainly opportunity for cleanup, but we shouldn't spread this any
> further.
>
>> Well, to skip here x64 mingw is caused by the fact that it has a
>> different ABI as x86_64.  I will add it to skip message.
>
> Thanks.  It's always far easier to have this in the testsuite to spare
> the next guy from doing software archaeology.
>
>> Ok to apply with those changes?
>
> Again, I prefer to defer to the target maintainers.

Ping


Re: [PATCH][4.8][4.7][4.6] Make -shared-libgcc the default on Cygwin.

2013-03-12 Thread Richard Biener
On Tue, Mar 12, 2013 at 2:44 AM, Dave Korn  wrote:
>
> Hello list,
>
>   The attached patch makes -shared-libgcc the default for Cygwin.  This is
> something that I should have done some time ago, as shared libgcc on Cygwin is
> more than mature.  What's more, it is vital for reliable compilation of
> applications that throw exceptions or share TLS variables from DLLs into the
> main executable; at present these compile incorrectly without an explicit
> -shared-libgcc.  For instance, the just-released MPFR-3.1.2 doesn't work
> without it.
>
>   Given that it's a very simple tweak to the compiler specs on a single
> platform only, I would like to use my target maintainer's discretion to apply
> it even at this late stage, but I figure it's so close to RC1 that I should
> ask the RM's permission anyway.
>
>   I'd also like to backport it to all the currently-open branches.
>
> gcc/ChangeLog
>
> 2013-03-12  Dave Korn  
>
> * config/i386/cygwin.h (SHARED_LIBGCC_SPEC): Make shared libgcc the
> default setting.
>
>   Is this OK by everyone?

Ok for trunk (4.8).  Please add a documentation entry to gcc-4.8/changes.html.

I'm not sure whether this kind of stuff should change on a release branch,
I'll defer to others for this.  Still, if you backport it, add a
gcc-4.x/changes.html
item to the sub-release sections.

Richard.

> cheers,
>   DaveK
>
>
>


Re: [PATCH][1/n] tree LIM TLC

2013-03-12 Thread Richard Biener
On Mon, 11 Mar 2013, Richard Biener wrote:

> 
> This is a series of patches applying some TLC to LIM.  This first
> patch gets rid of the remains of create_vop_ref_mapping and
> alongside cleans up how we record references.

Actually I rolled in some more changes into this patch,
bootstrapped and tested on x86_64-unknown-linux-gnu, queued for 4.9.

Richard.

2013-03-11  Richard Biener  

* tree-ssa-loop-im.c (record_mem_ref_loc): Record ref as
stored here.
(gather_mem_refs_stmt): Instead of here and in
create_vop_ref_mapping_loop.
(gather_mem_refs_in_loops): Fold into ...
(analyze_memory_references): ... this.  Move data structure
init to tree_ssa_lim_initialize.  Propagate stored refs info
as well.
(create_vop_ref_mapping_loop): Remove.
(create_vop_ref_mapping): Likewise.
(tree_ssa_lim_initialize): Initialize ref bitmaps here.
Split always-executed computation into ...
(fill_always_executed_in): ... this.  Rename original to ...
(fill_always_executed_in_1): ... this.
(tree_ssa_lim): Call fill_always_executed_in here.

Index: trunk/gcc/tree-ssa-loop-im.c
===
*** trunk.orig/gcc/tree-ssa-loop-im.c   2013-03-11 12:38:43.0 +0100
--- trunk/gcc/tree-ssa-loop-im.c2013-03-11 14:12:00.143773587 +0100
*** mem_ref_locs_alloc (void)
*** 1518,1528 
 description REF.  The reference occurs in statement STMT.  */
  
  static void
! record_mem_ref_loc (mem_ref_p ref, struct loop *loop, gimple stmt, tree *loc)
  {
mem_ref_loc_p aref = XNEW (struct mem_ref_loc);
mem_ref_locs_p accs;
-   bitmap ril = memory_accesses.refs_in_loop[loop->num];
  
if (ref->accesses_in_loop.length ()
<= (unsigned) loop->num)
--- 1518,1528 
 description REF.  The reference occurs in statement STMT.  */
  
  static void
! record_mem_ref_loc (mem_ref_p ref, bool is_stored,
!   struct loop *loop, gimple stmt, tree *loc)
  {
mem_ref_loc_p aref = XNEW (struct mem_ref_loc);
mem_ref_locs_p accs;
  
if (ref->accesses_in_loop.length ()
<= (unsigned) loop->num)
*** record_mem_ref_loc (mem_ref_p ref, struc
*** 1536,1556 
  
aref->stmt = stmt;
aref->ref = loc;
- 
accs->locs.safe_push (aref);
-   bitmap_set_bit (ril, ref->id);
- }
- 
- /* Marks reference REF as stored in LOOP.  */
  
! static void
! mark_ref_stored (mem_ref_p ref, struct loop *loop)
! {
!   for (;
!loop != current_loops->tree_root
!&& !bitmap_bit_p (ref->stored, loop->num);
!loop = loop_outer (loop))
! bitmap_set_bit (ref->stored, loop->num);
  }
  
  /* Gathers memory references in statement STMT in LOOP, storing the
--- 1536,1552 
  
aref->stmt = stmt;
aref->ref = loc;
accs->locs.safe_push (aref);
  
!   bitmap_set_bit (memory_accesses.refs_in_loop[loop->num], ref->id);
!   if (is_stored)
! {
!   bitmap_set_bit (memory_accesses.all_refs_stored_in_loop[loop->num],
! ref->id);
!   while (loop != current_loops->tree_root
!&& bitmap_set_bit (ref->stored, loop->num))
!   loop = loop_outer (loop);
! }
  }
  
  /* Gathers memory references in statement STMT in LOOP, storing the
*** gather_mem_refs_stmt (struct loop *loop,
*** 1582,1590 
  fprintf (dump_file, "Unanalyzed memory reference %u: ", id);
  print_gimple_stmt (dump_file, stmt, 0, TDF_SLIM);
}
!   if (gimple_vdef (stmt))
!   mark_ref_stored (ref, loop);
!   record_mem_ref_loc (ref, loop, stmt, mem);
return;
  }
  
--- 1578,1584 
  fprintf (dump_file, "Unanalyzed memory reference %u: ", id);
  print_gimple_stmt (dump_file, stmt, 0, TDF_SLIM);
}
!   record_mem_ref_loc (ref, gimple_vdef (stmt), loop, stmt, mem);
return;
  }
  
*** gather_mem_refs_stmt (struct loop *loop,
*** 1611,1627 
}
  }
  
!   if (is_stored)
! mark_ref_stored (ref, loop);
! 
!   record_mem_ref_loc (ref, loop, stmt, mem);
return;
  }
  
  /* Gathers memory references in loops.  */
  
  static void
! gather_mem_refs_in_loops (void)
  {
gimple_stmt_iterator bsi;
basic_block bb;
--- 1605,1618 
}
  }
  
!   record_mem_ref_loc (ref, is_stored, loop, stmt, mem);
return;
  }
  
  /* Gathers memory references in loops.  */
  
  static void
! analyze_memory_references (void)
  {
gimple_stmt_iterator bsi;
basic_block bb;
*** gather_mem_refs_in_loops (void)
*** 1647,1731 
alrefs = memory_accesses.all_refs_in_loop[loop->num];
bitmap_ior_into (alrefs, lrefs);
  
!   if (loop_outer (loop) == current_loops->tree_root)
continue;
  
!   alrefso = memory_accesses.all_refs_in_loop[loop_outer (loop)->num];
bitmap_ior_into (alrefso, alrefs);
  }
  }
  
- /* 

Re: extend fwprop optimization

2013-03-12 Thread Wei Mi
Thanks for the helpful comments! I have some replies inlined.

Regards,
Wei.

On Mon, Mar 11, 2013 at 12:52 PM, Steven Bosscher  wrote:
> On Mon, Mar 11, 2013 at 6:52 AM, Wei Mi wrote:
>> This is the fwprop extension patch which is put in order. Regression
>> test and bootstrap pass. Please help to review its rationality. The
>> following is a brief description what I have done in the patch.
>>
>> In order to make fwprop more effective in rtl optimization, we extend
>> it to handle general expressions instead of the three cases listed in
>> the head comment in fwprop.c. The major changes include a) We need to
>> check propagation correctness for src exprs of def which contain mem
>> references. Previous fwprop for the three cases above doesn't have the
>> problem. b) We need a better cost model because the benefit is usually
>> not so apparent as the three cases above.
>>
>> For a general fwprop problem, there are two possible sources where
>> benefit comes from. The frist is the new use insn after propagation
>> and simplification may have lower cost than itself before propagation,
>> or propagation may create a new insn, that could be splitted or
>> peephole optimized later and get a lower cost. The second is that if
>> all the uses are replaced with the src of the def insn, the def insn
>> could be deleted.
>>
>> So instead of check each def-use pair independently, we use DU chain
>> to track all the uses for a def. For each def-use pair, we attempt the
>> propagation, record the change candidate in changes[] array, but we
>> wait to confirm the changes until all the pairs with the same def are
>> iterated. The changes confirmation is done in the func
>> confirm_change_group_by_cost. We only do this for fwprop. For
>> fwprop_addr, the benefit of each change is ensured by
>> propagation_rtx_1 using should_replace_address, so we just confirm all
>> the changes without checking benefit again.
>
> Hello Wei Mi,
>
> So IIUC, in essence you are doing:
>
> main:
>   FOR_EACH_BB:
> FOR_BB_INSNS, non-debug insns only:
>   for each df_ref DEF operand on insn:
> iterate_def_uses
>
> iterate_def_uses:
>   for each UD chain from DEF to USE(i):
> forward_propagate_into
>   confirm changes by total benefit
>
> I still like the idea, but there are also still a few "design issues"
> to resolve.
>
> Some of the same comments as before apply: Do you really, really,
> reallyreally have to go so low-level as to insn splitting, peephole
> optimizations, and even register allocation, to get the cost right?
> That will almost certainly not be acceptable, and I for one would
> oppose such a change. It's IMHO a violation of proper engineering when
> your medium-to-high level code transformations have to do that. If you
> have strong reasons for your approach, it'd be helpful if you can
> explain them so that we can together look for a less intrusive
> solution (e.g. splitting earlier, adjusting the cost model, etc.).
>

For the motivational case, I need insn splitting to get the cost
right. insn splitting is not very intrusive. All I need is to call
split_insns func. I think split_insns is just a pattern matching func
just like recog(), which is called at many places. Peephole is not
necessary (I add it in order to find as many oppotunities as possible,
but from my trace analysis, it doesn't help much). To call
peephole2_insn() is indeed intrusive, because peephole assumes reg
allocation is completed, I have to insert the ugly workaround below.
peephole also requires setting DF_LR_RUN_DCE flag and some
initialization of peep2_insn_data array.

So how about keep split_insns and remove peephole in the cost estimation func?

> So things like:
>> +  /* we may call peephole2_insns in fwprop phase to estimate how
>> + peephole will affect the cost of the insn transformed by fwprop.
>> + fwprop is done before ira phase. In that case, we simply return
>> + a new pseudo register.  */
>> +  if (!strncmp (current_pass->name, "fwprop", 6))
>> +return gen_reg_rtx (mode);
>
> and
>
>> Index: config/i386/i386.c
>> ===
>> --- config/i386/i386.c(revision 196270)
>> +++ config/i386/i386.c(working copy)
>> @@ -15901,8 +15901,14 @@ ix86_expand_clear (rtx dest)
>>  {
>>rtx tmp;
>>
>> -  /* We play register width games, which are only valid after reload.  */
>> -  gcc_assert (reload_completed);
>> +  /* We play register width games, which are only valid after reload.
>> + An exception: fwprop call peephole to estimate the change benefit,
>> + and peephole will call this func. That is before reload complete.
>> + It will not bring any problem because the peephole2_insns call is
>> + only used for cost estimation in fwprop, and its change will be
>> + abandoned immediately after the cost estimation.  */
>> +  if (strncmp (current_pass->name, "fwprop", 6))
>> +gcc_assert (reload_completed);
>
> are IMHO not OK.

Re: [Patch,AVR]: Fix PR56263

2013-03-12 Thread Denis Chertykov
2013/3/11 Georg-Johann Lay :
> This patch implements a new warning option -Waddr-space-convert warns about
> conversions to a non-containing address space.
>
> Address spaces are implemented in such a way that each address space is
> contained in each other space so that casting is possible, e.g. in code like
>
> char read_c (bool in_flash, const char *str)
> {
> if (in_flash)
>   return * (const __flash char*) str;
> else
>   return *str;
> }
>
> However, there are no warning about implicit or explicit address space
> conversions, which makes it hard to port progmem code to address space code.
>
> If an address space qualifier is missing, e.g. when calling a function that is
> not qualified correctly, this would result in wrong code (in contrast to
> pgm_read that work no matter how the address is qualified).
>
> There is still some work to do to get more precise warnings and this is just a
> first take to implement PR56263, see the FIXME in the source.
>
> The details can be worked out later, e.g. for 4.8.1.
>
>
> Ok for trunk?
>
>
> PR target/56263
> * config/avr/avr.c (TARGET_CONVERT_TO_TYPE): Define to...
> (avr_convert_to_type): ...this new static function.
> * config/avr/avr.opt (-Waddr-space-convert): New C option.
> * doc/invoke.texi (AVR Options): Document it.

Approved.

Denis.