RFA: Fix rtl-optimization/57425

2013-06-15 Thread Joern Rennecke

Bootstrapped/regtested on i686-pc-linux-gnu.
2013-06-15  Joern Rennecke 

gcc:
PR rtl-optimization/57425
* alias.c (write_dependence_p): Add new parameters mem_size and
canon_mem_addr.  Changed all callers.
(canon_anti_dependence): New function.
* cse.c (check_dependence): Use canon_anti_dependence.
* cselib.c (cselib_invalidate_mem): Likewise.
* rtl.h (canon_anti_dependence): Declare.
gcc/testsuite:
PR rtl-optimization/57425
* gcc.dg/torture/pr57425-1.c, gcc.dg/torture/pr57425-2.c: New files.
* gcc.dg/torture/pr57425-3.c: Likewise.

Index: alias.c
===
--- alias.c (revision 200126)
+++ alias.c (working copy)
@@ -156,7 +156,7 @@ static int insert_subset_children (splay
 static alias_set_entry get_alias_set_entry (alias_set_type);
 static bool nonoverlapping_component_refs_p (const_rtx, const_rtx);
 static tree decl_for_component_ref (tree);
-static int write_dependence_p (const_rtx, const_rtx, int);
+static int write_dependence_p (const_rtx, unsigned, rtx, const_rtx, int);
 
 static void memory_modified_1 (rtx, const_rtx, void *);
 
@@ -2553,10 +2553,12 @@ canon_true_dependence (const_rtx mem, en
 }
 
 /* Returns nonzero if a write to X might alias a previous read from
-   (or, if WRITEP is nonzero, a write to) MEM.  */
+   (or, if WRITEP is nonzero, a write to) MEM.
+   If CANON_MEM_ADDR is nonzero, it is the canonicalized address of MEM.  */
 
 static int
-write_dependence_p (const_rtx mem, const_rtx x, int writep)
+write_dependence_p (const_rtx mem, unsigned mem_size, rtx canon_mem_addr,
+   const_rtx x, int writep)
 {
   rtx x_addr, mem_addr;
   rtx base;
@@ -2612,9 +2614,14 @@ write_dependence_p (const_rtx mem, const
 return 0;
 
   x_addr = canon_rtx (x_addr);
-  mem_addr = canon_rtx (mem_addr);
+  if (canon_mem_addr)
+mem_addr = canon_mem_addr;
+  else
+mem_addr = canon_rtx (mem_addr);
+  if (!mem_size)
+mem_size = SIZE_FOR_MODE (mem);
 
-  if ((ret = memrefs_conflict_p (SIZE_FOR_MODE (mem), mem_addr,
+  if ((ret = memrefs_conflict_p (mem_size, mem_addr,
 SIZE_FOR_MODE (x), x_addr, 0)) != -1)
 return ret;
 
@@ -2629,7 +2636,19 @@ write_dependence_p (const_rtx mem, const
 int
 anti_dependence (const_rtx mem, const_rtx x)
 {
-  return write_dependence_p (mem, x, /*writep=*/0);
+  return write_dependence_p (mem, 0, NULL_RTX, x, /*writep=*/0);
+}
+
+/* Likewise, but we already have a canonicalized MEM_ADDR for MEM.
+   Also, consider MEM in MEM_MODE (which might be from an enclosing
+   STRICT_LOW_PART / ZERO_EXTRACT).  */
+
+int
+canon_anti_dependence (const_rtx mem, enum machine_mode mem_mode,
+  rtx mem_addr, const_rtx x)
+{
+  return write_dependence_p (mem, GET_MODE_SIZE (mem_mode), mem_addr,
+x, /*writep=*/0);
 }
 
 /* Output dependence: X is written after store in MEM takes place.  */
@@ -2637,7 +2656,7 @@ anti_dependence (const_rtx mem, const_rt
 int
 output_dependence (const_rtx mem, const_rtx x)
 {
-  return write_dependence_p (mem, x, /*writep=*/1);
+  return write_dependence_p (mem, 0, NULL_RTX, x, /*writep=*/1);
 }
 
 
Index: cse.c
===
--- cse.c   (revision 200126)
+++ cse.c   (working copy)
@@ -1824,7 +1824,7 @@ flush_hash_table (void)
   }
 }
 
-/* Function called for each rtx to check whether true dependence exist.  */
+/* Function called for each rtx to check whether an anti dependence exist.  */
 struct check_dependence_data
 {
   enum machine_mode mode;
@@ -1837,7 +1837,7 @@ check_dependence (rtx *x, void *data)
 {
   struct check_dependence_data *d = (struct check_dependence_data *) data;
   if (*x && MEM_P (*x))
-return canon_true_dependence (d->exp, d->mode, d->addr, *x, NULL_RTX);
+return canon_anti_dependence (d->exp, d->mode, d->addr, *x);
   else
 return 0;
 }
Index: cselib.c
===
--- cselib.c(revision 200126)
+++ cselib.c(working copy)
@@ -2263,8 +2263,8 @@ cselib_invalidate_mem (rtx mem_rtx)
  continue;
}
  if (num_mems < PARAM_VALUE (PARAM_MAX_CSELIB_MEMORY_LOCATIONS)
- && ! canon_true_dependence (mem_rtx, GET_MODE (mem_rtx),
- mem_addr, x, NULL_RTX))
+ && ! canon_anti_dependence (mem_rtx, GET_MODE (mem_rtx),
+ mem_addr, x))
{
  has_mem = true;
  num_mems++;
Index: rtl.h
===
--- rtl.h   (revision 200126)
+++ rtl.h   (working copy)
@@ -2705,6 +2705,8 @@ extern int canon_true_dependence (const_
  const_rtx, rtx);
 extern int read_dependence (const_rtx, const_rtx);
 extern int anti_depe

[PATCH] MIPS r5900, --with-llsc=?

2013-06-15 Thread Jürgen Urban
Hello Richard,

> >> > How much other changes will be currently accepted here? There is other
> >> > stuff which I want to prepare and submit here, e.g.:

> >> > 3. fix use of ll/sc in libgomp, either increase mips ISA level or use
> >> > syscall (which is broken in Linux 2.6.35.4).

The attached patch fixes problem 3. libgomp was not the cause of the problem. 
When linux is detected in gcc/config.gcc, the variable "with_llsc" is set to 
"yes". This happens before the CPU is checked. I fixed this by storing the 
original parameter. I think this is better than moving the code up.

The patch for gcc/config/mips/mips.h fixes that ".set mips2" wasn't used when 
with_llsc=yes was configured.

The patch for gcc/config/mips/mips.c gets lld and scd working. These 
instructions are normally not emulated by the Linux kernel and the syscall only 
supports 32 bit. So I changed my kernel to support lld and scd. On the long 
term I plan to use registers k0 or k1 as address registers for the instructions 
lb, lw, ld, lq, sb, sw, sd and sq which was suggested for the PS2 in an old 
paper, because the registers are changed by the kernel on interrupts. There 
were measurements which showed much performance improvement on the PS2 when no 
illegal instructions are used.

Best regards
JürgenIndex: gcc/config.gcc
===
--- gcc/config.gcc	(Revision 199708)
+++ gcc/config.gcc	(Arbeitskopie)
@@ -297,6 +297,9 @@
 	;;
 esac
 
+# Save parameter --with-llsc for later check.
+param_llsc="$with_llsc"
+
 # Set default cpu_type, tm_file, tm_p_file and xm_file so it can be
 # updated in each machine entry.  Also set default extra_headers for some
 # machines.
@@ -2985,7 +2988,7 @@
 mips64r5900-*-* | mips64r5900el-*-* | mipsr5900-*-* | mipsr5900el-*-*)
   with_arch=r5900
   with_tune=r5900
-  if test x$with_llsc = x; then
+  if test x$param_llsc = x; then
 	# r5900 doesn't support ll, sc, lld and scd instructions:
 	with_llsc=no
   fi
Index: gcc/config/mips/mips.c
===
--- gcc/config/mips/mips.c	(Revision 199708)
+++ gcc/config/mips/mips.c	(Arbeitskopie)
@@ -12463,7 +12463,11 @@
   if (!ISA_HAS_LL_SC)
 {
   output_asm_insn (".set\tpush", 0);
-  output_asm_insn (".set\tmips2", 0);
+  if (TARGET_64BIT) {
+output_asm_insn (".set\tmips3", 0);
+  } else {
+output_asm_insn (".set\tmips2", 0);
+  }
 }
 }
 
Index: gcc/config/mips/mips.h
===
--- gcc/config/mips/mips.h	(Revision 199708)
+++ gcc/config/mips/mips.h	(Arbeitskopie)
@@ -1063,7 +1081,7 @@
 /* ISA includes ll and sc.  Note that this implies ISA_HAS_SYNC
because the expanders use both ISA_HAS_SYNC and ISA_HAS_LL_SC
instructions.  */
-#define ISA_HAS_LL_SC (mips_isa >= 2 && !TARGET_MIPS16)
+#define ISA_HAS_LL_SC (mips_isa >= 2 && !TARGET_MIPS16 && !TARGET_MIPS5900)
 #define GENERATE_LL_SC			\
   (target_flags_explicit & MASK_LLSC	\
? TARGET_LLSC && !TARGET_MIPS16	\


Re: [PATCH] Proof of concept: multiple gc heaps

2013-06-15 Thread Mike Stump
On Jun 14, 2013, at 8:21 PM, David Malcolm  wrote:
> I'm hoping that gcc 4.9 can support multiple "parallel universes" of gcc
> state within one process

> One issue with the above is the garbage collector.

> I think there are two possible ways in which "universe instances" could
> interact with the GC:
> 
> (a) have the universe instances be GC-managed: all parallel universes
> share the same heap, requiring a rewrite of the GC code to be
> thread-safe,

Yeah, I think going this route would be bad.  The extra locking won't win in 
the end.

> I don't think (a) is feasible.


I agree.

> Hence (a) would require all threads to synchronize on GC-safe locations.

Yup.

> It seems much simpler to me to go with (b): multiple independent
> GC-heaps.

Yup.

> I'm attaching a patch which converts all state within ggc into a gc_heap
> class, so that you can have multiple instances of ggc heaps.  For now
> there's a singleton instance "the_heap", which is used implicitly in a
> few places.

I like the design.

> I don't yet know to what extent this affects performance of the garbage
> collector (extra register pressure from passing around "this",
> perhaps?).

Would be nice to do a release style build (gcc trunk will default to lots of 
extra checking, since it isn't a release) and gather performance numbers for it.

> Thoughts?

I looked through the patch, and it looks reasonable to me (informal code 
review).  I think for formal review, the reviewer should see and consider the 
performance impact.  You can do something just a hair wrong, and kill 
performance.  So, a C++ generate a pch file, say of 100,000 lines or typical 
C++ code, and see the time of an -O2 and a -O0 build.  -O0 influence the 
edit->debug cycle time.  If performance is more than 30% off, it would seem you 
should be able to regain that by fixing your code.  I'd hope than pch 
generation time is slower than less than 3%, ideally, around 0.4%.

Re: [PATCH] Basic support for MIPS r5900

2013-06-15 Thread Richard Sandiford
"Jürgen Urban"  writes:
>> Richard Sandiford  writes:
>>
>> > I can't approve the Makefile.in bits.  I've cc'ed Ian, who's the libgcc
>> > maintainer.  Ian: the problem is that "_muldi3.o" on 64-bit targets
>> > is actually an implementation of __multi3.  Jürgen wants to have a
>> > __muldi3 too, with the same implementation as on 32-bit targets.
>>
>> My assumption is that target maintainers can approve target-specific
>> changes to libgcc, including Makefile changes.
>>
>> That said, it seems that this particular patch ought to mostly be in
>> libgcc/config/mips/t-mips, using LIB2FUNCS_EXCLUDE and LIB2ADD.  It's
>> not clear to me that libgcc/Makefile.in needs any changes, and in case
>> it should not be necessary for it to have anything like MIPSTYPE.  That
>> kind of thing belongs in config/mips.
>
> The code is now completely moved into libgcc/config/mips/t-mips and
> libgcc/config/mips/lib2funcs.c (new file).
> The code should now be easier to understand.
> I used the code from libgcc/config/m32c as example (e.g. same file name
> lib2funcs.c). I copied the file header (LGPL) from a file which I
> believed to be new and correct.

Thanks, this looks very clean.  It's good that the new file compiles
to nothing for ISA_HAS_DMULT/ISA_HAS_DDIV targets.  In that case though,
I think it should be added to LIB2ADD_ST rather than LIB2ADD.
Objects from LIB2ADD are included in libgcc.so, which needs to have
a stable interface, whereas LIB2ADD_ST is purely for libgcc.a,
where this kind of variation is OK.

Putting them in one file means they'll either all be pulled in or none will,
but I doubt the size is going to matter in practice, right?  Besides,
that kind of thing could easily be tweaked later if it shows up as a problem.

Also, I see you changed the patch so that mul3 tests ISA_HAS_MULT
in the C body rather than in the condition.  Was that in response to my
previous comment about define_expand conditions?  Your first version was
right because mul3 is a public pattern that's exposed to optabs
(insn-opinit.c tests HAVE_mul3).  The other two define_expands
you mentioned are private to the MIPS port though and we never use
HAVE_* for them.  We only use them from places where the insns are
already known to be valid.

The ISA_HAS_DMUL3 part was redundant, sorry for not noticing it last time.

Does it still work with those changes, as below?  If so, I'll check it in.

Thanks,
Richard


gcc/
2013-06-15  Jürgen Urban  

* config/mips/mips.h (ISA_HAS_MUL3): Include TARGET_R5900.
(ISA_HAS_MULT, ISA_HAS_DMULT, ISA_HAS_DIV, ISA_HAS_DDIV): New macros.
* config/mips/mips.md (mul3, mul3_internal)
(mul3_r4000): Require ISA_HAS_MULT.
(mul3_mul3): Handle TARGET_R5900.
(mulsidi3_64bit_dmul): Remove redundant TARGET_64BIT test.
(muldi3_highpart, muldi3_highpart_internal, mulditi3)
(mulditi3_internal, mulditi3_r4000): Require ISA_HAS_DMULT
instead of TARGET_64BIT.
(divmod4, udivmod4, divmod4_hilo_):
Require ISA_HAS_DIV.

libgcc/
2013-06-15  Jürgen Urban  

* config/mips/lib2funcs.c: New file.
* config/mips/t-mips (LIB2ADD_ST): Add it.

Index: gcc/config/mips/mips.h
===
--- gcc/config/mips/mips.h  2013-06-15 14:55:16.985850737 +0100
+++ gcc/config/mips/mips.h  2013-06-15 19:32:51.637536044 +0100
@@ -807,6 +807,7 @@ #define ISA_HAS_BRANCHLIKELY(!ISA_MIPS1
 #define ISA_HAS_MUL3   ((TARGET_MIPS3900   \
  || TARGET_MIPS5400\
  || TARGET_MIPS5500\
+ || TARGET_MIPS5900\
  || TARGET_MIPS7000\
  || TARGET_MIPS9000\
  || TARGET_MAD \
@@ -821,6 +822,22 @@ #define ISA_HAS_DMUL3  (TARGET_64BIT
 && TARGET_OCTEON   \
 && !TARGET_MIPS16)
 
+/* ISA supports instructions DMULT and DMULTU. */
+#define ISA_HAS_DMULT  (TARGET_64BIT && !TARGET_MIPS5900)
+
+/* ISA supports instructions MULT and MULTU.
+   This is always true, but the macro is needed for ISA_HAS_MULT
+   in mips.md.  */
+#define ISA_HAS_MULT   (1)
+
+/* ISA supports instructions DDIV and DDIVU. */
+#define ISA_HAS_DDIV   (TARGET_64BIT && !TARGET_MIPS5900)
+
+/* ISA supports instructions DIV and DIVU.
+   This is always true, but the macro is needed for ISA_HAS_DIV
+   in mips.md.  */
+#define ISA_HAS_DIV(1)
+
 #define ISA_HAS_DIV3   ((TARGET_LOONGSON_2EF   \
  || TARGET_LOONGSON_3A)\
 && !TARGET_MIPS16)
Index: gcc/config/mips/mips.md
==

Re: [PATCH, libcpp] Do not decrease highest_location if the included file has be included twice.

2013-06-15 Thread Dehao Chen
ping ^2

On Tue, Jun 4, 2013 at 10:02 AM, Dehao Chen  wrote:
> Hi, Dodji,
>
> Thanks for helping update the patch. The new patch passed all
> regression test and can fix the problem in my huge source file. I
> added ChangeLog entry to the patch. Could any libcpp maintainers help
> check if it is ok for trunk?
>
> Thanks,
> Dehao
>
> libcpp/ChangeLog:
>
> 2013-06-04  Dehao Chen  
>
>  * files.c (_cpp_stack_include): Fix the highest_location when header
>  file is guarded by #ifndef and is included twice.
>
> Index: libcpp/files.c
> ===
> --- libcpp/files.c (revision 199570)
> +++ libcpp/files.c (working copy)
> @@ -983,6 +983,7 @@ _cpp_stack_include (cpp_reader *pfile, const char
>  {
>struct cpp_dir *dir;
>_cpp_file *file;
> +  bool stacked;
>
>dir = search_path_head (pfile, fname, angle_brackets, type);
>if (!dir)
> @@ -993,19 +994,26 @@ _cpp_stack_include (cpp_reader *pfile, const char
>if (type == IT_DEFAULT && file == NULL)
>  return false;
>
> -  /* Compensate for the increment in linemap_add that occurs in
> - _cpp_stack_file.  In the case of a normal #include, we're
> - currently at the start of the line *following* the #include.  A
> - separate source_location for this location makes no sense (until
> - we do the LC_LEAVE), and complicates LAST_SOURCE_LINE_LOCATION.
> - This does not apply if we found a PCH file (in which case
> - linemap_add is not called) or we were included from the
> - command-line.  */
> +  /* Compensate for the increment in linemap_add that occurs if
> +  _cpp_stack_file actually stacks the file.  In the case of a
> + normal #include, we're currently at the start of the line
> + *following* the #include.  A separate source_location for this
> + location makes no sense (until we do the LC_LEAVE), and
> + complicates LAST_SOURCE_LINE_LOCATION.  This does not apply if we
> + found a PCH file (in which case linemap_add is not called) or we
> + were included from the command-line.  */
>if (file->pchname == NULL && file->err_no == 0
>&& type != IT_CMDLINE && type != IT_DEFAULT)
>  pfile->line_table->highest_location--;
>
> -  return _cpp_stack_file (pfile, file, type == IT_IMPORT);
> +  stacked = _cpp_stack_file (pfile, file, type == IT_IMPORT);
> +
> +  if (!stacked)
> +/* _cpp_stack_file didn't stack the file, so let's rollback the
> +   compensation dance we performed above.  */
> +pfile->line_table->highest_location++;
> +
> +  return stacked;
>  }
>
>  /* Could not open FILE.  The complication is dependency output.  */


Re: [Patch, fortran] PR 49074 ICE on defined assignment with class arrays.

2013-06-15 Thread Tobias Burnus

Mikael Morin wrote:

Dominique noticed that the patch also fixed PR56136 whose test is very
close to the one of PR49074.
This made me notice that while the PR56136 test should use a temporary
(and does), the PR49074 one shouldn't.  That is fixed with the attached
patch.
Then the ICE which was fixed by the previous patch isn't
reachable any more in the PR49074 test-case, so I also add a test
based on the PR56136 one.

Regression tested on x86_64-unknown-linux-gnu.  OK for trunk?


OK. Thanks to the patch - and for voiding the temporary.

Tobias

PS: Pending patch:Print exception status at STOP, 
http://gcc.gnu.org/ml/fortran/2013-06/msg00077.html


Re: [Patch, fortran] PR 49074 ICE on defined assignment with class arrays.

2013-06-15 Thread Mikael Morin
Hello,

Dominique noticed that the patch also fixed PR56136 whose test is very
close to the one of PR49074.
This made me notice that while the PR56136 test should use a temporary
(and does), the PR49074 one shouldn't.  That is fixed with the attached
patch.
Then the ICE which was fixed by the previous patch isn't
reachable any more in the PR49074 test-case, so I also add a test
based on the PR56136 one.

Regression tested on x86_64-unknown-linux-gnu.  OK for trunk?

Mikael





2013-06-15  Mikael Morin  

PR fortran/49074
PR fortran/56136
* dependency.c (gfc_check_argument_var_dependency): Return 0 in the
array constructor case.

Index: dependency.c
===
--- dependency.c(révision 200067)
+++ dependency.c(copie de travail)
@@ -990,7 +990,9 @@ gfc_check_argument_var_dependency (gfc_expr *var,
   return 0;
 
 case EXPR_ARRAY:
-  return gfc_check_dependency (var, expr, 1);
+  /* the scalarizer always generates a temporary for array constructors,
+so there is no dependency.  */
+  return 0;
 
 case EXPR_FUNCTION:
   if (intent != INTENT_IN)
2013-06-15  Mikael Morin  

PR fortran/49074
PR fortran/56136
* gfortran.dg/typebound_assignment_5.f03: Check the absence of any 
packing.
* gfortran.dg/typebound_assignment_6.f03: New.


Index: typebound_assignment_5.f03
===
--- typebound_assignment_5.f03  (révision 200070)
+++ typebound_assignment_5.f03  (copie de travail)
@@ -1,4 +1,5 @@
 ! { dg-do run }
+! { dg-options "-fdump-tree-original" }
 !
 ! PR fortran/49074
 ! ICE on defined assignment with class arrays.
@@ -38,3 +39,6 @@
 if (any(foobar%i /= [1, 2])) call abort
   end program
 
+! { dg-final { scan-tree-dump-not "_gfortran_internal_pack" "original" } }
+! { dg-final { scan-tree-dump-not "_gfortran_internal_unpack" "original" } }
+! { dg-final { cleanup-tree-dump "original"} }
! { dg-do run }
! { dg-options "-fdump-tree-original" }
!
! PR fortran/56136
! ICE on defined assignment with class arrays.
!
! Original testcase by Alipasha 

  MODULE A_TEST_M
TYPE :: A_TYPE
  INTEGER :: I
  CONTAINS
  GENERIC :: ASSIGNMENT (=) => ASGN_A
  PROCEDURE, PRIVATE :: ASGN_A
END TYPE

CONTAINS

ELEMENTAL SUBROUTINE ASGN_A (A, B)
  CLASS (A_TYPE), INTENT (INOUT) :: A
  CLASS (A_TYPE), INTENT (IN) :: B
  A%I = B%I
END SUBROUTINE
  END MODULE A_TEST_M
  
  PROGRAM ASGN_REALLOC_TEST
USE A_TEST_M
TYPE (A_TYPE), ALLOCATABLE :: A(:)
INTEGER :: I, J

ALLOCATE (A(100))
A = (/ (A_TYPE(I), I=1,SIZE(A)) /)
A(1:50) = A(51:100)
IF (ANY(A%I /= (/ ((50+I, I=1,SIZE(A)/2), J=1,2) /))) CALL ABORT
A(::2) = A(1:50)! pack/unpack
IF (ANY(A( ::2)%I /= (/ (50+I, I=1,SIZE(A)/2) /))) CALL ABORT  
IF (ANY(A(2::2)%I /= (/ ((50+2*I, I=1,SIZE(A)/4), J=1,2) /))) CALL 
ABORT  
  END PROGRAM

! { dg-final { scan-tree-dump-times "_gfortran_internal_pack" 1 "original" } }
! { dg-final { scan-tree-dump-times "_gfortran_internal_unpack" 1 "original" } }
! { dg-final { cleanup-tree-dump "original" } }



Re: [PATCH] Re-write LTO type merging again, do tree merging

2013-06-15 Thread Jan Hubicka
> 
> Patch didn't survive a kernel lto build. LTO test cases are tricky as usual, 
> but I can give you a objdir
> or core file if you want.

I was looking into this ICE and it is DECL_CONTEXT being wrong.  There is
another PR about the same, so i will try to debug it and figure out why.
Otherwise WPA with earlier version of patch was 1.1GB and 30 second for Martin
Liska and me ;)

Honza


Re: [PATCH] Re-write LTO type merging again, do tree merging

2013-06-15 Thread Jan Hubicka
> 
> I've managed to fix nearly all reported missed merged types for cc1.
> Remaining are those we'll never be able to merge (merging would
> change the SCC shape) and those that eventually end up refering
> to a TYPE_STUB_DECL with a make_anon_name () IDENTIFIER_NODE.
> For the latter we should find a middle-end solution as a followup
> in case it really matters.
> 
> WPA statistics for stage2 cc1 are
> 
> [WPA] read 2495082 SCCs of average size 2.380088
> [WPA] 5938514 tree bodies read in total
> [WPA] tree SCC table: size 524287, 260253 elements, collision ratio: 
> 0.804380
> [WPA] tree SCC max chain length 11 (size 1)
> [WPA] Compared 429412 SCCs, 7039 collisions (0.016392)
> [WPA] Merged 426111 SCCs
> [WPA] Merged 3313709 tree bodies
> [WPA] Merged 225079 types
> [WPA] 162844 types prevailed (488124 associated trees)
> [WPA] Old merging code merges an additional 22412 types of which 21492 are 
> in the same SCC with their prevailing variant (345831 and 323276 
> associated trees)
> 
> which shows there are 920 such TYPE_STUB_DECL issues and 21492
> merges the old code did that destroyed SCCs.
> 
> Compared to the old code which only unified types and some selected
> trees (INTEGER_CSTs), the new code can immediately ggc_free the
> unified SCCs after they have been read which results in 55% of
> all tree bodies input into WPA stage to be freed (rather than hoping
> on secondary GC walk effects as the old code relied on), 58% of
> all types are recycled.
> 
> Compile-time is at least on-par with the old code now and disk-usage
> grows by a moderate 10% due to the streaming format change.

On Firefox we now get
[WPA] read 43144472 SCCs of average size 2.270524
[WPA] 97960575 tree bodies read in total
[WPA] tree SCC table: size 8388593, 3936571 elements, collision ratio: 0.727773
[WPA] tree SCC max chain length 88 (size 1)
[WPA] Compared 19030240 SCCs, 337719 collisions (0.017746)
[WPA] Merged 18957101 SCCs
[WPA] Merged 58202930 tree bodies
[WPA] Merged 11800337 types
[WPA] 4506307 types prevailed (13699881 associated trees)
[WPA] Old merging code merges an additional 2174796 types of which 141104 are 
in the same SCC with their prevailing variant (12811826 and 6367853 associated 
trees)
[WPA] GIMPLE canonical type table: size 131071, 77871 elements, 4506442 
searches, 1130903 collisions (ratio: 0.250953)
[WPA] GIMPLE canonical type hash table: size 8388593, 4506386 elements, 
15712947 searches, 12879021 collisions (ratio: 0.819644)

and about 5GB of GGC memory after merging, overall the footprint is still 
around 10GB.
It is notable improvmenet over old code however, where we needed 16GB.

[LTRANS] read 319710 SCCs of average size 6.184039
[LTRANS] 1977099 tree bodies read in total
[LTRANS] GIMPLE canonical type table: size 16381, 9569 elements, 473131 
searches, 24899 collisions (ratio: 0.052626)
[LTRANS] GIMPLE canonical type hash table: size 1048573, 473076 elements, 
1611909 searches, 1340396 collisions (ratio: 0.831558)

CPU: AMD64 family10, speed 2100 MHz (estimated)
Counted CPU_CLK_UNHALTED events (Cycles outside of halt state) with a unit mask 
of 0x00 (No unit mask) count 75
samples  %app name symbol name
4504711.7420  lto1 inflate_fast
34224 8.9209  lto1 streamer_read_uhwi(lto_input_block*)
24630 6.4201  lto1 compare_tree_sccs_1(tree_node*, 
tree_node*, tree_node***)
23205 6.0487  lto1 pointer_map_insert(pointer_map_t*, 
void const*)
20829 5.4293  lto1 unpack_value_fields(data_in*, 
bitpack_d*, tree_node*)
13545 3.5307  lto1 ht_lookup_with_hash(ht*, unsigned 
char const*, unsigned long, unsigned int, ht_lookup_option)
12841 3.3472  libc-2.11.1.so   memset
11840 3.0862  lto1 htab_find_slot_with_hash
11397 2.9708  lto1 
streamer_tree_cache_insert_1(streamer_tree_cache_d*, tree_node*, unsigned int, 
unsigned int*, bool)
11086 2.8897  lto1 lto_input_tree(lto_input_block*, 
data_in*)
10522 2.7427  lto1 lto_input_tree_1(lto_input_block*, 
data_in*, LTO_tags, unsigned int)
8853  2.3076  lto1 unify_scc(streamer_tree_cache_d*, 
unsigned int, unsigned int, unsigned int, unsigned int)
8539  2.2258  lto1 hash_table::find_slot_with_hash(tree_scc const*, unsigned int, insert_option)
7987  2.0819  lto1 adler32
7743  2.0183  lto1 
streamer_read_tree_body(lto_input_block*, data_in*, tree_node*)

Can't we free the pointer map in streamer after every SCC?

 phase stream in : 244.05 (47%) usr   7.14 (25%) sys 252.74 (46%) wall 
3478752 kB (93%) ggc
 phase stream out: 222.50 (43%) usr  21.22 (73%) sys 243.97 (44%) wall  
  7160 kB ( 0%) ggc
 garbage collection  :  12.88 ( 2%) usr   0.00 ( 0%) sys  12.88 ( 2%) wall  
 0 kB

Re: [PATCH GCC]Fix PR57540, try to choose scaled_offset address mode when expanding array reference

2013-06-15 Thread Eric Botcazou
> As reported in pr57540, gcc chooses bad address mode, resulting in A)
> invariant part of address expression is not kept or hoisted; b) additional
> computation which should be encoded in address expression.  The reason is
> when gcc runs into "addr+offset" (which is invalid) during expanding, it
> pre-computes the entire address and accesses memory unit using "MEM[reg]".
> Yet we can force addr into register and try to generate "reg+offset" which
> is valid for targets like ARM.  By doing this, we can:
> 1) keep addr in loop invariant form and hoist it later;
> 2) saving additional computation by taking advantage of scaled addressing
> mode;

Does the invalid address not go through arm_legitimize_address from here?

  /* Perform machine-dependent transformations on X
 in certain cases.  This is not necessary since the code
 below can handle all possible cases, but machine-dependent
 transformations can make better code.  */
  {
rtx orig_x = x;
x = targetm.addr_space.legitimize_address (x, oldx, mode, as);
if (orig_x != x && memory_address_addr_space_p (mode, x, as))
  goto done;
  }

-- 
Eric Botcazou


Re: [PATCH 4/4] Fix leading spaces.

2013-06-15 Thread Ondřej Bílka
On Sat, Jun 15, 2013 at 05:13:31PM +0800, Chung-Ju Wu wrote:
> 2013/6/14 Joseph S. Myers :
> > On Thu, 13 Jun 2013, Richard Biener wrote:
> >
> >> Btw, rather than these kind of patches I'd appreciate if someone would look
> >> at a simple pre(post?)-commit hook that enforces those whitespace rules.
> >
> > In the cpp testsuite we definitely want tests with bad whitespace (e.g.
> > gcc.dg/cpp/backslash*.c) and generally I think it's wrong to be changing
> > whitespace in the testsuite without an understanding of what the test is
> > testing and how the whitespace is irrelevant to that (more generally,
> > cleanups of compiler tests are suspect without such an understanding of
> > what is or is not significant in a particular test).  And so you need to
> > allow addition of otherwise bad whitespace there.
> >
> > It's not obvious whether there might be other cases needing such
> > whitespace as well.
> >
> >> Either by adjusting the committed content or by rejecting the commit(?)
> >
> > I don't think hooks adjusting committed content are likely to work at all.
> >
> > --
> > Joseph S. Myers
> > jos...@codesourcery.com
> 
> By having a look at Ondřej's patch, it is nice to fix existing
> codes with proper whitespace/tab rules.
> 
> But it covers too much under whole gcc source tree.
> Like Joseph said, we may accidentally change the cases
> that need bad whitespace.
> 
> Perhaps we should gradually fix them?  i.e. We can only fix
> leading spaces for some obvious cases (gcc/*.c gcc/c-family/*.c ...),
> leaving other source (gcc/testsuite/* gcc/config/* ...) untouched.
>
Here we need a way to clearly mark where preserving whitespaces is
important and where it is not.

You could add files named say

.indent-on .indent-off

to enable or disable formating on per directory basis. I leave decision
if its whitelist or blacklist upto you.

I will follow convention that in files you can selectively disable
formatter by comments:
/*INDENT-OFF*/ /*INDENT-ON*/ 

I now accept only files with .c and .h extensions. If you want to format 
ada/fortran files it is also possible.

> Once Ondřej or others figure out the purpose of those non-obvious cases,
> they can propose another patches to fix them.
> 
> 
> Best regards,
> jasonwucj


Re: [PATCH 4/4] Fix leading spaces.

2013-06-15 Thread Chung-Ju Wu
2013/6/14 Joseph S. Myers :
> On Thu, 13 Jun 2013, Richard Biener wrote:
>
>> Btw, rather than these kind of patches I'd appreciate if someone would look
>> at a simple pre(post?)-commit hook that enforces those whitespace rules.
>
> In the cpp testsuite we definitely want tests with bad whitespace (e.g.
> gcc.dg/cpp/backslash*.c) and generally I think it's wrong to be changing
> whitespace in the testsuite without an understanding of what the test is
> testing and how the whitespace is irrelevant to that (more generally,
> cleanups of compiler tests are suspect without such an understanding of
> what is or is not significant in a particular test).  And so you need to
> allow addition of otherwise bad whitespace there.
>
> It's not obvious whether there might be other cases needing such
> whitespace as well.
>
>> Either by adjusting the committed content or by rejecting the commit(?)
>
> I don't think hooks adjusting committed content are likely to work at all.
>
> --
> Joseph S. Myers
> jos...@codesourcery.com

By having a look at Ondřej's patch, it is nice to fix existing
codes with proper whitespace/tab rules.

But it covers too much under whole gcc source tree.
Like Joseph said, we may accidentally change the cases
that need bad whitespace.

Perhaps we should gradually fix them?  i.e. We can only fix
leading spaces for some obvious cases (gcc/*.c gcc/c-family/*.c ...),
leaving other source (gcc/testsuite/* gcc/config/* ...) untouched.

Once Ondřej or others figure out the purpose of those non-obvious cases,
they can propose another patches to fix them.


Best regards,
jasonwucj