fix memory leak in gengtype

2011-04-20 Thread Dimitrios Apostolou
Hello list, while trying to build gcc-4.6.0 on my sparcstation, I got gengtype OOM killed. That's when I noticed that its RAM usage peaks at 150MB, which is a bit excessive for parsing a ~500K text file. The attached patch fixes the leak and gengtype now uses a peak of 4MB heap. Hopefully I

Re: fix memory leak in gengtype

2011-04-20 Thread Dimitrios Apostolou
On Wed, 20 Apr 2011, Jeff Law wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 04/20/11 15:08, Dimitrios Apostolou wrote: Hello list, while trying to build gcc-4.6.0 on my sparcstation, I got gengtype OOM killed. That's when I noticed that its RAM usage peaks at 150MB, which is

Re: fix memory leak in gengtype

2011-04-21 Thread Dimitrios Apostolou
On Thu, 21 Apr 2011, Laurynas Biveinis wrote: :( Why don't you get yourself a compile farm account? http://gcc.gnu.org/wiki/CompileFarm Thanks Laurynas, I am absolutely thrilled to see such a variety of hardware! I'll try applying, but I'm not sure I'm eligible, my contributions to OSS are

Re: Patch: speed up compiler a little bit by optimizing lookup_attribute() and is_attribute_p()

2011-06-21 Thread Dimitrios Apostolou
FWIW I think that most of the speedup is due to inlining lookup_attribute(). I got almost the same by applying only the attached very simple patch, since strlen() was called too often (according to the profile at [1]). I used the always_inline attribute to avoid using a macro. I was going to

Re: Patch: speed up compiler a little bit by optimizing lookup_attribute() and is_attribute_p()

2011-06-21 Thread Dimitrios Apostolou
Hi Nicola, my patch is too simple compared to yours, feel free to work on it as much as you wish, no need to credit me since you posted it independantly. I just posted it to note that the inlining part is the one providing most performance benefit. richi: I used always_inline because it is t

[df-scan.c] Optimise DF_REFs ordering in collection_rec, use HARD_REG_SETs instead of bitmaps

2011-07-07 Thread Dimitrios Apostolou
Hello list, The attached patch does two things for df_get_call_refs(): * First it uses HARD_REG_SETs for defs_generated and regs_invalidated_by_call, instead of bitmaps. Replacing in total more than 400K calls (for my testcase) to bitmap_bit_p() with the much faster TEST_HARD_REG_BIT, reduces

Re: [df-scan.c] Optimise DF_REFs ordering in collection_rec, use HARD_REG_SETs instead of bitmaps

2011-07-07 Thread Dimitrios Apostolou
To document the gains from the bitmaps, here is (part of) the annotated source from callgrind profiler, showing instruction count. Before: 1,154,400 if (bitmap_bit_p(regs_invalidated_by_call_regset, i) 8,080,800 => bitmap.c:bitmap_bit_p (192400x) 1,021,200 && !bitmap_bit_p (&d

Re: [df-scan.c] Optimise DF_REFs ordering in collection_rec, use HARD_REG_SETs instead of bitmaps

2011-07-07 Thread Dimitrios Apostolou
And here is the patch that breaks things. By moving df_defs_record() *after* df_get_call_refs() most times collection_rec remains sorted, and about 50M instructions are avoided in qsort() calls of df_canonize_collection_rec(). Unfortunately this does not work. Sometimes cc1 crashes, for exampl

Re: [df-scan.c] Optimise DF_REFs ordering in collection_rec, use HARD_REG_SETs instead of bitmaps

2011-07-08 Thread Dimitrios Apostolou
On Fri, 8 Jul 2011, Steven Bosscher wrote: On Fri, Jul 8, 2011 at 5:20 AM, Dimitrios Apostolou wrote: The attached patch does two things for df_get_call_refs(): How did you test this patch? Normally, a patch submission comes with text like, "Bootstrapped & tested on ..., no re

Re: [df-scan.c] Optimise DF_REFs ordering in collection_rec, use HARD_REG_SETs instead of bitmaps

2011-07-08 Thread Dimitrios Apostolou
On Fri, 8 Jul 2011, Jakub Jelinek wrote: On Fri, Jul 08, 2011 at 06:20:04AM +0300, Dimitrios Apostolou wrote: The attached patch does two things for df_get_call_refs(): * First it uses HARD_REG_SETs for defs_generated and regs_invalidated_by_call, instead of bitmaps. Replacing in total more

Re: [df-scan.c] Optimise DF_REFs ordering in collection_rec, use HARD_REG_SETs instead of bitmaps

2011-07-08 Thread Dimitrios Apostolou
On Fri, 8 Jul 2011, Richard Guenther wrote: On Fri, Jul 8, 2011 at 5:20 AM, Dimitrios Apostolou wrote: Hello list, The attached patch does two things for df_get_call_refs(): * First it uses HARD_REG_SETs for defs_generated and regs_invalidated_by_call, instead of bitmaps. Replacing in total

Re: what can be in a group set?

2011-07-08 Thread Dimitrios Apostolou
On Fri, 8 Jul 2011, Paolo Bonzini wrote: On 07/08/2011 12:43 PM, Richard Sandiford wrote: The docs also say that the first expr_list can be null: If @var{lval} is a @code{parallel}, it is used to represent the case of a function returning a structure in multiple registers. Each element

Re: what can be in a group set?

2011-07-08 Thread Dimitrios Apostolou
Thanks Paolo for the detailed explanation! On Fri, 8 Jul 2011, Paolo Bonzini wrote: That said, changing exit_block_uses and entry_block_defs to HARD_REG_SET would be a nice cleanup, but it would also touch target code due to targetm.extra_live_on_entry (entry_block_defs); I've already done

Re: [df-scan.c] Optimise DF_REFs ordering in collection_rec, use HARD_REG_SETs instead of bitmaps

2011-07-08 Thread Dimitrios Apostolou
On Fri, 8 Jul 2011, Paolo Bonzini wrote: On 07/08/2011 05:51 AM, Dimitrios Apostolou wrote: + /* first write DF_REF_BASE */ This is not necessary. These uses are written to use_vec, while the uses from REG_EQUIV and REG_EQUAL are written to eq_use_vec (see df_ref_create_structure

Re: [DF] [performance] generate DF_REF_BASE REFs in REGNO order

2011-10-08 Thread Dimitrios Apostolou
them will make it into 4.7? On Mon, 22 Aug 2011, Dimitrios Apostolou wrote: For the record I'm posting here the final version of this patch, in case it gets applied. It adds minor stylistic fixes, plus a small change in alloc_pool sizes. Any further testing I do will be posted under

Re: [dwarf2out, elfos] Output assembly faster

2011-10-30 Thread Dimitrios Apostolou
l GNU assembler with -s parameter, though it's pretty hard to be compliant. * Even further in the future we could generate binary data, if we *know* the assembler is GAS. Slightly more descriptive changelog: 2011-08-12 Dimitrios Apostolou * final.c, output.h (fprint_whex, fprint_w,

[DF] Generate REFs in REGNO order

2012-05-20 Thread Dimitrios Apostolou
0.720s Tested on i686, ppc64. No regressions. Paolo: I couldn't find a single test-case where the mw_reg_pool was heavily used so I reduced its size. You think it's OK for all archs? 2012-05-20 Dimitrios Apostolou Paolo Bonzini Provide almost 2% speedup on

Show hash table stats when -fmem-report

2012-05-21 Thread Dimitrios Apostolou
ion than a run-time one. This is probably an overkill so I think I'll skip it. Thanks, Dimitris 2012-05-21 Dimitrios Apostolou Print various statistics about hash tables when called with -fmem-report. If the tables are created once use htab_dump_statistics(), if t

Re: Show hash table stats when -fmem-report

2012-05-21 Thread Dimitrios Apostolou
One line patch to update Makefile. 2012-05-21 Dimitrios Apostolou * gcc/Makefile.in: (toplev.o) toplev.o depends on cselib.h. === modified file 'gcc/Makefile.in' --- gcc/Makefile.in 2012-05-04 20:04:47 + +++ gcc/Makefile.in 2012-05-21 14:08:45 + @@ -2751

Re: [DF] Generate REFs in REGNO order

2012-05-21 Thread Dimitrios Apostolou
Hi Paolo, On Mon, 21 May 2012, Paolo Bonzini wrote: Il 20/05/2012 20:50, Dimitrios Apostolou ha scritto: Paolo: I couldn't find a single test-case where the mw_reg_pool was heavily used so I reduced its size. You think it's OK for all archs? Makes sense, we can see if someth

Re: [DF] Generate REFs in REGNO order

2012-05-22 Thread Dimitrios Apostolou
On Tue, 22 May 2012, Paolo Bonzini wrote: Il 21/05/2012 19:49, Dimitrios Apostolou ha scritto: Thanks for reviewing, in the meantime I'll try to figure out why this patch doesn't offer any speed-up on ppc64 (doesn't break anything though), so expect a followup by tomorrow.

Re: [DF] Generate REFs in REGNO order

2012-05-22 Thread Dimitrios Apostolou
On Tue, 22 May 2012, Paolo Bonzini wrote: Il 22/05/2012 18:26, Dimitrios Apostolou ha scritto: You are right, and I noticed that if we reverse (actually put straight) the loop for the PARALLEL defs inside df_defs_record() then the speedup stands for both x86 and ppc64. The following patch

[line-map] simple oneliner that speeds up track-macro-expansion

2012-06-03 Thread Dimitrios Apostolou
t was zeroed in every macro. Maybe there are pathological cases that I don't see? 2012-06-04 Dimitrios Apostolou * line-map.c (linemap_enter_macro): Don't zero max_column_hint in every macro. This improves performance by reducing the number of reallocations w

Re: [line-map] simple oneliner that speeds up track-macro-expansion

2012-06-04 Thread Dimitrios Apostolou
Hi Dodji, On Mon, 4 Jun 2012, Dodji Seketeli wrote: Hello Dimitrios, I cannot approve or deny your patch, but I have one question. Who should I CC then? I saw that you have commits in that file. I am wondering why this change implies better performance. Is this because when we later want

ping again - Show hash table stats when -fmem-report

2012-07-07 Thread Dimitrios Apostolou
g) table->size); + + fprintf (stderr, "\tused\t\t%lu (%.2f%%)\n", + (unsigned long) table->n_elements, + table->n_elements * 100.0 / table->size); + fprintf (stderr, "\t\tvalid\t\t%lu\n", + (unsigned long) n_valid); + fprintf (stderr, &q

PR #53525 - track-macro-expansion performance regression

2012-07-07 Thread Dimitrios Apostolou
With the attached patches I introduce four new obstacks in struct cpp_reader to substitute malloc's/realloc's when expanding macros. Numbers have been posted in the PR, but to summarize: before: 0.785 s or 2201 M instr after: 0.760 s or 2108 M instr Memory overhead is some tens kilobytes wo

cosmetic change - simplify cse.c:preferable()

2012-07-08 Thread Dimitrios Apostolou
Hello, I've had this patch some time now, it's simple and cosmetic only, I had done it while trying to understand expression costs in CSE. I think it's more readable than the previous one. FWIW it passed all tests on x86. Thanks, Dimitris=== modified file 'gcc/cse.c' --- gcc/cse.c 2012-06

PR 51094 - fprint_w() in output_addr_const() reinstated

2012-07-09 Thread Dimitrios Apostolou
ys be?). Bootstrapped/tested on i386, regtested on x86_64 multilib, i386-pc-solaris2.10 (thanks ro), i686-darwin9 (thanks iains). 2012-07-09 Dimitrios Apostolou * final.c, output.h (fprint_w): New function to write a HOST_WIDE_INT to a file, fast. * final.c (output_addr_

Re: ping again - Show hash table stats when -fmem-report

2012-08-03 Thread Dimitrios Apostolou
l come in a separate patch. The notes quoted from earlier mail still apply: On Sun, 8 Jul 2012, Dimitrios Apostolou wrote: Hi, This patch adds many nice stats about hash tables when gcc is run with -fmem-report. Attached patch tested on x86, no regressions. Also attached is sample output

Re: ping again - Show hash table stats when -fmem-report

2012-08-03 Thread Dimitrios Apostolou
I'm always forgetting something, now it was the changelog, see attached (same as old, nothing significant changed). On Fri, 3 Aug 2012, Dimitrios Apostolou wrote: Hi, I've updated this patch to trunk and rebootstrapped it, so I'm resubmitting it, I'm also making a tr

Re: cosmetic change - simplify cse.c:preferable()

2012-08-03 Thread Dimitrios Apostolou
On Thu, 19 Jul 2012, Richard Guenther wrote: I don't think it's any good or clearer to understand. Hi Richi, I had forgotten I prepared this for PR #19832, maybe you want to take a look. FWIW, with my patch applied there is a difference of ~3 M instr, which is almost unmeasurable in time. Bu

[libiberty] add obstack macros (was Re: PR #53525 - track-macro-expansion performance regression)

2012-08-03 Thread Dimitrios Apostolou
nd I'll back out the patch. 2012-08-04 Dimitrios Apostolou * libiberty.h (XOBDELETE,XOBGROW,XOBGROWVEC,XOBSHRINK,XOBSHRINKVEC): New type-safe macros for obstack allocation. (XOBFINISH): Renamed argument to PT since it is a pointer to T. === modified file

Re: [libiberty] add obstack macros (was Re: PR #53525 - track-macro-expansion performance regression)

2012-08-04 Thread Dimitrios Apostolou
On Fri, 3 Aug 2012, Ian Lance Taylor wrote: 2012-08-04 Dimitrios Apostolou * libiberty.h (XOBDELETE,XOBGROW,XOBGROWVEC,XOBSHRINK,XOBSHRINKVEC): New type-safe macros for obstack allocation. (XOBFINISH): Renamed argument to PT since it is a pointer to T

Re: [libiberty] add obstack macros (was Re: PR #53525 - track-macro-expansion performance regression)

2012-08-05 Thread Dimitrios Apostolou
On Sat, 4 Aug 2012, Ian Lance Taylor wrote: On Fri, 3 Aug 2012, Ian Lance Taylor wrote: I'm not sure where you are looking. I only see one call to _obstack_begin in the gcc directory, and it could easily be replaced with a call to obstack_specify_allocation instead. In libcpp/ mostly, but o

Assembly output optimisations (was: PR 51094 - fprint_w() in output_addr_const() reinstated)

2012-08-06 Thread Dimitrios Apostolou
strapped on x86, no regressions for C,C++ testsuite. Thanks Andreas, hp, Mike, for your comments. Mike I'd appreciate if you elaborated on how to speed-up sprint_uw_rev(), I don't think I understood what you have in mind. Thanks, Dimitris2012-08-07 Dimitrios Apostolou

add strnlen to libiberty (was Re: Assembly output optimisations)

2012-08-06 Thread Dimitrios Apostolou
As an addendum to my previous patch, I made an attempt to properly add strnlen() to libiberty, with the code copied from gnulib. Unfortunately it seems I've messed it up somewhere since defining HAVE_STRNLEN to 0 doesn't seem to build strnlen.o for me. Any ideas? Thanks, Dimitris === modified

Re: add strnlen to libiberty (was Re: Assembly output optimisations)

2012-08-06 Thread Dimitrios Apostolou
On Mon, 6 Aug 2012, Ian Lance Taylor wrote: On Mon, Aug 6, 2012 at 9:34 PM, Dimitrios Apostolou wrote: As an addendum to my previous patch, I made an attempt to properly add strnlen() to libiberty, with the code copied from gnulib. Unfortunately it seems I've messed it up somewhere

Re: Assembly output optimisations (was: PR 51094 - fprint_w() in output_addr_const() reinstated)

2012-08-07 Thread Dimitrios Apostolou
I should mention that with my patch .ascii is used more aggresively than before, so if a string is longer than ELF_STRING_LIMIT it will be written as .ascii all of it, while in the past it would use .string for the string's tail. Example diff to original behaviour: .LASF15458: - .ascii

Re: Assembly output optimisations (was: PR 51094 - fprint_w() in output_addr_const() reinstated)

2012-08-07 Thread Dimitrios Apostolou
On Tue, 7 Aug 2012, Ian Lance Taylor wrote: On Tue, Aug 7, 2012 at 2:24 PM, Dimitrios Apostolou wrote: BTW I can't find why ELF_STRING_LIMIT is only 256, it seems GAS supports arbitrary lengths. I'd have to change my code if we ever set it too high (or even unlimited) since I al

Re: [RFC] Replace some bitmaps with HARD_REG_SETs

2011-07-24 Thread Dimitrios Apostolou
Hi Steven, On Sun, 24 Jul 2011, Steven Bosscher wrote: Can you please create your patches with the -p option, so that it's easier to see what function you are changing? Also, even for an RFC patch a ChangeLog is more than just nice to have ;-) Do you mean an entry in Changelog file in root dir

Re: [RFC] Replace some bitmaps with HARD_REG_SETs - second version

2011-07-25 Thread Dimitrios Apostolou
Bug found, in df_mark_reg I need to iterate until regno + n, not n. The error is at the following hunk: --- gcc/df-scan.c 2011-02-02 20:08:06 + +++ gcc/df-scan.c 2011-07-24 17:16:46 + @@ -3713,35 +3717,40 @@ df_mark_reg (rtx reg, void *vset) if (regno < FIRST_PSEUDO_REGIST

eliminate bitmap regs_invalidated_by_call_regset

2011-07-25 Thread Dimitrios Apostolou
un 10) execution test FAIL: libmudflap.cth/pass39-frag.c (-O3) (rerun 10) output pattern test Performance measured not to be affected, maybe it is now a couple milliseconds faster: Original: PC1:0.878s, PC2:6.55s, 2105.6 M instr Patched : PC1:0.875s, PC2:6.54s, 2104.9 M instr 2011-07-25 Dimi

Re: [RFC] Replace some bitmaps with HARD_REG_SETs - second version

2011-07-25 Thread Dimitrios Apostolou
That was a bug, indeed, but unfortunately it wasn't the one causing the crash I posted earlier... Even after fixing it I get the same backtrace from gdb. So the petition "spot the bug" holds... Thanks, Dimitris

added some assert checks in hard-reg-set.h

2011-07-25 Thread Dimitrios Apostolou
Dimitrios Apostolou * hard-reg-set.h (TEST_HARD_REG_BIT, SET_HARD_REG_BIT, CLEAR_HARD_REG_BIT): Added some assert checks for test, set and clear operations of HARD_REG_SETs, enabled when RTL checks are on. Runtime overhead was measured as negligible. Thanks, Dimitris=== modified file 

Re: [RFC] Replace some bitmaps with HARD_REG_SETs - second version

2011-07-26 Thread Dimitrios Apostolou
Bug found at last, it's in the following hunk, the ampersand in &exit_block_uses is wrong... :-@ @@ -3951,7 +3949,7 @@ df_get_exit_block_use_set (bitmap exit_b { rtx tmp = EH_RETURN_STACKADJ_RTX; if (tmp && REG_P (tmp)) - df_mark_reg (tmp, exit_block_uses); + df_

[DF] [performance] generate DF_REF_BASE REFs in REGNO order

2011-07-29 Thread Dimitrios Apostolou
improvements are welcome. http://gcc.gnu.org/wiki/OptimisingGCC?action=AttachFile&do=view&target=callgrind-tcp_ipv4-trunk-co-109439-prod.txt http://gcc.gnu.org/wiki/OptimisingGCC?action=AttachFile&do=view&target=callgrind-tcp_ipv4-df2-co-prod.txt Changelog: 2011-07-29 Dim

Re: [DF] [performance] generate DF_REF_BASE REFs in REGNO order

2011-07-29 Thread Dimitrios Apostolou
Completely forgot it: Tested on i386, no regressions. Dimitrios

Re: [DF] [performance] generate DF_REF_BASE REFs in REGNO order

2011-07-29 Thread Dimitrios Apostolou
On Fri, 29 Jul 2011, Kenneth Zadeck wrote: i really think that patches of this magnitude having to with the rtl level should be tested on more than one platform. I'd really appreciate further testing on alternate platforms from whoever does it casually, for me it would take too much time to s

[RFC] hard-reg-set.h refactoring

2011-07-30 Thread Dimitrios Apostolou
Hello list, the attached patch changes hard-reg-set.h in the following areas: 1) HARD_REG_SET is now always a struct so that it can be used in files where we don't want to include tm.h. Many thanks to Paolo for providing the idea and the original patch. 2) Code for specific HARD_REG_SET_LONG

Re: [DF] [performance] generate DF_REF_BASE REFs in REGNO order

2011-07-31 Thread Dimitrios Apostolou
On Sun, 31 Jul 2011, Steven Bosscher wrote: On Fri, Jul 29, 2011 at 11:48 PM, Steven Bosscher wrote: I'll see if I can test the patch on the compile farm this weekend, just to be sure. Worked fine with some cross-builds to arm-eabi. Bootstrap on ia64-unknown-linux-gnu is in stage2 but it is

Re: [RFC] hard-reg-set.h refactoring

2011-08-01 Thread Dimitrios Apostolou
On Sun, 31 Jul 2011, Paolo Bonzini wrote: On Sat, Jul 30, 2011 at 19:21, Dimitrios Apostolou wrote: Nevertheless I'd appreciate comments on whether any part of this patch is worth keeping. FWIW I've profiled this on i386 to be about 4 M instr slower out of ~1.5 G inst. I'll be n

Re: [RFC] hard-reg-set.h refactoring

2011-08-01 Thread Dimitrios Apostolou
On Mon, 1 Aug 2011, Paolo Bonzini wrote: On 08/01/2011 05:57 PM, Dimitrios Apostolou wrote: I don't fully understand the output from -fdump-tree-all, but my conclusion based also on profiler output and objdump, is that both unrolling and inlining is happening in both versions. Neverthel

Decrease fill-ratio of hash tables

2011-08-08 Thread Dimitrios Apostolou
ample, for the mem_attrs_htab hash table, coll/searches ratio is still sometimes higher than 0.5. Changelog: 2011-08-09 Dimitrios Apostolou * symtab.c (ht_lookup_with_hash): Hash table will now be doubled when 50% full, not 75%, to reduce collisions. *

Dump stats about hottest hash tables when -fmem-report

2011-08-08 Thread Dimitrios Apostolou
hangelog: 2011-08-09 Dimitrios Apostolou * cgraph.c, cgraph.h (cgraph_dump_stats): New function to dump stats about cgraph_hash hash table. * cselib.c, cselib.h (cselib_dump_stats): New function to dump stats about cselib_hash_table. * cselib.c (cselib_finis

Re: Dump stats about hottest hash tables when -fmem-report

2011-08-09 Thread Dimitrios Apostolou
I forgot to include the dwarf2out.c:file_table. Stats are printed when -g. See attached patch. Additional Changelog: * dwarf2out.c (dwarf2out_finish): Call htab_dump_statistics() if -fmem-report. Dimitris === modified file 'gcc/dwarf2out.c' --- gcc/dwarf2out.c 2011-06-06 1

Re: Dump stats about hottest hash tables when -fmem-report

2011-08-09 Thread Dimitrios Apostolou
On Tue, 9 Aug 2011, Tom Tromey wrote: "Richard" == Richard Guenther writes: The libcpp part is ok with this change. Richard> Note that sparsely populated hashes come at the cost of increased Richard> cache footprint. Not sure what is more important here though, memory Richard> access or h

Re: repost: [DF] Use HARD_REG_SETs instead of bitmaps

2011-11-06 Thread Dimitrios Apostolou
On Sun, 6 Nov 2011, Joern Rennecke wrote: But where HARD_REG_SETS make no material difference in speed, and the compilation unit has no other tight coupling with tm.h, it would really be cleaner to move from HARD_REG_SETS to a target-independent type, like sbitmap or bitmap. Maybe we want someth

Re: repost: [DF] Use HARD_REG_SETs instead of bitmaps

2011-11-06 Thread Dimitrios Apostolou
On Mon, 7 Nov 2011, Jakub Jelinek wrote: On Mon, Nov 07, 2011 at 12:01:29AM +0200, Dimitrios Apostolou wrote: On Sun, 6 Nov 2011, Joern Rennecke wrote: But where HARD_REG_SETS make no material difference in speed, and the compilation unit has no other tight coupling with tm.h, it would really

Re: [PATCH] Fix Linux/sparc build after generic asm output optimizations.

2011-11-11 Thread Dimitrios Apostolou
Hi David, I couldn't imagine such breakage... If too many platforms break perhaps we should undo the optimisation - see attached patch. Thanks, Dimitris P.S. see also bug #51094 I've attached some more fixes === modified file 'gcc/config/elfos.h' --- gcc/config/elfos.h 2011-10-30 01:45:46

Re: [PATCH] Fix Linux/sparc build after generic asm output optimizations.

2011-11-12 Thread Dimitrios Apostolou
Hi, On Sat, 12 Nov 2011, Eric Botcazou wrote: We just need to declare it in system.h in order to use the definition in libiberty. OK, this should be fine. do the patches I sent for bug #51094 solve the problems? http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51094 Thanks, Dimitris

Re: Dump stats about hottest hash tables when -fmem-report

2011-08-19 Thread Dimitrios Apostolou
On Fri, 19 Aug 2011, Tom Tromey wrote: I think you are the most likely person to do this sort of testing. You can use machines on the GCC compile farm for this. Your patch to change the symbol table's load factor is fine technically. I think the argument for putting it in is lacking; what I wou

Various minor speed-ups

2011-08-22 Thread Dimitrios Apostolou
Hello list, the followup patches are a selection of minor changes introduced in various times during my GSOC project. They mostly are simple or not that important to be posted alone, so I'll post them alltogether under this thread. Nevertheless they have been carefully selected from a pool of

mem_attrs_htab

2011-08-22 Thread Dimitrios Apostolou
2011-08-22 Dimitrios Apostolou * emit-rtl.c (mem_attrs_htab_hash): Hash massively by calling iterative_hash(). We disregard the offset,size rtx fields of the mem_attrs struct, but overall this hash is a *huge* improvement to the previous one, it reduces the

graphds.[ch]: alloc_pool for edges

2011-08-22 Thread Dimitrios Apostolou
free() was called way too often before, this patch reduces it significantly. Minor speed-up here too, I don't mention it individually since numbers are within noise margins. 2011-08-22 Dimitrios Apostolou * graphds.h (struct graph): Added edge_pool as a pool for alloc

tree-ssa*: reduce malloc() calls by preallocating hot VECs on the stack

2011-08-22 Thread Dimitrios Apostolou
2011-08-22 Dimitrios Apostolou Allocate some very frequently used vectors on the stack: * vecir.h: Defined a tree vector on the stack. * tree-ssa-sccvn.c (print_scc, sort_scc, process_scc) (extract_and_process_scc_for_name): Allocate the scc vector on the

tree-ssa-structalias.c: alloc_pool for struct equiv_class_label

2011-08-22 Thread Dimitrios Apostolou
2011-08-22 Dimitrios Apostolou * tree-ssa-structalias.c (equiv_class_add) (perform_var_substitution, free_var_substitution_info): Created a new equiv_class_pool allocator pool for struct equiv_class_label. Changed the pointer_equiv_class_table and

Re: Various minor speed-ups

2011-08-22 Thread Dimitrios Apostolou
2011-08-22 Dimitrios Apostolou * tree-ssa-pre.c (phi_trans_add, init_pre, fini_pre): Added a pool for phi_translate_table elements to avoid free() calls from htab_delete(). === modified file 'gcc/tree-ssa-pre.c' --- gcc/tree-ssa-pre.c 2011-05-04 09:0

Re: tree-ssa-structalias.c: alloc_pool for struct equiv_class_label

2011-08-22 Thread Dimitrios Apostolou
Forgot the patch... On Mon, 22 Aug 2011, Dimitrios Apostolou wrote: 2011-08-22 Dimitrios Apostolou * tree-ssa-structalias.c (equiv_class_add) (perform_var_substitution, free_var_substitution_info): Created a new equiv_class_pool allocator pool for struct

Re: mem_attrs_htab

2011-08-22 Thread Dimitrios Apostolou
Hi Jakub, I forgot to mention that all patches are against mid-July trunk, I was hoping I'd have no conflicts. Anyway thanks for letting me know, if there are conflicts with my other patches please let me know, and I'll post an updated version at a later date. All your other concerns are val

Re: Various minor speed-ups

2011-08-22 Thread Dimitrios Apostolou
For whoever is concerned about memory usage, I didn't measure a real increase, besides a few KB. These are very hot allocation pools and allocating too many blocks of 10 elements is suboptimal. 2011-08-22 Dimitrios Apostolou * cselib.c (cselib_init): Increased initial si

cse.c: preferable()

2011-08-22 Thread Dimitrios Apostolou
Attached patch is also posted at bug #19832 and I think resolves it, as well as /maybe/ offers a negligible speedup of 3-4 M instr or a couple milliseconds. I also post it here for comments. 2011-08-13 Dimitrios Apostolou * cse.c (preferable): Make it more readable and slightly faster

Re: Dump stats about hottest hash tables when -fmem-report

2011-08-22 Thread Dimitrios Apostolou
sing hash table much simpler, and I think there is only one case we actually delete strings. Have to look further into this one. All comments welcome, Dimitris Changelog: 2011-08-22 Dimitrios Apostolou * cgraph.c, cgraph.h (cgraph_dump_stats): New function to dump

Re: tree-ssa-structalias.c: alloc_pool for struct equiv_class_label

2011-08-22 Thread Dimitrios Apostolou
On Mon, 22 Aug 2011, Richard Guenther wrote: On Mon, Aug 22, 2011 at 9:46 AM, Dimitrios Apostolou wrote: 2011-08-22  Dimitrios Apostolou          * tree-ssa-structalias.c (equiv_class_add)        (perform_var_substitution, free_var_substitution_info): Created a        new equiv_class_pool

[var-tracking] small speed-ups

2011-08-22 Thread Dimitrios Apostolou
completion I'll also post a follow-up patch where I delete/simplify a big part of var-tracking, unfortunately with some impact on performance. 2011-08-22 Dimitrios Apostolou * var-tracking.c (init_attrs_list_set): Remove function, instead use a memset() call to zero t

[var-tracking] [not-good!] disable shared_hash and other simplifications

2011-08-22 Thread Dimitrios Apostolou
Hello, the attached patch applies after my previous one, and actually cancels all runtime gains from it. It doesn't make things worse than initially, so it's not *that* bad. While trying to understand var-tracking I deleted the whole shared hash table concept and some other indirections. It

Re: Dump stats about hottest hash tables when -fmem-report

2011-08-22 Thread Dimitrios Apostolou
I should note here that specialised hash-tables in pointer-set.c have a load-factor of at most 25%. Also another very fast hash table I've studied, dense_hash_map from google's sparse_hash_table, has a load factor of 50% max. As I understand it a good hash function gives a perfectly random val

[dwarf2out, elfos] Output assembly faster

2011-08-22 Thread Dimitrios Apostolou
parameter, though it's pretty hard to be compliant. * Even further in the future we could generate binary data, if we *know* the assembler is GAS. Changelog: 2011-08-12 Dimitrios Apostolou * final.c, output.h (fprint_whex, fprint_w, fprint_ul, sprint_ul): New func

Re: [DF] [performance] generate DF_REF_BASE REFs in REGNO order

2011-08-22 Thread Dimitrios Apostolou
Hi Steven, On Mon, 1 Aug 2011, Steven Bosscher wrote: On Sun, Jul 31, 2011 at 11:59 PM, Steven Bosscher wrote: On Fri, Jul 29, 2011 at 11:48 PM, Steven Bosscher wrote: I'll see if I can test the patch on the compile farm this weekend, just to be sure. Bootstrap on ia64-unknown-linux-gnu is

Re: [DF] [performance] generate DF_REF_BASE REFs in REGNO order

2011-08-22 Thread Dimitrios Apostolou
On Mon, 22 Aug 2011, Dimitrios Apostolou wrote: Hi Steven, On Mon, 1 Aug 2011, Steven Bosscher wrote: On Sun, Jul 31, 2011 at 11:59 PM, Steven Bosscher wrote: On Fri, Jul 29, 2011 at 11:48 PM, Steven Bosscher wrote: I'll see if I can test the patch on the compile farm this weekend, ju

Re: [var-tracking] small speed-ups

2011-08-23 Thread Dimitrios Apostolou
Hi jakub, On Mon, 22 Aug 2011, Jakub Jelinek wrote: On Mon, Aug 22, 2011 at 01:30:33PM +0300, Dimitrios Apostolou wrote: @@ -1191,7 +1189,7 @@ dv_uid2hash (dvuid uid) static inline hashval_t dv_htab_hash (decl_or_value dv) { - return dv_uid2hash (dv_uid (dv)); + return (hashval_t

Re: [var-tracking] small speed-ups

2011-08-23 Thread Dimitrios Apostolou
On Tue, 23 Aug 2011, Jakub Jelinek wrote: On Tue, Aug 23, 2011 at 02:40:56PM +0300, Dimitrios Apostolou wrote: dst->vars = (shared_hash) pool_alloc (shared_hash_pool); dst->vars->refcount = 1; dst->vars->htab -= htab_create (MAX (src1_elems, src2_elems), var

Re: [PATCH][C++] Save memory and reallocations in name-lookup

2012-08-17 Thread Dimitrios Apostolou
Hi, On Fri, 17 Aug 2012, Jakub Jelinek wrote: On Fri, Aug 17, 2012 at 06:41:37AM -0500, Gabriel Dos Reis wrote: I am however concerned with: static void store_bindings (tree names, VEC(cxx_saved_binding,gc) **old_bindings) { ! static VEC(tree,heap) *bindings_need_stored = NULL; I wo

Speedups/Cleanups: End of GSOC patch collection

2012-08-18 Thread Dimitrios Apostolou
Hello list, for the following couple of days I'll be posting under this thread my collection of patches. Unless otherwise mentioned they've been bootstrapped and tested on x86, but with a three-weeks old snapshot, that is pre-C++ conversion. I plan to test again next week with a latest snaps

Re: Speedups/Cleanups: End of GSOC patch collection

2012-08-18 Thread Dimitrios Apostolou
2012-08-18 Dimitrios Apostolou * dwarf2out.c (output_indirect_string): Use ASM_OUTPUT_INTERNAL_LABEL instead of slower ASM_OUTPUT_LABEL. * varasm.c (assemble_string): Don't break string in chunks, this is assembler specific and already done in most versio

Re: Speedups/Cleanups: End of GSOC patch collection

2012-08-18 Thread Dimitrios Apostolou
documented way of allocating macros. 2012-08-18 Dimitrios Apostolou * include/libiberty.h (XOBDELETE, XOBGROW, XOBGROWVEC, XOBSHRINK) (XOBSHRINKVEC, XOBFINISH): New type-safe macros for obstack operations. (XOBFINISH): Changed to return (T *) instead of T. All

[graphds.h] Allocate graph from obstack

2012-08-18 Thread Dimitrios Apostolou
acceptable, and also where I should initialise the obstack once, and avoid checking if it's NULL in every use. Minor speed gains (couple of ms), tested with pre-C++ conversion snapshot, I'll retest soon and post update. Thanks, Dimitris 2012-08-18 Dimitrios Apostolou

more malloc mitigation

2012-08-18 Thread Dimitrios Apostolou
Hi, 2012-08-18 Dimitrios Apostolou * gcc/tree-ssa-sccvn.c (struct vn_tables_s): Add obstack_start to mark the first allocated object on the obstack. (process_scc, allocate_vn_table): Use it. (init_scc_vn): Don't truncate shared_lookup_references v

obstack for equiv_class_label, more vectors on stack

2012-08-19 Thread Dimitrios Apostolou
2012-08-19 Dimitrios Apostolou * gcc/tree-ssa-structalias.c: Change declaration of ce_s type vector from heap to stack. Update all relevant functions to VEC_alloc() such vector upfront with enough (32) slots so that malloc() calls are mostly avoided

alloc_pool for tree-ssa-pre.c:phi_translate_table

2012-08-19 Thread Dimitrios Apostolou
2012-08-19 Dimitrios Apostolou * gcc/tree-ssa-pre.c (phi_translate_pool): New static global alloc_pool, used for allocating struct expr_pred_trans_d for phi_translate_table. (phi_trans_add, init_pre, fini_pre): Use it, avoids thousand of malloc() and

enlarge hot allocation pools

2012-08-19 Thread Dimitrios Apostolou
Hello, 2012-08-19 Dimitrios Apostolou * gcc/cselib.c (cselib_init): Make allocation pools larger since they are too hot and show to expand often on the profiler. * gcc/df-problems.c (df_chain_alloc): Same. * gcc/et-forest.c (et_new_occ, et_new_tree): Same

Re: enlarge hot allocation pools

2012-08-19 Thread Dimitrios Apostolou
Hi Steven, On Sun, 19 Aug 2012, Steven Bosscher wrote: On Sun, Aug 19, 2012 at 8:31 PM, Dimitrios Apostolou wrote: Hello, 2012-08-19 Dimitrios Apostolou * gcc/cselib.c (cselib_init): Make allocation pools larger since they are too hot and show to expand often on the

Re: alloc_pool for tree-ssa-pre.c:phi_translate_table

2012-08-20 Thread Dimitrios Apostolou
On Mon, 20 Aug 2012, Jakub Jelinek wrote: I'd note for all the recently posted patches from Dimitrios, the gcc/ prefix doesn't belong to the ChangeLog entry pathnames, the filenames are relative to the corresponding ChangeLog location. Ah sorry, it's what the mklog utility generates, it seems

Re: [graphds.h] Allocate graph from obstack

2012-08-20 Thread Dimitrios Apostolou
Hi Paolo, On Mon, 20 Aug 2012, Paolo Bonzini wrote: Il 19/08/2012 18:55, Richard Guenther ha scritto: Initially I had one obstack per struct graph, which was better than using XNEW for every edge, but still obstack_init() called from new_graph() was too frequent. So in this iteration of the pa

[DF] RFC: obstacks in DF

2012-08-20 Thread Dimitrios Apostolou
Hi, while I was happy using obstacks in other parts of the compiler I thought they would provide a handy solution for the XNEWVECs/XRESIZEVECs in df-scan.c, especially df_install_refs() which is the heaviest malloc() user after the rest of my patches. In the process I realised that obstacks

failed attempt: retain identifier length from frontend to backend

2012-08-20 Thread Dimitrios Apostolou
Hello, my last attempt on improving something serious was about three weeks ago, trying to keep all lengths of all strings parsed in the frontend for the whole compilation phase until the assembly output. I was hoping that would help on using faster hashes (knowing the length allows us to hash