On Tue, 23 Aug 2011, Jakub Jelinek wrote:
On Tue, Aug 23, 2011 at 02:40:56PM +0300, Dimitrios Apostolou wrote:

  dst->vars = (shared_hash) pool_alloc (shared_hash_pool);
  dst->vars->refcount = 1;
  dst->vars->htab
-    = htab_create (MAX (src1_elems, src2_elems), variable_htab_hash,
+    = htab_create (2 * MAX (src1_elems, src2_elems), variable_htab_hash,
                   variable_htab_eq, variable_htab_free);

This looks wrong, 2 * max is definitely too much.

For a hash table to fit N elements, it has to have at least 4/3*N
slots, or 2*N slots if htab has the 50% load factor I was proposing.

For var-tracking, 50% load factor is IMHO a very bad idea, memory
consumption of var-tracking is already high right now, we IMHO don't have
the luxury to waste more RAM.

Agreed, then I 'll change it to 4/3 * MAX so that I avoid expansions.

In total for dataflow_set_destroy I can see that calls to
attrs_list_clear() have been reduced from 500K to 250K, and I can
also see a reduction of free() calls from htab_delete(), from 30K to

free calls?  If you avoid free calls, then you end up with a memory leak.
I can understand when you avoid pool_free calls...

You are right, I just mentioned the total difference in free() calls from all my patches. But in this part there is no free() involved, so the major gain should be from avoiding htab_delete() iterating many times over the hash tables. Annotated source from the callgrind profiler (shows instruction count):

Before:

         .  void
15,820,597  htab_delete (htab_t htab)
   250,914  {
    41,819    size_t size = htab_size (htab);
    41,819    PTR *entries = htab->entries;
         .    int i;
         .
    83,638    if (htab->del_f)
66,493,884      for (i = size - 1; i >= 0; i--)
49,825,998        if (entries[i] != HTAB_EMPTY_ENTRY && entries[i] != 
HTAB_DELETED_ENTRY)
 1,813,018      (*htab->del_f) (entries[i]);
 1,800,731  => tree-into-ssa.c:def_blocks_free (10988x)
   345,910  => tree-into-ssa.c:repl_map_free (2815x)
    53,950  => tree-scalar-evolution.c:del_scev_info (1197x)
   139,354  => tree-ssa-loop-im.c:memref_free (198x)
   281,777  => tree-ssa-sccvn.c:free_phi (3788x)
    81,726  => tree-ssa-uncprop.c:equiv_free (463x)
17,359,572  => var-tracking.c:variable_htab_free (835512x)
    42,161  => cfgloop.c:loop_exit_free (720x)
   284,904  => ira-costs.c:cost_classes_del (2270x)
       157  => tree-ssa-loop-im.c:vtoe_free (1x)
11,454,221  => ???:free (37228x)
 1,460,684  => tree-ssa-sccvn.c:free_reference (11329x)
         .
   125,457    if (htab->free_f != NULL)



After:

         .  void
 6,543,474  htab_delete (htab_t htab)
   250,914  {
    41,819    size_t size = htab_size (htab);
    41,819    PTR *entries = htab->entries;
         .    int i;
         .
    83,638    if (htab->del_f)
29,288,268      for (i = size - 1; i >= 0; i--)
21,927,330        if (entries[i] != HTAB_EMPTY_ENTRY && entries[i] != 
HTAB_DELETED_ENTRY)
 1,738,584      (*htab->del_f) (entries[i]);
   139,344  => tree-ssa-loop-im.c:memref_free (198x)
       157  => tree-ssa-loop-im.c:vtoe_free (1x)
    81,762  => tree-ssa-uncprop.c:equiv_free (463x)
   281,884  => tree-ssa-sccvn.c:free_phi (3788x)
    42,145  => cfgloop.c:loop_exit_free (720x)
   345,870  => tree-into-ssa.c:repl_map_free (2815x)
 1,462,000  => tree-ssa-sccvn.c:free_reference (11329x)
 1,800,951  => tree-into-ssa.c:def_blocks_free (10988x)
16,518,004  => var-tracking.c:variable_htab_free (824010x)
   284,979  => ira-costs.c:cost_classes_del (2270x)
 9,080,245  => ???:free (11513x)
    53,921  => tree-scalar-evolution.c:del_scev_info (1197x)
         .
   125,457    if (htab->free_f != NULL)



So if the part relevant to this patch is the number of calls to variable_htab_free, I suppose it won't make a big difference.

But if I take a look at the top callers and top callees of htab_delete():

Before:

 19,590,149  < tree-ssa-structalias.c:solve_constraints (2058x) [cc1]
 33,299,064  < var-tracking.c:shared_hash_destroy (6923x) [cc1]
 68,941,719  < tree-ssa-pre.c:execute_pre (2058x) [cc1]
134,915,334  *  hashtab.c:htab_delete [cc1]
 17,359,572  >   var-tracking.c:variable_htab_free (835512x) [cc1]
 11,454,221  >   ???:free (108456x) [libc-2.12.so]
  1,800,731  >   tree-into-ssa.c:def_blocks_free (10988x) [cc1]


After:

  8,756,328  < tree-ssa-structalias.c:delete_points_to_sets (1029x) [cc1]
  9,316,165  < tree-ssa-dom.c:tree_ssa_dominator_optimize (562x) [cc1]
 83,696,211  < var-tracking.c:shared_hash_destroy (6923x) [cc1]
 60,459,493  *  hashtab.c:htab_delete [cc1]
 16,518,004  >   var-tracking.c:variable_htab_free (824010x) [cc1]
  9,080,245  >   ???:free (82741x) [libc-2.12.so]
  1,800,951  >   tree-into-ssa.c:def_blocks_free (10988x) [cc1]


Then I'm more confused, htab_delete() seems to be much colder, probably due to less traversals of hash tables, but It's not clear how much var-tracking changes are responsible for this. I'll keep in mind to revert the do_free parameter and report back. I guess for less than 10 M instr it's not worth the hassle.


Thanks,
Dimitris

Reply via email to