Hello,

I'm attaching here the final version of statistics dumping, I think I made some stylistic/insignificant changes. Tested on i386 and x86_64. I have withheld the hash table size changes, so please apply if you find everything good.

Would you agree on a future patch that would make such statistics being computed only when the gather-mem-stats compile-time variable is set?


On Sat, 20 Aug 2011, Richard Guenther wrote:
On Fri, Aug 19, 2011 at 6:40 PM, Tom Tromey <tro...@redhat.com> wrote:

Your patch to change the symbol table's load factor is fine technically.
I think the argument for putting it in is lacking; what I would like to
see is either some rationale explaining that the increased memory use is
not important, or some numbers showing that it still performs well on
more than a single machine.  My reason for wanting this is just that,
historically, GCC has been very sensitive to increases in memory use.
Alternatively, comments from more active maintainers indicating that
they don't care about this would also help your case.

I can't approve or reject the libiberty change, just the libcpp one.

Yes, memory usage is as important as compile-time.  We still have testcases
that show a vast imbalance of them.  I don't know if the symbol table hash
is ever the problem, but changing the generic load factor in libiberty doesn't
sound a good idea - maybe instead have a away of specifying that factor
per hashtable instance.  Or, as usual, improve the hash function to
reduce the collision rate and/or to make re-hashing cheaper.

Regarding hash table size, I will back off from hashtab.c. I really believe that even though it makes a difference the proper solution would be a much more intrusive one: complete reimplementation.

If we primarily care about memory usage we should use a closed-addressing (with linked-lists in each bucket) hash table, and set the load-factor high. If we care about cache-efficiency and speed we should use an open-addressing hash table, like the one we have, but decrease the load-factor and resolve collisions by quadratic probing instead of rehashing. Also we should make sure our hash table is always power-of-2 sized, and that hash values are always computed using good and proved hash functions (we use Bob Jenkins v2 hash, we should upgrade to v3 even though v2 seems very good *when* we actually do use it). That would probably mean the interface should change to disallow custom hash values, and while we do that I'd really like to see new functions dedicated to searching and inserting.

I've experimented with some of these changes, but these changes individually do not offer significant speed-up. I believe that change will show if hash tables are completely reimplemented.

But regarding symtab.c, I only see benefits. Collisions are reduced and extra memory usage is negligible. For a hash table with 32K elements and 64K slots, the overhead of empty slots is 32K*8 = 256KB. It it was 75% full it would be 128KB. Actual measurements on i386 show negligible overhead, it could be by luck that final ht size is the same (they are always power-of-2 sized, so it's possible). Anyway I'll hopefully test in other platforms within September and report back.

A change that would probably help for symtab would be to store custom string structs, that include string length so we avoid several strlen() calls. Another could be to not support deletion of strings (since what we are actually doing is string interning, isn't it?). This would make the open-addressing hash table much simpler, and I think there is only one case we actually delete strings. Have to look further into this one.


All comments welcome,
Dimitris




Changelog:

2011-08-22  Dimitrios Apostolou  <ji...@gmx.net>

        * cgraph.c, cgraph.h (cgraph_dump_stats): New function to dump
        stats about cgraph_hash hash table.
        * cselib.c, cselib.h (cselib_dump_stats): New function to dump
        stats about cselib_hash_table.
        * cselib.c (cselib_finish): Keep statistics by increasing values
        of new global variables
        cselib_htab_{num,expands,searches,collisions} if -fmem-report.
        * emit-rtl.c, emit-rtl.h (mem_attrs_dump_stats): New function to
        dump statistics about mem_attrs_htab hash table.
        * tree.c (print_type_hash_statistics): Used the new
        htab_dump_statistics() function.
        * var-tracking.c (shared_hash_destroy): Keep statistics by
        increasing values of new global variables
        vars_htab_{num,expands,searches,collisions} if -fmem-report.
        (vt_finalize): Keep statistics by increasing values of new global
        variables cv_htab_{num,expands,searches,collisions} and
        vc_htab_{num,expands,searches,collisions} if -fmem-report.
        * var-tracking.c, rtl.h (vt_dump_stats): New function to dump
        stats about vars->htab, changed_variables and value_chains hash
        tables.
        * hashtab.c, hashtab.h: Added "expands" variable to htab_t type
        for tracking total times the hash table was expanded.
        * hashtab.c, hashtab.h (htab_dump_statistics, htab_collisions_num)
        (htab_searches_num, htab_expands_num): New functions for  statistics.
        * hashtab.c: Included assert.h for checks in htab_dump_statistics.
        * symtab.c, symtab.h: Added "expands" variable to hash_table type
        for tracking total times the hash table was expanded.
        * symtab.c (ht_dump_statistics): Beautified stats output.
        * dwarf2out.c (dwarf2out_finish): Call htab_dump_statistics() if
        -fmem-report.
        * cgraph.h, varpool.c (varpool_dump_stats): New function to dump
        stats about varpool_hash hash table.
        * tree-ssa.c (delete_tree_ssa): Keep statistics for hash tables by
        increasing the new referenced_vars_* and default_defs_* global
        variables.
        * tree.h, tree-ssa.c (tree_ssa_dump_stats): New function to print
        statistics about all referenced_vars and default_defs hash tables.
        * tree.c (dump_tree_statistics): Call tree_ssa_dump_stats().
        * toplev.c: Included cselib.h for cselib_dump_stats().
        (dump_memory_report): Call all the above functions to provide
        better statistics.
=== modified file 'gcc/cgraph.c'
--- gcc/cgraph.c        2011-06-06 17:12:25 +0000
+++ gcc/cgraph.c        2011-08-22 08:41:38 +0000
@@ -2897,4 +2897,11 @@ cgraph_used_from_object_file_p (struct c
   return false;
 }
 
+void
+cgraph_dump_stats (void)
+{
+  fprintf (stderr, "\ncgraph.c:cgraph_hash hash table stats:\n");
+  htab_dump_statistics (cgraph_hash, sizeof (struct cgraph_node));
+}
+
 #include "gt-cgraph.h"

=== modified file 'gcc/cgraph.h'
--- gcc/cgraph.h        2011-05-31 14:58:49 +0000
+++ gcc/cgraph.h        2011-08-22 08:41:38 +0000
@@ -540,6 +540,7 @@ bool cgraph_can_remove_if_no_direct_call
 bool resolution_used_from_other_file_p (enum ld_plugin_symbol_resolution 
resolution);
 bool cgraph_used_from_object_file_p (struct cgraph_node *node);
 bool varpool_used_from_object_file_p (struct varpool_node *node);
+void cgraph_dump_stats (void);
 
 /* In cgraphunit.c  */
 extern FILE *cgraph_dump_file;
@@ -659,6 +660,7 @@ struct varpool_node * varpool_extra_name
 const char * varpool_node_name (struct varpool_node *node);
 void varpool_reset_queue (void);
 bool const_value_known_p (tree);
+void varpool_dump_stats (void);
 
 /* Walk all reachable static variables.  */
 #define FOR_EACH_STATIC_VARIABLE(node) \

=== modified file 'gcc/cselib.c'
--- gcc/cselib.c        2011-05-31 19:14:21 +0000
+++ gcc/cselib.c        2011-08-22 08:41:38 +0000
@@ -2518,6 +2518,11 @@ cselib_init (int record_what)
   next_uid = 1;
 }
 
+static unsigned int cselib_htab_num;
+static unsigned long cselib_htab_expands;
+static unsigned long cselib_htab_searches;
+static unsigned long cselib_htab_collisions;
+
 /* Called when the current user is done with cselib.  */
 
 void
@@ -2531,6 +2536,13 @@ cselib_finish (void)
   free_alloc_pool (elt_loc_list_pool);
   free_alloc_pool (cselib_val_pool);
   free_alloc_pool (value_pool);
+  if (mem_report)
+    {
+      cselib_htab_num++;
+      cselib_htab_expands += htab_expands_num (cselib_hash_table);
+      cselib_htab_searches += htab_searches_num (cselib_hash_table);
+      cselib_htab_collisions += htab_collisions_num (cselib_hash_table);
+    }
   cselib_clear_table ();
   htab_delete (cselib_hash_table);
   free (used_regs);
@@ -2630,4 +2642,18 @@ dump_cselib_table (FILE *out)
   fprintf (out, "next uid %i\n", next_uid);
 }
 
+void
+cselib_dump_stats (void)
+{
+  if (cselib_htab_num > 0)
+    {
+      fprintf (stderr, "\ncselib stats for %u hash tables\n", cselib_htab_num);
+      fprintf (stderr, "\ttotal expansions\t%lu\n", cselib_htab_expands);
+      fprintf (stderr, "\ttotal searches\t\t%lu\n", cselib_htab_searches);
+      fprintf (stderr, "\ttotal collisions\t%lu\n", cselib_htab_collisions);
+      fprintf (stderr, "\ttotal coll/search\t%.4f\n",
+              (float) cselib_htab_collisions / cselib_htab_searches);
+    }
+}
+
 #include "gt-cselib.h"

=== modified file 'gcc/cselib.h'
--- gcc/cselib.h        2011-02-03 06:04:04 +0000
+++ gcc/cselib.h        2011-08-22 08:41:38 +0000
@@ -98,3 +98,4 @@ extern void cselib_preserve_only_values 
 extern void cselib_preserve_cfa_base_value (cselib_val *, unsigned int);
 
 extern void dump_cselib_table (FILE *);
+extern void cselib_dump_stats (void);

=== modified file 'gcc/dwarf2out.c'
--- gcc/dwarf2out.c     2011-06-06 17:46:00 +0000
+++ gcc/dwarf2out.c     2011-08-22 08:41:38 +0000
@@ -24820,6 +24820,13 @@ dwarf2out_finish (const char *filename)
        add_comp_dir_attribute (comp_unit_die ());
     }
 
+  if (mem_report)
+    {
+      fprintf(stderr, "\ndwarf2out.c: file_table hash table statistics:\n");
+      htab_dump_statistics(file_table, sizeof (struct dwarf_file_data));
+    }
+
+
   for (i = 0; i < VEC_length (deferred_locations, deferred_locations_list); 
i++)
     {
       add_location_or_const_value_attribute (

=== modified file 'gcc/emit-rtl.c'
--- gcc/emit-rtl.c      2011-05-29 17:40:05 +0000
+++ gcc/emit-rtl.c      2011-08-22 08:41:38 +0000
@@ -5425,6 +5425,13 @@ gen_rtx_CONST_VECTOR (enum machine_mode 
   return gen_rtx_raw_CONST_VECTOR (mode, v);
 }
 
+void
+mem_attrs_dump_stats (void)
+{
+  fprintf (stderr, "\nemit-rtl.c:mem_attrs_htab hash table:\n");
+  htab_dump_statistics (mem_attrs_htab, sizeof (mem_attrs));
+}
+
 /* Initialise global register information required by all functions.  */
 
 void

=== modified file 'gcc/emit-rtl.h'
--- gcc/emit-rtl.h      2011-01-03 20:52:22 +0000
+++ gcc/emit-rtl.h      2011-08-22 08:41:38 +0000
@@ -62,6 +62,7 @@ extern void set_reg_attrs_for_parm (rtx,
 extern void set_reg_attrs_for_decl_rtl (tree t, rtx x);
 extern void adjust_reg_mode (rtx, enum machine_mode);
 extern int mem_expr_equal_p (const_tree, const_tree);
+extern void mem_attrs_dump_stats (void);
 
 /* Return the first insn of the current sequence or current function.  */
 

=== modified file 'gcc/rtl.h'
--- gcc/rtl.h   2011-05-03 13:08:36 +0000
+++ gcc/rtl.h   2011-08-22 08:41:38 +0000
@@ -2527,6 +2527,7 @@ extern bool expensive_function_p (int);
 
 /* In var-tracking.c */
 extern unsigned int variable_tracking_main (void);
+extern void vt_dump_stats (void);
 
 /* In stor-layout.c.  */
 extern void get_mode_bounds (enum machine_mode, int, enum machine_mode,

=== modified file 'gcc/toplev.c'
--- gcc/toplev.c        2011-05-25 11:00:14 +0000
+++ gcc/toplev.c        2011-08-22 08:41:38 +0000
@@ -77,6 +77,7 @@ along with GCC; see the file COPYING3.  
 #include "gimple.h"
 #include "tree-ssa-alias.h"
 #include "plugin.h"
+#include "cselib.h"            /* only for cselib_dump_stats() */
 
 #if defined (DWARF2_UNWIND_INFO) || defined (DWARF2_DEBUGGING_INFO)
 #include "dwarf2out.h"
@@ -1824,8 +1825,14 @@ dump_memory_report (bool final)
 {
   ggc_print_statistics ();
   stringpool_statistics ();
-  dump_tree_statistics ();
   dump_gimple_statistics ();
+  dump_tree_statistics ();
+  mem_attrs_dump_stats ();
+  cgraph_dump_stats ();
+  if (flag_var_tracking)
+    vt_dump_stats ();
+  cselib_dump_stats ();
+  varpool_dump_stats ();
   dump_rtx_statistics ();
   dump_alloc_pool_statistics ();
   dump_bitmap_statistics ();

=== modified file 'gcc/tree-ssa.c'
--- gcc/tree-ssa.c      2011-05-18 10:36:45 +0000
+++ gcc/tree-ssa.c      2011-08-22 08:41:38 +0000
@@ -1172,6 +1172,12 @@ uid_ssaname_map_hash (const void *item)
   return ((const_tree)item)->ssa_name.var->decl_minimal.uid;
 }
 
+/* hash table statistics */
+
+static unsigned long referenced_vars_expands, default_defs_expands;
+static unsigned long referenced_vars_searches, default_defs_searches;
+static unsigned long referenced_vars_collisions, default_defs_collisions;
+static unsigned int referenced_vars_num, default_defs_num;
 
 /* Initialize global DFA and SSA structures.  */
 
@@ -1208,6 +1214,25 @@ delete_tree_ssa (void)
          *DECL_VAR_ANN_PTR (var) = NULL;
        }
     }
+
+  if (mem_report)
+    {
+      referenced_vars_num++;
+      referenced_vars_expands +=
+       htab_expands_num (gimple_referenced_vars (cfun));
+      referenced_vars_searches +=
+       htab_searches_num (gimple_referenced_vars (cfun));
+      referenced_vars_collisions +=
+       htab_collisions_num (gimple_referenced_vars (cfun));
+      default_defs_num++;
+      default_defs_expands +=
+       htab_expands_num (cfun->gimple_df->default_defs);
+      default_defs_searches +=
+       htab_searches_num (cfun->gimple_df->default_defs);
+      default_defs_collisions +=
+       htab_collisions_num (cfun->gimple_df->default_defs);
+    }
+
   htab_delete (gimple_referenced_vars (cfun));
   cfun->gimple_df->referenced_vars = NULL;
 
@@ -1231,6 +1256,26 @@ delete_tree_ssa (void)
   redirect_edge_var_map_destroy ();
 }
 
+void
+tree_ssa_dump_stats (void)
+{
+  fprintf (stderr, "\ntree-ssa.c stats\n");
+  
+  fprintf (stderr, "\t%u referenced_vars hash tables:\n", referenced_vars_num);
+  fprintf (stderr, "\t\ttotal expansions\t%lu\n", referenced_vars_expands);
+  fprintf (stderr, "\t\ttotal searches\t\t%lu\n", referenced_vars_searches);
+  fprintf (stderr, "\t\ttotal collisions\t%lu\n", referenced_vars_collisions);
+  fprintf (stderr, "\t\ttotal coll/search\t%.4f\n",
+          (float) referenced_vars_collisions / referenced_vars_searches);
+
+  fprintf (stderr, "\t%u default_defs hash tables\n", default_defs_num);
+  fprintf (stderr, "\t\ttotal expansions\t%lu\n", default_defs_expands);
+  fprintf (stderr, "\t\ttotal searches\t\t%lu\n", default_defs_searches);
+  fprintf (stderr, "\t\ttotal collisions\t%lu\n", default_defs_collisions);
+  fprintf (stderr, "\t\ttotal coll/search\t%.4f\n",
+          (float) default_defs_collisions / default_defs_searches);
+}
+
 /* Return true if the conversion from INNER_TYPE to OUTER_TYPE is a
    useless type conversion, otherwise return false.
 

=== modified file 'gcc/tree.c'
--- gcc/tree.c  2011-06-07 14:37:39 +0000
+++ gcc/tree.c  2011-08-22 08:41:38 +0000
@@ -6225,10 +6225,8 @@ type_hash_marked_p (const void *p)
 static void
 print_type_hash_statistics (void)
 {
-  fprintf (stderr, "Type hash: size %ld, %ld elements, %f collisions\n",
-          (long) htab_size (type_hash_table),
-          (long) htab_elements (type_hash_table),
-          htab_collisions (type_hash_table));
+  fprintf (stderr, "\ntree.c:type_hash_table stats\n");
+  htab_dump_statistics (type_hash_table, sizeof (struct type_hash));
 }
 
 /* Compute a hash code for a list of attributes (chain of TREE_LIST nodes
@@ -8564,10 +8562,11 @@ dump_tree_statistics (void)
 #else
   fprintf (stderr, "(No per-node statistics)\n");
 #endif
-  print_type_hash_statistics ();
   print_debug_expr_statistics ();
   print_value_expr_statistics ();
   lang_hooks.print_statistics ();
+  print_type_hash_statistics ();
+  tree_ssa_dump_stats ();
 }
 
 #define FILE_FUNCTION_FORMAT "_GLOBAL__%s_%s"

=== modified file 'gcc/tree.h'
--- gcc/tree.h  2011-06-07 14:37:39 +0000
+++ gcc/tree.h  2011-08-22 08:41:38 +0000
@@ -5698,6 +5698,7 @@ struct GTY(()) tree_priority_map {
 /* In tree-ssa.c */
 
 tree target_for_debug_bind (tree);
+void tree_ssa_dump_stats (void);
 
 /* In tree-ssa-address.c.  */
 extern tree tree_mem_ref_addr (tree, tree);

=== modified file 'gcc/var-tracking.c'
--- gcc/var-tracking.c  2011-06-03 01:42:31 +0000
+++ gcc/var-tracking.c  2011-08-22 08:41:38 +0000
@@ -1422,6 +1422,11 @@ shared_hash_copy (shared_hash vars)
   return vars;
 }
 
+static unsigned long vars_htab_expands;
+static unsigned long vars_htab_searches;
+static unsigned long vars_htab_collisions;
+static unsigned int vars_htab_num;
+
 /* Decrement reference counter and destroy hash table if not shared
    anymore.  */
 
@@ -1431,6 +1436,13 @@ shared_hash_destroy (shared_hash vars)
   gcc_checking_assert (vars->refcount > 0);
   if (--vars->refcount == 0)
     {
+      if (mem_report)
+       {
+         vars_htab_num++;
+         vars_htab_expands += htab_expands_num (vars->htab);
+         vars_htab_searches += htab_searches_num (vars->htab);
+         vars_htab_collisions += htab_collisions_num (vars->htab);
+       }
       htab_delete (vars->htab);
       pool_free (shared_hash_pool, vars);
     }
@@ -8982,6 +8994,11 @@ vt_debug_insns_local (bool skipped ATTRI
   delete_debug_insns ();
 }
 
+static unsigned long cv_htab_expands, vc_htab_expands;
+static unsigned long cv_htab_searches, vc_htab_searches;
+static unsigned long cv_htab_collisions, vc_htab_collisions;
+static unsigned int cv_htab_num, vc_htab_num;
+
 /* Free the data structures needed for variable tracking.  */
 
 static void
@@ -9006,6 +9023,13 @@ vt_finalize (void)
     }
   free_aux_for_blocks ();
   htab_delete (empty_shared_hash->htab);
+  if (mem_report)
+    {
+      cv_htab_num++;
+      cv_htab_expands += htab_expands_num (changed_variables);
+      cv_htab_searches += htab_searches_num (changed_variables);
+      cv_htab_collisions += htab_collisions_num (changed_variables);
+    }
   htab_delete (changed_variables);
   free_alloc_pool (attrs_pool);
   free_alloc_pool (var_pool);
@@ -9014,6 +9038,13 @@ vt_finalize (void)
 
   if (MAY_HAVE_DEBUG_INSNS)
     {
+      if (mem_report)
+       {
+         vc_htab_num++;
+         vc_htab_expands += htab_expands_num (value_chains);
+         vc_htab_searches += htab_searches_num (value_chains);
+         vc_htab_collisions += htab_collisions_num (value_chains);
+       }
       htab_delete (value_chains);
       free_alloc_pool (value_chain_pool);
       free_alloc_pool (valvar_pool);
@@ -9029,6 +9060,36 @@ vt_finalize (void)
   vui_allocated = 0;
 }
 
+void
+vt_dump_stats (void)
+{
+  fprintf (stderr, "\nvar-tracking.c stats\n");
+  
+  fprintf (stderr, "\t%u vars->htab hash tables:\n", vars_htab_num);
+  fprintf (stderr, "\t\ttotal expansions\t%lu\n", vars_htab_expands);
+  fprintf (stderr, "\t\ttotal searches\t\t%lu\n", vars_htab_searches);
+  fprintf (stderr, "\t\ttotal collisions\t%lu\n", vars_htab_collisions);
+  fprintf (stderr, "\t\ttotal coll/search\t%.4f\n",
+          (float) vars_htab_collisions / vars_htab_searches);
+
+  fprintf (stderr, "\t%u changed_variables hash tables\n", cv_htab_num);
+  fprintf (stderr, "\t\ttotal expansions\t%lu\n", cv_htab_expands);
+  fprintf (stderr, "\t\ttotal searches\t\t%lu\n", cv_htab_searches);
+  fprintf (stderr, "\t\ttotal collisions\t%lu\n", cv_htab_collisions);
+  fprintf (stderr, "\t\ttotal coll/search\t%.4f\n",
+          (float) cv_htab_collisions / cv_htab_searches);
+
+  if (MAY_HAVE_DEBUG_INSNS)
+    {
+      fprintf (stderr, "\t%u value_chains hash tables\n", vc_htab_num);
+      fprintf (stderr, "\t\ttotal expansions\t%lu\n", vc_htab_expands);
+      fprintf (stderr, "\t\ttotal searches\t\t%lu\n", vc_htab_searches);
+      fprintf (stderr, "\t\ttotal collisions\t%lu\n", vc_htab_collisions);
+      fprintf (stderr, "\t\ttotal coll/search\t%.4f\n",
+              (float) vc_htab_collisions / vc_htab_searches);
+    }
+}
+
 /* The entry point to variable tracking pass.  */
 
 static inline unsigned int

=== modified file 'gcc/varpool.c'
--- gcc/varpool.c       2011-06-03 18:23:22 +0000
+++ gcc/varpool.c       2011-08-22 08:41:38 +0000
@@ -724,4 +724,11 @@ varpool_used_from_object_file_p (struct 
   return false;
 }
 
+void
+varpool_dump_stats (void)
+{
+  fprintf (stderr, "\nvarpool.c:varpool_hash hash table stats:\n");
+  htab_dump_statistics (varpool_hash, sizeof (struct varpool_node));
+}
+
 #include "gt-varpool.h"

=== modified file 'include/hashtab.h'
--- include/hashtab.h   2010-06-08 06:25:24 +0000
+++ include/hashtab.h   2011-08-22 08:41:38 +0000
@@ -127,6 +127,9 @@ struct GTY(()) htab {
      of collisions fixed for time of work with the hash table. */
   unsigned int collisions;
 
+  /* Number of times we reallocated the table to change its capacity. */
+  unsigned int expands;
+
   /* Pointers to allocate/free functions.  */
   htab_alloc alloc_f;
   htab_free free_f;
@@ -187,6 +190,10 @@ extern void        htab_traverse_noresize (htab
 extern size_t  htab_size (htab_t);
 extern size_t  htab_elements (htab_t);
 extern double  htab_collisions (htab_t);
+extern void     htab_dump_statistics (htab_t, size_t);
+extern unsigned int    htab_collisions_num (htab_t);
+extern unsigned int    htab_searches_num (htab_t);
+extern unsigned int    htab_expands_num (htab_t);
 
 /* A hash function for pointers.  */
 extern htab_hash htab_hash_pointer;

=== modified file 'libcpp/include/symtab.h'
--- libcpp/include/symtab.h     2011-01-03 20:52:22 +0000
+++ libcpp/include/symtab.h     2011-08-22 08:41:38 +0000
@@ -63,6 +63,7 @@ struct ht
   struct cpp_reader *pfile;
 
   /* Table usage statistics.  */
+  unsigned int expands;
   unsigned int searches;
   unsigned int collisions;
 

=== modified file 'libcpp/symtab.c'
--- libcpp/symtab.c     2011-08-09 02:39:45 +0000
+++ libcpp/symtab.c     2011-08-22 08:41:38 +0000
@@ -219,6 +219,7 @@ ht_expand (hash_table *table)
   table->entries_owned = true;
   table->entries = nentries;
   table->nslots = size;
+  table->expands++;
 }
 
 /* For all nodes in TABLE, callback CB with parameters TABLE->PFILE,
@@ -276,7 +277,7 @@ ht_load (hash_table *ht, hashnode *entri
 void
 ht_dump_statistics (hash_table *table)
 {
-  size_t nelts, nids, overhead, headers;
+  size_t nelts, nids, obmem, headers;
   size_t total_bytes, longest, deleted = 0;
   double sum_of_squares, exp_len, exp_len2, exp2_len;
   hashnode *p, *limit;
@@ -307,34 +308,41 @@ ht_dump_statistics (hash_table *table)
   while (++p < limit);
 
   nelts = table->nelements;
-  overhead = obstack_memory_used (&table->stack) - total_bytes;
+  obmem = obstack_memory_used (&table->stack);
   headers = table->nslots * sizeof (hashnode);
 
-  fprintf (stderr, "\nString pool\nentries\t\t%lu\n",
-          (unsigned long) nelts);
-  fprintf (stderr, "identifiers\t%lu (%.2f%%)\n",
+  fprintf (stderr, "\nlibcpp symtab string pool:\n");
+  fprintf (stderr, "\tidentifiers\t%lu (%.2f%%)\n",
           (unsigned long) nids, nids * 100.0 / nelts);
-  fprintf (stderr, "slots\t\t%lu\n",
-          (unsigned long) table->nslots);
-  fprintf (stderr, "deleted\t\t%lu\n",
+  fprintf (stderr, "\tentries\t\t%lu (%.2f%%)\n",
+          (unsigned long) nelts, nelts * 100.0 / table->nslots);
+  fprintf (stderr, "\tdeleted\t\t%lu\n",
           (unsigned long) deleted);
-  fprintf (stderr, "bytes\t\t%lu%c (%lu%c overhead)\n",
+  fprintf (stderr, "\tslots\t\t%u\n",
+          table->nslots);
+  fprintf (stderr, "\tstring bytes\t%lu%c (%lu%c obstack_memory_used)\n",
           SCALE (total_bytes), LABEL (total_bytes),
-          SCALE (overhead), LABEL (overhead));
-  fprintf (stderr, "table size\t%lu%c\n",
+          SCALE (obmem), LABEL (obmem));
+  fprintf (stderr, "\ttable size\t%lu%c\n",
           SCALE (headers), LABEL (headers));
 
   exp_len = (double)total_bytes / (double)nelts;
   exp2_len = exp_len * exp_len;
   exp_len2 = (double) sum_of_squares / (double) nelts;
 
-  fprintf (stderr, "coll/search\t%.4f\n",
+  fprintf (stderr, "\texpansions\t%u\n",
+          table->expands);
+  fprintf (stderr, "\tsearches\t%u\n",
+          table->searches);
+  fprintf (stderr, "\tcollisions\t%u\n",
+          table->collisions);
+  fprintf (stderr, "\tcoll/search\t%.4f\n",
           (double) table->collisions / (double) table->searches);
-  fprintf (stderr, "ins/search\t%.4f\n",
+  fprintf (stderr, "\tins/search\t%.4f\n",
           (double) nelts / (double) table->searches);
-  fprintf (stderr, "avg. entry\t%.2f bytes (+/- %.2f)\n",
+  fprintf (stderr, "\tavg. entry\t%.2f bytes (+/- %.2f)\n",
           exp_len, approx_sqrt (exp_len2 - exp2_len));
-  fprintf (stderr, "longest entry\t%lu\n",
+  fprintf (stderr, "\tlongest entry\t%lu\n",
           (unsigned long) longest);
 #undef SCALE
 #undef LABEL

=== modified file 'libiberty/hashtab.c'
--- libiberty/hashtab.c 2011-08-09 02:39:45 +0000
+++ libiberty/hashtab.c 2011-08-22 08:41:38 +0000
@@ -58,6 +58,7 @@ Boston, MA 02110-1301, USA.  */
 #endif
 
 #include <stdio.h>
+#include <assert.h>
 
 #include "libiberty.h"
 #include "ansidecl.h"
@@ -564,6 +565,7 @@ htab_expand (htab_t htab)
   htab->size_prime_index = nindex;
   htab->n_elements -= htab->n_deleted;
   htab->n_deleted = 0;
+  htab->expands++;
 
   p = oentries;
   do
@@ -813,6 +815,99 @@ htab_collisions (htab_t htab)
   return (double) htab->collisions / (double) htab->searches;
 }
 
+/* Return the number of expands */
+
+unsigned int
+htab_expands_num (htab_t htab)
+{
+  return htab->expands;
+}
+
+/* Return the number of collisions */
+
+unsigned int
+htab_collisions_num (htab_t htab)
+{
+  return htab->collisions;
+}
+
+/* Return the number of searches */
+
+unsigned int
+htab_searches_num (htab_t htab)
+{
+  return htab->searches;
+}
+
+/* Dump allocation statistics to stderr. If elem_size > 0 display total memory
+ * usage of hash table too. */
+
+void
+htab_dump_statistics (htab_t table, size_t elem_size)
+{
+  size_t n_valid, headers, contents, empties, deleted;
+  void **p, **limit;
+
+#define SCALE(x) ((unsigned long) ((x) < 1024*10 \
+                 ? (x) \
+                 : ((x) < 1024*1024*10 \
+                    ? (x) / 1024 \
+                    : (x) / (1024*1024))))
+#define LABEL(x) ((x) < 1024*10 ? ' ' : ((x) < 1024*1024*10 ? 'k' : 'M'))
+
+  deleted = n_valid = empties = 0;
+  p = table->entries;
+  limit = p + table->size;
+  do
+    if (*p == HTAB_DELETED_ENTRY)
+      ++deleted;
+    else if (*p == HTAB_EMPTY_ENTRY)
+      ++empties;
+    else
+      ++n_valid;
+  while (++p < limit);
+
+  assert (deleted == table->n_deleted);
+  assert (empties == table->size - table->n_elements);
+  assert (n_valid + deleted == table->n_elements);
+
+  headers = table->size * sizeof (*(table->entries));
+  contents = n_valid * elem_size;
+
+  fprintf (stderr, "\tslots\t\t%lu\n",
+          (unsigned long) table->size);
+  fprintf (stderr, "\tidentifiers\t%lu\n",
+          (unsigned long) n_valid);
+  fprintf (stderr, "\tentries\t\t%lu (%.2f%%)\n",
+          (unsigned long) table->n_elements,
+          table->n_elements * 100.0 / table->size);
+  fprintf (stderr, "\tdeleted\t\t%lu\n",
+          (unsigned long) table->n_deleted);
+  fprintf (stderr, "\tstruct htab\t%lu  B\n",
+          (unsigned long) sizeof (*table));
+  fprintf (stderr, "\ttable size\t%lu %cB\n",
+          SCALE (headers), LABEL (headers));
+  if (elem_size > 0)
+    {
+      fprintf (stderr, "\telement\t\t%lu  B\n",
+              (unsigned long) elem_size);
+      fprintf (stderr, "\ttotal contents\t%lu %cB\n",
+              SCALE (contents), LABEL (contents));
+    }
+  fprintf (stderr, "\texpansions\t%u\n",
+          table->expands);
+  fprintf (stderr, "\tsearches\t%u\n",
+          table->searches);
+  fprintf (stderr, "\tcollisions\t%u\n",
+          table->collisions);
+  fprintf (stderr, "\tcoll/search\t%.4f\n",
+          (double) table->collisions / (double) table->searches);
+
+#undef SCALE
+#undef LABEL
+}
+
+
 /* Hash P as a null-terminated string.
 
    Copied from gcc/hashtable.c.  Zack had the following to say with respect

Reply via email to