Author: njn
Date: 2007-10-30 21:37:49 +0000 (Tue, 30 Oct 2007)
New Revision: 7058

Log:
tweak comments, mostly to clean up the list of things that are critical for
a release.

Modified:
   branches/MASSIF2/massif/docs/ms-manual.xml
   branches/MASSIF2/massif/ms_main.c


Modified: branches/MASSIF2/massif/docs/ms-manual.xml
===================================================================
--- branches/MASSIF2/massif/docs/ms-manual.xml  2007-10-30 19:31:43 UTC (rev 
7057)
+++ branches/MASSIF2/massif/docs/ms-manual.xml  2007-10-30 21:37:49 UTC (rev 
7058)
@@ -6,6 +6,25 @@
 <chapter id="ms-manual" xreflabel="Massif: a heap profiler">
   <title>Massif: a heap profiler</title>
 
+</pre>
+ Docs:
+ - Mention that complex functions names are best protected with single
+   quotes, eg:
+       --alloc-fn='operator new(unsigned, std::nothrow_t const&)'
+   [XXX: that doesn't work if the option is in a .valgrindrc file or in
+    $VALGRIND_OPTS.  In m_commandline.c:add_args_from_string() need to
+    respect single quotes...]
+ - Explain the --threshold=0 case -- entries with zero bytes must have
+   allocated some memory and then freed it all again.
+ - Explain that no peak will be taken if no deallocations are done.
+ - Explain how the stack is computed -- size is assumed to be zero when
+   code starts executing, which isn't true, but reflects what you have
+   control over in a normal program.
+ - file format -- not specified, because it may change in the future to 
+   become more generic
+</pre>
+
+
 <para>To use this tool, you must specify
 <computeroutput>--tool=massif</computeroutput> on the Valgrind
 command line.</para>

Modified: branches/MASSIF2/massif/ms_main.c
===================================================================
--- branches/MASSIF2/massif/ms_main.c   2007-10-30 19:31:43 UTC (rev 7057)
+++ branches/MASSIF2/massif/ms_main.c   2007-10-30 21:37:49 UTC (rev 7058)
@@ -30,108 +30,70 @@
 //---------------------------------------------------------------------------
 // XXX:
 //---------------------------------------------------------------------------
-// Performance:
-//
-//   perl perf/vg_perf --tools=massif --reps=3 perf/{heap,tinycc} massif
-//   time valgrind --tool=massif --depth=100 konqueror
-//
-// The other benchmarks don't do much allocation, and so give similar speeds
-// to Nulgrind.
-//
-// Initial slowdown (r6976 + r6979):
-//   heap      0.24s  ma:26.7s (111.4x, -----)
-//   tinycc    0.44s  ma:10.7s (24.4x, -----)
-//   many-xpts 0.11s  ma:32.8s (298.0x, -----)
-//
-// Changed alloc_fns from an OSet to an XArray (r6981):
-//   heap      0.24s  ma:19.4s (80.6x, -----)
-//   tinycc    0.49s  ma: 7.8s (16.0x, -----)
-//   many-xpts 0.12s  ma:26.8s (223.4x, -----)
-//
-// Changed get_IPs so that all alloc_fns in a chain must be mentioned, not
-// just the bottom one, greatly reducing the number of calls to is_alloc_fn
-// (r6983):
-//   heap      0.24s  ma:18.8s (78.5x, -----)
-//   tinycc    0.45s  ma: 7.4s (16.4x, -----)
-//   many-xpts 0.05s  ma:23.5s (470.6x, -----)
-//
-// Don't dup children of insignificant XPts in dup_XTree.  Made many-xpts
-// more than 10x faster (r6984):
-//   heap      0.59s  ma:20.3s (34.5x, -----)
-//   tinycc    0.49s  ma: 7.6s (15.4x, -----)
-//   many-xpts 0.04s  ma: 1.9s (46.2x, -----)
-//
-// Make many-xpts run for longer (r7001):
-//   heap      0.59s  ma:20.3s (34.5x, -----)
-//   tinycc    0.49s  ma: 7.6s (15.4x, -----)
-//   many-xpts 0.13s  ma: 2.8s (21.6x, -----)
-//   konqueror 4:37 real  4:14 user
-//
-// Minimised the number of dup'd XPts by introducing SXPts (r7004):
-//   heap      0.56s  ma:20.8s (37.2x, -----)
-//   tinycc    0.45s  ma: 7.1s (15.7x, -----)
-//   many-xpts 0.05s  ma: 1.6s (33.0x, -----)
-//   konqueror 3:45 real  3:35 user
-//
-// Moved main-or-below-main filtering to the end, avoiding *many* calls to
-// VG_(get_fnname) (r7010):
-//   heap      0.62s  ma:12.4s (20.0x, -----)
-//   tinycc    0.45s  ma: 5.1s (11.4x, -----)
-//   many-xpts 0.09s  ma: 2.1s (23.8x, -----)
-//   konqueror 0:46 real  0:36 user
-//
-// Instead of sorting XPt children at duplication time, sort them at print
-// time (ie. many fewer sorts required) (r7011):
-//   heap      0.36s  ma:12.3s (34.1x, -----)
-//   tinycc    0.46s  ma: 4.8s (10.3x, -----)
-//   many-xpts 0.09s  ma: 2.0s (22.3x, -----)
-//   konqueror 0:42 real  0:34 user
-//
-// Don't do the significance test (which involves a division) for every
-// child, instead compute a threshold (which involves a division) which can
-// be reused for every child of an XPt (r7012):
-//   heap      0.60s  ma:12.4s (20.6x, -----)
-//   tinycc    0.48s  ma: 4.8s (10.1x, -----)
-//   many-xpts 0.10s  ma: 2.2s (22.1x, -----)
-//   konqueror 37.7s real  0:29.5s user
-//
-// By default, only snapshot a peak if it's 1% larger than the previous peak,
-// rather than snapshotting every peak.  Greatly reduces the number of peak
-// snapshots taken for larger programs like konqueror (r7013):
-//   heap      0.53s  ma:12.4s (23.5x, -----)
-//   tinycc    0.46s  ma: 4.9s (10.7x, -----)
-//   many-xpts 0.08s  ma: 2.0s (25.0x, -----)
-//   konqueror 29.6s real  0:21.0s user
-//
-// Performance todos:
-// - get_XCon accounts for about 9% of konqueror startup time.  Try keeping
-//   XPt children sorted by 'ip' and use binary search in get_XCon.
-//
 // Todo -- critical for release:
-// - decide on a name!
+// - decide on a name!  (This is the only thing blocking a merge with the
+//   trunk)
 // - do a graph-drawing test
 // - write a good basic test that shows how the tool works, suitable for
 //   documentation
 // - write documentation
+// - make --threshold and --peak-inaccuracy fractional
+// - do filename properly, clean up Valgrind-wide log file naming mess
+// - currently recording asked-for sizes of heap blocks, not actual sizes.
+//   Should add the difference to heap-admin, and change heap-admin name to
+//   something else (heap-extra?).
+//
+// Todo -- nice, but less critical:
 // - make file format more generic.  Obstacles:
 //   - unit prefixes are not generic
 //   - preset column widths for stats are not generic
 //   - preset column headers are not generic
 //   - "Massif arguments:" line is not generic
-//
-// Todo -- nice, but less critical:
 // - do snapshots on client requests
+//   - (Michael Meeks): have an interactive way to request a dump
+//     (callgrind_control-style)
+//     - "profile now"
+//     - "show me the extra allocations since the last snapshot"
+//     - "start/stop logging" (eg. quickly skip boring bits)
 // - Add ability to draw multiple graphs, eg. heap-only, stack-only, total.
 //   Give each graph a title.  (try to do it generically!)
 // - allow truncation of long fnnames if the exact line number is
 //   identified?  [hmm, could make getting the name of alloc-fns more
 //   difficult] [could dump full names to file, truncate in ms_print]
+// - make --show-below-main=no work
 //
+// Performance:
+// - To run the benchmarks:
+//
+//     perl perf/vg_perf --tools=massif --reps=3 perf/{heap,tinycc} massif
+//     time valgrind --tool=massif --depth=100 konqueror
+//
+//   The other benchmarks don't do much allocation, and so give similar speeds
+//   to Nulgrind.
+//
+//   Timing results on 'nevermore' (njn's machine) as of r7013:
+//
+//     heap      0.53s  ma:12.4s (23.5x, -----)
+//     tinycc    0.46s  ma: 4.9s (10.7x, -----)
+//     many-xpts 0.08s  ma: 2.0s (25.0x, -----)
+//     konqueror 29.6s real  0:21.0s user
+//
+// - get_XCon accounts for about 9% of konqueror startup time.  Try
+//   keeping XPt children sorted by 'ip' and use binary search in get_XCon.
+//   Requires factoring out binary search code from various places into a
+//   VG_(bsearch) function.  
+//
 // Todo -- low priority:
 // - Consider 'instructions executed' as a time unit -- more regular than
 //   ms, less artificial than B (bug #121629).
 // - In each XPt, record both bytes and the number of allocations, and
 //   possibly the global number of allocations.
+// - (Artur Wisz) add a feature to Massif to ignore any heap blocks larger
+//   than a certain size!  Because: "linux's malloc allows to set a
+//   MMAP_THRESHOLD value, so we set it to 4096 - all blocks above that will
+//   be handled directly by the kernel, and are guaranteed to be returned to
+//   the system when freed. So we needed to profile only blocks below this
+//   limit."
 //
 // Examine and fix bugs on bugzilla:
 // IGNORE:
@@ -174,44 +136,6 @@
 // 146456   (update_XCon): Assertion 'xpt->curr_space >= -space_delta' failed.
 //   - better sanity-checking should help this greatly
 //
-// Michael Meeks:
-// - wants an interactive way to request a dump (callgrind_control-style)
-//   - "profile now"
-//   - "show me the extra allocations since the last snapshot"
-//   - "start/stop logging" (eg. quickly skip boring bits)
-//
-// Artur Wisz:
-// - added a feature to Massif to ignore any heap blocks larger than a
-//   certain size!  Because:
-//     "linux's malloc allows to set a MMAP_THRESHOLD value, so we
-//      set it to 4096 - all blocks above that will be handled directly by
-//      the kernel, and are guaranteed to be returned to the system when
-//      freed. So we needed to profile only blocks below this limit."
-//
-// Other:
-//   - am I recording asked-for sizes or actual rounded-up sizes?
-//     [asked-for.  Should probably be actual.  But that might be
-//     confusing...]
-//   - could conceivably remove XPts that have their szB reduced to zero.
-//   - allow the output file name to be changed
-//
-// Docs:
-// - Mention that complex functions names are best protected with single
-//   quotes, eg:
-//       --alloc-fn='operator new(unsigned, std::nothrow_t const&)'
-//   [XXX: that doesn't work if the option is in a .valgrindrc file or in
-//    $VALGRIND_OPTS.  In m_commandline.c:add_args_from_string() need to
-//    respect single quotes...]
-// - Explain the --threshold=0 case -- entries with zero bytes must have
-//   allocated some memory and then freed it all again.
-// - Explain that no peak will be taken if no deallocations are done.
-// - Explain how the stack is computed -- size is assumed to be zero when
-//   code starts executing, which isn't true, but reflects what you have
-//   control over in a normal program.
-//
-// Tests:
-// - tests/overloaded_new.cpp is there
-//
 // File format working notes:
 
 #if 0
@@ -245,10 +169,16 @@
 
 n_events: n  time(ms)  total(B)    useful-heap(B)  admin-heap(B)  stacks(B)
 t_events: B
-n  0 0 0 0 0 
-td 0 0 0 0 0
-n1:
+n 0 0 0 0 0 
+n 0 0 0 0 0
+t1: 5 <string...>
+ t1: 6 <string...>
 
+Ideas:
+- each snapshot specifies an x-axis value and one or more y-axis values.
+- can display the y-axis values separately if you like
+- can completely separate connection between snapshots and trees.
+
 Challenges:
 - how to specify and scale/abbreviate units on axes?
 - how to combine multiple values into the y-axis?
@@ -469,7 +399,7 @@
 static Bool clo_heap            = True;
 static UInt clo_heap_admin      = 8;
 static Bool clo_stacks          = False;
-static UInt clo_depth           = 30;      // XXX: too low?
+static UInt clo_depth           = 30;
 static UInt clo_threshold       = 100;     // 100 == 1%
 static UInt clo_peak_inaccuracy = 100;     // 100 == 1%
 static UInt clo_time_unit       = TimeMS;


-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
_______________________________________________
Valgrind-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/valgrind-developers

Reply via email to