I have not had a chance to check out the tmp branch for this (I'm
currently in an airport without network access), but it all sounds
good in principle to me. Forgive me if I've said these things before,
but here's what I'd like to see if possible:
- configure output shows whether this stuff is enabled
- e.g., does it check for the relevant macros in valgrind's header
files? (I assume so; I've totally forgotten...) Ensure that these
checks are output in configure's stdout
- ompi_info shows whether this stuff is enabled
- obvious user-level configure errors raise errors/abort configure
(E.g., --enable-memchecker is specified but --enable-debug is not), or
make some obvious assumptions about "what the user meant" (e.g., if --
enable-memchecker is specified by --enable-debug is not, then
automatically enable --enable-debug and output a message saying so).
- I think we've said ad nauseam that there should be zero performance
penalty for when this stuff is not enabled, and I'm guessing that this
is still true. :-)
- some kind of documentation should be written up about how to use
this stuff, perhaps in the FAQ (e.g., pairing it with a valgrind-
enabled libibverbs for max benefit, etc.).
If --enable-memchecker is on, --enable-debug should be on as well to
make
sense
On Jan 8, 2008, at 3:11 PM, Rainer Keller wrote:
Hello dear all,
WHAT:
We would like to integrate the changes on the memchecker-branch to
trunk, as
planned in the
WHY:
The checking offers memory checking for certain User and OMPI-
internal errors,
like buffer overruns, size mismatches, checks for wrong send/receive
buffers.
WHERE: OMPI trunk and v1.3 phase3
WHEN:
Integration into Trunk of memchecker branch: 25.1. (although off-by-
default,
this leaves enough time before Feature Freeze on 8.2.)
TIMEOUT: None
===============================================================
The memchecker branch contains checks for memory buffer faults
either in the
User-Code or in ompi-code itself.
It uses the valgrind-API to set/reset buffer validity of the user
buffers
passed to the MPI-layer. Additionally ompi-internal datatypes are
checked
for.
Both are configurable using the flags:
--enable-memchecker
--with-valgrind=DIR (if needed)
A decent/recent valgrind is needed (for getting and setting VBITS/
using the
newer macros). The valgrind-version is being checked for, at least
version
3.2.0 is required.
The actual checking is done in the MPI-layer, in order not to trap any
(correct) access in the BTL, the user buffer is reset to accessible
in the
PML-layer (currently OB1 -- others won't make much sense?).
The default behaviour is to *NOT* enable memchecker.
If it is enabled, but not valgrind is being run, the costs for the
buffer
checks are minimal, the costs for each ompi-datastructure (like
datatype, or
communicator passed) is not.
Further information regarding penalties and performance may be found
in:
http://www.open-mpi.org/papers/parco-2007
Comments from the Paris meeting have been integrated.
Are there any objections or hints?
With best regards,
Shiqing and Rainer
PS: If --enable-memchecker is on, --enable-debug should be on as
well to make
sense.
--
----------------------------------------------------------------
Dipl.-Inf. Rainer Keller http://www.hlrs.de/people/keller
HLRS Tel: ++49 (0)711-685 6 5858
Nobelstrasse 19 Fax: ++49 (0)711-685 6 5832
70550 Stuttgart email: kel...@hlrs.de
Germany AIM/Skype:rusraink
"Emails save time, not printing them saves trees!"
_______________________________________________
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel
--
Jeff Squyres
Cisco Systems