I have not had a chance to check out the tmp branch for this (I'm currently in an airport without network access), but it all sounds good in principle to me. Forgive me if I've said these things before, but here's what I'd like to see if possible:

- configure output shows whether this stuff is enabled
- e.g., does it check for the relevant macros in valgrind's header files? (I assume so; I've totally forgotten...) Ensure that these checks are output in configure's stdout

- ompi_info shows whether this stuff is enabled

- obvious user-level configure errors raise errors/abort configure (E.g., --enable-memchecker is specified but --enable-debug is not), or make some obvious assumptions about "what the user meant" (e.g., if -- enable-memchecker is specified by --enable-debug is not, then automatically enable --enable-debug and output a message saying so).

- I think we've said ad nauseam that there should be zero performance penalty for when this stuff is not enabled, and I'm guessing that this is still true. :-)

- some kind of documentation should be written up about how to use this stuff, perhaps in the FAQ (e.g., pairing it with a valgrind- enabled libibverbs for max benefit, etc.).



If --enable-memchecker is on, --enable-debug should be on as well to make
sense


On Jan 8, 2008, at 3:11 PM, Rainer Keller wrote:

Hello dear all,

WHAT:
We would like to integrate the changes on the memchecker-branch to trunk, as
planned in the

WHY:
The checking offers memory checking for certain User and OMPI- internal errors, like buffer overruns, size mismatches, checks for wrong send/receive buffers.

WHERE: OMPI trunk and v1.3 phase3

WHEN:
Integration into Trunk of memchecker branch: 25.1. (although off-by- default,
this leaves enough time before Feature Freeze on 8.2.)

TIMEOUT: None
===============================================================

The memchecker branch contains checks for memory buffer faults either in the
User-Code or in ompi-code itself.
It uses the valgrind-API to set/reset buffer validity of the user buffers passed to the MPI-layer. Additionally ompi-internal datatypes are checked
for.
Both are configurable using the flags:
  --enable-memchecker
  --with-valgrind=DIR (if needed)

A decent/recent valgrind is needed (for getting and setting VBITS/ using the newer macros). The valgrind-version is being checked for, at least version
3.2.0 is required.

The actual checking is done in the MPI-layer, in order not to trap any
(correct) access in the BTL, the user buffer is reset to accessible in the
PML-layer (currently OB1 -- others won't make much sense?).

The default behaviour is to *NOT* enable memchecker.
If it is enabled, but not valgrind is being run, the costs for the buffer checks are minimal, the costs for each ompi-datastructure (like datatype, or
communicator passed) is not.
Further information regarding penalties and performance may be found in:
http://www.open-mpi.org/papers/parco-2007

Comments from the Paris meeting have been integrated.
Are there any objections or hints?

With best regards,
Shiqing and Rainer

PS: If --enable-memchecker is on, --enable-debug should be on as well to make
sense.
--
----------------------------------------------------------------
Dipl.-Inf. Rainer Keller   http://www.hlrs.de/people/keller
 HLRS                          Tel: ++49 (0)711-685 6 5858
 Nobelstrasse 19                  Fax: ++49 (0)711-685 6 5832
 70550 Stuttgart                    email: kel...@hlrs.de
 Germany                             AIM/Skype:rusraink

"Emails save time, not printing them saves trees!"
_______________________________________________
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel


--
Jeff Squyres
Cisco Systems

Reply via email to