If MPE and Vampir represent the class of tools you're interested in, there is a performance-tool FAQ at http://www.open-mpi.org/faq/?category=perftools listing some other tools in this class.

Note that these are really postmortem tools. That is, you typically run the code first and then look at results later. In certain cases, you can start looking at results while the job is still running, but mostly these tools are built to do postmortem analysis.

That may still work for you. E.g., Sun Studio Analyzer (which happens to be the only one of the tools I know well) allows you to look at in-flight messages or bytes -- either in general or for a specific connection.

But I'm guessing these are indirect ways of looking at what you really want to know. It sounds like you want to be able to watch some % utilization of a hardware interface as the program is running. I *think* these tools (the ones on the FAQ, including MPE, Vampir, and Sun Studio) are not of that class.

But maybe the indirect, postmortem methods suffice.  You decide.

Matthieu Brucher wrote:

You can try MPE (free) or Vampir (not free, but can be integrated
inside OpenMPI).

2009/9/29 Rahul Nabar <rpna...@gmail.com>:
I have a code that seems to run about 40% faster when I bond together
twin eth interfaces. The question, of course, arises: is it really
producing so much traffic to keep twin 1 Gig eth interfaces busy? I
don't really believe this but need a way to check.

What are good tools to monitior the MPI performance of a running job.
Basically what throughput loads is it imposing on the eth interfaces.
Any suggestions?

The code does not seem to produce much of disk I/O as profiled via
strace (if at all NFS I/O is a bottleneck).

Reply via email to