There might be one reason to slowdown the application quite a bit. If the fact that you're using timer interact with the libevent (the library we're using to internally manage any kind of events), then we might end-up in the situation where we call the poll for every iteration in the event library. And this is really expensive.

A quick way to figure out if this is that case is to run Open MPI without support for shared memory (--mca btl ^sm). This way we will call poll on a regular basis anyway, and if there is no difference between a normal run and a OSS one, we know at least where to start looking ...

  george.

On Jan 12, 2009, at 13:00 , Jeff Squyres wrote:

On Jan 9, 2009, at 12:39 AM, William Hachfeld wrote:

Can any of the OpenMPI developers speculate as to possible mechanisms by which the ptrace() attachment , signal handler, or timer registration and corresponding signal delivery could cause large amounts of time to be spent within the "progress" functions of the OpenMPI library with an apparent lack of any real progress? Any ideas/information would be greatly appreciated.


Hum; interesting. I can't think of any reason why that would be a problem offhand. The mca_btl_sm_component_progress() function is the shared memory progression function. opal_progress() and mca_bml_r2_progress() are likely mainly dispatching off to this function.

Does OSS interfere with shared memory between processes in any way? (I'm not enough of a kernel guy to know what the ramifications of ptrace and whatnot are)

--
Jeff Squyres
Cisco Systems

_______________________________________________
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel

Reply via email to