Re: [OMPI users] Oversubscription performance problem

Torje Henriksen Tue, 15 Apr 2008 07:04:37 -0400

Hi Lars,

First off, I think Jeff makes some very good points.

If you still think your applications will benefit from yieldinginstead of hogging the cpu,

you should probably try to use the parameter "mpi_show_mca_params".

This will give you a list of the mca parameters at runtime. This wayyou can seewhat the yield_when_idle-parameter really looks like at runtime. Ompiseems to beoverriding the user some times. If yield_when_idle is disabled, Ithink changes

has to be done to the open mpi code to make it yield.

Guess this didn't help at all, but at least you can check if you arecurious :)


Best regards,

Torje Henriksen

On Apr 13, 2008, at 1:51 PM, Jeff Squyres wrote:

Sorry for the delays in replying.

The central problem is that Open MPI is much more aggressive about its
message passing progress than LAM is -- it simply wasn't designed to
share well as a mechanism to get as high performance as possible.

mpi_yield_when_idle is most helpful only for certain transports that
actively use our event engine, such as the TCP device.  Since you're
using the LAM sysv RPI, I assume you're using the TCP and shared
memory devices in OMPI, right?  If you're using infiniband, for
example, the event engine is not called much because IB has its own
progression engine that is unrelated to OMPI's (and therefore we don't
invoke OMPI's much).

mpi_yield_when_idle is also only helpful if you're going into the MPI
layer often and making message passing progress (i.e., OMPI's event
engine is actively being invoked).  Is this true for your application?

If mpi_yield_when_idle really doesn't help much, you may consider
sprinkling calls to sched_yield() in your codes to force the process
to yield the processor.



On Apr 4, 2008, at 2:30 AM, Lars Andersson wrote:

Hi,

I'm just in the progress of moving our application from LAM/MPI to
OpenMPI, mainly because OpenMPI makes it easier for a user to run
multiple jobs(MPI universa) simultaneously. This is useful if a user

wants to run smaller experiments without disturbing a largeexperimentrunning in the background). I've been evaluation the performanceusing

a simple test, running on a hetrogenous cluster of 2 x dual core
Opteron machines, a couple of dual core P4 Xeon machines and a 8 core
Core2 machine. The main structure of the application is a master rank

distributing jobs packages to the rest of the ranks and collectingthe

results. We don't use any fancy MPI features but rather see it as an
efficient low-level tool for broadcasting and transferring data.

When a single user runs a job (fully subscribed nodes, but not

oversubscribed, i.e one process per cpu-core) on an otherwiseunloaded

cluster both LAM/MPI and OpenMPI average runtimes of about 1m33s
(OpenMPI has a slightly lower average).

When I start the same job simultaneously as two different users (thus

oversubscribing the nodes 2x) under LAM/MPI, the two jobs finish asan

average time of about 3m, thus scaling very well (we use the -ssi rpi
sysv option to mpirun under LAM/MPI to avoid busy waiting).

When running the same second experiment under OpenMPI, the average
runtime jumps up to about 3m30s, with runs occasionally taking more

than 4 minutes to complete. I do use the "--mca mpi_yield_when_idle1"

option to mpirun, but it doesn't seem to make any difference. I've
also tried setting the environment variable
OMPI_MCA_mpi_yield_when_idle=1, but still no change. ompi_info says:

ompi_info --param all all | grep yield
               MCA mpi: parameter "mpi_yield_when_idle" (current
value: "1")

The cluster is used for various tasks, running MPI applications as
well as non-MPI applications, so we would like to avoid spending too
much cycles on busy-waiting. Any ideas on how to tweak OpenMPI to get

better performance and more cooperative behavior in this case wouldbe

greatly appreciated.

Cheers,

Lars
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



--
Jeff Squyres
Cisco Systems

_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

Re: [OMPI users] Oversubscription performance problem

Reply via email to