On 2 October 2010 at 21:51, Manuel Prinz wrote: | Hi Zack! | | On Sat, Oct 02, 2010 at 08:39:06AM -0700, Zack Weinberg wrote: | > On Sat, Oct 2, 2010 at 6:01 AM, Manuel Prinz <[email protected]> wrote: | > >> On 29 September 2010 at 18:22, Zack Weinberg wrote: | > >> | (on an 8-core machine), CPU utilization jumps *immediately* from 98% idle | > >> | to 20% user, 70% system, 12% idle. strace reveals that each slave is | > >> | spinning through poll() calls with timeout zero, rather than blocking | > >> | until a message arrives, as the documentation for mpi.probe() suggests | > >> | should happen. | > ... | > > Well, no. Actually, this behavior is by design. I'm not sure about the details | > > exactly but can get back to Jeff if you're interested in those. This is coming | > > up every now and then in the BTS or the user list. Open MPI is basically burning | > > every free cycle that is not used for computation (busy wait). There are no | > > immediate plans of changing that, as far as I know. | | I did some reading and it seems the Open MPI indeed does support two modes | of waiting: aggressive and degraded. The default behavior is "aggressive", | but you can switch them by setting the mpi_yield_when_idle MCA parameter. | See the following FAQ entries (and links therein): | | http://www.open-mpi.org/faq/?category=running#force-aggressive-degraded | http://www.open-mpi.org/faq/?category=running#oversubscribing | | I guess this is basically the behaviour you want. It would be great if you | could give it a try and report back if it works for you. If it doesn't do | what you (and I) expect, I'll forward this issue upstream.
Nice work! That totally rhymes with what I recall from late in the 1.2.* cycle and would indeed be nice if we could get this tested and then documented. Dirk -- Dirk Eddelbuettel | [email protected] | http://dirk.eddelbuettel.com -- To UNSUBSCRIBE, email to [email protected] with a subject of "unsubscribe". Trouble? Contact [email protected]

