----- "Eugene Loh" <[email protected]> wrote: > This is an important discussion.
Indeed! My big fear is that people won't pick up the significance of the change and will complain about performance regressions in the middle of an OMPI stable release cycle. > Do note: > > 1) Bind-to-core is actually the default behavior of many MPIs today. We had this issue with MVAPICH before we dumped it to go to OpenMPI as if we had (for example) two 4 core jobs running on the same node they'd both go at half speed whilst the node itself was 50% idle. Turned out they'd both bound to cores 0-3 leaving cores 4-7 unused. :-( Fortunately there was an undocumented environment variable that let us turn it off for all jobs, but getting rid of that misbehaviour was a major reason for switching to OpenMPI. > 2) The proposed OMPI bind-to-socket default is less severe. In the > general case, it would allow multiple jobs to bind in the same way > without oversubscribing any core or socket. (This comment added to > the trac ticket.) That's a nice clarification, thanks. I suspect though that the same issue we have with MVAPICH would occur if two 4 core jobs both bound themselves to the first socket. Thinking further, it would be interesting to find out how this code would behave on a system where cpusets is in use and so OMPI has to submit to the will of the scheduler regarding cores/sockets. > 3) Defaults (if I understand correctly) can be set differently > on each cluster. Yes, but the defaults should be sensible for the majority of clusters. If the majority do indeed share nodes between jobs then I would suggest that the default should be off and the minority who don't share nodes should have to enable it. There's also the issue of those users who (for whatever reason) like to build their own MPI stack and who are even less likely to understand the impact that they may have on others.. :-( cheers! Chris -- Christopher Samuel - (03) 9925 4751 - Systems Manager The Victorian Partnership for Advanced Computing P.O. Box 201, Carlton South, VIC 3053, Australia VPAC is a not-for-profit Registered Research Agency
