On 2/27/14 14:06 PM, Noam Bernstein wrote:
On Feb 27, 2014, at 2:36 AM, Patrick Begou <patrick.be...@legi.grenoble-inp.fr> 
wrote:

Bernd Dammann wrote:
Using the workaround '--bind-to-core' does only make sense for those jobs, that 
allocate full nodes, but the majority of our jobs don't do that.
Why ?
We still use this option in OpenMPI (1.6.x, 1.7.x) with OpenFOAM and other 
applications to attach each process on its core because sometimes linux move 
processes and 2 process can run on the same core, slowing the application. Even 
if we do not use full nodes.
'--bind-to-core' is only not applicable if you mix OpenMP and MPI as all your 
threads will be binded to the same core but I do not remember that OpenFOAM 
does this yet.

But if your jobs don't allocate full nodes and there are two jobs on the same 
node
they can end up bound to the same subset of cores.

Exactly, that's our problem!

Torque cpusets should in
principle be able to do this (queuing system allocates distinct sets of cores to
distinct jobs), but I've never used them myself.


We started to use them at some point, but it had some side effects (leaving dangling jobs/processes), so we stopped using them. And certain ISV applications has issues as well.

Here we've just basically given up on jobs that allocate a non-integer # of
nodes.  In principle they can (and then I turn off bind by core), but hardly 
anyone
does it except for some serial jobs.  Then again, we have a mix of 8 and 16 core
nodes.  If we had only 32 or 64 core nodes we might be less tolerant of this
restriction.


We are running a system with a very inhomogeneous workload, i.e. in-house applications, or applications which we compile ourselves, but also 3rd party applications, that not always are designed with a (multi-user) cluster in mind.

Rgds,
Bernd

Reply via email to