> On Oct 6, 2015, at 12:41 PM, marcin.krotkiewski > <marcin.krotkiew...@gmail.com> wrote: > > > Ralph, maybe I was not precise - most likely --cpu_bind does not work on my > system because it is disabled in SLURM, and is not caused by any problem in > OpenMPI. I am not certain and I will have to investigate this further, so > please do not waste your time on this.
No problem - I understood your note. It does require you to enable it in the slurm config. > > What do you mean by 'loss of dynaics support’? There is no support for things like connect/accept, comm_spawn (since the procs cannot do a connect/accept after start), publish, lookup, etc. These are all considered part of MPI's dynamics definitions. You’ll also have a lesser number of map/bind options, if that matters. > > Thanks, > > Marcin > > > On 10/06/2015 09:35 PM, Ralph Castain wrote: >> I’ll have to fix it later this week - out due to eye surgery today. Looks >> like something didn’t get across to 1.10 as it should have. There are other >> tradeoffs that occur when you go to direct launch (e.g., loss of dynamics >> support) - may or may not be of concern to your usage. >> >> >>> On Oct 6, 2015, at 11:57 AM, marcin.krotkiewski >>> <marcin.krotkiew...@gmail.com> wrote: >>> >>> >>> Thanks, Gilles. This is a good suggestion and I will pursue this direction. >>> The problem is that currently SLURM does not support --cpu_bind on my >>> system for whatever reasons. I may work towards turning this option on if >>> that will be necessary, but it would also be good to be able to do it with >>> pure openmpi.. >>> >>> Marcin >>> >>> >>> On 10/06/2015 08:01 AM, Gilles Gouaillardet wrote: >>>> Marcin, >>>> >>>> did you investigate direct launch (e.g. srun) instead of mpirun ? >>>> >>>> for example, you can do >>>> srun --ntasks=2 --cpus-per-task=4 -l grep Cpus_allowed_list >>>> /proc/self/status >>>> >>>> note, you might have to use the srun --cpu_bind option, and make sure your >>>> slurm config does support that : >>>> srun --ntasks=2 --cpus-per-task=4 --cpu_bind=core,verbose -l grep >>>> Cpus_allowed_list /proc/self/status >>>> >>>> Cheers, >>>> >>>> Gilles >>>> >>>> On 10/6/2015 4:38 AM, marcin.krotkiewski wrote: >>>>> Yet another question about cpu binding under SLURM environment.. >>>>> >>>>> Short version: will OpenMPI support SLURM_CPUS_PER_TASK for the purpose >>>>> of cpu binding? >>>>> >>>>> >>>>> Full version: When you allocate a job like, e.g., this >>>>> >>>>> salloc --ntasks=2 --cpus-per-task=4 >>>>> >>>>> SLURM will allocate 8 cores in total, 4 for each 'assumed' MPI tasks. >>>>> This is useful for hybrid jobs, where each MPI process spawns some >>>>> internal worker threads (e.g., OpenMP). The intention is that there are 2 >>>>> MPI procs started, each of them 'bound' to 4 cores. SLURM will also set >>>>> an environment variable >>>>> >>>>> SLURM_CPUS_PER_TASK=4 >>>>> >>>>> which should (probably?) be taken into account by the method that >>>>> launches the MPI processes to figure out the cpuset. In case of OpenMPI + >>>>> mpirun I think something should happen in >>>>> orte/mca/ras/slurm/ras_slurm_module.c, where the variable _is_ actually >>>>> parsed. Unfortunately, it is never really used... >>>>> >>>>> As a result, cpuset of all tasks started on a given compute node includes >>>>> all CPU cores of all MPI tasks on that node, just as provided by SLURM >>>>> (in the above example - 8). In general, there is no simple way for the >>>>> user code in the MPI procs to 'split' the cores between themselves. I >>>>> imagine the original intention to support this in OpenMPI was something >>>>> like >>>>> >>>>> mpirun --bind-to subtask_cpuset >>>>> >>>>> with an artificial bind target that would cause OpenMPI to divide the >>>>> allocated cores between the mpi tasks. Is this right? If so, it seems >>>>> that at this point this is not implemented. Is there plans to do this? If >>>>> no, does anyone know another way to achieve that? >>>>> >>>>> Thanks a lot! >>>>> >>>>> Marcin >>>>> >>>>> >>>>> >>>>> _______________________________________________ >>>>> users mailing list >>>>> us...@open-mpi.org >>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >>>>> Link to this post: >>>>> http://www.open-mpi.org/community/lists/users/2015/10/27803.php >>>>> >>>> _______________________________________________ >>>> users mailing list >>>> us...@open-mpi.org >>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >>>> Link to this post: >>>> http://www.open-mpi.org/community/lists/users/2015/10/27812.php >>> _______________________________________________ >>> users mailing list >>> us...@open-mpi.org >>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >>> Link to this post: >>> http://www.open-mpi.org/community/lists/users/2015/10/27818.php >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >> Link to this post: >> http://www.open-mpi.org/community/lists/users/2015/10/27820.php > > _______________________________________________ > users mailing list > us...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > Link to this post: > http://www.open-mpi.org/community/lists/users/2015/10/27821.php