mpirun reads the allocation from the environment - there is no need to create a 
hostfile.


> On Oct 11, 2021, at 11:12 AM, Sheppard, Raymond W <rshep...@iu.edu> wrote:
> 
> Hi,
>  Personally, I have had trouble with Slurm not wanting to give mpirun a 
> hostfile to work with.  How do you get around that?  Thanks.
>            Ray
> 
> ________________________________________
> From: users <users-boun...@lists.open-mpi.org> on behalf of Ralph Castain via 
> users <users@lists.open-mpi.org>
> Sent: Monday, October 11, 2021 1:49 PM
> To: Open MPI Users
> Cc: Ralph Castain
> Subject: Re: [OMPI users] [External] Re: cpu binding of mpirun to follow 
> slurm setting
> 
> Oh my - that is a pretty strong statement. It depends on what you are trying 
> to do, and whether or not Slurm offers a mapping pattern that matches. mpirun 
> tends to have a broader range of options, which is why many people use it. It 
> also means that your job script is portable and not locked to a specific RM, 
> which is important to quite a few users.
> 
> However, if Slurm has something you can use/like and you don't need to worry 
> about portability, then by all means one should use it.
> 
> Just don't assume that everyone fits in that box :-)
> 
> 
> On Oct 11, 2021, at 10:40 AM, Chang Liu via users 
> <users@lists.open-mpi.org<mailto:users@lists.open-mpi.org>> wrote:
> 
> OK thank you. Seems that srun is a better option for normal users.
> 
> Chang
> 
> On 10/11/21 1:23 PM, Ralph Castain via users wrote:
> Sorry, your output wasn't clear about cores vs hwthreads. Apparently, your 
> Slurm config is setup to use hwthreads as independent cpus - what you are 
> calling "logical cores", which is a little confusing.
> No, mpirun has no knowledge of what mapping pattern you passed to salloc. We 
> don't have any good way of obtaining config information, for one thing - 
> e.g., that Slurm is treating hwthreads as cpus. So we can't really interpret 
> what they might have done.
> Given this clarification, you can probably get what you want with:
> mpirun --use-hwthread-cpus --map-by hwthread:pe=2 ..."
> On Oct 11, 2021, at 7:35 AM, Chang Liu via users 
> <users@lists.open-mpi.org<mailto:users@lists.open-mpi.org> 
> <mailto:users@lists.open-mpi.org>> wrote:
> 
> This is not what I need. The cpu can run 4 threads per core, so "--bind-to 
> core" results in one process occupying 4 logical cores.
> 
> I want one process to occupy 2 logical cores, so two processes sharing a 
> physical core.
> 
> I guess there is a way to do that by playing with mapping. I just want to 
> know if this is a bug in mpirun, or this feature for interacting with slurm 
> was never implemented.
> 
> Chang
> 
> On 10/11/21 10:07 AM, Ralph Castain via users wrote:
> You just need to tell mpirun that you want your procs to be bound to cores, 
> not socket (which is the default).
> Add "--bind-to core" to your mpirun cmd line
> On Oct 10, 2021, at 11:17 PM, Chang Liu via users 
> <users@lists.open-mpi.org<mailto:users@lists.open-mpi.org> 
> <mailto:users@lists.open-mpi.org><mailto:users@lists.open-mpi.org 
> <mailto:users@lists.open-mpi.org>>> wrote:
> 
> Yes they are. This is an interactive job from
> 
> salloc -N 1 --ntasks-per-node=64 --cpus-per-task=2 --gpus-per-node=4 
> --gpu-mps --time=24:00:00
> 
> Chang
> 
> On 10/11/21 2:09 AM, Åke Sandgren via users wrote:
> On 10/10/21 5:38 PM, Chang Liu via users wrote:
> OMPI v4.1.1-85-ga39a051fd8
> 
> % srun bash -c "cat /proc/self/status|grep Cpus_allowed_list"
> Cpus_allowed_list:      58-59
> Cpus_allowed_list:      106-107
> Cpus_allowed_list:      110-111
> Cpus_allowed_list:      114-115
> Cpus_allowed_list:      16-17
> Cpus_allowed_list:      36-37
> Cpus_allowed_list:      54-55
> ...
> 
> % mpirun bash -c "cat /proc/self/status|grep Cpus_allowed_list"
> Cpus_allowed_list:      0-127
> Cpus_allowed_list:      0-127
> Cpus_allowed_list:      0-127
> Cpus_allowed_list:      0-127
> Cpus_allowed_list:      0-127
> Cpus_allowed_list:      0-127
> Cpus_allowed_list:      0-127
> ...
> Was that run in the same batch job? If not, the data is useless.
> 
> --
> Chang Liu
> Staff Research Physicist
> +1 609 243 3438
> c...@pppl.gov<mailto:c...@pppl.gov> 
> <mailto:c...@pppl.gov><mailto:c...@pppl.gov <mailto:c...@pppl.gov>>
> Princeton Plasma Physics Laboratory
> 100 Stellarator Rd, Princeton NJ 08540, USA
> 
> --
> Chang Liu
> Staff Research Physicist
> +1 609 243 3438
> c...@pppl.gov<mailto:c...@pppl.gov> <mailto:c...@pppl.gov>
> Princeton Plasma Physics Laboratory
> 100 Stellarator Rd, Princeton NJ 08540, USA
> 
> --
> Chang Liu
> Staff Research Physicist
> +1 609 243 3438
> c...@pppl.gov<mailto:c...@pppl.gov>
> Princeton Plasma Physics Laboratory
> 100 Stellarator Rd, Princeton NJ 08540, USA
> 


Reply via email to