Hi Ralph, Thank you for looking into this!
The job #BSUB -J "task_geometry" #BSUB -n 9 #BSUB -R "span[ptile=3]" #BSUB -m "p10a30 p10a33 p10a35 p10a55 p10a58" #BSUB -R "affinity[core]" #BSUB -e "task_geometry.stderr.%J" #BSUB -o "task_geometry.stdout.%J" #BSUB -q "normal" #BSUB -M "800" #BSUB -R "rusage[mem=800]" #BSUB -x export PATH=/usr/local/OpenMPI/1.10.2/bin:${PATH} export LD_LIBRARY_PATH=/usr/local/OpenMPI/1.10.2/lib:${PATH} export LSB_PJL_TASK_GEOMETRY="{(5)(4,3)(2,1,0)}" mpirun /gpfs/gpfs_stage1/parpia/OpenMPI_tests/reporter/bin/reporter_MPI fails with the message -------------------------------------------------------------------------- A request was made to bind to that would result in binding more processes than cpus on a resource: Bind to: CORE Node: p10a55 #processes: 2 #cpus: 1 You can override this protection by adding the "overload-allowed" option to your binding directive. -------------------------------------------------------------------------- (Please see the first set of LSF output files in the original message.) I did not expect this failure: I haven't asked for more than one MPI process per core. In an attempt to work around this failure, I added the option -bind-to core:overload-allowed, and this led to 20 MPI processes (there are 20 cores on each host in this cluster) being started on just one of the hosts. That is, neither job did what I expected. I will try to put you in touch with someone in LSF development immediately. Regards, Farid Parpia IBM Corporation: 710-2-RF28, 2455 South Road, Poughkeepsie, NY 12601, USA; Telephone: (845) 433-8420 = Tie Line 293-8420 ----- Forwarded by Farid Parpia/Poughkeepsie/IBM on 04/18/2016 06:56 PM ----- From: Ralph Castain <r...@open-mpi.org> To: Open MPI Users <us...@open-mpi.org> Date: 04/18/2016 06:53 PM Subject: Re: [OMPI users] LSF's LSB_PJL_TASK_GEOMETRY + OpenMPI 1.10.2 Sent by: "users" <users-boun...@open-mpi.org> Hi Farid I’m not sure I understand what you are asking here. If your point is that OMPI isn’t placing and binding procs per the LSF directives, then you are quite correct. The LSF folks never provided that level of integration, nor the info by which we might have derived it (e.g., how the pattern is communicated). If someone from IBM would like to provide that code, we’d be happy to help answer questions as to how to perform the integration. On Apr 18, 2016, at 10:13 AM, Farid Parpia <par...@us.ibm.com> wrote: Greetings! The following batch script will successfully demo the use of LSF's task geometry feature using IBM Parallel Environment: #BUB -J "task_geometry" #BSUB -n 9 #BSUB -R "span[ptile=3]" #BSUB -network "type=sn_single:mode=us" #BSUB -R "affinity[core]" #BSUB -e "task_geometry.stderr.%J" #BSUB -o "task_geometry.stdout.%J" #BSUB -q "normal" #BSUB -M "800" #BSUB -R "rusage[mem=800]" #BSUB -x export LSB_PJL_TASK_GEOMETRY="{(5)(4,3)(2,1,0)}" ldd /gpfs/gpfs_stage1/parpia/PE_tests/reporter/bin/reporter_MPI /gpfs/gpfs_stage1/parpia/PE_tests/reporter/bin/reporter_MPI The reporter_MPIutility simply reports the hostname and affinitization for each MPI process, and is what I use to verify that the job is distributed to allocated nodes and on them with the affinitization expected. Typical output is , To adapt the above batch script to use OpenMPI, I modify it to #BSUB -J "task_geometry" #BSUB -n 9 #BSUB -R "span[ptile=3]" #BSUB -m "p10a30 p10a33 p10a35 p10a55 p10a58" #BSUB -R "affinity[core]" #BSUB -e "task_geometry.stderr.%J" #BSUB -o "task_geometry.stdout.%J" #BSUB -q "normal" #BSUB -M "800" #BSUB -R "rusage[mem=800]" #BSUB -x export PATH=/usr/local/OpenMPI/1.10.2/bin:${PATH} export LD_LIBRARY_PATH=/usr/local/OpenMPI/1.10.2/lib:${PATH} export LSB_PJL_TASK_GEOMETRY="{(5)(4,3)(2,1,0)}" echo "=== LSB_DJOB_HOSTFILE ===" cat ${LSB_DJOB_HOSTFILE} echo "=== LSB_AFFINITY_HOSTFILE ===" cat ${LSB_AFFINITY_HOSTFILE} echo "=== LSB_DJOB_RANKFILE ===" cat ${LSB_DJOB_RANKFILE} echo "=========================" ldd /gpfs/gpfs_stage1/parpia/OpenMPI_tests/reporter/bin/reporter_MPI mpirun /gpfs/gpfs_stage1/parpia/OpenMPI_tests/reporter/bin/reporter_MPI There are additional lines of scripting that I have inserted to help with debugging this failing job. Here are the output files from the job: , If I change the last line of the immediately above job script to mpirun -bind-to core:overload-allowed /gpfs/gpfs_stage1/parpia/OpenMPI_tests/reporter/bin/reporter_MPI the job runs through, but the host selection and affinization is completely wrong (you can extract the relevant information with grep "can be sched" *.stdout.* | sort -n -k 9): , OpenMPI 1.10.2 was built using this script: It was installed with make install executed from the top if the build tree. Here is the output of ompi_info --all Regards, Farid Parpia IBM Corporation: 710-2-RF28, 2455 South Road, Poughkeepsie, NY 12601, USA; Telephone: (845) 433-8420 = Tie Line 293-8420 <task_geometry.stdout.43915.gz><task_geometry.stderr.43915.gz><task_geometry.stderr.43918.gz><task_geometry.stdout.43918.gz><task_geometry.stderr.43953.gz><task_geometry.stdout.43953.gz><build_OpenMPI.sh><ompi_info--all.gz>_______________________________________________ users mailing list us...@open-mpi.org Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users Link to this post: http://www.open-mpi.org/community/lists/users/2016/04/28955.php _______________________________________________ users mailing list us...@open-mpi.org Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users Link to this post: http://www.open-mpi.org/community/lists/users/2016/04/28958.php