Re: [OMPI users] Infiniband Question
Yep -- it's normal. Those IP addresses are used for bootstrapping/startup, not for MPI traffic. In particular, that "HNP URI" stuff is used by Open MPI's underlying run-time environment. It's not used by the MPI layer at all. On Feb 5, 2010, at 2:32 PM, Mike Hanby wrote: > Howdy, > > When running a Gromacs job using OpenMPI 1.4.1 on Infiniband enabled nodes, > I'm seeing the following process listing: > > \_ -bash /opt/gridengine/default/spool/compute-0-3/job_scripts/97037 > \_ mpirun -np 4 mdrun_mpi -v -np 4 -s production-Npt-323K_4CPU -o > production-Npt-323K_4CPU -c production-Npt-323K_4CPU -x > production-Npt-323K_4CPU -g production-Npt-323K_4CPU.log > \_ /opt/gridengine/bin/lx26-amd64/qrsh -inherit -nostdin -V > compute-0-4.local orted -mca ess env -mca orte_ess_jobid 945881088 > -mca orte_ess_vpid 1 -mca orte_ess_num_procs 4 --hnp-uri > "945881088.0;tcp://192.168.20.252:39440;tcp://192.168.21.252:39440" > \_ /opt/gridengine/bin/lx26-amd64/qrsh -inherit -nostdin -V > compute-0-2.local orted -mca ess env -mca orte_ess_jobid 945881088 > -mca orte_ess_vpid 2 -mca orte_ess_num_procs 4 --hnp-uri > "945881088.0;tcp://192.168.20.252:39440;tcp://192.168.21.252:39440" > \_ /opt/gridengine/bin/lx26-amd64/qrsh -inherit -nostdin -V > compute-0-1.local orted -mca ess env -mca orte_ess_jobid 945881088 > -mca orte_ess_vpid 3 -mca orte_ess_num_procs 4 --hnp-uri > "945881088.0;tcp://192.168.20.252:39440;tcp://192.168.21.252:39440" > \_ mdrun_mpi -v -np 4 -s production-Npt-323K_4CPU -o > production-Npt-323K_4CPU -c production-Npt-323K_4CPU > -x production-Npt-323K_4CPU -g production-Npt-323K_4CPU.log > > Is it normal for these tcp addresses to be listed if the job is using > Infiniband? > > The 192.168.20.x subnet is the eth0 GigE network > And the 192.168.21.x subnet is the ib0 IPoverIB network > > Or is this job actually using TCPIP over Infiniband / GigE? > > I'm running mpirun without any special fabric includes / excludes. > > ompi_info lists openib as a valid fabric: > $ ompi_info |grep openib > MCA btl: openib (MCA v2.0, API v2.0, Component v1.4.1) > > Thanks for any insight, > > Mike > = > Mike Hanby > mha...@uab.edu > Information Systems Specialist II > IT HPCS / Research Computing > > > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users > -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/
[OMPI users] Infiniband Question
Howdy, When running a Gromacs job using OpenMPI 1.4.1 on Infiniband enabled nodes, I'm seeing the following process listing: \_ -bash /opt/gridengine/default/spool/compute-0-3/job_scripts/97037 \_ mpirun -np 4 mdrun_mpi -v -np 4 -s production-Npt-323K_4CPU -o production-Npt-323K_4CPU -c production-Npt-323K_4CPU -x production-Npt-323K_4CPU -g production-Npt-323K_4CPU.log \_ /opt/gridengine/bin/lx26-amd64/qrsh -inherit -nostdin -V compute-0-4.local orted -mca ess env -mca orte_ess_jobid 945881088 -mca orte_ess_vpid 1 -mca orte_ess_num_procs 4 --hnp-uri "945881088.0;tcp://192.168.20.252:39440;tcp://192.168.21.252:39440" \_ /opt/gridengine/bin/lx26-amd64/qrsh -inherit -nostdin -V compute-0-2.local orted -mca ess env -mca orte_ess_jobid 945881088 -mca orte_ess_vpid 2 -mca orte_ess_num_procs 4 --hnp-uri "945881088.0;tcp://192.168.20.252:39440;tcp://192.168.21.252:39440" \_ /opt/gridengine/bin/lx26-amd64/qrsh -inherit -nostdin -V compute-0-1.local orted -mca ess env -mca orte_ess_jobid 945881088 -mca orte_ess_vpid 3 -mca orte_ess_num_procs 4 --hnp-uri "945881088.0;tcp://192.168.20.252:39440;tcp://192.168.21.252:39440" \_ mdrun_mpi -v -np 4 -s production-Npt-323K_4CPU -o production-Npt-323K_4CPU -c production-Npt-323K_4CPU -x production-Npt-323K_4CPU -g production-Npt-323K_4CPU.log Is it normal for these tcp addresses to be listed if the job is using Infiniband? The 192.168.20.x subnet is the eth0 GigE network And the 192.168.21.x subnet is the ib0 IPoverIB network Or is this job actually using TCPIP over Infiniband / GigE? I'm running mpirun without any special fabric includes / excludes. ompi_info lists openib as a valid fabric: $ ompi_info |grep openib MCA btl: openib (MCA v2.0, API v2.0, Component v1.4.1) Thanks for any insight, Mike = Mike Hanby mha...@uab.edu Information Systems Specialist II IT HPCS / Research Computing
Re: [OMPI users] infiniband question
Correct, you don't need DAPL. Can you send all the information listed here: http://www.open-mpi.org/community/help/ On Sep 17, 2009, at 9:17 AM, Yann JOBIC wrote: Hi, I'm new to infiniband. I installed the rdma_cm, rdma_ucm and ib_uverbs kernel modules. When i'm running a ring test openmpi code, i've got : [Lidia][0,1,1][btl_openib_endpoint.c: 992:mca_btl_openib_endpoint_qp_init_query] Set MTU to IBV value 4 (2048 bytes) [Lidia][0,1,1][btl_openib_endpoint.c: 992:mca_btl_openib_endpoint_qp_init_query] Set MTU to IBV value 4 (2048 bytes) [Lilou][0,1,0][btl_openib_endpoint.c: 992:mca_btl_openib_endpoint_qp_init_query] Set MTU to IBV value 4 (2048 bytes) [Lilou][0,1,0][btl_openib_endpoint.c: 992:mca_btl_openib_endpoint_qp_init_query] Set MTU to IBV value 4 (2048 bytes) And then, the program hangs. I thought i only need rdma communications, and don't need the DALP lib (with the iboip module). I am wrong ? Thanks, Yann -- ___ Yann JOBIC HPC engineer Polytech Marseille DME IUSTI-CNRS UMR 6595 Technopôle de Château Gombert 5 rue Enrico Fermi 13453 Marseille cedex 13 Tel : (33) 4 91 10 69 39 ou (33) 4 91 10 69 43 Fax : (33) 4 91 10 69 69 ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users -- Jeff Squyres jsquy...@cisco.com
[OMPI users] infiniband question
Hi, I'm new to infiniband. I installed the rdma_cm, rdma_ucm and ib_uverbs kernel modules. When i'm running a ring test openmpi code, i've got : [Lidia][0,1,1][btl_openib_endpoint.c:992:mca_btl_openib_endpoint_qp_init_query] Set MTU to IBV value 4 (2048 bytes) [Lidia][0,1,1][btl_openib_endpoint.c:992:mca_btl_openib_endpoint_qp_init_query] Set MTU to IBV value 4 (2048 bytes) [Lilou][0,1,0][btl_openib_endpoint.c:992:mca_btl_openib_endpoint_qp_init_query] Set MTU to IBV value 4 (2048 bytes) [Lilou][0,1,0][btl_openib_endpoint.c:992:mca_btl_openib_endpoint_qp_init_query] Set MTU to IBV value 4 (2048 bytes) And then, the program hangs. I thought i only need rdma communications, and don't need the DALP lib (with the iboip module). I am wrong ? Thanks, Yann -- ___ Yann JOBIC HPC engineer Polytech Marseille DME IUSTI-CNRS UMR 6595 Technopôle de Château Gombert 5 rue Enrico Fermi 13453 Marseille cedex 13 Tel : (33) 4 91 10 69 39 ou (33) 4 91 10 69 43 Fax : (33) 4 91 10 69 69