Hi Tohiko If you compiled Open MPI in a computer with IB hardware, then copied the installation tree to another machine, or if you installed from an RPM or other package generated in a machine with IB, your OpenMPI will have IB enabled, I think, even if the machine where it is running does not have IB.
This is a matter of taste, but here is what I think, regarding a previous question you sent. I would rather compile open MPI from source, in the machine[s] where it will run, and install it with the same path on all machines {or in a single NFS shared directory}, to make things simpler. I would use the most homogeneous set of machines possible, to avoid too many headaches. I.e. use the least common denominator, so to speak. Say, everything x86_64, all with Ethernet only [or all with IB + Ethernet, but you don't seem to have IB, at least not on all machines]. I hope this helps, Gus Correa On Feb 15, 2012, at 1:27 AM, Tohiko Looka wrote: > Mm... This is really strange > I don't have that service and there is no ib* output in 'ifconfig -a' or > 'Infinband' in 'lspci' > Which makes me believe that I don't have such a network. I also checked on an > identical computer on the same network with the same results. > > What's strange is that these messages didn't use to show up and they don't > show up on that identical computer; only on mine. Even though both computers > have the same hardware, openMPI version and on the same network. > > I guess I can safely ignore these warnings and run on Ethernet, but it would > be nice to know what happened there, in case anybody has an idea. > > Thank you, > > On Wed, Feb 15, 2012 at 12:52 AM, Gustavo Correa <g...@ldeo.columbia.edu> > wrote: > Hi Tohiko > > OpenFabrics network a.k.a. Infiniband a.k.a. IB. > To check if the compute nodes have IB interfaces, try: > > lspci [and search the output for Infinband] > > To see if the IB interface is configured try: > > ifconfig -a [and search the output for ib0, ib1, or similar] > > To check if the OFED module is up try: > > 'service openibd status' > > > As an alternative, you could also try to run your program over Ethernet, > avoiding Infinband, > in case you don't have IB or if somehow it is broken. > It is slower than Infiniband, though. > > Try something like this: > > mpiexec -mca btl tcp,sm,self -np 4 ./my_mpi_program > > I hope this helps, > Gus Correa > > On Feb 14, 2012, at 4:02 PM, Tohiko Looka wrote: > > > Sorry for the noob question, but how do I check my network type and if OFED > > service is running correctly or not? And how do I run it > > > > Thank you, > > > > On Tue, Feb 14, 2012 at 2:14 PM, Jeff Squyres <jsquy...@cisco.com> wrote: > > Do you have an OpenFabrics-based network? (e.g., InfiniBand or iWarp) > > > > If so, this error message usually means that OFED is either installed > > incorrectly, or is not running properly (e.g., its services didn't get > > started properly upon boot). > > > > If you don't have an OpenFabrics-based network, then it usually means that > > you have OpenFabrics services running when you really shouldn't (because > > you don't have any OpenFabrics-based devices). > > > > > > On Feb 14, 2012, at 4:48 AM, Tohiko Looka wrote: > > > > > Greetings, > > > > > > Until today I was running my openmpi applications with no errors/warnings > > > Today I restarted my computer (possibly after an automatic openmpi > > > update) and got these warnings when > > > running my program > > > [tohiko@kw12614 1d]$ mpirun -x LD_LIBRARY_PATH -hostfile hosts -np 10 > > > hello > > > librdmacm: couldn't read ABI version. > > > librdmacm: assuming: 4 > > > CMA: unable to get RDMA device list > > > -------------------------------------------------------------------------- > > > [[21652,1],0]: A high-performance Open MPI point-to-point messaging module > > > was unable to find any relevant network interfaces: > > > > > > Module: OpenFabrics (openib) > > > Host: kw12614 > > > > > > Another transport will be used instead, although this may result in > > > lower performance. > > > -------------------------------------------------------------------------- > > > [kw12614:03195] 10 more processes have sent help message > > > help-mpi-btl-base.txt / btl:no-nics > > > [kw12614:03195] Set MCA parameter "orte_base_help_aggregate" to 0 to see > > > all help / error messages > > > > > > > > > Is this normal? And how come it happened now? > > > -- Tohiko > > > _______________________________________________ > > > users mailing list > > > us...@open-mpi.org > > > http://www.open-mpi.org/mailman/listinfo.cgi/users > > > > > > -- > > Jeff Squyres > > jsquy...@cisco.com > > For corporate legal information go to: > > http://www.cisco.com/web/about/doing_business/legal/cri/ > > > > > > _______________________________________________ > > users mailing list > > us...@open-mpi.org > > http://www.open-mpi.org/mailman/listinfo.cgi/users > > > > _______________________________________________ > > users mailing list > > us...@open-mpi.org > > http://www.open-mpi.org/mailman/listinfo.cgi/users > > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users