Thanks, I'll give it a try! Lee Manko
On Tue, Feb 2, 2010 at 10:01 AM, Ralph Castain <r...@open-mpi.org> wrote: > Probably the easiest solution is to tell OMPI not to use the second NIC. > For example, if that NIC is eth1, then you could do this: > > mpirun -mca oob_tcp_if_exclude eth1 -mca btl_tcp_if_exclude eth1 ... > > This tells both the MPI layer and the RTE to ignore the eth1 interface. > > > > > On Tue, Feb 2, 2010 at 10:04 AM, Lee Manko <lma...@gblsys.com> wrote: > >> Thank you Jody and Ralph. Your suggestions got me up and running (well >> sort of). I have run into another issue that I was wondering if you have >> had any experience with. My server has one NIC that is static and a second >> that is DHCP on a corp network (the only way to get to the outside world). >> My scatter/gather process does not work when the second NIC is plugged in, >> but does work when unplugged. It appears to have something to do with DHCP >> Discovery. >> >> Any suggestions? >> >> Lee Manko >> >> >> >> On Thu, Jan 28, 2010 at 11:53 AM, Lee Manko <lma...@gblsys.com> wrote: >> >>> See, it was a simple thing. Thank you for the information. I am trying >>> it now. Have to recompile and re-install openmpi for a heterogeneous >>> network. >>> >>> Now, knowing what to search for, I found that I can set the configuration >>> of the cluster in a file that mpirun and mpiexec can read. >>> >>> mpirun --app my_appfile >>> >>> >>> where app file contains the same --host information. Makes customizing >>> the cluster for certain applications very easy. >>> >>> Thanks for the guidance to this MPI newbie. >>> >>> Lee >>> >>> >>> >>> >>> On Wed, Jan 27, 2010 at 11:43 PM, jody <jody....@gmail.com> wrote: >>> >>>> Hi >>>> I'm not sure i completely understood. >>>> Is it the case that an application compiled on the dell will not work >>>> on the PS3 and vice versa? >>>> >>>> If this is the case, you could try this: >>>> shell$ mpirun -np 1 --host a app_ps3 : -np 1 --host b app_dell >>>> where app_ps3 is your application compiled on the PS3 and a is your PS3 >>>> host, >>>> and app_dell is your application compiled on the dell, and b is your >>>> dell host. >>>> >>>> Check the MPI FAQs >>>> http://www.open-mpi.org/faq/?category=running#mpmd-run >>>> http://www.open-mpi.org/faq/?category=running#mpirun-host >>>> >>>> Hope this helps >>>> Jody >>>> >>>> On Thu, Jan 28, 2010 at 3:08 AM, Lee Manko <lma...@gblsys.com> wrote: >>>> > OK, so please stop me if you have heard this before, but I couldn’t >>>> find >>>> > anything in the archives that addressed my situation. >>>> > >>>> > >>>> > >>>> > I have a Beowulf cluster where ALL the node are PS3s running Yellow >>>> Dog >>>> > Linux 6.2 and a host (server) that is a Dell i686 Quad-core running >>>> Fedora >>>> > Core 12. After a failed attempt at letting yum install openmpi, I >>>> > downloaded v1.4.1, compiled and installed on all machines (PS3s and >>>> > Dell). I have an NSF shared directory on the host where the >>>> application >>>> > resides after building. All nodes have access to the shared volume >>>> and they >>>> > can see any files in the shared volume. >>>> > >>>> > >>>> > >>>> > I wrote a very simple master/slave application where the slave does a >>>> simple >>>> > computation and gets the processor name. The slave returns both >>>> pieces of >>>> > information to the master who then simply displays it in the terminal >>>> > window. After the slaves work on 1024 such tasks, the master exists. >>>> > >>>> > >>>> > >>>> > When I run on the host, without distributing to the nodes, I use the >>>> > command: >>>> > >>>> > >>>> > >>>> > “mpirun –np 4 ./MPI_Example” >>>> > >>>> > >>>> > >>>> > Compiling and running the application on the native hardware works >>>> perfectly >>>> > (ie: compiled and run on the PS3 or compiled and run on the Dell). >>>> > >>>> > >>>> > >>>> > However, when I went to scatter the tasks to the nodes, using the >>>> following >>>> > command, >>>> > >>>> > >>>> > >>>> > “mpirun –np 4 –hostfile mpi-hostfile ./MPI_Example” >>>> > >>>> > >>>> > >>>> > the application fails. I’m surmising that the issue is with running >>>> code >>>> > that was compiled for the Dell on the PS3 since the MPI_Init will >>>> launch the >>>> > application from the shared volume. >>>> > >>>> > >>>> > >>>> > So, I took the source code and compiled it on both the Dell and the >>>> PS3 and >>>> > placed the executables in /shared_volume/Dell and /shared_volume/PS3 >>>> and >>>> > added the paths to the environment variable PATH. I tried to run the >>>> > application from the host again using the following command, >>>> > >>>> > >>>> > >>>> > “mpirun –np 4 –hostfile mpi-hostfile –wdir >>>> > /shared_volume/PS3 ./MPI_Example” >>>> > >>>> > >>>> > >>>> > Hoping that the wdir would set the working directory at the time of >>>> the call >>>> > to MPI_Init() so that MPI_Init will launch the PS3 version of the >>>> > executable. >>>> > >>>> > >>>> > >>>> > I get the error: >>>> > >>>> > Could not execute the executable “./MPI_Example” : Exec format error >>>> > >>>> > This could mean that your PATH or executable name is wrong, or that >>>> you do >>>> > not >>>> > >>>> > have the necessary permissions. Please ensure that the executable is >>>> able >>>> > to be >>>> > >>>> > found and executed. >>>> > >>>> > >>>> > >>>> > Now, I know I’m gonna get some heat for this, but all of these machine >>>> use >>>> > only the root account with full root privileges, so it’s not a >>>> permission >>>> > issue. >>>> > >>>> > >>>> > >>>> > >>>> > >>>> > I am sure there is simple solution to my problem. Replacing the host >>>> with a >>>> > PS3 is not an option. Does anyone have any suggestions? >>>> > >>>> > >>>> > >>>> > Thanks. >>>> > >>>> > >>>> > >>>> > PS: When I get to programming the Cell BE, then I’ll use the IBM Cell >>>> SDK >>>> > with its cross-compiler toolchain. >>>> > >>>> > _______________________________________________ >>>> > users mailing list >>>> > us...@open-mpi.org >>>> > http://www.open-mpi.org/mailman/listinfo.cgi/users >>>> > >>>> >>>> _______________________________________________ >>>> users mailing list >>>> us...@open-mpi.org >>>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>>> >>> >>> >> >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users >> > > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users >