[OMPI users] mpirun links wrong library with BLACS tester
I have installed openmpi 1.4.1 locally for one user on a cluster, where some other mpi were installed. when I try to run an executable through mpirun (I am running the BLACS tester) I get xFbtest_MPI-LINUX-0: error while loading shared libraries: liblam.so. 0: cannot open shared object file: No such file or directory if I run the executable it works ldd always shows the correct libraries (even when run in mpirun) and no liblam also the environment looks normal in both cases (both PATH and LD_RUN_PATH have the installation as first path). I did try to set -rpath to */lib and */lib/openmpi, and generally reduce the environment to a basic one, and use that in all the shells both when compiling and running, but to no avail. The examples in the openmpi directory seem to work without problems. I did manage to run the blacs tester, but in no reproducible way (I really don't know what I did to make it work and it stopped working really fast (the same binary)). The same setup works in another machine (and I think BLACS flags are ok) I am getting really crazy, any pointer at what else I could try would be greatly appreciated. gcc (GCC) 4.1.2 20071124 (Red Hat 4.1.2-42) G95 (GCC 4.0.3 (g95 0.92!) Jun 24 2009) thanks Fawzi
Re: [OMPI users] How to start MPI_Spawn child processes early?
My question is why? If you are willing to reserve a chunk of your machine for yet-to-exist tasks, why not just create them all at mpirun time and slice and dice your communicators as appropriate? On Thu, 2010-01-28 at 09:24 +1100, Jaison Paul wrote: > Hi, I am just reposting my early query once again. If anyone one can > give some hint, that would be great. > > Thanks, Jaison > ANU > > Jaison Paul wrote: > > Hi All, > > > > I am trying to use MPI for scientific High Performance (hpc) > > applications. I use MPI_Spawn to create child processes. Is there a > > way to start child processes early than the parent process, using > > MPI_Spawn? > > > > I want this because, my experiments showed that the time to spawn the > > children by parent is too long for HPC apps which slows down the whole > > process. If the children are ready when parent application process > > seeks for them, that initial delay can be avoided. Is there a way to > > do that? > > > > Thanks in advance, > > > > Jaison > > Australian National University > > ___ > > users mailing list > > us...@open-mpi.org > > http://www.open-mpi.org/mailman/listinfo.cgi/users > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users
[OMPI users] Non-homogeneous Cluster Implementation
OK, so please stop me if you have heard this before, but I couldn’t find anything in the archives that addressed my situation. I have a Beowulf cluster where ALL the node are PS3s running Yellow Dog Linux 6.2 and a host (server) that is a Dell i686 Quad-core running Fedora Core 12. After a failed attempt at letting yum install openmpi, I downloaded v1.4.1, compiled and installed on all machines (PS3s and Dell). I have an NSF shared directory on the host where the application resides after building. All nodes have access to the shared volume and they can see any files in the shared volume. I wrote a very simple master/slave application where the slave does a simple computation and gets the processor name. The slave returns both pieces of information to the master who then simply displays it in the terminal window. After the slaves work on 1024 such tasks, the master exists. When I run on the host, without distributing to the nodes, I use the command: “mpirun –np 4 ./MPI_Example” Compiling and running the application on the native hardware works perfectly (ie: compiled and run on the PS3 or compiled and run on the Dell). However, when I went to scatter the tasks to the nodes, using the following command, “mpirun –np 4 –hostfile mpi-hostfile ./MPI_Example” the application fails. I’m surmising that the issue is with running code that was compiled for the Dell on the PS3 since the MPI_Init will launch the application from the shared volume. So, I took the source code and compiled it on both the Dell and the PS3 and placed the executables in /shared_volume/Dell and /shared_volume/PS3 and added the paths to the environment variable PATH. I tried to run the application from the host again using the following command, “mpirun –np 4 –hostfile mpi-hostfile –wdir /shared_volume/PS3 ./MPI_Example” Hoping that the wdir would set the working directory at the time of the call to MPI_Init() so that MPI_Init will launch the PS3 version of the executable. I get the error: Could not execute the executable “./MPI_Example” : Exec format error This could mean that your PATH or executable name is wrong, or that you do not have the necessary permissions. Please ensure that the executable is able to be found and executed. Now, I know I’m gonna get some heat for this, but all of these machine use only the root account with full root privileges, so it’s not a permission issue. I am sure there is simple solution to my problem. Replacing the host with a PS3 is not an option. Does anyone have any suggestions? Thanks. PS: When I get to programming the Cell BE, then I’ll use the IBM Cell SDK with its cross-compiler toolchain.
Re: [OMPI users] How to start MPI_Spawn child processes early?
It sounds to me a bit like asking to be born before your mother. Unless I misunderstand the question... Douglas. On Thu, Jan 28, 2010 at 09:24:29AM +1100, Jaison Paul wrote: > Hi, I am just reposting my early query once again. If anyone one can > give some hint, that would be great. > > Thanks, Jaison > ANU > > Jaison Paul wrote: >> Hi All, >> >> I am trying to use MPI for scientific High Performance (hpc) >> applications. I use MPI_Spawn to create child processes. Is there a >> way to start child processes early than the parent process, using >> MPI_Spawn? >> >> I want this because, my experiments showed that the time to spawn the >> children by parent is too long for HPC apps which slows down the whole >> process. If the children are ready when parent application process >> seeks for them, that initial delay can be avoided. Is there a way to >> do that? >> >> Thanks in advance, >> >> Jaison >> Australian National University
Re: [OMPI users] How to start MPI_Spawn child processes early?
I cannot resist: Jaison - The MPI_Comm_spawn call specifies what you want to have happen. The child launch is what does happen. If we can come up with a way to have things happen correctly before we know what it is that we want to have happen, the heck with this HPC stuff. Lets get together and place stock orders on yesterday's market. Just joking - Ralph's suggestion about launching all parts of the application up front and then using JOIN or ACCEPT/CONNECT will work. I also agree with his skepticism about the problem. Most applications that are worth running in parallel take long enough so the time it takes to spawn should be barely noticeable. Are you using parallelism for something that only takes a few seconds and if so, why not just do it with a serial run? Dick Treumann - MPI Team IBM Systems & Technology Group Dept X2ZA / MS P963 -- 2455 South Road -- Poughkeepsie, NY 12601 Tele (845) 433-7846 Fax (845) 433-8363 users-boun...@open-mpi.org wrote on 01/27/2010 06:07:43 PM: > [image removed] > > Re: [OMPI users] How to start MPI_Spawn child processes early? > > Ralph Castain > > to: > > Open MPI Users > > 01/27/2010 06:09 PM > > Sent by: > > users-boun...@open-mpi.org > > Please respond to Open MPI Users > > I can't imagine how you would do that - only thing I can think of > would be to start your "child" processes as one job, then start your > "parent" processes and have them do an MPI_Comm_join with the child job. > > That said, I can't imagine that comm_spawn is -that- slow to make > much difference to an HPC application! At least, not in anything > I've measured. > > On Jan 27, 2010, at 3:24 PM, Jaison Paul wrote: > > > Hi, I am just reposting my early query once again. If anyone one > can give some hint, that would be great. > > > > Thanks, Jaison > > ANU > > > > Jaison Paul wrote: > >> Hi All, > >> > >> I am trying to use MPI for scientific High Performance (hpc) > applications. I use MPI_Spawn to create child processes. Is there a > way to start child processes early than the parent process, using MPI_Spawn? > >> > >> I want this because, my experiments showed that the time to spawn > the children by parent is too long for HPC apps which slows down the > whole process. If the children are ready when parent application > process seeks for them, that initial delay can be avoided. Is there > a way to do that? > >> > >> Thanks in advance, > >> > >> Jaison > >> Australian National University > >> ___ > >> users mailing list > >> us...@open-mpi.org > >> http://www.open-mpi.org/mailman/listinfo.cgi/users > > > > ___ > > users mailing list > > us...@open-mpi.org > > http://www.open-mpi.org/mailman/listinfo.cgi/users > > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] How to start MPI_Spawn child processes early?
I can't imagine how you would do that - only thing I can think of would be to start your "child" processes as one job, then start your "parent" processes and have them do an MPI_Comm_join with the child job. That said, I can't imagine that comm_spawn is -that- slow to make much difference to an HPC application! At least, not in anything I've measured. On Jan 27, 2010, at 3:24 PM, Jaison Paul wrote: > Hi, I am just reposting my early query once again. If anyone one can give > some hint, that would be great. > > Thanks, Jaison > ANU > > Jaison Paul wrote: >> Hi All, >> >> I am trying to use MPI for scientific High Performance (hpc) applications. I >> use MPI_Spawn to create child processes. Is there a way to start child >> processes early than the parent process, using MPI_Spawn? >> >> I want this because, my experiments showed that the time to spawn the >> children by parent is too long for HPC apps which slows down the whole >> process. If the children are ready when parent application process seeks for >> them, that initial delay can be avoided. Is there a way to do that? >> >> Thanks in advance, >> >> Jaison >> Australian National University >> ___ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] How to start MPI_Spawn child processes early?
Hi, I am just reposting my early query once again. If anyone one can give some hint, that would be great. Thanks, Jaison ANU Jaison Paul wrote: Hi All, I am trying to use MPI for scientific High Performance (hpc) applications. I use MPI_Spawn to create child processes. Is there a way to start child processes early than the parent process, using MPI_Spawn? I want this because, my experiments showed that the time to spawn the children by parent is too long for HPC apps which slows down the whole process. If the children are ready when parent application process seeks for them, that initial delay can be avoided. Is there a way to do that? Thanks in advance, Jaison Australian National University ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] How to check OMPI is using IB or not?
You could also rule Ethernet (TCP) out. E.g., mpirun --mca btl self,openib ./a.out Or, if you wanted the opposite (Ethernet/TCP, but not IB), then mpirun --mca btl self,tcp ./a.out If an infiniband network is configured successfully, how to confirm that Open MPI is using infiniband, not other ethernet network available?
Re: [OMPI users] How to check OMPI is using IB or not?
Thanks Brett for the useful information. On Wed, Jan 27, 2010 at 12:40 PM, Brett Pemberton wrote: > > - "Sangamesh B" wrote: > > > Hi all, > > > > If an infiniband network is configured successfully, how to confirm > > that Open MPI is using infiniband, not other ethernet network > > available? > > > > At a low level simplistic way, how about: > > [root@tango003 ~]# lsof | grep /dev/infiniband > namd2 7271 weimin mem CHR231,192 >8306 /dev/infiniband/uverbs0 > namd2 7271 weimin 13u CHR231,192 >8306 /dev/infiniband/uverbs0 > ... > > Here i can see that the namd that I compiled with openmpi is using IB. > > cheers, > > / Brett > > -- > Brett Pemberton - VPAC HPC Team Leader > http://www.vpac.org/ - (03) 9925 4899 > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users >
Re: [OMPI users] How to check OMPI is using IB or not?
- "Sangamesh B" wrote: > Hi all, > > If an infiniband network is configured successfully, how to confirm > that Open MPI is using infiniband, not other ethernet network > available? > At a low level simplistic way, how about: [root@tango003 ~]# lsof | grep /dev/infiniband namd2 7271 weimin mem CHR231,192 8306 /dev/infiniband/uverbs0 namd2 7271 weimin 13u CHR231,192 8306 /dev/infiniband/uverbs0 ... Here i can see that the namd that I compiled with openmpi is using IB. cheers, / Brett -- Brett Pemberton - VPAC HPC Team Leader http://www.vpac.org/ - (03) 9925 4899
[OMPI users] How to check OMPI is using IB or not?
Hi all, If an infiniband network is configured successfully, how to confirm that Open MPI is using infiniband, not other ethernet network available? In earlier versions, I've seen if OMPI is running on ethernet, it was giving warning - its runnig on slower network. Is this available in 1.3.3 version also? The linux command "netstat", does not confirm this as OMPI works on RDMA. Thanks