Re: [OMPI users] Help on the big picture..
this is a mailing list, and some of us are new, others older and experienced, the new ones we might not know the protocol commonly used, but we should at least treat each other more friendly without judging the interests of the others ahead of time, because you are wrong. All the answers received were useful for me. thanks... Cristobal On Fri, Jul 23, 2010 at 10:27 PM, Tim Prince wrote: > On 7/22/2010 4:11 PM, Gus Correa wrote: >> >> Hi Cristobal >> >> Cristobal Navarro wrote: >>> >>> yes, >>> i was aware of the big difference hehe. >>> >>> now that openMP and openMPI is in talk, i've alwyas wondered if its a >>> good idea to model a solution on the following way, using both openMP >>> and openMPI. >>> suppose you have n nodes, each node has a quadcore, (so you have n*4 >>> processors) >>> launch n proceses acorrding to the n nodes available. >>> set a resource manager like SGE to fill the n*4 slots using round robin. >>> on each process, make use of the other cores available on the node, >>> with openMP. >>> >>> if this is possible, then on each one could make use fo the shared >>> memory model locally at each node, evading unnecesary I/O through the >>> nwetwork, what do you think? >>> > Before asking what we think about this, please check the many references > posted on this subject over the last decade. Then refine your question to > what you are interested in hearing about; evidently you have no interest in > much of this topic. >> >> Yes, it is possible, and many of the atmosphere/oceans/climate codes >> that we run is written with this capability. In other areas of >> science and engineering this is probably the case too. >> >> However, this is not necessarily better/faster/simpler than dedicate all >> the cores to MPI processes. >> >> In my view, this is due to: >> >> 1) OpenMP has a different scope than MPI, >> and to some extent is limited by more stringent requirements than MPI; >> >> 2) Most modern MPI implementations (and OpenMPI is an example) use shared >> memory mechanisms to communicate between processes that reside >> in a single physical node/computer; > > The shared memory communication of several MPI implementations does greatly > improve efficiency of message passing among ranks assigned to the same node. > However, these ranks also communicate with ranks on other nodes, so there > is a large potential advantage for hybrid MPI/OpenMP as the number of cores > in use increases. If you aren't interested in running on more than 8 nodes > or so, perhaps you won't care about this. >> >> 3) Writing hybrid code with MPI and OpenMP requires more effort, >> and much care so as not to let the two forms of parallelism step on >> each other's toes. > > The MPI standard specifies the use of MPI_init_thread to indicate which > combination of MPI and threading you intend to use, and to inquire whether > that model is supported by the active MPI. > In the case where there is only 1 MPI process per node (possibly using > several cores via OpenMP threading) there is no requirement for special > affinity support. > If there is more than 1 FUNNELED rank per multiple CPU node, it becomes > important to maintain cache locality for each rank. >> >> OpenMP operates mostly through compiler directives/pragmas interspersed >> on the code. For instance, you can parallelize inner loops in no time, >> granted that there are no data dependencies across the commands within the >> loop. All it takes is to write one or two directive/pragma lines. >> More than loop parallelization can be done with OpenMP, of course, >> although not as much as can be done with MPI. >> Still, with OpenMP, you are restricted to work in a shared memory >> environment. >> >> By contrast, MPI requires more effort to program, but it takes advantage >> of shared memory and networked environments >> (and perhaps extended grids too). >> > > snipped tons of stuff rather than attempt to reconcile top postings > > -- > Tim Prince > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users >
Re: [OMPI users] mpiexec hangs - new install
I don't think that could be the problem. I can ssh between machines, have a coulple of common directories shared with NFS, etc. And OpenMPI runs (or starts, anyway) under ssh, doesn't it? James On Fri, 23 Jul 2010 14:17:48 -0700, Ralph Castain wrote: Check for a firewall blocking tcp communications - that's the most common issue. On Jul 23, 2010, at 3:05 PM, James wrote: Hi, I am trying to get OpenMPI running on my home network. This has two machines, t61 and quad, both running SuSE 11. I'm using the "hello_c" program from the examples as a test. It will run fine on each machine, using whatever number or processes I specify. However, when I try to run on multiple machines, it hangs. If I start from t61 with the command "mpiexec -host t61,quad -np 2 hello" then I see that command when I do a ps -ax on t61. On quad I see "orted --daemonize (long parameter string)". Both of them seem to be silently waiting on some event, but I've no idea what. Both machines are running OpenMPI 1.4.2 (compiled from same tar file), installed in /opt/openmpi. The executables are in the same user/path on each machine (/home/me/src/openmpi/examples), and path, LD_LIBRARY_PATH, and so on all seem the same. Any suggestions? Thanks, James PS: Also, may I suggest putting something in the FAQ pointing out that the environment vars need to be set in .tcshrc, not .login? It would have saved me several hours. ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] mpiexec hangs - new install
OK, that's the problem. I turned the firewall off on both machines, and it works. Now the question: how do I fix it? I searched through the archives, and found that it seems to be a pretty common problem. Unfortunately, I didn't see a solution that I could understand. (I'm not a sysadmin, just a person trying to do some programming.) I have a couple of machines on a local net, with IP addresses in the 192.168.10.1xx range. There's a router at 192.168.10.1, which is connected to the internet via a cable mode. So how do I set up my system so my local machines can do whatever talking between themselves that's needed by OpenMPI, while still having a firewall between my system and the outside world? Thanks, James PS: Hate to kvetch, but wouldn't it save a lot of wasted time if basic problems like this were addressed in the FAQ? On Fri, 23 Jul 2010 14:17:48 -0700, Ralph Castain wrote: Check for a firewall blocking tcp communications - that's the most common issue. On Jul 23, 2010, at 3:05 PM, James wrote: Hi, I am trying to get OpenMPI running on my home network. This has two machines, t61 and quad, both running SuSE 11. I'm using the "hello_c" program from the examples as a test. It will run fine on each machine, using whatever number or processes I specify. However, when I try to run on multiple machines, it hangs. If I start from t61 with the command "mpiexec -host t61,quad -np 2 hello" then I see that command when I do a ps -ax on t61. On quad I see "orted --daemonize (long parameter string)". Both of them seem to be silently waiting on some event, but I've no idea what. Both machines are running OpenMPI 1.4.2 (compiled from same tar file), installed in /opt/openmpi. The executables are in the same user/path on each machine (/home/me/src/openmpi/examples), and path, LD_LIBRARY_PATH, and so on all seem the same. Any suggestions? Thanks, James PS: Also, may I suggest putting something in the FAQ pointing out that the environment vars need to be set in .tcshrc, not .login? It would have saved me several hours. ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] mpiexec hangs - new install
On Jul 24, 2010, at 4:40 PM, James wrote: > OK, that's the problem. I turned the firewall off on both machines, and > it works. > > Now the question: how do I fix it? I searched through the archives, and > found that it seems to be a pretty common problem. Unfortunately, I didn't > see a solution that I could understand. (I'm not a sysadmin, just a person > trying to do some programming.) > > I have a couple of machines on a local net, with IP addresses in the > 192.168.10.1xx range. There's a router at 192.168.10.1, which is connected > to the internet via a cable mode. So how do I set up my system so my > local machines can do whatever talking between themselves that's needed by > OpenMPI, while still having a firewall between my system and the outside > world? Most routers provide their own internal-to-external firewall - you might check its setup and see. If it does, then you don't need to also have one on your individual machines. > > Thanks, > James > > PS: Hate to kvetch, but wouldn't it save a lot of wasted time if basic > problems like this were addressed in the FAQ? Yes, it probably should be. However, a simple search for "firewall" on the user mailing list provides lots of info on how to deal with this issue. > > > On Fri, 23 Jul 2010 14:17:48 -0700, Ralph Castain wrote: > >> Check for a firewall blocking tcp communications - that's the most common >> issue. >> >> On Jul 23, 2010, at 3:05 PM, James wrote: >> >>> Hi, >>> >>> I am trying to get OpenMPI running on my home network. This has two >>> machines, t61 and quad, both running SuSE 11. I'm using the "hello_c" >>> program from the examples as a test. It will run fine on each machine, >>> using whatever number or processes I specify. However, when I try to >>> run on multiple machines, it hangs. >>> >>> If I start from t61 with the command "mpiexec -host t61,quad -np 2 hello" >>> then I see that command when I do a ps -ax on t61. On quad I see >>> "orted --daemonize (long parameter string)". Both of them seem to be >>> silently waiting on some event, but I've no idea what. >>> >>> Both machines are running OpenMPI 1.4.2 (compiled from same tar file), >>> installed in /opt/openmpi. The executables are in the same user/path >>> on each machine (/home/me/src/openmpi/examples), and path, >>> LD_LIBRARY_PATH, and so on all seem the same. >>> >>> Any suggestions? >>> >>> Thanks, >>> James >>> >>> PS: Also, may I suggest putting something in the FAQ pointing out >>> that the environment vars need to be set in .tcshrc, not .login? >>> It would have saved me several hours. >>> ___ >>> users mailing list >>> us...@open-mpi.org >>> http://www.open-mpi.org/mailman/listinfo.cgi/users >> >> >> ___ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users > > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users