Re: [OMPI users] Question about checkpoint/restart protocol

2009-11-05 Thread Sergio Díaz
Hi, Did you load the BLCR modules before compiling OpenMPI? Regards, Sergio Mohamed Adel escribió: Dear OMPI users, I'm a new OpenMPI user. I've configured openmpi-1.3.3 with those options "./configure --prefix=/home/mab/openmpi-1.3.3 --with-sge --enable-ft-thread --with-ft=cr --enable-mpi-

Re: [OMPI users] Question about checkpoint/restart protocol

2009-11-05 Thread Mohamed Adel
Dear Sergio, Thank you for your reply. I've inserted the modules into the kernel and it all worked fine. But there is still a weired issue. I use the command "mpirun -n 2 -am ft-enable-cr -H comp001 checkpoint-restart-test" to start the an mpi job. I then use "ompi-checkpoint PID" to checkpoint

[OMPI users] Mac OSX 10.6 (SL) + openMPI 1.3.3 + Intel Compilers 11.1.076

2009-11-05 Thread Christophe Peyret
Hello, I'm trying to launch a job with mpirun on my Mac Pro and I have a strange error message, any idea ? Christophe [santafe.onera:00235] orte:plm:xgrid: Connection to XGrid controller unexpectedly closed: (600) The operation couldn’t be completed. (BEEP error 600.) 2009-11-05 13:13:5

Re: [OMPI users] Mac OSX 10.6 (SL) + openMPI 1.3.3 + Intel Compilers11.1.076

2009-11-05 Thread Jeff Squyres
I'm afraid that Open MPI v1.3.x's xgrid support is currently broken -- we haven't had anyone with the knowledge or experience available to fix it. :-( Patches would be welcome... Note that Open MPI itself works fine on Snow Leopard -- it's just the xgrid launching support that is broken.

Re: [OMPI users] Mac OSX 10.6 (SL) + openMPI 1.3.3 + Intel Compilers 11.1.076

2009-11-05 Thread Christophe Peyret
How can I deactivate Xgrid launching in order to be able to use open- mpi under snow leopard ? Le 5 nov. 2009 à 13:18, Christophe Peyret a écrit : Hello, I'm trying to launch a job with mpirun on my Mac Pro and I have a strange error message, any idea ? Christophe [santafe.onera:0023

Re: [OMPI users] Mac OSX 10.6 (SL) + openMPI 1.3.3 + IntelCompilers 11.1.076

2009-11-05 Thread Jeff Squyres
On Nov 5, 2009, at 9:00 AM, Christophe Peyret wrote: How can I deactivate Xgrid launching in order to be able to use open- mpi under snow leopard ? Easiest way is to just remove the xgrid plugin: rm where_you_installed_ompi/lib/openmpi/mca_*xgrid* -- Jeff Squyres jsquy...@cisco.com

[OMPI users] Help: Firewall problems

2009-11-05 Thread Lee Amy
Hi, I remembered MPI does not count on TCP/IP but why default iptables will prevent the MPI programs from running? After I stop iptables then programs run well. I use Ethernet as connection. Could anyone tell me tips about fix this problem? Thank you very much. Amy

[OMPI users] Segmentation fault whilst running RaXML-MPI

2009-11-05 Thread Nick Holway
Dear all, I'm trying to run RaXML 7.0.4 on my 64bit Rocks 5.1 cluster (ie Centos 5.2). I compiled Open MPI 1.3.3 using the Intel compilers v 11.1.056 using ./configure CC=icc CXX=icpc F77=ifort FC=ifort --with-sge --prefix=/usr/prog/mpi/openmpi/1.3.3/x86_64-no-mem-man --with-memory-manager=none.

Re: [OMPI users] Segmentation fault whilst running RaXML-MPI

2009-11-05 Thread Jeff Squyres
FWIW, I think Intel released 11.1.059 earlier today (I've been trying to download it all morning). I doubt it's an issue in this case, but I thought I'd mention it as a public service announcement. ;-) Seg faults are *usually* an application issue (never say "never", but they *usually* ar

Re: [OMPI users] Help: Firewall problems

2009-11-05 Thread Terry Dontje
Technically MPI Spec may not put a requirement on TCP/IP, however Open MPI's runtime environment needs some way to launch jobs and pass data around in a standard way and it currently uses TCP/IP. That being said there have been rumblings for some time to use other protocols but that has not ye

Re: [OMPI users] Help: Firewall problems

2009-11-05 Thread Jeff Squyres
On Nov 5, 2009, at 11:28 AM, Lee Amy wrote: I remembered MPI does not count on TCP/IP but why default iptables will prevent the MPI programs from running? After I stop iptables then programs run well. I use Ethernet as connection. Note that Open MPI *can* use TCP as an interface for MPI mess

[OMPI users] Openmpi on Heterogeneous environment

2009-11-05 Thread Yogesh Aher
Dear Open-mpi users, I have installed openmpi on 2 different machines with different architectures (INTEL and x86_64) separately (command: ./configure --enable-heterogeneous). Compiled executables of the same code for these 2 arch. Kept these executables on individual machines. Prepared a hostfile

Re: [OMPI users] Openmpi on Heterogeneous environment

2009-11-05 Thread Pallab Datta
I have had issues for running in cross platforms..ie. Mac OSX and Linux (Ubuntu)..haven't got it resolved..check firewalls if thats blocking any communication.. > Dear Open-mpi users, > > I have installed openmpi on 2 different machines with different > architectures (INTEL and x86_64) separately

[OMPI users] mpirun example program fail on multiple nodes - unable to launch specified application on client node

2009-11-05 Thread qing pang
Dear Sir/Madam, I'm having problem running example program. Please kindly help --- I've been fooling with it for days, kind of getting lost. - MPIRUN fails on example hello prgram -unable to launch the specified ap

Re: [OMPI users] mpirun example program fail on multiple nodes - unable to launch specified application on client node

2009-11-05 Thread Jeff Squyres
The short version of the answer is to check to see that the executable is in the same location on both nodes (apparently: /home/gordon/ Desktop/openmpi-1.3.3/examples/hello_c.out). Open MPI is complaining that it can't find that specific executable on the .194 node. See below for more detai

Re: [OMPI users] mpirun example program fail on multiple nodes - unable to launch specified application on client node

2009-11-05 Thread Qing Pang
Thank you Jeff! That solves the problem. :-) You are the lifesaver! So does that means I always need to copy my application to all the nodes? Or should I give the pathname of the my executable in a different way to avoid this? Do I need a network file system for that? Jeff Squyres wrote: The

Re: [OMPI users] mpirun example program fail on multiple nodes - unable to launch specified application on client node

2009-11-05 Thread Jeff Squyres
On Nov 5, 2009, at 4:15 PM, Qing Pang wrote: Thank you Jeff! That solves the problem. :-) You are the lifesaver! So does that means I always need to copy my application to all the nodes? Or should I give the pathname of the my executable in a different way to avoid this? Do I need a network f

Re: [OMPI users] mpirun example program fail on multiple nodes - unable to launch specified application on client node

2009-11-05 Thread Douglas Guptill
On Thu, Nov 05, 2009 at 03:15:33PM -0600, Qing Pang wrote: > Thank you Jeff! That solves the problem. :-) You are the lifesaver! > So does that means I always need to copy my application to all the > nodes? Or should I give the pathname of the my executable in a different > way to avoid this?

Re: [OMPI users] mpirun example program fail on multiple nodes- unable to launch specified application on client node

2009-11-05 Thread Jeff Squyres
On Nov 5, 2009, at 5:34 PM, Douglas Guptill wrote: I am currently using sshfs to mount both OpenMPI and my application on the "other" computers/nodes. The advantage to this is that I have only one copy of OpenMPI and my application. There may be a performance penalty, but I haven't seen it yet

Re: [OMPI users] mpirun example program fail on multiple nodes- unable to launch specified application on client node

2009-11-05 Thread Terry Frankcombe
For small ad hoc COWs I'd vote for sshfs too. It may well be as slow as a dog, but it actually has some security, unlike NFS, and is a doddle to make work with no superuser access on the server, unlike NFS. On Thu, 2009-11-05 at 17:53 -0500, Jeff Squyres wrote: > On Nov 5, 2009, at 5:34 PM, Doug