[OMPI users] Problem in remote nodes

2010-03-17 Thread uriz . 49949
Hi everyone I'm a new Open MPI user and I have just installed Open MPI in a 6 nodes cluster with Scientific Linux. When I execute it in local it works perfectly, but when I try to execute it on the remote nodes with the --host option it hangs and gives no message. I think that the problem could be

Re: [OMPI users] Problem in remote nodes

2010-03-17 Thread Jeff Squyres
On Mar 17, 2010, at 4:39 AM, wrote: > Hi everyone I'm a new Open MPI user and I have just installed Open MPI in > a 6 nodes cluster with Scientific Linux. When I execute it in local it > works perfectly, but when I try to execute it on the remote nodes with the > --host option it hangs and gives

Re: [OMPI users] Problem in remote nodes

2010-03-17 Thread Fernando Lemos
On Wed, Mar 17, 2010 at 1:41 PM, Jeff Squyres wrote: > On Mar 17, 2010, at 4:39 AM, wrote: > >> Hi everyone I'm a new Open MPI user and I have just installed Open MPI in >> a 6 nodes cluster with Scientific Linux. When I execute it in local it >> works perfectly, but when I try to execute it on t

Re: [OMPI users] Problem in remote nodes

2010-03-19 Thread uriz . 49949
The processes are running on the remote nodes but they don't give the response to the origin node. I don't know why. With the option --mca btl_base_verbose 30, I have the same problems and it doesn't show any message. Thanks > On Wed, Mar 17, 2010 at 1:41 PM, Jeff Squyres wrote: >> On Mar 17, 20

Re: [OMPI users] Problem in remote nodes

2010-03-19 Thread Ralph Castain
Did you configure OMPI with --enable-debug? You should do this so that more diagnostic output is available. You can also add the following to your cmd line to get more info: --debug --debug-daemons --leave-session-attached Something is likely blocking proper launch of the daemons and processes

Re: [OMPI users] Problem in remote nodes

2010-03-30 Thread uriz . 49949
I've benn investigating and there is no firewall that could stop TCP traffic in the cluster. With the option --mca plm_base_verbose 30 I get the following output: [itanium1] /home/otro > mpirun --mca plm_base_verbose 30 --host itanium2 helloworld.out [itanium1:08311] mca: base: components_open: Lo

Re: [OMPI users] Problem in remote nodes

2010-03-30 Thread Ralph Castain
Looks to me like you have an error in your cmd line - you aren't specifying the number of procs to run. My guess is that the system is hanging trying to resolve the process map as a result. Try adding "-np 1" to the cmd line. The output indicates it is dropping slurm because it doesn't see a slu

Re: [OMPI users] Problem in remote nodes

2010-03-30 Thread Robert Collyer
I've been having similar problems using Fedora core 9. I believe the issue may be with SELinux, but this is just an educated guess. In my setup, shortly after a login via mpi, there is a notation in the /var/log/messages on the compute node as follows: Mar 30 12:39:45 kernel: type=1400 audi

Re: [OMPI users] Problem in remote nodes

2010-03-30 Thread Robert Collyer
I changed the SELinux config to permissive (log only), and it didn't change anything. Back to the drawing board. Robert Collyer wrote: I've been having similar problems using Fedora core 9. I believe the issue may be with SELinux, but this is just an educated guess. In my setup, shortly aft

Re: [OMPI users] Problem in remote nodes

2010-03-31 Thread uriz . 49949
I've been checking the /var/log/messages on the compute node and there is nothing new after executing ' mpirun --host itanium2 -np 2 helloworld.out', but in the /var/log/messages file on the remote node it appears the following messages, nothing about unix_chkpwd. Mar 31 11:56:51 itanium2 sshd(pam

Re: [OMPI users] Problem in remote nodes

2010-03-31 Thread Jeff Squyres (jsquyres)
trying to open some tcp sockets back). Can you open random tcp sockets between your nodes? (E.g., in non-mpi processes) -jms Sent from my PDA. No type good. - Original Message - From: users-boun...@open-mpi.org To: Open MPI Users Sent: Wed Mar 31 06:25:43 2010 Subject: Re: [OMPI user

Re: [OMPI users] Problem in remote nodes

2010-03-31 Thread Jeff Squyres
On Mar 30, 2010, at 4:28 PM, Robert Collyer wrote: > I changed the SELinux config to permissive (log only), and it didn't > change anything. Back to the drawing board. I'm afraid I have no expereince with SELinux -- I don't know what it restricts. Generally, you need to be able to run processe

Re: [OMPI users] Problem in remote nodes

2010-04-07 Thread Robert Collyer
es) -jms Sent from my PDA. No type good. - Original Message - From: users-boun...@open-mpi.org To: Open MPI Users Sent: Wed Mar 31 06:25:43 2010 Subject: Re: [OMPI users] Problem in remote nodes I've been checking the /var/log/messages on the compute node and there is nothing new