On Apr 13, 2011, at 10:29 AM, Jack Bryan wrote:

> Hi , 
> 
> If I cannot ssh to a worker node, it means that my program cannot work 
> correctly ? 

No, that's not true. People thought you were on a cluster using ssh as the 
launcher. From prior notes, you were using Torque, so not being allowed to ssh 
is just an admin thing.

> 
> I can run it on 32 nodes *4 cores/node parallel processes. But, for larger 
> parallel processes, 
> 128 nodes * 1 cpu/node, it is killed by signal 9. 
> 
> Is this a reason ? 

No, it isn't

> 
> thanks
> 
> > Date: Wed, 13 Apr 2011 05:59:10 -0700
> > From: n...@aol.com
> > To: us...@open-mpi.org
> > Subject: Re: [OMPI users] OMPI monitor each process behavior
> > 
> > On 4/12/2011 8:55 PM, Jack Bryan wrote:
> > 
> > >
> > > I need to monitor the memory usage of each parallel process on a linux
> > > Open MPI cluster.
> > >
> > > But, top, ps command cannot help here because they only show the head
> > > node information.
> > >
> > > I need to follow the behavior of each process on each cluster node.
> > Did you consider ganglia et al?
> > >
> > > I cannot use ssh to access each node.
> > How can MPI run?
> > >
> > > The program takes 8 hours to finish.
> > 
> > 
> > 
> > -- 
> > Tim Prince
> > _______________________________________________
> > users mailing list
> > us...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users

Reply via email to