Re: [OMPI users] Error on running large number of processes

2007-11-15 Thread Jeff Squyres
My guess is that this is similar to the last post: you are  
oversubscribing the nodes so heavily that the OS is running out of  
some resources (perhaps regular or registered memory?) such that Open  
MPI is unable to setup its network transport layers properly.



On Nov 15, 2007, at 6:35 AM, Clement Kam Man Chu wrote:


Hi,

   I am using openmpi 1.2.3 under ia64 machine and uses pbs job
scheduler.  I can successfully run 100 processes on 16 cpus, but I got
an error If run 200 processes on the same number of cpus.  The error  
is :


 PML add procs failed
 --> Returned "Temporarily out of resource" (-3) instead of  
"Success" (0)



Please help.

Regards,
Clement

--
Clement Kam Man Chu
Research Assistant
Faculty of Information Technology
Monash University, Caulfield Campus
Ph: 61 3 9903 2355

___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



--
Jeff Squyres
Cisco Systems



[OMPI users] Error on running large number of processes

2007-11-15 Thread Clement Kam Man Chu

Hi,

   I am using openmpi 1.2.3 under ia64 machine and uses pbs job 
scheduler.  I can successfully run 100 processes on 16 cpus, but I got 
an error If run 200 processes on the same number of cpus.  The error is :


 PML add procs failed
 --> Returned "Temporarily out of resource" (-3) instead of "Success" (0)


Please help.

Regards,
Clement

--
Clement Kam Man Chu
Research Assistant
Faculty of Information Technology
Monash University, Caulfield Campus
Ph: 61 3 9903 2355