We read the nodes from the PBS_NODEFILE, Paul - can you pass that along?

On Jan 31, 2014, at 2:33 PM, Paul Hargrove <phhargr...@lbl.gov> wrote:

> I am trying to test the trunk on an SGI UV (to validate Nathan's port of 
> btl:vader to SGI's variant of xpmem).
> 
> At configure time, PBS's TM support was correctly located.
> 
> My PBS batch script includes
>   #PBS -l ncpus=16
> because that is what this installation requires (not nodes, mppnodes, or 
> anything like that).
> One is allocating cpus on a large shared-memory machine, not a set of nodes 
> in a cluster.
> 
> However, this appears to be causing mpirun to think I have just 1 slot:
> 
> + mpirun -np 2 ./ring_c
> --------------------------------------------------------------------------
> There are not enough slots available in the system to satisfy the 2 slots 
> that were requested by the application:
>   ./ring_c
> 
> Either request fewer slots for your application, or make more slots available
> for use.
> --------------------------------------------------------------------------
> 
> In case they contain useful info, here are the PBS env vars in the job:
> 
> PBS_HT_NCPUS=32
> PBS_VERSION=TORQUE-2.3.13
> PBS_JOBNAME=qs
> PBS_ENVIRONMENT=PBS_BATCH
> PBS_HOME=/var/spool/torque
> PBS_O_WORKDIR=/usr/users/6/hargrove/SCRATCH/OMPI/openmpi-trunk-linux-x86_64-uv-trunk/BLD/examples
> PBS_PPN=16
> PBS_TASKNUM=1
> PBS_O_HOME=/usr/users/6/hargrove
> PBS_MOMPORT=15003
> PBS_O_QUEUE=debug
> PBS_O_LOGNAME=hargrove
> PBS_O_LANG=en_US.UTF-8
> PBS_JOBCOOKIE=9EEF5DF75FA705A241FEF66EDFE01C5B
> PBS_NODENUM=0
> PBS_O_SHELL=/usr/psc/shells/bash
> PBS_SERVER=tg-login1.blacklight.psc.teragrid.org
> PBS_JOBID=314827.tg-login1.blacklight.psc.teragrid.org
> PBS_NCPUS=16
> PBS_O_HOST=tg-login1.blacklight.psc.teragrid.org
> PBS_VNODENUM=0
> PBS_QUEUE=debug_r1
> PBS_O_MAIL=/var/mail/hargrove
> PBS_NODEFILE=/var/spool/torque/aux//314827.tg-login1.blacklight.psc.teragrid.org
> PBS_O_PATH=[...removed...]
> 
> If any additional info is needed to help make mpirun "just work", please let 
> me know.
> 
> However, at this point I am mostly interested in any work-arounds that will 
> let me run something other than a singleton on this system.
> 
> -Paul
> 
> -- 
> Paul H. Hargrove                          phhargr...@lbl.gov
> Future Technologies Group
> Computer and Data Sciences Department     Tel: +1-510-495-2352
> Lawrence Berkeley National Laboratory     Fax: +1-510-486-6900
> _______________________________________________
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel

Reply via email to