I notice that the Torque in OSCAR5 is pretty ancient -- 2.0.0p8,
while the current version is 2.1.something (the last stable 2.0.0 was
p11). Is there a reason it's so ancient in OSCAR?
I was running into a weird situation with Torque on my OSCAR cluster
that it would give me *cpus* instead of *nodes*. Specifically, if I
did this:
qsub -I -lnodes=16
I'd get 16 *cpus* on 2 nodes (4 core machines, with hyperthreading
accidentally enabled, so OSCAR/Torque thinks that they're 8 processor
machines; just hadn't gotten around to fixing that yet). Indeed,
even if I did this:
qsub -I -lnodes=16:ppn=1
I'd still get 16 CPUs, not 16 nodes (i.e., "cat $PBS_NODEFILE" would
show 2 hosts, each listed 8 times). This was quite repeatable --
successive and simultaneous jobs would show the same behavior with
different nodes.
My understanding of Torque (which is admittedly far from perfect) is
that I should get 16 *nodes*, not 16 *cpus*.
Does anyone know if this is simply a bug in this old version of
Torque, or whether there is a common configuration problem that
causes this issue?
Thanks.
--
Jeff Squyres
Server Virtualization Business Unit
Cisco Systems
-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Oscar-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/oscar-devel