<x-flowed>
I'm not sure that PBS supports multiple ppn specifications within the same submit. Also, keep in mind that some admins might to post configuration that changes PBS's default ppn to 1, thus preventing you from filling the cluster with that command at all.
There is one other way to do it... but it's kind of a hack. qsub will take switches to make a job not start until after another job starts, after or another job finishes, or also exactly when another job starts (I think). For this, you would have to submit multiple jobs to make up the same one though, and that is way too much of a kludge if you ask me.
Is your goal to be able to find out the max number of procs available for a job, or to be able to launch the same job w/ different PPN specifications?

Jeremy

At 11:59 AM 10/5/2001 -0500, Jim Basney wrote:
Hello,

Jeremy Enos and I installed OSCAR 1.1 on a cluster of 5 dual-processor machines (1 head node and 4 compute nodes) here at NCSA for our Grid-in-a-Box project. I'm hoping you can answer (what I hope are) some simple PBS questions for me. I want to be able to submit mpich jobs to OSCAR clusters without needing to know how many processors each machine in each cluster has. Instead, I just want to know how many total CPUs each cluster has. This simplifies my life when doing "Grid computing", where I'm submitting to different clusters across the network, where some clusters may have dual-processor machines and others may have single-processor machines.

When I submit an MPI job to PBS on our cluster with "qsub -l nodes=4", PBS allocates 2 processors each on 2 compute nodes to my job, as opposed to allocating 1 processor each on 4 compute nodes. That's great because it makes me think "nodes" in PBS terminology means CPUs rather than machines. However, when I try "qsub -l nodes=8", I get "Job exceeds queue resource limits".

Is there some trick I can play to get this to work without needing to set ppn=2 in my qsub line on the clusters with dual-processor machines? If I have to use ppn, how would I request 5 CPUs from the cluster? "qsub -l nodes=2:ppn=2+nodes=1:ppn=1" fails with "Job exceeds queue resource limits" but "qsub -l nodes=3:ppn=2" succeeds. Is my syntax wrong?

Any help would be greatly appreciated.

-Jim


_______________________________________________
Oscar-users mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/oscar-users

_______________________________________________
Oscar-users mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/oscar-users

</x-flowed>

Reply via email to