Hope someone can help with this. We submitted hundreds of jobs using something similiar to qsub -pe my_pe 4 my_job.sh. We found that there is always a nodes with 8 slots empty at any time that we checked. A screenshot is pasted here, either comp03 or comp04 is idle while there are bunch of jobs waiting in the queue. Ideally both comp03 and comp04 should have 2 tasks running all the time. Given our setup we expect 6 jobs running simultaneously but there are only 4 jobs instead. These are long-run jobs that each of them may last 4 hours so we checked the queue plenty of times and find this behaviour.

Can someone shed some light on this?

---------------------------------------------------------------------------------
[email protected]  BIP   0/4/6 2.45     linux-x64
   1344 0.55500 npairs_run anita        r     05/27/2013 01:20:46     4
---------------------------------------------------------------------------------
[email protected]  BIP   0/4/6 2.71     linux-x64
   1343 0.55500 npairs_run anita        r     05/27/2013 00:37:16     4
---------------------------------------------------------------------------------
[email protected]  BIP   0/8/8 5.46     linux-x64
   1345 0.55500 npairs_run anita        r     05/27/2013 02:17:01     4
   1346 0.55500 npairs_run anita        r     05/27/2013 03:11:31     4
---------------------------------------------------------------------------------
[email protected]  BIP   0/0/8 0.02     linux-x64

############################################################################
 - PENDING JOBS - PENDING JOBS - PENDING JOBS - PENDING JOBS - PENDING JOBS
############################################################################
   1347 0.55500 npairs_run anita        qw    05/23/2013 16:37:25     4
   1348 0.55500 npairs_run anita        qw    05/23/2013 16:37:25     4
   1349 0.55500 npairs_run anita        qw    05/23/2013 16:37:25     4
   1350 0.55500 npairs_run anita        qw    05/23/2013 16:37:25     4
   1351 0.55500 npairs_run anita        qw    05/23/2013 16:37:25     4
   1352 0.55500 npairs_run anita        qw    05/23/2013 16:37:25     4
   1353 0.55500 npairs_run anita        qw    05/23/2013 16:37:25     4
   1354 0.55500 npairs_run anita        qw    05/23/2013 16:37:25     4



On 05/17/2013 08:00 AM, [email protected] wrote:
Send users mailing list submissions to
        [email protected]

To subscribe or unsubscribe via the World Wide Web, visit
        https://gridengine.org/mailman/listinfo/users
or, via email, send a message with subject or body 'help' to
        [email protected]

You can reach the person managing the list at
        [email protected]

When replying, please edit your Subject line so it is more specific
than "Re: Contents of users digest..."


Today's Topics:

    1. Where do the factors for np_load_short come from?
       (Tim Landscheidt)
    2. Re: Where do the factors for np_load_short come  from? (Reuti)
    3. Re: Where do the factors for np_load_short come  from?
       (Tim Landscheidt)
    4. Re: Where do the factors for np_load_short come  from? (Reuti)


_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to