Hi Reuti, >> I use multiple queues to divide up available resources based on job run >> times. > > So you are requesting "-l h_rt=..."?
Yes. > Yes, the problem is that you can't address a specific queue in `qrsh -inherit > ...` and if you get several queues on a machine you might have used up the > slots of the queue that is selected first for the `qrsh -inherit ...`. > https://arc.liv.ac.uk/trac/SGE/ticket/813 Thanks for this information. Switching the allocation rule from "round_robin" to "fill_up" gets rid of the problem for me, but I am not sure if this is just because less queues are being used on each host. > It should help to have a PE for each queue, but you end up with 9 PEs for > each PE you have right now. But this would limit the maximum size of parallel jobs to the maximum number of slots on a single queue, right? > BUT: What type of parallel applications are you using? With a tight > integration of MPICH2/3 and Open MPI there is only one `qrsh -inherit ...` > call per exechost and all other processes are forks. And as you get > "Execution daemon on host <hostname> didn't accept task" you are having a > tight integration. I am using MPICH2 version 1.4 which does have tight integration built in. Thanks! Brendan _______________________________________________ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users