Hello grindengine-users,
we're running SGE on a small cluster consisting of nodes offering 16
slots each.
one of our cluster users asked us to change the cluster configuration in
this way:
"Even when a job requests an integer multiple of 16 processes, these
tasks are distributed across many nodes with a range of tasks/node from
16 to 1; obviously the total performance is reduced to that of the
slowest task, which is that of the isolated process for which all
communication goes over the infiniband. The difference in performance
is serious.
Please can you reconfigure the queue to assign tasks to nodes such that
large parallel jobs will not be split up to fill in all small spaces on
the queue, but rather will be run on the minimum number of complete nodes?"
We are not sure how to achieve this. Could you please give a any hint?
TI & Kind Regards,
Christian
--
No signature available.
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users