Hello grindengine-users,

we're running SGE on a small cluster consisting of nodes offering 16 slots each.

one of our cluster users asked us to change the cluster configuration in this way:

"Even when a job requests an integer multiple of 16 processes, these tasks are distributed across many nodes with a range of tasks/node from 16 to 1; obviously the total performance is reduced to that of the slowest task, which is that of the isolated process for which all communication goes over the infiniband. The difference in performance is serious.

Please can you reconfigure the queue to assign tasks to nodes such that large parallel jobs will not be split up to fill in all small spaces on the queue, but rather will be run on the minimum number of complete nodes?"

We are not sure how to achieve this. Could you please give a any hint?

TI & Kind Regards,
Christian

--
No signature available.
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to