On Fri, Dec 22, 2017 at 05:55:26PM -0500, [email protected] wrote: > True, but even with that info, there doesn't seem to be any universal > way to tell an arbitrary GPU job which GPU to use -- they all default > to device 0.
With Nvidia GPUs we use a prolog script that manipulates lock files to select a GPU then chgrp's the selected /dev/nvidia? file so the group is the group associated with the job. An epilog script undoes all of this. The /dev/nvidia? files permissions are set to be inaccessible to anyone other than owner(root) and the group. However you have to pass a magic option to the kernel to prevent permissions from being reset whenever anyone tries to access the device. This seems to be a fairly bullet proof way of restricting jobs to their assigned GPU. William
signature.asc
Description: PGP signature
_______________________________________________ users mailing list [email protected] https://gridengine.org/mailman/listinfo/users
