On Fri, Dec 22, 2017 at 05:55:26PM -0500, [email protected] wrote:
> True, but even with that info, there doesn't seem to be any universal
> way to tell an arbitrary GPU job which GPU to use -- they all default
> to device 0.

With Nvidia GPUs we use a prolog script that manipulates lock files
to select a GPU then chgrp's the selected /dev/nvidia? file so the group is
the group associated with the job.   An epilog script undoes all of this.  
The /dev/nvidia? files permissions are set to be inaccessible to anyone 
other than owner(root) and the group.  However you have to pass
a magic option to the kernel to prevent permissions from being reset
whenever anyone tries to access the device.

This seems to be a fairly bullet proof way of restricting jobs to
their assigned GPU.


William

Attachment: signature.asc
Description: PGP signature

_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to