I think you can make it a consumable resource, such that once a
specific GPU on a specific host is in use, no other jobs can land on
it.

Ian

On Mon, Apr 14, 2014 at 11:06 AM, Feng Zhang <prod.f...@gmail.com> wrote:
> Thanks, Ian!
>
> I haven't checked the GPU load sensor in detail, either. It sounds to
> me it only handles the number of GPU allocated to a job, but the job
> doesn't know which GPUs it actually get and set the
> CUDA_VISIBLE_DEVICE(some programs need this env to be set). This can
> be done by writing some scripts/programs, but to me, it is not an
> accurate solution, since some jobs may still happen to collide to each
> other on the same GPU on a multiple GPU node. If GE can have the
> memory to record the GPUs allocated to a job, then this can be
> perfect.
>
>
> On Mon, Apr 14, 2014 at 1:46 PM, Ian Kaufman <ikauf...@eng.ucsd.edu> wrote:
>> I believe there already is support for GPUs - there is a GPU Load
>> Sensor in Open Grid Engine. You may have to build it yourself, I
>> haven't checked to see if it comes pre-packaged.
>>
>> Univa has Phi support, and I believe OGE/OGS has it as well, or at
>> least has been working on it.
>>
>> Ian
>>
>> On Mon, Apr 14, 2014 at 10:35 AM, Feng Zhang <prod.f...@gmail.com> wrote:
>>> Hi,
>>>
>>> Is there's any plan to implement the GPU resource management in SGE in
>>> the near future? Like Slurm or Torque? There are some ways to do this
>>> using scripts/programs, but I wonder that if the SGE itself can
>>> recognize and manage GPU(and Phi). Not need to be complicated and
>>> powerful, just do basic work.
>>>
>>> Thanks,
>>> _______________________________________________
>>> users mailing list
>>> users@gridengine.org
>>> https://gridengine.org/mailman/listinfo/users
>>
>>
>>
>> --
>> Ian Kaufman
>> Research Systems Administrator
>> UC San Diego, Jacobs School of Engineering ikaufman AT ucsd DOT edu



-- 
Ian Kaufman
Research Systems Administrator
UC San Diego, Jacobs School of Engineering ikaufman AT ucsd DOT edu
_______________________________________________
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users

Reply via email to