Re: [gridengine users] help on adding a gpgpu to a cpu cluster

Roberto Nunnari Wed, 29 May 2013 00:31:44 -0700

Ok Nico.. Thank you very much for your explanation.

Humm.. I believe I will start in production by just defining the GPUs asa consumable resource.. then I'll try to adapt and fix when troublearise. :-) In fact I'm new to GPUs and don't know much about it.. I'llrelay to the users who know much more than me on GPUs as they also aredevelopers of programs for GPU. ;-)


Thank you and best regards.
Robi


Nicolás Serrano Martínez Santos wrote:

This extension seems to add information to be controlled by sensors. You could
employ this information in many ways. For instance, you could change the load

in a host if a gpu is being used to not accept any other processes.Depending on which are your requirements you might need to install or employ acustom made shellscript as proposed in the link I previously attach.


What I have not been able to replicate is a variable similar to h_vmem, which
will automatically kill the process if it is reached in some moment. In fact,
in GE the process is launched with only this memory available (I dont know
how this is performed).  I think this feature is not implemented in the update

you attached.

Regards,

NiCo

Excerpts from Roberto Nunnari's message of mar may 28 10:54:53 +0200 2013:

Nicolás Serrano Martínez Santos wrote:

As far as I know, there is not much you can do, besides defining a consumable
for each gpu.

http://serverfault.com/questions/322073/howto-set-up-sge-for-cuda-devices

In our university we also have a Tesla in our GE cluster. Tesla lets you to
define several virtual gpu (e.g. 3 or 4 slots). You may find useful to define
a gpu_memory consumable to limit the graphic memory per process, as when the
memory of the graphic card is exhausted, processes crash. However, GE is
not able to (easily) monitor the memory used per process. You can define it in
GE and reserve it when submitting it, but in our case, the memory is monitored
by each process.

If you wish to limit gpu to a queue, I think you can define gpu to be a
consumable also in the queue.

Best Regards,

NiCo

Hi Nico,

Thank you for your answer.

What about this?https://gridscheduler.svn.sourceforge.net/svnroot/gridscheduler/trunk/source/dist/gpu/gpu_sensor.cEver tried it? Do you think it can be useful? What are the advantagesand/or scenarios of using that sensor?


Thank you and best regards.
Robi

Excerpts from Roberto Nunnari's message of mar may 28 10:20:13 +0200 2013:

Anybody on this, please? I'm sorry to insist, but I've posted on 24thand on 27th but got no answer yet..


Best regards,
Robi

Roberto Nunnari wrote:

Hello.

Anybody on this, please? In the while, I went on a little bit andimplemented it as this:



[root@master ~]# qconf -sc | grep gpu
gpu                 gpu        INT         <=    YES         YES 0        0


[root@master ~]# qhost -F | grep gpu
   hc:gpu=1.000000

Now users can access the gpu by specifying 'qsub -l gpu=1'

I haven't defined a specialized queue for the gpu, and I see that whenthe job is running on the gpu, the sceduler also reserves a cpu slot onthat host.. that's good because a gpu job will also consume cpu time..


More hints, tips or advices, please?

Thank you and best regards,
Robi


Roberto Nunnari wrote:

Hi all.

I just doing my first tests with Open Grid Scheduler and GPGPU.

To do the testing I set up opengridscheduler on two hosts, one is the
frontend, and one is the execution node. The execution node has 64 cores
and a nvidia tesla M2090. (Most probably, the final solution will be
made up of one master, 20-30 execution nodes with 8-12 cores each, and a
couple of file servers, and 4-8 GPUs attached to some of the execution
nodes)

At present I set up my testing environment queues, similar to theexisting production cluster.. so the scheduler has three queues,1hour, 1day, and unlimited.


I believe I once was told that the best way to add GPGPUs to a CPU
cluster, was by adding a queue dedicated to the GPUs and consumable
resources.. maybe also play with priorities to use CPUs on hosts with
GPUs.. do you agree?

I also was told that OpenGridScheduler has added support for GPUs..Could anybody tell me more on that, please?


So.. I'm new to GPUs and would like some help/direction from the experts
on how to make a mixed cluster CPUs/GPUs using OpenGridScheduler.

here's my environment:
- OGS/GE 2011.11p1
- CentOS 6.4

[root@master ~]# qhost -q
HOSTNAME                ARCH         NCPU  LOAD  MEMTOT  MEMUSE  SWAPTO
  SWAPUS

-------------------------------------------------------------------------------

global                  -               -     -       -       -       -
       -
node01                  linux-x64      64 14.33   94.4G   50.3G  186.3G
     0.0
    1h.q                 BIP   0/0/20
    1d.q                 BP    0/0/20
    long.q               BP    0/0/20


Any help/tips/direction greatly appreciated!  :-)
Robi
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Re: [gridengine users] help on adding a gpgpu to a cpu cluster

Reply via email to