Just remember, GPU's are non-fungible and there can be many per blade. Also, many folks in the scientific community had found OpenCL limiting fwiw, and found themselves eventually supporting CUDA, so the JIRA should probably plan for either.
I recommend crib-ing ideas from here (namely discovery): http://research.cs.wisc.edu/htcondor/HTCondorWeek2014/presentations/KnoellerJ-GPU.pptx Cheers, Tim ----- Original Message ----- > From: "Chester Kuo" <[email protected]> > To: [email protected] > Sent: Tuesday, January 27, 2015 3:18:21 AM > Subject: Re: GPU computing resource add into Mesos > > Ya, i'm working on it, will publish to JIRA once done. > > On Tue, Jan 27, 2015 at 12:16 AM, Christos Kozyrakis > <[email protected]> wrote: > > Chester, this sounds great. Do you want to start a design doc about > > extensions needed in slave/isolators/containerizer/... for GPUs. It would > > be useful to separate what is a minimum vs complete set of features to > > consider. The doc will be a good starting point for discussion. > > > > On Mon, Jan 26, 2015 at 1:18 AM, Chester Kuo <[email protected]> wrote: > > > >> Hi Tom > >> > >> Ya, the GPGPU resources needs to provided from slave , but we need to > >> extend to have it to query GPGPU resources such as GPU devices > >> (single or multiple) ,CU(compute unit) , global/local memory embedded > >> in the slave node, with this info , framework can utilize it as we did > >> of generic CPU/Memory. > >> > >> Besides i'd like to have OpenCL (https://www.khronos.org/opencl/) to > >> help to query slave's capability and its more generic and portable, > >> and i also plan to have other framework (such as Spark) have knowledge > >> about GPGPU resources for computing performance up (Planning). > >> > >> > >> Chester > >> > >> > >> On Mon, Jan 26, 2015 at 4:48 PM, Tom Arnfeld <[email protected]> wrote: > >> > Chester, you can specify arbitrary resources using the --resources flag > >> to the slave and Mesos will share out the resources to frameworks, and > >> then > >> your framework can do as it pleases. > >> > > >> > > >> > I'm not sure any changes are required in Mesos itself to support this, > >> unless I'm missing something. > >> > > >> > > >> > -- > >> > > >> > > >> > Tom Arnfeld > >> > > >> > Developer // DueDil > >> > > >> > > >> > > >> > > >> > > >> > (+44) 7525940046 > >> > > >> > 25 Christopher Street, London, EC2A 2BS > >> > > >> > On Mon, Jan 26, 2015 at 6:15 AM, Chester Kuo <[email protected]> > >> > wrote: > >> > > >> >> Hi All > >> >> I'd like to extend and add new feature into Mesos to support GPU > >> >> resource allocation, so we can put OpenCL application/framework on top > >> >> of Mesos and make it write once run across cluster. > >> >> Why choose OpenCL, due to it was widely supported by Intel , Nvidia, > >> >> AMD, Qualcomm GPGPU, so we may extended to have other framework (ex: > >> >> Spark) can try to utilize GPGPU computing resource. > >> >> Any Comments? > >> >> Chester > >> > -- Cheers, Timothy St. Clair Red Hat Inc.
