[ https://issues.apache.org/jira/browse/MESOS-2262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14315915#comment-14315915 ]
chester kuo commented on MESOS-2262: ------------------------------------ First draft for review. https://reviews.apache.org/r/30736/ > Adding GPGPU resource into Mesos, so we can know if any GPU/Heterogeneous > resource are available from slave > ----------------------------------------------------------------------------------------------------------- > > Key: MESOS-2262 > URL: https://issues.apache.org/jira/browse/MESOS-2262 > Project: Mesos > Issue Type: Task > Components: slave > Environment: OpenCL support env, such as OS X, Linux, Windows.. > Reporter: chester kuo > Assignee: chester kuo > Priority: Minor > > Extending Mesos to support Heterogeneous resource such as GPGPU/FPGA..etc as > computing resources in the data-center, OpenCL will be first target to add > into Mesos (support by all major GPU vendor) , I will reserve to support > others such as CUDA in the future. > In this feature, slave will be supported to do resources discover including > but not limited to, > (1) Heterogeneous Computing programming model : "OpenCL". "CUDA", "HSA" > (2) Computing global memory (MB) > (3) Computing run time version , such as "1.2" , "2.0" > (4) Computing compute unit (double) > (5) Computing device type : GPGPU, CPU, Accelerator device. > (6) Computing (number of devices): (double) > The Heterogeneous resource isolation will be supported in the framework > instead of in the slave devices side, the major reason here is , the > ecosystem , such as OpenCL operate on top of private device driver own by > vendors, only runtime library (OpenCL) is user-space application, so its hard > for us to do like Linux cgroup to have CPU/memory resource isolation. As a > result we may use run time library to do device isolation and memory > allocation. > (PS, if anyone know how to do it for GPGPU driver, please drop me a note) > Meanwhile, some run-time library (such as OpenCL) support to run on top of > CPU, so we need to use isolator API to notify this once it allocated. -- This message was sent by Atlassian JIRA (v6.3.4#6332)