[jira] [Commented] (MESOS-2262) Adding GPGPU resource into Mesos, so we can know if any GPU/Heterogeneous resource are available from slave

2016-01-18 Thread Benjamin Mahler (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15106106#comment-15106106
 ] 

Benjamin Mahler commented on MESOS-2262:


I've created an epic (MESOS-4424) to track initial support of GPU resources and 
added the watchers from this ticket. A design doc will be circulated for 
community feedback soon, looking forward to seeing feedback from folks 
interested!

> Adding GPGPU resource into Mesos, so we can know if any GPU/Heterogeneous 
> resource are available from slave
> ---
>
> Key: MESOS-2262
> URL: https://issues.apache.org/jira/browse/MESOS-2262
> Project: Mesos
>  Issue Type: Task
>  Components: slave
> Environment: OpenCL support env, such as OS X, Linux, Windows..
>Reporter: chester kuo
>Assignee: chester kuo
>Priority: Minor
>
> Extending Mesos to support Heterogeneous resource such as GPGPU/FPGA..etc as 
> computing resources in the data-center, OpenCL will be first target to add 
> into Mesos (support by all major GPU vendor) , I will reserve to support 
> others such as CUDA in the future.
> In this feature, slave will be supported to do resources discover including 
> but not limited to, 
> (1) Heterogeneous Computing programming model : "OpenCL". "CUDA", "HSA"
> (2) Computing global memory (MB)
> (3) Computing run time version , such as "1.2" , "2.0"
> (4) Computing compute unit (double)
> (5) Computing device type : GPGPU, CPU, Accelerator device.
> (6) Computing (number of devices): (double)
> The Heterogeneous resource isolation will be supported in the framework 
> instead of in the slave devices side, the major reason here is , the 
> ecosystem , such as OpenCL operate on top of private device driver own by 
> vendors, only runtime library (OpenCL) is user-space application, so its hard 
> for us to do like Linux cgroup to have CPU/memory resource isolation. As a 
> result we may use run time library to do device isolation and memory 
> allocation.
> (PS, if anyone know how to do it for GPGPU driver, please drop me a note)
> Meanwhile, some run-time library (such as OpenCL) support to run on top of 
> CPU, so we need to use isolator API to notify this once it allocated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2262) Adding GPGPU resource into Mesos, so we can know if any GPU/Heterogeneous resource are available from slave

2015-11-26 Thread haosdent (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15029432#comment-15029432
 ] 

haosdent commented on MESOS-2262:
-

I think for
> so we can know if any GPU/Heterogeneous resource are available from slave
it totally not depends on add GPGPU as a resource type. We could use 
constraints to ask Mesos run our tasks in those agents which have special 
hardwares, include nvdia cards.

And nvidia-docker image have already make we use GPGPU in docker container well.

Add GPGPU as a resource type in mesos just make it more user friendly. 

> Adding GPGPU resource into Mesos, so we can know if any GPU/Heterogeneous 
> resource are available from slave
> ---
>
> Key: MESOS-2262
> URL: https://issues.apache.org/jira/browse/MESOS-2262
> Project: Mesos
>  Issue Type: Task
>  Components: slave
> Environment: OpenCL support env, such as OS X, Linux, Windows..
>Reporter: chester kuo
>Assignee: chester kuo
>Priority: Minor
>
> Extending Mesos to support Heterogeneous resource such as GPGPU/FPGA..etc as 
> computing resources in the data-center, OpenCL will be first target to add 
> into Mesos (support by all major GPU vendor) , I will reserve to support 
> others such as CUDA in the future.
> In this feature, slave will be supported to do resources discover including 
> but not limited to, 
> (1) Heterogeneous Computing programming model : "OpenCL". "CUDA", "HSA"
> (2) Computing global memory (MB)
> (3) Computing run time version , such as "1.2" , "2.0"
> (4) Computing compute unit (double)
> (5) Computing device type : GPGPU, CPU, Accelerator device.
> (6) Computing (number of devices): (double)
> The Heterogeneous resource isolation will be supported in the framework 
> instead of in the slave devices side, the major reason here is , the 
> ecosystem , such as OpenCL operate on top of private device driver own by 
> vendors, only runtime library (OpenCL) is user-space application, so its hard 
> for us to do like Linux cgroup to have CPU/memory resource isolation. As a 
> result we may use run time library to do device isolation and memory 
> allocation.
> (PS, if anyone know how to do it for GPGPU driver, please drop me a note)
> Meanwhile, some run-time library (such as OpenCL) support to run on top of 
> CPU, so we need to use isolator API to notify this once it allocated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2262) Adding GPGPU resource into Mesos, so we can know if any GPU/Heterogeneous resource are available from slave

2015-11-26 Thread Yubo Li (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15029370#comment-15029370
 ] 

Yubo Li commented on MESOS-2262:


Is there any progress on this?

> Adding GPGPU resource into Mesos, so we can know if any GPU/Heterogeneous 
> resource are available from slave
> ---
>
> Key: MESOS-2262
> URL: https://issues.apache.org/jira/browse/MESOS-2262
> Project: Mesos
>  Issue Type: Task
>  Components: slave
> Environment: OpenCL support env, such as OS X, Linux, Windows..
>Reporter: chester kuo
>Assignee: chester kuo
>Priority: Minor
>
> Extending Mesos to support Heterogeneous resource such as GPGPU/FPGA..etc as 
> computing resources in the data-center, OpenCL will be first target to add 
> into Mesos (support by all major GPU vendor) , I will reserve to support 
> others such as CUDA in the future.
> In this feature, slave will be supported to do resources discover including 
> but not limited to, 
> (1) Heterogeneous Computing programming model : "OpenCL". "CUDA", "HSA"
> (2) Computing global memory (MB)
> (3) Computing run time version , such as "1.2" , "2.0"
> (4) Computing compute unit (double)
> (5) Computing device type : GPGPU, CPU, Accelerator device.
> (6) Computing (number of devices): (double)
> The Heterogeneous resource isolation will be supported in the framework 
> instead of in the slave devices side, the major reason here is , the 
> ecosystem , such as OpenCL operate on top of private device driver own by 
> vendors, only runtime library (OpenCL) is user-space application, so its hard 
> for us to do like Linux cgroup to have CPU/memory resource isolation. As a 
> result we may use run time library to do device isolation and memory 
> allocation.
> (PS, if anyone know how to do it for GPGPU driver, please drop me a note)
> Meanwhile, some run-time library (such as OpenCL) support to run on top of 
> CPU, so we need to use isolator API to notify this once it allocated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2262) Adding GPGPU resource into Mesos, so we can know if any GPU/Heterogeneous resource are available from slave

2015-11-15 Thread haosdent (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15005880#comment-15005880
 ] 

haosdent commented on MESOS-2262:
-

I think you could use the docker image provided by Nvidia 
https://github.com/NVIDIA/nvidia-docker and Marathon constraints operator to 
make sure your jobs only run in those machine have GPGPU resource. For more 
exactly control for GPGPU resource. So far Mesos don't provide this.

> Adding GPGPU resource into Mesos, so we can know if any GPU/Heterogeneous 
> resource are available from slave
> ---
>
> Key: MESOS-2262
> URL: https://issues.apache.org/jira/browse/MESOS-2262
> Project: Mesos
>  Issue Type: Task
>  Components: slave
> Environment: OpenCL support env, such as OS X, Linux, Windows..
>Reporter: chester kuo
>Assignee: chester kuo
>Priority: Minor
>
> Extending Mesos to support Heterogeneous resource such as GPGPU/FPGA..etc as 
> computing resources in the data-center, OpenCL will be first target to add 
> into Mesos (support by all major GPU vendor) , I will reserve to support 
> others such as CUDA in the future.
> In this feature, slave will be supported to do resources discover including 
> but not limited to, 
> (1) Heterogeneous Computing programming model : "OpenCL". "CUDA", "HSA"
> (2) Computing global memory (MB)
> (3) Computing run time version , such as "1.2" , "2.0"
> (4) Computing compute unit (double)
> (5) Computing device type : GPGPU, CPU, Accelerator device.
> (6) Computing (number of devices): (double)
> The Heterogeneous resource isolation will be supported in the framework 
> instead of in the slave devices side, the major reason here is , the 
> ecosystem , such as OpenCL operate on top of private device driver own by 
> vendors, only runtime library (OpenCL) is user-space application, so its hard 
> for us to do like Linux cgroup to have CPU/memory resource isolation. As a 
> result we may use run time library to do device isolation and memory 
> allocation.
> (PS, if anyone know how to do it for GPGPU driver, please drop me a note)
> Meanwhile, some run-time library (such as OpenCL) support to run on top of 
> CPU, so we need to use isolator API to notify this once it allocated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2262) Adding GPGPU resource into Mesos, so we can know if any GPU/Heterogeneous resource are available from slave

2015-10-29 Thread Guangya Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14981964#comment-14981964
 ] 

Guangya Liu commented on MESOS-2262:


[~chesterkuo] Does MESOS-3366 can help? This ticket enables end user can write 
a hook moudle to collect some customized resources.

> Adding GPGPU resource into Mesos, so we can know if any GPU/Heterogeneous 
> resource are available from slave
> ---
>
> Key: MESOS-2262
> URL: https://issues.apache.org/jira/browse/MESOS-2262
> Project: Mesos
>  Issue Type: Task
>  Components: slave
> Environment: OpenCL support env, such as OS X, Linux, Windows..
>Reporter: chester kuo
>Assignee: chester kuo
>Priority: Minor
>
> Extending Mesos to support Heterogeneous resource such as GPGPU/FPGA..etc as 
> computing resources in the data-center, OpenCL will be first target to add 
> into Mesos (support by all major GPU vendor) , I will reserve to support 
> others such as CUDA in the future.
> In this feature, slave will be supported to do resources discover including 
> but not limited to, 
> (1) Heterogeneous Computing programming model : "OpenCL". "CUDA", "HSA"
> (2) Computing global memory (MB)
> (3) Computing run time version , such as "1.2" , "2.0"
> (4) Computing compute unit (double)
> (5) Computing device type : GPGPU, CPU, Accelerator device.
> (6) Computing (number of devices): (double)
> The Heterogeneous resource isolation will be supported in the framework 
> instead of in the slave devices side, the major reason here is , the 
> ecosystem , such as OpenCL operate on top of private device driver own by 
> vendors, only runtime library (OpenCL) is user-space application, so its hard 
> for us to do like Linux cgroup to have CPU/memory resource isolation. As a 
> result we may use run time library to do device isolation and memory 
> allocation.
> (PS, if anyone know how to do it for GPGPU driver, please drop me a note)
> Meanwhile, some run-time library (such as OpenCL) support to run on top of 
> CPU, so we need to use isolator API to notify this once it allocated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2262) Adding GPGPU resource into Mesos, so we can know if any GPU/Heterogeneous resource are available from slave

2015-10-28 Thread Bill Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14979424#comment-14979424
 ] 

Bill Zhao commented on MESOS-2262:
--

Any progress on this? [~benjaminhindman], are you going to add this support to 
Mesos?

> Adding GPGPU resource into Mesos, so we can know if any GPU/Heterogeneous 
> resource are available from slave
> ---
>
> Key: MESOS-2262
> URL: https://issues.apache.org/jira/browse/MESOS-2262
> Project: Mesos
>  Issue Type: Task
>  Components: slave
> Environment: OpenCL support env, such as OS X, Linux, Windows..
>Reporter: chester kuo
>Assignee: chester kuo
>Priority: Minor
>
> Extending Mesos to support Heterogeneous resource such as GPGPU/FPGA..etc as 
> computing resources in the data-center, OpenCL will be first target to add 
> into Mesos (support by all major GPU vendor) , I will reserve to support 
> others such as CUDA in the future.
> In this feature, slave will be supported to do resources discover including 
> but not limited to, 
> (1) Heterogeneous Computing programming model : "OpenCL". "CUDA", "HSA"
> (2) Computing global memory (MB)
> (3) Computing run time version , such as "1.2" , "2.0"
> (4) Computing compute unit (double)
> (5) Computing device type : GPGPU, CPU, Accelerator device.
> (6) Computing (number of devices): (double)
> The Heterogeneous resource isolation will be supported in the framework 
> instead of in the slave devices side, the major reason here is , the 
> ecosystem , such as OpenCL operate on top of private device driver own by 
> vendors, only runtime library (OpenCL) is user-space application, so its hard 
> for us to do like Linux cgroup to have CPU/memory resource isolation. As a 
> result we may use run time library to do device isolation and memory 
> allocation.
> (PS, if anyone know how to do it for GPGPU driver, please drop me a note)
> Meanwhile, some run-time library (such as OpenCL) support to run on top of 
> CPU, so we need to use isolator API to notify this once it allocated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2262) Adding GPGPU resource into Mesos, so we can know if any GPU/Heterogeneous resource are available from slave

2015-02-11 Thread chester kuo (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14315915#comment-14315915
 ] 

chester kuo commented on MESOS-2262:


First draft for review.

https://reviews.apache.org/r/30736/


> Adding GPGPU resource into Mesos, so we can know if any GPU/Heterogeneous 
> resource are available from slave
> ---
>
> Key: MESOS-2262
> URL: https://issues.apache.org/jira/browse/MESOS-2262
> Project: Mesos
>  Issue Type: Task
>  Components: slave
> Environment: OpenCL support env, such as OS X, Linux, Windows..
>Reporter: chester kuo
>Assignee: chester kuo
>Priority: Minor
>
> Extending Mesos to support Heterogeneous resource such as GPGPU/FPGA..etc as 
> computing resources in the data-center, OpenCL will be first target to add 
> into Mesos (support by all major GPU vendor) , I will reserve to support 
> others such as CUDA in the future.
> In this feature, slave will be supported to do resources discover including 
> but not limited to, 
> (1) Heterogeneous Computing programming model : "OpenCL". "CUDA", "HSA"
> (2) Computing global memory (MB)
> (3) Computing run time version , such as "1.2" , "2.0"
> (4) Computing compute unit (double)
> (5) Computing device type : GPGPU, CPU, Accelerator device.
> (6) Computing (number of devices): (double)
> The Heterogeneous resource isolation will be supported in the framework 
> instead of in the slave devices side, the major reason here is , the 
> ecosystem , such as OpenCL operate on top of private device driver own by 
> vendors, only runtime library (OpenCL) is user-space application, so its hard 
> for us to do like Linux cgroup to have CPU/memory resource isolation. As a 
> result we may use run time library to do device isolation and memory 
> allocation.
> (PS, if anyone know how to do it for GPGPU driver, please drop me a note)
> Meanwhile, some run-time library (such as OpenCL) support to run on top of 
> CPU, so we need to use isolator API to notify this once it allocated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)