I will reiterate some feedback I left on the PR. Firstly, it’s not immediately 
clear if we should be opinionated around supporting GPUs in the Docker image in 
a first class way.

 

Firstly there’s the question of how we arbitrate the kinds of customizations we 
support moving forward. For example if we say we support GPUs now, what’s to 
say that we should not also support FPGAs?

 

Also what kind of testing can we add to CI to ensure what we’ve provided in 
this Dockerfile works?

 

Instead we can make the Spark images have bare minimum support for basic Spark 
applications, and then provide detailed instructions for how to build custom 
Docker images (mostly just needing to make sure the custom image has the right 
entry point).

 

-Matt Cheah

 

From: Rong Ou <rong...@gmail.com>
Date: Friday, February 8, 2019 at 2:28 PM
To: "dev@spark.apache.org" <dev@spark.apache.org>
Subject: building docker images for GPU

 

Hi spark dev, 

 

I created a JIRA issue a while ago 
(https://issues.apache.org/jira/browse/SPARK-26398 [issues.apache.org]) to add 
GPU support to Spark docker images, and sent a PR 
(https://github.com/apache/spark/pull/23347 [github.com]) that went through 
several iterations. It was suggested that it should be discussed on the dev 
mailing list, so here we are. Please chime in if you have any questions or 
concerns.

 

A little more background. I mainly looked at running XGBoost on Spark using 
GPUs. Preliminary results have shown that there is potential for significant 
speedup in training time. This seems like a popular use case for Spark. In any 
event, it'd be nice for Spark to have better support for GPUs. Building 
gpu-enabled docker images seems like a useful first step.

 

Thanks,

 

Rong

 

Attachment: smime.p7s
Description: S/MIME cryptographic signature

Reply via email to