[ 
https://issues.apache.org/jira/browse/SPARK-26398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17757293#comment-17757293
 ] 

comet commented on SPARK-26398:
-------------------------------

I see #23347 is closed without merge. Does that mean support for GPU is not 
available in spark and we need to build the docker image ourself? Any guide 
step available?

> Support building GPU docker images
> ----------------------------------
>
>                 Key: SPARK-26398
>                 URL: https://issues.apache.org/jira/browse/SPARK-26398
>             Project: Spark
>          Issue Type: Improvement
>          Components: Kubernetes, Spark Core
>    Affects Versions: 2.4.0
>            Reporter: Rong Ou
>            Priority: Minor
>
> To run Spark on Kubernetes, a user first needs to build docker images using 
> the `bin/docker-image-tool.sh` script. However, this script only supports 
> building images for running on CPUs. As parts of Spark and related libraries 
> (e.g. XGBoost) get accelerated on GPUs, it's desirable to build base images 
> that can take advantage of GPU acceleration.
> This issue only addresses building docker images with CUDA support. Actually 
> accelerating Spark on GPUs is outside the scope, as is supporting other types 
> of GPUs.
> Today if anyone wants to experiment with running Spark on Kubernetes with GPU 
> support, they have to write their own custom `Dockerfile`. By providing an 
> "official" way to build GPU-enabled docker images, we can make it easier to 
> get started.
> For now probably not that many people care about this, but it's a necessary 
> first step towards GPU acceleration for Spark on Kubernetes.
> The risks are minimal as we only need to make minor changes to 
> `bin/docker-image-tool.sh`. The PR is already done and will be attached. 
> Success means anyone can easily build Spark docker images with GPU support.
> Proposed API changes: add an optional  `-g` flag to 
> `bin/docker-image-tool.sh` for building GPU versions of the JVM/Python/R 
> docker images. When the `-g` is omitted, existing behavior is preserved.
> Design sketch: when the `-g` flag is specified, we append `-gpu` to the 
> docker image names, and switch to dockerfiles based on the official CUDA 
> images. Since the CUDA images are based on Ubuntu while the Spark dockerfiles 
> are based on Alpine, steps for setting up additional packages are different, 
> so there are a parallel set of `Dockerfile.gpu` files.
> Alternative: if we are willing to forego Alpine and switch to Ubuntu for the 
> CPU-only images, the two sets of dockerfiles can be unified, and we can just 
> pass in a different base image depending on whether the `-g` flag is present 
> or not.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to