[
https://issues.apache.org/jira/browse/SPARK-26398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16724436#comment-16724436
]
Rong Ou commented on SPARK-26398:
-
https://github.com/apache/spark/pull/23347
> Support building GPU docker images
> --
>
> Key: SPARK-26398
> URL: https://issues.apache.org/jira/browse/SPARK-26398
> Project: Spark
> Issue Type: Improvement
> Components: Kubernetes
>Affects Versions: 2.4.0
>Reporter: Rong Ou
>Priority: Minor
>
> To run Spark on Kubernetes, a user first needs to build docker images using
> the `bin/docker-image-tool.sh` script. However, this script only supports
> building images for running on CPUs. As parts of Spark and related libraries
> (e.g. XGBoost) get accelerated on GPUs, it's desirable to build base images
> that can take advantage of GPU acceleration.
> This issue only addresses building docker images with CUDA support. Actually
> accelerating Spark on GPUs is outside the scope, as is supporting other types
> of GPUs.
> Today if anyone wants to experiment with running Spark on Kubernetes with GPU
> support, they have to write their own custom `Dockerfile`. By providing an
> "official" way to build GPU-enabled docker images, we can make it easier to
> get started.
> For now probably not that many people care about this, but it's a necessary
> first step towards GPU acceleration for Spark on Kubernetes.
> The risks are minimal as we only need to make minor changes to
> `bin/docker-image-tool.sh`. The PR is already done and will be attached.
> Success means anyone can easily build Spark docker images with GPU support.
> Proposed API changes: add an optional `-g` flag to
> `bin/docker-image-tool.sh` for building GPU versions of the JVM/Python/R
> docker images. When the `-g` is omitted, existing behavior is preserved.
> Design sketch: when the `-g` flag is specified, we append `-gpu` to the
> docker image names, and switch to dockerfiles based on the official CUDA
> images. Since the CUDA images are based on Ubuntu while the Spark dockerfiles
> are based on Alpine, steps for setting up additional packages are different,
> so there are a parallel set of `Dockerfile.gpu` files.
> Alternative: if we are willing to forego Alpine and switch to Ubuntu for the
> CPU-only images, the two sets of dockerfiles can be unified, and we can just
> pass in a different base image depending on whether the `-g` flag is present
> or not.
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org