Unfortunately the answer you got from the forum is true. The current
Spark-rapids package doesn't support RDD. Please see
https://nvidia.github.io/spark-rapids/docs/FAQ.html#what-parts-of-apache-spark-are-accelerated
I guess to be able to use spark-rapids, one option you have would be to
spark-rapids is not part of Spark, so couldn't speak to it, but Spark
itself does not use GPUs at all.
It does let you configure a task to request a certain number of GPUs, and
that would work for RDDs, but it's up to the code being executed to use the
GPUs.
On Tue, Sep 21, 2021 at 1:23 PM
Hi,
I am currently trying to run genomic analyses pipelines using Hail(library
for genomics analyses written in python and Scala). Recently, Apache Spark
3 was released and it supported GPU usage.
I tried spark-rapids library to start an on-premise slurm cluster with gpu
nodes. I was able to