Re: GPU job in Spark 3

HaoZ Fri, 09 Apr 2021 10:09:19 -0700

Hi Martin,

I tested the local mode in Spark on Rapids Accelerator and it works fine for
me.
The only possible issue is the CUDA 11.2 however the supported CUDA version
as per https://nvidia.github.io/spark-rapids/docs/download.html is 11.0.


Here is a quick test using Spark local mode.
Note: When I was testing this local mode, I make sure there is nothing in
spark-defaults.conf so everything is clean.

======
scala> val df = sc.makeRDD(1 to 100, 6).toDF
df: org.apache.spark.sql.DataFrame = [value: int]

scala> val df2 = sc.makeRDD(1 to 100, 6).toDF
df2: org.apache.spark.sql.DataFrame = [value: int]

scala> df.select( $"value" as "a").join(df2.select($"value" as "b"), $"a"
=== $"b").count
res0: Long = 100
scala> df.select( $"value" as "a").join(df2.select($"value" as "b"), $"a"
=== $"b").explain()
== Physical Plan ==
GpuColumnarToRow false
+- GpuShuffledHashJoin [a#29], [b#31], Inner, GpuBuildRight, false
   :- GpuShuffleCoalesce 2147483647
   :  +- GpuColumnarExchange gpuhashpartitioning(a#29, 10),
ENSURE_REQUIREMENTS, [id=#221]
   :     +- GpuProject [value#2 AS a#29]
   :        +- GpuRowToColumnar TargetSize(2147483647)
   :           +- *(1) SerializeFromObject [input[0, int, false] AS value#2]
   :              +- Scan[obj#1]
   +- GpuCoalesceBatches RequireSingleBatch
      +- GpuShuffleCoalesce 2147483647
         +- GpuColumnarExchange gpuhashpartitioning(b#31, 10),
ENSURE_REQUIREMENTS, [id=#228]
            +- GpuProject [value#8 AS b#31]
               +- GpuRowToColumnar TargetSize(2147483647)
                  +- *(2) SerializeFromObject [input[0, int, false] AS
value#8]
                     +- Scan[obj#7]
======

Thanks,
Hao



--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Re: GPU job in Spark 3

Reply via email to