Vajiha filed a spark-rapids discussion here
https://github.com/NVIDIA/spark-rapids/discussions/7205, so if you are
interested please follow there.

On Wed, Nov 30, 2022 at 7:17 AM Vajiha Begum S A <
vajihabegu...@maestrowiz.com> wrote:

> Hi,
> I'm using an Ubuntu system with the NVIDIA Quadro K1200 with GPU memory
> 20GB
> Installed - CUDF 22.10.0 jar file, Rapid 4 Spark 2.12-22.10.0 jar file,
> CUDA Toolkit 11.8.0 Linux version., JAVA 8
> I'm running only single server, Master is localhost
>
> I'm trying to run pyspark code through spark submit & Python idle. I'm
> getting errors. Kindly help me to resolve this error.
> Kindly give suggestions where I have made mistakes.
>
> *Error when running code through spark-submit:*
>    spark-submit /home/mwadmin/Documents/test.py
> 22/11/30 14:59:32 WARN Utils: Your hostname, mwadmin-HP-Z440-Workstation
> resolves to a loopback address: 127.0.1.1; using ***.***.**.** instead (on
> interface eno1)
> 22/11/30 14:59:32 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to
> another address
> Using Spark's default log4j profile:
> org/apache/spark/log4j-defaults.properties
> 22/11/30 14:59:32 INFO SparkContext: Running Spark version 3.2.2
> 22/11/30 14:59:32 WARN NativeCodeLoader: Unable to load native-hadoop
> library for your platform... using builtin-java classes where applicable
> 22/11/30 14:59:33 INFO ResourceUtils:
> ==============================================================
> 22/11/30 14:59:33 INFO ResourceUtils: No custom resources configured for
> spark.driver.
> 22/11/30 14:59:33 INFO ResourceUtils:
> ==============================================================
> 22/11/30 14:59:33 INFO SparkContext: Submitted application: Spark.com
> 22/11/30 14:59:33 INFO ResourceProfile: Default ResourceProfile created,
> executor resources: Map(cores -> name: cores, amount: 1, script: , vendor:
> , memory -> name: memory, amount: 1024, script: , vendor: , offHeap ->
> name: offHeap, amount: 0, script: , vendor: , gpu -> name: gpu, amount: 1,
> script: , vendor: ), task resources: Map(cpus -> name: cpus, amount: 1.0,
> gpu -> name: gpu, amount: 0.5)
> 22/11/30 14:59:33 INFO ResourceProfile: Limiting resource is cpus at 1
> tasks per executor
> 22/11/30 14:59:33 WARN ResourceUtils: The configuration of resource: gpu
> (exec = 1, task = 0.5/2, runnable tasks = 2) will result in wasted
> resources due to resource cpus limiting the number of runnable tasks per
> executor to: 1. Please adjust your configuration.
> 22/11/30 14:59:33 INFO ResourceProfileManager: Added ResourceProfile id: 0
> 22/11/30 14:59:33 INFO SecurityManager: Changing view acls to: mwadmin
> 22/11/30 14:59:33 INFO SecurityManager: Changing modify acls to: mwadmin
> 22/11/30 14:59:33 INFO SecurityManager: Changing view acls groups to:
> 22/11/30 14:59:33 INFO SecurityManager: Changing modify acls groups to:
> 22/11/30 14:59:33 INFO SecurityManager: SecurityManager: authentication
> disabled; ui acls disabled; users  with view permissions: Set(mwadmin);
> groups with view permissions: Set(); users  with modify permissions:
> Set(mwadmin); groups with modify permissions: Set()
> 22/11/30 14:59:33 INFO Utils: Successfully started service 'sparkDriver'
> on port 45883.
> 22/11/30 14:59:33 INFO SparkEnv: Registering MapOutputTracker
> 22/11/30 14:59:33 INFO SparkEnv: Registering BlockManagerMaster
> 22/11/30 14:59:33 INFO BlockManagerMasterEndpoint: Using
> org.apache.spark.storage.DefaultTopologyMapper for getting topology
> information
> 22/11/30 14:59:33 INFO BlockManagerMasterEndpoint:
> BlockManagerMasterEndpoint up
> 22/11/30 14:59:33 INFO SparkEnv: Registering BlockManagerMasterHeartbeat
> 22/11/30 14:59:33 INFO DiskBlockManager: Created local directory at
> /tmp/blockmgr-647d2c2a-72e4-402d-aeff-d7460726eb6d
> 22/11/30 14:59:33 INFO MemoryStore: MemoryStore started with capacity
> 366.3 MiB
> 22/11/30 14:59:33 INFO SparkEnv: Registering OutputCommitCoordinator
> 22/11/30 14:59:33 INFO Utils: Successfully started service 'SparkUI' on
> port 4040.
> 22/11/30 14:59:33 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at
> htttp://localhost:4040
> 22/11/30 14:59:33 INFO ShimLoader: Loading shim for Spark version: 3.2.2
> 22/11/30 14:59:33 INFO ShimLoader: Complete Spark build info: 3.2.2,
> https://github.com/apache/spark, HEAD,
> 78a5825fe266c0884d2dd18cbca9625fa258d7f7, 2022-07-11T15:44:21Z
> 22/11/30 14:59:33 INFO ShimLoader: findURLClassLoader found a
> URLClassLoader org.apache.spark.util.MutableURLClassLoader@1530c739
> 22/11/30 14:59:33 INFO ShimLoader: Updating spark classloader
> org.apache.spark.util.MutableURLClassLoader@1530c739 with the URLs:
> jar:file:/home/mwadmin/spark-3.2.2-bin-hadoop3.2/jars/rapids-4-spark_2.12-22.10.0.jar!/spark3xx-common/,
> jar:file:/home/mwadmin/spark-3.2.2-bin-hadoop3.2/jars/rapids-4-spark_2.12-22.10.0.jar!/spark322/
> 22/11/30 14:59:33 INFO ShimLoader: Spark classLoader
> org.apache.spark.util.MutableURLClassLoader@1530c739 updated successfully
> 22/11/30 14:59:33 INFO ShimLoader: Updating spark classloader
> org.apache.spark.util.MutableURLClassLoader@1530c739 with the URLs:
> jar:file:/home/mwadmin/spark-3.2.2-bin-hadoop3.2/jars/rapids-4-spark_2.12-22.10.0.jar!/spark3xx-common/,
> jar:file:/home/mwadmin/spark-3.2.2-bin-hadoop3.2/jars/rapids-4-spark_2.12-22.10.0.jar!/spark322/
> 22/11/30 14:59:33 INFO ShimLoader: Spark classLoader
> org.apache.spark.util.MutableURLClassLoader@1530c739 updated successfully
> 22/11/30 14:59:33 INFO RapidsPluginUtils: RAPIDS Accelerator build:
> {version=22.10.0, user=, url=https://github.com/NVIDIA/spark-rapids.git,
> date=2022-10-17T11:25:41Z,
> revision=c75a2eafc9ce9fb3e6ab75c6677d97bf681bff50, cudf_version=22.10.0,
> branch=HEAD}
> 22/11/30 14:59:33 INFO RapidsPluginUtils: RAPIDS Accelerator JNI build:
> {version=22.10.0, user=, url=
> https://github.com/NVIDIA/spark-rapids-jni.git,
> date=2022-10-14T05:19:41Z,
> revision=b2c02b61afe1747f3741d6c5e2064edb8da51b32, branch=HEAD}
> 22/11/30 14:59:33 INFO RapidsPluginUtils: cudf build: {version=22.10.0,
> user=, date=2022-10-14T01:51:22Z,
> revision=8ffe375d85f8fd0f98e0052f36ccd820a669d0ab, branch=HEAD}
> 22/11/30 14:59:33 WARN RapidsPluginUtils: RAPIDS Accelerator 22.10.0 using
> cudf 22.10.0.
> 22/11/30 14:59:33 WARN RapidsPluginUtils:
> spark.rapids.sql.multiThreadedRead.numThreads is set to 20.
> 22/11/30 14:59:33 WARN RapidsPluginUtils: RAPIDS Accelerator is enabled,
> to disable GPU support set `spark.rapids.sql.enabled` to false.
> 22/11/30 14:59:33 WARN RapidsPluginUtils: spark.rapids.sql.explain is set
> to `NOT_ON_GPU`. Set it to 'NONE' to suppress the diagnostics logging about
> the query placement on the GPU.
> 22/11/30 14:59:33 INFO DriverPluginContainer: Initialized driver component
> for plugin com.nvidia.spark.SQLPlugin.
> 22/11/30 14:59:33 WARN ResourceUtils: The configuration of resource: gpu
> (exec = 1, task = 0.5/2, runnable tasks = 2) will result in wasted
> resources due to resource cpus limiting the number of runnable tasks per
> executor to: 1. Please adjust your configuration.
> 22/11/30 14:59:34 INFO Executor: Starting executor ID driver on host
> ***.***.**.**
> 22/11/30 14:59:34 INFO RapidsExecutorPlugin: RAPIDS Accelerator build:
> {version=22.10.0, user=, url=https://github.com/NVIDIA/spark-rapids.git,
> date=2022-10-17T11:25:41Z,
> revision=c75a2eafc9ce9fb3e6ab75c6677d97bf681bff50, cudf_version=22.10.0,
> branch=HEAD}
> 22/11/30 14:59:34 INFO RapidsExecutorPlugin: cudf build: {version=22.10.0,
> user=, date=2022-10-14T01:51:22Z,
> revision=8ffe375d85f8fd0f98e0052f36ccd820a669d0ab, branch=HEAD}
> 22/11/30 14:59:34 INFO RapidsExecutorPlugin: Initializing memory from
> Executor Plugin
> 22/11/30 14:59:47 INFO Executor: Told to re-register on heartbeat
> 22/11/30 14:59:47 INFO BlockManager: BlockManager null re-registering with
> master
> 22/11/30 14:59:48 INFO BlockManagerMaster: Registering BlockManager null
> 22/11/30 14:59:48 ERROR Inbox: Ignoring error
> java.lang.NullPointerException
> at org.apache.spark.storage.BlockManagerMasterEndpoint.org
> $apache$spark$storage$BlockManagerMasterEndpoint$$register(BlockManagerMasterEndpoint.scala:534)
> at
> org.apache.spark.storage.BlockManagerMasterEndpoint$$anonfun$receiveAndReply$1.applyOrElse(BlockManagerMasterEndpoint.scala:117)
> at org.apache.spark.rpc.netty.Inbox.$anonfun$process$1(Inbox.scala:103)
> at org.apache.spark.rpc.netty.Inbox.safelyCall(Inbox.scala:213)
> at org.apache.spark.rpc.netty.Inbox.process(Inbox.scala:100)
> at org.apache.spark.rpc.netty.MessageLoop.org
> $apache$spark$rpc$netty$MessageLoop$$receiveLoop(MessageLoop.scala:75)
> at org.apache.spark.rpc.netty.MessageLoop$$anon$1.run(MessageLoop.scala:41)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:750)
> 22/11/30 14:59:48 WARN Executor: Issue communicating with driver in
> heartbeater
> org.apache.spark.SparkException: Exception thrown in awaitResult:
> at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:301)
> at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:75)
> at org.apache.spark.rpc.RpcEndpointRef.askSync(RpcEndpointRef.scala:103)
> at org.apache.spark.rpc.RpcEndpointRef.askSync(RpcEndpointRef.scala:87)
> at
> org.apache.spark.storage.BlockManagerMaster.registerBlockManager(BlockManagerMaster.scala:78)
> at org.apache.spark.storage.BlockManager.reregister(BlockManager.scala:626)
> at org.apache.spark.executor.Executor.reportHeartBeat(Executor.scala:1009)
> at
> org.apache.spark.executor.Executor.$anonfun$heartbeater$1(Executor.scala:212)
> at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
> at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:2048)
> at org.apache.spark.Heartbeater$$anon$1.run(Heartbeater.scala:46)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
> at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
> at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:750)
> Caused by: java.lang.NullPointerException
> at org.apache.spark.storage.BlockManagerMasterEndpoint.org
> $apache$spark$storage$BlockManagerMasterEndpoint$$register(BlockManagerMasterEndpoint.scala:534)
> at
> org.apache.spark.storage.BlockManagerMasterEndpoint$$anonfun$receiveAndReply$1.applyOrElse(BlockManagerMasterEndpoint.scala:117)
> at org.apache.spark.rpc.netty.Inbox.$anonfun$process$1(Inbox.scala:103)
> at org.apache.spark.rpc.netty.Inbox.safelyCall(Inbox.scala:213)
> at org.apache.spark.rpc.netty.Inbox.process(Inbox.scala:100)
> at org.apache.spark.rpc.netty.MessageLoop.org
> $apache$spark$rpc$netty$MessageLoop$$receiveLoop(MessageLoop.scala:75)
> at org.apache.spark.rpc.netty.MessageLoop$$anon$1.run(MessageLoop.scala:41)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> ... 3 more
> 22/11/30 14:59:52 INFO GpuDeviceManager: Initializing RMM ASYNC pool size
> = 3137.0625 MB on gpuId 0
> 22/11/30 14:59:52 INFO GpuDeviceManager: Using per-thread default stream
> 22/11/30 14:59:52 ERROR RapidsExecutorPlugin: Exception in the executor
> plugin, shutting down!
> *ai.rapids.cudf.CudfException: RMM failure at:
> /home/jenkins/agent/workspace/jenkins-cudf-release-39-cuda11/cpp/build/_deps/rmm-src/include/rmm/mr/device/cuda_async_memory_resource.hpp:90:
> cudaMallocAsync not supported with this CUDA driver/runtime version*
> at ai.rapids.cudf.Rmm.initializeInternal(Native Method)
> at ai.rapids.cudf.Rmm.initialize(Rmm.java:119)
> at
> com.nvidia.spark.rapids.GpuDeviceManager$.initializeRmm(GpuDeviceManager.scala:296)
> at
> com.nvidia.spark.rapids.GpuDeviceManager$.initializeMemory(GpuDeviceManager.scala:328)
> at
> com.nvidia.spark.rapids.GpuDeviceManager$.initializeGpuAndMemory(GpuDeviceManager.scala:137)
> at com.nvidia.spark.rapids.RapidsExecutorPlugin.init(Plugin.scala:258)
> at
> org.apache.spark.internal.plugin.ExecutorPluginContainer.$anonfun$executorPlugins$1(PluginContainer.scala:125)
> at
> scala.collection.TraversableLike.$anonfun$flatMap$1(TraversableLike.scala:293)
> at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)
> at
> scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)
> at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
> at scala.collection.TraversableLike.flatMap(TraversableLike.scala:293)
> at scala.collection.TraversableLike.flatMap$(TraversableLike.scala:290)
> at scala.collection.AbstractTraversable.flatMap(Traversable.scala:108)
> at
> org.apache.spark.internal.plugin.ExecutorPluginContainer.<init>(PluginContainer.scala:113)
> at
> org.apache.spark.internal.plugin.PluginContainer$.apply(PluginContainer.scala:211)
> at
> org.apache.spark.internal.plugin.PluginContainer$.apply(PluginContainer.scala:199)
> at
> org.apache.spark.executor.Executor.$anonfun$plugins$1(Executor.scala:253)
> at org.apache.spark.util.Utils$.withContextClassLoader(Utils.scala:231)
> at org.apache.spark.executor.Executor.<init>(Executor.scala:253)
> at
> org.apache.spark.scheduler.local.LocalEndpoint.<init>(LocalSchedulerBackend.scala:64)
> at
> org.apache.spark.scheduler.local.LocalSchedulerBackend.start(LocalSchedulerBackend.scala:132)
> at
> org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:220)
> at org.apache.spark.SparkContext.<init>(SparkContext.scala:581)
> at
> org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:58)
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
> at
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
> at
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
> at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:247)
> at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
> at py4j.Gateway.invoke(Gateway.java:238)
> at
> py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:80)
> at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:69)
> at
> py4j.ClientServerConnection.waitForCommands(ClientServerConnection.java:182)
> at py4j.ClientServerConnection.run(ClientServerConnection.java:106)
> at java.lang.Thread.run(Thread.java:750)
> 22/11/30 14:59:52 INFO DiskBlockManager: Shutdown hook called
> 22/11/30 14:59:52 INFO ShutdownHookManager: Shutdown hook called
> 22/11/30 14:59:52 INFO ShutdownHookManager: Deleting directory
> /tmp/spark-58488513-7d53-42f2-8bc4-cdcb34b5cf49
> 22/11/30 14:59:52 INFO ShutdownHookManager: Deleting directory
> /tmp/spark-24b8e0ea-43d4-430a-9756-b1e84ceaa1ff/userFiles-5ce7f28f-16db-48fd-94bd-e9ef563c01f1
> 22/11/30 14:59:52 INFO ShutdownHookManager: Deleting directory
> /tmp/spark-24b8e0ea-43d4-430a-9756-b1e84ceaa1ff
>
>
>
> *Error When running code through Python IDLE:  *
> raise Py4JNetworkError("Answer from Java side is empty")
> py4j.protocol.Py4JNetworkError: Answer from Java side is empty During
> handling of the above exception, another exception occurred:
>
>
>

Reply via email to