Re: Spark with GPU

2023-02-07 Thread Alessandro Bellina
For Apache Spark a stand-alone worker can manage all the resources of the
box, including all GPUs. So a spark worker could be set up to manage N gpus
in the box via *spark.worker.resource.gpu.amount,* and then
*spark.executor.resource.gpu.amount, *as provided by on app submit, assigns
GPU resources to executors as they come up. Here is a getting started guide
for spark-rapids but I am not sure if that's what you are looking to use.
Either way, it may help with the resource setup:
https://nvidia.github.io/spark-rapids/docs/get-started/getting-started-on-prem.html#spark-standalone-cluster
.

Not every node in the cluster needs to have GPUs. You could request 0 GPUs
for an app (default value of spark.executor.resource.gpu.amount), and the
executors will not require this resource.

If you are using a yarn/k8s cluster there are other configs to pay
attention to. If you need help with those let us know.

On Sun, Feb 5, 2023 at 1:50 PM Jack Goodson  wrote:

> As far as I understand you will need a GPU for each worker node or you
> will need to partition the GPU processing somehow to each node which I
> think would defeat the purpose. In Databricks for example when you select
> GPU workers there is a GPU allocated to each worker. I assume this is the
> “correct” approach to this problem
>
> On Mon, 6 Feb 2023 at 8:17 AM, Mich Talebzadeh 
> wrote:
>
>> if you have several nodes with only one node having GPUs, you still have
>> to wait for the result set to complete. In other words it will be as fast
>> as the lowest denominator ..
>>
>> my postulation
>>
>> HTH
>>
>>
>>
>>view my Linkedin profile
>> 
>>
>>
>>  https://en.everybodywiki.com/Mich_Talebzadeh
>>
>>
>>
>> *Disclaimer:* Use it at your own risk. Any and all responsibility for
>> any loss, damage or destruction of data or any other property which may
>> arise from relying on this email's technical content is explicitly
>> disclaimed. The author will in no case be liable for any monetary damages
>> arising from such loss, damage or destruction.
>>
>>
>>
>>
>> On Sun, 5 Feb 2023 at 13:38, Irene Markelic  wrote:
>>
>>> Hello,
>>>
>>> has anyone used spark with GPUs? I wonder if every worker node in a
>>> cluster needs one GPU or if you can have several worker nodes of which
>>> only one has a GPU.
>>>
>>> Thank you!
>>>
>>>
>>>
>>> -
>>> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>>>
>>>


Re: Spark with GPU

2023-02-05 Thread Jack Goodson
As far as I understand you will need a GPU for each worker node or you will
need to partition the GPU processing somehow to each node which I think
would defeat the purpose. In Databricks for example when you select GPU
workers there is a GPU allocated to each worker. I assume this is the
“correct” approach to this problem

On Mon, 6 Feb 2023 at 8:17 AM, Mich Talebzadeh 
wrote:

> if you have several nodes with only one node having GPUs, you still have
> to wait for the result set to complete. In other words it will be as fast
> as the lowest denominator ..
>
> my postulation
>
> HTH
>
>
>
>view my Linkedin profile
> 
>
>
>  https://en.everybodywiki.com/Mich_Talebzadeh
>
>
>
> *Disclaimer:* Use it at your own risk. Any and all responsibility for any
> loss, damage or destruction of data or any other property which may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.
>
>
>
>
> On Sun, 5 Feb 2023 at 13:38, Irene Markelic  wrote:
>
>> Hello,
>>
>> has anyone used spark with GPUs? I wonder if every worker node in a
>> cluster needs one GPU or if you can have several worker nodes of which
>> only one has a GPU.
>>
>> Thank you!
>>
>>
>>
>> -
>> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>>
>>


Re: Spark with GPU

2023-02-05 Thread Mich Talebzadeh
if you have several nodes with only one node having GPUs, you still have to
wait for the result set to complete. In other words it will be as fast as
the lowest denominator ..

my postulation

HTH



   view my Linkedin profile



 https://en.everybodywiki.com/Mich_Talebzadeh



*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.




On Sun, 5 Feb 2023 at 13:38, Irene Markelic  wrote:

> Hello,
>
> has anyone used spark with GPUs? I wonder if every worker node in a
> cluster needs one GPU or if you can have several worker nodes of which
> only one has a GPU.
>
> Thank you!
>
>
>
> -
> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>
>


Spark with GPU

2023-02-05 Thread Irene Markelic

Hello,

has anyone used spark with GPUs? I wonder if every worker node in a 
cluster needs one GPU or if you can have several worker nodes of which 
only one has a GPU.


Thank you!



-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org



Error - using Spark with GPU

2022-11-30 Thread Vajiha Begum S A
 spark-submit /home/mwadmin/Documents/test.py
22/11/30 14:59:32 WARN Utils: Your hostname, mwadmin-HP-Z440-Workstation
resolves to a loopback address: 127.0.1.1; using ***.***.**.** instead (on
interface eno1)
22/11/30 14:59:32 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to
another address
Using Spark's default log4j profile:
org/apache/spark/log4j-defaults.properties
22/11/30 14:59:32 INFO SparkContext: Running Spark version 3.2.2
22/11/30 14:59:32 WARN NativeCodeLoader: Unable to load native-hadoop
library for your platform... using builtin-java classes where applicable
22/11/30 14:59:33 INFO ResourceUtils:
==
22/11/30 14:59:33 INFO ResourceUtils: No custom resources configured for
spark.driver.
22/11/30 14:59:33 INFO ResourceUtils:
==
22/11/30 14:59:33 INFO SparkContext: Submitted application: Spark.com
22/11/30 14:59:33 INFO ResourceProfile: Default ResourceProfile created,
executor resources: Map(cores -> name: cores, amount: 1, script: , vendor:
, memory -> name: memory, amount: 1024, script: , vendor: , offHeap ->
name: offHeap, amount: 0, script: , vendor: , gpu -> name: gpu, amount: 1,
script: , vendor: ), task resources: Map(cpus -> name: cpus, amount: 1.0,
gpu -> name: gpu, amount: 0.5)
22/11/30 14:59:33 INFO ResourceProfile: Limiting resource is cpus at 1
tasks per executor
22/11/30 14:59:33 WARN ResourceUtils: The configuration of resource: gpu
(exec = 1, task = 0.5/2, runnable tasks = 2) will result in wasted
resources due to resource cpus limiting the number of runnable tasks per
executor to: 1. Please adjust your configuration.
22/11/30 14:59:33 INFO ResourceProfileManager: Added ResourceProfile id: 0
22/11/30 14:59:33 INFO SecurityManager: Changing view acls to: mwadmin
22/11/30 14:59:33 INFO SecurityManager: Changing modify acls to: mwadmin
22/11/30 14:59:33 INFO SecurityManager: Changing view acls groups to:
22/11/30 14:59:33 INFO SecurityManager: Changing modify acls groups to:
22/11/30 14:59:33 INFO SecurityManager: SecurityManager: authentication
disabled; ui acls disabled; users  with view permissions: Set(mwadmin);
groups with view permissions: Set(); users  with modify permissions:
Set(mwadmin); groups with modify permissions: Set()
22/11/30 14:59:33 INFO Utils: Successfully started service 'sparkDriver' on
port 45883.
22/11/30 14:59:33 INFO SparkEnv: Registering MapOutputTracker
22/11/30 14:59:33 INFO SparkEnv: Registering BlockManagerMaster
22/11/30 14:59:33 INFO BlockManagerMasterEndpoint: Using
org.apache.spark.storage.DefaultTopologyMapper for getting topology
information
22/11/30 14:59:33 INFO BlockManagerMasterEndpoint:
BlockManagerMasterEndpoint up
22/11/30 14:59:33 INFO SparkEnv: Registering BlockManagerMasterHeartbeat
22/11/30 14:59:33 INFO DiskBlockManager: Created local directory at
/tmp/blockmgr-647d2c2a-72e4-402d-aeff-d7460726eb6d
22/11/30 14:59:33 INFO MemoryStore: MemoryStore started with capacity 366.3
MiB
22/11/30 14:59:33 INFO SparkEnv: Registering OutputCommitCoordinator
22/11/30 14:59:33 INFO Utils: Successfully started service 'SparkUI' on
port 4040.
22/11/30 14:59:33 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at
htttp://localhost:4040
22/11/30 14:59:33 INFO ShimLoader: Loading shim for Spark version: 3.2.2
22/11/30 14:59:33 INFO ShimLoader: Complete Spark build info: 3.2.2,
https://github.com/apache/spark, HEAD,
78a5825fe266c0884d2dd18cbca9625fa258d7f7, 2022-07-11T15:44:21Z
22/11/30 14:59:33 INFO ShimLoader: findURLClassLoader found a
URLClassLoader org.apache.spark.util.MutableURLClassLoader@1530c739
22/11/30 14:59:33 INFO ShimLoader: Updating spark classloader
org.apache.spark.util.MutableURLClassLoader@1530c739 with the URLs:
jar:file:/home/mwadmin/spark-3.2.2-bin-hadoop3.2/jars/rapids-4-spark_2.12-22.10.0.jar!/spark3xx-common/,
jar:file:/home/mwadmin/spark-3.2.2-bin-hadoop3.2/jars/rapids-4-spark_2.12-22.10.0.jar!/spark322/
22/11/30 14:59:33 INFO ShimLoader: Spark classLoader
org.apache.spark.util.MutableURLClassLoader@1530c739 updated successfully
22/11/30 14:59:33 INFO ShimLoader: Updating spark classloader
org.apache.spark.util.MutableURLClassLoader@1530c739 with the URLs:
jar:file:/home/mwadmin/spark-3.2.2-bin-hadoop3.2/jars/rapids-4-spark_2.12-22.10.0.jar!/spark3xx-common/,
jar:file:/home/mwadmin/spark-3.2.2-bin-hadoop3.2/jars/rapids-4-spark_2.12-22.10.0.jar!/spark322/
22/11/30 14:59:33 INFO ShimLoader: Spark classLoader
org.apache.spark.util.MutableURLClassLoader@1530c739 updated successfully
22/11/30 14:59:33 INFO RapidsPluginUtils: RAPIDS Accelerator build:
{version=22.10.0, user=, url=https://github.com/NVIDIA/spark-rapids.git,
date=2022-10-17T11:25:41Z,
revision=c75a2eafc9ce9fb3e6ab75c6677d97bf681bff50, cudf_version=22.10.0,
branch=HEAD}
22/11/30 14:59:33 INFO RapidsPluginUtils: RAPIDS Accelerator JNI build:
{version=22.10.0, user=, url=https://github.com/NVIDIA/spark-rapids-jni.git,
date=2022-10-14T05:19:41Z,

Re: Spark with GPU

2022-08-13 Thread Gourav Sengupta
One of the best things that could have happened to SPARK (now mostly an
overhyped ETL tool with small incremental optimisation changes and no
large scale innovation) is the release by NVIDIA for GPU processing. You
need some time to get your head around it, but it is supported quite easily
in AWS EMR with a few configuration changes.  You can see massive gains,
given AWS has different varieties of GPU's,

We can end up saving a lot of time and money running a few selected
processes on the GPU. There is an easier fall back options on CPU
obviously.

If you are in AWS, try to use athena, or redshift, or snowflake, they get a
lot more done with less overheads and heart aches. I particularly like how
native integration between ML systems like sagemaker works via redshift
queries, and aurora postgres - that is true unified data analytics at work.


Regards,
Gourav Sengupta


Regards,
Gourav Sengupta


On Sat, Aug 13, 2022 at 6:16 PM Alessandro Bellina 
wrote:

> This thread may be better suited as a discussion in our Spark plug-in’s
> repo:
> https://github.com/NVIDIA/spark-rapids/discussions.
>
> Just to answer the questions that were asked so far:
>
> I would recommend checking our documentation for what is supported as of
> our latest release (22.06):
> https://nvidia.github.io/spark-rapids/docs/supported_ops.html, as we have
> quite a bit of support for decimal and also nested types and keep adding
> coverage.
>
> For UDFs, if you are willing to rewrite it to use the RAPIDS cuDF API, we
> do have support and examples on how to do this, please check out this:
>
> https://nvidia.github.io/spark-rapids/docs/additional-functionality/rapids-udfs.html.
> Automatically translating UDFs to GPUs is not easy. We have a Scala UDF to
> catalyst transpiler that will be able to handle very simple UDFs where
> every operation has a corresponding catalyst expression, that may be worth
> checking out:
>
> https://nvidia.github.io/spark-rapids/docs/additional-functionality/udf-to-catalyst-expressions.html.
> This transpiler falls back if it can’t translate any part of the UDF.
>
> The plug-in will not fail in case where it can’t run part of a query on
> the GPU, it will fall back and run on the CPU for the parts of the query
> that are not supported. It will also output what it can’t optimize on the
> driver (on .explain), which should help narrow down an expression or exec
> that should be looked at further.
>
> There are other resources all linked from here:
> https://nvidia.github.io/spark-rapids/ (of interest may be the
> Qualification Tool, and our Getting Started guide for different cloud
> providers and distros).
>
> I’d say let’s continue this in the discussions or as issues in the
> spark-rapids repo if you have further questions or run into issues, as it’s
> not specific to Apache Spark.
>
> Thanks!
>
> Alessandro
>
> On Sat, Aug 13, 2022 at 10:53 AM Sean Owen  wrote:
>
>> This isn't a Spark question, but rather a question about whatever Spark
>> application you are talking about. RAPIDS?
>>
>> On Sat, Aug 13, 2022 at 10:35 AM rajat kumar 
>> wrote:
>>
>>> Thanks Sean.
>>>
>>> Also, I observed that lots of things are not supported in GPU by NVIDIA.
>>> E.g. nested types/decimal type/Udfs etc.
>>> So, will it use CPU automatically for running those tasks which require
>>> nested types or will it run on GPU and fail.
>>>
>>> Thanks
>>> Rajat
>>>
>>> On Sat, Aug 13, 2022, 18:54 Sean Owen  wrote:
>>>
 Spark does not use GPUs itself, but tasks you run on Spark can.
 The only 'support' there is is for requesting GPUs as resources for
 tasks, so it's just a question of resource management. That's in OSS.

 On Sat, Aug 13, 2022 at 8:16 AM rajat kumar 
 wrote:

> Hello,
>
> I have been hearing about GPU in spark3.
>
> For batch jobs , will it help to improve GPU performance. Also is GPU
> support available only on Databricks or on cloud based Spark clusters ?
>
> I am new , if anyone can share insight , it will help
>
> Thanks
> Rajat
>



Re: Spark with GPU

2022-08-13 Thread Alessandro Bellina
This thread may be better suited as a discussion in our Spark plug-in’s
repo:
https://github.com/NVIDIA/spark-rapids/discussions.

Just to answer the questions that were asked so far:

I would recommend checking our documentation for what is supported as of
our latest release (22.06):
https://nvidia.github.io/spark-rapids/docs/supported_ops.html, as we have
quite a bit of support for decimal and also nested types and keep adding
coverage.

For UDFs, if you are willing to rewrite it to use the RAPIDS cuDF API, we
do have support and examples on how to do this, please check out this:
https://nvidia.github.io/spark-rapids/docs/additional-functionality/rapids-udfs.html.
Automatically translating UDFs to GPUs is not easy. We have a Scala UDF to
catalyst transpiler that will be able to handle very simple UDFs where
every operation has a corresponding catalyst expression, that may be worth
checking out:
https://nvidia.github.io/spark-rapids/docs/additional-functionality/udf-to-catalyst-expressions.html.
This transpiler falls back if it can’t translate any part of the UDF.

The plug-in will not fail in case where it can’t run part of a query on the
GPU, it will fall back and run on the CPU for the parts of the query that
are not supported. It will also output what it can’t optimize on the driver
(on .explain), which should help narrow down an expression or exec that
should be looked at further.

There are other resources all linked from here:
https://nvidia.github.io/spark-rapids/ (of interest may be the
Qualification Tool, and our Getting Started guide for different cloud
providers and distros).

I’d say let’s continue this in the discussions or as issues in the
spark-rapids repo if you have further questions or run into issues, as it’s
not specific to Apache Spark.

Thanks!

Alessandro

On Sat, Aug 13, 2022 at 10:53 AM Sean Owen  wrote:

> This isn't a Spark question, but rather a question about whatever Spark
> application you are talking about. RAPIDS?
>
> On Sat, Aug 13, 2022 at 10:35 AM rajat kumar 
> wrote:
>
>> Thanks Sean.
>>
>> Also, I observed that lots of things are not supported in GPU by NVIDIA.
>> E.g. nested types/decimal type/Udfs etc.
>> So, will it use CPU automatically for running those tasks which require
>> nested types or will it run on GPU and fail.
>>
>> Thanks
>> Rajat
>>
>> On Sat, Aug 13, 2022, 18:54 Sean Owen  wrote:
>>
>>> Spark does not use GPUs itself, but tasks you run on Spark can.
>>> The only 'support' there is is for requesting GPUs as resources for
>>> tasks, so it's just a question of resource management. That's in OSS.
>>>
>>> On Sat, Aug 13, 2022 at 8:16 AM rajat kumar 
>>> wrote:
>>>
 Hello,

 I have been hearing about GPU in spark3.

 For batch jobs , will it help to improve GPU performance. Also is GPU
 support available only on Databricks or on cloud based Spark clusters ?

 I am new , if anyone can share insight , it will help

 Thanks
 Rajat

>>>


Re: Spark with GPU

2022-08-13 Thread Sean Owen
This isn't a Spark question, but rather a question about whatever Spark
application you are talking about. RAPIDS?

On Sat, Aug 13, 2022 at 10:35 AM rajat kumar 
wrote:

> Thanks Sean.
>
> Also, I observed that lots of things are not supported in GPU by NVIDIA.
> E.g. nested types/decimal type/Udfs etc.
> So, will it use CPU automatically for running those tasks which require
> nested types or will it run on GPU and fail.
>
> Thanks
> Rajat
>
> On Sat, Aug 13, 2022, 18:54 Sean Owen  wrote:
>
>> Spark does not use GPUs itself, but tasks you run on Spark can.
>> The only 'support' there is is for requesting GPUs as resources for
>> tasks, so it's just a question of resource management. That's in OSS.
>>
>> On Sat, Aug 13, 2022 at 8:16 AM rajat kumar 
>> wrote:
>>
>>> Hello,
>>>
>>> I have been hearing about GPU in spark3.
>>>
>>> For batch jobs , will it help to improve GPU performance. Also is GPU
>>> support available only on Databricks or on cloud based Spark clusters ?
>>>
>>> I am new , if anyone can share insight , it will help
>>>
>>> Thanks
>>> Rajat
>>>
>>


Re: Spark with GPU

2022-08-13 Thread rajat kumar
Thanks Sean.

Also, I observed that lots of things are not supported in GPU by NVIDIA.
E.g. nested types/decimal type/Udfs etc.
So, will it use CPU automatically for running those tasks which require
nested types or will it run on GPU and fail.

Thanks
Rajat

On Sat, Aug 13, 2022, 18:54 Sean Owen  wrote:

> Spark does not use GPUs itself, but tasks you run on Spark can.
> The only 'support' there is is for requesting GPUs as resources for tasks,
> so it's just a question of resource management. That's in OSS.
>
> On Sat, Aug 13, 2022 at 8:16 AM rajat kumar 
> wrote:
>
>> Hello,
>>
>> I have been hearing about GPU in spark3.
>>
>> For batch jobs , will it help to improve GPU performance. Also is GPU
>> support available only on Databricks or on cloud based Spark clusters ?
>>
>> I am new , if anyone can share insight , it will help
>>
>> Thanks
>> Rajat
>>
>


Re: Spark with GPU

2022-08-13 Thread Sean Owen
Spark does not use GPUs itself, but tasks you run on Spark can.
The only 'support' there is is for requesting GPUs as resources for tasks,
so it's just a question of resource management. That's in OSS.

On Sat, Aug 13, 2022 at 8:16 AM rajat kumar 
wrote:

> Hello,
>
> I have been hearing about GPU in spark3.
>
> For batch jobs , will it help to improve GPU performance. Also is GPU
> support available only on Databricks or on cloud based Spark clusters ?
>
> I am new , if anyone can share insight , it will help
>
> Thanks
> Rajat
>


Spark with GPU

2022-08-13 Thread rajat kumar
Hello,

I have been hearing about GPU in spark3.

For batch jobs , will it help to improve GPU performance. Also is GPU
support available only on Databricks or on cloud based Spark clusters ?

I am new , if anyone can share insight , it will help

Thanks
Rajat