unsubscribe

2023-08-08 Thread Daniel Tavares de Santana
unsubscribe


Re: [EXTERNAL] Use of ML in certain aspects of Spark to improve the performance

2023-08-08 Thread Daniel Tavares de Santana
unsubscribe

From: Mich Talebzadeh 
Sent: Tuesday, August 8, 2023 4:43 PM
To: user @spark 
Subject: [EXTERNAL] Use of ML in certain aspects of Spark to improve the 
performance

I am currently pondering and sharing my thoughts openly. Given our reliance on 
gathered statistics, it prompts the question of whether we could integrate 
specific machine learning components into Spark Structured Streaming. Consider 
a scenario

I am currently pondering and sharing my thoughts openly. Given our reliance on 
gathered statistics, it prompts the question of whether we could integrate 
specific machine learning components into Spark Structured Streaming. Consider 
a scenario where we aim to adjust configuration values on the fly based on 
collected statistics. For instance, Spark GUI could gather some statistics via 
Spark listeners. Is this a viable approach? This aligns with use cases 
involving Spark on Kubernetes autopilot and Spark Structured Streaming. While 
this might slightly veer from our usual discussions, I admit my lack of 
expertise in machine learning. However, there are experts among us who could 
lend their knowledge to explore this potential avenue.

Mich Talebzadeh,
Solutions Architect/Engineering Lead
London
United Kingdom


 
[https://ci3.googleusercontent.com/mail-sig/AIorK4zholKucR2Q9yMrKbHNn-o1TuS4mYXyi2KO6Xmx6ikHPySa9MLaLZ8t2hrA6AUcxSxDgHIwmKE]
   view my Linkedin 
profile


 
https://en.everybodywiki.com/Mich_Talebzadeh



Disclaimer: Use it at your own risk. Any and all responsibility for any loss, 
damage or destruction of data or any other property which may arise from 
relying on this email's technical content is explicitly disclaimed. The author 
will in no case be liable for any monetary damages arising from such loss, 
damage or destruction.




Use of ML in certain aspects of Spark to improve the performance

2023-08-08 Thread Mich Talebzadeh
I am currently pondering and sharing my thoughts openly. Given our reliance
on gathered statistics, it prompts the question of whether we could
integrate specific machine learning components into Spark Structured
Streaming. Consider a scenario where we aim to adjust configuration values
on the fly based on collected statistics. For instance, Spark GUI could
gather some statistics via Spark listeners. Is this a viable approach? This
aligns with use cases involving Spark on Kubernetes autopilot and Spark
Structured Streaming. While this might slightly veer from our usual
discussions, I admit my lack of expertise in machine learning. However,
there are experts among us who could lend their knowledge to explore this
potential avenue.

Mich Talebzadeh,
Solutions Architect/Engineering Lead
London
United Kingdom


   view my Linkedin profile



 https://en.everybodywiki.com/Mich_Talebzadeh



*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.


Re: Dynamic allocation does not deallocate executors

2023-08-08 Thread Holden Karau
So if you disable shuffle tracking but enable shuffle block decommissioning
it should work from memory

On Tue, Aug 8, 2023 at 4:13 AM Mich Talebzadeh 
wrote:

> Hm. I don't think it will work
>
> --conf spark.dynamicAllocation.shuffleTracking.enabled=false
>
> In Spark 3.4.1 running spark in k8s
>
> you get
>
> : org.apache.spark.SparkException: Dynamic allocation of executors
> requires the external shuffle service. You may enable this through
> spark.shuffle.service.enabled.
>
> HTH
>
> Mich Talebzadeh,
> Solutions Architect/Engineering Lead
> London
> United Kingdom
>
>
>view my Linkedin profile
> 
>
>
>  https://en.everybodywiki.com/Mich_Talebzadeh
>
>
>
> *Disclaimer:* Use it at your own risk. Any and all responsibility for any
> loss, damage or destruction of data or any other property which may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.
>
>
>
>
> On Mon, 7 Aug 2023 at 21:24, Holden Karau  wrote:
>
>> I think you need to set 
>> "spark.dynamicAllocation.shuffleTracking.enabled=true"
>> to false.
>>
>> On Mon, Aug 7, 2023 at 2:50 AM Mich Talebzadeh 
>> wrote:
>>
>>> Yes I have seen cases where the driver gone but a couple of executors
>>> hanging on. Sounds like a code issue.
>>>
>>> HTH
>>>
>>> Mich Talebzadeh,
>>> Solutions Architect/Engineering Lead
>>> London
>>> United Kingdom
>>>
>>>
>>>view my Linkedin profile
>>> 
>>>
>>>
>>>  https://en.everybodywiki.com/Mich_Talebzadeh
>>>
>>>
>>>
>>> *Disclaimer:* Use it at your own risk. Any and all responsibility for
>>> any loss, damage or destruction of data or any other property which may
>>> arise from relying on this email's technical content is explicitly
>>> disclaimed. The author will in no case be liable for any monetary damages
>>> arising from such loss, damage or destruction.
>>>
>>>
>>>
>>>
>>> On Thu, 27 Jul 2023 at 15:01, Sergei Zhgirovski 
>>> wrote:
>>>
 Hi everyone

 I'm trying to use pyspark 3.3.2.
 I have these relevant options set:

 
 spark.dynamicAllocation.enabled=true
 spark.dynamicAllocation.shuffleTracking.enabled=true
 spark.dynamicAllocation.shuffleTracking.timeout=20s
 spark.dynamicAllocation.executorIdleTimeout=30s
 spark.dynamicAllocation.cachedExecutorIdleTimeout=40s
 spark.executor.instances=0
 spark.dynamicAllocation.minExecutors=0
 spark.dynamicAllocation.maxExecutors=20
 spark.master=k8s://https://k8s-api.<>:6443
 

 So I'm using kubernetes to deploy up to 20 executors

 then I run this piece of code:
 
 df = spark.read.parquet("s3a://>>> files>")
 print(df.count())
 time.sleep(999)
 

 This works fine and as expected: during the execution ~1600 tasks are
 completed, 20 executors get deployed and are being quickly removed after
 the calculation is complete.

 Next, I add these to the config:
 
 spark.decommission.enabled=true
 spark.storage.decommission.shuffleBlocks.enabled=true
 spark.storage.decommission.enabled=true
 spark.storage.decommission.rddBlocks.enabled=true
 

 I repeat the experiment on an empty kubernetes cluster, so that no
 actual pod evicting is occuring.

 This time executors deallocation is not working as expected: depending
 on the run, after the job is complete, 0-3 executors out of 20 remain
 present forever and never seem to get removed.

 I tried to debug the code and found out that inside the
 'ExecutorMonitor.timedOutExecutors' function those executors that never get
 to be removed do not make it to the 'timedOutExecs' variable, because the
 property 'hasActiveShuffle' remains 'true' for them.

 I'm a little stuck here trying to understand how all pod management,
 shuffle tracking and decommissioning were supposed to be working together,
 how to debug this and whether this is an expected behavior at all (to me it
 is not).

 Thank you for any hints!

>>>
>>
>> --
>> Twitter: https://twitter.com/holdenkarau
>> Books (Learning Spark, High Performance Spark, etc.):
>> https://amzn.to/2MaRAG9  
>> YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>>
> --
Twitter: https://twitter.com/holdenkarau
Books (Learning Spark, High Performance Spark, etc.):
https://amzn.to/2MaRAG9  
YouTube Live Streams: https://www.youtube.com/user/holdenkarau


Re: Dynamic allocation does not deallocate executors

2023-08-08 Thread Mich Talebzadeh
Hm. I don't think it will work

--conf spark.dynamicAllocation.shuffleTracking.enabled=false

In Spark 3.4.1 running spark in k8s

you get

: org.apache.spark.SparkException: Dynamic allocation of executors requires
the external shuffle service. You may enable this through
spark.shuffle.service.enabled.

HTH

Mich Talebzadeh,
Solutions Architect/Engineering Lead
London
United Kingdom


   view my Linkedin profile



 https://en.everybodywiki.com/Mich_Talebzadeh



*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.




On Mon, 7 Aug 2023 at 21:24, Holden Karau  wrote:

> I think you need to set "spark.dynamicAllocation.shuffleTracking.enabled=true"
> to false.
>
> On Mon, Aug 7, 2023 at 2:50 AM Mich Talebzadeh 
> wrote:
>
>> Yes I have seen cases where the driver gone but a couple of executors
>> hanging on. Sounds like a code issue.
>>
>> HTH
>>
>> Mich Talebzadeh,
>> Solutions Architect/Engineering Lead
>> London
>> United Kingdom
>>
>>
>>view my Linkedin profile
>> 
>>
>>
>>  https://en.everybodywiki.com/Mich_Talebzadeh
>>
>>
>>
>> *Disclaimer:* Use it at your own risk. Any and all responsibility for
>> any loss, damage or destruction of data or any other property which may
>> arise from relying on this email's technical content is explicitly
>> disclaimed. The author will in no case be liable for any monetary damages
>> arising from such loss, damage or destruction.
>>
>>
>>
>>
>> On Thu, 27 Jul 2023 at 15:01, Sergei Zhgirovski 
>> wrote:
>>
>>> Hi everyone
>>>
>>> I'm trying to use pyspark 3.3.2.
>>> I have these relevant options set:
>>>
>>> 
>>> spark.dynamicAllocation.enabled=true
>>> spark.dynamicAllocation.shuffleTracking.enabled=true
>>> spark.dynamicAllocation.shuffleTracking.timeout=20s
>>> spark.dynamicAllocation.executorIdleTimeout=30s
>>> spark.dynamicAllocation.cachedExecutorIdleTimeout=40s
>>> spark.executor.instances=0
>>> spark.dynamicAllocation.minExecutors=0
>>> spark.dynamicAllocation.maxExecutors=20
>>> spark.master=k8s://https://k8s-api.<>:6443
>>> 
>>>
>>> So I'm using kubernetes to deploy up to 20 executors
>>>
>>> then I run this piece of code:
>>> 
>>> df = spark.read.parquet("s3a://")
>>> print(df.count())
>>> time.sleep(999)
>>> 
>>>
>>> This works fine and as expected: during the execution ~1600 tasks are
>>> completed, 20 executors get deployed and are being quickly removed after
>>> the calculation is complete.
>>>
>>> Next, I add these to the config:
>>> 
>>> spark.decommission.enabled=true
>>> spark.storage.decommission.shuffleBlocks.enabled=true
>>> spark.storage.decommission.enabled=true
>>> spark.storage.decommission.rddBlocks.enabled=true
>>> 
>>>
>>> I repeat the experiment on an empty kubernetes cluster, so that no
>>> actual pod evicting is occuring.
>>>
>>> This time executors deallocation is not working as expected: depending
>>> on the run, after the job is complete, 0-3 executors out of 20 remain
>>> present forever and never seem to get removed.
>>>
>>> I tried to debug the code and found out that inside the
>>> 'ExecutorMonitor.timedOutExecutors' function those executors that never get
>>> to be removed do not make it to the 'timedOutExecs' variable, because the
>>> property 'hasActiveShuffle' remains 'true' for them.
>>>
>>> I'm a little stuck here trying to understand how all pod management,
>>> shuffle tracking and decommissioning were supposed to be working together,
>>> how to debug this and whether this is an expected behavior at all (to me it
>>> is not).
>>>
>>> Thank you for any hints!
>>>
>>
>
> --
> Twitter: https://twitter.com/holdenkarau
> Books (Learning Spark, High Performance Spark, etc.):
> https://amzn.to/2MaRAG9  
> YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>