Re: What are the most common operators for shuffle in Spark

2022-01-23 Thread Khalid Mammadov
know some operators in Spark are expensive because of shuffle. > > This document describes shuffle > > https://www.educba.com/spark-shuffle/ > > and says > More shufflings in numbers are not always bad. Memory constraints and > other impossibilities can be overcome by shuffling

What are the most common operators for shuffle in Spark

2022-01-23 Thread ashok34...@yahoo.com.INVALID
Hello, I know some operators in Spark are expensive because of shuffle. This document describes shuffle https://www.educba.com/spark-shuffle/ and saysMore shufflings in numbers are not always bad. Memory constraints and other impossibilities can be overcome by shuffling. In RDD, the below

Shuffle in Spark with Kubernetes

2021-10-27 Thread Mich Talebzadeh
As I understand Spark releases > 3 currently do not support external shuffle. Is there any timelines when this could be available? For now we have two parameters for Dynamic Resource Allocation. These are --conf spark.dynamicAllocation.enabled=true \ --conf

Regression of external shuffle service spark 2.3 vs spark 2.2

2018-11-19 Thread igor.berman
Hi, any inputs will be welcome regarding below We are running with external shuffle service. Mesos cluster(1.5.1) After upgrading our production workload to spark 2.3 we started to see OOM failures of external shuffle services(running on each node). Does anybody experienced same problems? Any

Re: shuffle in spark

2016-03-14 Thread Jules Damji
Hello Ashok, I found three sources of how shuffle works (and what transformations trigger it) instructive and illuminative. After learning from it, you should be able to extrapolate how your particular and practical use case would work.

shuffle in spark

2016-03-14 Thread Ashok Kumar
experts, please I need to understand how shuffling works in Spark and which parameters influence it. I am sorry but my knowledge of shuffling is very limited. Need a practical use case if you can. regards

Re: How to clear the temp files that gets created by shuffle in Spark Streaming

2015-11-19 Thread swetha kasireddy
t gets created due to intermediate >> operations in group by? >> >> >> Thanks, >> Swetha >> >> >> >> -- >> View this message in context: >> http://apache-spark-user-list.1001560.n3.nabble.com/How-to-clear-the-temp-files-that-gets-cr

How to clear the temp files that gets created by shuffle in Spark Streaming

2015-11-18 Thread swetha
-the-temp-files-that-gets-created-by-shuffle-in-Spark-Streaming-tp25425.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e

Re: How to clear the temp files that gets created by shuffle in Spark Streaming

2015-11-18 Thread Ted Yu
ue to intermediate > operations in group by? > > > Thanks, > Swetha > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/How-to-clear-the-temp-files-that-gets-created-by-shuffle-in-Spark-Streaming-tp25425.html > Sen