Ok, I did a bit of a test that shows that the shuffle does spill to memory
then to disk if my assertion is valid.
The sample code I wrote is as follows:
import sys
from pyspark.sql import SparkSession
from pyspark import SparkContext
from pyspark.sql import SQLContext
from pyspark.sql import func
Following link you will get all required details
https://aws.amazon.com/blogs/containers/best-practices-for-running-spark-on-amazon-eks/
Let me know if you required further informations.
Regards,
Vaquar khan
On Mon, May 15, 2023, 10:14 PM Mich Talebzadeh
wrote:
> Couple of points
>
> Why