date:20230517

Re: Spark shuffle and inevitability of writing to Disk

2023-05-17 Thread Mich Talebzadeh

Ok, I did a bit of a test that shows that the shuffle does spill to memory then to disk if my assertion is valid. The sample code I wrote is as follows: import sys from pyspark.sql import SparkSession from pyspark import SparkContext from pyspark.sql import SQLContext from pyspark.sql import func

Re: [spark-core] Can executors recover/reuse shuffle files upon failure?

2023-05-17 Thread vaquar khan

Following link you will get all required details https://aws.amazon.com/blogs/containers/best-practices-for-running-spark-on-amazon-eks/ Let me know if you required further informations. Regards, Vaquar khan On Mon, May 15, 2023, 10:14 PM Mich Talebzadeh wrote: > Couple of points > > Why