subject:"Re\-create SparkContext of SparkSession inside long\-lived Spark app"

Re: Re-create SparkContext of SparkSession inside long-lived Spark app

2024-02-19 Thread Mich Talebzadeh

Daniel > > > > [1] > https://github.com/apache/spark/blob/8f5a647b0bbb6e83ee484091d3422b24baea5a80/core/src/main/scala/org/apache/spark/storage/DiskBlockManager.scala#L369 > > [2] > https://github.com/apache/spark/blob/c4e4497ff7e747eb71d087cdfb1b51673c53b83b/core/src/main/sc

Re: Re-create SparkContext of SparkSession inside long-lived Spark app

2024-02-19 Thread Saha, Daniel

, February 18, 2024 at 1:38 AM Cc: "user@spark.apache.org" Subject: RE: [EXTERNAL] Re-create SparkContext of SparkSession inside long-lived Spark app CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender an

Re: Re-create SparkContext of SparkSession inside long-lived Spark app

2024-02-18 Thread Mich Talebzadeh

Hi, What do you propose or you think will help when these spark jobs are independent of each other --> So once a job/iterator is complete, there is no need to retain these shuffle files. You have a number of options to consider starting from spark configuration parameters and so forth https://spa

Re: Re-create SparkContext of SparkSession inside long-lived Spark app

2024-02-17 Thread Jörn Franke

You can try to shuffle to s3 using the cloud shuffle plugin for s3 (https://aws.amazon.com/blogs/big-data/introducing-the-cloud-shuffle-storage-plugin-for-apache-spark/) - the performance of the new plugin is for many spark jobs sufficient (it works also on EMR). Then you can use s3 lifecycle po

Re: Re-create SparkContext of SparkSession inside long-lived Spark app

2024-02-17 Thread Adam Binford

If you're using dynamic allocation it could be caused by executors with shuffle data being deallocated before the shuffle is cleaned up. These shuffle files will never get cleaned up once that happens until the Yarn application ends. This was a big issue for us so I added support for deleting shuff

Re: Re-create SparkContext of SparkSession inside long-lived Spark app

Re: Re-create SparkContext of SparkSession inside long-lived Spark app

Re: Re-create SparkContext of SparkSession inside long-lived Spark app

Re: Re-create SparkContext of SparkSession inside long-lived Spark app

Re: Re-create SparkContext of SparkSession inside long-lived Spark app

5 matches

Site Navigation

Mail list logo

Footer information