I think the temporary folders are used to store blocks and shuffles.
That doesn't depend on the cluster manager.

Ideally they should be removed after the application has been
terminated.

Can you check if there are contents under those folders?

 

________________________________

From: Taeyun Kim [mailto:taeyun....@innowireless.com] 
Sent: Friday, May 08, 2015 9:42 AM
To: 'Ted Yu'; 'Todd Nist'; user@spark.apache.org
Subject: RE: Spark does not delete temporary directories

 

Thanks, but it seems that the option is for Spark standalone mode only.

I've (lightly) tested the options with local mode and yarn-client mode,
the 'temp' directories were not deleted.

 

From: Ted Yu [mailto:yuzhih...@gmail.com] 
Sent: Thursday, May 07, 2015 10:47 PM
To: Todd Nist
Cc: Taeyun Kim; user@spark.apache.org
Subject: Re: Spark does not delete temporary directories

 

Default value for spark.worker.cleanup.enabled is false:


private val CLEANUP_ENABLED =
conf.getBoolean("spark.worker.cleanup.enabled", false)

 

I wonder if the default should be set as true.

 

Cheers

 

On Thu, May 7, 2015 at 6:19 AM, Todd Nist <tsind...@gmail.com> wrote:

Have you tried to set the following?

spark.worker.cleanup.enabled=true 
spark.worker.cleanup.appDataTtl=<seconds>"

 

 

On Thu, May 7, 2015 at 2:39 AM, Taeyun Kim <taeyun....@innowireless.com>
wrote:

Hi,

 

After a spark program completes, there are 3 temporary directories
remain in the temp directory.

The file names are like this: spark-2e389487-40cc-4a82-a5c7-353c0feefbb7

 

And the Spark program runs on Windows, a snappy DLL file also remains in
the temp directory.

The file name is like this:
snappy-1.0.4.1-6e117df4-97b6-4d69-bf9d-71c4a627940c-snappyjava

 

They are created every time the Spark program runs. So the number of
files and directories keeps growing.

 

How can let them be deleted?

 

Spark version is 1.3.1 with Hadoop 2.6.

 

Thanks.

 

 

 

 

Reply via email to