Thank you for you response. 

We are using spark-1.6.2 on standalone deploy mode with dynamic allocation 
disabled.

I have traced the code. IMHO, it seems this cleanup is not handled by shutdown 
hooks directly. The shutdown hooks only send a “ExecutorStateChanged” message 
to the worker and if the worker see the message, it will cleanup the directory 
only when this application is finished. In our case, the application is not 
finished (long running). The executor exits due to some unknown error and it is 
restarted by worker right away. In this scenario, those old directories are not 
going to be deleted. 

If the application is still running, is it safe to delete the old “blockmgr” 
directory and leaving only the newest one?

Our temporary solution is to restart our application regularly and we are 
seeking a more elegant way. 

Thanks.

Yang


> 在 2016年9月2日,下午4:11,Sun Rui <sunrise_...@163.com> 写道:
> 
> Hi,
> Could you give more information about your Spark environment? cluster 
> manager, spark version, using dynamic allocation or not, etc..
> 
> Generally, executors will delete temporary directories for shuffle files on 
> exit because JVM shutdown hooks are registered. Unless they are brutally 
> killed.
> 
> You can safely delete the directories when you are sure that the spark 
> applications related to them have finished. A crontab task may be used for 
> automatic clean up.
> 
>> On Sep 2, 2016, at 12:18, 汪洋 <tiandiwo...@icloud.com> wrote:
>> 
>> Hi all,
>> 
>> I discovered that sometimes executor exits unexpectedly and when it is 
>> restarted, it will create another blockmgr directory without deleting the 
>> old ones. Thus, for a long running application, some shuffle files will 
>> never be cleaned up. Sometimes those files could take up the whole disk. 
>> 
>> Is there a way to clean up those unused file automatically? Or is it safe to 
>> delete the old directory manually only leaving the newest one?
>> 
>> Here is the executor’s local directory.
>> <D7718580-FF26-47F8-B6F8-00FB1F20A8C0.png>
>> 
>> Any advice on this?
>> 
>> Thanks.
>> 
>> Yang
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
> 

Reply via email to