Re: shuffle files not deleted after executor restarted

Artur Sukhenko Fri, 02 Sep 2016 03:33:54 -0700

I believe in your case it will help, as executor's shuffle files will be
managed by external service.
It is described in spark docs: graceful-decommission-of-executors
<http://spark.apache.org/docs/latest/job-scheduling.html#graceful-decommission-of-executors>



Artur



On Fri, Sep 2, 2016 at 1:01 PM 汪洋 <tiandiwo...@icloud.com> wrote:

> 在 2016年9月2日，下午5:58，汪洋 <tiandiwo...@icloud.com> 写道：
>
> Yeah, using external shuffle service is a reasonable choice but I think we
> will still face the same problems. We use SSDs to store shuffle files for
> performance considerations. If the shuffle files are not going to be used
> anymore, we want them to be deleted instead of taking up valuable SSD space.
>
> Not very familiar with external shuffle service though. Is it going to
> help in this case? -:)
>
> 在 2016年9月2日，下午5:40，Artur Sukhenko <artur.sukhe...@gmail.com> 写道：
>
> Hi Yang,
>
> Isn't external shuffle service better for long running applications?
> "It runs as a standalone application and manages shuffle output files so
> they are available for executors at all time"
>
> It is described here:
>
> https://jaceklaskowski.gitbooks.io/mastering-apache-spark/content/spark-ExternalShuffleService.html
>
> ---
> Artur
>
> On Fri, Sep 2, 2016 at 12:30 PM 汪洋 <tiandiwo...@icloud.com> wrote:
>
>> Thank you for you response.
>>
>> We are using spark-1.6.2 on standalone deploy mode with dynamic
>> allocation disabled.
>>
>> I have traced the code. IMHO, it seems this cleanup is not handled by
>> shutdown hooks directly. The shutdown hooks only send a
>> “ExecutorStateChanged” message to the worker and if the worker see the
>> message, it will cleanup the directory *only when this application is
>> finished*. In our case, the application is not finished (long running).
>> The executor exits due to some unknown error and it is restarted by worker
>> right away. In this scenario, those old directories are not going to be
>> deleted.
>>
>> If the application is still running, is it safe to delete the old
>> “blockmgr” directory and leaving only the newest one?
>>
>> Our temporary solution is to restart our application regularly and we are
>> seeking a more elegant way.
>>
>> Thanks.
>>
>> Yang
>>
>>
>> 在 2016年9月2日，下午4:11，Sun Rui <sunrise_...@163.com> 写道：
>>
>> Hi,
>> Could you give more information about your Spark environment? cluster
>> manager, spark version, using dynamic allocation or not, etc..
>>
>> Generally, executors will delete temporary directories for shuffle files
>> on exit because JVM shutdown hooks are registered. Unless they are brutally
>> killed.
>>
>> You can safely delete the directories when you are sure that the spark
>> applications related to them have finished. A crontab task may be used for
>> automatic clean up.
>>
>> On Sep 2, 2016, at 12:18, 汪洋 <tiandiwo...@icloud.com> wrote:
>>
>> Hi all,
>>
>> I discovered that sometimes executor exits unexpectedly and when it is
>> restarted, it will create another blockmgr directory without deleting the
>> old ones. Thus, for a long running application, some shuffle files will
>> never be cleaned up. Sometimes those files could take up the whole disk.
>>
>> Is there a way to clean up those unused file automatically? Or is it safe
>> to delete the old directory manually only leaving the newest one？
>>
>> Here is the executor’s local directory.
>> <D7718580-FF26-47F8-B6F8-00FB1F20A8C0.png>
>>
>> Any advice on this?
>>
>> Thanks.
>>
>> Yang
>>
>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>>
>>
>> --
> --
> Artur Sukhenko
>
>
> --
--
Artur Sukhenko

Re: shuffle files not deleted after executor restarted

Reply via email to