I believe in your case it will help, as executor's shuffle files will be managed by external service. It is described in spark docs: graceful-decommission-of-executors <http://spark.apache.org/docs/latest/job-scheduling.html#graceful-decommission-of-executors>
Artur On Fri, Sep 2, 2016 at 1:01 PM 汪洋 <tiandiwo...@icloud.com> wrote: > 在 2016年9月2日,下午5:58,汪洋 <tiandiwo...@icloud.com> 写道: > > Yeah, using external shuffle service is a reasonable choice but I think we > will still face the same problems. We use SSDs to store shuffle files for > performance considerations. If the shuffle files are not going to be used > anymore, we want them to be deleted instead of taking up valuable SSD space. > > Not very familiar with external shuffle service though. Is it going to > help in this case? -:) > > 在 2016年9月2日,下午5:40,Artur Sukhenko <artur.sukhe...@gmail.com> 写道: > > Hi Yang, > > Isn't external shuffle service better for long running applications? > "It runs as a standalone application and manages shuffle output files so > they are available for executors at all time" > > It is described here: > > https://jaceklaskowski.gitbooks.io/mastering-apache-spark/content/spark-ExternalShuffleService.html > > --- > Artur > > On Fri, Sep 2, 2016 at 12:30 PM 汪洋 <tiandiwo...@icloud.com> wrote: > >> Thank you for you response. >> >> We are using spark-1.6.2 on standalone deploy mode with dynamic >> allocation disabled. >> >> I have traced the code. IMHO, it seems this cleanup is not handled by >> shutdown hooks directly. The shutdown hooks only send a >> “ExecutorStateChanged” message to the worker and if the worker see the >> message, it will cleanup the directory *only when this application is >> finished*. In our case, the application is not finished (long running). >> The executor exits due to some unknown error and it is restarted by worker >> right away. In this scenario, those old directories are not going to be >> deleted. >> >> If the application is still running, is it safe to delete the old >> “blockmgr” directory and leaving only the newest one? >> >> Our temporary solution is to restart our application regularly and we are >> seeking a more elegant way. >> >> Thanks. >> >> Yang >> >> >> 在 2016年9月2日,下午4:11,Sun Rui <sunrise_...@163.com> 写道: >> >> Hi, >> Could you give more information about your Spark environment? cluster >> manager, spark version, using dynamic allocation or not, etc.. >> >> Generally, executors will delete temporary directories for shuffle files >> on exit because JVM shutdown hooks are registered. Unless they are brutally >> killed. >> >> You can safely delete the directories when you are sure that the spark >> applications related to them have finished. A crontab task may be used for >> automatic clean up. >> >> On Sep 2, 2016, at 12:18, 汪洋 <tiandiwo...@icloud.com> wrote: >> >> Hi all, >> >> I discovered that sometimes executor exits unexpectedly and when it is >> restarted, it will create another blockmgr directory without deleting the >> old ones. Thus, for a long running application, some shuffle files will >> never be cleaned up. Sometimes those files could take up the whole disk. >> >> Is there a way to clean up those unused file automatically? Or is it safe >> to delete the old directory manually only leaving the newest one? >> >> Here is the executor’s local directory. >> <D7718580-FF26-47F8-B6F8-00FB1F20A8C0.png> >> >> Any advice on this? >> >> Thanks. >> >> Yang >> >> >> >> >> --------------------------------------------------------------------- >> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org >> >> >> -- > -- > Artur Sukhenko > > > -- -- Artur Sukhenko