OK. We have a long running streaming job. I was thinking that may be we should have a cron to clear files that are older than 2 days. What would be an appropriate way to do that?
On Wed, Nov 18, 2015 at 7:43 PM, Ted Yu <yuzhih...@gmail.com> wrote: > Have you seen SPARK-5836 ? > Note TD's comment at the end. > > Cheers > > On Wed, Nov 18, 2015 at 7:28 PM, swetha <swethakasire...@gmail.com> wrote: > >> Hi, >> >> We have a lot of temp files that gets created due to shuffles caused by >> group by. How to clear the files that gets created due to intermediate >> operations in group by? >> >> >> Thanks, >> Swetha >> >> >> >> -- >> View this message in context: >> http://apache-spark-user-list.1001560.n3.nabble.com/How-to-clear-the-temp-files-that-gets-created-by-shuffle-in-Spark-Streaming-tp25425.html >> Sent from the Apache Spark User List mailing list archive at Nabble.com. >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >> For additional commands, e-mail: user-h...@spark.apache.org >> >> >