Re: spark streaming - how to purge old data files in data directory

2016-06-18 Thread Akhil Das
Currently, there is no out of the box solution for this. Although, you can
use other hdfs utils to remove older files from the directory (say 24hrs
old). Another approach is discussed here

.

On Sun, Jun 19, 2016 at 7:28 AM, Vamsi Krishna 
wrote:

> Hi,
>
> I'm on HDP 2.3.2 cluster (Spark 1.4.1).
> I have a spark streaming app which uses 'textFileStream' to stream simple
> CSV files and process.
> I see the old data files that are processed are left in the data directory.
> What is the right way to purge the old data files in data directory on
> HDFS?
>
> Thanks,
> Vamsi Attluri
> --
> Vamsi Attluri
>



-- 
Cheers!


spark streaming - how to purge old data files in data directory

2016-06-18 Thread Vamsi Krishna
Hi,

I'm on HDP 2.3.2 cluster (Spark 1.4.1).
I have a spark streaming app which uses 'textFileStream' to stream simple
CSV files and process.
I see the old data files that are processed are left in the data directory.
What is the right way to purge the old data files in data directory on HDFS?

Thanks,
Vamsi Attluri
-- 
Vamsi Attluri