Hi,
I'm on HDP 2.3.2 cluster (Spark 1.4.1).
I have a spark streaming app which uses 'textFileStream' to stream simple
CSV files and process.
I see the old data files that are processed are left in the data directory.
What is the right way to purge the old data files in data directory on HDFS?
Thanks. It works.
On Thu, Jun 16, 2016 at 5:32 PM Hyukjin Kwon <gurwls...@gmail.com> wrote:
> It will 'auto-detect' the compression codec by the file extension and then
> will decompress and read it correctly.
>
> Thanks!
>
> 2016-06-16 20:27 GMT+09:00 Vamsi Krishna
Hi,
I'm using Spark 1.4.1 (HDP 2.3.2).
As per the spark-csv documentation (https://github.com/databricks/spark-csv),
I see that we can write to a csv file in compressed form using the 'codec'
option.
But, didn't see the support for 'codec' option to read a csv file.
Is there a way to read a