Working on LZOP Files

2014-09-25 Thread Harsha HN
Hi, Anybody using LZOP files to process in Spark? We have a huge volume of LZOP files in HDFS to process through Spark. In MapReduce framework, it automatically detects the file format and sends the decompressed version to Mappers. Any such support in Spark? As of now I am manually downloading,

Re: Working on LZOP Files

2014-09-25 Thread Andrew Ash
Hi Harsha, I use LZOP files extensively on my Spark cluster -- see my writeup for how to do this on this mailing list post: http://mail-archives.apache.org/mod_mbox/spark-user/201312.mbox/%3CCAOoZ679ehwvT1g8=qHd2n11Z4EXOBJkP+q=Aj0qE_=shhyl...@mail.gmail.com%3E Maybe we should better document how