I have the similar kind of requirement where I want to push avro data into parquet. But it seems you have to do it on your own. There is parquet-mr project that uses hadoop to do so. I am trying to write a spark job to do similar kind of thing.
On Fri, Jan 9, 2015 at 3:20 AM, Jerry Lam <chiling...@gmail.com> wrote: > Hi spark users, > > I'm using spark SQL to create parquet files on HDFS. I would like to store > the avro schema into the parquet meta so that non spark sql applications > can marshall the data without avro schema using the avro parquet reader. > Currently, schemaRDD.saveAsParquetFile does not allow to do that. Is there > another API that allows me to do this? > > Best Regards, > > Jerry >