In case of storing as parquet file I don’t think it requires header. option("header","true")
Give a try by removing header option and then try to read it. I haven’t tried. Just a thought. Thank you, Naresh On Tue, Mar 27, 2018 at 9:47 PM Mina Aslani <aslanim...@gmail.com> wrote: > Hi, > > > I am using pyspark. To transform my sample data and create model, I use > stringIndexer and OneHotEncoder. > > > However, when I try to write data as csv using below command > > > df.coalesce(1).write.option("header","true").mode("overwrite").csv("output.csv") > > > I get UnsupportedOperationException > > java.lang.UnsupportedOperationException: CSV data source does not support > struct<type:tinyint,size:int,indices:array<int>,values:array<double>> data > type. > > Therefore, to save data and avoid getting the error I use > > > > df.coalesce(1).write.option("header","true").mode("overwrite").save("output") > > > The above command saves data but it's in parquet format. > How can I read parquet file and convert to csv to observe the data? > > When I use > > df = spark.read.parquet("1.parquet"), it throws: > > ERROR RetryingBlockFetcher: Exception while beginning fetch of 1 > outstanding blocks > > Your input is appreciated. > > > Best regards, > > Mina > > > > -- Thanks, Naresh www.linkedin.com/in/naresh-dulam http://hadoopandspark.blogspot.com/