Your dataframe has array data type, which is NOT supported by CSV. How csv file can include array or other nest structure?
If you want your data to be human readable text, write out as json in your case then. Yong ________________________________ From: Mina Aslani <aslanim...@gmail.com> Sent: Wednesday, March 28, 2018 12:22 AM To: naresh Goud Cc: user @spark Subject: Re: java.lang.UnsupportedOperationException: CSV data source does not support struct/ERROR RetryingBlockFetcher Hi Naresh, Thank you for the quick response, appreciate it. Removing the option("header","true") and trying df = spark.read.parquet("test.parquet"), now can read the parquet works. However, I would like to find a way to have the data in csv/readable. still I cannot save df as csv as it throws. ava.lang.UnsupportedOperationException: CSV data source does not support struct<type:tinyint,size:int,indices:array<int>,values:array<double>> data type. Any idea? Best regards, Mina On Tue, Mar 27, 2018 at 10:51 PM, naresh Goud <nareshgoud.du...@gmail.com<mailto:nareshgoud.du...@gmail.com>> wrote: In case of storing as parquet file I don’t think it requires header. option("header","true") Give a try by removing header option and then try to read it. I haven’t tried. Just a thought. Thank you, Naresh On Tue, Mar 27, 2018 at 9:47 PM Mina Aslani <aslanim...@gmail.com<mailto:aslanim...@gmail.com>> wrote: Hi, I am using pyspark. To transform my sample data and create model, I use stringIndexer and OneHotEncoder. However, when I try to write data as csv using below command df.coalesce(1).write.option("header","true").mode("overwrite").csv("output.csv") I get UnsupportedOperationException java.lang.UnsupportedOperationException: CSV data source does not support struct<type:tinyint,size:int,indices:array<int>,values:array<double>> data type. Therefore, to save data and avoid getting the error I use df.coalesce(1).write.option("header","true").mode("overwrite").save("output") The above command saves data but it's in parquet format. How can I read parquet file and convert to csv to observe the data? When I use df = spark.read.parquet("1.parquet"), it throws: ERROR RetryingBlockFetcher: Exception while beginning fetch of 1 outstanding blocks Your input is appreciated. Best regards, Mina -- Thanks, Naresh www.linkedin.com/in/naresh-dulam<http://www.linkedin.com/in/naresh-dulam> http://hadoopandspark.blogspot.com/