Re: java.lang.UnsupportedOperationException: CSV data source does not support struct/ERROR RetryingBlockFetcher

Yong Zhang Wed, 28 Mar 2018 05:15:18 -0700

Your dataframe has array data type, which is NOT supported by CSV. How csv file 
can include array or other nest structure?



If you want your data to be human readable text, write out as json in your case 
then.


Yong


________________________________
From: Mina Aslani <aslanim...@gmail.com>
Sent: Wednesday, March 28, 2018 12:22 AM
To: naresh Goud
Cc: user @spark
Subject: Re: java.lang.UnsupportedOperationException: CSV data source does not 
support struct/ERROR RetryingBlockFetcher

Hi Naresh,

Thank you for the quick response, appreciate it.
Removing the option("header","true") and trying

df = spark.read.parquet("test.parquet"), now can read the parquet works. 
However, I would like to find a way to have the data in csv/readable.
still I cannot save df as csv as it throws.
ava.lang.UnsupportedOperationException: CSV data source does not support 
struct<type:tinyint,size:int,indices:array<int>,values:array<double>> data type.

Any idea?

Best regards,

Mina


On Tue, Mar 27, 2018 at 10:51 PM, naresh Goud 
<nareshgoud.du...@gmail.com<mailto:nareshgoud.du...@gmail.com>> wrote:
In case of storing as parquet file I don’t think it requires header.
option("header","true")

Give a try by removing header option and then try to read it.  I haven’t tried. 
Just a thought.

Thank you,
Naresh


On Tue, Mar 27, 2018 at 9:47 PM Mina Aslani 
<aslanim...@gmail.com<mailto:aslanim...@gmail.com>> wrote:

Hi,


I am using pyspark. To transform my sample data and create model, I use 
stringIndexer and OneHotEncoder.


However, when I try to write data as csv using below command

df.coalesce(1).write.option("header","true").mode("overwrite").csv("output.csv")


I get UnsupportedOperationException

java.lang.UnsupportedOperationException: CSV data source does not support 
struct<type:tinyint,size:int,indices:array<int>,values:array<double>> data type.

Therefore, to save data and avoid getting the error I use


df.coalesce(1).write.option("header","true").mode("overwrite").save("output")


The above command saves data but it's in parquet format.
How can I read parquet file and convert to csv to observe the data?

When I use

df = spark.read.parquet("1.parquet"), it throws:

ERROR RetryingBlockFetcher: Exception while beginning fetch of 1 outstanding 
blocks

Your input is appreciated.


Best regards,

Mina



--
Thanks,
Naresh
www.linkedin.com/in/naresh-dulam<http://www.linkedin.com/in/naresh-dulam>
http://hadoopandspark.blogspot.com/

Re: java.lang.UnsupportedOperationException: CSV data source does not support struct/ERROR RetryingBlockFetcher

Reply via email to