Re: java.lang.UnsupportedOperationException: CSV data source does not support struct/ERROR RetryingBlockFetcher

2018-03-28 Thread Jiří Syrový
Quick comment:

Excel CSV (very special case though) supports arrays in CSV using "\n"
inside quotes, but you have to use as EOL for the row "\r\n" (Windows EOL).

Cheers,
Jiri

2018-03-28 14:14 GMT+02:00 Yong Zhang <java8...@hotmail.com>:

> Your dataframe has array data type, which is NOT supported by CSV. How csv
> file can include array or other nest structure?
>
>
> If you want your data to be human readable text, write out as json in your
> case then.
>
>
> Yong
>
>
> --
> *From:* Mina Aslani <aslanim...@gmail.com>
> *Sent:* Wednesday, March 28, 2018 12:22 AM
> *To:* naresh Goud
> *Cc:* user @spark
> *Subject:* Re: java.lang.UnsupportedOperationException: CSV data source
> does not support struct/ERROR RetryingBlockFetcher
>
> Hi Naresh,
>
> Thank you for the quick response, appreciate it.
> Removing the option("header","true") and trying
>
> df = spark.read.parquet("test.parquet"), now can read the parquet works.
> However, I would like to find a way to have the data in csv/readable.
> still I cannot save df as csv as it throws.
> ava.lang.UnsupportedOperationException: CSV data source does not support
> struct<type:tinyint,size:int,indices:array,values:array>
> data type.
>
> Any idea?
>
>
> Best regards,
>
> Mina
>
>
> On Tue, Mar 27, 2018 at 10:51 PM, naresh Goud <nareshgoud.du...@gmail.com>
> wrote:
>
> In case of storing as parquet file I don’t think it requires header.
> option("header","true")
>
> Give a try by removing header option and then try to read it.  I haven’t
> tried. Just a thought.
>
> Thank you,
> Naresh
>
>
> On Tue, Mar 27, 2018 at 9:47 PM Mina Aslani <aslanim...@gmail.com> wrote:
>
> Hi,
>
>
> I am using pyspark. To transform my sample data and create model, I use
> stringIndexer and OneHotEncoder.
>
>
> However, when I try to write data as csv using below command
>
> df.coalesce(1).write.option("header","true").mode("overwrite
> ").csv("output.csv")
>
>
> I get UnsupportedOperationException
>
> java.lang.UnsupportedOperationException: CSV data source does not support
> struct<type:tinyint,size:int,indices:array,values:array>
> data type.
>
> Therefore, to save data and avoid getting the error I use
>
>
> df.coalesce(1).write.option("header","true").mode("overwrite
> ").save("output")
>
>
> The above command saves data but it's in parquet format.
> How can I read parquet file and convert to csv to observe the data?
>
> When I use
>
> df = spark.read.parquet("1.parquet"), it throws:
>
> ERROR RetryingBlockFetcher: Exception while beginning fetch of 1
> outstanding blocks
>
> Your input is appreciated.
>
>
> Best regards,
>
> Mina
>
>
>
> --
> Thanks,
> Naresh
> www.linkedin.com/in/naresh-dulam
> http://hadoopandspark.blogspot.com/
>
>
>


Re: java.lang.UnsupportedOperationException: CSV data source does not support struct/ERROR RetryingBlockFetcher

2018-03-28 Thread Yong Zhang
Your dataframe has array data type, which is NOT supported by CSV. How csv file 
can include array or other nest structure?


If you want your data to be human readable text, write out as json in your case 
then.


Yong



From: Mina Aslani <aslanim...@gmail.com>
Sent: Wednesday, March 28, 2018 12:22 AM
To: naresh Goud
Cc: user @spark
Subject: Re: java.lang.UnsupportedOperationException: CSV data source does not 
support struct/ERROR RetryingBlockFetcher

Hi Naresh,

Thank you for the quick response, appreciate it.
Removing the option("header","true") and trying

df = spark.read.parquet("test.parquet"), now can read the parquet works. 
However, I would like to find a way to have the data in csv/readable.
still I cannot save df as csv as it throws.
ava.lang.UnsupportedOperationException: CSV data source does not support 
struct<type:tinyint,size:int,indices:array,values:array> data type.

Any idea?

Best regards,

Mina


On Tue, Mar 27, 2018 at 10:51 PM, naresh Goud 
<nareshgoud.du...@gmail.com<mailto:nareshgoud.du...@gmail.com>> wrote:
In case of storing as parquet file I don’t think it requires header.
option("header","true")

Give a try by removing header option and then try to read it.  I haven’t tried. 
Just a thought.

Thank you,
Naresh


On Tue, Mar 27, 2018 at 9:47 PM Mina Aslani 
<aslanim...@gmail.com<mailto:aslanim...@gmail.com>> wrote:

Hi,


I am using pyspark. To transform my sample data and create model, I use 
stringIndexer and OneHotEncoder.


However, when I try to write data as csv using below command

df.coalesce(1).write.option("header","true").mode("overwrite").csv("output.csv")


I get UnsupportedOperationException

java.lang.UnsupportedOperationException: CSV data source does not support 
struct<type:tinyint,size:int,indices:array,values:array> data type.

Therefore, to save data and avoid getting the error I use


df.coalesce(1).write.option("header","true").mode("overwrite").save("output")


The above command saves data but it's in parquet format.
How can I read parquet file and convert to csv to observe the data?

When I use

df = spark.read.parquet("1.parquet"), it throws:

ERROR RetryingBlockFetcher: Exception while beginning fetch of 1 outstanding 
blocks

Your input is appreciated.


Best regards,

Mina



--
Thanks,
Naresh
www.linkedin.com/in/naresh-dulam<http://www.linkedin.com/in/naresh-dulam>
http://hadoopandspark.blogspot.com/




Re: java.lang.UnsupportedOperationException: CSV data source does not support struct/ERROR RetryingBlockFetcher

2018-03-27 Thread Mina Aslani
Hi Naresh,

Thank you for the quick response, appreciate it.
Removing the option("header","true") and trying

df = spark.read.parquet("test.parquet"), now can read the parquet works.
However, I would like to find a way to have the data in csv/readable.
still I cannot save df as csv as it throws.
ava.lang.UnsupportedOperationException: CSV data source does not support
struct data
type.

Any idea?


Best regards,

Mina


On Tue, Mar 27, 2018 at 10:51 PM, naresh Goud 
wrote:

> In case of storing as parquet file I don’t think it requires header.
> option("header","true")
>
> Give a try by removing header option and then try to read it.  I haven’t
> tried. Just a thought.
>
> Thank you,
> Naresh
>
>
> On Tue, Mar 27, 2018 at 9:47 PM Mina Aslani  wrote:
>
>> Hi,
>>
>>
>> I am using pyspark. To transform my sample data and create model, I use
>> stringIndexer and OneHotEncoder.
>>
>>
>> However, when I try to write data as csv using below command
>>
>> df.coalesce(1).write.option("header","true").mode("
>> overwrite").csv("output.csv")
>>
>>
>> I get UnsupportedOperationException
>>
>> java.lang.UnsupportedOperationException: CSV data source does not
>> support struct
>> data type.
>>
>> Therefore, to save data and avoid getting the error I use
>>
>>
>> df.coalesce(1).write.option("header","true").mode("
>> overwrite").save("output")
>>
>>
>> The above command saves data but it's in parquet format.
>> How can I read parquet file and convert to csv to observe the data?
>>
>> When I use
>>
>> df = spark.read.parquet("1.parquet"), it throws:
>>
>> ERROR RetryingBlockFetcher: Exception while beginning fetch of 1
>> outstanding blocks
>>
>> Your input is appreciated.
>>
>>
>> Best regards,
>>
>> Mina
>>
>>
>>
>> --
> Thanks,
> Naresh
> www.linkedin.com/in/naresh-dulam
> http://hadoopandspark.blogspot.com/
>
>


Re: java.lang.UnsupportedOperationException: CSV data source does not support struct/ERROR RetryingBlockFetcher

2018-03-27 Thread naresh Goud
In case of storing as parquet file I don’t think it requires header.
option("header","true")

Give a try by removing header option and then try to read it.  I haven’t
tried. Just a thought.

Thank you,
Naresh


On Tue, Mar 27, 2018 at 9:47 PM Mina Aslani  wrote:

> Hi,
>
>
> I am using pyspark. To transform my sample data and create model, I use
> stringIndexer and OneHotEncoder.
>
>
> However, when I try to write data as csv using below command
>
>
> df.coalesce(1).write.option("header","true").mode("overwrite").csv("output.csv")
>
>
> I get UnsupportedOperationException
>
> java.lang.UnsupportedOperationException: CSV data source does not support
> struct data
> type.
>
> Therefore, to save data and avoid getting the error I use
>
>
>
> df.coalesce(1).write.option("header","true").mode("overwrite").save("output")
>
>
> The above command saves data but it's in parquet format.
> How can I read parquet file and convert to csv to observe the data?
>
> When I use
>
> df = spark.read.parquet("1.parquet"), it throws:
>
> ERROR RetryingBlockFetcher: Exception while beginning fetch of 1
> outstanding blocks
>
> Your input is appreciated.
>
>
> Best regards,
>
> Mina
>
>
>
> --
Thanks,
Naresh
www.linkedin.com/in/naresh-dulam
http://hadoopandspark.blogspot.com/


java.lang.UnsupportedOperationException: CSV data source does not support struct/ERROR RetryingBlockFetcher

2018-03-27 Thread Mina Aslani
Hi,


I am using pyspark. To transform my sample data and create model, I use
stringIndexer and OneHotEncoder.


However, when I try to write data as csv using below command

df.coalesce(1).write.option("header","true").mode("overwrite").csv("output.csv")


I get UnsupportedOperationException

java.lang.UnsupportedOperationException: CSV data source does not support
struct data
type.

Therefore, to save data and avoid getting the error I use


df.coalesce(1).write.option("header","true").mode("overwrite").save("output")


The above command saves data but it's in parquet format.
How can I read parquet file and convert to csv to observe the data?

When I use

df = spark.read.parquet("1.parquet"), it throws:

ERROR RetryingBlockFetcher: Exception while beginning fetch of 1
outstanding blocks

Your input is appreciated.


Best regards,

Mina