7 11:30 AM
To: dev@arrow.apache.org
Subject: Re: Major difference between Spark and Arrow Parquet Implementations
hi Erin -- please send a separate e-mail to dev-unsubscr...@arrow.apache.org
Thanks
On Wed, Aug 16, 2017 at 1:06 PM, Erin Sobkow wrote:
> Hi Wes:
>
> Somehow I have been inadv
com]
> Sent: August 16, 2017 10:04 AM
> To: dev@arrow.apache.org
> Subject: Re: Major difference between Spark and Arrow Parquet Implementations
>
> hi Lucas,
>
> My understanding is that the Parquet format by itself does not place any such
> restrictions on the names of fields,
McKinney [mailto:wesmck...@gmail.com]
Sent: August 16, 2017 10:04 AM
To: dev@arrow.apache.org
Subject: Re: Major difference between Spark and Arrow Parquet Implementations
hi Lucas,
My understanding is that the Parquet format by itself does not place any such
restrictions on the names of fields,
hi Lucas,
My understanding is that the Parquet format by itself does not place
any such restrictions on the names of fields, and so this is a Spark
SQL-specific issue (anyone please correct me if I'm mistaken about
this). I would be happy to help add a schema cleaning option to
normalize field nam
Hello,
I have been using pyarrow and PySpark to write Parquet files. I have used
pyarrow to successfully write out a Parquet file with spaces in column names.
E.g. 'X Coordinate'.
When I try to write out the same dataset using Sparks Parquet writer it fails
claiming:
"Attribute name "X Coordina