Spark 2.0 Encoder().schema() is sorting StructFields

Paul Stewart Wed, 12 Oct 2016 07:57:13 -0700

Hi all,

I am using Spark 2.0 to read a CSV file into a Dataset in Java.  This works 
fine if i define the StructType with the StructField array ordered by hand.  
What I would like to do is use a bean class for both the schema and Dataset row 
type.  For example,


Dataset<Bean> beanDS = spark.read().schema( 
Encoders.bean(Bean.class).schema()).as(Encoders.bean(Bean.class));

When using the Encoder(Bean.class).schema() method to generate the StructType 
array
of StructFields the class attributes are returned as a sorted list and not
in the defined order within the Bean.class.  This makes the schema unusable
for reading from a CSV file where the ordering of the attributes is
significant.

Is there anyway to cause the Encoder().schema() method to return the array
of StructFields in the original bean class definition?  (Aside from prefix 
every attribute name to maintain order)

Would this be considered a bug/enhancement?

Regards,
Paul


---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Spark 2.0 Encoder().schema() is sorting StructFields

Reply via email to