[ https://issues.apache.org/jira/browse/SPARK-17952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16624791#comment-16624791 ]
Apache Spark commented on SPARK-17952: -------------------------------------- User 'michalsenkyr' has created a pull request for this issue: https://github.com/apache/spark/pull/22527 > SparkSession createDataFrame method throws exception for nested JavaBeans > ------------------------------------------------------------------------- > > Key: SPARK-17952 > URL: https://issues.apache.org/jira/browse/SPARK-17952 > Project: Spark > Issue Type: Bug > Affects Versions: 2.0.0, 2.0.1, 2.3.0 > Reporter: Amit Baghel > Priority: Major > > As per latest spark documentation for Java at > http://spark.apache.org/docs/latest/sql-programming-guide.html#inferring-the-schema-using-reflection, > > {quote} > Nested JavaBeans and List or Array fields are supported though. > {quote} > However nested JavaBean is not working. Please see the below code. > SubCategory class > {code} > public class SubCategory implements Serializable{ > private String id; > private String name; > > public String getId() { > return id; > } > public void setId(String id) { > this.id = id; > } > public String getName() { > return name; > } > public void setName(String name) { > this.name = name; > } > } > {code} > Category class > {code} > public class Category implements Serializable{ > private String id; > private SubCategory subCategory; > > public String getId() { > return id; > } > public void setId(String id) { > this.id = id; > } > public SubCategory getSubCategory() { > return subCategory; > } > public void setSubCategory(SubCategory subCategory) { > this.subCategory = subCategory; > } > } > {code} > SparkSample class > {code} > public class SparkSample { > public static void main(String[] args) throws IOException { > > SparkSession spark = SparkSession > .builder() > .appName("SparkSample") > .master("local") > .getOrCreate(); > //SubCategory > SubCategory sub = new SubCategory(); > sub.setId("sc-111"); > sub.setName("Sub-1"); > //Category > Category category = new Category(); > category.setId("s-111"); > category.setSubCategory(sub); > //categoryList > List<Category> categoryList = new ArrayList<Category>(); > categoryList.add(category); > //DF > Dataset<Row> dframe = spark.createDataFrame(categoryList, > Category.class); > dframe.show(); > } > } > {code} > Above code throws below error. > {code} > Exception in thread "main" scala.MatchError: com.sample.SubCategory@e7391d > (of class com.sample.SubCategory) > at > org.apache.spark.sql.catalyst.CatalystTypeConverters$StructConverter.toCatalystImpl(CatalystTypeConverters.scala:256) > at > org.apache.spark.sql.catalyst.CatalystTypeConverters$StructConverter.toCatalystImpl(CatalystTypeConverters.scala:251) > at > org.apache.spark.sql.catalyst.CatalystTypeConverters$CatalystTypeConverter.toCatalyst(CatalystTypeConverters.scala:103) > at > org.apache.spark.sql.catalyst.CatalystTypeConverters$$anonfun$createToCatalystConverter$2.apply(CatalystTypeConverters.scala:403) > at > org.apache.spark.sql.SQLContext$$anonfun$beansToRows$1$$anonfun$apply$1.apply(SQLContext.scala:1106) > at > org.apache.spark.sql.SQLContext$$anonfun$beansToRows$1$$anonfun$apply$1.apply(SQLContext.scala:1106) > at > scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) > at > scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) > at > scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33) > at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186) > at scala.collection.TraversableLike$class.map(TraversableLike.scala:234) > at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:186) > at > org.apache.spark.sql.SQLContext$$anonfun$beansToRows$1.apply(SQLContext.scala:1106) > at > org.apache.spark.sql.SQLContext$$anonfun$beansToRows$1.apply(SQLContext.scala:1104) > at scala.collection.Iterator$$anon$11.next(Iterator.scala:409) > at scala.collection.Iterator$class.toStream(Iterator.scala:1322) > at scala.collection.AbstractIterator.toStream(Iterator.scala:1336) > at > scala.collection.TraversableOnce$class.toSeq(TraversableOnce.scala:298) > at scala.collection.AbstractIterator.toSeq(Iterator.scala:1336) > at > org.apache.spark.sql.SparkSession.createDataFrame(SparkSession.scala:373) > at com.sample.SparkSample.main(SparkSample.java:33) > {code} > createDataFrame method throws above exception. But I observed that > createDataset method works fine with below code. > {code} > Encoder<Category> encoder = Encoders.bean(Category.class); > Dataset<Category> dframe = spark.createDataset(categoryList, encoder); > dframe.show(); > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org