[ https://issues.apache.org/jira/browse/SPARK-12932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15110796#comment-15110796 ]
Apache Spark commented on SPARK-12932: -------------------------------------- User 'andygrove' has created a pull request for this issue: https://github.com/apache/spark/pull/10865 > Bad error message with trying to create Dataset from RDD of Java objects that > are not bean-compliant > ---------------------------------------------------------------------------------------------------- > > Key: SPARK-12932 > URL: https://issues.apache.org/jira/browse/SPARK-12932 > Project: Spark > Issue Type: Improvement > Components: Java API > Affects Versions: 1.6.0 > Environment: Ubuntu 15.10 / Java 8 > Reporter: Andy Grove > Priority: Minor > > When trying to create a Dataset from an RDD of Person (all using the Java > API), I got the error "java.lang.UnsupportedOperationException: no encoder > found for example_java.dataset.Person". This is not a very helpful error and > no other logging information was apparent to help troubleshoot this. > It turned out that the problem was that my Person class did not have a > default constructor and also did not have setter methods and that was the > root cause. > This JIRA is for implementing a more usful error message to help Java > developers who are trying out the Dataset API for the first time. > The full stack trace is: > {code} > Exception in thread "main" java.lang.UnsupportedOperationException: no > encoder found for example_java.common.Person > at > org.apache.spark.sql.catalyst.JavaTypeInference$.org$apache$spark$sql$catalyst$JavaTypeInference$$extractorFor(JavaTypeInference.scala:403) > at > org.apache.spark.sql.catalyst.JavaTypeInference$.extractorsFor(JavaTypeInference.scala:314) > at > org.apache.spark.sql.catalyst.encoders.ExpressionEncoder$.javaBean(ExpressionEncoder.scala:75) > at org.apache.spark.sql.Encoders$.bean(Encoder.scala:176) > at org.apache.spark.sql.Encoders.bean(Encoder.scala) > {code} > NOTE that if I do provide EITHER the default constructor OR the setters, but > not both, then I get a stack trace with much more useful information, but > omitting BOTH causes this issue. > The original source is below. > {code:title=Example.java} > public class JavaDatasetExample { > public static void main(String[] args) throws Exception { > SparkConf sparkConf = new SparkConf() > .setAppName("Example") > .setMaster("local[*]"); > JavaSparkContext sc = new JavaSparkContext(sparkConf); > SQLContext sqlContext = new SQLContext(sc); > List<Person> people = ImmutableList.of( > new Person("Joe", "Bloggs", 21, "NY") > ); > Dataset<Person> dataset = sqlContext.createDataset(people, > Encoders.bean(Person.class)); > {code} > {code:title=Person.java} > class Person implements Serializable { > String first; > String last; > int age; > String state; > public Person() { > } > public Person(String first, String last, int age, String state) { > this.first = first; > this.last = last; > this.age = age; > this.state = state; > } > public String getFirst() { > return first; > } > public String getLast() { > return last; > } > public int getAge() { > return age; > } > public String getState() { > return state; > } > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org