[ https://issues.apache.org/jira/browse/SPARK-21989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16164152#comment-16164152 ]
Jen-Ming Chung edited comment on SPARK-21989 at 9/13/17 5:27 AM: ----------------------------------------------------------------- Hi [~client.test], I write the above code in Scala and run in Spark 2.2.0 can show the schema and content you expected. {code:language=Scala} case class SampleData(str: String) ... import spark.implicits._ val arr = Seq("{\"str\": \"everyone\"}", "{\"str\": \"Hello\"}") val rdd: RDD[SampleData] = spark .sparkContext .parallelize(arr) .map(v => new Gson().fromJson[SampleData](v, classOf[SampleData])) val ds = spark.createDataset(rdd) ds.printSchema() root |-- str: string (nullable = true) ds.show(false) +--------+ |str | +--------+ |everyone| |Hello | +--------+ {code} was (Author: jmchung): Hi [~client.test], I write the above code in Scala and run in Spark 2.2.0 can show the schema and content you expected. {code:language=Scala} case class SimpleData(str: String) ... import spark.implicits._ val arr = Seq("{\"str\": \"everyone\"}", "{\"str\": \"Hello\"}") val rdd: RDD[SimpleData] = spark .sparkContext .parallelize(arr) .map(v => new Gson().fromJson[SimpleData](v, classOf[SimpleData])) val ds = spark.createDataset(rdd) ds.printSchema() root |-- str: string (nullable = true) ds.show(false) +--------+ |str | +--------+ |everyone| |Hello | +--------+ {code} > createDataset and the schema of encoder class > --------------------------------------------- > > Key: SPARK-21989 > URL: https://issues.apache.org/jira/browse/SPARK-21989 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 2.2.0 > Reporter: taiho choi > > Hello. > public class SampleData implements Serializable { > public String str; > } > ArrayList<String> arr= new ArrayList(); > arr.add("{\"str\": \"everyone\"}"); > arr.add("{\"str\": \"Hello\"}"); > JavaRDD<SampleData> data2 = sc.parallelize(arr).map(v -> {return new > Gson().fromJson(v, SampleData.class);}); > Dataset<SampleData> df = sqc.createDataset(data2.rdd(), > Encoders.bean(SampleData.class)); > df.printSchema(); > expected result of printSchema is str field of sampleData class. > actual result is following. > root > and if i call df.show() it displays like following. > ++ > || > ++ > || > || > ++ > what i expected is , "hello", "everyone" will be displayed. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org