[ https://issues.apache.org/jira/browse/SPARK-11941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15032946#comment-15032946 ]
Michael Armbrust commented on SPARK-11941: ------------------------------------------ Sorry, maybe I'm misunderstanding. Can you construct a case where we serialize the case class representation to and from json and we lose information? If you can, then I agree this is a bug and we should fix it. Otherwise, it seems like an inconvenience. > JSON representation of nested StructTypes could be more uniform > --------------------------------------------------------------- > > Key: SPARK-11941 > URL: https://issues.apache.org/jira/browse/SPARK-11941 > Project: Spark > Issue Type: Improvement > Components: SQL > Reporter: Henri DF > > I have a json file with a single row {code}{"a":1, "b": 1.0, "c": "asdfasd", > "d":[1, 2, 4]}{code} After reading that file in, the schema is correctly > inferred: > {code} > scala> df.printSchema > root > |-- a: long (nullable = true) > |-- b: double (nullable = true) > |-- c: string (nullable = true) > |-- d: array (nullable = true) > | |-- element: long (containsNull = true) > {code} > However, the json representation has a strange nesting under "type" for > column "d": > {code} > scala> df.collect()(0).schema.prettyJson > res60: String = > { > "type" : "struct", > "fields" : [ { > "name" : "a", > "type" : "long", > "nullable" : true, > "metadata" : { } > }, { > "name" : "b", > "type" : "double", > "nullable" : true, > "metadata" : { } > }, { > "name" : "c", > "type" : "string", > "nullable" : true, > "metadata" : { } > }, { > "name" : "d", > "type" : { > "type" : "array", > "elementType" : "long", > "containsNull" : true > }, > "nullable" : true, > "metadata" : { } > }] > } > {code} > Specifically, in the last element, "type" is an object instead of being a > string. I would expect the last element to be: > {code} > { > "name":"d", > "type":"array", > "elementType":"long", > "containsNull":true, > "nullable":true, > "metadata":{} > } > {code} > There's a similar issue for nested structs. > (I ran into this while writing node.js bindings, wanted to recurse down this > representation, which would be nicer if it was uniform...). -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org