[ https://issues.apache.org/jira/browse/SPARK-7301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15065021#comment-15065021 ]
Simeon Simeonov commented on SPARK-7301: ---------------------------------------- It would be nice if using backticks to refer to the column resolved the ambiguity, e.g., {{select `A` as upperA, `a` as lowerA from test}}. > Issue with duplicated fields in interpreted json schemas > -------------------------------------------------------- > > Key: SPARK-7301 > URL: https://issues.apache.org/jira/browse/SPARK-7301 > Project: Spark > Issue Type: Bug > Components: SQL > Reporter: David Crossland > > I have a large json dataset that has evolved over time as such some fields > seem to have slight renames or have been capitalised in some way. This means > there are certain fields that spark considers ambiguous when i attempt to > access them > i get a > org.apache.spark.sql.AnalysisException: Ambiguous reference to fields > StructField(Currency,StringType,true), StructField(currency,StringType,true); > error > There appears to be no way to resolve an ambiguous field after its been > inferred by spark sql other than to manually construct the schema using > StructType/StructField which is a bit heavy handed as the schema is quite > large. Is there some way to resolve an ambiguous reference? or affect the > schema post inference? It seems like something of a bug that i cant tell > spark to treat both fields as though they were the same. Ive created a test > where i manually defined a schema as > val schema = StructType(Seq(StructField("A", StringType, true))) > And it returns 2 rows when i perform a count on the following dataset > {"A":"test1"} > {"a":"test2"} > If i could modify the schema to remove the duplicate entries then i could > work around this issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org