In the structure shown, tag is under element. I wonder if that was a factor.
On Fri, May 13, 2016 at 11:49 AM, Andy Davidson < a...@santacruzintegration.com> wrote: > I am using spark-1.6.1. > > I create a data frame from a very complicated JSON file. I would assume > that query planer would treat both version of my transformation chains the > same way. > > > // org.apache.spark.sql.AnalysisException: Cannot resolve column name > "tag" among (actor, body, generator, pip, id, inReplyTo, link, object, > objectType, postedTime, provider, retweetCount, twitter_entities, verb); > > // DataFrame emptyDF = rawDF.selectExpr("*", “pip.rules.tag") > > // .filter(rawDF.col(tagCol).isNull()); > > DataFrame emptyDF1 = rawDF.selectExpr("*", “pip.rules.tag"); > > DataFrame emptyDF = emptyDF1.filter(emptyDF1.col(“tag").isNull()); > > > Here is the schema for the gnip structure > > |-- pip: struct (nullable = true) > > | |-- _profile: struct (nullable = true) > > | | |-- topics: array (nullable = true) > > | | | |-- element: string (containsNull = true) > > | |-- rules: array (nullable = true) > > | | |-- element: struct (containsNull = true) > > | | | |-- tag: string (nullable = true) > > > Is this a bug ? > > > Andy > > >