Hi Ted
Its seems really strange. Its seems like in the version were I used 2 data frames spark added ³as(tag)². (Which is really nice. ) Odd that I got different behavior Is this a bug? Kind regards Andy From: Ted Yu <yuzhih...@gmail.com> Date: Friday, May 13, 2016 at 12:38 PM To: Andrew Davidson <a...@santacruzintegration.com> Cc: "user @spark" <user@spark.apache.org> Subject: Re: strange behavior when I chain data frame transformations > In the structure shown, tag is under element. > > I wonder if that was a factor. > > On Fri, May 13, 2016 at 11:49 AM, Andy Davidson > <a...@santacruzintegration.com> wrote: >> I am using spark-1.6.1. >> >> I create a data frame from a very complicated JSON file. I would assume that >> query planer would treat both version of my transformation chains the same >> way. >> >> >> // org.apache.spark.sql.AnalysisException: Cannot resolve column name "tag" >> among (actor, body, generator, pip, id, inReplyTo, link, object, objectType, >> postedTime, provider, retweetCount, twitter_entities, verb); >> >> // DataFrame emptyDF = rawDF.selectExpr("*", ³pip.rules.tag") >> >> // .filter(rawDF.col(tagCol).isNull()); >> >> DataFrame emptyDF1 = rawDF.selectExpr("*", ³pip.rules.tag"); >> >> DataFrame emptyDF = emptyDF1.filter(emptyDF1.col(³tag").isNull()); >> >> >> >> Here is the schema for the gnip structure >> >> |-- pip: struct (nullable = true) >> >> | |-- _profile: struct (nullable = true) >> >> | | |-- topics: array (nullable = true) >> >> | | | |-- element: string (containsNull = true) >> >> | |-- rules: array (nullable = true) >> >> | | |-- element: struct (containsNull = true) >> >> | | | |-- tag: string (nullable = true) >> >> >> >> Is this a bug ? >> >> >> >> Andy >> >> >