Re: strange behavior when I chain data frame transformations

Ted Yu Fri, 13 May 2016 12:39:29 -0700

In the structure shown, tag is under element.

I wonder if that was a factor.


On Fri, May 13, 2016 at 11:49 AM, Andy Davidson <
a...@santacruzintegration.com> wrote:

> I am using spark-1.6.1.
>
> I create a data frame from a very complicated JSON file. I would assume
> that query planer would treat both version of my transformation chains the
> same way.
>
>
> // org.apache.spark.sql.AnalysisException: Cannot resolve column name
> "tag" among (actor, body, generator, pip, id, inReplyTo, link, object,
> objectType, postedTime, provider, retweetCount, twitter_entities, verb);
>
> // DataFrame emptyDF = rawDF.selectExpr("*", “pip.rules.tag")
>
> // .filter(rawDF.col(tagCol).isNull());
>
> DataFrame emptyDF1 = rawDF.selectExpr("*", “pip.rules.tag");
>
> DataFrame emptyDF =  emptyDF1.filter(emptyDF1.col(“tag").isNull());
>
>
> Here is the schema for the gnip structure
>
>  |-- pip: struct (nullable = true)
>
>  |    |-- _profile: struct (nullable = true)
>
>  |    |    |-- topics: array (nullable = true)
>
>  |    |    |    |-- element: string (containsNull = true)
>
>  |    |-- rules: array (nullable = true)
>
>  |    |    |-- element: struct (containsNull = true)
>
>  |    |    |    |-- tag: string (nullable = true)
>
>
> Is this a bug ?
>
>
> Andy
>
>
>

Re: strange behavior when I chain data frame transformations

Reply via email to