Hi Ted

Its seems really strange. Its seems like in the version were I used 2 data
frames spark added ³as(tag)². (Which is really nice. )

Odd that I got different behavior

Is this a bug?

Kind regards

Andy



From:  Ted Yu <yuzhih...@gmail.com>
Date:  Friday, May 13, 2016 at 12:38 PM
To:  Andrew Davidson <a...@santacruzintegration.com>
Cc:  "user @spark" <user@spark.apache.org>
Subject:  Re: strange behavior when I chain data frame transformations

> In the structure shown, tag is under element.
> 
> I wonder if that was a factor.
> 
> On Fri, May 13, 2016 at 11:49 AM, Andy Davidson
> <a...@santacruzintegration.com> wrote:
>> I am using spark-1.6.1.
>> 
>> I create a data frame from a very complicated JSON file. I would assume that
>> query planer would treat both version of my transformation chains the same
>> way.
>> 
>> 
>> // org.apache.spark.sql.AnalysisException: Cannot resolve column name "tag"
>> among (actor, body, generator, pip, id, inReplyTo, link, object, objectType,
>> postedTime, provider, retweetCount, twitter_entities, verb);
>> 
>> // DataFrame emptyDF = rawDF.selectExpr("*", ³pip.rules.tag")
>> 
>> // .filter(rawDF.col(tagCol).isNull());
>> 
>> DataFrame emptyDF1 = rawDF.selectExpr("*", ³pip.rules.tag");
>> 
>> DataFrame emptyDF =  emptyDF1.filter(emptyDF1.col(³tag").isNull());
>> 
>> 
>> 
>> Here is the schema for the gnip structure
>> 
>>  |-- pip: struct (nullable = true)
>> 
>>  |    |-- _profile: struct (nullable = true)
>> 
>>  |    |    |-- topics: array (nullable = true)
>> 
>>  |    |    |    |-- element: string (containsNull = true)
>> 
>>  |    |-- rules: array (nullable = true)
>> 
>>  |    |    |-- element: struct (containsNull = true)
>> 
>>  |    |    |    |-- tag: string (nullable = true)
>> 
>> 
>> 
>> Is this a bug ?
>> 
>> 
>> 
>> Andy
>> 
>> 
> 


Reply via email to