Could you reproduce this problem in 1.5 or 1.6?
On Sun, Dec 6, 2015 at 12:29 AM, YaoPau wrote:
> If anyone runs into the same issue, I found a workaround:
>
df.where('state_code = "NY"')
>
> works for me.
>
df.where(df.state_code == "NY").collect()
>
> fails with
I'm working with a Hadoop distribution that doesn't support 1.5 yet, we'll
be able to upgrade in probably two months. For now I'm seeing the same
issue with spark not recognizing an existing column name in many
hive-table-to-dataframe situations:
Py4JJavaError: An error occurred while calling
When I run df.printSchema() I get:
root
|-- durable_key: string (nullable = true)
|-- code: string (nullable = true)
|-- desc: string (nullable = true)
|-- city: string (nullable = true)
|-- state_code: string (nullable = true)
|-- zip_code: string (nullable = true)
|-- county: string
If anyone runs into the same issue, I found a workaround:
>>> df.where('state_code = "NY"')
works for me.
>>> df.where(df.state_code == "NY").collect()
fails with the error from the first post.
--
View this message in context: