Each operation on a dataframe is completely independent and doesn't know
what operations happened before it.  When you do a selection, you are
removing other columns from the dataframe and so the filter has nothing to
operate on.

On Fri, Jul 17, 2015 at 11:55 AM, Mike Trienis <mike.trie...@orcsol.com>
wrote:

> I'd like to understand why the where field must exist in the select
> clause.
>
> For example, the following select statement works fine
>
>    - df.select("field1", "filter_field").filter(df("filter_field") ===
>    "value").show()
>
> However, the next one fails with the error "in operator !Filter
> (filter_field#60 = value);"
>
>    - df.select("field1").filter(df("filter_field") === "value").show()
>
> As a work-around, it seems that I can do the following
>
>    - df.select("field1", "filter_field").filter(df("filter_field") ===
>    "value").drop("filter_field").show()
>
>
> Thanks, Mike.
>

Reply via email to