I'd like to understand why the where field must exist in the select clause.
For example, the following select statement works fine - df.select("field1", "filter_field").filter(df("filter_field") === "value").show() However, the next one fails with the error "in operator !Filter (filter_field#60 = value);" - df.select("field1").filter(df("filter_field") === "value").show() As a work-around, it seems that I can do the following - df.select("field1", "filter_field").filter(df("filter_field") === "value").drop("filter_field").show() Thanks, Mike.