Yeah, the filter gets infront of the select after analyzing scala> b.where($"bar" === 20).explain(true) == Parsed Logical Plan == 'Filter ('bar = 20) +- AnalysisBarrier +- Project [foo#6] +- Project [_1#3 AS foo#6, _2#4 AS bar#7] +- SerializeFromObject [assertnotnull(assertnotnull(input[0, scala.Tuple2, true]))._1 AS _1#3, assertnotnull(assertnotnull(input[0, scala.Tuple2, true]))._2 AS _2#4] +- ExternalRDD [obj#2]
== Analyzed Logical Plan == foo: int Project [foo#6] +- Filter (bar#7 = 20) +- Project [foo#6, bar#7] +- Project [_1#3 AS foo#6, _2#4 AS bar#7] +- SerializeFromObject [assertnotnull(assertnotnull(input[0, scala.Tuple2, true]))._1 AS _1#3, assertnotnull(assertnotnull(input[0, scala.Tuple2, true]))._2 AS _2#4] +- ExternalRDD [obj#2] == Optimized Logical Plan == Project [_1#3 AS foo#6] +- Filter (_2#4 = 20) +- SerializeFromObject [assertnotnull(input[0, scala.Tuple2, true])._1 AS _1#3, assertnotnull(input[0, scala.Tuple2, true])._2 AS _2#4] +- ExternalRDD [obj#2] == Physical Plan == *(1) Project [_1#3 AS foo#6] +- *(1) Filter (_2#4 = 20) +- *(1) SerializeFromObject [assertnotnull(input[0, scala.Tuple2, true])._1 AS _1#3, assertnotnull(input[0, scala.Tuple2, true])._2 AS _2#4] +- Scan ExternalRDDScan[obj#2] On Wed, Feb 13, 2019 at 8:04 PM Yeikel <em...@yeikel.com> wrote: > This is indeed strange. To add to the question , I can see that if I use a > filter I get an exception (as expected) , so I am not sure what's the > difference between the where clause and filter : > > > b.filter(s=> { > val bar : String = s.getAs("bar") > > bar.equals("20") > }).show > > * java.lang.IllegalArgumentException: Field "bar" does not exist.* > > > > > > -- > Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/ > > --------------------------------------------------------------------- > To unsubscribe e-mail: user-unsubscr...@spark.apache.org > > -- Sent from my iPhone