Github user mgaido91 commented on a diff in the pull request: https://github.com/apache/spark/pull/22518#discussion_r232580202 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/SubquerySuite.scala --- @@ -1268,4 +1269,16 @@ class SubquerySuite extends QueryTest with SharedSQLContext { assert(getNumSortsInQuery(query5) == 1) } } + + test("SPARK-25482: Reuse same Subquery in order to execute it only once") { + withTempView("t1", "t2") { + sql("create temporary view t1(a int) using parquet") + sql("create temporary view t2(b int) using parquet") + val plan = sql("select * from t2 where b > (select max(a) from t1)") --- End diff -- Sure, please can you check the PR description? I think the context is quite well explained there. Anyway, as a quick summary: in this case `b > (select max(a) from t1)` is pushed down as a datasource filter. So we have 2 instances of `b > (select max(a) from t1)` and the result is not reused. It is not reused because the copied plan satisfies `==`, so even if `ReuseSubquery` replaces it, then the change is ignored.
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org