sigmod commented on a change in pull request #34470: URL: https://github.com/apache/spark/pull/34470#discussion_r742194450
########## File path: sql/core/src/test/scala/org/apache/spark/sql/SubquerySuite.scala ########## @@ -1931,18 +1931,29 @@ class SubquerySuite extends QueryTest with SharedSparkSession with AdaptiveSpark sql( """ |SELECT c1, s, s * 10 FROM ( - | SELECT c1, (SELECT FIRST(c2) FROM t2 WHERE t1.c1 = t2.c1) s FROM t1) + | SELECT c1, (SELECT MIN(c2) FROM t2 WHERE t1.c1 = t2.c1) s FROM t1) Review comment: Can we not change the test query and assert the error instead? > Just a side note - I have been arguing, that first/last should be deterministic functions +1 even though FIRST/LAST are not truly deterministic during execution. The purpose of this field is for determining the eligibility of query rewrites. Postgres has a nice categorization of those: https://www.postgresql.org/docs/8.3/xfunc-volatility.html SUM, AVG are not completely deterministic (when running distributed-ly) neither, but we can still do query optimizations over them, and I think it'd be fine for LAST/FIRST belong too. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org