Leona Yoda created SPARK-37387: ---------------------------------- Summary: Allow nondeterministic expression in aggregate function Key: SPARK-37387 URL: https://issues.apache.org/jira/browse/SPARK-37387 Project: Spark Issue Type: Improvement Components: SQL Affects Versions: 3.3.0 Reporter: Leona Yoda
Nondeterministic expression in aggregate function is not allow in spark, so we cannot execute query like {code:java} SELECT COUNT(RANDOM()); {code} and raise {{nondeterministic expression ... should not appear in the arguments of an aggregate function. }}error message. [related code section|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala#L298] Hence other DB like PostgreSQL, we can call the SQL. {code:java} postgres=# SELECT COUNT(RANDOM()); count ------- 1 (1 row) {code} I tried to remove the error message section, then I found spark could execute the query. {code:java} scala> spark.sql("SELECT COUNT(RANDOM())").show() +-------------+ |count(rand())| +-------------+ | 1| +-------------+ {code} It could be useful for spark users to be able to execute those kinds of queries because they can simply call {code:java} spark.sql("SELECT COUNT(DISTINCT(INPUT_FILE_NAME())) FROM table WHERE ...") {code} to find target files, for example. -- This message was sent by Atlassian Jira (v8.20.1#820001) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org