[ https://issues.apache.org/jira/browse/SPARK-37387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17446365#comment-17446365 ]
Zhen Wang commented on SPARK-37387: ----------------------------------- [~yoda-mon] what is the use case behind, random doesn't seem justify your usage > Allow nondeterministic expression in aggregate function > ------------------------------------------------------- > > Key: SPARK-37387 > URL: https://issues.apache.org/jira/browse/SPARK-37387 > Project: Spark > Issue Type: Improvement > Components: SQL > Affects Versions: 3.3.0 > Reporter: Leona Yoda > Priority: Minor > > Nondeterministic expression in aggregate function is not allow in spark, so > we cannot execute query like > {code:java} > SELECT COUNT(RANDOM()); > {code} > and raise \{{nondeterministic expression ... should not appear in the > arguments of an aggregate function. }}error message. > [related code > section|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala#L298] > > Hence other DB like PostgreSQL, we can call the SQL. > {code:java} > postgres=# SELECT COUNT(RANDOM()); > count > ------- > 1 > (1 row) {code} > > I tried to remove the error message section, then I found spark could execute > the query. > {code:java} > scala> spark.sql("SELECT COUNT(RANDOM())").show() > +-------------+ > |count(rand())| > +-------------+ > | 1| > +-------------+ {code} > > It could be useful for spark users to be able to execute those kinds of > queries because they can simply call > {code:java} > spark.sql("SELECT COUNT(DISTINCT(INPUT_FILE_NAME())) FROM table WHERE ...") > {code} > to find target files, for example. > -- This message was sent by Atlassian Jira (v8.20.1#820001) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org