xqods9o5ekm3 created SPARK-30335: ------------------------------------ Summary: Clarify behavior of FIRST and LAST without OVER caluse. Key: SPARK-30335 URL: https://issues.apache.org/jira/browse/SPARK-30335 Project: Spark Issue Type: New Feature Components: SQL Affects Versions: 2.4.0, 3.0.0 Reporter: xqods9o5ekm3
Unlike many databases, Spark SQL allows usage of {{FIRST}} and {{LAST}} in non-analytic contexts. At the moment {{FIRST}} > first(expr[, isIgnoreNull]) - Returns the first value of {{expr}} for a group > of rows. If {{isIgnoreNull}} is true, returns only non-null values. and {{LAST}} > last(expr[, isIgnoreNull]) - Returns the last value of {{expr}} for a group > of rows. If {{isIgnoreNull}} is true, returns only non-null values. descriptions, suggest that their behavior is deterministic and many users assume that it return specific values for example when query {code:sql} SELECT first(foo) FROM ( SELECT * FROM table ORDER BY bar ) {code} That however doesn't seem to be the case. To make situation worse, it seems to work (for example on small samples in local mode). -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org