xqods9o5ekm3 created SPARK-30335:
------------------------------------

             Summary: Clarify behavior of FIRST and LAST without OVER caluse.
                 Key: SPARK-30335
                 URL: https://issues.apache.org/jira/browse/SPARK-30335
             Project: Spark
          Issue Type: New Feature
          Components: SQL
    Affects Versions: 2.4.0, 3.0.0
            Reporter: xqods9o5ekm3


Unlike many databases, Spark SQL allows usage of {{FIRST}} and {{LAST}} in 
non-analytic contexts.

 

At the moment {{FIRST}}

 

> first(expr[, isIgnoreNull]) - Returns the first value of {{expr}} for a group 
> of rows. If {{isIgnoreNull}} is true, returns only non-null values.

 

and {{LAST}}

 

> last(expr[, isIgnoreNull]) - Returns the last value of {{expr}} for a group 
> of rows. If {{isIgnoreNull}} is true, returns only non-null values.

 

descriptions, suggest that their behavior is deterministic and many users 
assume that it return specific values for example when query 
 
{code:sql}
SELECT first(foo)
FROM (
    SELECT * FROM table ORDER BY bar
)
{code}

That however doesn't seem to be the case.

To make situation worse, it seems to work (for example on small samples in 
local mode).





--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to