[ https://issues.apache.org/jira/browse/SPARK-39169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Vitalii Li updated SPARK-39169: ------------------------------- Description: When `FIRST` is a single aggregate function in `Aggregate` we could either rewrite whole query or optimize execution logic. * Plan => `SELECT FIRST(<col>) FROM <table>` => `SELECT <col> FROM <table> LIMIT 1`. Note that setting `ignoreNulls` to `true` should block such rewrite since returns could differ in case all values of <col> are `NULL` * Execution => `SELECT FIRST(<col>) FROM <table> GROUP BY <some_col>` => short circuit iteration per key once a value for `FIRST` is set. was: When `FIRST` is a single aggregate function in `Aggregate` we could either rewrite whole query or optimize execution logic. * Plan => `SELECT FIRST(<col>) FROM <table> [GROUP BY <col>]` => `SELECT <col> FROM <table> LIMIT 1`. Note that setting `ignoreNulls` to `true` should block such rewrite since returns could differ in case all values of <col> are `NULL` * Execution => `SELECT FIRST(<col>) FROM <table> GROUP BY <col2>` => short circuit iteration per key once a value for `FIRST` is set. > Optimize FIRST when used as non-aggregate > ----------------------------------------- > > Key: SPARK-39169 > URL: https://issues.apache.org/jira/browse/SPARK-39169 > Project: Spark > Issue Type: Improvement > Components: SQL > Affects Versions: 3.3.0 > Reporter: Vitalii Li > Priority: Major > > When `FIRST` is a single aggregate function in `Aggregate` we could either > rewrite whole query or optimize execution logic. > * Plan => `SELECT FIRST(<col>) FROM <table>` => `SELECT <col> FROM <table> > LIMIT 1`. Note that setting `ignoreNulls` to `true` should block such rewrite > since returns could differ in case all values of <col> are `NULL` > * Execution => `SELECT FIRST(<col>) FROM <table> GROUP BY <some_col>` => > short circuit iteration per key once a value for `FIRST` is set. -- This message was sent by Atlassian Jira (v8.20.7#820007) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org