[ 
https://issues.apache.org/jira/browse/FLINK-26764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17562047#comment-17562047
 ] 

Jark Wu edited comment on FLINK-26764 at 7/5/22 7:57 AM:
---------------------------------------------------------

I checked some resources[1][2][3], and it seems the default behavior of 
first_value is "respect nulls"[3]: 

> The SQL standard defines a RESPECT NULLS or IGNORE NULLS option for lead, 
> lag, first_value, last_value, and nth_value. This is not implemented in 
> PostgreSQL: the behavior is always the same as the standard's default, namely 
> RESPECT NULLS. 

I think we should follow SQL standards but keep compatibility. Therefore, I 
think we should change the default behavior to respect nulls and we can provide 
an option for users to switch back to the previous behavior.

Besides, [~luoyuxia] could you help to check the null behavior of LEAD and LAG 
in Flink SQL? We should also fix them if they ignore nulls. 

Regarding the config option name, I would suggest 
{{table.exec.navigation-functions.null-treatment=respect_nulls/ignore_nulls}} 
or 
{{table.exec.first-last-value.null-treatment=respect_nulls/ignore_nulls}} in 
case of only need to fix first_value and last_value.


[1]: https://modern-sql.com/caniuse/T617
[2]: https://modern-sql.com/caniuse/first_value
[3]: https://www.postgresql.org/docs/current/functions-window.html



was (Author: jark):
I checked some resources[1][2][3], and it seems the default behavior of 
first_value is "respect nulls"[3]: 

> The SQL standard defines a RESPECT NULLS or IGNORE NULLS option for lead, 
> lag, first_value, last_value, and nth_value. This is not implemented in 
> PostgreSQL: the behavior is always the same as the standard's default, namely 
> RESPECT NULLS. 

I think we should follow SQL standards but keep compatibility. Therefore, I 
agree with [~godfreyhe] that adding a config option to respect nulls (default 
ignore nulls).

Besides, [~luoyuxia] could you help to check the null behavior of LEAD and LAG 
in Flink SQL? We should also fix them if they ignore nulls. 

Regarding the config option name, I would suggest 
{{table.exec.navigation-functions.null-treatment=respect_nulls/ignore_nulls}} 
or 
{{table.exec.first-last-value.null-treatment=respect_nulls/ignore_nulls}} in 
case of only need to fix first_value and last_value.


[1]: https://modern-sql.com/caniuse/T617
[2]: https://modern-sql.com/caniuse/first_value
[3]: https://www.postgresql.org/docs/current/functions-window.html


> Support RESPECT  NULLS for FIRST_VALUE/LAST_VALUE
> -------------------------------------------------
>
>                 Key: FLINK-26764
>                 URL: https://issues.apache.org/jira/browse/FLINK-26764
>             Project: Flink
>          Issue Type: New Feature
>          Components: Table SQL / API, Table SQL / Planner
>            Reporter: luoyuxia
>            Assignee: luoyuxia
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 1.16.0
>
>
> Flink supports fucntion FIRST_VALUE/LAST_VALUE, but the behavior is always 
> ignore null value.
> But the 
> [Spark|https://spark.apache.org/docs/2.4.2/api/sql/index.html#first_value], 
> [Hive|https://cwiki.apache.org/confluence/display/hive/languagemanual+windowingandanalytics],
>  
> [Oracle|https://docs.oracle.com/cd/B19306_01/server.102/b14200/functions057.htm],
>  
> [Snowflake|https://docs.snowflake.com/en/sql-reference/functions/first_value.html],
>  etc, also support to respect null for FIRST_VALUE/LAST_VALUE.
> Should we also support to allow users to specifc whether to ignore null?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to