[ https://issues.apache.org/jira/browse/SPARK-29638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Dylan Guedes updated SPARK-29638: --------------------------------- Description: Currently, Spark handles 'NaN' as 0 in window functions, such that 3+'NaN'=3. PgSQL, on the other hand, handles the entire result as 'NaN', as in 3+'NaN' = 'NaN' I experienced this with the query below: {code:sql} SELECT a, b, SUM(b) OVER(ORDER BY A ROWS BETWEEN 1 PRECEDING AND CURRENT ROW) FROM (VALUES(1,1),(2,2),(3,(cast('nan' as int))),(4,3),(5,4)) t(a,b); {code} was:Currently, Spark handles 'NaN' as 0 in window functions, such that 3+'NaN'=3. PgSQL, on the other hand, handles the entire result as 'NaN', as in 3+'NaN' = 'NaN' > Spark handles 'NaN' as 0 in sums > -------------------------------- > > Key: SPARK-29638 > URL: https://issues.apache.org/jira/browse/SPARK-29638 > Project: Spark > Issue Type: Sub-task > Components: SQL > Affects Versions: 3.0.0 > Reporter: Dylan Guedes > Priority: Major > > Currently, Spark handles 'NaN' as 0 in window functions, such that 3+'NaN'=3. > PgSQL, on the other hand, handles the entire result as 'NaN', as in 3+'NaN' = > 'NaN' > I experienced this with the query below: > {code:sql} > SELECT a, b, > SUM(b) OVER(ORDER BY A ROWS BETWEEN 1 PRECEDING AND CURRENT ROW) > FROM (VALUES(1,1),(2,2),(3,(cast('nan' as int))),(4,3),(5,4)) t(a,b); > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org