[
https://issues.apache.org/jira/browse/SPARK-35955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Karen Feng updated SPARK-35955:
---
Description:
Fix decimal overflow issues for decimal average in ANSI mode. Linked to
SPARK-32018 and SPARK-28067, which address decimal sum.
Repro:
{code:java}
import org.apache.spark.sql.functions._
spark.conf.set("spark.sql.ansi.enabled", true)
val df = Seq(
(BigDecimal("1000"), 1),
(BigDecimal("1000"), 1),
(BigDecimal("1000"), 2),
(BigDecimal("1000"), 2),
(BigDecimal("1000"), 2),
(BigDecimal("1000"), 2),
(BigDecimal("1000"), 2),
(BigDecimal("1000"), 2),
(BigDecimal("1000"), 2),
(BigDecimal("1000"), 2),
(BigDecimal("1000"), 2),
(BigDecimal("1000"), 2)).toDF("decNum", "intNum")
val df2 = df.withColumnRenamed("decNum", "decNum2").join(df,
"intNum").agg(mean("decNum"))
df2.show(40,false)
{code}
Should throw an exception (as sum overflows), but instead returns:
{code:java}
+---+
|avg(decNum)|
+---+
|null |
+---+{code}
was:Return null on overflow for decimal average. Linked to SPARK-32018 and
SPARK-28067, which address decimal sum.
> Fix decimal overflow issues for Average
> ---
>
> Key: SPARK-35955
> URL: https://issues.apache.org/jira/browse/SPARK-35955
> Project: Spark
> Issue Type: Bug
> Components: SQL
>Affects Versions: 3.0.0
>Reporter: Karen Feng
>Priority: Major
>
> Fix decimal overflow issues for decimal average in ANSI mode. Linked to
> SPARK-32018 and SPARK-28067, which address decimal sum.
> Repro:
>
> {code:java}
> import org.apache.spark.sql.functions._
> spark.conf.set("spark.sql.ansi.enabled", true)
> val df = Seq(
> (BigDecimal("1000"), 1),
> (BigDecimal("1000"), 1),
> (BigDecimal("1000"), 2),
> (BigDecimal("1000"), 2),
> (BigDecimal("1000"), 2),
> (BigDecimal("1000"), 2),
> (BigDecimal("1000"), 2),
> (BigDecimal("1000"), 2),
> (BigDecimal("1000"), 2),
> (BigDecimal("1000"), 2),
> (BigDecimal("1000"), 2),
> (BigDecimal("1000"), 2)).toDF("decNum", "intNum")
> val df2 = df.withColumnRenamed("decNum", "decNum2").join(df,
> "intNum").agg(mean("decNum"))
> df2.show(40,false)
> {code}
>
> Should throw an exception (as sum overflows), but instead returns:
>
> {code:java}
> +---+
> |avg(decNum)|
> +---+
> |null |
> +---+{code}
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org