[
https://issues.apache.org/jira/browse/SPARK-42399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17813733#comment-17813733
]
Nicholas Chammas commented on SPARK-42399:
------------------------------------------
This issue does indeed appear to be resolved on {{master}} when ANSI mode is
enabled:
{code:java}
>>> spark.sql(f"SELECT CONV('{'f' * 64}', 16, 10) AS
>>> result").show(truncate=False)
+--------------------+
|result |
+--------------------+
|18446744073709551615|
+--------------------+
>>> spark.conf.set("spark.sql.ansi.enabled", "true")
>>> spark.sql(f"SELECT CONV('{'f' * 64}', 16, 10) AS
>>> result").show(truncate=False)
Traceback (most recent call last):
...
pyspark.errors.exceptions.captured.ArithmeticException: [ARITHMETIC_OVERFLOW]
Overflow in function conv(). If necessary set "spark.sql.ansi.enabled" to
"false" to bypass this error. SQLSTATE: 22003
== SQL (line 1, position 8) ==
SELECT CONV('ffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff',
16, 10) AS result
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
{code}
However, there is still a silent overflow when ANSI mode is disabled. The error
message suggests this is intended behavior.
cc [~gengliang] and [~gurwls223], who resolved SPARK-42427.
> CONV() silently overflows returning wrong results
> -------------------------------------------------
>
> Key: SPARK-42399
> URL: https://issues.apache.org/jira/browse/SPARK-42399
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 3.4.0, 3.5.0
> Reporter: Serge Rielau
> Priority: Critical
> Labels: correctness, pull-request-available
>
> spark-sql> SELECT
> CONV(SUBSTRING('0xffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff',
> 3), 16, 10);
> 18446744073709551615
> Time taken: 2.114 seconds, Fetched 1 row(s)
> spark-sql> set spark.sql.ansi.enabled = true;
> spark.sql.ansi.enabled true
> Time taken: 0.068 seconds, Fetched 1 row(s)
> spark-sql> SELECT
> CONV(SUBSTRING('0xffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff',
> 3), 16, 10);
> 18446744073709551615
> Time taken: 0.05 seconds, Fetched 1 row(s)
> In ANSI mode we should raise an error for sure.
> In non ANSI either an error or a NULL maybe be acceptable.
> Alternatively, of course, we could consider if we can support arbitrary
> domains since the result is a STRING again.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]