[jira] [Updated] (SPARK-40616) Loss of precision using SparkSQL shell on high-precision DECIMAL types

xsys (Jira) Thu, 29 Sep 2022 09:33:04 -0700


     [ 
https://issues.apache.org/jira/browse/SPARK-40616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


xsys updated SPARK-40616:
-------------------------
    Description: 
h3. Describe the bug

We are trying to save {{DECIMAL}} values with high precision in a table using 
the SparkSQL shell. When we {{INSERT}} decimal values with precision higher 
than the standard double precision, precision is lost. (8.8888888888888888888e9 
interpreted as 8888888888.8888900000 instead of 8888888888.8888888888).

This seems to be caused by type inference at shell parsing inferring that the 
value is a double type.
h3. To reproduce

On Spark 3.2.1 (commit {{{}4f25b3f712{}}}), using {{{}spark-sql{}}}:
{code:java}
$SPARK_HOME/bin/spark-sql{code}
In the shell:
{code:java}
CREATE TABLE t(c0 DECIMAL(20,10));         
INSERT INTO t VALUES (8.8888888888888888888e9);                             
SELECT * FROM t;{code}
Executing the above gives this:
{code:java}
spark-sql> CREATE TABLE t(c0 DECIMAL(20,10));
22/09/29 11:28:41 WARN ResolveSessionCatalog: A Hive serde table will be 
created as there is no table provider specified. You can set 
spark.sql.legacy.createHiveTableByDefault to false so that native data source 
table will be created instead.
Time taken: 0.118 seconds
spark-sql> INSERT INTO t VALUES (8.8888888888888888888e9);
Time taken: 0.392 seconds
spark-sql> SELECT * FROM t;
8888888888.8888900000
Time taken: 0.197 seconds, Fetched 1 row(s){code}
h3. Expected behavior

We expect the inserted value to retain the precision as determined by the 
parameters for the {{DECIMAL}} type. For example, we expect the example above 
to return {{{}8888888888.8888888888{}}}.

  was:
h3. Describe the bug

We are trying to save {{DECIMAL}} values with high precision in a table using 
the SparkSQL shell. When we {{INSERT}} decimal values with precision higher 
than the standard double precision, precision is lost.

This seems to be caused by type inference at shell parsing inferring that the 
value is a double type.
h3. To reproduce

On Spark 3.2.1 (commit {{{}4f25b3f712{}}}), using {{{}spark-sql{}}}:
{code:java}
$SPARK_HOME/bin/spark-sql{code}
In the shell:
{code:java}
CREATE TABLE t(c0 DECIMAL(20,10));         
INSERT INTO t VALUES (8.8888888888888888888e9);                             
SELECT * FROM t;{code}
Executing the above gives this:
{code:java}
spark-sql> CREATE TABLE t(c0 DECIMAL(20,10));
22/09/29 11:28:41 WARN ResolveSessionCatalog: A Hive serde table will be 
created as there is no table provider specified. You can set 
spark.sql.legacy.createHiveTableByDefault to false so that native data source 
table will be created instead.
Time taken: 0.118 seconds
spark-sql> INSERT INTO t VALUES (8.8888888888888888888e9);
Time taken: 0.392 seconds
spark-sql> SELECT * FROM t;
8888888888.8888900000
Time taken: 0.197 seconds, Fetched 1 row(s){code}
h3. Expected behavior

We expect the inserted value to retain the precision as determined by the 
parameters for the {{DECIMAL}} type. For example, we expect the example above 
to return {{{}8888888888.8888888888{}}}.


> Loss of precision using SparkSQL shell on high-precision DECIMAL types
> ----------------------------------------------------------------------
>
>                 Key: SPARK-40616
>                 URL: https://issues.apache.org/jira/browse/SPARK-40616
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 3.2.1
>            Reporter: xsys
>            Priority: Major
>
> h3. Describe the bug
> We are trying to save {{DECIMAL}} values with high precision in a table using 
> the SparkSQL shell. When we {{INSERT}} decimal values with precision higher 
> than the standard double precision, precision is lost. 
> (8.8888888888888888888e9 interpreted as 8888888888.8888900000 instead of 
> 8888888888.8888888888).
> This seems to be caused by type inference at shell parsing inferring that the 
> value is a double type.
> h3. To reproduce
> On Spark 3.2.1 (commit {{{}4f25b3f712{}}}), using {{{}spark-sql{}}}:
> {code:java}
> $SPARK_HOME/bin/spark-sql{code}
> In the shell:
> {code:java}
> CREATE TABLE t(c0 DECIMAL(20,10));         
> INSERT INTO t VALUES (8.8888888888888888888e9);                             
> SELECT * FROM t;{code}
> Executing the above gives this:
> {code:java}
> spark-sql> CREATE TABLE t(c0 DECIMAL(20,10));
> 22/09/29 11:28:41 WARN ResolveSessionCatalog: A Hive serde table will be 
> created as there is no table provider specified. You can set 
> spark.sql.legacy.createHiveTableByDefault to false so that native data source 
> table will be created instead.
> Time taken: 0.118 seconds
> spark-sql> INSERT INTO t VALUES (8.8888888888888888888e9);
> Time taken: 0.392 seconds
> spark-sql> SELECT * FROM t;
> 8888888888.8888900000
> Time taken: 0.197 seconds, Fetched 1 row(s){code}
> h3. Expected behavior
> We expect the inserted value to retain the precision as determined by the 
> parameters for the {{DECIMAL}} type. For example, we expect the example above 
> to return {{{}8888888888.8888888888{}}}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-40616) Loss of precision using SparkSQL shell on high-precision DECIMAL types

Reply via email to