Benny Lu created SPARK-29123:
--------------------------------

             Summary: DecimalType multiplication precision scale loss 
                 Key: SPARK-29123
                 URL: https://issues.apache.org/jira/browse/SPARK-29123
             Project: Spark
          Issue Type: Bug
          Components: PySpark
    Affects Versions: 2.4.3
            Reporter: Benny Lu


When doing multiplication with PySpark, it seems PySpark is losing precision.

For example, when multiplying two decimals with precision 38,10, it returns 
38,6 instead of 38,10. It also truncates result to three decimals which is 
incorrect result. 
{code:java}
from decimal import Decimal
from pyspark.sql.types import DecimalType, StructType, StructField

schema = StructType([StructField("amount", DecimalType(38,10)), 
StructField("fx", DecimalType(38,10))])
df = spark.createDataFrame([(Decimal(233.00), Decimal(1.1403218880))], 
schema=schema)

df.printSchema()
df = df.withColumn("amount_usd", df.amount * df.fx)
df.printSchema()
df.show()
{code}
Result
{code:java}
>>> df.printSchema()
root
 |-- amount: decimal(38,10) (nullable = true)
 |-- fx: decimal(38,10) (nullable = true)
 |-- amount_usd: decimal(38,6) (nullable = true)

>>> df = df.withColumn("amount_usd", df.amount * df.fx)
>>> df.printSchema()
root
 |-- amount: decimal(38,10) (nullable = true)
 |-- fx: decimal(38,10) (nullable = true)
 |-- amount_usd: decimal(38,6) (nullable = true)

>>> df.show()
+--------------+------------+----------+
|        amount|          fx|amount_usd|
+--------------+------------+----------+
|233.0000000000|1.1403218880|265.695000|
+--------------+------------+----------+
{code}
When rounding to two decimals, it returns 265.70 but the correct result should 
be 265.69499 and when rounded to two decimals, it should be 265.69.

 



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to