[GitHub] [arrow-datafusion] viirya commented on issue #7121: [BUG] CAST Decimal gives incorrect result

via GitHub Thu, 27 Jul 2023 19:02:06 -0700


viirya commented on issue #7121:
URL: 
https://github.com/apache/arrow-datafusion/issues/7121#issuecomment-1654866409


   I'm not sure if this is a bug.
   
   Actually, even you do something like:
   
   ```
   ❯ select 6.4053151420411946063694043751862251568;
   +----------------------------+
   | Float64(6.405315142041195) |
   +----------------------------+
   | 6.405315142041195          |
   +----------------------------+
   1 row in set. Query took 0.005 seconds.
   ```
   
   You can see that you still cannot get full precision of the input floating 
point.
   
   It is more like the precision limit instead of an incorrect result.
   
   Further, if you look at Spark, `6.4053151420411946063694043751862251568` is 
treated as decimal, instead of float64
   
   ```
   scala> sql("select 6.4053151420411946063694043751862251568").printSchema
   root
    |-- 6.4053151420411946063694043751862251568: decimal(38,37) (nullable = 
false)
   ```
   
   So actually in Spark you just cast decimal to decimal?
   
   But in DataFusion, it is float64:
   
   ```
   ❯ explain select 6.4053151420411946063694043751862251568;
   
+---------------+------------------------------------------------------------------------+
   | plan_type     | plan                                                       
            |
   
+---------------+------------------------------------------------------------------------+
   | logical_plan  | Projection: Float64(6.405315142041195)                     
            |
   |               |   EmptyRelation                                            
            |
   | physical_plan | ProjectionExec: expr=[6.405315142041195 as 
Float64(6.405315142041195)] |
   |               |   EmptyExec: produce_one_row=true                          
            |
   |               |                                                            
            |
   
+---------------+------------------------------------------------------------------------+
   2 rows in set. Query took 0.006 seconds.
   
   ```
   
   It explains why casted result is the exactly same as input because there 
will be precision loss when casting floating-point value to decimal.
   
   Then, let's force Spark to treat input value as float64?
   
   ```
   scala> sql("select cast(6.4053151420411946063694043751862251568d as 
decimal(38,37))").collect()
   res13: Array[org.apache.spark.sql.Row] = 
Array([6.4053151420411950000000000000000000000])
   ```
   
   You can see that now Spark also returns a precision loss decimal.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [arrow-datafusion] viirya commented on issue #7121: [BUG] CAST Decimal gives incorrect result

Reply via email to