[GitHub] [spark] gengliangwang commented on issue #24849: [SPARK-28018][SQL] Allow upcasting decimal to double/float
gengliangwang commented on issue #24849: [SPARK-28018][SQL] Allow upcasting decimal to double/float URL: https://github.com/apache/spark/pull/24849#issuecomment-507143300 @rdblue I have checked the I[nformation technology — Database languages — SQL —Part 2 Foundation (SQL/Foundation)](https://www.iso.org/standard/53682.html) of the year 2011 In section 15.10: Effect of inserting tables into base tables ![image](https://user-images.githubusercontent.com/1097932/60416770-33dcdb80-9c11-11e9-9cc8-07b739fc2ef9.png) So we can refer the rules In section 9.2: Store assignment ![image](https://user-images.githubusercontent.com/1097932/60416836-5d960280-9c11-11e9-81be-29bb26897eba.png) From ISO SQL standard, "If a value of the declared type of T can be obtained from V by rounding or truncation, then the value of T is set to that value". Checking whether data types strictly upcast-able in analyzer doesn't match the standard. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] gengliangwang commented on issue #24849: [SPARK-28018][SQL] Allow upcasting decimal to double/float
gengliangwang commented on issue #24849: [SPARK-28018][SQL] Allow upcasting decimal to double/float URL: https://github.com/apache/spark/pull/24849#issuecomment-507140862 > - Add a decimal type for SQL literals that can be cast to float because the intended type of the literal is not known, or use some analysis rule that matches literals for the same purpose > - Parse literals as floats and insert an implicit cast from float to decimal Sorry I meant there is not a conclusion in the sync. I came up with the proposals in the sync, but I don't think they are good enough. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] gengliangwang commented on issue #24849: [SPARK-28018][SQL] Allow upcasting decimal to double/float
gengliangwang commented on issue #24849: [SPARK-28018][SQL] Allow upcasting decimal to double/float URL: https://github.com/apache/spark/pull/24849#issuecomment-506578483 > This is not a safe cast. I think it is explainable. There is no precision loss in the casting itself > In the DSv2 sync we discussed options that would work. Why not go with those? As I remember, we didn't have a solution in DSv2 sync and we decide to come up with solutions offline. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] gengliangwang commented on issue #24849: [SPARK-28018][SQL] Allow upcasting decimal to double/float
gengliangwang commented on issue #24849: [SPARK-28018][SQL] Allow upcasting decimal to double/float URL: https://github.com/apache/spark/pull/24849#issuecomment-506327498 > regarding Upcast, seems both float/double -> int/long and decimal -> float/double should be forbidden. float/double can be Infinity, I don't think allowing float/double -> int/long is a good idea. > How about we update DataType.canWrite I have tried creating a patch to allow such conversion for LocalRelation: https://github.com/apache/spark/compare/master...gengliangwang:allowUpcastDecimalLocalRelation?expand=1 The code is actually quite ugly. If `canWrite` is true, then the conversion is created as `Upcast`. In such case, we can't simply update `DataType.canWrite`. We will have to mark some of the unsafe conversions as `Cast`. I still suggest that we should simply allow upcasting tighter decimal to double/float. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] gengliangwang commented on issue #24849: [SPARK-28018][SQL] Allow upcasting decimal to double/float
gengliangwang commented on issue #24849: [SPARK-28018][SQL] Allow upcasting decimal to double/float URL: https://github.com/apache/spark/pull/24849#issuecomment-506266045 @wangyum @cloud-fan Thanks a lot for the investigation. I have also investigated Mysql/Oracle/PostgreSQL, all of them allow casting float/double value to int/long column. We can't use their behaviors as references since their data type casting allows precision loss. The safe casting we are proposing in Spark 3.0 is a totally new behavior. It checks the data types in analyzer statically. However, considering most of the popular DBMS are using implicit casting, are we making it too strict for this specific case `decimal -> double/float`? It is true that the **computation** result can be different between the following. ``` scala> BigDecimal("1115.32") + BigDecimal("0.0049") res0: scala.math.BigDecimal = 1115.3249 scala> 1115.32 +0.0049 res1: Double = 1115.32488 ``` But there is no precision loss in the **casting** itself ``` scala> BigDecimal(1115.3249).toDouble res2: Double = 1115.3249 ``` I think we should reconsider the proposal in this PR unless there is a better solution. BTW, I have been thinking about other solutions to this problem, e.g. only allow such casting for `LocalRelation`, or only allow such casting for table insertion. But allowing special cases makes the casting behaviors inconsistent. The proposal in this PR is simple and explainable. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] gengliangwang commented on issue #24849: [SPARK-28018][SQL] Allow upcasting decimal to double/float
gengliangwang commented on issue #24849: [SPARK-28018][SQL] Allow upcasting decimal to double/float URL: https://github.com/apache/spark/pull/24849#issuecomment-501186464 @rxin Thanks, +1. I think we can close this one. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] gengliangwang commented on issue #24849: [SPARK-28018][SQL] Allow upcasting decimal to double/float
gengliangwang commented on issue #24849: [SPARK-28018][SQL] Allow upcasting decimal to double/float URL: https://github.com/apache/spark/pull/24849#issuecomment-501168100 @cloud-fan @ueshin @maropu @dongjoon-hyun @rxin @rdblue @gatorsmile Actually, I am not super confident about this. I would like to know your idea. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org