Yes, it was done on purpose to match the behavior of Hive ( https://issues.apache.org/jira/browse/SPARK-10865).
And I believe Hive returns `Long`s because they adopted the definition used in MySQL (https://issues.apache.org/jira/browse/HIVE-615). On Fri, May 19, 2017 at 10:51 AM, Anton Okolnychyi < anton.okolnyc...@gmail.com> wrote: > Hi Dongjoon, > > yeah, it seems to be the same. So, was it done on purpose to match the > behavior of Hive? > > Best regards, > Anton > > 2017-05-19 16:39 GMT+02:00 Dong Joon Hyun <dh...@hortonworks.com>: > >> Hi, Anton. >> >> >> >> It’s the same result with Hive, isn’t it? >> >> >> >> hive> select 9.223372036854786E20, ceil(9.223372036854786E20); >> >> OK >> >> _c0 _c1 >> >> 9.223372036854786E20 9223372036854775807 >> >> Time taken: 2.041 seconds, Fetched: 1 row(s) >> >> >> >> Bests, >> >> Dongjoon. >> >> >> >> *From: *Anton Okolnychyi <anton.okolnyc...@gmail.com> >> *Date: *Friday, May 19, 2017 at 7:26 AM >> *To: *"dev@spark.apache.org" <dev@spark.apache.org> >> *Subject: *[Spark SQL] ceil and floor functions on doubles >> >> >> >> Hi all, >> >> >> >> I am wondering why the results of ceil and floor functions on doubles are >> internally casted to longs. This causes loss of precision since doubles can >> hold bigger numbers. >> >> >> >> Consider the following example: >> >> >> >> // 9.223372036854786E20 is greater than Long.MaxValue >> >> val df = sc.parallelize(Array(("col", 9.223372036854786E20))).toDF() >> >> df.createOrReplaceTempView("tbl") >> >> spark.sql("select _2 AS original_value, ceil(_2) as ceil_result from >> tbl").show() >> >> >> >> +---------------------------------+---------------------------------+ >> >> | original_value | ceil_result | >> >> +---------------------------------+---------------------------------+ >> >> | 9.223372036854786E20 | 9223372036854775807 | >> >> +---------------------------------+---------------------------------+ >> >> >> >> So, the original double value is rounded to 9223372036854775807, which is >> Long.MaxValue. >> >> I think that it would be better to return 9.223372036854786E20 as it was >> (and as it is actually returned by math.ceil before the cast to long). If >> it is a problem, then I can fix this. >> >> >> >> Best regards, >> >> Anton >> > >