Hi, all, FYI: >> @Yuming Wang the results in float8.sql are from PostgreSQL directly? >> Interesting if it also returns the same less accurate result, which >> might suggest it's more to do with underlying OS math libraries. You >> noted that these tests sometimes gave platform-dependent differences >> in the last digit, so wondering if the test value directly reflects >> PostgreSQL or just what we happen to return now.
The results in float8.sql.out were recomputed in Spark/JVM. The expected output of the PostgreSQL test is here: https://github.com/postgres/postgres/blob/master/src/test/regress/expected/float8.out#L493 As you can see in the file (float8.out), the results other than atanh also are different between Spark/JVM and PostgreSQL. For example, the answers of acosh are: -- PostgreSQL https://github.com/postgres/postgres/blob/master/src/test/regress/expected/float8.out#L487 1.31695789692482 -- Spark/JVM https://github.com/apache/spark/blob/master/sql/core/src/test/resources/sql-tests/results/pgSQL/float8.sql.out#L523 1.3169578969248166 btw, the PostgreSQL implementation for atanh just calls atanh in math.h: https://github.com/postgres/postgres/blob/master/src/backend/utils/adt/float.c#L2606 Bests, Takeshi On Sat, Jul 27, 2019 at 10:35 AM bo zhaobo <bzhaojyathousa...@gmail.com> wrote: > Hi all, > > Thanks for your concern. Yeah, that's worth to also test in backend > database. But need to note here, this issue is hit in Spark SQL, as we only > test it with spark itself, not integrate other databases. > > Best Regards, > > ZhaoBo > > > > [image: Mailtrack] > <https://mailtrack.io?utm_source=gmail&utm_medium=signature&utm_campaign=signaturevirality5&> > Sender > notified by > Mailtrack > <https://mailtrack.io?utm_source=gmail&utm_medium=signature&utm_campaign=signaturevirality5&> > 19/07/27 > 上午09:30:56 > > Sean Owen <sro...@gmail.com> 于2019年7月26日周五 下午5:46写道: > >> Interesting. I don't think log(3) is special, it's just that some >> differences in how it's implemented and floating-point values on >> aarch64 vs x86, or in the JVM, manifest at some values like this. It's >> still a little surprising! BTW Wolfram Alpha suggests that the correct >> value is more like ...810969..., right between the two. java.lang.Math >> doesn't guarantee strict IEEE floating-point behavior, but >> java.lang.StrictMath is supposed to, at the potential cost of speed, >> and it gives ...81096, in agreement with aarch64. >> >> @Yuming Wang the results in float8.sql are from PostgreSQL directly? >> Interesting if it also returns the same less accurate result, which >> might suggest it's more to do with underlying OS math libraries. You >> noted that these tests sometimes gave platform-dependent differences >> in the last digit, so wondering if the test value directly reflects >> PostgreSQL or just what we happen to return now. >> >> One option is to use StrictMath in special cases like computing atanh. >> That gives a value that agrees with aarch64. >> I also note that 0.5 * (math.log(1 + x) - math.log(1 - x) gives the >> more accurate answer too, and makes the result agree with, say, >> Wolfram Alpha for atanh(0.5). >> (Actually if we do that, better still is 0.5 * (math.log1p(x) - >> math.log1p(-x)) for best accuracy near 0) >> Commons Math also has implementations of sinh, cosh, atanh that we >> could call. It claims it's possibly more accurate and faster. I >> haven't tested its result here. >> >> FWIW the "log1p" version appears, from some informal testing, to be >> most accurate (in agreement with Wolfram) and using StrictMath doesn't >> matter. If we change something, I'd use that version above. >> The only issue is if this causes the result to disagree with >> PostgreSQL, but then again it's more correct and maybe the DB is >> wrong. >> >> >> The rest may be a test vs PostgreSQL issue; see >> https://issues.apache.org/jira/browse/SPARK-28316 >> >> >> On Fri, Jul 26, 2019 at 2:32 AM Tianhua huang <huangtianhua...@gmail.com> >> wrote: >> > >> > Hi, all >> > >> > >> > Sorry to disturb again, there are several sql tests failed on arm64 >> instance: >> > >> > pgSQL/float8.sql *** FAILED *** >> > Expected "0.549306144334054[9]", but got "0.549306144334054[8]" Result >> did not match for query #56 >> > SELECT atanh(double('0.5')) (SQLQueryTestSuite.scala:362) >> > pgSQL/numeric.sql *** FAILED *** >> > Expected "2 2247902679199174[72 224790267919917955.1326161858 >> > 4 7405685069595001 7405685069594999.0773399947 >> > 5 5068226527.321263 5068226527.3212726541 >> > 6 281839893606.99365 281839893606.9937234336 >> > 7 1716699575118595840 1716699575118597095.4233081991 >> > 8 167361463828.0749 167361463828.0749132007 >> > 9 107511333880051856] 107511333880052007....", but got "2 >> 2247902679199174[40224790267919917955.1326161858 >> > 4 7405685069595001 7405685069594999.0773399947 >> > 5 5068226527.321263 5068226527.3212726541 >> > 6 281839893606.99365 281839893606.9937234336 >> > 7 1716699575118595580 1716699575118597095.4233081991 >> > 8 167361463828.0749 167361463828.0749132007 >> > 9 107511333880051872] 107511333880052007...." Result did not match for >> query #496 >> > SELECT t1.id1, t1.result, t2.expected >> > FROM num_result t1, num_exp_power_10_ln t2 >> > WHERE t1.id1 = t2.id >> > AND t1.result != t2.expected (SQLQueryTestSuite.scala:362) >> > >> > The first test failed, because the value of math.log(3.0) is different >> on aarch64: >> > >> > # on x86_64: >> > >> > scala> val a = 0.5 >> > a: Double = 0.5 >> > >> > scala> a * math.log((1.0 + a) / (1.0 - a)) >> > res1: Double = 0.5493061443340549 >> > >> > scala> math.log((1.0 + a) / (1.0 - a)) >> > res2: Double = 1.0986122886681098 >> > >> > # on aarch64: >> > >> > scala> val a = 0.5 >> > >> > a: Double = 0.5 >> > >> > scala> a * math.log((1.0 + a) / (1.0 - a)) >> > >> > res20: Double = 0.5493061443340548 >> > >> > scala> math.log((1.0 + a) / (1.0 - a)) >> > >> > res21: Double = 1.0986122886681096 >> > >> > And I tried other several numbers like math.log(4.0) and math.log(5.0) >> and they are same, I don't know why math.log(3.0) is so special? But the >> result is different indeed on aarch64. If you are interesting, please try >> it. >> > >> > The second test failed, because some values of pow(10, x) is different >> on aarch64, according to sql tests of spark, I took similar tests on >> aarch64 and x86_64, take '-83028485' as example: >> > >> > # on x86_64: >> > scala> import java.lang.Math._ >> > import java.lang.Math._ >> > scala> var a = -83028485 >> > a: Int = -83028485 >> > scala> abs(a) >> > res4: Int = 83028485 >> > scala> math.log(abs(a)) >> > res5: Double = 18.234694299654787 >> > scala> pow(10, math.log(abs(a))) >> > res6: Double = 1.71669957511859584E18 >> > >> > # on aarch64: >> > >> > scala> var a = -83028485 >> > a: Int = -83028485 >> > scala> abs(a) >> > res38: Int = 83028485 >> > >> > scala> math.log(abs(a)) >> > >> > res39: Double = 18.234694299654787 >> > scala> pow(10, math.log(abs(a))) >> > res40: Double = 1.71669957511859558E18 >> > >> > I send an email to jdk-dev, hope someone can help, and also I proposed >> this to JIRA https://issues.apache.org/jira/browse/SPARK-28519, , if >> you are interesting, welcome to join and discuss, thank you very much. >> > >> > >> > On Thu, Jul 18, 2019 at 11:12 AM Tianhua huang < >> huangtianhua...@gmail.com> wrote: >> >> >> >> Thanks for your reply. >> >> >> >> About the first problem we didn't find any other reason in log, just >> found timeout to wait the executor up, and after increase the timeout from >> 10000 ms to 30000(even 20000)ms, >> https://github.com/apache/spark/blob/master/core/src/test/scala/org/apache/spark/SparkContextSuite.scala#L764 >> >> https://github.com/apache/spark/blob/master/core/src/test/scala/org/apache/spark/SparkContextSuite.scala#L792 >> the test passed, and there are more than one executor up, not sure whether >> it's related with the flavor of our aarch64 instance? Now the flavor of the >> instance is 8C8G. Maybe we will try the bigger flavor later. Or any one has >> other suggestion, please contact me, thank you. >> >> >> >> About the second problem, I proposed a pull request to apache/spark, >> https://github.com/apache/spark/pull/25186 if you have time, would you >> please to help to review it, thank you very much. >> >> >> >> On Wed, Jul 17, 2019 at 8:37 PM Sean Owen <sro...@gmail.com> wrote: >> >>> >> >>> On Wed, Jul 17, 2019 at 6:28 AM Tianhua huang < >> huangtianhua...@gmail.com> wrote: >> >>> > Two failed and the reason is 'Can't find 1 executors before 10000 >> milliseconds elapsed', see below, then we try increase timeout the tests >> passed, so wonder if we can increase the timeout? and here I have another >> question about >> https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/TestUtils.scala#L285, >> why is not >=? see the comment of the function, it should be >=? >> >>> > >> >>> >> >>> I think it's ">" because the driver is also an executor, but not 100% >> >>> sure. In any event it passes in general. >> >>> These errors typically mean "I didn't start successfully" for some >> >>> other reason that may be in the logs. >> >>> >> >>> > The other two failed and the reason is '2143289344 equaled >> 2143289344', this because the value of floatToRawIntBits(0.0f/0.0f) on >> aarch64 platform is 2143289344 and equals to floatToRawIntBits(Float.NaN). >> About this I send email to jdk-dev and proposed a topic on scala community >> https://users.scala-lang.org/t/the-value-of-floattorawintbits-0-0f-0-0f-is-different-on-x86-64-and-aarch64-platforms/4845 >> and https://github.com/scala/bug/issues/11632, I thought it's something >> about jdk or scala, but after discuss, it should related with platform, so >> seems the following asserts is not appropriate? >> https://github.com/apache/spark/blob/master/sql/core/src/test/scala/org/apache/spark/sql/DataFrameWindowFunctionsSuite.scala#L704-L705 >> and >> https://github.com/apache/spark/blob/master/sql/core/src/test/scala/org/apache/spark/sql/DataFrameAggregateSuite.scala#L732-L733 >> >>> >> >>> These tests could special-case execution on ARM, like you'll see some >> >>> tests handle big-endian architectures. >> >> --------------------------------------------------------------------- >> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org >> >> -- --- Takeshi Yamamuro