Hi, all,

FYI:
>> @Yuming Wang the results in float8.sql are from PostgreSQL directly?
>> Interesting if it also returns the same less accurate result, which
>> might suggest it's more to do with underlying OS math libraries. You
>> noted that these tests sometimes gave platform-dependent differences
>> in the last digit, so wondering if the test value directly reflects
>> PostgreSQL or just what we happen to return now.

The results in float8.sql.out were recomputed in Spark/JVM.
The expected output of the PostgreSQL test is here:
https://github.com/postgres/postgres/blob/master/src/test/regress/expected/float8.out#L493

As you can see in the file (float8.out), the results other than atanh also
are different between Spark/JVM and PostgreSQL.
For example, the answers of acosh are:
-- PostgreSQL
https://github.com/postgres/postgres/blob/master/src/test/regress/expected/float8.out#L487
1.31695789692482

-- Spark/JVM
https://github.com/apache/spark/blob/master/sql/core/src/test/resources/sql-tests/results/pgSQL/float8.sql.out#L523
1.3169578969248166

btw, the PostgreSQL implementation for atanh just calls atanh in math.h:
https://github.com/postgres/postgres/blob/master/src/backend/utils/adt/float.c#L2606

Bests,
Takeshi

On Sat, Jul 27, 2019 at 10:35 AM bo zhaobo <bzhaojyathousa...@gmail.com>
wrote:

> Hi all,
>
> Thanks for your concern. Yeah, that's worth to also test in backend
> database. But need to note here, this issue is hit in Spark SQL, as we only
> test it with spark itself, not integrate other databases.
>
> Best Regards,
>
> ZhaoBo
>
>
>
> [image: Mailtrack]
> <https://mailtrack.io?utm_source=gmail&utm_medium=signature&utm_campaign=signaturevirality5&;>
>  Sender
> notified by
> Mailtrack
> <https://mailtrack.io?utm_source=gmail&utm_medium=signature&utm_campaign=signaturevirality5&;>
>  19/07/27
> 上午09:30:56
>
> Sean Owen <sro...@gmail.com> 于2019年7月26日周五 下午5:46写道:
>
>> Interesting. I don't think log(3) is special, it's just that some
>> differences in how it's implemented and floating-point values on
>> aarch64 vs x86, or in the JVM, manifest at some values like this. It's
>> still a little surprising! BTW Wolfram Alpha suggests that the correct
>> value is more like ...810969..., right between the two. java.lang.Math
>> doesn't guarantee strict IEEE floating-point behavior, but
>> java.lang.StrictMath is supposed to, at the potential cost of speed,
>> and it gives ...81096, in agreement with aarch64.
>>
>> @Yuming Wang the results in float8.sql are from PostgreSQL directly?
>> Interesting if it also returns the same less accurate result, which
>> might suggest it's more to do with underlying OS math libraries. You
>> noted that these tests sometimes gave platform-dependent differences
>> in the last digit, so wondering if the test value directly reflects
>> PostgreSQL or just what we happen to return now.
>>
>> One option is to use StrictMath in special cases like computing atanh.
>> That gives a value that agrees with aarch64.
>> I also note that 0.5 * (math.log(1 + x) - math.log(1 - x) gives the
>> more accurate answer too, and makes the result agree with, say,
>> Wolfram Alpha for atanh(0.5).
>> (Actually if we do that, better still is 0.5 * (math.log1p(x) -
>> math.log1p(-x)) for best accuracy near 0)
>> Commons Math also has implementations of sinh, cosh, atanh that we
>> could call. It claims it's possibly more accurate and faster. I
>> haven't tested its result here.
>>
>> FWIW the "log1p" version appears, from some informal testing, to be
>> most accurate (in agreement with Wolfram) and using StrictMath doesn't
>> matter. If we change something, I'd use that version above.
>> The only issue is if this causes the result to disagree with
>> PostgreSQL, but then again it's more correct and maybe the DB is
>> wrong.
>>
>>
>> The rest may be a test vs PostgreSQL issue; see
>> https://issues.apache.org/jira/browse/SPARK-28316
>>
>>
>> On Fri, Jul 26, 2019 at 2:32 AM Tianhua huang <huangtianhua...@gmail.com>
>> wrote:
>> >
>> > Hi, all
>> >
>> >
>> > Sorry to disturb again, there are several sql tests failed on arm64
>> instance:
>> >
>> > pgSQL/float8.sql *** FAILED ***
>> > Expected "0.549306144334054[9]", but got "0.549306144334054[8]" Result
>> did not match for query #56
>> > SELECT atanh(double('0.5')) (SQLQueryTestSuite.scala:362)
>> > pgSQL/numeric.sql *** FAILED ***
>> > Expected "2 2247902679199174[72 224790267919917955.1326161858
>> > 4 7405685069595001 7405685069594999.0773399947
>> > 5 5068226527.321263 5068226527.3212726541
>> > 6 281839893606.99365 281839893606.9937234336
>> > 7 1716699575118595840 1716699575118597095.4233081991
>> > 8 167361463828.0749 167361463828.0749132007
>> > 9 107511333880051856] 107511333880052007....", but got "2
>> 2247902679199174[40224790267919917955.1326161858
>> > 4 7405685069595001 7405685069594999.0773399947
>> > 5 5068226527.321263 5068226527.3212726541
>> > 6 281839893606.99365 281839893606.9937234336
>> > 7 1716699575118595580 1716699575118597095.4233081991
>> > 8 167361463828.0749 167361463828.0749132007
>> > 9 107511333880051872] 107511333880052007...." Result did not match for
>> query #496
>> > SELECT t1.id1, t1.result, t2.expected
>> > FROM num_result t1, num_exp_power_10_ln t2
>> > WHERE t1.id1 = t2.id
>> > AND t1.result != t2.expected (SQLQueryTestSuite.scala:362)
>> >
>> > The first test failed, because the value of math.log(3.0) is different
>> on aarch64:
>> >
>> > # on x86_64:
>> >
>> > scala> val a = 0.5
>> > a: Double = 0.5
>> >
>> > scala> a * math.log((1.0 + a) / (1.0 - a))
>> > res1: Double = 0.5493061443340549
>> >
>> > scala> math.log((1.0 + a) / (1.0 - a))
>> > res2: Double = 1.0986122886681098
>> >
>> > # on aarch64:
>> >
>> > scala> val a = 0.5
>> >
>> > a: Double = 0.5
>> >
>> > scala> a * math.log((1.0 + a) / (1.0 - a))
>> >
>> > res20: Double = 0.5493061443340548
>> >
>> > scala> math.log((1.0 + a) / (1.0 - a))
>> >
>> > res21: Double = 1.0986122886681096
>> >
>> > And I tried other several numbers like math.log(4.0) and math.log(5.0)
>> and they are same, I don't know why math.log(3.0) is so special? But the
>> result is different indeed on aarch64. If you are interesting, please try
>> it.
>> >
>> > The second test failed, because some values of pow(10, x) is different
>> on aarch64, according to sql tests of spark, I took similar tests on
>> aarch64 and x86_64, take '-83028485' as example:
>> >
>> > # on x86_64:
>> > scala> import java.lang.Math._
>> > import java.lang.Math._
>> > scala> var a = -83028485
>> > a: Int = -83028485
>> > scala> abs(a)
>> > res4: Int = 83028485
>> > scala> math.log(abs(a))
>> > res5: Double = 18.234694299654787
>> > scala> pow(10, math.log(abs(a)))
>> > res6: Double = 1.71669957511859584E18
>> >
>> > # on aarch64:
>> >
>> > scala> var a = -83028485
>> > a: Int = -83028485
>> > scala> abs(a)
>> > res38: Int = 83028485
>> >
>> > scala> math.log(abs(a))
>> >
>> > res39: Double = 18.234694299654787
>> > scala> pow(10, math.log(abs(a)))
>> > res40: Double = 1.71669957511859558E18
>> >
>> > I send an email to jdk-dev, hope someone can help, and also I proposed
>> this to JIRA  https://issues.apache.org/jira/browse/SPARK-28519, , if
>> you are interesting, welcome to join and discuss, thank you very much.
>> >
>> >
>> > On Thu, Jul 18, 2019 at 11:12 AM Tianhua huang <
>> huangtianhua...@gmail.com> wrote:
>> >>
>> >> Thanks for your reply.
>> >>
>> >> About the first problem we didn't find any other reason in log, just
>> found timeout to wait the executor up, and after increase the timeout from
>> 10000 ms to 30000(even 20000)ms,
>> https://github.com/apache/spark/blob/master/core/src/test/scala/org/apache/spark/SparkContextSuite.scala#L764
>>
>> https://github.com/apache/spark/blob/master/core/src/test/scala/org/apache/spark/SparkContextSuite.scala#L792
>> the test passed, and there are more than one executor up, not sure whether
>> it's related with the flavor of our aarch64 instance? Now the flavor of the
>> instance is 8C8G. Maybe we will try the bigger flavor later. Or any one has
>> other suggestion, please contact me, thank you.
>> >>
>> >> About the second problem, I proposed a pull request to apache/spark,
>> https://github.com/apache/spark/pull/25186  if you have time, would you
>> please to help to review it, thank you very much.
>> >>
>> >> On Wed, Jul 17, 2019 at 8:37 PM Sean Owen <sro...@gmail.com> wrote:
>> >>>
>> >>> On Wed, Jul 17, 2019 at 6:28 AM Tianhua huang <
>> huangtianhua...@gmail.com> wrote:
>> >>> > Two failed and the reason is 'Can't find 1 executors before 10000
>> milliseconds elapsed', see below, then we try increase timeout the tests
>> passed, so wonder if we can increase the timeout? and here I have another
>> question about
>> https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/TestUtils.scala#L285,
>> why is not >=? see the comment of the function, it should be >=?
>> >>> >
>> >>>
>> >>> I think it's ">" because the driver is also an executor, but not 100%
>> >>> sure. In any event it passes in general.
>> >>> These errors typically mean "I didn't start successfully" for some
>> >>> other reason that may be in the logs.
>> >>>
>> >>> > The other two failed and the reason is '2143289344 equaled
>> 2143289344', this because the value of floatToRawIntBits(0.0f/0.0f) on
>> aarch64 platform is 2143289344 and equals to floatToRawIntBits(Float.NaN).
>> About this I send email to jdk-dev and proposed a topic on scala community
>> https://users.scala-lang.org/t/the-value-of-floattorawintbits-0-0f-0-0f-is-different-on-x86-64-and-aarch64-platforms/4845
>> and https://github.com/scala/bug/issues/11632, I thought it's something
>> about jdk or scala, but after discuss, it should related with platform, so
>> seems the following asserts is not appropriate?
>> https://github.com/apache/spark/blob/master/sql/core/src/test/scala/org/apache/spark/sql/DataFrameWindowFunctionsSuite.scala#L704-L705
>> and
>> https://github.com/apache/spark/blob/master/sql/core/src/test/scala/org/apache/spark/sql/DataFrameAggregateSuite.scala#L732-L733
>> >>>
>> >>> These tests could special-case execution on ARM, like you'll see some
>> >>> tests handle big-endian architectures.
>>
>> ---------------------------------------------------------------------
>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>>
>>

-- 
---
Takeshi Yamamuro

Reply via email to