Could you do that benchmark with e.g. JMH rather than taking the difference
of System.currentTimeMillis?  That would probably make it easier to read
and trust the results.

On Sun, Feb 12, 2023, 7:56 PM Sergey Kuksenko <skukse...@openjdk.org> wrote:

> On Fri, 10 Feb 2023 10:00:05 GMT, Xiaowei Lu <d...@openjdk.org> wrote:
>
> > [JDK-8269667](https://bugs.openjdk.org/browse/JDK-8269667) has
> uncovered the poor performance of BigDecimal.divide under certain
> circumstance.
> >
> > We confront similar situations when benchmarking Spark3 on TPC-DS test
> kit. According to the flame-graph below, it is StripZeros that spends most
> of the time of BigDecimal.divide. Hence we propose this patch to optimize
> stripping zeros.
> > ![图片 1](
> https://user-images.githubusercontent.com/39413832/218062061-53cd0220-776e-4b72-8b9a-6b0f11707986.png
> )
> >
> > Currently, createAndStripZerosToMatchScale() is performed linearly. That
> is, the target value is parsed from back to front, each time stripping out
> single ‘0’. To optimize, we can adopt the method of binary search. That is,
> each time we try to strip out ${scale/2} ‘0’s.
> >
> > The performance looks good. Therotically, time complexity of our method
> is O(log n), while the current one is O(n). In practice, benchmarks on
> Spark3 show that 1/3 less time (102s->68s) is spent on TPC-DS query4. We
> also runs Jtreg and JCK to check correctness, and it seems fine.
> >
> > More about environment:
> > we run Spark3.3.0 on Openjdk11, but it seems jdk version doesn’t have
> much impact on BigDecimal. Spark cluster consists of a main node and 2 core
> nodes, each has 4cores, 16g memory and 4x500GB storage.
>
> As for TPC-DS
> [AUTO-RESULT] QueryTotal=1968s      vs      [AUTO-RESULT] QueryTotal=1934s
> that gives ~1.7% of performance difference.
> Are you sure that this small diff is a real diff, but not run-to-run
> variance?
>
> -------------
>
> PR: https://git.openjdk.org/jdk/pull/12509
>

Reply via email to