On Fri, 10 Feb 2023 10:00:05 GMT, Xiaowei Lu <d...@openjdk.org> wrote:

> [JDK-8269667](https://bugs.openjdk.org/browse/JDK-8269667) has uncovered the 
> poor performance of BigDecimal.divide under certain circumstance.
> 
> We confront similar situations when benchmarking Spark3 on TPC-DS test kit. 
> According to the flame-graph below, it is StripZeros that spends most of the 
> time of BigDecimal.divide. Hence we propose this patch to optimize stripping 
> zeros.
> ![图片 
> 1](https://user-images.githubusercontent.com/39413832/218062061-53cd0220-776e-4b72-8b9a-6b0f11707986.png)
> 
> Currently, createAndStripZerosToMatchScale() is performed linearly. That is, 
> the target value is parsed from back to front, each time stripping out single 
> ‘0’. To optimize, we can adopt the method of binary search. That is, each 
> time we try to strip out ${scale/2} ‘0’s. 
> 
> The performance looks good. Therotically, time complexity of our method is 
> O(log n), while the current one is O(n). In practice, benchmarks on Spark3 
> show that 1/3 less time (102s->68s) is spent on TPC-DS query4. We also runs 
> Jtreg and JCK to check correctness, and it seems fine.
> 
> More about environment: 
> we run Spark3.3.0 on Openjdk11, but it seems jdk version doesn’t have much 
> impact on BigDecimal. Spark cluster consists of a main node and 2 core nodes, 
> each has 4cores, 16g memory and 4x500GB storage.

After making sure that `intVal` is even, and before attempting a division by a 
power of 10, it might help to check if 5 divides `intVal` in the first place. 
If it doesn't, there no point in performing the division.

It can be shown that 5 divides `intVal` if and only if it divides the `long` 
sum of all `int`s in the `mag` array of `intVal`.

I didn't try out if this contributes to improve the overall performance, but it 
might be worth giving a try.

-------------

PR: https://git.openjdk.org/jdk/pull/12509

Reply via email to