> BigInteger currently uses three different algorithms for multiply. The simple > quadratic algorithm, then the slightly better Karatsuba if we exceed a bit > count and then Toom Cook 3 once we go into the several thousands of bits. > Since Toom Cook 3 is a recursive algorithm, it is trivial to parallelize it. > I have demonstrated this several times in conference talks. In order to be > consistent with other classes such as Arrays and Collection, I have added a > parallelMultiply() method. Internally we have added a parameter to the > private multiply method to indicate whether the calculation should be done in > parallel. > > The performance improvements are as should be expected. Fibonacci of 100 > million (using a single-threaded Dijkstra's sum of squares version) completes > in 9.2 seconds with the parallelMultiply() vs 25.3 seconds with the > sequential multiply() method. This is on my 1-8-2 laptop. The final > multiplications are with very large numbers, which then benefit from the > parallelization of Toom-Cook 3. Fibonacci 100 million is a 347084 bit number. > > We have also parallelized the private square() method. Internally, the > square() method defaults to be sequential. > > Some benchmark results, run on my 1-6-2 server: > > > Benchmark (n) Mode Cnt Score > Error Units > BigIntegerParallelMultiply.multiply 1000000 ss 4 51.707 > ± 11.194 ms/op > BigIntegerParallelMultiply.multiply 10000000 ss 4 988.302 > ± 235.977 ms/op > BigIntegerParallelMultiply.multiply 100000000 ss 4 24662.063 > ± 1123.329 ms/op > BigIntegerParallelMultiply.parallelMultiply 1000000 ss 4 49.337 > ± 26.611 ms/op > BigIntegerParallelMultiply.parallelMultiply 10000000 ss 4 527.560 > ± 268.903 ms/op > BigIntegerParallelMultiply.parallelMultiply 100000000 ss 4 9076.551 > ± 1899.444 ms/op > > > We can see that for larger calculations (fib 100m), the execution is 2.7x > faster in parallel. For medium size (fib 10m) it is 1.873x faster. And for > small (fib 1m) it is roughly the same. Considering that the fibonacci > algorithm that we used was in itself sequential, and that the last 3 > calculations would dominate, 2.7x faster should probably be considered quite > good on a 1-6-2 machine.
kabutz has updated the pull request incrementally with one additional commit since the last revision: Updated comment to include information about performance ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/6409/files - new: https://git.openjdk.java.net/jdk/pull/6409/files/fc7b844a..ef74878e Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=6409&range=09 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=6409&range=08-09 Stats: 9 lines in 1 file changed: 8 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/6409.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/6409/head:pull/6409 PR: https://git.openjdk.java.net/jdk/pull/6409