On Fri, 19 Nov 2021 10:42:23 GMT, kabutz <d...@openjdk.java.net> wrote:

> > I think the functionality in this PR is worth pursuing, but with the JDK 18 
> > rampdown 1 date fast approaching, as a non-urgent issue, I think we 
> > shouldn't try to rush it into JDK 18.
> 
> > I looked more closely and now understand a bit more about the threshold. It 
> > would be useful to have some implementation notes detailing approximately 
> > when the parallel execution kicks in. For `a*a` it's `>= 
> > TOOM_COOK_SQUARE_THRESHOLD(216)` and for `a*b` it's `>= 
> > TOOM_COOK_THRESHOLD(240)`, so we could refer to certain bit lengths.
> > The branching factor is 4 (well 3 i guess, but its easier to think in 
> > powers of 2). It might be reasonable to assume the problem gets split 
> > equally in 4 parts. I don't know in practice what the depth of recursion 
> > might, its hard to see this getting completely out of control, but we could 
> > always keep track of the depth and cut off in proportion to the # runtime 
> > processors if need be.
> > Given the existing degree of specialization, the minimal changes to code, 
> > and the fact that the creation of recursive tasks is in the noise this PR 
> > looks quite reasonable.
> 
> I have run some more tests. For my fibonacci algorithm, here are the worst 
> cases for the various calculations.
> 
> ```
>             n   most_bits     tasks  time_ms
>          1000         694         0      1
>        10_000       6_942         0      1
>       100_000      69_424        18     13
>     1_000_000     694_241       468    143
>    10_000_000   6_942_418    11_718   1049
>   100_000_000  69_424_191   292_968  13695
> 1_000_000_000 694_241_913 7_324_218 237282
> ```
> 
> Each data point has 10x the number of bits in the final result and the number 
> of tasks in the final calculation is 25x more. Perhaps I can make the 
> threshold the recursive depth up to which we would run in parallel. And that 
> recursive depth could the availableProcessors() or some multiple thereof.
> 

Defense in depth :-) Perhaps a depth that is between some multiple of log2 and 
log4 of `availableProcessors()`, given the branching factor.

-------------

PR: https://git.openjdk.java.net/jdk/pull/6409

Reply via email to