Re: RFR: 8196106: Support nested infinite or recursive flat mapped streams [v2]

2024-04-09 Thread Paul Sandoz
On Tue, 9 Apr 2024 10:07:46 GMT, Viktor Klang  wrote:

>> This PR implements Gatherer-inspired encoding of `flatMap` that shows that 
>> it is both competitive performance-wise as well as improve correctness.
>> 
>> Below is the performance of `Stream::flatMap` (for reference types):
>> 
>> Before this PR:
>> 
>> Benchmark(size)   Mode  CntScore   Error  Units
>> FlatMap.par_array10  thrpt   12   294008,937 ? 54369,110  ops/s
>> FlatMap.par_array   100  thrpt   1262411,229 ? 14868,119  ops/s
>> FlatMap.par_array  1000  thrpt   12 8263,821 ?   452,622  ops/s
>> FlatMap.par_iterate  10  thrpt   1223029,978 ?  4274,449  ops/s
>> FlatMap.par_iterate 100  thrpt   1210532,907 ?   321,694  ops/s
>> FlatMap.par_iterate1000  thrpt   12  981,571 ?   135,270  ops/s
>> FlatMap.seq_array10  thrpt   12  2955648,495 ? 32539,142  ops/s
>> FlatMap.seq_array   100  thrpt   1241851,009 ?   377,546  ops/s
>> FlatMap.seq_array  1000  thrpt   12 1740,281 ?  1229,974  ops/s
>> FlatMap.seq_iterate  10  thrpt   12   321727,690 ?  5149,356  ops/s
>> FlatMap.seq_iterate 100  thrpt   12 8437,198 ?56,635  ops/s
>> FlatMap.seq_iterate1000  thrpt   12   76,994 ? 0,965  ops/s
>> 
>> 
>> After this PR:
>> 
>> 
>> Benchmark(size)   Mode  CntScoreError  Units
>> FlatMap.par_array10  thrpt   12   283350,051 ?  35567,223  ops/s
>> FlatMap.par_array   100  thrpt   1253846,906 ?  19241,913  ops/s
>> FlatMap.par_array  1000  thrpt   12 8230,909 ?156,362  ops/s
>> FlatMap.par_iterate  10  thrpt   1226328,500 ?   5411,401  ops/s
>> FlatMap.par_iterate 100  thrpt   1210470,862 ?249,991  ops/s
>> FlatMap.par_iterate1000  thrpt   12  986,511 ?224,050  ops/s
>> FlatMap.seq_array10  thrpt   12  5654826,565 ?  27317,453  ops/s
>> FlatMap.seq_array   100  thrpt   12   187929,786 ?542,787  ops/s
>> FlatMap.seq_array  1000  thrpt   12 2385,346 ?  9,827  ops/s
>> FlatMap.seq_iterate  10  thrpt   12   812722,403 ? 160500,399  ops/s
>> FlatMap.seq_iterate 100  thrpt   1213542,472 ?118,769  ops/s
>> FlatMap.seq_iterate1000  thrpt   12  157,056 ?  1,814  ops/s
>> 
>> 
>> For streams of size 100k, the following numbers are interesting:
>> 
>> Before this PR:
>> 
>> Benchmark(size)   Mode  Cnt  ScoreError  Units
>> FlatMap.par_array10  thrpt   12  0,325 ?  0,004  ops/s
>> FlatMap.par_iterate  10  thrpt   12  0,106 ?  0,008  o...
>
> Viktor Klang has updated the pull request incrementally with one additional 
> commit since the last revision:
> 
>   Updating copyright year

src/java.base/share/classes/java/util/stream/DoublePipeline.java line 280:

> 278: result.sequential().allMatch(this);
> 279: else
> 280: 
> result.sequential().forEach(sink::accept);

I think that might create a new double consumer instance for every input 
element. Alternatively you can compute and cache it as a field, replacing 
`shorts` and use a `null` check.

-

PR Review Comment: https://git.openjdk.org/jdk/pull/18625#discussion_r1557995257


Re: RFR: 8196106: Support nested infinite or recursive flat mapped streams [v2]

2024-04-09 Thread Viktor Klang
On Tue, 9 Apr 2024 10:04:44 GMT, Viktor Klang  wrote:

>> This PR implements Gatherer-inspired encoding of `flatMap` that shows that 
>> it is both competitive performance-wise as well as improve correctness.
>> 
>> Below is the performance of `Stream::flatMap` (for reference types):
>> 
>> Before this PR:
>> 
>> 
>> Benchmark(size)   Mode  CntScore   Error  Units
>> FlatMap.par_array10  thrpt   12   294008,937 ? 54369,110  ops/s
>> FlatMap.par_array   100  thrpt   1262411,229 ? 14868,119  ops/s
>> FlatMap.par_array  1000  thrpt   12 8263,821 ?   452,622  ops/s
>> FlatMap.par_iterate  10  thrpt   1223029,978 ?  4274,449  ops/s
>> FlatMap.par_iterate 100  thrpt   1210532,907 ?   321,694  ops/s
>> FlatMap.par_iterate1000  thrpt   12  981,571 ?   135,270  ops/s
>> FlatMap.seq_array10  thrpt   12  2955648,495 ? 32539,142  ops/s
>> FlatMap.seq_array   100  thrpt   1241851,009 ?   377,546  ops/s
>> FlatMap.seq_array  1000  thrpt   12 1740,281 ?  1229,974  ops/s
>> FlatMap.seq_iterate  10  thrpt   12   321727,690 ?  5149,356  ops/s
>> FlatMap.seq_iterate 100  thrpt   12 8437,198 ?56,635  ops/s
>> FlatMap.seq_iterate1000  thrpt   12   76,994 ? 0,965  ops/s
>> 
>> 
>> After this PR:
>> 
>> 
>> Benchmark(size)   Mode  CntScoreError  Units
>> FlatMap.par_array10  thrpt   12   283350,051 ?  35567,223  ops/s
>> FlatMap.par_array   100  thrpt   1253846,906 ?  19241,913  ops/s
>> FlatMap.par_array  1000  thrpt   12 8230,909 ?156,362  ops/s
>> FlatMap.par_iterate  10  thrpt   1226328,500 ?   5411,401  ops/s
>> FlatMap.par_iterate 100  thrpt   1210470,862 ?249,991  ops/s
>> FlatMap.par_iterate1000  thrpt   12  986,511 ?224,050  ops/s
>> FlatMap.seq_array10  thrpt   12  5654826,565 ?  27317,453  ops/s
>> FlatMap.seq_array   100  thrpt   12   187929,786 ?542,787  ops/s
>> FlatMap.seq_array  1000  thrpt   12 2385,346 ?  9,827  ops/s
>> FlatMap.seq_iterate  10  thrpt   12   812722,403 ? 160500,399  ops/s
>> FlatMap.seq_iterate 100  thrpt   1213542,472 ?118,769  ops/s
>> FlatMap.seq_iterate1000  thrpt   12  157,056 ?  1,814  ops/s
>
> Viktor Klang has updated the pull request incrementally with one additional 
> commit since the last revision:
> 
>   Updating copyright year

src/java.base/share/classes/java/util/stream/DoublePipeline.java line 268:

> 266: class FlatMap implements Sink.OfDouble, DoublePredicate {
> 267: boolean cancel;
> 268: private final boolean shorts = 
> isShortCircuitingPipeline();

If there's a known, short-circuiting, operation in the pipeline, we have to use 
`allMatch` but in the case where we know that there's nothing which could 
short-circuit we can go down a fast-path to `forEach`.

-

PR Review Comment: https://git.openjdk.org/jdk/pull/18625#discussion_r1557369039


Re: RFR: 8196106: Support nested infinite or recursive flat mapped streams [v2]

2024-04-09 Thread Viktor Klang
> This PR implements Gatherer-inspired encoding of `flatMap` that shows that it 
> is both competitive performance-wise as well as improve correctness.
> 
> Below is the performance of `Stream::flatMap` (for reference types):
> 
> Before this PR:
> 
> 
> Benchmark(size)   Mode  CntScore   Error  Units
> FlatMap.par_array10  thrpt   12   294008,937 ? 54369,110  ops/s
> FlatMap.par_array   100  thrpt   1262411,229 ? 14868,119  ops/s
> FlatMap.par_array  1000  thrpt   12 8263,821 ?   452,622  ops/s
> FlatMap.par_iterate  10  thrpt   1223029,978 ?  4274,449  ops/s
> FlatMap.par_iterate 100  thrpt   1210532,907 ?   321,694  ops/s
> FlatMap.par_iterate1000  thrpt   12  981,571 ?   135,270  ops/s
> FlatMap.seq_array10  thrpt   12  2955648,495 ? 32539,142  ops/s
> FlatMap.seq_array   100  thrpt   1241851,009 ?   377,546  ops/s
> FlatMap.seq_array  1000  thrpt   12 1740,281 ?  1229,974  ops/s
> FlatMap.seq_iterate  10  thrpt   12   321727,690 ?  5149,356  ops/s
> FlatMap.seq_iterate 100  thrpt   12 8437,198 ?56,635  ops/s
> FlatMap.seq_iterate1000  thrpt   12   76,994 ? 0,965  ops/s
> 
> 
> After this PR:
> 
> 
> Benchmark(size)   Mode  CntScoreError  Units
> FlatMap.par_array10  thrpt   12   283350,051 ?  35567,223  ops/s
> FlatMap.par_array   100  thrpt   1253846,906 ?  19241,913  ops/s
> FlatMap.par_array  1000  thrpt   12 8230,909 ?156,362  ops/s
> FlatMap.par_iterate  10  thrpt   1226328,500 ?   5411,401  ops/s
> FlatMap.par_iterate 100  thrpt   1210470,862 ?249,991  ops/s
> FlatMap.par_iterate1000  thrpt   12  986,511 ?224,050  ops/s
> FlatMap.seq_array10  thrpt   12  5654826,565 ?  27317,453  ops/s
> FlatMap.seq_array   100  thrpt   12   187929,786 ?542,787  ops/s
> FlatMap.seq_array  1000  thrpt   12 2385,346 ?  9,827  ops/s
> FlatMap.seq_iterate  10  thrpt   12   812722,403 ? 160500,399  ops/s
> FlatMap.seq_iterate 100  thrpt   1213542,472 ?118,769  ops/s
> FlatMap.seq_iterate1000  thrpt   12  157,056 ?  1,814  ops/s

Viktor Klang has updated the pull request incrementally with one additional 
commit since the last revision:

  Updating copyright year

-

Changes:
  - all: https://git.openjdk.org/jdk/pull/18625/files
  - new: https://git.openjdk.org/jdk/pull/18625/files/e31c764f..3ff40739

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=18625&range=01
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18625&range=00-01

  Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod
  Patch: https://git.openjdk.org/jdk/pull/18625.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/18625/head:pull/18625

PR: https://git.openjdk.org/jdk/pull/18625