Re: RFR: 8280915: Better parallelization for AbstractSpliterator and IteratorSpliterator when size is unknown [v5]

Paul Sandoz Thu, 14 Apr 2022 09:24:19 -0700

On Thu, 10 Feb 2022 04:30:34 GMT, Tagir F. Valeev <[email protected]> wrote:


>> See the bug description for details.
>> 
>> I propose a simple solution. Let's allow ArraySpliterator to be non-SIZED 
>> and report artificial estimatedSize(), much bigger than the real one. This 
>> will allow AbstractSpliterator and IteratorSpliterator to produce prefix 
>> whose size is comparable to Long.MAX_VALUE (say, starting with 
>> Long.MAX_VALUE/2), and this will enable further splitting of the prefix. 
>> This change will drastically improve parallel streaming for affected streams 
>> of size <= 1024 and significantly improve for streams of size 1025..20000. 
>> The cost is higher-grained splitting for huge streams of unknown size. This 
>> might add a minor overhead for such scenarios which, I believe, is 
>> completely tolerable.
>> 
>> No public API changes are necessary, sequential processing should not be 
>> affected, except an extra field in ArraySpliterator which increases a 
>> footprint by 8 bytes.
>> 
>> I added a simple test using an artificial collector to ensure that at least 
>> two non-empty parts are created when parallelizing Stream.iterate source. 
>> More testing ideas are welcome.
>
> Tagir F. Valeev has updated the pull request incrementally with two 
> additional commits since the last revision:
> 
>  - Update copyright year
>  - Cosmetic fixes

Getting back to this after much delay! Approving. But, i would like to try and 
document this design decision in comments, and maybe in implementation notes. 
We can do that as a follow on PR.

-------------

Marked as reviewed by psandoz (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/7279

Re: RFR: 8280915: Better parallelization for AbstractSpliterator and IteratorSpliterator when size is unknown [v5]

Reply via email to