Re: Collectors.toSet() small performance improvement
> On 16 Aug 2016, at 20:08, Tagir F. Valeev wrote: > > Hello, Paul! > > Thank you. Here's issue: > https://bugs.openjdk.java.net/browse/JDK-8164189 > And webrev: > http://cr.openjdk.java.net/~tvaleev/webrev/8164189/r1/ > Thanks will push today or tomorrow, Paul. > With best regards, > Tagir Valeev. > > PS> Hi Tagir, > > PS> I like it. Please open an issue and i will sponsor the fix for you. > > PS> Thanks, > PS> Paul. > >>> On 8 Aug 2016, at 22:31, Tagir F. Valeev wrote: >>> >>> Hello! >>> >>> I'd like to propose a simple performance patch for Collectors.toSet(): >>> >>> --- a/src/java.base/share/classes/java/util/stream/Collectors.java Tue >>> Aug 02 07:19:06 2016 +0530 >>> +++ b/src/java.base/share/classes/java/util/stream/Collectors.java Tue >>> Aug 09 11:47:20 2016 +0700 >>> @@ -295,7 +295,12 @@ >>>public static >>>Collector> toSet() { >>>return new CollectorImpl<>((Supplier>) HashSet::new, Set::add, >>> - (left, right) -> { left.addAll(right); >>> return left; }, >>> + (left, right) -> { >>> + if (left.size() < right.size()) { >>> + right.addAll(left); return >>> right; >>> + } >>> + left.addAll(right); return left; >>> + }, >>> CH_UNORDERED_ID); >>>} >>> >>> This will add the smaller set to the larger which may improve parallel >>> performance when subtasks produced uneven results. In normal case the >>> additional overhead is insignificant (additional HashSet size >>> comparison per each combine operation which usually performed >>> ~4*parallelism times). >>> >>> Here's simple JDK benchmark with results (Core-i7-4702MQ 4 HT cores, >>> Win7, JDK 9-ea+129 64bit) which confirms the speedup: >>> http://cr.openjdk.java.net/~tvaleev/patches/toSet/ >>> >>> When data is unevenly distributed, it's significantly faster: >>> >>> Original: >>> ToSet.toSetParallel1 true avgt 50 88,580 ± 0,874 >>> us/op >>> ToSet.toSetParallel 10 true avgt 50676,243 ± 41,188 >>> us/op >>> ToSet.toSetParallel 100 true avgt 50 9126,399 ± 83,260 >>> us/op >>> Patched: >>> ToSet.toSetParallel1 true avgt 50 62,121 ± 1,207 >>> us/op >>> ToSet.toSetParallel 10 true avgt 50556,968 ± 37,404 >>> us/op >>> ToSet.toSetParallel 100 true avgt 50 6572,372 ± 183,466 >>> us/op >>> >>> When data is evenly distributed, no statistically significant changes >>> observed: >>> >>> Original: >>> ToSet.toSetParallel1 false avgt 50177,137 ±5,941 >>> us/op >>> ToSet.toSetParallel 10 false avgt 50 2056,332 ± 87,433 >>> us/op >>> ToSet.toSetParallel 100 false avgt 50 51864,198 ± 560,883 >>> us/op >>> Patched: >>> ToSet.toSetParallel1 false avgt 50172,606 ± 7,321 >>> us/op >>> ToSet.toSetParallel 10 false avgt 50 1922,478 ± 79,593 >>> us/op >>> ToSet.toSetParallel 100 false avgt 50 52648,100 ± 621,526 >>> us/op >>> >>> What do you think? Should I open an issue? >>> >>> With best regards, >>> Tagir Valeev. >>> >
Re: Collectors.toSet() small performance improvement
Hello, Paul! Thank you. Here's issue: https://bugs.openjdk.java.net/browse/JDK-8164189 And webrev: http://cr.openjdk.java.net/~tvaleev/webrev/8164189/r1/ With best regards, Tagir Valeev. PS> Hi Tagir, PS> I like it. Please open an issue and i will sponsor the fix for you. PS> Thanks, PS> Paul. >> On 8 Aug 2016, at 22:31, Tagir F. Valeev wrote: >> >> Hello! >> >> I'd like to propose a simple performance patch for Collectors.toSet(): >> >> --- a/src/java.base/share/classes/java/util/stream/Collectors.java Tue >> Aug 02 07:19:06 2016 +0530 >> +++ b/src/java.base/share/classes/java/util/stream/Collectors.java Tue >> Aug 09 11:47:20 2016 +0700 >> @@ -295,7 +295,12 @@ >> public static >> Collector> toSet() { >> return new CollectorImpl<>((Supplier>) HashSet::new, Set::add, >> - (left, right) -> { left.addAll(right); >> return left; }, >> + (left, right) -> { >> + if (left.size() < right.size()) { >> + right.addAll(left); return right; >> + } >> + left.addAll(right); return left; >> + }, >>CH_UNORDERED_ID); >> } >> >> This will add the smaller set to the larger which may improve parallel >> performance when subtasks produced uneven results. In normal case the >> additional overhead is insignificant (additional HashSet size >> comparison per each combine operation which usually performed >> ~4*parallelism times). >> >> Here's simple JDK benchmark with results (Core-i7-4702MQ 4 HT cores, >> Win7, JDK 9-ea+129 64bit) which confirms the speedup: >> http://cr.openjdk.java.net/~tvaleev/patches/toSet/ >> >> When data is unevenly distributed, it's significantly faster: >> >> Original: >> ToSet.toSetParallel1 true avgt 50 88,580 ± 0,874 us/op >> ToSet.toSetParallel 10 true avgt 50676,243 ± 41,188 us/op >> ToSet.toSetParallel 100 true avgt 50 9126,399 ± 83,260 us/op >> Patched: >> ToSet.toSetParallel1 true avgt 50 62,121 ± 1,207 us/op >> ToSet.toSetParallel 10 true avgt 50556,968 ± 37,404 us/op >> ToSet.toSetParallel 100 true avgt 50 6572,372 ± 183,466 us/op >> >> When data is evenly distributed, no statistically significant changes >> observed: >> >> Original: >> ToSet.toSetParallel1 false avgt 50177,137 ±5,941 >> us/op >> ToSet.toSetParallel 10 false avgt 50 2056,332 ± 87,433 >> us/op >> ToSet.toSetParallel 100 false avgt 50 51864,198 ± 560,883 >> us/op >> Patched: >> ToSet.toSetParallel1 false avgt 50172,606 ± 7,321 us/op >> ToSet.toSetParallel 10 false avgt 50 1922,478 ± 79,593 us/op >> ToSet.toSetParallel 100 false avgt 50 52648,100 ± 621,526 us/op >> >> What do you think? Should I open an issue? >> >> With best regards, >> Tagir Valeev. >>
Re: Collectors.toSet() small performance improvement
Hi Tagir, I like it. Please open an issue and i will sponsor the fix for you. Thanks, Paul. > On 8 Aug 2016, at 22:31, Tagir F. Valeev wrote: > > Hello! > > I'd like to propose a simple performance patch for Collectors.toSet(): > > --- a/src/java.base/share/classes/java/util/stream/Collectors.java Tue > Aug 02 07:19:06 2016 +0530 > +++ b/src/java.base/share/classes/java/util/stream/Collectors.java Tue > Aug 09 11:47:20 2016 +0700 > @@ -295,7 +295,12 @@ > public static > Collector> toSet() { > return new CollectorImpl<>((Supplier>) HashSet::new, Set::add, > - (left, right) -> { left.addAll(right); > return left; }, > + (left, right) -> { > + if (left.size() < right.size()) { > + right.addAll(left); return right; > + } > + left.addAll(right); return left; > + }, >CH_UNORDERED_ID); > } > > This will add the smaller set to the larger which may improve parallel > performance when subtasks produced uneven results. In normal case the > additional overhead is insignificant (additional HashSet size > comparison per each combine operation which usually performed > ~4*parallelism times). > > Here's simple JDK benchmark with results (Core-i7-4702MQ 4 HT cores, > Win7, JDK 9-ea+129 64bit) which confirms the speedup: > http://cr.openjdk.java.net/~tvaleev/patches/toSet/ > > When data is unevenly distributed, it's significantly faster: > > Original: > ToSet.toSetParallel1 true avgt 50 88,580 ± 0,874 us/op > ToSet.toSetParallel 10 true avgt 50676,243 ± 41,188 us/op > ToSet.toSetParallel 100 true avgt 50 9126,399 ± 83,260 us/op > Patched: > ToSet.toSetParallel1 true avgt 50 62,121 ± 1,207 us/op > ToSet.toSetParallel 10 true avgt 50556,968 ± 37,404 us/op > ToSet.toSetParallel 100 true avgt 50 6572,372 ± 183,466 us/op > > When data is evenly distributed, no statistically significant changes > observed: > > Original: > ToSet.toSetParallel1 false avgt 50177,137 ±5,941 us/op > ToSet.toSetParallel 10 false avgt 50 2056,332 ± 87,433 us/op > ToSet.toSetParallel 100 false avgt 50 51864,198 ± 560,883 us/op > Patched: > ToSet.toSetParallel1 false avgt 50172,606 ± 7,321 us/op > ToSet.toSetParallel 10 false avgt 50 1922,478 ± 79,593 us/op > ToSet.toSetParallel 100 false avgt 50 52648,100 ± 621,526 us/op > > What do you think? Should I open an issue? > > With best regards, > Tagir Valeev. >
Re: Collectors.toSet() small performance improvement
On 08/09/2016 10:19 AM, Tagir F. Valeev wrote: > AS> Good trick, but does it work properly with the sets that care about the > AS> add order, e.g. LinkedHashSet? I guess our saving grace here is > AS> Collector.toSet() is declared UNORDERED, so we can disturb the encounter > AS> order without violating the API contract. > > This implementation is bound to HashSet which is created right here, > in supplier: > > return new CollectorImpl<>((Supplier>) HashSet::new, Set::add,... ) > > No other set implementation is possible here. Users who want to use > custom set should use Collectors.toCollection(LinkedHashSet::new) > which I'm not going to change as indeed here order might be important. Right. (seeps some green tea trying to wake up) Looks good then. Thanks, -Aleksey
Re: Collectors.toSet() small performance improvement
Hello, Aleksey! Thank you for review. AS> Good trick, but does it work properly with the sets that care about the AS> add order, e.g. LinkedHashSet? I guess our saving grace here is AS> Collector.toSet() is declared UNORDERED, so we can disturb the encounter AS> order without violating the API contract. This implementation is bound to HashSet which is created right here, in supplier: return new CollectorImpl<>((Supplier>) HashSet::new, Set::add,... ) No other set implementation is possible here. Users who want to use custom set should use Collectors.toCollection(LinkedHashSet::new) which I'm not going to change as indeed here order might be important. AS> I would make it a bit cleaner though: AS> (left, right) -> { AS> if (left.size() < right.size()) { AS>right.addAll(left); return right; AS> } else { AS>left.addAll(right); return left; AS> } AS> } Agreed. With best regards, Tagir Valeev.
Re: Collectors.toSet() small performance improvement
On 08/09/2016 08:31 AM, Tagir F. Valeev wrote: > --- a/src/java.base/share/classes/java/util/stream/Collectors.java Tue > Aug 02 07:19:06 2016 +0530 > +++ b/src/java.base/share/classes/java/util/stream/Collectors.java Tue > Aug 09 11:47:20 2016 +0700 > @@ -295,7 +295,12 @@ > public static > Collector> toSet() { > return new CollectorImpl<>((Supplier>) HashSet::new, Set::add, > - (left, right) -> { left.addAll(right); > return left; }, > + (left, right) -> { > + if (left.size() < right.size()) { > + right.addAll(left); return right; > + } > + left.addAll(right); return left; > + }, > CH_UNORDERED_ID); > } Good trick, but does it work properly with the sets that care about the add order, e.g. LinkedHashSet? I guess our saving grace here is Collector.toSet() is declared UNORDERED, so we can disturb the encounter order without violating the API contract. I would make it a bit cleaner though: (left, right) -> { if (left.size() < right.size()) { right.addAll(left); return right; } else { left.addAll(right); return left; } } Thanks, -Aleksey
Collectors.toSet() small performance improvement
Hello! I'd like to propose a simple performance patch for Collectors.toSet(): --- a/src/java.base/share/classes/java/util/stream/Collectors.java Tue Aug 02 07:19:06 2016 +0530 +++ b/src/java.base/share/classes/java/util/stream/Collectors.java Tue Aug 09 11:47:20 2016 +0700 @@ -295,7 +295,12 @@ public static Collector> toSet() { return new CollectorImpl<>((Supplier>) HashSet::new, Set::add, - (left, right) -> { left.addAll(right); return left; }, + (left, right) -> { + if (left.size() < right.size()) { + right.addAll(left); return right; + } + left.addAll(right); return left; + }, CH_UNORDERED_ID); } This will add the smaller set to the larger which may improve parallel performance when subtasks produced uneven results. In normal case the additional overhead is insignificant (additional HashSet size comparison per each combine operation which usually performed ~4*parallelism times). Here's simple JDK benchmark with results (Core-i7-4702MQ 4 HT cores, Win7, JDK 9-ea+129 64bit) which confirms the speedup: http://cr.openjdk.java.net/~tvaleev/patches/toSet/ When data is unevenly distributed, it's significantly faster: Original: ToSet.toSetParallel1 true avgt 50 88,580 ± 0,874 us/op ToSet.toSetParallel 10 true avgt 50676,243 ± 41,188 us/op ToSet.toSetParallel 100 true avgt 50 9126,399 ± 83,260 us/op Patched: ToSet.toSetParallel1 true avgt 50 62,121 ± 1,207 us/op ToSet.toSetParallel 10 true avgt 50556,968 ± 37,404 us/op ToSet.toSetParallel 100 true avgt 50 6572,372 ± 183,466 us/op When data is evenly distributed, no statistically significant changes observed: Original: ToSet.toSetParallel1 false avgt 50177,137 ±5,941 us/op ToSet.toSetParallel 10 false avgt 50 2056,332 ± 87,433 us/op ToSet.toSetParallel 100 false avgt 50 51864,198 ± 560,883 us/op Patched: ToSet.toSetParallel1 false avgt 50172,606 ± 7,321 us/op ToSet.toSetParallel 10 false avgt 50 1922,478 ± 79,593 us/op ToSet.toSetParallel 100 false avgt 50 52648,100 ± 621,526 us/op What do you think? Should I open an issue? With best regards, Tagir Valeev.