On Sun, 7 Nov 2021 06:53:12 GMT, kabutz <d...@openjdk.java.net> wrote:
>>> The net effect of this change might depend on your workload. If you call >>> stream() on empty collections that have cheap isEmpty(), this change will >>> likely improve performance and reduce waste. However, this same change >>> might do the opposite if some of your collections aren't empty or have >>> costly isEmpty(). It would be good to have benchmarks for different >>> workloads. >> >> Yes, I also thought about the cost of isEmpty() on concurrent collections. >> There are four concurrent collections that have a linear time cost size() >> method: CLQ, CLD, LTQ and CHM. However, in each of these cases, the >> isEmpty() method has constant time cost. There might be collections defined >> outside the JDK where this could be the case. >> >> However, I will extend the benchmark to include a few of those cases too, as >> well as different sizes and collection sizes. >> >> Thank you so much for your input. > >> wouldn't this make streams no longer lazy if the collection is empty? >> >> ```java >> List<String> list = new ArrayList<>(); >> Stream<String> stream = list.stream(); >> >> list.addAll(List.of("one", "two", "three")); >> >> stream.forEach(System.out::println); // prints one two three >> ``` > > I did not consider this case, thank you for bringing it up. I have always > found this behaviour a bit strange and have never used it "in the real > world". It is also not consistent between collections. Here is an example > with four collections: ArrayList, CopyOnWriteArrayList, ConcurrentSkipListSet > and ArrayBlockingQueue: > > > import java.util.ArrayList; > import java.util.Arrays; > import java.util.Collection; > import java.util.List; > import java.util.Objects; > import java.util.concurrent.ArrayBlockingQueue; > import java.util.concurrent.ConcurrentSkipListSet; > import java.util.concurrent.CopyOnWriteArrayList; > import java.util.function.Supplier; > import java.util.stream.IntStream; > > public class LazyStreamDemo { > public static void main(String... args) { > List<Supplier<Collection<String>>> suppliers = > List.of(ArrayList::new, // fast-fail > CopyOnWriteArrayList::new, // snapshot > ConcurrentSkipListSet::new, // weakly-consistent > () -> new ArrayBlockingQueue<>(10) // > weakly-consistent > ); > for (Supplier<Collection<String>> supplier : suppliers) { > Collection<String> c = supplier.get(); > System.out.println(c.getClass()); > IntStream stream = c.stream() > .sorted() > .filter(Objects::nonNull) > .mapToInt(String::length) > .sorted(); > > c.addAll(List.of("one", "two", "three", "four", "five")); > System.out.println("stream = " + > Arrays.toString(stream.toArray())); > } > } > } > > > The output is: > > > class java.util.ArrayList > stream = [3, 3, 4, 4, 5] > class java.util.concurrent.CopyOnWriteArrayList > stream = [] > class java.util.concurrent.ConcurrentSkipListSet > stream = [] > class java.util.concurrent.ArrayBlockingQueue > stream = [3, 3, 4, 4, 5] > > > At least with the EmptyStream we would have consistent output of always [] @kabutz I agree that i wouldn't consider it clean code to use a stream like i put into the example. I only brought it up because it might break existing code, since i think streams are expected to be lazy. Interesting to see that they are in fact not lazy in all situations - i put that into my notes. ------------- PR: https://git.openjdk.java.net/jdk/pull/6275