On Sun, 7 Nov 2021 06:26:22 GMT, kabutz <[email protected]> wrote:

>> (immutable collections could override stream() instead, since they don't 
>> have that problem)
>
>> The net effect of this change might depend on your workload. If you call 
>> stream() on empty collections that have cheap isEmpty(), this change will 
>> likely improve performance and reduce waste. However, this same change might 
>> do the opposite if some of your collections aren't empty or have costly 
>> isEmpty(). It would be good to have benchmarks for different workloads.
> 
> Yes, I also thought about the cost of isEmpty() on concurrent collections. 
> There are four concurrent collections that have a linear time cost size() 
> method: CLQ, CLD, LTQ and CHM. However, in each of these cases, the isEmpty() 
> method has constant time cost. There might be collections defined outside the 
> JDK where this could be the case.
> 
> However, I will extend the benchmark to include a few of those cases too, as 
> well as different sizes and collection sizes.
> 
> Thank you so much for your input.

> wouldn't this make streams no longer lazy if the collection is empty?
> 
> ```java
>         List<String> list = new ArrayList<>();
>         Stream<String> stream = list.stream();
> 
>         list.addAll(List.of("one", "two", "three"));
> 
>         stream.forEach(System.out::println); // prints one two three
> ```

I did not consider this case, thank you for bringing it up. I have always found 
this behaviour a bit strange and have never used it "in the real world". It is 
also not consistent between collections. Here is an example with four 
collections: ArrayList, CopyOnWriteArrayList, ConcurrentSkipListSet and 
ArrayBlockingQueue:


import java.util.ArrayList;
import java.util.Arrays;
import java.util.Collection;
import java.util.List;
import java.util.Objects;
import java.util.concurrent.ArrayBlockingQueue;
import java.util.concurrent.ConcurrentSkipListSet;
import java.util.concurrent.CopyOnWriteArrayList;
import java.util.function.Supplier;
import java.util.stream.IntStream;

public class LazyStreamDemo {
    public static void main(String... args) {
        List<Supplier<Collection<String>>> suppliers =
                List.of(ArrayList::new, // fast-fail
                        CopyOnWriteArrayList::new, // snapshot
                        ConcurrentSkipListSet::new, // weakly-consistent
                        () -> new ArrayBlockingQueue<>(10) // weakly-consistent
                );
        for (Supplier<Collection<String>> supplier : suppliers) {
            Collection<String> c = supplier.get();
            System.out.println(c.getClass());
            IntStream stream = c.stream()
                    .sorted()
                    .filter(Objects::nonNull)
                    .mapToInt(String::length)
                    .sorted();

            c.addAll(List.of("one", "two", "three", "four", "five"));
            System.out.println("stream = " + Arrays.toString(stream.toArray()));
        }
    }
}


The output is:


class java.util.ArrayList
stream = [3, 3, 4, 4, 5]
class java.util.concurrent.CopyOnWriteArrayList
stream = []
class java.util.concurrent.ConcurrentSkipListSet
stream = []
class java.util.concurrent.ArrayBlockingQueue
stream = [3, 3, 4, 4, 5]


At least with the EmptyStream we would have consistent output of always []

-------------

PR: https://git.openjdk.java.net/jdk/pull/6275

Reply via email to