Re: RFR: 8180352: Add Stream.toList() method

2021-02-05 Thread Brian Goetz




I have been reading previous threads, the original bug request, and exploring 
the javadoc and implementation of toList() on Stream in JDK 16. I don’t want to 
waste time rehashing previous discussions, but I want to understand and 
prioritize the motivation of this change, and propose what I believe is a safer 
alternative name for this method based on the current implementation: 
Stream.toUnmodifiableList().


Big +1 to "let's not rehash previous discussions, but help us understand 
the motivation."  Stewarding the core libraries is a complex task, and 
there are rarely hard-and-fast rules for doing the Right Thing.


Your question seems to have two main aspects:

 - Why this method, why not others, and why now
 - Why take such a strong anti-mutability position with this method

The desire for a Stream::toList method has a long history; when we first 
did streams, it was one of the first convenience methods to be 
"requested".  We resisted then, for good reasons, but we knew this saga 
was not over.


"Convenience" methods are a constant challenge in the JDK.  On the one 
hand, they are, well, convenient, and we want Java to be easy and 
pleasant to program in.  On the other, the number of potentially-useful 
imaginable convenience methods is infinite, and the widespread 
perception is that they are so easy, that all that is needed is for 
someone to propose the idea.  (The (admittedly soft) criteria we use for 
judging whether a convenience method meets the bar is an interesting 
one, which we can have separately.)


There are basically two stable points with respect to convenience 
methods in API design; zero tolerance, and "don't worry, be happy". In 
the former, the methods of an API are like a basis (ideally, an 
orthonormal one) of a vector space; the minimum number of API points 
from which you can derive all possible usages.  At the other extreme, 
every reasonable combination of methods gets its own special form of 
expression.  Of course, both are extremes (Stream::count and 
IntStream::sum are conveniences for reduce, and even Haskell's Monad has 
multiple ways to represent bind), but APIs tend to align themselves in 
one direction or another.  And, as the JDK APIs go, Streams treats 
sparsity and orthogonality as virtues to be striven for.


Eclipse Collections chooses a different (and also valid!) philosophy: 
completeness, and it walks the walk.  (Having 81 (template-generated) 
implementations of HashMap is proof.) Similarly, Tagir's StreamEx is an 
example of an extension to Stream that takes the other approach.  And 
both are great!  But also, they are not how the JDK rolls.  Which is 
fine; it's a big ecosystem, and there's room for multiple philosophies, 
and each can find its fans and detractors.


The calls for a convenience for Stream::toList have come pretty much 
continuously since we first resisted it (but, we knew even then that if 
we had a lifetime budget for just one convenience method, it would end 
up being toList.)  We knew then that there would be questions to ask 
about what the ideal dial settings would be for toList, and were not yet 
ready to confront the question, nor did we want to add fuel to the 
demands for more convenience methods ("No toSet?  Inconsistent!")


When an API is new, and all things are possible, we tend to be in 
"imagine everything we could put into it" mode, and streams was no 
different.  It is wise to resist this temptation -- and maybe even 
over-rotate in the other direction -- to allow for some time for the 
spirit of what you've built to make itself clear; even creators are not 
always immediately clear on the nuances of their creation.  So we tried 
hard to resist the calls for unnecessary methods, knowing that they 
could always be added, but not taken away, and also, allowing for the 
true gaps to emerge from usage.  (The first method to be added, 
takeWhile(), was the very opposite of a convenience; it represented a 
reasonable use case that the original design didn't support.)


So, why toList now?  Well, a number of reasons.  Collecting to a list is 
one of the most common terminal operations, so any small irritant (like 
a clumsy locution) adds up.  And, as has been pointed out, it can be 
more efficient if it is brought into the stream core rather than held at 
arm's length through Collector.  So if we're going to compromise our 
principles in one place, after thinking about it for a long time, this 
seemed a worthy candidate.  (And still, we hesitate, because we knew it 
would be firing the starting gun for the "But where's toSet?" arguments.)


So yes, there are lots of good reasons to continue to Just Say No to 
conveniences, but, there are also reasonable times to make exceptions -- 
especially when it is not purely about convenience. And, data suggests 
that toList is 5-10x more popular than the next most popular collector, 
so there's a clear argument to say that toList is pretty special, and we 
can stop there.



List 

RFR: 8180352: Add Stream.toList() method

2021-02-04 Thread Donald Raab
I have been reading previous threads, the original bug request, and exploring 
the javadoc and implementation of toList() on Stream in JDK 16. I don’t want to 
waste time rehashing previous discussions, but I want to understand and 
prioritize the motivation of this change, and propose what I believe is a safer 
alternative name for this method based on the current implementation: 
Stream.toUnmodifiableList().


The original justification for Stream.toList() as described in the bug request 
filed in 2017 was: "It makes the coding just easier, saves some time and place.”

see: https://bugs.openjdk.java.net/browse/JDK-8180352 


Summarizing my interpretation of the eventual motivations for the change, over 
various threads (please correct as needed):

1. Convenience (easier to discover API and less code for developers to write - 
potential refactoring of large number of unmodifiable usages of 
Collectors.toList())
2. Performance (less copying during parallel operations and pre-sizing for 
serial operations)
3. Unmodifiability (List result but thread-safe and usable in static contexts 
and as hash keys - unsafe for mutating operations)
4. Support for nulls (Inconsistent with Collectors.toUnmodifiableList() but 
consistent with other Stream methods)


List is a mutable interface. It’s contract allows for “conditional mutability”. 
A convention was established in 2014 with Collectors.toList() returning a 
mutable List (ArrayList). The spec in Collectors.toList() says there is no 
guarantee on the type, but there appears to be no appetite to undo this. The 
current Collectors.toList() convention was then further strengthened with the 
addition of Collectors.toUnmodifiableList() in Java 10 in 2018. 


The problem as I see it is that we are currently only looking inside of the JDK 
for convention. The convention of using the toList() name has existed in other 
libraries and application code for as long as List has been available as an 
interface. The problem of inconsistency of convention within the JDK itself can 
be argued away as a historical mistake, but harder to claim for the 
"established convention" in other libraries.


We learned an important lesson in JDK 15 when the addition of a default isEmpty 
method to CharSequence broke Eclipse Collections. See this excellent blog from 
Stuart Marks: 
https://stuartmarks.wordpress.com/2020/09/22/incompatibilities-with-jdk-15-charsequence-isempty/.


The Stream interface has existed for seven years now. The method toList() is a 
prime candidate for anyone who wanted to optimize for convenience over the past 
seven years.


I found this default definition of toList() on the Folds interface in Cyclops:
https://github.com/aol/cyclops/blob/dc26bdad0f2b8a5aa87de4aa946ed52a0dbf00de/cyclops/src/main/java/com/oath/cyclops/types/foldable/Folds.java#L99


I discovered these occurrences of toList() in the jOOQ library.
https://github.com/jOOQ/jOOL/blob/main/jOOL/src/main/java/org/jooq/lambda/Collectable.java#L682
 
https://github.com/jOOQ/jOOL/blob/main/jOOL/src/main/java/org/jooq/lambda/SeqImpl.java#L557
 


Incidentally, this interface also defines toUnmodifiableList().
https://github.com/jOOQ/jOOL/blob/main/jOOL/src/main/java/org/jooq/lambda/Collectable.java#L692
 


jOOQ did not define it’s toList() method as a default method on Collectable, so 
it should not trigger the default method issue we discovered in JDK 15, but the 
specifications of toList() on Collectable and Stream are incompatible. Which 
specification should win?


Eclipse Collections consistent convention in all of its interfaces is that 
toList returns a MutableList [1]. MutableList extends java.util.List, which is 
a mutable interface. The Stream.toList() method will violate the principle of 
least surprise for Eclipse Collections users based on long time established 
conventions for the the library. There may be other libraries and applications 
that have defined their own toList() methods to return List as well.


Lastly, Craig Motlin explains the performance optimization we use for toList in 
our parallel implementation in his talk in 2014: 
https://www.infoq.com/presentations/java-streams-scala-parallel-collections/


Thanks,
Don


[1] Example usages of Eclipse Collections toList:


// toList result is mutable for all of these usages with Eclipse Collections
List list1 = mutableSet.toList();
List list2 = mutableSet.asLazy().toList(); 
List list3 = mutableSet.asParallel(Executors.newWorkStealingPool(), 
10).toList();
List list4 = mutableSet.stream().collect(Collectors.toList());
List list5 = mutableSet.stream().collect(Collectors2.toList());


// toList result is currently unmodifiable in Stream
List list6 = mutableSet.stream().toList();

Re: RFR: 8180352: Add Stream.toList() method [v4]

2020-11-30 Thread Stuart Marks
On Wed, 25 Nov 2020 17:14:43 GMT, Peter Levart  wrote:

>> An alternative with similar performance would be to do a Stream.toArray() 
>> and then copy that array into new Object[] and then wrap that copy with 
>> listFromTrustedArrayNullsAllowed(). The difference would be in the 
>> serialization format of the resulting List and maybe also in the access 
>> performance of resulting List (no indirection via the unmodifiableList 
>> wrapper and different type for JIT to speculate about). So if we want the 
>> resulting List to behave exactly the same in both implementations of 
>> toList(), then this alternative might be preferable. WDYT?
>
> Such alternative might also be faster using intrisified array copying method 
> (here measuring just copying overhead):
> 
> Benchmark  (len)  Mode  Cnt   Score  Error  Units
> ToListBench.toList1   10  avgt   10  14.213 ±0.061  ns/op
> ToListBench.toList1 1000  avgt   10 541.883 ±3.845  ns/op
> ToListBench.toList1  100  avgt   10  753223.523 ± 4656.664  ns/op
> ToListBench.toList2   10  avgt   10   8.810 ±0.052  ns/op
> ToListBench.toList2 1000  avgt   10 264.748 ±0.807  ns/op
> ToListBench.toList2  100  avgt   10  349518.502 ± 3242.061  ns/op
> 
> https://gist.github.com/plevart/974b67b65210f8dd122773f481c0a603

I don't want to change the default implementation and its specification, for a 
couple reasons. First, this implementation won't be used in normal operation, 
as it's overridden by the JDK implementation. (I expect that stream extension 
libraries will be updated to override it as well.) Second, I'm quite 
uncomfortable using an internal method from within a default implementation. 
Right now this method is an agreement between the unmodifiable collections 
implementation and the ReferencePipeline implementation, and we have complete 
freedom to change this agreement at any time. If this internal method were used 
from a default implementation, its behavior would have to be specified in 
`@implSpec` somehow. Certainly one could write some very vague words for the 
specification that allow some freedom to make changes, but this has to choose 
between a better specification and more implementation freedom. I don't see the 
value in having to make this tradeoff at all.

An extra copy can be avoided via a private agreement between ArrayList and 
Arrays.asList, which should probably be done in any case.

-

PR: https://git.openjdk.java.net/jdk/pull/1026


Re: RFR: 8180352: Add Stream.toList() method [v4]

2020-11-25 Thread Paul Sandoz
On Tue, 24 Nov 2020 07:10:14 GMT, Stuart Marks  wrote:

>> This change introduces a new terminal operation on Stream. This looks like a 
>> convenience method for Stream.collect(Collectors.toList()) or 
>> Stream.collect(Collectors.toUnmodifiableList()), but it's not. Having this 
>> method directly on Stream enables it to do what can't easily by done by a 
>> Collector. In particular, it allows the stream to deposit results directly 
>> into a destination array (even in parallel) and have this array be wrapped 
>> in an unmodifiable List without copying.
>> 
>> In the past we've kept most things from the Collections Framework as 
>> implementations of Collector, not directly on Stream, whereas only 
>> fundamental things (like toArray) appear directly on Stream. This is true of 
>> most Collections, but it does seem that List is special. It can be a thin 
>> wrapper around an array; it can handle generics better than arrays; and 
>> unlike an array, it can be made unmodifiable (shallowly immutable); and it 
>> can be value-based. See John Rose's comments in the bug report:
>> 
>> https://bugs.openjdk.java.net/browse/JDK-8180352?focusedCommentId=14133065=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14133065
>> 
>> This operation is null-tolerant, which matches the rest of Streams. This 
>> isn't specified, though; a general statement about null handling in Streams 
>> is probably warranted at some point.
>> 
>> Finally, this method is indeed quite convenient (if the caller can deal with 
>> what this operation returns), as collecting into a List is the most common 
>> stream terminal operation.
>
> Stuart Marks has updated the pull request incrementally with one additional 
> commit since the last revision:
> 
>   Add comment and assert regarding array class; use switch expression.

Recommend you review the problem list of this source file in IntelliJ, which 
shows some potential cleanups. This can be done as a follow on PR if 
appropriate, as can any improvements to the default implementation.

src/java.base/share/classes/java/util/ImmutableCollections.java line 163:

> 161:  *
> 162:  * @param  the List's element type
> 163:  * @param input the input array

Rogue parameter declaration.

-

Marked as reviewed by psandoz (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/1026


Re: RFR: 8180352: Add Stream.toList() method [v4]

2020-11-25 Thread Peter Levart
On Wed, 25 Nov 2020 14:07:07 GMT, Peter Levart  wrote:

>> This is the default implementation in the Stream interface, which is 
>> overridden by an implementation in the ReferencePipeline class, so it will 
>> rarely be used in normal operation. The ReferencePipeline version (part of 
>> this changeset) is based on toArray() and avoids any copying. I'm thus not 
>> inclined to add new interfaces in order to support this default 
>> implementation.
>> 
>> As written it's true that the default implementation does perform apparently 
>> redundant copies, but we can't be assured that toArray() actually returns a 
>> freshly created array. Thus, we wrap it using Arrays.asList and then copy it 
>> using the ArrayList constructor. This is unfortunate but necessary to avoid 
>> situations where someone could hold a reference to the internal array of a 
>> List, allowing modification of a List that's supposed to be unmodifiable.
>
> An alternative with similar performance would be to do a Stream.toArray() and 
> then copy that array into new Object[] and then wrap that copy with 
> listFromTrustedArrayNullsAllowed(). The difference would be in the 
> serialization format of the resulting List and maybe also in the access 
> performance of resulting List (no indirection via the unmodifiableList 
> wrapper and different type for JIT to speculate about). So if we want the 
> resulting List to behave exactly the same in both implementations of 
> toList(), then this alternative might be preferable. WDYT?

Such alternative might also be faster using intrisified array copying method 
(here measuring just copying overhead):

Benchmark  (len)  Mode  Cnt   Score  Error  Units
ToListBench.toList1   10  avgt   10  14.213 ±0.061  ns/op
ToListBench.toList1 1000  avgt   10 541.883 ±3.845  ns/op
ToListBench.toList1  100  avgt   10  753223.523 ± 4656.664  ns/op
ToListBench.toList2   10  avgt   10   8.810 ±0.052  ns/op
ToListBench.toList2 1000  avgt   10 264.748 ±0.807  ns/op
ToListBench.toList2  100  avgt   10  349518.502 ± 3242.061  ns/op

https://gist.github.com/plevart/974b67b65210f8dd122773f481c0a603

-

PR: https://git.openjdk.java.net/jdk/pull/1026


Re: RFR: 8180352: Add Stream.toList() method [v4]

2020-11-25 Thread Peter Levart
On Tue, 24 Nov 2020 03:46:38 GMT, Stuart Marks  wrote:

>> In `ReferencePipeline` class we have:
>> 
>> @Override
>> public final Object[] toArray() {
>> return toArray(Object[]::new);
>> }
>> 
>> @Override
>> @SuppressWarnings("unchecked")
>> public final  A[] toArray(IntFunction generator) {
>> ...
>> @SuppressWarnings("rawtypes")
>> IntFunction rawGenerator = (IntFunction) generator;
>> return (A[]) Nodes.flatten(evaluateToArrayNode(rawGenerator), 
>> rawGenerator)
>>   .asArray(rawGenerator);
>> }
>> 
>> 
>> In `Nodes` class we have:
>> 
>> 
>> public static  Node flatten(Node node, IntFunction 
>> generator) {
>> if (node.getChildCount() > 0) {
>> ... 
>> T[] array = generator.apply((int) size);
>> new ToArrayTask.OfRef<>(node, array, 0).invoke();
>> return node(array);
>> } else {
>> return node;
>> }
>> }
>> 
>> 
>> 
>> It looks like it is required to implement `toList` method in a similar way 
>> in order to avoid array copy. i.e. there will be `IntFunction> 
>> generator` which will generate 'ArrayList' with specified size and the 
>> list's `add` method will be called to add elements to the list.
>
> This is the default implementation in the Stream interface, which is 
> overridden by an implementation in the ReferencePipeline class, so it will 
> rarely be used in normal operation. The ReferencePipeline version (part of 
> this changeset) is based on toArray() and avoids any copying. I'm thus not 
> inclined to add new interfaces in order to support this default 
> implementation.
> 
> As written it's true that the default implementation does perform apparently 
> redundant copies, but we can't be assured that toArray() actually returns a 
> freshly created array. Thus, we wrap it using Arrays.asList and then copy it 
> using the ArrayList constructor. This is unfortunate but necessary to avoid 
> situations where someone could hold a reference to the internal array of a 
> List, allowing modification of a List that's supposed to be unmodifiable.

An alternative with similar performance would be to do a Stream.toArray() and 
then copy that array into new Object[] and then wrap that copy with 
listFromTrustedArrayNullsAllowed(). The difference would be in the 
serialization format of the resulting List and maybe also in the access 
performance of resulting List (no indirection via the unmodifiableList wrapper 
and different type for JIT to speculate about). So if we want the resulting 
List to behave exactly the same in both implementations of toList(), then this 
alternative might be preferable. WDYT?

-

PR: https://git.openjdk.java.net/jdk/pull/1026


Re: RFR: 8180352: Add Stream.toList() method [v4]

2020-11-25 Thread Remi Forax
- Mail original -
> De: "Stuart Marks" 
> À: "core-libs-dev" 
> Envoyé: Mardi 24 Novembre 2020 08:10:14
> Objet: Re: RFR: 8180352: Add Stream.toList() method [v4]

>> This change introduces a new terminal operation on Stream. This looks like a
>> convenience method for Stream.collect(Collectors.toList()) or
>> Stream.collect(Collectors.toUnmodifiableList()), but it's not. Having this
>> method directly on Stream enables it to do what can't easily by done by a
>> Collector. In particular, it allows the stream to deposit results directly 
>> into
>> a destination array (even in parallel) and have this array be wrapped in an
>> unmodifiable List without copying.
>> 
>> In the past we've kept most things from the Collections Framework as
>> implementations of Collector, not directly on Stream, whereas only 
>> fundamental
>> things (like toArray) appear directly on Stream. This is true of most
>> Collections, but it does seem that List is special. It can be a thin wrapper
>> around an array; it can handle generics better than arrays; and unlike an
>> array, it can be made unmodifiable (shallowly immutable); and it can be
>> value-based. See John Rose's comments in the bug report:
>> 
>> https://bugs.openjdk.java.net/browse/JDK-8180352?focusedCommentId=14133065=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14133065
>> 
>> This operation is null-tolerant, which matches the rest of Streams. This 
>> isn't
>> specified, though; a general statement about null handling in Streams is
>> probably warranted at some point.
>> 
>> Finally, this method is indeed quite convenient (if the caller can deal with
>> what this operation returns), as collecting into a List is the most common
>> stream terminal operation.
> 
> Stuart Marks has updated the pull request incrementally with one additional
> commit since the last revision:
> 
>  Add comment and assert regarding array class; use switch expression.
> 
> -

Hi Stuart,
This version is Ok for me.

I still think that using a null friendly List is a mistake, but you, Brian and 
John all think it's not an issue, so let's go with it.

> 
> Changes:
>  - all: https://git.openjdk.java.net/jdk/pull/1026/files
>  - new: https://git.openjdk.java.net/jdk/pull/1026/files/15beacd2..bd890ae5
> 
> Webrevs:
> - full: https://webrevs.openjdk.java.net/?repo=jdk=1026=03
> - incr: https://webrevs.openjdk.java.net/?repo=jdk=1026=02-03
> 
>  Stats: 20 lines in 1 file changed: 4 ins; 4 del; 12 mod
>  Patch: https://git.openjdk.java.net/jdk/pull/1026.diff
>  Fetch: git fetch https://git.openjdk.java.net/jdk pull/1026/head:pull/1026
> 
> PR: https://git.openjdk.java.net/jdk/pull/1026


Re: RFR: 8180352: Add Stream.toList() method [v2]

2020-11-24 Thread Stuart Marks




On 11/18/20 3:55 AM, Florian Weimer wrote:

I think it's also needed for an efficient null element check in
List::copyOf.  I have a hunch that with the posted implementation, it
would incorrectly produce an immutable list that contains null
elements.


(Sorry for the delay; oddly, this comment didn't make it into the PR.)

Yes, there was a bug in List::copyOf in that it would pass through a list containing 
nulls, a clear violation of its spec. I fixed this in my second-most recent commit 
on this PR.


s'marks



Re: RFR: 8180352: Add Stream.toList() method [v4]

2020-11-23 Thread Stuart Marks
> This change introduces a new terminal operation on Stream. This looks like a 
> convenience method for Stream.collect(Collectors.toList()) or 
> Stream.collect(Collectors.toUnmodifiableList()), but it's not. Having this 
> method directly on Stream enables it to do what can't easily by done by a 
> Collector. In particular, it allows the stream to deposit results directly 
> into a destination array (even in parallel) and have this array be wrapped in 
> an unmodifiable List without copying.
> 
> In the past we've kept most things from the Collections Framework as 
> implementations of Collector, not directly on Stream, whereas only 
> fundamental things (like toArray) appear directly on Stream. This is true of 
> most Collections, but it does seem that List is special. It can be a thin 
> wrapper around an array; it can handle generics better than arrays; and 
> unlike an array, it can be made unmodifiable (shallowly immutable); and it 
> can be value-based. See John Rose's comments in the bug report:
> 
> https://bugs.openjdk.java.net/browse/JDK-8180352?focusedCommentId=14133065=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14133065
> 
> This operation is null-tolerant, which matches the rest of Streams. This 
> isn't specified, though; a general statement about null handling in Streams 
> is probably warranted at some point.
> 
> Finally, this method is indeed quite convenient (if the caller can deal with 
> what this operation returns), as collecting into a List is the most common 
> stream terminal operation.

Stuart Marks has updated the pull request incrementally with one additional 
commit since the last revision:

  Add comment and assert regarding array class; use switch expression.

-

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/1026/files
  - new: https://git.openjdk.java.net/jdk/pull/1026/files/15beacd2..bd890ae5

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk=1026=03
 - incr: https://webrevs.openjdk.java.net/?repo=jdk=1026=02-03

  Stats: 20 lines in 1 file changed: 4 ins; 4 del; 12 mod
  Patch: https://git.openjdk.java.net/jdk/pull/1026.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/1026/head:pull/1026

PR: https://git.openjdk.java.net/jdk/pull/1026


Re: RFR: 8180352: Add Stream.toList() method [v3]

2020-11-23 Thread Stuart Marks
On Sat, 21 Nov 2020 11:02:27 GMT, Rémi Forax 
 wrote:

>> Stuart Marks has updated the pull request incrementally with one additional 
>> commit since the last revision:
>> 
>>   Adjust List.copyOf to null-check and copy allowNulls lists.
>>   Fix equals, hashCode, indexOf, lastIndexOf to handle nulls properly.
>>   Add MOAT tests for new lists; add equals and hashCode tests.
>
> src/java.base/share/classes/java/util/ImmutableCollections.java line 222:
> 
>> 220: default:
>> 221: return (List) new ListN<>(input, false);
>> 222: }
> 
> Using a switch expression + arrow (->) here will allow you too factor the 
> cast, even if it disappear in the generated bytecode, I think it makes the 
> code more readable

Heh yes, I haven't gotten used to using those yet!

-

PR: https://git.openjdk.java.net/jdk/pull/1026


Re: RFR: 8180352: Add Stream.toList() method [v3]

2020-11-23 Thread Stuart Marks
On Sat, 21 Nov 2020 10:58:49 GMT, Rémi Forax 
 wrote:

>> It's an implementation invariant that the internal array be Object[]. Having 
>> it be something other than Object[] can lead to subtle bugs. See 
>> [JDK-6260652](https://bugs.openjdk.java.net/browse/JDK-6260652) for example.
>
> you can still calls the varargs with an already created array
> listFromTrustedArray(new String[] { "foo" });
> 
> I think at least an assert is missing
> assert input.getClass() == Object.class;

If the parameter were of type `E...`, then a call such as 
`listFromTrustedArray("foo", "bar")` would result in a `String[]`. The idea is 
to avoid accidents such as the one that caused JDK-6260652, and not prevent 
deliberate passing of an array of the wrong type, though this is an internal 
interface so it's unlikely to occur. Still, a stronger comment and an assert 
might be worthwhile.

-

PR: https://git.openjdk.java.net/jdk/pull/1026


Re: RFR: 8180352: Add Stream.toList() method [v3]

2020-11-23 Thread Stuart Marks
On Mon, 23 Nov 2020 14:06:02 GMT, sergus13 
 wrote:

>> This implementation duplicates the array twice, once in this.toArray() and 
>> once in the constructor of ArrayList that calls toArray() on 
>> Collections.ArrayList.
>> 
>> I'm sure there is a better way
>
> In `ReferencePipeline` class we have:
> 
> @Override
> public final Object[] toArray() {
> return toArray(Object[]::new);
> }
> 
> @Override
> @SuppressWarnings("unchecked")
> public final  A[] toArray(IntFunction generator) {
> ...
> @SuppressWarnings("rawtypes")
> IntFunction rawGenerator = (IntFunction) generator;
> return (A[]) Nodes.flatten(evaluateToArrayNode(rawGenerator), 
> rawGenerator)
>   .asArray(rawGenerator);
> }
> 
> 
> In `Nodes` class we have:
> 
> 
> public static  Node flatten(Node node, IntFunction 
> generator) {
> if (node.getChildCount() > 0) {
> ... 
> T[] array = generator.apply((int) size);
> new ToArrayTask.OfRef<>(node, array, 0).invoke();
> return node(array);
> } else {
> return node;
> }
> }
> 
> 
> 
> It looks like it is required to implement `toList` method in a similar way in 
> order to avoid array copy. i.e. there will be `IntFunction> 
> generator` which will generate 'ArrayList' with specified size and the list's 
> `add` method will be called to add elements to the list.

This is the default implementation in the Stream interface, which is overridden 
by an implementation in the ReferencePipeline class, so it will rarely be used 
in normal operation. The ReferencePipeline version (part of this changeset) is 
based on toArray() and avoids any copying. I'm thus not inclined to add new 
interfaces in order to support this default implementation.

As written it's true that the default implementation does perform apparently 
redundant copies, but we can't be assured that toArray() actually returns a 
freshly created array. Thus, we wrap it using Arrays.asList and then copy it 
using the ArrayList constructor. This is unfortunate but necessary to avoid 
situations where someone could hold a reference to the internal array of a 
List, allowing modification of a List that's supposed to be unmodifiable.

-

PR: https://git.openjdk.java.net/jdk/pull/1026


Re: RFR: 8180352: Add Stream.toList() method [v3]

2020-11-23 Thread sergus13
On Sat, 21 Nov 2020 11:22:55 GMT, Rémi Forax 
 wrote:

>> `listFromTrustedArrayNullsAllowed` is clearly not an option, as it will 
>> produce shared-secret leaking (see 
>> [JDK-8254090](https://bugs.openjdk.java.net/browse/JDK-8254090) for a 
>> similar case). StreamEx solution is dirty as it relies on the implementation 
>> detail. I believe, OpenJDK team is not very interested in providing optimal 
>> implementations for third-party stream implementations, as third-party 
>> implementations will likely update by themselves when necessary. At least, 
>> my suggestion to make the default `mapMulti` implementation better was 
>> declined.
>
> This implementation duplicates the array twice, once in this.toArray() and 
> once in the constructor of ArrayList that calls toArray() on 
> Collections.ArrayList.
> 
> I'm sure there is a better way

In `ReferencePipeline` class we have:

@Override
public final Object[] toArray() {
return toArray(Object[]::new);
}

@Override
@SuppressWarnings("unchecked")
public final  A[] toArray(IntFunction generator) {
...
@SuppressWarnings("rawtypes")
IntFunction rawGenerator = (IntFunction) generator;
return (A[]) Nodes.flatten(evaluateToArrayNode(rawGenerator), 
rawGenerator)
  .asArray(rawGenerator);
}


In `Nodes` class we have:


public static  Node flatten(Node node, IntFunction generator) 
{
if (node.getChildCount() > 0) {
... 
T[] array = generator.apply((int) size);
new ToArrayTask.OfRef<>(node, array, 0).invoke();
return node(array);
} else {
return node;
}
}



It looks like it is required to implement `toList` method in a similar way in 
order to avoid array copy. i.e. there will be `IntFunction> generator` 
which will generate 'ArrayList' with specified size and the list's `add` method 
will be called to add elements to the list.

-

PR: https://git.openjdk.java.net/jdk/pull/1026


Re: RFR: 8180352: Add Stream.toList() method [v3]

2020-11-21 Thread Rémi Forax
On Wed, 4 Nov 2020 09:21:12 GMT, Tagir F. Valeev  wrote:

>> src/java.base/share/classes/java/util/stream/Stream.java line 1192:
>> 
>>> 1190: @SuppressWarnings("unchecked")
>>> 1191: default List toList() {
>>> 1192: return (List) Collections.unmodifiableList(new 
>>> ArrayList<>(Arrays.asList(this.toArray(;
>> 
>> Why can't we return `listFromTrustedArrayNullsAllowed` here as in 
>> `ReferencePipeline`?
>> Or at least, we should avoid unnecessary copying of arrays. See [how this is 
>> done](https://github.com/amaembo/streamex/blob/master/src/main/java/one/util/streamex/AbstractStreamEx.java#L1313)
>>  in StreamEx.
>
> `listFromTrustedArrayNullsAllowed` is clearly not an option, as it will 
> produce shared-secret leaking (see 
> [JDK-8254090](https://bugs.openjdk.java.net/browse/JDK-8254090) for a similar 
> case). StreamEx solution is dirty as it relies on the implementation detail. 
> I believe, OpenJDK team is not very interested in providing optimal 
> implementations for third-party stream implementations, as third-party 
> implementations will likely update by themselves when necessary. At least, my 
> suggestion to make the default `mapMulti` implementation better was declined.

This implementation duplicates the array twice, once in this.toArray() and once 
in the constructor of ArrayList that calls toArray() on Collections.ArrayList.

I'm sure there is a better way

-

PR: https://git.openjdk.java.net/jdk/pull/1026


Re: RFR: 8180352: Add Stream.toList() method [v3]

2020-11-21 Thread Rémi Forax
On Wed, 18 Nov 2020 22:20:26 GMT, Stuart Marks  wrote:

>> This change introduces a new terminal operation on Stream. This looks like a 
>> convenience method for Stream.collect(Collectors.toList()) or 
>> Stream.collect(Collectors.toUnmodifiableList()), but it's not. Having this 
>> method directly on Stream enables it to do what can't easily by done by a 
>> Collector. In particular, it allows the stream to deposit results directly 
>> into a destination array (even in parallel) and have this array be wrapped 
>> in an unmodifiable List without copying.
>> 
>> In the past we've kept most things from the Collections Framework as 
>> implementations of Collector, not directly on Stream, whereas only 
>> fundamental things (like toArray) appear directly on Stream. This is true of 
>> most Collections, but it does seem that List is special. It can be a thin 
>> wrapper around an array; it can handle generics better than arrays; and 
>> unlike an array, it can be made unmodifiable (shallowly immutable); and it 
>> can be value-based. See John Rose's comments in the bug report:
>> 
>> https://bugs.openjdk.java.net/browse/JDK-8180352?focusedCommentId=14133065=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14133065
>> 
>> This operation is null-tolerant, which matches the rest of Streams. This 
>> isn't specified, though; a general statement about null handling in Streams 
>> is probably warranted at some point.
>> 
>> Finally, this method is indeed quite convenient (if the caller can deal with 
>> what this operation returns), as collecting into a List is the most common 
>> stream terminal operation.
>
> Stuart Marks has updated the pull request incrementally with one additional 
> commit since the last revision:
> 
>   Adjust List.copyOf to null-check and copy allowNulls lists.
>   Fix equals, hashCode, indexOf, lastIndexOf to handle nulls properly.
>   Add MOAT tests for new lists; add equals and hashCode tests.

src/java.base/share/classes/java/util/ImmutableCollections.java line 222:

> 220: default:
> 221: return (List) new ListN<>(input, false);
> 222: }

Using a switch expression + arrow (->) here will allow you too factor the cast, 
even if it disappear in the generated bytecode, I think it makes the code more 
readable

-

PR: https://git.openjdk.java.net/jdk/pull/1026


Re: RFR: 8180352: Add Stream.toList() method [v3]

2020-11-21 Thread Rémi Forax
On Thu, 5 Nov 2020 17:26:48 GMT, Stuart Marks  wrote:

>> src/java.base/share/classes/java/util/ImmutableCollections.java line 199:
>> 
>>> 197:  * safely reused as the List's internal storage, avoiding a 
>>> defensive copy. Declared
>>> 198:  * with Object... instead of E... as the parameter type so that 
>>> varargs calls don't
>>> 199:  * accidentally create an array of type other than Object[].
>> 
>> Why would that be a problem? If the resulting list is immutable, then the 
>> actual array type doesn't really matter, right?
>
> It's an implementation invariant that the internal array be Object[]. Having 
> it be something other than Object[] can lead to subtle bugs. See 
> [JDK-6260652](https://bugs.openjdk.java.net/browse/JDK-6260652) for example.

you can still calls the varargs with an already created array
listFromTrustedArray(new String[] { "foo" });

I think at least an assert is missing
assert input.getClass() == Object.class;

-

PR: https://git.openjdk.java.net/jdk/pull/1026


Re: RFR: 8180352: Add Stream.toList() method [v3]

2020-11-18 Thread Stuart Marks
> This change introduces a new terminal operation on Stream. This looks like a 
> convenience method for Stream.collect(Collectors.toList()) or 
> Stream.collect(Collectors.toUnmodifiableList()), but it's not. Having this 
> method directly on Stream enables it to do what can't easily by done by a 
> Collector. In particular, it allows the stream to deposit results directly 
> into a destination array (even in parallel) and have this array be wrapped in 
> an unmodifiable List without copying.
> 
> In the past we've kept most things from the Collections Framework as 
> implementations of Collector, not directly on Stream, whereas only 
> fundamental things (like toArray) appear directly on Stream. This is true of 
> most Collections, but it does seem that List is special. It can be a thin 
> wrapper around an array; it can handle generics better than arrays; and 
> unlike an array, it can be made unmodifiable (shallowly immutable); and it 
> can be value-based. See John Rose's comments in the bug report:
> 
> https://bugs.openjdk.java.net/browse/JDK-8180352?focusedCommentId=14133065=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14133065
> 
> This operation is null-tolerant, which matches the rest of Streams. This 
> isn't specified, though; a general statement about null handling in Streams 
> is probably warranted at some point.
> 
> Finally, this method is indeed quite convenient (if the caller can deal with 
> what this operation returns), as collecting into a List is the most common 
> stream terminal operation.

Stuart Marks has updated the pull request incrementally with one additional 
commit since the last revision:

  Adjust List.copyOf to null-check and copy allowNulls lists.
  Fix equals, hashCode, indexOf, lastIndexOf to handle nulls properly.
  Add MOAT tests for new lists; add equals and hashCode tests.

-

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/1026/files
  - new: https://git.openjdk.java.net/jdk/pull/1026/files/cf849755..15beacd2

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk=1026=02
 - incr: https://webrevs.openjdk.java.net/?repo=jdk=1026=01-02

  Stats: 136 lines in 3 files changed: 104 ins; 22 del; 10 mod
  Patch: https://git.openjdk.java.net/jdk/pull/1026.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/1026/head:pull/1026

PR: https://git.openjdk.java.net/jdk/pull/1026


Re: RFR: 8180352: Add Stream.toList() method [v2]

2020-11-18 Thread Stuart Marks
On Tue, 17 Nov 2020 20:04:58 GMT, Stuart Marks  wrote:

>>> @plevart wrote:
>>> 
>>> > But the question is how does having a separate CollSer.IMM_LIST_NULLS 
>>> > type prevent that from happening?
>>> 
>>> When a serialized list with IMM_LIST_NULLS is deserialized on an older JDK, 
>>> it'll throw InvalidObjectException since that tag isn't valid on older 
>>> JDKs. Obviously this is still an error, but it's a fail-fast approach that 
>>> avoids letting nulls leak into a data structure where they might cause a 
>>> problem some arbitrary time later.
>>> 
>> 
>> Yes, but that is JDK16+ vs. JDK15- and not App V1 vs. App V2 thing. If both 
>> apps run on JDK16+, there will be no exception.
>> 
>> What I'm trying to say is that the same problem of injecting unexpected 
>> nulls via serialization/deserialization can happen also if App V2 starts 
>> using ArrayList to construct the data structure and serialize it while App 
>> V1 deserializes it and expects non-null values only. App V1 would already 
>> have to guard against null values during deserialization in that case, 
>> because possibility of null values in deserialized data structure is nothing 
>> new for App V1.
>
> @plevart wrote:
>> Yes, but that is JDK16+ vs. JDK15- and not App V1 vs. App V2 thing. If both 
>> apps run on JDK16+, there will be no exception.
> 
> Sure, the IMM_LIST_NULLS tag only helps with serialization compatibility 
> across JDK releases. There are lots of ways an app can make incompatible 
> changes to the serialized forms of its objects that we have no way of 
> detecting.

>> Sure, the IMM_LIST_NULLS tag only helps with serialization compatibility 
>> across JDK releases.

> I would say it goes the other way - it worsens the serialization
compatibility.

OK, I was imprecise. The IMM_LIST_NULLS tag has an effect only on serialization 
across JDK releases, not changes to the application's serialization format 
using the same JDK release, or even on many changes to the app's serialization 
format across JDK releases. By "helps with serialization compatibility" I meant 
that this new serialized form helps the general issue of serialization 
compatibility (really, incompatibility) by failing fast in certain cases, 
instead of possibly allowing polluted data to leak into the receiving 
application and causing some arbitrary exception later during the run.

But as you noted last, this is a different kind of object, and it has different 
behavior, so it needs a different encoding in the serialized form.

I'll update this PR shortly with changes to fix null handling and other issues.

-

PR: https://git.openjdk.java.net/jdk/pull/1026


Re: RFR: 8180352: Add Stream.toList() method [v2]

2020-11-18 Thread Remi Forax
- Mail original -
> De: "Florian Weimer" 
> À: "Peter Levart" 
> Cc: "Stuart Marks" , "core-libs-dev" 
> 
> Envoyé: Mercredi 18 Novembre 2020 12:55:02
> Objet: Re: RFR: 8180352: Add Stream.toList() method [v2]

> * Peter Levart:
> 
>> But I see that the new  IMM_LIST_NULLS type is needed for one other
>> thing - the immutable list implementation of that type has different
>> behavior of indexOf and lastIndexOf methods (it doesn't throw NPE when
>> null is passed to those methods) so this behavior has to be preserved in
>> the deserialized instance and there is not way to achieve that on old
>> JDK with existing serialization format, so there has to be an
>> incompatible change in the serialization format for that matter.
> 
> I think it's also needed for an efficient null element check in
> List::copyOf.  I have a hunch that with the posted implementation, it
> would incorrectly produce an immutable list that contains null
> elements.

yes !

Rémi


Re: RFR: 8180352: Add Stream.toList() method [v2]

2020-11-18 Thread Florian Weimer
* Peter Levart:

> But I see that the new  IMM_LIST_NULLS type is needed for one other 
> thing - the immutable list implementation of that type has different 
> behavior of indexOf and lastIndexOf methods (it doesn't throw NPE when 
> null is passed to those methods) so this behavior has to be preserved in 
> the deserialized instance and there is not way to achieve that on old 
> JDK with existing serialization format, so there has to be an 
> incompatible change in the serialization format for that matter.

I think it's also needed for an efficient null element check in
List::copyOf.  I have a hunch that with the posted implementation, it
would incorrectly produce an immutable list that contains null
elements.


Re: RFR: 8180352: Add Stream.toList() method [v2]

2020-11-18 Thread Peter Levart



On 11/17/20 9:08 PM, Stuart Marks wrote:

@plevart wrote:

Yes, but that is JDK16+ vs. JDK15- and not App V1 vs. App V2 thing. If both 
apps run on JDK16+, there will be no exception.

Sure, the IMM_LIST_NULLS tag only helps with serialization compatibility across 
JDK releases.



I would say it goes the other way - it worsens the serialization 
compatibility. You are introducing a serialization format that is 
incompatible with older JDK releases. OTOH the problem of introducing 
nulls into List(s) in serializable data structures already exists 
because serializable List implementations that allow nulls already 
exist, so by introducing another one that is able to be deserialized on 
older JDK would not worsen the situation.




  There are lots of ways an app can make incompatible changes to the serialized 
forms of its objects that we have no way of detecting.



So why bother with making the result of serialized stream.toList() not 
de-serializable on older JDK(s)? By making a favor to a hypothetical app 
that runs on older JDK and does not expect nulls in Lists of a data 
structure, you are also preventing an app that allows nulls in similar 
data structure from deserializing the datastructure on older JDK when 
the serialized form was produced with serializing stream.toList() on 
newer JDK.


But I see that the new  IMM_LIST_NULLS type is needed for one other 
thing - the immutable list implementation of that type has different 
behavior of indexOf and lastIndexOf methods (it doesn't throw NPE when 
null is passed to those methods) so this behavior has to be preserved in 
the deserialized instance and there is not way to achieve that on old 
JDK with existing serialization format, so there has to be an 
incompatible change in the serialization format for that matter.


Peter




Re: RFR: 8180352: Add Stream.toList() method [v2]

2020-11-17 Thread Stuart Marks
On Tue, 10 Nov 2020 09:34:56 GMT, Peter Levart  wrote:

>> I can see that having a separate IMM_LIST_NULLS type might be necessary to 
>> preserve the allows-null/disallows-null behaviour of indexOf and lastIndexOf 
>> methods...
>> 
>> NOTE ALSO that ListN.equals(o) and ListN.hashCode() are inherited from 
>> AbstractImmutableList:
>> 
>> @Override
>> public boolean equals(Object o) {
>> if (o == this) {
>> return true;
>> }
>> 
>> if (!(o instanceof List)) {
>> return false;
>> }
>> 
>> Iterator oit = ((List) o).iterator();
>> for (int i = 0, s = size(); i < s; i++) {
>> if (!oit.hasNext() || !get(i).equals(oit.next())) {
>> return false;
>> }
>> }
>> return !oit.hasNext();
>> }
>> and
>> public int hashCode() {
>> int hash = 1;
>> for (int i = 0, s = size(); i < s; i++) {
>> hash = 31 * hash + get(i).hashCode();
>> }
>> return hash;
>> }
>> 
>> ...which means they will throw NPE when the list contains null. The same 
>> goes for SubList.
>
>> @plevart wrote:
>> 
>> > But the question is how does having a separate CollSer.IMM_LIST_NULLS type 
>> > prevent that from happening?
>> 
>> When a serialized list with IMM_LIST_NULLS is deserialized on an older JDK, 
>> it'll throw InvalidObjectException since that tag isn't valid on older JDKs. 
>> Obviously this is still an error, but it's a fail-fast approach that avoids 
>> letting nulls leak into a data structure where they might cause a problem 
>> some arbitrary time later.
>> 
> 
> Yes, but that is JDK16+ vs. JDK15- and not App V1 vs. App V2 thing. If both 
> apps run on JDK16+, there will be no exception.
> 
> What I'm trying to say is that the same problem of injecting unexpected nulls 
> via serialization/deserialization can happen also if App V2 starts using 
> ArrayList to construct the data structure and serialize it while App V1 
> deserializes it and expects non-null values only. App V1 would already have 
> to guard against null values during deserialization in that case, because 
> possibility of null values in deserialized data structure is nothing new for 
> App V1.

@plevart wrote:
> Yes, but that is JDK16+ vs. JDK15- and not App V1 vs. App V2 thing. If both 
> apps run on JDK16+, there will be no exception.

Sure, the IMM_LIST_NULLS tag only helps with serialization compatibility across 
JDK releases. There are lots of ways an app can make incompatible changes to 
the serialized forms of its objects that we have no way of detecting.

-

PR: https://git.openjdk.java.net/jdk/pull/1026


Re: RFR: 8180352: Add Stream.toList() method [v2]

2020-11-10 Thread Peter Levart
On Sun, 8 Nov 2020 15:55:58 GMT, Peter Levart  wrote:

>> Hi Stuart,
>> 
>> I would like to discuss the serialization. You introduce new 
>> CollSer.IMM_LIST_NULLS type of immutable collection. This means that if this 
>> change goes into JDK16 for example, JDK15 and before will not be able to 
>> deserialize such list as they know nothing about IMM_LIST_NULLS even if such 
>> lists don't contain nulls. The reason you say to chose new type of 
>> serialization format is the following:
>> 
>>> "Suppose I had an application that created a data structure that used lists 
>>> from List.of(), and I had a global assertion over that structure that it 
>>> contained no nulls. Further suppose that I serialized and deserizalized 
>>> this structure. I'd want that assertion to be preserved after 
>>> deserialization. If another app (or a future version of this app) created 
>>> the structure using Stream.to
>>>  List(), this would allow nulls to leak into that structure and violate 
>>> that assertion. Therefore, the result of Stream.toList() should not be 
>>> serialization-compatible with List.of() et. al. That's why there's the new 
>>> IMM_LIST_NULLS tag in the serial format"
>> 
>> I don't quite get this reasoning. Let's try to decompose the reasoning 
>> giving an example. Suppose we had the following data structure:
>> 
>> public class Names implements Serializable {
>>   private final List names;
>>   Names(List names) {
>> this.names = names;
>>   }
>>   public List names() { return names; }
>> }
>> 
>> App v1 creates such structures using new Names(List.of(...)) and 
>> serializes/deserializes them. They keep the invariant that no nulls are 
>> present. Now comes App v2 that starts using new Names(stream.toList()) which 
>> allows nulls to be present. When such Names instance from app v2 is 
>> serialized and then deserialized in app v1, nulls "leak" into data structure 
>> of app v1 that does not expect them.
>> 
>> But the question is how does having a separate CollSer.IMM_LIST_NULLS type 
>> prevent that from happening?
>
> I can see that having a separate IMM_LIST_NULLS type might be necessary to 
> preserve the allows-null/disallows-null behaviour of indexOf and lastIndexOf 
> methods...
> 
> NOTE ALSO that ListN.equals(o) and ListN.hashCode() are inherited from 
> AbstractImmutableList:
> 
> @Override
> public boolean equals(Object o) {
> if (o == this) {
> return true;
> }
> 
> if (!(o instanceof List)) {
> return false;
> }
> 
> Iterator oit = ((List) o).iterator();
> for (int i = 0, s = size(); i < s; i++) {
> if (!oit.hasNext() || !get(i).equals(oit.next())) {
> return false;
> }
> }
> return !oit.hasNext();
> }
> and
> public int hashCode() {
> int hash = 1;
> for (int i = 0, s = size(); i < s; i++) {
> hash = 31 * hash + get(i).hashCode();
> }
> return hash;
> }
> 
> ...which means they will throw NPE when the list contains null. The same goes 
> for SubList.

> @plevart wrote:
> 
> > But the question is how does having a separate CollSer.IMM_LIST_NULLS type 
> > prevent that from happening?
> 
> When a serialized list with IMM_LIST_NULLS is deserialized on an older JDK, 
> it'll throw InvalidObjectException since that tag isn't valid on older JDKs. 
> Obviously this is still an error, but it's a fail-fast approach that avoids 
> letting nulls leak into a data structure where they might cause a problem 
> some arbitrary time later.
> 

Yes, but that is JDK16+ vs. JDK15- and not App V1 vs. App V2 thing. If both 
apps run on JDK16+, there will be no exception.

What I'm trying to say is that the same problem of injecting unexpected nulls 
via serialization/deserialization can happen also if App V2 starts using 
ArrayList to construct the data structure and serialize it while App V1 
deserializes it and expects non-null values only. App V1 would already have to 
guard against null values during deserialization in that case, because 
possibility of null values in deserialized data structure is nothing new for 
App V1.

-

PR: https://git.openjdk.java.net/jdk/pull/1026


Re: RFR: 8180352: Add Stream.toList() method [v2]

2020-11-09 Thread Stuart Marks
On Sun, 8 Nov 2020 15:55:58 GMT, Peter Levart  wrote:

>> Hi Stuart,
>> 
>> I would like to discuss the serialization. You introduce new 
>> CollSer.IMM_LIST_NULLS type of immutable collection. This means that if this 
>> change goes into JDK16 for example, JDK15 and before will not be able to 
>> deserialize such list as they know nothing about IMM_LIST_NULLS even if such 
>> lists don't contain nulls. The reason you say to chose new type of 
>> serialization format is the following:
>> 
>>> "Suppose I had an application that created a data structure that used lists 
>>> from List.of(), and I had a global assertion over that structure that it 
>>> contained no nulls. Further suppose that I serialized and deserizalized 
>>> this structure. I'd want that assertion to be preserved after 
>>> deserialization. If another app (or a future version of this app) created 
>>> the structure using Stream.to
>>>  List(), this would allow nulls to leak into that structure and violate 
>>> that assertion. Therefore, the result of Stream.toList() should not be 
>>> serialization-compatible with List.of() et. al. That's why there's the new 
>>> IMM_LIST_NULLS tag in the serial format"
>> 
>> I don't quite get this reasoning. Let's try to decompose the reasoning 
>> giving an example. Suppose we had the following data structure:
>> 
>> public class Names implements Serializable {
>>   private final List names;
>>   Names(List names) {
>> this.names = names;
>>   }
>>   public List names() { return names; }
>> }
>> 
>> App v1 creates such structures using new Names(List.of(...)) and 
>> serializes/deserializes them. They keep the invariant that no nulls are 
>> present. Now comes App v2 that starts using new Names(stream.toList()) which 
>> allows nulls to be present. When such Names instance from app v2 is 
>> serialized and then deserialized in app v1, nulls "leak" into data structure 
>> of app v1 that does not expect them.
>> 
>> But the question is how does having a separate CollSer.IMM_LIST_NULLS type 
>> prevent that from happening?
>
> I can see that having a separate IMM_LIST_NULLS type might be necessary to 
> preserve the allows-null/disallows-null behaviour of indexOf and lastIndexOf 
> methods...
> 
> NOTE ALSO that ListN.equals(o) and ListN.hashCode() are inherited from 
> AbstractImmutableList:
> 
> @Override
> public boolean equals(Object o) {
> if (o == this) {
> return true;
> }
> 
> if (!(o instanceof List)) {
> return false;
> }
> 
> Iterator oit = ((List) o).iterator();
> for (int i = 0, s = size(); i < s; i++) {
> if (!oit.hasNext() || !get(i).equals(oit.next())) {
> return false;
> }
> }
> return !oit.hasNext();
> }
> and
> public int hashCode() {
> int hash = 1;
> for (int i = 0, s = size(); i < s; i++) {
> hash = 31 * hash + get(i).hashCode();
> }
> return hash;
> }
> 
> ...which means they will throw NPE when the list contains null. The same goes 
> for SubList.

@plevart wrote:
> But the question is how does having a separate CollSer.IMM_LIST_NULLS type 
> prevent that from happening?

When a serialized list with IMM_LIST_NULLS is deserialized on an older JDK, 
it'll throw InvalidObjectException since that tag isn't valid on older JDKs. 
Obviously this is still an error, but it's a fail-fast approach that avoids 
letting nulls leak into a data structure where they might cause a problem some 
arbitrary time later.

> NOTE ALSO that ListN.equals(o) and ListN.hashCode() are inherited from 
> AbstractImmutableList [...] which means they will throw NPE when the list 
> contains null. The same goes for SubList.

Good catch! Yes, this is a problem. I'll do some rearranging here and add more 
test cases. Thanks for spotting this.

-

PR: https://git.openjdk.java.net/jdk/pull/1026


Re: RFR: 8180352: Add Stream.toList() method

2020-11-08 Thread Alan Snyder



> On Nov 8, 2020, at 3:32 AM, fo...@univ-mlv.fr wrote:
> 
> It's a strawman but the effect is real, the more interfaces you have the less 
> you can reuse code because the code is written with the wrong interface for 
> your use case. 

There are points of compromise. If someone can accept misbehavior, they can use 
interfaces that are pure access interfaces, such as Iterator and Stream.

> Following the same reasoning, you can say that having interfaces saying that 
> collections are null hostile is even more important because it alleviate the 
> issue with NullPointerException. 

My own immutable collection classes are also null-intolerant. No explosion 
there. Use Optional instead.

> Array vs List is not something we can be very proud of, by example in 
> java.lang.invoke, we have several methods that are duplicated only because of 
> this dichotomy. 
> File vs Path is less an issue because it's between implementation classes, 
> not interfaces, anyway, you still have a number of methods inside 
> java.io/java.nio  that are duplicated.

Progress with backwards compatibility requires some duplication. It is nothing 
to be ashamed about.

I’m more ashamed that Type<> notation cannot be used with arrays and [] 
notation cannot be used with Lists.

  Alan



Re: RFR: 8180352: Add Stream.toList() method [v2]

2020-11-08 Thread Peter Levart
On Sun, 8 Nov 2020 10:47:08 GMT, Peter Levart  wrote:

>>> Simon Roberts wrote:
>> 
>>> This discussion of unmodifiable lists brings me back to the thought that
>>> there would be good client-side reasons for inserting an UnmodifiableList
>>> interface as a parent of LIst, not least because all our unmodifiable
>>> variants from the Collections.unmodifiableList proxy onward fail the Liskov
>>> substitution test for actually "being contract-fulfilling Lists".
>> 
>> At some point there probably will need to be a long article explaining all 
>> the issues here, but at the moment the best writeup I have is this one:
>> 
>> https://stackoverflow.com/a/57926310/1441122
>> 
>> TL;DR there are a few different ways to approach retrofitting something like 
>> this, but they all have enough compromises that the net benefits are unclear.
>
> Hi Stuart,
> 
> I would like to discuss the serialization. You introduce new 
> CollSer.IMM_LIST_NULLS type of immutable collection. This means that if this 
> change goes into JDK16 for example, JDK15 and before will not be able to 
> deserialize such list as they know nothing about IMM_LIST_NULLS even if such 
> lists don't contain nulls. The reason you say to chose new type of 
> serialization format is the following:
> 
>> "Suppose I had an application that created a data structure that used lists 
>> from List.of(), and I had a global assertion over that structure that it 
>> contained no nulls. Further suppose that I serialized and deserizalized this 
>> structure. I'd want that assertion to be preserved after deserialization. If 
>> another app (or a future version of this app) created the structure using 
>> Stream.to
>>  List(), this would allow nulls to leak into that structure and violate that 
>> assertion. Therefore, the result of Stream.toList() should not be 
>> serialization-compatible with List.of() et. al. That's why there's the new 
>> IMM_LIST_NULLS tag in the serial format"
> 
> I don't quite get this reasoning. Let's try to decompose the reasoning giving 
> an example. Suppose we had the following data structure:
> 
> public class Names implements Serializable {
>   private final List names;
>   Names(List names) {
> this.names = names;
>   }
>   public List names() { return names; }
> }
> 
> App v1 creates such structures using new Names(List.of(...)) and 
> serializes/deserializes them. They keep the invariant that no nulls are 
> present. Now comes App v2 that starts using new Names(stream.toList()) which 
> allows nulls to be present. When such Names instance from app v2 is 
> serialized and then deserialized in app v1, nulls "leak" into data structure 
> of app v1 that does not expect them.
> 
> But the question is how does having a separate CollSer.IMM_LIST_NULLS type 
> prevent that from happening?

I can see that having a separate IMM_LIST_NULLS type might be necessary to 
preserve the allows-null/disallows-null behaviour of indexOf and lastIndexOf 
methods...

NOTE ALSO that while ListN.equals(o) method is using Objects.equals(o1, o2) to 
compare elements, hashCode is inherited from AbstractImmutableList:

public int hashCode() {
int hash = 1;
for (int i = 0, s = size(); i < s; i++) {
hash = 31 * hash + get(i).hashCode();
}
return hash;
}

...which means it will throw NPE when the list contains null. The same goes for 
SubList.

-

PR: https://git.openjdk.java.net/jdk/pull/1026


Re: RFR: 8180352: Add Stream.toList() method

2020-11-08 Thread forax
> De: "Alan Snyder" 
> À: "Remi Forax" 
> Cc: "Simon Roberts" , "core-libs-dev"
> 
> Envoyé: Vendredi 6 Novembre 2020 18:55:40
> Objet: Re: RFR: 8180352: Add Stream.toList() method
Hi Alan, 

>>> This discussion of unmodifiable lists brings me back to the thought that
>>> there would be good client-side reasons for inserting an UnmodifiableList
>>> interface as a parent of LIst
>> On Nov 6, 2020, at 1:14 AM, Remi Forax < [ mailto:fo...@univ-mlv.fr |
>> fo...@univ-mlv.fr ] > wrote:

>> This question is asked at least every six months since 1998
>> [
>> https://docs.oracle.com/javase/8/docs/technotes/guides/collections/designfaq.html#a1
>> |
>> https://docs.oracle.com/javase/8/docs/technotes/guides/collections/designfaq.html#a1
>> ]

> The question that Simon asked is not exactly the question that is answered in
> this link.

> The question of whether List should be a subtype of (some kind of) 
> ImmutableList
> is answered in
> Stuart’s stack overflow answer ( [ 
> https://stackoverflow.com/a/57926310/1441122
> | https://stackoverflow.com/a/57926310/1441122 ] ). The answer
> is that it should not.

> The question answered in the link is basically a straw man: why not capture
> every conceivable
> semantic distinction in the collections type system. And the answer is, not
> surprisingly, that
> there would be way too many types.

It's a strawman but the effect is real, the more interfaces you have the less 
you can reuse code because the code is written with the wrong interface for 
your use case. 

> But a question that deserves ongoing review is whether Java should support
> immutable collections
> using a separate type hierarchy. In other words, Immutable List would not be a
> subtype of List
> and List would not be a subtype of Immutable List. The linked answer says:
> " Adding this support to the type hierarchy requires four more interfaces.” 
> Four
> additional
> interfaces sounds like a small cost for a significant benefit. As Java evolves
> to better support
> functional programming styles, immutable collections seems like an obvious 
> next
> step.

> Much could be written, and probably has been, about the disadvantages of 
> basing
> the
> collections framework on mutable collections. Let me just remark that in
> addition to the
> already mentioned UnsupportedOperationException, there is also the more subtle
> ConcurrentModificationException, both of which would be absent in fully
> immutable collections.

Following the same reasoning, you can say that having interfaces saying that 
collections are null hostile is even more important because it alleviate the 
issue with NullPointerException. 

> The lack of subtyping between List and ImmutableList is not terrible. Arrays
> coexist with
> Lists with no subtyping relationship. java.io.File coexists with
> java.nio.filePath with no
> subtyping relationship. It means that explicit conversions are required 
> between
> List and
> ImmutableList. Extra copying may be needed, but there are tricks for reducing
> copying, and the
> need for defensive copying is removed.

Array vs List is not something we can be very proud of, by example in 
java.lang.invoke, we have several methods that are duplicated only because of 
this dichotomy. 
File vs Path is less an issue because it's between implementation classes, not 
interfaces, anyway, you still have a number of methods inside java.io/java.nio 
that are duplicated. 

Anyway, I don't think we can *all* agree about how a collection API should be 
implemented because it is equivalent to answering to the question what you want 
to be part of the type system and what you want to be runtime exceptions. 
With that in minde, Java is in the middle between Clojure and Scala. 

> Alan

Rémi 


Re: RFR: 8180352: Add Stream.toList() method [v2]

2020-11-08 Thread Peter Levart
On Fri, 6 Nov 2020 03:05:45 GMT, Stuart Marks  wrote:

>> @ePaul wrote:
>> 
>>> The Stream API works just as well with [third party] collection libraries 
>>> instead of the java.util ones, just using different collectors. Adding a 
>>> method which directly returns a java.util.List somewhat couples it to the 
>>> Java Collection API.
>>> 
>>> Now this was mentioned two and a half year ago. Did something change which 
>>> made this consideration irrelevant? I would expect at least some mention of 
>>> it in the discussion here.
>> 
>> The separation between streams and the java.util Collections Framework is a 
>> good design principle, but it isn't an ironclad rule. It's still easy to 
>> have streams create instances of other collections libraries using the 
>> Collector interface. What's different here is that the Collections Framework 
>> has "leaked into" streams a little bit more, so that they're now more 
>> interdependent. This doesn't seem to have any disadvantages; it seems 
>> unlikely that the Collections Framework will ever be unplugged from the JDK. 
>> However, the benefits are that a List is the closest thing we have to an 
>> unmodifiable array that also plays well with generics and that can be 
>> value-based; these benefits are considerable.
>
>> Simon Roberts wrote:
> 
>> This discussion of unmodifiable lists brings me back to the thought that
>> there would be good client-side reasons for inserting an UnmodifiableList
>> interface as a parent of LIst, not least because all our unmodifiable
>> variants from the Collections.unmodifiableList proxy onward fail the Liskov
>> substitution test for actually "being contract-fulfilling Lists".
> 
> At some point there probably will need to be a long article explaining all 
> the issues here, but at the moment the best writeup I have is this one:
> 
> https://stackoverflow.com/a/57926310/1441122
> 
> TL;DR there are a few different ways to approach retrofitting something like 
> this, but they all have enough compromises that the net benefits are unclear.

Hi Stuart,

I would like to discuss the serialization. You introduce new 
CollSer.IMM_LIST_NULLS type of immutable collection. This means that if this 
change goes into JDK16 for example, JDK15 and before will not be able to 
deserialize such list as they know nothing about IMM_LIST_NULLS even if such 
lists don't contain nulls. The reason you say to chose new type of 
serialization format is the following:

> "Suppose I had an application that created a data structure that used lists 
> from List.of(), and I had a global assertion over that structure that it 
> contained no nulls. Further suppose that I serialized and deserizalized this 
> structure. I'd want that assertion to be preserved after deserialization. If 
> another app (or a future version of this app) created the structure using 
> Stream.to
>  List(), this would allow nulls to leak into that structure and violate that 
> assertion. Therefore, the result of Stream.toList() should not be 
> serialization-compatible with List.of() et. al. That's why there's the new 
> IMM_LIST_NULLS tag in the serial format"

I don't quite get this reasoning. Let's try to decompose the reasoning giving 
an example. Suppose we had the following data structure:

public class Names implements Serializable {
  private final List names;
  Names(List names) {
this.names = names;
  }
  public List names() { return names; }
}

App v1 creates such structures using new Names(List.of(...)) and 
serializes/deserializes them. They keep the invariant that no nulls are 
present. Now comes App v2 that starts using new Names(stream.toList()) which 
allows nulls to be present. When such Names instance from app v2 is serialized 
and then deserialized in app v1, nulls "leak" into data structure of app v1 
that does not expect them.

But the question is how does having a separate CollSer.IMM_LIST_NULLS type 
prevent that from happening?

-

PR: https://git.openjdk.java.net/jdk/pull/1026


Re: RFR: 8180352: Add Stream.toList() method

2020-11-06 Thread Martin Desruisseaux

Le 06/11/2020 à 19:00, Alan Snyder a écrit :
(…snip…) But a question that deserves ongoing review is whether Java 
should support immutable collections using a separate type hierarchy 
(…snip…).


Maybe an alternative way (admittedly more difficult) to have immutable 
collections without introducing new interfaces could be to take 
inspiration from the C++ "const" keyword? Especially since in Java, 
"const" is already a reserved keyword, just not used yet.


I realize that it would be a difficult task: how to handle private 
fields that are just caches in a const object; where to add the "const" 
keyword in existing JDK API; how to make such change in a backward 
compatible way (e.g. when a legacy code overrides a "const" method). I 
do not have any expectation. But given the inconvenient of alternatives, 
I wonder if there is some research about the long-term feasibility of a 
"const" semantic in Java? If it was the case, then maybe it would be 
better to not add immutable collection interfaces in Java for now.


    Regards

        Martin




Re: RFR: 8180352: Add Stream.toList() method

2020-11-06 Thread Alan Snyder
>> This discussion of unmodifiable lists brings me back to the thought that
>> there would be good client-side reasons for inserting an UnmodifiableList
>> interface as a parent of LIst

> On Nov 6, 2020, at 1:14 AM, Remi Forax  > wrote:
> 
> 
> This question is asked at least every six months since 1998
> https://docs.oracle.com/javase/8/docs/technotes/guides/collections/designfaq.html#a1
>  
> 

The question that Simon asked is not exactly the question that is answered in 
this link.

The question of whether List should be a subtype of (some kind of) 
ImmutableList is answered in
Stuart’s stack overflow answer (https://stackoverflow.com/a/57926310/1441122 
). The answer
is that it should not.

The question answered in the link is basically a straw man: why not capture 
every conceivable
semantic distinction in the collections type system. And the answer is, not 
surprisingly, that
there would be way too many types.

But a question that deserves ongoing review is whether Java should support 
immutable collections
using a separate type hierarchy. In other words, Immutable List would not be a 
subtype of List
and List would not be a subtype of Immutable List. The linked answer says:
"Adding this support to the type hierarchy requires four more interfaces.” Four 
additional
interfaces sounds like a small cost for a significant benefit. As Java evolves 
to better support
functional programming styles, immutable collections seems like an obvious next 
step.

Much could be written, and probably has been, about the disadvantages of basing 
the
collections framework on mutable collections. Let me just remark that in 
addition to the
already mentioned UnsupportedOperationException, there is also the more subtle
ConcurrentModificationException, both of which would be absent in fully 
immutable collections.

The lack of subtyping between List and ImmutableList is not terrible. Arrays 
coexist with
Lists with no subtyping relationship. java.io.File coexists with 
java.nio.filePath with no
subtyping relationship. It means that explicit conversions are required between 
List and
ImmutableList. Extra copying may be needed, but there are tricks for reducing 
copying, and the
need for defensive copying is removed.

  Alan



Re: RFR: 8180352: Add Stream.toList() method [v2]

2020-11-06 Thread Stuart Marks
> This change introduces a new terminal operation on Stream. This looks like a 
> convenience method for Stream.collect(Collectors.toList()) or 
> Stream.collect(Collectors.toUnmodifiableList()), but it's not. Having this 
> method directly on Stream enables it to do what can't easily by done by a 
> Collector. In particular, it allows the stream to deposit results directly 
> into a destination array (even in parallel) and have this array be wrapped in 
> an unmodifiable List without copying.
> 
> In the past we've kept most things from the Collections Framework as 
> implementations of Collector, not directly on Stream, whereas only 
> fundamental things (like toArray) appear directly on Stream. This is true of 
> most Collections, but it does seem that List is special. It can be a thin 
> wrapper around an array; it can handle generics better than arrays; and 
> unlike an array, it can be made unmodifiable (shallowly immutable); and it 
> can be value-based. See John Rose's comments in the bug report:
> 
> https://bugs.openjdk.java.net/browse/JDK-8180352?focusedCommentId=14133065=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14133065
> 
> This operation is null-tolerant, which matches the rest of Streams. This 
> isn't specified, though; a general statement about null handling in Streams 
> is probably warranted at some point.
> 
> Finally, this method is indeed quite convenient (if the caller can deal with 
> what this operation returns), as collecting into a List is the most common 
> stream terminal operation.

Stuart Marks has updated the pull request incrementally with one additional 
commit since the last revision:

  Merge ListNNullsAllowed into ListN. Update spec for Stream.toList.
  Add nulls-allowed empty list. Simplify indexOf/lastIndexOf.
  Reduce tests, add contains(null) tests.

-

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/1026/files
  - new: https://git.openjdk.java.net/jdk/pull/1026/files/3e05564d..cf849755

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk=1026=01
 - incr: https://webrevs.openjdk.java.net/?repo=jdk=1026=00-01

  Stats: 181 lines in 3 files changed: 16 ins; 117 del; 48 mod
  Patch: https://git.openjdk.java.net/jdk/pull/1026.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/1026/head:pull/1026

PR: https://git.openjdk.java.net/jdk/pull/1026


Re: RFR: 8180352: Add Stream.toList() method

2020-11-06 Thread Remi Forax
- Mail original -
> De: "Simon Roberts" 
> À: "core-libs-dev" 
> Envoyé: Jeudi 5 Novembre 2020 18:40:44
> Objet: Re: RFR: 8180352: Add Stream.toList() method

> At the risk of a can of worms, or at least of raising something that has
> long since been discussed and rejected...
> 
> This discussion of unmodifiable lists brings me back to the thought that
> there would be good client-side reasons for inserting an UnmodifiableList
> interface as a parent of LIst, not least because all our unmodifiable
> variants from the Collections.unmodifiableList proxy onward fail the Liskov
> substitution test for actually "being contract-fulfilling Lists".
> 
> Is this something that has been considered and rejected? (If so, is it easy
> to point me at the discussion, as I expect I'd find it fascinating). Is it
> worth considering, might it have some merit, or merely horrible
> unforeseen-by-me consequences to the implementation?

This question is asked at least every six months since 1998
https://docs.oracle.com/javase/8/docs/technotes/guides/collections/designfaq.html#a1

> 
> Cheers,
> Simon

regards,
Rémi Forax

> 
> 
> 
> On Thu, Nov 5, 2020 at 10:30 AM Stuart Marks 
> wrote:
> 
>> On Wed, 4 Nov 2020 23:03:02 GMT, Paŭlo Ebermann > 645859+ep...@openjdk.org> wrote:
>>
>> >> This change introduces a new terminal operation on Stream. This looks
>> like a convenience method for Stream.collect(Collectors.toList()) or
>> Stream.collect(Collectors.toUnmodifiableList()), but it's not. Having this
>> method directly on Stream enables it to do what can't easily by done by a
>> Collector. In particular, it allows the stream to deposit results directly
>> into a destination array (even in parallel) and have this array be wrapped
>> in an unmodifiable List without copying.
>> >>
>> >> In the past we've kept most things from the Collections Framework as
>> implementations of Collector, not directly on Stream, whereas only
>> fundamental things (like toArray) appear directly on Stream. This is true
>> of most Collections, but it does seem that List is special. It can be a
>> thin wrapper around an array; it can handle generics better than arrays;
>> and unlike an array, it can be made unmodifiable (shallowly immutable); and
>> it can be value-based. See John Rose's comments in the bug report:
>> >>
>> >>
>> https://bugs.openjdk.java.net/browse/JDK-8180352?focusedCommentId=14133065=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14133065
>> >>
>> >> This operation is null-tolerant, which matches the rest of Streams.
>> This isn't specified, though; a general statement about null handling in
>> Streams is probably warranted at some point.
>> >>
>> >> Finally, this method is indeed quite convenient (if the caller can deal
>> with what this operation returns), as collecting into a List is the most
>> common stream terminal operation.
>> >
>> > src/java.base/share/classes/java/util/ImmutableCollections.java line 199:
>> >
>> >> 197:  * safely reused as the List's internal storage, avoiding a
>> defensive copy. Declared
>> >> 198:  * with Object... instead of E... as the parameter type so
>> that varargs calls don't
>> >> 199:  * accidentally create an array of type other than Object[].
>> >
>> > Why would that be a problem? If the resulting list is immutable, then
>> the actual array type doesn't really matter, right?
>>
>> It's an implementation invariant that the internal array be Object[].
>> Having it be something other than Object[] can lead to subtle bugs. See
>> [JDK-6260652](https://bugs.openjdk.java.net/browse/JDK-6260652) for
>> example.
>>
>> -
>>
>> PR: https://git.openjdk.java.net/jdk/pull/1026
>>
> 
> 
> --
> Simon Roberts
> (303) 249 3613


Re: RFR: 8180352: Add Stream.toList() method

2020-11-05 Thread Stuart Marks
On Fri, 6 Nov 2020 03:01:42 GMT, Stuart Marks  wrote:

>> Looking at the linked issue, I see [this comment from Rémi 
>> Forax](https://bugs.openjdk.java.net/browse/JDK-8180352?focusedCommentId=14171626=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14171626):
>> 
>>> Please talk with Brian about this change because it nulls a property of the 
>>> Stream API we (the lambda-util group) have take time to keep.
>> The whole Stream API doesn't depends on the Collection API, [...] so the 
>> Stream API can be easily integrated with other collection API if in the 
>> future we want by example add a persistent collection API with no link with 
>> java.util.List.
>> 
>> That's an argument I can understand – as is, the Stream API works just as 
>> well with collections from e.g. Scala's or Kotlin's (or Ceylon's) collection 
>> libraries instead of the java.util ones, just using different collectors. 
>> Adding a method which directly returns a java.util.List somewhat couples it 
>> to the Java Collection API.
>> 
>> Now this was mentioned two and a half year ago. Did something change which 
>> made this consideration irrelevant? I would expect at least some mention of 
>> it in the discussion here.
>
> @ePaul wrote:
> 
>> The Stream API works just as well with [third party] collection libraries 
>> instead of the java.util ones, just using different collectors. Adding a 
>> method which directly returns a java.util.List somewhat couples it to the 
>> Java Collection API.
>> 
>> Now this was mentioned two and a half year ago. Did something change which 
>> made this consideration irrelevant? I would expect at least some mention of 
>> it in the discussion here.
> 
> The separation between streams and the java.util Collections Framework is a 
> good design principle, but it isn't an ironclad rule. It's still easy to have 
> streams create instances of other collections libraries using the Collector 
> interface. What's different here is that the Collections Framework has 
> "leaked into" streams a little bit more, so that they're now more 
> interdependent. This doesn't seem to have any disadvantages; it seems 
> unlikely that the Collections Framework will ever be unplugged from the JDK. 
> However, the benefits are that a List is the closest thing we have to an 
> unmodifiable array that also plays well with generics and that can be 
> value-based; these benefits are considerable.

> Simon Roberts wrote:

> This discussion of unmodifiable lists brings me back to the thought that
> there would be good client-side reasons for inserting an UnmodifiableList
> interface as a parent of LIst, not least because all our unmodifiable
> variants from the Collections.unmodifiableList proxy onward fail the Liskov
> substitution test for actually "being contract-fulfilling Lists".

At some point there probably will need to be a long article explaining all the 
issues here, but at the moment the best writeup I have is this one:

https://stackoverflow.com/a/57926310/1441122

TL;DR there are a few different ways to approach retrofitting something like 
this, but they all have enough compromises that the net benefits are unclear.

-

PR: https://git.openjdk.java.net/jdk/pull/1026


Re: RFR: 8180352: Add Stream.toList() method

2020-11-05 Thread Stuart Marks
On Wed, 4 Nov 2020 23:18:29 GMT, Paŭlo Ebermann 
 wrote:

>> Changes requested by orio...@github.com (no known OpenJDK username).
>
> Looking at the linked issue, I see [this comment from Rémi 
> Forax](https://bugs.openjdk.java.net/browse/JDK-8180352?focusedCommentId=14171626=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14171626):
> 
>> Please talk with Brian about this change because it nulls a property of the 
>> Stream API we (the lambda-util group) have take time to keep.
> The whole Stream API doesn't depends on the Collection API, [...] so the 
> Stream API can be easily integrated with other collection API if in the 
> future we want by example add a persistent collection API with no link with 
> java.util.List.
> 
> That's an argument I can understand – as is, the Stream API works just as 
> well with collections from e.g. Scala's or Kotlin's (or Ceylon's) collection 
> libraries instead of the java.util ones, just using different collectors. 
> Adding a method which directly returns a java.util.List somewhat couples it 
> to the Java Collection API.
> 
> Now this was mentioned two and a half year ago. Did something change which 
> made this consideration irrelevant? I would expect at least some mention of 
> it in the discussion here.

@ePaul wrote:

> The Stream API works just as well with [third party] collection libraries 
> instead of the java.util ones, just using different collectors. Adding a 
> method which directly returns a java.util.List somewhat couples it to the 
> Java Collection API.
> 
> Now this was mentioned two and a half year ago. Did something change which 
> made this consideration irrelevant? I would expect at least some mention of 
> it in the discussion here.

The separation between streams and the java.util Collections Framework is a 
good design principle, but it isn't an ironclad rule. It's still easy to have 
streams create instances of other collections libraries using the Collector 
interface. What's different here is that the Collections Framework has "leaked 
into" streams a little bit more, so that they're now more interdependent. This 
doesn't seem to have any disadvantages; it seems unlikely that the Collections 
Framework will ever be unplugged from the JDK. However, the benefits are that a 
List is the closest thing we have to an unmodifiable array that also plays well 
with generics and that can be value-based; these benefits are considerable.

-

PR: https://git.openjdk.java.net/jdk/pull/1026


Re: RFR: 8180352: Add Stream.toList() method

2020-11-05 Thread Paul Sandoz
On Fri, 6 Nov 2020 02:50:29 GMT, Stuart Marks  wrote:

>> test/jdk/java/util/stream/test/org/openjdk/tests/java/util/stream/ToListOpTest.java
>>  line 73:
>> 
>>> 71: }
>>> 72: 
>>> 73: @Test(dataProvider = "withNull:StreamTestData", 
>>> dataProviderClass = StreamTestDataProvider.class)
>> 
>> Given the non-default `toList()` implementation defers to the `toArray()` 
>> terminal op there is no need for this and the following tests, which are 
>> really designed to shake out the optimizations when producing and operating 
>> on arrays.
>
> OK, I'll remove all the tests starting from here to the end of the file. I'm 
> assuming that's the set of tests you're referring to.

Yes from line 73 to G (the end).

-

PR: https://git.openjdk.java.net/jdk/pull/1026


Re: RFR: 8180352: Add Stream.toList() method

2020-11-05 Thread Stuart Marks
On Thu, 5 Nov 2020 19:47:43 GMT, Paul Sandoz  wrote:

>> This change introduces a new terminal operation on Stream. This looks like a 
>> convenience method for Stream.collect(Collectors.toList()) or 
>> Stream.collect(Collectors.toUnmodifiableList()), but it's not. Having this 
>> method directly on Stream enables it to do what can't easily by done by a 
>> Collector. In particular, it allows the stream to deposit results directly 
>> into a destination array (even in parallel) and have this array be wrapped 
>> in an unmodifiable List without copying.
>> 
>> In the past we've kept most things from the Collections Framework as 
>> implementations of Collector, not directly on Stream, whereas only 
>> fundamental things (like toArray) appear directly on Stream. This is true of 
>> most Collections, but it does seem that List is special. It can be a thin 
>> wrapper around an array; it can handle generics better than arrays; and 
>> unlike an array, it can be made unmodifiable (shallowly immutable); and it 
>> can be value-based. See John Rose's comments in the bug report:
>> 
>> https://bugs.openjdk.java.net/browse/JDK-8180352?focusedCommentId=14133065=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14133065
>> 
>> This operation is null-tolerant, which matches the rest of Streams. This 
>> isn't specified, though; a general statement about null handling in Streams 
>> is probably warranted at some point.
>> 
>> Finally, this method is indeed quite convenient (if the caller can deal with 
>> what this operation returns), as collecting into a List is the most common 
>> stream terminal operation.
>
> test/jdk/java/util/stream/test/org/openjdk/tests/java/util/stream/ToListOpTest.java
>  line 73:
> 
>> 71: }
>> 72: 
>> 73: @Test(dataProvider = "withNull:StreamTestData", 
>> dataProviderClass = StreamTestDataProvider.class)
> 
> Given the non-default `toList()` implementation defers to the `toArray()` 
> terminal op there is no need for this and the following tests, which are 
> really designed to shake out the optimizations when producing and operating 
> on arrays.

OK, I'll remove all the tests starting from here to the end of the file. I'm 
assuming that's the set of tests you're referring to.

-

PR: https://git.openjdk.java.net/jdk/pull/1026


Re: RFR: 8180352: Add Stream.toList() method

2020-11-05 Thread Paul Sandoz
On Tue, 3 Nov 2020 01:33:32 GMT, Stuart Marks  wrote:

> This change introduces a new terminal operation on Stream. This looks like a 
> convenience method for Stream.collect(Collectors.toList()) or 
> Stream.collect(Collectors.toUnmodifiableList()), but it's not. Having this 
> method directly on Stream enables it to do what can't easily by done by a 
> Collector. In particular, it allows the stream to deposit results directly 
> into a destination array (even in parallel) and have this array be wrapped in 
> an unmodifiable List without copying.
> 
> In the past we've kept most things from the Collections Framework as 
> implementations of Collector, not directly on Stream, whereas only 
> fundamental things (like toArray) appear directly on Stream. This is true of 
> most Collections, but it does seem that List is special. It can be a thin 
> wrapper around an array; it can handle generics better than arrays; and 
> unlike an array, it can be made unmodifiable (shallowly immutable); and it 
> can be value-based. See John Rose's comments in the bug report:
> 
> https://bugs.openjdk.java.net/browse/JDK-8180352?focusedCommentId=14133065=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14133065
> 
> This operation is null-tolerant, which matches the rest of Streams. This 
> isn't specified, though; a general statement about null handling in Streams 
> is probably warranted at some point.
> 
> Finally, this method is indeed quite convenient (if the caller can deal with 
> what this operation returns), as collecting into a List is the most common 
> stream terminal operation.

test/jdk/java/util/stream/test/org/openjdk/tests/java/util/stream/ToListOpTest.java
 line 73:

> 71: }
> 72: 
> 73: @Test(dataProvider = "withNull:StreamTestData", 
> dataProviderClass = StreamTestDataProvider.class)

Given the non-default `toList()` implementation defers to the `toArray()` 
terminal op there is no need for this and the following tests, which are really 
designed to shake out the optimizations when producing and operating on arrays.

-

PR: https://git.openjdk.java.net/jdk/pull/1026


Re: RFR: 8180352: Add Stream.toList() method

2020-11-05 Thread Simon Roberts
At the risk of a can of worms, or at least of raising something that has
long since been discussed and rejected...

This discussion of unmodifiable lists brings me back to the thought that
there would be good client-side reasons for inserting an UnmodifiableList
interface as a parent of LIst, not least because all our unmodifiable
variants from the Collections.unmodifiableList proxy onward fail the Liskov
substitution test for actually "being contract-fulfilling Lists".

Is this something that has been considered and rejected? (If so, is it easy
to point me at the discussion, as I expect I'd find it fascinating). Is it
worth considering, might it have some merit, or merely horrible
unforeseen-by-me consequences to the implementation?

Cheers,
Simon



On Thu, Nov 5, 2020 at 10:30 AM Stuart Marks 
wrote:

> On Wed, 4 Nov 2020 23:03:02 GMT, Paŭlo Ebermann  645859+ep...@openjdk.org> wrote:
>
> >> This change introduces a new terminal operation on Stream. This looks
> like a convenience method for Stream.collect(Collectors.toList()) or
> Stream.collect(Collectors.toUnmodifiableList()), but it's not. Having this
> method directly on Stream enables it to do what can't easily by done by a
> Collector. In particular, it allows the stream to deposit results directly
> into a destination array (even in parallel) and have this array be wrapped
> in an unmodifiable List without copying.
> >>
> >> In the past we've kept most things from the Collections Framework as
> implementations of Collector, not directly on Stream, whereas only
> fundamental things (like toArray) appear directly on Stream. This is true
> of most Collections, but it does seem that List is special. It can be a
> thin wrapper around an array; it can handle generics better than arrays;
> and unlike an array, it can be made unmodifiable (shallowly immutable); and
> it can be value-based. See John Rose's comments in the bug report:
> >>
> >>
> https://bugs.openjdk.java.net/browse/JDK-8180352?focusedCommentId=14133065=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14133065
> >>
> >> This operation is null-tolerant, which matches the rest of Streams.
> This isn't specified, though; a general statement about null handling in
> Streams is probably warranted at some point.
> >>
> >> Finally, this method is indeed quite convenient (if the caller can deal
> with what this operation returns), as collecting into a List is the most
> common stream terminal operation.
> >
> > src/java.base/share/classes/java/util/ImmutableCollections.java line 199:
> >
> >> 197:  * safely reused as the List's internal storage, avoiding a
> defensive copy. Declared
> >> 198:  * with Object... instead of E... as the parameter type so
> that varargs calls don't
> >> 199:  * accidentally create an array of type other than Object[].
> >
> > Why would that be a problem? If the resulting list is immutable, then
> the actual array type doesn't really matter, right?
>
> It's an implementation invariant that the internal array be Object[].
> Having it be something other than Object[] can lead to subtle bugs. See
> [JDK-6260652](https://bugs.openjdk.java.net/browse/JDK-6260652) for
> example.
>
> -
>
> PR: https://git.openjdk.java.net/jdk/pull/1026
>


-- 
Simon Roberts
(303) 249 3613


Re: RFR: 8180352: Add Stream.toList() method

2020-11-05 Thread Stuart Marks
On Wed, 4 Nov 2020 23:03:02 GMT, Paŭlo Ebermann 
 wrote:

>> This change introduces a new terminal operation on Stream. This looks like a 
>> convenience method for Stream.collect(Collectors.toList()) or 
>> Stream.collect(Collectors.toUnmodifiableList()), but it's not. Having this 
>> method directly on Stream enables it to do what can't easily by done by a 
>> Collector. In particular, it allows the stream to deposit results directly 
>> into a destination array (even in parallel) and have this array be wrapped 
>> in an unmodifiable List without copying.
>> 
>> In the past we've kept most things from the Collections Framework as 
>> implementations of Collector, not directly on Stream, whereas only 
>> fundamental things (like toArray) appear directly on Stream. This is true of 
>> most Collections, but it does seem that List is special. It can be a thin 
>> wrapper around an array; it can handle generics better than arrays; and 
>> unlike an array, it can be made unmodifiable (shallowly immutable); and it 
>> can be value-based. See John Rose's comments in the bug report:
>> 
>> https://bugs.openjdk.java.net/browse/JDK-8180352?focusedCommentId=14133065=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14133065
>> 
>> This operation is null-tolerant, which matches the rest of Streams. This 
>> isn't specified, though; a general statement about null handling in Streams 
>> is probably warranted at some point.
>> 
>> Finally, this method is indeed quite convenient (if the caller can deal with 
>> what this operation returns), as collecting into a List is the most common 
>> stream terminal operation.
>
> src/java.base/share/classes/java/util/ImmutableCollections.java line 199:
> 
>> 197:  * safely reused as the List's internal storage, avoiding a 
>> defensive copy. Declared
>> 198:  * with Object... instead of E... as the parameter type so that 
>> varargs calls don't
>> 199:  * accidentally create an array of type other than Object[].
> 
> Why would that be a problem? If the resulting list is immutable, then the 
> actual array type doesn't really matter, right?

It's an implementation invariant that the internal array be Object[]. Having it 
be something other than Object[] can lead to subtle bugs. See 
[JDK-6260652](https://bugs.openjdk.java.net/browse/JDK-6260652) for example.

-

PR: https://git.openjdk.java.net/jdk/pull/1026


Re: RFR: 8180352: Add Stream.toList() method

2020-11-05 Thread Stuart Marks
On Wed, 4 Nov 2020 09:46:32 GMT, Zheka Kozlov 
 wrote:

>> This change introduces a new terminal operation on Stream. This looks like a 
>> convenience method for Stream.collect(Collectors.toList()) or 
>> Stream.collect(Collectors.toUnmodifiableList()), but it's not. Having this 
>> method directly on Stream enables it to do what can't easily by done by a 
>> Collector. In particular, it allows the stream to deposit results directly 
>> into a destination array (even in parallel) and have this array be wrapped 
>> in an unmodifiable List without copying.
>> 
>> In the past we've kept most things from the Collections Framework as 
>> implementations of Collector, not directly on Stream, whereas only 
>> fundamental things (like toArray) appear directly on Stream. This is true of 
>> most Collections, but it does seem that List is special. It can be a thin 
>> wrapper around an array; it can handle generics better than arrays; and 
>> unlike an array, it can be made unmodifiable (shallowly immutable); and it 
>> can be value-based. See John Rose's comments in the bug report:
>> 
>> https://bugs.openjdk.java.net/browse/JDK-8180352?focusedCommentId=14133065=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14133065
>> 
>> This operation is null-tolerant, which matches the rest of Streams. This 
>> isn't specified, though; a general statement about null handling in Streams 
>> is probably warranted at some point.
>> 
>> Finally, this method is indeed quite convenient (if the caller can deal with 
>> what this operation returns), as collecting into a List is the most common 
>> stream terminal operation.
>
> src/java.base/share/classes/java/util/ImmutableCollections.java line 240:
> 
>> 238: static  List listFromTrustedArrayNullsAllowed(Object... 
>> input) {
>> 239: if (input.length == 0) {
>> 240: return (List) EMPTY_LIST;
> 
> If we return a `ListN` here, does this mean that 
> `Stream.of().toList().contains(null)` will throw an NPE? But this is 
> incorrect because `toList()` returns a null-tolerant List.

Yes, good point, we might need to have a null-tolerant empty list.

-

PR: https://git.openjdk.java.net/jdk/pull/1026


Re: RFR: 8180352: Add Stream.toList() method

2020-11-04 Thread Tagir Valeev
Hello!

> The min() and max() methods are not null hostile.  If you pass a
> null-friendly comparator (such as returned by the `nullsFirst()` and
> `nullsLast()` comparator combinators), nulls are not a problem.

No, it is the problem, the same as with findFirst(). Try it:

Comparator nullFriendly =
Comparator.nullsFirst(Comparator.naturalOrder());
List data = Arrays.asList("hello", null, "world");
System.out.println(Collections.min(data, nullFriendly)); // null -- correct
System.out.println(data.stream().min(nullFriendly)); // NPE!

With best regards,
Tagir Valeev.


Re: RFR: 8180352: Add Stream.toList() method

2020-11-04 Thread Paŭlo Ebermann
On Wed, 4 Nov 2020 08:53:23 GMT, Zheka Kozlov 
 wrote:

>> This change introduces a new terminal operation on Stream. This looks like a 
>> convenience method for Stream.collect(Collectors.toList()) or 
>> Stream.collect(Collectors.toUnmodifiableList()), but it's not. Having this 
>> method directly on Stream enables it to do what can't easily by done by a 
>> Collector. In particular, it allows the stream to deposit results directly 
>> into a destination array (even in parallel) and have this array be wrapped 
>> in an unmodifiable List without copying.
>> 
>> In the past we've kept most things from the Collections Framework as 
>> implementations of Collector, not directly on Stream, whereas only 
>> fundamental things (like toArray) appear directly on Stream. This is true of 
>> most Collections, but it does seem that List is special. It can be a thin 
>> wrapper around an array; it can handle generics better than arrays; and 
>> unlike an array, it can be made unmodifiable (shallowly immutable); and it 
>> can be value-based. See John Rose's comments in the bug report:
>> 
>> https://bugs.openjdk.java.net/browse/JDK-8180352?focusedCommentId=14133065=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14133065
>> 
>> This operation is null-tolerant, which matches the rest of Streams. This 
>> isn't specified, though; a general statement about null handling in Streams 
>> is probably warranted at some point.
>> 
>> Finally, this method is indeed quite convenient (if the caller can deal with 
>> what this operation returns), as collecting into a List is the most common 
>> stream terminal operation.
>
> Changes requested by orio...@github.com (no known OpenJDK username).

Looking at the linked issue, I see [this comment from Rémi 
Forax](https://bugs.openjdk.java.net/browse/JDK-8180352?focusedCommentId=14171626=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14171626):

> Please talk with Brian about this change because it nulls a property of the 
> Stream API we (the lambda-util group) have take time to keep.
The whole Stream API doesn't depends on the Collection API, [...] so the Stream 
API can be easily integrated with other collection API if in the future we want 
by example add a persistent collection API with no link with java.util.List.

That's an argument I can understand – as is, the Stream API works just as well 
with collections from e.g. Scala's or Kotlin's (or Ceylon's) collection 
libraries instead of the java.util ones, just using different collectors. 
Adding a method which directly returns a java.util.List somewhat couples it to 
the Java Collection API.

Now this was mentioned two and a half year ago. Did something change which made 
this consideration irrelevant? I would expect at least some mention of it in 
the discussion here.

-

PR: https://git.openjdk.java.net/jdk/pull/1026


Re: RFR: 8180352: Add Stream.toList() method

2020-11-04 Thread Paŭlo Ebermann
On Tue, 3 Nov 2020 01:33:32 GMT, Stuart Marks  wrote:

> This change introduces a new terminal operation on Stream. This looks like a 
> convenience method for Stream.collect(Collectors.toList()) or 
> Stream.collect(Collectors.toUnmodifiableList()), but it's not. Having this 
> method directly on Stream enables it to do what can't easily by done by a 
> Collector. In particular, it allows the stream to deposit results directly 
> into a destination array (even in parallel) and have this array be wrapped in 
> an unmodifiable List without copying.
> 
> In the past we've kept most things from the Collections Framework as 
> implementations of Collector, not directly on Stream, whereas only 
> fundamental things (like toArray) appear directly on Stream. This is true of 
> most Collections, but it does seem that List is special. It can be a thin 
> wrapper around an array; it can handle generics better than arrays; and 
> unlike an array, it can be made unmodifiable (shallowly immutable); and it 
> can be value-based. See John Rose's comments in the bug report:
> 
> https://bugs.openjdk.java.net/browse/JDK-8180352?focusedCommentId=14133065=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14133065
> 
> This operation is null-tolerant, which matches the rest of Streams. This 
> isn't specified, though; a general statement about null handling in Streams 
> is probably warranted at some point.
> 
> Finally, this method is indeed quite convenient (if the caller can deal with 
> what this operation returns), as collecting into a List is the most common 
> stream terminal operation.

src/java.base/share/classes/java/util/ImmutableCollections.java line 199:

> 197:  * safely reused as the List's internal storage, avoiding a 
> defensive copy. Declared
> 198:  * with Object... instead of E... as the parameter type so that 
> varargs calls don't
> 199:  * accidentally create an array of type other than Object[].

Why would that be a problem? If the resulting list is immutable, then the 
actual array type doesn't really matter, right?

-

PR: https://git.openjdk.java.net/jdk/pull/1026


Re: RFR: 8180352: Add Stream.toList() method

2020-11-04 Thread Brian Goetz




I wonder if we should not do the same with toList(), having toList() to be equivalent to 
collect(Collectors.toUnmodifiableList()) and toList(IntFunction>) 
allowing to use a null friendly List implementation like ArrayList.


If you pull on this string all the way, we end up with "let's just not 
do anything."


Adding conveniences only begets demands for more conveniences, and the 
endpoint there is a class with hundreds of trivial methods. This is why 
we held the line for so long.  Over that time, it emerged that the _one_ 
convenience method we could justify was toList(), since 
collect(toList()) is single most commonly used terminal op -- but only 
if we could justify stopping there.  Which I think we can, so we're 
stopping there.  (Maybe in five years we'll add another one.)





Re: RFR: 8180352: Add Stream.toList() method

2020-11-04 Thread forax
- Mail original -
> De: "Brian Goetz" 
> À: "Tagir Valeev" 
> Cc: "Remi Forax" , "Stuart Marks" 
> , "core-libs-dev"
> 
> Envoyé: Mercredi 4 Novembre 2020 18:58:59
> Objet: Re: RFR: 8180352: Add Stream.toList() method

>> As for nullity topic, I really welcome that the proposed toList() is
>> null-tolerant but it worth mentioning that existing terminal
>> operations like findFirst(), findAny(), min() and max() are already
>> null-hostile.
> The min() and max() methods are not null hostile.  If you pass a
> null-friendly comparator (such as returned by the `nullsFirst()` and
> `nullsLast()` comparator combinators), nulls are not a problem.  The
> null hostility comes not from streams, but from the behaviors that
> clients pass to stream methods, whether they be min() or max() or
> map().  If the behaviors passed to _any_ stream method are null-hostile,
> then (usually) so will be that stream pipeline.  But this is something
> entirely under the control of the user; users should ensure that their
> domain and the behaviors that operate on that domain are consistent.
> 
> What we're talking about is what _streams_ should do.  Streams should
> not gratutiously narrow the domain.  Since `toList()` will be built into
> streams, we have to define this clearly, and there's no justification
> for making this method null-hostile.  The arguments made so far that
> toList() should somehow be null-hostile appear to be nothing more than
> weak wrappers around "I hate nulls, so let's make new methods
> null-hostile."
> 
> Remi say:
> 
>> You know that you can not change the implementation of
>> Collectors.toList(), and you also know that if there is a method
>> Stream.toList(), people will replace the calls to
>> .collect(Collectors.toList()) by a call to Stream.toList() for the
>> very same reason but you want the semantics of Stream.toList() to be
>> different from the semantics of Collectors.toList().
> 
> This is what I call a "for consistency" argument.  Now, we all know that
> consistency is a good starting point, but almost invariably, when
> someone says "you should do X because for consistency with Y", that "for
> consistency" argument turns out to be little more than a thin wrapper
> around "I prefer Y, and I found a precedent for it." Yes, people will be
> tempted to make this assumption -- at first. (And most of the time, that
> will be fine -- the most common thing people do after collecting to a
> list is iterate the list.)   But it is a complete inversion to say that
> the built-in method must be consistent with any given existing
> Collector, even if that collector has a similar name.  The built-in
> method should provide sensible default to-list behavior, and if you want
> _any other_ to-list behavior -- a mutable list, a different type of
> list, a null-hostile list, a list that drops prime-numbered elements,
> whatever -- you use the more general tool, which is collect(), which
> lets you do whatever you want, and comes with a variety of
> pre-configured options.  And this is the most sensible default behavior
> for a built-in to-list operation.

As i said since, i can live with only one method toList() instead of two 
(toList() and toUnmodifiableList()), as you said this requires to provide a 
good default.
Obviously i still disagree with you about what the good default is.

This morning i was thinking about the symmetry between stream.toList() and 
stream.toArray(), there are two variants of toarray, toArray() and 
toArray(IntFunction),
the second variant let you choose as a user which implementation of array you 
want.

I wonder if we should not do the same with toList(), having toList() to be 
equivalent to collect(Collectors.toUnmodifiableList()) and 
toList(IntFunction>) allowing to use a null friendly List 
implementation like ArrayList. 

> 
> firstFirst/findAny are indeed sad corner cases, because we didn't have a
> good way of representing "maybe absent nullable value."  (If that case
> were more important, we might have done more work to support it.)  But I
> think it would be a poor move to try and extrapolate from this behavior;
> this behavior is merely a hostage to circumstance.

Rémi


Re: RFR: 8180352: Add Stream.toList() method

2020-11-04 Thread Brian Goetz




As for nullity topic, I really welcome that the proposed toList() is
null-tolerant but it worth mentioning that existing terminal
operations like findFirst(), findAny(), min() and max() are already
null-hostile.
The min() and max() methods are not null hostile.  If you pass a 
null-friendly comparator (such as returned by the `nullsFirst()` and 
`nullsLast()` comparator combinators), nulls are not a problem.  The 
null hostility comes not from streams, but from the behaviors that 
clients pass to stream methods, whether they be min() or max() or 
map().  If the behaviors passed to _any_ stream method are null-hostile, 
then (usually) so will be that stream pipeline.  But this is something 
entirely under the control of the user; users should ensure that their 
domain and the behaviors that operate on that domain are consistent.


What we're talking about is what _streams_ should do.  Streams should 
not gratutiously narrow the domain.  Since `toList()` will be built into 
streams, we have to define this clearly, and there's no justification 
for making this method null-hostile.  The arguments made so far that 
toList() should somehow be null-hostile appear to be nothing more than 
weak wrappers around "I hate nulls, so let's make new methods 
null-hostile."


Remi say:

You know that you can not change the implementation of 
Collectors.toList(), and you also know that if there is a method 
Stream.toList(), people will replace the calls to 
.collect(Collectors.toList()) by a call to Stream.toList() for the 
very same reason but you want the semantics of Stream.toList() to be 
different from the semantics of Collectors.toList().


This is what I call a "for consistency" argument.  Now, we all know that 
consistency is a good starting point, but almost invariably, when 
someone says "you should do X because for consistency with Y", that "for 
consistency" argument turns out to be little more than a thin wrapper 
around "I prefer Y, and I found a precedent for it." Yes, people will be 
tempted to make this assumption -- at first. (And most of the time, that 
will be fine -- the most common thing people do after collecting to a 
list is iterate the list.)   But it is a complete inversion to say that 
the built-in method must be consistent with any given existing 
Collector, even if that collector has a similar name.  The built-in 
method should provide sensible default to-list behavior, and if you want 
_any other_ to-list behavior -- a mutable list, a different type of 
list, a null-hostile list, a list that drops prime-numbered elements, 
whatever -- you use the more general tool, which is collect(), which 
lets you do whatever you want, and comes with a variety of 
pre-configured options.  And this is the most sensible default behavior 
for a built-in to-list operation.


firstFirst/findAny are indeed sad corner cases, because we didn't have a 
good way of representing "maybe absent nullable value."  (If that case 
were more important, we might have done more work to support it.)  But I 
think it would be a poor move to try and extrapolate from this behavior; 
this behavior is merely a hostage to circumstance.





Re: RFR: 8180352: Add Stream.toList() method

2020-11-04 Thread Remi Forax
- Mail original -
> De: "Stuart Marks" 
> À: "core-libs-dev" 
> Envoyé: Mercredi 4 Novembre 2020 01:14:54
> Objet: Re: RFR: 8180352: Add Stream.toList() method

> On Tue, 3 Nov 2020 18:53:23 GMT, Stuart Marks  wrote:
> 
>>> Should there be a test that tests the default implementation in 
>>> `j.u.s.Stream`?
>>> Or maybe there is and I missed that?
>>
>> @dfuch wrote:
>>> Should there be a test that tests the default implementation in 
>>> `j.u.s.Stream`?
>>> Or maybe there is and I missed that?
>> 
>> The test method `testDefaultOps` does that. The stream test framework has a
>> thing called `DefaultMethodStreams` that delegates everything except default
>> methods to another Stream instance, so this should test the default
>> implementation.
> 
> OK, there are rather too many forking threads here for me to try to reply to 
> any
> particular message, so I'll try to summarize things and say what I intend to
> do.
> 
> Null tolerance. While part of me wants to prohibit nulls, the fact is that
> Streams pass through nulls, and toList() would be much less useful if it were
> to reject nulls. The affinity here is closer to Stream.toArray(), which allows
> nulls, rather than Collectors.to[Unmodifiable]List.
> 
> Unmodifiability. Unlike with nulls, this is where we've been taking a strong
> stand recently, where new constructs are unmodifiable ("shallowly immutable").
> Consider the Java 9 unmodifiable collections, records, primitive classes, etc.
> -- they're all unmodifiable. They're data-race-free and are resistant to a
> whole class of bugs that arise from mutability.

again, the strong stance about null in collections was taken earlier that the 
one about unmodifiable collection,
all collections introduced since java 1.6 are null hostile.

I think we are bound to rediscover why :(


And BTW, an unmodifiable list is not data-race free, it depends if the content 
is immutable or not.


> 
> Unfortunately, the combination of the above means that the returned List is
> neither like an ArrayList nor like an unmodifiable list produced by List.of()
> or by Collectors.toUnmodifiableList(). [Heavy sigh.] While I've been somewhat
> reluctant to introduce this new thing, the alternatives of trying to reuse one
> of the existing things are worse.

This is the smell test here.
You want to introduce a 4th semantics, we already have ArrayList, 
Arrays.asList() and List.of().
Add a new semantics has not a linear cost, the cost is bigger than linear 
because of the all the interactions with iterators, streams, etc.
(BTW this remember me that List.of().stream() is not optimize well compared to 
ArrayList.stream())

This is a real danger, Java is already complex, if it becomes more and more, it 
will be irrelevant, because it becomes harder and harder to internalize all the 
semantics inside of your head.
There is already a C++, let us not become another one.

So i really like the fact that you have decided to go bold here and choose to 
return an unmodifiable list but we should go to full monty and be null hostile 
too, like toUnmodifiableList().

> 
> John and Rémi pointed out that the way I implemented this, using a subclass,
> reintroduces the possibility of problems with megamorphic dispatch which we 
> had
> so carefully avoided by reducing the number of implementation classes in this
> area. I agree this is a problem.
> 
> I also had an off-line discussion with John where we discussed the 
> serialization
> format, which unfortunately is somewhat coupled with this issue. (Fortunately 
> I
> think it can mostly be decoupled.) Given that we're introducing a new thing,
> which is an unmodifiable-list-that-allows-nulls, this needs to be manifested 
> in
> the serialized form of these new objects. (This corresponds to the new tag
> value of IMM_LIST_NULLS in the CollSer class.) The alternative is to reuse the
> existing serialized form, IMM_LIST, for both of these cases, relaxing it to
> allow nulls. This would be a serialization immutability problem. Suppose I had
> an application that created a data structure that used lists from List.of(),
> and I had a global assertion over that structure that it contained no nulls.
> Further suppose that I serialized and deserizalized this structure. I'd want
> that assertion to be preserved after deserialization. If another app (or a
> future version of this app) created the structure using Stream.to
> List(), this would allow nulls to leak into that structure and violate that
> assertion. Therefore, the result of Stream.toList() should not be
> serialization-compatible with List.of() et. al. That's why there's the new
> IMM_LIST_NULLS tag in the serial format.
> 
> However, the n

Re: RFR: 8180352: Add Stream.toList() method

2020-11-04 Thread Justin Dekeyser
Hello,

Ah that's a super good news, thanks a lot for sharing !
I dream of this since 2 years <3

Best regards,

Justin Dekeyser


On Wed, Nov 4, 2020 at 10:24 AM Tagir Valeev  wrote:
>
> Hello!
>
> On Wed, Nov 4, 2020 at 4:16 PM Justin Dekeyser
>  wrote:
> > What about trying to make the Stream interface *flexible* to users,
> > instead of adding new functionalities that we could regret later?
> > Then, as the language usages and the trends evolve, we could really
> > see which of the "utils" functions are relevant to add.
> >
> > For example, we could maybe invent a method of the form (just a
> > sketch, not an actual well thought proposal)
> > `
> > default > S wrap(Function, S> wrapper) {
> >return wrapper.apply(this);
> > }
> > `
> > on Stream interface, in such a way one could use the "natural flow"
> > of writing while binding its own implementation and, therefore, use
> > its own shortcuts.
>
> See https://bugs.openjdk.java.net/browse/JDK-8210372
> (also, String::transform)
>
> With best regards,
> Tagir Valeev


Re: RFR: 8180352: Add Stream.toList() method

2020-11-04 Thread Zheka Kozlov
On Tue, 3 Nov 2020 01:33:32 GMT, Stuart Marks  wrote:

> This change introduces a new terminal operation on Stream. This looks like a 
> convenience method for Stream.collect(Collectors.toList()) or 
> Stream.collect(Collectors.toUnmodifiableList()), but it's not. Having this 
> method directly on Stream enables it to do what can't easily by done by a 
> Collector. In particular, it allows the stream to deposit results directly 
> into a destination array (even in parallel) and have this array be wrapped in 
> an unmodifiable List without copying.
> 
> In the past we've kept most things from the Collections Framework as 
> implementations of Collector, not directly on Stream, whereas only 
> fundamental things (like toArray) appear directly on Stream. This is true of 
> most Collections, but it does seem that List is special. It can be a thin 
> wrapper around an array; it can handle generics better than arrays; and 
> unlike an array, it can be made unmodifiable (shallowly immutable); and it 
> can be value-based. See John Rose's comments in the bug report:
> 
> https://bugs.openjdk.java.net/browse/JDK-8180352?focusedCommentId=14133065=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14133065
> 
> This operation is null-tolerant, which matches the rest of Streams. This 
> isn't specified, though; a general statement about null handling in Streams 
> is probably warranted at some point.
> 
> Finally, this method is indeed quite convenient (if the caller can deal with 
> what this operation returns), as collecting into a List is the most common 
> stream terminal operation.

src/java.base/share/classes/java/util/ImmutableCollections.java line 240:

> 238: static  List listFromTrustedArrayNullsAllowed(Object... input) 
> {
> 239: if (input.length == 0) {
> 240: return (List) EMPTY_LIST;

If we return a `ListN` here, does this mean that 
`Stream.of().toList().contains(null)` will throw an NPE? But this is incorrect 
because `toList()` returns a null-tolerant List.

-

PR: https://git.openjdk.java.net/jdk/pull/1026


Re: RFR: 8180352: Add Stream.toList() method

2020-11-04 Thread Tagir F . Valeev
On Wed, 4 Nov 2020 08:50:49 GMT, Zheka Kozlov 
 wrote:

>> This change introduces a new terminal operation on Stream. This looks like a 
>> convenience method for Stream.collect(Collectors.toList()) or 
>> Stream.collect(Collectors.toUnmodifiableList()), but it's not. Having this 
>> method directly on Stream enables it to do what can't easily by done by a 
>> Collector. In particular, it allows the stream to deposit results directly 
>> into a destination array (even in parallel) and have this array be wrapped 
>> in an unmodifiable List without copying.
>> 
>> In the past we've kept most things from the Collections Framework as 
>> implementations of Collector, not directly on Stream, whereas only 
>> fundamental things (like toArray) appear directly on Stream. This is true of 
>> most Collections, but it does seem that List is special. It can be a thin 
>> wrapper around an array; it can handle generics better than arrays; and 
>> unlike an array, it can be made unmodifiable (shallowly immutable); and it 
>> can be value-based. See John Rose's comments in the bug report:
>> 
>> https://bugs.openjdk.java.net/browse/JDK-8180352?focusedCommentId=14133065=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14133065
>> 
>> This operation is null-tolerant, which matches the rest of Streams. This 
>> isn't specified, though; a general statement about null handling in Streams 
>> is probably warranted at some point.
>> 
>> Finally, this method is indeed quite convenient (if the caller can deal with 
>> what this operation returns), as collecting into a List is the most common 
>> stream terminal operation.
>
> src/java.base/share/classes/java/util/stream/Stream.java line 1192:
> 
>> 1190: @SuppressWarnings("unchecked")
>> 1191: default List toList() {
>> 1192: return (List) Collections.unmodifiableList(new 
>> ArrayList<>(Arrays.asList(this.toArray(;
> 
> Why can't we return `listFromTrustedArrayNullsAllowed` here as in 
> `ReferencePipeline`?
> Or at least, we should avoid unnecessary copying of arrays. See [how this is 
> done](https://github.com/amaembo/streamex/blob/master/src/main/java/one/util/streamex/AbstractStreamEx.java#L1313)
>  in StreamEx.

`listFromTrustedArrayNullsAllowed` is clearly not an option, as it will produce 
shared-secret leaking (see 
[JDK-8254090](https://bugs.openjdk.java.net/browse/JDK-8254090) for a similar 
case). StreamEx solution is dirty as it relies on the implementation detail. I 
believe, OpenJDK team is not very interested in providing optimal 
implementations for third-party stream implementations, as third-party 
implementations will likely update by themselves when necessary. At least, my 
suggestion to make the default `mapMulti` implementation better was declined.

-

PR: https://git.openjdk.java.net/jdk/pull/1026


Re: RFR: 8180352: Add Stream.toList() method

2020-11-04 Thread Tagir Valeev
Hello!

On Wed, Nov 4, 2020 at 4:16 PM Justin Dekeyser
 wrote:
> What about trying to make the Stream interface *flexible* to users,
> instead of adding new functionalities that we could regret later?
> Then, as the language usages and the trends evolve, we could really
> see which of the "utils" functions are relevant to add.
>
> For example, we could maybe invent a method of the form (just a
> sketch, not an actual well thought proposal)
> `
> default > S wrap(Function, S> wrapper) {
>return wrapper.apply(this);
> }
> `
> on Stream interface, in such a way one could use the "natural flow"
> of writing while binding its own implementation and, therefore, use
> its own shortcuts.

See https://bugs.openjdk.java.net/browse/JDK-8210372
(also, String::transform)

With best regards,
Tagir Valeev


Re: RFR: 8180352: Add Stream.toList() method

2020-11-04 Thread Justin Dekeyser
Hello,

I think the `collect(toList())` is interesting, as the developer
really knows that the internal operation is collect-like.
I say this because there would be another, much more functional
approach, that would be of the form
`stream.map(List::of).reduce(List.empty, List::concat)`,
(if ever the List has been made with monads in minds... (concat does
not exist, and there is no unit for the empty list monad).)
The toList() obfuscates both the list implementation (as the current
utils actually) but also obfuscates the philosophy of the conversion.

As suggested by the previous mail, collect version looks much more
explicit, because it really suggests an aggregate operation and allows
one to use the exact constructor.

What about trying to make the Stream interface *flexible* to users,
instead of adding new functionalities that we could regret later?
Then, as the language usages and the trends evolve, we could really
see which of the "utils" functions are relevant to add.

For example, we could maybe invent a method of the form (just a
sketch, not an actual well thought proposal)
`
default > S wrap(Function, S> wrapper) {
   return wrapper.apply(this);
}
`
on Stream interface, in such a way one could use the "natural flow"
of writing while binding its own implementation and, therefore, use
its own shortcuts.

Best regards,

Justin Dekeyser

On Wed, Nov 4, 2020 at 9:59 AM Tagir Valeev  wrote:
>
> Hello!
>
> On Wed, Nov 4, 2020 at 2:55 AM Brian Goetz  wrote:
> > (In fact, we would be entirely justified to change the behavior of
> > Collectors::toList tomorrow, to match the proposed Stream::toList; the
> > spec was crafted to preserve this option.  We probably won't, because
> > there are too many programmers who refuse to read the specification, and
> > have baked in the assumption that it returns an ArrayList, and this
> > would likely just be picking a fight for little good reason, regardless
> > of who is right.)
>
> My experience based on writing Stream code as well as reading Stream
> code written by other developers suggest that one of the most popular
> reasons to modify the resulting list is caused by the need to add a
> special value (or a few of them) to the resulting list. We have
> several options:
> 1. List result =
> Stream.concat(xyz.stream().filter(...).map(...),
> Stream.of("special")).collect(toList());
> Canonical, recommended way, no mutability. However concat is really
> ugly, it breaks the fluent style of stream API calls, it's hard to
> read and people just hate it.
>
> 2. List result =
> xyz.stream().filter(...).map(...).collect(toCollection(ArrayList::new));
> result.add("special");
> Mutable result, but much more readable. Also, according to spec. But
> this ArrayList::new is hard to type and makes the code longer.
>
> 3. List result = xyz.stream().filter(...).map(...).collect(toList());
> result.add("special");
> Shorter and readable! Well, violates the spec but works!
>
> What people really need is an easy and readable way to concatenate
> lists and streams, appending new elements, etc. The JDK doesn't offer
> really great options here, so people often end up with external
> utility methods or libraries.
>
> Something like this would be very welcomed (and would reduce the need
> for mutable lists):
> List result =
> xyz.stream().filter(...).map(...).append("special").collect(toList());
>
> As for nullity topic, I really welcome that the proposed toList() is
> null-tolerant but it worth mentioning that existing terminal
> operations like findFirst(), findAny(), min() and max() are already
> null-hostile.
>
> With best regards,
> Tagir Valeev.


Re: RFR: 8180352: Add Stream.toList() method

2020-11-04 Thread Zheka Kozlov
On Tue, 3 Nov 2020 01:33:32 GMT, Stuart Marks  wrote:

> This change introduces a new terminal operation on Stream. This looks like a 
> convenience method for Stream.collect(Collectors.toList()) or 
> Stream.collect(Collectors.toUnmodifiableList()), but it's not. Having this 
> method directly on Stream enables it to do what can't easily by done by a 
> Collector. In particular, it allows the stream to deposit results directly 
> into a destination array (even in parallel) and have this array be wrapped in 
> an unmodifiable List without copying.
> 
> In the past we've kept most things from the Collections Framework as 
> implementations of Collector, not directly on Stream, whereas only 
> fundamental things (like toArray) appear directly on Stream. This is true of 
> most Collections, but it does seem that List is special. It can be a thin 
> wrapper around an array; it can handle generics better than arrays; and 
> unlike an array, it can be made unmodifiable (shallowly immutable); and it 
> can be value-based. See John Rose's comments in the bug report:
> 
> https://bugs.openjdk.java.net/browse/JDK-8180352?focusedCommentId=14133065=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14133065
> 
> This operation is null-tolerant, which matches the rest of Streams. This 
> isn't specified, though; a general statement about null handling in Streams 
> is probably warranted at some point.
> 
> Finally, this method is indeed quite convenient (if the caller can deal with 
> what this operation returns), as collecting into a List is the most common 
> stream terminal operation.

src/java.base/share/classes/java/util/stream/Stream.java line 1192:

> 1190: @SuppressWarnings("unchecked")
> 1191: default List toList() {
> 1192: return (List) Collections.unmodifiableList(new 
> ArrayList<>(Arrays.asList(this.toArray(;

Why can't we return `listFromTrustedArrayNullsAllowed` here as in 
`ReferencePipeline`?
Or at least, we should avoid unnecessary copying of arrays. See [how this is 
done](https://github.com/amaembo/streamex/blob/master/src/main/java/one/util/streamex/AbstractStreamEx.java#L1313)
 in StreamEx.

-

PR: https://git.openjdk.java.net/jdk/pull/1026


Re: RFR: 8180352: Add Stream.toList() method

2020-11-04 Thread Tagir Valeev
Hello!

On Wed, Nov 4, 2020 at 2:55 AM Brian Goetz  wrote:
> (In fact, we would be entirely justified to change the behavior of
> Collectors::toList tomorrow, to match the proposed Stream::toList; the
> spec was crafted to preserve this option.  We probably won't, because
> there are too many programmers who refuse to read the specification, and
> have baked in the assumption that it returns an ArrayList, and this
> would likely just be picking a fight for little good reason, regardless
> of who is right.)

My experience based on writing Stream code as well as reading Stream
code written by other developers suggest that one of the most popular
reasons to modify the resulting list is caused by the need to add a
special value (or a few of them) to the resulting list. We have
several options:
1. List result =
Stream.concat(xyz.stream().filter(...).map(...),
Stream.of("special")).collect(toList());
Canonical, recommended way, no mutability. However concat is really
ugly, it breaks the fluent style of stream API calls, it's hard to
read and people just hate it.

2. List result =
xyz.stream().filter(...).map(...).collect(toCollection(ArrayList::new));
result.add("special");
Mutable result, but much more readable. Also, according to spec. But
this ArrayList::new is hard to type and makes the code longer.

3. List result = xyz.stream().filter(...).map(...).collect(toList());
result.add("special");
Shorter and readable! Well, violates the spec but works!

What people really need is an easy and readable way to concatenate
lists and streams, appending new elements, etc. The JDK doesn't offer
really great options here, so people often end up with external
utility methods or libraries.

Something like this would be very welcomed (and would reduce the need
for mutable lists):
List result =
xyz.stream().filter(...).map(...).append("special").collect(toList());

As for nullity topic, I really welcome that the proposed toList() is
null-tolerant but it worth mentioning that existing terminal
operations like findFirst(), findAny(), min() and max() are already
null-hostile.

With best regards,
Tagir Valeev.


Re: RFR: 8180352: Add Stream.toList() method

2020-11-04 Thread Zheka Kozlov
On Tue, 3 Nov 2020 01:33:32 GMT, Stuart Marks  wrote:

> This change introduces a new terminal operation on Stream. This looks like a 
> convenience method for Stream.collect(Collectors.toList()) or 
> Stream.collect(Collectors.toUnmodifiableList()), but it's not. Having this 
> method directly on Stream enables it to do what can't easily by done by a 
> Collector. In particular, it allows the stream to deposit results directly 
> into a destination array (even in parallel) and have this array be wrapped in 
> an unmodifiable List without copying.
> 
> In the past we've kept most things from the Collections Framework as 
> implementations of Collector, not directly on Stream, whereas only 
> fundamental things (like toArray) appear directly on Stream. This is true of 
> most Collections, but it does seem that List is special. It can be a thin 
> wrapper around an array; it can handle generics better than arrays; and 
> unlike an array, it can be made unmodifiable (shallowly immutable); and it 
> can be value-based. See John Rose's comments in the bug report:
> 
> https://bugs.openjdk.java.net/browse/JDK-8180352?focusedCommentId=14133065=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14133065
> 
> This operation is null-tolerant, which matches the rest of Streams. This 
> isn't specified, though; a general statement about null handling in Streams 
> is probably warranted at some point.
> 
> Finally, this method is indeed quite convenient (if the caller can deal with 
> what this operation returns), as collecting into a List is the most common 
> stream terminal operation.

Changes requested by orio...@github.com (no known OpenJDK username).

-

PR: https://git.openjdk.java.net/jdk/pull/1026


Re: RFR: 8180352: Add Stream.toList() method

2020-11-03 Thread Stuart Marks
On Tue, 3 Nov 2020 18:53:23 GMT, Stuart Marks  wrote:

>> Should there be a test that tests the default implementation in 
>> `j.u.s.Stream`? Or maybe there is and I missed that?
>
> @dfuch wrote:
>> Should there be a test that tests the default implementation in 
>> `j.u.s.Stream`? Or maybe there is and I missed that?
> 
> The test method `testDefaultOps` does that. The stream test framework has a 
> thing called `DefaultMethodStreams` that delegates everything except default 
> methods to another Stream instance, so this should test the default 
> implementation.

OK, there are rather too many forking threads here for me to try to reply to 
any particular message, so I'll try to summarize things and say what I intend 
to do.

Null tolerance. While part of me wants to prohibit nulls, the fact is that 
Streams pass through nulls, and toList() would be much less useful if it were 
to reject nulls. The affinity here is closer to Stream.toArray(), which allows 
nulls, rather than Collectors.to[Unmodifiable]List.

Unmodifiability. Unlike with nulls, this is where we've been taking a strong 
stand recently, where new constructs are unmodifiable ("shallowly immutable"). 
Consider the Java 9 unmodifiable collections, records, primitive classes, etc. 
-- they're all unmodifiable. They're data-race-free and are resistant to a 
whole class of bugs that arise from mutability.

Unfortunately, the combination of the above means that the returned List is 
neither like an ArrayList nor like an unmodifiable list produced by List.of() 
or by Collectors.toUnmodifiableList(). [Heavy sigh.] While I've been somewhat 
reluctant to introduce this new thing, the alternatives of trying to reuse one 
of the existing things are worse.

John and Rémi pointed out that the way I implemented this, using a subclass, 
reintroduces the possibility of problems with megamorphic dispatch which we had 
so carefully avoided by reducing the number of implementation classes in this 
area. I agree this is a problem.

I also had an off-line discussion with John where we discussed the 
serialization format, which unfortunately is somewhat coupled with this issue. 
(Fortunately I think it can mostly be decoupled.) Given that we're introducing 
a new thing, which is an unmodifiable-list-that-allows-nulls, this needs to be 
manifested in the serialized form of these new objects. (This corresponds to 
the new tag value of IMM_LIST_NULLS in the CollSer class.) The alternative is 
to reuse the existing serialized form, IMM_LIST, for both of these cases, 
relaxing it to allow nulls. This would be a serialization immutability problem. 
Suppose I had an application that created a data structure that used lists from 
List.of(), and I had a global assertion over that structure that it contained 
no nulls. Further suppose that I serialized and deserizalized this structure. 
I'd want that assertion to be preserved after deserialization. If another app 
(or a future version of this app) created the structure using Stream.to
 List(), this would allow nulls to leak into that structure and violate that 
assertion. Therefore, the result of Stream.toList() should not be 
serialization-compatible with List.of() et. al. That's why there's the new 
IMM_LIST_NULLS tag in the serial format.

However, the new representation in the serial format does NOT necessarily 
require the implementation to be a different class. I'm going to investigate 
collapsing ListN and ListNNullsAllowed back into a single class, while 
preserving the separate serialized forms. This should mitigate the concerns 
about megamorphic dispatch.

-

PR: https://git.openjdk.java.net/jdk/pull/1026


Re: RFR: 8180352: Add Stream.toList() method

2020-11-03 Thread Stuart Marks




On 11/3/20 3:10 AM, Florian Weimer wrote:

I'd expect a test here that if the list contains a null element, `List::copyOf` 
throws `NullPointerException`.


The new Stream.toList() actually permits null elements (by design) so it goes 
through a different path than List::copyOf. The new tests' data provider does 
include nulls in the input, and these should be accepted.


Rejection of nulls for List::copyOf is be handled by tests in

test/jdk/java/util/List/ListFactories.java

s'marks


Re: RFR: 8180352: Add Stream.toList() method

2020-11-03 Thread Remi Forax
- Mail original -
> De: "Martin Desruisseaux" 
> À: "core-libs-dev" 
> Envoyé: Mardi 3 Novembre 2020 23:16:49
> Objet: Re: RFR: 8180352: Add Stream.toList() method

> Hello
> 
> Le 03/11/2020 à 21:30, fo...@univ-mlv.fr a écrit :
> 
>> You know that you can not change the implementation of
>> Collectors.toList(), and you also know that if there is a method
>> Stream.toList(), people will replace the calls to
>> .collect(Collectors.toList()) by a call to Stream.toList() for the
>> very same reason (…snip…)
>>
> Would they would be so numerous to do this change without verifying if
> the specifications match? (on my side I do read the method javadoc). But
> even if we worry about developers not reading javadoc, the argument
> could go both ways: they could assume that all Stream methods accepts
> null and not read that Stream.toList() specifies otherwise.

yes, that why i said to Brian that he is right that stream.toList() should 
return a List that accept null.

> 
>      Martin

Rémi


Re: RFR: 8180352: Add Stream.toList() method

2020-11-03 Thread forax
Hi Aaron, 

> De: "Aaron Scott-Boddendijk" 
> À: "Remi Forax" 
> Cc: "Brian Goetz" , "Stuart Marks"
> , "core-libs-dev" 
> Envoyé: Mardi 3 Novembre 2020 21:59:37
> Objet: Re: RFR: 8180352: Add Stream.toList() method

> >>> But it can not be immutable too, for the same reason.

> >> Nope. The spec of toList() very clearly says: no guarantees as to the type,
> >> mutability, serializability, etc etc of the returned list. That doesn't 
> >> mean
> >> that every method returning a List added to the JDK after 
> >> Collectors::toList
> >> must similarly disavow all such guarantees.

> >> (In fact, we would be entirely justified to change the behavior of
> >> Collectors::toList tomorrow, to match the proposed Stream::toList; the 
> >> spec was
> >> crafted to preserve this option. We probably won't, because there are too 
> >> many
> >> programmers who refuse to read the specification, and have baked in the
> >> assumption that it returns an ArrayList, and this would likely just be 
> >> picking
> >> a fight for little good reason, regardless of who is right.)

> >I don't understand your argument.
>>You know that you can not change the implementation of Collectors.toList(), 
>>and
>>you also know that if there is a method Stream.toList(), people will replace
> >the calls to
>>.collect(Collectors.toList()) by a call to Stream.toList() for the very same
>>reason but you want the semantics of Stream.toList() to be different from the
> >semantics of Collectors.toList().

> Surely, taking that position, List.of(...) should not have produced an
> unmodifiable list since many combinations of creating Lists would not have
> produced unmodifiable lists.

The problem is that people will see stream.toList() as a shorcut for 
stream.collect(Collectors.toList()), 
i've not seen my student thinking that List.of() is a shortcut for creating an 
ArrayList with some elements, but maybe it's because we introduce ArrayList 
later compared to List.of(...). 

> In all of the teams I've worked with, through many Java version transitions 
> (and
> 3rdparty library version transitions), dealing with the selective adoption of
> new APIs is nothing new.

> This case is no different, developers will need to identify the cases in which
> they can accept the list as unmodifiable and replace those uses of
> Collectors.toList() and Collectors.toUnmodifiableList().

We have Collectors.toList() and Collectors.toUnmodifiableList(), would it make 
more sense to have stream.toList() and stream.toUnmodifiableList() instead of 
having stream.toList() that behave half way in betwen the semantics of 
Collectors.toList() and Collectors.toUnmodifiableList(). 

> And static-analysis will be able to propose some of those replacements for us.

I believe that a tool able to suggest to use Collectors.toUnmodifiableList() 
instead of Collectors.toList() will rely one a global interprocedural static 
analysis, so such analysis will be invalid once you will update one of your 
dependency. 

> Does this mean it's more complex than a text-replace, sure. Will there be
> mistakes, sure. But are there benefits to the immutability of the API result? 
> I
> suggest that's pretty hard to debate against given the risk of a mistake is an
> exception rather than bad data outcomes.

You want immutability/unmodifiability over the cost of people miss mis-using 
the API. I think it's a sin as bad as me want stream.toList() to return a null 
hostile List :) 
As Brain implies, Java is a blue collar language, we should not forget it, 
whatever we think about null or immutability. 

And I said above perhaps we should not discussed about adding a method 
Stream.toList() but a method Stream.toUnmodifiableList(). 

> Please don't blunt the tool to make it safer for children.

> --
> Aaron Scott-Boddendijk

Rémi 

> On Wed, Nov 4, 2020 at 9:30 AM < [ mailto:fo...@univ-mlv.fr | 
> fo...@univ-mlv.fr
> ] > wrote:

>> > De: "Brian Goetz" < [ mailto:brian.go...@oracle.com | 
>> > brian.go...@oracle.com ] >
>> > À: "Remi Forax" < [ mailto:fo...@univ-mlv.fr | fo...@univ-mlv.fr ] >
>>> Cc: "Stuart Marks" < [ mailto:sma...@openjdk.java.net | 
>>> sma...@openjdk.java.net
>> > ] >, "core-libs-dev"
>> > < [ mailto:core-libs-dev@openjdk.java.net | core-libs-dev@openjdk.java.net 
>> > ] >
>> > Envoyé: Mardi 3 Novembre 2020 20:54:57
>> > Objet: Re: RFR: 8180352: Add Stream.toList() method

>> >>> There is no value in making users remember which stream methods are
>> >>> nul

Re: RFR: 8180352: Add Stream.toList() method

2020-11-03 Thread Martin Desruisseaux

Hello

Le 03/11/2020 à 21:30, fo...@univ-mlv.fr a écrit :

You know that you can not change the implementation of 
Collectors.toList(), and you also know that if there is a method 
Stream.toList(), people will replace the calls to 
.collect(Collectors.toList()) by a call to Stream.toList() for the 
very same reason (…snip…)


Would they would be so numerous to do this change without verifying if 
the specifications match? (on my side I do read the method javadoc). But 
even if we worry about developers not reading javadoc, the argument 
could go both ways: they could assume that all Stream methods accepts 
null and not read that Stream.toList() specifies otherwise.


    Martin




Re: RFR: 8180352: Add Stream.toList() method

2020-11-03 Thread Aaron Scott-Boddendijk
 >>> But it can not be immutable too, for the same reason.

>> Nope. The spec of toList() very clearly says: no guarantees as to the
type,
>> mutability, serializability, etc etc of the returned list. That doesn't
mean
>> that every method returning a List added to the JDK after
Collectors::toList
>> must similarly disavow all such guarantees.

>> (In fact, we would be entirely justified to change the behavior of
>> Collectors::toList tomorrow, to match the proposed Stream::toList; the
spec was
>> crafted to preserve this option. We probably won't, because there are
too many
>> programmers who refuse to read the specification, and have baked in the
>> assumption that it returns an ArrayList, and this would likely just be
picking
>> a fight for little good reason, regardless of who is right.)

>I don't understand your argument.
>You know that you can not change the implementation of
Collectors.toList(), and you also know that if there is a method
Stream.toList(), people will replace the calls to
>.collect(Collectors.toList()) by a call to Stream.toList() for the very
same reason but you want the semantics of Stream.toList() to be different
from the semantics of Collectors.toList().

Surely, taking that position, List.of(...) should not have produced an
unmodifiable list since many combinations of creating Lists would not have
produced unmodifiable lists.

In all of the teams I've worked with, through many Java version transitions
(and 3rdparty library version transitions), dealing with the selective
adoption of new APIs is nothing new.

This case is no different, developers will need to identify the cases in
which they can accept the list as unmodifiable and replace those uses of
Collectors.toList() and Collectors.toUnmodifiableList().

And static-analysis will be able to propose some of those replacements for
us.

Does this mean it's more complex than a text-replace, sure. Will there be
mistakes, sure. But are there benefits to the immutability of the API
result? I suggest that's pretty hard to debate against given the risk of a
mistake is an exception rather than bad data outcomes.

Please don't blunt the tool to make it safer for children.

--
Aaron Scott-Boddendijk


On Wed, Nov 4, 2020 at 9:30 AM  wrote:

> > De: "Brian Goetz" 
> > À: "Remi Forax" 
> > Cc: "Stuart Marks" , "core-libs-dev"
> > 
> > Envoyé: Mardi 3 Novembre 2020 20:54:57
> > Objet: Re: RFR: 8180352: Add Stream.toList() method
>
> >>> There is no value in making users remember which stream methods are
> >>> null-hostile and which are null-tolerant; this is just more accidental
> >>> complexity.
>
> >> I think that ship has sailed a long ago.
> >> Some collectors are null hostile, some are not.
> >> You can make a point that a Collector is technically not the Stream API
> per se.
>
> > Yes, and this is no mere "technical argument". A Collector is a
> collection of
> > behaviors, governed by its own spec. The behaviors passed to stream
> operations,
> > whether they be lambdas passed by the user (`.map(x ->
> x.hopeXIsntNull())`) or
> > collector objects or comparators, are under the control of the user, and
> it is
> > the user's responsibility to pass behaviors which are compatible with
> the data
> > domain -- which the user knows and the streams implementation cannot
> know.
>
> >> Because of that, i don't think we even have the choice of the semantics
> for
> >> Stream.toList(), it has to be the same as
> stream.collect(Collectors.toList()).
>
> > This doesn't remotely follow. (And, if it did, I would likely not even
> support
> > this RFE.)
> It seems you had an issue when replying to my mail, there is a paragraph
> before "Because"
>
> > Let's take a step back,
> > if we introduce a method toList() on Stream it will be used a lot, i
> mean really
> > a lot, to the point where people will change the code from
> > stream.collect(Collectors.toList()) to use stream.toList() instead.
>
> > The spec of Collectors::toList was crafted to disavow pretty much
> anything other
> > than List-hood, in part in anticipation of this addition.
>
> >> But it can not be immutable too, for the same reason.
>
> > Nope. The spec of toList() very clearly says: no guarantees as to the
> type,
> > mutability, serializability, etc etc of the returned list. That doesn't
> mean
> > that every method returning a List added to the JDK after
> Collectors::toList
> > must similarly disavow all such guarantees.
>
> > (In fact, we would be entirely justified to change the behavior of
> > Collectors::to

Re: RFR: 8180352: Add Stream.toList() method

2020-11-03 Thread forax
> De: "Brian Goetz" 
> À: "Remi Forax" 
> Cc: "Stuart Marks" , "core-libs-dev"
> 
> Envoyé: Mardi 3 Novembre 2020 20:54:57
> Objet: Re: RFR: 8180352: Add Stream.toList() method

>>> There is no value in making users remember which stream methods are
>>> null-hostile and which are null-tolerant; this is just more accidental
>>> complexity.

>> I think that ship has sailed a long ago.
>> Some collectors are null hostile, some are not.
>> You can make a point that a Collector is technically not the Stream API per 
>> se.

> Yes, and this is no mere "technical argument". A Collector is a collection of
> behaviors, governed by its own spec. The behaviors passed to stream 
> operations,
> whether they be lambdas passed by the user (`.map(x -> x.hopeXIsntNull())`) or
> collector objects or comparators, are under the control of the user, and it is
> the user's responsibility to pass behaviors which are compatible with the data
> domain -- which the user knows and the streams implementation cannot know.

>> Because of that, i don't think we even have the choice of the semantics for
>> Stream.toList(), it has to be the same as 
>> stream.collect(Collectors.toList()).

> This doesn't remotely follow. (And, if it did, I would likely not even support
> this RFE.)
It seems you had an issue when replying to my mail, there is a paragraph before 
"Because" 

> Let's take a step back, 
> if we introduce a method toList() on Stream it will be used a lot, i mean 
> really 
> a lot, to the point where people will change the code from 
> stream.collect(Collectors.toList()) to use stream.toList() instead. 

> The spec of Collectors::toList was crafted to disavow pretty much anything 
> other
> than List-hood, in part in anticipation of this addition.

>> But it can not be immutable too, for the same reason.

> Nope. The spec of toList() very clearly says: no guarantees as to the type,
> mutability, serializability, etc etc of the returned list. That doesn't mean
> that every method returning a List added to the JDK after Collectors::toList
> must similarly disavow all such guarantees.

> (In fact, we would be entirely justified to change the behavior of
> Collectors::toList tomorrow, to match the proposed Stream::toList; the spec 
> was
> crafted to preserve this option. We probably won't, because there are too many
> programmers who refuse to read the specification, and have baked in the
> assumption that it returns an ArrayList, and this would likely just be picking
> a fight for little good reason, regardless of who is right.)
I don't understand your argument. 
You know that you can not change the implementation of Collectors.toList(), and 
you also know that if there is a method Stream.toList(), people will replace 
the calls to .collect(Collectors.toList()) by a call to Stream.toList() for the 
very same reason but you want the semantics of Stream.toList() to be different 
from the semantics of Collectors.toList(). 

Rémi 


Re: RFR: 8180352: Add Stream.toList() method

2020-11-03 Thread Brian Goetz




There is no value in making users remember which stream methods are
null-hostile and which are null-tolerant; this is just more accidental
complexity.

I think that ship has sailed a long ago.
Some collectors are null hostile, some are not.
You can make a point that a Collector is technically not the Stream API per se.


Yes, and this is no mere "technical argument".  A Collector is a 
collection of behaviors, governed by its own spec.  The behaviors passed 
to stream operations, whether they be lambdas passed by the user 
(`.map(x -> x.hopeXIsntNull())`) or collector objects or comparators, 
are under the control of the user, and it is the user's responsibility 
to pass behaviors which are compatible with the data domain -- which the 
user knows and the streams implementation cannot know.



Because of that, i don't think we even have the choice of the semantics for 
Stream.toList(), it has to be the same as stream.collect(Collectors.toList()).


This doesn't remotely follow.  (And, if it did, I would likely not even 
support this RFE.)  The spec of Collectors::toList was crafted to 
disavow pretty much anything other than List-hood, in part in 
anticipation of this addition.




But it can not be immutable too, for the same reason.


Nope.  The spec of toList() very clearly says: no guarantees as to the 
type, mutability, serializability, etc etc of the returned list.  That 
doesn't mean that every method returning a List added to the JDK after 
Collectors::toList must similarly disavow all such guarantees.


(In fact, we would be entirely justified to change the behavior of 
Collectors::toList tomorrow, to match the proposed Stream::toList; the 
spec was crafted to preserve this option.  We probably won't, because 
there are too many programmers who refuse to read the specification, and 
have baked in the assumption that it returns an ArrayList, and this 
would likely just be picking a fight for little good reason, regardless 
of who is right.)


Please, can we stop having this same tired "I hate nulls, let's find a 
new way to punish people who like them" discussion for every single 
feature?




Re: RFR: 8180352: Add Stream.toList() method

2020-11-03 Thread forax
- Mail original -
> De: "Brian Goetz" 
> À: "Remi Forax" , "Stuart Marks" 
> Cc: "core-libs-dev" 
> Envoyé: Mardi 3 Novembre 2020 19:02:37
> Objet: Re: RFR: 8180352: Add Stream.toList() method

>>> This change introduces a new terminal operation on Stream. This looks like a
>>> convenience method for Stream.collect(Collectors.toList()) or
>>> Stream.collect(Collectors.toUnmodifiableList()), but it's not. Having this
>>> method directly on Stream enables it to do what can't easily by done by a
>>> Collector. In particular, it allows the stream to deposit results directly 
>>> into
>>> a destination array (even in parallel) and have this array be wrapped in an
>>> unmodifiable List without copying.
>>>
>>> Hi Stuart,
>>> I'm Okay with the idea of having a method toList() on Stream but really 
>>> dislike
>>> the proposed semantics because tit is neither stream.collect(toList()) nor
>>> stream.collect(toUnmodifiableList()) but something in between.
>>>
>>> It's true that a Stream support nulls, we want people to be able map() with 
>>> a
>>> method that returns null and then filter out the nulls (even if using 
>>> flatMap
>>> for this case is usually a better idea),
>>> but it doesn't mean that all methods of the Stream interface has to support
>>> nulls, the original idea was more to allow nulls to flow in the stream 
>>> because
>>> at some point they will be removed before being stored in a collection.
> 
> Uhm ... no and no.
> 
> It does mean that all methods of the stream interface have to support
> nulls.  Streams are null tolerant.  Because ...
> 
> The original idea was not "the nulls can be removed later."  The
> original idea was "Streams are plumbing, they pass values through
> pipelines, to user-specified lambdas, into arrays, etc, and the stream
> plumbing should not have an opinion on the values that are flowing
> through."   And this was the right choice.

A stream is computation, so i agree that letting nulls to flow is the right 
call.
So i stand corrected on that point.

> 
> There is no value in making users remember which stream methods are
> null-hostile and which are null-tolerant; this is just more accidental
> complexity. 

I think that ship has sailed a long ago.
Some collectors are null hostile, some are not.
You can make a point that a Collector is technically not the Stream API per se.

[...]

Let's take a step back,
if we introduce a method toList() on Stream it will be used a lot, i mean 
really a lot, to the point where people will change the code from 
stream.collect(Collectors.toList()) to use stream.toList() instead.

Because of that, i don't think we even have the choice of the semantics for 
Stream.toList(), it has to be the same as stream.collect(Collectors.toList()).

So you are right that toList() can not be null hostile because 
Collectors.toList() is not null hostile.

But it can not be immutable too, for the same reason.

> Stuart made the right choice here.

I would say half of the right choices, null hostile: right, immutable: wrong :)

And if we at some point introduce a method toImmutableList() on Stream, it will 
have to be null hostile.

Rémi


Re: RFR: 8180352: Add Stream.toList() method

2020-11-03 Thread John Rose
I agree with Stuart and Brian that streams should be null-tolerant,
including in this case.  And of course I’m delighted to see an
immutable container as the output to the utility method; I’m
a fan of reduced-copy, race-free, bug-resistant algorithms you
get from immutability.

On Nov 3, 2020, at 6:53 AM, Remi Forax  wrote:
> 
> Also, adding a third immutable list creates a problem, it means that now when 
> we get an immutable list it can be 3 different implementations but the VM 
> only uses bimorphic inlining cache,
> so more callsites will fail to inline because of that. I think we have 
> already reduced the number of implementation of immutable map from 3 to 2 for 
> the very same reasons.

Yes, this part concerns me, with my JIT hat on.

It seems to me that the problem can be removed simply by allowing
the existing ListN object to contain nulls.  That’s not the same
thing as allowing List.of(1,2,3, null):  That factory method can
reject nulls, while the privileged factory method that makes
an array-backed ListN can simply sidestep the null check.

And if a call to Stream::toList call returns 0, 1, or 2 items,
it’s a straightforward matter to test if either is null, and then
choose to use either ListN or List12.  No new implementation
classes, at all!

Re: RFR: 8180352: Add Stream.toList() method

2020-11-03 Thread Stuart Marks
On Tue, 3 Nov 2020 10:06:26 GMT, Daniel Fuchs  wrote:

>> This change introduces a new terminal operation on Stream. This looks like a 
>> convenience method for Stream.collect(Collectors.toList()) or 
>> Stream.collect(Collectors.toUnmodifiableList()), but it's not. Having this 
>> method directly on Stream enables it to do what can't easily by done by a 
>> Collector. In particular, it allows the stream to deposit results directly 
>> into a destination array (even in parallel) and have this array be wrapped 
>> in an unmodifiable List without copying.
>> 
>> In the past we've kept most things from the Collections Framework as 
>> implementations of Collector, not directly on Stream, whereas only 
>> fundamental things (like toArray) appear directly on Stream. This is true of 
>> most Collections, but it does seem that List is special. It can be a thin 
>> wrapper around an array; it can handle generics better than arrays; and 
>> unlike an array, it can be made unmodifiable (shallowly immutable); and it 
>> can be value-based. See John Rose's comments in the bug report:
>> 
>> https://bugs.openjdk.java.net/browse/JDK-8180352?focusedCommentId=14133065=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14133065
>> 
>> This operation is null-tolerant, which matches the rest of Streams. This 
>> isn't specified, though; a general statement about null handling in Streams 
>> is probably warranted at some point.
>> 
>> Finally, this method is indeed quite convenient (if the caller can deal with 
>> what this operation returns), as collecting into a List is the most common 
>> stream terminal operation.
>
> Should there be a test that tests the default implementation in 
> `j.u.s.Stream`? Or maybe there is and I missed that?

@dfuch wrote:
> Should there be a test that tests the default implementation in 
> `j.u.s.Stream`? Or maybe there is and I missed that?

The test method `testDefaultOps` does that. The stream test framework has a 
thing called `DefaultMethodStreams` that delegates everything except default 
methods to another Stream instance, so this should test the default 
implementation.

-

PR: https://git.openjdk.java.net/jdk/pull/1026


Re: RFR: 8180352: Add Stream.toList() method

2020-11-03 Thread Stuart Marks
On Tue, 3 Nov 2020 11:05:21 GMT, Stephen Colebourne  
wrote:

>> This change introduces a new terminal operation on Stream. This looks like a 
>> convenience method for Stream.collect(Collectors.toList()) or 
>> Stream.collect(Collectors.toUnmodifiableList()), but it's not. Having this 
>> method directly on Stream enables it to do what can't easily by done by a 
>> Collector. In particular, it allows the stream to deposit results directly 
>> into a destination array (even in parallel) and have this array be wrapped 
>> in an unmodifiable List without copying.
>> 
>> In the past we've kept most things from the Collections Framework as 
>> implementations of Collector, not directly on Stream, whereas only 
>> fundamental things (like toArray) appear directly on Stream. This is true of 
>> most Collections, but it does seem that List is special. It can be a thin 
>> wrapper around an array; it can handle generics better than arrays; and 
>> unlike an array, it can be made unmodifiable (shallowly immutable); and it 
>> can be value-based. See John Rose's comments in the bug report:
>> 
>> https://bugs.openjdk.java.net/browse/JDK-8180352?focusedCommentId=14133065=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14133065
>> 
>> This operation is null-tolerant, which matches the rest of Streams. This 
>> isn't specified, though; a general statement about null handling in Streams 
>> is probably warranted at some point.
>> 
>> Finally, this method is indeed quite convenient (if the caller can deal with 
>> what this operation returns), as collecting into a List is the most common 
>> stream terminal operation.
>
> src/java.base/share/classes/java/util/stream/Stream.java line 1168:
> 
>> 1166:  * Accumulates the elements of this stream into a {@code List}. 
>> The elements in
>> 1167:  * the list will be in this stream's encounter order, if one 
>> exists. There are no
>> 1168:  * guarantees on the implementation type, mutability, 
>> serializability, or
> 
> It would be useful for callers to feel more confident that they will get an 
> immutable instance. In java.time.* we have wording like "This interface 
> places no restrictions on the mutability of implementations, however 
> immutability is strongly recommended." Could something like that work here, 
> emphasising that everyone implementing this method should seek to return an 
> immutable list?

Yes, good point, the "no guarantee of mutability" clashes with the later 
statement about the possibility of the returned instance being value-based, 
which strongly implies immutability. I'll work on tuning this up to be a 
stronger statement on immutability, while retaining "no-guarantee" for 
implementation type, serializability, etc. I think we do want to preserve 
future implementation flexibility in those areas.

-

PR: https://git.openjdk.java.net/jdk/pull/1026


Re: RFR: 8180352: Add Stream.toList() method

2020-11-03 Thread Stuart Marks
On Tue, 3 Nov 2020 10:58:25 GMT, Stephen Colebourne  
wrote:

>> This change introduces a new terminal operation on Stream. This looks like a 
>> convenience method for Stream.collect(Collectors.toList()) or 
>> Stream.collect(Collectors.toUnmodifiableList()), but it's not. Having this 
>> method directly on Stream enables it to do what can't easily by done by a 
>> Collector. In particular, it allows the stream to deposit results directly 
>> into a destination array (even in parallel) and have this array be wrapped 
>> in an unmodifiable List without copying.
>> 
>> In the past we've kept most things from the Collections Framework as 
>> implementations of Collector, not directly on Stream, whereas only 
>> fundamental things (like toArray) appear directly on Stream. This is true of 
>> most Collections, but it does seem that List is special. It can be a thin 
>> wrapper around an array; it can handle generics better than arrays; and 
>> unlike an array, it can be made unmodifiable (shallowly immutable); and it 
>> can be value-based. See John Rose's comments in the bug report:
>> 
>> https://bugs.openjdk.java.net/browse/JDK-8180352?focusedCommentId=14133065=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14133065
>> 
>> This operation is null-tolerant, which matches the rest of Streams. This 
>> isn't specified, though; a general statement about null handling in Streams 
>> is probably warranted at some point.
>> 
>> Finally, this method is indeed quite convenient (if the caller can deal with 
>> what this operation returns), as collecting into a List is the most common 
>> stream terminal operation.
>
> src/java.base/share/classes/java/util/ImmutableCollections.java line 211:
> 
>> 209: }
>> 210: 
>> 211: switch (input.length) { // implicit null check of elements
> 
> Was a variable renamed? Should "elements" be "input"?

Yes, actually that comment belongs up above. I'll fix it, thanks.

-

PR: https://git.openjdk.java.net/jdk/pull/1026


Re: RFR: 8180352: Add Stream.toList() method

2020-11-03 Thread Brian Goetz




This change introduces a new terminal operation on Stream. This looks like a
convenience method for Stream.collect(Collectors.toList()) or
Stream.collect(Collectors.toUnmodifiableList()), but it's not. Having this
method directly on Stream enables it to do what can't easily by done by a
Collector. In particular, it allows the stream to deposit results directly into
a destination array (even in parallel) and have this array be wrapped in an
unmodifiable List without copying.

Hi Stuart,
I'm Okay with the idea of having a method toList() on Stream but really dislike 
the proposed semantics because tit is neither stream.collect(toList()) nor 
stream.collect(toUnmodifiableList()) but something in between.

It's true that a Stream support nulls, we want people to be able map() with a 
method that returns null and then filter out the nulls (even if using flatMap 
for this case is usually a better idea),
but it doesn't mean that all methods of the Stream interface has to support 
nulls, the original idea was more to allow nulls to flow in the stream because 
at some point they will be removed before being stored in a collection.


Uhm ... no and no.

It does mean that all methods of the stream interface have to support 
nulls.  Streams are null tolerant.  Because ...


The original idea was not "the nulls can be removed later."  The 
original idea was "Streams are plumbing, they pass values through 
pipelines, to user-specified lambdas, into arrays, etc, and the stream 
plumbing should not have an opinion on the values that are flowing 
through."   And this was the right choice.


There is no value in making users remember which stream methods are 
null-hostile and which are null-tolerant; this is just more accidental 
complexity.  And the root cause of that accidental complexity is the 
misguided belief that we can (and should) contain the consequences of 
nullity by making ad-hoc restrictions about nullity.   That doesn't 
work, and all it does is add more complexity and more ways for X to not 
interact with Y.


I understand the hatred for nulls, but it's a fantasy to think we can 
contain them with ad-hoc restrictions.   And the aggregate cost in 
complexity of all the ad-hoc decisions is pretty significant. Stuart 
made the right choice here.





Re: RFR: 8180352: Add Stream.toList() method

2020-11-03 Thread Remi Forax
- Mail original -
> De: "Stuart Marks" 
> À: "core-libs-dev" 
> Envoyé: Mardi 3 Novembre 2020 04:18:08
> Objet: RFR: 8180352: Add Stream.toList() method

> This change introduces a new terminal operation on Stream. This looks like a
> convenience method for Stream.collect(Collectors.toList()) or
> Stream.collect(Collectors.toUnmodifiableList()), but it's not. Having this
> method directly on Stream enables it to do what can't easily by done by a
> Collector. In particular, it allows the stream to deposit results directly 
> into
> a destination array (even in parallel) and have this array be wrapped in an
> unmodifiable List without copying.
> 
> In the past we've kept most things from the Collections Framework as
> implementations of Collector, not directly on Stream, whereas only fundamental
> things (like toArray) appear directly on Stream. This is true of most
> Collections, but it does seem that List is special. It can be a thin wrapper
> around an array; it can handle generics better than arrays; and unlike an
> array, it can be made unmodifiable (shallowly immutable); and it can be
> value-based. See John Rose's comments in the bug report:
> 
> https://bugs.openjdk.java.net/browse/JDK-8180352?focusedCommentId=14133065=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14133065
> 
> This operation is null-tolerant, which matches the rest of Streams. This isn't
> specified, though; a general statement about null handling in Streams is
> probably warranted at some point.
> 
> Finally, this method is indeed quite convenient (if the caller can deal with
> what this operation returns), as collecting into a List is the most common
> stream terminal operation.

Hi Stuart,
I'm Okay with the idea of having a method toList() on Stream but really dislike 
the proposed semantics because tit is neither stream.collect(toList()) nor 
stream.collect(toUnmodifiableList()) but something in between.

It's true that a Stream support nulls, we want people to be able map() with a 
method that returns null and then filter out the nulls (even if using flatMap 
for this case is usually a better idea),
but it doesn't mean that all methods of the Stream interface has to support 
nulls, the original idea was more to allow nulls to flow in the stream because 
at some point they will be removed before being stored in a collection.

For me allowing in 2020 to store null in a collection is very backward, all 
collections since 1.6 doesn't support nulls.

Also, adding a third immutable list creates a problem, it means that now when 
we get an immutable list it can be 3 different implementations but the VM only 
uses bimorphic inlining cache,
so more callsites will fail to inline because of that. I think we have already 
reduced the number of implementation of immutable map from 3 to 2 for the very 
same reasons.

I believe that instead of inventing a third semantics that allows to store null 
in a collection, we should use the semantics of 
stream.collect(Collectors.toUnmodifiableList()) even if it means that toList() 
will throw a NPE if one element is null.

Rémi


Re: RFR: 8180352: Add Stream.toList() method

2020-11-03 Thread Florian Weimer
On Tue, 3 Nov 2020 01:33:32 GMT, Stuart Marks  wrote:

> This change introduces a new terminal operation on Stream. This looks like a 
> convenience method for Stream.collect(Collectors.toList()) or 
> Stream.collect(Collectors.toUnmodifiableList()), but it's not. Having this 
> method directly on Stream enables it to do what can't easily by done by a 
> Collector. In particular, it allows the stream to deposit results directly 
> into a destination array (even in parallel) and have this array be wrapped in 
> an unmodifiable List without copying.
> 
> In the past we've kept most things from the Collections Framework as 
> implementations of Collector, not directly on Stream, whereas only 
> fundamental things (like toArray) appear directly on Stream. This is true of 
> most Collections, but it does seem that List is special. It can be a thin 
> wrapper around an array; it can handle generics better than arrays; and 
> unlike an array, it can be made unmodifiable (shallowly immutable); and it 
> can be value-based. See John Rose's comments in the bug report:
> 
> https://bugs.openjdk.java.net/browse/JDK-8180352?focusedCommentId=14133065=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14133065
> 
> This operation is null-tolerant, which matches the rest of Streams. This 
> isn't specified, though; a general statement about null handling in Streams 
> is probably warranted at some point.
> 
> Finally, this method is indeed quite convenient (if the caller can deal with 
> what this operation returns), as collecting into a List is the most common 
> stream terminal operation.

test/jdk/java/util/stream/test/org/openjdk/tests/java/util/stream/ToListOpTest.java
 line 47:

> 45: }
> 46: 
> 47: private void checkUnmodifiable(List list) {

I'd expect a test here that if the list contains a null element, `List::copyOf` 
throws `NullPointerException`.

-

PR: https://git.openjdk.java.net/jdk/pull/1026


Re: RFR: 8180352: Add Stream.toList() method

2020-11-03 Thread Stephen Colebourne
On Tue, 3 Nov 2020 01:33:32 GMT, Stuart Marks  wrote:

> This change introduces a new terminal operation on Stream. This looks like a 
> convenience method for Stream.collect(Collectors.toList()) or 
> Stream.collect(Collectors.toUnmodifiableList()), but it's not. Having this 
> method directly on Stream enables it to do what can't easily by done by a 
> Collector. In particular, it allows the stream to deposit results directly 
> into a destination array (even in parallel) and have this array be wrapped in 
> an unmodifiable List without copying.
> 
> In the past we've kept most things from the Collections Framework as 
> implementations of Collector, not directly on Stream, whereas only 
> fundamental things (like toArray) appear directly on Stream. This is true of 
> most Collections, but it does seem that List is special. It can be a thin 
> wrapper around an array; it can handle generics better than arrays; and 
> unlike an array, it can be made unmodifiable (shallowly immutable); and it 
> can be value-based. See John Rose's comments in the bug report:
> 
> https://bugs.openjdk.java.net/browse/JDK-8180352?focusedCommentId=14133065=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14133065
> 
> This operation is null-tolerant, which matches the rest of Streams. This 
> isn't specified, though; a general statement about null handling in Streams 
> is probably warranted at some point.
> 
> Finally, this method is indeed quite convenient (if the caller can deal with 
> what this operation returns), as collecting into a List is the most common 
> stream terminal operation.

src/java.base/share/classes/java/util/ImmutableCollections.java line 211:

> 209: }
> 210: 
> 211: switch (input.length) { // implicit null check of elements

Was a variable renamed? Should "elements" be "input"?

src/java.base/share/classes/java/util/stream/Stream.java line 1168:

> 1166:  * Accumulates the elements of this stream into a {@code List}. The 
> elements in
> 1167:  * the list will be in this stream's encounter order, if one 
> exists. There are no
> 1168:  * guarantees on the implementation type, mutability, 
> serializability, or

It would be useful for callers to feel more confident that they will get an 
immutable instance. In java.time.* we have wording like "This interface places 
no restrictions on the mutability of implementations, however immutability is 
strongly recommended." Could something like that work here, emphasising that 
everyone implementing this method should seek to return an immutable list?

-

PR: https://git.openjdk.java.net/jdk/pull/1026


Re: RFR: 8180352: Add Stream.toList() method

2020-11-03 Thread Daniel Fuchs
On Tue, 3 Nov 2020 01:33:32 GMT, Stuart Marks  wrote:

> This change introduces a new terminal operation on Stream. This looks like a 
> convenience method for Stream.collect(Collectors.toList()) or 
> Stream.collect(Collectors.toUnmodifiableList()), but it's not. Having this 
> method directly on Stream enables it to do what can't easily by done by a 
> Collector. In particular, it allows the stream to deposit results directly 
> into a destination array (even in parallel) and have this array be wrapped in 
> an unmodifiable List without copying.
> 
> In the past we've kept most things from the Collections Framework as 
> implementations of Collector, not directly on Stream, whereas only 
> fundamental things (like toArray) appear directly on Stream. This is true of 
> most Collections, but it does seem that List is special. It can be a thin 
> wrapper around an array; it can handle generics better than arrays; and 
> unlike an array, it can be made unmodifiable (shallowly immutable); and it 
> can be value-based. See John Rose's comments in the bug report:
> 
> https://bugs.openjdk.java.net/browse/JDK-8180352?focusedCommentId=14133065=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14133065
> 
> This operation is null-tolerant, which matches the rest of Streams. This 
> isn't specified, though; a general statement about null handling in Streams 
> is probably warranted at some point.
> 
> Finally, this method is indeed quite convenient (if the caller can deal with 
> what this operation returns), as collecting into a List is the most common 
> stream terminal operation.

Should there be a test that tests the default implementation in `j.u.s.Stream`? 
Or maybe there is and I missed that?

-

PR: https://git.openjdk.java.net/jdk/pull/1026


RFR: 8180352: Add Stream.toList() method

2020-11-02 Thread Stuart Marks
This change introduces a new terminal operation on Stream. This looks like a 
convenience method for Stream.collect(Collectors.toList()) or 
Stream.collect(Collectors.toUnmodifiableList()), but it's not. Having this 
method directly on Stream enables it to do what can't easily by done by a 
Collector. In particular, it allows the stream to deposit results directly into 
a destination array (even in parallel) and have this array be wrapped in an 
unmodifiable List without copying.

In the past we've kept most things from the Collections Framework as 
implementations of Collector, not directly on Stream, whereas only fundamental 
things (like toArray) appear directly on Stream. This is true of most 
Collections, but it does seem that List is special. It can be a thin wrapper 
around an array; it can handle generics better than arrays; and unlike an 
array, it can be made unmodifiable (shallowly immutable); and it can be 
value-based. See John Rose's comments in the bug report:

https://bugs.openjdk.java.net/browse/JDK-8180352?focusedCommentId=14133065=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14133065

This operation is null-tolerant, which matches the rest of Streams. This isn't 
specified, though; a general statement about null handling in Streams is 
probably warranted at some point.

Finally, this method is indeed quite convenient (if the caller can deal with 
what this operation returns), as collecting into a List is the most common 
stream terminal operation.

-

Commit messages:
 - 8180352: Add Stream.toList() method

Changes: https://git.openjdk.java.net/jdk/pull/1026/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk=1026=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8180352
  Stats: 405 lines in 6 files changed: 358 ins; 23 del; 24 mod
  Patch: https://git.openjdk.java.net/jdk/pull/1026.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/1026/head:pull/1026

PR: https://git.openjdk.java.net/jdk/pull/1026