----- Mail original ----- > De: "Peter Levart" <peter.lev...@gmail.com> > À: "Remi Forax" <fo...@univ-mlv.fr>, "dfranken jdk" <dfranken....@gmail.com> > Cc: "Stuart Marks" <stuart.ma...@oracle.com>, "core-libs-dev" > <core-libs-dev@openjdk.java.net> > Envoyé: Mardi 11 Mai 2021 13:44:08 > Objet: Re: [External] : Re: ReversibleCollection proposal
> Hi Remi, Hi Peter, > > > I see you avoided addFirst/addLast methods. While getFirst/removeFirst > could be specified to be consistent with iterator().next(), so the > "first" part of the name would be softer, addFirst is cursed two times: > it has no equivalent in current API and it clashes with void returning > method in Deque. So the best "soft" alternative seems to be the existing > Collection.add(E) method which is equivalent to addFirst() when > Collection is ordered or something totally different when it is sorted > (i.e. SortedSet) and again different when it is neither. yes, i've chosen to not have addFirst/addLast because of addFirst/addLast on Deque returning void. To simplify the issue of adding addFirst/addLast on List, SortedSet/NavigableSet, EnumSet, LinkedHashSet and adding addFirstEntry/addLastEntry on SortedMap/NavigableMap, EnumMap and LinkedHashMap can be discussed separately. > > To avoid bugs when an API expects an ordered Collection but has no > static type to express that, a means to perform a runtime check would be > needed. Marker interface is a way to add a bit of static typing to be > used in generic methods like: > > > <E, C extends Collectiom<E> & Ordered> void doSomething(C > orderedCollection); > > > ...but that's very limited. So probably not worth it. yes, and also if we go with Ordered people will also want Sorted, NoDuplicate, NonNullable, etc. > Stream API choose a different approach: Spliterator.characteristics(). So > probably, > Collection.characteristics() could also be added. The set of bit > constants in Spliterator is applicable also to Collection (except > SUBSIZED which has no sense in Collection, while SIZED is always present): > > ORDERED > DISTINCT > SORTED > SIZED > NONNULL > IMMUTABLE > CONCURRENT > > Specifically for Collection I would also add: > > REVERSIBLE > > ...to indicate that Collection.reversed() would succeed. > > > The question is what would the default value of > Collection.characteristics() be. The most sensible would be SIZED with > overrides in: > > - List (SIZED | ORDERED | REVERSIBLE) > - Set (SIZED | DISTINCT) > - SortedSet (SIZED | SORTED | REVERSIBLE | DISTINCT) > - LinkedHashSet (SIZED | ORDERED | REVERSIBLE | DISTINCT) > - Queue (SIZED | ORDERED) > - Deque (SIZED | ORDERED | REVERSIBLE) > > ... Immutable JDK List and Set implementations could also add IMMUTABLE >| NONNULL, while concurrent JDK implementations would add CONCURRENT | > NONNULL. > > > ...but one could also base default value on something like this: > > > default int characteristics() { > return stream().spliterator().characteristics() & > ~Spliterator.SUBSIZED; > } There is already a method Collection.spliterator() [1], no need to use an intermediary stream and also collection.spliterator() != collection.stream().spliterator(), but i'm not sure if it is specified somewhere. For me, adding a method characteristics() to Collection is also a separate issue, perhaps it's more a documentation issue, so adding a paragraph in the javadoc explaining what are the Spliterator characteristics of each collections is enough. > > > Wrappers like Collections.unmodifiableXXX() would just delegate to the > underlying Collection and augment the returned bits (with IMMUTABLE for > example). > > > An API that expects a particular kind of collection could then check > that at runtime. Some of existing collections would have to be updated > to express the correct characteristics, but defaults in interfaces would > be sufficient for most and conservative. > > > WDYT? > > Peter Rémi [1] https://docs.oracle.com/en/java/javase/16/docs/api/java.base/java/util/Collection.html#spliterator() > > > On 5/11/21 9:26 AM, fo...@univ-mlv.fr wrote: >> >>> Are you proposing to just add methods to the root Collection interface, >>> because it already has methods which may or may not be implemented or >>> throw Exceptions by its implementations ? >> yes, the idea is to add getFirst/getLast/removeFirst/removeLast and reversed >> on >> Collection as default methods. >> Something like >> >> interface Collection<E> { >> /** >> * throws NoSuchElementException if the collection is empty >> */ >> public default E getFirst() { >> return iterator().next(); >> } >> >> /** >> * throws NoSuchElementException if the collection is empty >> * throws UnsupportedOperationException if the collection is non mutable >> */ >> public default E removeFirst() { >> var it = iterator(); >> var element = it.next(); >> it.remove(); >> return element; >> } >> >> /** >> * throws NonOrderedCollectionException if the collection is non ordered >> */ >> public default Collection<E> reversed() { >> throw new NonOrderedCollectionException(); >> } >> >> /** >> * throws NoSuchElementException if the collection is empty >> * throws NonOrderedCollectionException if the collection is non ordered >> */ >> public default E getLast() { >> return reversed().getFirst(); >> } >> >> /** >> * throws NoSuchElementException if the collection is empty >> * throws UnsupportedOperationException if the collection is non mutable >> * throws NonOrderedCollectionException if the collection is non ordered >> */ >> public default E removeLast() { >> return reversed().removeFirst(); >> } >> } >> >> >> And adds an implementation of reversed(), getLast() and removeLast() on >> NavigableSet/LinkedHashSet/EnumSet/Deque and List. >> >> And do the same thing for Map: >> >> interface Map<K, V> { >> /** >> * throws NoSuchElementException if the map is empty >> */ >> public default Map.Entry<K,V> getFirstEntry() { >> return entrySet().iterator().next(); >> } >> >> /** >> * throws NoSuchElementException if the map is empty >> * throws UnsupportedOperationException if the map is non mutable >> */ >> public default Map.Entry<K,V> removeFirstEntry() { >> var it = entrySet().iterator(); >> var element = it.next(); >> it.remove(); >> return element; >> } >> >> /** >> * throws NonOrderedCollectionException if the map.keySet() is non ordered >> */ >> public default Map<K,V> reversedMap() { >> throw new NonOrderedCollectionException(); >> } >> >> /** >> * throws NoSuchElementException if the map is empty >> * throws NonOrderedCollectionException if the map.keySet() is non ordered >> */ >> public default E getLastEntry() { >> return reversedMap().getFirstEntry(); >> } >> >> /** >> * throws NoSuchElementException if the map is empty >> * throws UnsupportedOperationException if the map is non mutable >> * throws NonOrderedCollectionException if the map.keySet() is non ordered >> */ >> public default E removeLast() { >> return reversedMap().removeFirstEntry(); >> } >> } >> >> And adds an implementation of reversedMap(), getLastEntry() and >> removeLastEntry() on NavigableMap/LinkedHasMap and EnumMap. >> >> >>> But we could also view "having (possibly duplicate) items in a >>> sequence" as a capability and that is defined by an actual interface, >>> List. >> It already exist for a Spliterator, the capability is called DISTINCT, by >> example, set.stream() and list.stream().distinct() are both Streams with no >> duplicate. >> >>> I do fear that if we add methods to the Collection interface, users >>> might not expect them to throw exceptions. E.g. if we add getFirst(), >>> users may just expect it to return the element that iterator().next() >>> or .stream().findFirst().orElseThrow() would return instead of an >>> UnsupportedOperationException. And while HashSet does not have a >>> defined order, it would in practice always return the same element. >> I agree, getFirst() and removeFirst() should have their semantics based on >> iterator (see above), >> because as you said, it's already how findFirst() is specified. >> >>> So while I accept that some hidden capabilities are surfaced only >>> through return values of certain methods, I think an interface is still >>> a better choice here. However, if we go that route, I also agree on the >>> fact that we might need to explore defining proper interfaces for the >>> other capabilities at some point. >>> >>> Kind regards, >>> >>> Dave >> regards, >> Rémi >> >>> On Mon, 2021-05-10 at 12:22 +0200, fo...@univ-mlv.fr wrote: >>>> ----- Mail original ----- >>>>> De: "dfranken jdk" <dfranken....@gmail.com> >>>>> À: "Remi Forax" <fo...@univ-mlv.fr>, "Stuart Marks" >>>>> <stuart.ma...@oracle.com> >>>>> Cc: "core-libs-dev" <core-libs-dev@openjdk.java.net> >>>>> Envoyé: Dimanche 9 Mai 2021 12:13:58 >>>>> Objet: Re: [External] : Re: ReversibleCollection proposal >>>>> When I thought about Collection's role in the hierarchy, it seemed >>>>> to >>>>> me that 'Collection' is an interface for describing how elements >>>>> are >>>>> stored in a cardboard box (we can and and remove them) and that we >>>>> might need a different, yet related, interface to describe how to >>>>> retrieve the items from the box. This way we are not tied to the >>>>> Collection hierarchy and we could have one Set implementation which >>>>> is >>>>> ordered and another Set implementation which is not and they would >>>>> both >>>>> still implement Collection, but only one would implement the >>>>> interface. >>>> So you want to model ReversibleSet as Set + Reversible, >>>> Reversible being an interface with a small number of method specific >>>> to the fact that the implementation is reversible. >>>> >>>> This does not work well for two reasons, >>>> - there is no syntax to represent the union of types in Java, >>>> Set & Reversible set = ... >>>> is not a valid syntax. You can use a type variable with several >>>> bounds instead but it does work nicely with the rest of the language >>>> (overidding, overloading, etc). >>>> >>>> - having public methods that takes an interface with few methods is a >>>> common design error. >>>> Let suppose you have a method foo that only prints the elements of >>>> a collection, in that case you may want to type the first parameter >>>> as Iterable or Collection. >>>> But requirements change an now you want to prints the elements >>>> sorted, oops, you have to change the signature of the public method >>>> which may be something not possible >>>> depending how many "clients" this method has. >>>> Providing interfaces with a small number of access methods will >>>> lead to this kind of issue. >>>> >>>>> Imagine an interface like 'java.lang.OrderedEnumerable` if you will >>>>> with methods such as 'first(), last(), etc', then each >>>>> implementation >>>>> would be free to choose to implement the interface or not. I also >>>>> thought about 'OrderedIterable', but there would be too much >>>>> confusion >>>>> with 'Iterable', but I think these are related too. Retrieving >>>>> items is >>>>> an iteration problem, not a storage problem. >>>> The problem is that is you multiply the number of interfaces to >>>> access the elements, you add the dilemma of choice in the mix. >>>> The first iteration of the Scala Collection were like this, too many >>>> interfaces, at least for my taste. >>>> >>>>> While I would love to see the Collection hierarchy redesigned to >>>>> also >>>>> allow for ImmutableCollection which for instance would not have an >>>>> `add(T e)` method, I don't think we can simply do something like >>>>> that >>>>> without breaking too many things. >>>> Dear Santa, i want an interface BuilderCollection that only allows >>>> add() but no remove()/clear(), because if you can not remove >>>> elements, then all the iterators can implement the snapshot at the >>>> beginning semantics, >>>> so no ConcurrentModificationException anymore. For me, being able to >>>> snapshot/freeze a collection is better than an ImmutableCollection, >>>> because it can be immutable when you want. >>>> >>>> Anyway, it's not gonna to happen, there is so many ways to slice an >>>> onion and each have pros and cons and providing all the possible >>>> interfaces is a good.way to make something simple complex. >>>> >>>>> So in short, separating retrieval aspects from storage aspects with >>>>> a >>>>> different interface might be the way to go. >>>>> >>>>> Kind regards, >>>>> >>>>> Dave Franken >>>>> >>>> regards, >>>> Rémi >>>> >>>>> On Wed, 2021-05-05 at 12:41 +0200, fo...@univ-mlv.fr wrote: >>>>>> ----- Mail original ----- >>>>>>> De: "Stuart Marks" <stuart.ma...@oracle.com> >>>>>>> À: "Remi Forax" <fo...@univ-mlv.fr> >>>>>>> Cc: "core-libs-dev" <core-libs-dev@openjdk.java.net> >>>>>>> Envoyé: Mercredi 5 Mai 2021 02:00:03 >>>>>>> Objet: Re: [External] : Re: ReversibleCollection proposal >>>>>>> On 5/1/21 5:57 AM, fo...@univ-mlv.fr wrote: >>>>>>>> I suppose the performance issue comes from the fact that >>>>>>>> traversing a >>>>>>>> LinkedHahSet is slow because it's a linked list ? >>>>>>>> >>>>>>>> You can replace a LinkedHashSet by a List + Set, the List >>>>>>>> keeps >>>>>>>> the values in >>>>>>>> order, the Set avoids duplicates. >>>>>>>> >>>>>>>> Using a Stream, it becomes >>>>>>>> Stream.of(getItemsFromSomeplace(), >>>>>>>> getItemsFromAnotherPlace(), >>>>>>>> getItemsFromSomeplaceElse()) >>>>>>>> .flatMap(List::stream) >>>>>>>> .distinct() // use a Set internally >>>>>>>> .collect(toList()); >>>>>>> The problem with any example is that simplifying assumptions >>>>>>> are >>>>>>> necessary for >>>>>>> showing the example, but those assumptions enable it to be >>>>>>> easily >>>>>>> picked apart. >>>>>>> Of >>>>>>> course, the example isn't just a particular example; it is a >>>>>>> *template* for a >>>>>>> whole >>>>>>> space of possible examples. Consider the possibility that the >>>>>>> items >>>>>>> processing >>>>>>> client needs to do some intermediate processing on the first >>>>>>> group >>>>>>> of items >>>>>>> before >>>>>>> adding the other groups. This might not be possible to do using >>>>>>> streams. Use >>>>>>> your >>>>>>> imagination. >>>>>>> >>>>>>>> I think there are maybe some scenarios where >>>>>>>> ReversibleCollection >>>>>>>> can be useful, >>>>>>>> but they are rare, to the point where when there is a good >>>>>>>> scenario for it >>>>>>>> people will not recognize it because ReversibleCollection >>>>>>>> will >>>>>>>> not be part of >>>>>>>> their muscle memory. >>>>>>> I'm certain that uses of RC/RS will be rare in the beginning, >>>>>>> because they will >>>>>>> be >>>>>>> new, and people won't be familar with them. And then there will >>>>>>> the >>>>>>> people who >>>>>>> say >>>>>>> "I can't use them because I'm stuck on JDK 11 / 8 / 7 / 6 ...." >>>>>>> It >>>>>>> was the same >>>>>>> thing with lambdas and streams in Java 8, with List.of() etc in >>>>>>> Java 9, records >>>>>>> in >>>>>>> Java 16, etc. This wasn't an argument not to add them, and it >>>>>>> isn't >>>>>>> an argument >>>>>>> not >>>>>>> to add RC/RS. >>>>>> All the changes you are listing here are "client side" changes, >>>>>> the >>>>>> ones that can be adopted quickly because they do not require to >>>>>> change the API side of any libraries. >>>>>> ReversibleCollection is an API side change, like generics is, >>>>>> those >>>>>> changes have a far higher cost because you have to wait your >>>>>> library >>>>>> dependencies to be updated. >>>>>> On the Valhalla list, we have discussed several times about how >>>>>> to >>>>>> alleviate those API side change cost using automatic bridging or >>>>>> methods forwarding, even for Valhalla, we are currently moving in >>>>>> a >>>>>> state where those mechanisms are not needed. >>>>>> >>>>>>>> There is a real value to add methods like >>>>>>>> descendingSet/descendingList()/getFirst/getLast on existing >>>>>>>> collections, but we >>>>>>>> don't need a new interface (or two) for that. >>>>>>> It depends on what you mean by "need". Sure, we could get away >>>>>>> without this; >>>>>>> after >>>>>>> all, we've survived the past twenty years without it, so we >>>>>>> could >>>>>>> probably >>>>>>> survive >>>>>>> the next twenty years as well. >>>>>>> >>>>>>> It would indeed be useful to add various methods to List, >>>>>>> Deque, >>>>>>> SortedSet, and >>>>>>> LinkedHashSet to provide a full set of methods on all of them. >>>>>>> And >>>>>>> it would also >>>>>>> be >>>>>>> nice to have those methods be similar to one another. An >>>>>>> interface >>>>>>> helps with >>>>>>> that, >>>>>>> but I agree, that's not really the reason to have an interface >>>>>>> though. >>>>>>> >>>>>>> The reversed-view concept could also be added individually to >>>>>>> the >>>>>>> different >>>>>>> places. >>>>>>> A reverse-ordered List is a List, a reverse-ordered Deque is a >>>>>>> Deque, a >>>>>>> reverse-ordered SortedSet is a SortedSet, and a reverse-ordered >>>>>>> LinkedHashSet is >>>>>>> a >>>>>>> ... ? And what is the type of the keySet of a LinkedHashMap, >>>>>>> that >>>>>>> enables one to >>>>>>> (say) get the last element? >>>>>> see below >>>>>> >>>>>>> After working with a system like this for a while, it begins to >>>>>>> emerge that >>>>>>> there is >>>>>>> an abstraction missing from the collections framework, >>>>>>> something >>>>>>> like an >>>>>>> "ordered >>>>>>> collection". People have been missing this for quite a long >>>>>>> time. >>>>>>> The most >>>>>>> recent >>>>>>> example (which this proposal builds on) is Tagir's proposal >>>>>>> from a >>>>>>> year ago. And >>>>>>> it's been asked about several times before that. >>>>>>> ReversibleCollection fills in >>>>>>> that >>>>>>> missing abstraction. >>>>>> The abstraction already exists but it's not defined in term of >>>>>> interface because it's an implementation decision and those are >>>>>> cleanly separated in the current Collection design. >>>>>> >>>>>> Let take a step back, the collection API defines basic data >>>>>> structure >>>>>> operations in term of interfaces like List, Deque, Set, etc those >>>>>> interfaces are decoupled from implementation capabilities like >>>>>> mutable, nullable, ordered and checked. >>>>>> >>>>>> Depending on the implementation capabilities, the interfaces >>>>>> method >>>>>> implementation may throw an exception, non-mutable >>>>>> implementations >>>>>> use UnsupportedOperationException, non-nullable implementations >>>>>> use >>>>>> NPE and checked implementations use CCE. >>>>>> >>>>>> So what is missing is methods on Collection interfaces that >>>>>> require >>>>>> the collection implementation to be ordered like >>>>>> descendingList(), >>>>>> getFirst(), etc. >>>>>> Those methods that may throw a specific exception if the >>>>>> implementation is not ordered, not UnsupportedOperationException >>>>>> but >>>>>> a new one like NotOrderedException. >>>>>> >>>>>> So to answer to your question about LinkedHashSet, the reverse- >>>>>> ordered LinkedHashSet is a Set with a method descendingSet() that >>>>>> do >>>>>> not throw NotOrderedException like any Set with an order. >>>>>> >>>>>> To summarize, if we introduce ReversibleCollection, we should >>>>>> also >>>>>> introduce ImmutableCollection, NonNullableCollection and >>>>>> CheckedCollection. >>>>>> I think it's better to consider the fact that being ordered as a >>>>>> capability (hint: this is already what the Spliterator API does) >>>>>> and >>>>>> not as a specific interface. >>>>>> >>>>>>> s'marks > >>>>> Rémi