Hi Remi,

I see you avoided addFirst/addLast methods. While getFirst/removeFirst could be specified to be consistent with iterator().next(), so the "first" part of the name would be softer, addFirst is cursed two times: it has no equivalent in current API and it clashes with void returning method in Deque. So the best "soft" alternative seems to be the existing Collection.add(E) method which is equivalent to addFirst() when Collection is ordered or something totally different when it is sorted (i.e. SortedSet) and again different when it is neither.

To avoid bugs when an API expects an ordered Collection but has no static type to express that, a means to perform a runtime check would be needed. Marker interface is a way to add a bit of static typing to be used in generic methods like:


<E, C extends Collectiom<E> & Ordered> void doSomething(C orderedCollection);


...but that's very limited. So probably not worth it. Stream API choose a different approach: Spliterator.characteristics(). So probably, Collection.characteristics() could also be added. The set of bit constants in Spliterator is applicable also to Collection (except SUBSIZED which has no sense in Collection, while SIZED is always present):

     ORDERED
     DISTINCT
     SORTED
     SIZED
     NONNULL
     IMMUTABLE
     CONCURRENT

Specifically for Collection I would also add:

    REVERSIBLE

...to indicate that Collection.reversed() would succeed.


The question is what would the default value of Collection.characteristics() be. The most sensible would be SIZED with overrides in:

- List (SIZED | ORDERED | REVERSIBLE)
- Set (SIZED | DISTINCT)
- SortedSet (SIZED | SORTED | REVERSIBLE | DISTINCT)
- LinkedHashSet (SIZED | ORDERED | REVERSIBLE | DISTINCT)
- Queue (SIZED | ORDERED)
- Deque (SIZED | ORDERED | REVERSIBLE)

... Immutable JDK List and Set implementations could also add IMMUTABLE | NONNULL, while concurrent JDK implementations would add CONCURRENT | NONNULL.


...but one could also base default value on something like this:


default int characteristics() {
    return stream().spliterator().characteristics() & ~Spliterator.SUBSIZED;
}



Wrappers like Collections.unmodifiableXXX() would just delegate to the underlying Collection and augment the returned bits (with IMMUTABLE for example).


An API that expects a particular kind of collection could then check that at runtime. Some of existing collections would have to be updated to express the correct characteristics, but defaults in interfaces would be sufficient for most and conservative.


WDYT?

Peter


On 5/11/21 9:26 AM, fo...@univ-mlv.fr wrote:

Are you proposing to just add methods to the root Collection interface,
because it already has methods which may or may not be implemented or
throw Exceptions by its implementations ?
yes, the idea is to add getFirst/getLast/removeFirst/removeLast and reversed on 
Collection as default methods.
Something like

interface Collection<E> {
   /**
    * throws NoSuchElementException if the collection is empty
    */
   public default E getFirst() {
     return iterator().next();
   }

   /**
    * throws NoSuchElementException if the collection is empty
    * throws UnsupportedOperationException if the collection is non mutable
    */
   public default E removeFirst() {
     var it = iterator();
     var element = it.next();
     it.remove();
     return element;
   }

   /**
    * throws NonOrderedCollectionException if the collection is non ordered
    */
   public default Collection<E> reversed() {
     throw new NonOrderedCollectionException();
   }

   /**
    * throws NoSuchElementException if the collection is empty
    * throws NonOrderedCollectionException if the collection is non ordered
    */
   public default E getLast() {
     return reversed().getFirst();
   }

   /**
    * throws NoSuchElementException if the collection is empty
    * throws UnsupportedOperationException if the collection is non mutable
    * throws NonOrderedCollectionException if the collection is non ordered
    */
   public default E removeLast() {
     return reversed().removeFirst();
   }
}


And adds an implementation of reversed(), getLast() and removeLast() on 
NavigableSet/LinkedHashSet/EnumSet/Deque and List.

And do the same thing for Map:

interface Map<K, V> {
   /**
    * throws NoSuchElementException if the map is empty
    */
   public default Map.Entry<K,V> getFirstEntry() {
     return entrySet().iterator().next();
   }

   /**
    * throws NoSuchElementException if the map is empty
    * throws UnsupportedOperationException if the map is non mutable
    */
   public default  Map.Entry<K,V> removeFirstEntry() {
     var it = entrySet().iterator();
     var element = it.next();
     it.remove();
     return element;
   }

   /**
    * throws NonOrderedCollectionException if the map.keySet() is non ordered
    */
   public default Map<K,V> reversedMap() {
     throw new NonOrderedCollectionException();
   }

   /**
    * throws NoSuchElementException if the map is empty
    * throws NonOrderedCollectionException if the map.keySet() is non ordered
    */
   public default E getLastEntry() {
     return reversedMap().getFirstEntry();
   }

   /**
    * throws NoSuchElementException if the map is empty
    * throws UnsupportedOperationException if the map is non mutable
    * throws NonOrderedCollectionException if the map.keySet() is non ordered
    */
   public default E removeLast() {
     return reversedMap().removeFirstEntry();
   }
}

And adds an implementation of reversedMap(), getLastEntry() and 
removeLastEntry() on NavigableMap/LinkedHasMap and EnumMap.


But we could also view "having (possibly duplicate) items in a
sequence" as a capability and that is defined by an actual interface,
List.
It already exist for a Spliterator, the capability is called DISTINCT, by 
example, set.stream() and list.stream().distinct() are both Streams with no 
duplicate.

I do fear that if we add methods to the Collection interface, users
might not expect them to throw exceptions. E.g. if we add getFirst(),
users may just expect it to return the element that iterator().next()
or .stream().findFirst().orElseThrow() would return instead of an
UnsupportedOperationException. And while HashSet does not have a
defined order, it would in practice always return the same element.
I agree, getFirst() and removeFirst() should have their semantics based on 
iterator (see above),
because as you said, it's already how findFirst() is specified.

So while I accept that some hidden capabilities are surfaced only
through return values of certain methods, I think an interface is still
a better choice here. However, if we go that route, I also agree on the
fact that we might need to explore defining proper interfaces for the
other capabilities at some point.

Kind regards,

Dave
regards,
Rémi

On Mon, 2021-05-10 at 12:22 +0200, fo...@univ-mlv.fr wrote:
----- Mail original -----
De: "dfranken jdk" <dfranken....@gmail.com>
À: "Remi Forax" <fo...@univ-mlv.fr>, "Stuart Marks"
<stuart.ma...@oracle.com>
Cc: "core-libs-dev" <core-libs-dev@openjdk.java.net>
Envoyé: Dimanche 9 Mai 2021 12:13:58
Objet: Re: [External] : Re: ReversibleCollection proposal
When I thought about Collection's role in the hierarchy, it seemed
to
me that 'Collection' is an interface for describing how elements
are
stored in a cardboard box (we can and and remove them) and that we
might need a different, yet related, interface to describe how to
retrieve the items from the box. This way we are not tied to the
Collection hierarchy and we could have one Set implementation which
is
ordered and another Set implementation which is not and they would
both
still implement Collection, but only one would implement the
interface.
So you want to model ReversibleSet as Set + Reversible,
Reversible being an interface with a small number of method specific
to the fact that the implementation is reversible.

This does not work well for two reasons,
- there is no syntax to represent the union of types in Java,
    Set & Reversible set = ...
   is not a valid syntax. You can use a type variable with several
bounds instead but it does work nicely with the rest of the language
(overidding, overloading, etc).

- having public methods that takes an interface with few methods is a
common design error.
   Let suppose you have a method foo that only prints the elements of
a collection, in that case you may want to type the first parameter
as Iterable or Collection.
   But requirements change an now you want to prints the elements
sorted, oops, you have to change the signature of the public method
which may be something not possible
   depending how many "clients" this method has.
   Providing interfaces with a small number of access methods will
lead to this kind of issue.

Imagine an interface like 'java.lang.OrderedEnumerable` if you will
with methods such as 'first(), last(), etc', then each
implementation
would be free to choose to implement the interface or not. I also
thought about 'OrderedIterable', but there would be too much
confusion
with 'Iterable', but I think these are related too. Retrieving
items is
an iteration problem, not a storage problem.
The problem is that is you multiply the number of interfaces to
access the elements, you add the dilemma of choice in the mix.
The first iteration of the Scala Collection were like this, too many
interfaces, at least for my taste.

While I would love to see the Collection hierarchy redesigned to
also
allow for ImmutableCollection which for instance would not have an
`add(T e)` method, I don't think we can simply do something like
that
without breaking too many things.
Dear Santa, i want an interface BuilderCollection that only allows
add() but no remove()/clear(), because if you can not remove
elements, then all the iterators can implement the snapshot at the
beginning semantics,
so no ConcurrentModificationException anymore. For me, being able to
snapshot/freeze a collection is better than an ImmutableCollection,
because it can be immutable when you want.

Anyway, it's not gonna to happen, there is so many ways to slice an
onion and each have pros and cons and providing all the possible
interfaces is a good.way to make something simple complex.

So in short, separating retrieval aspects from storage aspects with
a
different interface might be the way to go.

Kind regards,

Dave Franken

regards,
Rémi

On Wed, 2021-05-05 at 12:41 +0200, fo...@univ-mlv.fr wrote:
----- Mail original -----
De: "Stuart Marks" <stuart.ma...@oracle.com>
À: "Remi Forax" <fo...@univ-mlv.fr>
Cc: "core-libs-dev" <core-libs-dev@openjdk.java.net>
Envoyé: Mercredi 5 Mai 2021 02:00:03
Objet: Re: [External] : Re: ReversibleCollection proposal
On 5/1/21 5:57 AM, fo...@univ-mlv.fr wrote:
I suppose the performance issue comes from the fact that
traversing a
LinkedHahSet is slow because it's a linked list ?

You can replace a LinkedHashSet by a List + Set, the List
keeps
the values in
order, the Set avoids duplicates.

Using a Stream, it becomes
    Stream.of(getItemsFromSomeplace(),
getItemsFromAnotherPlace(),
    getItemsFromSomeplaceElse())
     .flatMap(List::stream)
     .distinct()              // use a Set internally
     .collect(toList());
The problem with any example is that simplifying assumptions
are
necessary for
showing the example, but those assumptions enable it to be
easily
picked apart.
Of
course, the example isn't just a particular example; it is a
*template* for a
whole
space of possible examples. Consider the possibility that the
items
processing
client needs to do some intermediate processing on the first
group
of items
before
adding the other groups. This might not be possible to do using
streams. Use
your
imagination.

I think there are maybe some scenarios where
ReversibleCollection
can be useful,
but they are rare, to the point where when there is a good
scenario for it
people will not recognize it because ReversibleCollection
will
not be part of
their muscle memory.
I'm certain that uses of RC/RS will be rare in the beginning,
because they will
be
new, and people won't be familar with them. And then there will
the
people who
say
"I can't use them because I'm stuck on JDK 11 / 8 / 7 / 6 ...."
It
was the same
thing with lambdas and streams in Java 8, with List.of() etc in
Java 9, records
in
Java 16, etc. This wasn't an argument not to add them, and it
isn't
an argument
not
to add RC/RS.
All the changes you are listing here are "client side" changes,
the
ones that can be adopted quickly because they do not require to
change the API side of any libraries.
ReversibleCollection is an API side change, like generics is,
those
changes have a far higher cost because you have to wait your
library
dependencies to be updated.
On the Valhalla list, we have discussed several times about how
to
alleviate those API side change cost using automatic bridging or
methods forwarding, even for Valhalla, we are currently moving in
a
state where those mechanisms are not needed.

There is a real value to add methods like
descendingSet/descendingList()/getFirst/getLast on existing
collections, but we
don't need a new interface (or two) for that.
It depends on what you mean by "need". Sure, we could get away
without this;
after
all, we've survived the past twenty years without it, so we
could
probably
survive
the next twenty years as well.

It would indeed be useful to add various methods to List,
Deque,
SortedSet, and
LinkedHashSet to provide a full set of methods on all of them.
And
it would also
be
nice to have those methods be similar to one another. An
interface
helps with
that,
but I agree, that's not really the reason to have an interface
though.

The reversed-view concept could also be added individually to
the
different
places.
A reverse-ordered List is a List, a reverse-ordered Deque is a
Deque, a
reverse-ordered SortedSet is a SortedSet, and a reverse-ordered
LinkedHashSet is
a
... ? And what is the type of the keySet of a LinkedHashMap,
that
enables one to
(say) get the last element?
see below

After working with a system like this for a while, it begins to
emerge that
there is
an abstraction missing from the collections framework,
something
like an
"ordered
collection". People have been missing this for quite a long
time.
The most
recent
example (which this proposal builds on) is Tagir's proposal
from a
year ago. And
it's been asked about several times before that.
ReversibleCollection fills in
that
missing abstraction.
The abstraction already exists but it's not defined in term of
interface because it's an implementation decision and those are
cleanly separated in the current Collection design.

Let take a step back, the collection API defines basic data
structure
operations in term of interfaces like List, Deque, Set, etc those
interfaces are decoupled from implementation capabilities like
mutable, nullable, ordered and checked.

Depending on the implementation capabilities, the interfaces
method
implementation may throw an exception, non-mutable
implementations
use UnsupportedOperationException, non-nullable implementations
use
NPE and checked implementations use CCE.

So what is missing is methods on Collection interfaces that
require
the collection implementation to be ordered like
descendingList(),
getFirst(), etc.
Those methods that may throw a specific exception if the
implementation is not ordered, not UnsupportedOperationException
but
a new one like NotOrderedException.

So to answer to your question about LinkedHashSet, the reverse-
ordered LinkedHashSet is a Set with a method descendingSet() that
do
not throw NotOrderedException like any Set with an order.

To summarize, if we introduce ReversibleCollection, we should
also
introduce ImmutableCollection, NonNullableCollection and
CheckedCollection.
I think it's better to consider the fact that being ordered as a
capability (hint: this is already what the Spliterator API does)
and
not as a specific interface.

s'marks
Rémi

Reply via email to