------------------------------------------------------------------------------------
*From: *"Tagir Valeev" <amae...@gmail.com>
*To: *"Stuart Marks" <stuart.ma...@oracle.com>
*Cc: *"Remi Forax" <fo...@univ-mlv.fr>, "core-libs-dev"
<core-libs-dev@openjdk.java.net>
*Sent: *Saturday, February 12, 2022 4:24:24 AM
*Subject: *Re: [External] : Sequenced Collections
Wow, I missed that the Sequenced Collections JEP draft was posted!
Of course, I strongly support this initiative and am happy that my proposal
got
some love and is moving forward. In general, I like the JEP in the way it
is. I
have only two slight concerns:
1. I'm not sure that having addition methods (addFirst, addLast, putFirst,
putLast) is a good idea, as not every mutable implementation may support
them.
While this adds some API unification, it's quite a rare case when this
could be
necessary. I think most real world applications of Sequenced* types would be
around querying, or maybe draining (so removal operations are ok). Probably
it
would be enough to add addFirst, addLast, putFirst, putLast directly to the
compatible implementations/subinterfaces like List, LinkedHashSet, and
LinkedHashMap removing them from the Sequenced* interfaces. In this case,
SortedSet interface will not be polluted with operations that can never be
implemented. Well my opinion is not very strong here.
2. SequencedCollection name is a little bit too long. I think every extra
letter
adds a hesitation for users to use the type, especially in APIs where it
could
be the most useful. I see the Naming section and must admit that I don't
have
better ideas. Well, maybe just Sequenced would work? Or too vague?
Speaking of Remi's suggestion, I don't think it's a good idea. Maybe it
could be
if we designed the Collection API from scratch.
??
Here is the first sentence of the javadoc for java.util.List "An ordered collection
(also known as a /sequence/)."
And the first paragraph of java.util.RandomAccess "Marker interface used by |List|
implementations to indicate that they support fast (generally constant time) random
access. The primary purpose of this interface is to allow generic algorithms to
alter their behavior to provide good performance when applied to either random or
sequential access lists"
You can find that the actual design, mixing ordered collection and indexed
collection into one interface named List not great, but you can not say that this is
not the actual design.
But given the current state of Java collections, it's better to add new
interfaces than to put the new semantics to the java.util.List and greatly
increase the amount of non-random-accessed lists in the wild.
There are tons of code that implicitly assume fast random access of every
incoming list (using indexed iteration inside). The suggested approach could
become a performance disaster.
If you take several Java developers, some will stick to the javadoc definition, a
List is either sequential or random access and some will think that a List is only
random access. Because of that, adding more sequential implementations under the
List interface is an issue.
Introducing SequencesCollection (more on the name later), a super interface of List
solves that issue, the new implementations will only implement the sequential part
of interface List.
But it does not solve the other problems, mainly adding 4 interfaces when one is
enough, not being backward compatible because of inference and the weird semantics
of LinkedHashMap.
We still need SortedSet or LinkedHashSet to not directly implement
SequencesCollection but to use delegation and a have a method returning a view. The
same reasoning applied to SortedMap, LinkedHashMap.
By using views, there is no need to the two other proposed interfaces SequenceSet
and SequenceMap.
Another question is ListIterator, a list can be iterated forward and backward, a
SequenceCollection can do almost the same thing, with iterator() and
reversed().iterator(). It's not exactly the same semantics but i don't think it
exist an implementation of SequenceCollection that can be implemented without the
property that given one element, it's predecessor and successor can be found in O(1).
Do we need a new SequenceCollectionIterator that provides the method
next/hasNext/previous/hasPrevious but not add/set/nextIndex/previousIndex ?
For the name, Java uses single simple name of one syllable for the important
interface List, Set, Map or Deque (the javadoc of Deque insist that Deque should be
pronounced using one syllable).
So the name should be Seq.
The main issue with the name "Seq" is that it is perhaps too close to the name
"Set".
Also, it can not be "Sequence" because of CharSequence.
interface Seq<E> extends Collection<E> {
void addFirst();
void addLast();
E getFirst();
E getLast();
E removeFirst();
E removeLast();
Seq<E> reversed();
}
interface List<E> extends Seq<E> { }
interface SortedSet<E> implements Set<E> { // or NavigableSet
// new methods
Seq<E> asSeq();
}
interface SortedMap<K,V> implements Map<K,V> { // or NavigableMap
// new methods
Seq<K> keySeq(); // do not use covariant return type
Seq<V> valueSeq();
Seq<Map.Entry<K,V>> entrySeq();
}
I'm still not sure that introducing an interface like Seq instead of using List is
the way to go.
If we do that, there will be a lot of blog post/bikeshedding about when to use List
vs Seq and a lot of github issues about taking a Seq instead of a List as parameter
of a method of a library.
With best regards,
Tagir Valeev.
Rémi
On Sat, Feb 12, 2022 at 2:26 AM Stuart Marks <stuart.ma...@oracle.com
<mailto:stuart.ma...@oracle.com>> wrote:
Hi Rémi,
I see that you're trying to reduce the number of interfaces introduced
by
unifying
things around an existing interface, List. Yes, it's true that List is
an
ordered
collection. However, your analysis conveniently omits other facts about
List
that
make it unsuitable as a general "ordered collection" interface.
Specifically:
1) List supports element access by int index; and
2) List is externally ordered. That is, its ordering is determined by a
succession
of API calls, irrespective of element values. This is in contrast to
SortedSet et al
which are internally ordered, in that the ordering is determined by the
element values.
The problem with indexed element access is that it creates a bunch of
hidden
performance pitfalls for any data structure where element access is
other
than O(1).
So get(i) degrades to O(n), binarySearch degrades from O(log n) to O(n).
(This is in
the sequential implementation; the random access implementation
degrades to
O(n log
n)). Apparently innocuous indexed for-loops degrade to quadratic. This
is
one of the
reasons why LinkedList is a bad List implementation.
If we refactor LinkedHashSet to implement List, we basically have
created
another
situation just like LinkedList. That's a step in the wrong direction.
Turning to internal ordering (SortedSet): it's fundamentally
incompatible with
List's external ordering. List has a lot of positional mutation
operations
such as
add(i, obj); after this call, you expect obj to appear at position i.
That
can't
work with a SortedSet.
There is implicit positioning semantics in other methods that don't
have index
arguments. For example, replaceAll replaces each element of a List with
the
result
of calling a function on that element. Crucially, the function result
goes
into the
same location as the original element. That to cannot work with
SortedSet.
Well, we can try to deal with these issues somehow, like making certain
methods
throw UnsupportedOperationException, or by relaxing the semantics of the
methods so
that they no longer have the same element positioning semantics. Either
of
these
approaches contorts the List interface to such an extent that it's no
longer
a List.
So, no, it's not useful or effective to try to make List be the common
"ordered
collection" interface.
s'marks
On 2/10/22 3:14 AM, Remi Forax wrote:
> I've read the draft of the JEP on sequenced collection, and i think
the
proposed design can be improved.
> https://bugs.openjdk.java.net/browse/JDK-8280836
<https://bugs.openjdk.java.net/browse/JDK-8280836>
>
> I agree with the motivation, there is a need for an API to consider
the
element of a list, a sorted set and a linked hash set as an ordered
sequence
of elements with a simple way to access/add/remove the first/last
element
and also reverse the elements as view.
>
> I disagree about the conclusion that we need to introduce 4 new
interfaces for that matter.
>
> Here are the reasons
> 1/ Usually an ordered collection is called a list. Introducing an
interface SequencedCollection for something which is usually called a
list
will cause more harm than good. Or maybe we should rename LISP to SEQP
:)
>
> 2/ There is already an interface List in Java, that represents an
ordered
sequence of elements, with LinkedList being the name of the the double
linked list implementation. You can argue that there is a slight
difference
between the semantics of java.util.List and the proposed syntax of
java.util.SequencedCollection, but given that people already have
difficulties to understand basic data structure concepts, as a teacher i
dread to have a discussion on those slight differences that are only
true in
Java.
>
> If the collection API was not already existing, we may discuss about
having the same interface java.util.List to both indexed collection and
ordered collection, but that boat has sailed a long time ago.
>
> So in first approach, we should refactor sorted set and linked hash
set
to directly implement java.util.List and all the proposed methods into
java.util.List. But as you hint in the Risks and Assumptions section,
this
will cause regression due to inference and also we will have trouble
with
LinkedHashMap (see below).
>
> 3/ LinkedHashMap mixes 3 implementations in one class, some of these
implementations does not conform to the semantics of SequencedMap.
> - You can opt-out having the key sequentially ordered as defined by
SequencedMap by using the constructor LinkedHashMap(int initialCapacity,
float loadFactor, boolean accessOrder) and passing true as last
parameter.
> - You can opt-out having the key sequentially ordered as defined by
SequencedMap by overriding removeEldestEntry(), removing the first
entry at
the same time you add a new one.
>
> Because all these reasons, i think we should move to another design,
using delegation instead of inheritance, which for the collection
framework
means exposing new way to access/modify sorted set and linked hash set
through java.util.List views.
>
> The concept of views is not a new concept, it's used in
Arrays.asList(),
List.subList() or Map.keySet()/values()/entrySet() (and more). The idea
is
not that a sorted set is a list but that it provides a method to see it
as a
list. It solves our problem of compatibility by not adding super types
to
existing type and also the problem of the semantics of LinkedHashMap
because
a view keeps the semantics of the data structure it originated.
>
> Here is the proposed new methods in List, SortedSet and SortedMap.
>
> interface List<E> extends Collection<E> {
> // new methods
> void addFirst();
> void addLast();
> E getFirst();
> E getLast();
> E removeFirst();
> E removeLast();
> List<E> reversedList(); // or descendingList() ??
> }
>
> interface SortedSet<E> implements Set<E> {
> // new methods
> List<E> asList();
> }
>
> interface SortedMap<K,V> implements Map<K,V> {
> // new methods
> List<K> keyList(); // do not use covariant return type
> List<Map.Entry<K,V>> entryList(); // same
> }
>
> I believe this design is objectively better than the one proposed
because
as a user being able to use new interfaces is a slow process, the
libraries/dependencies must be updated to take the new interfaces as
parameter before the new types can be used. By contrast, the proposed
design
only enhance existing interfaces so people will enjoy the new methods
directly when introduced.
>
> Rémi
>