Hi Stuart,

It seems we have one major disagreement. You believe the collections framework 
specification is inconsistent. I believe it is consistent. (We agree that there 
is room for improvement.) That disagreement is making communication more 
difficult.

Based on your reply, I assume the disagreement revolves around this fact:

Using public classes provided by the framework, it is possible to:

* Create an object that is an instance of Set but does not support the general 
contract of Set.

* Create an object that is an instance of Map but does not support the general 
contract of Map.

I quoted the relevant paragraphs from the specifications of these classes in my 
previous message. These paragraphs make it clear that the area where these 
objects fail to support the general contract of their interfaces is that they 
define membership in a way that is inconsistent with equals() and hashCode(). 
The specification goes to some length to explain that membership is based on 
equals() and hashCode() and that optimizations based on that dependence are 
permitted.

Obviously, there is a conflict here, but I would not say that the specification 
itself is inconsistent (which would imply that the specification needs to be 
“fixed”).

Instead, I see this as a pragmatic decision (one might say “kludge”, but that 
is not the point) by the designers of the framework to allow these concrete 
classes to be used both to create objects that conform (semantically) to their 
interfaces and to create objects that do not (fully) conform to their 
interfaces. The designers could have instead said that the non-conforming uses 
are erroneous and not supported, which would have been a cleaner solution. 
Instead, they documented the potential non-conforming cases.

I presume that the non-conforming cases are blessed (not rejected) because they 
are useful (the classes are described as being well defined either way) and 
possibly also because conformance is not generally testable.

Having an object that does not conform (semantically) to its interface is an 
anomaly. It introduces the possibility of unexpected behavior if one does not 
fully understand the divergence between the object’s behavior and the interface 
specification and the potential impact of that divergence. This is the sense in 
which these objects are “second class” or “ersatz”. It is not an overstatement 
or a new idea, it is the plain reading of the current specification.

If you pass one of these objects as a parameter to a method that declares the 
parameter to be of the interface type, then that method is free to assume that 
object satisfies the (general contract of the) interface. If the object does 
not, the method may not behave as expected. If there is a bug, it is in the 
caller, not the method.

removeAll() is an example of such a method. As its parameter is of type 
Collection, the framework specification implies that it is a correct 
implementation to iterate over the parameter, call hashCode() on the elements 
returned by the iterator, and call equals() only on the target elements with a 
matching hash, which is what removeAll() currently may do when the target is a 
HashSet. (The fact that removeAll() uses the sizes of the collections to choose 
an implementation may be troubling, but that does not imply that either of the 
two implementations are incorrect. I believe they are both correct. Using your 
terms, I believe that removeAll(), viewed as a binary operation, is symmetric 
or reversible in the current specification.)

Does this disagreement over consistency of the specification matter? Perhaps 
not in practical terms. Even if the current specification is consistent, it is 
too easy to make mistakes using the above classes. A developer must read the 
specification carefully and make logical deductions to understand why 
removeAll() might behave in an unexpected way. That is not a good thing. In 
addition, whether or not the disputed implementation of removalAll() is 
strictly a bug, your proposed change to removeAll() is still an incompatible 
change, both in specification and performance, and needs to be evaluated 
relative to the benefits of the change. (I assume a CSR is forthcoming to 
initiate this evaluation formally.)

By the way, I would describe the benefit of your change in practical terms as 
reducing the probability of programming errors. Improving consistency is vague; 
it is also problematic if one does not believe that the specification is 
currently inconsistent.

 Alan



> On May 20, 2019, at 3:57 PM, Stuart Marks <stuart.ma...@oracle.com> wrote:
> 
> On 5/16/19 10:06 PM, Alan Snyder wrote:
>> Could you explain the inconsistency in the specification that affects 
>> removeAll()? I don’t see it.
> 
> It's the assumption that the operation can be reversed without changing its 
> semantics. This isn't true, given the existence of SortedSet et. al. This is 
> the point of this bug report. The inconsistency exists in the collections 
> framework specifications taken as a whole, not removeAll() in isolation.
> 
>> I believe I already accepted the fact of ersatz Collections.
> 
> Of course I don't know your state of mind, but it it doesn't sound to me like 
> you've accepted them. You're calling them "ersatz" collections, which sound 
> to me like "fake" collections. You're carving them out of "real" collections 
> and treating them like second-class citizens. I would have thought that was 
> an overstatement, until I read this:
> 
>> Promoting ersatz Collections to first class citizens would be a significant 
>> incompatible change...
> 
> My goal here is to reconcile the various parts of the collections framework 
> specifications so that they are more self-consistent as a whole. Treating 
> certain pieces as second-class citizens makes the problem worse, not better.
> 
>> What I would most like to preserve is the performance advantages of using 
>> hash coding, which apparently was a goal of the original design:
>> "implementations of the various Collections Framework interfaces are free to 
>> take advantage of the specified behavior of underlying {@link Object} 
>> methods wherever the implementor deems it appropriate"
>> Promoting ersatz Collections to first class citizens would be a significant 
>> incompatible change to the specification because it invalidates this 
>> statement.
> 
> My changes don't affect the relationship of equals() and hashCode() in the 
> slightest.
> 
> s'marks
> 

Reply via email to