Hello core-lib-dev, I think it is a mistake to make `AbstractSet#removeAll` use the membership semantics of the argument collection instead of those of the receiver set, and to iterate over the receiver instead of the argument collection.
I think that there are much stronger arguments for doing it the other way around: to use the membership semantics of the receiver and to iterate over the argument. This is what Stuart suggest in his first prototype [1], but unlike the how it looks in the current webrev [2]. After having read through the original thread (most importantly this message: [3]) and the review threads, it is my impression that the disadvantages of this solution have been over-emphasised and the advantages under-emphasised. This is my impression of the advantages and disadvantages of making `removeAll` use the membership semantics of the receiver set: ## Advantages 1. It leads to the expected behaviour when custom membership semantics are used. 2. It preserves the behaviour in the common case when custom membership semantics are used. 3. It has better performance for hashed sets in the common case. ## Disadvantages 4. It is different from `AbstractCollection`. 5. It is not possible to do in `retainAll`, and thus make the two methods inconsistent. (See [3].) I don't think 1-2 has been given enough weight in the discussion, and that 4-5 have been given too much. ## Details 1. I think that often `removeAll` is called without much tough about the properties of the argument collection, except for its members. It will be confusing, annoying and error prone if code like this doesn't work as expected: var s = new TreeSet<>(String::compareToIgnoreCase); s.addAll(List.of("a", "b", "c"); // Should remove "a" and "b" s.removeAll(List.of("A", "B")); Non-set collections on the other hand are probably much less likely to have custom membership semantics. Because of this they can get away with using the membership semantics of the argument, without exhibiting unexpected behaviour for code that is similar to the above. 2. The consequences are even more sever if code like the above stops working as expected. This will probably cause some code that has been working for years (decades?) to suddenly fails. * I think that it is much more common to use `removeAll` to remove a small number of elements from a bigger set. If the membership semantics of the argument are to be used in that case, then the behaviour of existing code will change. If on the other hand the membership semantics of the receiver are to be used, then existing code will continue to behave in the same way. * I think that existing code that works in the opposite way, that depends on the custom membership semantics of the argument of `removeAll`, is much more uncommon, and thus that it is more acceptable that such code stops working. 3. Alan Snyder wrote about performance at length in his mails. This is an important point, even if the arguments above are more important. For example, some applications with large HashSet-based caches might stop working in a satisfactory way. 4. I think that consistency with `AbstractCollection#removeAll` has been given too much importance in the discussion. Sets are different from lists, and it seem justified that the implementation of `removeAll` in AbstractSet works in a different way from `AbstractCollection`. * Collections that allow duplicates are forced to iterate `this` out of necessity. * But sets have no duplicates. It is natural take advantage of this fact and iterate the argument. It seems unlikely that this would be unexpected for anyone or lead to problems. * Non-set collections are probably much less likely to have custom equality semantics, which makes iterating over the argument less problematic. 5. The difference in behaviour between `retainAll` and `removeAll` that is the result of using the membership semantics of the receiver is, I think, the most serious disadvantage of this solution. This disadvantage is not as strong as the advantages in 1-2 above, however. Also, I think that there is some justification to it. I think `retainAll` is perceived much more as a set theoretic operation than `removeAll`. This makes it more intuitive for a user that you should ensure that the argument to that method is a collection with the right membership semantics. ### Conclusion I think the following original suggestion [1] is a much better choice for an implementation of `AbstractSet#removeAll` than the existing implementation in `AbstractCollection`: public boolean removeAll(Collection<?> c) { Objects.requireNonNull(c); boolean modified = false; for (Object e : c) modified |= remove(e); return modified; } [1]: https://mail.openjdk.java.net/pipermail/core-libs-dev/2019-January/058172.html [2]: http://cr.openjdk.java.net/~smarks/reviews/6394757/webrev.1/ [3]: https://mail.openjdk.java.net/pipermail/core-libs-dev/2019-February/058539.html BR, Jens Lideström Proud new Eclipse committer