> On Oct 19, 2017, at 4:29 PM, Xiaodi Wu via swift-dev <swift-dev@swift.org 
> <mailto:swift-dev@swift.org>> wrote:
> 
> Differing behavior in generic and concrete contexts is simply too subtle to 
> be understandable to the reader.

Hardly more subtle then the current “Equatable works like this, with these 
strong guarantees. Oh, except for some cases it doesn’t, in which case 
¯\_(ツ)_/¯”

I realize that Comparable admits an exception for FP. This is, IMO, a serious 
mistake and needs to be reversed. Equatable has no such exception and rightly 
so.

The clearest demonstrations of how flawed this approach is can be found in the 
Standard Library. You can throw a brick at it and hit an example of something 
that’s broken by the presence of .nan: random sort orders, == implementations 
that differ based on the identify of the buffer, arbitrary duplication in 
Set/Dictionary etc. 

The important point to take from this is not “how do we fix the Standard 
Library?” but rather “these errors are easy to make” by anyone writing generic 
code using standard protocols. If the Standard Library can’t get these right, 
how can we expect others to? There are potentially far worse 
bugs that could result. A differently-written sorting algorithm could corrupt 
elements (because it relied on substitutability). Other sorting or searching 
algorithms could easily go into an infinite loop. These problems exist because 
the code relies on the documented behavior of the protocol, because if you 
can’t, then what is the point in documenting that behavior?  

I don’t support solutions such as adding a property indicating 
“containsExceptionalValues” (whatever that means), and expecting every author 
of a generic algorithm that uses Equatable to remember to call it, and craft 
custom paranoid behavior (if there is any reasonable behavior) based on it. 
With recursive conformance landed on master, we finally have a generics system 
where writing algorithms against Collection can be considered approachable by 
ordinary users. You no longer have to know things like how 
Collection.SubSequence needs to be constrained to also be a Collection – it 
just is. We would be negating this good work to now introduce a whole new set 
of gotchas that we expect people to know (without the type system even helping 
them in this case) about how some types, including standard library types, 
flout the documented rules for Equatable and Comparable, and that you need to 
use one of a handful of properties to hack in special cases to handle it.

And as a user of algorithms, what should you do? If a generic algorithm doesn’t 
document how it handles these special cases, should you assume it doesn’t? 
Check the code? Experiment to find out?

This problem also spreads, virus-like, once we have conditional conformance 
that makes containers equatable when their elements are. [Double] would need to 
propagate it’s elements’ “exceptionality", to avoid problems with [Double]. 
Double? will have to do the same.

> The explanation that a method on `Float` is a "floating-point context" but a 
> method on `[Float]` is *not a "floating point context"* is, IMO, indefensible.

Nevertheless, I will attempt to defend it :)  

I find it odd that violating the documented requirements of a protocol is 
considered defensible, but expecting types comply with those requirements is 
indefensible. A principled stance would be to say that Float shouldn’t conform 
to Equatable (because… it doesn’t!) and requiring all calls to supply a 
predicate (and maybe amending types like Dictionary to allow you to supply 
one). That won’t fly though – users would complain – so instead we are in this 
murky ground.

Later in the thread, you mention a possible fix for sort:

> `sort()` is problematic, but not if a custom predicate is supplied.


So, we are potentially trading off one subtlety (that < behaves differently in 
generic and non-generic contexts) for another (that you need to know that you 
need to pass in a special predicate for sorting, or you get nonsense results). 
Knowing when an algorithm requires you to supply a predicate (like sort) vs 
when handling for the special case is built in (like equatable) seems far worse 
complication to me than knowing one rule: that generically when constrained to 
Comparable, Float adheres to the requirements of Comparable. Always. That is a 
consistent rule that you need to learn once and that doesn’t vary depending on 
which algorithm you’re using.

Another alternative proposed in previous threads is to give Comparable an 
additional operator (<=> or .compare(to:) that will always enforce a total 
ordering, and sort can use that. This is, afaict, C#’s solution – double.NaN < 
1.0, 1.0 < double.NaN and double.NaN == double.NaN all return false, but 
Comparer<double>.Default.compare returns -1, 1 and 0 respectively.

This is essentially a backdoor to having one comparison concretely, and another 
generically, except whether you get that different behavior is left to the whim 
of the algorithm’s implementor, depending on whether they used < or <=>. And 
what should sets use? Should a hashed set use == and allow multiple nan, but an 
ordered set use <=> and exclude them?

> - `index(of:)` works perfectly sensibly without such a relation; if no NaN is 
> equal to any other NaN, `index(of: .nan)` is appropriately `nil`.


It works sensibly when you are calling it from a concrete context, and you know 
that you have is a collection of floating point elements.

But it does not work sensibly if you are using index(of:) as a building block 
in a wider generic algorithm on Collection, which relies on the rules set out 
by Equatable. I find myself using index(of:) within generic algorithms 
frequently, and I really don’t know how I should reason about its use if it’s 
accepted that the rules of Equatable just don’t hold in the all cases. When 
operating generically more than one level deep like this, you really need to be 
able to rely on those rules.

This leads to the other argument against having algorithms generic over 
Equatable behave differently to ones on Float/Double (or FloatingPoint): that 
when you take your FP algorithm, and then loosen the constraints, it changes 
behavior in a way that may no longer be correct.

But we aready have the opposite case, where algorithms are written generically 
and then don’t work correctly for floats. I would argue that is far more 
harmful and subtle an issue. I honestly don’t think the situation where someone 
writes an algorithm first against floats, then generalizes it, is a very common 
path to creating a generic algorithm.  The inverse – where someone writes an 
algorithm using Equatable, and then people use it with floats – is certainly 
common, hence this discussion.

There really is no way to square this circle. Every option is going to have 
downsides. We have to balance correctness, least surprise/most expected 
behavior for most people, and consistency. For me, making generic use of 
Equatable and Comparable stick to the documented conformance generically, while 
keeping FP-specific uses the way they are, is the least bad option. 


_______________________________________________
swift-dev mailing list
swift-dev@swift.org
https://lists.swift.org/mailman/listinfo/swift-dev

Reply via email to