Re: [swift-dev] Rationalizing FloatingPoint conformance to Equatable

Jonathan Hull via swift-dev Wed, 25 Oct 2017 02:39:08 -0700

One possibility which might be easier would be to just depreciate Float’s 
Equatable conformance and provide a warning/fixit (suggesting how to use the 
new ‘~’ relation).


> On Oct 25, 2017, at 2:35 AM, Jonathan Hull <jh...@gbis.com> wrote:
> 
> There is something interesting here.  I like that ‘Bool?’ really represents 
> what we need from the relation on Float.  I think it solves the objection 
> that Xiaodi mentioned with a PartialEq protocol.  If everyone just switches 
> to using PartialEq in generic contexts because they want it to work with 
> Floats (which is the outcome he was worried about)… then they will be forced 
> to deal with the optional!  This brings Swift’s strengths into play, and we 
> will have gained correctness.
> 
> The issue is migration though. Let’s say we name the partial equivalence 
> relation ‘~’ and then we remove Float’s Equivalence conformance (no more == 
> for Float).  To migrate, we would need to replace any ‘a == b’ where there is 
> a Float with ‘(a ~ b) == true’.  Not pretty, but not impossible.
> 
> The trouble spot will be where someone has written their own method which is 
> generic on Equatable, and they are passing Floats to it.  We need to be 
> careful about the warning given, so that it sends them in the right 
> direction.  The good news is that if they do rewrite the function to use ‘~' 
> it should actually cause them to fix bugs which were there (or at least 
> consider the case of NaN) when dealing with Floats.  So, overall good things, 
> but a bit annoying while transitioning.
> 
> We should keep thinking/discussing. I am convinced there is an interesting 
> thread to pull on here...
> 
> 
>> On Oct 24, 2017, at 11:25 PM, David Sweeris via swift-dev 
>> <swift-dev@swift.org <mailto:swift-dev@swift.org>> wrote:
>> 
>>> 
>>> On Oct 24, 2017, at 9:06 PM, Xiaodi Wu via swift-dev <swift-dev@swift.org 
>>> <mailto:swift-dev@swift.org>> wrote:
>>> 
>>> On Tue, Oct 24, 2017 at 10:08 PM, Ben Cohen <ben_co...@apple.com 
>>> <mailto:ben_co...@apple.com>> wrote:
>>> 
>>> 
>>> On Oct 24, 2017, at 6:48 PM, Xiaodi Wu <xiaodi...@gmail.com 
>>> <mailto:xiaodi...@gmail.com>> wrote:
>>> 
>>>> On Tue, Oct 24, 2017 at 1:55 PM, Ben Cohen <ben_co...@apple.com 
>>>> <mailto:ben_co...@apple.com>> wrote:
>>>> 
>>>> 
>>>>> On Oct 19, 2017, at 4:29 PM, Xiaodi Wu via swift-dev <swift-dev@swift.org 
>>>>> <mailto:swift-dev@swift.org>> wrote:
>>>>> 
>>>>> Differing behavior in generic and concrete contexts is simply too subtle 
>>>>> to be understandable to the reader.
>>>> 
>>>> Hardly more subtle then the current “Equatable works like this, with these 
>>>> strong guarantees. Oh, except for some cases it doesn’t, in which case 
>>>> ¯\_(ツ)_/¯”
>>>> 
>>>> I'm not saying that the status quo is a superior alternative.
>>>> 
>>>> However, one option is to _weaken_ the guarantees of Equatable such that 
>>>> it guarantees only partial equivalence for `==`. From the perspective of 
>>>> documented semantics, it's not subtle at all but a giant hammer of a 
>>>> change. However, from an actual what-does-the-implementation-do 
>>>> standpoint, it would be acknowledging what is already true. Only code that 
>>>> is already broken when used with floating-point values would become 
>>>> formally "incorrect" in the sense of relying on semantics that are then no 
>>>> longer guaranteed.
>>>> 
>>>> Such a solution would avoid, as you might say, perpetuating the ¯\_(ツ)_/¯ 
>>>> approach to floating point.
>>>> 
>>>> I realize that Comparable admits an exception for FP. This is, IMO, a 
>>>> serious mistake and needs to be reversed. Equatable has no such exception 
>>>> and rightly so.
>>>> 
>>>> The clearest demonstrations of how flawed this approach is can be found in 
>>>> the Standard Library. You can throw a brick at it and hit an example of 
>>>> something that’s broken by the presence of .nan: random sort orders, == 
>>>> implementations that differ based on the identify of the buffer,
>>>> 
>>>> In my view, if a sort algorithm cannot accommodate NaN, it's entirely 
>>>> acceptable to trap on NaN--and that is a trivial change.
>>> 
>>> I think this would be user-hostile. This isn’t like out-of-bounds subscript 
>>> where it’s just not possible to reasonably proceed. NaNs crop up and people 
>>> don’t expect them to trap when you sort – they expect them to sort to one 
>>> end, like in Excel.
>>> 
>>> Honestly, I don't know that most users have thought about this possibility 
>>> at all. Sure, a sort that matches IEEE total order _might_ be justifiable. 
>>> But users are as likely to expect that the last item in the sorted 
>>> collection will be the greatest and that the first item in the sorted 
>>> collection will be the smallest. Now, you can say that NaN compares larger 
>>> than everything, everywhere. But the moment that they try to plug that last 
>>> element into, say, an AppKit UI function, they're toast.
>>> 
>>> I certainly disagree with ideas of trapping on NaN inside `==` or similar 
>>> functions, but I really do think that an argument can be made that it is 
>>> not reasonable to proceed with sorting an array that contains NaN.
>>> 
>>> 
>>>> After all, NaN is unordered with respect to everything and one cannot sort 
>>>> the unsortable. And, as shown, the `Array.==` implementation is trivially 
>>>> fixable. The entire standard library can be made NaN-safe in like manner.
>>>>  
>>> 
>>> My point was, it’s not about what we can do in the standard library. The 
>>> std lib only has a handful of methods and sure, we can fix them one by one. 
>>> It’s about whether the standard library defines types and protocols such 
>>> that it’s reasonable for programmers to use them to write and use generic 
>>> algorithms correctly. I’m citing the existing std lib implementations as 
>>> proof that it’s easy to make mistakes. And I think a more complicated 
>>> approach, with more operators, more properties, more rules, won’t fix this 
>>> problem.
>>> 
>>> Well, to my mind, this problem you state really works out to:
>>> 
>>> (a) People expect generic algorithms that operate on Comparable types to 
>>> work correctly with floating-point types
>>> (b) Generic algorithms that operate on Comparable types don't work 
>>> correctly with floating-point types unless the author is very, very careful
>>> (c) People shouldn't have to be very, very careful to write working generic 
>>> algorithms that work with floating-point types
>>> 
>>> Which, in turn, really boils down to:
>>> 
>>> (d) People expect floating-point types not to have numerous unintuitive 
>>> (but learnable) properties, including NaN being unordered
>>> (e) Floating-point types have numerous unintuitive (but learnable) 
>>> properties, including NaN being unordered
>>> 
>>> The reason I'm writing to swift-dev (rather than evolution) is that my 
>>> interest is in fixing the standard library. I'm not even convinced that 
>>> this problem you state is fixable, at least on your terms. In the interest 
>>> of not increasing the API surface area, you would propose to blow away (e) 
>>> in the generic but not concrete context. Now, while it's true that an 
>>> alternative to increasing the API surface area is to have the same API 
>>> exhibit context-specific behaviors, that certainly isn't any less 
>>> complicated conceptually, as we would then be faced with the notion that 
>>> floating-point types both have and do not have numerous unintuitive 
>>> properties, depending on the context in which they are used.
>>>> arbitrary duplication in Set/Dictionary etc. 
>>>> 
>>>> (I disagree that it's arbitrary. If NaN != NaN, then every NaN is properly 
>>>> unique.)
>>>>  
>>>> The important point to take from this is not “how do we fix the Standard 
>>>> Library?” but rather “these errors are easy to make” by anyone writing 
>>>> generic code using standard protocols. If the Standard Library can’t get 
>>>> these right, how can we expect others to? There are potentially far worse 
>>>> bugs that could result. A differently-written sorting algorithm could 
>>>> corrupt elements (because it relied on substitutability). Other sorting or 
>>>> searching algorithms could easily go into an infinite loop. These problems 
>>>> exist because the code relies on the documented behavior of the protocol, 
>>>> because if you can’t, then what is the point in documenting that behavior?
>>>> 
>>>> It's not that the standard library *can't* get these right, but that it 
>>>> currently *doesn't*, because it documents one set of semantics but 
>>>> implements another, then relies on documented semantics that it knows it 
>>>> does not implement. We both agree that this needs to be fixed.
>>>> 
>>>> The question here is whether it is to be fixed by sticking to the 
>>>> documented semantic guarantees of `==` and bringing all implementations 
>>>> into proper conformance, or alternatively sticking to the implemented 
>>>> behavior of `==` and aligning the documented semantic guarantees to that.
>>>>  
>>>> I don’t support solutions such as adding a property indicating 
>>>> “containsExceptionalValues” (whatever that means), and expecting every 
>>>> author of a generic algorithm that uses Equatable to remember to call it, 
>>>> and craft custom paranoid behavior (if there is any reasonable behavior) 
>>>> based on it. With recursive conformance landed on master, we finally have 
>>>> a generics system where writing algorithms against Collection can be 
>>>> considered approachable by ordinary users. You no longer have to know 
>>>> things like how Collection.SubSequence needs to be constrained to also be 
>>>> a Collection – it just is. We would be negating this good work to now 
>>>> introduce a whole new set of gotchas that we expect people to know 
>>>> (without the type system even helping them in this case) about how some 
>>>> types, including standard library types, flout the documented rules for 
>>>> Equatable and Comparable, and that you need to use one of a handful of 
>>>> properties to hack in special cases to handle it.
>>>> 
>>>> The gotchas aren't new; they arise when using floating point values, 
>>>> originate with the IEEE definition of floating point equivalence, and 
>>>> exist in some form in every language that has implemented collections of 
>>>> floating point values. Crucially, they exist today in Swift; only, we 
>>>> haven't documented it.
>>>> 
>>>> And as a user of algorithms, what should you do? If a generic algorithm 
>>>> doesn’t document how it handles these special cases, should you assume it 
>>>> doesn’t? Check the code? Experiment to find out?
>>>> 
>>>> This problem also spreads, virus-like, once we have conditional 
>>>> conformance that makes containers equatable when their elements are. 
>>>> [Double] would need to propagate it’s elements’ “exceptionality", to avoid 
>>>> problems with [Double]. Double? will have to do the same.
>>>> 
>>>> This isn't a _problem_. In fact, I consider this to be a key _feature_. 
>>>> Naturally, every protocol conformance (conditional or not) must implement 
>>>> all protocol requirements, so if we add additional requirements they must 
>>>> be implemented. What I'm saying here is that *it may be desirable* to have 
>>>> some protocol-based API to distinguish partial from full equivalence 
>>>> relations. If you accept that premise, then it is the logical consequence 
>>>> that if you conditionally conform `Array` to `Equatable`, you will have to 
>>>> implement any new APIs, and in so doing, document how equivalence of 
>>>> arrays of floating point values relates to floating point equivalence. For 
>>>> me, this is a _good thing_: it documents _in code_ something that today is 
>>>> muddled through.
>>>> 
>>>>> The explanation that a method on `Float` is a "floating-point context" 
>>>>> but a method on `[Float]` is *not a "floating point context"* is, IMO, 
>>>>> indefensible.
>>>> 
>>>> Nevertheless, I will attempt to defend it :)  
>>>> 
>>>> I find it odd that violating the documented requirements of a protocol is 
>>>> considered defensible, but expecting types comply with those requirements 
>>>> is indefensible. A principled stance would be to say that Float shouldn’t 
>>>> conform to Equatable (because… it doesn’t!) and requiring all calls to 
>>>> supply a predicate (and maybe amending types like Dictionary to allow you 
>>>> to supply one). That won’t fly though – users would complain – so instead 
>>>> we are in this murky ground.
>>>> 
>>>> I don't think we should defend violating the documented requirements of a 
>>>> protocol. Either (a) Float should not conform to Equatable (agree, this is 
>>>> a non-starter); (b) how Float conforms to Equatable should be brought into 
>>>> conformance with documented semantics (your stance); or (c) what semantics 
>>>> are documented should be brought into alignment with how conformance is 
>>>> actually implemented (my stance). Naturally, in the last case, additional 
>>>> APIs should be added as needed to make such reduced semantic guarantees 
>>>> useful for generic algorithms.
>>>> 
>>>> Later in the thread, you mention a possible fix for sort:
>>>> 
>>>>> `sort()` is problematic, but not if a custom predicate is supplied.
>>>> 
>>>> 
>>>> So, we are potentially trading off one subtlety (that < behaves 
>>>> differently in generic and non-generic contexts) for another (that you 
>>>> need to know that you need to pass in a special predicate for sorting, or 
>>>> you get nonsense results). Knowing when an algorithm requires you to 
>>>> supply a predicate (like sort) vs when handling for the special case is 
>>>> built in (like equatable) seems far worse complication to me than knowing 
>>>> one rule: that generically when constrained to Comparable, Float adheres 
>>>> to the requirements of Comparable. Always. That is a consistent rule that 
>>>> you need to learn once and that doesn’t vary depending on which algorithm 
>>>> you’re using.
>>>> 
>>>> I would argue that Float should _always_ adhere to the requirements of 
>>>> Comparable, in all contexts. The question is, rather: what can be the 
>>>> requirements of Comparable such that Float can always adhere to them?
>>>> 
>>>> Another alternative proposed in previous threads is to give Comparable an 
>>>> additional operator (<=> or .compare(to:) that will always enforce a total 
>>>> ordering, and sort can use that. This is, afaict, C#’s solution – 
>>>> double.NaN < 1.0, 1.0 < double.NaN and double.NaN == double.NaN all return 
>>>> false, but Comparer<double>.Default.compare returns -1, 1 and 0 
>>>> respectively.
>>>> 
>>>> This is, essentially, the endpoint of what I'm proposing.
>>>> 
>>>> Equatable would vend (modulo bikeshedding):
>>>> `==`, a partial equivalence relation
>>>> `~`, a full equivalence relation
>>>> `containsExceptionalValues` (yes, this is a deliberately terrible name, 
>>>> because it's meant to go through bikeshedding), a Boolean value to 
>>>> indicate whether `==` is the same as `~`
>>>> 
>>>> Comparable would vend (modulo bikeshedding):
>>>> `<`, `>`, <=`, `>=`, defined as now
>>>> `<=>`, as in C# `compare` (or maybe, to emphasize the point, `<~>`)
>>>> `containsExceptionalValues`, inherited from `Equatable`, to document the 
>>>> relationship between `<` (etc.) and the spaceship operator
>>>>  
>>> 
>>> This looks to me to be an absurd mess of operations, none of which will 
>>> have much hope of being used in a coherent fashion by most people. Should I 
>>> use == or ~ here? What are the rules again? Will people remember to not use 
>>> < when they really need <=>? Probably not. Did the author of this framework 
>>> I’m using remember? Dunno.
>>> 
>>> The syntax here is not the point (or if it is, it can be bikeshedded). The 
>>> point I'm trying to make is that what you're criticizing as _incoherent_ is 
>>> also _inescapable_. Floating-point types have a notion of equivalence that 
>>> isn't full equivalence. For certain use cases (both concrete and generic), 
>>> we want that partial equivalence, while for other use cases (both concrete 
>>> and generic), we truly want full equivalence. To work with floating-point 
>>> types correctly, a user must know that there is a difference between the 
>>> two. If there is no hope of "most people" understanding this distinction 
>>> when one relation is named `==` and the other is named `~`, then _a 
>>> fortiori_ there is no hope of "most people" understanding the distinction 
>>> when they're conflated into one operator `==` that has different behaviors 
>>> in different contexts.
>>> 
>>> The C# model of compare works because < is not available generically. There 
>>> is no choice between < and <=>, and so the model is simple and easily 
>>> understood by both algorithm implementors and users. And if you need a 
>>> different ordering, you can supply your own custom comparator. As far as I 
>>> can tell, it’s a good model and users are happy with it. Swift is 
>>> different, since the concrete < isexposed to the generic implementation, 
>>> but having two possibilities and expecting users to pick is IMO a bad idea. 
>>> Hence the proposed fix that Float’s Comparable.< is required to be a total 
>>> order, per the requirements of Comparable, essentially giving us the C# 
>>> model.
>>> 
>>> A true C# model would be fine, but the key point of that model to my mind 
>>> is that partial equivalence and full equivalence are spelled differently 
>>> (that is, `==` and `Equals`, respectively). It would not work with IEEE 
>>> `==` being spelled the same way as Comparable `==`. If we were to rename 
>>> the IEEE operation `&==` instead, then we'd functionally have a design 
>>> that's broadly similar to the earlier version, only with different names:
>>> 
>>> Equatable would vend `==`, a full equivalence relation (and `!=`)
>>> Comparable would vend `<`, `>`, `<=`, `>=`, now operators that reflect a 
>>> total order over the set of all values; and maybe `<=>`
>>> Floating point would additionally vend `&==` and `&<` (and `&!=`, `&<`, 
>>> `&>`, `&<=`, `&>=`)
>>> 
>>> One key difference here would be that the partial equivalence relation 
>>> would now only be found on floating-point types, and it would not be 
>>> possible to write a generic algorithm that operates on any partially 
>>> equatable or equatable type. But the other--and major--issues would be (a) 
>>> that all concrete uses of floating-point comparison operators would have to 
>>> be migrated to append an extra `&`; and (b) this syntax suggests that most 
>>> users want to use `==` *instead of* `&==`, which I'm not sure is the 
>>> case--and certainly isn't the case if they're trying to do the same things 
>>> they're used to doing with floating-point values in other languages.
>> 
>> What about having the protocol hierarchy look like this? (All names subject 
>> to bikeshedding, of course)
>> 
>> protocol MaybeEquatable {
>>      static func ?== (lhs: Self, rhs: Self) -> Bool?
>> }
>> protocol MostlyEquatable : MaybeEquatable {
>>      static func == (lhs: Self, rhs: Self) -> Bool
>> }
>> extension MostlyEquatable {
>>      static func ?== (lhs: Self, rhs: Self) -> Bool? { return lhs == rhs } 
>> // allows a `MostlyEquatable` or `Equatable` to function as a 
>> `MaybeEquatable` without any extra code
>> }
>> protocol Equatable : MostlyEquatable {} // purely a semantic difference, no 
>> extra syntax
>> 
>> protocol MaybeComparable : MaybeEquatable {
>>      static func ?< (lhs: Self, rhs: Self) -> Bool?
>>      // plus the rest of them
>> }
>> protocol MostlyComparable : MaybeComparable, MostlyEquatable {
>>      static func < (lhs: Self, rhs: Self) -> Bool
>>      // plus the rest of them
>> }
>> extension MostlyComparable {
>>      static func ?< (lhs: Self, rhs: Self) -> Bool? { return lhs < rhs } // 
>> allows a `MostlyComparable` or `Comparable` to function as a 
>> `MaybeComparable` without any extra code
>>      // plus the rest of them
>> }
>> protocol Comparable : MostlyComparable, Equatable {} // purely a semantic 
>> difference, no extra syntax
>> 
>> extension Double : MostlyComparable {
>>      static func ?== (lhs: Double, rhs: Double) -> Bool? {
>>              return lhs.isNaN || rhs.isNaN ? nil : lhs == rhs
>>      }
>>      static func ?< (lhs: Double, rhs: Double) -> Bool? {
>>              return lhs.isNaN || rhs.isNaN || (lhs.isInfinite == true && 
>> rhs.isInfinite == true && lhs.sign == rhs.sign) ? nil : lhs < rhs
>>      }
>>      static func == (lhs: Double, rhs: Double) -> Bool {
>>              // whatever current impl is
>>      }
>>      static func < (lhs: Double, rhs: Double) -> Bool {
>>              // whatever current impl is
>>      }
>> }
>> 
>> 
>> This would let people easily switch between the two kinds of "correct" 
>> generic comparisons (returning a `Bool?` for types that could have invalid 
>> comparisons and a `Bool` for types that can't), as well as easily opting 
>> into using IEEE-754's current "arguably incompatible with sane generic 
>> programming" behavior (by constraining T to "MostlyComparable" instead of 
>> "Comparable") if that's what they really want.
>> 
>> `Collection` could have different implementations of `==` depending on 
>> whether `Element` conforms to "MaybeEquatable" or 
>> "MostlyEquatable/Equatable", solving the "a = [Double.nan]; print(a == a) // 
>> prints true" issue.
>> 
>> This doesn't involve trapping on anything, so dunno what the performance 
>> implications would be. All the extra work would be contained in the "?*" 
>> functions, though, so at least existing code wouldn't sudden get slower.
>> 
>> - Dave Sweeris
>> _______________________________________________
>> swift-dev mailing list
>> swift-dev@swift.org <mailto:swift-dev@swift.org>
>> https://lists.swift.org/mailman/listinfo/swift-dev 
>> <https://lists.swift.org/mailman/listinfo/swift-dev>

_______________________________________________
swift-dev mailing list
swift-dev@swift.org
https://lists.swift.org/mailman/listinfo/swift-dev

Re: [swift-dev] Rationalizing FloatingPoint conformance to Equatable

Reply via email to