One possibility which might be easier would be to just depreciate Float’s Equatable conformance and provide a warning/fixit (suggesting how to use the new ‘~’ relation).
> On Oct 25, 2017, at 2:35 AM, Jonathan Hull <[email protected]> wrote: > > There is something interesting here. I like that ‘Bool?’ really represents > what we need from the relation on Float. I think it solves the objection > that Xiaodi mentioned with a PartialEq protocol. If everyone just switches > to using PartialEq in generic contexts because they want it to work with > Floats (which is the outcome he was worried about)… then they will be forced > to deal with the optional! This brings Swift’s strengths into play, and we > will have gained correctness. > > The issue is migration though. Let’s say we name the partial equivalence > relation ‘~’ and then we remove Float’s Equivalence conformance (no more == > for Float). To migrate, we would need to replace any ‘a == b’ where there is > a Float with ‘(a ~ b) == true’. Not pretty, but not impossible. > > The trouble spot will be where someone has written their own method which is > generic on Equatable, and they are passing Floats to it. We need to be > careful about the warning given, so that it sends them in the right > direction. The good news is that if they do rewrite the function to use ‘~' > it should actually cause them to fix bugs which were there (or at least > consider the case of NaN) when dealing with Floats. So, overall good things, > but a bit annoying while transitioning. > > We should keep thinking/discussing. I am convinced there is an interesting > thread to pull on here... > > >> On Oct 24, 2017, at 11:25 PM, David Sweeris via swift-dev >> <[email protected] <mailto:[email protected]>> wrote: >> >>> >>> On Oct 24, 2017, at 9:06 PM, Xiaodi Wu via swift-dev <[email protected] >>> <mailto:[email protected]>> wrote: >>> >>> On Tue, Oct 24, 2017 at 10:08 PM, Ben Cohen <[email protected] >>> <mailto:[email protected]>> wrote: >>> >>> >>> On Oct 24, 2017, at 6:48 PM, Xiaodi Wu <[email protected] >>> <mailto:[email protected]>> wrote: >>> >>>> On Tue, Oct 24, 2017 at 1:55 PM, Ben Cohen <[email protected] >>>> <mailto:[email protected]>> wrote: >>>> >>>> >>>>> On Oct 19, 2017, at 4:29 PM, Xiaodi Wu via swift-dev <[email protected] >>>>> <mailto:[email protected]>> wrote: >>>>> >>>>> Differing behavior in generic and concrete contexts is simply too subtle >>>>> to be understandable to the reader. >>>> >>>> Hardly more subtle then the current “Equatable works like this, with these >>>> strong guarantees. Oh, except for some cases it doesn’t, in which case >>>> ¯\_(ツ)_/¯” >>>> >>>> I'm not saying that the status quo is a superior alternative. >>>> >>>> However, one option is to _weaken_ the guarantees of Equatable such that >>>> it guarantees only partial equivalence for `==`. From the perspective of >>>> documented semantics, it's not subtle at all but a giant hammer of a >>>> change. However, from an actual what-does-the-implementation-do >>>> standpoint, it would be acknowledging what is already true. Only code that >>>> is already broken when used with floating-point values would become >>>> formally "incorrect" in the sense of relying on semantics that are then no >>>> longer guaranteed. >>>> >>>> Such a solution would avoid, as you might say, perpetuating the ¯\_(ツ)_/¯ >>>> approach to floating point. >>>> >>>> I realize that Comparable admits an exception for FP. This is, IMO, a >>>> serious mistake and needs to be reversed. Equatable has no such exception >>>> and rightly so. >>>> >>>> The clearest demonstrations of how flawed this approach is can be found in >>>> the Standard Library. You can throw a brick at it and hit an example of >>>> something that’s broken by the presence of .nan: random sort orders, == >>>> implementations that differ based on the identify of the buffer, >>>> >>>> In my view, if a sort algorithm cannot accommodate NaN, it's entirely >>>> acceptable to trap on NaN--and that is a trivial change. >>> >>> I think this would be user-hostile. This isn’t like out-of-bounds subscript >>> where it’s just not possible to reasonably proceed. NaNs crop up and people >>> don’t expect them to trap when you sort – they expect them to sort to one >>> end, like in Excel. >>> >>> Honestly, I don't know that most users have thought about this possibility >>> at all. Sure, a sort that matches IEEE total order _might_ be justifiable. >>> But users are as likely to expect that the last item in the sorted >>> collection will be the greatest and that the first item in the sorted >>> collection will be the smallest. Now, you can say that NaN compares larger >>> than everything, everywhere. But the moment that they try to plug that last >>> element into, say, an AppKit UI function, they're toast. >>> >>> I certainly disagree with ideas of trapping on NaN inside `==` or similar >>> functions, but I really do think that an argument can be made that it is >>> not reasonable to proceed with sorting an array that contains NaN. >>> >>> >>>> After all, NaN is unordered with respect to everything and one cannot sort >>>> the unsortable. And, as shown, the `Array.==` implementation is trivially >>>> fixable. The entire standard library can be made NaN-safe in like manner. >>>> >>> >>> My point was, it’s not about what we can do in the standard library. The >>> std lib only has a handful of methods and sure, we can fix them one by one. >>> It’s about whether the standard library defines types and protocols such >>> that it’s reasonable for programmers to use them to write and use generic >>> algorithms correctly. I’m citing the existing std lib implementations as >>> proof that it’s easy to make mistakes. And I think a more complicated >>> approach, with more operators, more properties, more rules, won’t fix this >>> problem. >>> >>> Well, to my mind, this problem you state really works out to: >>> >>> (a) People expect generic algorithms that operate on Comparable types to >>> work correctly with floating-point types >>> (b) Generic algorithms that operate on Comparable types don't work >>> correctly with floating-point types unless the author is very, very careful >>> (c) People shouldn't have to be very, very careful to write working generic >>> algorithms that work with floating-point types >>> >>> Which, in turn, really boils down to: >>> >>> (d) People expect floating-point types not to have numerous unintuitive >>> (but learnable) properties, including NaN being unordered >>> (e) Floating-point types have numerous unintuitive (but learnable) >>> properties, including NaN being unordered >>> >>> The reason I'm writing to swift-dev (rather than evolution) is that my >>> interest is in fixing the standard library. I'm not even convinced that >>> this problem you state is fixable, at least on your terms. In the interest >>> of not increasing the API surface area, you would propose to blow away (e) >>> in the generic but not concrete context. Now, while it's true that an >>> alternative to increasing the API surface area is to have the same API >>> exhibit context-specific behaviors, that certainly isn't any less >>> complicated conceptually, as we would then be faced with the notion that >>> floating-point types both have and do not have numerous unintuitive >>> properties, depending on the context in which they are used. >>>> arbitrary duplication in Set/Dictionary etc. >>>> >>>> (I disagree that it's arbitrary. If NaN != NaN, then every NaN is properly >>>> unique.) >>>> >>>> The important point to take from this is not “how do we fix the Standard >>>> Library?” but rather “these errors are easy to make” by anyone writing >>>> generic code using standard protocols. If the Standard Library can’t get >>>> these right, how can we expect others to? There are potentially far worse >>>> bugs that could result. A differently-written sorting algorithm could >>>> corrupt elements (because it relied on substitutability). Other sorting or >>>> searching algorithms could easily go into an infinite loop. These problems >>>> exist because the code relies on the documented behavior of the protocol, >>>> because if you can’t, then what is the point in documenting that behavior? >>>> >>>> It's not that the standard library *can't* get these right, but that it >>>> currently *doesn't*, because it documents one set of semantics but >>>> implements another, then relies on documented semantics that it knows it >>>> does not implement. We both agree that this needs to be fixed. >>>> >>>> The question here is whether it is to be fixed by sticking to the >>>> documented semantic guarantees of `==` and bringing all implementations >>>> into proper conformance, or alternatively sticking to the implemented >>>> behavior of `==` and aligning the documented semantic guarantees to that. >>>> >>>> I don’t support solutions such as adding a property indicating >>>> “containsExceptionalValues” (whatever that means), and expecting every >>>> author of a generic algorithm that uses Equatable to remember to call it, >>>> and craft custom paranoid behavior (if there is any reasonable behavior) >>>> based on it. With recursive conformance landed on master, we finally have >>>> a generics system where writing algorithms against Collection can be >>>> considered approachable by ordinary users. You no longer have to know >>>> things like how Collection.SubSequence needs to be constrained to also be >>>> a Collection – it just is. We would be negating this good work to now >>>> introduce a whole new set of gotchas that we expect people to know >>>> (without the type system even helping them in this case) about how some >>>> types, including standard library types, flout the documented rules for >>>> Equatable and Comparable, and that you need to use one of a handful of >>>> properties to hack in special cases to handle it. >>>> >>>> The gotchas aren't new; they arise when using floating point values, >>>> originate with the IEEE definition of floating point equivalence, and >>>> exist in some form in every language that has implemented collections of >>>> floating point values. Crucially, they exist today in Swift; only, we >>>> haven't documented it. >>>> >>>> And as a user of algorithms, what should you do? If a generic algorithm >>>> doesn’t document how it handles these special cases, should you assume it >>>> doesn’t? Check the code? Experiment to find out? >>>> >>>> This problem also spreads, virus-like, once we have conditional >>>> conformance that makes containers equatable when their elements are. >>>> [Double] would need to propagate it’s elements’ “exceptionality", to avoid >>>> problems with [Double]. Double? will have to do the same. >>>> >>>> This isn't a _problem_. In fact, I consider this to be a key _feature_. >>>> Naturally, every protocol conformance (conditional or not) must implement >>>> all protocol requirements, so if we add additional requirements they must >>>> be implemented. What I'm saying here is that *it may be desirable* to have >>>> some protocol-based API to distinguish partial from full equivalence >>>> relations. If you accept that premise, then it is the logical consequence >>>> that if you conditionally conform `Array` to `Equatable`, you will have to >>>> implement any new APIs, and in so doing, document how equivalence of >>>> arrays of floating point values relates to floating point equivalence. For >>>> me, this is a _good thing_: it documents _in code_ something that today is >>>> muddled through. >>>> >>>>> The explanation that a method on `Float` is a "floating-point context" >>>>> but a method on `[Float]` is *not a "floating point context"* is, IMO, >>>>> indefensible. >>>> >>>> Nevertheless, I will attempt to defend it :) >>>> >>>> I find it odd that violating the documented requirements of a protocol is >>>> considered defensible, but expecting types comply with those requirements >>>> is indefensible. A principled stance would be to say that Float shouldn’t >>>> conform to Equatable (because… it doesn’t!) and requiring all calls to >>>> supply a predicate (and maybe amending types like Dictionary to allow you >>>> to supply one). That won’t fly though – users would complain – so instead >>>> we are in this murky ground. >>>> >>>> I don't think we should defend violating the documented requirements of a >>>> protocol. Either (a) Float should not conform to Equatable (agree, this is >>>> a non-starter); (b) how Float conforms to Equatable should be brought into >>>> conformance with documented semantics (your stance); or (c) what semantics >>>> are documented should be brought into alignment with how conformance is >>>> actually implemented (my stance). Naturally, in the last case, additional >>>> APIs should be added as needed to make such reduced semantic guarantees >>>> useful for generic algorithms. >>>> >>>> Later in the thread, you mention a possible fix for sort: >>>> >>>>> `sort()` is problematic, but not if a custom predicate is supplied. >>>> >>>> >>>> So, we are potentially trading off one subtlety (that < behaves >>>> differently in generic and non-generic contexts) for another (that you >>>> need to know that you need to pass in a special predicate for sorting, or >>>> you get nonsense results). Knowing when an algorithm requires you to >>>> supply a predicate (like sort) vs when handling for the special case is >>>> built in (like equatable) seems far worse complication to me than knowing >>>> one rule: that generically when constrained to Comparable, Float adheres >>>> to the requirements of Comparable. Always. That is a consistent rule that >>>> you need to learn once and that doesn’t vary depending on which algorithm >>>> you’re using. >>>> >>>> I would argue that Float should _always_ adhere to the requirements of >>>> Comparable, in all contexts. The question is, rather: what can be the >>>> requirements of Comparable such that Float can always adhere to them? >>>> >>>> Another alternative proposed in previous threads is to give Comparable an >>>> additional operator (<=> or .compare(to:) that will always enforce a total >>>> ordering, and sort can use that. This is, afaict, C#’s solution – >>>> double.NaN < 1.0, 1.0 < double.NaN and double.NaN == double.NaN all return >>>> false, but Comparer<double>.Default.compare returns -1, 1 and 0 >>>> respectively. >>>> >>>> This is, essentially, the endpoint of what I'm proposing. >>>> >>>> Equatable would vend (modulo bikeshedding): >>>> `==`, a partial equivalence relation >>>> `~`, a full equivalence relation >>>> `containsExceptionalValues` (yes, this is a deliberately terrible name, >>>> because it's meant to go through bikeshedding), a Boolean value to >>>> indicate whether `==` is the same as `~` >>>> >>>> Comparable would vend (modulo bikeshedding): >>>> `<`, `>`, <=`, `>=`, defined as now >>>> `<=>`, as in C# `compare` (or maybe, to emphasize the point, `<~>`) >>>> `containsExceptionalValues`, inherited from `Equatable`, to document the >>>> relationship between `<` (etc.) and the spaceship operator >>>> >>> >>> This looks to me to be an absurd mess of operations, none of which will >>> have much hope of being used in a coherent fashion by most people. Should I >>> use == or ~ here? What are the rules again? Will people remember to not use >>> < when they really need <=>? Probably not. Did the author of this framework >>> I’m using remember? Dunno. >>> >>> The syntax here is not the point (or if it is, it can be bikeshedded). The >>> point I'm trying to make is that what you're criticizing as _incoherent_ is >>> also _inescapable_. Floating-point types have a notion of equivalence that >>> isn't full equivalence. For certain use cases (both concrete and generic), >>> we want that partial equivalence, while for other use cases (both concrete >>> and generic), we truly want full equivalence. To work with floating-point >>> types correctly, a user must know that there is a difference between the >>> two. If there is no hope of "most people" understanding this distinction >>> when one relation is named `==` and the other is named `~`, then _a >>> fortiori_ there is no hope of "most people" understanding the distinction >>> when they're conflated into one operator `==` that has different behaviors >>> in different contexts. >>> >>> The C# model of compare works because < is not available generically. There >>> is no choice between < and <=>, and so the model is simple and easily >>> understood by both algorithm implementors and users. And if you need a >>> different ordering, you can supply your own custom comparator. As far as I >>> can tell, it’s a good model and users are happy with it. Swift is >>> different, since the concrete < isexposed to the generic implementation, >>> but having two possibilities and expecting users to pick is IMO a bad idea. >>> Hence the proposed fix that Float’s Comparable.< is required to be a total >>> order, per the requirements of Comparable, essentially giving us the C# >>> model. >>> >>> A true C# model would be fine, but the key point of that model to my mind >>> is that partial equivalence and full equivalence are spelled differently >>> (that is, `==` and `Equals`, respectively). It would not work with IEEE >>> `==` being spelled the same way as Comparable `==`. If we were to rename >>> the IEEE operation `&==` instead, then we'd functionally have a design >>> that's broadly similar to the earlier version, only with different names: >>> >>> Equatable would vend `==`, a full equivalence relation (and `!=`) >>> Comparable would vend `<`, `>`, `<=`, `>=`, now operators that reflect a >>> total order over the set of all values; and maybe `<=>` >>> Floating point would additionally vend `&==` and `&<` (and `&!=`, `&<`, >>> `&>`, `&<=`, `&>=`) >>> >>> One key difference here would be that the partial equivalence relation >>> would now only be found on floating-point types, and it would not be >>> possible to write a generic algorithm that operates on any partially >>> equatable or equatable type. But the other--and major--issues would be (a) >>> that all concrete uses of floating-point comparison operators would have to >>> be migrated to append an extra `&`; and (b) this syntax suggests that most >>> users want to use `==` *instead of* `&==`, which I'm not sure is the >>> case--and certainly isn't the case if they're trying to do the same things >>> they're used to doing with floating-point values in other languages. >> >> What about having the protocol hierarchy look like this? (All names subject >> to bikeshedding, of course) >> >> protocol MaybeEquatable { >> static func ?== (lhs: Self, rhs: Self) -> Bool? >> } >> protocol MostlyEquatable : MaybeEquatable { >> static func == (lhs: Self, rhs: Self) -> Bool >> } >> extension MostlyEquatable { >> static func ?== (lhs: Self, rhs: Self) -> Bool? { return lhs == rhs } >> // allows a `MostlyEquatable` or `Equatable` to function as a >> `MaybeEquatable` without any extra code >> } >> protocol Equatable : MostlyEquatable {} // purely a semantic difference, no >> extra syntax >> >> protocol MaybeComparable : MaybeEquatable { >> static func ?< (lhs: Self, rhs: Self) -> Bool? >> // plus the rest of them >> } >> protocol MostlyComparable : MaybeComparable, MostlyEquatable { >> static func < (lhs: Self, rhs: Self) -> Bool >> // plus the rest of them >> } >> extension MostlyComparable { >> static func ?< (lhs: Self, rhs: Self) -> Bool? { return lhs < rhs } // >> allows a `MostlyComparable` or `Comparable` to function as a >> `MaybeComparable` without any extra code >> // plus the rest of them >> } >> protocol Comparable : MostlyComparable, Equatable {} // purely a semantic >> difference, no extra syntax >> >> extension Double : MostlyComparable { >> static func ?== (lhs: Double, rhs: Double) -> Bool? { >> return lhs.isNaN || rhs.isNaN ? nil : lhs == rhs >> } >> static func ?< (lhs: Double, rhs: Double) -> Bool? { >> return lhs.isNaN || rhs.isNaN || (lhs.isInfinite == true && >> rhs.isInfinite == true && lhs.sign == rhs.sign) ? nil : lhs < rhs >> } >> static func == (lhs: Double, rhs: Double) -> Bool { >> // whatever current impl is >> } >> static func < (lhs: Double, rhs: Double) -> Bool { >> // whatever current impl is >> } >> } >> >> >> This would let people easily switch between the two kinds of "correct" >> generic comparisons (returning a `Bool?` for types that could have invalid >> comparisons and a `Bool` for types that can't), as well as easily opting >> into using IEEE-754's current "arguably incompatible with sane generic >> programming" behavior (by constraining T to "MostlyComparable" instead of >> "Comparable") if that's what they really want. >> >> `Collection` could have different implementations of `==` depending on >> whether `Element` conforms to "MaybeEquatable" or >> "MostlyEquatable/Equatable", solving the "a = [Double.nan]; print(a == a) // >> prints true" issue. >> >> This doesn't involve trapping on anything, so dunno what the performance >> implications would be. All the extra work would be contained in the "?*" >> functions, though, so at least existing code wouldn't sudden get slower. >> >> - Dave Sweeris >> _______________________________________________ >> swift-dev mailing list >> [email protected] <mailto:[email protected]> >> https://lists.swift.org/mailman/listinfo/swift-dev >> <https://lists.swift.org/mailman/listinfo/swift-dev>
_______________________________________________ swift-dev mailing list [email protected] https://lists.swift.org/mailman/listinfo/swift-dev
