On Mon, Oct 16, 2017 at 05:48 Jonathan Hull <jh...@gbis.com> wrote: > > On Oct 15, 2017, at 9:58 PM, Xiaodi Wu <xiaodi...@gmail.com> wrote: > > On Sun, Oct 15, 2017 at 8:51 PM, Jonathan Hull <jh...@gbis.com> wrote: > >> >> On Oct 14, 2017, at 10:48 PM, Xiaodi Wu <xiaodi...@gmail.com> wrote: >> >>> That ordering can be arbitrary, but it shouldn’t leak internal >>>> representation such that the method used to create identical things affects >>>> the outcome of generic methods because of differences in internal >>>> representation. >>>> >>>>> >>>>> >>>>> It would be better to say that the iteration order is well-defined. >>>>> That will almost always mean documented, and usually predictable though >>>>> obviously e.g. RNGs and iterating in random order will not be predictable >>>>> by design. >>>>> >>>>>> >>>>>> That's actually more semantically constrained than what Swift calls a >>>>>> `Collection` (which requires conforming types to be multi-pass and(?) >>>>>> finite). By contrast, Swift's `SpongeBob` protocol explicitly permits >>>>>> conforming single-pass, infinite, and/or unordered types. >>>>>> >>>>>> >>>>>> I think you’re talking about Sequence here, I’ve lost track of your >>>>>> nonsense by now. Yes, the current Swift protocol named Sequence allows >>>>>> unordered types. You seem to keep asserting that but not actually >>>>>> addressing my argument, which is *that allowing Sequences to be >>>>>> unordered with the current API is undesired and actively harmful, and >>>>>> should* *therefore** be changed*. >>>>>> >>>>> >>>>> What is harmful about it? >>>>> >>>>> >>>>> After thinking about it, I think the harmful bit is that unordered >>>>> sequences are leaking internal representation (In your example, this is >>>>> causing people to be surprised when two sets with identical elements are >>>>> generating different sequences/orderings based on how they were created). >>>>> You are correct when you say that this problem is even true for for-in. >>>>> >>>> >>>> I would not say it is a problem. Rather, by definition, iteration >>>> involves retrieving one element after another; if you're allowed to do that >>>> with Set, then the elements of a Set are observably ordered in some way. >>>> Since it's not an OrderedSet--i.e., order doesn't matter--then the only >>>> sensible conclusion is that the order of elements obtained in a for...in >>>> loop must be arbitrary. If you think this is harmful, then you must believe >>>> that one should be prohibited from iterating over an instance of Set. >>>> Otherwise, Set is inescapably a Sequence by the Swift definition of >>>> Sequence. All extension methods on Sequence like drop(while:) are really >>>> just conveniences for common things that you can do with iterated access; >>>> to my mind, they're essentially just alternative ways of spelling various >>>> for...in loops. >>>> >>>> >>>> I think an argument could be made that you shouldn’t be able to iterate >>>> over a set without first defining an ordering on it (even if that ordering >>>> is somewhat arbitrary). Maybe we have something like a “Sequenc(e)able” >>>> protocol which defines things which can be turned into a sequence when >>>> combined with some sort of ordering. One possible ordering could be the >>>> internal representation (At least in that case we are calling it out >>>> specifically). If I had to say >>>> “setA.arbitraryOrder.elementsEqual(setB.arbitraryOrder)” I would definitely >>>> be less surprised when it returns false even though setA == setB. >>>> >>> >>> Well, that's a totally different direction, then; you're arguing that >>> `Set` and `Dictionary` should not conform to `Sequence` altogether. That's >>> fine (it's also a direction that some of us explored off-list a while ago), >>> but at this point in Swift's evolution, realistically, it's not within the >>> realm of possible changes. >>> >>> >>> I am actually suggesting something slightly different. Basically, Set >>> and Dictionary’s conformance to Collection would have a different >>> implementation. They would conform to another protocol declaring that they >>> are unordered. That protocol would fill in part of the conformance to >>> sequence/collection using a default ordering, which is mostly arbitrary, >>> but guaranteed to produce the same ordering for the same list of elements >>> (even across collection types). This would be safer, but a tiny bit slower >>> than what we have now (We could also potentially develop a way for >>> collections like set to amortize the cost). For those who need to recover >>> speed, the new protocol would also define a property which quickly returns >>> a sequence/iterator using the internal ordering (I arbitrarily called it >>> .arbitraryOrder). >>> >>> I believe it would not be source breaking. >>> >> >> That is indeed something slightly different. >> >> In an ideal world--and my initial understanding of what you were >> suggesting--Set and Dictionary would each have a member like `collection`, >> which would expose the underlying data as a `SetCollection` or >> `DictionaryCollection` that in turn would conform to `Collection`; >> meanwhile, Set and Dictionary themselves would not offer methods such as >> `prefix`, or indexing by subscript, which are not compatible with being >> unordered. For those who want a particular ordering, there'd be something >> like `collection(ordered areInIncreasingOrder: (T, T) -> Bool) -> >> {Set|Dictionary}Collection`. >> >> What you suggest here instead would be minimally source-breaking. >> However, I'm unsure of where these guarantees provide benefit to justify >> the performance cost. Certainly not for `first` or `dropFirst(_:)`, which >> still yields an arbitrary result which doesn't make sense for something >> _unordered_. We *could* have an underscored customization point named >> something like `_customOrderingPass` that is only invoked from >> `elementsEqual` or other such methods to pre-rearrange the internal >> ordering of unordered collections in some deterministic way before >> comparison. Is that what you have in mind? >> >> >> >> Something like that. Whatever we do, there will be a tradeoff between >> speed, correctness, and ergonomics. >> >> My suggestion trades speed for correctness, and provides a way to recover >> speed through additional typing (which is slightly less ergonomic). >> > > You haven't convinced me that this is at all improved in "correctness." It > trades one arbitrary iteration order for another on a type that tries to > model an unordered collection. > > >> We could do something like you suggest. I don’t think the method would >> need to be underscored… the ordering pass could just be a method on the >> protocol which defines it as unordered. Then we could provide a special >> conformance for things where order really matters based on adherence to >> that protocol. That might be an acceptable tradeoff. It would give us >> speed at the cost of having the correct implementation being less ergonomic >> and more error prone (you have to remember to check that it is unordered >> and call the ordering method when it mattered). >> >> I’d still be a bit worried that people would make incorrect generic >> algorithms based on expecting an order from unordered things, but at least >> it would be possible for them check and handle it correctly. I think I >> could get behind that tradeoff/compromise, given where we are in the swift >> process and Swift's obsession with speed (though I still slightly prefer >> the safer default). At least the standard library would handle all the >> things correctly, and that is what will affect the majority of programmers. >> > > What is an example of such an "incorrect" generic algorithm that would be > made correct by such a scheme? > > > To start with, the one you gave as an example at the beginning of this > discussion: Two sets with identical elements which have different internal > storage and thus give different orderings as sequences. You yourself have > argued that the confusion around this is enough of a problem that we need > to make a source-breaking change (renaming it) to warn people that the > results of the ‘elementsEqual’ algorithm are undefined for sets and > dictionaries. >
No, I am arguing that the confusion about ‘elementsEqual’ is foremost a problem with its name; the result of this operation is not at all undefined for two sets but actually clearly defined: it returns true if two sets have the same elements in the same iteration order, which is a publicly observable behavior of sets (likewise dictionaries). I don’t see why a non-source-breaking change is suddenly off-limits. > > But more than that, any generic algorithm which is assuming that the > sequence is coming from an ordered source (i.e. many things using > first/last). Some uses of first are ok because the programmer actually > means ‘any’, but anywhere where they actually mean first/last may be > problematic. > Such as...? Currently, there is no way to test for ordered-ness, so there is no way for > even a careful programmer to mitigate this problem. By adding a protocol > which states that something is unordered, we can either branch on it, or > create a separate version of an algorithm for things which conform. > It is clearly the case that Swift’s protocol hierarchy fits sets and collections imperfectly; however, it is in the nature of modeling that imperfections are present. The question is not whether it is possible to incur performance, API surface area, and other trade-offs to make the model more faithful, but rather whether this usefully solves any problem. What is the problem being mitigated? As I write above, Swift’s Set and Dictionary types meet the semantic requirements for Collection and moonlight as ordered collections. What is a generic algorithm on an ordered collection that is “not OK” for Set and Dictionary? (“elementsEqual”, as I’ve said, is not such an example.)
_______________________________________________ swift-evolution mailing list swift-evolution@swift.org https://lists.swift.org/mailman/listinfo/swift-evolution