> Am 15.10.2017 um 10:38 schrieb Xiaodi Wu via swift-evolution > <swift-evolution@swift.org>: > > On Sun, Oct 15, 2017 at 2:29 AM, Kevin Nattinger <sw...@nattinger.net > <mailto:sw...@nattinger.net>> wrote: > >> On Oct 14, 2017, at 7:54 PM, Xiaodi Wu <xiaodi...@gmail.com >> <mailto:xiaodi...@gmail.com>> wrote: >> >> On Sat, Oct 14, 2017 at 6:17 PM, Kevin Nattinger <sw...@nattinger.net >> <mailto:sw...@nattinger.net>> wrote: >>>> […] >>>> * A Swift `Sequence` is, to put it simplistically, a thing that can be >>>> iterated over in a `for...in` loop. If it would make you happy, for the >>>> rest of the discussion, let's suppose we called the protocol `ForLoopable` >>>> instead of `Sequence`. >>> >>> ForLoopable is so ugly. Since we’re just iterating over the elements, how >>> about, oh, say, `Iterable`? Hey, that looks familiar. >>> >>> I'm not trying to bikeshed the name of `Sequence`. I'm picking an >>> intentionally unwieldy name for the purposes of discussing the semantics of >>> this particular protocol. The point is that the underlying issue has >>> nothing to do with the name; it can be `Iterable` or it can be `SpongeBob` >>> for all I care. >> >> I’m not trying to bikeshed the name either, The underlying issue is that >> (what is currently) Sequence actually encompasses two separate >> functionalities, and those functionalities need to be separated with their >> separate semantic requirements documented. “Sequence: Iterable,” >> “OrderedSequence: Sequence,” “SpongeBob: ForLoopable,” the names are 100% >> irrelevant at this point; what’s important is that one is not necessarily >> ordered and the other guarantees an order. >> >> >> What are the "two separate functionalities”? > > Iteration, with convenience methods that don’t imply or rely on an order that > may not be there; and convenience methods applicable to sequences that do > have an intrinsic order. > > Sets, as a mathematical concept, have no intrinsic order. However, instances > of `Set`, which can be iterated over, *do* have at least one order which can > be said to be intrinsic in the following sense: as long as iteration is > possible, no API design can prevent that order from being observed and > associated with the instance. Put another way, if you can use an instance of > a type in a for...in loop, intrinsic to that functionality is a publicly > visible order.
I disagree. Sets are value types, therefore two instances of `Set` are equal if they contain the same elements. An intrinsic order should therefore only depend on the elements contained and should be the same for two instances of `Set` which are equal. This is not the case, though, as you can easily check in a playground by looking at Set([1,2,3,4,5,6]) and Set([6,5,4,3,2,1]) which represent the same value and are equal but do *not* have the same order. > > >> All the extension methods on Sequence are ways of spelling things that you >> can write in a few lines of code using a `for...in` loop; they're in the >> stdlib to allow a more functional style which some prefer. If you accept >> that a type should support iteration with a `for...in` loop, then what is >> your basis for claiming that these extension methods are "separate >> functionalities”? > > Just because you *can* express something in code doesn’t mean you should, or > that it’s correct. It is objectively false to say a Set has a first or last > object, because the objects therein have no order. You can take a random > object from the set and call it “first”, but that doesn’t make that a correct > definition of Set.first. A Set has no order, a specific iteration has an > “order” only in the sense that all and only the objects in the set have to > come out one at a time, but that doesn’t mean the Set itself has an order, > specifically a first or last object. > > Since Set conforms to Collection, it is guaranteed that if one element of an > instance of Set comes out first one time, it'll come out first every time > from that instance. If it helps, think of Swift's Set as modeling > (imperfectly, as all models must) both a mathematical set and a multi-pass > sequence, just as Swift's Int models both an integer and a sequence of bits. > > You’re a fan of the principal of least surprise. Tell me, which would be less > surprising: Set.dropFirst() actually drops a random element, or Set doesn’t > have a dropFirst()? And if you think dropFirst() removing an element at > random is not surprising, please explain why. > > I think Set.dropFirst removing the first element that I observe on iteration > is the least surprising answer, because Swift tells me that the stdlib Set > models a set but that it is also a sequence. The latter is exactly the problem Kevin did point out. A Set is an Iterable (in the sense that I can iterate over its elements with the order being a meaningless random side effect) but it is *not* a Sequence (in the sense that the order conveys any meaning). -Thorsten > >>>> […] >>> >>>> * If a type `T` conforms to `ForLoopable` and an instance `t` of that type >>>> has at least one element, then *something* has to be the first element in >>>> a `for element in t { ... }` loop. Put another way, every instance of a >>>> type that conforms to `ForLoopable` must have at least one publicly >>>> observable order (although, intriguingly, I'm not sure it has to be a >>>> repeatable one). It is possible, therefore, to have a semantic answer to >>>> the question of which element is `first` or (if finite) `last`; one can >>>> also `drop(while:)`, etc., and perform lexicographical comparisons. >>> >>> As a side effect of Swift being a procedural language each iteration >>> happens to occur in some order, yes, but that order is meaningless and >>> reflects nothing about the Set itself. In fact, I’d say that `first`, >>> `last`, etc. are not even defined on the original Set per se, only on the >>> specific order that a particular iteration resulted in. And that order is >>> not necessarily predictable, nor necessarily stable, as you yourself said. >>> >>> Consider an Iterable that gives a different order every time it’s iterated. >>> Should calling `.first` or `last` give a different object every time? >>> That’s absurd. >>> Should an object lexicographically compare not equal to itself? Even more >>> absurd. >>> >>> What's your basis for saying that such behavior is absurd? It is explicitly >>> permitted for instances of types conforming to `SpongeBob` to be >>> single-pass and/or infinite. For a single-pass `SpongeBob`, `first` will >>> certainly return a different value every time it is invoked. >> >> Is `first` mutating? No. Should it be? No! `first` and `last` are a peek at >> the state of the object. >> >> You're right, `first` should not be mutating; that's actually an important >> design consideration, as Ole pointed out, and it's not actually available on >> `Sequence` for that reason. However, `first { _ in true }` is available and >> is potentially mutating, as are all methods on Sequence by design. >> >> Is `elementsEqual` (or *shudder* lexicographicallyEqual) reflexive? IMO it >> clearly should be. Especially with the “lexicographically” part—from >> everything I can find, a lexicographical ordering is by definition >> reflexive. Do you have a citation for the idea that lexicographical equality >> can legitimately be non-reflexive? >> >> Clearly (tautologically?), such a function should be reflexive for any >> argument ordered with respect to itself. However, if there is no >> lexicographical comparison possible, then a thing cannot compare >> lexicographically equal to anything, even itself. > > And that’s PRECISELY why lexicographicallyEqual does not make sense to apply > to unordered sets. There is no lexicographical comparison possible, so why do > you keep insisting they should have a method that falsely claims to > lexicographically compare them? > > I agree! It doesn't make sense if no comparison is possible! But Swift tells > me that a `Set` is a `Sequence`! >> >>> A random number generator fulfills all the semantic requirements of >>> conforming to `SpongeBob`, and in fact I do just that in NumericAnnex >>> <https://github.com/xwu/NumericAnnex/blob/master/Sources/PRNG.swift#L53>. >>> `first` gives a different value every time, and a randomly generated >>> `SpongeBob` would unsurprisingly compare lexicographically not equal to >>> itself. >> >> > IMO that’s a bug in the implementation; lexicographical equality is >> > reflexive, period. >> >> > Presumably the `elementsEqual` method contains something along these lines >> > (take from SequenceAlgorithms.swift.gyb): >> >> var iter1 = self.makeIterator() >> var iter2 = other.makeIterator() >> >> > By creating two iterators, you’re mutating while iterating. Turns out >> > there’s even a warning against this in Sequence.swift: >> >> /// Using Multiple Iterators >> /// ======================== >> /// >> /// Whenever you use multiple iterators (or `for`-`in` loops) over a single >> /// sequence, be sure you know that the specific sequence supports repeated >> /// iteration, either because you know its concrete type or because the >> /// sequence is also constrained to the `Collection` protocol. >> /// >> /// Obtain each separate iterator from separate calls to the sequence's >> /// `makeIterator()` method rather than by copying. Copying an iterator is >> /// safe, but advancing one copy of an iterator by calling its `next()` >> method >> /// may invalidate other copies of that iterator. `for`-`in` loops are safe >> in >> /// this regard. >> >> > The default implementation of elementsEqual is therefore unsafe because it >> > has the potential for using an invalidated iterator. >> >> You are misunderstanding the warning in the second paragraph here. The >> implementation (not a default implementation, unless I'm mistaken, as it >> cannot be overridden) > >> makes each iterator using separate calls to `makeIterator()`, just as the >> documentation tells you to do. Calling next() on one iterator does not >> invalidate the other iterator, because the second is not a copy of the first. > > Indeed, I misread that comment. > That said, is there a well-defined behavior when iterating a one-shot > sequence with two iterators? > (It *is* a default implementation, btw) > > Not sure about that; I don't see a protocol requirement, in which case it can > only be shadowed in a concrete type but it can't be overridden. How to > accommodate single-pass sequences is an interesting question. Off the top of > my head, an iterator would have to be or wrap a reference type. >> >> You are, however, right that calling `rng.elementsEqual(rng)` is not >> advised. It isn't unsafe; it's just not useful. That said, calling >> `array.elementsEqual(array)` is equally safe and equally useless, and the >> uselessness of such a reflexive comparison is neither here nor there. > > Funny how you complain about my code being useless and yet you insist below, > "If it's not providing you with utility, then don't use it.” > Regardless, you’re wrong to dismiss this case. `foo.elementsEqual(foo)` on > its own makes little sense, sure, but you could easily find yourself in a > method comparing two iterators you obtained from elsewhere, and occasionally > they happen to be the identical object. Allowing an iterator to return > `false` for .elementsEqual on itself is unexpected and dangerous. > > You will always have to account for this possibility, because Swift's > `Equatable` explicitly allows "special values" to be not equal to themselves. > This is, at least in part, in order to accommodate the IEEE decree that NaN > != NaN: > > ``` > let x = [Double.nan] > x.elementsEqual(x) // false > ``` > > Changing this behavior is way beyond the scope of this thread (and has been > the topic of hours (actually, weeks and weeks) of fun on this list > previously). > >>> On the other hand, if I have a collection of objects that I want iterated >>> in a particular order, I can use a container that iterates in a specific, >>> known, well-defined way, and use that to construct the sequence of objects. >>> That’s clearly an Iterable collection, but the guarantee is stronger than >>> that. Since it iterates objects in a specific sequence, the logical way to >>> express that would be `Sequence: Iterable`. Again, we’ve seen that before. >>> >>> Now, since a Sequence is guaranteed to iterate the same every time, >>> suddenly our `first`, `last`, `drop*`, etc. methods have a meaning inherent >>> to the collection itself, rather than a specific iteration. >>> >>> What you call a "Sequence" here would have to be multi-pass, finite, and >>> ordered. >> >> > Ordered, yes, but it’s only admittedly poor wording that suggests >> > multi-pass, and I don’t think anything there suggests finite. >> >> If a Sequence is "guaranteed to iterate the same every time," then surely it >> must be multi-pass; what's the alternative? > > Not sure if you just missed the very next sentence or are actively ignoring > it just to be argumentative. I already acknowledged that that phrase didn’t > convey the meaning I intended, and a Sequence is not and should not be > required to be multi-pass. > > I entirely misunderstood your next sentence as asserting that being > multi-pass makes the iteration order well-defined. >> >> It would be better to say that the iteration order is well-defined. That >> will almost always mean documented, and usually predictable though obviously >> e.g. RNGs and iterating in random order will not be predictable by design. > > Wouldn't it then suffice to document, say, that a set's iteration order is > the insertion order? >> >>> That's actually more semantically constrained than what Swift calls a >>> `Collection` (which requires conforming types to be multi-pass and(?) >>> finite). By contrast, Swift's `SpongeBob` protocol explicitly permits >>> conforming single-pass, infinite, and/or unordered types. >> >> I think you’re talking about Sequence here, I’ve lost track of your nonsense >> by now. Yes, the current Swift protocol named Sequence allows unordered >> types. You seem to keep asserting that but not actually addressing my >> argument, which is that allowing Sequences to be unordered with the current >> API is undesired and actively harmful, and should therefore be changed. >> >> What is harmful about it? > > Well, for one, the issue you raised this thread about—two sets that are `==` > could return either true or false for `elementsEqual`, depending on how they > arrived at their current state. That’s not acceptable, and the problem isn’t > with the name of the method. > > Apple documentation calls this one of the "order-dependent" methods. It is > surely acceptable for a type that conforms to an order-dependent protocol to > have methods that are order-dependent; they do, however, have to be clearly > order-dependent to avoid confusion on unordered types. > > Then there are all the methods that imply a specific order of iteration. If > the “sequence” is unordered, who knows what you’ll get? It is incredibly easy > for an engineer to write a method that implicitly relies on a passed sequence > being intrinsically ordered and another engineer to pass it an unordered > “sequence.” The first engineer could neglect to document the dependency, or > even not notice it; or the second engineer could just fail to read the > documentation thoroughly enough. There is currently no way for the compiler > to enforce passing only an object that is (or at least claims to be) > intrinsically ordered. > > It is also incredibly easy for such an engineer to use `for...in` instead to > accomplish the same task, generic over ordered and unordered sequences > whatever you name such distinguished protocols. I think your beef really > still boils down to Set being compatible with `for...in` at all, as Jon > acknowledges. >> >> >>> As long as it is possible to iterate over a `SpongeBob`, it is meaningful >>> to ask what element is first obtained upon iteration or to drop the first >>> element obtained upon iteration. >>> And as long as iteration is not required to be repeatable (and it isn't), >>> it is perfectly acceptable for these algorithms to return a different >>> result every time. >> >> It is “meaningful” in the sense that it can technically be programmed. The >> actual results are meaningless beyond returning or dropping a random* >> element. >> >> *: Don’t nitpick the word “random”, you know exactly what I mean. It’s just >> shorter and no less clear than “technically more-or-less deterministic but >> not documented, not generally predictable, and probably but not necessarily >> consistent from one call to the next." >> >> >> I fail to see the issue here. If it's not providing you with utility, then >> don't use it. > > I have no problem with functions I don’t use provided they are well-defined > and reasonably accurately named. Functions requiring an order on unordered > collections don’t pass that bar. > > As I said, you're welcome to tackle the protocol hierarchy, but I really > doubt it's within the realm of realistic endpoints for Swift 5. I'm just > trying to propose a narrowly targeted pragmatic solution to one specific > limited harm that might be deliverable by the next point release. As a great > man once said, Swift is a pragmatic language. >> Since Collections do guarantee multi-pass iteration, Brent's example of >> `set.dropFirst().reduce(set.first!, ...)` provides just one instance where >> an unordered Collection can profitably make use of `first`. Permitting >> generic algorithms that can operate on either arrays or sets, for example, >> is the desired effect of having such a protocol; a generic algorithm that >> takes a Collection can ask for the first element, and in the case of an >> unordered Collection, an arbitrary element will do just fine. > > The generic algorithms should be on a protocol that specifies everything they > require. If one can work on anything you can iterate over, put it on > Iterable. If another requires the objects to be ordered, put it on Sequence. > Need to express that an algorithm requires a multi-pass sequence? Make a > MultiPassSequence protocol and put the algorithm on an extension containing > that requirement. Use protocols to express requirements, as they were > designed for. Don’t just tack a bunch of methods onto a protocol that isn’t > sufficient to describe their requirements and say, “oh, by the way, only use > this method if your implementation meets these conditions…" > > The benefits are likely outweighed by the costs of such an approach taken to > completion, because there are many axes to differentiate. The protocol > hierarchy for collections is already daunting, leading to monstrosities such > as `MutableRangeReplaceableRandomAccessSlice`. It stretches the bounds of > sensibility to have a > `LazyUnorderedInfiniteMultiPassMutableRangeReplaceableRandomAccessSlice`. > > The Swift stdlib deliberately eschews modeling everything in protocol > hierarchies with the highest level of granularity. There's some fudging, > deliberately, to find a happy medium between obtuse and approachable, between > too many/too specialized and not enough. For example, I pushed for protocols > such as `Field` and `Ring` at the top of the numeric hierarchy, which might > allow complex number types to slot into the hierarchy more sensibly, for > example. But we have a compromise protocol `Numeric` which doesn't quite have > the same guarantees but is much more approachable. Notice that we also don't > break down numeric protocols into `Addable`, `Subtractable`, etc.; we also > have that fudge factor built into `Equatable`, as I mentioned. >>> >>> `first` is the first object in the Sequence. It doesn’t matter how the >>> sequence came to be in that order; it doesn’t matter whether or not the >>> sequence has already been iterated or how many times. `first` is the first >>> object that is, was, and always will be presented by the Sequence’s >>> Iterator. (Until the collection is mutated, obviously). >>> >>> To summarize, >>> A Set has no intrinsic order. You can iterate over it, and a specific >>> iteration of a set has an order, but that order is not tied to the Set >>> itself beyond including all and only the items therein. Therefore, the Set >>> itself has no intrinsic `first`, `last`, lexicographical comparison, etc.; >>> only its iterations do, and they are not themselves Sets. >>> A Sequence does have an intrinsic order. The order of iteration reflects >>> the order inherent to the Sequence. Therefore, a Sequence has a `first`, >>> `last`, lexicographical comparison, etc. >>> >>> Just in case it’s not obvious, `Set` here is pretty much interchangeable >>> with any other unordered iterable. >>> >>> What you want to call a "Sequence" is what Swift calls a `Collection`, with >>> additional restrictions. What you want to be called "Iterable" is what >>> Swift calls `Sequence` (or now, `SpongeBob`). Clearly, shuffling names will >>> not make these protocols support any more functionality, so that can be put >>> aside. >> >> No, no, no! What I want to call “Iterable” is specified below. It is about >> HALF of what’s currently in Sequence—the half that has to do with iterating, >> whence the name. >> What I want to call Sequence is precisely what Swift now calls Sequence—the >> methods that are in Iterable by virtue of adopting Iterable, PLUS some >> methods that only make sense on iterable groups of objects where the >> iteration order is well-defined. >> >>> >>> Now, with that out of the way, why do you think that only `Collection` >>> types should have `first` and `last`? These helper properties and methods >>> are simply convenient ways to spell things that can be done with >>> `for...in`--the whole point of supplying them is to allow people to work >>> with these types in a more functional style. >> >> Apparently “collection" was a bad choice of word. What I clearly meant was >> not the current Swift Collection protocol, but rather an unordered >> assemblage of objects. UnorderedCollection, perhaps, or if that’s still >> going to cause issues, try UnorderedAssemblage. What `first` and `last` >> really mean in an UnorderedAssemblage is give me some object from the >> assembled objects, I don’t care which one. For which it’s much more clear to >> have an `anyObject()` as on NSSet; as another user has pointed out, >> `assemblage.makeIterator().next()` works just as well. (I just checked, and >> that’s actually how `first` is implemented. But it’s on Collection, which is >> guaranteed to be multipass,) >> >> Again, the point of having a protocol-based design is to allow useful >> _generic_ algorithms to be written; that `first` and `last` would be >> equivalent to an arbitrary element in the case that a collection is >> unordered is not at all an argument against these types conforming to >> `Collection`; if anything, it's an argument for it. > > If a protocol demands the first object, you should give it the first object. > If you don’t have a first object, maybe you shouldn’t conform to the > protocol. If the protocol really just needs any old object, call it > `anyObject`. > > Sure, but we *do* have a first element; it just happens to be the first that > is obtainable on iteration. That you could make a good case for any other > element to be first doesn't mean that this one isn't a perfectly cromulent > first. > > >> Just as `Sequence.underestimatedCount` is equivalent to `Collection.count` >> for types that conform to `Collection`, or the instance >> `BinaryInteger.bitWidth` is equivalent to a static `bitWidth` for types that >> conform to `FixedWidthInteger`. > > I don’t see how those are relevant, they all mean what they claim to, unlike > Set.first/dropFirst/etc. > >>>>> public protocol Iterable { >>>>> associatedtype Iterator: IteratorProtocol >>>>> func map<T>(...) -> [T] // Iterable where .Iterator.Element == T >>>>> func filter(...) -> [Iterator.Element] // Iterable where >>>>> .Iterator.Element == Self.Iterator.Element >>>>> func forEach(...) >>>>> func makeIterator() -> Iterator >>>>> var underestimatedCount: Int { get } >>>>> } >>>>> >>>>> public protocol Sequence: Iterable { // Maybe OrderedSequence just to >>>>> make the well-defined-order requirement explicit >>>>> associatedtype SubSequence >>>>> func dropFirst(...) -> SubSequence // Sequence where >>>>> .Iterator.Element == Self.Iterator.Element >>>>> func dropLast(...) -> SubSequence // " " >>>>> func drop(while...) -> SubSequence // " " >>>>> func prefix(...) -> SubSequence // " " >>>>> func prefix(while...) -> SubSequence // " " >>>>> func suffix(...) -> SubSequence // " " >>>>> func split(...where...) -> [SubSequence] // Iterable where >>>>> .Iterator.Element == (Sequence where .Iterator.Element == >>>>> Self.Iterator.Element) >>>>> } >>> >>> >>> And just to be explicit, >>> struct Set: Iterable {…} >>> struct Dictionary: Iterable {…} >>> struct Array: Sequence {…} >>> etc. >>> >>> Hopefully at some point: >>> struct OrderedSet: Sequence {…} > > > _______________________________________________ > swift-evolution mailing list > swift-evolution@swift.org > https://lists.swift.org/mailman/listinfo/swift-evolution
_______________________________________________ swift-evolution mailing list swift-evolution@swift.org https://lists.swift.org/mailman/listinfo/swift-evolution