> On Oct 15, 2017, at 9:58 PM, Xiaodi Wu <xiaodi...@gmail.com> wrote:
> 
> On Sun, Oct 15, 2017 at 8:51 PM, Jonathan Hull <jh...@gbis.com 
> <mailto:jh...@gbis.com>> wrote:
> 
>> On Oct 14, 2017, at 10:48 PM, Xiaodi Wu <xiaodi...@gmail.com 
>> <mailto:xiaodi...@gmail.com>> wrote:
>>>>  That ordering can be arbitrary, but it shouldn’t leak internal 
>>>> representation such that the method used to create identical things 
>>>> affects the outcome of generic methods because of differences in internal 
>>>> representation.
>>>> 
>>>> 
>>>>>  It would be better to say that the iteration order is well-defined. That 
>>>>> will almost always mean documented, and usually predictable though 
>>>>> obviously e.g. RNGs and iterating in random order will not be predictable 
>>>>> by design.
>>>>> 
>>>>>> That's actually more semantically constrained than what Swift calls a 
>>>>>> `Collection` (which requires conforming types to be multi-pass and(?) 
>>>>>> finite). By contrast, Swift's `SpongeBob` protocol explicitly permits 
>>>>>> conforming single-pass, infinite, and/or unordered types. 
>>>>> 
>>>>> I think you’re talking about Sequence here, I’ve lost track of your 
>>>>> nonsense by now. Yes, the current Swift protocol named Sequence allows 
>>>>> unordered types. You seem to keep asserting that but not actually 
>>>>> addressing my argument, which is that allowing Sequences to be unordered 
>>>>> with the current API is undesired and actively harmful, and should 
>>>>> therefore be changed.
>>>>> 
>>>>> What is harmful about it?
>>>> 
>>>> After thinking about it, I think the harmful bit is that unordered 
>>>> sequences are leaking internal representation (In your example, this is 
>>>> causing people to be surprised when two sets with identical elements are 
>>>> generating different sequences/orderings based on how they were created).  
>>>> You are correct when you say that this problem is even true for for-in.
>>>> 
>>>> I would not say it is a problem. Rather, by definition, iteration involves 
>>>> retrieving one element after another; if you're allowed to do that with 
>>>> Set, then the elements of a Set are observably ordered in some way. Since 
>>>> it's not an OrderedSet--i.e., order doesn't matter--then the only sensible 
>>>> conclusion is that the order of elements obtained in a for...in loop must 
>>>> be arbitrary. If you think this is harmful, then you must believe that one 
>>>> should be prohibited from iterating over an instance of Set. Otherwise, 
>>>> Set is inescapably a Sequence by the Swift definition of Sequence. All 
>>>> extension methods on Sequence like drop(while:) are really just 
>>>> conveniences for common things that you can do with iterated access; to my 
>>>> mind, they're essentially just alternative ways of spelling various 
>>>> for...in loops.
>>> 
>>> I think an argument could be made that you shouldn’t be able to iterate 
>>> over a set without first defining an ordering on it (even if that ordering 
>>> is somewhat arbitrary).  Maybe we have something like a “Sequenc(e)able” 
>>> protocol which defines things which can be turned into a sequence when 
>>> combined with some sort of ordering.  One possible ordering could be the 
>>> internal representation (At least in that case we are calling it out 
>>> specifically).  If I had to say 
>>> “setA.arbitraryOrder.elementsEqual(setB.arbitraryOrder)” I would definitely 
>>> be less surprised when it returns false even though setA == setB.
>>> 
>>> Well, that's a totally different direction, then; you're arguing that `Set` 
>>> and `Dictionary` should not conform to `Sequence` altogether. That's fine 
>>> (it's also a direction that some of us explored off-list a while ago), but 
>>> at this point in Swift's evolution, realistically, it's not within the 
>>> realm of possible changes.
>> 
>> I am actually suggesting something slightly different.  Basically, Set and 
>> Dictionary’s conformance to Collection would have a different 
>> implementation.  They would conform to another protocol declaring that they 
>> are unordered. That protocol would fill in part of the conformance to 
>> sequence/collection using a default ordering, which is mostly arbitrary, but 
>> guaranteed to produce the same ordering for the same list of elements (even 
>> across collection types).  This would be safer, but a tiny bit slower than 
>> what we have now (We could also potentially develop a way for collections 
>> like set to amortize the cost). For those who need to recover speed, the new 
>> protocol would also define a property which quickly returns a 
>> sequence/iterator using the internal ordering (I arbitrarily called it 
>> .arbitraryOrder).
>> 
>> I believe it would not be source breaking.
>> 
>> That is indeed something slightly different.
>> 
>> In an ideal world--and my initial understanding of what you were 
>> suggesting--Set and Dictionary would each have a member like `collection`, 
>> which would expose the underlying data as a `SetCollection` or 
>> `DictionaryCollection` that in turn would conform to `Collection`; 
>> meanwhile, Set and Dictionary themselves would not offer methods such as 
>> `prefix`, or indexing by subscript, which are not compatible with being 
>> unordered. For those who want a particular ordering, there'd be something 
>> like `collection(ordered areInIncreasingOrder: (T, T) -> Bool) -> 
>> {Set|Dictionary}Collection`.
>> 
>> What you suggest here instead would be minimally source-breaking. However, 
>> I'm unsure of where these guarantees provide benefit to justify the 
>> performance cost. Certainly not for `first` or `dropFirst(_:)`, which still 
>> yields an arbitrary result which doesn't make sense for something 
>> _unordered_. We *could* have an underscored customization point named 
>> something like `_customOrderingPass` that is only invoked from 
>> `elementsEqual` or other such methods to pre-rearrange the internal ordering 
>> of unordered collections in some deterministic way before comparison. Is 
>> that what you have in mind?
> 
> 
> Something like that.  Whatever we do, there will be a tradeoff between speed, 
> correctness, and ergonomics.
> 
> My suggestion trades speed for correctness, and provides a way to recover 
> speed through additional typing (which is slightly less ergonomic).
> 
> You haven't convinced me that this is at all improved in "correctness." It 
> trades one arbitrary iteration order for another on a type that tries to 
> model an unordered collection.
>  
> We could do something like you suggest. I don’t think the method would need 
> to be underscored… the ordering pass could just be a method on the protocol 
> which defines it as unordered.  Then we could provide a special conformance 
> for things where order really matters based on adherence to that protocol.  
> That might be an acceptable tradeoff.  It would give us speed at the cost of 
> having the correct implementation being less ergonomic and more error prone 
> (you have to remember to check that it is unordered and call the ordering 
> method when it mattered).
> 
> I’d still be a bit worried that people would make incorrect generic 
> algorithms based on expecting an order from unordered things, but at least it 
> would be possible for them check and handle it correctly.  I think I could 
> get behind that tradeoff/compromise, given where we are in the swift process 
> and Swift's obsession with speed (though I still slightly prefer the safer 
> default).  At least the standard library would handle all the things 
> correctly, and that is what will affect the majority of programmers.
> 
> What is an example of such an "incorrect" generic algorithm that would be 
> made correct by such a scheme?

To start with, the one you gave as an example at the beginning of this 
discussion: Two sets with identical elements which have different internal 
storage and thus give different orderings as sequences.  You yourself have 
argued that the confusion around this is enough of a problem that we need to 
make a source-breaking change (renaming it) to warn people that the results of 
the ‘elementsEqual’ algorithm are undefined for sets and dictionaries.  I don’t 
see why a non-source-breaking change is suddenly off-limits.

But more than that, any generic algorithm which is assuming that the sequence 
is coming from an ordered source (i.e. many things using first/last).  Some 
uses of first are ok because the programmer actually means ‘any’, but anywhere 
where they actually mean first/last may be problematic.

Currently, there is no way to test for ordered-ness, so there is no way for 
even a careful programmer to mitigate this problem.  By adding a protocol which 
states that something is unordered, we can either branch on it, or create a 
separate version of an algorithm for things which conform.

Thanks,
Jon



_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

Reply via email to