Hi Douglas, First of all, thanks very much for looking into this seemingly dark corner of the compiler. This caused us a lot of trouble already.
Comments inline. > On 1 Dec 2017, at 12:28 am, Douglas Gregor via swift-evolution > <swift-evolution@swift.org> wrote: > > Hello Swift community, > > Associated type inference, which is the process by which the Swift compiler > attempts to infer typealiases to satisfy associated-type requirements based > on other requirements, has caused both implementation problems and user > confusion for a long time. Some of you might remember a previous (failed) > attempt to remove this feature from the Swift language, in SE-0108 “Remove > associated type inference”. > > I’m not sure we can remove this feature outright (i.e., the concerns that > sank that proposal are still absolutely valid), because it is so very > convenient and a lot of user code likely depends on it in some form or other. > So, here I’d like to describe the uses of the feature, its current (very > problematic) implementation approach, and a half-baked proposal to narrow the > scope of associated type inference to something that I think is more > tractable. I need help with the design of this feature, because I feel like > it’s going to be a delicate balance between implementability and > expressiveness. > > A Motivating Example > As a motivating example, let’s consider a “minimal” random access collection: > > struct MyCollection<T> { > var contents: [T] > } > > extension MyCollection: RandomAccessCollection { > var startIndex: Int { return contents.startIndex } > var endIndex: Int { return contents.endIndex } > subscript(index: Int) -> T { return contents[index] } > } > > This is actually pretty awesome: by providing just two properties and a > subscript, we get the full breadth of the random access collection API! This > is relying heavily on associated type inference (for associated type > requirements) and default implementations specified on protocol extensions. > Without associated type inference, we would have had to write: > > > extension MyCollection: RandomAccessCollection { > typealias Element = T > typealias Index = Int > typealias Indices = CountableRange<Int> > typealias Iterator = IndexingIterator<MyCollection<T>> > typealias SubSequence = RandomAccessSlice<MyCollection<T>> > > var startIndex: Int { return contents.startIndex } > var endIndex: Int { return contents.endIndex } > subscript(index: Int) -> T { return contents[index] } > } > > where the bolded typealiases are currently inferred. It was worse back when > we reviewed SE-0108, because IIRC there were a few underscored associated > types (e.g., _Element) that have since been removed. Still, that’s a bit of > additional work to define a “minimal” collection, and requires quite a bit > more understanding: how do I know to choose IndexingIterator, and > CountableRange, and RandomAccessSlice? > > The issue would get worse with, e.g., SE-0174 “Change filter to return an > associated type”, which adds an associated type Filtered that almost nobody > will ever customize, and isn’t really fundamental to the way collections > work. Adding Filtered to the standard library would be a source-breaking > change, because users would have to write a typealias giving it its default. > > Associated Type Defaults > One of the ways in which we avoid having to specify typealiases is to use > associated type defaults. For example, the standard library contains > associated type defaults for Indices, Iterator, and SubSequence. Let’s focus > on Indices: > > protocol Collection : Sequence { > associatedtype Indices = DefaultIndices<Self> > // ... > } > > protocol BidirectionalCollection : Collection { > associatedtype Indices = DefaultBidirectionalIndices<Self> > // ... > } > > protocol RandomAccessCollection : BidirectionalCollection { > associatedtype Indices = DefaultRandomAccessIndices<Self> > // ... > } > > The basic idea here is that different protocols in the hierarchy provide > different defaults, and you presumably want the default from the most > specific protocol. If I define a type and make it conform to > BidirectionalCollection, I’d expect to get DefaultBidirectionalIndices<Self> > for Indices. If a define a type and make it conform to RandomAccessIterator, > I’d expect to get DefaultRandomAccessIndices<Self>. > > (Aside: DefaultRandomAccessIndices and DefaultBidirectionalIndices got > collapsed into DefaultIndices now that we have conditional conformances for > the standard library, but the issues I’m describing remain). > > Associated type defaults seem like a reasonable feature that fits well enough > into the design. However, it’s not the only thing in place with our > MyCollection example, for which Indices was inferred to CountableRange. How’s > that happen? > > Associated Type Inference > Associated type inference attempts to look at the requirements of a protocol, > and then looks into the conforming type for declarations that might satisfy > those requirements, and infers associated types from the types of those > declarations. Let’s look at some examples. RandomAccessCollection has some > requirements that mention the Index and Element types: > > protocol RandomAccessCollection : BidirectionalCollection { > associatedtype Element > associatedtype Index > > var startIndex: Index { get } > var endIndex: Index { get } > subscript (i: Index) -> Element { get } > } > > Associated type inference wil try to satisfy those requirements for > MyCollection, and will find these declarations: > > extension MyCollection: RandomAccessCollection { > var startIndex: Int { return contents.startIndex } > var endIndex: Int { return contents.endIndex } > subscript(index: Int) -> T { return contents[index] } > } > > and match them up. From startIndex, we infer Index := Int. From endIndex, we > infer Index := Int. From subscript, we infer Index := Int and Element := T. > Those results are all consistent, so we’ve properly inferred Index and > Element. Yay. > > Associated type inference often has to deal with ambiguities. For example, > there is an extension of Collection that provides a range subscript operator: > > extension Collection { > subscript (bounds: Range<Index>) -> Slice<Self> { … } > } > > When we look at that and match it to the subscript requirement in > RandomAccessCollection (remembering that argument labels aren’t significant > for subscripts by default!), we infer Index := Range<Index> (admittedly > weird) and Element := Slice<Self> (could be legitimate). We have to discard > this candidate because the deduction from startIndex/endIndex (Index := Int) > collides with this deduction (Index := Range<Index>). > > In our initial example, we saw that Indices was inferred to > CountableRange<Int>. Let’s look at another slice of the > RandomAccessCollection protocol that’s relevant to this associated type: > > protocol RandomAccessCollection : BidirectionalCollection { > associatedtype Index > associatedtype Indices: RandomAccessCollection where Indices.Element == > Index > > var indices: Indices { get } > } > > We will match that requirement for an “indices” property against a member of > a constrained protocol extension of RandomAccessCollection: > > extension RandomAccessCollection where Index : Strideable, Index.Stride == > IndexDistance, Indices == CountableRange<Index> { > public var indices: CountableRange<Index> { > return startIndex..<endIndex > } > } > > Associated type inference can determine here that Indices := > CountableRange<Index>, but only when Index: Strideable and Index.Stride == > IndexDistance. In other words, there are a whole bunch of other constraints > to verify before we can accept this inference for Indices. > > > What’s a Good Solution Look Like? > Our current system for associated type inference and associated type defaults > is buggy and complicated. The compiler gets it right often enough that people > depend on it, but I don’t think anyone can reasonably be expected to puzzle > out what’s going to happen, and this area is rife with bugs. If we were to > design a new solution from scratch, what properties should it have? > > • It should allow the author of a protocol to provide reasonable > defaults, so the user doesn’t have to write them > • It shouldn’t require users to write typealiases for “obvious” cases, > even when they aren’t due to defaults > • It shouldn’t infer an inconsistent set of typealiases > • It should be something that a competent Swift programmer could reason > about when it will succeed, when and why it will fail, and what the resulting > inferred typealiases would be > • It should admit a reasonable implementation in the compiler that is > performant and robust > > A Rough Proposal > I’ve been thinking about this for a bit, and I think there are three ways in > which we should be able to infer an associated type witness: > > • Associated type defaults, which are specified with the associated > type itself, e.g., > > associatedtype Indices = DefaultIndices<Self> > > These are easy to reason about for both the programmer and the compiler. > • Typealiases in (possibly constrained) protocol extensions, e.g., > > extension RandomAccessCollection where Index : Strideable, Index.Stride == > IndexDistance { > typealias RandomAccessCollection.Indices = CountableRange<Index> > } > > I’m intentionally using some odd ‘.’ syntax here to indicate that this > typealias is intended only to be found when trying to satisfy an associated > type requirement, and is not a general typealias that could be found by > normal name lookup. Let’s set the syntax bike shed aside for the moment. The > primary advantage of this approach (vs. inferring Indices from “var Indices: > CountableRange<Index>” in a constrained protocol extension) is that there’s a > real typealias declaration that compiler and programmer alike can look at and > reason about based just on the name “Indices”. > > Note that this mechanism technically obviates the need for (1), in the same > sense that default implementations in protocols are merely syntactic sugar. > • Declarations within the nominal type declaration or extension that > declares conformance to the protocol in question. This is generally the same > approach as described in “associated type inference” above, where we match > requirements of the protocol against declarations that could satisfy those > requirements and infer associated types from there. However, I want to turn > it around: instead of starting with the requirements of the protocol any > looking basically anywhere in the type or any protocol to which it conforms > (for implementations in protocol extensions), start with the declarations > that the user explicitly wrote at the point of the conformance and look for > requirements they might satisfy. For example, consider our initial example: > > extension MyCollection: RandomAccessCollection { > var startIndex: Int { return contents.startIndex } > var endIndex: Int { return contents.endIndex } > subscript(index: Int) -> T { return contents[index] } > } > > Since startIndex, endIndex, and subscript(_:) are declared in the same > extension that declares conformance to RandomAccessIterator, we should look > for requirements with the same name as these properties and subscript within > RandomAccessCollection (or any protocol it inherits) and infer Index := Int > and Element := T by matching the type signatures. This is still the most > magical inference rule, because there is no declaration named “Index” or > “Element” to look at. However, it is much narrower in scope than the current > implementation, because it’s only going to reason from the (probably small) > set of declarations that the user wrote alongside the conformance, so it’s > more likely to be intentional. Note that this is again nudging programmers > toward the style of programming where one puts one protocol conformance per > extension, which is admittedly my personal preference. > > Thoughts? > I think this approach is more predictable and more implementable than the > current model. I’m curious whether the above makes sense to someone other > than me, and whether it covers existing use cases well enough. Thoughts? This seems entirely reasonable to me and makes this a lot easier to understand 💯. The only requirement I'd have is that a use case as described in SR-6516 [1] will work. That is --- SNIP --- public struct Foo<A, B, C> {} public protocol P { associatedtype PA /* an implementation must set `PA` */ associatedtype PB = Never /* an implementation may set `PB` and `PC` */ associatedtype PC = Never associatedtype Context = Foo<PA, PB, PC> /* in our real-world example the right side of this associated type is a very ugly type that we don't want the user to ever write. That's why we want to define an alias "Context" here */ func f1(_ x: Context, _ y: PA) /* some protocol requirements */ func f2(_ x: Context, _ y: PB) func f3(_ x: Context, _ y: PC) func f4(_ x: Context) } public extension P { /* defaults implementations for all of those */ public func f1(_ x: Context, _ y: PA) { } public func f2(_ x: Context, _ y: PB) { } public func f3(_ x: Context, _ y: PC) { } public func f4(_ x: Context) { } } public struct S: P { /* the implementation sets `PA` as required, chooses to also set `PB` but lets `PC` alone (defaults to `Never`)) public typealias PA = String public typealias PB = Int /* implementations for two out of the four protocol methods using the associated types */ public func f1(_ x: Context, _ y: PA) { } public func f2(_ x: Context, _ y: PB) { } } --- SNAP --- [1]: https://bugs.swift.org/browse/SR-6516 -- Johannes > > - Doug > > _______________________________________________ > swift-evolution mailing list > swift-evolution@swift.org > https://lists.swift.org/mailman/listinfo/swift-evolution _______________________________________________ swift-evolution mailing list swift-evolution@swift.org https://lists.swift.org/mailman/listinfo/swift-evolution