Re: [swift-evolution] [RFC] Associated type inference

Johannes Weiß via swift-evolution Fri, 01 Dec 2017 04:21:07 -0800

Hi Douglas,

First of all, thanks very much for looking into this seemingly dark corner of 
the compiler. This caused us a lot of trouble already.


Comments inline.

> On 1 Dec 2017, at 12:28 am, Douglas Gregor via swift-evolution 
> <swift-evolution@swift.org> wrote:
> 
> Hello Swift community,
> 
> Associated type inference, which is the process by which the Swift compiler 
> attempts to infer typealiases to satisfy associated-type requirements based 
> on other requirements, has caused both implementation problems and user 
> confusion for a long time. Some of you might remember a previous (failed) 
> attempt to remove this feature from the Swift language, in SE-0108 “Remove 
> associated type inference”. 
> 
> I’m not sure we can remove this feature outright (i.e., the concerns that 
> sank that proposal are still absolutely valid), because it is so very 
> convenient and a lot of user code likely depends on it in some form or other. 
> So, here I’d like to describe the uses of the feature, its current (very 
> problematic) implementation approach, and a half-baked proposal to narrow the 
> scope of associated type inference to something that I think is more 
> tractable. I need help with the design of this feature, because I feel like 
> it’s going to be a delicate balance between implementability and 
> expressiveness.
> 
> A Motivating Example
> As a motivating example, let’s consider a “minimal” random access collection:
> 
> struct MyCollection<T> {
>     var contents: [T]
> }
> 
> extension MyCollection: RandomAccessCollection {    
>     var startIndex: Int { return contents.startIndex }
>     var endIndex: Int { return contents.endIndex }
>     subscript(index: Int) -> T { return contents[index] }
> }
> 
> This is actually pretty awesome: by providing just two properties and a 
> subscript, we get the full breadth of the random access collection API! This 
> is relying heavily on associated type inference (for associated type 
> requirements) and default implementations specified on protocol extensions. 
> Without associated type inference, we would have had to write:
> 
> 
> extension MyCollection: RandomAccessCollection {
>     typealias Element = T
>     typealias Index = Int
>     typealias Indices = CountableRange<Int>
>     typealias Iterator = IndexingIterator<MyCollection<T>>
>     typealias SubSequence = RandomAccessSlice<MyCollection<T>>
>     
>     var startIndex: Int { return contents.startIndex }
>     var endIndex: Int { return contents.endIndex }
>     subscript(index: Int) -> T { return contents[index] }
> }
> 
> where the bolded typealiases are currently inferred. It was worse back when 
> we reviewed SE-0108, because IIRC there were a few underscored associated 
> types (e.g., _Element) that have since been removed. Still, that’s a bit of 
> additional work to define a “minimal” collection, and requires quite a bit 
> more understanding: how do I know to choose IndexingIterator, and 
> CountableRange, and RandomAccessSlice?
> 
> The issue would get worse with, e.g., SE-0174 “Change filter to return an 
> associated type”, which adds an associated type Filtered that almost nobody 
> will ever customize, and isn’t really fundamental to the way collections 
> work. Adding Filtered to the standard library would be a source-breaking 
> change, because users would have to write a typealias giving it its default.
> 
> Associated Type Defaults
> One of the ways in which we avoid having to specify typealiases is to use 
> associated type defaults. For example, the standard library contains 
> associated type defaults for Indices, Iterator, and SubSequence. Let’s focus 
> on Indices:
> 
> protocol Collection : Sequence {
>   associatedtype Indices = DefaultIndices<Self>
>   // ...
> }
> 
> protocol BidirectionalCollection : Collection {
>   associatedtype Indices = DefaultBidirectionalIndices<Self>
>   // ...
> }
> 
> protocol RandomAccessCollection : BidirectionalCollection {
>   associatedtype Indices = DefaultRandomAccessIndices<Self>
>   // ...
> }
> 
> The basic idea here is that different protocols in the hierarchy provide 
> different defaults, and you presumably want the default from the most 
> specific protocol. If I define a type and make it conform to 
> BidirectionalCollection, I’d expect to get DefaultBidirectionalIndices<Self> 
> for Indices. If a define a type and make it conform to RandomAccessIterator, 
> I’d expect to get DefaultRandomAccessIndices<Self>.
> 
> (Aside: DefaultRandomAccessIndices and DefaultBidirectionalIndices got 
> collapsed into DefaultIndices now that we have conditional conformances for 
> the standard library, but the issues I’m describing remain).
> 
> Associated type defaults seem like a reasonable feature that fits well enough 
> into the design. However, it’s not the only thing in place with our 
> MyCollection example, for which Indices was inferred to CountableRange. How’s 
> that happen?
> 
> Associated Type Inference
> Associated type inference attempts to look at the requirements of a protocol, 
> and then looks into the conforming type for declarations that might satisfy 
> those requirements, and infers associated types from the types of those 
> declarations. Let’s look at some examples. RandomAccessCollection has some 
> requirements that mention the Index and Element types:
> 
> protocol RandomAccessCollection : BidirectionalCollection {
>   associatedtype Element
>   associatedtype Index
> 
>   var startIndex: Index { get }
>   var endIndex: Index { get }
>   subscript (i: Index) -> Element { get }
> }
> 
> Associated type inference wil try to satisfy those requirements for 
> MyCollection, and will find these declarations:
> 
> extension MyCollection: RandomAccessCollection {    
>     var startIndex: Int { return contents.startIndex }
>     var endIndex: Int { return contents.endIndex }
>     subscript(index: Int) -> T { return contents[index] }
> }
> 
> and match them up. From startIndex, we infer Index := Int. From endIndex, we 
> infer Index := Int. From subscript, we infer Index := Int and Element := T. 
> Those results are all consistent, so we’ve properly inferred Index and 
> Element. Yay.
> 
> Associated type inference often has to deal with ambiguities. For example, 
> there is an extension of Collection that provides a range subscript operator:
> 
> extension Collection {
>   subscript (bounds: Range<Index>) -> Slice<Self> { … }
> }
> 
> When we look at that and match it to the subscript requirement in 
> RandomAccessCollection (remembering that argument labels aren’t significant 
> for subscripts by default!), we infer Index := Range<Index> (admittedly 
> weird) and Element := Slice<Self> (could be legitimate). We have to discard 
> this candidate because the deduction from startIndex/endIndex (Index := Int) 
> collides with this deduction (Index := Range<Index>).
> 
> In our initial example, we saw that Indices was inferred to 
> CountableRange<Int>. Let’s look at another slice of the 
> RandomAccessCollection protocol that’s relevant to this associated type:
> 
> protocol RandomAccessCollection : BidirectionalCollection {
>   associatedtype Index
>   associatedtype Indices: RandomAccessCollection where Indices.Element == 
> Index
> 
>   var indices: Indices { get }
> }
> 
> We will match that requirement for an “indices” property against a member of 
> a constrained protocol extension of RandomAccessCollection:
> 
> extension RandomAccessCollection where Index : Strideable, Index.Stride == 
> IndexDistance, Indices == CountableRange<Index> {
>   public var indices: CountableRange<Index> {
>     return startIndex..<endIndex
>   }
> }
> 
> Associated type inference can determine here that Indices := 
> CountableRange<Index>, but only when Index: Strideable and Index.Stride == 
> IndexDistance. In other words, there are a whole bunch of other constraints 
> to verify before we can accept this inference for Indices.
> 
> 
> What’s a Good Solution Look Like?
> Our current system for associated type inference and associated type defaults 
> is buggy and complicated. The compiler gets it right often enough that people 
> depend on it, but I don’t think anyone can reasonably be expected to puzzle 
> out what’s going to happen, and this area is rife with bugs. If we were to 
> design a new solution from scratch, what properties should it have?
> 
>       • It should allow the author of a protocol to provide reasonable 
> defaults, so the user doesn’t have to write them
>       • It shouldn’t require users to write typealiases for “obvious” cases, 
> even when they aren’t due to defaults
>       • It shouldn’t infer an inconsistent set of typealiases
>       • It should be something that a competent Swift programmer could reason 
> about when it will succeed, when and why it will fail, and what the resulting 
> inferred typealiases would be
>       • It should admit a reasonable implementation in the compiler that is 
> performant and robust
> 
> A Rough Proposal
> I’ve been thinking about this for a bit, and I think there are three ways in 
> which we should be able to infer an associated type witness:
> 
>       • Associated type defaults, which are specified with the associated 
> type itself, e.g.,
> 
>   associatedtype Indices = DefaultIndices<Self>
> 
> These are easy to reason about for both the programmer and the compiler.
>       • Typealiases in (possibly constrained) protocol extensions, e.g.,
> 
>   extension RandomAccessCollection where Index : Strideable, Index.Stride == 
> IndexDistance {
>     typealias RandomAccessCollection.Indices = CountableRange<Index>
>   }
> 
> I’m intentionally using some odd ‘.’ syntax here to indicate that this 
> typealias is intended only to be found when trying to satisfy an associated 
> type requirement, and is not a general typealias that could be found by 
> normal name lookup. Let’s set the syntax bike shed aside for the moment. The 
> primary advantage of this approach (vs. inferring Indices from “var Indices: 
> CountableRange<Index>” in a constrained protocol extension) is that there’s a 
> real typealias declaration that compiler and programmer alike can look at and 
> reason about based just on the name “Indices”. 
> 
> Note that this mechanism technically obviates the need for (1), in the same 
> sense that default implementations in protocols are merely syntactic sugar.
>       • Declarations within the nominal type declaration or extension that 
> declares conformance to the protocol in question. This is generally the same 
> approach as described in “associated type inference” above, where we match 
> requirements of the protocol against declarations that could satisfy those 
> requirements and infer associated types from there. However, I want to turn 
> it around: instead of starting with the requirements of the protocol any 
> looking basically anywhere in the type or any protocol to which it conforms 
> (for implementations in protocol extensions), start with the declarations 
> that the user explicitly wrote at the point of the conformance and look for 
> requirements they might satisfy. For example, consider our initial example:
> 
>   extension MyCollection: RandomAccessCollection {    
>     var startIndex: Int { return contents.startIndex }
>     var endIndex: Int { return contents.endIndex }
>     subscript(index: Int) -> T { return contents[index] }
>   }
> 
> Since startIndex, endIndex, and subscript(_:) are declared in the same 
> extension that declares conformance to RandomAccessIterator, we should look 
> for requirements with the same name as these properties and subscript within 
> RandomAccessCollection (or any protocol it inherits) and infer Index := Int 
> and Element := T by matching the type signatures. This is still the most 
> magical inference rule, because there is no declaration named “Index” or 
> “Element” to look at. However, it is much narrower in scope than the current 
> implementation, because it’s only going to reason from the (probably small) 
> set of declarations that the user wrote alongside the conformance, so it’s 
> more likely to be intentional. Note that this is again nudging programmers 
> toward the style of programming where one puts one protocol conformance per 
> extension, which is admittedly my personal preference.
> 
> Thoughts?
> I think this approach is more predictable and more implementable than the 
> current model. I’m curious whether the above makes sense to someone other 
> than me, and whether it covers existing use cases well enough. Thoughts?

This seems entirely reasonable to me and makes this a lot easier to understand 
💯. The only requirement I'd have is that a use case as described in SR-6516 [1] 
will work. That is

--- SNIP ---
public struct Foo<A, B, C> {}

public protocol P {
    associatedtype PA /* an implementation must set `PA` */
    associatedtype PB = Never /* an implementation may set `PB` and `PC` */
    associatedtype PC = Never

    associatedtype Context = Foo<PA, PB, PC> /* in our real-world example the 
right side of this associated type is a very ugly type that we don't want the 
user to ever write. That's why we want to define an alias "Context" here */

    func f1(_ x: Context, _ y: PA) /* some protocol requirements */
    func f2(_ x: Context, _ y: PB)
    func f3(_ x: Context, _ y: PC)
    func f4(_ x: Context)
}

public extension P {
    /* defaults implementations for all of those */
    public func f1(_ x: Context, _ y: PA) {
    }
    public func f2(_ x: Context, _ y: PB) {
    }
    public func f3(_ x: Context, _ y: PC) {
    }
    public func f4(_ x: Context) {
    }
}

public struct S: P {
    /* the implementation sets `PA` as required, chooses to also set `PB` but 
lets `PC` alone (defaults to `Never`))
    public typealias PA = String
    public typealias PB = Int

    /* implementations for two out of the four protocol methods using the 
associated types */
    public func f1(_ x: Context, _ y: PA) {
    }
    public func f2(_ x: Context, _ y: PB) {
    }
}
--- SNAP ---

[1]: https://bugs.swift.org/browse/SR-6516

-- Johannes

> 
>       - Doug
> 
> _______________________________________________
> swift-evolution mailing list
> swift-evolution@swift.org
> https://lists.swift.org/mailman/listinfo/swift-evolution

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

Re: [swift-evolution] [RFC] Associated type inference

Reply via email to