Re: [swift-evolution] Why you can't make someone else's class Decodable: a long-winded explanation of 'required' initializers

Itai Ferber via swift-evolution Thu, 03 Aug 2017 10:06:16 -0700

Thanks for putting these thoughts together, Jordan! Some additional comments 
inline.


> On Aug 2, 2017, at 5:08 PM, Jordan Rose <jordan_r...@apple.com> wrote:
> 
> David Hart recently asked on Twitter 
> <https://twitter.com/dhartbit/status/891766239340748800> if there was a good 
> way to add Decodable support to somebody else's class. The short answer is 
> "no, because you don't control all the subclasses", but David already 
> understood that and wanted to know if there was anything working to mitigate 
> the problem. So I decided to write up a long email about it instead. (Well, 
> actually I decided to write a short email and then failed at doing so.)
> 
> The Problem
> 
> You can add Decodable to someone else's struct today with no problems:
> 
> extension Point: Decodable {
>   enum CodingKeys: String, CodingKey {
>     case x
>     case y
>   }
>   public init(from decoder: Decoder) throws {
>     let container = try decoder.container(keyedBy: CodingKeys.self)
>     let x = try container.decode(Double.self, forKey: .x)
>     let y = try container.decode(Double.self, forKey: .y)
>     self.init(x: x, y: y)
>   }
> }
> 
> But if Point is a (non-final) class, then this gives you a pile of errors:
> 
> - init(from:) needs to be 'required' to satisfy a protocol requirement. 
> 'required' means the initializer can be invoked dynamically on subclasses. 
> Why is this important? Because someone might write code like this:
> 
> func decodeMe<Result: Decodable>() -> Result {
>   let decoder = getDecoderFromSomewhere()
>   return Result(from: decoder)
> }
> let specialPoint: VerySpecialSubclassOfPoint = decodeMe()
> 
> …and the compiler can't stop them, because VerySpecialSubclassOfPoint is a 
> Point, and Point is Decodable, and therefore VerySpecialSubclassOfPoint is 
> Decodable. A bit more on this later, but for now let's say that's a sensible 
> requirement.
> 
> - init(from:) also has to be a 'convenience' initializer. That one makes 
> sense too—if you're outside the module, you can't necessarily see private 
> properties, and so of course you'll have to call another initializer that can.
> 
> But once it's marked 'convenience' and 'required' we get "'required' 
> initializer must be declared directly in class 'Point' (not in an 
> extension)", and that defeats the whole purpose. Why this restriction?
> 
> 
> The Semantic Reason
> 
> The initializer is 'required', right? So all subclasses need to have access 
> to it. But the implementation we provided here might not make sense for all 
> subclasses—what if VerySpecialSubclassOfPoint doesn't have an 'init(x:y:)' 
> initializer? Normally, the compiler checks for this situation and makes the 
> subclass reimplement the 'required' initializer…but that only works if the 
> 'required' initializers are all known up front. So it can't allow this new 
> 'required' initializer to go by, because someone might try to call it 
> dynamically on a subclass. Here's a dynamic version of the code from above:
> 
> func decodeDynamic(_ pointType: Point.Type) -> Point {
>   let decoder = getDecoderFromSomewhere()
>   return pointType.init(from: decoder)
> }
> let specialPoint = decodeDynamic(VerySpecialSubclassOfPoint.self)
> 
> 
> The Implementation Reason
> 
> 'required' initializers are like methods: they may require dynamic dispatch. 
> That means that they get an entry in the class's dynamic dispatch table, 
> commonly known as its vtable. Unlike Objective-C method tables, vtables 
> aren't set up to have entries arbitrarily added at run time.
> 
> (Aside: This is one of the reasons why non-@objc methods in Swift extensions 
> can't be overridden; if we ever lift that restriction, it'll be by using a 
> separate table and a form of dispatch similar to objc_msgSend. I sent a 
> proposal to swift-evolution about this last year but there wasn't much 
> interest.)
> 
> 
> The Workaround
> 
> Today's answer isn't wonderful, but it does work: write a wrapper struct that 
> conforms to Decodable instead:
> 
> struct DecodedPoint: Decodable {
>   var value: Point
>   enum CodingKeys: String, CodingKey {
>     case x
>     case y
>   }
>   public init(from decoder: Decoder) throws {
>     let container = try decoder.container(keyedBy: CodingKeys.self)
>     let x = try container.decode(Double.self, forKey: .x)
>     let y = try container.decode(Double.self, forKey: .y)
>     self.value = Point(x: x, y: y)
>   }
> }
> 
> This doesn't have any of the problems with inheritance, because it only 
> handles the base class, Point. But it makes everywhere else a little less 
> convenient—instead of directly encoding or decoding Point, you have to use 
> the wrapper, and that means no implicitly-generated Codable implementations 
> either.
> 
> I'm not going to spend more time talking about this, but it is the officially 
> recommended answer at the moment. You can also just have all your own types 
> that contain points manually decode the 'x' and 'y' values and then construct 
> a Point from that.
I would actually take this a step further and recommend that any time you 
intend to extend someone else’s type with Encodable or Decodable, you should 
almost certainly write a wrapper struct for it instead, unless you have 
reasonable guarantees that the type will never attempt to conform to these 
protocols on its own.

This might sound extreme (and inconvenient), but Jordan mentions the issue here 
below in The Dangers of Retroactive Modeling. Any time you conform a type which 
does not belong to you to a protocol, you make a decision about its behavior 
where you might not necessarily have the "right" to — if the type later adds 
conformance to the protocol itself (e.g. in a library update), your code will 
no longer compile, and you’ll have to remove your own conformance. In most 
cases, that’s fine, e.g., there’s not much harm done in dropping your custom 
Equatable conformance on some type if it starts adopting it on its own. The 
real risk with Encodable and Decodable is that unless you don’t care about 
backwards/forwards compatibility, the implementations of these conformances are 
forever.

Using Point here as an example, it’s not unreasonable for Point to eventually 
get updated to conform to Codable. It’s also not unreasonable for the 
implementation of Point to adopt the default conformance, i.e., get encoded as 
{"x": …, "y": …}. This form might not be the most compact, but it leaves room 
for expansion (e.g. if Point adds a z field, which might also be reasonable, 
considering the type doesn’t belong to you). If you update your library 
dependency with the new Point class and have to drop the conformance you added 
to it directly, you’ve introduced a backwards and forwards compatibility 
concern: all new versions of your app now encode and decode a new archive 
format, which now requires migration. Unless you don’t care about other 
versions of your app, you’ll have to deal with this:
Old versions of your app which users may have on their devices cannot read 
archives with this new format
New versions of your app cannot read archives with the old format

Unless you don’t care for some reason, you will now have to write the wrapper 
struct, to either
Have new versions of your app attempt to read old archive versions and migrate 
them forward (leaving old app versions in the dust), or
Write all new archives with the old format so old app versions can still read 
archives written with newer app versions, and vice versa

Either way, you’ll need to write some wrapper to handle this; it’s 
significantly safer to do that work up front on a type which you do control 
(and safely allow Point to change out underneath you transparently), rather 
than potentially end up between a rock and a hard place later on because a type 
you don’t own changes out from under you.

> Future Direction: 'required' + 'final'
> 
> One language feature we could add to make this work is a 'required' 
> initializer that is also 'final'. Because it's 'final', it wouldn't have to 
> go into the dynamic dispatch table. But because it's 'final', we have to make 
> sure its implementation works on all subclasses. For that to work, it would 
> only be allowed to call other 'required' initializers…which means you're 
> still stuck if the original author didn't mark anything 'required'. Still, 
> it's a safe, reasonable, and contained extension to our initializer model.
> 
> 
> Future Direction: runtime-checked convenience initializers
> 
> In most cases you don't care about hypothetical subclasses or invoking 
> init(from:) on some dynamic Point type. If there was a way to mark 
> init(from:) as something that was always available on subclasses, but 
> dynamically checked to see if it was okay, we'd be good. That could take one 
> of two forms:
> 
> - If 'self' is not Point itself, trap.
> - If 'self' did not inherit or override all of Point's designated 
> initializers, trap.
> 
> The former is pretty easy to implement but not very extensible. The latter 
> seems more expensive: it's information we already check in the compiler, but 
> we don't put it into the runtime metadata for a class, and checking it at run 
> time requires walking up the class hierarchy until we get to the class we 
> want. This is all predicated on the idea that this is rare, though.
> 
> This is a much more intrusive change to the initializer model, and it's 
> turning a compile-time check into a run-time check, so I think we're less 
> likely to want to take this any time soon.
> 
> 
> Future Direction: Non-inherited conformances
> 
> All of this is only a problem because people might try to call init(from:) on 
> a subclass of Point. If we said that subclasses of Point weren't 
> automatically Decodable themselves, we'd avoid this problem. This sounds like 
> a terrible idea but it actually doesn't change very much in practice. 
> Unfortunately, it's also a very complicated and intrusive change to the Swift 
> protocol system, and so I don't want to spend more time on it here.
> 
> 
> The Dangers of Retroactive Modeling
> 
> Even if we magically make this all work, however, there's still one last 
> problem: what if two frameworks do this? Point can't conform to Decodable in 
> two different ways, but neither can it just pick one. (Maybe one of the 
> encoded formats uses "dx" and "dy" for the key names, or maybe it's encoded 
> with polar coordinates.) There aren't great answers to this, and it calls 
> into question whether the struct "solution" at the start of this message is 
> even sensible.
> 
> I'm going to bring this up on swift-evolution soon as part of the Library 
> Evolution discussions (there's a very similar problem if the library that 
> owns Point decides to make it Decodable too), but it's worth noting that the 
> wrapper struct solution doesn't have this problem.
> 
> 
> Whew! So, that's why you can't do it. It's not a very satisfying answer, but 
> it's one that falls out of our compile-time safety rules for initializers. 
> For more information on this I suggest checking out my write-up of some of 
> our initialization model problems 
> <https://github.com/apple/swift/blob/master/docs/InitializerProblems.rst>. 
> And I plan to write another email like this to discuss some solutions that 
> are actually doable.
> 
> Jordan
> 
> P.S. There's a reason why Decodable uses an initializer instead of a 
> factory-like method on the type but I can't remember what it is right now. I 
> think it's something to do with having the right result type, which would 
> have to be either 'Any' or an associated type if it wasn't just 'Self'. (And 
> if it is 'Self' then it has all the same problems as an initializer and would 
> require extra syntax.) Itai would know for sure.
To give background on this — the protocols originally had factory initializers 
in mind for this (to allow for object replacement and avoid some of these 
issues), but without a "real" factory initializer pattern like we’re discussing 
here, the problems with this approach were intractable (all due to subclassing 
issues).

An initializer pattern like static func decode(from: Decoder) throws -> ??? has 
a few problems
The return type is one consideration. If we allow for an associated type 
representing to the return type, subclasses cannot override the associated type 
to return something different. This makes object replacement impossible in 
situations which use subclassing. The only reasonable thing is to return Self 
(which would allow for returning instances of self, or of subclasses). (We 
could return Any, but that defeats the entire purpose of having a type-safe API 
to begin with; we want to avoid the dynamic casting altogether.)
Even if we return Self, this method cannot be overridden by subclasses:
If implemented as static func decode(from: Decoder) throws -> Self, the method 
clearly cannot be overridden in a subclass, as it is a static method
The method cannot be implemented as class func decode(from: Decoder) throws -> 
Self on a non-final class:
protocol Foo {
    static func create() -> Self
}

class Bar : Foo {
    class func create() -> Bar { // method 'create()' in non-final class 'Bar' 
must return 'Self' to conform to protocol 'Foo'
        return Bar()
    }
}

protocol Foo {
    static func create() -> Self
}

class Bar : Foo {
    class func create() -> Self {
        return Bar() // cannot convert return expression of type 'Bar' to 
return type 'Self'
    }
}


protocol Foo {
    static func create() -> Self
}

class Bar : Foo {
    class func create() -> Self {
        return Bar() as! Self // error: 'Self' is only available in a protocol 
or as the result of a method in a class; did you mean 'Bar'?; warning: forced 
cast of 'Bar' to same type has no effect; error: cannot convert return of 
expression type 'Bar' to return type 'Self'
    }
}

final class Bar : Foo {
    class func create() -> Bar { // no problems
        return Bar()
    }
}
This means that we either allow adoption of these protocols on final classes 
only (which, again, defeats the whole purpose!), or, that every class which 
implements these protocols has to have knowledge about all of its potential 
subclasses and their implementations of these protocols. This is prohibitive as 
well.
Even if it were possible to subclass these types of methods, they don’t follow 
the regular initializer pattern. In order to construct an instance of a 
subclass, you need to be able to call a superclass initializer. But these 
methods are not initializers; even if you call super’s factory initializer, 
there’s noting you can do with the returned instance of the superclass; unlike 
in ObjC, there’s no super- or self-reassignment (in general), so classes would 
have to follow a completely different (and awkward) pattern of creating an 
instance of the superclass, initializing from that instance in a separate 
initializer (e.g. self.init(superInstance)), and also setting decoded properties

Overall, the lack of a true factory initializer pattern prevented us from doing 
something like this, and we took the regular initializer approach.

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

Re: [swift-evolution] Why you can't make someone else's class Decodable: a long-winded explanation of 'required' initializers

Reply via email to