Re: [swift-evolution] [Proposal] Foundation Swift Archival & Serialization

Matthew Johnson via swift-evolution Sun, 19 Mar 2017 20:19:38 -0700


Sent from my iPad


> On Mar 19, 2017, at 9:14 PM, Brent Royal-Gordon <br...@architechies.com> 
> wrote:
> 
>> On Mar 19, 2017, at 5:51 PM, Matthew Johnson <matt...@anandabits.com> wrote:
>> 
>> I generally agree with you about casting.  However, my dislike isn’t the 
>> cast itself, but instead it is the lack of a static guarantee.  I’m not sure 
>> we’ll find a solution that provides a static guarantee that a required 
>> context exists that is also acceptable to the Foundation team.
> 
> I don't think we can get a static guarantee that the context is present, but 
> I still would like a static guarantee that the context is of the expected 
> type. That's what I'm trying to provide here.

This doesn't do any better job of that than a cast in user code.  I can see two 
meaningful differences.  First, your solution does not allow a user to see a 
context if they can't name the type (you can't get it as Any and use 
reflection, etc).  I don't see this restriction as being beneficial.  Second, 
your solution introduces several subtle problems mentioned in my last email 
which you didn't respond to (overlapping context types, etc).  

> 
>>> 
>>>     protocol Encoder {
>>>             // Retrieve the context instance of the indicated type.
>>>             func context<Context>(ofType type: Context.Type) -> Context?
>>>             
>>>             // This context is visible for `encode(_:)` calls from this 
>>> encoder's containers all the way down, recursively.
>>>             func addContext<Context>(_ context: Context, ofType type: 
>>> Context.Type)
>> 
>> What happens if you call `addContext` more than once with values of the same 
>> type?
> 
> It overrides the previous context, but only for the containers created by 
> this `encode(to:)` method and any containers nested within them.
> 
> (Although that could cause trouble for an encoder which only encodes objects 
> with multiple instances once. Hmm.)
> 
>> And why do you require the type to be passed explicitly when it is already 
>> implied by the type of the value?
> 
> As you surmised later, I was thinking in terms of `type` being used as a 
> dictionary key; in that case, if you stored a `Foo` into the context, you 
> would not later be able to look it up using one of `Foo`'s supertypes. But if 
> we really do expect multiple contexts to be rare, perhaps we don't need a 
> dictionary at all—we can just keep an array, loop over it with `as?`, and 
> return the first (or last?) match. If that's what we do, then we probably 
> don't need to pass the type explicitly.

The array approach is better because at least there is an order to the contexts 
and we can assign precise semantics in the presence of overlapping context 
types by saying type get the first (most recent) context that can be cast to 
the type you ask for.  

That said, I think what you're really trying to model here is a context stack, 
isn't it?  Why don't we just do that?

> 
>>>     }
>>>     // Likewise on Decoder
>>>     
>>>     // Encoder and decoder classes should accept contexts in their 
>>> top-level API:
>>>     open class JSONEncoder {
>>>             open func encode<Value : Codable>(_ value: Value, withContexts 
>>> contexts: [Any] = []) throws -> Data
>>>     }
>> 
>> What happens if more than one context of the same type is provided here?
> 
> Fail a precondition, probably.

I would never support this design.  Good news though: the context stack 
approach avoids the problem.  We allow multiple contexts of the same type to be 
on the stack and the topmost context that can be cast to the requested type is 
used.

> 
>> Also, it’s worth pointing out that whatever reason you had for explicitly 
>> passing the type above you’re not requiring type information to be provided 
>> here.  Whatever design we have it should be self-consistent.
> 
> Yeah. I did this here because there was no way to specify a dictionary 
> literal of `(T.Type, T)`, where `T` could be different for different elements.
> 
>> Do you think it’s really important to allow users to dynamically provide 
>> context for children?  Do you have real world use cases where this is 
>> needed?  I’m sure there could be case where this might be useful.  But I 
>> also think there is some benefit in knowing that the context used for an 
>> entire encoding / decoding is the one you provide at the top level.  I 
>> suspect the benefit of a static guarantee that your context is used for the 
>> entire encoding / decoding has a lot more value than the ability to 
>> dynamically change the context for a subtree.
> 
> The problem with providing all the contexts at the top level is that then the 
> top level has to *know* what all the contexts needed are. Again, if you're 
> encoding a type from FooKit, and it uses a type from GeoKit, then you—the 
> user of FooKit—need to know that FooKit uses GeoKit and how to make contexts 
> for both of them. There's no way to encapsulate GeoKit's role in encoding.

The use cases I know of for contexts are really around helping a type choose an 
encoding strategy.  I can't imagine a real world use case where a Codable type 
would have a required context - it's easy enough to choose one strategy as the 
default.  That said, I can imagine really evil and degenerate API designs that 
would require the same type to be encoded differently in different parts of the 
tree.  I could imagine dynamic contexts being helpful in solving some of these 
cases, but often you would need to look at the codingKeyContext to get it right.

If you have a concrete real world use case involving module boundaries please 
elaborate.  I'm having trouble imagining the details about a precise problem 
you would solve using dynamic contexts.  I get the impression you have 
something more concrete in mind than I can think of.

> 
> On the other hand, there *could* be a way to encapsulate it. Suppose we had a 
> context protocol:
> 
>       protocol CodingContext {
>               var underlyingContexts: [CodingContext] { get }
>       }
>       extension CodingContext {
>               var underlyingContexts: [CodingContext] { return [] }
>       }
> 
> Then you could have this as your API surface:
> 
>       protocol Encoder {
>               // Retrieve the context instance of the indicated type.
>               func context<Context: CodingContext>(ofType type: Context.Type) 
> -> Context?
>       }
>       // Likewise on Decoder
>       
>       // Encoder and decoder classes should accept contexts in their 
> top-level API:
>       open class JSONEncoder {
>               open func encode<Value : Codable>(_ value: Value, with context: 
> CodingContext? = nil) throws -> Data
>       }
> 
> And libraries would be able to add additional contexts for dependencies as 
> needed.
> 
> (Hmm. Could we maybe do this?
> 
>       protocol Codable {
>               associatedtype CodingContextType: CodingContext = Never
>               
>               func encode(to encoder: Encoder) throws
>               init(from decoder: Decoder) throws
>       }
> 
>       protocol Encoder {
>               // Retrieve the context instance of the indicated type.
>               func context<CodableType: Codable>(for instance: Codable) -> 
> CodableType.CodingContextType?
>       }
>       // Likewise on Decoder
>       
>       // Encoder and decoder classes should accept contexts in their 
> top-level API:
>       open class JSONEncoder {
>               open func encode<Value : Codable>(_ value: Value, with context: 
> Value.CodingContextType? = nil) throws -> Data
>       }
> 
> That would make sure that, if you did use a context, it would be the right 
> one for the root type. And I don't believe it would have any impact on types 
> which didn't use contexts.)

I think this is far more than we need.  I think we could just say encoders and 
decoders keep a stack of contexts.  Calls to encode or decode (including top 
level) can provide a context (or an array of contexts which are interpreted as 
a stack bottom on left, top on right).  When the call returns the stack is 
popped to the point it was at before the call.  We could also include an 
explicit `func push(contexts: Context...)` method on encoder and decoder to 
allow a Codable to set context used by all of its members.  All calls to `push` 
would be popped when the current call to encode / decode returns.

Users ask for a context from an encoder / decoder using `func 
context<Context>(of: Context.Type) -> Context?`.  The stack is searched from 
the top to the bottom for a value that can be successfully cast to Context.

> 
>> What benefit do you see in using types as context “keys” rather than 
>> something like `CodingUserInfoKey`?  As far as I can tell, it avoids the 
>> need for an explicit key which you could argue are somewhat redundant (it 
>> would be weird to have two context values of the same type in the cases I 
>> know of) and puts the cast in the Encoder / Decoder rather than user code.  
>> These seem like modest, but reasonable wins.  
> 
> I also see it as an incentive for users to build a single context type rather 
> than sprinkling in a whole bunch of separate keys. I really would prefer not 
> to see people filling a `userInfo` dictionary with random primitive-typed 
> values like `["json": true, "apiVersion": "1.4"]`; it seems too easy for 
> names to clash or people to forget the type they're actually using. 
> `context(…)` being a function instead of a subscript is similarly about 
> ergonomics: it discourages you from trying to mutate your context during the 
> encoding process (although it doesn't prevent it for reference types.)
> 

I agree with this sentiment and indicated to Tony the desire to steer people 
away from treating this as a dictionary to put a lot of stuff in and towards 
defining an explicit context type.  This and the fact that keys will feel 
pretty arbitrary are behind my desire to avoid the keys and dictionary approach.

>> Unfortunately, I don't think there is a good answer to the question about 
>> multiple context values with the same type though.  I can’t think of a good 
>> way to prevent this statically.  Worse, the values might not have the same 
>> type, but be equally good matches for a type a user requests (i.e. both 
>> conform to the same protocol).  I’m not sure how a user-defined encoder / 
>> decoder could be expected to find the “best” match using semantics that 
>> would make sense to Swift users (i.e. following the rules that are kind of 
>> the inverse to overload resolution).  
>> 
>> Even if this were possible there are ambiguous cases where there would be 
>> equally good matches.  Which value would a user get when requesting a 
>> context in that case?  We definitely don’t want accessing the context to be 
>> a trapping or throwing operation.  That leaves returning nil or picking a 
>> value at random.  Both are bad choices IMO.
> 
> If we use the `underlyingContexts` idea, we could say that the context list 
> is populated breadth-first and the first context of a particular type 
> encountered wins. That would tend to prefer the context "closest" to the 
> top-level one provided by the caller, which will probably have the best 
> fidelity to the caller's preferences.

I'm not totally sure I follow you here, but I think you're describing 
stack-like semantics that are at least similar to what I have described.  I 
think the stack approach is a pretty cool one that targets the kinds of 
problems multiple contexts are trying to solve more directly than the 
dictionary approach would.

> 
> -- 
> Brent Royal-Gordon
> Architechies
>

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

Re: [swift-evolution] [Proposal] Foundation Swift Archival & Serialization

Reply via email to