> On Oct 8, 2016, at 10:09 AM, Karl <razie...@gmail.com> wrote: > > Could you add this (and John’s previous writeup) to the docs in the repo?
Yeah, it’s unfortunate that design discussions are buried in a flood of email. On the flip side, I’ve checked in some premature design docs that are probably nonsense now. I’m currently preparing a type safe memory model design doc to checkin. After that I’ll probably work on a document for SIL SSA with address-only types, which should cover John’s writeup. I’ll have to work with Michael Gottesman and John McCall to get a SIL ownership docs checked in. > I was reasonably along the way to adding unowned optionals a while back but > got totally lost in SILGen. > This info looks really valuable, but personally I find that with the mailing > list format it’s hard to ever find this kind of stuff when I need it. > > Thanks > > Karl > > P.S. going to pick up that unowned optional stuff soon, once I have time to > read the docs about SILGen There are SILGen docs somewhere? -Andy > >> On 8 Oct 2016, at 08:10, Andrew Trick via swift-dev <swift-dev@swift.org >> <mailto:swift-dev@swift.org>> wrote: >> >> On swift-dev, John already sent out a great writeup on SIL SSA: >> Representing "address-only" values in SIL. >> >> While talking to John I also picked up a lot of insight into how >> address types relate to SIL ownership and borrow checking. I finally >> organized the information into these notes. This is not a >> proposal. It's background information for those of us writing and >> reviewing proposals. Just take it as a strawman for future >> discussions. (There's also a good chance I'm getting something >> wrong). >> >> [My commentary in brackets.] >> >> ** Recap of address-only. >> >> Divide address-only types into two categories: >> 1. By abstraction (compiler doesn't know the size). >> 2. The type is "memory-linked". i.e. the address is significant at runtime. >> - weak references (anything that registers its address). >> - C++ this. >> - Anything with interior pointers. >> - Any shared-borrowed value of a type with "nonmutating" properties. >> ["nonmutating" properties allow mutation of state attached to a value. >> Rust atomics are an example.] >> >> Address-only will not be reflected in SIL types. SIL addresses should >> only be used for formal memory (pointers, globals, class >> properties, captures). We'll get to inout arguments later... >> >> As with opaque types, when IRGen lowers a memory-linked borrowed type, >> it needs to allocate storage. >> >> Concern: SILGen has built-in tracking of managed values that automates >> insertion of cleanups. Lowering address-only types after SILOpt would >> require rediscovering that information based on CFG analysis. Is this >> too heroic? >> >> This was already described by John. Briefly recapping: >> >> e.g. Constructung Optional<Any> >> >> We want initialization should be in-place as such: >> >> %0 = struct_element_addr .. #S.any >> %1 = init_existential_addr %0, $*Any, $Optional<X> >> %2 = inject_enum_data_addr %1, $Optional<X>.Some >> apply @initX(%2) >> >> SILValue initialization would look something like: >> >> %0 = apply @initX() >> %1 = enum #Optional.Some, %0 : $X >> %2 = existential %1 : $Any >> >> [I'm not sure we actually want to represent an existential container >> this way, but enum, yes.] >> >> Lowering now requires discovering the storage structure, bottom-up, >> hoisting allocation, inserting cleanups as John explained. >> >> Side note: Before lowering, something like alloc_box would directly >> take its initial value. >> >> ** SILFunction calling convention. >> >> For ownership analysis, there's effectively no difference between the >> value/address forms of argument ownership: >> >> @owned / @in >> @guaranteed / @in_guaranteed >> return / @out >> @owned arg >> + @owned return / @inout >> >> Regardless of the representation we choose for @inout, @in/@out will >> now be scalar types. SILFunction will maintain the distinction between >> @owned/@in etc. based on whether the type is address-only. We need >> this for reabstraction, but it only affects the function type, not the >> calling convention. >> >> Rather than building a tuple, John prefers SIL support for anonymous >> aggregate as "exploded values". >> >> [I'm guessing because tuples are a distinct formal type with their own >> convention and common ownership. This may need some discussion though.] >> >> Example SIL function type: >> >> $(@in P, @owned Q) -> (@owned R, @owned S, @out T, @out U) >> >> %p = apply f: $() -> P >> %q = apply g: $() -> Q >> %exploded = apply h(%p, %q) >> %r = project_exploded %exploded, #0 : $R >> %s = project_exploded %exploded, #1 : $S >> %t = project_exploded %exploded, #2 : $T >> %u = project_exploded %exploded, #3 : $U >> >> Exploded types requires all their elements to be projected with their >> own independent ownership. >> >> ** Ownership terminology. >> >> Swift "owned" = Rust values = SIL @owned = implicitly >> consumed >> Swift "borrowed" = Rust immutable borrow = SIL @guaranteed = shared >> Swift "inout" = Rust mutable borrow = SIL @inout = unique >> >> Swift "inout" syntax is already (nearly) sufficient. >> >> "borrowed" may not need syntax on the caller side, just a way to >> qualify parameters. Swift still needs syntax for returning a borrowed >> value. >> >> ** Representation of borrowed values. >> >> Borrowed values represent some shared storage location. >> >> We want some borrowed value references to be passed as SIL values, not SIL >> addresses: >> - Borrowed class references should not be indirected. >> - Optimize borrowing other small non-memory linked types. >> - Support capture promotion, and other SSA optimizations. >> - Borrow CoW values directly. >> >> [Address-only borrowed types will still be passed as SIL addresses (why >> not?)] >> >> Borrowed types with potentially mutating properties must be passed by >> SIL address because they are not actually immutable and their storage >> location is significant. >> >> Borrowed references have a scope and need an end-of-borrow marker. >> >> [The end-of-borrow marker semantically changes the memory state, and >> statically enforces non-overlapping memory states. It does not >> semantically write-back a value. Borrowed values with mutating fields >> are semantically modified in-place.] >> >> [Regardless of whether borrowed references are represented as SIL >> values or addresses, they must be associated with formal storage. That >> storage must remain immutable at the language level (although it may >> have mutating fields) and the value cannot be destroyed during the >> borrowed scope]. >> >> [Trivial borrowed values can be demoted to copies so we can eliminate >> their scope] >> >> [Anything borrowed from global storage (and not demoted to a copy) >> needs its scope to be dynamically enforced. Borrows from local storage >> are sufficiently statically enforced. However, in both cases the >> optimizer must respect the static scope of the borrow.] >> >> [I think borrowed values are effectively passed @guaranteed. The >> end-of-borrow scope marker will then always be at the top-level >> scope. You can't borrow in a caller and end its scope in the callee.] >> >> ** Borrowed and inout scopes. >> >> inout value references are also scoped. We'll get to their >> representation shortly. Within an inout scope, memory is in an >> exclusive state. No borrowed scopes may overlap with an inout state, >> which is to say, memory is either shared or exclusive. >> >> We need a flag for stored properties, even for simple trivial >> types. That's the only way to provide a simple user model. At least we >> don't need this to be implemented atomically, we're not detecting race >> conditions. Optimizations will come later. We should be able to prove >> that some stored properties are never passed as inout. >> >> The stored property flag needs to be a tri-state: owned, borrowed, exclusive. >> >> The memory value can only be destroyed in the owned state. >> >> The user may mark some storage locations as "unchecked" as an >> opt-out. That doesn't change the optimizer's constraints. It simply >> bypasses the runtime check. >> >> ** Ownership of loaded values. >> >> [MikeG already explained possibilities of load ownership in >> [swift-dev] [semantic-arc][proposal] High Level ARC Memory Operations] >> >> For the sake of understanding the model, it's worth realizing that we >> only need one form of load ownership: load_borrow. We don't >> actually need an operation that loads an owned value out of formal >> storage. This makes canonical sense because: >> >> - Semantically, a load must at least be a borrow because the storage >> location's non-exclusive flag needs to be dynamically checked >> anyway, even if the value will be copied. >> >> - Code motion in the SIL optimizer has to obey the same limitations >> within borrow scopes regardless of whether we fuse loads and copies >> (retains). >> >> [For the purpose of semantic ARC, the copy_value would be the RC >> root. The load and copy_value would effectively be "coupled" by the >> static scope of the borrow. e.g. we would not want to move a release >> inside the static scope of a borrow.] >> >> [Purely in the interest of concise SIL, I still think we want a load [copy].] >> >> ** SIL value ownership and aggregates >> >> Operations on values: >> 1. copy >> 2. forward (move) >> 3. borrow (share) >> >> A copy or forward produces an owned value. >> An owned value has a single consumer. >> A borrow has static scope. >> >> For simplicity, passing a bb argument only has move semantics (it >> forwards the value). Later that can be expanded if needed. >> >> We want to allow simultaneous access to independent subelements of a >> fragile aggregate. We should be able to borrow one field while >> mutating another. >> >> Is it possible to forward a subelement within an aggregate? No. But we >> can fully explode an owned aggregate into individual owned elements >> and reconstruct the aggregate. This makes use of the @exploded type >> feature described in the calling convention. >> >> [I don't think forwarding a subelement is useful anyway except for >> modeling @inout semantics...] >> >> That leads us to this question: Does an @inout value reference have >> formal storage (thus a SIL address) or is it just a convention for >> passing owned SSA values? >> >> ** World 1: SSA @inout >> >> Projecting an element produces a new SILValue. Does this SILValue have >> it's own ownership associated with it's lifetime, or is it derived >> from it's parent object by looking through projections? >> >> Either way, projecting any subelement requires reconstructing the >> entire aggregate in SIL, through all nesting levels. This will >> generate a massive amount of SILValues. Superficially they all need >> their own storage. >> >> [We could claim that projections don't need storage, but that only >> solves one side of the problem.] >> >> [I argue that this actually obscures the producer/consumer >> relationship, which is the opposite of the intention of moving to >> SSA. Projecting subelements for mutation fundamentally doesn't make >> sense. It does make sense to borrow a subelement (not for >> mutation). It also makes sense to project a mutable storage >> location. The natural way to project a storage location is by >> projecting an address...] >> >> ** World 2: @inout formal storage >> >> In this world, @inout references continue to have SILType $*T with >> guaranteed exclusive access. >> >> Memory state can be: >> - uninitialized >> - holds an owned value >> - has exclusive access >> - has shared access >> >> --- expected transitions need to be handled >> - must become uninitialized >> - must become initialized >> - must preserve initialization state >> >> We need to mark initializers with some "must initialize" marker, >> similar to how we mark deinitializers [this isn't clear to me yet]. >> >> We could give address types qualifiers to distinguish the memory state >> of their pointee (uninitialized, shared, exclusive). Addresses >> themselves could be pseudo-linear types. This would provide the same >> use-def guarantees as the SSA @inout approach, but producing a new >> address each type memory changes states would also be complicated and >> cumbersome (though not as bad as SSA). >> >> [[ >> We didn't talk about the alternative, but presumably exclusive >> vs. shared scope would be delimited by pseudo memory operations as >> such: >> >> %a1 = alloc_stack >> >> begin_exclusive %a >> apply foo(%a) // must be marked an initializer? >> end_exclusive %a >> >> begin_shared %a >> apply bar(%a) // immutable access >> end_shared %a >> >> dealloc_stack %a >> >> Values loaded from shared memory also need to be scoped. They must be >> consumed within the shared region. e.g. >> >> %a2 = ref_element_addr >> >> %x = load_borrow %a2 >> >> end_borrow %x, %a2 >> >> It makes sense to me that a load_borrow would implicitly transition >> memory to shared state, and end_borrow would implicitly return memory >> to an owned state. If the address type is already ($* @borrow T), then >> memory would remain in the shared state. >> ]] >> >> For all sorts of analysis and optimization, from borrow checking to >> CoW to ARC, we really need aliasing guarantees. Knowing we have a >> unique address to a location is about as good as having an owned >> value. >> >> To get this guarantee we need to structurally guarantee >> unique addresses. >> >> [Is there a way to do this with out making all the element_addr >> operations scoped?] >> >> With aliasing guaratees, verification should be able to statically >> prove that most formal storage locations are properly initialized and >> uninitialized (pseudo-linear type) by inspecting the memory >> operations. >> >> Likewise, we can verify the shared vs. exclusive states. >> >> Representing @inout with addresses doesn't really add features to >> SIL. In any case, SIL address types are still used for >> formal storage. Exclusive access through any of the following >> operations must be guaranteed dynamically: >> >> - ref_element_addr >> - global_addr >> - pointer_to_address >> - alloc_stack >> - project_box >> >> We end up with these basic SIL Types: >> >> $T = owned value >> >> $@borrowed T = shared value >> >> $*T = exclusively accessed >> >> $* @borrowed T = shared access >> >> [I think the non-address @borrowed type is only valid for concrete >> types that the compiler knows are not memory-linked? This can be used >> to avoid passing borrowed values indirectly for arrays and other >> small, free-to-copy values]. >> >> [We obviously need to work through concrete examples before we can >> claim to have a real design.] >> >> -Andy >> >> _______________________________________________ >> swift-dev mailing list >> swift-dev@swift.org <mailto:swift-dev@swift.org> >> https://lists.swift.org/mailman/listinfo/swift-dev >
_______________________________________________ swift-dev mailing list swift-dev@swift.org https://lists.swift.org/mailman/listinfo/swift-dev