> On Oct 7, 2016, at 6:04 PM, Michael Gottesman <[email protected]> wrote:
>> On Oct 7, 2016, at 5:09 PM, John McCall <[email protected]
>> <mailto:[email protected]>> wrote:
>>> On Oct 7, 2016, at 2:38 PM, Michael Gottesman <[email protected]
>>> <mailto:[email protected]>> wrote:
>>>
>>> Attached below is an updated version of the proposal. Again a rendered
>>> version is located at:
>>>
>>> https://gottesmm.github.io/proposals/high-level-arc-memory-operations.html
>>> <https://gottesmm.github.io/proposals/high-level-arc-memory-operations.html>
>>>
>>> Michael
>>>
>>> ----
>>>
>>> # Summary
>>>
>>> This document proposes:
>>>
>>> 1. adding the following ownership qualifiers to `load`: `[take]`, `[copy]`,
>>> `[borrow]`, `[trivial]`.
>>> 2. adding the following ownership qualifiers to `store`: `[init]`,
>>> `[assign]`,
>>> `[trivial]`.
>>> 3. requiring all `load` and `store` operations to have ownership qualifiers.
>>> 4. banning the use of `load [trivial]`, `store [trivial]` on memory
>>> locations of
>>> `non-trivial` type.
>>>
>>> This will allow for:
>>>
>>> 1. eliminating optimizer miscompiles that occur due to releases being moved
>>> into
>>> the region in between a `load`/`retain`, `load`/`release`,
>>> `store`/`release`. (For a specific example, see the appendix).
>>> 2. explicitly modeling `load [trivial]`/`store [trivial]` as having `unsafe
>>> unowned` ownership semantics. This will be enforced via the verifier.
>>> 3. more aggressive ARC code motion.
>>>
>>> # Definitions
>>>
>>> ## ownership qualified load
>>>
>>> We propose four different ownership qualifiers for load. Define `load
>>> [trivial]`
>>> as:
>>>
>>> %x = load [trivial] %x_ptr : $*Int
>>>
>>> =>
>>>
>>> %x = load %x_ptr : $*Int
>>>
>>> A `load [trivial]` can not be used to load values of non-trivial type.
>>
>> Should we mandate the reverse as well, that e.g. load [copy] cannot be used
>> to
>> load values of trivial type? That's a little more work for substituting
>> cloners, but it
>> keeps everything nice and canonical.
>
> No. I think that in the trivial case, load [copy] optimizes to load [trivial]
> as a canonicalization. This is just like applying a retain_value to a trivial
> type.
I guess I'm fine with that.
>>> Define
>>> `load [copy]` as:
>>>
>>> %x = load [copy] %x_ptr : $*C
>>>
>>> =>
>>>
>>> %x = load %x_ptr : $*C
>>> retain_value %x : $C
>>>
>>> Then define `load [take]` as:
>>>
>>> %x = load [take] %x_ptr : $*Builtin.NativeObject
>>>
>>> =>
>>>
>>> %x = load %x_ptr : $*Builtin.NativeObject
>>>
>>> **NOTE** `load [take]` implies that the loaded from memory location no
>>> longer
>>> owns the result object (i.e. a take is a move). Loading from the memory
>>> location
>>> again without reinitialization is illegal.
>>>
>>> Next we provide `load [borrow]`:
>>>
>>> %x = load [borrow] %x_ptr : $*Builtin.NativeObject
>>> ...
>>> endBorrow(%x, %x_ptr)
>>>
>>> =>
>>>
>>> %x = load %x_ptr : $*Builtin.NativeObject
>>> ...
>>> endBorrow(%x, %x_ptr)
>>>
>>> `load [borrow]` implies that in the region between the `load` and the
>>> `endBorrow`, the loaded object must semantically remain alive.
>>
>> For consistency with other multi-word SIL instructions, this should be
>> end_borrow.
>
> Sure.
>
>>
>> I wonder whether it might make more sense for load [borrow] to be a
>> different instruction.
>> There's a couple reasons for that first. The first is that it's the only
>> load which introduces
>> a scope, which is a really big difference structurally. The second is that
>> it's the only load
>> which returns a non-owned value, which will be a typing difference when we
>> record
>> ownership in the type system.
>
> I am fine with a load_borrow. If this is the only change left that you want
> can I just send out a proposal with that small change and start implementing.
> I am nervous about perfection being the enemy of the good (and I want to
> start implementing this weekend if possible *evil smile*).
Yes, it's fine to introduce this incrementally.
We can discuss the formation rules on scopes concurrently. I think the same
theoretical
foundation is probably what we'll use for pseudo-linearity,
memory-initialization soundness,
and so on.
John.
>> Anyway, I really like that load [borrow] is scoped.. Are you planning to
>> include the formation
>> restrictions on scopes instructions immediately, or will you do that in a
>> later proposal?
>
> It will be done in a later proposal. We are just trying to set the stage for
> verification.
>
>>
>> The requirements we need from scopes are:
>> - there has to be a well-defined set of IPs that lie within the scope and
>> - there can't be a non-explicit transition into or out of the scope.
>>
>> One way to get the first condition is to say that there has to be a unique
>> scope-ender that
>> post-dominates the scope-beginner, but that's a pretty hard restriction for
>> SILGen to satisfy
>> (as well as the optimizer, I imagine), and it doesn't handle "unreachable"
>> or infinite loops.
>> We need to allow multiple scope-enders, and we need to allow scope-enders to
>> be missing
>> in some cases.
>
> I agree with you here. We definitely want to be able to support multiple
> scope-enders.
>
>> Here's the right formalism, I think:
>>
>> For all walks W beginning from the entry point of the function:
>> For each node B in the CFG which is a scope-beginner:
>> Let E be the set of scope-enders for B, and consider just the
>> sub-sequence S of nodes
>> of W where the node is a member of {B} U E. Then the elements of S at
>> even
>> positions (starting from 0) must be B, and the elements at odd positions
>> must be
>> members of E. Furthermore, if the walk ends in a return or throw
>> instruction, then
>> S must have even length.
>>
>> Note that for this to be true, all the scope-enders must be dominated by the
>> scope-beginner.
>>
>> It is sufficient to just consider the set of "beeline" paths, i.e. acyclic
>> paths ending in either a true
>> terminator (a return, throw, or unreachable) or an edge back to a node
>> already in the path.
>> No such path may include multiple scope-enders for the same scope-beginner.
>> If the path ends
>> in a return or throw, it must include a matching scope-ender after every
>> scope-beginner. If
>> it ends in a loop back, then for every scope-beginner in the path, if the
>> path contains a scope-ender
>> then the loop destination must either precede the scope-beginner or follow
>> the scope-ender;
>> otherwise, the loop destination must follow the scope-beginner.
>>
>> Or, as a decision algorithm in Swift for a single scope-beginner:
>>
>> var blockEntryIsInScope = [Block: Bool]()
>> func scan(startingFrom inst: Instruction, isInScope: Bool) {
>> if inst is ReturnInst || inst is ThrowInst {
>> guard !isInScope else { invalid("ended function while in scope") }
>> return
>> }
>>
>> if let term = inst as? TerminatorInst {
>> for successor in term.successors {
>> guard begin.dominates(successor) else {
>> guard !isInScope else { invalid("branch out of scope while in
>> scope") }
>> continue
>> }
>> if let cachedValue = blockEntryIsInScope[successor] {
>> if cachedValue != isInScope {
>> invalid(isInScope ? "branch out of scope while in scope" :
>> "branch into scope after exiting scope")
>> }
>> } else {
>> blockEntryIsInScope[successor] = isInScope
>> scan(startingFrom: successor.begin, isInScope: isInScope)
>> }
>> }
>> return
>> }
>>
>> if inst.endsScopeOf(begin) {
>> guard isInScope else { invalid("ending scope that was already ended") }
>> scan(startingFrom: inst.next, isInScope: false)
>> } else {
>> scan(startingFrom: inst.next, isInScope: isInScope)
>> }
>> }
>> scan(startingFrom: begin, isInScope: true)
>
> Since this is tangential to the current proposal, can we introduce a side
> thread?
>
>>
>> John.
>>
>>> The `endBorrow` communicates to the optimizer:
>>>
>>> 1. That the value in `%x_ptr` should not be destroyed before endBorrow.
>>> 2. Uses of `%x` should not be sunk past endBorrow since `%x` is only a
>>> shallow
>>> copy of the value in `%x_ptr` and past that point `%x_ptr` may not remain
>>> alive.
>>>
>>> An example of where this construct is useful is when one has a let binding
>>> to a
>>> class instance `c` that contains a let field `f`. In that case `c`'s
>>> lifetime
>>> guarantees `f`'s lifetime meaning that returning `f` over the function call
>>> boundary is safe.
>>>
>>> ## ownership qualified store
>>>
>>> First define a `store [trivial]` as:
>>>
>>> store %x to [trivial] %x_ptr : $*Int
>>>
>>> =>
>>>
>>> store %x to %x_ptr : $*Int
>>>
>>> The verifier will prevent this instruction from being used on types with
>>> non-trivial ownership. Define a `store [assign]` as follows:
>>>
>>> store %x to [assign] %x_ptr : $*C
>>>
>>> =>
>>>
>>> %old_x = load %x_ptr : $*C
>>> store %new_x to %x_ptr : $*C
>>> release_value %old_x : $C
>>>
>>> *NOTE* `store` is defined as a consuming operation. We also provide
>>> `store [init]` in the case where we know statically that there is no
>>> previous value in the memory location:
>>>
>>> store %x to [init] %x_ptr : $*C
>>>
>>> =>
>>>
>>> store %new_x to %x_ptr : $*C
>>>
>>> # Implementation
>>>
>>> ## Goals
>>>
>>> Our implementation strategy goals are:
>>>
>>> 1. zero impact on other compiler developers until the feature is fully
>>> developed. This implies all work will be done behind a flag.
>>> 2. separation of feature implementation from updating passes.
>>>
>>> Goal 2 will be implemented via a pass that transforms ownership qualified
>>> `load`/`store` instructions into unqualified `load`/`store` right after
>>> SILGen.
>>>
>>> ## Plan
>>>
>>> We begin by adding initial infrastructure for our development. This means:
>>>
>>> 1. Adding to SILOptions a disabled by default flag called
>>> "EnableSILOwnershipModel". This flag will be set by a false by default
>>> frontend
>>> option called "-enable-sil-ownership-mode".
>>>
>>> 2. Bots will be brought up to test the compiler with
>>> "-enable-sil-ownership-model" set to true. The specific bots are:
>>>
>>> * RA-OSX+simulators
>>> * RA-Device
>>> * RA-Linux.
>>>
>>> The bots will run once a day until the feature is close to completion.
>>> Then a
>>> polling model will be followed.
>>>
>>> Now that change isolation is borrow, we develop building blocks for the
>>> optimization:
>>>
>>> 1. Two enums will be defined: `LoadInstOwnershipQualifier`,
>>> `StoreInstOwnershipQualifier`. The exact definition of these enums are as
>>> follows:
>>>
>>> enum class LoadOwnershipQualifier {
>>> Unqualified, Take, Copy, Borrow, Trivial
>>> };
>>> enum class StoreOwnershipQualifier {
>>> Unqualified, Init, Assign, Trivial
>>> };
>>>
>>> *NOTE* `LoadOwnershipQualifier::Unqualified` and
>>> `StoreOwnershipQualifier::Unqualified` are only needed for staging
>>> purposes.
>>>
>>> 2. Creating a `LoadInst`, `StoreInst` will be changed to require an
>>> ownership
>>> qualifier. At this stage, this argument will default to `Unqualified`.
>>> "Bare"
>>> `load`, `store` when parsed via textual SIL will be considered to be
>>> unqualified. This implies that the rest of the compiler will not have to be
>>> changed as a result of this step.
>>>
>>> 3. Support will be added to SIL, IRGen, Serialization, SILPrinting, and SIL
>>> Parsing for the rest of the qualifiers. SILGen will not be modified at this
>>> stage.
>>>
>>> 4. A pass called the "OwnershipModelEliminator" will be implemented. It will
>>> blow up all `load`, `store` instructions with non `*::Unqualified`
>>> ownership
>>> into their constituant ARC operations and `*::Unqualified` `load`,
>>> `store`
>>> insts.
>>>
>>> 3. An option called "EnforceSILOwnershipMode" will be added to the
>>> verifier. If
>>> the option is set, the verifier will assert if:
>>>
>>> a. `load`, `store` operations with trivial ownership are applied to
>>> memory
>>> locations with non-trivial type.
>>>
>>> b. `load`, `store` operation with unqualified ownership type are present
>>> in
>>> the IR.
>>>
>>> Finally, we wire up the building blocks:
>>>
>>> 1. If SILOption.EnableSILOwnershipModel is true, then the after SILGen SIL
>>> verification will be performed with EnforceSILOwnershipModel set to true.
>>> 2. If SILOption.EnableSILOwnershipModel is true, then the pass manager will
>>> run
>>> the OwnershipModelEliminator pass right after SILGen before the normal
>>> pass
>>> pipeline starts.
>>> 3. SILGen will be changed to emit non-unqualified ownership qualifiers on
>>> load,
>>> store instructions when the EnableSILOwnershipModel flag is set. We will
>>> use
>>> the verifier throwing to guarantee that we are not missing any specific
>>> cases.
>>>
>>> Then once all of the bots are green, we change
>>> SILOption.EnableSILOwnershipModel
>>> to be true by default. After a cooling off period, we move all of the code
>>> behind the SILOwnershipModel flag in front of the flag. We do this so we can
>>> reuse that flag for further SILOwnershipModel changes.
>>>
>>> ## Optimizer Changes
>>>
>>> Since the SILOwnershipModel eliminator will eliminate the ownership
>>> qualifiers
>>> on load, store instructions right after ownership verification, there will
>>> be no
>>> immediate affects on the optimizer and thus the optimizer changes can be
>>> done in
>>> parallel with the rest of the ARC optimization work.
>>>
>>> But, in the long run, we want to enforce these ownership invariants all
>>> throughout the SIL pipeline implying these ownership qualified `load`,
>>> `store`
>>> instructions must be processed by IRGen, not eliminated by the
>>> SILOwnershipModel
>>> eliminator. Thus we will need to update passes to handle these new
>>> instructions. The main optimizer changes can be separated into the following
>>> areas: memory forwarding, dead stores, ARC optimization. In all of these
>>> cases,
>>> the necessary changes are relatively trivial to respond to. We give a quick
>>> taste of two of them: store->load forwarding and ARC Code Motion.
>>>
>>> ### store->load forwarding
>>>
>>> Currently we perform store->load forwarding as follows:
>>>
>>> store %x to %x_ptr : $C
>>> ... NO SIDE EFFECTS THAT TOUCH X_PTR ...
>>> %y = load %x_ptr : $C
>>> use(%y)
>>>
>>> =>
>>>
>>> store %x to %x_ptr : $C
>>> ... NO SIDE EFFECTS THAT TOUCH X_PTR ...
>>> use(%x)
>>>
>>> In a world, where we are using ownership qualified load, store, we have to
>>> also
>>> consider the ownership implications. *NOTE* Since we are not modifying the
>>> store, `store [assign]` and `store [init]` are treated the same. Thus
>>> without
>>> any loss of generality, lets consider solely `store`.
>>>
>>> store %x to [assign] %x_ptr : $C
>>> ... NO SIDE EFFECTS THAT TOUCH X_PTR ...
>>> %y = load [copy] %x_ptr : $C
>>> use(%y)
>>>
>>> =>
>>>
>>> store %x to [assign] %x_ptr : $C
>>> ... NO SIDE EFFECTS THAT TOUCH X_PTR ...
>>> strong_retain %x
>>> use(%x)
>>>
>>> ### ARC Code Motion
>>>
>>> If ARC Code Motion wishes to move the ARC semantics of ownership qualified
>>> `load`, `store` instructions, it must now consider read/write effects. On
>>> the
>>> other hand, it will be able to now not consider the side-effects of
>>> destructors
>>> when moving retain/release operations.
>>>
>>> ### Normal Code Motion
>>>
>>> Normal code motion will lose some effectiveness since many of the load/store
>>> operations that it used to be able to move now must consider ARC
>>> information. We
>>> may need to consider running ARC code motion earlier in the pipeline where
>>> we
>>> normally run Normal Code Motion to ensure that we are able to handle these
>>> cases.
>>>
>>> ### ARC Optimization
>>>
>>> The main implication for ARC optimization is that instead of eliminating
>>> just
>>> retains, releases, it must be able to recognize ownership qualified `load`,
>>> `store` and set their flags as appropriate.
>>>
>>> ### Function Signature Optimization
>>>
>>> Semantic ARC affects function signature optimization in the context of the
>>> owned
>>> to borrow optimization. Specifically:
>>>
>>> 1. A `store [assign]` must be recognized as a release of the old value that
>>> is
>>> being overridden. In such a case, we can move the `release` of the old
>>> value
>>> into the caller and change the `store [assign]` into a `store [init]`.
>>> 2. A `load [copy]` must be recognized as a retain in the callee. Then
>>> function
>>> signature optimization will transform the `load [copy]` into a `load
>>> [borrow]`. This would require the addition of a new `@borrow` return
>>> value convention.
>>>
>>> # Appendix
>>>
>>> ## Partial Initialization of Loadable References in SIL
>>>
>>> In SIL, a value of non-trivial loadable type is loaded from a memory
>>> location as
>>> follows:
>>>
>>> %x = load %x_ptr : $*S
>>> ...
>>> retain_value %x_ptr : $S
>>>
>>> At first glance, this looks reasonable, but in truth there is a hidden
>>> drawback:
>>> the partially initialized zone in between the load and the retain
>>> operation. This zone creates a period of time when an "evil optimizer" could
>>> insert an instruction that causes x to be deallocated before the copy is
>>> finished being initialized. Similar issues come up when trying to perform a
>>> store of a non-trival value into a memory location.
>>>
>>> Since this sort of partial initialization is allowed in SIL, the optimizer
>>> is
>>> forced to be overly conservative when attempting to move releases passed
>>> retains
>>> lest the release triggers a deinit that destroys a value like `%x`. Lets
>>> look at
>>> two concrete examples that show how semantically providing ownership
>>> qualified
>>> load, store instructions eliminate this problem.
>>>
>>> **NOTE** Without any loss of generality, we will speak of values with
>>> reference
>>> semantics instead of non-trivial values.
>>>
>>> ## Case Study: Partial Initialization and load [copy]
>>>
>>> ### The Problem
>>>
>>> Consider the following swift program:
>>>
>>> func opaque_call()
>>>
>>> final class C {
>>> var int: Int = 0
>>> deinit {
>>> opaque_call()
>>> }
>>> }
>>>
>>> final class D {
>>> var int: Int = 0
>>> }
>>>
>>> var GLOBAL_C : C? = nil
>>> var GLOBAL_D : D? = nil
>>>
>>> func useC(_ c: C)
>>> func useD(_ d: D)
>>>
>>> func run() {
>>> let c = C()
>>> GLOBAL_C = c
>>> let d = D()
>>> GLOBAL_D = d
>>> useC(c)
>>> useD(d)
>>> }
>>>
>>> Notice that both `C` and `D` have fixed layouts and separate class
>>> hierarchies,
>>> but `C`'s deinit has a call to the function `opaque_call` which may write to
>>> `GLOBAL_D` or `GLOBAL_C`. Additionally assume that both `useC` and `useD`
>>> are
>>> known to the compiler to not have any affects on instances of type `D`, `C`
>>> respectively and useC assigns `nil` to `GLOBAL_C`. Now consider the
>>> following
>>> valid SIL lowering for `run`:
>>>
>>> sil_global GLOBAL_D : $D
>>> sil_global GLOBAL_C : $C
>>>
>>> final class C {
>>> var x: Int
>>> deinit
>>> }
>>>
>>> final class D {
>>> var x: Int
>>> }
>>>
>>> sil @useC : $@convention(thin) () -> ()
>>> sil @useD : $@convention(thin) () -> ()
>>>
>>> sil @run : $@convention(thin) () -> () {
>>> bb0:
>>> %c = alloc_ref $C
>>> %global_c = global_addr @GLOBAL_C : $*C
>>> strong_retain %c : $C
>>> store %c to %global_c : $*C
>>> (1)
>>>
>>> %d = alloc_ref $D
>>> %global_d = global_addr @GLOBAL_D : $*D
>>> strong_retain %d : $D
>>> store %d to %global_d : $*D
>>> (2)
>>>
>>> %c2 = load %global_c : $*C
>>> (3)
>>> strong_retain %c2 : $C
>>> (4)
>>> %d2 = load %global_d : $*D
>>> (5)
>>> strong_retain %d2 : $D
>>> (6)
>>>
>>> %useC_func = function_ref @useC : $@convention(thin) (@owned C) -> ()
>>> apply %useC_func(%c2) : $@convention(thin) (@owned C) -> ()
>>> (7)
>>>
>>> %useD_func = function_ref @useD : $@convention(thin) (@owned D) -> ()
>>> apply %useD_func(%d2) : $@convention(thin) (@owned D) -> ()
>>> (8)
>>>
>>> strong_release %d : $D
>>> (9)
>>> strong_release %c : $C
>>> (10)
>>> }
>>>
>>> Lets optimize this function! First we perform the following operations:
>>>
>>> 1. Since `(2)` is storing to an identified object that can not be
>>> `GLOBAL_C`, we
>>> can store to load forward `(1)` to `(3)`.
>>> 2. Since a retain does not block store to load forwarding, we can forward
>>> `(2)`
>>> to `(5)`. But lets for the sake of argument, assume that the optimizer
>>> keeps
>>> such information as an analysis and does not perform the actual
>>> load->store
>>> forwarding.
>>> 3. Even though we do not foward `(2)` to `(5)`, we can still move `(4)` over
>>> `(6)` so that `(4)` is right before `(7)`.
>>>
>>> This yields (using the ' marker to designate that a register has had
>>> load-store
>>> forwarding applied to it):
>>>
>>> sil @run : $@convention(thin) () -> () {
>>> bb0:
>>> %c = alloc_ref $C
>>> %global_c = global_addr @GLOBAL_C : $*C
>>> strong_retain %c : $C
>>> store %c to %global_c : $*C
>>> (1)
>>>
>>> %d = alloc_ref $D
>>> %global_d = global_addr @GLOBAL_D : $*D
>>> strong_retain %d : $D
>>> store %d to %global_d : $*D
>>> (2)
>>>
>>> strong_retain %c : $C
>>> (4')
>>> %d2 = load %global_d : $*D
>>> (5)
>>> strong_retain %d2 : $D
>>> (6)
>>>
>>> %useC_func = function_ref @useC : $@convention(thin) (@owned C) -> ()
>>> apply %useC_func(%c) : $@convention(thin) (@owned C) -> ()
>>> (7')
>>>
>>> %useD_func = function_ref @useD : $@convention(thin) (@owned D) -> ()
>>> apply %useD_func(%d2) : $@convention(thin) (@owned D) -> ()
>>> (8)
>>>
>>> strong_release %d : $D
>>> (9)
>>> strong_release %c : $C
>>> (10)
>>> }
>>>
>>> Then by assumption, we know that `%useC` does not perform any releases of
>>> any
>>> instances of class `D`. Thus `(6)` can be moved past `(7')` and we can then
>>> pair
>>> and eliminate `(6)` and `(9)` via the rules of ARC optimization using the
>>> analysis information that `%d2` and `%d` are th same due to the possibility
>>> of
>>> performing store->load forwarding. After performing such transformations,
>>> `run`
>>> looks as follows:
>>>
>>> sil @run : $@convention(thin) () -> () {
>>> bb0:
>>> %c = alloc_ref $C
>>> %global_c = global_addr @GLOBAL_C : $*C
>>> strong_retain %c : $C
>>> store %c to %global_c : $*C
>>> (1)
>>>
>>> %d = alloc_ref $D
>>> %global_d = global_addr @GLOBAL_D : $*D
>>> strong_retain %d : $D
>>> store %d to %global_d : $*D
>>>
>>> %d2 = load %global_d : $*D
>>> (5)
>>> strong_retain %c : $C
>>> (4')
>>> %useC_func = function_ref @useC : $@convention(thin) (@owned C) -> ()
>>> apply %useC_func(%c) : $@convention(thin) (@owned C) -> ()
>>> (7')
>>>
>>> %useD_func = function_ref @useD : $@convention(thin) (@owned D) -> ()
>>> apply %useD_func(%d2) : $@convention(thin) (@owned D) -> ()
>>> (8)
>>>
>>> strong_release %c : $C
>>> (10)
>>> }
>>>
>>> Now by assumption, we know that `%useD_func` does not touch any instances of
>>> class `C` and `%c` does not contain any ivars of type `D` and is final so
>>> none
>>> can be added. At first glance, this seems to suggest that we can move `(10)`
>>> before `(8')` and then pair/eliminate `(4')` and `(10)`. But is this a safe
>>> optimization perform? Absolutely Not! Why? Remember that since `useC_func`
>>> assigns `nil` to `GLOBAL_C`, after `(7')`, `%c` could have a reference count
>>> of 1. Thus `(10)` _may_ invoke the destructor of `C`. Since this destructor
>>> calls an opaque function that _could_ potentially write to `GLOBAL_D`, we
>>> may be
>>> be passing `%d2`, an already deallocated object to `%useD_func`, an illegal
>>> optimization!
>>>
>>> Lets think a bit more about this example and consider this example at the
>>> language level. Remember that while Swift's deinit are not asychronous, we
>>> do
>>> not allow for user level code to create dependencies from the body of the
>>> destructor into the normal control flow that has called it. This means that
>>> there are two valid results of this code:
>>>
>>> - Operation Sequence 1: No optimization is performed and `%d2` is passed to
>>> `%useD_func`.
>>> - Operation Sequence 2: We shorten the lifetime of `%c` before `%useD_func`
>>> and
>>> a different instance of `$D` is passed into `%useD_func`.
>>>
>>> The fact that 1 occurs without optimization is just as a result of an
>>> implementation detail of SILGen. 2 is also a valid sequence of operations.
>>>
>>> Given that:
>>>
>>> 1. As a principle, the optimizer does not consider such dependencies to
>>> avoid
>>> being overly conservative.
>>> 2. We provide constructs to ensure appropriate lifetimes via the usage of
>>> constructs such as fix_lifetime.
>>>
>>> We need to figure out how to express our optimization such that 2
>>> happens. Remember that one of the optimizations that we performed at the
>>> beginning was to move `(6)` over `(7')`, i.e., transform this:
>>>
>>> %d = alloc_ref $D
>>> %global_d_addr = global_addr GLOBAL_D : $D
>>> %d = load %global_d_addr : $*D (5)
>>> strong_retain %d : $D (6)
>>>
>>> // Call the user functions passing in the instances that we loaded
>>> from the globals.
>>> %useC_func = function_ref @useC : $@convention(thin) (@owned C) -> ()
>>> apply %useC_func(%c) : $@convention(thin) (@owned C) -> ()
>>> (7')
>>>
>>> into:
>>>
>>> %global_d_addr = global_addr GLOBAL_D : $D
>>> %d2 = load %global_d_addr : $*D (5)
>>>
>>> // Call the user functions passing in the instances that we loaded
>>> from the globals.
>>> %useC_func = function_ref @useC : $@convention(thin) (@owned C) -> ()
>>> apply %useC_func(%c) : $@convention(thin) (@owned C) -> ()
>>> (7')
>>> strong_retain %d2 : $D (6)
>>>
>>> This transformation in Swift corresponds to transforming:
>>>
>>> let d = GLOBAL_D
>>> useC(c)
>>>
>>> to:
>>>
>>> let d_raw = load_d_value(GLOBAL_D)
>>> useC(c)
>>> let d = take_ownership_of_d(d_raw)
>>>
>>> This is clearly an instance where we have moved a side-effect in between the
>>> loading of the data and the taking ownership of such data, that is before
>>> the
>>> `let` is fully initialized. What if instead of just moving the retain, we
>>> moved
>>> the entire let statement? This would then result in the following swift
>>> code:
>>>
>>> useC(c)
>>> let d = GLOBAL_D
>>>
>>> and would correspond to the following SIL snippet:
>>>
>>> %global_d_addr = global_addr GLOBAL_D : $D
>>>
>>> // Call the user functions passing in the instances that we loaded
>>> from the globals.
>>> %useC_func = function_ref @useC : $@convention(thin) (@owned C) -> ()
>>> apply %useC_func(%c) : $@convention(thin) (@owned C) -> ()
>>> (7')
>>> %d2 = load %global_d_addr : $*D
>>> (5)
>>> strong_retain %d2 : $D
>>> (6)
>>>
>>> Moving the load with the strong_retain to ensure that the full
>>> initialization is
>>> performed even after code motion causes our SIL to look as follows:
>>>
>>> sil @run : $@convention(thin) () -> () {
>>> bb0:
>>> %c = alloc_ref $C
>>> %global_c = global_addr @GLOBAL_C : $*C
>>> strong_retain %c : $C
>>> store %c to %global_c : $*C
>>> (1)
>>>
>>> %d = alloc_ref $D
>>> %global_d = global_addr @GLOBAL_D : $*D
>>> strong_retain %d : $D
>>> store %d to %global_d : $*D
>>>
>>> strong_retain %c : $C
>>> (4')
>>> %useC_func = function_ref @useC : $@convention(thin) (@owned C) -> ()
>>> apply %useC_func(%c) : $@convention(thin) (@owned C) -> ()
>>> (7')
>>>
>>> %d2 = load %global_d : $*D
>>> (5)
>>> %useD_func = function_ref @useD : $@convention(thin) (@owned D) -> ()
>>> apply %useD_func(%d2) : $@convention(thin) (@owned D) -> ()
>>> (8)
>>>
>>> strong_release %c : $C
>>> (10)
>>> }
>>>
>>> Giving us the exact result that we want: Operation Sequence 2!
>>>
>>> ### Defining load [copy]
>>>
>>> Given that we wish the load, store to be tightly coupled together, it is
>>> natural
>>> to express this operation as a `load [copy]` instruction. Lets define the
>>> `load
>>> [copy]` instruction as follows:
>>>
>>> %1 = load [copy] %0 : $*C
>>>
>>> =>
>>>
>>> %1 = load %0 : $*C
>>> retain_value %1 : $C
>>>
>>> Now lets transform our initial example to use this instruction:
>>>
>>> Notice how now if we move `(7)` over `(3)` and `(6)` now, we get the
>>> following SIL:
>>>
>>> sil @run : $@convention(thin) () -> () {
>>> bb0:
>>> %c = alloc_ref $C
>>> %global_c = global_addr @GLOBAL_C : $*C
>>> strong_retain %c : $C
>>> store %c to %global_c : $*C
>>> (1)
>>>
>>> %d = alloc_ref $D
>>> %global_d = global_addr @GLOBAL_D : $*D
>>> strong_retain %d : $D
>>> store %d to %global_d : $*D
>>> (2)
>>>
>>> %c2 = load [copy] %global_c : $*C
>>> (3)
>>> %d2 = load [copy] %global_d : $*D
>>> (5)
>>>
>>> %useC_func = function_ref @useC : $@convention(thin) (@owned C) -> ()
>>> apply %useC_func(%c2) : $@convention(thin) (@owned C) -> ()
>>> (7)
>>>
>>> %useD_func = function_ref @useD : $@convention(thin) (@owned D) -> ()
>>> apply %useD_func(%d2) : $@convention(thin) (@owned D) -> ()
>>> (8)
>>>
>>> strong_release %d : $D
>>> (9)
>>> strong_release %c : $C
>>> (10)
>>> }
>>>
>>> We then perform the previous code motion:
>>>
>>> sil @run : $@convention(thin) () -> () {
>>> bb0:
>>> %c = alloc_ref $C
>>> %global_c = global_addr @GLOBAL_C : $*C
>>> strong_retain %c : $C
>>> store %c to %global_c : $*C
>>> (1)
>>>
>>> %d = alloc_ref $D
>>> %global_d = global_addr @GLOBAL_D : $*D
>>> strong_retain %d : $D
>>> store %d to %global_d : $*D
>>> (2)
>>>
>>> %c2 = load [copy] %global_c : $*C
>>> (3)
>>> %useC_func = function_ref @useC : $@convention(thin) (@owned C) -> ()
>>> apply %useC_func(%c2) : $@convention(thin) (@owned C) -> ()
>>> (7)
>>> strong_release %d : $D
>>> (9)
>>>
>>> %d2 = load [copy] %global_d : $*D
>>> (5)
>>> %useD_func = function_ref @useD : $@convention(thin) (@owned D) -> ()
>>> apply %useD_func(%d2) : $@convention(thin) (@owned D) -> ()
>>> (8)
>>> strong_release %c : $C
>>> (10)
>>> }
>>>
>>> We then would like to eliminate `(9)` and `(10)` by pairing them with `(3)`
>>> and
>>> `(8)`. Can we still do so? One way we could do this is by introducing the
>>> `[take]` flag. The `[take]` flag on a `load [take]` says that one is
>>> semantically loading a value from a memory location and are taking
>>> ownership of
>>> the value thus eliding the retain. In terms of SIL this flag is defined as:
>>>
>>> %x = load [take] %x_ptr : $*C
>>>
>>> =>
>>>
>>> %x = load %x_ptr : $*C
>>>
>>> Why do we care about having such a `load [take]` instruction when we could
>>> just
>>> use a `load`? The reason why is that a normal `load` has unsafe unowned
>>> ownership (i.e. it has no implications on ownership). We would like for
>>> memory
>>> that has non-trivial type to only be able to be loaded via instructions that
>>> maintain said ownership. We will allow for casting to trivial types as
>>> usual to
>>> provide such access if it is required.
>>>
>>> Thus we have achieved the desired result:
>>>
>>> sil @run : $@convention(thin) () -> () {
>>> bb0:
>>> %c = alloc_ref $C
>>> %global_c = global_addr @GLOBAL_C : $*C
>>> strong_retain %c : $C
>>> store %c to %global_c : $*C
>>> (1)
>>>
>>> %d = alloc_ref $D
>>> %global_d = global_addr @GLOBAL_D : $*D
>>> strong_retain %d : $D
>>> store %d to %global_d : $*D
>>> (2)
>>>
>>> %c2 = load [take] %global_c : $*C
>>> (3)
>>> %useC_func = function_ref @useC : $@convention(thin) (@owned C) -> ()
>>> apply %useC_func(%c2) : $@convention(thin) (@owned C) -> ()
>>> (7)
>>>
>>> %d2 = load [take] %global_d : $*D
>>> (5)
>>> %useD_func = function_ref @useD : $@convention(thin) (@owned D) -> ()
>>> apply %useD_func(%d2) : $@convention(thin) (@owned D) -> ()
>>> (8)
>>> }
>>>
>>>
>>>> On Oct 6, 2016, at 3:03 PM, John McCall <[email protected]
>>>> <mailto:[email protected]>> wrote:
>>>>
>>>>> On Oct 5, 2016, at 4:48 PM, Michael Gottesman <[email protected]
>>>>> <mailto:[email protected]>> wrote:
>>>>>> On Oct 5, 2016, at 4:40 PM, Michael Gottesman via swift-dev
>>>>>> <[email protected] <mailto:[email protected]>> wrote:
>>>>>>
>>>>>>>
>>>>>>> On Oct 4, 2016, at 1:04 PM, John McCall <[email protected]
>>>>>>> <mailto:[email protected]>> wrote:
>>>>>>>
>>>>>>>>
>>>>>>>> On Sep 30, 2016, at 11:54 PM, Michael Gottesman via swift-dev
>>>>>>>> <[email protected] <mailto:[email protected]>> wrote:
>>>>>>>>
>>>>>>>> The document attached below contains the first "Semantic ARC" mini
>>>>>>>> proposal: the High Level ARC Memory Operations Proposal.
>>>>>>>>
>>>>>>>> An html rendered version of this markdown document is available at the
>>>>>>>> following URL:
>>>>>>>>
>>>>>>>> https://gottesmm.github.io/proposals/high-level-arc-memory-operations.html
>>>>>>>>
>>>>>>>> <https://gottesmm.github.io/proposals/high-level-arc-memory-operations.html>
>>>>>>>>
>>>>>>>> ----
>>>>>>>>
>>>>>>>> # Summary
>>>>>>>>
>>>>>>>> This document proposes:
>>>>>>>>
>>>>>>>> 1. adding the `load_strong`, `store_strong` instructions to SIL. These
>>>>>>>> can only
>>>>>>>> be used with memory locations of `non-trivial` type.
>>>>>>>
>>>>>>> I would really like to avoid using the word "strong" here. Under the
>>>>>>> current proposal, these instructions will be usable with arbitrary
>>>>>>> non-trivial types, not just primitive class references. Even if you
>>>>>>> think of an aggregate that happens to contain one or more strong
>>>>>>> references as some sort of aggregate strong reference (which is
>>>>>>> questionable but not completely absurd), we already have loadable
>>>>>>> non-strong class references that this operation would be usable with,
>>>>>>> like native unowned references. "load_strong %0 : $*@sil_unowned T" as
>>>>>>> an operation yielding a scalar "@sil_unowned T" is ridiculous, and it
>>>>>>> will only get more ridiculous when we eventually allow this operation
>>>>>>> to work with types that are currently address-only, like weak
>>>>>>> references.
>>>>>>>
>>>>>>> Brainstorming:
>>>>>>>
>>>>>>> Something like load_copy and store_copy would be a bit unfortunate,
>>>>>>> since store_copy doesn't actually copy the source operand and we want
>>>>>>> to have a load_copy [take].
>>>>>>>
>>>>>>> load_value and store_value seem excessively generic. It's not like
>>>>>>> non-trivial types aren't values.
>>>>>>>
>>>>>>> One question that comes to mind: do we actually need new instructions
>>>>>>> here other than for staging purposes? We don't actually need new
>>>>>>> instructions for pseudo-linear SIL to work; we just need to say that we
>>>>>>> only enforce pseudo-linearity for non-trivial types.
>>>>>>>
>>>>>>> If we just want the instruction to be explicit about ownership so that
>>>>>>> we can easily distinguish these cases, we can make the rule always
>>>>>>> explicit, e.g.:
>>>>>>> load [take] %0 : $*MyClass
>>>>>>> load [copy] %0 : $*MyClass
>>>>>>> load [trivial] %0 : $*Int
>>>>>>>
>>>>>>> store %0 to [initialization] %1 : $*MyClass
>>>>>>> store %0 to [assignment] %1 : $*MyClass
>>>>>>> store %0 to [trivial] %1 : $*Int
>>>>>>>
>>>>>>> John.
>>>>>>
>>>>>> The reason why I originally suggested to go the load_strong route is
>>>>>> that we already have load_weak, load_unowned instructions. If I could
>>>>>> add a load_strong instruction, then it would make sense to assign an
>>>>>> engineer to do a pass over all 3 of these instructions and combine them
>>>>>> into 1 load instruction. That is, first transform into a form amenable
>>>>>> for canonicalization and then canonicalize all at once.
>>>>>>
>>>>>> As you pointed out, both load_unowned and load_weak involve
>>>>>> representation changes in type (for instance the change of weak pointers
>>>>>> to Optional<T>). Such a change would be against the "spirit" of a load
>>>>>> instruction to perform such representation changes versus ownership
>>>>>> changes.
>>>>>>
>>>>>> In terms of the properties that we actually want here, what is important
>>>>>> is that we can verify that no non-trivially typed values are loaded in
>>>>>> an unsafe unowned manner. That can be done also with ownership flags on
>>>>>> load/store.
>>>>>>
>>>>>> Does this sound reasonable:
>>>>>>
>>>>>> 1. We introduce two enums that define memory ownership changes, one for
>>>>>> load and one for store. Both of these enums will contain a [trivial]
>>>>>> ownership.
>>>>>> 2. We enforce in the verifier that non-trivial types must have a
>>>>>> non-trivial ownership modifier on any memory operations that they are
>>>>>> involved in.
>>>>>
>>>>> Sorry for not being explicit. I will not add new instructions, just
>>>>> modifiers. Assuming that this is agreeable to you, I am going to prepare
>>>>> a quick additional version of the proposal document.
>>>>
>>>> That sounds great, thanks.
>>>>
>>>> John.
>
_______________________________________________
swift-dev mailing list
[email protected]
https://lists.swift.org/mailman/listinfo/swift-dev