Re: [swift-dev] copy-on-write proposal

Erik Eckstein via swift-dev Thu, 20 Oct 2016 10:00:54 -0700

> On Oct 20, 2016, at 8:51 AM, Dave Abrahams <dabrah...@apple.com> wrote:
> 
> We might want to leave some room in the design for a shared atomic cache 
> reference to live in the buffer, FWIW. It would have to be mutable even when 
> the buffer was multiply-referenced


Should be no problem with an attribute on that field. Like ‘mutable' in C++.
> 
> Sent from my moss-covered three-handled family gradunza
> 
> On Oct 20, 2016, at 8:41 AM, Erik Eckstein <eeckst...@apple.com 
> <mailto:eeckst...@apple.com>> wrote:
> 
>> 
>>> On Oct 19, 2016, at 6:36 PM, Andrew Trick via swift-dev 
>>> <swift-dev@swift.org <mailto:swift-dev@swift.org>> wrote:
>>> 
>>>> 
>>>> On Oct 19, 2016, at 10:13 AM, Dave Abrahams via swift-dev 
>>>> <swift-dev@swift.org <mailto:swift-dev@swift.org>> wrote:
>>>> 
>>>> 
>>>> on Tue Oct 18 2016, Erik Eckstein <swift-dev-AT-swift.org 
>>>> <http://swift-dev-at-swift.org/>> wrote:
>>>> 
>>>>>> On Oct 17, 2016, at 10:21 AM, Dave Abrahams <dabrah...@apple.com 
>>>>>> <mailto:dabrah...@apple.com>> wrote:
>>>>>> 
>>>>>> 
>>>>>> on Mon Oct 17 2016, Erik Eckstein <eeckstein-AT-apple.com 
>>>>>> <http://eeckstein-at-apple.com/>> wrote:
>>>>>> 
>>>>> 
>>>>>>> On Oct 16, 2016, at 2:05 PM, Dave Abrahams via swift-dev 
>>>>>>> <swift-dev@swift.org <mailto:swift-dev@swift.org>> wrote:
>>>>>>> 
>>>>>>>> on Thu Oct 13 2016, Joe Groff <swift-dev-AT-swift.org 
>>>>>>>> <http://swift-dev-at-swift.org/> <http://swift-dev-at-swift.org/ 
>>>>>>>> <http://swift-dev-at-swift.org/>>> wrote:
>>>>>>>> 
>>>>>>>>>> On Oct 11, 2016, at 4:48 PM, Erik Eckstein via swift-dev 
>>>>>>>>>> <swift-dev@swift.org <mailto:swift-dev@swift.org>> wrote:
>>>>>>>>>> 
>>>>>>>>>> This is a proposal for representing copy-on-write buffers in
>>>>>>>>>> SIL. Actually it’s still a draft for a proposal. It also heavily
>>>>>>>>>> depends on how we move forward with SIL ownership.
>>>>>>>>>> <CopyOnWrite.rst>
>>>>>>>>>> If you have any comments, please let me know.
>>>>>>>>> 
>>>>>>>>> The SIL-level design seems sensible to me at a glance. At the language
>>>>>>>>> level, I think it would make more sense to treat this as an attribute
>>>>>>>>> on class types rather than on properties in structs using the class. I
>>>>>>>>> don't think many people reuse class definitions as both shared
>>>>>>>>> reference types and as value type payloads, 
>>>>>>>> 
>>>>>>>> Foundation does, or would if they could.
>>>>>>>> 
>>>>>>>>> but beyond that, I think that making it an attribute of classes would
>>>>>>>>> put us into a better position to leverage the borrow model to enforce
>>>>>>>>> the "mutable-only-when-unique" aspect of COW implementations. John
>>>>>>>>> alluded to this in the "SIL address types and borrowing" thread:
>>>>>>>>> 
>>>>>>>>>> I wonder if it would make more sense to make copy-on-write buffer
>>>>>>>>>> references a move-only type, so that as long as you were just
>>>>>>>>>> working with the raw reference (as opposed to the CoW aggregate,
>>>>>>>>>> which would remain copyable) it wouldn't get implicitly copied
>>>>>>>>>> anymore.  You could have mutable and immutable buffer reference
>>>>>>>>>> types, both move-only, and there could be a consuming checkUnique
>>>>>>>>>> operation on the immutable one that, I dunno, returned an Either of
>>>>>>>>>> the mutable and immutable versions.
>>>>>>>>>> 
>>>>>>>>>> For CoW aggregates, you'd need some @copied attribute on the field
>>>>>>>>>> to make sure that the CoW attribute was still copyable.  Within the
>>>>>>>>>> implementation of the type, though, you would be projecting out the
>>>>>>>>>> reference immediately, and thereafter you'd be certain that you were
>>>>>>>>>> borrowing / moving it around as appropriate.
>>>>>>>>> 
>>>>>>>>> If 'copy-on-write' were a trait on classes, then we could distinguish
>>>>>>>>> unique and nonunique references to the class. A unique reference would
>>>>>>>>> act like a move-only type to prevent accidental loss of uniqueness. 
>>>>>>>> 
>>>>>>>> +1
>>>>>>>> 
>>>>>>>>> We can also allow a copy-on-write class to have "mutating" methods,
>>>>>>>>> and only allow mutation on unique references. It seems to me like,
>>>>>>>>> exploring this direction, we could also come up with a way for the
>>>>>>>>> high-level value-semantics operations on the struct to statically
>>>>>>>>> indicate which methods are known to leave the value's buffers in a
>>>>>>>>> unique state, or which return values that are uniquely owned, which
>>>>>>>>> would give the optimizer more ability to avoid uniqueness checks
>>>>>>>>> across calls without relying on inlining and IPO.
>>>>>>>> 
>>>>>>>> That's pretty cool.  However, I think there's nothing to prevent any
>>>>>>>> mutating method from storing a copy of self in a global, so I think 
>>>>>>>> we'd
>>>>>>>> need some participation from the programmer (either an agreement not to
>>>>>>>> do that, or an explicit claim of uniqueness on exit) in order to
>>>>>>>> identify operations that create/preserve uniqueness.
>>>>>>> 
>>>>>>> If a mutating reference (like self in a mutating method) is move-only
>>>>>>> then you would not be able to “copy” it to a global.
>>>>>> 
>>>>>> Yes, a reference to a move-only type would work for this purpose.
>>>>>> 
>>>>>> 
>>>>>>>> On Oct 16, 2016, at 2:01 PM, Dave Abrahams via swift-dev 
>>>>>>>> <swift-dev@swift.org <mailto:swift-dev@swift.org>> wrote:
>>>>>>>> 
>>>>>>>> 
>>>>>>>> on Tue Oct 11 2016, Erik Eckstein <swift-dev-AT-swift.org 
>>>>>>>> <http://swift-dev-at-swift.org/>> wrote:
>>>>>>>> 
>>>>>>>>> This is a proposal for representing copy-on-write buffers in
>>>>>>>>> SIL. Actually it’s still a draft for a proposal. It also heavily
>>>>>>>>> depends on how we move forward with SIL ownership.
>>>>>>>>> 
>>>>>>>>> :orphan:
>>>>>>>>> 
>>>>>>>>> .. highlight:: sil
>>>>>>>>> 
>>>>>>>>> ===================================
>>>>>>>>> Copy-On-Write Representation in SIL
>>>>>>>>> ===================================
>>>>>>>>> 
>>>>>>>>> .. contents::
>>>>>>>>> 
>>>>>>>>> Overview
>>>>>>>>> ========
>>>>>>>>> 
>>>>>>>>> This document proposes:
>>>>>>>>> 
>>>>>>>>> - An ownership attribute to define copy-on-write (COW) buffers in 
>>>>>>>>> Swift data
>>>>>>>>> types.
>>>>>>>>> 
>>>>>>>>> - A representation of COW buffers in SIL so that optimizations can 
>>>>>>>>> take benefit
>>>>>>>>> of it.
>>>>>>>>> 
>>>>>>>>> The basic idea is to enable the SIL optimizer to reason about COW 
>>>>>>>>> data types
>>>>>>>>> in the same way as a programmer can do.
>>>>>>>>> This means: a COW buffer can only be modified by its owning SIL 
>>>>>>>>> value, because
>>>>>>>>> either it's uniquely referenced or the buffer is copied before 
>>>>>>>>> modified.
>>>>>>>>> 
>>>>>>>>> .. note::
>>>>>>>>>  In the following the term "buffer" refers to a Swift heap object.
>>>>>>>>>  It can be any heap object, not necessarily a “buffer” with e.g. 
>>>>>>>>> tail-allocated elements.
>>>>>>>>> 
>>>>>>>>> COW Types
>>>>>>>>> =========
>>>>>>>>> 
>>>>>>>>> The basic structure of COW data types can be simplified as follows::
>>>>>>>>> 
>>>>>>>>>  class COWBuffer {
>>>>>>>>>    var someData: Int
>>>>>>>>>    ...
>>>>>>>>>  }
>>>>>>>>> 
>>>>>>>>>  struct COWType {
>>>>>>>>>    var b : COWBuffer
>>>>>>>>> 
>>>>>>>>>    mutating func change_it() {
>>>>>>>>>      if (!isUniquelyReferenced(b)) {
>>>>>>>>>        b = copy_buffer(b)
>>>>>>>>>      }
>>>>>>>>>      b.someData = ...
>>>>>>>>>    }
>>>>>>>>>  }
>>>>>>>>> 
>>>>>>>>> Currently the COW behavior of such types is just defined by their 
>>>>>>>>> implementation.
>>>>>>>>> But there is no representation of this special behavior in the SIL.
>>>>>>>>> So the SIL optimizer has no clue about it and cannot take advantage 
>>>>>>>>> of it.
>>>>>>>>> 
>>>>>>>>> For example::
>>>>>>>>> 
>>>>>>>>>  func foo(arr : [Int]) {
>>>>>>>>>    x = arr[0]
>>>>>>>>>    opaque_function()
>>>>>>>>>    y = arr[0] // can RLE replace this with y = x?
>>>>>>>>>  }
>>>>>>>>> 
>>>>>>>>> If opaque_function() wants to change the contents of the array buffer 
>>>>>>>>> it first
>>>>>>>>> has to copy it. 
>>>>>>>> 
>>>>>>>> ...or determine that it's uniquely-referenced.
>>>>>>> 
>>>>>>> In this specific example, if opqaue_function holds a reference to arr’s 
>>>>>>> buffer, the buffer is not
>>>>>>> uniquely-referenced.
>>>>>> 
>>>>>> Right.
>>>>>> 
>>>>>>>> 
>>>>>>>>> But the optimizer does not know it so it has to conservatively assume
>>>>>>>>> that opaque_function() will write to the location of arr[0].
>>>>>>>>> 
>>>>>>>>> Copy-on-write Ownership Attribute
>>>>>>>>> =================================
>>>>>>>>> 
>>>>>>>>> This section proposes an ownership attribute to define a 
>>>>>>>>> copy-on-write buffer.
>>>>>>>>> 
>>>>>>>>> Swift Syntax
>>>>>>>>> ------------
>>>>>>>>> 
>>>>>>>>> A COW buffer reference can be defined with a new ownership attribute 
>>>>>>>>> for the
>>>>>>>>> buffer variable declaration (similar to “weak” and “unowned”)::
>>>>>>>>> 
>>>>>>>>>  struct COWType {
>>>>>>>>>    copy_on_write var b : COWBuffer
>>>>>>>>> 
>>>>>>>>>    // ...
>>>>>>>>>  }
>>>>>>>>> 
>>>>>>>>> The ``copy_on_write`` attribute is purely used for optimization 
>>>>>>>>> purposes.
>>>>>>>>> It does not change the semantics of the program.
>>>>>>>> 
>>>>>>>> Presumably, it changes what code you can execute on `b` without 
>>>>>>>> invoking
>>>>>>>> traps or undefined behavior.  Otherwise, the optimizer wouldn't be able
>>>>>>>> to do anything differently to take advantage of the annotation.
>>>>>>> 
>>>>>>> That’s true.
>>>>>>> 
>>>>>>>> What are the rules for writing code that uses `copy_on_write`?
>>>>>>> 
>>>>>>> See below ("The rules for using ``copy_on_write`` and the built-ins 
>>>>>>> are:”)
>>>>>> 
>>>>>> Yeah, I got there, eventually.  But just saying “doesn't change
>>>>>> semantics” at this point in the proposal leaves a gap, because it does
>>>>>> change semantic *requirements*.  You should mention that.
>>>>>> 
>>>>>>>>> .. note::
>>>>>>>>> 
>>>>>>>>> “copy_on_write” is a  working title. TODO: decide on the name.
>>>>>>>>> Maybe it should be a @-attribute, like @copy_on_write?
>>>>>>>>> Another question is if we should open this attribute for the public 
>>>>>>>>> or just
>>>>>>>>> use it internally in the library, because violating the implied rules
>>>>>>>>> (see below) could break memory safety.
>>>>>>>>> 
>>>>>>>>> Implementation
>>>>>>>>> --------------
>>>>>>>>> 
>>>>>>>>> The ``copy_on_write`` references can be represented in the AST as a 
>>>>>>>>> special
>>>>>>>>> ``StorageType``, just like how ``unowned`` and ``weak`` is 
>>>>>>>>> represented.
>>>>>>>>> The canonical type of a ``CopyOnWriteStorageType`` would be the 
>>>>>>>>> referenced
>>>>>>>>> buffer class type.
>>>>>>>>> 
>>>>>>>>> In SIL the buffer reference will have type::
>>>>>>>>> 
>>>>>>>>>  $@sil_cow COWBuffer
>>>>>>>>> 
>>>>>>>>> where ``COWBuffer`` is the type of the referenced heap object.
>>>>>>>>> 
>>>>>>>>> Two conversion instructions are needed to convert from a ``@sil_cow`` 
>>>>>>>>> reference
>>>>>>>>> type to a regular reference type::
>>>>>>>>> 
>>>>>>>>>  cow_to_ref
>>>>>>>>>  ref_to_cow
>>>>>>>>> 
>>>>>>>>> Again, this is similar to ``ref_to_unowned`` and ``unowned_to_ref``.
>>>>>>>>> 
>>>>>>>>> For example the SIL code for::
>>>>>>>>> 
>>>>>>>>>  var c: COWType
>>>>>>>>>  let x = c.b.someData
>>>>>>>>> 
>>>>>>>>> would be::
>>>>>>>>> 
>>>>>>>>>  %1 = struct_extract %1 : COWType, #COWType.b
>>>>>>>>>  %2 = cow_to_ref %1 : $@sil_cow COWBuffer
>>>>>>>>>  %3 = ref_element_addr %2 : $COWBuffer, #COWBuffer.someData
>>>>>>>>>  %4 = load %3 : $*Int
>>>>>>>>> 
>>>>>>>>> The ``ref_to_cow`` instruction is needed to store a new buffer 
>>>>>>>>> reference into a
>>>>>>>>> COW type.
>>>>>>>>> 
>>>>>>>>> COW Buffers and the Optimizer
>>>>>>>>> =============================
>>>>>>>>> 
>>>>>>>>> A reference to a COW buffer gives the optimizer additional 
>>>>>>>>> information:
>>>>>>>>> 
>>>>>>>>> *A buffer, referenced by a @sil_cow reference is considered to be 
>>>>>>>>> immutable
>>>>>>>>> during the lifetime of the reference.*
>>>>>>>> 
>>>>>>>> This seems like much too broad a rule to allow inplace mutations of
>>>>>>>> uniquely referenced buffers.
>>>>>>> 
>>>>>>> The point is that all mutations must be guarded by an is_unique, which
>>>>>>> takes the _address_ of the buffer reference as argument.
>>>>>>> And the optimizer considers this instruction as a potential write to 
>>>>>>> the buffer reference.
>>>>>>> The effect is that the lifetime of a buffer reference (as a SIL value)
>>>>>>> will not outlive a is_unique - regardless if this is inside a called
>>>>>>> function or inlined.
>>>>>> 
>>>>>> I don't see how that allows me to mutate a uniquely referenced buffer 
>>>>>> held
>>>>>> in a @sil_cow reference, given what you wrote above.
>>>>> 
>>>>> You would not be able to get a reference to a mutable buffer by
>>>>> reading the COW type’s @sil_cow field.  Instead you would only get
>>>>> such a reference as a result of the is_unique instruction/builtin. Or,
>>>>> of course, by creating a new buffer.
>>>>> 
>>>>> I’m not sure if this was the question, though.
>>>> 
>>>> I think it just comes down to precise phrasing.  AFAICT, what you really
>>>> mean to say is something like
>>>> 
>>>>  A buffer cannot be directly mutated through a @sil_cow reference;
>>>>  instead one must mutate it indirectly via the result of is_unique or
>>>>  start_unique.
>> 
>> Exactly, that’s what I wanted to say.
>> 
>>>> 
>>>> Saying that the buffer is “considered to be immmutable during the
>>>> lifetime of the reference” could be taken to mean that the compiler will
>>>> assume no mutations of the buffer can occur while the reference exists.
>>>> IIUC you are not planning to formally end the reference's lifetime at
>>>> the moment is_unique/start_unique returns.
>>> 
>>> To clarify: I proposed an alternate approach in which the @sil_cow 
>>> reference is only mutable during the Array’s @inout scope—to be 
>>> automatically enforced by the compiler once @inout scopes are enforced. But 
>>> the text in question is not referring to that approach, so your comments 
>>> are on target.
>> 
>> After thinking about Joe’s suggestion (having the cow attribute on the class 
>> type and make a reference to that type move-only), I’m more inclined to go 
>> with the isUnique builtin. If such a reference can only be returned by 
>> isUnique, it is really guaranteed that only a uniquely referenced buffer can 
>> be mutated. With the inout approach, the programmer is not forced to make 
>> the uniqueness check before modifying the buffer.
>> 
>>> -Andy 
>>> 
>>>>> 
>>>>> Plus: we will have an explicit conversion instruction (start_unique) to 
>>>>> convert an immutable
>>>>> reference to a mutable referece.
>>>>> A SIL optimization can replace an is_unique with this instruction if it 
>>>>> can prove that the reference
>>>>> is already unique at that point.
>>>>> 
>>>>>> 
>>>>>> 
>>>>>>>> Unless you mean the reference is
>>>>>>>> immutable, rather than the storage being referred to by it.
>>>>>>>> 
>>>>>>>>> This means any address derived from a ``cow_to_ref`` instruction can 
>>>>>>>>> be
>>>>>>>>> considered to point to immutable memory.
>>>>>>>>> 
>>>>>>>>> Some examples of optimizations which will benefit from copy-on-write
>>>>>>>>> representation in SIL:
>>>>>>>>> 
>>>>>>>>> - Redundant load elimination
>>>>>>>>> 
>>>>>>>>> RLE can assume that opaque code does not modify a COW buffer.
>>>>>>>> 
>>>>>>>> How do you distinguish “opaque code” from “code that is meant to
>>>>>>>> modify the buffer and might do so in place if it's 
>>>>>>>> uniquely-referenced?”
>>>>>>> 
>>>>>>> Again, the is_unique which takes the address of the reference, will
>>>>>>> guarantee that during the lifetime of a buffer there are no
>>>>>>> modifications of the buffer.
>>>>>> 
>>>>>> Again, that sounds like it rules out inplace modification of uniquely
>>>>>> referenced buffers.
>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>>> 
>>>>>>>>> Example::
>>>>>>>>> 
>>>>>>>>>    %2 = cow_to_ref %1 : $@sil_cow COWBuffer
>>>>>>>>>    %3 = ref_element_addr %2 : $COWBuffer, #someData
>>>>>>>>>    %4 = load %3 : $*Int
>>>>>>>>>    %5 = apply %foo()                        // Cannot overwrite 
>>>>>>>>> memory location %3
>>>>>>>>>    %6 = load %3 : $*Int                     // Can be replaced by %4
>>>>>>>>> 
>>>>>>>>> Currently we do some ad-hoc optimizations for array, based on 
>>>>>>>>> semantics,
>>>>>>>>> like array count propagation. These hacks would not be needed
>>>>>>>>> anymore.
>>>>>>>> 
>>>>>>>> W0000000000000000000000t.
>>>>>>>> 
>>>>>>>>> Note that it’s not required to check if a ``cow_to_ref`` reference 
>>>>>>>>> (or a
>>>>>>>>> projected address) escapes. Even if it escapes, it will reference 
>>>>>>>>> immutable
>>>>>>>>> memory.
>>>>>>>>> 
>>>>>>>>> - CSE, loop hoisting
>>>>>>>>> 
>>>>>>>>> Similar to RLE: the optimizer can assume that opaque code cannot 
>>>>>>>>> modify a
>>>>>>>>> COW buffer
>>>>>>>> 
>>>>>>>> Same question here as above, then.
>>>>>>>>> 
>>>>>>>>> - ARC optimization
>>>>>>>>> 
>>>>>>>>> Knowing that some opaque code cannot overwrite a reference in the COW 
>>>>>>>>> buffer
>>>>>>>>> can remove retain/release pairs across such code::
>>>>>>>>> 
>>>>>>>>>    %2 = cow_to_ref %1 : $@sil_cow COWBuffer
>>>>>>>>>    %3 = ref_element_addr %2 : $COWBuffer, #someRef
>>>>>>>>>    %4 = load_strong %3 : $*MyClass          // Can do a load_strong 
>>>>>>>>> [guarantee]
>>>>>>>>>    %5 = apply %foo()                        // Cannot overwrite 
>>>>>>>>> someRef and dealloc the object
>>>>>>>>>    // Use %4
>>>>>>>>>    destroy_value %4 : $MyClass
>>>>>>>>> 
>>>>>>>>> Scoping instructions
>>>>>>>>> --------------------
>>>>>>>>> 
>>>>>>>>> To let the optimizer reason about the immutability of the COW buffer, 
>>>>>>>>> it is
>>>>>>>>> important to *bind* the lifetime of the buffer content to the 
>>>>>>>>> lifetime of the
>>>>>>>>> buffer reference. For example::
>>>>>>>>> 
>>>>>>>>>  %b1 = load %baddr : $@sil_cow COWBuffer  // load the buffer reference
>>>>>>>>>  // load something from %b1
>>>>>>>>>  %a = apply %foo(%baddr : $@sil_cow COWBuffer)
>>>>>>>>>  %b2 = load %baddr : $@sil_cow COWBuffer  // load the buffer 
>>>>>>>>> reference again
>>>>>>>>>  // load something from %b2
>>>>>>>>> 
>>>>>>>>> The question is: can RLE forward the load of the buffer reference and 
>>>>>>>>> replace
>>>>>>>>> ``%b2`` with ``%b1``? It must not be able to do so if ``foo()`` 
>>>>>>>>> modifies the
>>>>>>>>> buffer.
>>>>>>>>> 
>>>>>>>>> To enforce this restriction, the scope of any buffer modification 
>>>>>>>>> must be
>>>>>>>>> enclosed in a pair of SIL instructions. Those instructions define the 
>>>>>>>>> scope
>>>>>>>>> of the mutation. Both instructions take the *address* of the buffer
>>>>>>>>> reference as operand and act as a potential write to the buffer 
>>>>>>>>> reference. 
>>>>>>>>> 
>>>>>>>>> The purpose of the scoping instructions is to strictly separate the 
>>>>>>>>> liferanges
>>>>>>>>> of references to an immutable buffer and references to the mutable 
>>>>>>>>> buffer.
>>>>>>>> 
>>>>>>>> Looks reasonable.
>>>>>>>> 
>>>>>>>>> The following example shows why the scoping instructions 
>>>>>>>>> (specifically the
>>>>>>>>> end-of-scope instruction) are required to prevent loop-hoisting from
>>>>>>>>> interleaving mutable and immutable liferanges::
>>>>>>>>> 
>>>>>>>>>  // there should be a begin-of-scope %baddr
>>>>>>>>>  %mut_b = load %baddr
>>>>>>>>>  store %x to %mut_b    // modification of the buffer
>>>>>>>>>  // there should be a end-of-scope %baddr
>>>>>>>>> 
>>>>>>>>>  loop {
>>>>>>>>>    %b = load %baddr
>>>>>>>>>    %y = load %b        // load from the buffer
>>>>>>>>>    ...
>>>>>>>>>  }
>>>>>>>>> 
>>>>>>>>> If there is no end-of-scope instruction, loop hoisting could do::
>>>>>>>>> 
>>>>>>>>>  %mut_b = load %baddr
>>>>>>>>>  %b = load %baddr        // moved out of the loop
>>>>>>>>>  store %x to %mut_b
>>>>>>>>> 
>>>>>>>>>  loop {
>>>>>>>>>    %y = load %b
>>>>>>>>>    ...
>>>>>>>>>  }
>>>>>>>>> 
>>>>>>>>> Now the optimizer assumes that ``%b`` references an immutable buffer, 
>>>>>>>>> so it could
>>>>>>>>> also hoist the load::
>>>>>>>>> 
>>>>>>>>>  %mut_b = load %baddr
>>>>>>>>>  %b = load %baddr
>>>>>>>>>  %y = load %b          // Wrong! Will be overwritten by the following 
>>>>>>>>> store
>>>>>>>>>  store %x to %mut_b
>>>>>>>>> 
>>>>>>>>>  loop {
>>>>>>>>>    ...
>>>>>>>>>  }
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> The following sections describe two alternatives to implement the 
>>>>>>>>> scoping.
>>>>>>>>> 
>>>>>>>>> Scoping Alternative 1: Explicit Built-ins
>>>>>>>>> -----------------------------------------
>>>>>>>>> 
>>>>>>>>> SIL instructions
>>>>>>>>> ^^^^^^^^^^^^^^^^
>>>>>>>>> 
>>>>>>>>> The existing ``is_unique`` instruction is changed to a terminator 
>>>>>>>>> instruction::
>>>>>>>>> 
>>>>>>>>>  bb0:
>>>>>>>>>    is_unique_addr_br %0 : $*@sil_cow COWBuffer, bb1, bb2  // %0 is 
>>>>>>>>> the address of the COWBuffer reference
>>>>>>>>>  bb1(%1 : $COWBuffer): // the true-block. The payload %1 is the 
>>>>>>>>> unique reference. Physically identical to "load %0”
>>>>>>>>>    // usually empty
>>>>>>>>>    br bb3(%1 : $COWBuffer)
>>>>>>>>>  bb2:                  // the false-block
>>>>>>>>>    // usually contains:
>>>>>>>>>    %2 = apply %copy_buffer
>>>>>>>>>    %3 = cow_to_ref %2
>>>>>>>>>    store_strong %3 to %0 : $*@sil_cow COWBuffer
>>>>>>>>>    br bb3(%2 : $COWBuffer)
>>>>>>>>>  bb3(%4 : $COWBuffer):
>>>>>>>>>    // Modify the buffer referenced by %4
>>>>>>>>>    // ...
>>>>>>>>> 
>>>>>>>>> The end-of-scope instruction is::
>>>>>>>>> 
>>>>>>>>>  end_unique_addr %0 : $*COWBuffer
>>>>>>>>> 
>>>>>>>>> It is important that the references to the unique buffers (``%1``, 
>>>>>>>>> ``%2``) must
>>>>>>>>> not outlive ``end_unique_addr``. In most cases this can be check by 
>>>>>>>>> the SIL
>>>>>>>>> verifier.
>>>>>>>>> 
>>>>>>>>> The two instructions must be paired properly but not necessarily in 
>>>>>>>>> the
>>>>>>>>> same function.
>>>>>>>>> 
>>>>>>>>> The purpose of an ``is_unique_addr_br`` - ``end_unique_addr`` pair is 
>>>>>>>>> to
>>>>>>>>> separate the lifetimes of mutable and immutable accesses to the COW 
>>>>>>>>> buffer.
>>>>>>>>> Both instructions take an address to the COW buffer reference and are
>>>>>>>>> considered as potential stores to the reference.
>>>>>>>>> This makes sure that the SIL optimizer cannot mix-up buffer reference 
>>>>>>>>> lifetimes
>>>>>>>>> across these instructions.
>>>>>>>>> For example, RLE cannot combine two buffer loads which are 
>>>>>>>>> interleaved with
>>>>>>>>> a ``is_unique_addr_br``::
>>>>>>>>> 
>>>>>>>>>  %1 = load_strong %0 : $*@sil_cow COWBuffer
>>>>>>>>>  // do something with %1
>>>>>>>>>  …
>>>>>>>>>  is_unique_addr_br %0 : $*@sil_cow COWBuffer
>>>>>>>>>  …
>>>>>>>>>  %2 = load_strong %0 : $*@sil_cow COWBuffer // RLE cannot replace 
>>>>>>>>> this with %1
>>>>>>>>> 
>>>>>>>>> Another important thing is that the COW buffer can only be mutated by 
>>>>>>>>> using the
>>>>>>>>> reference of the ``is_unique_addr_br`` true-block argument.
>>>>>>>>> The COW buffer cannot be modified by simply loading/extracting the 
>>>>>>>>> reference
>>>>>>>>> from the COWType.
>>>>>>>>> Example::
>>>>>>>>> 
>>>>>>>>> %1 = load_strong %0 : $*COWBuffer
>>>>>>>>> %2 = cow_to_ref %1 : $@sil_cow COWBuffer
>>>>>>>>> %3 = ref_element_addr %2 : $COWBuffer, #someData
>>>>>>>>> store %7 : $Int to %3 : $*Int            // Violation!
>>>>>>>>> 
>>>>>>>>> Most obvious violations to this constraint can be catched by the 
>>>>>>>>> SILVerifier.
>>>>>>>>> 
>>>>>>>>> The ``_addr`` variants of the instructions also have a non-addr 
>>>>>>>>> counterpart::
>>>>>>>>> 
>>>>>>>>>  is_unique_br %0 : $COWBuffer, bb1, bb2.  // consumes %0 and produces 
>>>>>>>>> the true-block arg as owned
>>>>>>>>> 
>>>>>>>>>  %1 = end_unique %0 : $COWBuffer // consumes %0 and produces %1 as 
>>>>>>>>> owned
>>>>>>>>> 
>>>>>>>>> These instructions are generated by Mem2reg (or a similar 
>>>>>>>>> optimization)
>>>>>>>>> in case the COW value is stored (in a temporary alloc_stack location)
>>>>>>>>> just for the sake of passing an address to ``is_unique_addr_br`` and
>>>>>>>>> ``end_unique_addr``.
>>>>>>>>> For example in the following code, where the COW data is passed 
>>>>>>>>> as-value and
>>>>>>>>> all the mutating functions are inlined::
>>>>>>>>> 
>>>>>>>>>  func foo(arr : [Int], x: Int) {
>>>>>>>>>    arr[0] = 27
>>>>>>>>>    …
>>>>>>>>>    y = arr[x]
>>>>>>>>>    …
>>>>>>>>>  }
>>>>>>>>> 
>>>>>>>>> Finally it’s probably a good idea to add an instruction for 
>>>>>>>>> converting an
>>>>>>>>> immutable reference to a mutable reference::
>>>>>>>>> 
>>>>>>>>>  %1 = start_unique %0 : $COWBuffer // consumes %0 and produces %1 : 
>>>>>>>>> $COWBuffer as owned
>>>>>>>>> 
>>>>>>>>> which is basically just a simpler representation of the following 
>>>>>>>>> pattern::
>>>>>>>>> 
>>>>>>>>>  bb0:
>>>>>>>>>    is_unique_br %0 : $@sil_cow COWBuffer, bb1, bb2
>>>>>>>>>  bb1(%1 : $COWBuffer):
>>>>>>>>>    … // main control flow continues here
>>>>>>>>>  bb2:
>>>>>>>>>    unreachable
>>>>>>>>> 
>>>>>>>>> An optimizations, which eliminate uniqueness checks, would replace a
>>>>>>>>> ``is_unique_br`` by a ``start_unique``.
>>>>>>>>> 
>>>>>>>>> Built-ins
>>>>>>>>> ^^^^^^^^^
>>>>>>>>> 
>>>>>>>>> A COW type implementor can generate the new instructions by using a 
>>>>>>>>> set of built-ins::
>>>>>>>>> 
>>>>>>>>>  func isUnique<BufferType>(_ buffer: inout BufferType) -> BufferType?
>>>>>>>>>  func endUnique<BufferType>(_ buffer: inout BufferType)  
>>>>>>>>> 
>>>>>>>>> For example::
>>>>>>>>> 
>>>>>>>>>  struct COWType {
>>>>>>>>>    copy_on_write var b : COWBuffer
>>>>>>>>> 
>>>>>>>>>    mutating func makeMutable() -> COWBuffer {
>>>>>>>>>      if let uniqueBuffer = isUnique(&self.b) {
>>>>>>>>>        return uniqueBuffer
>>>>>>>>>      }
>>>>>>>>>      let copiedBuffer = copyBuffer(self.b)
>>>>>>>>>      self.b = copiedBuffer
>>>>>>>>>      return copiedBuffer
>>>>>>>>>    }
>>>>>>>>> 
>>>>>>>>>    mutating func setSomeData(x: Int) {
>>>>>>>>>      let uniqueBuffer = makeMutable()
>>>>>>>>>      uniqueBuffer.someData = x
>>>>>>>>>      endUnique(&self.b)
>>>>>>>>>    }
>>>>>>>>>  }
>>>>>>>> 
>>>>>>>> This seems reasonable, but it also looks like the compiler could do the
>>>>>>>> `endUnique` dance for us based, e.g., on the mutability of methods.  
>>>>>>> 
>>>>>>> I agree, that would be ideal, e.g. the compiler could insert the 
>>>>>>> endUnique at the end of an inout
>>>>>>> scope.
>>>>>>> 
>>>>>>>> 
>>>>>>>>> The ``isUnique`` built-in returns an optional unique buffer reference.
>>>>>>>>> Physically this is the COW buffer which is passed as the inout 
>>>>>>>>> argument.
>>>>>>>>> The result is nil if the buffer is not uniquely referenced.
>>>>>>>>> In this case usually the original buffer is copied and the reference 
>>>>>>>>> to the
>>>>>>>>> copy is written back to the original buffer reference location
>>>>>>>>> (``self.b = copiedBuffer``).
>>>>>>>>> Starting at the point of the write-back, the reference to the copy 
>>>>>>>>> also becomes
>>>>>>>>> a unique buffer reference.
>>>>>>>>> 
>>>>>>>>> The ``isUnique`` built-in is lowered to the ``is_unique_addr_br`` 
>>>>>>>>> pattern which
>>>>>>>>> constructs the Optional in the successor blocks. Using ``isUnique`` 
>>>>>>>>> in an
>>>>>>>>> if-let (as shown above) will end up in two consecutive CFG "diamonds".
>>>>>>>>> Simplify-CFG can combine those into a single ``is_unique_addr_br`` 
>>>>>>>>> diamond.
>>>>>>>>> 
>>>>>>>>> .. note::
>>>>>>>>> This makes the definition of the unique buffer location lifetime a 
>>>>>>>>> little bit
>>>>>>>>> problematic, because the false-branch of ``isUnique`` is not 
>>>>>>>>> equivalent to
>>>>>>>>> the false-branch of the ``is_unique_addr_br`` instruction (before 
>>>>>>>>> SimplifyCFG
>>>>>>>>> can do its job).
>>>>>>>> 
>>>>>>>> I don't know what the implications of these diamonds and the problem
>>>>>>>> described above might be, FWIW.
>>>>>>>> 
>>>>>>>>> The rules for using ``copy_on_write`` and the built-ins are:
>>>>>>>>> 
>>>>>>>>> 1. ``isUnique`` must be paired with ``endUnique``, but not 
>>>>>>>>> necessarily in the
>>>>>>>>> same function.
>>>>>>>>> 
>>>>>>>>> 2. The COW buffer may only be mutated by using the unique buffer 
>>>>>>>>> reference.
>>>>>>>>> 
>>>>>>>>> 3. The COW buffer must not be mutated outside the ``isUnique`` - 
>>>>>>>>> ``endUnique``
>>>>>>>>> pair.
>>>>>>>>> 
>>>>>>>>> 4. During the lifetime of the unique buffer reference, the original 
>>>>>>>>> COW buffer
>>>>>>>>> reference must not be used in any way, e.g. for reading from the 
>>>>>>>>> buffer.
>>>>>>>>> 
>>>>>>>>> Note that the lifetime of the unique buffer reference does not 
>>>>>>>>> include the
>>>>>>>>> part between the begin of the ``isUnique`` false-branch and the 
>>>>>>>>> write-back
>>>>>>>>> of the copy. This means is okay to read from the buffer (using 
>>>>>>>>> ``self.b``)
>>>>>>>>> for the purpose of copying.
>>>>>>>>> 
>>>>>>>>> Examples::
>>>>>>>>> 
>>>>>>>>>  mutating func setSomeData(x: Int) {
>>>>>>>>>    let uniqueBuffer = makeMutable()
>>>>>>>>>    uniqueBuffer.someData = x
>>>>>>>>>    // violates rule 1
>>>>>>>>>  }
>>>>>>>>> 
>>>>>>>>>  mutating func setSomeData(x: Int) {
>>>>>>>>>    makeMutable()
>>>>>>>>>    self.b.someData = x // violates rule 2
>>>>>>>>>    endUnique(&self.b)
>>>>>>>>>  }
>>>>>>>>> 
>>>>>>>>>  mutating func setSomeData(x: Int) {
>>>>>>>>>    let uniqueBuffer = makeMutable()
>>>>>>>>>    uniqueBuffer.someData = x
>>>>>>>>>    endUnique(&self.b)
>>>>>>>>>    uniqueBuffer.someData = 27 // violates rule 3
>>>>>>>>>  }
>>>>>>>>> 
>>>>>>>>>  mutating func incrementSomeData() {
>>>>>>>>>    let uniqueBuffer = makeMutable()
>>>>>>>>>    uniqueBuffer.someData = self.b.someData + 1 // violates rule 4
>>>>>>>>>    endUnique(&self.b)
>>>>>>>>>  }
>>>>>>>> 
>>>>>>>> It would be instructive to write down the *correct* code for these
>>>>>>>> operations.
>>>>>>> 
>>>>>>> added to my todo list.
>>>>>>> 
>>>>>>>> 
>>>>>>>>> The intention of the rules is to ensure that there is no overlap of a
>>>>>>>>> "read-only" life-range with a "mutable" life-range of the buffer 
>>>>>>>>> reference.
>>>>>>>>> It’s the responsibility of the implementor to follow the rules.
>>>>>>>>> But the compiler (a mandatory diagnostics pass and the SIL verifier) 
>>>>>>>>> can
>>>>>>>>> statically detect rule violations in obvious cases (with 
>>>>>>>>> inter-procedural
>>>>>>>>> analysis maybe even in most cases).
>>>>>>>>> 
>>>>>>>>> This approach would require to change some of the internals of our
>>>>>>>>> current COW data structures in the stdlib (Array, Dictionary, etc.).
>>>>>>>>> For example, the Array make_mutable semantic functions currently do 
>>>>>>>>> not return
>>>>>>>>> the unique buffer.
>>>>>>>> 
>>>>>>>> No big deal.
>>>>>>>> 
>>>>>>>>> Scoping Alternative 2: Implicit Inout Scopes
>>>>>>>>> --------------------------------------------
>>>>>>>>> 
>>>>>>>>> There is an idea (proposal?) to change the representation of inout 
>>>>>>>>> variables
>>>>>>>>> in SIL. This is independent of this proposal, but can be helpful for 
>>>>>>>>> the
>>>>>>>>> purpose of defining the scope of a COW mutation.
>>>>>>>>> 
>>>>>>>>> The basic idea is that SILGen inserts scoping instructions for *all* 
>>>>>>>>> inout
>>>>>>>>> variables. And those scoping instructions can be used to define the 
>>>>>>>>> mutating
>>>>>>>>> scope of a COW buffer.
>>>>>>>>> 
>>>>>>>>> The scoping instructions which are inserted by SILGen for an inout 
>>>>>>>>> scope are::
>>>>>>>>> 
>>>>>>>>>  begin_exclusive
>>>>>>>>>  end_exclusive
>>>>>>>>> 
>>>>>>>>> Simliar to ``is_unique_addr_br`` and ``end_unique_addr``, those 
>>>>>>>>> instructions take the
>>>>>>>>> address of the inout variable as argument. For the optimizer those 
>>>>>>>>> instructions
>>>>>>>>> look like potential writes to the inout variable.
>>>>>>>>> 
>>>>>>>>> The implementor of a COW type has to follow the rule that the COW 
>>>>>>>>> buffer may
>>>>>>>>> only be modified in mutating functions of the COW type. But this is 
>>>>>>>>> the case
>>>>>>>>> anyway because any modification needs a uniqueness check and this can 
>>>>>>>>> only be
>>>>>>>>> done in mutating functions.
>>>>>>>>> 
>>>>>>>>> Example::
>>>>>>>>> 
>>>>>>>>>  // > mutating func setSomeData(x: Int) {
>>>>>>>>>  // Accepts a unique reference to the array value (avoiding refcount 
>>>>>>>>> operations)
>>>>>>>>>  sil @setSomeData : $(Int, @inout Array) -> () {
>>>>>>>>>  bb_entry(%x : Int, %arrayref : $*Array<T>) // Begin scope #0
>>>>>>>>> 
>>>>>>>>>  // >   makeMutable() (inlined)
>>>>>>>>>  // Forward the unique reference to the `self` array value, still 
>>>>>>>>> avoiding refcount operations.
>>>>>>>>>  // Begin the inlined exclusive scope (could be trivially removed).
>>>>>>>>>  begin_exclusive %arrayref : $*Array<T> // Begin scope #1
>>>>>>>>> 
>>>>>>>>>  // >    if !isUnique(&self._storage) {
>>>>>>>>>  // Extract a unique inout reference to the class reference to the 
>>>>>>>>> array storage.
>>>>>>>>>  // This begins the isUnique() argument's exclusive scope. The memory 
>>>>>>>>> is already exclusive
>>>>>>>>>  // but the scope helps ensure this is the only alias to _storage.
>>>>>>>>>  %arrayref._storageref = struct_element_addr [exclusive] %arrayref, 
>>>>>>>>> #Array._storage
>>>>>>>>> 
>>>>>>>>>  // Uniqueness checking requires an inout reference to the class 
>>>>>>>>> reference.
>>>>>>>>>  // The is_unique instruction does not need to create a new storage 
>>>>>>>>> reference.
>>>>>>>>>  // It's only purpose is to check the RC count, ensure that the 
>>>>>>>>> checked reference
>>>>>>>>>  // is inout, and prevent the inout scope from being optimized away.
>>>>>>>>>  %isuniq = is_unique %arrayref._storageref : $*@sil_cow 
>>>>>>>>> ArrayStorage<T>
>>>>>>>>> 
>>>>>>>>>  // End the isUnique argument's exclusive scope (can also be 
>>>>>>>>> trivially removed).
>>>>>>>>>  end_exclusive %arrayref._storageref : $*@sil_cow ArrayStorage<T>
>>>>>>>>> 
>>>>>>>>>  br %isuniq, bb_continue, bb_slow
>>>>>>>>> 
>>>>>>>>>  bb_slow:
>>>>>>>>>  // >      self._storage = copyBuffer(self._storage)
>>>>>>>>>  // Produce a new class reference to storage with verifiably unique 
>>>>>>>>> RC semantics.
>>>>>>>>>  %copied_storage_class = alloc_ref ...
>>>>>>>>>  // A begin/end exclusive scope is implicit in store [assign].
>>>>>>>>>  store [assign] %copied_storage_class to %arrayref._storageref
>>>>>>>>>  br bb_continue
>>>>>>>>> 
>>>>>>>>>  bb_continue:
>>>>>>>>> 
>>>>>>>>>  // This marks the end of makeMutable's inout `self` scope. Because 
>>>>>>>>> Array
>>>>>>>>>  // contains a "copy_on_write" property, the SIL verifier needs to
>>>>>>>>>  // prove that %arrayref.#_storage has not escaped at this point. This
>>>>>>>>>  // is equivalent to checking that %arrayref itself is not copied, and
>>>>>>>>>  // checking each projection of the "copy_on_write" storage property
>>>>>>>>>  // (%arrayref._storageref) is not copied. Or, if any copies are 
>>>>>>>>> present,
>>>>>>>>>  // they must be consumed within this scope.
>>>>>>>>>  end_exclusive %arrayref : $*Array<T> // End scope #1
>>>>>>>>> 
>>>>>>>>>  // >    self._storage.someData = x
>>>>>>>>>  // An _addr instruction with one load/store use doesn't really need 
>>>>>>>>> its own scope.
>>>>>>>>>  %arrayref._storageref = struct_element_addr %arrayref, 
>>>>>>>>> #Array._storage
>>>>>>>>> 
>>>>>>>>>  // ARC optimization can promote this to a borrow, replacing 
>>>>>>>>> strong_release with end_borrow.
>>>>>>>>>  %arrayref.cow_storage = load [copy] %arrayref._storageref : 
>>>>>>>>> $*@sil_cow ArrayStorage
>>>>>>>>>  %arrayref._storage = cow_to_ref %arrayref.cow_storage : $@sil_cow 
>>>>>>>>> ArrayStorage
>>>>>>>>> 
>>>>>>>>>  // Write some data into the CoW buffer.
>>>>>>>>>  // (For simplicity, pretend ArrayStorage has a "someData" field).
>>>>>>>>>  // A single-use _addr instruction, so no scope.
>>>>>>>>>  %somedata_addr = ref_element_addr %arrayref._storage, #someData
>>>>>>>>>  // A store with an implicit [exclusive] scope.
>>>>>>>>>  store [assign] %x to %somedata_addr
>>>>>>>>> 
>>>>>>>>>  strong_release %arrayref._storage : $*ArrayStorage<T>
>>>>>>>>> 
>>>>>>>>>  // End the isUnique argument's exclusive scope.
>>>>>>>>>  // The same verification is needed here, but the inner scope would 
>>>>>>>>> be eliminated.
>>>>>>>>>  end_exclusive %arrayref : $*Array<T> // End scope #0
>>>>>>>>> 
>>>>>>>>> In general this approach looks more "user-friendly" than the first
>>>>>>>>> alternative.
>>>>>>>> 
>>>>>>>> Well, I can't really tell, because you haven't shown the Swift code 
>>>>>>>> that
>>>>>>>> generates this SIL.
>>>>>>>> 
>>>>>>>>> But it depends on implementing the general feature to insert the inout
>>>>>>>>> scoping instructions.  Also, we still have to think through all the
>>>>>>>>> details of this approach.
>>>>>>>> 
>>>>>>>> FWIW, I am convinced we will need (and get) a stricter inout model that
>>>>>>>> would be conducive to inserting the scoping instructions.
>>>>>>>> 
>>>>>>>> 
>>>>>>>>> Dependency between a buffer reference to the scope-begin
>>>>>>>>> --------------------------------------------------------
>>>>>>>> 
>>>>>>>> You can only have a dependency between two things, but as phrased “a
>>>>>>>> buffer reference to the scope-begin” sounds like one thing.  s/to/and/
>>>>>>>> would fix it.
>>>>>>>> 
>>>>>>>>> With both alternatives there is no explicit dependency from a buffer 
>>>>>>>>> reference
>>>>>>>>> to a scope-begin instruction::
>>>>>>>>> 
>>>>>>>>>  %b_cow = load %baddr
>>>>>>>>>  %b = cow_to_ref %b_cow
>>>>>>>>>  %x = load %b             // No dependency between this...
>>>>>>>>>  ...
>>>>>>>>>  begin_exclusive %baddr   // ... and this instruction.
>>>>>>>>>  ...
>>>>>>>>> 
>>>>>>>>> So in theory the optimizer is free to reschedule the instructions::
>>>>>>>>> 
>>>>>>>>>  %b_cow = load %baddr
>>>>>>>>>  %b = cow_to_ref %b_cow
>>>>>>>>>  ...
>>>>>>>>>  begin_exclusive %baddr
>>>>>>>>>  %x = load %b             // Wrong! Buffer could be modified here
>>>>>>>>>  ...
>>>>>>>>> 
>>>>>>>>> We still have to figure out how to cope with this.
>>>>>>>>> 
>>>>>>>>> - We could add an end-of-lifetime instruction for a COW buffer 
>>>>>>>>> reference, which
>>>>>>>>> the optimizer may not move over a begin-of-scope instruction.
>>>>>>>>> 
>>>>>>>>> - Or we just define the implicit rule for the optimizer that any use 
>>>>>>>>> of a COW
>>>>>>>>> reference may not be moved over a begin-of-scope instruction.
>>>>>>>>> 
>>>>>>>>> Preconditions
>>>>>>>>> =============
>>>>>>>>> 
>>>>>>>>> To benefit from COW optimizations in the stdlib Array, Set and 
>>>>>>>>> Dictionary data
>>>>>>>>> structures we first need eager bridging, meaning getting rid of the 
>>>>>>>>> bridged
>>>>>>>>> buffer. 
>>>>>>>> 
>>>>>>>> As you know I'm very much in favor of eager bridging, but I don't see
>>>>>>>> why this would be dependent on it.
>>>>>>> 
>>>>>>> We could use copy_on_write with eager bridging, but I don’t think it 
>>>>>>> will give any benefits to the
>>>>>>> optimizer.
>>>>>>> For example, the SIL code to get from an Array to a native
>>>>>>> ContiguousArrayStorage reference is pretty hard to understand for the
>>>>>>> optimizer (involves low level bit operations, etc.).
>>>>>> 
>>>>>> It wouldn't need to do low-level bit operations if our enums were
>>>>>> capable/controllable enough.  I'm just saying, there's no reason we
>>>>>> couldn't give the optimizer something to work with that has higher level
>>>>>> semantics than what we currently do.
>>>>>> 
>>>>>>>>> At least for Array this is implemented as low-level bit operations and
>>>>>>>>> optimizations cannot reason about it (e.g. finding a reasonable
>>>>>>>>> RC-root for the buffer reference).
>>>>>>>>> 
>>>>>>>>> Another thing is that we currently cannot use ``copy_on_write`` for 
>>>>>>>>> Array
>>>>>>>>> because of pinning. Array pins it’s buffer when passing an element 
>>>>>>>>> address to
>>>>>>>>> an inout parameter. This allows the array buffer to be modified even 
>>>>>>>>> if its
>>>>>>>>> reference count is > 1. With ``copy_on_write``, a programmer could 
>>>>>>>>> break memory
>>>>>>>>> safety when violating the inout rule. Example::
>>>>>>>>> 
>>>>>>>>>  var arr = [MyClass()]  // a global array
>>>>>>>>> 
>>>>>>>>>  foo(&arr[0])        // Pins the buffer of arr during the call
>>>>>>>>> 
>>>>>>>>>  func foo(_ x: inout MyClass) -> Int {
>>>>>>>>>    let b = arr       // The ref-count of the buffer is not 
>>>>>>>>> incremented, because it is pinned!
>>>>>>>>>    let r = b[0]      // optimizer removes the retain of r because it 
>>>>>>>>> thinks the following code cannot modify b
>>>>>>>>>    arr.removeAll()   // does not copy the array buffer and thus 
>>>>>>>>> de-allocates r
>>>>>>>>>    return r.i        // use-after-free!
>>>>>>>>>  }
>>>>>>>> 
>>>>>>>> I only know of one way to resolve inout and pinning:
>>>>>>>> 
>>>>>>>> * Semantically, references are replaced with a trap value when entering
>>>>>>>> an inout context so that all inout values are provably unique
>>>>>>>> references in the absence of unsafe code.  We drop pinning and provide
>>>>>>>> explicit operations that provide simultaneous lvalue accesses to
>>>>>>>> distinct regions, e.g. c.swap(i1, i2) where i1 and i2 are indices.
>>>>>>>> 
>>>>>>>> If there are other ideas out there, I'd like to hear them.  If not, we
>>>>>>>> should probably decide that this is what we're doing so that we can 
>>>>>>>> move
>>>>>>>> forward without this looming uncertainty.
>>>>>>>> 
>>>>>>>> -- 
>>>>>>>> -Dave
>>>>>>>> 
>>>>>>>> _______________________________________________
>>>>>>>> swift-dev mailing list
>>>>>>>> swift-dev@swift.org <mailto:swift-dev@swift.org>
>>>>>>>> https://lists.swift.org/mailman/listinfo/swift-dev 
>>>>>>>> <https://lists.swift.org/mailman/listinfo/swift-dev>
>>>>>>> 
>>>>>> 
>>>>>> -- 
>>>>>> -Dave
>>>>> 
>>>>> _______________________________________________
>>>>> swift-dev mailing list
>>>>> swift-dev@swift.org <mailto:swift-dev@swift.org>
>>>>> https://lists.swift.org/mailman/listinfo/swift-dev 
>>>>> <https://lists.swift.org/mailman/listinfo/swift-dev>
>>>> 
>>>> -- 
>>>> -Dave
>>>> 
>>>> _______________________________________________
>>>> swift-dev mailing list
>>>> swift-dev@swift.org <mailto:swift-dev@swift.org>
>>>> https://lists.swift.org/mailman/listinfo/swift-dev 
>>>> <https://lists.swift.org/mailman/listinfo/swift-dev>
>>> _______________________________________________
>>> swift-dev mailing list
>>> swift-dev@swift.org <mailto:swift-dev@swift.org>
>>> https://lists.swift.org/mailman/listinfo/swift-dev 
>>> <https://lists.swift.org/mailman/listinfo/swift-dev>

_______________________________________________
swift-dev mailing list
swift-dev@swift.org
https://lists.swift.org/mailman/listinfo/swift-dev

Re: [swift-dev] copy-on-write proposal

Reply via email to