I’m sending out a proposal for fundamentally changing SIL. This work feeds into generic code optimization, resilience, semantic ARC, and SIL ownership. This was discussed at length back in October—some info went out on swift-dev—but I realized there hasn’t been a formal proposal. So here it is. I want to make sure enough people have seen this before pushing my PR that puts the infrastructure in place: https://github.com/apple/swift/pull/6922.


Markdown:
# SIL Opaque Values

# Introduction

A SIL type is either loadable or address-only. A loadable type is one
whose object size and layout can be determined by the compiler *and*
whose values are not "pinned" to a memory address. Types are most
commonly address-only because their layout is opaque by
abstraction. Generic type parameters are address-only because their
concrete type not statically identified. Resilient types are
statically identified, but have opaque layout for binary
compatibility.

Compiled code must access and pass address-only objects via their
memory addresses. We refer to this as "physical" access, as expressed
by LLVM IR. Currently, SIL also reflects physical access. For example,
SIL for a generic function relies on address types:
```
sil @identity : $<T> (@in T) -> @out T {
bb0(%0 : $*T, %1 : $*T):
  copy_addr %1 to [initialization] %0 : $*T
  destroy_addr %1 : $*T
  %4 = tuple ()
  return %4 : $()
}
```

If the generic type is bound to a loadable type, then the SIL values
are promoted to value types:
```
// generic specialization <Swift.Int> of identity <T> (T) -> T
sil @identity : $(Int) -> Int {
bb0(%0 : $Int):
  return %0 : $Int
}
```

This leads to drastically different SIL patterns in code that is
semantically almost identical. A key aspect of optimizing SIL is
converting values from an in-memory form to an SSA value. Being
address-only currently prevents opaque values from taking part in
normal optimization. Naturally, opaque types must limit some
optimizations, such as inlining. However, an important feature of the
SIL optimizer is copy elimination and by extension ARC operations,
which applies equally to opaque and loadable types.

This SIL code creates an unnecessary local copy:
```
%copy = alloc_stack $T
copy_addr %arg to [initialization] %copy : $*T
...
%ret = apply %callee<T>(%copy) : $@convention(thin) <τ_0_0> (@in τ_0_0) -> ()
...
dealloc_stack %copy : $*T
destroy_addr %arg : $*T
```

The same code expressed in SSA reveals that the copy is redundant:
```
%copy = copy_value %arg : $T
...
%ret = apply %callee<T>(%copy) : $@convention(thin) <τ_0_0> (@in τ_0_0) -> ()
...
destroy_value %arg : $T
```
Redundancy can be inferred from these facts: `%arg` by definition cannot mutate, it is consumed in-scope by a `destroy_value`, and `%arg` is not copied again after `%copy` has been consumed:
 
The address-only property of arguments determines physical calling
conventions in LLVM IR. These conventions cannot be complete hidden
from SIL code. In particular, SIL is responsible for handling
reabstraction of function types. If the caller and callee have
different views of an argument type, then a SIL-level thunk is
required to bridge between the two conventions.

For example, a concrete function may satisfy a protocol with generic
constraints. A protocol witness thunk will be generated to load the
address-only arguments and store address-only results. An extra copy
is also generated to convert a guaranteed self argument to owned:
```
sil @foo : $@convention(method) (Int, S) -> Int {
bb0(%0 : $Int, %1 : $S):
  return %0 : $Int
}

// protocol witness for P.foo (A.T) -> A.T in conformance S : P
sil [thunk] @_TTWV1t1SS_1PS_FS1_3foofwx1TwxS2_
  : $@convention(witness_method) (@in Int, @in_guaranteed S) -> @out Int {
bb0(%0 : $*Int, %1 : $*Int, %2 : $*S):
  %3 = load [trivial] %1 : $*Int
  %4 = load [trivial] %2 : $*S
  %5 = function_ref @foo : $@convention(method) (Int, S) -> Int
  %6 = apply %5(%3, %4) : $@convention(method) (Int, S) -> Int
  store %6 to [trivial] %0 : $*Int
  %8 = tuple ()
  return %8 : $()
}
```

Conceivably, the address-only types in the thunk could be expressed as SSA values. However, the `@in`, `@in_guaranteed` parameter conventions must remain to communicate indirection to the SIL compiler:
```
// protocol witness for P.foo (A.T) -> A.T in conformance S : P
sil [thunk] @_TTWV1t1SS_1PS_FS1_3foofwx1TwxS2_
  : $@convention(witness_method) (@in Int, @in_guaranteed S) -> @out Int {
bb0(%0 : $Int, %1 : $*S):
  %2 = function_ref @foo : $@convention(method) (Int, S) -> Int
  %3 = apply %2(%0, %1) : $@convention(method) (Int, S) -> Int
  return %3 : $Int
}
```

This proposal argues that address-only SIL values should be
represented within SIL function bodies as SSA values, unless SIL
semantics would otherwise require an address even if the value's type
were loadable. SIL parameter and result conventions will continue to
reflect argument indirection. Two SIL function signatures with
the same SIL types and conventions will always have the same ABI.

# Motivation and Goals

- Optimize generic and resilient code. Primarily done by avoiding
  unnecessary copies.

- Make ownership verification more efficient. 

- Simplify the SIL optimizer. SSA analyses for SILValues should apply
  to opaque types. Avoid developing non-SSA memory optimizations "in
  parallel". Avoid many redundant peepholes for address-type
  operations.

- Simplify SILGen. It should be a straightforward translation of the
  AST. A lot of the complexity currently has to do with lowering
  address-type values on-the-fly.

- Simplify IRGen. It should be a straightforward translation of
  lowered SIL into LLVM IR. Some on-the-fly logic for lowering
  addresses can be removed.

# Design

The loadable and address-only property of SIL types will not
change. However, address-only will only refer to the physical
properties of the type and will no longer determine the SIL-level
representation.

A new "lowered" SIL state will be introduced as a preparation for
IRGen. Lowered SIL will reflect the physical constraints of a type,
just as SIL currently does at all stages.

Generic code before address lowering:
```
sil @identity : <T> (@in T) -> @out T {
bb0(%0 : $T):
  %2 = copy_value %0 : $T
  destroy_value %0 : $T
  return %2 : $T
}
```

Generic code after address lowering:
```
sil @identity : $<T> (@in T) -> @out T {
bb0(%0 : $*T, %1 : $*T):
  copy_addr %1 to [initialization] %0 : $*T
  destroy_addr %1 : $*T
  %4 = tuple ()
  return %4 : $()
}
```

The “lowered” SIL stage is conceptually part of IRGen (it is not
serialized in the module). Optimizations that allocate storage for SIL
types, handle physical calling conventions, and fold away computation
based on known sizes can be done here. The actual LLVM IR generation
can be a fairly literal translation of SIL.

This actually makes canonical SIL much more canonical by moving most
of the SIL address representation down to IRGen. alloc_stack will
still exist in canonical SIL for @inout and captures but operations
will be SSA-based.

The primary motivation is to optimize generic and resilient types, but
the side effect will be a drastically simplified SILGen and somewhat
simplified IRGen. SIL ownership verification will also be much more
efficient in practice.

SIL will continue supporting address types prior to lowering. `inout`
arguments, and by extension captured variables, must have a memory
location. These objects are semantically associated with a memory
location in SIL. This is entirely independent of whether the type is
address-only.

# Alternatives

## Ignore address-only completely during SIL generation.

This would make formal calling conventions consistent with SIL-level
conventions. For example, `@in T` and `@owned T` arguments would both
be considered to passed directly. This way, whether two conventions
are compatible could be determined merely by comparing the SIL types
of the arguments. A SIL address type would map to an indirect
convention and a SIL object type would map to a direct convention.

This would hide part of the ABI from SIL. However, reabstraction must
be exposed to SIL. Doing so simplifies IRGen, allows the SIL optimizer
to improve code within thunks, and allows the SIL optimizer can
perform function signature optimizations across calls.

Instead, the proposed approach introduces two separate notions of
indirection. A indirect formal parameter requires indirection at the
ABI level as determined by the function's formal type. An indirect SIL
argument requires a SIL address type as determined by constraints
within the SIL code. `@in T` will be a formally indirect parameter
that, prior to address lowering, accepts a direct SIL value from the
callee and provides a direct SIL value in the caller.

## Lower addresses during IRGen.

This would avoid the need to support two SIL representations depending
on the compilation stage. This could be done by analyzing SIL values
and supplying the information needed for address lowering to IRGen via
a side-channel.

Integrating lowering within IRGen is not seamless. It contradicts our
important goal of simplifying the IR bridging phases of the
compiler. Handling address lowering as an independent pass avoids
adding complexity to IRGen. Once in place, some of the existing
IRGen complexity can even be moved to the lowering pass.

By introducing a new SIL stage, the proposed approach opens up multiple
additional opportunities. It will be useful to allow some SIL
passes to operate on lowered SIL and have access to physical
properties of types. We have already added an `AllocStackHoisting`
pass and plan to add more.

# Plan

1. Introduce SILFunctionConventions.

  Update all code that deals with SILFunctionTypes and clarify the
  separation between formal parameter/result types vs. SIL types.

  This step was extremely invasive and time consuming. I believe that
  it set the groundwork to make it easy to roll out the rest of the
  feature. I hope the process of migrating to this API shook out most
  of the bugs that we would have hit later when turning on the new
  feature.

  See [PR 6922](https://github.com/apple/swift/pull/6922).

2. Introduce EnableSILOpaqueValues option.

  Under this option SILGen will directly generate SSA values for
  opaque types. The SIL optimizer will be modified to handle opaque
  SSA values.

3. Introduce an AddressLowering pass.

  This will be the last stage in the SIL pipeline. It will run after
  SIL serialization, providing IRGen support for opaque SIL
  values. This way IRGen can continue to function almost unchanged.

  After AddressLowering, the SIL module will be in a lowered stage.

4. Verify SIL Opaque Values

  SIL operations on address types will now be prohibited in canonical
  SIL with the exception of alloc_stack, load, and store.

  There is still much work to be done in SILGen here.

5. Optimized address lowering.

  The AddressLowering pass will be optimized to avoid allocating
  storage for temporaries.

  John McCall's example:
  ```
    try_apply %someFunction() normal %cont, unwind %handler
  cont(%value: $T):
    %enum = enum #MyEnum.foo, %value : $T
    %any = existential $Any, %enum
    %fn = function_ref @bar
    apply %fn(%any)
  handler(%error: $Error):
    throw $error
  ```
  "Naive allocation here is going to introduce a lot of moves.
  Optimally, we would receive the return value from %someFunction
  directly in the payload of %enum, which we want to build directly
  into the allocated existential buffer of %any.  But to do this, we
  actually need to allocate that existential buffer before executing
  the try_apply; and if the try_apply throws, we need to deallocate
  that existential buffer in the handler block.  The need to
  retroactively insert this kind of clean-up code adds a lot of
  complexity to this allocation approach.  Moreover, it's quite
  possible that complex intermediate control — for example, if there's
  a loop somewhere between the definition of a value and its consuming
  use — will tend to block this kind of analysis and cause more
  unnecessary moves."

  I will be focusing on this design work over the next couple months.

6. Copy Optimization.

  The copy forwarding pass will be redesigned to coalesce SSA values
  by eliminating copies.

  This will prepare SIL for "Semantic ARC Optimization".

  The existing CopyForwarding pass can be eliminated!

  I have been looking forward to doing this for a long time. As soon
  as we have performance parity with the address lowering pass, I can
  begin this work.

-Andy
_______________________________________________
swift-dev mailing list
swift-dev@swift.org
https://lists.swift.org/mailman/listinfo/swift-dev

Reply via email to