> On Oct 3, 2016, at 12:10 PM, Alexis via swift-dev <swift-dev@swift.org> wrote:
> 
> When I first started reading this proposal, my primary objection was going to 
> be that SSA doesn’t seem to really jive well with the idea of values becoming 
> (in)valid at some point in the future (due to moves). You can see this in the 
> definite-init pass, which it works entirely with addresses to handle the idea 
> of a value which becomes assigned “eventually”. But if SIL’s notion of SSA 
> can be extended to handle these problems, this sounds great!
> 
> It’s not clear to me how this would work though. How does an optimization 
> pass reason about an SSA register becoming invalid because it was moved out 
> of? Or rather: in what ways do the interesting properties of SSA survive 
> passes needing to handle this? Is this a standard extension that’s been 
> researched/implemented before?

I think that’s why John claimed that we need to enforce “pseudo-linear” SIL 
values types. Moving out of an SSA value must be the last use of that value. 
SIL will enforce this single-consumer property throughout.

-Andy

> 
>> On Oct 1, 2016, at 4:32 AM, John McCall via swift-dev <swift-dev@swift.org 
>> <mailto:swift-dev@swift.org>> wrote:
>> 
>> Andy Trick and I had this conversation Thursday, and I thought I'd capture 
>> it here.
>> 
>> The Problem
>> 
>> It's a longstanding complaint that SIL uses drastically different code 
>> patterns for the same sequence of operations based on whether the types of 
>> the values involved are loadable or "address-only".  For example, suppose I 
>> have Swift code like this:
>>   let x, y : T
>>   let z = x + y
>> 
>> If T is a loadable type, this will generate SIL somewhat like this:
>>  // %x and %y are values of type $T
>> %lhs = copy_value %x
>> %rhs = copy_value %y
>>  %operator = function_ref T.+
>>  %result = apply %operator(%lhs, %rhs)
>>  %z = %result
>> 
>> (copy_value doesn't currently exist, but it's easier to understand, and as 
>> it happens we're thinking of adding it back.)
>> 
>> If T is an address-only type, this will generate SIL somewhat like this:
>>   // %x and %y are values of type $*T
>>   %z = alloc_stack $T
>>   %lhs = alloc_stack $T
>>   copy_addr %x to [initialization] %lhs
>>   %rhs = alloc_stack $T
>>   copy_addr %y to [initialization] %rhs
>>   %operator = function_ref T.+
>>   apply %operator(%z, %lhs, %rhs)
>>   dealloc_stack %rhs
>>   dealloc_stack %lhs
>> 
>> Notably, we're explicitly modeling the memory in which values are stored, 
>> which is both very verbose and — more importantly — loses any interesting 
>> SSA properties for tracking actual values around.  And this has a bunch of 
>> secondary effects where various high-level operations like dynamic casts end 
>> up needing two different instructions based on whether the value is stored 
>> in memory.  This is pretty dumb, and it's a major contributor to the reality 
>> that generic code is currently very poorly optimized.
>> 
>> It does, however, have some significant advantages: since the memory 
>> allocation is explicit, it's quite natural to express optimizations that 
>> e.g. hoist or sink those allocations, and the next level of lowering (IRGen) 
>> can be very simple.
>> 
>> Addresses and address-only types
>> 
>> Swift is an imperative language, and imperative languages make formal memory 
>> locations part of the high-level semantics.  DI and SSA formation allow us 
>> to eliminate formal memory locations for most local variables, but class 
>> properties, global variables, escaped local variables, and pointers are all 
>> fundamentally un-SSA-able.  Therefore we will always have some concept of an 
>> address.
>> 
>> But most values of address-only type aren't really being stored in that sort 
>> of formal memory location; we're just representing them that way in SIL.  
>> Why do we do this?  What is it about a type that makes it address-only?
>> 
>> Well, some types are inherently "memory-pinned": something about their 
>> representation only makes sense, or is only implementable, if the value is 
>> always kept in memory:
>>   - The representation of the value may involve interior pointers, as with 
>> LLVM's SmallVector.  This isn't currently a thing in Swift, but it's a 
>> possibility someday with the import of non-POD C++ types.
>>   - The address of the value may need to be registered elsewhere, as with 
>> weak references.
>>   - The value may allow internal mutation even from a shared reference, like 
>> a C++ class with a mutable field or a Rust atomic type; you can see weak 
>> references as analogous to this.
>> Such types are necessarily address-only at a low level.
>> 
>> Other types are address-only by abstraction: their representation isn't 
>> (fully) known, and so they have to be kept in memory (1) in case that 
>> representation includes another address-only value and (2) because there's 
>> no way to store a unbounded amount of data except in memory anyway.
>> 
>> But this sense of address-only types that we're describing is really a 
>> property of the final generated code.  It's necessary for LLVM IR generation 
>> to know about it, but it's not obviously necessary for SIL to know about it 
>> beyond the implications of the inherently memory-pinned cases above:
>>   - it is not generally safe to assume that the stored properties of a 
>> non-POD C++ type remain invariant across moves
>>   - weak references *do* represent the same value across moves, but of 
>> course that value can change dynamically at any time anyway, per the rules 
>> of weak references
>>   - mutable fields can be modified even by a shared borrowed reference, and 
>> so (if they are modeled at all in SIL at all, rather than just leaving the 
>> type opaque) there must be some way to project a mutable address from a 
>> shared borrow and so on.
>> 
>> Our current representation of address-only types arises directly from the 
>> low-level operations that are required, which does admit some interesting 
>> optimizations on its own, but the disadvantages of having to support wildly 
>> divergent code paths and completely give up SSA use/def chains are 
>> crippling.  What would SIL look like if we abandoned this entirely?
>> 
>> All types as SIL scalars
>> 
>> The fundamental issue here is that IR-level lowering does need to place 
>> memory-pinned values into memory.  Suppose we take a value, use it as an 
>> enum case payload, inject that enum into an Any, and pass that to a 
>> function.  In a future world where we consistently SSA all local values, 
>> this ideally looks something like this:
>> 
>>  // %value is a $T
>>  %enum = enum #MyEnum.foo, %value : $T
>>  %any = existential $Any, %enum
>>  %fn = function_ref @bar
>>   apply %fn(%any)
>> 
>> If all of these types were loadable, lowering this to IR doesn't impose any 
>> abstraction costs, because we can just manipulate it as a bit-pattern in 
>> memory, which LLVM (really, any reasonable compiler infrastructure) should 
>> be quite good at analyzing and optimizing.  But if all of these types are 
>> address-only, that's not obviously true.  Let's look at the details of 
>> lowering to memory.
>> 
>> Because all types are movable — even in a world with Rust-style ownership — 
>> there's a fairly straightforward lowering that works for all possible 
>> functions: every address-only SIL value gets its own allocation which is 
>> deallocated along the dominance frontier of the value.  Indirect arguments 
>> use the passed-in buffer.  Basic block arguments are copied from the 
>> allocation for the branch argument value to the allocation for the 
>> corresponding destination block parameter.  Indirect return values are 
>> copied from the allocation for the return value to the pointer passed in.
>> 
>> This admits some obvious optimizations.  If "owned" SIL values are 
>> pseudo-linear — i.e. along every path from their definition point, it is 
>> statically evident that they are (optionally) borrowed multiple times, 
>> consumed exactly once, and then never used again — then the copies can 
>> instead be moves.  Standard register-allocation algorithms can recognize 
>> when basic block parameters can use the same location as their arguments.  
>> SIL only permits a single return instruction, so there's no reason not to 
>> allocate returned values directly into the return slot.  These optimizations 
>> will tend to eliminate a lot of moves.
>> 
>> However, there's another problem, which is injections.  Consider the example 
>> above.  %any is an opaque existential, and initializing it formally involves 
>> allocating a buffer within the existential and moving the argument into that 
>> buffer.  Ideally, we would allocate that buffer first and then simply 
>> allocate %enum in-place into that buffer.  In this simple example, that's 
>> easy.  But if the control flow were more complex, detecting that this is 
>> possible becomes significantly more challenging, as does ensuring that the 
>> buffer is properly cleaned up along all paths.  For example, suppose that 
>> %value were computed by calling a throwing function:
>> 
>>   try_apply %someFunction() normal %cont, unwind %handler
>> cont(%value: $T):
>>   %enum = enum #MyEnum.foo, %value : $T
>>   %any = existential $Any, %enum
>>   %fn = function_ref @bar
>>   apply %fn(%any)
>> handler(%error: $Error):
>>   throw $error
>> 
>> Naive allocation here is going to introduce a lot of moves.  Optimally, we 
>> would receive the return value from %someFunction directly in the payload of 
>> %enum, which we want to build directly into the allocated existential buffer 
>> of %any.  But to do this, we actually need to allocate that existential 
>> buffer before executing the try_apply; and if the try_apply throws, we need 
>> to deallocate that existential buffer in the handler block.  The need to 
>> retroactively insert this kind of clean-up code adds a lot of complexity to 
>> this allocation approach.  Moreover, it's quite possible that complex 
>> intermediate control — for example, if there's a loop somewhere between the 
>> definition of a value and its consuming use — will tend to block this kind 
>> of analysis and cause more unnecessary moves.
>> 
>> In contrast, the current SIL-generation scheme tends to be aware of at least 
>> the immediate local uses of values and therefore emits a lot of these kinds 
>> of initialization "in-place" already.  But that said, it does proceed from a 
>> simple recursive examination of the AST, which means it will miss examples 
>> that are just slightly more opaque, like binding a return value as a let and 
>> only then returning it.  That kind of thing is currently left to the 
>> optimizer for no real good reason.
>> 
>> Summary
>> 
>> Overall, I think representing local address-only values as SIL scalars with 
>> full SSA use/def chains is really promising, and I do think we can write an 
>> allocation pass that does an acceptable job eliminating unnecessary moves.  
>> In order to actually do this, though, I think we need two things:
>>   - We need SIL to be "pseudo-linear" as discussed above.  We really don't 
>> want the allocation pass to have to worry about keeping values alive past 
>> consumptions, and thus potentially having to insert copies instead of moves.
>>   - We need the allocation pass to handle exits before initialization (like 
>> with try_apply) and other sorts of interfering control flow.  It will not be 
>> acceptable for this to be a block-local analysis with only a trivial amount 
>> of cross-block argument merging.
>> 
>> John.
>> _______________________________________________
>> swift-dev mailing list
>> swift-dev@swift.org <mailto:swift-dev@swift.org>
>> https://lists.swift.org/mailman/listinfo/swift-dev
> 
> _______________________________________________
> swift-dev mailing list
> swift-dev@swift.org
> https://lists.swift.org/mailman/listinfo/swift-dev

_______________________________________________
swift-dev mailing list
swift-dev@swift.org
https://lists.swift.org/mailman/listinfo/swift-dev

Reply via email to