> On Oct 3, 2016, at 12:10 PM, Alexis via swift-dev <swift-dev@swift.org> wrote: > > When I first started reading this proposal, my primary objection was going to > be that SSA doesn’t seem to really jive well with the idea of values becoming > (in)valid at some point in the future (due to moves). You can see this in the > definite-init pass, which it works entirely with addresses to handle the idea > of a value which becomes assigned “eventually”. But if SIL’s notion of SSA > can be extended to handle these problems, this sounds great! > > It’s not clear to me how this would work though. How does an optimization > pass reason about an SSA register becoming invalid because it was moved out > of? Or rather: in what ways do the interesting properties of SSA survive > passes needing to handle this? Is this a standard extension that’s been > researched/implemented before?
I think that’s why John claimed that we need to enforce “pseudo-linear” SIL values types. Moving out of an SSA value must be the last use of that value. SIL will enforce this single-consumer property throughout. -Andy > >> On Oct 1, 2016, at 4:32 AM, John McCall via swift-dev <swift-dev@swift.org >> <mailto:swift-dev@swift.org>> wrote: >> >> Andy Trick and I had this conversation Thursday, and I thought I'd capture >> it here. >> >> The Problem >> >> It's a longstanding complaint that SIL uses drastically different code >> patterns for the same sequence of operations based on whether the types of >> the values involved are loadable or "address-only". For example, suppose I >> have Swift code like this: >> let x, y : T >> let z = x + y >> >> If T is a loadable type, this will generate SIL somewhat like this: >> // %x and %y are values of type $T >> %lhs = copy_value %x >> %rhs = copy_value %y >> %operator = function_ref T.+ >> %result = apply %operator(%lhs, %rhs) >> %z = %result >> >> (copy_value doesn't currently exist, but it's easier to understand, and as >> it happens we're thinking of adding it back.) >> >> If T is an address-only type, this will generate SIL somewhat like this: >> // %x and %y are values of type $*T >> %z = alloc_stack $T >> %lhs = alloc_stack $T >> copy_addr %x to [initialization] %lhs >> %rhs = alloc_stack $T >> copy_addr %y to [initialization] %rhs >> %operator = function_ref T.+ >> apply %operator(%z, %lhs, %rhs) >> dealloc_stack %rhs >> dealloc_stack %lhs >> >> Notably, we're explicitly modeling the memory in which values are stored, >> which is both very verbose and — more importantly — loses any interesting >> SSA properties for tracking actual values around. And this has a bunch of >> secondary effects where various high-level operations like dynamic casts end >> up needing two different instructions based on whether the value is stored >> in memory. This is pretty dumb, and it's a major contributor to the reality >> that generic code is currently very poorly optimized. >> >> It does, however, have some significant advantages: since the memory >> allocation is explicit, it's quite natural to express optimizations that >> e.g. hoist or sink those allocations, and the next level of lowering (IRGen) >> can be very simple. >> >> Addresses and address-only types >> >> Swift is an imperative language, and imperative languages make formal memory >> locations part of the high-level semantics. DI and SSA formation allow us >> to eliminate formal memory locations for most local variables, but class >> properties, global variables, escaped local variables, and pointers are all >> fundamentally un-SSA-able. Therefore we will always have some concept of an >> address. >> >> But most values of address-only type aren't really being stored in that sort >> of formal memory location; we're just representing them that way in SIL. >> Why do we do this? What is it about a type that makes it address-only? >> >> Well, some types are inherently "memory-pinned": something about their >> representation only makes sense, or is only implementable, if the value is >> always kept in memory: >> - The representation of the value may involve interior pointers, as with >> LLVM's SmallVector. This isn't currently a thing in Swift, but it's a >> possibility someday with the import of non-POD C++ types. >> - The address of the value may need to be registered elsewhere, as with >> weak references. >> - The value may allow internal mutation even from a shared reference, like >> a C++ class with a mutable field or a Rust atomic type; you can see weak >> references as analogous to this. >> Such types are necessarily address-only at a low level. >> >> Other types are address-only by abstraction: their representation isn't >> (fully) known, and so they have to be kept in memory (1) in case that >> representation includes another address-only value and (2) because there's >> no way to store a unbounded amount of data except in memory anyway. >> >> But this sense of address-only types that we're describing is really a >> property of the final generated code. It's necessary for LLVM IR generation >> to know about it, but it's not obviously necessary for SIL to know about it >> beyond the implications of the inherently memory-pinned cases above: >> - it is not generally safe to assume that the stored properties of a >> non-POD C++ type remain invariant across moves >> - weak references *do* represent the same value across moves, but of >> course that value can change dynamically at any time anyway, per the rules >> of weak references >> - mutable fields can be modified even by a shared borrowed reference, and >> so (if they are modeled at all in SIL at all, rather than just leaving the >> type opaque) there must be some way to project a mutable address from a >> shared borrow and so on. >> >> Our current representation of address-only types arises directly from the >> low-level operations that are required, which does admit some interesting >> optimizations on its own, but the disadvantages of having to support wildly >> divergent code paths and completely give up SSA use/def chains are >> crippling. What would SIL look like if we abandoned this entirely? >> >> All types as SIL scalars >> >> The fundamental issue here is that IR-level lowering does need to place >> memory-pinned values into memory. Suppose we take a value, use it as an >> enum case payload, inject that enum into an Any, and pass that to a >> function. In a future world where we consistently SSA all local values, >> this ideally looks something like this: >> >> // %value is a $T >> %enum = enum #MyEnum.foo, %value : $T >> %any = existential $Any, %enum >> %fn = function_ref @bar >> apply %fn(%any) >> >> If all of these types were loadable, lowering this to IR doesn't impose any >> abstraction costs, because we can just manipulate it as a bit-pattern in >> memory, which LLVM (really, any reasonable compiler infrastructure) should >> be quite good at analyzing and optimizing. But if all of these types are >> address-only, that's not obviously true. Let's look at the details of >> lowering to memory. >> >> Because all types are movable — even in a world with Rust-style ownership — >> there's a fairly straightforward lowering that works for all possible >> functions: every address-only SIL value gets its own allocation which is >> deallocated along the dominance frontier of the value. Indirect arguments >> use the passed-in buffer. Basic block arguments are copied from the >> allocation for the branch argument value to the allocation for the >> corresponding destination block parameter. Indirect return values are >> copied from the allocation for the return value to the pointer passed in. >> >> This admits some obvious optimizations. If "owned" SIL values are >> pseudo-linear — i.e. along every path from their definition point, it is >> statically evident that they are (optionally) borrowed multiple times, >> consumed exactly once, and then never used again — then the copies can >> instead be moves. Standard register-allocation algorithms can recognize >> when basic block parameters can use the same location as their arguments. >> SIL only permits a single return instruction, so there's no reason not to >> allocate returned values directly into the return slot. These optimizations >> will tend to eliminate a lot of moves. >> >> However, there's another problem, which is injections. Consider the example >> above. %any is an opaque existential, and initializing it formally involves >> allocating a buffer within the existential and moving the argument into that >> buffer. Ideally, we would allocate that buffer first and then simply >> allocate %enum in-place into that buffer. In this simple example, that's >> easy. But if the control flow were more complex, detecting that this is >> possible becomes significantly more challenging, as does ensuring that the >> buffer is properly cleaned up along all paths. For example, suppose that >> %value were computed by calling a throwing function: >> >> try_apply %someFunction() normal %cont, unwind %handler >> cont(%value: $T): >> %enum = enum #MyEnum.foo, %value : $T >> %any = existential $Any, %enum >> %fn = function_ref @bar >> apply %fn(%any) >> handler(%error: $Error): >> throw $error >> >> Naive allocation here is going to introduce a lot of moves. Optimally, we >> would receive the return value from %someFunction directly in the payload of >> %enum, which we want to build directly into the allocated existential buffer >> of %any. But to do this, we actually need to allocate that existential >> buffer before executing the try_apply; and if the try_apply throws, we need >> to deallocate that existential buffer in the handler block. The need to >> retroactively insert this kind of clean-up code adds a lot of complexity to >> this allocation approach. Moreover, it's quite possible that complex >> intermediate control — for example, if there's a loop somewhere between the >> definition of a value and its consuming use — will tend to block this kind >> of analysis and cause more unnecessary moves. >> >> In contrast, the current SIL-generation scheme tends to be aware of at least >> the immediate local uses of values and therefore emits a lot of these kinds >> of initialization "in-place" already. But that said, it does proceed from a >> simple recursive examination of the AST, which means it will miss examples >> that are just slightly more opaque, like binding a return value as a let and >> only then returning it. That kind of thing is currently left to the >> optimizer for no real good reason. >> >> Summary >> >> Overall, I think representing local address-only values as SIL scalars with >> full SSA use/def chains is really promising, and I do think we can write an >> allocation pass that does an acceptable job eliminating unnecessary moves. >> In order to actually do this, though, I think we need two things: >> - We need SIL to be "pseudo-linear" as discussed above. We really don't >> want the allocation pass to have to worry about keeping values alive past >> consumptions, and thus potentially having to insert copies instead of moves. >> - We need the allocation pass to handle exits before initialization (like >> with try_apply) and other sorts of interfering control flow. It will not be >> acceptable for this to be a block-local analysis with only a trivial amount >> of cross-block argument merging. >> >> John. >> _______________________________________________ >> swift-dev mailing list >> swift-dev@swift.org <mailto:swift-dev@swift.org> >> https://lists.swift.org/mailman/listinfo/swift-dev > > _______________________________________________ > swift-dev mailing list > swift-dev@swift.org > https://lists.swift.org/mailman/listinfo/swift-dev
_______________________________________________ swift-dev mailing list swift-dev@swift.org https://lists.swift.org/mailman/listinfo/swift-dev