> On Feb 3, 2017, at 3:12 PM, John McCall <rjmcc...@apple.com> wrote: > > I think we can generalize this discussion a bit by describing some > mostly-independent axes of variation:
That's great (sorry I used up the arabic+letter naming convention earlier in the thread)... > I. A call site: > I1) inlines the lookup > I2) calls a function that performs the lookup and returns a function pointer > I3) calls a function that performs the lookup and calls the function > I3 minimizes code size at the call site, but it prevents the lookup (or > components thereof) from being hoisted. I1 requires at least some details of > the lookup to be ABI. Exactly. > II. The function that performs the lookup: > II1) is emitted lazily > II2) is emitted eagerly > II2 can use static information to implement the lookup inline without > affecting the ABI. II1 cannot; it must either rely on ABI assumptions or > just recursively use another strategy, in which case the function is probably > just a call-site code-size optimization. Sure, but I was thinking of it as client side vs. class-definition side. My suggestions explored the tradeoff between II1 and II2, where the client code could optionally optimize for vtable dispatch by emitting some lazy lookup helpers with a fall-back to the eagerly emitted lookup. > III. The function that performs the lookup: > III1) is method-specific > III2) is class-specific > III3) is part of the language runtime > III1 requires a lot more functions, but if emitted with the class definition, > it allows the lookup to be statically specialized based on the method called, > so e.g. if a non-open method is never overridden, it can resolved to a > specific function pointer immediately, possibly by just aliasing a symbol to > the function definition. III3 minimizes the number of functions required but > doesn't allow us to evolve dispatch beyond what's supported by the current > language runtime and doesn't allow class-specific optimizations (e.g. if the > class definition statically knows the offset to its v-table). III1 is the sort of optimization that can be selectively added later if we're willing to version the ABI. I suppose by that logic we should start with III3. We could never ditch the runtime/compiler support, but newer code could move toward III2 through ABI versioning. I actually prefer to start with III2 because the static overhead is not significantly worse than III3 and it can be optimized without ABI migration. > IV. The function that performs the lookup: > IV1) is parameterized by an isa > IV2) is not parameterized by an isa > IV1 allows the same function to be used for super-dispatch but requires extra > work to be inlined at the call site (possibly requiring a chain of resolution > function calls). In my first message I was trying to accomplish IV1. But IV2 is simpler and I can't see a fundamental advantage to IV1. Why would it need a lookup chain? > V. For any particular function or piece of information, it can be accessed: > V1) directly through a symbol > V2) through a class-specific table > V3) through a hierarchy-specific table (e.g. the class object) > V1 requires more global symbols, especially if the symbol is per-method, but > doesn't have any index-computation problems, and it's generally a bit more > efficient. > V2 allows stable assignment of fixed indexes to entries because of > availability-sorting. > V3 does not; it requires some ability to (at least) slide indexes of entries > because of changes elsewhere in the hierarchy. > If there are multiple instantiations of a table (perhaps with different > information, like a v-table), V2 and V3 can be subject to table bloat. I had proposed V2 as an option, but am strongly leaning toward V1 for ABI simplicity and lower static costs (why generate vtables and offset tables?) > So I think your alternatives were: > 1. I3, II2, III1, IV2, V1 (for the dispatch function): a direct call to a > per-method global function that performs the dispatch. We could apply V2 to > this to decrease the number of global symbols required, but at the cost of > inflating the call site and requiring a global variable whose address would > have to be resolved at load time. Has an open question about super dispatch. > 2. I1, V3 (for the v-table), V1 (for the global offset): a load of a > per-method global variable giving an offset into the v-table. Joe's > suggestion adds a helper function as a code-size optimization that follows > I2, II1, III1, IV2. Again, we could also use V2 for the global offset to > reduce the symbol-table costs. > 3. I2, II2, III2, IV1, V2 (for the class offset / dispatch mechanism table). > At least I think this is right? The difference between 3a and 3b seems to be > about initialization, but maybe also shifts a lot of code-generation to the > call site? I'll pick the following option as a starting point because it constrains the ABI the least in terms of static costs and potential directions for optimization: "I2; (II1+II2); III2; IV1; V1" method_entry = resolveMethodAddress_ForAClass(isa, method_index, &vtable_offset) (where both modules would need to opt into the vtable_offset.) I think any alternative would need to be demonstrably better in terms of code size or dynamic dispatch cost. -Andy
_______________________________________________ swift-dev mailing list swift-dev@swift.org https://lists.swift.org/mailman/listinfo/swift-dev