> On Feb 3, 2017, at 3:12 PM, John McCall <rjmcc...@apple.com> wrote:
> 
> I think we can generalize this discussion a bit by describing some 
> mostly-independent axes of variation:

That's great (sorry I used up the arabic+letter naming convention earlier in 
the thread)...

> I. A call site:
>  I1) inlines the lookup
>  I2) calls a function that performs the lookup and returns a function pointer
>  I3) calls a function that performs the lookup and calls the function
> I3 minimizes code size at the call site, but it prevents the lookup (or 
> components thereof) from being hoisted.  I1 requires at least some details of 
> the lookup to be ABI.

Exactly.

> II. The function that performs the lookup:
> II1) is emitted lazily
> II2) is emitted eagerly
> II2 can use static information to implement the lookup inline without 
> affecting the ABI.  II1 cannot; it must either rely on ABI assumptions or 
> just recursively use another strategy, in which case the function is probably 
> just a call-site code-size optimization.

Sure, but I was thinking of it as client side vs. class-definition
side.  My suggestions explored the tradeoff between II1 and II2, where
the client code could optionally optimize for vtable dispatch by
emitting some lazy lookup helpers with a fall-back to the eagerly
emitted lookup.

> III. The function that performs the lookup:
>  III1) is method-specific
>  III2) is class-specific
>  III3) is part of the language runtime
> III1 requires a lot more functions, but if emitted with the class definition, 
> it allows the lookup to be statically specialized based on the method called, 
> so e.g. if a non-open method is never overridden, it can resolved to a 
> specific function pointer immediately, possibly by just aliasing a symbol to 
> the function definition.  III3 minimizes the number of functions required but 
> doesn't allow us to evolve dispatch beyond what's supported by the current 
> language runtime and doesn't allow class-specific optimizations (e.g. if the 
> class definition statically knows the offset to its v-table).

III1 is the sort of optimization that can be selectively added later
if we're willing to version the ABI.

I suppose by that logic we should start with III3. We could never
ditch the runtime/compiler support, but newer code could move toward
III2 through ABI versioning.

I actually prefer to start with III2 because the static overhead is
not significantly worse than III3 and it can be optimized without ABI
migration.

> IV. The function that performs the lookup:
>  IV1) is parameterized by an isa
>  IV2) is not parameterized by an isa
> IV1 allows the same function to be used for super-dispatch but requires extra 
> work to be inlined at the call site (possibly requiring a chain of resolution 
> function calls).

In my first message I was trying to accomplish IV1. But IV2 is simpler
and I can't see a fundamental advantage to IV1. Why would it need a
lookup chain?

> V. For any particular function or piece of information, it can be accessed:
>  V1) directly through a symbol
>  V2) through a class-specific table
>  V3) through a hierarchy-specific table (e.g. the class object)
> V1 requires more global symbols, especially if the symbol is per-method, but 
> doesn't have any index-computation problems, and it's generally a bit more 
> efficient.
> V2 allows stable assignment of fixed indexes to entries because of 
> availability-sorting.
> V3 does not; it requires some ability to (at least) slide indexes of entries 
> because of changes elsewhere in the hierarchy.
> If there are multiple instantiations of a table (perhaps with different 
> information, like a v-table), V2 and V3 can be subject to table bloat.

I had proposed V2 as an option, but am strongly leaning toward V1 for
ABI simplicity and lower static costs (why generate vtables and offset
tables?)

> So I think your alternatives were:
> 1. I3, II2, III1, IV2, V1 (for the dispatch function): a direct call to a 
> per-method global function that performs the dispatch.  We could apply V2 to 
> this to decrease the number of global symbols required, but at the cost of 
> inflating the call site and requiring a global variable whose address would 
> have to be resolved at load time.  Has an open question about super dispatch.
> 2. I1, V3 (for the v-table), V1 (for the global offset): a load of a 
> per-method global variable giving an offset into the v-table.  Joe's 
> suggestion adds a helper function as a code-size optimization that follows 
> I2, II1, III1, IV2.  Again, we could also use V2 for the global offset to 
> reduce the symbol-table costs.
> 3. I2, II2, III2, IV1, V2 (for the class offset / dispatch mechanism table).  
> At least I think this is right?  The difference between 3a and 3b seems to be 
> about initialization, but maybe also shifts a lot of code-generation to the 
> call site?

I'll pick the following option as a starting point because it constrains the 
ABI the least in
terms of static costs and potential directions for optimization:

"I2; (II1+II2); III2; IV1; V1"

method_entry = resolveMethodAddress_ForAClass(isa, method_index, &vtable_offset)

(where both modules would need to opt into the vtable_offset.)

I think any alternative would need to be demonstrably better in terms of code 
size or dynamic dispatch cost.

-Andy
_______________________________________________
swift-dev mailing list
swift-dev@swift.org
https://lists.swift.org/mailman/listinfo/swift-dev

Reply via email to