On Thu, May 5, 2011 at 2:16 AM, Richard Guenther
<richard.guent...@gmail.com> wrote:
> On Thu, May 5, 2011 at 12:19 AM, Xinliang David Li <davi...@google.com> wrote:
>>>
>>> I can think of some more-or-less obvious high-level forms, one would
>>> for example simply stick a new DISPATCH tree into gimple_call_fn
>>> (similar to how we can have OBJ_TYPE_REF there), the DISPATCH
>>> tree would be of variable length, first operand the selector function
>>> and further operands function addresses.  That would keep the
>>> actual call visible (instead of a fake __builtin_dispatch call), something
>>> I'd really like to see.
>>
>> This sounds like a good long term solution.
>
> Thinking about it again maybe, similar to OBJ_TYPE_REF, have the
> selection itself lowered and only keep the set of functions as
> additional info.  Thus instead of having the selector function as
> first operand have a pointer to the selected function there (that also
> avoids too much knowledge about the return value of the selector).
> Thus,
>
>  sel = selector ();
>  switch (sel)
>   {
>   case A: fn = &bar;
>   case B: fn = &foo;
>   }
>  val = (*DISPATCH (fn, bar, foo)) (...);
>
> that way regular optimizations can apply to the selection, eventually
> discard the dispatch if fn becomes a known direct function (similar
> to devirtualization).  At expansion time the call address is simply
> taken from the first operand and an indirect call is assembled.
>
> Does the above still provide enough knowledge for the IPA path isolation?
>

I like your original proposal (extending call) better because related
information are tied together and is easier to hoist and clean up.

I want propose a more general solution.

1) Generic Annotation Support for gcc IR -- it is used attach to
application/optimization specific annotation to gimple statements and
annotations can be passed around across passes. In gcc, I only see
HISTOGRAM annotation for value profiling, which is not general enough
2) Support of CallInfo for each callsite. This is an annotation, but
more standardized. The callinfo can be used to record information such
as call attributes, call side effects, mod-ref information etc ---
current gimple_call_flags can be folded into this Info structure.

Similarly (not related to this discussion), LoopInfo structure can be
introduced to annotate loop back edge jumps to allow FE to pass useful
information at loop level. For floating pointer operations, things
like the precision constraint, sensitivity to floating environment etc
can be recorded in FPInfo.

T


>>> Restricting ourselves to use the existing target attribute at the
>>> beginning (with a single, compiler-generated selector function)
>>> is probably good enough to get a prototype up and running.
>>> Extending it to arbitrary selector-function, value pairs using a
>>> new attribute is then probably easy (I don't see the exact use-case
>>> for that yet, but I suppose it exists if you say so).
>>
>> For the use cases, CPU model will be looked at instead of just the
>> core architecture -- this will give use more information about the
>> numbrer of cores, size of caches etc. Intel's runtime library does
>> this checkiing at start up time so that the multi-versioned code can
>> look at those and make the appropriate decisions.
>>
>> It will be even more complicated for arm processors -- which can have
>> the same processor cores but configured differently w.r.t VFP, NEON
>> etc.
>
> Ah, indeed.  I hadn't thought about the tuning for different variants
> as opposed to enabling HW features.  So the interface for overloading
> would be sth like
>
> enum X { Foo = 0, Bar = 5 };
>
> enum X select () { return Bar; }
>
> void foo (void) __attribute__((dispatch(select, Bar)));
>

Yes, for overloading -- something like this looks good.

Thanks,

David

Reply via email to