Jim- Just out of idle curiousity, what would happen if the `complex' structure were changed to something like:
typedef struct { float re, im; int dummy; } complex; Since this is no longer an HFA would this kick the compiler into a mode where the code would at least work, all be it not in the most efficient manner? On Wed, Oct 31, 2001 at 02:23:25PM -0800, Jim Wilson wrote: > The IA-64 ABI says that structures of floats are passed/returned decomposed > into floating point registers. They ABI calls them homogeneous floating-point > aggregates, or HFA for short. This also applies to complex types. Thus your > structure > typedef struct { > float re, im; > } complex; > is handled by putting RE in one FP register, and IM in the next. This is > not normal practice, since the structure is 8 bytes, but ends up using 16 > bytes worth of register (ignoring long double to simplify the discussion). > This requires special code to decompose/compose HFA arguments and return > values on IA-64 when loading/storing them. IA-32 does not use this > convention, > and thus does not need special code for HFAs. > > Because of the old design of the C front end, this special code is > problematic. > The C front end generates low level code first, including code to compose/ > decompose HFAs, and then tries to do function inlining. When we inline a > function, we have to optimize away the code that composes/decomposes HFAs, > and this is so difficult that in practice it isn't worthwhile to try. Thus > we can not inline a function that uses an HFA argument or return value. > > The C++ front uses a more recent design that inlines first, and then generates > low level code including the HFA compose/decompose code. If you compile your > example as C++ code, it will work. > > Work is underway to rewrite the C front end to make it work more like the C++ > front end, or perhaps even just use the C++ front end for C. When this work > gets far enough, inlining of HFA functions will work in C. I just tried your > example with the current FSF development sources, and it did work, so I think > this is fixed as of Alexandre Oliva's 2001-10-05 gcc changes to the C front > end. I don't know how well it is working at the moment though. However, > I would expect it to be working fine by the time gcc 3.1 comes out in spring > of > 2002. > > Another consideration here is that the IL (Intermediate Language) used by > gcc has no support for representing decomposed structures. If we did, > then we could get much better optimization of structures by separately > optimizing every structure field as if it was a scalar. But we don't, > so the only way we can handle decomposed structures as arguments is to > decompose them before the call, and then recompose them in the function > prologue. This is pretty inefficient, but it does work. Fixing this will > be a lot of work, and it will likely be a while before anyone tries. > > Jim > > _______________________________________________ > Linux-IA64 mailing list > [EMAIL PROTECTED] > http://lists.linuxia64.org/lists/listinfo/linux-ia64 -- Don Dugger "Censeo Toto nos in Kansa esse decisse." - D. Gale [EMAIL PROTECTED] Ph: 303/652-0870x117