| One way to make this happen would be for C-- call nodes to carry information | about the calling convention of the target (e.g. how many arguments of each | type the function expects; in the same way identifiers in Core carry their | type).
That's be entirely possible for "known" calls, where the target is known, but not for "unknown" (i.e higher order) ones where the target of the call varies. The "Making a fast curry" paper goes into this in some detail. I think we already have different entry points for these two cases. So maybe they could have different entry conventions... Simon | -----Original Message----- | From: ghc-devs [mailto:[email protected]] On Behalf Of Ben Gamari | Sent: 20 September 2017 16:54 | To: Moritz Angermann <[email protected]>; GHC developers <ghc- | [email protected]> | Subject: Re: The Curious Case of T6084 -or- Register Confusion with LLVM | | Moritz Angermann <[email protected]> writes: | | [snip] | > | > I should not have the YMM*, and ZMM* registers as I don’t have any AVX | > nor AVX512; that looks like only a patch away. However we try to | > optimize our register, such that we can pass up to six doubles or six | > floats or any combination of both if needed in registers, without | > having to allocate them on the stack, by assuming overlapping registers | (See Note [Overlapping global registers]). | > | > And as such a full function signature in LLVM would as opposed to one | > that’s based on the “live” registers as we have right now, would | > consist of 12 float/double registers, and LLVM only maps 6. My | > current idea is to, pass only the explicit F1,D1,…,F3,D3 and try to | > disable the register overlapping for LLVM. This would probably force | > more floating values to be stack allocated rather than passed via | > registers, but would likely guarantee that the registers match up. | > The other option I can think of is to define some viertual generic | > floating registers in the llvm code gen: V1,…,V6 and then perform | > something like | > | > F1 <- V1 as float | > D1 <- V1 as double | > | > in the body of the function, while trying to use the `live` | > information at the call site to decide which of F1 or D1 to pass as V1. | > | Arguably the fundamental problem here is the assumption that all STG entry- | points have the same machine-level calling convention. As you point out, our | calling conventions in fact change due to things like register overlap. | Ideally the LLVM we produce would reflect this. | | One way to make this happen would be for C-- call nodes to carry information | about the calling convention of the target (e.g. how many arguments of each | type the function expects; in the same way identifiers in Core carry their | type). Unfortunately a brief look at the code generator suggests that this | may require a fair amount of plumbing. | | It's important to note though that this overlap problem is something that | will need to be addressed eventually if we are are to have proper SIMD | support (due to overlap between XMM, YMM, and ZMM). | | Cheers, | | - Ben _______________________________________________ ghc-devs mailing list [email protected] http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
