On Dec 30, 2008, at 3:03 PM, Zack Rusin wrote: > On Tuesday 30 December 2008 15:30:35 Chris Lattner wrote: >> On Dec 30, 2008, at 6:39 AM, Corbin Simpson wrote: >>>> However, the special instrucions cannot directly be mapped to LLVM >>>> IR, like >>>> "min", the conversion involves in 'extract' the vector, create >>>> less-than-compare, create 'select' instruction, and create 'insert- >>>> element' >>>> instruction. >> >> Using scalar operations obviously works, but will probably produce >> very inefficient code. One positive thing is that all target- >> specific >> operations of supported vector ISAs (Altivec and SSE[1-4] currently) >> are exposed either through LLVM IR ops or through target-specific >> builtins/intrinsics. This means that you can get access to all the >> crazy SSE instructions, but it means that your codegen would have to >> handle this target-specific code generation. > > I think Alex was referring here to a AOS layout which is completely > not ready. > The currently supported one is SOA layout which eliminates scalar > operations.
Ok! >> Sure, it would be very reasonable to make these target-specific >> builtins when targeting a GPU, the same way we have target-specific >> builtins for SSE. > > Actually currently the plan is to have essentially a "two pass" LLVM > IR. I > wanted the first one to never lower any of the GPU instructions so > we'd have > intrinsics or maybe even just function calls like gallium.lit, > gallium.dot, > gallium.noise and such. Then gallium should query the driver to > figure out > which instructions the GPU supports and runs our custom llvm > lowering pass > that decomposes those into things the GPU supports. That makes a lot of sense. Note that there is no reason to use actual LLVM intrinsics for this: naming them "gallium.lit" is just as good as "llvm.gallium.lit" for example. > Essentially I'd like to > make as many complicated things in gallium as possible to make the > GPU llvm > backends in drivers as simple as possible and this would help us > make the > pattern matching in the generator /a lot/ easier (matching > gallium.lit vs 9+ > instructions it would be be decomposed to) and give us a more > generic GPU > independent layer above. But that hasn't been done yet, I hope to be > able to > write that code while working on the OpenCL implementation for > Gallium. Makes sense. For the more complex functions (e.g. texture lookup) you can also just compile C code to LLVM IR and use the LLVM inliner to inline the code if you prefer. -Chris ------------------------------------------------------------------------------ _______________________________________________ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev