Thanks to everyone who gave me feedback on my proposal. I have made some modifications to my proposal based on some of the suggestions I received. The main change to my proposal is that I am going to focus on doing the branch emulation and loop unrolling in the r300 compiler instead of doing it with TGSI. I left some time at the end of the project to explore doing a TGSI -> RC -> RC Branch Emulation -> TGSI translation to expose the branch emulation done in the r300 compiler to the rest of the Gallium drivers. I also modified some of the time estimates based on suggestions from Nicolai. The full proposal is here: http://socghop.appspot.com/gsoc/student_proposal/show/google/gsoc2010/tstellar/t126997450856 The project plan is reproduced below:
Tasks: 1. Improve branch emulation in the r300 compiler: The goal of this task will be to improve upon the work done by Nicolai Häehnle in this branch: http://cgit.freedesktop.org/~nh/mesa/log/?h=r300g-glsl and fully support branch emulation in the r300 compiler. This first part of this task will involve testing the current branch emulation code to determine what works and what does not. After this has been completed work can begin on any part of the branch emulation that does not work correctly. 2. Unroll loops in the r300 compiler: The goal of this task will be to unroll loops so that they can be executed by hardware that does not support them. The loop unrolling in this task is not meant as a code optimization. It is only being done to eliminate branch instructions. Loops where the number of iterations are known at compile time will be unrolled and may have additional optimizations applied. Loops that have an unknown number of iterations, will have to be studied to see if there is a way to replace the loop with a set of instructions that produces the same output as the loop. For example, one solution might be to replace an ADD(src0, src0) instruction that is supposed to execute n times with a MUL(src0, n). It is possible that not all loops will be able to be unrolled successfully. 3. Loops and Conditionals for R500 fragment and vertex shaders: The goal of this task will be to make use of the R500 hardware support for branches and loops. New radeon_compiler opcodes (RC_OPCODE_*) will need to be added to represent loops, and the corresponding TGSI instructions will need to be converted into these new opcodes during the TGSI_OPCODE_* to RC_OPCODE_* phase. Once this has been done, the code generator for R500 vertex and fragment shaders will need to be modified to output the correct hardware instructions for loops. 4. Optional Tasks: Here is a list of things that could be explored if there is some extra time left at the end of the project. a) Using r300 compiler for branch emulation/loop unrolling in Gallium drivers: The goal of this task would be to apply branch emulation and loop unrolling to TGSI code. This would be accomplished by creating a Gallium util function that takes TGSI code, converts it to the r300 compiler intermediate language(RC) and then uses the r300 compiler to do the branch emulation and loop unrolling. The RC would then be converted back into TGSI and passed back to the driver calling the function. b)More optimizations: Revisit the work from previous tasks and explore doing some optimizations that may have been outside the scope of the original task. c) Other GLSL features for the r300 compiler: i)Adding support for the gl_FrontFacing variable. ii)Handling varying modifiers like perspective, flat, and centroid. d)Improving the GLSL frontend to add support for more language features. Schedule / Deliverables: 1. Improve branch emulation in the r300 compiler (2 - 3 weeks) 2. Unroll loops in the r300 compiler (4 weeks) Midterm Evaluation 3. Loops and Conditionals for R500 fragment and vertex shaders (4 weeks) 4. Optional Tasks (2 weeks) Tasks 1-3 will be required for this project. Task 4 is optional. Thanks. -Tom ------------------------------------------------------------------------------ Download Intel® Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev _______________________________________________ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev