Oh, I forgot... patch 20 should probably get cherry-picked as well.
On Tue, Dec 16, 2014 at 6:01 PM, Jason Ekstrand <ja...@jlekstrand.net> wrote: > > > On Tue, Dec 16, 2014 at 2:52 PM, Connor Abbott <cwabbo...@gmail.com> wrote: >> >> Hi, >> >> On Tue, Dec 16, 2014 at 1:04 AM, Jason Ekstrand <ja...@jlekstrand.net> >> wrote: >> > NIR (pronounced "ner") is a new IR (internal representation) for the >> > Mesa >> > shader compiler that will sit between the old IR (GLSL IR) and back-end >> > compilers. The primary purpose of NIR is to be more efficient for doing >> > optimizations and generate better code for the back-ends. We have a lot >> > of >> > optimizations implemented in GLSL IR right now. However, they still >> > generate fairly bad code primarily because its tree-based structure >> > makes >> > writing good optimizations difficult. For this reason, we have >> > implemented >> > a lot of optimizations in the i965 back-end compilers just to fix up the >> > code we get from GLSL IR. The "proper fix" to this is to implement a >> > better high-level IR; enter NIR. >> > >> > Most of the initial work on NIR including setting up common data >> > structures, helper methods, and a few basic passes was by Connor Abbot >> > who >> > interned with us over the summer. Connor did a fantastic job, but there >> > is >> > still a lot left to be done. I've spent the last two months trying to >> > fill >> > in the pieces that we need in order to get NIR off the ground. At this >> > point, we now have compitent in and out of SSA passes, are at zero >> > piglit >> > regressions for i965 SIMD8 fragment shaders, and the shader-db numbers >> > aren't terrible. >> > >> > This is still a bit experimental. I have been testing only on HSW but >> > it >> > should work ok on SNB and later. Eventually, once we get booleans fixed >> > up, it should work fine on older chips as well. It also doesn't yet >> > support SIMD16, so performance won't be that great. That said, I think >> > we >> > are at the point now where we should try and land this and I can stop >> > developing in my masive private branch. Since this isn't quite ready >> > for >> > prime-time yet, using it requires setting the INTEL_USE_NIR environment >> > variable. >> > >> > A few key points about NIR: >> > >> > 1. It is primarily an SSA-based IR. >> > 2. It supports source/destination-modifiers and swizzles/*write-masks. >> > 3. Standard GPU operations such as sin() and fmad() are first-class ALU >> > operations, not intrinsics. >> > 4. GLSL concepts like inputs, outputs, uniforms, etc. are built into >> > the >> > IR so we can do proper analysis on them. >> > 5. Even though it's SSA, it still has a concept of registers and >> > write-masks in the core IR data structures. This means we can >> > generate >> > code that is much closer to what backends want. >> > 6. Control flow is structured explicitly in the IR. >> > >> > (*write-masks are not available for SSA values) >> > >> > While source/destination modifiers and writemasks/swizzles are not >> > particularly useful for optimizations, having them represented in the IR >> > gives us the ability to generate more useful code for backends. >> > >> > A few notes about review: >> > >> > 1. For those of you who aren't interested in the general compiler, I'm >> > sorry for the patch-bomb. However, several people have requsted >> > that >> > we maintain the history of the NIR development since connor's >> > original >> > drop at the end of the summer. Therefore, while I've squashed >> > several >> > things, I've tried to leave the diff of what I've done more-or-less >> > preserved. >> > >> > 2. No, this is not LLVM. There was a long-winded discussion about that >> > when Connor dropped his patches that went a whole lot of nowhere as >> > usual. I would really prefer if we left that debate alone. If >> > there >> > must be bikeshedding on the topic, please do so on the cover-letter >> > e-mail. >> > >> > 3. Please keep all bikeshedding about C++, typedefs, etc. on the core >> > datastructures e-mail. If we need, we can split that off in its own >> > thread. >> > >> > 4. While I welcome review, I don't plan to make non-trivial changes to >> > specific patches or squash anything beyond what has already been >> > squashed. I've tried thus far to more-or-less keep the history and >> > I'd >> > like to continue this if we can. >> >> I know you've said this, but I think there might still be some benefit >> from re-arranging a few things. In particular, I think patches 21, 36, >> 39, 59, and 65 should probably get put first so that we can push them >> + patch 1 right away (with appropriate review), since they're not >> NIR-specific. I've reviewed the ones I feel qualified to review (and >> that I didn't write!) to help with this. I know I got feedback on a >> few of those prep patches that we should wait to commit them until the >> things they introduce have users, but I think that since there are now >> patches in the list and we want to land them soon-ish it might be a >> good idea to commit them earlier in order to reduce the size of this >> patch-bomb :) Feel free to disagree, though... > > > I'm totally OK with cherry-picking and pushing those early. What I don't > want is a bunch of "patch 34 and 76 should get squashed except for this one > hunk which should go in 52". Unless, of course, I really did make a > nonsense rebasing error. Splitting patches would probably be ok if needed > though. > >> Connor >> >> > >> > 5. Eric Anholt has also written NIR -> TGSI -> NIR passes which will >> > hopefully get landed soon after NIR initially lands. Exactly how >> > that >> > all gets hooked up for other gallium drivers beyond vc4 is outside >> > the >> > scope of this series. >> > >> > I have pushed a branch to my personal freedesktop.org account. For >> > certain >> > types of review, it may be easier to look at the end result rather than >> > the >> > patches. The branch can be found via freedesktop cgit here: >> > >> > http://cgit.freedesktop.org/~jekstrand/mesa/log/?h=review/nir-v1 >> > >> > Last week, I did a presentation for some of the other Intel people to >> > try >> > and help bring them up to speed on NIR concepts quickly. As part of >> > this, >> > I typed up a bunch of notes that provide a decent overview of a lot of >> > NIR >> > concepts. Those notes can be found here: >> > >> > http://www.jlekstrand.net/jason/projects/mesa/nir-notes/ >> > >> > Happy reviewing! >> > >> > P.S. Connor, Don't do too much reviewing before your finals are done. >> > :-P >> > >> > Connor Abbott (22): >> > exec_list: add a list_foreach_typed_reverse() macro >> > nir: add initial README >> > nir: add a simple C wrapper around glsl_types.h >> > nir: add the core datastructures >> > nir: add core helper functions >> > nir: add a printer >> > nir: add a validation pass >> > nir: add a glsl-to-nir pass >> > nir: add a pass to lower variables for scalar backends >> > nir: keep track of the number of input, output, and uniform slots >> > nir: add a pass to remove unused variables >> > nir: add a pass to lower sampler instructions >> > nir: add a pass to lower system value reads >> > nir: add a pass to lower atomics >> > nir: add an optimization to turn global registers into local registers >> > nir: calculate dominance information >> > nir: add a pass to convert to SSA >> > nir: add an SSA-based copy propagation pass >> > nir: add an SSA-based dead code elimination pass >> > i965/fs: make emit_fragcoord_interpolation() more general >> > i965/fs: Don't pass through the coordinate type >> > i965/fs: add a NIR frontend >> > >> > Jason Ekstrand (101): >> > i965/fs: Only use nir for 8-wide non-fast-clear shaders. >> > i965/fs_nir: Make the sampler register always unsigned >> > i965/fs_nir: Use the correct types for texture inputs >> > i965/fs_nir: Use the correct texture offset immediate >> > Fix what I think are a few NIR typos >> > Fix up varying pull constants >> > i965/fs_nir: Add support for sample_pos and sample_id >> > nir/glsl: Add support for saturate >> > nir: Add fine and coarse derivative opcodes >> > nir/glsl: Add support for coarse and fine derivatives >> > i965/fs_nir: Handle coarse/fine derivatives >> > nir/lower_atomics: Multiply array offsets by ATOMIC_COUNTER_SIZE >> > i965/fs_nir: Add atomic counters support >> > i965/fs: Allow reinterpretation in constant propagation >> > nir: Add NIR_TRUE and NIR_FALSE constants and use them for boolean >> > immediates >> > nir: Add intrinsics to do alternate interpolation on inputs >> > i965/fs: Don't take an ir_variable for emit_general_interpolation >> > i965/fs_nir: Don't duplicate emit_general_interpolation >> > nir: Add a naieve from-SSA pass >> > nir: Add a lower_vec_to_movs pass >> > i965/fs_nir: Convert the shader to/from SSA >> > nir/lower_variables_scalar: Silence a compiler warning >> > nir: Add a basic metadata management system >> > nir: Add an assert >> > nir/foreach_block: Return false if the callback on the last block >> > fails >> > nir: Add a foreach_block_reverse function >> > nir: Add a function to detect if a block is immediately followed by an >> > if >> > nir: Make the nir_index_* functions return the nuber of items >> > nir: Add an SSA-based liveness analysis pass. >> > nir: Add an initialization function for SSA definitions >> > nir: Automatically handle SSA uses when an instruction is inserted >> > nir: Add a function for rewriting all the uses of a SSA def >> > nir: Add a parallel copy instruction type >> > nir: Add a function for comparing two sources >> > nir: Add a better out-of-SSA pass >> > i965/fs_nir: Do retyping for ALU srouces in get_nir_alu_src >> > glsl/list: Fix the exec_list_validate function >> > nir: Validate all lists in the validator >> > nir/print: Don't reindex things >> > nir: Differentiate between signed and unsigned versions of find_msb >> > i965/fs_nir: Validate optimization passes >> > nir/nir: Fix a bug in move_successors >> > glsl/list: Add a foreach_list_typed_safe_reverse macro >> > nir/nir: Use safe iterators when iterating over the CFG >> > nir/nir: Patch up phi predecessors in move_successors >> > nir: Add a peephole select optimization >> > i965/fs_nir: Turn on the peephole select optimization >> > nir: Validate that the SSA def and register indices are unique >> > nir: Add a fused multiply-add peephole >> > nir: Add a basic CSE pass >> > i965/fs_nir: Add the CSE pass and actually run in a loop >> > i965/fs_nir: Use an array rather than a hash table for register lookup >> > i965/fs_nir: Handle SSA constants >> > i965/fs_nir: Properly saturate multiplies >> > nir: Add a helper for rewriting an instruction source >> > nir/lower_samplers: Use the nir_instr_rewrite_src function >> > nir: Clean up nir_deref helper functions >> > nir: Make array deref direct vs. indirect an enum >> > nir: Add a concept of a wildcard array dereference >> > nir: Use an integer index for specifying structure fields >> > nir: Don't require a function in ssa_def_init >> > nir/copy_propagate: Don't cause size mismatches on phi node sources >> > nir: Validate that the sources of a phi have the same size as the >> > destination >> > nir/glsl: Don't allocate a state_slots array for 0 state slots >> > i965/fs_nir: Don't dump the shader. >> > nir: Use the enum for the variable mode >> > nir: Automatically update SSA if uses >> > nir: Add a copy splitting pass >> > nir: Add a pass to lower local variable accesses to SSA values >> > nir: Add a pass to lower local variables to registers >> > nir: Add a pass for lowering input/output loads/stores >> > nir: Add a pass to lower global variables to local variables >> > nir/glsl: Generate SSA NIR >> > i965/fs_nir: Use the new variable lowering code >> > nir/validate: Ensure that outputs are write-only and inputs are >> > read-only >> > nir: Remove the old variable lowering code >> > nir: Vectorize intrinsics >> > nir/validate: Validate intrinsic source/destination sizes >> > nir: Add gpu_shader5 interpolation intrinsics >> > nir/glsl: Add support for gpu_shader5 interpolation instrinsics >> > nir: Add a helper for getting a constant value from an SSA source >> > i965/fs_nir: Add a has_indirect flag and clean up some of the >> > input/output code >> > i965/fs_nir: Implement the ARB_gpu_shader5 interpolation intrinsics >> > nir: Add neg, abs, and sat opcodes >> > nir: Add a lowering pass for adding source modifiers where possible >> > nir: Make the type casting operations static inline functions >> > nir/glsl: Emit abs, neg, and sat operations instead of source >> > modifiers >> > nir: Add an expression matching framework >> > nir: Add infastructure for generating algebraic transformation passes >> > nir: Add an algebraic optimization pass >> > nir: Add a basic constant folding pass >> > nir: Remove the ffma peephole >> > nir: Make texture instruction names more consistent >> > nir: Constant fold array indirects >> > nir: Use a source for uniform buffer indices instead of an index >> > nir: Add a sampler index indirect to nir_tex_instr >> > nir: Rework the way samplers are lowered >> > i965/fs_nir: Add support for indirect texture arrays >> > nir/metadata: Rename metadata_dirty to metadata_preserve >> > nir: Call nir_metadata_preserve more places >> > nir: Make bcsel a fully vector operation >> > >> > src/glsl/Makefile.am | 10 +- >> > src/glsl/Makefile.sources | 39 +- >> > src/glsl/list.h | 19 +- >> > src/glsl/nir/README | 118 ++ >> > src/glsl/nir/glsl_to_nir.cpp | 1825 >> > +++++++++++++++++ >> > src/glsl/nir/glsl_to_nir.h | 40 + >> > src/glsl/nir/nir.c | 2042 >> > ++++++++++++++++++++ >> > src/glsl/nir/nir.h | 1433 >> > ++++++++++++++ >> > src/glsl/nir/nir_algebraic.py | 249 +++ >> > src/glsl/nir/nir_dominance.c | 298 +++ >> > src/glsl/nir/nir_from_ssa.c | 859 ++++++++ >> > src/glsl/nir/nir_intrinsics.c | 49 + >> > src/glsl/nir/nir_intrinsics.h | 140 ++ >> > src/glsl/nir/nir_live_variables.c | 282 +++ >> > src/glsl/nir/nir_lower_atomics.c | 146 ++ >> > src/glsl/nir/nir_lower_global_vars_to_local.c | 107 + >> > src/glsl/nir/nir_lower_io.c | 324 ++++ >> > src/glsl/nir/nir_lower_locals_to_regs.c | 308 +++ >> > src/glsl/nir/nir_lower_samplers.cpp | 181 ++ >> > src/glsl/nir/nir_lower_system_values.c | 107 + >> > src/glsl/nir/nir_lower_to_source_mods.c | 181 ++ >> > src/glsl/nir/nir_lower_variables.c | 1046 ++++++++++ >> > src/glsl/nir/nir_lower_vec_to_movs.c | 96 + >> > src/glsl/nir/nir_metadata.c | 54 + >> > src/glsl/nir/nir_opcodes.c | 46 + >> > src/glsl/nir/nir_opcodes.h | 356 ++++ >> > src/glsl/nir/nir_opt_algebraic.py | 67 + >> > src/glsl/nir/nir_opt_constant_folding.c | 355 ++++ >> > src/glsl/nir/nir_opt_copy_propagate.c | 325 ++++ >> > src/glsl/nir/nir_opt_cse.c | 269 +++ >> > src/glsl/nir/nir_opt_dce.c | 186 ++ >> > src/glsl/nir/nir_opt_global_to_local.c | 103 + >> > src/glsl/nir/nir_opt_peephole_select.c | 214 ++ >> > src/glsl/nir/nir_print.c | 948 +++++++++ >> > src/glsl/nir/nir_remove_dead_variables.c | 138 ++ >> > src/glsl/nir/nir_search.c | 337 ++++ >> > src/glsl/nir/nir_search.h | 80 + >> > src/glsl/nir/nir_split_var_copies.c | 225 +++ >> > src/glsl/nir/nir_to_ssa.c | 660 +++++++ >> > src/glsl/nir/nir_types.cpp | 143 ++ >> > src/glsl/nir/nir_types.h | 75 + >> > src/glsl/nir/nir_validate.c | 912 +++++++++ >> > src/mesa/drivers/dri/i965/Makefile.sources | 1 + >> > src/mesa/drivers/dri/i965/brw_fs.cpp | 74 +- >> > src/mesa/drivers/dri/i965/brw_fs.h | 57 +- >> > .../drivers/dri/i965/brw_fs_copy_propagation.cpp | 4 +- >> > src/mesa/drivers/dri/i965/brw_fs_fp.cpp | 32 +- >> > src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 1778 >> > +++++++++++++++++ >> > src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 39 +- >> > src/mesa/main/bitset.h | 1 + >> > 50 files changed, 17301 insertions(+), 77 deletions(-) >> > create mode 100644 src/glsl/nir/README >> > create mode 100644 src/glsl/nir/glsl_to_nir.cpp >> > create mode 100644 src/glsl/nir/glsl_to_nir.h >> > create mode 100644 src/glsl/nir/nir.c >> > create mode 100644 src/glsl/nir/nir.h >> > create mode 100644 src/glsl/nir/nir_algebraic.py >> > create mode 100644 src/glsl/nir/nir_dominance.c >> > create mode 100644 src/glsl/nir/nir_from_ssa.c >> > create mode 100644 src/glsl/nir/nir_intrinsics.c >> > create mode 100644 src/glsl/nir/nir_intrinsics.h >> > create mode 100644 src/glsl/nir/nir_live_variables.c >> > create mode 100644 src/glsl/nir/nir_lower_atomics.c >> > create mode 100644 src/glsl/nir/nir_lower_global_vars_to_local.c >> > create mode 100644 src/glsl/nir/nir_lower_io.c >> > create mode 100644 src/glsl/nir/nir_lower_locals_to_regs.c >> > create mode 100644 src/glsl/nir/nir_lower_samplers.cpp >> > create mode 100644 src/glsl/nir/nir_lower_system_values.c >> > create mode 100644 src/glsl/nir/nir_lower_to_source_mods.c >> > create mode 100644 src/glsl/nir/nir_lower_variables.c >> > create mode 100644 src/glsl/nir/nir_lower_vec_to_movs.c >> > create mode 100644 src/glsl/nir/nir_metadata.c >> > create mode 100644 src/glsl/nir/nir_opcodes.c >> > create mode 100644 src/glsl/nir/nir_opcodes.h >> > create mode 100644 src/glsl/nir/nir_opt_algebraic.py >> > create mode 100644 src/glsl/nir/nir_opt_constant_folding.c >> > create mode 100644 src/glsl/nir/nir_opt_copy_propagate.c >> > create mode 100644 src/glsl/nir/nir_opt_cse.c >> > create mode 100644 src/glsl/nir/nir_opt_dce.c >> > create mode 100644 src/glsl/nir/nir_opt_global_to_local.c >> > create mode 100644 src/glsl/nir/nir_opt_peephole_select.c >> > create mode 100644 src/glsl/nir/nir_print.c >> > create mode 100644 src/glsl/nir/nir_remove_dead_variables.c >> > create mode 100644 src/glsl/nir/nir_search.c >> > create mode 100644 src/glsl/nir/nir_search.h >> > create mode 100644 src/glsl/nir/nir_split_var_copies.c >> > create mode 100644 src/glsl/nir/nir_to_ssa.c >> > create mode 100644 src/glsl/nir/nir_types.cpp >> > create mode 100644 src/glsl/nir/nir_types.h >> > create mode 100644 src/glsl/nir/nir_validate.c >> > create mode 100644 src/mesa/drivers/dri/i965/brw_fs_nir.cpp >> > >> > -- >> > 2.2.0 >> > >> > _______________________________________________ >> > mesa-dev mailing list >> > mesa-dev@lists.freedesktop.org >> > http://lists.freedesktop.org/mailman/listinfo/mesa-dev _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev