After all the messages, some think that llvm is the solution. And why is the Connor solution right ?
This is an very hard problem and some people want the easiest way out. That is llvm. I think we need the Connor in house approach. I think we can have compiler experts, here. If no one want to say it: mesa developers fear the compiler internals. On Sat, Aug 16, 2014 at 3:12 AM, Connor Abbott <cwabbo...@gmail.com> wrote: > I know what you might be thinking right now. "Wait, *another* IR? Don't > we already have like 5 of those, not counting all the driver-specific > ones? Isn't this stuff complicated enough already?" Well, there are some > pretty good reasons to start afresh (again...). In the years we've been > using GLSL IR, we've come to realize that, in fact, it's not what we > want *at all* to do optimizations on. Ian has done a talk at FOSDEM that > highlights some of the problems they've run into: > > https://video.fosdem.org/2014/H1301_Cornil/Saturday/Three_Years_Experience_with_a_Treelike_Shader_IR.webm > > But here's the summary: > > * GLSL IR is way too much of a memory hog, since it has to make a new > variable for each temporary the compiler creates and then each time you > want to dereference that temporary you need to create an > ir_dereference_variable that points to it which is also very > cache-unfriendly ("downright cache-mean!"). > > * The expression trees were originally added so that we could do > pattern matching to automatically optimize things, but this turned out > to be both very difficult to do and not very helpful. Instead, all it > does is add more complexity to the IR without much benefit - with SSA or > having proper use-def chains, we could get back what the trees give us > while also being able to do lots more optimizations. > > * We don't have the concept of basic blocks in GLSL IR, which makes a > lot of optimizations harder because they were originally designed with > basic blocks in mind - take, for example, my SSA series. I had to map a > whole lot of concepts that were based on the control flow graph to this > tree of statements that GLSL IR uses, and the end result wound up > looking nothing at all like the original paper. This problem gets even > worse for things like e.g. Global Code Motion that depend upon having > the dominance tree. > > I originally wanted to modify GLSL IR to fix these problems by adding > new instruction types that would address these issues and then > converting back and forth between the old and the new form, but I > realized that fixing all the problems would basically mean a complete > rewrite - and if that's the case, then why don't we start from scratch? > So I took Ken's suggestions and started designing, and then at Intel > over the summer started implementing, a completely new IR which I call > NIR that's at a lower level than GLSL IR, but still high-level enough to > be mostly device-independant (different drivers may have different > passes and different ways of lowering e.g. matrix multiplies) so that > we can do generic optimizations on it. Having support for SSA from the > beginning was also a must, because lots of optimisations that we really > want for cleaning up DX9-translated games are either a lot easier in or > made possible by SSA. I also made the decision for it to be typeless, > because that's what the cool kids are all doing :) and for a > lower-level, flat IR it seemed like the thing to do (it could have gone > either way, though). So the key design points of NIR (pronounced either > like "near" as in "NIR is near!" or to rhyme with "burr") are: > > * It's flat (no expression trees) > > * It's typeless > > * Modifiers (abs, negate, saturate), swizzles, and write masks are part > of ALU instructions > > * It includes enough GLSL-like things (variables that you can load from > or store to, function calls) to be hardware-agnostic (although we don't > have a way to represent matrix multiplies right now, but that could > easily be added) to be able to do optimizations at a high level, while > having lowering passes that convert variables to registers and > input/output/uniform loads/stores that will open up more opportunities > for optimization and save memory while being more hardware-specific. > > * Control flow consists of a tree of if statements and loops, like in > GLSL IR, except the leaves of the tree are now basic blocks instead of > instructions. Also, each basic block keeps track of its successors and > predecessors, so the control flow graph is explicit in the IR. > > * SSA is natively supported, and SSA uses point directly to the SSA > definition, which means that the use-def chains are always there, and > def-use chains are kept by tracking the set of all uses for each > definition. > > * It's written in C. > > (see the README in patch 3 and nir.h in patch 4 for more details) > > Some things that are missing or could be improved: > > * There's currently no alias tracking for inputs, outputs, and uniforms. > This is especially important for uniforms because we don't pack them > like we pack inputs and outputs. > > * We need a way to represent matrix multiplies so that we can do > matrix-flipping optimizations in NIR (currently GLSL IR does this for > us). > > * I'm not entirely happy about how we represent loads and stores in the > IR. Right now, they're intrinsics, but that means we need a different > intrinsic for each size and combination of arguments (indirect vs. not > indirect, etc.) and we might run into a combinatorial explosion problem > in the future, so we might need to make separate load/store instructions > like what I did for textures. > > * Right now, we only have a pass that lowers variables for scalar > backends. We need to write a similar pass for vector backends that uses > std140 packing or something similar, as well as porting > lower_ubo_reference to NIR and changing it to output offsets in the > hardware-native units instead of bytes. > > * We'll need to write a pass that splits up vector expressions for > scalar backends. > > The first two patches are preperatory patches that I already sent to the > list, but I'm re-sending them as part of the series as they haven't > landed yet. Right now, the series only has code to convert GLSL IR to > NIR, but no way to actually hook it up to a backend in order to generate > code from it, and it also doesn't do anything with the SSA part of the > IR. I have a branch on my Github that does the conversion to SSA and a > few simple SSA-based optimizations, which hasn't been tested as much > (since I haven't written a pass to get out of SSA or a backend that uses > SSA): > > https://github.com/cwabbott0/mesa/tree/nir > > and an experimental backend for i965 fs that I hope to combine with > Matt's SSA work; right now, there are only a few piglit regressions and > most of them are because of the hacky way I changed boolean true to be > 0xFFFFFF instead of 1 (with Matt's series to do the same thing in a > better way, they should go away) or because of unimplemented features > (atomics and some system values): > > https://github.com/cwabbott0/mesa/tree/nir-i965-fs > > NIR has been what I've worked on for my entire summer internship at > Intel, and before I go off to my freshman year at college, I'd like to > thank the other Intel folks for the knowledge they've given me and the > many interesting discussions that made this go from an idea to a reality > - I'll miss you guys! > > Connor > > Connor Abbott (16): > exec_list: add a list_foreach_typed_reverse() macro > glsl/linker: pass through the is_intrinsic flag > nir: add initial README > nir: add a simple C wrapper around glsl_types.h > nir: add the core datastructures > nir: add core helper functions > nir: add a printer > nir: add a validation pass > nir: add a glsl-to-nir pass > nir: add a pass to lower variables for scalar backends > nir: keep track of the number of input, output, and uniform slots > nir: add a pass to remove unused variables > nir: add a pass to lower sampler instructions > nir: add a pass to lower system value reads > nir: add a pass to lower atomics > nir: add an optimization to turn global registers into local registers > > src/glsl/Makefile.sources | 18 +- > src/glsl/link_functions.cpp | 2 + > src/glsl/list.h | 6 + > src/glsl/nir/README | 118 ++ > src/glsl/nir/glsl_to_nir.cpp | 1759 > +++++++++++++++++++++++++++++ > src/glsl/nir/glsl_to_nir.h | 40 + > src/glsl/nir/nir.c | 1717 ++++++++++++++++++++++++++++ > src/glsl/nir/nir.h | 1270 +++++++++++++++++++++ > src/glsl/nir/nir_intrinsics.c | 49 + > src/glsl/nir/nir_intrinsics.h | 158 +++ > src/glsl/nir/nir_lower_atomics.c | 127 +++ > src/glsl/nir/nir_lower_samplers.cpp | 170 +++ > src/glsl/nir/nir_lower_system_values.c | 106 ++ > src/glsl/nir/nir_lower_variables_scalar.c | 1243 ++++++++++++++++++++ > src/glsl/nir/nir_opcodes.c | 46 + > src/glsl/nir/nir_opcodes.h | 346 ++++++ > src/glsl/nir/nir_opt_global_to_local.c | 103 ++ > src/glsl/nir/nir_print.c | 916 +++++++++++++++ > src/glsl/nir/nir_remove_dead_variables.c | 138 +++ > src/glsl/nir/nir_types.cpp | 155 +++ > src/glsl/nir/nir_types.h | 78 ++ > src/glsl/nir/nir_validate.c | 798 +++++++++++++ > 22 files changed, 9362 insertions(+), 1 deletion(-) > create mode 100644 src/glsl/nir/README > create mode 100644 src/glsl/nir/glsl_to_nir.cpp > create mode 100644 src/glsl/nir/glsl_to_nir.h > create mode 100644 src/glsl/nir/nir.c > create mode 100644 src/glsl/nir/nir.h > create mode 100644 src/glsl/nir/nir_intrinsics.c > create mode 100644 src/glsl/nir/nir_intrinsics.h > create mode 100644 src/glsl/nir/nir_lower_atomics.c > create mode 100644 src/glsl/nir/nir_lower_samplers.cpp > create mode 100644 src/glsl/nir/nir_lower_system_values.c > create mode 100644 src/glsl/nir/nir_lower_variables_scalar.c > create mode 100644 src/glsl/nir/nir_opcodes.c > create mode 100644 src/glsl/nir/nir_opcodes.h > create mode 100644 src/glsl/nir/nir_opt_global_to_local.c > create mode 100644 src/glsl/nir/nir_print.c > create mode 100644 src/glsl/nir/nir_remove_dead_variables.c > create mode 100644 src/glsl/nir/nir_types.cpp > create mode 100644 src/glsl/nir/nir_types.h > create mode 100644 src/glsl/nir/nir_validate.c > > -- > 1.9.3 > > _______________________________________________ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev