Re: [Mesa-dev] [PATCH 10/12] i965/fs: Make compact_virtual_grfs an optimization pass
On Fri, Sep 19, 2014 at 5:41 PM, Matt Turner wrote: > On Fri, Sep 19, 2014 at 1:10 PM, Jason Ekstrand > wrote: > > Previously we disabled compact_virtual_grfs when dumping optimizations. > > The idea here was to make it easier to diff the dumped shader because you > > didn't have a sudden renaming. However, sometimes a bug is affected by > > compact_virtual_grfs and, when this happens, you want to keep dumping > > instructions with compact_virtual_grfs enabled. By turning it into an > > optimization pass and dumping it along with the others, we retain the > > ability to diff because you can just diff against the compact_virtual_grf > > output. > > I'd like to understand the bug you encountered. > I really don't think you'd like that. Those bugs are a real pain. But yes, I've hit this more times than I can count while working on this stuff. > > I'm kind of concerned that we're going to just run the optimization > loop an extra time for every shader now, since compact_virtual_grfs is > going to set progress = true after the last actual optimization pass > made progress. I guess we could remove that problem by calling > compact_virtual_grfs at the end of the loop, rather than at the > beginning. > Sure, we can do something to make it not run an extra time. I'm mostly concerned about not just shutting it off. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 08/12] i965/fs: Make null_reg_* const members of fs_visitor instead of globals
On Fri, Sep 19, 2014 at 5:37 PM, Matt Turner wrote: > On Fri, Sep 19, 2014 at 1:10 PM, Jason Ekstrand > wrote: > > We also set the register width equal to the dispatch width. Right now, > > this is effectively a no-op since we don't do anything with it. However, > > it will be important once we add an actual width field to fs_reg. > > I don't really see the point to be honest. We just wind up calling the > constructor >1 time. > > I could see maybe see making them static members just to reduce their > scope. > The point is to get a null register with a width of dispatch_width. We need that later. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 07/12] i965/fs: Use the var_from_vgrf helper function instead of doing it manually
On Fri, Sep 19, 2014 at 5:16 PM, Matt Turner wrote: > On Fri, Sep 19, 2014 at 1:10 PM, Jason Ekstrand > wrote: > > Signed-off-by: Jason Ekstrand > > --- > > src/mesa/drivers/dri/i965/brw_fs_dead_code_eliminate.cpp | 10 +- > > 1 file changed, 5 insertions(+), 5 deletions(-) > > > > diff --git a/src/mesa/drivers/dri/i965/brw_fs_dead_code_eliminate.cpp > b/src/mesa/drivers/dri/i965/brw_fs_dead_code_eliminate.cpp > > index 697b44a..036875f 100644 > > --- a/src/mesa/drivers/dri/i965/brw_fs_dead_code_eliminate.cpp > > +++ b/src/mesa/drivers/dri/i965/brw_fs_dead_code_eliminate.cpp > > @@ -58,7 +58,7 @@ fs_visitor::dead_code_eliminate() > > int var = live_intervals->var_from_reg(&inst->dst); > > result_live = BITSET_TEST(live, var); > > } else { > > - int var = live_intervals->var_from_vgrf[inst->dst.reg]; > > + int var = live_intervals->var_from_reg(&inst->dst); > > for (int i = 0; i < inst->regs_written; i++) { > >result_live = result_live || BITSET_TEST(live, var + > i); > > This is wrong, isn't it? Before we get the base var and iterate 0 > through regs_written. After we're getting the var of the > register+offset and then iterating. > No, in fact this hunk is what prompted me to make the change. If we write to vgrf3+2.0, then the previous version would tacitly assume that the offset is 0 and treat it as if we were writing to vgrf3+0.0. > > > } > > @@ -78,19 +78,19 @@ fs_visitor::dead_code_eliminate() > > > > if (inst->dst.file == GRF) { > > if (!inst->is_partial_write()) { > > - int var = live_intervals->var_from_vgrf[inst->dst.reg]; > > + int var = live_intervals->var_from_reg(&inst->dst); > > for (int i = 0; i < inst->regs_written; i++) { > > - BITSET_CLEAR(live, var + inst->dst.reg_offset + i); > > + BITSET_CLEAR(live, var + i); > > } > > This hunk seems fine. > > > } > > } > > > > for (int i = 0; i < inst->sources; i++) { > > if (inst->src[i].file == GRF) { > > - int var = > live_intervals->var_from_vgrf[inst->src[i].reg]; > > + int var = live_intervals->var_from_reg(&inst->src[i]); > > > > for (int j = 0; j < inst->regs_read(this, i); j++) { > > - BITSET_SET(live, var + inst->src[i].reg_offset + j); > > + BITSET_SET(live, var + j); > > I think this is also fine. > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 04/20] i965/vec4: Preserve CFG in spill_reg().
--- This also means I'll drop 05/20. v2: Just pass block to emit_before(), rather than trying to get rid of emit_before(). src/mesa/drivers/dri/i965/brw_vec4.cpp | 6 +- src/mesa/drivers/dri/i965/brw_vec4.h | 13 +++-- .../drivers/dri/i965/brw_vec4_reg_allocate.cpp | 11 ++-- src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 64 +- 4 files changed, 56 insertions(+), 38 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp b/src/mesa/drivers/dri/i965/brw_vec4.cpp index 6072962..c3e5c8a 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp @@ -804,10 +804,12 @@ vec4_visitor::move_push_constants_to_pull_constants() } } + calculate_cfg(); + /* Now actually rewrite usage of the things we've moved to pull * constants. */ - foreach_in_list_safe(vec4_instruction, inst, &instructions) { + foreach_block_and_inst_safe(block, vec4_instruction, inst, cfg) { for (int i = 0 ; i < 3; i++) { if (inst->src[i].file != UNIFORM || pull_constant_loc[inst->src[i].reg] == -1) @@ -817,7 +819,7 @@ vec4_visitor::move_push_constants_to_pull_constants() dst_reg temp = dst_reg(this, glsl_type::vec4_type); -emit_pull_constant_load(inst, temp, inst->src[i], +emit_pull_constant_load(block, inst, temp, inst->src[i], pull_constant_loc[uniform]); inst->src[i].file = temp.file; diff --git a/src/mesa/drivers/dri/i965/brw_vec4.h b/src/mesa/drivers/dri/i965/brw_vec4.h index 186667c..4a264ef 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4.h +++ b/src/mesa/drivers/dri/i965/brw_vec4.h @@ -405,7 +405,8 @@ public: vec4_instruction *emit(enum opcode opcode, dst_reg dst, src_reg src0, src_reg src1, src_reg src2); - vec4_instruction *emit_before(vec4_instruction *inst, + vec4_instruction *emit_before(bblock_t *block, + vec4_instruction *inst, vec4_instruction *new_inst); vec4_instruction *MOV(const dst_reg &dst, const src_reg &src0); @@ -549,17 +550,17 @@ public: void emit_untyped_surface_read(unsigned surf_index, dst_reg dst, src_reg offset); - src_reg get_scratch_offset(vec4_instruction *inst, + src_reg get_scratch_offset(bblock_t *block, vec4_instruction *inst, src_reg *reladdr, int reg_offset); - src_reg get_pull_constant_offset(vec4_instruction *inst, + src_reg get_pull_constant_offset(bblock_t *block, vec4_instruction *inst, src_reg *reladdr, int reg_offset); - void emit_scratch_read(vec4_instruction *inst, + void emit_scratch_read(bblock_t *block, vec4_instruction *inst, dst_reg dst, src_reg orig_src, int base_offset); - void emit_scratch_write(vec4_instruction *inst, + void emit_scratch_write(bblock_t *block, vec4_instruction *inst, int base_offset); - void emit_pull_constant_load(vec4_instruction *inst, + void emit_pull_constant_load(bblock_t *block, vec4_instruction *inst, dst_reg dst, src_reg orig_src, int base_offset); diff --git a/src/mesa/drivers/dri/i965/brw_vec4_reg_allocate.cpp b/src/mesa/drivers/dri/i965/brw_vec4_reg_allocate.cpp index ddab342..72c72e6 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4_reg_allocate.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4_reg_allocate.cpp @@ -28,6 +28,7 @@ extern "C" { #include "brw_vec4.h" #include "brw_vs.h" +#include "brw_cfg.h" using namespace brw; @@ -326,8 +327,10 @@ vec4_visitor::spill_reg(int spill_reg_nr) assert(virtual_grf_sizes[spill_reg_nr] == 1); unsigned int spill_offset = c->last_scratch++; + calculate_cfg(); + /* Generate spill/unspill instructions for the objects being spilled. */ - foreach_in_list(vec4_instruction, inst, &instructions) { + foreach_block_and_inst(block, vec4_instruction, inst, cfg) { for (unsigned int i = 0; i < 3; i++) { if (inst->src[i].file == GRF && inst->src[i].reg == spill_reg_nr) { src_reg spill_reg = inst->src[i]; @@ -342,16 +345,16 @@ vec4_visitor::spill_reg(int spill_reg_nr) temp.writemask |= (1 << BRW_GET_SWZ(inst->src[i].swizzle, c)); assert(temp.writemask != 0); -emit_scratch_read(inst, temp, spill_reg, spill_offset); +emit_scratch_read(block, inst, temp, spill_reg, spill_offset); } } if (inst->dst.file == GRF && inst->dst.reg == spill_reg_nr) { - emit_scratch_write(inst, spill_offset); + emit_scratch_write(block, inst, spill_offset); } } - invalidate_live_intervals(); + invalidate_live_intervals(false); }
Re: [Mesa-dev] [PATCH 14/20] i965/vec4: Don't iterate between blocks with inst->next/prev.
On Wed, Sep 17, 2014 at 5:51 AM, Pohjolainen, Topi wrote: > On Tue, Sep 02, 2014 at 09:34:25PM -0700, Matt Turner wrote: >> The register coalescing portion of this patch hurts three shaders in >> Guacamelee by one instruction each, but examining the diff makes me >> believe that what we were generating was (perhaps harmlessly) incorrect. >> --- >> src/mesa/drivers/dri/i965/brw_vec4.cpp | 30 +- >> 1 file changed, 9 insertions(+), 21 deletions(-) >> >> diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp >> b/src/mesa/drivers/dri/i965/brw_vec4.cpp >> index 6669281..e3869d6 100644 >> --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp >> +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp >> @@ -513,12 +513,9 @@ vec4_visitor::dead_code_eliminate() >> } >>} >> >> - for (exec_node *node = inst->prev, *prev = node->prev; >> - prev != NULL && dead_channels != 0; >> - node = prev, prev = prev->prev) { >> - vec4_instruction *scan_inst = (vec4_instruction *)node; >> - >> - if (scan_inst->is_control_flow()) > > Last instruction of the block is not considered in the iteration, but first > instruction is. Hence if I'm reading this right, before DO and ENDIF weren't > considered but now they are. That's true, but it doesn't have any effect, since neither do nor endif take arguments. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 00/14] i965: Instruction compaction improvements
On Thu, Aug 28, 2014 at 8:10 PM, Matt Turner wrote: > This series adds instruction compaction support for G45 and Gen5 > and enables compaction of control flow instructions. Ken reviewed the first four patches I think. Can I get someone to review the rest? ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 13/12] i965/fs: Refactor fs_inst::is_send_from_grf()
Reviewed-by: Matt Turner ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 11/12] i965/fs: Print BAD_FILE registers in dump_instruction
Reviewed-by: Matt Turner ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 10/12] i965/fs: Make compact_virtual_grfs an optimization pass
On Fri, Sep 19, 2014 at 1:10 PM, Jason Ekstrand wrote: > Previously we disabled compact_virtual_grfs when dumping optimizations. > The idea here was to make it easier to diff the dumped shader because you > didn't have a sudden renaming. However, sometimes a bug is affected by > compact_virtual_grfs and, when this happens, you want to keep dumping > instructions with compact_virtual_grfs enabled. By turning it into an > optimization pass and dumping it along with the others, we retain the > ability to diff because you can just diff against the compact_virtual_grf > output. I'd like to understand the bug you encountered. I'm kind of concerned that we're going to just run the optimization loop an extra time for every shader now, since compact_virtual_grfs is going to set progress = true after the last actual optimization pass made progress. I guess we could remove that problem by calling compact_virtual_grfs at the end of the loop, rather than at the beginning. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 08/12] i965/fs: Make null_reg_* const members of fs_visitor instead of globals
On Fri, Sep 19, 2014 at 1:10 PM, Jason Ekstrand wrote: > We also set the register width equal to the dispatch width. Right now, > this is effectively a no-op since we don't do anything with it. However, > it will be important once we add an actual width field to fs_reg. I don't really see the point to be honest. We just wind up calling the constructor >1 time. I could see maybe see making them static members just to reduce their scope. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 09/12] i964/fs: Make immediate fs_reg constructors explicit
Yes, I've always been weirded out by this. Reviewed-by: Matt Turner ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 06/12] i965/fs: Use the UW type for the destination of VARYING_PULL_CONSTANT_LOAD instructions
Reviewed-by: Matt Turner ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 05/12] i965/fs: Use offset a lot more places
On Fri, Sep 19, 2014 at 1:10 PM, Jason Ekstrand wrote: > We have this wonderful offset() function for advancing registers, but we're > not using it. Using offset() allows us to do some sanity checking and > avoid manually touching fs_reg::reg_offset. In a few commits, we will make > offset do even more nifty things for us. > > Signed-off-by: Jason Ekstrand > --- > src/mesa/drivers/dri/i965/brw_fs.cpp | 18 +-- > src/mesa/drivers/dri/i965/brw_fs_cse.cpp | 12 +- > src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp | 4 +- > src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 137 > ++ > 4 files changed, 78 insertions(+), 93 deletions(-) > > diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp > b/src/mesa/drivers/dri/i965/brw_fs.cpp > index af8c087..ea91705 100644 > --- a/src/mesa/drivers/dri/i965/brw_fs.cpp > +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp > @@ -310,8 +310,8 @@ fs_visitor::VARYING_PULL_CONSTANT_LOAD(const fs_reg &dst, > inst->mlen = 1 + dispatch_width / 8; > } > > - vec4_result.reg_offset += (const_offset & 3) * scale; > - instructions.push_tail(MOV(dst, vec4_result)); > + fs_reg result = offset(vec4_result, (const_offset & 3) * scale); > + instructions.push_tail(MOV(dst, result)); Isn't this going to cause us to copy an fs_reg twice, rather than just setting .reg_offset? I'd like to check the generated code. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 04/12] i965/fs: fix a comment in compact_virtual_grfs
Reviewed-by: Matt Turner ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 03/12] i965/fs: Rewrite fs_visitor::split_virtual_grfs
On Fri, Sep 19, 2014 at 1:10 PM, Jason Ekstrand wrote: > The original vgrf splitting code was written assuming that with the > assumption that vgrfs came in two types: those that can be split into s/ with the assumption that// > single registers and those that can't be split at all Period > It was very > conservative and bailed as soon as more than one element of a register was > read or written. This won't work once we start allowing a regular MOV or > ADD operation to operate on multiple registers. This rewrite allows for > the case where a vgrf of size 5 may appropreately be split in to one appropriately > register of size 1 and two registers of size 2. I'm not sure I understand enough yet to review. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/3] targets/omx: create symlink to aux/vl/vl_winsys_dri.cat at build time
On Fri, Sep 19, 2014 at 8:14 PM, Emil Velikov wrote: > On 20/09/14 00:56, Ilia Mirkin wrote: >> On Fri, Sep 19, 2014 at 7:45 PM, Emil Velikov >> wrote: >>> On 20/09/14 00:13, Ilia Mirkin wrote: Do we do that anywhere else? Seems really hacky, and windows doesn't support symlinks among other things... I'd just as soon force a non-broken version of automake :) >>> Hmm just noticed that we should put the generated source(s) into the >>> nodist_* >>> >>> Please define "anywhere else". It does seem hacky but it's less hacky >>> than the current approach afaics. Cannot really parse "I'd just as soon >>> force a non-broken version of autmake". Can you elaborate ? >> >> You said that automake 2.0 is broken (in that it's not backwards >> compatible and doesn't support our setup). To resolve it, you're >> introducing a (IMO) horrible hack of adding a symlink. My suggested >> alternative is to just force a lower version of automake... >> > Hmm I feel that you hate autohell a bit too much... or is it the case of I hate pandering to broken tools, auto or otherwise. > "people fear what they don't understand" ? Not that I like/know autoslow > too much but I'm willing to (still) give it a chance. > > Automake 2.0 is not out, yet putting tape over our eyes and pleading > ignorance against it's (future) existence is a bit silly. I realise that > this does not look good but it's the most reasonable choice. I can't imagine that the intended way to resolve the issue is to add symlinks at build time. This sort of solution points to a severe mismatch between what the tool expects and what we want (or at least are currently doing, which may be dirty to begin with, for all I know). > Additionally I'd like to acknowledge OpenBSD people's existence, and > help them stop rolling their own build for every mesa release. > >>> >>> Have a sneaky feeling that we may get away with just creating a single >>> blob in aux/vl, rather than one per target, yet I would prefer to save >>> people (myself?) the headaches at things go pair-shape :) >> >> Yes, building the files where they are is the more common thing than >> referencing them from all over... >> > I fear that our current split (the gallium way) mandates it. And even if > it works(tm) now I _really_ want to prevent the headaches as it breaks. symlinks in builds cause headaches. Building files where they live is almost always the right answer. > > -Emil > >>> >>> -Emil >>> On Fri, Sep 19, 2014 at 7:01 PM, Emil Velikov wrote: > Ensure that the object is build in the target folder, as automake 2.0 > will mandate subdir-objects. Pointed out by automake 1.14. > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=69874 > Signed-off-by: Emil Velikov > --- > src/gallium/targets/omx/.gitignore | 1 + > src/gallium/targets/omx/Makefile.am | 10 -- > 2 files changed, 9 insertions(+), 2 deletions(-) > create mode 100644 src/gallium/targets/omx/.gitignore > > diff --git a/src/gallium/targets/omx/.gitignore > b/src/gallium/targets/omx/.gitignore > new file mode 100644 > index 000..4fd1800 > --- /dev/null > +++ b/src/gallium/targets/omx/.gitignore > @@ -0,0 +1 @@ > +vl_winsys_dri.c > diff --git a/src/gallium/targets/omx/Makefile.am > b/src/gallium/targets/omx/Makefile.am > index 4045548..f41719f 100644 > --- a/src/gallium/targets/omx/Makefile.am > +++ b/src/gallium/targets/omx/Makefile.am > @@ -7,8 +7,7 @@ omxdir = $(OMX_LIB_INSTALL_DIR) > omx_LTLIBRARIES = libomx_mesa.la > > nodist_EXTRA_libomx_mesa_la_SOURCES = dummy.cpp > -libomx_mesa_la_SOURCES = \ > - $(top_srcdir)/src/gallium/auxiliary/vl/vl_winsys_dri.c > +libomx_mesa_la_SOURCES = vl_winsys_dri.c > > libomx_mesa_la_LDFLAGS = \ > -shared \ > @@ -30,6 +29,13 @@ libomx_mesa_la_LIBADD = \ > $(OMX_LIBS) \ > $(GALLIUM_COMMON_LIB_DEPS) > > +BUILT_SOURCES = vl_winsys_dri.c > +CLEANFILES = vl_winsys_dri.c > + > +vl_winsys_dri.c: > + $(AM_V_GEN)$(LN_S) > $(top_srcdir)/src/gallium/auxiliary/vl/vl_winsys_dri.c > + > + > if HAVE_GALLIUM_STATIC_TARGETS > > STATIC_TARGET_CPPFLAGS = -DGALLIUM_STATIC_TARGETS=1 > -- > 2.1.0 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev >>> > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 02/12] i965/fs_live_variables: Use var_from_vgrf insead of repeating the calculation
Reviewed-by: Matt Turner ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/3] targets/omx: create symlink to aux/vl/vl_winsys_dri.cat at build time
On Fri, Sep 19, 2014 at 4:01 PM, Emil Velikov wrote: > Ensure that the object is build in the target folder, as automake 2.0 > will mandate subdir-objects. Pointed out by automake 1.14. > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=69874 > Signed-off-by: Emil Velikov > --- > src/gallium/targets/omx/.gitignore | 1 + > src/gallium/targets/omx/Makefile.am | 10 -- > 2 files changed, 9 insertions(+), 2 deletions(-) > create mode 100644 src/gallium/targets/omx/.gitignore > > diff --git a/src/gallium/targets/omx/.gitignore > b/src/gallium/targets/omx/.gitignore > new file mode 100644 > index 000..4fd1800 > --- /dev/null > +++ b/src/gallium/targets/omx/.gitignore > @@ -0,0 +1 @@ > +vl_winsys_dri.c > diff --git a/src/gallium/targets/omx/Makefile.am > b/src/gallium/targets/omx/Makefile.am > index 4045548..f41719f 100644 > --- a/src/gallium/targets/omx/Makefile.am > +++ b/src/gallium/targets/omx/Makefile.am > @@ -7,8 +7,7 @@ omxdir = $(OMX_LIB_INSTALL_DIR) > omx_LTLIBRARIES = libomx_mesa.la > > nodist_EXTRA_libomx_mesa_la_SOURCES = dummy.cpp > -libomx_mesa_la_SOURCES = \ > - $(top_srcdir)/src/gallium/auxiliary/vl/vl_winsys_dri.c > +libomx_mesa_la_SOURCES = vl_winsys_dri.c > > libomx_mesa_la_LDFLAGS = \ > -shared \ > @@ -30,6 +29,13 @@ libomx_mesa_la_LIBADD = \ > $(OMX_LIBS) \ > $(GALLIUM_COMMON_LIB_DEPS) > > +BUILT_SOURCES = vl_winsys_dri.c > +CLEANFILES = vl_winsys_dri.c > + > +vl_winsys_dri.c: > + $(AM_V_GEN)$(LN_S) > $(top_srcdir)/src/gallium/auxiliary/vl/vl_winsys_dri.c This file gets built by omx, xvmc, and vdpau, but is it actually built with different CPPFLAGS or something? That is, can't we actually just build it once in its subdirectory? I don't see any meaningful preprocessor checks in the source file that make me think it can't. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/3] targets/omx: create symlink to aux/vl/vl_winsys_dri.cat at build time
On 20/09/14 00:56, Ilia Mirkin wrote: > On Fri, Sep 19, 2014 at 7:45 PM, Emil Velikov > wrote: >> On 20/09/14 00:13, Ilia Mirkin wrote: >>> Do we do that anywhere else? Seems really hacky, and windows doesn't >>> support symlinks among other things... I'd just as soon force a >>> non-broken version of automake :) >>> >> Hmm just noticed that we should put the generated source(s) into the >> nodist_* >> >> Please define "anywhere else". It does seem hacky but it's less hacky >> than the current approach afaics. Cannot really parse "I'd just as soon >> force a non-broken version of autmake". Can you elaborate ? > > You said that automake 2.0 is broken (in that it's not backwards > compatible and doesn't support our setup). To resolve it, you're > introducing a (IMO) horrible hack of adding a symlink. My suggested > alternative is to just force a lower version of automake... > Hmm I feel that you hate autohell a bit too much... or is it the case of "people fear what they don't understand" ? Not that I like/know autoslow too much but I'm willing to (still) give it a chance. Automake 2.0 is not out, yet putting tape over our eyes and pleading ignorance against it's (future) existence is a bit silly. I realise that this does not look good but it's the most reasonable choice. Additionally I'd like to acknowledge OpenBSD people's existence, and help them stop rolling their own build for every mesa release. >> >> Have a sneaky feeling that we may get away with just creating a single >> blob in aux/vl, rather than one per target, yet I would prefer to save >> people (myself?) the headaches at things go pair-shape :) > > Yes, building the files where they are is the more common thing than > referencing them from all over... > I fear that our current split (the gallium way) mandates it. And even if it works(tm) now I _really_ want to prevent the headaches as it breaks. -Emil >> >> -Emil >> >>> On Fri, Sep 19, 2014 at 7:01 PM, Emil Velikov >>> wrote: Ensure that the object is build in the target folder, as automake 2.0 will mandate subdir-objects. Pointed out by automake 1.14. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=69874 Signed-off-by: Emil Velikov --- src/gallium/targets/omx/.gitignore | 1 + src/gallium/targets/omx/Makefile.am | 10 -- 2 files changed, 9 insertions(+), 2 deletions(-) create mode 100644 src/gallium/targets/omx/.gitignore diff --git a/src/gallium/targets/omx/.gitignore b/src/gallium/targets/omx/.gitignore new file mode 100644 index 000..4fd1800 --- /dev/null +++ b/src/gallium/targets/omx/.gitignore @@ -0,0 +1 @@ +vl_winsys_dri.c diff --git a/src/gallium/targets/omx/Makefile.am b/src/gallium/targets/omx/Makefile.am index 4045548..f41719f 100644 --- a/src/gallium/targets/omx/Makefile.am +++ b/src/gallium/targets/omx/Makefile.am @@ -7,8 +7,7 @@ omxdir = $(OMX_LIB_INSTALL_DIR) omx_LTLIBRARIES = libomx_mesa.la nodist_EXTRA_libomx_mesa_la_SOURCES = dummy.cpp -libomx_mesa_la_SOURCES = \ - $(top_srcdir)/src/gallium/auxiliary/vl/vl_winsys_dri.c +libomx_mesa_la_SOURCES = vl_winsys_dri.c libomx_mesa_la_LDFLAGS = \ -shared \ @@ -30,6 +29,13 @@ libomx_mesa_la_LIBADD = \ $(OMX_LIBS) \ $(GALLIUM_COMMON_LIB_DEPS) +BUILT_SOURCES = vl_winsys_dri.c +CLEANFILES = vl_winsys_dri.c + +vl_winsys_dri.c: + $(AM_V_GEN)$(LN_S) $(top_srcdir)/src/gallium/auxiliary/vl/vl_winsys_dri.c + + if HAVE_GALLIUM_STATIC_TARGETS STATIC_TARGET_CPPFLAGS = -DGALLIUM_STATIC_TARGETS=1 -- 2.1.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev >> ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 07/12] i965/fs: Use the var_from_vgrf helper function instead of doing it manually
On Fri, Sep 19, 2014 at 1:10 PM, Jason Ekstrand wrote: > Signed-off-by: Jason Ekstrand > --- > src/mesa/drivers/dri/i965/brw_fs_dead_code_eliminate.cpp | 10 +- > 1 file changed, 5 insertions(+), 5 deletions(-) > > diff --git a/src/mesa/drivers/dri/i965/brw_fs_dead_code_eliminate.cpp > b/src/mesa/drivers/dri/i965/brw_fs_dead_code_eliminate.cpp > index 697b44a..036875f 100644 > --- a/src/mesa/drivers/dri/i965/brw_fs_dead_code_eliminate.cpp > +++ b/src/mesa/drivers/dri/i965/brw_fs_dead_code_eliminate.cpp > @@ -58,7 +58,7 @@ fs_visitor::dead_code_eliminate() > int var = live_intervals->var_from_reg(&inst->dst); > result_live = BITSET_TEST(live, var); > } else { > - int var = live_intervals->var_from_vgrf[inst->dst.reg]; > + int var = live_intervals->var_from_reg(&inst->dst); > for (int i = 0; i < inst->regs_written; i++) { >result_live = result_live || BITSET_TEST(live, var + i); This is wrong, isn't it? Before we get the base var and iterate 0 through regs_written. After we're getting the var of the register+offset and then iterating. > } > @@ -78,19 +78,19 @@ fs_visitor::dead_code_eliminate() > > if (inst->dst.file == GRF) { > if (!inst->is_partial_write()) { > - int var = live_intervals->var_from_vgrf[inst->dst.reg]; > + int var = live_intervals->var_from_reg(&inst->dst); > for (int i = 0; i < inst->regs_written; i++) { > - BITSET_CLEAR(live, var + inst->dst.reg_offset + i); > + BITSET_CLEAR(live, var + i); > } This hunk seems fine. > } > } > > for (int i = 0; i < inst->sources; i++) { > if (inst->src[i].file == GRF) { > - int var = live_intervals->var_from_vgrf[inst->src[i].reg]; > + int var = live_intervals->var_from_reg(&inst->src[i]); > > for (int j = 0; j < inst->regs_read(this, i); j++) { > - BITSET_SET(live, var + inst->src[i].reg_offset + j); > + BITSET_SET(live, var + j); I think this is also fine. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/3] targets/omx: create symlink to aux/vl/vl_winsys_dri.cat at build time
On Fri, Sep 19, 2014 at 7:45 PM, Emil Velikov wrote: > On 20/09/14 00:13, Ilia Mirkin wrote: >> Do we do that anywhere else? Seems really hacky, and windows doesn't >> support symlinks among other things... I'd just as soon force a >> non-broken version of automake :) >> > Hmm just noticed that we should put the generated source(s) into the > nodist_* > > Please define "anywhere else". It does seem hacky but it's less hacky > than the current approach afaics. Cannot really parse "I'd just as soon > force a non-broken version of autmake". Can you elaborate ? You said that automake 2.0 is broken (in that it's not backwards compatible and doesn't support our setup). To resolve it, you're introducing a (IMO) horrible hack of adding a symlink. My suggested alternative is to just force a lower version of automake... > > Have a sneaky feeling that we may get away with just creating a single > blob in aux/vl, rather than one per target, yet I would prefer to save > people (myself?) the headaches at things go pair-shape :) Yes, building the files where they are is the more common thing than referencing them from all over... > > -Emil > >> On Fri, Sep 19, 2014 at 7:01 PM, Emil Velikov >> wrote: >>> Ensure that the object is build in the target folder, as automake 2.0 >>> will mandate subdir-objects. Pointed out by automake 1.14. >>> >>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=69874 >>> Signed-off-by: Emil Velikov >>> --- >>> src/gallium/targets/omx/.gitignore | 1 + >>> src/gallium/targets/omx/Makefile.am | 10 -- >>> 2 files changed, 9 insertions(+), 2 deletions(-) >>> create mode 100644 src/gallium/targets/omx/.gitignore >>> >>> diff --git a/src/gallium/targets/omx/.gitignore >>> b/src/gallium/targets/omx/.gitignore >>> new file mode 100644 >>> index 000..4fd1800 >>> --- /dev/null >>> +++ b/src/gallium/targets/omx/.gitignore >>> @@ -0,0 +1 @@ >>> +vl_winsys_dri.c >>> diff --git a/src/gallium/targets/omx/Makefile.am >>> b/src/gallium/targets/omx/Makefile.am >>> index 4045548..f41719f 100644 >>> --- a/src/gallium/targets/omx/Makefile.am >>> +++ b/src/gallium/targets/omx/Makefile.am >>> @@ -7,8 +7,7 @@ omxdir = $(OMX_LIB_INSTALL_DIR) >>> omx_LTLIBRARIES = libomx_mesa.la >>> >>> nodist_EXTRA_libomx_mesa_la_SOURCES = dummy.cpp >>> -libomx_mesa_la_SOURCES = \ >>> - $(top_srcdir)/src/gallium/auxiliary/vl/vl_winsys_dri.c >>> +libomx_mesa_la_SOURCES = vl_winsys_dri.c >>> >>> libomx_mesa_la_LDFLAGS = \ >>> -shared \ >>> @@ -30,6 +29,13 @@ libomx_mesa_la_LIBADD = \ >>> $(OMX_LIBS) \ >>> $(GALLIUM_COMMON_LIB_DEPS) >>> >>> +BUILT_SOURCES = vl_winsys_dri.c >>> +CLEANFILES = vl_winsys_dri.c >>> + >>> +vl_winsys_dri.c: >>> + $(AM_V_GEN)$(LN_S) >>> $(top_srcdir)/src/gallium/auxiliary/vl/vl_winsys_dri.c >>> + >>> + >>> if HAVE_GALLIUM_STATIC_TARGETS >>> >>> STATIC_TARGET_CPPFLAGS = -DGALLIUM_STATIC_TARGETS=1 >>> -- >>> 2.1.0 >>> >>> ___ >>> mesa-dev mailing list >>> mesa-dev@lists.freedesktop.org >>> http://lists.freedesktop.org/mailman/listinfo/mesa-dev > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/3] targets/omx: create symlink to aux/vl/vl_winsys_dri.cat at build time
On 20/09/14 00:13, Ilia Mirkin wrote: > Do we do that anywhere else? Seems really hacky, and windows doesn't > support symlinks among other things... I'd just as soon force a > non-broken version of automake :) > Hmm just noticed that we should put the generated source(s) into the nodist_* Please define "anywhere else". It does seem hacky but it's less hacky than the current approach afaics. Cannot really parse "I'd just as soon force a non-broken version of autmake". Can you elaborate ? Have a sneaky feeling that we may get away with just creating a single blob in aux/vl, rather than one per target, yet I would prefer to save people (myself?) the headaches at things go pair-shape :) -Emil > On Fri, Sep 19, 2014 at 7:01 PM, Emil Velikov > wrote: >> Ensure that the object is build in the target folder, as automake 2.0 >> will mandate subdir-objects. Pointed out by automake 1.14. >> >> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=69874 >> Signed-off-by: Emil Velikov >> --- >> src/gallium/targets/omx/.gitignore | 1 + >> src/gallium/targets/omx/Makefile.am | 10 -- >> 2 files changed, 9 insertions(+), 2 deletions(-) >> create mode 100644 src/gallium/targets/omx/.gitignore >> >> diff --git a/src/gallium/targets/omx/.gitignore >> b/src/gallium/targets/omx/.gitignore >> new file mode 100644 >> index 000..4fd1800 >> --- /dev/null >> +++ b/src/gallium/targets/omx/.gitignore >> @@ -0,0 +1 @@ >> +vl_winsys_dri.c >> diff --git a/src/gallium/targets/omx/Makefile.am >> b/src/gallium/targets/omx/Makefile.am >> index 4045548..f41719f 100644 >> --- a/src/gallium/targets/omx/Makefile.am >> +++ b/src/gallium/targets/omx/Makefile.am >> @@ -7,8 +7,7 @@ omxdir = $(OMX_LIB_INSTALL_DIR) >> omx_LTLIBRARIES = libomx_mesa.la >> >> nodist_EXTRA_libomx_mesa_la_SOURCES = dummy.cpp >> -libomx_mesa_la_SOURCES = \ >> - $(top_srcdir)/src/gallium/auxiliary/vl/vl_winsys_dri.c >> +libomx_mesa_la_SOURCES = vl_winsys_dri.c >> >> libomx_mesa_la_LDFLAGS = \ >> -shared \ >> @@ -30,6 +29,13 @@ libomx_mesa_la_LIBADD = \ >> $(OMX_LIBS) \ >> $(GALLIUM_COMMON_LIB_DEPS) >> >> +BUILT_SOURCES = vl_winsys_dri.c >> +CLEANFILES = vl_winsys_dri.c >> + >> +vl_winsys_dri.c: >> + $(AM_V_GEN)$(LN_S) >> $(top_srcdir)/src/gallium/auxiliary/vl/vl_winsys_dri.c >> + >> + >> if HAVE_GALLIUM_STATIC_TARGETS >> >> STATIC_TARGET_CPPFLAGS = -DGALLIUM_STATIC_TARGETS=1 >> -- >> 2.1.0 >> >> ___ >> mesa-dev mailing list >> mesa-dev@lists.freedesktop.org >> http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/3] targets/omx: create symlink to aux/vl/vl_winsys_dri.cat at build time
Do we do that anywhere else? Seems really hacky, and windows doesn't support symlinks among other things... I'd just as soon force a non-broken version of automake :) On Fri, Sep 19, 2014 at 7:01 PM, Emil Velikov wrote: > Ensure that the object is build in the target folder, as automake 2.0 > will mandate subdir-objects. Pointed out by automake 1.14. > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=69874 > Signed-off-by: Emil Velikov > --- > src/gallium/targets/omx/.gitignore | 1 + > src/gallium/targets/omx/Makefile.am | 10 -- > 2 files changed, 9 insertions(+), 2 deletions(-) > create mode 100644 src/gallium/targets/omx/.gitignore > > diff --git a/src/gallium/targets/omx/.gitignore > b/src/gallium/targets/omx/.gitignore > new file mode 100644 > index 000..4fd1800 > --- /dev/null > +++ b/src/gallium/targets/omx/.gitignore > @@ -0,0 +1 @@ > +vl_winsys_dri.c > diff --git a/src/gallium/targets/omx/Makefile.am > b/src/gallium/targets/omx/Makefile.am > index 4045548..f41719f 100644 > --- a/src/gallium/targets/omx/Makefile.am > +++ b/src/gallium/targets/omx/Makefile.am > @@ -7,8 +7,7 @@ omxdir = $(OMX_LIB_INSTALL_DIR) > omx_LTLIBRARIES = libomx_mesa.la > > nodist_EXTRA_libomx_mesa_la_SOURCES = dummy.cpp > -libomx_mesa_la_SOURCES = \ > - $(top_srcdir)/src/gallium/auxiliary/vl/vl_winsys_dri.c > +libomx_mesa_la_SOURCES = vl_winsys_dri.c > > libomx_mesa_la_LDFLAGS = \ > -shared \ > @@ -30,6 +29,13 @@ libomx_mesa_la_LIBADD = \ > $(OMX_LIBS) \ > $(GALLIUM_COMMON_LIB_DEPS) > > +BUILT_SOURCES = vl_winsys_dri.c > +CLEANFILES = vl_winsys_dri.c > + > +vl_winsys_dri.c: > + $(AM_V_GEN)$(LN_S) > $(top_srcdir)/src/gallium/auxiliary/vl/vl_winsys_dri.c > + > + > if HAVE_GALLIUM_STATIC_TARGETS > > STATIC_TARGET_CPPFLAGS = -DGALLIUM_STATIC_TARGETS=1 > -- > 2.1.0 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] i965/gen6: Enable GLSL 3.30 and OpenGL 3.3
--- src/mesa/drivers/dri/i965/intel_extensions.c | 4 +--- src/mesa/drivers/dri/i965/intel_screen.c | 7 +-- 2 files changed, 2 insertions(+), 9 deletions(-) diff --git a/src/mesa/drivers/dri/i965/intel_extensions.c b/src/mesa/drivers/dri/i965/intel_extensions.c index b7c64c6..4e6627e 100644 --- a/src/mesa/drivers/dri/i965/intel_extensions.c +++ b/src/mesa/drivers/dri/i965/intel_extensions.c @@ -243,10 +243,8 @@ intelInitExtensions(struct gl_context *ctx) ctx->Extensions.OES_standard_derivatives = true; ctx->Extensions.OES_EGL_image_external = true; - if (brw->gen >= 7) + if (brw->gen >= 6) ctx->Const.GLSLVersion = 330; - else if (brw->gen >= 6) - ctx->Const.GLSLVersion = 150; else ctx->Const.GLSLVersion = 120; _mesa_override_glsl_version(&ctx->Const); diff --git a/src/mesa/drivers/dri/i965/intel_screen.c b/src/mesa/drivers/dri/i965/intel_screen.c index 8070e97..41964ec 100644 --- a/src/mesa/drivers/dri/i965/intel_screen.c +++ b/src/mesa/drivers/dri/i965/intel_screen.c @@ -1269,13 +1269,8 @@ set_max_gl_versions(struct intel_screen *screen) switch (screen->devinfo->gen) { case 8: case 7: - psp->max_gl_core_version = 33; - psp->max_gl_compat_version = 30; - psp->max_gl_es1_version = 11; - psp->max_gl_es2_version = 30; - break; case 6: - psp->max_gl_core_version = 32; + psp->max_gl_core_version = 33; psp->max_gl_compat_version = 30; psp->max_gl_es1_version = 11; psp->max_gl_es2_version = 30; -- 2.1.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] i965/gen6: Enable GLSL 3.30 and OpenGL 3.3
Hi This is the first time I've used git send-mail - hopefully it should be inline now I've run piglit and there don't seem to be any failures related to the new enablement (I do get some GS fails though) Cheers Mike ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/3] targets/vdpau: create symlink to aux/vl/vl_winsys_dri.cat at build time
Ensure that the object is build in the target folder, as automake 2.0 will mandate subdir-objects. Pointed out by automake 1.14. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=69874 Signed-off-by: Emil Velikov --- src/gallium/targets/vdpau/.gitignore | 1 + src/gallium/targets/vdpau/Makefile.am | 9 +++-- 2 files changed, 8 insertions(+), 2 deletions(-) create mode 100644 src/gallium/targets/vdpau/.gitignore diff --git a/src/gallium/targets/vdpau/.gitignore b/src/gallium/targets/vdpau/.gitignore new file mode 100644 index 000..4fd1800 --- /dev/null +++ b/src/gallium/targets/vdpau/.gitignore @@ -0,0 +1 @@ +vl_winsys_dri.c diff --git a/src/gallium/targets/vdpau/Makefile.am b/src/gallium/targets/vdpau/Makefile.am index 440cf22..1b42a1d 100644 --- a/src/gallium/targets/vdpau/Makefile.am +++ b/src/gallium/targets/vdpau/Makefile.am @@ -7,8 +7,7 @@ vdpaudir = $(VDPAU_LIB_INSTALL_DIR) vdpau_LTLIBRARIES = libvdpau_gallium.la nodist_EXTRA_libvdpau_gallium_la_SOURCES = dummy.cpp -libvdpau_gallium_la_SOURCES = \ - $(top_srcdir)/src/gallium/auxiliary/vl/vl_winsys_dri.c +libvdpau_gallium_la_SOURCES = vl_winsys_dri.c libvdpau_gallium_la_LDFLAGS = \ -shared \ @@ -36,6 +35,12 @@ libvdpau_gallium_la_LIBADD = \ $(LIBDRM_LIBS) \ $(GALLIUM_COMMON_LIB_DEPS) +BUILT_SOURCES = vl_winsys_dri.c +CLEANFILES = vl_winsys_dri.c + +vl_winsys_dri.c: + $(AM_V_GEN)$(LN_S) $(top_srcdir)/src/gallium/auxiliary/vl/vl_winsys_dri.c + if HAVE_GALLIUM_STATIC_TARGETS -- 2.1.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 3/3] targets/xvmc: create symlink to aux/vl/vl_winsys_dri.cat at build time
Ensure that the object is build in the target folder, as automake 2.0 will mandate subdir-objects. Pointed out by automake 1.14. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=69874 Signed-off-by: Emil Velikov --- src/gallium/targets/xvmc/.gitignore | 1 + src/gallium/targets/xvmc/Makefile.am | 9 +++-- 2 files changed, 8 insertions(+), 2 deletions(-) create mode 100644 src/gallium/targets/xvmc/.gitignore diff --git a/src/gallium/targets/xvmc/.gitignore b/src/gallium/targets/xvmc/.gitignore new file mode 100644 index 000..4fd1800 --- /dev/null +++ b/src/gallium/targets/xvmc/.gitignore @@ -0,0 +1 @@ +vl_winsys_dri.c diff --git a/src/gallium/targets/xvmc/Makefile.am b/src/gallium/targets/xvmc/Makefile.am index 884bccf..f6c7e03 100644 --- a/src/gallium/targets/xvmc/Makefile.am +++ b/src/gallium/targets/xvmc/Makefile.am @@ -7,8 +7,7 @@ xvmcdir = $(XVMC_LIB_INSTALL_DIR) xvmc_LTLIBRARIES = libXvMCgallium.la nodist_EXTRA_libXvMCgallium_la_SOURCES = dummy.cpp -libXvMCgallium_la_SOURCES = \ - $(top_srcdir)/src/gallium/auxiliary/vl/vl_winsys_dri.c +libXvMCgallium_la_SOURCES = vl_winsys_dri.c libXvMCgallium_la_LDFLAGS = \ -shared \ @@ -31,6 +30,12 @@ libXvMCgallium_la_LIBADD = \ $(LIBDRM_LIBS) \ $(GALLIUM_COMMON_LIB_DEPS) +BUILT_SOURCES = vl_winsys_dri.c +CLEANFILES = vl_winsys_dri.c + +vl_winsys_dri.c: + $(AM_V_GEN)$(LN_S) $(top_srcdir)/src/gallium/auxiliary/vl/vl_winsys_dri.c + if HAVE_GALLIUM_STATIC_TARGETS -- 2.1.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/3] targets/omx: create symlink to aux/vl/vl_winsys_dri.cat at build time
Ensure that the object is build in the target folder, as automake 2.0 will mandate subdir-objects. Pointed out by automake 1.14. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=69874 Signed-off-by: Emil Velikov --- src/gallium/targets/omx/.gitignore | 1 + src/gallium/targets/omx/Makefile.am | 10 -- 2 files changed, 9 insertions(+), 2 deletions(-) create mode 100644 src/gallium/targets/omx/.gitignore diff --git a/src/gallium/targets/omx/.gitignore b/src/gallium/targets/omx/.gitignore new file mode 100644 index 000..4fd1800 --- /dev/null +++ b/src/gallium/targets/omx/.gitignore @@ -0,0 +1 @@ +vl_winsys_dri.c diff --git a/src/gallium/targets/omx/Makefile.am b/src/gallium/targets/omx/Makefile.am index 4045548..f41719f 100644 --- a/src/gallium/targets/omx/Makefile.am +++ b/src/gallium/targets/omx/Makefile.am @@ -7,8 +7,7 @@ omxdir = $(OMX_LIB_INSTALL_DIR) omx_LTLIBRARIES = libomx_mesa.la nodist_EXTRA_libomx_mesa_la_SOURCES = dummy.cpp -libomx_mesa_la_SOURCES = \ - $(top_srcdir)/src/gallium/auxiliary/vl/vl_winsys_dri.c +libomx_mesa_la_SOURCES = vl_winsys_dri.c libomx_mesa_la_LDFLAGS = \ -shared \ @@ -30,6 +29,13 @@ libomx_mesa_la_LIBADD = \ $(OMX_LIBS) \ $(GALLIUM_COMMON_LIB_DEPS) +BUILT_SOURCES = vl_winsys_dri.c +CLEANFILES = vl_winsys_dri.c + +vl_winsys_dri.c: + $(AM_V_GEN)$(LN_S) $(top_srcdir)/src/gallium/auxiliary/vl/vl_winsys_dri.c + + if HAVE_GALLIUM_STATIC_TARGETS STATIC_TARGET_CPPFLAGS = -DGALLIUM_STATIC_TARGETS=1 -- 2.1.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965/gen6: Enable GL 3.3 and GLSL 3.30
Reviewed-by: Jordan Justen Mike, my reply for your patch was going to be: * Can you inline your patch? * Did you run piglit? On Fri, Sep 19, 2014 at 3:39 PM, Chris Forbes wrote: > Tested on my snb-gt2: > > 4 tests skip->pass in spec/EXT_texture_array > 51 tests skip->pass in spec.glsl-3.30 > 4 tests skip->pass in spec/!OpenGL 3.3 > No regressions; no skip->fail changes. > > Signed-off-by: Chris Forbes > --- > > Had the gen6 machine out anyway to try some other things; may as well get > this test run done at the same time :) > > src/mesa/drivers/dri/i965/intel_extensions.c | 4 +--- > src/mesa/drivers/dri/i965/intel_screen.c | 7 +-- > 2 files changed, 2 insertions(+), 9 deletions(-) > > diff --git a/src/mesa/drivers/dri/i965/intel_extensions.c > b/src/mesa/drivers/dri/i965/intel_extensions.c > index b7c64c6..4e6627e 100644 > --- a/src/mesa/drivers/dri/i965/intel_extensions.c > +++ b/src/mesa/drivers/dri/i965/intel_extensions.c > @@ -243,10 +243,8 @@ intelInitExtensions(struct gl_context *ctx) > ctx->Extensions.OES_standard_derivatives = true; > ctx->Extensions.OES_EGL_image_external = true; > > - if (brw->gen >= 7) > + if (brw->gen >= 6) >ctx->Const.GLSLVersion = 330; > - else if (brw->gen >= 6) > - ctx->Const.GLSLVersion = 150; > else >ctx->Const.GLSLVersion = 120; > _mesa_override_glsl_version(&ctx->Const); > diff --git a/src/mesa/drivers/dri/i965/intel_screen.c > b/src/mesa/drivers/dri/i965/intel_screen.c > index 8070e97..41964ec 100644 > --- a/src/mesa/drivers/dri/i965/intel_screen.c > +++ b/src/mesa/drivers/dri/i965/intel_screen.c > @@ -1269,13 +1269,8 @@ set_max_gl_versions(struct intel_screen *screen) > switch (screen->devinfo->gen) { > case 8: > case 7: > - psp->max_gl_core_version = 33; > - psp->max_gl_compat_version = 30; > - psp->max_gl_es1_version = 11; > - psp->max_gl_es2_version = 30; > - break; > case 6: > - psp->max_gl_core_version = 32; > + psp->max_gl_core_version = 33; >psp->max_gl_compat_version = 30; >psp->max_gl_es1_version = 11; >psp->max_gl_es2_version = 30; > -- > 1.8.5.3 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] i965/gen6: Enable GL 3.3 and GLSL 3.30
Tested on my snb-gt2: 4 tests skip->pass in spec/EXT_texture_array 51 tests skip->pass in spec.glsl-3.30 4 tests skip->pass in spec/!OpenGL 3.3 No regressions; no skip->fail changes. Signed-off-by: Chris Forbes --- Had the gen6 machine out anyway to try some other things; may as well get this test run done at the same time :) src/mesa/drivers/dri/i965/intel_extensions.c | 4 +--- src/mesa/drivers/dri/i965/intel_screen.c | 7 +-- 2 files changed, 2 insertions(+), 9 deletions(-) diff --git a/src/mesa/drivers/dri/i965/intel_extensions.c b/src/mesa/drivers/dri/i965/intel_extensions.c index b7c64c6..4e6627e 100644 --- a/src/mesa/drivers/dri/i965/intel_extensions.c +++ b/src/mesa/drivers/dri/i965/intel_extensions.c @@ -243,10 +243,8 @@ intelInitExtensions(struct gl_context *ctx) ctx->Extensions.OES_standard_derivatives = true; ctx->Extensions.OES_EGL_image_external = true; - if (brw->gen >= 7) + if (brw->gen >= 6) ctx->Const.GLSLVersion = 330; - else if (brw->gen >= 6) - ctx->Const.GLSLVersion = 150; else ctx->Const.GLSLVersion = 120; _mesa_override_glsl_version(&ctx->Const); diff --git a/src/mesa/drivers/dri/i965/intel_screen.c b/src/mesa/drivers/dri/i965/intel_screen.c index 8070e97..41964ec 100644 --- a/src/mesa/drivers/dri/i965/intel_screen.c +++ b/src/mesa/drivers/dri/i965/intel_screen.c @@ -1269,13 +1269,8 @@ set_max_gl_versions(struct intel_screen *screen) switch (screen->devinfo->gen) { case 8: case 7: - psp->max_gl_core_version = 33; - psp->max_gl_compat_version = 30; - psp->max_gl_es1_version = 11; - psp->max_gl_es2_version = 30; - break; case 6: - psp->max_gl_core_version = 32; + psp->max_gl_core_version = 33; psp->max_gl_compat_version = 30; psp->max_gl_es1_version = 11; psp->max_gl_es2_version = 30; -- 1.8.5.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] i965/gen6: Enable GLSL 3.30 and OpenGL 3.3
Hi I'm pretty sure this is all thats needed to switch on GLSL 3.30 and OpenGL 3.3 on Sandybridge Cheers Mike From b16937f37681f8e44c86cdb86bd76fd1bbfab998 Mon Sep 17 00:00:00 2001 From: Mike Lothian Date: Fri, 19 Sep 2014 22:56:46 +0100 Subject: [PATCH] i965/gen6: Enable GLSL 3.30 and OpenGL 3.3 --- src/mesa/drivers/dri/i965/intel_extensions.c | 4 +--- src/mesa/drivers/dri/i965/intel_screen.c | 7 +-- 2 files changed, 2 insertions(+), 9 deletions(-) diff --git a/src/mesa/drivers/dri/i965/intel_extensions.c b/src/mesa/drivers/dri/i965/intel_extensions.c index b7c64c6..4e6627e 100644 --- a/src/mesa/drivers/dri/i965/intel_extensions.c +++ b/src/mesa/drivers/dri/i965/intel_extensions.c @@ -243,10 +243,8 @@ intelInitExtensions(struct gl_context *ctx) ctx->Extensions.OES_standard_derivatives = true; ctx->Extensions.OES_EGL_image_external = true; - if (brw->gen >= 7) + if (brw->gen >= 6) ctx->Const.GLSLVersion = 330; - else if (brw->gen >= 6) - ctx->Const.GLSLVersion = 150; else ctx->Const.GLSLVersion = 120; _mesa_override_glsl_version(&ctx->Const); diff --git a/src/mesa/drivers/dri/i965/intel_screen.c b/src/mesa/drivers/dri/i965/intel_screen.c index 8070e97..41964ec 100644 --- a/src/mesa/drivers/dri/i965/intel_screen.c +++ b/src/mesa/drivers/dri/i965/intel_screen.c @@ -1269,13 +1269,8 @@ set_max_gl_versions(struct intel_screen *screen) switch (screen->devinfo->gen) { case 8: case 7: - psp->max_gl_core_version = 33; - psp->max_gl_compat_version = 30; - psp->max_gl_es1_version = 11; - psp->max_gl_es2_version = 30; - break; case 6: - psp->max_gl_core_version = 32; + psp->max_gl_core_version = 33; psp->max_gl_compat_version = 30; psp->max_gl_es1_version = 11; psp->max_gl_es2_version = 30; -- 2.1.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 13/12] i965/fs: Refactor fs_inst::is_send_from_grf()
A switch statement is much easier to read/edit than a big giant or statement. --- src/mesa/drivers/dri/i965/brw_fs.cpp | 25 - 1 file changed, 16 insertions(+), 9 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp index a0b7c6a..527467a 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp @@ -368,15 +368,22 @@ fs_inst::overwrites_reg(const fs_reg ®) const bool fs_inst::is_send_from_grf() const { - return (opcode == FS_OPCODE_VARYING_PULL_CONSTANT_LOAD_GEN7 || - opcode == SHADER_OPCODE_SHADER_TIME_ADD || - opcode == FS_OPCODE_INTERPOLATE_AT_CENTROID || - opcode == FS_OPCODE_INTERPOLATE_AT_SAMPLE || - opcode == FS_OPCODE_INTERPOLATE_AT_SHARED_OFFSET || - opcode == FS_OPCODE_INTERPOLATE_AT_PER_SLOT_OFFSET || - (opcode == FS_OPCODE_UNIFORM_PULL_CONSTANT_LOAD && -src[1].file == GRF) || - (is_tex() && src[0].file == GRF)); + switch (opcode) { + case FS_OPCODE_VARYING_PULL_CONSTANT_LOAD_GEN7: + case SHADER_OPCODE_SHADER_TIME_ADD: + case FS_OPCODE_INTERPOLATE_AT_CENTROID: + case FS_OPCODE_INTERPOLATE_AT_SAMPLE: + case FS_OPCODE_INTERPOLATE_AT_SHARED_OFFSET: + case FS_OPCODE_INTERPOLATE_AT_PER_SLOT_OFFSET: + return true; + case FS_OPCODE_UNIFORM_PULL_CONSTANT_LOAD: + return src[1].file == GRF; + default: + if (is_tex()) + return src[0].file == GRF; + + return false; + } } bool -- 2.1.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 00/12] i965/fs: A bunch of cleanups in preparation for explicit register widths
Oops. For got one: i965/fs: Refactor fs_inst::is_send_from_grf() On Fri, Sep 19, 2014 at 1:10 PM, Jason Ekstrand wrote: > I'm working on a series (which I hope to send out soon) that will allow us > to have explicit register widths and instruction execution sizes in the fs > backend IR. If you want to see where I'm going with this, I've got a > working version here: > > http://cgit.freedesktop.org/~jekstrand/mesa/log/?h=wip/kill-mrf-v0.5 > > I'm planning to get that cleaned up a bit more and hope to send the full > series out by the end of today or maybe Monday. This series is a bunch of > cleanup patches that will be needed eventually, but don't really change > anything important on their own. They should be generally reviewable by > anyone with a decent understanding of the i965 fs backend. > > Jason Ekstrand (12): > i965/fs: Manually generate the meta fast-clear shader > i965/fs_live_variables: Use var_from_vgrf insead of repeating the > calculation > i965/fs: Rewrite fs_visitor::split_virtual_grfs > i965/fs: fix a comment in compact_virtual_grfs > i965/fs: Use offset a lot more places > i965/fs: Use the UW type for the destination of > VARYING_PULL_CONSTANT_LOAD instructions > i965/fs: Use the var_from_vgrf helper function instead of doing it > manually > i965/fs: Make null_reg_* const members of fs_visitor instead of > globals > i964/fs: Make immediate fs_reg constructors explicit > i965/fs: Make compact_virtual_grfs an optimization pass > i965/fs: Print BAD_FILE registers in dump_instruction > i965/fs: Clean up emit_fb_writes > > src/mesa/drivers/dri/i965/brw_fs.cpp | 295 > +- > src/mesa/drivers/dri/i965/brw_fs.h | 21 +- > src/mesa/drivers/dri/i965/brw_fs_cse.cpp | 12 +- > .../dri/i965/brw_fs_dead_code_eliminate.cpp| 10 +- > src/mesa/drivers/dri/i965/brw_fs_fp.cpp| 2 +- > src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 4 +- > .../drivers/dri/i965/brw_fs_live_variables.cpp | 4 +- > src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp | 4 +- > src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 338 > + > src/mesa/drivers/dri/i965/brw_reg.h| 6 + > 10 files changed, 323 insertions(+), 373 deletions(-) > > -- > 2.1.0 > > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 12/15] radeonsi: don't pass the context to the shader translator
From: Marek Olšák This should prevent accessing context state there. --- src/gallium/drivers/radeonsi/si_compute.c | 2 +- src/gallium/drivers/radeonsi/si_shader.c | 29 + src/gallium/drivers/radeonsi/si_shader.h | 7 +++ src/gallium/drivers/radeonsi/si_state.c | 2 +- 4 files changed, 18 insertions(+), 22 deletions(-) diff --git a/src/gallium/drivers/radeonsi/si_compute.c b/src/gallium/drivers/radeonsi/si_compute.c index 9088268..4b2662d 100644 --- a/src/gallium/drivers/radeonsi/si_compute.c +++ b/src/gallium/drivers/radeonsi/si_compute.c @@ -81,7 +81,7 @@ static void *si_create_compute_state( for (i = 0; i < program->num_kernels; i++) { LLVMModuleRef mod = radeon_llvm_get_kernel_module(program->llvm_ctx, i, code, header->num_bytes); - si_compile_llvm(sctx, &program->kernels[i], mod); + si_compile_llvm(sctx->screen, &program->kernels[i], mod); LLVMDisposeModule(mod); } diff --git a/src/gallium/drivers/radeonsi/si_shader.c b/src/gallium/drivers/radeonsi/si_shader.c index fbc94d2..7aa65c9 100644 --- a/src/gallium/drivers/radeonsi/si_shader.c +++ b/src/gallium/drivers/radeonsi/si_shader.c @@ -2625,16 +2625,16 @@ static void preload_streamout_buffers(struct si_shader_context *si_shader_ctx) } } -int si_compile_llvm(struct si_context *sctx, struct si_shader *shader, - LLVMModuleRef mod) +int si_compile_llvm(struct si_screen *sscreen, struct si_shader *shader, + LLVMModuleRef mod) { unsigned r; /* llvm_compile result */ unsigned i; unsigned char *ptr; struct radeon_shader_binary binary; - bool dump = r600_can_dump_shader(&sctx->screen->b, + bool dump = r600_can_dump_shader(&sscreen->b, shader->selector ? shader->selector->tokens : NULL); - const char * gpu_family = r600_get_llvm_processor_name(sctx->screen->b.family); + const char * gpu_family = r600_get_llvm_processor_name(sscreen->b.family); unsigned code_size; /* Use LLVM to compile shader */ @@ -2690,20 +2690,20 @@ int si_compile_llvm(struct si_context *sctx, struct si_shader *shader, /* copy new shader */ code_size = binary.code_size + binary.rodata_size; r600_resource_reference(&shader->bo, NULL); - shader->bo = si_resource_create_custom(sctx->b.b.screen, PIPE_USAGE_IMMUTABLE, + shader->bo = si_resource_create_custom(&sscreen->b.b, PIPE_USAGE_IMMUTABLE, code_size); if (shader->bo == NULL) { return -ENOMEM; } - ptr = sctx->b.ws->buffer_map(shader->bo->cs_buf, sctx->b.rings.gfx.cs, PIPE_TRANSFER_WRITE); + ptr = sscreen->b.ws->buffer_map(shader->bo->cs_buf, NULL, PIPE_TRANSFER_WRITE); util_memcpy_cpu_to_le32(ptr, binary.code, binary.code_size); if (binary.rodata_size > 0) { ptr += binary.code_size; util_memcpy_cpu_to_le32(ptr, binary.rodata, binary.rodata_size); } - sctx->b.ws->buffer_unmap(shader->bo->cs_buf); + sscreen->b.ws->buffer_unmap(shader->bo->cs_buf); free(binary.code); free(binary.config); @@ -2713,7 +2713,7 @@ int si_compile_llvm(struct si_context *sctx, struct si_shader *shader, } /* Generate code for the hardware VS shader stage to go with a geometry shader */ -static int si_generate_gs_copy_shader(struct si_context *sctx, +static int si_generate_gs_copy_shader(struct si_screen *sscreen, struct si_shader_context *si_shader_ctx, bool dump) { @@ -2792,7 +2792,7 @@ static int si_generate_gs_copy_shader(struct si_context *sctx, if (dump) fprintf(stderr, "Copy Vertex Shader for Geometry Shader:\n\n"); - r = si_compile_llvm(sctx, si_shader_ctx->shader, + r = si_compile_llvm(sscreen, si_shader_ctx->shader, bld_base->base.gallivm->module); radeon_llvm_dispose(&si_shader_ctx->radeon_bld); @@ -2801,18 +2801,15 @@ static int si_generate_gs_copy_shader(struct si_context *sctx, return r; } -int si_shader_create( - struct pipe_context *ctx, - struct si_shader *shader) +int si_shader_create(struct si_screen *sscreen, struct si_shader *shader) { - struct si_context *sctx = (struct si_context*)ctx; struct si_shader_selector *sel = shader->selector; struct si_shader_context si_shader_ctx; struct tgsi_shader_info shader_info; struct lp_build_tgsi_context * bld_base; LLVMModuleRef mod; int r = 0; - bool dump = r600_can_dump_shader(&sctx->screen->b, sel->tokens); + bool dump = r600_can_dump_shader(&sscreen->b, sel->tokens); /* Dump
[Mesa-dev] [PATCH 08/15] radeonsi: disable gl_SampleMask fragment shader output if MSAA is disabled
From: Marek Olšák This fixes piglit: arb_sample_shading-builtin-gl-sample-mask 0 --- src/gallium/drivers/radeonsi/si_state.c | 21 ++--- 1 file changed, 18 insertions(+), 3 deletions(-) diff --git a/src/gallium/drivers/radeonsi/si_state.c b/src/gallium/drivers/radeonsi/si_state.c index 671e57b..7614bba 100644 --- a/src/gallium/drivers/radeonsi/si_state.c +++ b/src/gallium/drivers/radeonsi/si_state.c @@ -666,6 +666,8 @@ static void *si_create_rs_state(struct pipe_context *ctx, static void si_bind_rs_state(struct pipe_context *ctx, void *state) { struct si_context *sctx = (struct si_context *)ctx; + struct si_state_rasterizer *old_rs = + (struct si_state_rasterizer*)sctx->queued.named.rasterizer; struct si_state_rasterizer *rs = (struct si_state_rasterizer *)state; if (state == NULL) @@ -676,6 +678,10 @@ static void si_bind_rs_state(struct pipe_context *ctx, void *state) sctx->pa_sc_line_stipple = rs->pa_sc_line_stipple; sctx->pa_su_sc_mode_cntl = rs->pa_su_sc_mode_cntl; + if (sctx->framebuffer.nr_samples > 1 && + (!old_rs || old_rs->multisample_enable != rs->multisample_enable)) + sctx->db_render_state.dirty = true; + si_pm4_bind_state(sctx, rasterizer, rs); si_update_fb_rs_state(sctx); } @@ -845,6 +851,8 @@ static void si_set_occlusion_query_state(struct pipe_context *ctx, bool enable) static void si_emit_db_render_state(struct si_context *sctx, struct r600_atom *state) { struct radeon_winsys_cs *cs = sctx->b.rings.gfx.cs; + struct si_state_rasterizer *rs = sctx->queued.named.rasterizer; + unsigned db_shader_control; r600_write_context_reg_seq(cs, R_028000_DB_RENDER_CONTROL, 2); @@ -897,10 +905,16 @@ static void si_emit_db_render_state(struct si_context *sctx, struct r600_atom *s r600_write_context_reg(cs, R_028010_DB_RENDER_OVERRIDE2, 0); } + db_shader_control = S_02880C_Z_ORDER(V_02880C_EARLY_Z_THEN_LATE_Z) | + S_02880C_ALPHA_TO_MASK_DISABLE(sctx->framebuffer.cb0_is_integer) | + sctx->ps_db_shader_control; + + /* Disable the gl_SampleMask fragment shader output if MSAA is disabled. */ + if (sctx->framebuffer.nr_samples <= 1 || (rs && !rs->multisample_enable)) + db_shader_control &= C_02880C_MASK_EXPORT_ENABLE; + r600_write_context_reg(cs, R_02880C_DB_SHADER_CONTROL, - S_02880C_Z_ORDER(V_02880C_EARLY_Z_THEN_LATE_Z) | - S_02880C_ALPHA_TO_MASK_DISABLE(sctx->framebuffer.cb0_is_integer) | - sctx->ps_db_shader_control); + db_shader_control); } /* @@ -2012,6 +2026,7 @@ static void si_set_framebuffer_state(struct pipe_context *ctx, if (sctx->framebuffer.nr_samples != old_nr_samples) { sctx->msaa_config.dirty = true; + sctx->db_render_state.dirty = true; /* Set sample locations as fragment shader constants. */ switch (sctx->framebuffer.nr_samples) { -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 15/15] radeonsi: properly destroy the GS copy shader and scratch_bo for compute
From: Marek Olšák Cc: 10.2 10.3 --- src/gallium/drivers/radeonsi/si_shader.c | 4 src/gallium/drivers/radeonsi/si_state.c | 7 --- 2 files changed, 8 insertions(+), 3 deletions(-) diff --git a/src/gallium/drivers/radeonsi/si_shader.c b/src/gallium/drivers/radeonsi/si_shader.c index 7aa65c9..94db1dc 100644 --- a/src/gallium/drivers/radeonsi/si_shader.c +++ b/src/gallium/drivers/radeonsi/si_shader.c @@ -2973,5 +2973,9 @@ out: void si_shader_destroy(struct pipe_context *ctx, struct si_shader *shader) { + if (shader->gs_copy_shader) + si_shader_destroy(ctx, shader->gs_copy_shader); + r600_resource_reference(&shader->bo, NULL); + r600_resource_reference(&shader->scratch_bo, NULL); } diff --git a/src/gallium/drivers/radeonsi/si_state.c b/src/gallium/drivers/radeonsi/si_state.c index 2aa9aad..ed90f13 100644 --- a/src/gallium/drivers/radeonsi/si_state.c +++ b/src/gallium/drivers/radeonsi/si_state.c @@ -2403,9 +2403,10 @@ static void si_delete_shader_selector(struct pipe_context *ctx, while (p) { c = p->next_variant; - if (sel->type == PIPE_SHADER_GEOMETRY) + if (sel->type == PIPE_SHADER_GEOMETRY) { si_pm4_delete_state(sctx, gs, p->pm4); - else if (sel->type == PIPE_SHADER_FRAGMENT) + si_pm4_delete_state(sctx, vs, p->gs_copy_shader->pm4); + } else if (sel->type == PIPE_SHADER_FRAGMENT) si_pm4_delete_state(sctx, ps, p->pm4); else if (p->key.vs.as_es) si_pm4_delete_state(sctx, es, p->pm4); @@ -2418,7 +2419,7 @@ static void si_delete_shader_selector(struct pipe_context *ctx, free(sel->tokens); free(sel); - } +} static void si_delete_vs_shader(struct pipe_context *ctx, void *state) { -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 00/15] RadeonSI: Random improvements
Patch 1: Documenting. radeonsi: document what si_descriptors.c does Patches 2-8: Improvements and cleanups for DB registers and MSAA. radeonsi: remove unused variable si_pipe_shader::sprite_coord_enable radeonsi: move DB registers from draw_vbo into new db_render_state radeonsi: remove shader.ps_conservative_z, set db_shader_control instead radeonsi: set KILL_ENABLE during shader compilation, remove uses_kill flag radeonsi: move DB_SHADER_CONTROL into db_render_state radeonsi: only update MSAA-specific framebuffer state if nr_samples is changed radeonsi: disable gl_SampleMask fragment shader output if MSAA is disabled Patches 9-10: Renaming stuff and simplification. radeonsi: merge si_pipe_shader into si_shader radeonsi: shorten si_pipe_* prefixes to si_* Patches 11-15: Geometry shader fixes. radeonsi: don't snoop currently-bound GS shader when compiling ES radeonsi: don't pass the context to the shader translator radeonsi: don't use pipe_constant_buffer for GS rings radeonsi: release GS rings at context destruction radeonsi: properly destroy the GS copy shader and scratch_bo for compute Please review. Marek ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 01/15] radeonsi: document what si_descriptors.c does
From: Marek Olšák --- src/gallium/drivers/radeonsi/si_descriptors.c | 11 +++ 1 file changed, 11 insertions(+) diff --git a/src/gallium/drivers/radeonsi/si_descriptors.c b/src/gallium/drivers/radeonsi/si_descriptors.c index 792d2c3..2543052 100644 --- a/src/gallium/drivers/radeonsi/si_descriptors.c +++ b/src/gallium/drivers/radeonsi/si_descriptors.c @@ -23,6 +23,17 @@ * Authors: * Marek Olšák */ + +/* Resource binding slots and sampler states (each described with 8 or 4 dwords) + * live in memory on SI. + * + * This file is responsible for managing lists of resources and sampler states + * in memory and binding them, which means updating those structures in memory. + * + * There is also code for updating shader pointers to resources and sampler + * states. CP DMA functions are here too. + */ + #include "radeon/r600_cs.h" #include "si_pipe.h" #include "si_shader.h" -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 07/15] radeonsi: only update MSAA-specific framebuffer state if nr_samples is changed
From: Marek Olšák --- src/gallium/drivers/radeonsi/si_state.c | 50 ++--- 1 file changed, 27 insertions(+), 23 deletions(-) diff --git a/src/gallium/drivers/radeonsi/si_state.c b/src/gallium/drivers/radeonsi/si_state.c index b83b930..671e57b 100644 --- a/src/gallium/drivers/radeonsi/si_state.c +++ b/src/gallium/drivers/radeonsi/si_state.c @@ -1943,6 +1943,7 @@ static void si_set_framebuffer_state(struct pipe_context *ctx, struct r600_surface *surf = NULL; struct r600_texture *rtex; bool old_cb0_is_integer = sctx->framebuffer.cb0_is_integer; + unsigned old_nr_samples = sctx->framebuffer.nr_samples; int i; if (sctx->framebuffer.state.nr_cbufs) { @@ -2008,31 +2009,34 @@ static void si_set_framebuffer_state(struct pipe_context *ctx, sctx->framebuffer.atom.num_dw += 3; /* WINDOW_SCISSOR_BR */ sctx->framebuffer.atom.num_dw += 18; /* MSAA sample locations */ sctx->framebuffer.atom.dirty = true; - sctx->msaa_config.dirty = true; - /* Set sample locations as fragment shader constants. */ - switch (sctx->framebuffer.nr_samples) { - case 1: - constbuf.user_buffer = sctx->b.sample_locations_1x; - break; - case 2: - constbuf.user_buffer = sctx->b.sample_locations_2x; - break; - case 4: - constbuf.user_buffer = sctx->b.sample_locations_4x; - break; - case 8: - constbuf.user_buffer = sctx->b.sample_locations_8x; - break; - case 16: - constbuf.user_buffer = sctx->b.sample_locations_16x; - break; - default: - assert(0); + if (sctx->framebuffer.nr_samples != old_nr_samples) { + sctx->msaa_config.dirty = true; + + /* Set sample locations as fragment shader constants. */ + switch (sctx->framebuffer.nr_samples) { + case 1: + constbuf.user_buffer = sctx->b.sample_locations_1x; + break; + case 2: + constbuf.user_buffer = sctx->b.sample_locations_2x; + break; + case 4: + constbuf.user_buffer = sctx->b.sample_locations_4x; + break; + case 8: + constbuf.user_buffer = sctx->b.sample_locations_8x; + break; + case 16: + constbuf.user_buffer = sctx->b.sample_locations_16x; + break; + default: + assert(0); + } + constbuf.buffer_size = sctx->framebuffer.nr_samples * 2 * 4; + ctx->set_constant_buffer(ctx, PIPE_SHADER_FRAGMENT, +SI_DRIVER_STATE_CONST_BUF, &constbuf); } - constbuf.buffer_size = sctx->framebuffer.nr_samples * 2 * 4; - ctx->set_constant_buffer(ctx, PIPE_SHADER_FRAGMENT, -SI_DRIVER_STATE_CONST_BUF, &constbuf); } static void si_emit_framebuffer_state(struct si_context *sctx, struct r600_atom *atom) -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 04/15] radeonsi: remove shader.ps_conservative_z, set db_shader_control instead
From: Marek Olšák Also set the field on SI too. It's not just specific to CIK. --- src/gallium/drivers/radeonsi/si_shader.c | 7 --- src/gallium/drivers/radeonsi/si_shader.h | 1 - src/gallium/drivers/radeonsi/si_state_draw.c | 4 3 files changed, 4 insertions(+), 8 deletions(-) diff --git a/src/gallium/drivers/radeonsi/si_shader.c b/src/gallium/drivers/radeonsi/si_shader.c index 0a5ed96..19dc9ca 100644 --- a/src/gallium/drivers/radeonsi/si_shader.c +++ b/src/gallium/drivers/radeonsi/si_shader.c @@ -2818,17 +2818,18 @@ int si_pipe_shader_create( si_shader_ctx.radeon_bld.load_input = declare_input_fs; bld_base->emit_epilogue = si_llvm_emit_fs_epilogue; - shader->shader.ps_conservative_z = V_02880C_EXPORT_ANY_Z; for (i = 0; i < shader_info.num_properties; i++) { switch (shader_info.properties[i].name) { case TGSI_PROPERTY_FS_DEPTH_LAYOUT: switch (shader_info.properties[i].data[0]) { case TGSI_FS_DEPTH_LAYOUT_GREATER: - shader->shader.ps_conservative_z = V_02880C_EXPORT_GREATER_THAN_Z; + shader->db_shader_control |= + S_02880C_CONSERVATIVE_Z_EXPORT(V_02880C_EXPORT_GREATER_THAN_Z); break; case TGSI_FS_DEPTH_LAYOUT_LESS: - shader->shader.ps_conservative_z = V_02880C_EXPORT_LESS_THAN_Z; + shader->db_shader_control |= + S_02880C_CONSERVATIVE_Z_EXPORT(V_02880C_EXPORT_LESS_THAN_Z); break; } break; diff --git a/src/gallium/drivers/radeonsi/si_shader.h b/src/gallium/drivers/radeonsi/si_shader.h index df7dbb0..e07d872 100644 --- a/src/gallium/drivers/radeonsi/si_shader.h +++ b/src/gallium/drivers/radeonsi/si_shader.h @@ -140,7 +140,6 @@ struct si_shader { unsignedgs_input_prim; unsignedgs_output_prim; unsignedgs_max_out_vertices; - unsignedps_conservative_z; unsignednparam; booluses_kill; diff --git a/src/gallium/drivers/radeonsi/si_state_draw.c b/src/gallium/drivers/radeonsi/si_state_draw.c index fb1ddc0..37dc40b 100644 --- a/src/gallium/drivers/radeonsi/si_state_draw.c +++ b/src/gallium/drivers/radeonsi/si_state_draw.c @@ -269,10 +269,6 @@ static void si_pipe_shader_ps(struct pipe_context *ctx, struct si_pipe_shader *s if (shader->shader.uses_kill || shader->key.ps.alpha_func != PIPE_FUNC_ALWAYS) db_shader_control |= S_02880C_KILL_ENABLE(1); - if (sctx->b.chip_class >= CIK) - db_shader_control |= - S_02880C_CONSERVATIVE_Z_EXPORT(shader->shader.ps_conservative_z); - spi_ps_in_control = S_0286D8_NUM_INTERP(shader->shader.nparam) | S_0286D8_BC_OPTIMIZE_DISABLE(1); -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 05/15] radeonsi: set KILL_ENABLE during shader compilation, remove uses_kill flag
From: Marek Olšák --- src/gallium/drivers/radeonsi/si_shader.c | 6 +- src/gallium/drivers/radeonsi/si_shader.h | 1 - src/gallium/drivers/radeonsi/si_state_draw.c | 3 --- 3 files changed, 5 insertions(+), 5 deletions(-) diff --git a/src/gallium/drivers/radeonsi/si_shader.c b/src/gallium/drivers/radeonsi/si_shader.c index 19dc9ca..5893531 100644 --- a/src/gallium/drivers/radeonsi/si_shader.c +++ b/src/gallium/drivers/radeonsi/si_shader.c @@ -774,6 +774,8 @@ static void si_alpha_test(struct lp_build_tgsi_context *bld_base, LLVMVoidTypeInContext(gallivm->context), NULL, 0, 0); } + + si_shader_ctx->shader->db_shader_control |= S_02880C_KILL_ENABLE(1); } static void si_llvm_emit_clipvertex(struct lp_build_tgsi_context * bld_base, @@ -2751,7 +2753,9 @@ int si_pipe_shader_create( tgsi_scan_shader(sel->tokens, &shader_info); - shader->shader.uses_kill = shader_info.uses_kill; + if (shader_info.uses_kill) + shader->db_shader_control |= S_02880C_KILL_ENABLE(1); + shader->shader.uses_instanceid = shader_info.uses_instanceid; bld_base->info = &shader_info; bld_base->emit_fetch_funcs[TGSI_FILE_CONSTANT] = fetch_constant; diff --git a/src/gallium/drivers/radeonsi/si_shader.h b/src/gallium/drivers/radeonsi/si_shader.h index e07d872..559e4e2 100644 --- a/src/gallium/drivers/radeonsi/si_shader.h +++ b/src/gallium/drivers/radeonsi/si_shader.h @@ -142,7 +142,6 @@ struct si_shader { unsignedgs_max_out_vertices; unsignednparam; - booluses_kill; booluses_instanceid; boolfs_write_all; boolvs_out_misc_write; diff --git a/src/gallium/drivers/radeonsi/si_state_draw.c b/src/gallium/drivers/radeonsi/si_state_draw.c index 37dc40b..28e92fc 100644 --- a/src/gallium/drivers/radeonsi/si_state_draw.c +++ b/src/gallium/drivers/radeonsi/si_state_draw.c @@ -266,9 +266,6 @@ static void si_pipe_shader_ps(struct pipe_context *ctx, struct si_pipe_shader *s db_shader_control |= shader->db_shader_control; - if (shader->shader.uses_kill || shader->key.ps.alpha_func != PIPE_FUNC_ALWAYS) - db_shader_control |= S_02880C_KILL_ENABLE(1); - spi_ps_in_control = S_0286D8_NUM_INTERP(shader->shader.nparam) | S_0286D8_BC_OPTIMIZE_DISABLE(1); -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 03/15] radeonsi: move DB registers from draw_vbo into new db_render_state
From: Marek Olšák It's called db_misc_state in r600g. --- src/gallium/drivers/radeonsi/si_blit.c | 6 +++ src/gallium/drivers/radeonsi/si_hw_context.c | 1 + src/gallium/drivers/radeonsi/si_pipe.h | 16 +++--- src/gallium/drivers/radeonsi/si_state.c | 73 +--- src/gallium/drivers/radeonsi/si_state_draw.c | 52 5 files changed, 82 insertions(+), 66 deletions(-) diff --git a/src/gallium/drivers/radeonsi/si_blit.c b/src/gallium/drivers/radeonsi/si_blit.c index 9f95a8a..4744154 100644 --- a/src/gallium/drivers/radeonsi/si_blit.c +++ b/src/gallium/drivers/radeonsi/si_blit.c @@ -146,6 +146,7 @@ static void si_blit_decompress_depth(struct pipe_context *ctx, struct pipe_surface *zsurf, *cbsurf, surf_tmpl; sctx->dbcb_copy_sample = sample; + sctx->db_render_state.dirty = true; surf_tmpl.format = texture->resource.b.b.format; surf_tmpl.u.tex.level = level; @@ -179,6 +180,7 @@ static void si_blit_decompress_depth(struct pipe_context *ctx, sctx->dbcb_depth_copy_enabled = false; sctx->dbcb_stencil_copy_enabled = false; + sctx->db_render_state.dirty = true; } static void si_blit_decompress_depth_in_place(struct si_context *sctx, @@ -190,6 +192,7 @@ static void si_blit_decompress_depth_in_place(struct si_context *sctx, unsigned layer, max_layer, checked_last_layer, level; sctx->db_inplace_flush_enabled = true; + sctx->db_render_state.dirty = true; surf_tmpl.format = texture->resource.b.b.format; @@ -227,6 +230,7 @@ static void si_blit_decompress_depth_in_place(struct si_context *sctx, } sctx->db_inplace_flush_enabled = false; + sctx->db_render_state.dirty = true; } void si_flush_depth_textures(struct si_context *sctx, @@ -372,6 +376,7 @@ static void si_clear(struct pipe_context *ctx, unsigned buffers, zstex->depth_clear_value = depth; sctx->framebuffer.atom.dirty = true; /* updates DB_DEPTH_CLEAR */ sctx->db_depth_clear = true; + sctx->db_render_state.dirty = true; } si_blitter_begin(ctx, SI_CLEAR); @@ -384,6 +389,7 @@ static void si_clear(struct pipe_context *ctx, unsigned buffers, sctx->db_depth_clear = false; sctx->db_depth_disable_expclear = false; zstex->depth_cleared = true; + sctx->db_render_state.dirty = true; } } diff --git a/src/gallium/drivers/radeonsi/si_hw_context.c b/src/gallium/drivers/radeonsi/si_hw_context.c index bd8409b..eaefa6a 100644 --- a/src/gallium/drivers/radeonsi/si_hw_context.c +++ b/src/gallium/drivers/radeonsi/si_hw_context.c @@ -161,6 +161,7 @@ void si_begin_new_cs(struct si_context *ctx) ctx->framebuffer.atom.dirty = true; ctx->msaa_config.dirty = true; + ctx->db_render_state.dirty = true; ctx->b.streamout.enable_atom.dirty = true; si_all_descriptors_begin_new_cs(ctx); diff --git a/src/gallium/drivers/radeonsi/si_pipe.h b/src/gallium/drivers/radeonsi/si_pipe.h index 6ec8d5d..df81e1f 100644 --- a/src/gallium/drivers/radeonsi/si_pipe.h +++ b/src/gallium/drivers/radeonsi/si_pipe.h @@ -106,6 +106,7 @@ struct si_context { struct r600_atom *streamout_begin; struct r600_atom *streamout_enable; /* must be after streamout_begin */ struct r600_atom *framebuffer; + struct r600_atom *db_render_state; struct r600_atom *msaa_config; } s; struct r600_atom *array[0]; @@ -159,13 +160,14 @@ struct si_context { union si_state queued; union si_state emitted; - /* Additional DB state. */ - bool dbcb_depth_copy_enabled; - bool dbcb_stencil_copy_enabled; - unsigned dbcb_copy_sample; - bool db_inplace_flush_enabled; - bool db_depth_clear; - bool db_depth_disable_expclear; + /* DB render state. */ + struct r600_atomdb_render_state; + booldbcb_depth_copy_enabled; + booldbcb_stencil_copy_enabled; + unsigneddbcb_copy_sample; + booldb_inplace_flush_enabled; + booldb_depth_clear; + booldb_depth_disable_expclear; }; /* si_blit.c */ diff --git a/src/gallium/drivers/radeonsi/si_state.c b/src/gallium/drivers/radeonsi/si_state.c index 1d6ae86..c66eac9 100644 --- a/src/gallium/drivers/radeonsi/si_state.c +++ b/src/gallium/drivers/radeonsi/si_state.c @@ -833,6 +833,71 @@ static void *si_create_db_flush_dsa(struct si_context *sctx) return sctx->b.b.create_depth_stencil_alpha_state(&sctx->b.b, &dsa); } +/* DB RENDER ST
[Mesa-dev] [PATCH 14/15] radeonsi: release GS rings at context destruction
From: Marek Olšák --- src/gallium/drivers/radeonsi/si_pipe.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/src/gallium/drivers/radeonsi/si_pipe.c b/src/gallium/drivers/radeonsi/si_pipe.c index 4f9c876..2cce5cc 100644 --- a/src/gallium/drivers/radeonsi/si_pipe.c +++ b/src/gallium/drivers/radeonsi/si_pipe.c @@ -38,6 +38,8 @@ static void si_destroy_context(struct pipe_context *context) si_release_all_descriptors(sctx); + pipe_resource_reference(&sctx->esgs_ring, NULL); + pipe_resource_reference(&sctx->gsvs_ring, NULL); pipe_resource_reference(&sctx->null_const_buf.buffer, NULL); r600_resource_reference(&sctx->border_color_table, NULL); -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 10/15] radeonsi: shorten si_pipe_* prefixes to si_*
From: Marek Olšák This was the original naming convention in r600g and it somehow crept into radeonsi. --- src/gallium/drivers/radeonsi/si_compute.c | 15 ++--- src/gallium/drivers/radeonsi/si_descriptors.c | 14 ++-- src/gallium/drivers/radeonsi/si_pipe.h| 15 +++-- src/gallium/drivers/radeonsi/si_shader.c | 6 ++--- src/gallium/drivers/radeonsi/si_shader.h | 10 - src/gallium/drivers/radeonsi/si_state.c | 32 +-- src/gallium/drivers/radeonsi/si_state.h | 4 ++-- src/gallium/drivers/radeonsi/si_state_draw.c | 19 8 files changed, 57 insertions(+), 58 deletions(-) diff --git a/src/gallium/drivers/radeonsi/si_compute.c b/src/gallium/drivers/radeonsi/si_compute.c index 049f6c2..9088268 100644 --- a/src/gallium/drivers/radeonsi/si_compute.c +++ b/src/gallium/drivers/radeonsi/si_compute.c @@ -38,7 +38,7 @@ #define NUM_USER_SGPRS 4 #endif -struct si_pipe_compute { +struct si_compute { struct si_context *ctx; unsigned local_size; @@ -59,8 +59,7 @@ static void *si_create_compute_state( const struct pipe_compute_state *cso) { struct si_context *sctx = (struct si_context *)ctx; - struct si_pipe_compute *program = - CALLOC_STRUCT(si_pipe_compute); + struct si_compute *program = CALLOC_STRUCT(si_compute); const struct pipe_llvm_program_header *header; const unsigned char *code; unsigned i; @@ -95,7 +94,7 @@ static void *si_create_compute_state( static void si_bind_compute_state(struct pipe_context *ctx, void *state) { struct si_context *sctx = (struct si_context*)ctx; - sctx->cs_shader_state.program = (struct si_pipe_compute*)state; + sctx->cs_shader_state.program = (struct si_compute*)state; } static void si_set_global_binding( @@ -105,7 +104,7 @@ static void si_set_global_binding( { unsigned i; struct si_context *sctx = (struct si_context*)ctx; - struct si_pipe_compute *program = sctx->cs_shader_state.program; + struct si_compute *program = sctx->cs_shader_state.program; if (!resources) { for (i = first; i < first + n; i++) { @@ -169,7 +168,7 @@ static void si_launch_grid( uint32_t pc, const void *input) { struct si_context *sctx = (struct si_context*)ctx; - struct si_pipe_compute *program = sctx->cs_shader_state.program; + struct si_compute *program = sctx->cs_shader_state.program; struct si_pm4_state *pm4 = CALLOC_STRUCT(si_pm4_state); struct r600_resource *input_buffer = program->input_buffer; unsigned kernel_args_size; @@ -383,7 +382,7 @@ static void si_launch_grid( static void si_delete_compute_state(struct pipe_context *ctx, void* state){ - struct si_pipe_compute *program = (struct si_pipe_compute *)state; + struct si_compute *program = (struct si_compute *)state; if (!state) { return; @@ -392,7 +391,7 @@ static void si_delete_compute_state(struct pipe_context *ctx, void* state){ if (program->kernels) { for (int i = 0; i < program->num_kernels; i++){ if (program->kernels[i].bo){ - si_pipe_shader_destroy(ctx, &program->kernels[i]); + si_shader_destroy(ctx, &program->kernels[i]); } } FREE(program->kernels); diff --git a/src/gallium/drivers/radeonsi/si_descriptors.c b/src/gallium/drivers/radeonsi/si_descriptors.c index 2543052..a0780cd 100644 --- a/src/gallium/drivers/radeonsi/si_descriptors.c +++ b/src/gallium/drivers/radeonsi/si_descriptors.c @@ -330,8 +330,8 @@ static void si_sampler_views_begin_new_cs(struct si_context *sctx, /* Add relocations to the CS. */ while (mask) { int i = u_bit_scan(&mask); - struct si_pipe_sampler_view *rview = - (struct si_pipe_sampler_view*)views->views[i]; + struct si_sampler_view *rview = + (struct si_sampler_view*)views->views[i]; r600_context_bo_reloc(&sctx->b, &sctx->b.rings.gfx, rview->resource, RADEON_USAGE_READ, @@ -354,8 +354,8 @@ static void si_set_sampler_view(struct si_context *sctx, unsigned shader, return; if (view) { - struct si_pipe_sampler_view *rview = - (struct si_pipe_sampler_view*)view; + struct si_sampler_view *rview = + (struct si_sampler_view*)view; r600_context_bo_reloc(&sctx->b, &sctx->b.rings.gfx, rview->resource, RADEON_USAGE_READ, @@ -380,7 +380,7 @@ static void si_set_sampler_views(struct pipe_context *ctx, { struct si_context *sctx = (struct si_context *)ctx;
[Mesa-dev] [PATCH 09/15] radeonsi: merge si_pipe_shader into si_shader
From: Marek Olšák One is part of the other anyway. --- src/gallium/drivers/radeonsi/si_compute.c| 6 +-- src/gallium/drivers/radeonsi/si_shader.c | 56 +++ src/gallium/drivers/radeonsi/si_shader.h | 68 ++-- src/gallium/drivers/radeonsi/si_state.c | 8 ++-- src/gallium/drivers/radeonsi/si_state_draw.c | 44 +- 5 files changed, 90 insertions(+), 92 deletions(-) diff --git a/src/gallium/drivers/radeonsi/si_compute.c b/src/gallium/drivers/radeonsi/si_compute.c index fc842d4..049f6c2 100644 --- a/src/gallium/drivers/radeonsi/si_compute.c +++ b/src/gallium/drivers/radeonsi/si_compute.c @@ -45,7 +45,7 @@ struct si_pipe_compute { unsigned private_size; unsigned input_size; unsigned num_kernels; - struct si_pipe_shader *kernels; + struct si_shader *kernels; unsigned num_user_sgprs; struct r600_resource *input_buffer; @@ -77,7 +77,7 @@ static void *si_create_compute_state( program->num_kernels = radeon_llvm_get_num_kernels(program->llvm_ctx, code, header->num_bytes); - program->kernels = CALLOC(sizeof(struct si_pipe_shader), + program->kernels = CALLOC(sizeof(struct si_shader), program->num_kernels); for (i = 0; i < program->num_kernels; i++) { LLVMModuleRef mod = radeon_llvm_get_kernel_module(program->llvm_ctx, i, @@ -181,7 +181,7 @@ static void si_launch_grid( uint64_t shader_va; unsigned arg_user_sgpr_count = NUM_USER_SGPRS; unsigned i; - struct si_pipe_shader *shader = &program->kernels[pc]; + struct si_shader *shader = &program->kernels[pc]; unsigned lds_blocks; unsigned num_waves_for_scratch; diff --git a/src/gallium/drivers/radeonsi/si_shader.c b/src/gallium/drivers/radeonsi/si_shader.c index 5893531..9b70a35 100644 --- a/src/gallium/drivers/radeonsi/si_shader.c +++ b/src/gallium/drivers/radeonsi/si_shader.c @@ -59,7 +59,7 @@ struct si_shader_context struct radeon_llvm_context radeon_bld; struct tgsi_parse_context parse; struct tgsi_token * tokens; - struct si_pipe_shader *shader; + struct si_shader *shader; struct si_shader *gs_for_vs; unsigned type; /* TGSI_PROCESSOR_* specifies the type of shader. */ int param_streamout_config; @@ -220,7 +220,7 @@ static void declare_input_vs( if (divisor) { /* Build index from instance ID, start instance and divisor */ - si_shader_ctx->shader->shader.uses_instanceid = true; + si_shader_ctx->shader->uses_instanceid = true; buffer_index = get_instance_index_for_fetch(&si_shader_ctx->radeon_bld, divisor); } else { /* Load the buffer index for vertices. */ @@ -257,7 +257,7 @@ static void declare_input_gs( { struct si_shader_context *si_shader_ctx = si_shader_context(&radeon_bld->soa.bld_base); - struct si_shader *shader = &si_shader_ctx->shader->shader; + struct si_shader *shader = si_shader_ctx->shader; si_store_shader_io_attribs(shader, decl); @@ -273,7 +273,7 @@ static LLVMValueRef fetch_input_gs( { struct lp_build_context *base = &bld_base->base; struct si_shader_context *si_shader_ctx = si_shader_context(bld_base); - struct si_shader *shader = &si_shader_ctx->shader->shader; + struct si_shader *shader = si_shader_ctx->shader; struct lp_build_context *uint = &si_shader_ctx->radeon_bld.soa.bld_base.uint_bld; struct gallivm_state *gallivm = base->gallivm; LLVMTypeRef i32 = LLVMInt32TypeInContext(gallivm->context); @@ -352,7 +352,7 @@ static void declare_input_fs( struct lp_build_context *base = &radeon_bld->soa.bld_base.base; struct si_shader_context *si_shader_ctx = si_shader_context(&radeon_bld->soa.bld_base); - struct si_shader *shader = &si_shader_ctx->shader->shader; + struct si_shader *shader = si_shader_ctx->shader; struct lp_build_context *uint = &radeon_bld->soa.bld_base.uint_bld; struct gallivm_state *gallivm = base->gallivm; LLVMTypeRef input_type = LLVMFloatTypeInContext(gallivm->context); @@ -782,7 +782,7 @@ static void si_llvm_emit_clipvertex(struct lp_build_tgsi_context * bld_base, LLVMValueRef (*pos)[9], LLVMValueRef *out_elts) { struct si_shader_context *si_shader_ctx = si_shader_context(bld_base); - struct si_pipe_shader *shader = si_shader_ctx->shader; + struct si_shader *shader = si_shader_ctx->shader; struct lp_build_context *base = &bld_base->base; struct lp_build_context *uint = &si_shader_ctx->radeon_bld.soa.bld_base.uint_bld; unsigned reg_index; @@ -799,7 +799,7 @@ static void si_llvm_emit_clipvertex
[Mesa-dev] [PATCH 13/15] radeonsi: don't use pipe_constant_buffer for GS rings
From: Marek Olšák --- src/gallium/drivers/radeonsi/si_descriptors.c | 10 - src/gallium/drivers/radeonsi/si_pipe.h| 4 ++-- src/gallium/drivers/radeonsi/si_state.h | 2 +- src/gallium/drivers/radeonsi/si_state_draw.c | 32 --- 4 files changed, 22 insertions(+), 26 deletions(-) diff --git a/src/gallium/drivers/radeonsi/si_descriptors.c b/src/gallium/drivers/radeonsi/si_descriptors.c index a0780cd..fc535d0 100644 --- a/src/gallium/drivers/radeonsi/si_descriptors.c +++ b/src/gallium/drivers/radeonsi/si_descriptors.c @@ -732,7 +732,7 @@ static void si_set_constant_buffer(struct pipe_context *ctx, uint shader, uint s /* RING BUFFERS */ void si_set_ring_buffer(struct pipe_context *ctx, uint shader, uint slot, - struct pipe_constant_buffer *input, + struct pipe_resource *buffer, unsigned stride, unsigned num_records, bool add_tid, bool swizzle, unsigned element_size, unsigned index_stride) @@ -749,10 +749,10 @@ void si_set_ring_buffer(struct pipe_context *ctx, uint shader, uint slot, assert(slot < buffers->num_buffers); pipe_resource_reference(&buffers->buffers[slot], NULL); - if (input && input->buffer) { + if (buffer) { uint64_t va; - va = r600_resource(input->buffer)->gpu_address; + va = r600_resource(buffer)->gpu_address; switch (element_size) { default: @@ -807,9 +807,9 @@ void si_set_ring_buffer(struct pipe_context *ctx, uint shader, uint slot, S_008F0C_INDEX_STRIDE(index_stride) | S_008F0C_ADD_TID_ENABLE(add_tid); - pipe_resource_reference(&buffers->buffers[slot], input->buffer); + pipe_resource_reference(&buffers->buffers[slot], buffer); r600_context_bo_reloc(&sctx->b, &sctx->b.rings.gfx, - (struct r600_resource*)input->buffer, + (struct r600_resource*)buffer, buffers->shader_usage, buffers->priority); buffers->desc.enabled_mask |= 1 << slot; } else { diff --git a/src/gallium/drivers/radeonsi/si_pipe.h b/src/gallium/drivers/radeonsi/si_pipe.h index 4b223b4..5f5404d 100644 --- a/src/gallium/drivers/radeonsi/si_pipe.h +++ b/src/gallium/drivers/radeonsi/si_pipe.h @@ -154,8 +154,8 @@ struct si_context { struct si_pm4_state *gs_rings; struct r600_atomcache_flush; struct pipe_constant_buffer null_const_buf; /* used for set_constant_buffer(NULL) on CIK */ - struct pipe_constant_buffer esgs_ring; - struct pipe_constant_buffer gsvs_ring; + struct pipe_resource*esgs_ring; + struct pipe_resource*gsvs_ring; /* SI state handling */ union si_state queued; diff --git a/src/gallium/drivers/radeonsi/si_state.h b/src/gallium/drivers/radeonsi/si_state.h index a5c6720..d3a745a 100644 --- a/src/gallium/drivers/radeonsi/si_state.h +++ b/src/gallium/drivers/radeonsi/si_state.h @@ -234,7 +234,7 @@ void si_set_sampler_descriptors(struct si_context *sctx, unsigned shader, unsigned start, unsigned count, void **states); void si_update_vertex_buffers(struct si_context *sctx); void si_set_ring_buffer(struct pipe_context *ctx, uint shader, uint slot, - struct pipe_constant_buffer *input, + struct pipe_resource *buffer, unsigned stride, unsigned num_records, bool add_tid, bool swizzle, unsigned element_size, unsigned index_stride); diff --git a/src/gallium/drivers/radeonsi/si_state_draw.c b/src/gallium/drivers/radeonsi/si_state_draw.c index a9dedf9..61951ee 100644 --- a/src/gallium/drivers/radeonsi/si_state_draw.c +++ b/src/gallium/drivers/radeonsi/si_state_draw.c @@ -550,42 +550,38 @@ bcolor: /* Initialize state related to ESGS / GSVS ring buffers */ static void si_init_gs_rings(struct si_context *sctx) { - unsigned size = 128 * 1024; + unsigned esgs_ring_size = 128 * 1024; + unsigned gsvs_ring_size = 64 * 1024 * 1024; assert(!sctx->gs_rings); sctx->gs_rings = si_pm4_alloc_state(sctx); - sctx->esgs_ring.buffer = - pipe_buffer_create(sctx->b.b.screen, PIPE_BIND_CUSTOM, - PIPE_USAGE_DEFAULT, size); - sctx->esgs_ring.buffer_size = size; + sctx->esgs_ring = pipe_buffer_create(sctx->b.b.screen, PIPE_BIND_CUSTOM, + PIPE_USAGE_DEFAULT, esgs_ring_size); - size = 64 * 1024 * 1024; - sctx->gsvs_ring.buffer = - pipe_buffer_create(sctx->b.b.screen, PIPE_BIND_CUSTOM, - PIPE_USAGE_DEFAULT, size); -
[Mesa-dev] [PATCH 06/15] radeonsi: move DB_SHADER_CONTROL into db_render_state
From: Marek Olšák I will need this for fixing sample shading with 1 sample. The good news is that all shader pm4 states no longer use the current context state, so we can generate the pm4 states outside of draw_vbo if needed. --- src/gallium/drivers/radeonsi/si_pipe.h | 1 + src/gallium/drivers/radeonsi/si_shader.h | 1 - src/gallium/drivers/radeonsi/si_state.c | 11 ++- src/gallium/drivers/radeonsi/si_state_draw.c | 18 +++--- 4 files changed, 18 insertions(+), 13 deletions(-) diff --git a/src/gallium/drivers/radeonsi/si_pipe.h b/src/gallium/drivers/radeonsi/si_pipe.h index df81e1f..00d03be 100644 --- a/src/gallium/drivers/radeonsi/si_pipe.h +++ b/src/gallium/drivers/radeonsi/si_pipe.h @@ -168,6 +168,7 @@ struct si_context { booldb_inplace_flush_enabled; booldb_depth_clear; booldb_depth_disable_expclear; + unsignedps_db_shader_control; }; /* si_blit.c */ diff --git a/src/gallium/drivers/radeonsi/si_shader.h b/src/gallium/drivers/radeonsi/si_shader.h index 559e4e2..9c6b238 100644 --- a/src/gallium/drivers/radeonsi/si_shader.h +++ b/src/gallium/drivers/radeonsi/si_shader.h @@ -186,7 +186,6 @@ struct si_pipe_shader { unsignedspi_shader_z_format; unsigneddb_shader_control; unsignedcb_shader_mask; - boolcb0_is_integer; union si_shader_key key; }; diff --git a/src/gallium/drivers/radeonsi/si_state.c b/src/gallium/drivers/radeonsi/si_state.c index c66eac9..b83b930 100644 --- a/src/gallium/drivers/radeonsi/si_state.c +++ b/src/gallium/drivers/radeonsi/si_state.c @@ -896,6 +896,11 @@ static void si_emit_db_render_state(struct si_context *sctx, struct r600_atom *s } else { r600_write_context_reg(cs, R_028010_DB_RENDER_OVERRIDE2, 0); } + + r600_write_context_reg(cs, R_02880C_DB_SHADER_CONTROL, + S_02880C_Z_ORDER(V_02880C_EARLY_Z_THEN_LATE_Z) | + S_02880C_ALPHA_TO_MASK_DISABLE(sctx->framebuffer.cb0_is_integer) | + sctx->ps_db_shader_control); } /* @@ -1937,6 +1942,7 @@ static void si_set_framebuffer_state(struct pipe_context *ctx, struct pipe_constant_buffer constbuf = {0}; struct r600_surface *surf = NULL; struct r600_texture *rtex; + bool old_cb0_is_integer = sctx->framebuffer.cb0_is_integer; int i; if (sctx->framebuffer.state.nr_cbufs) { @@ -1957,6 +1963,9 @@ static void si_set_framebuffer_state(struct pipe_context *ctx, sctx->framebuffer.cb0_is_integer = state->nr_cbufs && state->cbufs[0] && util_format_is_pure_integer(state->cbufs[0]->format); + if (sctx->framebuffer.cb0_is_integer != old_cb0_is_integer) + sctx->db_render_state.dirty = true; + for (i = 0; i < state->nr_cbufs; i++) { if (!state->cbufs[i]) continue; @@ -2983,7 +2992,7 @@ static void si_need_gfx_cs_space(struct pipe_context *ctx, unsigned num_dw, void si_init_state_functions(struct si_context *sctx) { si_init_atom(&sctx->framebuffer.atom, &sctx->atoms.s.framebuffer, si_emit_framebuffer_state, 0); - si_init_atom(&sctx->db_render_state, &sctx->atoms.s.db_render_state, si_emit_db_render_state, 7); + si_init_atom(&sctx->db_render_state, &sctx->atoms.s.db_render_state, si_emit_db_render_state, 10); sctx->b.b.create_blend_state = si_create_blend_state; sctx->b.b.bind_blend_state = si_bind_blend_state; diff --git a/src/gallium/drivers/radeonsi/si_state_draw.c b/src/gallium/drivers/radeonsi/si_state_draw.c index 28e92fc..54f2fd9 100644 --- a/src/gallium/drivers/radeonsi/si_state_draw.c +++ b/src/gallium/drivers/radeonsi/si_state_draw.c @@ -231,7 +231,7 @@ static void si_pipe_shader_ps(struct pipe_context *ctx, struct si_pipe_shader *s { struct si_context *sctx = (struct si_context *)ctx; struct si_pm4_state *pm4; - unsigned i, spi_ps_in_control, db_shader_control; + unsigned i, spi_ps_in_control; unsigned num_sgprs, num_user_sgprs; unsigned spi_baryc_cntl = 0, spi_ps_input_ena; uint64_t va; @@ -242,9 +242,6 @@ static void si_pipe_shader_ps(struct pipe_context *ctx, struct si_pipe_shader *s if (pm4 == NULL) return; - db_shader_control = S_02880C_Z_ORDER(V_02880C_EARLY_Z_THEN_LATE_Z) | - S_02880C_ALPHA_TO_MASK_DISABLE(sctx->framebuffer.cb0_is_integer); - for (i = 0; i < shader->shader.ninput; i++) { switch (shader->shader.input[i].name) { case TGSI_SEMANTIC_POSITION: @@ -264,8 +261,6 @@ static void si_pipe_shader_ps(struct pipe_context *ctx, struct si_pipe_shader *s
[Mesa-dev] [PATCH 11/15] radeonsi: don't snoop currently-bound GS shader when compiling ES
From: Marek Olšák Instead, pass the layout of GS inputs in memory to the ES using the shader key. Only 64 bits are needed to represent the layout in the key. Mixing and matching different VS and GS shaders should now always work. --- src/gallium/drivers/radeonsi/si_shader.c | 107 ++- src/gallium/drivers/radeonsi/si_shader.h | 4 ++ src/gallium/drivers/radeonsi/si_state.c | 6 +- 3 files changed, 101 insertions(+), 16 deletions(-) diff --git a/src/gallium/drivers/radeonsi/si_shader.c b/src/gallium/drivers/radeonsi/si_shader.c index 2fc1632..fbc94d2 100644 --- a/src/gallium/drivers/radeonsi/si_shader.c +++ b/src/gallium/drivers/radeonsi/si_shader.c @@ -60,7 +60,6 @@ struct si_shader_context struct tgsi_parse_context parse; struct tgsi_token * tokens; struct si_shader *shader; - struct si_shader *gs_for_vs; unsigned type; /* TGSI_PROCESSOR_* specifies the type of shader. */ int param_streamout_config; int param_streamout_write_index; @@ -105,6 +104,84 @@ static struct si_shader_context * si_shader_context( #define SENDMSG_GS_OP_EMIT (2 << 4) #define SENDMSG_GS_OP_EMIT_CUT (3 << 4) +/** + * Returns a unique index for a semantic name and index. The index must be + * less than 64, so that a 64-bit bitmask of used inputs or outputs can be + * calculated. + */ +static unsigned get_unique_index(unsigned semantic_name, unsigned index) +{ + switch (semantic_name) { + case TGSI_SEMANTIC_POSITION: + return 0; + case TGSI_SEMANTIC_PSIZE: + return 1; + case TGSI_SEMANTIC_CLIPDIST: + assert(index <= 1); + return 2 + index; + case TGSI_SEMANTIC_CLIPVERTEX: + return 4; + case TGSI_SEMANTIC_COLOR: + assert(index <= 1); + return 5 + index; + case TGSI_SEMANTIC_BCOLOR: + assert(index <= 1); + return 7 + index; + case TGSI_SEMANTIC_FOG: + return 9; + case TGSI_SEMANTIC_EDGEFLAG: + return 10; + case TGSI_SEMANTIC_GENERIC: + assert(index <= 63-11); + return 11 + index; + default: + assert(0); + return 63; + } +} + +/** + * Given a semantic name and index of a parameter and a mask of used parameters + * (inputs or outputs), return the index of the parameter in the list of all + * used parameters. + * + * For example, assume this list of parameters: + * POSITION, PSIZE, GENERIC0, GENERIC2 + * which has the mask: + * 110101 + * Then: + * querying POSITION returns 0, + * querying PSIZE returns 1, + * querying GENERIC0 returns 2, + * querying GENERIC2 returns 3. + * + * Which can be used as an offset to a parameter buffer in units of vec4s. + */ +static int get_param_index(unsigned semantic_name, unsigned index, + uint64_t mask) +{ + unsigned unique_index = get_unique_index(semantic_name, index); + int i, param_index = 0; + + /* If not present... */ + if (!((1llu << unique_index) & mask)) + return -1; + + for (i = 0; mask; i++) { + uint64_t bit = 1llu << i; + + if (bit & mask) { + if (i == unique_index) + return param_index; + + mask &= ~bit; + param_index++; + } + } + + assert(0 && "unreachable"); + return -1; +} /** * Build an LLVM bytecode indexed load using LLVMBuildGEP + LLVMBuildLoad @@ -261,8 +338,12 @@ static void declare_input_gs( si_store_shader_io_attribs(shader, decl); - if (decl->Semantic.Name != TGSI_SEMANTIC_PRIMID) - shader->input[input_index].param_offset = shader->nparam++; + if (decl->Semantic.Name != TGSI_SEMANTIC_PRIMID) { + shader->gs_used_inputs |= + 1llu << get_unique_index(decl->Semantic.Name, +decl->Semantic.Index); + shader->nparam++; + } } static LLVMValueRef fetch_input_gs( @@ -282,6 +363,7 @@ static LLVMValueRef fetch_input_gs( LLVMValueRef t_list; LLVMValueRef args[9]; unsigned vtx_offset_param; + struct si_shader_input *input = &shader->input[reg->Register.Index]; if (swizzle != ~0 && shader->input[reg->Register.Index].name == TGSI_SEMANTIC_PRIMID) { @@ -327,7 +409,8 @@ static LLVMValueRef fetch_input_gs( args[0] = t_list; args[1] = vtx_offset; args[2] = lp_build_const_int32(gallivm, - ((shader->input[reg->Register.Index].param_offset * 4) + + (get_param_index(input->name, input->sid, + shader->gs_used_inputs) * 4 +
[Mesa-dev] [PATCH 02/15] radeonsi: remove unused variable si_pipe_shader::sprite_coord_enable
From: Marek Olšák --- src/gallium/drivers/radeonsi/si_shader.h | 1 - src/gallium/drivers/radeonsi/si_state_draw.c | 1 - 2 files changed, 2 deletions(-) diff --git a/src/gallium/drivers/radeonsi/si_shader.h b/src/gallium/drivers/radeonsi/si_shader.h index a68c25a..df7dbb0 100644 --- a/src/gallium/drivers/radeonsi/si_shader.h +++ b/src/gallium/drivers/radeonsi/si_shader.h @@ -189,7 +189,6 @@ struct si_pipe_shader { unsigneddb_shader_control; unsignedcb_shader_mask; boolcb0_is_integer; - unsignedsprite_coord_enable; union si_shader_key key; }; diff --git a/src/gallium/drivers/radeonsi/si_state_draw.c b/src/gallium/drivers/radeonsi/si_state_draw.c index 2e9d951..9eeda9d 100644 --- a/src/gallium/drivers/radeonsi/si_state_draw.c +++ b/src/gallium/drivers/radeonsi/si_state_draw.c @@ -321,7 +321,6 @@ static void si_pipe_shader_ps(struct pipe_context *ctx, struct si_pipe_shader *s si_pm4_set_reg(pm4, R_02880C_DB_SHADER_CONTROL, db_shader_control); shader->cb0_is_integer = sctx->framebuffer.cb0_is_integer; - shader->sprite_coord_enable = sctx->sprite_coord_enable; sctx->b.flags |= R600_CONTEXT_INV_SHADER_CACHE; } -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 01/12] i965/fs: Manually generate the meta fast-clear shader
Previously, we were generating the fast-clear shader from GLSL. The problem is that fast clears require that we use a replicated write rather than a regular write instruction. In order to get this we had a complicated and somewhat fragile optimization pass that looked for places where we can use a replicated write and used it. Since replicated writes have a lot of restrictions, we only ever use them for fast-clear operations. This commit replaces the optimization pass with a function that just generates the shader we want. This is a) less code, b) less fragile than the optimization pass, and c) generates a more efficient shader. Signed-off-by: Jason Ekstrand Cc: Kristian Høgsberg --- src/mesa/drivers/dri/i965/brw_fs.cpp | 122 ++- src/mesa/drivers/dri/i965/brw_fs.h | 3 +- 2 files changed, 34 insertions(+), 91 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp index fa95c81..3fb1545 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp @@ -2319,98 +2319,42 @@ fs_visitor::compute_to_mrf() * instructions to FS_OPCODE_REP_FB_WRITE. */ void -fs_visitor::try_rep_send() +fs_visitor::emit_repclear_shader() { - int i, count; - fs_inst *start = NULL; - bblock_t *mov_block; + int base_mrf = 1; + int color_mrf = base_mrf + 2; - /* From the Ivybridge PRM, Volume 4 Part 1, section 3.9.11.2 -* ("Message Descriptor - Render Target Write"): -* -* "SIMD16_REPDATA message must not be used in SIMD8 pixel-shaders." -*/ - if (dispatch_width != 16) - return; - - /* The constant color write message can't handle anything but the 4 color -* values. We could do MRT, but the loops below would need to understand -* handling the header being enabled or disabled on different messages. It -* also requires that the render target be tiled, which might not be the -* case for some EGLImage paths or if we some day do rendering to PBOs. -*/ - if (prog->OutputsWritten & BITFIELD64_BIT(FRAG_RESULT_DEPTH) || - payload.aa_dest_stencil_reg || - payload.dest_depth_reg || - dual_src_output.file != BAD_FILE) - return; - - /* The optimization is implemented as one pass through the instruction -* list. We keep track of the most recent block of MOVs into sequential -* MRFs from single, sequential float registers (ie uniforms). Then when -* we find an FB_WRITE opcode, we see if the payload registers match the -* destination registers in our block of MOVs. -*/ - count = 0; - foreach_block_and_inst_safe(block, fs_inst, inst, cfg) { - if (count == 0) { - start = inst; - mov_block = block; - } - if (inst->opcode == BRW_OPCODE_MOV && - inst->dst.file == MRF && - inst->dst.reg == start->dst.reg + 2 * count && - inst->src[0].file == HW_REG && - inst->src[0].reg_offset == start->src[0].reg_offset + count) { - if (count == 0) { -start = inst; -mov_block = block; - } - count++; - } - - if (inst->opcode == FS_OPCODE_FB_WRITE && - count == 4 && - (inst->base_mrf == start->dst.reg || - (inst->base_mrf + 2 == start->dst.reg && inst->header_present))) { - fs_inst *mov = MOV(start->dst, start->src[0]); + fs_inst *mov = emit(MOV(vec4(brw_message_reg(color_mrf)), + fs_reg(UNIFORM, 0, BRW_REGISTER_TYPE_F))); + mov->force_writemask_all = true; + mov->force_uncompressed = true; - /* Make a MOV that moves the four floats into the replicated write - * payload. Since we're running at the very end of code generation - * we can use hw registers and generate the stride and offsets we - * need for this MOV. We use the first of the eight registers - * allocated for the SIMD16 payload for the four floats. - */ - mov->dst.fixed_hw_reg = -brw_vec4_reg(BRW_MESSAGE_REGISTER_FILE, - start->dst.reg, 0); - mov->dst.file = HW_REG; - mov->dst.type = mov->dst.fixed_hw_reg.type; - - mov->src[0].fixed_hw_reg = -brw_vec4_grf(mov->src[0].fixed_hw_reg.nr, 0); - mov->src[0].file = HW_REG; - mov->src[0].type = mov->src[0].fixed_hw_reg.type; - mov->force_writemask_all = true; - mov->dst.type = BRW_REGISTER_TYPE_F; - - /* Replace the four MOVs with the new vec4 MOV. */ - start->insert_before(mov_block, mov); - for (i = 0; i < 4; i++) -((fs_inst *) mov->next)->remove(mov_block); - - /* Finally, adjust the message length and set the opcode to - * REP_FB_WRITE for the send, so that the generator will use the - * replicated data mesage type. Then reset count so we'll start - * looking for a new block in case we're in a MRT shader
[Mesa-dev] [PATCH 11/12] i965/fs: Print BAD_FILE registers in dump_instruction
Sometimes these show up in LOAD_PAYLOAD instructions and it's nice to be able to see them. Signed-off-by: Jason Ekstrand --- src/mesa/drivers/dri/i965/brw_fs.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp index a41cb4f..a0b7c6a 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp @@ -2896,7 +2896,7 @@ fs_visitor::dump_instruction(backend_instruction *be_inst, FILE *file) } fprintf(file, ":%s, ", brw_reg_type_letters(inst->dst.type)); - for (int i = 0; i < inst->sources && inst->src[i].file != BAD_FILE; i++) { + for (int i = 0; i < inst->sources; i++) { if (inst->src[i].negate) fprintf(file, "-"); if (inst->src[i].abs) -- 2.1.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 07/12] i965/fs: Use the var_from_vgrf helper function instead of doing it manually
Signed-off-by: Jason Ekstrand --- src/mesa/drivers/dri/i965/brw_fs_dead_code_eliminate.cpp | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs_dead_code_eliminate.cpp b/src/mesa/drivers/dri/i965/brw_fs_dead_code_eliminate.cpp index 697b44a..036875f 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_dead_code_eliminate.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_dead_code_eliminate.cpp @@ -58,7 +58,7 @@ fs_visitor::dead_code_eliminate() int var = live_intervals->var_from_reg(&inst->dst); result_live = BITSET_TEST(live, var); } else { - int var = live_intervals->var_from_vgrf[inst->dst.reg]; + int var = live_intervals->var_from_reg(&inst->dst); for (int i = 0; i < inst->regs_written; i++) { result_live = result_live || BITSET_TEST(live, var + i); } @@ -78,19 +78,19 @@ fs_visitor::dead_code_eliminate() if (inst->dst.file == GRF) { if (!inst->is_partial_write()) { - int var = live_intervals->var_from_vgrf[inst->dst.reg]; + int var = live_intervals->var_from_reg(&inst->dst); for (int i = 0; i < inst->regs_written; i++) { - BITSET_CLEAR(live, var + inst->dst.reg_offset + i); + BITSET_CLEAR(live, var + i); } } } for (int i = 0; i < inst->sources; i++) { if (inst->src[i].file == GRF) { - int var = live_intervals->var_from_vgrf[inst->src[i].reg]; + int var = live_intervals->var_from_reg(&inst->src[i]); for (int j = 0; j < inst->regs_read(this, i); j++) { - BITSET_SET(live, var + inst->src[i].reg_offset + j); + BITSET_SET(live, var + j); } } } -- 2.1.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 00/12] i965/fs: A bunch of cleanups in preparation for explicit register widths
I'm working on a series (which I hope to send out soon) that will allow us to have explicit register widths and instruction execution sizes in the fs backend IR. If you want to see where I'm going with this, I've got a working version here: http://cgit.freedesktop.org/~jekstrand/mesa/log/?h=wip/kill-mrf-v0.5 I'm planning to get that cleaned up a bit more and hope to send the full series out by the end of today or maybe Monday. This series is a bunch of cleanup patches that will be needed eventually, but don't really change anything important on their own. They should be generally reviewable by anyone with a decent understanding of the i965 fs backend. Jason Ekstrand (12): i965/fs: Manually generate the meta fast-clear shader i965/fs_live_variables: Use var_from_vgrf insead of repeating the calculation i965/fs: Rewrite fs_visitor::split_virtual_grfs i965/fs: fix a comment in compact_virtual_grfs i965/fs: Use offset a lot more places i965/fs: Use the UW type for the destination of VARYING_PULL_CONSTANT_LOAD instructions i965/fs: Use the var_from_vgrf helper function instead of doing it manually i965/fs: Make null_reg_* const members of fs_visitor instead of globals i964/fs: Make immediate fs_reg constructors explicit i965/fs: Make compact_virtual_grfs an optimization pass i965/fs: Print BAD_FILE registers in dump_instruction i965/fs: Clean up emit_fb_writes src/mesa/drivers/dri/i965/brw_fs.cpp | 295 +- src/mesa/drivers/dri/i965/brw_fs.h | 21 +- src/mesa/drivers/dri/i965/brw_fs_cse.cpp | 12 +- .../dri/i965/brw_fs_dead_code_eliminate.cpp| 10 +- src/mesa/drivers/dri/i965/brw_fs_fp.cpp| 2 +- src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 4 +- .../drivers/dri/i965/brw_fs_live_variables.cpp | 4 +- src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp | 4 +- src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 338 + src/mesa/drivers/dri/i965/brw_reg.h| 6 + 10 files changed, 323 insertions(+), 373 deletions(-) -- 2.1.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 09/12] i964/fs: Make immediate fs_reg constructors explicit
Signed-off-by: Jason Ekstrand --- src/mesa/drivers/dri/i965/brw_fs.cpp | 2 +- src/mesa/drivers/dri/i965/brw_fs.h | 6 +++--- src/mesa/drivers/dri/i965/brw_fs_fp.cpp | 2 +- src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 11 ++- 4 files changed, 11 insertions(+), 10 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp index ea91705..002d40fd 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp @@ -279,7 +279,7 @@ fs_visitor::VARYING_PULL_CONSTANT_LOAD(const fs_reg &dst, */ fs_reg vec4_offset = fs_reg(this, glsl_type::int_type); instructions.push_tail(ADD(vec4_offset, - varying_offset, const_offset & ~3)); + varying_offset, fs_reg(const_offset & ~3))); int scale = 1; if (brw->gen == 4 && dispatch_width == 8) { diff --git a/src/mesa/drivers/dri/i965/brw_fs.h b/src/mesa/drivers/dri/i965/brw_fs.h index cb44037..402433b 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.h +++ b/src/mesa/drivers/dri/i965/brw_fs.h @@ -69,9 +69,9 @@ public: void init(); fs_reg(); - fs_reg(float f); - fs_reg(int32_t i); - fs_reg(uint32_t u); + explicit fs_reg(float f); + explicit fs_reg(int32_t i); + explicit fs_reg(uint32_t u); fs_reg(struct brw_reg fixed_hw_reg); fs_reg(enum register_file file, int reg); fs_reg(enum register_file file, int reg, enum brw_reg_type type); diff --git a/src/mesa/drivers/dri/i965/brw_fs_fp.cpp b/src/mesa/drivers/dri/i965/brw_fs_fp.cpp index 526c817..ec05bfe 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_fp.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_fp.cpp @@ -489,7 +489,7 @@ fs_visitor::emit_fragment_program_code() fs_inst *inst; if (brw->gen >= 7) { -inst = emit_texture_gen7(ir, dst, coordinate, shadow_c, lod, dpdy, sample_index, fs_reg(0u), fpi->TexSrcUnit); +inst = emit_texture_gen7(ir, dst, coordinate, shadow_c, lod, dpdy, sample_index, fs_reg(0u), fs_reg(fpi->TexSrcUnit)); } else if (brw->gen >= 5) { inst = emit_texture_gen5(ir, dst, coordinate, shadow_c, lod, dpdy, sample_index, fpi->TexSrcUnit); } else { diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp index 6a75b05..a8d2804 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp @@ -2611,10 +2611,10 @@ fs_visitor::visit_atomic_counter_intrinsic(ir_call *ir) deref_array->array_index->accept(this); fs_reg tmp(this, glsl_type::uint_type); - emit(MUL(tmp, this->result, ATOMIC_COUNTER_SIZE)); - emit(ADD(offset, tmp, location->data.atomic.offset)); + emit(MUL(tmp, this->result, fs_reg(ATOMIC_COUNTER_SIZE))); + emit(ADD(offset, tmp, fs_reg(location->data.atomic.offset))); } else { - offset = location->data.atomic.offset; + offset = fs_reg(location->data.atomic.offset); } /* Emit the appropriate machine instruction */ @@ -2734,7 +2734,8 @@ fs_visitor::emit_untyped_atomic(unsigned atomic_op, unsigned surf_index, } /* Emit the instruction. */ - inst = emit(SHADER_OPCODE_UNTYPED_ATOMIC, dst, atomic_op, surf_index); + inst = emit(SHADER_OPCODE_UNTYPED_ATOMIC, dst, + fs_reg(atomic_op), fs_reg(surf_index)); inst->base_mrf = 0; inst->mlen = mlen; inst->header_present = true; @@ -2768,7 +2769,7 @@ fs_visitor::emit_untyped_surface_read(unsigned surf_index, fs_reg dst, mlen += operand_len; /* Emit the instruction. */ - inst = emit(SHADER_OPCODE_UNTYPED_SURFACE_READ, dst, surf_index); + inst = emit(SHADER_OPCODE_UNTYPED_SURFACE_READ, dst, fs_reg(surf_index)); inst->base_mrf = 0; inst->mlen = mlen; inst->header_present = true; -- 2.1.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 05/12] i965/fs: Use offset a lot more places
We have this wonderful offset() function for advancing registers, but we're not using it. Using offset() allows us to do some sanity checking and avoid manually touching fs_reg::reg_offset. In a few commits, we will make offset do even more nifty things for us. Signed-off-by: Jason Ekstrand --- src/mesa/drivers/dri/i965/brw_fs.cpp | 18 +-- src/mesa/drivers/dri/i965/brw_fs_cse.cpp | 12 +- src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp | 4 +- src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 137 ++ 4 files changed, 78 insertions(+), 93 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp index af8c087..ea91705 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp @@ -310,8 +310,8 @@ fs_visitor::VARYING_PULL_CONSTANT_LOAD(const fs_reg &dst, inst->mlen = 1 + dispatch_width / 8; } - vec4_result.reg_offset += (const_offset & 3) * scale; - instructions.push_tail(MOV(dst, vec4_result)); + fs_reg result = offset(vec4_result, (const_offset & 3) * scale); + instructions.push_tail(MOV(dst, result)); return instructions; } @@ -1019,7 +1019,7 @@ fs_visitor::emit_fragcoord_interpolation(ir_variable *ir) } else { emit(ADD(wpos, this->pixel_x, fs_reg(0.5f))); } - wpos.reg_offset++; + wpos = offset(wpos, 1); /* gl_FragCoord.y */ if (!flip && ir->data.pixel_center_integer) { @@ -1035,7 +1035,7 @@ fs_visitor::emit_fragcoord_interpolation(ir_variable *ir) emit(ADD(wpos, pixel_y, fs_reg(offset))); } - wpos.reg_offset++; + wpos = offset(wpos, 1); /* gl_FragCoord.z */ if (brw->gen >= 6) { @@ -1046,7 +1046,7 @@ fs_visitor::emit_fragcoord_interpolation(ir_variable *ir) this->delta_y[BRW_WM_PERSPECTIVE_PIXEL_BARYCENTRIC], interp_reg(VARYING_SLOT_POS, 2)); } - wpos.reg_offset++; + wpos = offset(wpos, 1); /* gl_FragCoord.w: Already set up in emit_interpolation */ emit(BRW_OPCODE_MOV, wpos, this->wpos_w); @@ -1120,7 +1120,7 @@ fs_visitor::emit_general_interpolation(ir_variable *ir) /* If there's no incoming setup data for this slot, don't * emit interpolation for it. */ - attr.reg_offset += type->vector_elements; + attr = offset(attr, type->vector_elements); location++; continue; } @@ -1135,7 +1135,7 @@ fs_visitor::emit_general_interpolation(ir_variable *ir) interp = suboffset(interp, 3); interp.type = reg->type; emit(FS_OPCODE_CINTERP, attr, fs_reg(interp)); - attr.reg_offset++; + attr = offset(attr, 1); } } else { /* Smooth/noperspective interpolation case. */ @@ -1173,7 +1173,7 @@ fs_visitor::emit_general_interpolation(ir_variable *ir) if (brw->gen < 6 && interpolation_mode == INTERP_QUALIFIER_SMOOTH) { emit(BRW_OPCODE_MUL, attr, attr, this->pixel_w); } - attr.reg_offset++; + attr = offset(attr, 1); } } @@ -1284,7 +1284,7 @@ fs_visitor::emit_samplepos_setup() } /* Compute gl_SamplePosition.x */ compute_sample_position(pos, int_sample_x); - pos.reg_offset++; + pos = offset(pos, 1); inst = emit(MOV(int_sample_y, fs_reg(suboffset(sample_pos_reg, 1; if (dispatch_width == 16) { inst->force_uncompressed = true; diff --git a/src/mesa/drivers/dri/i965/brw_fs_cse.cpp b/src/mesa/drivers/dri/i965/brw_fs_cse.cpp index 9db6865..a09b0f4 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_cse.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_cse.cpp @@ -212,10 +212,8 @@ fs_visitor::opt_cse_local(bblock_t *block) fs_inst *copy; if (written > 1) { fs_reg *sources = ralloc_array(mem_ctx, fs_reg, written); - for (int i = 0; i < written; i++) { - sources[i] = tmp; - sources[i].reg_offset = i; - } + for (int i = 0; i < written; i++) + sources[i] = offset(tmp, i); copy = LOAD_PAYLOAD(orig_dst, sources, written); } else { copy = MOV(orig_dst, tmp); @@ -235,10 +233,8 @@ fs_visitor::opt_cse_local(bblock_t *block) fs_inst *copy; if (written > 1) { fs_reg *sources = ralloc_array(mem_ctx, fs_reg, written); - for (int i = 0; i < written; i++) { - sources[i] = tmp; - sources[i].reg_offset = i; - } + for (int i = 0; i < written; i++) + sources[i] = offset(tmp, i); copy = LOAD_PAYLOAD(dst, sources, written); } else { copy = MOV(dst, tmp); diff --git a/s
[Mesa-dev] [PATCH 03/12] i965/fs: Rewrite fs_visitor::split_virtual_grfs
The original vgrf splitting code was written assuming that with the assumption that vgrfs came in two types: those that can be split into single registers and those that can't be split at all It was very conservative and bailed as soon as more than one element of a register was read or written. This won't work once we start allowing a regular MOV or ADD operation to operate on multiple registers. This rewrite allows for the case where a vgrf of size 5 may appropreately be split in to one register of size 1 and two registers of size 2. Signed-off-by: Jason Ekstrand --- src/mesa/drivers/dri/i965/brw_fs.cpp | 132 ++- 1 file changed, 85 insertions(+), 47 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp index 3fb1545..10a3a20 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp @@ -1627,15 +1627,39 @@ void fs_visitor::split_virtual_grfs() { int num_vars = this->virtual_grf_count; - bool split_grf[num_vars]; - int new_virtual_grf[num_vars]; - /* Try to split anything > 0 sized. */ + /* Count the total number of registers */ + int reg_count = 0; + int vgrf_to_reg[num_vars]; for (int i = 0; i < num_vars; i++) { - if (this->virtual_grf_sizes[i] != 1) -split_grf[i] = true; - else -split_grf[i] = false; + vgrf_to_reg[i] = reg_count; + reg_count += virtual_grf_sizes[i]; + } + + /* An array of "split points". For each register slot, this indicates +* if this slot can be separated from the previous slot. Every time an +* instruction uses multiple elements of a register (as a source or +* destination), we mark the used slots as inseparable. Then we go +* through and split the registers into the smallest pieces we can. +*/ + bool split_points[reg_count]; + memset(split_points, 0, sizeof(split_points)); + + /* Mark all used registers as fully splittable */ + foreach_in_list(fs_inst, inst, &instructions) { + if (inst->dst.file == GRF) { + int reg = vgrf_to_reg[inst->dst.reg]; + for (int j = 1; j < this->virtual_grf_sizes[inst->dst.reg]; j++) +split_points[reg + j] = true; + } + + for (int i = 0; i < inst->sources; i++) { + if (inst->src[i].file == GRF) { +int reg = vgrf_to_reg[inst->src[i].reg]; +for (int j = 1; j < this->virtual_grf_sizes[inst->src[i].reg]; j++) + split_points[reg + j] = true; + } + } } if (brw->has_pln && @@ -1645,61 +1669,75 @@ fs_visitor::split_virtual_grfs() * Gen6, that was the only supported interpolation mode, and since Gen6, * delta_x and delta_y are in fixed hardware registers. */ - split_grf[this->delta_x[BRW_WM_PERSPECTIVE_PIXEL_BARYCENTRIC].reg] = - false; + int vgrf = this->delta_x[BRW_WM_PERSPECTIVE_PIXEL_BARYCENTRIC].reg; + split_points[vgrf_to_reg[vgrf] + 1] = false; } foreach_in_list(fs_inst, inst, &instructions) { - /* If there's a SEND message that requires contiguous destination - * registers, no splitting is allowed. - */ - if (inst->regs_written > 1) { -split_grf[inst->dst.reg] = false; + if (inst->dst.file == GRF) { + int reg = vgrf_to_reg[inst->dst.reg] + inst->dst.reg_offset; + for (int j = 1; j < inst->regs_written; j++) +split_points[reg + j] = false; } - - /* If we're sending from a GRF, don't split it, on the assumption that - * the send is reading the whole thing. - */ - if (inst->is_send_from_grf()) { - for (int i = 0; i < inst->sources; i++) { -if (inst->src[i].file == GRF) { - split_grf[inst->src[i].reg] = false; -} + for (int i = 0; i < inst->sources; i++) { + if (inst->src[i].file == GRF) { +int reg = vgrf_to_reg[inst->src[i].reg] + inst->src[i].reg_offset; +for (int j = 1; j < inst->regs_read(this, i); j++) + split_points[reg + j] = false; } } } - /* Allocate new space for split regs. Note that the virtual -* numbers will be contiguous. -*/ + int new_virtual_grf[reg_count]; + int new_reg_offset[reg_count]; + + int reg = 0; for (int i = 0; i < num_vars; i++) { - if (split_grf[i]) { -new_virtual_grf[i] = virtual_grf_alloc(1); -for (int j = 2; j < this->virtual_grf_sizes[i]; j++) { - int reg = virtual_grf_alloc(1); - assert(reg == new_virtual_grf[i] + j - 1); - (void) reg; -} -this->virtual_grf_sizes[i] = 1; + /* The first one should always be 0 as a quick sanity check. */ + assert(split_points[reg] == false); + + /* j = 0 case */ + new_reg_offset[reg] = 0; + reg++; + int offset = 1; + + /* j > 0 case */ + for (int j = 1; j < virtual_grf_sizes[i]; j++
[Mesa-dev] [PATCH 02/12] i965/fs_live_variables: Use var_from_vgrf insead of repeating the calculation
Signed-off-by: Jason Ekstrand --- src/mesa/drivers/dri/i965/brw_fs_live_variables.cpp | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs_live_variables.cpp b/src/mesa/drivers/dri/i965/brw_fs_live_variables.cpp index e7ecb0f..39fc61a 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_live_variables.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_live_variables.cpp @@ -56,7 +56,7 @@ void fs_live_variables::setup_one_read(bblock_t *block, fs_inst *inst, int ip, fs_reg reg) { - int var = var_from_vgrf[reg.reg] + reg.reg_offset; + int var = var_from_reg(®); assert(var < num_vars); /* In most cases, a register can be written over safely by the @@ -108,7 +108,7 @@ void fs_live_variables::setup_one_write(bblock_t *block, fs_inst *inst, int ip, fs_reg reg) { - int var = var_from_vgrf[reg.reg] + reg.reg_offset; + int var = var_from_reg(®); assert(var < num_vars); start[var] = MIN2(start[var], ip); -- 2.1.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 12/12] i965/fs: Clean up emit_fb_writes
This splits emit_fb_writes into two functions: emit_fb_writes and emit_single_fb_write. This reduces the amount of duplicated code in emit_fb_writes and makes the register number fiddling less arcane. Signed-off-by: Jason Ekstrand --- src/mesa/drivers/dri/i965/brw_fs.h | 4 +- src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 193 +++ 2 files changed, 83 insertions(+), 114 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs.h b/src/mesa/drivers/dri/i965/brw_fs.h index 56f40b4..50b5fc1 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.h +++ b/src/mesa/drivers/dri/i965/brw_fs.h @@ -426,8 +426,10 @@ public: const struct prog_instruction *fpi, fs_reg dst, fs_reg src0, fs_reg src1, fs_reg one); - void emit_color_write(int target, int index, int first_color_mrf); + void emit_color_write(fs_reg color, int index, int first_color_mrf); void emit_alpha_test(); + fs_inst *emit_single_fb_write(fs_reg color1, fs_reg color2, + fs_reg src0_alpha, unsigned components); void emit_fb_writes(); void emit_shader_time_begin(); diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp index a8d2804..f9bc82a 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp @@ -2920,11 +2920,10 @@ fs_visitor::emit_interpolation_setup_gen6() } void -fs_visitor::emit_color_write(int target, int index, int first_color_mrf) +fs_visitor::emit_color_write(fs_reg color, int index, int first_color_mrf) { int reg_width = dispatch_width / 8; fs_inst *inst; - fs_reg color = outputs[target]; fs_reg mrf; /* If there's no color data to be written, skip it. */ @@ -3042,8 +3041,9 @@ fs_visitor::emit_alpha_test() cmp->flag_subreg = 1; } -void -fs_visitor::emit_fb_writes() +fs_inst * +fs_visitor::emit_single_fb_write(fs_reg color0, fs_reg color1, + fs_reg src0_alpha, unsigned components) { this->current_annotation = "FB write header"; bool header_present = true; @@ -3053,13 +3053,6 @@ fs_visitor::emit_fb_writes() int base_mrf = 1; int nr = base_mrf; int reg_width = dispatch_width / 8; - bool src0_alpha_to_render_target = false; - - if (do_dual_src) { - no16("GL_ARB_blend_func_extended not yet supported in SIMD16."); - if (dispatch_width == 16) - do_dual_src = false; - } /* From the Sandy Bridge PRM, volume 4, page 198: * @@ -3069,19 +3062,15 @@ fs_visitor::emit_fb_writes() * thread message and on all dual-source messages." */ if (brw->gen >= 6 && - (brw->is_haswell || brw->gen >= 8 || !this->prog_data->uses_kill) && - !do_dual_src && + (brw->is_haswell || brw->gen >= 8 || !prog_data->uses_kill) && + color1.file == BAD_FILE && key->nr_color_regions == 1) { header_present = false; } - if (header_present) { - src0_alpha_to_render_target = brw->gen >= 6 && - !do_dual_src && -key->replicate_alpha; + if (header_present) /* m2, m3 header */ nr += 2; - } if (payload.aa_dest_stencil_reg) { push_force_uncompressed(); @@ -3100,13 +3089,34 @@ fs_visitor::emit_fb_writes() nr += 1; } - /* Reserve space for color. It'll be filled in per MRT below. */ - int color_mrf = nr; - nr += 4 * reg_width; - if (do_dual_src) - nr += 4; - if (src0_alpha_to_render_target) - nr += reg_width; + if (color0.file == BAD_FILE) { + /* Even if there's no color buffers enabled, we still need to send + * alpha out the pipeline to our null renderbuffer to support + * alpha-testing, alpha-to-coverage, and so on. + */ + emit_color_write(this->outputs[0], 3, nr); + nr += 4 * reg_width; + } else if (color1.file == BAD_FILE) { + if (src0_alpha.file != BAD_FILE) { + fs_inst *inst; + inst = emit(MOV(fs_reg(MRF, nr, src0_alpha.type), src0_alpha)); + inst->saturate = key->clamp_fragment_color; + nr += reg_width; + } + + for (unsigned i = 0; i < components; i++) + emit_color_write(color0, i, nr); + + nr += 4 * reg_width; + } else { + for (unsigned i = 0; i < components; i++) + emit_color_write(color0, i, nr); + nr += 4 * reg_width; + + for (unsigned i = 0; i < components; i++) + emit_color_write(color1, i, nr); + nr += 4 * reg_width; + } if (source_depth_to_render_target) { if (brw->gen == 6) { @@ -3136,111 +3146,68 @@ fs_visitor::emit_fb_writes() nr += reg_width; } - if (do_dual_src) { - fs_reg src0 = this->outputs[0]; - fs_reg src1 = this->dual_src_output; - - this->current_annotation = ralloc_asprintf(this->mem_ctx, -"FB write
[Mesa-dev] [PATCH 08/12] i965/fs: Make null_reg_* const members of fs_visitor instead of globals
We also set the register width equal to the dispatch width. Right now, this is effectively a no-op since we don't do anything with it. However, it will be important once we add an actual width field to fs_reg. Signed-off-by: Jason Ekstrand --- src/mesa/drivers/dri/i965/brw_fs.h | 6 +++--- src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 3 +++ src/mesa/drivers/dri/i965/brw_reg.h | 6 ++ 3 files changed, 12 insertions(+), 3 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs.h b/src/mesa/drivers/dri/i965/brw_fs.h index d5a96c8..cb44037 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.h +++ b/src/mesa/drivers/dri/i965/brw_fs.h @@ -134,9 +134,6 @@ half(const fs_reg ®, unsigned idx) } static const fs_reg reg_undef; -static const fs_reg reg_null_f(retype(brw_null_reg(), BRW_REGISTER_TYPE_F)); -static const fs_reg reg_null_d(retype(brw_null_reg(), BRW_REGISTER_TYPE_D)); -static const fs_reg reg_null_ud(retype(brw_null_reg(), BRW_REGISTER_TYPE_UD)); class ip_record : public exec_node { public: @@ -206,6 +203,9 @@ public: class fs_visitor : public backend_visitor { public: + const fs_reg reg_null_f; + const fs_reg reg_null_d; + const fs_reg reg_null_ud; fs_visitor(struct brw_context *brw, void *mem_ctx, diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp index 92a50a5..6a75b05 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp @@ -3277,6 +3277,9 @@ fs_visitor::fs_visitor(struct brw_context *brw, unsigned dispatch_width) : backend_visitor(brw, shader_prog, &fp->Base, &prog_data->base, MESA_SHADER_FRAGMENT), + reg_null_f(retype(brw_null_vec(dispatch_width), BRW_REGISTER_TYPE_F)), + reg_null_d(retype(brw_null_vec(dispatch_width), BRW_REGISTER_TYPE_D)), + reg_null_ud(retype(brw_null_vec(dispatch_width), BRW_REGISTER_TYPE_UD)), key(key), prog_data(prog_data), dispatch_width(dispatch_width) { diff --git a/src/mesa/drivers/dri/i965/brw_reg.h b/src/mesa/drivers/dri/i965/brw_reg.h index 28d3d94..9638c77 100644 --- a/src/mesa/drivers/dri/i965/brw_reg.h +++ b/src/mesa/drivers/dri/i965/brw_reg.h @@ -603,6 +603,12 @@ brw_null_reg(void) } static inline struct brw_reg +brw_null_vec(unsigned width) +{ + return brw_vecn_reg(width, BRW_ARCHITECTURE_REGISTER_FILE, BRW_ARF_NULL, 0); +} + +static inline struct brw_reg brw_address_reg(unsigned subnr) { return brw_uw1_reg(BRW_ARCHITECTURE_REGISTER_FILE, BRW_ARF_ADDRESS, subnr); -- 2.1.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 06/12] i965/fs: Use the UW type for the destination of VARYING_PULL_CONSTANT_LOAD instructions
Using a floating-point type doesn't usually cause hangs on my HSW, but the simulator complains about it quite a bit. Signed-off-by: Jason Ekstrand --- src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp index 1bc10f5..bd0ee3a 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp @@ -1082,7 +1082,7 @@ fs_generator::generate_varying_pull_constant_load_gen7(fs_inst *inst, uint32_t surf_index = index.dw1.ud; brw_inst *send = brw_next_insn(p, BRW_OPCODE_SEND); - brw_set_dest(p, send, dst); + brw_set_dest(p, send, retype(dst, BRW_REGISTER_TYPE_UW)); brw_set_src0(p, send, offset); brw_set_sampler_message(p, send, surf_index, @@ -1131,7 +1131,7 @@ fs_generator::generate_varying_pull_constant_load_gen7(fs_inst *inst, /* dst = send(offset, a0.0) */ brw_inst *insn_send = brw_next_insn(p, BRW_OPCODE_SEND); - brw_set_dest(p, insn_send, dst); + brw_set_dest(p, insn_send, retype(dst, BRW_REGISTER_TYPE_UW)); brw_set_src0(p, insn_send, offset); brw_set_indirect_send_descriptor(p, insn_send, BRW_SFID_SAMPLER, addr); -- 2.1.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 10/12] i965/fs: Make compact_virtual_grfs an optimization pass
Previously we disabled compact_virtual_grfs when dumping optimizations. The idea here was to make it easier to diff the dumped shader because you didn't have a sudden renaming. However, sometimes a bug is affected by compact_virtual_grfs and, when this happens, you want to keep dumping instructions with compact_virtual_grfs enabled. By turning it into an optimization pass and dumping it along with the others, we retain the ability to diff because you can just diff against the compact_virtual_grf output. Signed-off-by: Jason Ekstrand --- src/mesa/drivers/dri/i965/brw_fs.cpp | 17 +++-- src/mesa/drivers/dri/i965/brw_fs.h | 2 +- 2 files changed, 12 insertions(+), 7 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp index 002d40fd..a41cb4f 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp @@ -1752,12 +1752,10 @@ fs_visitor::split_virtual_grfs() * to loop over all the virtual GRFs. Compacting them can save a lot of * overhead. */ -void +bool fs_visitor::compact_virtual_grfs() { - if (unlikely(INTEL_DEBUG & DEBUG_OPTIMIZER)) - return; - + bool progress = false; int remap_table[this->virtual_grf_count]; memset(remap_table, -1, sizeof(remap_table)); @@ -1775,7 +1773,12 @@ fs_visitor::compact_virtual_grfs() /* Compact the GRF arrays. */ int new_index = 0; for (int i = 0; i < this->virtual_grf_count; i++) { - if (remap_table[i] != -1) { + if (remap_table[i] == -1) { + /* We just found an unused register. This means that we are + * actually going to compact something. + */ + progress = true; + } else { remap_table[i] = new_index; virtual_grf_sizes[new_index] = virtual_grf_sizes[i]; invalidate_live_intervals(false); @@ -1818,6 +1821,8 @@ fs_visitor::compact_virtual_grfs() } } } + + return progress; } /* @@ -3255,7 +3260,7 @@ fs_visitor::run() iteration++; int pass_num = 0; - compact_virtual_grfs(); + OPT(compact_virtual_grfs); OPT(remove_duplicate_mrf_writes); diff --git a/src/mesa/drivers/dri/i965/brw_fs.h b/src/mesa/drivers/dri/i965/brw_fs.h index 402433b..56f40b4 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.h +++ b/src/mesa/drivers/dri/i965/brw_fs.h @@ -325,7 +325,7 @@ public: int choose_spill_reg(struct ra_graph *g); void spill_reg(int spill_reg); void split_virtual_grfs(); - void compact_virtual_grfs(); + bool compact_virtual_grfs(); void move_uniform_array_access_to_pull_constants(); void assign_constant_locations(); void demote_pull_constants(); -- 2.1.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 04/12] i965/fs: fix a comment in compact_virtual_grfs
Signed-off-by: Jason Ekstrand --- src/mesa/drivers/dri/i965/brw_fs.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp index 10a3a20..af8c087 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp @@ -1758,10 +1758,10 @@ fs_visitor::compact_virtual_grfs() if (unlikely(INTEL_DEBUG & DEBUG_OPTIMIZER)) return; - /* Mark which virtual GRFs are used, and count how many. */ int remap_table[this->virtual_grf_count]; memset(remap_table, -1, sizeof(remap_table)); + /* Mark which virtual GRFs are used. */ foreach_in_list(const fs_inst, inst, &instructions) { if (inst->dst.file == GRF) remap_table[inst->dst.reg] = 0; -- 2.1.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 84089] [radeonsi] hd 7790 need more sGPRS for ps
https://bugs.freedesktop.org/show_bug.cgi?id=84089 Iaroslav Andrusyak changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED --- Comment #5 from Iaroslav Andrusyak --- Yes, it helps -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] multi-planar tiled fourcc's in mesa and drm
So, lucky me, I have a scenario where I get to deal with NV12MT. Hurray! I know there has been some reluctance in the past to combine tiling and color format, since in theory that could lead to a combinatorial explosion in formats. And, as long as the buffer usage is entirely within a single driver, you can approximately hide tiling (or compressed, etc) permutations of a color format. On the other hand, there is already some precedence for fourcc or format values to represent tiled formats at the interface level (in kernel, v4l, and in userspace, and gstreamer and openmax). But in this scenario, sharing buffer between other devices (video decoder/encoder) and drm/kms (msm) and mesa (freedreno) via EGL_EXT_image_dma_buf_import[1], I sort of don't really have any other way to pass around tiling flags. So I would propose adding custom fourcc's only in the more limited cases where formats are exchanged between devices. This should avoid an explosion of color_format * tiling_format. For the kms part, it would mean merging a small patch to allow addfb2 for NV12MT[2]. For the mesa part, it looks like there is a bit of work needed to teach egl about multi-planar buffers, buffers where offset[n] != 0, etc. I'll start with patches to teach egl how to import plain NV12 buffers. But once that is done, for it to be much use to me I'll need NV12MT, which means adding a new gallium format and __DRI_IMAGE_FOURCC_NV12MT. Also, I'm still a bit undecided on how to represent multi-planar formats (ie. single pipe_resource encapsulating each of the planes? or pipe_resource per plane but teach pipe_sampler_view about textures which have multiple pipe_resource's, one for per plane). Anyways, I'll start working on the mesa egl bits next week. First step is just add an 'offset' parameter to 'struct winsys_handle', which should hopefully be non-controversial. After that, I need to decide how to handle multi-planar, and I think that hinges on how folks want to handle multi-planar in gallium. Ie. if one pipe_resource per plane, then winsys_handle doesn't need any further change (but we need changes elsewhere), otherwise winsys_handle needs to have an array of handles. Anyways, I'd appreciate feedback. BR, -R [1] https://www.khronos.org/registry/egl/extensions/EXT/EGL_EXT_image_dma_buf_import.txt [2] http://lists.freedesktop.org/archives/dri-devel/2014-July/064828.html ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] Mesa 10.3 released
Mesa 10.3 has been released! Mesa 10.3 is a feature release that includes many updates and enhancements. The full list is available in the release notes file in docs/relnotes/10.3.html. The tag in the GIT repository for Mesa 10.3 is 'mesa-10.3'. I have verified that the tag is in the correct place in the tree. Mesa 10.3 is available for download at ftp://freedesktop.org/pub/mesa/10.3/ SHA-256 checksums (can be verified with the sha256sum program): 9a1bf52040fc3dda81e83a35f944f1c3f532847dbe9fdf57161265cf71ea1bae MesaLib-10.3.0.tar.gz 0283bfe710fa449ed82e465cfa09612a269e19abb7e0382082608062ce7960b5 MesaLib-10.3.0.tar.bz2 221420763c2c3a244836a736e735612c4a6a0377b4e5223fca1e612f49906789 MesaLib-10.3.0.zip I have verified building from the .tar.bz2 file by doing: tar xjf MesaLib-10.3.0.tar.bz2 cd Mesa-10.3.0 ./configure --enable-gallium-llvm make -j6 make -j6 install I have also verified that I pushed the tag. -Emil -- Changes since mesa-10.3-rc3: Andreas Boll (1): gallium/util: add missing u_debug include Brian Paul (1): mesa: fix _mesa_free_pipeline_data() use-after-free bug Christian König (1): mesa/st: don't advertise NV_vdpau_interop if it doesn't work. Christoph Bumiller (2): nv50/ir/util: fix BitSet issues nvc0/ir: clarify recursion fix to finding first tex uses Connor Abbott (1): r300g: set register classes before interferences Emil Velikov (5): configure: bail out if building svga without libdrm configure: enable the gallium loader only when needed Bump version to 10.3 (final) docs: Update 10.3 release notes docs: Add 10.3 sha256 sums, news item and link release notes Gwenole Beauchesne (1): i965: add support for RGBA dma_buf imports. Iago Toral Quiroga (1): i965: Implement GL_PRIMITIVES_GENERATED with non-zero streams. Ian Romanick (8): mesa: Document SYSTEM_VALUE_VERTEX_ID and SYSTEM_VALUE_INSTANCE_ID mesa: Add SYSTEM_VALUE_VERTEX_ID_ZERO_BASE mesa: Add SYSTEM_VALUE_BASE_VERTEX glsl/linker: Make get_main_function_signature public glsl: Add a lowering pass for gl_VertexID i965: Handle SYSTEM_VALUE_VERTEX_ID_ZERO_BASE i965: Request lowering gl_VertexID i965/vec4: Only examine virtual_grf_end for GRF sources Ilia Mirkin (4): nv50/ir: avoid array overrun when checking for supported mods nouveau: only enable the depth test if there actually is a depth buffer nouveau: only enable stencil func if the visual has stencil bits nouveau: change internal variables to avoid conflicts with macro args Jason Ekstrand (1): i965/blorp: Pass image formats seperately from the miptree Jonathan Gray (1): configure.ac: strip _GNU_SOURCE from llvm-config output Kenneth Graunke (15): i965: Handle ir_triop_csel in emit_if_gen6(). i965: Handle ir_binop_ubo_load in boolean expression code. mesa: Replace string comparisons with SYSTEM_VALUE enum checks. mesa: Fix glGetActiveAttribute for gl_VertexID when lowered. i965: Calculate start/base_vertex_location after preparing vertices. i965: Make gl_BaseVertex available in a buffer object. i965: Refactor Gen4-7 VERTEX_BUFFER_STATE emission into a helper. i965: Expose gl_BaseVertex via a vertex attribute. i965: Fix reference counting in new basevertex upload code. i965: Separate gl_InstanceID and gl_VertexID uploading. i965: Disable guardband clipping in the smaller-than-viewport case. i965: Skip allocating UNIFORM file storage for uniforms of size 0. i965/vec4: Make type_size() return 0 for samplers. glsl: Speed up constant folding for swizzles. i965: Mark delta_x/y as BAD_FILE if remapped away completely. Kristian Høgsberg (1): i965: Adjust fast-clear resolve rect for BDW Maarten Lankhorst (4): nouveau: re-allocate bo's on overflow nouveau: fix MPEG4 hw decoding nouveau: rework reference frame handling nouveau: remove unneeded assert Matt Turner (1): i965/vec4: Reswizzle sources when necessary. Richard Sandiford (1): gallivm: Fix uses of 2^24 Ulrich Weigand (1): gallivm: Fix Altivec pack intrinsics for little-endian ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev signature.asc Description: OpenPGP digital signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] r300g: implement MSAA copies by resolving and upsampling
From: Marek Olšák There's no other way. It will use hw resolve + blit. --- src/gallium/drivers/r300/r300_blit.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/src/gallium/drivers/r300/r300_blit.c b/src/gallium/drivers/r300/r300_blit.c index 2320abb..4e7efc5 100644 --- a/src/gallium/drivers/r300/r300_blit.c +++ b/src/gallium/drivers/r300/r300_blit.c @@ -679,7 +679,9 @@ static boolean r300_is_simple_msaa_resolve(const struct pipe_blit_info *info) unsigned dst_width = u_minify(info->dst.resource->width0, info->dst.level); unsigned dst_height = u_minify(info->dst.resource->height0, info->dst.level); -return info->dst.resource->format == info->src.resource->format && +return info->src.resource->nr_samples > 1 && + info->dst.resource->nr_samples <= 1 && + info->dst.resource->format == info->src.resource->format && info->dst.resource->format == info->dst.format && info->src.resource->format == info->src.format && !info->scissor_enable && @@ -803,7 +805,6 @@ static void r300_blit(struct pipe_context *pipe, /* MSAA resolve. */ if (info.src.resource->nr_samples > 1 && -info.dst.resource->nr_samples <= 1 && !util_format_is_depth_or_stencil(info.src.resource->format)) { r300_msaa_resolve(pipe, &info); return; -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/4] st/mesa: drop dependence on API profile in st_init_extensions
From: Marek Olšák The extensions and limits being set in the conditional block are core-only anyway and don't have any effect on other profiles. --- src/mesa/state_tracker/st_context.c| 2 +- src/mesa/state_tracker/st_extensions.c | 20 +--- src/mesa/state_tracker/st_extensions.h | 1 - src/mesa/state_tracker/st_manager.c| 2 +- 4 files changed, 11 insertions(+), 14 deletions(-) diff --git a/src/mesa/state_tracker/st_context.c b/src/mesa/state_tracker/st_context.c index 768a667..1723513 100644 --- a/src/mesa/state_tracker/st_context.c +++ b/src/mesa/state_tracker/st_context.c @@ -242,7 +242,7 @@ st_create_context_priv( struct gl_context *ctx, struct pipe_context *pipe, /* GL limits and extensions */ st_init_limits(st->pipe->screen, &ctx->Const, &ctx->Extensions); - st_init_extensions(st->pipe->screen, ctx->API, &ctx->Const, + st_init_extensions(st->pipe->screen, &ctx->Const, &ctx->Extensions, &st->options, ctx->Mesa_DXTn); /* Enable shader-based fallbacks for ARB_color_buffer_float if needed. */ diff --git a/src/mesa/state_tracker/st_extensions.c b/src/mesa/state_tracker/st_extensions.c index c7bc0ca..681723a 100644 --- a/src/mesa/state_tracker/st_extensions.c +++ b/src/mesa/state_tracker/st_extensions.c @@ -407,7 +407,6 @@ get_max_samples_for_formats(struct pipe_screen *screen, * Some fine tuning may still be needed. */ void st_init_extensions(struct pipe_screen *screen, -gl_api api, struct gl_constants *consts, struct gl_extensions *extensions, struct st_config_options *options, @@ -844,17 +843,16 @@ void st_init_extensions(struct pipe_screen *screen, consts->DisableVaryingPacking = GL_TRUE; } - if (api == API_OPENGL_CORE) { - consts->MaxViewports = screen->get_param(screen, PIPE_CAP_MAX_VIEWPORTS); - if (consts->MaxViewports >= 16) { - consts->ViewportBounds.Min = -16384.0; - consts->ViewportBounds.Max = 16384.0; - extensions->ARB_viewport_array = GL_TRUE; - extensions->ARB_fragment_layer_viewport = GL_TRUE; - if (extensions->AMD_vertex_shader_layer) -extensions->AMD_vertex_shader_viewport_index = GL_TRUE; - } + consts->MaxViewports = screen->get_param(screen, PIPE_CAP_MAX_VIEWPORTS); + if (consts->MaxViewports >= 16) { + consts->ViewportBounds.Min = -16384.0; + consts->ViewportBounds.Max = 16384.0; + extensions->ARB_viewport_array = GL_TRUE; + extensions->ARB_fragment_layer_viewport = GL_TRUE; + if (extensions->AMD_vertex_shader_layer) + extensions->AMD_vertex_shader_viewport_index = GL_TRUE; } + if (consts->MaxProgramTextureGatherComponents > 0) extensions->ARB_texture_gather = GL_TRUE; diff --git a/src/mesa/state_tracker/st_extensions.h b/src/mesa/state_tracker/st_extensions.h index 8d2724d..faff11f 100644 --- a/src/mesa/state_tracker/st_extensions.h +++ b/src/mesa/state_tracker/st_extensions.h @@ -38,7 +38,6 @@ extern void st_init_limits(struct pipe_screen *screen, struct gl_extensions *extensions); extern void st_init_extensions(struct pipe_screen *screen, - gl_api api, struct gl_constants *consts, struct gl_extensions *extensions, struct st_config_options *options, diff --git a/src/mesa/state_tracker/st_manager.c b/src/mesa/state_tracker/st_manager.c index 7bc3326..df6de73 100644 --- a/src/mesa/state_tracker/st_manager.c +++ b/src/mesa/state_tracker/st_manager.c @@ -928,7 +928,7 @@ static unsigned get_version(struct pipe_screen *screen, _mesa_init_extensions(&extensions); st_init_limits(screen, &consts, &extensions); - st_init_extensions(screen, api, &consts, &extensions, options, GL_TRUE); + st_init_extensions(screen, &consts, &extensions, options, GL_TRUE); return _mesa_get_version(&extensions, &consts, api); } -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 4/4] st/mesa: redefine mapping from VARYING_SLOT_TEXi/PNTC/VARi to TGSI GENERIC[i]
From: Marek Olšák Generic varyings in TGSI were based on the value of VARYING_SLOT_TEX0, so VAR0 was always GENERIC[22] (with tessellation patches). Some drivers might not be able to cope with that. This commit defines a proper mapping, so that PNTC is GENERIC[8] and VAR0 is GENERIC[9]. --- src/mesa/state_tracker/st_atom_rasterizer.c | 3 +- src/mesa/state_tracker/st_program.c | 46 - src/mesa/state_tracker/st_program.h | 25 3 files changed, 52 insertions(+), 22 deletions(-) diff --git a/src/mesa/state_tracker/st_atom_rasterizer.c b/src/mesa/state_tracker/st_atom_rasterizer.c index 71b7f1b..a228538 100644 --- a/src/mesa/state_tracker/st_atom_rasterizer.c +++ b/src/mesa/state_tracker/st_atom_rasterizer.c @@ -33,6 +33,7 @@ #include "main/macros.h" #include "st_context.h" #include "st_atom.h" +#include "st_program.h" #include "pipe/p_context.h" #include "pipe/p_defines.h" #include "cso_cache/cso_context.h" @@ -174,7 +175,7 @@ static void update_raster_state( struct st_context *st ) if (!st->needs_texcoord_semantic && fragProg->Base.InputsRead & VARYING_BIT_PNTC) { raster->sprite_coord_enable |= -1 << (VARYING_SLOT_PNTC - VARYING_SLOT_TEX0); +1 << st_get_generic_varying_index(st, VARYING_SLOT_PNTC); } raster->point_quad_rasterization = 1; diff --git a/src/mesa/state_tracker/st_program.c b/src/mesa/state_tracker/st_program.c index fbf8930..926086b 100644 --- a/src/mesa/state_tracker/st_program.c +++ b/src/mesa/state_tracker/st_program.c @@ -275,17 +275,18 @@ st_prepare_vertex_program(struct gl_context *ctx, case VARYING_SLOT_TEX5: case VARYING_SLOT_TEX6: case VARYING_SLOT_TEX7: -stvp->output_semantic_name[slot] = st->needs_texcoord_semantic ? - TGSI_SEMANTIC_TEXCOORD : TGSI_SEMANTIC_GENERIC; -stvp->output_semantic_index[slot] = attr - VARYING_SLOT_TEX0; -break; - +if (st->needs_texcoord_semantic) { + stvp->output_semantic_name[slot] = TGSI_SEMANTIC_TEXCOORD; + stvp->output_semantic_index[slot] = attr - VARYING_SLOT_TEX0; + break; +} +/* fall through */ case VARYING_SLOT_VAR0: default: assert(attr < VARYING_SLOT_MAX); stvp->output_semantic_name[slot] = TGSI_SEMANTIC_GENERIC; -stvp->output_semantic_index[slot] = st->needs_texcoord_semantic ? - (attr - VARYING_SLOT_VAR0) : (attr - VARYING_SLOT_TEX0); +stvp->output_semantic_index[slot] = + st_get_generic_varying_index(st, attr); break; } } @@ -655,9 +656,8 @@ st_translate_fragment_program(struct st_context *st, * the user varyings on VAR0. Otherwise, we use TEX0 as base index. */ assert(attr >= VARYING_SLOT_TEX0); -input_semantic_index[slot] = st->needs_texcoord_semantic ? - (attr - VARYING_SLOT_VAR0) : (attr - VARYING_SLOT_TEX0); input_semantic_name[slot] = TGSI_SEMANTIC_GENERIC; +input_semantic_index[slot] = st_get_generic_varying_index(st, attr); if (attr == VARYING_SLOT_PNTC) interpMode[slot] = TGSI_INTERPOLATE_LINEAR; else @@ -974,16 +974,18 @@ st_translate_geometry_program(struct st_context *st, case VARYING_SLOT_TEX5: case VARYING_SLOT_TEX6: case VARYING_SLOT_TEX7: -stgp->input_semantic_name[slot] = st->needs_texcoord_semantic ? - TGSI_SEMANTIC_TEXCOORD : TGSI_SEMANTIC_GENERIC; -stgp->input_semantic_index[slot] = (attr - VARYING_SLOT_TEX0); -break; +if (st->needs_texcoord_semantic) { + stgp->input_semantic_name[slot] = TGSI_SEMANTIC_TEXCOORD; + stgp->input_semantic_index[slot] = attr - VARYING_SLOT_TEX0; + break; +} +/* fall through */ case VARYING_SLOT_VAR0: default: assert(attr >= VARYING_SLOT_VAR0 && attr < VARYING_SLOT_MAX); stgp->input_semantic_name[slot] = TGSI_SEMANTIC_GENERIC; -stgp->input_semantic_index[slot] = st->needs_texcoord_semantic ? - (attr - VARYING_SLOT_VAR0) : (attr - VARYING_SLOT_TEX0); +stgp->input_semantic_index[slot] = + st_get_generic_varying_index(st, attr); break; } } @@ -1069,17 +1071,19 @@ st_translate_geometry_program(struct st_context *st, case VARYING_SLOT_TEX5: case VARYING_SLOT_TEX6: case VARYING_SLOT_TEX7: -gs_output_semantic_name[slot] = st->needs_texcoord_semantic ? - TGSI_SEMANTIC_TEXCOORD : TGSI_SEMANTIC_GENERIC; -gs_output_semantic_index[slot] = (attr - VARYING_SLOT_TEX0); -break; +if (st->needs
[Mesa-dev] [PATCH 3/4] st/mesa: don't set coord_enable for gl_PointCoord if using TGSI_SEMANTIC_PCOORD
From: Marek Olšák This was missed when Christoph Bumiller added PIPE_CAP_TGSI_TEXCOORD. --- src/mesa/state_tracker/st_atom_rasterizer.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/src/mesa/state_tracker/st_atom_rasterizer.c b/src/mesa/state_tracker/st_atom_rasterizer.c index 2bad643..71b7f1b 100644 --- a/src/mesa/state_tracker/st_atom_rasterizer.c +++ b/src/mesa/state_tracker/st_atom_rasterizer.c @@ -171,7 +171,8 @@ static void update_raster_state( struct st_context *st ) raster->sprite_coord_enable |= 1 << i; } } - if (fragProg->Base.InputsRead & VARYING_BIT_PNTC) { + if (!st->needs_texcoord_semantic && + fragProg->Base.InputsRead & VARYING_BIT_PNTC) { raster->sprite_coord_enable |= 1 << (VARYING_SLOT_PNTC - VARYING_SLOT_TEX0); } -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/4] st/mesa: use UniformBooleanTrue in glsl_to_tgsi
From: Marek Olšák Just for consistency. This doesn't fix anything as the original code was already pretty good. --- src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 5 + 1 file changed, 1 insertion(+), 4 deletions(-) diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp index b338a98..57f43a6 100644 --- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp +++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp @@ -2617,10 +2617,7 @@ glsl_to_tgsi_visitor::visit(ir_constant *ir) case GLSL_TYPE_BOOL: gl_type = native_integers ? GL_BOOL : GL_FLOAT; for (i = 0; i < ir->type->vector_elements; i++) { - if (native_integers) -values[i].u = ir->value.b[i] ? ~0 : 0; - else -values[i].f = ir->value.b[i]; + values[i].u = ir->value.b[i] ? ctx->Const.UniformBooleanTrue : 0; } break; default: -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/2] mesa: don't set ES versions to GLSLVersion in _mesa_init_constants
From: Marek Olšák No place in Mesa expects an ES version there. Drivers don't even set it like this. --- src/mesa/main/context.c | 12 ++-- src/mesa/main/mtypes.h | 2 +- 2 files changed, 3 insertions(+), 11 deletions(-) diff --git a/src/mesa/main/context.c b/src/mesa/main/context.c index 682b9c7..53fb9c6 100644 --- a/src/mesa/main/context.c +++ b/src/mesa/main/context.c @@ -642,16 +642,8 @@ _mesa_init_constants(struct gl_constants *consts, gl_api api) consts->MaxGeometryTotalOutputComponents = MAX_GEOMETRY_TOTAL_OUTPUT_COMPONENTS; /* Shading language version */ - if (api == API_OPENGL_COMPAT || api == API_OPENGL_CORE) { - consts->GLSLVersion = 120; - _mesa_override_glsl_version(consts); - } - else if (api == API_OPENGLES2) { - consts->GLSLVersion = 100; - } - else if (api == API_OPENGLES) { - consts->GLSLVersion = 0; /* GLSL not supported */ - } + consts->GLSLVersion = 120; + _mesa_override_glsl_version(consts); /* GL_ARB_framebuffer_object */ consts->MaxSamples = 0; diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h index 553a216..7c237bd 100644 --- a/src/mesa/main/mtypes.h +++ b/src/mesa/main/mtypes.h @@ -3452,7 +3452,7 @@ struct gl_constants GLuint MaxGeometryOutputVertices; GLuint MaxGeometryTotalOutputComponents; - GLuint GLSLVersion; /**< GLSL version supported (ex: 120 = 1.20) */ + GLuint GLSLVersion; /**< Desktop GLSL version supported (ex: 120 = 1.20) */ /** * Changes default GLSL extension behavior from "error" to "warn". It's out -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/2] mesa: allow forcing >=3.1 compatibility contexts with MESA_GL_VERSION_OVERRIDE
From: Marek Olšák E.g. the 4.0 compatibility profile can be forced with: MESA_GL_VERSION_OVERRIDE=4.0COMPAT Some tests that I have require 4.0 compatibility. --- src/mesa/main/version.c | 16 ++-- 1 file changed, 10 insertions(+), 6 deletions(-) diff --git a/src/mesa/main/version.c b/src/mesa/main/version.c index 4dea530..71f7011 100644 --- a/src/mesa/main/version.c +++ b/src/mesa/main/version.c @@ -57,13 +57,15 @@ check_for_ending(const char *string, const char *ending) * fwd_context is only valid if version > 0 */ static void -get_gl_override(int *version, GLboolean *fwd_context) +get_gl_override(int *version, GLboolean *fwd_context, +GLboolean *compat_context) { const char *env_var = "MESA_GL_VERSION_OVERRIDE"; const char *version_str; int major, minor, n; static int override_version = -1; static GLboolean fc_suffix = GL_FALSE; + static GLboolean compat_suffix = GL_FALSE; if (override_version < 0) { override_version = 0; @@ -71,6 +73,7 @@ get_gl_override(int *version, GLboolean *fwd_context) version_str = getenv(env_var); if (version_str) { fc_suffix = check_for_ending(version_str, "FC"); + compat_suffix = check_for_ending(version_str, "COMPAT"); n = sscanf(version_str, "%u.%u", &major, &minor); if (n != 2) { @@ -87,6 +90,7 @@ get_gl_override(int *version, GLboolean *fwd_context) *version = override_version; *fwd_context = fc_suffix; + *compat_context = compat_suffix; } /** @@ -129,16 +133,16 @@ _mesa_override_gl_version_contextless(struct gl_constants *consts, gl_api *apiOut, GLuint *versionOut) { int version; - GLboolean fwd_context; + GLboolean fwd_context, compat_context; - get_gl_override(&version, &fwd_context); + get_gl_override(&version, &fwd_context, &compat_context); if (version > 0) { *versionOut = version; if (version >= 30 && fwd_context) { *apiOut = API_OPENGL_CORE; consts->ContextFlags |= GL_CONTEXT_FLAG_FORWARD_COMPATIBLE_BIT; - } else if (version >= 31) { + } else if (version >= 31 && !compat_context) { *apiOut = API_OPENGL_CORE; } else { *apiOut = API_OPENGL_COMPAT; @@ -166,9 +170,9 @@ int _mesa_get_gl_version_override(void) { int version; - GLboolean fwd_context; + GLboolean fwd_context, compat_context; - get_gl_override(&version, &fwd_context); + get_gl_override(&version, &fwd_context, &compat_context); return version; } -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] Mesa 10.2.8
(Resend with GPG signature.) Mesa 10.2.8 has been released. Mesa 10.2.8 is a bug fix release fixing bugs since the 10.2.7 release, (see below for a list of changes). The tag in the git repository for Mesa 10.2.8 is 'mesa-10.2.8'. Mesa 10.2.8 is available for download at ftp://freedesktop.org/pub/mesa/10.2.8/ SHA-256 checksums (can be verified with the sha256sum program): 4c5a25ccaf1a9734bbd10d62a1420cc8fd35a1060ce679f2fc846769a25fbeec MesaLib-10.2.8.tar.gz 1ef9ad3f241788d454f2ff8c9d65b6849dfc31c8fe91f70fd2930b81c8af1398 MesaLib-10.2.8.tar.bz2 d26218da3b44734b1d555267b4c63c48803c4c8b14d2bc53071be57014da37fa MesaLib-10.2.8.zip I have verified building from the .tar.bz2 file by doing: tar xjf MesaLib-10.2.8.tar.bz2 cd Mesa-10.2.8 ./configure --enable-gallium-llvm make -j6 make -j6 install I have also verified that I pushed the tag. -Emil -- Changes from 10.2.7 to 10.2.8: Aaron Watry (1): gallivm: Fix build after LLVM commit 211259 Christoph Bumiller (2): nv50/ir/util: fix BitSet issues nvc0/ir: clarify recursion fix to finding first tex uses Emil Velikov (4): docs: Add sha256 sums for the 10.2.7 release configure: bail out if building svga without libdrm Update VERSION to 10.2.8 Add release notes for the 10.2.8 release Ilia Mirkin (4): nv50/ir: avoid array overrun when checking for supported mods nouveau: only enable the depth test if there actually is a depth buffer nouveau: only enable stencil func if the visual has stencil bits nouveau: change internal variables to avoid conflicts with macro args Jonathan Gray (1): configure.ac: strip _GNU_SOURCE from llvm-config output José Fonseca (1): gallivm: Disable workaround for PR12833 on LLVM 3.2+. Maarten Lankhorst (4): nouveau: re-allocate bo's on overflow nouveau: fix MPEG4 hw decoding nouveau: rework reference frame handling nouveau: remove unneeded assert Marek Olšák (3): r600g,radeonsi: make sure there's enough CS space before resuming queries mesa: set UniformBooleanTrue = 1.0f by default st/mesa: use 1.0f as boolean true on drivers without integer support Richard Sandiford (1): gallivm: Fix uses of 2^24 Roland Scheidegger (1): gallivm: set mcpu when initializing llvm execution engine Thomas Hellstrom (1): winsys/svga: Fix incorrect type usage in IOCTL v2 signature.asc Description: OpenPGP digital signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] Mesa 10.2.8
Mesa 10.2.8 has been released. Mesa 10.2.8 is a bug fix release fixing bugs since the 10.2.7 release, (see below for a list of changes). The tag in the git repository for Mesa 10.2.8 is 'mesa-10.2.8'. Mesa 10.2.8 is available for download at ftp://freedesktop.org/pub/mesa/10.2.8/ SHA-256 checksums (can be verified with the sha256sum program): 4c5a25ccaf1a9734bbd10d62a1420cc8fd35a1060ce679f2fc846769a25fbeec MesaLib-10.2.8.tar.gz 1ef9ad3f241788d454f2ff8c9d65b6849dfc31c8fe91f70fd2930b81c8af1398 MesaLib-10.2.8.tar.bz2 d26218da3b44734b1d555267b4c63c48803c4c8b14d2bc53071be57014da37fa MesaLib-10.2.8.zip I have verified building from the .tar.bz2 file by doing: tar xjf MesaLib-10.2.8.tar.bz2 cd Mesa-10.2.8 ./configure --enable-gallium-llvm make -j6 make -j6 install I have also verified that I pushed the tag. -Emil -- Changes from 10.2.7 to 10.2.8: Aaron Watry (1): gallivm: Fix build after LLVM commit 211259 Christoph Bumiller (2): nv50/ir/util: fix BitSet issues nvc0/ir: clarify recursion fix to finding first tex uses Emil Velikov (4): docs: Add sha256 sums for the 10.2.7 release configure: bail out if building svga without libdrm Update VERSION to 10.2.8 Add release notes for the 10.2.8 release Ilia Mirkin (4): nv50/ir: avoid array overrun when checking for supported mods nouveau: only enable the depth test if there actually is a depth buffer nouveau: only enable stencil func if the visual has stencil bits nouveau: change internal variables to avoid conflicts with macro args Jonathan Gray (1): configure.ac: strip _GNU_SOURCE from llvm-config output José Fonseca (1): gallivm: Disable workaround for PR12833 on LLVM 3.2+. Maarten Lankhorst (4): nouveau: re-allocate bo's on overflow nouveau: fix MPEG4 hw decoding nouveau: rework reference frame handling nouveau: remove unneeded assert Marek Olšák (3): r600g,radeonsi: make sure there's enough CS space before resuming queries mesa: set UniformBooleanTrue = 1.0f by default st/mesa: use 1.0f as boolean true on drivers without integer support Richard Sandiford (1): gallivm: Fix uses of 2^24 Roland Scheidegger (1): gallivm: set mcpu when initializing llvm execution engine Thomas Hellstrom (1): winsys/svga: Fix incorrect type usage in IOCTL v2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 5/5] mesa: Replace gl_client_array usage in _mesa_print_arrays()
On Friday 19 September 2014, Kenneth Graunke wrote: > For now, this prints out the same information as before - just using the > newer/non-derived structures. Printing out each structure's fields > separately might be more useful, but I've never used this code, so I'm > not sure. > > Signed-off-by: Kenneth Graunke > --- > src/mesa/main/varray.c | 47 +++ > 1 file changed, 31 insertions(+), 16 deletions(-) > > diff --git a/src/mesa/main/varray.c b/src/mesa/main/varray.c > index 09bf52c..380a32e 100644 > --- a/src/mesa/main/varray.c > +++ b/src/mesa/main/varray.c > @@ -1904,16 +1904,19 @@ _mesa_copy_vertex_buffer_binding(struct gl_context > *ctx, > * Print vertex array's fields. > */ > static void > -print_array(const char *name, GLint index, const struct gl_client_array > *array) > +print_array(const char *name, GLint index, > +const struct gl_vertex_attrib_array *attrib, > +const struct gl_vertex_buffer_binding *binding) > { > if (index >= 0) >printf(" %s[%d]: ", name, index); > else >printf(" %s: ", name); > printf("Ptr=%p, Type=0x%x, Size=%d, ElemSize=%u, Stride=%d, > Buffer=%u(Size %lu)\n", > - array->Ptr, array->Type, array->Size, > - array->_ElementSize, array->StrideB, > - array->BufferObj->Name, (unsigned long) array->BufferObj->Size); > + _mesa_vertex_attrib_address(attrib, binding), > + attrib->Type, attrib->Size, > + attrib->_ElementSize, binding->Stride, > + binding->BufferObj->Name, (unsigned long) binding->BufferObj->Size); > } > > > @@ -1927,18 +1930,30 @@ _mesa_print_arrays(struct gl_context *ctx) > GLuint i; > > printf("Array Object %u\n", vao->Name); > - if (vao->_VertexAttrib[VERT_ATTRIB_POS].Enabled) > - print_array("Vertex", -1, &vao->_VertexAttrib[VERT_ATTRIB_POS]); > - if (vao->_VertexAttrib[VERT_ATTRIB_NORMAL].Enabled) > - print_array("Normal", -1, &vao->_VertexAttrib[VERT_ATTRIB_NORMAL]); > - if (vao->_VertexAttrib[VERT_ATTRIB_COLOR0].Enabled) > - print_array("Color", -1, &vao->_VertexAttrib[VERT_ATTRIB_COLOR0]); > - for (i = 0; i < ctx->Const.MaxTextureCoordUnits; i++) > - if (vao->_VertexAttrib[VERT_ATTRIB_TEX(i)].Enabled) > - print_array("TexCoord", i, &vao->_VertexAttrib[VERT_ATTRIB_TEX(i)]); > - for (i = 0; i < VERT_ATTRIB_GENERIC_MAX; i++) > - if (vao->_VertexAttrib[VERT_ATTRIB_GENERIC(i)].Enabled) > - print_array("Attrib", i, > &vao->_VertexAttrib[VERT_ATTRIB_GENERIC(i)]); > + if (vao->VertexAttrib[VERT_ATTRIB_POS].Enabled) { > + print_array("Vertex", -1, &vao->VertexAttrib[VERT_ATTRIB_POS], > +&vao->VertexBinding[VERT_ATTRIB_POS]); > + } > + if (vao->VertexAttrib[VERT_ATTRIB_NORMAL].Enabled) { > + print_array("Normal", -1, &vao->VertexAttrib[VERT_ATTRIB_NORMAL], > +&vao->VertexBinding[VERT_ATTRIB_NORMAL]); > + } > + if (vao->VertexAttrib[VERT_ATTRIB_COLOR0].Enabled) { > + print_array("Color", -1, &vao->VertexAttrib[VERT_ATTRIB_COLOR0], > + &vao->VertexBinding[VERT_ATTRIB_COLOR0]); > + } > + for (i = 0; i < ctx->Const.MaxTextureCoordUnits; i++) { > + if (vao->VertexAttrib[VERT_ATTRIB_TEX(i)].Enabled) { > + print_array("TexCoord", i, &vao->VertexAttrib[VERT_ATTRIB_TEX(i)], > +&vao->VertexBinding[VERT_ATTRIB_TEX(i)]); > + } > + } > + for (i = 0; i < VERT_ATTRIB_GENERIC_MAX; i++) { > + if (vao->VertexAttrib[VERT_ATTRIB_GENERIC(i)].Enabled) { > + print_array("Attrib", i, &vao->VertexAttrib[VERT_ATTRIB_GENERIC(i)], > + > &vao->VertexBinding[VERT_ATTRIB_GENERIC(i)]); The generic attributes are not always associated with the vertex binding of the same index. The VertexBinding array should be indexed by gl_vertex_attrib_array::VertexBinding. I think it would be easier to just pass the vao pointer and the index to print_array() and let it figure out which attrib array and binding it should use. > + } > + } > } > > > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/5] mesa: Remove some dead helper functions.
Patches 1, 2, 3 and 4 are: Reviewed-by: Fredrik Höglund On Friday 19 September 2014, Kenneth Graunke wrote: > Dead since the _MaxElement removal, but these functions seemed generally > applicable, so I decided to remove them in a separate patch. > > Signed-off-by: Kenneth Graunke > --- > src/mesa/main/arrayobj.h | 26 -- > 1 file changed, 26 deletions(-) > > diff --git a/src/mesa/main/arrayobj.h b/src/mesa/main/arrayobj.h > index 1819cd1..3c1f918 100644 > --- a/src/mesa/main/arrayobj.h > +++ b/src/mesa/main/arrayobj.h > @@ -78,32 +78,6 @@ extern void > _mesa_update_vao_client_arrays(struct gl_context *ctx, > struct gl_vertex_array_object *vao); > > - > -/** Returns the bitmask of all enabled arrays in fixed function mode. > - * > - * In fixed function mode only the traditional fixed function arrays > - * are available. > - */ > -static inline GLbitfield64 > -_mesa_array_object_get_enabled_ff(const struct gl_vertex_array_object *vao) > -{ > - return vao->_Enabled & VERT_BIT_FF_ALL; > -} > - > -/** Returns the bitmask of all enabled arrays in arb/glsl shader mode. > - * > - * In arb/glsl shader mode all the fixed function and the arb/glsl generic > - * arrays are available. Only the first generic array takes > - * precedence over the legacy position array. > - */ > -static inline GLbitfield64 > -_mesa_array_object_get_enabled_arb(const struct gl_vertex_array_object *vao) > -{ > - GLbitfield64 enabled = vao->_Enabled; > - return enabled & ~(VERT_BIT_POS & (enabled >> VERT_ATTRIB_GENERIC0)); > -} > - > - > /* > * API functions > */ > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] gallivm: add information about different sampler/view units if analyzing shader
On 19/09/14 18:12, srol...@vmware.com wrote: From: Roland Scheidegger Useful to know in some cases. --- src/gallium/auxiliary/gallivm/lp_bld_tgsi.h | 6 ++ src/gallium/auxiliary/gallivm/lp_bld_tgsi_info.c | 4 2 files changed, 10 insertions(+) diff --git a/src/gallium/auxiliary/gallivm/lp_bld_tgsi.h b/src/gallium/auxiliary/gallivm/lp_bld_tgsi.h index 85411ce..029ca3c 100644 --- a/src/gallium/auxiliary/gallivm/lp_bld_tgsi.h +++ b/src/gallium/auxiliary/gallivm/lp_bld_tgsi.h @@ -127,6 +127,12 @@ struct lp_tgsi_info unsigned indirect_textures:1; /* +* Whether any of the texture (sample) ocpodes use different sampler +* and sampler view unit. +*/ + unsigned sampler_texture_units_different:1; + + /* * Whether any immediate values are outside the range of 0 and 1 */ unsigned unclamped_immediates:1; diff --git a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_info.c b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_info.c index fcaa201..55acea8 100644 --- a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_info.c +++ b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_info.c @@ -243,6 +243,10 @@ analyse_sample(struct analysis_context *ctx, tex_info->texture_unit = inst->Src[1].Register.Index; tex_info->sampler_unit = inst->Src[2].Register.Index; + if (tex_info->texture_unit != tex_info->sampler_unit) { + info->sampler_texture_units_different = TRUE; + } + if (modifier == LP_BLD_TEX_MODIFIER_EXPLICIT_DERIV || modifier == LP_BLD_TEX_MODIFIER_EXPLICIT_LOD || modifier == LP_BLD_TEX_MODIFIER_LOD_BIAS || shadow) { LGTM. Jose ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] gallivm: add information about different sampler/view units if analyzing shader
From: Roland Scheidegger Useful to know in some cases. --- src/gallium/auxiliary/gallivm/lp_bld_tgsi.h | 6 ++ src/gallium/auxiliary/gallivm/lp_bld_tgsi_info.c | 4 2 files changed, 10 insertions(+) diff --git a/src/gallium/auxiliary/gallivm/lp_bld_tgsi.h b/src/gallium/auxiliary/gallivm/lp_bld_tgsi.h index 85411ce..029ca3c 100644 --- a/src/gallium/auxiliary/gallivm/lp_bld_tgsi.h +++ b/src/gallium/auxiliary/gallivm/lp_bld_tgsi.h @@ -127,6 +127,12 @@ struct lp_tgsi_info unsigned indirect_textures:1; /* +* Whether any of the texture (sample) ocpodes use different sampler +* and sampler view unit. +*/ + unsigned sampler_texture_units_different:1; + + /* * Whether any immediate values are outside the range of 0 and 1 */ unsigned unclamped_immediates:1; diff --git a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_info.c b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_info.c index fcaa201..55acea8 100644 --- a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_info.c +++ b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_info.c @@ -243,6 +243,10 @@ analyse_sample(struct analysis_context *ctx, tex_info->texture_unit = inst->Src[1].Register.Index; tex_info->sampler_unit = inst->Src[2].Register.Index; + if (tex_info->texture_unit != tex_info->sampler_unit) { + info->sampler_texture_units_different = TRUE; + } + if (modifier == LP_BLD_TEX_MODIFIER_EXPLICIT_DERIV || modifier == LP_BLD_TEX_MODIFIER_EXPLICIT_LOD || modifier == LP_BLD_TEX_MODIFIER_LOD_BIAS || shadow) { -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 84089] [radeonsi] hd 7790 need more sGPRS for ps
https://bugs.freedesktop.org/show_bug.cgi?id=84089 --- Comment #4 from Tom Stellard --- Created attachment 106551 --> https://bugs.freedesktop.org/attachment.cgi?id=106551&action=edit Fix Can you try this patch? Make sure you replace the original assert in the mesa code. -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/5] mesa: Remove some dead helper functions.
The series looks good to me. Nice catches. Reviewed-by: Brian Paul -Brian On 09/18/2014 05:32 PM, Kenneth Graunke wrote: Dead since the _MaxElement removal, but these functions seemed generally applicable, so I decided to remove them in a separate patch. Signed-off-by: Kenneth Graunke --- src/mesa/main/arrayobj.h | 26 -- 1 file changed, 26 deletions(-) diff --git a/src/mesa/main/arrayobj.h b/src/mesa/main/arrayobj.h index 1819cd1..3c1f918 100644 --- a/src/mesa/main/arrayobj.h +++ b/src/mesa/main/arrayobj.h @@ -78,32 +78,6 @@ extern void _mesa_update_vao_client_arrays(struct gl_context *ctx, struct gl_vertex_array_object *vao); - -/** Returns the bitmask of all enabled arrays in fixed function mode. - * - * In fixed function mode only the traditional fixed function arrays - * are available. - */ -static inline GLbitfield64 -_mesa_array_object_get_enabled_ff(const struct gl_vertex_array_object *vao) -{ - return vao->_Enabled & VERT_BIT_FF_ALL; -} - -/** Returns the bitmask of all enabled arrays in arb/glsl shader mode. - * - * In arb/glsl shader mode all the fixed function and the arb/glsl generic - * arrays are available. Only the first generic array takes - * precedence over the legacy position array. - */ -static inline GLbitfield64 -_mesa_array_object_get_enabled_arb(const struct gl_vertex_array_object *vao) -{ - GLbitfield64 enabled = vao->_Enabled; - return enabled & ~(VERT_BIT_POS & (enabled >> VERT_ATTRIB_GENERIC0)); -} - - /* * API functions */ ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v4] glsl: Expand matrix flip optimization pass to cover more cases.
On vie, 2014-09-19 at 21:16 +1000, Timothy Arceri wrote: > On Fri, 2014-09-19 at 12:52 +0200, Iago Toral Quiroga wrote: > > Also, as suggested by Ian Romanick, make it so we don't need a bunch of > > individual handles to flippable matrices, instead we register > > matrix/transpose_matrix pairs in a hash table for all built-in matrices > > using the non-transpose matrix name as key. > > --- > > src/glsl/opt_flip_matrices.cpp | 159 > > +++-- > > 1 file changed, 121 insertions(+), 38 deletions(-) > > > > I think this never got the reviewed-by... This is a rebased version of the > > v3 > > patch that also fixes a silly mistake that I had introduced in that version. > > No piglit regressions observed on SandyBridge. > > > > Ian, do you think this version is good? > > > > diff --git a/src/glsl/opt_flip_matrices.cpp b/src/glsl/opt_flip_matrices.cpp > > index 04c6170..bb449d6 100644 > > --- a/src/glsl/opt_flip_matrices.cpp > > +++ b/src/glsl/opt_flip_matrices.cpp > > @@ -29,43 +29,143 @@ > > * On some hardware, this is more efficient. > > * > > * This currently only does the conversion for built-in matrices which > > - * already have transposed equivalents. Namely, > > gl_ModelViewProjectionMatrix > > - * and gl_TextureMatrix. > > + * already have transposed equivalents. > > */ > > #include "ir.h" > > #include "ir_optimization.h" > > #include "main/macros.h" > > +#include "program/hash_table.h" > > > > namespace { > > class matrix_flipper : public ir_hierarchical_visitor { > > public: > > + struct matrix_and_transpose { > > + ir_variable *matrix; > > + ir_variable *transpose_matrix; > > + }; > > + > > matrix_flipper(exec_list *instructions) > > { > > + this->mem_ctx = ralloc_context(NULL); > >progress = false; > > - mvp_transpose = NULL; > > - texmat_transpose = NULL; > > + > > + /* Build a hash table of built-in matrices and their transposes. > > + * > > + * The key for the entries in the hash table is the non-transpose > > matrix > > + * name. This assumes that all built-in transpose matrices have the > > + * "Transpose" suffix. > > + */ > > + ht = hash_table_ctor(0, hash_table_string_hash, > > + hash_table_string_compare); > > > >foreach_in_list(ir_instruction, ir, instructions) { > > ir_variable *var = ir->as_variable(); > > + > > if (!var) > > continue; > > - if (strcmp(var->name, "gl_ModelViewProjectionMatrixTranspose") == > > 0) > > -mvp_transpose = var; > > - if (strcmp(var->name, "gl_TextureMatrixTranspose") == 0) > > -texmat_transpose = var; > > + > > + /* Must be a matrix or array of matrices. */ > > + if (!var->type->is_matrix() && > > + !(var->type->is_array() && > > var->type->fields.array->is_matrix())) > > > This can now be simplified to > > if(!var->type->without_array()->is_matrix()) Oh, nice! I have changed the patch to do this instead. Thanks! > > > +continue; > > + > > + /* Must be a built-in */ > > + if (!is_gl_identifier(var->name)) > > +continue; > > + > > + /* Create a new entry for this matrix if we don't have one yet */ > > + bool new_entry = false; > > + struct matrix_and_transpose *entry = > > +(struct matrix_and_transpose *) hash_table_find(ht, var->name); > > + if (!entry) { > > +new_entry = true; > > +entry = new struct matrix_and_transpose(); > > +entry->matrix = NULL; > > +entry->transpose_matrix = NULL; > > + } > > + > > + const char *transpose_ptr = strstr(var->name, "Transpose"); > > + if (transpose_ptr == NULL) { > > +entry->matrix = var; > > + } else { > > +/* We should not be adding transpose built-in matrices that do > > + * not end in 'Transpose'. > > + */ > > +assert(transpose_ptr[9] == 0); > > +entry->transpose_matrix = var; > > + } > > + > > + if (new_entry) { > > +char *entry_key; > > +if (transpose_ptr == NULL) { > > + entry_key = (char *) var->name; > > +} else { > > + entry_key = ralloc_strndup(this->mem_ctx, var->name, > > + transpose_ptr - var->name); > > +} > > +hash_table_insert(ht, entry, entry_key); > > + } > >} > > } > > > > + ~matrix_flipper() > > + { > > + hash_table_dtor(ht); > > + ralloc_free(this->mem_ctx); > > + } > > + > > ir_visitor_status visit_enter(ir_expression *ir); > > > > bool progress; > > > > private: > > - ir_variable *mvp_transpose; > > - ir_variable *texmat_transpose; > > + void transform_operands(ir_expression *ir, > > +
Re: [Mesa-dev] [PATCH v4] glsl: Expand matrix flip optimization pass to cover more cases.
On Fri, 2014-09-19 at 12:52 +0200, Iago Toral Quiroga wrote: > Also, as suggested by Ian Romanick, make it so we don't need a bunch of > individual handles to flippable matrices, instead we register > matrix/transpose_matrix pairs in a hash table for all built-in matrices > using the non-transpose matrix name as key. > --- > src/glsl/opt_flip_matrices.cpp | 159 > +++-- > 1 file changed, 121 insertions(+), 38 deletions(-) > > I think this never got the reviewed-by... This is a rebased version of the v3 > patch that also fixes a silly mistake that I had introduced in that version. > No piglit regressions observed on SandyBridge. > > Ian, do you think this version is good? > > diff --git a/src/glsl/opt_flip_matrices.cpp b/src/glsl/opt_flip_matrices.cpp > index 04c6170..bb449d6 100644 > --- a/src/glsl/opt_flip_matrices.cpp > +++ b/src/glsl/opt_flip_matrices.cpp > @@ -29,43 +29,143 @@ > * On some hardware, this is more efficient. > * > * This currently only does the conversion for built-in matrices which > - * already have transposed equivalents. Namely, gl_ModelViewProjectionMatrix > - * and gl_TextureMatrix. > + * already have transposed equivalents. > */ > #include "ir.h" > #include "ir_optimization.h" > #include "main/macros.h" > +#include "program/hash_table.h" > > namespace { > class matrix_flipper : public ir_hierarchical_visitor { > public: > + struct matrix_and_transpose { > + ir_variable *matrix; > + ir_variable *transpose_matrix; > + }; > + > matrix_flipper(exec_list *instructions) > { > + this->mem_ctx = ralloc_context(NULL); >progress = false; > - mvp_transpose = NULL; > - texmat_transpose = NULL; > + > + /* Build a hash table of built-in matrices and their transposes. > + * > + * The key for the entries in the hash table is the non-transpose > matrix > + * name. This assumes that all built-in transpose matrices have the > + * "Transpose" suffix. > + */ > + ht = hash_table_ctor(0, hash_table_string_hash, > + hash_table_string_compare); > >foreach_in_list(ir_instruction, ir, instructions) { > ir_variable *var = ir->as_variable(); > + > if (!var) > continue; > - if (strcmp(var->name, "gl_ModelViewProjectionMatrixTranspose") == 0) > -mvp_transpose = var; > - if (strcmp(var->name, "gl_TextureMatrixTranspose") == 0) > -texmat_transpose = var; > + > + /* Must be a matrix or array of matrices. */ > + if (!var->type->is_matrix() && > + !(var->type->is_array() && > var->type->fields.array->is_matrix())) This can now be simplified to if(!var->type->without_array()->is_matrix()) > +continue; > + > + /* Must be a built-in */ > + if (!is_gl_identifier(var->name)) > +continue; > + > + /* Create a new entry for this matrix if we don't have one yet */ > + bool new_entry = false; > + struct matrix_and_transpose *entry = > +(struct matrix_and_transpose *) hash_table_find(ht, var->name); > + if (!entry) { > +new_entry = true; > +entry = new struct matrix_and_transpose(); > +entry->matrix = NULL; > +entry->transpose_matrix = NULL; > + } > + > + const char *transpose_ptr = strstr(var->name, "Transpose"); > + if (transpose_ptr == NULL) { > +entry->matrix = var; > + } else { > +/* We should not be adding transpose built-in matrices that do > + * not end in 'Transpose'. > + */ > +assert(transpose_ptr[9] == 0); > +entry->transpose_matrix = var; > + } > + > + if (new_entry) { > +char *entry_key; > +if (transpose_ptr == NULL) { > + entry_key = (char *) var->name; > +} else { > + entry_key = ralloc_strndup(this->mem_ctx, var->name, > + transpose_ptr - var->name); > +} > +hash_table_insert(ht, entry, entry_key); > + } >} > } > > + ~matrix_flipper() > + { > + hash_table_dtor(ht); > + ralloc_free(this->mem_ctx); > + } > + > ir_visitor_status visit_enter(ir_expression *ir); > > bool progress; > > private: > - ir_variable *mvp_transpose; > - ir_variable *texmat_transpose; > + void transform_operands(ir_expression *ir, > + ir_variable *mat_var, > + ir_variable *mat_transpose); > + void transform_operands_array_of_matrix(ir_expression *ir, > + ir_variable *mat_var, > + ir_variable *mat_transpose); > + struct hash_table *ht; > + void *mem_ctx; > }; > } > > +void > +matrix_flippe
[Mesa-dev] [PATCH v4] glsl: Expand matrix flip optimization pass to cover more cases.
Also, as suggested by Ian Romanick, make it so we don't need a bunch of individual handles to flippable matrices, instead we register matrix/transpose_matrix pairs in a hash table for all built-in matrices using the non-transpose matrix name as key. --- src/glsl/opt_flip_matrices.cpp | 159 +++-- 1 file changed, 121 insertions(+), 38 deletions(-) I think this never got the reviewed-by... This is a rebased version of the v3 patch that also fixes a silly mistake that I had introduced in that version. No piglit regressions observed on SandyBridge. Ian, do you think this version is good? diff --git a/src/glsl/opt_flip_matrices.cpp b/src/glsl/opt_flip_matrices.cpp index 04c6170..bb449d6 100644 --- a/src/glsl/opt_flip_matrices.cpp +++ b/src/glsl/opt_flip_matrices.cpp @@ -29,43 +29,143 @@ * On some hardware, this is more efficient. * * This currently only does the conversion for built-in matrices which - * already have transposed equivalents. Namely, gl_ModelViewProjectionMatrix - * and gl_TextureMatrix. + * already have transposed equivalents. */ #include "ir.h" #include "ir_optimization.h" #include "main/macros.h" +#include "program/hash_table.h" namespace { class matrix_flipper : public ir_hierarchical_visitor { public: + struct matrix_and_transpose { + ir_variable *matrix; + ir_variable *transpose_matrix; + }; + matrix_flipper(exec_list *instructions) { + this->mem_ctx = ralloc_context(NULL); progress = false; - mvp_transpose = NULL; - texmat_transpose = NULL; + + /* Build a hash table of built-in matrices and their transposes. + * + * The key for the entries in the hash table is the non-transpose matrix + * name. This assumes that all built-in transpose matrices have the + * "Transpose" suffix. + */ + ht = hash_table_ctor(0, hash_table_string_hash, + hash_table_string_compare); foreach_in_list(ir_instruction, ir, instructions) { ir_variable *var = ir->as_variable(); + if (!var) continue; - if (strcmp(var->name, "gl_ModelViewProjectionMatrixTranspose") == 0) -mvp_transpose = var; - if (strcmp(var->name, "gl_TextureMatrixTranspose") == 0) -texmat_transpose = var; + + /* Must be a matrix or array of matrices. */ + if (!var->type->is_matrix() && + !(var->type->is_array() && var->type->fields.array->is_matrix())) +continue; + + /* Must be a built-in */ + if (!is_gl_identifier(var->name)) +continue; + + /* Create a new entry for this matrix if we don't have one yet */ + bool new_entry = false; + struct matrix_and_transpose *entry = +(struct matrix_and_transpose *) hash_table_find(ht, var->name); + if (!entry) { +new_entry = true; +entry = new struct matrix_and_transpose(); +entry->matrix = NULL; +entry->transpose_matrix = NULL; + } + + const char *transpose_ptr = strstr(var->name, "Transpose"); + if (transpose_ptr == NULL) { +entry->matrix = var; + } else { +/* We should not be adding transpose built-in matrices that do + * not end in 'Transpose'. + */ +assert(transpose_ptr[9] == 0); +entry->transpose_matrix = var; + } + + if (new_entry) { +char *entry_key; +if (transpose_ptr == NULL) { + entry_key = (char *) var->name; +} else { + entry_key = ralloc_strndup(this->mem_ctx, var->name, + transpose_ptr - var->name); +} +hash_table_insert(ht, entry, entry_key); + } } } + ~matrix_flipper() + { + hash_table_dtor(ht); + ralloc_free(this->mem_ctx); + } + ir_visitor_status visit_enter(ir_expression *ir); bool progress; private: - ir_variable *mvp_transpose; - ir_variable *texmat_transpose; + void transform_operands(ir_expression *ir, + ir_variable *mat_var, + ir_variable *mat_transpose); + void transform_operands_array_of_matrix(ir_expression *ir, + ir_variable *mat_var, + ir_variable *mat_transpose); + struct hash_table *ht; + void *mem_ctx; }; } +void +matrix_flipper::transform_operands(ir_expression *ir, + ir_variable *mat_var, + ir_variable *mat_transpose) +{ +#ifndef NDEBUG + ir_dereference_variable *deref = ir->operands[0]->as_dereference_variable(); + assert(deref && deref->var == mat_var); +#endif + + void *mem_ctx = ralloc_parent(ir); + ir->operands[0] = ir->operands[1]; + ir->operands[1] = new
[Mesa-dev] [Bug 84089] [radeonsi] hd 7790 need more sGPRS for ps
https://bugs.freedesktop.org/show_bug.cgi?id=84089 --- Comment #3 from Christian König --- (In reply to comment #2) > i use llvm-svn|git from yesterday, and 6-7 days ago valley fork fine. Ah! Then some change in LLVM broke register spilling, please bisect LLVM to figure out what it was. Thanks, Christian. -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 84089] [radeonsi] hd 7790 need more sGPRS for ps
https://bugs.freedesktop.org/show_bug.cgi?id=84089 --- Comment #2 from Iaroslav Andrusyak --- i use llvm-svn|git from yesterday, and 6-7 days ago valley fork fine. -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 84089] [radeonsi] hd 7790 need more sGPRS for ps
https://bugs.freedesktop.org/show_bug.cgi?id=84089 --- Comment #1 from Christian König --- (In reply to comment #0) > I changed num_sgprs <= 104 to num_sgprs <= 204 and add fprint for num_sgprs > and user_sgprs ??? 104 is a hardware limit, you can't change it. You probably just need to use a newer LLVM version which supports SGPR spilling. -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 84089] [radeonsi] hd 7790 need more sGPRS for ps
https://bugs.freedesktop.org/show_bug.cgi?id=84089 Iaroslav Andrusyak changed: What|Removed |Added Assignee|dri-devel@lists.freedesktop |mesa-dev@lists.freedesktop. |.org|org -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 28/37] i965/gen6/gs: implement transform feedback support in gen6_gs_visitor
On vie, 2014-09-19 at 00:26 -0700, Jordan Justen wrote: > On Thu, Sep 18, 2014 at 11:50 PM, Samuel Iglesias Gonsálvez > wrote: > > On Thu, 2014-09-18 at 16:05 -0700, Jordan Justen wrote: > >> On Thu, Aug 14, 2014 at 4:12 AM, Iago Toral Quiroga > >> wrote: > >> > From: Samuel Iglesias Gonsalvez > >> > > >> > + this->xfb_output = src_reg(this, > >> > + glsl_type::uint_type, > >> > + linked_xfb_info->NumOutputs * > >> > + c->gp->program.VerticesOut); > >> > + this->xfb_output_offset = src_reg(this, glsl_type::uint_type); > >> > + emit(MOV(dst_reg(this->xfb_output_offset), src_reg(0u))); > >> > + /* Create a virtual register to hold destination indices in SOL */ > >> > + this->destination_indices = src_reg(this, glsl_type::uvec4_type); > >> > + /* Create a virtual register to hold temporal values in SOL */ > >> > + this->sol_temp = src_reg(this, glsl_type::uvec4_type); > >> > >> What is the duration of liveness for sol_temp? > >> > >> Would it be better to generate a new temp in each function to help out > >> register allocation? > >> > > > > Yes, it is better. I have made this change: create a new temp virtual > > register in every place it is needed (emit_thread_end(), xfb_write(), > > xfb_program()). > > Cool. Add Reviewed-by: Jordan Justen for > this patch, and: > i965/gen6/gs: Avoid buffering transform feedback varyings twice. > i965/gen6/gs: Fix binding table clash between TF surfaces and textures. > i965/gen6/gs: Enable transform feedback support in geometry shaders > i965/gen6/gs: upload ubo and pull constants surfaces. > i965/gen6/gs: Use a specific implementation of geometry shaders for gen6. > i965/gen6: enable GLSL 1.50 and OpenGL 3.2 > > That is the rest of the series, right? Yes. > Thank you both for all the great work on this series! Great, thanks for taking the time to review all the patches! I'll push them later today. Iago ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 28/37] i965/gen6/gs: implement transform feedback support in gen6_gs_visitor
On Thu, Sep 18, 2014 at 11:50 PM, Samuel Iglesias Gonsálvez wrote: > On Thu, 2014-09-18 at 16:05 -0700, Jordan Justen wrote: >> On Thu, Aug 14, 2014 at 4:12 AM, Iago Toral Quiroga >> wrote: >> > From: Samuel Iglesias Gonsalvez >> > >> > + this->xfb_output = src_reg(this, >> > + glsl_type::uint_type, >> > + linked_xfb_info->NumOutputs * >> > + c->gp->program.VerticesOut); >> > + this->xfb_output_offset = src_reg(this, glsl_type::uint_type); >> > + emit(MOV(dst_reg(this->xfb_output_offset), src_reg(0u))); >> > + /* Create a virtual register to hold destination indices in SOL */ >> > + this->destination_indices = src_reg(this, glsl_type::uvec4_type); >> > + /* Create a virtual register to hold temporal values in SOL */ >> > + this->sol_temp = src_reg(this, glsl_type::uvec4_type); >> >> What is the duration of liveness for sol_temp? >> >> Would it be better to generate a new temp in each function to help out >> register allocation? >> > > Yes, it is better. I have made this change: create a new temp virtual > register in every place it is needed (emit_thread_end(), xfb_write(), > xfb_program()). Cool. Add Reviewed-by: Jordan Justen for this patch, and: i965/gen6/gs: Avoid buffering transform feedback varyings twice. i965/gen6/gs: Fix binding table clash between TF surfaces and textures. i965/gen6/gs: Enable transform feedback support in geometry shaders i965/gen6/gs: upload ubo and pull constants surfaces. i965/gen6/gs: Use a specific implementation of geometry shaders for gen6. i965/gen6: enable GLSL 1.50 and OpenGL 3.2 That is the rest of the series, right? Thank you both for all the great work on this series! -Jordan ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2] i965/gen6/gs: implement transform feedback support in gen6_gs_visitor
From: Samuel Iglesias Gonsalvez This takes care of generating code required to handle transform feedback. Notice that transform feedback isn't enabled yet, since that requires additional setups in other parts of the code that will come in later patches. Signed-off-by: Samuel Iglesias Gonsalvez Acked-by: Kenneth Graunke --- src/mesa/drivers/dri/i965/brw_context.h | 113 ++ src/mesa/drivers/dri/i965/gen6_gs_visitor.cpp | 309 +- src/mesa/drivers/dri/i965/gen6_gs_visitor.h | 13 ++ 3 files changed, 390 insertions(+), 45 deletions(-) V2: Allocate sol_temp in each function that requires the temporary register rather than allocating it once during the prolog. diff --git a/src/mesa/drivers/dri/i965/brw_context.h b/src/mesa/drivers/dri/i965/brw_context.h index 9e04d81..3bdc480 100644 --- a/src/mesa/drivers/dri/i965/brw_context.h +++ b/src/mesa/drivers/dri/i965/brw_context.h @@ -556,48 +556,6 @@ struct brw_vs_prog_data { bool uses_instanceid; }; - -/* Note: brw_gs_prog_data_compare() must be updated when adding fields to - * this struct! - */ -struct brw_gs_prog_data -{ - struct brw_vec4_prog_data base; - - /** -* Size of an output vertex, measured in HWORDS (32 bytes). -*/ - unsigned output_vertex_size_hwords; - - unsigned output_topology; - - /** -* Size of the control data (cut bits or StreamID bits), in hwords (32 -* bytes). 0 if there is no control data. -*/ - unsigned control_data_header_size_hwords; - - /** -* Format of the control data (either GEN7_GS_CONTROL_DATA_FORMAT_GSCTL_SID -* if the control data is StreamID bits, or -* GEN7_GS_CONTROL_DATA_FORMAT_GSCTL_CUT if the control data is cut bits). -* Ignored if control_data_header_size is 0. -*/ - unsigned control_data_format; - - bool include_primitive_id; - - int invocations; - - /** -* Dispatch mode, can be any of: -* GEN7_GS_DISPATCH_MODE_DUAL_OBJECT -* GEN7_GS_DISPATCH_MODE_DUAL_INSTANCE -* GEN7_GS_DISPATCH_MODE_SINGLE -*/ - int dispatch_mode; -}; - /** Number of texture sampler units */ #define BRW_MAX_TEX_UNIT 32 @@ -644,6 +602,77 @@ struct brw_gs_prog_data #define SURF_INDEX_GEN6_SOL_BINDING(t) (t) #define BRW_MAX_GEN6_GS_SURFACES SURF_INDEX_GEN6_SOL_BINDING(BRW_MAX_SOL_BINDINGS) +/* Note: brw_gs_prog_data_compare() must be updated when adding fields to + * this struct! + */ +struct brw_gs_prog_data +{ + struct brw_vec4_prog_data base; + + /** +* Size of an output vertex, measured in HWORDS (32 bytes). +*/ + unsigned output_vertex_size_hwords; + + unsigned output_topology; + + /** +* Size of the control data (cut bits or StreamID bits), in hwords (32 +* bytes). 0 if there is no control data. +*/ + unsigned control_data_header_size_hwords; + + /** +* Format of the control data (either GEN7_GS_CONTROL_DATA_FORMAT_GSCTL_SID +* if the control data is StreamID bits, or +* GEN7_GS_CONTROL_DATA_FORMAT_GSCTL_CUT if the control data is cut bits). +* Ignored if control_data_header_size is 0. +*/ + unsigned control_data_format; + + bool include_primitive_id; + + int invocations; + + /** +* Dispatch mode, can be any of: +* GEN7_GS_DISPATCH_MODE_DUAL_OBJECT +* GEN7_GS_DISPATCH_MODE_DUAL_INSTANCE +* GEN7_GS_DISPATCH_MODE_SINGLE +*/ + int dispatch_mode; + + /** +* Gen6 transform feedback enabled flag. +*/ + bool gen6_xfb_enabled; + + /** +* Gen6: Provoking vertex convention for odd-numbered triangles +* in tristrips. +*/ + GLuint pv_first:1; + + /** +* Gen6: Number of varyings that are output to transform feedback. +*/ + GLuint num_transform_feedback_bindings:7; /* 0-BRW_MAX_SOL_BINDINGS */ + + /** +* Gen6: Map from the index of a transform feedback binding table entry to the +* gl_varying_slot that should be streamed out through that binding table +* entry. +*/ + unsigned char transform_feedback_bindings[BRW_MAX_SOL_BINDINGS]; + + /** +* Gen6: Map from the index of a transform feedback binding table entry to the +* swizzles that should be used when streaming out data through that +* binding table entry. +*/ + unsigned char transform_feedback_swizzles[BRW_MAX_SOL_BINDINGS]; +}; + /** * Stride in bytes between shader_time entries. * diff --git a/src/mesa/drivers/dri/i965/gen6_gs_visitor.cpp b/src/mesa/drivers/dri/i965/gen6_gs_visitor.cpp index 7a832ca..c9e8e66 100644 --- a/src/mesa/drivers/dri/i965/gen6_gs_visitor.cpp +++ b/src/mesa/drivers/dri/i965/gen6_gs_visitor.cpp @@ -97,6 +97,43 @@ gen6_gs_visitor::emit_prolog() this->prim_count = src_reg(this, glsl_type::uint_type); emit(MOV(dst_reg(this->prim_count), 0u)); + if (c->prog_data.gen6_xfb_enabled) { + const struct gl_transform_feedback_info *linked_xfb_info = + &this->shader_prog->LinkedTransformFeedback; + + /* Gen6 geometry shaders are required to ask for