Re: [Mesa-dev] [PATCH] glsl: Eliminate unused built-in variables after compilation

2014-05-31 Thread Kenneth Graunke
On Thursday, May 29, 2014 02:37:34 PM Ian Romanick wrote:
> From: Ian Romanick 
> 
> After compilation (and before linking) we can eliminate quite a few
> built-in variables.  Basically, any uniform or constant (e.g.,
> gl_MaxVertexTextureImageUnits) that isn't used (with one exception) can
> be eliminated.  System values, vertex shader inputs (with one
> exception), and fragment shader outputs that are not used and not
> re-declared in the shader text can also be removed.
> 
> gl_ModelViewProjectMatrix and gl_Vertex are used by the built-in
> function ftransform.  There are some complications with eliminating
> these variables (see the comment in the patch), so they are not
> eliminated.
> 
> Reduces the peak ir_variable memory usage in a trimmed apitrace of dota2
> by 3.5MB on 64-bit.

Except that it doesn't.  Your patch removes these variables from the IR, which 
means that your memory counting trick stops accounting for them.  But they're 
allocated using the symbol table as the memory context, which lives as long as 
the gl_shader object.  So they're never actually *freed*.

Massif on an apitrace of Portal shows no difference in memory usage.

That said, perhaps this can be part of the solution...
 
> Before: IR MEM: variable usage / name / total: 5327760 894914 6222674
> After:  IR MEM: variable usage / name / total: 2156568 318192 2474760
> 
> Reduces the peak ir_variable memory usage in a trimmed apitrace of dota2
> by 2.8MB on 32-bit.
> 
> Before: IR MEM: variable usage / name / total: 4118280 644100 4762380
> After:  IR MEM: variable usage / name / total: 1473408 256871 1730279
> 
> v2: Don't remove any built-in with Transpose in the name.

signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/3] radeon/compute: Implement PIPE_COMPUTE_CAP_MAX_COMPUTE_UNITS

2014-05-31 Thread Bruno Jimenez
On Fri, 2014-05-30 at 19:33 -0400, Alex Deucher wrote:
> On Fri, May 30, 2014 at 11:31 AM, Bruno Jiménez  wrote:
> > The data has been extracted from:
> > AMD Accelerated Parallel Processing OpenCL Programming Guide (rev 2.7)
> > Appendix D: Device Parameters
> 
> You should add a query for the number of compute units to the
> RADEON_INFO ioctl and then just ask the kernel how many CUs/SIMDs the
> hw has.  This will properly handle all boards (harvest, etc.) since we
> can read the actual number of CUs off the GPU.
> 
> Alex

Hi,

At first I tried to do so (as for the maximum clock frequency), but I
couldn't find how to query that value, nor many docs about what I could
ask the kernel for.

I think I have found now the appropiate docs, and I will try again to
query the kernel later.

Sorry for any inconvenience.
Bruno

> 
> > ---
> >  src/gallium/drivers/radeon/r600_pipe_common.c | 90 
> > +++
> >  1 file changed, 90 insertions(+)
> >
> > diff --git a/src/gallium/drivers/radeon/r600_pipe_common.c 
> > b/src/gallium/drivers/radeon/r600_pipe_common.c
> > index 70c4d1a..c4abacd 100644
> > --- a/src/gallium/drivers/radeon/r600_pipe_common.c
> > +++ b/src/gallium/drivers/radeon/r600_pipe_common.c
> > @@ -422,6 +422,89 @@ const char *r600_get_llvm_processor_name(enum 
> > radeon_family family)
> > }
> >  }
> >
> > +static uint32_t radeon_max_compute_units(enum radeon_family family)
> > +{
> > +   switch (family) {
> > +   case CHIP_CEDAR:
> > +   return 2;
> > +
> > +   /* Redwood PRO2: 4
> > +* Redwood PRO:  5
> > +* Redwood XT:   5 */
> > +   case CHIP_REDWOOD:
> > +   return 4;
> > +
> > +   /* Juniper LE:  9
> > +* Juniper XT: 10 */
> > +   case CHIP_JUNIPER:
> > +   return 9;
> > +
> > +   /* Cypress LE:  14
> > +* Cypress PRO: 18
> > +* Cypress XT:  20 */
> > +   case CHIP_CYPRESS:
> > +   return 14;
> > +
> > +   case CHIP_HEMLOCK:
> > +   return 40;
> > +
> > +   /* XXX: is Zacate really equal to Ontario?
> > +* Zacate E-350: 2
> > +* Zacate E-240: 2
> > +* Ontario C-50: 2
> > +* Ontario C-30: 2 */
> > +   case CHIP_PALM:
> > +   return 2;
> > +
> > +   /* Caicos:  2
> > +* Seymour LP:  2
> > +* Seymour PRO: 2
> > +* Seymour XT:  2
> > +* Seymour XTX: 2 */
> > +   case CHIP_CAICOS:
> > +   return 2;
> > +
> > +   /* Turks PRO:6
> > +* Turks XT: 6
> > +* Whistler LP:  6
> > +* Whistler PRO: 6
> > +* Whistler XT:  6 */
> > +   case CHIP_TURKS:
> > +   return 6;
> > +
> > +   /* Barts LE:  10
> > +* Barts PRO: 12
> > +* Barts XT:  14
> > +* Blackcomb PRO: 12 */
> > +   case CHIP_BARTS:
> > +   return 10;
> > +
> > +   /* Cayman PRO: 22
> > +* Cayman XT:  24
> > +* Cayman Gemini: 48 */
> > +   case CHIP_CAYMAN:
> > +   return 22;
> > +
> > +   /* Verde PRO:  8
> > +* Verde XT:  10 */
> > +   case CHIP_VERDE:
> > +   return 8;
> > +
> > +   /* Pitcairn PRO: 16
> > +* Pitcairn XT:  20 */
> > +   case CHIP_PITCAIRN:
> > +   return 16;
> > +
> > +   /* Tahiti PRO: 28
> > +* Tahiti XT:  32 */
> > +   case CHIP_TAHITI:
> > +   return 28;
> > +
> > +   default:
> > +   return 1;
> > +   }
> > +}
> > +
> >  static int r600_get_compute_param(struct pipe_screen *screen,
> >  enum pipe_compute_cap param,
> >  void *ret)
> > @@ -519,6 +602,13 @@ static int r600_get_compute_param(struct pipe_screen 
> > *screen,
> > }
> > return sizeof(uint32_t);
> >
> > +   case PIPE_COMPUTE_CAP_MAX_COMPUTE_UNITS:
> > +   if (ret) {
> > +   uint32_t *max_compute_units = ret;
> > +   *max_compute_units = 
> > radeon_max_compute_units(rscreen->family);
> > +   }
> > +   return sizeof(uint32_t);
> > +
> > default:
> > fprintf(stderr, "unknown PIPE_COMPUTE_CAP %d\n", param);
> > return 0;
> > --
> > 1.9.3
> >
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/mesa-dev


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] glsl: Validate that built-in uniforms have backing state

2014-05-31 Thread Kenneth Graunke
On Wednesday, May 28, 2014 06:35:47 PM Ian Romanick wrote:
> From: Ian Romanick 
> 
> All built-in uniforms are supposed to be backed by some GL state.  The
> state_slots field describes this backing state.
> 
> This helped me track down a bug in a later patch.
> 
> Signed-off-by: Ian Romanick 
> ---
>  src/glsl/ir_validate.cpp | 8 
>  1 file changed, 8 insertions(+)
> 
> diff --git a/src/glsl/ir_validate.cpp b/src/glsl/ir_validate.cpp
> index 2fca437..8f50980 100644
> --- a/src/glsl/ir_validate.cpp
> +++ b/src/glsl/ir_validate.cpp
> @@ -685,6 +685,14 @@ ir_validate::visit(ir_variable *ir)
>abort();
> }
> 
> +   if (ir->data.mode == ir_var_uniform
> +   && strncmp(ir->name, "gl_", 3) == 0
> +   && ir->get_state_slots() == NULL) {
> +  printf("built-in uniform has no state\n");
> +  ir->print();
> +  abort();
> +   }
> +
> return visit_continue;
>  }

This patch is:
Reviewed-by: Kenneth Graunke 

(It's always good to have more validation, and I verified that this condition 
ought to hold in all cases, AFAICS.) 

signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 79469] Commit e3cc0d90e14e62a0a787b6c07a6df0f5c84039be breaks unigine heaven

2014-05-31 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=79469

David Heidelberger (okias)  changed:

   What|Removed |Added

 CC||david.heidelber...@ixit.cz

--- Comment #3 from David Heidelberger (okias)  ---
Confirmed on r600g.

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 79039] [TRACKER] Mesa 10.2 release tracker

2014-05-31 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=79039

Bug 79039 depends on bug 79294, which changed state.

Bug 79294 Summary: Xlib-based build broken on non x86/x86-64 architectures
https://bugs.freedesktop.org/show_bug.cgi?id=79294

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 79469] Commit e3cc0d90e14e62a0a787b6c07a6df0f5c84039be breaks unigine heaven

2014-05-31 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=79469

Matt Turner  changed:

   What|Removed |Added

 CC||a...@nwnk.net,
   ||gli...@freedesktop.org

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 79421] [llvmpipe] SIGSEGV src/gallium/drivers/llvmpipe/lp_rast_priv.h:218

2014-05-31 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=79421

Roland Scheidegger  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 79421] [llvmpipe] SIGSEGV src/gallium/drivers/llvmpipe/lp_rast_priv.h:218

2014-05-31 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=79421

--- Comment #2 from Roland Scheidegger  ---
I still have no idea why this could happen with the fb size used by this piglit
test but fixed with 576868140bbb1abd177e7fd122720883d773137e.

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa: Free the compiled shader IR after it has been linked.

2014-05-31 Thread Eric Anholt
Ian Romanick  writes:

> On 05/28/2014 01:57 PM, Eric Anholt wrote:
>> If the shader compiled once, then we can compile it again.  Compiled
>> shaders almost always get used in just one program, so holding that
>> compiled IR until the program is freed is just a waste of memory.
>
> Would this work with some madness like:
>
>glAttachShader(prog, sh1);
>glAttachShader(prog, sh2);
>glLinkProgram(prog);
>
>GLchar *empty = "";
>glShaderSource(sh1, 1, &empty, NULL);

When we get the shadersource call here on a previously linked-and-freed
shader, the lazy recompile call present in shadersource ensures that we
have the right IR.

>glBindAttribLocation(prog, 0, "foo");
>glLinkProgram(prog);

> There are some things like this that I think would work very nicely with
> glCreateShaderProgramv, but it may be difficult to get all the corner
> cases right otherwise.
>
>> On the other hand, if they are either reusing shader objects to compile
>> multiple times, or linking the same shader into multiple programs, we turn
>> off this memory savings hack to avoid spending CPU on recompiling.
>> 
>> Reduces peak memory allocation of glretrace of a trace of dolphin-emu by
>> 5.5MB.  It seems like this should be a big deal to DOTA2, but it was
>> triggering RecompiledAnyShader, and I failed to see a benefit even if I
>> removed the RecompiledAnyShader check (which confuses me).
>
> I think they use the same vertex shader with multiple fragment
> shaders... but it does seem weird that disabling the check didn't change
> the memory usage.  Did it change anything (bad rendering)?

Not that I noticed, not that I was paying close attention when it took
so long to run.



pgppc9J_JznQI.pgp
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 4/5] i965/fs: Debug the optimization passes by dumping instr to file.

2014-05-31 Thread Kenneth Graunke
On Friday, May 30, 2014 07:35:22 PM Matt Turner wrote:
> With INTEL_DEBUG=optimizer, write the output of dump_instructions() to a
> file each time an optimization pass makes progress. This lets you easily
> diff successive files to see what an optimization pass did.
> 
> Example filenames written when running glxgears:
>fs8-00-00-start
>fs8-00-01-04-opt_copy_propagate
>fs8-00-01-06-dead_code_eliminate
>fs8-00-01-12-compute_to_mrf
>fs8-00-02-06-dead_code_eliminate
>|   |  |   |
>|   |  |   `-- optimization pass name
>|   |  |
>|   |  `-- optimization pass number in the loop
>|   |
>|   `-- optimization loop interation
>|
>`-- shader program number
> 
> Note that with INTEL_DEBUG=optimizer, we disable compact_virtual_grfs,
> so that we can diff instruction lists across loop interations without
> the register numbers being changes.
> ---
>  src/mesa/drivers/dri/i965/brw_fs.cpp | 53 
> 
>  1 file changed, 41 insertions(+), 12 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
> b/src/mesa/drivers/dri/i965/brw_fs.cpp
> index c9b31fe..0d56ac7 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
> @@ -1714,6 +1714,9 @@ fs_visitor::split_virtual_grfs()
>  void
>  fs_visitor::compact_virtual_grfs()
>  {
> +   if (unlikely(INTEL_DEBUG & DEBUG_OPTIMIZER))
> +  return;
> +
> /* Mark which virtual GRFs are used, and count how many. */
> int remap_table[this->virtual_grf_count];
> memset(remap_table, -1, sizeof(remap_table));
> @@ -3020,24 +3023,50 @@ fs_visitor::run()
>  
>opt_drop_redundant_mov_to_flags();
>  
> +#define OPT(pass, args...) do {\
> +  pass_num++;  \
> +  bool this_progress = pass(args); \
> +   \
> +  if (unlikely(INTEL_DEBUG & DEBUG_OPTIMIZER) && this_progress) {  \
> + char filename[64];\
> + snprintf(filename, 64, "fs%d-%02d-%02d-%02d-" #pass,  \

One thought...could we widen the shader number to:
"fs%d-%04d-%02d-%02-d"
Some programs have a lot of shaders (though admittedly you're probably not
using INTEL_DEBUG=optimizer on a whole game).

Either way's fine.

signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 4/5] i965/fs: Debug the optimization passes by dumping instr to file.

2014-05-31 Thread Matt Turner
On Sat, May 31, 2014 at 5:33 PM, Kenneth Graunke  wrote:
> On Friday, May 30, 2014 07:35:22 PM Matt Turner wrote:
>> With INTEL_DEBUG=optimizer, write the output of dump_instructions() to a
>> file each time an optimization pass makes progress. This lets you easily
>> diff successive files to see what an optimization pass did.
>>
>> Example filenames written when running glxgears:
>>fs8-00-00-start
>>fs8-00-01-04-opt_copy_propagate
>>fs8-00-01-06-dead_code_eliminate
>>fs8-00-01-12-compute_to_mrf
>>fs8-00-02-06-dead_code_eliminate
>>|   |  |   |
>>|   |  |   `-- optimization pass name
>>|   |  |
>>|   |  `-- optimization pass number in the loop
>>|   |
>>|   `-- optimization loop interation
>>|
>>`-- shader program number
>>
>> Note that with INTEL_DEBUG=optimizer, we disable compact_virtual_grfs,
>> so that we can diff instruction lists across loop interations without
>> the register numbers being changes.
>> ---
>>  src/mesa/drivers/dri/i965/brw_fs.cpp | 53 
>> 
>>  1 file changed, 41 insertions(+), 12 deletions(-)
>>
>> diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
>> b/src/mesa/drivers/dri/i965/brw_fs.cpp
>> index c9b31fe..0d56ac7 100644
>> --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
>> +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
>> @@ -1714,6 +1714,9 @@ fs_visitor::split_virtual_grfs()
>>  void
>>  fs_visitor::compact_virtual_grfs()
>>  {
>> +   if (unlikely(INTEL_DEBUG & DEBUG_OPTIMIZER))
>> +  return;
>> +
>> /* Mark which virtual GRFs are used, and count how many. */
>> int remap_table[this->virtual_grf_count];
>> memset(remap_table, -1, sizeof(remap_table));
>> @@ -3020,24 +3023,50 @@ fs_visitor::run()
>>
>>opt_drop_redundant_mov_to_flags();
>>
>> +#define OPT(pass, args...) do {\
>> +  pass_num++;  \
>> +  bool this_progress = pass(args); \
>> +   \
>> +  if (unlikely(INTEL_DEBUG & DEBUG_OPTIMIZER) && this_progress) {  \
>> + char filename[64];\
>> + snprintf(filename, 64, "fs%d-%02d-%02d-%02d-" #pass,  \
>
> One thought...could we widen the shader number to:
> "fs%d-%04d-%02d-%02-d"
> Some programs have a lot of shaders (though admittedly you're probably not
> using INTEL_DEBUG=optimizer on a whole game).
>
> Either way's fine.

Sure thing. That's not a bad idea.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 5/5] i965: Skip IR annotations with INTEL_DEBUG=noann.

2014-05-31 Thread Kenneth Graunke
On Friday, May 30, 2014 07:35:23 PM Matt Turner wrote:
> Running shader-db with INTEL_DEBUG=noann reduces the runtime
> from ~90 to ~80 seconds on my machine. It also reduces the disk space
> consumed by the .out files from 660 MB (676 on disk) to 343 MB (358 on
> disk).
> ---
>  src/mesa/drivers/dri/i965/brw_shader.cpp | 6 --
>  src/mesa/drivers/dri/i965/intel_debug.c  | 1 +
>  src/mesa/drivers/dri/i965/intel_debug.h  | 1 +
>  3 files changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_shader.cpp 
> b/src/mesa/drivers/dri/i965/brw_shader.cpp
> index 69eab59..1ab5f07 100644
> --- a/src/mesa/drivers/dri/i965/brw_shader.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_shader.cpp
> @@ -794,8 +794,10 @@ void annotate(struct brw_context *brw,
>  
> struct annotation *ann = &annotation->ann[annotation->ann_count++];
> ann->offset = offset;
> -   ann->ir = inst->ir;
> -   ann->annotation = inst->annotation;
> +   if ((INTEL_DEBUG & DEBUG_NO_ANNOTATION) != 0) {

This does the opposite of what you said.  With this patch,
annotations are off by default, and INTEL_DEBUG=noann turns them back on.

I think you mean == 0.

With that fixed, this series is:
Reviewed-by: Kenneth Graunke 

Thanks for adding the geteuid checks.

> +  ann->ir = inst->ir;
> +  ann->annotation = inst->annotation;
> +   }
>  
> if (cfg->blocks[annotation->cur_block]->start == inst) {
>ann->block_start = cfg->blocks[annotation->cur_block];
> diff --git a/src/mesa/drivers/dri/i965/intel_debug.c 
> b/src/mesa/drivers/dri/i965/intel_debug.c
> index bba873b..c72fce2 100644
> --- a/src/mesa/drivers/dri/i965/intel_debug.c
> +++ b/src/mesa/drivers/dri/i965/intel_debug.c
> @@ -65,6 +65,7 @@ static const struct dri_debug_control debug_control[] = {
> { "blorp", DEBUG_BLORP },
> { "nodualobj", DEBUG_NO_DUAL_OBJECT_GS },
> { "optimizer", DEBUG_OPTIMIZER },
> +   { "noann", DEBUG_NO_ANNOTATION },
> { NULL,0 }
>  };
>  
> diff --git a/src/mesa/drivers/dri/i965/intel_debug.h 
> b/src/mesa/drivers/dri/i965/intel_debug.h
> index f257054..37dc34a 100644
> --- a/src/mesa/drivers/dri/i965/intel_debug.h
> +++ b/src/mesa/drivers/dri/i965/intel_debug.h
> @@ -61,6 +61,7 @@ extern uint64_t INTEL_DEBUG;
>  #define DEBUG_VUE 0x4000
>  #define DEBUG_NO_DUAL_OBJECT_GS 0x8000
>  #define DEBUG_OPTIMIZER   0x1
> +#define DEBUG_NO_ANNOTATION 0x2
>  
>  #ifdef HAVE_ANDROID_PLATFORM
>  #define LOG_TAG "INTEL-MESA"
> 

signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 5/5] i965: Skip IR annotations with INTEL_DEBUG=noann.

2014-05-31 Thread Matt Turner
On Sat, May 31, 2014 at 5:52 PM, Kenneth Graunke  wrote:
> On Friday, May 30, 2014 07:35:23 PM Matt Turner wrote:
>> Running shader-db with INTEL_DEBUG=noann reduces the runtime
>> from ~90 to ~80 seconds on my machine. It also reduces the disk space
>> consumed by the .out files from 660 MB (676 on disk) to 343 MB (358 on
>> disk).
>> ---
>>  src/mesa/drivers/dri/i965/brw_shader.cpp | 6 --
>>  src/mesa/drivers/dri/i965/intel_debug.c  | 1 +
>>  src/mesa/drivers/dri/i965/intel_debug.h  | 1 +
>>  3 files changed, 6 insertions(+), 2 deletions(-)
>>
>> diff --git a/src/mesa/drivers/dri/i965/brw_shader.cpp 
>> b/src/mesa/drivers/dri/i965/brw_shader.cpp
>> index 69eab59..1ab5f07 100644
>> --- a/src/mesa/drivers/dri/i965/brw_shader.cpp
>> +++ b/src/mesa/drivers/dri/i965/brw_shader.cpp
>> @@ -794,8 +794,10 @@ void annotate(struct brw_context *brw,
>>
>> struct annotation *ann = &annotation->ann[annotation->ann_count++];
>> ann->offset = offset;
>> -   ann->ir = inst->ir;
>> -   ann->annotation = inst->annotation;
>> +   if ((INTEL_DEBUG & DEBUG_NO_ANNOTATION) != 0) {
>
> This does the opposite of what you said.  With this patch,
> annotations are off by default, and INTEL_DEBUG=noann turns them back on.
>
> I think you mean == 0.

Oh, yeah. Thanks!

> With that fixed, this series is:
> Reviewed-by: Kenneth Graunke 

Thanks for the review!

> Thanks for adding the geteuid checks.
>
>> +  ann->ir = inst->ir;
>> +  ann->annotation = inst->annotation;
>> +   }
>>
>> if (cfg->blocks[annotation->cur_block]->start == inst) {
>>ann->block_start = cfg->blocks[annotation->cur_block];
>> diff --git a/src/mesa/drivers/dri/i965/intel_debug.c 
>> b/src/mesa/drivers/dri/i965/intel_debug.c
>> index bba873b..c72fce2 100644
>> --- a/src/mesa/drivers/dri/i965/intel_debug.c
>> +++ b/src/mesa/drivers/dri/i965/intel_debug.c
>> @@ -65,6 +65,7 @@ static const struct dri_debug_control debug_control[] = {
>> { "blorp", DEBUG_BLORP },
>> { "nodualobj", DEBUG_NO_DUAL_OBJECT_GS },
>> { "optimizer", DEBUG_OPTIMIZER },
>> +   { "noann", DEBUG_NO_ANNOTATION },
>> { NULL,0 }
>>  };
>>
>> diff --git a/src/mesa/drivers/dri/i965/intel_debug.h 
>> b/src/mesa/drivers/dri/i965/intel_debug.h
>> index f257054..37dc34a 100644
>> --- a/src/mesa/drivers/dri/i965/intel_debug.h
>> +++ b/src/mesa/drivers/dri/i965/intel_debug.h
>> @@ -61,6 +61,7 @@ extern uint64_t INTEL_DEBUG;
>>  #define DEBUG_VUE 0x4000
>>  #define DEBUG_NO_DUAL_OBJECT_GS 0x8000
>>  #define DEBUG_OPTIMIZER   0x1
>> +#define DEBUG_NO_ANNOTATION 0x2
>>
>>  #ifdef HAVE_ANDROID_PLATFORM
>>  #define LOG_TAG "INTEL-MESA"
>>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 19/19] i965/fs: Optimize SEL with the same sources into a MOV.

2014-05-31 Thread Kenneth Graunke
On Tuesday, May 27, 2014 06:47:50 PM Matt Turner wrote:
> instructions in affected programs: 474 -> 462 (-2.53%)
> ---
>  src/mesa/drivers/dri/i965/brw_fs.cpp | 7 ++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
> index c0af6d0..453683c 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
> @@ -2021,7 +2021,12 @@ fs_visitor::opt_algebraic()
>   }
>   break;
>case BRW_OPCODE_SEL:
> - if (inst->saturate && inst->src[1].file == IMM) {
> + if (inst->src[0].equals(inst->src[1])) {
> +inst->opcode = BRW_OPCODE_MOV;
> +inst->src[1] = reg_undef;
> +inst->predicate = BRW_PREDICATE_NONE;

I'd probably add

 inst->predicate_inverse = false;

just in case.  Otherwise, this patch is:
Reviewed-by: Kenneth Graunke 

> +progress = true;
> + } else if (inst->saturate && inst->src[1].file == IMM) {
>  switch (inst->conditional_mod) {
>  case BRW_CONDITIONAL_LE:
>  case BRW_CONDITIONAL_L:
> 


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 18/19] i965/fs: Perform CSE on texture operations.

2014-05-31 Thread Kenneth Graunke
On Tuesday, May 27, 2014 06:47:49 PM Matt Turner wrote:
> Helps Unigine Tropics and some (old) gstreamer shaders in shader-db.
> 
> instructions in affected programs: 792 -> 744 (-6.06%)
> ---
>  src/mesa/drivers/dri/i965/brw_fs_cse.cpp | 12 +++-
>  1 file changed, 11 insertions(+), 1 deletion(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_fs_cse.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_cse.cpp
> index 75c6aab..6e36b8c 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs_cse.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs_cse.cpp
> @@ -92,7 +92,7 @@ is_expression(const fs_inst *const inst)
> case SHADER_OPCODE_LOAD_PAYLOAD:
>return !is_copy_payload(inst);
> default:
> -  return false;
> +  return inst->is_tex();
> }
>  }
>  
> @@ -142,6 +142,16 @@ instructions_match(fs_inst *a, fs_inst *b)
>a->conditional_mod == b->conditional_mod &&
>a->dst.type == b->dst.type &&
>a->sources == b->sources &&
> +  (a->is_tex() ? (a->texture_offset == b->texture_offset &&
> +  a->mlen == b->mlen &&
> +  a->regs_written == b->regs_written &&
> +  a->base_mrf == b->base_mrf &&
> +  a->sampler == b->sampler &&
> +  a->target == b->target &&

target is a field for FB writes - it isn't used for sampling messages.
You should probably drop it.

Otherwise, this patch is (dubiously, given that I'm working backwards):
Reviewed-by: Kenneth Graunke 

> +  a->eot == b->eot &&
> +  a->header_present == b->header_present &&
> +  a->shadow_compare == b->shadow_compare)
> +   : true) &&
>operands_match(a, b);
>  }
>  
> 


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa: Free the compiled shader IR after it has been linked.

2014-05-31 Thread Ian Romanick
On 05/31/2014 02:39 PM, Eric Anholt wrote:
> Ian Romanick  writes:
> 
>> On 05/28/2014 01:57 PM, Eric Anholt wrote:
>>> If the shader compiled once, then we can compile it again.  Compiled
>>> shaders almost always get used in just one program, so holding that
>>> compiled IR until the program is freed is just a waste of memory.
>>
>> Would this work with some madness like:
>>
>>glAttachShader(prog, sh1);
>>glAttachShader(prog, sh2);
>>glLinkProgram(prog);
>>
>>GLchar *empty = "";
>>glShaderSource(sh1, 1, &empty, NULL);
> 
> When we get the shadersource call here on a previously linked-and-freed
> shader, the lazy recompile call present in shadersource ensures that we
> have the right IR.

Ah.. I did miss that on the first read.  Maybe add a comment there?
Either way,

Reviewed-by: Ian Romanick 

>>glBindAttribLocation(prog, 0, "foo");
>>glLinkProgram(prog);
> 
>> There are some things like this that I think would work very nicely with
>> glCreateShaderProgramv, but it may be difficult to get all the corner
>> cases right otherwise.
>>
>>> On the other hand, if they are either reusing shader objects to compile
>>> multiple times, or linking the same shader into multiple programs, we turn
>>> off this memory savings hack to avoid spending CPU on recompiling.
>>>
>>> Reduces peak memory allocation of glretrace of a trace of dolphin-emu by
>>> 5.5MB.  It seems like this should be a big deal to DOTA2, but it was
>>> triggering RecompiledAnyShader, and I failed to see a benefit even if I
>>> removed the RecompiledAnyShader check (which confuses me).
>>
>> I think they use the same vertex shader with multiple fragment
>> shaders... but it does seem weird that disabling the check didn't change
>> the memory usage.  Did it change anything (bad rendering)?
> 
> Not that I noticed, not that I was paying close attention when it took
> so long to run.




signature.asc
Description: OpenPGP digital signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 79469] Commit e3cc0d90e14e62a0a787b6c07a6df0f5c84039be breaks unigine heaven

2014-05-31 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=79469

Ian Romanick  changed:

   What|Removed |Added

 CC||i...@freedesktop.org

--- Comment #4 from Ian Romanick  ---
Ugh... this should also affect the 10.2 release branch.  I'm not super excited
about reverting patches, but I don't want to ship a release with Unigine broken
either.

Ideas / suggestions?

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 79039] [TRACKER] Mesa 10.2 release tracker

2014-05-31 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=79039

Ian Romanick  changed:

   What|Removed |Added

 Depends on||79469

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 79469] Commit e3cc0d90e14e62a0a787b6c07a6df0f5c84039be breaks unigine heaven

2014-05-31 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=79469

Ian Romanick  changed:

   What|Removed |Added

 Blocks||79039

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa: Free the compiled shader IR after it has been linked.

2014-05-31 Thread Ian Romanick
On 05/28/2014 01:57 PM, Eric Anholt wrote:
> If the shader compiled once, then we can compile it again.  Compiled
> shaders almost always get used in just one program, so holding that
> compiled IR until the program is freed is just a waste of memory.
> 
> On the other hand, if they are either reusing shader objects to compile
> multiple times, or linking the same shader into multiple programs, we turn
> off this memory savings hack to avoid spending CPU on recompiling.
> 
> Reduces peak memory allocation of glretrace of a trace of dolphin-emu by
> 5.5MB.  It seems like this should be a big deal to DOTA2, but it was
> triggering RecompiledAnyShader, and I failed to see a benefit even if I
> removed the RecompiledAnyShader check (which confuses me).

Does dota2 DeleteShader after AttachShader / LinkProgram?  That could
explain why it doesn't help there...

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 00/21] Reduce ir_variable memory usage

2014-05-31 Thread Ian Romanick
On 05/30/2014 05:52 PM, Kenneth Graunke wrote:
> On Friday, May 30, 2014 11:07:03 AM Eric Anholt wrote:
>> Ian Romanick  writes:
>>> This series reduces the memory usage of ir_variable quite significantly.
>>>
>>> The first couple patches add a mechanism to determine the amount of
>>> memory used by any kind of IR object.  This is used to collect the data
>>> that is shown in the commit messages through the series.
> 
> I think that talloc can give you this type of information for basically free 
> - 
> I'd rather see an implementation of the ralloc API atop talloc, and then just 
> turn the memory debugging on.  This is quite a bit of code, and incomplete...

It could give some of the same information, but not quite all.  For
example, if I want to know how much memory is used by
ir_dereference_record, I want to know how much is used by the
ir_dereference_record objects and the ir_dereference_record::name
allocations too.  Since everything in the IR gets (eventually)
reparented to the IR list, I don't see a way to get that information
from just the talloc tracking.

You were a lot deeper in the guts of talloc, so you may well know things
about it that I do not. :)




signature.asc
Description: OpenPGP digital signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 04/19] i965/fs: Store the number of sources an fs_inst has.

2014-05-31 Thread Kenneth Graunke
On Tuesday, May 27, 2014 06:47:35 PM Matt Turner wrote:
> ---
>  src/mesa/drivers/dri/i965/brw_fs.cpp | 21 +++--
>  src/mesa/drivers/dri/i965/brw_fs.h   |  3 ++-
>  2 files changed, 13 insertions(+), 11 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
> index b06966a..a9a8ac1 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
> @@ -52,11 +52,12 @@ extern "C" {
>  #include "glsl/glsl_types.h"
>  
>  void
> -fs_inst::init()
> +fs_inst::init(int sources)
>  {
> memset(this, 0, sizeof(*this));
>  
> -   this->src = ralloc_array(this, fs_reg, 3);
> +   this->sources = sources;
> +   this->src = ralloc_array(this, fs_reg, sources);
>  
> this->conditional_mod = BRW_CONDITIONAL_NONE;
>  
> @@ -73,19 +74,19 @@ fs_inst::init()
>  
>  fs_inst::fs_inst()
>  {
> -   init();
> +   init(3);

Why 3?  Seems like a waste.

> this->opcode = BRW_OPCODE_NOP;
>  }
>  
>  fs_inst::fs_inst(enum opcode opcode)
>  {
> -   init();
> +   init(3);

Ditto (0?).

> this->opcode = opcode;
>  }
>  
>  fs_inst::fs_inst(enum opcode opcode, fs_reg dst)
>  {
> -   init();
> +   init(3);

Ditto (0?).

> this->opcode = opcode;
> this->dst = dst;
>  
> @@ -95,7 +96,7 @@ fs_inst::fs_inst(enum opcode opcode, fs_reg dst)
>  
>  fs_inst::fs_inst(enum opcode opcode, fs_reg dst, fs_reg src0)
>  {
> -   init();
> +   init(3);

Ditto (1).

> this->opcode = opcode;
> this->dst = dst;
> this->src[0] = src0;
> @@ -108,7 +109,7 @@ fs_inst::fs_inst(enum opcode opcode, fs_reg dst, fs_reg 
src0)
>  
>  fs_inst::fs_inst(enum opcode opcode, fs_reg dst, fs_reg src0, fs_reg src1)
>  {
> -   init();
> +   init(3);

Ditto (2).  Given that most expressions are unary or binary operations, this 
could actually save a bit of memory...

> this->opcode = opcode;
> this->dst = dst;
> this->src[0] = src0;
> @@ -125,7 +126,7 @@ fs_inst::fs_inst(enum opcode opcode, fs_reg dst, fs_reg 
src0, fs_reg src1)
>  fs_inst::fs_inst(enum opcode opcode, fs_reg dst,
>fs_reg src0, fs_reg src1, fs_reg src2)
>  {
> -   init();
> +   init(3);
> this->opcode = opcode;
> this->dst = dst;
> this->src[0] = src0;
> @@ -146,9 +147,9 @@ fs_inst::fs_inst(const fs_inst &that)
>  {
> memcpy(this, &that, sizeof(that));
>  
> -   this->src = ralloc_array(this, fs_reg, 3);
> +   this->src = ralloc_array(this, fs_reg, that.sources);
>  
> -   for (int i = 0; i < 3; i++)
> +   for (int i = 0; i < that.sources; i++)
>this->src[i] = that.src[i];
>  }
>  
> diff --git a/src/mesa/drivers/dri/i965/brw_fs.h 
b/src/mesa/drivers/dri/i965/brw_fs.h
> index 11a5c7c..4f8a2b2 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs.h
> +++ b/src/mesa/drivers/dri/i965/brw_fs.h
> @@ -190,7 +190,7 @@ class fs_inst : public backend_instruction {
>  public:
> DECLARE_RALLOC_CXX_OPERATORS(fs_inst)
>  
> -   void init();
> +   void init(int sources);
>  
> fs_inst();
> fs_inst(enum opcode opcode);
> @@ -216,6 +216,7 @@ public:
> uint32_t texture_offset; /**< Texture offset bitfield */
> uint32_t offset; /* spill/unspill offset */
>  
> +   uint8_t sources; /**< Number of fs_reg sources. */
> uint8_t conditional_mod; /**< BRW_CONDITIONAL_* */
>  
> /* Chooses which flag subregister (f0.0 or f0.1) is used for conditional
> 


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 05/19] i965/fs: Loop from 0 to inst->sources, not 0 to 3.

2014-05-31 Thread Kenneth Graunke
On Tuesday, May 27, 2014 06:47:36 PM Matt Turner wrote:
[snip]
> diff --git a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp
> index 069b60f..a1aff21 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp
> @@ -353,7 +353,7 @@ fs_visitor::try_constant_propagate(fs_inst *inst, 
acp_entry *entry)
> if (entry->src.file != IMM)
>return false;
>  
> -   for (int i = 2; i >= 0; i--) {
> +   for (int i = inst->sources - 1; i >= 0; i--) {
>if (inst->src[i].file != entry->dst.file ||
>inst->src[i].reg != entry->dst.reg ||
>inst->src[i].reg_offset != entry->dst.reg_offset ||

I am glad to see that you're using a signed integer type here.  For 
BRW_OPCODE_NOP, SHADER_OPCODE_PLACEHOLDER_HALT, and so on, inst->sources could 
conceievably be 0.  (Today, you set it to 3 for some reason, but I could see 
someone coming along and "fixing" that later...)

If you had used unsigned types, this would translate into i = 4294967296; i >= 
0; i--, which would take forever.  With signed types, it works, and no one 
should get bitten by that bug.

Patches 1-8 are:
Reviewed-by: Kenneth Graunke 
(and you can ignore my comment on patch 4 - I'm happy to see that done in 
later patches - no need to block this any further on trivialities)

signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 10/19] i965/fs: Lower LOAD_PAYLOAD and clean up.

2014-05-31 Thread Kenneth Graunke
On Tuesday, May 27, 2014 06:47:41 PM Matt Turner wrote:
> Clean up with with register_coalesce()/dead_code_eliminate().
> ---
>  src/mesa/drivers/dri/i965/brw_fs.cpp | 42 

>  src/mesa/drivers/dri/i965/brw_fs.h   |  1 +
>  2 files changed, 43 insertions(+)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
> index 0856b6b..c0af6d0 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
> @@ -2574,6 +2574,43 @@ fs_visitor::lower_uniform_pull_constant_loads()
> }
>  }
>  
> +bool
> +fs_visitor::lower_load_payload()
> +{
> +   bool progress = false;
> +
> +   foreach_list_safe(node, &instructions) {
> +  fs_inst *inst = (fs_inst *)node;
> +
> +  if (inst->opcode == SHADER_OPCODE_LOAD_PAYLOAD) {
> + fs_reg dst = inst->dst;
> +
> + /* The generator creates the message header if present, which is 
in
> +  * the first register of the message payload.
> +  */
> + if (!inst->header_present) {

I don't think this will work like you want.  In your emit_texture_gen7 patch, 
when the texturing operation needs a header, you set src[0] to reg_undef 
(BAD_FILE).  But you don't set header_present on the LOAD_PAYLOAD instruction.

Here, inst is the LOAD_PAYLOAD operation...so you're checking header_present
on...not what you thought.  It sure looks like this will generate

MOV(dst, reg_undef)

which is bad.

I think you actually should just go with your original version of this patch:
http://article.gmane.org/gmane.comp.video.mesa3d.devel/76592

AFAICT, the semantics of LOAD_PAYLOAD are:

/**
 * Combines multiple sources of size 1 into a larger virtual GRF.
 * For example, parameters for a send-from-GRF message.  Or, updating
 * channels of a size 4 VGRF used to store vec4s such as texturing results.
 *
 * This will be lowered into MOVs from each source to consecutive reg_offsets
 * of the destination VGRF.
 *
 * src[0] may be BAD_FILE.  If so, the lowering pass skips emitting the MOV,
 * but still reserves the first channel of the destination VGRF.  This can be
 * used to reserve space for, say, a message header set up by the generators.
 */

I'd like a comment like that to appear above the SHADER_OPCODE_LOAD_PAYLOAD 
opcode in brw_defines.h.  Feel free to improve the wording.

> +inst->insert_before(MOV(dst, inst->src[0]));
> + } else {
> +assert(inst->src[0].file == BAD_FILE);
> + }
> + dst.reg_offset++;
> +
> + for (int i = 1; i < inst->sources; i++) {
> +inst->insert_before(MOV(dst, inst->src[i]));
> +dst.reg_offset++;
> + }
> +
> + inst->remove();
> + progress = true;
> +  }
> +   }
> +
> +   if (progress)
> +  invalidate_live_intervals();
> +
> +   return progress;
> +}
> +
>  void
>  fs_visitor::dump_instructions()
>  {
> @@ -3071,6 +3108,11 @@ fs_visitor::run()
>   progress = OPT(compute_to_mrf) || progress;
>} while (progress);
>  
> +  if (lower_load_payload()) {
> + register_coalesce();
> + dead_code_eliminate();
> +  }
> +
>lower_uniform_pull_constant_loads();
>  
>assign_curb_setup();
> diff --git a/src/mesa/drivers/dri/i965/brw_fs.h 
b/src/mesa/drivers/dri/i965/brw_fs.h
> index d0e459c..2b60945 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs.h
> +++ b/src/mesa/drivers/dri/i965/brw_fs.h
> @@ -391,6 +391,7 @@ public:
> void fail(const char *msg, ...);
> void no16(const char *msg, ...);
> void lower_uniform_pull_constant_loads();
> +   bool lower_load_payload();
>  
> void push_force_uncompressed();
> void pop_force_uncompressed();
> 


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 11/19] i965/fs: Use LOAD_PAYLOAD in emit_texture_gen7().

2014-05-31 Thread Kenneth Graunke
On Friday, April 18, 2014 11:56:47 AM Matt Turner wrote:
> ---
>  src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 135 
+++
>  1 file changed, 73 insertions(+), 62 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
> index 2aa3acd..91bbe0a 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
> @@ -1257,8 +1257,11 @@ fs_visitor::emit_texture_gen7(ir_texture *ir, fs_reg 
dst, fs_reg coordinate,
> int reg_width = dispatch_width / 8;
> bool header_present = false;
>  
> -   fs_reg payload = fs_reg(this, glsl_type::float_type);
> -   fs_reg next = payload;
> +   fs_reg *sources = ralloc_array(mem_ctx, fs_reg, 
MAX_SAMPLER_MESSAGE_SIZE);
> +   for (int i = 0; i < MAX_SAMPLER_MESSAGE_SIZE; i++) {
> +  sources[i] = fs_reg(this, glsl_type::float_type);
> +   }
> +   int length = 0;
>  
> if (ir->op == ir_tg4 || (ir->offset && ir->op != ir_txf) || sampler >= 
16) {
>/* For general texture offsets (no txf workaround), we need a header 
to
> @@ -1272,12 +1275,13 @@ fs_visitor::emit_texture_gen7(ir_texture *ir, fs_reg 
dst, fs_reg coordinate,
> * need to offset the Sampler State Pointer in the header.
> */
>header_present = true;
> -  next.reg_offset++;
> +  sources[length] = reg_undef;
> +  length++;
> }
>  
> if (ir->shadow_comparitor) {
> -  emit(MOV(next, shadow_c));
> -  next.reg_offset++;
> +  emit(MOV(sources[length], shadow_c));

Again, it really seems like these MOVs shouldn't help anything.  But, 
according to your data, they do...probably helping some optimization pass.  
I'm okay with leaving them for now, but it would be interesting (and probably 
worthwhile) to understand why later.

Notably, you don't emit such MOVs in the cube map array fixup.

signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev