Re: [Mesa-dev] [PATCH 10/12] i965/fs: Make compact_virtual_grfs an optimization pass

2014-09-19 Thread Jason Ekstrand
On Fri, Sep 19, 2014 at 5:41 PM, Matt Turner  wrote:

> On Fri, Sep 19, 2014 at 1:10 PM, Jason Ekstrand 
> wrote:
> > Previously we disabled compact_virtual_grfs when dumping optimizations.
> > The idea here was to make it easier to diff the dumped shader because you
> > didn't have a sudden renaming.  However, sometimes a bug is affected by
> > compact_virtual_grfs and, when this happens, you want to keep dumping
> > instructions with compact_virtual_grfs enabled.  By turning it into an
> > optimization pass and dumping it along with the others, we retain the
> > ability to diff because you can just diff against the compact_virtual_grf
> > output.
>
> I'd like to understand the bug you encountered.
>

I really don't think you'd like that.  Those bugs are a real pain.  But
yes, I've hit this more times than I can count while working on this stuff.


>
> I'm kind of concerned that we're going to just run the optimization
> loop an extra time for every shader now, since compact_virtual_grfs is
> going to set progress = true after the last actual optimization pass
> made progress. I guess we could remove that problem by calling
> compact_virtual_grfs at the end of the loop, rather than at the
> beginning.
>

Sure, we can do something to make it not run an extra time.  I'm mostly
concerned about not just shutting it off.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 08/12] i965/fs: Make null_reg_* const members of fs_visitor instead of globals

2014-09-19 Thread Jason Ekstrand
On Fri, Sep 19, 2014 at 5:37 PM, Matt Turner  wrote:

> On Fri, Sep 19, 2014 at 1:10 PM, Jason Ekstrand 
> wrote:
> > We also set the register width equal to the dispatch width.  Right now,
> > this is effectively a no-op since we don't do anything with it.  However,
> > it will be important once we add an actual width field to fs_reg.
>
> I don't really see the point to be honest. We just wind up calling the
> constructor >1 time.
>
> I could see maybe see making them static members just to reduce their
> scope.
>

The point is to get a null register with a width of dispatch_width.  We
need that later.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 07/12] i965/fs: Use the var_from_vgrf helper function instead of doing it manually

2014-09-19 Thread Jason Ekstrand
On Fri, Sep 19, 2014 at 5:16 PM, Matt Turner  wrote:

> On Fri, Sep 19, 2014 at 1:10 PM, Jason Ekstrand 
> wrote:
> > Signed-off-by: Jason Ekstrand 
> > ---
> >  src/mesa/drivers/dri/i965/brw_fs_dead_code_eliminate.cpp | 10 +-
> >  1 file changed, 5 insertions(+), 5 deletions(-)
> >
> > diff --git a/src/mesa/drivers/dri/i965/brw_fs_dead_code_eliminate.cpp
> b/src/mesa/drivers/dri/i965/brw_fs_dead_code_eliminate.cpp
> > index 697b44a..036875f 100644
> > --- a/src/mesa/drivers/dri/i965/brw_fs_dead_code_eliminate.cpp
> > +++ b/src/mesa/drivers/dri/i965/brw_fs_dead_code_eliminate.cpp
> > @@ -58,7 +58,7 @@ fs_visitor::dead_code_eliminate()
> > int var = live_intervals->var_from_reg(&inst->dst);
> > result_live = BITSET_TEST(live, var);
> >  } else {
> > -   int var = live_intervals->var_from_vgrf[inst->dst.reg];
> > +   int var = live_intervals->var_from_reg(&inst->dst);
> > for (int i = 0; i < inst->regs_written; i++) {
> >result_live = result_live || BITSET_TEST(live, var +
> i);
>
> This is wrong, isn't it? Before we get the base var and iterate 0
> through regs_written. After we're getting the var of the
> register+offset and then iterating.
>

No, in fact this hunk is what prompted me to make the change.  If we write
to vgrf3+2.0, then the previous version would tacitly assume that the
offset is 0 and treat it as if we were writing to vgrf3+0.0.


>
> > }
> > @@ -78,19 +78,19 @@ fs_visitor::dead_code_eliminate()
> >
> >   if (inst->dst.file == GRF) {
> >  if (!inst->is_partial_write()) {
> > -   int var = live_intervals->var_from_vgrf[inst->dst.reg];
> > +   int var = live_intervals->var_from_reg(&inst->dst);
> > for (int i = 0; i < inst->regs_written; i++) {
> > -  BITSET_CLEAR(live, var + inst->dst.reg_offset + i);
> > +  BITSET_CLEAR(live, var + i);
> > }
>
> This hunk seems fine.
>
> >  }
> >   }
> >
> >   for (int i = 0; i < inst->sources; i++) {
> >  if (inst->src[i].file == GRF) {
> > -   int var =
> live_intervals->var_from_vgrf[inst->src[i].reg];
> > +   int var = live_intervals->var_from_reg(&inst->src[i]);
> >
> > for (int j = 0; j < inst->regs_read(this, i); j++) {
> > -  BITSET_SET(live, var + inst->src[i].reg_offset + j);
> > +  BITSET_SET(live, var + j);
>
> I think this is also fine.
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 04/20] i965/vec4: Preserve CFG in spill_reg().

2014-09-19 Thread Matt Turner
---
This also means I'll drop 05/20.

v2: Just pass block to emit_before(), rather than trying to get rid
of emit_before().

 src/mesa/drivers/dri/i965/brw_vec4.cpp |  6 +-
 src/mesa/drivers/dri/i965/brw_vec4.h   | 13 +++--
 .../drivers/dri/i965/brw_vec4_reg_allocate.cpp | 11 ++--
 src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 64 +-
 4 files changed, 56 insertions(+), 38 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4.cpp
index 6072962..c3e5c8a 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp
@@ -804,10 +804,12 @@ vec4_visitor::move_push_constants_to_pull_constants()
   }
}
 
+   calculate_cfg();
+
/* Now actually rewrite usage of the things we've moved to pull
 * constants.
 */
-   foreach_in_list_safe(vec4_instruction, inst, &instructions) {
+   foreach_block_and_inst_safe(block, vec4_instruction, inst, cfg) {
   for (int i = 0 ; i < 3; i++) {
 if (inst->src[i].file != UNIFORM ||
 pull_constant_loc[inst->src[i].reg] == -1)
@@ -817,7 +819,7 @@ vec4_visitor::move_push_constants_to_pull_constants()
 
 dst_reg temp = dst_reg(this, glsl_type::vec4_type);
 
-emit_pull_constant_load(inst, temp, inst->src[i],
+emit_pull_constant_load(block, inst, temp, inst->src[i],
 pull_constant_loc[uniform]);
 
 inst->src[i].file = temp.file;
diff --git a/src/mesa/drivers/dri/i965/brw_vec4.h 
b/src/mesa/drivers/dri/i965/brw_vec4.h
index 186667c..4a264ef 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4.h
+++ b/src/mesa/drivers/dri/i965/brw_vec4.h
@@ -405,7 +405,8 @@ public:
vec4_instruction *emit(enum opcode opcode, dst_reg dst,
  src_reg src0, src_reg src1, src_reg src2);
 
-   vec4_instruction *emit_before(vec4_instruction *inst,
+   vec4_instruction *emit_before(bblock_t *block,
+ vec4_instruction *inst,
 vec4_instruction *new_inst);
 
vec4_instruction *MOV(const dst_reg &dst, const src_reg &src0);
@@ -549,17 +550,17 @@ public:
void emit_untyped_surface_read(unsigned surf_index, dst_reg dst,
   src_reg offset);
 
-   src_reg get_scratch_offset(vec4_instruction *inst,
+   src_reg get_scratch_offset(bblock_t *block, vec4_instruction *inst,
  src_reg *reladdr, int reg_offset);
-   src_reg get_pull_constant_offset(vec4_instruction *inst,
+   src_reg get_pull_constant_offset(bblock_t *block, vec4_instruction *inst,
src_reg *reladdr, int reg_offset);
-   void emit_scratch_read(vec4_instruction *inst,
+   void emit_scratch_read(bblock_t *block, vec4_instruction *inst,
  dst_reg dst,
  src_reg orig_src,
  int base_offset);
-   void emit_scratch_write(vec4_instruction *inst,
+   void emit_scratch_write(bblock_t *block, vec4_instruction *inst,
   int base_offset);
-   void emit_pull_constant_load(vec4_instruction *inst,
+   void emit_pull_constant_load(bblock_t *block, vec4_instruction *inst,
dst_reg dst,
src_reg orig_src,
int base_offset);
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_reg_allocate.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_reg_allocate.cpp
index ddab342..72c72e6 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_reg_allocate.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_reg_allocate.cpp
@@ -28,6 +28,7 @@ extern "C" {
 
 #include "brw_vec4.h"
 #include "brw_vs.h"
+#include "brw_cfg.h"
 
 using namespace brw;
 
@@ -326,8 +327,10 @@ vec4_visitor::spill_reg(int spill_reg_nr)
assert(virtual_grf_sizes[spill_reg_nr] == 1);
unsigned int spill_offset = c->last_scratch++;
 
+   calculate_cfg();
+
/* Generate spill/unspill instructions for the objects being spilled. */
-   foreach_in_list(vec4_instruction, inst, &instructions) {
+   foreach_block_and_inst(block, vec4_instruction, inst, cfg) {
   for (unsigned int i = 0; i < 3; i++) {
  if (inst->src[i].file == GRF && inst->src[i].reg == spill_reg_nr) {
 src_reg spill_reg = inst->src[i];
@@ -342,16 +345,16 @@ vec4_visitor::spill_reg(int spill_reg_nr)
temp.writemask |= (1 << BRW_GET_SWZ(inst->src[i].swizzle, c));
 assert(temp.writemask != 0);
 
-emit_scratch_read(inst, temp, spill_reg, spill_offset);
+emit_scratch_read(block, inst, temp, spill_reg, spill_offset);
  }
   }
 
   if (inst->dst.file == GRF && inst->dst.reg == spill_reg_nr) {
- emit_scratch_write(inst, spill_offset);
+ emit_scratch_write(block, inst, spill_offset);
   }
}
 
-   invalidate_live_intervals();
+   invalidate_live_intervals(false);
 }
 

Re: [Mesa-dev] [PATCH 14/20] i965/vec4: Don't iterate between blocks with inst->next/prev.

2014-09-19 Thread Matt Turner
On Wed, Sep 17, 2014 at 5:51 AM, Pohjolainen, Topi
 wrote:
> On Tue, Sep 02, 2014 at 09:34:25PM -0700, Matt Turner wrote:
>> The register coalescing portion of this patch hurts three shaders in
>> Guacamelee by one instruction each, but examining the diff makes me
>> believe that what we were generating was (perhaps harmlessly) incorrect.
>> ---
>>  src/mesa/drivers/dri/i965/brw_vec4.cpp | 30 +-
>>  1 file changed, 9 insertions(+), 21 deletions(-)
>>
>> diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp 
>> b/src/mesa/drivers/dri/i965/brw_vec4.cpp
>> index 6669281..e3869d6 100644
>> --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp
>> +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp
>> @@ -513,12 +513,9 @@ vec4_visitor::dead_code_eliminate()
>>   }
>>}
>>
>> -  for (exec_node *node = inst->prev, *prev = node->prev;
>> -   prev != NULL && dead_channels != 0;
>> -   node = prev, prev = prev->prev) {
>> - vec4_instruction *scan_inst = (vec4_instruction  *)node;
>> -
>> - if (scan_inst->is_control_flow())
>
> Last instruction of the block is not considered in the iteration, but first
> instruction is. Hence if I'm reading this right, before DO and ENDIF weren't
> considered but now they are.

That's true, but it doesn't have any effect, since neither do nor
endif take arguments.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 00/14] i965: Instruction compaction improvements

2014-09-19 Thread Matt Turner
On Thu, Aug 28, 2014 at 8:10 PM, Matt Turner  wrote:
> This series adds instruction compaction support for G45 and Gen5
> and enables compaction of control flow instructions.

Ken reviewed the first four patches I think. Can I get someone to
review the rest?
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 13/12] i965/fs: Refactor fs_inst::is_send_from_grf()

2014-09-19 Thread Matt Turner
Reviewed-by: Matt Turner 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 11/12] i965/fs: Print BAD_FILE registers in dump_instruction

2014-09-19 Thread Matt Turner
Reviewed-by: Matt Turner 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 10/12] i965/fs: Make compact_virtual_grfs an optimization pass

2014-09-19 Thread Matt Turner
On Fri, Sep 19, 2014 at 1:10 PM, Jason Ekstrand  wrote:
> Previously we disabled compact_virtual_grfs when dumping optimizations.
> The idea here was to make it easier to diff the dumped shader because you
> didn't have a sudden renaming.  However, sometimes a bug is affected by
> compact_virtual_grfs and, when this happens, you want to keep dumping
> instructions with compact_virtual_grfs enabled.  By turning it into an
> optimization pass and dumping it along with the others, we retain the
> ability to diff because you can just diff against the compact_virtual_grf
> output.

I'd like to understand the bug you encountered.

I'm kind of concerned that we're going to just run the optimization
loop an extra time for every shader now, since compact_virtual_grfs is
going to set progress = true after the last actual optimization pass
made progress. I guess we could remove that problem by calling
compact_virtual_grfs at the end of the loop, rather than at the
beginning.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 08/12] i965/fs: Make null_reg_* const members of fs_visitor instead of globals

2014-09-19 Thread Matt Turner
On Fri, Sep 19, 2014 at 1:10 PM, Jason Ekstrand  wrote:
> We also set the register width equal to the dispatch width.  Right now,
> this is effectively a no-op since we don't do anything with it.  However,
> it will be important once we add an actual width field to fs_reg.

I don't really see the point to be honest. We just wind up calling the
constructor >1 time.

I could see maybe see making them static members just to reduce their scope.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 09/12] i964/fs: Make immediate fs_reg constructors explicit

2014-09-19 Thread Matt Turner
Yes, I've always been weirded out by this.

Reviewed-by: Matt Turner 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 06/12] i965/fs: Use the UW type for the destination of VARYING_PULL_CONSTANT_LOAD instructions

2014-09-19 Thread Matt Turner
Reviewed-by: Matt Turner 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 05/12] i965/fs: Use offset a lot more places

2014-09-19 Thread Matt Turner
On Fri, Sep 19, 2014 at 1:10 PM, Jason Ekstrand  wrote:
> We have this wonderful offset() function for advancing registers, but we're
> not using it.  Using offset() allows us to do some sanity checking and
> avoid manually touching fs_reg::reg_offset.  In a few commits, we will make
> offset do even more nifty things for us.
>
> Signed-off-by: Jason Ekstrand 
> ---
>  src/mesa/drivers/dri/i965/brw_fs.cpp  |  18 +--
>  src/mesa/drivers/dri/i965/brw_fs_cse.cpp  |  12 +-
>  src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp |   4 +-
>  src/mesa/drivers/dri/i965/brw_fs_visitor.cpp  | 137 
> ++
>  4 files changed, 78 insertions(+), 93 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
> b/src/mesa/drivers/dri/i965/brw_fs.cpp
> index af8c087..ea91705 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
> @@ -310,8 +310,8 @@ fs_visitor::VARYING_PULL_CONSTANT_LOAD(const fs_reg &dst,
>   inst->mlen = 1 + dispatch_width / 8;
> }
>
> -   vec4_result.reg_offset += (const_offset & 3) * scale;
> -   instructions.push_tail(MOV(dst, vec4_result));
> +   fs_reg result = offset(vec4_result, (const_offset & 3) * scale);
> +   instructions.push_tail(MOV(dst, result));

Isn't this going to cause us to copy an fs_reg twice, rather than just
setting .reg_offset?

I'd like to check the generated code.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 04/12] i965/fs: fix a comment in compact_virtual_grfs

2014-09-19 Thread Matt Turner
Reviewed-by: Matt Turner 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 03/12] i965/fs: Rewrite fs_visitor::split_virtual_grfs

2014-09-19 Thread Matt Turner
On Fri, Sep 19, 2014 at 1:10 PM, Jason Ekstrand  wrote:
> The original vgrf splitting code was written assuming that with the
> assumption that vgrfs came in two types: those that can be split into

s/ with the assumption that//

> single registers and those that can't be split at all

Period

> It was very
> conservative and bailed as soon as more than one element of a register was
> read or written.  This won't work once we start allowing a regular MOV or
> ADD operation to operate on multiple registers.  This rewrite allows for
> the case where a vgrf of size 5 may appropreately be split in to one

appropriately

> register of size 1 and two registers of size 2.

I'm not sure I understand enough yet to review.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] targets/omx: create symlink to aux/vl/vl_winsys_dri.cat at build time

2014-09-19 Thread Ilia Mirkin
On Fri, Sep 19, 2014 at 8:14 PM, Emil Velikov  wrote:
> On 20/09/14 00:56, Ilia Mirkin wrote:
>> On Fri, Sep 19, 2014 at 7:45 PM, Emil Velikov  
>> wrote:
>>> On 20/09/14 00:13, Ilia Mirkin wrote:
 Do we do that anywhere else? Seems really hacky, and windows doesn't
 support symlinks among other things... I'd just as soon force a
 non-broken version of automake :)

>>> Hmm just noticed that we should put the generated source(s) into the
>>> nodist_*
>>>
>>> Please define "anywhere else". It does seem hacky but it's less hacky
>>> than the current approach afaics. Cannot really parse "I'd just as soon
>>> force a non-broken version of autmake". Can you elaborate ?
>>
>> You said that automake 2.0 is broken (in that it's not backwards
>> compatible and doesn't support our setup). To resolve it, you're
>> introducing a (IMO) horrible hack of adding a symlink. My suggested
>> alternative is to just force a lower version of automake...
>>
> Hmm I feel that you hate autohell a bit too much... or is it the case of

I hate pandering to broken tools, auto or otherwise.

> "people fear what they don't understand" ? Not that I like/know autoslow
> too much but I'm willing to (still) give it a chance.
>
> Automake 2.0 is not out, yet putting tape over our eyes and pleading
> ignorance against it's (future) existence is a bit silly. I realise that
> this does not look good but it's the most reasonable choice.

I can't imagine that the intended way to resolve the issue is to add
symlinks at build time. This sort of solution points to a severe
mismatch between what the tool expects and what we want (or at least
are currently doing, which may be dirty to begin with, for all I
know).

> Additionally I'd like to acknowledge OpenBSD people's existence, and
> help them stop rolling their own build for every mesa release.
>
>>>
>>> Have a sneaky feeling that we may get away with just creating a single
>>> blob in aux/vl, rather than one per target, yet I would prefer to save
>>> people (myself?) the headaches at things go pair-shape :)
>>
>> Yes, building the files where they are is the more common thing than
>> referencing them from all over...
>>
> I fear that our current split (the gallium way) mandates it. And even if
> it works(tm) now I _really_ want to prevent the headaches as it breaks.

symlinks in builds cause headaches. Building files where they live is
almost always the right answer.

>
> -Emil
>
>>>
>>> -Emil
>>>
 On Fri, Sep 19, 2014 at 7:01 PM, Emil Velikov  
 wrote:
> Ensure that the object is build in the target folder, as automake 2.0
> will mandate subdir-objects. Pointed out by automake 1.14.
>
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=69874
> Signed-off-by: Emil Velikov 
> ---
>  src/gallium/targets/omx/.gitignore  |  1 +
>  src/gallium/targets/omx/Makefile.am | 10 --
>  2 files changed, 9 insertions(+), 2 deletions(-)
>  create mode 100644 src/gallium/targets/omx/.gitignore
>
> diff --git a/src/gallium/targets/omx/.gitignore 
> b/src/gallium/targets/omx/.gitignore
> new file mode 100644
> index 000..4fd1800
> --- /dev/null
> +++ b/src/gallium/targets/omx/.gitignore
> @@ -0,0 +1 @@
> +vl_winsys_dri.c
> diff --git a/src/gallium/targets/omx/Makefile.am 
> b/src/gallium/targets/omx/Makefile.am
> index 4045548..f41719f 100644
> --- a/src/gallium/targets/omx/Makefile.am
> +++ b/src/gallium/targets/omx/Makefile.am
> @@ -7,8 +7,7 @@ omxdir = $(OMX_LIB_INSTALL_DIR)
>  omx_LTLIBRARIES = libomx_mesa.la
>
>  nodist_EXTRA_libomx_mesa_la_SOURCES = dummy.cpp
> -libomx_mesa_la_SOURCES = \
> -   $(top_srcdir)/src/gallium/auxiliary/vl/vl_winsys_dri.c
> +libomx_mesa_la_SOURCES = vl_winsys_dri.c
>
>  libomx_mesa_la_LDFLAGS = \
> -shared \
> @@ -30,6 +29,13 @@ libomx_mesa_la_LIBADD = \
> $(OMX_LIBS) \
> $(GALLIUM_COMMON_LIB_DEPS)
>
> +BUILT_SOURCES = vl_winsys_dri.c
> +CLEANFILES = vl_winsys_dri.c
> +
> +vl_winsys_dri.c:
> +   $(AM_V_GEN)$(LN_S) 
> $(top_srcdir)/src/gallium/auxiliary/vl/vl_winsys_dri.c
> +
> +
>  if HAVE_GALLIUM_STATIC_TARGETS
>
>  STATIC_TARGET_CPPFLAGS = -DGALLIUM_STATIC_TARGETS=1
> --
> 2.1.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>>>
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 02/12] i965/fs_live_variables: Use var_from_vgrf insead of repeating the calculation

2014-09-19 Thread Matt Turner
Reviewed-by: Matt Turner 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] targets/omx: create symlink to aux/vl/vl_winsys_dri.cat at build time

2014-09-19 Thread Matt Turner
On Fri, Sep 19, 2014 at 4:01 PM, Emil Velikov  wrote:
> Ensure that the object is build in the target folder, as automake 2.0
> will mandate subdir-objects. Pointed out by automake 1.14.
>
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=69874
> Signed-off-by: Emil Velikov 
> ---
>  src/gallium/targets/omx/.gitignore  |  1 +
>  src/gallium/targets/omx/Makefile.am | 10 --
>  2 files changed, 9 insertions(+), 2 deletions(-)
>  create mode 100644 src/gallium/targets/omx/.gitignore
>
> diff --git a/src/gallium/targets/omx/.gitignore 
> b/src/gallium/targets/omx/.gitignore
> new file mode 100644
> index 000..4fd1800
> --- /dev/null
> +++ b/src/gallium/targets/omx/.gitignore
> @@ -0,0 +1 @@
> +vl_winsys_dri.c
> diff --git a/src/gallium/targets/omx/Makefile.am 
> b/src/gallium/targets/omx/Makefile.am
> index 4045548..f41719f 100644
> --- a/src/gallium/targets/omx/Makefile.am
> +++ b/src/gallium/targets/omx/Makefile.am
> @@ -7,8 +7,7 @@ omxdir = $(OMX_LIB_INSTALL_DIR)
>  omx_LTLIBRARIES = libomx_mesa.la
>
>  nodist_EXTRA_libomx_mesa_la_SOURCES = dummy.cpp
> -libomx_mesa_la_SOURCES = \
> -   $(top_srcdir)/src/gallium/auxiliary/vl/vl_winsys_dri.c
> +libomx_mesa_la_SOURCES = vl_winsys_dri.c
>
>  libomx_mesa_la_LDFLAGS = \
> -shared \
> @@ -30,6 +29,13 @@ libomx_mesa_la_LIBADD = \
> $(OMX_LIBS) \
> $(GALLIUM_COMMON_LIB_DEPS)
>
> +BUILT_SOURCES = vl_winsys_dri.c
> +CLEANFILES = vl_winsys_dri.c
> +
> +vl_winsys_dri.c:
> +   $(AM_V_GEN)$(LN_S) 
> $(top_srcdir)/src/gallium/auxiliary/vl/vl_winsys_dri.c

This file gets built by omx, xvmc, and vdpau, but is it actually built
with different CPPFLAGS or something? That is, can't we actually just
build it once in its subdirectory?

I don't see any meaningful preprocessor checks in the source file that
make me think it can't.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] targets/omx: create symlink to aux/vl/vl_winsys_dri.cat at build time

2014-09-19 Thread Emil Velikov
On 20/09/14 00:56, Ilia Mirkin wrote:
> On Fri, Sep 19, 2014 at 7:45 PM, Emil Velikov  
> wrote:
>> On 20/09/14 00:13, Ilia Mirkin wrote:
>>> Do we do that anywhere else? Seems really hacky, and windows doesn't
>>> support symlinks among other things... I'd just as soon force a
>>> non-broken version of automake :)
>>>
>> Hmm just noticed that we should put the generated source(s) into the
>> nodist_*
>>
>> Please define "anywhere else". It does seem hacky but it's less hacky
>> than the current approach afaics. Cannot really parse "I'd just as soon
>> force a non-broken version of autmake". Can you elaborate ?
> 
> You said that automake 2.0 is broken (in that it's not backwards
> compatible and doesn't support our setup). To resolve it, you're
> introducing a (IMO) horrible hack of adding a symlink. My suggested
> alternative is to just force a lower version of automake...
> 
Hmm I feel that you hate autohell a bit too much... or is it the case of
"people fear what they don't understand" ? Not that I like/know autoslow
too much but I'm willing to (still) give it a chance.

Automake 2.0 is not out, yet putting tape over our eyes and pleading
ignorance against it's (future) existence is a bit silly. I realise that
this does not look good but it's the most reasonable choice.
Additionally I'd like to acknowledge OpenBSD people's existence, and
help them stop rolling their own build for every mesa release.

>>
>> Have a sneaky feeling that we may get away with just creating a single
>> blob in aux/vl, rather than one per target, yet I would prefer to save
>> people (myself?) the headaches at things go pair-shape :)
> 
> Yes, building the files where they are is the more common thing than
> referencing them from all over...
> 
I fear that our current split (the gallium way) mandates it. And even if
it works(tm) now I _really_ want to prevent the headaches as it breaks.

-Emil

>>
>> -Emil
>>
>>> On Fri, Sep 19, 2014 at 7:01 PM, Emil Velikov  
>>> wrote:
 Ensure that the object is build in the target folder, as automake 2.0
 will mandate subdir-objects. Pointed out by automake 1.14.

 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=69874
 Signed-off-by: Emil Velikov 
 ---
  src/gallium/targets/omx/.gitignore  |  1 +
  src/gallium/targets/omx/Makefile.am | 10 --
  2 files changed, 9 insertions(+), 2 deletions(-)
  create mode 100644 src/gallium/targets/omx/.gitignore

 diff --git a/src/gallium/targets/omx/.gitignore 
 b/src/gallium/targets/omx/.gitignore
 new file mode 100644
 index 000..4fd1800
 --- /dev/null
 +++ b/src/gallium/targets/omx/.gitignore
 @@ -0,0 +1 @@
 +vl_winsys_dri.c
 diff --git a/src/gallium/targets/omx/Makefile.am 
 b/src/gallium/targets/omx/Makefile.am
 index 4045548..f41719f 100644
 --- a/src/gallium/targets/omx/Makefile.am
 +++ b/src/gallium/targets/omx/Makefile.am
 @@ -7,8 +7,7 @@ omxdir = $(OMX_LIB_INSTALL_DIR)
  omx_LTLIBRARIES = libomx_mesa.la

  nodist_EXTRA_libomx_mesa_la_SOURCES = dummy.cpp
 -libomx_mesa_la_SOURCES = \
 -   $(top_srcdir)/src/gallium/auxiliary/vl/vl_winsys_dri.c
 +libomx_mesa_la_SOURCES = vl_winsys_dri.c

  libomx_mesa_la_LDFLAGS = \
 -shared \
 @@ -30,6 +29,13 @@ libomx_mesa_la_LIBADD = \
 $(OMX_LIBS) \
 $(GALLIUM_COMMON_LIB_DEPS)

 +BUILT_SOURCES = vl_winsys_dri.c
 +CLEANFILES = vl_winsys_dri.c
 +
 +vl_winsys_dri.c:
 +   $(AM_V_GEN)$(LN_S) 
 $(top_srcdir)/src/gallium/auxiliary/vl/vl_winsys_dri.c
 +
 +
  if HAVE_GALLIUM_STATIC_TARGETS

  STATIC_TARGET_CPPFLAGS = -DGALLIUM_STATIC_TARGETS=1
 --
 2.1.0

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>>

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 07/12] i965/fs: Use the var_from_vgrf helper function instead of doing it manually

2014-09-19 Thread Matt Turner
On Fri, Sep 19, 2014 at 1:10 PM, Jason Ekstrand  wrote:
> Signed-off-by: Jason Ekstrand 
> ---
>  src/mesa/drivers/dri/i965/brw_fs_dead_code_eliminate.cpp | 10 +-
>  1 file changed, 5 insertions(+), 5 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_fs_dead_code_eliminate.cpp 
> b/src/mesa/drivers/dri/i965/brw_fs_dead_code_eliminate.cpp
> index 697b44a..036875f 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs_dead_code_eliminate.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs_dead_code_eliminate.cpp
> @@ -58,7 +58,7 @@ fs_visitor::dead_code_eliminate()
> int var = live_intervals->var_from_reg(&inst->dst);
> result_live = BITSET_TEST(live, var);
>  } else {
> -   int var = live_intervals->var_from_vgrf[inst->dst.reg];
> +   int var = live_intervals->var_from_reg(&inst->dst);
> for (int i = 0; i < inst->regs_written; i++) {
>result_live = result_live || BITSET_TEST(live, var + i);

This is wrong, isn't it? Before we get the base var and iterate 0
through regs_written. After we're getting the var of the
register+offset and then iterating.

> }
> @@ -78,19 +78,19 @@ fs_visitor::dead_code_eliminate()
>
>   if (inst->dst.file == GRF) {
>  if (!inst->is_partial_write()) {
> -   int var = live_intervals->var_from_vgrf[inst->dst.reg];
> +   int var = live_intervals->var_from_reg(&inst->dst);
> for (int i = 0; i < inst->regs_written; i++) {
> -  BITSET_CLEAR(live, var + inst->dst.reg_offset + i);
> +  BITSET_CLEAR(live, var + i);
> }

This hunk seems fine.

>  }
>   }
>
>   for (int i = 0; i < inst->sources; i++) {
>  if (inst->src[i].file == GRF) {
> -   int var = live_intervals->var_from_vgrf[inst->src[i].reg];
> +   int var = live_intervals->var_from_reg(&inst->src[i]);
>
> for (int j = 0; j < inst->regs_read(this, i); j++) {
> -  BITSET_SET(live, var + inst->src[i].reg_offset + j);
> +  BITSET_SET(live, var + j);

I think this is also fine.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] targets/omx: create symlink to aux/vl/vl_winsys_dri.cat at build time

2014-09-19 Thread Ilia Mirkin
On Fri, Sep 19, 2014 at 7:45 PM, Emil Velikov  wrote:
> On 20/09/14 00:13, Ilia Mirkin wrote:
>> Do we do that anywhere else? Seems really hacky, and windows doesn't
>> support symlinks among other things... I'd just as soon force a
>> non-broken version of automake :)
>>
> Hmm just noticed that we should put the generated source(s) into the
> nodist_*
>
> Please define "anywhere else". It does seem hacky but it's less hacky
> than the current approach afaics. Cannot really parse "I'd just as soon
> force a non-broken version of autmake". Can you elaborate ?

You said that automake 2.0 is broken (in that it's not backwards
compatible and doesn't support our setup). To resolve it, you're
introducing a (IMO) horrible hack of adding a symlink. My suggested
alternative is to just force a lower version of automake...

>
> Have a sneaky feeling that we may get away with just creating a single
> blob in aux/vl, rather than one per target, yet I would prefer to save
> people (myself?) the headaches at things go pair-shape :)

Yes, building the files where they are is the more common thing than
referencing them from all over...

>
> -Emil
>
>> On Fri, Sep 19, 2014 at 7:01 PM, Emil Velikov  
>> wrote:
>>> Ensure that the object is build in the target folder, as automake 2.0
>>> will mandate subdir-objects. Pointed out by automake 1.14.
>>>
>>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=69874
>>> Signed-off-by: Emil Velikov 
>>> ---
>>>  src/gallium/targets/omx/.gitignore  |  1 +
>>>  src/gallium/targets/omx/Makefile.am | 10 --
>>>  2 files changed, 9 insertions(+), 2 deletions(-)
>>>  create mode 100644 src/gallium/targets/omx/.gitignore
>>>
>>> diff --git a/src/gallium/targets/omx/.gitignore 
>>> b/src/gallium/targets/omx/.gitignore
>>> new file mode 100644
>>> index 000..4fd1800
>>> --- /dev/null
>>> +++ b/src/gallium/targets/omx/.gitignore
>>> @@ -0,0 +1 @@
>>> +vl_winsys_dri.c
>>> diff --git a/src/gallium/targets/omx/Makefile.am 
>>> b/src/gallium/targets/omx/Makefile.am
>>> index 4045548..f41719f 100644
>>> --- a/src/gallium/targets/omx/Makefile.am
>>> +++ b/src/gallium/targets/omx/Makefile.am
>>> @@ -7,8 +7,7 @@ omxdir = $(OMX_LIB_INSTALL_DIR)
>>>  omx_LTLIBRARIES = libomx_mesa.la
>>>
>>>  nodist_EXTRA_libomx_mesa_la_SOURCES = dummy.cpp
>>> -libomx_mesa_la_SOURCES = \
>>> -   $(top_srcdir)/src/gallium/auxiliary/vl/vl_winsys_dri.c
>>> +libomx_mesa_la_SOURCES = vl_winsys_dri.c
>>>
>>>  libomx_mesa_la_LDFLAGS = \
>>> -shared \
>>> @@ -30,6 +29,13 @@ libomx_mesa_la_LIBADD = \
>>> $(OMX_LIBS) \
>>> $(GALLIUM_COMMON_LIB_DEPS)
>>>
>>> +BUILT_SOURCES = vl_winsys_dri.c
>>> +CLEANFILES = vl_winsys_dri.c
>>> +
>>> +vl_winsys_dri.c:
>>> +   $(AM_V_GEN)$(LN_S) 
>>> $(top_srcdir)/src/gallium/auxiliary/vl/vl_winsys_dri.c
>>> +
>>> +
>>>  if HAVE_GALLIUM_STATIC_TARGETS
>>>
>>>  STATIC_TARGET_CPPFLAGS = -DGALLIUM_STATIC_TARGETS=1
>>> --
>>> 2.1.0
>>>
>>> ___
>>> mesa-dev mailing list
>>> mesa-dev@lists.freedesktop.org
>>> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] targets/omx: create symlink to aux/vl/vl_winsys_dri.cat at build time

2014-09-19 Thread Emil Velikov
On 20/09/14 00:13, Ilia Mirkin wrote:
> Do we do that anywhere else? Seems really hacky, and windows doesn't
> support symlinks among other things... I'd just as soon force a
> non-broken version of automake :)
> 
Hmm just noticed that we should put the generated source(s) into the
nodist_*

Please define "anywhere else". It does seem hacky but it's less hacky
than the current approach afaics. Cannot really parse "I'd just as soon
force a non-broken version of autmake". Can you elaborate ?

Have a sneaky feeling that we may get away with just creating a single
blob in aux/vl, rather than one per target, yet I would prefer to save
people (myself?) the headaches at things go pair-shape :)

-Emil

> On Fri, Sep 19, 2014 at 7:01 PM, Emil Velikov  
> wrote:
>> Ensure that the object is build in the target folder, as automake 2.0
>> will mandate subdir-objects. Pointed out by automake 1.14.
>>
>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=69874
>> Signed-off-by: Emil Velikov 
>> ---
>>  src/gallium/targets/omx/.gitignore  |  1 +
>>  src/gallium/targets/omx/Makefile.am | 10 --
>>  2 files changed, 9 insertions(+), 2 deletions(-)
>>  create mode 100644 src/gallium/targets/omx/.gitignore
>>
>> diff --git a/src/gallium/targets/omx/.gitignore 
>> b/src/gallium/targets/omx/.gitignore
>> new file mode 100644
>> index 000..4fd1800
>> --- /dev/null
>> +++ b/src/gallium/targets/omx/.gitignore
>> @@ -0,0 +1 @@
>> +vl_winsys_dri.c
>> diff --git a/src/gallium/targets/omx/Makefile.am 
>> b/src/gallium/targets/omx/Makefile.am
>> index 4045548..f41719f 100644
>> --- a/src/gallium/targets/omx/Makefile.am
>> +++ b/src/gallium/targets/omx/Makefile.am
>> @@ -7,8 +7,7 @@ omxdir = $(OMX_LIB_INSTALL_DIR)
>>  omx_LTLIBRARIES = libomx_mesa.la
>>
>>  nodist_EXTRA_libomx_mesa_la_SOURCES = dummy.cpp
>> -libomx_mesa_la_SOURCES = \
>> -   $(top_srcdir)/src/gallium/auxiliary/vl/vl_winsys_dri.c
>> +libomx_mesa_la_SOURCES = vl_winsys_dri.c
>>
>>  libomx_mesa_la_LDFLAGS = \
>> -shared \
>> @@ -30,6 +29,13 @@ libomx_mesa_la_LIBADD = \
>> $(OMX_LIBS) \
>> $(GALLIUM_COMMON_LIB_DEPS)
>>
>> +BUILT_SOURCES = vl_winsys_dri.c
>> +CLEANFILES = vl_winsys_dri.c
>> +
>> +vl_winsys_dri.c:
>> +   $(AM_V_GEN)$(LN_S) 
>> $(top_srcdir)/src/gallium/auxiliary/vl/vl_winsys_dri.c
>> +
>> +
>>  if HAVE_GALLIUM_STATIC_TARGETS
>>
>>  STATIC_TARGET_CPPFLAGS = -DGALLIUM_STATIC_TARGETS=1
>> --
>> 2.1.0
>>
>> ___
>> mesa-dev mailing list
>> mesa-dev@lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] targets/omx: create symlink to aux/vl/vl_winsys_dri.cat at build time

2014-09-19 Thread Ilia Mirkin
Do we do that anywhere else? Seems really hacky, and windows doesn't
support symlinks among other things... I'd just as soon force a
non-broken version of automake :)

On Fri, Sep 19, 2014 at 7:01 PM, Emil Velikov  wrote:
> Ensure that the object is build in the target folder, as automake 2.0
> will mandate subdir-objects. Pointed out by automake 1.14.
>
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=69874
> Signed-off-by: Emil Velikov 
> ---
>  src/gallium/targets/omx/.gitignore  |  1 +
>  src/gallium/targets/omx/Makefile.am | 10 --
>  2 files changed, 9 insertions(+), 2 deletions(-)
>  create mode 100644 src/gallium/targets/omx/.gitignore
>
> diff --git a/src/gallium/targets/omx/.gitignore 
> b/src/gallium/targets/omx/.gitignore
> new file mode 100644
> index 000..4fd1800
> --- /dev/null
> +++ b/src/gallium/targets/omx/.gitignore
> @@ -0,0 +1 @@
> +vl_winsys_dri.c
> diff --git a/src/gallium/targets/omx/Makefile.am 
> b/src/gallium/targets/omx/Makefile.am
> index 4045548..f41719f 100644
> --- a/src/gallium/targets/omx/Makefile.am
> +++ b/src/gallium/targets/omx/Makefile.am
> @@ -7,8 +7,7 @@ omxdir = $(OMX_LIB_INSTALL_DIR)
>  omx_LTLIBRARIES = libomx_mesa.la
>
>  nodist_EXTRA_libomx_mesa_la_SOURCES = dummy.cpp
> -libomx_mesa_la_SOURCES = \
> -   $(top_srcdir)/src/gallium/auxiliary/vl/vl_winsys_dri.c
> +libomx_mesa_la_SOURCES = vl_winsys_dri.c
>
>  libomx_mesa_la_LDFLAGS = \
> -shared \
> @@ -30,6 +29,13 @@ libomx_mesa_la_LIBADD = \
> $(OMX_LIBS) \
> $(GALLIUM_COMMON_LIB_DEPS)
>
> +BUILT_SOURCES = vl_winsys_dri.c
> +CLEANFILES = vl_winsys_dri.c
> +
> +vl_winsys_dri.c:
> +   $(AM_V_GEN)$(LN_S) 
> $(top_srcdir)/src/gallium/auxiliary/vl/vl_winsys_dri.c
> +
> +
>  if HAVE_GALLIUM_STATIC_TARGETS
>
>  STATIC_TARGET_CPPFLAGS = -DGALLIUM_STATIC_TARGETS=1
> --
> 2.1.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] i965/gen6: Enable GLSL 3.30 and OpenGL 3.3

2014-09-19 Thread Mike Lothian
---
 src/mesa/drivers/dri/i965/intel_extensions.c | 4 +---
 src/mesa/drivers/dri/i965/intel_screen.c | 7 +--
 2 files changed, 2 insertions(+), 9 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/intel_extensions.c 
b/src/mesa/drivers/dri/i965/intel_extensions.c
index b7c64c6..4e6627e 100644
--- a/src/mesa/drivers/dri/i965/intel_extensions.c
+++ b/src/mesa/drivers/dri/i965/intel_extensions.c
@@ -243,10 +243,8 @@ intelInitExtensions(struct gl_context *ctx)
ctx->Extensions.OES_standard_derivatives = true;
ctx->Extensions.OES_EGL_image_external = true;
 
-   if (brw->gen >= 7)
+   if (brw->gen >= 6)
   ctx->Const.GLSLVersion = 330;
-   else if (brw->gen >= 6)
-  ctx->Const.GLSLVersion = 150;
else
   ctx->Const.GLSLVersion = 120;
_mesa_override_glsl_version(&ctx->Const);
diff --git a/src/mesa/drivers/dri/i965/intel_screen.c 
b/src/mesa/drivers/dri/i965/intel_screen.c
index 8070e97..41964ec 100644
--- a/src/mesa/drivers/dri/i965/intel_screen.c
+++ b/src/mesa/drivers/dri/i965/intel_screen.c
@@ -1269,13 +1269,8 @@ set_max_gl_versions(struct intel_screen *screen)
switch (screen->devinfo->gen) {
case 8:
case 7:
-  psp->max_gl_core_version = 33;
-  psp->max_gl_compat_version = 30;
-  psp->max_gl_es1_version = 11;
-  psp->max_gl_es2_version = 30;
-  break;
case 6:
-  psp->max_gl_core_version = 32;
+  psp->max_gl_core_version = 33;
   psp->max_gl_compat_version = 30;
   psp->max_gl_es1_version = 11;
   psp->max_gl_es2_version = 30;
-- 
2.1.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] i965/gen6: Enable GLSL 3.30 and OpenGL 3.3

2014-09-19 Thread Mike Lothian
Hi 

This is the first time I've used git send-mail - hopefully it should be inline 
now

I've run piglit and there don't seem to be any failures related to the new 
enablement (I do get some GS fails though)

Cheers

Mike

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/3] targets/vdpau: create symlink to aux/vl/vl_winsys_dri.cat at build time

2014-09-19 Thread Emil Velikov
Ensure that the object is build in the target folder, as automake 2.0
will mandate subdir-objects. Pointed out by automake 1.14.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=69874
Signed-off-by: Emil Velikov 
---
 src/gallium/targets/vdpau/.gitignore  | 1 +
 src/gallium/targets/vdpau/Makefile.am | 9 +++--
 2 files changed, 8 insertions(+), 2 deletions(-)
 create mode 100644 src/gallium/targets/vdpau/.gitignore

diff --git a/src/gallium/targets/vdpau/.gitignore 
b/src/gallium/targets/vdpau/.gitignore
new file mode 100644
index 000..4fd1800
--- /dev/null
+++ b/src/gallium/targets/vdpau/.gitignore
@@ -0,0 +1 @@
+vl_winsys_dri.c
diff --git a/src/gallium/targets/vdpau/Makefile.am 
b/src/gallium/targets/vdpau/Makefile.am
index 440cf22..1b42a1d 100644
--- a/src/gallium/targets/vdpau/Makefile.am
+++ b/src/gallium/targets/vdpau/Makefile.am
@@ -7,8 +7,7 @@ vdpaudir = $(VDPAU_LIB_INSTALL_DIR)
 vdpau_LTLIBRARIES = libvdpau_gallium.la
 
 nodist_EXTRA_libvdpau_gallium_la_SOURCES = dummy.cpp
-libvdpau_gallium_la_SOURCES = \
-   $(top_srcdir)/src/gallium/auxiliary/vl/vl_winsys_dri.c
+libvdpau_gallium_la_SOURCES = vl_winsys_dri.c
 
 libvdpau_gallium_la_LDFLAGS = \
-shared \
@@ -36,6 +35,12 @@ libvdpau_gallium_la_LIBADD = \
$(LIBDRM_LIBS) \
$(GALLIUM_COMMON_LIB_DEPS)
 
+BUILT_SOURCES = vl_winsys_dri.c
+CLEANFILES = vl_winsys_dri.c
+
+vl_winsys_dri.c:
+   $(AM_V_GEN)$(LN_S) 
$(top_srcdir)/src/gallium/auxiliary/vl/vl_winsys_dri.c
+
 
 if HAVE_GALLIUM_STATIC_TARGETS
 
-- 
2.1.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/3] targets/xvmc: create symlink to aux/vl/vl_winsys_dri.cat at build time

2014-09-19 Thread Emil Velikov
Ensure that the object is build in the target folder, as automake 2.0
will mandate subdir-objects. Pointed out by automake 1.14.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=69874
Signed-off-by: Emil Velikov 
---
 src/gallium/targets/xvmc/.gitignore  | 1 +
 src/gallium/targets/xvmc/Makefile.am | 9 +++--
 2 files changed, 8 insertions(+), 2 deletions(-)
 create mode 100644 src/gallium/targets/xvmc/.gitignore

diff --git a/src/gallium/targets/xvmc/.gitignore 
b/src/gallium/targets/xvmc/.gitignore
new file mode 100644
index 000..4fd1800
--- /dev/null
+++ b/src/gallium/targets/xvmc/.gitignore
@@ -0,0 +1 @@
+vl_winsys_dri.c
diff --git a/src/gallium/targets/xvmc/Makefile.am 
b/src/gallium/targets/xvmc/Makefile.am
index 884bccf..f6c7e03 100644
--- a/src/gallium/targets/xvmc/Makefile.am
+++ b/src/gallium/targets/xvmc/Makefile.am
@@ -7,8 +7,7 @@ xvmcdir = $(XVMC_LIB_INSTALL_DIR)
 xvmc_LTLIBRARIES = libXvMCgallium.la
 
 nodist_EXTRA_libXvMCgallium_la_SOURCES = dummy.cpp
-libXvMCgallium_la_SOURCES = \
-   $(top_srcdir)/src/gallium/auxiliary/vl/vl_winsys_dri.c
+libXvMCgallium_la_SOURCES = vl_winsys_dri.c
 
 libXvMCgallium_la_LDFLAGS = \
-shared \
@@ -31,6 +30,12 @@ libXvMCgallium_la_LIBADD = \
$(LIBDRM_LIBS) \
$(GALLIUM_COMMON_LIB_DEPS)
 
+BUILT_SOURCES = vl_winsys_dri.c
+CLEANFILES = vl_winsys_dri.c
+
+vl_winsys_dri.c:
+   $(AM_V_GEN)$(LN_S) 
$(top_srcdir)/src/gallium/auxiliary/vl/vl_winsys_dri.c
+
 
 if HAVE_GALLIUM_STATIC_TARGETS
 
-- 
2.1.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/3] targets/omx: create symlink to aux/vl/vl_winsys_dri.cat at build time

2014-09-19 Thread Emil Velikov
Ensure that the object is build in the target folder, as automake 2.0
will mandate subdir-objects. Pointed out by automake 1.14.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=69874
Signed-off-by: Emil Velikov 
---
 src/gallium/targets/omx/.gitignore  |  1 +
 src/gallium/targets/omx/Makefile.am | 10 --
 2 files changed, 9 insertions(+), 2 deletions(-)
 create mode 100644 src/gallium/targets/omx/.gitignore

diff --git a/src/gallium/targets/omx/.gitignore 
b/src/gallium/targets/omx/.gitignore
new file mode 100644
index 000..4fd1800
--- /dev/null
+++ b/src/gallium/targets/omx/.gitignore
@@ -0,0 +1 @@
+vl_winsys_dri.c
diff --git a/src/gallium/targets/omx/Makefile.am 
b/src/gallium/targets/omx/Makefile.am
index 4045548..f41719f 100644
--- a/src/gallium/targets/omx/Makefile.am
+++ b/src/gallium/targets/omx/Makefile.am
@@ -7,8 +7,7 @@ omxdir = $(OMX_LIB_INSTALL_DIR)
 omx_LTLIBRARIES = libomx_mesa.la
 
 nodist_EXTRA_libomx_mesa_la_SOURCES = dummy.cpp
-libomx_mesa_la_SOURCES = \
-   $(top_srcdir)/src/gallium/auxiliary/vl/vl_winsys_dri.c
+libomx_mesa_la_SOURCES = vl_winsys_dri.c
 
 libomx_mesa_la_LDFLAGS = \
-shared \
@@ -30,6 +29,13 @@ libomx_mesa_la_LIBADD = \
$(OMX_LIBS) \
$(GALLIUM_COMMON_LIB_DEPS)
 
+BUILT_SOURCES = vl_winsys_dri.c
+CLEANFILES = vl_winsys_dri.c
+
+vl_winsys_dri.c:
+   $(AM_V_GEN)$(LN_S) 
$(top_srcdir)/src/gallium/auxiliary/vl/vl_winsys_dri.c
+
+
 if HAVE_GALLIUM_STATIC_TARGETS
 
 STATIC_TARGET_CPPFLAGS = -DGALLIUM_STATIC_TARGETS=1
-- 
2.1.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965/gen6: Enable GL 3.3 and GLSL 3.30

2014-09-19 Thread Jordan Justen
Reviewed-by: Jordan Justen 

Mike, my reply for your patch was going to be:
* Can you inline your patch?
* Did you run piglit?

On Fri, Sep 19, 2014 at 3:39 PM, Chris Forbes  wrote:
> Tested on my snb-gt2:
>
> 4 tests skip->pass in spec/EXT_texture_array
> 51 tests skip->pass in spec.glsl-3.30
> 4 tests skip->pass in spec/!OpenGL 3.3
> No regressions; no skip->fail changes.
>
> Signed-off-by: Chris Forbes 
> ---
>
> Had the gen6 machine out anyway to try some other things; may as well get 
> this test run done at the same time :)
>
>  src/mesa/drivers/dri/i965/intel_extensions.c | 4 +---
>  src/mesa/drivers/dri/i965/intel_screen.c | 7 +--
>  2 files changed, 2 insertions(+), 9 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/intel_extensions.c 
> b/src/mesa/drivers/dri/i965/intel_extensions.c
> index b7c64c6..4e6627e 100644
> --- a/src/mesa/drivers/dri/i965/intel_extensions.c
> +++ b/src/mesa/drivers/dri/i965/intel_extensions.c
> @@ -243,10 +243,8 @@ intelInitExtensions(struct gl_context *ctx)
> ctx->Extensions.OES_standard_derivatives = true;
> ctx->Extensions.OES_EGL_image_external = true;
>
> -   if (brw->gen >= 7)
> +   if (brw->gen >= 6)
>ctx->Const.GLSLVersion = 330;
> -   else if (brw->gen >= 6)
> -  ctx->Const.GLSLVersion = 150;
> else
>ctx->Const.GLSLVersion = 120;
> _mesa_override_glsl_version(&ctx->Const);
> diff --git a/src/mesa/drivers/dri/i965/intel_screen.c 
> b/src/mesa/drivers/dri/i965/intel_screen.c
> index 8070e97..41964ec 100644
> --- a/src/mesa/drivers/dri/i965/intel_screen.c
> +++ b/src/mesa/drivers/dri/i965/intel_screen.c
> @@ -1269,13 +1269,8 @@ set_max_gl_versions(struct intel_screen *screen)
> switch (screen->devinfo->gen) {
> case 8:
> case 7:
> -  psp->max_gl_core_version = 33;
> -  psp->max_gl_compat_version = 30;
> -  psp->max_gl_es1_version = 11;
> -  psp->max_gl_es2_version = 30;
> -  break;
> case 6:
> -  psp->max_gl_core_version = 32;
> +  psp->max_gl_core_version = 33;
>psp->max_gl_compat_version = 30;
>psp->max_gl_es1_version = 11;
>psp->max_gl_es2_version = 30;
> --
> 1.8.5.3
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] i965/gen6: Enable GL 3.3 and GLSL 3.30

2014-09-19 Thread Chris Forbes
Tested on my snb-gt2:

4 tests skip->pass in spec/EXT_texture_array
51 tests skip->pass in spec.glsl-3.30
4 tests skip->pass in spec/!OpenGL 3.3
No regressions; no skip->fail changes.

Signed-off-by: Chris Forbes 
---

Had the gen6 machine out anyway to try some other things; may as well get this 
test run done at the same time :)

 src/mesa/drivers/dri/i965/intel_extensions.c | 4 +---
 src/mesa/drivers/dri/i965/intel_screen.c | 7 +--
 2 files changed, 2 insertions(+), 9 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/intel_extensions.c 
b/src/mesa/drivers/dri/i965/intel_extensions.c
index b7c64c6..4e6627e 100644
--- a/src/mesa/drivers/dri/i965/intel_extensions.c
+++ b/src/mesa/drivers/dri/i965/intel_extensions.c
@@ -243,10 +243,8 @@ intelInitExtensions(struct gl_context *ctx)
ctx->Extensions.OES_standard_derivatives = true;
ctx->Extensions.OES_EGL_image_external = true;
 
-   if (brw->gen >= 7)
+   if (brw->gen >= 6)
   ctx->Const.GLSLVersion = 330;
-   else if (brw->gen >= 6)
-  ctx->Const.GLSLVersion = 150;
else
   ctx->Const.GLSLVersion = 120;
_mesa_override_glsl_version(&ctx->Const);
diff --git a/src/mesa/drivers/dri/i965/intel_screen.c 
b/src/mesa/drivers/dri/i965/intel_screen.c
index 8070e97..41964ec 100644
--- a/src/mesa/drivers/dri/i965/intel_screen.c
+++ b/src/mesa/drivers/dri/i965/intel_screen.c
@@ -1269,13 +1269,8 @@ set_max_gl_versions(struct intel_screen *screen)
switch (screen->devinfo->gen) {
case 8:
case 7:
-  psp->max_gl_core_version = 33;
-  psp->max_gl_compat_version = 30;
-  psp->max_gl_es1_version = 11;
-  psp->max_gl_es2_version = 30;
-  break;
case 6:
-  psp->max_gl_core_version = 32;
+  psp->max_gl_core_version = 33;
   psp->max_gl_compat_version = 30;
   psp->max_gl_es1_version = 11;
   psp->max_gl_es2_version = 30;
-- 
1.8.5.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] i965/gen6: Enable GLSL 3.30 and OpenGL 3.3

2014-09-19 Thread Mike Lothian
Hi

I'm pretty sure this is all thats needed to switch on GLSL 3.30 and
OpenGL 3.3 on Sandybridge

Cheers

Mike
From b16937f37681f8e44c86cdb86bd76fd1bbfab998 Mon Sep 17 00:00:00 2001
From: Mike Lothian 
Date: Fri, 19 Sep 2014 22:56:46 +0100
Subject: [PATCH] i965/gen6: Enable GLSL 3.30 and OpenGL 3.3

---
 src/mesa/drivers/dri/i965/intel_extensions.c | 4 +---
 src/mesa/drivers/dri/i965/intel_screen.c | 7 +--
 2 files changed, 2 insertions(+), 9 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/intel_extensions.c b/src/mesa/drivers/dri/i965/intel_extensions.c
index b7c64c6..4e6627e 100644
--- a/src/mesa/drivers/dri/i965/intel_extensions.c
+++ b/src/mesa/drivers/dri/i965/intel_extensions.c
@@ -243,10 +243,8 @@ intelInitExtensions(struct gl_context *ctx)
ctx->Extensions.OES_standard_derivatives = true;
ctx->Extensions.OES_EGL_image_external = true;
 
-   if (brw->gen >= 7)
+   if (brw->gen >= 6)
   ctx->Const.GLSLVersion = 330;
-   else if (brw->gen >= 6)
-  ctx->Const.GLSLVersion = 150;
else
   ctx->Const.GLSLVersion = 120;
_mesa_override_glsl_version(&ctx->Const);
diff --git a/src/mesa/drivers/dri/i965/intel_screen.c b/src/mesa/drivers/dri/i965/intel_screen.c
index 8070e97..41964ec 100644
--- a/src/mesa/drivers/dri/i965/intel_screen.c
+++ b/src/mesa/drivers/dri/i965/intel_screen.c
@@ -1269,13 +1269,8 @@ set_max_gl_versions(struct intel_screen *screen)
switch (screen->devinfo->gen) {
case 8:
case 7:
-  psp->max_gl_core_version = 33;
-  psp->max_gl_compat_version = 30;
-  psp->max_gl_es1_version = 11;
-  psp->max_gl_es2_version = 30;
-  break;
case 6:
-  psp->max_gl_core_version = 32;
+  psp->max_gl_core_version = 33;
   psp->max_gl_compat_version = 30;
   psp->max_gl_es1_version = 11;
   psp->max_gl_es2_version = 30;
-- 
2.1.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 13/12] i965/fs: Refactor fs_inst::is_send_from_grf()

2014-09-19 Thread Jason Ekstrand
A switch statement is much easier to read/edit than a big giant or
statement.
---
 src/mesa/drivers/dri/i965/brw_fs.cpp | 25 -
 1 file changed, 16 insertions(+), 9 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index a0b7c6a..527467a 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -368,15 +368,22 @@ fs_inst::overwrites_reg(const fs_reg ®) const
 bool
 fs_inst::is_send_from_grf() const
 {
-   return (opcode == FS_OPCODE_VARYING_PULL_CONSTANT_LOAD_GEN7 ||
-   opcode == SHADER_OPCODE_SHADER_TIME_ADD ||
-   opcode == FS_OPCODE_INTERPOLATE_AT_CENTROID ||
-   opcode == FS_OPCODE_INTERPOLATE_AT_SAMPLE ||
-   opcode == FS_OPCODE_INTERPOLATE_AT_SHARED_OFFSET ||
-   opcode == FS_OPCODE_INTERPOLATE_AT_PER_SLOT_OFFSET ||
-   (opcode == FS_OPCODE_UNIFORM_PULL_CONSTANT_LOAD &&
-src[1].file == GRF) ||
-   (is_tex() && src[0].file == GRF));
+   switch (opcode) {
+   case FS_OPCODE_VARYING_PULL_CONSTANT_LOAD_GEN7:
+   case SHADER_OPCODE_SHADER_TIME_ADD:
+   case FS_OPCODE_INTERPOLATE_AT_CENTROID:
+   case FS_OPCODE_INTERPOLATE_AT_SAMPLE:
+   case FS_OPCODE_INTERPOLATE_AT_SHARED_OFFSET:
+   case FS_OPCODE_INTERPOLATE_AT_PER_SLOT_OFFSET:
+  return true;
+   case FS_OPCODE_UNIFORM_PULL_CONSTANT_LOAD:
+  return src[1].file == GRF;
+   default:
+  if (is_tex())
+ return src[0].file == GRF;
+
+  return false;
+   }
 }
 
 bool
-- 
2.1.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 00/12] i965/fs: A bunch of cleanups in preparation for explicit register widths

2014-09-19 Thread Jason Ekstrand
Oops.  For got one:

i965/fs: Refactor fs_inst::is_send_from_grf()

On Fri, Sep 19, 2014 at 1:10 PM, Jason Ekstrand 
wrote:

> I'm working on a series (which I hope to send out soon) that will allow us
> to have explicit register widths and instruction execution sizes in the fs
> backend IR.  If you want to see where I'm going with this, I've got a
> working version here:
>
> http://cgit.freedesktop.org/~jekstrand/mesa/log/?h=wip/kill-mrf-v0.5
>
> I'm planning to get that cleaned up a bit more and hope to send the full
> series out by the end of today or maybe Monday.  This series is a bunch of
> cleanup patches that will be needed eventually, but don't really change
> anything important on their own.  They should be generally reviewable by
> anyone with a decent understanding of the i965 fs backend.
>
> Jason Ekstrand (12):
>   i965/fs: Manually generate the meta fast-clear shader
>   i965/fs_live_variables: Use var_from_vgrf insead of repeating the
> calculation
>   i965/fs: Rewrite fs_visitor::split_virtual_grfs
>   i965/fs: fix a comment in compact_virtual_grfs
>   i965/fs: Use offset a lot more places
>   i965/fs: Use the UW type for the destination of
> VARYING_PULL_CONSTANT_LOAD instructions
>   i965/fs: Use the var_from_vgrf helper function instead of doing it
> manually
>   i965/fs: Make null_reg_* const members of fs_visitor instead of
> globals
>   i964/fs: Make immediate fs_reg constructors explicit
>   i965/fs: Make compact_virtual_grfs an optimization pass
>   i965/fs: Print BAD_FILE registers in dump_instruction
>   i965/fs: Clean up emit_fb_writes
>
>  src/mesa/drivers/dri/i965/brw_fs.cpp   | 295
> +-
>  src/mesa/drivers/dri/i965/brw_fs.h |  21 +-
>  src/mesa/drivers/dri/i965/brw_fs_cse.cpp   |  12 +-
>  .../dri/i965/brw_fs_dead_code_eliminate.cpp|  10 +-
>  src/mesa/drivers/dri/i965/brw_fs_fp.cpp|   2 +-
>  src/mesa/drivers/dri/i965/brw_fs_generator.cpp |   4 +-
>  .../drivers/dri/i965/brw_fs_live_variables.cpp |   4 +-
>  src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp  |   4 +-
>  src/mesa/drivers/dri/i965/brw_fs_visitor.cpp   | 338
> +
>  src/mesa/drivers/dri/i965/brw_reg.h|   6 +
>  10 files changed, 323 insertions(+), 373 deletions(-)
>
> --
> 2.1.0
>
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 12/15] radeonsi: don't pass the context to the shader translator

2014-09-19 Thread Marek Olšák
From: Marek Olšák 

This should prevent accessing context state there.
---
 src/gallium/drivers/radeonsi/si_compute.c |  2 +-
 src/gallium/drivers/radeonsi/si_shader.c  | 29 +
 src/gallium/drivers/radeonsi/si_shader.h  |  7 +++
 src/gallium/drivers/radeonsi/si_state.c   |  2 +-
 4 files changed, 18 insertions(+), 22 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_compute.c 
b/src/gallium/drivers/radeonsi/si_compute.c
index 9088268..4b2662d 100644
--- a/src/gallium/drivers/radeonsi/si_compute.c
+++ b/src/gallium/drivers/radeonsi/si_compute.c
@@ -81,7 +81,7 @@ static void *si_create_compute_state(
for (i = 0; i < program->num_kernels; i++) {
LLVMModuleRef mod = 
radeon_llvm_get_kernel_module(program->llvm_ctx, i,
code, 
header->num_bytes);
-   si_compile_llvm(sctx, &program->kernels[i], mod);
+   si_compile_llvm(sctx->screen, &program->kernels[i], mod);
LLVMDisposeModule(mod);
}
 
diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index fbc94d2..7aa65c9 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -2625,16 +2625,16 @@ static void preload_streamout_buffers(struct 
si_shader_context *si_shader_ctx)
}
 }
 
-int si_compile_llvm(struct si_context *sctx, struct si_shader *shader,
-   LLVMModuleRef mod)
+int si_compile_llvm(struct si_screen *sscreen, struct si_shader *shader,
+   LLVMModuleRef mod)
 {
unsigned r; /* llvm_compile result */
unsigned i;
unsigned char *ptr;
struct radeon_shader_binary binary;
-   bool dump = r600_can_dump_shader(&sctx->screen->b,
+   bool dump = r600_can_dump_shader(&sscreen->b,
shader->selector ? shader->selector->tokens : NULL);
-   const char * gpu_family = 
r600_get_llvm_processor_name(sctx->screen->b.family);
+   const char * gpu_family = 
r600_get_llvm_processor_name(sscreen->b.family);
unsigned code_size;
 
/* Use LLVM to compile shader */
@@ -2690,20 +2690,20 @@ int si_compile_llvm(struct si_context *sctx, struct 
si_shader *shader,
/* copy new shader */
code_size = binary.code_size + binary.rodata_size;
r600_resource_reference(&shader->bo, NULL);
-   shader->bo = si_resource_create_custom(sctx->b.b.screen, 
PIPE_USAGE_IMMUTABLE,
+   shader->bo = si_resource_create_custom(&sscreen->b.b, 
PIPE_USAGE_IMMUTABLE,
   code_size);
if (shader->bo == NULL) {
return -ENOMEM;
}
 
-   ptr = sctx->b.ws->buffer_map(shader->bo->cs_buf, sctx->b.rings.gfx.cs, 
PIPE_TRANSFER_WRITE);
+   ptr = sscreen->b.ws->buffer_map(shader->bo->cs_buf, NULL, 
PIPE_TRANSFER_WRITE);
util_memcpy_cpu_to_le32(ptr, binary.code, binary.code_size);
if (binary.rodata_size > 0) {
ptr += binary.code_size;
util_memcpy_cpu_to_le32(ptr, binary.rodata, binary.rodata_size);
}
 
-   sctx->b.ws->buffer_unmap(shader->bo->cs_buf);
+   sscreen->b.ws->buffer_unmap(shader->bo->cs_buf);
 
free(binary.code);
free(binary.config);
@@ -2713,7 +2713,7 @@ int si_compile_llvm(struct si_context *sctx, struct 
si_shader *shader,
 }
 
 /* Generate code for the hardware VS shader stage to go with a geometry shader 
*/
-static int si_generate_gs_copy_shader(struct si_context *sctx,
+static int si_generate_gs_copy_shader(struct si_screen *sscreen,
  struct si_shader_context *si_shader_ctx,
  bool dump)
 {
@@ -2792,7 +2792,7 @@ static int si_generate_gs_copy_shader(struct si_context 
*sctx,
if (dump)
fprintf(stderr, "Copy Vertex Shader for Geometry Shader:\n\n");
 
-   r = si_compile_llvm(sctx, si_shader_ctx->shader,
+   r = si_compile_llvm(sscreen, si_shader_ctx->shader,
bld_base->base.gallivm->module);
 
radeon_llvm_dispose(&si_shader_ctx->radeon_bld);
@@ -2801,18 +2801,15 @@ static int si_generate_gs_copy_shader(struct si_context 
*sctx,
return r;
 }
 
-int si_shader_create(
-   struct pipe_context *ctx,
-   struct si_shader *shader)
+int si_shader_create(struct si_screen *sscreen, struct si_shader *shader)
 {
-   struct si_context *sctx = (struct si_context*)ctx;
struct si_shader_selector *sel = shader->selector;
struct si_shader_context si_shader_ctx;
struct tgsi_shader_info shader_info;
struct lp_build_tgsi_context * bld_base;
LLVMModuleRef mod;
int r = 0;
-   bool dump = r600_can_dump_shader(&sctx->screen->b, sel->tokens);
+   bool dump = r600_can_dump_shader(&sscreen->b, sel->tokens);
 
/* Dump 

[Mesa-dev] [PATCH 08/15] radeonsi: disable gl_SampleMask fragment shader output if MSAA is disabled

2014-09-19 Thread Marek Olšák
From: Marek Olšák 

This fixes piglit: arb_sample_shading-builtin-gl-sample-mask 0
---
 src/gallium/drivers/radeonsi/si_state.c | 21 ++---
 1 file changed, 18 insertions(+), 3 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_state.c 
b/src/gallium/drivers/radeonsi/si_state.c
index 671e57b..7614bba 100644
--- a/src/gallium/drivers/radeonsi/si_state.c
+++ b/src/gallium/drivers/radeonsi/si_state.c
@@ -666,6 +666,8 @@ static void *si_create_rs_state(struct pipe_context *ctx,
 static void si_bind_rs_state(struct pipe_context *ctx, void *state)
 {
struct si_context *sctx = (struct si_context *)ctx;
+   struct si_state_rasterizer *old_rs =
+   (struct si_state_rasterizer*)sctx->queued.named.rasterizer;
struct si_state_rasterizer *rs = (struct si_state_rasterizer *)state;
 
if (state == NULL)
@@ -676,6 +678,10 @@ static void si_bind_rs_state(struct pipe_context *ctx, 
void *state)
sctx->pa_sc_line_stipple = rs->pa_sc_line_stipple;
sctx->pa_su_sc_mode_cntl = rs->pa_su_sc_mode_cntl;
 
+   if (sctx->framebuffer.nr_samples > 1 &&
+   (!old_rs || old_rs->multisample_enable != rs->multisample_enable))
+   sctx->db_render_state.dirty = true;
+
si_pm4_bind_state(sctx, rasterizer, rs);
si_update_fb_rs_state(sctx);
 }
@@ -845,6 +851,8 @@ static void si_set_occlusion_query_state(struct 
pipe_context *ctx, bool enable)
 static void si_emit_db_render_state(struct si_context *sctx, struct r600_atom 
*state)
 {
struct radeon_winsys_cs *cs = sctx->b.rings.gfx.cs;
+   struct si_state_rasterizer *rs = sctx->queued.named.rasterizer;
+   unsigned db_shader_control;
 
r600_write_context_reg_seq(cs, R_028000_DB_RENDER_CONTROL, 2);
 
@@ -897,10 +905,16 @@ static void si_emit_db_render_state(struct si_context 
*sctx, struct r600_atom *s
r600_write_context_reg(cs, R_028010_DB_RENDER_OVERRIDE2, 0);
}
 
+   db_shader_control = S_02880C_Z_ORDER(V_02880C_EARLY_Z_THEN_LATE_Z) |
+   
S_02880C_ALPHA_TO_MASK_DISABLE(sctx->framebuffer.cb0_is_integer) |
+   sctx->ps_db_shader_control;
+
+   /* Disable the gl_SampleMask fragment shader output if MSAA is 
disabled. */
+   if (sctx->framebuffer.nr_samples <= 1 || (rs && 
!rs->multisample_enable))
+   db_shader_control &= C_02880C_MASK_EXPORT_ENABLE;
+
r600_write_context_reg(cs, R_02880C_DB_SHADER_CONTROL,
-  S_02880C_Z_ORDER(V_02880C_EARLY_Z_THEN_LATE_Z) |
-  
S_02880C_ALPHA_TO_MASK_DISABLE(sctx->framebuffer.cb0_is_integer) |
-  sctx->ps_db_shader_control);
+  db_shader_control);
 }
 
 /*
@@ -2012,6 +2026,7 @@ static void si_set_framebuffer_state(struct pipe_context 
*ctx,
 
if (sctx->framebuffer.nr_samples != old_nr_samples) {
sctx->msaa_config.dirty = true;
+   sctx->db_render_state.dirty = true;
 
/* Set sample locations as fragment shader constants. */
switch (sctx->framebuffer.nr_samples) {
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 15/15] radeonsi: properly destroy the GS copy shader and scratch_bo for compute

2014-09-19 Thread Marek Olšák
From: Marek Olšák 

Cc: 10.2 10.3 
---
 src/gallium/drivers/radeonsi/si_shader.c | 4 
 src/gallium/drivers/radeonsi/si_state.c  | 7 ---
 2 files changed, 8 insertions(+), 3 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index 7aa65c9..94db1dc 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -2973,5 +2973,9 @@ out:
 
 void si_shader_destroy(struct pipe_context *ctx, struct si_shader *shader)
 {
+   if (shader->gs_copy_shader)
+   si_shader_destroy(ctx, shader->gs_copy_shader);
+
r600_resource_reference(&shader->bo, NULL);
+   r600_resource_reference(&shader->scratch_bo, NULL);
 }
diff --git a/src/gallium/drivers/radeonsi/si_state.c 
b/src/gallium/drivers/radeonsi/si_state.c
index 2aa9aad..ed90f13 100644
--- a/src/gallium/drivers/radeonsi/si_state.c
+++ b/src/gallium/drivers/radeonsi/si_state.c
@@ -2403,9 +2403,10 @@ static void si_delete_shader_selector(struct 
pipe_context *ctx,
 
while (p) {
c = p->next_variant;
-   if (sel->type == PIPE_SHADER_GEOMETRY)
+   if (sel->type == PIPE_SHADER_GEOMETRY) {
si_pm4_delete_state(sctx, gs, p->pm4);
-   else if (sel->type == PIPE_SHADER_FRAGMENT)
+   si_pm4_delete_state(sctx, vs, p->gs_copy_shader->pm4);
+   } else if (sel->type == PIPE_SHADER_FRAGMENT)
si_pm4_delete_state(sctx, ps, p->pm4);
else if (p->key.vs.as_es)
si_pm4_delete_state(sctx, es, p->pm4);
@@ -2418,7 +2419,7 @@ static void si_delete_shader_selector(struct pipe_context 
*ctx,
 
free(sel->tokens);
free(sel);
- }
+}
 
 static void si_delete_vs_shader(struct pipe_context *ctx, void *state)
 {
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 00/15] RadeonSI: Random improvements

2014-09-19 Thread Marek Olšák
Patch 1: Documenting.

  radeonsi: document what si_descriptors.c does

Patches 2-8: Improvements and cleanups for DB registers and MSAA.

  radeonsi: remove unused variable si_pipe_shader::sprite_coord_enable
  radeonsi: move DB registers from draw_vbo into new db_render_state
  radeonsi: remove shader.ps_conservative_z, set db_shader_control instead
  radeonsi: set KILL_ENABLE during shader compilation, remove uses_kill flag
  radeonsi: move DB_SHADER_CONTROL into db_render_state
  radeonsi: only update MSAA-specific framebuffer state if nr_samples is 
changed
  radeonsi: disable gl_SampleMask fragment shader output if MSAA is disabled

Patches 9-10: Renaming stuff and simplification.

  radeonsi: merge si_pipe_shader into si_shader
  radeonsi: shorten si_pipe_* prefixes to si_*

Patches 11-15: Geometry shader fixes.

  radeonsi: don't snoop currently-bound GS shader when compiling ES
  radeonsi: don't pass the context to the shader translator
  radeonsi: don't use pipe_constant_buffer for GS rings
  radeonsi: release GS rings at context destruction
  radeonsi: properly destroy the GS copy shader and scratch_bo for compute

Please review.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 01/15] radeonsi: document what si_descriptors.c does

2014-09-19 Thread Marek Olšák
From: Marek Olšák 

---
 src/gallium/drivers/radeonsi/si_descriptors.c | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/src/gallium/drivers/radeonsi/si_descriptors.c 
b/src/gallium/drivers/radeonsi/si_descriptors.c
index 792d2c3..2543052 100644
--- a/src/gallium/drivers/radeonsi/si_descriptors.c
+++ b/src/gallium/drivers/radeonsi/si_descriptors.c
@@ -23,6 +23,17 @@
  * Authors:
  *  Marek Olšák 
  */
+
+/* Resource binding slots and sampler states (each described with 8 or 4 
dwords)
+ * live in memory on SI.
+ *
+ * This file is responsible for managing lists of resources and sampler states
+ * in memory and binding them, which means updating those structures in memory.
+ *
+ * There is also code for updating shader pointers to resources and sampler
+ * states. CP DMA functions are here too.
+ */
+
 #include "radeon/r600_cs.h"
 #include "si_pipe.h"
 #include "si_shader.h"
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 07/15] radeonsi: only update MSAA-specific framebuffer state if nr_samples is changed

2014-09-19 Thread Marek Olšák
From: Marek Olšák 

---
 src/gallium/drivers/radeonsi/si_state.c | 50 ++---
 1 file changed, 27 insertions(+), 23 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_state.c 
b/src/gallium/drivers/radeonsi/si_state.c
index b83b930..671e57b 100644
--- a/src/gallium/drivers/radeonsi/si_state.c
+++ b/src/gallium/drivers/radeonsi/si_state.c
@@ -1943,6 +1943,7 @@ static void si_set_framebuffer_state(struct pipe_context 
*ctx,
struct r600_surface *surf = NULL;
struct r600_texture *rtex;
bool old_cb0_is_integer = sctx->framebuffer.cb0_is_integer;
+   unsigned old_nr_samples = sctx->framebuffer.nr_samples;
int i;
 
if (sctx->framebuffer.state.nr_cbufs) {
@@ -2008,31 +2009,34 @@ static void si_set_framebuffer_state(struct 
pipe_context *ctx,
sctx->framebuffer.atom.num_dw += 3; /* WINDOW_SCISSOR_BR */
sctx->framebuffer.atom.num_dw += 18; /* MSAA sample locations */
sctx->framebuffer.atom.dirty = true;
-   sctx->msaa_config.dirty = true;
 
-   /* Set sample locations as fragment shader constants. */
-   switch (sctx->framebuffer.nr_samples) {
-   case 1:
-   constbuf.user_buffer = sctx->b.sample_locations_1x;
-   break;
-   case 2:
-   constbuf.user_buffer = sctx->b.sample_locations_2x;
-   break;
-   case 4:
-   constbuf.user_buffer = sctx->b.sample_locations_4x;
-   break;
-   case 8:
-   constbuf.user_buffer = sctx->b.sample_locations_8x;
-   break;
-   case 16:
-   constbuf.user_buffer = sctx->b.sample_locations_16x;
-   break;
-   default:
-   assert(0);
+   if (sctx->framebuffer.nr_samples != old_nr_samples) {
+   sctx->msaa_config.dirty = true;
+
+   /* Set sample locations as fragment shader constants. */
+   switch (sctx->framebuffer.nr_samples) {
+   case 1:
+   constbuf.user_buffer = sctx->b.sample_locations_1x;
+   break;
+   case 2:
+   constbuf.user_buffer = sctx->b.sample_locations_2x;
+   break;
+   case 4:
+   constbuf.user_buffer = sctx->b.sample_locations_4x;
+   break;
+   case 8:
+   constbuf.user_buffer = sctx->b.sample_locations_8x;
+   break;
+   case 16:
+   constbuf.user_buffer = sctx->b.sample_locations_16x;
+   break;
+   default:
+   assert(0);
+   }
+   constbuf.buffer_size = sctx->framebuffer.nr_samples * 2 * 4;
+   ctx->set_constant_buffer(ctx, PIPE_SHADER_FRAGMENT,
+SI_DRIVER_STATE_CONST_BUF, &constbuf);
}
-   constbuf.buffer_size = sctx->framebuffer.nr_samples * 2 * 4;
-   ctx->set_constant_buffer(ctx, PIPE_SHADER_FRAGMENT,
-SI_DRIVER_STATE_CONST_BUF, &constbuf);
 }
 
 static void si_emit_framebuffer_state(struct si_context *sctx, struct 
r600_atom *atom)
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 04/15] radeonsi: remove shader.ps_conservative_z, set db_shader_control instead

2014-09-19 Thread Marek Olšák
From: Marek Olšák 

Also set the field on SI too. It's not just specific to CIK.
---
 src/gallium/drivers/radeonsi/si_shader.c | 7 ---
 src/gallium/drivers/radeonsi/si_shader.h | 1 -
 src/gallium/drivers/radeonsi/si_state_draw.c | 4 
 3 files changed, 4 insertions(+), 8 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index 0a5ed96..19dc9ca 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -2818,17 +2818,18 @@ int si_pipe_shader_create(
 
si_shader_ctx.radeon_bld.load_input = declare_input_fs;
bld_base->emit_epilogue = si_llvm_emit_fs_epilogue;
-   shader->shader.ps_conservative_z = V_02880C_EXPORT_ANY_Z;
 
for (i = 0; i < shader_info.num_properties; i++) {
switch (shader_info.properties[i].name) {
case TGSI_PROPERTY_FS_DEPTH_LAYOUT:
switch (shader_info.properties[i].data[0]) {
case TGSI_FS_DEPTH_LAYOUT_GREATER:
-   shader->shader.ps_conservative_z = 
V_02880C_EXPORT_GREATER_THAN_Z;
+   shader->db_shader_control |=
+   
S_02880C_CONSERVATIVE_Z_EXPORT(V_02880C_EXPORT_GREATER_THAN_Z);
break;
case TGSI_FS_DEPTH_LAYOUT_LESS:
-   shader->shader.ps_conservative_z = 
V_02880C_EXPORT_LESS_THAN_Z;
+   shader->db_shader_control |=
+   
S_02880C_CONSERVATIVE_Z_EXPORT(V_02880C_EXPORT_LESS_THAN_Z);
break;
}
break;
diff --git a/src/gallium/drivers/radeonsi/si_shader.h 
b/src/gallium/drivers/radeonsi/si_shader.h
index df7dbb0..e07d872 100644
--- a/src/gallium/drivers/radeonsi/si_shader.h
+++ b/src/gallium/drivers/radeonsi/si_shader.h
@@ -140,7 +140,6 @@ struct si_shader {
unsignedgs_input_prim;
unsignedgs_output_prim;
unsignedgs_max_out_vertices;
-   unsignedps_conservative_z;
 
unsignednparam;
booluses_kill;
diff --git a/src/gallium/drivers/radeonsi/si_state_draw.c 
b/src/gallium/drivers/radeonsi/si_state_draw.c
index fb1ddc0..37dc40b 100644
--- a/src/gallium/drivers/radeonsi/si_state_draw.c
+++ b/src/gallium/drivers/radeonsi/si_state_draw.c
@@ -269,10 +269,6 @@ static void si_pipe_shader_ps(struct pipe_context *ctx, 
struct si_pipe_shader *s
if (shader->shader.uses_kill || shader->key.ps.alpha_func != 
PIPE_FUNC_ALWAYS)
db_shader_control |= S_02880C_KILL_ENABLE(1);
 
-   if (sctx->b.chip_class >= CIK)
-   db_shader_control |=
-   
S_02880C_CONSERVATIVE_Z_EXPORT(shader->shader.ps_conservative_z);
-
spi_ps_in_control = S_0286D8_NUM_INTERP(shader->shader.nparam) |
S_0286D8_BC_OPTIMIZE_DISABLE(1);
 
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 05/15] radeonsi: set KILL_ENABLE during shader compilation, remove uses_kill flag

2014-09-19 Thread Marek Olšák
From: Marek Olšák 

---
 src/gallium/drivers/radeonsi/si_shader.c | 6 +-
 src/gallium/drivers/radeonsi/si_shader.h | 1 -
 src/gallium/drivers/radeonsi/si_state_draw.c | 3 ---
 3 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index 19dc9ca..5893531 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -774,6 +774,8 @@ static void si_alpha_test(struct lp_build_tgsi_context 
*bld_base,
LLVMVoidTypeInContext(gallivm->context),
NULL, 0, 0);
}
+
+   si_shader_ctx->shader->db_shader_control |= S_02880C_KILL_ENABLE(1);
 }
 
 static void si_llvm_emit_clipvertex(struct lp_build_tgsi_context * bld_base,
@@ -2751,7 +2753,9 @@ int si_pipe_shader_create(
 
tgsi_scan_shader(sel->tokens, &shader_info);
 
-   shader->shader.uses_kill = shader_info.uses_kill;
+   if (shader_info.uses_kill)
+   shader->db_shader_control |= S_02880C_KILL_ENABLE(1);
+
shader->shader.uses_instanceid = shader_info.uses_instanceid;
bld_base->info = &shader_info;
bld_base->emit_fetch_funcs[TGSI_FILE_CONSTANT] = fetch_constant;
diff --git a/src/gallium/drivers/radeonsi/si_shader.h 
b/src/gallium/drivers/radeonsi/si_shader.h
index e07d872..559e4e2 100644
--- a/src/gallium/drivers/radeonsi/si_shader.h
+++ b/src/gallium/drivers/radeonsi/si_shader.h
@@ -142,7 +142,6 @@ struct si_shader {
unsignedgs_max_out_vertices;
 
unsignednparam;
-   booluses_kill;
booluses_instanceid;
boolfs_write_all;
boolvs_out_misc_write;
diff --git a/src/gallium/drivers/radeonsi/si_state_draw.c 
b/src/gallium/drivers/radeonsi/si_state_draw.c
index 37dc40b..28e92fc 100644
--- a/src/gallium/drivers/radeonsi/si_state_draw.c
+++ b/src/gallium/drivers/radeonsi/si_state_draw.c
@@ -266,9 +266,6 @@ static void si_pipe_shader_ps(struct pipe_context *ctx, 
struct si_pipe_shader *s
 
db_shader_control |= shader->db_shader_control;
 
-   if (shader->shader.uses_kill || shader->key.ps.alpha_func != 
PIPE_FUNC_ALWAYS)
-   db_shader_control |= S_02880C_KILL_ENABLE(1);
-
spi_ps_in_control = S_0286D8_NUM_INTERP(shader->shader.nparam) |
S_0286D8_BC_OPTIMIZE_DISABLE(1);
 
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 03/15] radeonsi: move DB registers from draw_vbo into new db_render_state

2014-09-19 Thread Marek Olšák
From: Marek Olšák 

It's called db_misc_state in r600g.
---
 src/gallium/drivers/radeonsi/si_blit.c   |  6 +++
 src/gallium/drivers/radeonsi/si_hw_context.c |  1 +
 src/gallium/drivers/radeonsi/si_pipe.h   | 16 +++---
 src/gallium/drivers/radeonsi/si_state.c  | 73 +---
 src/gallium/drivers/radeonsi/si_state_draw.c | 52 
 5 files changed, 82 insertions(+), 66 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_blit.c 
b/src/gallium/drivers/radeonsi/si_blit.c
index 9f95a8a..4744154 100644
--- a/src/gallium/drivers/radeonsi/si_blit.c
+++ b/src/gallium/drivers/radeonsi/si_blit.c
@@ -146,6 +146,7 @@ static void si_blit_decompress_depth(struct pipe_context 
*ctx,
struct pipe_surface *zsurf, *cbsurf, surf_tmpl;
 
sctx->dbcb_copy_sample = sample;
+   sctx->db_render_state.dirty = true;
 
surf_tmpl.format = texture->resource.b.b.format;
surf_tmpl.u.tex.level = level;
@@ -179,6 +180,7 @@ static void si_blit_decompress_depth(struct pipe_context 
*ctx,
 
sctx->dbcb_depth_copy_enabled = false;
sctx->dbcb_stencil_copy_enabled = false;
+   sctx->db_render_state.dirty = true;
 }
 
 static void si_blit_decompress_depth_in_place(struct si_context *sctx,
@@ -190,6 +192,7 @@ static void si_blit_decompress_depth_in_place(struct 
si_context *sctx,
unsigned layer, max_layer, checked_last_layer, level;
 
sctx->db_inplace_flush_enabled = true;
+   sctx->db_render_state.dirty = true;
 
surf_tmpl.format = texture->resource.b.b.format;
 
@@ -227,6 +230,7 @@ static void si_blit_decompress_depth_in_place(struct 
si_context *sctx,
}
 
sctx->db_inplace_flush_enabled = false;
+   sctx->db_render_state.dirty = true;
 }
 
 void si_flush_depth_textures(struct si_context *sctx,
@@ -372,6 +376,7 @@ static void si_clear(struct pipe_context *ctx, unsigned 
buffers,
zstex->depth_clear_value = depth;
sctx->framebuffer.atom.dirty = true; /* updates DB_DEPTH_CLEAR 
*/
sctx->db_depth_clear = true;
+   sctx->db_render_state.dirty = true;
}
 
si_blitter_begin(ctx, SI_CLEAR);
@@ -384,6 +389,7 @@ static void si_clear(struct pipe_context *ctx, unsigned 
buffers,
sctx->db_depth_clear = false;
sctx->db_depth_disable_expclear = false;
zstex->depth_cleared = true;
+   sctx->db_render_state.dirty = true;
}
 }
 
diff --git a/src/gallium/drivers/radeonsi/si_hw_context.c 
b/src/gallium/drivers/radeonsi/si_hw_context.c
index bd8409b..eaefa6a 100644
--- a/src/gallium/drivers/radeonsi/si_hw_context.c
+++ b/src/gallium/drivers/radeonsi/si_hw_context.c
@@ -161,6 +161,7 @@ void si_begin_new_cs(struct si_context *ctx)
 
ctx->framebuffer.atom.dirty = true;
ctx->msaa_config.dirty = true;
+   ctx->db_render_state.dirty = true;
ctx->b.streamout.enable_atom.dirty = true;
si_all_descriptors_begin_new_cs(ctx);
 
diff --git a/src/gallium/drivers/radeonsi/si_pipe.h 
b/src/gallium/drivers/radeonsi/si_pipe.h
index 6ec8d5d..df81e1f 100644
--- a/src/gallium/drivers/radeonsi/si_pipe.h
+++ b/src/gallium/drivers/radeonsi/si_pipe.h
@@ -106,6 +106,7 @@ struct si_context {
struct r600_atom *streamout_begin;
struct r600_atom *streamout_enable; /* must be after 
streamout_begin */
struct r600_atom *framebuffer;
+   struct r600_atom *db_render_state;
struct r600_atom *msaa_config;
} s;
struct r600_atom *array[0];
@@ -159,13 +160,14 @@ struct si_context {
union si_state  queued;
union si_state  emitted;
 
-   /* Additional DB state. */
-   bool dbcb_depth_copy_enabled;
-   bool dbcb_stencil_copy_enabled;
-   unsigned dbcb_copy_sample;
-   bool db_inplace_flush_enabled;
-   bool db_depth_clear;
-   bool db_depth_disable_expclear;
+   /* DB render state. */
+   struct r600_atomdb_render_state;
+   booldbcb_depth_copy_enabled;
+   booldbcb_stencil_copy_enabled;
+   unsigneddbcb_copy_sample;
+   booldb_inplace_flush_enabled;
+   booldb_depth_clear;
+   booldb_depth_disable_expclear;
 };
 
 /* si_blit.c */
diff --git a/src/gallium/drivers/radeonsi/si_state.c 
b/src/gallium/drivers/radeonsi/si_state.c
index 1d6ae86..c66eac9 100644
--- a/src/gallium/drivers/radeonsi/si_state.c
+++ b/src/gallium/drivers/radeonsi/si_state.c
@@ -833,6 +833,71 @@ static void *si_create_db_flush_dsa(struct si_context 
*sctx)
return sctx->b.b.create_depth_stencil_alpha_state(&sctx->b.b, &dsa);
 }
 
+/* DB RENDER ST

[Mesa-dev] [PATCH 14/15] radeonsi: release GS rings at context destruction

2014-09-19 Thread Marek Olšák
From: Marek Olšák 

---
 src/gallium/drivers/radeonsi/si_pipe.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/src/gallium/drivers/radeonsi/si_pipe.c 
b/src/gallium/drivers/radeonsi/si_pipe.c
index 4f9c876..2cce5cc 100644
--- a/src/gallium/drivers/radeonsi/si_pipe.c
+++ b/src/gallium/drivers/radeonsi/si_pipe.c
@@ -38,6 +38,8 @@ static void si_destroy_context(struct pipe_context *context)
 
si_release_all_descriptors(sctx);
 
+   pipe_resource_reference(&sctx->esgs_ring, NULL);
+   pipe_resource_reference(&sctx->gsvs_ring, NULL);
pipe_resource_reference(&sctx->null_const_buf.buffer, NULL);
r600_resource_reference(&sctx->border_color_table, NULL);
 
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 10/15] radeonsi: shorten si_pipe_* prefixes to si_*

2014-09-19 Thread Marek Olšák
From: Marek Olšák 

This was the original naming convention in r600g and it somehow crept
into radeonsi.
---
 src/gallium/drivers/radeonsi/si_compute.c | 15 ++---
 src/gallium/drivers/radeonsi/si_descriptors.c | 14 ++--
 src/gallium/drivers/radeonsi/si_pipe.h| 15 +++--
 src/gallium/drivers/radeonsi/si_shader.c  |  6 ++---
 src/gallium/drivers/radeonsi/si_shader.h  | 10 -
 src/gallium/drivers/radeonsi/si_state.c   | 32 +--
 src/gallium/drivers/radeonsi/si_state.h   |  4 ++--
 src/gallium/drivers/radeonsi/si_state_draw.c  | 19 
 8 files changed, 57 insertions(+), 58 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_compute.c 
b/src/gallium/drivers/radeonsi/si_compute.c
index 049f6c2..9088268 100644
--- a/src/gallium/drivers/radeonsi/si_compute.c
+++ b/src/gallium/drivers/radeonsi/si_compute.c
@@ -38,7 +38,7 @@
 #define NUM_USER_SGPRS 4
 #endif
 
-struct si_pipe_compute {
+struct si_compute {
struct si_context *ctx;
 
unsigned local_size;
@@ -59,8 +59,7 @@ static void *si_create_compute_state(
const struct pipe_compute_state *cso)
 {
struct si_context *sctx = (struct si_context *)ctx;
-   struct si_pipe_compute *program =
-   CALLOC_STRUCT(si_pipe_compute);
+   struct si_compute *program = CALLOC_STRUCT(si_compute);
const struct pipe_llvm_program_header *header;
const unsigned char *code;
unsigned i;
@@ -95,7 +94,7 @@ static void *si_create_compute_state(
 static void si_bind_compute_state(struct pipe_context *ctx, void *state)
 {
struct si_context *sctx = (struct si_context*)ctx;
-   sctx->cs_shader_state.program = (struct si_pipe_compute*)state;
+   sctx->cs_shader_state.program = (struct si_compute*)state;
 }
 
 static void si_set_global_binding(
@@ -105,7 +104,7 @@ static void si_set_global_binding(
 {
unsigned i;
struct si_context *sctx = (struct si_context*)ctx;
-   struct si_pipe_compute *program = sctx->cs_shader_state.program;
+   struct si_compute *program = sctx->cs_shader_state.program;
 
if (!resources) {
for (i = first; i < first + n; i++) {
@@ -169,7 +168,7 @@ static void si_launch_grid(
uint32_t pc, const void *input)
 {
struct si_context *sctx = (struct si_context*)ctx;
-   struct si_pipe_compute *program = sctx->cs_shader_state.program;
+   struct si_compute *program = sctx->cs_shader_state.program;
struct si_pm4_state *pm4 = CALLOC_STRUCT(si_pm4_state);
struct r600_resource *input_buffer = program->input_buffer;
unsigned kernel_args_size;
@@ -383,7 +382,7 @@ static void si_launch_grid(
 
 
 static void si_delete_compute_state(struct pipe_context *ctx, void* state){
-   struct si_pipe_compute *program = (struct si_pipe_compute *)state;
+   struct si_compute *program = (struct si_compute *)state;
 
if (!state) {
return;
@@ -392,7 +391,7 @@ static void si_delete_compute_state(struct pipe_context 
*ctx, void* state){
if (program->kernels) {
for (int i = 0; i < program->num_kernels; i++){
if (program->kernels[i].bo){
-   si_pipe_shader_destroy(ctx, 
&program->kernels[i]);
+   si_shader_destroy(ctx, &program->kernels[i]);
}
}
FREE(program->kernels);
diff --git a/src/gallium/drivers/radeonsi/si_descriptors.c 
b/src/gallium/drivers/radeonsi/si_descriptors.c
index 2543052..a0780cd 100644
--- a/src/gallium/drivers/radeonsi/si_descriptors.c
+++ b/src/gallium/drivers/radeonsi/si_descriptors.c
@@ -330,8 +330,8 @@ static void si_sampler_views_begin_new_cs(struct si_context 
*sctx,
/* Add relocations to the CS. */
while (mask) {
int i = u_bit_scan(&mask);
-   struct si_pipe_sampler_view *rview =
-   (struct si_pipe_sampler_view*)views->views[i];
+   struct si_sampler_view *rview =
+   (struct si_sampler_view*)views->views[i];
 
r600_context_bo_reloc(&sctx->b, &sctx->b.rings.gfx,
  rview->resource, RADEON_USAGE_READ,
@@ -354,8 +354,8 @@ static void si_set_sampler_view(struct si_context *sctx, 
unsigned shader,
return;
 
if (view) {
-   struct si_pipe_sampler_view *rview =
-   (struct si_pipe_sampler_view*)view;
+   struct si_sampler_view *rview =
+   (struct si_sampler_view*)view;
 
r600_context_bo_reloc(&sctx->b, &sctx->b.rings.gfx,
  rview->resource, RADEON_USAGE_READ,
@@ -380,7 +380,7 @@ static void si_set_sampler_views(struct pipe_context *ctx,
 {
struct si_context *sctx = (struct si_context *)ctx;
  

[Mesa-dev] [PATCH 09/15] radeonsi: merge si_pipe_shader into si_shader

2014-09-19 Thread Marek Olšák
From: Marek Olšák 

One is part of the other anyway.
---
 src/gallium/drivers/radeonsi/si_compute.c|  6 +--
 src/gallium/drivers/radeonsi/si_shader.c | 56 +++
 src/gallium/drivers/radeonsi/si_shader.h | 68 ++--
 src/gallium/drivers/radeonsi/si_state.c  |  8 ++--
 src/gallium/drivers/radeonsi/si_state_draw.c | 44 +-
 5 files changed, 90 insertions(+), 92 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_compute.c 
b/src/gallium/drivers/radeonsi/si_compute.c
index fc842d4..049f6c2 100644
--- a/src/gallium/drivers/radeonsi/si_compute.c
+++ b/src/gallium/drivers/radeonsi/si_compute.c
@@ -45,7 +45,7 @@ struct si_pipe_compute {
unsigned private_size;
unsigned input_size;
unsigned num_kernels;
-   struct si_pipe_shader *kernels;
+   struct si_shader *kernels;
unsigned num_user_sgprs;
 
struct r600_resource *input_buffer;
@@ -77,7 +77,7 @@ static void *si_create_compute_state(
 
program->num_kernels = radeon_llvm_get_num_kernels(program->llvm_ctx, 
code,
header->num_bytes);
-   program->kernels = CALLOC(sizeof(struct si_pipe_shader),
+   program->kernels = CALLOC(sizeof(struct si_shader),
program->num_kernels);
for (i = 0; i < program->num_kernels; i++) {
LLVMModuleRef mod = 
radeon_llvm_get_kernel_module(program->llvm_ctx, i,
@@ -181,7 +181,7 @@ static void si_launch_grid(
uint64_t shader_va;
unsigned arg_user_sgpr_count = NUM_USER_SGPRS;
unsigned i;
-   struct si_pipe_shader *shader = &program->kernels[pc];
+   struct si_shader *shader = &program->kernels[pc];
unsigned lds_blocks;
unsigned num_waves_for_scratch;
 
diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index 5893531..9b70a35 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -59,7 +59,7 @@ struct si_shader_context
struct radeon_llvm_context radeon_bld;
struct tgsi_parse_context parse;
struct tgsi_token * tokens;
-   struct si_pipe_shader *shader;
+   struct si_shader *shader;
struct si_shader *gs_for_vs;
unsigned type; /* TGSI_PROCESSOR_* specifies the type of shader. */
int param_streamout_config;
@@ -220,7 +220,7 @@ static void declare_input_vs(
 
if (divisor) {
/* Build index from instance ID, start instance and divisor */
-   si_shader_ctx->shader->shader.uses_instanceid = true;
+   si_shader_ctx->shader->uses_instanceid = true;
buffer_index = 
get_instance_index_for_fetch(&si_shader_ctx->radeon_bld, divisor);
} else {
/* Load the buffer index for vertices. */
@@ -257,7 +257,7 @@ static void declare_input_gs(
 {
struct si_shader_context *si_shader_ctx =
si_shader_context(&radeon_bld->soa.bld_base);
-   struct si_shader *shader = &si_shader_ctx->shader->shader;
+   struct si_shader *shader = si_shader_ctx->shader;
 
si_store_shader_io_attribs(shader, decl);
 
@@ -273,7 +273,7 @@ static LLVMValueRef fetch_input_gs(
 {
struct lp_build_context *base = &bld_base->base;
struct si_shader_context *si_shader_ctx = si_shader_context(bld_base);
-   struct si_shader *shader = &si_shader_ctx->shader->shader;
+   struct si_shader *shader = si_shader_ctx->shader;
struct lp_build_context *uint = 
&si_shader_ctx->radeon_bld.soa.bld_base.uint_bld;
struct gallivm_state *gallivm = base->gallivm;
LLVMTypeRef i32 = LLVMInt32TypeInContext(gallivm->context);
@@ -352,7 +352,7 @@ static void declare_input_fs(
struct lp_build_context *base = &radeon_bld->soa.bld_base.base;
struct si_shader_context *si_shader_ctx =
si_shader_context(&radeon_bld->soa.bld_base);
-   struct si_shader *shader = &si_shader_ctx->shader->shader;
+   struct si_shader *shader = si_shader_ctx->shader;
struct lp_build_context *uint = &radeon_bld->soa.bld_base.uint_bld;
struct gallivm_state *gallivm = base->gallivm;
LLVMTypeRef input_type = LLVMFloatTypeInContext(gallivm->context);
@@ -782,7 +782,7 @@ static void si_llvm_emit_clipvertex(struct 
lp_build_tgsi_context * bld_base,
LLVMValueRef (*pos)[9], LLVMValueRef 
*out_elts)
 {
struct si_shader_context *si_shader_ctx = si_shader_context(bld_base);
-   struct si_pipe_shader *shader = si_shader_ctx->shader;
+   struct si_shader *shader = si_shader_ctx->shader;
struct lp_build_context *base = &bld_base->base;
struct lp_build_context *uint = 
&si_shader_ctx->radeon_bld.soa.bld_base.uint_bld;
unsigned reg_index;
@@ -799,7 +799,7 @@ static void si_llvm_emit_clipvertex

[Mesa-dev] [PATCH 13/15] radeonsi: don't use pipe_constant_buffer for GS rings

2014-09-19 Thread Marek Olšák
From: Marek Olšák 

---
 src/gallium/drivers/radeonsi/si_descriptors.c | 10 -
 src/gallium/drivers/radeonsi/si_pipe.h|  4 ++--
 src/gallium/drivers/radeonsi/si_state.h   |  2 +-
 src/gallium/drivers/radeonsi/si_state_draw.c  | 32 ---
 4 files changed, 22 insertions(+), 26 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_descriptors.c 
b/src/gallium/drivers/radeonsi/si_descriptors.c
index a0780cd..fc535d0 100644
--- a/src/gallium/drivers/radeonsi/si_descriptors.c
+++ b/src/gallium/drivers/radeonsi/si_descriptors.c
@@ -732,7 +732,7 @@ static void si_set_constant_buffer(struct pipe_context 
*ctx, uint shader, uint s
 /* RING BUFFERS */
 
 void si_set_ring_buffer(struct pipe_context *ctx, uint shader, uint slot,
-   struct pipe_constant_buffer *input,
+   struct pipe_resource *buffer,
unsigned stride, unsigned num_records,
bool add_tid, bool swizzle,
unsigned element_size, unsigned index_stride)
@@ -749,10 +749,10 @@ void si_set_ring_buffer(struct pipe_context *ctx, uint 
shader, uint slot,
assert(slot < buffers->num_buffers);
pipe_resource_reference(&buffers->buffers[slot], NULL);
 
-   if (input && input->buffer) {
+   if (buffer) {
uint64_t va;
 
-   va = r600_resource(input->buffer)->gpu_address;
+   va = r600_resource(buffer)->gpu_address;
 
switch (element_size) {
default:
@@ -807,9 +807,9 @@ void si_set_ring_buffer(struct pipe_context *ctx, uint 
shader, uint slot,
  S_008F0C_INDEX_STRIDE(index_stride) |
  S_008F0C_ADD_TID_ENABLE(add_tid);
 
-   pipe_resource_reference(&buffers->buffers[slot], input->buffer);
+   pipe_resource_reference(&buffers->buffers[slot], buffer);
r600_context_bo_reloc(&sctx->b, &sctx->b.rings.gfx,
- (struct r600_resource*)input->buffer,
+ (struct r600_resource*)buffer,
  buffers->shader_usage, buffers->priority);
buffers->desc.enabled_mask |= 1 << slot;
} else {
diff --git a/src/gallium/drivers/radeonsi/si_pipe.h 
b/src/gallium/drivers/radeonsi/si_pipe.h
index 4b223b4..5f5404d 100644
--- a/src/gallium/drivers/radeonsi/si_pipe.h
+++ b/src/gallium/drivers/radeonsi/si_pipe.h
@@ -154,8 +154,8 @@ struct si_context {
struct si_pm4_state *gs_rings;
struct r600_atomcache_flush;
struct pipe_constant_buffer null_const_buf; /* used for 
set_constant_buffer(NULL) on CIK */
-   struct pipe_constant_buffer esgs_ring;
-   struct pipe_constant_buffer gsvs_ring;
+   struct pipe_resource*esgs_ring;
+   struct pipe_resource*gsvs_ring;
 
/* SI state handling */
union si_state  queued;
diff --git a/src/gallium/drivers/radeonsi/si_state.h 
b/src/gallium/drivers/radeonsi/si_state.h
index a5c6720..d3a745a 100644
--- a/src/gallium/drivers/radeonsi/si_state.h
+++ b/src/gallium/drivers/radeonsi/si_state.h
@@ -234,7 +234,7 @@ void si_set_sampler_descriptors(struct si_context *sctx, 
unsigned shader,
unsigned start, unsigned count, void **states);
 void si_update_vertex_buffers(struct si_context *sctx);
 void si_set_ring_buffer(struct pipe_context *ctx, uint shader, uint slot,
-   struct pipe_constant_buffer *input,
+   struct pipe_resource *buffer,
unsigned stride, unsigned num_records,
bool add_tid, bool swizzle,
unsigned element_size, unsigned index_stride);
diff --git a/src/gallium/drivers/radeonsi/si_state_draw.c 
b/src/gallium/drivers/radeonsi/si_state_draw.c
index a9dedf9..61951ee 100644
--- a/src/gallium/drivers/radeonsi/si_state_draw.c
+++ b/src/gallium/drivers/radeonsi/si_state_draw.c
@@ -550,42 +550,38 @@ bcolor:
 /* Initialize state related to ESGS / GSVS ring buffers */
 static void si_init_gs_rings(struct si_context *sctx)
 {
-   unsigned size = 128 * 1024;
+   unsigned esgs_ring_size = 128 * 1024;
+   unsigned gsvs_ring_size = 64 * 1024 * 1024;
 
assert(!sctx->gs_rings);
sctx->gs_rings = si_pm4_alloc_state(sctx);
 
-   sctx->esgs_ring.buffer =
-   pipe_buffer_create(sctx->b.b.screen, PIPE_BIND_CUSTOM,
-  PIPE_USAGE_DEFAULT, size);
-   sctx->esgs_ring.buffer_size = size;
+   sctx->esgs_ring = pipe_buffer_create(sctx->b.b.screen, PIPE_BIND_CUSTOM,
+  PIPE_USAGE_DEFAULT, esgs_ring_size);
 
-   size = 64 * 1024 * 1024;
-   sctx->gsvs_ring.buffer =
-   pipe_buffer_create(sctx->b.b.screen, PIPE_BIND_CUSTOM,
-  PIPE_USAGE_DEFAULT, size);
-

[Mesa-dev] [PATCH 06/15] radeonsi: move DB_SHADER_CONTROL into db_render_state

2014-09-19 Thread Marek Olšák
From: Marek Olšák 

I will need this for fixing sample shading with 1 sample.

The good news is that all shader pm4 states no longer use the current context
state, so we can generate the pm4 states outside of draw_vbo if needed.
---
 src/gallium/drivers/radeonsi/si_pipe.h   |  1 +
 src/gallium/drivers/radeonsi/si_shader.h |  1 -
 src/gallium/drivers/radeonsi/si_state.c  | 11 ++-
 src/gallium/drivers/radeonsi/si_state_draw.c | 18 +++---
 4 files changed, 18 insertions(+), 13 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_pipe.h 
b/src/gallium/drivers/radeonsi/si_pipe.h
index df81e1f..00d03be 100644
--- a/src/gallium/drivers/radeonsi/si_pipe.h
+++ b/src/gallium/drivers/radeonsi/si_pipe.h
@@ -168,6 +168,7 @@ struct si_context {
booldb_inplace_flush_enabled;
booldb_depth_clear;
booldb_depth_disable_expclear;
+   unsignedps_db_shader_control;
 };
 
 /* si_blit.c */
diff --git a/src/gallium/drivers/radeonsi/si_shader.h 
b/src/gallium/drivers/radeonsi/si_shader.h
index 559e4e2..9c6b238 100644
--- a/src/gallium/drivers/radeonsi/si_shader.h
+++ b/src/gallium/drivers/radeonsi/si_shader.h
@@ -186,7 +186,6 @@ struct si_pipe_shader {
unsignedspi_shader_z_format;
unsigneddb_shader_control;
unsignedcb_shader_mask;
-   boolcb0_is_integer;
union si_shader_key key;
 };
 
diff --git a/src/gallium/drivers/radeonsi/si_state.c 
b/src/gallium/drivers/radeonsi/si_state.c
index c66eac9..b83b930 100644
--- a/src/gallium/drivers/radeonsi/si_state.c
+++ b/src/gallium/drivers/radeonsi/si_state.c
@@ -896,6 +896,11 @@ static void si_emit_db_render_state(struct si_context 
*sctx, struct r600_atom *s
} else {
r600_write_context_reg(cs, R_028010_DB_RENDER_OVERRIDE2, 0);
}
+
+   r600_write_context_reg(cs, R_02880C_DB_SHADER_CONTROL,
+  S_02880C_Z_ORDER(V_02880C_EARLY_Z_THEN_LATE_Z) |
+  
S_02880C_ALPHA_TO_MASK_DISABLE(sctx->framebuffer.cb0_is_integer) |
+  sctx->ps_db_shader_control);
 }
 
 /*
@@ -1937,6 +1942,7 @@ static void si_set_framebuffer_state(struct pipe_context 
*ctx,
struct pipe_constant_buffer constbuf = {0};
struct r600_surface *surf = NULL;
struct r600_texture *rtex;
+   bool old_cb0_is_integer = sctx->framebuffer.cb0_is_integer;
int i;
 
if (sctx->framebuffer.state.nr_cbufs) {
@@ -1957,6 +1963,9 @@ static void si_set_framebuffer_state(struct pipe_context 
*ctx,
sctx->framebuffer.cb0_is_integer = state->nr_cbufs && state->cbufs[0] &&
  
util_format_is_pure_integer(state->cbufs[0]->format);
 
+   if (sctx->framebuffer.cb0_is_integer != old_cb0_is_integer)
+   sctx->db_render_state.dirty = true;
+
for (i = 0; i < state->nr_cbufs; i++) {
if (!state->cbufs[i])
continue;
@@ -2983,7 +2992,7 @@ static void si_need_gfx_cs_space(struct pipe_context 
*ctx, unsigned num_dw,
 void si_init_state_functions(struct si_context *sctx)
 {
si_init_atom(&sctx->framebuffer.atom, &sctx->atoms.s.framebuffer, 
si_emit_framebuffer_state, 0);
-   si_init_atom(&sctx->db_render_state, &sctx->atoms.s.db_render_state, 
si_emit_db_render_state, 7);
+   si_init_atom(&sctx->db_render_state, &sctx->atoms.s.db_render_state, 
si_emit_db_render_state, 10);
 
sctx->b.b.create_blend_state = si_create_blend_state;
sctx->b.b.bind_blend_state = si_bind_blend_state;
diff --git a/src/gallium/drivers/radeonsi/si_state_draw.c 
b/src/gallium/drivers/radeonsi/si_state_draw.c
index 28e92fc..54f2fd9 100644
--- a/src/gallium/drivers/radeonsi/si_state_draw.c
+++ b/src/gallium/drivers/radeonsi/si_state_draw.c
@@ -231,7 +231,7 @@ static void si_pipe_shader_ps(struct pipe_context *ctx, 
struct si_pipe_shader *s
 {
struct si_context *sctx = (struct si_context *)ctx;
struct si_pm4_state *pm4;
-   unsigned i, spi_ps_in_control, db_shader_control;
+   unsigned i, spi_ps_in_control;
unsigned num_sgprs, num_user_sgprs;
unsigned spi_baryc_cntl = 0, spi_ps_input_ena;
uint64_t va;
@@ -242,9 +242,6 @@ static void si_pipe_shader_ps(struct pipe_context *ctx, 
struct si_pipe_shader *s
if (pm4 == NULL)
return;
 
-   db_shader_control = S_02880C_Z_ORDER(V_02880C_EARLY_Z_THEN_LATE_Z) |
-   
S_02880C_ALPHA_TO_MASK_DISABLE(sctx->framebuffer.cb0_is_integer);
-
for (i = 0; i < shader->shader.ninput; i++) {
switch (shader->shader.input[i].name) {
case TGSI_SEMANTIC_POSITION:
@@ -264,8 +261,6 @@ static void si_pipe_shader_ps(struct pipe_context *ctx, 
struct si_pipe_shader *s
 

[Mesa-dev] [PATCH 11/15] radeonsi: don't snoop currently-bound GS shader when compiling ES

2014-09-19 Thread Marek Olšák
From: Marek Olšák 

Instead, pass the layout of GS inputs in memory to the ES using the shader
key. Only 64 bits are needed to represent the layout in the key.

Mixing and matching different VS and GS shaders should now always work.
---
 src/gallium/drivers/radeonsi/si_shader.c | 107 ++-
 src/gallium/drivers/radeonsi/si_shader.h |   4 ++
 src/gallium/drivers/radeonsi/si_state.c  |   6 +-
 3 files changed, 101 insertions(+), 16 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index 2fc1632..fbc94d2 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -60,7 +60,6 @@ struct si_shader_context
struct tgsi_parse_context parse;
struct tgsi_token * tokens;
struct si_shader *shader;
-   struct si_shader *gs_for_vs;
unsigned type; /* TGSI_PROCESSOR_* specifies the type of shader. */
int param_streamout_config;
int param_streamout_write_index;
@@ -105,6 +104,84 @@ static struct si_shader_context * si_shader_context(
 #define SENDMSG_GS_OP_EMIT (2 << 4)
 #define SENDMSG_GS_OP_EMIT_CUT (3 << 4)
 
+/**
+ * Returns a unique index for a semantic name and index. The index must be
+ * less than 64, so that a 64-bit bitmask of used inputs or outputs can be
+ * calculated.
+ */
+static unsigned get_unique_index(unsigned semantic_name, unsigned index)
+{
+   switch (semantic_name) {
+   case TGSI_SEMANTIC_POSITION:
+   return 0;
+   case TGSI_SEMANTIC_PSIZE:
+   return 1;
+   case TGSI_SEMANTIC_CLIPDIST:
+   assert(index <= 1);
+   return 2 + index;
+   case TGSI_SEMANTIC_CLIPVERTEX:
+   return 4;
+   case TGSI_SEMANTIC_COLOR:
+   assert(index <= 1);
+   return 5 + index;
+   case TGSI_SEMANTIC_BCOLOR:
+   assert(index <= 1);
+   return 7 + index;
+   case TGSI_SEMANTIC_FOG:
+   return 9;
+   case TGSI_SEMANTIC_EDGEFLAG:
+   return 10;
+   case TGSI_SEMANTIC_GENERIC:
+   assert(index <= 63-11);
+   return 11 + index;
+   default:
+   assert(0);
+   return 63;
+   }
+}
+
+/**
+ * Given a semantic name and index of a parameter and a mask of used parameters
+ * (inputs or outputs), return the index of the parameter in the list of all
+ * used parameters.
+ *
+ * For example, assume this list of parameters:
+ *   POSITION, PSIZE, GENERIC0, GENERIC2
+ * which has the mask:
+ *   110101
+ * Then:
+ *   querying POSITION returns 0,
+ *   querying PSIZE returns 1,
+ *   querying GENERIC0 returns 2,
+ *   querying GENERIC2 returns 3.
+ *
+ * Which can be used as an offset to a parameter buffer in units of vec4s.
+ */
+static int get_param_index(unsigned semantic_name, unsigned index,
+  uint64_t mask)
+{
+   unsigned unique_index = get_unique_index(semantic_name, index);
+   int i, param_index = 0;
+
+   /* If not present... */
+   if (!((1llu << unique_index) & mask))
+   return -1;
+
+   for (i = 0; mask; i++) {
+   uint64_t bit = 1llu << i;
+
+   if (bit & mask) {
+   if (i == unique_index)
+   return param_index;
+
+   mask &= ~bit;
+   param_index++;
+   }
+   }
+
+   assert(0 && "unreachable");
+   return -1;
+}
 
 /**
  * Build an LLVM bytecode indexed load using LLVMBuildGEP + LLVMBuildLoad
@@ -261,8 +338,12 @@ static void declare_input_gs(
 
si_store_shader_io_attribs(shader, decl);
 
-   if (decl->Semantic.Name != TGSI_SEMANTIC_PRIMID)
-   shader->input[input_index].param_offset = shader->nparam++;
+   if (decl->Semantic.Name != TGSI_SEMANTIC_PRIMID) {
+   shader->gs_used_inputs |=
+   1llu << get_unique_index(decl->Semantic.Name,
+decl->Semantic.Index);
+   shader->nparam++;
+   }
 }
 
 static LLVMValueRef fetch_input_gs(
@@ -282,6 +363,7 @@ static LLVMValueRef fetch_input_gs(
LLVMValueRef t_list;
LLVMValueRef args[9];
unsigned vtx_offset_param;
+   struct si_shader_input *input = &shader->input[reg->Register.Index];
 
if (swizzle != ~0 &&
shader->input[reg->Register.Index].name == TGSI_SEMANTIC_PRIMID) {
@@ -327,7 +409,8 @@ static LLVMValueRef fetch_input_gs(
args[0] = t_list;
args[1] = vtx_offset;
args[2] = lp_build_const_int32(gallivm,
-  
((shader->input[reg->Register.Index].param_offset * 4) +
+  (get_param_index(input->name, input->sid,
+   shader->gs_used_inputs) 
* 4 +

[Mesa-dev] [PATCH 02/15] radeonsi: remove unused variable si_pipe_shader::sprite_coord_enable

2014-09-19 Thread Marek Olšák
From: Marek Olšák 

---
 src/gallium/drivers/radeonsi/si_shader.h | 1 -
 src/gallium/drivers/radeonsi/si_state_draw.c | 1 -
 2 files changed, 2 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_shader.h 
b/src/gallium/drivers/radeonsi/si_shader.h
index a68c25a..df7dbb0 100644
--- a/src/gallium/drivers/radeonsi/si_shader.h
+++ b/src/gallium/drivers/radeonsi/si_shader.h
@@ -189,7 +189,6 @@ struct si_pipe_shader {
unsigneddb_shader_control;
unsignedcb_shader_mask;
boolcb0_is_integer;
-   unsignedsprite_coord_enable;
union si_shader_key key;
 };
 
diff --git a/src/gallium/drivers/radeonsi/si_state_draw.c 
b/src/gallium/drivers/radeonsi/si_state_draw.c
index 2e9d951..9eeda9d 100644
--- a/src/gallium/drivers/radeonsi/si_state_draw.c
+++ b/src/gallium/drivers/radeonsi/si_state_draw.c
@@ -321,7 +321,6 @@ static void si_pipe_shader_ps(struct pipe_context *ctx, 
struct si_pipe_shader *s
si_pm4_set_reg(pm4, R_02880C_DB_SHADER_CONTROL, db_shader_control);
 
shader->cb0_is_integer = sctx->framebuffer.cb0_is_integer;
-   shader->sprite_coord_enable = sctx->sprite_coord_enable;
sctx->b.flags |= R600_CONTEXT_INV_SHADER_CACHE;
 }
 
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 01/12] i965/fs: Manually generate the meta fast-clear shader

2014-09-19 Thread Jason Ekstrand
Previously, we were generating the fast-clear shader from GLSL.  The
problem is that fast clears require that we use a replicated write rather
than a regular write instruction.  In order to get this we had a
complicated and somewhat fragile optimization pass that looked for places
where we can use a replicated write and used it.  Since replicated writes
have a lot of restrictions, we only ever use them for fast-clear
operations.

This commit replaces the optimization pass with a function that just
generates the shader we want.  This is a) less code, b) less fragile than
the optimization pass, and c) generates a more efficient shader.

Signed-off-by: Jason Ekstrand 
Cc: Kristian Høgsberg 
---
 src/mesa/drivers/dri/i965/brw_fs.cpp | 122 ++-
 src/mesa/drivers/dri/i965/brw_fs.h   |   3 +-
 2 files changed, 34 insertions(+), 91 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index fa95c81..3fb1545 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -2319,98 +2319,42 @@ fs_visitor::compute_to_mrf()
  * instructions to FS_OPCODE_REP_FB_WRITE.
  */
 void
-fs_visitor::try_rep_send()
+fs_visitor::emit_repclear_shader()
 {
-   int i, count;
-   fs_inst *start = NULL;
-   bblock_t *mov_block;
+   int base_mrf = 1;
+   int color_mrf = base_mrf + 2;
 
-   /* From the Ivybridge PRM, Volume 4 Part 1, section 3.9.11.2
-* ("Message Descriptor - Render Target Write"):
-*
-* "SIMD16_REPDATA message must not be used in SIMD8 pixel-shaders."
-*/
-   if (dispatch_width != 16)
-  return;
-
-   /* The constant color write message can't handle anything but the 4 color
-* values.  We could do MRT, but the loops below would need to understand
-* handling the header being enabled or disabled on different messages.  It
-* also requires that the render target be tiled, which might not be the
-* case for some EGLImage paths or if we some day do rendering to PBOs.
-*/
-   if (prog->OutputsWritten & BITFIELD64_BIT(FRAG_RESULT_DEPTH) ||
-   payload.aa_dest_stencil_reg ||
-   payload.dest_depth_reg ||
-   dual_src_output.file != BAD_FILE)
-  return;
-
-   /* The optimization is implemented as one pass through the instruction
-* list.  We keep track of the most recent block of MOVs into sequential
-* MRFs from single, sequential float registers (ie uniforms).  Then when
-* we find an FB_WRITE opcode, we see if the payload registers match the
-* destination registers in our block of MOVs.
-*/
-   count = 0;
-   foreach_block_and_inst_safe(block, fs_inst, inst, cfg) {
-  if (count == 0) {
- start = inst;
- mov_block = block;
-  }
-  if (inst->opcode == BRW_OPCODE_MOV &&
- inst->dst.file == MRF &&
-  inst->dst.reg == start->dst.reg + 2 * count &&
-  inst->src[0].file == HW_REG &&
-  inst->src[0].reg_offset == start->src[0].reg_offset + count) {
- if (count == 0) {
-start = inst;
-mov_block = block;
- }
- count++;
-  }
-
-  if (inst->opcode == FS_OPCODE_FB_WRITE &&
-  count == 4 &&
-  (inst->base_mrf == start->dst.reg ||
-   (inst->base_mrf + 2 == start->dst.reg && inst->header_present))) {
- fs_inst *mov = MOV(start->dst, start->src[0]);
+   fs_inst *mov = emit(MOV(vec4(brw_message_reg(color_mrf)),
+   fs_reg(UNIFORM, 0, BRW_REGISTER_TYPE_F)));
+   mov->force_writemask_all = true;
+   mov->force_uncompressed = true;
 
- /* Make a MOV that moves the four floats into the replicated write
-  * payload.  Since we're running at the very end of code generation
-  * we can use hw registers and generate the stride and offsets we
-  * need for this MOV.  We use the first of the eight registers
-  * allocated for the SIMD16 payload for the four floats.
-  */
- mov->dst.fixed_hw_reg =
-brw_vec4_reg(BRW_MESSAGE_REGISTER_FILE,
- start->dst.reg, 0);
- mov->dst.file = HW_REG;
- mov->dst.type = mov->dst.fixed_hw_reg.type;
-
- mov->src[0].fixed_hw_reg =
-brw_vec4_grf(mov->src[0].fixed_hw_reg.nr, 0);
- mov->src[0].file = HW_REG;
- mov->src[0].type = mov->src[0].fixed_hw_reg.type;
- mov->force_writemask_all = true;
- mov->dst.type = BRW_REGISTER_TYPE_F;
-
- /* Replace the four MOVs with the new vec4 MOV. */
- start->insert_before(mov_block, mov);
- for (i = 0; i < 4; i++)
-((fs_inst *) mov->next)->remove(mov_block);
-
- /* Finally, adjust the message length and set the opcode to
-  * REP_FB_WRITE for the send, so that the generator will use the
-  * replicated data mesage type.  Then reset count so we'll start
-  * looking for a new block in case we're in a MRT shader

[Mesa-dev] [PATCH 11/12] i965/fs: Print BAD_FILE registers in dump_instruction

2014-09-19 Thread Jason Ekstrand
Sometimes these show up in LOAD_PAYLOAD instructions and it's nice to be
able to see them.

Signed-off-by: Jason Ekstrand 
---
 src/mesa/drivers/dri/i965/brw_fs.cpp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index a41cb4f..a0b7c6a 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -2896,7 +2896,7 @@ fs_visitor::dump_instruction(backend_instruction 
*be_inst, FILE *file)
}
fprintf(file, ":%s, ", brw_reg_type_letters(inst->dst.type));
 
-   for (int i = 0; i < inst->sources && inst->src[i].file != BAD_FILE; i++) {
+   for (int i = 0; i < inst->sources; i++) {
   if (inst->src[i].negate)
  fprintf(file, "-");
   if (inst->src[i].abs)
-- 
2.1.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 07/12] i965/fs: Use the var_from_vgrf helper function instead of doing it manually

2014-09-19 Thread Jason Ekstrand
Signed-off-by: Jason Ekstrand 
---
 src/mesa/drivers/dri/i965/brw_fs_dead_code_eliminate.cpp | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_dead_code_eliminate.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_dead_code_eliminate.cpp
index 697b44a..036875f 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_dead_code_eliminate.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_dead_code_eliminate.cpp
@@ -58,7 +58,7 @@ fs_visitor::dead_code_eliminate()
int var = live_intervals->var_from_reg(&inst->dst);
result_live = BITSET_TEST(live, var);
 } else {
-   int var = live_intervals->var_from_vgrf[inst->dst.reg];
+   int var = live_intervals->var_from_reg(&inst->dst);
for (int i = 0; i < inst->regs_written; i++) {
   result_live = result_live || BITSET_TEST(live, var + i);
}
@@ -78,19 +78,19 @@ fs_visitor::dead_code_eliminate()
 
  if (inst->dst.file == GRF) {
 if (!inst->is_partial_write()) {
-   int var = live_intervals->var_from_vgrf[inst->dst.reg];
+   int var = live_intervals->var_from_reg(&inst->dst);
for (int i = 0; i < inst->regs_written; i++) {
-  BITSET_CLEAR(live, var + inst->dst.reg_offset + i);
+  BITSET_CLEAR(live, var + i);
}
 }
  }
 
  for (int i = 0; i < inst->sources; i++) {
 if (inst->src[i].file == GRF) {
-   int var = live_intervals->var_from_vgrf[inst->src[i].reg];
+   int var = live_intervals->var_from_reg(&inst->src[i]);
 
for (int j = 0; j < inst->regs_read(this, i); j++) {
-  BITSET_SET(live, var + inst->src[i].reg_offset + j);
+  BITSET_SET(live, var + j);
}
 }
  }
-- 
2.1.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 00/12] i965/fs: A bunch of cleanups in preparation for explicit register widths

2014-09-19 Thread Jason Ekstrand
I'm working on a series (which I hope to send out soon) that will allow us
to have explicit register widths and instruction execution sizes in the fs
backend IR.  If you want to see where I'm going with this, I've got a
working version here:

http://cgit.freedesktop.org/~jekstrand/mesa/log/?h=wip/kill-mrf-v0.5

I'm planning to get that cleaned up a bit more and hope to send the full
series out by the end of today or maybe Monday.  This series is a bunch of
cleanup patches that will be needed eventually, but don't really change
anything important on their own.  They should be generally reviewable by
anyone with a decent understanding of the i965 fs backend.

Jason Ekstrand (12):
  i965/fs: Manually generate the meta fast-clear shader
  i965/fs_live_variables: Use var_from_vgrf insead of repeating the
calculation
  i965/fs: Rewrite fs_visitor::split_virtual_grfs
  i965/fs: fix a comment in compact_virtual_grfs
  i965/fs: Use offset a lot more places
  i965/fs: Use the UW type for the destination of
VARYING_PULL_CONSTANT_LOAD instructions
  i965/fs: Use the var_from_vgrf helper function instead of doing it
manually
  i965/fs: Make null_reg_* const members of fs_visitor instead of
globals
  i964/fs: Make immediate fs_reg constructors explicit
  i965/fs: Make compact_virtual_grfs an optimization pass
  i965/fs: Print BAD_FILE registers in dump_instruction
  i965/fs: Clean up emit_fb_writes

 src/mesa/drivers/dri/i965/brw_fs.cpp   | 295 +-
 src/mesa/drivers/dri/i965/brw_fs.h |  21 +-
 src/mesa/drivers/dri/i965/brw_fs_cse.cpp   |  12 +-
 .../dri/i965/brw_fs_dead_code_eliminate.cpp|  10 +-
 src/mesa/drivers/dri/i965/brw_fs_fp.cpp|   2 +-
 src/mesa/drivers/dri/i965/brw_fs_generator.cpp |   4 +-
 .../drivers/dri/i965/brw_fs_live_variables.cpp |   4 +-
 src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp  |   4 +-
 src/mesa/drivers/dri/i965/brw_fs_visitor.cpp   | 338 +
 src/mesa/drivers/dri/i965/brw_reg.h|   6 +
 10 files changed, 323 insertions(+), 373 deletions(-)

-- 
2.1.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 09/12] i964/fs: Make immediate fs_reg constructors explicit

2014-09-19 Thread Jason Ekstrand
Signed-off-by: Jason Ekstrand 
---
 src/mesa/drivers/dri/i965/brw_fs.cpp |  2 +-
 src/mesa/drivers/dri/i965/brw_fs.h   |  6 +++---
 src/mesa/drivers/dri/i965/brw_fs_fp.cpp  |  2 +-
 src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 11 ++-
 4 files changed, 11 insertions(+), 10 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index ea91705..002d40fd 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -279,7 +279,7 @@ fs_visitor::VARYING_PULL_CONSTANT_LOAD(const fs_reg &dst,
 */
fs_reg vec4_offset = fs_reg(this, glsl_type::int_type);
instructions.push_tail(ADD(vec4_offset,
-  varying_offset, const_offset & ~3));
+  varying_offset, fs_reg(const_offset & ~3)));
 
int scale = 1;
if (brw->gen == 4 && dispatch_width == 8) {
diff --git a/src/mesa/drivers/dri/i965/brw_fs.h 
b/src/mesa/drivers/dri/i965/brw_fs.h
index cb44037..402433b 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.h
+++ b/src/mesa/drivers/dri/i965/brw_fs.h
@@ -69,9 +69,9 @@ public:
void init();
 
fs_reg();
-   fs_reg(float f);
-   fs_reg(int32_t i);
-   fs_reg(uint32_t u);
+   explicit fs_reg(float f);
+   explicit fs_reg(int32_t i);
+   explicit fs_reg(uint32_t u);
fs_reg(struct brw_reg fixed_hw_reg);
fs_reg(enum register_file file, int reg);
fs_reg(enum register_file file, int reg, enum brw_reg_type type);
diff --git a/src/mesa/drivers/dri/i965/brw_fs_fp.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_fp.cpp
index 526c817..ec05bfe 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_fp.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_fp.cpp
@@ -489,7 +489,7 @@ fs_visitor::emit_fragment_program_code()
 
  fs_inst *inst;
  if (brw->gen >= 7) {
-inst = emit_texture_gen7(ir, dst, coordinate, shadow_c, lod, dpdy, 
sample_index, fs_reg(0u), fpi->TexSrcUnit);
+inst = emit_texture_gen7(ir, dst, coordinate, shadow_c, lod, dpdy, 
sample_index, fs_reg(0u), fs_reg(fpi->TexSrcUnit));
  } else if (brw->gen >= 5) {
 inst = emit_texture_gen5(ir, dst, coordinate, shadow_c, lod, dpdy, 
sample_index, fpi->TexSrcUnit);
  } else {
diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
index 6a75b05..a8d2804 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
@@ -2611,10 +2611,10 @@ fs_visitor::visit_atomic_counter_intrinsic(ir_call *ir)
   deref_array->array_index->accept(this);
 
   fs_reg tmp(this, glsl_type::uint_type);
-  emit(MUL(tmp, this->result, ATOMIC_COUNTER_SIZE));
-  emit(ADD(offset, tmp, location->data.atomic.offset));
+  emit(MUL(tmp, this->result, fs_reg(ATOMIC_COUNTER_SIZE)));
+  emit(ADD(offset, tmp, fs_reg(location->data.atomic.offset)));
} else {
-  offset = location->data.atomic.offset;
+  offset = fs_reg(location->data.atomic.offset);
}
 
/* Emit the appropriate machine instruction */
@@ -2734,7 +2734,8 @@ fs_visitor::emit_untyped_atomic(unsigned atomic_op, 
unsigned surf_index,
}
 
/* Emit the instruction. */
-   inst = emit(SHADER_OPCODE_UNTYPED_ATOMIC, dst, atomic_op, surf_index);
+   inst = emit(SHADER_OPCODE_UNTYPED_ATOMIC, dst,
+   fs_reg(atomic_op), fs_reg(surf_index));
inst->base_mrf = 0;
inst->mlen = mlen;
inst->header_present = true;
@@ -2768,7 +2769,7 @@ fs_visitor::emit_untyped_surface_read(unsigned 
surf_index, fs_reg dst,
mlen += operand_len;
 
/* Emit the instruction. */
-   inst = emit(SHADER_OPCODE_UNTYPED_SURFACE_READ, dst, surf_index);
+   inst = emit(SHADER_OPCODE_UNTYPED_SURFACE_READ, dst, fs_reg(surf_index));
inst->base_mrf = 0;
inst->mlen = mlen;
inst->header_present = true;
-- 
2.1.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 05/12] i965/fs: Use offset a lot more places

2014-09-19 Thread Jason Ekstrand
We have this wonderful offset() function for advancing registers, but we're
not using it.  Using offset() allows us to do some sanity checking and
avoid manually touching fs_reg::reg_offset.  In a few commits, we will make
offset do even more nifty things for us.

Signed-off-by: Jason Ekstrand 
---
 src/mesa/drivers/dri/i965/brw_fs.cpp  |  18 +--
 src/mesa/drivers/dri/i965/brw_fs_cse.cpp  |  12 +-
 src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp |   4 +-
 src/mesa/drivers/dri/i965/brw_fs_visitor.cpp  | 137 ++
 4 files changed, 78 insertions(+), 93 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index af8c087..ea91705 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -310,8 +310,8 @@ fs_visitor::VARYING_PULL_CONSTANT_LOAD(const fs_reg &dst,
  inst->mlen = 1 + dispatch_width / 8;
}
 
-   vec4_result.reg_offset += (const_offset & 3) * scale;
-   instructions.push_tail(MOV(dst, vec4_result));
+   fs_reg result = offset(vec4_result, (const_offset & 3) * scale);
+   instructions.push_tail(MOV(dst, result));
 
return instructions;
 }
@@ -1019,7 +1019,7 @@ fs_visitor::emit_fragcoord_interpolation(ir_variable *ir)
} else {
   emit(ADD(wpos, this->pixel_x, fs_reg(0.5f)));
}
-   wpos.reg_offset++;
+   wpos = offset(wpos, 1);
 
/* gl_FragCoord.y */
if (!flip && ir->data.pixel_center_integer) {
@@ -1035,7 +1035,7 @@ fs_visitor::emit_fragcoord_interpolation(ir_variable *ir)
 
   emit(ADD(wpos, pixel_y, fs_reg(offset)));
}
-   wpos.reg_offset++;
+   wpos = offset(wpos, 1);
 
/* gl_FragCoord.z */
if (brw->gen >= 6) {
@@ -1046,7 +1046,7 @@ fs_visitor::emit_fragcoord_interpolation(ir_variable *ir)
this->delta_y[BRW_WM_PERSPECTIVE_PIXEL_BARYCENTRIC],
interp_reg(VARYING_SLOT_POS, 2));
}
-   wpos.reg_offset++;
+   wpos = offset(wpos, 1);
 
/* gl_FragCoord.w: Already set up in emit_interpolation */
emit(BRW_OPCODE_MOV, wpos, this->wpos_w);
@@ -1120,7 +1120,7 @@ fs_visitor::emit_general_interpolation(ir_variable *ir)
/* If there's no incoming setup data for this slot, don't
 * emit interpolation for it.
 */
-   attr.reg_offset += type->vector_elements;
+   attr = offset(attr, type->vector_elements);
location++;
continue;
 }
@@ -1135,7 +1135,7 @@ fs_visitor::emit_general_interpolation(ir_variable *ir)
   interp = suboffset(interp, 3);
interp.type = reg->type;
   emit(FS_OPCODE_CINTERP, attr, fs_reg(interp));
-  attr.reg_offset++;
+  attr = offset(attr, 1);
}
 } else {
/* Smooth/noperspective interpolation case. */
@@ -1173,7 +1173,7 @@ fs_visitor::emit_general_interpolation(ir_variable *ir)
if (brw->gen < 6 && interpolation_mode == 
INTERP_QUALIFIER_SMOOTH) {
   emit(BRW_OPCODE_MUL, attr, attr, this->pixel_w);
}
-  attr.reg_offset++;
+  attr = offset(attr, 1);
}
 
 }
@@ -1284,7 +1284,7 @@ fs_visitor::emit_samplepos_setup()
}
/* Compute gl_SamplePosition.x */
compute_sample_position(pos, int_sample_x);
-   pos.reg_offset++;
+   pos = offset(pos, 1);
inst = emit(MOV(int_sample_y, fs_reg(suboffset(sample_pos_reg, 1;
if (dispatch_width == 16) {
   inst->force_uncompressed = true;
diff --git a/src/mesa/drivers/dri/i965/brw_fs_cse.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_cse.cpp
index 9db6865..a09b0f4 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_cse.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_cse.cpp
@@ -212,10 +212,8 @@ fs_visitor::opt_cse_local(bblock_t *block)
fs_inst *copy;
if (written > 1) {
   fs_reg *sources = ralloc_array(mem_ctx, fs_reg, written);
-  for (int i = 0; i < written; i++) {
- sources[i] = tmp;
- sources[i].reg_offset = i;
-  }
+  for (int i = 0; i < written; i++)
+ sources[i] = offset(tmp, i);
   copy = LOAD_PAYLOAD(orig_dst, sources, written);
} else {
   copy = MOV(orig_dst, tmp);
@@ -235,10 +233,8 @@ fs_visitor::opt_cse_local(bblock_t *block)
fs_inst *copy;
if (written > 1) {
   fs_reg *sources = ralloc_array(mem_ctx, fs_reg, written);
-  for (int i = 0; i < written; i++) {
- sources[i] = tmp;
- sources[i].reg_offset = i;
-  }
+  for (int i = 0; i < written; i++)
+ sources[i] = offset(tmp, i);
   copy = LOAD_PAYLOAD(dst, sources, written);
} else {
   copy = MOV(dst, tmp);
diff --git a/s

[Mesa-dev] [PATCH 03/12] i965/fs: Rewrite fs_visitor::split_virtual_grfs

2014-09-19 Thread Jason Ekstrand
The original vgrf splitting code was written assuming that with the
assumption that vgrfs came in two types: those that can be split into
single registers and those that can't be split at all It was very
conservative and bailed as soon as more than one element of a register was
read or written.  This won't work once we start allowing a regular MOV or
ADD operation to operate on multiple registers.  This rewrite allows for
the case where a vgrf of size 5 may appropreately be split in to one
register of size 1 and two registers of size 2.

Signed-off-by: Jason Ekstrand 
---
 src/mesa/drivers/dri/i965/brw_fs.cpp | 132 ++-
 1 file changed, 85 insertions(+), 47 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index 3fb1545..10a3a20 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -1627,15 +1627,39 @@ void
 fs_visitor::split_virtual_grfs()
 {
int num_vars = this->virtual_grf_count;
-   bool split_grf[num_vars];
-   int new_virtual_grf[num_vars];
 
-   /* Try to split anything > 0 sized. */
+   /* Count the total number of registers */
+   int reg_count = 0;
+   int vgrf_to_reg[num_vars];
for (int i = 0; i < num_vars; i++) {
-  if (this->virtual_grf_sizes[i] != 1)
-split_grf[i] = true;
-  else
-split_grf[i] = false;
+  vgrf_to_reg[i] = reg_count;
+  reg_count += virtual_grf_sizes[i];
+   }
+
+   /* An array of "split points".  For each register slot, this indicates
+* if this slot can be separated from the previous slot.  Every time an
+* instruction uses multiple elements of a register (as a source or
+* destination), we mark the used slots as inseparable.  Then we go
+* through and split the registers into the smallest pieces we can.
+*/
+   bool split_points[reg_count];
+   memset(split_points, 0, sizeof(split_points));
+
+   /* Mark all used registers as fully splittable */
+   foreach_in_list(fs_inst, inst, &instructions) {
+  if (inst->dst.file == GRF) {
+ int reg = vgrf_to_reg[inst->dst.reg];
+ for (int j = 1; j < this->virtual_grf_sizes[inst->dst.reg]; j++)
+split_points[reg + j] = true;
+  }
+
+  for (int i = 0; i < inst->sources; i++) {
+ if (inst->src[i].file == GRF) {
+int reg = vgrf_to_reg[inst->src[i].reg];
+for (int j = 1; j < this->virtual_grf_sizes[inst->src[i].reg]; j++)
+   split_points[reg + j] = true;
+ }
+  }
}
 
if (brw->has_pln &&
@@ -1645,61 +1669,75 @@ fs_visitor::split_virtual_grfs()
* Gen6, that was the only supported interpolation mode, and since Gen6,
* delta_x and delta_y are in fixed hardware registers.
*/
-  split_grf[this->delta_x[BRW_WM_PERSPECTIVE_PIXEL_BARYCENTRIC].reg] =
- false;
+  int vgrf = this->delta_x[BRW_WM_PERSPECTIVE_PIXEL_BARYCENTRIC].reg;
+  split_points[vgrf_to_reg[vgrf] + 1] = false;
}
 
foreach_in_list(fs_inst, inst, &instructions) {
-  /* If there's a SEND message that requires contiguous destination
-   * registers, no splitting is allowed.
-   */
-  if (inst->regs_written > 1) {
-split_grf[inst->dst.reg] = false;
+  if (inst->dst.file == GRF) {
+ int reg = vgrf_to_reg[inst->dst.reg] + inst->dst.reg_offset;
+ for (int j = 1; j < inst->regs_written; j++)
+split_points[reg + j] = false;
   }
-
-  /* If we're sending from a GRF, don't split it, on the assumption that
-   * the send is reading the whole thing.
-   */
-  if (inst->is_send_from_grf()) {
- for (int i = 0; i < inst->sources; i++) {
-if (inst->src[i].file == GRF) {
-   split_grf[inst->src[i].reg] = false;
-}
+  for (int i = 0; i < inst->sources; i++) {
+ if (inst->src[i].file == GRF) {
+int reg = vgrf_to_reg[inst->src[i].reg] + inst->src[i].reg_offset;
+for (int j = 1; j < inst->regs_read(this, i); j++)
+   split_points[reg + j] = false;
  }
   }
}
 
-   /* Allocate new space for split regs.  Note that the virtual
-* numbers will be contiguous.
-*/
+   int new_virtual_grf[reg_count];
+   int new_reg_offset[reg_count];
+
+   int reg = 0;
for (int i = 0; i < num_vars; i++) {
-  if (split_grf[i]) {
-new_virtual_grf[i] = virtual_grf_alloc(1);
-for (int j = 2; j < this->virtual_grf_sizes[i]; j++) {
-   int reg = virtual_grf_alloc(1);
-   assert(reg == new_virtual_grf[i] + j - 1);
-   (void) reg;
-}
-this->virtual_grf_sizes[i] = 1;
+  /* The first one should always be 0 as a quick sanity check. */
+  assert(split_points[reg] == false);
+
+  /* j = 0 case */
+  new_reg_offset[reg] = 0;
+  reg++;
+  int offset = 1;
+
+  /* j > 0 case */
+  for (int j = 1; j < virtual_grf_sizes[i]; j++

[Mesa-dev] [PATCH 02/12] i965/fs_live_variables: Use var_from_vgrf insead of repeating the calculation

2014-09-19 Thread Jason Ekstrand
Signed-off-by: Jason Ekstrand 
---
 src/mesa/drivers/dri/i965/brw_fs_live_variables.cpp | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_live_variables.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_live_variables.cpp
index e7ecb0f..39fc61a 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_live_variables.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_live_variables.cpp
@@ -56,7 +56,7 @@ void
 fs_live_variables::setup_one_read(bblock_t *block, fs_inst *inst,
   int ip, fs_reg reg)
 {
-   int var = var_from_vgrf[reg.reg] + reg.reg_offset;
+   int var = var_from_reg(®);
assert(var < num_vars);
 
/* In most cases, a register can be written over safely by the
@@ -108,7 +108,7 @@ void
 fs_live_variables::setup_one_write(bblock_t *block, fs_inst *inst,
int ip, fs_reg reg)
 {
-   int var = var_from_vgrf[reg.reg] + reg.reg_offset;
+   int var = var_from_reg(®);
assert(var < num_vars);
 
start[var] = MIN2(start[var], ip);
-- 
2.1.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 12/12] i965/fs: Clean up emit_fb_writes

2014-09-19 Thread Jason Ekstrand
This splits emit_fb_writes into two functions: emit_fb_writes and
emit_single_fb_write.  This reduces the amount of duplicated code in
emit_fb_writes and makes the register number fiddling less arcane.

Signed-off-by: Jason Ekstrand 
---
 src/mesa/drivers/dri/i965/brw_fs.h   |   4 +-
 src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 193 +++
 2 files changed, 83 insertions(+), 114 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.h 
b/src/mesa/drivers/dri/i965/brw_fs.h
index 56f40b4..50b5fc1 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.h
+++ b/src/mesa/drivers/dri/i965/brw_fs.h
@@ -426,8 +426,10 @@ public:
 const struct prog_instruction *fpi,
 fs_reg dst, fs_reg src0, fs_reg src1, fs_reg one);
 
-   void emit_color_write(int target, int index, int first_color_mrf);
+   void emit_color_write(fs_reg color, int index, int first_color_mrf);
void emit_alpha_test();
+   fs_inst *emit_single_fb_write(fs_reg color1, fs_reg color2,
+ fs_reg src0_alpha, unsigned components);
void emit_fb_writes();
 
void emit_shader_time_begin();
diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
index a8d2804..f9bc82a 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
@@ -2920,11 +2920,10 @@ fs_visitor::emit_interpolation_setup_gen6()
 }
 
 void
-fs_visitor::emit_color_write(int target, int index, int first_color_mrf)
+fs_visitor::emit_color_write(fs_reg color, int index, int first_color_mrf)
 {
int reg_width = dispatch_width / 8;
fs_inst *inst;
-   fs_reg color = outputs[target];
fs_reg mrf;
 
/* If there's no color data to be written, skip it. */
@@ -3042,8 +3041,9 @@ fs_visitor::emit_alpha_test()
cmp->flag_subreg = 1;
 }
 
-void
-fs_visitor::emit_fb_writes()
+fs_inst *
+fs_visitor::emit_single_fb_write(fs_reg color0, fs_reg color1,
+ fs_reg src0_alpha, unsigned components)
 {
this->current_annotation = "FB write header";
bool header_present = true;
@@ -3053,13 +3053,6 @@ fs_visitor::emit_fb_writes()
int base_mrf = 1;
int nr = base_mrf;
int reg_width = dispatch_width / 8;
-   bool src0_alpha_to_render_target = false;
-
-   if (do_dual_src) {
-  no16("GL_ARB_blend_func_extended not yet supported in SIMD16.");
-  if (dispatch_width == 16)
- do_dual_src = false;
-   }
 
/* From the Sandy Bridge PRM, volume 4, page 198:
 *
@@ -3069,19 +3062,15 @@ fs_visitor::emit_fb_writes()
 *  thread message and on all dual-source messages."
 */
if (brw->gen >= 6 &&
-   (brw->is_haswell || brw->gen >= 8 || !this->prog_data->uses_kill) &&
-   !do_dual_src &&
+   (brw->is_haswell || brw->gen >= 8 || !prog_data->uses_kill) &&
+   color1.file == BAD_FILE &&
key->nr_color_regions == 1) {
   header_present = false;
}
 
-   if (header_present) {
-  src0_alpha_to_render_target = brw->gen >= 6 &&
-   !do_dual_src &&
-key->replicate_alpha;
+   if (header_present)
   /* m2, m3 header */
   nr += 2;
-   }
 
if (payload.aa_dest_stencil_reg) {
   push_force_uncompressed();
@@ -3100,13 +3089,34 @@ fs_visitor::emit_fb_writes()
   nr += 1;
}
 
-   /* Reserve space for color. It'll be filled in per MRT below. */
-   int color_mrf = nr;
-   nr += 4 * reg_width;
-   if (do_dual_src)
-  nr += 4;
-   if (src0_alpha_to_render_target)
-  nr += reg_width;
+   if (color0.file == BAD_FILE) {
+  /* Even if there's no color buffers enabled, we still need to send
+   * alpha out the pipeline to our null renderbuffer to support
+   * alpha-testing, alpha-to-coverage, and so on.
+   */
+  emit_color_write(this->outputs[0], 3, nr);
+  nr += 4 * reg_width;
+   } else if (color1.file == BAD_FILE) {
+  if (src0_alpha.file != BAD_FILE) {
+ fs_inst *inst;
+ inst = emit(MOV(fs_reg(MRF, nr, src0_alpha.type), src0_alpha));
+ inst->saturate = key->clamp_fragment_color;
+ nr += reg_width;
+  }
+
+  for (unsigned i = 0; i < components; i++)
+ emit_color_write(color0, i, nr);
+
+  nr += 4 * reg_width;
+   } else {
+  for (unsigned i = 0; i < components; i++)
+ emit_color_write(color0, i, nr);
+  nr += 4 * reg_width;
+
+  for (unsigned i = 0; i < components; i++)
+ emit_color_write(color1, i, nr);
+  nr += 4 * reg_width;
+   }
 
if (source_depth_to_render_target) {
   if (brw->gen == 6) {
@@ -3136,111 +3146,68 @@ fs_visitor::emit_fb_writes()
   nr += reg_width;
}
 
-   if (do_dual_src) {
-  fs_reg src0 = this->outputs[0];
-  fs_reg src1 = this->dual_src_output;
-
-  this->current_annotation = ralloc_asprintf(this->mem_ctx,
-"FB write

[Mesa-dev] [PATCH 08/12] i965/fs: Make null_reg_* const members of fs_visitor instead of globals

2014-09-19 Thread Jason Ekstrand
We also set the register width equal to the dispatch width.  Right now,
this is effectively a no-op since we don't do anything with it.  However,
it will be important once we add an actual width field to fs_reg.

Signed-off-by: Jason Ekstrand 
---
 src/mesa/drivers/dri/i965/brw_fs.h   | 6 +++---
 src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 3 +++
 src/mesa/drivers/dri/i965/brw_reg.h  | 6 ++
 3 files changed, 12 insertions(+), 3 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.h 
b/src/mesa/drivers/dri/i965/brw_fs.h
index d5a96c8..cb44037 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.h
+++ b/src/mesa/drivers/dri/i965/brw_fs.h
@@ -134,9 +134,6 @@ half(const fs_reg ®, unsigned idx)
 }
 
 static const fs_reg reg_undef;
-static const fs_reg reg_null_f(retype(brw_null_reg(), BRW_REGISTER_TYPE_F));
-static const fs_reg reg_null_d(retype(brw_null_reg(), BRW_REGISTER_TYPE_D));
-static const fs_reg reg_null_ud(retype(brw_null_reg(), BRW_REGISTER_TYPE_UD));
 
 class ip_record : public exec_node {
 public:
@@ -206,6 +203,9 @@ public:
 class fs_visitor : public backend_visitor
 {
 public:
+   const fs_reg reg_null_f;
+   const fs_reg reg_null_d;
+   const fs_reg reg_null_ud;
 
fs_visitor(struct brw_context *brw,
   void *mem_ctx,
diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
index 92a50a5..6a75b05 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
@@ -3277,6 +3277,9 @@ fs_visitor::fs_visitor(struct brw_context *brw,
unsigned dispatch_width)
: backend_visitor(brw, shader_prog, &fp->Base, &prog_data->base,
  MESA_SHADER_FRAGMENT),
+ reg_null_f(retype(brw_null_vec(dispatch_width), BRW_REGISTER_TYPE_F)),
+ reg_null_d(retype(brw_null_vec(dispatch_width), BRW_REGISTER_TYPE_D)),
+ reg_null_ud(retype(brw_null_vec(dispatch_width), BRW_REGISTER_TYPE_UD)),
  key(key), prog_data(prog_data),
  dispatch_width(dispatch_width)
 {
diff --git a/src/mesa/drivers/dri/i965/brw_reg.h 
b/src/mesa/drivers/dri/i965/brw_reg.h
index 28d3d94..9638c77 100644
--- a/src/mesa/drivers/dri/i965/brw_reg.h
+++ b/src/mesa/drivers/dri/i965/brw_reg.h
@@ -603,6 +603,12 @@ brw_null_reg(void)
 }
 
 static inline struct brw_reg
+brw_null_vec(unsigned width)
+{
+   return brw_vecn_reg(width, BRW_ARCHITECTURE_REGISTER_FILE, BRW_ARF_NULL, 0);
+}
+
+static inline struct brw_reg
 brw_address_reg(unsigned subnr)
 {
return brw_uw1_reg(BRW_ARCHITECTURE_REGISTER_FILE, BRW_ARF_ADDRESS, subnr);
-- 
2.1.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 06/12] i965/fs: Use the UW type for the destination of VARYING_PULL_CONSTANT_LOAD instructions

2014-09-19 Thread Jason Ekstrand
Using a floating-point type doesn't usually cause hangs on my HSW, but the
simulator complains about it quite a bit.

Signed-off-by: Jason Ekstrand 
---
 src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
index 1bc10f5..bd0ee3a 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
@@ -1082,7 +1082,7 @@ 
fs_generator::generate_varying_pull_constant_load_gen7(fs_inst *inst,
   uint32_t surf_index = index.dw1.ud;
 
   brw_inst *send = brw_next_insn(p, BRW_OPCODE_SEND);
-  brw_set_dest(p, send, dst);
+  brw_set_dest(p, send, retype(dst, BRW_REGISTER_TYPE_UW));
   brw_set_src0(p, send, offset);
   brw_set_sampler_message(p, send,
   surf_index,
@@ -1131,7 +1131,7 @@ 
fs_generator::generate_varying_pull_constant_load_gen7(fs_inst *inst,
 
   /* dst = send(offset, a0.0) */
   brw_inst *insn_send = brw_next_insn(p, BRW_OPCODE_SEND);
-  brw_set_dest(p, insn_send, dst);
+  brw_set_dest(p, insn_send, retype(dst, BRW_REGISTER_TYPE_UW));
   brw_set_src0(p, insn_send, offset);
   brw_set_indirect_send_descriptor(p, insn_send, BRW_SFID_SAMPLER, addr);
 
-- 
2.1.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 10/12] i965/fs: Make compact_virtual_grfs an optimization pass

2014-09-19 Thread Jason Ekstrand
Previously we disabled compact_virtual_grfs when dumping optimizations.
The idea here was to make it easier to diff the dumped shader because you
didn't have a sudden renaming.  However, sometimes a bug is affected by
compact_virtual_grfs and, when this happens, you want to keep dumping
instructions with compact_virtual_grfs enabled.  By turning it into an
optimization pass and dumping it along with the others, we retain the
ability to diff because you can just diff against the compact_virtual_grf
output.

Signed-off-by: Jason Ekstrand 
---
 src/mesa/drivers/dri/i965/brw_fs.cpp | 17 +++--
 src/mesa/drivers/dri/i965/brw_fs.h   |  2 +-
 2 files changed, 12 insertions(+), 7 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index 002d40fd..a41cb4f 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -1752,12 +1752,10 @@ fs_visitor::split_virtual_grfs()
  * to loop over all the virtual GRFs.  Compacting them can save a lot of
  * overhead.
  */
-void
+bool
 fs_visitor::compact_virtual_grfs()
 {
-   if (unlikely(INTEL_DEBUG & DEBUG_OPTIMIZER))
-  return;
-
+   bool progress = false;
int remap_table[this->virtual_grf_count];
memset(remap_table, -1, sizeof(remap_table));
 
@@ -1775,7 +1773,12 @@ fs_visitor::compact_virtual_grfs()
/* Compact the GRF arrays. */
int new_index = 0;
for (int i = 0; i < this->virtual_grf_count; i++) {
-  if (remap_table[i] != -1) {
+  if (remap_table[i] == -1) {
+ /* We just found an unused register.  This means that we are
+  * actually going to compact something.
+  */
+ progress = true;
+  } else {
  remap_table[i] = new_index;
  virtual_grf_sizes[new_index] = virtual_grf_sizes[i];
  invalidate_live_intervals(false);
@@ -1818,6 +1821,8 @@ fs_visitor::compact_virtual_grfs()
  }
   }
}
+
+   return progress;
 }
 
 /*
@@ -3255,7 +3260,7 @@ fs_visitor::run()
  iteration++;
  int pass_num = 0;
 
- compact_virtual_grfs();
+ OPT(compact_virtual_grfs);
 
  OPT(remove_duplicate_mrf_writes);
 
diff --git a/src/mesa/drivers/dri/i965/brw_fs.h 
b/src/mesa/drivers/dri/i965/brw_fs.h
index 402433b..56f40b4 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.h
+++ b/src/mesa/drivers/dri/i965/brw_fs.h
@@ -325,7 +325,7 @@ public:
int choose_spill_reg(struct ra_graph *g);
void spill_reg(int spill_reg);
void split_virtual_grfs();
-   void compact_virtual_grfs();
+   bool compact_virtual_grfs();
void move_uniform_array_access_to_pull_constants();
void assign_constant_locations();
void demote_pull_constants();
-- 
2.1.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 04/12] i965/fs: fix a comment in compact_virtual_grfs

2014-09-19 Thread Jason Ekstrand
Signed-off-by: Jason Ekstrand 
---
 src/mesa/drivers/dri/i965/brw_fs.cpp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index 10a3a20..af8c087 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -1758,10 +1758,10 @@ fs_visitor::compact_virtual_grfs()
if (unlikely(INTEL_DEBUG & DEBUG_OPTIMIZER))
   return;
 
-   /* Mark which virtual GRFs are used, and count how many. */
int remap_table[this->virtual_grf_count];
memset(remap_table, -1, sizeof(remap_table));
 
+   /* Mark which virtual GRFs are used. */
foreach_in_list(const fs_inst, inst, &instructions) {
   if (inst->dst.file == GRF)
  remap_table[inst->dst.reg] = 0;
-- 
2.1.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 84089] [radeonsi] hd 7790 need more sGPRS for ps

2014-09-19 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=84089

Iaroslav Andrusyak  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #5 from Iaroslav Andrusyak  ---
Yes, it helps

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] multi-planar tiled fourcc's in mesa and drm

2014-09-19 Thread Rob Clark
So, lucky me, I have a scenario where I get to deal with NV12MT.  Hurray!

I know there has been some reluctance in the past to combine tiling
and color format, since in theory that could lead to a combinatorial
explosion in formats.  And, as long as the buffer usage is entirely
within a single driver, you can approximately hide tiling (or
compressed, etc) permutations of a color format.  On the other hand,
there is already some precedence for fourcc or format values to
represent tiled formats at the interface level (in kernel, v4l, and in
userspace, and gstreamer and openmax).

But in this scenario, sharing buffer between other devices (video
decoder/encoder) and drm/kms (msm) and mesa (freedreno) via
EGL_EXT_image_dma_buf_import[1], I sort of don't really have any other
way to pass around tiling flags.  So I would propose adding custom
fourcc's only in the more limited cases where formats are exchanged
between devices.  This should avoid an explosion of color_format *
tiling_format.

For the kms part, it would mean merging a small patch to allow addfb2
for NV12MT[2].

For the mesa part, it looks like there is a bit of work needed to
teach egl about multi-planar buffers, buffers where offset[n] != 0,
etc.  I'll start with patches to teach egl how to import plain NV12
buffers.  But once that is done, for it to be much use to me I'll need
NV12MT, which means adding a new gallium format and
__DRI_IMAGE_FOURCC_NV12MT.

Also, I'm still a bit undecided on how to represent multi-planar
formats (ie. single pipe_resource encapsulating each of the planes?
or pipe_resource per plane but teach pipe_sampler_view about textures
which have multiple pipe_resource's, one for per plane).

Anyways, I'll start working on the mesa egl bits next week.  First
step is just add an 'offset' parameter to 'struct winsys_handle',
which should hopefully be non-controversial.  After that, I need to
decide how to handle multi-planar, and I think that hinges on how
folks want to handle multi-planar in gallium.  Ie. if one
pipe_resource per plane, then winsys_handle doesn't need any further
change (but we need changes elsewhere), otherwise winsys_handle needs
to have an array of handles.

Anyways, I'd appreciate feedback.

BR,
-R

[1] 
https://www.khronos.org/registry/egl/extensions/EXT/EGL_EXT_image_dma_buf_import.txt
[2] http://lists.freedesktop.org/archives/dri-devel/2014-July/064828.html
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] Mesa 10.3 released

2014-09-19 Thread Emil Velikov
Mesa 10.3 has been released! Mesa 10.3 is a feature release that
includes many updates and enhancements. The full list is available in
the release notes file in docs/relnotes/10.3.html.

The tag in the GIT repository for Mesa 10.3 is 'mesa-10.3'.  I have
verified that the tag is in the correct place in the tree.

Mesa 10.3 is available for download at
ftp://freedesktop.org/pub/mesa/10.3/

SHA-256 checksums (can be verified with the sha256sum program):
9a1bf52040fc3dda81e83a35f944f1c3f532847dbe9fdf57161265cf71ea1bae  
MesaLib-10.3.0.tar.gz
0283bfe710fa449ed82e465cfa09612a269e19abb7e0382082608062ce7960b5  
MesaLib-10.3.0.tar.bz2
221420763c2c3a244836a736e735612c4a6a0377b4e5223fca1e612f49906789  
MesaLib-10.3.0.zip

I have verified building from the .tar.bz2 file by doing:

tar xjf MesaLib-10.3.0.tar.bz2
cd Mesa-10.3.0
./configure --enable-gallium-llvm
make -j6
make -j6 install

I have also verified that I pushed the tag.

-Emil

--

Changes since mesa-10.3-rc3:

Andreas Boll (1):
  gallium/util: add missing u_debug include

Brian Paul (1):
  mesa: fix _mesa_free_pipeline_data() use-after-free bug

Christian König (1):
  mesa/st: don't advertise NV_vdpau_interop if it doesn't work.

Christoph Bumiller (2):
  nv50/ir/util: fix BitSet issues
  nvc0/ir: clarify recursion fix to finding first tex uses

Connor Abbott (1):
  r300g: set register classes before interferences

Emil Velikov (5):
  configure: bail out if building svga without libdrm
  configure: enable the gallium loader only when needed
  Bump version to 10.3 (final)
  docs: Update 10.3 release notes
  docs: Add 10.3 sha256 sums, news item and link release notes

Gwenole Beauchesne (1):
  i965: add support for RGBA dma_buf imports.

Iago Toral Quiroga (1):
  i965: Implement GL_PRIMITIVES_GENERATED with non-zero streams.

Ian Romanick (8):
  mesa: Document SYSTEM_VALUE_VERTEX_ID and SYSTEM_VALUE_INSTANCE_ID
  mesa: Add SYSTEM_VALUE_VERTEX_ID_ZERO_BASE
  mesa: Add SYSTEM_VALUE_BASE_VERTEX
  glsl/linker: Make get_main_function_signature public
  glsl: Add a lowering pass for gl_VertexID
  i965: Handle SYSTEM_VALUE_VERTEX_ID_ZERO_BASE
  i965: Request lowering gl_VertexID
  i965/vec4: Only examine virtual_grf_end for GRF sources

Ilia Mirkin (4):
  nv50/ir: avoid array overrun when checking for supported mods
  nouveau: only enable the depth test if there actually is a depth buffer
  nouveau: only enable stencil func if the visual has stencil bits
  nouveau: change internal variables to avoid conflicts with macro args

Jason Ekstrand (1):
  i965/blorp: Pass image formats seperately from the miptree

Jonathan Gray (1):
  configure.ac: strip _GNU_SOURCE from llvm-config output

Kenneth Graunke (15):
  i965: Handle ir_triop_csel in emit_if_gen6().
  i965: Handle ir_binop_ubo_load in boolean expression code.
  mesa: Replace string comparisons with SYSTEM_VALUE enum checks.
  mesa: Fix glGetActiveAttribute for gl_VertexID when lowered.
  i965: Calculate start/base_vertex_location after preparing vertices.
  i965: Make gl_BaseVertex available in a buffer object.
  i965: Refactor Gen4-7 VERTEX_BUFFER_STATE emission into a helper.
  i965: Expose gl_BaseVertex via a vertex attribute.
  i965: Fix reference counting in new basevertex upload code.
  i965: Separate gl_InstanceID and gl_VertexID uploading.
  i965: Disable guardband clipping in the smaller-than-viewport case.
  i965: Skip allocating UNIFORM file storage for uniforms of size 0.
  i965/vec4: Make type_size() return 0 for samplers.
  glsl: Speed up constant folding for swizzles.
  i965: Mark delta_x/y as BAD_FILE if remapped away completely.

Kristian Høgsberg (1):
  i965: Adjust fast-clear resolve rect for BDW

Maarten Lankhorst (4):
  nouveau: re-allocate bo's on overflow
  nouveau: fix MPEG4 hw decoding
  nouveau: rework reference frame handling
  nouveau: remove unneeded assert

Matt Turner (1):
  i965/vec4: Reswizzle sources when necessary.

Richard Sandiford (1):
  gallivm: Fix uses of 2^24

Ulrich Weigand (1):
  gallivm: Fix Altivec pack intrinsics for little-endian



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev




signature.asc
Description: OpenPGP digital signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] r300g: implement MSAA copies by resolving and upsampling

2014-09-19 Thread Marek Olšák
From: Marek Olšák 

There's no other way. It will use hw resolve + blit.
---
 src/gallium/drivers/r300/r300_blit.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/src/gallium/drivers/r300/r300_blit.c 
b/src/gallium/drivers/r300/r300_blit.c
index 2320abb..4e7efc5 100644
--- a/src/gallium/drivers/r300/r300_blit.c
+++ b/src/gallium/drivers/r300/r300_blit.c
@@ -679,7 +679,9 @@ static boolean r300_is_simple_msaa_resolve(const struct 
pipe_blit_info *info)
 unsigned dst_width = u_minify(info->dst.resource->width0, info->dst.level);
 unsigned dst_height = u_minify(info->dst.resource->height0, 
info->dst.level);
 
-return info->dst.resource->format == info->src.resource->format &&
+return info->src.resource->nr_samples > 1 &&
+   info->dst.resource->nr_samples <= 1 &&
+   info->dst.resource->format == info->src.resource->format &&
info->dst.resource->format == info->dst.format &&
info->src.resource->format == info->src.format &&
!info->scissor_enable &&
@@ -803,7 +805,6 @@ static void r300_blit(struct pipe_context *pipe,
 
 /* MSAA resolve. */
 if (info.src.resource->nr_samples > 1 &&
-info.dst.resource->nr_samples <= 1 &&
 !util_format_is_depth_or_stencil(info.src.resource->format)) {
 r300_msaa_resolve(pipe, &info);
 return;
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/4] st/mesa: drop dependence on API profile in st_init_extensions

2014-09-19 Thread Marek Olšák
From: Marek Olšák 

The extensions and limits being set in the conditional block are core-only
anyway and don't have any effect on other profiles.
---
 src/mesa/state_tracker/st_context.c|  2 +-
 src/mesa/state_tracker/st_extensions.c | 20 +---
 src/mesa/state_tracker/st_extensions.h |  1 -
 src/mesa/state_tracker/st_manager.c|  2 +-
 4 files changed, 11 insertions(+), 14 deletions(-)

diff --git a/src/mesa/state_tracker/st_context.c 
b/src/mesa/state_tracker/st_context.c
index 768a667..1723513 100644
--- a/src/mesa/state_tracker/st_context.c
+++ b/src/mesa/state_tracker/st_context.c
@@ -242,7 +242,7 @@ st_create_context_priv( struct gl_context *ctx, struct 
pipe_context *pipe,
 
/* GL limits and extensions */
st_init_limits(st->pipe->screen, &ctx->Const, &ctx->Extensions);
-   st_init_extensions(st->pipe->screen, ctx->API, &ctx->Const,
+   st_init_extensions(st->pipe->screen, &ctx->Const,
   &ctx->Extensions, &st->options, ctx->Mesa_DXTn);
 
/* Enable shader-based fallbacks for ARB_color_buffer_float if needed. */
diff --git a/src/mesa/state_tracker/st_extensions.c 
b/src/mesa/state_tracker/st_extensions.c
index c7bc0ca..681723a 100644
--- a/src/mesa/state_tracker/st_extensions.c
+++ b/src/mesa/state_tracker/st_extensions.c
@@ -407,7 +407,6 @@ get_max_samples_for_formats(struct pipe_screen *screen,
  * Some fine tuning may still be needed.
  */
 void st_init_extensions(struct pipe_screen *screen,
-gl_api api,
 struct gl_constants *consts,
 struct gl_extensions *extensions,
 struct st_config_options *options,
@@ -844,17 +843,16 @@ void st_init_extensions(struct pipe_screen *screen,
  consts->DisableVaryingPacking = GL_TRUE;
}
 
-   if (api == API_OPENGL_CORE) {
-  consts->MaxViewports = screen->get_param(screen, PIPE_CAP_MAX_VIEWPORTS);
-  if (consts->MaxViewports >= 16) {
- consts->ViewportBounds.Min = -16384.0;
- consts->ViewportBounds.Max = 16384.0;
- extensions->ARB_viewport_array = GL_TRUE;
- extensions->ARB_fragment_layer_viewport = GL_TRUE;
- if (extensions->AMD_vertex_shader_layer)
-extensions->AMD_vertex_shader_viewport_index = GL_TRUE;
-  }
+   consts->MaxViewports = screen->get_param(screen, PIPE_CAP_MAX_VIEWPORTS);
+   if (consts->MaxViewports >= 16) {
+  consts->ViewportBounds.Min = -16384.0;
+  consts->ViewportBounds.Max = 16384.0;
+  extensions->ARB_viewport_array = GL_TRUE;
+  extensions->ARB_fragment_layer_viewport = GL_TRUE;
+  if (extensions->AMD_vertex_shader_layer)
+ extensions->AMD_vertex_shader_viewport_index = GL_TRUE;
}
+
if (consts->MaxProgramTextureGatherComponents > 0)
   extensions->ARB_texture_gather = GL_TRUE;
 
diff --git a/src/mesa/state_tracker/st_extensions.h 
b/src/mesa/state_tracker/st_extensions.h
index 8d2724d..faff11f 100644
--- a/src/mesa/state_tracker/st_extensions.h
+++ b/src/mesa/state_tracker/st_extensions.h
@@ -38,7 +38,6 @@ extern void st_init_limits(struct pipe_screen *screen,
struct gl_extensions *extensions);
 
 extern void st_init_extensions(struct pipe_screen *screen,
-   gl_api api,
struct gl_constants *consts,
struct gl_extensions *extensions,
struct st_config_options *options,
diff --git a/src/mesa/state_tracker/st_manager.c 
b/src/mesa/state_tracker/st_manager.c
index 7bc3326..df6de73 100644
--- a/src/mesa/state_tracker/st_manager.c
+++ b/src/mesa/state_tracker/st_manager.c
@@ -928,7 +928,7 @@ static unsigned get_version(struct pipe_screen *screen,
_mesa_init_extensions(&extensions);
 
st_init_limits(screen, &consts, &extensions);
-   st_init_extensions(screen, api, &consts, &extensions, options, GL_TRUE);
+   st_init_extensions(screen, &consts, &extensions, options, GL_TRUE);
 
return _mesa_get_version(&extensions, &consts, api);
 }
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 4/4] st/mesa: redefine mapping from VARYING_SLOT_TEXi/PNTC/VARi to TGSI GENERIC[i]

2014-09-19 Thread Marek Olšák
From: Marek Olšák 

Generic varyings in TGSI were based on the value of VARYING_SLOT_TEX0, so VAR0
was always GENERIC[22] (with tessellation patches). Some drivers might not
be able to cope with that.

This commit defines a proper mapping, so that PNTC is GENERIC[8] and VAR0 is
GENERIC[9].
---
 src/mesa/state_tracker/st_atom_rasterizer.c |  3 +-
 src/mesa/state_tracker/st_program.c | 46 -
 src/mesa/state_tracker/st_program.h | 25 
 3 files changed, 52 insertions(+), 22 deletions(-)

diff --git a/src/mesa/state_tracker/st_atom_rasterizer.c 
b/src/mesa/state_tracker/st_atom_rasterizer.c
index 71b7f1b..a228538 100644
--- a/src/mesa/state_tracker/st_atom_rasterizer.c
+++ b/src/mesa/state_tracker/st_atom_rasterizer.c
@@ -33,6 +33,7 @@
 #include "main/macros.h"
 #include "st_context.h"
 #include "st_atom.h"
+#include "st_program.h"
 #include "pipe/p_context.h"
 #include "pipe/p_defines.h"
 #include "cso_cache/cso_context.h"
@@ -174,7 +175,7 @@ static void update_raster_state( struct st_context *st )
   if (!st->needs_texcoord_semantic &&
   fragProg->Base.InputsRead & VARYING_BIT_PNTC) {
  raster->sprite_coord_enable |=
-1 << (VARYING_SLOT_PNTC - VARYING_SLOT_TEX0);
+1 << st_get_generic_varying_index(st, VARYING_SLOT_PNTC);
   }
 
   raster->point_quad_rasterization = 1;
diff --git a/src/mesa/state_tracker/st_program.c 
b/src/mesa/state_tracker/st_program.c
index fbf8930..926086b 100644
--- a/src/mesa/state_tracker/st_program.c
+++ b/src/mesa/state_tracker/st_program.c
@@ -275,17 +275,18 @@ st_prepare_vertex_program(struct gl_context *ctx,
  case VARYING_SLOT_TEX5:
  case VARYING_SLOT_TEX6:
  case VARYING_SLOT_TEX7:
-stvp->output_semantic_name[slot] = st->needs_texcoord_semantic ?
-   TGSI_SEMANTIC_TEXCOORD : TGSI_SEMANTIC_GENERIC;
-stvp->output_semantic_index[slot] = attr - VARYING_SLOT_TEX0;
-break;
-
+if (st->needs_texcoord_semantic) {
+   stvp->output_semantic_name[slot] = TGSI_SEMANTIC_TEXCOORD;
+   stvp->output_semantic_index[slot] = attr - VARYING_SLOT_TEX0;
+   break;
+}
+/* fall through */
  case VARYING_SLOT_VAR0:
  default:
 assert(attr < VARYING_SLOT_MAX);
 stvp->output_semantic_name[slot] = TGSI_SEMANTIC_GENERIC;
-stvp->output_semantic_index[slot] = st->needs_texcoord_semantic ?
-   (attr - VARYING_SLOT_VAR0) : (attr - VARYING_SLOT_TEX0);
+stvp->output_semantic_index[slot] =
+   st_get_generic_varying_index(st, attr);
 break;
  }
   }
@@ -655,9 +656,8 @@ st_translate_fragment_program(struct st_context *st,
  * the user varyings on VAR0.  Otherwise, we use TEX0 as base 
index.
  */
 assert(attr >= VARYING_SLOT_TEX0);
-input_semantic_index[slot] = st->needs_texcoord_semantic ?
-   (attr - VARYING_SLOT_VAR0) : (attr - VARYING_SLOT_TEX0);
 input_semantic_name[slot] = TGSI_SEMANTIC_GENERIC;
+input_semantic_index[slot] = st_get_generic_varying_index(st, 
attr);
 if (attr == VARYING_SLOT_PNTC)
interpMode[slot] = TGSI_INTERPOLATE_LINEAR;
 else
@@ -974,16 +974,18 @@ st_translate_geometry_program(struct st_context *st,
  case VARYING_SLOT_TEX5:
  case VARYING_SLOT_TEX6:
  case VARYING_SLOT_TEX7:
-stgp->input_semantic_name[slot] = st->needs_texcoord_semantic ?
-   TGSI_SEMANTIC_TEXCOORD : TGSI_SEMANTIC_GENERIC;
-stgp->input_semantic_index[slot] = (attr - VARYING_SLOT_TEX0);
-break;
+if (st->needs_texcoord_semantic) {
+   stgp->input_semantic_name[slot] = TGSI_SEMANTIC_TEXCOORD;
+   stgp->input_semantic_index[slot] = attr - VARYING_SLOT_TEX0;
+   break;
+}
+/* fall through */
  case VARYING_SLOT_VAR0:
  default:
 assert(attr >= VARYING_SLOT_VAR0 && attr < VARYING_SLOT_MAX);
 stgp->input_semantic_name[slot] = TGSI_SEMANTIC_GENERIC;
-stgp->input_semantic_index[slot] = st->needs_texcoord_semantic ?
-   (attr - VARYING_SLOT_VAR0) : (attr - VARYING_SLOT_TEX0);
+stgp->input_semantic_index[slot] =
+   st_get_generic_varying_index(st, attr);
  break;
  }
   }
@@ -1069,17 +1071,19 @@ st_translate_geometry_program(struct st_context *st,
  case VARYING_SLOT_TEX5:
  case VARYING_SLOT_TEX6:
  case VARYING_SLOT_TEX7:
-gs_output_semantic_name[slot] = st->needs_texcoord_semantic ?
-   TGSI_SEMANTIC_TEXCOORD : TGSI_SEMANTIC_GENERIC;
-gs_output_semantic_index[slot] = (attr - VARYING_SLOT_TEX0);
-break;
+if (st->needs

[Mesa-dev] [PATCH 3/4] st/mesa: don't set coord_enable for gl_PointCoord if using TGSI_SEMANTIC_PCOORD

2014-09-19 Thread Marek Olšák
From: Marek Olšák 

This was missed when Christoph Bumiller added PIPE_CAP_TGSI_TEXCOORD.
---
 src/mesa/state_tracker/st_atom_rasterizer.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/mesa/state_tracker/st_atom_rasterizer.c 
b/src/mesa/state_tracker/st_atom_rasterizer.c
index 2bad643..71b7f1b 100644
--- a/src/mesa/state_tracker/st_atom_rasterizer.c
+++ b/src/mesa/state_tracker/st_atom_rasterizer.c
@@ -171,7 +171,8 @@ static void update_raster_state( struct st_context *st )
 raster->sprite_coord_enable |= 1 << i;
  }
   }
-  if (fragProg->Base.InputsRead & VARYING_BIT_PNTC) {
+  if (!st->needs_texcoord_semantic &&
+  fragProg->Base.InputsRead & VARYING_BIT_PNTC) {
  raster->sprite_coord_enable |=
 1 << (VARYING_SLOT_PNTC - VARYING_SLOT_TEX0);
   }
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/4] st/mesa: use UniformBooleanTrue in glsl_to_tgsi

2014-09-19 Thread Marek Olšák
From: Marek Olšák 

Just for consistency. This doesn't fix anything as the original code was
already pretty good.
---
 src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp 
b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
index b338a98..57f43a6 100644
--- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
@@ -2617,10 +2617,7 @@ glsl_to_tgsi_visitor::visit(ir_constant *ir)
case GLSL_TYPE_BOOL:
   gl_type = native_integers ? GL_BOOL : GL_FLOAT;
   for (i = 0; i < ir->type->vector_elements; i++) {
- if (native_integers)
-values[i].u = ir->value.b[i] ? ~0 : 0;
- else
-values[i].f = ir->value.b[i];
+ values[i].u = ir->value.b[i] ? ctx->Const.UniformBooleanTrue : 0;
   }
   break;
default:
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] mesa: don't set ES versions to GLSLVersion in _mesa_init_constants

2014-09-19 Thread Marek Olšák
From: Marek Olšák 

No place in Mesa expects an ES version there.
Drivers don't even set it like this.
---
 src/mesa/main/context.c | 12 ++--
 src/mesa/main/mtypes.h  |  2 +-
 2 files changed, 3 insertions(+), 11 deletions(-)

diff --git a/src/mesa/main/context.c b/src/mesa/main/context.c
index 682b9c7..53fb9c6 100644
--- a/src/mesa/main/context.c
+++ b/src/mesa/main/context.c
@@ -642,16 +642,8 @@ _mesa_init_constants(struct gl_constants *consts, gl_api 
api)
consts->MaxGeometryTotalOutputComponents = 
MAX_GEOMETRY_TOTAL_OUTPUT_COMPONENTS;
 
/* Shading language version */
-   if (api == API_OPENGL_COMPAT || api == API_OPENGL_CORE) {
-  consts->GLSLVersion = 120;
-  _mesa_override_glsl_version(consts);
-   }
-   else if (api == API_OPENGLES2) {
-  consts->GLSLVersion = 100;
-   }
-   else if (api == API_OPENGLES) {
-  consts->GLSLVersion = 0; /* GLSL not supported */
-   }
+   consts->GLSLVersion = 120;
+   _mesa_override_glsl_version(consts);
 
/* GL_ARB_framebuffer_object */
consts->MaxSamples = 0;
diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
index 553a216..7c237bd 100644
--- a/src/mesa/main/mtypes.h
+++ b/src/mesa/main/mtypes.h
@@ -3452,7 +3452,7 @@ struct gl_constants
GLuint MaxGeometryOutputVertices;
GLuint MaxGeometryTotalOutputComponents;
 
-   GLuint GLSLVersion;  /**< GLSL version supported (ex: 120 = 1.20) */
+   GLuint GLSLVersion;  /**< Desktop GLSL version supported (ex: 120 = 1.20) */
 
/**
 * Changes default GLSL extension behavior from "error" to "warn".  It's out
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] mesa: allow forcing >=3.1 compatibility contexts with MESA_GL_VERSION_OVERRIDE

2014-09-19 Thread Marek Olšák
From: Marek Olšák 

E.g. the 4.0 compatibility profile can be forced with:

MESA_GL_VERSION_OVERRIDE=4.0COMPAT

Some tests that I have require 4.0 compatibility.
---
 src/mesa/main/version.c | 16 ++--
 1 file changed, 10 insertions(+), 6 deletions(-)

diff --git a/src/mesa/main/version.c b/src/mesa/main/version.c
index 4dea530..71f7011 100644
--- a/src/mesa/main/version.c
+++ b/src/mesa/main/version.c
@@ -57,13 +57,15 @@ check_for_ending(const char *string, const char *ending)
  * fwd_context is only valid if version > 0
  */
 static void
-get_gl_override(int *version, GLboolean *fwd_context)
+get_gl_override(int *version, GLboolean *fwd_context,
+GLboolean *compat_context)
 {
const char *env_var = "MESA_GL_VERSION_OVERRIDE";
const char *version_str;
int major, minor, n;
static int override_version = -1;
static GLboolean fc_suffix = GL_FALSE;
+   static GLboolean compat_suffix = GL_FALSE;
 
if (override_version < 0) {
   override_version = 0;
@@ -71,6 +73,7 @@ get_gl_override(int *version, GLboolean *fwd_context)
   version_str = getenv(env_var);
   if (version_str) {
  fc_suffix = check_for_ending(version_str, "FC");
+ compat_suffix = check_for_ending(version_str, "COMPAT");
 
  n = sscanf(version_str, "%u.%u", &major, &minor);
  if (n != 2) {
@@ -87,6 +90,7 @@ get_gl_override(int *version, GLboolean *fwd_context)
 
*version = override_version;
*fwd_context = fc_suffix;
+   *compat_context = compat_suffix;
 }
 
 /**
@@ -129,16 +133,16 @@ _mesa_override_gl_version_contextless(struct gl_constants 
*consts,
   gl_api *apiOut, GLuint *versionOut)
 {
int version;
-   GLboolean fwd_context;
+   GLboolean fwd_context, compat_context;
 
-   get_gl_override(&version, &fwd_context);
+   get_gl_override(&version, &fwd_context, &compat_context);
 
if (version > 0) {
   *versionOut = version;
   if (version >= 30 && fwd_context) {
  *apiOut = API_OPENGL_CORE;
  consts->ContextFlags |= GL_CONTEXT_FLAG_FORWARD_COMPATIBLE_BIT;
-  } else if (version >= 31) {
+  } else if (version >= 31 && !compat_context) {
  *apiOut = API_OPENGL_CORE;
   } else {
  *apiOut = API_OPENGL_COMPAT;
@@ -166,9 +170,9 @@ int
 _mesa_get_gl_version_override(void)
 {
int version;
-   GLboolean fwd_context;
+   GLboolean fwd_context, compat_context;
 
-   get_gl_override(&version, &fwd_context);
+   get_gl_override(&version, &fwd_context, &compat_context);
 
return version;
 }
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] Mesa 10.2.8

2014-09-19 Thread Emil Velikov
(Resend with GPG signature.)

Mesa 10.2.8 has been released. Mesa 10.2.8 is a bug fix release
fixing bugs since the 10.2.7 release, (see below for a list of
changes).

The tag in the git repository for Mesa 10.2.8 is 'mesa-10.2.8'.

Mesa 10.2.8 is available for download at
ftp://freedesktop.org/pub/mesa/10.2.8/

SHA-256 checksums (can be verified with the sha256sum program):
4c5a25ccaf1a9734bbd10d62a1420cc8fd35a1060ce679f2fc846769a25fbeec  
MesaLib-10.2.8.tar.gz
1ef9ad3f241788d454f2ff8c9d65b6849dfc31c8fe91f70fd2930b81c8af1398  
MesaLib-10.2.8.tar.bz2
d26218da3b44734b1d555267b4c63c48803c4c8b14d2bc53071be57014da37fa  
MesaLib-10.2.8.zip

I have verified building from the .tar.bz2 file by doing:

tar xjf MesaLib-10.2.8.tar.bz2
cd Mesa-10.2.8
./configure --enable-gallium-llvm
make -j6
make -j6 install

I have also verified that I pushed the tag.

-Emil

--

Changes from 10.2.7 to 10.2.8:

Aaron Watry (1):
  gallivm: Fix build after LLVM commit 211259

Christoph Bumiller (2):
  nv50/ir/util: fix BitSet issues
  nvc0/ir: clarify recursion fix to finding first tex uses

Emil Velikov (4):
  docs: Add sha256 sums for the 10.2.7 release
  configure: bail out if building svga without libdrm
  Update VERSION to 10.2.8
  Add release notes for the 10.2.8 release

Ilia Mirkin (4):
  nv50/ir: avoid array overrun when checking for supported mods
  nouveau: only enable the depth test if there actually is a depth buffer
  nouveau: only enable stencil func if the visual has stencil bits
  nouveau: change internal variables to avoid conflicts with macro args

Jonathan Gray (1):
  configure.ac: strip _GNU_SOURCE from llvm-config output

José Fonseca (1):
  gallivm: Disable workaround for PR12833 on LLVM 3.2+.

Maarten Lankhorst (4):
  nouveau: re-allocate bo's on overflow
  nouveau: fix MPEG4 hw decoding
  nouveau: rework reference frame handling
  nouveau: remove unneeded assert

Marek Olšák (3):
  r600g,radeonsi: make sure there's enough CS space before resuming queries
  mesa: set UniformBooleanTrue = 1.0f by default
  st/mesa: use 1.0f as boolean true on drivers without integer support

Richard Sandiford (1):
  gallivm: Fix uses of 2^24

Roland Scheidegger (1):
  gallivm: set mcpu when initializing llvm execution engine

Thomas Hellstrom (1):
  winsys/svga: Fix incorrect type usage in IOCTL v2





signature.asc
Description: OpenPGP digital signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] Mesa 10.2.8

2014-09-19 Thread Emil Velikov
Mesa 10.2.8 has been released. Mesa 10.2.8 is a bug fix release
fixing bugs since the 10.2.7 release, (see below for a list of
changes).

The tag in the git repository for Mesa 10.2.8 is 'mesa-10.2.8'.

Mesa 10.2.8 is available for download at
ftp://freedesktop.org/pub/mesa/10.2.8/

SHA-256 checksums (can be verified with the sha256sum program):
4c5a25ccaf1a9734bbd10d62a1420cc8fd35a1060ce679f2fc846769a25fbeec  
MesaLib-10.2.8.tar.gz
1ef9ad3f241788d454f2ff8c9d65b6849dfc31c8fe91f70fd2930b81c8af1398  
MesaLib-10.2.8.tar.bz2
d26218da3b44734b1d555267b4c63c48803c4c8b14d2bc53071be57014da37fa  
MesaLib-10.2.8.zip

I have verified building from the .tar.bz2 file by doing:

tar xjf MesaLib-10.2.8.tar.bz2
cd Mesa-10.2.8
./configure --enable-gallium-llvm
make -j6
make -j6 install

I have also verified that I pushed the tag.

-Emil

--

Changes from 10.2.7 to 10.2.8:

Aaron Watry (1):
  gallivm: Fix build after LLVM commit 211259

Christoph Bumiller (2):
  nv50/ir/util: fix BitSet issues
  nvc0/ir: clarify recursion fix to finding first tex uses

Emil Velikov (4):
  docs: Add sha256 sums for the 10.2.7 release
  configure: bail out if building svga without libdrm
  Update VERSION to 10.2.8
  Add release notes for the 10.2.8 release

Ilia Mirkin (4):
  nv50/ir: avoid array overrun when checking for supported mods
  nouveau: only enable the depth test if there actually is a depth buffer
  nouveau: only enable stencil func if the visual has stencil bits
  nouveau: change internal variables to avoid conflicts with macro args

Jonathan Gray (1):
  configure.ac: strip _GNU_SOURCE from llvm-config output

José Fonseca (1):
  gallivm: Disable workaround for PR12833 on LLVM 3.2+.

Maarten Lankhorst (4):
  nouveau: re-allocate bo's on overflow
  nouveau: fix MPEG4 hw decoding
  nouveau: rework reference frame handling
  nouveau: remove unneeded assert

Marek Olšák (3):
  r600g,radeonsi: make sure there's enough CS space before resuming queries
  mesa: set UniformBooleanTrue = 1.0f by default
  st/mesa: use 1.0f as boolean true on drivers without integer support

Richard Sandiford (1):
  gallivm: Fix uses of 2^24

Roland Scheidegger (1):
  gallivm: set mcpu when initializing llvm execution engine

Thomas Hellstrom (1):
  winsys/svga: Fix incorrect type usage in IOCTL v2



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 5/5] mesa: Replace gl_client_array usage in _mesa_print_arrays()

2014-09-19 Thread Fredrik Höglund
On Friday 19 September 2014, Kenneth Graunke wrote:
> For now, this prints out the same information as before - just using the
> newer/non-derived structures.  Printing out each structure's fields
> separately might be more useful, but I've never used this code, so I'm
> not sure.
> 
> Signed-off-by: Kenneth Graunke 
> ---
>  src/mesa/main/varray.c | 47 +++
>  1 file changed, 31 insertions(+), 16 deletions(-)
> 
> diff --git a/src/mesa/main/varray.c b/src/mesa/main/varray.c
> index 09bf52c..380a32e 100644
> --- a/src/mesa/main/varray.c
> +++ b/src/mesa/main/varray.c
> @@ -1904,16 +1904,19 @@ _mesa_copy_vertex_buffer_binding(struct gl_context 
> *ctx,
>   * Print vertex array's fields.
>   */
>  static void
> -print_array(const char *name, GLint index, const struct gl_client_array 
> *array)
> +print_array(const char *name, GLint index,
> +const struct gl_vertex_attrib_array *attrib,
> +const struct gl_vertex_buffer_binding *binding)
>  {
> if (index >= 0)
>printf("  %s[%d]: ", name, index);
> else
>printf("  %s: ", name);
> printf("Ptr=%p, Type=0x%x, Size=%d, ElemSize=%u, Stride=%d, 
> Buffer=%u(Size %lu)\n",
> -   array->Ptr, array->Type, array->Size,
> -   array->_ElementSize, array->StrideB,
> -   array->BufferObj->Name, (unsigned long) array->BufferObj->Size);
> +   _mesa_vertex_attrib_address(attrib, binding),
> +  attrib->Type, attrib->Size,
> +   attrib->_ElementSize, binding->Stride,
> +   binding->BufferObj->Name, (unsigned long) binding->BufferObj->Size);
>  }
>  
>  
> @@ -1927,18 +1930,30 @@ _mesa_print_arrays(struct gl_context *ctx)
> GLuint i;
>  
> printf("Array Object %u\n", vao->Name);
> -   if (vao->_VertexAttrib[VERT_ATTRIB_POS].Enabled)
> -  print_array("Vertex", -1, &vao->_VertexAttrib[VERT_ATTRIB_POS]);
> -   if (vao->_VertexAttrib[VERT_ATTRIB_NORMAL].Enabled)
> -  print_array("Normal", -1, &vao->_VertexAttrib[VERT_ATTRIB_NORMAL]);
> -   if (vao->_VertexAttrib[VERT_ATTRIB_COLOR0].Enabled)
> -  print_array("Color", -1, &vao->_VertexAttrib[VERT_ATTRIB_COLOR0]);
> -   for (i = 0; i < ctx->Const.MaxTextureCoordUnits; i++)
> -  if (vao->_VertexAttrib[VERT_ATTRIB_TEX(i)].Enabled)
> - print_array("TexCoord", i, &vao->_VertexAttrib[VERT_ATTRIB_TEX(i)]);
> -   for (i = 0; i < VERT_ATTRIB_GENERIC_MAX; i++)
> -  if (vao->_VertexAttrib[VERT_ATTRIB_GENERIC(i)].Enabled)
> - print_array("Attrib", i, 
> &vao->_VertexAttrib[VERT_ATTRIB_GENERIC(i)]);
> +   if (vao->VertexAttrib[VERT_ATTRIB_POS].Enabled) {
> +  print_array("Vertex", -1, &vao->VertexAttrib[VERT_ATTRIB_POS],
> +&vao->VertexBinding[VERT_ATTRIB_POS]);
> +   }
> +   if (vao->VertexAttrib[VERT_ATTRIB_NORMAL].Enabled) {
> +  print_array("Normal", -1, &vao->VertexAttrib[VERT_ATTRIB_NORMAL],
> +&vao->VertexBinding[VERT_ATTRIB_NORMAL]);
> +   }
> +   if (vao->VertexAttrib[VERT_ATTRIB_COLOR0].Enabled) {
> +  print_array("Color", -1, &vao->VertexAttrib[VERT_ATTRIB_COLOR0],
> +   &vao->VertexBinding[VERT_ATTRIB_COLOR0]);
> +   }
> +   for (i = 0; i < ctx->Const.MaxTextureCoordUnits; i++) {
> +  if (vao->VertexAttrib[VERT_ATTRIB_TEX(i)].Enabled) {
> + print_array("TexCoord", i, &vao->VertexAttrib[VERT_ATTRIB_TEX(i)],
> +&vao->VertexBinding[VERT_ATTRIB_TEX(i)]);
> +  }
> +   }
> +   for (i = 0; i < VERT_ATTRIB_GENERIC_MAX; i++) {
> +  if (vao->VertexAttrib[VERT_ATTRIB_GENERIC(i)].Enabled) {
> + print_array("Attrib", i, &vao->VertexAttrib[VERT_ATTRIB_GENERIC(i)],
> +  
> &vao->VertexBinding[VERT_ATTRIB_GENERIC(i)]);

The generic attributes are not always associated with the vertex
binding of the same index.  The VertexBinding array should be indexed
by gl_vertex_attrib_array::VertexBinding.

I think it would be easier to just pass the vao pointer and the index
to print_array() and let it figure out which attrib array and binding
it should use.

> +  }
> +   }
>  }
>  
>  
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/5] mesa: Remove some dead helper functions.

2014-09-19 Thread Fredrik Höglund
Patches 1, 2, 3 and 4 are:

Reviewed-by: Fredrik Höglund 

On Friday 19 September 2014, Kenneth Graunke wrote:
> Dead since the _MaxElement removal, but these functions seemed generally
> applicable, so I decided to remove them in a separate patch.
> 
> Signed-off-by: Kenneth Graunke 
> ---
>  src/mesa/main/arrayobj.h | 26 --
>  1 file changed, 26 deletions(-)
> 
> diff --git a/src/mesa/main/arrayobj.h b/src/mesa/main/arrayobj.h
> index 1819cd1..3c1f918 100644
> --- a/src/mesa/main/arrayobj.h
> +++ b/src/mesa/main/arrayobj.h
> @@ -78,32 +78,6 @@ extern void
>  _mesa_update_vao_client_arrays(struct gl_context *ctx,
> struct gl_vertex_array_object *vao);
>  
> -
> -/** Returns the bitmask of all enabled arrays in fixed function mode.
> - *
> - *  In fixed function mode only the traditional fixed function arrays
> - *  are available.
> - */
> -static inline GLbitfield64
> -_mesa_array_object_get_enabled_ff(const struct gl_vertex_array_object *vao)
> -{
> -   return vao->_Enabled & VERT_BIT_FF_ALL;
> -}
> -
> -/** Returns the bitmask of all enabled arrays in arb/glsl shader mode.
> - *
> - *  In arb/glsl shader mode all the fixed function and the arb/glsl generic
> - *  arrays are available. Only the first generic array takes
> - *  precedence over the legacy position array.
> - */
> -static inline GLbitfield64
> -_mesa_array_object_get_enabled_arb(const struct gl_vertex_array_object *vao)
> -{
> -   GLbitfield64 enabled = vao->_Enabled;
> -   return enabled & ~(VERT_BIT_POS & (enabled >> VERT_ATTRIB_GENERIC0));
> -}
> -
> -
>  /*
>   * API functions
>   */
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] gallivm: add information about different sampler/view units if analyzing shader

2014-09-19 Thread Jose Fonseca

On 19/09/14 18:12, srol...@vmware.com wrote:

From: Roland Scheidegger 

Useful to know in some cases.
---
  src/gallium/auxiliary/gallivm/lp_bld_tgsi.h  | 6 ++
  src/gallium/auxiliary/gallivm/lp_bld_tgsi_info.c | 4 
  2 files changed, 10 insertions(+)

diff --git a/src/gallium/auxiliary/gallivm/lp_bld_tgsi.h 
b/src/gallium/auxiliary/gallivm/lp_bld_tgsi.h
index 85411ce..029ca3c 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_tgsi.h
+++ b/src/gallium/auxiliary/gallivm/lp_bld_tgsi.h
@@ -127,6 +127,12 @@ struct lp_tgsi_info
 unsigned indirect_textures:1;

 /*
+* Whether any of the texture (sample) ocpodes use different sampler
+* and sampler view unit.
+*/
+   unsigned sampler_texture_units_different:1;
+
+   /*
  * Whether any immediate values are outside the range of 0 and 1
  */
 unsigned unclamped_immediates:1;
diff --git a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_info.c 
b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_info.c
index fcaa201..55acea8 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_info.c
+++ b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_info.c
@@ -243,6 +243,10 @@ analyse_sample(struct analysis_context *ctx,
tex_info->texture_unit = inst->Src[1].Register.Index;
tex_info->sampler_unit = inst->Src[2].Register.Index;

+  if (tex_info->texture_unit != tex_info->sampler_unit) {
+ info->sampler_texture_units_different = TRUE;
+  }
+
if (modifier == LP_BLD_TEX_MODIFIER_EXPLICIT_DERIV ||
modifier == LP_BLD_TEX_MODIFIER_EXPLICIT_LOD ||
modifier == LP_BLD_TEX_MODIFIER_LOD_BIAS || shadow) {



LGTM.

Jose
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] gallivm: add information about different sampler/view units if analyzing shader

2014-09-19 Thread sroland
From: Roland Scheidegger 

Useful to know in some cases.
---
 src/gallium/auxiliary/gallivm/lp_bld_tgsi.h  | 6 ++
 src/gallium/auxiliary/gallivm/lp_bld_tgsi_info.c | 4 
 2 files changed, 10 insertions(+)

diff --git a/src/gallium/auxiliary/gallivm/lp_bld_tgsi.h 
b/src/gallium/auxiliary/gallivm/lp_bld_tgsi.h
index 85411ce..029ca3c 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_tgsi.h
+++ b/src/gallium/auxiliary/gallivm/lp_bld_tgsi.h
@@ -127,6 +127,12 @@ struct lp_tgsi_info
unsigned indirect_textures:1;
 
/*
+* Whether any of the texture (sample) ocpodes use different sampler
+* and sampler view unit.
+*/
+   unsigned sampler_texture_units_different:1;
+
+   /*
 * Whether any immediate values are outside the range of 0 and 1
 */
unsigned unclamped_immediates:1;
diff --git a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_info.c 
b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_info.c
index fcaa201..55acea8 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_info.c
+++ b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_info.c
@@ -243,6 +243,10 @@ analyse_sample(struct analysis_context *ctx,
   tex_info->texture_unit = inst->Src[1].Register.Index;
   tex_info->sampler_unit = inst->Src[2].Register.Index;
 
+  if (tex_info->texture_unit != tex_info->sampler_unit) {
+ info->sampler_texture_units_different = TRUE;
+  }
+
   if (modifier == LP_BLD_TEX_MODIFIER_EXPLICIT_DERIV ||
   modifier == LP_BLD_TEX_MODIFIER_EXPLICIT_LOD ||
   modifier == LP_BLD_TEX_MODIFIER_LOD_BIAS || shadow) {
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 84089] [radeonsi] hd 7790 need more sGPRS for ps

2014-09-19 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=84089

--- Comment #4 from Tom Stellard  ---
Created attachment 106551
  --> https://bugs.freedesktop.org/attachment.cgi?id=106551&action=edit
Fix

Can you try this patch?  Make sure you replace the original assert in the mesa
code.

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/5] mesa: Remove some dead helper functions.

2014-09-19 Thread Brian Paul

The series looks good to me.  Nice catches.
Reviewed-by: Brian Paul 

-Brian


On 09/18/2014 05:32 PM, Kenneth Graunke wrote:

Dead since the _MaxElement removal, but these functions seemed generally
applicable, so I decided to remove them in a separate patch.

Signed-off-by: Kenneth Graunke 
---
  src/mesa/main/arrayobj.h | 26 --
  1 file changed, 26 deletions(-)

diff --git a/src/mesa/main/arrayobj.h b/src/mesa/main/arrayobj.h
index 1819cd1..3c1f918 100644
--- a/src/mesa/main/arrayobj.h
+++ b/src/mesa/main/arrayobj.h
@@ -78,32 +78,6 @@ extern void
  _mesa_update_vao_client_arrays(struct gl_context *ctx,
 struct gl_vertex_array_object *vao);

-
-/** Returns the bitmask of all enabled arrays in fixed function mode.
- *
- *  In fixed function mode only the traditional fixed function arrays
- *  are available.
- */
-static inline GLbitfield64
-_mesa_array_object_get_enabled_ff(const struct gl_vertex_array_object *vao)
-{
-   return vao->_Enabled & VERT_BIT_FF_ALL;
-}
-
-/** Returns the bitmask of all enabled arrays in arb/glsl shader mode.
- *
- *  In arb/glsl shader mode all the fixed function and the arb/glsl generic
- *  arrays are available. Only the first generic array takes
- *  precedence over the legacy position array.
- */
-static inline GLbitfield64
-_mesa_array_object_get_enabled_arb(const struct gl_vertex_array_object *vao)
-{
-   GLbitfield64 enabled = vao->_Enabled;
-   return enabled & ~(VERT_BIT_POS & (enabled >> VERT_ATTRIB_GENERIC0));
-}
-
-
  /*
   * API functions
   */



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v4] glsl: Expand matrix flip optimization pass to cover more cases.

2014-09-19 Thread Iago Toral Quiroga
On vie, 2014-09-19 at 21:16 +1000, Timothy Arceri wrote:
> On Fri, 2014-09-19 at 12:52 +0200, Iago Toral Quiroga wrote:
> > Also, as suggested by Ian Romanick, make it so we don't need a bunch of
> > individual handles to flippable matrices, instead we register
> > matrix/transpose_matrix pairs in a hash table for all built-in matrices
> > using the non-transpose matrix name as key.
> > ---
> >  src/glsl/opt_flip_matrices.cpp | 159 
> > +++--
> >  1 file changed, 121 insertions(+), 38 deletions(-)
> > 
> > I think this never got the reviewed-by... This is a rebased version of the 
> > v3
> > patch that also fixes a silly mistake that I had introduced in that version.
> > No piglit regressions observed on SandyBridge.
> > 
> > Ian, do you think this version is good?
> > 
> > diff --git a/src/glsl/opt_flip_matrices.cpp b/src/glsl/opt_flip_matrices.cpp
> > index 04c6170..bb449d6 100644
> > --- a/src/glsl/opt_flip_matrices.cpp
> > +++ b/src/glsl/opt_flip_matrices.cpp
> > @@ -29,43 +29,143 @@
> >   * On some hardware, this is more efficient.
> >   *
> >   * This currently only does the conversion for built-in matrices which
> > - * already have transposed equivalents.  Namely, 
> > gl_ModelViewProjectionMatrix
> > - * and gl_TextureMatrix.
> > + * already have transposed equivalents.
> >   */
> >  #include "ir.h"
> >  #include "ir_optimization.h"
> >  #include "main/macros.h"
> > +#include "program/hash_table.h"
> >  
> >  namespace {
> >  class matrix_flipper : public ir_hierarchical_visitor {
> >  public:
> > +   struct matrix_and_transpose {
> > +  ir_variable *matrix;
> > +  ir_variable *transpose_matrix;
> > +   };
> > +
> > matrix_flipper(exec_list *instructions)
> > {
> > +  this->mem_ctx = ralloc_context(NULL);
> >progress = false;
> > -  mvp_transpose = NULL;
> > -  texmat_transpose = NULL;
> > +
> > +  /* Build a hash table of built-in matrices and their transposes.
> > +   *
> > +   * The key for the entries in the hash table is the non-transpose 
> > matrix
> > +   * name. This assumes that all built-in transpose matrices have the
> > +   * "Transpose" suffix.
> > +   */
> > +  ht = hash_table_ctor(0, hash_table_string_hash,
> > +   hash_table_string_compare);
> >  
> >foreach_in_list(ir_instruction, ir, instructions) {
> >   ir_variable *var = ir->as_variable();
> > +
> >   if (!var)
> >  continue;
> > - if (strcmp(var->name, "gl_ModelViewProjectionMatrixTranspose") == 
> > 0)
> > -mvp_transpose = var;
> > - if (strcmp(var->name, "gl_TextureMatrixTranspose") == 0)
> > -texmat_transpose = var;
> > +
> > + /* Must be a matrix or array of matrices. */
> > + if (!var->type->is_matrix() &&
> > + !(var->type->is_array() && 
> > var->type->fields.array->is_matrix()))
> 
> 
> This can now be simplified to
> 
> if(!var->type->without_array()->is_matrix())

Oh, nice! I have changed the patch to do this instead.
Thanks!

> 
> > +continue;
> > +
> > + /* Must be a built-in */
> > + if (!is_gl_identifier(var->name))
> > +continue;
> > +
> > + /* Create a new entry for this matrix if we don't have one yet */
> > + bool new_entry = false;
> > + struct matrix_and_transpose *entry =
> > +(struct matrix_and_transpose *) hash_table_find(ht, var->name);
> > + if (!entry) {
> > +new_entry = true;
> > +entry = new struct matrix_and_transpose();
> > +entry->matrix = NULL;
> > +entry->transpose_matrix = NULL;
> > + }
> > +
> > + const char *transpose_ptr = strstr(var->name, "Transpose");
> > + if (transpose_ptr == NULL) {
> > +entry->matrix = var;
> > + } else {
> > +/* We should not be adding transpose built-in matrices that do
> > + * not end in 'Transpose'.
> > + */
> > +assert(transpose_ptr[9] == 0);
> > +entry->transpose_matrix = var;
> > + }
> > +
> > + if (new_entry) {
> > +char *entry_key;
> > +if (transpose_ptr == NULL) {
> > +   entry_key = (char *) var->name;
> > +} else {
> > +   entry_key = ralloc_strndup(this->mem_ctx, var->name,
> > +  transpose_ptr - var->name);
> > +}
> > +hash_table_insert(ht, entry, entry_key);
> > + }
> >}
> > }
> >  
> > +   ~matrix_flipper()
> > +   {
> > +  hash_table_dtor(ht);
> > +  ralloc_free(this->mem_ctx);
> > +   }
> > +
> > ir_visitor_status visit_enter(ir_expression *ir);
> >  
> > bool progress;
> >  
> >  private:
> > -   ir_variable *mvp_transpose;
> > -   ir_variable *texmat_transpose;
> > +   void transform_operands(ir_expression *ir,
> > + 

Re: [Mesa-dev] [PATCH v4] glsl: Expand matrix flip optimization pass to cover more cases.

2014-09-19 Thread Timothy Arceri
On Fri, 2014-09-19 at 12:52 +0200, Iago Toral Quiroga wrote:
> Also, as suggested by Ian Romanick, make it so we don't need a bunch of
> individual handles to flippable matrices, instead we register
> matrix/transpose_matrix pairs in a hash table for all built-in matrices
> using the non-transpose matrix name as key.
> ---
>  src/glsl/opt_flip_matrices.cpp | 159 
> +++--
>  1 file changed, 121 insertions(+), 38 deletions(-)
> 
> I think this never got the reviewed-by... This is a rebased version of the v3
> patch that also fixes a silly mistake that I had introduced in that version.
> No piglit regressions observed on SandyBridge.
> 
> Ian, do you think this version is good?
> 
> diff --git a/src/glsl/opt_flip_matrices.cpp b/src/glsl/opt_flip_matrices.cpp
> index 04c6170..bb449d6 100644
> --- a/src/glsl/opt_flip_matrices.cpp
> +++ b/src/glsl/opt_flip_matrices.cpp
> @@ -29,43 +29,143 @@
>   * On some hardware, this is more efficient.
>   *
>   * This currently only does the conversion for built-in matrices which
> - * already have transposed equivalents.  Namely, gl_ModelViewProjectionMatrix
> - * and gl_TextureMatrix.
> + * already have transposed equivalents.
>   */
>  #include "ir.h"
>  #include "ir_optimization.h"
>  #include "main/macros.h"
> +#include "program/hash_table.h"
>  
>  namespace {
>  class matrix_flipper : public ir_hierarchical_visitor {
>  public:
> +   struct matrix_and_transpose {
> +  ir_variable *matrix;
> +  ir_variable *transpose_matrix;
> +   };
> +
> matrix_flipper(exec_list *instructions)
> {
> +  this->mem_ctx = ralloc_context(NULL);
>progress = false;
> -  mvp_transpose = NULL;
> -  texmat_transpose = NULL;
> +
> +  /* Build a hash table of built-in matrices and their transposes.
> +   *
> +   * The key for the entries in the hash table is the non-transpose 
> matrix
> +   * name. This assumes that all built-in transpose matrices have the
> +   * "Transpose" suffix.
> +   */
> +  ht = hash_table_ctor(0, hash_table_string_hash,
> +   hash_table_string_compare);
>  
>foreach_in_list(ir_instruction, ir, instructions) {
>   ir_variable *var = ir->as_variable();
> +
>   if (!var)
>  continue;
> - if (strcmp(var->name, "gl_ModelViewProjectionMatrixTranspose") == 0)
> -mvp_transpose = var;
> - if (strcmp(var->name, "gl_TextureMatrixTranspose") == 0)
> -texmat_transpose = var;
> +
> + /* Must be a matrix or array of matrices. */
> + if (!var->type->is_matrix() &&
> + !(var->type->is_array() && 
> var->type->fields.array->is_matrix()))


This can now be simplified to

if(!var->type->without_array()->is_matrix())


> +continue;
> +
> + /* Must be a built-in */
> + if (!is_gl_identifier(var->name))
> +continue;
> +
> + /* Create a new entry for this matrix if we don't have one yet */
> + bool new_entry = false;
> + struct matrix_and_transpose *entry =
> +(struct matrix_and_transpose *) hash_table_find(ht, var->name);
> + if (!entry) {
> +new_entry = true;
> +entry = new struct matrix_and_transpose();
> +entry->matrix = NULL;
> +entry->transpose_matrix = NULL;
> + }
> +
> + const char *transpose_ptr = strstr(var->name, "Transpose");
> + if (transpose_ptr == NULL) {
> +entry->matrix = var;
> + } else {
> +/* We should not be adding transpose built-in matrices that do
> + * not end in 'Transpose'.
> + */
> +assert(transpose_ptr[9] == 0);
> +entry->transpose_matrix = var;
> + }
> +
> + if (new_entry) {
> +char *entry_key;
> +if (transpose_ptr == NULL) {
> +   entry_key = (char *) var->name;
> +} else {
> +   entry_key = ralloc_strndup(this->mem_ctx, var->name,
> +  transpose_ptr - var->name);
> +}
> +hash_table_insert(ht, entry, entry_key);
> + }
>}
> }
>  
> +   ~matrix_flipper()
> +   {
> +  hash_table_dtor(ht);
> +  ralloc_free(this->mem_ctx);
> +   }
> +
> ir_visitor_status visit_enter(ir_expression *ir);
>  
> bool progress;
>  
>  private:
> -   ir_variable *mvp_transpose;
> -   ir_variable *texmat_transpose;
> +   void transform_operands(ir_expression *ir,
> +   ir_variable *mat_var,
> +   ir_variable *mat_transpose);
> +   void transform_operands_array_of_matrix(ir_expression *ir,
> +   ir_variable *mat_var,
> +   ir_variable *mat_transpose);
> +   struct hash_table *ht;
> +   void *mem_ctx;
>  };
>  }
>  
> +void
> +matrix_flippe

[Mesa-dev] [PATCH v4] glsl: Expand matrix flip optimization pass to cover more cases.

2014-09-19 Thread Iago Toral Quiroga
Also, as suggested by Ian Romanick, make it so we don't need a bunch of
individual handles to flippable matrices, instead we register
matrix/transpose_matrix pairs in a hash table for all built-in matrices
using the non-transpose matrix name as key.
---
 src/glsl/opt_flip_matrices.cpp | 159 +++--
 1 file changed, 121 insertions(+), 38 deletions(-)

I think this never got the reviewed-by... This is a rebased version of the v3
patch that also fixes a silly mistake that I had introduced in that version.
No piglit regressions observed on SandyBridge.

Ian, do you think this version is good?

diff --git a/src/glsl/opt_flip_matrices.cpp b/src/glsl/opt_flip_matrices.cpp
index 04c6170..bb449d6 100644
--- a/src/glsl/opt_flip_matrices.cpp
+++ b/src/glsl/opt_flip_matrices.cpp
@@ -29,43 +29,143 @@
  * On some hardware, this is more efficient.
  *
  * This currently only does the conversion for built-in matrices which
- * already have transposed equivalents.  Namely, gl_ModelViewProjectionMatrix
- * and gl_TextureMatrix.
+ * already have transposed equivalents.
  */
 #include "ir.h"
 #include "ir_optimization.h"
 #include "main/macros.h"
+#include "program/hash_table.h"
 
 namespace {
 class matrix_flipper : public ir_hierarchical_visitor {
 public:
+   struct matrix_and_transpose {
+  ir_variable *matrix;
+  ir_variable *transpose_matrix;
+   };
+
matrix_flipper(exec_list *instructions)
{
+  this->mem_ctx = ralloc_context(NULL);
   progress = false;
-  mvp_transpose = NULL;
-  texmat_transpose = NULL;
+
+  /* Build a hash table of built-in matrices and their transposes.
+   *
+   * The key for the entries in the hash table is the non-transpose matrix
+   * name. This assumes that all built-in transpose matrices have the
+   * "Transpose" suffix.
+   */
+  ht = hash_table_ctor(0, hash_table_string_hash,
+   hash_table_string_compare);
 
   foreach_in_list(ir_instruction, ir, instructions) {
  ir_variable *var = ir->as_variable();
+
  if (!var)
 continue;
- if (strcmp(var->name, "gl_ModelViewProjectionMatrixTranspose") == 0)
-mvp_transpose = var;
- if (strcmp(var->name, "gl_TextureMatrixTranspose") == 0)
-texmat_transpose = var;
+
+ /* Must be a matrix or array of matrices. */
+ if (!var->type->is_matrix() &&
+ !(var->type->is_array() && var->type->fields.array->is_matrix()))
+continue;
+
+ /* Must be a built-in */
+ if (!is_gl_identifier(var->name))
+continue;
+
+ /* Create a new entry for this matrix if we don't have one yet */
+ bool new_entry = false;
+ struct matrix_and_transpose *entry =
+(struct matrix_and_transpose *) hash_table_find(ht, var->name);
+ if (!entry) {
+new_entry = true;
+entry = new struct matrix_and_transpose();
+entry->matrix = NULL;
+entry->transpose_matrix = NULL;
+ }
+
+ const char *transpose_ptr = strstr(var->name, "Transpose");
+ if (transpose_ptr == NULL) {
+entry->matrix = var;
+ } else {
+/* We should not be adding transpose built-in matrices that do
+ * not end in 'Transpose'.
+ */
+assert(transpose_ptr[9] == 0);
+entry->transpose_matrix = var;
+ }
+
+ if (new_entry) {
+char *entry_key;
+if (transpose_ptr == NULL) {
+   entry_key = (char *) var->name;
+} else {
+   entry_key = ralloc_strndup(this->mem_ctx, var->name,
+  transpose_ptr - var->name);
+}
+hash_table_insert(ht, entry, entry_key);
+ }
   }
}
 
+   ~matrix_flipper()
+   {
+  hash_table_dtor(ht);
+  ralloc_free(this->mem_ctx);
+   }
+
ir_visitor_status visit_enter(ir_expression *ir);
 
bool progress;
 
 private:
-   ir_variable *mvp_transpose;
-   ir_variable *texmat_transpose;
+   void transform_operands(ir_expression *ir,
+   ir_variable *mat_var,
+   ir_variable *mat_transpose);
+   void transform_operands_array_of_matrix(ir_expression *ir,
+   ir_variable *mat_var,
+   ir_variable *mat_transpose);
+   struct hash_table *ht;
+   void *mem_ctx;
 };
 }
 
+void
+matrix_flipper::transform_operands(ir_expression *ir,
+   ir_variable *mat_var,
+   ir_variable *mat_transpose)
+{
+#ifndef NDEBUG
+   ir_dereference_variable *deref = ir->operands[0]->as_dereference_variable();
+   assert(deref && deref->var == mat_var);
+#endif
+
+   void *mem_ctx = ralloc_parent(ir);
+   ir->operands[0] = ir->operands[1];
+   ir->operands[1] = new

[Mesa-dev] [Bug 84089] [radeonsi] hd 7790 need more sGPRS for ps

2014-09-19 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=84089

--- Comment #3 from Christian König  ---
(In reply to comment #2)
> i use llvm-svn|git from yesterday, and 6-7 days ago valley fork fine.

Ah! Then some change in LLVM broke register spilling, please bisect LLVM to
figure out what it was.

Thanks,
Christian.

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 84089] [radeonsi] hd 7790 need more sGPRS for ps

2014-09-19 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=84089

--- Comment #2 from Iaroslav Andrusyak  ---
i use llvm-svn|git from yesterday, and 6-7 days ago valley fork fine.

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 84089] [radeonsi] hd 7790 need more sGPRS for ps

2014-09-19 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=84089

--- Comment #1 from Christian König  ---
(In reply to comment #0)
> I changed  num_sgprs <= 104 to num_sgprs <= 204 and add fprint for num_sgprs
> and user_sgprs

??? 104 is a hardware limit, you can't change it. You probably just need to use
a newer LLVM version which supports SGPR spilling.

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 84089] [radeonsi] hd 7790 need more sGPRS for ps

2014-09-19 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=84089

Iaroslav Andrusyak  changed:

   What|Removed |Added

   Assignee|dri-devel@lists.freedesktop |mesa-dev@lists.freedesktop.
   |.org|org

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 28/37] i965/gen6/gs: implement transform feedback support in gen6_gs_visitor

2014-09-19 Thread Iago Toral Quiroga
On vie, 2014-09-19 at 00:26 -0700, Jordan Justen wrote:
> On Thu, Sep 18, 2014 at 11:50 PM, Samuel Iglesias Gonsálvez
>  wrote:
> > On Thu, 2014-09-18 at 16:05 -0700, Jordan Justen wrote:
> >> On Thu, Aug 14, 2014 at 4:12 AM, Iago Toral Quiroga  
> >> wrote:
> >> > From: Samuel Iglesias Gonsalvez 
> >> >
> >> > +  this->xfb_output = src_reg(this,
> >> > + glsl_type::uint_type,
> >> > + linked_xfb_info->NumOutputs *
> >> > + c->gp->program.VerticesOut);
> >> > +  this->xfb_output_offset = src_reg(this, glsl_type::uint_type);
> >> > +  emit(MOV(dst_reg(this->xfb_output_offset), src_reg(0u)));
> >> > +  /* Create a virtual register to hold destination indices in SOL */
> >> > +  this->destination_indices = src_reg(this, glsl_type::uvec4_type);
> >> > +  /* Create a virtual register to hold temporal values in SOL */
> >> > +  this->sol_temp = src_reg(this, glsl_type::uvec4_type);
> >>
> >> What is the duration of liveness for sol_temp?
> >>
> >> Would it be better to generate a new temp in each function to help out
> >> register allocation?
> >>
> >
> > Yes, it is better. I have made this change: create a new temp virtual
> > register in every place it is needed (emit_thread_end(), xfb_write(),
> > xfb_program()).
> 
> Cool. Add Reviewed-by: Jordan Justen  for
> this patch, and:
>  i965/gen6/gs: Avoid buffering transform feedback varyings twice.
>  i965/gen6/gs: Fix binding table clash between TF surfaces and textures.
>  i965/gen6/gs: Enable transform feedback support in geometry shaders
>  i965/gen6/gs: upload ubo and pull constants surfaces.
>  i965/gen6/gs: Use a specific implementation of geometry shaders for gen6.
>  i965/gen6: enable GLSL 1.50 and OpenGL 3.2
> 
> That is the rest of the series, right?

Yes.

> Thank you both for all the great work on this series!

Great, thanks for taking the time to review all the patches! I'll push
them later today.

Iago

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 28/37] i965/gen6/gs: implement transform feedback support in gen6_gs_visitor

2014-09-19 Thread Jordan Justen
On Thu, Sep 18, 2014 at 11:50 PM, Samuel Iglesias Gonsálvez
 wrote:
> On Thu, 2014-09-18 at 16:05 -0700, Jordan Justen wrote:
>> On Thu, Aug 14, 2014 at 4:12 AM, Iago Toral Quiroga  
>> wrote:
>> > From: Samuel Iglesias Gonsalvez 
>> >
>> > +  this->xfb_output = src_reg(this,
>> > + glsl_type::uint_type,
>> > + linked_xfb_info->NumOutputs *
>> > + c->gp->program.VerticesOut);
>> > +  this->xfb_output_offset = src_reg(this, glsl_type::uint_type);
>> > +  emit(MOV(dst_reg(this->xfb_output_offset), src_reg(0u)));
>> > +  /* Create a virtual register to hold destination indices in SOL */
>> > +  this->destination_indices = src_reg(this, glsl_type::uvec4_type);
>> > +  /* Create a virtual register to hold temporal values in SOL */
>> > +  this->sol_temp = src_reg(this, glsl_type::uvec4_type);
>>
>> What is the duration of liveness for sol_temp?
>>
>> Would it be better to generate a new temp in each function to help out
>> register allocation?
>>
>
> Yes, it is better. I have made this change: create a new temp virtual
> register in every place it is needed (emit_thread_end(), xfb_write(),
> xfb_program()).

Cool. Add Reviewed-by: Jordan Justen  for
this patch, and:
 i965/gen6/gs: Avoid buffering transform feedback varyings twice.
 i965/gen6/gs: Fix binding table clash between TF surfaces and textures.
 i965/gen6/gs: Enable transform feedback support in geometry shaders
 i965/gen6/gs: upload ubo and pull constants surfaces.
 i965/gen6/gs: Use a specific implementation of geometry shaders for gen6.
 i965/gen6: enable GLSL 1.50 and OpenGL 3.2

That is the rest of the series, right?

Thank you both for all the great work on this series!

-Jordan
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2] i965/gen6/gs: implement transform feedback support in gen6_gs_visitor

2014-09-19 Thread Iago Toral Quiroga
From: Samuel Iglesias Gonsalvez 

This takes care of generating code required to handle transform feedback.
Notice that transform feedback isn't enabled yet, since that requires
additional setups in other parts of the code that will come in later patches.

Signed-off-by: Samuel Iglesias Gonsalvez 
Acked-by: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/brw_context.h   | 113 ++
 src/mesa/drivers/dri/i965/gen6_gs_visitor.cpp | 309 +-
 src/mesa/drivers/dri/i965/gen6_gs_visitor.h   |  13 ++
 3 files changed, 390 insertions(+), 45 deletions(-)

V2: Allocate sol_temp in each function that requires the temporary register 
rather than
allocating it once during the prolog.

diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
b/src/mesa/drivers/dri/i965/brw_context.h
index 9e04d81..3bdc480 100644
--- a/src/mesa/drivers/dri/i965/brw_context.h
+++ b/src/mesa/drivers/dri/i965/brw_context.h
@@ -556,48 +556,6 @@ struct brw_vs_prog_data {
bool uses_instanceid;
 };
 
-
-/* Note: brw_gs_prog_data_compare() must be updated when adding fields to
- * this struct!
- */
-struct brw_gs_prog_data
-{
-   struct brw_vec4_prog_data base;
-
-   /**
-* Size of an output vertex, measured in HWORDS (32 bytes).
-*/
-   unsigned output_vertex_size_hwords;
-
-   unsigned output_topology;
-
-   /**
-* Size of the control data (cut bits or StreamID bits), in hwords (32
-* bytes).  0 if there is no control data.
-*/
-   unsigned control_data_header_size_hwords;
-
-   /**
-* Format of the control data (either GEN7_GS_CONTROL_DATA_FORMAT_GSCTL_SID
-* if the control data is StreamID bits, or
-* GEN7_GS_CONTROL_DATA_FORMAT_GSCTL_CUT if the control data is cut bits).
-* Ignored if control_data_header_size is 0.
-*/
-   unsigned control_data_format;
-
-   bool include_primitive_id;
-
-   int invocations;
-
-   /**
-* Dispatch mode, can be any of:
-* GEN7_GS_DISPATCH_MODE_DUAL_OBJECT
-* GEN7_GS_DISPATCH_MODE_DUAL_INSTANCE
-* GEN7_GS_DISPATCH_MODE_SINGLE
-*/
-   int dispatch_mode;
-};
-
 /** Number of texture sampler units */
 #define BRW_MAX_TEX_UNIT 32
 
@@ -644,6 +602,77 @@ struct brw_gs_prog_data
 #define SURF_INDEX_GEN6_SOL_BINDING(t) (t)
 #define BRW_MAX_GEN6_GS_SURFACES   
SURF_INDEX_GEN6_SOL_BINDING(BRW_MAX_SOL_BINDINGS)
 
+/* Note: brw_gs_prog_data_compare() must be updated when adding fields to
+ * this struct!
+ */
+struct brw_gs_prog_data
+{
+   struct brw_vec4_prog_data base;
+
+   /**
+* Size of an output vertex, measured in HWORDS (32 bytes).
+*/
+   unsigned output_vertex_size_hwords;
+
+   unsigned output_topology;
+
+   /**
+* Size of the control data (cut bits or StreamID bits), in hwords (32
+* bytes).  0 if there is no control data.
+*/
+   unsigned control_data_header_size_hwords;
+
+   /**
+* Format of the control data (either GEN7_GS_CONTROL_DATA_FORMAT_GSCTL_SID
+* if the control data is StreamID bits, or
+* GEN7_GS_CONTROL_DATA_FORMAT_GSCTL_CUT if the control data is cut bits).
+* Ignored if control_data_header_size is 0.
+*/
+   unsigned control_data_format;
+
+   bool include_primitive_id;
+
+   int invocations;
+
+   /**
+* Dispatch mode, can be any of:
+* GEN7_GS_DISPATCH_MODE_DUAL_OBJECT
+* GEN7_GS_DISPATCH_MODE_DUAL_INSTANCE
+* GEN7_GS_DISPATCH_MODE_SINGLE
+*/
+   int dispatch_mode;
+
+   /**
+* Gen6 transform feedback enabled flag.
+*/
+   bool gen6_xfb_enabled;
+
+   /**
+* Gen6: Provoking vertex convention for odd-numbered triangles
+* in tristrips.
+*/
+   GLuint pv_first:1;
+
+   /**
+* Gen6: Number of varyings that are output to transform feedback.
+*/
+   GLuint num_transform_feedback_bindings:7; /* 0-BRW_MAX_SOL_BINDINGS */
+
+   /**
+* Gen6: Map from the index of a transform feedback binding table entry to 
the
+* gl_varying_slot that should be streamed out through that binding table
+* entry.
+*/
+   unsigned char transform_feedback_bindings[BRW_MAX_SOL_BINDINGS];
+
+   /**
+* Gen6: Map from the index of a transform feedback binding table entry to 
the
+* swizzles that should be used when streaming out data through that
+* binding table entry.
+*/
+   unsigned char transform_feedback_swizzles[BRW_MAX_SOL_BINDINGS];
+};
+
 /**
  * Stride in bytes between shader_time entries.
  *
diff --git a/src/mesa/drivers/dri/i965/gen6_gs_visitor.cpp 
b/src/mesa/drivers/dri/i965/gen6_gs_visitor.cpp
index 7a832ca..c9e8e66 100644
--- a/src/mesa/drivers/dri/i965/gen6_gs_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/gen6_gs_visitor.cpp
@@ -97,6 +97,43 @@ gen6_gs_visitor::emit_prolog()
this->prim_count = src_reg(this, glsl_type::uint_type);
emit(MOV(dst_reg(this->prim_count), 0u));
 
+   if (c->prog_data.gen6_xfb_enabled) {
+  const struct gl_transform_feedback_info *linked_xfb_info =
+ &this->shader_prog->LinkedTransformFeedback;
+
+  /* Gen6 geometry shaders are required to ask for