Re: [Mesa-dev] [PATCH 1/3] nvc0: fix blit triangle size to fully cover FB's > 8192x8192

2016-03-13 Thread Samuel Pitoiset

What about nv50? Is there the same issue?
Don't you need to update nvc0_clear_buffer() accordingly?

On 03/13/2016 04:07 AM, Ilia Mirkin wrote:

The idea is that a single triangle will cover the whole area being
drawn, allowing the blit shader to do its work. However the max fb size
is 16384x16384, which means that the triangle we draw needs to be twice
that in order to cover the whole area fully. Increase the size of the
triangle to 32768x32768.

This fixes a number of dEQP tests that were failing because a blit was
involved which would miss some of the resulting texture.

Signed-off-by: Ilia Mirkin 
Cc: "11.1 11.2" 
---
  src/gallium/drivers/nouveau/nvc0/nvc0_surface.c | 8 
  1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_surface.c 
b/src/gallium/drivers/nouveau/nvc0/nvc0_surface.c
index ccfc9e2..f2ad4bf 100644
--- a/src/gallium/drivers/nouveau/nvc0/nvc0_surface.c
+++ b/src/gallium/drivers/nouveau/nvc0/nvc0_surface.c
@@ -1215,8 +1215,8 @@ nvc0_blit_3d(struct nvc0_context *nvc0, const struct 
pipe_blit_info *info)
 x0 = (float)info->src.box.x - x_range * (float)info->dst.box.x;
 y0 = (float)info->src.box.y - y_range * (float)info->dst.box.y;

-   x1 = x0 + 16384.0f * x_range;
-   y1 = y0 + 16384.0f * y_range;
+   x1 = x0 + 32768.0f * x_range;
+   y1 = y0 + 32768.0f * y_range;

 x0 *= (float)(1 << nv50_miptree(src)->ms_x);
 x1 *= (float)(1 << nv50_miptree(src)->ms_x);
@@ -1327,14 +1327,14 @@ nvc0_blit_3d(struct nvc0_context *nvc0, const struct 
pipe_blit_info *info)
*(vbuf++) = fui(y0);
*(vbuf++) = fui(z);

-  *(vbuf++) = fui(16384 << nv50_miptree(dst)->ms_x);
+  *(vbuf++) = fui(32768 << nv50_miptree(dst)->ms_x);
*(vbuf++) = fui(0.0f);
*(vbuf++) = fui(x1);
*(vbuf++) = fui(y0);
*(vbuf++) = fui(z);

*(vbuf++) = fui(0.0f);
-  *(vbuf++) = fui(16384 << nv50_miptree(dst)->ms_y);
+  *(vbuf++) = fui(32768 << nv50_miptree(dst)->ms_y);
*(vbuf++) = fui(x0);
*(vbuf++) = fui(y1);
*(vbuf++) = fui(z);


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/3] nv50/ir: avoid folding mul + add if the mul has a dnz

2016-03-13 Thread Samuel Pitoiset

This doesn't seem crazy.

Reviewed-by: Samuel Pitoiset 

On 03/13/2016 04:07 AM, Ilia Mirkin wrote:

Signed-off-by: Ilia Mirkin 
---
  src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp | 5 ++---
  1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
index 6192c06..66e7b2e 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
@@ -1635,11 +1635,10 @@ AlgebraicOpt::tryADDToMADOrSAD(Instruction *add, 
operation toOp)
 if (src->getUniqueInsn() && src->getUniqueInsn()->bb != add->bb)
return false;

-   if (src->getInsn()->saturate)
+   if (src->getInsn()->saturate || src->getInsn()->postFactor ||
+   src->getInsn()->dnz)
return false;

-   if (src->getInsn()->postFactor)
-  return false;
 if (toOp == OP_SAD) {
ImmediateValue imm;
if (!src->getInsn()->src(2).getImmediate(imm))


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/3] nv50, nvc0: handle SQRT lowering inside the driver

2016-03-13 Thread Samuel Pitoiset



On 03/13/2016 04:07 AM, Ilia Mirkin wrote:

First off, st/mesa lowers DSQRT incorrectly (it uses CMP to attempt to
find out whether the input is less than 0). Secondly the current
approach (x * rsq(x)) behaves poorly for x = inf - a NaN is produced
instead of inf.



When I had a look at this precision thing, this seemed a bit weird to me 
but I did not investigate much of time on it.



Instead we switch to the less accurate rcp(rsq(x)) method - this behaves
nicely for all valid inputs. We still don't do this for DSQRT since the
RSQ/RCP ops are *really* inaccurate, and don't even have Newton-Raphson
steps right now. Eventually we should have a separate library function
for DSQRT that does it more precisely (and perhaps move this lowering to
the post-opt phase).

This fixes a number of dEQP precision tests that were expecting better
behavior for infinite inputs.


Yep, this fixes distance, length, refract and sqrt deqp precision tests 
(37 tests). But they are still some fails with atan2, idexp and tanh.

Are you going to fix them too?

Except that you forgot to remove unused variables:
codegen/nv50_ir_lowering_nvc0.cpp: In member function ‘bool 
nv50_ir::NVC0LoweringPass::handleSQRT(nv50_ir::Instruction*)’:
codegen/nv50_ir_lowering_nvc0.cpp:1785:20: warning: unused variable 
‘mov’ [-Wunused-variable]

   Instruction *mov, *rsq;
^
codegen/nv50_ir_lowering_nvc0.cpp:1785:26: warning: unused variable 
‘rsq’ [-Wunused-variable]

   Instruction *mov, *rsq;
  ^
  CC   nvc0/nvc0_surface.lo

This patch is:

Reviewed-by: Samuel Pitoiset 
Tested-by: Samuel Pitoiset 



Signed-off-by: Ilia Mirkin 
---
  .../drivers/nouveau/codegen/nv50_ir_build_util.cpp |  6 +++-
  .../drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp  |  2 ++
  .../nouveau/codegen/nv50_ir_lowering_nv50.cpp  |  7 ++---
  .../nouveau/codegen/nv50_ir_lowering_nvc0.cpp  | 32 +++---
  src/gallium/drivers/nouveau/nv50/nv50_screen.c |  2 +-
  src/gallium/drivers/nouveau/nvc0/nvc0_screen.c |  2 +-
  6 files changed, 28 insertions(+), 23 deletions(-)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_build_util.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_build_util.cpp
index f58cf97..84ebfdb 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_build_util.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_build_util.cpp
@@ -585,6 +585,7 @@ BuildUtil::split64BitOpPostRA(Function *fn, Instruction *i,
   return NULL;
srcNr = 2;
break;
+   case OP_SELP: srcNr = 3; break;
 default:
// TODO when needed
return NULL;
@@ -601,7 +602,10 @@ BuildUtil::split64BitOpPostRA(Function *fn, Instruction *i,

 for (int s = 0; s < srcNr; ++s) {
if (lo->getSrc(s)->reg.size < 8) {
- hi->setSrc(s, zero);
+ if (s == 2)
+hi->setSrc(s, lo->getSrc(s));
+ else
+hi->setSrc(s, zero);
} else {
   if (lo->getSrc(s)->refCount() > 1)
  lo->setSrc(s, cloneShallow(fn, lo->getSrc(s)));
diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
index b06d86a..d284446 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
@@ -616,6 +616,7 @@ static nv50_ir::operation translateOpcode(uint opcode)

 NV50_IR_OPCODE_CASE(RCP, RCP);
 NV50_IR_OPCODE_CASE(RSQ, RSQ);
+   NV50_IR_OPCODE_CASE(SQRT, SQRT);

 NV50_IR_OPCODE_CASE(MUL, MUL);
 NV50_IR_OPCODE_CASE(ADD, ADD);
@@ -2689,6 +2690,7 @@ Converter::handleInstruction(const struct 
tgsi_full_instruction *insn)
 case TGSI_OPCODE_FLR:
 case TGSI_OPCODE_TRUNC:
 case TGSI_OPCODE_RCP:
+   case TGSI_OPCODE_SQRT:
 case TGSI_OPCODE_IABS:
 case TGSI_OPCODE_INEG:
 case TGSI_OPCODE_NOT:
diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nv50.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nv50.cpp
index 8752b0c..12c5f69 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nv50.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nv50.cpp
@@ -1203,10 +1203,9 @@ NV50LoweringPreSSA::handleDIV(Instruction *i)
  bool
  NV50LoweringPreSSA::handleSQRT(Instruction *i)
  {
-   Instruction *rsq = bld.mkOp1(OP_RSQ, TYPE_F32,
-bld.getSSA(), i->getSrc(0));
-   i->op = OP_MUL;
-   i->setSrc(1, rsq->getDef(0));
+   bld.setPosition(i, true);
+   i->op = OP_RSQ;
+   bld.mkOp1(OP_RCP, i->dType, i->getDef(0), i->getDef(0));

 return true;
  }
diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp
index d181f15..29b77c9 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp
@@ -1778,22 +1778,22 @@ NVC0Lowerin

[Mesa-dev] [Bug 94522] weston eglSwapBuffers crash in kms_swrast_dri.so on GMA500

2016-03-13 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=94522

Bug ID: 94522
   Summary: weston eglSwapBuffers crash in kms_swrast_dri.so on
GMA500
   Product: Mesa
   Version: 11.1
  Hardware: x86 (IA32)
OS: All
Status: NEW
  Severity: normal
  Priority: medium
 Component: EGL/Wayland
  Assignee: wayland-b...@lists.freedesktop.org
  Reporter: comicfan...@gmail.com
QA Contact: mesa-dev@lists.freedesktop.org

mesa version 11.1.2
llvm 3.7.1
cpu: intel ATOM z520 :32bit only,supports sse sse2 ssse3 but not sse4 nor avx
weston crash when calling eglSwapBuffers 
backtrace shows crash thread receives SIGBUS

Thread 2 "llvmpipe-0" received signal SIGBUS, Bus error.

#0  0xb755e633 in ?? ()
#1  0xb6e2960b in lp_rast_shade_tile (task=0x80a9ab4, arg=...) at lp_rast.c:352
#2  0xb6e29981 in do_rasterize_bin (bin=, x=,
y=, task=0x80a9ab4) at lp_rast.c:609
#3  rasterize_bin (y=, x=, bin=,
task=0x80a9ab4) at lp_rast.c:628
#4  rasterize_scene (task=task@entry=0x80a9ab4, scene=) at
lp_rast.c:688
#5  0xb6e2a10a in thread_function (init_data=0x80a9ab4) at lp_rast.c:828
#6  0xb6e29f25 in impl_thrd_routine (p=0x80a2498) at
../../../../include/c11/threads_posix.h:87
#7  0xb7c6f291 in start_thread () from target:/usr/lib/libpthread.so.0
#8  0xb7d75d7e in clone () from target:/usr/lib/libc.so.6



eglSwapBuffers calling thread:

 0xb7fdbc11 in __kernel_vsyscall ()
#1  0xb7c74aab in pthread_cond_wait@@GLIBC_2.3.2 () from
target:/usr/lib/libpthread.so.0
#2  0xb7d8248d in pthread_cond_wait@@GLIBC_2.3.2 () from
target:/usr/lib/libc.so.6
#3  0xb6e2a55a in cnd_wait (mtx=0x80a9b64, cond=0x80a9b7c) at
../../../../include/c11/threads_posix.h:159
#4  pipe_semaphore_wait (sema=0x80a9b64) at
../../../../src/gallium/auxiliary/os/os_thread.h:259
#5  lp_rast_finish (rast=0x80a9aa8) at lp_rast.c:771
#6  0xb6e35aab in lp_setup_rasterize_scene (setup=0x811cac8) at lp_setup.c:180
#7  set_scene_state (setup=setup@entry=0x811cac8,
new_state=new_state@entry=SETUP_FLUSHED, reason=0xb6f44308 <__func__.14289>
"do_flush") at lp_setup.c:330
#8  0xb6e3666f in lp_setup_flush (setup=0x811cac8, fence=0x0, reason=0xb6f44308
<__func__.14289> "do_flush") at lp_setup.c:359
#9  0xb6e287f5 in llvmpipe_flush (pipe=0x80fc1a0, fence=0x0, reason=0xb6f44308
<__func__.14289> "do_flush") at lp_flush.c:55
#10 0xb6e27e33 in do_flush (pipe=0x80fc1a0, fence=0x0, flags=1) at
lp_context.c:113
#11 0xb69b43b1 in st_flush (st=0x81ee428, fence=0x0, flags=1) at
state_tracker/st_cb_flush.c:87
#12 0xb69fe5eb in st_context_flush (stctxi=0x81ee428, flags=2, fence=0x0) at
state_tracker/st_manager.c:504
#13 0xb6ad192d in dri_flush (cPriv=0x80cb130, dPriv=0x80fb9b0, flags=5,
reason=__DRI2_THROTTLE_SWAPBUFFER) at dri_drawable.c:538
#14 0xb753b82b in dri2_flush_drawable_for_swapbuffers (disp=0x80b7458,
draw=0x80fb7f0) at drivers/dri2/egl_dri2.c:1318
#15 0xb75417e0 in dri2_drm_swap_buffers (drv=0x80b56c0, disp=0x80b7458,
draw=0x80fb7f0) at drivers/dri2/platform_drm.c:441
#16 0xb7538db8 in dri2_swap_buffers (drv=0x80b56c0, dpy=0x80b7458,
surf=0x80fb7f0) at drivers/dri2/egl_dri2.c:1333
#17 0xb75331b4 in eglSwapBuffers (dpy=0x80b7458, surface=0x80fb7f0) at
main/eglapi.c:1010


another thread backtrace:

#0  0xb6e2b9e8 in do_block_16_32_1 (c=, y=,
x=, plane=0xb35b10f4, tri=0x812beb0, task=0x80a9bb0) at
lp_rast_tri_tmp.h:136
#1  lp_rast_triangle_32_1 (task=0x80a9bb0, arg=...) at lp_rast_tri_tmp.h:232
#2  0xb6e29981 in do_rasterize_bin (bin=, x=,
y=, task=0x80a9bb0) at lp_rast.c:609
#3  rasterize_bin (y=, x=, bin=,
task=0x80a9bb0) at lp_rast.c:628
#4  rasterize_scene (task=task@entry=0x80a9bb0, scene=) at
lp_rast.c:688
#5  0xb6e2a10a in thread_function (init_data=0x80a9bb0) at lp_rast.c:828
#6  0xb6e29f25 in impl_thrd_routine (p=0x8091c30) at
../../../../include/c11/threads_posix.h:87
#7  0xb7c6f291 in start_thread () from target:/usr/lib/libpthread.so.0
#8  0xb7d75d7e in clone () from target:/usr/lib/libc.so.6

-- 
You are receiving this mail because:
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 94522] weston eglSwapBuffers crash in kms_swrast_dri.so on GMA500

2016-03-13 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=94522

--- Comment #1 from comicfans44  ---
local variable and args in crash thread:

(gdb) info locals
color = {0xb2298100 ,
0xc01064b3 , 
  0xb7d6de89 
"[=\001\360\377\377\017\203\213\227\363\377\303f\220f\220f\220f\220f\220e\203=\f",
0xb7c72c0b <__pthread_mutex_unlock_usercnt+11> "\201\307\365\003\001", 
  0xb715b000 "\360]\225", 0xb2d6a008 "0\303\017\b\260y,\b", 0xb2d6a1fc "",
0xb2d6a868
"\300\367\021\b\360B-\b\360B-\b\300\367\021\b\230X-\b\230X-\b\300\367\021\b0r-\b0r-\b"}
depth = 
i = 
stride = {4096, 3000410620, 134911960, 0, 3066076233, 0, 2621440, 3084396197}
depth_stride = 
scene = 0xb2d6a008
inputs = 0x812c030
state = 0x811f7c0
variant = 0x8202c30
tile_x = 64
tile_y = 448
x = 0
y = 12
(gdb) info args
task = 0x80a9ab4
arg = {shade_tile = 0x812c030, triangle = {tri = 0x812c030, plane_mask = 10},
set_state = 0x812c030, clear_rb = 0x812c030, clear_zstencil = {value =
43085119536, 
mask = 18446744030745985025}, state = 0x812c030, fence = 0x812c030,
query_obj = 0x812c030}

-- 
You are receiving this mail because:
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 01/11] nv50/ir: Check for valid insn instead of def size

2016-03-13 Thread Pierre Moreau
On Tesla cards, the first register $r0 contains the thread id; later
generations use a specialised register for it. In order to prevent the register
from being given to anyone, and thus lose the thread id information, an lvalue
is created to represent $r0 and is passed as an argument to the `main`
function.

However, since the inputs and outputs of a function are stored as value
definitions, a definition is added onto the previously created lvalue without
it being associated to an instruction. Therefore, checking the number of
definitions of an lvalue do not ensure that it is associated to an instruction.

Signed-off-by: Pierre Moreau 
---
 src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp
index 62b0aa1..7c319df 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp
@@ -853,7 +853,7 @@ isShortRegOp(Instruction *insn)
 static bool
 isShortRegVal(LValue *lval)
 {
-   if (lval->defs.size() == 0)
+   if (lval->getInsn() == NULL)
   return false;
for (Value::DefCIterator def = lval->defs.begin();
 def != lval->defs.end(); ++def)
@@ -1467,7 +1467,7 @@ GCRA::allocateRegisters(ArrayList& insns)
  nodes[i].init(regs, lval);
  RIG.insert(&nodes[i]);
 
- if (lval->inFile(FILE_GPR) && lval->defs.size() > 0 &&
+ if (lval->inFile(FILE_GPR) && lval->getInsn() != NULL &&
  prog->getTarget()->getChipset() < 0xc0) {
 Instruction *insn = lval->getInsn();
 if (insn->op == OP_MAD || insn->op == OP_SAD)
-- 
2.7.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 01/11] nv50/ir: Check for valid insn instead of def size

2016-03-13 Thread Samuel Pitoiset

01/11? Where are the other patches?

On 03/13/2016 02:16 PM, Pierre Moreau wrote:

On Tesla cards, the first register $r0 contains the thread id; later
generations use a specialised register for it. In order to prevent the register
from being given to anyone, and thus lose the thread id information, an lvalue
is created to represent $r0 and is passed as an argument to the `main`
function.

However, since the inputs and outputs of a function are stored as value
definitions, a definition is added onto the previously created lvalue without
it being associated to an instruction. Therefore, checking the number of
definitions of an lvalue do not ensure that it is associated to an instruction.

Signed-off-by: Pierre Moreau 
---
  src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp
index 62b0aa1..7c319df 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp
@@ -853,7 +853,7 @@ isShortRegOp(Instruction *insn)
  static bool
  isShortRegVal(LValue *lval)
  {
-   if (lval->defs.size() == 0)
+   if (lval->getInsn() == NULL)
return false;
 for (Value::DefCIterator def = lval->defs.begin();
  def != lval->defs.end(); ++def)
@@ -1467,7 +1467,7 @@ GCRA::allocateRegisters(ArrayList& insns)
   nodes[i].init(regs, lval);
   RIG.insert(&nodes[i]);

- if (lval->inFile(FILE_GPR) && lval->defs.size() > 0 &&
+ if (lval->inFile(FILE_GPR) && lval->getInsn() != NULL &&
   prog->getTarget()->getChipset() < 0xc0) {
  Instruction *insn = lval->getInsn();
  if (insn->op == OP_MAD || insn->op == OP_SAD)


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 01/11] nv50/ir: Check for valid insn instead of def size

2016-03-13 Thread Pierre Moreau
Hum… Something went wrong, sorry. This is the same as the previous patch and
not the updated version…

Pierre

On 02:16 PM - Mar 13 2016, Pierre Moreau wrote:
> On Tesla cards, the first register $r0 contains the thread id; later
> generations use a specialised register for it. In order to prevent the 
> register
> from being given to anyone, and thus lose the thread id information, an lvalue
> is created to represent $r0 and is passed as an argument to the `main`
> function.
> 
> However, since the inputs and outputs of a function are stored as value
> definitions, a definition is added onto the previously created lvalue without
> it being associated to an instruction. Therefore, checking the number of
> definitions of an lvalue do not ensure that it is associated to an 
> instruction.
> 
> Signed-off-by: Pierre Moreau 
> ---
>  src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp 
> b/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp
> index 62b0aa1..7c319df 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp
> @@ -853,7 +853,7 @@ isShortRegOp(Instruction *insn)
>  static bool
>  isShortRegVal(LValue *lval)
>  {
> -   if (lval->defs.size() == 0)
> +   if (lval->getInsn() == NULL)
>return false;
> for (Value::DefCIterator def = lval->defs.begin();
>  def != lval->defs.end(); ++def)
> @@ -1467,7 +1467,7 @@ GCRA::allocateRegisters(ArrayList& insns)
>   nodes[i].init(regs, lval);
>   RIG.insert(&nodes[i]);
>  
> - if (lval->inFile(FILE_GPR) && lval->defs.size() > 0 &&
> + if (lval->inFile(FILE_GPR) && lval->getInsn() != NULL &&
>   prog->getTarget()->getChipset() < 0xc0) {
>  Instruction *insn = lval->getInsn();
>  if (insn->op == OP_MAD || insn->op == OP_SAD)
> -- 
> 2.7.2
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2] nv50/ir: Check for valid insn instead of def size

2016-03-13 Thread Pierre Moreau
Functions arguments get a definition from the function itself, a definition
which is therefore not linked to any instruction. If a value ends up having a
definition but no linked instruction, the register allocation pass can skip
that value since it is not being used.

This fixes a null pointer dereference during the register allocation pass, if
a function had unused arguments.

v2: Rewrite commit message based on Ilia Mirkin's comment

Signed-off-by: Pierre Moreau 
---
 src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp
index 62b0aa1..7c319df 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp
@@ -853,7 +853,7 @@ isShortRegOp(Instruction *insn)
 static bool
 isShortRegVal(LValue *lval)
 {
-   if (lval->defs.size() == 0)
+   if (lval->getInsn() == NULL)
   return false;
for (Value::DefCIterator def = lval->defs.begin();
 def != lval->defs.end(); ++def)
@@ -1467,7 +1467,7 @@ GCRA::allocateRegisters(ArrayList& insns)
  nodes[i].init(regs, lval);
  RIG.insert(&nodes[i]);
 
- if (lval->inFile(FILE_GPR) && lval->defs.size() > 0 &&
+ if (lval->inFile(FILE_GPR) && lval->getInsn() != NULL &&
  prog->getTarget()->getChipset() < 0xc0) {
 Instruction *insn = lval->getInsn();
 if (insn->op == OP_MAD || insn->op == OP_SAD)
-- 
2.7.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] require libdrm 2.4.66 for AMD drivers

2016-03-13 Thread Serge Martin
since 737b6ed13e8f813987b5566004f0f45e9c55f1e8
src/gallium/winsys/amdgpu/drm/amdgpu_winsys.c not longer compile:
error: unknown type name ‘drmDevicePtr’
---
 configure.ac | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/configure.ac b/configure.ac
index 49be147..55b1c95 100644
--- a/configure.ac
+++ b/configure.ac
@@ -69,8 +69,8 @@ AC_SUBST([OPENCL_VERSION])
 
 dnl Versions for external dependencies
 LIBDRM_REQUIRED=2.4.60
-LIBDRM_RADEON_REQUIRED=2.4.56
-LIBDRM_AMDGPU_REQUIRED=2.4.63
+LIBDRM_RADEON_REQUIRED=2.4.66
+LIBDRM_AMDGPU_REQUIRED=2.4.66
 LIBDRM_INTEL_REQUIRED=2.4.61
 LIBDRM_NVVIEUX_REQUIRED=2.4.66
 LIBDRM_NOUVEAU_REQUIRED=2.4.66
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 03/10] st/glsl_to_tgsi: set memory access type on image intrinsics

2016-03-13 Thread Nicolai Hähnle
From: Nicolai Hähnle 

This is required to preserve the image variable's coherent/restrict/volatile
qualifiers in TGSI.
---
 src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp 
b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
index fcfd8b7..18cea60 100644
--- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
@@ -3616,6 +3616,13 @@ glsl_to_tgsi_visitor::visit_image_intrinsic(ir_call *ir)
 
inst->image_format = st_mesa_format_to_pipe_format(st_context(ctx),
  _mesa_get_shader_image_format(imgvar->data.image_format));
+
+   if (imgvar->data.image_coherent)
+  inst->buffer_access |= TGSI_MEMORY_COHERENT;
+   if (imgvar->data.image_restrict)
+  inst->buffer_access |= TGSI_MEMORY_RESTRICT;
+   if (imgvar->data.image_volatile)
+  inst->buffer_access |= TGSI_MEMORY_VOLATILE;
 }
 
 void
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 01/10] tgsi: add Texture and Format to tgsi_instruction_memory

2016-03-13 Thread Nicolai Hähnle
From: Nicolai Hähnle 

Frontends should have this information readily available, and it simplifies
image LOAD/STORE/ATOM* handling especially with indirect image access.
---
 src/gallium/auxiliary/tgsi/tgsi_dump.c | 8 
 src/gallium/include/pipe/p_shader_tokens.h | 4 +++-
 2 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/src/gallium/auxiliary/tgsi/tgsi_dump.c 
b/src/gallium/auxiliary/tgsi/tgsi_dump.c
index f232f38..c8b91bb 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_dump.c
+++ b/src/gallium/auxiliary/tgsi/tgsi_dump.c
@@ -637,6 +637,14 @@ iter_instruction(
  TXT(", ");
  ENM(bit, tgsi_memory_names);
   }
+  if (inst->Memory.Texture) {
+ TXT( ", " );
+ ENM( inst->Memory.Texture, tgsi_texture_names );
+  }
+  if (inst->Memory.Format) {
+ TXT( ", " );
+ TXT( util_format_name(inst->Memory.Format) );
+  }
}
 
switch (inst->Instruction.Opcode) {
diff --git a/src/gallium/include/pipe/p_shader_tokens.h 
b/src/gallium/include/pipe/p_shader_tokens.h
index 9d4a96a..34e491e 100644
--- a/src/gallium/include/pipe/p_shader_tokens.h
+++ b/src/gallium/include/pipe/p_shader_tokens.h
@@ -743,7 +743,9 @@ struct tgsi_dst_register
 struct tgsi_instruction_memory
 {
unsigned Qualifier : 3;  /* TGSI_MEMORY_ */
-   unsigned Padding   : 29;
+   unsigned Texture   : 8;  /* only for images: TGSI_TEXTURE_ */
+   unsigned Format: 10; /* only for images: PIPE_FORMAT_ */
+   unsigned Padding   : 11;
 };
 
 #define TGSI_MEMBAR_SHADER_BUFFER (1 << 0)
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 07/10] st/mesa: set image access flags in st_bind_images

2016-03-13 Thread Nicolai Hähnle
From: Nicolai Hähnle 

---
 src/mesa/state_tracker/st_atom_image.c | 15 +++
 1 file changed, 15 insertions(+)

diff --git a/src/mesa/state_tracker/st_atom_image.c 
b/src/mesa/state_tracker/st_atom_image.c
index bf7486b..e96d10a1 100644
--- a/src/mesa/state_tracker/st_atom_image.c
+++ b/src/mesa/state_tracker/st_atom_image.c
@@ -70,6 +70,21 @@ st_bind_images(struct st_context *st, struct gl_shader 
*shader,
 
   img->resource = stObj->pt;
   img->format = st_mesa_format_to_pipe_format(st, u->_ActualFormat);
+
+  switch (u->Access) {
+  case GL_READ_ONLY:
+ img->access = PIPE_IMAGE_ACCESS_READ;
+ break;
+  case GL_WRITE_ONLY:
+ img->access = PIPE_IMAGE_ACCESS_WRITE;
+ break;
+  case GL_READ_WRITE:
+ img->access = PIPE_IMAGE_ACCESS_READ_WRITE;
+ break;
+  default:
+ unreachable("bad gl_image_unit::Access");
+  }
+
   if (stObj->pt->target == PIPE_BUFFER) {
  unsigned base, size;
  unsigned f, n;
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 00/10] gallium: more preparation for shader images support

2016-03-13 Thread Nicolai Hähnle
Hi,

this is probably my last batch of hardware-independent patches that need
to be pushed before the series that finally adds ARB_shader_image_load_store
for radeonsi.

Patches 1 to 7 add various fields to TGSI and pipe_image_view to communicate
state and flags to the driver that are so far not communicated at all or are
only communicated in a less convenient form.

Patches 8 & 9 are two simple utility functions that I am using in my radeonsi
series.

Patch 10 already implements early_fragment_tests for radeonsi, even though it
cannot be used yet.

Please review!

Thanks,
Nicolai
--
 src/gallium/auxiliary/tgsi/tgsi_build.c  | 15 
 src/gallium/auxiliary/tgsi/tgsi_build.h  |  5 +++
 src/gallium/auxiliary/tgsi/tgsi_dump.c   |  8 +
 src/gallium/auxiliary/tgsi/tgsi_strings.c|  1 +
 src/gallium/auxiliary/tgsi/tgsi_ureg.c   | 12 +--
 src/gallium/auxiliary/tgsi/tgsi_ureg.h   |  8 +++--
 src/gallium/auxiliary/util/u_inlines.h   | 10 ++
 src/gallium/docs/source/tgsi.rst |  6 
 .../drivers/radeonsi/si_state_shaders.c  |  3 ++
 src/gallium/include/pipe/p_defines.h |  8 +
 src/gallium/include/pipe/p_shader_tokens.h   |  7 ++--
 src/gallium/include/pipe/p_state.h   |  3 +-
 src/mesa/state_tracker/st_atom_image.c   | 15 
 src/mesa/state_tracker/st_glsl_to_tgsi.cpp   | 34 +-
 14 files changed, 118 insertions(+), 17 deletions(-)

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 02/10] st/glsl_to_tgsi: provide Texture and Format information for image ops

2016-03-13 Thread Nicolai Hähnle
From: Nicolai Hähnle 

---
 src/gallium/auxiliary/tgsi/tgsi_ureg.c | 12 +---
 src/gallium/auxiliary/tgsi/tgsi_ureg.h |  8 ++--
 src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 24 +++-
 3 files changed, 30 insertions(+), 14 deletions(-)

diff --git a/src/gallium/auxiliary/tgsi/tgsi_ureg.c 
b/src/gallium/auxiliary/tgsi/tgsi_ureg.c
index e1a7278..ab1d034 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_ureg.c
+++ b/src/gallium/auxiliary/tgsi/tgsi_ureg.c
@@ -1242,7 +1242,9 @@ ureg_emit_texture_offset(struct ureg_program *ureg,
 void
 ureg_emit_memory(struct ureg_program *ureg,
  unsigned extended_token,
- unsigned qualifier)
+ unsigned qualifier,
+ unsigned texture,
+ unsigned format)
 {
union tgsi_any_token *out, *insn;
 
@@ -1253,6 +1255,8 @@ ureg_emit_memory(struct ureg_program *ureg,
 
out[0].value = 0;
out[0].insn_memory.Qualifier = qualifier;
+   out[0].insn_memory.Texture = texture;
+   out[0].insn_memory.Format = format;
 }
 
 void
@@ -1413,7 +1417,9 @@ ureg_memory_insn(struct ureg_program *ureg,
  unsigned nr_dst,
  const struct ureg_src *src,
  unsigned nr_src,
- unsigned qualifier)
+ unsigned qualifier,
+ unsigned texture,
+ unsigned format)
 {
struct ureg_emit_insn_result insn;
unsigned i;
@@ -1430,7 +1436,7 @@ ureg_memory_insn(struct ureg_program *ureg,
  nr_dst,
  nr_src);
 
-   ureg_emit_memory(ureg, insn.extended_token, qualifier);
+   ureg_emit_memory(ureg, insn.extended_token, qualifier, texture, format);
 
for (i = 0; i < nr_dst; i++)
   ureg_emit_dst(ureg, dst[i]);
diff --git a/src/gallium/auxiliary/tgsi/tgsi_ureg.h 
b/src/gallium/auxiliary/tgsi/tgsi_ureg.h
index 6a3b5dd..04a62a6 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_ureg.h
+++ b/src/gallium/auxiliary/tgsi/tgsi_ureg.h
@@ -541,7 +541,9 @@ ureg_memory_insn(struct ureg_program *ureg,
  unsigned nr_dst,
  const struct ureg_src *src,
  unsigned nr_src,
- unsigned qualifier);
+ unsigned qualifier,
+ unsigned texture,
+ unsigned format);
 
 /***
  * Internal instruction helpers, don't call these directly:
@@ -582,7 +584,9 @@ ureg_emit_texture_offset(struct ureg_program *ureg,
 void
 ureg_emit_memory(struct ureg_program *ureg,
  unsigned insn_token,
- unsigned qualifier);
+ unsigned qualifier,
+ unsigned texture,
+ unsigned format);
 
 void 
 ureg_emit_dst( struct ureg_program *ureg,
diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp 
b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
index 92cd775..fcfd8b7 100644
--- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
@@ -5572,7 +5572,7 @@ compile_tgsi_instruction(struct st_translate *t,
 
int num_dst;
int num_src;
-   unsigned tex_target;
+   unsigned tex_target = 0;
 
num_dst = num_inst_dst_regs(inst);
num_src = num_inst_src_regs(inst);
@@ -5647,32 +5647,38 @@ compile_tgsi_instruction(struct st_translate *t,
   for (i = num_src - 1; i >= 0; i--)
  src[i + 1] = src[i];
   num_src++;
-  if (inst->buffer.file == PROGRAM_MEMORY)
+  if (inst->buffer.file == PROGRAM_MEMORY) {
  src[0] = t->shared_memory;
-  else if (inst->buffer.file == PROGRAM_BUFFER)
+  } else if (inst->buffer.file == PROGRAM_BUFFER) {
  src[0] = t->buffers[inst->buffer.index];
-  else
+  } else {
  src[0] = t->images[inst->buffer.index];
+ tex_target = st_translate_texture_target(inst->tex_target, 
inst->tex_shadow);
+  }
   if (inst->buffer.reladdr)
  src[0] = ureg_src_indirect(src[0], ureg_src(t->address[2]));
   assert(src[0].File != TGSI_FILE_NULL);
   ureg_memory_insn(ureg, inst->op, dst, num_dst, src, num_src,
-   inst->buffer_access);
+   inst->buffer_access,
+   tex_target, inst->image_format);
   break;
 
case TGSI_OPCODE_STORE:
-  if (inst->buffer.file == PROGRAM_MEMORY)
+  if (inst->buffer.file == PROGRAM_MEMORY) {
  dst[0] = ureg_dst(t->shared_memory);
-  else if (inst->buffer.file == PROGRAM_BUFFER)
+  } else if (inst->buffer.file == PROGRAM_BUFFER) {
  dst[0] = ureg_dst(t->buffers[inst->buffer.index]);
-  else
+  } else {
  dst[0] = ureg_dst(t->images[inst->buffer.index]);
+ tex_target = st_translate_texture_target(inst->tex_target, 
inst->tex_shadow);
+  }
   dst[0] = ureg_writemask(dst[0], inst->dst[0].writemask);
   if (inst->buffer.reladdr)
  dst[0] = ureg_dst_indir

[Mesa-dev] [PATCH 04/10] tgsi: add TGSI_PROPERTY_FS_EARLY_DEPTH_STENCIL

2016-03-13 Thread Nicolai Hähnle
From: Nicolai Hähnle 

---
 src/gallium/auxiliary/tgsi/tgsi_strings.c  | 1 +
 src/gallium/docs/source/tgsi.rst   | 6 ++
 src/gallium/include/pipe/p_shader_tokens.h | 3 ++-
 3 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/src/gallium/auxiliary/tgsi/tgsi_strings.c 
b/src/gallium/auxiliary/tgsi/tgsi_strings.c
index b15ae69..6bd1a2e 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_strings.c
+++ b/src/gallium/auxiliary/tgsi/tgsi_strings.c
@@ -144,6 +144,7 @@ const char *tgsi_property_names[TGSI_PROPERTY_COUNT] =
"TES_POINT_MODE",
"NUM_CLIPDIST_ENABLED",
"NUM_CULLDIST_ENABLED",
+   "FS_EARLY_DEPTH_STENCIL",
 };
 
 const char *tgsi_return_type_names[TGSI_RETURN_TYPE_COUNT] =
diff --git a/src/gallium/docs/source/tgsi.rst b/src/gallium/docs/source/tgsi.rst
index 489cbb0..af2df22 100644
--- a/src/gallium/docs/source/tgsi.rst
+++ b/src/gallium/docs/source/tgsi.rst
@@ -3206,6 +3206,12 @@ NUM_CULLDIST_ENABLED
 
 How many cull distance scalar outputs are enabled.
 
+FS_EARLY_DEPTH_STENCIL
+""
+
+Whether depth test, stencil test, and occlusion query should run before
+the fragment shader (regardless of fragment shader side effects). Corresponds
+to GLSL early_fragment_tests.
 
 Texture Sampling and Texture Formats
 
diff --git a/src/gallium/include/pipe/p_shader_tokens.h 
b/src/gallium/include/pipe/p_shader_tokens.h
index 34e491e..7a34841 100644
--- a/src/gallium/include/pipe/p_shader_tokens.h
+++ b/src/gallium/include/pipe/p_shader_tokens.h
@@ -277,7 +277,8 @@ union tgsi_immediate_data
 #define TGSI_PROPERTY_TES_POINT_MODE 14
 #define TGSI_PROPERTY_NUM_CLIPDIST_ENABLED   15
 #define TGSI_PROPERTY_NUM_CULLDIST_ENABLED   16
-#define TGSI_PROPERTY_COUNT  17
+#define TGSI_PROPERTY_FS_EARLY_DEPTH_STENCIL 17
+#define TGSI_PROPERTY_COUNT  18
 
 struct tgsi_property {
unsigned Type : 4;  /**< TGSI_TOKEN_TYPE_PROPERTY */
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 08/10] gallium/u_inlines: add util_copy_image_view

2016-03-13 Thread Nicolai Hähnle
From: Nicolai Hähnle 

---
 src/gallium/auxiliary/util/u_inlines.h | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/src/gallium/auxiliary/util/u_inlines.h 
b/src/gallium/auxiliary/util/u_inlines.h
index d081203..0e80cef 100644
--- a/src/gallium/auxiliary/util/u_inlines.h
+++ b/src/gallium/auxiliary/util/u_inlines.h
@@ -622,6 +622,16 @@ util_copy_constant_buffer(struct pipe_constant_buffer *dst,
}
 }
 
+static inline void
+util_copy_image_view(struct pipe_image_view *dst,
+ const struct pipe_image_view *src)
+{
+   pipe_resource_reference(&dst->resource, src->resource);
+   dst->format = src->format;
+   dst->access = src->access;
+   dst->u = src->u;
+}
+
 static inline unsigned
 util_max_layer(const struct pipe_resource *r, unsigned level)
 {
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 09/10] tgsi: add tgsi_full_src_register_from_dst helper function

2016-03-13 Thread Nicolai Hähnle
From: Nicolai Hähnle 

---
 src/gallium/auxiliary/tgsi/tgsi_build.c | 15 +++
 src/gallium/auxiliary/tgsi/tgsi_build.h |  5 +
 2 files changed, 20 insertions(+)

diff --git a/src/gallium/auxiliary/tgsi/tgsi_build.c 
b/src/gallium/auxiliary/tgsi/tgsi_build.c
index cfe9b92..e5355f5 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_build.c
+++ b/src/gallium/auxiliary/tgsi/tgsi_build.c
@@ -1425,3 +1425,18 @@ tgsi_build_full_property(
 
return size;
 }
+
+struct tgsi_full_src_register
+tgsi_full_src_register_from_dst(const struct tgsi_full_dst_register *dst)
+{
+   struct tgsi_full_src_register src;
+   src.Register = tgsi_default_src_register();
+   src.Register.File = dst->Register.File;
+   src.Register.Indirect = dst->Register.Indirect;
+   src.Register.Dimension = dst->Register.Dimension;
+   src.Register.Index = dst->Register.Index;
+   src.Indirect = dst->Indirect;
+   src.Dimension = dst->Dimension;
+   src.DimIndirect = dst->DimIndirect;
+   return src;
+}
diff --git a/src/gallium/auxiliary/tgsi/tgsi_build.h 
b/src/gallium/auxiliary/tgsi/tgsi_build.h
index c5127e1..34d181a 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_build.h
+++ b/src/gallium/auxiliary/tgsi/tgsi_build.h
@@ -30,6 +30,8 @@
 
 
 struct tgsi_token;
+struct tgsi_full_dst_register;
+struct tgsi_full_src_register;
 
 
 #if defined __cplusplus
@@ -111,6 +113,9 @@ tgsi_build_full_instruction(
 struct tgsi_instruction_predicate
 tgsi_default_instruction_predicate(void);
 
+struct tgsi_full_src_register
+tgsi_full_src_register_from_dst(const struct tgsi_full_dst_register *dst);
+
 #if defined __cplusplus
 }
 #endif
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 05/10] st/glsl_to_tgsi: set FS_EARLY_DEPTH_STENCIL when required

2016-03-13 Thread Nicolai Hähnle
From: Nicolai Hähnle 

---
 src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp 
b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
index 18cea60..1841405 100644
--- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
@@ -6121,6 +6121,9 @@ st_translate_program(
}
 
if (procType == TGSI_PROCESSOR_FRAGMENT) {
+  if (program->shader->EarlyFragmentTests)
+ ureg_property(ureg, TGSI_PROPERTY_FS_EARLY_DEPTH_STENCIL, 1);
+
   if (proginfo->InputsRead & VARYING_BIT_POS) {
   /* Must do this after setting up t->inputs. */
   emit_wpos(st_context(ctx), t, proginfo, ureg,
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 06/10] gallium: add access field to pipe_image_view

2016-03-13 Thread Nicolai Hähnle
From: Nicolai Hähnle 

This allows drivers to make smarter decisions e.g. about whether the image
has to be decompressed.
---
 src/gallium/include/pipe/p_defines.h | 8 
 src/gallium/include/pipe/p_state.h   | 3 ++-
 2 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/src/gallium/include/pipe/p_defines.h 
b/src/gallium/include/pipe/p_defines.h
index a490b3b..bdd76ab 100644
--- a/src/gallium/include/pipe/p_defines.h
+++ b/src/gallium/include/pipe/p_defines.h
@@ -541,6 +541,14 @@ enum pipe_reset_status
PIPE_HANDLE_USAGE_WRITE)
 
 /**
+ * pipe_image_view access flags.
+ */
+#define PIPE_IMAGE_ACCESS_READ   (1 << 0)
+#define PIPE_IMAGE_ACCESS_WRITE  (1 << 1)
+#define PIPE_IMAGE_ACCESS_READ_WRITE (PIPE_IMAGE_ACCESS_READ | \
+  PIPE_IMAGE_ACCESS_WRITE)
+
+/**
  * Implementation capabilities/limits which are queried through
  * pipe_screen::get_param()
  */
diff --git a/src/gallium/include/pipe/p_state.h 
b/src/gallium/include/pipe/p_state.h
index c568c48..2e720ce 100644
--- a/src/gallium/include/pipe/p_state.h
+++ b/src/gallium/include/pipe/p_state.h
@@ -393,13 +393,14 @@ struct pipe_sampler_view
 
 
 /**
- * A description of a writable buffer or texture that can be bound to a shader
+ * A description of a buffer or texture image that can be bound to a shader
  * stage.
  */
 struct pipe_image_view
 {
struct pipe_resource *resource; /**< resource into which this is a view  */
enum pipe_format format;  /**< typed PIPE_FORMAT_x */
+   unsigned access;  /**< PIPE_IMAGE_ACCESS_x */
 
union {
   struct {
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 10/10] radeonsi: set DEPTH_BEFORE_SHADER based on FS_EARLY_DEPTH_STENCIL

2016-03-13 Thread Nicolai Hähnle
From: Nicolai Hähnle 

---
 src/gallium/drivers/radeonsi/si_state_shaders.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/src/gallium/drivers/radeonsi/si_state_shaders.c 
b/src/gallium/drivers/radeonsi/si_state_shaders.c
index 321b87d..5fe1f79 100644
--- a/src/gallium/drivers/radeonsi/si_state_shaders.c
+++ b/src/gallium/drivers/radeonsi/si_state_shaders.c
@@ -1154,6 +1154,9 @@ static void *si_create_shader_selector(struct 
pipe_context *ctx,
break;
}
 
+   if (sel->info.properties[TGSI_PROPERTY_FS_EARLY_DEPTH_STENCIL])
+   sel->db_shader_control |= S_02880C_DEPTH_BEFORE_SHADER(1);
+
/* Compile the main shader part for use with a prolog and/or epilog. */
if (sel->type != PIPE_SHADER_GEOMETRY &&
!sscreen->use_monolithic_shaders) {
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] radeonsi: avoid crash when a sampler state is bound for a buffer texture

2016-03-13 Thread Nicolai Hähnle

On 12.03.2016 19:09, Ilia Mirkin wrote:

On Fri, Mar 11, 2016 at 11:17 AM, Nicolai Hähnle  wrote:

From: Nicolai Hähnle 

Sampler states don't really make sense with buffer textures, but the PBO
upload code sets one because apparently nouveau needs it. It would be
nice to work that out at some point, but in any case being defensive
here is a good idea.


Sampler states are set in regular GL as well if you have a regular
buffer texture too, no?


I suppose you're right - we just haven't had many users of buffer 
textures before PBO upload, so nobody noticed this. I'll change the 
comment before pushing.


Cheers,
Nicolai





Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94284
Cc: mesa-sta...@lists.freedesktop.org
---
  src/gallium/drivers/radeonsi/si_descriptors.c | 1 +
  1 file changed, 1 insertion(+)

diff --git a/src/gallium/drivers/radeonsi/si_descriptors.c 
b/src/gallium/drivers/radeonsi/si_descriptors.c
index 9aa4877..f5ad113 100644
--- a/src/gallium/drivers/radeonsi/si_descriptors.c
+++ b/src/gallium/drivers/radeonsi/si_descriptors.c
@@ -324,6 +324,7 @@ static void si_bind_sampler_states(struct pipe_context 
*ctx, unsigned shader,
  */
 if (samplers->views.views[i] &&
 samplers->views.views[i]->texture &&
+   samplers->views.views[i]->texture->target != PIPE_BUFFER &&
 ((struct 
r600_texture*)samplers->views.views[i]->texture)->fmask.size)
 continue;

--
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 00/10] gallium: more preparation for shader images support

2016-03-13 Thread Ilia Mirkin
I can't say I'm a huge fan of adding the format/etc info
per-instruction, but I'm also not opposed to it if it makes your life
a lot easier. Patches 1-9 are

Reviewed-by: Ilia Mirkin 

Thanks for taking care of the access/early depth bits :)

  -ilia

On Sun, Mar 13, 2016 at 10:29 AM, Nicolai Hähnle  wrote:
> Hi,
>
> this is probably my last batch of hardware-independent patches that need
> to be pushed before the series that finally adds ARB_shader_image_load_store
> for radeonsi.
>
> Patches 1 to 7 add various fields to TGSI and pipe_image_view to communicate
> state and flags to the driver that are so far not communicated at all or are
> only communicated in a less convenient form.
>
> Patches 8 & 9 are two simple utility functions that I am using in my radeonsi
> series.
>
> Patch 10 already implements early_fragment_tests for radeonsi, even though it
> cannot be used yet.
>
> Please review!
>
> Thanks,
> Nicolai
> --
>  src/gallium/auxiliary/tgsi/tgsi_build.c  | 15 
>  src/gallium/auxiliary/tgsi/tgsi_build.h  |  5 +++
>  src/gallium/auxiliary/tgsi/tgsi_dump.c   |  8 +
>  src/gallium/auxiliary/tgsi/tgsi_strings.c|  1 +
>  src/gallium/auxiliary/tgsi/tgsi_ureg.c   | 12 +--
>  src/gallium/auxiliary/tgsi/tgsi_ureg.h   |  8 +++--
>  src/gallium/auxiliary/util/u_inlines.h   | 10 ++
>  src/gallium/docs/source/tgsi.rst |  6 
>  .../drivers/radeonsi/si_state_shaders.c  |  3 ++
>  src/gallium/include/pipe/p_defines.h |  8 +
>  src/gallium/include/pipe/p_shader_tokens.h   |  7 ++--
>  src/gallium/include/pipe/p_state.h   |  3 +-
>  src/mesa/state_tracker/st_atom_image.c   | 15 
>  src/mesa/state_tracker/st_glsl_to_tgsi.cpp   | 34 +-
>  14 files changed, 118 insertions(+), 17 deletions(-)
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] nvc0: fix blit triangle size to fully cover FB's > 8192x8192

2016-03-13 Thread Ilia Mirkin
On Sun, Mar 13, 2016 at 7:43 AM, Samuel Pitoiset
 wrote:
> What about nv50? Is there the same issue?

Yes, but nv50's max fb size is 8192x8192, and there the blit
triangle's size is 16384x16384, so it all works out. (Also nv50 has
some funny business wrt MS textures that ends up being a bit different
on nvc0)

> Don't you need to update nvc0_clear_buffer() accordingly?

I don't think so, but perhaps I'm missing something? clear_buffer
doesn't draw any triangles... the point of this blit is to draw a
triangle that rasterizes over the full fb. By making the w/h of the
tri 2x the fb size, we ensure that the full fb is covered... here's
some ASCII art that might help:

+-+-/
| |   /
| | /
+-/
|   /
| /
/

note that the width of a char is 2x the height, which is why it looks
a bit funny. The square in the middle is the fb, and the larger
triangle is defined by the position coordinates that we're providing.


>
>
> On 03/13/2016 04:07 AM, Ilia Mirkin wrote:
>>
>> The idea is that a single triangle will cover the whole area being
>> drawn, allowing the blit shader to do its work. However the max fb size
>> is 16384x16384, which means that the triangle we draw needs to be twice
>> that in order to cover the whole area fully. Increase the size of the
>> triangle to 32768x32768.
>>
>> This fixes a number of dEQP tests that were failing because a blit was
>> involved which would miss some of the resulting texture.
>>
>> Signed-off-by: Ilia Mirkin 
>> Cc: "11.1 11.2" 
>> ---
>>   src/gallium/drivers/nouveau/nvc0/nvc0_surface.c | 8 
>>   1 file changed, 4 insertions(+), 4 deletions(-)
>>
>> diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_surface.c
>> b/src/gallium/drivers/nouveau/nvc0/nvc0_surface.c
>> index ccfc9e2..f2ad4bf 100644
>> --- a/src/gallium/drivers/nouveau/nvc0/nvc0_surface.c
>> +++ b/src/gallium/drivers/nouveau/nvc0/nvc0_surface.c
>> @@ -1215,8 +1215,8 @@ nvc0_blit_3d(struct nvc0_context *nvc0, const struct
>> pipe_blit_info *info)
>>  x0 = (float)info->src.box.x - x_range * (float)info->dst.box.x;
>>  y0 = (float)info->src.box.y - y_range * (float)info->dst.box.y;
>>
>> -   x1 = x0 + 16384.0f * x_range;
>> -   y1 = y0 + 16384.0f * y_range;
>> +   x1 = x0 + 32768.0f * x_range;
>> +   y1 = y0 + 32768.0f * y_range;
>>
>>  x0 *= (float)(1 << nv50_miptree(src)->ms_x);
>>  x1 *= (float)(1 << nv50_miptree(src)->ms_x);
>> @@ -1327,14 +1327,14 @@ nvc0_blit_3d(struct nvc0_context *nvc0, const
>> struct pipe_blit_info *info)
>> *(vbuf++) = fui(y0);
>> *(vbuf++) = fui(z);
>>
>> -  *(vbuf++) = fui(16384 << nv50_miptree(dst)->ms_x);
>> +  *(vbuf++) = fui(32768 << nv50_miptree(dst)->ms_x);
>> *(vbuf++) = fui(0.0f);
>> *(vbuf++) = fui(x1);
>> *(vbuf++) = fui(y0);
>> *(vbuf++) = fui(z);
>>
>> *(vbuf++) = fui(0.0f);
>> -  *(vbuf++) = fui(16384 << nv50_miptree(dst)->ms_y);
>> +  *(vbuf++) = fui(32768 << nv50_miptree(dst)->ms_y);
>> *(vbuf++) = fui(x0);
>> *(vbuf++) = fui(y1);
>> *(vbuf++) = fui(z);
>>
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/3] nv50, nvc0: handle SQRT lowering inside the driver

2016-03-13 Thread Ilia Mirkin
On Sun, Mar 13, 2016 at 7:52 AM, Samuel Pitoiset
 wrote:
>
>
> On 03/13/2016 04:07 AM, Ilia Mirkin wrote:
>>
>> First off, st/mesa lowers DSQRT incorrectly (it uses CMP to attempt to
>> find out whether the input is less than 0). Secondly the current
>> approach (x * rsq(x)) behaves poorly for x = inf - a NaN is produced
>> instead of inf.
>>
>
> When I had a look at this precision thing, this seemed a bit weird to me but
> I did not investigate much of time on it.
>
>> Instead we switch to the less accurate rcp(rsq(x)) method - this behaves
>> nicely for all valid inputs. We still don't do this for DSQRT since the
>> RSQ/RCP ops are *really* inaccurate, and don't even have Newton-Raphson
>> steps right now. Eventually we should have a separate library function
>> for DSQRT that does it more precisely (and perhaps move this lowering to
>> the post-opt phase).
>>
>> This fixes a number of dEQP precision tests that were expecting better
>> behavior for infinite inputs.
>
>
> Yep, this fixes distance, length, refract and sqrt deqp precision tests (37
> tests). But they are still some fails with atan2, idexp and tanh.
> Are you going to fix them too?

Not today. (that's "ldexp" btw) I glanced at atan2 - looks like it's
returning results out of range... perhaps a min/max on the poly
generated by the lowering logic is in order. Didn't investigate too
closely.

>
> Except that you forgot to remove unused variables:
> codegen/nv50_ir_lowering_nvc0.cpp: In member function ‘bool
> nv50_ir::NVC0LoweringPass::handleSQRT(nv50_ir::Instruction*)’:
> codegen/nv50_ir_lowering_nvc0.cpp:1785:20: warning: unused variable ‘mov’
> [-Wunused-variable]
>Instruction *mov, *rsq;
> ^
> codegen/nv50_ir_lowering_nvc0.cpp:1785:26: warning: unused variable ‘rsq’
> [-Wunused-variable]
>Instruction *mov, *rsq;
>   ^
>   CC   nvc0/nvc0_surface.lo

Yeah, I noticed that about 5s after I sent the patch out... already
fixed locally.

>
> This patch is:
>
> Reviewed-by: Samuel Pitoiset 
> Tested-by: Samuel Pitoiset 

Thanks!

>
>
>>
>> Signed-off-by: Ilia Mirkin 
>> ---
>>   .../drivers/nouveau/codegen/nv50_ir_build_util.cpp |  6 +++-
>>   .../drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp  |  2 ++
>>   .../nouveau/codegen/nv50_ir_lowering_nv50.cpp  |  7 ++---
>>   .../nouveau/codegen/nv50_ir_lowering_nvc0.cpp  | 32
>> +++---
>>   src/gallium/drivers/nouveau/nv50/nv50_screen.c |  2 +-
>>   src/gallium/drivers/nouveau/nvc0/nvc0_screen.c |  2 +-
>>   6 files changed, 28 insertions(+), 23 deletions(-)
>>
>> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_build_util.cpp
>> b/src/gallium/drivers/nouveau/codegen/nv50_ir_build_util.cpp
>> index f58cf97..84ebfdb 100644
>> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_build_util.cpp
>> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_build_util.cpp
>> @@ -585,6 +585,7 @@ BuildUtil::split64BitOpPostRA(Function *fn,
>> Instruction *i,
>>return NULL;
>> srcNr = 2;
>> break;
>> +   case OP_SELP: srcNr = 3; break;
>>  default:
>> // TODO when needed
>> return NULL;
>> @@ -601,7 +602,10 @@ BuildUtil::split64BitOpPostRA(Function *fn,
>> Instruction *i,
>>
>>  for (int s = 0; s < srcNr; ++s) {
>> if (lo->getSrc(s)->reg.size < 8) {
>> - hi->setSrc(s, zero);
>> + if (s == 2)
>> +hi->setSrc(s, lo->getSrc(s));
>> + else
>> +hi->setSrc(s, zero);
>> } else {
>>if (lo->getSrc(s)->refCount() > 1)
>>   lo->setSrc(s, cloneShallow(fn, lo->getSrc(s)));
>> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
>> b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
>> index b06d86a..d284446 100644
>> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
>> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
>> @@ -616,6 +616,7 @@ static nv50_ir::operation translateOpcode(uint opcode)
>>
>>  NV50_IR_OPCODE_CASE(RCP, RCP);
>>  NV50_IR_OPCODE_CASE(RSQ, RSQ);
>> +   NV50_IR_OPCODE_CASE(SQRT, SQRT);
>>
>>  NV50_IR_OPCODE_CASE(MUL, MUL);
>>  NV50_IR_OPCODE_CASE(ADD, ADD);
>> @@ -2689,6 +2690,7 @@ Converter::handleInstruction(const struct
>> tgsi_full_instruction *insn)
>>  case TGSI_OPCODE_FLR:
>>  case TGSI_OPCODE_TRUNC:
>>  case TGSI_OPCODE_RCP:
>> +   case TGSI_OPCODE_SQRT:
>>  case TGSI_OPCODE_IABS:
>>  case TGSI_OPCODE_INEG:
>>  case TGSI_OPCODE_NOT:
>> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nv50.cpp
>> b/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nv50.cpp
>> index 8752b0c..12c5f69 100644
>> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nv50.cpp
>> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nv50.cpp
>> @@ -1203,10 +1203,9 @@ NV50LoweringPreSSA::handleDIV(Instruction *i)
>>   bool
>>   NV50LoweringPreSSA::handleSQRT(Instruction *i)
>

[Mesa-dev] [PATCH] configure.ac require libdrm 2.4.65 for amdgpu because of drmGetDevice

2016-03-13 Thread Marek Olšák
From: Marek Olšák 

---
 configure.ac | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/configure.ac b/configure.ac
index 49be147..d768db6 100644
--- a/configure.ac
+++ b/configure.ac
@@ -70,7 +70,7 @@ AC_SUBST([OPENCL_VERSION])
 dnl Versions for external dependencies
 LIBDRM_REQUIRED=2.4.60
 LIBDRM_RADEON_REQUIRED=2.4.56
-LIBDRM_AMDGPU_REQUIRED=2.4.63
+LIBDRM_AMDGPU_REQUIRED=2.4.65
 LIBDRM_INTEL_REQUIRED=2.4.61
 LIBDRM_NVVIEUX_REQUIRED=2.4.66
 LIBDRM_NOUVEAU_REQUIRED=2.4.66
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 09/10] i965: Add and use is_scheduling_barrier() function.

2016-03-13 Thread Matt Turner
---
 .../drivers/dri/i965/brw_schedule_instructions.cpp  | 21 +
 1 file changed, 17 insertions(+), 4 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp 
b/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp
index 98fa5e3..befa9ff 100644
--- a/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp
+++ b/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp
@@ -892,6 +892,14 @@ fs_instruction_scheduler::is_compressed(fs_inst *inst)
return inst->exec_size == 16;
 }
 
+static bool
+is_scheduling_barrier(const fs_inst *inst)
+{
+   return inst->opcode == FS_OPCODE_PLACEHOLDER_HALT ||
+  inst->is_control_flow() ||
+  inst->has_side_effects();
+}
+
 void
 fs_instruction_scheduler::calculate_deps()
 {
@@ -917,9 +925,7 @@ fs_instruction_scheduler::calculate_deps()
foreach_in_list(schedule_node, n, &instructions) {
   fs_inst *inst = (fs_inst *)n->inst;
 
-  if (inst->opcode == FS_OPCODE_PLACEHOLDER_HALT ||
-  inst->is_control_flow() ||
-  inst->has_side_effects())
+  if (is_scheduling_barrier(inst))
  add_barrier_deps(n);
 
   /* read-after-write deps. */
@@ -1131,6 +1137,13 @@ fs_instruction_scheduler::calculate_deps()
}
 }
 
+static bool
+is_scheduling_barrier(const vec4_instruction *inst)
+{
+   return inst->is_control_flow() ||
+  inst->has_side_effects();
+}
+
 void
 vec4_instruction_scheduler::calculate_deps()
 {
@@ -1152,7 +1165,7 @@ vec4_instruction_scheduler::calculate_deps()
foreach_in_list(schedule_node, n, &instructions) {
   vec4_instruction *inst = (vec4_instruction *)n->inst;
 
-  if (inst->is_control_flow() || inst->has_side_effects())
+  if (is_scheduling_barrier(inst))
  add_barrier_deps(n);
 
   /* read-after-write deps. */
-- 
2.4.10

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 10/10] i965: Don't add barrier deps for FB write messages.

2016-03-13 Thread Matt Turner
Ken did this earlier, and this is just me reimplementing his patch a
little differently.
---
 src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp 
b/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp
index befa9ff..8d92584 100644
--- a/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp
+++ b/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp
@@ -897,7 +897,8 @@ is_scheduling_barrier(const fs_inst *inst)
 {
return inst->opcode == FS_OPCODE_PLACEHOLDER_HALT ||
   inst->is_control_flow() ||
-  inst->has_side_effects();
+  inst->eot ||
+  (inst->has_side_effects() && inst->opcode != FS_OPCODE_FB_WRITE);
 }
 
 void
-- 
2.4.10

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] softpipe: fix misleading TGSI_QUAD_SIZE usage

2016-03-13 Thread sroland
From: Roland Scheidegger 

All these img filter loops iterate through NUM_CHANNELS, not QUAD_SIZE.
In practice both are of course the same unchangeable value (4), but it
makes the code look a bit confusing. Moreover, some of the functions were
actually given an array of 4 values according to the declaration, yet the
code was addressing values 0/4/8/12 out of it, so fix this by just saying
it's a pointer to floats like the other functions.

While here, also add comment about not quite correct filtering.

There's no actual code difference.
---
 src/gallium/drivers/softpipe/sp_tex_sample.c | 53 +++-
 1 file changed, 29 insertions(+), 24 deletions(-)

diff --git a/src/gallium/drivers/softpipe/sp_tex_sample.c 
b/src/gallium/drivers/softpipe/sp_tex_sample.c
index e6bdd50..5703ca2 100644
--- a/src/gallium/drivers/softpipe/sp_tex_sample.c
+++ b/src/gallium/drivers/softpipe/sp_tex_sample.c
@@ -1047,7 +1047,7 @@ img_filter_2d_linear_repeat_POT(const struct 
sp_sampler_view *sp_sview,
}
 
/* interpolate R, G, B, A */
-   for (c = 0; c < TGSI_QUAD_SIZE; c++) {
+   for (c = 0; c < TGSI_NUM_CHANNELS; c++) {
   rgba[TGSI_NUM_CHANNELS*c] = lerp_2d(xw, yw, 
tx[0][c], tx[1][c], 
tx[2][c], tx[3][c]);
@@ -1063,7 +1063,7 @@ static inline void
 img_filter_2d_nearest_repeat_POT(const struct sp_sampler_view *sp_sview,
  const struct sp_sampler *sp_samp,
  const struct img_filter_args *args,
- float rgba[TGSI_QUAD_SIZE])
+ float *rgba)
 {
const unsigned xpot = pot_level_size(sp_sview->xpot, args->level);
const unsigned ypot = pot_level_size(sp_sview->ypot, args->level);
@@ -1085,7 +1085,7 @@ img_filter_2d_nearest_repeat_POT(const struct 
sp_sampler_view *sp_sview,
addr.bits.z = sp_sview->base.u.tex.first_layer;
 
out = get_texel_2d_no_border(sp_sview, addr, x0, y0);
-   for (c = 0; c < TGSI_QUAD_SIZE; c++)
+   for (c = 0; c < TGSI_NUM_CHANNELS; c++)
   rgba[TGSI_NUM_CHANNELS*c] = out[c];
 
if (DEBUG_TEX) {
@@ -1098,7 +1098,7 @@ static inline void
 img_filter_2d_nearest_clamp_POT(const struct sp_sampler_view *sp_sview,
 const struct sp_sampler *sp_samp,
 const struct img_filter_args *args,
-float rgba[TGSI_QUAD_SIZE])
+float *rgba)
 {
const unsigned xpot = pot_level_size(sp_sview->xpot, args->level);
const unsigned ypot = pot_level_size(sp_sview->ypot, args->level);
@@ -1128,7 +1128,7 @@ img_filter_2d_nearest_clamp_POT(const struct 
sp_sampler_view *sp_sview,
   y0 = ypot - 1;

out = get_texel_2d_no_border(sp_sview, addr, x0, y0);
-   for (c = 0; c < TGSI_QUAD_SIZE; c++)
+   for (c = 0; c < TGSI_NUM_CHANNELS; c++)
   rgba[TGSI_NUM_CHANNELS*c] = out[c];
 
if (DEBUG_TEX) {
@@ -1141,7 +1141,7 @@ static void
 img_filter_1d_nearest(const struct sp_sampler_view *sp_sview,
   const struct sp_sampler *sp_samp,
   const struct img_filter_args *args,
-  float rgba[TGSI_QUAD_SIZE])
+  float *rgba)
 {
const struct pipe_resource *texture = sp_sview->base.texture;
const int width = u_minify(texture->width0, args->level);
@@ -1159,7 +1159,7 @@ img_filter_1d_nearest(const struct sp_sampler_view 
*sp_sview,
 
out = get_texel_1d_array(sp_sview, sp_samp, addr, x,
 sp_sview->base.u.tex.first_layer);
-   for (c = 0; c < TGSI_QUAD_SIZE; c++)
+   for (c = 0; c < TGSI_NUM_CHANNELS; c++)
   rgba[TGSI_NUM_CHANNELS*c] = out[c];
 
if (DEBUG_TEX) {
@@ -1191,7 +1191,7 @@ img_filter_1d_array_nearest(const struct sp_sampler_view 
*sp_sview,
sp_samp->nearest_texcoord_s(args->s, width, args->offset[0], &x);
 
out = get_texel_1d_array(sp_sview, sp_samp, addr, x, layer);
-   for (c = 0; c < TGSI_QUAD_SIZE; c++)
+   for (c = 0; c < TGSI_NUM_CHANNELS; c++)
   rgba[TGSI_NUM_CHANNELS*c] = out[c];
 
if (DEBUG_TEX) {
@@ -1225,7 +1225,7 @@ img_filter_2d_nearest(const struct sp_sampler_view 
*sp_sview,
sp_samp->nearest_texcoord_t(args->t, height, args->offset[1], &y);
 
out = get_texel_2d(sp_sview, sp_samp, addr, x, y);
-   for (c = 0; c < TGSI_QUAD_SIZE; c++)
+   for (c = 0; c < TGSI_NUM_CHANNELS; c++)
   rgba[TGSI_NUM_CHANNELS*c] = out[c];
 
if (DEBUG_TEX) {
@@ -1260,7 +1260,7 @@ img_filter_2d_array_nearest(const struct sp_sampler_view 
*sp_sview,
sp_samp->nearest_texcoord_t(args->t, height, args->offset[1], &y);
 
out = get_texel_2d_array(sp_sview, sp_samp, addr, x, y, layer);
-   for (c = 0; c < TGSI_QUAD_SIZE; c++)
+   for (c = 0; c < TGSI_NUM_CHANNELS; c++)
   rgba[TGSI_NUM_CHANNELS*c] = out[c];
 
if (DEBUG_TEX) {
@@ -1304,7 +1304,7 @@ img_filter_cube_nearest(const struct sp_sampl

Re: [Mesa-dev] [PATCH v2 10/10] i965: Don't add barrier deps for FB write messages.

2016-03-13 Thread Francisco Jerez
Matt Turner  writes:

> Ken did this earlier, and this is just me reimplementing his patch a
> little differently.

Heh, it seems a little mean to Ken to revert his patch only to implement
almost the same thing a few commits later.  I suggest you squash the
revert and this patch into PATCHv2 9 before pushing.  Either way the end
result looks reasonable so the three patches (4, 9, 10) are:

Reviewed-by: Francisco Jerez 

Thanks.

> ---
>  src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp 
> b/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp
> index befa9ff..8d92584 100644
> --- a/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp
> @@ -897,7 +897,8 @@ is_scheduling_barrier(const fs_inst *inst)
>  {
> return inst->opcode == FS_OPCODE_PLACEHOLDER_HALT ||
>inst->is_control_flow() ||
> -  inst->has_side_effects();
> +  inst->eot ||
> +  (inst->has_side_effects() && inst->opcode != FS_OPCODE_FB_WRITE);
>  }
>  
>  void
> -- 
> 2.4.10


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] draw: fix line stippling

2016-03-13 Thread sroland
From: Roland Scheidegger 

The logic was comparing actual ints, not true/false values.
This meant that it was emitting always multiple line segments instead of just
one even if the stipple test had the same result, which looks inefficient, and
the segments also overlapped thus breaking line aa as well.
(In practice, with the no-op default line stipple pattern, for a 10-pixel
long line from 0-9 it was emitting 10 segments, with the individual segments
ranging from 0-1, 0-2, 0-3 and so on.)

This fixes https://bugs.freedesktop.org/show_bug.cgi?id=94193

CC: 
---
 src/gallium/auxiliary/draw/draw_pipe_stipple.c | 30 +-
 1 file changed, 15 insertions(+), 15 deletions(-)

diff --git a/src/gallium/auxiliary/draw/draw_pipe_stipple.c 
b/src/gallium/auxiliary/draw/draw_pipe_stipple.c
index dcf05aa..04854ad 100644
--- a/src/gallium/auxiliary/draw/draw_pipe_stipple.c
+++ b/src/gallium/auxiliary/draw/draw_pipe_stipple.c
@@ -108,11 +108,11 @@ emit_segment(struct draw_stage *stage, struct prim_header 
*header,
 }
 
 
-static inline unsigned
+static inline boolean
 stipple_test(int counter, ushort pattern, int factor)
 {
int b = (counter / factor) & 0xf;
-   return (1 << b) & pattern;
+   return !!((1 << b) & pattern);
 }
 
 
@@ -126,7 +126,7 @@ stipple_line(struct draw_stage *stage, struct prim_header 
*header)
const float *pos0 = v0->data[pos];
const float *pos1 = v1->data[pos];
float start = 0;
-   int state = 0;
+   boolean state = 0;
 
float x0 = pos0[0];
float x1 = pos1[0];
@@ -143,29 +143,29 @@ stipple_line(struct draw_stage *stage, struct prim_header 
*header)
   stipple->counter = 0;
 
 
-   /* XXX ToDo: intead of iterating pixel-by-pixel, use a look-up table.
+   /* XXX ToDo: instead of iterating pixel-by-pixel, use a look-up table.
 */
for (i = 0; i < length; i++) {
-  int result = stipple_test( (int) stipple->counter+i,
- (ushort) stipple->pattern, stipple->factor );
+  boolean result = stipple_test((int)stipple->counter + i,
+(ushort)stipple->pattern, stipple->factor);
   if (result != state) {
  /* changing from "off" to "on" or vice versa */
-if (state) {
-   if (start != i) {
+ if (state) {
+if (start != i) {
/* finishing an "on" segment */
-  emit_segment( stage, header, start / length, i / length );
+   emit_segment(stage, header, start / length, i / length);
 }
-}
-else {
+ }
+ else {
 /* starting an "on" segment */
-   start = (float) i;
-}
-state = result;   
+start = (float)i;
+ }
+ state = result;
   }
}
 
if (state && start < length)
-  emit_segment( stage, header, start / length, 1.0 );
+  emit_segment(stage, header, start / length, 1.0);
 
stipple->counter += length;
 }
-- 
2.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] squash: Fix up VPM read optimization.

2016-03-13 Thread Varad Gautam
- Do not reorder instructions reading packed sources (fixes piglit
regressions with draw-vertices and draw-vertices-user).
---
Eric, Rhys, with this bit we should be good to go.

 src/gallium/drivers/vc4/vc4_opt_vpm.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/vc4/vc4_opt_vpm.c 
b/src/gallium/drivers/vc4/vc4_opt_vpm.c
index a4ee6af..d15b0c1 100644
--- a/src/gallium/drivers/vc4/vc4_opt_vpm.c
+++ b/src/gallium/drivers/vc4/vc4_opt_vpm.c
@@ -77,7 +77,8 @@ qir_opt_vpm(struct vc4_compile *c)
 continue;
 
 for (int j = 0; j < qir_get_op_nsrc(inst->op); j++) {
-if (inst->src[j].file != QFILE_TEMP)
+if (inst->src[j].file != QFILE_TEMP ||
+inst->src[j].pack)
 continue;
 
 uint32_t temp = inst->src[j].index;
-- 
2.6.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] nv50, nvc0: Set only NEW_CP_GLOBALS upon binding

2016-03-13 Thread Pierre Moreau
Signed-off-by: Pierre Moreau 
---
 src/gallium/drivers/nouveau/nv50/nv50_state.c | 2 +-
 src/gallium/drivers/nouveau/nvc0/nvc0_state.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/gallium/drivers/nouveau/nv50/nv50_state.c 
b/src/gallium/drivers/nouveau/nv50/nv50_state.c
index c73e3ba..b9efb3f 100644
--- a/src/gallium/drivers/nouveau/nv50/nv50_state.c
+++ b/src/gallium/drivers/nouveau/nv50/nv50_state.c
@@ -1246,7 +1246,7 @@ nv50_set_global_bindings(struct pipe_context *pipe,
 
nouveau_bufctx_reset(nv50->bufctx_cp, NV50_BIND_CP_GLOBAL);
 
-   nv50->dirty_cp = NV50_NEW_CP_GLOBALS;
+   nv50->dirty_cp |= NV50_NEW_CP_GLOBALS;
 }
 
 void
diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_state.c 
b/src/gallium/drivers/nouveau/nvc0/nvc0_state.c
index c279093..36e3546 100644
--- a/src/gallium/drivers/nouveau/nvc0/nvc0_state.c
+++ b/src/gallium/drivers/nouveau/nvc0/nvc0_state.c
@@ -1343,7 +1343,7 @@ nvc0_set_global_bindings(struct pipe_context *pipe,
 
nouveau_bufctx_reset(nvc0->bufctx_cp, NVC0_BIND_CP_GLOBAL);
 
-   nvc0->dirty_cp = NVC0_NEW_CP_GLOBALS;
+   nvc0->dirty_cp |= NVC0_NEW_CP_GLOBALS;
 }
 
 void
-- 
2.7.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] nv50: Mark compute states as dirty on context switch

2016-03-13 Thread Pierre Moreau
Signed-off-by: Pierre Moreau 
---
 src/gallium/drivers/nouveau/nv50/nv50_state_validate.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/gallium/drivers/nouveau/nv50/nv50_state_validate.c 
b/src/gallium/drivers/nouveau/nv50/nv50_state_validate.c
index 5536978..d06ba4a 100644
--- a/src/gallium/drivers/nouveau/nv50/nv50_state_validate.c
+++ b/src/gallium/drivers/nouveau/nv50/nv50_state_validate.c
@@ -437,6 +437,7 @@ nv50_switch_pipe_context(struct nv50_context *ctx_to)
   ctx_to->state = ctx_to->screen->save_state;
 
ctx_to->dirty = ~0;
+   ctx_to->dirty_cp = ~0;
ctx_to->viewports_dirty = ~0;
ctx_to->scissors_dirty = ~0;
 
-- 
2.7.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] nv50, nvc0: Set only NEW_CP_GLOBALS upon binding

2016-03-13 Thread Samuel Pitoiset
This is there since ... 2013 ... but this was never really used because 
it's compute-related, that might explain why you are the first one to 
hit the issue. :-)


Luckily, this doesn't affect compute shaders on Fermi because globals 
buffers are validated *after* all other things.


Good catch!

Reviewed-by: Samuel Pitoiset 

On 03/13/2016 10:11 PM, Pierre Moreau wrote:

Signed-off-by: Pierre Moreau 
---
  src/gallium/drivers/nouveau/nv50/nv50_state.c | 2 +-
  src/gallium/drivers/nouveau/nvc0/nvc0_state.c | 2 +-
  2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/gallium/drivers/nouveau/nv50/nv50_state.c 
b/src/gallium/drivers/nouveau/nv50/nv50_state.c
index c73e3ba..b9efb3f 100644
--- a/src/gallium/drivers/nouveau/nv50/nv50_state.c
+++ b/src/gallium/drivers/nouveau/nv50/nv50_state.c
@@ -1246,7 +1246,7 @@ nv50_set_global_bindings(struct pipe_context *pipe,

 nouveau_bufctx_reset(nv50->bufctx_cp, NV50_BIND_CP_GLOBAL);

-   nv50->dirty_cp = NV50_NEW_CP_GLOBALS;
+   nv50->dirty_cp |= NV50_NEW_CP_GLOBALS;
  }

  void
diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_state.c 
b/src/gallium/drivers/nouveau/nvc0/nvc0_state.c
index c279093..36e3546 100644
--- a/src/gallium/drivers/nouveau/nvc0/nvc0_state.c
+++ b/src/gallium/drivers/nouveau/nvc0/nvc0_state.c
@@ -1343,7 +1343,7 @@ nvc0_set_global_bindings(struct pipe_context *pipe,

 nouveau_bufctx_reset(nvc0->bufctx_cp, NVC0_BIND_CP_GLOBAL);

-   nvc0->dirty_cp = NVC0_NEW_CP_GLOBALS;
+   nvc0->dirty_cp |= NVC0_NEW_CP_GLOBALS;
  }

  void


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 08/10] i965: Remove NOP insertion kludge in scheduler.

2016-03-13 Thread Francisco Jerez
Matt Turner  writes:

> Instead of removing every instruction in add_insts_from_block(), just
> move the instruction to its scheduled location. This is a step towards
> doing both bottom-up and top-down scheduling without conflicts.
>
> Note that this patch changes cycle counts for programs because it begins
> including control flow instructions in the estimates.
> ---
>  .../drivers/dri/i965/brw_schedule_instructions.cpp | 25 
> +-
>  1 file changed, 5 insertions(+), 20 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp 
> b/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp
> index 46b45a5..98fa5e3 100644
> --- a/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp
> @@ -783,26 +783,13 @@ schedule_node::schedule_node(backend_instruction *inst,
>  void
>  instruction_scheduler::add_insts_from_block(bblock_t *block)
>  {
> -   /* Removing the last instruction from a basic block removes the block as
> -* well, so put a NOP at the end to keep it alive.
> -*/
> -   if (!block->end()->is_control_flow()) {
> -  backend_instruction *nop = new(mem_ctx) backend_instruction();
> -  nop->opcode = BRW_OPCODE_NOP;
> -  block->end()->insert_after(block, nop);
> -   }
> -

Thanks for removing this hack, I had been meaning to do the same for a
while.  What was really bothering me about it is the fact that it
inserts a naked backend_instruction into the basic block's instruction
list, which everything else assumes contains elements of a concrete
instruction class, fs_inst or vec4_instruction depending on the
back-end -- So thanks a lot,

Reviewed-by: Francisco Jerez 

> -   foreach_inst_in_block_safe(backend_instruction, inst, block) {
> -  if (inst->opcode == BRW_OPCODE_NOP || inst->is_control_flow())
> - continue;
> -
> +   foreach_inst_in_block(backend_instruction, inst, block) {
>schedule_node *n = new(mem_ctx) schedule_node(inst, this);
>  
> -  this->instructions_to_schedule++;
> -
> -  inst->remove(block);
>instructions.push_tail(n);
> }
> +
> +   this->instructions_to_schedule = block->end_ip - block->start_ip + 1;
>  }
>  
>  /** Recursive computation of the delay member of a node. */
> @@ -1463,7 +1450,6 @@ void
>  instruction_scheduler::schedule_instructions(bblock_t *block)
>  {
> const struct brw_device_info *devinfo = bs->devinfo;
> -   backend_instruction *inst = block->end();
> time = 0;
> if (!post_reg_alloc)
>reg_pressure = reg_pressure_in[block->num];
> @@ -1482,7 +1468,8 @@ instruction_scheduler::schedule_instructions(bblock_t 
> *block)
>/* Schedule this instruction. */
>assert(chosen);
>chosen->remove();
> -  inst->insert_before(block, chosen->inst);
> +  chosen->inst->exec_node::remove();
> +  block->instructions.push_tail(chosen->inst);
>instructions_to_schedule--;
>  
>if (!post_reg_alloc) {
> @@ -1551,8 +1538,6 @@ instruction_scheduler::schedule_instructions(bblock_t 
> *block)
>}
> }
>  
> -   if (block->end()->opcode == BRW_OPCODE_NOP)
> -  block->end()->remove(block);
> assert(instructions_to_schedule == 0);
>  
> block->cycle_count = time;
> -- 
> 2.4.10
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] nv50: Mark compute states as dirty on context switch

2016-03-13 Thread Samuel Pitoiset
Well, without a new validation path for compute on Tesla this won't 
change anything because nv50_state_validate() is 3d-related and it 
should be never called by compute.


On 03/13/2016 10:11 PM, Pierre Moreau wrote:

Signed-off-by: Pierre Moreau 
---
  src/gallium/drivers/nouveau/nv50/nv50_state_validate.c | 1 +
  1 file changed, 1 insertion(+)

diff --git a/src/gallium/drivers/nouveau/nv50/nv50_state_validate.c 
b/src/gallium/drivers/nouveau/nv50/nv50_state_validate.c
index 5536978..d06ba4a 100644
--- a/src/gallium/drivers/nouveau/nv50/nv50_state_validate.c
+++ b/src/gallium/drivers/nouveau/nv50/nv50_state_validate.c
@@ -437,6 +437,7 @@ nv50_switch_pipe_context(struct nv50_context *ctx_to)
ctx_to->state = ctx_to->screen->save_state;

 ctx_to->dirty = ~0;
+   ctx_to->dirty_cp = ~0;
 ctx_to->viewports_dirty = ~0;
 ctx_to->scissors_dirty = ~0;



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] nv50, nvc0: Set only NEW_CP_GLOBALS upon binding

2016-03-13 Thread Samuel Pitoiset



On 03/13/2016 10:18 PM, Samuel Pitoiset wrote:

This is there since ... 2013 ... but this was never really used because
it's compute-related, that might explain why you are the first one to
hit the issue. :-)

Luckily, this doesn't affect compute shaders on Fermi because globals
buffers are validated *after* all other things.


Err, because we don't need to bind global buffers.



Good catch!

Reviewed-by: Samuel Pitoiset 

On 03/13/2016 10:11 PM, Pierre Moreau wrote:

Signed-off-by: Pierre Moreau 
---
  src/gallium/drivers/nouveau/nv50/nv50_state.c | 2 +-
  src/gallium/drivers/nouveau/nvc0/nvc0_state.c | 2 +-
  2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/gallium/drivers/nouveau/nv50/nv50_state.c
b/src/gallium/drivers/nouveau/nv50/nv50_state.c
index c73e3ba..b9efb3f 100644
--- a/src/gallium/drivers/nouveau/nv50/nv50_state.c
+++ b/src/gallium/drivers/nouveau/nv50/nv50_state.c
@@ -1246,7 +1246,7 @@ nv50_set_global_bindings(struct pipe_context *pipe,

 nouveau_bufctx_reset(nv50->bufctx_cp, NV50_BIND_CP_GLOBAL);

-   nv50->dirty_cp = NV50_NEW_CP_GLOBALS;
+   nv50->dirty_cp |= NV50_NEW_CP_GLOBALS;
  }

  void
diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_state.c
b/src/gallium/drivers/nouveau/nvc0/nvc0_state.c
index c279093..36e3546 100644
--- a/src/gallium/drivers/nouveau/nvc0/nvc0_state.c
+++ b/src/gallium/drivers/nouveau/nvc0/nvc0_state.c
@@ -1343,7 +1343,7 @@ nvc0_set_global_bindings(struct pipe_context *pipe,

 nouveau_bufctx_reset(nvc0->bufctx_cp, NVC0_BIND_CP_GLOBAL);

-   nvc0->dirty_cp = NVC0_NEW_CP_GLOBALS;
+   nvc0->dirty_cp |= NVC0_NEW_CP_GLOBALS;
  }

  void


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] Small changes in GL3.txt

2016-03-13 Thread Romain Failliot
Hi!

With the recent commits, I'm having some trouble parsing the GL3.txt
text file for mesamatrix.net.

With these set of 4 patches, I'm proposing to rationalize a little bit
this file. The content is exactly the same.
(Note: I'm not really used to propose patches, so if this format is
wrong, I'll be glad to do it the proper way)

# gl3_txt_1_howto.patch
Add some explanations on how to read (and edit) GL3.txt.
It tends to give some sort of easy to follow standard.

# gl3_txt_2_realign_status.patch
Realign the Status column to the same column number.

# gl3_txt_3_renormalize_some_extensions.patch
The comment for GL_ARB_texture_buffer_object was between "DONE" and
the brackets, it's an exception compared to the way it's done for
other extensions.
The status of GL_KHR_robustness was "90% done", it is now "in
progress" and "90% done" is in the extension comment.

# gl3_txt_4_renormalize_older_extensions.patch
On older extensions, there is an explanation first and the extension
name in brackets. I inverted that so we have the extension first and
then the explanation in brackets (it will help me later for the "all
drivers that support ").

I might have more patches later, but I'd like to see if these are OK
for you first.

Cheers!
Romain
diff --git a/docs/GL3.txt b/docs/GL3.txt
index ee7faca..1ed5c1a 100644
--- a/docs/GL3.txt
+++ b/docs/GL3.txt
@@ -1,13 +1,28 @@
+# Status of OpenGL extensions in Mesa
 
-Status of OpenGL 3.x features in Mesa
+Here's how to read this file:
 
+all DONE: , ...
+All the extensions are done for the given list of drivers.
 
-Note: when an item is marked as "DONE" it means all the core Mesa
-infrastructure is complete but it may be the case that few (if any) drivers
-implement the features.
+DONE
+The extension is done for Mesa and no implementation is necessary on the
+driver-side.
 
+DONE ()
+The extension is done for Mesa and all the drivers in the "all DONE" list.
 
-OpenGL Core and Compatibility context support
+DONE (, ...)
+The extension is done for Mesa, all the drivers in the "all DONE" list, and
+all the drivers in the brackets.
+
+in progress
+The extension is started but not finished yet.
+
+not started
+The extension isn't started yet.
+
+# OpenGL Core and Compatibility context support
 
 OpenGL 3.1 and later versions are only supported with the Core profile.
 There are no plans to support GL_ARB_compatibility. The last supported OpenGL
diff --git a/docs/GL3.txt b/docs/GL3.txt
index 1ed5c1a..17522d5 100644
--- a/docs/GL3.txt
+++ b/docs/GL3.txt
@@ -30,8 +30,8 @@ version with all deprecated features is 3.0. Some of the later GL features
 are exposed in the 3.0 context as extensions.
 
 
-Feature   Status
-- 
+Feature Status
+--- 
 
 GL 3.0, GLSL 1.30 --- all DONE: i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe
 
@@ -109,170 +109,170 @@ GL 3.3, GLSL 3.30 --- all DONE: i965, nv50, nvc0, r600, radeonsi, llvmpipe, soft
 
 GL 4.0, GLSL 4.00 --- all DONE: nvc0, r600, radeonsi
 
-  GL_ARB_draw_buffers_blendDONE (i965, nv50, llvmpipe, softpipe)
-  GL_ARB_draw_indirect DONE (i965, llvmpipe, softpipe)
-  GL_ARB_gpu_shader5   DONE (i965)
-  - 'precise' qualifierDONE
-  - Dynamically uniform sampler array indices  DONE (softpipe)
-  - Dynamically uniform UBO array indices  DONE ()
-  - Implicit signed -> unsigned conversionsDONE
-  - Fused multiply-add DONE ()
-  - Packing/bitfield/conversion functions  DONE (softpipe)
-  - Enhanced textureGather DONE (softpipe)
-  - Geometry shader instancing DONE (llvmpipe, softpipe)
-  - Geometry shader multiple streams   DONE ()
-  - Enhanced per-sample shadingDONE ()
-  - Interpolation functionsDONE ()
-  - New overload resolution rules  DONE
-  GL_ARB_gpu_shader_fp64   DONE (llvmpipe, softpipe)
-  GL_ARB_sample_shadingDONE (i965, nv50)
-  GL_ARB_shader_subroutine DONE (i965, nv50, llvmpipe, softpipe)
-  GL_ARB_tessellation_shader   DONE (i965)
-  GL_ARB_texture_buffer_object_rgb32   DONE (i965, llvmpipe, softpipe)
-  GL_ARB_texture_cube_map_arrayDONE (i965, nv50, llvmpipe, softpipe)
-  GL_ARB_texture_gatherDONE (i965, nv50, llvmpipe, softpipe)
-  GL_ARB_texture_query_lod DONE (i965, nv50, softpipe)
-  GL_ARB_transform_feedback2  

Re: [Mesa-dev] [PATCH] configure.ac require libdrm 2.4.65 for amdgpu because of drmGetDevice

2016-03-13 Thread Emil Velikov
Hi Marek,

On 13 March 2016 at 16:46, Marek Olšák  wrote:
> From: Marek Olšák 
>
The struct is available in 2.4.65, although the functions
(drm{Get,Free}Device) are available since 2.4.66 afaics.

Mildly related: Not sure it there are any non PCI ATI/AMD devices,
then again one should check drmDevice::bus_type prior to accessing
drmDevice::bus_info.pci.

> src/gallium/winsys/amdgpu/drm/amdgpu_winsys.c not longer compile:
> error: unknown type name ‘drmDevicePtr’
> ---
>  configure.ac | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/configure.ac b/configure.ac
> index 49be147..d768db6 100644
> --- a/configure.ac
> +++ b/configure.ac
> @@ -70,7 +70,7 @@ AC_SUBST([OPENCL_VERSION])
>  dnl Versions for external dependencies
>  LIBDRM_REQUIRED=2.4.60
>  LIBDRM_RADEON_REQUIRED=2.4.56
> -LIBDRM_AMDGPU_REQUIRED=2.4.63
> +LIBDRM_AMDGPU_REQUIRED=2.4.65
LIBDRM_REQUIRED should be bumped, as it's the one that provides the symbol(s).

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] configure.ac require libdrm 2.4.65 for amdgpu because of drmGetDevice

2016-03-13 Thread eocallaghan

Reviewed-by: Edward O'Callaghan 

On 2016-03-14 03:46, Marek Olšák wrote:

From: Marek Olšák 

---
 configure.ac | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/configure.ac b/configure.ac
index 49be147..d768db6 100644
--- a/configure.ac
+++ b/configure.ac
@@ -70,7 +70,7 @@ AC_SUBST([OPENCL_VERSION])
 dnl Versions for external dependencies
 LIBDRM_REQUIRED=2.4.60
 LIBDRM_RADEON_REQUIRED=2.4.56
-LIBDRM_AMDGPU_REQUIRED=2.4.63
+LIBDRM_AMDGPU_REQUIRED=2.4.65
 LIBDRM_INTEL_REQUIRED=2.4.61
 LIBDRM_NVVIEUX_REQUIRED=2.4.66
 LIBDRM_NOUVEAU_REQUIRED=2.4.66


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] spirv: Fix structure splitting with per-vertex interface arrays.

2016-03-13 Thread Kenneth Graunke
We want to use interface_type, not vtn_var->type.  They're normally
equivalent, but for geometry/tessellation per-vertex interface arrays,
we need to unwrap a level.

Otherwise, we tried to iterate a structure members but instead used
an array length.  If the array length was longer than the number of
fields in the structure, we'd crash.
---
 src/compiler/nir/spirv/vtn_variables.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/compiler/nir/spirv/vtn_variables.c 
b/src/compiler/nir/spirv/vtn_variables.c
index 31bf416..1628042 100644
--- a/src/compiler/nir/spirv/vtn_variables.c
+++ b/src/compiler/nir/spirv/vtn_variables.c
@@ -923,7 +923,7 @@ var_decoration_cb(struct vtn_builder *b, struct vtn_value 
*val, int member,
  vtn_var->var->data.explicit_location = true;
   } else {
  assert(vtn_var->members);
- unsigned length = glsl_get_length(vtn_var->type->type);
+ unsigned length = glsl_get_length(val->type->type);
  for (unsigned i = 0; i < length; i++) {
 vtn_var->members[i]->data.location = location;
 vtn_var->members[i]->data.explicit_location = true;
-- 
2.7.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Mesa include guard style. (Was: [PATCH] i965/cfg: Remove redundant #pragma once.)

2016-03-13 Thread Francisco Jerez
Ian Romanick  writes:

> On 03/11/2016 03:46 PM, Eric Anholt wrote:
>> Ian Romanick  writes:
>> 
>>> On 03/10/2016 05:53 PM, Francisco Jerez wrote:
 Iago Toral  writes:

> On Wed, 2016-03-09 at 19:04 -0800, Francisco Jerez wrote:
>> Matt Turner  writes:
>>
>>> On Wed, Mar 9, 2016 at 1:37 PM, Francisco Jerez  
>>> wrote:
 Iago Toral  writes:

> On Tue, 2016-03-08 at 17:42 -0800, Francisco Jerez wrote:
>> brw_cfg.h already has include guards, remove the "#pragma once" which
>> is redundant and non-standard.
>
> FWIW, I think using both #pragma once and include guards is a way to
> keep portability while still getting the performance advantage of
> #pragma once where it is supported.
>
 It's highly unlikely to make any significant difference on any
 reasonably modern compiler.  I cannot measure any change in compilation
 time locally from my cleanup.

> Also it seems that we do the same thing in many other files...
>
 Really?  I'm not aware of any other file where we use both.
>>>
>>> There are quite a few in glsl/
>>
>> Heh, apparently you're right.  Anyway it seems rather pointless to use
>> '#pragma once' in a bunch of scattered header files with the expectation
>> to gain some speed, the improvement from a single header file is so
>> minuscule (if it will make any difference at all on a modern compiler
>> and compilation workload, which I doubt) that we would have to use it
>> universally in order to have the chance to measure any improvement.
>>
>> Can we please just decide for one of the include guard styles and use it
>> consistently?  Given that the majority of header files in the Mesa
>> codebase use old-school define guards, that it's the only standard
>> option, that it has well-defined semantics in presence of file copies
>> and hardlinks, and that the performance argument against it is rather
>> dubious (although I definitely find '#pragma once' prettier and more
>> concise), I'd vote for using preprocessor define guards universally.
>>
>> What do other people think?
>
> I think we have to use define guards necessarily since #pragma once is
> not standard even it it has wide support. So the question is whether we
> want to use only define guards or define guards plus #pragma once. I am
> fine with doing only define guards as you propose.

 *Shrug* I have the impression that the only real advantage of '#pragma
 once' is that you no longer need to do the ifndef/define dance, so I
 don't think I can see much benefit in doing both.
>>>
>>> Several compilers will cache the file name where '#pragma once' occurs
>>> and never read that file again.  A #include of a file previously seen
>>> with '#pragma once' becomes a no-op.  Since the file is never read, the
>>> compiler avoids all the I/O and the parsing.  That is true of MSVC and,
>>> I thought, some versions of GCC.  As Iago points out, some compilers
>>> ignore the #pragma altogether.  Since Mesa supports (or does it?) some
>>> of these compilers, we have to have the ifdef/define/endif guards.
>> 
>> Compilers have noticed that ifdef/define/endif is a thing and optimized
>> it, anyway.
>> 
>> https://gcc.gnu.org/onlinedocs/cppinternals/Guard-Macros.html
>
> That's cool!  I don't think GCC did that when I looked into this in
> 2010.  It sounds like the #pragma actually breaks the GCC optimization,
> so let's get rid of them all.

Shall I take that as an Acked-by for my patch?  It doesn't get rid of
them all but it's a start. ;)


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] squash: Fix up VPM read optimization.

2016-03-13 Thread Rhys Kidd
On 13 March 2016 at 17:09, Varad Gautam  wrote:

> - Do not reorder instructions reading packed sources (fixes piglit
> regressions with draw-vertices and draw-vertices-user).
> ---
> Eric, Rhys, with this bit we should be good to go.
>
>  src/gallium/drivers/vc4/vc4_opt_vpm.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/src/gallium/drivers/vc4/vc4_opt_vpm.c
> b/src/gallium/drivers/vc4/vc4_opt_vpm.c
> index a4ee6af..d15b0c1 100644
> --- a/src/gallium/drivers/vc4/vc4_opt_vpm.c
> +++ b/src/gallium/drivers/vc4/vc4_opt_vpm.c
> @@ -77,7 +77,8 @@ qir_opt_vpm(struct vc4_compile *c)
>  continue;
>
>  for (int j = 0; j < qir_get_op_nsrc(inst->op); j++) {
> -if (inst->src[j].file != QFILE_TEMP)
> +if (inst->src[j].file != QFILE_TEMP ||
> +inst->src[j].pack)
>  continue;
>
>  uint32_t temp = inst->src[j].index;
> --
> 2.6.2
>
>
No piglit regressions, so this gets my

Tested-by: Rhys Kidd 

This applies to the squashed set of [0], [1], and [2] plus this patch.
Given I believe you do not yet have Mesa commit access, perhaps ask Eric if
he'd commit that set for you.

Thanks for your contribution Varad!

[0] https://patchwork.freedesktop.org/patch/76069/
[1] https://patchwork.freedesktop.org/patch/76070/
[2] https://patchwork.freedesktop.org/patch/76528/
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] LLVMInitializeAMDGPU* undefined?

2016-03-13 Thread Chih-Wei Huang
2016-03-11 11:50 GMT+08:00 Jan Vesely :
> On Fri, 2016-03-11 at 10:09 +0800, Chih-Wei Huang wrote:
>> cc1: some warnings being treated as errors
>>
>>
>> But I'm still not sure whether if it should be fixed on the mesa
>> side.
>>
>> So my question is, what kind of fix do we want (i.e., acceptable by
>> the upstream)?
>
> this is the result of using llvm without AMDGPU backend. the header
> "llvm-c/Target.h" includes "llvm/Config/Targets.def" and declares
> functions based on backends listed there (selected at llvm build time).
>
> Your "llvm/Config/Targets.def" needs to include this line:
> LLVM_TARGET(AMDGPU)

Ah, thank you for pointed it out.
Originally I thought adding Android.mk for AMDGPU is enough.
I'll try to add this.

Anything else we need to do to "enable" AMDGPU backend?

> otherwise the functions are not declared.
> probably the only patch needed on mesa side is to detect this at
> configure time.

Android doesn't need configure.
What we need is to detect Android version then
we know the llvm version it uses and what backends it has.
The patch is already in android-x86 codebase but
probably not submited to mesa-dev yet.


-- 
Chih-Wei
Android-x86 project
http://www.android-x86.org
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] get: reconcile aliasing enums for MaxCombinedShaderOutputResources

2016-03-13 Thread Kenneth Graunke
On Thursday, March 10, 2016 6:26:43 PM PDT Nicolai Hähnle wrote:
> From: Nicolai Hähnle 
> 
> The enums MAX_COMBINED_IMAGE_UNITS_AND_FRAGMENT_OUTPUTS and
> MAX_COMBINED_SHADER_OUTPUT_RESOURCES are equal and should therefore only
> appear once.
> 
> Noticed while implementing ARB_shader_image_load_store without previously
> implementing SSBO.
> ---
>  src/mesa/main/get.c  | 7 +++
>  src/mesa/main/get_hash_params.py | 6 --
>  2 files changed, 11 insertions(+), 2 deletions(-)
> 
> diff --git a/src/mesa/main/get.c b/src/mesa/main/get.c
> index 67c4f99..b0fadc9 100644
> --- a/src/mesa/main/get.c
> +++ b/src/mesa/main/get.c
> @@ -384,6 +384,13 @@ static const int 
extra_ARB_shader_storage_buffer_object_and_geometry_shader[] =
> EXTRA_END
>  };
>  
> +static const int 
extra_ARB_shader_image_load_store_shader_storage_buffer_object_es31[] = {
> +   EXT(ARB_shader_image_load_store),
> +   EXT(ARB_shader_storage_buffer_object),
> +   EXTRA_API_ES31,
> +   EXTRA_END
> +};
> +
>  static const int extra_ARB_framebuffer_no_attachments_and_geometry_shader[] 
= {
> EXTRA_EXT_FB_NO_ATTACH_GS,
> EXTRA_END
> diff --git a/src/mesa/main/get_hash_params.py b/src/mesa/main/
get_hash_params.py
> index f9d66f8..12c2189 100644
> --- a/src/mesa/main/get_hash_params.py
> +++ b/src/mesa/main/get_hash_params.py
> @@ -495,9 +495,12 @@ descriptor=[
>[ "MAX_COMBINED_SHADER_STORAGE_BLOCKS", 
"CONTEXT_INT(Const.MaxCombinedShaderStorageBlocks), 
extra_ARB_shader_storage_buffer_object_es31" ],
>[ "MAX_SHADER_STORAGE_BLOCK_SIZE", 
"CONTEXT_INT(Const.MaxShaderStorageBlockSize), 
extra_ARB_shader_storage_buffer_object_es31" ],
>[ "MAX_SHADER_STORAGE_BUFFER_BINDINGS", 
"CONTEXT_INT(Const.MaxShaderStorageBufferBindings), 
extra_ARB_shader_storage_buffer_object_es31" ],
> -  [ "MAX_COMBINED_SHADER_OUTPUT_RESOURCES", 
"CONTEXT_INT(Const.MaxCombinedShaderOutputResources), 
extra_ARB_shader_storage_buffer_object_es31" ],
>[ "SHADER_STORAGE_BUFFER_OFFSET_ALIGNMENT", 
"CONTEXT_INT(Const.ShaderStorageBufferOffsetAlignment), 
extra_ARB_shader_storage_buffer_object_es31" ],
>[ "SHADER_STORAGE_BUFFER_BINDING", "LOC_CUSTOM, TYPE_INT, 0, 
extra_ARB_shader_storage_buffer_object_es31" ],
> +
> +  # GL_ARB_shader_image_load_store / GL_ARB_shader_storage_buffer_object / 
GLES 3.1
> +  # (MAX_COMBINED_IMAGE_UNITS_AND_FRAGMENT_OUTPUTS in 
GL_ARB_shader_image_load_store)
> +  [ "MAX_COMBINED_SHADER_OUTPUT_RESOURCES", 
"CONTEXT_INT(Const.MaxCombinedShaderOutputResources), 
extra_ARB_shader_image_load_store_shader_storage_buffer_object_es31" ],
>  ]},
>  
>  # Enums in OpenGL Core profile and ES 3.1
> @@ -841,7 +844,6 @@ descriptor=[
>[ "MAX_PROGRAM_TEXTURE_GATHER_COMPONENTS_ARB", 
"CONTEXT_INT(Const.MaxProgramTextureGatherComponents), 
extra_ARB_texture_gather"],
>  
>  # GL_ARB_shader_image_load_store
> -  [ "MAX_COMBINED_IMAGE_UNITS_AND_FRAGMENT_OUTPUTS", 
"CONTEXT_INT(Const.MaxCombinedShaderOutputResources), 
extra_ARB_shader_image_load_store" ],
>[ "MAX_IMAGE_SAMPLES", "CONTEXT_INT(Const.MaxImageSamples), 
extra_ARB_shader_image_load_store" ],
>  
>  # GL_EXT_polygon_offset_clamp
> 

Reviewed-by: Kenneth Graunke 


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 01/30] i965/fs: Add missing analysis invalidation in opt_sampler_eot().

2016-03-13 Thread Francisco Jerez
Bug found by the liveness analysis validation pass that will be
introduced in a later commit.  opt_sampler_eot() was allocating
registers and inserting and removing instructions, which makes the
cached liveness analysis calculation inconsistent with the shader IR,
so it must be invalidated.

Cc: mesa-sta...@lists.freedesktop.org
---
 src/mesa/drivers/dri/i965/brw_fs.cpp | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index ff67cf4..42bc5e2 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -2441,8 +2441,10 @@ fs_visitor::opt_sampler_eot()
 * we have enough space, but it will make sure the dead code eliminator 
kills
 * the instruction that this will replace.
 */
-   if (tex_inst->header_size != 0)
+   if (tex_inst->header_size != 0) {
+  invalidate_live_intervals();
   return true;
+   }
 
fs_reg send_header = ibld.vgrf(BRW_REGISTER_TYPE_F,
   load_payload->sources + 1);
@@ -2473,6 +2475,7 @@ fs_visitor::opt_sampler_eot()
tex_inst->insert_before(cfg->blocks[cfg->num_blocks - 1], new_load_payload);
tex_inst->src[0] = send_header;
 
+   invalidate_live_intervals();
return true;
 }
 
-- 
2.7.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 19/30] i965/fs: Pass single backend_shader argument to the fs_live_variables constructor.

2016-03-13 Thread Francisco Jerez
This removes the dependency of fs_live_variables on fs_visitor.  The
IR analysis framework requires the analysis result to be constructible
with a single argument -- The second argument was redundant anyway.
---
 src/mesa/drivers/dri/i965/brw_fs_live_variables.cpp | 12 ++--
 src/mesa/drivers/dri/i965/brw_fs_live_variables.h   |  6 ++
 2 files changed, 8 insertions(+), 10 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_live_variables.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_live_variables.cpp
index 6da026e..4b0943f 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_live_variables.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_live_variables.cpp
@@ -238,22 +238,22 @@ fs_live_variables::compute_start_end()
}
 }
 
-fs_live_variables::fs_live_variables(fs_visitor *v, const cfg_t *cfg)
-   : v(v), cfg(cfg)
+fs_live_variables::fs_live_variables(const backend_shader *s)
+   : cfg(s->cfg)
 {
mem_ctx = ralloc_context(NULL);
 
-   num_vgrfs = v->alloc.count;
+   num_vgrfs = s->alloc.count;
num_vars = 0;
var_from_vgrf = rzalloc_array(mem_ctx, int, num_vgrfs);
for (int i = 0; i < num_vgrfs; i++) {
   var_from_vgrf[i] = num_vars;
-  num_vars += v->alloc.sizes[i];
+  num_vars += s->alloc.sizes[i];
}
 
vgrf_from_var = rzalloc_array(mem_ctx, int, num_vars);
for (int i = 0; i < num_vgrfs; i++) {
-  for (unsigned j = 0; j < v->alloc.sizes[i]; j++) {
+  for (unsigned j = 0; j < s->alloc.sizes[i]; j++) {
  vgrf_from_var[var_from_vgrf[i] + j] = i;
   }
}
@@ -323,7 +323,7 @@ fs_visitor::calculate_live_intervals()
if (this->live_intervals)
   return;
 
-   this->live_intervals = new(mem_ctx) fs_live_variables(this, cfg);
+   this->live_intervals = new(mem_ctx) fs_live_variables(this);
 }
 
 bool
diff --git a/src/mesa/drivers/dri/i965/brw_fs_live_variables.h 
b/src/mesa/drivers/dri/i965/brw_fs_live_variables.h
index c0943c7..e1cd12c 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_live_variables.h
+++ b/src/mesa/drivers/dri/i965/brw_fs_live_variables.h
@@ -32,7 +32,7 @@
 #include "util/bitset.h"
 
 struct cfg_t;
-class fs_visitor;
+struct backend_shader;
 
 namespace brw {
 
@@ -66,7 +66,7 @@ public:
 
DECLARE_RALLOC_CXX_OPERATORS(fs_live_variables)
 
-   fs_live_variables(fs_visitor *v, const cfg_t *cfg);
+   fs_live_variables(const backend_shader *s);
~fs_live_variables();
 
bool vars_interfere(int a, int b) const;
@@ -118,10 +118,8 @@ protected:
void compute_live_variables();
void compute_start_end();
 
-   fs_visitor *v;
const cfg_t *cfg;
void *mem_ctx;
-
 };
 
 } /* namespace brw */
-- 
2.7.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 11/30] i965/ir: Introduce simple IR analysis pass framework.

2016-03-13 Thread Francisco Jerez
Motivated in detail in the source code.  The only piece missing here
from the analysis pass infrastructure is some sort of mechanism to
broadcast changes in the IR to all existing analysis passes, which
will be addressed by a future commit.  The analysis_dependency_class
enum might seem a bit silly at this point, more interesting dependency
categories will be defined later on.
---
 src/mesa/drivers/dri/i965/Makefile.sources  |   1 +
 src/mesa/drivers/dri/i965/brw_ir_analysis.h | 146 
 2 files changed, 147 insertions(+)
 create mode 100644 src/mesa/drivers/dri/i965/brw_ir_analysis.h

diff --git a/src/mesa/drivers/dri/i965/Makefile.sources 
b/src/mesa/drivers/dri/i965/Makefile.sources
index 1c6d2e9..ba6f841 100644
--- a/src/mesa/drivers/dri/i965/Makefile.sources
+++ b/src/mesa/drivers/dri/i965/Makefile.sources
@@ -39,6 +39,7 @@ i965_compiler_FILES = \
brw_interpolation_map.c \
brw_ir.h \
brw_ir_allocator.h \
+   brw_ir_analysis.h \
brw_ir_fs.h \
brw_ir_vec4.h \
brw_nir.h \
diff --git a/src/mesa/drivers/dri/i965/brw_ir_analysis.h 
b/src/mesa/drivers/dri/i965/brw_ir_analysis.h
new file mode 100644
index 000..31989ec
--- /dev/null
+++ b/src/mesa/drivers/dri/i965/brw_ir_analysis.h
@@ -0,0 +1,146 @@
+/* -*- c++ -*- */
+/*
+ * Copyright © 2016 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ */
+
+#ifndef BRW_IR_ANALYSIS_H
+#define BRW_IR_ANALYSIS_H
+
+namespace brw {
+   /**
+* Bitset of state categories that can influence the result of IR analysis
+* passes.
+*/
+   enum analysis_dependency_class {
+  /**
+   * The analysis doesn't depend on the IR, its result is effectively a
+   * constant during the compilation.
+   */
+  DEPENDENCY_NOTHING = 0,
+  /**
+   * The analysis depends on the program being literally the same (good
+   * luck...), any change in the input invalidates previous analysis
+   * computations.
+   */
+  DEPENDENCY_EVERYTHING = ~0
+   };
+
+   inline analysis_dependency_class
+   operator|(analysis_dependency_class x, analysis_dependency_class y)
+   {
+  return static_cast(
+ static_cast(x) | static_cast(y));
+   }
+}
+
+/**
+ * Instantiate a program analysis class \p L which can calculate an object of
+ * type \p T as result.  \p C is a closure that encapsulates whatever
+ * information is required as argument to run the analysis pass.  The purpose
+ * of this class is to make sure that:
+ *
+ *  - The analysis pass is executed lazily whenever it's needed and multiple
+ *executions are optimized out as long as the cached result remains marked
+ *up-to-date.
+ *
+ *  - There is no way to access the cached analysis result without first
+ *calling L::require(), which makes sure that the analysis pass is rerun
+ *if necessary.
+ *
+ *  - The cached result doesn't become inconsistent with the program for as
+ *long as it remains marked up-to-date. (This is only enforced in debug
+ *builds for performance reasons)
+ *
+ * The requirements on \p T are the following:
+ *
+ *  - Constructible with a single argument, as in 'x = T(c)' for \p c of type
+ *\p C.
+ *
+ *  - 'x.dependency_class()' on const \p x returns a bitset of
+ *brw::analysis_dependency_class specifying the set of IR objects that are
+ *required to remain invariant for the cached analysis result to be
+ *considered valid.
+ *
+ *  - 'x.validate(c)' on const \p x returns a boolean result specifying
+ *whether the analysis result \p x is consistent with the input IR.  This
+ *is currently only used for validation in debug builds.
+ */
+#define BRW_ANALYSIS(L, T, C)   \
+   class L {\
+   public:  \
+   

[Mesa-dev] [PATCH 05/30] i965/ir: Move base IR definitions into a separate header file.

2016-03-13 Thread Francisco Jerez
This pulls out the i965 IR definitions into a separate file and leaves
the top-level backend_shader structure and back-end compiler entry
points in brw_shader.h.  The purpose is to keep things tidy and
prevent a nasty circular dependency between brw_cfg.h and
brw_shader.h.  The logical dependency between these data structures
looks like:

 backend_shader (brw_shader.h) -> cfg_t (brw_cfg.h)
-> bblock_t (brw_cfg.h) -> backend_instruction (brw_shader.h)

This circular header dependency is currently resolved by using forward
declarations of cfg_t/bblock_t in brw_shader.h and having brw_cfg.h
include brw_shader.h, which seems backwards and won't work at all when
the forward declarations of cfg_t/bblock_t are no longer sufficient in
a future commit.
---
 src/mesa/drivers/dri/i965/Makefile.sources |   1 +
 src/mesa/drivers/dri/i965/brw_ir.h | 164 +
 src/mesa/drivers/dri/i965/brw_shader.h | 138 +---
 3 files changed, 167 insertions(+), 136 deletions(-)
 create mode 100644 src/mesa/drivers/dri/i965/brw_ir.h

diff --git a/src/mesa/drivers/dri/i965/Makefile.sources 
b/src/mesa/drivers/dri/i965/Makefile.sources
index 4689588..1c6d2e9 100644
--- a/src/mesa/drivers/dri/i965/Makefile.sources
+++ b/src/mesa/drivers/dri/i965/Makefile.sources
@@ -37,6 +37,7 @@ i965_compiler_FILES = \
brw_fs_visitor.cpp \
brw_inst.h \
brw_interpolation_map.c \
+   brw_ir.h \
brw_ir_allocator.h \
brw_ir_fs.h \
brw_ir_vec4.h \
diff --git a/src/mesa/drivers/dri/i965/brw_ir.h 
b/src/mesa/drivers/dri/i965/brw_ir.h
new file mode 100644
index 000..1e4d8c7
--- /dev/null
+++ b/src/mesa/drivers/dri/i965/brw_ir.h
@@ -0,0 +1,164 @@
+/* -*- c++ -*- */
+/*
+ * Copyright © 2010-2016 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ */
+
+#ifndef BRW_IR_H
+#define BRW_IR_H
+
+#include "brw_reg.h"
+#include "compiler/glsl/list.h"
+
+#define MAX_SAMPLER_MESSAGE_SIZE 11
+#define MAX_VGRF_SIZE 16
+
+#ifdef __cplusplus
+struct backend_reg : private brw_reg
+{
+   backend_reg() {}
+   backend_reg(const struct brw_reg ®) : brw_reg(reg) {}
+
+   const brw_reg &as_brw_reg() const
+   {
+  assert(file == ARF || file == FIXED_GRF || file == MRF || file == IMM);
+  assert(reg_offset == 0);
+  return static_cast(*this);
+   }
+
+   brw_reg &as_brw_reg()
+   {
+  assert(file == ARF || file == FIXED_GRF || file == MRF || file == IMM);
+  assert(reg_offset == 0);
+  return static_cast(*this);
+   }
+
+   bool equals(const backend_reg &r) const;
+
+   bool is_zero() const;
+   bool is_one() const;
+   bool is_negative_one() const;
+   bool is_null() const;
+   bool is_accumulator() const;
+   bool in_range(const backend_reg &r, unsigned n) const;
+
+   /**
+* Offset within the virtual register.
+*
+* In the scalar backend, this is in units of a float per pixel for pre-
+* register allocation registers (i.e., one register in SIMD8 mode and two
+* registers in SIMD16 mode).
+*
+* For uniforms, this is in units of 1 float.
+*/
+   uint16_t reg_offset;
+
+   using brw_reg::type;
+   using brw_reg::file;
+   using brw_reg::negate;
+   using brw_reg::abs;
+   using brw_reg::address_mode;
+   using brw_reg::subnr;
+   using brw_reg::nr;
+
+   using brw_reg::swizzle;
+   using brw_reg::writemask;
+   using brw_reg::indirect_offset;
+   using brw_reg::vstride;
+   using brw_reg::width;
+   using brw_reg::hstride;
+
+   using brw_reg::f;
+   using brw_reg::d;
+   using brw_reg::ud;
+};
+
+struct bblock_t;
+#endif
+
+#ifdef __cplusplus
+struct backend_instruction : public exec_node {
+   bool is_3src() const;
+   bool is_tex() const;
+   bool is_math() const;
+   bool is_control_flow() const;
+   bool is_commutative() const;
+   bool can_do_source_mods() const;
+   bool can_do_saturate() const;
+   bool can_do_cmod() const;
+   bool r

[Mesa-dev] [PATCH 15/30] i965/ir: Mark virtual_grf_interferes and vars_interfere as const.

2016-03-13 Thread Francisco Jerez
---
 src/mesa/drivers/dri/i965/brw_fs.h| 2 +-
 src/mesa/drivers/dri/i965/brw_fs_live_variables.cpp   | 4 ++--
 src/mesa/drivers/dri/i965/brw_fs_live_variables.h | 2 +-
 src/mesa/drivers/dri/i965/brw_vec4.h  | 2 +-
 src/mesa/drivers/dri/i965/brw_vec4_live_variables.cpp | 2 +-
 5 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.h 
b/src/mesa/drivers/dri/i965/brw_fs.h
index e47a8e5..4f5d085 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.h
+++ b/src/mesa/drivers/dri/i965/brw_fs.h
@@ -162,7 +162,7 @@ public:
bool remove_duplicate_mrf_writes();
 
bool opt_sampler_eot();
-   bool virtual_grf_interferes(int a, int b);
+   bool virtual_grf_interferes(int a, int b) const;
void schedule_instructions(instruction_scheduler_mode mode);
void insert_gen4_send_dependency_workarounds();
void insert_gen4_pre_send_dependency_workarounds(bblock_t *block,
diff --git a/src/mesa/drivers/dri/i965/brw_fs_live_variables.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_live_variables.cpp
index 4ebefe4..676619a 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_live_variables.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_live_variables.cpp
@@ -333,14 +333,14 @@ fs_visitor::calculate_live_intervals()
 }
 
 bool
-fs_live_variables::vars_interfere(int a, int b)
+fs_live_variables::vars_interfere(int a, int b) const
 {
return !(end[b] <= start[a] ||
 end[a] <= start[b]);
 }
 
 bool
-fs_visitor::virtual_grf_interferes(int a, int b)
+fs_visitor::virtual_grf_interferes(int a, int b) const
 {
return !(virtual_grf_end[a] <= virtual_grf_start[b] ||
 virtual_grf_end[b] <= virtual_grf_start[a]);
diff --git a/src/mesa/drivers/dri/i965/brw_fs_live_variables.h 
b/src/mesa/drivers/dri/i965/brw_fs_live_variables.h
index 6b49cfc..fb5aee7 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_live_variables.h
+++ b/src/mesa/drivers/dri/i965/brw_fs_live_variables.h
@@ -69,7 +69,7 @@ public:
fs_live_variables(fs_visitor *v, const cfg_t *cfg);
~fs_live_variables();
 
-   bool vars_interfere(int a, int b);
+   bool vars_interfere(int a, int b) const;
int var_from_reg(const fs_reg ®) const
{
   return var_from_vgrf[reg.nr] + reg.reg_offset;
diff --git a/src/mesa/drivers/dri/i965/brw_vec4.h 
b/src/mesa/drivers/dri/i965/brw_vec4.h
index c3f9ca5..bb0329f0 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4.h
+++ b/src/mesa/drivers/dri/i965/brw_vec4.h
@@ -143,7 +143,7 @@ public:
bool dead_code_eliminate();
int var_range_start(unsigned v, unsigned n) const;
int var_range_end(unsigned v, unsigned n) const;
-   bool virtual_grf_interferes(int a, int b);
+   bool virtual_grf_interferes(int a, int b) const;
bool opt_cmod_propagation();
bool opt_copy_propagation(bool do_constant_prop = true);
bool opt_cse_local(bblock_t *block);
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_live_variables.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_live_variables.cpp
index 8e1efa4..ecf5dee 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_live_variables.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_live_variables.cpp
@@ -339,7 +339,7 @@ vec4_visitor::var_range_end(unsigned v, unsigned n) const
 }
 
 bool
-vec4_visitor::virtual_grf_interferes(int a, int b)
+vec4_visitor::virtual_grf_interferes(int a, int b) const
 {
return !((var_range_end(4 * alloc.offsets[a], 4 * alloc.sizes[a]) <=
  var_range_start(4 * alloc.offsets[b], 4 * alloc.sizes[b])) ||
-- 
2.7.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 06/30] i965/ir: Reverse inclusion dependency between brw_cfg.h and brw_shader.h.

2016-03-13 Thread Francisco Jerez
This reflects the natural dependency relationship between brw_cfg.h
and brw_shader.h.  brw_cfg.h only requires the base IR definitions
which are now part of a separate header.
---
 src/mesa/drivers/dri/i965/brw_cfg.cpp  | 1 +
 src/mesa/drivers/dri/i965/brw_cfg.h| 5 +++--
 src/mesa/drivers/dri/i965/brw_predicated_break.cpp | 2 +-
 src/mesa/drivers/dri/i965/brw_shader.h | 4 +---
 4 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_cfg.cpp 
b/src/mesa/drivers/dri/i965/brw_cfg.cpp
index 5d46615..4fcee72 100644
--- a/src/mesa/drivers/dri/i965/brw_cfg.cpp
+++ b/src/mesa/drivers/dri/i965/brw_cfg.cpp
@@ -26,6 +26,7 @@
  */
 
 #include "brw_cfg.h"
+#include "brw_shader.h"
 
 /** @file brw_cfg.cpp
  *
diff --git a/src/mesa/drivers/dri/i965/brw_cfg.h 
b/src/mesa/drivers/dri/i965/brw_cfg.h
index a2ca6b1..0813798 100644
--- a/src/mesa/drivers/dri/i965/brw_cfg.h
+++ b/src/mesa/drivers/dri/i965/brw_cfg.h
@@ -28,7 +28,7 @@
 #ifndef BRW_CFG_H
 #define BRW_CFG_H
 
-#include "brw_shader.h"
+#include "brw_ir.h"
 
 struct bblock_t;
 
@@ -46,7 +46,8 @@ struct bblock_link {
struct bblock_t *block;
 };
 
-struct backend_instruction;
+struct backend_shader;
+struct cfg_t;
 
 struct bblock_t {
 #ifdef __cplusplus
diff --git a/src/mesa/drivers/dri/i965/brw_predicated_break.cpp 
b/src/mesa/drivers/dri/i965/brw_predicated_break.cpp
index 607715d..8aeaa24 100644
--- a/src/mesa/drivers/dri/i965/brw_predicated_break.cpp
+++ b/src/mesa/drivers/dri/i965/brw_predicated_break.cpp
@@ -21,7 +21,7 @@
  * IN THE SOFTWARE.
  */
 
-#include "brw_cfg.h"
+#include "brw_shader.h"
 
 using namespace brw;
 
diff --git a/src/mesa/drivers/dri/i965/brw_shader.h 
b/src/mesa/drivers/dri/i965/brw_shader.h
index 3f38dc1..81d3f96 100644
--- a/src/mesa/drivers/dri/i965/brw_shader.h
+++ b/src/mesa/drivers/dri/i965/brw_shader.h
@@ -27,9 +27,7 @@
 #include "brw_reg.h"
 #include "brw_defines.h"
 #include "brw_context.h"
-#include "brw_ir.h"
-
-struct cfg_t;
+#include "brw_cfg.h"
 
 #ifdef __cplusplus
 #include "brw_ir_allocator.h"
-- 
2.7.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 18/30] i965/vec4: Restructure live intervals computation code.

2016-03-13 Thread Francisco Jerez
This makes the structure of the vec4 live intervals calculation more
similar to the FS back-end liveness analysis code.  The non-CF-aware
start/end computation is moved into the same pass that calculates the
block-local def/use sets, which saves quite a bit of code, while the
CF-aware start/end computation is moved into a separate
compute_start_end() function as is done in the FS back-end.
---
 .../drivers/dri/i965/brw_vec4_live_variables.cpp   | 133 -
 .../drivers/dri/i965/brw_vec4_live_variables.h |   1 +
 2 files changed, 54 insertions(+), 80 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_vec4_live_variables.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_live_variables.cpp
index d2a7adf..7042371 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_live_variables.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_live_variables.cpp
@@ -30,6 +30,8 @@
 
 using namespace brw;
 
+#define MAX_INSTRUCTION (1 << 30)
+
 /** @file brw_vec4_live_variables.cpp
  *
  * Support for computing at the basic block level which variables
@@ -40,7 +42,7 @@ using namespace brw;
  */
 
 /**
- * Sets up the use[] and def[] arrays.
+ * Sets up the use/def arrays and block-local approximation of the live ranges.
  *
  * The basic-block-level live variable analysis needs to know which
  * variables get used before they're completely defined, and which
@@ -73,13 +75,17 @@ vec4_live_variables::setup_def_use()
   foreach_inst_in_block(vec4_instruction, inst, block) {
  struct block_data *bd = &block_data[block->num];
 
-/* Set use[] for this instruction */
+ /* Set up the instruction uses. */
 for (unsigned int i = 0; i < 3; i++) {
if (inst->src[i].file == VGRF) {
for (unsigned j = 0; j < inst->regs_read(i); j++) {
   for (int c = 0; c < 4; c++) {
  const unsigned v =
 var_from_reg(alloc, offset(inst->src[i], j), c);
+
+ start[v] = MIN2(start[v], ip);
+ end[v] = ip;
+
  if (!BITSET_TEST(bd->def, v))
 BITSET_SET(bd->use, v);
   }
@@ -93,18 +99,23 @@ vec4_live_variables::setup_def_use()
 }
  }
 
-/* Check for unconditional writes to whole registers. These
- * are the things that screen off preceding definitions of a
- * variable, and thus qualify for being in def[].
- */
-if (inst->dst.file == VGRF &&
-(!inst->predicate || inst->opcode == BRW_OPCODE_SEL)) {
+ /* Set up the instruction defs. */
+ if (inst->dst.file == VGRF) {
 for (unsigned i = 0; i < inst->regs_written; i++) {
for (int c = 0; c < 4; c++) {
   if (inst->dst.writemask & (1 << c)) {
  const unsigned v =
 var_from_reg(alloc, offset(inst->dst, i), c);
- if (!BITSET_TEST(bd->use, v))
+
+ start[v] = MIN2(start[v], ip);
+ end[v] = ip;
+
+ /* Check for unconditional register writes, these are the
+  * things that screen off preceding definitions of a
+  * variable, and thus qualify for being in def[].
+  */
+ if ((!inst->predicate || inst->opcode == BRW_OPCODE_SEL) 
&&
+ !BITSET_TEST(bd->use, v))
 BITSET_SET(bd->def, v);
   }
}
@@ -182,6 +193,30 @@ vec4_live_variables::compute_live_variables()
}
 }
 
+/**
+ * Extend the start/end ranges for each variable to account for the
+ * new information calculated from control flow.
+ */
+void
+vec4_live_variables::compute_start_end()
+{
+   foreach_block (block, cfg) {
+  const struct block_data &bd = block_data[block->num];
+
+  for (int i = 0; i < num_vars; i++) {
+ if (BITSET_TEST(bd.livein, i)) {
+start[i] = MIN2(start[i], block->start_ip);
+end[i] = MAX2(end[i], block->start_ip);
+ }
+
+ if (BITSET_TEST(bd.liveout, i)) {
+start[i] = MIN2(start[i], block->end_ip);
+end[i] = MAX2(end[i], block->end_ip);
+ }
+  }
+   }
+}
+
 vec4_live_variables::vec4_live_variables(const simple_allocator &alloc,
  cfg_t *cfg)
: alloc(alloc), cfg(cfg)
@@ -189,6 +224,14 @@ vec4_live_variables::vec4_live_variables(const 
simple_allocator &alloc,
mem_ctx = ralloc_context(NULL);
 
num_vars = alloc.total_size * 4;
+   start = ralloc_array(mem_ctx, int, num_vars);
+   end = ralloc_array(mem_ctx, int, num_vars);
+
+   for (int i = 0; i < num_vars; i++) {
+  start[i] = MAX_INSTRUCTION;
+  end[i] = -1;
+   }
+
block_data = rzalloc_array(mem_ctx, struct block_data, cfg->num_blocks);
 
bitset_words = BITSET_WORDS(num_vars);
@@ -206,6 +249,7 @@ vec4_live_var

[Mesa-dev] [PATCH 22/30] i965/vec4: Add live interval validation pass.

2016-03-13 Thread Francisco Jerez
This could be improved somewhat with additional validation of the
calculated live in/out sets and by checking that the calculated live
intervals are minimal (which isn't strictly necessary to guarantee the
correctness of the program).  This should be good enough though to
catch accidental use of stale liveness results due to missing or
incorrect analysis invalidation.
---
 .../drivers/dri/i965/brw_vec4_live_variables.cpp   | 43 ++
 .../drivers/dri/i965/brw_vec4_live_variables.h |  3 ++
 2 files changed, 46 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_vec4_live_variables.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_live_variables.cpp
index 297d027..a7c991f 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_live_variables.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_live_variables.cpp
@@ -294,6 +294,49 @@ vec4_visitor::invalidate_live_intervals()
live_intervals = NULL;
 }
 
+static bool
+check_register_live_range(const vec4_live_variables *live, int ip,
+  unsigned var, unsigned n)
+{
+   for (unsigned j = 0; j < n; j += 4) {
+  if (var + j >= unsigned(live->num_vars) ||
+  live->start[var + j] > ip || live->end[var + j] < ip)
+ return false;
+   }
+
+   return true;
+}
+
+bool
+vec4_live_variables::validate(const backend_shader *s) const
+{
+   unsigned ip = 0;
+
+   foreach_block_and_inst(block, vec4_instruction, inst, s->cfg) {
+  for (unsigned c = 0; c < 4; c++) {
+ if (inst->dst.writemask & (1 << c)) {
+for (unsigned i = 0; i < 3; i++) {
+   if (inst->src[i].file == VGRF &&
+   !check_register_live_range(this, ip,
+  var_from_reg(alloc, 
inst->src[i], c),
+  inst->regs_read(i)))
+  return false;
+}
+
+if (inst->dst.file == VGRF &&
+!check_register_live_range(this, ip,
+   var_from_reg(alloc, inst->dst, c),
+   inst->regs_written))
+   return false;
+ }
+  }
+
+  ip++;
+   }
+
+   return true;
+}
+
 int
 vec4_live_variables::var_range_start(unsigned v, unsigned n) const
 {
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_live_variables.h 
b/src/mesa/drivers/dri/i965/brw_vec4_live_variables.h
index 7384f04..f6cefa0 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_live_variables.h
+++ b/src/mesa/drivers/dri/i965/brw_vec4_live_variables.h
@@ -68,6 +68,9 @@ public:
vec4_live_variables(const backend_shader *s);
~vec4_live_variables();
 
+   bool
+   validate(const backend_shader *s) const;
+
int num_vars;
int bitset_words;
 
-- 
2.7.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 17/30] i965/vec4: Move all live interval analysis results into vec4_live_variables.

2016-03-13 Thread Francisco Jerez
This moves the following methods that are currently defined in
vec4_visitor (even though they are side products of the liveness
analysis computation) and are already implemented in
brw_vec4_live_variables.cpp:

> int var_range_start(unsigned v, unsigned n) const;
> int var_range_end(unsigned v, unsigned n) const;
> bool virtual_grf_interferes(int a, int b) const;
> int *virtual_grf_start;
> int *virtual_grf_end;

It makes sense for them to be part of the vec4_live_variables object,
because they have the same lifetime as other liveness analysis results
and because this will allow some extra validation to happen wherever
they are accessed in order to make sure that we only ever use
up-to-date liveness analysis results.

The naming of the virtual_grf_start/end arrays was rather misleading,
they were indexed by variable rather than by vgrf, this renames them
start/end to match the FS liveness analysis pass.  The churn in the
definition of var_range_start/end is just in order to avoid a
collision between the start/end arrays and local variables declared
with the same name.
---
 src/mesa/drivers/dri/i965/brw_vec4.cpp |  3 +-
 src/mesa/drivers/dri/i965/brw_vec4.h   |  5 
 src/mesa/drivers/dri/i965/brw_vec4_cse.cpp |  3 +-
 .../drivers/dri/i965/brw_vec4_live_variables.cpp   | 32 +-
 .../drivers/dri/i965/brw_vec4_live_variables.h | 11 
 .../drivers/dri/i965/brw_vec4_reg_allocate.cpp |  2 +-
 src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp |  2 --
 7 files changed, 35 insertions(+), 23 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4.cpp
index cf5b114..7c2c9cc 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp
@@ -1061,7 +1061,8 @@ vec4_visitor::opt_register_coalesce()
   /* Can't coalesce this GRF if someone else was going to
* read it later.
*/
-  if (var_range_end(var_from_reg(alloc, inst->src[0]), 4) > ip)
+  if (live_intervals->var_range_end(
+ var_from_reg(alloc, inst->src[0]), 4) > ip)
 continue;
 
   /* We need to check interference with the final destination between this
diff --git a/src/mesa/drivers/dri/i965/brw_vec4.h 
b/src/mesa/drivers/dri/i965/brw_vec4.h
index bb0329f0..35bd265 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4.h
+++ b/src/mesa/drivers/dri/i965/brw_vec4.h
@@ -101,8 +101,6 @@ public:
 
int first_non_payload_grf;
unsigned int max_grf;
-   int *virtual_grf_start;
-   int *virtual_grf_end;
brw::vec4_live_variables *live_intervals;
dst_reg userplane[MAX_CLIP_PLANES];
 
@@ -141,9 +139,6 @@ public:
bool opt_vector_float();
bool opt_reduce_swizzle();
bool dead_code_eliminate();
-   int var_range_start(unsigned v, unsigned n) const;
-   int var_range_end(unsigned v, unsigned n) const;
-   bool virtual_grf_interferes(int a, int b) const;
bool opt_cmod_propagation();
bool opt_copy_propagation(bool do_constant_prop = true);
bool opt_cse_local(bblock_t *block);
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_cse.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_cse.cpp
index 38dd3ed..3a1dfda 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_cse.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_cse.cpp
@@ -246,7 +246,8 @@ vec4_visitor::opt_cse_local(bblock_t *block)
  * more -- a sure sign they'll fail operands_match().
  */
 if (src->file == VGRF) {
-   if (var_range_end(var_from_reg(alloc, *src), 4) < ip) {
+   if (live_intervals->var_range_end(
+  var_from_reg(alloc, *src), 4) < ip) {
   entry->remove();
   ralloc_free(entry);
   break;
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_live_variables.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_live_variables.cpp
index ecf5dee..d2a7adf 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_live_variables.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_live_variables.cpp
@@ -240,10 +240,6 @@ vec4_visitor::calculate_live_intervals()
 
int *start = ralloc_array(mem_ctx, int, this->alloc.total_size * 4);
int *end = ralloc_array(mem_ctx, int, this->alloc.total_size * 4);
-   ralloc_free(this->virtual_grf_start);
-   ralloc_free(this->virtual_grf_end);
-   this->virtual_grf_start = start;
-   this->virtual_grf_end = end;
 
for (unsigned i = 0; i < this->alloc.total_size * 4; i++) {
   start[i] = MAX_INSTRUCTION;
@@ -290,6 +286,11 @@ vec4_visitor::calculate_live_intervals()
 * this point we're distilling it down to vgrfs.
 */
this->live_intervals = new(mem_ctx) vec4_live_variables(alloc, cfg);
+   /* XXX -- This belongs in the constructor of vec4_live_variables, will be
+* cleaned up later.
+*/
+   this->live_intervals->start = start;
+   this->live_intervals->end = end;
 
foreach_block (block, cfg) {
   const struct vec4_live_variables::block_data *bd =

[Mesa-dev] [PATCH 10/30] i965/vec4: Reverse inclusion dependency between brw_vec4_live_variables.h and brw_vec4.h.

2016-03-13 Thread Francisco Jerez
brw_vec4.h (in particular vec4_visitor) is logically a user of the
live variables analysis pass, not the other way around.
brw_vec4_live_variables.h requires the definition of some VEC4 IR data
structures to compile, but those can be obtained directly from
brw_ir_vec4.h without including brw_vec4.h.
---
 src/mesa/drivers/dri/i965/brw_vec4.cpp| 1 -
 src/mesa/drivers/dri/i965/brw_vec4.h  | 4 +---
 src/mesa/drivers/dri/i965/brw_vec4_live_variables.cpp | 2 +-
 src/mesa/drivers/dri/i965/brw_vec4_live_variables.h   | 3 +--
 4 files changed, 3 insertions(+), 7 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4.cpp
index baf72a2..a3d1f7c 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp
@@ -27,7 +27,6 @@
 #include "brw_vs.h"
 #include "brw_nir.h"
 #include "brw_vec4_builder.h"
-#include "brw_vec4_live_variables.h"
 #include "brw_dead_control_flow.h"
 #include "program/prog_parameter.h"
 
diff --git a/src/mesa/drivers/dri/i965/brw_vec4.h 
b/src/mesa/drivers/dri/i965/brw_vec4.h
index d43a5a8..b56b35d 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4.h
+++ b/src/mesa/drivers/dri/i965/brw_vec4.h
@@ -29,6 +29,7 @@
 
 #ifdef __cplusplus
 #include "brw_ir_vec4.h"
+#include "brw_vec4_live_variables.h"
 #endif
 
 #include "compiler/glsl/ir.h"
@@ -52,9 +53,6 @@ brw_vec4_generate_assembly(const struct brw_compiler 
*compiler,
 } /* extern "C" */
 
 namespace brw {
-
-class vec4_live_variables;
-
 /**
  * The vertex shader front-end.
  *
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_live_variables.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_live_variables.cpp
index b4476ef..8e1efa4 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_live_variables.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_live_variables.cpp
@@ -25,7 +25,7 @@
  *
  */
 
-#include "brw_cfg.h"
+#include "brw_vec4.h"
 #include "brw_vec4_live_variables.h"
 
 using namespace brw;
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_live_variables.h 
b/src/mesa/drivers/dri/i965/brw_vec4_live_variables.h
index ae8e406..a6e04b2 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_live_variables.h
+++ b/src/mesa/drivers/dri/i965/brw_vec4_live_variables.h
@@ -28,12 +28,11 @@
 #ifndef BRW_VEC4_LIVE_VARIABLES_H
 #define BRW_VEC4_LIVE_VARIABLES_H
 
+#include "brw_ir_vec4.h"
 #include "util/bitset.h"
-#include "brw_vec4.h"
 
 namespace brw {
 
-
 class vec4_live_variables {
 public:
struct block_data {
-- 
2.7.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 07/30] i965/ir: Nest definition of live variables block_data structures.

2016-03-13 Thread Francisco Jerez
These two structures have exactly the same name which prevents the two
files from being included at the same time and could cause serious
trouble in the future if it ever leads to a (silent) violation of the
C++ one definition rule.
---
 src/mesa/drivers/dri/i965/brw_fs_live_variables.h  | 52 +++---
 .../drivers/dri/i965/brw_vec4_live_variables.cpp   |  3 +-
 .../drivers/dri/i965/brw_vec4_live_variables.h | 51 ++---
 3 files changed, 54 insertions(+), 52 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_live_variables.h 
b/src/mesa/drivers/dri/i965/brw_fs_live_variables.h
index 96cadea..882315a 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_live_variables.h
+++ b/src/mesa/drivers/dri/i965/brw_fs_live_variables.h
@@ -32,34 +32,34 @@ struct cfg_t;
 
 namespace brw {
 
-struct block_data {
-   /**
-* Which variables are defined before being used in the block.
-*
-* Note that for our purposes, "defined" means unconditionally, completely
-* defined.
-*/
-   BITSET_WORD *def;
-
-   /**
-* Which variables are used before being defined in the block.
-*/
-   BITSET_WORD *use;
-
-   /** Which defs reach the entry point of the block. */
-   BITSET_WORD *livein;
-
-   /** Which defs reach the exit point of the block. */
-   BITSET_WORD *liveout;
-
-   BITSET_WORD flag_def[1];
-   BITSET_WORD flag_use[1];
-   BITSET_WORD flag_livein[1];
-   BITSET_WORD flag_liveout[1];
-};
-
 class fs_live_variables {
 public:
+   struct block_data {
+  /**
+   * Which variables are defined before being used in the block.
+   *
+   * Note that for our purposes, "defined" means unconditionally, 
completely
+   * defined.
+   */
+  BITSET_WORD *def;
+
+  /**
+   * Which variables are used before being defined in the block.
+   */
+  BITSET_WORD *use;
+
+  /** Which defs reach the entry point of the block. */
+  BITSET_WORD *livein;
+
+  /** Which defs reach the exit point of the block. */
+  BITSET_WORD *liveout;
+
+  BITSET_WORD flag_def[1];
+  BITSET_WORD flag_use[1];
+  BITSET_WORD flag_livein[1];
+  BITSET_WORD flag_liveout[1];
+   };
+
DECLARE_RALLOC_CXX_OPERATORS(fs_live_variables)
 
fs_live_variables(fs_visitor *v, const cfg_t *cfg);
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_live_variables.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_live_variables.cpp
index 57d5fbb..b4476ef 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_live_variables.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_live_variables.cpp
@@ -292,7 +292,8 @@ vec4_visitor::calculate_live_intervals()
this->live_intervals = new(mem_ctx) vec4_live_variables(alloc, cfg);
 
foreach_block (block, cfg) {
-  struct block_data *bd = &live_intervals->block_data[block->num];
+  const struct vec4_live_variables::block_data *bd =
+ &live_intervals->block_data[block->num];
 
   for (int i = 0; i < live_intervals->num_vars; i++) {
  if (BITSET_TEST(bd->livein, i)) {
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_live_variables.h 
b/src/mesa/drivers/dri/i965/brw_vec4_live_variables.h
index 12d281e..f55bed7 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_live_variables.h
+++ b/src/mesa/drivers/dri/i965/brw_vec4_live_variables.h
@@ -30,34 +30,35 @@
 
 namespace brw {
 
-struct block_data {
-   /**
-* Which variables are defined before being used in the block.
-*
-* Note that for our purposes, "defined" means unconditionally, completely
-* defined.
-*/
-   BITSET_WORD *def;
-
-   /**
-* Which variables are used before being defined in the block.
-*/
-   BITSET_WORD *use;
-
-   /** Which defs reach the entry point of the block. */
-   BITSET_WORD *livein;
-
-   /** Which defs reach the exit point of the block. */
-   BITSET_WORD *liveout;
-
-   BITSET_WORD flag_def[1];
-   BITSET_WORD flag_use[1];
-   BITSET_WORD flag_livein[1];
-   BITSET_WORD flag_liveout[1];
-};
 
 class vec4_live_variables {
 public:
+   struct block_data {
+  /**
+   * Which variables are defined before being used in the block.
+   *
+   * Note that for our purposes, "defined" means unconditionally, 
completely
+   * defined.
+   */
+  BITSET_WORD *def;
+
+  /**
+   * Which variables are used before being defined in the block.
+   */
+  BITSET_WORD *use;
+
+  /** Which defs reach the entry point of the block. */
+  BITSET_WORD *livein;
+
+  /** Which defs reach the exit point of the block. */
+  BITSET_WORD *liveout;
+
+  BITSET_WORD flag_def[1];
+  BITSET_WORD flag_use[1];
+  BITSET_WORD flag_livein[1];
+  BITSET_WORD flag_liveout[1];
+   };
+
DECLARE_RALLOC_CXX_OPERATORS(vec4_live_variables)
 
vec4_live_variables(const simple_allocator &alloc, cfg_t *cfg);
-- 
2.7.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/m

[Mesa-dev] [PATCH 02/30] i965/fs: Add missing analysis invalidation in fixup_3src_null_dest().

2016-03-13 Thread Francisco Jerez
Bug found by the liveness analysis validation pass that will be
introduced in a later commit.  fixup_3src_null_dest() was allocating
registers which makes the cached liveness analysis calculation
incomplete, so it must be invalidated.

Cc: mesa-sta...@lists.freedesktop.org
---
 src/mesa/drivers/dri/i965/brw_fs.cpp | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index 42bc5e2..86d2bd9 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -5190,12 +5190,18 @@ fs_visitor::optimize()
 void
 fs_visitor::fixup_3src_null_dest()
 {
+   bool progress = false;
+
foreach_block_and_inst_safe (block, fs_inst, inst, cfg) {
   if (inst->is_3src() && inst->dst.is_null()) {
  inst->dst = fs_reg(VGRF, alloc.allocate(dispatch_width / 8),
 inst->dst.type);
+ progress = true;
   }
}
+
+   if (progress)
+  invalidate_live_intervals();
 }
 
 void
-- 
2.7.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 00/30] i965: IR analysis pass framework.

2016-03-13 Thread Francisco Jerez
The purpose of this series is to introduce some lightweight
infrastructure intended to make room for additional analysis passes in
the i965 back-end without increasing the clutter of the visitor
objects, to reduce duplication of logic between analysis passes, and
to prevent some mistakes that are fairly difficult to avoid under the
current (rather informal) approach.

Patches 1-3 of this series fix some long-standing bugs that were
uncovered by the additional validation done by the analysis pass
framework.  The bugs could be hit during either shader-db or piglit
runs.  All the three caused us to use inconsistent live interval
metadata in some cases and could potentially lead to program
miscompilation.  I don't know whether there are any real-life
workloads where it actually caused corruption, in doubt it seems
sensible to back-port them to stable branches.

Patches 5-10 is mostly trivial header file churn intended to
clean things up and prevent circular header dependencies in cases
where the header inclusion order was the opposite of their
logical dependency order.

Patches 11-13 is the core of this series.  They introduce the general
framework that will be put to use in the following patches.

Patch 14 instruments the whole back-end with analysis metadata
invalidation and might be one of the more difficult to review.  On the
bright side it's no longer a particularly critical thing to get right,
because any mistake in the invalidation instrumentation will
eventually trigger an assertion failure if it leads to a situation
where we attempt to use analysis metadata that has become inconsistent
with the program, as demonstrated in patches 1-3.

The rest of the series is mainly about converting existing
analysis passes to the new framework.  Patches 16-25 port the
VEC4 and FS liveness analysis passes, patches 26-27 port the
dominance analysis pass and patch 29 ports the register pressure
calculation in the FS back-end.

For a branch in testable form see:
 https://cgit.freedesktop.org/~currojerez/mesa/log/?h=i965-ir-analysis

Enjoy.

[PATCH 01/30] i965/fs: Add missing analysis invalidation in opt_sampler_eot().
[PATCH 02/30] i965/fs: Add missing analysis invalidation in 
fixup_3src_null_dest().
[PATCH 03/30] i965/vec4: Consider removal of no-op MOVs as progress during 
register coalesce.
[PATCH 04/30] i965/fs: Restrict inequality that can only hold equal in saturate 
propagation.
[PATCH 05/30] i965/ir: Move base IR definitions into a separate header file.
[PATCH 06/30] i965/ir: Reverse inclusion dependency between brw_cfg.h and 
brw_shader.h.
[PATCH 07/30] i965/ir: Nest definition of live variables block_data structures.
[PATCH 08/30] i965/ir: Add include guards to the live variables header files.
[PATCH 09/30] i965/fs: Reverse inclusion dependency between 
brw_fs_live_variables.h and brw_fs.h.
[PATCH 10/30] i965/vec4: Reverse inclusion dependency between 
brw_vec4_live_variables.h and brw_vec4.h.
[PATCH 11/30] i965/ir: Introduce simple IR analysis pass framework.
[PATCH 12/30] i965/ir: Introduce backend_shader method to propagate IR changes 
to analysis passes.
[PATCH 13/30] i965/ir: Define more detailed analysis dependency classes.
[PATCH 14/30] i965/ir: Pass detailed dependency classes to 
invalidate_analysis().
[PATCH 15/30] i965/ir: Mark virtual_grf_interferes and vars_interfere as const.
[PATCH 16/30] i965/fs: Move all live interval analysis results into 
fs_live_variables.
[PATCH 17/30] i965/vec4: Move all live interval analysis results into 
vec4_live_variables.
[PATCH 18/30] i965/vec4: Restructure live intervals computation code.
[PATCH 19/30] i965/fs: Pass single backend_shader argument to the 
fs_live_variables constructor.
[PATCH 20/30] i965/vec4: Pass single backend_shader argument to the 
vec4_live_variables constructor.
[PATCH 21/30] i965/fs: Add live interval validation pass.
[PATCH 22/30] i965/vec4: Add live interval validation pass.
[PATCH 23/30] i965/fs: Switch liveness analysis to IR analysis framework.
[PATCH 24/30] i965/vec4: Switch liveness analysis to IR analysis framework.
[PATCH 25/30] i965/ir: Drop invalidate_live_intervals().
[PATCH 26/30] i965/ir: Move idom tree calculation and related logic into 
analysis object.
[PATCH 27/30] i965/ir: Move dominance tree data structure into idom_tree object.
[PATCH 28/30] i965/ir: Simplify new_idom reduction in dominance tree 
calculation.
[PATCH 29/30] i965/fs: Move register pressure calculation into IR analysis 
object.
[PATCH 30/30] i965/fs: Calculate num_instructions in O(1) during register 
pressure calculation.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 20/30] i965/vec4: Pass single backend_shader argument to the vec4_live_variables constructor.

2016-03-13 Thread Francisco Jerez
The IR analysis framework requires the analysis result to be
constructible with a single argument.
---
 src/mesa/drivers/dri/i965/brw_vec4_live_variables.cpp | 7 +++
 src/mesa/drivers/dri/i965/brw_vec4_live_variables.h   | 4 +++-
 2 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_vec4_live_variables.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_live_variables.cpp
index 7042371..297d027 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_live_variables.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_live_variables.cpp
@@ -217,9 +217,8 @@ vec4_live_variables::compute_start_end()
}
 }
 
-vec4_live_variables::vec4_live_variables(const simple_allocator &alloc,
- cfg_t *cfg)
-   : alloc(alloc), cfg(cfg)
+vec4_live_variables::vec4_live_variables(const backend_shader *s)
+   : alloc(s->alloc), cfg(s->cfg)
 {
mem_ctx = ralloc_context(NULL);
 
@@ -285,7 +284,7 @@ vec4_visitor::calculate_live_intervals()
 * The control flow-aware analysis was done at a channel level, while at
 * this point we're distilling it down to vgrfs.
 */
-   this->live_intervals = new(mem_ctx) vec4_live_variables(alloc, cfg);
+   this->live_intervals = new(mem_ctx) vec4_live_variables(this);
 }
 
 void
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_live_variables.h 
b/src/mesa/drivers/dri/i965/brw_vec4_live_variables.h
index f7a6932..7384f04 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_live_variables.h
+++ b/src/mesa/drivers/dri/i965/brw_vec4_live_variables.h
@@ -31,6 +31,8 @@
 #include "brw_ir_vec4.h"
 #include "util/bitset.h"
 
+struct backend_shader;
+
 namespace brw {
 
 class vec4_live_variables {
@@ -63,7 +65,7 @@ public:
 
DECLARE_RALLOC_CXX_OPERATORS(vec4_live_variables)
 
-   vec4_live_variables(const simple_allocator &alloc, cfg_t *cfg);
+   vec4_live_variables(const backend_shader *s);
~vec4_live_variables();
 
int num_vars;
-- 
2.7.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 08/30] i965/ir: Add include guards to the live variables header files.

2016-03-13 Thread Francisco Jerez
---
 src/mesa/drivers/dri/i965/brw_fs_live_variables.h   | 5 +
 src/mesa/drivers/dri/i965/brw_vec4_live_variables.h | 5 +
 2 files changed, 10 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_live_variables.h 
b/src/mesa/drivers/dri/i965/brw_fs_live_variables.h
index 882315a..a5c1764 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_live_variables.h
+++ b/src/mesa/drivers/dri/i965/brw_fs_live_variables.h
@@ -25,6 +25,9 @@
  *
  */
 
+#ifndef BRW_FS_LIVE_VARIABLES_H
+#define BRW_FS_LIVE_VARIABLES_H
+
 #include "brw_fs.h"
 #include "util/bitset.h"
 
@@ -113,3 +116,5 @@ protected:
 };
 
 } /* namespace brw */
+
+#endif
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_live_variables.h 
b/src/mesa/drivers/dri/i965/brw_vec4_live_variables.h
index f55bed7..ae8e406 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_live_variables.h
+++ b/src/mesa/drivers/dri/i965/brw_vec4_live_variables.h
@@ -25,6 +25,9 @@
  *
  */
 
+#ifndef BRW_VEC4_LIVE_VARIABLES_H
+#define BRW_VEC4_LIVE_VARIABLES_H
+
 #include "util/bitset.h"
 #include "brw_vec4.h"
 
@@ -99,3 +102,5 @@ var_from_reg(const simple_allocator &alloc, const dst_reg 
®,
 }
 
 } /* namespace brw */
+
+#endif
-- 
2.7.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 14/30] i965/ir: Pass detailed dependency classes to invalidate_analysis().

2016-03-13 Thread Francisco Jerez
Have fun reading through the whole back-end optimizer to verify
whether I've missed any dependency flags -- Or alternatively, just
trust that any mistake here will trigger an assertion failure during
analysis pass validation if it ever poses a problem for the
consistency of any of the analysis passes managed by the framework.
---
 .../drivers/dri/i965/brw_dead_control_flow.cpp |  2 +-
 src/mesa/drivers/dri/i965/brw_fs.cpp   | 41 +++---
 .../drivers/dri/i965/brw_fs_cmod_propagation.cpp   |  2 +-
 .../drivers/dri/i965/brw_fs_combine_constants.cpp  |  2 +-
 .../drivers/dri/i965/brw_fs_copy_propagation.cpp   |  3 +-
 src/mesa/drivers/dri/i965/brw_fs_cse.cpp   |  2 +-
 .../dri/i965/brw_fs_dead_code_eliminate.cpp|  2 +-
 src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp  |  2 +-
 .../drivers/dri/i965/brw_fs_register_coalesce.cpp  |  2 +-
 src/mesa/drivers/dri/i965/brw_fs_sel_peephole.cpp  |  2 +-
 src/mesa/drivers/dri/i965/brw_predicated_break.cpp |  2 +-
 .../drivers/dri/i965/brw_schedule_instructions.cpp |  4 +--
 src/mesa/drivers/dri/i965/brw_vec4.cpp | 15 
 .../drivers/dri/i965/brw_vec4_cmod_propagation.cpp |  2 +-
 .../drivers/dri/i965/brw_vec4_copy_propagation.cpp |  3 +-
 src/mesa/drivers/dri/i965/brw_vec4_cse.cpp |  2 +-
 .../dri/i965/brw_vec4_dead_code_eliminate.cpp  |  2 +-
 .../drivers/dri/i965/brw_vec4_reg_allocate.cpp |  2 +-
 18 files changed, 48 insertions(+), 44 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_dead_control_flow.cpp 
b/src/mesa/drivers/dri/i965/brw_dead_control_flow.cpp
index 65e009c..75c7be3 100644
--- a/src/mesa/drivers/dri/i965/brw_dead_control_flow.cpp
+++ b/src/mesa/drivers/dri/i965/brw_dead_control_flow.cpp
@@ -111,7 +111,7 @@ dead_control_flow_eliminate(backend_shader *s)
}
 
if (progress)
-  s->invalidate_analysis(DEPENDENCY_EVERYTHING);
+  s->invalidate_analysis(DEPENDENCY_BLOCKS | DEPENDENCY_INSTRUCTIONS);
 
return progress;
 }
diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index 60924b6..9183d1f 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -1843,7 +1843,7 @@ fs_visitor::split_virtual_grfs()
  }
   }
}
-   invalidate_analysis(DEPENDENCY_EVERYTHING);
+   invalidate_analysis(DEPENDENCY_INSTRUCTION_DETAIL | DEPENDENCY_VARIABLES);
 }
 
 /**
@@ -1884,7 +1884,7 @@ fs_visitor::compact_virtual_grfs()
   } else {
  remap_table[i] = new_index;
  alloc.sizes[new_index] = alloc.sizes[i];
- invalidate_analysis(DEPENDENCY_EVERYTHING);
+ invalidate_analysis(DEPENDENCY_INSTRUCTION_DETAIL | 
DEPENDENCY_VARIABLES);
  ++new_index;
   }
}
@@ -2088,7 +2088,7 @@ fs_visitor::demote_pull_constants()
  inst->src[i].reg_offset = 0;
   }
}
-   invalidate_analysis(DEPENDENCY_EVERYTHING);
+   invalidate_analysis(DEPENDENCY_INSTRUCTIONS | DEPENDENCY_VARIABLES);
 }
 
 bool
@@ -2316,7 +2316,8 @@ fs_visitor::opt_algebraic()
}
 
if (progress)
-  invalidate_analysis(DEPENDENCY_EVERYTHING);
+  invalidate_analysis(DEPENDENCY_INSTRUCTION_DATA_FLOW |
+  DEPENDENCY_INSTRUCTION_DETAIL);
 
return progress;
 }
@@ -2367,7 +2368,7 @@ fs_visitor::opt_zero_samples()
}
 
if (progress)
-  invalidate_analysis(DEPENDENCY_EVERYTHING);
+  invalidate_analysis(DEPENDENCY_INSTRUCTION_DETAIL);
 
return progress;
 }
@@ -2446,7 +2447,7 @@ fs_visitor::opt_sampler_eot()
 * the instruction that this will replace.
 */
if (tex_inst->header_size != 0) {
-  invalidate_analysis(DEPENDENCY_EVERYTHING);
+  invalidate_analysis(DEPENDENCY_INSTRUCTIONS);
   return true;
}
 
@@ -2479,7 +2480,7 @@ fs_visitor::opt_sampler_eot()
tex_inst->insert_before(cfg->blocks[cfg->num_blocks - 1], new_load_payload);
tex_inst->src[0] = send_header;
 
-   invalidate_analysis(DEPENDENCY_EVERYTHING);
+   invalidate_analysis(DEPENDENCY_INSTRUCTIONS | DEPENDENCY_VARIABLES);
return true;
 }
 
@@ -2532,7 +2533,7 @@ fs_visitor::opt_register_renaming()
}
 
if (progress) {
-  invalidate_analysis(DEPENDENCY_EVERYTHING);
+  invalidate_analysis(DEPENDENCY_INSTRUCTION_DETAIL | 
DEPENDENCY_VARIABLES);
 
   for (unsigned i = 0; i < ARRAY_SIZE(delta_xy); i++) {
  if (delta_xy[i].file == VGRF && remap[delta_xy[i].nr] != -1) {
@@ -2580,7 +2581,7 @@ fs_visitor::opt_redundant_discard_jumps()
}
 
if (progress)
-  invalidate_analysis(DEPENDENCY_EVERYTHING);
+  invalidate_analysis(DEPENDENCY_INSTRUCTIONS);
 
return progress;
 }
@@ -2740,7 +2741,7 @@ fs_visitor::compute_to_mrf()
}
 
if (progress)
-  invalidate_analysis(DEPENDENCY_EVERYTHING);
+  invalidate_analysis(DEPENDENCY_INSTRUCTIONS);
 
return progress;
 }
@@ -2790,7 +2791,7 @@ fs_visitor::eliminate_find_live_channel()
}
 
if (progress)
-  invalidate_analysis(DE

[Mesa-dev] [PATCH 12/30] i965/ir: Introduce backend_shader method to propagate IR changes to analysis passes.

2016-03-13 Thread Francisco Jerez
The invalidate_analysis() method knows what analysis passes there are
in the back-end and calls their invalidate() method to report changes
in the IR.  For the moment it just calls invalidate_live_intervals()
(which will eventually be fully replaced by this function) if anything
changed.

This makes all optimization passes invalidate DEPENDENCY_EVERYTHING,
which is clearly far from ideal -- The dependency classes passed to
invalidate_analysis() will be refined in a future commit.
---
 .../drivers/dri/i965/brw_dead_control_flow.cpp |  4 +-
 src/mesa/drivers/dri/i965/brw_fs.cpp   | 49 ++
 src/mesa/drivers/dri/i965/brw_fs.h |  1 +
 .../drivers/dri/i965/brw_fs_cmod_propagation.cpp   |  4 +-
 .../drivers/dri/i965/brw_fs_combine_constants.cpp  |  2 +-
 .../drivers/dri/i965/brw_fs_copy_propagation.cpp   |  4 +-
 src/mesa/drivers/dri/i965/brw_fs_cse.cpp   |  2 +-
 .../dri/i965/brw_fs_dead_code_eliminate.cpp|  4 +-
 src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp  |  2 +-
 .../drivers/dri/i965/brw_fs_register_coalesce.cpp  |  4 +-
 src/mesa/drivers/dri/i965/brw_fs_sel_peephole.cpp  |  2 +-
 src/mesa/drivers/dri/i965/brw_predicated_break.cpp |  2 +-
 .../drivers/dri/i965/brw_schedule_instructions.cpp |  4 +-
 src/mesa/drivers/dri/i965/brw_shader.cpp   |  7 
 src/mesa/drivers/dri/i965/brw_shader.h |  2 +
 src/mesa/drivers/dri/i965/brw_vec4.cpp | 21 +++---
 src/mesa/drivers/dri/i965/brw_vec4.h   |  1 +
 .../drivers/dri/i965/brw_vec4_cmod_propagation.cpp |  2 +-
 .../drivers/dri/i965/brw_vec4_copy_propagation.cpp |  2 +-
 src/mesa/drivers/dri/i965/brw_vec4_cse.cpp |  2 +-
 .../dri/i965/brw_vec4_dead_code_eliminate.cpp  |  2 +-
 .../drivers/dri/i965/brw_vec4_reg_allocate.cpp |  2 +-
 22 files changed, 84 insertions(+), 41 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_dead_control_flow.cpp 
b/src/mesa/drivers/dri/i965/brw_dead_control_flow.cpp
index 2c1abaf..65e009c 100644
--- a/src/mesa/drivers/dri/i965/brw_dead_control_flow.cpp
+++ b/src/mesa/drivers/dri/i965/brw_dead_control_flow.cpp
@@ -29,6 +29,8 @@
 #include "brw_shader.h"
 #include "brw_cfg.h"
 
+using namespace brw;
+
 /* Look for and eliminate dead control flow:
  *
  *   - if/endif
@@ -109,7 +111,7 @@ dead_control_flow_eliminate(backend_shader *s)
}
 
if (progress)
-  s->invalidate_live_intervals();
+  s->invalidate_analysis(DEPENDENCY_EVERYTHING);
 
return progress;
 }
diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index 86d2bd9..60924b6 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -1843,7 +1843,7 @@ fs_visitor::split_virtual_grfs()
  }
   }
}
-   invalidate_live_intervals();
+   invalidate_analysis(DEPENDENCY_EVERYTHING);
 }
 
 /**
@@ -1884,7 +1884,7 @@ fs_visitor::compact_virtual_grfs()
   } else {
  remap_table[i] = new_index;
  alloc.sizes[new_index] = alloc.sizes[i];
- invalidate_live_intervals();
+ invalidate_analysis(DEPENDENCY_EVERYTHING);
  ++new_index;
   }
}
@@ -2088,7 +2088,7 @@ fs_visitor::demote_pull_constants()
  inst->src[i].reg_offset = 0;
   }
}
-   invalidate_live_intervals();
+   invalidate_analysis(DEPENDENCY_EVERYTHING);
 }
 
 bool
@@ -2314,6 +2314,10 @@ fs_visitor::opt_algebraic()
  }
   }
}
+
+   if (progress)
+  invalidate_analysis(DEPENDENCY_EVERYTHING);
+
return progress;
 }
 
@@ -2363,7 +2367,7 @@ fs_visitor::opt_zero_samples()
}
 
if (progress)
-  invalidate_live_intervals();
+  invalidate_analysis(DEPENDENCY_EVERYTHING);
 
return progress;
 }
@@ -2442,7 +2446,7 @@ fs_visitor::opt_sampler_eot()
 * the instruction that this will replace.
 */
if (tex_inst->header_size != 0) {
-  invalidate_live_intervals();
+  invalidate_analysis(DEPENDENCY_EVERYTHING);
   return true;
}
 
@@ -2475,7 +2479,7 @@ fs_visitor::opt_sampler_eot()
tex_inst->insert_before(cfg->blocks[cfg->num_blocks - 1], new_load_payload);
tex_inst->src[0] = send_header;
 
-   invalidate_live_intervals();
+   invalidate_analysis(DEPENDENCY_EVERYTHING);
return true;
 }
 
@@ -2528,7 +2532,7 @@ fs_visitor::opt_register_renaming()
}
 
if (progress) {
-  invalidate_live_intervals();
+  invalidate_analysis(DEPENDENCY_EVERYTHING);
 
   for (unsigned i = 0; i < ARRAY_SIZE(delta_xy); i++) {
  if (delta_xy[i].file == VGRF && remap[delta_xy[i].nr] != -1) {
@@ -2576,7 +2580,7 @@ fs_visitor::opt_redundant_discard_jumps()
}
 
if (progress)
-  invalidate_live_intervals();
+  invalidate_analysis(DEPENDENCY_EVERYTHING);
 
return progress;
 }
@@ -2736,7 +2740,7 @@ fs_visitor::compute_to_mrf()
}
 
if (progress)
-  invalidate_live_intervals();
+  invalidate_analysis(DEPENDENCY_EVERYTHING);
 
return progr

[Mesa-dev] [PATCH 24/30] i965/vec4: Switch liveness analysis to IR analysis framework.

2016-03-13 Thread Francisco Jerez
This involves wrapping vec4_live_variables in a BRW_ANALYSIS object
and hooking it up to invalidate_analysis() so it's properly
invalidated.  Seems like a lot of churn but it's fairly
straightforward.  The vec4_visitor invalidate_ and
calculate_live_intervals() methods are no longer necessary after this
change.
---
 src/mesa/drivers/dri/i965/brw_vec4.cpp |  7 ++---
 src/mesa/drivers/dri/i965/brw_vec4.h   |  6 ++--
 src/mesa/drivers/dri/i965/brw_vec4_cse.cpp | 10 +++---
 .../dri/i965/brw_vec4_dead_code_eliminate.cpp  | 10 +++---
 .../drivers/dri/i965/brw_vec4_live_variables.cpp   | 36 ++
 .../drivers/dri/i965/brw_vec4_live_variables.h | 11 +--
 .../drivers/dri/i965/brw_vec4_reg_allocate.cpp |  5 ++-
 src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp |  3 +-
 8 files changed, 29 insertions(+), 59 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4.cpp
index 7c2c9cc..6695ddb 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp
@@ -1018,8 +1018,7 @@ vec4_visitor::opt_register_coalesce()
 {
bool progress = false;
int next_ip = 0;
-
-   calculate_live_intervals();
+   const vec4_live_variables &live = live_analysis.require();
 
foreach_block_and_inst_safe (block, vec4_instruction, inst, cfg) {
   int ip = next_ip;
@@ -1061,8 +1060,7 @@ vec4_visitor::opt_register_coalesce()
   /* Can't coalesce this GRF if someone else was going to
* read it later.
*/
-  if (live_intervals->var_range_end(
- var_from_reg(alloc, inst->src[0]), 4) > ip)
+  if (live.var_range_end(var_from_reg(alloc, inst->src[0]), 4) > ip)
 continue;
 
   /* We need to check interference with the final destination between this
@@ -1878,6 +1876,7 @@ void
 vec4_visitor::invalidate_analysis(brw::analysis_dependency_class c)
 {
backend_shader::invalidate_analysis(c);
+   live_analysis.invalidate(c);
 }
 
 bool
diff --git a/src/mesa/drivers/dri/i965/brw_vec4.h 
b/src/mesa/drivers/dri/i965/brw_vec4.h
index 35bd265..c95e741 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4.h
+++ b/src/mesa/drivers/dri/i965/brw_vec4.h
@@ -101,7 +101,8 @@ public:
 
int first_non_payload_grf;
unsigned int max_grf;
-   brw::vec4_live_variables *live_intervals;
+   BRW_ANALYSIS(live_analysis, brw::vec4_live_variables,
+backend_shader *) live_analysis;
dst_reg userplane[MAX_CLIP_PLANES];
 
bool need_all_constants_in_pull_buffer;
@@ -132,7 +133,6 @@ public:
void move_push_constants_to_pull_constants();
void split_uniform_registers();
void pack_uniform_registers();
-   void calculate_live_intervals();
void invalidate_live_intervals();
virtual void invalidate_analysis(brw::analysis_dependency_class c);
void split_virtual_grfs();
@@ -141,7 +141,7 @@ public:
bool dead_code_eliminate();
bool opt_cmod_propagation();
bool opt_copy_propagation(bool do_constant_prop = true);
-   bool opt_cse_local(bblock_t *block);
+   bool opt_cse_local(bblock_t *block, const vec4_live_variables &live);
bool opt_cse();
bool opt_algebraic();
bool opt_register_coalesce();
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_cse.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_cse.cpp
index 3a1dfda..04e03ab 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_cse.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_cse.cpp
@@ -134,7 +134,7 @@ instructions_match(vec4_instruction *a, vec4_instruction *b)
 }
 
 bool
-vec4_visitor::opt_cse_local(bblock_t *block)
+vec4_visitor::opt_cse_local(bblock_t *block, const vec4_live_variables &live)
 {
bool progress = false;
exec_list aeb;
@@ -246,8 +246,7 @@ vec4_visitor::opt_cse_local(bblock_t *block)
  * more -- a sure sign they'll fail operands_match().
  */
 if (src->file == VGRF) {
-   if (live_intervals->var_range_end(
-  var_from_reg(alloc, *src), 4) < ip) {
+   if (live.var_range_end(var_from_reg(alloc, *src), 4) < ip) {
   entry->remove();
   ralloc_free(entry);
   break;
@@ -268,11 +267,10 @@ bool
 vec4_visitor::opt_cse()
 {
bool progress = false;
-
-   calculate_live_intervals();
+   const vec4_live_variables &live = live_analysis.require();
 
foreach_block (block, cfg) {
-  progress = opt_cse_local(block) || progress;
+  progress = opt_cse_local(block, live) || progress;
}
 
if (progress)
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_dead_code_eliminate.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_dead_code_eliminate.cpp
index eefb48a..113b253 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_dead_code_eliminate.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_dead_code_eliminate.cpp
@@ -69,17 +69,15 @@ bool
 vec4_visitor::dead_code_eliminate()
 {
bool progress = false;
-
-   calculate_live_intervals();
-
-   int num_var

[Mesa-dev] [PATCH 04/30] i965/fs: Restrict inequality that can only hold equal in saturate propagation.

2016-03-13 Thread Francisco Jerez
Should have no functional change.  The IP value of an instruction that
reads src_var cannot possibly be after the end of the live interval of
the variable it's reading from, by the definition of live interval.
Might save future readers a momentary WTF while trying to understand
this code.
---
 src/mesa/drivers/dri/i965/brw_fs_saturate_propagation.cpp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_saturate_propagation.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_saturate_propagation.cpp
index dc2b0c8..f59fdbd 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_saturate_propagation.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_saturate_propagation.cpp
@@ -73,7 +73,7 @@ opt_saturate_propagation_local(fs_visitor *v, bblock_t *block)
 if (scan_inst->saturate) {
inst->saturate = false;
progress = true;
-} else if (src_end_ip <= ip || inst->dst.equals(inst->src[0])) {
+} else if (src_end_ip == ip || inst->dst.equals(inst->src[0])) {
if (scan_inst->can_do_saturate()) {
   if (scan_inst->dst.type != inst->dst.type) {
  scan_inst->dst.type = inst->dst.type;
-- 
2.7.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 09/30] i965/fs: Reverse inclusion dependency between brw_fs_live_variables.h and brw_fs.h.

2016-03-13 Thread Francisco Jerez
brw_fs.h (in particular fs_visitor) is logically a user of the live
variables analysis pass, not the other way around.
brw_fs_live_variables.h requires the definition of some FS IR data
structures to compile, but those can be obtained directly from
brw_ir_fs.h without including brw_fs.h.  The dependency of
fs_live_variables on fs_visitor is rather accidental and will be
removed in a future commit, a forward declaration is enough for the
moment.
---
 src/mesa/drivers/dri/i965/brw_fs.h  | 2 +-
 src/mesa/drivers/dri/i965/brw_fs_live_variables.cpp | 2 +-
 src/mesa/drivers/dri/i965/brw_fs_live_variables.h   | 3 ++-
 3 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.h 
b/src/mesa/drivers/dri/i965/brw_fs.h
index d4acc87..3448275 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.h
+++ b/src/mesa/drivers/dri/i965/brw_fs.h
@@ -30,6 +30,7 @@
 #include "brw_shader.h"
 #include "brw_ir_fs.h"
 #include "brw_fs_builder.h"
+#include "brw_fs_live_variables.h"
 #include "compiler/glsl/ir.h"
 #include "compiler/nir/nir.h"
 
@@ -39,7 +40,6 @@ namespace {
 }
 
 namespace brw {
-   class fs_live_variables;
 }
 
 struct brw_gs_compile;
diff --git a/src/mesa/drivers/dri/i965/brw_fs_live_variables.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_live_variables.cpp
index 66b70a9..4ebefe4 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_live_variables.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_live_variables.cpp
@@ -25,7 +25,7 @@
  *
  */
 
-#include "brw_cfg.h"
+#include "brw_fs.h"
 #include "brw_fs_live_variables.h"
 
 using namespace brw;
diff --git a/src/mesa/drivers/dri/i965/brw_fs_live_variables.h 
b/src/mesa/drivers/dri/i965/brw_fs_live_variables.h
index a5c1764..6b49cfc 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_live_variables.h
+++ b/src/mesa/drivers/dri/i965/brw_fs_live_variables.h
@@ -28,10 +28,11 @@
 #ifndef BRW_FS_LIVE_VARIABLES_H
 #define BRW_FS_LIVE_VARIABLES_H
 
-#include "brw_fs.h"
+#include "brw_ir_fs.h"
 #include "util/bitset.h"
 
 struct cfg_t;
+class fs_visitor;
 
 namespace brw {
 
-- 
2.7.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 27/30] i965/ir: Move dominance tree data structure into idom_tree object.

2016-03-13 Thread Francisco Jerez
It makes sense to keep the result of analysis passes independent from
the IR itself.  Instead of representing the idom tree as a pointer in
each basic block pointing to its immediate dominator, the whole
dominator tree is represented separately from the IR as an array of
pointers inside the idom_tree object.  This has the advantage that
it's no longer possible to use stale dominance results by accident
without having called require() beforehand, which makes sure that the
idom tree is recalculated if necessary.
---
 src/mesa/drivers/dri/i965/brw_cfg.cpp | 51 ++-
 src/mesa/drivers/dri/i965/brw_cfg.h   | 22 +--
 2 files changed, 47 insertions(+), 26 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_cfg.cpp 
b/src/mesa/drivers/dri/i965/brw_cfg.cpp
index f586dd0..f951f44 100644
--- a/src/mesa/drivers/dri/i965/brw_cfg.cpp
+++ b/src/mesa/drivers/dri/i965/brw_cfg.cpp
@@ -54,7 +54,7 @@ link(void *mem_ctx, bblock_t *block)
 }
 
 bblock_t::bblock_t(cfg_t *cfg) :
-   cfg(cfg), idom(NULL), start_ip(0), end_ip(0), num(0)
+   cfg(cfg), start_ip(0), end_ip(0), num(0)
 {
instructions.make_empty();
parents.make_empty();
@@ -426,8 +426,9 @@ cfg_t::dump(backend_shader *s)
const idom_tree *idom = (s ? &s->idom_analysis.require() : NULL);
 
foreach_block (block, this) {
-  if (block->idom)
- fprintf(stderr, "START B%d IDOM(B%d)", block->num, block->idom->num);
+  if (idom && idom->parent(block))
+ fprintf(stderr, "START B%d IDOM(B%d)", block->num,
+ idom->parent(block)->num);
   else
  fprintf(stderr, "START B%d IDOM(none)", block->num);
 
@@ -455,14 +456,14 @@ cfg_t::dump(backend_shader *s)
  * (less than 1000 nodes) that this algorithm is significantly faster than
  * others like Lengauer-Tarjan.
  */
-idom_tree::idom_tree(const backend_shader *s)
+idom_tree::idom_tree(const backend_shader *s) :
+   num_parents(s->cfg->num_blocks),
+   parents(new bblock_t *[num_parents]())
 {
-   foreach_block(block, s->cfg) {
-  block->idom = NULL;
-   }
-   s->cfg->blocks[0]->idom = s->cfg->blocks[0];
-
bool changed;
+
+   parents[0] = s->cfg->blocks[0];
+
do {
   changed = false;
 
@@ -471,24 +472,29 @@ idom_tree::idom_tree(const backend_shader *s)
 continue;
 
  bblock_t *new_idom = NULL;
- foreach_list_typed(bblock_link, parent, link, &block->parents) {
-if (parent->block->idom) {
+ foreach_list_typed(bblock_link, parent_link, link, &block->parents) {
+if (parent(parent_link->block)) {
if (new_idom == NULL) {
-  new_idom = parent->block;
-   } else if (parent->block->idom != NULL) {
-  new_idom = intersect(parent->block, new_idom);
+  new_idom = parent_link->block;
+   } else if (parent(parent_link->block) != NULL) {
+  new_idom = intersect(parent_link->block, new_idom);
}
 }
  }
 
- if (block->idom != new_idom) {
-block->idom = new_idom;
+ if (parent(block) != new_idom) {
+parents[block->num] = new_idom;
 changed = true;
  }
   }
} while (changed);
 }
 
+idom_tree::~idom_tree()
+{
+   delete parents;
+}
+
 bblock_t *
 idom_tree::intersect(bblock_t *b1, bblock_t *b2) const
 {
@@ -498,23 +504,20 @@ idom_tree::intersect(bblock_t *b1, bblock_t *b2) const
 */
while (b1->num != b2->num) {
   while (b1->num > b2->num)
- b1 = b1->idom;
+ b1 = parent(b1);
   while (b2->num > b1->num)
- b2 = b2->idom;
+ b2 = parent(b2);
}
assert(b1);
return b1;
 }
 
 void
-idom_tree::dump(const backend_shader *s) const
+idom_tree::dump() const
 {
printf("digraph DominanceTree {\n");
-   foreach_block(block, s->cfg) {
-  if (block->idom) {
- printf("\t%d -> %d\n", block->idom->num, block->num);
-  }
-   }
+   for (unsigned i = 0; i < num_parents; i++)
+  printf("\t%d -> %d\n", parents[i]->num, i);
printf("}\n");
 }
 
diff --git a/src/mesa/drivers/dri/i965/brw_cfg.h 
b/src/mesa/drivers/dri/i965/brw_cfg.h
index 1c9207c..fe6aa44 100644
--- a/src/mesa/drivers/dri/i965/brw_cfg.h
+++ b/src/mesa/drivers/dri/i965/brw_cfg.h
@@ -84,7 +84,6 @@ struct bblock_t {
 
struct exec_node link;
struct cfg_t *cfg;
-   struct bblock_t *idom;
 
int start_ip;
int end_ip;
@@ -349,6 +348,7 @@ namespace brw {
 */
struct idom_tree {
   idom_tree(const backend_shader *s);
+  ~idom_tree();
 
   bool
   validate(const backend_shader *) const
@@ -363,11 +363,29 @@ namespace brw {
  return DEPENDENCY_BLOCKS;
   }
 
+  const bblock_t *
+  parent(const bblock_t *b) const
+  {
+ assert(unsigned(b->num) < num_parents);
+ return parents[b->num];
+  }
+
+  bblock_t *
+  parent(bblock_t *b) const
+  {
+ assert(unsigned

[Mesa-dev] [PATCH 25/30] i965/ir: Drop invalidate_live_intervals().

2016-03-13 Thread Francisco Jerez
---
 src/mesa/drivers/dri/i965/brw_fs.h| 1 -
 src/mesa/drivers/dri/i965/brw_fs_live_variables.cpp   | 8 
 src/mesa/drivers/dri/i965/brw_shader.cpp  | 2 --
 src/mesa/drivers/dri/i965/brw_shader.h| 1 -
 src/mesa/drivers/dri/i965/brw_vec4.h  | 1 -
 src/mesa/drivers/dri/i965/brw_vec4_live_variables.cpp | 8 
 6 files changed, 21 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.h 
b/src/mesa/drivers/dri/i965/brw_fs.h
index ac5e8cf..7a1916d 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.h
+++ b/src/mesa/drivers/dri/i965/brw_fs.h
@@ -142,7 +142,6 @@ public:
bool compact_virtual_grfs();
void assign_constant_locations();
void demote_pull_constants();
-   void invalidate_live_intervals();
virtual void invalidate_analysis(brw::analysis_dependency_class c);
void calculate_register_pressure();
void validate();
diff --git a/src/mesa/drivers/dri/i965/brw_fs_live_variables.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_live_variables.cpp
index 0d5bef2..5c1e00b 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_live_variables.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_live_variables.cpp
@@ -345,14 +345,6 @@ fs_live_variables::validate(const backend_shader *s) const
return true;
 }
 
-void
-fs_visitor::invalidate_live_intervals()
-{
-   /* XXX -- Leave this around for the moment to keep the fs_vistor object
-* concrete.
-*/
-}
-
 bool
 fs_live_variables::vars_interfere(int a, int b) const
 {
diff --git a/src/mesa/drivers/dri/i965/brw_shader.cpp 
b/src/mesa/drivers/dri/i965/brw_shader.cpp
index 5b05725..4d701d9 100644
--- a/src/mesa/drivers/dri/i965/brw_shader.cpp
+++ b/src/mesa/drivers/dri/i965/brw_shader.cpp
@@ -1049,8 +1049,6 @@ backend_shader::calculate_cfg()
 void
 backend_shader::invalidate_analysis(brw::analysis_dependency_class c)
 {
-   if (c)
-  invalidate_live_intervals();
 }
 
 /**
diff --git a/src/mesa/drivers/dri/i965/brw_shader.h 
b/src/mesa/drivers/dri/i965/brw_shader.h
index 47e6f02..a244f48 100644
--- a/src/mesa/drivers/dri/i965/brw_shader.h
+++ b/src/mesa/drivers/dri/i965/brw_shader.h
@@ -83,7 +83,6 @@ public:
 
void calculate_cfg();
 
-   virtual void invalidate_live_intervals() = 0;
virtual void invalidate_analysis(brw::analysis_dependency_class c);
 };
 
diff --git a/src/mesa/drivers/dri/i965/brw_vec4.h 
b/src/mesa/drivers/dri/i965/brw_vec4.h
index c95e741..a21eb48 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4.h
+++ b/src/mesa/drivers/dri/i965/brw_vec4.h
@@ -133,7 +133,6 @@ public:
void move_push_constants_to_pull_constants();
void split_uniform_registers();
void pack_uniform_registers();
-   void invalidate_live_intervals();
virtual void invalidate_analysis(brw::analysis_dependency_class c);
void split_virtual_grfs();
bool opt_vector_float();
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_live_variables.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_live_variables.cpp
index 665cf58..03830ab 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_live_variables.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_live_variables.cpp
@@ -256,14 +256,6 @@ vec4_live_variables::~vec4_live_variables()
ralloc_free(mem_ctx);
 }
 
-void
-vec4_visitor::invalidate_live_intervals()
-{
-   /* XXX -- Leave this around for the moment to keep the vec4_vistor object
-* concrete.
-*/
-}
-
 static bool
 check_register_live_range(const vec4_live_variables *live, int ip,
   unsigned var, unsigned n)
-- 
2.7.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 13/30] i965/ir: Define more detailed analysis dependency classes.

2016-03-13 Thread Francisco Jerez
I've deliberately separated this from the general analysis pass
infrastructure in order to discuss it independently.  The dependency
classes defined here refer to state changes of several objects of the
program IR, and are fully orthogonal and expected to change less often
than the set of analysis passes present in the compiler back-end.

The objective is to avoid unnecessary coupling between optimization
and analysis passes in the back-end.  By doing things in this way the
set of flags to be passed to invalidate_analysis() can be determined
from knowledge of a single optimization pass and a small set of well
specified dependency classes alone -- IOW there is no need to audit
all analysis passes to find out which ones might be affected by
certain kind of program transformation performed by an optimization
pass, as well as the converse, there is no need to audit all
optimization passes when writing a new analysis pass to find out which
ones can potentially invalidate the result of the analysis.

The set of dependency classes defined here is rather conservative and
mainly based on the requirements of the few analysis passes already
part of the back-end.  I've also used them without difficulty with a
few additional analysis passes I've written but haven't yet sent for
review.
---
 src/mesa/drivers/dri/i965/brw_ir_analysis.h | 46 +
 1 file changed, 46 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_ir_analysis.h 
b/src/mesa/drivers/dri/i965/brw_ir_analysis.h
index 31989ec..f008ac7 100644
--- a/src/mesa/drivers/dri/i965/brw_ir_analysis.h
+++ b/src/mesa/drivers/dri/i965/brw_ir_analysis.h
@@ -37,6 +37,52 @@ namespace brw {
*/
   DEPENDENCY_NOTHING = 0,
   /**
+   * The analysis depends on the set of instructions in the program and
+   * their naming.  Note that because instructions are named sequentially
+   * by IP this implies a dependency on the control flow edges between
+   * instructions.  This will be signaled whenever instructions are
+   * inserted, removed or reordered in the program.
+   */
+  DEPENDENCY_INSTRUCTION_IDENTITY = 0x1,
+  /**
+   * The analysis is sensitive to the detailed semantics of instructions
+   * in the program, where "detailed" means any change in the instruction
+   * data structures other than the linked-list pointers (which are
+   * already covered by DEPENDENCY_INSTRUCTION_IDENTITY).  E.g. changing
+   * the negate or abs flags of an instruction source would signal this
+   * flag alone because it would preserve all other instruction dependency
+   * classes.
+   */
+  DEPENDENCY_INSTRUCTION_DETAIL = 0x2,
+  /**
+   * The analysis depends on the set of data flow edges between
+   * instructions.  This will be signaled whenever the dataflow relation
+   * between instructions has potentially changed, e.g. when the VGRF
+   * index of an instruction source or destination changes (in which case
+   * it will appear in combination with DEPENDENCY_INSTRUCTION_DETAIL), or
+   * when data-dependent instructions are reordered (in which case it will
+   * appear in combination with DEPENDENCY_INSTRUCTION_IDENTITY).
+   */
+  DEPENDENCY_INSTRUCTION_DATA_FLOW = 0x4,
+  /**
+   * The analysis depends on all instruction dependency classes.  These
+   * will typically be signaled simultaneously when inserting or removing
+   * instructions in the program (or if you're feeling too lazy to read
+   * through your optimization pass to figure out which of the instruction
+   * dependency classes above it invalidates).
+   */
+  DEPENDENCY_INSTRUCTIONS = 0x7,
+  /**
+   * The analysis depends on the set of VGRFs in the program and their
+   * naming.  This will be signaled when VGRFs are allocated or released.
+   */
+  DEPENDENCY_VARIABLES = 0x8,
+  /**
+   * The analysis depends on the set of basic blocks in the program, their
+   * control flow edges and naming.
+   */
+  DEPENDENCY_BLOCKS = 0x10,
+  /**
* The analysis depends on the program being literally the same (good
* luck...), any change in the input invalidates previous analysis
* computations.
-- 
2.7.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 03/30] i965/vec4: Consider removal of no-op MOVs as progress during register coalesce.

2016-03-13 Thread Francisco Jerez
Bug found by the liveness analysis validation pass that will be
introduced in a later commit.  The no-op MOV check in
opt_register_coalesce() was removing instructions which makes the
cached liveness analysis calculation inconsistent with the shader IR.
We were failing to set progress to true in that case though, which
means that invalidate_live_intervals() wouldn't necessarily be called
at the end of the function.

Cc: mesa-sta...@lists.freedesktop.org
---
 src/mesa/drivers/dri/i965/brw_vec4.cpp | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4.cpp
index cf62ed9..baf72a2 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp
@@ -1051,6 +1051,7 @@ vec4_visitor::opt_register_coalesce()
 
  if (is_nop_mov) {
 inst->remove(block);
+progress = true;
 continue;
  }
   }
-- 
2.7.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 26/30] i965/ir: Move idom tree calculation and related logic into analysis object.

2016-03-13 Thread Francisco Jerez
This only does half of the work.  The actual representation of the
idom tree is left untouched, but the computation algorithm is moved
into a separate analysis result class wrapped in a BRW_ANALYSIS
object, along with the intersect() and dump_domtree() auxiliary
functions in order to keep things tidy.
---
 src/mesa/drivers/dri/i965/brw_cfg.cpp  | 44 ++
 src/mesa/drivers/dri/i965/brw_cfg.h| 38 ---
 .../drivers/dri/i965/brw_fs_combine_constants.cpp  |  4 +-
 src/mesa/drivers/dri/i965/brw_shader.cpp   |  3 +-
 src/mesa/drivers/dri/i965/brw_shader.h |  2 +
 5 files changed, 59 insertions(+), 32 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_cfg.cpp 
b/src/mesa/drivers/dri/i965/brw_cfg.cpp
index 4fcee72..f586dd0 100644
--- a/src/mesa/drivers/dri/i965/brw_cfg.cpp
+++ b/src/mesa/drivers/dri/i965/brw_cfg.cpp
@@ -34,6 +34,8 @@
  * blocks with successor/predecessor edges connecting them.
  */
 
+using namespace brw;
+
 static bblock_t *
 pop_stack(exec_list *list)
 {
@@ -158,7 +160,6 @@ cfg_t::cfg_t(exec_list *instructions)
block_list.make_empty();
blocks = NULL;
num_blocks = 0;
-   idom_dirty = true;
 
bblock_t *cur = NULL;
int ip = 0;
@@ -384,7 +385,6 @@ cfg_t::remove_block(bblock_t *block)
 
this->blocks[this->num_blocks - 1]->num = this->num_blocks - 2;
this->num_blocks--;
-   idom_dirty = true;
 }
 
 bblock_t *
@@ -423,8 +423,7 @@ cfg_t::make_block_array()
 void
 cfg_t::dump(backend_shader *s)
 {
-   if (idom_dirty)
-  calculate_idom();
+   const idom_tree *idom = (s ? &s->idom_analysis.require() : NULL);
 
foreach_block (block, this) {
   if (block->idom)
@@ -456,19 +455,18 @@ cfg_t::dump(backend_shader *s)
  * (less than 1000 nodes) that this algorithm is significantly faster than
  * others like Lengauer-Tarjan.
  */
-void
-cfg_t::calculate_idom()
+idom_tree::idom_tree(const backend_shader *s)
 {
-   foreach_block(block, this) {
+   foreach_block(block, s->cfg) {
   block->idom = NULL;
}
-   blocks[0]->idom = blocks[0];
+   s->cfg->blocks[0]->idom = s->cfg->blocks[0];
 
bool changed;
do {
   changed = false;
 
-  foreach_block(block, this) {
+  foreach_block(block, s->cfg) {
  if (block->num == 0)
 continue;
 
@@ -489,12 +487,10 @@ cfg_t::calculate_idom()
  }
   }
} while (changed);
-
-   idom_dirty = false;
 }
 
 bblock_t *
-cfg_t::intersect(bblock_t *b1, bblock_t *b2)
+idom_tree::intersect(bblock_t *b1, bblock_t *b2) const
 {
/* Note, the comparisons here are the opposite of what the paper says
 * because we index blocks from beginning -> end (i.e. reverse post-order)
@@ -511,26 +507,26 @@ cfg_t::intersect(bblock_t *b1, bblock_t *b2)
 }
 
 void
-cfg_t::dump_cfg()
+idom_tree::dump(const backend_shader *s) const
 {
-   printf("digraph CFG {\n");
-   for (int b = 0; b < num_blocks; b++) {
-  bblock_t *block = this->blocks[b];
-
-  foreach_list_typed_safe (bblock_link, child, link, &block->children) {
- printf("\t%d -> %d\n", b, child->block->num);
+   printf("digraph DominanceTree {\n");
+   foreach_block(block, s->cfg) {
+  if (block->idom) {
+ printf("\t%d -> %d\n", block->idom->num, block->num);
   }
}
printf("}\n");
 }
 
 void
-cfg_t::dump_domtree()
+cfg_t::dump_cfg()
 {
-   printf("digraph DominanceTree {\n");
-   foreach_block(block, this) {
-  if (block->idom) {
- printf("\t%d -> %d\n", block->idom->num, block->num);
+   printf("digraph CFG {\n");
+   for (int b = 0; b < num_blocks; b++) {
+  bblock_t *block = this->blocks[b];
+
+  foreach_list_typed_safe (bblock_link, child, link, &block->children) {
+ printf("\t%d -> %d\n", b, child->block->num);
   }
}
printf("}\n");
diff --git a/src/mesa/drivers/dri/i965/brw_cfg.h 
b/src/mesa/drivers/dri/i965/brw_cfg.h
index 0813798..1c9207c 100644
--- a/src/mesa/drivers/dri/i965/brw_cfg.h
+++ b/src/mesa/drivers/dri/i965/brw_cfg.h
@@ -29,6 +29,9 @@
 #define BRW_CFG_H
 
 #include "brw_ir.h"
+#ifdef __cplusplus
+#include "brw_ir_analysis.h"
+#endif
 
 struct bblock_t;
 
@@ -272,12 +275,9 @@ struct cfg_t {
bblock_t *new_block();
void set_next_block(bblock_t **cur, bblock_t *block, int ip);
void make_block_array();
-   void calculate_idom();
-   static bblock_t *intersect(bblock_t *b1, bblock_t *b2);
 
void dump(backend_shader *s);
void dump_cfg();
-   void dump_domtree();
 #endif
void *mem_ctx;
 
@@ -286,8 +286,6 @@ struct cfg_t {
struct bblock_t **blocks;
int num_blocks;
 
-   bool idom_dirty;
-
unsigned cycle_count;
 };
 
@@ -344,4 +342,34 @@ struct cfg_t {
 !__scan_inst->is_head_sentinel();  \
 __scan_inst = (__type *)__scan_inst->prev)
 
+#ifdef __cplusplus
+namespace brw {
+   /**
+* Immediate dominator tree analysis of a shader.
+*/
+   struct idom_tree {
+  idom_tree(const backend_shader *s);
+
+  

[Mesa-dev] [PATCH 30/30] i965/fs: Calculate num_instructions in O(1) during register pressure calculation.

2016-03-13 Thread Francisco Jerez
And mark the variable declaration as const.
---
 src/mesa/drivers/dri/i965/brw_fs.cpp | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index 6de47fa..e8fad87 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -5064,9 +5064,8 @@ fs_visitor::setup_cs_payload()
 brw::register_pressure::register_pressure(const fs_visitor *v)
 {
const fs_live_variables &live = v->live_analysis.require();
-   unsigned num_instructions = 0;
-   foreach_block(block, v->cfg)
-  num_instructions += block->instructions.length();
+   const unsigned num_instructions = v->cfg->num_blocks ?
+  v->cfg->blocks[v->cfg->num_blocks - 1]->end_ip + 1 : 0;
 
regs_live_at_ip = new unsigned[num_instructions]();
 
-- 
2.7.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 29/30] i965/fs: Move register pressure calculation into IR analysis object.

2016-03-13 Thread Francisco Jerez
This defines a new BRW_ANALYSIS object which wraps the register
pressure computation code along with its result.  For the rationale
see the previous commits converting the liveness and dominance
analysis passes to the IR analysis framework.
---
 src/mesa/drivers/dri/i965/brw_fs.cpp | 27 +--
 src/mesa/drivers/dri/i965/brw_fs.h   | 32 +---
 src/mesa/drivers/dri/i965/brw_fs_visitor.cpp |  6 ++
 3 files changed, 47 insertions(+), 18 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index 030e66f..6de47fa 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -4636,11 +4636,11 @@ fs_visitor::dump_instructions(const char *name)
}
 
if (cfg) {
-  calculate_register_pressure();
-  int ip = 0, max_pressure = 0;
+  const register_pressure &rp = regpressure_analysis.require();
+  unsigned ip = 0, max_pressure = 0;
   foreach_block_and_inst(block, backend_instruction, inst, cfg) {
- max_pressure = MAX2(max_pressure, regs_live_at_ip[ip]);
- fprintf(file, "{%3d} %4d: ", regs_live_at_ip[ip], ip);
+ max_pressure = MAX2(max_pressure, rp.regs_live_at_ip[ip]);
+ fprintf(file, "{%3d} %4d: ", rp.regs_live_at_ip[ip], ip);
  dump_instruction(inst, file);
  ip++;
   }
@@ -5061,27 +5061,32 @@ fs_visitor::setup_cs_payload()
}
 }
 
-void
-fs_visitor::calculate_register_pressure()
+brw::register_pressure::register_pressure(const fs_visitor *v)
 {
-   const fs_live_variables &live = live_analysis.require();
+   const fs_live_variables &live = v->live_analysis.require();
unsigned num_instructions = 0;
-   foreach_block(block, cfg)
+   foreach_block(block, v->cfg)
   num_instructions += block->instructions.length();
 
-   regs_live_at_ip = rzalloc_array(mem_ctx, int, num_instructions);
+   regs_live_at_ip = new unsigned[num_instructions]();
 
-   for (unsigned reg = 0; reg < alloc.count; reg++) {
+   for (unsigned reg = 0; reg < v->alloc.count; reg++) {
   for (int ip = live.vgrf_start[reg]; ip <= live.vgrf_end[reg]; ip++)
- regs_live_at_ip[ip] += alloc.sizes[reg];
+ regs_live_at_ip[ip] += v->alloc.sizes[reg];
}
 }
 
+brw::register_pressure::~register_pressure()
+{
+   delete regs_live_at_ip;
+}
+
 void
 fs_visitor::invalidate_analysis(brw::analysis_dependency_class c)
 {
backend_shader::invalidate_analysis(c);
live_analysis.invalidate(c);
+   regpressure_analysis.invalidate(c);
 }
 
 void
diff --git a/src/mesa/drivers/dri/i965/brw_fs.h 
b/src/mesa/drivers/dri/i965/brw_fs.h
index 7a1916d..b913d3f 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.h
+++ b/src/mesa/drivers/dri/i965/brw_fs.h
@@ -39,7 +39,34 @@ namespace {
struct acp_entry;
 }
 
+class fs_visitor;
+
 namespace brw {
+   /**
+* Register pressure analysis of a shader.  Estimates how many registers
+* are live at any point of the program in GRF units.
+*/
+   struct register_pressure {
+  register_pressure(const fs_visitor *v);
+  ~register_pressure();
+
+  analysis_dependency_class
+  dependency_class() const
+  {
+ return (DEPENDENCY_INSTRUCTION_IDENTITY |
+ DEPENDENCY_INSTRUCTION_DATA_FLOW |
+ DEPENDENCY_VARIABLES);
+  }
+
+  bool
+  validate(const fs_visitor *) const
+  {
+ /* FINISHME */
+ return true;
+  }
+
+  unsigned *regs_live_at_ip;
+   };
 }
 
 struct brw_gs_compile;
@@ -143,7 +170,6 @@ public:
void assign_constant_locations();
void demote_pull_constants();
virtual void invalidate_analysis(brw::analysis_dependency_class c);
-   void calculate_register_pressure();
void validate();
bool opt_algebraic();
bool opt_redundant_discard_jumps();
@@ -330,8 +356,8 @@ public:
 
BRW_ANALYSIS(live_analysis, brw::fs_live_variables,
 backend_shader *) live_analysis;
-
-   int *regs_live_at_ip;
+   BRW_ANALYSIS(regpressure_analysis, brw::register_pressure,
+fs_visitor *) regpressure_analysis;
 
/** Number of uniform variable components visited. */
unsigned uniforms;
diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
index ede39df..0388974 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
@@ -973,7 +973,7 @@ fs_visitor::fs_visitor(const struct brw_compiler *compiler, 
void *log_data,
: backend_shader(compiler, log_data, mem_ctx, shader, prog_data),
  key(key), gs_compile(NULL), prog_data(prog_data), prog(prog),
  input_vue_map(input_vue_map),
- live_analysis(this),
+ live_analysis(this), regpressure_analysis(this),
  dispatch_width(dispatch_width),
  shader_time_index(shader_time_index),
  bld(fs_builder(this, dispatch_width).at_end())
@@ -991,7 +991,7 @@ fs_visitor::fs_visitor(const st

[Mesa-dev] [PATCH 16/30] i965/fs: Move all live interval analysis results into fs_live_variables.

2016-03-13 Thread Francisco Jerez
This moves the following methods that are currently defined in
fs_visitor (even though they are side products of the liveness
analysis computation) and are already implemented in
brw_fs_live_variables.cpp:

> bool virtual_grf_interferes(int a, int b) const;
> int *virtual_grf_start;
> int *virtual_grf_end;

It makes sense for them to be part of the fs_live_variables object,
because they have the same lifetime as other liveness analysis results
and because this will allow some extra validation to happen wherever
they are accessed in order to make sure that we only ever use
up-to-date liveness analysis results.

This shortens the virtual_grf prefix in order to compensate for the
slightly increased lexical overhead from the live_intervals pointer
dereference.
---
 src/mesa/drivers/dri/i965/brw_fs.cpp   |  5 +--
 src/mesa/drivers/dri/i965/brw_fs.h |  3 --
 src/mesa/drivers/dri/i965/brw_fs_cse.cpp   |  3 +-
 .../drivers/dri/i965/brw_fs_live_variables.cpp | 40 +-
 src/mesa/drivers/dri/i965/brw_fs_live_variables.h  |  8 +
 src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp  |  4 +--
 src/mesa/drivers/dri/i965/brw_fs_visitor.cpp   |  2 --
 .../drivers/dri/i965/brw_schedule_instructions.cpp |  4 +--
 8 files changed, 34 insertions(+), 35 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index 9183d1f..afa1db9 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -2627,7 +2627,7 @@ fs_visitor::compute_to_mrf()
   /* Can't compute-to-MRF this GRF if someone else was going to
* read it later.
*/
-  if (this->virtual_grf_end[inst->src[0].nr] > ip)
+  if (live_intervals->vgrf_end[inst->src[0].nr] > ip)
 continue;
 
   /* Found a move of a GRF to a MRF.  Let's see if we can go
@@ -5074,7 +5074,8 @@ fs_visitor::calculate_register_pressure()
regs_live_at_ip = rzalloc_array(mem_ctx, int, num_instructions);
 
for (unsigned reg = 0; reg < alloc.count; reg++) {
-  for (int ip = virtual_grf_start[reg]; ip <= virtual_grf_end[reg]; ip++)
+  for (int ip = live_intervals->vgrf_start[reg];
+   ip <= live_intervals->vgrf_end[reg]; ip++)
  regs_live_at_ip[ip] += alloc.sizes[reg];
}
 }
diff --git a/src/mesa/drivers/dri/i965/brw_fs.h 
b/src/mesa/drivers/dri/i965/brw_fs.h
index 4f5d085..82d5d63 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.h
+++ b/src/mesa/drivers/dri/i965/brw_fs.h
@@ -162,7 +162,6 @@ public:
bool remove_duplicate_mrf_writes();
 
bool opt_sampler_eot();
-   bool virtual_grf_interferes(int a, int b) const;
void schedule_instructions(instruction_scheduler_mode mode);
void insert_gen4_send_dependency_workarounds();
void insert_gen4_pre_send_dependency_workarounds(bblock_t *block,
@@ -329,8 +328,6 @@ public:
 
int *param_size;
 
-   int *virtual_grf_start;
-   int *virtual_grf_end;
brw::fs_live_variables *live_intervals;
 
int *regs_live_at_ip;
diff --git a/src/mesa/drivers/dri/i965/brw_fs_cse.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_cse.cpp
index a484caf..d112378 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_cse.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_cse.cpp
@@ -328,7 +328,8 @@ fs_visitor::opt_cse_local(bblock_t *block)
 /* Kill any AEB entries using registers that don't get reused any
  * more -- a sure sign they'll fail operands_match().
  */
-if (src_reg->file == VGRF && virtual_grf_end[src_reg->nr] < ip) {
+if (src_reg->file == VGRF &&
+live_intervals->vgrf_end[src_reg->nr] < ip) {
entry->remove();
ralloc_free(entry);
break;
diff --git a/src/mesa/drivers/dri/i965/brw_fs_live_variables.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_live_variables.cpp
index 676619a..6da026e 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_live_variables.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_live_variables.cpp
@@ -265,6 +265,13 @@ fs_live_variables::fs_live_variables(fs_visitor *v, const 
cfg_t *cfg)
   end[i] = -1;
}
 
+   vgrf_start = ralloc_array(mem_ctx, int, num_vgrfs);
+   vgrf_end = ralloc_array(mem_ctx, int, num_vgrfs);
+   for (int i = 0; i < num_vgrfs; i++) {
+  vgrf_start[i] = MAX_INSTRUCTION;
+  vgrf_end[i] = -1;
+   }
+
block_data= rzalloc_array(mem_ctx, struct block_data, cfg->num_blocks);
 
bitset_words = BITSET_WORDS(num_vars);
@@ -283,6 +290,13 @@ fs_live_variables::fs_live_variables(fs_visitor *v, const 
cfg_t *cfg)
setup_def_use();
compute_live_variables();
compute_start_end();
+
+   /* Merge the per-component live ranges to whole VGRF live ranges. */
+   for (int i = 0; i < num_vars; i++) {
+  const unsigned vgrf = vgrf_from_var[i];
+  vgrf_start[vgrf] = MIN2(vgrf_start[vgrf], start[i]);
+  vgrf_end[vgrf] = MAX2(vgrf_end[vgrf], end[i]);
+   }
 }
 
 fs_live_variables::~fs_live_variables()

[Mesa-dev] [PATCH 23/30] i965/fs: Switch liveness analysis to IR analysis framework.

2016-03-13 Thread Francisco Jerez
This involves wrapping fs_live_variables in a BRW_ANALYSIS object and
hooking it up to invalidate_analysis() so it's properly invalidated.
Seems like a lot of churn but it's fairly straightforward.  The
fs_visitor invalidate_ and calculate_live_intervals() methods are no
longer necessary after this change.
---
 src/mesa/drivers/dri/i965/brw_fs.cpp   | 12 
 src/mesa/drivers/dri/i965/brw_fs.h | 10 ---
 src/mesa/drivers/dri/i965/brw_fs_cse.cpp   | 10 +++
 .../dri/i965/brw_fs_dead_code_eliminate.cpp| 18 ++--
 .../drivers/dri/i965/brw_fs_live_variables.cpp | 20 ++---
 src/mesa/drivers/dri/i965/brw_fs_live_variables.h  | 11 ++--
 src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp  | 10 +++
 .../drivers/dri/i965/brw_fs_register_coalesce.cpp  | 33 +-
 .../dri/i965/brw_fs_saturate_propagation.cpp   | 13 +
 src/mesa/drivers/dri/i965/brw_fs_visitor.cpp   |  3 +-
 .../drivers/dri/i965/brw_schedule_instructions.cpp | 19 ++---
 11 files changed, 72 insertions(+), 87 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index afa1db9..030e66f 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -2596,7 +2596,7 @@ fs_visitor::compute_to_mrf()
if (devinfo->gen >= 7)
   return false;
 
-   calculate_live_intervals();
+   const fs_live_variables &live = live_analysis.require();
 
foreach_block_and_inst_safe(block, fs_inst, inst, cfg) {
   int ip = next_ip;
@@ -2627,7 +2627,7 @@ fs_visitor::compute_to_mrf()
   /* Can't compute-to-MRF this GRF if someone else was going to
* read it later.
*/
-  if (live_intervals->vgrf_end[inst->src[0].nr] > ip)
+  if (live.vgrf_end[inst->src[0].nr] > ip)
 continue;
 
   /* Found a move of a GRF to a MRF.  Let's see if we can go
@@ -5064,9 +5064,7 @@ fs_visitor::setup_cs_payload()
 void
 fs_visitor::calculate_register_pressure()
 {
-   invalidate_live_intervals();
-   calculate_live_intervals();
-
+   const fs_live_variables &live = live_analysis.require();
unsigned num_instructions = 0;
foreach_block(block, cfg)
   num_instructions += block->instructions.length();
@@ -5074,8 +5072,7 @@ fs_visitor::calculate_register_pressure()
regs_live_at_ip = rzalloc_array(mem_ctx, int, num_instructions);
 
for (unsigned reg = 0; reg < alloc.count; reg++) {
-  for (int ip = live_intervals->vgrf_start[reg];
-   ip <= live_intervals->vgrf_end[reg]; ip++)
+  for (int ip = live.vgrf_start[reg]; ip <= live.vgrf_end[reg]; ip++)
  regs_live_at_ip[ip] += alloc.sizes[reg];
}
 }
@@ -5084,6 +5081,7 @@ void
 fs_visitor::invalidate_analysis(brw::analysis_dependency_class c)
 {
backend_shader::invalidate_analysis(c);
+   live_analysis.invalidate(c);
 }
 
 void
diff --git a/src/mesa/drivers/dri/i965/brw_fs.h 
b/src/mesa/drivers/dri/i965/brw_fs.h
index 82d5d63..ac5e8cf 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.h
+++ b/src/mesa/drivers/dri/i965/brw_fs.h
@@ -132,7 +132,9 @@ public:
void assign_regs_trivial();
void calculate_payload_ranges(int payload_node_count,
  int *payload_last_use_ip);
-   void setup_payload_interference(struct ra_graph *g, int payload_reg_count,
+   void setup_payload_interference(struct ra_graph *g,
+   const brw::fs_live_variables &live,
+   int payload_reg_count,
int first_payload_node);
int choose_spill_reg(struct ra_graph *g);
void spill_reg(int spill_reg);
@@ -142,13 +144,12 @@ public:
void demote_pull_constants();
void invalidate_live_intervals();
virtual void invalidate_analysis(brw::analysis_dependency_class c);
-   void calculate_live_intervals();
void calculate_register_pressure();
void validate();
bool opt_algebraic();
bool opt_redundant_discard_jumps();
bool opt_cse();
-   bool opt_cse_local(bblock_t *block);
+   bool opt_cse_local(const brw::fs_live_variables &live, bblock_t *block);
bool opt_copy_propagate();
bool try_copy_propagate(fs_inst *inst, int arg, acp_entry *entry);
bool try_constant_propagate(fs_inst *inst, acp_entry *entry);
@@ -328,7 +329,8 @@ public:
 
int *param_size;
 
-   brw::fs_live_variables *live_intervals;
+   BRW_ANALYSIS(live_analysis, brw::fs_live_variables,
+backend_shader *) live_analysis;
 
int *regs_live_at_ip;
 
diff --git a/src/mesa/drivers/dri/i965/brw_fs_cse.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_cse.cpp
index d112378..f2c257e 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_cse.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_cse.cpp
@@ -222,7 +222,7 @@ create_copy_instr(const fs_builder &bld, fs_inst *inst, 
fs_reg src, bool negate)
 }
 
 bool
-fs_visitor::opt_cse_local(bblock_t *block)
+fs_visitor::opt_cse_local(const f

[Mesa-dev] [PATCH 28/30] i965/ir: Simplify new_idom reduction in dominance tree calculation.

2016-03-13 Thread Francisco Jerez
Trivial, just use a few less tokens to do the same thing.
---
 src/mesa/drivers/dri/i965/brw_cfg.cpp | 7 ++-
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_cfg.cpp 
b/src/mesa/drivers/dri/i965/brw_cfg.cpp
index f951f44..8701d16 100644
--- a/src/mesa/drivers/dri/i965/brw_cfg.cpp
+++ b/src/mesa/drivers/dri/i965/brw_cfg.cpp
@@ -474,11 +474,8 @@ idom_tree::idom_tree(const backend_shader *s) :
  bblock_t *new_idom = NULL;
  foreach_list_typed(bblock_link, parent_link, link, &block->parents) {
 if (parent(parent_link->block)) {
-   if (new_idom == NULL) {
-  new_idom = parent_link->block;
-   } else if (parent(parent_link->block) != NULL) {
-  new_idom = intersect(parent_link->block, new_idom);
-   }
+   new_idom = (new_idom ? intersect(new_idom, parent_link->block) :
+   parent_link->block);
 }
  }
 
-- 
2.7.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 21/30] i965/fs: Add live interval validation pass.

2016-03-13 Thread Francisco Jerez
This could be improved somewhat with additional validation of the
calculated live in/out sets and by checking that the calculated live
intervals are minimal (which isn't strictly necessary to guarantee the
correctness of the program).  This should be good enough though to
catch accidental use of stale liveness results due to missing or
incorrect analysis invalidation.
---
 .../drivers/dri/i965/brw_fs_live_variables.cpp | 41 ++
 src/mesa/drivers/dri/i965/brw_fs_live_variables.h  |  3 ++
 2 files changed, 44 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_live_variables.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_live_variables.cpp
index 4b0943f..215349a 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_live_variables.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_live_variables.cpp
@@ -304,6 +304,47 @@ fs_live_variables::~fs_live_variables()
ralloc_free(mem_ctx);
 }
 
+static bool
+check_register_live_range(const fs_live_variables *live, int ip,
+  const fs_reg ®, unsigned n)
+{
+   const unsigned var = live->var_from_reg(reg);
+
+   if (var + n > unsigned(live->num_vars) ||
+   live->vgrf_start[reg.nr] > ip || live->vgrf_end[reg.nr] < ip)
+  return false;
+
+   for (unsigned j = 0; j < n; j++) {
+  if (live->start[var + j] > ip || live->end[var + j] < ip)
+ return false;
+   }
+
+   return true;
+}
+
+bool
+fs_live_variables::validate(const backend_shader *s) const
+{
+   int ip = 0;
+
+   foreach_block_and_inst(block, fs_inst, inst, s->cfg) {
+  for (unsigned i = 0; i < inst->sources; i++) {
+ if (inst->src[i].file == VGRF &&
+ !check_register_live_range(this, ip,
+inst->src[i], inst->regs_read(i)))
+return false;
+  }
+
+  if (inst->dst.file == VGRF &&
+ !check_register_live_range(this, ip, inst->dst, inst->regs_written))
+ return false;
+
+  ip++;
+   }
+
+   return true;
+}
+
 void
 fs_visitor::invalidate_live_intervals()
 {
diff --git a/src/mesa/drivers/dri/i965/brw_fs_live_variables.h 
b/src/mesa/drivers/dri/i965/brw_fs_live_variables.h
index e1cd12c..c2a3c63 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_live_variables.h
+++ b/src/mesa/drivers/dri/i965/brw_fs_live_variables.h
@@ -69,6 +69,9 @@ public:
fs_live_variables(const backend_shader *s);
~fs_live_variables();
 
+   bool
+   validate(const backend_shader *s) const;
+
bool vars_interfere(int a, int b) const;
bool vgrfs_interfere(int a, int b) const;
int var_from_reg(const fs_reg ®) const
-- 
2.7.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 00/30] i965: IR analysis pass framework.

2016-03-13 Thread Matt Turner
1-4 are

Reviewed-by: Matt Turner 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 21/30] i965/fs: Add live interval validation pass.

2016-03-13 Thread Matt Turner
On Sun, Mar 13, 2016 at 8:47 PM, Francisco Jerez  wrote:
> This could be improved somewhat with additional validation of the
> calculated live in/out sets and by checking that the calculated live
> intervals are minimal (which isn't strictly necessary to guarantee the
> correctness of the program).  This should be good enough though to
> catch accidental use of stale liveness results due to missing or
> incorrect analysis invalidation.
> ---
>  .../drivers/dri/i965/brw_fs_live_variables.cpp | 41 
> ++
>  src/mesa/drivers/dri/i965/brw_fs_live_variables.h  |  3 ++
>  2 files changed, 44 insertions(+)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_fs_live_variables.cpp 
> b/src/mesa/drivers/dri/i965/brw_fs_live_variables.cpp
> index 4b0943f..215349a 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs_live_variables.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs_live_variables.cpp
> @@ -304,6 +304,47 @@ fs_live_variables::~fs_live_variables()
> ralloc_free(mem_ctx);
>  }
>
> +static bool
> +check_register_live_range(const fs_live_variables *live, int ip,
> +  const fs_reg ®, unsigned n)
> +{
> +   const unsigned var = live->var_from_reg(reg);
> +
> +   if (var + n > unsigned(live->num_vars) ||
> +   live->vgrf_start[reg.nr] > ip || live->vgrf_end[reg.nr] < ip)
> +  return false;
> +
> +   for (unsigned j = 0; j < n; j++) {
> +  if (live->start[var + j] > ip || live->end[var + j] < ip)
> + return false;
> +   }
> +
> +   return true;
> +}
> +
> +bool
> +fs_live_variables::validate(const backend_shader *s) const
> +{
> +   int ip = 0;
> +
> +   foreach_block_and_inst(block, fs_inst, inst, s->cfg) {
> +  for (unsigned i = 0; i < inst->sources; i++) {
> + if (inst->src[i].file == VGRF &&
> + !check_register_live_range(this, ip,
> +inst->src[i], inst->regs_read(i)))
> +return false;
> +  }
> +
> +  if (inst->dst.file == VGRF &&
> + !check_register_live_range(this, ip, inst->dst, inst->regs_written))

Looks like the indentation is slightly off on this line.

> + return false;
> +
> +  ip++;
> +   }
> +
> +   return true;
> +}
> +
>  void
>  fs_visitor::invalidate_live_intervals()
>  {
> diff --git a/src/mesa/drivers/dri/i965/brw_fs_live_variables.h 
> b/src/mesa/drivers/dri/i965/brw_fs_live_variables.h
> index e1cd12c..c2a3c63 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs_live_variables.h
> +++ b/src/mesa/drivers/dri/i965/brw_fs_live_variables.h
> @@ -69,6 +69,9 @@ public:
> fs_live_variables(const backend_shader *s);
> ~fs_live_variables();
>
> +   bool
> +   validate(const backend_shader *s) const;

Return type on its own line -- intentional? Others, seen below, are on
the same line. I'd have put it on the same line.

> +
> bool vars_interfere(int a, int b) const;
> bool vgrfs_interfere(int a, int b) const;
> int var_from_reg(const fs_reg ®) const
> --
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 21/30] i965/fs: Add live interval validation pass.

2016-03-13 Thread Matt Turner
On Sun, Mar 13, 2016 at 8:47 PM, Francisco Jerez  wrote:
> This could be improved somewhat with additional validation of the
> calculated live in/out sets and by checking that the calculated live
> intervals are minimal (which isn't strictly necessary to guarantee the
> correctness of the program).  This should be good enough though to
> catch accidental use of stale liveness results due to missing or
> incorrect analysis invalidation.
> ---
>  .../drivers/dri/i965/brw_fs_live_variables.cpp | 41 
> ++
>  src/mesa/drivers/dri/i965/brw_fs_live_variables.h  |  3 ++
>  2 files changed, 44 insertions(+)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_fs_live_variables.cpp 
> b/src/mesa/drivers/dri/i965/brw_fs_live_variables.cpp
> index 4b0943f..215349a 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs_live_variables.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs_live_variables.cpp
> @@ -304,6 +304,47 @@ fs_live_variables::~fs_live_variables()
> ralloc_free(mem_ctx);
>  }
>
> +static bool
> +check_register_live_range(const fs_live_variables *live, int ip,
> +  const fs_reg ®, unsigned n)
> +{
> +   const unsigned var = live->var_from_reg(reg);
> +
> +   if (var + n > unsigned(live->num_vars) ||
> +   live->vgrf_start[reg.nr] > ip || live->vgrf_end[reg.nr] < ip)
> +  return false;
> +
> +   for (unsigned j = 0; j < n; j++) {
> +  if (live->start[var + j] > ip || live->end[var + j] < ip)
> + return false;

Braces in nested control flow. Elsewhere in this patch as well, and
also in the next patch.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] Reminder: 2016 X.Org Board of Directors Elections Nomination period is NOW

2016-03-13 Thread Peter Hutterer

We are seeking nominations for candidates for election to the X.Org
Foundation Board of Directors.  All X.Org Foundation members are
eligible for election to the board.

Nominations for the 2016 election are now open and will remain open
until 23:59 UTC on 15 March 2016.

The Board consists of directors elected from the membership.  Each
year, an election is held to bring the total number of directors to
eight. The four members receiving the highest vote totals will serve
as directors for two year terms.

The directors who received two year terms starting in 2015 were
Peter Hutterer, Martin Peres, Rob Clark and Daniel Vetter, They
will continue to serve until their term ends in 2017.  Current
directors whose term expires in 2016 are Alex Deucher, Matt Dew,
Egbert Eich and Keith Packard.

A director is expected to participate in the fortnightly IRC meeting to
discuss current business and to attend the annual meeting of the X.Org
Foundation, which will be held at a location determined in advance by
the Board of Directors.

A member may nominate themselves or any other member they feel is
qualified. Nominations should be sent to the Election Committee at
electi...@x.org.

Nominees shall be required to be current members of the X.Org
Foundation, and submit a  personal statement of up to 200 words that
will be provided to prospective voters.  The collected statements,
along with the statement of contribution to the X.Org Foundation in
the members account page on http://members.x.org, will be made
available to all voters to help them make their voting decisions.

Nominations, membership applications or renewals and completed
personal statements must be received no later than 23:59 UTC on 15 March
2016.

The slate of candidates will be published 17 March
2016 and candidate Q&A will begin then.  The deadline for Xorg
membership applications and renewals is 17 March 2016.

Cheers,
  Peter, on behalf of the X.Org BoD

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] Clean up preprocessor checks for execinfo.h

2016-03-13 Thread Aric Belsito
Fixes building with debug enabled on musl systems.
---
 configure.ac| 1 +
 src/gallium/auxiliary/util/u_debug_symbol.c | 4 ++--
 src/mapi/glapi/gen/gl_gentable.py   | 2 +-
 3 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/configure.ac b/configure.ac
index 49be147..1369937 100644
--- a/configure.ac
+++ b/configure.ac
@@ -758,6 +758,7 @@ AC_CHECK_HEADER([xlocale.h], [DEFINES="$DEFINES 
-DHAVE_XLOCALE_H"])
 AC_CHECK_HEADER([sys/sysctl.h], [DEFINES="$DEFINES -DHAVE_SYS_SYSCTL_H"])
 AC_CHECK_FUNC([strtof], [DEFINES="$DEFINES -DHAVE_STRTOF"])
 AC_CHECK_FUNC([mkostemp], [DEFINES="$DEFINES -DHAVE_MKOSTEMP"])
+AC_CHECK_HEADER([execinfo.h], [DEFINES="$DEFINES -DHAVE_EXECINFO_H"])
 
 dnl Check to see if dlopen is in default libraries (like Solaris, which
 dnl has it in libc), or if libdl is needed to get it.
diff --git a/src/gallium/auxiliary/util/u_debug_symbol.c 
b/src/gallium/auxiliary/util/u_debug_symbol.c
index 10efdd5..4026089 100644
--- a/src/gallium/auxiliary/util/u_debug_symbol.c
+++ b/src/gallium/auxiliary/util/u_debug_symbol.c
@@ -219,7 +219,7 @@ debug_symbol_name_dbghelp(const void *addr, char* buf, 
unsigned size)
 #endif /* PIPE_OS_WINDOWS */
 
 
-#if defined(__GLIBC__) && !defined(__UCLIBC__)
+#if defined(HAVE_EXECINFO_H)
 
 #include 
 
@@ -240,7 +240,7 @@ debug_symbol_name_glibc(const void *addr, char* buf, 
unsigned size)
return TRUE;
 }
 
-#endif /* defined(__GLIBC__) && !defined(__UCLIBC__) */
+#endif /* defined(HAVE_EXECINFO_H) */
 
 
 void
diff --git a/src/mapi/glapi/gen/gl_gentable.py 
b/src/mapi/glapi/gen/gl_gentable.py
index 7cd475a..0d1ebd4 100644
--- a/src/mapi/glapi/gen/gl_gentable.py
+++ b/src/mapi/glapi/gen/gl_gentable.py
@@ -44,7 +44,7 @@ header = """/* GLXEXT is the define used in the xserver when 
the GLX extension i
 #endif
 
 #if (defined(GLXEXT) && defined(HAVE_BACKTRACE)) \\
-   || (!defined(GLXEXT) && defined(DEBUG) && !defined(__CYGWIN__) && 
!defined(__MINGW32__) && !defined(__OpenBSD__) && !defined(__NetBSD__) && 
!defined(__DragonFly__))
+   || (!defined(GLXEXT) && defined(DEBUG) && defined(HAVE_EXECINFO_H))
 #define USE_BACKTRACE
 #endif
 
-- 
2.7.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 01/14] nir: Add explicitly sized types

2016-03-13 Thread Samuel Iglesias Gonsálvez
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256



On 12/03/16 17:26, Connor Abbott wrote:
> On Fri, Mar 11, 2016 at 2:33 AM, Samuel Iglesias Gonsálvez 
>  wrote:
> 
> 
> On 11/03/16 01:08, Jason Ekstrand wrote:
 On Thu, Mar 10, 2016 at 4:00 PM, Connor Abbott 
  wrote:
 
> On Mon, Mar 7, 2016 at 3:45 AM, Samuel Iglesias Gonsálvez 
>  wrote:
>> From: Jason Ekstrand 
>> 
>> v2: Fix size/type mask to properly handle 8-bit types.
>> 
>> Signed-off-by: Juan A. Suarez Romero
>>  --- src/compiler/nir/nir.h | 17
>> - 1 file changed, 16 insertions(+), 1
>> deletion(-)
>> 
>> diff --git a/src/compiler/nir/nir.h
>> b/src/compiler/nir/nir.h index cccb3a4..659e98c 100644
>> --- a/src/compiler/nir/nir.h +++ b/src/compiler/nir/nir.h
>> @@ -605,9 +605,24 @@ typedef enum { nir_type_float,
>> nir_type_int, nir_type_uint, -   nir_type_bool +
>> nir_type_bool, +   nir_type_bool32 =32 | 
>> nir_type_bool, +   nir_type_int8 =  8  |
>> nir_type_int, + nir_type_int16 = 16 | nir_type_int, +
>> nir_type_int32 = 32 | nir_type_int, +   nir_type_int64 =
>> 64 | nir_type_int, +   nir_type_uint8 = 8  |
>> nir_type_uint, + nir_type_uint16 =16 | nir_type_uint,
>> +   nir_type_uint32 = 32 | nir_type_uint, +
>> nir_type_uint64 =64 | nir_type_uint, +
>> nir_type_float16 =   16 | nir_type_float, + 
>> nir_type_float32 =   32 | nir_type_float, +
>> nir_type_float64 =   64 | nir_type_float, }
>> nir_alu_type;
>> 
>> +#define NIR_ALU_TYPE_SIZE_MASK 0xfff8 +#define 
>> NIR_ALU_TYPE_BASE_TYPE_MASK 0x0007
> 
> So I'm not really the one to be reviewing this series
> (after all, I wrote most of it :) ) but one thing that I
> never quite liked, and didn't get around to fixing, is how
> we use these raw constants all over the place. Perhaps we
> could make things more readable by adding
> nir_get_sized_type(), nir_get_unsized_type(), and
> nir_type_size() helpers and then use those instead of 
> or-ing/and-ing things together everywhere.
> 
 
 Agreed.
 
 
> 
> Agreed. We saw it too but, as this is used in a lot in the fp64
> patches, we were thinking on apply one patch at the end of the fp64
> series adding those helper functions (maybe just macros like
> NIR_GET_UNSIZED_TYPE and NIR_GET_TYPE_SIZE) and adapting the users
> of the mask.
> 
>> I should probably mention, in general we tend to prefer inline 
>> functions over macros where possible since it's clearer what
>> their argument types and return type are and they tend to
>> integrate better with gdb.
> 

Thanks for the advice! We will use inline functions then.

Sam

> 
> However, we can add them here and modify the rest of fp64 patches
> if you prefer it.
> 
> Sam
> 
> 
>> + typedef enum { NIR_OP_IS_COMMUTATIVE = (1 << 0), 
>> NIR_OP_IS_ASSOCIATIVE = (1 << 1), -- 2.7.0
>> 
>> ___ mesa-dev 
>> mailing list mesa-dev@lists.freedesktop.org 
>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> ___ mesa-dev
> mailing list mesa-dev@lists.freedesktop.org 
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> 
 
> 
-BEGIN PGP SIGNATURE-
Version: GnuPG v2

iQIcBAEBCAAGBQJW5lSfAAoJEH/0ujLxfcNDeVMP/jv3UO2VNeW2R/3BX1d7KTni
GJ/0lH4BxEEDlJNUoxCOoxRGVmUVI8D/NNRKWg9lWjP1aPSKcpNLXCSzw5C3YHql
YzklKjnNi6nSZdzjjDWOtC7wJvOrLUhbst2ng6N9vu9SkjGTT2TOjp6AMSc0O5G6
ptnH+mdqKqBb31rh22w5LhlJnjIA2qmAsVpR2XZ8RCTaRZw/whKnQ+RN3MBSW7qs
ENMoeBeaG8pKdJc7w8YzfO1jM4Tgic8cFlPq2rWF3eeG0s9pACLmCOFroyBjJvD9
ysQkUQoNyG25ElVMA8jdhuI19CemHN7EK1BxmA+U0RQyUQfMmax4g4TUFLloALnw
FapL4FI0j4nuER9liW/EED/NVgn78cLpMMOYyVjxOEt3GjAZ9WwIHHtDr9ZTXmdu
u9pyciB724XQJNkcaMQiAF1vJgsnR3WdBuHKgOul8B3EV8urRtUYLOv+4eqVbaIT
p5E3n1On5zxyifeWX/Nq2JDFA029K74G+6bVN3VnyZNUv5dMw4Ry+fNY6CrAdYqY
cvTs9WMxWhsPwK3CRKVYImquKngoVdkXazA7sePRF44AGBMobFPN4tF2DOtRVfOq
EtvbP7nJ3UJSA527KKdnUG4scf/1XDVkvPVK48/nkVVpURIZPj4fkXZvdAdUsutk
VZR5zXfg5JuF1xNiV6gH
=DTjV
-END PGP SIGNATURE-
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev