[Nouveau] [Bug 76173] [NVAC] xbmc failure with vdpau enabled

2015-01-10 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=76173

Martin Bednar seraf...@gmail.com changed:

   What|Removed |Added

Version|10.1|10.3

--- Comment #6 from Martin Bednar seraf...@gmail.com ---
Still happens with mesa 10.3 and xbmc 13.2. 
Any news?

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
Nouveau mailing list
Nouveau@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/nouveau


[Nouveau] [Bug 88262] New: 3.19 kernels hang during boot for an NV28 based card

2015-01-10 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=88262

Bug ID: 88262
   Summary: 3.19 kernels hang during boot for an NV28 based card
   Product: xorg
   Version: unspecified
  Hardware: Other
OS: All
Status: NEW
  Severity: normal
  Priority: medium
 Component: Driver/nouveau
  Assignee: nouveau@lists.freedesktop.org
  Reporter: br...@wolff.to
QA Contact: xorg-t...@lists.x.org

Created attachment 112047
  -- https://bugs.freedesktop.org/attachment.cgi?id=112047action=edit
dmesg output from a successful 3.18 boot

3.19 kernels having been hanging for me during boot for one of my machines. 
This is still happening as of commit dc9319f5a3e1f67d2a2fbf190e30f6d03f569fed
(post rc3). The problem first happens with commit
ad4a362635353f7ceb66f4038269770fee1025fa. I tried reverting
ad4a362635353f7ceb66f4038269770fee1025fa from
dc9319f5a3e1f67d2a2fbf190e30f6d03f569fed, but there were conflicts and I don't
know enough to properly resolve them.
The hang is early enough that I can't get logging output from the boot.

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
Nouveau mailing list
Nouveau@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/nouveau


[Nouveau] [Bug 88262] 3.19 kernels hang during boot for an NV28 based card

2015-01-10 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=88262

Bruno Wolff br...@wolff.to changed:

   What|Removed |Added

 CC||br...@wolff.to

--- Comment #1 from Bruno Wolff br...@wolff.to ---
I filed a bug at kernel.org for this as well:
https://bugzilla.kernel.org/show_bug.cgi?id=91091

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
Nouveau mailing list
Nouveau@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/nouveau


[Nouveau] [Bug 88262] [NV28] 3.19 kernels hang during boot

2015-01-10 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=88262

Pierre Moreau pierre.mor...@free.fr changed:

   What|Removed |Added

Summary|3.19 kernels hang during|[NV28] 3.19 kernels hang
   |boot for an NV28 based card |during boot

--- Comment #2 from Pierre Moreau pierre.mor...@free.fr ---
You could blacklist nouveau (via /etc/modprobe.d/somefile.conf) and load it
manually once you're logged in. It seems you're using netconsole, but it only
gets up once nouveau is loaded (apparently the interface is first called enp5s3
and is renamed to eth0 too late (for us)), so loading nouveau manually should
also help you get some logs via netconsole.

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
Nouveau mailing list
Nouveau@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/nouveau


[Nouveau] [Bug 88272] New: [NVAC] Flickering screen on 1920x1080 monitor with 9400M in MacbookPro 5, 5

2015-01-10 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=88272

Bug ID: 88272
   Summary: [NVAC] Flickering screen on 1920x1080 monitor with
9400M in MacbookPro 5,5
   Product: xorg
   Version: git
  Hardware: x86-64 (AMD64)
OS: Linux (All)
Status: NEW
  Severity: normal
  Priority: medium
 Component: Driver/nouveau
  Assignee: nouveau@lists.freedesktop.org
  Reporter: n...@famalex.de
QA Contact: xorg-t...@lists.x.org

Created attachment 112063
  -- https://bugs.freedesktop.org/attachment.cgi?id=112063action=edit
dmesg without drm.debug

I'm using nouveau from the git kernel to drive the GeForce 9400M, the only
graphics adapter in my MacbookPro 5,5.

My Dell U2312HM monitor (1920x1080@60, attached via DisplayPort - DisplayPort
cable) won't show a static picture, moving pictures make the flickering even
worse. The problem still stands in the highest pstate level and with only the
DisplayPort as active output. I also tried a DisplayPort - VGA adapter and got
the same result.

Included two dmesg:

1) Without drm.debug
2) With drm.debug=0xf, flooded

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
Nouveau mailing list
Nouveau@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/nouveau


[Nouveau] [Bug 88262] [NV28] 3.19 kernels hang during boot

2015-01-10 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=88262

--- Comment #3 from Marcin Slusarz marcin.slus...@gmail.com ---
Should be fixed now.

http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=c7e873f85fb60b1af589ac1b0c62353cfe0bbb29

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
Nouveau mailing list
Nouveau@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/nouveau


[Nouveau] [Bug 88272] [NVAC] Flickering screen on 1920x1080 monitor with 9400M in MacbookPro 5, 5

2015-01-10 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=88272

--- Comment #1 from Nils Alex n...@famalex.de ---
Created attachment 112064
  -- https://bugs.freedesktop.org/attachment.cgi?id=112064action=edit
dmesg with drm.debug

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
Nouveau mailing list
Nouveau@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/nouveau


[Nouveau] [PATCH 2/3] nv50/ir: For MAD, prefer SDST == SSRC2

2015-01-10 Thread Roy Spliet
If liveness analysis indicates it's good, this should improve the chances
of being able to emit the short MAD form.

Signed-off-by: Roy Spliet rspl...@eclipso.eu
---
 src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp | 10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp
index 898653c..1273449 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp
@@ -627,6 +627,7 @@ RegAlloc::BuildIntervalsPass::visit(BasicBlock *bb)
 #define JOIN_MASK_UNION  (1  1)
 #define JOIN_MASK_MOV(1  2)
 #define JOIN_MASK_TEX(1  3)
+#define JOIN_MASK_MAD(1  4)
 
 class GCRA
 {
@@ -851,7 +852,7 @@ GCRA::coalesce(ArrayList insns)
case 0x80:
case 0x90:
case 0xa0:
-  ret = doCoalesce(insns, JOIN_MASK_UNION | JOIN_MASK_TEX);
+  ret = doCoalesce(insns, JOIN_MASK_UNION | JOIN_MASK_TEX | JOIN_MASK_MAD);
   break;
case 0xc0:
case 0xd0:
@@ -995,6 +996,13 @@ GCRA::doCoalesce(ArrayList insns, unsigned int mask)
copyCompound(insn-getSrc(0), insn-getDef(0));
  }
  break;
+  case OP_MAD:
+ if (!(mask  JOIN_MASK_MAD))
+break;
+ if (insn-srcExists(2)  insn-src(2).getFile() == FILE_GPR 
+ insn-def(0).getFile() == FILE_GPR)
+coalesceValues(insn-getDef(0), insn-getSrc(2), false);
+ break;
   case OP_TEX:
   case OP_TXB:
   case OP_TXL:
-- 
2.1.0



___
Nouveau mailing list
Nouveau@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/nouveau


[Nouveau] [PATCH 3/3] nv50/ir: Fold IMM into MAD

2015-01-10 Thread Roy Spliet
Add a specific optimisation pass for NV50 to check whether SRC0 or SRC1 is
a MOV dst, IMM. If so: fold the IMM in and try to drop the MOV. Must be
done post-RA because it is required that SDST == SSRC2.

Signed-off-by: Roy Spliet rspl...@eclipso.eu
---
 .../drivers/nouveau/codegen/nv50_ir_peephole.cpp   | 52 ++
 1 file changed, 52 insertions(+)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
index 21d20ca..1fc3ae6 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
@@ -2259,6 +2259,56 @@ FlatteningPass::tryPredicateConditional(BasicBlock *bb)
 
 // 
=
 
+// Fold Immediate into MAD; must be done after register allocation due to
+// constraint SDST == SSRC2
+// TODO:
+// Does NVC0+ have other situations where this pass makes sense?
+class NV50PostRaConstantFolding : public Pass
+{
+private:
+   virtual bool visit(BasicBlock *);
+};
+
+bool
+NV50PostRaConstantFolding::visit(BasicBlock *bb)
+{
+   Value *vtmp;
+   Instruction *def;
+
+   for (Instruction *i = bb-getFirst(); i; i = i-next) {
+  switch (i-op) {
+  case OP_MAD:
+ if(i-def(0).getFile() == FILE_GPR 
+   i-src(0).getFile() == FILE_GPR 
+   i-src(1).getFile() == FILE_GPR 
+   i-src(2).getFile() == FILE_GPR 
+   i-getDef(0)-reg.data.id == i-getSrc(2)-reg.data.id) {
+for (int s = 1; s = 0; s--) {
+   def = i-getSrc(1)-getInsn();
+   if (def-op == OP_MOV  def-src(0).getFile() == 
FILE_IMMEDIATE) {
+  vtmp = i-getSrc(1);
+  i-setSrc(1, def-getSrc(0));
+  if (vtmp-refCount() == 0)
+ delete_Instruction(bb-getProgram(), def);
+  break;
+   }
+
+   vtmp = i-getSrc(0);
+   i-setSrc(0, i-getSrc(1));
+   i-setSrc(1, vtmp);
+}
+ }
+ break;
+  default:
+ break;
+  }
+   }
+
+   return true;
+}
+
+// 
=
+
 // Common subexpression elimination. Stupid O^2 implementation.
 class LocalCSE : public Pass
 {
@@ -2629,6 +2679,8 @@ bool
 Program::optimizePostRA(int level)
 {
RUN_PASS(2, FlatteningPass, run);
+   if (getTarget()-getChipset()  0xc0)
+  RUN_PASS(2, NV50PostRaConstantFolding, run);
return true;
 }
 
-- 
2.1.0



___
Nouveau mailing list
Nouveau@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/nouveau


[Nouveau] [PATCH 1/3] nv50/ir: Add support for MAD short+IMM notation

2015-01-10 Thread Roy Spliet
MAD IMM has a very specific SDST == SSRC2 requirement, so don't emit

Signed-off-by: Roy Spliet rspl...@eclipso.eu
---
 .../drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp  | 18 --
 .../drivers/nouveau/codegen/nv50_ir_target_nv50.cpp|  2 +-
 2 files changed, 13 insertions(+), 7 deletions(-)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp
index 2077388..b1e7409 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp
@@ -939,9 +939,20 @@ CodeEmitterNV50::emitFMAD(const Instruction *i)
 
code[0] = 0xe000;
 
+   if (i-src(1).getFile() == FILE_IMMEDIATE) {
+  code[1] = 0;
+  emitForm_IMM(i);
+  code[0] |= neg_mul  15;
+  code[0] |= neg_add  22;
+  if (i-saturate)
+ code[0] |= 1  8;
+   } else
if (i-encSize == 4) {
   emitForm_MUL(i);
-  assert(!neg_mul  !neg_add);
+  code[0] |= neg_mul  15;
+  code[0] |= neg_add  22;
+  if (i-saturate)
+ code[0] |= 1  8;
} else {
   code[1]  = neg_mul  26;
   code[1] |= neg_add  27;
@@ -1931,11 +1942,6 @@ CodeEmitterNV50::getMinEncodingSize(const Instruction 
*i) const
 
// check constraints on short MAD
if (info.srcNr = 2  i-srcExists(2)) {
-  if (i-saturate || i-src(2).mod)
- return 8;
-  if ((i-src(0).mod ^ i-src(1).mod) ||
-  (i-src(0).mod | i-src(1).mod).abs())
- return 8;
   if (!i-defExists(0) ||
   i-def(0).rep()-reg.data.id != i-src(2).rep()-reg.data.id)
  return 8;
diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nv50.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nv50.cpp
index 48f996b..f4dedd7 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nv50.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nv50.cpp
@@ -118,7 +118,7 @@ void TargetNV50::initOpInfo()
static const uint32_t shortForm[(OP_LAST + 31) / 32] =
{
   // MOV,ADD,SUB,MUL,SAD,L/PINTERP,RCP,TEX,TXF
-  0x00010e40, 0x0040, 0x0498, 0x
+  0x00014e40, 0x0040, 0x0498, 0x
};
static const operation noDestList[] =
{
-- 
2.1.0



___
Nouveau mailing list
Nouveau@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/nouveau


Re: [Nouveau] [PATCH 2/3] nv50/ir: For MAD, prefer SDST == SSRC2

2015-01-10 Thread Ilia Mirkin
On Sat, Jan 10, 2015 at 7:22 PM, Roy Spliet rspl...@eclipso.eu wrote:
 If liveness analysis indicates it's good, this should improve the chances
 of being able to emit the short MAD form.

 Signed-off-by: Roy Spliet rspl...@eclipso.eu
 ---
  src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp | 10 +-
  1 file changed, 9 insertions(+), 1 deletion(-)

 diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp 
 b/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp
 index 898653c..1273449 100644
 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp
 +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp
 @@ -627,6 +627,7 @@ RegAlloc::BuildIntervalsPass::visit(BasicBlock *bb)
  #define JOIN_MASK_UNION  (1  1)
  #define JOIN_MASK_MOV(1  2)
  #define JOIN_MASK_TEX(1  3)
 +#define JOIN_MASK_MAD(1  4)

  class GCRA
  {
 @@ -851,7 +852,7 @@ GCRA::coalesce(ArrayList insns)
 case 0x80:
 case 0x90:
 case 0xa0:
 -  ret = doCoalesce(insns, JOIN_MASK_UNION | JOIN_MASK_TEX);
 +  ret = doCoalesce(insns, JOIN_MASK_UNION | JOIN_MASK_TEX | 
 JOIN_MASK_MAD);
break;
 case 0xc0:
 case 0xd0:
 @@ -995,6 +996,13 @@ GCRA::doCoalesce(ArrayList insns, unsigned int mask)
 copyCompound(insn-getSrc(0), insn-getDef(0));
   }
   break;
 +  case OP_MAD:
 + if (!(mask  JOIN_MASK_MAD))
 +break;
 + if (insn-srcExists(2)  insn-src(2).getFile() == FILE_GPR 
 + insn-def(0).getFile() == FILE_GPR)

Use the same check here as you do elsewhere... check that
insn-defExists(0). It might not in case that only flags are returned
I guess? Not sure if mad actually allows that.

 +coalesceValues(insn-getDef(0), insn-getSrc(2), false);
 + break;
case OP_TEX:
case OP_TXB:
case OP_TXL:
 --
 2.1.0



 ___
 Nouveau mailing list
 Nouveau@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/nouveau
___
Nouveau mailing list
Nouveau@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/nouveau


Re: [Nouveau] [PATCH v2] nv50/ir: Handle OP_CVT when folding constant expressions

2015-01-10 Thread Ilia Mirkin
On Fri, Jan 9, 2015 at 8:24 PM, Tobias Klausmann
tobias.johannes.klausm...@mni.thm.de wrote:
 Folding for conversions: F32-(U{16/32}, S{16/32}) and (U{16/32}, 
 {S16/32})-F32

 Signed-off-by: Tobias Klausmann tobias.johannes.klausm...@mni.thm.de
 ---
 V2: beat me, whip me, split out F64

  .../drivers/nouveau/codegen/nv50_ir_peephole.cpp   | 81 
 ++
  1 file changed, 81 insertions(+)

 diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp 
 b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
 index 9a0bb60..741c74f 100644
 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
 +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
 @@ -997,6 +997,87 @@ ConstantFolding::opnd(Instruction *i, ImmediateValue 
 imm0, int s)
i-op = OP_MOV;
break;
 }
 +   case OP_CVT: {
 +  Storage res;
 +  bld.setPosition(i, true); /* make sure bld is init'ed */
 +  switch(i-dType) {
 +  case TYPE_U16:
 + switch (i-sType) {
 + case TYPE_F32:
 +if (i-saturate)
 +   res.data.u16 = util_iround(CLAMP(imm0.reg.data.f32, 0,
 +UINT16_MAX));

Where did this saturate stuff come from? It doesn't make sense to
saturate to a non-float dtype. I'd go ahead and just
assert(!i-saturate) in the int dtype cases.

One does wonder what the hw does if the float doesn't fit in the
destination... whether it saturates or not. I don't hugely care
though.

 +else
 +   res.data.u16 = util_iround(imm0.reg.data.f32);
 +break;
 + default:
 +return;
 + }
 + i-setSrc(0, bld.mkImm(res.data.u16));
 + break;
 +  case TYPE_U32:
 + switch (i-sType) {
 + case TYPE_F32:
 +if (i-saturate)
 +   res.data.u32 = util_iround(CLAMP(imm0.reg.data.f32, 0,
 +UINT32_MAX));
 +else
 +   res.data.u32 = util_iround(imm0.reg.data.f32);
 +break;
 + default:
 +return;
 + }
 + i-setSrc(0, bld.mkImm(res.data.u32));
 + break;
 +  case TYPE_S16:
 + switch (i-sType) {
 + case TYPE_F32:
 +if (i-saturate)
 +   res.data.s16 = util_iround(CLAMP(imm0.reg.data.f32, INT16_MIN,
 +INT16_MAX));
 +else
 +   res.data.s16 = util_iround(imm0.reg.data.f32);
 +break;
 + default:
 +return;
 + }
 + i-setSrc(0, bld.mkImm(res.data.s16));
 + break;
 +  case TYPE_S32:
 + switch (i-sType) {
 + case TYPE_F32:
 +if (i-saturate)
 +   res.data.s32 = util_iround(CLAMP(imm0.reg.data.f32, INT32_MIN,
 +   INT32_MAX));
 +else
 +   res.data.s32 = util_iround(imm0.reg.data.f32);
 +break;
 + default:
 +return;
 + }
 + i-setSrc(0, bld.mkImm(res.data.s32));
 + break;
 +  case TYPE_F32:
 + switch (i-sType) {
 + case TYPE_U16: res.data.f32 = (float) imm0.reg.data.u16; break;
 + case TYPE_U32: res.data.f32 = (float) imm0.reg.data.u32; break;
 + case TYPE_S16: res.data.f32 = (float) imm0.reg.data.s16; break;
 + case TYPE_S32: res.data.f32 = (float) imm0.reg.data.s32; break;
 + default:
 +return;
 + }
 + i-setSrc(0, bld.mkImm(res.data.f32));
 + break;
 +  default:
 + return;
 +  }
 +  i-setType(i-dType); /* Remove i-sType, which we don't need anymore 
 */
 +  i-setSrc(1, NULL);

How can src(1) be set? OP_CVT only has the one arg...

 +  i-op = OP_MOV;
 +
 +  i-src(0).mod = Modifier(0); /* Clear the already applied modifier */
 +  break;
 +   }
 default:
return;
 }
 --
 2.2.1

 ___
 Nouveau mailing list
 Nouveau@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/nouveau
___
Nouveau mailing list
Nouveau@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/nouveau


Re: [Nouveau] [RFC] mesa/st: Avoid passing a NULL buffer to the drivers

2015-01-10 Thread Ilia Mirkin
Can you elaborate a bit as to why that's the right thing to do?

On Wed, Jan 7, 2015 at 1:52 PM, Tobias Klausmann
tobias.johannes.klausm...@mni.thm.de wrote:
 If we capture transform feedback from n stream in (n-1) buffers we face a
 NULL buffer, use the buffer (n-1) to capture the output of stream n.

 This fixes one piglit test with nvc0:
arb_gpu_shader5-xfb-streams-without-invocations

 Signed-off-by: Tobias Klausmann tobias.johannes.klausm...@mni.thm.de
 ---
  src/mesa/state_tracker/st_cb_xformfb.c | 5 +
  1 file changed, 5 insertions(+)

 diff --git a/src/mesa/state_tracker/st_cb_xformfb.c 
 b/src/mesa/state_tracker/st_cb_xformfb.c
 index 8f75eda..5a12da4 100644
 --- a/src/mesa/state_tracker/st_cb_xformfb.c
 +++ b/src/mesa/state_tracker/st_cb_xformfb.c
 @@ -123,6 +123,11 @@ st_begin_transform_feedback(struct gl_context *ctx, 
 GLenum mode,
struct st_buffer_object *bo = st_buffer_object(sobj-base.Buffers[i]);

if (bo) {
 + if (!bo-buffer)
 +/* If we capture transform feedback from n streams into (n-1)
 + * buffers we have to write to buffer (n-1) for stream n.
 + */
 +bo = st_buffer_object(sobj-base.Buffers[i-1]);
   /* Check whether we need to recreate the target. */
   if (!sobj-targets[i] ||
   sobj-targets[i] == sobj-draw_count ||
 --
 2.2.1

 ___
 Nouveau mailing list
 Nouveau@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/nouveau
___
Nouveau mailing list
Nouveau@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/nouveau


Re: [Nouveau] [PATCH 3/3] nv50/ir: Fold IMM into MAD

2015-01-10 Thread Ilia Mirkin
On Sat, Jan 10, 2015 at 7:23 PM, Roy Spliet rspl...@eclipso.eu wrote:
 Add a specific optimisation pass for NV50 to check whether SRC0 or SRC1 is
 a MOV dst, IMM. If so: fold the IMM in and try to drop the MOV. Must be
 done post-RA because it is required that SDST == SSRC2.

because it requires that


 Signed-off-by: Roy Spliet rspl...@eclipso.eu
 ---
  .../drivers/nouveau/codegen/nv50_ir_peephole.cpp   | 52 
 ++
  1 file changed, 52 insertions(+)

 diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp 
 b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
 index 21d20ca..1fc3ae6 100644
 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
 +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
 @@ -2259,6 +2259,56 @@ FlatteningPass::tryPredicateConditional(BasicBlock *bb)

  // 
 =

 +// Fold Immediate into MAD; must be done after register allocation due to
 +// constraint SDST == SSRC2
 +// TODO:
 +// Does NVC0+ have other situations where this pass makes sense?
 +class NV50PostRaConstantFolding : public Pass
 +{
 +private:
 +   virtual bool visit(BasicBlock *);
 +};
 +
 +bool
 +NV50PostRaConstantFolding::visit(BasicBlock *bb)
 +{
 +   Value *vtmp;
 +   Instruction *def;
 +
 +   for (Instruction *i = bb-getFirst(); i; i = i-next) {
 +  switch (i-op) {
 +  case OP_MAD:
 + if(i-def(0).getFile() == FILE_GPR 
 +   i-src(0).getFile() == FILE_GPR 
 +   i-src(1).getFile() == FILE_GPR 
 +   i-src(2).getFile() == FILE_GPR 
 +   i-getDef(0)-reg.data.id == i-getSrc(2)-reg.data.id) {


This would be much easier to read as

if (... != GPR || != GPR || ...) break; (or continue...)

 +for (int s = 1; s = 0; s--) {

You don't end up using 's' in the loop. Did you mean to have some
clever logic that flips the order of src0 and src1 in case the wrong
one came from an immediate?

 +   def = i-getSrc(1)-getInsn();
 +   if (def-op == OP_MOV  def-src(0).getFile() == 
 FILE_IMMEDIATE) {
 +  vtmp = i-getSrc(1);
 +  i-setSrc(1, def-getSrc(0));
 +  if (vtmp-refCount() == 0)
 + delete_Instruction(bb-getProgram(), def);

This shouldn't be necessary, it's all allocated in an arena and will
get cleaned up later.

 +  break;
 +   }
 +
 +   vtmp = i-getSrc(0);
 +   i-setSrc(0, i-getSrc(1));
 +   i-setSrc(1, vtmp);
 +}
 + }
 + break;
 +  default:
 + break;
 +  }
 +   }
 +
 +   return true;
 +}
 +
 +// 
 =
 +
  // Common subexpression elimination. Stupid O^2 implementation.
  class LocalCSE : public Pass
  {
 @@ -2629,6 +2679,8 @@ bool
  Program::optimizePostRA(int level)
  {
 RUN_PASS(2, FlatteningPass, run);
 +   if (getTarget()-getChipset()  0xc0)
 +  RUN_PASS(2, NV50PostRaConstantFolding, run);
 return true;
  }

 --
 2.1.0



 ___
 Nouveau mailing list
 Nouveau@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/nouveau
___
Nouveau mailing list
Nouveau@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/nouveau


Re: [Nouveau] [PATCH 1/3] nv50/ir: Add support for MAD short+IMM notation

2015-01-10 Thread Roy Spliet

Op 11-01-15 om 01:34 schreef Ilia Mirkin:

And you're allowing saturate/neg emission on the short form.

Yes

Is this already in envytools?
Tesla floating point instructions are poorly documented in the RST 
documents; fmad is no exception. I'll make sure to check envydis.

Also, what's the shortForm thing?
Documented in envytools; see 
http://envytools.readthedocs.org/en/latest/hw/graph/tesla/cuda/isa.html#instruction-format 
. In short, opcodes are either 4 bytes (short) or 8 bytes (long).

This change is
probably fine, but the changelog needs work.
If you insist I could elaborate a little further. However, documenting 
what a short opcode is seems a bit superfluous.


On Sat, Jan 10, 2015 at 7:22 PM, Roy Spliet rspl...@eclipso.eu wrote:

MAD IMM has a very specific SDST == SSRC2 requirement, so don't emit

Signed-off-by: Roy Spliet rspl...@eclipso.eu
---
  .../drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp  | 18 --
  .../drivers/nouveau/codegen/nv50_ir_target_nv50.cpp|  2 +-
  2 files changed, 13 insertions(+), 7 deletions(-)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp
index 2077388..b1e7409 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp
@@ -939,9 +939,20 @@ CodeEmitterNV50::emitFMAD(const Instruction *i)

 code[0] = 0xe000;

+   if (i-src(1).getFile() == FILE_IMMEDIATE) {
+  code[1] = 0;
+  emitForm_IMM(i);
+  code[0] |= neg_mul  15;
+  code[0] |= neg_add  22;
+  if (i-saturate)
+ code[0] |= 1  8;
+   } else
 if (i-encSize == 4) {
emitForm_MUL(i);
-  assert(!neg_mul  !neg_add);
+  code[0] |= neg_mul  15;
+  code[0] |= neg_add  22;
+  if (i-saturate)
+ code[0] |= 1  8;
 } else {
code[1]  = neg_mul  26;
code[1] |= neg_add  27;
@@ -1931,11 +1942,6 @@ CodeEmitterNV50::getMinEncodingSize(const Instruction 
*i) const

 // check constraints on short MAD
 if (info.srcNr = 2  i-srcExists(2)) {
-  if (i-saturate || i-src(2).mod)
- return 8;
-  if ((i-src(0).mod ^ i-src(1).mod) ||
-  (i-src(0).mod | i-src(1).mod).abs())
- return 8;
if (!i-defExists(0) ||
i-def(0).rep()-reg.data.id != i-src(2).rep()-reg.data.id)
   return 8;
diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nv50.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nv50.cpp
index 48f996b..f4dedd7 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nv50.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nv50.cpp
@@ -118,7 +118,7 @@ void TargetNV50::initOpInfo()
 static const uint32_t shortForm[(OP_LAST + 31) / 32] =
 {
// MOV,ADD,SUB,MUL,SAD,L/PINTERP,RCP,TEX,TXF
-  0x00010e40, 0x0040, 0x0498, 0x
+  0x00014e40, 0x0040, 0x0498, 0x
 };
 static const operation noDestList[] =
 {
--
2.1.0



___
Nouveau mailing list
Nouveau@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/nouveau

___
Nouveau mailing list
Nouveau@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/nouveau


___
Nouveau mailing list
Nouveau@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/nouveau


Re: [Nouveau] [PATCH 1/3] nv50/ir: Add support for MAD short+IMM notation

2015-01-10 Thread Ilia Mirkin
And you're allowing saturate/neg emission on the short form. Is this
already in envytools? Also, what's the shortForm thing? This change is
probably fine, but the changelog needs work.

On Sat, Jan 10, 2015 at 7:22 PM, Roy Spliet rspl...@eclipso.eu wrote:
 MAD IMM has a very specific SDST == SSRC2 requirement, so don't emit

 Signed-off-by: Roy Spliet rspl...@eclipso.eu
 ---
  .../drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp  | 18 
 --
  .../drivers/nouveau/codegen/nv50_ir_target_nv50.cpp|  2 +-
  2 files changed, 13 insertions(+), 7 deletions(-)

 diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp 
 b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp
 index 2077388..b1e7409 100644
 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp
 +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp
 @@ -939,9 +939,20 @@ CodeEmitterNV50::emitFMAD(const Instruction *i)

 code[0] = 0xe000;

 +   if (i-src(1).getFile() == FILE_IMMEDIATE) {
 +  code[1] = 0;
 +  emitForm_IMM(i);
 +  code[0] |= neg_mul  15;
 +  code[0] |= neg_add  22;
 +  if (i-saturate)
 + code[0] |= 1  8;
 +   } else
 if (i-encSize == 4) {
emitForm_MUL(i);
 -  assert(!neg_mul  !neg_add);
 +  code[0] |= neg_mul  15;
 +  code[0] |= neg_add  22;
 +  if (i-saturate)
 + code[0] |= 1  8;
 } else {
code[1]  = neg_mul  26;
code[1] |= neg_add  27;
 @@ -1931,11 +1942,6 @@ CodeEmitterNV50::getMinEncodingSize(const Instruction 
 *i) const

 // check constraints on short MAD
 if (info.srcNr = 2  i-srcExists(2)) {
 -  if (i-saturate || i-src(2).mod)
 - return 8;
 -  if ((i-src(0).mod ^ i-src(1).mod) ||
 -  (i-src(0).mod | i-src(1).mod).abs())
 - return 8;
if (!i-defExists(0) ||
i-def(0).rep()-reg.data.id != i-src(2).rep()-reg.data.id)
   return 8;
 diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nv50.cpp 
 b/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nv50.cpp
 index 48f996b..f4dedd7 100644
 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nv50.cpp
 +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nv50.cpp
 @@ -118,7 +118,7 @@ void TargetNV50::initOpInfo()
 static const uint32_t shortForm[(OP_LAST + 31) / 32] =
 {
// MOV,ADD,SUB,MUL,SAD,L/PINTERP,RCP,TEX,TXF
 -  0x00010e40, 0x0040, 0x0498, 0x
 +  0x00014e40, 0x0040, 0x0498, 0x
 };
 static const operation noDestList[] =
 {
 --
 2.1.0



 ___
 Nouveau mailing list
 Nouveau@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/nouveau
___
Nouveau mailing list
Nouveau@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/nouveau


Re: [Nouveau] [PATCH 1/3] nv50/ir: Add support for MAD short+IMM notation

2015-01-10 Thread Ilia Mirkin
On Sat, Jan 10, 2015 at 7:45 PM, Roy Spliet se...@nimrod-online.com wrote:
 Op 11-01-15 om 01:34 schreef Ilia Mirkin:

 And you're allowing saturate/neg emission on the short form.

 Yes

 Is this already in envytools?

 Tesla floating point instructions are poorly documented in the RST
 documents; fmad is no exception. I'll make sure to check envydis.

Sorry, I meant envydis


 Also, what's the shortForm thing?

 Documented in envytools; see
 http://envytools.readthedocs.org/en/latest/hw/graph/tesla/cuda/isa.html#instruction-format
 . In short, opcodes are either 4 bytes (short) or 8 bytes (long).

Yes, I'm aware of that bit :)


 This change is
 probably fine, but the changelog needs work.

 If you insist I could elaborate a little further. However, documenting what
 a short opcode is seems a bit superfluous.

I meant what was the reason for the change to the shortForm array in
target_nv50? I don't remember offhand what it is, and you were doing a
bunch of things in here and I wasn't sure which of them it was related
to.



 On Sat, Jan 10, 2015 at 7:22 PM, Roy Spliet rspl...@eclipso.eu wrote:

 MAD IMM has a very specific SDST == SSRC2 requirement, so don't emit

 Signed-off-by: Roy Spliet rspl...@eclipso.eu
 ---
   .../drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp  | 18
 --
   .../drivers/nouveau/codegen/nv50_ir_target_nv50.cpp|  2 +-
   2 files changed, 13 insertions(+), 7 deletions(-)

 diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp
 b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp
 index 2077388..b1e7409 100644
 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp
 +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp
 @@ -939,9 +939,20 @@ CodeEmitterNV50::emitFMAD(const Instruction *i)

  code[0] = 0xe000;

 +   if (i-src(1).getFile() == FILE_IMMEDIATE) {
 +  code[1] = 0;
 +  emitForm_IMM(i);
 +  code[0] |= neg_mul  15;
 +  code[0] |= neg_add  22;
 +  if (i-saturate)
 + code[0] |= 1  8;
 +   } else
  if (i-encSize == 4) {
 emitForm_MUL(i);
 -  assert(!neg_mul  !neg_add);
 +  code[0] |= neg_mul  15;
 +  code[0] |= neg_add  22;
 +  if (i-saturate)
 + code[0] |= 1  8;
  } else {
 code[1]  = neg_mul  26;
 code[1] |= neg_add  27;
 @@ -1931,11 +1942,6 @@ CodeEmitterNV50::getMinEncodingSize(const
 Instruction *i) const

  // check constraints on short MAD
  if (info.srcNr = 2  i-srcExists(2)) {
 -  if (i-saturate || i-src(2).mod)
 - return 8;
 -  if ((i-src(0).mod ^ i-src(1).mod) ||
 -  (i-src(0).mod | i-src(1).mod).abs())
 - return 8;
 if (!i-defExists(0) ||
 i-def(0).rep()-reg.data.id != i-src(2).rep()-reg.data.id)
return 8;
 diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nv50.cpp
 b/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nv50.cpp
 index 48f996b..f4dedd7 100644
 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nv50.cpp
 +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nv50.cpp
 @@ -118,7 +118,7 @@ void TargetNV50::initOpInfo()
  static const uint32_t shortForm[(OP_LAST + 31) / 32] =
  {
 // MOV,ADD,SUB,MUL,SAD,L/PINTERP,RCP,TEX,TXF
 -  0x00010e40, 0x0040, 0x0498, 0x
 +  0x00014e40, 0x0040, 0x0498, 0x
  };
  static const operation noDestList[] =
  {
 --
 2.1.0



 ___
 Nouveau mailing list
 Nouveau@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/nouveau

 ___
 Nouveau mailing list
 Nouveau@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/nouveau


 ___
 Nouveau mailing list
 Nouveau@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/nouveau
___
Nouveau mailing list
Nouveau@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/nouveau