[Nouveau] [Bug 76173] [NVAC] xbmc failure with vdpau enabled
https://bugs.freedesktop.org/show_bug.cgi?id=76173 Martin Bednar seraf...@gmail.com changed: What|Removed |Added Version|10.1|10.3 --- Comment #6 from Martin Bednar seraf...@gmail.com --- Still happens with mesa 10.3 and xbmc 13.2. Any news? -- You are receiving this mail because: You are the assignee for the bug. ___ Nouveau mailing list Nouveau@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/nouveau
[Nouveau] [Bug 88262] New: 3.19 kernels hang during boot for an NV28 based card
https://bugs.freedesktop.org/show_bug.cgi?id=88262 Bug ID: 88262 Summary: 3.19 kernels hang during boot for an NV28 based card Product: xorg Version: unspecified Hardware: Other OS: All Status: NEW Severity: normal Priority: medium Component: Driver/nouveau Assignee: nouveau@lists.freedesktop.org Reporter: br...@wolff.to QA Contact: xorg-t...@lists.x.org Created attachment 112047 -- https://bugs.freedesktop.org/attachment.cgi?id=112047action=edit dmesg output from a successful 3.18 boot 3.19 kernels having been hanging for me during boot for one of my machines. This is still happening as of commit dc9319f5a3e1f67d2a2fbf190e30f6d03f569fed (post rc3). The problem first happens with commit ad4a362635353f7ceb66f4038269770fee1025fa. I tried reverting ad4a362635353f7ceb66f4038269770fee1025fa from dc9319f5a3e1f67d2a2fbf190e30f6d03f569fed, but there were conflicts and I don't know enough to properly resolve them. The hang is early enough that I can't get logging output from the boot. -- You are receiving this mail because: You are the assignee for the bug. ___ Nouveau mailing list Nouveau@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/nouveau
[Nouveau] [Bug 88262] 3.19 kernels hang during boot for an NV28 based card
https://bugs.freedesktop.org/show_bug.cgi?id=88262 Bruno Wolff br...@wolff.to changed: What|Removed |Added CC||br...@wolff.to --- Comment #1 from Bruno Wolff br...@wolff.to --- I filed a bug at kernel.org for this as well: https://bugzilla.kernel.org/show_bug.cgi?id=91091 -- You are receiving this mail because: You are the assignee for the bug. ___ Nouveau mailing list Nouveau@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/nouveau
[Nouveau] [Bug 88262] [NV28] 3.19 kernels hang during boot
https://bugs.freedesktop.org/show_bug.cgi?id=88262 Pierre Moreau pierre.mor...@free.fr changed: What|Removed |Added Summary|3.19 kernels hang during|[NV28] 3.19 kernels hang |boot for an NV28 based card |during boot --- Comment #2 from Pierre Moreau pierre.mor...@free.fr --- You could blacklist nouveau (via /etc/modprobe.d/somefile.conf) and load it manually once you're logged in. It seems you're using netconsole, but it only gets up once nouveau is loaded (apparently the interface is first called enp5s3 and is renamed to eth0 too late (for us)), so loading nouveau manually should also help you get some logs via netconsole. -- You are receiving this mail because: You are the assignee for the bug. ___ Nouveau mailing list Nouveau@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/nouveau
[Nouveau] [Bug 88272] New: [NVAC] Flickering screen on 1920x1080 monitor with 9400M in MacbookPro 5, 5
https://bugs.freedesktop.org/show_bug.cgi?id=88272 Bug ID: 88272 Summary: [NVAC] Flickering screen on 1920x1080 monitor with 9400M in MacbookPro 5,5 Product: xorg Version: git Hardware: x86-64 (AMD64) OS: Linux (All) Status: NEW Severity: normal Priority: medium Component: Driver/nouveau Assignee: nouveau@lists.freedesktop.org Reporter: n...@famalex.de QA Contact: xorg-t...@lists.x.org Created attachment 112063 -- https://bugs.freedesktop.org/attachment.cgi?id=112063action=edit dmesg without drm.debug I'm using nouveau from the git kernel to drive the GeForce 9400M, the only graphics adapter in my MacbookPro 5,5. My Dell U2312HM monitor (1920x1080@60, attached via DisplayPort - DisplayPort cable) won't show a static picture, moving pictures make the flickering even worse. The problem still stands in the highest pstate level and with only the DisplayPort as active output. I also tried a DisplayPort - VGA adapter and got the same result. Included two dmesg: 1) Without drm.debug 2) With drm.debug=0xf, flooded -- You are receiving this mail because: You are the assignee for the bug. ___ Nouveau mailing list Nouveau@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/nouveau
[Nouveau] [Bug 88262] [NV28] 3.19 kernels hang during boot
https://bugs.freedesktop.org/show_bug.cgi?id=88262 --- Comment #3 from Marcin Slusarz marcin.slus...@gmail.com --- Should be fixed now. http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=c7e873f85fb60b1af589ac1b0c62353cfe0bbb29 -- You are receiving this mail because: You are the assignee for the bug. ___ Nouveau mailing list Nouveau@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/nouveau
[Nouveau] [Bug 88272] [NVAC] Flickering screen on 1920x1080 monitor with 9400M in MacbookPro 5, 5
https://bugs.freedesktop.org/show_bug.cgi?id=88272 --- Comment #1 from Nils Alex n...@famalex.de --- Created attachment 112064 -- https://bugs.freedesktop.org/attachment.cgi?id=112064action=edit dmesg with drm.debug -- You are receiving this mail because: You are the assignee for the bug. ___ Nouveau mailing list Nouveau@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/nouveau
[Nouveau] [PATCH 2/3] nv50/ir: For MAD, prefer SDST == SSRC2
If liveness analysis indicates it's good, this should improve the chances of being able to emit the short MAD form. Signed-off-by: Roy Spliet rspl...@eclipso.eu --- src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp | 10 +- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp index 898653c..1273449 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp @@ -627,6 +627,7 @@ RegAlloc::BuildIntervalsPass::visit(BasicBlock *bb) #define JOIN_MASK_UNION (1 1) #define JOIN_MASK_MOV(1 2) #define JOIN_MASK_TEX(1 3) +#define JOIN_MASK_MAD(1 4) class GCRA { @@ -851,7 +852,7 @@ GCRA::coalesce(ArrayList insns) case 0x80: case 0x90: case 0xa0: - ret = doCoalesce(insns, JOIN_MASK_UNION | JOIN_MASK_TEX); + ret = doCoalesce(insns, JOIN_MASK_UNION | JOIN_MASK_TEX | JOIN_MASK_MAD); break; case 0xc0: case 0xd0: @@ -995,6 +996,13 @@ GCRA::doCoalesce(ArrayList insns, unsigned int mask) copyCompound(insn-getSrc(0), insn-getDef(0)); } break; + case OP_MAD: + if (!(mask JOIN_MASK_MAD)) +break; + if (insn-srcExists(2) insn-src(2).getFile() == FILE_GPR + insn-def(0).getFile() == FILE_GPR) +coalesceValues(insn-getDef(0), insn-getSrc(2), false); + break; case OP_TEX: case OP_TXB: case OP_TXL: -- 2.1.0 ___ Nouveau mailing list Nouveau@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/nouveau
[Nouveau] [PATCH 3/3] nv50/ir: Fold IMM into MAD
Add a specific optimisation pass for NV50 to check whether SRC0 or SRC1 is a MOV dst, IMM. If so: fold the IMM in and try to drop the MOV. Must be done post-RA because it is required that SDST == SSRC2. Signed-off-by: Roy Spliet rspl...@eclipso.eu --- .../drivers/nouveau/codegen/nv50_ir_peephole.cpp | 52 ++ 1 file changed, 52 insertions(+) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp index 21d20ca..1fc3ae6 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp @@ -2259,6 +2259,56 @@ FlatteningPass::tryPredicateConditional(BasicBlock *bb) // = +// Fold Immediate into MAD; must be done after register allocation due to +// constraint SDST == SSRC2 +// TODO: +// Does NVC0+ have other situations where this pass makes sense? +class NV50PostRaConstantFolding : public Pass +{ +private: + virtual bool visit(BasicBlock *); +}; + +bool +NV50PostRaConstantFolding::visit(BasicBlock *bb) +{ + Value *vtmp; + Instruction *def; + + for (Instruction *i = bb-getFirst(); i; i = i-next) { + switch (i-op) { + case OP_MAD: + if(i-def(0).getFile() == FILE_GPR + i-src(0).getFile() == FILE_GPR + i-src(1).getFile() == FILE_GPR + i-src(2).getFile() == FILE_GPR + i-getDef(0)-reg.data.id == i-getSrc(2)-reg.data.id) { +for (int s = 1; s = 0; s--) { + def = i-getSrc(1)-getInsn(); + if (def-op == OP_MOV def-src(0).getFile() == FILE_IMMEDIATE) { + vtmp = i-getSrc(1); + i-setSrc(1, def-getSrc(0)); + if (vtmp-refCount() == 0) + delete_Instruction(bb-getProgram(), def); + break; + } + + vtmp = i-getSrc(0); + i-setSrc(0, i-getSrc(1)); + i-setSrc(1, vtmp); +} + } + break; + default: + break; + } + } + + return true; +} + +// = + // Common subexpression elimination. Stupid O^2 implementation. class LocalCSE : public Pass { @@ -2629,6 +2679,8 @@ bool Program::optimizePostRA(int level) { RUN_PASS(2, FlatteningPass, run); + if (getTarget()-getChipset() 0xc0) + RUN_PASS(2, NV50PostRaConstantFolding, run); return true; } -- 2.1.0 ___ Nouveau mailing list Nouveau@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/nouveau
[Nouveau] [PATCH 1/3] nv50/ir: Add support for MAD short+IMM notation
MAD IMM has a very specific SDST == SSRC2 requirement, so don't emit Signed-off-by: Roy Spliet rspl...@eclipso.eu --- .../drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp | 18 -- .../drivers/nouveau/codegen/nv50_ir_target_nv50.cpp| 2 +- 2 files changed, 13 insertions(+), 7 deletions(-) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp index 2077388..b1e7409 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp @@ -939,9 +939,20 @@ CodeEmitterNV50::emitFMAD(const Instruction *i) code[0] = 0xe000; + if (i-src(1).getFile() == FILE_IMMEDIATE) { + code[1] = 0; + emitForm_IMM(i); + code[0] |= neg_mul 15; + code[0] |= neg_add 22; + if (i-saturate) + code[0] |= 1 8; + } else if (i-encSize == 4) { emitForm_MUL(i); - assert(!neg_mul !neg_add); + code[0] |= neg_mul 15; + code[0] |= neg_add 22; + if (i-saturate) + code[0] |= 1 8; } else { code[1] = neg_mul 26; code[1] |= neg_add 27; @@ -1931,11 +1942,6 @@ CodeEmitterNV50::getMinEncodingSize(const Instruction *i) const // check constraints on short MAD if (info.srcNr = 2 i-srcExists(2)) { - if (i-saturate || i-src(2).mod) - return 8; - if ((i-src(0).mod ^ i-src(1).mod) || - (i-src(0).mod | i-src(1).mod).abs()) - return 8; if (!i-defExists(0) || i-def(0).rep()-reg.data.id != i-src(2).rep()-reg.data.id) return 8; diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nv50.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nv50.cpp index 48f996b..f4dedd7 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nv50.cpp +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nv50.cpp @@ -118,7 +118,7 @@ void TargetNV50::initOpInfo() static const uint32_t shortForm[(OP_LAST + 31) / 32] = { // MOV,ADD,SUB,MUL,SAD,L/PINTERP,RCP,TEX,TXF - 0x00010e40, 0x0040, 0x0498, 0x + 0x00014e40, 0x0040, 0x0498, 0x }; static const operation noDestList[] = { -- 2.1.0 ___ Nouveau mailing list Nouveau@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/nouveau
Re: [Nouveau] [PATCH 2/3] nv50/ir: For MAD, prefer SDST == SSRC2
On Sat, Jan 10, 2015 at 7:22 PM, Roy Spliet rspl...@eclipso.eu wrote: If liveness analysis indicates it's good, this should improve the chances of being able to emit the short MAD form. Signed-off-by: Roy Spliet rspl...@eclipso.eu --- src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp | 10 +- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp index 898653c..1273449 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp @@ -627,6 +627,7 @@ RegAlloc::BuildIntervalsPass::visit(BasicBlock *bb) #define JOIN_MASK_UNION (1 1) #define JOIN_MASK_MOV(1 2) #define JOIN_MASK_TEX(1 3) +#define JOIN_MASK_MAD(1 4) class GCRA { @@ -851,7 +852,7 @@ GCRA::coalesce(ArrayList insns) case 0x80: case 0x90: case 0xa0: - ret = doCoalesce(insns, JOIN_MASK_UNION | JOIN_MASK_TEX); + ret = doCoalesce(insns, JOIN_MASK_UNION | JOIN_MASK_TEX | JOIN_MASK_MAD); break; case 0xc0: case 0xd0: @@ -995,6 +996,13 @@ GCRA::doCoalesce(ArrayList insns, unsigned int mask) copyCompound(insn-getSrc(0), insn-getDef(0)); } break; + case OP_MAD: + if (!(mask JOIN_MASK_MAD)) +break; + if (insn-srcExists(2) insn-src(2).getFile() == FILE_GPR + insn-def(0).getFile() == FILE_GPR) Use the same check here as you do elsewhere... check that insn-defExists(0). It might not in case that only flags are returned I guess? Not sure if mad actually allows that. +coalesceValues(insn-getDef(0), insn-getSrc(2), false); + break; case OP_TEX: case OP_TXB: case OP_TXL: -- 2.1.0 ___ Nouveau mailing list Nouveau@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/nouveau ___ Nouveau mailing list Nouveau@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/nouveau
Re: [Nouveau] [PATCH v2] nv50/ir: Handle OP_CVT when folding constant expressions
On Fri, Jan 9, 2015 at 8:24 PM, Tobias Klausmann tobias.johannes.klausm...@mni.thm.de wrote: Folding for conversions: F32-(U{16/32}, S{16/32}) and (U{16/32}, {S16/32})-F32 Signed-off-by: Tobias Klausmann tobias.johannes.klausm...@mni.thm.de --- V2: beat me, whip me, split out F64 .../drivers/nouveau/codegen/nv50_ir_peephole.cpp | 81 ++ 1 file changed, 81 insertions(+) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp index 9a0bb60..741c74f 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp @@ -997,6 +997,87 @@ ConstantFolding::opnd(Instruction *i, ImmediateValue imm0, int s) i-op = OP_MOV; break; } + case OP_CVT: { + Storage res; + bld.setPosition(i, true); /* make sure bld is init'ed */ + switch(i-dType) { + case TYPE_U16: + switch (i-sType) { + case TYPE_F32: +if (i-saturate) + res.data.u16 = util_iround(CLAMP(imm0.reg.data.f32, 0, +UINT16_MAX)); Where did this saturate stuff come from? It doesn't make sense to saturate to a non-float dtype. I'd go ahead and just assert(!i-saturate) in the int dtype cases. One does wonder what the hw does if the float doesn't fit in the destination... whether it saturates or not. I don't hugely care though. +else + res.data.u16 = util_iround(imm0.reg.data.f32); +break; + default: +return; + } + i-setSrc(0, bld.mkImm(res.data.u16)); + break; + case TYPE_U32: + switch (i-sType) { + case TYPE_F32: +if (i-saturate) + res.data.u32 = util_iround(CLAMP(imm0.reg.data.f32, 0, +UINT32_MAX)); +else + res.data.u32 = util_iround(imm0.reg.data.f32); +break; + default: +return; + } + i-setSrc(0, bld.mkImm(res.data.u32)); + break; + case TYPE_S16: + switch (i-sType) { + case TYPE_F32: +if (i-saturate) + res.data.s16 = util_iround(CLAMP(imm0.reg.data.f32, INT16_MIN, +INT16_MAX)); +else + res.data.s16 = util_iround(imm0.reg.data.f32); +break; + default: +return; + } + i-setSrc(0, bld.mkImm(res.data.s16)); + break; + case TYPE_S32: + switch (i-sType) { + case TYPE_F32: +if (i-saturate) + res.data.s32 = util_iround(CLAMP(imm0.reg.data.f32, INT32_MIN, + INT32_MAX)); +else + res.data.s32 = util_iround(imm0.reg.data.f32); +break; + default: +return; + } + i-setSrc(0, bld.mkImm(res.data.s32)); + break; + case TYPE_F32: + switch (i-sType) { + case TYPE_U16: res.data.f32 = (float) imm0.reg.data.u16; break; + case TYPE_U32: res.data.f32 = (float) imm0.reg.data.u32; break; + case TYPE_S16: res.data.f32 = (float) imm0.reg.data.s16; break; + case TYPE_S32: res.data.f32 = (float) imm0.reg.data.s32; break; + default: +return; + } + i-setSrc(0, bld.mkImm(res.data.f32)); + break; + default: + return; + } + i-setType(i-dType); /* Remove i-sType, which we don't need anymore */ + i-setSrc(1, NULL); How can src(1) be set? OP_CVT only has the one arg... + i-op = OP_MOV; + + i-src(0).mod = Modifier(0); /* Clear the already applied modifier */ + break; + } default: return; } -- 2.2.1 ___ Nouveau mailing list Nouveau@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/nouveau ___ Nouveau mailing list Nouveau@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/nouveau
Re: [Nouveau] [RFC] mesa/st: Avoid passing a NULL buffer to the drivers
Can you elaborate a bit as to why that's the right thing to do? On Wed, Jan 7, 2015 at 1:52 PM, Tobias Klausmann tobias.johannes.klausm...@mni.thm.de wrote: If we capture transform feedback from n stream in (n-1) buffers we face a NULL buffer, use the buffer (n-1) to capture the output of stream n. This fixes one piglit test with nvc0: arb_gpu_shader5-xfb-streams-without-invocations Signed-off-by: Tobias Klausmann tobias.johannes.klausm...@mni.thm.de --- src/mesa/state_tracker/st_cb_xformfb.c | 5 + 1 file changed, 5 insertions(+) diff --git a/src/mesa/state_tracker/st_cb_xformfb.c b/src/mesa/state_tracker/st_cb_xformfb.c index 8f75eda..5a12da4 100644 --- a/src/mesa/state_tracker/st_cb_xformfb.c +++ b/src/mesa/state_tracker/st_cb_xformfb.c @@ -123,6 +123,11 @@ st_begin_transform_feedback(struct gl_context *ctx, GLenum mode, struct st_buffer_object *bo = st_buffer_object(sobj-base.Buffers[i]); if (bo) { + if (!bo-buffer) +/* If we capture transform feedback from n streams into (n-1) + * buffers we have to write to buffer (n-1) for stream n. + */ +bo = st_buffer_object(sobj-base.Buffers[i-1]); /* Check whether we need to recreate the target. */ if (!sobj-targets[i] || sobj-targets[i] == sobj-draw_count || -- 2.2.1 ___ Nouveau mailing list Nouveau@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/nouveau ___ Nouveau mailing list Nouveau@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/nouveau
Re: [Nouveau] [PATCH 3/3] nv50/ir: Fold IMM into MAD
On Sat, Jan 10, 2015 at 7:23 PM, Roy Spliet rspl...@eclipso.eu wrote: Add a specific optimisation pass for NV50 to check whether SRC0 or SRC1 is a MOV dst, IMM. If so: fold the IMM in and try to drop the MOV. Must be done post-RA because it is required that SDST == SSRC2. because it requires that Signed-off-by: Roy Spliet rspl...@eclipso.eu --- .../drivers/nouveau/codegen/nv50_ir_peephole.cpp | 52 ++ 1 file changed, 52 insertions(+) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp index 21d20ca..1fc3ae6 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp @@ -2259,6 +2259,56 @@ FlatteningPass::tryPredicateConditional(BasicBlock *bb) // = +// Fold Immediate into MAD; must be done after register allocation due to +// constraint SDST == SSRC2 +// TODO: +// Does NVC0+ have other situations where this pass makes sense? +class NV50PostRaConstantFolding : public Pass +{ +private: + virtual bool visit(BasicBlock *); +}; + +bool +NV50PostRaConstantFolding::visit(BasicBlock *bb) +{ + Value *vtmp; + Instruction *def; + + for (Instruction *i = bb-getFirst(); i; i = i-next) { + switch (i-op) { + case OP_MAD: + if(i-def(0).getFile() == FILE_GPR + i-src(0).getFile() == FILE_GPR + i-src(1).getFile() == FILE_GPR + i-src(2).getFile() == FILE_GPR + i-getDef(0)-reg.data.id == i-getSrc(2)-reg.data.id) { This would be much easier to read as if (... != GPR || != GPR || ...) break; (or continue...) +for (int s = 1; s = 0; s--) { You don't end up using 's' in the loop. Did you mean to have some clever logic that flips the order of src0 and src1 in case the wrong one came from an immediate? + def = i-getSrc(1)-getInsn(); + if (def-op == OP_MOV def-src(0).getFile() == FILE_IMMEDIATE) { + vtmp = i-getSrc(1); + i-setSrc(1, def-getSrc(0)); + if (vtmp-refCount() == 0) + delete_Instruction(bb-getProgram(), def); This shouldn't be necessary, it's all allocated in an arena and will get cleaned up later. + break; + } + + vtmp = i-getSrc(0); + i-setSrc(0, i-getSrc(1)); + i-setSrc(1, vtmp); +} + } + break; + default: + break; + } + } + + return true; +} + +// = + // Common subexpression elimination. Stupid O^2 implementation. class LocalCSE : public Pass { @@ -2629,6 +2679,8 @@ bool Program::optimizePostRA(int level) { RUN_PASS(2, FlatteningPass, run); + if (getTarget()-getChipset() 0xc0) + RUN_PASS(2, NV50PostRaConstantFolding, run); return true; } -- 2.1.0 ___ Nouveau mailing list Nouveau@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/nouveau ___ Nouveau mailing list Nouveau@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/nouveau
Re: [Nouveau] [PATCH 1/3] nv50/ir: Add support for MAD short+IMM notation
Op 11-01-15 om 01:34 schreef Ilia Mirkin: And you're allowing saturate/neg emission on the short form. Yes Is this already in envytools? Tesla floating point instructions are poorly documented in the RST documents; fmad is no exception. I'll make sure to check envydis. Also, what's the shortForm thing? Documented in envytools; see http://envytools.readthedocs.org/en/latest/hw/graph/tesla/cuda/isa.html#instruction-format . In short, opcodes are either 4 bytes (short) or 8 bytes (long). This change is probably fine, but the changelog needs work. If you insist I could elaborate a little further. However, documenting what a short opcode is seems a bit superfluous. On Sat, Jan 10, 2015 at 7:22 PM, Roy Spliet rspl...@eclipso.eu wrote: MAD IMM has a very specific SDST == SSRC2 requirement, so don't emit Signed-off-by: Roy Spliet rspl...@eclipso.eu --- .../drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp | 18 -- .../drivers/nouveau/codegen/nv50_ir_target_nv50.cpp| 2 +- 2 files changed, 13 insertions(+), 7 deletions(-) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp index 2077388..b1e7409 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp @@ -939,9 +939,20 @@ CodeEmitterNV50::emitFMAD(const Instruction *i) code[0] = 0xe000; + if (i-src(1).getFile() == FILE_IMMEDIATE) { + code[1] = 0; + emitForm_IMM(i); + code[0] |= neg_mul 15; + code[0] |= neg_add 22; + if (i-saturate) + code[0] |= 1 8; + } else if (i-encSize == 4) { emitForm_MUL(i); - assert(!neg_mul !neg_add); + code[0] |= neg_mul 15; + code[0] |= neg_add 22; + if (i-saturate) + code[0] |= 1 8; } else { code[1] = neg_mul 26; code[1] |= neg_add 27; @@ -1931,11 +1942,6 @@ CodeEmitterNV50::getMinEncodingSize(const Instruction *i) const // check constraints on short MAD if (info.srcNr = 2 i-srcExists(2)) { - if (i-saturate || i-src(2).mod) - return 8; - if ((i-src(0).mod ^ i-src(1).mod) || - (i-src(0).mod | i-src(1).mod).abs()) - return 8; if (!i-defExists(0) || i-def(0).rep()-reg.data.id != i-src(2).rep()-reg.data.id) return 8; diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nv50.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nv50.cpp index 48f996b..f4dedd7 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nv50.cpp +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nv50.cpp @@ -118,7 +118,7 @@ void TargetNV50::initOpInfo() static const uint32_t shortForm[(OP_LAST + 31) / 32] = { // MOV,ADD,SUB,MUL,SAD,L/PINTERP,RCP,TEX,TXF - 0x00010e40, 0x0040, 0x0498, 0x + 0x00014e40, 0x0040, 0x0498, 0x }; static const operation noDestList[] = { -- 2.1.0 ___ Nouveau mailing list Nouveau@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/nouveau ___ Nouveau mailing list Nouveau@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/nouveau ___ Nouveau mailing list Nouveau@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/nouveau
Re: [Nouveau] [PATCH 1/3] nv50/ir: Add support for MAD short+IMM notation
And you're allowing saturate/neg emission on the short form. Is this already in envytools? Also, what's the shortForm thing? This change is probably fine, but the changelog needs work. On Sat, Jan 10, 2015 at 7:22 PM, Roy Spliet rspl...@eclipso.eu wrote: MAD IMM has a very specific SDST == SSRC2 requirement, so don't emit Signed-off-by: Roy Spliet rspl...@eclipso.eu --- .../drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp | 18 -- .../drivers/nouveau/codegen/nv50_ir_target_nv50.cpp| 2 +- 2 files changed, 13 insertions(+), 7 deletions(-) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp index 2077388..b1e7409 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp @@ -939,9 +939,20 @@ CodeEmitterNV50::emitFMAD(const Instruction *i) code[0] = 0xe000; + if (i-src(1).getFile() == FILE_IMMEDIATE) { + code[1] = 0; + emitForm_IMM(i); + code[0] |= neg_mul 15; + code[0] |= neg_add 22; + if (i-saturate) + code[0] |= 1 8; + } else if (i-encSize == 4) { emitForm_MUL(i); - assert(!neg_mul !neg_add); + code[0] |= neg_mul 15; + code[0] |= neg_add 22; + if (i-saturate) + code[0] |= 1 8; } else { code[1] = neg_mul 26; code[1] |= neg_add 27; @@ -1931,11 +1942,6 @@ CodeEmitterNV50::getMinEncodingSize(const Instruction *i) const // check constraints on short MAD if (info.srcNr = 2 i-srcExists(2)) { - if (i-saturate || i-src(2).mod) - return 8; - if ((i-src(0).mod ^ i-src(1).mod) || - (i-src(0).mod | i-src(1).mod).abs()) - return 8; if (!i-defExists(0) || i-def(0).rep()-reg.data.id != i-src(2).rep()-reg.data.id) return 8; diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nv50.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nv50.cpp index 48f996b..f4dedd7 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nv50.cpp +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nv50.cpp @@ -118,7 +118,7 @@ void TargetNV50::initOpInfo() static const uint32_t shortForm[(OP_LAST + 31) / 32] = { // MOV,ADD,SUB,MUL,SAD,L/PINTERP,RCP,TEX,TXF - 0x00010e40, 0x0040, 0x0498, 0x + 0x00014e40, 0x0040, 0x0498, 0x }; static const operation noDestList[] = { -- 2.1.0 ___ Nouveau mailing list Nouveau@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/nouveau ___ Nouveau mailing list Nouveau@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/nouveau
Re: [Nouveau] [PATCH 1/3] nv50/ir: Add support for MAD short+IMM notation
On Sat, Jan 10, 2015 at 7:45 PM, Roy Spliet se...@nimrod-online.com wrote: Op 11-01-15 om 01:34 schreef Ilia Mirkin: And you're allowing saturate/neg emission on the short form. Yes Is this already in envytools? Tesla floating point instructions are poorly documented in the RST documents; fmad is no exception. I'll make sure to check envydis. Sorry, I meant envydis Also, what's the shortForm thing? Documented in envytools; see http://envytools.readthedocs.org/en/latest/hw/graph/tesla/cuda/isa.html#instruction-format . In short, opcodes are either 4 bytes (short) or 8 bytes (long). Yes, I'm aware of that bit :) This change is probably fine, but the changelog needs work. If you insist I could elaborate a little further. However, documenting what a short opcode is seems a bit superfluous. I meant what was the reason for the change to the shortForm array in target_nv50? I don't remember offhand what it is, and you were doing a bunch of things in here and I wasn't sure which of them it was related to. On Sat, Jan 10, 2015 at 7:22 PM, Roy Spliet rspl...@eclipso.eu wrote: MAD IMM has a very specific SDST == SSRC2 requirement, so don't emit Signed-off-by: Roy Spliet rspl...@eclipso.eu --- .../drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp | 18 -- .../drivers/nouveau/codegen/nv50_ir_target_nv50.cpp| 2 +- 2 files changed, 13 insertions(+), 7 deletions(-) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp index 2077388..b1e7409 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp @@ -939,9 +939,20 @@ CodeEmitterNV50::emitFMAD(const Instruction *i) code[0] = 0xe000; + if (i-src(1).getFile() == FILE_IMMEDIATE) { + code[1] = 0; + emitForm_IMM(i); + code[0] |= neg_mul 15; + code[0] |= neg_add 22; + if (i-saturate) + code[0] |= 1 8; + } else if (i-encSize == 4) { emitForm_MUL(i); - assert(!neg_mul !neg_add); + code[0] |= neg_mul 15; + code[0] |= neg_add 22; + if (i-saturate) + code[0] |= 1 8; } else { code[1] = neg_mul 26; code[1] |= neg_add 27; @@ -1931,11 +1942,6 @@ CodeEmitterNV50::getMinEncodingSize(const Instruction *i) const // check constraints on short MAD if (info.srcNr = 2 i-srcExists(2)) { - if (i-saturate || i-src(2).mod) - return 8; - if ((i-src(0).mod ^ i-src(1).mod) || - (i-src(0).mod | i-src(1).mod).abs()) - return 8; if (!i-defExists(0) || i-def(0).rep()-reg.data.id != i-src(2).rep()-reg.data.id) return 8; diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nv50.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nv50.cpp index 48f996b..f4dedd7 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nv50.cpp +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nv50.cpp @@ -118,7 +118,7 @@ void TargetNV50::initOpInfo() static const uint32_t shortForm[(OP_LAST + 31) / 32] = { // MOV,ADD,SUB,MUL,SAD,L/PINTERP,RCP,TEX,TXF - 0x00010e40, 0x0040, 0x0498, 0x + 0x00014e40, 0x0040, 0x0498, 0x }; static const operation noDestList[] = { -- 2.1.0 ___ Nouveau mailing list Nouveau@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/nouveau ___ Nouveau mailing list Nouveau@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/nouveau ___ Nouveau mailing list Nouveau@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/nouveau ___ Nouveau mailing list Nouveau@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/nouveau