Re: [Mesa-dev] [RFC] i965/dbg: Expose cases hitting a presumably dead optimization
On Sat, Mar 12, 2016 at 08:44:54AM -0800, Jason Ekstrand wrote: >On Mar 11, 2016 11:47 PM, "Pohjolainen, Topi" ><[1]topi.pohjolai...@intel.com> wrote: >> >> On Fri, Mar 11, 2016 at 05:59:37PM -0800, Jason Ekstrand wrote: >> >On Fri, Mar 11, 2016 at 4:40 AM, Topi Pohjolainen >> ><[1][2]topi.pohjolai...@intel.com> wrote: >> > >> > The logic iterates over param[] which contains pointers to >> > uniform storage set during linking (see >> > link_assign_uniform_locations()). >> > The pointers are unique and it should be impossible to find >> > matching entries. >> > I couldn't find any regressions with test system. In addition >> > I tried several benchmarks on HSW and none hit this. >> > I'm hoping to remove this optimization attempt. This is the >only >> > bit that depends on knowing about the actual storage during >> > compilation. All the rest deal with just relative push and >pull >> > locations once the actual filling of pull_param[] is moved >> > outside the compiler just as param[]. (Filling pull_param is >> > based on the pull locations and doesn't need to be inside the >> > compiler). >> > Any thoughts? >> > >> >I'm not 100% sure what you're trying to do, but I have a branch >that >> >may be of interest: >> > >[2][3]https://cgit.freedesktop.org/~jekstrand/mesa/log/?h=wip/i965-unif >orm >> >s >> >The branch enables support for pushing small uniform arrays. >Among >> >other things, it redoes the way we do push constants and gets >rid of >> >some of the data tracking in the backend compiler. The big >reason why >> >I haven't tried too hard to get it merged is because it >regresses Sandy >> >Bridge just a bit. I know I've seen and fixed the bug before in >an >> >alternate attempt, but I don't remember how. >> >I'm going to be refreshing it soon because we need indirect push >> >constants for the Vulkan driver. (The branch is already merged >into >> >the Vulkan branch.) >> >> I'd like to stop filling param[] before compilation. This is really >not >> needed by the compiler as it deals with pull and push constant >locations, >> i.e., positions in the push and pull files. Actual uniform values and >their >> location in the uniform storage are not needed until actual pipeline >upload. >> >> My plan is to move the iteration over the core uniform storage to >pipeline >> upload time. We can fill push and pull buffers directly without the >need of >> storing pointers to param[] in the middle. Not only makes this things >simpler >> and more flexible in my mind, does it give us the possibility to >upload >> floats with 16-bit precision instead of 32-bits. Current upload logic >only >> gets pointers to 32-bit values without knowing if they should be >represented >> with 16 bits let alone whether the values are floats or integers to >begin >> with. > >Right. Kristian and I have talked about some related things that we >need for pipeline caching and the Vulkan driver. In Vulkan, they >aren't actual pointers at all but are, instead, offsets into a push >constant block. Fortunately, the back-end compiler never dereferences >them so you can shove whatever you want in there and it's OK. We've >talked about turning the pull and push params into just a set of >integers that means whatever the api and state setup code want. One of >the problems with pointers is that you can't easily put them into an >on-disk shader cache (which we have for Vulkan). > >When you talk about 16 or 64-bit values, what is your intention? Are >64-bit values still going to take up two slots or are they now one >64-bit slot? Are there two 16-bit values per slot or just one? Are >16-bit uniforms converted before they get uploaded or consumed directly >by the shader? I'm still a little confused as to what problem you're >trying to solve. I'm seeing the 16-bit float as two-fold. First, the uniform storage always represents them as normal 32-bit floats for the gl-api to work correctly (even if they are marked as low/mediump I don't think the api for setting and querying them is allowed operate with reduced precision. On the other hand, such conversion back and forth in the core gl-api doesn't sound appealing at all just from implementation point of view). Therefore after the compiler has chosen to represent a particular uniform with reduced precision and set the operand types accordingly, the upload logic has to convert the 32-bit float into equivalent 16-bit value before uplaoding. Second is the question on how to pack the 16-bit values. I'm seeing this as second step
Re: [Mesa-dev] [PATCH] softpipe: fix anisotropic filtering crash
On 03/13/2016 06:53 AM, srol...@vmware.com wrote: From: Roland ScheideggerThe filt_args->offset wasn't assigned but was always used later leading to a crash (as far as I can tell, texel offsets don't actually make much sense with anisotropic filtering, but because there's no explicit setting if offsets are enabled there the array is always accessed). This fixes https://bugs.freedesktop.org/show_bug.cgi?id=94481 CC: --- src/gallium/drivers/softpipe/sp_tex_sample.c | 14 -- 1 file changed, 12 insertions(+), 2 deletions(-) diff --git a/src/gallium/drivers/softpipe/sp_tex_sample.c b/src/gallium/drivers/softpipe/sp_tex_sample.c index e3e28a3..5e3d47b 100644 --- a/src/gallium/drivers/softpipe/sp_tex_sample.c +++ b/src/gallium/drivers/softpipe/sp_tex_sample.c @@ -2209,6 +2209,7 @@ img_filter_2d_ewa(const struct sp_sampler_view *sp_sview, const float t[TGSI_QUAD_SIZE], const float p[TGSI_QUAD_SIZE], const uint faces[TGSI_QUAD_SIZE], + const int8_t *offset, unsigned level, const float dudx, const float dvdx, const float dudy, const float dvdy, @@ -2268,6 +2269,8 @@ img_filter_2d_ewa(const struct sp_sampler_view *sp_sview, /* F *= formScale; */ /* no need to scale F as we don't use it below here */ args.level = level; + args.offset = offset; + for (j = 0; j < TGSI_QUAD_SIZE; j++) { /* Heckbert MS thesis, p. 59; scan over the bounding box of the ellipse * and incrementally update the value of Ax^2+Bxy*Cy^2; when this @@ -2431,6 +2434,8 @@ mip_filter_linear_aniso(const struct sp_sampler_view *sp_sview, const float dvdy = (t[QUAD_TOP_LEFT] - t[QUAD_BOTTOM_LEFT]) * t_to_v; struct img_filter_args args; + args.offset = filt_args->offset; + if (filt_args->control == TGSI_SAMPLER_LOD_BIAS || filt_args->control == TGSI_SAMPLER_LOD_NONE || /* XXX FIXME */ @@ -2495,6 +2500,11 @@ mip_filter_linear_aniso(const struct sp_sampler_view *sp_sview, args.p = p[j]; args.level = psview->u.tex.last_level; args.face_id = filt_args->faces[j]; + /* + * XXX: we overwrote any linear filter with nearest, so this + * isn't right (albeit if last level is 1x1 and no border it + * will work just the same). + */ min_filter(sp_sview, sp_samp, , [0][j]); } Patch looks right but this comment seems unrelated with it. If that's the case then perhaps it should be moved out to a patch of its own. Other than that: Reviewed-by: Eduardo Lima Mitev Thanks. Eduardo } @@ -2503,8 +2513,8 @@ mip_filter_linear_aniso(const struct sp_sampler_view *sp_sview, * seem to be worth the extra running time. */ img_filter_2d_ewa(sp_sview, sp_samp, min_filter, mag_filter, -s, t, p, filt_args->faces, level0, -dudx, dvdx, dudy, dvdy, rgba); +s, t, p, filt_args->faces, filt_args->offset, +level0, dudx, dvdx, dudy, dvdy, rgba); } if (DEBUG_TEX) { ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] softpipe: fix anisotropic filtering crash
From: Roland ScheideggerThe filt_args->offset wasn't assigned but was always used later leading to a crash (as far as I can tell, texel offsets don't actually make much sense with anisotropic filtering, but because there's no explicit setting if offsets are enabled there the array is always accessed). This fixes https://bugs.freedesktop.org/show_bug.cgi?id=94481 CC: --- src/gallium/drivers/softpipe/sp_tex_sample.c | 14 -- 1 file changed, 12 insertions(+), 2 deletions(-) diff --git a/src/gallium/drivers/softpipe/sp_tex_sample.c b/src/gallium/drivers/softpipe/sp_tex_sample.c index e3e28a3..5e3d47b 100644 --- a/src/gallium/drivers/softpipe/sp_tex_sample.c +++ b/src/gallium/drivers/softpipe/sp_tex_sample.c @@ -2209,6 +2209,7 @@ img_filter_2d_ewa(const struct sp_sampler_view *sp_sview, const float t[TGSI_QUAD_SIZE], const float p[TGSI_QUAD_SIZE], const uint faces[TGSI_QUAD_SIZE], + const int8_t *offset, unsigned level, const float dudx, const float dvdx, const float dudy, const float dvdy, @@ -2268,6 +2269,8 @@ img_filter_2d_ewa(const struct sp_sampler_view *sp_sview, /* F *= formScale; */ /* no need to scale F as we don't use it below here */ args.level = level; + args.offset = offset; + for (j = 0; j < TGSI_QUAD_SIZE; j++) { /* Heckbert MS thesis, p. 59; scan over the bounding box of the ellipse * and incrementally update the value of Ax^2+Bxy*Cy^2; when this @@ -2431,6 +2434,8 @@ mip_filter_linear_aniso(const struct sp_sampler_view *sp_sview, const float dvdy = (t[QUAD_TOP_LEFT] - t[QUAD_BOTTOM_LEFT]) * t_to_v; struct img_filter_args args; + args.offset = filt_args->offset; + if (filt_args->control == TGSI_SAMPLER_LOD_BIAS || filt_args->control == TGSI_SAMPLER_LOD_NONE || /* XXX FIXME */ @@ -2495,6 +2500,11 @@ mip_filter_linear_aniso(const struct sp_sampler_view *sp_sview, args.p = p[j]; args.level = psview->u.tex.last_level; args.face_id = filt_args->faces[j]; + /* + * XXX: we overwrote any linear filter with nearest, so this + * isn't right (albeit if last level is 1x1 and no border it + * will work just the same). + */ min_filter(sp_sview, sp_samp, , [0][j]); } } @@ -2503,8 +2513,8 @@ mip_filter_linear_aniso(const struct sp_sampler_view *sp_sview, * seem to be worth the extra running time. */ img_filter_2d_ewa(sp_sview, sp_samp, min_filter, mag_filter, -s, t, p, filt_args->faces, level0, -dudx, dvdx, dudy, dvdy, rgba); +s, t, p, filt_args->faces, filt_args->offset, +level0, dudx, dvdx, dudy, dvdy, rgba); } if (DEBUG_TEX) { -- 2.1.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 94503] OpenCL segfaults during compilation
https://bugs.freedesktop.org/show_bug.cgi?id=94503 --- Comment #3 from Tyson Whitehead--- Created attachment 122262 --> https://bugs.freedesktop.org/attachment.cgi?id=122262=edit Simplified kernel that causes other (different) compiler segfault -- You are receiving this mail because: You are the assignee for the bug. You are the QA Contact for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 94503] OpenCL segfaults during compilation
https://bugs.freedesktop.org/show_bug.cgi?id=94503 --- Comment #2 from Tyson Whitehead--- Thanks for the heads-up Matt. I rebuilt the Debian package of mesa 11.2.0-rc3 against the Debian package of llvm 3.9~svn262954 and am pleased to say the simplified kernel I provided also now compiles for me. Unfortunately the full set of my OpenCL code I still causing a segfault. Pruning code reveals it is a different kernel though, and the backtrace is entirely different too, so progress is being made! I'm attaching a simplified version of this next kernel function. I would appreciate it if you could give it a go on your setup and see if it is segfaulting for you as well. Program received signal SIGSEGV, Segmentation fault. 0x73bad2a0 in (anonymous namespace)::JoinVals::pruneValues (this=this@entry=0x7fffb8a0, Other=..., EndPoints=..., changeInstrs=changeInstrs@entry=false) at /tmp/buildd/llvm-toolchain-snapshot-3.9~svn262954/lib/CodeGen/RegisterCoalescer.cpp:2388 #0 0x73bad2a0 in (anonymous namespace)::JoinVals::pruneValues (this=this@entry=0x7fffb8a0, Other=..., EndPoints=..., changeInstrs=changeInstrs@entry=false) at /tmp/buildd/llvm-toolchain-snapshot-3.9~svn262954/lib/CodeGen/RegisterCoalescer.cpp:2388 #1 0x73bb38da in (anonymous namespace)::RegisterCoalescer::joinSubRegRanges (this=0x2386a50, this=0x2386a50, CP=..., LaneMask=8, RRange=..., LRange=...) at /tmp/buildd/llvm-toolchain-snapshot-3.9~svn262954/lib/CodeGen/RegisterCoalescer.cpp:2569 #2 (anonymous namespace)::RegisterCoalescer::mergeSubRangeInto (this=this@entry=0x2386a50, LI=..., ToMerge=..., LaneMask=8, CP=...) at /tmp/buildd/llvm-toolchain-snapshot-3.9~svn262954/lib/CodeGen/RegisterCoalescer.cpp:2622 #3 0x73bb4a31 in (anonymous namespace)::RegisterCoalescer::joinVirtRegs ( this=this@entry=0x2386a50, CP=...) at /tmp/buildd/llvm-toolchain-snapshot-3.9~svn262954/lib/CodeGen/RegisterCoalescer.cpp:2688 #4 0x73bb54a0 in (anonymous namespace)::RegisterCoalescer::joinIntervals (CP=..., this=0x2386a50) at /tmp/buildd/llvm-toolchain-snapshot-3.9~svn262954/lib/CodeGen/RegisterCoalescer.cpp:2734 #5 (anonymous namespace)::RegisterCoalescer::joinCopy (Again=, CopyMI=0xb991b0, this=0x2386a50) at /tmp/buildd/llvm-toolchain-snapshot-3.9~svn262954/lib/CodeGen/RegisterCoalescer.cpp:1449 #6 (anonymous namespace)::RegisterCoalescer::copyCoalesceWorkList (this=this@entry=0x2386a50, CurrList=...) at /tmp/buildd/llvm-toolchain-snapshot-3.9~svn262954/lib/CodeGen/RegisterCoalescer.cpp:2805 #7 0x73bb70bb in (anonymous namespace)::RegisterCoalescer::coalesceLocals ( this=this@entry=0x2386a50) at /tmp/buildd/llvm-toolchain-snapshot-3.9~svn262954/lib/CodeGen/RegisterCoalescer.cpp:2930 #8 0x73bb7da8 in (anonymous namespace)::RegisterCoalescer::joinAllIntervals (this=0x2386a50) at /tmp/buildd/llvm-toolchain-snapshot-3.9~svn262954/lib/CodeGen/RegisterCoalescer.cpp:2956 #9 (anonymous namespace)::RegisterCoalescer::runOnMachineFunction (this=0x2386a50, fn=...) at /tmp/buildd/llvm-toolchain-snapshot-3.9~svn262954/lib/CodeGen/RegisterCoalescer.cpp:3006 #10 0x739cc752 in llvm::FPPassManager::runOnFunction (this=0x238d260, F=...) at /tmp/buildd/llvm-toolchain-snapshot-3.9~svn262954/lib/IR/LegacyPassManager.cpp:1550 #11 0x739cca8b in llvm::FPPassManager::runOnModule (this=0x238d260, M=...) at /tmp/buildd/llvm-toolchain-snapshot-3.9~svn262954/lib/IR/LegacyPassManager.cpp:1571 #12 0x739cc3cf in (anonymous namespace)::MPPassManager::runOnModule (M=..., this=0x238cfd0) at /tmp/buildd/llvm-toolchain-snapshot-3.9~svn262954/lib/IR/LegacyPassManager.cpp:1627 #13 llvm::legacy::PassManagerImpl::run (this=0xa96580, M=...) at /tmp/buildd/llvm-toolchain-snapshot-3.9~svn262954/lib/IR/LegacyPassManager.cpp:1730 #14 0x739cc569 in llvm::legacy::PassManager::run (this=this@entry=0x7fffc6a0, M=...) at /tmp/buildd/llvm-toolchain-snapshot-3.9~svn262954/lib/IR/LegacyPassManager.cpp:1761 #15 0x74507ef7 in LLVMTargetMachineEmit (T=T@entry=0x239f8a0, M=M@entry=0xb42a60, OS=..., codegen=codegen@entry=LLVMObjectFile, ErrorMessage=ErrorMessage@entry=0x7fffc948) at /tmp/buildd/llvm-toolchain-snapshot-3.9~svn262954/lib/Target/TargetMachineC.cpp:206 #16 0x74508219 in LLVMTargetMachineEmitToMemoryBuffer (T=T@entry=0x239f8a0, M=M@entry=0xb42a60, codegen=codegen@entry=LLVMObjectFile, ErrorMessage=ErrorMessage@entry=0x7fffc948, OutMemBuf=OutMemBuf@entry=0x7fffcae8) at /tmp/buildd/llvm-toolchain-snapshot-3.9~svn262954/lib/Target/TargetMachineC.cpp:230 #17 0x76605584 in (anonymous namespace)::emit_code (tm=tm@entry=0x239f8a0, mod=mod@entry=0xb42a60, file_type=file_type@entry=LLVMObjectFile, out_buffer=out_buffer@entry=0x7fffcae8, r_log="test2.c:36:31: warning: double precision constant requires
[Mesa-dev] [PATCH 2/3] nv50/ir: avoid folding mul + add if the mul has a dnz
Signed-off-by: Ilia Mirkin--- src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp index 6192c06..66e7b2e 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp @@ -1635,11 +1635,10 @@ AlgebraicOpt::tryADDToMADOrSAD(Instruction *add, operation toOp) if (src->getUniqueInsn() && src->getUniqueInsn()->bb != add->bb) return false; - if (src->getInsn()->saturate) + if (src->getInsn()->saturate || src->getInsn()->postFactor || + src->getInsn()->dnz) return false; - if (src->getInsn()->postFactor) - return false; if (toOp == OP_SAD) { ImmediateValue imm; if (!src->getInsn()->src(2).getImmediate(imm)) -- 2.4.10 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 3/3] nv50, nvc0: handle SQRT lowering inside the driver
First off, st/mesa lowers DSQRT incorrectly (it uses CMP to attempt to find out whether the input is less than 0). Secondly the current approach (x * rsq(x)) behaves poorly for x = inf - a NaN is produced instead of inf. Instead we switch to the less accurate rcp(rsq(x)) method - this behaves nicely for all valid inputs. We still don't do this for DSQRT since the RSQ/RCP ops are *really* inaccurate, and don't even have Newton-Raphson steps right now. Eventually we should have a separate library function for DSQRT that does it more precisely (and perhaps move this lowering to the post-opt phase). This fixes a number of dEQP precision tests that were expecting better behavior for infinite inputs. Signed-off-by: Ilia Mirkin--- .../drivers/nouveau/codegen/nv50_ir_build_util.cpp | 6 +++- .../drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp | 2 ++ .../nouveau/codegen/nv50_ir_lowering_nv50.cpp | 7 ++--- .../nouveau/codegen/nv50_ir_lowering_nvc0.cpp | 32 +++--- src/gallium/drivers/nouveau/nv50/nv50_screen.c | 2 +- src/gallium/drivers/nouveau/nvc0/nvc0_screen.c | 2 +- 6 files changed, 28 insertions(+), 23 deletions(-) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_build_util.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_build_util.cpp index f58cf97..84ebfdb 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_build_util.cpp +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_build_util.cpp @@ -585,6 +585,7 @@ BuildUtil::split64BitOpPostRA(Function *fn, Instruction *i, return NULL; srcNr = 2; break; + case OP_SELP: srcNr = 3; break; default: // TODO when needed return NULL; @@ -601,7 +602,10 @@ BuildUtil::split64BitOpPostRA(Function *fn, Instruction *i, for (int s = 0; s < srcNr; ++s) { if (lo->getSrc(s)->reg.size < 8) { - hi->setSrc(s, zero); + if (s == 2) +hi->setSrc(s, lo->getSrc(s)); + else +hi->setSrc(s, zero); } else { if (lo->getSrc(s)->refCount() > 1) lo->setSrc(s, cloneShallow(fn, lo->getSrc(s))); diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp index b06d86a..d284446 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp @@ -616,6 +616,7 @@ static nv50_ir::operation translateOpcode(uint opcode) NV50_IR_OPCODE_CASE(RCP, RCP); NV50_IR_OPCODE_CASE(RSQ, RSQ); + NV50_IR_OPCODE_CASE(SQRT, SQRT); NV50_IR_OPCODE_CASE(MUL, MUL); NV50_IR_OPCODE_CASE(ADD, ADD); @@ -2689,6 +2690,7 @@ Converter::handleInstruction(const struct tgsi_full_instruction *insn) case TGSI_OPCODE_FLR: case TGSI_OPCODE_TRUNC: case TGSI_OPCODE_RCP: + case TGSI_OPCODE_SQRT: case TGSI_OPCODE_IABS: case TGSI_OPCODE_INEG: case TGSI_OPCODE_NOT: diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nv50.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nv50.cpp index 8752b0c..12c5f69 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nv50.cpp +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nv50.cpp @@ -1203,10 +1203,9 @@ NV50LoweringPreSSA::handleDIV(Instruction *i) bool NV50LoweringPreSSA::handleSQRT(Instruction *i) { - Instruction *rsq = bld.mkOp1(OP_RSQ, TYPE_F32, -bld.getSSA(), i->getSrc(0)); - i->op = OP_MUL; - i->setSrc(1, rsq->getDef(0)); + bld.setPosition(i, true); + i->op = OP_RSQ; + bld.mkOp1(OP_RCP, i->dType, i->getDef(0), i->getDef(0)); return true; } diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp index d181f15..29b77c9 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp @@ -1778,22 +1778,22 @@ NVC0LoweringPass::handleMOD(Instruction *i) bool NVC0LoweringPass::handleSQRT(Instruction *i) { - Value *pred = bld.getSSA(1, FILE_PREDICATE); - Value *zero = bld.getSSA(); - Instruction *rsq; - - bld.mkOp1(OP_MOV, TYPE_U32, zero, bld.mkImm(0)); - if (i->dType == TYPE_F64) - zero = bld.mkOp2v(OP_MERGE, TYPE_U64, bld.getSSA(8), zero, zero); - bld.mkCmp(OP_SET, CC_LE, i->dType, pred, i->dType, i->getSrc(0), zero); - bld.mkOp1(OP_MOV, i->dType, i->getDef(0), zero)->setPredicate(CC_P, pred); - rsq = bld.mkOp1(OP_RSQ, i->dType, - bld.getSSA(typeSizeof(i->dType)), i->getSrc(0)); - rsq->setPredicate(CC_NOT_P, pred); - i->op = OP_MUL; - i->setSrc(1, rsq->getDef(0)); - i->setPredicate(CC_NOT_P, pred); - + if (i->dType == TYPE_F64) { + Value *pred = bld.getSSA(1, FILE_PREDICATE); + Value *zero = bld.loadImm(NULL, 0.0d); + Value *dst = bld.getSSA(8); + Instruction *mov, *rsq; +
[Mesa-dev] [PATCH 1/3] nvc0: fix blit triangle size to fully cover FB's > 8192x8192
The idea is that a single triangle will cover the whole area being drawn, allowing the blit shader to do its work. However the max fb size is 16384x16384, which means that the triangle we draw needs to be twice that in order to cover the whole area fully. Increase the size of the triangle to 32768x32768. This fixes a number of dEQP tests that were failing because a blit was involved which would miss some of the resulting texture. Signed-off-by: Ilia MirkinCc: "11.1 11.2" --- src/gallium/drivers/nouveau/nvc0/nvc0_surface.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_surface.c b/src/gallium/drivers/nouveau/nvc0/nvc0_surface.c index ccfc9e2..f2ad4bf 100644 --- a/src/gallium/drivers/nouveau/nvc0/nvc0_surface.c +++ b/src/gallium/drivers/nouveau/nvc0/nvc0_surface.c @@ -1215,8 +1215,8 @@ nvc0_blit_3d(struct nvc0_context *nvc0, const struct pipe_blit_info *info) x0 = (float)info->src.box.x - x_range * (float)info->dst.box.x; y0 = (float)info->src.box.y - y_range * (float)info->dst.box.y; - x1 = x0 + 16384.0f * x_range; - y1 = y0 + 16384.0f * y_range; + x1 = x0 + 32768.0f * x_range; + y1 = y0 + 32768.0f * y_range; x0 *= (float)(1 << nv50_miptree(src)->ms_x); x1 *= (float)(1 << nv50_miptree(src)->ms_x); @@ -1327,14 +1327,14 @@ nvc0_blit_3d(struct nvc0_context *nvc0, const struct pipe_blit_info *info) *(vbuf++) = fui(y0); *(vbuf++) = fui(z); - *(vbuf++) = fui(16384 << nv50_miptree(dst)->ms_x); + *(vbuf++) = fui(32768 << nv50_miptree(dst)->ms_x); *(vbuf++) = fui(0.0f); *(vbuf++) = fui(x1); *(vbuf++) = fui(y0); *(vbuf++) = fui(z); *(vbuf++) = fui(0.0f); - *(vbuf++) = fui(16384 << nv50_miptree(dst)->ms_y); + *(vbuf++) = fui(32768 << nv50_miptree(dst)->ms_y); *(vbuf++) = fui(x0); *(vbuf++) = fui(y1); *(vbuf++) = fui(z); -- 2.4.10 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Mesa include guard style. (Was: [PATCH] i965/cfg: Remove redundant #pragma once.)
On 03/11/2016 03:46 PM, Eric Anholt wrote: > Ian Romanickwrites: > >> On 03/10/2016 05:53 PM, Francisco Jerez wrote: >>> Iago Toral writes: >>> On Wed, 2016-03-09 at 19:04 -0800, Francisco Jerez wrote: > Matt Turner writes: > >> On Wed, Mar 9, 2016 at 1:37 PM, Francisco Jerez >> wrote: >>> Iago Toral writes: >>> On Tue, 2016-03-08 at 17:42 -0800, Francisco Jerez wrote: > brw_cfg.h already has include guards, remove the "#pragma once" which > is redundant and non-standard. FWIW, I think using both #pragma once and include guards is a way to keep portability while still getting the performance advantage of #pragma once where it is supported. >>> It's highly unlikely to make any significant difference on any >>> reasonably modern compiler. I cannot measure any change in compilation >>> time locally from my cleanup. >>> Also it seems that we do the same thing in many other files... >>> Really? I'm not aware of any other file where we use both. >> >> There are quite a few in glsl/ > > Heh, apparently you're right. Anyway it seems rather pointless to use > '#pragma once' in a bunch of scattered header files with the expectation > to gain some speed, the improvement from a single header file is so > minuscule (if it will make any difference at all on a modern compiler > and compilation workload, which I doubt) that we would have to use it > universally in order to have the chance to measure any improvement. > > Can we please just decide for one of the include guard styles and use it > consistently? Given that the majority of header files in the Mesa > codebase use old-school define guards, that it's the only standard > option, that it has well-defined semantics in presence of file copies > and hardlinks, and that the performance argument against it is rather > dubious (although I definitely find '#pragma once' prettier and more > concise), I'd vote for using preprocessor define guards universally. > > What do other people think? I think we have to use define guards necessarily since #pragma once is not standard even it it has wide support. So the question is whether we want to use only define guards or define guards plus #pragma once. I am fine with doing only define guards as you propose. >>> >>> *Shrug* I have the impression that the only real advantage of '#pragma >>> once' is that you no longer need to do the ifndef/define dance, so I >>> don't think I can see much benefit in doing both. >> >> Several compilers will cache the file name where '#pragma once' occurs >> and never read that file again. A #include of a file previously seen >> with '#pragma once' becomes a no-op. Since the file is never read, the >> compiler avoids all the I/O and the parsing. That is true of MSVC and, >> I thought, some versions of GCC. As Iago points out, some compilers >> ignore the #pragma altogether. Since Mesa supports (or does it?) some >> of these compilers, we have to have the ifdef/define/endif guards. > > Compilers have noticed that ifdef/define/endif is a thing and optimized > it, anyway. > > https://gcc.gnu.org/onlinedocs/cppinternals/Guard-Macros.html That's cool! I don't think GCC did that when I looked into this in 2010. It sounds like the #pragma actually breaks the GCC optimization, so let's get rid of them all. signature.asc Description: OpenPGP digital signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] radeonsi: avoid crash when a sampler state is bound for a buffer texture
On Fri, Mar 11, 2016 at 11:17 AM, Nicolai Hähnlewrote: > From: Nicolai Hähnle > > Sampler states don't really make sense with buffer textures, but the PBO > upload code sets one because apparently nouveau needs it. It would be > nice to work that out at some point, but in any case being defensive > here is a good idea. Sampler states are set in regular GL as well if you have a regular buffer texture too, no? > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94284 > Cc: mesa-sta...@lists.freedesktop.org > --- > src/gallium/drivers/radeonsi/si_descriptors.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/src/gallium/drivers/radeonsi/si_descriptors.c > b/src/gallium/drivers/radeonsi/si_descriptors.c > index 9aa4877..f5ad113 100644 > --- a/src/gallium/drivers/radeonsi/si_descriptors.c > +++ b/src/gallium/drivers/radeonsi/si_descriptors.c > @@ -324,6 +324,7 @@ static void si_bind_sampler_states(struct pipe_context > *ctx, unsigned shader, > */ > if (samplers->views.views[i] && > samplers->views.views[i]->texture && > + samplers->views.views[i]->texture->target != PIPE_BUFFER > && > ((struct > r600_texture*)samplers->views.views[i]->texture)->fmask.size) > continue; > > -- > 2.5.0 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/3] vc4: Add a helper for NIR->QIR control flow function node
Templated implementation at present until the recently landed NIR function support is plumbed through. Signed-off-by: Rhys Kidd--- src/gallium/drivers/vc4/vc4_program.c | 11 +++ 1 file changed, 11 insertions(+) diff --git a/src/gallium/drivers/vc4/vc4_program.c b/src/gallium/drivers/vc4/vc4_program.c index 4b625a2..b026013 100644 --- a/src/gallium/drivers/vc4/vc4_program.c +++ b/src/gallium/drivers/vc4/vc4_program.c @@ -1686,6 +1686,13 @@ ntq_emit_block(struct vc4_compile *c, nir_block *block) } static void +ntq_emit_function(struct vc4_compile *c, nir_function_impl *func) +{ +fprintf(stderr, "FUNCTIONS not handled.\n"); +abort(); +} + +static void ntq_emit_cf_list(struct vc4_compile *c, struct exec_list *list) { foreach_list_typed(nir_cf_node, node, node, list) { @@ -1699,6 +1706,10 @@ ntq_emit_cf_list(struct vc4_compile *c, struct exec_list *list) ntq_emit_if(c, nir_cf_node_as_if(node)); break; +case nir_cf_node_function: +ntq_emit_function(c, nir_cf_node_as_function(node)); +break; + default: fprintf(stderr, "Unknown NIR node type\n"); abort(); -- 2.5.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 3/3] vc4: Add NIR->QIR control flow graph loops
Fixes the following piglit tests: - shaders/complex-loop-analysis-bug - shaders/glsl-fs-discard-04 Converts the following piglit tests from crash to fail: - shaders/glsl-fs-continue-inside-do-while - shaders/glsl-fs-loop - shaders/glsl-fs-loop-continue - shaders/glsl-fs-loop-nested - shaders/glsl-texcoord-array - shaders/glsl-vs-continue-inside-do-while - shaders/glsl-vs-loop - shaders/glsl-vs-loop-continue - shaders/glsl-vs-loop-nested No piglit regressions. Signed-off-by: Rhys Kidd--- src/gallium/drivers/vc4/vc4_program.c | 13 - 1 file changed, 12 insertions(+), 1 deletion(-) diff --git a/src/gallium/drivers/vc4/vc4_program.c b/src/gallium/drivers/vc4/vc4_program.c index b026013..82dfdbe 100644 --- a/src/gallium/drivers/vc4/vc4_program.c +++ b/src/gallium/drivers/vc4/vc4_program.c @@ -1685,6 +1685,14 @@ ntq_emit_block(struct vc4_compile *c, nir_block *block) } } +static void ntq_emit_cf_list(struct vc4_compile *c, struct exec_list *list); + +static void +ntq_emit_loop(struct vc4_compile *c, nir_loop *nloop) +{ +ntq_emit_cf_list(c, >body); +} + static void ntq_emit_function(struct vc4_compile *c, nir_function_impl *func) { @@ -1697,7 +1705,6 @@ ntq_emit_cf_list(struct vc4_compile *c, struct exec_list *list) { foreach_list_typed(nir_cf_node, node, node, list) { switch (node->type) { -/* case nir_cf_node_loop: */ case nir_cf_node_block: ntq_emit_block(c, nir_cf_node_as_block(node)); break; @@ -1706,6 +1713,10 @@ ntq_emit_cf_list(struct vc4_compile *c, struct exec_list *list) ntq_emit_if(c, nir_cf_node_as_if(node)); break; +case nir_cf_node_loop: +ntq_emit_loop(c, nir_cf_node_as_loop(node)); +break; + case nir_cf_node_function: ntq_emit_function(c, nir_cf_node_as_function(node)); break; -- 2.5.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/3] vc4: Add better debug of NIR->QIR control flow graph failure
Ensure NIR control flow graph nodes that are unhandled in QIR are reported with sufficient verbosity to aid debugging. This improves piglit outputs, amongst other tools. There are no other remaining uses of assert(0) as a blunt tool within vc4. Signed-off-by: Rhys Kidd--- src/gallium/drivers/vc4/vc4_program.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/src/gallium/drivers/vc4/vc4_program.c b/src/gallium/drivers/vc4/vc4_program.c index 5c91c02..4b625a2 100644 --- a/src/gallium/drivers/vc4/vc4_program.c +++ b/src/gallium/drivers/vc4/vc4_program.c @@ -1700,7 +1700,8 @@ ntq_emit_cf_list(struct vc4_compile *c, struct exec_list *list) break; default: -assert(0); +fprintf(stderr, "Unknown NIR node type\n"); +abort(); } } } -- 2.5.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 0/3] vc4: Rework NIR control flow graph handling
Short patchset to go some way towards improving the handling of NIR control flow graphs in vc4. Whilst in no way completely addressing the known issues this improves piglit output, provides better internal handlers for loop and function nir_cf_node types and creates a cleaner base upon which to build. Fixes the following piglit tests: - shaders/complex-loop-analysis-bug - shaders/glsl-fs-discard-04 Converts the following piglit tests from crash to fail: - shaders/glsl-fs-continue-inside-do-while - shaders/glsl-fs-loop - shaders/glsl-fs-loop-continue - shaders/glsl-fs-loop-nested - shaders/glsl-texcoord-array - shaders/glsl-vs-continue-inside-do-while - shaders/glsl-vs-loop - shaders/glsl-vs-loop-continue - shaders/glsl-vs-loop-nested No piglit regressions. Importantly, the specific piglit fixes reported were from a debug build of Mesa. At present there are known vc4 problems exposed by piglit with release builds. I hope to work on resolving these shortly. Nonetheless a full piglit run was also done with release builds and I confirm no regressions were seen. Rhys Kidd (3): vc4: Add better debug of NIR->QIR control flow graph failure vc4: Add a helper for NIR->QIR control flow function node vc4: Add NIR->QIR control flow graph loops src/gallium/drivers/vc4/vc4_program.c | 27 +-- 1 file changed, 25 insertions(+), 2 deletions(-) -- 2.5.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] mesa: Fix error condition for 1d array texture
On Mar 11, 2016 12:33 PM, "Alejandro Piñeiro"wrote: > > On 11/03/16 20:15, Anuj Phogat wrote: > > yoffset is also applicable to 1d array textures. > > > > Signed-off-by: Anuj Phogat > > --- > > I don't know if it fixes any test, but it looked incorrect to me. > > No one fixed doing a piglit all.py run (also no regression). Didn't test > with a deqp run. There are very few tests for glGetTexImage. Not hitting one doesn't mean much. > In any case, I also agree that the change seems to make sense. > > > > > src/mesa/main/texgetimage.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/src/mesa/main/texgetimage.c b/src/mesa/main/texgetimage.c > > index 06bc8f1..dc21551 100644 > > --- a/src/mesa/main/texgetimage.c > > +++ b/src/mesa/main/texgetimage.c > > @@ -1046,7 +1046,7 @@ dimensions_error_check(struct gl_context *ctx, > > "%s(xoffset = %d)", caller, xoffset); > > return true; > > } > > - if (target != GL_TEXTURE_1D && target != GL_TEXTURE_1D_ARRAY) { > > + if (target != GL_TEXTURE_1D) { > > if (yoffset % bh != 0) { > > _mesa_error(ctx, GL_INVALID_VALUE, > > "%s(yoffset = %d)", caller, yoffset); I don't think this is correct. The check is for compressed textures to ensure that the texture coordinates are a multiple of the block size of the texture. I'm not sure what the rules are for 1-D array compressed textures (if they even exist) bit I'm pretty sure the compression doesn't cross slices. If anything, we probably want to take the check below that looks at height and pull it into the if too. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] ARB_shading_language_include
Hi all, the game "Divinity: Original Sin - Enhanced Edition" uses ARB_shading_language_include whenever it detects a non catalyst driver on Linux. Apitraces from the game running on catalyst show that the shaders are simply included within the game engine and replay fine with all mesa drivers as long as "glShaderSource(shader = 216, count = 1, string = [6BB0788BA6DFF7F4204CCFE5139E8AE6], length = [-1])" calls are ignored so there are two issues: 1. The game just uses ARB_shading_language_include without checking if it's actually there. I have a WIP branch here: https://github.com/karolherbst/mesa/commits/ARB_shading_language_include that branch contains everything needed to run the game, but also hacks around the glShaderSource calls I mentioned above. The big question is now: Would a proper implemention be accepted in mesa, even when only one game actually requires it? 2. glShadersource calls invalidate the compile Status of shaders and linking fails I have _no_ idea what the spec say about this, but the game actually creates shader, compiles them, links them, uises them, then calls those glShaderSource calls, and links them again. Mesa fails with an linking error indicating that a shader is uncompiled (because glShaderSource marks a shader as uncompiled) So what is going on there? Many Thanks Karol ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [GSoC2016] Interested in implementing "Soft" double precision floating point support
I found on PCC website that it was imported in OpenBSD and NetBSD system so the license should be compatible. I think I will use it as a base for add, multiply, absolute value, negate, convert to/from single precision, and comparison functions. Tomorrow, I will make a draft of my proposal for GSoC in which I will resume everything. 2016-03-11 22:00 GMT+01:00 Ian Romanick: > On 03/10/2016 03:09 PM, Dylan Baker wrote: > > Quoting Marek Olšák (2016-03-10 06:57:57) > >> On Thu, Mar 10, 2016 at 3:30 PM, tournier.elie > wrote: > >>> First, thank you all for your answers. > >>> > >>> So if I summarize what was said, we need > >>> Ian: > >>> - add > >>> - negate > >>> - absolute value > >>> - multiply > >>> - reciprocal > >>> - convert to single precision > >>> - convert from single precision > >>> Roland: > >>> - sqrt > >>> - comparaison (< / == / >) > >>> - floor/ceil > >>> I will contact Pat Brown (His name appear in the contact field in [1]) > to > >>> know if we need the function below for implement gpu_shader_fp64. > >>> - pow > >>> - exp > >>> - log > >>> > >>> About the license > >>> > >>> Like I mentioned in the project description, there are quite a few > >>> existing C implementations of these functions. Finding one of those > >>> that you can understand and that has a compatible license is probably > >>> the best place to start. > >>> > >>> Main Mesa code is under MIT license. > >>> If I chose to use a GNU GPL license file like Linux kernel [3], my > code must > >>> be under GNU GPL and probably all the project too. Am I right? > >>> > >>> [1] https://www.opengl.org/registry/specs/ARB/gpu_shader_fp64.txt > >>> [2] http://www.mesa3d.org/license.html > >>> [3] > >>> > https://github.com/torvalds/linux/blob/097f70b3c4d84ffccca15195bdfde3a37c0a7c0f/arch/arm/nwfpe/softfloat.c > >> > >> You can't use GNU GPL for this project. > >> > >> The kernel as a whole is licensed under GNU GPL, but some source files > >> aren't. The file you linked doesn't mention GNU GPL. Somebody needs to > >> verify that the file you linked can be legally re-licensed under the > >> MIT license. If not, I think you have to forget the contents of the > >> file immediately, but I'm not a lawyer. > >> > >> Marek > >> ___ > >> mesa-dev mailing list > >> mesa-dev@lists.freedesktop.org > >> https://lists.freedesktop.org/mailman/listinfo/mesa-dev > > > > Most BSD style licenses are legally compatible, as long as none of the > > developers object. One of the BSD kernels should have a softfloat > > implementation that would be license compatible. > > Yes, and there are a couple C compilers that have compatible licenses. > Portable C Compiler (PCC) being one. LLVM might also support some > devices that lack floating-point hardware. > > > > ___ > > mesa-dev mailing list > > mesa-dev@lists.freedesktop.org > > https://lists.freedesktop.org/mailman/listinfo/mesa-dev > > > > > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] radeonsi: avoid crash when a sampler state is bound for a buffer texture
Le vendredi 11 mars 2016, 11:17:21 CET Nicolai Hähnle a écrit : > From: Nicolai Hähnle> > Sampler states don't really make sense with buffer textures, but the PBO > upload code sets one because apparently nouveau needs it. It would be > nice to work that out at some point, but in any case being defensive > here is a good idea. > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94284 > Cc: mesa-sta...@lists.freedesktop.org > --- > src/gallium/drivers/radeonsi/si_descriptors.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/src/gallium/drivers/radeonsi/si_descriptors.c > b/src/gallium/drivers/radeonsi/si_descriptors.c index 9aa4877..f5ad113 > 100644 > --- a/src/gallium/drivers/radeonsi/si_descriptors.c > +++ b/src/gallium/drivers/radeonsi/si_descriptors.c > @@ -324,6 +324,7 @@ static void si_bind_sampler_states(struct pipe_context > *ctx, unsigned shader, */ > if (samplers->views.views[i] && > samplers->views.views[i]->texture && > + samplers->views.views[i]->texture->target != PIPE_BUFFER && > ((struct > r600_texture*)samplers->views.views[i]->texture)->fmask.size) continue; That fixed bug 94284, thanks -- Laurent Carlier http://www.archlinux.org signature.asc Description: This is a digitally signed message part. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/4] glcpp: Implicitly resolve version after the first non-space/hash token.
On 10/03/2016 19:26, Kenneth Graunke wrote: On Wednesday, March 9, 2016 3:18:50 PM PST Jon Turney wrote: On 05/03/2016 03:33, Kenneth Graunke wrote: We resolved the implicit version directive when processing control lines, such as #ifdef, to ensure any built-in macros exist. However, we failed to resolve it when handling ordinary text. [...] diff --git a/src/compiler/glsl/glcpp/tests/146-version-first- hash.c.expected b/src/compiler/glsl/glcpp/tests/146-version-first- hash.c.expected new file mode 100644 index 000..2872090 --- /dev/null +++ b/src/compiler/glsl/glcpp/tests/146-version-first-hash.c.expected @@ -0,0 +1,3 @@ +0:1(3): preprocessor error: #version must appear on the first line + + This last test fails in glcpp-test-cr-lf for me (See attached). Can you just confirm that it passes for you, before I start looking into why it might fail just for me...? Sorry about that. I had just run glcpp-test, but not glcpp-test-cr-lf. It turns out that our handling of hash followed by newline was not counting lines correctly, so it was returning either line 3 or line 4 based on the line terminator characters. 0:1(3) in the test was wrong; it should have actually been 0:2(1). Iago just reviewed my patch to fix this, so I've pushed it. Hopefully master should work for you now. Yes, that's fixed. Thanks! ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] vulkan: regression on Haswell
On 2016-03-12 19:21, Jason Ekstrand wrote: >> > Haswell should still work just fine if >> > you're on a 4.4 kernel, but we really should make it detect the command >> > parser version and do something intelligent. >> >> I am confused now… Should it 'work just fine' without this hack on 4.4, >> or is the remark about the 'fixed' version? >> >> Because: >> >> $ uname -r >> 4.4.4-1 > > Yeah, we've had the same confusion on Nanley's laptop. Still trying to > get it all sorted out. What distro are you using? PLD Linux. Quite niche, I could have built my own kernel as well. Though, I can provide any specific information about that build, if that may help. Jacek ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] vulkan: regression on Haswell
On Mar 12, 2016 9:11 AM, "Jacek Konieczny"wrote: > > On 2016-03-12 17:58, Jason Ekstrand wrote: > > There is a bug report that's tracking this regression: > > https://bugs.freedesktop.org/show_bug.cgi?id=94468 > > > > In the meantime, a workaround is comment out: > > genX(cmd_buffer_config_l3)(cmd_buffer, false); > > in src/intel/vulkan/genX_cmd_buffer.c. > > > > > > I just pushed a hack patch that does exactly that for you on gen7. > > Hopefully, Jordan can get the command parser version stuff figured out > > soon. Until then, we'll just disable it to get haswell at least sort-of > > working. > > Thanks! It works fine now. > > Though, from the commit: > > > Haswell should still work just fine if > > you're on a 4.4 kernel, but we really should make it detect the command > > parser version and do something intelligent. > > I am confused now… Should it 'work just fine' without this hack on 4.4, > or is the remark about the 'fixed' version? > > Because: > > $ uname -r > 4.4.4-1 Yeah, we've had the same confusion on Nanley's laptop. Still trying to get it all sorted out. What distro are you using? --Jason ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] vulkan: regression on Haswell
On 2016-03-12 17:58, Jason Ekstrand wrote: > There is a bug report that's tracking this regression: > https://bugs.freedesktop.org/show_bug.cgi?id=94468 > > In the meantime, a workaround is comment out: > genX(cmd_buffer_config_l3)(cmd_buffer, false); > in src/intel/vulkan/genX_cmd_buffer.c. > > > I just pushed a hack patch that does exactly that for you on gen7. > Hopefully, Jordan can get the command parser version stuff figured out > soon. Until then, we'll just disable it to get haswell at least sort-of > working. Thanks! It works fine now. Though, from the commit: > Haswell should still work just fine if > you're on a 4.4 kernel, but we really should make it detect the command > parser version and do something intelligent. I am confused now… Should it 'work just fine' without this hack on 4.4, or is the remark about the 'fixed' version? Because: $ uname -r 4.4.4-1 Jacek ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] vulkan: regression on Haswell
On Sat, Mar 12, 2016 at 8:29 AM, Nanley Cherywrote: > On Sat, Mar 12, 2016 at 12:20:26PM +0100, Jacek Konieczny wrote: > > On 2016-03-12 11:59, Jacek Konieczny wrote: > > > Hi, > > > > > > I have been playing with Vulkan API and using the Mesa Intel Vulkan > > > driver from the 'vulkan' branch. > > > > > > Recent driver upgrade has broken my, previously working code, causing > > > massive flickering and graphical artifacts. > > > > > > git bisect have shown, that this is the breaking change: > > > > > > commit 7ebbc3946ae9cffb3c3db522dcbe2f1041633164 (refs/bisect/bad) > > > Author: Nanley Chery > > > Date: Fri Mar 4 11:43:19 2016 -0800 > > > > > > anv/meta: Minimize height of images used for copies > > > > I am sorry. It seems I have been using 'git bisect' wrong. > > > > This is the breaking change: > > > > commit 248ab61740c4082517424f5aa94b2f4e7b210d76 (HEAD) > > Author: Jason Ekstrand > > Date: Tue Mar 8 17:10:05 2016 -0800 > > > > anv/cmd_buffer: Pull the core of flush_state into genX_cmd_buffer > > > > And this seems to make more sense. > > > > There is a bug report that's tracking this regression: > https://bugs.freedesktop.org/show_bug.cgi?id=94468 > > In the meantime, a workaround is comment out: > genX(cmd_buffer_config_l3)(cmd_buffer, false); > in src/intel/vulkan/genX_cmd_buffer.c. > I just pushed a hack patch that does exactly that for you on gen7. Hopefully, Jordan can get the command parser version stuff figured out soon. Until then, we'll just disable it to get haswell at least sort-of working. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [RFC] i965/dbg: Expose cases hitting a presumably dead optimization
On Mar 11, 2016 11:47 PM, "Pohjolainen, Topi"wrote: > > On Fri, Mar 11, 2016 at 05:59:37PM -0800, Jason Ekstrand wrote: > >On Fri, Mar 11, 2016 at 4:40 AM, Topi Pohjolainen > ><[1]topi.pohjolai...@intel.com> wrote: > > > > The logic iterates over param[] which contains pointers to > > uniform storage set during linking (see > > link_assign_uniform_locations()). > > The pointers are unique and it should be impossible to find > > matching entries. > > I couldn't find any regressions with test system. In addition > > I tried several benchmarks on HSW and none hit this. > > I'm hoping to remove this optimization attempt. This is the only > > bit that depends on knowing about the actual storage during > > compilation. All the rest deal with just relative push and pull > > locations once the actual filling of pull_param[] is moved > > outside the compiler just as param[]. (Filling pull_param is > > based on the pull locations and doesn't need to be inside the > > compiler). > > Any thoughts? > > > >I'm not 100% sure what you're trying to do, but I have a branch that > >may be of interest: > >[2] https://cgit.freedesktop.org/~jekstrand/mesa/log/?h=wip/i965-uniform > >s > >The branch enables support for pushing small uniform arrays. Among > >other things, it redoes the way we do push constants and gets rid of > >some of the data tracking in the backend compiler. The big reason why > >I haven't tried too hard to get it merged is because it regresses Sandy > >Bridge just a bit. I know I've seen and fixed the bug before in an > >alternate attempt, but I don't remember how. > >I'm going to be refreshing it soon because we need indirect push > >constants for the Vulkan driver. (The branch is already merged into > >the Vulkan branch.) > > I'd like to stop filling param[] before compilation. This is really not > needed by the compiler as it deals with pull and push constant locations, > i.e., positions in the push and pull files. Actual uniform values and their > location in the uniform storage are not needed until actual pipeline upload. > > My plan is to move the iteration over the core uniform storage to pipeline > upload time. We can fill push and pull buffers directly without the need of > storing pointers to param[] in the middle. Not only makes this things simpler > and more flexible in my mind, does it give us the possibility to upload > floats with 16-bit precision instead of 32-bits. Current upload logic only > gets pointers to 32-bit values without knowing if they should be represented > with 16 bits let alone whether the values are floats or integers to begin > with. Right. Kristian and I have talked about some related things that we need for pipeline caching and the Vulkan driver. In Vulkan, they aren't actual pointers at all but are, instead, offsets into a push constant block. Fortunately, the back-end compiler never dereferences them so you can shove whatever you want in there and it's OK. We've talked about turning the pull and push params into just a set of integers that means whatever the api and state setup code want. One of the problems with pointers is that you can't easily put them into an on-disk shader cache (which we have for Vulkan). When you talk about 16 or 64-bit values, what is your intention? Are 64-bit values still going to take up two slots or are they now one 64-bit slot? Are there two 16-bit values per slot or just one? Are 16-bit uniforms converted before they get uploaded or consumed directly by the shader? I'm still a little confused as to what problem you're trying to solve. One thing to think about as you work on this is that Vulkan doesn't have individual uniforms but instead has a block of explicit push constants. In the shader, the push constants are specified with explicit offsets into that block similar to a UBO. The result is that it's very difficult for the state setup code to know what size anything is. Just chopping the push constant space into 32-bit hunks that the compiler is free to rearrange is terribly convenient. We could use 16 or 8-bit chunks just as easily but having some be 32-bit, others 64-bit, and others 16-bit has the potential to get very painful. Food for thought. Maybe I'm completely missing what your trying to do. --Jason ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 01/14] nir: Add explicitly sized types
On Fri, Mar 11, 2016 at 2:33 AM, Samuel Iglesias Gonsálvezwrote: > -BEGIN PGP SIGNED MESSAGE- > Hash: SHA256 > > > > On 11/03/16 01:08, Jason Ekstrand wrote: >> On Thu, Mar 10, 2016 at 4:00 PM, Connor Abbott >> wrote: >> >>> On Mon, Mar 7, 2016 at 3:45 AM, Samuel Iglesias Gonsálvez >>> wrote: From: Jason Ekstrand v2: Fix size/type mask to properly handle 8-bit types. Signed-off-by: Juan A. Suarez Romero --- src/compiler/nir/nir.h | 17 - 1 file changed, 16 insertions(+), 1 deletion(-) diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h index cccb3a4..659e98c 100644 --- a/src/compiler/nir/nir.h +++ b/src/compiler/nir/nir.h @@ -605,9 +605,24 @@ typedef enum { nir_type_float, nir_type_int, nir_type_uint, - nir_type_bool + nir_type_bool, + nir_type_bool32 =32 | nir_type_bool, + nir_type_int8 = 8 | nir_type_int, + nir_type_int16 = 16 | nir_type_int, + nir_type_int32 = 32 | nir_type_int, + nir_type_int64 = 64 | nir_type_int, + nir_type_uint8 = 8 | nir_type_uint, + nir_type_uint16 =16 | nir_type_uint, + nir_type_uint32 = 32 | nir_type_uint, + nir_type_uint64 =64 | nir_type_uint, + nir_type_float16 = 16 | nir_type_float, + nir_type_float32 = 32 | nir_type_float, + nir_type_float64 = 64 | nir_type_float, } nir_alu_type; +#define NIR_ALU_TYPE_SIZE_MASK 0xfff8 +#define NIR_ALU_TYPE_BASE_TYPE_MASK 0x0007 >>> >>> So I'm not really the one to be reviewing this series (after all, >>> I wrote most of it :) ) but one thing that I never quite liked, >>> and didn't get around to fixing, is how we use these raw >>> constants all over the place. Perhaps we could make things more >>> readable by adding nir_get_sized_type(), nir_get_unsized_type(), >>> and nir_type_size() helpers and then use those instead of >>> or-ing/and-ing things together everywhere. >>> >> >> Agreed. >> >> > > Agreed. We saw it too but, as this is used in a lot in the fp64 patches, > we were thinking on apply one patch at the end of the fp64 series adding > those helper functions (maybe just macros like NIR_GET_UNSIZED_TYPE and > NIR_GET_TYPE_SIZE) and adapting the users of the mask. I should probably mention, in general we tend to prefer inline functions over macros where possible since it's clearer what their argument types and return type are and they tend to integrate better with gdb. > > However, we can add them here and modify the rest of fp64 patches if > you prefer it. > > Sam > >>> + typedef enum { NIR_OP_IS_COMMUTATIVE = (1 << 0), NIR_OP_IS_ASSOCIATIVE = (1 << 1), -- 2.7.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev >>> ___ mesa-dev mailing >>> list mesa-dev@lists.freedesktop.org >>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev >>> >> > -BEGIN PGP SIGNATURE- > Version: GnuPG v2 > > iQIcBAEBCAAGBQJW4nTAAAoJEH/0ujLxfcNDQmcP/3PDBMxX+z91XQ0wSY7QMuu8 > I4BVir0n1J3g05S8Yid+z61vCOMNdDB9xmUCJmV1Jv+YuS4SB5GaluHj9jFBPgvj > YQtT5SnoGC1tBEViAPa+nNRwxF+fxh8xLKG+OQ2IXqDMAdIsx5V772Ea8/anClhi > q4d8Fw93URPubBKTTh8IMt/dOa0oN3L0Cka7062bLl27+Y2Ml8MyPVLEQPBI2WP8 > ayMicIDco2ldRS3u/jteGc6R4GI9Ef8gIsSVyEYPKUYgNmVkun5LMJjpjbh2PXBB > VaManLcCdv6Yf2GP9ehQjTp4rr0GLl2rcAaftt0pD7MN1ZzQlFp/opyIQpzFe+Ny > hqzzvbn8wh/W4goKbfir6HpasaPC56AamTnHZ9zJVhaUIPjan/oSSRHRoK9kswib > rpnj5WDQN9KKnuY89Pxoo/w8aesgyektLiFbsXQx7jbNVxKOdrvKwnhSjSQs0sUG > C+e/2oLSMiH2VLnYT7iJoinD8IlQXgmYBo/IZvFgtcOfZdJRgSssrWQclfagv8MR > dzNLUTR5sS6/GG+4nTuD14uGaswuToCRCNiq2CDnemFXMdtgkIkztj8dwZd8u9hY > kP5UQKoW6KU+0fFf8PQez2YCFX/dxLXtRyP8uP+V5ZUh1y+Qv4TDwYacl/VG8Hlt > kx7+UXIC4g/vUS5ONfP0 > =6z48 > -END PGP SIGNATURE- ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] vulkan: regression on Haswell
On Sat, Mar 12, 2016 at 12:20:26PM +0100, Jacek Konieczny wrote: > On 2016-03-12 11:59, Jacek Konieczny wrote: > > Hi, > > > > I have been playing with Vulkan API and using the Mesa Intel Vulkan > > driver from the 'vulkan' branch. > > > > Recent driver upgrade has broken my, previously working code, causing > > massive flickering and graphical artifacts. > > > > git bisect have shown, that this is the breaking change: > > > > commit 7ebbc3946ae9cffb3c3db522dcbe2f1041633164 (refs/bisect/bad) > > Author: Nanley Chery> > Date: Fri Mar 4 11:43:19 2016 -0800 > > > > anv/meta: Minimize height of images used for copies > > I am sorry. It seems I have been using 'git bisect' wrong. > > This is the breaking change: > > commit 248ab61740c4082517424f5aa94b2f4e7b210d76 (HEAD) > Author: Jason Ekstrand > Date: Tue Mar 8 17:10:05 2016 -0800 > > anv/cmd_buffer: Pull the core of flush_state into genX_cmd_buffer > > And this seems to make more sense. > There is a bug report that's tracking this regression: https://bugs.freedesktop.org/show_bug.cgi?id=94468 In the meantime, a workaround is comment out: genX(cmd_buffer_config_l3)(cmd_buffer, false); in src/intel/vulkan/genX_cmd_buffer.c. Regards, Nanley ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] vulkan: regression on Haswell
On 2016-03-12 11:59, Jacek Konieczny wrote: > Hi, > > I have been playing with Vulkan API and using the Mesa Intel Vulkan > driver from the 'vulkan' branch. > > Recent driver upgrade has broken my, previously working code, causing > massive flickering and graphical artifacts. > > git bisect have shown, that this is the breaking change: > > commit 7ebbc3946ae9cffb3c3db522dcbe2f1041633164 (refs/bisect/bad) > Author: Nanley Chery> Date: Fri Mar 4 11:43:19 2016 -0800 > > anv/meta: Minimize height of images used for copies I am sorry. It seems I have been using 'git bisect' wrong. This is the breaking change: commit 248ab61740c4082517424f5aa94b2f4e7b210d76 (HEAD) Author: Jason Ekstrand Date: Tue Mar 8 17:10:05 2016 -0800 anv/cmd_buffer: Pull the core of flush_state into genX_cmd_buffer And this seems to make more sense. Jacek ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] vulkan: regression on Haswell
Hi, I have been playing with Vulkan API and using the Mesa Intel Vulkan driver from the 'vulkan' branch. Recent driver upgrade has broken my, previously working code, causing massive flickering and graphical artifacts. git bisect have shown, that this is the breaking change: commit 7ebbc3946ae9cffb3c3db522dcbe2f1041633164 (refs/bisect/bad) Author: Nanley CheryDate: Fri Mar 4 11:43:19 2016 -0800 anv/meta: Minimize height of images used for copies It might be, that my code is broken, but it worked correctly before this change and has been checked with validation layers and Valgrind. Jacek ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] swrast: Delete the unused 'slice' parameter
On 12/03/16 00:16, Anuj Phogat wrote: > Signed-off-by: Anuj PhogatAny reason to not just move the slice assert at line 243 as part of the checks of check_map_teximage? > --- > src/mesa/swrast/s_texture.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/src/mesa/swrast/s_texture.c b/src/mesa/swrast/s_texture.c > index 9ccd0e3..6ea7b6c 100644 > --- a/src/mesa/swrast/s_texture.c > +++ b/src/mesa/swrast/s_texture.c > @@ -178,7 +178,7 @@ _swrast_free_texture_image_buffer(struct gl_context *ctx, > */ > static void > check_map_teximage(const struct gl_texture_image *texImage, > - GLuint slice, GLuint x, GLuint y, GLuint w, GLuint h) > + GLuint x, GLuint y, GLuint w, GLuint h) > { > > if (texImage->TexObject->Target == GL_TEXTURE_1D) > @@ -216,7 +216,7 @@ _swrast_map_teximage(struct gl_context *ctx, > GLint stride, texelSize; > GLuint bw, bh; > > - check_map_teximage(texImage, slice, x, y, w, h); > + check_map_teximage(texImage, x, y, w, h); > > if (!swImage->Buffer) { >/* Either glTexImage was called with a NULL argument or ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev